Approximate optimality and the risk/reward tradeoff in a class of bandit problems