A Greedy Approach to Adapting the Trace Parameter for Temporal Difference Learning