Tuesday, July 21, 2009

Yes son, you can compare apples to oranges...


One of the things that bothered me while tweaking and tuning the Keltis AI heuristics, was that ultimately things sometimes boiled down to the need to compare apples to oranges, unfortunately I do not remember the exact details and I am too lazy to dig them up, but I know that I had to compare values that I was not able to reduce to a common unit to measure by (like risk per example), it was really a matter of preference, this is not a new problem, and with my head in the details, I failed to notice the obvious, this is an old topic called utility that economists have been using for decades, of course, as usual, I said 'aha' just after getting my head out of the details and shipping.
It was no big deal though, I ended up using utility without knowing it.

Utility is 'the' way to compare apples to oranges, but what brings me to today's rant is that I remembered this while reading in the context of my ongoing current research in applying Reinforcement Learning to Animation planning.
The question in question is about a very valid question [ :) :D :P ] about the 'essence' of Reinforcement Learning (similar to http://rlai.cs.ualberta.ca/RLAI/rewardhypothesis.html) :

Is it sensible to treat all preferences as numeric rewards on a single scale? Theoretically, yes. There is a theorem (North [4]) that if you believe four fairly simple axioms about preferences, then you can derive the existence of a real-valued utility function. (The only mildly controversial axiom is substitutability: that if you prefer A to B, then you must prefer a coin flip between A and C to a coin flip between B and C.) Practically, it depends. Users often find it hard to articulate their preferences as numbers. (Example: you have to design the controller for a nuclear power plant. How many dollars is a human life worth?)
source: http://www.eecs.umich.edu/~baveja/RLMasses/node5.html#SECTION00032000000000000000

I could not find the original in free electronic format: "D. W. North. A tutorial introduction to decision theory. IEEE Transactions on Systems Man and Cybernetics, SSC-4(3), Sept. 1968. "

If anyone can provide it I would be grateful, it is always very insightful to read about the essence of these things, this usually involves reading very old papers, and from my experience it is always worth it, it gives lots of confidence when applying things later and when doubts appear, because much thought and critical thinking went into each and every 'fact' we take today for granted, and for too naive tomorrow.

.

No comments: