In the field of reinforcement learning, a type of machine learning, there is the concept of an optimal action-value function. This mathematical function is represented by the symbol Q*. It identifies what action the agent (read: decision maker) should take given a particular state (read: situation) in order to maximise the expected cumulative reward (read: success). As this blog post neatly explains:

"If we know Q*, then we’re basically done. It tells us the right action to take. The optimal value function specifies the best possible performance in the MDP. An MDP is solved when we know the optimal value function."

MDP stands for Markov Decision Process, or in other words, a series of discrete steps whereby the outcome is partly up to chance and partly up to the decision maker. Not unlike many of the decisions business leaders have to make on a daily basis.

This blog is called Q* as its aim is to help you take optimal actions through data. As in Q-learning, in life we approach Q* one exploratory step at a time. Enjoy the journey.

— Ryan