Abstract
This paper is intended to model a decision maker as a rational
probabilistic decider (RPD) and to investigate its behavior in stationary
and symmetric Markov switch environments. RPDs take their decisions
based on penalty functions defined by the environment. The quality of
decision making depends on a parameter referred to as level of
rationality. The dynamic behavior of RPDs is described by an ergodic
Markov chain. Two classes of RPDs are considered—local and
global. The former take their decisions based on the penalty in
the current state while the latter consider all states. It is shown that
asymptotically (in time and in the level of rationality) both classes behave
quite similarly. However, the second largest eigenvalue of Markov transition
matrices for global RPDs is smaller than that for local ones, indicating
faster convergence to the optimal state. As an illustration, the behavior
of a chief executive officer, modeled as a global RPD, is considered, and
it is shown that the company performance may or may not be
optimized—depending on the pay structure employed. While the
current paper investigates individual RPDs, a companion paper will
address collective behavior.