FDT was developed as an alternative to causal decision theory (CDT) and evidential decision theory (EDT), aiming to address perceived shortcomings in both theories when applied to certain decision problems and game-theoretic scenarios. It builds upon and formalizes concepts from Yudkowsky's earlier Timeless Decision Theory (TDT).
The core principle of Functional Decision Theory is that rational agents should conceptualize their decision-making process as implementing a mathematical function. Rather than asking "What should I do?" an FDT agent asks "What output from the function that I implement would lead to the best outcomes?"[4]
This approach differs fundamentally from traditional decision theories in how it conceptualizes the relationship between an agent's decision and outcomes:
Causal Decision Theory: recommends actions based on their direct causal consequences.[5] CDT agents ask "What will happen if I take this action?" focusing on the causal chain that flows from their decision.
Evidential Decision Theory: recommends doing what you most want to learn that you will do.[6] EDT agents consider their choice as evidence about the world state and choose the action they would be most pleased to discover they had chosen.
Functional Decision Theory: Recommends actions by treating the agent's decision as determining the output of all computationally similar processes.[1] FDT agents ask "What would happen if the mathematical function I implement returned this output?" considering logical rather than just causal connections.
FDT is grounded in three main philosophical arguments:[1][7]
Precommitment. FDT proponents argue that rational agents should be willing to precommit to certain strategies when they know doing so will lead to better outcomes. FDT naturally incorporates this willingness to precommit without requiring separate justification.
Information value. Traditional decision theories sometimes recommend gathering information that has no value for improving outcomes.[8] FDT avoids this by focusing on functional relationships rather than causal or evidential ones.
Utility. FDT is designed to maximize expected utility across a broader range of scenarios than competing theories, particularly in cases involving prediction, simulation, or strategic interaction with similar agents.
FDT builds upon and supersedes Yudkowsky's earlier Timeless Decision Theory (TDT), introduced in 2010.[9] FDT is described as a replacement for TDT, providing a more formal and precise framework for the same underlying intuitions.[1]
In Newcomb's problem, an agent faces two boxes: one transparent containing $1,000, and one opaque containing either $1,000,000 or nothing. A reliable predictor, who has made similar predictions in the past and has been correct 99% of the time, claims to have placed $1,000,000 in the opaque box if she predicted that the agent would leave the transparent box behind. The predictor has already made her prediction and left. The agent can take either just the opaque box or both boxes.[10]
CDT recommends taking both boxes because the opaque box's contents are already causally determined by the predictor's past action. Since the agent's current choice cannot causally influence what was already placed in the box, CDT reasons that taking both boxes always yields $1,000 more than taking only the opaque box, regardless of what's inside it.[11]
EDT recommends taking only the opaque box because the agent's choice serves as evidence about what the predictor likely placed in the box. Since the predictor is highly accurate, choosing to one-box is strong evidence that the predictor predicted this choice and therefore placed $1,000,000 in the opaque box. EDT asks: "What choice would I be happiest to learn that I made?" and concludes that learning you one-boxed (and thus likely received $1,000,000) is preferable to learning you two-boxed (and thus likely received only $1,000).
FDT Recommends taking only the opaque box,[12] reasoning that the agent's decision-making process and the predictor's prediction process are functionally linked through computational similarity. FDT treats the question not as "What should I choose given that the box contents are fixed?" but rather "What output from my decision function would lead to the best overall outcome?" Since the predictor bases their prediction on modeling the agent's decision function, choosing to one-box functionally determines that the predictor placed $1,000,000 in the box, while choosing to two-box functionally determines that the predictor placed nothing.[1]
The key distinction is that while EDT relies on evidential correlation between choice and outcome, FDT posits a functional connection: the same computational process that determines the agent's choice also determines (through the predictor's modeling) what was placed in the box.
^ abcdeYudkowsky, Eliezer; Soares, Nate (2017). "Functional Decision Theory: A New Theory of Instrumental Rationality". arXiv. 1710.05060. arXiv:1710.05060.
^Good, I.J. (1967). "On the Principle of Total Evidence". The British Journal for the Philosophy of Science. 17 (4): 319–321. doi:10.1093/bjps/17.4.319.
^Yudkowsky, Eliezer (2010). "Timeless Decision Theory"(PDF). The Machine Intelligence Research Institute (previously known as the Singularity Institute).