cs188-note13
Decision Networks
Nodes:
- Chance: 类比贝叶斯网络
- Action: 我们能控制并做出选择的节点
- Utility: 前两种的children,基于前两种输出一个效能
在决策网络中,我们的目标还是选择能产生maximum expected utility(MEU) 的行动:
- calculate the posterior probabilities of all chance node parents of the utility node into which the action node feeds
- compute the expected utility of taking that action given the posterior probabilities computed in the previous step. The expected utility of taking an action a given evidence e and n chance nodes is computed with the following formula:
- Finally, select the action which yielded the highest utility to get the MEU.
能产生最大预期效用的行动就是 “采取”,因此这也是决策网络向我们推荐的行动。更正式地说,可以通过求预期效用的 argmax 来确定能产生 MEU 的行动
Outcome Trees
就是把上面的内容拆成了图,下面是图
The Value of Perfect Information
General Formula
\[\begin{align*} MEU(e) &= \max_a \sum_s P(s|e) U(s, a) \\ MEU(e, e') &= \max_a \sum_s P(s|e, e') U(s, a) \\ MEU(e, E') &= \sum_{e'} P(e'|e) MEU(e, e') \\ V{PI}(E'|e) &= MEU(e, E') - MEU(e) \end{align*}\]Properties of VPI
- Nonnegativity.
- Nonadditivity. It’s true because generally observing some new evidence E j might change how much we care about Ek
- Order-independence.
本文由作者按照
CC BY 4.0
进行授权