文章

cs188-note13

Decision Networks

Nodes:

  • Chance: 类比贝叶斯网络
  • Action: 我们能控制并做出选择的节点
  • Utility: 前两种的children,基于前两种输出一个效能

在决策网络中,我们的目标还是选择能产生maximum expected utility(MEU) 的行动:

  • calculate the posterior probabilities of all chance node parents of the utility node into which the action node feeds
  • compute the expected utility of taking that action given the posterior probabilities computed in the previous step. The expected utility of taking an action a given evidence e and n chance nodes is computed with the following formula:
\[EU(a|e) = \sum_{x_1, \ldots, x_n} P(x_1, \ldots, x_n | e) U(a, x_1, \ldots, x_n)\]
  • Finally, select the action which yielded the highest utility to get the MEU.

能产生最大预期效用的行动就是 “采取”,因此这也是决策网络向我们推荐的行动。更正式地说,可以通过求预期效用的 argmax 来确定能产生 MEU 的行动

Outcome Trees

就是把上面的内容拆成了图,下面是图

pACTg2T.png

The Value of Perfect Information

General Formula

\[\begin{align*} MEU(e) &= \max_a \sum_s P(s|e) U(s, a) \\ MEU(e, e') &= \max_a \sum_s P(s|e, e') U(s, a) \\ MEU(e, E') &= \sum_{e'} P(e'|e) MEU(e, e') \\ V{PI}(E'|e) &= MEU(e, E') - MEU(e) \end{align*}\]

Properties of VPI

  • Nonnegativity.
  • Nonadditivity. It’s true because generally observing some new evidence E j might change how much we care about Ek
  • Order-independence.
本文由作者按照 CC BY 4.0 进行授权