文章

cs188-note13

Decision Networks

Nodes:

  • Chance: 类比贝叶斯网络
  • Action: 我们能控制并做出选择的节点
  • Utility: 前两种的children,基于前两种输出一个效能

在决策网络中,我们的目标还是选择能产生maximum expected utility(MEU) 的行动:

  • calculate the posterior probabilities of all chance node parents of the utility node into which the action node feeds
  • compute the expected utility of taking that action given the posterior probabilities computed in the previous step. The expected utility of taking an action a given evidence e and n chance nodes is computed with the following formula:
EU(a|e)=x1,,xnP(x1,,xn|e)U(a,x1,,xn)
  • Finally, select the action which yielded the highest utility to get the MEU.

能产生最大预期效用的行动就是 “采取”,因此这也是决策网络向我们推荐的行动。更正式地说,可以通过求预期效用的 argmax 来确定能产生 MEU 的行动

Outcome Trees

就是把上面的内容拆成了图,下面是图

pACTg2T.png

The Value of Perfect Information

General Formula

MEU(e)=maxasP(s|e)U(s,a)MEU(e,e)=maxasP(s|e,e)U(s,a)MEU(e,E)=eP(e|e)MEU(e,e)VPI(E|e)=MEU(e,E)MEU(e)

Properties of VPI

  • Nonnegativity.
  • Nonadditivity. It’s true because generally observing some new evidence E j might change how much we care about Ek
  • Order-independence.
本文由作者按照 CC BY 4.0 进行授权