cs188 lecture4

Deterministic Games

Many possible formalizations, one is:

Solution for a player is a policy: \(S \rightarrow A\)

Zero-Sum Games:

General Games:

Value of a state: The best achievable outcome (utility) from that state.

Non-Terminal States: \[ V(s) = max_{s' \in children(s)}V(s') \] Terminal States: \[ V(s) = known \]

The problem solved means we can compute the value of the root state.

Deterministic, zero-sum games:
- One player maximizes result
- The other minimizes result
Minimax search:
- A state-space search tree
- Players alternate turns
- Compute each node's minimax value: the best achievable utility against a rational (optimal) adversary

This pruning has no effect on minimax value computed for the root!
Values of intermediate nodes might be wrong
- Important: children of the root may have the wrong value
- So the most naive version won't let you do action selection
Good child ordering improves effectiveness of pruning
With "perfect ordering":
- Time complexity drops to \(O(b^{\frac{m}{2}})\)
- Doubles solvable depth
- Full search of, e.g. chess, is still hopeless...
This is a simple example of metareasoning (computing about what to compute)

Problem: In realistic games, cannot search to leaves!
Solution: Depth-limited search
- Instead, search only to a limited depth in the tree
- Replace terminal utilities with an evaluation function for non-terminal positions
Guarantee of optimal play is gone
More plies makes a BIG difference
Use iterative deepening for an anytime algorithm

\[ Eval(s) = w_1f_1(s)+w_2f_2(s)+...+w_nf_n(s) \]

网课笔记 CS188

CS188

本博客所有文章除特别声明外，均采用 CC BY-SA 4.0 协议，转载请注明出处！