The nested abnormality theory described by
not only solves the problem
of projecting a specific strategy, it
also describes a particular use of
circumscription that allows reasoning
about declarative formalizations of
strategies of the sort proposed in this
paper. The expression
(strategy) denotes the
nested abnormality theory described by
parameterized for different
strategy descriptions
. The idea
is to replace axioms
to
by other sets of axioms describing different
strategies, so that we can reason about the
behavior of different strategies.
We describe a scenario in which a program uses
(strategy) to reason about
declarative formalizations of strategies.
The program starts with an empty strategy.
As different heuristics are suggested by
the adviser, the program considers how
they may affect its problem solving
behavior, and reacts accordingly.
The scenario tries to illustrate the idea that a program capable of reasoning non-monotonically about declarative formalizations of strategies can have interesting reflective behavior [McCarthy1990b] [McCarthy1995] [Steels1996] [Sierra1996]. For example, it can save computational resources by detecting uncomputable or incorrect strategies. It can determine which of the heuristics is told improve, are redundant, partially redundant, or inconsistent with its current strategy. It can improve its problem solving strategy, accordingly, by adding and substituting action selection rules and axioms. It can avoid inconsistencies, which may cause it to have an arbitrary behavior. It can learn by taking advice [McCarthy1959], and reflecting on it.
Figure: Behavior of different strategies
for solving planning problems in the blocks
world: (1)
Strategy-1 and Strategy-3 are
both uncomputable, since they allow cyclic
behaviors;
(2) Strategy-2 describes a computable
but incorrect strategy, its unique terminal
situation is not a solution;
(3) Strategy-4 and Strategy-5
describe two computable and correct
strategies, together with their projections
for a particular blocks world problem. It can
be easily observed that Strategy-5 is
better than Strategy-4.
Initially, the advisor suggests to use
the following heuristic:
If a block can be moved to final
position, this should be done right away.
The program constructs Strategy-1,
which is described by axiom .
The projection of Strategy-1 allows
cyclic behaviors, such as the one described
in fig.
. The program
concludes that Strategy-1 is not
computable.
We use the following formula to identify
cycles in the projections of
state-based strategies.
The expression
means that
there is a nonempty
sequence of selectable
situations that leads from s to
.
A state-based strategy contains a
cycle (axiom
) if there
is a nonempty sequence of selectable situations
that leads from a situation s to a different
situation
with the same associated state
(i.e., whose associated database contains the same
formulas as the database associated with s).
In particular, the program can prove that
holds
in the projection of Strategy-1 by
finding appropriate values for s and
.
The program asks for more advice, instead of
trying to apply Strategy-1 to solve
the problem. The advisor proposes a
different heuristic:
If a block is not in final
position and cannot be moved to final
position, it is better to move it to the
table than anywhere else.
The program constructs Strategy-2,
which is described by axiom
.
The projection of Strategy-2 is shown
in fig.
. Strategy-2 is
an incorrect strategy, because
it allows terminal situations which do not
satisfy the goal conditions.
The program still needs more advice. The
advisor suggests now to consider both
heuristics together.
The program constructs Strategy-3,
which is described by axioms
and
.
The projection of Strategy-3 still
allows cyclic behaviors, such as the one
described in fig.
,
i.e. it satisfies
.
The advisor suggests a third heuristic:
If a block is in final position, do not move
it. The program constructs Strategy-4
as the set of axioms to
.
The set of situations that are selectable
according to projection of Strategy-4
(see fig.
) is finite.
The program knows the strategy is correct,
since all its terminal situations happen to be
solutions. The second order axiom of induction
for situations (
) is needed here
in order to prove that
holds
for all the situations not mentioned in the
theorem below.
The advisor suggests a fourth heuristic:
If there is a block that is above a block it
ought to be above in the goal configuration
but it is not in final position
(tower-deadlock), put it on the table.
The program constructs Strategy-5
as the set of axioms to
. The set of situations that
are selectable according to projection of
Strategy-5
(see fig.
) is finite.
The program can prove that Strategy-5 is
correct. It can also conclude that
Strategy-5 is better than Strategy-4,
since it always solves the problem performing a
smaller or equal number of actions
.
The advisor still suggests a fifth heuristic: If a block is on the table but not in final position, do not move anything on that block.
The program constructs Strategy-6
as the set of axioms
to
and
. It can
check that the projections of
Strategy-5 and Strategy-6 are
identical. This means that suggestion
is redundant with its
current strategy. Therefore, including
it in the database will not improve the
program's behavior.
The advisor finally suggests a sixth heuristic: If there is a block that is above a block it ought to be above in the goal configuration but it is not in final position (tower-deadlock), it is better to move it on top of a clear block that is in final position and should be clear on the goal configuration than anywhere else.
We use the following formula to detect
inconsistencies in the projection
of a strategy. The formulas
,
and
are examples of
verification axioms that can be added
to
in order to reason about the
behavior of different strategies.
The program constructs Strategy-7
as the set of axioms
to
and
.
Although axiom
describes
a plausible heuristic, it is in contradiction
with axiom
. For example, axiom
implies that Move(A,T) is
good in the initial situation, whereas axioms
and
imply
that Move(A,T) is bad
.
Using axiom
, the program
can prove that Strategy-7 is
inconsistent.
Therefore, it rejects suggestion ,
because it is inconsistent with its current
strategy.