In this section and the next, we show how
the ideas outlined above can be applied to
formalize and reason about heuristics for
moving blocks in order to solve planning
problems in the blocks world. The strategies
formalized in the paper describe different
algorithms for solving planning problems in
the elementary blocks world domain [Gupta & Nau1991].
First, we summarize a very elegant and simple
formalization of STRIPS in the situation
calculus, and its application to reasoning
about action in the blocks world, described
in [McCarthy1985]. Then, we propose a
nested abnormality theory (NAT)
[Lifschitz1995]
that solves the problem of projecting the
strategy described by axioms to
. In section 3, this theory
is generalized into a class of NAT's
which apply circumscription in a particular way
that is useful for studying the projections of
declarative formalizations of strategies of the
sort described in this paper.
In [McCarthy1985], John McCarthy proposes a very simple formalization of STRIPS [Fikes & Nilsson1971] in the situation calculus. The formalization is as follows. STRIPS is a planning system that uses a database of logical formulas to represent information about a state. Each action has a precondition, an add list, and a delete list. When an action is considered, it is first determined whether its precondition is satisfied. If the precondition is met, then the sentences on the delete list are deleted from the database, and the sentences on the add list are added to it.
There are variables of the following sorts:
for situations (s, , ...), for
actions (a,
, ...), and for
propositional fluents (p,
, ...
)
.
Associated with each situation is a database
of propositions, and this gives us the wff
DB(p,s) asserting that p is in
the database associated with s. The
function Result maps a situation s
and an action a into the situation that
results when action a is performed in
situation s.
STRIPS is characterized by three predicates: (1) Prec(a,s) is true provided action a can be performed in situation s; (2) Delete(p,a,s) is true if proposition p is to be deleted when action a is performed in situation s; (3) Add(p,a,s) is true if proposition p is to be added when action a is performed in situation s. STRIPS has the single axiom
In [McCarthy1985], an example of how to
use this formalization of STRIPS to reason
about action in the blocks world is given.
The example is as follows (we have modified
the initial conditions, and added a
uniqueness of names axiom). The variables
x, y and z range over
blocks. The constant blocks used in the
example are A, B, C, D, E, F, and
T (for Table). The one kind
of propositional fluent is
On(x,y) describing the fact that
block x is on block y.
The one kind of action
is Move(x,y) denoting the act of
moving block x on top of block y.
We assume uniqueness of names for every
function symbol, and every pair of distinct
function symbols. The initial situation
is described by axiom
.
The precondition, delete and add
lists of Move(x,y) are characterized
by axioms
,
and
, respectively.
For example, from axioms to
we can prove that
In order to interpret action selection rules,
such as axioms to
,
in terms of the theory of action described above,
we need to establish a connection between what
holds at a situation and what is in the
database associated with that situation.
The databases in the formalization of the
blocks world presented above contain only
propositional fluents of the form On(x,y)
for x, y {A,B,C,D,E,F,T}. These
propositional fluents are called frame
fluents [McCarthy & Hayes1969] [Lifschitz1990],
since any configuration of the blocks
A,B,C,D,E, and F can be
described by combinations of their
values
.
The following axiom states that a frame fluent holds at a particular situation if and only if it is in the database associated with that situation.
The expression means that
can be reached [Reiter1993] from s
by executing a nonempty sequence of actions
(
is an abbreviation for
)
.
We include domain closure axioms for blocks,
actions and situations. Axiom
restricts the domain of situations to those
that can be reached from the initial situation
or from the goal situation
.
Finally, we introduce an axiom of induction
for situations
, which allows
us to prove that a property holds for all the
situations that can be reached from a given
situation.
In addition to frame fluents, we use a
number of derived fluents, such as
clear, final, above, tower-deadlock,
terminal, and achieved, which are
partially
defined in terms of the frame fluents.
Axiom describes the configuration
of the goal situation
.
In general, a problem will be described by
a set of conditions on some initial and goal
situations. The particular problem described
by axioms
and
characterizes completely the state of the
initial and goal situations, but this needs
not be the case. Specifying a set of
constrains on initial and goal situations
allows defining classes of problems, instead
of particular instances. Such constraints can
be used then to reason about the behavior of
different strategies on a class of problems.
The formulas presented so far allow us to
prove that some actions are good, bad
or better than others for a given goal
and a particular situation. But we aren't
still able to decide which situations are
selectable according to a particular
strategy.
Action selection rules do not give us
complete information. They don't tell us
which actions are ``not good'', ``not bad'',
or ``not better'' than others. In order to
interpret them in terms of action selection
(using axiom ),
we need to be able to jump to the conclusion
that an action is ''not good'' or ``not bad''
unless the heuristics known so far (axioms
to
) imply
that it is so. This incompleteness
of our formalization is also one of its main
advantages, because it allows us to refine
the problem solving strategy of a program by
simple additions of better heuristics.
We illustrate this idea with an advice
taking scenario later on.
The nested abnormality theory described
below characterizes the behavior of the
strategy described by axioms to
when it is applied to solve the
problem described by axioms
and
. That is, it allows us to
determine what situations (and, therefore,
what sequences of actions) can be selected
according to the strategy.
Let
be the conjunction of the
universal closures of the formulas
to
.
The expression describes a
nested abnormality theory in which
circumscription is
applied in the following way. The predicate
Good is circumscribed with respect to
the universal closures of the axioms describing
the strategy of the program (axioms
to
). The predicate
Bad is circumscribed with respect to
the result of the circumscription described
above and the universal closure of axiom
, which contributes to the
definition of Bad with positive
instances of
Better. We need to let Better
vary, because minimizing the extension of
Bad may affect (through axiom
) the extension of
Better. The latter circumscription,
which characterizes the extensions of the
predicates Good and Bad, is
conjoined with
, which describes the
theory of action assumed for the blocks world
(axioms
to
), the
specific problem reasoned about (axioms
and
), and the
mechanism for action selection (axiom
).