% This preamble comes from /u/jmc/let/latex-95.tex which is copied into
% latex sources by the macro latex-include-95 which is itself called
% by the macro mklatex-95 executed in /u/jmc/let/files
\documentclass[12pt]{article}
% BEGIN stuff for times and dates
\newcount\hours
\newcount\minutes
\newcount\temp
\newtoks\ampm
% set the real time of day
\def\setdaytime{%
\temp\time
\divide\temp 60
\hours\temp
\multiply\temp -60
\minutes\time
\advance\minutes \temp
\ifnum\hours =12 \ampm={p.m.} \else
\ifnum\hours >12 \advance\hours -12 \ampm={p.m.} \else \ampm={a.m.} \fi \fi
}
\setdaytime
\def\theMonth{\relax
\ifcase\month\or
Jan\or Feb\or Mar\or Apr\or May\or Jun\or
Jul\or Aug\or Sep\or Oct\or Nov\or Dec\fi}
\def\theTime{\the\hours:\ifnum \minutes < 10 0\fi \the\minutes\ \the\ampm}
\def\jmcdate{ \number\year\space\theMonth\space\number\day}
% END stuff for times and dates
%begin stuff for version to be read on-line
%output-xwindow
%\textheight 8.0in
%output-xwindow
%\textwidth 6.5in
%output-powerbook
%\textheight 4.1in % This height setting is for reading on the
%%Powerbook.
%output-xwindow
%\textheight 7.5in % This is for reading on an X-window.
%output-xwindow
%\textwidth 5.5in
%output-xwindow
%\oddsidemargin 0.0in
%output-xwindow
%\topmargin -0.5in
%output-xwindow
%\headheight 0.0in
% end stuff for version to be read on-line, but note
% that the whole text up to the bibliography
% is bracketed to be printed large.
%******************************
%******************************
%******************************
%******************************
\begin{document}
\bibliographystyle{alpha}
\title{A LOGICAL AI APPROACH TO CONTEXT}
\author{
{\Large\bf John McCarthy} \\
Computer Science Department \\
Stanford University \\
Stanford, CA 94305 \\
{\tt jmc@cs.stanford.edu} \\
{\tt http://www-formal.stanford.edu/jmc/}}
\date{\jmcdate ,\ \theTime}
\maketitle
%For reading on-line
%output-xwindow
% {\Large % On-line requires \Large
% Van Benthem wants /u/jmc/RMAIL.F95==120
\begin{abstract}
Logical AI develops computer programs that represent what they know
about the world primarily by logical formulas and decide what to do
primarily by logical reasoning---including nonmonotonic logical
reasoning. It is convenient to use logical sentences and terms
whose meaning depends on context. The reasons for this are similar
to what causes human language to use context dependent meanings.
This note gives elements of some of the formalisms to which we have
been led. Fuller treatments are in \cite{McC93}, \cite{guha-thesis}
and \cite{McCBuvac94} and the references cited in the Web page
\cite{Buvac95}. The first main idea is to make contexts first class
objects in the logic and use the formula $ist(c,p)$ to assert that
the \emph{proposition} $p$ is true in the \emph{context} $c$. A
second idea is to formalize how propositions true in one context
transform when they are moved to different but related contexts.
An ability to \emph{transcend} the outermost context is needed to
give computer programs the ability to reason about the totality of
all they have thought about so far \cite{McC96}.
\end{abstract}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Introduction}
As requested by Johan van Benthem, this is a brief introduction to the
logical formalism for context being explored by John McCarthy and
Sa\v{s}a Buva\v{c} at Stanford University. It is motivated by the
need to use contexts as first order objects for artificial
intelligence. I hope the description is suitable for comparison with
other approaches to context that often have other motivations.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Features of the Formalism}
Here are some features of our formalizations.
% begin here
\begin{enumerate}
\item We offer no \emph{definition} of context. There are mathematical
context structures of different properties, some of which are
useful. Asking what a context is is like asking what a group
element is. See section \ref{sec:abstract} for more on this.
\item Sentences about propositions and contexts are built up from a
formula $ist(c,p)$ which is to be understood as asserting that the
{\em proposition} $p$ is true in the {\em context} $c$. When we
have entered the context $c$, we can write
\begin{equation}
c:\quad\quad p.
\end{equation}
\item
Once a program has inferred a sentence $q$ from $p$, it can
\emph{leave} the context $c$ and have $ist(c,q)$. This generalizes
natural deduction.
\item Reasoning and communicating in context permits taking only
limited phenomena into account. Treating contexts as objects
permits stating the limitations explicitly within the formalism.
\item Statements about contexts are themselves in contexts.
\item There is no universal context. This is a fact of epistemology
(both of the physical world and the mathematical world). It is
always possible to generalize the concepts one has used up to the
present. Attempts at ultimate definitions always fail---and usually
in uninteresting ways. Humans and machines must start at middle
levels of the conceptual world and both specialize and
generalize.
\item We can deal with this phenomenon in our formalism by
ensuring that it is always possible to \emph{transcend} the
outermost context used so far. Thus a robot designed in this way is
not stuck with the concepts it has been given.
\item Because of the possibility of transcendence, the use of contexts
as objects is not just a matter of efficiency. Any given set of
sentences including contexts can always be \emph{flattened} (at the
cost of lengthening) to eliminate explicit contexts. However, the
resulting flat theory can no longer be transcended within the
formalism, because it is not an object that can be referred to as a
whole.
\item There is often a theory associated with a context---the set of
sentences true in the context. However, two contexts with the same
theory need not be the same, because they may have different
relations with other contexts. Not all useful contexts will be
closed under logical inference.
\item We advocate using \emph{propositions} as discussed in
\cite{McC79b} for the objects true in contexts rather than logical
or natural language sentences. This has the advantage that the set
of propositions true in a context may be finite when the set of
sentences that can express these propositions will be infinite.
However, our present applications of context would work equally well
if sentences were used. Buva\v{c} and Mason
\cite{buvac-buvac-mason-95} treat $ist(c,p)$ as a modal logic
formula in a propositional theory.
\item Besides the truth of propositions in contexts, we consider the
value $value(c,exp)$ of a term $exp$ representing an
\emph{individual concept} in a context $c$ as discussed in
\cite{McC79b}. This presents problems beyond those presented by
propositions, because in general the space of values of individual
concepts will depend on some outer context.
\end{enumerate}
\section{Applications}
Here are some applications of the logical theory of contexts.
\begin{enumerate}
\item Conventional linguistic applications like the referents of
pronouns can be treated using contexts as objects, but
formalized contexts are also useful for more complex anaphora. For
example, we need to relate the surgeon's ``Scalpel'' to the sentence
``Please hand me a number 3 scalpel''. See \cite{Buvac96}. These
applications require associating contexts with sentences or parts of
sentences.
\item Defining a theory in a narrow context in a way that permits it
to be \emph{lifted} to a richer outer context and applied.
\cite{McC93} discusses lifting a simple theory of $above(x,y)$ as the
transitive closure of $on(x,y)$ to an outer situation calculus
context that uses $on(x,y,s)$ and $above(x,y,s)$. A key formula of
that paper is
\begin{equation}
c:\hspace{0.3in}
(\forall x y s)(on(x,y,s)
\equiv ist(context\hbox{-}of\hbox{-}situation(s),on(x,y))),\label{on1}
\end{equation}
which relates the three argument situation calculus predicate
$on(x,y,s)$
and the two element predicate
$on(x,y)$ of the specialized
theory of $on$ and $above$. The use of contexts to implement
``microtheories'' in Cyc is described in \cite{guha-thesis}.
This allowed people entering knowledge about some phenomenon,
e.g. automobiles, to do it in a limited context, but leave open
the ability to use the knowledge in a larger context.
\item Defining a narrow context for a problem and importing facts that
permit the problem to be solved by considering only a small set of
possibilities. For example, in formulating the missionaries and
cannibals problem a person or program must take a number of common
sense facts into account, but ends up with a 32 state space, because
all that is relevant in this context is the numbers of missionaries,
cannibals and boats on each bank of the river.
\item Relating databases with different conventions \cite{McCBuvac94}.
Imagine that the Airforce and the General Electric Company have
databases both of which include prices for the jet engines that the
company sells the Airforce. However, suppose the databases don't
agree on what the price covers, e.g. spare parts. We use one
context $c_{AF}$ for the Air Force database, another $c_{GE}$ for
the GE database, and a third context $c0$ that needs to relate
information from both. \emph{Lifting} formulas in the context true
in $c0$ relate information in the different databases to the context
in which reasoning is done, , e.g. they tell about the relation of
the prices listed in $c_{AF}$ and $c_{GE}$ to the inclusion or not
of spare parts.
\item Buva\v{c} and McCarthy have also discussed using context to
combine aspects of plans generated by different planners not
originally designed to work together---or plans originally intended
to work together but which have drifted apart in the course of
separate development.
\end{enumerate}
\section{Desiderata for a Mathematical Logic of Context}
\label{sec:abstract}
The simplest approach to a logic of context is to treat $ist(c,p)$ as
a modal operator with $p$ quantifier free. Sa\v{s}a Buva\v{c} and Ian
Mason \cite{buvac-buvac-mason-95} did this. However, the applications to
natural language, to databases and to formalizing common sense
knowledge and reasoning require a lot more. Here are some desiderata
for a formal theory.\footnote{Just so Johan doesn't get off
too easily in keeping his promise to make one.}
\begin{itemize}
\item $truths(c)$ is the set of $p$ such that $ist(c,p)$. In some
formalizations it will be a first class object. In any case we can
think about it in the metatheory.
\item The simplest possibility for $truths(c)$ for a particular
context $c$ is that it is an arbitrary set of propositions, i.e. not
required to be closed under some logical operations.
\item The second possibility is that $truths(c)$ is closed under
deduction in some logical system---perhaps the theory of contexts.
\item $truths(c)$ may be the set of propositions true about some
subject matter. We can assert propositions about this set of
proposition without knowing what sentences are in it.
\item Associated with at least some contexts is a domain $domain(c)$.
As with $truths(c)$, $domain(c)$ may be an object, presumably in a
higher level context, or it may be only in the metalanguage.
\end{itemize}
The variety of potential applications of contexts as objects suggests
looking at contexts as mathematics looks at group elements. Groups
were first identified as sets of transformations closed under certain
operations. However, it was noticed that the integers with addition
as an operation, the non-zero rationals with multiplication as an
operation and many others had the same algebraic property. This
motivated the definition of abstract group around the turn of the
century. In such a theory, formulas express relations among contexts
would be primary rather than the propositions true in the contexts.
Thus the theory would emphasize $specializes(c1,c2,time)$ rather than
$ist(c,p)$.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Remarks}
Johan van Benthem asked for the following in soliciting this essay and
John Perry's.
\begin{quotation}
\emph{My proposal is the following. I would like to invite the two Johns
to send me a rough outline of their contribution. It would be good
if you could bring out (1) what the notion of context \textbf{is} and what
it \textbf{does} according to you: in both cases, I think you want it to
achieve 'efficiency' and 'portability' of information, (2) what is
involved in the dynamics of \textbf{changing contexts}, perhaps with
attendant changes in linguistic formulation (add or drop variables,
etcetera). I would then like to comment on this, adding some
thoughts on possible logical formalizations, emphasizing the
interplay between what is said in a formula and what remains
implicit in the models where it gets evaluated.}
\end{quotation}
I have rejected the idea of defining what a context \textbf{is}, but I
hope I have given some idea of what they do. The example relating the
three argument $on$ and the two argument $on$ should provide a basis
for comments. In the formulation of the ideas, the ability to combine
formulas arising in different contexts has been more important than
computational efficiency.
%output-xwindow
%} % This is the right bracket ending \Large
%\bibliography{/u/jmc/1/biblio}
% Its full name is /u/jmc/1/biblio.bib
\cite{McC93} and \cite{McCBuvac94} have additional references. Also
Sa\v{s}a Buva\v{c} has several other papers on context on his Web page
http://www-formal.stanford.edu/buvac/.
\begin{thebibliography}{BBM95}
\bibitem[BBM95]{buvac-buvac-mason-95}
Sa\v{s}a Buva\v{c}, Vanja Buva\v{c}, and Ian~A. Mason.
\newblock Metamathematics of contexts.
\newblock {\em Fundamenta Informaticae}, 23(3), 1995.
\bibitem[Buv95]{Buvac95}
Sa\v{s}a Buva\v{c}.
\newblock Sa\v{s}a buva\v{c}'s web page, 1995.
\newblock http://www-formal.stanford.edu/buvac/.
\bibitem[Buv96]{Buvac96}
Sa\v{s}a Buva\v{c}.
\newblock Resolving lexical ambiguity using a formal theory of context.
\newblock In {\em Semantic Ambiguity and Underspecification}. CSLI Lecture
Notes, Center for Studies in Language and Information, Stanford, CA, 1996.
\bibitem[Guh91]{guha-thesis}
R.~V. Guha.
\newblock {\em Contexts: A Formalization and Some Applications}.
\newblock PhD thesis, Stanford University, 1991.
\newblock Also published as technical report STAN-CS-91-1399-Thesis, and MCC
Technical Report Number ACT-CYC-423-91.
\bibitem[MB94]{McCBuvac94}
John McCarthy and Sa\v{s}a Buva\v{c}.
\newblock {Formalizing Context (Expanded Notes)}.
\newblock Technical Note STAN-CS-TN-94-13, Stanford University, 1994.
\bibitem[McC79]{McC79b}
John McCarthy.
\newblock First order theories of individual concepts and propositions.
\newblock In Donald Michie, editor, {\em Machine Intelligence}, volume~9.
Edinburgh University Press, Edinburgh, 1979.
\newblock Reprinted in \cite{mccarthy-book}.
\bibitem[McC90]{mccarthy-book}
John McCarthy.
\newblock {\em Formalizing Common Sense: Papers by John McCarthy}.
\newblock Ablex Publishing Corporation, 355 Chestnut Street, Norwood, NJ 07648,
1990.
\bibitem[McC93]{McC93}
John McCarthy.
\newblock Notes on formalizing context.
\newblock In {\em IJCAI-93}, 1993.
\newblock Available on http://www-formal.stanford.edu/jmc/.
\bibitem[McC96]{McC96}
John McCarthy.
\newblock Making robots conscious of their mental states.
\newblock In Stephen Muggleton, editor, {\em Machine Intelligence 15}. Oxford
University Press, 1996.
\newblock to appear, available on http://www-formal.stanford.edu/jmc/.
\end{thebibliography}
\vfill
{\tiny\rm\noindent /@steam.stanford.edu:/u/jmc/f95/context.tex: begun 1995 Sep 22, latexed
\jmcdate\ at \theTime}
\end{document}