next up previous
Next: About this document ...

Relations between reality and appearance
John McCarthy, Stanford University

Apology: My knowledge of of machine learning is no more recent than Tom Mitchell's book. Its chapters describe, except for inductive logic programming, programs aimed at classifying appearances.

We live in a complicated world that existed for billions of years before there were humans, and our sense organs give us limited opportunities to observe it directly. Four centuries of science tell us that we and the objects we perceive are built in a complicated way from atoms and, below atoms, quarks.

Science, since 1700, is far better established than any kind of philosophy. Bad philosophy has stunted AI, just as behaviorism stunted psychology for many decades.

Besides the fundamental realities behind appearance studied by science, there are hidden every day realities--the three dimensional reality behind two dimensional images, hidden surfaces, objects in boxes, people's names, what people really think of us.

Appearance is quite different from reality. Most machine learning research has concerned the classification of appearances and has not involved inferring relations between reality and appearance. Robots and other AI systems will have to infer such relations.

Human common sense also reasons in terms of the realities that give rise to the appearances our senses provide us. Indeed young babies have some initial knowledge of the permanence of physical objects.

Perhaps if your philosophy rejects the notion of reality as a fundamental concept, you'll accept a notion of relative reality appropriate for the design and debugging of robots. Thus the robot needs to be designed to determine this relative reality from the appearance given by its inputs.

We'll discuss:

Dalton's atomic theory as a discovery of the reality behind appearance.

The use of touch in finding the shape of an object. Results of an experiment in drawing an object which one is only allowed to touch - not see.

A simple problem involving changeable two dimensional appearances and a three dimensional reality.

Some formulas relating appearance and reality in particular cases.

What can one know about a three dimensional object and how to represent this knowledge.

How scientific study and the use of instruments extends what can be learned from the senses. Thus a doctor's training involving dissection of cadavers enables him to determine something about the liver by palpation.


Some scientific discoveries like Galileo's involve discovering the relations between known entities. Patrick Langley's Bacon program did that.

John Dalton's postulation of atoms and molecules made up of fixed numbers of atoms of two or more kinds was much more creative and will be harder to make computers do.

The ancient ideas of Democritus and Lucretius that matter was made up from atoms had no important or even testable consequences. Dalton's did.

Giving each kind of atom its own atomic mass explained the complicated ratios of masses in a compound as representing small numbers of atoms in a molecule. Thus a sodium chloride (NaCl) molecule would have one atom of each of its elements. Water came out as HO.

The simplest forms of the atomic theory were inaccurate. [Early 19th century chemists didn't soon realize that the hydrogen and oxygen molecules are H and O and not just H and O.] Computers also need to be able to propose theories adventurously and fix their inaccuracies later later.

Only the relative masses of atoms could be proposed in Dalton's time. The first actual way of estimating these masses was made by Maxwell and Boltzmann about 60 years after Dalton's proposal. They realized that the coefficients of viscosity, heat conductivity, and diffusion of gases as explained by the kinetic theory of gases depended on the actual sizes of molecules.

The last important scientific holdout against the reality of atoms, the chemist Wilhelm Ostwald, was convinced by Einstein's 1905 explanation of Brownian motion. The philosopher Ernst Mach was unconvinced.

The first actual pictures of atoms in the 1990s were a big surprise. An actual picture of a proton showing the quarks would be even more surprising and seems quite unlikely.

Philosophical point: Atoms cannot be regarded as just an explanation of the observations that led Dalton to propose them. Maxwell and Boltzmann used the notion to explain entirely different observations, and modern explanations of atoms are not at all based on the law of combining proportions. In short, atoms were discovered, not invented.


Most likely, it is still too hard to make programs that will invent elements, atoms, and molecules. Let's therefore try to write logical sentences that will introduce these concepts to a knowledge base that has no ideas of them.

We assume that the notions of a body being composed of parts and of mass have already been formalized, but the idea of atom has not. The ideas of bodies being disjoint is also assumed to be formalized.

The following formulas approximate a fragment of high school chemistry and should be somewhat elaboration tolerant, e.g. should admit additional information about the structure of molecules. The situation argument is included only to point out that material bodies change in chemical reactions.





Getting reality from appearance is an inverse problem. Formulas and programs giving appearance as a function of reality and the circumstances of observation are easier to state and less likely to be ambiguous.

Reality is more stable than appearance. Formulas giving the effects of events (including actions) are almost always written in terms of reality.

The formulas that follow will need a situation or time argument once we consider changing appearances.


We begin with a little bit about touch rather than with vision. Imagine putting one's hand into one's pocket in order to take out one of the objects.


For now we needn't say anything about except that it is distinguishable from other textures. Textures for touch have similarities to and differences from textures for vision. Both are very scale dependent.


How can we best express what a human can know and a robot should know about a three dimensional object? We start from a standard kind of object with particular types of objects and individual objects defined by successive approximations.

I propose starting with a rectangular parallelopiped, which we'll abbreviate rppd. An object is an rppd modified by dimension information, shape modifications, attached objects, information about its internal structure, location information, folding information, information about surfaces, physical information like mass. Perhaps one should start even more simply with just a size, a ball too large to be included in the object and too small to include it.

My small Swiss army knife is an rppd, 5cm by 2cm by 1.5cm, rounded in the width dimension at each end. Its largest surface has a smooth plastic surface texture, and its other surfaces are metallic with stripes parallel to the long axis, i.e. the backs of the blades. This description should suffice to find the knife in my pocket and get it out, even though it says nothing about the blades.

Consider a baby and a doll of the same size. Each may be described as an rppd with attached rppds in appropriate places for the arms, legs, and head. The most obvious and significant differences come in a texture, motion, and family relationships.


Here's the appearance. The puzzle is: What is the reality behind the appearance? Clicking on the and signs is how one experiments.

The reality is three dimensional, while the appearance is two dimensional.

Those who implement display know that computing appearance is difficult. Those who do computer vision know that inverting the relation is even more difficult.


The appearance in the puzzle is a genuine appearance. The reality behind the appearance is rather abstract. Thus the bodies have no thickness or mass. This doesn't seem to bother people; we're used to abstractions.

We use concepts like like solid body, behind, part of, length, etc.

Some of these concepts may be learned by babies from experience, as Locke proposed. However, there is good evidence that many of them, e.g. solid body and behind were learned by evolution and are built into human and most animal infants.

The quickest and most articulate human solution was by Donald Michie. Eventually machines will do better.

We introduce positions. There is a string of 13 positions. Bodies are also represented by strings of squares of length appropriate to the body. is either a color or a letter depending on the version of the puzzle.


Here's the formula for the effect of counter-clockwise motion.

The last parts of the last two formulas tell what doesn't change.


A point of view common (and maybe dominant) in the machine learning community is that the computer should solve the problem from scratch, e.g. inventing body and behind as needed. It is not dominant in the computer vision community.

Our opinion, and that of the knowledge representation community, is that it is better to provide computer programs with common sense concepts, suitably formalized. There is some success, but the formalisms tend to be limited in the contexts in which they apply. I think, but won't argue here, that formalizing context itself is a necessary step. Here are two sample formulas relevant to the present problem but perhaps not general enough to be put in a knowledge base of common sense.



Solving the puzzle involves inferring formulas like

We haven't put in effects of actions and some relations among the predicates.

The lengths and colors of the bodies are assumed not dependent of the situation. Human language tolerates elaborations such as actions that affect color better than do present AI formalisms.

The ideas of the last two slides about what knowledge should be given to the program have benefitted from discussions with Stephen Muggleton and Ramon Otero.


The most obvious example is a tune. Maybe jokes, especially practical jokes, are another example.

next up previous
Next: About this document ...
John McCarthy