Commonsense reasoning is a central part of intelligent behavior. In contrast to expert knowledge,
which is usually explicit, most commonsense knowledge is implicit. One of the prerequisites to
developing commonsense systems is making this knowledge explicit. The goal of the formal
commonsense reasoning community is to encode this knowledge using formal logic.
Formalizing the commonsense knowledge needed for even simple reasoning problems is a huge undertaking.
For this reason, researchers often study small toy problems, such as planning in the blocks
world domain. Because such toy problems can gloss over some of the more interesting research issues,
there has been a recent trend toward working on more realistic challenge
problems. This page contains a collection of these challenge problems, solutions
to some of these problems, and some other useful links.
The Common Sense Problem Page was
originally created by
Rob Miller.
It is currently maintained by
Leora Morgenstern.
Please send email to
(leora@steam.stanford.edu)
if you would like to contribute additional problems, solutions, or
suggestions.
The Commonsense Symposia of 1991, 1993, and 1996 did not have websites associated with them. We are currently working on creating websites using existing proceedings and other sources.
Many of these problems are listed together with a number
of variants.
An acceptable solution to a problem should
Note that the categorization below is approximate. Many problems could be placed in
more than one category.
Useful Links
This comprises two scenarios, the zoo and traffic worlds.
Challenge Problems for Commonsense Reasoning
Some challenge commonsense problems are listed below. The full text for these problems
can be found on this page; you can either
scroll down
the page or click on the highlighted links. A link to a printable version (a web page
containing only that problem)
can be found at the beginning of the full text version of the problem.
Solving one of these challenge problems should result in the discovery of new representational
issues and problems that would not appear in an artificially small toy problem. If one encounters
no difficulty along the way, one should be suspicious of the adequacy of the solution.
Indeed, many of these problems are more difficult than they appear at first glance.
Ernest Davis, who contributed
many of these problems, has suggested that very few of these problems are currently solvable
without considerable simplification.
Two problems that he believes are solvable are The Surprise Birthday Present
and the first half of Wolves and Rabbits.
Planning
Physical Reasoning
Spatial Reasoning
Naive Psychology
Understanding Language
Miscellaneous
Full Text of Common Sense Challenge Problems
Alice and Bob want to surprise their sister Carol with a joint present for her birthday, two weeks from now. They therefore go into a closed room to decide on the present and to plan how they will buy it.
Variants:
The plan will not work:
Contributed by Ernie Davis ( davise@cs.nyu.edu), New York University, U.S.A. (29th May 2001)
A small girl is walking through a forest to visit her grandmother, and she passes a bush behind which a Wolf is hiding, planning to pounce out and eat her. Just as she gets close, however, the Wolf hears the singing of the woodcutters as they start work nearby. The Wolf therefore decides to stay hidden and not pounce on the little girl after all. The problem is to explain why the Wolf decides to stay behind the bush.
Contributed by
Pat Hayes (
phayes@ai.uwf.edu), Institute for the Interdisciplinary
Study of Human and Machine Cognition, University of West Florida, U.S.A.
and
Lokendra Shastri
(shastri@icsi.berkeley.edu),
International Computer Science Institute, Berkeley, California, U.S.A.
(9th July, 1997)
The combination of a safe consists of 3 numbers between 1 and 50, with a tolerance of plus/minus two. No one knows what the combination is. Infer (a) that it will not be possible to open the safe using the combination within 5 minutes (unless you are very lucky); (b) that it will be possible in a couple of days work.
Contributed by Ernie Davis ( davise@cs.nyu.edu), New York University, U.S.A. (18th September 1997)
It is morning and you recall that you have to take out the garbage tonight. You are afraid that tonight it will slip your mind. Infer that you would do well to write a memo reminding yourself and attach it in some place where you are sure to look tonight.
Contributed by Ernie Davis ( davise@cs.nyu.edu), New York University, U.S.A. (19th September 1997)
It is 9:00 PM, you are very tired, and you are settling down in a comfortable armchair with a book of conference proceedings. You are supposed to call your mother at 9:30. Infer that you would do well to set the alarm on your watch for 9:25.
Contributed by Ernie Davis ( davise@cs.nyu.edu), New York University, U.S.A. (19th September 1997)
A humanoid robot is flying economy class on a major airline and is required to "eat" the packaged meal that has been served to it. Like its fellow human travellers, the robot can be assumed to be in a standard seat and to have two arms which function similarly to theirs, with similar restrictions on mobility; e.g. because of the cramped conditions, the robot's elbows have to remain close to its chest. In front of the robot is the familiar small table, occupied almost entirely by the tray containing the meal, neatly packaged in little plastic containers with transparent lids, along with a small plastic cup containing a foil-sealed tub of water, and a cellophane envelope containing a set of plastic cutlery, napkin, condiments, etc. For simplicity, assume that eating can be taken to consist of manipulating the food and drink to the robot's mouth, where the utensils are emptied at typical human diner rate. The robot is conventional in eating habits; thus, it tries at all times to not spill, to use the appropriate utensils, and to obey conventions as to when it is permissible to eat with its fingers (chicken, no; asparagus, yes). Moreover, it begins its meal with the starter, follows this with the main course (along with the mini roll which it has spread with butter), then it has dessert, and finally the cheese and biscuits. To complicate matters, the robot drinks water at various stages of the meal. Everything must be kept on the tray or table, including the packaging for the plastic cutler, the tops of containers, and the containers and their contents. So, like its human companions, the robot quickly becomes involved in an elaborate Chess game, continually manoeuvering the containers so that the chosen one is in position.
The problem is to formalise some aspect of this problem: e.g. the problem of food manipulation, of on planning how to eat the next part of the meal. (For example, consider the situation if the only way to tear through the plastic wrapping is with a sharp object such as a key, and the robot's keys are either in the back pocket of its trousers, or in its purse on the floor.) Initially this might be done at a fairly abstract level. However, the eventual aim is an epistemologically adequate formalisation. Toward this end, formalising the robot's mental life is interesting. For example, the robot's beliefs, desires, and preferences might lead it to try to eat the portion of processed chesse, and this goal might persist until the robot realises that it cannot open the cheese, or assuming that it manages to do so, until it decides that the cheese tastes worse than it looks. Those interested in multi-agent systems might care to formalise the arrival of coffee, served by a member of the cabin crew, usually a small tray held over the no-man's land of the adjacent seat, where the robot helps itself to milk and sugar.
Contributed by John Bell ( jb@dcs.qmw.ac.uk), Queen Mary and Westfield College, London, U.K. (9th June, 1998)
Characterize the following physical operation:
A gardener who has valuable plants with long delicate stems protects them against the wind by staking them; that is, by plunging a stake into the ground near them and attaching the plants to the stake with string.
Variants:
What would happen: If the stake is only placed upright on the ground,
not stuck into the ground? If the string were attached only to the plant, not
to the stake? To the stake, but not to the plant? If the plant is
growing out of rock? Or in water? If, instead of string, you use a
rubber band? Or a wire twist-tie? Or a light chain? Or a metal ring?
Or a cobweb? If instead of tying the ends of the string, you twist them
together? Or glue them? Or place them side by side? If you use a large rock rather
than a stake? If the string is very much longer, or very much shorter,
than the distance from the stake to the plant? If the distance from the
stake to the plant is large as compared to the height of the plant?
If the stake is also made out of string? Trees are sometimes blown
over in heavy storms; can they be staked against this?
Contributed by Ernie Davis ( davise@cs.nyu.edu), New York University, U.S.A. (18th September 1997)
Characterize the following:
A cook is cracking a raw egg against a glass bowl. Properly performed, the impact of the egg against the edge of the bowl will crack the eggshell in half. Holding the egg over the bowl, the cook will then separate the two halves of the shell with his fingers, enlarging the crack, and the contents of the egg will fall gently into the bowl. The end result is that the entire contents of the egg will be in the bowl, with the yolk unbroken, and that the two halves of the shell are held in the cook's fingers.
Variants:
What happens if: The cook brings the egg to impact very quickly?
Very slowly? The cook lays the egg in the bowl and exerts steady pressure
with his hand? The cook, having cracked the egg, attempts to peel it
off its contents like a hard-boiled egg? The bowl is made of looseleaf
paper? of soft clay? The bowl is smaller than the egg? The bowl is
upside down? The cook tries this procedure with a hard-boiled egg?
With a coconut? With an M & M?
Contributed by
Ernie Davis (
davise@cs.nyu.edu), New York University,
U.S.A. (18th September 1997)
When baking cookies, after you prepare the cookie dough, you lightly
spread flour over a large flat surface; then roll out the dough on
the surface with a rolling pin; then cut out cookie shapes with a
cookie cutter; then put the separated cookies separately onto a
cookie sheet and bake.
Variants:
What happens if the surface is covered with sand? Or covered with
sandpaper? If the rolling pin has bumps? or cavities? or is square?
If the cookie cutter does not fit within the dough? What happens if you use
the rolling pin just in the middle of the dough and leave the edges
alone? If, rather than roll, you pick up the rolling pin and press it
down into the dough in various spots? Ordinarily the cutting part of
the cookie cutter is a thin vertical wall above a simple closed curve
in the plane; suppose it is not thin? or not vertical? or not closed?
or a multiple curve? If the cuts with the cutter overlap one another?
Does the dough end up thinner or thicker if you exert more force on
the rolling pin? If you roll it out more times? If you roll the
pin faster or slower? Do you get more or fewer cookies if the dough
is rolled thinner? If a larger cookie cutter is used? If there is
more dough? If the cuts with the cutter are spread further apart?
What is the point of placing waxed paper on the surface?
What happens if the above procedure is
tried with a recipe for drop cookies? bar cookies? refrigerator cookies?
Contributed by
Characterize the following:
The following experiment can be used to estimate absolute zero using
household objects. Prepare a pot of boiling water and a pot of ice
water. Take a graduated baby bottle and hold it (using tongs) in the
boiling water. After a few minutes, when it has stopped bubbling, remove
it and plunge it rapidly into the ice water. Water will then stream into
the baby bottle through the nipple, as the gas contracts. (Actually,
the nipple collapses: to allow the flow of water, you have to
manipulate the nipple.) When the flow of water stops, the volume of
the water that has entered the bottle may be measured by holding the bottle
right-side up; the final volume of the gas at 0 degrees C may be measured
by holding the bottle upside down. The initial volume of the gas at 100
degrees C is the sum of the final volume of the gas plus the volume of
the water. By doing a linear extrapolation between these two values to
the point where the volume of the gas would be zero, one can find the
value of absolute zero.
Variants:
In connection with
Ernie Davis's 'Absolute Zero' problem,
here's an experiment with a very counter-intuitive outcome.
The apparatus is two containers, with one fitting loosely into the other.
Hot water is placed into the inner one, and iced water into the outer one,
forming a cooling jacket. The experiment measures how long it takes for the
water in the inner container to cool to room temperature. If the initial
temperature of the hot (inner) water is very high, it cools to room
temperature in less time than if its initial temperature is lower.
Why is this surprising? More difficult, and maybe outside 'common sense,'
what explanation could be given for it?
Contributed by
Pat Hayes (
phayes@ai.uwf.edu), Institute for the Interdisciplinary
Study of Human and Machine Cognition, University of West Florida, U.S.A.
(19th September 1997)
[printable version]
Consider dropping the following objects on the floor from a height
of five feet: (1) a chalk eraser; (2) a raw egg; (3) fine glassware;
(4) a lump of clay; (5) a feather; (6) a flat piece of paper;
(7) a crumpled piece of paper. Develop a theory that connects
the final state of these being dropped and their behavior while
falling to their other material properties.
Contributed by
Ernie Davis (
davise@cs.nyu.edu), New York University,
U.S.A.
(18th September 1997)
Formally characterize the structure of a linked chain, and
infer (a) that pulling on one end will cause the whole chain
to follow; (b) that the chain is very flexible; (c) that cutting one
link will give two shorter chains and that linking two chains together
end to end gives a longer chain.
Contributed by
Ernie Davis (
davise@cs.nyu.edu), New York University,
U.S.A.
(18th September 1997)
It is necessary to walk several hundred yards in rain. Explain why if the
rain is moderate then one should run, but not if one has an umbrella;
but if the rain is very heavy then running is of no use unless one has an
umbrella, and even then it is best to hurry; and if there is also a strong
wind one is likely to get more wet than if not, even with an umbrella.
Contributed by
Pat Hayes (
phayes@ai.uwf.edu), Institute for the Interdisciplinary
Study of Human and Machine Cognition, University of West Florida, U.S.A.
(13th November 1997)
Give a general purpose characterisation of what constitutes a handle,
in the ordinary sense of door-handle or drawer-handle, which is sufficient
to enable one to infer from a qualitative description of the shape of a
part of an object whether or not it can be a handle for that object.
In particular, it should be possible to infer that a blunt conical
projection cannot be a handle, but an inverted conical projection can
be; that a simple rectangular projection can be a drawer handle, but not
a suitable handle for lifting a heavy object; that a piece of rope
attached at one end can be a door handle; and that a hooked or u-shaped
projection, or a rope fastened at both ends, can be a handle for almost
anything.
Contributed by
Pat Hayes (
phayes@ai.uwf.edu), Institute for the Interdisciplinary
Study of Human and Machine Cognition, University of West Florida, U.S.A.
(2nd October 1997)
Sam got straight C's in high school math and has not thought for a moment
about math in the 20 years since. Infer that Sam is not the person
to ask about a calculus problem.
Contributed by
Ernie Davis (
davise@cs.nyu.edu), New York University,
U.S.A.
(18th September 1997)
You are riding Black Beauty (a well-trained horse) in the dark,
and you come to a bridge that he has often crossed before. Black Beauty
absolutely refuses to set foot on the bridge. Infer that something
may be wrong with the bridge.
Contributed by
Ernie Davis (
davise@cs.nyu.edu), New York University,
U.S.A.
(18th September 1997)
As an addendum to
Ernie Davis's 'Trusting the Horse'
problem, here's a real incident from my own youth. Black Beauty
(in this version,a mare),
while walking, develops a limp and is reluctant to step on her front
leg, holding that hoof slightly above the ground. Her driver, who claims
to be an expert on horses, tries to force her to lift her front
right leg, on the grounds that a horse placing too much weight
on a hoof is often a sign of a problem in that hoof.
Infer that the driver is a fool.
Contributed by
Pat Hayes (
phayes@ai.uwf.edu), Institute for the Interdisciplinary
Study of Human and Machine Cognition, University of West Florida, U.S.A.
(19th September 1997)
There are many ways in which the meaning of a two word noun phrase can be
related to the meanings of the individual nouns, and syntax gives little
indication of which applies in any given case. Some such phrases are
purely idiomatic and must be individually learned (e.g. "tag sale,"
"mustard gas") but in most cases a speaker who has never seen
the particular phrase can figure out its meaning from semantic constraints
and commonsense knowledge.
Characterize the commonsense knowledge used in determining that the correct
meaning of the following noun phrases is more plausible than any
of the alternative readings:
Contributed by
Ernie Davis (
davise@cs.nyu.edu), New York University,
U.S.A.
(19th September 1997)
Develop a theory justifying the following:
If you put a half dozen rabbits in a pen and care for them suitably for a period
of a few months, you will generally end up with more than a half dozen rabbits
in the pen. If, however, you fail to feed them, then you will end up with
no (live) rabbits in the pen. If they are all of one sex,
and none of the rabbits is pregnant to start with, you will end up with
no more than a half dozen rabbits no matter how long you wait.
If you put a couple of wolves with a half dozen rabbits in a pen overnight,
then in the morning, you will have two wolves and no rabbits. If, however,
the wolves are chained by a short chain at one end of the pen, you
will probably have as many animals in the morning as you started with.
A metal chain will work for this purpose; a rope is not reliable.
Contributed by
Ernie Davis (
davise@cs.nyu.edu), New York University,
U.S.A.
(19th September 1997)
A says that he witnessed B murdering C. Infer that the evidence
that B actually did murder C is stronger
Drew McDermott
hears about some dastardly deed. What conclusion should be drawn in each
of the following cases:
A. A gun is loaded, then something else happens to it, then the gun is
aimed at Fred (a turkey), and the trigger is pulled.
B. A branding iron is heated red-hot, something else happens to it,
then the iron is placed against some of Fred's feathers.
C. A small earthenware crucible is filled with a volatile poison, then
something else happens to it, then the cup (if it exists) is put to
Fred's beak, and Fred swallows.
The "something else" is on of the following list:
Contributed by
Pat Hayes (
phayes@ai.uwf.edu), Institute for the Interdisciplinary
Study of Human and Machine Cognition, University of West Florida, U.S.A.
(2nd October 1997)
Three solutions to the Egg-Cracking Problem:
Physical Reasoning
Baking Cookies
[printable version]
What happens if: You do not flour the surface? You use too much
flour? You do not roll out the dough, but cut the cookies from the
original mass? You roll out the dough but don't cut it? You
cut the dough but don't separate the pieces?
Leora Morgenstern
(leora@steam.stanford.edu)
,
IBM T.J. Watson Research Center, U.S.A.
and
Ernie Davis(
davise@cs.nyu.edu), New York University,
U.S.A.
(18th September 1997)
Physical Reasoning
Estimating Absolute Zero
[printable version]
What would happen: If the bottle is immersed only very briefly in
the hot water? Or only very briefly in the cold water? If it is laid on
top of the pots of water rather than immersed in them? If the bottle
is left in the outside air a long time between being in the hot
water and being in the ice water? If the bottle has an open end with no nipple?
If the bottle has other holes besides this nipple? If the bottle is opaque?
If you use containers with air at 100 degrees and 0 degrees rather than
water? If the quantity of ice water in the second pot is very small?
very large? or if the quantity of hot water in the first pot is very
small or very large? If the bottle is coated with styrofoam? If the
bottle is not graduated? Why is the following not a reasonable experiment:
"Take a volume of gas in your hands; cool it; see how much it shrinks."
Contributed by
Ernie Davis (
davise@cs.nyu.edu), New York University,
U.S.A.
(18th September 1997)
Physical Reasoning
A Failure of Common Sense --- cooling water to room temperature
[printable version]
Physical Reasoning
Falling Objects
Physical Reasoning
Linked Chains
[printable version]
Physical Reasoning
Singin' in the Rain
[printable version]
Spatial Reasoning
The Handle Problem
[printable version]
Naive Psychology
Sam's Calculus
[printable version]
A solution to Sam's Calculus Problem
John Campbell and Vladimir Lifschitz: Reinforcing a claim in commonsense reasoning,
Common Sense 2003, Logical Formalizations of
Commonsense Reasoning: Papers from 2003 AAAI Spring Symposium, 2003, pp. 51-56.
Note that this solution focuses more on the issue of elaboration tolerance than the
naive psychology aspects of the problem.
Naive Psychology
Trusting the Horse
[printable version]
Naive Psychology
Trusting the Horse II
[printable version]
Understanding Language
The Meaning of Noun Phrases
[printable version]
.
Miscellaneous
Wolves and Rabbits
[printable version]
Miscellaneous
Strength of Evidence
[printable version]
Contributed by
Ernie Davis (
davise@cs.nyu.edu), New York University,
U.S.A.
(19th September 1997)
Miscellaneous
The Cruel and Unusual Yale Shooting Problem
[printable version]