AlanTuring.net Reference Articles on Turing

What is Artificial Intelligence?

By Jack Copeland

© Copyright B.J. Copeland, May 2000

 

The CYC Project

CYC (the name comes from "encyclopaedia") is the largest experiment yet in symbolic AI. The project began at the Microelectronics and Computer Technology Corporation in Texas in 1984 under the direction of Douglas Lenat, with an initial budget of U.S.$50 million, and is now Cycorp Inc. The goal is to build a KB containing a significant percentage of the common sense knowledge of a human being. Lenat hopes that the CYC project will culminate in a KB that can serve as the foundation for future generations of expert systems. His expectation is that when expert systems are equipped with common sense they will achieve an even higher level of performance and be less prone to errors of the sort just mentioned.

By "common sense", AI researchers mean that large corpus of worldly knowledge that human beings use to get along in daily life. A moment's reflection reveals that even the simplest activities and transactions presuppose a mass of trivial-seeming knowledge: to get to a place one should (on the whole) move in its direction; one can pass by an object by moving first towards it and then away from it; one can pull with a string, but not push; pushing something usually affects its position; an object resting on a pushed object usually but not always moves with the pushed object; water flows downhill; city dwellers do not usually go outside undressed; causes generally precede their effects; time constantly passes and future events become past events ... and so on and so on. A computer that is to get along intelligently in the real world must somehow be given access to millions of such facts. Winograd, the creator of SHRDLU, has remarked "It has long been recognised that it is much easier to write a program to carry out abstruse formal operations than to capture the common sense of a dog".

The CYC project involves "hand-coding" many millions of assertions. By the end of the first six years, over one million assertions had been entered manually into the KB. Lenat estimates that it will require some 2 person-centuries of work to increase this figure to the 100 million assertions that he believes are necessary before CYC can begin learning usefully from written material for itself. At any one time as many as 30 people may be logged into CYC, all simultaneously entering data. These knowledge-enterers (or "cyclists") go through newspaper and magazine articles, encyclopaedia entries, advertisements, and so forth, asking themselves what the writer assumed the reader would already know: living things get diseases, the products of a commercial process are more valuable than the inputs, and so on. Lenat describes CYC as "the complement of an encyclopaedia": the primary goal of the project is to encode the knowledge that any person or machine must have before they can begin to understand an encyclopaedia. He has predicted that in the early years of the new millennium, CYC will become "a system with human-level breadth and depth of knowledge".

CYC uses its common-sense knowledge to draw inferences that would defeat simpler systems. For example, CYC can infer "Garcia is wet" from the statement "Garcia is finishing a marathon run", employing its knowledge that running a marathon entails high exertion, that people sweat at high levels of exertion, and that when something sweats it is wet.

Among the outstanding fundamental problems with CYC are (1) issues in search and problem-solving, for example how to automatically search the KB for information that is relevant to a given problem (these issues are aspects of the frame problem, described in the section Nouvelle AI) and (2) issues in knowledge representation, for example how basic concepts such as those of substance and causation are to be analyzed and represented within the KB. Lenat emphasises the importance of large-scale knowledge-entry and is devoting only some 20 percent of the project's effort to development of mechanisms for searching, updating, reasoning, learning, and analogizing. Critics argue that this strategy puts the cart before the horse.

[Previous section] [top of page] [Next section]