Recently, many attempts have been made to create AI systems that understand natural language, that is, adequately interpret texts in natural language.
To assess what is possible in this area and how difficult it is, consider the role of language in the human environment from this point of view.
A person begins to use language as a means of communication much later than he masters a significant amount of information about the world and learns many types of actions and activities ( Tacit knowledge/implicit knowledge ). This is proof that the internal representation of knowledge is non-verbal: knowledge and skills are already there at the initial stage, but there is no language communication yet.
With the beginning of the learning of verbal communication, words and then phrases are associated with already accumulated knowledge, presented in a non-verbal form. The role of the language at this stage is reduced to the exchange of information about entities that are familiar to both participants in the communication.
Specialists designate these two phases as the holistic education period since reading is not a channel of knowledge acquisition.
And only in the next phase, as a result of learning to read and write, the ability to master new concepts not previously presented in the brain (in non-verbal form) according to their textual description is acquired, that is, to form an internal non-verbal representation of knowledge from a verbal one.
By this time, the lion's share of the entire volume of knowledge that will be acquired by a person has already been accumulated, presented non-verbally, and partially associated with verbal structures. Much of the accumulated knowledge remains unassociated with the concepts represented in the language - partly due to the lack of need for this, partly due to the inadequacy of natural language for this purpose.
The natural language developed much later than the non-verbal internal representation of knowledge as a tool for communication between people who have a primarily identical set of non-verbalized knowledge. Synchronization of linguistic concepts in different persons is initially provided by non-verbal means: linking words with the demonstration of objects and actions.
As a result, natural language, developed as an instrument of communication, is poorly suited to represent knowledge. This is the reason for the emergence and use of specialized non-verbal languages - maps, diagrams, drawings, musical notation, mathematical notation, etc.
This circumstance is also actual for the most informative channel of information flow, vision. Many elementary actions that are easy to teach by demonstration (directly by the teacher or video) are extremely difficult and often impossible to teach using a verbal description. You don't need to go far, for example: learning to tie shoelaces, braiding a braid, drawing a star, doing yoga gymnastics using only text is absolutely unrealistic, while without a verbal description, but using a demonstration, video or pictures, this is easily accomplished. Text or voice instructions are applicable when they operate with the names of actions and objects they are already familiar with. The most elementary of these everyday objects, activities, and phenomena are acquired in a non-verbal way.
Thus, even after a complete mastery of the language, the brain contains a large amount of knowledge that is not verbally represented. The interpretation of the text actively uses such non-verbal knowledge both for resolving ambiguities typical of the language and for the "logical unfolding" of implied but not explicitly presented information (context) in the process of logical analysis of the text. At the same time, the bulk of such verbally unrepresented knowledge relates to what is usually called "common sense" and which is an essential element of the context in the analysis of phrases.
Under this, an attempt to realize the understanding of texts, that is, a rational interpretation of the text in the same way as a person does, without using that component of human knowledge that is not represented verbally, is fully possible only within special thematic "niches" (for example, programming, mathematics) where, due to their specificity, one can do without non-verbalized knowledge.
It is easy to see that it is impossible to extract the missing non-verbal information from the available texts in most cases: it is simply not there either because a person does not need such descriptions or because of "technical" difficulties (it is difficult to find a way to adequately describe smells, tactile sensations, many visual elements and so on). This means that to teach an AGI system to a natural language, it is not enough to study the available texts - it is necessary to additionally find a way to introduce the knowledge represented in the human brain non-verbally.
Such non-verbalized knowledge can be prepared by humans or acquired by an AI system in the same way as in human society - by practical training in actions in an ordinary human environment. The second option obviously requires using the same sensory capabilities as in humans, which has not yet been fully implemented technically. And both options have a common problem: since the language is not a universal way of representing knowledge, it is necessary to develop a universal way of the knowledge representation, suitable for storing any kind of knowledge (including non-verbalized) and the conversion rules between this internal representation and the external communicative representation (natural language, formal languages, graphics of various sorts, and so on).
The basic principles of our version of the universal representation of knowledge are described in SEMANTIC STORAGE and subsequent chapters.
SUMMATION
A person accumulates the lion's share of knowledge and skills before mastering the language.
A person's internal representation of knowledge is non-verbal.
The non-verbal component of knowledge is used by a person when interpreting a text.
For a complete understanding of the text by the AI system, the AGI system should use the knowledge presented in humans non-verbally.
The knowledge presented in people non-verbally cannot be extracted from existing texts.
AGI system should use a universal way of representing knowledge, suitable for manipulating knowledge represented in people non-verbally.
Good read Mykola,
There are too many projects in development that think the road to general-purpose artificial intelligence is through text alone. To accomplish anything close to human level intelligence text must be grounded.
Brett