AGI: OBJECT DETECTION+IDENTIFICATION VS RECOGNITION
Today's most common approach in AI systems is based on searching for known objects within the observed scene and forming such a set of predefined objects in preliminary "training." The disadvantages of this approach are the subject of debate in the AI society, so without dwelling on them, list the advantages of the alternative method described in AGI: STRUCTURING THE OBSERVABLE which distinguishes between the stages of detection and identification:
An object unknown to the system is detected and tracked, and its description is generated and remembered if necessary, forming a new concept in the process of system operation, i.e. use of the system and learning are not mutually exclusive phases/modes.
It is enough to have a single observation of the object to construct a description of a previously unknown object - but observation for some (very short) time range.
Object detection is based on the explicit separation of objects
and environments; as a result, the object's description does not depend on the environment in which it was observed at the moment of generation of its description.
Description of the object (concept) splits the object' properties
into two sets of object attributes:
(a) Constant attributes characterizing the object itself
(dimensions, proportions, coloring, etc.)
(b) Time-varying object state parameters - position,
orientation in space, etc.;
this allows you to form position and orientation predictions, thereby creating a dynamic object model.
There is no need to train the system as a unique phase, fundamentally
different from the subsequent stage of use; no preparation of training datasets is required, and the dependency of the system's quality on the volume and composition of the training dataset disappears too. The learning phase can be used as the initial stage of the system operation in the presence of an instructor, as is the case in driver training, operators of various equipment training, etc.
The collection of objects known to the system is replenished permanently and not only during the optional learning stage.
Explicit human-readable object description/definition that can be accessed for each known object separately; a list of known things is accessible - i.e., full explainability provided.
Description of any particular detected object can be alienated (exported)
and then transferred (imported) into any other instance of the system, which creates the possibility of collective accumulation of experience/knowledge through the regular exchange of descriptions of previously unknown objects.
The approach provides the ability to "assembly" a set of objects into a new complex one; this means the natural formation of a hierarchical system of concepts (descriptions/definitions of objects).
There is a possibility to detect and identify objects that are moved behind obstacles that hide alternately one part of the object, then the other.
In later chapters, we will take a closer look at the benefits listed above, enabling the generation of new knowledge - and not just the use of knowledge acquired by a human or other AI system and assimilated in the process of knowledge transfer at the training stage/phase. This allows you to create truly intelligent systems.