Parallel execution of various logical processes in an AGI system naturally implies the presence of shared data that is created/updated by some logical processes and used by others. The history of the development of such systems dates back to the time of the creation of the first operating systems; It would seem that by virtue of this, there should be no problems here by today. Alas, it is not.
The interaction of parallel computing working with shared data is required to prevent data from being changed by one process while the data is being used by another. Methods for regulating such interaction have been known for a long time; both C/C++ and other programming languages use mutexes and their derivatives (guard_lock, shared_lock, unique_lock), which are an element of standard libraries. The principle of operation is outwardly quite simple: if the data is reserved for modification by a particular thread, then the rest must wait. It is easy to guess that this somewhat slows down the system as a whole. This, of course, is not a subject for admiration but a wholly expected and understandable situation. However, this is just the tip of the iceberg: in multiprocessor systems with a large number of parallel threads and shared data, situations can arise when the slowdown is substantial (livelock/starvation) or the system as a whole stops - a deadlock occurs: everyone is waiting for something, but there is no progress.
This is aggravated by the fact that even in not-too-complex systems, the cause of problems is difficult to detect and eliminate. First, this is because such situations usually arise at random times. Sometimes they take place, and sometimes they do not; detecting something not deterministically reproduced is always extremely difficult. The second complicating circumstance is that using any debugging tool changes the timing and thus breaks the program's behavior. In particular, the unpleasant situation often "disappears" in this case but, of course, occurs again in the normal mode. In particular, simply printing debug information can have this effect since the print statement interrupts the thread it is being printed from. Finally, if the problem has been eliminated, this does not mean that tomorrow it will not reappear after modifying the code of one of the parallel logical processes or adding a new logical process and, accordingly, one or more threads.
Finally, another problem: the C++ standard does not contain precise requirements for implementing coordination facilities (mutex and derivatives). The most advanced version should provide some pretty sophisticated behavior that allows multiple threads "read-only" access simultaneously, alternated with exclusive access for modification. This involves denying "read-only" access after receiving a modification request and granting modification access after no "readers" are left. Because this is not specified by the standard, a system that functions normally on one platform (combination of the compiler, operating system, and processor type) may encounter problems on another platform.
Since refusing the requested access, according to the standard, means suspending the thread until the requested access is obtained, the need to use the operating system functions is obvious. This introduces additional possibilities for variations in system behavior due to differences in operating systems or settings on the same system. The use of processor specifics when compiling application and/or operating system code is why implementing mutex and its derivatives depends on the type of processor.
Ultimately, the "classic" implementation of parallelization boils down to the fact that each logical process runs in a separate thread; as a result, the number of threads involved is many times greater than the number of processors/cores. This means most threads are in the "hold on" state at any time. The operating system scheduler is busy activating/deactivating threads, which, firstly, requires a certain amount of resources, and, secondly, affects the behavior of the AI/AGI system.
Let's summarize the above:
The interaction of many logical processes creates opportunities for logical internal conflicts
Such conflicts can lead to significant degradation of performance and even a complete blocking of the system
The cause of conflicts is difficult to detect, and once discovered, it does not always have an obvious way to eliminate it
The presence or absence of problems depends not only on the source code (and therefore the programmer's qualifications) but also on the compiler, multithreading support library, operating system, and processor.