AGI: PARALLEL COMPUTING

ADVANTAGE OF FLEXIBILITY

Aug 01, 2024

A natural way to increase the performance of AI computing is to use parallel computing where possible. This, in turn, requires tools that ensure the correct use of shared data, i.e., the use of tools to synchronize access to shared data from several parallel computations. Naturally, appropriate software technologies for synchronizing access to shared data have been developed.

However, there is an aspect that has not yet received due attention from systems developers that use parallelism. The easiest way to explain its essence is to use the example of breakfast in a motel. When we come to have breakfast, we have a specific plan of action: warm-up toast in the toaster, pour coffee, pour juice, take sausages and jam, etc. But it is unlikely that there is a person who strictly adheres to a predetermined sequence of performing these operations; if the toaster is busy, we pour coffee; if the coffee machine is busy, we go get sausages, etc. That is, in a situation where a specific shared resource required to perform an operation is unavailable, we simply perform another operation that can be performed at that moment. Naturally, this approach allows you to perform the planned set of actions faster than by adhering to a rigid predetermined sequence of actions if there are competitors. It's so natural that we don't think about the fact that it ultimately saves time - or increases productivity if we formulate it in terms of parallel computing.

But when programming parallel computing, we just as thoughtlessly write code in which the sequence of steps is rigidly defined, even when a different order of process steps based on the availability of the required shared resources is logically possible.

And the reason for this is not our slow-wittedness (which is also not excluded) but two completely objective factors. Firstly, the approach with a flexible choice of the current action from several possible ones requires inventing and implementing an algorithm for such a choice, which is naturally more difficult than writing code with a strict order of actions. Secondly, off-the-shelf high-level library tools for synchronizing access to shared data (lock_guard, futures, etc.) use a straightforward approach: if access to data is required but not currently available, you just need to wait until it becomes available.

We have developed a tool to support the flexible execution of a set of operations based on the approach that each step of the process (executed in parallel with other competing processes) in the code is implemented as a function associated with an auxiliary function that specifies the conditions under which this step can be executed (based on what steps have already been executed earlier, i.e., based on the current state of the computational process). The process as a whole, naturally, includes variables about its current state.

The algorithm for choosing the current step is isolated in a universal control module suitable for any such process. From among the steps acceptable for execution at the moment, it chooses one at random and executes it if the required shared data is available; otherwise, it chooses another at random; waiting for a specific resource to become available as an implicit element of the computational process is absent.

Practice shows that, depending on the situation, performance gains can reach up to 50% at a high concurrency level.

The level of computational load in AI systems used to control robots, cars, and drones varies significantly depending on the current situation; in stressful situations, the level of computational load increases sharply, increasing the degree of concurrency of access to shared data. Using the described flexible approach in stressful situations allows for mitigating the negative impact of increasing computational load - especially when the reserve of the system's computational capabilities is exhausted.

Coding a multi-step parallel computing process is quite routine since the algorithm for choosing steps is encapsulated in a universal control module. If readers want to try this approach, we are ready to share the source code (C++).

AGI engineering

Discussion about this post