PROBABILITY IN DECISION MAKING

To have and have not

Apr 30, 2021

The previous sections on Semantic Storage raised questions from readers about why the probabilistic characteristics of relationships between entities were not mentioned. The main reason for this is that probabilities, when used, are numeric values, and according to the accepted architecture, numeric data is stored in Data Storage. Separation Data Storage from the Semantic one is technically convenient since the set of attributes (including the probability) is different for different entities. The numerical values change regardless of the relationship - as a result of measurements and/or calculations. So using probabilistic characteristics is certainly possible, but they are stored separately, and there is no rational reason to assign the value of probability to each relation.

However, since probabilities and the degree of confidence in many approaches are used as the significant or essential element of decision-making logic, it makes sense to analyze this aspect separately.

Since decision-making is based on accumulated information, it is natural to start the analysis with the structuring of this information. The accumulated information consists mainly of facts of a different nature, primarily the facts that certain events occurred in a specific sequence, the measured values had corresponding values, and so on. In addition, there are facts based on predetermined natural-scientific laws of the macro-world and mathematical laws (the relationship between speed and acceleration, between the radius of a circle and its area). With this factual information, it makes no sense to talk about probability, and if, nevertheless, it is required, then no other value other than precisely one is suitable.

The accumulated numerical data on facts can be processed to obtain statistical data, including the frequency of certain events in certain situations. That is, estimates of probabilities appear along with the aggregation of facts, in other words, when forming statistical models. The corresponding frequencies can be considered estimates of probabilities, intended to be used in many decision-making algorithms. At the same time, in essence, estimates of probability are hypotheses that correspond to reality to a greater or lesser extent - not a fact, although they are based on facts. For example, suppose data is collected about whether a person was burned after taking the first sip of liquid from a cup. In that case, the total number of incidents and the number of burns are facts, but the estimate of the likelihood of such an event in the future is a hypothesis and may change as data accumulates.

The essential aspect in the macro-world is that the probabilistic nature of events reflects, in most cases, not the fundamental features of the environment or the process but the consequence that facts corresponding to different situations are collected in one set. In our case (a burn from the first sip), information about what kind of liquid they drink significantly changes the situation. If it is beer or sparkling water, then there will be no burns at all. Adding the result of measuring the temperature of the liquid to the situation description turns the process into a completely deterministic one.

Thus, the more detailed the description of the situation, the less random the results are; the more detailed the classification of situations, the less random the consequences will be, and the less reasonable is the use of a probabilistic model (probabilistic hypothesis). Theoretically, using a model that considers all the factors influencing the predicted events, the situation becomes completely deterministic. In reality, however, situations regularly occur when the factors influencing the consequences, in principle, cannot be known (measured, detected). It is the main reason for using a probabilistic approach to decision making. However, the "obvious" conclusion "if the situation is non-deterministic, then it is useful to use a probabilistic approach" is not as logical as it might seem at first glance.

First, when a situation arises that did not occur before, it is necessary to decide. However, at the same time, there is no data for constructing an estimate of the probability. So using the probabilistic approach, it is necessary to have a "spare" variant of the decision-making method that does not involves probabilities of possible consequences.

Secondly, if the number of past cases, similar to the current one, is small, statistical estimates of probabilities will be deliberately inadequate. It is required to set a certain minimum amount of statistical data that must be reached before switching to probability-based decision-making.

The third, no less important, aspect is that a reasonable approach to decision-making is based on the assumption that the more detailed the situation is, the more grounds for making an optimal decision. However, the more factors are used to define the situation, the more different combinations of factors describing the situation. Each factor combination requires its estimates of the probabilities. For example, if there are only eight factors, each of which has one of three possible values, the number of possible situations exceeds six thousand. When statistical data for the current situation are insufficient to obtain adequate probability estimation, cases will not be rare cases rather frequent ones. Accordingly, the "spare" decision-making method in natural conditions becomes frequently used.

Finally, there is the last aspect, the most logically complex and least apparent. The use of probability estimates for decision-making is based on the implicit assumption that as experience and relevant statistical data are accumulated, the probability estimates will approach specific actual values, providing an opportunity to make optimal decisions. However, a more thorough analysis leads to the conclusion that this assumption is erroneous. The reason for this is that the source data for assessing the probabilities are the collected statistical data on the consequences of decision-making. The choice of actions depends on the probability estimates, the probability estimates depend on the collected statistics, and the statistics depend on the decisions made; there is a logical loop.

A naive iterative process in which current estimates of probabilities are used to make decisions, and the consequences of a decision update statistics and, accordingly, estimates of probability is essentially a process of solving a system of equations. If “f” is the decision-making function, “u” is the selected action, “P” is the vector of the estimates of probabilities, and “g” is the function of updating the probabilities under the outcome of the execution of the action u, we obtain the following set of relations for a certain situation:

u = f( P )
P = g( u )

It is nothing more than a system of equations with unknowns “u” and “P”. The mathematical aspect is that the iterative process of recalculating the probability estimates will not necessarily bring the estimates closer to the actual values. It is a consequence that the decisions aim to avoid undesirable consequences, which reduces the frequency of choosing the actions that lead to them (up to zero in some cases), and statistics for these actions are updated less often (or never). As a result, the initial "classification" into desirable and undesirable actions, formed at the initial stage, tends to be preserved. This formation itself is based on statistical data with a small number of samples of the initial stages.

The repository https://github.com/mrabchevskiy/probability contains a Python script that demonstrates the above features with numerical simulations. There are two possible actions, one more beneficial but also riskier, and vice versa. If the probability of the undesirable consequences of a risky action exceeds a predetermined threshold, the less risky action is selected. The first-time action (there are no statistics and therefore no estimates of the probability) is chosen randomly; a series of tests are carried out. Results:

Threshold: 0.100
Number of tests: 600
Steps per test: 500
Cautious action:
True probability of the unwanted consequences 0.010;
Range of probability estimation: 0.000 .. 0.026 .
Risky action:
True probability of the unwanted consequences 0.200;
Range of probability estimation: 0.100 .. 1.000 .

As we can see, the estimates of probabilities can radically differ from the actual values, while there are no obvious ways of detecting such a situation.

The only reliable way to avoid developing a stably non-optimal behavior is to make decisions not based on an assessment of the consequences of a decision but based on the need to accumulate a sufficient amount of statistical data during a sufficiently large number of decision-making acts for each specific situation. It eliminates the mutual dependence of decisions and estimates of probabilities and means switching to the mode of experimental study of the situation to form an adequate statistical model. Since there are many different situations, and the formation of an adequate statistical model for each of the situations is time-consuming, such a self-learning process, which guarantees the development of optimal behavior, becomes complex and resource-hungry. Obviously, in many cases, such an approach is undesirable or unacceptable due to the high risks.

It means that at least there is no a priori superiority of approaches based on the use of probability estimates compared to other approaches (and there are such approaches). Especially if we take into account that in many cases, the optimal solution does not depend on the estimates of probabilities but is determined only by the nonzero probability of one or another consequence, that is, by the facts of the possibility of one or another outcome, regardless of their probability. For example, a hungry predator may decide to catch prey regardless of a specific probability of success — just the possibility of success is sufficient. Furthermore, his potential victim decides to run away regardless of the likelihood of success (subjective assessments of which, for an apparent reason, will always be 100%).

Decision-making methods without using probability estimates are the subject of one of the following chapters.

SUMMATION

The described architecture allows the use of a probabilistic approach to decision making.

It is irrational to endow each ratio with an attribute of probability since, for a significant fraction of them, the probability is equal to one.

The use of decision-making algorithms based on estimates of probability also requires an alternative "spare" method.

The naive option of refining the probability estimates as these estimates are used for decision-making can lead to the development of stable estimates that are significantly different from the actual values, leading to non-optimal behavior.

The optimal solution does not always depend on the probability of consequences.

There are approaches to decision-making that do not use estimates of the likelihood of consequences.

AGI engineering

Discussion about this post

AGI engineering

PROBABILITY IN DECISION MAKING

To have and have not

Threshold: 0.100 Number of tests: 600 Steps per test: 500 Cautious action: True probability of the unwanted consequences 0.010; Range of probability estimation: 0.000 .. 0.026 . Risky action: True probability of the unwanted consequences 0.200; Range of probability estimation: 0.100 .. 1.000 .

Discussion about this post

Threshold: 0.100
Number of tests: 600
Steps per test: 500
Cautious action:
True probability of the unwanted consequences 0.010;
Range of probability estimation: 0.000 .. 0.026 .
Risky action:
True probability of the unwanted consequences 0.200;
Range of probability estimation: 0.100 .. 1.000 .