History Partitioning To Determine Probability Always Is Possible, And Always Is Wrong (This Time Never Is Different)

Consider as motivation for the below discussion: someone comes to you in the late 1980s, as the Soviet bloc broke up: the end of history is here and there will be no more great evil empires. The justification: the recent tragedies, as well as the wide availability of history due to the freedom of the press, will motivate people to take the actions that avoiding repeating the tragic past.

That was not what happened. Hence we can understand that this argument is fallacious, but the technique involved is critical to answering people who want to say “this time is different”.

Right off there is a problem: history always increases and the situation always changes in some way – so you always could argue that previous prediction techniques are invalid – but back-testing on this line of thinking miserably fails.

For physical quantities, we know that if a certain situation is set, the results always will be the same. If you build a series of upstream dams and don’t maintain them, you will have catastrophic failures. Moreover, even if the physical situation – the geographic region, aspects of the weather, the type of mining – changes, if you build these types of weak structures, the outcomes will be the same. Hence the concept of partitioning history into sections to improve accuracy does not make sense for 100% correlations/physical situations.

For human quantities, first we must recall how human behavior partitions: there are routines – situations in which certain individuals’ behavior is highly predictable – and then a series of situations where it becomes harder and harder to predict behavior, until eventually it becomes random. In the case where human behavior is routine, asserting this time is different would be wrong almost 100% of the time. For situations in which certain individuals’ behavior is highly predictable, with sophisticated analysis, again this would be wrong almost 100% of the time. Only when human reaction becomes difficult to predict would you have any chance of passing that back-test.

Why do we say that in these situations, the human behavior is difficult to predict? Structurally the poor prediction performance is due to measurement issues – but those measurement issues also apply for the near 100% cases. The actual argument is based on frequentist statistics, which even at their simplest level, predict the future at high accuracy in the near 100% cases. However, such descriptive statistics do not have much use when correlation starts to drop.

The argument that recent history will change the behavior from the preceding epochs, asserts that the recent history has high predictive value, but the previous recent histories do not (because then this time would not be different, it would be the same, except that a certain situation may never have yet occurred in history). This is argued on the relative uniqueness of the situation – the generalization of this argument.

However, every point in history is unique, and in most of human history the future has been significantly different from the past, not merely arguing over the timeframe of a few years. So if truly, only very similar cases to x-1 predict outcomes (vs. say a business cycle, where x-3 to x-1 determines), then the actual outcomes of history in these non-100% cases, cannot follow any pattern or have any meaningful/non-volatile average performance. There are two ways to demonstrate the error of this assertion: a straightforward recall of history, and a mathematical one.

Consider the matter of inflation and printing money as one illustration of patterns in distinct situations. As a macroeconomic question, with implications to government finance, this invokes non-100% behaviors. Consider countries undershooting inflation targets of 0-2% in the 21st century: Japan, the 48 states, many countries in Europe. The economic histories and general structure of each of these economies are fairly different, within the bounds of some roughly free-marketish economic superstructure. The martial policies of each region also highly differ; likewise with their general culture. Yet, they not only all are failing to hit their targets, but in the case of Japan and the 48 states, large increases in spending and money supply have not drastically changed the situation. Perhaps this is not so deterministic, that developed economies are in some sort of trap: we should consider other countries, that print money and experience hyperinflation, or other outcomes, to assess whether there are patterns. Despite the historical certainty that printing disproportionate amounts of money will ruin the economy, countries like Zimbabwe, Venezuela and Argentina ran the presses. These three countries, in their respective situations, are not similar; and Weimar Germany would differ even further. Even so, the same outcomes.

We could spill many electrons arguing the relative similarities and patterns seen in this aspect of economic history. That we have basis for argumentation indicates that there are patterns, and that there are some similarities that seem to overcome the many differences in these situations.

As for the math, asserting only x-1 (and a very small equivalence class in which x-1 exists) really matters, and then to say with high confidence, that the equivalence class for x-1 has a certain probability distribution, is a contradiction. The question comes from the justification for why x-1 has this probability distribution. If as previously noted the history must essentially be random, then in order to satisfy no-patterns, all x-1 must be picked from the set of all viable probability distributions under the physical situation. Hence you wind up in a situation like Pascal’s wager, where the division of the probabilities erodes confidence. However this is not the most important point: if x-1 really is being drawn from this set, then it is impossible to predict its corresponding x value; moreover, if over time, x-1s are not encountering distributional shift (i.e. that you are not picking a series of x-1s that is forming a pattern of its own and contradicting the previous hypothesis in the future time), that means the actual results really are random, and so you only can say that the decisions appear to be random (or on the larger historical trend, since adjustments for physicals), and not conforming to one specific distribution or the other.

One could postulate the existence of a technique to predict every x from every x-1, even if the distributions as a whole are random. The largest issue is how such a proposed technique would differ from asserting the random hypothesis, given the assumptions. If all x-1->x do not have relationships to each other, and history is composed of x-1->x, then there is no conventionally computable technique to show the history of prior successful performance: it functionally would amount to a lookup table, which is not systematically generalizable to the future, since the future points have to be computed.

A variant (not the same as the original argument), but often phrased the same way, is that this time is different, but specifically stating that x-1 is a member of a new equivalence class, that the previous patterns in history do not cover. The first point: if x-1 really is a new equivalence class, how do you know it is a new equivalence class? You don’t yet know the outcome, yet many points in history also could be deemed a new equivalence class based upon the situational distance metric you invoke. These prior points in history do conform to the patterns you recognize, yet this new one does not, even though it has about the same situational distance metric? The usual response is that there is an input of the situational distance metric that was not previously exercised, that relates to some new technology or other non-human development. The follow-up is, at some point, that also was true for all the other x-1 that do conform to the pattern, and yet these patterns existed. Moreover, this new x-1 does not have inputs that the old ones did, which could have caused them to nonconform to the pattern. Hence, if we consider a meta-history of this situational distance metric, there is a long pattern of changed situations not resulting in changes to the overall pattern. Without the evidence from x in this new x-1, there is no measurement which can contradict the overall pattern.