I wouldn’t say it is that good in terms of execution (50% overall?), but it is far superior than most other records in certain areas, and, most importantly, I will outline below the key improvements that would allow me, or you, to get better results in the future.
In case it isn’t obvious, I recognize the role of luck up to this point in being right about these things. You can see on all the different pages on this website, how I am attempting to demonstrate how to achieve systematic accuracy, not simple heuristics with some luck.
I tend to be strongest relative to others when it comes to nasty geopolitical issues. Unfortunately, I also tend not to do a lot of predictions, or to formalize them very often – hence you only find a few below:
1997 and onwards: Whether the PRC (China Mainland) meaningfully moves towards liberty in the next 10 years – so far I am a “winner” on this one. Currently, the other countries are applying a form of what I call the “get-lucky strategy”, which, if well executed, uses isolation, covert operations and propaganda, cultural exchange, and other means to inspire the population towards change and to destabilize the regime generally. However, in this situation, the pressure that is being applied is very light; there are few restrictions on trade, the PRC is able to build its military with few obstacles, and overall the PRC is able to maintain a significant amount of information control. Consequently, the poor implementation of the get-lucky strategy does not greatly increase the odds of regime change. As the PRC tends to operate on a 5-10 year leadership cycle, to get good odds on the get-lucky strategy working, you are looking for 10% additional probability of major improvement in leadership per cycle. The only real negative change in the prospects of the current leadership is in information flow, and there the results are not very good (e.g. Great Firewall of China). As such, it is hard to say that even a 5 percentage point overall improvement per leadership cycle has been achieved by the improved information flow, while the PRC is able to execute its economic and other social control strategies without much interference, thereby counteracting much of the positive effect of this openness through the increased effectiveness of legitimated regime propaganda (i.e. the regime can correctly state that the situation is improving and therefore effectively insinuate that political change is therefore not needed) and through effective execution of economic growth strategies.
1997 and onwards: Whether the North Korean leadership will respond to the current program of six-party talks and carrot/sticks being offered, with a program of meaningful reform (particularly in reducing nuclear weapons proliferation) and reduction in belligerence. This is, unfortunately, a very clear win for me in predicting that this will have little effect on regime behavior. The six-party strategy has not even given rise to a cause for war or retaliatory actions to deter the North Korean provocations, such as the shelling of South Korean territory. The issue here is that none of the fundamentals have changed; indeed, North Korea’s belligerence, followed by food aid, has become a pattern, a repeatable exploit. Without movement in those fundamentals, useful change remains highly unlikely.
2002: Whether the then-current war plans of the US and its allies will be sufficient to pacify Iraq and lead to the relatively peaceful end-state for Iraq, with some flavor of improved government. As of 2006, the temporary government was mildly improved from the Saddam Hussein/Baathist regime in some ways, but the security situation was a disaster, moving towards conventional civil war. The troop and other resource allocations clearly proved insufficient to the need. The Americans exerted a lot of effort, but here my prediction was based on the fact that resource allocation for the rebuilding was nowhere near the pessimistic military estimates released at the time. Consequently, the effort should be expected to have a high (at least 20%) chance of failure, since pessimistic estimates tend to account for more factors and for less favorable measurements of unknowns than do the more optimistic predictions. Consequently, the pessimistic estimate will cover more possible futures than the optimistic ones.
2002: What are the prospects for Afghanistan in the next few years, given that available American forces are being allocated to go to Iraq? Here I predicted no meaningful improvement in the overall fundamentals of the situation, based on the lack of additional resources being committed, and was correct. Unfortunately, I didn’t also try (or don’t remember whether) to assess how many schools, etc. would get built, or how the minor factors in the conflict would be affected. Lazy, I know.
2006: Whether the Iraq “surge” strategy will prove to be enough to achieve something like the original goals of the campaign to nation-build in Iraq. I incorrectly predicted that it would not be enough to change the fundamentals. According to media reports, the combined improvement was due to the commitment of tens of thousands of additional troops, a change in tactics to focus more on counterinsurgency, the effectiveness of the ability of our diplomacy and bribes to turn factions of the Sunni opposition against their former allies (i.e. the Awakening movement), and classified programs (which obviously are not prime for consideration in an Internet post). I incorrectly only took the troop numbers into account, and failed to account for the implications of the strategic shifts, particularly diplomatic efforts. Had I done that, I would have gotten much closer to predicting that the violence would not abate, but would continue at a greatly reduced level.
2008: Whether a comparable surge strategy as that implemented in Iraq by American and its allies will achieve a comparable result in Afghanistan-Pakistan region. Here I said no – it would definitely have positive effects, but the scope (only a few years) and resources (not even as much as in Iraq, despite facing more intractable obstacles) of the effort will not be sufficient to alter the key problematic fundamentals in the region, stemming from what is effectively Pakistani shielding of insurgents across the border, the overall military and economic weakness of the allied governments in the region, the highly unfavorable physical terrain, and the general cultural problems (e.g. the quick resort to violence and lack of trust) that make cooperation between the various tribes troubled and infrequent. With the exception of tribal politics, all of these were also significant factors in the debacle in Vietnam. It is too early to assess the correctness of this prediction; so far there are positive effects, but the end-state is really the key here, and we can’t measure that as yet. We know there are many ongoing problems, but those might (though it is not likely for them to) be resolved in the remaining timeline leading to the withdrawal of major Western foreign fighting forces.
2011: Under what conditions the Libyan insurgency will succeed in overthrowing the Gaddafi regime? I assessed the odds of the insurgents prevailing without significant outside intervention at 50-50, and assessed as highly likely that with outside intervention, that the insurgents would prevail. The insurgent prediction was based on relative parity and seizure of several cities by insurgents early in the conflict, and then obviously the improved odds with the outside support build on that relatively favorable situation. As it happened, the insurgents alone would probably have lost the struggle in the end, without the outside intervention, because of Gaddafi’s substantial lead in initial resources, e.g. foreign bank accounts. As for the outside intervention, the NATO forces brought a lot of advanced firepower, so much so that once you knew that was going to happen, it was hard to ever see Gaddafi prevailing, simply because no pro-regime force of any size would ever be able to openly move without getting a bomb dropped on their heads.
2011-2012: What will be the result of reducing medical interns’/residents’ work hours? I assessed that the likely effect would be to improve the quality of patient care, based upon the well-known results of fatigue studies in other, far less intellectually taxing, industries. However, the managers in healthcare generally disagreed with this hypothesis, citing issues around ensuring enough training time and increased patient handovers. Two prominent studies came out in 2013 following up on the 2011 reductions in work hours; one was led by Srijan Sen, the other by Sanjay V. Desai. The two studies came to roughly similar conclusions: the quality of patient care was not really improved. Several factors were believed implicated:
– Changes to shift work were not accompanied by re-work of time and motion, leading to inefficiencies.
– Handoffs of patients between providers occurred far more frequently
– Interns/residents weren’t getting much more sleep; they were spending on the order of 3 hours additional time sleeping, which was nowhere near the amount of time theoretically freed by the regulations
Based on my own experience with management and how companies account for employees’ time, I also suspect that where self-reporting was used, that there is also a possibility for spill-over from work to happen, meaning people worked on their educational expectations off the clock. Furthermore, in my experience of knowing people, they don’t allocate enough time to rest and recovery, particularly when young; they look for opportunities to socialize and have adventures instead of focusing on their primary responsibilities. Viewing work as a chore, they consequently seek out alternatives as a means to do what they think is meant by “living”.
Consequently, my prediction on this study was incorrect. This is a good illustration of two concepts: the difficulty of predicting by judgment or qualitative factors, and the complexity of the implementation of change as an predictor of the effectiveness of reform. All of the people involved on both the pro- and anti-reform sides believe that each factor cited is roughly negative, but disagreed as to the relative weights. The implementation of these changes, particularly the end result of very little additional sleep time, ended up causing these studies not to settle the questions of which weight was greater. This illustrates a further difficulty: in this case, what was actually tested was a long-honed system and routine of sleep denial vs. a quickly thrown together system of shift work, when what is most important to know is the comparison of the sleep-denial concepts against a well-structured system (maybe not even shift work as such) where the clinician fatigue, as well as patient care factors, are managed in a roughly Pareto-optimal way. Naturally, this involves a series of tradeoffs, which have to be driven by a series of studies. Hence, a quick study probably can’t get us a definitive answer on these points, adding to the difficulty of resolving the question of whether limiting clinician fatigue is worth the effort.
2016: Like everyone else I screwed up and assumed Hillary would prevail over Trump.
—
As for my profession, working in software, my estimates of when things will complete are pretty bad overall. Some of that is just my inexperience, mainly the lack of accurate measurements of typical values. However, a lot of it is also due to the limited amount of time that is spent developing the estimates, particularly in developing alternative scenarios, vs. the nominal plan where there are no major disasters. There is a fundamental truth about almost any intellectual endeavor: you can make anybody stupid by reducing the amount of time they have to crank something out. (Likewise, you can make somebody physically weak by making them crank things out at a high enough rate.) The level of complexity involved in knowledge work is not insurmountable, but because it is knowledge work, where the work is largely in figuring out what needs to be done, estimating it tends to be of the same order of magnitude as actually doing it. For various reasons, I tend to be biased towards work-doing vs. spending more time estimating to get better numbers, and it really shows. It’s not much consolation, but my pessimistic approach tends to yield better results than those of my peers.
In my personal life, I wouldn’t say my prediction record is good; a lot of it is simply lack of experience. Some predictions are dead-on, but mostly I have tended to lack the necessary data to make any real useful prediction. My approach has been to try a lot of new things, and by this, to get more information about what will be most effective in my remaining lifespan.