Bounding The Limits Of Appeals To Expertise As A Proxy For Understanding

Conceptually, appealing to the authority of an expert is sound/economically optimal, if:

Accepting the expert’s advice is quick/cheap
Validating that the expert really is an expert is quick/cheap (validating credentials)
- And either being able to consult multiple experts, or to do a background check to verify the expert is trustworthy and not misrepresenting/unethical
That there is a valid assumption that the expertise does highly correlate with accurate advice, so that you will at minimum achieve improved accuracy for amount of time spent listening to reporting/advice
- Which also means that the experts must have near-consensus on the topic, otherwise the correlation does not exist
That you know you can obtain the required accuracy via this technique, vs. e.g. the expert needing to build a custom model or one-off analysis to gain the required accuracy
That the economics (e.g. study time) do not make it viable to gain your own understanding, or that the stakes are relatively low, so it also does not make sense to spend to create a highly formalized framework similar to mathematical proofs, that is easier to verify than expertise

The key point is that, even before the following analysis, there is an envelope of investment where accepting expert advice makes sense, and that it is a surprisingly narrow one (vs. for reporting where you may have no choice but to rely on your sources, or vs. work-doing where you have a functional performance parameter). In other words, for any easy matter, or any large, societally important question, or expensive business proposition, it is a mistake to appeal to or rely on expert authority as the solution, as you have better options. After you review the following analysis, you also will realize that in many cases it is effectively futile: only when an expert/their model is clearly dominant over others can you actually recognize and make use of their recommendations.

So assuming that accepting the expert’s advice is no costlier than any other approach, and that the expert’s advice has known acceptable limits, we only need to solve for validating credentials, validating the expertise leads to accurate advice, and those costs vs. the cost of gaining the expertise yourself.

Validating credentials: this usually is the easiest thing, with the availability of the Internet. You can see the certifying institution, and you can get the quick sanity check of second opinions from other experts. If needed you can make a few phone calls. Note that this is not the case if there only are a handful of experts – in this case you may need to do a very costly background check and prediction record search.

Validating that the expertise leads to accurate advice: here, there are several fundamental issues, treated in more detail:

That you picked the right type of expertise
That you picked a measure of effectiveness (MOE) relevant to the subject of the expert advice, which would tell you whether the history indicates expert advice is accurate
- And the prerequisites for picking it, that such an MOE exists and readily is available
The meaning of census, population, polling, samples, and other similar in assessing expert agreement. In other words, what is a malicious/plant/specious/corrupt minority body of experts, vs. an ostensibly reliable majority – and what are the thresholds or means used to determine this.

First, let us consider the matters of picking the domain(s) of experts, and the MOEs. Let us start with the MOEs, as I think it is a more constructive approach, vs going from picking the domains abstractly which opens up more problems. The MOE is just any other measurement, so reading a book or other document will allow you to perform that rudimentary data evaluation, as a layman. In other words, if you have a series of climate measurements (that for simplicity let us deem that an easy thing to do, it is not), murder cases, revenues, or other such table, you could inspect these and then make your decision…or could you?

Assuming that the MOE is not itself wrong or misleading, that does not say anything about confounders, or changes in the relative situation. Gun control is a classic example – putting the 48 states (even worse if you consider them individually), Alaska, Japan, Switzerland, and all the other countries for which we have reliable murder data, many culturally quite different, into a basket and then attempting to draw some simple analysis out of it, is itself an extremely complicated task, and this type of task may not even be feasible. Consider the analysis of COVID-19 spread: if densely populated New York City has the bulk of the cases, how could you even have confidence in any model that doesn’t project similar results for any populated area? How do you disentangle patterns of behavior, intrinsic contagion of the pathogen, and then reconcile that with other densely populated cities that didn’t have such a spread?

To give perspective, let’s momentarily put that question aside and approach the matter from that the MOE really is reliable, that trends can be drawn year to year, etc. In that case, what can an expert’s advice add to the situation? Nothing: because the assumption that an MOE is more or less easy to understand and relate to the policy actions (or personal choices, this also is a subject for which people rely on expert advice), also means that the expert’s judgment doesn’t add useful information, and that her communication of the facts, isn’t any more succinct or insightful than just looking at the numbers directly. Experts can have a role as educators or fixing mistakes, but there is no need to trust them in these cases.

Hence we recognize that only in a situation of complicated MOEs or situations, often with sparse data, without lopsided trends, can an appeal to expert authority make sense. However, this is not the only implication: putting aside how this is achieved, if a person were to attempt to deconflict the data in order to generate a reliable MOE, they would effectively be building their own competence and expertise in the subject. This doesn’t put them on the same level as the experts of which we speak, but it does presume that they have sufficient skill and expertise to perform this task. And what is the output of this MOE generation task? It would be some sort of correlation or predictive framework that generates the MOE given the inputs of the situation: in other words, something that a layman could evaluate in the same way as if there were simple MOEs. The layman might not be able to evaluate the model operation itself, as it it could require considerable expertise: but the layman can verify that the situations and results line up. So momentarily ignoring the cases where there could be multiple such models that physically are relevant, this provides a constructive approach to the problem of having MOEs that can be used to validate the expert advice. Of course, if such models exist, then there is no need to rely on expert authority.

There was a gotcha in the preceding discussion. The claim was that a proper predictive model got around the issue of complicated MOEs. However, the flaw lies in that the layman would not be able to understand the model within the economic constraints, so really this just displaced the complexity to the model that interprets the data, vs. the data itself. Notionally this re-asserts the original problem, and sometimes that is where you have to leave it, because you can’t understand whether the experts are helping. However, there is a case where this can be ignored: when the model in question is so much better (in historical prediction, backtesting, cross-validation, etc.) than the alternatives that presumptively it is an expert, or oracle. “Trust” is not the correct word to describe relying on a mechanical or informatic process, but appealing to the authority (however trivialized as you understand from the preceding discussion) of such a model, is valid in this case. Notionally you can extend such a concept to a human being, that their mind is as good or better than such a model, based on the similar “they are so much better than everyone else”. With a large amount of historical data (to increase the confidence level, e.g. that the person is not lucky and the model is not overfitted) and tight relevant parameters, you could make that claim.

To complete the argument by cases, we must return to the points of assumption: that multiple such models with relevant parameters do not exist/are not easily offered as counterpoints. This is a very complicated question and I will not recount here the fundamentals behind claiming that a model or technique embodies knowledge. The particular cases where this comes into play are when:

Two different models with two different key parameter sets, have roughly similar accuracy.
Two different models with the same parameter set, have roughly similar overall accuracy, but differ in key cases covering the situations under consideration.

With only this information in hand, nether model could be said to be more correct, the math is the same either way. If you were to add the “goodness” of the selection of the parameters, using criteria such as relative relevance to related problems, tendency to overfit when applied to other problems, or ostensible physical distance from the result (e.g. using income level for predicting educational performance), you could have an expert tell you which are the “better” parameters. However, because these models each give similar results, you rapidly hit diminishing returns as to increased confidence. Mechanically, this is because if you were to replace one model’s parameters with another (one-by-one), but both were originally optimal, replacing those parameters will by definition get a suboptimal or no-better result. Hence if you said the “goodness” of physical distance was not as good for one parameter set, but you replaced the parameters in that set with ones that were physically more proximal, you would decrease your accuracy. In other words, by these substitutions you are refuting the concept that the physical distance, or other metrics, actually are “better” except as some sort of complicated ensemble, which, practically speaking, amounts to asserting the conclusion, because there is no stepwise methodology to get you to those particular answers. By this, I am not meaning to imply that physics-based explanations and so on are not valid; rather, it is that when models are disagreeing by significant amounts, to the point that the input parameters are different but getting the same results, that likely either:

Novel or unknown physics, lack of measurements, etc. are causing the correlation less than 100%
Human behaviors or similar are involved

and once you drill down on these cases, if you still have not been able to improve on the accuracy of one set of parameters vs. another, you probably have run into one or more of these phenomena and you wouldn’t even have a good idea of which “goodness of parameter” tests to run.

As for the case where two different models with the same parameters give different probability spectra with similar risk-weighted outcomes, that’s not even a question that can be addressed, as there is no basis for additional “goodness” once you added in risk-weighting and other similar shortcuts onto the raw model outcomes.

To finish with the topic of using experts and expert models as proxies to overcome confounders, only dominant accuracy can indicate the best MOE in a way that a layman can recognize. Hence in any heavily confounded MOE without an explanation/model of dominant accuracy, neither an expert nor a layman will be able to pick one best MOE.

Returning to original points: if you have such an MOE, then you can pick the correct type of expertise by its historical performance according to the MOE.

Turning to the final issue of expertise leading to accurate advice: ensuring your experts actually agree with each other, so you are picking the “right” expert vs. the “wrong” expert. At its coarsest, we could talk about the obviously fallacious majoritarian argument, and other similars, which trivially fail to give confidence. To be explicit about narrowing the scope: constructive credentialing (i.e. not just deeming your opponents heretics because they disagree with you and are part of groups that disagree with you/can’t join your group because they are heretics), as well as MOEs do not clearly disambiguate between the two groups. In this case, only detailed review of the controversies between the groups could reveal points of difference that could be used to deem one set of experts “right”. To gain such detailed expertise, and accurately to sort out the detailed facts, means that you have become an expert in order to determine the experts – hence it is not a constructive approach vs. just learning the issue directly. In other words, if you have significant questions about which group of experts holding contrary viewpoints is correct and in what respects, you have no way to rely on either group’s assertions, as you don’t even know what the “expert” opinion is.

To summarize the narrowed scope of expertise due to expert validation:

An expert (or their model) that clearly outperforms all others over a sizable number of trials and situations, can be relied on, hence its authority can be invoked.
Anything else is not viable as an authority, because you cannot make effective choices amongst multiple likely options without gaining your own expertise.

and to re-integrate into an economic context:

It may be optimal to invoke/rely on general expert authority when you need to make a quick purchase decision, or some sort of emergency-situation decision, you can get a quick sense of what is good and bad in the situation, but not a complete accounting within your available research time. In any other situation, you only can consider invoking a clearly dominant expert.