Why low confidence doesn't always mean inaccurate

PontiusPrimate · 1 July 2022 21:33

Continuing the discussion from Media Lens 'climate change', vs Daily Skeptic, somewhat of a poll of 5F posters:

I’m spinning this little post off from the main thread, as I want to discuss a single point that I hope will help to clarify the somewhat tricky questions of confidence, plausibility and accuracy as applied to a mathematical model.

Let’s take a real life example that has nothing to do with climate change, and hopefully will be a lot easier to get to grips with and explain the situation we’re facing. I’m leaving out almost all the maths and am just concentrating on the basic point around certainty and accuracy.

In 2004, Burnham and Roberts published their (in)famous study of the estimated Iraq war deaths. Because of the difficulty around gathering data in a war zone, and other more mundane problems, the authors were unable to gather a lot of data, and as a result their estimate had a very large confidence interval. The estimate was anywhere from 8,000 to 200,000 deaths resulted from the war.

The sparsity of the data meant that the confidence interval was very large.

The statistical best fit line was somewhere around 100K deaths resulting from the war. However, when questioned further, the authors declared that they thought the true number was a lot higher than that, let’s say somewhere closer to 150K deaths. This statement was criticised by many people who didn’t want to accept this data (particularly the Iraq Body Count folk).

This situation is very, very similar to the hockey stick graph, and the statement by Mann, that he thought that the 1990’s were the warmest decade for 1000 years. Sparse data, wide confidence intervals and a statement that was criticised by a lot of people who don’t want to accept this data.

Consider a hearing where a different scientist is questioned (under oath) to comment on the Lancet study: (Q = questioner, A = scientist answers)

Fictional Senate hearing

Q: Do you think that the study is well done, and scientifically valid?
A: Mostly yes. The data was sparse, and not well distributed, but overall the study is important

Q: The authors have stated that, based on the study, their estimate is that over 100K, maybe even 150K people died. What do you think of those numbers?
A: Well… it’s a bit tricky to be sure from the confidence intervals of the published model, but I would say that yes, the number of dead could well be that high. It’s a plausible number.

Q: But the study also says that the true number of dead could be as low as 8K - is that true?
A: Yes, that’s also true, but it seems a lot less likely than the 150K number.

Q: Really? What certainty, or confidence do you have that the 150K number is more accurate? 95? 90% 50%?
A: Actually, given the sparsity of the data, it’s very hard to quantify the 150K number with any confidence at all. The error bars are wide, and the data were very sparse. It’s possible that 150K is true, and given the fact that the study couldn’t actually survey the most dangerous areas with the highest deaths accurately, it’s quite plausible that the true deaths are closer to 150K than 8K, but the numbers in the study don’t give me any way of determining the certainty of that estimate.

Q: So you are saying that you have no actual certainty that the deaths were 150K? Not even 10% certainty?
A: Correct, I can’t give any certainty to the 150K. It’s not possible to quantify how certain that number is.

and … scene

This is very close to what happened in real life with the Iraqi mortality study, and also very similar to what happened in real life with Mann’s hockey stick study.

There was no way under oath for the scientist to come up with a concrete measure of the certainty. It’s a very difficult question. However, it is pretty clear that the 150K number is not only plausible, but given the circumstances, actually likely. In fact, when the study was re-done 2 years later, they confirmed that the 150K number was actually pretty close to the truth.

The study had high accuracy but low precision. That’s why it was hard to be certain, but to find the numbers quite plausible. Especially given other lines of evidence not directly in their paper.

Now imagine that an anonymous poster on a message board were to say something like:

If [Burnham and Roberts] are incapable or unwilling to see that the data they are using is too meagre and uncertain to support their statements then they are either delusional or incompetent.

Is this true? Does the inability of a scientist to ascribe a degree of certainty about the numbers mean that they are incompetent or delusional? Does this mean that the 150K number was wrong, or that they were wrong to suggest it?

Clearly it is not true, Roberts or Burnham are neither delusional nor incompetent. They were doing the best they could, and making the best of a difficult data situation. In fact, it later turned out that they were, upon replication of the study, largely correct! This is exactly the same situation as Mann saying that he thought that the 90’s were the warmest decade for 1000 years. It wasn’t possible to quantify the certainty of the statement, but given all the data and the lines of inquiry, that was a very plausible statement to make. And upon replication, it has turned out to be true.

So I hope that this little example shows how how it’s possible to be uncertain, but still accurate and plausible.

A little glimpse into one little intricacy of mathematical modelling.

Cheers
PP

PontiusPrimate · 1 July 2022 22:28

Actually the analogy goes even deeper, as there were several attempts to get Roberts and Burnham to release data etc, and several groups with a definite axe to grind hounding the authors for ages.

Sincere scientists who publish results that the status quo prefer to hide end up having a terrible time… Just like many climate scientists

RhisiartGwilym · 2 July 2022 07:36

All this still sounds like ‘reasons to be certain’ in a situation where certainty is simply not justified, P. Very uncomfortable for human psychology to have to live with uncertainty, I know. But simply unavoidable sometimes.

The current upshot of these two issues seems to be: the exact death-toll of the Anglozionist empire’s aggression against the Iraqis, to steal their oil, is still uncertain (in the dictionary meaning of the word); as is the precise near-future behaviour of the climate.

It may well be that things will get hotter in the immediate future. Or, alternatively, by means still obscure to us, the current ice age in which we’re still supposed to be may reassert itself. That was certainly an idea that was afloat earlier in my lifetime. Remember that hit film, ‘The Day After Tomorrow’? Crappy Hollyshite, sure; but well within the zeigeist of that - recent - time.

I’m still inclined to give moderately good odds for a hotter immediate future. But that’s all it would be: a gamble, on processes which it’s beyond us to be able to call precisely. Enforcing large-scale changes? Yes, highly likely. But predictable in detail, with peremptory certainty? No, not doable. Not justified.

The need for caution in conclusions is unavoidable. Just be prepared, in ways which I know we both understand, so as to be able to do a Taoist flowing with the realities - whatever they turn out to be.

PontiusPrimate · 2 July 2022 08:53

Hi RG

How certain are you that the death toll in Iraq a year or two after the invasion was greater than 8,000 people? I’m 100% certain even if you cant mathematically prove it. You don’t need to know all the details to know that the big picture is certainly correct.

The same is true for the big picture of climate change.

Cheers

PS the second point is that it’s usually a mistake to use the dictionary definition of “certain” when talking to scientists about a mathematical model. Certainty and confidence have. More specific meanings

Kieran_Telo · 2 July 2022 12:33

Please could you c.c. this to the lovely Devi Sridhar