Historical methodology

Note: It is possible that, as I continue with these essays, I may find additional methodological points that I forgot to cover in the present essay. I will add any such points as I find them. I probably, however, will not start a revision history for these documents until I have completed a first draft of the entire series.

Richard Carrier’s grand project to investigate the historicity of Jesus consists primarily of two large books. The first, Proving History: Bayes’s Theorem and the Quest for the Historical Jesus, centers squarely on historical methodology. The second, the one I am summarizing in the present series of essays, On the Historicity of Jesus: Why We Might Have Reason for Doubt, applies the methodology he developed in the first book to the question of whether Jesus Christ began as a historical figure or a mythological one. In order for my series of essays to make sense, I will therefore need to include at least some discussion of methodology.

In the interest of full disclosure, I will note right from the start that I have not read Proving History. I already have a passing familiarity with Bayes’s Theorem, and thus did not find reading the first book to be necessary to my understanding of On the Historicity of Jesus. Therefore, it is likely that my summary of the methodology that Carrier used in his second book will be contextualized differently from how Carrier himself would characterize it, and it is certainly possible that I might make some mistakes in my presentation. I naturally welcome all criticisms and corrections.

Carrier shows little love for current and past methodologies used in the field of historical studies, particularly with respect to the ancient world where evidence is significantly sparser than it is in more modern time periods. It is my understanding that Proving History contains lengthy discussions and debunkings of many commonly-used methodologies in the field of ancient history. Since I haven’t read that book, I don’t have access to his examples and arguments. I will therefore make the case in this essay for Carrier’s methodology, but will largely leave aside the case against the more commonly-accepted methodologies.

The Role of Probability in Historical Analysis

History is largely about determining what happened in the past. What is often not acknowledged, however, is that any such determination is inherently probabilistic. If we are absolutely confident that some event happened in the past, then we are 100%, or close to 100%, certain of it. If we think it is likely that some event happened in the past, but acknowledge that there are some alternate possibilities that are non-negligible, we may be 70% certain of that event. If we think that it is barely possible that some event happened, but that there are other explanations that are much more likely, then we might be only 2% certain of that event. Any historical conclusion can, and according to Carrier should, be labeled with a probabilistic confidence level.

Of course, multiple experts who reach the same conclusions qualitatively may differ significantly on their assigned probabilities. Two such experts may agree that X probably happened, but one may think that it is 95% likely, and the other may think it is 70% likely. Our methodology needs to have a way to account for the inherent subjectivity in the evaluation of evidence.

A natural approach to this would be, for every historical conclusion, to explicitly state three probabilities. If we are comparing hypothesis X with hypothesis Y, we would therefore state our estimated probability of X over Y if we interpret the evidence as favorably toward X as seems possible, as favorably toward Y as seems possible, and as unbiased as possible. The first two give the range of possibilities that reality has to fall between, and the last gives the author’s most reasonable interpretation. If, after accounting for all of the evidence, the determination is that hypothesis X is 73% to 98% likely, with the most reasonable interpretation of the author that it is 92% likely, then we can infer that the best conclusion is that X is what happened. If, however, we determine that hypothesis X is 36% to 92% likely, then wherever the author’s most reasonable interpretation is, that author will have to acknowledge that it is reasonable for experts to disagree about X vs. Y, and it perhaps is most reasonable to withhold judgement.

Bayes’s Theorem is a rigorous method for combining probabilities from multiple sources of evidence into an overall probability measurement. Even though Carrier spends a great deal of time going through the details of Bayes’s Theorem and its application to the evidence that he presents, I will not be spending a huge amount of time on this aspect of the argument. I think that the gist of the argument can be grasped without going through the math.

Finally, Carrier further simplifies the probability reporting and calculation by only explicitly addressing two of the three probability categories. He interprets the data as generously as possible in favor of the minimal historicity hypothesis, and he interprets the data as realistically as possible, leaving out the interpretation that is as generous as possible in favor of minimal mythicism. This is a reasonable move by Carrier since even his realistic interpretation yields only an overall 0.008% chance of historicity being true. Since that chance is negligible already, it isn’t particularly meaningful to calculate an even lower value that would be the lower bound on this calculation. As a result, his conclusion is that minimal historicity is somewhere between 0.008% and 33% likely to be true (the former being his realistic estimate and the latter being the interpretation when being as generous as possible to minimal historicity).

Comparing Hypotheses

Imagine a shooting where there is eyewitness testimony that two shots were heard from different locations, and yet only a single bullet was found (this is my example, not Carrier’s). According to Carrier, it is common in the historical literature for a situation analogous to this to have a proposed explanation, such as that one of the bullets missed and was lost, and to then examine the evidence in support of that explanation. If the evidence is reasonably expected based on that theory, then it is considered to be support for the theory.

There is, however, a logical fallacy contained in this conclusion. Consider an alternate explanation, that the second audible shot was an echo of the first audible shot. If these are the only two reasonable explanations, then the eyewitness testimony and the finding of a single bullet are equally expected under both explanations. The evidence, therefore, does not favor either explanation. Other evidence might, such as the outcome of a test firing of a shot at the same site under similar weather conditions to evaluate the echo hypothesis, but the original evidence is neutral between the two explanations. Bayes’s Theorem shows this rigorously, but the idea should be clear enough without going through the math.

The take-home point is that a historical methodology that examines evidence for a hypothesis without explicitly comparing it to competing hypotheses is fatally flawed. A further take-home point from the analysis of Bayes’s Theorem is that in such an analysis, it is only the relative likelihood that matters, not the absolute probabilities. Let’s start with the 1 in 10,000 chance that you will be struck by lightning sometime over the course of your life. Let’s also suppose that people spend 1% of their lives in physical contact with another person, meaning that there is only a 1 in 1,000,000 chance that you will be struck by lightning while touching another person. So, if you are struck by lightning, what is the chance that you’ll be touching someone else? 1%. Because we’ve already said that you were struck by lightning. In the case of historical evidence, we already know that we have that evidence. What matters is how likely is it we would see that evidence under hypothesis A compared to how likely it is we would see that evidence under hypothesis B.

So, Carrier’s strategy is to compare each type of evidence that could bear on the question of historicity vs. myth and determine how likely it is that this evidence would be expected under each of those two hypotheses. If we would expect that evidence equally under those two hypotheses, then it’s a wash… that particular evidence doesn’t favor one over the other. If, however, the evidence makes perfect sense under historicity but is a stretch under mythicism, then the probability has to favor historicity.

When is Evidence less than 100% Expected?

Clearly, for this kind of analysis to work, it is key to be able to identify when some kind of evidence is expected under a given hypothesis and when it is unexpected. The mindset to put oneself in to try to evaluate this is to put the evidence out of your mind, get yourself in the mindset that hypothesis A is true, and then ask how likely it is that you would then see that evidence. If that evidence is directly predicted by the hypothesis, then it is 100% expected. If it is consistent with the evidence, but not fully expected, then its probability should be less than 100%. If the evidence, however, seems surprising in light of hypothesis A, then there has to be a less than 100% probability for it. An important pitfall to avoid is introducing ad hoc assumptions. “Well, I’d expect that evidence if this other fact were also true,” is an ad hoc assumption. Such an ad hoc assumption necessarily reduces the probability below 100%, because we have to account for the possibility that the other fact might not be true.

Absence of Evidence and Evidence of Absence

It is often stated that absence of evidence is not evidence of absence. However, there are some cases where this pithy phrase is completely wrong.

Imagine that you were trying to reconstruct the progression of emperors of some long lost empire. Through a variety of sources you hear about Emperor Frank, Emperor Charlie, and Emperor Tony. Then, in the course of your research, you come across a tome purporting to be a history of the empire that contains a separate chapter for each emperor, fifteen in all, that run from the original founding of the empire to its final overthrow by a neighboring empire. Within this document you find a chapter on Emperor Frank and a chapter on Emperor Tony, and chapters on thirteen other emperors that you had previously found no evidence of before. But the tome does not contain a chapter on Emperor Charlie. There is, very specifically, an absence of evidence concerning Charlie. But, very specifically, that absence of evidence is in a document where we have very good reason to expect that evidence to be. Its absence is in fact evidence that Charlie didn’t exist.

There are, of course, alternate explanations. Perhaps the author or a later editor of the history tome removed the chapter on Charlie. Perhaps he was a pretender to the throne during a period where another Emperor rightly held the throne. These are, however, ad hoc hypotheses that lower their likelihood, unless outside evidence supported them over their alternatives.

If more documents are found that similarly outline the history of this empire, none of which mention Charlie, and which together provided reliable dating of the reigns of the various emperors with no gaps within which Charlie’s reign could have fit, it becomes less and less likely that Charlie was ever an emperor. But again notice, this is an absence of evidence that is directly providing evidence of absence.

One must clearly be careful when invoking this type of argument. The expectation that the missing evidence would be present has to be exceedingly-well supported for this type of argument to work. But it is fallacious to rule out the argument in those cases where the expectation is extremely well-supported.

Combining Probabilities

Bayes’s Theorem, as I mentioned, provides a strict mathematical framework within which probabilities from independent sources of evidence can be combined into overall probabilities. As I also mentioned, I’m not going to go into the details of Bayes’s Theorem at this point. Nonetheless, there is one point that is worth bringing up about how probabilities are combined, which I will illustrate with an example.

Let’s suppose that we have a pen pal that we have never met. Let’s further suppose that we have heard from another source that this pen pal has a spouse. Then, at some point, we notice that we never seem to hear anything in the letters we get from this pen pal about this spouse, and thus start to question whether the information about the existence of this spouse is accurate. How can we evaluate the evidence?

How likely is it that, if the pen pal doesn’t have a spouse, that the pen pal would not mention that spouse in a given letter? Close to 100%. It may, however, be reasonably likely that the spouse wouldn’t be mentioned even if the spouse exists. Maybe we estimate that it’s at most 95% likely that, in a given letter, the spouse wouldn’t come up even if the spouse exists. I would consider that to be an overestimate, but given that we can get an idea of the maximum chance that the spouse exists.

But, how many letters do we have from this pen pal? Let’s say 20. If it is 95% likely in a given letter (or a 0.95 out of 1 chance), then for two letters it is 0.95 times 0.95 out of 1, or 0.9025 out of 1, or 90.25% likely that the spouse wouldn’t be mentioned in those two letters. Multiplying 0.95 times itself 20 times gives us the chance that we estimate the spouse wouldn’t be mentioned in any of those letters. Do the math and you find that it’s only 32.85% likely that the spouse wouldn’t be mentioned in any of the letters. A few letters with no mention can be discounted. A lot of letters with no mention becomes suspicious. And multiplying our probabilities tells us how suspicious we should be.

Final Thoughts on Methodology

I believe that the points I have addressed above should be sufficient background to understand the techniques that Carrier employs in analyzing the evidence to compare minimal historicity to minimal mythicism. In short, Carrier examines each class of evidence from the perspective of how expected that evidence is under the minimal historicity hypothesis and how expected it is under the minimal mythicism hypothesis. He converts those expectations into a pair of probabilities, his best estimate and an a fortiori estimate. This latter estimate is Carrier’s evaluation of the probability given the most generous-to-historicity evaluations of the evidence, which is the best possibility of historicity being true. He combines these probabilities using Bayes’s Theorem, and we get an overall estimate of how likely he thinks it is that historicity (vs. mythicism) is true (based on his reasonable estimates) and the maximum likelihood of it being true (based on his a fortiori estimates).

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s