LSE professor argues Bayesian reasoning can make predictions even with small sample sizes

  • 0
  • 2 min to read
Bayesian Statistics

In 1854, a mother washed her baby’s diaper at a town well in Broad Street, London. Fecal matters contaminated the water in the well, resulting in the deaths of more than 500 people in just 10 days. 

British physician John Snow, who investigated the cholera outbreak, said that this was the most terrible outbreak of cholera which ever occurred in the UK.

People previously thought that “miasma,” colloquially known as “bad air,” caused the outbreak from the cesspools that were dumped with feces, waste, and other toxic substances. 

The study found that victims of all ages, genders, and socio-economic status were affected by ingesting water with bacteria leaked from cesspools and sewage dumps across London. 

However, his theories were challenged multiple times due to incomplete evidence. Tasha Fairfield, associate professor of international development and political science at the London School of Economics asked in her Oct. 31 lecture whether Bayesian statistics could have helped solve this problem.

“We come up with a few plausible explanations for whatever we’re studying,” Fairfield said. “Based on our background knowledge, we have some initial sense of what those explanations are.” 

Bayesian statistics understand probability as a degree of belief based on a state of knowledge.

This approach allows researchers to learn about the plausibility of a causal hypothesis that makes predictions for cases that researchers haven’t studied yet, Fairfield explained. In simpler words, this approach can predict outcomes based on only a small data size. 

To illustrate this theory, Fairfield gave an example of “twenty questions” where participants in the game ask 20 yes-or-no questions to figure out what the subject has in mind. 

“To efficiently reduce uncertainty, instead of asking questions designed to eliminate one specific possibility at a time (e.g., Barack Obama, duck-billed platypus, earl-grey gelato) we should aim to ask questions that halve the remaining possible hypotheses at each stage (e.g., something like: is or was the subject a living organism?),” Fairfield wrote in her paper, “A Bayesian Perspective on Case Selection.” 

She explained that although this strategy may not justify one single hypothesis, it can narrow down to sets of hypotheses. 

If the cholera outbreak in Broad Street was studied today using Bayesian statistics, Fairfield would use two different causal mechanisms for her hypotheses. 

In statistical hypothesis testing, the null hypothesis, which the alternative is compared against, is that drinking bad water from a pump gives you cholera. The alternative is that it’s actually eating contaminated meat from the butcher shop.

In terms of X and Y, Fairfield said that high X and high Y are sick people living near the pump, and low X and low Y are healthy people living far from the pump. 

“Individuals may make shopping trips to get a good price on meat from the butcher, but they don’t stay in the community long enough to be drinking the pump water,” Fairfield said. “That would provide a strong weight of evidence in favor of H1 over H2.  Alternatively, we may find that these people are drinking water from the pump from tea time visits with their friends, but they’re not staying long enough to be eating contaminated meat at dinner. That would have strong weight evidence of H2.”

This is an example of case selection using Bayesian statistics that highlights its use in explaining unique events in a population that cannot naturally be constructed from random processes.

The elasticity of this theory has been used in case studies in social sciences to predict models for presidential elections, understand cancer rates in children living near power lines, and predict the likelihood of a spontaneous mass protest in relation to economic decline, according to her paper.  

Reach contributing writer Anh Nguyen at Twitter: @thedailyanh

Like what you’re reading? Support high-quality student journalism by donating here.

(0) comments

Welcome to the discussion.

Keep it Clean. Please avoid obscene, vulgar, lewd, racist or sexually-oriented language.
Don't Threaten. Threats of harming another person will not be tolerated.
Be Truthful. Don't knowingly lie about anyone or anything.
Be Nice. No racism, sexism or any sort of -ism that is degrading to another person.
Be Proactive. Use the 'Report' link on each comment to let us know of abusive posts.
Share with Us. We'd love to hear eyewitness accounts, the history behind an article.