Saturday, March 17, 2018

Logical Fallacies

I'm developing new respect for the field of Statistics, nowadays rebranding as Data Science, given its 21st Century willingness to amalgamate (converge) what had been considered two mutually exclusive approaches:  frequentist versus Bayesian.

As Terry Bristol and I discussed at Tom's over breakfast not long ago, sometimes the most mature science is the one that overcomes a core either/or mentality.  Reality is made of particles.  Reality is made of waves.  Rather than a single Grand Unified Theory, why not have two?  Part of our GUT is we need two ways of looking at minimum.

Operation DuckRabbit.

The prejudice against Bayesian thinking, expressed as antipathy towards its champion, Laplace, might trace in part to a school days lesson most of us learn.

If A then B does not imply B therefore A.  Example:  if it's raining, I will not go to the zoo.  I'm not at the zoo, ergo it's raining.  That does not follow.  It's a bright sunny day, but I didn't feel like going to the zoo, OK?

However, a Bayesian would say, the fact "I'm not at the zoo" constitutes new information vis-a-vis the hypothesis "it's raining".  P(it's raining, given I'm not at the zoo) > P(it's raining).  Given I'm not at the zoo, I'm more willing to bet that it's raining.

Shifting to a more eugenic set of memes, what is the probability a randomly selected member of the population has blue eyes?  Lets say 36%, regardless of hair color.  Now I tell you said person has blond hair.  The chance said person's eyes are blue just went up to 45%.  Why?  Because having blond hair increases the likelihood of having blue eyes.

Draw some probability distribution.  That's my reality right now.  I just draw an invisible landscape of what I consider likely.

How let the data stream in for awhile.  Roll the dice a few times.  What's my probability distribution now?  How about now? 

My prior beliefs, "compromised" by subsequent data, yield my posterior beliefs.

The credibility curve, in light of new data, stems from the ratio between the likelihood of said data given old beliefs, and the probability of said data for any reason.

My old belief is there's one chance in thousand that I have medical condition X.  Then I take a test that's almost always positive when a person has X.  The test registers positive.  My old belief modifies somewhat, but not a lot, because it was already close to certain that I don't have X.

For years, per the sources I'm studying, Bayesian thinking was delegitimized.  But in the 21st Century, Bayesian thinking was finally accepted, keeping the door open to forms of Machine Learning that had been developed to a high level at Bletchley Park.