Search This Blog

Thursday, December 12, 2013

Some Thoughts on Data, Zeno's Paradox, and Leaps of Faith

Some highly educated people are seekers of statistical truth, approaching their quest as passionately as spiritually-driven people might pursue transcendence in a place of worship, on a yoga mat, or on an Appalachian Trail through-hike. The more data for these people, the more faith (though they would never call it that) they have in the rightness of their scientific or economic conclusions. And yet that certainty is elusive, and it drives them on toward ever more, better, real-time points of data.

A data-driven mind can never bridge that last Zeno-paradoxical gap between the data they have and data they want. The data they want is that which would be possessed by an all-knowing god who understood every variable, every permutation, every possible combination, with absolute certainty. Many, though not all of these people, I suspect generally reject the whole concept of a divine power. Perhaps that is because deep down they think of themselves as having the potential to be all-knowing, if only they had more data, so who needs god?

Of course, some of these data geeks are okay with there being a data gap. They rely on confidence intervals and statistical significance to deem their conclusions valid (not 'true' mind you, but more or less likely to be repeatable given infinite trials). But there always is uncertainty, there is no faith. To statisticians, 99.9999% confidence is a pretty sure bet, but there is still that one in a millionth time in which the thing they don't expect will happen. And let's face it, there are plenty of things that only happen once in a million times.

Adding to their uncertainty, these data driven people remain ever wary of bad data. Bad data can make you erroneously conclude that the chance of something isn't one in a million, it's two in a million, which is twice as bad (or as good, depending...). It's like the devil, that bad data, always lurking in dark places, ready to catch them unawares. And some of these statisticians suspect the providers of data - the respondents, test subjects, recordings, and measurements - are deliberately out to trick them, throw off their conclusions, mess with their tendencies to the center - more devils.

And they're right. The data is undoubtedly flawed, the people who gathered it flawed as well, and the source from which it came - instrumentation, people, animals, financial records - they're flawed, too. They don't behave as they would in a perfect world. Don't get me wrong, I believe in the power of statistics, and have seen that when used appropriately it is far better at predicting the future or explaining the past than any human brain or gut could ever do. It is an extremely useful tool. My point is that people who place all their faith in numbers, little in people, and who pursue ever narrower bands of reliability and significance to the point of costly absurdity, should count themselves among the flawed. Use the data, strive for more, better, faster within reason, but then, take that leap of faith across Zeno's paradoxical gap. The rest of us mortals will be there to grab your hand on the other side.

No comments:

Post a Comment