Papingo Weblog
The Papingo 'blog is not intended to sell our services. Any anecdotes related are "true" in the abstract, but are altered to ensure that nothing (sensitive or otherwise) about our engagements with our clients, nor their businesses can be inferred. So a story about what happened at a sprocket firm may in fact be a composite of stories from separate widget, doohickey, and thingamabob businesses.
The lukewarm statistician: or, how a cold and burned toes can be useful
Somewhere out there is a book of mathematical and statistical jokes. That's not a statistical inference, by the way; it exists. One of the gems in this probably little-read tome concerns a man visiting his statistician friend, only to find him with his feet in the oven and his head in the ice box. Nonplussed, his guest enquires what the meaning of this strange behaviour might be, to which the answer is of course "On average, I feel perfectly comfortable".
The use of measures of central tendency in business without the concomitant understanding of probability distributions can lead to these kinds of absurdity (without the drollery). But the over-use of the arithmetic mean has another ill consequence in business: when asking for forecasts, we naturally ask for an "average estimate" (from a single person) or an "average of estimates" (from a group of people).
One particularly pernicious manifestation of our unhealthy obsession with a single number is in project planning. The project planner asks the team leader, the team leader asks the task owners, and the project planner receives a number he or she can put in a spreadsheet or (if lucky) a project management tool. But people are terrible at answering questions of the form "how long will this take" or "how long does this usually take". What they are usually quite good at, if probed, is answering questions of the form "what's the minimum time this could possibly take?", and "what's the longest something like this has taken you in the past?" Even so small a change as obtaining [minimum, maximum] based ranges (they could just as easily be [good case, bad case]-type ranges) can yield dramatic improvements. Not only do we get a much better feeling for project risk, but we can start to use approaches which consider the distributions of things which matter to us, in particular, bayesian approaches.
There are sources out there for range-based estimations for the interested. For a more hands-on introduction, have a look at this spreadsheet which we rather enjoy. Of course, one doesn't need a specialist tool for this kind of approach: Excel and Project, for instance, can be used (with some work) to generate robust Monte Carlo estimates for projects. At any event, it is not the tools which are important, but the explicit acknowledgement that the convenience of a single number to represent a distribution has certain costs, especially when that number is something we are particularly poor at accurately generating.