# Estimation as Hypothesis

This content is syndicated from George Dinwiddie's blog by George Dinwiddie. To view the original post in full, click here.

Experimentation is a powerful learning tool. When I was young, I performed scientific experiments by mixing chemicals together to see what they would do. I learned that most random concoctions from my chemistry set would make a brown liquid that was often hard to clean out of a test tube. I learned that sometimes they would create very smelly brown liquids. These were not really experiments, however, and I didn’t really learn from them. Instead, these were activities and I collected anecdotes and experiences from them.

The scientific method rests on the performance of experiments to confirm or deny a proposed hypothesis. Unless you can propose a hypothesis in advance, you cannot design an experiment to test it. Until you test the hypothesis, you haven’t really learned anything.

“In general, we look for a new law by the following process: First we guess it; then we compute the consequences of the guess to see what would be implied if this law that we guessed is right; then we compare the result of the computation to nature, with experiment or experience, compare it directly with observation, to see if it works. If it disagrees with experiment, it is wrong. In that simple statement is the key to science. It does not make any difference how beautiful your guess is, it does not make any difference how smart you are, who made the guess, or what his name is — if it disagrees with experiment, it is wrong.” — Richard Feynman, The Character of Physical Law

When we estimate how long it will take, or how much it will cost, to implement a desired amount of software functionality, we create a hypothesis that we can test. Our hypothesis may not be of enduring and universal value as a hypothesis that predicting physical laws, but it may still be extremely valuable to us.

For example, suppose we have a number of features we’d like to get into our next software release. And suppose we have a date in mind for that release, and a team ready to work on it. We could then ask that team to estimate the features relative to each other, bucketing them into groups of similar sizes. We could also ask them to estimate how much bigger (or smaller) are the features in one size group than the features of another. If this team has previous experience working together, they might be able to guess how long one feature might take to implement. Otherwise, they might just take a guess at it.

I would expect these numbers to be simple, with only one or two significant digits. After all, we don’t have much data to base them on. Their precision should not pretend that we do.

If we were practicing a plan-driven serial software development cycle, we might treat these estimates as promises and try to manage the work to meet them. In such case, I would expect them to have padding for the unknown, and higher precision to hide the fact that they’re padded guesses.

Using an empirical software development approach, we’ll instead treat this projection as a first hypothesis. When we finish the first feature, we’ll have some better data on the rate at which we’re progressing, and can project into the future with a bit more confidence. Does this data confirm our hypothesis of when we’ll be done?

This experiment helps us make decisions. If completing the features by the target date looks unlikely, we’ll want to take drastic action. Perhaps we’ll eliminate some features, or make them all simpler, in order to trim scope and achieve some success. Perhaps we’ll decide to cancel the project altogether, cutting our losses with only a fraction of the budget spent.

If the target still looks feasible, we can continue the experiment. We’ll still have uncertainty about both the rate of progress and the size of the work, but we can reason about those uncertainties. Are our errors in sizing likely to be additive, or random? Is the current rate of progress sustainable? Is it depressed because of one-time startup work? Or is it optimistic because we’ve been cutting corners?

Poorly handled estimation is a means to fool ourselves, but handled with care, it gives us tools to experiment and learn.