Estimating a Full Backlog Based on a Sample of It

I want to address a question I was sent recently and that I get asked about once a month. The question has to do with how we estimate how many hours it will take to deliver a given product backlog if we have no historical data at all.

My first bit of advice is always to try to put off answering until you’re able to get even one sprint of historical data. But that’s not always possible. When it’s not my general recommendation is to conduct one or more commitment-driven sprint planning meetings and use those to forecast likely velocity.

However, some people are still more comfortable thinking in hours and rather than forecast velocity, they want to forecast the number of hours a product backlog will take. From there they will usually estimate the duration of the project (something that could be done more directly with a velocity estimate). But, since it’s a common question, I’d like to address it.

Here’s essentially the question I was asked. I’ve paraphrased, simplified and left out extraneous information:

We have no historical velocity data. We have a product backlog of 300 user stories. Each user story has been estimated, most in the range of 3-13 points. Would the following approach be reasonable:

  1. Grab a random sample of 40 stories.
  2. Break each of those stories into tasks and estimate the tasks.
  3. From the task estimates come up with an average number of hours per story.

For example, suppose the randomly selected 40 user stories total 150 points and that the tasks identified for those 40 user stories total 600 hours. We would then theorize that it is about 4 hours per story point?

Well, yes, the idea here is fine with two problems:

  1. It’s important to remember that the relationship between points and hours is not an equivalence. It is not that 1 point equals 4 hours (which is what our example showed above). It is that 1 point equals a mean of four hours with a standard deviation of plus or minus say 45 minutes. This would mean that most of the time (68% of the time) the relationship would be that 1 point takes from 3:15 to 4:45 to finish. It would mean that almost always (98% of the time), one point would take between 2-1/2 and 5-1/2 hours to finish (two standard deviations).
  2. The above approach assumes that 1-point stories and 13-point stories are estimated perfectly in relative terms. In other words, it assumes that if the mean duration of a one-point story is 4 hours, the mean of a 13-point story will be 13×4=52 hours. For many reasons, this is unlikely to be true. And the data I’ve collected from a variety of teams, shows that–as we’d expect–teams are not perfect, even though many are amazingly consistent.

So, what can we do to address these problems? A first really simple improvement is to calculate the average number of hours for stories of each size rather than one overall average. For example, in the example above we said that the 40 stories were 150 points in total and 600 hours for an average of 4 hours per point. But, if we averaged the 1-point stories on their own we might find that they were 3.2 hours per point, and the 2-point stories that were broken into tasks were 4.3 hours per point, and the 3-point stories were 4.1 hours, and so on.

We can then multiply that average number of hours by the number of stories on the product backlog of each size. An example using the average hours given above is shown in the following table:

Points Hours Per Story # of Stories Total Hours
1 3.2 5 16.0
2 8.6 8 68.8
3 12.3 7 86.1

First, notice in the above that the second column is hours per story, not hours per point. The two-point stories were assumed to take 4.3 hours per point so 8.6 (4.3×2) hours per story is shown in that column.

This table shows that we have five 1-point user stories. Each is expected to be about 3.2 hours so we expect to spend 16 hours total on the 1-point stories. Summing the last column in this table gives an expected total of 170.9 hours.

Note that this approach is subject to all the shortcomings that will have been introduced during the task identification and estimation step. Most importantly that the team will fail to identify all tasks. So this approach will estimate the number of hours to deliver the tasks identified. Some adjustment will need to be made to estimate the amount of work that the team failed to identify and the duration adjusted accordingly.

I’ll write about other improvements to this simple approach in upcoming blog posts.

Share on facebook
Facebook
Share on twitter
Twitter
Share on linkedin
LinkedIn

Leave a Reply

Your email address will not be published. Required fields are marked *

Recent Posts

Culture, Skills, and Capabilities // How to become a more data-driven organisation

In our whitepaper “How to become a more data-driven organisation”, we wrote about the five steps that an organisation would need to take, which are: Outcomes: Defining goals and metrics to ensure clear and measurable outcomes Analytics: Implementing and sharing the analytics to improve data-driven decision making Innovation: Testing assumptions through hypothesis testing and learning Data Platform: Gaining new insights

Read More »

Data Platform // How to become a more data-driven organisation

This is the fourth article in our series on “How to become a more data-driven organisation”, and we are going to be focusing on Data Platforms. It is at this point that most people start to dive deep into the technical aspects of Data Lakes vs Data Warehouses, but we want to bring us back up a level and ask

Read More »

Innovation // How to become a more data-driven organisation

In our white paper “How to become a more data-driven organisation”, we wrote about the five steps that an organisation would need to take, which are: Outcomes: Defining goals and metrics to ensure clear and measurable outcomes Analytics: Implementing and sharing the analytics to improve data-driven decision making Innovation: Testing assumptions through hypothesis testing and learning Data Platform: Gaining new

Read More »

Search the Blog

Agile Management Made Easy!

All About Agile

By Kelly Waters

“’Agile’ is one of the biggest buzzwords of the last decade. Agile methods often come across as rather more complicated than they really are. This book is an attempt to unravel that complexity. To simplify the concepts. This book breaks the concepts into small bite-sized pieces that are easy to understand and easy to implement and delivers the message in a friendly and conversational style. Allaboutagile.com is one of the most popular blogs about agile on the web. ”

Kelly Waters

Agile 101 is available to purchase. GAME ON!

Agile 101

Emma Hopkinson-Spark

“Whilst there are lots of ways you can vary the game depending on the teams you have and the learning outcomes you want, the basic flow of the game play is common to all.”
Emma Hopkinson-Spark

Why did we make the game?

How to play the game?

London

101 Ways Limited
145 City Rd
London
EC1V 1AZ
United Kingdom

Amsterdam

101 Ways BV
Weesperstraat 61-105
1018 VN Amsterdam
Netherlands

Contact Us

If you would like to get in touch with one of the team at 101 Ways, then please fill out the form below or email us at contact-us@101ways.com.