Dear Junior - Letters to a Junior Programmer: backlog

Showing posts with label backlog. Show all posts

Wednesday, 10 November 2010

4 Points of Story Points

Dear Junior

For quite some time there have been a debate on "what story points really are". To some extent there have been insightful discussions about important things to consider when sizing stories. However, the "really are" part of the debate just leaves me tired.

To discuss the true nature of story points is pretty pointless: they are a construct, created by us, and can be given any arbitrary meaning. It does not matter. What matters is whether that construct is useful for a purpose. This is the core of pragmatism.

In other words, the relevant question is not "what story points are", but "what story points are useful for". Please excuse the pun, but rephrased "What is the point of story points?"

If I try to pry things apart I can distinguish four different situations where I have found story points useful. Basically they cover the questions "Is it small enough?", "What does it cost?", "Which should we do first?", and "When can I have it?".

Small Enough

One of the cardinal faults a team can do is to start working on a story that is way to big. If they do, they will surely fail to finish it within the sprint, and having a demo with nothing to show is pretty depressing. Having a lot of half-finished work also tie their hands of what they can do next sprint. No good.

A mature team might have developed a gut feeling of how big a bite they can take and still eat it. The stories considered too big are simply sent back to product owner for delimitation or splitting.

A less mature team might see that bites have different sizes, but have not yet the insight of how big a bite they can take without choking.

Here story points can come in useful. The team can size the stories as 13, 20, 3, 8 etc. But they need not to know their limits. Instead we can observe the velocity over a few sprints and see what "small enough" means. For example, if the team have a velocity of 18, I would advice to set the limit to 9 (half velocity). So, the stories sized 3 and 8 are small enough, and those of 13 and 20 will need some more pre-sprint work to make them manageable.

It can be handy to reserve the top of backlog for stories that are small enough, a "backlog shortlist" of ready-to-develop stories. Or, if you prefer the kanban style to have one stage "delimit and split" followed by a "ready-to-develop" queue before pulling them into active development.

Cost

Whether we like it or not, the question about money always come up. It can be hard to just look at a requested feature and say how much it would cost to develop it. In a lot of development efforts the dominating cost is the cost of labour. Either the work will tax the amount of available work by the employees, or there will be contractor bills to be paid.

Here story points can come in useful. Someone probably knows what the team costs per week, or can calculate it. If you have some historical velocity data you can make a rough estimate of what each point costs. So, if you know roughly the size of the story, you can calculate the cost.

Say that your team costs EUR 27 000 a week, you run two weeks sprint, and your sprint-velocity hoover around 18. Then each point costs around EUR 3 000. So, a story of size 40 will cost roughly EUR 120 000.

Well, not all of the time will be "pure development work". There will be meetings, phone calls, administration (filling out time reports) etc. But, that does not matter. Pulling through a story of size 40 will take roughly two sprints and during that time the team will cost that amount of money.

Of course, in practice you will rather want to give an interval than a precise number. The velocity might be within 17-19 (with 90% confidence) and a "40" story can be anything from 21 to 40. So the cost will rather be in the range EUR 60 000 - 130 000. Still, it will be a figure good enough for business to decide if it is remotely interesting to proceed or not.

Priorities

A misconception among business side "customers" is that they should set priorities on business value. Well, but "economics" is really about alternatives - the cost of a bar of chocolate is that you cannot get two lollipops. In the same way, for the product owner to make a wise balance between what different stakeholders want, she must be able to compare their cost.

Interesting enough, to make priorities, we do not need to know the absolute cost of each story. It is enough to know their relative cost.

Here story points can come in useful. If feature A is size 20, feature B is 8, and feature C is 13, then we know that we can swap out feature A from a release and switch in feature B and C instead. And in doing so we do not need to care about the details of how much money it is about - all we need is aid in choosing.

Planning

Business need to look ahead to synchronize different parts of their work. E g there are some benefits in having a marketing campaign at the same time as you release some new feature of your software. But, waiting for the software to be complete before ordering the marketing will obviously make you loose market-time.

Here story points come in useful. If you observe the velocity of the team over some time you can apply some not-too-advanced statistics to predict how much work will be completed at some future point of time. Of course, each such prediction will have a probability to fail, and the surer you want to be, the lower you must set the prediction.

For example, if you have observed the velocities 36, 28, 36, 38, 24, 35, 32, 35 and you have five sprints to go, you can calculate the average and predict the team will finish 165 points. However, that prediction is just a prediction - the real result will be as likely to be higher as it is to be lower. In other words, your prediction has a confidentiality of 50 % - or a 50 % risk of failing.

If you want to make a safer prediction, say taking just a 5% risk of failing, you can calculate an interval with 95% confidentiality. In this case it will be the interval 144-186. Now you can mark the backlog, colouring all stories up to 144 as green (very likely to be delivered), those from 144-186 as yellow (totally "depends-on") and those from 186 and up as red (very unlikely to be delivered).

In Summary

It is very hard to talk about the "true nature" of story points. They are abstractions that say something about development work. And, the question of "what they really are" is not a very interesting one. Working with story points is a model, and a model should not be evaluated on "how true" it is - but on how useful it is.

Story points might be useful for a team to decide whether a story is small enough to fit in a sprint, of if it should be pushed back to the product owner - if they cannot do it by gut feeling.

Story points might be useful in assessing the cost of developing a story - if it is questionable whether its value justify the cost.

Story points might be useful for setting priorities - if it is difficult balancing the stakeholder interests.

Story points might be useful for planning at release level - if the organisation need to synch the work of different departments or groups

Apart from these four points, story points might be useful in doing other useful things - if it helps the organisation to act more wisely.

It might well be that in your particular setting, none of the "ifs" apply - and in that case story points are not useful to you. And if they are not useful, they steal time and attention from other things that would serve you better - i e they are waste that should be discarded.

In the end "what story points really are" is not as interesting as "what is the point of story points".

Yours

Dan

ps Story points being helpful for planning is of course key to why release planning works.

pps If story points help in planning it is interesting to see what factors drive high story points. For planning, it is the amount of effort that differs, but that is to a large extent driven by the complexity.

ppps Unless you are really good at making statistical computation, it helps to have a spreadsheet to aid in the planning.

pppps One way to ensure team does not embark upon developing something "too big" is to make a backlog shortlist of the top part of the product backlog.

Friday, 21 August 2009

Heapsort, the Binary Mafia, and Product Backlog Priorities

Dear Junior

I recently picked up my university book on algorithms and by accident came across the section on heap-sort and its data structures. When reading it, I was struck by a strong association to product backlogs and how product owners keep them sorted, often spending a substantial time on that task.

A common advice to product owners is to give each user story a priority by attributing it an “importance number”. High importance number means important story, so if you want to send a story to the top, you can always change its number to being higher then the currently most important. This is a neat trick eliminating the common problem when things are given priority numbers (where “priority one” is most important) – because what do you do when something even more important arrives? Give it “prio zero”?

In the same breath, product owners are usually advised that all stories should have unique importance numbers, thus distinguishing them. For those stories that to be developed in near future (what I call the “Backlog Shortlist”), it is important to sort out their relative priorities, and I agree that assigning unique numbers to those stories is a feasible way to do it.

However, for the rest of the backlog, setting unique numbers is unnecessary work. Following the advice forces us to determine the exact order of even the bottom ten, most unimportant, stories – and to do this at an early stage. Guess what – those priorities might, and will, change many times before those stories are ripe for development. Chances are high that those stories will not even be developed, at least not in their current shape. So basically we have wasted precious product-owner time on establishing details that was not needed yet – what I call “premature exactness”.

Going back to the algorithms and data structures, this reminds me on how a list is sorted using insert-sort, where a new item “scans” through the list to find its place in the sorted list. Doing this with one element at a time with all the elements you want to sort gives you a sorted list with the most important element at the beginning. If you extract the most important element one at a time from that list you obviously get them in priority order. Also, if you let new elements enter the list during the process, you have a what computer scientists call a “priority queue”. The connection to the backlog should be obvious.

However, it turns out that insert-sort is a pretty bad way of implementing a priority queue. This is because inserting a new element requires you to scan half the list on average – in other words, inserting a new element is linear to number of elements in the list, or O(n) using computer science terminology. So, the complete work for processing n original items through the priority processing will be O(n*n).

Heapsort on the other hand uses another data structure called the “heap”. This heap is a binary tree with some extra restrictions on how nodes can sit relative to each other – not to be confused with garbage collected memory area used in runtime environments such as Java or LISP.

What makes the binary tree into a heap is an extra property – each node has a value that is larger than its immediate children’s. Thus, the largest value will be at the top. However, in the heap the nodes do not care about whether the right or the left child is larger; nor does it care about the values in the nodes two steps down. And it is exactly this ignorance of irrelevant details that happens to make the heap an excellent structure for implementing a priority queue.

I think of a heap as a Mafia organisation with branches. In each cell (tree node), the toughest guy or gal is the cell boss with two subordinates - each being the boss of one sub-branch each. The other way around, the tough boss is part of another cell (the tree node above) where the boss of that cell is yet tougher. And of cause, at the top is the head boss (root node) that is the toughest of them all.

So, as there are two sub-bosses that are directly under the head boss – are those two the second and third toughest in the org? Well, one of them is the second toughest, no doubt – but the other one might not be the third toughest. He or she might just be lucky enough that the third toughest is not in his or her part of the org. The third toughest might be one of the underdogs to the other sub-boss. This is the same way as it is not necessary the second best team that plays the final in a cup. The winner of the final should be the best, but the second best team could have lost against the best in some early round.

Back to the Ma; it is easy to know who’s the toughest; it’s the boss right at the root of the heap. But what happens when the head boss get shot ("extracted from the heap")? Well, the two sub-bosses will challenge each other, and the toughest will step up to take the empty place. Now, there is a vacancy for the promoted sub-boss's position. So, the toughest of the two in that cell will step up to fill that vacancy, leaving a vacancy another step down in the hierarchy – rattling down through one “leg” in the tree. At each level there will be a challenge between the peers on who is to be the new boss, and the number of challenges in total will be the height of the tree, which is roughly the logarithm of the size of the org. So, appointing a new head boss when the old gets shot is a O(lg n) operation.

So, what happens when a newbie enters the org? Well, the newbie will enter as a leaf node in one of the cells, reporting to some lowest level local cell boss. Now it well be that the newbie is tougher than the local cell boss, it which case the latter will be challenged, and they will switch places, the newbie being the new local cell boss. Of course, the newbie might be tougher then the next level boss as well, whereupon there will be a new challenge and another switch. At worst, the newbie will be tougher then everybody in the org, and will challenge his or her way all the way to the top becoming the new head boss. This will take as many challenges as the number of levels in the org – i e again, the tree height. So, accepting a new member is a O(lg n) operation.

If inserting a new element is O(lg n) and extracting highest priority element is O(lg n), then processing all elements of a list takes O(n lg n). As product backlogs often can be some thirty to hundred elements (if not more), the difference between n*n and n*lg n is substantial – especially if we talk about using precious product owner time. So, heap-sort is a much better role-model than insert-sort when grooming our product backlog.

How come the heap performed so well? Its efficiency stems from not keeping the entire queue perfectly sorted. Instead it focus on the most important thing – to keep the “toughest” element at the top, and keeping the rest of the queue roughly sorted where “tough” elements tend to hang around at the top, and ”lesser tough” somewhere lower down. Thus it avoids premature exactness, e g decisions that are not needed yet. I also think about is as some kind of "fuzzy late evaluation" - if you get what I mean.

Applying this back to our product backlog, we might go so far as to structure our backlog as a heap. That would be cool but I have never done it, but I am definitely looking for the opportunity. Taking some of the wisdom back, we have learned that it suffices to have the lower part of the backlog “roughly sorted”. One way of applying that would be to relax the requirement on unique importance of each story. We should still require unique numbers for the shortlist, but for the rest it suffices that there are strata of stories that are roughly as important and which might share the same importance number.

Yours

Dan

Saturday, 5 April 2008

Keeping Backlog in Shape Using a Backlog Shortlist

Dear Junior

Most products and projects tend to sooner or later get a really long list of things to be done, the Product Backlog to use the Scrum terminology. Keeping the backlog in shape can be a really tedious job. This is certainly true for all places that I have been working on. We need something more agile.

The backlog can take many shapes - excel spreadsheets, Jira items, wiki pages or all of the above - please try to collect them in one place. In an ideal world all items on the list should be well-written, well-understood and reliably estimated (preferably using "points", "trees" or some other unit not based on physical time).

We cannot really stop the list from growing. The reason for the list growing long is simple, it is easier to generate ideas than to realise them. Either you inherit a log backlog or you will soon have one. So, we cannot avoid if from growing, and we cannot spend all our time waking through that list polishing each item. Doing so would be both frustrating and a waste of time.

However, there are some points why estimating stories is important. And these are the once that we should focus on, so we can get the most benefits without paying the full price. I see three main good effects from estimation.

The most obvious benefit is that when the product owner selects what should be developed next, she needs to know the cost associated with each item - to be able to get the most bang (business value) for the buck (development effort). However, estimation also forces the developers to understand what is requested - which is helpful so they do not have to spend time within the sprint to find out (of course, they will need to refine their understanding and iron out wrinkles together with the business people, but at least they get the scope right). But thirdly, and not much talked about: estimation also drives a discussion on architecture and design, if allowed. To estimate you must have a rough idea about how to design it, and if peoples' perception of that differs a lot they will give very different estimates. This is an excellent way to find out areas where the team needs some time to synch their ideas. Unfortunately, it happens that project leads do not realise the necessity of this.

To get the benefits of estimation without paying the price of maintaining 1000 stories (change requests, bugs and enhancement ideas), I use what I call a "Backlog Shortlist". The Backlog Shortlist is simply the upper-most (i e highest prioritized) part of the product backlog. These are the items that the product owner will keep in crisp condition and ensure that they are properly ordered according to importance. E g, when working with the BDD story format I usually insist that these should have at least two example scenarios, which also gives a good idea about how the story should be demoed. Outside the shortlist I am far more tolerant against sliminess and other non-crispiness.

At each planning occasion I want to walk through all the items on the shortlist, ensure that everybody understands them and re-estimate them. This re-estimation further improves the benefits of estimation I mentioned: product owner's awareness of development cost, scope understanding, and synchronized ideas about design within the team.

Firstly, the cost estimates benefit a lot from re-estimation. From time to time the team realises that a story has become much easier because of job already done and the estimates might drop significant. For example I have experienced that during one sprint we implemented a story that made us create some objects. At the next planning we realised that another story now became almost trivial when the object structure where in place, and the estimate dropped. I do not remember the precise numbers, but it was something like dropping from thirteen to two.

Secondly, repeating the estimation a few times when the story is just a hot candidate, but not yet to be developed, gives the product owner repeated opportunities to get feedback about the cost, and perhaps modify the story by simplifying it or breaking it into parts. It also keeps the developers up to date about what is hot, so that the selection for the next sprint does not come as a surprise. The already recognise the topic and are pretty familiar with it.

Thirdly, as long as the team has a common understanding of the architecture (should we use a domain modelled oo-structure, or dao us directly to the database) the re-estimation is fast. However, if the team have very different views on how a feature should be implemented, it is better to resolve those issues before it is selected for a sprint.

To get these benefits I try to size the shortlist to be double the size of what we can do in a sprint. In that way, each item will be re-discussed and re-estimated twice before being selected for development. At the same time, the time to do the re-estimation does not get too long and burdening.

In short: using a backlog shortlist is just a way to avoid premature exactness, or the other way around: to enforce exactness when it becomes productive but not before.

So, next time you try to get a backlog in shape: focus the effort on where it gives the best result - on the shortlist.

Yours

Dan

Dear Junior - Letters to a Junior Programmer

Wednesday, 10 November 2010

4 Points of Story Points

Friday, 21 August 2009

Heapsort, the Binary Mafia, and Product Backlog Priorities

Saturday, 5 April 2008

Keeping Backlog in Shape Using a Backlog Shortlist

Others recently read

Blogroll

Blog Archive