Story pointing during backlog grooming is well-intended but can be a waste of the Product Owner’s (PO) time. Points are a tool for the development team to figure out what they can take on, and generally are assessed during Sprint Planning, not before. A PO may have opinions and insights when a particular story’s points appear out of proportion, and can support Sprint Planning, but POs should defend precious Backlog Grooming time by not allowing story pointing to intrude.

One common approach for backlog grooming is “three amigos”, in which the Product Owner, Tech Lead, and Test Lead (at a minimum) form the core backlog grooming group. The order of items to be discussed is maintained in advance, and people whose input would be critical may be invited as needed, with sufficient advance notice. That approach addresses concerns about over-populating the grooming sessions. Some teams put rosters together with a condensed attendance to supplement large-scale backlog grooming meetings, and to efficiently refine the outcomes of those large meetings. Those leaned-down rosters may be the best quorum for all backlog grooming sessions.

A compact group of people can be best for initial backlog definition, but not the best group for estimating and refining stories. Under Scrum and SAFe, the Increment Planning/Sprint Planning session involves the Product Owners, Scrum Master, and all development team members. Questions and clarifications about the stories are discussed, their relative complexity evaluated, and then they receive pointing. (Remember that pointing is a metaphor for complexity, not directly for hours.)

Points as a “thing” can be insidious. Teams deliver systems, not points.

Once points are assessed, the development team discuss how many of those points they think they can commit to for the upcoming iteration. The points completed for the last iteration (velocity) informs them about that, but they consider various factors – technical debt, interruptions, people’s absences, etc. This human element makes planning slightly messy, and prevents Velocity from being a mathematically precise predictor. And that’s ok. Pretty close in truth is better than delusions of precision.

Under agile philosophy, this collective estimating provides many benefits. Not only does it help all team members learn how to contribute to estimating, it improves the estimates themselves. Estimating as a group ensures that complications and snags are brought out into the open or not forgotten. It ensures they are discussed from the viewpoint of the entire team’s capability and not estimated as if Super #1 Star Programmer will be doing all the work. We have observed teams in which the same three people (out of a team of 20 or more) always provide story pointing input. Story pointing requires the input of the entire development team, not just a few. Having two or three people out of 15 - 20 doing all the pointing may seem efficient, but generally indicates one of several situations:

The team is disciplined enough to not take initial estimates too seriously. These teams view the “Three Amigos” pointing as efforts to understand the work’s complexity. For every story, these teams will refine the points as a group. This requires force of will to resist the temptation to go along with the initial number. As much as developers may criticize clients for treating estimates as hard facts, they can be just as guilty of it!
They just haven't learned how — or why —- to point as a team (this is a very common reason.)
Teams are behind the curve and are trying to make up lost time by not doing "planning poker" and other pointing exercises. That is false economy, as it is unlikely for a team under pressure to have its work estimated well by just a few representatives. Time saved by not pointing will may be lost in struggling to meet Sprint commitments.
The agile management tool is being manipulated to ensure that points and hours estimates are always clean.
Leads in “hero mode” don’t want to relinquish control over pointing. Unfortunately, the collective approach to estimating is hard for some people to accept, particularly who have spent years estimating on the behalf of large groups of other people.
The only people doing meaningful work on the team are the 3 or 4 that show up at backlog refinement; the rest may not be effectively used or are incapable. This results in skill stagnation, whereas the customer would benefit from skill growth. Granted, a project may kick off with only 2 out of 20 developers being truly expert on the system, code language, data and functionality. At the end of the year, however, there should be 20 people in the development work force with higher expertise. It does not benefit the sponsors for a year to go by with the same few high-end experts doling out work to a raft of medium performers, and not expanding the worker base qualified to lead development on the system.
There is a high-performing development team that operates so smoothly, with parametric estimating, that a couple of people can point for the entire team; only minor adjustments are needed. I haven't seen that level of sophistication more than a few times in 30 years, and it's only a flu bug away from being completely disrupted, and it's a resignation away from disappearing.

Story Pointing is one example of the act of planning being more valuable than the plan itself. Story Pointing serves to call out the complexities and aspects of the story, and makes everyone justify their position on the relative complexity of that story. It's not about how long the story will take, it's about how complex the problem is. The whole team needs to hear that. Estimating hours should be an outgrowth of story pointing, not the leading activity — if we bother doing that at all.

Remember that iterations and sprints reduce the risk of undertaking a huge block of work, committing to an estimate, and then finding out over the subsequent months or years that the desired scope is more complex than originally imagined. Groups working with shorter sprints should resist the temptation to have points-to-hours rubrics and parametric estimating; some could resist the urge to estimate at all.

Preliminary Estimates

Assigning preliminary estimates to stories during backlog refinement can be valuable. Rough-order-of-magnitude estimates check priorities and Release plans. Rather than points, however, a less confusing practice is to use T-Shirt sizes. This lessens the temptation to just take the preliminary estimate and “run with it” during Iteration planning.

Use T-Shirt Size estimates as a sanity check during direct story pointing, but not as a strict constraint. There should be no T-Shirt Size-to-Story Point rubrics that will say what size equals how many points. Such rubrics, especially when used too early in the organization’s maturity, lead to a false sense of precision. False precision leads to flawed metrics programs, people struggling for power over the measurement program, pushing misleading data, and other problems.

False precision and wasted effort are why premature pointing is a dangerous habit.

Product Owners and other backlog refinement participants can, however, look at points assigned to stories to check whether their rough, relative T-Shirt Sizing correlates to the proportions of the finished work. For example:

Suppose Feature A has a T-Shirt size of “Small.” The stories created for that feature wind up taking five people 4500 keyboard hours to complete
Suppose Feature B has a T-Shirt size of “Extra Large.” The stories implemented to create that feature wind up taking 2 people 80 hours keyboard hours to complete

In this example, the hours to complete seem disproportionate to the original T-Shirt sizing. The people evolving the backlog would want to know that so they could discuss what factors they missed for Feature A when evaluating its rough magnitude, or what changed that enabled Feature B to be completed with relatively little effort.

Refining a T-Shirt-to-Points Rubric would not help anyone; T-Shirt sizing assumes that some details are unknown. The Rubric therefore would always be unstable. On the other hand, a list of common system attributes and risks, an assessment of interconnecting systems, and an overview of the Architectural Runway are examples of data that would be useful.

Giving In to Points Pressure

Much pressure for this behavior comes from outside the team. We have heard managers tell teams, “you need to increase your points count so you can get more credit.” This goes back to wanting to treat story points as a “thing” much as credit hours in school are a thing. But, as one colleague commented, “This isn’t high school! You’re not collecting credits towards your diploma.”

We’ve also heard teams pressured under the rationale, “You need to point earlier so you can assign tasks further out.” Almost invariably, this comes from a member of the PMI community, anxious to get their Gantt Chart* fleshed out twelve months in advance, with 20 layers of breakout, before Sprint 0 is over. “The points may be premature and half-baked, but it doesn’t matter, we’ll now have a plan!” That’s not useful, either.

*(I refuse to refer to a MS-Project file as a WBS. That’s for another post.)

Story points have no empirical value. None. Zip. A point is not equal to 20 lines of code, 1/15th of a microservice, a 30 column table, or any other measure of work product. There is no Point Standard cast in gold and stored alongside the National Prototype Metre Bar in Paris. A point doesn’t firmly equate an hour’s worth of work either, although some groups use that as a “get started” rough equivalent. Points are an imprecise and malleable way of sizing stories relative to each other.

There is no fixed duration or complexity for a given point value.

Just as story points do not have empirical value, neither does it directly matter how many points you deliver. It doesn’t matter! What does matter is how much functioning product you deliver, how much refinement of the backlog you help with, how much refactoring and innovation you perform, how much infrastructure readiness and stability you deliver. That’s what the Product Owner is buying, not points. We see many business sponsors finally backing away from asking for points delivery, even on contracts that mandate a fixed number of points. They are sick of all the effort and wrangling over such an imprecise indicator.

Points are a team’s internal tool for guesstimating, tracking and reporting its own progress relative to a single sprint. Remember that next time you’re pressured to assign points in a session where the team isn’t present. There are limited circumstances in which groups external to the team have any reasonable concern about story points. They may include:

When there is a radical (not incremental) number of points taken on in a sprint, or left out, compared to the previous one. (Emphasis on compared to the previous one, not compared to a sprint taking place a year ago, and never compared to some other team’s sprint). Outside people may need to know what events cause radical fluctuations in points estimation. Sometimes they can help resolve the root cause faster than the development team can. Just a few causes of radical fluctuations include:
- Research spikes, especially when several occur at the same time
- Team attrition, vacations, illness.
- Large scale refactoring (which you should try to prevent like the plague)
- Product Owner or Subject Matter Expert unavailability
- Low responsiveness from external groups that keep critical volumes of work from progressing
- Problems with the development or staging environments outside the team’s control
When the relative number of Defect points compared to those for User Stories/Spike Stories changes radically. This could indicate a quality problem that people outside the team naturally have an interest in.
The relative number of points delivered and rate of burndown are perfectly steady and consistent. Bean counters think they like this, but it probably indicates a problem. Software teams are not factory lines, and I can tell you from my experience in manufacturing that the best production lines have variation. When we see super-perfect velocity, we usually see Product Owners are dissatisfied with the quality and quantity of working software delivered. This situation often comes from a Scrum Master being pressured to have perfect delivery for reasons other than supporting open and honest communication.

Once again, this pressure for manufacturing line evenness and predictability is a holdover from classical project management and its misapplication of Total Quality Management. As someone who has worked in manufacturing and warehousing, they are very different from the design-intensive thought processes in software engineering. Even the best-marketing “software factories” have to contend with uncertainty, shifting customer priorities, and market shifts. The problem isn’t how uniformly you can crank out product and estimate your productivity; the problem is the probability of change. Work that is so uniform and predictable that you can point early with high reliability is not the sort of work Agile methods were devised to handle.

Summary

Story points have no real relevance outside of the development team. Trying to point too far in advance misses the idea of why points are used in the first place. Instead of mutating points into productivity, budgeting, and other macro information, use them as just-in-time, imperfect checks on what we can commit to this iteration.