On Tech

Month: July 2015

Release Testing Is Risk Management Theatre

Continuous Delivery often leads to the discovery of suboptimal practices within an organisation, and the Release Testing antipattern is a common example. What is Release Testing, and why is it an example of Risk Management Theatre?

Pre-Agile Testing

“I was a principal test analyst. I worked in a separate testing team to the developers. I spent most of my time talking to them to understand their changes, and had to work long hours to do my testing” – Suzy

The traditional testing strategy of many IT organisations was predicated upon a misguided belief described by Elisabeth Hendrickson as “testers test, programmers code, and the separation of the two disciplines is important“. Segregated development and testing teams worked in sequential phases of the value stream, with each product increment handed over to the testers for a prolonged period of testing prior to sign-off.

Release Testing Is Risk Management Theatre - Pre Agile Testing

This strategy was certainly capable of uncovering defects, but it also had a detrimental impact on lead times and quality. The handover period between development and testing inserted delays into the value stream, creating large feedback loops that increased rework. Furthermore, the segregation of development and testing implicitly assigned authority for changes to developers and responsibility for quality to testers. This disassociated developers from defect consequences and testers from business requirements, invariably resulting in higher defect counts and lower quality over time.

Agile Testing

“I was a product tester. I worked in an agile team with developers and a business analyst. I contributed to acceptance tests and did exploratory testing. I don’t miss the old ways” – Dwayne

The publication of the Agile Manifesto in 2001 led to a range of lightweight development processes that introduced a radically different testing approach. Agile methods advocate cross-functional teams of co-located developers and testers, in which testing is considered a continuous activity and there is a shared commitment to product quality.

Release Testing Is Risk Management Theatre - Agile Testing

In an agile team developers and testers collaborate on practices such as Test Driven Development and Acceptance Test Driven Development in accordance with the Test Pyramid strategy, which recommends a large number of automated unit and acceptance tests in proportion to a small number of automated end-to-end and manual tests.

Release Testing Is Risk Management Theatre - Test Pyramid

The Test Pyramid favours automated unit and acceptance tests as they offer a greater value at a lower cost. Test execution time and determinism are directly proportional to System Under Test size, and as automated unit and acceptance tests have minimal scope they provide fast, deterministic feedback. Automated end-to-end tests and exploratory testing are also valuable, but the larger System Under Test means feedback is slower and less reliable.

This testing strategy is a vast improvement upon its predecessor. Uniting developers and testers in a product team eliminates handover delays and recombines authority with responsibility, resulting in a continual emphasis upon product quality and lower lead times.

Release Testing Is Risk Management Theatre - Agile Testing Test Pyramid

Release Testing

“I was an operational acceptance tester. I worked in a separate testing team to the developers and functional testers. I never had time to find defects or understand requirements, and always got the blame” – Jamie

The transition from siloed development and testing teams to cross-functional product teams is a textbook example of how organisational change enables Continuous Delivery – faster feedback and improved quality will unlock substantial cycle time gains and decrease opportunity costs. However, all too often Continuous Delivery is impeded by Release Testing – an additional phase of automated and/or manual end-to-end regression testing, performed on the critical path independent of the product team.

Release Testing Is Risk Management Theatre - Release Testing

Release Testing is often justified as a guarantee of product quality, but in reality it is a disproportionately costly practice with little potential for defect discovery. The segregation of release testers from the product team reinserts handover delays into the value stream and dilutes responsibility for quality, increasing feedback loops and rework. Furthermore, as release testers must rely upon end-to-end tests their testing invariably becomes a Test Ice Cream Cone of slow, brittle tests with long execution times and high maintenance costs.

Release Testing Is Risk Management Theatre - Test Ice Cream Cone

The reliance of Release Testing upon end-to-end testing on the critical path means a low degree of test coverage is inevitable. Release testers will always be working to a pre-arranged business deadline outside their control, and consequently test coverage will often be curtailed to such an extent the blameless testers will find it difficult to uncover any significant defects.

Release Testing Is Risk Management Theatre - Release Testing Test Ice Cream Cone

When viewed through a Continuous Delivery frame the high cost and low value of Release Testing become evident, and attempting to redress that imbalance is a zero-sum game. Decreasing the cost of Release Testing means fewer end-to-end tests, which will decrease execution time but also decrease test coverage. Increasing the value of Release Testing means more end-to-end tests, which will increase test coverage but also increase execution time. Release Testing can therefore be considered an example of what Jez Humble describes as Risk Management Theatre – an overly-costly practice with an artificial sense of value.

Release Testing is high cost, low value Risk Management Theatre

Build Quality In

Continuous Delivery is founded upon the Lean Manufacturing principle of Build Quality In, and the advice of Dr. W. Edwards Deming that “we cannot rely on mass inspection to improve quality” is especially pertinent to Release Testing. An organisation should build quality into its product rather than expect testers to inspect quality in at a later date, and that means eliminating Release Testing by moving release testers back into the product team.

Release Testing Is Risk Management Theatre - No Release Testing

Folding release testers into product development removes the handover delays and responsibility barriers imposed by Release Testing. End-to-end regression tests can be audited by all stakeholders, with valuable tests retained and the remainder discarded. More importantly, ex-release testers will be freed up to work on higher-value activities off the critical path, such as exploratory testing and business analysis.

Batch Size Reduction

Given the limited value of Release Testing it is prudent to consider other risk reduction strategies, and a viable alternative supported by Continuous Delivery is Batch Size Reduction – releasing smaller changesets more frequently into production. Splitting a large experiment into smaller independent experiments reduces variation in outcomes, so by decomposing large changesets into smaller unrelated changesets we can reduce the probability of failure associated with any one changeset.

For example, assume an organisation has a median cycle time of 12 weeks – perhaps due to Release Testing – and a pending release of 4 features. The probability of failure for this release has been estimated as 1 in 2 (50%), and there is a desire to reduce that level of risk.

Release Testing Is Risk Management Theatre - Probability One Release

As the 50% estimate is aggregated from 4 features it can be improved by reducing delivery costs – perhaps by eliminating Release Testing – and releasing features independently every 3 weeks. While this theoretically produces 4 homogeneous releases with a 1 in 8 (12.5%) failure probability, the heterogeneity of product development creates variable feature complexity – and smaller changesets enable more accurate estimation of comparative failure probabilities. In this example the 4 changesets allow a more detailed risk assessment that assigns features 2 and 3 a higher failure probability, which means more exploratory testing time can be allocated to those specific features to reduce overall failure probability.

Release Testing Is Risk Management Theatre - Probability Multiple Heterogenous Releases

When a production defect does occur, batch size reduction has the ability to significantly reduce defect cost. The cost of a defect is comprised of the sunk cost incurred between activation and discovery, and the opportunity cost incurred between discovery and resolution. Those costs are a function of cost per unit time and duration, where cost per unit time represents economic impact and duration represents time.

For example, assume the organisation unwisely retained its 12 week lead time and a production defect D1 has been found 3 weeks after release. An assessment of external market conditions calculates a static cost per unit time of £10,000 a week, which means a sunk cost of £30,000 has already been incurred and a £120,000 opportunity cost is looming.

Release Testing Is Risk Management Theatre - Opportunity Cost Long Lead Time

As cost per unit time is governed by external market conditions it is difficult to influence, but duration is controlled by Little’s Law which states that lead time is directly proportional to work in progress. This means the opportunity cost duration of a defect can be decreased by releasing the defect fix in a smaller changeset, which will result in a shorter lead time and a reduced defect cost. If a fix for D1 is released in its own changeset in 1 week, that would decrease the opportunity cost by 92% to £10,000 and produce a 73% overall reduction in defect cost to £40,000.

Release Testing Is Risk Management Theatre - Opportunity Cost Short Lead Time

Conclusion

Release Testing is the definitive example of Risk Management Theatre in the IT industry today and a significant barrier to Continuous Delivery. End-to-end regression testing on the critical path cannot provide any meaningful reduction in defect probability without incurring costs that harm product quality and inflate lead times. Continuous Delivery advocates a lower cost, higher value alternative in which the product team owns responsibility for product quality, with an emphasis upon exploratory testing and batch size reduction to decrease risk.

Tester names have been altered

Further Reading

  1. Leading Lean Software Development  by Mary and Tom Poppendieck
  2. Assign Responsibility And Authority by Shelley Doll
  3. Integrated Tests Are A Scam by JB Rainsberger
  4. Continuous Delivery by Dave Farley and Jez Humble
  5. Organisation Antipattern – Release Testing by Steve Smith
  6. The Essential Deming by W. Edwards Deming
  7. Explore It! by Elisabeth Hendrickson
  8. Principles Of Product Development Flow by Don Reinertsen

Organisation antipattern: Build Feature Branching

The Version Control Strategies series

  1. Organisation antipattern – Release Feature Branching
  2. Organisation pattern – Trunk Based Development
  3. Organisation antipattern – Integration Feature Branching
  4. Organisation antipattern – Build Feature Branching

Build Feature Branching is oft-incompatible with Continuous Integration

Build Feature Branching is a version control strategy where developers commit their changes to individual remote branches of a source code repository prior to the shared trunk. Build Feature Branching is possible with centralised Version Control Systems (VCSs) such as Subversion and TFS, but it is normally associated with Distributed Version Control Systems (DVCSs) such as Git and Mercurial – particularly GitHub and GitHub Flow.

In Build Feature Branching Trunk is considered a flawless representation of all previously released work, and new features are developed on short-lived feature branches cut from Trunk. A developer will commit changes to their feature branch, and upon completion those changes are either directly merged into Trunk or reviewed and merged by another developer using a process such as a GitHub Pull Request. Automated tests are then executed on Trunk, testers manually verify the changes, and the new feature is released into production. When a production defect occurs it is fixed on a release branch cut from Trunk and merged back upon production release.

Consider an organisation that provides an online Company Accounts Service, with its codebase maintained by a team practising Build Feature Branching. Initially two features are requested – F1 Computations and F2 Write Offs – so F1 and F2 feature branches are cut from Trunk and developers commit their changes to F1 and F2.

Organisation Antipattern - Build Feature Branching - 1

Two more features – F3 Bank Details and F4 Accounting Periods – then begin development, with F3 and F4 feature branches cut from Trunk and developers committing to F3 and F4. F2 is completed and merged into Trunk by a non-F2 developer following a code review, and once testing is signed off on Trunk + F2 it is released into production. The F1 branch grows to encompass a Computations refactoring, which briefly breaks the F1 branch.

Organisation Antipattern - Build Feature Branching - 2

A production defect is found in F2, so a F2.1 fix for Write Offs is made on a release branch cut from Trunk + F2 and merged back when the fix is in production. F3 is deemed complete and merged into Trunk + F2 + F2.1 by a non-F3 developer, and after testing it is released into production. The F1 branch grows further as the Computations refactoring increases in scope, and the F4 branch is temporarily broken by an architectural change to the submissions system for Accounting Periods.

Organisation Antipattern - Build Feature Branching - 3

When F1 is completed the amount of modified code means a lengthy code review by a non-F1 developer and some rework are required before F1 can be merged into Trunk + F2 + F2.1 + F3, after which it is successfully tested and released into production. The architectural changes made in F4 also mean a time-consuming code review and merge into Trunk + F2 + F2.1 + F3 + F1 by a non-F4 developer, and after testing F4 goes into production. However, a production defect is then found in F4, and a F4.1 fix for Accounting Periods is made on a release branch and merged into Trunk + F2 + F2.1 + F3 + F1 + F4 once the defect is resolved.

Organisation Antipattern - Build Feature Branching - 4

In this example F1, F2, F3, and F4 all enjoy uninterrupted development on their own feature branches. The emphasis upon short-lived feature branches reduces merge complexity into Trunk, and the use of code reviews lowers the probability of Trunk build failures. However, the F1 and F4 feature branches grow unchecked until they both require a complex, risky merge into Trunk.

The Company Accounts Service team might have used Promiscuous Integration to reduce the complexity of merging each feature branch into Trunk, but that does not prevent the same code deviating on different branches. For example, integrating F2 and F3 into F1 and F4 would simplify merging F1 and F4 into Trunk later on, but it would not restrain F1 and F4 from generating Semantic Conflicts if they both modified the same code.

Organisation Antipattern - Build Feature Branching - 4 Promiscuous Merge

This example shows how Build Feature Branching typically inserts a costly integration phase into software delivery. Short-lived feature branches with Promiscuous Integration should ensure minimal integration costs, but the reality is feature branch duration is limited only by developer discipline – and even with the best of intentions that discipline is all too easily lost. A feature branch might be intended to last only for a day, but all too often it will grow to include bug fixes, usability tweaks, and/or refactorings until it has lasted longer than expected and requires a complex merge into Trunk. This is why Build Feature Branching is normally incompatible with Continuous Integration, which requires every team member to integrate and test their changes on Trunk on at least a daily basis. It is highly unlikely every member of a Build Feature Branching team will merge to Trunk daily as it is too easy to go astray, and while using a build server to continuously verify branch integrity is a good step it does not equate to shared feedback on the whole system.

Build Feature Branching advocates that the developer of a feature branch should have their changes reviewed and merged into Trunk by another developer, and this process is well-managed by tools such as GitHub Pull Requests. However, each code review represents a handover period full of opportunities for delay – the developer might wait for reviewer availability, the reviewer might wait for developer context, the developer might wait for reviewer feedback, and/or the reviewer might wait for developer rework. As Allan Kelly has remarked “code reviews lose their efficacy when they are not conducted promptly“, and when a code review is slow the feature branch grows stale and Trunk merge complexity increases. A better technique to adopt would be Pair Programming, which is a form of continuous code review with minimal rework.

Asking developers working on orthogonal tasks to share responsibility for integrating a feature into Trunk dilutes responsibility. When one developer has authority for a feature branch and another is responsible for its Trunk merge both individuals will naturally feel less responsible for the overall outcome, and less motivated to obtain rapid feedback on the feature. It is for this reason Build Feature Branching often leads to what Jim Shore refers to as Asynchronous Integration, where the developer of a feature branch starts work on the next feature immediately after asking for a review, as opposed to waiting for a successful review and Trunk build. In the short-term Asynchronous Integration leads to more costly build failures, as the original developer must interrupt their new feature and context switch back to the old feature to resolve a Trunk build failure. In the long-term it results in a slower Trunk build, as a slow build is more tolerable when it is monitored asynchronously. Developers will resist running a full build locally, developers will then checkin less often, and builds will gradually slowdown until the entire team grinds to a halt. A better solution is for developers to adopt Synchronous Integration in spite of Build Feature Branching, and by waiting on Trunk builds they will be compelled to optimise it using techniques such as acceptance test parallelisation.

Build Feature Branching works well for open-source projects where a small team of experienced developers must integrate changes from a disparate group of contributors, and the need to mitigate different timezones and different levels of expertise outweighs the need for Continuous Integration. However, for commercial software development Build Feature Branching fits the Wikipedia definition of an antipattern – “a common response to a recurring problem that is usually ineffective and risks being highly counterproductive“. A small, experienced team practising Build Feature Branching could theoretically accomplish Continuous Integration given a well-structured architecture and a predictable flow of features, but it would be unusual. For the vast majority of co-located teams working on commercial software Build Feature Branching is a costly practice that discourages collaboration, inhibits refactoring, and by implicitly sacrificing Continuous Integration acts as a significant impediment to Continuous Delivery. As Paul Hammant has said, “you should not make branches for features regardless of how long they are going to take“.

Organisation antipattern: Integration Feature Branching

The Version Control Strategies series

  1. Organisation antipattern – Release Feature Branching
  2. Organisation pattern – Trunk Based Development
  3. Organisation antipattern – Integration Feature Branching
  4. Organisation antipattern – Build Feature Branching

Integration Feature Branching is overly-costly and unpredictable

Integration Feature Branching is a version control strategy where developers commit their changes to a shared remote branch of a source code repository prior to the shared trunk. Integration Feature Branching is applicable to both centralised Version Control Systems (VCS) and Distributed Version Control Systems (DVCS), with multiple variants of increasing complexity:

  • Type 1 – Integration branch and Trunk. This was originally used with VCSs such as Subversion and TFS
  • Type 2 – Feature branches, an Integration branch, and Trunk. This is used today with DVCSs such as Git and Mercurial
  • Type 3 – Feature release branches, feature branches, an Integration branch, and Trunk. This is advocated by Git Flow

In all Integration Feature Branching variants Trunk represents the latest production-ready state and Integration represents the latest completed changes ready for release. New features are developed on Integration (Type 1), or short-lived feature branches cut from Integration and merged back into Integration on completion (Types 2 and 3). When Integration contains a new feature it is merged into Trunk for release (Types 1 and 2), or a short-lived feature release branch cut from Integration and merged into Trunk and Integration on release (Type 3). When a production defect occurs it is fixed on a release branch cut from Trunk, then merged back to Integration (Types 1 and 2) or a feature release branch if one exists (Type 3).

Consider an organisation that provides an online Company Accounts Service, with its codebase maintained by a team practising Type 2 Integration Feature Branching. Initially two features are requested – F1 Computations and F2 Write Offs – so F1 and F2 feature branches are cut from Integration and developers commit their changes to F1 and F2.

Organisation Antipattern - Integration Feature Branching - Type 2 - 1

Two more features – F3 Bank Details and F4 Accounting Periods – then begin development, with F3 and F4 feature branches cut from Integration and developers committing to F3 and F4. F2 is completed and merged into Integration, and after testing it is merged into Trunk and regression tested before its production release. The F1 branch is briefly broken by a computations refactoring, with no impact on Integration.

Organisation Antipattern - Integration Feature Branching - Type 2 - 2

When F3 is completed it is merged into Integration + F2 and tested, but in the meantime a production defect is found in F2. A F2.1 fix is made on a F2.1 release branch cut from Trunk + F2, and after its release F2.1 is merged into and regression tested on both Integration + F2 + F3 and Trunk + F2. F3 is then merged into Trunk and regression tested, after which it is released into production. F1 continues development, and the F4 branch is temporarily broken by changes to the submissions system.

Organisation Antipattern - Integration Feature Branching - Type 2 - 3

When F1 is completed and merged into Integration + F2 + F3 + F2.1 it is ready for production release, but a business decision is made to release F4 first. F4 is completed and after being merged into and tested on both Integration + F2 + F3 + F2.1 + F1 and Trunk + F2 + F3 + F2.1 it is released into production. Soon afterwards F1 is merged into and regression tested on Trunk + F2 + F2.1 + F3, then released into production. A production defect is found in F4, and a F4.1 fix is made on a release branch cut from Trunk + F2 + F2.1 + F3 + F4 + F1. Once F4.1 is released it is merged into and regression tested on both Integration + F2 + F3 + F2.1 + F1 + F4 and Trunk + F2 + F2.1 + F3 + F4 + F1.

Organisation Antipattern - Integration Feature Branching - Type 2 - 4

In this example F1, F2, F3, and F4 all enjoy uninterrupted development on their own feature branches. The use of an Integration branch reduces the complexity of each merge into Trunk, and allows the business stakeholders to re-schedule the F1 and F4 releases when circumstances change. However, the isolated development of F1, F2, F3, and F4 causes complex, time-consuming merges into Integration, and Trunk requires regression testing as it can differ from Integration – such as F4 being merged into Integration + F2 + F3 + F2.1 + F1 and Trunk + F2 + F2.1 + F3. The Company Accounts Service team might have used Promiscuous Integration on feature release to reduce the complexity of merging into Integration, but there would still be a need for regression testing on Trunk.

Organisation Antipattern - Integration Feature Branching - Type 2 - 4 Promiscuous

If the Company Accounts Service team used Type 3 Integration Feature Branching the use of feature release branches between Integration and Trunk could reduce the complexity of merging into Trunk, but regression testing would still be required on Trunk to garner confidence in a production release. Type 3 Integration Feature Branching also makes the version control strategy more convoluted for developers, as highlighted by Adam Ruka criticising Git Flow’s ability to “create more useless merge commits that make your history even less readable, and add significant complexity to the workflow“.

Organisation Antipattern - Integration Feature Branching - Type 3 - 4 Promiscuous

The above example shows how Integration Feature Branching adds a costly, unpredictable phase into software development for little gain. The use of an Integration branch in Type 1 creates wasteful activities such as Integration merges and Trunk regression testing, which insert per-feature variability into delivery schedules. The use of feature branches in Type 2 discourages collaborative design and refactoring, leading to a gradual deterioration in codebase quality. The use of feature release branches in Type 3 lengthens feedback loops, increasing rework and lead times when defects occur.

Integration Feature Branching is entirely incompatible with Continuous Integration. Continuous Integration requires every team member to integrate and test their code on Trunk at least once a day in order to minimise feedback loops, and Integration Feature Branching is the polar opposite of this. While Integration Feature Branching can involve commits to Integration on a daily basis and a build server constantly verifying both Integration and Trunk integrity, it is vastly inferior to continuously integrating changes into Trunk. As observed by Dave Farley, “you must have a single shared picture of the state of the system… there is no point having a separate integration branch“.

© 2026 Steve Smith

Theme by Anders NorénUp ↑