Page 6 of 11

Discontinuous Delivery

20 August, 2017 / Steve Smith

Continuous Delivery is a set of principles and practices to improve the stability and throughput of a release process. But what does it mean to be practising Continuous Delivery? What comes beforehand, what comes afterwards, and how many deploys a day do you actually need?

Measuring Continuous Delivery describes how to guide the adoption of Continuous Delivery, using stability and throughput measurements. The book introduces a new term into the lexicon of Continuous Delivery – Discontinuous Delivery.

Discontinuous Delivery is when an organisation has a release process that lacks the stability and speed required to satisfy business demand

An organisation that cannot release product increments sufficiently reliably or quickly for its customers is in a state of Discontinuous Delivery. By applying the principles and practices of Continuous Delivery to its unique circumstances and constraints, an organisation can continuously improve the stability and throughput of its release process until it is in a state of Continuous Delivery.

The definition of Discontinuous Delivery leads to some interesting conclusions:

Business demand must be understood before success criteria for Continuous Delivery can be defined
Continuous Delivery does not ask for a fixed amount of deploys per unit time – 3 deploys a day might be too slow, 1 deploy a month might be too fast
It is possible to move from Discontinuous Delivery to Continuous Delivery and vice versa multiple times, depending on market conditions

Measuring Continuous Delivery contains more detailed information on Discontinuous Delivery, and how to use the Improvement Kata within the context of an organisation to successfully adopt Continuous Delivery principles and practices.

Aim for Operability, not DevOps as a Cult

20 April, 2017 / Steve Smith

The DevOps Handbook describes an admirable DevOps as a Philosophy based on flow, feedback, continual learning and experimentation. However, a near-decade of naivety, confusion, and profiteering surrounding DevOps has left the IT industry with DevOps as a Cult, and the benefits of Operability are all too often overlooked.

Why is DevOps as a Philosophy is a laudable ideal? Why is DevOps as a Cult the unpleasant reality? Why should organisations instead focus on Operability as an enabler of Continuous Delivery?

Introduction

When an IT organisation has separate Development and Operations departments it will inevitably suffer from a serious conflict of interest. The Development teams will be told to keep pace with the market and incentivised by features, while the Operations teams will be told to provide reliability and incentivised by uptime. This creates a troubled relationship in which one party tries to maximise production changes, and the other tries to minimise them.

This conflict of interest has a devastating impact on the stability, throughput, and quality of IT services. It produces unstable, unreliable, and insecure services vulnerable to costly outages. It ensures production changes are delayed by days, weeks, or even months due to endless coordination between teams, convoluted change approvals, and fear of failure. It results in significant amounts of functional and operational rework, and constant firefighting just to keep systems up and running. It means the organisation loses out in the marketplace, due to the high opportunity costs incurred and the high attrition rate of employees.

DevOps As A Philosophy

In 2008, Patrick Debois and Andrew Schafer discussed at Agile 2008 the application of Agile practices to infrastructure. In 2009, John Allspaw and Paul Hammond shared with Velocity 2009 their famous “10 Deploys per Day: Dev and Ops Cooperation at Flickr” story, and Patrick Debois subsequently created the first DevOpsDays conference. The DevOps philosophy of collaboration between Development and Operations had begun.

In 2016, the DevOps Handbook was published by Gene Kim, Jez Humble, Patrick Debois, and John Willis. The DevOps Handbook builds on the Phoenix Project novel by Gene Kim, Kevin Behr, and George Spafford in 2013, and it describes how the Three Ways of DevOps can help organisations to succeed:

The First Way: The Principles of Flow – create a continuous flow of value-add from Development to Operations
The Second Way: The Principles of Feedback – create a constant flow of feedback from Operations to Development
The Third Way: The Principles of Continual Learning and Experimentation – create a culture of ever-increasing knowledge within Development and Operations

The DevOps Handbook advocates long-lived product teams frequently deploying changes during normal business hours, using ubiquitous monitoring to quickly resolve errors, and building a shared culture of Continuous Improvement. It is a seminal work that describes what DevOps should be – DevOps As A Philosophy.

DevOps As A Cult

Unfortunately, there were 7 years between the creation of the DevOps meme and the publication of The DevOps Handbook. In the meantime, a different kind of DevOps has emerged that is entirely distinct from DevOps As A Philosophy yet regrettably popular within the IT industry. This bastardisation of DevOps is a cult based on confusion, naivety, and profiteering.

There has been a great deal of confusion about what DevOps actually is, and many organisations have unwittingly increased their disorder by attempting to adopt DevOps without understanding it. For example, there is now the notion of a DevOps Engineer, in which DevOps is equated with a Infrastructure As Code specialist and any need for further change is ignored. Another example is the DevOps Team, in which a team of DevOps Engineers or similar is inserted between Development and Operations teams and becomes yet another delivery impediment. As Jez Humble has remarked, “creating another functional silo that sits between Dev and Ops is clearly a poor (and ironic) way to try and solve these problems“.

Many people have naively latched onto DevOps via misinformation and with little appreciation of their organisational complexity and context. In a complex, adaptive system every individual has limited information, the cause and effect of an event cannot be predicted, and the system must be probed for insights. One common error is to literally assume the conflict of interest between Development and Operations is always the key constraint, when other emerging conflicts can be equally ruinous such as between separate Development and Testing departments. Another error is to assume large enterprise organisations need some kind of Enterprise DevOps roadmap, despite the ineffectuality of blueprints in a complex system and Dave Roberts pointing out “flow and continuous improvement are equally applicable to a large enterprise as they are to an agile web startup“.

Finally, the lack of clarity on DevOps has led to unabashed profiteering from some recruitment firms and vendors. This can be seen when recruitment firms rebrand sysadmins as DevOps Engineers, or when vendors market their automation tools as DevOps tools. DevOps certification has even been launched by the DevOps Institute, which sells one interpretation of a complex cultural movement and of which Sam Newman complained “aside from perhaps three practitioners, the rest of the group are either professional trainers or sales and marketing people“.

Many organisations that have attempted to adopt DevOps still suffer from short-lived project teams infrequently deploying changes out of business hours, manual regression testing without telemetry, and an antagonistic culture with minimal knowledge sharing. The application of confusion, naivety, and profiteering to the DevOps meme has resulted in what DevOps should not be – DevOps As A Cult.

Aim for Operability, not DevOps As A Cult

The rise of DevOps coincided with the rise of Continuous Delivery, which is explicitly focussed on the improvement of IT stability and throughput to satisfy business demand. Continuous Delivery does not need DevOps As A Philosophy but they can be thought of as complementary, due to their shared emphasis on fast feedback loops, cultural change, and task automation. DevOps As A Cult has no such standing, as shown by Dave Farley stating that “DevOps rarely says enough about the goal of delivering valuable software… this is no place for cargo-cultism“.

Continuous Delivery requires operational excellence to be built into organisations. If a service is unstable, a high level of throughput is impossible to sustain as the rework incurred during periods of instability will restrict the delivery of new features. This means Operability is of critical importance to Continuous Delivery, as throughput is dependent upon the ability of the organisation to maintain safe and reliable systems according to its operational requirements.

Both Continuous Delivery and DevOps As A Philosophy advocate the following operational practices to improve Operability:

Prioritisation of operational requirements – plan and prioritise work on configuration, infrastructure, performance, security, etc. alongside new features
Automated infrastructure – automate production infrastructure and build a self-service provisioning capability for on-demand pre-production environments
Deployment health checks – incorporate system health checks and functional smoke tests into pre-production and production deployments
Pervasive telemetry – establish a logging/monitoring platform for the aggregation, visualisation, anomaly detection, and alerting of business-level, application-level, and operational-level events
Failure injection – introduce simulated errors under controlled conditions into production systems, and rehearse incident response scenarios
Incident swarming – encourage people to work together to identify and resolve production incidents as soon as they occur
Blameless post-mortems – hold post-incident reviews to understand the context, cause and effect, and remediation of a production incident, and propose countermeasures for the future
Shared on-call responsibilities – ensure all team members are on rotation for production incidents, and empowered to handle incidents when they occur

Teams need to adopt a “You Build It, You Run It” culture, in which everyone contributes to operational practices and everyone is responsible for Operability. This means teams will need guidance on how to build, deploy, and run services plus how to create the operational toolchain to support those services. For this reason operability engineers should be embedded into teams, to share their expertise on the delivery of operational requirements and coach other team members on architecting for resilience, establishing a telemetry platform, adopting a mindset of operational excellence, etc. If there are more teams than available operability engineers then every team should have an operability engineer assigned in a liaison role.

Conclusion

In many organisations the conflict of interest between Development and Operations is enormously damaging, and DevOps As A Philosophy as described in the DevOps Handbook is an admirable model for improving organisations via fast flow, fast feedback, and a culture of learning and experimentation. However, the confusion, naivety, and profiteering surrounding DevOps has led to DevOps As A Cult within the IT industry, and unfortunately its popularity is matched only by its inability to improve organisations.

An organisation that wishes to improve its time to market should adopt Continuous Delivery and aim for Operability. That means operability engineers working on teams to teach others how to adopt an operational mindset and build the necessary tools. Continuous Delivery needs operability, and by achieving operational excellence an organisation can improve its throughput and obtain a strategic competitive advantage in the marketplace.

Thanks to Beccy Stafford, Charles Kubicek, Chris O’ Dell, Edd Grant, John Clapham, and Martin Jackson for their feedback

Publishing the Measuring Continuous Delivery book

4 November, 2016 / Steve Smith

An book on the what, why, and how of measuring Continuous Delivery adoption within an organisation

The latest ebook from Steve Smith is now available – “Measuring Continuous Delivery“.

“Measuring Continuous Delivery” covers the what, why, and how of measuring Continuous Delivery adoption within an organisation. It is aimed at executives, managers, practitioners, and anyone else involved in Continuous Delivery adoption efforts.

From the introduction:

Continuous Delivery is a set of holistic principles and practices to reduce time to market and provide an organisation with a strategic competitive advantage, but adoption is invariably a challenging and time-consuming journey. Before adoption, the current time to market and desired time to market are often unknown, which makes alignment and collaboration between individuals, teams, and departments difficult. During adoption practices, techniques, and tools are often introduced without acceptance criteria, with makes it hard to assess and learn from the impact of changes.

What does a successful Continuous Delivery outcome look like, how do we move towards that outcome, and how do we measure our progress along the way?

“Measuring Continuous Delivery” is being incrementally published on Leanpub in the weeks and months to come. Buy your copy today!

Buy “Measuring Continuous Delivery”

Announcing the Measuring Continuous Delivery book

3 October, 2016 / Steve Smith

An book on the what, why, and how of measuring Continuous Delivery adoption within an organisation

Announcing the latest ebook from our founder Steve Smith – “Measuring Continuous Delivery“.

“Measuring Continuous Delivery” will cover the what, why, and how of measuring Continuous Delivery adoption within an organisation. It is aimed at executives, managers, practitioners, and anyone else involved in Continuous Delivery adoption efforts.

From the introduction:

Continuous Delivery is a set of holistic principles and practices to reduce time to market and provide an organisation with a strategic competitive advantage, but adoption is invariably a challenging and time-consuming journey. Before adoption, the current time to market and desired time to market are often unknown, which makes alignment and collaboration between individuals, teams, and departments difficult. During adoption practices, techniques, and tools are often introduced without acceptance criteria, with makes it hard to assess and learn from the impact of changes.

What does a successful Continuous Delivery outcome look like, how do we move towards that outcome, and how do we measure our progress along the way?

“Measuring Continuous Delivery” will be incrementally published on Leanpub in the weeks and months to come. Register today and indicate your interest in this book!
Register your interest

End-To-End Testing considered harmful

12 December, 2015 / Steve Smith

End-To-End Testing is used by many organisations, but relying on extensive end-to-end tests is fundamentally incompatible with Continuous Delivery. Why is End-To-End Testing so commonplace, and yet so ineffective? How is Continuous Testing a lower cost, higher value testing strategy?

NOTE: The latter half of this article was superseded by the talk “End-To-End Testing Considered Harmful” in September 2016

Introduction

“Good testing involves balancing the need to mitigate risk against the risk of trying to gather too much information” Jerry Weinberg

Continuous Delivery is a set of holistic principles and practices to reduce time to market, and it is predicated upon rapid and reliable test feedback. Continuous Delivery mandates any change to code, configuration, data, or infrastructure must pass a series of automated and exploratory tests in a Deployment Pipeline to evaluate production readiness, so test execution times must be low and test results must be deterministic if an organisation is to achieve shorter lead times.

For example, consider a Company Accounts service in which year end payments are submitted to a downstream Payments service.

The behaviour of the Company Accounts service could be checked at build time by the following types of automated test:

Unit tests check intent against implementation by verifying a discrete unit of code
Acceptance tests check implementation against requirements by verifying a functional slice of the system
End-to-end tests check implementation against requirements by verifying a functional slice of the system, including unowned dependent services

While unit tests and acceptance tests vary in terms of purpose and scope, acceptance tests and end-to-end tests vary solely in scope. Acceptance tests exclude unowned dependent services, so an acceptance test of a Company Accounts user journey would use a System Under Test comprised of the latest Company Accounts code and a Payments Stub.

End-to-end tests include unowned dependent services, so an end-to-end test of a Company Accounts user journey would use a System Under Test comprised of the latest Company Accounts code and a running version of Payments.

If a testing strategy is to be compatible with Continuous Delivery it must have an appropriate ratio of unit tests, acceptance tests, and end-to-end tests that balances the need for information discovery against the need for fast, deterministic feedback. If testing does not yield new information then defects will go undetected, but if testing takes too long delivery will be slow and opportunity costs will be incurred.

The folly of End-To-End Testing

“Any advantage you gain by talking to the real system is overwhelmed by the need to stamp out non-determinism” Martin Fowler

End-To-End Testing is a testing practice in which a large number of automated end-to-end tests and manual regression tests are used at build time with a small number of automated unit and acceptance tests. The End-To-End Testing test ratio can be visualised as a Test Ice Cream Cone.

End-To-End Testing Considered Harmful - The Test Ice Cream Cone

End-To-End Testing often seems attractive due to the perceived benefits of an end-to-end test:

An end-to-end test maximises its System Under Test, suggesting a high degree of test coverage
An end-to-end test uses the system itself as a test client, suggesting a low investment in test infrastructure

Given the above it is perhaps understandable why so many organisations adopt End-To-End Testing – as observed by Don Reinertsen, “this combination of low investment and high validity creates the illusion that system tests are more economical“. However, the End-To-End Testing value proposition is fatally flawed as both assumptions are incorrect:

The idea that testing a whole system will simultaneously test its constituent parts is a Decomposition Fallacy. Checking implementation against requirements is not the same as checking intent against implementation, which means an end-to-end test will check the interactions between code pathways but not the behaviours within those pathways
The idea that testing a whole system will be cheaper than testing its constituent parts is a Cheap Investment Fallacy. Test execution time and non-determinism are directly proportional to System Under Test scope, which means an end-to-end test will be slow and prone to non-determinism

Martin Fowler has warned before that “non-deterministic tests can completely destroy the value of an automated regression suite“, and Stephen Covey’s Circles of Control, Influence, and Concern highlights how the multiple actors in an end-to-end test make non-determinism difficult to identify and resolve. If different teams in the same Companies R Us organisation owned the Company Accounts and Payments services the Company Accounts team would control its own service in an end-to-end test, but would only be able to influence the second-party Payments service.

The lead time to improve an end-to-end test depends on where the change is located in the System Under Test, so the Company Accounts team could analyse and implement a change in the Company Accounts service in a relatively short lead time. However, the lead time for a change to the Payments service would be constrained by the extent to which the Company Accounts team could persuade the Payments team to take action.

Alternatively, if a separate Payments R Us organisation owned the Payments service it would be a third-party service and merely a concern of the Company Accounts team.

In this situation a change to the Payments service would take much longer as the Company Accounts team would have zero control or influence over Payments R Us. Furthermore, the Payments service could be arbitrarily updated with little or no warning, which would increase non-determinism in Company Accounts end-to-end tests and make it impossible to establish a predictable test baseline.

A reliance upon End-To-End Testing is often a symptom of long-term underinvestment producing a fragile system that is resistant to change, has long lead times, and optimised for Mean Time Between Failures instead of Mean Time To Repair. Customer experience and operational performance cannot be accurately predicted in a fragile system due to variations caused by external circumstances, and focussing on failure probability instead of failure cost creates an exposure to extremely low probability, extremely high cost events known as Black Swans such as Knights Capital losing $440 million in 45 minutes. For example, if the Payments data centre suffered a catastrophic outage then all customer payments made by the Company Accounts service would fail.

An unavailable Payments service would leave customers of the Company Accounts service with their money locked up in in-flight payments, and a slow restoration of service would encourage dissatisfied customers to take their business elsewhere. If any in-flight payments were lost and it became public knowledge it could trigger an enormous loss of customer confidence.

End-To-End Testing is an uncomprehensive, high cost testing strategy. An end-to-end test will not check behaviours, will take time to execute, and will intermittently fail, so a test suite largely composed of end-to-end tests will result in poor test coverage, slow execution times, and non-deterministic results. Defects will go undetected, feedback will be slow and unreliable, maintenance costs will escalate, and as a result testers will be forced to rely on their own manual end-to-end regression tests. End-To-End Testing cannot produce short lead times, and it is utterly incompatible with Continuous Delivery.

The value of Continuous Testing

“Cease dependence on inspection to achieve quality. Eliminate the need for inspection on a mass basis by building quality into the product in the first place” Dr W Edwards Deming

Continuous Delivery advocates Continuous Testing – a testing strategy in which a large number of automated unit and acceptance tests are complemented by a small number of automated end-to-end tests and focussed exploratory testing. The Continuous Testing test ratio can be visualised as a Test Pyramid, which might be considered the antithesis of the Test Ice Cream Cone.

Continuous Testing is aligned with Test-Driven Development and Acceptance Test Driven Development, and by advocating cross-functional testing as part of a shared commitment to quality it embodies the Continuous Delivery principle of Build Quality In. However, Continuous Testing can seem daunting due to the perceived drawbacks of unit tests and acceptance tests:

A unit test or acceptance test minimises its System Under Test, suggesting a low degree of test coverage
A unit test or acceptance test uses its own test client, suggesting a high investment in test infrastructure

While the End-To-End Testing value proposition is invalidated by incorrect assumptions of high test coverage and low maintenance costs, the inverse is true of Continuous Testing – its value proposition is validated by incorrect assumptions of low test coverage and high maintenance costs:

A unit test will check intent against implementation and an acceptance test will check implementation against requirements, which means both the behaviour of a code pathway and its interactions with other pathways can be checked
A unit test will restrict its System Under Test scope to a single pathway and an acceptance test will restrict itself to a single service, which means both can have the shortest possible execution time and deterministic results

A non-deterministic acceptance test can be resolved in a much shorter period of time than an end-to-end test as the System Under Test has a single owner. If Companies R Us owned the Company Accounts service and Payments R Us owned the Payments service a Company Accounts acceptance test would only use services controlled by the Company Accounts team.

If the Company Accounts team attempted to identify and resolve non-determinism in an acceptance test they would be able to make the necessary changes in a short period of time. There would also be no danger of unexpected changes to the Payments service impeding an acceptance test of the latest Company Accounts code, which would allow a predictable test baseline to be established.

End-to-end tests are a part of Continuous Testing, not least because the idea that testing the constituent parts of a system will simultaneously test the whole system is a Composition Fallacy. A small number of automated end-to-end tests should be used to validate core user journeys, but not at build time when unowned dependent services are unreliable and unrepresentative. The end-to-end tests should be used for release time smoke testing and runtime production monitoring, with synthetic transactions used to simulate user activity. This approach will increase confidence in production releases and should be combined with real-time monitoring of business and operational metrics to accelerate feedback loops and understand user behaviours.

In Continuous Delivery there is a recognition that optimising for Mean Time To Repair is more valuable than optimising for Mean Time Between Failures as it enables an organisation to minimise the impact of production defects, and it is more easily achievable. Defect cost can be controlled as Little’s Law guarantees smaller production releases will shorten lead times to defect resolution, and Continuous Testing provides the necessary infrastructure to shrink feedback loops for smaller releases. The combination of Continuous Testing and Continuous Delivery practices such as Blue Green Releases and Canary Releases empower an organisation to create a robust system capable of neutralising unanticipated events, and advanced practices such as Dark Launching and Chaos Engineering can lead to antifragile systems that seek to benefit from Black Swans. For example, if Chaos Engineering surfaced concerns about the Payments service the Company Accounts team might Dark Launch its Payments Stub into production and use it in the unlikely event of a Payments data centre outage.

While the Payments data centre was offline the Company Accounts service would gracefully degrade to collecting customer payments in the Payments Stub until the Payments service was operational again. Customers would be unaffected by the production incident, and if competitors to the Company Accounts service were also dependent on the same third-party Payments service that would constitute a strategic advantage in the marketplace. Redundant operational capabilities might seem wasteful, but Continuous Testing promotes operational excellence and as Nassim Nicholas Taleb has remarked “something unusual happens – usually“.

Continuous Testing can be a comprehensive and low cost testing strategy. According to Dave Farley and Jez Humble “building quality in means writing automated tests at multiple levels“, and a test suite largely comprised of unit and acceptance tests will contain meticulously tested scenarios with a high degree of test coverage, low execution times, and predictable test results. This means end-to-end tests can be reserved for smoke testing and production monitoring, and testers can be freed up from manual regression testing for higher value activities such as exploratory testing. This will result in fewer production defects, fast and reliable feedback, shorter lead times to market, and opportunities for revenue growth.

From end-to-end testing to continuous testing

“Push tests as low as they can go for the highest return in investment and quickest feedback” Janet Gregory and Lisa Crispin

Moving from End-To-End Testing to Continuous Testing is a long-term investment, and should be based on the notion that an end-to-end test can be pushed down the Test Pyramid by decoupling its concerns as follows:

Connectivity – can services connect to one another
Conversation – can services talk with one another
Conduct – can services behave with one another

Assume the Company Accounts service depends on a Pay endpoint on the Payments service, which accepts a company id and payment amount before returning a confirmation code and days until payment. The Company Accounts service sends the id and amount request fields and silently depends on the code response field.

The connection between the services could be unit tested using Test Doubles, which would allow the Company Accounts service to test its reaction to different Payments behaviours. Company Accounts unit tests would replace the Payments connector with a Mock or Stub connector to ensure scenarios such as an unexpected Pay timeout were appropriately handled.

The conversation between the services could be unit tested using Consumer Driven Contracts, which would enable the Company Accounts service to have its interactions continually verified by the Payments service. The Payments service would issue a Provider Contract describing its Pay API at build time, the Company Accounts service would return a Consumer Contract describing its usage, and the Payments service would create a Consumer Driven Contract to be checked during every build.

With the Company Accounts service not using the days response field it would be excluded from the Consumer Contract and Consumer Driven Contract, so a build of the Payments service that removed days or added a new comments response field would be successful. If the code response field was removed the Consumer Driven Contract would fail, and the Payments team would have to collaborate with the Company Accounts team on a different approach.

The conduct of the services could be unit tested using API Examples, which would permit the Company Accounts service to check for behavioural changes in new releases of the Payments service. Each release of the Payments service would be accompanied by a sibling artifact containing example API requests and responses for the Pay endpoint, which would be plugged into Company Accounts unit tests to act as representative test data and warn of behavioural changes.

If a new version of the Payments service changed the format of the code response field from alphanumeric to numeric it would cause the Company Accounts service to fail at build time, indicating a behavioural change within the Payments service and prompting a conversation between the teams.

Conclusion

“Not only won’t system testing catch all the bugs, but it will take longer and cost more – more than you save by skipping effective acceptance testing” – Jerry Weinberg

End-To-End Testing seems attractive to organisations due to its promise of high test coverage and low maintenance costs, but the extensive use of automated end-to-end tests and manual regression tests can only produce a fragile system with slow, unreliable test feedback that inflates lead times and is incompatible with Continuous Delivery. Continuous Testing requires an upfront and ongoing investment in test automation, but a comprehensive suite of automated unit tests and acceptance tests will ensure fast, deterministic test feedback that reduces production defects, shortens lead times, and encourages the Continuous Delivery of robust or antifragile systems.

Acknowledgements

Thanks to Amy Phillips, Beccy Stafford, Charles Kubicek, and Chris O’Dell for their early feedback on this article.

Release Testing Is Risk Management Theatre

21 July, 2015 / Steve Smith

Continuous Delivery often leads to the discovery of suboptimal practices within an organisation, and the Release Testing antipattern is a common example. What is Release Testing, and why is it an example of Risk Management Theatre?

Pre-Agile Testing

“I was a principal test analyst. I worked in a separate testing team to the developers. I spent most of my time talking to them to understand their changes, and had to work long hours to do my testing” – Suzy

The traditional testing strategy of many IT organisations was predicated upon a misguided belief described by Elisabeth Hendrickson as “testers test, programmers code, and the separation of the two disciplines is important“. Segregated development and testing teams worked in sequential phases of the value stream, with each product increment handed over to the testers for a prolonged period of testing prior to sign-off.

This strategy was certainly capable of uncovering defects, but it also had a detrimental impact on lead times and quality. The handover period between development and testing inserted delays into the value stream, creating large feedback loops that increased rework. Furthermore, the segregation of development and testing implicitly assigned authority for changes to developers and responsibility for quality to testers. This disassociated developers from defect consequences and testers from business requirements, invariably resulting in higher defect counts and lower quality over time.

Agile Testing

“I was a product tester. I worked in an agile team with developers and a business analyst. I contributed to acceptance tests and did exploratory testing. I don’t miss the old ways” – Dwayne

The publication of the Agile Manifesto in 2001 led to a range of lightweight development processes that introduced a radically different testing approach. Agile methods advocate cross-functional teams of co-located developers and testers, in which testing is considered a continuous activity and there is a shared commitment to product quality.

Release Testing Is Risk Management Theatre - Agile Testing

In an agile team developers and testers collaborate on practices such as Test Driven Development and Acceptance Test Driven Development in accordance with the Test Pyramid strategy, which recommends a large number of automated unit and acceptance tests in proportion to a small number of automated end-to-end and manual tests.

The Test Pyramid favours automated unit and acceptance tests as they offer a greater value at a lower cost. Test execution time and determinism are directly proportional to System Under Test size, and as automated unit and acceptance tests have minimal scope they provide fast, deterministic feedback. Automated end-to-end tests and exploratory testing are also valuable, but the larger System Under Test means feedback is slower and less reliable.

This testing strategy is a vast improvement upon its predecessor. Uniting developers and testers in a product team eliminates handover delays and recombines authority with responsibility, resulting in a continual emphasis upon product quality and lower lead times.

Release Testing

“I was an operational acceptance tester. I worked in a separate testing team to the developers and functional testers. I never had time to find defects or understand requirements, and always got the blame” – Jamie

The transition from siloed development and testing teams to cross-functional product teams is a textbook example of how organisational change enables Continuous Delivery – faster feedback and improved quality will unlock substantial cycle time gains and decrease opportunity costs. However, all too often Continuous Delivery is impeded by Release Testing – an additional phase of automated and/or manual end-to-end regression testing, performed on the critical path independent of the product team.

Release Testing Is Risk Management Theatre - Release Testing

Release Testing is often justified as a guarantee of product quality, but in reality it is a disproportionately costly practice with little potential for defect discovery. The segregation of release testers from the product team reinserts handover delays into the value stream and dilutes responsibility for quality, increasing feedback loops and rework. Furthermore, as release testers must rely upon end-to-end tests their testing invariably becomes a Test Ice Cream Cone of slow, brittle tests with long execution times and high maintenance costs.

The reliance of Release Testing upon end-to-end testing on the critical path means a low degree of test coverage is inevitable. Release testers will always be working to a pre-arranged business deadline outside their control, and consequently test coverage will often be curtailed to such an extent the blameless testers will find it difficult to uncover any significant defects.

When viewed through a Continuous Delivery frame the high cost and low value of Release Testing become evident, and attempting to redress that imbalance is a zero-sum game. Decreasing the cost of Release Testing means fewer end-to-end tests, which will decrease execution time but also decrease test coverage. Increasing the value of Release Testing means more end-to-end tests, which will increase test coverage but also increase execution time. Release Testing can therefore be considered an example of what Jez Humble describes as Risk Management Theatre – an overly-costly practice with an artificial sense of value.

Release Testing is high cost, low value Risk Management Theatre

Build Quality In

Continuous Delivery is founded upon the Lean Manufacturing principle of Build Quality In, and the advice of Dr. W. Edwards Deming that “we cannot rely on mass inspection to improve quality” is especially pertinent to Release Testing. An organisation should build quality into its product rather than expect testers to inspect quality in at a later date, and that means eliminating Release Testing by moving release testers back into the product team.

Folding release testers into product development removes the handover delays and responsibility barriers imposed by Release Testing. End-to-end regression tests can be audited by all stakeholders, with valuable tests retained and the remainder discarded. More importantly, ex-release testers will be freed up to work on higher-value activities off the critical path, such as exploratory testing and business analysis.

Batch Size Reduction

Given the limited value of Release Testing it is prudent to consider other risk reduction strategies, and a viable alternative supported by Continuous Delivery is Batch Size Reduction – releasing smaller changesets more frequently into production. Splitting a large experiment into smaller independent experiments reduces variation in outcomes, so by decomposing large changesets into smaller unrelated changesets we can reduce the probability of failure associated with any one changeset.

For example, assume an organisation has a median cycle time of 12 weeks – perhaps due to Release Testing – and a pending release of 4 features. The probability of failure for this release has been estimated as 1 in 2 (50%), and there is a desire to reduce that level of risk.

As the 50% estimate is aggregated from 4 features it can be improved by reducing delivery costs – perhaps by eliminating Release Testing – and releasing features independently every 3 weeks. While this theoretically produces 4 homogeneous releases with a 1 in 8 (12.5%) failure probability, the heterogeneity of product development creates variable feature complexity – and smaller changesets enable more accurate estimation of comparative failure probabilities. In this example the 4 changesets allow a more detailed risk assessment that assigns features 2 and 3 a higher failure probability, which means more exploratory testing time can be allocated to those specific features to reduce overall failure probability.

When a production defect does occur, batch size reduction has the ability to significantly reduce defect cost. The cost of a defect is comprised of the sunk cost incurred between activation and discovery, and the opportunity cost incurred between discovery and resolution. Those costs are a function of cost per unit time and duration, where cost per unit time represents economic impact and duration represents time.

For example, assume the organisation unwisely retained its 12 week lead time and a production defect D1 has been found 3 weeks after release. An assessment of external market conditions calculates a static cost per unit time of £10,000 a week, which means a sunk cost of £30,000 has already been incurred and a £120,000 opportunity cost is looming.

As cost per unit time is governed by external market conditions it is difficult to influence, but duration is controlled by Little’s Law which states that lead time is directly proportional to work in progress. This means the opportunity cost duration of a defect can be decreased by releasing the defect fix in a smaller changeset, which will result in a shorter lead time and a reduced defect cost. If a fix for D1 is released in its own changeset in 1 week, that would decrease the opportunity cost by 92% to £10,000 and produce a 73% overall reduction in defect cost to £40,000.

Conclusion

Release Testing is the definitive example of Risk Management Theatre in the IT industry today and a significant barrier to Continuous Delivery. End-to-end regression testing on the critical path cannot provide any meaningful reduction in defect probability without incurring costs that harm product quality and inflate lead times. Continuous Delivery advocates a lower cost, higher value alternative in which the product team owns responsibility for product quality, with an emphasis upon exploratory testing and batch size reduction to decrease risk.

Tester names have been altered

Organisation antipattern: Build Feature Branching

20 July, 2015 / Steve Smith

The Version Control Strategies series

Organisation antipattern – Release Feature Branching

Organisation pattern – Trunk Based Development

Organisation antipattern – Integration Feature Branching

Organisation antipattern – Build Feature Branching

Build Feature Branching is oft-incompatible with Continuous Integration

Build Feature Branching is a version control strategy where developers commit their changes to individual remote branches of a source code repository prior to the shared trunk. Build Feature Branching is possible with centralised Version Control Systems (VCSs) such as Subversion and TFS, but it is normally associated with Distributed Version Control Systems (DVCSs) such as Git and Mercurial – particularly GitHub and GitHub Flow.

In Build Feature Branching Trunk is considered a flawless representation of all previously released work, and new features are developed on short-lived feature branches cut from Trunk. A developer will commit changes to their feature branch, and upon completion those changes are either directly merged into Trunk or reviewed and merged by another developer using a process such as a GitHub Pull Request. Automated tests are then executed on Trunk, testers manually verify the changes, and the new feature is released into production. When a production defect occurs it is fixed on a release branch cut from Trunk and merged back upon production release.

Consider an organisation that provides an online Company Accounts Service, with its codebase maintained by a team practising Build Feature Branching. Initially two features are requested – F1 Computations and F2 Write Offs – so F1 and F2 feature branches are cut from Trunk and developers commit their changes to F1 and F2.

Two more features – F3 Bank Details and F4 Accounting Periods – then begin development, with F3 and F4 feature branches cut from Trunk and developers committing to F3 and F4. F2 is completed and merged into Trunk by a non-F2 developer following a code review, and once testing is signed off on Trunk + F2 it is released into production. The F1 branch grows to encompass a Computations refactoring, which briefly breaks the F1 branch.

A production defect is found in F2, so a F2.1 fix for Write Offs is made on a release branch cut from Trunk + F2 and merged back when the fix is in production. F3 is deemed complete and merged into Trunk + F2 + F2.1 by a non-F3 developer, and after testing it is released into production. The F1 branch grows further as the Computations refactoring increases in scope, and the F4 branch is temporarily broken by an architectural change to the submissions system for Accounting Periods.

When F1 is completed the amount of modified code means a lengthy code review by a non-F1 developer and some rework are required before F1 can be merged into Trunk + F2 + F2.1 + F3, after which it is successfully tested and released into production. The architectural changes made in F4 also mean a time-consuming code review and merge into Trunk + F2 + F2.1 + F3 + F1 by a non-F4 developer, and after testing F4 goes into production. However, a production defect is then found in F4, and a F4.1 fix for Accounting Periods is made on a release branch and merged into Trunk + F2 + F2.1 + F3 + F1 + F4 once the defect is resolved.

In this example F1, F2, F3, and F4 all enjoy uninterrupted development on their own feature branches. The emphasis upon short-lived feature branches reduces merge complexity into Trunk, and the use of code reviews lowers the probability of Trunk build failures. However, the F1 and F4 feature branches grow unchecked until they both require a complex, risky merge into Trunk.

The Company Accounts Service team might have used Promiscuous Integration to reduce the complexity of merging each feature branch into Trunk, but that does not prevent the same code deviating on different branches. For example, integrating F2 and F3 into F1 and F4 would simplify merging F1 and F4 into Trunk later on, but it would not restrain F1 and F4 from generating Semantic Conflicts if they both modified the same code.

This example shows how Build Feature Branching typically inserts a costly integration phase into software delivery. Short-lived feature branches with Promiscuous Integration should ensure minimal integration costs, but the reality is feature branch duration is limited only by developer discipline – and even with the best of intentions that discipline is all too easily lost. A feature branch might be intended to last only for a day, but all too often it will grow to include bug fixes, usability tweaks, and/or refactorings until it has lasted longer than expected and requires a complex merge into Trunk. This is why Build Feature Branching is normally incompatible with Continuous Integration, which requires every team member to integrate and test their changes on Trunk on at least a daily basis. It is highly unlikely every member of a Build Feature Branching team will merge to Trunk daily as it is too easy to go astray, and while using a build server to continuously verify branch integrity is a good step it does not equate to shared feedback on the whole system.

Build Feature Branching advocates that the developer of a feature branch should have their changes reviewed and merged into Trunk by another developer, and this process is well-managed by tools such as GitHub Pull Requests. However, each code review represents a handover period full of opportunities for delay – the developer might wait for reviewer availability, the reviewer might wait for developer context, the developer might wait for reviewer feedback, and/or the reviewer might wait for developer rework. As Allan Kelly has remarked “code reviews lose their efficacy when they are not conducted promptly“, and when a code review is slow the feature branch grows stale and Trunk merge complexity increases. A better technique to adopt would be Pair Programming, which is a form of continuous code review with minimal rework.

Asking developers working on orthogonal tasks to share responsibility for integrating a feature into Trunk dilutes responsibility. When one developer has authority for a feature branch and another is responsible for its Trunk merge both individuals will naturally feel less responsible for the overall outcome, and less motivated to obtain rapid feedback on the feature. It is for this reason Build Feature Branching often leads to what Jim Shore refers to as Asynchronous Integration, where the developer of a feature branch starts work on the next feature immediately after asking for a review, as opposed to waiting for a successful review and Trunk build. In the short-term Asynchronous Integration leads to more costly build failures, as the original developer must interrupt their new feature and context switch back to the old feature to resolve a Trunk build failure. In the long-term it results in a slower Trunk build, as a slow build is more tolerable when it is monitored asynchronously. Developers will resist running a full build locally, developers will then checkin less often, and builds will gradually slowdown until the entire team grinds to a halt. A better solution is for developers to adopt Synchronous Integration in spite of Build Feature Branching, and by waiting on Trunk builds they will be compelled to optimise it using techniques such as acceptance test parallelisation.

Build Feature Branching works well for open-source projects where a small team of experienced developers must integrate changes from a disparate group of contributors, and the need to mitigate different timezones and different levels of expertise outweighs the need for Continuous Integration. However, for commercial software development Build Feature Branching fits the Wikipedia definition of an antipattern – “a common response to a recurring problem that is usually ineffective and risks being highly counterproductive“. A small, experienced team practising Build Feature Branching could theoretically accomplish Continuous Integration given a well-structured architecture and a predictable flow of features, but it would be unusual. For the vast majority of co-located teams working on commercial software Build Feature Branching is a costly practice that discourages collaboration, inhibits refactoring, and by implicitly sacrificing Continuous Integration acts as a significant impediment to Continuous Delivery. As Paul Hammant has said, “you should not make branches for features regardless of how long they are going to take“.

Organisation antipattern: Integration Feature Branching

5 July, 2015 / Steve Smith

The Version Control Strategies series

Organisation antipattern – Release Feature Branching

Organisation pattern – Trunk Based Development

Organisation antipattern – Integration Feature Branching

Organisation antipattern – Build Feature Branching

Integration Feature Branching is overly-costly and unpredictable

Integration Feature Branching is a version control strategy where developers commit their changes to a shared remote branch of a source code repository prior to the shared trunk. Integration Feature Branching is applicable to both centralised Version Control Systems (VCS) and Distributed Version Control Systems (DVCS), with multiple variants of increasing complexity:

Type 1 – Integration branch and Trunk. This was originally used with VCSs such as Subversion and TFS
Type 2 – Feature branches, an Integration branch, and Trunk. This is used today with DVCSs such as Git and Mercurial
Type 3 – Feature release branches, feature branches, an Integration branch, and Trunk. This is advocated by Git Flow

In all Integration Feature Branching variants Trunk represents the latest production-ready state and Integration represents the latest completed changes ready for release. New features are developed on Integration (Type 1), or short-lived feature branches cut from Integration and merged back into Integration on completion (Types 2 and 3). When Integration contains a new feature it is merged into Trunk for release (Types 1 and 2), or a short-lived feature release branch cut from Integration and merged into Trunk and Integration on release (Type 3). When a production defect occurs it is fixed on a release branch cut from Trunk, then merged back to Integration (Types 1 and 2) or a feature release branch if one exists (Type 3).

Consider an organisation that provides an online Company Accounts Service, with its codebase maintained by a team practising Type 2 Integration Feature Branching. Initially two features are requested – F1 Computations and F2 Write Offs – so F1 and F2 feature branches are cut from Integration and developers commit their changes to F1 and F2.

Two more features – F3 Bank Details and F4 Accounting Periods – then begin development, with F3 and F4 feature branches cut from Integration and developers committing to F3 and F4. F2 is completed and merged into Integration, and after testing it is merged into Trunk and regression tested before its production release. The F1 branch is briefly broken by a computations refactoring, with no impact on Integration.

When F3 is completed it is merged into Integration + F2 and tested, but in the meantime a production defect is found in F2. A F2.1 fix is made on a F2.1 release branch cut from Trunk + F2, and after its release F2.1 is merged into and regression tested on both Integration + F2 + F3 and Trunk + F2. F3 is then merged into Trunk and regression tested, after which it is released into production. F1 continues development, and the F4 branch is temporarily broken by changes to the submissions system.

When F1 is completed and merged into Integration + F2 + F3 + F2.1 it is ready for production release, but a business decision is made to release F4 first. F4 is completed and after being merged into and tested on both Integration + F2 + F3 + F2.1 + F1 and Trunk + F2 + F3 + F2.1 it is released into production. Soon afterwards F1 is merged into and regression tested on Trunk + F2 + F2.1 + F3, then released into production. A production defect is found in F4, and a F4.1 fix is made on a release branch cut from Trunk + F2 + F2.1 + F3 + F4 + F1. Once F4.1 is released it is merged into and regression tested on both Integration + F2 + F3 + F2.1 + F1 + F4 and Trunk + F2 + F2.1 + F3 + F4 + F1.

In this example F1, F2, F3, and F4 all enjoy uninterrupted development on their own feature branches. The use of an Integration branch reduces the complexity of each merge into Trunk, and allows the business stakeholders to re-schedule the F1 and F4 releases when circumstances change. However, the isolated development of F1, F2, F3, and F4 causes complex, time-consuming merges into Integration, and Trunk requires regression testing as it can differ from Integration – such as F4 being merged into Integration + F2 + F3 + F2.1 + F1 and Trunk + F2 + F2.1 + F3. The Company Accounts Service team might have used Promiscuous Integration on feature release to reduce the complexity of merging into Integration, but there would still be a need for regression testing on Trunk.

If the Company Accounts Service team used Type 3 Integration Feature Branching the use of feature release branches between Integration and Trunk could reduce the complexity of merging into Trunk, but regression testing would still be required on Trunk to garner confidence in a production release. Type 3 Integration Feature Branching also makes the version control strategy more convoluted for developers, as highlighted by Adam Ruka criticising Git Flow’s ability to “create more useless merge commits that make your history even less readable, and add significant complexity to the workflow“.

The above example shows how Integration Feature Branching adds a costly, unpredictable phase into software development for little gain. The use of an Integration branch in Type 1 creates wasteful activities such as Integration merges and Trunk regression testing, which insert per-feature variability into delivery schedules. The use of feature branches in Type 2 discourages collaborative design and refactoring, leading to a gradual deterioration in codebase quality. The use of feature release branches in Type 3 lengthens feedback loops, increasing rework and lead times when defects occur.

Integration Feature Branching is entirely incompatible with Continuous Integration. Continuous Integration requires every team member to integrate and test their code on Trunk at least once a day in order to minimise feedback loops, and Integration Feature Branching is the polar opposite of this. While Integration Feature Branching can involve commits to Integration on a daily basis and a build server constantly verifying both Integration and Trunk integrity, it is vastly inferior to continuously integrating changes into Trunk. As observed by Dave Farley, “you must have a single shared picture of the state of the system… there is no point having a separate integration branch“.

Organisation pattern: Trunk Based Development

16 June, 2015 / Steve Smith

The Version Control Strategies series

Organisation antipattern – Release Feature Branching

Organisation pattern – Trunk Based Development

Organisation antipattern – Integration Feature Branching

Organisation antipattern – Build Feature Branching

Trunk Based Development minimises development costs and risk

Trunk Based Development is a version control strategy in which developers commit their changes to the shared trunk of a source code repository with minimal branching. Trunk Based Development became well known in the mid 2000s as Continuous Integration became a mainstream development practice, and today it is equally applicable to centralised Version Control Systems (VCS) and Distributed Version Control Systems (DVCS).

In Trunk Based Development new features are developed concurrently on trunk as a series of small, incremental steps that preserve existing functionality and minimise merge complexity. Features are always released from trunk, and defect fixes are either released from trunk or a short-lived release branch.

When development of a feature spans multiple releases its entry point is concealed to ensure the ongoing changes do not impede release cadence. The addition of a new feature can be concealed with a Feature Toggle, which means a configuration parameter or business rule is used to turn a feature on or off at runtime. As shown below a Feature Toggle is turned off while its feature is in development (v1), turned on when its feature is in production (v2), and removed after a period of time (v3).

Updates to an existing feature can be concealed with a Branch By Abstraction, which means an abstraction layer is temporarily introduced to encapsulate both the old behaviour in use and the new behaviour in development. As shown below a Branch By Abstraction routes requests to the old behaviour while the new behaviour is in development (v1-v2), reroutes requests to the new behaviour when it is in production (v3), and is removed after a period of time (v4).

Trunk Based Development is synonymous with Continuous Integration, which has been described by Jez Humble et al as “the most important technical practice in the agile canon“. Continuous Integration is a development practice where all members of a team integrate and test their changes together on at least a daily basis, resulting in a shared mindset of collaboration and an always releasable codebase. This is verified by an automated build server continuously building the latest changes, and can include pre- and post-build actions such as code reviews and auto-revert on failure.

Consider an organisation that provides an online Company Accounts Service, with its codebase maintained by a team practising Trunk Based Development and Continuous Integration. In iteration 1 two features are requested – F1 Computations and F2 Write Offs – so the team discuss their concurrent development and decide on a Feature Toggle for F1 as it is a larger change. The developers commit their changes for F1 and F2 to trunk multiple times a day, with F1 tested in its on and off states to verify its progress alongside F2.

In iteration 2 more features – F3 Bank Details and F4 Accounting Periods – begin development. F4 requires a different downstream submissions system, so the team design a Branch By Abstraction for submissions to ensure F1 and F3 can continue with the legacy submissions system until F4 is complete. F2 is signed off and released into production with F1 still toggled off at runtime. Some changes for F3 break the build, which triggers an automatic revert and a team discussion on a better design for F3.

In iteration 3 a production defect is found in F2, and after the defect is fixed on trunk a release branch is agreed for risk mitigation. An F2.1 release branch is created from the last commit of the F2 release, the fix is merged to the branch, and F2.1 is released into production. F4 continues on trunk, with the submissions Branch By Abstraction tested in both modes. F3 is signed off and released into production using the legacy submissions system.

In iteration 4 F1 is signed off and its Feature Toggle is turned on in production following a release. F4 is signed off and released into production, but when the Branch By Abstraction is switched to the new submissions system a defect is found. As a result the Branch By Abstraction is reverted at runtime to the legacy submissions system, and a F4.1 fix is released from trunk.

In this example F1, F2, F3, and F4 clearly benefit from being developed by a team collaborating on a single shared code stream. For F1 the team agrees on the why and how of the Feature Toggle, with F1 tested in both its on and off states. For F2 the defect fix is made available from trunk and everyone is aware of the decision to use a release branch for risk mitigation. For F3 the prominence of a reverted build failure encourages people to contribute to a better design. For F4 there is a team decision to create a submissions Branch By Abstraction, with the new abstraction layer offering fresh insights into the legacy system and incremental commits enabling regular feedback on the new approach. Furthermore, when the new submissions system is switched on and a defect is found in F4 the ability to revert at runtime to the legacy submissions means the Company Accounts Service can remain online with zero downtime.

This highlights the advantages of Trunk Based Development:

Continuous Integration – incremental commits to trunk ensure an always integrated, always tested codebase with minimal integration costs and a predictable flow of features
Adaptive scheduling – an always releasable codebase separates the release schedule from development efforts, meaning features can be released on demand according to customer needs
Collaborative design – everyone working on the same code encourages constant communication, with team members sharing responsibility for design changes and a cohesive Evolutionary Architecture
Operational and business empowerment – techniques such as Feature Toggle and Branch By Abstraction decouple release from launch, providing the operational benefit of graceful degradation on failure and the business benefit of Dark Launching features

Breaking down features and re-architecting an existing system in incremental steps requires discipline, planning, and ingenuity from an entire team on a daily basis, and Trunk Based Development can incur a development overhead for some time if multiple technologies are in play and/or the codebase is poorly structured. However, those additional efforts will substantially reduce integration costs and gradually push the codebase in the right direction – as shown by Dave Farley and Jez Humble praising Trunk Based Development for “the gentle, subtle pressure it applies to make the design of your software better“.

A common misconception of Trunk Based Development is that it is slow, as features take longer to complete and team velocity is often lower than expected. However, an organisation should optimise globally for cycle time not locally for velocity, and by mandating a single code stream Trunk Based Development ensures developers work at the maximum rate of the team not the individual, with reduced integration costs resulting in lower lead times.

Trunk Based Development is simple, but not easy. It has a steep learning curve but the continuous integration of small changesets into trunk will minimise integration costs, encourage collaborative design, empower runtime operational and business decisions, and ultimately drive the engine of Continuous Delivery. It is for this reason Dave Farley and Jez Humble declared “we can’t emphasise enough how important this practice is in enabling continuous delivery of valuable, working software“.

What Is Continuous Delivery

12 May, 2015 / Steve Smith

What is Continuous Delivery, and how could it help your organisation?

Continuous Delivery is often cited by organisations with a high performance IT capability, yet it is difficult to find a concise explanation. Executives and managers want to understand its business case, and practitioners want to understand its impact.

Introduction

“Software is eating the world” – Marc Andreessen

If there is one constant in the 21st Century, it is the ever-accelerating rate of technology change. British Gas manages 75,000 thermostats online, Tesla delivers over-the-air fixes to 29,000 cars, and with the Deloitte Shift Index reporting the last 50 years have seen the life expectancy of a Fortune 500 company decline from 75 years to 15 years it is evident the business world has been permanently disrupted by the ubiquity of software. Every business is an IT business now.

If an organisation wants a competitive advantage today it must strategically position IT at the core of its business to rapidly deliver new capabilities and respond to customer demands faster. If successful the rewards are great, as shown by the 2014 State Of DevOps Report declaring organisations with a high performance IT capability were twice as likely to exceed profitability, market share, and productivity goals. However, in many organisations IT has historically been a cost centre used solely for efficiency gains, and as a result years of underinvestment in software delivery must now be rectified.

The Last Mile

The Last Mile is a term used in IT to describe the value stream in an organisation from development to production, and was originally used in the telecoms industry to describe broadband provisioning from the telephone exchange to the customer premises. In IT the Last Mile is traditionally formed of discrete sequential phases, with work managed via a phase-gate project process.

In telecoms a long Last Mile to the customer means a slower, inferior service and IT is no different – many quality problems occur in the Last Mile, and a longer time to market means slower customer feedback and increased opportunity costs. Unfortunately many organisations have not invested in the Last Mile for years and consequently rely upon a manual release process with developers, testers, system administrators, and database administrators occupied by the following tasks:

Manual configuration – system administrators manually configure releases, causing runtime application errors
Manual infrastructure – system administrators manually provision servers and networks, introducing environmental errors
Manual testing – testers manually regression test releases without production reference data, leading to slow and inaccurate feedback
Manual database management – database administrators manually apply unversioned database scripts, resulting in faulty and/or nonperformant changes
Manual operations – systems administrators manually deploy, start, and stop releases, increasing potential for human error
Manual monitoring – system administrators manually configure operational checks, leading to unreliable alerts
Manual rollback – systems administrators manually unwind releases upon failure, leaving an inconsistent system state
Manual audit – testers, database administrators, and systems administrators manually log their actions, producing an inaccurate audit trail

Such a release process is likely to be controlled by a heavyweight change management process, with extensive documentation required and collaboration between teams restricted to a ticketing system. Production releases will take hours or even days, and will be conducted out of business hours due to fear of failure. This results in a stressful, infrequent release process that increases costs and prevents organisations from delivering features to customers on predictable timelines.

For example, consider an organisation with fortnightly iterations and quarterly production releases. If a product increment is estimated to generate £70,000 per week and is not released for 3 months of 28 days it will incur an opportunity cost of £840,000. In this scenario it is unlikely the Last Mile activities performed would outweigh the opportunity cost.

The IT capability of an organisation can be measured by its cycle time, which is the average lead time from development to production. This is sometimes framed as the Poppendieck question:

“How long would it take your organisation to deploy a change that involves just one single line of code?” Mary Poppendieck

A low cycle time can furnish an organisation with a short, repeatable, and reliable Last Mile that represents a strategic advantage over the competition – and Continuous Delivery is the means to accomplish that goal.

Continuous Delivery

Inspired by the first principle of the Agile Manifesto stating “our highest priority is to satisfy the customer through early and continuous delivery of valuable software“, Continuous Delivery is a set of holistic principles and practices that advocate automating a deployment pipeline to rapidly and reliably release software into production. Creating a Continuous Delivery pipeline enables smaller, more frequent production releases that show the business what customers want – and do not want – much faster, reducing opportunity costs and increasing product revenues.

The seminal book “Continuous Delivery” by Dave Farley and Jez Humble describes how Continuous Delivery is composed of the following principles:

Principle	Description
Repeatable Reliable Process	Use the same deterministic release mechanism in all environments
Automate Almost Everything	Automate as much of the release workflow as possible
Keep Everything In Version Control	Version control code, config, schemas, infrastructure, etc.
Build Quality In	Develop and test work items in a single continuous activity
Bring The Pain Forward	Increase cadence of infrequent, costly events to reduce errors
Done Means Released	Do not consider a feature complete until it is in production
Everybody Is Responsible	Align individuals and teams with the release process
Continuous Improvement	Continuously improve the people and technology involved

It is important to note that only the first 3 of these principles are technology-focussed, which is an early indication of the impact of organisational structures and communication pathways upon Continuous Delivery.

The Deployment Pipeline

At the heart of Continuous Delivery is the Deployment Pipeline pattern, which extends the development practice of Continuous Integration to establish an automated workflow of build, test, and release activities from checkin to production. Any change to code, configuration, infrastructure, reference data, or database schema triggers a pipeline run, which packages a new artifact version and stores it in the artifact repository. That artifact is then subjected to a series of automated and exploratory tests to evaluate its production readiness, progressing to the next stage on success or halting upon failure. The result of this rigorous and repeatable testing process is a release candidate that meets well-defined quality standards and instils confidence prior to production release.

Automating a deployment pipeline offers the following advantages:

Automated configuration – application behaviour is modified consistently, reducing runtime errors
Automated infrastructure – operating systems, middleware, and networks are automatically provisioned, preventing environmental errors
Automated testing – acceptance and performance tests are automated with production reference data, providing fast feedback
Automated database management – database scripts are versioned and applied from development to production, uncovering errors sooner
Automated operations – releases are a push button process for authenticated users, reducing human error
Automated monitoring – operational checks are automated, increasing confidence in the production environment
Automated rollback – rollback uses the standard release mechanism, ensuring a reliable rollback on error
Automated statistics – lead time data is easily collected, enabling visualisation of metrics such as cycle time
Automated audit – all actions are recorded, forming a timely and accurate audit trail

A deployment pipeline tightens feedback loops, reduces error rates, and automates repetitive tasks so that humans are freed up to work on higher-value activities – testers can perform exploratory testing, database administrators can plan for capacity, and systems administrators can create business-facing monitoring checks to learn from customers. Automated configuration and automated infrastructure provisioning are particularly important as they allow operational changes to be developed, tested, and released using exactly the same process as functional changes.

There is a wide range of practices that can be applied throughout a deployment pipeline, including the following

Practice	Description	Advantage
Build Artifacts Once	Build immutable artifacts on commit	Prevents compiler errors post-development
Stop The Line	Prioritise a releaseable codebase over new features	Improves flow of features to production on predictable timelines
Trunk Based Development	Commit changes directly to trunk	Minimises merge costs and encourages good design practices
Feature Toggle	Turn features on/off at runtime	Reduces operational fragility and permits fine-grained feature launches
Atomic Tests	Write automated tests that own all their data	Enables tests to be parallelised and speed up developer feedback
Consumer Driven Contracts	Test consumer/provider interactions as unit tests	Shrinks integration costs and constantly tests conversational integrity
Separate Schema And Data	Split schema and reference data origins	Improves reliability of database changes and data quality
Expand/Contract	Stagger constructive and destructive schema changes	Enables database migrations to occur with zero downtime
Blue Green Releases	Perform new release into standby production servers	Allows production server upgrades to occur with zero downtime
Canary Releasing	Perform new release one server at a time	Reduces risk of production upgrade error affecting customers

Together these practices enable an organisation to rapidly develop, test, and release high quality software with a low error rate and zero downtime in production.

Batch Size Reduction

Once a deployment pipeline is established it can be used to reduce batch size and release smaller changesets into production more frequently, resulting in a lower cycle time that delivers value-add faster at a lower level of risk. In the earlier example, assume the product increment with an estimated value of £70,000 is comprised of 4 features and the estimated failure probability is 1 in 2 (50%).

Now assume the organisation has adopted Continuous Delivery to the extent it can release 1 feature at a time every 3 weeks via its deployment pipeline. A smaller batch size reduces changeset complexity, so more accurate estimates of value-add and risk per release become possible – and as product development is heterogeneous different features offer different amounts of value-add and risk, meaning releases can be ordered by relative value-add and risk.

This breakdown of value-add and risk shows that Feature 2 should be released prior to Feature 1, as it will realise 50% of the original value-add in 25% of the original timeline with only a 6.25% failure probability. Exploratory testing of Features 3 and 4 can be increased to mitigate their heightened risk, and if a production defect does occur the smaller changeset size and lower cycle time means defects can be identified and resolved at a much lower cost.

Combining small production releases with a well-defined authorisation model and automated audit trail means a deployment pipeline can be a definitive compliance tool entirely compatible with ITIL. The ITIL v3 Service Transition definition of a Standard Change as a pre-authorised, low risk, and common change matches the Continuous Delivery concept of frequently releasing small changesets into production. The change management approval process can be implemented as an automated button push for authorised users, and in the rare situation where a small production release is not possible a more involved Normal Change procedure can be followed.

Organisational Change

While a wide range of tools are available to build a deployment pipeline – such as Zookeeper for configuration management, Puppet/AWS/Docker for infrastructure provisioning, Capistrano for deployments, and Graphite/Logstash for traceability – tool selection will not have a significant impact upon the success or failure of a Continuous Delivery programme. The central challenge of Continuous Delivery is undoubtedly organisational change, as many organisations consist of siloed teams with their own local incentives and priorities. Business stakeholders handover requirements to Development, who handover release candidates to Testing, who handover releases to Operations, who handover features to customers – and those handovers introduce significant delays into the value stream that will comprise the majority of cycle time.

In the earlier example, the 12 week cycle time would likely contain delays caused by issues such as test hardware procurement, database administrator unavailability, and change advisory board scheduling. Such delays are entirely unrelated to technology choices and must be addressed via organisational change.

If an organisation is to successfully adopt Continuous Delivery it must harmonise its communication pathways with Conway’s Law, which postulates a correlation between organisational and system architecture now accepted as canon within IT:

“Any organisation that designs a system (defined broadly) will produce a design whose structure is a copy of the organisation’s communication structure” Mel Conway

Conway’s Law explains why siloed teams within the same value stream inevitably use their own tooling and processes – such as developers using a different database migrator to database administrators – and strongly implies that the most effective method of software delivery is cross-functional product teams with complete authority and responsibility for their deployment pipeline.

If an organisation wants to reap the rewards of Continuous Delivery it must change itself, and as Linda Rising and Mary Lynn Manns have observed “change is best introduced bottom-up with support at appropriate points from management“. Executives must communicate well-defined business outcomes, encourage innovators and early adopters, and help their staff with the transition. As Edgar Schein suggests in his book “Organisational Culture and Leadership” change often triggers learning anxiety and survival anxiety in individuals, so for people to commit to Continuous Delivery they must feel part of a culture of continuous improvement in a blame-free environment.

The DevOps philosophy conceived by Patrick Debois in 2009 is a popular method of fostering a culture of collaboration, and its emphasis upon Development and Operations working together on feature development and operability while sharing incentives can be a powerful force for change. However, as Dave Farley has observed “DevOps rarely says enough about the goal of delivering valuable software” and misinterpretations of its collaboration philosophy such as the DevOps Team antipattern are common.

While a deployment pipeline cannot directly reduce cycle time, it can indirectly contribute by acting as a central communication tool that facilitates organisational change. Radiating lead times and real-time customer monitoring from the deployment pipeline will show people the direct correlation between value stream bottlenecks and loss of customer revenue, and as a result people will better understand how change will benefit the entire organisation. This means organisational change for Continuous Delivery has an implicit dependence upon automation that can be summarised as follows:

Continuous Delivery is 10% automation and 90% organisational change – but don’t try it without that 10%

Dual Value Streams

While Continuous Delivery appears to be a daunting prospect many organisations already contain evidence of their potential for success, as they have two different value streams – a Feature Value Stream of siloed teams that will take weeks or months, and a Fix Value Stream of natural collaboration that will take days.

The reason Fix Value Streams have a much lower cycle time than Feature Value Streams is defect fixes are more easily assigned an estimated Cost Of Delay that can be communicated throughout an organisation. When people know a defect has caused a sunk cost and an opportunity cost is pending there is a shared sense of urgency that encourages collaboration and a truncated value stream. This is a leading indicator of organisational potential for Continuous Delivery, and highlights how inappropriate the project paradigm is for product development. When an organisation works on features in smaller batches it can use Cost Of Delay to better prioritise work and improve flow through its Continuous Delivery pipeline.

Conclusion

It is time for businesses to recognise the strategic value of an IT capability that can rapidly innovate and respond to customer feedback in existing and emerging markets. Continuous Delivery enables an organisation to significantly reduce its time to market for new features, resulting in improved quality and increased product revenues.

Automating a deployment pipeline and accomplishing organisational change for Continuous Delivery is a long-term investment. Jez Humble at al say “the key is to find ways to make small, incremental changes that deliver improved customer outcomes“, and Dave Farley says “break down barriers, increase automation, increase collaboration and iterate“. Regardless of approach, a successful adoption of Continuous Delivery will provide an organisation with an enormous advantage over competitors. If an organisation does not adopt Continuous Delivery, it will eventually lose out to a competitor that can deliver faster, learn from customers faster, and make money faster. You can ignore the economics, but the economics won’t ignore you…

Introduction

DevOps As A Philosophy

DevOps As A Cult

Aim for Operability, not DevOps As A Cult

Conclusion

Introduction

The folly of End-To-End Testing

The value of Continuous Testing

From end-to-end testing to continuous testing

Conclusion

Further Reading

Acknowledgements

Pre-Agile Testing

Agile Testing

Release Testing

Build Quality In

Batch Size Reduction

Conclusion

Further Reading

Introduction

The Last Mile

Continuous Delivery

The Deployment Pipeline

Batch Size Reduction

Organisational Change

Dual Value Streams

Conclusion

Further Reading

Recent Posts

Tags

Categories

Archives