Month: September 2012

The Strangler Pipeline – Challenges

28 September, 2012 / Steve Smith

The Strangler Pipeline introduced a Repeatable Reliable Process for start/stop, deployment, and database migration

Previous entries in the Strangler Pipeline series:

The Strangler Pipeline – Introduction

To start our Continuous Delivery journey at Sky Network Services, we created a cross-team working group and identified the following challenges:

Slow platform build times. Developers used brittle, slow Maven/Ruby scripts to construct platforms of applications
Different start/stop methods. Developers used a Ruby script to start/stop individual applications, server administrators used a Perl script to start/stop platforms of applications
Different deployment methods. Developers used a Ruby script to deploy applications, server administrators used a Perl script to deploy platforms of applications driven by a Subversion tag
Different database migration methods. Developers used Maven to migrate applications, database administrators used a set of Perl scripts to migrate platforms of applications driven by the same Subversion tag

As automated release management is not our core business function, we initially examined a number of commercial and open-source off-the-shelf products such as ThoughtWorks Go, LinkedIn Glu, Ant Hill Pro, and Jenkins. However, despite identifying Go as an attractive option we reluctantly decided to build a custom pipeline. As our application estate already consisted of ~30 applications, we were concerned that the migration cost of introducing a new release management product would be disproportionately high. Furthermore, a well-established Continuous Integration solution of Artifactory Pro and a 24-agent TeamCity build farm was in situ, and to recommend discarding such a large financial investment with no identifiable upfront value would have been professional irresponsibility bordering upon consultancy. We listened to Bodart’s Law and reconciled ourselves to building a low-cost, highly scalable pipeline capable of supporting our applications in order of business and operational value.

With trust between Development and Operations at a low ebb, our first priority was to improve platform build times. With Maven used to build and release the entire application estate, the use of non-unique snapshots in conjunction with the Maven Release plugin meant that a platform build could take up to 60 minutes, recompiled the application binaries, and frequently failed due to transitive dependencies. To overcome this problem we decreed that using the Maven Release plugin violated Build Your Binaries Only Once, and we placed Maven in a bounded CI context of clean-verify. Standalone application binaries were built at fixed versions using the Axel Fontaine solution, and a custom Ant script was written to transform Maven snapshots into releasable artifacts. As a result of these changes platform build times shrank from 60 minutes to 10 minutes, improving release cadence and restoring trust between Development and Operations.

In the meantime, some of our senior Operations staff had been drawing up a new process for starting/stopping applications. While the existing release procedure of deploy -> stop -> migrate -> set current version -> start was compatible with the Decouple Deployment From Release principle, the start/stop scripts used by Operations were coupled to Apache Tomcat wrapper scripts due to prior use. The Operations team were aware that new applications were being developed for Jetty and Java Web Server, and collectively it was acknowledged that the existing model left Operations in the undesirable state of Responsibility Without Authority. To resolve this Operations proposed that all future application binaries should be ZIP archives containing zero-parameter start and stop shell scripts, and this became the first version of our Binary Interface. This strategy empowered Development teams to choose whichever technology was most appropriate to solve business problems, and decoupled Operations teams from knowledge of different start/stop implementations.

Although the Binary Interface proved over time to be successful, the understandable desire to decommission the Perl deployment scripts meant that early versions of the Binary Interface also called for deployment, database migration, and symlinking scripts to be provided in each ZIP archive. It was successfully argued that this conflated the need for binary-specific start/stop policies with application-neutral deploy/migrate policies, and as a result the latter responsibilities were earmarked for our pipeline.

Implementing a cross-team plan of action for database migration has proven far more challenging. The considerable amount of customer-sensitive data in our Production databases encouraged risk aversion, and there was a sizeable technology gap. Different Development teams used different Maven plugins and database administrators used a set of unfathomable Perl scripts run from a Subversion tag. That risk aversion and gulf in knowledge meant that a cross-team migration strategy was slow to emerge, and its implementation remains in progress. However, we did experience a Quick Win and resolve the insidious Subversion coupling when a source code move in Subversion caused an unnecessary database migration failure. A pipeline stage was introduced to deliver application SQL from Artifactory to the Perl script source directories on the database servers. While this solution did not provide full database migration, it resolved an immediate problem for all teams and better positioned us for full database migration at a later date.

With the benefit of hindsight, it is clear that the above tooling discrepancies, disparate release processes, and communications issues were rooted in Development and Operations historically working in separate silos, as forewarned by Conway’s Law. These problems were solved by Development and Operations teams coming together to create and implement cross-team policies, and this formed a template for future co-operation on the Strangler Pipeline.

Pipeline Pattern: Stage Strangler

22 September, 2012 / Steve Smith

The Strangler Pattern reduces the pipeline entry cost for multiple applications

When adding an application into a Continuous Delivery pipeline, we must assess its compatibility with the Repeatable Reliable Process already used by the pipeline to release application artifacts. If the new application produces artifacts that are deemed incompatible, then we can use a Artifact Interface to hide the implementation details. However, if the new application has an existing release mechanism that is radically different, then we must balance our desire for a uniform Repeatable Reliable Process with business expectations.

Assuming that the rationale for pipelining the new application is to de-risk its release process and improve its time-to-market, spending a significant amount of time re-engineering the pipeline and/or application would conflict with Bodart’s Law and harm our value proposition. In this situation we should be pragmatic and adopt a separate, application-specific Repeatable Reliable Process and manage the multiple release mechanisms within the pipeline via a Stage Interface and the Strangler Pattern.

The Strangler Pattern is a legacy code pattern named after Strangler Fig plants, which grow in rainforests where there is intense competition for sunlight. Strangler plants germinate in the rainforest canopy, growing down and around a host tree an inch at a time until the roots are reached and the host tree dies. The Strangler Pattern uses this as an analogy to describe how to replace legacy systems, with a Strangler application created to wrap around the legacy application and gradually replace it one feature at a time until decommissioning. The incremental progress of the Strangler Pattern facilitates a higher release cadence and de-risks system cutover, as well as allowing new features to be developed alongside the transfer of existing features.

To use the Strangler Pattern in Continuous Delivery, we first define a Stage Interface as follows:

Stage#run(Application, Version, Environment)

For each pipeline stage we can then create a default implementation to act as the Repeatable Reliable Process for as many applications as possible, and consider each incoming application on its merits. If the existing release mechanism of a new application is unwanted, then we can use our default stage implementation. If the legacy release mechanism retains some value or is too costly to replace at this point in time, then we can use our Stage Interface to conceal a fresh implementation that wraps around the legacy release mechanism until a strangulation time of our choosing.

In the below example, our pipeline supports three applications – Apples, Oranges, and Pears. Apples and Oranges delegate to their own specific implementations, whereas Pears uses our standard Repeatable Reliable Process. A deploy of Apples will delegate to the Apples-specific pipeline stage implementation, which wraps the Apples legacy release mechanism.

In a similar fashion, deploying Oranges to an environment will delegate to the Oranges-specific pipeline stage implementation and its legacy release mechanism.

Whereas deploying Pears to an environment uses the standard Repeatable Reliable Process.

If and when we consider it valuable, we can update the pipeline and/or Apples application to support the standard Repeatable Reliable Process and subsequently strangle the Apples-specific pipeline stage implementation. Both Apples and Pears are unaffected by this change.

Finally, we can strange the Oranges-specific pipeline stage implementation at a time of our choosing and attain a single Repeatable Reliable Process for all applications.

It is important to note that if the legacy pipeline stage implementations are never strangled, it is unimportant as a significant amount of return on investment has still been delivered. Our applications are managed by our Continuous Delivery pipeline with a minimum of integration effort and a minimum of impact upon both applications and pipeline.

Continuous Delivery and organisational change

10 September, 2012 / Steve Smith

Continuous Delivery unaccompanied by organisational change will not reduce cycle time

Our Continuous Delivery value proposition describes a goal of reducing cycle time – the average time for a software release to propagate through to Production – in order to improve our time-to-market, saving time and money that can be invested back into product development and growing revenues. However, it is important to bear in mind that like any cross-organisation transformational programme Continuous Delivery is susceptible to Conway’s Law:

Any organisation that designs a system (defined broadly) will produce a design whose structure is a copy of the organisation’s communication structure

This extraordinary sociological observation predicts that multiple teams working on the same problem will produce disparate solutions, and that the structure of an organisation must be adaptable if product development is to remain sustainable. As a Continuous Delivery pipeline will likely traverse multiple organisational units (particularly in silo-based organisations), these are pertinent warnings that were addressed by Dave Farley and Jez Humble in the principles of Continuous Delivery:

Repeatable Reliable Process
Automate Almost Everything
Keep Everything In Version Control
Bring The Pain Forward
Build Quality In
Done Means Released
Everybody Is Responsible
Continuous Improvement

The majority of these principles are clearly focussed upon culture and behaviours, yet some Continuous Delivery implementations are entirely based upon Reliable Repeatable Process and Automate Almost Everything at the expense of more challenging principles such as Everybody Is Responsible.

For example, in our siloed organisation we are asked to improve the cycle time of an application from 28 days to 14 days, with the existing deployment and migration mechanisms manual processes that each take 20 minutes to perform. We introduce a Continuous Delivery pipeline in which we Automate Almost Everything, we Keep Everything In Version Control, and we establish our Repeatable Reliable Process. However, despite deployment and migration now taking only 5 minutes each, our cycle time is unaffected! How is this possible?

To explain this disheartening situation, we need to use Lean Thinking and examine the value stream of our application. While our new release mechanism has reduced the machine time of each pipeline stage (i.e. time releasing an artifact), the process lead time (i.e. time required to release and sign off a artifact) is largely unaffected. This is because process lead time includes wait time, and in a siloed organisation there are likely to be significant handoff periods both during and between pipeline stages which are “fraught with opportunities for waste“. If the deployment and migration mechanisms have each been reduced to 5 minutes but a 3 hour handoff from server administrator to database administrator remains, our Repeatable Reliable Process will never affect our cycle time.

To accomplish organisational change alongside Continuous Delivery, the most effective method of breaking down silo barriers is to visualise your value stream and act upon waste. Donella Meadows recommended that to effect organisational change you must “arrange the structures and conditions to reduce the probability of destructive behaviours and to encourage the possibility of beneficial ones“, and a pipeline containing a Repeatable Reliable Process is an excellent starting point – but it is not the end. Visualise your pipeline, educate people on the unseen inefficiencies caused by your organisational structure, and encourage an Everybody Is Responsible mentality.

Month: September 2012

The Strangler Pipeline – Challenges

Pipeline Pattern: Stage Strangler

Continuous Delivery and organisational change

Recent Posts

Categories

Archives

The Strangler Pipeline – Challenges

Pipeline Pattern: Stage Strangler

Continuous Delivery and organisational change

Recent Posts

Tags

Categories

Archives