DevOps Case Studies
This is a bank of case studies on DevOps to help those tasked with implementing CI or CD. It can be a reality for organisations large and small independent of sector and technology. Because Daysha is focused on larger indigenous organisations our case studies are more oriented towards financial institutions with legacy technologies.
This material has been assembled over the last 2 years by the team at Daysha through attendance at meet up’s in Dublin and London, conferences in Europe and the US and through interactions with our various technology partners.
In total if you were to consume this content you would need of 4-6 hours to digest all the material. Many of the presentations are no less than 30-40 minutes.
More probably you will want to save this page as a bookmark and return here as you need more information.
Notes from meetings are available as word files. If you would like this material, please email us and we will send it on.
The content is ordered by growing maturity in DevOps.
Companies that are starting their journey are near the top. Those that are fully implemented/using microservice and thinking about dynamic auto provisioning etc. are nearer the bottom.
1. Standard Bank discover DevOps
This is a story about a bank that grew frustrated with the pace of delivery from IT and decided to accelerate their journey to production through faster release cycles. There are a total of 6 blog articles outlining how they fast tracked their way to DevOps through piloting.
Since this blog Standard Bank have internet banking applications in a continuous delivery pipeline. They also have a further 3 projects adopting the practices. There is a white paper from Chef on the fastrack approach. Please email me for this.
2. IG’s DevOps 4 year journey
An online financial investment trading application that has applied DevOps to four different projects, including a legacy system that was quite fragile. The first presentation made by Joe McKevitt focuses on the incentive for people to buy into the processes and how their 4-year journey panned out. It includes some interesting insights into developing a community of best practice and engaging partners in the process. There is interesting content on Joe’s personal motivations that will resonate with all IT professionals.
The second presentation (below) is given by Gustavo Elias, the person who led the legacy system project at IG. This is aimed at a more technical audience.
3. Just Eat’s DevOps since 2010
This is a mature DevOps team working at the online food retailer that doesn’t produce any food! They scaled from 20 to 200 people; moved from Denmark to London in 2010; and put a CD process in place over the last 6 years. This team didn’t have a ‘legacy’ as such – their software processes up to 1200 food orders per minute. On route to a double digit features per day release cycle they migrated from data centre to AWS cloud.
This video runs 85 minutes but we have notes of this presentation – email us.
4. HMRC DevOps at pace
This is a presentation on how HMRC have adopted DevOps for the delivery of tax collection services on a microservices architecture that is hosted in the cloud. Note – on the second busiest tax collection day they switched cloud providers. This presentation works towards the last sentence … so primarily the focus is on how different architectures, in two different companies contrast. The HMRC data starts at around minute 8. There are circa 30 scrum teams involved.
5. Daily Telegraph
This presentation was made at the Atlassian Summit conference in Barcelona May 3rd 2017 by Carol Johnson.
What is immediately clear is how the business culture drove IT to be lean. The paper has been around since 1850 but in recent times won a series of firsts as a digital product and in doing so has had to adapt its content production and business models. This business culture made it easier for IT leadership to succeed as innovators on their DevOps journey.
The next most striking feature was the persistence and patience required to get to ‘what good looked like’. There were several cul-de-sacs and while they are still not complete one of the most significant and recent breakthroughs was empowered teams. Teams who owned the delivery their of code into production and its support through initial production release.
The newspaper business has been digitally disrupted since 2009. Back then at DT there was an enterprise focus from an op’s perspective which meant clunky change processes which led to poor communication with the business and a product backlog that was First In First Out … in effect chaotic. There was no sense of the cost or value of a feature but a reaction to the business owner shouting loudest. There was a one year backlog.
The 1st digital strategy… GIOT
This was to start moving some app’s towards a SaaS model. They also starting moving VM’s into their own data center. The catch phrase at the time was GIOT – Get It Out There.
They built fast, reduced costs and used self healing and auto scaling/load balancing systems. However this resulted in snowflake configuration problems leading to fragility in applications. They experienced significant problems through poor test automation. Operations and service management bore the brunt of these failings.
The 2nd digital journey … More Engineers.
This led to more dev capacity, mode products and faster delivery. There were more teams including offshore but there was still a hand over to Op’s in the process via ‘DevOps’. Op’s teams did not see any benefit from this approach. There were further challenges in the form of website speed. So how did they get to faster releases, and improved response time on their site.
One of the learnings from this phase was around solution architectures based on their 3 R’s design principle. So when building and releasing software they design so that they can recover in the following sequence
1. Restart the app – this is is the fastest
2. Reboot an instance – this is fast
3. Relaunch the service – this is slowest but fixes most things
This phase led to 25% more releases, 10% less failures on new features and 7% less downtime.
The 3rd stage ,, Working Together.
This stage was the most arduous for all concerned and they arrived in the end at a good solution somewhat by chance but mostly from good management. They felt they were still somewhat unclear on what DevOps meant and had read the books etc but it still didn’t feel right and further change was needed. They looked at what other companies were doing etc and felt they were off.
In all earlier phases Dev and Ops were silo’d and this had to change if releases were to be delivered more quickly with higher quality and lower downtime. There was something of an accident of birth about how this finally came together.
Dev teams were working on a series of API projects which were described as POC but out of the blue they explained they would like to put these into production which made Ops folks very nervous but Dev stood up to the plate and said they were prepared to see these through to production.
They agreed that the dev who wrote the release notes would be on call. So this was the start of integrated teams. They had ‘DevOps’ guys (release automation) in phase 2 but they became a bottleneck so they started to assign a release lead to each dev stream and this emerged as an optimal structure.
Today the DT teams say ‘we build, we support we own’. Their rules are
- Fast build, fail and fix
- Self select tools and guilds
- Seamless and clean processes
Ops keep systems healthy and they rotate around the dev teams. Ops folks are trained in the guilds (in house experts who train others to their own level so there are fewer points of failure). Op’s own logging, monitoring and security and for example improve error messaging, scripting and in the future there is a thought that Ops will code.
Changing change control was a big job as there was a culture of clunkiness in how this was done but it did provide the business with a level of confidence in how the production was progressing. Starting in 2009 the CAB meet weekly and took 8 days through a 4 step process. The next iteration was a twice weekly meeting with a 2 step approval process and it took 5 days on average. Now the CAB is daily and if its automated it doesn’t go through CAB otherwise there is a 1 step process which takes one day to pass/fail.
Today there are daily stand ups and feedback loops, fortnightly sprints and daily releases.
Carol’s parting advice
Integrate the team’s early – start with experiments, short feedback loops and fix what hurts. Don’t be afraid to change process but verify new process is working through strong metrical measurement.
Above all else empower teams – they want to be empowered more than you want them to be.