One development team’s journey into DevOps (Part 2)

Share this post:

In my last post, we talked about the first principle in continuous delivery: building a good safety net. We started with a fast and reliable rollback system so we could quickly get healthy again when we made mistakes.

For this post, I am going to talk about step two: building a culture of test automation. This one seems pretty obvious, right? After all who would argue with having good test automation for our product’s features so that we could be sure we are doing the job well?  Well, nobody would really argue, but resistance showed up in lots of passive forms:

  • “We have deadlines to meet and won’t get all of our code done if we also have to write a bunch more test automation.”
  • “Talk to ‘the other guys’ who have the job to make that stuff.”
  • “Running test automation will slow down our end-to-end build and therefore our team’s productivity will go down.”

Given these reactions from the team, we developed a sort of “Excuse Buster” set of principles and responses:

  1. Deadlines can be like the call of a siren  watch out!  This one seems obvious, and is the subject of pretty much every book or article you read about good software development.  We all know that if you don’t bake quality in from the start, it is far more costly to try to beat it in towards the end.  Despite this, that illusion of the nearest milestone is so tempting – maybe just this time I can go ahead and just slam my code into the library and it will be fine.  I know what I am doing! Before adopting a DevOps model, this kind of thinking simply killed us.  Our build success rates were very low, and because so many changes were going in each day, tracking down those build problems was exhaustive and slow.  The team simply thrashed.  The other disastrous side effect is that you actually create far more uncertainty about when you are done in the later parts of your cycle, which is also the worst time to talk about ability to hit deadlines with your stakeholders.  Testing up front allows you to really predict where you will land – you know what you have and you know that it works.
  2. The ‘Other Guys’ aren’t coming.  They are you!  In the end, this is about accountability – do you as a developer feel truly accountable for delivering high quality, highly consumable features for a user?  If you do, you will want to spend time on great test automation to prove to yourself that you can be proud of your work.  If you don’t, then you think it is someone else’s job (which really means that you think we are only doing this because some manager said so).  Again, the readings are clear on this topic – if you think about how to test something and prove it works well from the beginning, your designs and implementations will benefit just from that thinking process alone.  Add to that a good test harness to prove it day after day, and ensure that changes all over the system don’t ripple any side effects, and you can sleep with a smile on your face each night.
  3. Builds have to be like breathing – you don’t even think about it.  If you run test automation suites after your builds are finished, it is certainly true that it will take longer before a “green build” can be declared.  The real question you have to ask is “how many builds are going to break, and what’s the productivity cost of each?”  As I said before, our team simply thrashed – we had 30 or 40 code changes going in each day.  So, we would finish a build, run a minimal “sniff test,” and then hand it over to the test team.  They would install it in several environments (which would take many hours), start to test, and then after an hour or so run into some significant blocking issue.  The chief programmer would then start collecting evidence (blood spatter, hair samples) in order to find the guilty party and bring them to justice.  That might take a couple more hours.  Then the person would make the fix and check it into the library.  Respin.  Install again.  Begin the tests again.  Meanwhile the rest of the team is prevented from checking on other changes that are needed to expose other tests.  In the end, this was an incredibly costly way to do business – far too many builds, far too long a cycle to find a blocking problem, and that cost multiplied quickly given the number of different environments and topologies that we needed to try.  Implementing a strong automated test system so that we were very confident of a build that could be used, tested and enhanced saved us countless hours per week across hundreds of team members worldwide.  It’s the difference between playing offense on your project as opposed to having to play defense.

The method for change

In the end, we chose to dictate this from the top to overcome our passive resistance.  We needed to give our team members “permission” to invest in test automation by stating it as a mandatory part of their job.  Code check-ins could not happen without this also being present.  This eliminated much of the deadline tradeoff problem and also aligned the accountability and pride where we wanted it – with each of our team members.  To overcome the startup costs of putting this in place (the ROI problem), we also defined a baseline of tests that had to be done immediately – before any further enhancements could be checked in.  This baseline, although not exhaustive, represented our main use cases and allowed us to start to capture 80 to 90 percent of our problems immediately.  After that, all check-ins from that point forward were required to include test automation to cover that feature or that bug fix.  Code committers reviewing those check-ins were asked to also review the test automation to ensure appropriate coverage.

Next time, I will talk about how we are applying the output of our cloud tools in changing the way we do our development and support work.

What did you use to embrace test automation as a way of life?  Carrot or stick?

More stories

Why we added new map tools to Netcool

I had the opportunity to visit a number of telecommunications clients using IBM Netcool over the last year. We frequently discussed the benefits of have a geographically mapped view of topology. Not just because it was nice “eye candy” in the Network Operations Center (NOC), but because it gives an important geographically-based view of network […]

Continue reading

How to streamline continuous delivery through better auditing

IT managers, does this sound familiar? Just when everything is running smoothly, you encounter the release management process in place for upgrading business applications in the production environment. You get an error notification in one of the workflows running the release management process. It can be especially frustrating when the error is coming from the […]

Continue reading

Want to see the latest from WebSphere Liberty? Join our webcast

We just released the latest release of WebSphere Liberty, It includes many new enhancements to its security, database management and overall performance. Interested in what’s new? Join our webcast on January 11, 2017. Why? Read on. I used to take time to reflect on the year behind me as the calendar year closed out, […]

Continue reading