Cultivate a culture of discovery and fail-fast to ensure DevOps success

Transitioning to a DevOps culture is never an easy task, given the number of stakeholders that you need to get on board with the process just to get it going. To reduce resistance and eliminate obstacles to the seamless implementation of DevOps culture, you need to get higher-ups involved and also work at the grassroots level to convince development teams and operations team that they can work more efficiently and quickly if they adopt this new culture. To successfully implement a DevOps transformation, managers need to foster a culture of discovery and fail-fast. This serves to minimize the impact of a shoddy transformation on customers and end-users by ensuring that errors and lack of adherence to requirements are caught in the initial stages of development.

Discovery

A DevOps transformation always begins with a discovery process that involves determining the objectives each stakeholder has for the transition to a DevOps culture. The stakeholders involved include development and operations teams, of course, but also the quality assurance (QA), business, and change management teams. You need to ensure that all the stakeholders affected by the transformation have clarity and consensus on the definitions of DevOps and continuous integration and continuous delivery (CI/CD. They need to align their goals and priorities for the transition to successfully integrate their functions.

Discovery not only clarifies the goals teams will work toward as they integrate but also clarifies the current practices and processes in place. Team members may need to assess the structure and flow of existing applications that have been in place for a while. Determining this can be a time-consuming and error-prone process if done manually. It’s essential for developers to automate the process by which they determine the structure of existing applications to reduce the time taken and scope for errors. This early assessment also gives developers an idea of how they can go about automating the development and deployment of these applications as the DevOps culture kicks in and team members start adopting end-to-end responsibility for products.

In the long run, automating the discovery process for applications can help developers analyze the impact of updates on these applications more accurately and comprehensively, increasing the likelihood that the updates will succeed in production. Eventually, the goal is to also automate impact analysis so that potential issues or errors are identified before product deployment.

Preparing teams to commit to the transition to DevOps can be tricky if the right culture is not established. Old incentives and goals need to be discarded in favor of collective goals that promote the integration of teams and encourage them to learn about each other’s processes and responsibilities: from the development process to the bug-testing process, to the management of customer feedback and requirements. Success may seem to be the most important goal to work toward, but a DevOps culture can only thrive where failure and experimentation are allowed and even encouraged.

Fail-fast

No one in management ever tells you to fail. It seems rather counterintuitive given that failure is often devastating to a business. When every other team is measuring their effectiveness in terms of their successes, the last thing you’d want to do is encourage your team to go the opposite way and fail. Interestingly, that’s exactly what a DevOps culture needs, and the faster you fail, the faster you find yourself with successful outcomes. This is where a fail-fast culture comes in.

The point of fail-fast in DevOps is not to maximize failure but rather to encourage development teams to experiment in a structured environment where the quicker they fail, the quicker they can discover ways to improve upon systems and products. This “fail-fast” mechanism popularized in Agile environments can be migrated over to DevOps environments, provided that managers limit the costs of failure, especially in terms of its impact on customer services. If developers are allowed to fail early on in the development process, they are more likely to spot security defects and errors before a product goes into deployment. This minimizes the likelihood of finding a severe flaw in an application just before it is rolled out to the end-users.


Teams need to learn to perceive every failure as a means of fine-tuning processes and products and spotting security defects. The only way to ensure that your team is completely willing to fail is to move away from a culture of blame and toward an environment where failures can be openly dissected in front of the team by the developers responsible for them without the fear of ramifications. When developers are allowed to dissect their flaws around their colleagues, they are more likely to report potential flaws before the product is in its final stages of deployment. Team members also learn not to repeat the same mistakes, reducing the likelihood that the same error will recur in future iterations.

Unlike in an Agile environment, where developers work on their own and can fail often without major costs, failure in a DevOps environment can be more costly. It’s important for intentional failures to be planned out and to be carried out in a controlled environment to minimize costs. Some strategies can help you roll out new software that may contain serious flaws without damaging your reputation with customers. One of them involves conducting static analysis on altered components of an application. When developers test updates in a sandbox, they are more likely to spot security defects and serious issues before the updates are rolled out to the end-users.

Another way for updates and changes to be allowed to fail without wide-ranging consequences is to test them on a smaller user base before applying the updates to all users. In fact, this can be done in increments, starting with a tiny base of carefully selected users (often employees in your organization), before deploying the change to increasingly larger user bases, and finally rolling out the update to all the end-users. In case a vulnerability is discovered at any stage in this phased rollout process, the update can be rolled back and the previous version of the application can be deployed.

Cultivating a discovery and fail-fast culture in DevOps helps developers spot security defects early on in the process, but these developers also need to be equipped with the tools they need to fix the defects efficiently. The point of DevOps is to increase the speed of product deployment and increase efficiency, but if developers aren’t trained to fix issues promptly, this can cause delays in the pipeline. To ensure that the CI/CD pipeline runs smoothly, developers need to constantly update their skills and share their learnings. A collaborative environment in which employees involved at different stages of the process can step in at any stage of product development and deployment thanks to their shared knowledge and skillsets allows the CI/CD pipeline to run smoothly. There will always be kinks to iron out as you transition to a DevOps culture, but as long as all stakeholders are on the same page about their shared goals and feel safe to try new things out in a blame-free environment where fail-fast leads to fast success, the transformation will be much smoother for everyone involved.

Images: Pixabay

Twain Taylor

My interests lie in DevOps, IoT, and cloud applications. I began my career in tech B2B marketing at Google India, after which I headed marketing for multiple startups. Today, I consult with companies in The Valley on their content marketing initiatives, and write for tech journals.

Share
Published by
Twain Taylor

Recent Posts

Review: Identity verification solution Specops Secure Service Desk

Specops Secure Service Desk is an innovative solution for positively identifying a user who calls…

9 hours ago

Apple Silicon: What it means for the world of personal computing

Apple is moving away from Intel processors to use its own Apple Silicon processors to…

12 hours ago

RAID 0 vs. RAID 1: When to use each level and why

Two of the most popular RAID levels for improving performance are RAID 0 and RAID…

15 hours ago

Got cybersecurity tools? Good. Got too many? That may be a problem

Strength in numbers may not apply to cybersecurity tools. In fact, using too many tools…

1 day ago

Getting started with System Center Operations Manager

System Center Operations Manager can monitor your IT resources, but the tool is only as…

2 days ago

Microsoft 365 administration: Creating DNS records for email security

Microsoft 365 administration has many facets, but none is more important than configuring email. Here’s…

2 days ago