Open source software has the advantage that it offers the most flexibility compared to any licensed alternative. In other words, open source enables vendor independence, which means you’re never locked into a vendor’s costs, buying structures, or redistribution terms. Additionally, with millions of developers contributing their efforts for free, it’s really hard for proprietary software to keep up. That’s probably why so many organizations donate their favorite open source DevOps tools to the community. It’s kind of like sending your kids to college for free (and never having to worry about them again).
Now there really isn’t anything cooler than open source DevOps tools that a top organization has actually used in production. This is what you call a win-win situation where the community gets a tried-and-tested tool and the organization gets an army of free developers.
1. Airbnb — Synapse
Airbnb is a top contributor to open source DevOps tools and the combination of Nerve and Synapse make the difficult task of service discovery in the cloud a lot easier.
Synapse is Airbnb’s new system that uses “watchers” to connect internal services together in a scalable, fault-tolerant way, hence solving the problem of automated failover in the cloud. Synapse comes with a number of watchers, which are responsible for service discovery and reconfiguring the proxy so it always points to available servers. Default watchers include ones that query ZooKeeper and use the AWS API. It is also relatively easy to write your own watchers and users are encouraged to submit them back to the project.
2. Airbnb — Nerve
Nerve is a utility for tracking the status of machines and services and is the other half of the Nerve-Synapse combination from Airbnb. It should be noted, however, that both can be used independently as well. Nerve keeps track of what’s going on by running locally on the systems that make up the distributed services and reporting state information to a distributed key-value store.
Monitoring is essential to running cloud-native applications. This is especially true in the case of mobile apps. Monitoring mobile app crashes can be a pain given the numerous device types, OS versions, carrier differences, and more. There are a million ways an app can crash on a mobile device. What you need when running mobile apps is a crash reporting app that can give you all the details behind a crash and let you get to the root cause faster.
3. Netflix — ChaosMonkey
Netflix is without a doubt the DevOps poster child of the enterprise and it’s released numerous open source DevOps tools that have been built in-house. Here’s a look at a few of the really interesting ones.
ChaosMonkey is by far the most “out-of-the-box” tool Netflix has come up with for the sole reason that it’s a service that basically just causes havoc. The principle behind it is that the best way to avoid major failures is to fail constantly, and once the “monkey” identifies a group of systems, it randomly terminates one in the group to simulate failure. Named for the way it causes chaos like a wild monkey in a datacenter, Chaos Monkey has a configurable schedule that allows simulated failures to occur at times when they can be closely monitored.
4. Netflix — SimianArmy
Inspired by the success of the Chaos Monkey, Netflix created a whole army of monkeys called SimianArmy, that induce various different kinds of failures. Latency Monkey induces artificial delays that simulate service degradation to check if upstream services respond appropriately. Conformity Monkey finds instances that don’t adhere to best-practices and shuts them down, and there’s even a Doctor Monkey that taps into health checks that run on each instance and monitors other external signs of health.
Additionally, there is also Janitor Monkey, Security Monkey and a 10–18 Monkey. The Chaos Gorilla, however, is thus named for its ability to simulate an outage of an entire Amazon availability zone.
5. Netflix — Hystrix
Hystrix evolved from the “resilience engineering” work that the Netflix API team began in 2011 and is a library designed to control the interactions between distributed services (and stop cascading failures across them). Hystrix improves the system’s overall uptime resiliency and it does this by isolating points of access between the services and providing fallback options. Today, Hystrix is responsible for tens of billions of thread-isolated and hundreds of billions of semaphore-isolated calls being executed every day at Netflix.
6. Netflix — Curator
To understand Curator functions, we need to first take a look at ZooKeeper which is a high-performance “coordination” service for distributed applications that comes bundled with a Java client. It coordinates by exposing common services in a simple interface. Using it effectively, however, is considered labor intensive, which brings us to Curator.
Curator is a collection of three related projects, namely curator-client, curator-framework, and curator-recipes. The client is a replacement for the bundled ZooKeeper class and the framework is a high-level API that makes using ZooKeeper a lot easier. It also adds a lot of features with regards to managing connections. Curator-recipes are basically implementations of ZooKeeper recipes built on the curator-framework.
7. Netflix — Spinnaker
Spinnaker is yet another of the open source DevOps tools from the Netflix stable and is used in production by hundreds of teams over millions of deployments around the world. The whole idea behind Spinnaker is cloud cross-compatibility and it basically decouples your release pipeline from your cloud provider, making it easier to move around. It does this by combining a powerful pipeline management system with integrations to all the major cloud providers. Spinnaker can be installed locally, on-premises, or in the cloud, running either on a virtual machine or Kubernetes. It also has built-in support for Google Compute Engine, Google Container Engine, Google App Engine, AWS EC2, Microsoft Azure, Kubernetes, and OpenStack.
8. Facebook — Presto
We can’t have a DevOps list without a Facebook feature, especially since Facebook uses, maintains, and contributes to a significant number of major open source projects, and its GitHub account has over 172 repos and tens of thousands of commits.
Presto is a distributed SQL query engine that over 1,000 Facebook employees use daily to run more than 30,000 queries that in total scan over a petabyte each per day. Yes a petabyte. In fact, Presto can run interactive analytic queries against data sources of all sizes. Facebook uses Presto for interactive queries against several internal data stores, one of which includes a 300PB data warehouse.
The next two open source DevOps tools are interesting because they come from fierce competitors Uber and Lyft, and are also the two newest projects to join the CNCF. Envoy is an edge and service proxy and Jaeger is a distributed tracing system.
9. Lyft — Envoy
Envoy was originally built at Lyft with the purpose of helping it move away from a monolith and is a high-performance edge and service proxy that makes the network transparent to applications. It does through its API-driven platform that supplies and manages external connectivity across different services and allows them to communicate with each other. Envoy was originally developed at Lyft and heavily contributed to by both Google and IBM.
10. Uber — Jaeger
Jaeger is a distributed tracing system inspired by Google Dapper paper and the OpenZipkin community. It is designed to be used with distributed applications and was first deployed by Uber internally in 2015. It is currently used by Uber to manage over 1,200 individual microservices, each of which may have multiple instances operating at any given time. It is also being used in production by a number of companies like Base CRM, Circonus, GrafanaLabs, Nets, Stagemonitor, Symantec, Red Hat and Zenly. Jaeger can be used to track problems across different services.
The way open source DevOps tools add value to both the community and the organization that’s releasing it to the public just cannot be denied. The concept of working with each other rather than against each other may be age-old but is demonstrated today more than ever by the way open source has basically just taken over the enterprise. When you have battle-tested tools that have been used by giant corporations like Facebook and Netflix at your disposal, who in their right minds would go looking for licensed software?
Photo credit: Wikimedia