Kubernetes’ adoption is at an all-time high as more organizations venture into the world of microservices and distributed workloads. Kubernetes has a lot of pros like quick failover, scalability, flexibility, and high availability. These features are enticing to organizations that have been running extremely expensive monolithic workloads. However, as with everything, Kubernetes’ benefits are limited. This becomes quite apparent when working with stateful applications or workloads, specifically, Kubernetes data storage.
The Kubernetes data storage problem
Kubernetes orchestrates containers, spins them up at run time, provides them with the necessary resources required to perform a particular process, and then destroys them when they are no longer needed. Since all of these containers are independent of each other, the data these containers hold gets destroyed, leaving no way to recover it when the container is done with. With stateless workloads, this doesn’t matter. However, if you are trying to containerize your stateful workloads, this could become a problem. Without a proper way to maintain the state of your workload, you won’t be able to run your workloads, let alone enjoy the benefits K8s has to offer.
Organizations can find workarounds to maintain the state. However, this is not advisable because of the steep learning curve involved and the sheer effort required to develop a proper workaround. Organizations may be able to create a solution that works for their workloads. However, training new resources on infrastructure could become challenging if your organization doesn’t have miraculously high job retention. Another major issue with such workarounds is vendor lock-in. A series of tools you use in your infrastructure will become entirely interdependent as your workload grows. This prevents portability which is an essential benefit of K8s. Finally, you might get stuck with just another monolith with no added benefit.
Leveraging external storage to store the state of an application can be tedious work and require constant monitoring and a lot of manual intervention. There should be a way to leverage storage just like any other resource in K8s so that it’s still available once the container is put down. Fortunately, there are specific solutions to do just that. These cloud-native solutions are called storage orchestrators. These tools store cluster-wide data in a shared pool. Specified volume can be leveraged by containers that need it, and once they go offline or fail, the data is still available in the shared pool. Using these tools, you can avoid the hassle of leveraging external storage to store the state of your workloads.
Storage orchestrators like Rook, Portworx, and LongHorn, among others, help solve the Kubernetes storage problem and let you enjoy the benefits of Kubernetes even with your stateful workloads. Since these tools are platform agnostic, you can easily migrate your workloads or have them hosted on a hybrid infrastructure. With storage orchestrators, you can enjoy the benefits of K8s without making compromises along the way. These tools have become very popular because most organizations experience the storage problem while migrating their tightly coupled monolithic workloads.
Let’s take a look at what’s new in the Kubernetes data storage space.
Portworx expands its K8s storage catalog with DBaaS and enhanced PX-Backup
Portworx is a leading end-to-end data management solution for applications running on Kubernetes. It protects your data with solid container granular encryption and makes it available across platforms. Portworx combines storage hardware across your clusters into a cluster-wide pool. This pool is resilient and stays open even if specific nodes fail. Portworx also creates independent application-consistent backups that can be ported across various platforms, making it easy to migrate workloads via backups. With the help of Portworx, organizations can leverage the hybrid infrastructure without any fear of vendor lock-in.
Workloads running on Kubernetes may require various data services like databases, queues, streaming, and message queries, among others. This can become complicated to do if it’s all being done manually by developers. Factor in the different environments and constant scaling of these resources, and your DevOps teams don’t have the time to work on new releases. To address this concern, Portworx has unveiled a new solution called Data Services. Portworx Data Services helps spin production-ready data services with just one click. You get to choose from a wide variety of data services without worrying about manual intervention. Data Services spins highly scalable and resilient data services with fully automated day-2 operations. Applications created using Portworx Data Services support staple Portworx features like backup and restore, disaster recovery, data protection, migration, and capacity management. Data Services also provide the same DBaaS solution across clouds and on-prem, keeping with the platform-agnostic tradition of storage orchestration solutions.
New PX-Backup enhancements now allow organizations to create backups and restore them anywhere (on-prem or cloud). PX-Backup offers 3-2-1 backup policy support, which means that organizations can maintain three different copies of the applications (production, snapshot, and backup copy) on disk or object storage. This way, organizations can offload backups to secondary locations of their choice, providing more flexibility. PX-Backup users can now protect their workloads with role-based access and encryption provided by PX-Secure.
Rook V1.7 unveils Ceph Cluster Helm chart and file mirroring.
Rook is an open-source project that the CNCF hosts. It graduated last year and was one of the very first Kubernetes storage projects to exist. Rook leverages K8s to run storage like you run applications. At the root of Rook is the rook operator that configures the data storage you want and helps automate tasks like deployment, bootstrapping, configuration, provisioning, scaling, upgrading, migration, disaster recovery, monitoring, and resource management. Rook takes your distributed storage and makes it self-managing, self-scaling, and self-healing to work well with your cloud-native workloads.
With its latest version release, Rook comes with Ceph Cluster Helm chart that allows users to deploy and configure other Ceph resources like CephCluster CR (Custom Resource), CephBlockPool, CephFileSystem, CephObjectStore, and Toolbox. Users can still use example manifests to create these resources if they don’t want to use the Helm chart.
Rook also comes with file mirroring. Unlike block mirroring, file mirroring is entirely snapshot-based. You can schedule your file mirroring snapshots and configure retention. This feature is not very stable. However, Rook promises you won’t lose your mission-critical data. Another new feature in this new release is the prevention of accidental resource deletion. Containers are temporary, and once they fail or get spun down, you can lose the data and resources they leverage. Rook will now prevent deletion of the CephCluster until or unless there are no resources attached to it. If any resources are still available under the CephCluster, the deletion will not occur until these resources are removed from the cluster.
Kubernetes storage solutions: A work in progress
Organizations want to leverage Kubernetes to enjoy the benefits it has to offer. However, getting wrapped up in the logistics of this migration can be daunting. In such cases, Kubernetes storage solutions are the best options. There are a plethora of other tools that address the K8s storage problem. And, all of these solutions are rapidly evolving to match the pace at which Kubernetes is being adopted. Storage orchestrators have a lot to offer, but they aren’t all the same. The Kubernetes storage space is quite volatile at the moment. Organizations need to stay updated with what is new in the market and what tool does what best.
Featured image: Designed by FullVector / Freepik