In the early days of computing, programmers shared software to learn from each other and evolve. Though the open source notion gradually moved to commercialization, the attention that free software gets is significant. Netscape was a pioneer in publishing the source code for their free software suite. The Open Source Initiative (OSI) of 1998 is one of those things that happened driven by Netscape’s inspirational software. OSI then inspired developers around the world to publish open-source software and the rest is history. The open source culture encouraged collaboration among developers, which resulted in higher quality software. Audits, quick fixes, updates, and license management are better when the software is open source. Here is a list of top six cool new open source projects released over the past year.
Ludwig is a TensorFlow-based toolbox that allows you to train and test deep learning models without the need to write any of the code. Incubated at Uber for the last two years, Ludwig was finally open sourced this February to incorporate the contributions of the data science community. With Ludwig, a data scientist can train a deep learning model by simply providing a CSV file that contains the training data as well as the YAML file with the outputs and inputs of the model.
Ludwig was created with five fundamental principles:
To solve the latency problem in a distributed microservice application, Netflix has an open source tool called FlameScope. Netflix FlameScope is a performance visualization utility. It allows programmers and administrators to analyze CPU activity. It generates a subsecond offset heat map in which the arbitrary spans of time can be selected by the user. Further analysis by selecting a portion of the heat map, for which a flame graph is generated for a corresponding block of time.
Netflix has a long history of releasing utilities developed internally for performance analysis and debugging as open source software. The new visualization tool instantly generates flame graphs from the sections of system profiles. The tool is a boon to programmers who want to identify the origin of performance issues.
Uber has open sourced its internal peer-to-peer Docker registry, Kraken to the public. Kraken was developed by the company’s cluster management team in early 2018 to solve the performance issues they faced with their legacy Docker registry stack. At Uber, Kraken is used internally for managing and distributing Docker images. It is also capable of distributing terabytes of data in seconds.
Kraken is designed for Docker image management, distribution, and replication in a hybrid cloud environment. With pluggable backend support, Kraken can be plugged into the existing Docker registry setups as the distribution layer. Kraken was designed with scalability in mind. According to Uber’s engineers, Docker containers are the building block of Uber’s infrastructure. As the size and count of the compute clusters grow, a simple Docker registry with sharding and cache can’t keep up with the throughput required to efficiently distribute Docker images.
Elasticsearch is a distributed search and analytical engine. Elasticsearch lets you perform and combine many types of searches the way you want. Besides performing quick searches, it also features impressive scalability for running on a multitude of servers. Github Vulcanizer is a library for using Elasticsearch.
Vulcanizer is great at getting to the nodes of cluster settings. It safely adds (and removes) nodes from settings to ensure that the shards don’t unexpectedly allocate onto a node. This Golang library interacts with an Elasticsearch cluster. It is not a full-fledged API client but helps you with common tasks when operating a cluster. These tasks include querying health status, updating cluster settings, and migrating data off nodes. The idea of Vulcanizer was born out of a frustrated effort to administer their clusters by building a packaged chat app. Initially, the project was executed with the following simple goals in mind:
Ideas like shard allocation, recovery, and more index-related cases are proposed for the future.
Spectrum is an open source image processing library released by Facebook. Mobile cameras are becoming increasingly powerful. This translates to a lot of technical implications from a social media sharing perspective. Large images consume more network bandwidth when shared online. This is why Facebook automatically compresses images. But this results in reduced image quality (lossy compression).
With Facebook Spectrum, you don’t have to trade off quality for a good upload experience. Spectrum ensures lossless resizing even when cropping and rotating JPEG images. It comes with native image compression libraries like Mozjpeg (Mozilla’s flagship JPEG encoder). The consistent API enables developers to control advanced parameters including chroma subsampling. Spectrum has helped the company improve the quality of images uploaded via its own suite of apps.
Dopamine is a TensorFlow-based framework released by Google’s DeepMind team. Dopamine aims to provide stability, flexibility, and reproducibility for reinforcement learning (RL) researchers. This release included a set of colabs to provide clarity on how to use the framework. It fills the need for an easily grokked codebase in which users can independently experiment with speculative research. The framework is compact, reliable, flexible, and facilitates reproducibility in results.
Despite the advancements in AI technology, deep learning systems still can’t keep up when trying to mimic the human brain. They require hundreds of hours to master even simple video games. Frameworks like Dopamine enable machines to learn faster. The concept was built around a meta-learning approach where rules are derived from examples and concepts are learned from the results. It comes with a single-GPU rainbow agent implements three vital components: n-step bellman updates, prioritized experience replay, and distributional reinforcement learning. The project was named after dopamine, a neurotransmitter responsible for sensations, emotions, and movements.
The open collaboration approach has unlocked a whole new level of transparency. Last year, open source was a significant accelerator of innovation especially in areas like machine learning, cloud computing, microservices, and blockchain. Last year witnessed $53 billion of transactions involving open source projects. Experts predict that 2019 will double this down. IBM’s $34 billion acquisition of Red Hat for its open source technology reinforces this hope. The future will see a lot of hyper-successful open source projects like the Linux operating system, Firefox browser, WordPress, and the Apache web server.
Featured image: Pixabay
Adding DevOps to your business is not enough. You must also create a successful DevOps culture. Here’s some ideas to…
Azure network security groups are essential to protect the traffic in any subnet within a virtual network. Here’s more on…
Intel says its next-gen Cooper Lake processors will deliver “breakthrough platform performance” with built-in AI training acceleration.
Even the slightest misconfiguration of an IoT network can serve as a point of entry for cyberattacks, security breaches, data…
Microsoft Desktop Analytics has the potential to greatly simplify the preparation for future Windows 10 update releases. Here’s more on…
Microsoft’s new Azure Dedicated Host will help organizations run their Linux and Windows virtual machines on single-tenant physical servers.