Technolead: ELK Stack

What is the ELK Stack?

The ELK Stack began as a collection of three open-source products — Elasticsearch, Logstash, and Kibana — all developed, managed and maintained by Elastic. The introduction and subsequent addition of Beats turned the stack into a four legged project.

Elasticsearch is a full-text search and analysis engine, based on the Apache Lucene open source search engine.

Logstash is a log aggregator that collects data from various input sources, executes different transformations and enhancements and then ships the data to various supported output destinations. It’s important to know that many modern implementations of ELK do not include Logstash. To replace its log processing capabilities, most turn to lightweight alternatives like Fluentd, which can also collect logs from data sources and forward them to Elasticsearch.

Kibana is a visualization layer that works on top of Elasticsearch, providing users with the ability to analyze and visualize the data. And last but not least — Beats are lightweight agents that are installed on edge hosts to collect different types of data for forwarding into the stack.

Together, these different components are most commonly used for monitoring, troubleshooting and securing IT environments (though there are many more use cases for the ELK Stack such as business intelligence and web analytics). Beats and (formerly) Logstash take care of data collection and processing, Elasticsearch indexes and stores the data, and Kibana provides a user interface for querying the data and visualizing it.

Why is ELK So Popular? Will OpenSearch surpass ELK?

The ELK Stack is popular because it fulfills a need in the log management and analytics space. Monitoring modern applications and the IT infrastructure they are deployed on requires a log management and analytics solution that enables engineers to overcome the challenge of monitoring what are highly distributed, dynamic and noisy environments.

The ELK Stack helps by providing users with a powerful platform that collects and processes data from multiple data sources, stores that data in one centralized data store that can scale as data grows, and that provides a set of tools to analyze the data.

Of course, it won’t be surprising to see ELK lose popularity since the announcement that it would be closed-sourced. Using open source means organizations can avoid vendor lock-in and onboard new talent much more easily. Open source also means a vibrant community constantly driving new features and innovation and helping out in case of need.

For these reasons, at Logz.io, we expect OpenSearch and OpenSearch Dashboards to eventually take the place of ELK as the most popular logging solution out there.

Sure, Splunk has long been a market leader in the space. But its numerous functionalities are increasingly not worth the expensive price — especially for smaller companies such as SaaS products and tech startups. Splunk has about 15,000 customers while ELK and OpenSearch are downloaded more times in a single month than Splunk’s total customer count — and many times over at that. ELK and OpenSearch might not have all of the features of Splunk, but it does not need those analytical bells and whistles. They are simple but robust log management and analytics platforms that cost a fraction of the price.

Why is Log Analysis Becoming More Important?

In today’s competitive world, organizations cannot afford one second of downtime or slow performance of their applications. Performance issues can damage a brand and in some cases translate into a direct revenue loss. For the same reason, organizations cannot afford to be compromised as well, and not complying with regulatory standards can result in hefty fines and damage a business just as much as a performance issue.

To ensure apps are available, performant and secure at all times, engineers rely on the different types of telemetry data generated by their applications and the infrastructure supporting them. This data, whether event logs, traces, or metrics, or all three, enables monitoring of these systems and the identification and resolution of issues should they occur.

Logs have always existed and so have the different tools available for analyzing them. What has changed, though, is the underlying architecture of the environments generating these logs. Architecture has evolved into microservices, containers and orchestration infrastructure deployed on the cloud, across clouds or in hybrid environments. Not only that, the sheer volume of data generated by these environments is constantly growing and constitutes a challenge in itself. Long gone are the days when an engineer could simply SSH into a machine and grep a log file. This cannot be done in environments consisting of hundreds of containers generating TBs of log data a day.

This is where centralized log management and analytics solutions such as the ELK Stack come into the picture, allowing engineers, whether DevOps, IT Operations or SREs, to gain the visibility they need and ensure apps are available and performant at all times.

Modern log management and analysis solutions include the following key capabilities:

Aggregation – the ability to collect and ship logs from multiple data sources.
Processing – the ability to transform log messages into meaningful data for easier analysis.
Storage – the ability to store data for extended time periods to allow for monitoring, trend analysis, and security use cases.
Analysis – the ability to dissect the data by querying it and creating visualizations and dashboards on top of it.

How to Use the ELK Stack for Log Analysis

As I mentioned above, taken together, the different components of the ELK Stack provide a simple yet powerful solution for log management and analytics.

The various components in the ELK Stack were designed to interact and play nicely with each other without too much extra configuration. However, how you end up designing the stack greatly differs on your environment and use case.

For a small-sized development environment, the classic architecture will look as follows:

However, for handling more complex pipelines built for handling large amounts of data in production, additional components are likely to be added into your logging architecture, for resiliency (Kafka, RabbitMQ, Redis) and security (nginx):

This is of course a simplified diagram for the sake of illustration. A full production-grade architecture will consist of multiple Elasticsearch nodes, perhaps multiple Logstash instances, an archiving mechanism, an alerting plugin and a full replication across regions or segments of your data center for high availability. You can read a full description of what it takes to deploy ELK as a production-grade log management and analytics solution in the relevant section below.

For many teams, spending the time to configure, tune, scale, upgrade, manage, and secure these components is no problem. However, for those who need to focus their resources elsewhere, Logz.io provides a fully managed OpenSearch service – including a full logging pipeline out-of-the-box – so teams can focus their energy on other endeavors like building new features.

What’s new?

As one might expect from an extremely popular toolset, the ELK Stack is constantly and frequently updated with new features. Keeping abreast of these changes is challenging, so in this section we’ll provide a highlight of the new features introduced in major releases.

Elasticsearch

Elasticsearch 7.x is much easier to setup since it now ships with Java bundled. Performance improvements include a real memory circuit breaker, improved search performance and a 1-shard policy. In addition, a new cluster coordination layer makes Elasticsearch more scalable and resilient.

Elasticsearch 8.x versions – which are not open source – include enhancements like optimizing indices for time-series data, and enabling security features by default.

Logstash

Logstash’s Java execution engine (announced as experimental in version 6.3) is enabled by default in version 7.x. Replacing the old Ruby execution engine, it boasts better performance, reduced memory usage and overall — an entirely faster experience.

Kibana

Kibana is undergoing some major facelifting with new pages and usability improvements. The latest release includes a dark mode, improved querying and filtering and improvements to Canvas.

Kibana versions 8.x – which are (like Elasticsearch versions 8.x) not open source – also allow users to break down fields by value – making it easier to scan through large data volumes.

Beats

Beats 7.x conform with the new Elastic Common Schema (ECS) — a new standard for field formatting. Metricbeat supports a new AWS module for pulling data from Amazon CloudWatch, Kinesis and SQS. New modules were introduced in Filebeat and Auditbeat as well.

When Elastic closed sourced the ELK Stack, they also quietly prevented Beats from shipping data to:

Elasticsearch 7.10 or earlier open source distros
Non-Elastic distros of Elasticsearch

This breaking change blocks current Beats users from freely redirecting their log data to their desired destination. In other words, if you install the latest version of Beats, you won’t be able to switch back-ends to OpenSearch unless you rip out Beats and replace it with an open source log collection component.

We can deep dive here on various topics :

Creation of Kibana DashBoard

Tutorials