ELK Elastic stack is a popular open-source solution for analyzing weblogs. In this tutorial, I describe how to setup Elasticsearch, Logstash and Kibana on a barebones VPS to analyze NGINX access logs. I don’t dwell on details but instead focus on things you need to get up and running with ELK-powered log analysis quickly.
Comparing to other tools available ELK gives you extreme flexibility in terms of ways to analyze and present your logs data. Hosted solutions are a bit pricey with monthly costs starting around $50 for a reasonable features set. By following this tutorial you can setup your own log analysis machine for a cost of a simple VPS server. No need to be a dev-ops pro to do it yourself.
ELK stack will reside on a server separate from your application. NGINX logs will be sent to it via an SSL protected connection using Filebeat. We will also setup GeoIP data and Let’s Encrypt certificate for Kibana dashboard access.
This step by step tutorial covers the newest at the time of writing version 7.7.0 of the ELK stack components on Ubuntu 18.04. Check out the release notes for the current ELK version and potential breaking changes.
Just to show you a sneak peak of what we will be building:
Currently, I am using Kibana to analyze traffic logs of this blog and Abot for Slack landing page.
Let’s get started.
Setup VPS
You need to start with purchasing a barebones VPS and adding SSH access to it. I don’t elaborate on how to do it in this tutorial.
You will also need a domain or a subdomain you will config with your VPS server IP using an A
DNS entry. If you use Cloudflare for your DNS remember not to use their CDN for this domain because it changes IP domain resolves to and can cause trouble with setup.
For my ELK stack server, I use a 4GB Hetzner Cloud VPS with Ubuntu 18.04. It is running Elasticsearch, Kibana and Logstash processes. With my current amount of traffic log data 4GB RAM is enough so far.
Install ELK dependencies
Access your VPS and run the following commands as a sudo user to install required dependencies:
Java
Java is required for both Elasticsearch and Logstash.
You can verify that installation was successful by typing:
Result should look similar to:
Elasticsearch
Elasticsearch is a database where logs are stored after Logstash processes them. It can be quite memory hungry so make sure to monitor your RAM usage when working with it on a low-end VPS.
Let’s install it by running:
Now uncomment the following lines in /etc/elasticsearch/elasticsearch.yml
start the Elasticsearch process:
and verify that it is running by making a cURL request:
JSON response should look something like:
Kibana
Kibana is a visual layer of an ELK stack. It queries an ElasticSearch for log data and offers a multitude of ways to analyze and present it.
First, let’s install it:
You can check out the contents of /etc/kibana/kibana.yml
to see the default configuration, but you don’t need to edit anything.
Now you can start a Kibana process by typing:
Just like in the case of Elasticsearch you can verify that it is running by using a cURL command:
Now let’s expose an access to our Kibana dashboard to an external world using NGINX.
NGINX
This is not the NGINX we will be analyzing logs from. This one will be used to provide password-protected access to Kibana instance running on our ELK server. We will use a Let’s Encrypt SSL certificate for secure access. We can do it by typing:
To automatically renew your certificate add this line to /etc/crontab
file:
Now you need to set a password for your Kibana HTTP basic auth user:
Next, configure NGINX to use generated certificates and proxy pass traffic from your VPS root path to Kibana:
/etc/nginx/sites-enabled/default
Make sure that your nginx config is correct and restart the server:
This config assumes that there is an A
DNS entry for my-elk-stack-vps.com
domain pointing to your VPS server IP.
You should now be able to see your Kibana dashboard by going to my-elk-stack-vps.com
and entering your credentials.
You can add sample dataset to play around with it or continue the tutorial to import your own log entries.
Logstash and SSL certificates
Logstash is used to accept logs data sent from your client application by Filebeat then transform and feed them into an Elasticsearch database.
Install it by running:
Configure SSL certificates
Because you will be sending your logs from a separate server, you should do it via a secure connection. Generating self-signed certificates will be necessary to do it:
Remember to substitute my-elk-stack-vps.com
with your domain name in the command generating a self-signed certificate. Later we will have to copy resulting files to your client-server Filebeat configuration.
Configure Logstash
Now you need to configure Logstash with the following files:
/etc/logstash/logstash.yml
/etc/logstash/conf.d/logstash-nginx-es.conf
This config specifies input and output for out logs and how they will be formatted before sending them to Elasticsearch. GeoIP data is configured here as well. It also enforces a secure SSL connection signed by a correct certificate for logs sent by a Filebeat.
Now let’s start Logstash process and verify that it is listening on a correct port:
Output of the last command should be similar to:
If it does not work, you can check out the troubleshooting guide at the end of the post.
Filebeat
Fielbeat is the only part of the infrastructure that needs to be installed on a client server. You should login to the server of your NGINX application and copy the self-signed SSL certificate files to the correct folder:
You can use SCP to do it or just copy/paste the contents of files.
Now, install Java using the same commands as for the main ELK host server.
Then install rest of the dependencies:
Now configure Filebeat by modifying this file: /etc/filebeat/filebeat.yml
This config tells Filebeat where to send our logs and which SSL certificates to use for authentication. paths
option points to a default NGINX logs folder.
If you have an NGINX running for a while, you probably have a bunch of GZipped logs in /var/log/nginx/
. To send them to Kibana you should unzip them using gunzip
and change their resulting filenames to match the *.log
wildcard expression.
You might also consider disabling logging access to static resources to limit the noise. You can do it with the following NGINX config:
Raw logs are here
If everything went fine you should go to Kibana dashboard and create an index pattern called weblogs-*
. You can do it in a Management
menu tab. Now you can go to Discover
and see your raw logs data there:
This how a raw JSON entry stored in Elasticsearch for a single NGINX log event after being parsed by Logstash looks like:
Troubleshooting
As you can see you need to make various components play together in order to get the ELK stack running. Here’s a list of commands which can help you debug when things go wrong:
Filebeat logs:
Logstash logs:
Start a Filebeat process in the foreground to see if it can connect to Logstash on the host ELK server:
Start a Logstash process in the foreground to check why it’s not listening on a port:
Summary
I am just gettings started to play with ELK Elastic stack and discovering options it has to offer. I hope that this tutorial will help you get up and running with it quickly even if you don’t have much dev ops experience up your sleeve.