Logstash
Logstash is a data processing pipeline, able to take multiple sources of data as input. It allows to format and modify data on the fly before forwarding them to the chosen destination.
Planning your Deployment
-
We are currently stuck with an important constraint related to Elasticsearch®. Scalingo provides a version of Logstash that is compatible with this constraint in the
es7-compat
branch of the logstash-scalingo repository. We will have to use this branch. -
Sizing Logstash vastly depends on your use-case and the amount of data processed. We usually recommend to start with an L container, and adjust later depending on the metrics of your Logstash instance.
Deploying
Using the Command Line
We maintain a repository called logstash-scalingo on GitHub to help you deploy Logstash on Scalingo. Here are the few additional steps you will have to follow:
-
Clone the
es7-compat
branch of our repository:git clone -b es7-compat --single-branch https://github.com/Scalingo/logstash-scalingo.git cd logstash-scalingo
-
Update your branch name to
master
ormain
:git branch -m master
-
Create the application on Scalingo:
scalingo create my-logstash
Notice that our Command Line automatically detects the git repository, and adds a git remote to Scalingo:
git remote -v origin https://github.com/Scalingo/logstash-scalingo (fetch) origin https://github.com/Scalingo/logstash-scalingo (push) scalingo git@ssh.osc-fr1.scalingo.com:my-logstash.git (fetch) scalingo git@ssh.osc-fr1.scalingo.com:my-logstash.git (push)
-
Scale the container to an L size:
scalingo --app my-logstash scale web:1:L
-
Provision a Scalingo for Elasticsearch® Sandbox addon:
scalingo --app my-logstash addons-add elasticsearch elasticsearch-sandbox
-
Edit the
logstash.conf
file to change the index name of the Elasticsearch® output. The goal is to make it fit semantically to the data being ingested:output { elasticsearch { [...] # OLD index => "change-me-%{+YYYY.MM.dd}" # NEW index => "unicorns-%{+YYYY.MM.dd}" } }
Don’t forget to commit your changes:
git add logstash.conf git commit -m "Update the index name"
-
Create a few environment variables to protect your Logstash instance via HTTP basic auth (not everyone should be able to send data to your instance!):
scalingo --app my-logstash env-set USER=logstash-username scalingo --app my-logstash env-set PASSWORD=logstash-password
-
Everything’s ready, deploy to Scalingo:
git push scalingo master
Using the Terraform Provider
-
Start by forking our Logstash repository
-
Place the following block in your Terraform file to create the app:
resource "scalingo_app" "my-logstash" { name = "my-logstash" force_https = true environment = { USER = "<logstash-username>" PASSWORD = "<logstash-password>" } }
-
Link the app to your forked repository:
data "scalingo_scm_integration" "github" { scm_type = "github" } resource "scalingo_scm_repo_link" "default" { auth_integration_uuid = data.scalingo_scm_integration.github.id app = scalingo_app.my-logstash.id source = "https://github.com/<username>/logstash-scalingo" branch = "es7-compat" }
-
Provision a Scalingo for Elasticsearch® addon and attach it to your app:
resource "scalingo_addon" "my-logstash-elasticsearch" { app = scalingo_app.my-logstash.id provider_id = "elasticsearch" plan = "sandbox" }
-
(optional) Instruct the platform to run the
web
process type in a single L container:resource "scalingo_container_type" "web" { app = scalingo_app.my-logstash.id name = "web" size = "L" amount = 1 }
-
Edit the
logstash.conf
file to change the index name of the Elasticsearch® output. The goal is to make it fit semantically to the data being ingested:output { elasticsearch { [...] # OLD index => "change-me-%{+YYYY.MM.dd}" # NEW index => "unicorns-%{+YYYY.MM.dd}" } }
Don’t forget to commit your changes:
git add logstash.conf git commit -m "Update the index name" git push
-
Run
terraform plan
and check if the result looks good -
If so, run
terraform apply
-
Once Terraform is done, your Logstash instance is provisioned and ready to be deployed. This requires an extra manual step:
- Head to your dashboard
- Click on your Logstash application
- Click on the Deploy tab
- Click on Manual deployment in the left menu
- Click the Trigger deployment button
- After a few seconds, your Logstash instance is finally up and running!
Testing
-
Once your Logstash is up and running, you can try to send some data to it:
curl --request POST 'https://<logstash-username>:<logstash-password>@my-logstash.osc-fr1.scalingo.io?name=whatever' --data 'Hello World!' ok
-
Check the indices that are stored in the Elasticsearch® database:
scalingo --app my-logstash run bash > curl $SCALINGO_ELASTICSEARCH_URL/_cat/indices yellow open unicorns-2024.06.04 _0XNpJKzQc2kjhTyxf4DnQ 5 1 1 0 6.6kb 6.6kb
-
Logstash has created the
unicorns
index which can now be requested:> curl $SCALINGO_ELASTICSEARCH_URL/unicorns-2024.06.04/_search | json_pp { "_shards" : { // [...] }, // [...] "hits" : { "total" : 1, "max_score" : 1, "hits" : [ { "_type" : "logs", "_score" : 1, "_source" : { "name" : "whatever", "message" : "Hello World!", "url" : "?name=whatever", "@timestamp" : "2024-06-04T11:57:03.155Z" // [...] }, // [...] } ] } }
The result of the above search confirms that the index contains a document having a field
name
set towhatever
and a fieldmessage
set toHello World!
.
Updating
Scalingo maintains compatibility with the Elasticsearch® instances we provide
in the es7-compat
branch of our logstash-scalingo repository.
Consequently, updating Logstash consists in pulling the changes from the
es7-compat
branch:
- From your Logstash repository:
git pull origin es7-compat
Using the Command Line
- Make sure you’ve successfully followed the first steps
- Send the updated version to Scalingo:
git push scalingo master
Using the Terraform Provider
- Make sure you’ve successfully followed the first steps
- Push the changes to your repository:
git push origin master
- Head to your dashboard
- Click on your Logstash application
- Click on the Deploy tab
- Click on Manual deployment in the left menu
- Click the Trigger deployment button
- After a few seconds, your updated Logstash instance is ready!
Customizing
Configuring
The repository we provide help deploy your Logstash instance comes with a
directory named config
. All the files contained in this directory are copied
in the Logstash configuration directory at runtime, allowing you to precisely
customize your instance.
For example, if you’d want to modify the logging behavior of Logstash, you
could edit the config/log4j2.yml
file.
Interfacing with Log Drains
Log drains are a Scalingo feature that, once activated, automatically sends every log line generated by an application to the configured destinations.
To make a log drain forward logs to a Logstash instance, use the elk type of log drain:
scalingo --app my-app log-drains-add --type elk --url <logstash_url>
With logstash_url
being the URL of your Lostash app.
The elk log drain sends the log entries over HTTP, with a content-type
header
set to plain/text
.
The application name (appname
) and the container name (hostname
) are passed
as query parameters and the log entry itself is in the request body.
Our logstash-scalingo repository contains a few basic examples of configuration, known to work with our log drain. Feel free to adjust to your needs.
Environment
The following environment variable(s) can be leveraged to customize your deployment:
-
USER
Logstash administrator account name.
mandatory -
PASSWORD
Logstash administrator password.
mandatory -
LOGSTASH_PLUGINS
Comma separated list of plugins to install.
Defaults to not being set. -
LOGSTASH_VERSION
Version of Logstash to deploy.
Defaults to6.8.21