Debugging Logstash Pipelines with Plugin IDs

Vakhtang Matskeplishvili
4 min readOct 2, 2023

Logstash is a powerful tool for collecting, parsing, and shipping data.
However, debugging Logstash pipelines can be challenging, especially when they are complex and involve multiple plugins.

This article explains what Logstash plugin IDs are and how they can be used to debug pipelines more efficiently.

Logstash Plugins

Logstash plugins are modular components that can be used to configure Logstash pipelines.
Plugins can collect data from a variety of sources, parse and filter data, and send data to a variety of destinations.

Logstash plugins are divided into three categories:

* Input plugins: Collect data from a variety of sources, such as files, logs, databases, and APIs.
* Filter plugins: Parse and transform data, e.g. converting data types, removing unwanted fields, and adding new fields.
* Output plugins: Send data to a variety of destinations, i.e. Elasticsearch, Kafka, and HDFS.

Logstash Plugin IDs

Logstash plugin IDs are unique identifiers that differentiate one Logstash plugin from another.
If plugin IDs are not defined in the pipeline, Logstash automatically generates them in GUID format.

Defining plugin IDs is not mandatory, but it is highly recommended by Elastic.
The plugin ID definition syntax for each plugin can be found in plugin’s documentation page.

A standard plugin ID definition is given by:

PLUGIN_NAME {
id => "my_plugin_id"
…..
plugins_parameters
…..
}

The ID for an Elasticsearch input plugin can be defined, for example, as follows:

input {
elasticsearch {
id => "my-elasticsearch-input"
hosts => ["localhost:9200"]
index => "my-index"
}
}

Where the id parameter specifies the plugin ID, which is a my-elasticsearch-input in this example.

Why Plugin IDs Are Important for Debugging

Plugin IDs are important for debugging because they help identify the specific plugin that is causing an error.
This can save a significant amount of time, especially in complex pipelines with many plugins.

For example, we can consider a pipeline featuring four Ruby filters:

filter {
ruby {
code => "event['message'] = event['message'].split(' ')"
}
ruby {
code => "event['date'] = DateTime.parse(event['date'])"
}
ruby {
code => "event['user'] = event['user'].split(' ')"
}
ruby {
code => "event['location'] = event['location'].split(',')[0]"
}
}

From the logs, we recognized the following exception

[2023–09–28T08:46:40,865][ERROR][logstash.filters.ruby ][kuku-test][1654651cvsadf5416541asf165] Ruby exception occurred: undefined method `split' for nil:NilClass {:class=>"NoMethodError", :backtrace=>["(ruby filter code):3:in `block in filter_method'", "/usr/share/logstash/vendor/bundle/jruby/2.6.0/gems/logstash-filter-ruby-3.1.8/lib/logstash/filters/ruby.rb:96:in `inline_script'", "/usr/share/logstash/vendor/bundle/jruby/2.6.0/gems/logstash-filter-ruby-3.1.8/lib/logstash/filters/ruby.rb:89:in `filter'", "/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:159:in `do_filter'", "/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:178:in `block in multi_filter'", "org/jruby/RubyArray.java:1865:in `each'", "/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:175:in `multi_filter'", "org/logstash/config/ir/compiler/AbstractFilterDelegatorExt.java:133:in `multi_filter'", "/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:304:in `block in start_workers'"]}

This is the standard Ruby exception that happened in the “kuku-test” pipeline, but how can we find where the error is?
The configuration has four Ruby plugins, and three of them have a “split” part.
We debug our pipeline step by step, removing correct Ruby parts, and eventually, we find where the problem lies. Hours of scrupulous work ran to waste!

But what can we do?
I call your attention to this part of the log:

[kuku-test][1654651cvsadf5416541asf165]

If we check the Logstash documentation, we reveal that the first part is the pipeline ID, and the second part is the plugin ID. As I mentioned above, Logstash generates the plugin ID unless you define it manually.

Let us try to change our pipeline by adding plugin IDs:

filter {
ruby {
id => "ruby-filter-split-message"
code => "event['message'] = event['message'].split(' ')"
}
ruby {
id => "ruby-filter-parse-date"
code => "event['date'] = DateTime.parse(event['date'])"
}
ruby {
id => "ruby-filter-split-user"
code => "event['user'] = event['user'].split(' ')"
}
ruby {
id => "ruby-filter-split-location"
code => "event['location'] = event['location'].split(',')[0]"
}
}

Let us check the logs now:

[2023–09–28T08:46:40,865][ERROR][logstash.filters.ruby ][kuku-test][ruby-filter-split-user] Ruby exception occurred: undefined method `split’ for nil:NilClass {:class=>”NoMethodError”, :backtrace=>[“(ruby filter code):3:in `block in filter_method’”, “/usr/share/logstash/vendor/bundle/jruby/2.6.0/gems/logstash-filter-ruby-3.1.8/lib/logstash/filters/ruby.rb:96:in `inline_script’”, “/usr/share/logstash/vendor/bundle/jruby/2.6.0/gems/logstash-filter-ruby-3.1.8/lib/logstash/filters/ruby.rb:89:in `filter’”, “/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:159:in `do_filter’”, “/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:178:in `block in multi_filter’”, “org/jruby/RubyArray.java:1865:in `each’”, “/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:175:in `multi_filter’”, “org/logstash/config/ir/compiler/AbstractFilterDelegatorExt.java:133:in `multi_filter’”, “/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:304:in `block in start_workers’”]}

With this fix, we can easily see that the error occures in the “ruby-filter-split-user” part.

Additional benefits of defining plugin IDs are:

* Improved readability and maintainability of pipelines
* Easier collaboration with other team members on debugging pipelines
* Easier creation of custom monitoring and alerting for pipelines

Conclusion

Defining plugin IDs in Logstash pipelines is a good practice that can save time and effort when debugging.
By identifying the specific plugin that is causing an error, you can quickly troubleshoot the problem and get your pipeline back on track.

I hope this article helped you to understand the importance of defining plugin IDs in Logstash pipelines.

--

--

Vakhtang Matskeplishvili
Vakhtang Matskeplishvili

Written by Vakhtang Matskeplishvili

Try my open-source applications for Elasticsearch on my site: https://dbeast.co

No responses yet