Implementation Considerations for the Elastic Common Schema

Security Ops

Here at Code42, we use Elasticsearch as one of our solutions for log aggregation, searching, and alerting. Out of the box, Elasticsearch is great for doing raw document searches, and doesn’t require much in the way of defining document schemas or parsers. You can basically just start reading data via Logstash with a fairly simple config, or you can use Beats to collect data from various devices and have the collector worry about parsing and formatting. However, to take advantage of visualizations, or to present data from dissimilar tools in a consistent format that makes it easy to enable alerting and targeted searches, it is vital to have a consistent data schema.

Many attempts at a common data format for security-type events have been created over the years, such as CEF (Common Event Format) from ArcSight, the similar LEEF (Log Event Extended Format) from IBM, as well as XML-based Common Information Model (CIM). None of these formats has become the true industry standard, and while many security tools and appliances support export into one of these data formats, it is just as common to see security data being emitted by security tools using Syslog or CSV formats. At the same time, the rise of SaaS tools and APIs means that more and more data is being shared in JSON format, which often doesn’t translate well to older, less-extensible formats.

The Elastic Common Schema is a new schema that attempts to be more flexible, being updated frequently with community input. This is a worthy goal, and the frequency at which updates have been published is very encouraging, although as I discuss later, it represents a challenge in terms of implementation. It’s a hierarchical format that coexists naturally with JSON or XML, and it is easy to extend.

From the beginning of our own Elasticsearch usage, we’ve tried to be consistent when it comes to parsing data. Since Elasticsearch is just one of the SIEM-like tools we use, switching to an all-Elastic stack (including using Beats everywhere) just wasn’t an option. That meant a lot of Logstash grok parser writing, and without a pre-defined schema, even the best attempts at normalization led to small differences between data sets. So when ECS 1.0.0 was released last year, we made the decision to implement it.

Implementation was a very detailed process, and is still an ongoing one due to the aforementioned frequent release cycle. Below are the main lessons we’ve learned during the implementation:

Pay attention to field requirements – Many of the fields in ECS, particularly Event fields, have current or future requirements that data only approve to certain values. Whenever possible, follow those requirements as early as you can, and don’t deviate. This will minimize any rework later as the schema gets updated.
Don’t be afraid to add your own fields – ECS is meant to be extensible, and there is little harm in defining your own fields, or even your own hierarchy. If there is data you want to report or alert on, create a field definition for it and use it. The good news is that ECS is rapidly expanding the schema to cover the most typical security event categories; the changes and field additions between 1.0 and 1.4 are quite significant. Nevertheless, there will always be cases where custom fields are necessary. In our case, metric data sent by our tools is a prime example of a case where we use custom fields.
Use Index Templates to your benefit – Index templates are a critical part of managing Elasticsearch, and the ECS project includes index templates that define fields for easy consumption. By leveraging the fact that multiple templates can be applied to an index, we have layered our own ECS-like schema on top of our indices for data fields that we have defined in a separate template document. That way, we can keep the official ECS document “clean” and also see very quickly what fields are custom to our environment.
Don’t feel the need to create fields for everything – As easy as it is to extend ECS, don’t feel the need to create a field definition for every bit of data. If you aren’t going to be alerting or aggregating on it, you probably don’t need to parse it out or create a field definition.
Use the message field – The event.original field is meant to have the original contents of the log entry, but it is not indexed, hence not searchable. If you want to be able to do full-text searches of the entirety of your logs, particularly if you don’t parse all data out per the advice above, copy the data to the message field so you can search on it. Note that this applies mainly to data ingested as raw text, not structured data like JSON.
Set the ecs.version field to track version info – There is an ecs.version field to document what version of ECS was followed when parsing the data, so make sure you use it. This simplifies setting up saved searches, visualizations, and dashboards and helps you find log sources that you may not have updated.
Use field aliases – After all this work renaming fields, you probably have a lot of searches, dashboards, or even just muscle memory for accessing data in a certain way. By using an alias field mapping, you can point your old field names to the new ones to smooth over the transition. I’ve found that aliases need to be defined in the same index template document as the main field definition, so you can either add them to the ECS template or duplicate field definitions in your custom template and add the aliases there.

All of that info may look like a lot to consider, but don’t let it dissuade you from moving towards a common schema. The ability to aggregate security events regardless of source is well worth it in the end.