Why you do not want to have “all” data in your SIEM

A common task when working with a SIEM is to onboard new technologies: In order to discover and alert on hackers attacking your new cloud instance|shopping website|password vault you need to see the systems’ event logs and work with these. You basically have two options how to get system event logs into a SIEM:

Get all the data into your SIEM that a certain system is able to send and later on decide on the security use cases you can implement with these events
Decide on the security use cases you want to implement and based on this decide which events you will need to get from the monitored system

In my opinion, the second option is often the wiser one, because you will only work with events that you understand and for which you know why you collect them.

However, in practice you can often see the first approach and there are several reasons for that. Often, the people responsible for the system that needs to send data to the SIEM are not particularly interested in specific security use cases. They get the requirement “send to SIEM” and understandably simply want to tick of this requirement as quickly as possible – as long as some data arrives, it’s fine.

Another reason is that there is this a general understanding that

“all events are relevant because it’s for security after all”

I do not agree with this statement.

The problem in practice often is not that you are missing data, but that you are missing the right data. That needed data might either never have been sent, because the source was not set up properly. Or it might be buried under tons of other data which “for security reasons and just in case” was sent to the SIEM as well. Or it is there, but nobody knows how to find it because it’s unclear how the required events look like.

The challenge with the second approach is that it will take longer until data arrives in your SIEM and therefore you will have a hard time to promote it.

First you will have to discuss with the system owner which security use cases you would like to cover (and the system owner might not have an opinion on this), next you will need to identify relevant events and check if and how they can be sent by the source system (and the system owner might not know it). The system owner will then also need to work on filtering out all the other events which are not needed – which means additional work. And only then will the system start to send data to your SIEM. In a fast-moving IT security environment, this kind of ideal setup can take too long – and people just go with the first approach.

That’s why in practice you often see a scenario where a system owner – let’s say a Linux admin – approaches you, tells you that he “got the requirement to connect all Linux systems to the SIEM” and consequently will simply forward the whole syslog, flooding your SIEM with events that neither you nor he properly knows how to read, analyze and correlate. In cases where the system is not as well understood as Linux – let’s say a rarely used specialized software – understanding logs might then be even harder, just leaving you with a lot of data which no one will ever know how to handle.

Just to be clear: The huge majority of system owners I work with are very motivated, knowledgeable and competent people. But their job is to keep operations running, not to create security use cases or to dig through event logs. Additionally – as vividly described in the recommended book The Phoenix Project – the most highly skilled people will also most likely be the busiest and least available ones. Luckily, what you can observe is that, as organizations are becoming more mature in their IT security endeavors, the awareness that IT security is important grows across organizational hierarchies and more time and budget is available to work on such topics.

Purple Serendipity

Mais les braves gens n'aiment pas que l'on suive une autre route qu'eux

Why you do not want to have “all” data in your SIEM

Leave a Reply Cancel reply