Month: August 2018

What is SIEM and how it differs from other security tools

Now that we understand what log data is, let’s discuss the technology that will allow your organization to collect and use it.

Security Information and Event Management is a technology that will process log data from your various systems, analyze it, make it available for searching, and store it. SIEM itself is a combination of two more abbreviations: Security Information Management (SIM) and Security Event Management (SEM). SIM is focused on the collection of log data for investigative and compliance purposes. SEM is focused on alerting and analytics: threat detection, pattern anomalies, and correlating different data sources.

SIEM tools can vary in architecture, but generally have two layers: A Processing Layer and an Analytics Layer. The Processing Layer is where data is structured, aggregated, and forwarded to the Analytics Layer. The Analytics Layer is where data is stored, made available for searching, and where security analytics is performed.

Using the above diagram as an example, the data sources are your various systems that will produce log data and send (push) to your Processing Layer, or your Processing Layer will reach out to (pull) via a database or API call. Depending on the SIEM product, the Processing Layer will structure the log data, normalizing it into a standard format, aggregate it by combining similar events into one, or may simply add an index to it and forward to the Analytics Layer. The Processing Layer is strictly used for processing; the only data that is typically retained are caches when the Analytics Layer is unavailable.

The Analytics Layer is where end users will search for data, create reports and use cases. Depending on the SIEM product, it may structure log data, and may act as a long-term retention repository.

While SIEM is defined as a security application, it differs significantly from your other security tools. Your SIEM will process log data, while your proxy, IDS/IPS, and some malware detection tools will typically process network traffic. Packets going over your network use the same protocols, so you don’t need to customize your firewall or IPS to detect TCP traffic. Your SIEM will need to process log data in various formats, many which may not be supported by your SIEM vendor.

Your IDS/IPS and malware tools provide you with a list of signatures that will be automatically updated on a regular basis. Some IDS/IPS tools allow you to implement custom signatures, but for the most part your analysts won’t have to write custom signatures for known vulnerabilities, exploits, and attacks. Your SIEM staff will need to create custom use cases and update them regularly, as you’ll unlikely be using much of your SIEM vendor’s default content.

SIEM vendors support many log sources, but your engineers will need to ensure the right parsers are being used, update them regularly, and write any that are not supported by your SIEM vendor. This is in stark contrast to your network devices and other security tools that only have to work with limited protocols such as TCP/IP, HTTP, and HTTPS.

Staff will be logging into your proxy, firewall, IDS/IPS, and malware tools often, but it will mostly be for administrative purposes. Your SIEM will have many end users, ranging from admins to users searching for data. In large environments, it’s common to have several users searching for data simultaneously. For MSPs, your customers may be logging in to search for data as well.

While most of your other security tools can block a malicious host from egressing your network or block users going to an uncategorized site, SIEMs don’t have the capability to block.

The following table summarizes the differences between SIEM and your other security products.

As you can see from the above table, a SIEM differs significantly from your black-boxed IPS or malware tools. While it may seem that it’s simply a log aggregator, a SIEM is a complex tool that will need significant customization. The environment can have many stakeholders from security analysts, to compliance and access management teams. Ultimately, it will need to be implemented, operated, and maintained differently.

What’s the big deal with log data?

So what’s the big deal with all this log data, and why on earth should I spend a large chunk of my budget to collect it? Aren’t the other security tools I have good enough? What exactly is in all this log data, anyway?

Log data is one of today’s most valuable assets: data. Google, Twitter and Facebook collect enough data on people to detect flu outbreaks faster than medical professionals can. Without owning a single taxi, GPS data gave a software company the opportunity to become the world’s largest taxi service. A computer algorithm can recommend a movie you’d like to watch and spare you from having to read reviews of movie critics. Amazon can tell you what book you’d like to read next or what household products you may be running low on.

In the context of cyber security, log data contains records of activity from your various IT systems. These records can help you understand what goes on inside your network. They can show you which user accounts are being used. They can show you which users are consistently visiting blocked websites. They can show you the suspicious files being blocked by your endpoint protection application. They can highlight suspicious processes running on your servers. They can tell you which exploits your web servers are vulnerable to and if anyone is trying to attack them. Ultimately, they can uncover activity in your network that is adding risk to your organization.

Log data is typically output to a file or database, where it was traditionally used for troubleshooting purposes. If someone couldn’t log into a particular application, the system admins would check the log files to see if they could find out why. If a customer application was down, the support team would check the log files to see if they could find out the cause of the crash.

As the amount of log data grew, many saw that the files sitting on their servers contained invaluable data. Many applications were born to manage all of this data, helping organizations search through it and assist in detecting issues before they became outages. In the early 2000’s, some programmers with a security mindset thought of creating an application that would act as a centralized repository of log data for security investigators, and be able to alert in real-time when particular values or suspicious patterns were detected in the log data. The result of this was the birth of SIEM, Security Information and Event Management.

Let’s take a quick peek at some log data. Here’s a small sample of authentication activity, which is a user failing to login, and then successfully logging into their workstation.

-May 1 2018 1:00PM, IP=10.1.1.1, User=Bob, Message=login failure
-May 1 2018 1:01PM, IP=10.1.1.1, User=Bob, Message=login failure
-May 1 2018 1:02PM, IP=10.1.1.1, User=Bob, Message=login success

Most log files will at minimum answer who, what, when, where, why, and how. Given the advent of SIEMs, most vendors now provide detailed logging for their applications, and some even allow you to customize what is output.

Here you can see a couple of punctual users logging into their company network in the morning, generating VPN login data:

-May 1 2018 8:50AM, IP=23.91.128.44, User=John, message=VPN Login Success
-May 1 2018 8:54AM, IP=23.95.148.12, User=Bob, message=VPN Login Success

Log files can also be specific to an application. Here we have some startup activity on the billing server:

-May 1 2018 9:54AM, hostname=billingserver01, message:NOTICE: Application starting
-May 1 2018 9:55AM, hostname=billingserver01, message:NOTICE: Running startup scripts

That’s great, you may think, but why should you devote resources to collect and manage this data? Let’s expand the above entries and see what the big deal is.

Using the authentication activity again:

-May 1 2018 1:00PM, IP=10.1.1.1, User=asmith, Message=login failure
-May 1 2018 1:01PM, IP=10.1.1.1, User=bsmith, Message=login failure
-May 1 2018 1:02PM, IP=10.1.1.1, User=csmith, Message=login failure
-May 1 2018 1:03PM, IP=10.1.1.1, User=dsmith, Message=login failure
-May 1 2018 1:04PM, IP=10.1.1.1, User=esmith, Message=login failure
-May 1 2018 1:05PM, IP=10.1.1.1, User=fsmith, Message=login failure
-May 1 2018 1:06PM, IP=10.1.1.1, User=gsmith, Message=login failure
-May 1 2018 1:07PM, IP=10.1.1.1, User=hsmith, Message=login failure

These log entries become interesting now that someone is trying to log into the billing server using an incremental version of “smith.” This small story could be many things, from a developer testing something, a script running in the background, or it could be indicative of someone trying to guess a username, attempting to gain unauthorized access to the server.

Let’s take a look at the VPN log again:

-May 1 2018 8:50AM, IP=23.91.128.44, User=Bob, message=VPN Login Success
-May 8 2018 8:55AM, IP=23.91.128.44, User=Bob, message=VPN Login Success
-May 15 2018 8:52AM, IP=23.91.128.44, User=Bob, message=VPN Login Success
-May 22 2018 8:59AM, IP=23.91.128.44, User=Bob, message=VPN Login Success
-May 29 2018 8:44AM, IP=23.91.128.44, User=Bob, message=VPN Login Success
-May 29 2018 9:30PM, IP=62.176.64.51, User=Bob, message=VPN Login Success

Nothing unusual about Bob being his punctual self logging into work, except that “he” logged in from Bulgaria at about 9:30PM on May 29. Scenarios like this could be John on a business trip, or not John at all.

Finally, let’s take a look at some file executions in a log file. Here is a sample system updating itself, but for some reason the last file executed doesn’t seem to be a standard update file, which could be indicative of a malicious file being executed.

-May 4 2018 1:10AM, hostname=billingserver01, msg=file “update_01.exe” executed
-May 4 2018 1:13AM, hostname=billingserver01, msg =file “update_02.exe” executed
-May 4 2018 1:15AM, hostname=billingserver01, msg =file “update_03.exe” executed
-May 4 2018 1:50AM, hostname=billingserver01, msg =file “A2.exe” executed

As you can see, log data can contain invaluable data that can help your organization investigate suspicious activity and detect attacks in real time. Log data can indicate issues brewing in your systems that can be caught in advance before an outage or breach occurs. SIEM is a technology that centralizes log data, makes it available for searching, allows staff to alert on suspicious activity, and ultimately enhance the efficiency and effectiveness of your organization’s security operations.