Month: November 2018

Alerting On Quiet Log Sources

Data sources that stop logging to your SIEM put your organization at risk. If one of your organization’s firewalls stops logging to the SIEM, your SOC will be blind to malicious traffic traversing it. If your endpoint protection application stops logging, your analysts won’t be able to see if malicious files are being executed on one of your billing servers.

In a perfect world, your SIEM should alert when any data source stops logging to your SIEM. While this is feasible in smaller organizations, it can become daunting in large organizations. It’s easier for your SOC to follow up with one system owner who sits a few cubicles over than with 100 system owners from different lines of businesses. The task of remediating several hundred systems not logging to a SIEM can easily consume an entire resource. In large organizations, network outages, system upgrades and maintenance windows can be a regular occurrence. Should you alert on any data source that stops logging to your SIEM in a predefined period, you could easily end up flooding your SOC, and in a worst case scenario, your analysts will develop a practice of ignoring these alerts.

As a best practice, especially in large organizations, a SIEM should be configured to alert when critical data sources stop logging. The data sources should at minimum include critical servers to the business (e.g. client-facing applications), firewalls, proxies, and security applications. A threshold of less than an hour in your organization may generate excessive alerts, as some sources that are file-based may be delayed by design, for example by copying the file to the SIEM every 30 minutes. However, data sources that haven’t logged in one hour may warrant an alert in your organization.

Another thing to consider when remediating systems not logging to the SIEM is that malware experts and threat intelligence specialists may not be the best resources to chase system owners down. While they may not mind the odd alert for this, they’re not likely going to have time to chase down and get 100 system owners to configure their systems properly, or have the patience to continuously follow up with them. Thus, in larger organizations, project management may be a good fit for this task.

Having all your systems log to your SIEM is a critical part of reducing your organization’s risk. Having a practical, manageable task for remediating systems that stop logging will ensure the process is followed and the risk is reduced.

Disable Unused Content

When building new SIEM environments or working with existing ones, one of the quickest ways you can improve the performance and stability of the environment is to remove unused content. While this may seem obvious to experienced SIEM resources, it’s common to find reports or rules running in the background that don’t serve a purpose. In some environments, unused content can be slowing the system down and contributing to application instability. Unused content is especially common in environments that don’t have enough staff to manage the SIEM.

Default rules provided by the vendor are often enabled but unused. The first indication that a rule is unused is if it doesn’t have an action or it isn’t used for informational purposes. If a rule isn’t alerted to the attention of an investigator or SIEM engineer, it may be that the rule is simply running in the background consuming system sources. A rule to trigger an alert when someone logs into the SIEM may be useful, but an ad-hoc report to obtain the same information may suffice. A significant amount of inefficient rules that match a large percentage of events can adversely affect the performance of the environment.

Reports can be another source of unused content. In many environments, I find reports that were originally setup to be used temporarily, but are no longer being used by the recipient. It’s often for the recipient to forget to follow up with the SIEM staff to note the reports are no longer required. Over a period of several years, this can easily amount to several dozen reports running on a regular basis, putting a significant strain on the system for no benefit.

All SIEM environments are different, and there’s no set of content that must be enabled or disabled. But there’s very likely content in your environment that can be disabled, and the system resources can instead be used to provide security analysts better search response times. So on a regular basis or whenever there’s a complaint about search response times or application instability, determine if there’s any content that can be safely disabled.

Effective Searching

There are two critical reasons end users should learn how to search their SIEM effectively. Ineffective searching is a risk to your organization, where end users can produce inaccurate data, and thus provide inaccurate investigation results. Ineffective searching can also degrade the SIEM’s performance, increasing the amount of time required for analysts to obtain data, while affecting the overall stability of the system.

If a security analyst is asked to perform an investigation and searches incorrectly, the results for a query on malicious traffic may return null when in fact there are matches. A compromised user account may be generating significant log data, but your analysts can’t obtain logs for it because they are searching for “jsmith” instead of a case-sensitive “JSmith.” End users can also match on incorrect fields, believing they are finding the correct data when they are not.

Ineffective queries can lengthen the amount of time required to complete them, and increase the system resources used by the SIEM. Many queries can improved to significantly increase their performance, making the end user happier with a faster response time, and a healthier system that has more CPU and RAM to work on other tasks. A simple rule of thumb is to match as early in the query as possible to limit the amount of data the system searches through. Searching for data in particular fields rather than searching all fields is also a way to reduce the amount of processing the system must do. Additionally, some SIEM tools allow you to easily check for poorly performing queries. For example, Splunk’s Search Job Inspector can not only show you which queries are taking the longest, but even which parts of the query are taking longer than others, allowing you to optimize accordingly.

It’s also common for security analysts to get requests for excessive data. In many cases requestors will ask for more information than is required in order to let them drill-down into the information they need, instead of having to submit multiple requests for data. For example, there may be a request to pull log data on a user for the past two months, when all that is required is some proxy traffic for a few days. These types of requests can be resource-intensive on SIEMs, especially if there are multiple queries running simultaneously. The impact can be more severe when the queries are scheduled reports. Scheduling multiple, large, inefficient queries on a regular basis can consume a significant amount of system resources. With a few inquiries to the requestor, the security analyst may be able to significantly reduce the amount of data searched for.

While ineffective searching is a risk, it’s a simple one to reduce. Training sessions, lunch-and-learns, or workshops can significantly reduce the risks of analysts searching incorrectly and consuming unnecessary system resources. I find a simple three-page deck can provide enough information to assist analysts with searching, highlighting the tool’s case sensitivity, common fields, and sample queries. Nearly all SIEM vendors offer complimentary documentation that will show you how to search best with their product. Thus, a few hours of effort can reduce searching risks while optimizing your SIEM environment.