One of the most useful apps you can find in Splunk Enterprise Security is the Risk Analysis Framework. The app allows you to assign a risk score to your use cases that you can use for alerting, threat hunting, and other analytics. Larger organizations will find significant value in the Framework, as it is difficult for most to monitor and triage the alert volume produced by their SIEM.
The Risk Analysis Framework is a simple index that stores each of your alerts (notables) with the entity that triggered the alert (called the “risk object,” an IP or username) and a numeric score. You can set a threshold suitable for your organization, and alert when a particular user or system reaches a particular risk score based on the use cases they’ve triggered. So instead of configuring each of your use cases to trigger an investigation, you can run them in the background and only trigger an investigation when the user or system reaches the threshold.
The Risk Analysis Framework can help with the below common issues organizations have with their SIEM environment:
High alert volume and false positive rate
The top SIEM issue that continues to this day. Organizations spend a significant amount of time and resources monitoring alerts from their SIEM use cases, many of which flag suspicious activity, outliers, and rarities, but end up being false positives. Given that these use cases can be recommended by many SIEM vendors and incident response providers, organizations can be reluctant to turn many of them off.
Compliance or Audit Requirements
Some regulated organizations are mandated to implement and monitor particular use cases, which the security team may consider a low-value activity.
Excessive tuning to make a use case feasible
To make many use cases feasible, organizations often tune out service accounts, certain network segments, and more, which risks tuning out true-positives and reducing the value of the use case.
Narrow, short-term view of the incident and related activity
Many use cases only look back an hour, leaving the analyst more of a short-term view of the user or entity that generated the alert. Given an analyst’s workload, they may put off looking back as necessary as other incidents arise rather than running extended queries that paint a better picture of the threat actor.
Can’t reach a consensus to disable use cases
It can be easier to implement a use case than disable it. Often, many organizations require approvals from multiple teams and lines of business to turn off a use case. As a stakeholder wouldn’t want to leave themselves liable in the event the use case could have detected an attack, they thus defer to leaving the use case enabled.
The Risk Analysis Framework can assist with all of the above.
Have a noisy use case with a high false positive rate? Let it run in the background and add to the user’s or system’s risk score rather than tie up an analyst. If the user or system continues to trigger it consistently or other use cases, then start investigating.
Compliance team giving you a use case that you think isn’t a good use of the security team’s time? Work with them to implement it as a non-alerting rule that adds to the entity’s risk score. I’ve worked with many audit and compliance teams, and they were all open to using the risk scoring method for many of the use cases they proposed.
Unable to implement a use case without filtering service accounts? Create one rule that excludes service accounts that generates an alert, and another that includes service accounts that adds to the user’s risk score.
Want the security team to take a longer-term view of the user or IP that generated the alert? The Risk Analysis Framework is a quick and easy way to provide that visibility. The analyst can quickly see what other use cases the user or IP has triggered in the past 6 months or more.
Can’t reach an agreement to disable a use case? Leave it enabled but convert it to a risk scoring use case.
How the Risk Analysis Framework works
The Risk Analysis Framework is an index (data store) for alerts (notables). To create a risk score for a Correlated Search, add a “Risk Analysis” action.
There’s three options when configuring a Risk Analysis action
1. Risk Object Field: (e.g. Username, source_IP_Address, destination_IP_Address, etc.)
2. Risk Object: User, System (e.g. hostname, IP), Other
3. Risk Score: (e.g. 20, 40, 60 etc.)
Most use cases would use “User” as the Risk Object, but some would be more appropriate for “System” (e.g. a password spraying use case).
Once your use cases are configured with a Risk Analysis action, you can create an alert that sums the risk scores for all risk objects over a particular time period. Using the below table as an example, we could create an alert when a risk object has a score of 250 or greater in the past 6 months. Once the user “Frank” triggered the last alert below, his score would exceed the threshold and an alert would be generated.
|Use Case Score
|Rare domain visited
|Account Locked Out
|Rare File Executed
You can also get creative with the Risk alert, by including not only the total risk score, but the amount of unique use cases triggered by the entity as well.
What’s an appropriate risk score and alert criteria? That depends on many factors, including your organization’s size, line of business, use cases, and others. Generally speaking you should look for high scores that are rare in your organization and as well users that trigger multiple use cases. The average user should not have a risk score in the hundreds or be triggering multiple security use cases.
Does the Risk Analysis Framework need tuning just like your other use cases? The answer is likely yes for your organization. To maximize its value, you should consider omitting particular users (e.g. default account names), use cases that can make it noisy, and other factors that would reduce its value. You should avoid “dumping” use cases into the Risk Framework simply to obtain a check mark, and prevent it from becoming filled with excessively noisy and low-value use cases.
So if you’re running Enterprise Security, the Risk Analysis Framework is an easy way to help with many of the common problems in SIEM environments. It’s easy to use, there’s little to maintain, and it can provide significant security value.