You Still Need UEBA

After spending significant time and money implementing a SIEM, the last thing anyone wants to hear is that you need to spend even more time and money on something called a UEBA (User and Entity Behavior Analytics) in order to get value out of your SIEM. But the good news is that many UEBAs are now add-ons to your SIEM that can be deployed quickly, instead of being an expensive multi-year project.

When UEBAs were first introduced, they were typically stand-alone applications that worked separately from your SIEM, and could be as challenging to implement and maintain. This meant additional skillsets, staff, data parsers, and more for a similar application. They required a separate feed of security and application log data, often being the same data you sent to your SIEM.

UEBAs differ by vendor, but regardless can provide value whether they’re a stand-alone or add-on application. As the name implies, the application focuses on the “user” or “entity” (e.g. system) instead of individual alerts that SIEMs were traditionally designed to work with. The applications can provide various analytics on what’s going on in your environment, such as rarities, outliers, and more, and can consolidate all of it by the user or system performing the actions. Common use cases include:

  • Unknown file executed for the first time in your environment.
  • User visited a newly created domain.
  • User visited an uncommon domain.
  • User connected from a new country for the first time.
  • User connected with a new user agent.

While those are all interesting scenarios and potentially suspicious, it’s simply not possible for most companies to review all of this activity in their environment. The unknown file executed that hasn’t run before in your network could be ransomware, but it’s likely just an update for one of the hundreds of applications you have. The user connecting from a different country for the first time using a new user agent could be indicative of a compromised account, but it’s likely someone logging in from their phone while on vacation.

You could technically do all of the above in a SIEM, but you’re going to need a team of data scientists to create and maintain the queries. UEBAs typically do all of this out-of-the-box, lifting this burden from the security team.

Again, rarities and outliers do not necessarily warrant an investigation, but they can help you identify the riskier behaviour in your environment. SIEM alerts on the other hand are typically designed to look for specific suspicious activity, for example:

  • Single failed pass the hash attempt.
  • Multiple RDP login failures.
  • Running something from the /tmp folder.
  • User clicked on a blocked link.
  • User attempted to execute a file that was blocked by the EDR.

Those above scenarios can be suspicious, but like UEBA analytics, they can be common in large environments. Sending each of these to the SOC for an investigation may not be a good use of their time.

But combining both SIEM and UEBA analytics, you start to get a much better picture of who are the riskiest users and entities in your environment and how to prioritize investigations. It may be hard to justify an investigation for one of the above alerts, especially in large environments. But it may be risky to ignore a user who has triggered many of them.

Since you’re not triggering an investigation for each use case, many “useless” use cases that would never get the approval of the SOC, can now become feasible and useful since they don’t trigger an investigation but increase a user’s or entity’s risk score. Large organizations for example may not review each successfully blocked or quarantined antivirus or EDR event, which may not be feasible as an individual alert due to the volume, but one of those events could be the start of a greater attack. That activity alone may not trigger a UEBA user investigation, but it can put the user or system on the security team’s radar.

Depending on your environment, you may want to weigh certain analytics over others when scoring the top users and entities. You can weigh by user severity (e.g. privileged users, executives), Mitre Att&ck Framework stage (e.g. scoring Execution use cases higher than Reconnaissance), and more.

As mentioned, many UEBA applications are no longer an exotic, risky security tool implementation. Some of them are simple add-ons to your SIEM and can be setup and configured quickly. For example, the Sentinel UEBA is a simple add-on to the Sentinel SIEM.

Some UEBAs are stand-alone applications that require a separate copy of the log data that feeds into your SIEM. The Splunk UEBA is a stand-alone application that works separately from the SIEM. While it may not be as simple as an add-on, stand-alone UEBAs can still provide valuable analytics.

There are also some simple apps that act like a UEBA. The Splunk Risk Analysis Framework in Enterprise Security is a valuable tool that gives you UEBA-like analytics with your SIEM rules.

Regardless of the UEBA, when combined with your SIEM and other security analytics, it can further highlight high-risk users and systems in your network. Which UEBA is best for your organization depends on many factors, such your environment, provider, systems, line of business, and more. But if you have a SIEM, you should definitely consider using a UEBA to add value to your environment. It can be a quick, inexpensive, and valuable addition to your cyber security team.

Top SOAR Use Cases

If you’re not familiar with SOARs, they’re one of the newer tools used by cyber security teams to consolidate all of your various alerts, automate common SOC tasks, improve incident response times, optimize data extraction, and reduce the amount of administrative work done by SOC analysts. The larger your organization, the more value and efficiencies they can provide to your already stretched cyber security team.

Here are some of the top ways SOARs can help your organization.

Automatic Intel Lookups and Alert Enrichment

Security analysts spend a significant portion of their day performing lookups in various applications. When a SIEM alert generates, they’ll likely perform a lookup in VirusTotal, detonate a suspicious file in a malware sandbox, get domain info from a Whois service, lookup the user in an identity management application, and more. While doing this for a single alert may only take 15 minutes, doing it several times per day can take up most of an analyst’s time. While collecting this data is required to perform an investigation, pulling it repetitively can become burdensome administrative work.

Most SOARs integrate with many of the popular intelligence services and can obtain the majority of data required before the analyst even opens the alert. Instead of spending twenty minutes running reports, analysts can jump right into the investigation, having the SIEM, firewall, Active Directory, and asset management reports right in front of them.

Case Management Application

Not only can SOARs collect most of the data required for an investigation, they can consolidate it into a single pane of glass. If you take a look at your analysts’ screens, you’ll likely see multiple monitors with several tabs open on each. And they’ll likely have to log into each of them several times per day due to session timeouts on each of them.

SOARs can consolidate all of the data they collect into a single ticket that analysts can work off of. SIEM and firewall reports, lookup results, and more can be attached directly to the ticket to be used in the investigation and retained as evidence. Most SOARs can also create alert metrics quickly, allowing organizations to produce lists of SIEM and other alerts along with their true and false positive rates, giving security teams insight into their high-value content.

Escalations

One of the perennial challenges of security operations teams is determining which alerts to prioritize. Many vendors set priorities of their IPS signatures and SIEM rules, but these are often too general to be used effectively. Running something from the /tmp folder may be significant at one organization, but not at another.

SOARs can perform a lookup in most identity management applications, and then change the severity based on the user’s role, line of business, and other factors. A generic alert all of a sudden becomes useful when it’s the CEO, an executive, or privileged user triggering it. Login failures are likely happening right now at any large organization, but if it’s the CEO’s user account, that should take priority over any other brute force rule. Is a customer-facing bank branch staff member trying to RDP into other servers on the network? Since that’s not something a branch employee would (or should) do, it should take a higher priority over other alerts.

Email

SOARs can integrate with common email services, including Exchange and Gmail. Before a user opens his or her email after a meeting, your SOAR could have already uploaded the URL or suspicious attachment in VirusTotal or Wildfire, and deleted the email before the user had a chance to click on the link or open the file flagged as malicious.

Email is also an excellent way to “talk” to a SOAR and perform ad-hoc requests. Many SOARs can be configured to monitor an email inbox, and PlayBooks can be setup to run based on certain text values found in selected emails. Want a bunch of IP addresses or domains looked up but don’t know how to search for them in the SIEM, or don’t have access to the SIEM? Not a problem. The SOAR can be configured to parse out IPs, domains, and other text from the email, look up the values in the SIEM, firewall, or other security application, and return the results as a CSV attachment in an email.

SOARs can also use email to perform surveys, collect data from users, and take action based on the responses. They can take a vulnerability management report, lookup the system owner in an asset management system, and then send an email to the owner asking if they are aware of the vulnerability and if it has been remediated. This may not seem like much, but depending on the size of your organization, this could significantly reduce the amount of notifications your SOC needs to send. Chasing thirty system owners around is much easier than one hundred. Chasing system owners around is also another example of administrative work that can be done by a SOAR instead of a security analyst.


SOARs of course need to be configured, maintained, and updated regularly. Automating a task performed a couple of times per year may not be a good automation investment, but reducing the amount of day-to-day administrative work performed by SOC analysts is a great investment and allows staff to focus more on analyzing threats to your organization.

Microsoft Sentinel UEBA

If you’re an Azure customer and are using Sentinel, then you’ll definitely want to check out Sentinel User and Entity Behavior Analytics (UEBA). It’s an add-on application that works right on top of Sentinel and can be easily setup without any major integrations or customizations. It integrates well with Sentinel SIEM and other Azure security products, allowing you to aggregate your various security use cases and create higher-fidelity alerts.

Sentinel User and Entity Behavior Analytics works by aggregating multiple Azure data sources and finds rarities and outliers within those sources. The data sources are:

  • Azure AD
  • Audit Logs (User and group management activity)
  • Azure Activity (Common Azure operational event logs)
  • Security Events (Windows Security Event Log events)
  • SignIn Logs (Authentication Activity)

The rarities and outliers discovered by Sentinel UEBA are known as “Insights”, which are uncommon actions, devices, peer activity, and other events of interest within your environment. Sample Insights produced are:

  • User performed uncommon action among peers
  • First time user performed this particular action
  • User connected from country uncommonly connected from peer group
  • Unusual number of logon failures performed by user

Sentinel UEBA will automatically produce these Insights based on the activity in your organization, which can flag someone using a new browser, connecting from a new location, or performing an abnormal amount of actions. This type of information can not only help alert on suspicious activity, it can also help an investigator determine events of interest performed by the user when performing an investigation, as many of the Insights would be something an investigator would query manually.

While these can be notable activities, you can also see why these are “Insights” and not “Alerts.” Larger organizations especially will find a lot of this “noisy,” which is why these alone may not be sufficient to justify an investigation without additional context. Behavior Analytics will catch someone still using Internet Explorer, but you may have staff using it to access a legacy application. If you have staff that travel, it will flag their phones connecting to your network as soon as they walk off a plane.

To obtain more value from Behavior Analytics, you can aggregate it with other analytics such as your Sentinel Use Cases, Anomalies, Identity Protection, MCAS, and other Azure alerts. Since the Insight will have the user that performed the action, you can simply aggregate it with your other sources by username.

Most Azure security products will produce an Entity automatically with each alert. For Sentinel use cases, you need to ensure an Entity is set when a rule is configured. For example, your user name field, e.g. AccountName, UserPrincipalName, should be mapped to the Account Entity.

Challenges

Especially for large organizations, Insights will simply be normal activities within your environment. Thus, like with other security products, you may want to filter out those that are common and benign.
Behavior Analytics also provides a priority from zero to ten for each Insight. The least abnormal activities will produce a lower numeric value (0-4), while more rare or uncommon Insights will produce a higher numeric score on the scale (5-10).

If you are going to aggregate Sentinel UEBA with other Azure security products, you may want to explore a weighting or scoring system that ensures Insights don’t outweigh other security alerts. For example, if one user consistently triggers one Insight multiple times, they can repeatedly trigger alerts or appear at the top of your dashboards. Thus, you can cap the score derived from Insights so that it allows other products to equally provide a risk score for the user, so that other high-risk users can generate alerts or appear at the top of dashboards.

Some Insights do not contain numerical values that indicate what baseline and deviations were observed. For example, an Insight that determines an “uncommon number of actions were observed,” the common and uncommon values may not be provided, leaving the analyst unclear if there was a significant deviation from the standard.

Summary

Overall, Sentinel UEBA is a great way to automatically flag suspicious behavior in your environment. Doing the equivalent manually within your organization would require a team of data scientists. Behavior Analytics provides additional criteria you can use to create higher-fidelity alerts, and as well automatically provide investigators will information pertinent to an investigation. If you’re an Azure customer and are using Sentinel then it’s definitely worth checking out.

Splunk Risk Analysis Framework

One of the most useful apps you can find in Splunk Enterprise Security is the Risk Analysis Framework. The app allows you to assign a risk score to your use cases that you can use for alerting, threat hunting, and other analytics. Larger organizations will find significant value in the Framework, as it is difficult for most to monitor and triage the alert volume produced by their SIEM.

The Risk Analysis Framework is a simple index that stores each of your alerts (notables) with the entity that triggered the alert (called the “risk object,” an IP or username) and a numeric score. You can set a threshold suitable for your organization, and alert when a particular user or system reaches a particular risk score based on the use cases they’ve triggered. So instead of configuring each of your use cases to trigger an investigation, you can run them in the background and only trigger an investigation when the user or system reaches the threshold.

The Risk Analysis Framework can help with the below common issues organizations have with their SIEM environment:

High alert volume and false positive rate
The top SIEM issue that continues to this day. Organizations spend a significant amount of time and resources monitoring alerts from their SIEM use cases, many of which flag suspicious activity, outliers, and rarities, but end up being false positives. Given that these use cases can be recommended by many SIEM vendors and incident response providers, organizations can be reluctant to turn many of them off.

Compliance or Audit Requirements
Some regulated organizations are mandated to implement and monitor particular use cases, which the security team may consider a low-value activity.

Excessive tuning to make a use case feasible
To make many use cases feasible, organizations often tune out service accounts, certain network segments, and more, which risks tuning out true-positives and reducing the value of the use case.

Narrow, short-term view of the incident and related activity
Many use cases only look back an hour, leaving the analyst more of a short-term view of the user or entity that generated the alert. Given an analyst’s workload, they may put off looking back as necessary as other incidents arise rather than running extended queries that paint a better picture of the threat actor.

Can’t reach a consensus to disable use cases
It can be easier to implement a use case than disable it. Often, many organizations require approvals from multiple teams and lines of business to turn off a use case. As a stakeholder wouldn’t want to leave themselves liable in the event the use case could have detected an attack, they thus defer to leaving the use case enabled.


The Risk Analysis Framework can assist with all of the above.

Have a noisy use case with a high false positive rate? Let it run in the background and add to the user’s or system’s risk score rather than tie up an analyst. If the user or system continues to trigger it consistently or other use cases, then start investigating.

Compliance team giving you a use case that you think isn’t a good use of the security team’s time? Work with them to implement it as a non-alerting rule that adds to the entity’s risk score. I’ve worked with many audit and compliance teams, and they were all open to using the risk scoring method for many of the use cases they proposed.

Unable to implement a use case without filtering service accounts? Create one rule that excludes service accounts that generates an alert, and another that includes service accounts that adds to the user’s risk score.

Want the security team to take a longer-term view of the user or IP that generated the alert? The Risk Analysis Framework is a quick and easy way to provide that visibility. The analyst can quickly see what other use cases the user or IP has triggered in the past 6 months or more.

Can’t reach an agreement to disable a use case? Leave it enabled but convert it to a risk scoring use case.

How the Risk Analysis Framework works

The Risk Analysis Framework is an index (data store) for alerts (notables). To create a risk score for a Correlated Search, add a “Risk Analysis” action.

There’s three options when configuring a Risk Analysis action

1. Risk Object Field: (e.g. Username, source_IP_Address, destination_IP_Address, etc.)

2. Risk Object: User, System (e.g. hostname, IP), Other

3. Risk Score: (e.g. 20, 40, 60 etc.)

Most use cases would use “User” as the Risk Object, but some would be more appropriate for “System” (e.g. a password spraying use case).

Once your use cases are configured with a Risk Analysis action, you can create an alert that sums the risk scores for all risk objects over a particular time period. Using the below table as an example, we could create an alert when a risk object has a score of 250 or greater in the past 6 months. Once the user “Frank” triggered the last alert below, his score would exceed the threshold and an alert would be generated.

DateUse CaseRisk ObjectUse Case ScoreCumulative Score
09-23-2021 13:44:22Rare domain visitedFrank100100
09-25-2021 09:02:45Account Locked OutFrank50150
09-27-2021 21:33:12Rare File ExecutedFrank100250

You can also get creative with the Risk alert, by including not only the total risk score, but the amount of unique use cases triggered by the entity as well.

What’s an appropriate risk score and alert criteria? That depends on many factors, including your organization’s size, line of business, use cases, and others. Generally speaking you should look for high scores that are rare in your organization and as well users that trigger multiple use cases. The average user should not have a risk score in the hundreds or be triggering multiple security use cases.

Does the Risk Analysis Framework need tuning just like your other use cases? The answer is likely yes for your organization. To maximize its value, you should consider omitting particular users (e.g. default account names), use cases that can make it noisy, and other factors that would reduce its value. You should avoid “dumping” use cases into the Risk Framework simply to obtain a check mark, and prevent it from becoming filled with excessively noisy and low-value use cases.

So if you’re running Enterprise Security, the Risk Analysis Framework is an easy way to help with many of the common problems in SIEM environments. It’s easy to use, there’s little to maintain, and it can provide significant security value.

Step into the ring with SIEM heavyweight Sumo Logic

While it has been around for over a decade, Sumo Logic is still unknown to many information security departments. Its absence from the Gartner SIEM Magic Quadrant has likely contributed to its SIEM popularity challenges, but changes in the industry may be in its favour as many organizations migrate to the cloud.

Sumo Logic has mainly been a log management-only solution with limited event management capabilities, but it now offers a full SIEM solution via the JASK acquisition. In addition to common SIEM capabilities, Sumo Logic also provides infrastructure monitoring and business analytics, giving organizations the opportunity to use it for multiple business functions. With a strong client base of over 2,000 clients and many organizations looking to build in the cloud, this “Continuous Intelligence” platform is definitely worth consideration.

I was able to demo Sumo Logic and explore many of the features of its base log management product.

Here are some of the things I liked about it:

Cloud-focused SIEM

One of the things that stood out with Sumo Logic was its direct integrations with many common cloud vendors. Its integrations with AWS, Netskope, and Cloudflare were the simple clicks of a few buttons, and data was ingesting within minutes.

Practical Searching

Sumo Logic has a Lucene-like search language that makes it easy to obtain common security search results. Aggregations and common security searches for IPs, hostnames, and usernames were easy to learn. If you’re familiar with Splunk, picking up Sumo Logic’s search syntax will be easy.

Option to structure data during ingestion or search-time

Structuring data is a critical function of a SIEM. Some SIEMs parse data during ingestion, while others at search time. It’s debatable as to which approach is better, but Sumo Logic has taken an approach that gives you the best of both worlds. Sumo Logic is designed to parse at search time, but you can parse up to fifty fields during ingest. This allows you to structure and quickly search your commonly used fields such as IPs, hostnames, URLs, usernames, and many others, giving you fast search response times, while limiting the amount of parsers you need to create and maintain. Any other fields can be parsed as needed during search-time.

Infrastructure Monitoring and Business Intelligence Capabilities

While providing many SIEM capabilities, Sumo Logic also provides infrastructure monitoring and business analytics. You can monitor and alert on system resource utilization on your servers and applications, and its query language makes it easy to calculate sales, profits and other common business metrics, and turn them into charts and other visualizations.

So if you’re looking for infrastructure monitoring or business analytics in addition to a SIEM, or to simply consolidate applications, Sumo Logic can be used to for all three functions.

Good Documentation

For anything I wanted to know about Sumo Logic regarding data source integrations, search operators, or using lookup tables, I found the documentation helpful, accurate and up to date.

Common SIEM functionality

  • Real-time and scheduled alerts
    You can create your use cases via a search, and then schedule it on a real-time basis or at a regular interval (hourly, daily). The results of a search can be emailed, sent via a Webhook, written to an index, or forwarded to a SOAR.
  • Ability to export a significant amount of data to a file
    During a security incident, you may need to export a significant amount of data to a CSV file for further analysis, or to share with other staff. With Sumo Logic you can export up to 100,000 search results from the web interface, and up to 200,000 results via the API.
  • Supports lookup tables
    Lookup tables are commonly used by security staff to compare large amounts of IPs, usernames, hostnames in firewall, proxy and other data. Security teams often get large lists of suspicious IPs/domains that need correlated against network traffic. You can import lookup tables in Sumo Logic directly via the web interface, and then use it in your searches.
  • Manually importing data from a file (csv, text)
    A common use case is to perform an ad-hoc analysis of a log file from another security application. With Sumo Logic you can import a file directly via the web interface, and then analyze it using common search functions, aggregations, and as well to correlate the data against other data sources.
  • Useful search operators (parse, parse regex, JSON)
    Sumo Logic has practical search operators that allow you to extract data for matching, counting, and sorting. You can search via regex, and two useful operators such as parse (which lets you easily match on any characters), and the JSON operator, which easily allows you to parse JSON values.

Concerns/Considerations:

Doesn’t support many legacy systems

This would only be a “concern” if you have legacy systems and a significant on-prem presence. However, on-prem clients don’t appear to be part of Sumo Logic’s strategy.

Maximum 99.9% Uptime

While nine hours a year may not seem like much, it can be an eternity if these nine hours happen to be during an incident or other critical event.

Base product is more of a log management tool than SIEM

The base product comes close to being a full SIEM, but lacks a basic incident management app, and provides a limited use case library. So if you’re only going with the base product, you’ll have to use another app, excel, or your SOAR for case management.

While there are dashboards for many of the supported data sources, there doesn’t appear to be a significant library of real-time alerts, so much of it would have to be developed internally by your security team or Sumo Logic professional services.

Base product and SIEM product are separate apps

The base log management and SIEM applications are different products, so data has to be forwarded from the Collectors to both.


Summary

Overall, I found the product stable, intuitive, integrations easy to setup, and the query language easy to learn. The product provided fast search response times in general, and even better performance from searches on fields parsed during ingestion. Common security functions such lookup tables, data exports to a file, and manually uploading a log file were all intuitive and can be done directly via the web interface.

So if your next SIEM is going to be in the cloud, be sure to check out Sumo Logic.

Azure Sentinel Lists and Rules

One of the first questions I had about Azure Sentinel was if it supports “Lists.” Lists are available in most (if not all) SIEMs, and how they work in each differs. Lists can help end users create use cases, store selected data outside of retention policies, blacklist/whitelist, and more. You can read more about the utility of SIEM Lists in a previous post here.

Regarding Sentinel, the answer is yes, it supports two main types of lists: temporary lists that are created and used in queries, and external lists (e.g. CSV files hosted in Azure Storage) that can be used for lookups. You can also create a custom log source via the CEF Connector and use that as a pseudo list.

In this post we’ll create a couple of lists and analytics rules that will trigger off values in the lists. We’ll use the data generated from the custom CEF Connector created in a previous post here.

The first use case will detect when the same user account has logged in from two or more different IP addresses within 24 hours, a common use case to flag potential account sharing or compromised accounts. The second use case will trigger when a login is detected from an IP found in a list of malicious IP addresses.

First, let’s create a query to put the users that are logging in from 2 or more different IP addresses into a list called ‘suspiciousUsers’.

Next, let’s take the users from the list and then query the same log source to show the applicable events generated by those users. The results show us all the “Login Success” events generated by the users in the list. We could also use this list to query other data sources in Sentinel.

Query:

let suspiciousUsers =
CommonSecurityLog
| where TimeGenerated >= ago(1d)
| where DeviceProduct == “Streamlined Security Product”
| where Message == “Login_Success”
| summarize dcount(SourceIP) by DestinationUserName
| where dcount_SourceIP > 1
| summarize make_list(DestinationUserName);
CommonSecurityLog
| where TimeGenerated >= ago(1d)
| where DeviceProduct == “Streamlined Security Product”
| where Message == “Login_Success”
| where DestinationUserName in (suspiciousUsers)

So instead of adding the applicable events to the list as they occur and then have a rule query the list, we are simply creating the list in real-time and then using the results in another part of the query. Since the list is temporary, the major thing to consider here is ensuring your retention policies are in line with your use case. This is not an issue with this use case as we are only looking at the past 24 hours, but if you would like to track e.g. RDP authentication events over 6 months, you would need 6 months of online data.

For the next list, we’ll use our CEF Connector to ingest a list of malicious IPs from a threat intelligence feed. We’ll use a simple Python script to write the values in the file to the local Syslog file on a Linux server, which will then be forwarded to Sentinel by the CEF Connector. The IPs in the file were randomly generated by me.

The CSV file has three columns: Vendor, Product, and IP. The values look as follows:

Using an FTP Client (e.g. WinSCP), copy the CSV file to the server.
Next, let’s create a file, give it execute permissions, and the open it.

touch process_ti.py
chmod +x process_ti.py
vi process_ti.py

Paste the script found here into the file, save and close, then run it.

./process_ti.py

Let’s check that there are 300 entries from our CSV file:

CommonSecurityLog
| where DeviceVendor == “Open Threat Intel Feed”
| summarize count()

Now that we can assume the ingestion was successful, let’s make a list named ‘maliciousIPs’. We’ll use this list to match IPs found in the Streamlined Security Product logs.

let maliciousIPs =
CommonSecurityLog
| where TimeGenerated >= ago(1d)
| where DeviceVendor == “Open Threat Intel Feed”
| summarize make_list(SourceIP);
CommonSecurityLog
| where TimeGenerated >= ago(1d)
| where DeviceProduct == “Streamlined Security Product”
| where SourceIP in (maliciousIPs)

Output should look as follows, showing the authentication events from IPs in the ‘maliciousIPs’ list.


Now that we can lookup the data with the queries, let’s create a couple of analytics rules that will detect these use cases in near real-time.

From the Analytics menu, select ‘Create’, then ‘Scheduled query rule’.


Enter a name and description, then select ‘Next: Set rule logic >’.


Enter the query used for the first list (suspiciousUsers), and then we’ll map the DestinationUserName field to the ‘Account’ Entity Type, and SourceIP field to the ‘IP’ Entity Type. You need to click ‘Add’ in order for it to be added to the rule query. Once it’s added the column value will say ‘Defined in query’.


For Query scheduling, run the query every five minutes, and lookup data from the last hour. Set the alert threshold to greater than 0, as the threshold for this use case is already set in the query (2 or more IPs for the same user). We’ll leave suppression off.

One of the nice things about creating rules in Sentinel is that it shows you how many hits your rule will trigger based on your parameters. The graph saves you from doing this yourself, which you would likely do when creating a use case.

We’ll leave the default settings for the ‘Incident settings (Preview)’ and ‘Automated response’ tabs, and then click ‘Create’ on the ‘Review and create’ tab.


Once the rule is created, we can go to the ‘Incidents’ tab to see triggered alerts. We can see that the rule was already triggered by three user accounts.


Next, let’s create a rule that triggers when a user logs in from an IP in the ‘maliciousIPs’ list we created.


We’ll add the query and Map entities as we did in the prior rule.


We’ll schedule the query and set the threshold as follows.


We’ll leave the default settings for the ‘Incident settings (Preview)’ and ‘Automated response’ tabs, and then click ‘Create’ on the ‘Review and create’ tab.


Once the rule is created, we can go back to the Investigations page and see that it has already been triggered by three users.


As you can see, lists are easy to create and can be useful when writing queries and developing use cases. You can also use an external file hosted in Azure Storage and access it directly within a query. For further reading on this topic, there are some helpful posts available on the Microsoft site here, and here.

Script to read from CSV file and write to Syslog in CEF Format

Sample Python script that opens a CSV file and writes the values in CEF format to the local Syslog file on a Linux server. Designed to be used with this post.

#!/usr/bin/python
## Simple Python script designed to read a CSV file and write the values to the local Syslog file in CEF format.
## Frank Cardinale, April 2020

## Importing the libraries used in the script
import syslog
import csv
with open('sample_malicious_IPs.csv') as csvfile:
    readCSV = csv.reader(csvfile, delimiter=',')
    for row in readCSV:

        #Creating a value that will be used to write to the Syslog file. Rows added to applicable CEF fields.
        syslog_message = "CEF:0|" + row[0] + "|" + row[1] + "|1.0|1000|ThreatIntelFeed|10|src=" + row[2]

        #Writing the event to the Syslog file.
        syslog.openlog(facility=syslog.LOG_LOCAL7)
        syslog.syslog(syslog.LOG_NOTICE, syslog_message)

SIEM Lists and Design Considerations


Those familiar with creating use cases in SIEMs have likely at some point worked with “Lists.” These are commonly known in SIEMs as “Active Lists” (ArcSight), “Reference Sets” (QRadar), “Lookups” (Splunk), “Lookup Tables” (Securonix, Devo), and similar in other tools. Lists are essentially tables of data, and you can think of them as an Excel-like table with multiple rows and columns. Lists are different in each of the SIEMs on the market. Some are simply a single column which you can use for e.g. IP Addresses, and others are up to 20 columns that can support a significant amount of data. Log retention policies typically don’t apply to Lists, so you can keep them for as long as needed.

SIEM Lists have three main drivers: limitations with real-time rule engines, limited retention policies, and external reference data.

Limitations with Real-Time Rule Engines

SIEMs with real-time rule engines have the advantage of triggering your use cases as data is ingested (versus running a query every 5 minutes). But the real-time advantage turns out to be a disadvantage when you have a use case that spans a greater timeframe. The longer the timeframe of the use case, the more system resources used by the real-time engine, thus making some use cases impossible. For example, real-time rule engines can easily detect 10 or more failed logins in 24 hours, but not over three months–that would be far too much data to keep in memory. To compensate for this, Lists can be used to store data required by use cases that can’t be done via the real-time rule engine. The List can store, for example, RDP logins over a much longer period, e.g. for one year, including the login type, username, hostname, IP address, and possibly more depending on your SIEM. You can then trigger an alert when the count for a particular user reaches the desired threshold based on the amount of entries in the List.

Limited Retention Policies

Limited retention policies were also a large driver for Lists. Most SIEM environments only have 3 months of online data. In order to access data older than that, it must be e.g. restored from an archive/backup, which is typically inconvenient enough that an analyst won’t even ask for it. With Lists, you can store selected data outside of your retention policies. If you want to store RDP logins for longer than your retention policy allows, you can simply add the values to a List.

External Reference Data

SIEMs are extremely effective at matching data in log files. The advent of threat intelligence data brought security teams massive lists of malicious IP addresses, hostnames, email addresses, and file hashes that needed to be correlated with firewall, proxy, and endpoint protection logs. These threat intel lists can be easily put into a List and then correlated with all applicable logs. Most (if not all) SIEM products support these types of Lists.

Other List Uses

Lists can often enhance your use case capabilities. If your SIEM product can’t meet all of the conditions of a use case with its real-time engine or query, you can sometimes use Lists to compensate. For example, you can put the result of the first use case into a List, and then use a second use case that uses both the real-time engine and values in the List.

Lists can be useful for whitelisting or suppressing duplicate alerts. For example, you can add the IP, username or hostname of the event that triggered an alert to a List (e.g. users/domains that are already being investigated), and use the List to suppress subsequent alerts from the same IP, username, or hostname.

Lists can also help simplify long and complicated queries. Instead of writing a single query, you can put the results of the first part of a query into a List, and then have the second query run against the values in the List.


As you can see, Lists can be very useful for SIEM end users. Overlooking List functionality during a SIEM design can have profound impacts. While List functionality differs per SIEM, it’s important to understand how your SIEM works and ensure it meets your requirements.

Integrate Custom Data Sources with Azure Sentinel’s CEF Connector

Microsoft Azure Sentinel allows you to ingest custom data sources with its CEF Connector. For those not familiar with CEF (Common Event Format), it was created to standardize logging formats. Different applications can log in wildly different formats, leaving SIEM engineers to spend a large portion of their time writing parsers and mapping them to different schemas. Thus, CEF was introduced to help standardize the format in which applications log, encourage vendors to follow the standard, and ultimately reduce the amount of time SIEM resources spend writing and updating parsers. You can find more information on CEF on the Micro Focus website.

With Azure Sentinel, you can ingest custom logs by simply writing in CEF format to the local Syslog file. Many data sources already support Sentinel’s CEF Connector, and given how simple it is, I’m sure your developers or vendors wouldn’t mind logging in this format if asked. Once the data source is logging in CEF and integrated with Sentinel, you can use the searching, threat hunting, and other functionality provided by Sentinel.

To highlight this, we’re going to write to the Syslog file on a default Azure Ubuntu VM, and then query the data in Sentinel. This activity is simple enough to be done with basic Azure and Linux knowledge, and can be done with Azure’s free subscription, so I would encourage anyone to try it.

Requirements:
-Azure subscription (Free one suffices)
-Basic Azure knowledge
-Basic Linux knowledge
-Azure Sentinel instance (default)
-Azure Ubuntu VM (A0 Basic with 1cpu .75GB RAM suffices)

Once you have an Azure subscription, the first step is to create an Azure Sentinel instance. If you don’t already have one, see the “Enable Azure Sentinel” section on the Microsoft Sentinel website.

Once you have an Azure Sentinel instance running, create an Ubuntu VM. Select ‘Virtual Machines’ from the Azure services menu.

Select ‘Add’.

Add in the required parameters:

At the bottom of the page, select ‘Next: Disks’.

Leave all default values for the Disk, Networking, Management, Advanced, and Tags sections, and then select ‘Create’ on the ‘Review and create tab’.

Add a firewall rule that will allow you to SSH to the server. For example, you can add your IP to access the server on port 22.

Next, select ‘Data connectors’, the ‘Common Event Format (CEF)’ Connector from the list, then ‘Open connector page’.

Copy the command provided. This will be used on the Ubuntu server.

Connect to the Ubuntu server using an SSH client (e.g. Putty).

Once logged in, paste the command, then press Enter.

Wait for the install to complete. As noted on the CEF Connector page, it can take up to 20 minutes for data to be searchable in Sentinel.

You can check if the integration was successful by searching for ‘Heartbeat’ in the query window.

Next, we’re going to use the Linux logger command to generate a test authentication CEF message before using a script to automate the process. We’re going to use the standard CEF fields, and as well add extensions ‘src’ (Source Address), ‘dst’ (destination address), and ‘msg’ (message) fields. You can add additional fields as listed in the CEF guide linked at the beginning of this post. Command:

logger “CEF:0|Seamless Security Solutions|Streamlined Security Product|1.0|1000|Authentication Event|10|src=10.1.2.3 dst=10.4.5.6 duser=Test msg=Test”

where:
CEF:Version|Device Vendor|Device Product|Device Version|Device Event Class ID|Name|Severity|Source Address|Destination Address|Message

The event appears as expected when searching the ‘CommonSecurityLog’, where events ingested from the CEF Connector are stored:

Now we’re going to use the Python script at the end of this post that will generate a sample authentication event every 5 minutes. Simply create a file, give it execute permissions, then open it with vi. Be sure to put the file in a more appropriate location if you plan on using it longer-term.

mkdir /tmp/azure
touch /tmp/azure/CEF_generator.py
chmod +x /tmp/azure/CEF_generator.py
vi /tmp/azure/CEF_generator.py

Paste the script into the file by pressing ‘i’ to insert, and then paste. When finished, exit by pressing Esc, and then save and exit, ‘:wq’.

Run the file in the background by running the following command:

nohup python /path/to/test.py &

As expected, events generated on the Ubuntu server are now searchable in Sentinel:

In less than an hour, you now have searchable data in a standard format from a custom application using Sentinel’s CEF Connector. Additionally, you can setup alerts based on the ingested events, and work with the data with other Sentinel tools such as the incident management and playbook apps.

CEF generator Python script link here.

CEF Event Generator

This is a sample CEF generator Python script that will log sample authentication events to the Syslog file. Created and tested on an Azure Ubuntu 18.04 VM. Please check indentation when copying/pasting.

#!/usr/bin/python
# Simple Python script designed to write to the local Syslog file in CEF format on an Azure Ubuntu 18.04 VM.
# Frank Cardinale, April 2020

# Importing the libraries used in the script
import random
import syslog
import time

# Simple list that contains usernames that will be randomly selected and then output to the "duser" CEF field.
usernames = ['Frank', 'John', 'Joe', 'Tony', 'Mario', 'James', 'Chris', 'Mary', 'Rose', 'Jennifer', 'Amanda', 'Andrea', 'Lina']

# Simple list that contains authentication event outcomes that will be randomly selected and then output to the CEF "msg" field.
message = ['Login_Success', 'Login_Failure']

# Endless loop that will run the below every five minutes.
while True:

    # Assigning a random value from the above lists to the two variables that will be used to write to the Syslog file.
    selected_user = random.choice(usernames)
    selected_message = random.choice(message)

# Assigning a random integer value from 1-255 that will be appended to the IP addresses written to the Syslog file.
    ip = str(random.randint(1,255))
    ip2 = str(random.randint(1,255))

# The full Syslog message that will be written.   
    syslog_message = "CEF:0|Seamless Security Solutions|Streamlined Security Product|1.0|1000|Authentication Event|10|src=167.0.0." + ip + " dst=10.0.0." + ip + " duser=" + selected_user + " msg=" + selected_message

# Writing the event to the Syslog file.
    syslog.openlog(facility=syslog.LOG_LOCAL7)
    syslog.syslog(syslog.LOG_NOTICE, syslog_message)

# Pausing the loop for five minutes.
    time.sleep(300)

# End of script