Month: April 2020

Azure Sentinel Lists and Rules

One of the first questions I had about Azure Sentinel was if it supports “Lists.” Lists are available in most (if not all) SIEMs, and how they work in each differs. Lists can help end users create use cases, store selected data outside of retention policies, blacklist/whitelist, and more. You can read more about the utility of SIEM Lists in a previous post here.

Regarding Sentinel, the answer is yes, it supports two main types of lists: temporary lists that are created and used in queries, and external lists (e.g. CSV files hosted in Azure Storage) that can be used for lookups. You can also create a custom log source via the CEF Connector and use that as a pseudo list.

In this post we’ll create a couple of lists and analytics rules that will trigger off values in the lists. We’ll use the data generated from the custom CEF Connector created in a previous post here.

The first use case will detect when the same user account has logged in from two or more different IP addresses within 24 hours, a common use case to flag potential account sharing or compromised accounts. The second use case will trigger when a login is detected from an IP found in a list of malicious IP addresses.

First, let’s create a query to put the users that are logging in from 2 or more different IP addresses into a list called ‘suspiciousUsers’.

Next, let’s take the users from the list and then query the same log source to show the applicable events generated by those users. The results show us all the “Login Success” events generated by the users in the list. We could also use this list to query other data sources in Sentinel.

Query:

let suspiciousUsers =
CommonSecurityLog
| where TimeGenerated >= ago(1d)
| where DeviceProduct == “Streamlined Security Product”
| where Message == “Login_Success”
| summarize dcount(SourceIP) by DestinationUserName
| where dcount_SourceIP > 1
| summarize make_list(DestinationUserName);
CommonSecurityLog
| where TimeGenerated >= ago(1d)
| where DeviceProduct == “Streamlined Security Product”
| where Message == “Login_Success”
| where DestinationUserName in (suspiciousUsers)

So instead of adding the applicable events to the list as they occur and then have a rule query the list, we are simply creating the list in real-time and then using the results in another part of the query. Since the list is temporary, the major thing to consider here is ensuring your retention policies are in line with your use case. This is not an issue with this use case as we are only looking at the past 24 hours, but if you would like to track e.g. RDP authentication events over 6 months, you would need 6 months of online data.

For the next list, we’ll use our CEF Connector to ingest a list of malicious IPs from a threat intelligence feed. We’ll use a simple Python script to write the values in the file to the local Syslog file on a Linux server, which will then be forwarded to Sentinel by the CEF Connector. The IPs in the file were randomly generated by me.

The CSV file has three columns: Vendor, Product, and IP. The values look as follows:

Using an FTP Client (e.g. WinSCP), copy the CSV file to the server.
Next, let’s create a file, give it execute permissions, and the open it.

touch process_ti.py
chmod +x process_ti.py
vi process_ti.py

Paste the script found here into the file, save and close, then run it.

./process_ti.py

Let’s check that there are 300 entries from our CSV file:

CommonSecurityLog
| where DeviceVendor == “Open Threat Intel Feed”
| summarize count()

Now that we can assume the ingestion was successful, let’s make a list named ‘maliciousIPs’. We’ll use this list to match IPs found in the Streamlined Security Product logs.

let maliciousIPs =
CommonSecurityLog
| where TimeGenerated >= ago(1d)
| where DeviceVendor == “Open Threat Intel Feed”
| summarize make_list(SourceIP);
CommonSecurityLog
| where TimeGenerated >= ago(1d)
| where DeviceProduct == “Streamlined Security Product”
| where SourceIP in (maliciousIPs)

Output should look as follows, showing the authentication events from IPs in the ‘maliciousIPs’ list.


Now that we can lookup the data with the queries, let’s create a couple of analytics rules that will detect these use cases in near real-time.

From the Analytics menu, select ‘Create’, then ‘Scheduled query rule’.


Enter a name and description, then select ‘Next: Set rule logic >’.


Enter the query used for the first list (suspiciousUsers), and then we’ll map the DestinationUserName field to the ‘Account’ Entity Type, and SourceIP field to the ‘IP’ Entity Type. You need to click ‘Add’ in order for it to be added to the rule query. Once it’s added the column value will say ‘Defined in query’.


For Query scheduling, run the query every five minutes, and lookup data from the last hour. Set the alert threshold to greater than 0, as the threshold for this use case is already set in the query (2 or more IPs for the same user). We’ll leave suppression off.

One of the nice things about creating rules in Sentinel is that it shows you how many hits your rule will trigger based on your parameters. The graph saves you from doing this yourself, which you would likely do when creating a use case.

We’ll leave the default settings for the ‘Incident settings (Preview)’ and ‘Automated response’ tabs, and then click ‘Create’ on the ‘Review and create’ tab.


Once the rule is created, we can go to the ‘Incidents’ tab to see triggered alerts. We can see that the rule was already triggered by three user accounts.


Next, let’s create a rule that triggers when a user logs in from an IP in the ‘maliciousIPs’ list we created.


We’ll add the query and Map entities as we did in the prior rule.


We’ll schedule the query and set the threshold as follows.


We’ll leave the default settings for the ‘Incident settings (Preview)’ and ‘Automated response’ tabs, and then click ‘Create’ on the ‘Review and create’ tab.


Once the rule is created, we can go back to the Investigations page and see that it has already been triggered by three users.


As you can see, lists are easy to create and can be useful when writing queries and developing use cases. You can also use an external file hosted in Azure Storage and access it directly within a query. For further reading on this topic, there are some helpful posts available on the Microsoft site here, and here.

Script to read from CSV file and write to Syslog in CEF Format

Sample Python script that opens a CSV file and writes the values in CEF format to the local Syslog file on a Linux server. Designed to be used with this post.

#!/usr/bin/python
## Simple Python script designed to read a CSV file and write the values to the local Syslog file in CEF format.
## Frank Cardinale, April 2020

## Importing the libraries used in the script
import syslog
import csv
with open('sample_malicious_IPs.csv') as csvfile:
    readCSV = csv.reader(csvfile, delimiter=',')
    for row in readCSV:

        #Creating a value that will be used to write to the Syslog file. Rows added to applicable CEF fields.
        syslog_message = "CEF:0|" + row[0] + "|" + row[1] + "|1.0|1000|ThreatIntelFeed|10|src=" + row[2]

        #Writing the event to the Syslog file.
        syslog.openlog(facility=syslog.LOG_LOCAL7)
        syslog.syslog(syslog.LOG_NOTICE, syslog_message)

SIEM Lists and Design Considerations


Those familiar with creating use cases in SIEMs have likely at some point worked with “Lists.” These are commonly known in SIEMs as “Active Lists” (ArcSight), “Reference Sets” (QRadar), “Lookups” (Splunk), “Lookup Tables” (Securonix, Devo), and similar in other tools. Lists are essentially tables of data, and you can think of them as an Excel-like table with multiple rows and columns. Lists are different in each of the SIEMs on the market. Some are simply a single column which you can use for e.g. IP Addresses, and others are up to 20 columns that can support a significant amount of data. Log retention policies typically don’t apply to Lists, so you can keep them for as long as needed.

SIEM Lists have three main drivers: limitations with real-time rule engines, limited retention policies, and external reference data.

Limitations with Real-Time Rule Engines

SIEMs with real-time rule engines have the advantage of triggering your use cases as data is ingested (versus running a query every 5 minutes). But the real-time advantage turns out to be a disadvantage when you have a use case that spans a greater timeframe. The longer the timeframe of the use case, the more system resources used by the real-time engine, thus making some use cases impossible. For example, real-time rule engines can easily detect 10 or more failed logins in 24 hours, but not over three months–that would be far too much data to keep in memory. To compensate for this, Lists can be used to store data required by use cases that can’t be done via the real-time rule engine. The List can store, for example, RDP logins over a much longer period, e.g. for one year, including the login type, username, hostname, IP address, and possibly more depending on your SIEM. You can then trigger an alert when the count for a particular user reaches the desired threshold based on the amount of entries in the List.

Limited Retention Policies

Limited retention policies were also a large driver for Lists. Most SIEM environments only have 3 months of online data. In order to access data older than that, it must be e.g. restored from an archive/backup, which is typically inconvenient enough that an analyst won’t even ask for it. With Lists, you can store selected data outside of your retention policies. If you want to store RDP logins for longer than your retention policy allows, you can simply add the values to a List.

External Reference Data

SIEMs are extremely effective at matching data in log files. The advent of threat intelligence data brought security teams massive lists of malicious IP addresses, hostnames, email addresses, and file hashes that needed to be correlated with firewall, proxy, and endpoint protection logs. These threat intel lists can be easily put into a List and then correlated with all applicable logs. Most (if not all) SIEM products support these types of Lists.

Other List Uses

Lists can often enhance your use case capabilities. If your SIEM product can’t meet all of the conditions of a use case with its real-time engine or query, you can sometimes use Lists to compensate. For example, you can put the result of the first use case into a List, and then use a second use case that uses both the real-time engine and values in the List.

Lists can be useful for whitelisting or suppressing duplicate alerts. For example, you can add the IP, username or hostname of the event that triggered an alert to a List (e.g. users/domains that are already being investigated), and use the List to suppress subsequent alerts from the same IP, username, or hostname.

Lists can also help simplify long and complicated queries. Instead of writing a single query, you can put the results of the first part of a query into a List, and then have the second query run against the values in the List.


As you can see, Lists can be very useful for SIEM end users. Overlooking List functionality during a SIEM design can have profound impacts. While List functionality differs per SIEM, it’s important to understand how your SIEM works and ensure it meets your requirements.

Integrate Custom Data Sources with Azure Sentinel’s CEF Connector

Microsoft Azure Sentinel allows you to ingest custom data sources with its CEF Connector. For those not familiar with CEF (Common Event Format), it was created to standardize logging formats. Different applications can log in wildly different formats, leaving SIEM engineers to spend a large portion of their time writing parsers and mapping them to different schemas. Thus, CEF was introduced to help standardize the format in which applications log, encourage vendors to follow the standard, and ultimately reduce the amount of time SIEM resources spend writing and updating parsers. You can find more information on CEF on the Micro Focus website.

With Azure Sentinel, you can ingest custom logs by simply writing in CEF format to the local Syslog file. Many data sources already support Sentinel’s CEF Connector, and given how simple it is, I’m sure your developers or vendors wouldn’t mind logging in this format if asked. Once the data source is logging in CEF and integrated with Sentinel, you can use the searching, threat hunting, and other functionality provided by Sentinel.

To highlight this, we’re going to write to the Syslog file on a default Azure Ubuntu VM, and then query the data in Sentinel. This activity is simple enough to be done with basic Azure and Linux knowledge, and can be done with Azure’s free subscription, so I would encourage anyone to try it.

Requirements:
-Azure subscription (Free one suffices)
-Basic Azure knowledge
-Basic Linux knowledge
-Azure Sentinel instance (default)
-Azure Ubuntu VM (A0 Basic with 1cpu .75GB RAM suffices)

Once you have an Azure subscription, the first step is to create an Azure Sentinel instance. If you don’t already have one, see the “Enable Azure Sentinel” section on the Microsoft Sentinel website.

Once you have an Azure Sentinel instance running, create an Ubuntu VM. Select ‘Virtual Machines’ from the Azure services menu.

Select ‘Add’.

Add in the required parameters:

At the bottom of the page, select ‘Next: Disks’.

Leave all default values for the Disk, Networking, Management, Advanced, and Tags sections, and then select ‘Create’ on the ‘Review and create tab’.

Add a firewall rule that will allow you to SSH to the server. For example, you can add your IP to access the server on port 22.

Next, select ‘Data connectors’, the ‘Common Event Format (CEF)’ Connector from the list, then ‘Open connector page’.

Copy the command provided. This will be used on the Ubuntu server.

Connect to the Ubuntu server using an SSH client (e.g. Putty).

Once logged in, paste the command, then press Enter.

Wait for the install to complete. As noted on the CEF Connector page, it can take up to 20 minutes for data to be searchable in Sentinel.

You can check if the integration was successful by searching for ‘Heartbeat’ in the query window.

Next, we’re going to use the Linux logger command to generate a test authentication CEF message before using a script to automate the process. We’re going to use the standard CEF fields, and as well add extensions ‘src’ (Source Address), ‘dst’ (destination address), and ‘msg’ (message) fields. You can add additional fields as listed in the CEF guide linked at the beginning of this post. Command:

logger “CEF:0|Seamless Security Solutions|Streamlined Security Product|1.0|1000|Authentication Event|10|src=10.1.2.3 dst=10.4.5.6 duser=Test msg=Test”

where:
CEF:Version|Device Vendor|Device Product|Device Version|Device Event Class ID|Name|Severity|Source Address|Destination Address|Message

The event appears as expected when searching the ‘CommonSecurityLog’, where events ingested from the CEF Connector are stored:

Now we’re going to use the Python script at the end of this post that will generate a sample authentication event every 5 minutes. Simply create a file, give it execute permissions, then open it with vi. Be sure to put the file in a more appropriate location if you plan on using it longer-term.

mkdir /tmp/azure
touch /tmp/azure/CEF_generator.py
chmod +x /tmp/azure/CEF_generator.py
vi /tmp/azure/CEF_generator.py

Paste the script into the file by pressing ‘i’ to insert, and then paste. When finished, exit by pressing Esc, and then save and exit, ‘:wq’.

Run the file in the background by running the following command:

nohup python /path/to/test.py &

As expected, events generated on the Ubuntu server are now searchable in Sentinel:

In less than an hour, you now have searchable data in a standard format from a custom application using Sentinel’s CEF Connector. Additionally, you can setup alerts based on the ingested events, and work with the data with other Sentinel tools such as the incident management and playbook apps.

CEF generator Python script link here.

CEF Event Generator

This is a sample CEF generator Python script that will log sample authentication events to the Syslog file. Created and tested on an Azure Ubuntu 18.04 VM. Please check indentation when copying/pasting.

#!/usr/bin/python
# Simple Python script designed to write to the local Syslog file in CEF format on an Azure Ubuntu 18.04 VM.
# Frank Cardinale, April 2020

# Importing the libraries used in the script
import random
import syslog
import time

# Simple list that contains usernames that will be randomly selected and then output to the "duser" CEF field.
usernames = ['Frank', 'John', 'Joe', 'Tony', 'Mario', 'James', 'Chris', 'Mary', 'Rose', 'Jennifer', 'Amanda', 'Andrea', 'Lina']

# Simple list that contains authentication event outcomes that will be randomly selected and then output to the CEF "msg" field.
message = ['Login_Success', 'Login_Failure']

# Endless loop that will run the below every five minutes.
while True:

    # Assigning a random value from the above lists to the two variables that will be used to write to the Syslog file.
    selected_user = random.choice(usernames)
    selected_message = random.choice(message)

# Assigning a random integer value from 1-255 that will be appended to the IP addresses written to the Syslog file.
    ip = str(random.randint(1,255))
    ip2 = str(random.randint(1,255))

# The full Syslog message that will be written.   
    syslog_message = "CEF:0|Seamless Security Solutions|Streamlined Security Product|1.0|1000|Authentication Event|10|src=167.0.0." + ip + " dst=10.0.0." + ip + " duser=" + selected_user + " msg=" + selected_message

# Writing the event to the Syslog file.
    syslog.openlog(facility=syslog.LOG_LOCAL7)
    syslog.syslog(syslog.LOG_NOTICE, syslog_message)

# Pausing the loop for five minutes.
    time.sleep(300)

# End of script