Selecting a SIEM Storage Medium
Given that SIEMs can process tremendous amounts of log data, a critical foundation of your SIEM environment is storage. In order to provide analytical services and fast search response times, SIEMs need dedicated, high-performing storage. Inadequate storage will result in poor query response times, application instability, frustrated end users, and ultimately make the SIEM a bad investment for your organization.
Before running out and buying the latest and greatest storage medium, understanding your retention requirements should be the first step. Does your organization need the typical three months of online and nine months of offline data? Or do you have larger requirements, such as six months of online followed by two years of offline data? Do you want all of your data “hot”? Answering these questions first is critical to keep costs as low as possible, especially if you have large storage requirements.
Once we understand the storage requirements, we can then better understand which medium(s) to use for the environment. If we have a six-month online retention requirement for a 50,000 event per second processing rate, we’re going to need dedicated, high-speed storage to be able to handle the throughput. While we definitely need to meet the IOPS requirement vendors will request, we need to also ensure the storage medium is dedicated. Even if the storage has the required IOPS, if the application can’t access the storage medium when required, the IOPS will be irrelevant. Thus, if using a technology such as SAN, ensure that the application is dedicated to the required storage and that the SAN is configured accordingly.
Another factor to consider when designing your storage architecture for your SIEM environment is what storage will be used per SIEM layer. The Processing Layer (Connectors/Collectors/Forwarders) typically doesn’t store data locally unless the Analytics Layer (where data is stored) is unavailable. However, when the Analytics Layer is unavailable, the Processing Layer should have the appropriate storage to meet the processing requirements. Dedicated, high-speed storage should be used to process large EPS rates, and should have the required capacity to meet caching requirements.
To save on storage costs, slower, shared storage can be used to meet offline retention requirements. When needing access to historical data, the data can be copied back locally to the Analytics layer for searching.
Ensuring you have the right storage for your SIEM environment is a simple but fundamental task. As SIEMs can take years to fully implement and equally long to change, selecting the correct storage is critical. For medium-to-large enterprises, dedicated, high-speed storage should be used to obtain fast read and write performance. While smaller organizations should also make the same investment, there are cases where slower, more cost effective storage can be used for low processing rates and minimal end user usage of the SIEM.