Wednesday, July 22, 2015

SIEM Planning - Reference Architecture for Midsize Deployments

After going through several websites and documents, I sadly discovered, like many of you had before, that HP haven’t yet published any reference architecture or certified design documents for different needs.

I decided to write a series blog articles to create reference architectures for SIEM deployments, basically for HP ArcSight, but the fact that solution components are more or less similar in different vendors, I believe they will be applicable to all SIEM environments.

Gartner defines a small deployment as one with around 300 log sources and 1500 EPS. A midsize deployment is considered to have up to 1000 log sources and 7000 EPS. Finally a large deployment generally covers more than 1000 log sources with approximately 15000 EPS. There can of course be larger deployments with over 15000 EPS but architecture-wise they can be considered as very “large” deployments.

In this article, I will give the details of a midsize deployment, covering components both for a primary datacenter and a disaster recovery center, working in an active-passive setup.

The reference architecture for midsize deployment is for a scenario where the company needs both a long term log storage solution (ArcSight Logger) and Security Event Management and SOC capabilities (ArcSight ESM).

The scheme below shows how different components of the architecture are set up.

  • In this setup software SmartConnectors are used to collect the logs. Up to 8 software connectors can be configured on a server and 1 GB of memory should be allocated on the server for each connector instance other than what the server needs for its operation.
  • In case appliances are not used, do pay attention to use built-for-purpose hardware servers where resources are not shared because like other big data solutions, these systems are greedy in terms of resources (CPU, Memory, IOPS rate) and do not perform well on virtual environments.
  • Sources send logs to one SmartConnector only. SmartConnector level redundancy is only possible only for Syslog connectors and that when connectors are put behind a load balancer. This also provides load sharing and scalability and is a best practice. DB and File connectors do not have such options as they pull the logs from sources.
  • When a DB or File collector is down, no log is lost until collector comes back as the logs continue to be written on local resources at the source.
  • For log storage and searching, SmartConnectors in each datacenter send their logs to their respective Logger appliance hosted in the same location, providing  important bandwidth savings. Each logger appliance back up the other one using the failover destinations option configured on the SmartConnectors. Thanks to the peering configuration between loggers, logs can be queried through any of the logger appliances without having to connect on each device.
  • DC ESM is the primary ESM for both datacenters. DRC ESM is only used in case of DR
  • Logs and Alerts are archived daily both on ESM and Logger.
  • In DR case, there is no RPO. Configurations for ESM and Logger are planned to be synchronized manually. ESM and Logger are expected to be operational instantly.
  • Configuration backups for SmartConnectors and Loggers are collected using Arcsight Management Center (Arc MC).
  • SmartConnector statistics and status can be easily followed using Arc MC as well. Realizing SmartConnector updates are also recommended to be done over Arc MC using the GUI.
  • SmartConnector level configuration options (aggregation, filter out, batching etc.) are easier to be configured using Arc MC.
  • Finally it is strongly recommended to use a Test ESM system to test all filters, rules, active lists and other configuration objects before applying them on production systems as a misconfguration in these settings may crash your ESM and make you lose very valuable data.

9 comments:

  1. whoah this weblog is wonderful i like studying your articles.

    Keep up the good work! You know, a lot off individuals are searching round
    for this info, you could aid them greatly.

    My homepage :: web site ()

    ReplyDelete
  2. Well, thanks for the nice words.Having some feedback definitely motivate me to do more.

    By the way, if anyone has a special topic that he wants to be mentioned, feel free to contact me.

    ReplyDelete
  3. ya, i like too bro. it is awesome. really appreciate your blog.

    ReplyDelete
  4. Thank you very much for the excellent article.

    ReplyDelete
  5. thanks or doing what HP should be doing! It's really appreciated, this is really useful, I'm ever so greatful

    ReplyDelete
    Replies
    1. Thanks, it is nice to know that what I am sharing is actually followd.

      Delete
  6. I have a question please... are the loggers able to send data to each other and maintain identical copies of both for DR reasons? I mean if connectors only communicate to 1 logger (for bandwidth reasons), can the other logger simply copy new files to itself? At the same time we could configure the failover on the connectors to reduce possibilities of loss...

    ReplyDelete
    Replies
    1. Hello,
      For keeping identical copies of at both sites can be done by synchronizing the databases rather than trying to configure Logger. Moreover, we are talking about long term log storage, mainly for compliance to laws, nobody needs the logs of 6 months ago in a DR case.
      Connectors can be configured to send to 2 Loggers at the same time but it is too much redundant. Loggers synchronizing between themselves would create almost the same load on DR link with sending the logs to 2 different Loggers.

      Delete
    2. Thanks for your help... really appreciated... what I need is the ability to maintain access to the log files for 6 months online + 2 years long term... and if 1 logger goes down, I don't want to lose access to the files online...

      I see the logger as a storage, if the logger fails, I could lose files on it... is this correct? or Do I need a separate DB along the logger?

      Delete