WP How to Pull Multiple SIEM logs Into a Single AWS S3 Bucket

Archive

How to Pull Multiple SIEM logs Into a Single AWS S3 Bucket

How to Pull Multiple SIEM logs Into a Single AWS S3 Bucket

In a previous article, we explained how to configure AWS to store your Incapsula (SIEM) logs in an S3 bucket. In this article, we’ll explain how to build on that configuration to push SIEM logs from multiple Incapsula subaccounts, each in their own S3 bucket, into a single bucket.

Getting all your SIEM logs into a single, common location can help you establish a centralized environment where you can access and analyze security information and alerts from all of your sources.

Overview

Imperva Incapsula allows you to push your Incapsula account’s security information and event management (SIEM) logs directly to a designated bucket in Amazon Simple Storage Service (S3).

Note: The code sample shown in this article is provided as a simplified illustration to help you understand the concept of how to combine your SIEM logs. This sample is not intended to be used in a production environment.

We’ll write a small script to pull multiple SIEM logs into a single S3 bucket. For this example, we’ll use a Python script and run it in an AWS Lambda instance. If you’re unfamiliar with Lambda, it’s a serverless platform that can run your back-end code in an Amazon Elastic Compute Cloud (EC2) instance. Lambda-based code consists of functions triggered by events. Python is one of several languages that Lambda supports. For details, check out the AWS Lambda Developer Guide at the following URL:

http://docs.aws.amazon.com/lambda/latest/dg/welcome.html

To get objects from your existing buckets and put them into a new bucket, we’ll use the AWS SDK for Python, boto3. Specifically, we’ll use the get_object and put_object; methods within the S3.Client class of the boto3 SDK.

Requirements

We’ll assume that you already have an AWS account and multiple S3 buckets where you are accumulating your SIEM logs. Here are the other requirements:

  1. A new S3 “target” bucket to hold your consolidated logs. You can find step-by-step instructions for creating a bucket in this article.
  2. An installation of the AWS boto3 SDK. You can get installation instructions and complete documentation at the following URL:

http://boto3.readthedocs.io/en/latest/index.html

  1. Identity and Access Management (IAM) credentials for your AWS account. The example Python script runs in an IAM context, rather than using a specific user account. Using IAM credentials is a best practice when running machine-to-machine scripts. For more information about IAM, see the AWS Identity and Access Management User Guide at the following URL:

http://docs.aws.amazon.com/IAM/latest/UserGuide/introduction.html

  1. A “Put Object” trigger within the Lambda function for each S3 bucket from which you will be pulling your SIEM logs. This trigger fires when a log is written OR uploaded into the S3 bucket.

In addition to these requirements, we’ve designed the example script to run within a single AWS region. If your SIEM logs are stored across multiple AWS regions, you’ll need to add appropriate cross-region permissions and possibly other accommodations to your script.

Example Python script

This script is triggered each time one of your S3 buckets fires the Put Object; trigger. When it runs, it gets the S3 object as raw data, splits out the header information, and then decompresses the SIEM data, prints it to the console, and then sends it to your S3 consolidation bucket.

import os
import json
import zlib
import uuid
import boto3
import logging
import logging.handlers


def lambda_handler(event, context):
    # Startup
    print('Loading function')
    print("Log stream name:", context.log_stream_name)

    # Get the information that was passed in from S3 PUT trigger
    for record in event['Records']:
        print(type(record))
        key = record['s3']['object']['key']
        bucket = record['s3']['bucket']['name']

    # Now that we have the file and bucket, use boto S3.Client 
    # to get the object instead of downloading the file
    print('Try to get the file %s in bucket %s' % (key, bucket))
    s3_client = boto3.client('s3')
    obj = s3_client.get_object(Bucket=bucket, Key=key)

    # Read in the Body element data of the S3 JSON object - this is the raw file data
    file_content = obj['Body'].read().decode("latin-1")

    # Split the header info from SIEM data
    file_split_date = file_content.split("|==|\n")[1].encode('latin-1')

    # Decompress the SIEM data
    uncompressed_file_content = zlib.decompressobj().decompress(file_split_date)

    # Log the data for debugging
    print(uncompressed_file_content.decode())

    # Use boto S3.Client to put the decompressed log into 
    # the new target bucket
    s3_client.put_object(Bucket='’, Key=key, Body=uncompressed_file_content)

Use the concepts presented in this article to create your own code with appropriate error checking, security measures, and validation, following your organization’s best practices. Don’t hesitate to contact us if you need help consolidating your SIEM logs in AWS S3.