Intesa Splunk Pipeline

Dockerized stack:

  • Splunk 9.4.2 (HEC enabled)
  • Poller: reads from Splunk, chunks to JSONL in ./out
  • Analyzer: scans chunks, writes reports to ./reports

Quick start:

docker compose up -d --build
# In Splunk: create index "intesa_payments"
# Seed test events (see docs/prod-setup.md)

Docs:

Splunk + Poller + Analyzer — Deploy Guide (Local/Prod)

This document lists the exact steps and commands to bring up our current stack:

  • Splunk 9.4.2 (Docker) with HEC enabled
  • Poller (Python) that reads from Splunk, chunks logs to JSONL, and writes to ./out
  • Analyzer (Python) that scans chunks and generates reports in ./reports

Azure Blob/Service Bus support is already built into the poller but disabled by default. When credentials are available, flip a few env vars (see Appendix).


0) Prerequisites (Ubuntu)

# Docker Engine + Compose plugin
sudo apt-get update
sudo apt-get install -y ca-certificates curl gnupg lsb-release

# Install Docker (if not already installed)
curl -fsSL https://get.docker.com | sudo sh

# Add your user to the docker group (re-login for this to take effect)
sudo usermod -aG docker "$USER"

1) Project files

Ensure the repo contains at minimum:

compose.yaml
.env.example
poller/
  Dockerfile
  requirements.txt
  splunk_poller.py
analyzer/
  Dockerfile
  requirements.txt
  analyzer_runner.py
  offline_analyzer.py
out/        # created at runtime; host bind mount for chunks
reports/    # created at runtime; host bind mount for analyzer reports

2) Prepare environment

Create .env (copy and edit from the example). Keep these secrets private.

cp .env.example .env
# Edit .env and set at least:
# SPLUNK_PASSWORD=Str0ngP@ss!9
# SPLUNK_HEC_TOKEN=dev-0123456789abcdef

Create local folders and set permissions for the poller to write chunks:

mkdir -p out reports
# Poller runs as UID 10001; make sure it can write to ./out
sudo chown -R 10001:10001 out

3) Launch the stack

# from the folder that contains compose.yaml
docker compose up -d --build

Watch logs:

docker compose logs -f splunk
docker compose logs -f poller
docker compose logs -f analyzer

4) Create the Splunk index (once)

Either via UI: Settings → Indexes → New Indexintesa_payments
Or via CLI:

docker compose exec splunk /opt/splunk/bin/splunk add index intesa_payments -auth admin:"$SPLUNK_PASSWORD"

Run this on your host to generate realistic events with rich fields (change the loop count as needed):

HEC_URL="https://localhost:8088/services/collector/event"
HEC_TOKEN="dev-0123456789abcdef"
INDEX="intesa_payments"
SOURCETYPE="intesa:bonifico"

gen_iban() { local d=""; for _ in $(seq 1 25); do d="${d}$((RANDOM%10))"; done; echo "IT${d}"; }
mask_iban(){ local i="$1"; local pre="${i:0:6}"; local suf="${i: -4}"; local n=$(( ${#i}-10 )); printf "%s%0.s*" "$pre" $(seq 1 $n); printf "%s" "$suf"; }
rand_amount(){ awk 'BEGIN{srand(); printf "%.2f", 5+rand()*14995}'; }
rand_bool_str(){ if ((RANDOM%2)); then echo "true"; else echo "false"; fi; }
pick(){ local a=("$@"); echo "${a[$RANDOM%${#a[@]}]}"; }

spese=(SHA OUR BEN)
divise=(EUR EUR EUR EUR USD GBP)
steps=(compila conferma esito)
statuses=(accepted pending rejected)

for i in {1..50}; do
  t_iso=$(date -u +%FT%T.%6NZ)
  t_epoch=$(date -u +%s)
  src=$(gen_iban); dst=$(gen_iban)
  srcm=$(mask_iban "$src"); dstm=$(mask_iban "$dst")
  amt=$(rand_amount)
  inst=$(rand_bool_str)
  sp=$(pick "${spese[@]}")
  dv=$(pick "${divise[@]}")
  st=$(pick "${statuses[@]}")
  step=$(pick "${steps[@]}")

  curl -sk "$HEC_URL"     -H "Authorization: Splunk $HEC_TOKEN" -H "Content-Type: application/json"     -d @- <<JSON
{
  "time": $t_epoch,
  "host": "seed.cli",
  "source": "cli_for_loop",
  "sourcetype": "$SOURCETYPE",
  "index": "$INDEX",
  "event": {
    "event_type": "bonifico",
    "step": "$step",
    "iban_origin_masked": "$srcm",
    "iban_dest_masked": "$dstm",
    "importo": "$amt",
    "divisa": "$dv",
    "istantaneo": "$inst",
    "data_pagamento": "$t_iso",
    "spese_commissioni": "$sp",
    "causale": "TEST SEED",
    "status": "$st"
  }
}
JSON
done

6) Verify

  • Splunk Web: http://localhost:8000 → login admin / $SPLUNK_PASSWORD
  • HEC Health:
    curl -k https://localhost:8088/services/collector/health     -H "Authorization: Splunk $SPLUNK_HEC_TOKEN"
    
  • Search in Splunk:
    index=intesa_payments sourcetype=intesa:bonifico earliest=-15m
    | stats count by step status
    
  • Poller logs should show file writes like:
    wrote /app/out/chunk_...jsonl (xxxxx bytes)
  • Analyzer should write: ./reports/report_<ts>.md and ./reports/anomalies_<ts>.json

7) Day2 operations

Restart services:

docker compose restart splunk poller analyzer

Tail logs:

docker compose logs -f poller
docker compose logs -f analyzer

Reset the poller checkpoint (reprocess from lookback window):

docker compose exec poller rm -f /app/out/.ckpt

Fix permissions (if needed):

sudo chown -R 10001:10001 out

Clean everything (removes containers + volumes):

docker compose down --volumes --remove-orphans

Appendix: Flip to Azure later (optional)

When you obtain credentials, edit .env and set:

SINK=blob            # or blob+sb
AZURE_STORAGE_CONNECTION_STRING=...   # required for blob uploads
AZURE_STORAGE_CONTAINER=bank-logs
AZURE_COMPRESS=true

# Only if SINK=blob+sb
AZURE_SERVICEBUS_CONNECTION_STRING=...
AZURE_SERVICEBUS_QUEUE=log-chunks

Then redeploy the poller only:

docker compose up -d --build --no-deps poller
docker compose logs -f poller

You should see:

[poller] uploaded blob intesa/YYYY/MM/DD/HH/chunk_....jsonl.gz (... bytes, compressed=True)
[poller] sent Service Bus notification    # only with blob+sb
Description
No description provided
Readme 97 KiB
Languages
Python 98%
Dockerfile 2%