Intesa Splunk Pipeline
Dockerized stack:
- Splunk 9.4.2 (HEC enabled)
- Poller: reads from Splunk, chunks to JSONL in
./out - Analyzer: scans chunks, writes reports to
./reports
Quick start:
docker compose up -d --build
# In Splunk: create index "intesa_payments"
# Seed test events (see docs/prod-setup.md)
Docs:
Splunk + Poller + Analyzer — Deploy Guide (Local/Prod)
This document lists the exact steps and commands to bring up our current stack:
- Splunk 9.4.2 (Docker) with HEC enabled
- Poller (Python) that reads from Splunk, chunks logs to JSONL, and writes to
./out - Analyzer (Python) that scans chunks and generates reports in
./reports
Azure Blob/Service Bus support is already built into the poller but disabled by default. When credentials are available, flip a few env vars (see Appendix).
0) Prerequisites (Ubuntu)
# Docker Engine + Compose plugin
sudo apt-get update
sudo apt-get install -y ca-certificates curl gnupg lsb-release
# Install Docker (if not already installed)
curl -fsSL https://get.docker.com | sudo sh
# Add your user to the docker group (re-login for this to take effect)
sudo usermod -aG docker "$USER"
1) Project files
Ensure the repo contains at minimum:
compose.yaml
.env.example
poller/
Dockerfile
requirements.txt
splunk_poller.py
analyzer/
Dockerfile
requirements.txt
analyzer_runner.py
offline_analyzer.py
out/ # created at runtime; host bind mount for chunks
reports/ # created at runtime; host bind mount for analyzer reports
2) Prepare environment
Create .env (copy and edit from the example). Keep these secrets private.
cp .env.example .env
# Edit .env and set at least:
# SPLUNK_PASSWORD=Str0ngP@ss!9
# SPLUNK_HEC_TOKEN=dev-0123456789abcdef
Create local folders and set permissions for the poller to write chunks:
mkdir -p out reports
# Poller runs as UID 10001; make sure it can write to ./out
sudo chown -R 10001:10001 out
3) Launch the stack
# from the folder that contains compose.yaml
docker compose up -d --build
Watch logs:
docker compose logs -f splunk
docker compose logs -f poller
docker compose logs -f analyzer
4) Create the Splunk index (once)
Either via UI: Settings → Indexes → New Index → intesa_payments
Or via CLI:
docker compose exec splunk /opt/splunk/bin/splunk add index intesa_payments -auth admin:"$SPLUNK_PASSWORD"
5) Seed test data (optional but recommended)
Run this on your host to generate realistic events with rich fields (change the loop count as needed):
HEC_URL="https://localhost:8088/services/collector/event"
HEC_TOKEN="dev-0123456789abcdef"
INDEX="intesa_payments"
SOURCETYPE="intesa:bonifico"
gen_iban() { local d=""; for _ in $(seq 1 25); do d="${d}$((RANDOM%10))"; done; echo "IT${d}"; }
mask_iban(){ local i="$1"; local pre="${i:0:6}"; local suf="${i: -4}"; local n=$(( ${#i}-10 )); printf "%s%0.s*" "$pre" $(seq 1 $n); printf "%s" "$suf"; }
rand_amount(){ awk 'BEGIN{srand(); printf "%.2f", 5+rand()*14995}'; }
rand_bool_str(){ if ((RANDOM%2)); then echo "true"; else echo "false"; fi; }
pick(){ local a=("$@"); echo "${a[$RANDOM%${#a[@]}]}"; }
spese=(SHA OUR BEN)
divise=(EUR EUR EUR EUR USD GBP)
steps=(compila conferma esito)
statuses=(accepted pending rejected)
for i in {1..50}; do
t_iso=$(date -u +%FT%T.%6NZ)
t_epoch=$(date -u +%s)
src=$(gen_iban); dst=$(gen_iban)
srcm=$(mask_iban "$src"); dstm=$(mask_iban "$dst")
amt=$(rand_amount)
inst=$(rand_bool_str)
sp=$(pick "${spese[@]}")
dv=$(pick "${divise[@]}")
st=$(pick "${statuses[@]}")
step=$(pick "${steps[@]}")
curl -sk "$HEC_URL" -H "Authorization: Splunk $HEC_TOKEN" -H "Content-Type: application/json" -d @- <<JSON
{
"time": $t_epoch,
"host": "seed.cli",
"source": "cli_for_loop",
"sourcetype": "$SOURCETYPE",
"index": "$INDEX",
"event": {
"event_type": "bonifico",
"step": "$step",
"iban_origin_masked": "$srcm",
"iban_dest_masked": "$dstm",
"importo": "$amt",
"divisa": "$dv",
"istantaneo": "$inst",
"data_pagamento": "$t_iso",
"spese_commissioni": "$sp",
"causale": "TEST SEED",
"status": "$st"
}
}
JSON
done
6) Verify
- Splunk Web: http://localhost:8000 → login
admin/$SPLUNK_PASSWORD - HEC Health:
curl -k https://localhost:8088/services/collector/health -H "Authorization: Splunk $SPLUNK_HEC_TOKEN" - Search in Splunk:
index=intesa_payments sourcetype=intesa:bonifico earliest=-15m | stats count by step status - Poller logs should show file writes like:
wrote /app/out/chunk_...jsonl (xxxxx bytes) - Analyzer should write:
./reports/report_<ts>.mdand./reports/anomalies_<ts>.json
7) Day‑2 operations
Restart services:
docker compose restart splunk poller analyzer
Tail logs:
docker compose logs -f poller
docker compose logs -f analyzer
Reset the poller checkpoint (reprocess from lookback window):
docker compose exec poller rm -f /app/out/.ckpt
Fix permissions (if needed):
sudo chown -R 10001:10001 out
Clean everything (removes containers + volumes):
docker compose down --volumes --remove-orphans
Appendix: Flip to Azure later (optional)
When you obtain credentials, edit .env and set:
SINK=blob # or blob+sb
AZURE_STORAGE_CONNECTION_STRING=... # required for blob uploads
AZURE_STORAGE_CONTAINER=bank-logs
AZURE_COMPRESS=true
# Only if SINK=blob+sb
AZURE_SERVICEBUS_CONNECTION_STRING=...
AZURE_SERVICEBUS_QUEUE=log-chunks
Then redeploy the poller only:
docker compose up -d --build --no-deps poller
docker compose logs -f poller
You should see:
[poller] uploaded blob intesa/YYYY/MM/DD/HH/chunk_....jsonl.gz (... bytes, compressed=True)
[poller] sent Service Bus notification # only with blob+sb