240 lines
5.9 KiB
Markdown
240 lines
5.9 KiB
Markdown
# Intesa Splunk Pipeline
|
||
|
||
Dockerized stack:
|
||
- **Splunk 9.4.2** (HEC enabled)
|
||
- **Poller**: reads from Splunk, chunks to JSONL in `./out`
|
||
- **Analyzer**: scans chunks, writes reports to `./reports`
|
||
|
||
Quick start:
|
||
```bash
|
||
docker compose up -d --build
|
||
# In Splunk: create index "intesa_payments"
|
||
# Seed test events (see docs/prod-setup.md)
|
||
```
|
||
|
||
Docs:
|
||
|
||
# Splunk + Poller + Analyzer — Deploy Guide (Local/Prod)
|
||
|
||
This document lists the exact steps and commands to bring up our current stack:
|
||
- **Splunk 9.4.2** (Docker) with HEC enabled
|
||
- **Poller** (Python) that reads from Splunk, chunks logs to JSONL, and writes to `./out`
|
||
- **Analyzer** (Python) that scans chunks and generates reports in `./reports`
|
||
|
||
> Azure Blob/Service Bus support is already built into the poller but disabled by default. When credentials are available, flip a few env vars (see Appendix).
|
||
|
||
---
|
||
|
||
## 0) Prerequisites (Ubuntu)
|
||
```bash
|
||
# Docker Engine + Compose plugin
|
||
sudo apt-get update
|
||
sudo apt-get install -y ca-certificates curl gnupg lsb-release
|
||
|
||
# Install Docker (if not already installed)
|
||
curl -fsSL https://get.docker.com | sudo sh
|
||
|
||
# Add your user to the docker group (re-login for this to take effect)
|
||
sudo usermod -aG docker "$USER"
|
||
```
|
||
|
||
---
|
||
|
||
## 1) Project files
|
||
Ensure the repo contains at minimum:
|
||
```
|
||
compose.yaml
|
||
.env.example
|
||
poller/
|
||
Dockerfile
|
||
requirements.txt
|
||
splunk_poller.py
|
||
analyzer/
|
||
Dockerfile
|
||
requirements.txt
|
||
analyzer_runner.py
|
||
offline_analyzer.py
|
||
out/ # created at runtime; host bind mount for chunks
|
||
reports/ # created at runtime; host bind mount for analyzer reports
|
||
```
|
||
|
||
---
|
||
|
||
## 2) Prepare environment
|
||
Create `.env` (copy and edit from the example). **Keep these secrets private.**
|
||
|
||
```bash
|
||
cp .env.example .env
|
||
# Edit .env and set at least:
|
||
# SPLUNK_PASSWORD=Str0ngP@ss!9
|
||
# SPLUNK_HEC_TOKEN=dev-0123456789abcdef
|
||
```
|
||
|
||
Create local folders and set permissions for the poller to write chunks:
|
||
```bash
|
||
mkdir -p out reports
|
||
# Poller runs as UID 10001; make sure it can write to ./out
|
||
sudo chown -R 10001:10001 out
|
||
```
|
||
|
||
---
|
||
|
||
## 3) Launch the stack
|
||
```bash
|
||
# from the folder that contains compose.yaml
|
||
docker compose up -d --build
|
||
```
|
||
|
||
Watch logs:
|
||
```bash
|
||
docker compose logs -f splunk
|
||
docker compose logs -f poller
|
||
docker compose logs -f analyzer
|
||
```
|
||
|
||
---
|
||
|
||
## 4) Create the Splunk index (once)
|
||
**Either via UI**: Settings → Indexes → *New Index* → `intesa_payments`
|
||
**Or via CLI**:
|
||
```bash
|
||
docker compose exec splunk /opt/splunk/bin/splunk add index intesa_payments -auth admin:"$SPLUNK_PASSWORD"
|
||
```
|
||
|
||
---
|
||
|
||
## 5) Seed test data (optional but recommended)
|
||
|
||
Run this on your host to generate realistic events **with rich fields** (change the loop count as needed):
|
||
|
||
```bash
|
||
HEC_URL="https://localhost:8088/services/collector/event"
|
||
HEC_TOKEN="dev-0123456789abcdef"
|
||
INDEX="intesa_payments"
|
||
SOURCETYPE="intesa:bonifico"
|
||
|
||
gen_iban() { local d=""; for _ in $(seq 1 25); do d="${d}$((RANDOM%10))"; done; echo "IT${d}"; }
|
||
mask_iban(){ local i="$1"; local pre="${i:0:6}"; local suf="${i: -4}"; local n=$(( ${#i}-10 )); printf "%s%0.s*" "$pre" $(seq 1 $n); printf "%s" "$suf"; }
|
||
rand_amount(){ awk 'BEGIN{srand(); printf "%.2f", 5+rand()*14995}'; }
|
||
rand_bool_str(){ if ((RANDOM%2)); then echo "true"; else echo "false"; fi; }
|
||
pick(){ local a=("$@"); echo "${a[$RANDOM%${#a[@]}]}"; }
|
||
|
||
spese=(SHA OUR BEN)
|
||
divise=(EUR EUR EUR EUR USD GBP)
|
||
steps=(compila conferma esito)
|
||
statuses=(accepted pending rejected)
|
||
|
||
for i in {1..50}; do
|
||
t_iso=$(date -u +%FT%T.%6NZ)
|
||
t_epoch=$(date -u +%s)
|
||
src=$(gen_iban); dst=$(gen_iban)
|
||
srcm=$(mask_iban "$src"); dstm=$(mask_iban "$dst")
|
||
amt=$(rand_amount)
|
||
inst=$(rand_bool_str)
|
||
sp=$(pick "${spese[@]}")
|
||
dv=$(pick "${divise[@]}")
|
||
st=$(pick "${statuses[@]}")
|
||
step=$(pick "${steps[@]}")
|
||
|
||
curl -sk "$HEC_URL" -H "Authorization: Splunk $HEC_TOKEN" -H "Content-Type: application/json" -d @- <<JSON
|
||
{
|
||
"time": $t_epoch,
|
||
"host": "seed.cli",
|
||
"source": "cli_for_loop",
|
||
"sourcetype": "$SOURCETYPE",
|
||
"index": "$INDEX",
|
||
"event": {
|
||
"event_type": "bonifico",
|
||
"step": "$step",
|
||
"iban_origin_masked": "$srcm",
|
||
"iban_dest_masked": "$dstm",
|
||
"importo": "$amt",
|
||
"divisa": "$dv",
|
||
"istantaneo": "$inst",
|
||
"data_pagamento": "$t_iso",
|
||
"spese_commissioni": "$sp",
|
||
"causale": "TEST SEED",
|
||
"status": "$st"
|
||
}
|
||
}
|
||
JSON
|
||
done
|
||
```
|
||
|
||
---
|
||
|
||
## 6) Verify
|
||
|
||
- **Splunk Web**: http://localhost:8000 → login `admin` / `$SPLUNK_PASSWORD`
|
||
- **HEC Health**:
|
||
```bash
|
||
curl -k https://localhost:8088/services/collector/health -H "Authorization: Splunk $SPLUNK_HEC_TOKEN"
|
||
```
|
||
- **Search in Splunk**:
|
||
```
|
||
index=intesa_payments sourcetype=intesa:bonifico earliest=-15m
|
||
| stats count by step status
|
||
```
|
||
- **Poller logs** should show file writes like:
|
||
`wrote /app/out/chunk_...jsonl (xxxxx bytes)`
|
||
- **Analyzer** should write: `./reports/report_<ts>.md` and `./reports/anomalies_<ts>.json`
|
||
|
||
---
|
||
|
||
## 7) Day‑2 operations
|
||
|
||
Restart services:
|
||
```bash
|
||
docker compose restart splunk poller analyzer
|
||
```
|
||
|
||
Tail logs:
|
||
```bash
|
||
docker compose logs -f poller
|
||
docker compose logs -f analyzer
|
||
```
|
||
|
||
Reset the poller checkpoint (reprocess from lookback window):
|
||
```bash
|
||
docker compose exec poller rm -f /app/out/.ckpt
|
||
```
|
||
|
||
Fix permissions (if needed):
|
||
```bash
|
||
sudo chown -R 10001:10001 out
|
||
```
|
||
|
||
Clean everything (removes containers + volumes):
|
||
```bash
|
||
docker compose down --volumes --remove-orphans
|
||
```
|
||
|
||
---
|
||
|
||
## Appendix: Flip to Azure later (optional)
|
||
|
||
When you obtain credentials, edit `.env` and set:
|
||
```
|
||
SINK=blob # or blob+sb
|
||
AZURE_STORAGE_CONNECTION_STRING=... # required for blob uploads
|
||
AZURE_STORAGE_CONTAINER=bank-logs
|
||
AZURE_COMPRESS=true
|
||
|
||
# Only if SINK=blob+sb
|
||
AZURE_SERVICEBUS_CONNECTION_STRING=...
|
||
AZURE_SERVICEBUS_QUEUE=log-chunks
|
||
```
|
||
|
||
Then redeploy the poller only:
|
||
```bash
|
||
docker compose up -d --build --no-deps poller
|
||
docker compose logs -f poller
|
||
```
|
||
|
||
You should see:
|
||
```
|
||
[poller] uploaded blob intesa/YYYY/MM/DD/HH/chunk_....jsonl.gz (... bytes, compressed=True)
|
||
[poller] sent Service Bus notification # only with blob+sb
|
||
```
|
||
|