Update README.md

This commit is contained in:
daniel.g 2025-09-24 16:42:38 +00:00
parent 5c9ee3f995
commit 0b6e62650f

226
README.md
View File

@ -11,4 +11,228 @@ docker compose up -d --build
# In Splunk: create index "intesa_payments"
# Seed test events (see docs/prod-setup.md)
Docs: see docs/prod-setup.md for full step-by-step and operational commands.
Docs:
# Splunk + Poller + Analyzer — Deploy Guide (Local/Prod)
This document lists the exact steps and commands to bring up our current stack:
- **Splunk 9.4.2** (Docker) with HEC enabled
- **Poller** (Python) that reads from Splunk, chunks logs to JSONL, and writes to `./out`
- **Analyzer** (Python) that scans chunks and generates reports in `./reports`
> Azure Blob/Service Bus support is already built into the poller but disabled by default. When credentials are available, flip a few env vars (see Appendix).
---
## 0) Prerequisites (Ubuntu)
```bash
# Docker Engine + Compose plugin
sudo apt-get update
sudo apt-get install -y ca-certificates curl gnupg lsb-release
# Install Docker (if not already installed)
curl -fsSL https://get.docker.com | sudo sh
# Add your user to the docker group (re-login for this to take effect)
sudo usermod -aG docker "$USER"
```
---
## 1) Project files
Ensure the repo contains at minimum:
```
compose.yaml
.env.example
poller/
Dockerfile
requirements.txt
splunk_poller.py
analyzer/
Dockerfile
requirements.txt
analyzer_runner.py
offline_analyzer.py
out/ # created at runtime; host bind mount for chunks
reports/ # created at runtime; host bind mount for analyzer reports
```
---
## 2) Prepare environment
Create `.env` (copy and edit from the example). **Keep these secrets private.**
```bash
cp .env.example .env
# Edit .env and set at least:
# SPLUNK_PASSWORD=Str0ngP@ss!9
# SPLUNK_HEC_TOKEN=dev-0123456789abcdef
```
Create local folders and set permissions for the poller to write chunks:
```bash
mkdir -p out reports
# Poller runs as UID 10001; make sure it can write to ./out
sudo chown -R 10001:10001 out
```
---
## 3) Launch the stack
```bash
# from the folder that contains compose.yaml
docker compose up -d --build
```
Watch logs:
```bash
docker compose logs -f splunk
docker compose logs -f poller
docker compose logs -f analyzer
```
---
## 4) Create the Splunk index (once)
**Either via UI**: Settings → Indexes → *New Index*`intesa_payments`
**Or via CLI**:
```bash
docker compose exec splunk /opt/splunk/bin/splunk add index intesa_payments -auth admin:"$SPLUNK_PASSWORD"
```
---
## 5) Seed test data (optional but recommended)
Run this on your host to generate realistic events **with rich fields** (change the loop count as needed):
```bash
HEC_URL="https://localhost:8088/services/collector/event"
HEC_TOKEN="dev-0123456789abcdef"
INDEX="intesa_payments"
SOURCETYPE="intesa:bonifico"
gen_iban() { local d=""; for _ in $(seq 1 25); do d="${d}$((RANDOM%10))"; done; echo "IT${d}"; }
mask_iban(){ local i="$1"; local pre="${i:0:6}"; local suf="${i: -4}"; local n=$(( ${#i}-10 )); printf "%s%0.s*" "$pre" $(seq 1 $n); printf "%s" "$suf"; }
rand_amount(){ awk 'BEGIN{srand(); printf "%.2f", 5+rand()*14995}'; }
rand_bool_str(){ if ((RANDOM%2)); then echo "true"; else echo "false"; fi; }
pick(){ local a=("$@"); echo "${a[$RANDOM%${#a[@]}]}"; }
spese=(SHA OUR BEN)
divise=(EUR EUR EUR EUR USD GBP)
steps=(compila conferma esito)
statuses=(accepted pending rejected)
for i in {1..50}; do
t_iso=$(date -u +%FT%T.%6NZ)
t_epoch=$(date -u +%s)
src=$(gen_iban); dst=$(gen_iban)
srcm=$(mask_iban "$src"); dstm=$(mask_iban "$dst")
amt=$(rand_amount)
inst=$(rand_bool_str)
sp=$(pick "${spese[@]}")
dv=$(pick "${divise[@]}")
st=$(pick "${statuses[@]}")
step=$(pick "${steps[@]}")
curl -sk "$HEC_URL" -H "Authorization: Splunk $HEC_TOKEN" -H "Content-Type: application/json" -d @- <<JSON
{
"time": $t_epoch,
"host": "seed.cli",
"source": "cli_for_loop",
"sourcetype": "$SOURCETYPE",
"index": "$INDEX",
"event": {
"event_type": "bonifico",
"step": "$step",
"iban_origin_masked": "$srcm",
"iban_dest_masked": "$dstm",
"importo": "$amt",
"divisa": "$dv",
"istantaneo": "$inst",
"data_pagamento": "$t_iso",
"spese_commissioni": "$sp",
"causale": "TEST SEED",
"status": "$st"
}
}
JSON
done
```
---
## 6) Verify
- **Splunk Web**: http://localhost:8000 → login `admin` / `$SPLUNK_PASSWORD`
- **HEC Health**:
```bash
curl -k https://localhost:8088/services/collector/health -H "Authorization: Splunk $SPLUNK_HEC_TOKEN"
```
- **Search in Splunk**:
```
index=intesa_payments sourcetype=intesa:bonifico earliest=-15m
| stats count by step status
```
- **Poller logs** should show file writes like:
`wrote /app/out/chunk_...jsonl (xxxxx bytes)`
- **Analyzer** should write: `./reports/report_<ts>.md` and `./reports/anomalies_<ts>.json`
---
## 7) Day2 operations
Restart services:
```bash
docker compose restart splunk poller analyzer
```
Tail logs:
```bash
docker compose logs -f poller
docker compose logs -f analyzer
```
Reset the poller checkpoint (reprocess from lookback window):
```bash
docker compose exec poller rm -f /app/out/.ckpt
```
Fix permissions (if needed):
```bash
sudo chown -R 10001:10001 out
```
Clean everything (removes containers + volumes):
```bash
docker compose down --volumes --remove-orphans
```
---
## Appendix: Flip to Azure later (optional)
When you obtain credentials, edit `.env` and set:
```
SINK=blob # or blob+sb
AZURE_STORAGE_CONNECTION_STRING=... # required for blob uploads
AZURE_STORAGE_CONTAINER=bank-logs
AZURE_COMPRESS=true
# Only if SINK=blob+sb
AZURE_SERVICEBUS_CONNECTION_STRING=...
AZURE_SERVICEBUS_QUEUE=log-chunks
```
Then redeploy the poller only:
```bash
docker compose up -d --build --no-deps poller
docker compose logs -f poller
```
You should see:
```
[poller] uploaded blob intesa/YYYY/MM/DD/HH/chunk_....jsonl.gz (... bytes, compressed=True)
[poller] sent Service Bus notification # only with blob+sb
```