# Intesa Logs – Project Documentation This repo implements a small, production-style pipeline that inspects bank transfer (“**bonifico**”) logs, looks for anomalies (e.g., **rejected EUR ≥ 10,000**, **`vop_no_match`**, **invalid IBAN/BIC**), and produces a concise report (optionally emailed). It runs **locally via Docker** and is designed to be **deployable to Azure** using the same containers. --- ## High-level flow **Splunk (HEC)** → **Poller** → *(Chunks: file or Azure Blob)* → *(Optional: Azure Queue message)* → **Analyzer API** → *(Optional: Email via Mailtrap)* - **Local mode:** Poller writes chunk **files** to a shared volume. Analyzer reads those files directly. - **Azure mode (final target):** Poller uploads **blobs** to Storage (`bank-logs`) and enqueues messages to Storage Queue (`log-chunks`). A **Queue Worker** consumes queue messages and calls the Analyzer API. --- ## Current state snapshot (what’s running now) ### ✅ Running in Azure - **App Service (Agent API)** - Name: `tf-in-dev-chatapp-app` - Image: `tfindevacr.azurecr.io/agent-api:prod` (pulled from ACR via Managed Identity) - Public endpoint: `https://tf-in-dev-chatapp-app.azurewebsites.net` - Health: `GET /health` → `{"status":"ok"}` - API: `POST /analyze` (see examples below) - **Azure Container Registry (ACR)** - Name: `tfindevacr` - Repos/tags present: - `agent-api:prod` ✅ - `queue-worker:prod` ✅ *(built & pushed; not yet deployed)* - **Azure Storage (data plane in use)** - Storage account: `tfindevst` - **Blob container:** `bank-logs` (holds `.jsonl` or `.jsonl.gz` chunks) - **Queue:** `log-chunks` (messages the worker consumes) > The API is live in Azure. The **worker** and **Splunk** are still local right now. ### ✅ Running locally (Docker Compose) - **Splunk** container (HEC exposed) - **Poller** (`splunk_poller.py`) - You can run it in either: - `SINK=file` → write chunks to local volume (simple local dev), or - `SINK=blob+queue` → upload to Azure Blob + enqueue Azure Queue (production-like) - **Queue Worker** (`worker/`) - Currently running **locally**, reading Azure Storage Queue and calling the Analyzer (either local API or Azure API based on `ANALYZER_URL`). --- ## Repo structure ```bash # 1) Create a .env (see sample below) # 2) Make sure compose.yaml has SINK=file (if local) or SINK=blob/blob+queue (if azure) for the poller # 3) Start the stack docker compose up -d # 4) Check health curl -sS http://localhost:8080/health # 5) Send test events to Splunk HEC for i in {1..5}; do curl -k https://localhost:8088/services/collector/event \ -H "Authorization: Splunk dev-0123456789abcdef" \ -H "Content-Type: application/json" \ -d '{"event":{"event_type":"bonifico","step":"esito","status":"accepted","importo": '"$((RANDOM%5000+50))"',"divisa":"EUR","transaction_id":"TX-'$RANDOM'"},"sourcetype":"intesa:bonifico","index":"intesa_payments"}' >/dev/null 2>&1 done # 6) Add a couple of anomalies to exercise the analyzer curl -k https://localhost:8088/services/collector/event \ -H "Authorization: Splunk dev-0123456789abcdef" \ -H "Content-Type: application/json" \ -d '{"event":{"event_type":"bonifico","step":"esito","status":"rejected","importo":12500,"divisa":"EUR","vop_check":"no_match","iban_origin_masked":"IT1998*2*4*6*8*10*12*14*16*9375","iban_dest_masked":"IT1171*2*4*6*8*10*12*14*16*0000","bic_swift":"TESTBICX"},"sourcetype":"intesa:bonifico","index":"intesa_payments"}' # 7) Ask the Agent API to analyze the latest local chunks curl -sS -X POST http://localhost:8080/analyze \ -H 'Content-Type: application/json' \ -d '{"question":"Scan latest chunks. Flag rejected EUR >= 10000, vop_no_match, invalid IBAN/BIC.","email":{"send":false}}' | jq .