2025-09-28 10:01:13 +00:00

83 lines
3.7 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Intesa Logs Project Documentation
This repo implements a small, production-style pipeline that inspects bank transfer (“**bonifico**”) logs, looks for anomalies (e.g., **rejected EUR ≥ 10,000**, **`vop_no_match`**, **invalid IBAN/BIC**), and produces a concise report (optionally emailed).
It runs **locally via Docker** and is designed to be **deployable to Azure** using the same containers.
---
## High-level flow
**Splunk (HEC)****Poller***(Chunks: file or Azure Blob)**(Optional: Azure Queue message)***Analyzer API***(Optional: Email via Mailtrap)*
- **Local mode:** Poller writes chunk **files** to a shared volume. Analyzer reads those files directly.
- **Azure mode (final target):** Poller uploads **blobs** to Storage (`bank-logs`) and enqueues messages to Storage Queue (`log-chunks`). A **Queue Worker** consumes queue messages and calls the Analyzer API.
---
## Current state snapshot (whats running now)
### ✅ Running in Azure
- **App Service (Agent API)**
- Name: `tf-in-dev-chatapp-app`
- Image: `tfindevacr.azurecr.io/agent-api:prod` (pulled from ACR via Managed Identity)
- Public endpoint: `https://tf-in-dev-chatapp-app.azurewebsites.net`
- Health: `GET /health``{"status":"ok"}`
- API: `POST /analyze` (see examples below)
- **Azure Container Registry (ACR)**
- Name: `tfindevacr`
- Repos/tags present:
- `agent-api:prod`
- `queue-worker:prod`*(built & pushed; not yet deployed)*
- **Azure Storage (data plane in use)**
- Storage account: `tfindevst`
- **Blob container:** `bank-logs` (holds `.jsonl` or `.jsonl.gz` chunks)
- **Queue:** `log-chunks` (messages the worker consumes)
> The API is live in Azure. The **worker** and **Splunk** are still local right now.
### ✅ Running locally (Docker Compose)
- **Splunk** container (HEC exposed)
- **Poller** (`splunk_poller.py`)
- You can run it in either:
- `SINK=file` → write chunks to local volume (simple local dev), or
- `SINK=blob+queue` → upload to Azure Blob + enqueue Azure Queue (production-like)
- **Queue Worker** (`worker/`)
- Currently running **locally**, reading Azure Storage Queue and calling the Analyzer (either local API or Azure API based on `ANALYZER_URL`).
---
## Repo structure
```bash
# 1) Create a .env (see sample below)
# 2) Make sure compose.yaml has SINK=file (if local) or SINK=blob/blob+queue (if azure) for the poller
# 3) Start the stack
docker compose up -d
# 4) Check health
curl -sS http://localhost:8080/health
# 5) Send test events to Splunk HEC
for i in {1..5}; do
curl -k https://localhost:8088/services/collector/event \
-H "Authorization: Splunk dev-0123456789abcdef" \
-H "Content-Type: application/json" \
-d '{"event":{"event_type":"bonifico","step":"esito","status":"accepted","importo": '"$((RANDOM%5000+50))"',"divisa":"EUR","transaction_id":"TX-'$RANDOM'"},"sourcetype":"intesa:bonifico","index":"intesa_payments"}' >/dev/null 2>&1
done
# 6) Add a couple of anomalies to exercise the analyzer
curl -k https://localhost:8088/services/collector/event \
-H "Authorization: Splunk dev-0123456789abcdef" \
-H "Content-Type: application/json" \
-d '{"event":{"event_type":"bonifico","step":"esito","status":"rejected","importo":12500,"divisa":"EUR","vop_check":"no_match","iban_origin_masked":"IT1998*2*4*6*8*10*12*14*16*9375","iban_dest_masked":"IT1171*2*4*6*8*10*12*14*16*0000","bic_swift":"TESTBICX"},"sourcetype":"intesa:bonifico","index":"intesa_payments"}'
# 7) Ask the Agent API to analyze the latest local chunks
curl -sS -X POST http://localhost:8080/analyze \
-H 'Content-Type: application/json' \
-d '{"question":"Scan latest chunks. Flag rejected EUR >= 10000, vop_no_match, invalid IBAN/BIC.","email":{"send":false}}' | jq .