Intesa Logs Project Documentation

This repo implements a small, production-style pipeline that inspects bank transfer (“bonifico”) logs, looks for anomalies (e.g., rejected EUR ≥ 10,000, vop_no_match, invalid IBAN/BIC), and produces a concise report (optionally emailed).

It runs locally via Docker and is designed to be deployable to Azure using the same containers.


High-level flow

Splunk (HEC)Poller(Chunks: file or Azure Blob)(Optional: Azure Queue message)Analyzer API(Optional: Email via Mailtrap)

  • Local mode: Poller writes chunk files to a shared volume. Analyzer reads those files directly.
  • Azure mode (final target): Poller uploads blobs to Storage (bank-logs) and enqueues messages to Storage Queue (log-chunks). A Queue Worker consumes queue messages and calls the Analyzer API.

Current state snapshot (whats running now)

Running in Azure

  • App Service (Agent API)

    • Name: tf-in-dev-chatapp-app
    • Image: tfindevacr.azurecr.io/agent-api:prod (pulled from ACR via Managed Identity)
    • Public endpoint: https://tf-in-dev-chatapp-app.azurewebsites.net
    • Health: GET /health{"status":"ok"}
    • API: POST /analyze (see examples below)
  • Azure Container Registry (ACR)

    • Name: tfindevacr
    • Repos/tags present:
      • agent-api:prod
      • queue-worker:prod (built & pushed; not yet deployed)
  • Azure Storage (data plane in use)

    • Storage account: tfindevst
    • Blob container: bank-logs (holds .jsonl or .jsonl.gz chunks)
    • Queue: log-chunks (messages the worker consumes)

The API is live in Azure. The worker and Splunk are still local right now.

Running locally (Docker Compose)

  • Splunk container (HEC exposed)
  • Poller (splunk_poller.py)
    • You can run it in either:
      • SINK=file → write chunks to local volume (simple local dev), or
      • SINK=blob+queue → upload to Azure Blob + enqueue Azure Queue (production-like)
  • Queue Worker (worker/)
    • Currently running locally, reading Azure Storage Queue and calling the Analyzer (either local API or Azure API based on ANALYZER_URL).

Repo structure

# 1) Create a .env (see sample below)
# 2) Make sure compose.yaml has SINK=file (if local) or SINK=blob/blob+queue (if azure) for the poller
# 3) Start the stack
docker compose up -d

# 4) Check health
curl -sS http://localhost:8080/health

# 5) Send test events to Splunk HEC
for i in {1..5}; do
  curl -k https://localhost:8088/services/collector/event \
    -H "Authorization: Splunk dev-0123456789abcdef" \
    -H "Content-Type: application/json" \
    -d '{"event":{"event_type":"bonifico","step":"esito","status":"accepted","importo": '"$((RANDOM%5000+50))"',"divisa":"EUR","transaction_id":"TX-'$RANDOM'"},"sourcetype":"intesa:bonifico","index":"intesa_payments"}' >/dev/null 2>&1
done

# 6) Add a couple of anomalies to exercise the analyzer
curl -k https://localhost:8088/services/collector/event \
  -H "Authorization: Splunk dev-0123456789abcdef" \
  -H "Content-Type: application/json" \
  -d '{"event":{"event_type":"bonifico","step":"esito","status":"rejected","importo":12500,"divisa":"EUR","vop_check":"no_match","iban_origin_masked":"IT1998*2*4*6*8*10*12*14*16*9375","iban_dest_masked":"IT1171*2*4*6*8*10*12*14*16*0000","bic_swift":"TESTBICX"},"sourcetype":"intesa:bonifico","index":"intesa_payments"}'

# 7) Ask the Agent API to analyze the latest local chunks
curl -sS -X POST http://localhost:8080/analyze \
  -H 'Content-Type: application/json' \
  -d '{"question":"Scan latest chunks. Flag rejected EUR >= 10000, vop_no_match, invalid IBAN/BIC.","email":{"send":false}}' | jq .
Description
No description provided
Readme 73 KiB
Languages
Python 96.6%
Dockerfile 3.4%