Update README.md
This commit is contained in:
parent
fa0d76d216
commit
f805a51f5d
66
README.md
66
README.md
@ -1,27 +1,61 @@
|
||||
# Intesa Logs – Local Docker Setup (Azure bits left empty)
|
||||
# Intesa Logs – Project Documentation
|
||||
|
||||
This repo runs a local pipeline that mimics production **end-to-end**, but **without any active Azure dependencies**.
|
||||
All “Azure things” are left as **placeholders** so this same repo can later be deployed to Azure.
|
||||
This repo implements a small, production-style pipeline that inspects bank transfer (“**bonifico**”) logs, looks for anomalies (e.g., **rejected EUR ≥ 10,000**, **`vop_no_match`**, **invalid IBAN/BIC**), and produces a concise report (optionally emailed).
|
||||
|
||||
## What runs locally (currently)
|
||||
It runs **locally via Docker** and is designed to be **deployable to Azure** using the same containers.
|
||||
|
||||
1. **Splunk** (container) – receives events via HEC.
|
||||
2. **Poller** (`splunk_poller.py`) – queries Splunk and writes newline-delimited JSON **chunks** to a shared volume.
|
||||
3. **Agent API** (`flask_app.py`) – reads chunks and produces a concise compliance/ops report (optionally emails it via Mailtrap).
|
||||
|
||||
> Local mode uses `SINK=file` and a shared Docker volume. **No Azure Storage or Queues** are used in this mode.
|
||||
|
||||
## What runs on Azure (currently)
|
||||
|
||||
1. **Queue-worker**
|
||||
2. **Agent API**
|
||||
---
|
||||
|
||||
## Quick start (TL;DR)
|
||||
## High-level flow
|
||||
|
||||
**Splunk (HEC)** → **Poller** → *(Chunks: file or Azure Blob)* → *(Optional: Azure Queue message)* → **Analyzer API** → *(Optional: Email via Mailtrap)*
|
||||
|
||||
- **Local mode:** Poller writes chunk **files** to a shared volume. Analyzer reads those files directly.
|
||||
- **Azure mode (final target):** Poller uploads **blobs** to Storage (`bank-logs`) and enqueues messages to Storage Queue (`log-chunks`). A **Queue Worker** consumes queue messages and calls the Analyzer API.
|
||||
|
||||
---
|
||||
|
||||
## Current state snapshot (what’s running now)
|
||||
|
||||
### ✅ Running in Azure
|
||||
|
||||
- **App Service (Agent API)**
|
||||
- Name: `tf-in-dev-chatapp-app`
|
||||
- Image: `tfindevacr.azurecr.io/agent-api:prod` (pulled from ACR via Managed Identity)
|
||||
- Public endpoint: `https://tf-in-dev-chatapp-app.azurewebsites.net`
|
||||
- Health: `GET /health` → `{"status":"ok"}`
|
||||
- API: `POST /analyze` (see examples below)
|
||||
|
||||
- **Azure Container Registry (ACR)**
|
||||
- Name: `tfindevacr`
|
||||
- Repos/tags present:
|
||||
- `agent-api:prod` ✅
|
||||
- `queue-worker:prod` ✅ *(built & pushed; not yet deployed)*
|
||||
|
||||
- **Azure Storage (data plane in use)**
|
||||
- Storage account: `tfindevst`
|
||||
- **Blob container:** `bank-logs` (holds `.jsonl` or `.jsonl.gz` chunks)
|
||||
- **Queue:** `log-chunks` (messages the worker consumes)
|
||||
|
||||
> The API is live in Azure. The **worker** and **Splunk** are still local right now.
|
||||
|
||||
### ✅ Running locally (Docker Compose)
|
||||
|
||||
- **Splunk** container (HEC exposed)
|
||||
- **Poller** (`splunk_poller.py`)
|
||||
- You can run it in either:
|
||||
- `SINK=file` → write chunks to local volume (simple local dev), or
|
||||
- `SINK=blob+queue` → upload to Azure Blob + enqueue Azure Queue (production-like)
|
||||
- **Queue Worker** (`worker/`)
|
||||
- Currently running **locally**, reading Azure Storage Queue and calling the Analyzer (either local API or Azure API based on `ANALYZER_URL`).
|
||||
|
||||
---
|
||||
|
||||
## Repo structure
|
||||
|
||||
```bash
|
||||
# 1) Create a .env (see sample below)
|
||||
# 2) Make sure compose.yaml has SINK=file for the poller
|
||||
# 2) Make sure compose.yaml has SINK=file (if local) or SINK=blob/blob+queue (if azure) for the poller
|
||||
# 3) Start the stack
|
||||
docker compose up -d
|
||||
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user