From a Ukrainian anti-scam bot to a worldwide threat intelligence platform — our architecture decisions, scaling challenges, and lessons learned.

Building a Threat Intelligence API for Global Scale

NSAI started as a Telegram bot helping Ukrainians identify scam messages. Today it's a threat intelligence platform serving businesses across multiple countries. Here's how we built the architecture to scale.

The Origin

In early 2025, we noticed a pattern: Ukrainian users were drowning in scam SMS messages imitating popular services — Nova Poshta, PrivatBank, Monobank, Diia. We built a simple Telegram bot that matched messages against known patterns.

Within weeks, we had thousands of users. The bot was a single Python process running on a VPS. It was time to think bigger.

Architecture Evolution

Phase 1: Monolithic Bot

User → Telegram → Python Bot → Regex Patterns → Reply

Simple, fast to develop. But:

No API for third parties
No persistence
No way to scale

Phase 2: API + Database

Telegram Bot ─→ FastAPI ─→ PostgreSQL
Web clients ──↗           ↘ Redis cache

We separated the classification logic into a FastAPI service with PostgreSQL for persistence and Redis for caching. The Telegram bot became just one client of the API.

Phase 3: Modular Platform

┌─────────────────────────────────────────┐
│              Load Balancer               │
└────────────┬───────────┬────────────────┘
             │           │
        ┌────▼───┐  ┌────▼───┐
        │ API #1 │  │ API #N │  (stateless)
        └────┬───┘  └────┬───┘
             │           │
    ┌────────▼───────────▼────────┐
    │        Redis Cluster         │
    └────────────┬────────────────┘
                 │
    ┌────────────▼────────────────┐
    │  PostgreSQL (primary + read) │
    └────────────┬────────────────┘
                 │
    ┌────────────▼────────────────┐
    │   Qdrant (vector search)     │
    └──────────────────────────────┘

Key decisions:

Stateless API nodes — horizontal scaling behind a load balancer
Redis for cache + Celery broker
Qdrant for vector similarity (RAG-based threat matching)
Celery workers for background tasks (DomainWatch, batch analysis)

Scaling Challenges

Challenge 1: Pattern Matching at Scale

5,000+ regex patterns can't be evaluated sequentially for every request. We organized them into category trees with early termination:

# Instead of checking all 5000 patterns:
category = detect_category(message)  # ~50 patterns
patterns = get_patterns(category)    # ~200-400 patterns
# 10x fewer checks

Challenge 2: LLM Cost Control

Cloud LLMs are expensive. We implemented a tiered approach:

Cache hit → free
DB lookup → negligible
Regex patterns → zero API cost, handles ~55%
LLM → only for the remaining ~10%

Result: 90% of requests never touch an LLM.

Challenge 3: Multi-Region Latency

Users in Europe expect sub-200ms responses. We deployed:

Edge caching for common scam hashes
Regional API nodes (EU-West, EU-East)
Anycast DNS routing

Lessons Learned

Start with patterns, not ML — regex is unglamorous but handles the majority of known threats instantly
Cache everything — scammers reuse messages; a 35% cache hit rate saved us thousands in LLM costs
Build for the API first — the bot, web UI, and SDK are all clients of the same API
Invest in observability — structured logging with correlation IDs saved us during outages

What's Next

Self-hosted option — for enterprises that can't send data externally
Plugin ecosystem — let third parties contribute detection modules
Federated learning — share threat patterns between instances without sharing raw data

Try the API → | View architecture docs →