Achieving 97.8% Accuracy: Our ML Pipeline Explained
From Ukrainian regex patterns to cloud LLM ensemble — how our 4-stage classification pipeline works and where we're heading.
NSAI Team
Achieving 97.8% Accuracy: Our ML Pipeline Explained
How do you classify whether a message is a scam — reliably, fast, and across multiple languages? This post breaks down NSAI's 4-stage threat classification pipeline that achieves 97.8% accuracy on our benchmark dataset.
The Challenge
Scam messages are diverse:
- SMS phishing pretending to be Nova Poshta or PrivatBank
- Telegram pyramid schemes with crypto promises
- Fake military donation requests
- Social engineering targeting the elderly
No single model handles all these well. That's why we built a pipeline, not a single classifier.
The 4-Stage Pipeline
Stage 1: Cache Check (0ms)
Every analyzed message is hashed (SHA-256) and cached in Redis. If we've seen this exact message before, we return the cached result instantly.
Hit rate: ~35% — scammers reuse messages heavily.
Stage 2: Database Lookup (2-5ms)
The hash is checked against our PostgreSQL database of known scam messages. This database is built from:
- User reports (verified)
- Honeypot data (Telegram channels, SMS gateways)
- Partner feeds (CERTs, banks, telecoms)
Stage 3: Pattern Matching (10-30ms)
This is where the magic happens. Our UkrainianScamPatterns module contains 5,000+ regex patterns organized into categories:
| Category | Patterns | Examples |
|---|---|---|
NOVA_POSHTA | 340+ | "Ваша посилка затримана", tracking numbers |
PRIVAT24 | 280+ | "Картку заблоковано", CVV requests |
MONOBANK | 210+ | "Підтвердіть транзакцію" |
DIIA | 150+ | "Оновіть дані в Дії" |
MILITARY_DONATION | 190+ | Fake volunteer fund requests |
CRYPTO_SCAM | 420+ | Fake airdrops, rug pulls |
LOTTERY | 160+ | Prize notifications |
| ... | 3,250+ | Other categories |
Each pattern has a confidence weight. Multiple pattern matches are combined using a scoring algorithm:
score = sum(p.weight * p.confidence for p in matches)
final_confidence = min(score / normalization_factor, 1.0)
If the combined confidence exceeds the threshold (default: 0.85), we classify the message without calling the LLM — saving time and money.
Coverage: Stage 3 resolves ~55% of all threats.
Stage 4: LLM Ensemble (100-500ms)
For messages that pass stages 1-3 without a definitive verdict, we call cloud LLMs:
- Primary: OpenRouter model pool (Claude / GPT-4o / Llama)
- Fallback: Direct OpenAI API
- Safety net: Anthropic Claude API
The LLM receives a structured prompt with:
- The message text
- Extracted entities (URLs, phones, crypto addresses)
- Pattern match hints (if any)
- Ukrainian cultural context
We use majority voting across models when confidence is ambiguous — if 2 out of 3 models agree, that verdict wins.
Accuracy Breakdown
Tested on our golden dataset (5,200 messages, human-labeled):
| Category | Precision | Recall | F1 |
|---|---|---|---|
| Nova Poshta phishing | 99.1% | 98.7% | 98.9% |
| Banking fraud | 98.4% | 97.2% | 97.8% |
| Crypto scams | 96.8% | 95.4% | 96.1% |
| Military donation fraud | 97.6% | 96.9% | 97.2% |
| Legit messages (true negatives) | 97.2% | 98.8% | 98.0% |
| Weighted average | 97.9% | 97.6% | 97.8% |
False Positive Rate
Critical for us — blocking legitimate messages is worse than missing some scams. Our false positive rate is 1.2%, mostly on edge cases like:
- Legitimate fundraising links that resemble scam patterns
- Real Nova Poshta notifications with unusual formatting
What's Next
- Fine-tuned local model: We're training a 7B parameter model on our labeled dataset to reduce LLM dependency
- Multilingual expansion: Polish, Czech, and German patterns in development
- Real-time learning: New confirmed scams auto-generate patterns within hours
Open Benchmarks
We publish our accuracy metrics monthly. Full methodology and test dataset structure available at developers page.
Want to check a suspicious message?