From Ukrainian regex patterns to cloud LLM ensemble — how our 4-stage classification pipeline works and where we're heading.

Achieving 97.8% Accuracy: Our ML Pipeline Explained

How do you classify whether a message is a scam — reliably, fast, and across multiple languages? This post breaks down NSAI's 4-stage threat classification pipeline that achieves 97.8% accuracy on our benchmark dataset.

The Challenge

Scam messages are diverse:

SMS phishing pretending to be Nova Poshta or PrivatBank
Telegram pyramid schemes with crypto promises
Fake military donation requests
Social engineering targeting the elderly

No single model handles all these well. That's why we built a pipeline, not a single classifier.

The 4-Stage Pipeline

Stage 1: Cache Check (0ms)

Every analyzed message is hashed (SHA-256) and cached in Redis. If we've seen this exact message before, we return the cached result instantly.

Hit rate: ~35% — scammers reuse messages heavily.

Stage 2: Database Lookup (2-5ms)

The hash is checked against our PostgreSQL database of known scam messages. This database is built from:

User reports (verified)
Honeypot data (Telegram channels, SMS gateways)
Partner feeds (CERTs, banks, telecoms)

Stage 3: Pattern Matching (10-30ms)

This is where the magic happens. Our UkrainianScamPatterns module contains 5,000+ regex patterns organized into categories:

Category	Patterns	Examples
`NOVA_POSHTA`	340+	"Ваша посилка затримана", tracking numbers
`PRIVAT24`	280+	"Картку заблоковано", CVV requests
`MONOBANK`	210+	"Підтвердіть транзакцію"
`DIIA`	150+	"Оновіть дані в Дії"
`MILITARY_DONATION`	190+	Fake volunteer fund requests
`CRYPTO_SCAM`	420+	Fake airdrops, rug pulls
`LOTTERY`	160+	Prize notifications
...	3,250+	Other categories

Each pattern has a confidence weight. Multiple pattern matches are combined using a scoring algorithm:

score = sum(p.weight * p.confidence for p in matches)
final_confidence = min(score / normalization_factor, 1.0)

If the combined confidence exceeds the threshold (default: 0.85), we classify the message without calling the LLM — saving time and money.

Coverage: Stage 3 resolves ~55% of all threats.

Stage 4: LLM Ensemble (100-500ms)

For messages that pass stages 1-3 without a definitive verdict, we call cloud LLMs:

Primary: OpenRouter model pool (Claude / GPT-4o / Llama)
Fallback: Direct OpenAI API
Safety net: Anthropic Claude API

The LLM receives a structured prompt with:

The message text
Extracted entities (URLs, phones, crypto addresses)
Pattern match hints (if any)
Ukrainian cultural context

We use majority voting across models when confidence is ambiguous — if 2 out of 3 models agree, that verdict wins.

Accuracy Breakdown

Tested on our golden dataset (5,200 messages, human-labeled):

Category	Precision	Recall	F1
Nova Poshta phishing	99.1%	98.7%	98.9%
Banking fraud	98.4%	97.2%	97.8%
Crypto scams	96.8%	95.4%	96.1%
Military donation fraud	97.6%	96.9%	97.2%
Legit messages (true negatives)	97.2%	98.8%	98.0%
Weighted average	97.9%	97.6%	97.8%

False Positive Rate

Critical for us — blocking legitimate messages is worse than missing some scams. Our false positive rate is 1.2%, mostly on edge cases like:

Legitimate fundraising links that resemble scam patterns
Real Nova Poshta notifications with unusual formatting

What's Next

Fine-tuned local model: We're training a 7B parameter model on our labeled dataset to reduce LLM dependency
Multilingual expansion: Polish, Czech, and German patterns in development
Real-time learning: New confirmed scams auto-generate patterns within hours

Open Benchmarks

We publish our accuracy metrics monthly. Full methodology and test dataset structure available at developers page.