Testing & Performance
Comprehensive testing strategy and performance benchmarks for the fraud detection platform.
Test Coverage
Unit Tests
45+ unit tests covering core components:
# Run all tests pytest tests/ -v # Run with coverage pytest tests/ --cov=src --cov-report=html
| Component | Tests | Coverage |
|---|---|---|
| Detection Engine | 15 | 94% |
| Policy Engine | 10 | 91% |
| Feature Store | 8 | 88% |
| Risk Scoring | 7 | 92% |
| API Endpoints | 5 | 85% |
Integration Tests
End-to-end tests validating complete decision flow:
pytest tests/integration/ -v
Test scenarios:
- Normal transaction - ALLOW
- Card testing attack - BLOCK
- Velocity violation - REVIEW
- Geographic anomaly - REVIEW
- Bot detection - BLOCK
- Policy reload - Version change
Load Tests
Performance benchmarks using Locust:
# Start load test locust -f tests/load/locustfile.py --host=http://localhost:8000
Performance Results
Latency Benchmarks
Tested on: MacBook Pro M1, Docker containers
| Metric | Target | Achieved |
|---|---|---|
| P50 Latency | < 10ms | 4.2ms |
| P95 Latency | < 15ms | 7.8ms |
| P99 Latency | < 20ms | 9.1ms |
| Max Latency | < 50ms | 23ms |
Throughput
| Concurrency | Requests/sec | Error Rate |
|---|---|---|
| 10 | 450 | 0% |
| 50 | 890 | 0% |
| 100 | 1,120 | 0% |
| 200 | 1,450 | 0.1% |
| 500 | 1,680 | 0.8% |
Resource Usage
At 1,000 req/s sustained load:
| Resource | Usage |
|---|---|
| API CPU | 45% |
| API Memory | 180MB |
| Redis CPU | 12% |
| Redis Memory | 85MB |
| PostgreSQL CPU | 8% |
| PostgreSQL Memory | 120MB |
Detection Accuracy
Test Dataset
Evaluated against 10,000 synthetic transactions:
- 8,500 legitimate (85%)
- 1,500 fraudulent (15%)
Results
| Metric | Value |
|---|---|
| True Positive Rate | 78% |
| False Positive Rate | 3.2% |
| Precision | 0.82 |
| Recall | 0.78 |
| F1 Score | 0.80 |
Detection Rates by Type
| Fraud Type | Detection Rate |
|---|---|
| Card Testing | 94% |
| Velocity Attacks | 86% |
| Geographic Anomaly | 72% |
| Bot/Automation | 91% |
| Friendly Fraud | 65% |
Test Scenarios
Scenario: Card Testing Attack
def test_card_testing_detection():
"""
Simulate card testing attack:
- 10 small transactions
- Same card, different amounts ($1-$5)
- 30 second window
- Should trigger BLOCK by 5th transaction
"""
card_token = "test_card_001"
for i in range(10):
response = client.post("/decide", json={
"transaction_id": f"txn_{i}",
"amount": random.uniform(1, 5),
"card_token": card_token,
"ip_address": "45.33.32.156", # Datacenter IP
"ip_datacenter": True,
...
})
if i < 3:
assert response["decision"] == "ALLOW"
elif i < 5:
assert response["decision"] in ["FRICTION", "REVIEW"]
else:
assert response["decision"] == "BLOCK"
assert "card_testing" in response["signals"]Scenario: Geographic Anomaly
def test_geographic_anomaly():
"""
Card issued in US, transaction from Nigeria.
Should trigger REVIEW.
"""
response = client.post("/decide", json={
"transaction_id": "geo_test_001",
"amount": 500,
"card_token": "us_card_001",
"card_country": "US",
"ip_address": "41.58.0.1", # Nigeria IP
"ip_country": "NG",
...
})
assert response["decision"] == "REVIEW"
assert "geo_mismatch" in response["signals"]Scenario: Policy Hot-Reload
def test_policy_hot_reload():
"""
Verify policy can be updated without restart.
"""
# Get current policy version
v1 = client.get("/policy/version")
# Modify policy file
update_policy_file(new_threshold=75)
# Reload policy
reload_response = client.post("/policy/reload")
assert reload_response["success"] == True
# Verify new version
v2 = client.get("/policy/version")
assert v2["version"] != v1["version"]
assert v2["thresholds"]["block"] == 75Chaos Testing
Redis Failure
# Kill Redis docker stop fraud_redis # Send transaction curl -X POST http://localhost:8000/decide -d '...' # Expected: Decision still returned (degraded mode) # Expected: Log warning about Redis unavailable
PostgreSQL Failure
# Kill PostgreSQL docker stop fraud_postgres # Send transaction curl -X POST http://localhost:8000/decide -d '...' # Expected: Decision returned # Expected: Evidence queued for later storage
Continuous Integration
GitHub Actions workflow:
# .github/workflows/test.yml
name: Tests
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
services:
redis:
image: redis:7
ports:
- 6379:6379
postgres:
image: postgres:15
env:
POSTGRES_PASSWORD: test
ports:
- 5432:5432
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.11'
- run: pip install -r requirements.txt
- run: pytest tests/ -v --cov=src
- run: pytest tests/integration/ -vMonitoring in Production
Key Alerts
| Alert | Condition | Action |
|---|---|---|
| High Latency | P99 > 50ms for 5min | Scale API pods |
| High Block Rate | Block > 15% for 10min | Check for attack or policy issue |
| Redis Disconnect | Connection lost | Page on-call |
| Evidence Queue Full | Queue > 1000 | Check PostgreSQL |
Dashboards
Grafana dashboards track:
- Decision distribution over time
- Latency percentiles
- Detector fire rates
- Resource utilization
- Error rates
Running the Full Test Suite
# Unit tests pytest tests/unit/ -v # Integration tests (requires Docker) docker-compose up -d pytest tests/integration/ -v # Load tests locust -f tests/load/locustfile.py --headless -u 100 -r 10 -t 60s # All tests with coverage report pytest tests/ --cov=src --cov-report=html open htmlcov/index.html