Testing & Performance

Comprehensive testing strategy and performance benchmarks for the fraud detection platform.

Test Coverage

Unit Tests

45+ unit tests covering core components:

# Run all tests
pytest tests/ -v

# Run with coverage
pytest tests/ --cov=src --cov-report=html
ComponentTestsCoverage
Detection Engine1594%
Policy Engine1091%
Feature Store888%
Risk Scoring792%
API Endpoints585%

Integration Tests

End-to-end tests validating complete decision flow:

pytest tests/integration/ -v

Test scenarios:

  • Normal transaction - ALLOW
  • Card testing attack - BLOCK
  • Velocity violation - REVIEW
  • Geographic anomaly - REVIEW
  • Bot detection - BLOCK
  • Policy reload - Version change

Load Tests

Performance benchmarks using Locust:

# Start load test
locust -f tests/load/locustfile.py --host=http://localhost:8000

Performance Results

Latency Benchmarks

Tested on: MacBook Pro M1, Docker containers

MetricTargetAchieved
P50 Latency< 10ms4.2ms
P95 Latency< 15ms7.8ms
P99 Latency< 20ms9.1ms
Max Latency< 50ms23ms

Throughput

ConcurrencyRequests/secError Rate
104500%
508900%
1001,1200%
2001,4500.1%
5001,6800.8%

Resource Usage

At 1,000 req/s sustained load:

ResourceUsage
API CPU45%
API Memory180MB
Redis CPU12%
Redis Memory85MB
PostgreSQL CPU8%
PostgreSQL Memory120MB

Detection Accuracy

Test Dataset

Evaluated against 10,000 synthetic transactions:

  • 8,500 legitimate (85%)
  • 1,500 fraudulent (15%)

Results

MetricValue
True Positive Rate78%
False Positive Rate3.2%
Precision0.82
Recall0.78
F1 Score0.80

Detection Rates by Type

Fraud TypeDetection Rate
Card Testing94%
Velocity Attacks86%
Geographic Anomaly72%
Bot/Automation91%
Friendly Fraud65%

Test Scenarios

Scenario: Card Testing Attack

def test_card_testing_detection():
    """
    Simulate card testing attack:
    - 10 small transactions
    - Same card, different amounts ($1-$5)
    - 30 second window
    - Should trigger BLOCK by 5th transaction
    """
    card_token = "test_card_001"

    for i in range(10):
        response = client.post("/decide", json={
            "transaction_id": f"txn_{i}",
            "amount": random.uniform(1, 5),
            "card_token": card_token,
            "ip_address": "45.33.32.156",  # Datacenter IP
            "ip_datacenter": True,
            ...
        })

        if i < 3:
            assert response["decision"] == "ALLOW"
        elif i < 5:
            assert response["decision"] in ["FRICTION", "REVIEW"]
        else:
            assert response["decision"] == "BLOCK"
            assert "card_testing" in response["signals"]

Scenario: Geographic Anomaly

def test_geographic_anomaly():
    """
    Card issued in US, transaction from Nigeria.
    Should trigger REVIEW.
    """
    response = client.post("/decide", json={
        "transaction_id": "geo_test_001",
        "amount": 500,
        "card_token": "us_card_001",
        "card_country": "US",
        "ip_address": "41.58.0.1",  # Nigeria IP
        "ip_country": "NG",
        ...
    })

    assert response["decision"] == "REVIEW"
    assert "geo_mismatch" in response["signals"]

Scenario: Policy Hot-Reload

def test_policy_hot_reload():
    """
    Verify policy can be updated without restart.
    """
    # Get current policy version
    v1 = client.get("/policy/version")

    # Modify policy file
    update_policy_file(new_threshold=75)

    # Reload policy
    reload_response = client.post("/policy/reload")
    assert reload_response["success"] == True

    # Verify new version
    v2 = client.get("/policy/version")
    assert v2["version"] != v1["version"]
    assert v2["thresholds"]["block"] == 75

Chaos Testing

Redis Failure

# Kill Redis
docker stop fraud_redis

# Send transaction
curl -X POST http://localhost:8000/decide -d '...'

# Expected: Decision still returned (degraded mode)
# Expected: Log warning about Redis unavailable

PostgreSQL Failure

# Kill PostgreSQL
docker stop fraud_postgres

# Send transaction
curl -X POST http://localhost:8000/decide -d '...'

# Expected: Decision returned
# Expected: Evidence queued for later storage

Continuous Integration

GitHub Actions workflow:

# .github/workflows/test.yml
name: Tests

on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    services:
      redis:
        image: redis:7
        ports:
          - 6379:6379
      postgres:
        image: postgres:15
        env:
          POSTGRES_PASSWORD: test
        ports:
          - 5432:5432

    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: '3.11'
      - run: pip install -r requirements.txt
      - run: pytest tests/ -v --cov=src
      - run: pytest tests/integration/ -v

Monitoring in Production

Key Alerts

AlertConditionAction
High LatencyP99 > 50ms for 5minScale API pods
High Block RateBlock > 15% for 10minCheck for attack or policy issue
Redis DisconnectConnection lostPage on-call
Evidence Queue FullQueue > 1000Check PostgreSQL

Dashboards

Grafana dashboards track:

  • Decision distribution over time
  • Latency percentiles
  • Detector fire rates
  • Resource utilization
  • Error rates

Running the Full Test Suite

# Unit tests
pytest tests/unit/ -v

# Integration tests (requires Docker)
docker-compose up -d
pytest tests/integration/ -v

# Load tests
locust -f tests/load/locustfile.py --headless -u 100 -r 10 -t 60s

# All tests with coverage report
pytest tests/ --cov=src --cov-report=html
open htmlcov/index.html