Deployment

This guide covers deploying the AI Ingredient Scanner to production environments, including the FastAPI backend, mobile applications, and Streamlit interface.

Architecture Overview

┌─────────────────────────────────────────────────────────────────────────────┐
│                            PRODUCTION ARCHITECTURE                           │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   ┌─────────────────────┐                                                   │
│   │     Mobile Apps      │                                                   │
│   │  ┌───────┐ ┌───────┐│                                                   │
│   │  │  iOS  │ │Android││                                                   │
│   │  │ Store │ │ Play  ││                                                   │
│   │  └───┬───┘ └───┬───┘│                                                   │
│   └──────┼─────────┼────┘                                                   │
│          │         │                                                         │
│          └────┬────┘                                                         │
│               │                                                              │
│               ▼                                                              │
│   ┌─────────────────────────────────────────────────────────────────┐      │
│   │                      Google Cloud Run                             │      │
│   │  ┌─────────────────────────────────────────────────────────────┐│      │
│   │  │                    FastAPI Backend                           ││      │
│   │  │  • /ocr - Image processing                                  ││      │
│   │  │  • /analyze - Safety analysis                               ││      │
│   │  │  • Auto-scaling, Load balancing                             ││      │
│   │  └─────────────────────────────────────────────────────────────┘│      │
│   └──────────┬──────────────────┬──────────────────┬────────────────┘      │
│              │                  │                  │                        │
│              ▼                  ▼                  ▼                        │
│   ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐          │
│   │   Qdrant Cloud   │ │   Redis Cloud    │ │ Google Gemini API│          │
│   │  (Vector Store)  │ │    (Cache)       │ │      (AI)        │          │
│   └──────────────────┘ └──────────────────┘ └──────────────────┘          │
│                                                       │                     │
│                                                       ▼                     │
│                                              ┌──────────────────┐          │
│                                              │    LangSmith     │          │
│                                              │  (Observability) │          │
│                                              └──────────────────┘          │
└─────────────────────────────────────────────────────────────────────────────┘

Backend Deployment

Google Cloud Run (Recommended)

Cloud Run is the recommended deployment platform for the FastAPI backend due to its auto-scaling capabilities and managed infrastructure.

Step 1: Create Dockerfile

FROM python:3.11-slim

WORKDIR /app

# Install dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy application
COPY . .

# Run with uvicorn
CMD ["uvicorn", "api:app", "--host", "0.0.0.0", "--port", "8080"]

Step 2: Build and Push Image

# Authenticate with GCP
gcloud auth configure-docker

# Build image
docker build -t gcr.io/YOUR_PROJECT/ingredient-scanner:latest .

# Push to Container Registry
docker push gcr.io/YOUR_PROJECT/ingredient-scanner:latest

Step 3: Deploy to Cloud Run

gcloud run deploy ingredient-scanner \
  --image gcr.io/YOUR_PROJECT/ingredient-scanner:latest \
  --platform managed \
  --region us-central1 \
  --allow-unauthenticated \
  --set-env-vars "GOOGLE_API_KEY=xxx,QDRANT_URL=xxx,QDRANT_API_KEY=xxx"

Environment Variables

Variable	Required	Description
GOOGLE_API_KEY	Yes	Gemini API key for AI processing
QDRANT_URL	Yes	Qdrant Cloud cluster URL
QDRANT_API_KEY	Yes	Qdrant Cloud API key
REDIS_URL	No	Redis connection string for caching
LANGCHAIN_API_KEY	No	LangSmith API key for observability

Security Warning

Never commit API keys to version control. Use secret management services:

Cloud Run: Google Secret Manager
Kubernetes: K8s Secrets
Heroku: Config Vars

Alternative: Docker Compose

For self-hosted or on-premise deployments:

# docker-compose.yml
version: '3.8'

services:
  api:
    build: .
    ports:
      - "8000:8000"
    environment:
      - GOOGLE_API_KEY=${GOOGLE_API_KEY}
      - QDRANT_URL=${QDRANT_URL}
      - QDRANT_API_KEY=${QDRANT_API_KEY}
    restart: unless-stopped

  redis:
    image: redis:alpine
    ports:
      - "6379:6379"
    volumes:
      - redis_data:/data

volumes:
  redis_data:

Mobile App Deployment

Expo EAS Build

Step 1: Install EAS CLI

npm install -g eas-cli
eas login

Step 2: Configure eas.json

{
  "cli": {
    "version": ">= 3.0.0"
  },
  "build": {
    "development": {
      "developmentClient": true,
      "distribution": "internal"
    },
    "preview": {
      "distribution": "internal",
      "ios": {
        "simulator": true
      }
    },
    "production": {
      "ios": {
        "resourceClass": "m1-medium"
      },
      "android": {
        "buildType": "apk"
      }
    }
  },
  "submit": {
    "production": {}
  }
}

Step 3: Update API URL

Before building, update src/services/api.ts with your production API URL:

// Production API URL
const API_BASE_URL = 'https://your-cloud-run-url.run.app';

Step 4: Build for Production

# Android APK
eas build --platform android --profile production

# iOS (requires Apple Developer account)
eas build --platform ios --profile production

App Store Submission

iOS App Store

eas submit --platform ios \
  --profile production

Requirements:

Apple Developer Program ($99/year)
App icons and screenshots
Privacy policy URL
App review information

Google Play Store

eas submit --platform android \
  --profile production

Requirements:

Google Play Developer ($25 one-time)
App icons and screenshots
Privacy policy
Content rating questionnaire

Streamlit Deployment

Streamlit Cloud

1Push your code to a GitHub repository
2Connect at share.streamlit.io
3Configure secrets in the Streamlit Cloud dashboard

Google Cloud Run

# Dockerfile for Streamlit
FROM python:3.11-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

EXPOSE 8501
CMD ["streamlit", "run", "app.py", "--server.port=8501", "--server.address=0.0.0.0"]

Production Checklist

Security

API keys in secret management
CORS restricted to production domains
HTTPS enforced
Rate limiting configured
Input validation on all endpoints

Performance

Qdrant Cloud in same region as API
Redis caching enabled
Connection pooling configured
Appropriate instance sizing

Monitoring

LangSmith tracing enabled
Error alerting configured
Health check endpoints monitored
API latency tracked

Mobile

Production API URL configured
App icons and splash screens
Privacy policy implemented
Crash reporting (Sentry/Crashlytics)

Scaling Considerations

API Scaling

┌─────────────────────────────────────────────────────────────┐
│                     CLOUD RUN AUTO-SCALING                   │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│                    ┌─────────────────┐                      │
│                    │  Load Balancer  │                      │
│                    └────────┬────────┘                      │
│                             │                                │
│         ┌───────────────────┼───────────────────┐           │
│         │                   │                   │           │
│         ▼                   ▼                   ▼           │
│   ┌───────────┐       ┌───────────┐       ┌───────────┐   │
│   │ Instance 1│       │ Instance 2│       │ Instance N│   │
│   └─────┬─────┘       └─────┬─────┘       └─────┬─────┘   │
│         │                   │                   │           │
│         └───────────────────┼───────────────────┘           │
│                             │                                │
│                             ▼                                │
│                    ┌─────────────────┐                      │
│                    │  Qdrant Cloud   │                      │
│                    └─────────────────┘                      │
│                                                              │
│  Configuration:                                              │
│  • Min instances: 1 (avoid cold starts)                     │
│  • Max instances: Based on budget                           │
│  • Concurrency: 80 requests per instance                    │
└─────────────────────────────────────────────────────────────┘

Qdrant Scaling

For high-volume production usage:

Use Qdrant Cloud distributed mode
Configure read replicas for query performance
Consider dedicated cluster for enterprise workloads

Cost Optimization

Service	Free Tier	Typical Cost
Google Gemini	15 RPM	Pay-per-use
Qdrant Cloud	1GB free	$25+/month
Cloud Run	2M requests	Pay-per-use
Redis Cloud	30MB free	$5+/month