Deployment

This guide covers deploying the AI Ingredient Scanner to production environments, including the FastAPI backend, mobile applications, and Streamlit interface.


Architecture Overview

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                            PRODUCTION ARCHITECTURE                           β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                                              β”‚
β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                                                   β”‚
β”‚   β”‚     Mobile Apps      β”‚                                                   β”‚
β”‚   β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”β”‚                                                   β”‚
β”‚   β”‚  β”‚  iOS  β”‚ β”‚Androidβ”‚β”‚                                                   β”‚
β”‚   β”‚  β”‚ Store β”‚ β”‚ Play  β”‚β”‚                                                   β”‚
β”‚   β”‚  β””β”€β”€β”€β”¬β”€β”€β”€β”˜ β””β”€β”€β”€β”¬β”€β”€β”€β”˜β”‚                                                   β”‚
β”‚   β””β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”˜                                                   β”‚
β”‚          β”‚         β”‚                                                         β”‚
β”‚          β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜                                                         β”‚
β”‚               β”‚                                                              β”‚
β”‚               β–Ό                                                              β”‚
β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”‚
β”‚   β”‚                      Google Cloud Run                             β”‚      β”‚
β”‚   β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”β”‚      β”‚
β”‚   β”‚  β”‚                    FastAPI Backend                           β”‚β”‚      β”‚
β”‚   β”‚  β”‚  β€’ /ocr - Image processing                                  β”‚β”‚      β”‚
β”‚   β”‚  β”‚  β€’ /analyze - Safety analysis                               β”‚β”‚      β”‚
β”‚   β”‚  β”‚  β€’ Auto-scaling, Load balancing                             β”‚β”‚      β”‚
β”‚   β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜β”‚      β”‚
β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β”‚
β”‚              β”‚                  β”‚                  β”‚                        β”‚
β”‚              β–Ό                  β–Ό                  β–Ό                        β”‚
β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”          β”‚
β”‚   β”‚   Qdrant Cloud   β”‚ β”‚   Redis Cloud    β”‚ β”‚ Google Gemini APIβ”‚          β”‚
β”‚   β”‚  (Vector Store)  β”‚ β”‚    (Cache)       β”‚ β”‚      (AI)        β”‚          β”‚
β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜          β”‚
β”‚                                                       β”‚                     β”‚
β”‚                                                       β–Ό                     β”‚
β”‚                                              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”          β”‚
β”‚                                              β”‚    LangSmith     β”‚          β”‚
β”‚                                              β”‚  (Observability) β”‚          β”‚
β”‚                                              β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜          β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Backend Deployment

Google Cloud Run (Recommended)

Cloud Run is the recommended deployment platform for the FastAPI backend due to its auto-scaling capabilities and managed infrastructure.

Step 1: Create Dockerfile

FROM python:3.11-slim

WORKDIR /app

# Install dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy application
COPY . .

# Run with uvicorn
CMD ["uvicorn", "api:app", "--host", "0.0.0.0", "--port", "8080"]

Step 2: Build and Push Image

# Authenticate with GCP
gcloud auth configure-docker

# Build image
docker build -t gcr.io/YOUR_PROJECT/ingredient-scanner:latest .

# Push to Container Registry
docker push gcr.io/YOUR_PROJECT/ingredient-scanner:latest

Step 3: Deploy to Cloud Run

gcloud run deploy ingredient-scanner \
  --image gcr.io/YOUR_PROJECT/ingredient-scanner:latest \
  --platform managed \
  --region us-central1 \
  --allow-unauthenticated \
  --set-env-vars "GOOGLE_API_KEY=xxx,QDRANT_URL=xxx,QDRANT_API_KEY=xxx"

Environment Variables

VariableRequiredDescription
GOOGLE_API_KEYYesGemini API key for AI processing
QDRANT_URLYesQdrant Cloud cluster URL
QDRANT_API_KEYYesQdrant Cloud API key
REDIS_URLNoRedis connection string for caching
LANGCHAIN_API_KEYNoLangSmith API key for observability
Security Warning

Never commit API keys to version control. Use secret management services:

  • Cloud Run: Google Secret Manager
  • Kubernetes: K8s Secrets
  • Heroku: Config Vars

Alternative: Docker Compose

For self-hosted or on-premise deployments:

# docker-compose.yml
version: '3.8'

services:
  api:
    build: .
    ports:
      - "8000:8000"
    environment:
      - GOOGLE_API_KEY=${GOOGLE_API_KEY}
      - QDRANT_URL=${QDRANT_URL}
      - QDRANT_API_KEY=${QDRANT_API_KEY}
    restart: unless-stopped

  redis:
    image: redis:alpine
    ports:
      - "6379:6379"
    volumes:
      - redis_data:/data

volumes:
  redis_data:

Mobile App Deployment

Expo EAS Build

Step 1: Install EAS CLI

npm install -g eas-cli
eas login

Step 2: Configure eas.json

{
  "cli": {
    "version": ">= 3.0.0"
  },
  "build": {
    "development": {
      "developmentClient": true,
      "distribution": "internal"
    },
    "preview": {
      "distribution": "internal",
      "ios": {
        "simulator": true
      }
    },
    "production": {
      "ios": {
        "resourceClass": "m1-medium"
      },
      "android": {
        "buildType": "apk"
      }
    }
  },
  "submit": {
    "production": {}
  }
}

Step 3: Update API URL

Before building, update src/services/api.ts with your production API URL:

// Production API URL
const API_BASE_URL = 'https://your-cloud-run-url.run.app';

Step 4: Build for Production

# Android APK
eas build --platform android --profile production

# iOS (requires Apple Developer account)
eas build --platform ios --profile production

App Store Submission

iOS App Store
eas submit --platform ios \
  --profile production
Requirements:
  • Apple Developer Program ($99/year)
  • App icons and screenshots
  • Privacy policy URL
  • App review information
Google Play Store
eas submit --platform android \
  --profile production
Requirements:
  • Google Play Developer ($25 one-time)
  • App icons and screenshots
  • Privacy policy
  • Content rating questionnaire

Streamlit Deployment

Streamlit Cloud

  1. 1Push your code to a GitHub repository
  2. 2Connect at share.streamlit.io
  3. 3Configure secrets in the Streamlit Cloud dashboard

Google Cloud Run

# Dockerfile for Streamlit
FROM python:3.11-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

EXPOSE 8501
CMD ["streamlit", "run", "app.py", "--server.port=8501", "--server.address=0.0.0.0"]

Production Checklist

Security
  • API keys in secret management
  • CORS restricted to production domains
  • HTTPS enforced
  • Rate limiting configured
  • Input validation on all endpoints
Performance
  • Qdrant Cloud in same region as API
  • Redis caching enabled
  • Connection pooling configured
  • Appropriate instance sizing
Monitoring
  • LangSmith tracing enabled
  • Error alerting configured
  • Health check endpoints monitored
  • API latency tracked
Mobile
  • Production API URL configured
  • App icons and splash screens
  • Privacy policy implemented
  • Crash reporting (Sentry/Crashlytics)

Scaling Considerations

API Scaling

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                     CLOUD RUN AUTO-SCALING                   β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                              β”‚
β”‚                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                      β”‚
β”‚                    β”‚  Load Balancer  β”‚                      β”‚
β”‚                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜                      β”‚
β”‚                             β”‚                                β”‚
β”‚         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”           β”‚
β”‚         β”‚                   β”‚                   β”‚           β”‚
β”‚         β–Ό                   β–Ό                   β–Ό           β”‚
β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”       β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”       β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚   β”‚ Instance 1β”‚       β”‚ Instance 2β”‚       β”‚ Instance Nβ”‚   β”‚
β”‚   β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜       β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜       β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜   β”‚
β”‚         β”‚                   β”‚                   β”‚           β”‚
β”‚         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜           β”‚
β”‚                             β”‚                                β”‚
β”‚                             β–Ό                                β”‚
β”‚                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                      β”‚
β”‚                    β”‚  Qdrant Cloud   β”‚                      β”‚
β”‚                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                      β”‚
β”‚                                                              β”‚
β”‚  Configuration:                                              β”‚
β”‚  β€’ Min instances: 1 (avoid cold starts)                     β”‚
β”‚  β€’ Max instances: Based on budget                           β”‚
β”‚  β€’ Concurrency: 80 requests per instance                    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Qdrant Scaling

For high-volume production usage:

  • Use Qdrant Cloud distributed mode
  • Configure read replicas for query performance
  • Consider dedicated cluster for enterprise workloads

Cost Optimization

ServiceFree TierTypical Cost
Google Gemini15 RPMPay-per-use
Qdrant Cloud1GB free$25+/month
Cloud Run2M requestsPay-per-use
Redis Cloud30MB free$5+/month

Related Documentation