Deployment
This guide covers deploying the AI Ingredient Scanner to production environments, including the FastAPI backend, mobile applications, and Streamlit interface.
Architecture Overview
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β PRODUCTION ARCHITECTURE β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€ β β β βββββββββββββββββββββββ β β β Mobile Apps β β β β βββββββββ ββββββββββ β β β β iOS β βAndroidββ β β β β Store β β Play ββ β β β βββββ¬ββββ βββββ¬βββββ β β ββββββββΌββββββββββΌβββββ β β β β β β ββββββ¬βββββ β β β β β βΌ β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β β Google Cloud Run β β β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β β β FastAPI Backend ββ β β β β β’ /ocr - Image processing ββ β β β β β’ /analyze - Safety analysis ββ β β β β β’ Auto-scaling, Load balancing ββ β β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β ββββββββββββ¬βββββββββββββββββββ¬βββββββββββββββββββ¬βββββββββββββββββ β β β β β β β βΌ βΌ βΌ β β ββββββββββββββββββββ ββββββββββββββββββββ ββββββββββββββββββββ β β β Qdrant Cloud β β Redis Cloud β β Google Gemini APIβ β β β (Vector Store) β β (Cache) β β (AI) β β β ββββββββββββββββββββ ββββββββββββββββββββ ββββββββββββββββββββ β β β β β βΌ β β ββββββββββββββββββββ β β β LangSmith β β β β (Observability) β β β ββββββββββββββββββββ β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Backend Deployment
Google Cloud Run (Recommended)
Cloud Run is the recommended deployment platform for the FastAPI backend due to its auto-scaling capabilities and managed infrastructure.
Step 1: Create Dockerfile
FROM python:3.11-slim WORKDIR /app # Install dependencies COPY requirements.txt . RUN pip install --no-cache-dir -r requirements.txt # Copy application COPY . . # Run with uvicorn CMD ["uvicorn", "api:app", "--host", "0.0.0.0", "--port", "8080"]
Step 2: Build and Push Image
# Authenticate with GCP gcloud auth configure-docker # Build image docker build -t gcr.io/YOUR_PROJECT/ingredient-scanner:latest . # Push to Container Registry docker push gcr.io/YOUR_PROJECT/ingredient-scanner:latest
Step 3: Deploy to Cloud Run
gcloud run deploy ingredient-scanner \ --image gcr.io/YOUR_PROJECT/ingredient-scanner:latest \ --platform managed \ --region us-central1 \ --allow-unauthenticated \ --set-env-vars "GOOGLE_API_KEY=xxx,QDRANT_URL=xxx,QDRANT_API_KEY=xxx"
Environment Variables
| Variable | Required | Description |
|---|---|---|
| GOOGLE_API_KEY | Yes | Gemini API key for AI processing |
| QDRANT_URL | Yes | Qdrant Cloud cluster URL |
| QDRANT_API_KEY | Yes | Qdrant Cloud API key |
| REDIS_URL | No | Redis connection string for caching |
| LANGCHAIN_API_KEY | No | LangSmith API key for observability |
Security Warning
Never commit API keys to version control. Use secret management services:
- Cloud Run: Google Secret Manager
- Kubernetes: K8s Secrets
- Heroku: Config Vars
Alternative: Docker Compose
For self-hosted or on-premise deployments:
# docker-compose.yml
version: '3.8'
services:
api:
build: .
ports:
- "8000:8000"
environment:
- GOOGLE_API_KEY=${GOOGLE_API_KEY}
- QDRANT_URL=${QDRANT_URL}
- QDRANT_API_KEY=${QDRANT_API_KEY}
restart: unless-stopped
redis:
image: redis:alpine
ports:
- "6379:6379"
volumes:
- redis_data:/data
volumes:
redis_data:Mobile App Deployment
Expo EAS Build
Step 1: Install EAS CLI
npm install -g eas-cli eas login
Step 2: Configure eas.json
{
"cli": {
"version": ">= 3.0.0"
},
"build": {
"development": {
"developmentClient": true,
"distribution": "internal"
},
"preview": {
"distribution": "internal",
"ios": {
"simulator": true
}
},
"production": {
"ios": {
"resourceClass": "m1-medium"
},
"android": {
"buildType": "apk"
}
}
},
"submit": {
"production": {}
}
}Step 3: Update API URL
Before building, update src/services/api.ts with your production API URL:
// Production API URL const API_BASE_URL = 'https://your-cloud-run-url.run.app';
Step 4: Build for Production
# Android APK eas build --platform android --profile production # iOS (requires Apple Developer account) eas build --platform ios --profile production
App Store Submission
iOS App Store
eas submit --platform ios \ --profile production
Requirements:
- Apple Developer Program ($99/year)
- App icons and screenshots
- Privacy policy URL
- App review information
Google Play Store
eas submit --platform android \ --profile production
Requirements:
- Google Play Developer ($25 one-time)
- App icons and screenshots
- Privacy policy
- Content rating questionnaire
Streamlit Deployment
Streamlit Cloud
- 1Push your code to a GitHub repository
- 2Connect at share.streamlit.io
- 3Configure secrets in the Streamlit Cloud dashboard
Google Cloud Run
# Dockerfile for Streamlit FROM python:3.11-slim WORKDIR /app COPY requirements.txt . RUN pip install --no-cache-dir -r requirements.txt COPY . . EXPOSE 8501 CMD ["streamlit", "run", "app.py", "--server.port=8501", "--server.address=0.0.0.0"]
Production Checklist
Security
- API keys in secret management
- CORS restricted to production domains
- HTTPS enforced
- Rate limiting configured
- Input validation on all endpoints
Performance
- Qdrant Cloud in same region as API
- Redis caching enabled
- Connection pooling configured
- Appropriate instance sizing
Monitoring
- LangSmith tracing enabled
- Error alerting configured
- Health check endpoints monitored
- API latency tracked
Mobile
- Production API URL configured
- App icons and splash screens
- Privacy policy implemented
- Crash reporting (Sentry/Crashlytics)
Scaling Considerations
API Scaling
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β CLOUD RUN AUTO-SCALING β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€ β β β βββββββββββββββββββ β β β Load Balancer β β β ββββββββββ¬βββββββββ β β β β β βββββββββββββββββββββΌββββββββββββββββββββ β β β β β β β βΌ βΌ βΌ β β βββββββββββββ βββββββββββββ βββββββββββββ β β β Instance 1β β Instance 2β β Instance Nβ β β βββββββ¬ββββββ βββββββ¬ββββββ βββββββ¬ββββββ β β β β β β β βββββββββββββββββββββΌββββββββββββββββββββ β β β β β βΌ β β βββββββββββββββββββ β β β Qdrant Cloud β β β βββββββββββββββββββ β β β β Configuration: β β β’ Min instances: 1 (avoid cold starts) β β β’ Max instances: Based on budget β β β’ Concurrency: 80 requests per instance β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Qdrant Scaling
For high-volume production usage:
- Use Qdrant Cloud distributed mode
- Configure read replicas for query performance
- Consider dedicated cluster for enterprise workloads
Cost Optimization
| Service | Free Tier | Typical Cost |
|---|---|---|
| Google Gemini | 15 RPM | Pay-per-use |
| Qdrant Cloud | 1GB free | $25+/month |
| Cloud Run | 2M requests | Pay-per-use |
| Redis Cloud | 30MB free | $5+/month |