OCR & Translation
The AI Ingredient Scanner supports multi-language ingredient labels, automatically detecting and translating non-English text using Gemini Vision.
Overview
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β OCR & TRANSLATION FLOW β β β β βββββββββββββββ βββββββββββββββ βββββββββββββββββββ β β β Image β β β Gemini β β β Language β β β β Capture β β Vision β β Detection β β β βββββββββββββββ βββββββββββββββ ββββββββββ¬βββββββββ β β β β β ββββββββββββββ΄βββββββββββββ β β β English? β β β ββββββββββββββ¬βββββββββββββ β β β β β βββββββββββββββββββΌββββββββββββββ β β β YES β NO β β β βΌ βΌ β β β βββββββββββββββ βββββββββββββββ β β β β Return Text β β Translate β β β β ββββββββ¬βββββββ ββββββββ¬βββββββ β β β β β β β β ββββββββββ¬βββββββββ β β β βΌ β β β βββββββββββββββ β β β β Analysis β β β β β Pipeline β β β β βββββββββββββββ β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Supported Languages
| Language | Detection Headers |
|---|---|
| English | Ingredients:, INGREDIENTS: |
| French | IngrΓ©dients:, COMPOSITION: |
| Spanish | Ingredientes: |
| German | Inhaltsstoffe:, Zutaten: |
| Italian | Ingredienti: |
| Korean | μ±λΆ:, μ μ±λΆ: |
| Japanese | ζε:, ε ¨ζε: |
| Chinese | ζε:, ι ζ: |
| Portuguese | Ingredientes: |
How It Works
Step 1: Image Capture
The mobile app captures the ingredient label image and encodes it as base64:
// Mobile App
const takePicture = async () => {
const photo = await cameraRef.current.takePictureAsync({
base64: true,
quality: 0.8,
});
// Send to backend
const response = await api.post('/ocr', {
image: photo.base64,
});
setIngredients(response.data.text);
};Step 2: OCR with Language Detection
The backend uses Gemini Vision to extract ingredients and detect the language:
# Backend OCR Prompt """ Find and extract ONLY the ingredient list from this product label image. INSTRUCTIONS: 1. Look for ingredient list headers in ANY language: - English: "Ingredients:", "INGREDIENTS:" - French: "IngrΓ©dients:", "COMPOSITION:" - Korean: "μ±λΆ:", "μ μ±λΆ:" - (and others...) 2. Extract the complete list of ingredients OUTPUT FORMAT: First line: LANGUAGE_DETECTED: <language code> Second line onwards: The extracted ingredients """
Example Response (Korean Label)
LANGUAGE_DETECTED: ko μ μ μ, κΈλ¦¬μΈλ¦°, λμ΄μμ μλ§μ΄λ, λΆνΈλ κΈλΌμ΄μ½
Step 3: Translation (if needed)
Non-English ingredients are translated while preserving scientific names:
def _translate_ingredients_to_english(client, ingredients_text):
"""Translate non-English ingredient text to English."""
response = client.models.generate_content(
model="gemini-2.0-flash",
contents=f"""
You are an expert translator specializing in cosmetic and food ingredients.
TASK: Translate the following ingredient list to English.
INSTRUCTIONS:
1. Translate each ingredient name to its standard English equivalent
2. Keep scientific/INCI names unchanged:
- "Aqua" stays "Aqua"
- "Sodium Lauryl Sulfate" stays the same
3. Translate common ingredient names:
- "Eau" β "Water"
- "μ μ μ" β "Purified Water"
4. Preserve the comma-separated format
5. Return ONLY the translated ingredient list
INGREDIENT LIST TO TRANSLATE:
{ingredients_text}
TRANSLATED INGREDIENTS:"""
)
return response.text.strip()Translation Example
Input:
μ μ μ, κΈλ¦¬μΈλ¦°, λμ΄μμ μλ§μ΄λOutput:
Purified Water, Glycerin, NiacinamideStep 4: Analysis
The translated English ingredients proceed through the normal analysis pipeline.
Backend Implementation
OCR Endpoint
@app.post("/ocr")
async def extract_text_from_image(request: OCRRequest):
client = genai.Client(api_key=settings.google_api_key)
# Decode image
image_data = base64.b64decode(request.image)
image_part = genai.types.Part.from_bytes(
data=image_data,
mime_type="image/jpeg"
)
# Extract ingredients with language detection
response = client.models.generate_content(
model="gemini-2.0-flash",
contents=[image_part, OCR_PROMPT]
)
# Parse language
lines = response.text.strip().split('\n', 1)
detected_language = "en"
ingredients_text = response.text
if lines[0].startswith("LANGUAGE_DETECTED:"):
detected_language = lines[0].replace("LANGUAGE_DETECTED:", "").strip().lower()
ingredients_text = lines[1].strip() if len(lines) > 1 else ""
# Translate if non-English
if detected_language != "en" and detected_language != "none":
ingredients_text = _translate_ingredients_to_english(client, ingredients_text)
return OCRResponse(success=True, text=ingredients_text)Mobile Integration
OCR Service
// services/ocr.ts
import { File } from 'expo-file-system';
import api from './api';
export async function extractIngredients(imageUri: string): Promise<string> {
// Read image as base64
const imageFile = new File(imageUri);
const base64Image = await imageFile.base64();
// Send to backend
const response = await api.post('/ocr', {
image: base64Image,
});
if (response.data?.text) {
return cleanIngredientText(response.data.text);
}
return '';
}
function cleanIngredientText(text: string): string {
return text
.replace(/\s+/g, ' ') // Normalize whitespace
.replace(/ingredients?\s*:?\s*/gi, '') // Remove headers
.trim();
}Usage in HomeScreen
const handleCapture = async (base64Image: string) => {
setLoading(true);
try {
const extractedText = await extractIngredients(base64Image);
setIngredients(extractedText);
setShowCamera(false);
} catch (error) {
Alert.alert('OCR Failed', 'Could not extract ingredients from image');
} finally {
setLoading(false);
}
};Best Practices
Image Quality
Good
- Good lighting
- Steady camera
- Clear focus on ingredient list
- Minimal background noise
Avoid
- Blurry images
- Partial ingredient lists
- Glare or reflections
- Poor lighting
Error Handling
try {
const ingredients = await extractIngredients(imageUri);
if (!ingredients) {
// Allow manual input
Alert.alert(
'No Ingredients Found',
'Please enter ingredients manually or try a clearer photo'
);
}
} catch (error) {
if (error.response?.status === 404) {
// OCR endpoint not available
console.log('Using manual input mode');
} else {
throw error;
}
}Troubleshooting
OCR Not Extracting Text
- Ensure good lighting conditions
- Hold camera steady
- Frame just the ingredient list
- Try selecting from gallery for clearer images
Translation Errors
- Scientific/INCI names should remain unchanged
- Check if the language is supported
- Very rare ingredients may not translate correctly
Network Issues
const api = axios.create({
baseURL: API_BASE_URL,
timeout: 120000, // 2 minutes for OCR + translation
});