requirements.txt

# Parser HTML (mails Picnic)
beautifulsoup4==4.12.3
lxml==5.3.0

# Parser PDF (tickets Leclerc)
pdfplumber==0.11.4
pytesseract>=0.3.10    # binding Python pour Tesseract OCR
Pillow>=10.0           # manipulation d'images (extraction JPEG du PDF)

# LLM (appels API OpenAI-compatible)
requests>=2.31

# Tests
pytest==8.3.4

# Note : Tesseract OCR (binaire C++) doit être installé séparément :
#   Windows : https://github.com/UB-Mannheim/tesseract/wiki
#   Linux   : apt install tesseract-ocr tesseract-ocr-fra
# Le modèle français (fra.traineddata) est requis.
# Sans droits admin, créer un dossier tessdata/ à la racine du projet :
#   tessdata/fra.traineddata  (14 Mo, téléchargeable sur github.com/tesseract-ocr/tessdata)
#   tessdata/eng.traineddata  (copié depuis l'install Tesseract)
feat: migration Windows → Ubuntu, stabilisation suite de tests - Ajout venv Python (.venv) avec pip bootstrap (python3-venv absent) - Correction OCR Linux : marqueur TTC/TVA tolère la confusion T↔I (Tesseract 5.3.4 Linux lit parfois "TIc" au lieu de "TTC") - test_leclerc.py : skipif si Tesseract absent, xfail pour test de somme (précision OCR variable entre plateformes, solution LLM vision prévue) - Résultat : 77 passent, 1 xfail, 0 échec (vs 78 sur Windows) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> 2026-02-24 18:53:41 +01:00			`# Parser HTML (mails Picnic)`
			`beautifulsoup4==4.12.3`
			`lxml==5.3.0`

			`# Parser PDF (tickets Leclerc)`
			`pdfplumber==0.11.4`
			`pytesseract>=0.3.10 # binding Python pour Tesseract OCR`
			`Pillow>=10.0 # manipulation d'images (extraction JPEG du PDF)`

			`# LLM (appels API OpenAI-compatible)`
			`requests>=2.31`

			`# Tests`
			`pytest==8.3.4`

			`# Note : Tesseract OCR (binaire C++) doit être installé séparément :`
			`# Windows : https://github.com/UB-Mannheim/tesseract/wiki`
			`# Linux : apt install tesseract-ocr tesseract-ocr-fra`
			`# Le modèle français (fra.traineddata) est requis.`
			`# Sans droits admin, créer un dossier tessdata/ à la racine du projet :`
			`# tessdata/fra.traineddata (14 Mo, téléchargeable sur github.com/tesseract-ocr/tessdata)`
			`# tessdata/eng.traineddata (copié depuis l'install Tesseract)`