Data Ingestion of Guide Commericiale: Difference between revisions
Jump to navigation
Jump to search
Line 47: | Line 47: | ||
* Run OCR with Pytesseract | * Run OCR with Pytesseract | ||
* Extract text using PDFPlumber | * Extract text using PDFPlumber | ||
* Compare the two approaches, choose the most suitable approach, and pre-process the text output | * Compare the two text extraction approaches, choose the most suitable approach, and pre-process the text output | ||
| align="center" |✓ | | align="center" |✓ |
Revision as of 11:50, 28 November 2024
Introduction
Project Timeline & Milestones
Timeframe | Task | Completion |
---|---|---|
Week 4 |
|
✓ |
Week 5 |
|
✓ |
Week 6 |
|
✓ |
Week 7 |
|
✓ |
Week 8 |
|
✓ |
Week 9 |
|
✓ |
Week 10 |
|
✓ |
Week 11 |
|
✓ |
Week 12 |
|
|
Week 13 |
|
|
Week 14 |
|