Production-Ready AI Agent Pipeline for Mobile App Screen Analysis (Python + OCR + LLM) need Mobile App Development
Contact person: Production-Ready AI Agent Pipeline for Mobile App Screen Analysis (Python + OCR + LLM)
Phone:Show
Email:Show
Location: Ankara, Turkey
Budget: Recommended by industry experts
Time to start: As soon as possible
Project description:
"I’ve developed a modular, real-time AI pipeline that analyzes the screen of a mobile application running inside **BlueStacks 5**, extracting structured information with high accuracy. The system is at an **MVP+ level**, with most core components working, but it still needs several improvements and production-hardening.
---
### **Current Architecture**
* **scrcpy**: Android screen mirroring
* **OpenCV**: ROI detection and contour analysis
* **PaddleOCR 2.7 (GPU-ready)**: OCR engine with Turkish + numeric recognition
* **Gemini 2.0 Flash**: AI-based postprocessing, normalization, classification
* **KeyDB**: Deduplication via idempotent key hashing
* **SQLite** + **Google Sheets API**: Persistent + cloud data storage
* **Logging & Metrics**: Centralized log and runtime tracker
* **[login to view URL]**: Centralized dynamic configuration
---
### **What’s Completed So Far (MVP+)**
* End-to-end frame-based pipeline (capture → OCR → validation → AI → storage)
* Dynamic ROI extraction + temporal majority voting
* Gemini-powered semantic normalization and header mapping
* KeyDB-based duplicate detection (TTL controlled)
* SQLite + Sheets integration with idempotent writes
* Logging system + metrics output (processed cells, errors, runtime)
* Test suite with golden test structure and static test image support
* Modular folder structure (`src/`, `tests/`, `logs/`) with clean interface
---
### **Remaining Work / Needed Help**
* Finalize **Gemini API integration** and fallback handling
* Improve **error recovery & graceful degradation**
* Add **real-time monitoring dashboard** (web or CLI-based)
* Finalize **test automation & CI/CD hooks**
* Improve parallelism / frame throughput (batch OCR, multi-threading)
---
### **Expected Deliverables**
* Fully working, **production-grade Python pipeline**
* All modules integrated and test-covered
* Dashboard / metrics panel (live stats like FPS, error count, duration)
* Finalized logging system with detailed traces
* [login to view URL] validation schema
* Clean README + setup guide
---
### Timeline**
* **Timeline**: \~3–5 days
* Potential for ongoing collaboration (multi-sport support, analytics, arbitrage detection, mobile UI, etc.)
---
### **Ideal Candidate**
* Python + OpenCV + PaddleOCR (GPU) experience
* Gemini or LLM integration background (JSON I/O, prompt handling)
* Strong in pipeline design, config handling, logging, fallback strategies
* (Bonus): KeyDB, React/Flet dashboard, CI setup experience
---
### **Note**
This system is built entirely on **visual screen analysis** (not scraping or API access). The goal is near real-time accuracy (400–900ms per frame) through layered validation and deduplication." (client-provided description)
Matched companies (6)

Junkies Coder

SJ Solutions & Infotech

El Codamics

Chirag Solutions

eShop Genius
