Business Client need Software Development
Contact person: Business Client
Phone:Show
Email:Show
Location: atlanta, United States
Budget: Recommended by industry experts
Time to start: As soon as possible
Project description:
"I have large-scale PDFs that can reach a full gigabyte and I need an application that can take each file, shrink it dramatically, and return a smaller, fully searchable document. The tool must work on both Windows and macOS, and I want the compression logic to strike a careful balance between aggressive size reduction and legible text-and-image quality.
Beyond mere compression, every page has to be run through English-language OCR so that the final file is 100 % searchable. While doing that, the script should normalise the document’s typography: one standard font family, a single font size, and consistent text alignment throughout. When existing embedded text already meets those rules, leave it intact; otherwise convert it before saving.
Because some source files are enormous, memory-efficient streaming or chunk processing is essential. I’m open to whichever stack you feel is best—Python with PyPDF2, Ghostscript, qpdf, and Tesseract, or a compiled alternative—so long as the final deliverable runs natively on both operating systems without requiring users to install a full development environment.
Deliverables
• A runnable desktop or CLI utility for Windows and macOS (.exe / .app or cross-platform binary)
• Source code with clear build instructions
• A quick-start guide that shows how to feed in a 1 GB sample, tweak compression/quality settings, and verify OCR
• A short test report proving the tool can process at least one 1 GB PDF down to a significantly smaller, searchable file while preserving acceptable visual quality
I’ll test against several large books and engineering manuals, so predictable performance and stability under heavy files matter as much as the final size savings. Feel free to suggest enhancements if they fit within the core goal of smaller, cleaner, fully searchable PDFs." (client-provided description)
Matched companies (6)

HJP Media

Versasia Infosoft

El Codamics

Junkies Coder

Mobiweb Global Solutions
