Business Client need Mobile App Development
Contact person: Business Client
Phone:Show
Email:Show
Location: Tiruvannamalai, India
Budget: Recommended by industry experts
Time to start: As soon as possible
Project description:
"We are developing a prototype of an Indian language speech recognition and translation system using open-source technologies only (no Google, AWS, or Azure APIs).
The system should be capable of:
Converting speech to text in multiple Indian languages
Translating text between selected Indian languages
Providing basic speaker identification (diarization)
Offering a RESTful API interface for integration
This is a proof-of-concept (MVP) project focusing on 5–10 major Indian languages such as Hindi, Tamil, Telugu, Bengali, and Marathi. The goal is to build a functional base system that can later be expanded to cover 150+ languages and dialects.
---
Scope of Work
1. Audio Processing
Handle input formats (MP3, WAV, FLAC, M4A)
Perform noise reduction and normalization to 16kHz
Prepare data for speech recognition models
2. Speech Recognition (ASR)
Implement speech-to-text using open-source models (Meta MMS, Whisper, or Coqui STT)
Support multiple Indian languages
Provide language detection and transcription confidence scores
3. Text Translation
Use open-source translation models such as IndicTrans2 or MarianNMT
Enable bidirectional text translation between selected Indian languages
4. Speaker Diarization
Integrate speaker detection using [login to view URL] or a similar open-source tool
5. API Development
Develop RESTful API endpoints for speech-to-text and translation
Include basic authentication and documentation
6. Deliverables
Complete source code (Python preferred)
Deployment and configuration scripts
Technical and API documentation
---
Preferred Tech Stack
Python, FastAPI, PyTorch
Whisper, MMS, IndicTrans2, [login to view URL]
Hugging Face Transformers
Docker for deployment
---
Deliverables and Timeline
Functional MVP covering 5–10 Indian languages
Duration: 6–8 weeks
Budget: ₹40,000 (fixed)
---
Required Skills
Experience with speech recognition and translation using open-source models
Familiarity with Indian language datasets (Bhashini, AI4Bharat, Common Voice)
Strong Python and API development experience
Ability to deliver clean, documented, and reproducible code
---
How to Apply
Please include the following in your proposal:
1. Short summary of relevant experience with ASR or translation systems
2. Example projects or GitHub repositories
3. Proposed timeline and approach for the MVP" (client-provided description)
Matched companies (4)

Chirag Solutions

April Innovations

Junkies Coder
