Business Client need Software Development

Contact person: Business Client

Phone:Show

Email:Show

Location: Ramat Gan, Israel

Budget: Recommended by industry experts

Time to start: As soon as possible

Project description:
"We are seeking a seasoned Entity Resolution (ER) Specialist to collaborate on building a high-performance matching pipeline for our music metadata. As the lead Python and JS developer who built the initial system, I will provide full support and infrastructure context.

Project Scope & Core Challenge
The system must link records between two existing databases (our internal MongoDB Atlas cluster and an external source like Genius).

Crucially, in both databases, the hierarchical relationships (Artist → Albums → Tracks) are already established, and every entity has a unique ID.

The task is to build a process that leverages this existing structure:

Phase 1 (Core ER): Accurately link Artist IDs between the two databases.

Phase 2 (Cascade): Use the confirmed Artist links to efficiently and accurately cascade the matching process to the corresponding Album IDs and Track IDs.

Scale: The pipeline must efficiently handle ≈1.5 million tracks and identify/create new internal entities for millions of external records not yet in our database.

Database: Read and write operations must be optimized for MongoDB Atlas.

Technical Methodology & Collaboration
The matching core must be robust and high-speed:

ER Algorithms: The core should leverage the Record Linkage framework, supplemented by powerful string similarity techniques like RapidFuzz or equivalent methods for optimal precision.

Performance: I expect highly efficient blocking/indexing to ensure speed.

Modularity: The logic must be clean and modular, allowing us (as developers) to easily tune weights, thresholds, and candidate generators in the future.

Infrastructure & Acceptance Criteria
The solution must be production-ready for our existing environment.

Deployment Stack: Deliverables must be encapsulated in a Docker container ready for deployment to our AWS environment and integrated neatly into our existing CI/CD flow.

Acceptance Criteria: A job is complete when the container processes a provided sample of 100k records in under 60 minutes on an [login to view URL], and returns ≥95% precision and ≥90% recall at the track level.

Deliverables
Production-ready Python project (PEP 8 compliant).

Dockerfile and compose file that build and run locally and in AWS.

Link Map collection written back to MongoDB with confidence scores.

Brief validation report summarising precision/recall on a held-out set.

Timeline: We are ready to move immediately. Please respond only if you can begin right away and meet the ASAP delivery timeline." (client-provided description)


Matched companies (4)

...

Kiantechwise Pvt. Ltd.

Kiantechwise is a creative tech company delivering innovative web design, software solutions, branding, and digital marketing. With expertise and vis… Read more

...

TechGigs LLP

We deliver cutting-edge technology solutions to businesses of all sizes. From mobile and web development to AR/VR, AI, and enterprise software, our t… Read more

...

Omninos Technologies International pvt ltd

Omninos Technologies offers full-stack mobile and web development services with a specialty in ready-made app clones to accelerate launch timelines a… Read more

...

Knowforth Tech

Empowering Businesses with Tailored Software & AI Solutions.