Kamil Lee need Projekty IT

Contact person: Kamil Lee

Phone:Show

Email:Show

Location: Remote Cooperation

Budget: Recommended by industry experts

Time to start: As soon as possible

Project description:
"I need help building realistic, terminal-based STEM research tasks used to evaluate frontier AI models (GPT, Gemini, etc.).What you'll build:A self-contained coding task that looks like real research work (analyzing datasets, running simulations, validating hypotheses, comparing methods). Not a textbook problem.Each submission must include:instruction.md (workflow, inputs, outputs, success criteria)Reproducible Docker environment with dataOracle solution (solve.sh) that fully solves the taskDeterministic tests for verificationtask.toml metadataAll packaged into one zipQuality bar:Multi-step, research-grade workflowHard enough that frontier models fail more than 80% of the timeOracle passes local tests 3 out of 3 timesObjectively verifiable outputsNo LLM-generated content allowedWho's a fit:STEM background (biology, chemistry, physics, ML, data science, etc.) with strong Python and Docker skills.Payout: $100 per accepted submission." (client-provided description)

Additional information:"No description" (admin-provided information)


Matched companies (2)

...

Crystal Infoway

Crystal Infoway is a well-known IT Service Provider who works to Bring Ideas to Reality. We work to shape the dreams victoriously using Design, Techn… Read more

...

Conchakra Technologies Pvt Ltd

At Conchakra, our mission is to empower organizations through innovative software solutions that leverage the transformative potential of artificial … Read more