About this role
Role Overview
This position involves collaborating with a prominent AI research lab to enhance frontier AI coding models through structured technical assessments. Contributors will engage in realistic machine learning engineering workflows, focusing on model evaluation and improvement.
Key Responsibilities- Utilize frontier AI coding agents to execute and assess complex machine learning and AI engineering tasks.
- Review model-generated implementations, including model training, inference systems, MLOps, and LLM applications.
- Identify bugs, edge cases, performance issues, and failure modes in AI models.
- Compare outputs from various frontier models, assessing their strengths and weaknesses.
- Apply professional engineering judgment to realistic machine learning engineering scenarios.
This is a sprint-based project requiring a time commitment of 12-24 hours based on client needs.
CompensationCompensation is set at $400 per accepted task, with typical tasks taking approximately 2-3 hours after an initial ramp-up period.
Eligibility- Minimum of 2 years of professional experience in machine learning engineering.
- Experience in building production ML systems, model deployment infrastructure, LLM applications, or AI-powered products.
- Regular use of AI coding agents such as Cursor, Claude Code, Codex, Windsurf, Gemini CLI, or similar tools.
- Ability to evaluate model-generated machine learning implementations and assess technical tradeoffs.
- Experience deploying ML systems to production is preferred.