About this role
Role Overview
Join a cutting-edge AI research initiative focused on enhancing frontier AI coding models. As a contributor, you will engage in structured technical assessments that evaluate and improve coding agents, working on realistic infrastructure engineering workflows.
Key Responsibilities- Utilize frontier AI coding agents to execute and assess complex infrastructure engineering tasks.
- Review implementations generated by models, focusing on cloud platforms, Kubernetes, CI/CD systems, observability, and infrastructure automation.
- Identify bugs, edge cases, reliability issues, and potential failure modes.
- Compare outputs from various frontier models to evaluate their strengths and weaknesses.
- Apply professional engineering judgment to practical infrastructure engineering scenarios.
This is a sprint-based project requiring commitment in 12-24 hour stretches based on client needs.
CompensationEarn $400 for each accepted task. Typical tasks require approximately 2-3 hours of work after an initial ramp-up period, with compensation directly tied to accepted work.
Qualifications- Minimum of 2 years of professional experience in DevOps, SRE, or Cloud Engineering.
- Proficiency with AWS, Azure, GCP, Kubernetes, Terraform, CI/CD pipelines, or observability tools.
- Regular experience using AI coding agents such as Cursor, Claude Code, Codex, Windsurf, Gemini CLI, or similar tools.
- Ability to evaluate model-generated infrastructure and reliability engineering solutions.
- Experience with production-scale systems is preferred.