Role Overview

This role supports a Frontier Code Agents initiative at a leading AI research lab, focused on evaluating and improving advanced AI coding models by performing structured technical assessments of realistic infrastructure engineering workflows and model outputs.

Key Responsibilities

Use frontier AI coding agents to perform and evaluate complex infrastructure engineering tasks.
Review model-generated implementations that involve cloud platforms, Kubernetes, CI/CD systems, observability, and infrastructure automation.
Identify bugs, edge cases, reliability problems, and likely failure modes in model outputs.
Compare outputs from multiple frontier models, assessing relative strengths and weaknesses.
Apply professional engineering judgment to realistic infrastructure and reliability engineering scenarios.

Qualifications

Minimum 2 years of professional experience in DevOps, SRE, or Cloud Engineering.
Hands-on experience with one or more cloud platforms, such as AWS, Azure, or GCP.
Familiarity with Kubernetes, Terraform, CI/CD pipelines, and observability tooling.
Regular use of AI coding agents such as Cursor, Claude Code, Codex, Windsurf, Gemini CLI, or similar tools.
Ability to evaluate model-generated infrastructure and reliability engineering solutions.
Experience supporting production-scale systems is preferred.

Work Terms

Remote role.
Hourly engagement, sprint based, with project sprints running in 12 to 24 hour stretches depending on client requirements.
Spots are limited and are filled on a first come, first serve basis.

Compensation

$400 paid per accepted task.
Typical tasks take approximately 2 to 3 hours of work after ramp-up.
Compensation is tied to accepted work.
Metadata indicates an hourly reference rate of $85 per hour.

Eligibility

Role is open to remote applicants. No specific work authorization or sponsorship information was provided in the source; applicants should confirm they are eligible to work in their location.

DevOps Engineer for AI Model Evaluation

About this role

Related Jobs

MLOps Engineer for AI Model Training

Java Developer for AI System Training

Performance Engineer for AI Model Training

Python Developer for AI Model Training

Frontend Software Engineer for AI Training