SaidGig

DevOps Engineer for AI Model Evaluation

$85/hr

RemoteContracttechnology
Apply Now

About this role

Role Overview

Join a cutting-edge AI research initiative focused on enhancing frontier AI coding models. As a contributor, you will engage in structured technical assessments that evaluate and improve coding agents, working on realistic infrastructure engineering workflows.

Key Responsibilities
  • Utilize frontier AI coding agents to execute and assess complex infrastructure engineering tasks.
  • Review implementations generated by models, focusing on cloud platforms, Kubernetes, CI/CD systems, observability, and infrastructure automation.
  • Identify bugs, edge cases, reliability issues, and potential failure modes.
  • Compare outputs from various frontier models to evaluate their strengths and weaknesses.
  • Apply professional engineering judgment to practical infrastructure engineering scenarios.
Work Terms

This is a sprint-based project requiring commitment in 12-24 hour stretches based on client needs.

Compensation

Earn $400 for each accepted task. Typical tasks require approximately 2-3 hours of work after an initial ramp-up period, with compensation directly tied to accepted work.

Qualifications
  • Minimum of 2 years of professional experience in DevOps, SRE, or Cloud Engineering.
  • Proficiency with AWS, Azure, GCP, Kubernetes, Terraform, CI/CD pipelines, or observability tools.
  • Regular experience using AI coding agents such as Cursor, Claude Code, Codex, Windsurf, Gemini CLI, or similar tools.
  • Ability to evaluate model-generated infrastructure and reliability engineering solutions.
  • Experience with production-scale systems is preferred.

Related Jobs