About this role
Join an innovative initiative to build realistic enterprise environments for training and evaluating frontier AI agents. This role focuses on leveraging your expertise in cloud computing to recreate the digital workspaces of major hyperscalers and Fortune 500 enterprises, designing tasks that challenge state-of-the-art AI.
Key Responsibilities- Build a realistic digital workspace centered on the Drive folders you use daily, including architecture docs, runbooks, RFCs, incident post-mortems, capacity plans, cost reports, SRE review decks, and email threads, while incorporating relevant platforms like HashiCorp Terraform and Datadog.
- Design multi-step tasks based on your real workflows that require navigating multiple applications, files, and stakeholders to challenge frontier AI agents meaningfully.
- Collaborate with other cloud-computing experts to design the environment, shape task scope, and review scenarios for realism and rigor.
- Work asynchronously with research teams to refine task designs and evaluation criteria for cloud-computing agent benchmarks.
- Contribute to frontier AI research and benchmarking, directly informing how leading labs train and evaluate the next generation of AI systems.
- 3+ years of full-time experience at a major hyperscaler (AWS, Azure, GCP, Oracle Cloud), a cloud-data platform (Snowflake, Databricks), or a Fortune 500 platform/infrastructure team.
- Background in one or more areas such as:
- Cloud architecture/solutions engineering (multi-account, multi-region, hybrid).
- Site reliability engineering or production engineering.
- Platform/developer-experience engineering (IaC, internal developer platforms).
- DevOps/DevSecOps, CI/CD, or container/Kubernetes operations.
- Cloud security, compliance (SOC 2, ISO 27001, FedRAMP), or cloud FinOps.
- Certifications a plus: AWS Solutions Architect/SysOps/DevOps, Azure Solutions Architect, GCP Professional Cloud Architect, CKA/CKAD.
- Day-to-day use of HashiCorp Terraform/Pulumi, Splunk/Datadog, GitHub Actions/CircleCI, and Okta/Microsoft Entra ID.
- Strong analytical thinking and writing skills, with the ability to translate cloud-ops workflows into structured task specifications.
This project will start with an effective hourly rate, transitioning to a compensation model based on the throughput of quality work rather than a flat hourly rate.
About UsWe are a talent marketplace connecting top experts with leading AI labs and research organizations, backed by prominent investors. Thousands of professionals contribute to projects shaping the next generation of AI systems.