About this role
Join an innovative initiative to build realistic enterprise environments for training and evaluating frontier AI agents. This role requires experienced civic and public-administration professionals from federal agencies, large state and city governments, and major federal contractors to recreate the digital workspaces they manage daily and design tasks that challenge state-of-the-art AI.
Your expertise in public-sector program management, policy implementation, procurement, or public-finance administration will be essential in constructing a high-fidelity environment that mirrors the tools, files, and workflows of a modern government enterprise, while authoring tasks grounded in the programs you currently oversee.
Key Responsibilities- Build a realistic digital workspace centered on the Drive folders you use day-to-day, including policy memos, program plans, budget justifications, RFPs, constituent correspondence, council/board materials, performance reports, and email threads, along with representation of supporting platforms (e.g., Salesforce Government Cloud, Granicus, Socrata, ArcGIS Hub, DocuSign).
- Design multi-step tasks based on your real workflows that require navigating multiple apps, files, and stakeholders, effectively challenging frontier AI agents.
- Collaborate with other civic and public-administration experts to design the environment, shape task scope, and review scenarios for realism and rigor.
- Work asynchronously with research teams to refine task designs and evaluation criteria for public-sector agent benchmarks.
- Contribute to frontier AI research and benchmarking, with your work directly informing how leading labs train and evaluate the next generation of AI systems.
- 3+ years of full-time experience at a federal agency, large state/city government, or major federal contractor.
- Background in one or more areas such as:
- Federal program management, policy implementation, or performance reporting (GPRAMA).
- Government procurement/contracting (FAR, DFARS, state procurement).
- Public finance, budgeting, or grants management.
- Public safety, social services, or licensing/benefits administration.
- Legislative affairs, intergovernmental relations, or municipal operations.
- Certifications a plus: PMP, CGFM, FAC-C/DAWIA.
- Day-to-day use of Salesforce Government Cloud, Granicus, Socrata, or Esri ArcGIS Hub, and DocuSign.
- Strong analytical thinking and writing skills, with the ability to translate public-sector workflows into structured task specifications.
This project will initially offer an effective hourly rate, transitioning to a compensation model based on the throughput of quality work rather than a flat accruing hourly rate.
Aboutis a talent marketplace connecting top experts with leading AI labs and research organizations, backed by notable investors. Thousands of professionals contribute to projects shaping the next generation of AI systems.