As a Bioinformatics & Single-Cell Genomics Expert, you will play a crucial role in developing a large-scale benchmark aimed at evaluating the capabilities of advanced AI systems in addressing complex scientific and engineering challenges. Your primary responsibility will be to design intricate computational problems that assess whether AI can effectively utilize real scientific software for research-level tasks, including running simulations, interpreting results, designing experiments, and extracting hidden insights from data.

This role goes beyond typical data-labeling tasks; you will create original, graduate-level problems rooted in authentic scientific workflows, rigorously testing them against cutting-edge AI models and refining them to achieve the appropriate level of difficulty.

Key Responsibilities

Design problems requiring the adept use of specialized scientific software, including tasks that involve computing exact answers from defined setups and planning queries or experiments to uncover non-visible information.
Engage in a testing loop with state-of-the-art AI models, refining problems until they meet targeted difficulty levels.

Domains & Tools We''re Hiring For

We are particularly interested in candidates with extensive, hands-on experience in:

Bioinformatics & Single-Cell Genomics: Proficiency with tools such as scanpy, scvelo, squidpy, and gudhi for single-cell RNA-seq analysis, trajectory inference, spatial transcriptomics, and topological data analysis.

You should be adept at designing problems related to cell-type annotation, pseudotime ordering, multi-omic integration, spatial variable gene identification, and persistence-based analysis pipelines. This domain is our highest-throughput area and the initial focus of our scaling efforts.

What Makes a Strong Candidate

The ideal candidate will possess graduate-level expertise (MS or PhD preferred) in the relevant domain, with substantial hands-on experience using the specified tools. You should have a proven track record of writing code with these libraries to solve real research problems, along with a deep understanding of their limitations, edge cases, and the nuances that distinguish genuinely challenging problems from merely complicated ones.

Additionally, strong candidates will exhibit puzzle-design thinking, crafting problems where the challenge arises from intelligent reasoning rather than mere computation, where multiple plausible approaches exist, and where careful analysis is essential to uncover the correct solution.

Requirements

Graduate-level training in a relevant STEM field (MS, PhD, or equivalent research experience)
Proven proficiency with at least one of the specified scientific software libraries, demonstrated through research publications, open-source contributions, or professional experience
Strong Python programming skills for writing problem setups, oracle functions, and solution validators
Ability to work independently and iterate on problem designs based on feedback
Comfortable operating in a Linux/terminal environment with remote compute sandboxes
Availability for at least 15, 20 hours per week

Nice to Have

Experience across multiple listed domains or tools
Familiarity with benchmark or evaluation design
Background in scientific teaching or exam/problem-set design
Experience with computational reproducibility and containerized environments

Please note that this application process includes a coding assessment as part of the evaluation.

Bioinformatics & Single-Cell Genomics Task Designer

About this role

Related Jobs

Physics Researcher for AI Model Training

Physics Expert for AI Model Training

Physicist for AI Model Training

Physics Research Auditor for AI Model Training

Energy Auditor for AI Model Evaluation