About this role
As a Bioinformatics & Single-Cell Genomics Expert, you will play a crucial role in developing a large-scale benchmark aimed at evaluating the capabilities of advanced AI systems in addressing complex scientific and engineering challenges. Your primary responsibility will be to design intricate computational problems that assess whether AI can effectively utilize real scientific software for research-level tasks, including running simulations, interpreting results, designing experiments, and extracting hidden insights from data.
This role goes beyond typical data-labeling tasks; you will create original, graduate-level problems rooted in authentic scientific workflows, rigorously testing them against cutting-edge AI models and refining them to achieve the appropriate level of difficulty.
Key Responsibilities- Design problems requiring the adept use of specialized scientific software, including tasks that involve computing exact answers from defined setups and planning queries or experiments to uncover non-visible information.
- Engage in a testing loop with state-of-the-art AI models, refining problems until they meet targeted difficulty levels.
We are particularly interested in candidates with extensive, hands-on experience in:
- Bioinformatics & Single-Cell Genomics: Proficiency with tools such as scanpy, scvelo, squidpy, and gudhi for single-cell RNA-seq analysis, trajectory inference, spatial transcriptomics, and topological data analysis.
You should be adept at designing problems related to cell-type annotation, pseudotime ordering, multi-omic integration, spatial variable gene identification, and persistence-based analysis pipelines. This domain is our highest-throughput area and the initial focus of our scaling efforts.
What Makes a Strong CandidateThe ideal candidate will possess graduate-level expertise (MS or PhD preferred) in the relevant domain, with substantial hands-on experience using the specified tools. You should have a proven track record of writing code with these libraries to solve real research problems, along with a deep understanding of their limitations, edge cases, and the nuances that distinguish genuinely challenging problems from merely complicated ones.
Additionally, strong candidates will exhibit puzzle-design thinking, crafting problems where the challenge arises from intelligent reasoning rather than mere computation, where multiple plausible approaches exist, and where careful analysis is essential to uncover the correct solution.
Requirements- Graduate-level training in a relevant STEM field (MS, PhD, or equivalent research experience)
- Proven proficiency with at least one of the specified scientific software libraries, demonstrated through research publications, open-source contributions, or professional experience
- Strong Python programming skills for writing problem setups, oracle functions, and solution validators
- Ability to work independently and iterate on problem designs based on feedback
- Comfortable operating in a Linux/terminal environment with remote compute sandboxes
- Availability for at least 15, 20 hours per week
- Experience across multiple listed domains or tools
- Familiarity with benchmark or evaluation design
- Background in scientific teaching or exam/problem-set design
- Experience with computational reproducibility and containerized environments
Please note that this application process includes a coding assessment as part of the evaluation.