Data Engineer for AI Benchmark Evaluation
from $50/hour
About this role
Role Overview
This role involves contributing to benchmark-driven evaluation projects that focus on real-world data engineering and data science workflows. As a Software Engineer specializing in Data Engineering and Data Science, you will engage in hands-on work with production-like datasets, data pipelines, and data science tasks to evaluate and enhance the performance of advanced AI systems. The ideal candidate will possess a strong foundation in both data engineering and data science, with the capability to navigate data preparation, analysis, and model-related workflows within real-world codebases.
Key Responsibilities
- Work with structured and unstructured datasets to support SWE Bench-style evaluation tasks.
- Design, build, and validate data pipelines used in benchmarking and evaluation workflows.
- Perform data processing, analysis, feature preparation, and validation for data science use cases.
- Write, run, and modify Python code to process data and support experiments locally.
- Evaluate data quality, transformations, and outputs for correctness and reproducibility.
- Create clean, well-documented, and reusable data workflows suitable for benchmarking.
- Participate in code reviews to ensure high standards of code quality and maintainability.
- Collaborate with researchers and engineers to design challenging, real-world data engineering and data science tasks for AI systems.
Qualifications
- Minimum 3+ years of overall experience as a Data Engineer, Data Scientist, or Software Engineer (data-focused).
- Strong proficiency in Python for data engineering and data science workflows.
- Demonstrable experience with data processing, analysis, and model-related workflows.
- Solid understanding of machine learning and data science fundamentals.
- Experience working with structured and unstructured data.
- Ability to understand, navigate, and modify complex, real-world codebases.
- Experience writing readable, reusable, maintainable, and well-documented code.
- Strong problem-solving skills, including experience with algorithmic or data-intensive problems.
- Excellent spoken and written English communication skills.
Work Terms
- Commitments Required: At least 4 hours per day and a minimum of 20 hours per week with 4 hours of overlap with PST.
- Engagement Type: Contractor assignment (no medical/paid leave).
- Duration of Contract: 3 months (adjustable based on engagement).
Compensation
Compensation details will be discussed during the interview process.
Eligibility
- This position is fully remote.
- Opportunity to work on cutting-edge AI projects with leading LLM companies.