Senior Software Engineer, C++ for LLM Evaluation
from $40/hour
About this role
Role Overview: This position offers the opportunity to work on innovative projects focused on building evaluation and training datasets for large language models (LLMs) in realistic software engineering contexts. You will play a crucial role in developing verifiable software engineering tasks based on public repository histories, utilizing a synthetic approach with human-in-the-loop methodologies to enhance dataset coverage across various programming languages and difficulty levels.
Key Responsibilities:
- Analyze and triage GitHub issues across trending open-source libraries.
- Set up and configure code repositories, including Dockerization and environment setup.
- Evaluate unit test coverage and quality.
- Modify and run codebases locally to assess LLM performance in bug-fixing scenarios.
- Collaborate with researchers to design and identify repositories and issues that present challenges for LLMs.
- Lead a team of junior engineers to collaborate on projects.
Qualifications:
- Minimum of 3 years of overall experience.
- Strong experience with C++.
- Proficiency with Git, Docker, and basic software pipeline setup.
- Ability to understand and navigate complex codebases.
- Comfortable running, modifying, and testing real-world projects locally.
- Experience contributing to or evaluating open-source projects is a plus.
Nice to Have:
- Previous participation in LLM research or evaluation projects.
- Experience building or testing developer tools or automation agents.
Work Terms:
- Commitments Required: At least 4 hours per day and a minimum of 20 hours per week with 4 hours of overlap with PST. Options for time commitment include 20 hrs/week, 30 hrs/week, or 40 hrs/week.
- Employment Type: Contractor assignment (no medical/paid leave).
- Location: Open to candidates in India, Pakistan, Nigeria, Kenya, Egypt, Ghana, Bangladesh, Turkey, and Mexico.
Compensation: Competitive compensation commensurate with experience.
Eligibility: Candidates must be eligible to work in the specified locations without requiring sponsorship.
Evaluation Process: The evaluation process includes two rounds of interviews: a 60-minute technical interview followed by a 30-minute technical and cultural discussion.