Join a high-impact AI research project as a Bilingual Spanish Generalist Evaluator Expert, where you will leverage your exceptional writing skills to create and evaluate Spanish/English prompt-golden answer pairs. This role is designed for native Spanish speakers from the United States, Spain, Chile, or Mexico who possess a deep understanding of local language nuances and cultural contexts.

This flexible, short-term opportunity is ideal for professionals who excel in language mastery, critical thinking, and instructional clarity. You will distill complex concepts into well-crafted, culturally grounded Spanish text while ensuring technical precision in English.

Key Responsibilities

Multilingual Prompt Design & Optimization: Create detailed prompts in Spanish and/or English, ensuring natural phrasing and real-world relevance for Spanish-speaking users in the United States, Spain, Chile, and Mexico.
Define and Document Evaluation Standards: Establish expectations for correct responses in consumer contexts and develop comprehensive rubrics that reflect linguistic nuances and cultural conventions.
Model Testing and Grading (Bilingual): Assess model outputs for accuracy, fluency, and cultural fit in Spanish, comparing results against English as necessary.
Benchmarking & Quality Assurance: Collaborate in QA review processes to ensure prompt tasks and rubrics maintain consistency and reliability across Spanish-language benchmarks.

Minimum Qualifications

Native-level fluency in Spanish (written) specific to the United States, Spain, Chile, or Mexico, with strong reading and writing ability in English.
Must be native to the United States, Spain, Chile, or Mexico and have lived in or spent significant time in-country.
BS or BA from a reputable institution (completed or in progress).
Strong writing and critical thinking skills.
Ability to work independently and meet deadlines.
Familiarity with ChatGPT or similar tools.
Based in the United States, Spain, Chile, or Mexico, or able to produce culturally accurate Spanish aligned with one of these regions.

Preferred Qualifications

Experience in teaching, research, editing, or academic writing.
Experience creating evaluation criteria, rubrics, or grading guidelines.
Familiarity with LLMs, prompting, or model evaluation (helpful but not required).

Work Terms

Commitment of at least 20 hours per week.
Engagement duration of approximately 2–4 months.
Structured project environment with clear goals and tools.

Compensation

Hourly rate ranging from $12 to $37.

Eligibility

All qualified applicants will be considered without regard to legally protected characteristics.
Reasonable accommodations will be provided upon request.

Bilingual Spanish Language Model Evaluator

About this role

Related Jobs

Remote Accent and Dialect Audio Contributor

S’gaw Karen Bilingual AI Language Expert

Odia Audio Transcription Specialist

Malayalam Audio Transcription Expert

Thai Audio Transcription Specialist