Role Overview

This position focuses on enhancing the safety of AI systems through proactive testing and vulnerability assessment. As part of a dedicated red team, you will engage in adversarial testing of AI models, identifying weaknesses and generating crucial data that contributes to the overall safety of AI for our customers.

Key Responsibilities

Red team conversational AI models and agents, including jailbreaks, prompt injections, misuse cases, bias exploitation, and multi-turn manipulation.
Generate high-quality human data by annotating failures, classifying vulnerabilities, and flagging systemic risks.
Apply structured methodologies by following taxonomies, benchmarks, and playbooks to ensure consistent testing.
Document findings reproducibly, producing reports, datasets, and attack cases that customers can act on.

Qualifications

Prior experience in red teaming, including AI adversarial work, cybersecurity, or socio-technical probing.
A curious and adversarial mindset, with a tendency to push systems to their limits.
Structured approach to testing, utilizing frameworks or benchmarks rather than random hacks.
Strong communication skills to clearly explain risks to both technical and non-technical stakeholders.
Adaptability to thrive in a dynamic environment with varying projects and customers.

Nice-to-Have Specialties

Experience with adversarial machine learning, including jailbreak datasets, prompt injection, RLHF/DPO attacks, and model extraction.
Background in cybersecurity, including penetration testing, exploit development, and reverse engineering.
Knowledge of socio-technical risks, such as harassment/disinformation probing, abuse analysis, and conversational AI testing.
Creative probing skills, with interests in psychology, acting, or writing that foster unconventional adversarial thinking.

What Success Looks Like

Identifying vulnerabilities that automated tests overlook.
Delivering reproducible artifacts that enhance the security of customer AI systems.
Expanding evaluation coverage by testing more scenarios and reducing surprises in production.
Building trust among customers regarding the safety of their AI systems through thorough adversarial probing.

Why Join Us

Gain valuable experience in human data-driven AI red teaming at the forefront of safety, and play a direct role in making AI systems more robust, safe, and trustworthy.

AI Safety Expert for English and Punjabi

About this role

Related Jobs

MLOps Engineer for AI Model Training

Java Developer for AI System Training

Performance Engineer for AI Model Training

Python Developer for AI Model Training

Frontend Software Engineer for AI Training