Adversarial Prompt Engineer
Up to $65/hr (depending on the project)
About this role
Adversarial Prompt experts engage in project-based initiatives aimed at probing large language models to identify failure modes and harmful outputs. This role involves crafting prompts and scenarios to rigorously test model guardrails, exploring innovative methods to bypass restrictions, and systematically documenting the outcomes of these tests. By adopting an adversarial mindset, you will uncover weaknesses in the models while collaborating with engineers and safety researchers to share insights and enhance system defenses.
Key Responsibilities- Develop prompts and scenarios to test the robustness of large language models.
- Identify and explore creative methods to bypass model restrictions.
- Document findings and outcomes systematically for further analysis.
- Collaborate with engineering and safety teams to share insights and improve model defenses.
- Experience with large language models and understanding of their limitations.
- Strong analytical skills and a creative approach to problem-solving.
- Ability to work independently and manage project-based tasks effectively.
Part-time, remote work arrangement.
CompensationUp to $65/hr (depending on the project).
EligibilityOpen to candidates with relevant experience and skills.