SaidGig

Document Sourcing Specialist for AI Training

$20–$66/hr

RemoteFull-timetechnologyUpdated Jun 11, 2026
Apply Now

About this role

Role Overview

As a Document Sourcing Specialist, you will play a crucial role in shaping the training of next-generation AI systems. Your expertise will directly influence how models learn and perform by providing high-quality, real-world input. This position is open to individuals with domain knowledge, and no prior AI experience is required.

Key Responsibilities
  • Source publicly available documents from platforms such as government archives, academic repositories, open datasets, and licensed open-source documentation.
  • Verify and document the license type of each sourced document, ensuring compliance with requirements such as CC0, CC-BY, MIT, or Apache 2.0 (or equivalent).
  • Log critical metadata for each submission, including source URLs and full license details, in designated tracking tools.
  • Flag and annotate any issues related to ownership, unclear licensing, paywalled access, or content with non-commercial usage restrictions.
  • Collaborate with data engineering and compliance teams to clarify requirements and resolve sourcing ambiguities.
  • Maintain up-to-date knowledge of open data best practices, licensing changes, and repository navigation strategies.
  • Communicate findings and unresolved issues clearly in both written and verbal form, supporting documentation integrity and compliance audits.
Qualifications
  • Exceptional attention to detail and ability to accurately review complex licensing and compliance information.
  • Experience sourcing documents from repositories such as SEC EDGAR, arXiv, Kaggle, and GitHub.
  • Proficiency in academic research, data collection, and public records searching.
  • Strong written and verbal communication skills, able to articulate findings and collaborate remotely.
  • Demonstrated ability to distinguish between open and restricted content, and to identify potential sourcing risks.
  • Comfort working independently in a fast-paced, remote environment with evolving priorities.
  • Highly organized, reliable, and adept at managing and documenting large volumes of information.
Preferred Qualifications
  • Prior experience supporting AI or machine learning projects with high-quality data sourcing.
  • Familiarity with open-source licensing and data compliance regulations.
  • Background in academic research, information science, or legal review.
Work Terms

Contract position available in full-time or part-time capacity, with remote work flexibility.

Compensation

Hourly pay ranges from $20 to $66, based on experience and qualifications.

Eligibility

This position is open to candidates with relevant domain knowledge, and no specific prior experience in AI is required.

Related Jobs