Document Sourcing Specialist for AI Training
$20–$66/hr
RemoteFull-timetechnologyUpdated Jun 11, 2026
Apply NowAbout this role
Role Overview
As a Document Sourcing Specialist, you will play a crucial role in shaping the training of next-generation AI systems. Your expertise will directly influence how models learn and perform by providing high-quality, real-world input. This position is open to individuals with domain knowledge, and no prior AI experience is required.
Key Responsibilities- Source publicly available documents from platforms such as government archives, academic repositories, open datasets, and licensed open-source documentation.
- Verify and document the license type of each sourced document, ensuring compliance with requirements such as CC0, CC-BY, MIT, or Apache 2.0 (or equivalent).
- Log critical metadata for each submission, including source URLs and full license details, in designated tracking tools.
- Flag and annotate any issues related to ownership, unclear licensing, paywalled access, or content with non-commercial usage restrictions.
- Collaborate with data engineering and compliance teams to clarify requirements and resolve sourcing ambiguities.
- Maintain up-to-date knowledge of open data best practices, licensing changes, and repository navigation strategies.
- Communicate findings and unresolved issues clearly in both written and verbal form, supporting documentation integrity and compliance audits.
- Exceptional attention to detail and ability to accurately review complex licensing and compliance information.
- Experience sourcing documents from repositories such as SEC EDGAR, arXiv, Kaggle, and GitHub.
- Proficiency in academic research, data collection, and public records searching.
- Strong written and verbal communication skills, able to articulate findings and collaborate remotely.
- Demonstrated ability to distinguish between open and restricted content, and to identify potential sourcing risks.
- Comfort working independently in a fast-paced, remote environment with evolving priorities.
- Highly organized, reliable, and adept at managing and documenting large volumes of information.
- Prior experience supporting AI or machine learning projects with high-quality data sourcing.
- Familiarity with open-source licensing and data compliance regulations.
- Background in academic research, information science, or legal review.
Contract position available in full-time or part-time capacity, with remote work flexibility.
CompensationHourly pay ranges from $20 to $66, based on experience and qualifications.
EligibilityThis position is open to candidates with relevant domain knowledge, and no specific prior experience in AI is required.