About this role
Role Overview
As a Document Sourcing Specialist, you will play a crucial role in ensuring the quality of data used for AI training by meticulously identifying, verifying, and sourcing open-access documents from reputable repositories. Your attention to detail and commitment to compliance will directly impact the integrity of AI models.
Key Responsibilities- Source publicly available documents from platforms such as government archives, academic repositories, open datasets, and licensed open-source documentation.
- Verify and document the license type of each sourced document, ensuring compliance with requirements such as CC0, CC-BY, MIT, or Apache 2.0 (or equivalent).
- Log critical metadata for each submission, including source URLs and full license details, in designated tracking tools.
- Flag and annotate any issues related to ownership, unclear licensing, paywalled access, or content with non-commercial usage restrictions.
- Collaborate with data engineering and compliance teams to clarify requirements and resolve sourcing ambiguities.
- Maintain up-to-date knowledge of open data best practices, licensing changes, and repository navigation strategies.
- Communicate findings and unresolved issues clearly in both written and verbal form to support documentation integrity and compliance audits.
- Exceptional attention to detail and ability to accurately review complex licensing and compliance information.
- Experience sourcing documents from repositories such as SEC EDGAR, arXiv, Kaggle, and GitHub.
- Proficiency in academic research, data collection, and public records searching.
- Strong written and verbal communication skills, able to articulate findings and collaborate remotely.
- Demonstrated ability to distinguish between open and restricted content, and to identify potential sourcing risks.
- Comfort working independently in a fast-paced, remote environment with evolving priorities.
- Highly organized, reliable, and adept at managing and documenting large volumes of information.
Employment Type: Contract (full-time or part-time)
CompensationHourly rate ranges from $20 to $66.
EligibilityThis position is fully remote.