Allen Institute for AI logoAllen Institute for AI logo

Senior Software Engineer - Data at Allen Institute | Seattle, WA

On-site Full-time $126K/yr - $189K/yr

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Experience Level

Senior

Qualifications

Essential Qualifications:Bachelor's degree in Computer Science or a related field.8+ years of experience in data engineering or a similar role. Proficiency in developing and optimizing data pipelines. Strong background in machine learning techniques and their application to data quality enhancement. Experience with API design and implementation. Excellent communication skills and a collaborative mindset.

About the job

Join our team at the Allen Institute for AI in Seattle, where we are dedicated to pioneering advancements in artificial intelligence research. This position requires on-site collaboration, with specific arrangements varying by team. Inquire with your recruiter for details.
Our competitive salary range for this role is between $126,000 and $189,000, supplemented by an attractive bonus structure.

About You:

We are seeking a talented Senior Data Engineer to enhance the data infrastructure that powers our AI research initiatives. You will significantly contribute to the Semantic Scholar corpus by expanding its scope and elevating the quality of existing data. This role involves creating scalable APIs and tools that support our AI agents in their exploration of scholarly literature.

Your work will bridge data engineering with applied machine learning, allowing you to manage data pipelines, design schemas, and deploy production services while implementing practical machine learning techniques, such as entity resolution and text classification, to refine data quality and enrich metadata.

About Us:

The Agentic Applications team at the Allen Institute for AI is dedicated to building robust, open-source systems that facilitate scientific discovery and large-scale AI research. We focus on developing high-quality structured datasets, integrating diverse content types, and enabling applications for search, citation analysis, and model training. Our team emphasizes strong engineering practices and close collaboration with Ai2’s product and research organizations to deliver tools and infrastructure utilized by millions of researchers and developers around the globe.

Your Responsibilities:

  • Enhance the coverage and quality of the Semantic Scholar corpus, including academic papers, patents, and specialized datasets.
  • Develop and maintain scalable data pipelines for corpus integration, citation resolution, and metadata enhancement.
  • Implement and launch machine learning models for tasks such as entity disambiguation, author linking, and topic classification.
  • Design and improve APIs that provide structured scholarly data for academic researchers and AI workflows.
  • Contribute to the development of dashboards and tools that assess data quality and model performance.
  • Work collaboratively with engineering and research teams to ensure code maintainability, test coverage, and reliable deployment.

About Allen Institute for AI

The Allen Institute for AI (Ai2) is at the forefront of AI research, dedicated to advancing the understanding and application of artificial intelligence in scientific exploration. We foster innovation and collaboration, creating impactful tools and datasets that push the boundaries of knowledge.

Similar jobs

Browse all companies, explore by city & role, or SEO search pages. View directory listings: all jobs, search results, location & role pages.

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.