About the job
Speechify builds tools that turn text into speech, helping over 50 million people convert PDFs, books, Google Docs, news articles, and websites into audio. Our mission is to remove reading as a barrier to learning, so users can read faster, remember more, and enjoy deeper learning.
Our product lineup includes apps for iOS, Android, Mac, a Chrome extension, and a web app. Speechify has earned recognition from Google as 'Chrome Extension of the Year' and received Apple’s 2025 Design Award for Inclusivity.
The team is fully remote, with nearly 200 colleagues who bring experience from Amazon, Microsoft, Google, Stanford, and other leading organizations. We value engineers and researchers who want to shape the future of accessible learning.
Role Overview
Speechify is hiring a Software Engineer for the Data division of our AI team. This person will help lead data collection efforts for model training, focusing on building and maintaining data infrastructure at petabyte scale. The work blends infrastructure, engineering, and research to deliver high-quality datasets efficiently and affordably. The main focus is improving our data ingestion pipeline.
What You Will Do
- Find and connect new audio data sources to our ingestion pipeline.
- Manage and grow our cloud infrastructure on Google Cloud Platform using Terraform.
- Work with scientists to improve cost, throughput, and data quality, supporting large-scale model training.
- Partner with the AI team and company leadership to define the dataset roadmap for future consumer and enterprise products.
Location
This position is remote, based in Split, Croatia.
