About the job
Speechify helps over 50 million people turn text, like PDFs, books, Google Docs, news articles, and websites, into audio. Our tools support faster reading, better comprehension, and stronger retention. Products run across iOS, Android, Mac, Chrome Extension, and Web App. Recent recognition includes Chrome Extension of the Year from Google and the Apple Design Award for inclusivity in 2025.
The team is fully distributed, with nearly 200 colleagues from backgrounds at Amazon, Microsoft, Google, and top PhD programs. There is no central office. Collaboration and new ideas drive the work.
Role Overview
Speechify is hiring a Software Engineer focused on data infrastructure and acquisition for the AI team. This position is central to collecting and managing large-scale datasets that support model training. The work enables efficient creation and handling of petabyte-scale data.
What You Will Do
- Find and connect new audio data sources to the ingestion pipeline.
- Maintain and expand cloud infrastructure on Google Cloud Platform (GCP) using Terraform.
- Work with scientists to improve data cost, throughput, and quality for future models.
- Partner with the AI team and leadership to plan the dataset roadmap for upcoming products.
Location
This role is based in Karachi, Pakistan, as part of Speechify’s distributed team.
