Speechify logoSpeechify logo

Software Engineer - Data Infrastructure & Acquisition

SpeechifyKathmandu, Nepal
Remote Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Experience Level

Mid to Senior

Qualifications

An Ideal Candidate Should Have A BS/MS/PhD in Computer Science or a related field. 5+ years of professional experience in software development. Proficiency in bash/Python scripting in Linux environments. Expertise in Docker and Infrastructure-as-Code practices, with professional experience in at least one major Cloud Provider (GCP preferred). Experience with web crawlers and large-scale data processing workflows is a plus. Adept at managing multiple tasks and adapting to changing priorities. Excellent communication skills, both written and verbal.

About the job

Speechify’s mission is to make reading accessible for everyone, removing barriers to learning. More than 50 million people use Speechify’s text-to-speech tools to turn reading materials, PDFs, books, Google Docs, news stories, and websites, into audio. This helps users read faster, understand more, and remember what matters. Our platform spans iOS, Android, Mac, Chrome, and web apps. Google named Speechify Chrome Extension of the Year, and Apple awarded us the 2025 Design Award for Inclusivity.

The Speechify team is fully remote, with nearly 200 professionals worldwide. Our colleagues include frontend and backend engineers, AI researchers, and alumni from Amazon, Microsoft, Google, and Stanford, as well as founders from successful startups.

Role Overview

Speechify is hiring a Software Engineer for the AI Data division. This position centers on data collection to fuel model training. The team builds and maintains infrastructure to assemble high-quality, petabyte-scale datasets efficiently and cost-effectively. The work combines engineering, infrastructure, and research.

What You Will Do

  • Find new and creative sources of audio data, then connect them to our ingestion pipeline.
  • Maintain and improve cloud infrastructure for data ingestion, using GCP and Terraform.
  • Work with Scientists to optimize for cost, throughput, and data quality, ensuring our models receive the best possible training data at scale.
  • Collaborate with AI team members and leadership to shape the dataset roadmap for future consumer and enterprise products.

Requirements

  • BS, MS, or PhD in Computer Science or a related field.
  • At least 5 years of professional software development experience.
  • Strong skills in bash or Python scripting within Linux environments.
  • Hands-on experience with Docker and Infrastructure-as-Code practices, plus professional experience with at least one major cloud provider (GCP preferred).
  • Background working with web crawlers and large-scale data processing is a plus.
  • Comfort managing multiple projects and shifting priorities.
  • Clear and effective written and verbal communication skills.

Location

This role is based in Kathmandu, Nepal, as part of Speechify’s distributed team.

About Speechify

Speechify is dedicated to breaking down barriers to learning through innovative text-to-speech technology. With a global team and a commitment to inclusivity, Speechify aims to transform how individuals interact with written content.

Similar jobs

Browse all companies, explore by city & role, or SEO search pages. View directory listings: all jobs, search results, location & role pages.

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.