About the job
As a member of the Data Infrastructure team, you will support vital big data platforms to ensure they are high-performing, reliable, available, and secure. This role encompasses data infrastructure engineering, often referred to as DataOps, Database Administration, or SRE.
The position combines tool development and operational support for our platforms, requiring a keen eye for detail and a curiosity about the underlying systems. You'll gain a broad skill set ranging from low-level system tuning to general coding.
We manage four primary storage platforms, including:
- Apache Solr (~2.2 PB)
- Apache HBase (~450 TB)
- PostgreSQL (~15 TB)
- Kafka (~60 TB)
All platforms are open source and developed using Java, Scala, or C. We maintain in-house builds and patches using various open-source and custom-designed tools, primarily in Rust and Python, operating across hundreds of servers in multiple data centers and in the cloud.
We strive to maintain a balance between project work and operational tasks for all team members, from seasoned professionals to recent graduates. Your daily responsibilities will encompass a mix of both.
The nature of project work will be tailored to your experience, ensuring it is both feasible and beneficial. Recent graduates have been tasked with projects such as:
- Creating mapreduce jobs to validate data between two multi-TB+ HBase clusters (to reduce processing time from months to hours)
- Developing a Python stack to seamlessly migrate production clients between clusters (ensuring no data loss or downtime)
- Building and testing new HBase versions and deploying them live (without disrupting Brandwatch's operations)

