About the job
P-1348
At Databricks, we are dedicated to empowering data teams to tackle some of the most challenging problems in the world, ranging from security threat detection to the development of cancer drugs. Our mission is to create and manage the leading data and AI infrastructure platform, allowing our customers to concentrate on the critical challenges central to their missions. Our engineering teams are committed to developing innovative technical products that meet real and significant needs globally. We continuously push the limits of data and AI technology while ensuring resilience, security, and scalability to enhance our customers' success on our platform.
We are responsible for the operation of one of the largest scale software platforms, comprising millions of virtual machines that generate terabytes of logs and process exabytes of data on a daily basis. At this scale, we encounter cloud hardware, network, and operating system issues, and our software must effectively shield our customers from these challenges.
As a Senior Software Engineer on the Data Platform team, you will contribute to building the Data Intelligence Platform for Databricks, which aims to automate decision-making across the organization. You will collaborate closely with Databricks Product Teams, Data Science, Applied AI, and more. Your role will involve developing a range of tools for logging, orchestration, data transformation, metric storage, governance platforms, and data consumption layers. You will leverage the latest and most advanced Databricks products and other tools in the data ecosystem. Our team also serves as a substantial in-house customer, using Databricks to inform the future direction of our product.
Your Impact:
- Design and manage the Databricks metrics store, enabling all business units and engineering teams to consolidate and share detailed metrics on a common platform with high quality, introspection capabilities, and query performance.
- Lead the development of the cross-company Data Intelligence Platform, which encapsulates all business and product metrics essential for running Databricks. You will play a pivotal role in balancing data protection with ease of shareability as we transition to a public company.
- Create tools and infrastructure to efficiently manage and operate Databricks on Databricks at scale across multiple clouds, geographies, and deployment types. This includes CI/CD processes, testing frameworks for pipelines and data quality, and infrastructure-as-code tooling.
- Establish the foundational ETL framework utilized by all pipelines developed within the company.
- Collaborate with our engineering teams to provide...
