About the job
Join c-the-signs as a Lead Data Engineer and take charge of architecting, developing, and scaling our innovative healthcare data platform. In this pivotal role, you will spearhead the design of robust data pipelines, modernize our data architecture, and ensure the high-quality ingestion and transformation of clinical and operational data. You will work in close collaboration with product management, analytics, clinical informatics, machine learning, and engineering teams to deliver reliable, timely, and compliant insights.
This hands-on leadership position is perfect for those who enjoy guiding technical direction while actively contributing code and assisting stakeholders in navigating complex healthcare data challenges.
Responsibilities
Architecture & Strategy
- Lead the design and evolution of our cloud-native data platform primarily utilizing Google Cloud Platform, including BigQuery, Cloud Storage, Pub/Sub, Cloud Run, Airflow (Cloud Composer), and Healthcare API.
- Guide strategic decisions regarding multi-cloud or AWS interoperability when necessary.
- Establish and uphold data engineering best practices, coding standards, and architectural patterns.
Pipeline Development
- Construct scalable ETL/ELT pipelines employing dbt for transformations and Airflow for orchestration.
- Create ingestion pipelines for clinical and administrative data in HL7, FHIR, DICOM, and custom formats.
- Develop ingestion and transformation pipelines to facilitate AI/ML development and model training.
- Implement both streaming and batch dataflows using Pub/Sub, Dataflow, and serverless compute.
- Support or guide integrations with AWS-based partner systems or AWS-hosted data sources when necessary.
Data Modeling & Warehousing
- Design and maintain BigQuery datasets, semantic layers, and warehouse structures.
- Utilize industry standards such as FHIR resources for canonical healthcare models.
- Provide insights on data modeling and warehouse best practices across both GCP and AWS ecosystems.
Data Quality, Observability & Governance
- Implement data quality frameworks, automated testing, and monitoring systems.
- Ensure adherence to HIPAA compliance and the appropriate handling of PHI/PII throughout all pipelines and cloud environments.
- Promote lineage, documentation, metadata governance, and adoption of dbt documentation standards.
Leadership & Collaboration
- Collaborate with analytics, product, clinical informatics, and security teams to deliver high-quality, reliable data products.
- Offer oversight and technical guidance for multi-cloud data integrations with AWS-based systems or partners.
- Assist in recruiting and developing junior data engineers.
