About the job
We are seeking a highly skilled Data Engineer Team Lead to spearhead the design and validation of robust data pipelines, data structures, and cutting-edge big data solutions, including data lake platforms. This role is crucial for optimizing data search and retrieval processes within our organization. You will collaborate closely with data structure designers to establish a comprehensive enterprise data architecture.
Key Responsibilities
Enhancing Data Quality and Reliability:
* Develop and implement innovative strategies to enhance data reliability and quality.
* Integrate raw data from diverse sources to produce consistent, machine-readable formats.
Data Structure Development:
* Lead the design and testing of data structures that facilitate effective data extraction and transformation for predictive and prescriptive analytics.
Process Definition:
* Establish and manage development, testing, release, and support processes for data engineering operations. Identify and rectify code bugs as necessary.
Big Data Model Development:
* Oversee the creation of big data models and use cases that are aligned with data structures, preparing them for seamless data operations.
Query Execution:
* Formulate and execute queries on both structured and unstructured data sources to diagnose process issues or conduct bulk updates.
Feature Layer Implementation:
* Direct the development and implementation of feature zones with essential features and KPIs requested by stakeholders to support robust machine learning models.
Batch Scheduling and Reporting:
* Ensure accuracy and timeliness in batch production scheduling and report distribution.
Data Transformation Management:
* Design and oversee processes that support data transformation, data structures, metadata management, dependency tracking, and workload management.
Required Competencies
* Performance Excellence
* Leadership & Empowerment
* Collaboration & Synergy Creation
* Agility & Resilience
* Innovation & Digital Transformation
* Strategic Thinking
* People-Centric Approach
Technical Skills
* Strong proficiency in Java, C#, and Python for application development.
* Extensive experience with Cloudera or comparable big data platforms, including services like Apache Hive, Apache Scala, BizSpark, Impala, Apache Spark, Data Security, Kafka, HBase, Sqoop, NiFi, and Python for programming and data analysis.
* Expertise in handling and processing large datasets using distributed computing frameworks.
* Advanced SQL skills for querying and managing relational databases.
* Comprehensive understanding of data warehousing principles and best practices.
* Proficiency in data analysis and visualization tools.
