Key Result Areas and Activities: 

  

  1. ETL Pipeline Development and Maintenance 
  • Design, develop, and maintain ETL pipelines using Cloudera tools such as Apache NiFi, Apache Flume, and Apache Spark. 
  • Create and maintain comprehensive documentation for data pipelines, configurations, and processes. 

  

  1. Data Integration and Processing 
  • Integrate and process data from diverse sources including relational databases, NoSQL databases, and external APIs. 

  

  1. Performance Optimization 
  • Optimize performance and scalability of Hadoop components (HDFS, YARN, MapReduce, Hive, Spark) to ensure efficient data processing. 
  • Identify and resolve issues related to data pipelines, system performance, and data integrity. 

  

  1. Data Quality and Transformation 
  • Implement data quality checks and manage data transformation processes to ensure accuracy and consistency. 

  

Must Have 

  • Proficiency in Cloudera Data Platform (CDP) - Cloudera Data Engineering 
  • Knowledge of data lakehouse architectures and their implementation 
  • Hands-on experience with Apache Spark, Apache Airflow within the Cloudera ecosystem 
  • Proficiency in languages such as Python, Java, Scala, Shell 
  • Exposure to containerization and related technologies (e.g. Docker, Kubernetes) 
  • System level understanding of Data structures, algorithms, distributed storage & compute 

  

Good To Have 

  • Experience with other CDP services like Dataflow, Stream Processing 
  • Familiarity with cloud environments such as AWS, Azure, or Google Cloud Platform 
  • Understanding of data governance and data quality principles 
  • CCP Data Engineer Certified 

  

Qualifications: 

  • 5+ years of experience in Cloudera/Hadoop/Big Data engineering or related roles 
  • Proven track record of successful data lake implementations and pipeline development 
  • Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field 

  

Qualities: 

  • Can influence and implement change; demonstrates confidence, strength of conviction and sound decisions. 
  • Believes in head-on dealing with a problem; approaches in logical and systematic manner; is persistent and patient; can independently tackle the problem, is not over-critical of the factors that led to a problem and is practical about it; follow up with developers on related issues. 
  • Able to consult, write, and present persuasively. 
  • Able to work in a self-organized and cross-functional team. 
  • Able to iterate based on new information, peer reviews, and feedback. 
  • Able to work seamlessly with clients across multiple geographies. 
  • Research focused mindset. 
  • Proficiency in English (read/write/speak) and communication over email. 
  • Excellent analytical, presentation, reporting, documentation, and interactive skills.