Roles and Responsibility
1. Design and implement data engineering projects.
2. Integrate multiple data sources to create data lake/data mart Perform data ingestion and ETL processes using SQL, Scoop, Spark or Hive
3. Knowledge of new components and various emerging technologies in on-premises and Cloud (AWS/Azure/Google)
4. Collaborate with various cross-functional teams: infrastructure, network and database
5. Work with various teams to setup and manage users, secure and govern platforms and data and maintain business continuity through contingency plans (data archiving etc.)
6. Monitor job performances, manage file system/disk-space, cluster & database connectivity, log files, manage backup/security and troubleshoot various user issues
7. Design, implement, test and document performance benchmarking strategy for platforms as well as for different use cases
8. Setup, administer, monitor, tune, optimize and govern large scale implementations
9. Implement machine learning models on real time input data stream
10. Drive customer communication during critical events and participate/lead various operational improvement initiatives
Desired Candidate Profile :
1. 3 - 5 years relevant experience in data engineering
2. Exposure to any or all latest data engineering ecosystem platforms such as AWS, Azure, GCP, Cloudera and Data bricks
3. Sound knowledge of Python/Scala/Java
4. Good knowledge of SQL / NoSQL databases and data warehouse concepts
5. Hands on experience of working on databases such as Sql Servers, PostgreSql, Cloud infrastructure, etc.
6. Excellent knowledge of data backup, recovery, security and integrity
7. Sound knowledge on Spark, HDFS/HIVE/HBASE, Shell Scripting, and Spark Streaming
8. Excellent communication skills
9. Must be proficient with data ingestion tools like Sqoop, flume, talend, and Kafka
Didn’t find the job appropriate? Report this Job