Job Type : W2

Experience : 1-2 YRs

Location : Mason, OH

Posted Date : 20-Dec-2018

Description :

 Job Description:

Looking for Hadoop Engineer for an 18+ Months Project In Mason, OH.

Responsibilities:

  • Build database prototypes to validate system requirements by discussing with Project managers, Business owners, Analyst teams. Document code and perform code review.

  • Design, develop, validate and deploy the Talend ETL processes for the DWH team using HADOOP (HIVE) on Hadoop.

  • Build data pipeline for different events of ingestion, aggregation and load consumer response data into Hive external tables in HDFS location to serve as feed for several dashboards and Web APIs. Develop SQOOP scripts to migrate data from Oracle to Big data Environment.

  • Design experimental Spark API for better optimization of existing algorithms such as Spark context, Spark SQL, Spark UDF’s, Spark DataFrames. Work with different file formats like CSV, Json, AVRO, text and Parquet and compression techniques    like snappy according to the request of the client.

  • Integrate Spark with MongoDB and create Mongo Collections, consumed by API teams. Convert Hive/SQL queries into Spark transformations using Spark RDDs and Scala.

  • Work on Kafka POC to establish the messages in to Kafka topics and test the frequency of messages. Work on cluster tuning and in-memory computing capabilities of Spark using Scala based on the resources available on the cluster.

  • Develop Shell Scripts to automate the Jobs before moving to Production in a configured way by passing Parameters. Schedule automated jobs on daily basis and weekly basis according to the requirement using Control-M as Scheduler.

  • Work on operation controls like job failure notifications, email notifications for failure logs and exceptions.

  • Support the project team for successful delivery of the client's business requirements through all the phases of the implementation.

Requirements:

  • The minimum education requirements to perform the above job duties are a Bachelor’s degree in Computer Science or related technical field.

  • Should have good knowledge on Hadoop eco systems, HDFS, Hive, Oozie, Sqoop, Kafka, Storm, Spark, Scala

  • Should be well versed with SDLC phases, release and change management processes

  • Should have good analytical and problem solving skills.