Job Type : W2

Experience : 1-2 YRs

Location : IRVING, TX

Posted Date : 13-Jan-2020

Description :

 We are looking for a candidate who will work closely with a global team of other big data engineers and software developers on the Hadoop platform and play a key role in the team.

 

Responsibilities:

  • Develop the Microservices based RESTful API’s using Spring MVC framework by writing controller, service, DAOs classes and business logics using java api and data structures
  • Design multiple MapReduce programs for data extraction, transformation and aggregation in XML, JSON, CSV and other compressed file formats.
  • Develop the spark code using Scala and Spark-SQL for data cleaning, pre-processing and aggregation. Apply the spark RDD transformations on top of Hive external tables.
  • Develop complex hive queries for different file sources and writing Hive UDFs. Conduct performance tuning of hive queries.
  • Coordinate the cluster and shedule workflows using Zookeeper and Oozie. Create HBase-Hive integration tables and loading large sets of semi-structured data coming from various sources.
  • Create Shell scripts to simplify the execution (Hbase, Hive, Oozie, Kafka and MapReduce) and move the data inside and outside of HDFS.
  • Configure Sqoop and develop scripts to extract data from Teradata into HDFS.
  • Develop multiple Kafka Producers and Consumers by Implementing data ingestion process and handle clusters in real time
  • Test, build, design, deploy, and maintain the CICD process using tools like Jenkins, Maven Git, Docker and Enterprise Cloud Platform (eCP).
  • Provide the daily progress in stand-up’s, participate in the code reviews, work effectively with QA in creating test cases and deploy new releases in production.

Requirements:

  • The minimum education requirements to perform the above job duties are a Bachelor’s degree in Computer Science, Applications or related technical field.
  • Experience in Full lifecycle application development using Java/Databases - SQL and no SQL/Python technologies
  • Proficiency in the Hadoop ecosystem - Hive, Hbase, HDFS, Oozie, Sqoop, pig, Spark, Kafka and other big data technologies.
  • Excellent communication skills, ability to work well in a team in a fast paced environment