Job Type : W2
Experience : 2 YRs
Location : Bentonville, AR
Posted Date : 03-Jan-2019
Job Description:
Position : Software Engineer - Hadoop
Location: Bentonville, AR
Duration: 1 year + Extensions
Responsibilities:
-
Installation and configuration the Hadoop platform distributions (cloudera and Hortonworks and MAPR) and Hadoop component services and adding edge nodes and gateway nodes and assign the master and slaves nodes in the cluster.
-
Adding and delete the nodes, connecting the servers through the remote secure shell (ssh) and set up the Rack Awareness
-
Set up HDFS replication factor for replica of data and set up log4j properties and integrated AWS cloud watch. Upgrade and patching the cluster one version to another version (CDH and HDP, MAPR) and patching the Linux servers.
-
Optimize and tune the Hadoop environments to meet performance requirements. Install and configure monitoring tools for all the critical Hadoop systems and services.
-
Configure and maintain HA of HDFS, YARN (Yet Another Resource Negotiator) Resource Manager, Map Reduce, Hive, HBASE, Kafka and Spark.
-
Manage scalable Hadoop virtual and physical cluster environments. Manage the backup and disaster recovery for Hadoop data.Work in tandem with big data developers and designers for use case specific scalable supportable -infrastructure.
-
Provide very responsive support for day-to-day requests from development, support and business analyst teams.
-
Performance analysis and debugging of slow running development and production processes. Perform product/tool upgrades and apply patches for the identified defects with the root cause analysis (RCA). Perform ongoing capacity management forecasts including timing and budget considerations.
-
Design scripts for Automation of jobs to run in Hadoop environments for Validation checks to monitor the cluster health.
Requirements:
-
The minimum education requirements to perform the above job duties are a Bachelor’s degree in Computer Science, Information Technology or related technical field.
-
Should have good knowledge on Cloudera, Hadoop, HDFS, Hive, Oozie, Spark, Python, Scala, Splunk.
-
Performance tuning of Hadoop clusters.