COVID-19: We are continuing to provide recruitment services to our customers & candidates and urge you to contact your relevant recruiter should you need any assistance.
Implementing, managing and administering the overall Hadoop infrastructure with various distributions (preferably Cloudera)
Deploying a Hadoop cluster, maintaining a Hadoop cluster, adding and removing nodes using cluster monitoring tools like Ganglia Nagios or Cloudera Manager, configuring the Name Node high availability and keeping a track of all the running Hadoop jobs.
Takes care of the day-to-day running of Hadoop clusters
Experience with NoSQL databases, such as HBase, Cassandra, MongoDB or similar
Experience with Big Data tools such as Pig, Hive, Impala, Sqoop, Kafka, Flume, Jupitor
Experience with Hadoop, HDFS, Spark
Experience in Linux administration, Unix Shell Scripts
Person will be responsible to Perform Hadoop Administration on Production/DR Hadoop clusters.
Perform Tuning and Increase Operational efficiency on a continuous basis
Monitor health of the platforms and Generate Performance Reports and Monitor and provide continuous improvements
Working closely with development, engineering and operation teams, jointly work on key deliverables ensuring production scalability and stability
Develop and enhance platform best practices
Ensure the Hadoop platform can effectively meet performance & SLA requirements
Responsible for support of Hadoop Production environment which includes Hive, Ranger, Kerberos, YARN, Spark, SAS, Kafka, base, etc.
Perform optimization, capacity planning of a large multi-tenant cluster.
Identify, Implement and continuously enhance the data automation process.
Participate in the architectural discussions, perform system analysis which involves a review of the existing systems and operating methodologies. Participate in the analysis of latest technologies and suggest the optimal solutions which will be best suited for satisfying the current requirements and will simplify the future modifications
Extensive experience working with SQL across a variety of databases
Experience working with both structured and unstructured data sources.