Last Date for Registration: June 05th, 2018

Eligibility: Faculty/PhD/Research Scholars of engineering/technical institutions and persons from Govt. departments/labs and industry.

Experts from Academia/Industry: Kovid Academy (Hyderabad), Guest Lecture (IITR)

Why Big Data with Hadoop?

Big data is a term that describes the large volume of data-both structured and unstructured that overburden a business on a day-to-day basis. It is the combination of these factors, high-volume, high-velocity and high-variety that serves as the basis for data to be termed Big Data. Big Data platforms and solutions provide the tools, methods and technologies used to capture, curate, store and search & analyze the data to find new correlations, relationships and trends that were previously unavailable. This course provides participants with a comprehensive understanding of all the steps necessary to operate and maintain a Hadoop cluster using Cloudera Manager.

Objective of the Course

  • To apply traditional data analytics and business intelligence skills to big data tools like Apache Impala (incubating), Apache Hive, and Apache Pig.
  • Cloudera present the tools data, professionals need to access, manipulate, transform, and analyze complex data sets using SQL and familiar scripting languages.
  • Benefits and Outcomes of the Course

  • Course provides participants with a comprehensive understanding of all the steps necessary to operate and maintain a Hadoop cluster using Cloudera Manager.
  • With Spark, developers can write sophisticated parallel applications to execute faster decisions, better decisions, and interactive actions, applied to a wide variety of use cases, architectures, and industries. Apache Spark examples and hands-on exercises are presented in Scala and Python.

  • Course Program

  • The program is split into lectures and lab sessions.
  • Quizzes and project work for enhanced learning.
  • Hands-on experience on basic & advanced- level topics.
  • Interaction & learning with experts from academia & industry.
  • Certificates to the participants by E&ICT Academy IITR.

  • Course Content:
  • Introduction to Hadoop.
  • Hadoop Distributed File System (HDFS).
  • Map Reduce and Spark on YARN.
  • Getting data into HDFS.
  • Hadoop clients including Hue.
  • Processing complex data and Multi-Dataset operations with Pig.
  • Introduction to Hive and Impala.
  • Apache Spark Basics.
  • Aggregating data with pair RDDs.
  • Writing, Running, & Configuring Apache Spark Apps.
  • Parallel Processing in Apache Spark.
  • RDD Persistence.
  • Data-Frames and Spark SQL.
  • Message processing with Apache Kafka.
  • Capturing data with Apache Flume.
  • Important Details

    Last Date for Registration:
    June 05th, 2018
    40 seats on first-cum-first-serve basis
    6 days, 48 hours
    Registration Fee
    Faculty Members/Research scholars: Rs. 2,500/-
    Persons from Industry: Rs. 3,000/-
    Payment Details
    Demand draft drawn in favour of "Dean SRIC IIT Roorkee" payable at Roorkee
    How to Apply
    You can apply online by click here to fill-up the application form OR you can download offline form and email scanned copy to
    Contact Details
    Dr. Sanjeev Manhas (P.I., E&ICT Academy, ECE Dept, IITR)
    Dr. Pramod Kumar (KEC, Ghaziabad),
    Tel: +91-9410728057, +91-7078627392, +91-1332-286457

    A hard copy of the application form along with Demand Draft must reach to the following address: Mr. Prateek Sharma, EICT Academy, ECE Department, IIT Roorkee, Uttarakhand 247667.

    Follow on: Facebook, Linkedin