Advanced CertificatE PROGRAM in Data Science & AI By E&ICT Academy, IIT Roorkee

Program Partner – CloudxLab

Certification in Data Science & AI

Visit Cloudxlab to

Faculty and Mentors

Prof. Raksha Sharma

CSE Dept., IIT Roorkee

iitrrk logo png

Prof. Sanjeev Manhas

ECE Dept., IIT Roorkee

iitrrk logo png

Mr. Sandeep Giri

Founder, CloudxLab

Mr. Abhinav Singh

Co-Founder, CloudxLab

About the Course

This Data Science and AI Certification Program is an online course. This course covers some of the most trending and latest technologies in the market like Tensorflow 2.0, Generative Adversarial Networks (GANs) etc. The cutting edge content provided through this course will help you launch a career in the field of Data Science

Additionally, this course comes with our cloud lab access to gain the much needed hands-on experience to solve the real-world problems.

Upon successfully completing the course, you will get the certificate from E&ICT Academy, IIT Roorkee which you can use for progressing in your career and finding better opportunities.

Course Highlights

11 Months of Blended Learning

Cloud Lab Access

Work on about 29+ projects to get hands-on experience

Best In Class Curriculum

Timely Doubt Resolution

Certificate of Completion by E&ICT Academy, IIT Roorkee

Programming Languages and Tools

Sample Certificate

Curriculum

1. Linux for Data Science
2. Getting Started with Git
3. Python Foundations
4. Machine Learning Prerequisites(Including Numpy, Pandas and Linear Algebra)
5. Getting Started with SQL
6. Statistics Foundations

In this topic, we will cover concepts like different types of Machine Learning algorithms (Supervised, Unsupervised, Reinforcement) and challenges in Machine Learning. We will see examples of solving the problems using the traditional approach and why Machine Learning algorithms give far better accuracy than the traditional approach. This topic will give you a brief introduction to both Machine Learning and Deep Learning world.

We will start the course by learning concepts in Machine Learning. In this topic, we will build a machine learning model to predict housing pricing in California. By the end of this project, you will understand how to build machine learning pipelines to build a model. We will also cover concepts like data cleaning, preparing data for machine learning algorithms, exploring many different models, short-list the best one and fine-tuning the selected model

In this topic, we will train a model on the MNIST dataset to recognize handwritten digits. We will also learn various performance measures in classification like Confusion Matrix, Precision and Recall, and ROC Curve.

In this topic, we will learn various Machine Learning algorithms and concepts like Unsupervised Learning, Ensemble Learning, and Dimensionality Reduction

We will start the Deep Learning course with Artificial Neural Networks. We will learn about biological neurons, multilayer perceptrons, and back-propagation. We will implement a multilayer perceptron using Keras and visualize the runs and graphs using Tensorboard

In this topic, we will learn various challenges deep neural networks face while training like vanishing and exploding gradients. We will learn various techniques to solve these problems like reusing pre-trained layers, using faster optimizers and avoiding overfitting by regularization.

In this topic, we will dive deeper into TensorFlow and its lower level Python API. These lower-level Python APIs are useful when we need extra control like writing custom loss function, layers and many more.

Deep Learning systems are usually trained on very large datasets that may not fit in the RAM. In this topic, we will learn TensorFlow's Data API which helps in ingesting dataset and preprocessing it efficiently.

In this topic, we will learn how Convolutional Neural Networks - CNNs achieve superhuman performance on complex visual tasks. Today CNNs power image search services, self-driving cars, automatic video classification systems and more. We will learn CNNs basic building blocks and how to implement them using TensorFlow and Keras

Predicting the future is something we do all the time like predicting stock prices. In this topic, we will learn how Recurrent Neural Networks - RNN predict the future, the problem they face like limited short-term memory and solutions to these problems - LSTM (Long Short-Term Memory) and GRU cells

Using Natural Language Processing we build systems that can read and write natural language. In this topic, we will learn different NLP techniques and generate Shakespearean text using a Character RNN.

Autoencoders are artificial neural networks capable of learning dense representations of input data without any supervision. For example, we could train an autoencoder on pictures of faces and it can then generate new faces. In this topic, we will learn different types of autoencoders and generative models.

Reinforcement Learning is one of the most exciting fields of Machine Learning. Using Reinforcement Learning AlphaGo(system) defeated the world champion at the game of Go. Reinforcement Learning is an area of Machine Learning aimed at creating agents capable of taking actions in an environment in a way that maximizes rewards over time. In this topic, we will learn various concepts in Reinforcement Learning and experiment with OpenAI Gym.

Course on Big Data with Hadoop

1. Introduction
2. Distributed systems
3. Big Data Use Cases
4. Various Solutions
5. Overview of Hadoop Ecosystem
6. Spark Ecosystem Walkthrough

1. Understanding the CloudxLab
2. Getting Started - Hands on
3. Hadoop & Spark Hands-on
4. Understanding Regular Expressions
5. Setting up VM

We will start the course by learning concepts in Machine Learning. In this topic, we will build a machine learning model to predict housing pricing in California. By the end of this project, you will understand how to build machine learning pipelines to build a model. We will also cover concepts like data cleaning, preparing data for machine learning algorithms, exploring many different models, short-list the best one and fine-tuning the selected model

1. ZooKeeper - Race Condition
2. ZooKeeper - Deadlock
3. How does election happen - Paxos Algorithm?
4. Use cases
5. When not to use

1. Why HDFS?
2. NameNode & DataNodes
3. Advance HDFS Concepts (HA, Federation)
4. Hands-on with HDFS (Upload, Download, SetRep)
5. Data Locality (Rack Awareness)

1. Why YARN?
2. Evolution from MapReduce 1.0
3. Resource Management: YARN Architecture
4. Advance Concepts - Speculative Execution

1. Understanding Sorting
2. MapReduce - Overview
3. Word Frequency Problem - Without MR
4. Only Mapper - Image Resizing
5. Temperature Problem
6. Multiple Reducer
7. Java MapReduce

1. Writing MapReduce Code Using Java
2. Apache Ant
3. Concept - Associative & Commutative
4. Combiner
5. Hadoop Streaming
6. Adv. Problem Solving - Anagrams
7. Adv. Problem Solving - Same DNA
8. Adv. Problem Solving - Similar DNA
9. Joins - Voting
10. Limitations of MapReduce

1. Pig - Introduction
2. Pig - Modes
3. Example - NYSE Stock Exchange
4. Concept - Lazy Evaluation

1. Hive - Introduction
2. Hive - Data Types
3. Loading Data in Hive (Tables)
4. Movielens Data Processing
5. Connecting Tableau and HiveServer 2
6. Connecting Microsoft Excel and HiveServer 2
7. Project: Sentiment Analyses of Twitter Data
8. Advanced - Partition Tables
9. Understanding HCatalog & Impal

1. NoSQL - Scaling Out / Up
2. ACID Properties and RDBMS Story
3. CAP Theorem
4. HBase Architecture - Region Servers etc
5. Hbase Data Model - Column Family Orientedness
6. Getting Started - Create table, Adding Data
7. Adv Example - Google Links Storage
8. Concept - Bloom Filter
9. Comparison of NOSQL Databases

1. Sqoop - Introduction
2. Sqoop Import - MySQL to HDFS
3. Exporting to MySQL from HDFS
4. Concept - Unbounding Dataset Processing or Stream Processing
5. Flume Overview: Agents - Source, Sink, Channel
6. Data from Local network service into HDFS
7. Example - Extracting Twitter Data
8. Example - Creating workflow with Oozier

Course on Big Data with Spark

1. Apache Spark ecosystem walkthrough
2. Spark Introduction - Why Spark?

1. Introduction, Access Scala on CloudxLab
2. Variables and Methods
3. Interactive, Compilation, SBT
4. Types, Variables & Values
5. Functions
6. Collections
7. Classes
8. Parameters

1. Apache Spark ecosystem
2. Why Spark?
3. Using the Spark Shell on CloudxLab
4. Example 1 - Performing Word Count
5. Understanding Spark Cluster Modes on YARN
6. RDDs (Resilient Distributed Datasets)
7. General RDD Operations: Transformations & Actions
8. RDD lineage
9. RDD Persistence Overview
10. Distributed Persistence

1. Creating the SparkContext
2. Building a Spark Application (Scala, Java, Python)
3. The Spark Application Web UI
4. Configuring Spark Properties
5. Running Spark on Cluster
6. RDD Partitions
7. Executing Parallel Operations
8. Stages and Tasks

1. Common Spark Use Cases
1. Example 1 - Data Cleaning (Movielens)
1. Example 2 - Understanding Spark Streaming
2. Understanding Kafka
3. Example 3 - Spark Streaming from Kafka
4. Iterative Algorithms in Spark
5. Project: Real-time analytics of orders in an e-commerce company

1. XML
2. AVRO
3. How to store many small files - SequenceFile?
4. Parquet
5. Protocol Buffers
6. Comparing Compressions
7. Understanding Row Oriented and Column Oriented Formats - RCFile?

1. Spark SQL - Introduction
2. Spark SQL - Dataframe Introduction
3. Transforming and Querying DataFrames
4. Saving DataFrames
5. DataFrames and RDDs
6. Comparing Spark SQL, Impala, and Hive-on-Spark

1. Machine Learning Introduction
2. Applications Of Machine Learning
3. MlLib Example: k-means
4. SparkR Example

1. Pig - Introduction
2. Pig - Modes
3. Example - NYSE Stock Exchange
4. Concept - Lazy Evaluation

1. Hive - Introduction
2. Hive - Data Types
3. Loading Data in Hive (Tables)
4. Movielens Data Processing
5. Connecting Tableau and HiveServer 2
6. Connecting Microsoft Excel and HiveServer 2
7. Project: Sentiment Analyses of Twitter Data
8. Advanced - Partition Tables
9. Understanding HCatalog & Impal

1. NoSQL - Scaling Out / Up
2. ACID Properties and RDBMS Story
3. CAP Theorem
4. HBase Architecture - Region Servers etc
5. Hbase Data Model - Column Family Orientedness
6. Getting Started - Create table, Adding Data
7. Adv Example - Google Links Storage
8. Concept - Bloom Filter
9. Comparison of NOSQL Databases

1. Sqoop - Introduction
2. Sqoop Import - MySQL to HDFS
3. Exporting to MySQL from HDFS
4. Concept - Unbounding Dataset Processing or Stream Processing
5. Flume Overview: Agents - Source, Sink, Channel
6. Data from Local network service into HDFS
7. Example - Extracting Twitter Data
8. Example - Creating workflow with Oozier