Big Data

The Certified Big Data Foundation Specialist (CBDFS) designation is a globally recognized certification for Big Data professionals. Being CBDFS showcases your experience in a cloud environment, and demonstrates the relevant skills and knowledge. Organizations that employ CBDFS will have experts on board that can help maximize the business opportunities that cloud is creating. The Certified Big Data Foundation Specialist (CBDFS) Certification is the certification awarded to individuals who have successfully passed exam. After completing our e-course, you will be equipped not only with Fundamental Big Data knowledge, but will also be introduced. This practical knowledge can be used as a starting point in the organization Big Data journey. The Big Data Foundation E-Course contains eLearning a study guide eBook and online course, and is delivered to you via our eLearning portal, giving you the freedom to access it anytime, whether at home or in the office.

Curriculum

Understanding Big Data and Hadoop

Introduction to Big Data & Big Data Challenges
Limitations & Solutions of Big Data Architecture
Hadoop & its Features
Hadoop Ecosystem
Hadoop 2.x Core Components
Hadoop Storage: HDFS (Hadoop Distributed File System)
Hadoop Processing: MapReduce Framework
Different Hadoop Distributions

Hadoop Architecture and HDFS

Hadoop 2.x Cluster Architecture
Federation and High Availability Architecture
Typical Production Hadoop Cluster
Hadoop Cluster Modes
Common Hadoop Shell Commands
Hadoop 2.x Configuration Files
Single Node Cluster & Multi-Node Cluster set up
Basic Hadoop Administration

Hadoop MapReduce Framework

Traditional way vs MapReduce way
Why MapReduce
YARN Components
YARN Architecture
YARN MapReduce Application Execution Flow
YARN Workflow
Anatomy of MapReduce Program
Input Splits, Relation between Input Splits and HDFS Blocks
MapReduce: Combiner & Partitioner
Demo of Health Care Dataset
Demo of Weather Dataset

Advanced Hadoop MapReduce

Counters
Distributed Cache
MRunit
Reduce Join
Custom Input Format
Sequence Input Format
XML file Parsing using MapReduce

Apache Pig

Introduction to Apache Pig
MapReduce vs Pig
Pig Components & Pig Execution
Pig Data Types & Data Models in Pig
Pig Latin Programs
Shell and Utility Commands
Pig UDF & Pig Streaming
Testing Pig scripts with Punit
Aviation use-case in PIG
Pig Demo of Healthcare Dataset

Apache Hive

Introduction to Apache Hive
Hive vs Pig
Hive Architecture and Components
Hive Metastore
Limitations of Hive
Comparison with Traditional Database
Hive Data Types and Data Models
Hive Partition
Hive Bucketing
Hive Tables (Managed Tables and External Tables)
Importing Data
Querying Data & Managing Outputs
Hive Script & Hive UDF
Retail use case in Hive
Hive Demo on Healthcare Dataset

Advanced Apache Hive and HBase

Hive QL: Joining Tables, Dynamic Partitioning
Custom MapReduce Scripts
Hive Indexes and views
Hive Query Optimizers
Hive Thrift Server
Hive UDF
HBase v/s RDBMS
HBase Components
HBase Architecture
HBase Run Modes
HBase Configuration
HBase Cluster Deployment

Advanced Apache HBase

HBase Data Model
HBase Shell
HBase Client API
Hive Data Loading Techniques
Apache Zookeeper Introduction
ZooKeeper Data Model
Zookeeper Service
HBase Bulk Loading
Getting and Inserting Data
HBase Filters

Processing Distributed Data with Apache Spark

What is Spark
Spark Ecosystem
Spark Components
What is Scala
Why Scala
SparkContext
Spark RDD

Oozie and Hadoop Project

Oozie
Oozie Components
Oozie Workflow
Scheduling Jobs with Oozie Scheduler
Demo of Oozie Workflow
Oozie Coordinator
Oozie Commands
Oozie Web Console
Oozie for MapReduce
Combining flow of MapReduce Jobs
Hive in Oozie
Hadoop Project Demo
Hadoop Talend Integration

Get Free Career Guidance

Syllabus

Foundation

Execution and Implementation

Management

Big Data Solutions

Analytics and Big Data

Cloud Technologies

Target Audience

Best suited to Information Technology professionals who possess intermediate to advanced programming, systems administration or relational database skills and are looking to move into the area of Big Data. These include

Software Engineersli
Application Developers
IT Architects
System Administrators
The course can also be of benefit to other professionals, e.g. business analysts, market/data researchers, etc. who possess strong information Technology skills and have a deep interest in Big Data analytics and the benefits it can bring to an organization.

Prateek Piley — “Happy to share this review, Nice trained & professional staff, Excellent coordination with everyone, help in solving your queries. Highly recommended.Keep up the good work👌👏👏.”

Thursday, August 31, 2017

sudhir bagadhi — “I am very happy with the training and material .It is very effective and all topics were covered and overall maintained standards absolutely.Its a great experience to be a part of training.”

Thursday, August 31, 2017

Anil Tatipaka — “I have recently completed PSM training through NeoSkills, this is my second training with them (previously I complete Prince2 Agile) and both the times I had a good experience with learning..”

Thursday, August 31, 2017

amrut limkar — Nice coordination with the staff. Ashish was very helpful giving nice training Overall very nice experience to complete the training and exam..”

Thursday, August 31, 2017

Tap the number to call us now

Big Data

Curriculum

Understanding Big Data and Hadoop

Hadoop Architecture and HDFS

Hadoop MapReduce Framework

Advanced Hadoop MapReduce

Apache Pig

Apache Hive

Advanced Apache Hive and HBase

Advanced Apache HBase

Processing Distributed Data with Apache Spark

Oozie and Hadoop Project

Get Free Career Guidance

Syllabus

Foundation

Execution and Implementation

Management

Big Data Solutions

Analytics and Big Data

Cloud Technologies

Target Audience

Best suited to Information Technology professionals who possess intermediate to advanced programming, systems administration or relational database skills and are looking to move into the area of Big Data. These include

Contact Us

Training