Master the essentials of Apache Hadoop, a crucial Big Data ecosystem framework, designed for developers and administrators looking to efficiently manage and process large datasets using distributed storage and computing power.
Implement Hadoop Distributed File System (HDFS) for secure storage.
Execute MapReduce operations for processing large data files.
Administer and configure Hadoop ecosystem components effectively.
Gain hands-on experience with Apache Pig, Hive, and HBase.
Apache Hadoop: is a set of algorithms (an open-source software framework written in Java) for distributed storage and distributed processing of very large data sets (Big Data) on computer clusters built from commodity hardware. All the modules in Hadoop are designed with a fundamental assumption that hardware failures (of individual machines, or racks of machines) are commonplace and thus should be automatically handled in software by the framework. What is covered in this course:
No upcoming dates. Please check back later.
Download and install the JDK & JRE
Download and Install Apache Hadoop
Add a Path to the profile
Configure SSH
Configure Common HDFS and MapReduce Configurations
Format NameNode and Launch Hadoop Daemons
Copy a File from local to HDFS
List Files Directory in HDFS
Copy a File from HDFS to Local
CAT Remove a File Directory from HDFS
Using Administrative Tools
Access NameNode via Web User Interface
Sourcing data from various Locations
Using Hadoop Archives
Parallel Copying with distcp
HDFS Upgrade Process
Configuring Rack Awareness
Installing Eclipse
Creating a Mapper Class
Creating a Reducer Class
Creating a Driver Class
Packing jar and Running MapReduce
Accessing Job Tracker via Web User Interface
Running a Default MapReducer Job
Default Mapper
Default Partitioner
Default Reducer
Running a Streaming MapReduce Job
Understanding Counters
Writing User Defined Counters
Finding Logs
Directory structures of HDFS Components
Commissioning & Decommissioning slave nodes
Optimizing configuration settings
Using Teragen to generate data sets
Using Terasort to Benchmark Hadoop Cluster
Downloading Apache Pig
Installing Apache Pig
Configuring Apache Pig
Starting Pig in Local Mode
Starting Pig in MapReduce Mode
Running a Pig Script
Loading & Storing
Filtering & Transforming
Grouping & Sorting
Combining & Splitting
Writing User Defined Functions
Using Diagnostic Operations
Downloading Apache Hive
Installing Apache Hive
Configuring Apache Hive
Creating a Table in Hive
Loading data into the Table
Running HiveQL Statements
Creating Tables (Managed & External)
Using Partitions
Creating Views
Creating Indexes
Writing a Hive UDF
Downloading Apache HBase
Installing Apache HBase
Configuring Apache HBase
Creating a Table in Apache HBase
Installing HBase in Fully Distributed Mode
Creating a Table in HBase using HBase Shell
Loading Data in HBase using Pig
Running Hive Queries on HBase Tables
Using REST Server
Downloading Apache Zookeeper
Installing Apache Zookeeper
Configuring Apache Zookeeper
Using Zookeeper in CLI to perform functions
Downloading Apache SQOOP
Installing Apache SQOOP
Configuring Apache SQOOP
Downloading MySQL Connector for SQOOP
Importing Data from RDBMS to HDFS and Hive
Exporting Data from HDFS to RDBMS
Downloading Apache Flume
Installing Apache Flume
Configuring Apache Flume
Setting up Twitter Developer Accounts for API Keys
Setting the .conf file to stream data to HDFS
Streaming Twitter data to HDFS
Download & Install VMWare Player on Windows
Download Cloudera CDH VM
Load Cloudera CDH using VMWare Player
Using Cloudera Manager
Using Cloudera HUE
Exploring Cloudera CDH VM
Download the HDP 2.1 Sandbox
Load HDP 2.1 Sandbox using VMWare Player
Getting Started with HDP 2.1 Sandbox
Using Apache Ambari
Your team deserves training as unique as they are.
Let us tailor the course to your needs at no extra cost.
Trusted by Engineers at:
and more...
Aaron Steele
Casey Pense
Chris Tsantiris
Javier Martin
Justin Gilley
Kathy Le
Kelson Smith
Oussama Azzam
Pascal Rodmacq
Randall Granier
Aaron Steele
Casey Pense
Chris Tsantiris
Javier Martin
Justin Gilley
Kathy Le
Kelson Smith
Oussama Azzam
Pascal Rodmacq
Randall Granier