Big Data & Analytics Bundle (93% discount)

Description

Hadoop is one of the most commonly used Big Data frameworks, supporting the processing of large data sets in a distributed computing environment. This tool is becoming more and more essential to big business as the world becomes more data-driven. In this introduction, you’ll cover the individual components of Hadoop in detail and get a higher level picture of how they interact with one another. It’s an excellent first step towards mastering Big Data processes.

Access 30 lectures & 5 hours of content 24/7
Install Hadoop in Standalone, Pseudo-Distributed, & Fully Distributed mode
Set up a Hadoop cluster using Linux VMs
Build a cloud Hadoop cluster on AWS w/ Cloudera Manager
Understand HDFS, MapReduce, & YARN & their interactions

Description

You see recommendation algorithms all the time, whether you realize it or not. Whether it’s Amazon recommending a product, Facebook recommending a friend, Netflix, a new TV show, recommendation systems are a big part of internet life. This is done by collaborative filtering, something you can perform through MapReduce with data collected in Hadoop. In this course, you’ll learn how to do it.

Access 4 lectures & 1 hour of content 24/7
Master the art of “thinking parallel” to break tasks into MapReduce transformations
Use Hadoop & MapReduce to implement a recommendations algorithm
Recommend friends on a social networking site using a MapReduce collaborative filtering algorithm

Description

For Big Data engineers and data analysts, HBase is an extremely effective databasing tool for organizing and manage massive data sets. HBase allows an increased level of flexibility, providing column oriented storage, no fixed schema and low latency to accommodate the dynamically changing needs of applications. With the 25 examples contained in this course, you’ll get a complete grasp of HBase that you can leverage in interviews for Big Data positions.

Access 41 lectures & 4.5 hours of content 24/7
Set up a database for your application using HBase
Integrate HBase w/ MapReduce for data processing tasks
Create tables, insert, read & delete data from HBase
Get a complete understanding of HBase & its role in the Hadoop ecosystem
Explore CRUD operations in the shell, & with the Java API

Description

The best way to learn is by example, and in this course you’ll get the lowdown on Scala with 65 comprehensive, hands-on examples. Scala is a general-purpose programming language that is highly scalable, making it incredibly useful in building programs. Over this immersive course, you’ll explore just how Scala can help your programming skill set, and how you can set yourself apart from other programmers by knowing this efficient tool.

Access 67 lectures & 6.5 hours of content 24/7
Use Scala w/ an intermediate level of proficiency
Read & understand Scala programs, including those w/ highly functional forms
Identify the similarities & differences between Java & Scala to use each to their advantages

Description

The functional programming nature and the availability of a REPL environment make Scala particularly well suited for a distributed computing framework like Spark. Using these two technologies in tandem can allow you to effectively analyze and explore data in an interactive environment with extremely fast feedback. This course will teach you how to best combine Spark and Scala, making it perfect for aspiring data analysts and Big Data engineers.

Access 51 lectures & 8.5 hours of content 24/7
Use Spark for a variety of analytics & machine learning tasks
Understand functional programming constructs in Scala
Implement complex algorithms like PageRank & Music Recommendations
Work w/ a variety of datasets from airline delays to Twitter, web graphs, & Product Ratings
Use the different features & libraries of Spark, like RDDs, Dataframes, Spark SQL, MLlib, Spark Streaming, & GraphX
Write code in Scala REPL environments & build Scala applications w/ an IDE

Description

Linear Regression is a powerful method for quantifying the cause and effect relationships that affect different phenomena in the world around us. This course will teach you how to build robust linear models that will stand up to scrutiny when you apply them to real world situations. You’ll even put what you’ve learnt into practice by leveraging Excel, R, and Python to build a model for stock returns.

Access 40 lectures & 5 hours of content 24/7
Cover method of least squares, explaining variance, & forecasting an outcome
Explore residuals & assumptions about residuals
Implement simple & multiple regression in Excel, R, & Python
Interpret regression results & avoid common pitfalls
Introduce a categorical variable

Description

Factor analysis helps to cut through the clutter when you have a lot of correlated variables to explain a single effect. This course will help you understand factor analysis and its link to linear regression. You’ll explore how Principal Components Analysis (PCA) is a cookie cutter technique to solve factor extraction, and how it relates to machine learning.

Access 19 lectures & 1.5 hours of content 24/7
Understand principal components
Discuss Eigen values & Eigen vectors
Perform Eigenvalue decomposition
Use principal components for dimensionality reduction & exploratory factor analysis
Apply PCA to explain the returns of a technology stock like Apple
Find the principal components & use them to build a regression model

Description

Big data is hot, and data management and analytics skills are your ticket to a fast-growing, lucrative career. This course will quickly teach you two technologies fundamental to big data: MapReduce and Hadoop. Learn and master the art of framing data analysis problems as MapReduce problems with over 10 hands-on examples. Write, analyze, and run real code along with the instructor– both on your own system, and in the cloud using Amazon’s Elastic MapReduce service. By course’s end, you’ll have a solid grasp of data management concepts.

Learn the concepts of MapReduce to analyze big sets of data w/ 56 lectures & 5.5 hours of content
Run MapReduce jobs quickly using Python & MRJob
Translate complex analysis problems into multi-stage MapReduce jobs
Scale up to larger data sets using Amazon’s Elastic MapReduce service
Understand how Hadoop distributes MapReduce across computing clusters
Complete projects to get hands-on experience: analyze social media data, movie ratings & more
Learn about other Hadoop technologies, like Hive, Pig & Spark

Description

Hadoop is perhaps the most important big data framework in existence, used by major data-driven companies around the globe. Hadoop and its associated technologies allow companies to manage huge amounts of data and make business decisions based on analytics surrounding that data. This course will take you from big data zero to hero, teaching you how to build Hadoop solutions that will solve real world problems – and qualify you for many high-paying jobs.

Access 43 lectures & 10 hours of content 24/7
Learn how technologies like Mapreduce apply to clustering problems
Parse a Twitter stream Python, extract keywords w/ Apache Pig, visualize data w/ NodeJS, & more
Set up a Kafka stream w/ Java code for producers & consumers
Explore real-world applications by building a relational schema for a health care data dictionary used by the US Department of Veterans Affairs
Log collections & analytics w/ the Hadoop distributed file system using Apache Flume & Apache HCatalog

Big Data & Analytics Bundle

Official Product Description

Reviews for Big Data & Analytics Bundle

The Small Print