Scala has been observing wide adoption over the past few years, especially in the field of data science and analytics. Spark, built on Scala, has gained a lot of recognition and is being used widely in productions. Thus, if you want to leverage the power of Scala and Spark to make sense of big data, this book is for you. By the end of this book, you will have a thorough understanding of Spark, and you will be able to perform full-stack data analytics with a feel that no amount of data is too big.
- Access 898 pages & 26 hours and 56 minutes of content 24/7
- Understand object-oriented & functional programming concepts of Scala
- Develop an in-depth understanding of Scala collection APIs
- Explore working w/ RDD & DataFrame to learn Spark’s core abstractions
- Explore analyzing structured & unstructured data using SparkSQL and GraphX
- Dive into Scalable & fault-tolerant streaming application development using Spark structured streaming
Frank Kane’s Taming Big Data with Apache Spark and Python is your companion to learning Apache Spark in a hands-on manner. Frank will start you off by teaching you how to set up Spark on a single system or on a cluster, and you’ll soon move on to analyzing large data sets using Spark RDD, and developing and running effective Spark jobs quickly using Python. Apache Spark has emerged as the next big thing in the Big Data domain – quickly rising from an ascending technology to an established superstar in just a matter of years. Spark allows you to quickly extract actionable insights from large amounts of data, on a real-time basis, making it an essential tool in many modern businesses.
- Access 296 pages & 8 hours and 52 minutes of content 24/7
- Find out how you can identify Big Data problems as Spark problems
- Learn how to install & run Apache Spark on your computer or on a cluster
- Dive into Analyzing large data sets across many CPUs using Spark’s Resilient Distributed Datasets
- Explore Implementing machine learning on Spark using the MLlib library
- Learn how to process continuous streams of data in real time using the Spark streaming module
- Explore performing complex network analysis using Spark’s GraphX library
- Learn to use Amazon’s Elastic MapReduce service to run your Spark jobs on a cluster
You are allowed to use this product only within the laws of your country/region. SharewareOnSale and its staff are not responsible for any illegal activity. We did not develop this product; if you have an issue with this product, contact the developer. This product is offered "as is" without express or implied or any other type of warranty. The description of this product on this page is not a recommendation, endorsement, or review; it is a marketing description, written by the developer. The quality and performance of this product is without guarantee. Download or use at your own risk. If you don't feel comfortable with this product, then don't download it.