Learn how to use Spark to process Big Data at speed and scale for sharper analytics. Put the principles into practice for a faster, slicker Big Data projects
This book is for developers with little to no knowledge of Spark, but with a background in Scala/Java programming. It's recommended that you have experience in dealing and working with big data and a strong interest in data science.
When people want a way to process Big Data at speed, Spark is invariably the solution. With its ease of development (in comparison to the relative complexity of Hadoop), it's unsurprising that it's becoming popular with data analysts and engineers everywhere.
Beginning with the fundamentals, we'll show you how to get set up with Spark with minimum fuss. You'll then get to grips with some simple APIs before investigating machine learning and graph processing - throughout we'll make sure you know exactly how to apply your knowledge.
You will also learn how to use the Spark shell, how to load data before finding out how to build and run your own Spark applications. Discover how to manipulate your RDD and get stuck into a range of DataFrame APIs. As if that's not enough, you'll also learn some useful Machine Learning algorithms with the help of Spark MLib and integrating Spark with R. We'll also make sure you're confident and prepared for graph processing, as you learn more about the GraphX API.
"synopsis" may belong to another edition of this title.
Krishna Sankar is a chief data scientist, where he focuses on optimizing user experiences via inference, intelligence, and interfaces. His earlier roles include principal architect, data scientist at Tata America Intl, director of a data science and bioinformatics start-up, and a distinguished engineer at Cisco. He has spoken at various conferences, such as Strata-Sparkcamp, OSCON, Pycon. He was a guest lecturer at Naval Postgraduate School, Monterey. His other passion is Lego Robotics. You can find him at the St. Louis FLL World Competition as the robots design judge.
Holden Karau is a software development engineer and is active in the open source. She has worked on a variety of search, classification, and distributed systems problems at IBM, Alpine, Databricks, Google, Foursquare, and Amazon. She graduated from the University of Waterloo with a bachelor's of mathematics degree in computer science. Other than software, she enjoys playing with fire and hula hoops, and welding.
"About this title" may belong to another edition of this title.
Seller: GreatBookPrices, Columbia, MD, U.S.A.
Condition: New. Seller Inventory # 29164511-n
Seller: BargainBookStores, Grand Rapids, MI, U.S.A.
Paperback or Softback. Condition: New. Fast Data Processing with Spark 2. Book. Seller Inventory # BBS-9781785889271
Seller: California Books, Miami, FL, U.S.A.
Condition: New. Seller Inventory # I-9781785889271
Seller: GreatBookPrices, Columbia, MD, U.S.A.
Condition: As New. Unread book in perfect condition. Seller Inventory # 29164511
Seller: PBShop.store US, Wood Dale, IL, U.S.A.
PAP. Condition: New. New Book. Shipped from UK. THIS BOOK IS PRINTED ON DEMAND. Established seller since 2000. Seller Inventory # L0-9781785889271
Seller: Majestic Books, Hounslow, United Kingdom
Condition: New. Print on Demand. Seller Inventory # 407288873
Quantity: 4 available
Seller: PBShop.store UK, Fairford, GLOS, United Kingdom
PAP. Condition: New. New Book. Delivered from our UK warehouse in 4 to 14 business days. THIS BOOK IS PRINTED ON DEMAND. Established seller since 2000. Seller Inventory # L0-9781785889271
Quantity: Over 20 available
Seller: Books Puddle, New York, NY, U.S.A.
Condition: New. Seller Inventory # 26405865462
Seller: Ria Christie Collections, Uxbridge, United Kingdom
Condition: New. In. Seller Inventory # ria9781785889271_new
Quantity: Over 20 available
Seller: Chiron Media, Wallingford, United Kingdom
Paperback. Condition: New. Seller Inventory # 6666-IUK-9781785889271
Quantity: Over 20 available