Two open source platforms for predictive analytics offer data scientists the power to work with virtually unlimited data: Apache Spark and H2O.
Apache Spark is a general-purpose in-memory cluster computing system with built-in libraries for SQL, machine learning, graph analytics and streaming analytics. First released in 2012, Spark graduated to top-level Apache project status in 2013, and is now included in every major Hadoop distribution. Interest in Apache Spark has exploded and has effectively dethroned the map-reduce paradigm.
H2O is less widely known outside of those data scientists who work on the cutting edge. H2O is an open source project dedicated to machine learning, adopted by more than two thousand users worldwide (including companies such as Cisco, eBay, Nielsen and Paypal,) H2O's rapidly growing user base speaks to the strengths and capabilities of the platform.
Individually, each of these platforms provides data scientists with powerful capabilities; working together, they provide "best-in-breed" tooling across a wide range of analytic use cases and applications. In a combined solution, users can leverage Spark SQL and Spark Streaming for data ingestion together with H2O for the most advanced ensemble modeling and model deployment tools.
Thomas W. Dinsmore, an analytics expert at The Boston Consulting Group reviews each of these platforms in depth from a practical, hands-on perspective. You will learn:
What you ll learn
Who this book is for
This book is for Data Scientists seeking to leverage the most advanced machine learning platform available today.
"synopsis" may belong to another edition of this title.
Thomas W. Dinsmore currently serves as a Knowledge Expert in Customer Analytics at The Boston Consulting Group. Previously, Thomas served as Director of Product Management for Revolution Analytics; as an Analytics Solution Architect for IBM Big Data Solutions; and as a Principal Consultant for SAS Professional Services.
Thomas brings to his current role more than twenty-five years of experience in predictive analytics. He has led or contributed to analytic solutions for more than five hundred clients across vertical markets and around the world, including AT&T, Banco Santander, Citbank, Dell, J.C.Penney, Monsanto, Morgan Stanley, Office Depot, Sony, Staples, United Health Group, UBS and Vodafone. His international experience includes work for clients in the United States, Puerto Rico, Canada, Mexico, Venezuela, Brazil, Chile, The United Kingdom, Belgium, Spain, Italy, Turkey, Israel, Malaysia and Singapore.
Although his roots are in hands-on customer analytics, in the past fifteen years Thomas has expanded the scope of his experience to include analytic software applications and broader solutions including database integration and web applications. As a project lead, he has worked with DB2, Oracle, Netezza, SQL Server and Teradata.Thomas is certified in SAS, and has working experience with the leading analytic tools available in the market today, including SAS. R, SPSS, and Oracle Data Mining.
"About this title" may belong to another edition of this title.
(No Available Copies)