Simplify data science infrastructure to give data scientists an efficient path from prototype to production.
In Effective Data Science Infrastructure you will learn how to:
Design data science infrastructure that boosts productivity
Handle compute and orchestration in the cloud
Deploy machine learning to production
Monitor and manage performance and results
Combine cloud-based tools into a cohesive data science environment
Develop reproducible data science projects using Metaflow, Conda, and Docker
Architect complex applications for multiple teams and large datasets
Customize and grow data science infrastructure
Effective Data Science Infrastructure: How to make data scientists more productive is a hands-on guide to assembling infrastructure for data science and machine learning applications. It reveals the processes used at Netflix and other data-driven companies to manage their cutting edge data infrastructure. In it, you’ll master scalable techniques for data storage, computation, experiment tracking, and orchestration that are relevant to companies of all shapes and sizes. You’ll learn how you can make data scientists more productive with your existing cloud infrastructure, a stack of open source software, and idiomatic Python.
The author is donating proceeds from this book to charities that support women and underrepresented groups in data science.
Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications.
About the technology
Growing data science projects from prototype to production requires reliable infrastructure. Using the powerful new techniques and tooling in this book, you can stand up an infrastructure stack that will scale with any organization, from startups to the largest enterprises.
About the book
Effective Data Science Infrastructure teaches you to build data pipelines and project workflows that will supercharge data scientists and their projects. Based on state-of-the-art tools and concepts that power data operations of Netflix, this book introduces a customizable cloud-based approach to model development and MLOps that you can easily adapt to your company’s specific needs. As you roll out these practical processes, your teams will produce better and faster results when applying data science and machine learning to a wide array of business problems.
What's inside
Handle compute and orchestration in the cloud
Combine cloud-based tools into a cohesive data science environment
Develop reproducible data science projects using Metaflow, AWS, and the Python data ecosystem
Architect complex applications that require large datasets and models, and a team of data scientists
About the reader
For infrastructure engineers and engineering-minded data scientists who are familiar with Python.
About the author
At Netflix, Ville Tuulos designed and built Metaflow, a full-stack framework for data science. Currently, he is the CEO of a startup focusing on data science infrastructure.
Table of Contents
1 Introducing data science infrastructure
2 The toolchain of data science
3 Introducing Metaflow
4 Scaling with the compute layer
5 Practicing scalability and performance
6 Going to production
7 Processing data
8 Using and operating models
9 Machine learning with the full stack
"synopsis" may belong to another edition of this title.
Ville Tuulos has been developing tools and infrastructure for data science and machine learning for over two decades. At Netflix, he designed and built Metaflow, a full-stack framework for data science. Currently, he is the CEO of a startup focusing on data science infrastructure.
Effective Data Science Infrastructure How to make data scientists more productive is a guide to building infrastructure that will supercharge data science projects and data scientists. Based on state-of-the-art practices that power the massive data operations of Netflix, this book offers techniques and patterns relevant to companies of all shapes and sizes. You'll learn how you can make data scientists more productive with your existing cloud infrastructure, a stack of open source software, and idiomatic Python.
As you work through this easy-to-follow guide, you'll set up end-to-end infrastructure from the ground up, with a fully customizable process you can easily adapt to your company. You'll build a cloud-based development environment that covers local prototyping and deployment to production, set up infrastructure that supports a real-world machine learning application, and handle a large-scale application for processing hundreds of gigabytes of data. Throughout, you'll follow a human-centric approach focused on user experience and meeting the unique needs of data scientists.
"About this title" may belong to another edition of this title.
Seller: medimops, Berlin, Germany
Condition: good. Befriedigend/Good: Durchschnittlich erhaltenes Buch bzw. Schutzumschlag mit Gebrauchsspuren, aber vollständigen Seiten. / Describes the average WORN book or dust jacket that has all the pages present. Seller Inventory # M01617299197-G
Quantity: 1 available
Seller: AwesomeBooks, Wallingford, United Kingdom
Paperback. Condition: Very Good. Effective Data Science Infrastructure: How to Make Data Scientists Productive This book is in very good condition and will be shipped within 24 hours of ordering. The cover may have some limited signs of wear but the pages are clean, intact and the spine remains undamaged. This book has clearly been well maintained and looked after thus far. Money back guarantee if you are not satisfied. See all our books here, order more than 1 book and get discounted shipping. Seller Inventory # 7719-9781617299193
Quantity: 2 available
Seller: Bahamut Media, Reading, United Kingdom
Paperback. Condition: Very Good. Shipped within 24 hours from our UK warehouse. Clean, undamaged book with no damage to pages and minimal wear to the cover. Spine still tight, in very good condition. Remember if you are not happy, you are covered by our 100% money back guarantee. Seller Inventory # 6545-9781617299193
Quantity: 1 available
Seller: GreatBookPrices, Columbia, MD, U.S.A.
Condition: New. Seller Inventory # 43994888-n
Seller: INDOO, Avenel, NJ, U.S.A.
Condition: New. Seller Inventory # 9781617299193
Seller: PBShop.store US, Wood Dale, IL, U.S.A.
PAP. Condition: New. New Book. Shipped from UK. Established seller since 2000. Seller Inventory # PB-9781617299193
Seller: GreatBookPrices, Columbia, MD, U.S.A.
Condition: As New. Unread book in perfect condition. Seller Inventory # 43994888
Seller: INDOO, Avenel, NJ, U.S.A.
Condition: As New. Unread copy in mint condition. Seller Inventory # SS9781617299193
Seller: PBShop.store UK, Fairford, GLOS, United Kingdom
PAP. Condition: New. New Book. Shipped from UK. Established seller since 2000. Seller Inventory # PB-9781617299193
Quantity: 15 available
Seller: Grand Eagle Retail, Bensenville, IL, U.S.A.
Paperback. Condition: new. Paperback. Effective Data Science Infrastructure is a hands-on guide to assembling infrastructure for data science and machine learning applications. It reveals the processes used at Netflix and other data driven companies to manage their cutting edge data infrastructure. As you work through this easy-to-follow guide, you'll set up end-to end infrastructure from the ground up, with a fully customizable process you can easily adapt to your company. You'll learn how you can make data scientists more productive with your existing cloud infrastructure, a stack of open source software, and idiomatic Python. Throughout, you'll follow a human-centric approach focused on user experience and meeting the unique needs of data scientists. About the TechnologyTurning data science projects from small prototypes to sustainable business processes requires scalable and reliable infrastructure. This book lays out the workflows, components, and methods of the full infrastructure stack for data science, from data warehousing and scalable compute to modeling frameworks. Shipping may be from multiple locations in the US or from the UK, depending on stock availability. Seller Inventory # 9781617299193