Title: Apache Flume Distributed Log Collection for Hadoop What You Need to Know
Publisher: Packt Publishing Limited
Publication Date: 2013
Language: English
ISBN 10: 1782167919
ISBN 13: 9781782167914
Binding: PAP
Condition: New
Weight: 270 grams

About this title

Synopsis

If your role includes moving datasets into Hadoop, this book will help you do it more efficiently using Apache Flume. From installation to customization, it's a complete step-by-step guide on making the service work for you.

Overview

Integrate Flume with your data sources
Transcode your data en-route in Flume
Route and separate your data using regular expression matching
Configure failover paths and load-balancing to remove single points of failure
Utilize Gzip Compression for files written to HDFS

In Detail

Apache Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data. Its main goal is to deliver data from applications to Apache Hadoop's HDFS. It has a simple and flexible architecture based on streaming data flows. It is robust and fault tolerant with many failover and recovery mechanisms.

Apache Flume: Distributed Log Collection for Hadoop covers problems with HDFS and streaming data/logs, and how Flume can resolve these problems. This book explains the generalized architecture of Flume, which includes moving data to/from databases, NO-SQL-ish data stores, as well as optimizing performance. This book includes real-world scenarios on Flume implementation.

Apache Flume: Distributed Log Collection for Hadoop starts with an architectural overview of Flume and then discusses each component in detail. It guides you through the complete installation process and compilation of Flume.

It will give you a heads-up on how to use channels and channel selectors. For each architectural component (Sources, Channels, Sinks, Channel Processors, Sink Groups, and so on) the various implementations will be covered in detail along with configuration options. You can use it to customize Flume to your specific needs. There are pointers given on writing custom implementations as well that would help you learn and implement them.

By the end, you should be able to construct a series of Flume agents to transport your streaming data and logs from your systems into Hadoop in near real time.

What you will learn from this book

Understand the Flume architecture
Download and install open source Flume from Apache
Discover when to use a memory or file-backed channel
Understand and configure the Hadoop File System (HDFS) sink
Learn how to use sink groups to create redundant data flows
Configure and use various sources for ingesting data
Inspect data records and route to different or multiple destinations based on payload content
Transform data en-route to Hadoop
Monitor your data flows

Approach

A starter guide that covers Apache Flume in detail.

Who this book is written for

Apache Flume: Distributed Log Collection for Hadoop is intended for people who are responsible for moving datasets into Hadoop in a timely and reliable manner like software engineers, database administrators, and data warehouse administrators.

From the Author

There is an updated and expanded second edition so please be sure to purchase that one instead.Search for ISBN: 978-1784392178 until I can get this one marked as old.Thanks!

"About this title" may belong to another edition of this title.

Store Description

We first started out as ?The Paperback Exchange,? a chain of physical bookstores where we would part exchange your beloved books for new stories to transport you to faraway places. However, as shopping started to evolve to online shops and marketplaces, we bid our stores goodbye to become ?PBShop.? This transition has only allowed us to blossom as we now ship thousands of titles to book lovers across the globe. We pride ourselves in being a community of local book lovers which allows our passion and devotion to shine in everything we do. In 2020 we not only celebrated our 20th birthday but our 1st birthday as being completely employee owned after becoming an E.O.T in September 2019. We are proud to be different and embrace standing apart on a book mountain by working from a virtual inventory which allows us to provide thousands of books that may be difficult to get for your bookshelf or your studies. Working with a number of different suppliers allows us to explore other avenues such as puzzles, sheet music and even stationery so we really do have something for everyone. Life is about being versatile in all realms of existence. If this is your first visit or you are a returning customer, we would like to welcome you to the PBShop family, for there is no friend as loyal as a book. We are a company who put our customers at the centre of everything we do as we understand the importance of reading because once you learn to read, you will forever be free. There are a whole lot of things in this world of ours that we are yet to explore, which is why we will forever inspire curious minds.

Visit Seller's Storefront

Seller's business information

Pbshop.co.uk Ltd
Unit 22 Horcott Industrial Estate, Horcott Road, Fairford, GL7 4BX, United Kingdom

Sale & Shipping Terms

Terms of Sale

Returns Policy
We ask all customers to contact us for authorisation should they wish to return their order. Orders returned without authorisation may not be credited.
If you wish to return, please contact us within 14 days of receiving your order to obtain authorisation.

Returns requested beyond this time will not be authorised.

Our team will provide full instructions on how to return your order and once received our returns department will process your refund.
Please note the cost to return any...

More Information

Shipping Terms

Books are shipped from UK warehouse. Delivery thereafter is between 4 and 14 business days dependant upon your location - please do contact us with any queries you may have.

Shipping rates within U.S.A.

Shipping rates within U.S.A.
Order quantity	7 to 14 business days	7 to 14 business days
First item	US$ 0.00	US$ 0.00

Delivery times are set by sellers and vary by carrier and location. Orders passing through Customs may face delays and buyers are responsible for any associated duties or fees. Sellers may contact you regarding additional charges to cover any increased costs to ship your items.