For veteran Linux programmers, Spector explains how to tie a bunch of computer together with a network and get them all to work on a large problem that has been broken down into smaller pieces. NASA's 1994 Beowulf cluster for supercomputer performance was built in this way with Linux at the fraction of the cost for conventional approaches. The disk contains a version of Red Hat Linux 6.2 customized for clustering, ready to go. Annotation c. Book News, Inc., Portland, OR (booknews.com)
Given that computers are a creation/projection of the human spirit, it is no surprise that they work against each other more often than not. Clients compete for the attention of servers, networks drain each other's bandwidth, and firewalls repel packet storms. But, just as system design mirrors our own greed, so too can it capture our Utopian dream that the whole is greater than the sum of the parts. Enter clusters, computer networks whose interconnectivity and communication protocols are interwoven so closely that the network can be used to solve single problems.
Author David Spector and the editors at O'Reilly achieve rare hacker-text synergy in recounting the adventure and in teaching the methods of networked hardware/software clusters in Building Linux Clusters, an extended how-to on coupling Linux boxes of all flavors (Alphas, Suns, 486 Intels, Pentiums) to work synchronously to compete with a multimillion-dollar supercomputer. Currently, the 62nd-fastest computer in the world is CPlant, a Linux cluster at Sandia National Labs (www.top500.org). The CPlant cluster is the equivalent of 1890 Intel-based Linux boxes that are running an expanded version of Don Becker's freely redistributable Beowulf platform for cluster operation.
The review of cluster building begins on hands and knees with an overview of networking basics: IP addressing and routing. Bandwidth and CPU-CPU timing requirements can be limiting factors; and, because interdependency is essential, proper design requires a weak-link analysis that establishes the compatibility of CPUs, buses, hard drives, Ethernet cards, hubs, switches, and routers. Strategies for cluster sizes from a few to several hundred are discussed.
In the book's second half, Spector turns his attention to cluster programming and applications, and describes tools, languages (where FORTRAN is still well regarded), libraries, and environments for parallel programming. Also, he gives examples of parallel virtual machines that serve MP3, persistence-of-vision graphics, and Web data to other devices or applications. Four brief appendices provide the essential technical details: an annotated Webography, a message-passing application programming interface, installation scripts for starting up the cluster of nodes at boot time, and a database to administer the activity of the nodes.
The fast pace and light pedantic touch in this book illuminate complexities and engender an excitement in the idea that new capabilities are yet to be found, if we all could just get along. --Peter Leopold