Cassandra High Performance Cookbook

08/05/2011 · No Responses

Apache Cassandra is a fault-tolerant, distributed data store which offers linear scalability allowing it to be a storage platform for large high volume websites.

This book provides detailed recipes that describe how to use the features of Cassandra and improve its performance. Recipes cover topics ranging from setting up Cassandra for the first time to complex multiple data center installations. The recipe format presents the information in a concise actionable form.

The book describes in detail how features of Cassandra can be tuned and what the possible effects of tuning can be. Recipes include how to access data stored in Cassandra and use third party tools to help you out. The book also describes how to monitor and do capacity planning to ensure it is performing at a high level. Towards the end, it takes you through the use of libraries and third party applications with Cassandra and Cassandra integration with Hadoop.

What you will learn from this book :

Interact with Cassandra using the command line interface
Write programs that access data in Cassandra
Configure and tune Cassandra components to enhance performance
Model data to optimize storage and access
Use tunable consistency to optimize data access
Deploy Cassandra in single and multiple data center environments
Monitor the performance of Cassandra
Manage a cluster by joining and removing nodes
Use libraries and third party applications with Cassandra
Integrate Cassandra with Hadoop

Approach
This is a cookbook and all tasks are approached as recipes. A recipe describes a task and outlines the steps necessary to complete this task.

Some recipes in the book are examples of writing code. An example of this is a recipe that stores and accesses the entries of a phone book in Cassandra. The recipe consists of a description of the program, a full code example is given, the example is run, the output is displayed, and finally the how it works section describes the process or code in greater detail.

Other recipes in the book describe a task. An example of this is a recipe that takes a snapshot back up of data in Cassandra. This recipe contains a description of the process, it then shows how to run the snapshot command and confirm that it worked, it then explains what the snapshot command does behind the scenes, finally the ‘see also’ section references other related recipes such as the recipe to restore a snapshot.

Who this book is written for
This book is designed for administrators, developers, and data architects who are interested in Apache Cassandra for redundant, highly performing, and scalable data storage. Typically these users should have experience working with a database technology, multiple node computer clusters, and high availability solutions.