Thông tin tài liệu
www.it-ebooks.info
www.it-ebooks.info
Cassandra: The Definitive Guide
www.it-ebooks.info
www.it-ebooks.info
Cassandra: The Definitive Guide
Eben Hewitt
Beijing
•
Cambridge
•
Farnham
•
Köln
•
Sebastopol
•
Tokyo
www.it-ebooks.info
Cassandra: The Definitive Guide
by Eben Hewitt
Copyright © 2011 Eben Hewitt. All rights reserved.
Printed in the United States of America.
Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472.
O’Reilly
books may be purchased for educational, business, or sales promotional use. Online editions
are also available for most titles (http://my.safaribooksonline.com). For more information, contact our
corporate/institutional sales department: (800) 998-9938 or corporate@oreilly.com.
Editor: Mike Loukides
Production Editor: Holly Bauer
Copyeditor: Genevieve d’Entremont
Proofreader: Emily Quill
Indexer: Ellen Troutman Zaig
Cover Designer: Karen Montgomery
Interior Designer: David Futato
Illustrator: Robert Romano
Printing History:
November 2010:
First Edition.
Nutshell Handbook, the Nutshell Handbook logo, and the O’Reilly logo are registered trademarks of
O’Reilly
Media, Inc. Cassandra: The Definitive Guide, the image of a Paradise flycatcher, and related
trade dress are trademarks of O’Reilly Media, Inc.
Many of the designations used by manufacturers and sellers to distinguish their products are claimed as
trademarks. Where those designations appear in this book, and O’Reilly Media, Inc. was aware of a
trademark claim, the designations have been printed in caps or initial caps.
While every precaution has been taken in the preparation of this book, the publisher and author assume
no responsibility for errors or omissions, or for damages resulting from the use of the information con-
tained herein.
TM
This book uses RepKover™, a durable and flexible lay-flat binding.
ISBN: 978-1-449-39041-9
[M]
1289577822
www.it-ebooks.info
This book is dedicated to my sweetheart,
Alison Brown. I can hear the sound of violins,
long before it begins.
www.it-ebooks.info
www.it-ebooks.info
Table of Contents
Foreword .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii
1. Introducing Cassandra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
What’s Wrong with Relational Databases? 1
A Quick Review of Relational Databases 6
RDBMS: The Awesome and the Not-So-Much 6
Web Scale 12
The Cassandra Elevator Pitch 14
Cassandra in 50 Words or Less 14
Distributed and Decentralized 14
Elastic Scalability 16
High Availability and Fault Tolerance 16
Tuneable Consistency 17
Brewer’s CAP Theorem 19
Row-Oriented 23
Schema-Free 24
High Performance 24
Where Did Cassandra Come From? 24
Use Cases for Cassandra 25
Large Deployments 25
Lots of Writes, Statistics, and Analysis 26
Geographical Distribution 26
Evolving Applications 26
Who Is Using Cassandra? 26
Summary 28
2. Installing Cassandra .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Installing the Binary 29
Extracting the Download 29
vii
www.it-ebooks.info
What’s In There? 29
Building from Source 30
Additional Build Targets 32
Building with Maven 32
Running Cassandra 33
On Windows 33
On Linux 33
Starting the Server 34
Running the Command-Line Client Interface 35
Basic CLI Commands 36
Help 36
Connecting to a Server 36
Describing the Environment 37
Creating a Keyspace and Column Family 38
Writing and Reading Data 39
Summary 40
3. The Cassandra Data Model . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
The Relational Data Model 41
A Simple Introduction 42
Clusters 45
Keyspaces 46
Column Families 47
Column Family Options 49
Columns 49
Wide Rows, Skinny Rows 51
Column Sorting 52
Super Columns 53
Composite Keys 55
Design Differences Between RDBMS and Cassandra 56
No Query Language 56
No Referential Integrity 56
Secondary Indexes 56
Sorting Is a Design Decision 57
Denormalization 57
Design Patterns 58
Materialized View 59
Valueless Column 59
Aggregate Key 59
Some Things to Keep in Mind 60
Summary 60
viii | Table of Contents
www.it-ebooks.info
[...]... enormous volumes of data; the fact that it does stands as a monument to the ingenious architecture of the Web But some of this infrastructure is starting to bend under the weight In 1966, a company like IBM was in a position to really make people listen to their innovations They had the problems, and they had the brain power to solve them As we enter the second decade of the 21st century, we’re starting... for some length of time; that’s the very point of making updates— that they’re there for others to read However, a more subtle examination might lead us to want to find a way to tune these properties a bit and control them slightly There is, as they say, no free lunch on the Internet, and once we see how we’re paying for our transactions, we may start to wonder whether there’s an alternative Transactions... DB2 database gets its name as the successor to DB1 the product built around the hierarchical data model IMS IMS was released in 1968, and subsequently enjoyed success in Customer Information Control System (CICS) and other applications It is still used today But in the years following the invention of IMS, the new model, the disruptive model, the threatening model, was the relational database In his... www.it-ebooks.info RDBMS, NoSQL The horse, the car, the plane They each build on prior art, they each attempt to solve certain problems, and so they’re each good at certain things—and less good at others They each coexist, even now So let’s examine for a moment why, at this point, we might consider an alternative to the relational database, just as Codd himself four decades ago looked at the Information Management... through the use of transactions, which require locking some portion of the database so it’s not available to other clients This can become untenable under very heavy loads, as the locks mean that competing users start queuing up, waiting for their turn to read or write the data We typically address these problems in one or more of the following ways, sometimes in this order: • Throw hardware at the problem... and updates in the database, which is exacerbated over a cluster • We turn our attention to the database again and decide that, now that the application is built and we understand the primary query paths, we can duplicate some of the data to make it look more like the queries that access it This process, called denormalization, is antithetical to the five normal forms that characterize the relational... www.it-ebooks.info www.it-ebooks.info CHAPTER 1 Introducing Cassandra If at first the idea is not absurd, then there is no hope for it —Albert Einstein Welcome to Cassandra: The Definitive Guide The aim of this book is to help developers and database administrators understand this important new database, explore how it compares to the relational database management systems we’re used to, and help you put... same time, then one of them will have to wait for the other to complete Durable Once a transaction has succeeded, the changes will not be lost This doesn’t imply another transaction won’t later modify the same data; it just means that writers can be confident that the changes are available for the next transaction to work with as necessary On the surface, these properties seem so obviously desirable as... First, the new model was very different from the old model, which it pointedly controverted It was threatening because it can be hard to understand something different and new Ensuing debates can help entrench people stubbornly further in their views—views that might have been 1 www.it-ebooks.info largely inherited from the climate in which they learned their craft and the circumstances in which they... a day, and in other Web 2.0 applications The idea here is that you split the data so that instead of hosting all of it on a single server or replicating all of the data on all of the servers in a cluster, you divide up portions of the data horizontally and host them each separately For example, consider a large customer table in a relational database The least disruptive thing (for the programming . www.it-ebooks.info
www.it-ebooks.info
Cassandra: The Definitive Guide
www.it-ebooks.info
www.it-ebooks.info
Cassandra: The Definitive Guide
Eben Hewitt
Beijing
•
Cambridge
•
Farnham
•
Köln
•
Sebastopol
•
Tokyo
www.it-ebooks.info
Cassandra:. Handbook, the Nutshell Handbook logo, and the O’Reilly logo are registered trademarks of
O’Reilly
Media, Inc. Cassandra: The Definitive Guide, the image
Ngày đăng: 21/02/2014, 19:20
Xem thêm: Tài liệu Cassandra: The Definitive Guide potx, Tài liệu Cassandra: The Definitive Guide potx, Chapter 3. The Cassandra Data Model, Chapter 7. Reading and Writing Data