Cassandra is
a distributed database management system which is open source with wide column
store, NoSQL database to handle large amount of data across many commodity
servers which provides high availability with no single point of failure. It is
written in Java and developed by Apache Software Foundation.
Avinash
Lakshman & Prashant Malik initially developed the Cassandra
at Facebook to power the Facebook inbox search feature. Facebook released
Cassandra as an open source project on Google code in July 2008. In March 2009
it became an Apache Incubator project and in February 2010 it becomes a
top-level project. Due to its outstanding technical features Cassandra becomes
so popular.
Apache
Cassandra is used to manage very large amounts of structure data spread out
across the world. It provides highly available service with no single point of
failure. Listed below are some points of Apache Cassandra:
- It is scalable, fault-tolerant,
and consistent.
- It is column-oriented database.
- Its distributed design is based
on Amazon’s Dynamo and its data model on Google’s Big table.
- It is Created at Facebook and it
differs sharply from relational database management systems.
Cassandra
implements a Dynamo-style replication model with no single point of failure but
its add a more powerful “column family” data model. Cassandra is being used by
some of the biggest companies such as Facebook, Twitter, Cisco, Rackspace,
eBay, Netflix, and more.
The design
goal of a Cassandra is to handle big data workloads across multiple nodes
without any single point of failure. Cassandra has peer-to-peer distributed
system across its nodes, and data is distributed among all the nodes of the
cluster.
All the
nodes of Cassandra in a cluster play the same role. Each node is independent,
at the same time interconnected to other nodes. Each node in a cluster can
accept read and write requests, regardless of where the data is actually
located in the cluster. When a node goes down, read/write request can be served
from other nodes in the network.
Features
of Cassandra:
Cassandra has become popular because of its technical features. There are some
of the features of Cassandra:
- Easy data distribution –
It provides the flexibility to distribute data where you need by replicating data across multiple data centres.
for example:
If there are 5 node let say N1, N2, N3, N4, N5 and by using partitioning
algorithm we will decide the token range and distribute data accordingly. Each
node have specific token range in which data will be distribute. let’s have a
look on diagram for better understanding.
- Flexible data storage –
Cassandra accommodates all possible data formats including: structured, semi-structured, and unstructured. It can dynamically accommodate changes to your data structures accordingly to your need. - Elastic scalability –
Cassandra is highly scalable and allows to add more hardware to accommodate more customers and more data as per requirement. - Fast writes –
Cassandra was designed to run on cheap commodity hardware. Cassandra performs blazingly fast writes and can store hundreds of terabytes of data, without sacrificing the read efficiency. - Always on Architecture –
Cassandra has no single point of failure and it is continuously available for business-critical applications that can’t afford a failure. - Fast linear-scale performance –
Cassandra is linearly scalable therefore it increases your throughput as you increase the number of nodes in the cluster. It maintains a quick response time. - Transaction support –
Cassandra supports properties like Atomicity, Consistency, Isolation, and Durability (ACID) properties of transactions.
- Flexible data storage –
No comments:
Post a Comment