CAP Theorem


What is CAP Theorem?

  • CAP describes that before choosing any Database (Including distributed database), Basing on your requirement we have to choose only two properties out of three.
    • Consistency 

      – Whenever you read a record (or data), consistency guaranties that it will give same data how many times you read. Simply we can say that each server returns the right response to each request, thus the system will be always consistent whenever you read or write data into that.

    • Availability

      – Availability simply means that each request eventually receives a response (even if it’s not the latest data or consistent across the system or just a message saying the system isn’t working).

    • Partition Tolerance 

      – Partition Tolerance means that the cluster continues to function even if there is a “partition” (communications break) between two nodes (both nodes are up, but can’t communicate).

  • One property should be scarified among three, so you have to choose combination of CA or CP or AP.a9hMn
    • Consistency – Partition Tolerance:

      • System will give you a consistent data and it will be distributed across the cluster. But it becomes unavailable when a node goes down.
    • Availability – Partition Tolerance:

      • System will respond even if nodes are not communicating with each other (nodes are up and running but not communicating). But there is no guaranty that all nodes will have same data.
    • Consistency – Availability:

      • System will give you a consistent data and you can write/read data to/from any node, but data partitioning will not be sync when develop a partition between nodes (RDBMS systems such as MySQL are of CA combination systems.)
    • According to CAP theorem, Cassandra will fall into category of AP combination, that means don’t think that Cassandra will not give a consistent data. We can tune Cassandra as per our requirement to give you a consistent result.

Profile photo of Siva

About Siva

Senior Hadoop developer with 4 years of experience in designing and architecture solutions for the Big Data domain and has been involved with several complex engagements. Technical strengths include Hadoop, YARN, Mapreduce, Hive, Sqoop, Flume, Pig, HBase, Phoenix, Oozie, Falcon, Kafka, Storm, Spark, MySQL and Java.

Leave a comment

Your email address will not be published. Required fields are marked *