Cassandra is a NoSQL database having a peer-to-peer, ring-type architecture. Cassandra offers fault-tolerance, data replication for higher availability as well as ensures no single point of failure. Given that Cassandra is a NoSQL database, it is evident that it lacks the amount of research that has gone into comparatively older and more widely and broadly used SQL databases. Cassandra’s growing popularity in recent times gives rise to the need of addressing any security-related or recovery-related concerns associated with its usage. This review paper discusses the existing deletion mechanism in Cassandra and presents some identified issues related to backup and recovery in the Cassandra database. Further, failure detection as well as handling of failures such as node failure or data center failure has been explored in the paper. In addition, several possible solutions to address backup and recovery including recovery in case of disasters have been reviewed.


[1] Abadi Aharon, Haib Ashraf, Melamed Roie, Nassar Alaa, Shribman Aidan, and Yasin Hisham. (2016). Holistic Disaster Recovery Approach For Big Data NoSQL Workloads. IEEE International Conference on Big Data (Big Data).

[2] Arous Ines, Khayati Mourad, Cudré- Mauroux Philippe, Zhang Ying, Kersten Martin, Stalinlov Svetlin. (2019). RecovDB: Accurate And Efficient Missing Blocks Recovery For Large Time Series. IEEE 35th International Conference on Data Engineering (ICDE).

[3] Bhattacharya Souvik, Roy Ananya, Sen Soumya, Debnath Narayan. (2017). Distributed Data Recovery Architecture Based On Schema Segregation. IEEE International Conference on Industrial Technology (ICIT).

[4] Cankaya Ebru Celikel, Kupka Brad. (2016). A Survey of Digital Forensics Tools for Database Extraction. Future Technologies Conference (FTC).

[5] Chopade Rupali, Pachghare V. K. (2019). Ten years of critical review on database forensics research. Digital Investigation, Volume 29, June 2019.

[6] DATASTAX ACADEMY (2020). Apache Cassandra 3.0 for DSE 5.0. Backing up and Restoring Data. Retrieved on February 5, 2020 from https://docs.datastax.com/en/archived/cassandra/3.0/cassandra/operations/opsBackupRestore.html.

[7] DATASTAX ACADEMY (2020). Problems we see in support is data going missing. Retrieved on February 5, 2020 from https://academy.datastax.com/support-blog/dude-where%E2%80%99s-my-data.

[8] DATASTAX ACADEMY (2020). Apache Cassandra 3.0 for DSE 5.0. How is Data Deleted? Retrieved on February 5, 2020 from https://docs.datastax.com/en/dse/5.1/dse-arch/datastax_enterprise/dbInternals/dbIntAboutDeletes.html.

[9] Kathpal Atish, Sehgal Priya. (2017). BARNS: Towards Building Backup and Recovery for NoSQL Databases. Proceeding HotStorage'17 Proceedings of the 9th USENIX Conference on Hot Topics in Storage and File Systems.

[10] Lakshman Avinash, Malik Prashant. (2010). Cassandra: A Decentralized Structured Storage System. ACM SIGOPS Operating Systems Review, Volume 44 Issue 2.

[11] Mangle Nikhil, Sambhare Praful. (2015). A Review on Big Data Management and NoSQL Databases in Digital Forensics. International Journal of Science and Research (IJSR), Volume 4 Issue 5.

[12] Prasad Abhishek, Gohil Bhavesh. (2014). A Comparative Study of NoSQL Databases. International Journal of Advanced Research in Computer Science, Volume 5, No. 5.

[13] Qiao Jialin, Huang Xiangdong, Rui Lei, Wang Jianmin. (2018). Heterogeneous Replica for Query on Cassandra. Cornell University.

[14] Wang Guoxi, Tang Jianfeng. (2012). The NoSQL Principles and Basic Application of Cassandra Model. International Conference on Computer Science and Service System.



To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.