Documents

Cassandra Solr Setup Review Exampe

COMPANY X

Cassandra/Solr Cloud Setup Review

Version 1.0

Revisions

Date Version Name Description

2016/03/07

1.0 Nenad Bozic Initial Version
 Matija Gobec

Purpose

{{ Explain in this section what will be under review and why. }}

This documents gives a review of the current company X Cassandra cluster setup including modification recommendations. The review covers the following Cassandra setup aspects: data model, cluster configuration, JVM configuration, OS configuration, deployment and client application. Cluster operations such as repair and logging are also addressed. Document will give also review of Solr Cloud which is used in front of Cassandra to serve analytics queries.

Structure of the Document

{{ Explain structure of the document and give the reader a hint of what can be found in which section. }}

This system has two important parts, Cassandra and Solr Cloud. The document is structured in a way to address both parts separately as well as their interaction. The first part of the document is a Cassandra review and the second part is a Solr Cloud review.

We believe it is easier to reason about current situation and proposed changes layer by layer than through a whole system. We start our review bottom up and this document is structured the same way. First we start from the OS level, go to the JVM configuration and the Cassandra configuration and finish off with the application level. Each section consists of the current state and proposed modifications. The given modifications are graded by their severity level: most important which should be short term goal, mid important as mid term goals and nice-to-haves which should be done if there is time).

Scope

{{ Explain scope of this document and also explain severity of different issues. Each section will give explanation of problem and action points along with severity level. Check example in text below with severity levels and explanation what each severity level means. }}

The scope of this document is to provide action points for the current Company X’s setup based on existing requirements/needs. The scope will cover a short term vision, where the infrastructure has to run ‘as is’ in production, a mid-term vision, where certain parts of the system are reworked to be more aligned with the chosen technologies, while a long term vision (how we think the system should be architected) is provided in a separate document.

All action points have severity level ranging from normal to critical. Here is explanation of severity levels:

  • Normal - nice to have, it would be nice to fix it but you can operate without it
  • Important - you can operate without it but important issues influence performance and operability significantly (our advice would be to fix it as soon as there is time and after the critical issues are fixed)
  • Critical - issues which needs to be addressed immediately

1. Review Cassandra

{{ This section should maps items from SOW (agreed and defined upfront, by the client, as problems to be solved) to respective sections in this document. If there are specific requirements for this review, the client is expected to provide them upfront. This does not influence the structure of the review but just put additional focus to defined problems. If there are no specific requirements, this section can be skipped. }}

Items from SOW:

  • Recommendations to improve and or re-work the data model (review item 1.7.3. Data Model)
  • Repair Strategy – divided keyspaces, incremental – repair cycle to complete within one week; not adversely impact system performance. (review 1.5.4. Repair)
  • Appropriate compaction strategy, especially for statistics keyspace (review 1.7.4. Compaction)
  • Changes to the cluster configuration (review 1.5. Cassandra configuration)
  • Changes to OS and block io tunings (review 1.3. OS configuration)
  • Replication factors, Read and Write Consistency settings (review 1.5.3. Replication factor, review 1.7. Client Application)

1.1. General

{{ Explain why/where Cassandra is used in the system, give a brief explanation of data in each keyspace/table. }}

Cassandra is used currently to store couple of things:

  • events keyspace stores events happening on UI
  • notification keyspace is used to store notifications to users

1.2. Deployment

{{ Touch here nodes in cluster, types of machines, hardware selection and utilization of resources (CPU, Disk, Network, Memory). }}

Cpu utilization looks relatively low so there is a lot of headroom. Disk IO and network are almost idle so the nodes aren’t doing much. This setup looks like an overkill but for the sake of data distribution and fault tolerance it always good to have a bit of headroom. Repair doesn’t seem to be running so these numbers are not complete.

Action points:

  1. [normal] Should check out XY type of instances from AWS instead of Z type of instances because that and that

1.3. OS configuration

{{ This section should cover operating system, and all system settings related to Cassandra such as system clocks, file system, swap settings etc. }}

1.3.1. Synchronize clocks

Cassandra synchronizes data between nodes and repairs inconsistency by comparing timestamps for each cell. For any system where consistency matters, clock synchronization is a must. This is usually done by using a centralized NTP daemon or having all nodes form a mesh for clock synchronization.

Action points:

  1. [critical] Synchronize clocks on all nodes using NTP (should be automatically done during deploy/configure)

1.4. JVM Configuration

{{ This section covers JVM settings such as memory and garbage collection settings and various configuration properties mostly connected to the selected GC. }}

Heap size is set really high, 32G … G1GC default settings are good enough. Any further GC tuning should be preceded with running the cluster and collecting metrics for GC.

Action points:

  1. [normal] Set MAX_HEAP_SIZE to 16G, because 16G leverages better that and that

1.5. Cassandra configuration

{{ This section covers all Cassandra specific settings. The cluster topology is important so we will first look at this, then cover cassandra.yaml and all settings that can be tweaked in there. It also should address consistency and replication. Finally it should cover all the background processes which need to run, such as the repair and backup processes. }}

1.5.1. Cluster topology

Run nodetool status and check topology. Give comments on rack distribution, size distribution per node, snitch used.

1.5.2. Cassandra yaml

{{ Check cassandra.yaml and properties that need to be changed for use case we are solving. Also compare cassandra.yaml with default configuration file for version we are reviewing and ask why some settings are changed. }}

Action points:

[important] compaction_throughput_mb_per_sec: 16 - Set to a much higher number or even leave unbounded since CPU and disks are mostly idle

1.5.3. Replication factor

Cover here replication factor per keyspace and concentrate on consistency, is consistency needed or performance is needed, what consistency and what replication is best for this use case, is replication good enough for robustness.

1.5.4. Repair

Repair is an important background process in Apache Cassandra, for a lot of reasons, but mainly for two. One is to fix data inconsistencies that are normal in eventually consistent systems and the other one is to help with failures and outages. Repair process is intensive, it goes through the phases and each phase is hitting different resource so it is fairly hard to properly configure it. First each node must calculate Merkle Tree which is CPU intensive process, than exchange those hashes which is Network intensive, then compare them to find differences which is again CPU intensive and lastly exchange sstables with differences which is again Network intensive operation. However if you want to make sure you have consistent data it is a necessary operation.

Check if repair is running, which keyspaces need repair, does window of reading data and window of repair make sense to run repair, can you run repair without impacting latency, is there alternative for routine repairs and for repairs after node is down.

Action points:

  1. [important] Enable regular repairs for critical keyspaces which are read multiple times or have data which is deleted and updated often
  2. [critical] Have defined procedure to bring node up when it was down outside hinted handoff period, either with full repair or with streaming data other node 

1.6. Client Application

{{ Review all the code reading from and writing to Cassandra, cover driver settings, error handling, data model review. Review also chosen compaction, compression and table settings. } 

1.6.1. Application 1 writing to Cassandra 

Action points:

1. [important] Change consistency level to QUORUM

2. [critical] Increase number of connection using pooling options

 

1.6.2. Application 2 reading from Cassandra

Action points:

3. [important] Change consistency level to QUORUM

4. [critical] Increase number of connection using pooling options

1.6.3. Data Model

1.6.3.1. First keyspace

Run nodetool cfstats, review data model, check compaction, give suggestions what to change and steps how best to change it.

Action points:

  1. [critical] Change clustering key from uuid to be timeuuid (sort by time and read by time) 

1.6.3.2. Second keyspace

Data model is not best suited for Cassandra, move it to relational database because of volumes, amount of reads and writes and number of ways data needs to be read.

  1. [important] Move this data to PostgreSQL

1.6.4. Compaction

Cover compaction per keyspace, and give suggestion which to use where based on use case.

Action points:

  1. [important] Change to LCS for tables in first keyspace which have lot of deletes

2. Review Solr Cloud

2.1. General

{{ Give general overview why is Solr Cloud used, mention problems so far which need additional focus to investigate. }}

2.2. OS configuration

{{ This section should cover OS settings for Solr machines. }}

Current value of the File Descriptor count is XXXX. Currently there are YY collections in the cloud, each month at least one new collection is added due to a partitioning strategy for XYZ data (one collection per month).

Action points:

  1. Set File Descriptor Count to XXXXXX

2.3. JVM Configuration

{{ Solr is a Java application so this section is expected to provide a review of JVM settings important for Solr to run properly. }}

Action points:

  1. [normal] Set MAX_HEAP_SIZE to 16G

2.4. Solr configuration

{{ This section covers all important aspects of Solr configuration. }}

Action points:

  1. [important] Set up external ZooKeeper ensemble of 3 nodes instead of embedded ZK nodes

2.5. Backup/failover strategy

{{ This section will cover failure scenarios when Solr is down. Solr is using indexes which are kept in memory, but when all machines (or some of machines) fail, there needs to be a strategy for restoring indexes, either with replication or with some kind of recovery process. }}

Solr Cloud is currently using replication factor of 1 which is problematic since data is not replicated anywhere, if single machine fails data is lost.

Action points:

1. [important] Set Replication Factor to be at least 2 

Summary

{{ Give brief overview of the whole system and general impression, how good is the current setup, are there many major issues or just few minor tweaks here and there. }}

Generally first impression is that Cassandra is chosen for the good reasons, there is amount of data and type of data which fits, but data model is completely done relationally with indexes all over the place which prevents leveraging all the good features of Cassandra. Solr Cloud is in good shape, most of the effort need to go to making it robust with higher replication factor and recovery strategy.

Future steps

{{ List in this section plan of action items, find big chunks of work and order them by priority. This section should give the client a brief overview of what needs to be done in upcoming period. It can be seen as a backlog without small fine grained tasks. The action items enlisted in each review section can be seen as fine grained detailed tasks that are just grouped here into logical and ordered steps. }}

Remodel table XYZ because data model is not appropriate because of that and that...

Choose repair option because of consistency issues…

Use monitoring of that and that to be in full control.

Change cassandra yaml with critical issues to have solid base configuration.

Appendix A - table schema

{{ Appendixes should state important facts about the current system that the modification suggestions are based upon. }}

Add here table shema so we can reference it inside the document 

Appendix B - Access Patterns

Add all the access patterns important for review and data modeling.

Appendix C - Sizing

Calculate wide partitions and data volumes

Appendix D - Nodetool status

Run nodetool status to display important information about the cluster.

Appendix E - Table histograms

Run important commands for this review, such as table histogram for example.