Cassandra: maintenance


I am inexperienced with Cassandra, but I have some experience with SQL-based relational databases.

I have been unable to find best practices information about how to maintain Cassandra once deployed. Is it necessary to VACUUM the database? I should think that read/write loads cause fragmentation in the storage.

Or more generally: what are the best practices for maintaining a Cassandra production deployment? What has to be done at regular intervals to maintain the health of the system? The Operations manual really doesn't discuss this aspect.


Best Answer

In general, a well designed cluster can live for YEARS without being touched. I've had clusters that ran for years hands-off. However, here are some guidelines:

Monitoring is hugely important:

1) Monitor latencies. Use opscenter or your favorite metrics tools to keep track of latencies. Latencies going up can be signs of problems coming, including GC pauses (more common in read workloads than write workloads), sstable problems, and the like.

2) Monitor sstable counts. SSTable counts will increase if you overrun compaction (each sstable is written exactly one time - deletes are handled by combining old sstables into new sstables through compaction).

3) Monitor node state changes (up/down,etc). If you see nodes flapping, investigate, as it's not normal.

4) Keep track of your disk usage - traditionally, you need to stay under 50% (especially if you use STCS compaction).

There are some basic things you should and shouldn't do regularly:

1) Don't explicitly run nodetool compact. You mention that you've done it, it's not fatal, but it does create very large sstables, which then are less likely to participate in compaction moving forward. You don't necessarily need to keep running it, but sometimes it may help to get rid of deleted/overwritten data.

2) nodetool repair is typically recommended every gc_grace_seconds (10 days by default). There are workloads where this is less important - the biggest reason you NEED repair is to make sure deletion markers (tombstones) are transmitted before they expire (they live for gc_grace_seconds, if a node is down when the delete happened, that data may come back to life without the repair!). If you don't issue deletes, and you query with sufficient consistency level (reads and writes at QUORUM, for example), you can actually live a life without repair.

3) If you are going to repair, consider using incremental repair, and repair small ranges at a time.

4) Compaction strategies matter - a lot. STCS is great for writes, LCS is great for reads. DTCS has some quirks.

5) Data models matter - just like RDBMS/SQL environments get into trouble as unindexed queries hit large tables, Cassandra can be problematic with very large rows/partitions.

6) Snapshots are cheap. Very cheap. Nearly instant, just hard links, they cost almost no disk space immediately. Use snapshot before you upgrade versions, especially major versions.

7) Be careful with deletes. As hinted in #2, delete creates more data on disk, and doesn't free it for AT LEAST gc_grace_seconds.

When all else fails:

I've seen articles that suggest Cassandra in prod requires a dedicated head to manage any sized cluster - I don't know that it's necessarily true, but if you're concerned, you may want to hire a third party consultant (TheLastPickle, Pythian) or have a support contract (Datastax) to give you some peace of mind.