Log-structured Memory for DRAM-based Storage

Log-structured Memory for DRAM-based Storage Best Paper on FAST 2014. Provided by RAMCloud Group.

Recent storage system stores all data in DRAM, such as memcached, redis, RAMCloud, Spark and Tachyon. Most of the applications use malloc interface to allocate memory. Memory in this case can not be moved once it is allocated increasing the fragmentation. On the other way, JVM use dynamic memory management, including fragment merging, compressing and incremental garbage collection. None of these memory management are efficient in the aspect of memory usage when running changing workloads. The benchmark methods is:

First,the workload allocates 50GB of memory using objects from a particular size distribution; it deletes existing objects at random in order to keep the amount of live data from exceeding 10GB. In the second phase the workload deletes a fraction of the existing objects at random. The third phase is identical to the first except that it uses a different size distribution (objects from the new distribution gradually displace those from the old distribution).

In doing this, memory consumption increased by an additional 30% and application experienced occasional pauses of one second or more.

An ideal memory allocator for a DRAM-based storage system such as RAMCloud should have two properties. First, it must be able to copy objects in order to eliminate fragmentation. Second, it must not require a global scan of memory: instead, it must be able to perform the copying incrementally, garbage collecting small regions of memory independently with cost proportional to the size of a region. Among other advantages, the incremental approach allows the garbage collector to focus on regions with the most free space.

Archtecture

RAMCloud uses DRAM as the main storage, this is the master module, and uses local disk or flash memory to store backup copies of data, this is the backup module. These two module runs co-located on a single node.

Data Structure

log-structured-storage

The RAMCloud use a simple key-value data structure to manage the memory. New object are allocated at the log head. To retrieve a object RAMCloud search the object key in the Hash Table. To delete the object RAMCloud marks the object as garbage in the Hash Table. As a result, allocation, read, write and deallocate are all very fast. This log-structure is commonly used in SSDs, but the write operation in RAMCloud do not need a prepended erase operation.

Garbage Collection

Upon a removal of an object, fragmentation will emerged. RAMCloud defined two-level garbage collection.

  • The first level of cleaning, called segment compaction, operates only on the in-memory segments on masters and consumes no network or disk I/O.
  • The second level of cleaning is combined cleaning, cleans both disk and memory together.

Parallel Cleaning

It is easy to perform cleaning on multiple segments concurrently because there is no conflicts. However, hash table updates need to be immutable.