Learning-based Memory Allocation for C++ Server Workloads

Learning-based Memory Allocator combines modern machine learning techniques with a novel memory manager, Learned Lifetime-Aware Memory Allocator (LLAMA), that manages the heap based on object lifetimes and huge pages (divided into blocks and lines).

Huge Pages

2MB huge pages reduce tlb misses and increase performance by up to 53%. However, C++ memory allocator can incur significant amounts of memory fragmentation with huge pages (up to 2x).

Assuming average object size is 64B, and 99% of objects are short-lived, the probability that >=1 object in page is long-lived is 0.6%. However, when switching to 2MB huge pages, the probability increases exponentially.

Profile-guided Optimization

Java already has performance tuning for long-lived and short-lived objects (as well as memory compression).

ML-based Lifetime Prediction

The training data is collected by sampling the malloc/free function calls and get the stack traces. However,

  1. pointer-based stack traces are only valid within a run.
  2. symbol-based stack traces differ across versions and builds.

The solution is to use LSTM language model on symbolized stack traces (function names).

Allocator

Allocator manages memory in huge pages. Each huge page has a lifetime class. The allocator is developed as a drop-in replacement for TCMalloc.