Improving Cache Performance by Exploiting Read-Write Disparity

20 May 2015

Improving Cache Performance by Exploiting Read-Write Disparity on HPCA 2014.

This paper aims to improving cache hit ratio in CPUs. The author made the observation that for some applications dirty cache lines have a very high chance that they will not be accessed before eviction.

The dirty cache lines are divided into two sub-categories. The first sub-category consists of dirty cache lines that are written to with no subsequent reads before eviction. The second sub-category, denoted by the gray bar at the bottom, depicts the subset of dirty lines that are read at least once in addition to being written to.

dirty_cachelines

Both of the above pattern is relatively stable if the workloads does not changes. As a results, we can study this to design the cache evict strategy.

There are three cases that we consider when we choose a replacement victim, based on the current number of dirty lines in a cache set:

– Current number of dirty lines is greater than the predicted best dirty partition size. This means that the set currently has more dirty lines than it should have. In this case, RWP picks the LRU line from the dirty partition as the replacement victim.

– Current number of dirty lines is smaller than the predicted best dirty partition size. This means that the set currently has more clean lines than it should have. In this case, RWP picks the LRU line from the clean partition as the replacement victim.

– Current number of dirty lines is equal to the predicted best dirty partition size. In this case, the replacement victim depends on the cache access type. If the access is a read, RWP picks the replacement victim from the clean partition. Similarly, a write access triggers a replacement from the dirty partition.

read_write_partitioning

Dirty partition size need to be predicted on-line. A small sampled cache set are studied by comparing the values of the dirty and clean age hit counters (in LRU). In the on-line study, the best partition size are calculated.