Hybrid Data Reliability for Emerging Key-Value Storage Devices

Key-Value Multi-Device (KVMD), a hybrid data reliability manager that employs a variety of reliability techniques with different trade-offs, for key-value devices. Compared to Linux mdadm-based RAID throughput degradation for block devices, data reliability for KV devices can be achieved at a comparable or lower throughput degradation. In addition, the KV API enables much quicker rebuild and recovery of failed devices, and also allows for both hybrid reliability configuration set automatically based on, say, value sizes, and custom per-object reliability configuration for user data.

KVMD is to KV devices as RAID is to block devices.

Reliability Mechanisms

There are four different reliability mechanisms.

Hashing

Hashing only provides load balancing and request distribution to all underlying (KV) devices. When a device fails, any recovery attempt fails and user data stored in the device will be lost.

Replication

Replication has high storage costs and write overhead, but low read and recovery costs.

Splitting

Splitting is a single object erasure coding mechanism, that splits the user object into k equal-sized objects, adds r parity objects using a systemic MDS code and writes the k+r objects to k+r consecutive devices using the same user key.

Packing

Packing is a multi-object erasure coding mechanism that packs up-to k independent objects from k different devices into a single reliability set. The packing is a logical packing, purely for the sake of parity calculation.