Log-Structured File System for Flash Storage

This is another file system designed for SSDs. F2FS: A New File System for Flash Storage, presented on FAST 2015 by Changman Lee. The File Systems specifically designed for SSDs are mainly focused on how to effectively write the data.

The problem in this paper is Log-structured file systems such as BTRFS and NILFS2 do not consider the characteristics of flash storage devices and are “inevitably suboptimal in terms of performance and device lifetime.”. The prior work on FAST 2012 utilize write skew in I/O workloads to optimize The Log-Structured File System. This paper optimize LFS from another angle.

Flash-friendly on-disk layout

f2fs_on-disk-layout

The disk layout of F2FS is very like the LFS disk layout. Why it is claimed to be Flash-friendly? Because:

It allocates storage blocks in the unit of segments from a number of individual zones. It performs “cleaning” in the unit of section. These units are introduced to align with the underlying FTL’s operational units to avoid unnecessary (yet costly) data copying.

Cost-effective index structure

wandering-tree

The “wandering tree” problem/algorithm is if any update in the tree requires updating parent nodes up to the root. The updating in Log-Structured File System is exactly a “wandering tree” problem.

If a leaf data block is updated (and written to somewhere), its direct index block should be updated, too. Once the direct index block is written, again its indirect index block should be updated.

This paper solves this problem by using node address table.

Multi-head logging

The internal parallelism in SSDs are hidden to the upper layer. This paper propose to use multiple logging head to exploit the internal parallelism. However, this may make FTL mix data from multiple sources and increasing the garbage collection (GC) overheads. In order to solve this, F2FS maps active logs to different zones to separate them in the FTL. Multi-head logging also separates hot, warm, and cold data.

Adaptive logging

F2FS implements both policies and switches between normal logging and threaded logging dynamically according to the file system status. This is because there is a chance that threaded logging incurs undesirable random writes when there are scattered holes.

acceleration with roll-forward recovery

fsync operation is usually supported by checkpointing and recover data with the roll-back model.

F2FS implements an efficient roll-forward recovery mechanism to enhance fsync performance. The key idea is to write data blocks and their direct node blocks only, excluding other node or F2FS metadata blocks. In order to find the data blocks selectively after rolling back to the stable checkpoint, F2FS remains a special flag inside direct node blocks.

Also check the review by Adrian Colyer: http://blog.acolyer.org/2015/02/26/f2fs-a-new-file-system-for-flash-storage/