Tags

3 Important DNS Records for Email

As Mail Exchange Record (MX) defines the server we used to receive email, it is not enough to secure your email service. Take a look at the mail server setup tutural, it is completely possible that we can configure the mail server send emails using any ”send from” address. This brings the security concerns:

How to Setup Postfix & Dovecot

In this tutorial, we will setup a Postfix Email/SMTP server and a Dovecot IMAP server. We then configure the SSL certificate to enforce the security/privacy.

Use SBT and Git to create Your Own Maven Repository

By leverage SBT and Git, it is very easy to deploy your own distributed Maven Repository.

Git-bz: Integrate Bugzilla and Git

I am trying to seek solutions to integrate bugzilla and git. One of the elegant solution is to the server side git hook, Gitzilla, which depend on Python interface to Bugzilla – pybugz. However, Gitzilla is lack of updates for 4 years and not seems to be a dependable solution for me. The last commit of Bugzilla is on Aug 16, 2011. And the Gitzilla compatible version is still 0.9.3, while the latest release of pybugz is 0.11.1.

Spark Code Analysis - DAGScheduler

DAG, a directed acyclic graph, is a directed graph with no directed cycles.

How to compile a Hadoop Program

Before compiling your first hadoop program, please see the instructions on how to run the WordCount Example.

JSON-RPC over Golang Websocket

Basic ideas

Libaio Simple Example

libaio api

Jekyll

Using Makefile in Jekyll

Github Pages is a great “it just works” resource.

Except for what it does not do.

VIM

Removing ^M Characters In Vim

If you edit files in gedit or notepad and ^M characters would be

inserted. After that you could not simply remove ^M in VIM with

the following command:

Spell checking in VIM

Use :set spell to turn on spell-checking.

:help spell will give you all the details.

Latex

Latex Multi-line Equation and Left/Right Align

Latex长公式的换行可以用split或者aliged,在要换行的地方用\\。例:

Websocket

JSON-RPC over Golang Websocket

Basic ideas

Golang

JSON-RPC over Golang Websocket

Basic ideas

RPC

JSON-RPC over Golang Websocket

Basic ideas

JSON-RPC

JSON-RPC over Golang Websocket

Basic ideas

Bash

Bash (Shell) Script To Compare Date And Time

I come across the need to compare the time and date in shell.

Here is how I did.

Hadoop

How to run Hadoop WordCount.java Map-Reduce Program

Hadoop comes with a set of demonstration programs. They are located in here.

How to compile a Hadoop Program

Before compiling your first hadoop program, please see the instructions on how to run the WordCount Example.

MapReduce

How to run Hadoop WordCount.java Map-Reduce Program

Hadoop comes with a set of demonstration programs. They are located in here.

Spark

Spark Code Analysis – TaskScheduler

The DAGScheduler group tasks in a single stage as a TaskSet.

Spark Code Analysis - DAGScheduler

DAG, a directed acyclic graph, is a directed graph with no directed cycles.

Spark Code Analysis – SparkEnv

SparkContext is the main entry point of spark. It contains the interface of hdfs/tachyon etc.

Spark Code Analysis – The Worker

The Spark has 3 main components:

First look at Spark

Spark, started at UC Berkeley AMPLab in 2009,

is a fast and general cluster computing system for Big Data.

FAST 2015

Log-Structured File System for Flash Storage

This is another file system designed for SSDs. F2FS: A New File System for Flash Storage, presented on FAST 2015 by Changman Lee. The File Systems specifically designed for SSDs are mainly focused on how to effectively write the data.

Advanced Virtualization for Modern Non-Volatile Memory Devices

ANViL is another interface related paper. Different from the RPC interface, which is more complicated and hard to implement, this design is simple and easy to implement.

SSDs

Scalable Parallel Flash Firmware for Many-core Architectures

DeepFlash extract the maximum performance of the underlying flash memory complex by concurrently executing multiple firmware components across many cores within the device.

GraphSSD Graph Semantics Aware SSD

GraphSSD replaces the conventional logical to physical page mapping mechanism in an SSD with a novel vertex-to-page mapping scheme and exploits the detailed knowledge of the flash properties to minimize page accesses. GraphSSD also supports efficient graph updates (vertex and edge modifications) by minimizing unnecessary page movement overheads.

RFLUSH Rethink the Flush

RFLUSH implements a fine-grained flush command in a storage device using an open-source flash development platform to transfers a range of LBAs that need to be flushed and thus enables the storage device to force only a subset of data in its buffer.

Design and Implementation of a Fast Persistent Key-Value Store

KVell [pdf] avoids to save indexes on disk to maximize the performance of the KV store.

Buffer-Controlled Writes to HDDs for SSD-HDD Hybrid Storage Server

BCW is based on the finding that HDDs are usually underutilized in an SSD-HDD hybrid storage server while SSDs are suffering from high write pressure.

Log-Structured File System for Flash Storage

This is another file system designed for SSDs. F2FS: A New File System for Flash Storage, presented on FAST 2015 by Changman Lee. The File Systems specifically designed for SSDs are mainly focused on how to effectively write the data.

Advanced Virtualization for Modern Non-Volatile Memory Devices

ANViL is another interface related paper. Different from the RPC interface, which is more complicated and hard to implement, this design is simple and easy to implement.

Layout Aware

NSDI 2015

OSDI 2014

NVMs

Single-Level Key-Value Store with Persistent Memory

SLM-DB exploits persistent memory to maintain a B+-tree index and adopt an LSM-tree approach to stage inserted KV pairs in a PM resident memory buffer. SLM-DB has a single-level organization of KV pairs on disks and performs selective compaction for the KV pairs, collecting garbage and keeping the KV pairs sorted sufficiently for range query operations.

Minimally Ordered Durable Data Structures for Persistent Memory

MOD borrows the idea of functional programming to make atomic one or more updates. It’s C++ prototype performs better than software transaction memory (STM).

Advanced Virtualization for Modern Non-Volatile Memory Devices

ANViL is another interface related paper. Different from the RPC interface, which is more complicated and hard to implement, this design is simple and easy to implement.

Eurosys 2015

Cache

HPCA 2014

Graph

GraphSSD Graph Semantics Aware SSD

GraphSSD replaces the conventional logical to physical page mapping mechanism in an SSD with a novel vertex-to-page mapping scheme and exploits the detailed knowledge of the flash properties to minimize page accesses. GraphSSD also supports efficient graph updates (vertex and edge modifications) by minimizing unnecessary page movement overheads.

Pregel: A System for Large-Scale Graph Processing

This week I will study Graph Processing Related papers. The Pregel: A System for Large-Scale Graph Processing was published on ACM SIGACT-SIGOPS Symposium on Principles of Distributed Computing 2009 by Google. It is the Large-scale graph computing at Google.

Block Correlations

Storage Systems

FAST 2004

ISCA 2014

ASPLOS 2014

Tree

Single-Level Key-Value Store with Persistent Memory

SLM-DB exploits persistent memory to maintain a B+-tree index and adopt an LSM-tree approach to stage inserted KV pairs in a PM resident memory buffer. SLM-DB has a single-level organization of KV pairs on disks and performs selective compaction for the KV pairs, collecting garbage and keeping the KV pairs sorted sufficiently for range query operations.

MICA A Holistic Approach to Fast In-Memory Key-Value Storage

MICA partitions data and mainly uses exclusive access to the partitions. MICA exploits CPU caches and packet burst I/O to disproportionately speed more loaded partitions, nearly eliminating the penalty from skewed workloads. MICA can fallback to concurrent reads if the load is extremely skewed, but avoids concurrent writes, which are always slower than exclusive writes.

Size-aware Sharding For Improving Tail Latencies in In-memory Key-value Stores

Minos distributes requests for keys to cores according to the size of the item associated with the key. In particular, requests for small and large items are sent to disjoint subsets of cores. Size-aware sharding improves tail latencies by avoiding that a request for a small item gets queued behind a request for a large item.

Efficient Log-Structured Key-Value Storage Engine for Persistent Memory

FlatStore uses a log to save data into Persistent Memory. It can be applied to either a B-Tree index or Hash index.

Design and Implementation of a Fast Persistent Key-Value Store

KVell [pdf] avoids to save indexes on disk to maximize the performance of the KV store.

VLDB 2014

Sampling

Log-structure

Efficient Log-Structured Key-Value Storage Engine for Persistent Memory

FlatStore uses a log to save data into Persistent Memory. It can be applied to either a B-Tree index or Hash index.

Design and Implementation of a Fast Persistent Key-Value Store

KVell [pdf] avoids to save indexes on disk to maximize the performance of the KV store.

Log-Structured File System for Flash Storage

This is another file system designed for SSDs. F2FS: A New File System for Flash Storage, presented on FAST 2015 by Changman Lee. The File Systems specifically designed for SSDs are mainly focused on how to effectively write the data.

Best Paper

Strong and Efficient Consistency with Consistency-Aware Durability

ORCA provides strong consistency while closely matching the performance of weakly consistent ZooKeeper.

Rethinking Virtual Memory Translation for Parallelism

Elastic Cuckoo Page Table is a novel page table design that transforms the sequential pointer-chasing operation used by conventional multi-level radix page tables into fully-parallel look-ups.

FAST 2014

Consistency

Strong and Efficient Consistency with Consistency-Aware Durability

ORCA provides strong consistency while closely matching the performance of weakly consistent ZooKeeper.

Minimally Ordered Durable Data Structures for Persistent Memory

MOD borrows the idea of functional programming to make atomic one or more updates. It’s C++ prototype performs better than software transaction memory (STM).

RFLUSH Rethink the Flush

RFLUSH implements a fine-grained flush command in a storage device using an open-source flash development platform to transfers a range of LBAs that need to be flushed and thus enables the storage device to force only a subset of data in its buffer.

Journal

ATC 2013

Java

Java 8 Improvements

Oracle no longer release public update for Java 7 since April 2015, after Java’s 20 year’s birthday. It’s time to move forward to Java 8.

Eurosys 2014

Sigmod 2015

Stream Processing

Sigmod 2014

FAST 2013

Error

Sigmetrics 2015

FAST 2012

Sigmod 2007

I/O Signature

Simulator

Preparing System Software for a World with Terabyte-scale Memories

0sim simulates the behavior of system software (e.g. kernels) on huge-memory systems (e.g. terabytes of RAM).

Simulators for Computer Archtecture

These simulation tools (toys) are developed for computer-system architecture research.

BACI Ben-Ari Concurrency Interpreter

BACI is a concurrency simulator. jBACI is an integration of the original BACI compilers and Strite’s interpreter into an IDE that contains an editor, together with extensions to the GUI to simplify its use by novices. jBACI is only available for Windows.

BACI

Lock-free Read/Write

Mutex Flag Define

BACI Ben-Ari Concurrency Interpreter

BACI is a concurrency simulator. jBACI is an integration of the original BACI compilers and Strite’s interpreter into an IDE that contains an editor, together with extensions to the GUI to simplify its use by novices. jBACI is only available for Windows.

Scala

Use Scala Swing with SBT to write GUI Program

Swing is a GUI widget toolkit for Java. It is part of Oracle’s Java Foundation Classes (JFC). Scala Swing is wrap most of the Java Swing’s API for Scala.

SBT/Scala tutorial

You can start to learn Scala from the official tutorial. A good Chinese page to learn Scala is here.

Git

Using hooks in Gitolite – Push to Github

Mirror in Gitolite is a bit too complicated to setup. What I am seeking for is to automatically push to github while I push to my gitolite repository.

Config Cgit & Gitolite

Install Gitolite & Cgit

Git-bz: Integrate Bugzilla and Git

I am trying to seek solutions to integrate bugzilla and git. One of the elegant solution is to the server side git hook, Gitzilla, which depend on Python interface to Bugzilla – pybugz. However, Gitzilla is lack of updates for 4 years and not seems to be a dependable solution for me. The last commit of Bugzilla is on Aug 16, 2011. And the Gitzilla compatible version is still 0.9.3, while the latest release of pybugz is 0.11.1.

Secure

Practical Byte-Granular Memory Blacklisting using Califorms

Califorms provides a low overhead security solution for practical, byte-granular memory safety.

Secure and Efficient Multitasking Inside a Single Enclave of Intel SGX

Occlum a system that enables secure and efficient multitasking on Intel Software Guard Extensions (SGX).

3 Important DNS Records for Email

As Mail Exchange Record (MX) defines the server we used to receive email, it is not enough to secure your email service. Take a look at the mail server setup tutural, it is completely possible that we can configure the mail server send emails using any ”send from” address. This brings the security concerns:

News

Paper Discussion 04/04/2019

Liangding Li

Paper Discussion 03/28/2019

Liangding Li

Paper Discussion 03/21/2019

Liangding Li

Paper Discussion 03/14/2019

Liangding Li

Paper Discussion 03/07/2019

Liangding Li

Paper Discussion 2/26/2019

Liangding Li

Huge Pages

Translation Ranger Operating System Support for Contiguity-Aware TLBs

Translation Ranger’s approach to generating translation contiguity is to rearrange the system’s physical memory mappings such that each VMA can be covered by as few contiguous regions as possible, with regions that are as large as possible.

Transparently Self-Replicating Page-Tables for Large-Memory Machine

Mitosis (OSDI 2018 Poster) mitigate NUMA effects on page-table walks by transparently replicating and migrating page-tables across sockets without application changes.

Learning-based Memory Allocation for C++ Server Workloads

Learning-based Memory Allocator combines modern machine learning techniques with a novel memory manager, Learned Lifetime-Aware Memory Allocator (LLAMA), that manages the heap based on object lifetimes and huge pages (divided into blocks and lines).

Efficient Fine-grained OS Support for Huge Pages

HawkEye demonstrates fine-grained OS support for huge pages, not fine-grained huge pages. The focus of this paper relies on when, where, and how to promote huge pages.

ASPLOS 2019

Fast Fine-Grained Global Synchronization on GPUs

The key idea is to transform global synchronization into global communication so that conflicts are serialized at the thread block level.

Efficient Fine-grained OS Support for Huge Pages

HawkEye demonstrates fine-grained OS support for huge pages, not fine-grained huge pages. The focus of this paper relies on when, where, and how to promote huge pages.

Virtual Memory

Practical Byte-Granular Memory Blacklisting using Califorms

Califorms provides a low overhead security solution for practical, byte-granular memory safety.

Translation Ranger Operating System Support for Contiguity-Aware TLBs

Translation Ranger’s approach to generating translation contiguity is to rearrange the system’s physical memory mappings such that each VMA can be covered by as few contiguous regions as possible, with regions that are as large as possible.

Transparently Self-Replicating Page-Tables for Large-Memory Machine

Mitosis (OSDI 2018 Poster) mitigate NUMA effects on page-table walks by transparently replicating and migrating page-tables across sockets without application changes.

Secure and Efficient Multitasking Inside a Single Enclave of Intel SGX

Occlum a system that enables secure and efficient multitasking on Intel Software Guard Extensions (SGX).

Learning-based Memory Allocation for C++ Server Workloads

Learning-based Memory Allocator combines modern machine learning techniques with a novel memory manager, Learned Lifetime-Aware Memory Allocator (LLAMA), that manages the heap based on object lifetimes and huge pages (divided into blocks and lines).

Rethinking Virtual Memory Translation for Parallelism

Elastic Cuckoo Page Table is a novel page table design that transforms the sequential pointer-chasing operation used by conventional multi-level radix page tables into fully-parallel look-ups.

Efficient Fine-grained OS Support for Huge Pages

HawkEye demonstrates fine-grained OS support for huge pages, not fine-grained huge pages. The focus of this paper relies on when, where, and how to promote huge pages.

FAST 2020

Scalable Parallel Flash Firmware for Many-core Architectures

DeepFlash extract the maximum performance of the underlying flash memory complex by concurrently executing multiple firmware components across many cores within the device.

Hybrid Data Reliability for Emerging Key-Value Storage Devices

Key-Value Multi-Device (KVMD), a hybrid data reliability manager that employs a variety of reliability techniques with different trade-offs, for key-value devices. Compared to Linux mdadm-based RAID throughput degradation for block devices, data reliability for KV devices can be achieved at a comparable or lower throughput degradation. In addition, the KV API enables much quicker rebuild and recovery of failed devices, and also allows for both hybrid reliability configuration set automatically based on, say, value sizes, and custom per-object reliability configuration for user data.

Strong and Efficient Consistency with Consistency-Aware Durability

ORCA provides strong consistency while closely matching the performance of weakly consistent ZooKeeper.

An Erasure-coding-supported Version of Raft for Reducing Storage Cost and Network Cost

In CRaft, a leader has two methods to replicate log entries to its followers. If the leader can communicate with enough followers, it will replicate log entries bycoded-fragments for better performance. Otherwise, it will replicate complete log entries for liveness.

Buffer-Controlled Writes to HDDs for SSD-HDD Hybrid Storage Server

BCW is based on the finding that HDDs are usually underutilized in an SSD-HDD hybrid storage server while SSDs are suffering from high write pressure.

NVMe

Design and Implementation of a Fast Persistent Key-Value Store

KVell [pdf] avoids to save indexes on disk to maximize the performance of the KV store.

KV

Single-Level Key-Value Store with Persistent Memory

SLM-DB exploits persistent memory to maintain a B+-tree index and adopt an LSM-tree approach to stage inserted KV pairs in a PM resident memory buffer. SLM-DB has a single-level organization of KV pairs on disks and performs selective compaction for the KV pairs, collecting garbage and keeping the KV pairs sorted sufficiently for range query operations.

Hybrid Data Reliability for Emerging Key-Value Storage Devices

Key-Value Multi-Device (KVMD), a hybrid data reliability manager that employs a variety of reliability techniques with different trade-offs, for key-value devices. Compared to Linux mdadm-based RAID throughput degradation for block devices, data reliability for KV devices can be achieved at a comparable or lower throughput degradation. In addition, the KV API enables much quicker rebuild and recovery of failed devices, and also allows for both hybrid reliability configuration set automatically based on, say, value sizes, and custom per-object reliability configuration for user data.

Strong and Efficient Consistency with Consistency-Aware Durability

ORCA provides strong consistency while closely matching the performance of weakly consistent ZooKeeper.

An Erasure-coding-supported Version of Raft for Reducing Storage Cost and Network Cost

In CRaft, a leader has two methods to replicate log entries to its followers. If the leader can communicate with enough followers, it will replicate log entries bycoded-fragments for better performance. Otherwise, it will replicate complete log entries for liveness.

MICA A Holistic Approach to Fast In-Memory Key-Value Storage

MICA partitions data and mainly uses exclusive access to the partitions. MICA exploits CPU caches and packet burst I/O to disproportionately speed more loaded partitions, nearly eliminating the penalty from skewed workloads. MICA can fallback to concurrent reads if the load is extremely skewed, but avoids concurrent writes, which are always slower than exclusive writes.

Size-aware Sharding For Improving Tail Latencies in In-memory Key-value Stores

Minos distributes requests for keys to cores according to the size of the item associated with the key. In particular, requests for small and large items are sent to disjoint subsets of cores. Size-aware sharding improves tail latencies by avoiding that a request for a small item gets queued behind a request for a large item.

Efficient Log-Structured Key-Value Storage Engine for Persistent Memory

FlatStore uses a log to save data into Persistent Memory. It can be applied to either a B-Tree index or Hash index.

Design and Implementation of a Fast Persistent Key-Value Store

KVell [pdf] avoids to save indexes on disk to maximize the performance of the KV store.

SOSP 2019

Design and Implementation of a Fast Persistent Key-Value Store

KVell [pdf] avoids to save indexes on disk to maximize the performance of the KV store.

Hash

Rethinking Virtual Memory Translation for Parallelism

Elastic Cuckoo Page Table is a novel page table design that transforms the sequential pointer-chasing operation used by conventional multi-level radix page tables into fully-parallel look-ups.

Efficient Log-Structured Key-Value Storage Engine for Persistent Memory

FlatStore uses a log to save data into Persistent Memory. It can be applied to either a B-Tree index or Hash index.

ASPLOS 2020

Transparently Self-Replicating Page-Tables for Large-Memory Machine

Mitosis (OSDI 2018 Poster) mitigate NUMA effects on page-table walks by transparently replicating and migrating page-tables across sockets without application changes.

Minimally Ordered Durable Data Structures for Persistent Memory

MOD borrows the idea of functional programming to make atomic one or more updates. It’s C++ prototype performs better than software transaction memory (STM).

Secure and Efficient Multitasking Inside a Single Enclave of Intel SGX

Occlum a system that enables secure and efficient multitasking on Intel Software Guard Extensions (SGX).

Learning-based Memory Allocation for C++ Server Workloads

Learning-based Memory Allocator combines modern machine learning techniques with a novel memory manager, Learned Lifetime-Aware Memory Allocator (LLAMA), that manages the heap based on object lifetimes and huge pages (divided into blocks and lines).

Classifying Memory Access Patterns for Prefetching

This work proposes a novel methodology to classify the memory access patterns of applications, enabling well-informed reasoning about the applicability of a certain prefetcher.

Rethinking Virtual Memory Translation for Parallelism

Elastic Cuckoo Page Table is a novel page table design that transforms the sequential pointer-chasing operation used by conventional multi-level radix page tables into fully-parallel look-ups.

Preparing System Software for a World with Terabyte-scale Memories

0sim simulates the behavior of system software (e.g. kernels) on huge-memory systems (e.g. terabytes of RAM).

Efficient Log-Structured Key-Value Storage Engine for Persistent Memory

FlatStore uses a log to save data into Persistent Memory. It can be applied to either a B-Tree index or Hash index.

Large Memory

Preparing System Software for a World with Terabyte-scale Memories

0sim simulates the behavior of system software (e.g. kernels) on huge-memory systems (e.g. terabytes of RAM).

Prefetch

Classifying Memory Access Patterns for Prefetching

This work proposes a novel methodology to classify the memory access patterns of applications, enabling well-informed reasoning about the applicability of a certain prefetcher.

Learning

Learning-based Memory Allocation for C++ Server Workloads

Learning-based Memory Allocator combines modern machine learning techniques with a novel memory manager, Learned Lifetime-Aware Memory Allocator (LLAMA), that manages the heap based on object lifetimes and huge pages (divided into blocks and lines).

Classifying Memory Access Patterns for Prefetching

This work proposes a novel methodology to classify the memory access patterns of applications, enabling well-informed reasoning about the applicability of a certain prefetcher.

Filesystems

RFLUSH Rethink the Flush

RFLUSH implements a fine-grained flush command in a storage device using an open-source flash development platform to transfers a range of LBAs that need to be flushed and thus enables the storage device to force only a subset of data in its buffer.

FAST 2018

RFLUSH Rethink the Flush

RFLUSH implements a fine-grained flush command in a storage device using an open-source flash development platform to transfers a range of LBAs that need to be flushed and thus enables the storage device to force only a subset of data in its buffer.

NUMA

Transparently Self-Replicating Page-Tables for Large-Memory Machine

Mitosis (OSDI 2018 Poster) mitigate NUMA effects on page-table walks by transparently replicating and migrating page-tables across sockets without application changes.

NSDI 2019

Size-aware Sharding For Improving Tail Latencies in In-memory Key-value Stores

Minos distributes requests for keys to cores according to the size of the item associated with the key. In particular, requests for small and large items are sent to disjoint subsets of cores. Size-aware sharding improves tail latencies by avoiding that a request for a small item gets queued behind a request for a large item.

NSDI 2014

MICA A Holistic Approach to Fast In-Memory Key-Value Storage

MICA partitions data and mainly uses exclusive access to the partitions. MICA exploits CPU caches and packet burst I/O to disproportionately speed more loaded partitions, nearly eliminating the penalty from skewed workloads. MICA can fallback to concurrent reads if the load is extremely skewed, but avoids concurrent writes, which are always slower than exclusive writes.

Erasure Coding

Hybrid Data Reliability for Emerging Key-Value Storage Devices

Key-Value Multi-Device (KVMD), a hybrid data reliability manager that employs a variety of reliability techniques with different trade-offs, for key-value devices. Compared to Linux mdadm-based RAID throughput degradation for block devices, data reliability for KV devices can be achieved at a comparable or lower throughput degradation. In addition, the KV API enables much quicker rebuild and recovery of failed devices, and also allows for both hybrid reliability configuration set automatically based on, say, value sizes, and custom per-object reliability configuration for user data.

An Erasure-coding-supported Version of Raft for Reducing Storage Cost and Network Cost

In CRaft, a leader has two methods to replicate log entries to its followers. If the leader can communicate with enough followers, it will replicate log entries bycoded-fragments for better performance. Otherwise, it will replicate complete log entries for liveness.

Consensus

An Erasure-coding-supported Version of Raft for Reducing Storage Cost and Network Cost

In CRaft, a leader has two methods to replicate log entries to its followers. If the leader can communicate with enough followers, it will replicate log entries bycoded-fragments for better performance. Otherwise, it will replicate complete log entries for liveness.

ISCA 2019

GraphSSD Graph Semantics Aware SSD

GraphSSD replaces the conventional logical to physical page mapping mechanism in an SSD with a novel vertex-to-page mapping scheme and exploits the detailed knowledge of the flash properties to minimize page accesses. GraphSSD also supports efficient graph updates (vertex and edge modifications) by minimizing unnecessary page movement overheads.

Translation Ranger Operating System Support for Contiguity-Aware TLBs

Translation Ranger’s approach to generating translation contiguity is to rearrange the system’s physical memory mappings such that each VMA can be covered by as few contiguous regions as possible, with regions that are as large as possible.

Log-Structured

Single-Level Key-Value Store with Persistent Memory

SLM-DB exploits persistent memory to maintain a B+-tree index and adopt an LSM-tree approach to stage inserted KV pairs in a PM resident memory buffer. SLM-DB has a single-level organization of KV pairs on disks and performs selective compaction for the KV pairs, collecting garbage and keeping the KV pairs sorted sufficiently for range query operations.

FAST 2019

Single-Level Key-Value Store with Persistent Memory

SLM-DB exploits persistent memory to maintain a B+-tree index and adopt an LSM-tree approach to stage inserted KV pairs in a PM resident memory buffer. SLM-DB has a single-level organization of KV pairs on disks and performs selective compaction for the KV pairs, collecting garbage and keeping the KV pairs sorted sufficiently for range query operations.

Parallel

Scalable Parallel Flash Firmware for Many-core Architectures

DeepFlash extract the maximum performance of the underlying flash memory complex by concurrently executing multiple firmware components across many cores within the device.

GPU

Fast Fine-Grained Global Synchronization on GPUs

The key idea is to transform global synchronization into global communication so that conflicts are serialized at the thread block level.

Micro 2019

Practical Byte-Granular Memory Blacklisting using Califorms

Califorms provides a low overhead security solution for practical, byte-granular memory safety.