NNThroughputBenchmark

NNThroughputBenchmark is one of the earliest NameNode Benchmarks. It was first described in HDFS Scalability: The Limits to Growth

In order to measure the name-node performance, I implemented a benchmark called NNThroughputBenchmark, which now is a standard part of the HDFS code base.

NNThroughputBenchmark is a single-node benchmark, which starts a name-node and runs a series of client threads on the same node. Each client repetitively performs the same name-node operation by directly calling the name-node method implementing this operation. Then the benchmark measures the number of operations performed by the name-node per second.

The reason for running clients locally rather than remotely from different nodes is to avoid any communication overhead caused by RPC connections and serialization, and thus reveal the upper bound of pure namenode performance.

run it using hadoop command:

hadoop org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark

Then you will see available operations, options:

Usage: NNThroughputBenchmark
-op all <other ops="" options=""> |
-op create [-threads T] [-files N] [-filesPerDir P] [-close] |
-op mkdirs [-threads T] [-dirs N] [-dirsPerDir P] |
-op open [-threads T] [-files N] [-filesPerDir P] [-useExisting] |
-op delete [-threads T] [-files N] [-filesPerDir P] [-useExisting] |
-op fileStatus [-threads T] [-files N] [-filesPerDir P] [-useExisting] |
-op rename [-threads T] [-files N] [-filesPerDir P] [-useExisting] |
-op blockReport [-datanodes T] [-reports N] [-blocksPerReport B] [-blocksPerFile F] |
-op replication [-datanodes T] [-nodesToDecommission D] [-nodeReplicationLimit C] [-totalBlocks B] [-replication R] |
-op clean |
[-keepResults] | [-logLevel L] | [-UGCacheRefreshCount G]</other>

Run with option -op all, you will get results like:

14/04/25 15:22:26 FATAL namenode.NNThroughputBenchmark: Log level = ERROR
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: Starting 22 replication(s).
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: Starting benchmark: clean
14/04/25 15:22:26 FATAL namenode.NNThroughputBenchmark: Log level = ERROR
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: Starting 1 clean(s).
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark:
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: --- create inputs ---
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: nrFiles = 10
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: nrThreads = 3
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: nrFilesPerDir = 4
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: --- create stats  ---
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: # operations: 10
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: Elapsed Time: 403
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark:  Ops per sec: 24.81389578163772
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: Average Time: 105
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark:
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: --- mkdirs inputs ---
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: nrDirs = 10
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: nrThreads = 3
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: nrDirsPerDir = 2
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: --- mkdirs stats  ---
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: # operations: 10
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: Elapsed Time: 349
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark:  Ops per sec: 28.653295128939828
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: Average Time: 88
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark:
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: --- open inputs ---
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: nrFiles = 10
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: nrThreads = 3
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: nrFilesPerDir = 4
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: --- open stats  ---
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: # operations: 10
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: Elapsed Time: 16
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark:  Ops per sec: 625.0
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: Average Time: 1
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark:
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: --- delete inputs ---
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: nrFiles = 10
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: nrThreads = 3
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: nrFilesPerDir = 4
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: --- delete stats  ---
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: # operations: 10
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: Elapsed Time: 366
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark:  Ops per sec: 27.3224043715847
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: Average Time: 91
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark:
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: --- fileStatus inputs ---
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: nrFiles = 10
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: nrThreads = 3
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: nrFilesPerDir = 4
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: --- fileStatus stats  ---
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: # operations: 10
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: Elapsed Time: 3
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark:  Ops per sec: 3333.3333333333335
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: Average Time: 0
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark:
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: --- rename inputs ---
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: nrFiles = 10
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: nrThreads = 3
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: nrFilesPerDir = 4
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: --- rename stats  ---
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: # operations: 10
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: Elapsed Time: 349
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark:  Ops per sec: 28.653295128939828
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: Average Time: 89
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark:
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: --- blockReport inputs ---
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: reports = 10
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: datanodes = 3 (0, 0, 0)
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: blocksPerReport = 100
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: blocksPerFile = 10
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: --- blockReport stats  ---
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: # operations: 10
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: Elapsed Time: 20
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark:  Ops per sec: 500.0
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: Average Time: 6
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark:
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: --- replication inputs ---
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: numOpsRequired = 22
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: datanodes = 3 (0, 0, 0)
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: decommissioned datanodes = 1
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: datanode replication limit = 100
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: total blocks = 100
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: --- replication stats  ---
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: # operations: 0
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: Elapsed Time: 0
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark:  Ops per sec: 0.0
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: Average Time: 0
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: decommissioned blocks = 0
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: pending replications = 0
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: replications per sec: 0.0
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark:
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: --- clean inputs ---
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: Remove directory /nnThroughputBenchmark
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: --- clean stats  ---
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: # operations: 1
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: Elapsed Time: 38
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark:  Ops per sec: 26.31578947368421
14/04/25 15:22:26 INFO namenode.NNThroughputBenchmark: Average Time: 35

Related issues:

  1. HDFS-5068
  2. HDFS-5675