How to run Hadoop WordCount.java Map-Reduce Program

17 Oct 2014

Hadoop comes with a set of demonstration programs. They are located in here.

One of them is WordCount.java which will automatically compute the word frequency of all text files found in the HDFS directory you ask it to process. Follow the Hadoop Tutorial to run the example.

Creating a working directory for your data:

bin/hdfs dfs -mkdir /wordcount

Copy Data Files to HDFS:

bin/hdfs dfs -copyFromLocal /path/to/your/data /wordcount/input

Running WordCount:

bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.1.jar wordcount /wordcount/input /wordcount/output

View the Results:

bin/hdfs dfs -cat /wordcount/output/part-r-00000

Download the Results:

bin/hdfs dfs -copyToLocal /wordcount/output/part-r-00000 .