Opass: Analysis and Optimization of Parallel Data Access on Distributed File Systems Posted on May 25, 2015
Jiangling Yin, Jun Wang, Jian Zhou, Tyler Lukasiewicz, Dan Huang and Junyao Zhang. IPDPS 2015.


In this paper, we study parallel data access on distributed file systems, e.g, the Hadoop file system. Our experiments show that parallel data read requests often access data remotely and in an imbalanced fashion. This results in a serious disk access and data transfer contention on certain cluster/storage nodes. We conduct a complete analysis on how the remote and imbalanced read patterns occur and how they are affected by the size of the cluster.

Read More

An Efficient Page-level FTL to Optimize Address Translation in Flash Memory Posted on Jan 23, 2015
You Zhou, Fei Wu, Ping Huang, Xubin He, Changsheng Xie, Jian Zhou. EuroSys 2015


Flash-based solid state disks (SSDs) have been very popular in consumer and enterprise storage markets due to their high performance, low energy, shock resistance, and small sizes. However, the increasing SSD capacity imposes great pressure on performing efficient logical to physical address translation in a page-level flash translation layer (FTL). Existing schemes usually employ an on-board RAM cache for storing mapping information, called mapping cache, to speed up address translation. Since only a fraction of the mapping table can be cached at a time due to limited cache space, a large number of extra operations in flash memory are required for cache management and garbage collection, degrading performance and lifetime.

Read More

Machine Learning Resources Posted on Jan 9, 2015
A Collection of free Machine Learning Resources


How do I learn Machine Learning? from Quora Machine Learning Resources from Sciencemag Course Practical Machine Learning from Coursera Statistics DataMining from CMU Machine Learning Lecture Notes from MIT Machine Learning Course from UCF Text Books The Elements of Statistical Learning Real World Google Flu

Read More

Machine Learning Benchmarks Posted on Dec 19, 2014
A collection of Machine Learning Benchmarks(Datasets).


http://yann.lecun.com/exdb/mnist/ http://image-net.org/ http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/ http://cervisia.org/machine_learning_data.php http://kdd.ics.uci.edu/ http://archive.ics.uci.edu/ml/ http://cdb.ics.uci.edu/cgibin/LearningDatasetsWeb.py http://www.cs.toronto.edu/~delve/data/datasets.html http://www.dmoz.org/Computers/Artificial_Intelligence/Machine_Learning/Datasets/ http://www.cs.ox.ac.uk/activities/machinelearning/applications.html http://isomap.stanford.edu/datasets.html http://deeplearning.net/datasets/

Read More

PERP: Attacking the balance among energy, performance and recovery in storage systems Posted on Nov 1, 2014
Junyao Zhang, Qingdong Wang, Jiangling Yin, Jian Zhou, Jun Wang. Journal of Parallel and Distributed Computing.


Most recently, an important metric called “energy proportional” is presented as a guideline for energy efficiency systems (Barroso and Hölzle, 2007), which advocates that energy consumption should be in proportion to system performance/utilization. However, this tradeoff metric is only defined for normal mode where the system is functioning normally without node failures. When node failure occurs, the system enters degradation mode during which node reconstruction is initiated. This very process needs to wake/spin up a number of disks and takes a substantial amount of I/O bandwidth, which will not only compromise energy efficiency but also performance.

Read More

How to compile a Hadoop Program Posted on Oct 8, 2014


Before compiling your first hadoop program, please see the instructions on how to run the WordCount Example. You can get the wordcount example code from Github (Make sure you get the compatible version): wget https://github.com/apache/hadoop-common/raw/trunk/hadoop-mapreduce-project/hadoop-mapreduce-examples/src/main/java/org/apache/hadoop/examples/WordCount.java Optionally you can change package org.apache.hadoop.examples; to package org.janzhou;. Set the HADOOP_CLASSPATH: export HADOOP_CLASSPATH=$(bin/hadoop classpath) Compile: javac -classpath ${HADOOP_CLASSPATH} -d WordCount/ WordCount.java Create JAR: jar -cvf WordCount.jar -C WordCount/ . Run: bin/hadoop jar WordCount.jar org.janzhou.wordcount /wordcount/input /wordcount/output Using sun.tools.javac.Main You normally invoke javac.exe from the command line, but you can also invoke it from within a Java program.

Read More

How to install and run Hadoop (the Troubleshooting Version) Posted on Oct 7, 2014


http://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml http://hadoop.apache.org/docs/r2.5.1/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml http://hadoop.apache.org/docs/r1.2.1/mapred_tutorial.html http://stackoverflow.com/questions/17975144/only-one-datanode-can-run-in-a-multinode-hadoop-setup http://wiki.apache.org/hadoop/ConnectionRefused http://blog.cloudera.com/blog/2009/08/hadoop-default-ports-quick-reference/ http://stackoverflow.com/questions/20171455/java-net-connectexception-connection-refused-error-when-running-hive

Read More

How to run Hadoop WordCount.java Map-Reduce Program Posted on Oct 7, 2014


Hadoop comes with a set of demonstration programs. They are located in here. One of them is WordCount.java which will automatically compute the word frequency of all text files found in the HDFS directory you ask it to process. Follow the Hadoop Tutorial to run the example. Creating a working directory for your data: bin/hdfs dfs -mkdir /wordcount Copy Data Files to HDFS: bin/hdfs dfs -copyFromLocal /path/to/your/data /wordcount/input Running WordCount: bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.1.jar wordcount /wordcount/input /wordcount/output View the Results: bin/hdfs dfs -cat /wordcount/output/part-r-00000 Download the Results: bin/hdfs dfs -copyToLocal /wordcount/output/part-r-00000 .

Read More

TOP Conference Programs for Computer Architectures Posted on Oct 2, 2014
ISCA/HPCA/FAST/ASPLOS


ASPLOS 2014 Acceptance Rate: 22% (49⁄217) ASPLOS 2013 Acceptance Rate: 23.0% (44⁄191) ASPLOS 2012 Acceptance Rate: 21% (37⁄172) ISCA 2014 Acceptance Rate: 18% (46⁄258) ISCA 2013 Acceptance Rate: 19.4% (56⁄288) ISCA 2012 Acceptance rate: 18% (47⁄262) HPCA 2014 Acceptance Rate: 25.6% (55⁄215) HPCA 2013 Acceptance Rate: 20% (51⁄249) HPCA 2012 Acceptance Rate: 17% (36⁄210) FAST 2014 FAST 2013 Acceptance Rate: 18% FAST 2012 Acceptance Rate: 19% (26⁄137) SC 2014 SC

Read More

How to Write a Research Paper Posted on Oct 1, 2014
Writing is easy. All you do is stare at a blank sheet of paper until drops of blood form on your forehead. --- Gene Fowler


Research is hard. In doing a research, you should start from finding a good research topic that truly interests you. However, finding a good research topic is out of the scope of this paper. In this paper, I mainly focus on writing. Writing skills is essential in producing a good quantity paper. The writing skills used in a paper should depends on the specific topic and solution the paper is telling.

Read More

Connect. Socialize.