Understanding Systems and Architectures for Big Data
IBM Research with various organizations have recently updated the paper "Understanding Systems and Architectures for Big Data" originally published in April 23, 2012, based on new data collected on 10 PowerLinux 7R2 systems running IBM InfoSphere BigInsights v1.3.
Note that the new paper starts at Page 2. The contact author is Jian Li (jianli at us.ibm.com) in IBM Research in Austin.
This paper provides tuning tips for tuning TeraSort on PowerLinux systems at the Hadoop, Java and system levels.