这是从Apache教程发布的单词计数作业的实现的摘录
public static class TokenizerMapper extends Mapper<Object,Text,IntWritable>{
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
public void map(Object key,Text value,Context context
) throws IOException,InterruptedException {
StringTokenizer itr = new StringTokenizer(value.toString());
while (itr.hasMoreTokens()) {
word.set(itr.nextToken());
context.write(word,one);
}
}
}
重用Text word
字段有什么好处?
我已经在许多Hadoop程序中看到了这一点,它是此类的实例化,以至于重用可以提高性能。如果不是这样,为什么人们会这么做,而不是像context.write(new Text(itr.nextToken()),one);