eclipse中开发Hadoop2.x的Map/Reduce项目

本文演示如何在eclipse中开发一个map/reduce项目： 1、环境说明 hadoop2.2.0 eclipse?juno sr2 hadoop2.x-eclipse-plugin 插件的编译安装配置的过程参考：http://www.micmiu.com/bigdata/hadoop/hadoop2-x-eclipse-plugin-build-install/ 2、新建mr工程依次
本文演示如何在eclipse中开发一个map/reduce项目：1、环境说明hadoop2.2.0 eclipse?juno sr2 hadoop2.x-eclipse-plugin 插件的编译安装配置的过程参考：http://www.micmiu.com/bigdata/hadoop/hadoop2-x-eclipse-plugin-build-install/2、新建mr工程依次点击 file →?new →?ohter... ?选择 “map/reduce project”，然后输入项目名称:micmiu_mrdemo，创建新项目:3、创建mapper和reducer依次点击?file →?new →?ohter... 选择mapper，自动继承mapper创建reducer的过程同mapper，具体的业务逻辑自己实现即可。本文就以官方自带的wordcount为例进行测试：package com.micmiu.mr;/** * licensed to the apache software foundation (asf) under one * or more contributor license agreements. see the notice file * distributed with this work for additional information * regarding copyright ownership. the asf licenses this file * to you under the apache license, version 2.0 (the * license); you may not use this file except in compliance * with the license. you may obtain a copy of the license at * * http://www.apache.org/licenses/license-2.0 * * unless required by applicable law or agreed to in writing, software * distributed under the license is distributed on an as is basis, * without warranties or conditions of any kind, either express or implied. * see the license for the specific language governing permissions and * limitations under the license. */import java.io.ioexception;import java.util.stringtokenizer;import org.apache.hadoop.conf.configuration;import org.apache.hadoop.fs.path;import org.apache.hadoop.io.intwritable;import org.apache.hadoop.io.text;import org.apache.hadoop.mapreduce.job;import org.apache.hadoop.mapreduce.mapper;import org.apache.hadoop.mapreduce.reducer;import org.apache.hadoop.mapreduce.lib.input.fileinputformat;import org.apache.hadoop.mapreduce.lib.output.fileoutputformat;import org.apache.hadoop.util.genericoptionsparser;public class wordcount { public static class tokenizermapper extends mapper{ private final static intwritable one = new intwritable(1); private text word = new text(); public void map(object key, text value, context context ) throws ioexception, interruptedexception { stringtokenizer itr = new stringtokenizer(value.tostring()); while (itr.hasmoretokens()) { word.set(itr.nexttoken()); context.write(word, one); } } } public static class intsumreducer extends reducer { private intwritable result = new intwritable(); public void reduce(text key, iterable values, context context ) throws ioexception, interruptedexception { int sum = 0; for (intwritable val : values) { sum += val.get(); } result.set(sum); context.write(key, result); } } public static void main(string[] args) throws exception { configuration conf = new configuration(); string[] otherargs = new genericoptionsparser(conf, args).getremainingargs(); if (otherargs.length != 2) { system.err.println(usage: wordcount ); system.exit(2); } //conf.set(fs.defaultfs, hdfs://192.168.6.77:9000); job job = new job(conf, word count); job.setjarbyclass(wordcount.class); job.setmapperclass(tokenizermapper.class); job.setcombinerclass(intsumreducer.class); job.setreducerclass(intsumreducer.class); job.setoutputkeyclass(text.class); job.setoutputvalueclass(intwritable.class); fileinputformat.addinputpath(job, new path(otherargs[0])); fileoutputformat.setoutputpath(job, new path(otherargs[1])); system.exit(job.waitforcompletion(true) ? 0 : 1); }}
4、准备测试数据micmiu-01.txt：hi michael welcome to hadoop more see micmiu.com
micmiu-02.txt：hi michael welcome to bigdatamore see micmiu.com
micmiu-03.txt：hi michael welcome to spark more see micmiu.com
把 micmiu 打头的三个文件上传到hdfs：micmiu-mbp:downloads micmiu$ hdfs dfs -copyfromlocal micmiu-*.txt /user/micmiu/test/inputmicmiu-mbp:downloads micmiu$ hdfs dfs -ls /user/micmiu/test/inputfound 3 items-rw-r--r-- 1 micmiu supergroup 50 2014-04-15 14:53 /user/micmiu/test/input/micmiu-01.txt-rw-r--r-- 1 micmiu supergroup 50 2014-04-15 14:53 /user/micmiu/test/input/micmiu-02.txt-rw-r--r-- 1 micmiu supergroup 49 2014-04-15 14:53 /user/micmiu/test/input/micmiu-03.txtmicmiu-mbp:downloads micmiu$
5、配置运行参数run as →?run configurations… ，在arguments中配置运行参数，例如程序的输入参数：6、运行run as -> run on hadoop ，执行完成后可以看到如下信息：到此eclipse中调用hadoop2x本地伪分布式模式执行mr演示成功。ps：调用集群环境mr运行一直失败，暂时没有找到原因。—————– ?eof?@michael sun?—————– 原文地址：eclipse中开发hadoop2.x的map/reduce项目, 感谢原作者分享。

eclipse中开发Hadoop2.x的Map/Reduce项目

推荐信息