在hadoop2.3下运行WordCount程序-jonlina-ITPUB博客

1、如果hdfs没有启动，则在haoop主目录下启动：

./sbin/start-dfs.sh

./sbin/start-yarn.sh

2、查看状态，保证有数据节点在运行

./bin/hdfs dfsadmin -report

看到如下状态表示一切正常

Datanodes available: 1 (1 total, 0 dead)

这一步也可以用浏览器查看：http://localhost:50070

3、新建几个数据文件，如file1.txt,file2.txt，我是放到hadoop主目录下的examples目录里

examples/file1.txt 内容如下：

hello www.isosee.com

hello www.pmi.org.cn

hello www.pmpway.com

hello www.92pm.com

examples/file2.txt 内容如下：

pmpbox ok

pmpbox v1.0

pmpbox online

I think pmpbox will help you!

4、把文件拷贝到hadoop文件系统

./bin/hadoop fs -mkdir /input

./bin/hadoop fs -put -f examples/file1.txt examples/file2.txt

/input

5、运行 WordCount

./bin/hadoop jar

./share/hadoop/mapreduce/sources/hadoop-mapreduce-examples-2.3.0-sources.jar

org.apache.hadoop.examples.WordCount /input /output

运行中会显示进度。

查看结果命令

./bin/hadoop fs -cat /output/part-r-00000

也可把结果从hdfs拷到文件系统中保存

./bin/hadoop fs -cat /output/part-r-00000

下面就是WordCount程序执行结果：

I 1

hello 4

help 1

ok 1

online 1

pmpbox 4

think 1

v1.0 1

will 1

www.92pm.com 1

www.isosee.com 1

www.pmi.org.cn 1

www.pmpway.com 1

you! 1