0
点赞
收藏
分享

微信扫一扫

hadoop中的wordcount示例


hadoop自带的单词统计示例,使用版本hadoop2.5.1

1.创建文件夹

创建输入和输出的文件夹 input

hdfs dfs -mkdir [-p] <paths>

[root@linuxmain hadoop]# bin/hdfs dfs -mkdir /input

hadoop中的wordcount示例_mapreduce

bin/hadoop jar hadoop-mapreduce-examples-<ver>.jar wordcount -files cachefile.txt -libjars mylib.jar -archives myarchive.zip input output

[root@linuxmain hadoop]# bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.1.jar wordcount /input/ /output
14/11/27 19:05:24 INFO client.RMProxy: Connecting to ResourceManager at LinuxMain/192.168.1.216:8032
14/11/27 19:05:24 INFO input.FileInputFormat: Total input paths to process : 1
14/11/27 19:05:24 INFO mapreduce.JobSubmitter: number of splits:1
14/11/27 19:05:24 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1417139642519_0006
14/11/27 19:05:25 INFO impl.YarnClientImpl: Submitted application application_1417139642519_0006
14/11/27 19:05:25 INFO mapreduce.Job: The url to track the job: http://LinuxMain:8088/proxy/application_1417139642519_0006/
14/11/27 19:05:25 INFO mapreduce.Job: Running job: job_1417139642519_0006
14/11/27 19:05:30 INFO mapreduce.Job: Job job_1417139642519_0006 running in uber mode : false
14/11/27 19:05:30 INFO mapreduce.Job: map 0% reduce 0%
14/11/27 19:05:39 INFO mapreduce.Job: map 100% reduce 0%
14/11/27 19:05:44 INFO mapreduce.Job: map 100% reduce 100%
14/11/27 19:05:44 INFO mapreduce.Job: Job job_1417139642519_0006 completed successfully
14/11/27 19:05:45 INFO mapreduce.Job: Counters: 49
File System Counters
FILE: Number of bytes read=80
FILE: Number of bytes written=193839
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=144
HDFS: Number of bytes written=46
HDFS: Number of read operations=6
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=1
Launched reduce tasks=1
Rack-local map tasks=1
Total time spent by all maps in occupied slots (ms)=6643
Total time spent by all reduces in occupied slots (ms)=2532
Total time spent by all map tasks (ms)=6643
Total time spent by all reduce tasks (ms)=2532
Total vcore-seconds taken by all map tasks=6643
Total vcore-seconds taken by all reduce tasks=2532
Total megabyte-seconds taken by all map tasks=6802432
Total megabyte-seconds taken by all reduce tasks=2592768
Map-Reduce Framework
Map input records=3
Map output records=9
Map output bytes=80
Map output materialized bytes=80
Input split bytes=100
Combine input records=9
Combine output records=7
Reduce input groups=7
Reduce shuffle bytes=80
Reduce input records=7
Reduce output records=7
Spilled Records=14
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=135
CPU time spent (ms)=1070
Physical memory (bytes) snapshot=224751616
Virtual memory (bytes) snapshot=781852672
Total committed heap usage (bytes)=137039872
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=44
File Output Format Counters
Bytes Written=46
[root@linuxmain hadoop]#

查看一下
[root@linuxmain hadoop]# bin/hdfs dfs -cat /output/*

I 1
hello 2
love 1
world 2
yang 1
you 1
yue 1

出现这个证明成功了你的编译没问题了
将自己的文本文件上传到hadoop的input文件夹中
hdfs dfs -copyFromLocal <localsrc> URI

[root@linuxmain hadoop]# bin/hdfs dfs -copyFromLocal /root/Desktop/new.txt /input

下边这种是如果文件存在则覆盖文件
[root@linuxmain hadoop]# bin/hdfs dfs -copyFromLocal -f /root/Desktop/new.txt /input

hadoop中的wordcount示例_hadoop中的wordcount示例_02

本人亲测没问题


举报

相关推荐

0 条评论