MapReduce--3--常用计数器详解-CFANZ编程社区

In this post I would like to explain the meaning of the Hadoop counters (the ones which you can generally see after the job completion). I have been analyzing the starvation of long running jobs in our relatively small cluster and Hadoop counters were of extreme importance in this investigation. Unfortunantely I could not find any resource which would explain in detail the meaning of those. In the table presented below, I am trying to describe in clear way what each of the counters means in Hadoop 2.6 release.

Counter Name	Counter Display Name	Detailed explanation
File System Counters
FILE_BYTES_READ	FILE: Number of bytes read	Amount of data read from local filesystem.
FILE_BYTES_WRITTEN	FILE: Number of bytes written	Amount of data written to local filesystem.
FILE_READ_OPS	FILE: Number of read operations	Number of read operations from local filesystem.
FILE_LARGE_READ_OPS	FILE: Number of large read operations	Number of read operations of large files from local filesystem (the ones which does not fit entirely into memory).
FILE_WRITE_OPS	FILE: Number of write operations	Number of write operations from local filesystem.
HDFS_BYTES_READ	HDFS: Number of bytes read	Amount of data read from HDFS.
HDFS_BYTES_WRITTEN	HDFS: Number of bytes written	Amount of data written to HDFS.
HDFS_READ_OPS	HDFS: Number of read operations	Number of read operations from HDFS.
HDFS_LARGE_READ_OPS	HDFS: Number of large read operations	Number of read operations of large files from HDFS (the ones which does not fit entirely into memory).
HDFS_WRITE_OPS	HDFS: Number of write operations	Number of write operations to HDFS.
Job Counters
TOTAL_LAUNCHED_MAPS	Launched map tasks	Total number of launched map tasks.
TOTAL_LAUNCHED_REDUCES	Launched reduce tasks	Total number of launched reduce tasks.
DATA_LOCAL_MAPS	Data-local map tasks	Number of map tasks which were launched on the nodes containing required data.
SLOTS_MILLIS_MAPS	Total time spent by all maps in occupied slots (ms)	Total time map tasks were executing.
SLOTS_MILLIS_REDUCES	Total time spent by all reduces in occupied slots (ms)	Total time reduce tasks were executing.
MILLIS_MAPS	Total time spent by all map tasks (ms)	Wall-time resources were occupied by mappers.
MILLIS_REDUCES	Total time spent by all reduce tasks (ms)	Wall-time resources were occupied by reducers.
VCORES_MILLIS_MAPS	Total vcore-seconds taken by all map tasks	Aggregated number of vCores that the mappers have allocated times the number of seconds the mappers have been running.
VCORES_MILLIS_REDUCES	Total vcore-seconds taken by all reduce tasks	Aggregated number of vCores that the reducers have allocated times the number of seconds the reducers have been running.
MB_MILLIS_MAPS	Total megabyte-seconds taken by all map tasks	Aggregated amount of memory (in megabytes) mappers have allocated times the number of seconds mappers have been running.
MB_MILLIS_REDUCES	Total megabyte-seconds taken by all reduce tasks	Aggregated amount of memory (in megabytes) reducers have allocated times the number of seconds reducers has have running.
Map-Reduce Framework
MAP_INPUT_RECORDS	Map input records	RecordReader
MAP_OUTPUT_RECORDS	Map output records	OutputCollector.
MAP_OUTPUT_BYTES	Map output bytes	OutputCollector.
MAP_OUTPUT_MATERIALIZED_BYTES	Map output materialized bytes	The amount of data which was actually written to disk (if the compression is enabled).
SPLIT_RAW_BYTES	Amount of data consumed for metadata representation during splits.
COMBINE_INPUT_RECORDS	Combine input records	Total number of records processed by combiners(if implemented in the application). Updated every time when the value is read from combiner's iterator.
COMBINE_OUTPUT_RECORDS	Combine output records	OutputCollector.
REDUCE_INPUT_GROUPS	Reduce input groups	Total number of unique keys (the number of distinct key groups processed by all reducers).
REDUCE_SHUFFLE_BYTES	Reduce shuffle bytes
REDUCE_INPUT_RECORDS	Reduce input records	Total number of records processed by all reducers.
REDUCE_OUTPUT_RECORDS	Reduce output records	Total number of records produced by all reducers.
SPILLED_RECORDS	Spilled Records	Total number of records (by mappers and reducers) which were spilled to disk (happens when there is not enough memory).
SHUFFLED_MAPS	Shuffled Maps	Total number of mappers which undergone through shuffle phase.
FAILED_SHUFFLE	Failed Shuffles	Total number of mappers which failed to undergo through shuffle phase.
MERGED_MAP_OUTPUTS	Merged Map outputs	Total number of mapper output files undergone through shuffle phase.
GC_TIME_MILLIS	GC time elapsed (ms)	Wall-time spent for Garbage Collection.
CPU_MILLISECONDS	CPU time spent (ms)	Cumulative CPU time for all tasks.
PHYSICAL_MEMORY_BYTES	Physical memory (bytes) snapshot	Total physical memory used by all tasks including spilled data.
VIRTUAL_MEMORY_BYTES	Virtual memory (bytes) snapshot	Total virtual memory used by all tasks.
COMMITTED_HEAP_BYTES	Total committed heap usage (bytes)	Total amount of memory available for JVM.
Shuffle Errors
BAD_ID	BAD_ID	Total number of errors related with the intepretations of IDs from shuffle headers (mapper ID for example).
CONNECTION	CONNECTION	Source code does not reveal any usage for this counter.
IO_ERROR	IO_ERROR	Total number of errors related with reading and writing intermediate data.
WRONG_LENGTH	WRONG_LENGTH	Total number of errors relared with missbehaving compression and decompression of intermediate data.
WRONG_MAP	WRONG_MAP	Total number of errors related to duplication of the mapper output data (when framework tries to process already processed mapper output).
WRONG_REDUCE	WRONG_REDUCE	Total number of errors related to the attempts of shuffling data for wrong reducer (when shuffle for determined reducer tries to shuffle the data for different reducer).
File Input Format Counters
BYTES_READ	Bytes Read	Amount of data read by every tasks for every filesystem.
File Output Format Counters
BYTES_WRITTEN	Bytes Written	Amount of data written by every tasks for every filesystem.

The sources of the information:

https://www.mapr.com/blog/managing-monitoring-and-testing-mapreduce-jobs-how-work-counters
http://liveramp.com/engineering/tracking-mapreduce-job-performance-with-counters/
https://hadoop.apache.org/docs/current/api/org/apache/hadoop/yarn/api/records/ApplicationResourceUsageReport.html
http://hadoop.apache.org/docs/r1.0.4/releasenotes.html
Hadoop sources - Fetcher.java class