Hadoop学习(五)Hadoop日志总结

通过之前的WordCount和AdLog实例,我们已经能够编写出简单的MapReduce实例了,但是第一次编写还是难免会遇到一些问题,比如AdLog解析json日志结构的时候出错怎么办,如何查看MapReduce的运行日志呢,这就是我们本篇要重点介绍的Hadoop日志

理解Hadoop的日志

Hadoop的日志一般会分为下面两种

  • Hadoop系统服务输出的日志
  • Mapreduce程序输出来的日志

Hadoop系统服务输出的日志

也就是我们启动NameNode, DataNode, NodeManager ResourceManager, HistoryServer等等系统自带的服务输出来的日志,默认是存放在${HADOOP_HOME}/logs目录下。

可以在mapred-site.xml配置文件中指定Hadoop日志的输出路径

1
2
3
4
5
6
7
<configuration>
<!-- hadoop的日志输出指定目录-->
<property>
<name>mapred.local.dir</name>
<value>/home/yunyu/birdben_logs</value>
</property>
<configuration>
服务名 服务类型 日志文件名
resourcemanager YARN yarn-${USER}-resourcemanager-${hostname}.log
nodemanager YARN yarn-${USER}-nodemanager-${hostname}.log
historyserver HDFS mapred-${USER}-historyserver-${hostname}.log
secondarynamenode HDFS hadoop-${USER}-secondarynamenode-${hostname}.log
namenode HDFS hadoop-${USER}-namenode-${hostname}.log
datanode HDFS hadoop-${USER}-datanode-${hostname}.log
1
2
3
4
5
6
用resourcemanager的输出日志举例 : yarn-${USER}-resourcemanager-${hostname}.log
${USER}是指启动resourcemanager进程的用户
${hostname}是resourcemanager进程所在机器的hostname;当日志到达一定的大小(可以在${HADOOP_HOME}/etc/Hadoop/log4j.properties文件中配置)将会被切割出一个新的文件,切割出来的日志文件名类似yarn-${USER}-resourcemanager-${hostname}.log.数字的,后面的数字越大,代表日志越旧。在默认情况下,只保存前20个日志文件
-rw-rw-r-- 1 yunyu yunyu 8987088 Oct 27 11:24 yarn-yunyu-nodemanager-hadoop1.log -rw-rw-r-- 1 yunyu yunyu 700 Oct 27 10:19 yarn-yunyu-nodemanager-hadoop1.out -rw-rw-r-- 1 yunyu yunyu 2062 Oct 26 18:25 yarn-yunyu-nodemanager-hadoop1.out.1 -rw-rw-r-- 1 yunyu yunyu 2062 Oct 26 17:51 yarn-yunyu-nodemanager-hadoop1.out.2 -rw-rw-r-- 1 yunyu yunyu 700 Oct 25 16:18 yarn-yunyu-nodemanager-hadoop1.out.3 -rw-rw-r-- 1 yunyu yunyu 2062 Oct 23 17:54 yarn-yunyu-nodemanager-hadoop1.out.4

Mapreduce程序输出来的日志

MapReduce程序输出来的日志又细分为下面两种

  • 作业运行日志(历史作业日志)
  • 任务运行日志(Container日志)

作业运行日志(历史作业日志)

作业运行由MRAppMaster(MapReduce作业的ApplicationMaster)产生,详细记录了作业启动时间、运行时间,每个任务启动时间、运行时间、Counter值等信息。这些信息对分析作业是很有帮助的,我们可以通过这些历史作业记录得到每天有多少个作业运行成功、有多少个作业运行失败、每个队列作业运行了多少个作业等很有用的信息。

MapReduce作业的ApplicationMaster也运行在Container中,且是编号为000001的Container,比如container_1385051297072_0001_01_000001,它自身可认为是一个特殊的task,因此,也有自己的运行日志(Container日志),该日志与Map Task和Reduce Task类似,但并不是这里介绍的“作业运行日志”。

作业运行日志和其他的日志文件不一样,是因为这些历史作业记录文件是存储在HDFS上的,而不是存储在本地系统文件中的,可以修改mapred-site.xml配置文件指定对应HDFS的存储路径,而且可以指定正在运行的MapReduce作业和已经完成的MapReduce作业信息在HDFS的存储路径。

mapred-site.xml
<!-- MapReduce已完成作业信息在HDFS的存储路径 -->
<property>
    <name>mapreduce.jobhistory.done-dir</name>
    <value>${yarn.app.mapreduce.am.staging-dir}/history/done</value>
</property>
<!-- MapReduce正在运行作业信息在HDFS的存储路径 -->
<property>
    <name>mapreduce.jobhistory.intermediate-done-dir</name>
    <value>${yarn.app.mapreduce.am.staging-dir}/history/done_intermediate</value>
</property>
<!-- MapReduce作业信息在HDFS默认的存储路径 -->
<property>
    <name>yarn.app.mapreduce.am.staging-dir</name>
    <value>/tmp/hadoop-yarn/staging</value>
</property>
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
作业运行日志产生过程如下:
- 步骤1:ResourceManager启动作业的ApplicationMaster,ApplicationMaster运行过程中,将日志写到${yarn.app.mapreduce.am.staging-dir}/yarn/.staging/job_XXXXX_XXX/下,其中参数yarn.app.mapreduce.am.staging-dir 的默认值是/tmp/hadoop-yarn/staging,该目录下将存在3个文件,分别是以".jhist"、".summary"和".xml"结尾的文件,分别表示作业运行日志、作业概要信息和作业配置属性
- 步骤2:所有任务运行完成后,意味着,该作业运行完成,此时ApplicationMaster将三个文件拷贝到${mapreduce.jobhistory.intermediate-done-dir}/${username}目录下,拷贝后的文件名后面添加"_tmp",其中mapreduce.jobhistory.intermediate-done-dir默认值是${yarn.app.mapreduce.am.staging-dir}/history/done_intermediate
- 步骤3:ApplicationMaster将拷贝完成的三个文件重新命名成".jhist"、".summary"和".xml"结尾的文件(去掉"_tmp")
- 步骤4:周期性扫描线程定期将done_intermediate的日志文件转移到done目录(通过参数mapreduce.jobhistory.done-dir配置,默认值为${yarn.app.mapreduce.am.staging-dir}/history/done)下,同时删除".summary"文件(该文件中的信息,.jhist文件中都有)。
- 步骤5:ApplicationMaster移除
${yarn.app.mapreduce.am.staging-dir}/yarn/.staging/job_XXXXX_XXX/目录
默认情况下,任务运行日志产只会存放在各NodeManager的本地磁盘上,你可以打开日志聚集功能,以便让任务将运行日志推送到HDFS上,以便集中管理和分析。
### 特别需要注意下
默认情况下,NodeManager将日志保存到yarn.nodemanager.log-dirs下,,该属性缺省值为${yarn.log.dir}/userlogs,也就是Hadoop安装目录下的logs/userlogs目录中,通常为了分摊磁盘负载,我们会为该参数设置多个路径,此外,需要注意的是,ApplicationMaster的自身的日志也存放在该路目下,因为它也运行在Container之中,是一个特殊的task。举例如下,其中,最后一个是某个作业的ApplicationMaster日志(编号是000001)。
即ApplicationMaster日志目录名称为container_XXX_000001,普通task日志目录名称则为container_XXX_000002,container_XXX_000003,...
container_XXX_00000X每个目录下包含三个日志文件:stdout、stderr和syslog
- stderr : 错误文件输出
- stdout : System.out.println控制台输出,我们自己写的MapReduce程序的System.out.println输出都将写入到此文件中
- syslog : logger系统日志输出,我们自己的MapReduce程序的logger.info日志记录都将写入到此文件中
#### HistoryServer
Hadoop自带了一个HistoryServer用于查看Mapreduce作业记录,比如用了多少个Map、用了多少个Reduce、作业提交时间、作业启动时间、作业完成时间等信息。
需要在mapred-site.xml配置文件中配置HistoryServer的通信地址,并且需要我们手动启动HistoryServer服务
##### 启动HistoryServer
$ mr-jobhistory-daemon.sh start historyserver
1
2
##### mapred-site.xml
<!-- MapReduce JobHistory Server的IPC通信地址,默认端口号是10020 --> <property> <name>mapreduce.jobhistory.address</name> <value>hadoop1:10020</value> </property> <!-- MapReduce JobHistory Server的Web服务器访问地址,默认端口号是19888 --> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>hadoop1:19888</value> </property>
1
2
我们可以在HDFS上查看一下MapReduce作业信息相关内容
$ hdfs dfs -ls /data/history/done/2016/10/26/000000 Found 34 items -rwxrwx--- 2 yunyu supergroup 33487 2016-10-26 01:46 /data/history/done/2016/10/26/000000/job_1477390779880_0003-1477471590330-yunyu-wordcount-1477471605995-1-1-SUCCEEDED-default-1477471593992.jhist -rwxrwx--- 2 yunyu supergroup 113716 2016-10-26 01:46 /data/history/done/2016/10/26/000000/job_1477390779880_0003_conf.xml -rwxrwx--- 2 yunyu supergroup 33478 2016-10-26 01:48 /data/history/done/2016/10/26/000000/job_1477390779880_0004-1477471680580-yunyu-wordcount-1477471695328-1-1-SUCCEEDED-default-1477471684537.jhist -rwxrwx--- 2 yunyu supergroup 113716 2016-10-26 01:48 /data/history/done/2016/10/26/000000/job_1477390779880_0004_conf.xml -rwxrwx--- 2 yunyu supergroup 33472 2016-10-26 01:49 /data/history/done/2016/10/26/000000/job_1477390779880_0005-1477471725086-yunyu-wordcount-1477471740048-1-1-SUCCEEDED-default-1477471729179.jhist -rwxrwx--- 2 yunyu supergroup 113716 2016-10-26 01:49 /data/history/done/2016/10/26/000000/job_1477390779880_0005_conf.xml -rwxrwx--- 2 yunyu supergroup 33483 2016-10-26 01:53 /data/history/done/2016/10/26/000000/job_1477390779880_0006-1477471983695-yunyu-wordcount-1477471999289-1-1-SUCCEEDED-default-1477471987996.jhist -rwxrwx--- 2 yunyu supergroup 113716 2016-10-26 01:53 /data/history/done/2016/10/26/000000/job_1477390779880_0006_conf.xml -rwxrwx--- 2 yunyu supergroup 33479 2016-10-26 01:55 /data/history/done/2016/10/26/000000/job_1477390779880_0007-1477472135663-yunyu-wordcount-1477472150775-1-1-SUCCEEDED-default-1477472139406.jhist -rwxrwx--- 2 yunyu supergroup 113716 2016-10-26 01:55 /data/history/done/2016/10/26/000000/job_1477390779880_0007_conf.xml -rwxrwx--- 2 yunyu supergroup 33477 2016-10-26 02:42 /data/history/done/2016/10/26/000000/job_1477390779880_0008-1477474917310-yunyu-wordcount-1477474934028-1-1-SUCCEEDED-default-1477474921803.jhist -rwxrwx--- 2 yunyu supergroup 113716 2016-10-26 02:42 /data/history/done/2016/10/26/000000/job_1477390779880_0008_conf.xml -rwxrwx--- 2 yunyu supergroup 33487 2016-10-26 03:25 /data/history/done/2016/10/26/000000/job_1477477415810_0001-1477477521120-yunyu-wordcount-1477477540308-1-1-SUCCEEDED-default-1477477527364.jhist -rwxrwx--- 2 yunyu supergroup 113712 2016-10-26 03:25 /data/history/done/2016/10/26/000000/job_1477477415810_0001_conf.xml -rwxrwx--- 2 yunyu supergroup 33483 2016-10-26 03:51 /data/history/done/2016/10/26/000000/job_1477477415810_0002-1477479057999-yunyu-wordcount-1477479076500-1-1-SUCCEEDED-default-1477479062997.jhist -rwxrwx--- 2 yunyu supergroup 113712 2016-10-26 03:51 /data/history/done/2016/10/26/000000/job_1477477415810_0002_conf.xml -rwxrwx--- 2 yunyu supergroup 33476 2016-10-26 04:01 /data/history/done/2016/10/26/000000/job_1477477415810_0003-1477479645080-yunyu-wordcount-1477479661516-1-1-SUCCEEDED-default-1477479650248.jhist -rwxrwx--- 2 yunyu supergroup 113712 2016-10-26 04:01 /data/history/done/2016/10/26/000000/job_1477477415810_0003_conf.xml -rwxrwx--- 2 yunyu supergroup 33470 2016-10-26 04:36 /data/history/done/2016/10/26/000000/job_1477477415810_0004-1477481804256-yunyu-wordcount-1477481820774-1-1-SUCCEEDED-default-1477481809917.jhist -rwxrwx--- 2 yunyu supergroup 113638 2016-10-26 04:36 /data/history/done/2016/10/26/000000/job_1477477415810_0004_conf.xml -rwxrwx--- 2 yunyu supergroup 33519 2016-10-26 04:46 /data/history/done/2016/10/26/000000/job_1477477415810_0005-1477482378129-yunyu-wordcount-1477482394056-1-1-SUCCEEDED-default-1477482382630.jhist -rwxrwx--- 2 yunyu supergroup 113682 2016-10-26 04:46 /data/history/done/2016/10/26/000000/job_1477477415810_0005_conf.xml -rwxrwx--- 2 yunyu supergroup 33518 2016-10-26 04:56 /data/history/done/2016/10/26/000000/job_1477477415810_0006-1477482967602-yunyu-wordcount-1477482983257-1-1-SUCCEEDED-default-1477482971751.jhist -rwxrwx--- 2 yunyu supergroup 113682 2016-10-26 04:56 /data/history/done/2016/10/26/000000/job_1477477415810_0006_conf.xml -rwxrwx--- 2 yunyu supergroup 33524 2016-10-26 05:05 /data/history/done/2016/10/26/000000/job_1477477415810_0007-1477483516841-yunyu-wordcount-1477483533885-1-1-SUCCEEDED-default-1477483521388.jhist -rwxrwx--- 2 yunyu supergroup 113682 2016-10-26 05:05 /data/history/done/2016/10/26/000000/job_1477477415810_0007_conf.xml -rwxrwx--- 2 yunyu supergroup 33519 2016-10-26 05:27 /data/history/done/2016/10/26/000000/job_1477477415810_0008-1477484838977-yunyu-wordcount-1477484854521-1-1-SUCCEEDED-default-1477484843086.jhist -rwxrwx--- 2 yunyu supergroup 113682 2016-10-26 05:27 /data/history/done/2016/10/26/000000/job_1477477415810_0008_conf.xml -rwxrwx--- 2 yunyu supergroup 33520 2016-10-26 19:21 /data/history/done/2016/10/26/000000/job_1477534790849_0001-1477534850971-yunyu-wordcount-1477534870748-1-1-SUCCEEDED-default-1477534857063.jhist -rwxrwx--- 2 yunyu supergroup 113699 2016-10-26 19:21 /data/history/done/2016/10/26/000000/job_1477534790849_0001_conf.xml -rwxrwx--- 2 yunyu supergroup 33521 2016-10-26 20:23 /data/history/done/2016/10/26/000000/job_1477534790849_0002-1477538573459-yunyu-wordcount-1477538590195-1-1-SUCCEEDED-default-1477538577756.jhist -rwxrwx--- 2 yunyu supergroup 113699 2016-10-26 20:23 /data/history/done/2016/10/26/000000/job_1477534790849_0002_conf.xml -rwxrwx--- 2 yunyu supergroup 33519 2016-10-26 20:24 /data/history/done/2016/10/26/000000/job_1477534790849_0003-1477538645546-yunyu-wordcount-1477538662701-1-1-SUCCEEDED-default-1477538650360.jhist -rwxrwx--- 2 yunyu supergroup 113699 2016-10-26 20:24 /data/history/done/2016/10/26/000000/job_1477534790849_0003_conf.xml
1
2
3
4
5
6
7
通过上面的结果我们可以得到一下几点:
- (1)历史作业记录是存放在HDFS目录中;
- (2)由于历史作业记录可能非常多,所以历史作业记录是按照年/月/日的形式分别存放在相应的目录中,这样便于管理和查找;
- (3)对于每一个Hadoop历史作业记录相关信息都用两个文件存放,后缀名分别为*.jhist,*.xml。*.jhist文件里存放的是具体Hadoop作业的详细信息,如下:
- (4)每一个作业的历史记录都存放在一个单独的文件中。
hdfs dfs -cat /data/history/done/2016/10/26/000000/job_1477534790849_0003-1477538645546-yunyu-wordcount-1477538662701-1-1-SUCCEEDED-default-1477538650360.jhist Avro-Json {"type":"record","name":"Event","namespace":"org.apache.hadoop.mapreduce.jobhistory","fields":[{"name":"type","type":{"type":"enum","name":"EventType","symbols":["JOB_SUBMITTED","JOB_INITED","JOB_FINISHED","JOB_PRIORITY_CHANGED","JOB_STATUS_CHANGED","JOB_QUEUE_CHANGED","JOB_FAILED","JOB_KILLED","JOB_ERROR","JOB_INFO_CHANGED","TASK_STARTED","TASK_FINISHED","TASK_FAILED","TASK_UPDATED","NORMALIZED_RESOURCE","MAP_ATTEMPT_STARTED","MAP_ATTEMPT_FINISHED","MAP_ATTEMPT_FAILED","MAP_ATTEMPT_KILLED","REDUCE_ATTEMPT_STARTED","REDUCE_ATTEMPT_FINISHED","REDUCE_ATTEMPT_FAILED","REDUCE_ATTEMPT_KILLED","SETUP_ATTEMPT_STARTED","SETUP_ATTEMPT_FINISHED","SETUP_ATTEMPT_FAILED","SETUP_ATTEMPT_KILLED","CLEANUP_ATTEMPT_STARTED","CLEANUP_ATTEMPT_FINISHED","CLEANUP_ATTEMPT_FAILED","CLEANUP_ATTEMPT_KILLED","AM_STARTED"]}},{"name":"event","type":[{"type":"record","name":"JobFinished","fields":[{"name":"jobid","type":"string"},{"name":"finishTime","type":"long"},{"name":"finishedMaps","type":"int"},{"name":"finishedReduces","type":"int"},{"name":"failedMaps","type":"int"},{"name":"failedReduces","type":"int"},{"name":"totalCounters","type":{"type":"record","name":"JhCounters","fields":[{"name":"name","type":"string"},{"name":"groups","type":{"type":"array","items":{"type":"record","name":"JhCounterGroup","fields":[{"name":"name","type":"string"},{"name":"displayName","type":"string"},{"name":"counts","type":{"type":"array","items":{"type":"record","name":"JhCounter","fields":[{"name":"name","type":"string"},{"name":"displayName","type":"string"},{"name":"value","type":"long"}]}}}]}}}]}},{"name":"mapCounters","type":"JhCounters"},{"name":"reduceCounters","type":"JhCounters"}]},{"type":"record","name":"JobInfoChange","fields":[{"name":"jobid","type":"string"},{"name":"submitTime","type":"long"},{"name":"launchTime","type":"long"}]},{"type":"record","name":"JobInited","fields":[{"name":"jobid","type":"string"},{"name":"launchTime","type":"long"},{"name":"totalMaps","type":"int"},{"name":"totalReduces","type":"int"},{"name":"jobStatus","type":"string"},{"name":"uberized","type":"boolean"}]},{"type":"record","name":"AMStarted","fields":[{"name":"applicationAttemptId","type":"string"},{"name":"startTime","type":"long"},{"name":"containerId","type":"string"},{"name":"nodeManagerHost","type":"string"},{"name":"nodeManagerPort","type":"int"},{"name":"nodeManagerHttpPort","type":"int"}]},{"type":"record","name":"JobPriorityChange","fields":[{"name":"jobid","type":"string"},{"name":"priority","type":"string"}]},{"type":"record","name":"JobQueueChange","fields":[{"name":"jobid","type":"string"},{"name":"jobQueueName","type":"string"}]},{"type":"record","name":"JobStatusChanged","fields":[{"name":"jobid","type":"string"},{"name":"jobStatus","type":"string"}]},{"type":"record","name":"JobSubmitted","fields":[{"name":"jobid","type":"string"},{"name":"jobName","type":"string"},{"name":"userName","type":"string"},{"name":"submitTime","type":"long"},{"name":"jobConfPath","type":"string"},{"name":"acls","type":{"type":"map","values":"string"}},{"name":"jobQueueName","type":"string"},{"name":"workflowId","type":["null","string"],"default":null},{"name":"workflowName","type":["null","string"],"default":null},{"name":"workflowNodeName","type":["null","string"],"default":null},{"name":"workflowAdjacencies","type":["null","string"],"default":null},{"name":"workflowTags","type":["null","string"],"default":null}]},{"type":"record","name":"JobUnsuccessfulCompletion","fields":[{"name":"jobid","type":"string"},{"name":"finishTime","type":"long"},{"name":"finishedMaps","type":"int"},{"name":"finishedReduces","type":"int"},{"name":"jobStatus","type":"string"},{"name":"diagnostics","type":["null","string"],"default":null}]},{"type":"record","name":"MapAttemptFinished","fields":[{"name":"taskid","type":"string"},{"name":"attemptId","type":"string"},{"name":"taskType","type":"string"},{"name":"taskStatus","type":"string"},{"name":"mapFinishTime","type":"long"},{"name":"finishTime","type":"long"},{"name":"hostname","type":"string"},{"name":"port","type":"int"},{"name":"rackname","type":"string"},{"name":"state","type":"string"},{"name":"counters","type":"JhCounters"},{"name":"clockSplits","type":{"type":"array","items":"int"}},{"name":"cpuUsages","type":{"type":"array","items":"int"}},{"name":"vMemKbytes","type":{"type":"array","items":"int"}},{"name":"physMemKbytes","type":{"type":"array","items":"int"}}]},{"type":"record","name":"ReduceAttemptFinished","fields":[{"name":"taskid","type":"string"},{"name":"attemptId","type":"string"},{"name":"taskType","type":"string"},{"name":"taskStatus","type":"string"},{"name":"shuffleFinishTime","type":"long"},{"name":"sortFinishTime","type":"long"},{"name":"finishTime","type":"long"},{"name":"hostname","type":"string"},{"name":"port","type":"int"},{"name":"rackname","type":"string"},{"name":"state","type":"string"},{"name":"counters","type":"JhCounters"},{"name":"clockSplits","type":{"type":"array","items":"int"}},{"name":"cpuUsages","type":{"type":"array","items":"int"}},{"name":"vMemKbytes","type":{"type":"array","items":"int"}},{"name":"physMemKbytes","type":{"type":"array","items":"int"}}]},{"type":"record","name":"TaskAttemptFinished","fields":[{"name":"taskid","type":"string"},{"name":"attemptId","type":"string"},{"name":"taskType","type":"string"},{"name":"taskStatus","type":"string"},{"name":"finishTime","type":"long"},{"name":"rackname","type":"string"},{"name":"hostname","type":"string"},{"name":"state","type":"string"},{"name":"counters","type":"JhCounters"}]},{"type":"record","name":"TaskAttemptStarted","fields":[{"name":"taskid","type":"string"},{"name":"taskType","type":"string"},{"name":"attemptId","type":"string"},{"name":"startTime","type":"long"},{"name":"trackerName","type":"string"},{"name":"httpPort","type":"int"},{"name":"shufflePort","type":"int"},{"name":"containerId","type":"string"},{"name":"locality","type":["null","string"],"default":null},{"name":"avataar","type":["null","string"],"default":null}]},{"type":"record","name":"TaskAttemptUnsuccessfulCompletion","fields":[{"name":"taskid","type":"string"},{"name":"taskType","type":"string"},{"name":"attemptId","type":"string"},{"name":"finishTime","type":"long"},{"name":"hostname","type":"string"},{"name":"port","type":"int"},{"name":"rackname","type":"string"},{"name":"status","type":"string"},{"name":"error","type":"string"},{"name":"counters","type":["null","JhCounters"],"default":null},{"name":"clockSplits","type":{"type":"array","items":"int"}},{"name":"cpuUsages","type":{"type":"array","items":"int"}},{"name":"vMemKbytes","type":{"type":"array","items":"int"}},{"name":"physMemKbytes","type":{"type":"array","items":"int"}}]},{"type":"record","name":"TaskFailed","fields":[{"name":"taskid","type":"string"},{"name":"taskType","type":"string"},{"name":"finishTime","type":"long"},{"name":"error","type":"string"},{"name":"failedDueToAttempt","type":["null","string"]},{"name":"status","type":"string"},{"name":"counters","type":["null","JhCounters"],"default":null}]},{"type":"record","name":"TaskFinished","fields":[{"name":"taskid","type":"string"},{"name":"taskType","type":"string"},{"name":"finishTime","type":"long"},{"name":"status","type":"string"},{"name":"counters","type":"JhCounters"},{"name":"successfulAttemptId","type":["null","string"],"default":null}]},{"type":"record","name":"TaskStarted","fields":[{"name":"taskid","type":"string"},{"name":"taskType","type":"string"},{"name":"startTime","type":"long"},{"name":"splitLocations","type":"string"}]},{"type":"record","name":"TaskUpdated","fields":[{"name":"taskid","type":"string"},{"name":"finishTime","type":"long"}]}]}]} {"type":"AM_STARTED","event":{"org.apache.hadoop.mapreduce.jobhistory.AMStarted":{"applicationAttemptId":"appattempt_1477534790849_0003_000001","startTime":1477538647394,"containerId":"container_1477534790849_0003_01_000001","nodeManagerHost":"hadoop1","nodeManagerPort":47596,"nodeManagerHttpPort":8042}}} {"type":"JOB_SUBMITTED","event":{"org.apache.hadoop.mapreduce.jobhistory.JobSubmitted":{"jobid":"job_1477534790849_0003","jobName":"wordcount","userName":"yunyu","submitTime":1477538645546,"jobConfPath":"hdfs://hadoop1/tmp/hadoop-yarn/staging/yunyu/.staging/job_1477534790849_0003/job.xml","acls":{},"jobQueueName":"default","workflowId":{"string":""},"workflowName":{"string":""},"workflowNodeName":{"string":""},"workflowAdjacencies":{"string":""},"workflowTags":{"string":""}}}} {"type":"JOB_QUEUE_CHANGED","event":{"org.apache.hadoop.mapreduce.jobhistory.JobQueueChange":{"jobid":"job_1477534790849_0003","jobQueueName":"default"}}} {"type":"JOB_INITED","event":{"org.apache.hadoop.mapreduce.jobhistory.JobInited":{"jobid":"job_1477534790849_0003","launchTime":1477538650360,"totalMaps":1,"totalReduces":1,"jobStatus":"INITED","uberized":false}}} {"type":"JOB_INFO_CHANGED","event":{"org.apache.hadoop.mapreduce.jobhistory.JobInfoChange":{"jobid":"job_1477534790849_0003","submitTime":1477538645546,"launchTime":1477538650360}}} {"type":"TASK_STARTED","event":{"org.apache.hadoop.mapreduce.jobhistory.TaskStarted":{"taskid":"task_1477534790849_0003_m_000000","taskType":"MAP","startTime":1477538650763,"splitLocations":"hadoop1,hadoop2"}}} {"type":"TASK_STARTED","event":{"org.apache.hadoop.mapreduce.jobhistory.TaskStarted":{"taskid":"task_1477534790849_0003_r_000000","taskType":"REDUCE","startTime":1477538650767,"splitLocations":""}}} {"type":"MAP_ATTEMPT_STARTED","event":{"org.apache.hadoop.mapreduce.jobhistory.TaskAttemptStarted":{"taskid":"task_1477534790849_0003_m_000000","taskType":"MAP","attemptId":"attempt_1477534790849_0003_m_000000_0","startTime":1477538652833,"trackerName":"hadoop2","httpPort":8042,"shufflePort":13562,"containerId":"container_1477534790849_0003_01_000002","locality":{"string":"NODE_LOCAL"},"avataar":{"string":"VIRGIN"}}}} {"type":"MAP_ATTEMPT_FINISHED","event":{"org.apache.hadoop.mapreduce.jobhistory.MapAttemptFinished":{"taskid":"task_1477534790849_0003_m_000000","attemptId":"attempt_1477534790849_0003_m_000000_0","taskType":"MAP","taskStatus":"SUCCEEDED","mapFinishTime":1477538656711,"finishTime":1477538656933,"hostname":"hadoop2","port":34358,"rackname":"/default-rack","state":"map","counters":{"name":"COUNTERS","groups":[{"name":"org.apache.hadoop.mapreduce.FileSystemCounter","displayName":"File System Counters","counts":[{"name":"FILE_BYTES_READ","displayName":"FILE: Number of bytes read","value":0},{"name":"FILE_BYTES_WRITTEN","displayName":"FILE: Number of bytes written","value":115420},{"name":"FILE_READ_OPS","displayName":"FILE: Number of read operations","value":0},{"name":"FILE_LARGE_READ_OPS","displayName":"FILE: Number of large read operations","value":0},{"name":"FILE_WRITE_OPS","displayName":"FILE: Number of write operations","value":0},{"name":"HDFS_BYTES_READ","displayName":"HDFS: Number of bytes read","value":194},{"name":"HDFS_BYTES_WRITTEN","displayName":"HDFS: Number of bytes written","value":0},{"name":"HDFS_READ_OPS","displayName":"HDFS: Number of read operations","value":3},{"name":"HDFS_LARGE_READ_OPS","displayName":"HDFS: Number of large read operations","value":0},{"name":"HDFS_WRITE_OPS","displayName":"HDFS: Number of write operations","value":0}]},{"name":"org.apache.hadoop.mapreduce.TaskCounter","displayName":"Map-Reduce Framework","counts":[{"name":"MAP_INPUT_RECORDS","displayName":"Map input records","value":4},{"name":"MAP_OUTPUT_RECORDS","displayName":"Map output records","value":14},{"name":"MAP_OUTPUT_BYTES","displayName":"Map output bytes","value":141},{"name":"MAP_OUTPUT_MATERIALIZED_BYTES","displayName":"Map output materialized bytes","value":114},{"name":"SPLIT_RAW_BYTES","displayName":"Input split bytes","value":109},{"name":"COMBINE_INPUT_RECORDS","displayName":"Combine input records","value":14},{"name":"COMBINE_OUTPUT_RECORDS","displayName":"Combine output records","value":9},{"name":"SPILLED_RECORDS","displayName":"Spilled Records","value":9},{"name":"FAILED_SHUFFLE","displayName":"Failed Shuffles","value":0},{"name":"MERGED_MAP_OUTPUTS","displayName":"Merged Map outputs","value":0},{"name":"GC_TIME_MILLIS","displayName":"GC time elapsed (ms)","value":83},{"name":"CPU_MILLISECONDS","displayName":"CPU time spent (ms)","value":490},{"name":"PHYSICAL_MEMORY_BYTES","displayName":"Physical memory (bytes) snapshot","value":216502272},{"name":"VIRTUAL_MEMORY_BYTES","displayName":"Virtual memory (bytes) snapshot","value":666292224},{"name":"COMMITTED_HEAP_BYTES","displayName":"Total committed heap usage (bytes)","value":120721408}]},{"name":"org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter","displayName":"File Input Format Counters ","counts":[{"name":"BYTES_READ","displayName":"Bytes Read","value":85}]}]},"clockSplits":[3941,11,12,11,11,12,11,11,12,11,11,12],"cpuUsages":[40,41,41,41,41,41,40,41,41,41,41,41],"vMemKbytes":[27111,81334,135557,189780,244003,298226,352449,406672,460895,515118,569341,623564],"physMemKbytes":[8809,26428,44047,61666,79285,96904,114523,132142,149761,167380,184999,202618]}}} {"type":"TASK_FINISHED","event":{"org.apache.hadoop.mapreduce.jobhistory.TaskFinished":{"taskid":"task_1477534790849_0003_m_000000","taskType":"MAP","finishTime":1477538656933,"status":"SUCCEEDED","counters":{"name":"COUNTERS","groups":[{"name":"org.apache.hadoop.mapreduce.FileSystemCounter","displayName":"File System Counters","counts":[{"name":"FILE_BYTES_READ","displayName":"FILE: Number of bytes read","value":0},{"name":"FILE_BYTES_WRITTEN","displayName":"FILE: Number of bytes written","value":115420},{"name":"FILE_READ_OPS","displayName":"FILE: Number of read operations","value":0},{"name":"FILE_LARGE_READ_OPS","displayName":"FILE: Number of large read operations","value":0},{"name":"FILE_WRITE_OPS","displayName":"FILE: Number of write operations","value":0},{"name":"HDFS_BYTES_READ","displayName":"HDFS: Number of bytes read","value":194},{"name":"HDFS_BYTES_WRITTEN","displayName":"HDFS: Number of bytes written","value":0},{"name":"HDFS_READ_OPS","displayName":"HDFS: Number of read operations","value":3},{"name":"HDFS_LARGE_READ_OPS","displayName":"HDFS: Number of large read operations","value":0},{"name":"HDFS_WRITE_OPS","displayName":"HDFS: Number of write operations","value":0}]},{"name":"org.apache.hadoop.mapreduce.TaskCounter","displayName":"Map-Reduce Framework","counts":[{"name":"MAP_INPUT_RECORDS","displayName":"Map input records","value":4},{"name":"MAP_OUTPUT_RECORDS","displayName":"Map output records","value":14},{"name":"MAP_OUTPUT_BYTES","displayName":"Map output bytes","value":141},{"name":"MAP_OUTPUT_MATERIALIZED_BYTES","displayName":"Map output materialized bytes","value":114},{"name":"SPLIT_RAW_BYTES","displayName":"Input split bytes","value":109},{"name":"COMBINE_INPUT_RECORDS","displayName":"Combine input records","value":14},{"name":"COMBINE_OUTPUT_RECORDS","displayName":"Combine output records","value":9},{"name":"SPILLED_RECORDS","displayName":"Spilled Records","value":9},{"name":"FAILED_SHUFFLE","displayName":"Failed Shuffles","value":0},{"name":"MERGED_MAP_OUTPUTS","displayName":"Merged Map outputs","value":0},{"name":"GC_TIME_MILLIS","displayName":"GC time elapsed (ms)","value":83},{"name":"CPU_MILLISECONDS","displayName":"CPU time spent (ms)","value":490},{"name":"PHYSICAL_MEMORY_BYTES","displayName":"Physical memory (bytes) snapshot","value":216502272},{"name":"VIRTUAL_MEMORY_BYTES","displayName":"Virtual memory (bytes) snapshot","value":666292224},{"name":"COMMITTED_HEAP_BYTES","displayName":"Total committed heap usage (bytes)","value":120721408}]},{"name":"org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter","displayName":"File Input Format Counters ","counts":[{"name":"BYTES_READ","displayName":"Bytes Read","value":85}]}]},"successfulAttemptId":{"string":"attempt_1477534790849_0003_m_000000_0"}}}} {"type":"REDUCE_ATTEMPT_STARTED","event":{"org.apache.hadoop.mapreduce.jobhistory.TaskAttemptStarted":{"taskid":"task_1477534790849_0003_r_000000","taskType":"REDUCE","attemptId":"attempt_1477534790849_0003_r_000000_0","startTime":1477538659565,"trackerName":"hadoop2","httpPort":8042,"shufflePort":13562,"containerId":"container_1477534790849_0003_01_000003","locality":{"string":"OFF_SWITCH"},"avataar":{"string":"VIRGIN"}}}} {"type":"REDUCE_ATTEMPT_FINISHED","event":{"org.apache.hadoop.mapreduce.jobhistory.ReduceAttemptFinished":{"taskid":"task_1477534790849_0003_r_000000","attemptId":"attempt_1477534790849_0003_r_000000_0","taskType":"REDUCE","taskStatus":"SUCCEEDED","shuffleFinishTime":1477538662152,"sortFinishTime":1477538662173,"finishTime":1477538662652,"hostname":"hadoop2","port":34358,"rackname":"/default-rack","state":"reduce > reduce","counters":{"name":"COUNTERS","groups":[{"name":"org.apache.hadoop.mapreduce.FileSystemCounter","displayName":"File System Counters","counts":[{"name":"FILE_BYTES_READ","displayName":"FILE: Number of bytes read","value":114},{"name":"FILE_BYTES_WRITTEN","displayName":"FILE: Number of bytes written","value":115365},{"name":"FILE_READ_OPS","displayName":"FILE: Number of read operations","value":0},{"name":"FILE_LARGE_READ_OPS","displayName":"FILE: Number of large read operations","value":0},{"name":"FILE_WRITE_OPS","displayName":"FILE: Number of write operations","value":0},{"name":"HDFS_BYTES_READ","displayName":"HDFS: Number of bytes read","value":0},{"name":"HDFS_BYTES_WRITTEN","displayName":"HDFS: Number of bytes written","value":72},{"name":"HDFS_READ_OPS","displayName":"HDFS: Number of read operations","value":3},{"name":"HDFS_LARGE_READ_OPS","displayName":"HDFS: Number of large read operations","value":0},{"name":"HDFS_WRITE_OPS","displayName":"HDFS: Number of write operations","value":2}]},{"name":"org.apache.hadoop.mapreduce.TaskCounter","displayName":"Map-Reduce Framework","counts":[{"name":"COMBINE_INPUT_RECORDS","displayName":"Combine input records","value":0},{"name":"COMBINE_OUTPUT_RECORDS","displayName":"Combine output records","value":0},{"name":"REDUCE_INPUT_GROUPS","displayName":"Reduce input groups","value":9},{"name":"REDUCE_SHUFFLE_BYTES","displayName":"Reduce shuffle bytes","value":114},{"name":"REDUCE_INPUT_RECORDS","displayName":"Reduce input records","value":9},{"name":"REDUCE_OUTPUT_RECORDS","displayName":"Reduce output records","value":9},{"name":"SPILLED_RECORDS","displayName":"Spilled Records","value":9},{"name":"SHUFFLED_MAPS","displayName":"Shuffled Maps ","value":1},{"name":"FAILED_SHUFFLE","displayName":"Failed Shuffles","value":0},{"name":"MERGED_MAP_OUTPUTS","displayName":"Merged Map outputs","value":1},{"name":"GC_TIME_MILLIS","displayName":"GC time elapsed (ms)","value":60},{"name":"CPU_MILLISECONDS","displayName":"CPU time spent (ms)","value":820},{"name":"PHYSICAL_MEMORY_BYTES","displayName":"Physical memory (bytes) snapshot","value":120524800},{"name":"VIRTUAL_MEMORY_BYTES","displayName":"Virtual memory (bytes) snapshot","value":672628736},{"name":"COMMITTED_HEAP_BYTES","displayName":"Total committed heap usage (bytes)","value":15728640}]},{"name":"Shuffle Errors","displayName":"Shuffle Errors","counts":[{"name":"BAD_ID","displayName":"BAD_ID","value":0},{"name":"CONNECTION","displayName":"CONNECTION","value":0},{"name":"IO_ERROR","displayName":"IO_ERROR","value":0},{"name":"WRONG_LENGTH","displayName":"WRONG_LENGTH","value":0},{"name":"WRONG_MAP","displayName":"WRONG_MAP","value":0},{"name":"WRONG_REDUCE","displayName":"WRONG_REDUCE","value":0}]},{"name":"org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter","displayName":"File Output Format Counters ","counts":[{"name":"BYTES_WRITTEN","displayName":"Bytes Written","value":72}]}]},"clockSplits":[2689,35,35,35,35,35,35,35,35,35,35,36],"cpuUsages":[68,68,69,68,68,69,68,68,69,68,68,69],"vMemKbytes":[27369,82107,136846,191584,246323,301062,355801,410539,465278,520017,574755,629494],"physMemKbytes":[4904,14712,24520,34328,44137,53945,63754,73561,83370,93179,102986,112795]}}} {"type":"TASK_FINISHED","event":{"org.apache.hadoop.mapreduce.jobhistory.TaskFinished":{"taskid":"task_1477534790849_0003_r_000000","taskType":"REDUCE","finishTime":1477538662652,"status":"SUCCEEDED","counters":{"name":"COUNTERS","groups":[{"name":"org.apache.hadoop.mapreduce.FileSystemCounter","displayName":"File System Counters","counts":[{"name":"FILE_BYTES_READ","displayName":"FILE: Number of bytes read","value":114},{"name":"FILE_BYTES_WRITTEN","displayName":"FILE: Number of bytes written","value":115365},{"name":"FILE_READ_OPS","displayName":"FILE: Number of read operations","value":0},{"name":"FILE_LARGE_READ_OPS","displayName":"FILE: Number of large read operations","value":0},{"name":"FILE_WRITE_OPS","displayName":"FILE: Number of write operations","value":0},{"name":"HDFS_BYTES_READ","displayName":"HDFS: Number of bytes read","value":0},{"name":"HDFS_BYTES_WRITTEN","displayName":"HDFS: Number of bytes written","value":72},{"name":"HDFS_READ_OPS","displayName":"HDFS: Number of read operations","value":3},{"name":"HDFS_LARGE_READ_OPS","displayName":"HDFS: Number of large read operations","value":0},{"name":"HDFS_WRITE_OPS","displayName":"HDFS: Number of write operations","value":2}]},{"name":"org.apache.hadoop.mapreduce.TaskCounter","displayName":"Map-Reduce Framework","counts":[{"name":"COMBINE_INPUT_RECORDS","displayName":"Combine input records","value":0},{"name":"COMBINE_OUTPUT_RECORDS","displayName":"Combine output records","value":0},{"name":"REDUCE_INPUT_GROUPS","displayName":"Reduce input groups","value":9},{"name":"REDUCE_SHUFFLE_BYTES","displayName":"Reduce shuffle bytes","value":114},{"name":"REDUCE_INPUT_RECORDS","displayName":"Reduce input records","value":9},{"name":"REDUCE_OUTPUT_RECORDS","displayName":"Reduce output records","value":9},{"name":"SPILLED_RECORDS","displayName":"Spilled Records","value":9},{"name":"SHUFFLED_MAPS","displayName":"Shuffled Maps ","value":1},{"name":"FAILED_SHUFFLE","displayName":"Failed Shuffles","value":0},{"name":"MERGED_MAP_OUTPUTS","displayName":"Merged Map outputs","value":1},{"name":"GC_TIME_MILLIS","displayName":"GC time elapsed (ms)","value":60},{"name":"CPU_MILLISECONDS","displayName":"CPU time spent (ms)","value":820},{"name":"PHYSICAL_MEMORY_BYTES","displayName":"Physical memory (bytes) snapshot","value":120524800},{"name":"VIRTUAL_MEMORY_BYTES","displayName":"Virtual memory (bytes) snapshot","value":672628736},{"name":"COMMITTED_HEAP_BYTES","displayName":"Total committed heap usage (bytes)","value":15728640}]},{"name":"Shuffle Errors","displayName":"Shuffle Errors","counts":[{"name":"BAD_ID","displayName":"BAD_ID","value":0},{"name":"CONNECTION","displayName":"CONNECTION","value":0},{"name":"IO_ERROR","displayName":"IO_ERROR","value":0},{"name":"WRONG_LENGTH","displayName":"WRONG_LENGTH","value":0},{"name":"WRONG_MAP","displayName":"WRONG_MAP","value":0},{"name":"WRONG_REDUCE","displayName":"WRONG_REDUCE","value":0}]},{"name":"org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter","displayName":"File Output Format Counters ","counts":[{"name":"BYTES_WRITTEN","displayName":"Bytes Written","value":72}]}]},"successfulAttemptId":{"string":"attempt_1477534790849_0003_r_000000_0"}}}} {"type":"JOB_FINISHED","event":{"org.apache.hadoop.mapreduce.jobhistory.JobFinished":{"jobid":"job_1477534790849_0003","finishTime":1477538662701,"finishedMaps":1,"finishedReduces":1,"failedMaps":0,"failedReduces":0,"totalCounters":{"name":"TOTAL_COUNTERS","groups":[{"name":"org.apache.hadoop.mapreduce.FileSystemCounter","displayName":"File System Counters","counts":[{"name":"FILE_BYTES_READ","displayName":"FILE: Number of bytes read","value":114},{"name":"FILE_BYTES_WRITTEN","displayName":"FILE: Number of bytes written","value":230785},{"name":"FILE_READ_OPS","displayName":"FILE: Number of read operations","value":0},{"name":"FILE_LARGE_READ_OPS","displayName":"FILE: Number of large read operations","value":0},{"name":"FILE_WRITE_OPS","displayName":"FILE: Number of write operations","value":0},{"name":"HDFS_BYTES_READ","displayName":"HDFS: Number of bytes read","value":194},{"name":"HDFS_BYTES_WRITTEN","displayName":"HDFS: Number of bytes written","value":72},{"name":"HDFS_READ_OPS","displayName":"HDFS: Number of read operations","value":6},{"name":"HDFS_LARGE_READ_OPS","displayName":"HDFS: Number of large read operations","value":0},{"name":"HDFS_WRITE_OPS","displayName":"HDFS: Number of write operations","value":2}]},{"name":"org.apache.hadoop.mapreduce.JobCounter","displayName":"Job Counters ","counts":[{"name":"TOTAL_LAUNCHED_MAPS","displayName":"Launched map tasks","value":1},{"name":"TOTAL_LAUNCHED_REDUCES","displayName":"Launched reduce tasks","value":1},{"name":"DATA_LOCAL_MAPS","displayName":"Data-local map tasks","value":1},{"name":"SLOTS_MILLIS_MAPS","displayName":"Total time spent by all maps in occupied slots (ms)","value":4100},{"name":"SLOTS_MILLIS_REDUCES","displayName":"Total time spent by all reduces in occupied slots (ms)","value":3087},{"name":"MILLIS_MAPS","displayName":"Total time spent by all map tasks (ms)","value":4100},{"name":"MILLIS_REDUCES","displayName":"Total time spent by all reduce tasks (ms)","value":3087},{"name":"VCORES_MILLIS_MAPS","displayName":"Total vcore-seconds taken by all map tasks","value":4100},{"name":"VCORES_MILLIS_REDUCES","displayName":"Total vcore-seconds taken by all reduce tasks","value":3087},{"name":"MB_MILLIS_MAPS","displayName":"Total megabyte-seconds taken by all map tasks","value":4198400},{"name":"MB_MILLIS_REDUCES","displayName":"Total megabyte-seconds taken by all reduce tasks","value":3161088}]},{"name":"org.apache.hadoop.mapreduce.TaskCounter","displayName":"Map-Reduce Framework","counts":[{"name":"MAP_INPUT_RECORDS","displayName":"Map input records","value":4},{"name":"MAP_OUTPUT_RECORDS","displayName":"Map output records","value":14},{"name":"MAP_OUTPUT_BYTES","displayName":"Map output bytes","value":141},{"name":"MAP_OUTPUT_MATERIALIZED_BYTES","displayName":"Map output materialized bytes","value":114},{"name":"SPLIT_RAW_BYTES","displayName":"Input split bytes","value":109},{"name":"COMBINE_INPUT_RECORDS","displayName":"Combine input records","value":14},{"name":"COMBINE_OUTPUT_RECORDS","displayName":"Combine output records","value":9},{"name":"REDUCE_INPUT_GROUPS","displayName":"Reduce input groups","value":9},{"name":"REDUCE_SHUFFLE_BYTES","displayName":"Reduce shuffle bytes","value":114},{"name":"REDUCE_INPUT_RECORDS","displayName":"Reduce input records","value":9},{"name":"REDUCE_OUTPUT_RECORDS","displayName":"Reduce output records","value":9},{"name":"SPILLED_RECORDS","displayName":"Spilled Records","value":18},{"name":"SHUFFLED_MAPS","displayName":"Shuffled Maps ","value":1},{"name":"FAILED_SHUFFLE","displayName":"Failed Shuffles","value":0},{"name":"MERGED_MAP_OUTPUTS","displayName":"Merged Map outputs","value":1},{"name":"GC_TIME_MILLIS","displayName":"GC time elapsed (ms)","value":143},{"name":"CPU_MILLISECONDS","displayName":"CPU time spent (ms)","value":1310},{"name":"PHYSICAL_MEMORY_BYTES","displayName":"Physical memory (bytes) snapshot","value":337027072},{"name":"VIRTUAL_MEMORY_BYTES","displayName":"Virtual memory (bytes) snapshot","value":1338920960},{"name":"COMMITTED_HEAP_BYTES","displayName":"Total committed heap usage (bytes)","value":136450048}]},{"name":"Shuffle Errors","displayName":"Shuffle Errors","counts":[{"name":"BAD_ID","displayName":"BAD_ID","value":0},{"name":"CONNECTION","displayName":"CONNECTION","value":0},{"name":"IO_ERROR","displayName":"IO_ERROR","value":0},{"name":"WRONG_LENGTH","displayName":"WRONG_LENGTH","value":0},{"name":"WRONG_MAP","displayName":"WRONG_MAP","value":0},{"name":"WRONG_REDUCE","displayName":"WRONG_REDUCE","value":0}]},{"name":"org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter","displayName":"File Input Format Counters ","counts":[{"name":"BYTES_READ","displayName":"Bytes Read","value":85}]},{"name":"org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter","displayName":"File Output Format Counters ","counts":[{"name":"BYTES_WRITTEN","displayName":"Bytes Written","value":72}]}]},"mapCounters":{"name":"MAP_COUNTERS","groups":[{"name":"org.apache.hadoop.mapreduce.FileSystemCounter","displayName":"File System Counters","counts":[{"name":"FILE_BYTES_READ","displayName":"FILE: Number of bytes read","value":0},{"name":"FILE_BYTES_WRITTEN","displayName":"FILE: Number of bytes written","value":115420},{"name":"FILE_READ_OPS","displayName":"FILE: Number of read operations","value":0},{"name":"FILE_LARGE_READ_OPS","displayName":"FILE: Number of large read operations","value":0},{"name":"FILE_WRITE_OPS","displayName":"FILE: Number of write operations","value":0},{"name":"HDFS_BYTES_READ","displayName":"HDFS: Number of bytes read","value":194},{"name":"HDFS_BYTES_WRITTEN","displayName":"HDFS: Number of bytes written","value":0},{"name":"HDFS_READ_OPS","displayName":"HDFS: Number of read operations","value":3},{"name":"HDFS_LARGE_READ_OPS","displayName":"HDFS: Number of large read operations","value":0},{"name":"HDFS_WRITE_OPS","displayName":"HDFS: Number of write operations","value":0}]},{"name":"org.apache.hadoop.mapreduce.TaskCounter","displayName":"Map-Reduce Framework","counts":[{"name":"MAP_INPUT_RECORDS","displayName":"Map input records","value":4},{"name":"MAP_OUTPUT_RECORDS","displayName":"Map output records","value":14},{"name":"MAP_OUTPUT_BYTES","displayName":"Map output bytes","value":141},{"name":"MAP_OUTPUT_MATERIALIZED_BYTES","displayName":"Map output materialized bytes","value":114},{"name":"SPLIT_RAW_BYTES","displayName":"Input split bytes","value":109},{"name":"COMBINE_INPUT_RECORDS","displayName":"Combine input records","value":14},{"name":"COMBINE_OUTPUT_RECORDS","displayName":"Combine output records","value":9},{"name":"SPILLED_RECORDS","displayName":"Spilled Records","value":9},{"name":"FAILED_SHUFFLE","displayName":"Failed Shuffles","value":0},{"name":"MERGED_MAP_OUTPUTS","displayName":"Merged Map outputs","value":0},{"name":"GC_TIME_MILLIS","displayName":"GC time elapsed (ms)","value":83},{"name":"CPU_MILLISECONDS","displayName":"CPU time spent (ms)","value":490},{"name":"PHYSICAL_MEMORY_BYTES","displayName":"Physical memory (bytes) snapshot","value":216502272},{"name":"VIRTUAL_MEMORY_BYTES","displayName":"Virtual memory (bytes) snapshot","value":666292224},{"name":"COMMITTED_HEAP_BYTES","displayName":"Total committed heap usage (bytes)","value":120721408}]},{"name":"org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter","displayName":"File Input Format Counters ","counts":[{"name":"BYTES_READ","displayName":"Bytes Read","value":85}]}]},"reduceCounters":{"name":"REDUCE_COUNTERS","groups":[{"name":"org.apache.hadoop.mapreduce.FileSystemCounter","displayName":"File System Counters","counts":[{"name":"FILE_BYTES_READ","displayName":"FILE: Number of bytes read","value":114},{"name":"FILE_BYTES_WRITTEN","displayName":"FILE: Number of bytes written","value":115365},{"name":"FILE_READ_OPS","displayName":"FILE: Number of read operations","value":0},{"name":"FILE_LARGE_READ_OPS","displayName":"FILE: Number of large read operations","value":0},{"name":"FILE_WRITE_OPS","displayName":"FILE: Number of write operations","value":0},{"name":"HDFS_BYTES_READ","displayName":"HDFS: Number of bytes read","value":0},{"name":"HDFS_BYTES_WRITTEN","displayName":"HDFS: Number of bytes written","value":72},{"name":"HDFS_READ_OPS","displayName":"HDFS: Number of read operations","value":3},{"name":"HDFS_LARGE_READ_OPS","displayName":"HDFS: Number of large read operations","value":0},{"name":"HDFS_WRITE_OPS","displayName":"HDFS: Number of write operations","value":2}]},{"name":"org.apache.hadoop.mapreduce.TaskCounter","displayName":"Map-Reduce Framework","counts":[{"name":"COMBINE_INPUT_RECORDS","displayName":"Combine input records","value":0},{"name":"COMBINE_OUTPUT_RECORDS","displayName":"Combine output records","value":0},{"name":"REDUCE_INPUT_GROUPS","displayName":"Reduce input groups","value":9},{"name":"REDUCE_SHUFFLE_BYTES","displayName":"Reduce shuffle bytes","value":114},{"name":"REDUCE_INPUT_RECORDS","displayName":"Reduce input records","value":9},{"name":"REDUCE_OUTPUT_RECORDS","displayName":"Reduce output records","value":9},{"name":"SPILLED_RECORDS","displayName":"Spilled Records","value":9},{"name":"SHUFFLED_MAPS","displayName":"Shuffled Maps ","value":1},{"name":"FAILED_SHUFFLE","displayName":"Failed Shuffles","value":0},{"name":"MERGED_MAP_OUTPUTS","displayName":"Merged Map outputs","value":1},{"name":"GC_TIME_MILLIS","displayName":"GC time elapsed (ms)","value":60},{"name":"CPU_MILLISECONDS","displayName":"CPU time spent (ms)","value":820},{"name":"PHYSICAL_MEMORY_BYTES","displayName":"Physical memory (bytes) snapshot","value":120524800},{"name":"VIRTUAL_MEMORY_BYTES","displayName":"Virtual memory (bytes) snapshot","value":672628736},{"name":"COMMITTED_HEAP_BYTES","displayName":"Total committed heap usage (bytes)","value":15728640}]},{"name":"Shuffle Errors","displayName":"Shuffle Errors","counts":[{"name":"BAD_ID","displayName":"BAD_ID","value":0},{"name":"CONNECTION","displayName":"CONNECTION","value":0},{"name":"IO_ERROR","displayName":"IO_ERROR","value":0},{"name":"WRONG_LENGTH","displayName":"WRONG_LENGTH","value":0},{"name":"WRONG_MAP","displayName":"WRONG_MAP","value":0},{"name":"WRONG_REDUCE","displayName":"WRONG_REDUCE","value":0}]},{"name":"org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter","displayName":"File Output Format Counters ","counts":[{"name":"BYTES_WRITTEN","displayName":"Bytes Written","value":72}]}]}}}}
1
2
3
4
5
6
7
8
#### 任务运行日志(Container日志)
任务运行日志(即:Container日志)包含ApplicationMaster日志和普通Task日志等信息。默认情况下,这些日志信息是存放在${HADOOP_HOME}/logs/userlogs目录下
普通Task就是我们的MapReduce实例程序,这类日志就是我们在MapReduce程序中输出的log日志和System.out打印到控制台的输出,例如:logger.info("test");和System.out.println("test");。这部分日志默认是存储在${HADOOP_HOME}/logs/userlogs
##### yarn-site.xml
<property> <name>yarn.nodemanager.log-dirs</name> <value>${yarn.log.dir}/userlogs</value> </property>
1
2
对于Container日志,在Hadoop 2.x版本里面,Task是按照application->Container的层次来管理的,所以在NameNode机器上运行MapReduce程序的时候,在Console里面看到的log都可以通过在相应的DataNode/NodeManager里面的${HADOOP_HOME}/logs/userlogs下面找到。
# 查看默认日志存储路径下的文件(一个MapReduce程序对应一个Application,这里的Application会在Console提示出哪一个Application执行我们的MapReduce程序) $ ls ${HADOOP_HOME}/logs/userlogs drwx--x--- 5 yunyu yunyu 4096 Nov 2 20:13 application_1478088725123_0001/ drwx--x--- 5 yunyu yunyu 4096 Nov 2 20:31 application_1478088725123_0002/ drwx--x--- 5 yunyu yunyu 4096 Nov 2 21:05 application_1478088725123_0003/ drwx--x--- 3 yunyu yunyu 4096 Nov 2 21:08 application_1478088725123_0004/ drwx--x--- 3 yunyu yunyu 4096 Nov 2 21:34 application_1478088725123_0006/ drwx--x--- 4 yunyu yunyu 4096 Nov 2 23:49 application_1478101603149_0001/ drwx--x--- 3 yunyu yunyu 4096 Nov 3 00:04 application_1478101603149_0002/ drwx--x--- 3 yunyu yunyu 4096 Nov 3 11:03 application_1478138258749_0001/ # 查看某个Application下的Container日志(如果是Hadoop分布式集群,Container日志可能会分布在多个DataNode机器中) $ ll application_1478088725123_0003/ total 20 drwx--x--- 5 yunyu yunyu 4096 Nov 2 21:05 ./ drwxr-xr-x 33 yunyu yunyu 4096 Nov 3 14:57 ../ drwx--x--- 2 yunyu yunyu 4096 Nov 2 21:04 container_1478088725123_0003_01_000001/ drwx--x--- 2 yunyu yunyu 4096 Nov 2 21:05 container_1478088725123_0003_01_000002/ drwx--x--- 2 yunyu yunyu 4096 Nov 2 21:05 container_1478088725123_0003_01_000003/ # 查看container下的日志文件,都有三个日志文件 # stderr : 错误文件输出 # stdout : System.out.println控制台输出,我们自己写的MapReduce程序的System.out.println输出都将写入到此文件中 # syslog : logger系统日志输出,我们自己的MapReduce程序的logger.info日志记录都将写入到此文件中 $ ll application_1478088725123_0003/container_1478088725123_0003_01_000001/* -rw-rw-r-- 1 yunyu yunyu 760 Nov 2 21:05 application_1478088725123_0003/container_1478088725123_0003_01_000001/stderr -rw-rw-r-- 1 yunyu yunyu 0 Nov 2 21:04 application_1478088725123_0003/container_1478088725123_0003_01_000001/stdout -rw-rw-r-- 1 yunyu yunyu 34718 Nov 2 21:05 application_1478088725123_0003/container_1478088725123_0003_01_000001/syslog
1
2
如果是Hadoop分布式集群,Container日志文件有可能会被分配到多个机器中
# Hadoop1机器中运行了一部分的Application分配的任务,任务日志在container_1478101603149_0002_01_000001中 $ ll application_1478101603149_0002/ total 12 drwx--x--- 3 yunyu yunyu 4096 Nov 3 00:04 ./ drwxr-xr-x 33 yunyu yunyu 4096 Nov 3 15:03 ../ drwx--x--- 2 yunyu yunyu 4096 Nov 3 00:04 container_1478101603149_0002_01_000001/ # Hadoop2机器中运行了一部分的Application分配的任务,任务日志在container_1478101603149_0002_01_000002和container_1478101603149_0002_01_000003中 $ ll application_1478101603149_0002/ total 16 drwx--x--- 4 yunyu yunyu 4096 Nov 3 00:04 ./ drwxr-xr-x 36 yunyu yunyu 4096 Nov 3 15:03 ../ drwx--x--- 2 yunyu yunyu 4096 Nov 3 00:04 container_1478101603149_0002_01_000002/ drwx--x--- 2 yunyu yunyu 4096 Nov 3 00:04 container_1478101603149_0002_01_000003/
1
2
3
4
5
6
### 日志聚合
将作业和任务日志存放在各个节点上不便于统一管理和分析,为此,我们可以启用日志聚集功能。打开该功能后,各个任务运行完成后,会将生成的日志推送到HDFS的一个目录下(之前的并不会立即删除,在HDFS上,每个任务产生的三个文件,即syslog、stderr和stdout将合并一个文件,并通过索引记录各自位置)。这样我们可以使用HistoryServer统一查看Hadoop相关日志。
这里需要在yarn-site.xml配置文件中添加如下配置
<property> <name>yarn.log-aggregation-enable</name> <value>true</value> </property>
1
2
如果不添加此配置我们在HistoryServer中直接查看logs会报错如下
Aggregation is not enabled. Try the nodemanager at hadoop1:39175
1
2
3
4
#### 日志聚集相关配置参数
日志聚集是YARN提供的日志中央化管理功能,它能将运行完成的Container/任务日志上传到HDFS上,从而减轻NodeManager负载,且提供一个中央化存储和分析机制。默认情况下,Container/任务日志存在在各个NodeManager上,如果启用日志聚集功能需要额外的配置。
(1) yarn.log-aggregation-enable 参数解释:是否启用日志聚集功能。 默认值:false (2) yarn.log-aggregation.retain-seconds 参数解释:在HDFS上聚集的日志最多保存多长时间。 默认值:-1 (3) yarn.log-aggregation.retain-check-interval-seconds 参数解释:多长时间检查一次日志,并将满足条件的删除,如果是0或者负数,则为上一个值的1/10。 默认值:-1 (4) yarn.nodemanager.remote-app-log-dir 参数解释:当应用程序运行结束后,日志被转移到的HDFS目录(启用日志聚集功能时有效)。 默认值:/tmp/logs (5) yarn.log-aggregation.retain-seconds 参数解释:远程日志目录子目录名称(启用日志聚集功能时有效)。 默认值:日志将被转移到目录 ${yarn.nodemanager.remote-app-log-dir}/${user}/${thisParam}下
1
2
我这里犯了一个比较低级的错误,导致我的日志结果一直和我预期的不同,这也让我困惑了许久。问题的现象是我只在Hadoop1的机器中启用了日志聚集功能(yarn.log-aggregation-enable为true),但是我在跑自己的MapReduce任务时还是看不到自己代码中的System.out.println和logger.info日志输出到Container日志文件中,而且发现Container日志文件会少,正常情况下会有container_XXX_00001,container_XXX_00002,container_XXX_00003三个容器运行日志,但是有的时候我发现只有container_XXX_00001和container_XXX_00003没有container_XXX_00002。后来我多次重新跑了MapReduce程序,发现container日志会被正常生成出来,但是MapReduce运行完很快就被删掉了。这样我困惑了很久,后来发现原来是我自己粗心导致了,日志聚集功能只在Hadoop1机器中配置了,而其他两台Hadoop机器没有配置,所以导致只有Hadoop1的Container日志文件被聚集到HDFS上,而且Hadoop1的Container日志聚集到HDFS之后,会将Hadoop1系统中的log文件删除,所以就会少了container_XXX_00002的日志文件。
# 可以在HDFS中查看到stderr, stdout, syslog聚集的日志文件 $ hdfs dfs -ls /tmp/logs/yunyu/logs/application_1478159146498_0001 Found 1 items -rw-r----- 2 yunyu supergroup 52122 2016-11-03 00:47 /tmp/logs/yunyu/logs/application_1478159146498_0001/hadoop1_35650

总结

这里我简单说一下我研究的结论

  • Q: 在Hadoop中如何看到MapReduce中的System.out.println输出
  • A: 在Container日志目录下的stdout日志文件中,默认是在${HADOOP_HOME}/logs/userlogs/application_XXXX/container_XXX_0000X/stdout

  • Q: 在Hadoop中如何看到Log4j或者其他日志组件的日志输出

  • A: 在Container日志目录下的stdout日志文件中,默认是在${HADOOP_HOME}/logs/userlogs/application_XXXX/container_XXX_0000X/syslog

  • Q: 日志文件保存在哪里

  • A: 默认保存在${HADOOP_HOME}/logs/userlogs/目录下

  • Q: 如何通过HistoryServer查看Hadoop日志

  • A: 需要在yarn-site.xml配置文件中配置yarn.log-aggregation-enable属性为true

参考文章: