通过之前的WordCount和AdLog实例,我们已经能够编写出简单的MapReduce实例了,但是第一次编写还是难免会遇到一些问题,比如AdLog解析json日志结构的时候出错怎么办,如何查看MapReduce的运行日志呢,这就是我们本篇要重点介绍的Hadoop日志
理解Hadoop的日志
Hadoop的日志一般会分为下面两种
- Hadoop系统服务输出的日志
- Mapreduce程序输出来的日志
Hadoop系统服务输出的日志
也就是我们启动NameNode, DataNode, NodeManager ResourceManager, HistoryServer等等系统自带的服务输出来的日志,默认是存放在${HADOOP_HOME}/logs目录下。
可以在mapred-site.xml配置文件中指定Hadoop日志的输出路径
|
|
服务名 | 服务类型 | 日志文件名 |
---|---|---|
resourcemanager | YARN | yarn-${USER}-resourcemanager-${hostname}.log |
nodemanager | YARN | yarn-${USER}-nodemanager-${hostname}.log |
historyserver | HDFS | mapred-${USER}-historyserver-${hostname}.log |
secondarynamenode | HDFS | hadoop-${USER}-secondarynamenode-${hostname}.log |
namenode | HDFS | hadoop-${USER}-namenode-${hostname}.log |
datanode | HDFS | hadoop-${USER}-datanode-${hostname}.log |
|
|
Mapreduce程序输出来的日志
MapReduce程序输出来的日志又细分为下面两种
- 作业运行日志(历史作业日志)
- 任务运行日志(Container日志)
作业运行日志(历史作业日志)
作业运行由MRAppMaster(MapReduce作业的ApplicationMaster)产生,详细记录了作业启动时间、运行时间,每个任务启动时间、运行时间、Counter值等信息。这些信息对分析作业是很有帮助的,我们可以通过这些历史作业记录得到每天有多少个作业运行成功、有多少个作业运行失败、每个队列作业运行了多少个作业等很有用的信息。
MapReduce作业的ApplicationMaster也运行在Container中,且是编号为000001的Container,比如container_1385051297072_0001_01_000001,它自身可认为是一个特殊的task,因此,也有自己的运行日志(Container日志),该日志与Map Task和Reduce Task类似,但并不是这里介绍的“作业运行日志”。
作业运行日志和其他的日志文件不一样,是因为这些历史作业记录文件是存储在HDFS上的,而不是存储在本地系统文件中的,可以修改mapred-site.xml配置文件指定对应HDFS的存储路径,而且可以指定正在运行的MapReduce作业和已经完成的MapReduce作业信息在HDFS的存储路径。
mapred-site.xml
<!-- MapReduce已完成作业信息在HDFS的存储路径 -->
<property>
<name>mapreduce.jobhistory.done-dir</name>
<value>${yarn.app.mapreduce.am.staging-dir}/history/done</value>
</property>
<!-- MapReduce正在运行作业信息在HDFS的存储路径 -->
<property>
<name>mapreduce.jobhistory.intermediate-done-dir</name>
<value>${yarn.app.mapreduce.am.staging-dir}/history/done_intermediate</value>
</property>
<!-- MapReduce作业信息在HDFS默认的存储路径 -->
<property>
<name>yarn.app.mapreduce.am.staging-dir</name>
<value>/tmp/hadoop-yarn/staging</value>
</property>
12345678910111213141516171819202122232425262728293031323334
作业运行日志产生过程如下:- 步骤1:ResourceManager启动作业的ApplicationMaster,ApplicationMaster运行过程中,将日志写到${yarn.app.mapreduce.am.staging-dir}/yarn/.staging/job_XXXXX_XXX/下,其中参数yarn.app.mapreduce.am.staging-dir 的默认值是/tmp/hadoop-yarn/staging,该目录下将存在3个文件,分别是以".jhist"、".summary"和".xml"结尾的文件,分别表示作业运行日志、作业概要信息和作业配置属性- 步骤2:所有任务运行完成后,意味着,该作业运行完成,此时ApplicationMaster将三个文件拷贝到${mapreduce.jobhistory.intermediate-done-dir}/${username}目录下,拷贝后的文件名后面添加"_tmp",其中mapreduce.jobhistory.intermediate-done-dir默认值是${yarn.app.mapreduce.am.staging-dir}/history/done_intermediate- 步骤3:ApplicationMaster将拷贝完成的三个文件重新命名成".jhist"、".summary"和".xml"结尾的文件(去掉"_tmp")- 步骤4:周期性扫描线程定期将done_intermediate的日志文件转移到done目录(通过参数mapreduce.jobhistory.done-dir配置,默认值为${yarn.app.mapreduce.am.staging-dir}/history/done)下,同时删除".summary"文件(该文件中的信息,.jhist文件中都有)。- 步骤5:ApplicationMaster移除${yarn.app.mapreduce.am.staging-dir}/yarn/.staging/job_XXXXX_XXX/目录默认情况下,任务运行日志产只会存放在各NodeManager的本地磁盘上,你可以打开日志聚集功能,以便让任务将运行日志推送到HDFS上,以便集中管理和分析。### 特别需要注意下默认情况下,NodeManager将日志保存到yarn.nodemanager.log-dirs下,,该属性缺省值为${yarn.log.dir}/userlogs,也就是Hadoop安装目录下的logs/userlogs目录中,通常为了分摊磁盘负载,我们会为该参数设置多个路径,此外,需要注意的是,ApplicationMaster的自身的日志也存放在该路目下,因为它也运行在Container之中,是一个特殊的task。举例如下,其中,最后一个是某个作业的ApplicationMaster日志(编号是000001)。即ApplicationMaster日志目录名称为container_XXX_000001,普通task日志目录名称则为container_XXX_000002,container_XXX_000003,...container_XXX_00000X每个目录下包含三个日志文件:stdout、stderr和syslog- stderr : 错误文件输出- stdout : System.out.println控制台输出,我们自己写的MapReduce程序的System.out.println输出都将写入到此文件中- syslog : logger系统日志输出,我们自己的MapReduce程序的logger.info日志记录都将写入到此文件中#### HistoryServerHadoop自带了一个HistoryServer用于查看Mapreduce作业记录,比如用了多少个Map、用了多少个Reduce、作业提交时间、作业启动时间、作业完成时间等信息。需要在mapred-site.xml配置文件中配置HistoryServer的通信地址,并且需要我们手动启动HistoryServer服务##### 启动HistoryServer
$ mr-jobhistory-daemon.sh start historyserver
12
##### mapred-site.xml
<!-- MapReduce JobHistory Server的IPC通信地址,默认端口号是10020 -->
<property>
<name>mapreduce.jobhistory.address</name>
<value>hadoop1:10020</value>
</property>
<!-- MapReduce JobHistory Server的Web服务器访问地址,默认端口号是19888 -->
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>hadoop1:19888</value>
</property>
12
我们可以在HDFS上查看一下MapReduce作业信息相关内容
$ hdfs dfs -ls /data/history/done/2016/10/26/000000
Found 34 items
-rwxrwx--- 2 yunyu supergroup 33487 2016-10-26 01:46 /data/history/done/2016/10/26/000000/job_1477390779880_0003-1477471590330-yunyu-wordcount-1477471605995-1-1-SUCCEEDED-default-1477471593992.jhist
-rwxrwx--- 2 yunyu supergroup 113716 2016-10-26 01:46 /data/history/done/2016/10/26/000000/job_1477390779880_0003_conf.xml
-rwxrwx--- 2 yunyu supergroup 33478 2016-10-26 01:48 /data/history/done/2016/10/26/000000/job_1477390779880_0004-1477471680580-yunyu-wordcount-1477471695328-1-1-SUCCEEDED-default-1477471684537.jhist
-rwxrwx--- 2 yunyu supergroup 113716 2016-10-26 01:48 /data/history/done/2016/10/26/000000/job_1477390779880_0004_conf.xml
-rwxrwx--- 2 yunyu supergroup 33472 2016-10-26 01:49 /data/history/done/2016/10/26/000000/job_1477390779880_0005-1477471725086-yunyu-wordcount-1477471740048-1-1-SUCCEEDED-default-1477471729179.jhist
-rwxrwx--- 2 yunyu supergroup 113716 2016-10-26 01:49 /data/history/done/2016/10/26/000000/job_1477390779880_0005_conf.xml
-rwxrwx--- 2 yunyu supergroup 33483 2016-10-26 01:53 /data/history/done/2016/10/26/000000/job_1477390779880_0006-1477471983695-yunyu-wordcount-1477471999289-1-1-SUCCEEDED-default-1477471987996.jhist
-rwxrwx--- 2 yunyu supergroup 113716 2016-10-26 01:53 /data/history/done/2016/10/26/000000/job_1477390779880_0006_conf.xml
-rwxrwx--- 2 yunyu supergroup 33479 2016-10-26 01:55 /data/history/done/2016/10/26/000000/job_1477390779880_0007-1477472135663-yunyu-wordcount-1477472150775-1-1-SUCCEEDED-default-1477472139406.jhist
-rwxrwx--- 2 yunyu supergroup 113716 2016-10-26 01:55 /data/history/done/2016/10/26/000000/job_1477390779880_0007_conf.xml
-rwxrwx--- 2 yunyu supergroup 33477 2016-10-26 02:42 /data/history/done/2016/10/26/000000/job_1477390779880_0008-1477474917310-yunyu-wordcount-1477474934028-1-1-SUCCEEDED-default-1477474921803.jhist
-rwxrwx--- 2 yunyu supergroup 113716 2016-10-26 02:42 /data/history/done/2016/10/26/000000/job_1477390779880_0008_conf.xml
-rwxrwx--- 2 yunyu supergroup 33487 2016-10-26 03:25 /data/history/done/2016/10/26/000000/job_1477477415810_0001-1477477521120-yunyu-wordcount-1477477540308-1-1-SUCCEEDED-default-1477477527364.jhist
-rwxrwx--- 2 yunyu supergroup 113712 2016-10-26 03:25 /data/history/done/2016/10/26/000000/job_1477477415810_0001_conf.xml
-rwxrwx--- 2 yunyu supergroup 33483 2016-10-26 03:51 /data/history/done/2016/10/26/000000/job_1477477415810_0002-1477479057999-yunyu-wordcount-1477479076500-1-1-SUCCEEDED-default-1477479062997.jhist
-rwxrwx--- 2 yunyu supergroup 113712 2016-10-26 03:51 /data/history/done/2016/10/26/000000/job_1477477415810_0002_conf.xml
-rwxrwx--- 2 yunyu supergroup 33476 2016-10-26 04:01 /data/history/done/2016/10/26/000000/job_1477477415810_0003-1477479645080-yunyu-wordcount-1477479661516-1-1-SUCCEEDED-default-1477479650248.jhist
-rwxrwx--- 2 yunyu supergroup 113712 2016-10-26 04:01 /data/history/done/2016/10/26/000000/job_1477477415810_0003_conf.xml
-rwxrwx--- 2 yunyu supergroup 33470 2016-10-26 04:36 /data/history/done/2016/10/26/000000/job_1477477415810_0004-1477481804256-yunyu-wordcount-1477481820774-1-1-SUCCEEDED-default-1477481809917.jhist
-rwxrwx--- 2 yunyu supergroup 113638 2016-10-26 04:36 /data/history/done/2016/10/26/000000/job_1477477415810_0004_conf.xml
-rwxrwx--- 2 yunyu supergroup 33519 2016-10-26 04:46 /data/history/done/2016/10/26/000000/job_1477477415810_0005-1477482378129-yunyu-wordcount-1477482394056-1-1-SUCCEEDED-default-1477482382630.jhist
-rwxrwx--- 2 yunyu supergroup 113682 2016-10-26 04:46 /data/history/done/2016/10/26/000000/job_1477477415810_0005_conf.xml
-rwxrwx--- 2 yunyu supergroup 33518 2016-10-26 04:56 /data/history/done/2016/10/26/000000/job_1477477415810_0006-1477482967602-yunyu-wordcount-1477482983257-1-1-SUCCEEDED-default-1477482971751.jhist
-rwxrwx--- 2 yunyu supergroup 113682 2016-10-26 04:56 /data/history/done/2016/10/26/000000/job_1477477415810_0006_conf.xml
-rwxrwx--- 2 yunyu supergroup 33524 2016-10-26 05:05 /data/history/done/2016/10/26/000000/job_1477477415810_0007-1477483516841-yunyu-wordcount-1477483533885-1-1-SUCCEEDED-default-1477483521388.jhist
-rwxrwx--- 2 yunyu supergroup 113682 2016-10-26 05:05 /data/history/done/2016/10/26/000000/job_1477477415810_0007_conf.xml
-rwxrwx--- 2 yunyu supergroup 33519 2016-10-26 05:27 /data/history/done/2016/10/26/000000/job_1477477415810_0008-1477484838977-yunyu-wordcount-1477484854521-1-1-SUCCEEDED-default-1477484843086.jhist
-rwxrwx--- 2 yunyu supergroup 113682 2016-10-26 05:27 /data/history/done/2016/10/26/000000/job_1477477415810_0008_conf.xml
-rwxrwx--- 2 yunyu supergroup 33520 2016-10-26 19:21 /data/history/done/2016/10/26/000000/job_1477534790849_0001-1477534850971-yunyu-wordcount-1477534870748-1-1-SUCCEEDED-default-1477534857063.jhist
-rwxrwx--- 2 yunyu supergroup 113699 2016-10-26 19:21 /data/history/done/2016/10/26/000000/job_1477534790849_0001_conf.xml
-rwxrwx--- 2 yunyu supergroup 33521 2016-10-26 20:23 /data/history/done/2016/10/26/000000/job_1477534790849_0002-1477538573459-yunyu-wordcount-1477538590195-1-1-SUCCEEDED-default-1477538577756.jhist
-rwxrwx--- 2 yunyu supergroup 113699 2016-10-26 20:23 /data/history/done/2016/10/26/000000/job_1477534790849_0002_conf.xml
-rwxrwx--- 2 yunyu supergroup 33519 2016-10-26 20:24 /data/history/done/2016/10/26/000000/job_1477534790849_0003-1477538645546-yunyu-wordcount-1477538662701-1-1-SUCCEEDED-default-1477538650360.jhist
-rwxrwx--- 2 yunyu supergroup 113699 2016-10-26 20:24 /data/history/done/2016/10/26/000000/job_1477534790849_0003_conf.xml
1234567
通过上面的结果我们可以得到一下几点:- (1)历史作业记录是存放在HDFS目录中;- (2)由于历史作业记录可能非常多,所以历史作业记录是按照年/月/日的形式分别存放在相应的目录中,这样便于管理和查找;- (3)对于每一个Hadoop历史作业记录相关信息都用两个文件存放,后缀名分别为*.jhist,*.xml。*.jhist文件里存放的是具体Hadoop作业的详细信息,如下:- (4)每一个作业的历史记录都存放在一个单独的文件中。
hdfs dfs -cat /data/history/done/2016/10/26/000000/job_1477534790849_0003-1477538645546-yunyu-wordcount-1477538662701-1-1-SUCCEEDED-default-1477538650360.jhist
Avro-Json
{"type":"record","name":"Event","namespace":"org.apache.hadoop.mapreduce.jobhistory","fields":[{"name":"type","type":{"type":"enum","name":"EventType","symbols":["JOB_SUBMITTED","JOB_INITED","JOB_FINISHED","JOB_PRIORITY_CHANGED","JOB_STATUS_CHANGED","JOB_QUEUE_CHANGED","JOB_FAILED","JOB_KILLED","JOB_ERROR","JOB_INFO_CHANGED","TASK_STARTED","TASK_FINISHED","TASK_FAILED","TASK_UPDATED","NORMALIZED_RESOURCE","MAP_ATTEMPT_STARTED","MAP_ATTEMPT_FINISHED","MAP_ATTEMPT_FAILED","MAP_ATTEMPT_KILLED","REDUCE_ATTEMPT_STARTED","REDUCE_ATTEMPT_FINISHED","REDUCE_ATTEMPT_FAILED","REDUCE_ATTEMPT_KILLED","SETUP_ATTEMPT_STARTED","SETUP_ATTEMPT_FINISHED","SETUP_ATTEMPT_FAILED","SETUP_ATTEMPT_KILLED","CLEANUP_ATTEMPT_STARTED","CLEANUP_ATTEMPT_FINISHED","CLEANUP_ATTEMPT_FAILED","CLEANUP_ATTEMPT_KILLED","AM_STARTED"]}},{"name":"event","type":[{"type":"record","name":"JobFinished","fields":[{"name":"jobid","type":"string"},{"name":"finishTime","type":"long"},{"name":"finishedMaps","type":"int"},{"name":"finishedReduces","type":"int"},{"name":"failedMaps","type":"int"},{"name":"failedReduces","type":"int"},{"name":"totalCounters","type":{"type":"record","name":"JhCounters","fields":[{"name":"name","type":"string"},{"name":"groups","type":{"type":"array","items":{"type":"record","name":"JhCounterGroup","fields":[{"name":"name","type":"string"},{"name":"displayName","type":"string"},{"name":"counts","type":{"type":"array","items":{"type":"record","name":"JhCounter","fields":[{"name":"name","type":"string"},{"name":"displayName","type":"string"},{"name":"value","type":"long"}]}}}]}}}]}},{"name":"mapCounters","type":"JhCounters"},{"name":"reduceCounters","type":"JhCounters"}]},{"type":"record","name":"JobInfoChange","fields":[{"name":"jobid","type":"string"},{"name":"submitTime","type":"long"},{"name":"launchTime","type":"long"}]},{"type":"record","name":"JobInited","fields":[{"name":"jobid","type":"string"},{"name":"launchTime","type":"long"},{"name":"totalMaps","type":"int"},{"name":"totalReduces","type":"int"},{"name":"jobStatus","type":"string"},{"name":"uberized","type":"boolean"}]},{"type":"record","name":"AMStarted","fields":[{"name":"applicationAttemptId","type":"string"},{"name":"startTime","type":"long"},{"name":"containerId","type":"string"},{"name":"nodeManagerHost","type":"string"},{"name":"nodeManagerPort","type":"int"},{"name":"nodeManagerHttpPort","type":"int"}]},{"type":"record","name":"JobPriorityChange","fields":[{"name":"jobid","type":"string"},{"name":"priority","type":"string"}]},{"type":"record","name":"JobQueueChange","fields":[{"name":"jobid","type":"string"},{"name":"jobQueueName","type":"string"}]},{"type":"record","name":"JobStatusChanged","fields":[{"name":"jobid","type":"string"},{"name":"jobStatus","type":"string"}]},{"type":"record","name":"JobSubmitted","fields":[{"name":"jobid","type":"string"},{"name":"jobName","type":"string"},{"name":"userName","type":"string"},{"name":"submitTime","type":"long"},{"name":"jobConfPath","type":"string"},{"name":"acls","type":{"type":"map","values":"string"}},{"name":"jobQueueName","type":"string"},{"name":"workflowId","type":["null","string"],"default":null},{"name":"workflowName","type":["null","string"],"default":null},{"name":"workflowNodeName","type":["null","string"],"default":null},{"name":"workflowAdjacencies","type":["null","string"],"default":null},{"name":"workflowTags","type":["null","string"],"default":null}]},{"type":"record","name":"JobUnsuccessfulCompletion","fields":[{"name":"jobid","type":"string"},{"name":"finishTime","type":"long"},{"name":"finishedMaps","type":"int"},{"name":"finishedReduces","type":"int"},{"name":"jobStatus","type":"string"},{"name":"diagnostics","type":["null","string"],"default":null}]},{"type":"record","name":"MapAttemptFinished","fields":[{"name":"taskid","type":"string"},{"name":"attemptId","type":"string"},{"name":"taskType","type":"string"},{"name":"taskStatus","type":"string"},{"name":"mapFinishTime","type":"long"},{"name":"finishTime","type":"long"},{"name":"hostname","type":"string"},{"name":"port","type":"int"},{"name":"rackname","type":"string"},{"name":"state","type":"string"},{"name":"counters","type":"JhCounters"},{"name":"clockSplits","type":{"type":"array","items":"int"}},{"name":"cpuUsages","type":{"type":"array","items":"int"}},{"name":"vMemKbytes","type":{"type":"array","items":"int"}},{"name":"physMemKbytes","type":{"type":"array","items":"int"}}]},{"type":"record","name":"ReduceAttemptFinished","fields":[{"name":"taskid","type":"string"},{"name":"attemptId","type":"string"},{"name":"taskType","type":"string"},{"name":"taskStatus","type":"string"},{"name":"shuffleFinishTime","type":"long"},{"name":"sortFinishTime","type":"long"},{"name":"finishTime","type":"long"},{"name":"hostname","type":"string"},{"name":"port","type":"int"},{"name":"rackname","type":"string"},{"name":"state","type":"string"},{"name":"counters","type":"JhCounters"},{"name":"clockSplits","type":{"type":"array","items":"int"}},{"name":"cpuUsages","type":{"type":"array","items":"int"}},{"name":"vMemKbytes","type":{"type":"array","items":"int"}},{"name":"physMemKbytes","type":{"type":"array","items":"int"}}]},{"type":"record","name":"TaskAttemptFinished","fields":[{"name":"taskid","type":"string"},{"name":"attemptId","type":"string"},{"name":"taskType","type":"string"},{"name":"taskStatus","type":"string"},{"name":"finishTime","type":"long"},{"name":"rackname","type":"string"},{"name":"hostname","type":"string"},{"name":"state","type":"string"},{"name":"counters","type":"JhCounters"}]},{"type":"record","name":"TaskAttemptStarted","fields":[{"name":"taskid","type":"string"},{"name":"taskType","type":"string"},{"name":"attemptId","type":"string"},{"name":"startTime","type":"long"},{"name":"trackerName","type":"string"},{"name":"httpPort","type":"int"},{"name":"shufflePort","type":"int"},{"name":"containerId","type":"string"},{"name":"locality","type":["null","string"],"default":null},{"name":"avataar","type":["null","string"],"default":null}]},{"type":"record","name":"TaskAttemptUnsuccessfulCompletion","fields":[{"name":"taskid","type":"string"},{"name":"taskType","type":"string"},{"name":"attemptId","type":"string"},{"name":"finishTime","type":"long"},{"name":"hostname","type":"string"},{"name":"port","type":"int"},{"name":"rackname","type":"string"},{"name":"status","type":"string"},{"name":"error","type":"string"},{"name":"counters","type":["null","JhCounters"],"default":null},{"name":"clockSplits","type":{"type":"array","items":"int"}},{"name":"cpuUsages","type":{"type":"array","items":"int"}},{"name":"vMemKbytes","type":{"type":"array","items":"int"}},{"name":"physMemKbytes","type":{"type":"array","items":"int"}}]},{"type":"record","name":"TaskFailed","fields":[{"name":"taskid","type":"string"},{"name":"taskType","type":"string"},{"name":"finishTime","type":"long"},{"name":"error","type":"string"},{"name":"failedDueToAttempt","type":["null","string"]},{"name":"status","type":"string"},{"name":"counters","type":["null","JhCounters"],"default":null}]},{"type":"record","name":"TaskFinished","fields":[{"name":"taskid","type":"string"},{"name":"taskType","type":"string"},{"name":"finishTime","type":"long"},{"name":"status","type":"string"},{"name":"counters","type":"JhCounters"},{"name":"successfulAttemptId","type":["null","string"],"default":null}]},{"type":"record","name":"TaskStarted","fields":[{"name":"taskid","type":"string"},{"name":"taskType","type":"string"},{"name":"startTime","type":"long"},{"name":"splitLocations","type":"string"}]},{"type":"record","name":"TaskUpdated","fields":[{"name":"taskid","type":"string"},{"name":"finishTime","type":"long"}]}]}]}
{"type":"AM_STARTED","event":{"org.apache.hadoop.mapreduce.jobhistory.AMStarted":{"applicationAttemptId":"appattempt_1477534790849_0003_000001","startTime":1477538647394,"containerId":"container_1477534790849_0003_01_000001","nodeManagerHost":"hadoop1","nodeManagerPort":47596,"nodeManagerHttpPort":8042}}}
{"type":"JOB_SUBMITTED","event":{"org.apache.hadoop.mapreduce.jobhistory.JobSubmitted":{"jobid":"job_1477534790849_0003","jobName":"wordcount","userName":"yunyu","submitTime":1477538645546,"jobConfPath":"hdfs://hadoop1/tmp/hadoop-yarn/staging/yunyu/.staging/job_1477534790849_0003/job.xml","acls":{},"jobQueueName":"default","workflowId":{"string":""},"workflowName":{"string":""},"workflowNodeName":{"string":""},"workflowAdjacencies":{"string":""},"workflowTags":{"string":""}}}}
{"type":"JOB_QUEUE_CHANGED","event":{"org.apache.hadoop.mapreduce.jobhistory.JobQueueChange":{"jobid":"job_1477534790849_0003","jobQueueName":"default"}}}
{"type":"JOB_INITED","event":{"org.apache.hadoop.mapreduce.jobhistory.JobInited":{"jobid":"job_1477534790849_0003","launchTime":1477538650360,"totalMaps":1,"totalReduces":1,"jobStatus":"INITED","uberized":false}}}
{"type":"JOB_INFO_CHANGED","event":{"org.apache.hadoop.mapreduce.jobhistory.JobInfoChange":{"jobid":"job_1477534790849_0003","submitTime":1477538645546,"launchTime":1477538650360}}}
{"type":"TASK_STARTED","event":{"org.apache.hadoop.mapreduce.jobhistory.TaskStarted":{"taskid":"task_1477534790849_0003_m_000000","taskType":"MAP","startTime":1477538650763,"splitLocations":"hadoop1,hadoop2"}}}
{"type":"TASK_STARTED","event":{"org.apache.hadoop.mapreduce.jobhistory.TaskStarted":{"taskid":"task_1477534790849_0003_r_000000","taskType":"REDUCE","startTime":1477538650767,"splitLocations":""}}}
{"type":"MAP_ATTEMPT_STARTED","event":{"org.apache.hadoop.mapreduce.jobhistory.TaskAttemptStarted":{"taskid":"task_1477534790849_0003_m_000000","taskType":"MAP","attemptId":"attempt_1477534790849_0003_m_000000_0","startTime":1477538652833,"trackerName":"hadoop2","httpPort":8042,"shufflePort":13562,"containerId":"container_1477534790849_0003_01_000002","locality":{"string":"NODE_LOCAL"},"avataar":{"string":"VIRGIN"}}}}
{"type":"MAP_ATTEMPT_FINISHED","event":{"org.apache.hadoop.mapreduce.jobhistory.MapAttemptFinished":{"taskid":"task_1477534790849_0003_m_000000","attemptId":"attempt_1477534790849_0003_m_000000_0","taskType":"MAP","taskStatus":"SUCCEEDED","mapFinishTime":1477538656711,"finishTime":1477538656933,"hostname":"hadoop2","port":34358,"rackname":"/default-rack","state":"map","counters":{"name":"COUNTERS","groups":[{"name":"org.apache.hadoop.mapreduce.FileSystemCounter","displayName":"File System Counters","counts":[{"name":"FILE_BYTES_READ","displayName":"FILE: Number of bytes read","value":0},{"name":"FILE_BYTES_WRITTEN","displayName":"FILE: Number of bytes written","value":115420},{"name":"FILE_READ_OPS","displayName":"FILE: Number of read operations","value":0},{"name":"FILE_LARGE_READ_OPS","displayName":"FILE: Number of large read operations","value":0},{"name":"FILE_WRITE_OPS","displayName":"FILE: Number of write operations","value":0},{"name":"HDFS_BYTES_READ","displayName":"HDFS: Number of bytes read","value":194},{"name":"HDFS_BYTES_WRITTEN","displayName":"HDFS: Number of bytes written","value":0},{"name":"HDFS_READ_OPS","displayName":"HDFS: Number of read operations","value":3},{"name":"HDFS_LARGE_READ_OPS","displayName":"HDFS: Number of large read operations","value":0},{"name":"HDFS_WRITE_OPS","displayName":"HDFS: Number of write operations","value":0}]},{"name":"org.apache.hadoop.mapreduce.TaskCounter","displayName":"Map-Reduce Framework","counts":[{"name":"MAP_INPUT_RECORDS","displayName":"Map input records","value":4},{"name":"MAP_OUTPUT_RECORDS","displayName":"Map output records","value":14},{"name":"MAP_OUTPUT_BYTES","displayName":"Map output bytes","value":141},{"name":"MAP_OUTPUT_MATERIALIZED_BYTES","displayName":"Map output materialized bytes","value":114},{"name":"SPLIT_RAW_BYTES","displayName":"Input split bytes","value":109},{"name":"COMBINE_INPUT_RECORDS","displayName":"Combine input records","value":14},{"name":"COMBINE_OUTPUT_RECORDS","displayName":"Combine output records","value":9},{"name":"SPILLED_RECORDS","displayName":"Spilled Records","value":9},{"name":"FAILED_SHUFFLE","displayName":"Failed Shuffles","value":0},{"name":"MERGED_MAP_OUTPUTS","displayName":"Merged Map outputs","value":0},{"name":"GC_TIME_MILLIS","displayName":"GC time elapsed (ms)","value":83},{"name":"CPU_MILLISECONDS","displayName":"CPU time spent (ms)","value":490},{"name":"PHYSICAL_MEMORY_BYTES","displayName":"Physical memory (bytes) snapshot","value":216502272},{"name":"VIRTUAL_MEMORY_BYTES","displayName":"Virtual memory (bytes) snapshot","value":666292224},{"name":"COMMITTED_HEAP_BYTES","displayName":"Total committed heap usage (bytes)","value":120721408}]},{"name":"org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter","displayName":"File Input Format Counters ","counts":[{"name":"BYTES_READ","displayName":"Bytes Read","value":85}]}]},"clockSplits":[3941,11,12,11,11,12,11,11,12,11,11,12],"cpuUsages":[40,41,41,41,41,41,40,41,41,41,41,41],"vMemKbytes":[27111,81334,135557,189780,244003,298226,352449,406672,460895,515118,569341,623564],"physMemKbytes":[8809,26428,44047,61666,79285,96904,114523,132142,149761,167380,184999,202618]}}}
{"type":"TASK_FINISHED","event":{"org.apache.hadoop.mapreduce.jobhistory.TaskFinished":{"taskid":"task_1477534790849_0003_m_000000","taskType":"MAP","finishTime":1477538656933,"status":"SUCCEEDED","counters":{"name":"COUNTERS","groups":[{"name":"org.apache.hadoop.mapreduce.FileSystemCounter","displayName":"File System Counters","counts":[{"name":"FILE_BYTES_READ","displayName":"FILE: Number of bytes read","value":0},{"name":"FILE_BYTES_WRITTEN","displayName":"FILE: Number of bytes written","value":115420},{"name":"FILE_READ_OPS","displayName":"FILE: Number of read operations","value":0},{"name":"FILE_LARGE_READ_OPS","displayName":"FILE: Number of large read operations","value":0},{"name":"FILE_WRITE_OPS","displayName":"FILE: Number of write operations","value":0},{"name":"HDFS_BYTES_READ","displayName":"HDFS: Number of bytes read","value":194},{"name":"HDFS_BYTES_WRITTEN","displayName":"HDFS: Number of bytes written","value":0},{"name":"HDFS_READ_OPS","displayName":"HDFS: Number of read operations","value":3},{"name":"HDFS_LARGE_READ_OPS","displayName":"HDFS: Number of large read operations","value":0},{"name":"HDFS_WRITE_OPS","displayName":"HDFS: Number of write operations","value":0}]},{"name":"org.apache.hadoop.mapreduce.TaskCounter","displayName":"Map-Reduce Framework","counts":[{"name":"MAP_INPUT_RECORDS","displayName":"Map input records","value":4},{"name":"MAP_OUTPUT_RECORDS","displayName":"Map output records","value":14},{"name":"MAP_OUTPUT_BYTES","displayName":"Map output bytes","value":141},{"name":"MAP_OUTPUT_MATERIALIZED_BYTES","displayName":"Map output materialized bytes","value":114},{"name":"SPLIT_RAW_BYTES","displayName":"Input split bytes","value":109},{"name":"COMBINE_INPUT_RECORDS","displayName":"Combine input records","value":14},{"name":"COMBINE_OUTPUT_RECORDS","displayName":"Combine output records","value":9},{"name":"SPILLED_RECORDS","displayName":"Spilled Records","value":9},{"name":"FAILED_SHUFFLE","displayName":"Failed Shuffles","value":0},{"name":"MERGED_MAP_OUTPUTS","displayName":"Merged Map outputs","value":0},{"name":"GC_TIME_MILLIS","displayName":"GC time elapsed (ms)","value":83},{"name":"CPU_MILLISECONDS","displayName":"CPU time spent (ms)","value":490},{"name":"PHYSICAL_MEMORY_BYTES","displayName":"Physical memory (bytes) snapshot","value":216502272},{"name":"VIRTUAL_MEMORY_BYTES","displayName":"Virtual memory (bytes) snapshot","value":666292224},{"name":"COMMITTED_HEAP_BYTES","displayName":"Total committed heap usage (bytes)","value":120721408}]},{"name":"org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter","displayName":"File Input Format Counters ","counts":[{"name":"BYTES_READ","displayName":"Bytes Read","value":85}]}]},"successfulAttemptId":{"string":"attempt_1477534790849_0003_m_000000_0"}}}}
{"type":"REDUCE_ATTEMPT_STARTED","event":{"org.apache.hadoop.mapreduce.jobhistory.TaskAttemptStarted":{"taskid":"task_1477534790849_0003_r_000000","taskType":"REDUCE","attemptId":"attempt_1477534790849_0003_r_000000_0","startTime":1477538659565,"trackerName":"hadoop2","httpPort":8042,"shufflePort":13562,"containerId":"container_1477534790849_0003_01_000003","locality":{"string":"OFF_SWITCH"},"avataar":{"string":"VIRGIN"}}}}
{"type":"REDUCE_ATTEMPT_FINISHED","event":{"org.apache.hadoop.mapreduce.jobhistory.ReduceAttemptFinished":{"taskid":"task_1477534790849_0003_r_000000","attemptId":"attempt_1477534790849_0003_r_000000_0","taskType":"REDUCE","taskStatus":"SUCCEEDED","shuffleFinishTime":1477538662152,"sortFinishTime":1477538662173,"finishTime":1477538662652,"hostname":"hadoop2","port":34358,"rackname":"/default-rack","state":"reduce > reduce","counters":{"name":"COUNTERS","groups":[{"name":"org.apache.hadoop.mapreduce.FileSystemCounter","displayName":"File System Counters","counts":[{"name":"FILE_BYTES_READ","displayName":"FILE: Number of bytes read","value":114},{"name":"FILE_BYTES_WRITTEN","displayName":"FILE: Number of bytes written","value":115365},{"name":"FILE_READ_OPS","displayName":"FILE: Number of read operations","value":0},{"name":"FILE_LARGE_READ_OPS","displayName":"FILE: Number of large read operations","value":0},{"name":"FILE_WRITE_OPS","displayName":"FILE: Number of write operations","value":0},{"name":"HDFS_BYTES_READ","displayName":"HDFS: Number of bytes read","value":0},{"name":"HDFS_BYTES_WRITTEN","displayName":"HDFS: Number of bytes written","value":72},{"name":"HDFS_READ_OPS","displayName":"HDFS: Number of read operations","value":3},{"name":"HDFS_LARGE_READ_OPS","displayName":"HDFS: Number of large read operations","value":0},{"name":"HDFS_WRITE_OPS","displayName":"HDFS: Number of write operations","value":2}]},{"name":"org.apache.hadoop.mapreduce.TaskCounter","displayName":"Map-Reduce Framework","counts":[{"name":"COMBINE_INPUT_RECORDS","displayName":"Combine input records","value":0},{"name":"COMBINE_OUTPUT_RECORDS","displayName":"Combine output records","value":0},{"name":"REDUCE_INPUT_GROUPS","displayName":"Reduce input groups","value":9},{"name":"REDUCE_SHUFFLE_BYTES","displayName":"Reduce shuffle bytes","value":114},{"name":"REDUCE_INPUT_RECORDS","displayName":"Reduce input records","value":9},{"name":"REDUCE_OUTPUT_RECORDS","displayName":"Reduce output records","value":9},{"name":"SPILLED_RECORDS","displayName":"Spilled Records","value":9},{"name":"SHUFFLED_MAPS","displayName":"Shuffled Maps ","value":1},{"name":"FAILED_SHUFFLE","displayName":"Failed Shuffles","value":0},{"name":"MERGED_MAP_OUTPUTS","displayName":"Merged Map outputs","value":1},{"name":"GC_TIME_MILLIS","displayName":"GC time elapsed (ms)","value":60},{"name":"CPU_MILLISECONDS","displayName":"CPU time spent (ms)","value":820},{"name":"PHYSICAL_MEMORY_BYTES","displayName":"Physical memory (bytes) snapshot","value":120524800},{"name":"VIRTUAL_MEMORY_BYTES","displayName":"Virtual memory (bytes) snapshot","value":672628736},{"name":"COMMITTED_HEAP_BYTES","displayName":"Total committed heap usage (bytes)","value":15728640}]},{"name":"Shuffle Errors","displayName":"Shuffle Errors","counts":[{"name":"BAD_ID","displayName":"BAD_ID","value":0},{"name":"CONNECTION","displayName":"CONNECTION","value":0},{"name":"IO_ERROR","displayName":"IO_ERROR","value":0},{"name":"WRONG_LENGTH","displayName":"WRONG_LENGTH","value":0},{"name":"WRONG_MAP","displayName":"WRONG_MAP","value":0},{"name":"WRONG_REDUCE","displayName":"WRONG_REDUCE","value":0}]},{"name":"org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter","displayName":"File Output Format Counters ","counts":[{"name":"BYTES_WRITTEN","displayName":"Bytes Written","value":72}]}]},"clockSplits":[2689,35,35,35,35,35,35,35,35,35,35,36],"cpuUsages":[68,68,69,68,68,69,68,68,69,68,68,69],"vMemKbytes":[27369,82107,136846,191584,246323,301062,355801,410539,465278,520017,574755,629494],"physMemKbytes":[4904,14712,24520,34328,44137,53945,63754,73561,83370,93179,102986,112795]}}}
{"type":"TASK_FINISHED","event":{"org.apache.hadoop.mapreduce.jobhistory.TaskFinished":{"taskid":"task_1477534790849_0003_r_000000","taskType":"REDUCE","finishTime":1477538662652,"status":"SUCCEEDED","counters":{"name":"COUNTERS","groups":[{"name":"org.apache.hadoop.mapreduce.FileSystemCounter","displayName":"File System Counters","counts":[{"name":"FILE_BYTES_READ","displayName":"FILE: Number of bytes read","value":114},{"name":"FILE_BYTES_WRITTEN","displayName":"FILE: Number of bytes written","value":115365},{"name":"FILE_READ_OPS","displayName":"FILE: Number of read operations","value":0},{"name":"FILE_LARGE_READ_OPS","displayName":"FILE: Number of large read operations","value":0},{"name":"FILE_WRITE_OPS","displayName":"FILE: Number of write operations","value":0},{"name":"HDFS_BYTES_READ","displayName":"HDFS: Number of bytes read","value":0},{"name":"HDFS_BYTES_WRITTEN","displayName":"HDFS: Number of bytes written","value":72},{"name":"HDFS_READ_OPS","displayName":"HDFS: Number of read operations","value":3},{"name":"HDFS_LARGE_READ_OPS","displayName":"HDFS: Number of large read operations","value":0},{"name":"HDFS_WRITE_OPS","displayName":"HDFS: Number of write operations","value":2}]},{"name":"org.apache.hadoop.mapreduce.TaskCounter","displayName":"Map-Reduce Framework","counts":[{"name":"COMBINE_INPUT_RECORDS","displayName":"Combine input records","value":0},{"name":"COMBINE_OUTPUT_RECORDS","displayName":"Combine output records","value":0},{"name":"REDUCE_INPUT_GROUPS","displayName":"Reduce input groups","value":9},{"name":"REDUCE_SHUFFLE_BYTES","displayName":"Reduce shuffle bytes","value":114},{"name":"REDUCE_INPUT_RECORDS","displayName":"Reduce input records","value":9},{"name":"REDUCE_OUTPUT_RECORDS","displayName":"Reduce output records","value":9},{"name":"SPILLED_RECORDS","displayName":"Spilled Records","value":9},{"name":"SHUFFLED_MAPS","displayName":"Shuffled Maps ","value":1},{"name":"FAILED_SHUFFLE","displayName":"Failed Shuffles","value":0},{"name":"MERGED_MAP_OUTPUTS","displayName":"Merged Map outputs","value":1},{"name":"GC_TIME_MILLIS","displayName":"GC time elapsed (ms)","value":60},{"name":"CPU_MILLISECONDS","displayName":"CPU time spent (ms)","value":820},{"name":"PHYSICAL_MEMORY_BYTES","displayName":"Physical memory (bytes) snapshot","value":120524800},{"name":"VIRTUAL_MEMORY_BYTES","displayName":"Virtual memory (bytes) snapshot","value":672628736},{"name":"COMMITTED_HEAP_BYTES","displayName":"Total committed heap usage (bytes)","value":15728640}]},{"name":"Shuffle Errors","displayName":"Shuffle Errors","counts":[{"name":"BAD_ID","displayName":"BAD_ID","value":0},{"name":"CONNECTION","displayName":"CONNECTION","value":0},{"name":"IO_ERROR","displayName":"IO_ERROR","value":0},{"name":"WRONG_LENGTH","displayName":"WRONG_LENGTH","value":0},{"name":"WRONG_MAP","displayName":"WRONG_MAP","value":0},{"name":"WRONG_REDUCE","displayName":"WRONG_REDUCE","value":0}]},{"name":"org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter","displayName":"File Output Format Counters ","counts":[{"name":"BYTES_WRITTEN","displayName":"Bytes Written","value":72}]}]},"successfulAttemptId":{"string":"attempt_1477534790849_0003_r_000000_0"}}}}
{"type":"JOB_FINISHED","event":{"org.apache.hadoop.mapreduce.jobhistory.JobFinished":{"jobid":"job_1477534790849_0003","finishTime":1477538662701,"finishedMaps":1,"finishedReduces":1,"failedMaps":0,"failedReduces":0,"totalCounters":{"name":"TOTAL_COUNTERS","groups":[{"name":"org.apache.hadoop.mapreduce.FileSystemCounter","displayName":"File System Counters","counts":[{"name":"FILE_BYTES_READ","displayName":"FILE: Number of bytes read","value":114},{"name":"FILE_BYTES_WRITTEN","displayName":"FILE: Number of bytes written","value":230785},{"name":"FILE_READ_OPS","displayName":"FILE: Number of read operations","value":0},{"name":"FILE_LARGE_READ_OPS","displayName":"FILE: Number of large read operations","value":0},{"name":"FILE_WRITE_OPS","displayName":"FILE: Number of write operations","value":0},{"name":"HDFS_BYTES_READ","displayName":"HDFS: Number of bytes read","value":194},{"name":"HDFS_BYTES_WRITTEN","displayName":"HDFS: Number of bytes written","value":72},{"name":"HDFS_READ_OPS","displayName":"HDFS: Number of read operations","value":6},{"name":"HDFS_LARGE_READ_OPS","displayName":"HDFS: Number of large read operations","value":0},{"name":"HDFS_WRITE_OPS","displayName":"HDFS: Number of write operations","value":2}]},{"name":"org.apache.hadoop.mapreduce.JobCounter","displayName":"Job Counters ","counts":[{"name":"TOTAL_LAUNCHED_MAPS","displayName":"Launched map tasks","value":1},{"name":"TOTAL_LAUNCHED_REDUCES","displayName":"Launched reduce tasks","value":1},{"name":"DATA_LOCAL_MAPS","displayName":"Data-local map tasks","value":1},{"name":"SLOTS_MILLIS_MAPS","displayName":"Total time spent by all maps in occupied slots (ms)","value":4100},{"name":"SLOTS_MILLIS_REDUCES","displayName":"Total time spent by all reduces in occupied slots (ms)","value":3087},{"name":"MILLIS_MAPS","displayName":"Total time spent by all map tasks (ms)","value":4100},{"name":"MILLIS_REDUCES","displayName":"Total time spent by all reduce tasks (ms)","value":3087},{"name":"VCORES_MILLIS_MAPS","displayName":"Total vcore-seconds taken by all map tasks","value":4100},{"name":"VCORES_MILLIS_REDUCES","displayName":"Total vcore-seconds taken by all reduce tasks","value":3087},{"name":"MB_MILLIS_MAPS","displayName":"Total megabyte-seconds taken by all map tasks","value":4198400},{"name":"MB_MILLIS_REDUCES","displayName":"Total megabyte-seconds taken by all reduce tasks","value":3161088}]},{"name":"org.apache.hadoop.mapreduce.TaskCounter","displayName":"Map-Reduce Framework","counts":[{"name":"MAP_INPUT_RECORDS","displayName":"Map input records","value":4},{"name":"MAP_OUTPUT_RECORDS","displayName":"Map output records","value":14},{"name":"MAP_OUTPUT_BYTES","displayName":"Map output bytes","value":141},{"name":"MAP_OUTPUT_MATERIALIZED_BYTES","displayName":"Map output materialized bytes","value":114},{"name":"SPLIT_RAW_BYTES","displayName":"Input split bytes","value":109},{"name":"COMBINE_INPUT_RECORDS","displayName":"Combine input records","value":14},{"name":"COMBINE_OUTPUT_RECORDS","displayName":"Combine output records","value":9},{"name":"REDUCE_INPUT_GROUPS","displayName":"Reduce input groups","value":9},{"name":"REDUCE_SHUFFLE_BYTES","displayName":"Reduce shuffle bytes","value":114},{"name":"REDUCE_INPUT_RECORDS","displayName":"Reduce input records","value":9},{"name":"REDUCE_OUTPUT_RECORDS","displayName":"Reduce output records","value":9},{"name":"SPILLED_RECORDS","displayName":"Spilled Records","value":18},{"name":"SHUFFLED_MAPS","displayName":"Shuffled Maps ","value":1},{"name":"FAILED_SHUFFLE","displayName":"Failed Shuffles","value":0},{"name":"MERGED_MAP_OUTPUTS","displayName":"Merged Map outputs","value":1},{"name":"GC_TIME_MILLIS","displayName":"GC time elapsed (ms)","value":143},{"name":"CPU_MILLISECONDS","displayName":"CPU time spent (ms)","value":1310},{"name":"PHYSICAL_MEMORY_BYTES","displayName":"Physical memory (bytes) snapshot","value":337027072},{"name":"VIRTUAL_MEMORY_BYTES","displayName":"Virtual memory (bytes) snapshot","value":1338920960},{"name":"COMMITTED_HEAP_BYTES","displayName":"Total committed heap usage (bytes)","value":136450048}]},{"name":"Shuffle Errors","displayName":"Shuffle Errors","counts":[{"name":"BAD_ID","displayName":"BAD_ID","value":0},{"name":"CONNECTION","displayName":"CONNECTION","value":0},{"name":"IO_ERROR","displayName":"IO_ERROR","value":0},{"name":"WRONG_LENGTH","displayName":"WRONG_LENGTH","value":0},{"name":"WRONG_MAP","displayName":"WRONG_MAP","value":0},{"name":"WRONG_REDUCE","displayName":"WRONG_REDUCE","value":0}]},{"name":"org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter","displayName":"File Input Format Counters ","counts":[{"name":"BYTES_READ","displayName":"Bytes Read","value":85}]},{"name":"org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter","displayName":"File Output Format Counters ","counts":[{"name":"BYTES_WRITTEN","displayName":"Bytes Written","value":72}]}]},"mapCounters":{"name":"MAP_COUNTERS","groups":[{"name":"org.apache.hadoop.mapreduce.FileSystemCounter","displayName":"File System Counters","counts":[{"name":"FILE_BYTES_READ","displayName":"FILE: Number of bytes read","value":0},{"name":"FILE_BYTES_WRITTEN","displayName":"FILE: Number of bytes written","value":115420},{"name":"FILE_READ_OPS","displayName":"FILE: Number of read operations","value":0},{"name":"FILE_LARGE_READ_OPS","displayName":"FILE: Number of large read operations","value":0},{"name":"FILE_WRITE_OPS","displayName":"FILE: Number of write operations","value":0},{"name":"HDFS_BYTES_READ","displayName":"HDFS: Number of bytes read","value":194},{"name":"HDFS_BYTES_WRITTEN","displayName":"HDFS: Number of bytes written","value":0},{"name":"HDFS_READ_OPS","displayName":"HDFS: Number of read operations","value":3},{"name":"HDFS_LARGE_READ_OPS","displayName":"HDFS: Number of large read operations","value":0},{"name":"HDFS_WRITE_OPS","displayName":"HDFS: Number of write operations","value":0}]},{"name":"org.apache.hadoop.mapreduce.TaskCounter","displayName":"Map-Reduce Framework","counts":[{"name":"MAP_INPUT_RECORDS","displayName":"Map input records","value":4},{"name":"MAP_OUTPUT_RECORDS","displayName":"Map output records","value":14},{"name":"MAP_OUTPUT_BYTES","displayName":"Map output bytes","value":141},{"name":"MAP_OUTPUT_MATERIALIZED_BYTES","displayName":"Map output materialized bytes","value":114},{"name":"SPLIT_RAW_BYTES","displayName":"Input split bytes","value":109},{"name":"COMBINE_INPUT_RECORDS","displayName":"Combine input records","value":14},{"name":"COMBINE_OUTPUT_RECORDS","displayName":"Combine output records","value":9},{"name":"SPILLED_RECORDS","displayName":"Spilled Records","value":9},{"name":"FAILED_SHUFFLE","displayName":"Failed Shuffles","value":0},{"name":"MERGED_MAP_OUTPUTS","displayName":"Merged Map outputs","value":0},{"name":"GC_TIME_MILLIS","displayName":"GC time elapsed (ms)","value":83},{"name":"CPU_MILLISECONDS","displayName":"CPU time spent (ms)","value":490},{"name":"PHYSICAL_MEMORY_BYTES","displayName":"Physical memory (bytes) snapshot","value":216502272},{"name":"VIRTUAL_MEMORY_BYTES","displayName":"Virtual memory (bytes) snapshot","value":666292224},{"name":"COMMITTED_HEAP_BYTES","displayName":"Total committed heap usage (bytes)","value":120721408}]},{"name":"org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter","displayName":"File Input Format Counters ","counts":[{"name":"BYTES_READ","displayName":"Bytes Read","value":85}]}]},"reduceCounters":{"name":"REDUCE_COUNTERS","groups":[{"name":"org.apache.hadoop.mapreduce.FileSystemCounter","displayName":"File System Counters","counts":[{"name":"FILE_BYTES_READ","displayName":"FILE: Number of bytes read","value":114},{"name":"FILE_BYTES_WRITTEN","displayName":"FILE: Number of bytes written","value":115365},{"name":"FILE_READ_OPS","displayName":"FILE: Number of read operations","value":0},{"name":"FILE_LARGE_READ_OPS","displayName":"FILE: Number of large read operations","value":0},{"name":"FILE_WRITE_OPS","displayName":"FILE: Number of write operations","value":0},{"name":"HDFS_BYTES_READ","displayName":"HDFS: Number of bytes read","value":0},{"name":"HDFS_BYTES_WRITTEN","displayName":"HDFS: Number of bytes written","value":72},{"name":"HDFS_READ_OPS","displayName":"HDFS: Number of read operations","value":3},{"name":"HDFS_LARGE_READ_OPS","displayName":"HDFS: Number of large read operations","value":0},{"name":"HDFS_WRITE_OPS","displayName":"HDFS: Number of write operations","value":2}]},{"name":"org.apache.hadoop.mapreduce.TaskCounter","displayName":"Map-Reduce Framework","counts":[{"name":"COMBINE_INPUT_RECORDS","displayName":"Combine input records","value":0},{"name":"COMBINE_OUTPUT_RECORDS","displayName":"Combine output records","value":0},{"name":"REDUCE_INPUT_GROUPS","displayName":"Reduce input groups","value":9},{"name":"REDUCE_SHUFFLE_BYTES","displayName":"Reduce shuffle bytes","value":114},{"name":"REDUCE_INPUT_RECORDS","displayName":"Reduce input records","value":9},{"name":"REDUCE_OUTPUT_RECORDS","displayName":"Reduce output records","value":9},{"name":"SPILLED_RECORDS","displayName":"Spilled Records","value":9},{"name":"SHUFFLED_MAPS","displayName":"Shuffled Maps ","value":1},{"name":"FAILED_SHUFFLE","displayName":"Failed Shuffles","value":0},{"name":"MERGED_MAP_OUTPUTS","displayName":"Merged Map outputs","value":1},{"name":"GC_TIME_MILLIS","displayName":"GC time elapsed (ms)","value":60},{"name":"CPU_MILLISECONDS","displayName":"CPU time spent (ms)","value":820},{"name":"PHYSICAL_MEMORY_BYTES","displayName":"Physical memory (bytes) snapshot","value":120524800},{"name":"VIRTUAL_MEMORY_BYTES","displayName":"Virtual memory (bytes) snapshot","value":672628736},{"name":"COMMITTED_HEAP_BYTES","displayName":"Total committed heap usage (bytes)","value":15728640}]},{"name":"Shuffle Errors","displayName":"Shuffle Errors","counts":[{"name":"BAD_ID","displayName":"BAD_ID","value":0},{"name":"CONNECTION","displayName":"CONNECTION","value":0},{"name":"IO_ERROR","displayName":"IO_ERROR","value":0},{"name":"WRONG_LENGTH","displayName":"WRONG_LENGTH","value":0},{"name":"WRONG_MAP","displayName":"WRONG_MAP","value":0},{"name":"WRONG_REDUCE","displayName":"WRONG_REDUCE","value":0}]},{"name":"org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter","displayName":"File Output Format Counters ","counts":[{"name":"BYTES_WRITTEN","displayName":"Bytes Written","value":72}]}]}}}}
12345678
#### 任务运行日志(Container日志)任务运行日志(即:Container日志)包含ApplicationMaster日志和普通Task日志等信息。默认情况下,这些日志信息是存放在${HADOOP_HOME}/logs/userlogs目录下普通Task就是我们的MapReduce实例程序,这类日志就是我们在MapReduce程序中输出的log日志和System.out打印到控制台的输出,例如:logger.info("test");和System.out.println("test");。这部分日志默认是存储在${HADOOP_HOME}/logs/userlogs##### yarn-site.xml
<property>
<name>yarn.nodemanager.log-dirs</name>
<value>${yarn.log.dir}/userlogs</value>
</property>
12
对于Container日志,在Hadoop 2.x版本里面,Task是按照application->Container的层次来管理的,所以在NameNode机器上运行MapReduce程序的时候,在Console里面看到的log都可以通过在相应的DataNode/NodeManager里面的${HADOOP_HOME}/logs/userlogs下面找到。
# 查看默认日志存储路径下的文件(一个MapReduce程序对应一个Application,这里的Application会在Console提示出哪一个Application执行我们的MapReduce程序)
$ ls ${HADOOP_HOME}/logs/userlogs
drwx--x--- 5 yunyu yunyu 4096 Nov 2 20:13 application_1478088725123_0001/
drwx--x--- 5 yunyu yunyu 4096 Nov 2 20:31 application_1478088725123_0002/
drwx--x--- 5 yunyu yunyu 4096 Nov 2 21:05 application_1478088725123_0003/
drwx--x--- 3 yunyu yunyu 4096 Nov 2 21:08 application_1478088725123_0004/
drwx--x--- 3 yunyu yunyu 4096 Nov 2 21:34 application_1478088725123_0006/
drwx--x--- 4 yunyu yunyu 4096 Nov 2 23:49 application_1478101603149_0001/
drwx--x--- 3 yunyu yunyu 4096 Nov 3 00:04 application_1478101603149_0002/
drwx--x--- 3 yunyu yunyu 4096 Nov 3 11:03 application_1478138258749_0001/
# 查看某个Application下的Container日志(如果是Hadoop分布式集群,Container日志可能会分布在多个DataNode机器中)
$ ll application_1478088725123_0003/
total 20
drwx--x--- 5 yunyu yunyu 4096 Nov 2 21:05 ./
drwxr-xr-x 33 yunyu yunyu 4096 Nov 3 14:57 ../
drwx--x--- 2 yunyu yunyu 4096 Nov 2 21:04 container_1478088725123_0003_01_000001/
drwx--x--- 2 yunyu yunyu 4096 Nov 2 21:05 container_1478088725123_0003_01_000002/
drwx--x--- 2 yunyu yunyu 4096 Nov 2 21:05 container_1478088725123_0003_01_000003/
# 查看container下的日志文件,都有三个日志文件
# stderr : 错误文件输出
# stdout : System.out.println控制台输出,我们自己写的MapReduce程序的System.out.println输出都将写入到此文件中
# syslog : logger系统日志输出,我们自己的MapReduce程序的logger.info日志记录都将写入到此文件中
$ ll application_1478088725123_0003/container_1478088725123_0003_01_000001/*
-rw-rw-r-- 1 yunyu yunyu 760 Nov 2 21:05 application_1478088725123_0003/container_1478088725123_0003_01_000001/stderr
-rw-rw-r-- 1 yunyu yunyu 0 Nov 2 21:04 application_1478088725123_0003/container_1478088725123_0003_01_000001/stdout
-rw-rw-r-- 1 yunyu yunyu 34718 Nov 2 21:05 application_1478088725123_0003/container_1478088725123_0003_01_000001/syslog
12
如果是Hadoop分布式集群,Container日志文件有可能会被分配到多个机器中
# Hadoop1机器中运行了一部分的Application分配的任务,任务日志在container_1478101603149_0002_01_000001中
$ ll application_1478101603149_0002/
total 12
drwx--x--- 3 yunyu yunyu 4096 Nov 3 00:04 ./
drwxr-xr-x 33 yunyu yunyu 4096 Nov 3 15:03 ../
drwx--x--- 2 yunyu yunyu 4096 Nov 3 00:04 container_1478101603149_0002_01_000001/
# Hadoop2机器中运行了一部分的Application分配的任务,任务日志在container_1478101603149_0002_01_000002和container_1478101603149_0002_01_000003中
$ ll application_1478101603149_0002/
total 16
drwx--x--- 4 yunyu yunyu 4096 Nov 3 00:04 ./
drwxr-xr-x 36 yunyu yunyu 4096 Nov 3 15:03 ../
drwx--x--- 2 yunyu yunyu 4096 Nov 3 00:04 container_1478101603149_0002_01_000002/
drwx--x--- 2 yunyu yunyu 4096 Nov 3 00:04 container_1478101603149_0002_01_000003/
123456
### 日志聚合将作业和任务日志存放在各个节点上不便于统一管理和分析,为此,我们可以启用日志聚集功能。打开该功能后,各个任务运行完成后,会将生成的日志推送到HDFS的一个目录下(之前的并不会立即删除,在HDFS上,每个任务产生的三个文件,即syslog、stderr和stdout将合并一个文件,并通过索引记录各自位置)。这样我们可以使用HistoryServer统一查看Hadoop相关日志。这里需要在yarn-site.xml配置文件中添加如下配置
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
12
如果不添加此配置我们在HistoryServer中直接查看logs会报错如下
Aggregation is not enabled. Try the nodemanager at hadoop1:39175
1234
#### 日志聚集相关配置参数日志聚集是YARN提供的日志中央化管理功能,它能将运行完成的Container/任务日志上传到HDFS上,从而减轻NodeManager负载,且提供一个中央化存储和分析机制。默认情况下,Container/任务日志存在在各个NodeManager上,如果启用日志聚集功能需要额外的配置。
(1) yarn.log-aggregation-enable
参数解释:是否启用日志聚集功能。
默认值:false
(2) yarn.log-aggregation.retain-seconds
参数解释:在HDFS上聚集的日志最多保存多长时间。
默认值:-1
(3) yarn.log-aggregation.retain-check-interval-seconds
参数解释:多长时间检查一次日志,并将满足条件的删除,如果是0或者负数,则为上一个值的1/10。
默认值:-1
(4) yarn.nodemanager.remote-app-log-dir
参数解释:当应用程序运行结束后,日志被转移到的HDFS目录(启用日志聚集功能时有效)。
默认值:/tmp/logs
(5) yarn.log-aggregation.retain-seconds
参数解释:远程日志目录子目录名称(启用日志聚集功能时有效)。
默认值:日志将被转移到目录
${yarn.nodemanager.remote-app-log-dir}/${user}/${thisParam}下
12
我这里犯了一个比较低级的错误,导致我的日志结果一直和我预期的不同,这也让我困惑了许久。问题的现象是我只在Hadoop1的机器中启用了日志聚集功能(yarn.log-aggregation-enable为true),但是我在跑自己的MapReduce任务时还是看不到自己代码中的System.out.println和logger.info日志输出到Container日志文件中,而且发现Container日志文件会少,正常情况下会有container_XXX_00001,container_XXX_00002,container_XXX_00003三个容器运行日志,但是有的时候我发现只有container_XXX_00001和container_XXX_00003没有container_XXX_00002。后来我多次重新跑了MapReduce程序,发现container日志会被正常生成出来,但是MapReduce运行完很快就被删掉了。这样我困惑了很久,后来发现原来是我自己粗心导致了,日志聚集功能只在Hadoop1机器中配置了,而其他两台Hadoop机器没有配置,所以导致只有Hadoop1的Container日志文件被聚集到HDFS上,而且Hadoop1的Container日志聚集到HDFS之后,会将Hadoop1系统中的log文件删除,所以就会少了container_XXX_00002的日志文件。
# 可以在HDFS中查看到stderr, stdout, syslog聚集的日志文件
$ hdfs dfs -ls /tmp/logs/yunyu/logs/application_1478159146498_0001
Found 1 items
-rw-r----- 2 yunyu supergroup 52122 2016-11-03 00:47 /tmp/logs/yunyu/logs/application_1478159146498_0001/hadoop1_35650
总结
这里我简单说一下我研究的结论
- Q: 在Hadoop中如何看到MapReduce中的System.out.println输出
A: 在Container日志目录下的stdout日志文件中,默认是在${HADOOP_HOME}/logs/userlogs/application_XXXX/container_XXX_0000X/stdout
Q: 在Hadoop中如何看到Log4j或者其他日志组件的日志输出
A: 在Container日志目录下的stdout日志文件中,默认是在${HADOOP_HOME}/logs/userlogs/application_XXXX/container_XXX_0000X/syslog
Q: 日志文件保存在哪里
A: 默认保存在${HADOOP_HOME}/logs/userlogs/目录下
Q: 如何通过HistoryServer查看Hadoop日志
- A: 需要在yarn-site.xml配置文件中配置yarn.log-aggregation-enable属性为true
参考文章: