yarn空队列crash|yarn空队列crash bug
最近线上的rm crash了一次,查看日志,发现有如下报错:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | 2015
-
01
-
05
18
:
00
:
11
,
523
ERROR org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Request
for
appInfo of unknown attemptappattempt_1418702129929_2281200_000001 2015
-
01
-
05
18
:
00
:
11
,
524
ERROR org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Request
for
appInfo of unknown attemptappattempt_1418702129929_2281200_000001 2015
-
01
-
05
18
:
00
:
11
,
526
ERROR org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Request
for
appInfo of unknown attemptappattempt_1418702129929_2281200_000001 2015
-
01
-
05
18
:
00
:
11
,
527
FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in handling event type APP_ADDED to the scheduler org.apache.hadoop.metrics2.MetricsException: Metrics source QueueMetrics,q0=root already exists!
at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.newSourceName(DefaultMetricsSystem.java:
126
)
at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.sourceName(DefaultMetricsSystem.java:
107
)
at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.register(MetricsSystemImpl.java:
217
)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSQueueMetrics.forQueue(FSQueueMetrics.java:
94
)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSQueue.
61
)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.
59
)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.createLeafQueue(QueueManager.java:
199
)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.getLeafQueue(QueueManager.java:
144
)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.addApplication(FairScheduler.java:
581
)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:
998
)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:
108
)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:
378
)
at java.lang.Thread.run(Thread.java:
662
) |
bug id:https://issues.apache.org/jira/browse/YARN-2863
【yarn空队列crash|yarn空队列crash bug】可以通过如下命令重现:
1 | yarn jar hadoop-mapreduce-examples-
2.0
.
0
-cdh4.
6.0
.jar pi -Dmapreduce.job.queuename=
20
10 |
https://issues.apache.org/jira/browse/YARN-1213
增加一个配置。在queue为空时会使用default的queue。
这个bug在cdh5.2.0已经fix
在5.2.0的yarn中增加了下面两个如下配置
1 2 | yarn.scheduler.fair.allow-undeclared-pools
//是否允许有不存在的queue,默认为true yarn.scheduler.fair.user-as-
default
-queue
//当queue不存在时,是否使用用户名来代替default,默认为true |
推荐阅读
- 爱,为孩子撑起一片空
- 欢觉笑貌(五)
- 走向天空,走向云(小说)3
- 不理解句意,你还想做对所有GRE填空题()
- 屏幕另一边的纠结
- 【雅静的幸福空间】第60篇《创造快乐》
- 悬疑小故事――空间
- 繁华声遁入空门
- Hexo代码块前后空白行问题
- 《数据结构与算法之美》——队列