spark|Spark内核源码深度剖析(Master主备切换机制原理剖析与源码分析)

1.Master主备切换机制的原理(图解)
spark|Spark内核源码深度剖析(Master主备切换机制原理剖析与源码分析)
文章图片

2.部分源码分析
【spark|Spark内核源码深度剖析(Master主备切换机制原理剖析与源码分析)】master.scala中的completeRecovery方法:

/* * 完成Master的主备切换 */ def completeRecovery() { // Ensure "only-once" recovery semantics using a short synchronization period. synchronized { if (state != RecoveryState.RECOVERING) { return } state = RecoveryState.COMPLETING_RECOVERY } /* * 将Application和worker,过滤出来目前的状态还是UNKNOW的 * 然后遍历,分别调用removeWorker和finishApplication方法, * 对可能已经出故障,或者甚至已经死掉的Application和Worker,进行清理 */ // Kill off any workers and apps that didn't respond to us. workers.filter(_.state == WorkerState.UNKNOWN).foreach(removeWorker) apps.filter(_.state == ApplicationState.UNKNOWN).foreach(finishApplication)// Reschedule drivers which were not claimed by any workers drivers.filter(_.worker.isEmpty).foreach { d => logWarning(s"Driver ${d.id} was not found after master recovery") if (d.desc.supervise) { logWarning(s"Re-launching ${d.id}") relaunchDriver(d) } else { removeDriver(d.id, DriverState.ERROR, None) logWarning(s"Did not re-launch ${d.id} because it was not supervised") } }state = RecoveryState.ALIVE schedule() logInfo("Recovery complete - resuming operations!") }

    推荐阅读