%iowait和CPU使用率的正确认知

resources

  • 理解 %IOWAIT (%WIO)
  • LINUX系统的CPU使用率和LOAD
  • Linux Performance Observability Tools
  • How Linux CPU Usage Time and Percentage is calculated
  • Linux进程状态
man (on RHEL 7)
# man mpstat %usr Show the percentage of CPU utilization that occurred while executing at the user level (application). %nice Show the percentage of CPU utilization that occurred while executing at the user level with nice priority. %sys Show the percentage of CPU utilization that occurred while executing at the system level (kernel).Note that this does not include time spent servicing hardware and software interrupts.

%iowait Show the percentage of time that the CPU or CPUs were idle during which the system had an outstanding disk I/O request. %irq Show the percentage of time spent by the CPU or CPUs to service hardware interrupts. %soft Show the percentage of time spent by the CPU or CPUs to service software interrupts. %steal Show the percentage of time spent in involuntary wait by the virtual CPU or CPUs while the hypervisor was servicing another virtualprocessor. %guest Show the percentage of time spent by the CPU or CPUs to run a virtual processor. %gnice Show the percentage of time spent by the CPU or CPUs to run a niced guest. %idle Show the percentage of time that the CPU or CPUs were idle and the system did not have an outstanding disk I/O request.# man top us, user : time running un-niced user processes sy, system : time running kernel processes ni, nice : time running niced user processes id, idle : time spent in the kernel idle handler wa, IO-wait : time waiting for I/O completion hi : time spent servicing hardware interrupts si : time spent servicing software interrupts st : time stolen from this vm by the hypervisor

TIPS
  • CPU Usage Time and Percentage
参考 mpstat 手册,%usr + %nice + %sys + %iwoait + %irq + %soft + %steal + %guest + %gnice + %idle = 100%%steal一般是在虚拟机中才能看到数值,比如CPU overcommitment很严重的VPS,而%guest和%nice一般都很低,
所以也可以根据/proc/stat或者top可得,user + nice + system + idle + iowait + irq + softirq + steal = 100To calculate Linux CPU usage time subtract the idle CPU time from the total CPU time as follows: Total CPU time since boot= user + nice + system + idle + iowait + irq + softirq + steal Total CPU Idle time since boot= idle + iowait Total CPU usage time since boot = (Total CPU time since boot) - (Total CPU Idle time since boot) Total CPU percentage= (Total CPU usage time since boot)/(Total CPU time since boot X 100)

  • Linux进程状态
运行状态(TASK_RUNNING):
是运行态和就绪态的合并,表示进程正在运行或准备运行,Linux 中使用TASK_RUNNING 宏表示此状态 可中断睡眠状态(浅度睡眠)(TASK_INTERRUPTIBLE):
进程正在睡眠(被阻塞),等待资源到来是唤醒,也可以通过其他进程信号或时钟中断唤醒,进入运行队列。Linux 使用TASK_INTERRUPTIBLE 宏表示此状态。 不可中断睡眠状态(深度睡眠状态)(TASK_UNINTERRUPTIBLE):
其和浅度睡眠基本类似,但有一点就是不可被其他进程信号或时钟中断唤醒。Linux 使用TASK_UNINTERRUPTIBLE 宏表示此状态。 暂停状态(TASK_STOPPED):
进程暂停执行接受某种处理。如正在接受调试的进程处于这种状态,Linux 使用TASK_STOPPED 宏表示此状态。 僵死状态(TASK_ZOMBIE):
进程已经结束但未释放PCB,Linux 使用TASK_ZOMBIE 宏表示此状态

  • %iowait 的正确认知
%iowait 表示在一个采样周期内有百分之几的时间属于以下情况:CPU空闲、并且有仍未完成的I/O请求。
对 %iowait 常见的误解有两个:
一是误以为 %iowait 表示CPU不能工作的时间,
二是误以为 %iowait 表示I/O有瓶颈。

首先 %iowait 升高并不能证明等待I/O的进程数量增多了,也不能证明等待I/O的总时间增加了。
例如,在CPU繁忙期间发生的I/O,无论IO是多还是少,%iowait都不会变;当CPU繁忙程度下降时,有一部分IO落入CPU空闲时间段内,导致%iowait升高。
再比如,IO的并发度低,%iowait就高;IO的并发度高,%iowait可能就比较低。
可见%iowait是一个非常模糊的指标,如果看到 %iowait 升高,还需检查I/O量有没有明显增加,avserv/avwait/avque等指标有没有明显增大,应用有没有感觉变慢,如果都没有,就没什么好担心的。

  • 查看CPU使用率,推荐如下Linux命令:
# top # sar -u 1 5 # vmstat -n 1 5 # mpstat -P ALL 1 5

  • 查看Load的值,推荐如下Linux命令:
# top # uptime # sar -q 1 5


【%iowait和CPU使用率的正确认知】转载于:https://www.cnblogs.com/echo1937/p/6240020.html

    推荐阅读