排查记录(负载过高、系统变慢)

背景 系统忽然变慢,页面刷新加载时间长,同一台机器上部署了n多个java服务,不要问我为什么,问就是历史原因。
一、top查看系统负载 Linux:top命令 – 实时显示系统运行状态
排查记录(负载过高、系统变慢)
文章图片

4核16GB内存的服务器,一分钟负载飙到24,有点吓人,页面加载时长20秒向上了
二、pidstat命令查看各个进程的CPU使用情况 pidstat -u 5 显示各个进程的cpu使用统计,5秒刷新一次

#显示各个进程的cpu使用统计,5秒刷新一次 [root@srv-100-50-88-6 logs]# pidstat -u 5 Linux 3.10.0-327.el7.x86_64 (srv-100-50-88-6)08/14/2022_x86_64_(4 CPU)12:29:06 AMUIDPID%usr %system%guest%CPUCPUCommand 12:29:12 AM070.000.200.000.200migration/0 12:29:12 AM01690.000.600.000.600rcu_sched 12:29:12 AM01710.000.200.000.202rcuos/1 12:29:12 AM01720.000.200.000.203rcuos/2 12:29:12 AM01730.000.400.000.401rcuos/3 12:29:12 AM03430.005.750.005.753ksoftirqd/3 12:29:12 AM023410.200.000.000.200ds_agent 12:29:12 AM0147996.940.600.007.540java 12:29:12 AM0598660.000.200.000.202qemu-ga 12:29:12 AM0694190.002.380.002.381java 12:29:12 AM0721290.0010.120.0010.120java 12:29:12 AM0861184.965.360.0010.322java 12:29:12 AM0951360.007.340.007.341java 12:29:12 AM0975740.200.000.000.200dockerd 12:29:12 AM01018540.003.370.003.373java 12:29:12 AM01475970.601.190.001.792node_exporter 12:29:12 AM014821530.9512.300.0043.250java 12:29:12 AM01509730.000.200.000.202kworker/2:1 12:29:12 AM01524590.200.790.000.990pidstat 12:29:12 AM01529020.005.160.005.162java 12:29:12 AM01531660.006.350.006.353java

明显观察到148215进程占用系统cpu时长达到了43.25%。
三、定位java线程
# 定位java线程 [root@srv-100-50-88-6 logs]# ps H -eo pid,tid,%cpu --sort=%cpu |grep 148215 148215 1490060.0 148215 1490620.0 148215 1506940.0 148215 1510110.0 148215 1516150.0 148215 1525700.0 148215 1482170.1 148215 1482180.1 148215 1482190.1 148215 1482200.1 148215 1483130.1 148215 1483310.1 148215 1485310.1 148215 1488930.1 148215 1488950.1 148215 1482430.2 148215 1483100.2 148215 1483430.2 148215 1485390.2 148215 1485400.2 148215 1485410.2 148215 1488900.2 148215 1488910.2 148215 1488920.2 148215 1488940.2 148215 1488960.2 148215 1487340.4 148215 1482210.8 148215 1482270.8 148215 1483770.9 148215 1483780.9 148215 1504030.9 148215 1483291.2 148215 1483281.3 148215 1483761.3 148215 1483851.3 148215 1485671.6 148215 1483271.7 148215 1483331.7 148215 1482162.1 148215 1482253.1 148215 1482263.1 148215 1511913.6

# 将线程TID转换为16进制,为后面查找 jstack 日志做准备; 这里转换了倒排的七个线程。 [root@srv-100-50-88-6 logs]# printf "0x%x\n" 151191 0x24e97 [root@srv-100-50-88-6 logs]# printf "0x%x\n" 148226 0x24302 [root@srv-100-50-88-6 logs]# printf "0x%x\n" 148225 0x24301 [root@srv-100-50-88-6 logs]# printf "0x%x\n" 148216 0x242f8 [root@srv-100-50-88-6 logs]# printf "0x%x\n" 148333 0x2436d [root@srv-100-50-88-6 logs]# printf "0x%x\n" 148327 0x24367 [root@srv-100-50-88-6 logs]# printf "0x%x\n" 148567 0x24457

四、定位代码块
#使用jstack生成虚拟机当前时刻的线程快照 [root@srv-100-50-88-6 logs]# jstack -l 148215 >> jstacklog.out #将当前堆栈信息保存为文件,通过16进制的TID查找问题所在的代码块;搜索得到对应的线程信息,可直观的看到代码异常信息。 [root@srv-100-50-88-6 logs]# sz jstacklog.out

通过0x24e97这样的线程16进程,从jstacklog.out中找出具体的堆栈。
"Thread-166" #520 prio=10 os_prio=0 tid=0x00007f3290016800 nid=0x24e97 runnable [0x00007f32582db000] java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) at java.net.SocketInputStream.read(SocketInputStream.java:171) at java.net.SocketInputStream.read(SocketInputStream.java:141) at sun.security.ssl.InputRecord.readFully(InputRecord.java:465) at sun.security.ssl.InputRecord.read(InputRecord.java:503) at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:975) - locked <0x00000006d212c5a0> (a java.lang.Object) at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:933) at sun.security.ssl.AppInputStream.read(AppInputStream.java:105) - locked <0x00000006d2136e38> (a sun.security.ssl.AppInputStream) at java.io.FilterInputStream.read(FilterInputStream.java:133) at com.mysql.cj.protocol.FullReadInputStream.readFully(FullReadInputStream.java:64) at com.mysql.cj.protocol.a.SimplePacketReader.readHeader(SimplePacketReader.java:63) at com.mysql.cj.protocol.a.SimplePacketReader.readHeader(SimplePacketReader.java:45) at com.mysql.cj.protocol.a.TimeTrackingPacketReader.readHeader(TimeTrackingPacketReader.java:52) at com.mysql.cj.protocol.a.TimeTrackingPacketReader.readHeader(TimeTrackingPacketReader.java:41) at com.mysql.cj.protocol.a.MultiPacketReader.readHeader(MultiPacketReader.java:54) at com.mysql.cj.protocol.a.MultiPacketReader.readHeader(MultiPacketReader.java:44) at com.mysql.cj.protocol.a.NativeProtocol.readMessage(NativeProtocol.java:533) at com.mysql.cj.protocol.a.NativeProtocol.checkErrorMessage(NativeProtocol.java:703) at com.mysql.cj.protocol.a.NativeProtocol.sendCommand(NativeProtocol.java:642) at com.mysql.cj.protocol.a.NativeProtocol.sendCommand(NativeProtocol.java:128) at com.mysql.cj.NativeSession.sendCommand(NativeSession.java:311) at com.mysql.cj.NativeSession.queryServerVariable(NativeSession.java:1008) at com.mysql.cj.jdbc.ConnectionImpl.isReadOnly(ConnectionImpl.java:1406) at com.mysql.cj.jdbc.ConnectionImpl.isReadOnly(ConnectionImpl.java:1399) at com.mysql.cj.jdbc.ClientPreparedStatement.checkReadOnlySafeStatement(ClientPreparedStatement.java:307) - locked <0x00000006d21934e0> (a com.mysql.cj.jdbc.ConnectionImpl) at com.mysql.cj.jdbc.ClientPreparedStatement.execute(ClientPreparedStatement.java:317) - locked <0x00000006d21934e0> (a com.mysql.cj.jdbc.ConnectionImpl) at com.zaxxer.hikari.pool.ProxyPreparedStatement.execute(ProxyPreparedStatement.java:44) at com.zaxxer.hikari.pool.HikariProxyPreparedStatement.execute(HikariProxyPreparedStatement.java) at sun.reflect.GeneratedMethodAccessor501.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.ibatis.logging.jdbc.PreparedStatementLogger.invoke(PreparedStatementLogger.java:59) at com.sun.proxy.$Proxy637.execute(Unknown Source) at org.apache.ibatis.executor.statement.PreparedStatementHandler.update(PreparedStatementHandler.java:47) at org.apache.ibatis.executor.statement.RoutingStatementHandler.update(RoutingStatementHandler.java:74) at sun.reflect.GeneratedMethodAccessor900.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.ibatis.plugin.Plugin.invoke(Plugin.java:63) at com.sun.proxy.$Proxy635.update(Unknown Source) at com.baomidou.mybatisplus.core.executor.MybatisSimpleExecutor.doUpdate(MybatisSimpleExecutor.java:56) at org.apache.ibatis.executor.BaseExecutor.update(BaseExecutor.java:117) at sun.reflect.GeneratedMethodAccessor774.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.ibatis.plugin.Invocation.proceed(Invocation.java:49) at com.baomidou.mybatisplus.extension.plugins.MybatisPlusInterceptor.intercept(MybatisPlusInterceptor.java:83) at org.apache.ibatis.plugin.Plugin.invoke(Plugin.java:61) at com.sun.proxy.$Proxy634.update(Unknown Source) at org.apache.ibatis.session.defaults.DefaultSqlSession.update(DefaultSqlSession.java:197) at sun.reflect.GeneratedMethodAccessor898.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.mybatis.spring.SqlSessionTemplate$SqlSessionInterceptor.invoke(SqlSessionTemplate.java:426) at com.sun.proxy.$Proxy159.update(Unknown Source) at org.mybatis.spring.SqlSessionTemplate.update(SqlSessionTemplate.java:287) at com.baomidou.mybatisplus.core.override.MybatisMapperMethod.execute(MybatisMapperMethod.java:65) at com.baomidou.mybatisplus.core.override.MybatisMapperProxy$PlainMethodInvoker.invoke(MybatisMapperProxy.java:148) at com.baomidou.mybatisplus.core.override.MybatisMapperProxy.invoke(MybatisMapperProxy.java:89) at com.sun.proxy.$Proxy418.updateById(Unknown Source) at sun.reflect.GeneratedMethodAccessor905.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:344) at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:198) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163) at org.springframework.dao.support.PersistenceExceptionTranslationInterceptor.invoke(PersistenceExceptionTranslationInterceptor.java:139) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:212) at com.sun.proxy.$Proxy419.updateById(Unknown Source) at com.baomidou.mybatisplus.extension.service.IService.updateById(IService.java:148) at com.baomidou.mybatisplus.extension.service.IService$$FastClassBySpringCGLIB$$f8525d18.invoke() at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218) at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:687) at com.segmentfault.business.service.impl.stat.StatQualityAnalysisServiceImpl$$EnhancerBySpringCGLIB$$eb9ef323.updateById() at com.segmentfault.business.task.service.StatQualityAnalysisService.lambda$saveOrUpdateAnalysis$23(StatQualityAnalysisService.java:1260) at com.segmentfault.business.task.service.StatQualityAnalysisService$$Lambda$1908/872329305.accept(Unknown Source) at java.util.ArrayList.forEach(ArrayList.java:1257) at com.segmentfault.business.task.service.StatQualityAnalysisService.saveOrUpdateAnalysis(StatQualityAnalysisService.java:1246) at com.segmentfault.business.task.service.StatQualityAnalysisService.collectTodayQualityInfo(StatQualityAnalysisService.java:138) at com.segmentfault.business.task.StatQualityAnalysisJob.collectTodayQualityInfo(StatQualityAnalysisJob.java:33) at sun.reflect.GeneratedMethodAccessor2274.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.xxl.job.core.handler.impl.MethodJobHandler.execute(MethodJobHandler.java:29) at com.xxl.job.core.thread.JobThread.run(JobThread.java:152)Locked ownable synchronizers: - None"C2 CompilerThread1" #6 daemon prio=9 os_prio=0 tid=0x00007f3310182800 nid=0x24302 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLELocked ownable synchronizers: - None"C2 CompilerThread0" #5 daemon prio=9 os_prio=0 tid=0x00007f331017f800 nid=0x24301 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLELocked ownable synchronizers: - None"DestroyJavaVM" #180 prio=5 os_prio=0 tid=0x00007f3310009800 nid=0x242f8 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLELocked ownable synchronizers: - None"com.alibaba.nacos.naming.beat.sender" #55 daemon prio=5 os_prio=0 tid=0x00007f3311ad9000 nid=0x2436d runnable [0x00007f326faf9000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for<0x00000006c3640658> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)Locked ownable synchronizers: - None"com.alibaba.nacos.naming.beat.sender" #49 daemon prio=5 os_prio=0 tid=0x00007f3310f93000 nid=0x24367 waiting on condition [0x00007f32a81d8000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for<0x00000006c3640658> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1088) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)Locked ownable synchronizers: - None"Thread-74" #185 prio=10 os_prio=0 tid=0x00007f32dc00c800 nid=0x24457 waiting on condition [0x00007f325cd21000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for<0x00000007a4ac0b90> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) at org.apache.dubbo.common.threadpool.ThreadlessExecutor.waitAndDrain(ThreadlessExecutor.java:91) at org.apache.dubbo.rpc.AsyncRpcResult.get(AsyncRpcResult.java:179) at org.apache.dubbo.rpc.protocol.AsyncToSyncInvoker.invoke(AsyncToSyncInvoker.java:61) at org.apache.dubbo.rpc.listener.ListenerInvokerWrapper.invoke(ListenerInvokerWrapper.java:78) at org.apache.dubbo.monitor.support.MonitorFilter.invoke(MonitorFilter.java:91) at org.apache.dubbo.rpc.protocol.FilterNode.invoke(FilterNode.java:61) at org.apache.dubbo.rpc.protocol.dubbo.filter.FutureFilter.invoke(FutureFilter.java:52) at org.apache.dubbo.rpc.protocol.FilterNode.invoke(FilterNode.java:61) at com.segmentfault.common.user.interceptor.filter.ConsumerContextFilter.invoke(ConsumerContextFilter.java:56) at org.apache.dubbo.rpc.protocol.FilterNode.invoke(FilterNode.java:61) at org.apache.dubbo.rpc.filter.ConsumerContextFilter.invoke(ConsumerContextFilter.java:69) at org.apache.dubbo.rpc.protocol.FilterNode.invoke(FilterNode.java:61) at org.apache.dubbo.rpc.protocol.InvokerWrapper.invoke(InvokerWrapper.java:56) at org.apache.dubbo.rpc.cluster.support.FailoverClusterInvoker.doInvoke(FailoverClusterInvoker.java:79) at org.apache.dubbo.rpc.cluster.support.AbstractClusterInvoker.invoke(AbstractClusterInvoker.java:265) at org.apache.dubbo.rpc.cluster.interceptor.ClusterInterceptor.intercept(ClusterInterceptor.java:47) at org.apache.dubbo.rpc.cluster.support.wrapper.AbstractCluster$InterceptorInvokerNode.invoke(AbstractCluster.java:92) at org.apache.dubbo.rpc.cluster.support.wrapper.MockClusterInvoker.invoke(MockClusterInvoker.java:93) at org.apache.dubbo.registry.client.migration.MigrationInvoker.invoke(MigrationInvoker.java:170) at org.apache.dubbo.rpc.proxy.InvokerInvocationHandler.invoke(InvokerInvocationHandler.java:96) at org.apache.dubbo.common.bytecode.proxy12.getAreaNameByCode(proxy12.java) at com.segmentfault.business.system.api.utils.AreaInfoUtil.getAreaNameByCode(AreaInfoUtil.java:39) at com.segmentfault.business.service.impl.WhiteItemRemindConfigServiceImpl.remind(WhiteItemRemindConfigServiceImpl.java:164) at com.segmentfault.business.service.impl.WhiteItemRemindConfigServiceImpl$$FastClassBySpringCGLIB$$bcecf521.invoke() at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218) at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:687) at com.segmentfault.business.service.impl.WhiteItemRemindConfigServiceImpl$$EnhancerBySpringCGLIB$$2071afb8.remind() at com.segmentfault.business.task.WhiteItemStateJob.lambda$whiteStateHandle$0(WhiteItemStateJob.java:63) at com.segmentfault.business.task.WhiteItemStateJob$$Lambda$1783/2031958241.accept(Unknown Source) at java.util.ArrayList.forEach(ArrayList.java:1257) at com.segmentfault.business.task.WhiteItemStateJob.whiteStateHandle(WhiteItemStateJob.java:51) at sun.reflect.GeneratedMethodAccessor1505.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.xxl.job.core.handler.impl.MethodJobHandler.execute(MethodJobHandler.java:29) at com.xxl.job.core.thread.JobThread.run(JobThread.java:152)Locked ownable synchronizers: - None

两个xxljob任务、两个C2 CompilerThread、两个nacos进程、一个DestroyJavaVM守护进程。
五、引用
Java应用CPU占用过高问题排查
【排查记录(负载过高、系统变慢)】pidstat 命令详解

    推荐阅读