windbg调试系列教程(sos扩展的介绍和使用)

SOS是什么?
直观来说,sos就是一个程序集文件。这个程序集的作用就是让我们在使用windbg分析.net进程时,更加方便快捷。通过sos,我们可以清晰的查看CLR运行时的各类信息,辅助我们去理解托管内存的状态和含义。
这个程序集是随.NET Framework一起安装的,一般不需要单独安装。在我本机自动安装的位置如下:

C:\Windows\Microsoft.NET\Framework\v4.0.30319\SOS.dll C:\Windows\Microsoft.NET\Framework64\v4.0.30319\SOS.dll


如何加载和使用
一般情况,使用windbg自带的命令【.loadby sos clr】即可自动加载,使用【.chain】查看加载是否成功。
0:098> .loadby sos clr 0:098> .chain Extension DLL search Path: ....省略.... Extension DLL chain: ....省略.... C:\Windows\Microsoft.NET\Framework64\v4.0.30319\sos.dll: image 4.8.9014.0, API 1.0.0, built Tue Oct 12 08:17:44 2021 [path: C:\Windows\Microsoft.NET\Framework64\v4.0.30319\sos.dll] ....省略....

此扩展提供的所有命令格式均为:![command] [options],下面就根据命令作用分类,讲解重要且常用的命令。
查看线程的命令
  1. Threads命令
    • 作用:输出全部托管线程信息
    • Threads [-live] [-special]
      • 默认输出常规的工作线程信息
      • -live 只输出存活的线程
      • -special 只输出特殊线程信息,包含:GC线程、debugger辅助线程、finalizer线程、AppDomain unload线程、线程池定时器线程
    • 输出示例和解读:
    • 查看代码
      0:015> !Threads ThreadCount:4 UnstartedThread:0 BackgroundThread: 3 PendingThread:0 DeadThread:0 Hosted Runtime:no Lock ID OSID ThreadOBJState GC ModeGC Alloc ContextDomainCount Apt Exception 01 5320 0000024cdf5efd8026020 Preemptive0000024CE14ECC50:0000024CE14EDC30 0000024cdf5c4cf0 0STA 52 7c74 0000024cdf618df02b220 Preemptive0000000000000000:0000000000000000 0000024cdf5c4cf0 0MTA (Finalizer) 153 5844 0000024cfedca6a03029220 Preemptive0000024CE14CA048:0000024CE14CBC30 0000024cdf5c4cf0 0MTA (Threadpool Worker) 164 7dc0 0000024cfed99a201029220 Preemptive0000024CE14CBF38:0000024CE14CDC30 0000024cdf5c4cf0 0MTA (Threadpool Worker)0:098> !Threads -special OSID Special thread type 1 27d3c DbgHelper 2 22954 GC 3 2b168 GC 4 c478 GC 5 13778 GC 6 62ec GC 7 2dd1c GC 8 ec60 GC 9 138b8 GC 10 e988 Finalizer 11 9850 ProfilingAPIAttach 16 180c Timer 17 24960 ThreadpoolWorker 18 e70c ThreadpoolWorker 20 29e94 GC 21 2abdc GC 22 2d760 GC 23 22ad8 GC 24 2d288 GC 25 9750 GC 26 52e8 GC 27 2b9a4 GC 29 18340 Wait ....省略....

    • 空列 ID OSID:前三列分别是windbg自定义的线程序列、CLR线程ID、OS线程ID。XXXX代表线程已Dead。
    • ThreadOBJ是线程的对象地址
    • Domain列代表当前线程所在的AppDomain,Lock Count列代表当前线程持有的锁数量,
    • APT列代表线程的COM模式,有MTA、Ukn等值,能区分线程类型(Finalizer\Threadpool Worker\Threadpool Completion Port等)
    • Exception列表示对应线程最新的异常对象。
  2. ThreadState命令
    • 作用:查看线程状态。
    • ThreadState < State value field >,参数就是上个命令输出的state值。
    • 输出示例和解读:
    • 0:015> !ThreadState 1029220 Legal to Join Background CLR Owns In Multi Threaded Apartment Fully initialized Thread Pool Worker Thread

    • 表明这是后台线程、属于CLR、初始化完成、是线程池工作线程。
  3. ThreadPool命令
    • 作用:查看线程池信息。此命令没有额外的控制选项。
    • 输出示例和解读:
    • 0:001> !threadpool CPU utilization: 32% Worker Thread: Total: 183 Running: 154 Idle: 29 MaxLimit: 1000 MinLimit: 100 Work Request in Queue: 0 -------------------------------------- Number of Timers: 1 -------------------------------------- Completion Port Thread:Total: 10 Free: 6 MaxFree: 16 CurrentLimit: 6 MaxLimit: 200 MinLimit: 20
    • 各项信息比较清楚,按照需要查看即可。
查看堆栈类的命令
  1. clrstack命令:
    • 作用:查看当前线程的托管堆栈
    • CLRStack [-a] [-l] [-p] [-n]
      • -p:输出内容包含堆栈中的方法入参
      • -l:输出内容包含堆栈中的方法的局部变量
      • -a:等效同时使用-p和-l
      • -n:输出内容不包括源码文件信息和行号。
    • 输出示例和解读:
    • 查看代码
      0:015> !clrstack -n -a OS Thread Id: 0x759c (15) Child SPIP Call Site 0000005c65ffef48 00007ffc73b82f14 [HelperMethodFrame: 0000005c65ffef48] System.Threading.Thread.SleepInternal(Int32) 0000005c65fff040 00007ffc5988b78b System.Threading.Thread.Sleep(Int32) PARAMETERS: millisecondsTimeout = 0000005c65fff070 00007ffbfc3c946b Tccc.WindbgDemoAPP.MainWindow+c.b__22_0(System.Object) PARAMETERS: this (0x0000005c65fff0a0) = 0x0000023f431ffd68 a (0x0000005c65fff0a8) = 0x00000000000000000000005c65fff0a0 00007ffc59853480 System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) PARAMETERS: executionContext = callback = state = preserveSyncCtx = LOCALS: 0000005c65fff170 00007ffc59853305 System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) PARAMETERS: executionContext = callback = state = preserveSyncCtx = 0000005c65fff1a0 00007ffc5987a506 System.Threading.QueueUserWorkItemCallback.System.Threading.IThreadPoolWorkItem.ExecuteWorkItem() PARAMETERS: this = LOCALS: 0000005c65fff1e0 00007ffc59879746 System.Threading.ThreadPoolWorkQueue.Dispatch() LOCALS: 0x0000005c65fff230 = 0x0000023f431ffdc0 0x0000005c65fff25c = 0x000000002019e530 0x0000005c65fff258 = 0x0000000000000001 0x0000005c65fff228 = 0x0000023f43201cb0 0x0000005c65fff244 = 0x0000000000000000 0000005c65fff678 00007ffc5b927873 [DebuggerU2MCatchHandlerFrame: 0000005c65fff678]

    • IP列代表当前方法的代码地址入口,通过!U命令可以查看IL代码和汇编代码。Call Site列就是堆栈方法信息。
    • PARAMETERS和LOCALS分别代表对应方法的入参和局部变量。其中入参可以正常显示名称,但是局部变量没有实际变量名称,而是通过变量地址来表示:<local address> = <value>。
  2. dumpstack命令
    • 作用:查看线程的完整堆栈信息(包含托管、非托管)
    • DumpStack [-EE] [-n] [top stack [bottom stack]]
      • -ee:只显示托管代码的堆栈信息
      • -n:同上,输出信息屏蔽掉源码文件信息和行号
    • 输出示例和解读:
    • 查看代码
      0:015> !dumpstack -ee OS Thread Id: 0x759c (15) Current frame: Child-SPRetAddrCaller, Callee 0000005c65fff030 00007ffc5988b78b (MethodDesc 00007ffc594b9090 +0xb System.Threading.Thread.Sleep(Int32)) 0000005c65fff060 00007ffbfc3c946b (MethodDesc 00007ffbfc45eb10 +0x2b Tccc.WindbgDemoAPP.MainWindow+<>c.b__22_0(System.Object)) 0000005c65fff090 00007ffc59853480 (MethodDesc 00007ffc592f8a18 +0x170 System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)) 0000005c65fff160 00007ffc59853305 (MethodDesc 00007ffc594c5248 +0x15 System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)) 0000005c65fff190 00007ffc5987a506 (MethodDesc 00007ffc594e3570 +0x76 System.Threading.QueueUserWorkItemCallback.System.Threading.IThreadPoolWorkItem.ExecuteWorkItem()) 0000005c65fff1d0 00007ffc59879746 (MethodDesc 00007ffc5965fbc0 +0x156 System.Threading.ThreadPoolWorkQueue.Dispatch())

    • RetAddr列代表当前方法的代码地址入口,同上的IP列。Caller, Callee列包含方法MethodDesc值和方法签名信息。
  3. eestack命令
    • 作用:查看所有线程的堆栈,完全等价于每个线程执行dumpstack命令
    • EEStack [-short] [-EE]
      • -EE同上
      • -short:只输出符合条件的线程堆栈:持有锁的线程、GC挂起的线程、正在执行托管代码的线程。
    • 输出示例和解读:同上,略。
  4. dso命令
    • 作用:查看当前线程的全部托管对象。
    • DSO [-verify] [top stack [bottom stack]]
    • 输出示例和解读:
    • 0:015> !dso OS Thread Id: 0x759c (15) RSP/REGObjectName 0000005C65FFEF58 0000023f43200628 System.Threading.QueueUserWorkItemCallback 0000005C65FFEF78 0000023f43200690 System.Threading.ExecutionContext 0000005C65FFEF80 0000023f43200650 System.Threading.ContextCallback 0000005C65FFEF98 0000023f43200650 System.Threading.ContextCallback 0000005C65FFEFC0 0000023f43201fe8 System.Threading.Thread 0000005C65FFF000 0000023f43200690 System.Threading.ExecutionContext 0000005C65FFF020 0000023f43201fe8 System.Threading.Thread 0000005C65FFF028 0000023f43200628 System.Threading.QueueUserWorkItemCallback 0000005C65FFF048 0000023f43200650 System.Threading.ContextCallback 0000005C65FFF050 0000023f43200690 System.Threading.ExecutionContext 0000005C65FFF060 0000023f43200690 System.Threading.ExecutionContext 0000005C65FFF088 0000023f43200628 System.Threading.QueueUserWorkItemCallback 0000005C65FFF0A0 0000023f431ffd68 Tccc.WindbgDemoAPP.MainWindow+<>c 0000005C65FFF0E0 0000023f43201fe8 System.Threading.Thread 0000005C65FFF140 0000023f43200628 System.Threading.QueueUserWorkItemCallback 0000005C65FFF148 0000023f43200690 System.Threading.ExecutionContext 0000005C65FFF1E0 0000023f43200628 System.Threading.QueueUserWorkItemCallback 0000005C65FFF228 0000023f43201cb0 System.Threading.ThreadPoolWorkQueueThreadLocals 0000005C65FFF230 0000023f431ffdc0 System.Threading.ThreadPoolWorkQueue 0000005C65FFF250 0000023f43200628 System.Threading.QueueUserWorkItemCallback

    • Object列是对象地址,Name列时对象类型名称。
查看锁的命令
  1. syncblk命令
    • 作用:查看进程中同步块锁的持有情况和等待情况。常用于分析死锁问题。
    • SyncBlk [-all | ]
      • 默认输出被线程持有的锁信息
      • -all:输出全部的SyncBlock对象信息
      • syncblk number:指定锁编号
    • 输出示例和解读:
    • 0:098> !syncblk Index SyncBlock MonitorHeld Recursion Owning Thread InfoSyncBlock Owner 510 000002bb0513d4a8791 000002bb073568c0 f9e8 177000002be7520adb0 System.Object 530 000002bb0513a1f82191 000002bb06a17770 26bdc 112000002bd74f07d30 System.Object 567 000002bb0513e14851 000002bb06ce6f40 24ce0 113000002be74f320b8 System.Object 1343 000002bb0513c2b871 000002bb06a17770 26bdc 112000002bd7515b158 ServiceStack.Redis.RedisSentinelWorker ----------------------------- Total1369 CCW3 RCW2 ComClassFactory 0 Free507

    • Index列是编号,SyncBlock列是锁地址,MonitorHeld是等待计数器,Recursion代表持有此锁的线程数量,Owning Thread Info这3列代表持有锁的线程信息,SyncBlock Owner代表锁对象地址和类型。
    • MonitorHeld值的含义:当一个线程持有了锁的时候 MonitorHeld+1 ,当一个线程在等待锁的时候 MonitorHeld+2。因此示例中510号锁的等待线程数量=(79-1)/2=39。
查看进程信息、内存分布的命令
  1. EEVersion命令
    • 作用:查看CLR版本信息
    • 输出示例和解读:
    • 0:001> !eeversion 4.8.4380.0 free Server mode with 8 gc heaps SOS Version: 4.8.4380.0 retail build

  2. DumpDomain命令
    • 作用:查看应用程序域信息
    • DumpDomain [],参数为指定程序域的地址,不带参数输出全部程序域信息。
    • 输出示例和解读:
    • 查看代码
      0:015> !dumpdomain -------------------------------------- System Domain:00007ffc5c3006e0 LowFrequencyHeap:00007ffc5c300c58 HighFrequencyHeap:00007ffc5c300ce8 StubHeap:00007ffc5c300d78 Stage:OPEN Name:None -------------------------------------- Shared Domain:00007ffc5c300110 LowFrequencyHeap:00007ffc5c300c58 HighFrequencyHeap:00007ffc5c300ce8 StubHeap:00007ffc5c300d78 Stage:OPEN Name:None Assembly:0000024cdf62db80 [C:\WINDOWS\Microsoft.Net\assembly\GAC_64\mscorlib\v4.0_4.0.0.0__b77a5c561934e089\mscorlib.dll] ClassLoader:0000024cdf62dca0 Module Name 00007ffc592f1000C:\WINDOWS\Microsoft.Net\assembly\GAC_64\mscorlib\v4.0_4.0.0.0__b77a5c561934e089\mscorlib.dll-------------------------------------- Domain 1:0000024cdf5c4cf0 LowFrequencyHeap:0000024cdf5c54e8 HighFrequencyHeap:0000024cdf5c5578 StubHeap:0000024cdf5c5608 Stage:OPEN SecurityDescriptor: 0000024cdf5c7300 Name:Tccc.WindbgDemoAPP.exe Assembly:0000024cdf62db80 [C:\WINDOWS\Microsoft.Net\assembly\GAC_64\mscorlib\v4.0_4.0.0.0__b77a5c561934e089\mscorlib.dll] ClassLoader:0000024cdf62dca0 SecurityDescriptor: 0000024cdf6317c0 Module Name 00007ffc592f1000C:\WINDOWS\Microsoft.Net\assembly\GAC_64\mscorlib\v4.0_4.0.0.0__b77a5c561934e089\mscorlib.dllAssembly:0000024cdf640bb0 [D:\50LearningSln\Tccc\Tccc.WindbgDemoAPP\bin\x64\Debug\Tccc.WindbgDemoAPP.exe] ClassLoader:0000024cdf640cd0 SecurityDescriptor: 0000024cdf640ac0 Module Name 00007ffbfc2a4140D:\50LearningSln\Tccc\Tccc.WindbgDemoAPP\bin\x64\Debug\Tccc.WindbgDemoAPP.exe

    • 输出内容包含系统程序域、共享程序域、Domain1等。同时,程序域中加载的程序集信息也打印出来。这里的程序集路径可以准确判断程序加载的文件位置,非常有用。
  3. eeheap命令
    • 作用:查看进程内的CLR内存分布。
    • EEHeap [-gc] [-loader],默认输出全部内存分布信息。
      • -gc只输出gc堆和大对象堆
      • -loader只输出加载器相关的内存分布
    • 输出示例和解读:
    • 查看代码
      0:015> !eeheap Loader Heap: -------------------------------------- System Domain:00007ffc5c3006e0 LowFrequencyHeap:00007ffbfc290000(3000:3000) Size: 0x3000 (12288) bytes. HighFrequencyHeap: 00007ffbfc294000(9000:3000) Size: 0x3000 (12288) bytes. StubHeap:00007ffbfc29d000(3000:3000) 00007ffbfc450000(10000:1000) Size: 0x4000 (16384) bytes. Virtual Call Stub Heap: IndcellHeap:00007ffbfc340000(6000:1000) Size: 0x1000 (4096) bytes. LookupHeap:00007ffbfc34c000(4000:1000) Size: 0x1000 (4096) bytes. ResolveHeap:00007ffbfc376000(3a000:2000) Size: 0x2000 (8192) bytes. DispatchHeap:00007ffbfc350000(26000:1000) Size: 0x1000 (4096) bytes. CacheEntryHeap:00007ffbfc346000(6000:1000) Size: 0x1000 (4096) bytes. Total size:Size: 0x10000 (65536) bytes. -------------------------------------- Shared Domain:00007ffc5c300110 LowFrequencyHeap:00007ffbfc290000(3000:3000) Size: 0x3000 (12288) bytes. HighFrequencyHeap: 00007ffbfc294000(9000:3000) Size: 0x3000 (12288) bytes. StubHeap:00007ffbfc29d000(3000:3000) 00007ffbfc450000(10000:1000) Size: 0x4000 (16384) bytes. Virtual Call Stub Heap: IndcellHeap:00007ffbfc340000(6000:1000) Size: 0x1000 (4096) bytes. LookupHeap:00007ffbfc34c000(4000:1000) Size: 0x1000 (4096) bytes. ResolveHeap:00007ffbfc376000(3a000:2000) Size: 0x2000 (8192) bytes. DispatchHeap:00007ffbfc350000(26000:1000) Size: 0x1000 (4096) bytes. CacheEntryHeap:00007ffbfc346000(6000:1000) Size: 0x1000 (4096) bytes. Total size:Size: 0x10000 (65536) bytes. -------------------------------------- Domain 1:0000024cdf5c4cf0 LowFrequencyHeap:00007ffbfc2a0000(3000:3000) 00007ffbfc430000(10000:d000) Size: 0x10000 (65536) bytes. HighFrequencyHeap: 00007ffbfc2a3000(a000:9000) 00007ffbfc440000(10000:f000) Size: 0x18000 (98304) bytes total, 0x1000 (4096) bytes wasted. StubHeap:00007ffbfc2ad000(3000:1000) Size: 0x1000 (4096) bytes. Virtual Call Stub Heap: IndcellHeap:00007ffbfc2b0000(4000:1000) Size: 0x1000 (4096) bytes. LookupHeap:00007ffbfc2bb000(2000:1000) Size: 0x1000 (4096) bytes. ResolveHeap:00007ffbfc2ec000(54000:8000) Size: 0x8000 (32768) bytes. DispatchHeap:00007ffbfc2bd000(2f000:3000) Size: 0x3000 (12288) bytes. CacheEntryHeap:00007ffbfc2b4000(7000:1000) Size: 0x1000 (4096) bytes. Total size:Size: 0x37000 (225280) bytes total, 0x1000 (4096) bytes wasted. -------------------------------------- Jit code heap: LoaderCodeHeap:0000000000000000(0:0) Size: 0x0 (0) bytes. Total size:Size: 0x0 (0) bytes. -------------------------------------- Module Thunk heaps: Module 00007ffc592f1000: Size: 0x0 (0) bytes. Module 00007ffbfc2a4140: Size: 0x0 (0) bytes. Module 00007ffc192a1000: Size: 0x0 (0) bytes. Module 00007ffc1aa31000: Size: 0x0 (0) bytes. Module 00007ffc54e51000: Size: 0x0 (0) bytes. Module 00007ffc18d91000: Size: 0x0 (0) bytes. Module 00007ffc4e2d1000: Size: 0x0 (0) bytes. Module 00007ffc36e51000: Size: 0x0 (0) bytes. Module 00007ffc4c0d1000: Size: 0x0 (0) bytes. Module 00007ffc4b821000: Size: 0x0 (0) bytes. Module 00007ffc17241000: Size: 0x0 (0) bytes. Module 00007ffbfc44bce8: Size: 0x0 (0) bytes. Module 00007ffbfc44c6a8: Size: 0x0 (0) bytes. Module 00007ffbfc44d2b8: Size: 0x0 (0) bytes. Module 00007ffc13541000: Size: 0x0 (0) bytes. Module 00007ffc22ef1000: Size: 0x0 (0) bytes. Total size:Size: 0x0 (0) bytes. -------------------------------------- Module Lookup Table heaps: Module 00007ffc592f1000: Size: 0x0 (0) bytes. Module 00007ffbfc2a4140: Size: 0x0 (0) bytes. Module 00007ffc192a1000: Size: 0x0 (0) bytes. Module 00007ffc1aa31000: Size: 0x0 (0) bytes. Module 00007ffc54e51000: Size: 0x0 (0) bytes. Module 00007ffc18d91000: Size: 0x0 (0) bytes. Module 00007ffc4e2d1000: Size: 0x0 (0) bytes. Module 00007ffc36e51000: Size: 0x0 (0) bytes. Module 00007ffc4c0d1000: Size: 0x0 (0) bytes. Module 00007ffc4b821000: Size: 0x0 (0) bytes. Module 00007ffc17241000: Size: 0x0 (0) bytes. Module 00007ffbfc44bce8: Size: 0x0 (0) bytes. Module 00007ffbfc44c6a8: Size: 0x0 (0) bytes. Module 00007ffbfc44d2b8: Size: 0x0 (0) bytes. Module 00007ffc13541000: Size: 0x0 (0) bytes. Module 00007ffc22ef1000: Size: 0x0 (0) bytes. Total size:Size: 0x0 (0) bytes. -------------------------------------- Total LoaderHeap size:Size: 0x57000 (356352) bytes total, 0x1000 (4096) bytes wasted. ======================================= Number of GC Heaps: 1 generation 0 starts at 0x0000024ce1407c30 generation 1 starts at 0x0000024ce12a1018 generation 2 starts at 0x0000024ce12a1000 ephemeral segment allocation context: none segmentbeginallocatedsize 0000024ce12a00000000024ce12a10000000024ce14edc480x24cc48(2411592) Large object heap starts at 0x0000024cf12a1000 segmentbeginallocatedsize 0000024cf12a00000000024cf12a10000000024cf12c31f00x221f0(139760) Total Size:Size: 0x26ee38 (2551352) bytes. ------------------------------ GC Heap Size:Size: 0x26ee38 (2551352) bytes.

    • 输出内容包含各个程序域内存信息、JIT堆、GC堆、加载器堆等地址信息。
  4. dumpheap命令
    • 作用:查看内存中的对象信息,非常有用。
    • DumpHeap [-stat][-mt ] [-type ]
      • 默认输出全部的对象明细,数据量大,一般都会带参数限制输出大小。
      • -stat:按照对象的类型,汇总输出,最常用的选项。
      • -mt :根据MethodTable地址过滤对象。
      • -type :根据类型名称过滤对象,区分大小写,不需要提供完整类型。
    • 输出示例和解读:
    • 0:001> !dumpheap -stat -mt 00007ffbee0abd78 Statistics: MTCountTotalSize Class Name 00007ffbee0abd78341501366000 System.Net.SocketAddress Total 34150 objects0:001> !dumpheap -stat -type Net.SocketAddress Statistics: MTCountTotalSize Class Name 00007ffbee0abd78341501366000 System.Net.SocketAddress Total 34150 objects

  5. dumpobj命令