AddressSanitizer 技术初体验

简介
AddressSanitizer 是 Google 开发的一款用于检测内存访问错误的工具,用于解决 use-after-free 和内存泄漏的相关问题。它内置在 GCC 版本 >= 4.8 中,适用于 C 和 C++ 代码。AddressSanitizer 可在使用运行时检测跟踪内存分配,这意味着必须使用 AddressSanitizer 构建代码才能利用它的功能。
因为内存泄漏会增加程序使用的总内存,所以当不再需要内存时,正确释放内存很重要。或许对于小程序,丢失几个字节似乎没什么大不了的,但是对于使用大量内存的长时间运行的程序,避免内存泄漏变得越来越重要。如果程序不再需要它时,无法释放使用的内存,很可能会耗尽内存,从而导致应用程序提前终止。AddressSanitizer 的出现,就可以帮助检测这些内存泄漏。
此外,AddressSanitizer 可以检测 use-after-free 错误。当程序尝试读取或写入已被释放的内存时,会发生释放后使用错误。这是未定义的行为,可能会导致数据损坏、结果不正确,甚至程序崩溃。
如果检测到错误,程序将向 stderr 打印错误消息,并以非零退出代码退出,若是使用 AddressSanitizer,那么它在第一个检测到的错误时就退出。
注意

  • ASan 使用自己的内存分配器(malloc, free 等)
  • ASan 使用大量虚拟地址空间(x86_64 Linux 上为 20T)
使用
-fsanitize=address 标志用于告诉编译器,将添加 AddressSanitizer,对使用调试符号编译代码很有帮助。如果存在调试符号,需要 AddressSanitizer 打印行号,那么请添加 -g 标志。此外,如果您发现堆栈跟踪看起来不太正确,使用 -fno-omit-frame-pointer 就可以显示更详细的调用栈信息,合并成一条命令生成可执行文件:
gcc main.cpp -o main -g -fsanitize=address

此段代码为 Bash
或者,分成单独的编译和链接阶段:
gcc -c main.cpp -fsanitize=address -g` -fno-omit-frame-pointergcc main.o -o main -fsanitize=address

此段代码为 Bash
请注意,编译和链接步骤都需要 -fsanitize=address 标志。如果构建系统更复杂,将这些标志放在 CXXFLAGS 和 LDFLAGS 环境变量中。
性能AddressSanitizer 比进行类似分析的工具(如 valgrind)快得多。为了获得较好的性能可以使用 -O1 或更高的优化。
但是,如果觉得 AddressSanitizer 对某些代码来说太慢,就可以使用编译器标志来禁用它,以用于特定功能。有助于在代码的较冷部分使用 AddressSanitizer,同时手动审核热路径。
跳过分析函数的编译器指令是 __attribute__((no_sanitize_address)。
错误类型
1. Use after free
内存释放后还被使用。
int main(int argc, char **argv) { int *array = new int[100]; delete [] array; return array[argc]; // BOOM }

此段代码为 C
===================================================================3262==ERROR: AddressSanitizer: heap-use-after-free on address 0x614000000044 at pc 0x55c005566d89 bp 0x7fffc64dc040 sp 0x7fffc64dc030READ of size 4 at 0x614000000044 thread T0 #0 0x55c005566d88 in main /root/study/cmakeutils/src/main.cpp:6 #1 0x7fdb76b17082 in __libc_start_main ../csu/libc-start.c:308 #2 0x55c005566c4d in _start (/root/study/cmakeutils/build/main+0xdc4d)0x614000000044 is located 4 bytes inside of 400-byte region [0x614000000040,0x6140000001d0)freed by thread T0 here: #0 0x7fdb77396b97 in operator delete[](void*) ../../../../src/libsanitizer/asan/asan_new_delete.cpp:163 #1 0x55c005566d3c in main /root/study/cmakeutils/src/main.cpp:5 #2 0x7fdb76b17082 in __libc_start_main ../csu/libc-start.c:308previously allocated by thread T0 here: #0 0x7fdb77396097 in operator new[](unsigned long) ../../../../src/libsanitizer/asan/asan_new_delete.cpp:102 #1 0x55c005566d25 in main /root/study/cmakeutils/src/main.cpp:4 #2 0x7fdb76b17082 in __libc_start_main ../csu/libc-start.c:308 ...

此段代码为 Bash
第 4 行显示了 READ(内存读取)的位置,return array[argc];
第 10行显示了 freed(内存释放)的位置,delete [] array;
第 16 行显示了 allocate(内存申请)的位置,int *array = new int[100]。
2. Heap buffer overflow
申请了堆空间,数组下标超出申请范围。
int main(int argc, char **argv) { int *array = new int[100]; array[0] = 0; int res = array[argc + 100]; // BOOM delete [] array; return res; }

此段代码为 C++ 语言
===================================================================3407==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6140000001d4 at pc 0x55753d9b4dbb bp 0x7ffe7d1e77e0 sp 0x7ffe7d1e77d0READ of size 4 at 0x6140000001d4 thread T0 #0 0x55753d9b4dba in main /root/study/cmakeutils/src/main.cpp:6 #1 0x7f9f5683b082 in __libc_start_main ../csu/libc-start.c:308 #2 0x55753d9b4c4d in _start (/root/study/cmakeutils/build/main+0xdc4d) 0x6140000001d4 is located 4 bytes to the right of 400-byte region [0x614000000040,0x6140000001d0)allocated by thread T0 here: #0 0x7f9f570ba097 in operator new[](unsigned long) ../../../../src/libsanitizer/asan/asan_new_delete.cpp:102 #1 0x55753d9b4d25 in main /root/study/cmakeutils/src/main.cpp:4 #2 0x7f9f5683b082 in __libc_start_main ../csu/libc-start.c:308 ...

此段代码为 Bash
第 4 行显示了 READ 的位置,int res = array[argc + 100];
第 11 行显示了 allocate 的位置,int *array = new int[100];
因为这里 argc + 100 >= 100, array 下标为 0-99,所以出现错误。
3. Stack buffer overflow
局部变量,数组下标超出范围。
int main(int argc, char **argv) { int stack_array[100]; stack_array[1] = 0; return stack_array[argc + 100]; // BOOM }

此段代码为 C++ 语言
===================================================================3529==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7fff4c128d44 at pc 0x55ccafbf0e13 bp 0x7fff4c128b60 sp 0x7fff4c128b50READ of size 4 at 0x7fff4c128d44 thread T0 #0 0x55ccafbf0e12 in main /root/study/cmakeutils/src/main.cpp:6 #1 0x7f624dc97082 in __libc_start_main ../csu/libc-start.c:308 #2 0x55ccafbf0c0d in _start (/root/study/cmakeutils/build/main+0xdc0d) Address 0x7fff4c128d44 is located in stack of thread T0 at offset 452 in frame #0 0x55ccafbf0cd8 in main /root/study/cmakeutils/src/main.cpp:3This frame has 1 object(s): [48, 448) 'stack_array' (line 4) <== Memory access at offset 452 overflows this variableHINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork (longjmp and C++ exceptions *are* supported) ...

此段代码为 Bash
第 4 行显示了 READ 的位置,return stack_array[argc + 100];
第 12 行显示了变量的位置,int stack_array[100]。
4. Global buffer overflow
全局变量,数组下标超出范围。
int global_array[100] = {-1}; int main(int argc, char **argv) { return global_array[argc + 100]; // BOOM }

此段代码为 C++
===================================================================3653==ERROR: AddressSanitizer: global-buffer-overflow on address 0x55b61f0391b4 at pc 0x55b61efd7d2b bp 0x7fff8bc1cbd0 sp 0x7fff8bc1cbc0READ of size 4 at 0x55b61f0391b4 thread T0 #0 0x55b61efd7d2a in main /root/study/cmakeutils/src/main.cpp:5 #1 0x7f0637717082 in __libc_start_main ../csu/libc-start.c:308 #2 0x55b61efd7c0d in _start (/root/study/cmakeutils/build/main+0xdc0d) 0x55b61f0391b4 is located 4 bytes to the right of global variable 'global_array' defined in '/root/study/cmakeutils/src/main.cpp:3:5' (0x55b61f039020) of size 400 ...

此段代码为 Bash
第 4 行显示了 READ 的位置,return global_array[argc + 100];
第 8 行显示了全局变量的位置,int global_array[100] = {-1}。
5. Use after return
指针指向了一个函数的局部变量,函数返回后局部变量失效,但使用了该指针。
// 默认不检测该项,可设置ASAN_OPTIONS=detect_stack_use_after_return=1开启检测int* ptr; __attribute__((noinline)) void FunctionThatEscapesLocalObject() { int local[100]; ptr = &local[0]; } int main(int argc, char** argv) { FunctionThatEscapesLocalObject(); return ptr[argc]; }

此段代码为 C++
===================================================================3811==ERROR: AddressSanitizer: stack-use-after-return on address 0x7fd77133e234 at pc 0x555fb157be71 bp 0x7fffdb165710 sp 0x7fffdb165700READ of size 4 at 0x7fd77133e234 thread T0 #0 0x555fb157be70 in main /root/study/cmakeutils/src/main.cpp:11 #1 0x7fd7746db082 in __libc_start_main ../csu/libc-start.c:308 #2 0x555fb157bc0d in _start (/root/study/cmakeutils/build/main+0xdc0d) Address 0x7fd77133e234 is located in stack of thread T0 at offset 52 in frame #0 0x555fb157bcd8 in FunctionThatEscapesLocalObject() /root/study/cmakeutils/src/main.cpp:4 This frame has 1 object(s): [48, 448) 'local' (line 5) <== Memory access at offset 52 is inside this variableHINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork (longjmp and C++ exceptions *are* supported) ...

此段代码为 Bash
第 4 行显示 READ 位置,return ptr[argc];
第 12 行显示变量位置,int local[100];
这里 ptr 指向了一个局部变量的地址ptr = &local[0],然后在 FunctionThatEscapesLocalObject 函数返回后,访问该地址。
6. Use after scope
指针指向了一个范围变量,该范围退出后变量失效,但使用了该指针。
volatile int *p = 0; int main() { { int x = 0; p = &x; } *p = 5; return 0; }

此段代码为 C++
===================================================================3922==ERROR: AddressSanitizer: stack-use-after-scope on address 0x7ffecd93f880 at pc 0x5616c0570de0 bp 0x7ffecd93f850 sp 0x7ffecd93f840WRITE of size 4 at 0x7ffecd93f880 thread T0 #0 0x5616c0570ddf in main /root/study/cmakeutils/src/main.cpp:10 #1 0x7f2ccf8c3082 in __libc_start_main ../csu/libc-start.c:308 #2 0x5616c0570c0d in _start (/root/study/cmakeutils/build/main+0xdc0d) Address 0x7ffecd93f880 is located in stack of thread T0 at offset 32 in frame #0 0x5616c0570cd8 in main /root/study/cmakeutils/src/main.cpp:5 This frame has 1 object(s): [32, 36) 'x' (line 7) <== Memory access at offset 32 is inside this variableHINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork (longjmp and C++ exceptions *are* supported) ...

此段代码为 Bash
第 4 行显示了 WRITE(写内存)的位置,*p = 5;
第 12 行显示了变量位置,int x = 0;这里 x 的作用域在 {} 内部,由于 *p = 5在 {} 外,所以 x 失效了。
7. Memory leaks
内存泄露,申请了未释放。
void *p; int main() { p = malloc(7); p = 0; // The memory is leaked here. return 0 }

此段代码为 C++
===================================================================4076==ERROR: LeakSanitizer: detected memory leaksDirect leak of 7 byte(s) in 1 object(s) allocated from:#0 0x7f799fcff527 in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145 #1 0x55a10f15acfa in main /root/study/cmakeutils/src/main.cpp:6 #2 0x7f799f482082 in __libc_start_main ../csu/libc-start.c:308SUMMARY: AddressSanitizer: 7 byte(s) leaked in 1 allocation(s).

此段代码为 Visual Basic
第 6 行显示了内存申请的位置,p = malloc(7);
第 9 行显示了总的内存泄露。
总结
【AddressSanitizer 技术初体验】编写内存安全的代码,是一个长期比较困难的问题,由于 C/C++ 不是一门内存安全的语言,所以此类问题会经常遇到。AddressSanitizer 的出现,使得相关 bug 的定位和解决问题的难度都得到了下降,并加快项目的进度。

    推荐阅读