内存泄漏介绍

什么是内存泄漏

内存泄漏是指程序分配了一块内存（通常是动态分配的堆内存），但在不再需要这块内存的情况下未将其释放。内存泄漏会导致程序浪费系统内存资源，持续的内存泄漏还导致系统内存的逐渐耗尽，最终导致程序或系统崩溃。

内存泄漏和常驻内存区别

常驻内存（Resident Set）是指进程在运行期间占用的内存大小，包括进程使用的代码、数据和其他资源。常驻内存是进程在运行期间一直驻留在内存中的部分，即使在进程不活动时也不会被释放。常驻内存通常不会带来显著的负面影响。

程序与进程里的内存布局

下图是源码与 ELF（可执行可链接）文件以及运行起来后内存布局的简易映射关系图。程序中的初始化全局变量和局部静态变量被编译到 .data，未初始化的全局变量和局部静态变量编译后放在 .bss 段，代码主体和函数主题存放在 .text 段，ELF 文件内实际有很多段（参考《程序员的自我修养-链接，装载与库》第三章）。当程序运行时，会将 ELF 文件加载到内存。不同的段会加载到内存布局中的不同位置，其中 heap 这部分就是程序员手动去动态申请和释放内存的部分。当程序员用 malloc 函数申请了一块内存，使用完之后却没有 free 它的时候，就会发生内存泄漏。内存泄漏得越多，进程中可以使用内存的空间就越少，时间长了就会导致系统响应慢，甚至程序崩溃。

如何“观察”内存泄漏是否发生？

在 Android 系统上通常可以用 dumpsys meminfo 命令查看进程的内存使用数据，重复 dump 后从数据的变化情况来大致判断是否有内存泄漏。

也可以借助python 或者其他一些工具将数据可视化方便查看数据变化趋势。但这只能大致的给你展示数据变化的趋势，而非直接明白的告诉你是否发生了内存泄漏。因此我们需要更精确的工具来检测是否也有内存泄漏。

常见的内存检测工具介绍

本节我们将依次介绍 Malloc Debug, libmemunreachable, Asan, HwASan, MTE, Heapprofd, Memcheck(Valigrind) 内存泄漏检测工具(https://source.android.com/docs/core/tests/debug/native-memory?hl=zh-cn)

Malloc Debug

简介

Malloc Debug 是一种调试本机内存问题的方法。它可以帮助检测内存损坏、内存泄漏和释放后使用问题。 Malloc Debug 通过对常规的 allocation 函数包装了一层来记录和分析内存的申请和释放。这些函数包括：malloc, free, calloc, realloc, posix_memalign, memalign, aligned_alloc, malloc_usable_size

使用方法

运行程序前的设置

adb shell setprop libc.debug.malloc.options "\"backtrace guard leak_track backtrace_dump_on_exit backtrace_dump_prefix=/sdcard/heap"\"
adb shell setprop libc.debug.malloc.program xxx(进程名)

参数介绍：

backtrace: 开启堆栈记录。
guard: 开启内存越界检测。
leak_track: 程序在退出时，如有内存泄漏，而不产生abort。
backtrace_dump_on_exit: 程序退出时dump堆栈和内存信息。
backtrace_dump_prefix: dump文件存放的路径和文件名的开头字符。如本处生成的文件放在/sdcard/目录下，文件名开头为heap字样，注意指定的路径要有写权限。
libc.debug.malloc.program: 用于设置检测的程序，不设置则检测所有的运行的程序。

执行待测试程序

离线程序
离线程序运行完成后会在 backtrace_dump_prefix 设定的路径下存储 dump 文件

在线程序
需要先停掉程序所在的进程，再重启该进程才会生效。
由于在线程序一般不会主动退出（如 camerahalserver），需要使用命令来主动触发 dump。
命令：kill -47 xxx(进程ID)，注意多次触发新文件覆盖之前的文件。当你的程序有内存泄漏问题的话，输出如下报告：

E malloc_debug: +++ memtest leaked block of size 48 at 0x7a4a6a42e0 (leak 1 of 2)
E malloc_debug: Backtrace at time of allocation:
E malloc_debug:           #00  pc 000000000004461c  /apex/com.android.runtime/lib64/bionic/libc.so (malloc+76)
E malloc_debug:           #01  pc 00000000000c83e8  /apex/com.android.runtime/lib64/bionic/libc.so (__register_atfork+40)
E malloc_debug:           #02  pc 000000000005460c  /apex/com.android.runtime/lib64/bionic/libc.so
E malloc_debug:           #03  pc 00000000000613a0  /apex/com.android.runtime/bin/linker64
E malloc_debug:           #04  pc 0000000000061144  /apex/com.android.runtime/bin/linker64
E malloc_debug:           #05  pc 0000000000061144  /apex/com.android.runtime/bin/linker64
E malloc_debug:           #06  pc 00000000000d5f14  /apex/com.android.runtime/bin/linker64
E malloc_debug:           #07  pc 00000000000d4e0c  /apex/com.android.runtime/bin/linker64
E malloc_debug:           #08  pc 0000000000064004  /apex/com.android.runtime/bin/linker64
E malloc_debug: +++ memtest leaked block of size 20 at 0x793a6ae9a0 (leak 2 of 2)
E malloc_debug: Backtrace at time of allocation:
E malloc_debug:           #00  pc 000000000004461c  /apex/com.android.runtime/lib64/bionic/libc.so (malloc+76)
E malloc_debug:           #01  pc 00000000000100b8  /data/local/tmp/memtest/memtest
E malloc_debug:           #02  pc 00000000000546e8  /apex/com.android.runtime/lib64/bionic/libc.so (__libc_init+104)
E malloc_debug: Dumping to file: /sdcard/heap.19748.exit.txt

注意报告中并不是所有的 leak 都是真正的内存泄漏，有些可能是常驻内存，开发者需要自己判断。还需注意dump 路径要有写权限很多时候在线运行环境下so 是无符号的程序，我们需要解析 dump 文件定位代码行号 python3 native_heapdump_viewer.py --symbols ./symboldir/ ./heap.4169.exit.txt --html > memtest4169.html

–symbols 指定的是符号库/程序的路径，子目录的路径必须要在手机上的路径一致。比如可执行程序在手机里的路径是/vendor/bin/memtest，那解释时它的带符号的程序路径上需要是 ./symboldir/vendor/bin/memtest 检测出来的并不是都是泄漏，一部分是属于常驻内存，尤其对于在线程序，我们需要将程序运行不同的次数，抓出不同的log来做对比，找出真正增长的部分。

libmemunreachable

简介

Android 的 libmemunreachable 是一个零开销的本地内存泄漏检测器。它会在触发内存检测的时候遍历进程内存，同时将任何不可访问的块报告为泄漏。

命令行方式使用

设置属性

adb root
adb shell setprop libc.debug.malloc.program app_process
adb shell setprop wrap.[process] "\$\@“
adb shell setprop libc.debug.malloc.options backtrace=4

参数

backtrace_size 只收集泄漏指定 size 大小的 backtrace
backtrace_min_size=192 backtrace_max_size=320 收集泄漏 size 介于两者之间的backtrec

重启应用，执行 dumpsys -t 600 meminfo –unreachable [process].（自测没有 dump 出预期结果）。下面是一个带有内存问题的输出结果。

 Unreachable memory
  24 bytes in 2 unreachable allocations
  ABI: 'arm64'

  24 bytes unreachable at 71d37787d0
   first 20 bytes of contents:
   71d37787d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
   71d37787e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................

  24 bytes unreachable at 71d37797d0
   first 20 bytes of contents:
   71d37797d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
   71d37797e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................

添加代码方式使用

官方提供4个接口来检测内存 C interface

bool LogUnreachableMemory(bool log_contents, size_t limit)
bool NoLeaks()

C++ interface

bool GetUnreachableMemory(UnreachableMemoryInfo& info, size_t limit = 100)
std::string GetUnreachableMemoryString(bool log_contents = false, size_t limit = 100)

核心函数是 GetUnreachableMemory() 其他三个函数内部都会调用此函数。在使用添加代码的方式打印时，需要在编译代码时需要将 libmemunreachable.so 添加到动态依赖，libmemunreachable.so 文件可以在手机 /system/lib64/libmemunreachable.so 获取。

例子：以下是一个包含内存泄漏的例子，在f 函数中申请了x, y 两块内存，在函数返回前x 被释放，y 赋值后没有被释放。

#include "./memunreachable.h"

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

using namespace android;

void f(void);
void f(void) {
  printf("[memtest] function f\n");
  int* x = (int*)malloc(10 * sizeof(int));
  x[0] = 0;
  int* y = (int*)malloc(5 * sizeof(int));
  y[0] = 0;
  y[1] = 1;
  y[2] = 2;
  y[3] = 3;
  y[4] = 4;
  free(x);
}

int main(void) {
  printf("[memtest] hello main\n");
  f();

  // C interface
  printf("LogUnreachableMemory()\n");
  LogUnreachableMemory(true, 100);

  return 0;
}

adb log 输出（考虑排版省去时间戳） log 里显示有一个 20 bytes 的内存泄漏，20 正是5 个 int 的大小，对应申请但没有释放的 y 地址的内存。

// 新建 Collection process
31231 I libmemunreachable: collecting thread info for process 31231...
31231 I libmemunreachable: collection thread done
// fork 进程运行 sweeping process 
31233 I libmemunreachable: searching process 31231 for allocations
31233 I libmemunreachable: searching done
31233 I libmemunreachable: sweeping process 31231 for unreachable memory
31233 I libmemunreachable: sweeping done
31233 I libmemunreachable: folding related leaks
31233 I libmemunreachable: folding done
// 回到 Original process 接收检测结果
31231 I libmemunreachable: unreachable memory detection done
31231 E libmemunreachable: 20 bytes in 1 allocation unreachable out of 1260 bytes in 7 allocations
31231 E libmemunreachable:   20 bytes unreachable at 7a03454400
31231 E libmemunreachable:    contents:
31231 E libmemunreachable:    7a03454400: 00 00 00 00 01 00 00 00 02 00 00 00 03 00 00 00 ................
31231 E libmemunreachable:    7a03454410: 04 00 00 00                                     ....
31231 E libmemunreachable:           #00  pc 000000000003e238  /apex/com.android.runtime/lib64/bionic/libc.so (malloc+84)
31231 E libmemunreachable:           #01  pc 00000000000100b8  /data/local/tmp/memtest/memtest_libmemunreachable
31231 E libmemunreachable:           #02  pc 000000000004aa48  /apex/com.android.runtime/lib64/bionic/libc.so (__libc_init+100)

调用 bool LogUnreachableMemory(bool log_contents, size_t limit) 时，log_contents 传 true 会打印泄露地址的内容，也就是 contents 对应的两行内容。之后可以用 address2line 解析行号。

其他内存检测工具简介

ASan

Asan(AddressSanitizer) 适用于检测内存越界访问、缓冲区溢出、内存泄漏等问题。它是一个在编译时插入的工具；
运行时有一定的性能开销（约增加两倍），代码大小和内存均有额外开销;
需要重新编译程序，编译时添加 address 相关选项；
可用于 linux 和 android，但在 android 上逐步被 HwASan 取代；
需要刷与 ASan 兼容的 ROM；
不再受支持，即使有bug 也不会修复；

HWASan

HWASan 利用硬件特性，适用于检测内存错误，类似于 ASan，但能够更高效地运行在一些支持硬件特性的平台上。
性能开销和 Asan 接近，但内存占用更小；
需要重新编译程序，编译时添加 hwaddress 相关选项
仅适用于 Android 10 及更高版本，AArch64 硬件；
需要刷与 HWASan 兼容的 ROM；

MTE

MTE(Memory Tagging Extension) 使用硬件标签来检测内存错误，主要专注于检测内存越界访问。
提供了较低的性能开销，首次具备了线上部署的可能。
无需重新构建代码来检测堆错误（但需要重新构建代码来检测堆栈错误）
Android 系统在 Arm v9 上开始支持，仅适用于64位应用/程序；

Heapprofd

Heapprofd 是一个跟踪给定时间段内 Android 进程的堆分配和释放的工具。
可以借助 Perfetto 抓取，开发人员可以使用该工具调查内存问题（调用栈和内存分配）。当开启连续 dump 后，开发者可以查看程序结束前内存占用是否合理，以检查是否有潜在内存泄漏问题。或者将待测试代码循环执行，比较每执行一次代码段后内存是否有增加，一次判断是否有内存泄漏。 Valgrind 中的Memcheck
Memcheck 是 Valgrind 工具套件中的一个工具，用于检测 C 和 C++ 程序中的内存错误。
内存问题检测比较全面，但对性能影响比较大，耗时增加10x~20x，不适用对时间敏感的程序。
在 Ubuntu 上安装：sudo apt-get install valgrind

使用 memcheck 的基本方法

编译程序时加上 –g 选项，编译优化选项建议选择 -O1;
使用 Valgrind 命令运行程序：valgrind –leak-check=yes myprog arg1 arg2valgrind 使用 –tools 来指定 debug 工具，而 Memcheck 是默认工具，可以省略 –tools=memcheck 选项；

程序运行后输出问题报告。

$ valgrind --leak-check=yes ./memtest_origin 
==19517== Memcheck, a memory error detector
==19517== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==19517== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==19517== Command: ./memtest_origin
==19517== 
[memtest] hello main
[memtest] function f
==19517== 
==19517== HEAP SUMMARY:
==19517==     in use at exit: 20 bytes in 1 blocks
==19517==   total heap usage: 3 allocs, 2 frees, 1,084 bytes allocated
==19517== 
==19517== 20 bytes in 1 blocks are definitely lost in loss record 1 of 1
==19517==    at 0x4C31B0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==19517==    by 0x1086FF: f() (memtest_origin.cc:9)
==19517==    by 0x108731: main (memtest_origin.cc:16)
==19517== 
==19517== LEAK SUMMARY:
==19517==    definitely lost: 20 bytes in 1 blocks
==19517==    indirectly lost: 0 bytes in 0 blocks
==19517==      possibly lost: 0 bytes in 0 blocks
==19517==    still reachable: 0 bytes in 0 blocks
==19517==         suppressed: 0 bytes in 0 blocks
==19517== 
==19517== For counts of detected and suppressed errors, rerun with: -v
==19517== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

内存泄漏检测原理介绍

malloc debug 原理介绍

分为初始化和内存泄漏检测两个阶段介绍。

初始化阶段

整体流程如下图

libc 初始化时通过 __libc_init_malloc 函数调用 MallocInitImpl 来初始化 memory allocation framework。 ```C++ // malloc_common_dynamic.cpp static constexpr char kDebugSharedLib[] = “libc_malloc_debug.so”; static constexpr char kDebugPrefix[] = “debug”; static constexpr char kDebugPropertyOptions[] = “libc.debug.malloc.options”; static constexpr char kDebugPropertyProgram[] = “libc.debug.malloc.program”; static constexpr char kDebugEnvOptions[] = “LIBC_DEBUG_MALLOC_OPTIONS”; … // Initializes memory allocation framework once per process. static void MallocInitImpl(libc_globals* globals) { char prop[PROP_VALUE_MAX]; char* options = prop;

MaybeInitGwpAsanFromLibc(globals);

#if defined(USE_SCUDO) __libc_shared_globals()->scudo_stack_depot = __scudo_get_stack_depot_addr(); __libc_shared_globals()->scudo_region_info = __scudo_get_region_info_addr(); __libc_shared_globals()->scudo_ring_buffer = __scudo_get_ring_buffer_addr(); __libc_shared_globals()->scudo_ring_buffer_size = __scudo_get_ring_buffer_size(); #endif

// Prefer malloc debug since it existed first and is a more complete // malloc interceptor than the hooks. bool hook_installed = false; if (CheckLoadMallocDebug(&options)) { hook_installed = InstallHooks(globals, options, kDebugPrefix, kDebugSharedLib); } else if (CheckLoadMallocHooks(&options)) { hook_installed = InstallHooks(globals, options, kHooksPrefix, kHooksSharedLib); }

if (!hook_installed) { if (HeapprofdShouldLoad()) { HeapprofdInstallHooksAtInit(globals); } } else { // Record the fact that incompatible hooks are active, to skip any later // heapprofd signal handler invocations. HeapprofdRememberHookConflict(); } }

2. CheckLoadMallocDebug() 检查属性是否满足加载 lib_malloc_debug.so 的条件，检查的属性正是前面提到的两个 android prop 属性。
```C++
// malloc_common_dynamic.cpp
static bool CheckLoadMallocDebug(char** options) {
  // If kDebugMallocEnvOptions is set then it overrides the system properties.
  char* env = getenv(kDebugEnvOptions);
  if (env == nullptr || env[0] == '\0') {
    if (__system_property_get(kDebugPropertyOptions, *options) == 0 || *options[0] == '\0') {
      return false;
    }

    // Check to see if only a specific program should have debug malloc enabled.
    char program[PROP_VALUE_MAX];
    if (__system_property_get(kDebugPropertyProgram, program) != 0 &&
        strstr(getprogname(), program) == nullptr) {
      return false;
    }
  } else {
    *options = env;
  }
  return true;
}

InstallHooks() 调用LoadSharedLibrary() 加载 libc_malloc_debug.so；调用 FinishInstallHooks 初始化 malloc_debug 和更新一些全局变量。 ```C++ // malloc_common_dynamic.cpp static bool InstallHooks(libc_globals* globals, const char* options, const char* prefix, const char* shared_lib) { void* impl_handle = LoadSharedLibrary(shared_lib, prefix, &globals->malloc_dispatch_table); if (impl_handle == nullptr) { return false; }

if (!FinishInstallHooks(globals, options, prefix)) { dlclose(impl_handle); return false; } return true; }

4. LoadSharedLibrary()  函数内部 dlopen lib_malloc_debug.so 
5. 之后调用 InitSharedLibrary() 查找如下names 数组中的 symbol，将查找到的 symbol 保存在全局数组变量 gfunctions 中。注意查找的函数都会加上 debug_ 前缀。
```C++
// malloc_common_dynamic.cpp
bool InitSharedLibrary(void* impl_handle, const char* shared_lib, const char* prefix, MallocDispatch* dispatch_table) {
  static constexpr const char* names[] = {
    "initialize",
    "finalize",
    "get_malloc_leak_info",
    "free_malloc_leak_info",
    "malloc_backtrace",
    "write_malloc_leak_info",
  };
  for (size_t i = 0; i < FUNC_LAST; i++) {
    char symbol[128];
    snprintf(symbol, sizeof(symbol), "%s_%s", prefix, names[i]);
    gFunctions[i] = dlsym(impl_handle, symbol);
    if (gFunctions[i] == nullptr) {
      error_log("%s: %s routine not found in %s", getprogname(), symbol, shared_lib);
      ClearGlobalFunctions();
      return false;
    }
  }

  if (!InitMallocFunctions(impl_handle, dispatch_table, prefix)) {
    ClearGlobalFunctions();
    return false;
  }
  return true;
}

InitMallocFunctions() 用来初始化除了上一步names 中的函数以外其他的函数，包括 free，malloc 等。查找到的（加上debug_前缀的）symbol 都存放到 MallocDispatch 对应的函数指针。

// malloc_common_dynamic.cpp
static bool InitMallocFunctions(void* impl_handler, MallocDispatch* table, const char* prefix) {
  if (!InitMallocFunction<MallocFree>(impl_handler, &table->free, prefix, "free")) {
 return false;
  }
  if (!InitMallocFunction<MallocCalloc>(impl_handler, &table->calloc, prefix, "calloc")) {
 return false;
  }
  if (!InitMallocFunction<MallocMallinfo>(impl_handler, &table->mallinfo, prefix, "mallinfo")) {
 return false;
  }
  if (!InitMallocFunction<MallocMallopt>(impl_handler, &table->mallopt, prefix, "mallopt")) {
 return false;
  }
  if (!InitMallocFunction<MallocMalloc>(impl_handler, &table->malloc, prefix, "malloc")) {
 return false;
  }
  if (!InitMallocFunction<MallocMallocInfo>(impl_handler, &table->malloc_info, prefix,
                                             "malloc_info")) {
 return false;
  }
  if (!InitMallocFunction<MallocMallocUsableSize>(impl_handler, &table->malloc_usable_size, prefix,
                                               "malloc_usable_size")) {
 return false;
  }
  ...
}

InitMallocFunction() 内部作的还是去获取 malloc_debug 内的函数符号。

// malloc_common_dynamic.cpp
template<typename FunctionType>
static bool InitMallocFunction(void* malloc_impl_handler, FunctionType* func, const char* prefix, const char* suffix) {
  char symbol[128];
  snprintf(symbol, sizeof(symbol), "%s_%s", prefix, suffix);
  *func = reinterpret_cast<FunctionType>(dlsym(malloc_impl_handler, symbol));
  if (*func == nullptr) {
 error_log("%s: dlsym(\"%s\") failed", getprogname(), symbol);
 return false;
  }
  return true;
}

FinishInstallHooks() 7.1 调用 malloc_debug 的 debug_initialize() 函数初始化 malloc debug； 7.2 更新 libc_globals.default_dispatch_table 和 current_dispatch_table 为 malloc_dispatch_table； 7.3 通过 __cxa_atexit() 注册 MallocFiniImpl()，在进程退出时回调此函数检查内存问题并写入dump 文件。 ```C++ // malloc_common_dynamic.cpp bool FinishInstallHooks(libc_globals* globals, const char* options, const char* prefix) { init_func_t init_func = reinterpret_cast(gFunctions[FUNC_INITIALIZE]);

// If GWP-ASan was initialised, we should use it as the dispatch table for // heapprofd/malloc_debug/malloc_debug. const MallocDispatch* prev_dispatch = GetDefaultDispatchTable(); if (prev_dispatch == nullptr) { prev_dispatch = NativeAllocatorDispatch(); }

if (!init_func(prev_dispatch, &gZygoteChild, options)) { error_log(“%s: failed to enable malloc %s”, getprogname(), prefix); ClearGlobalFunctions(); return false; }

// Do a pointer swap so that all of the functions become valid at once to // avoid any initialization order problems. atomic_store(&globals->default_dispatch_table, &globals->malloc_dispatch_table); if (!MallocLimitInstalled()) { atomic_store(&globals->current_dispatch_table, &globals->malloc_dispatch_table); }

// Use atexit to trigger the cleanup function. This avoids a problem // where another atexit function is used to cleanup allocated memory, // but the finalize function was already called. This particular error // seems to be triggered by a zygote spawned process calling exit. int ret_value = __cxa_atexit(MallocFiniImpl, nullptr, nullptr); if (ret_value != 0) { // We don’t consider this a fatal error. warning_log(“failed to set atexit cleanup function: %d”, ret_value); } return true; }

### 申请/释放内存阶段
其内存泄漏的检测原理可以简单概括为：维护一个记录内存申请和释放的列表，每当申请内存时列表成员+1，内存释放时列表成员-1，程序退出时列表中还存在的成员即内存泄漏的成员。

在调用 malloc 函数时，内部判断如果 dispatch_table 不为空，调用 dispatch_table->malloc(bytes)，否则调用默认malloc 函数。dispatch_table 里面存储的是 “debug_”前缀的lib_malloc_debug.so 里的函数。
```C++
// malloc_common.cpp
extern "C" void* malloc(size_t bytes) {
  auto dispatch_table = GetDispatchTable();
  void *result;
  if (__predict_false(dispatch_table != nullptr)) {
    result = dispatch_table->malloc(bytes);
  } else {
    result = Malloc(malloc)(bytes);
  }
  if (__predict_false(result == nullptr)) {
    warning_log("malloc(%zu) failed: returning null pointer", bytes);
    return nullptr;
  }
  return MaybeTagPointer(result);
}

在 malloc debug 的 debug_malloc() 函数内，内存实际在 InternalMalloc 里申请，并且会根据初始化时配置的选项选择性开启功能。

// malloc_debug.cpp
void* debug_malloc(size_t size) {
  Unreachable::CheckIfRequested(g_debug->config());

  if (DebugCallsDisabled()) {
    return g_dispatch->malloc(size);
  }
  ScopedConcurrentLock lock;
  ScopedDisableDebugCalls disable;
  ScopedBacktraceSignalBlocker blocked;

  TimedResult result = InternalMalloc(size);

  if (g_debug->config().options() & RECORD_ALLOCS) {
    g_debug->record->AddEntry(new MallocEntry(result.getValue<void*>(), size,
                                              result.GetStartTimeNS(), result.GetEndTimeNS()));
  }

  return result.getValue<void*>();
}

InternalMalloc() 实现，可以看到下面代码中有多处根据 g_debug 的成员函数判断要执行的操作。

// malloc_debug.cpp 
static TimedResult InternalMalloc(size_t size) {
  if ((g_debug->config().options() & BACKTRACE) && g_debug->pointer->ShouldDumpAndReset()) {
    debug_dump_heap(android::base::StringPrintf(
                        "%s.%d.txt", g_debug->config().backtrace_dump_prefix().c_str(), getpid())
                        .c_str());
  }

  if (size == 0) {
    size = 1;
  }

  TimedResult result;

  size_t real_size = size + g_debug->extra_bytes();
  if (real_size < size) {
    // Overflow.
    errno = ENOMEM;
    result.setValue<void*>(nullptr);
    return result;
  }

  if (size > PointerInfoType::MaxSize()) {
    errno = ENOMEM;
    result.setValue<void*>(nullptr);
    return result;
  }

  if (g_debug->HeaderEnabled()) {
    result = TCALL(memalign, MINIMUM_ALIGNMENT_BYTES, real_size);
    Header* header = reinterpret_cast<Header*>(result.getValue<void*>());
    if (header == nullptr) {
      return result;
    }
    result.setValue<void*>(InitHeader(header, header, size));
  } else {
    result = TCALL(malloc, real_size);
  }

  void* pointer = result.getValue<void*>();

  if (pointer != nullptr) {
    if (g_debug->TrackPointers()) {
      PointerData::Add(pointer, size);
    }

    if (g_debug->config().options() & FILL_ON_ALLOC) {
      size_t bytes = InternalMallocUsableSize(pointer);
      size_t fill_bytes = g_debug->config().fill_on_alloc_bytes();
      bytes = (bytes < fill_bytes) ? bytes : fill_bytes;
      memset(pointer, g_debug->config().fill_alloc_value(), bytes);
    }
  }

  return result;
}

在 PointData 里维护了一个全局的 pointers_ map。每次申请内存时调用 Add 函数增加 pointers_ 成员，释放内存时调用 Remove 函数移除 pointers_ 成员。申请内存时调用的Add 函数见上面的代码段PointerData::Add(pointer, size);，释放内存时PointerData::Remove(pointer);。

// malloc_debug.cpp 
static TimedResult InternalFree(void* pointer) {
...
  if (g_debug->TrackPointers()) {
    PointerData::Remove(pointer);
  }
...
  return result;
}

退出时调用 debug_finalize() 打印内存泄漏并保存dump 文件调用 LogLeaks() 将内存泄漏信息在log 打印，将dump 文件写入手机存储。

// malloc_debug.cpp 
void debug_finalize() {
  if (g_debug == nullptr) {
    return;
  }

  // Make sure that there are no other threads doing debug allocations
  // before we kill everything.
  ScopedConcurrentLock::BlockAllOperations();

  // Turn off capturing allocations calls.
  DebugDisableSet(true);

  if (g_debug->config().options() & FREE_TRACK) {
    PointerData::VerifyAllFreed();
  }

  if (g_debug->config().options() & LEAK_TRACK) {
    PointerData::LogLeaks();
  }

  if ((g_debug->config().options() & BACKTRACE) && g_debug->config().backtrace_dump_on_exit()) {
    debug_dump_heap(android::base::StringPrintf("%s.%d.exit.txt",
                                                g_debug->config().backtrace_dump_prefix().c_str(),
                                                getpid()).c_str());
  }

  backtrace_shutdown();

  // In order to prevent any issues of threads freeing previous pointers
  // after the main thread calls this code, simply leak the g_debug pointer
  // and do not destroy the debug disable pthread key.
}

LogLeaks() 内部调用 GetList 函数获得 pointers_ 成员，按照 allocation size 排序后返回。

// PointerData.cpp
void PointerData::LogLeaks() {
  std::vector<ListInfoType> list;

  std::lock_guard<std::mutex> pointer_guard(pointer_mutex_);
  std::lock_guard<std::mutex> frame_guard(frame_mutex_);
  GetList(&list, false);

  size_t track_count = 0;
  for (const auto& list_info : list) {
    error_log("+++ %s leaked block of size %zu at 0x%" PRIxPTR " (leak %zu of %zu)", getprogname(),
              list_info.size, list_info.pointer, ++track_count, list.size());
    if (list_info.backtrace_info != nullptr) {
      error_log("Backtrace at time of allocation:");
      UnwindLog(*list_info.backtrace_info);
    } else if (list_info.frame_info != nullptr) {
      error_log("Backtrace at time of allocation:");
      backtrace_log(list_info.frame_info->frames.data(), list_info.frame_info->frames.size());
    }
    // Do not bother to free the pointers, we are about to exit any way.
  }
}

小结

libc 初始化时通过属性控制加载 lib_malloc_debug.so；
替换系统 malloc/free 函数指针，注册退出时的调用的检测函数；
维护一个列表记录每一次的内存申请和释放信息;
每次 malloc 内存时列表成员+1，内存free 时列表成员-1;
程序退出时列表中还存在的成员即是内存泄漏的成员。

libmemunreachable 原理介绍

概述

执行泄漏检测过程所需的步骤序列分为三个 process – original process、collection process 和 sweeper process；
Original process 调用 GetUnreachableMemory 接口；
Collection process 收集内存信息；
Sweeper process 遍历内存信息得到内存泄漏结果返回给Original process；整体流程图接下来我们深入看一下每个步骤做了什么工作。

CaptureThreads() 函数遍历 pid 下所有 tid，调用 ptrace 使得线程的寄存器和内存信息可以被读取；

// ThreadCapture.cpp
bool ThreadCaptureImpl::CaptureThreads() {
  TidList tids{allocator_};

  bool found_new_thread;
  do {
    if (!ListThreads(tids)) {
      ReleaseThreads();
      return false;
    }

    found_new_thread = false;

    for (auto it = tids.begin(); it != tids.end(); it++) {
      auto captured = captured_threads_.find(*it);
      if (captured == captured_threads_.end()) {
        if (CaptureThread(*it) < 0) {
          ReleaseThreads();
          return false;
        }
        found_new_thread = true;
      }
    }
  } while (found_new_thread);

  return true;
}

CapturedThreadInfo() 函数获取线每个线程的 regs 和 stack 内容；

// ThreadCapture.cpp
bool ThreadCaptureImpl::CapturedThreadInfo(ThreadInfoList& threads) {
  threads.clear();

  for (auto it = captured_threads_.begin(); it != captured_threads_.end(); it++) {
    ThreadInfo t{0, allocator::vector<uintptr_t>(allocator_), std::pair<uintptr_t, uintptr_t>(0, 0)};
    if (!PtraceThreadInfo(it->first, t)) {
      return false;
    }
    threads.push_back(t);
  }
  return true;
}

ProcessMappings() 函数读取 pid maps 内容；

// ProcessMappings.cpp
bool ProcessMappings(pid_t pid, allocator::vector<Mapping>& mappings) {
  char map_buffer[1024];
  snprintf(map_buffer, sizeof(map_buffer), "/proc/%d/maps", pid);
  android::base::unique_fd fd(open(map_buffer, O_RDONLY));
  if (fd == -1) {
    return false;
  }
  allocator::string content(mappings.get_allocator());
  ssize_t n;
  while ((n = TEMP_FAILURE_RETRY(read(fd, map_buffer, sizeof(map_buffer)))) > 0) {
    content.append(map_buffer, n);
  }
  ReadMapCallback callback(mappings);
  return android::procinfo::ReadMapFileContent(&content[0], callback);
}

解析后的mapping

// Example of how a parsed line look line:
// 00400000-00409000 r-xp 00000000 fc:00 426998  /usr/lib/gvfs/gvfsd-http

格式和用dumpsys meminfo 得到的内容类似，只是这里通过一个回调函数把他们组装成了 Mapping 的数据结构。

CollectAllocations() 函数

调用 ClassifyMappings() 函数将 mappings 信息按照包含的关键字分类存放到 globals_mappings，heap_mappings，stack_mappings，anon_mappings（没有真正使用）；
将 heap mapping allocation 记录插入到 allocations_ map 里，记录总的 allocation 的范围，以及总的 allocation bytes;
将每一条 globals mapping 和 stack mapping 的 range 插入到 roots_ vector； ```C++

bool MemUnreachable::CollectAllocations(const allocator::vector& threads, const allocator::vector& mappings, const allocator::vector& refs) { MEM_ALOGI("searching process %d for allocations", pid_);

for (auto it = mappings.begin(); it != mappings.end(); it++) { heap_walker_.Mapping(it->begin, it->end); }

allocator::vector heap_mappings{mappings}; allocator::vector anon_mappings{mappings}; allocator::vector globals_mappings{mappings}; allocator::vector stack_mappings{mappings}; if (!ClassifyMappings(mappings, heap_mappings, anon_mappings, globals_mappings, stack_mappings)) { return false; }

for (auto it = heap_mappings.begin(); it != heap_mappings.end(); it++) { MEM_ALOGV(“Heap mapping %” PRIxPTR “-%” PRIxPTR “ %s”, it->begin, it->end, it->name); HeapIterate(*it, & { heap_walker_.Allocation(base, base + size); }); }

for (auto it = anon_mappings.begin(); it != anon_mappings.end(); it++) { MEM_ALOGV(“Anon mapping %” PRIxPTR “-%” PRIxPTR “ %s”, it->begin, it->end, it->name); heap_walker_.Allocation(it->begin, it->end); }

for (auto it = globals_mappings.begin(); it != globals_mappings.end(); it++) { MEM_ALOGV(“Globals mapping %” PRIxPTR “-%” PRIxPTR “ %s”, it->begin, it->end, it->name); heap_walker_.Root(it->begin, it->end); }

for (auto thread_it = threads.begin(); thread_it != threads.end(); thread_it++) { for (auto it = stack_mappings.begin(); it != stack_mappings.end(); it++) { if (thread_it->stack.first >= it->begin && thread_it->stack.first <= it->end) { MEM_ALOGV(“Stack %” PRIxPTR “-%” PRIxPTR “ %s”, thread_it->stack.first, it->end, it->name); heap_walker_.Root(thread_it->stack.first, it->end); } } heap_walker_.Root(thread_it->regs); }

heap_walker_.Root(refs);

MEM_ALOGI(“searching done”);

return true; }

GetUnreachableMemory()
1. 调用 DetectLeaks() 检测泄漏，遍历 roots_ vector 里保存的 mapping ，给在 range 内的 allocator 地址加上可从 root 引用的标记；
2. 调用 Leaked() 遍历总的 allocations_ map 记录，没有被标记引用的记录被认为是泄漏的内存。记录泄漏的数量和泄漏的大小，将记录保存到 leaked vector；
```c++
// MemUnreachable.cpp
bool MemUnreachable::GetUnreachableMemory(allocator::vector<Leak>& leaks, size_t limit,
                                          size_t* num_leaks, size_t* leak_bytes) {
  MEM_ALOGI("sweeping process %d for unreachable memory", pid_);
  leaks.clear();

  if (!heap_walker_.DetectLeaks()) {
    return false;
  }

  allocator::vector<Range> leaked1{allocator_};
  heap_walker_.Leaked(leaked1, 0, num_leaks, leak_bytes);

  MEM_ALOGI("sweeping done");

  MEM_ALOGI("folding related leaks");

  // ...  这部分内容还没有细看，暂时跳过

  MEM_ALOGI("folding done");

  std::sort(leaks.begin(), leaks.end(),
            [](const Leak& a, const Leak& b) { return a.total_size > b.total_size; });

  if (leaks.size() > limit) {
    leaks.resize(limit);
  }

  return true;
}

检测泄漏遍历 roots_ vector 里保存的 mapping ，给在 range 内的 allocator 地址加上可从 root 引用的标记；

// HeapWalker.cpp
bool HeapWalker::DetectLeaks() {
  // Recursively walk pointers from roots to mark referenced allocations
  for (auto it = roots_.begin(); it != roots_.end(); it++) {
    RecurseRoot(*it);
  }

  Range vals;
  vals.begin = reinterpret_cast<uintptr_t>(root_vals_.data());
  vals.end = vals.begin + root_vals_.size() * sizeof(uintptr_t);

  RecurseRoot(vals);

  if (segv_page_count_ > 0) {
    MEM_ALOGE("%zu pages skipped due to segfaults", segv_page_count_);
  }

  return true;
}

// 遍历总的 allocations_ map 记录，没有被标记引用的记录被认为是泄漏的内存。记录泄漏的数量和泄漏的大小，将记录保存到 leaked vector；
bool HeapWalker::Leaked(allocator::vector<Range>& leaked, size_t limit, size_t* num_leaks_out,
                        size_t* leak_bytes_out) {
  leaked.clear();

  size_t num_leaks = 0;
  size_t leak_bytes = 0;
  for (auto it = allocations_.begin(); it != allocations_.end(); it++) {
    if (!it->second.referenced_from_root) {
      num_leaks++;  // 泄漏的数量
      leak_bytes += it->first.end - it->first.begin;  // 泄漏的总大小
    }
  }

  size_t n = 0;
  for (auto it = allocations_.begin(); it != allocations_.end(); it++) {
    if (!it->second.referenced_from_root) {
      if (n++ < limit) {
        leaked.push_back(it->first);  // 泄漏的记录
      }
    }
  }

  if (num_leaks_out) {
    *num_leaks_out = num_leaks;  // 更新输出
  }
  if (leak_bytes_out) {
    *leak_bytes_out = leak_bytes;  // 更新输出
  }

  return true;
}

小结

Original process 调用 GetUnreachableMemory 接口触发内存泄漏检测；
创建 Collection process （共享 Original process 内存空间）收集 regs，stack，heap 内存信息；
收集完毕后恢复 Original process 状态；
创建 Sweeper process 遍历内存信息得到内存泄漏结果；
Sweeper process 将收集到的泄漏信息传送给 Original process；

编码实现内存泄漏检测功能

使用脚本统计 meminfo 判断是否有内存泄漏

使用 bash 或 python 脚本循环抓取指定进程的 meminfo 保存到 txt 文件；
使用 python 脚本解析出txt 文件中的 PSS 信息，借助 pyecharts 或其他可视化三方库将数据以折线图可视化；优点：操作简单。缺点：没有检测结果返回。

C++ 编码统计 meminfo 判断是否有内存泄漏

需要在待测试程序中添加代码。整体分为三个部分：初始化，记录数据，统计数据；

初始化: 设置保存统计数据的路径，记录内存的次数以及保存折线图的间隔；记录数据： 2.1 调用 dumpsys meminfo 接口获得内存信息； 2.2 使用异步线程将记录到的信息按照记录的间隔绘图存储（防止程序奔溃没有保存出数据）； 2.3 每次保存数据时检查内存数据判断是否有泄漏；获取内存泄漏的检查结果：返回是否有内存泄漏的检查结果；

优点：可以控制记录内存泄漏的时机，避免记录大量重复的 meminfo；缺点：内存数据以kb 返回，可能会遗漏小 size 的泄漏。不能定位泄漏的代码行。

Malloc Hooks 自定义内存泄漏检测逻辑

Malloc Hooks 允许程序拦截执行期间发生的所有分配/释放调用。它仅适用于 Android P 及之后的系统。它的流程和Malloc Debug 可以说基本上一样的，只是设置的属性名不一样。有两种方法可以启用这些 hooks，设置系统属性或环境变量，并运行应用程序/程序。 adb shell setprop libc.debug.hooks.enable 1 或 export LIBC_HOOKS_ENABLE=1

初始化过程和 malloc debug 类似，只是判断的属性不同；在malloc hooks 的初始化函数中将从 libc_malloc_hooks.so 解析出来的函数symbol 都存放到 MallocDispatch；

// malloc_common_dynamic.cpp
static constexpr char kHooksSharedLib[] = "libc_malloc_hooks.so";
static constexpr char kHooksPrefix[] = "hooks";
static constexpr char kHooksPropertyEnable[] = "libc.debug.hooks.enable";
static constexpr char kHooksEnvEnable[] = "LIBC_HOOKS_ENABLE";
...
// Initializes memory allocation framework once per process.
static void MallocInitImpl(libc_globals* globals) {
...
  // Prefer malloc debug since it existed first and is a more complete
  // malloc interceptor than the hooks.
  bool hook_installed = false;
  if (CheckLoadMallocDebug(&options)) {
    hook_installed = InstallHooks(globals, options, kDebugPrefix, kDebugSharedLib);
  } else if (CheckLoadMallocHooks(&options)) {
    hook_installed = InstallHooks(globals, options, kHooksPrefix, kHooksSharedLib);
  }

  if (!hook_installed) {
    if (HeapprofdShouldLoad()) {
      HeapprofdInstallHooksAtInit(globals);
    }
  } else {
    // Record the fact that incompatible hooks are active, to skip any later
    // heapprofd signal handler invocations.
    HeapprofdRememberHookConflict();
  }
}

系统调用 malloc 函数时实际会调用到 hooks_malloc() 中开发者自行实现的逻辑。

void* hooks_malloc(size_t size) {
  if (__malloc_hook != nullptr && __malloc_hook != default_malloc_hook) {
    return __malloc_hook(size, __builtin_return_address(0));
  }
  return g_dispatch->malloc(size);
}

官方示例

    void* new_malloc_hook(size_t bytes, const void* arg) {
      return orig_malloc_hook(bytes, arg);
    }

    auto orig_malloc_hook = __malloc_hook;
    __malloc_hook = new_malloc_hook;

Malloc Hooks 自定义内存泄漏检测逻辑更新__malloc_hook 和__malloc_free 指向新增函数；

bool hooks_initialize(const MallocDispatch* malloc_dispatch, bool*, const char*) {
  g_dispatch = malloc_dispatch;
  // __malloc_hook = default_malloc_hook;
  __malloc_hook = cus_malloc_hook;
  __realloc_hook = default_realloc_hook;
  // __free_hook = default_free_hook;
  __free_hook = cus_free_hook;
  __memalign_hook = default_memalign_hook;
  return true;
}

在新增函数内实现检测逻辑；


// static int allocated_count = 0;
static void* cus_malloc_hook(size_t size, const void* ) {
  auto malloced_addr = g_dispatch->malloc(size);
  // printf 可能会产生循环调用！
  // printf("[malloc hooks] malloc %p size %zu at %p\n", malloced_addr, size, malloc_return_addr);
  error_log("[malloc hooks] malloc %p size %zu\n", malloced_addr, size);
  // allocated_count++;
  return malloced_addr;
}

static void cus_free_hook(void* pointer, const void* ) {
  // printf 可能会产生循环调用！
  // printf("[malloc hooks] free %p, at %p\n", pointer, free_addr);
  error_log("[malloc hooks] free %p\n", pointer);
  // allocated_count--;
  g_dispatch->free(pointer);
}

在 hooks_finalize 打印统计信息或者 dump 信息到文件。在这里直接打印的话可能计数和预期不同，因为 malloc hooks 记录了整个程序执行过程中的申请和释放，是多于测试程序里面申请和释放的次数的。

void hooks_finalize() {
  // error_log("allocated_count %d\n", allocated_count);
}

避坑

apex/com.android.runtime/lib64/libc_malloc_hooks.so没有权限替换，放到 /data/local/tmp/ 路径下也没有权限读取。临时调试建议指定路径 kHooksSharedLib[] = “/system/lib64/libc_malloc_hooks.so“；
不要使用 printf 打印信息，会出现malloc/free 的循环调用，程序崩溃；

通过 dlsym 库函数对 malloc/free 进行 hook

方案：

通过 dlsym 拿到系统 malloc/free 函数，起个别名;
使用 atexit 注册退出时要调用的函数；
用自定义的 malloc/free 函数把 libc 的 malloc/free 包装一层;
程序中调用 malloc/free 时实际先调用自定义的函数，之后调用实际的 malloc/free。在增加的函数中实现：
每次调用 malloc 和 free 时打印地址；
使用一个变量记录已经申请但没有释放的数量；
程序退出时打印没有 free 的数量；示例： ```C++ #include #include #include #include

typedef void* (malloc_func_type)(size_t size); typedef void (free_func_type)(void* p); malloc_func_type malloc_origin_ = NULL; free_func_type free_origin_ = NULL;

int enable_malloc_hook = 1; int enable_free_hook = 1; static size_t allocate_cnt = 0; void* malloc(size_t size) { void* malloced_addr = NULL; if (enable_malloc_hook) { // 避免 printf 循环调用 enable_malloc_hook = 0; allocate_cnt++; malloced_addr = malloc_origin_(size); printf(“malloc %p size %zu\n”, malloced_addr, size); enable_malloc_hook = 1; } return malloced_addr; }

void free(void* p) { if (enable_free_hook) { enable_free_hook = 0; printf(“free [%p]\n”, p); enable_free_hook = 1; } allocate_cnt–; free_origin_(p); }

void finish() { printf(“allocate_cnt %zu\n”, allocate_cnt); }

void f(void); void f(void) { // printf(“[memtest] function f\n”); int* x = (int)malloc(10 * sizeof(int)); x[0] = 0; int y = (int*)malloc(5 * sizeof(int)); y[0] = 0; free(x); }

int main(void) { // 获取系统默认的 malloc 和 free 函数 if (malloc_origin_ == NULL) { malloc_origin_ = reinterpret_cast(dlsym(RTLD_NEXT, "malloc")); }

if (free_origin_ == NULL) { free_origin_ = reinterpret_cast(dlsym(RTLD_NEXT, "free")); }

// printf(“[memtest] hello main\n”); f();

// 注册程序退出时调用的函数 atexit(finish); return 0; }

输出
```SHELL
$ ./memtest_dlsym
malloc 0x56127a995260 size 40
malloc 0x56127a995290 size 20
free [0x56127a995260]
allocate_cnt 1

参考链接

【内存】Android C/C++ 内存泄漏分析 unreachable

[调试本地内存使用

Android 开源项目

Android Open Source Project](https://source.android.google.cn/docs/core/tests/debug/native-memory?hl=zh-cn)

[调试和减少内存错误

Android NDK

Android Developers (google.cn)](https://developer.android.google.cn/ndk/guides/memory-debug?hl=zh-cn)

Malloc Debug (googlesource.com)
Malloc Hooks (googlesource.com)
libmemunreachable (googlesource.com)
Memcheck: a memory error detector
Heap profiler - Perfetto Tracing Docs