Android|Android Log系统介绍 (基于Android N)

原文使用有道云笔记创作, 看这个获取更好阅读体验: 有道云笔记原创连接

引言 > Android 的log,从操作系统分层上来讲,可以分为“Kernel Log”和“User Log”(这是我个人引入的术语)。> 所谓“Kernel Log”就是操作系统内核打印的log。内核里调用printk等接口请求输出kernel log。 kernel log最后会被打印到/dev/kmsg文件上。可以通过dmesg查看到> 所谓“User Log”分为2部分。 一类是Linux的标准输出设备中打印的log(stderr/stdout). 另一类是android特有的log流程。如通过android.util.Log类打印的log,eventslog, ALOG() native层log打印.他们都可以通过logcat看到.本文基于Android N源码, 对Android的log机制做介绍.

先给出一张Android Log系统的总图

Android|Android Log系统介绍 (基于Android N)
文章图片
Android Log系统的总图 1 Android特有Log流程
// java import android.util.Log; Log.d("cwj", "test log");

==android.util.Log== 是在做Android开发中最常用的log输出手段.这里输出的log, 我们通过"adb logcat"或"adb shell logcat"命令获取.
那么从"Log.d("cwj", "test log"); " 到"logcat"之间到底发生了什么呢?
// C/C++#define LOG_TAG "fingerprintd"ALOG(LOG_VERBOSE, LOG_TAG, "lockout\n"); ALOGE("Invalid callback object"); ALOGD("onAcquired(%d), duplicatedFingerId(%d)", 1, 2);

对于Native Code中的 ALOG 等的log打印 到logcat之间, 又发生了什么呢?
1.1 android.util.Log 和 android.util.writeEvent 1.1.1 android.util.Log
java类Log的源码在 "frameworks/base/core/java/android/util/Log.java"
Lod中的 Log.d() / Log.v() 等打印不级别的函数, 最终都走到 ==println_native()== native函数.
public static int v(String tag, String msg) { return println_native(LOG_ID_MAIN, VERBOSE, tag, msg); }public static int d(String tag, String msg) { return println_native(LOG_ID_MAIN, DEBUG, tag, msg); }/** @hide */ public static native int println_native(int bufID, int priority, String tag, String msg);

println_native 的实现在 "/frameworks/base/core/jni/android_util_Log.cpp"
/* * JNI registration. */ static const JNINativeMethod gMethods[] = { /* name, signature, funcPtr */ ... { "println_native","(IILjava/lang/String; Ljava/lang/String; )I", (void*) android_util_Log_println_native }, ... }; /* * In class android.util.Log: *public static native int println_native(int buffer, int priority, String tag, String msg) */ static jint android_util_Log_println_native(JNIEnv* env, jobject clazz, jint bufID, jint priority, jstring tagObj, jstring msgObj) { const char* tag = NULL; const char* msg = NULL; if (msgObj == NULL) { jniThrowNullPointerException(env, "println needs a message"); return -1; }if (bufID < 0 || bufID >= LOG_ID_MAX) { jniThrowNullPointerException(env, "bad bufID"); return -1; }if (tagObj != NULL) tag = env->GetStringUTFChars(tagObj, NULL); msg = env->GetStringUTFChars(msgObj, NULL); int res = __android_log_buf_write(bufID, (android_LogPriority)priority, tag, msg); if (tag != NULL) env->ReleaseStringUTFChars(tagObj, tag); env->ReleaseStringUTFChars(msgObj, msg); return res; }

println_native() 对应的jni方法是 ==android_util_Log_println_native()==.
android_util_Log_println_native()很简单,就是简单的Log级别检查后, 就调用 ==__android_log_buf_write()== 做进一步处理.
__android_log_buf_write() 的声明在 "/system/core/include/log/log.h":
int __android_log_buf_write(int bufID, int prio, const char *tag, const char *text);

__android_log_buf_write() 的定义(实现)在 "/system/core/liblog".
发现liblog中有多个 __android_log_buf_write() 的实现:
logd_write.c int __android_log_buf_write(int bufID, int prio, const char *tag, const char *msg) ...logd_write_kern.c int __android_log_buf_write(int bufID, int prio, const char *tag, const char *msg) ...logger_write.c LIBLOG_ABI_PUBLIC int __android_log_buf_write(int bufID, int prio, ...

那么我们用到的到底是哪个呢? 具体分析在[1.3 /system/core/liblog]介绍.
1.1.2 android.util.writeEvent
分析 writeEvent 到 liblog 的流程高度相似, 细节就不展开了, 这里直接贴下最终的结论:
/frameworks/base/core/java/android/util/EventLog.javawriteEvent() --> /frameworks/base/core/jni/android_util_EventLog.cpandroid_btWriteLog_xx() --> /system/core/include/log/logger.h --> log.handroid_btWriteLog() --> /system/core/liblog/logger_write.c__android_log_btwrite() -> write_to_log()

【Android|Android Log系统介绍 (基于Android N)】==__android_log_btwrite()== 后面调用到 ==write_to_log()==. 前面提到的 __android_log_buf_write() 也是直接调用到 write_to_log() .
即结论是: android.util.EvnentLog 跟 Android.util.Log 打印日志的流程相同, 都是转到 /system/core/liblog/logger_write.c 去处理.
android.util.writeEvent.writeEvent() --> android_btWriteLog_xx() (/frameworks/base/core/jni/android_util_EventLog.cpp)
--> android_btWriteLog() (/system/core/include/log/log.h) --> __android_log_btwrite() (/system/core/liblog/logger_write.c) --> write_to_log()
1.2 Native Code : ALOG / ALOGE / ALOGD .. 我们还会看到一些native code(主要是C/C++)也有打印log:
// xxx.c / xxx.cpp#define LOG_TAG "fingerprintd"ALOG(LOG_VERBOSE, LOG_TAG, "lockout\n"); ALOGE("Invalid callback object"); ALOGD("onAcquired(%d), duplicatedFingerId(%d)", 1, 2);

这些形如 "[ASR]LOG[VDIWE]" 的函数的声明在 "/system/core/include/log/log.h":
// ALOGD/ALOGE ---> ALOG---> LOG_PRI---> android_printLog ---> __android_log_print()/* * Basic log message macro. * * Example: *ALOG(LOG_WARN, NULL, "Failed with error %d", errno); * * The second argument may be NULL or "" to indicate the "global" tag. */ #ifndef ALOG #define ALOG(priority, tag, ...) \ LOG_PRI(ANDROID_##priority, tag, __VA_ARGS__) #endif/* * Simplified macro to send an error log message using the current LOG_TAG. */ #ifndef ALOGE #define ALOGE(...) ((void)ALOG(LOG_ERROR, LOG_TAG, __VA_ARGS__)) #endif/* * Simplified macro to send a debug log message using the current LOG_TAG. */ #ifndef ALOGD #define ALOGD(...) ((void)ALOG(LOG_DEBUG, LOG_TAG, __VA_ARGS__)) #endif//////////////////////////////////////////////* * Log macro that allows you to specify a number for the priority. */ #ifndef LOG_PRI #define LOG_PRI(priority, tag, ...) \ android_printLog(priority, tag, __VA_ARGS__) #endif#define android_printLog(prio, tag, fmt...) \ __android_log_print(prio, tag, fmt)

在宏里兜兜转转,最后走到 __android_log_print() 函数:
ALOGD/ALOGE ---> ALOG ---> LOG_PRI ---> android_printLog ---> __android_log_print()
而 ==__android_log_print()== 函数的实现在 "/system/core/liblog":
// 只列出其中一个实现.// /system/core/liblog/logger_write.cLIBLOG_ABI_PUBLIC int __android_log_write(int prio, const char *tag, const char *msg) { return __android_log_buf_write(LOG_ID_MAIN, prio, tag, msg); }

__android_log_print() 函数只是 __android_log_buf_write() 的简单包装.
所以, 形如 "[ASR]LOG[VDIWE]" 的的log输出, 最终跟 android.util.Log 一样, 也是走到 "/system/core/liblog"的 __android_log_buf_write() 处理.
1.3 /system/core/liblog 上面2节都提到, 关键函数 __android_log_buf_write() . 以及遗留了一个问题 "__android_log_buf_write() 的多个实现, 到底哪个才是最终的实现?
结论是 "有 "LIBLOG_ABI_PUBLIC" 宏修饰的, 是最终被使用的实现."
// /system/core/liblog/logger_write.cLIBLOG_ABI_PUBLIC int __android_log_buf_write(int bufID, int prio, const char *tag, const char *msg) { // ....return write_to_log(bufID, vec, 3); }

那么怎么得道的这个结论呢?
__android_log_buf_write() 分别在 "logd_write.c" / "logd_write_kern.c" / "logger_write.c" 三个文件都有实现.
logd_write.c int __android_log_buf_write(int bufID, int prio, const char *tag, const char *msg) ...logd_write_kern.c int __android_log_buf_write(int bufID, int prio, const char *tag, const char *msg) ...logger_write.c LIBLOG_ABI_PUBLIC int __android_log_buf_write(int bufID, int prio, ...

根据liblog模块的mk文件 "/system/core/liblog/Android.mk", 知道前2个都没有参与编译. 只有第三个文件 "logger_write.c" 参与了编译.
另外结合阅读 liblog 模块的代码, 也可以确认, logd_write.c 和 logd_write_kern.c 都是历史残留代码.
在 [1.6 Android 4.4 liblog] 对旧的log逻辑做一些分析.
回到 logger_write.c 的 __android_log_buf_write() 继续跟进.
// /system/core/liblog/logger_write.cLIBLOG_ABI_PUBLIC int __android_log_buf_write(int bufID, int prio, const char *tag, const char *msg) { struct iovec vec[3]; char tmp_tag[32]; if (!tag) tag = ""; /* XXX: This needs to go! */ if ((bufID != LOG_ID_RADIO) && (!strcmp(tag, "HTC_RIL") || !strncmp(tag, "RIL", 3) || /* Any log tag with "RIL" as the prefix */ !strncmp(tag, "IMS", 3) || /* Any log tag with "IMS" as the prefix */ !strcmp(tag, "AT") || !strcmp(tag, "GSM") || !strcmp(tag, "STK") || !strcmp(tag, "CDMA") || !strcmp(tag, "PHONE") || !strcmp(tag, "SMS"))) { bufID = LOG_ID_RADIO; /* Inform third party apps/ril/radio.. to use Rlog or RLOG */ snprintf(tmp_tag, sizeof(tmp_tag), "use-Rlog/RLOG-%s", tag); tag = tmp_tag; }#if __BIONIC__ if (prio == ANDROID_LOG_FATAL) { android_set_abort_message(msg); } #endifvec[0].iov_base = (unsigned char *)&prio; vec[0].iov_len= 1; vec[1].iov_base = (void *)tag; vec[1].iov_len= strlen(tag) + 1; vec[2].iov_base = (void *)msg; vec[2].iov_len= strlen(msg) + 1; return write_to_log(bufID, vec, 3); // [1.1.2 android.util.writeEvent] 提到, eventlog会走到这个函数. }

__android_log_buf_write() 调用一次就是一条log.
它先处理了log标签"tag"如果为NULL, 则将其置为空字符"", 然后处理了radio log的特殊情况.
都ok以后,将出bufID以外的所有参数都封装到一个 iovec struct的数组中. 熟悉Linux C/C编程的小伙伴应该对 iovec 不陌生.
#includestruct iovec { ptr_t iov_base; // Starting address / iov_base指向一个缓冲区 size_t iov_len; // Length in bytes/确定了接收的最大长度 or 实际写入的长度 };

指针 iov_base 指向一个缓冲区,这个缓冲区是存放的是 writev() 将要发送的数据, 或 readv() 所接收的数据.
成员 iov_len 在各种情况下分别确定了 实际写入的长度 or 接收的最大长度.
所以我们知道 write_to_log() 内部肯定要调用到 writev() 了.
以为后面很容易就可以看到调用 writev() ? 其实还是有一段曲折的.
// /media/moasm/Samsung_SSD_1T/M1871_NF7_base/system/core/liblog/logger_write.cstatic int __write_to_log_init(log_id_t, struct iovec *vec, size_t nr); static int (*write_to_log)(log_id_t, struct iovec *vec, size_t nr) = __write_to_log_init; static int __write_to_log_init(log_id_t log_id, struct iovec *vec, size_t nr) { __android_log_lock(); if (write_to_log == __write_to_log_init) {// 第一次调用到 write_to_log() , if条件肯定成立 int ret; ret = __write_to_log_initialize(); // 初始化,里面是重点代码 if (ret < 0) { __android_log_unlock(); if (!list_empty(&__android_log_persist_write)) { __write_to_log_daemon(log_id, vec, nr); } return ret; }write_to_log = __write_to_log_daemon; // write_to_log 函数指针,重新指向到 __write_to_log_daemon 函数 }__android_log_unlock(); return write_to_log(log_id, vec, nr); }

函数指针 write_to_log 先初始化指向 __write_to_log_init()函数.
当第一次调用 write_to_log 就等于调用 __write_to_log_init().
然后后续的调用 write_to_log :
要么指向 __write_to_log_daemon() 函数, 以后调用 write_to_log 就直接调到 __write_to_log_daemon();
要么继续指向 __write_to_log_init() 函数, 但是依然会走到 __write_to_log_daemon() 函数.
总之, 要到 __write_to_log_daemon() 函数继续跟进.
static int __write_to_log_daemon(log_id_t log_id, struct iovec *vec, size_t nr) { struct android_log_transport_write *node; int ret; struct timespec ts; size_t len, i; for (len = i = 0; i < nr; ++i) {// 检查 vec 参数是否合法 len += vec[i].iov_len; } if (!len) { return -EINVAL; }#if defined(__BIONIC__) if (log_id == LOG_ID_SECURITY) { // ... // check_log_uid_permissions() ... // __android_log_security() ... // ... } else if (log_id == LOG_ID_EVENTS) { // ... // tag = android_lookupEventTag(m, get4LE(vec[0].iov_base)); // ret = __android_log_is_loggable(ANDROID_LOG_INFO, //tag, //ANDROID_LOG_VERBOSE); // if (!ret) { //return -EPERM; // } // ... } else { // ... if (!__android_log_is_loggable(prio, tag, ANDROID_LOG_VERBOSE)) { return -EPERM; } }clock_gettime(android_log_clockid(), &ts); #else /* 下面小段代码同 clock_gettime(CLOCK_REALTIME, &ts); */ { struct timeval tv; gettimeofday(&tv, NULL); ts.tv_sec = tv.tv_sec; ts.tv_nsec = tv.tv_usec * 1000; } #endifret = 0; i = 1 << log_id; write_transport_for_each(node, &__android_log_transport_write) { if (node->logMask & i) { ssize_t retval; retval = (*node->write)(log_id, &ts, vec, nr); if (ret >= 0) { ret = retval; } } }write_transport_for_each(node, &__android_log_persist_write) { if (node->logMask & i) { (void)(*node->write)(log_id, &ts, vec, nr); } }return ret; }

总的来说, 这个函数就是从 __android_log_transport_write 和 __android_log_persist_write 这两个结构中取出节点(node) 执行 write 操作:
(*node->write)(log_id, &ts, vec, nr);

其定义在 "/system/core/liblog/config_write.c":
// /system/core/liblog/config_write.cLIBLOG_HIDDEN struct listnode __android_log_transport_write = { &__android_log_transport_write, &__android_log_transport_write }; LIBLOG_HIDDEN struct listnode __android_log_persist_write = { &__android_log_persist_write, &__android_log_persist_write}; static void __android_log_add_transport( struct listnode *list, struct android_log_transport_write *transport) { // ... }LIBLOG_HIDDEN void __android_log_config_write() { #if (FAKE_LOG_DEVICE == 0) extern struct android_log_transport_write logdLoggerWrite; extern struct android_log_transport_write pmsgLoggerWrite; __android_log_add_transport(&__android_log_transport_write, &logdLoggerWrite); __android_log_add_transport(&__android_log_persist_write, &pmsgLoggerWrite); #else extern struct android_log_transport_write fakeLoggerWrite; __android_log_add_transport(&__android_log_transport_write, &fakeLoggerWrite); #endif }

实际上是对 "struct android_log_transport_write logdLoggerWrite" 和 "struct android_log_transport_write pmsgLoggerWrite" 的包装. 他们分别在 "/system/core/liblog/logd_writer.c" 和 "/system/core/liblog/pmsg_writer.c" 中实现:
// /system/core/liblog/logd_writer.cstatic int logdAvailable(log_id_t LogId); static int logdOpen(); static void logdClose(); static int logdWrite(log_id_t logId, struct timespec *ts, struct iovec *vec, size_t nr); LIBLOG_HIDDEN struct android_log_transport_write logdLoggerWrite = { .node = { &logdLoggerWrite.node, &logdLoggerWrite.node }, .context.sock = -1, .name = "logd", .available = logdAvailable, .open = logdOpen, .close = logdClose, .write = logdWrite, // chenwenjun add for uplevel log >> kernel log .fd = -1, // }; // .write = logdWrite,--> 实际的 write 操作是由 logdWrite() 函数执行

// /system/core/liblog/pmsg_writer.cstatic int pmsgOpen(); static void pmsgClose(); static int pmsgAvailable(log_id_t logId); static int pmsgWrite(log_id_t logId, struct timespec *ts, struct iovec *vec, size_t nr); LIBLOG_HIDDEN struct android_log_transport_write pmsgLoggerWrite = { .node = { &pmsgLoggerWrite.node, &pmsgLoggerWrite.node }, .context.fd = -1, .name = "pmsg", .available = pmsgAvailable, .open = pmsgOpen, .close = pmsgClose, .write = pmsgWrite, }; // .write = pmsgWrite,--> 实际的 write 操作是由 pmsgWrite() 函数执行

总的来说, "/system/core/liblog/logd_writer.c"的实现, 就是把log写到名为 "/dev/socket/logdw" 的socket :
// /system/core/liblog/logd_writer.c// logdOpen() 连接"/dev/socket/logdw" socket,记录socket句柄到 logdLoggerWrite.context.sock static int logdOpen() { int i, ret = 0; if (logdLoggerWrite.context.sock < 0) { i = TEMP_FAILURE_RETRY(socket(PF_UNIX, SOCK_DGRAM | SOCK_CLOEXEC, 0)); if (i < 0) { ret = -errno; } else if (TEMP_FAILURE_RETRY(fcntl(i, F_SETFL, O_NONBLOCK)) < 0) { ret = -errno; close(i); } else { struct sockaddr_un un; memset(&un, 0, sizeof(struct sockaddr_un)); un.sun_family = AF_UNIX; strcpy(un.sun_path, "/dev/socket/logdw"); if (TEMP_FAILURE_RETRY(connect(i, (struct sockaddr *)&un, sizeof(struct sockaddr_un))) < 0) { ret = -errno; close(i); } else { logdLoggerWrite.context.sock = i; } } }return ret; }// 关闭socket static void logdClose() { if (logdLoggerWrite.context.sock >= 0) { close(logdLoggerWrite.context.sock); logdLoggerWrite.context.sock = -1; } }// logdWrite() 把log非阻塞地写出到socket . static int logdWrite(log_id_t logId, struct timespec *ts, struct iovec *vec, size_t nr) { ssize_t ret; static const unsigned headerLength = 1; struct iovec newVec[nr + headerLength]; android_log_header_t header; size_t i, payloadSize; static atomic_int_fast32_t dropped; static atomic_int_fast32_t droppedSecurity; if (logdLoggerWrite.context.sock < 0) { return -EBADF; }/* logd, after initialization and priv drop */ if (__android_log_uid() == AID_LOGD) { /* * ignore log messages we send to ourself (logd). * Such log messages are often generated by libraries we depend on * which use standard Android logging. */ return 0; }/* *struct { *// what we provide to socket *android_log_header_t header; *// caller provides *union { *struct { *charprio; *charpayload[]; *} string; *struct { *uint32_t tag *charpayload[]; *} binary; *}; *}; */header.tid = gettid(); header.realtime.tv_sec = ts->tv_sec; header.realtime.tv_nsec = ts->tv_nsec; newVec[0].iov_base = (unsigned char *)&header; newVec[0].iov_len= sizeof(header); if (logdLoggerWrite.context.sock > 0) { int32_t snapshot = atomic_exchange_explicit(&droppedSecurity, 0, memory_order_relaxed); if (snapshot) { android_log_event_int_t buffer; header.id = LOG_ID_SECURITY; buffer.header.tag = htole32(LIBLOG_LOG_TAG); buffer.payload.type = EVENT_TYPE_INT; buffer.payload.data = https://www.it610.com/article/htole32(snapshot); newVec[headerLength].iov_base = &buffer; newVec[headerLength].iov_len= sizeof(buffer); ret = TEMP_FAILURE_RETRY(writev(logdLoggerWrite.context.sock, newVec, 2)); if (ret != (ssize_t)(sizeof(header) + sizeof(buffer))) { atomic_fetch_add_explicit(&droppedSecurity, snapshot, memory_order_relaxed); } } snapshot = atomic_exchange_explicit(&dropped, 0, memory_order_relaxed); if (snapshot && __android_log_is_loggable(ANDROID_LOG_INFO,"liblog", ANDROID_LOG_VERBOSE)) { android_log_event_int_t buffer; header.id = LOG_ID_EVENTS; buffer.header.tag = htole32(LIBLOG_LOG_TAG); buffer.payload.type = EVENT_TYPE_INT; buffer.payload.data = https://www.it610.com/article/htole32(snapshot); newVec[headerLength].iov_base = &buffer; newVec[headerLength].iov_len= sizeof(buffer); ret = TEMP_FAILURE_RETRY(writev(logdLoggerWrite.context.sock, newVec, 2)); if (ret != (ssize_t)(sizeof(header) + sizeof(buffer))) { atomic_fetch_add_explicit(&dropped, snapshot, memory_order_relaxed); } } }header.id = logId; for (payloadSize = 0, i = headerLength; i < nr + headerLength; i++) { newVec[i].iov_base = vec[i - headerLength].iov_base; payloadSize += newVec[i].iov_len = vec[i - headerLength].iov_len; if (payloadSize> LOGGER_ENTRY_MAX_PAYLOAD) { newVec[i].iov_len -= payloadSize - LOGGER_ENTRY_MAX_PAYLOAD; if (newVec[i].iov_len) { ++i; } break; } }/* * The write below could be lost, but will never block. * * ENOTCONN occurs if logd dies. * EAGAIN occurs if logd is overloaded. */ ret = TEMP_FAILURE_RETRY(writev(logdLoggerWrite.context.sock, newVec, i)); if (ret < 0) { ret = -errno; if (ret == -ENOTCONN) { __android_log_lock(); logdClose(); ret = logdOpen(); __android_log_unlock(); if (ret < 0) { return ret; }ret = TEMP_FAILURE_RETRY(writev(logdLoggerWrite.context.sock, newVec, i)); if (ret < 0) { ret = -errno; } } }if (ret > (ssize_t)sizeof(header)) { ret -= sizeof(header); } else if (ret == -EAGAIN) { atomic_fetch_add_explicit(&dropped, 1, memory_order_relaxed); if (logId == LOG_ID_SECURITY) { atomic_fetch_add_explicit(&droppedSecurity, 1, memory_order_relaxed); } }return ret; }

socket "/dev/socket/logdw" 是由 logd (/system/core/logd) 创建的socket. 将在 [1.3 logd ] 中介绍.
而 "/system/core/liblog/pmsg_writer.c" 的实现, 总的来说是把日志写到 "/dev/pmsg0" 文件去:
// /system/core/liblog/pmsg_writer.cstatic int pmsgOpen() { if (pmsgLoggerWrite.context.fd < 0) { pmsgLoggerWrite.context.fd = TEMP_FAILURE_RETRY(open("/dev/pmsg0", O_WRONLY | O_CLOEXEC)); }return pmsgLoggerWrite.context.fd; }static int pmsgWrite(log_id_t logId, struct timespec *ts, struct iovec *vec, size_t nr) { static const unsigned headerLength = 2; struct iovec newVec[nr + headerLength]; android_log_header_t header; android_pmsg_log_header_t pmsgHeader; size_t i, payloadSize; ssize_t ret; if ((logId == LOG_ID_EVENTS) && !__android_log_is_debuggable()) { if (vec[0].iov_len < 4) { return -EINVAL; }if (SNET_EVENT_LOG_TAG != get4LE(vec[0].iov_base)) { return -EPERM; } }if (pmsgLoggerWrite.context.fd < 0) { return -EBADF; }/* *struct { *// what we provide to pstore *android_pmsg_log_header_t pmsgHeader; *// what we provide to file *android_log_header_t header; *// caller provides *union { *struct { *charprio; *charpayload[]; *} string; *struct { *uint32_t tag *charpayload[]; *} binary; *}; *}; */pmsgHeader.magic = LOGGER_MAGIC; pmsgHeader.len = sizeof(pmsgHeader) + sizeof(header); pmsgHeader.uid = __android_log_uid(); pmsgHeader.pid = getpid(); header.id = logId; header.tid = gettid(); header.realtime.tv_sec = ts->tv_sec; header.realtime.tv_nsec = ts->tv_nsec; newVec[0].iov_base= (unsigned char *)&pmsgHeader; newVec[0].iov_len= sizeof(pmsgHeader); newVec[1].iov_base= (unsigned char *)&header; newVec[1].iov_len= sizeof(header); for (payloadSize = 0, i = headerLength; i < nr + headerLength; i++) { newVec[i].iov_base = vec[i - headerLength].iov_base; payloadSize += newVec[i].iov_len = vec[i - headerLength].iov_len; if (payloadSize > LOGGER_ENTRY_MAX_PAYLOAD) { newVec[i].iov_len -= payloadSize - LOGGER_ENTRY_MAX_PAYLOAD; if (newVec[i].iov_len) { ++i; } payloadSize = LOGGER_ENTRY_MAX_PAYLOAD; break; } } pmsgHeader.len += payloadSize; ret = TEMP_FAILURE_RETRY(writev(pmsgLoggerWrite.context.fd, newVec, i)); if (ret < 0) { ret = errno ? -errno : -ENOTCONN; }if (ret > (ssize_t)(sizeof(header) + sizeof(pmsgHeader))) { ret -= sizeof(header) - sizeof(pmsgHeader); }return ret; }

ps: 目前我们手头的机型, "/dev/pmsg0" 是未被创建出来的. 所以这段代码是无效的.
我们在创建 "/dev/pmsg0" 并配置合适的权限 和selinux规则后, 是可以同步的bug日志也导到这个设备来的.
(因为liblog.so是运行在每个打印日志的进程中的, "/dev/pmsg0"要创建成能被所有进程读写)
本节有点长(主要是贴了2个较长的函数), 这里小结下:"/system/core/liblog" 模块编译生成 "/system/lib(64)/liblog.so". liblog.so 被所有需要打印日志的进程加载使用, 负责处理打印日志流程.目前(Android N)上, 它会做尝试将日志输出到2个地方: 1> 通过socket "/dev/socket/logdw" 将日志输送到 logd (/system/bin/logd)进程. 2> 将log直接写入 "/dev/pmsg0" 文件 (如果存在,且可访问的话)."/system/core/liblog"的内部有一些无用的旧版代码. 关于进程写出日志,有用的主要源码文件是:/system/core/include/log/log.h// liblog 模块给别的模块使用的头文件/system/core/liblog/logger_write.c// 入口 /system/core/liblog/config_write.h// config_write 可以理解为配置管理 /system/core/liblog/config_write.c // logd_writer.c 和 pmsg_writer.c 是日志如何写出的实际实现者 /system/core/liblog/logd_writer.c// !!! 注意有个 "logd_write.c" 文件容易搞混淆 !!! /system/core/liblog/pmsg_writer.cps: 一个用户进程要读取日志, 也要用到liblog.so库. 这里就不展开了.

1.3 logd 上几节提到, 无论用户是从Log.java类,还是在 native层调用形如 ALOG 这样的函数(宏函数)打印log, 最后都是走到 liblog 的 __android_log_buf_write() 函数, 即"/system/lib(64)/liblog.so".
liblog 总的来说就2件事, 其中一件事就是写日志到socket "/dev/socket/logdw".
而这个socket "/dev/socket/logdw"就是logd负责创建的:
// /system/core/logd/logd.rcservice logd /system/bin/logd socket logd stream 0666 logd logd socket logdr seqpacket 0666 logd logd socket logdw dgram 0222 logd logd group root system readproc writepid /dev/cpuset/system-background/tasks

另外, 我们也可以看到 "socket logdr" 的创建. logcat 连接 socket logdr 获取日志.
tips 1 : "socket logdr" 最后创建出来的文件是 "/dev/socket/logdr" tips 2 : "socket logd"最后创建出来的文件是 "/dev/socket/logd" tips 3 : 我们说的"文件"是linux世界里的文件. linux的设计是"一切皆是文件"

通过ls命令查看 logd / logdr / logdw三个文件
$ adb shell ls -alZ /dev/socket | grep logd srw-rw-rw-1 logdlogdu:object_r:logd_socket:s00 2018-05-29 15:49 logd srw-rw-rw-1 logdlogdu:object_r:logdr_socket:s00 2018-05-29 15:49 logdr s-w--w--w-1 logdlogdu:object_r:logdw_socket:s00 2018-05-29 15:49 logdw

注意到他们的权限字段都是 "srw-rw-rw-", 最前面的 "s" 即表示该文件是 socket.
logd 模块源码在 /system/core/logd/, 编译的目标是linux可执行文件 "/system/bin/logd".
为常住进程:
$ adb shell ps | grep logd logd4701180882096sigsuspend 0000000000 S /system/bin/logd

logd中有循环缓冲区 来存储接收来的log.
属性字段 "persist.logd.size" 控制log缓冲区的大小:
// 查看logd log缓冲区大小: $ adb shell getprop persist.logd.size 256KB// 修改logd log缓冲区大小: $ adb shell setprop persist.logd.size 512KB

logd中有30个左右C/C++源文件. 大部分是C++文件,可见整个模块是面向对象编程的.
这里大概画出一下UML图(伪).
1> 接收写到logdw socket的日志:
2> 响应通过logdr socket获取日志:
3> logd抓取 Kernel Log (简介见 [1.5 logd 捕获 kernel log] ):
1.4 logcat 上一节提到, logcat是通过连接 socket "/dev/socket/logdr" 来获取日志的.
logcat 就是负责从 logd 获取日志, 然后展示给使用者,或转存到文件.
logcat也可以清除 logd 中的日志缓存, 命令是"adb shell logcat -c" 或 "adb logcat -c".
logcat 可以按照进程, 模块, 标签 过滤查看log. 也可以定制log查看的格式.
具体用法 使用 "$ adb shell logcat -h" 查看.
默认情况下, 一条log的格式形如下:
05-29 15:30:58.32529432943 IStrategy: xxxxxxxxxx 日期时间pidtidlog级别log标签log正文

logcat模块源码在 "/system/core/logcat", 编译目标为 "/system/bin/logcat" .
以下简要介绍下, logcat 从 logd 获取日志的流程
// /system/core/logcat/logcat.cppstruct log_device_t { const char* device; bool binary; struct logger *logger; struct logger_list *logger_list; bool printed; log_device_t* next; log_device_t(const char* d, bool b) { device = d; binary = b; next = NULL; printed = false; logger = NULL; logger_list = NULL; } }; namespace android {// ...static int g_devCount; // >1 means multiple// ...int main(int argc, char **argv) { log_device_t* devices = NULL; log_device_t* dev; bool printDividers = false; struct logger_list *logger_list; // ... 参数解析 ...// 获取 "main" / "system" / "crash" 三个log缓冲区 if (!devices) { dev = devices = new log_device_t("main", false); g_devCount = 1; if (android_name_to_log_id("system") == LOG_ID_SYSTEM) {// android_name_to_log_id() 见 system/core/liblog/logger_name.c dev = dev->next = new log_device_t("system", false); g_devCount++; } if (android_name_to_log_id("crash") == LOG_ID_CRASH) { dev = dev->next = new log_device_t("crash", false); g_devCount++; } }setupOutput(); // 设置log输出格式 if (hasSetLogFormat == 0) { const char* logFormat = getenv("ANDROID_PRINTF_LOG"); if (logFormat != NULL) { err = setLogFormat(logFormat); if (err < 0) { fprintf(stderr, "invalid format in ANDROID_PRINTF_LOG '%s'\n", logFormat); } } else { setLogFormat("threadtime"); } }// ... 设置 log过滤. 调用 android_log_addFilterString() , 源码在 system/core/liblog/logprint.c// 一系列繁复代码, 打酱油, 初始化 /********************************************************************************* 其核心逻辑就是 走 "/system/core/liblog/logger_read.c", 获取到 struct android_log_transport_read. 目前, 实际上有2个这样的结构: // /system/core/liblog/config_read.c extern struct android_log_transport_read logdLoggerRead; extern struct android_log_transport_read pmsgLoggerRead; 这个结构记录了要去读日志所需的基本参数 和函数. 如果 logdLoggerRead 的 ".read = logdRead,"就表示记录 loadRead()函数为读取日志的方法函数.// /system/core/liblog/logd_reader.c LIBLOG_HIDDEN struct android_log_transport_read logdLoggerRead = { .node = { &logdLoggerRead.node, &logdLoggerRead.node }, .name = "logd", .available = logdAvailable, .version = logdVersion, .read = logdRead, .poll = logdPoll, .close = logdClose, .clear = logdClear, .getSize = logdGetSize, .setSize = logdSetSize, .getReadableSize = logdGetReadableSize, .getPrune = logdGetPrune, .setPrune = logdSetPrune, .getStats = logdGetStats, }; // /system/core/liblog/pmsg_reader.c LIBLOG_HIDDEN struct android_log_transport_read pmsgLoggerRead = { .node = { &pmsgLoggerRead.node, &pmsgLoggerRead.node }, .name = "pmsg", .available = pmsgAvailable, .version = pmsgVersion, .read = pmsgRead, .poll = NULL, .close = pmsgClose, .clear = pmsgClear, .setSize = NULL, .getSize = NULL, .getReadableSize = NULL, .getPrune = NULL, .setPrune = NULL, .getStats = NULL, }; 前者(logdLoggerRead) 是负责读取 logd中的日志(通过读socket "/dev/socket/logdw"); 后者(pmsgLoggerRead) 是负责读取 "/dev/pmsg0" 文件中的日志. 直接文件读取. (目前该文件没有被创建, 所有没有日志输出到这里)即, 想要了解 logcat 如何读取日志的, 关键看 "/system/core/liblog/logd_reader.c" 和 "/system/core/liblog/pmsg_reader.c" 两个源文件. 可读性强, 逻辑清晰. **********************************************************************************/} }

如上分析. 我们发现logcat 自身代码目录 "/system/core/logcat" 中基本都是打酱油的代码,
真正的操作 都交给 "/system/core/liblog" 中的方法了.
经过了若干 可读性很差的代码跟进后, 我们发现 :
logcat 如何读取 logd 缓存的日志的, 关键看 "/system/core/liblog/logd_reader.c" 源文件.
可读性强, 逻辑清晰.
"/system/core/liblog/logd_reader.c"显示, logcat 是通过 连接 socket "logr" 来去读 logd中缓存的日志的:
// /system/core/liblog/logd_reader.c// 初始化 socket "logdr" static int logdOpen(struct android_log_logger_list *logger_list, struct android_log_transport_context *transp) { // ...int sock = transp->context.sock; // 如果已经初始化过了socket, 则直接返回. if (sock > 0) { return sock; }// ...// 创建 socket_local_client 客户端 sock = socket_local_client("logdr", ANDROID_SOCKET_NAMESPACE_RESERVED, SOCK_SEQPACKET); if (sock == 0) { /* Guarantee not file descriptor zero */ int newsock = socket_local_client("logdr", ANDROID_SOCKET_NAMESPACE_RESERVED, SOCK_SEQPACKET); close(sock); sock = newsock; } if (sock <= 0) { if ((sock == -1) && errno) { return -errno; } return sock; }// 操作 socket_local_client 结构, 连接socketreturn transp->context.sock = sock; }// 从 socket "logdr" 读取日志: static int logdRead(struct android_log_logger_list *logger_list, struct android_log_transport_context *transp, struct log_msg *log_msg) { int ret, e; // ...ret = logdOpen(logger_list, transp); // 确保socket 打开 if (ret < 0) { return ret; }memset(log_msg, 0, sizeof(*log_msg)); // 初始化接收缓冲区// .../* NOTE: SOCK_SEQPACKET guarantees we read exactly one full entry */ ret = recv(ret, log_msg, LOGGER_ENTRY_MAX_LEN, 0); // 从socket中接收日志 e = errno; // ...if ((ret == -1) && e) { return -e; } return ret; }

其他关于 "logcat -c " 发生了什么的, 不再赘述.可自行阅读 "/system/core/liblog/logd_reader.c"
tips : logcat 调用了很多 liblog 模块的函数. 代码跟起来有点麻烦. 例如有一个有意思的细节:struct android_log_logger_list 和 struct logger_list 的转换."struct logger_list"的定义在 "/system/core/include/log/logger.h" :struct logger_list; // 定义了一个空结构体android_log_logger_list的定义在 "/system/core/liblog/logger.h" :struct android_log_logger_list { struct listnode logger; struct listnode transport; int mode; unsigned int tail; log_time start; pid_t pid; }; 发现使用中, struct android_log_logger_list 和 struct logger_list 会相互转换, 如:// /system/core/liblog/logger_read.cLIBLOG_ABI_PUBLIC struct logger_list *android_logger_list_alloc_time( // cwj int mode, log_time start, pid_t pid) { struct android_log_logger_list *logger_list; // 实际数据结构是 android_log_logger_list// ...return (struct logger_list *)logger_list; // 但是给外界使用 是强转为 logger_list }LIBLOG_ABI_PUBLIC struct logger *android_logger_open( struct logger_list *logger_list, log_id_t logId) { struct android_log_logger_list *logger_list_internal = (struct android_log_logger_list *)logger_list; // struct logger_list 强转 struct android_log_logger_list .// ...}为什么要"画蛇添足"多定义一 个"struct logger_list; " 空结构体呢?猜测 /system/core/include/log/logger.h" 是对外的. 而 "/system/core/liblog/logger.h" 是对 liblog模块内的. liblog设计者想对外隐藏内部数据结构细节.在能够隐藏内部细节的情况下, 开发者尽量选择了隐藏内部细节, 因为外部不需要关心 struct android_log_logger_list 的内部实现, 所以只需要给出 struct logger_list * 去存储 struct android_log_logger_list 的内存地址, 上层后续有什么操作就把这个结构的地址传入liblog模块操作. 即外界根本不需要知道 struct android_log_logger_list 的实现细节.就像调用 open() API 打开文件, 我们只需拿到文件句柄 FD. 而不需要关系文件系统内部的数据结构. 后续要操作此文件, FD 就是我们的 token(令牌, 资源的身份识别码).

1.5 logd 捕获 kernel log 通过设置" adb shell setprop logd.kernel true" 可以让logd去抓取 Kernel Log.(一直监听并抓取)
实现原理其实是把 "/proc/kmsg" 和 "/dev/kmsg" 文件当作socket 文件来使用.(详情追踪 LogKlog.h )
实测发现, 需要root权限才能才能设置 "setprop logd.kernel true".
代码如下:
// /system/core/logd/main.cppint main(int argc, char *argv[]) { int fdPmesg = -1; bool klogd = property_get_bool("logd.kernel",// 获取 "logd.kernel" property, 确定是否开启了 kernel log 重定向到 User Log . BOOL_DEFAULT_TRUE | BOOL_DEFAULT_FLAG_PERSIST | BOOL_DEFAULT_FLAG_ENG | BOOL_DEFAULT_FLAG_SVELTE); if (klogd) { fdPmesg = open("/proc/kmsg", O_RDONLY | O_NDELAY); // "/proc/kmsg" 用来读kernel log } fdDmesg = open("/dev/kmsg", O_WRONLY); // "/dev/kmsg" 用来写 kernel log//....bool auditd = property_get_bool("logd.auditd", BOOL_DEFAULT_TRUE | BOOL_DEFAULT_FLAG_PERSIST); LogAudit *al = NULL; if (auditd) { al = new LogAudit(logBuf, reader, property_get_bool("logd.auditd.dmesg", BOOL_DEFAULT_TRUE | BOOL_DEFAULT_FLAG_PERSIST) ? fdDmesg : -1); }LogKlog *kl = NULL; if (klogd) { kl = new LogKlog(logBuf, reader, fdDmesg, fdPmesg, al != NULL); // 核心逻辑在 LogKlog }readDmesg(al, kl); // failure is an option ... messages are in dmesg (required by standard)if (kl && kl->startListener()) {// startListener成功时返回 0, 失败返回非0(多数情况是-1) delete kl; }if (al && al->startListener()) { delete al; }TEMP_FAILURE_RETRY(pause()); exit(0); }

1.6 Android 4.4 liblog 在Android 4.4 及更早的系统中, liblog并像现在这样通过socket把log传到logd, 而是直接写到
"/dev/log/main",
"/dev/log/events",
"/dev/log/system",
"dev/log/radio"
四个文件文件去.
liblog中残留的 "/system/core/liblog/logd_write_kern.c"就是残留的旧代码.
他们现在不生效的原因, 此源文件没有被包好在模块的Android.mk文件中, 不会参与编译.
往上有介绍如何打补丁,让高版本Android会到这种旧版日志的方法:
https://blog.csdn.net/kc58236582/article/details/72276041
其核心就是2点:
1> 让 logd_write_kern.c 参与编译.
2> 修改内核, 使其创建"/dev/log/*",并配置selinux. (目前上文中提到的这一点的连接已经失效)
理论上是可行的. 我们可以在logd莫名其妙挂掉的时候(最近于到这样的案例, 才产生研究log流程的需求)
, 将日志输出到这里来.
但是这并没有意思, 前面我们提到, 现有的代码中预留了"/dev/pmsg0"设备接管可存储log.
但是目前此文件也是没被创建, 所以不会有日志输出到这里.
目前看, logd挂掉非常罕见. 且logd挂了要找去挂了的原因, kernel log已经足够详细, User Log不会对定为logd挂的原因有帮助.
故"liblog库直接存log到其他文件"的需求暂定没有开发的意义.
2.0 stdout / stderr 与 logwrapper 据"网传", Android把系统标准输出设备 stdout / stderr 重定向到了 "/dev/null" (黑洞文件,有进无出) 网传属实.
Android把所有Java虚拟机进程的"系统标准输出设备" stdout / stderr 重定向到了 "/dev/null" (黑洞文件,有进无出)
, 在java虚拟机初始化较早的时候就操作了:
http://androidxref.com/7.1.2_r36/xref/frameworks/base/core/java/com/android/internal/os/RuntimeInit.java#343

Android|Android Log系统介绍 (基于Android N)
文章图片
close_Android_stdout1.jpg
Android|Android Log系统介绍 (基于Android N)
文章图片
close_Android_stdout2.jpg
Android|Android Log系统介绍 (基于Android N)
文章图片
close_Android_stdout3.jpg 而其他非java虚拟机进程是再哪里关闭的, 有兴趣可以自己跟下. (可以找到)
发现logwrapper 可以将 输出到 stdout / stderr 的信息重定向到"User Log".
这里只对 logwrapper 用法做简单介绍:
$ adb shell logwrapper Usage: logwrapper [-a] [-d] [-k] BINARY [ARGS ...]Forks and executes BINARY ARGS, redirecting stdout and stderr to the Android logging system. Tag is set to BINARY, priority is always LOG_INFO.-a: Causes logwrapper to do abbreviated logging. This logs up to the first 4K and last 4K of the command being run, and logs the output when the command exits -d: Causes logwrapper to SIGSEGV when BINARY terminates fault address is set to the status of wait() -k: Causes logwrapper to log to the kernel log instead of the Android system log参数说明:-a 仅拿最前面 和最后面 4K (一共最多8K)的日志-d 被操作进程如果发生内存错误而终结, 就给它发 SIGSEGV 信号.-k 重定向到 "Kernel log"(默认是重定向到 User Log)BINARY 是目标, 如"ps"命令.[ARGS ...] 是给目标可执行文件传的参数.用法示范:// 执行 "ps" 命令, 并将其的输出信息 重定向"User Log" $ adb shell logwrapper ps// 执行 "ps" 命令, 并将其的输出信息 重定向"Kernel Log" $ adb shell logwrapper -k ps以上两个分身执行后, 可以通过 logcat 或 dmesg 查看拿到 ps的输出信息了.

3.0 kernel log 在内核模块的代码, 通常通过 "printk" 这个"宏函数"输出 Kernel log.
用法简单. 不做介绍.
最终 Kernel log会输出到 "/proc/kmsg", "/dev/kmsg" 这两个文件.
"/dev/kmsg"是循环缓冲区, 存储所有 kernel log(超出缓冲区的旧log除外).
"adb shell dmesg" 命令是取的这个文件的内容.
"adb shell cat /proc/kmsg" 则可以只拿这条命令执行之后才输出的 kernel log, 且是持续监听更新.
这点比dmesg好,(dmesg是取一次 kernel log, 然后就退出了) , 在开发中, 方便我们实时的看最新的 kernel log .
ps:就是需要root权限. 限制了使用.
3.1 libcutils user space print kernel log "/system/core/init/log.cpp" 中的 init_klog_write() 函数 可以让应用也能写 Kernel log.
代码很简单, 就是在写 "/dev/kmsg" 文件.
也就是说, 应用得有访问 "/dev/kmsg" 的权限 (尤其是 selinux限制), 才能写 kernel log.
4.0 贝海拾遗 4.1 局部编译问题
试验发现, 对liblog的修改, 局部编译该模块 push到手机, 修改不能生效.
(使用 "adb sync system" 命令, 已经确保编译出争取的 liblog.so, 且成功push到手机, 且有重启手机)
全编才生效.
猜测:其他so库静态依赖这个so库.别人用的是其他二次包装的库?
4.2 System.out.println("test-system.out");
" System.out.println("xx") " 输出的信息, 也可以在 logcat 中看到.
其实际上是输出到系统标输出设备.
// /libcore/ojluni/src/main/java/java/lang/System.javapublic final static InputStream in; /** * The "standard" output stream. This stream is already * open and ready to accept output data. Typically this stream * corresponds to display output or another output destination * specified by the host environment or user. * * For simple stand-alone Java applications, a typical way to write * a line of output data is: *
*System.out.println(data) *

* * See the println methods in class PrintStream. * * @seejava.io.PrintStream#println() * @seejava.io.PrintStream#println(boolean) * @seejava.io.PrintStream#println(char) * @seejava.io.PrintStream#println(char[]) * @seejava.io.PrintStream#println(double) * @seejava.io.PrintStream#println(float) * @seejava.io.PrintStream#println(int) * @seejava.io.PrintStream#println(long) * @seejava.io.PrintStream#println(java.lang.Object) * @seejava.io.PrintStream#println(java.lang.String) */ public final static PrintStream out; public final static PrintStream err; static { // ...FileInputStream fdIn = new FileInputStream(FileDescriptor.in); FileOutputStream fdOut = new FileOutputStream(FileDescriptor.out); FileOutputStream fdErr = new FileOutputStream(FileDescriptor.err); in = new BufferedInputStream(fdIn); out = new PrintStream(fdOut); err = new PrintStream(fdErr); // ... }// /libcore/ojluni/src/main/java/java/io/FileDescriptor.java/** * A handle to the standard input stream. Usually, this file * descriptor is not used directly, but rather via the input stream * known as System.in. * * @seejava.lang.System#in */ public static final FileDescriptor in = dupFd(0); /** * A handle to the standard output stream. Usually, this file * descriptor is not used directly, but rather via the output stream * known as System.out. * @seejava.lang.System#out */ public static final FileDescriptor out = dupFd(1); /** * A handle to the standard error stream. Usually, this file * descriptor is not used directly, but rather via the output stream * known as System.err. * * @seejava.lang.System#err */ public static final FileDescriptor err = dupFd(2);

" System.out.println("xx") "实际上就是输出到FD=1的文件去. 而这个文件是系统事先打开, 任意进程可直接使用的. 据"网传", Android 输出到标准出设备的数据重定向到"/dev/null", 所以默认是看不到 System.out.println的输出.

    推荐阅读