AndroidANRlogtrace日志文件分析

合集下载
  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。

AndroidANRlogtrace⽇志⽂件分析
版权声明:本⽂为博主原创⽂章,遵循版权协议,转载请附上原⽂出处链接和本声明。

本⽂链接:
什么是ANR?
ANR:Application Not Responding,即应⽤⽆响应
ANR⽇志Trace⽂件获取
系统⽣成的Trace⽂件保存在data/anr,可以⽤过命令adb pull data/anr/取出
traces.txt只保留最后⼀次ANR的信息,Android系统有个DropBox功能功能,它能记录系统出现的crash错误.因此保留有发⽣过的ANR的信息.(log路径:/data/system/dropbox)
获取系统crash log: adb shell dumpsys dropbox --print >>log.txt
1.
Trace⽂件怎么⽣成的?
2.
当APP(包括系统APP和⽤户APP)进程出现ANR、应⽤响应慢或WatchDog的监视没有得到回馈时,
3.
系统会dump此时的top进程,进程中Thread的运⾏状态就都dump到这个Trace⽂件中了.
导致ANR的常见⼏种情况:
1:Input dispatching timed out(5 seconds) 按键或触摸事件处理超时(⼀般是UI主线程做了耗时的操作,这类ANR最常见)
2:BroadcastTimeout(10 seconds) ⼴播的分发和处理超时(⼀般是onReceiver执⾏时间过长)
3:ServiceTimeout(20 seconds) Service的启动和执⾏超时
另外还有ProviderTimeout和WatchDog等导致的ANR.还有当系统内存或CPU资源不⾜时容易出现ANR,⼀般这种情况会有
lowmemorykill的log打印.
应⽤ANR产⽣的时候,ActivityManagerService的appNotResponding⽅法就会被调⽤,然后在/data/anr/traces.txt⽂件中写⼊ANR相关信息.
1.
final void appNotResponding(ProcessRecord app, ActivityRecord activity,
2.
ActivityRecord parent, boolean aboveSystem, final String annotation) {
3.
// ... ...
4.
if (MONITOR_CPU_USAGE) {
5.
updateCpuStatsNow(); // 更新CPU使⽤率
6.
}
7.
// ... ...
8.
final ProcessCpuTracker processCpuTracker = new ProcessCpuTracker(true);
9.
// dumpStackTraces是输出traces⽂件的函数
10.
File tracesFile = dumpStackTraces(true, firstPids, processCpuTracker, lastPids,
11.
NATIVE_STACKS_OF_INTEREST);
12.
if (MONITOR_CPU_USAGE) {
15.
updateCpuStatsNow(); // 再次更新CPU信息
16.
synchronized (mProcessCpuTracker) {
17.
// 输出ANR发⽣前⼀段时间内的CPU使⽤率
18.
cpuInfo = mProcessCpuTracker.printCurrentState(anrTime);
19.
}
20.
info.append(processCpuTracker.printCurrentLoad());
21.
info.append(cpuInfo);
22.
}
23.
// 输出ANR发⽣后⼀段时间内的CPU使⽤率
24.
info.append(processCpuTracker.printCurrentState(anrTime));
25.
26.
Slog.e(TAG, info.toString());
27.
if (tracesFile == null) {
28.
// There is no trace file, so dump (only) the alleged culprit's threads to the log 29.
Process.sendSignal(app.pid, Process.SIGNAL_QUIT);
30.
}
31.
// 将ANR信息同时输出到DropBox中
32.
addErrorToDropBox("anr", app, app.processName, activity, parent, annotation, 33.
cpuInfo, tracesFile, null);
34.
// ... ...
35.
synchronized (this) {
36.
// 显⽰ANR提⽰对话框
37.
// Bring up the infamous App Not Responding dialog
38.
Message msg = Message.obtain();
39.
HashMap<String, Object> map = new HashMap<String, Object>();
40.
msg.what = SHOW_NOT_RESPONDING_MSG;
41.
msg.obj = map;
42.
msg.arg1 = aboveSystem ? 1 : 0;
43.
map.put("app", app);
44.
if (activity != null) {
45.
map.put("activity", activity);
46.
}
49.
}
避免ANR?
1. UI线程尽量只做跟UI相关的⼯作
2. 耗时的⼯作(⽐如数据库操作,I/O,连接⽹络或者别的有可能阻碍UI线程的操作)把它放⼊单独的线程处理
3. 尽量⽤Handler来处理UIthread和别的thread之间的交互
分析ANR的Log:
出现ANR的log如下:
1.
06-22 10:37:46.204 3547 3604 E ActivityManager: ANR in org.code:MessengerService // ANR出现的进程包名
2.
3.
E ActivityManager: PID: 17027 // ANR进程ID
4.
5.
E ActivityManager: Reason: executing service org.code/.ipc.MessengerService //导致ANR的原因
6.
7.
E ActivityManager: Load: 8.31 / 8.12 / 8.47
8.
9.
E ActivityManager: CPU usage from 0ms to 6462ms later: //CPU在ANR发⽣后的使⽤情况
10.
11.
E ActivityManager: 61% 3547/system_server: 21% user + 39% kernel / faults: 13302 minor 2 major
12.
13.
E ActivityManager: 0.2% 475/debuggerd: 0% user + 0.1% kernel / faults: 6086 minor 1 major
14.
15.
E ActivityManager: 10% 5742/com.android.phone: 5.1% user + 5.1% kernel / faults: 7597 minor
16.
17.
E ActivityManager: 6.9% 5342/com.android.systemui: 2.1% user + 4.8% kernel / faults: 4158 minor
18.
19.
E ActivityManager: 0.1% 477/debuggerd64: 0% user + 0.1% kernel / faults: 4013 minor
20.
21.
E ActivityManager: 5.7% 16313/org.code: 3.2% user + 2.4% kernel / faults: 2412 minor
22.
23.
E ActivityManager: 3.7% 17027/org.code:MessengerService: 1.7% user + 2% kernel / faults: 2571 minor 6 major
24.
25.
E ActivityManager: 2.6% 306/cfinteractive: 0% user + 2.6% kernel
E ActivityManager: +0% 17168/kworker/0:1: 0% user + 0% kernel
28.
29.
E ActivityManager: 0% TOTAL: 0% user + 0% kernel + 0% softirq // CUP占⽤情况
30.
31.
E ActivityManager: CPU usage from 5628ms to 6183ms later:
32.
33.
E ActivityManager: 42% 3547/system_server: 17% user + 24% kernel / faults: 11 minor
34.
35.
E ActivityManager: 12% 3604/ActivityManager: 1.7% user + 10% kernel
36.
37.
E ActivityManager: 12% 3609/android.display: 8.7% user + 3.5% kernel
38.
39.
E ActivityManager: 3.5% 5304/Binder_6: 1.7% user + 1.7% kernel
40.
41.
E ActivityManager: 3.5% 5721/Binder_A: 1.7% user + 1.7% kernel
42.
43.
E ActivityManager: 3.5% 5746/Binder_C: 3.5% user + 0% kernel
44.
45.
E ActivityManager: 1.7% 3599/Binder_1: 0% user + 1.7% kernel
46.
47.
E ActivityManager: 1.7% 3600/Binder_2: 0% user + 1.7% kernel
48.
49.
I ActivityManager: Killing 17027:org.code:MessengerService/u0a85 (adj 1): bg anr
50.
51.
I art : Wrote stack traces to '/data/anr/traces.txt' //art这个TAG打印对traces操作的信息
52.
53.
D GraphicsStats: Buffer count: 9
54.
55.
W ActivityManager: Scheduling restart of crashed service org.code/.ipc.MessengerService in 1000ms
56.
log打印了ANR的基本信息,我们可以分析CPU使⽤率推测ANR发⽣的时候设备在做什么⼯作;如果CPU使⽤率很⾼,接近100%,可能是在进⾏⼤规模的计算更可能是陷⼊死循环;如果CUP使⽤率很低,说明主线程被阻塞了,并且当IOwait很⾼,可能是主线程在等待I/O操作的完成.
对于ANR只是分析Log很难知道问题所在,我们还需要通过Trace⽂件分析stack调⽤情况.
----- pid 17027 at 2017-06-22 10:37:39 ----- // ANR出现的进程pid和时间(和上述log中pid⼀致)
3.
Cmd line: org.code:MessengerService // ANR出现的进程名
4.
Build fingerprint: 'Homecare/qucii8976v3_64:6.0.1/pansen06141150:eng/test-keys' // 下⾯记录系统版本,内存等状态信息
5.
ABI: 'arm64'
6.
Build type: optimized
7.
Zygote loaded classes=6576 post zygote classes=13
8.
Intern table: 13780 strong; 17 weak
9.
JNI: CheckJNI is on; globals=283 (plus 158 weak)
10.
Libraries: /system/lib64/libandroid.so /system/lib64/libcompiler_rt.so /system/lib64/libjavacrypto.so /system/lib64/libjnigraphics.so /system/lib64/libmedia_jni.so /system/lib64/libwebviewchromium_loader.so libjavacore.so (7)
11.
Heap: 29% free, 8MB/12MB; 75731 objects
12.
Dumping cumulative Gc timings
13.
Total number of allocations 75731
14.
Total bytes allocated 8MB
15.
Total bytes freed 0B
16.
Free memory 3MB
17.
Free memory until GC 3MB
18.
Free memory until OOME 183MB
19.
Total memory 12MB
20.
Max memory 192MB
21.
Zygote space size 3MB
22.
Total mutator paused time: 0
23.
Total time waiting for GC to complete: 0
24.
Total GC count: 0
25.
Total GC time: 0
26.
Total blocking GC count: 0
27.
Total blocking GC time: 0
28.
29.
suspend all histogram: Sum: 76us 99% C.I. 0.100us-28us Avg: 7.600us Max: 28us
30.
DALVIK THREADS (15):
31.
// Signal Catcher是记录traces信息的线程
32.
// Signal Catche(线程名)、(daemon)表⽰守护进程、prio(线程优先级,默认是5)、tid(线程唯⼀标识ID)、Runnable(线程当前状态) 33.
"Signal Catcher" daemon prio=5 tid=3 Runnable
36.
//sysTid是线程号(主线程的线程号和进程号相同)
37.
| sysTid=17033 nice=0 cgrp=default sched=0/0 handle=0x7fb2350450
38.
| state=R schedstat=( 4348125 172343 3 ) utm=0 stm=0 core=1 HZ=100
39.
| stack=0x7fb2254000-0x7fb2256000 stackSize=1013KB
40.
| held mutexes= "mutator lock"(shared held)
41.
native: #00 pc 0000000000489e28 /system/lib64/libart.so (art::DumpNativeStack(std::__1::basic_ostream<char,
std::__1::char_traits<char> >&, int, char const*, art::ArtMethod*, void*)+236)
42.
native: #01 pc 0000000000458fe8 /system/lib64/libart.so (art::Thread::Dump(std::__1::basic_ostream<char,
std::__1::char_traits<char> >&) const+220)
43.
native: #02 pc 0000000000465bc8 /system/lib64/libart.so (art::DumpCheckpoint::Run(art::Thread*)+688)
44.
native: #03 pc 0000000000466ae0 /system/lib64/libart.so (art::ThreadList::RunCheckpoint(art::Closure*)+276)
45.
native: #04 pc 000000000046719c /system/lib64/libart.so (art::ThreadList::Dump(std::__1::basic_ostream<char,
std::__1::char_traits<char> >&)+188)
46.
native: #05 pc 0000000000467a84 /system/lib64/libart.so (art::ThreadList::DumpForSigQuit(std::__1::basic_ostream<char, std::__1::char_traits<char> >&)+492)
47.
native: #06 pc 0000000000431194 /system/lib64/libart.so (art::Runtime::DumpForSigQuit(std::__1::basic_ostream<char, std::__1::char_traits<char> >&)+96)
48.
native: #07 pc 000000000043e604 /system/lib64/libart.so (art::SignalCatcher::HandleSigQuit()+1256)
49.
native: #08 pc 000000000043f214 /system/lib64/libart.so (art::SignalCatcher::Run(void*)+452)
50.
native: #09 pc 0000000000068714 /system/lib64/libc.so (__pthread_start(void*)+52)
51.
native: #10 pc 000000000001c604 /system/lib64/libc.so (__start_thread+16)
52.
(no managed stack frames)
53.
54.
//main(线程名)、prio(线程优先级,默认是5)、tid(线程唯⼀标识ID)、Sleeping(线程当前状态)
55.
"main" prio=5 tid=1 Sleeping
56.
| group="main" sCount=1 dsCount=0 obj=0x73132d10 self=0x5598a5f5e0
57.
//sysTid是线程号(主线程的线程号和进程号相同)
58.
| sysTid=17027 nice=0 cgrp=default sched=0/0 handle=0x7fb6db6fe8
59.
| state=S schedstat=( 420582038 5862546 143 ) utm=24 stm=18 core=6 HZ=100
60.
| stack=0x7fefba3000-0x7fefba5000 stackSize=8MB
61.
| held mutexes=
62.
// java 堆栈调⽤信息(这⾥可查看导致ANR的代码调⽤流程)(分析ANR最重要的信息)
63.
at ng.Thread.sleep!(Native method)
64.
- sleeping on <0x0c60f3c7> (a ng.Object)
65.
at ng.Thread.sleep(Thread.java:985)
68.
at android.os.SystemClock.sleep(SystemClock.java:120)
69.
at org.code.ipc.MessengerService.onCreate(MessengerService.java:63) //导致ANR的代码
70.
at android.app.ActivityThread.handleCreateService(ActivityThread.java:2877)
71.
at android.app.ActivityThread.access$1900(ActivityThread.java:150)
72.
at android.app.ActivityThread$H.handleMessage(ActivityThread.java:1427)
73.
at android.os.Handler.dispatchMessage(Handler.java:102)
74.
at android.os.Looper.loop(Looper.java:148)
75.
at android.app.ActivityThread.main(ActivityThread.java:5417)
76.
at ng.reflect.Method.invoke!(Native method)
77.
at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:726)
78.
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:616)
在log中显⽰的pid在traces⽂件中与之对应,trace log中会打印当前所有线程的堆栈信息,⼀般我们主要关注main线程的堆栈(也有分析其他线程的情况,通过分析ANR发⽣时系统状态推测出设备正在进⾏操作)
⽽这⾥然后通过查看堆栈调⽤信息分析ANR的代码.上述ANR实际上在org.code.ipc.MessengerService.onCreate中63⾏调⽤SystemClock.sleep(10000)代码导致的;这是⽐较简单ANR⽰例.
以上仅为解决ANR问题提供⼀个思路,具体问题还需要根据实际情况具体分析
线程状态的分类: java中定义的线程状态:
1.
// libcore/libart/src/main/java/java/lang/Thread.java
2.
/**
3.
* A representation of a thread's state. A given thread may only be in one
4.
* state at a time.
5.
*/
6.
public enum State {
7.
/**
8.
* The thread has been created, but has never been started.
9.
*/
10.
NEW,
11.
/**
12.
* The thread may be run.
13.
*/
14.
RUNNABLE,
15.
/**
18.
BLOCKED,
19.
/**
20.
* The thread is waiting.
21.
*/
22.
WAITING,
23.
/**
24.
* The thread is waiting for a specified amount of time.
25.
*/
26.
TIMED_WAITING,
27.
/**
28.
* The thread has been terminated.
29.
*/
30.
TERMINATED
31.
}
C代码中定义的线程状态:
1.
// /art/runtime/thread_state.h
2.
enum ThreadState {
3.
// Thread.State JDWP state
4.
kTerminated = 66, // TERMINATED TS_ZOMBIE Thread.run has returned, but Thread* still around 5.
kRunnable, // RUNNABLE TS_RUNNING runnable
6.
kTimedWaiting, // TIMED_WAITING TS_WAIT in Object.wait() with a timeout
7.
kSleeping, // TIMED_WAITING TS_SLEEPING in Thread.sleep()
8.
kBlocked, // BLOCKED TS_MONITOR blocked on a monitor
9.
kWaiting, // WAITING TS_WAIT in Object.wait()
10.
kWaitingForGcToComplete, // WAITING TS_WAIT blocked waiting for GC
11.
kWaitingForCheckPointsToRun, // WAITING TS_WAIT GC waiting for checkpoints to run
12.
kWaitingPerformingGc, // WAITING TS_WAIT performing GC
13.
kWaitingForDebuggerSend, // WAITING TS_WAIT blocked waiting for events to be sent
14.
kWaitingForDebuggerToAttach, // WAITING TS_WAIT blocked waiting for debugger to attach
15.
kWaitingInMainDebuggerLoop, // WAITING TS_WAIT blocking/reading/processing debugger events 16.
kWaitingForDebuggerSuspension, // WAITING TS_WAIT waiting for debugger suspend all
17.
kWaitingInMainSignalCatcherLoop, // WAITING TS_WAIT blocking/reading/processing signals
20.
kWaitingForDeoptimization, // WAITING TS_WAIT waiting for deoptimization suspend all
21.
kWaitingForMethodTracingStart, // WAITING TS_WAIT waiting for method tracing to start
22.
kWaitingForVisitObjects, // WAITING TS_WAIT waiting for visiting objects
23.
kWaitingForGetObjectsAllocated, // WAITING TS_WAIT waiting for getting the number of allocated objects
24.
kStarting, // NEW TS_WAIT native thread started, not yet ready to run managed code
25.
kNative, // RUNNABLE TS_RUNNING running in a JNI native method
26.
kSuspended, // RUNNABLE TS_RUNNING suspended by GC or debugger
27.
};
两者可以看出在C代码中定义更为详细,Traces中显⽰的线程状态都是C代码定义的.我们可以通过查看线程状态对应的信息分析ANR问题.如: TimedWaiting对应的线程状态是TIMED_WAITING
kTimedWaiting, // TIMED_WAITING TS_WAIT in Object.wait() with a timeout 执⾏了⽆超时参数的wait函数
kSleeping, // TIMED_WAITING TS_SLEEPING in Thread.sleep() 执⾏了带有超时参数的sleep函数
1.
ZOMBIE 线程死亡,终⽌运⾏
2.
RUNNING/RUNNABLE 线程可运⾏或正在运⾏
3.
TIMED_WAIT 执⾏了带有超时参数的wait、sleep或join函数
4.
MONITOR 线程阻塞,等待获取对象锁
5.
WAIT 执⾏了⽆超时参数的wait函数
6.
INITIALIZING 新建,正在初始化,为其分配资源
7.
STARTING 新建,正在启动
8.
NATIVE 正在执⾏JNI本地函数
9.
VMWAIT 正在等待VM资源
10.
SUSPENDED 线程暂停,通常是由于GC或debug被暂停。

相关文档
最新文档