It looked as follows:
0012fd14: USER32!_except_handler3+0 (7e440457) CRT scope 0, func: USER32!UserCallWinProc+10a (7e44aa1c) 0012fd6c: USER32!_except_handler3+0 (7e440457) CRT scope 0, filter: USER32!DispatchMessageWorker+113 (7e440712) func: USER32!DispatchMessageWorker+126 (7e44072a) 0012ffb0: VanishExcept!ILT+375(__except_handler3)+0 (0041117c) 0012ffe0: kernel32!_except_handler3+0 (7c839af0) CRT scope 0, filter: kernel32!BaseProcessStart+29 (7c84377a) func: kernel32!BaseProcessStart+3a (7c843790) Invalid exception stack at ffffffff |
There were no user exception handlers at all, but instead was an exception handler I have not seen before:
0012fd14: USER32!_except_handler3+0 (7e440457) CRT scope 0, func: USER32!UserCallWinProc+10a (7e44aa1c) 0012fd6c: USER32!_except_handler3+0 (7e440457) CRT scope 0, filter: USER32!DispatchMessageWorker+113 (7e440712) func: USER32!DispatchMessageWorker+126 (7e44072a) 0012ffb0: VanishExcept!ILT+375(__except_handler3)+0 (0041117c) 0012ffe0: kernel32!_except_handler3+0 (7c839af0) CRT scope 0, filter: kernel32!BaseProcessStart+29 (7c84377a) func: kernel32!BaseProcessStart+3a (7c843790) Invalid exception stack at ffffffff |
So I decided to check if this is a handler which handles our exception. Surprisingly, it was the handler I was looking for. For me, it was absolutely not clear why the OS decided to handle the exception having no meaning for it instead of reporting about a serious problem and terminating a program as a result? This behavior, with no doubt, is strange and dangerous! The only way to go in this situation is just to leave an exception unhandled and let the system to execute Dr. Watson or whatever tool which was adjusted instead. In case of Dr. Watson, the user will have an opportunity to send a report to WER. This is why this service was created indeed.
The next thing I did was setting the breakpoint at the USER32!DispatchMessageWorker+113 instruction in order to figure out why DispatchMessageWorker decides to handle the exception. I discovered that the decision to handle an exception or not DispatchMessageWorker’s exception filter takes depending on the 17th bit of the _teb.Win32ClientInfo.dwCompatFlags2 flag. If the bit is not set filter returns EXCEPTION_EXECUTE_HANDLER. In our case no one bit in dwCompatFlags2 was set, thus filter reported that DispatchMessageWorker’s handler can handle exception no matter what the exception is.
USER32!GetAppCompatFlags2: 7e418ed6 8bff mov edi,edi 7e418ed8 55 push ebp 7e418ed9 8bec mov ebp,esp 7e418edb 64a118000000 mov eax,dword ptr fs:[00000018h] 7e418ee1 83784000 cmp dword ptr [eax+40h],0 7e418ee5 0f842b550100 je USER32!GetAppCompatFlags2+0x11 (7e42e416) 7e418eeb 64a118000000 mov eax,dword ptr fs:[00000018h] 7e418ef1 0fb74d08 movzx ecx,word ptr [ebp+8] 7e418ef5 3b88d4060000 cmp ecx,dword ptr [eax+6D4h] 7e418efb 0f8238ae0000 jb USER32!GetAppCompatFlags2+0x2e (7e423d39) // Get the pointer to TEB (ntdll!_TEB) 7e418f01 64a118000000 mov eax,dword ptr fs:[00000018h] // Get the value of _CLIENTINFO.dwCompatFlags2 flag 7e418f07 8b80dc060000 mov eax,dword ptr [eax+6DCh] 7e418f0d 5d pop ebp 7e418f0e c20400 ret 4 USER32!DispatchMessageWorker+0x113: 7e440712 6800040000 push 400h 7e440717 e8ba87fdff call USER32!GetAppCompatFlags2 (7e418ed6) // Check for the 17th bit 7e44071c c1e810 shr eax,10h 7e44071f f7d0 not eax 7e440721 83e001 and eax,1 // Return EXCEPTION_EXECUTE_HANDLER == 1 7e440724 c3 ret |
Sounds great, isn’t it?!
Now let’s look what the handler does:
USER32!DispatchMessageWorker+0x126: 7e44072a 8b65e8 mov esp,dword ptr [ebp-18h] 7e44072d 33c0 xor eax,eax 7e44072f e96090fdff jmp USER32!DispatchMessageWorker+0x12b (7e419794) USER32!DispatchMessageWorker+0x12b: 7e419794 834dfcff or dword ptr [ebp-4],0FFFFFFFFh 7e419798 e948f2ffff jmp USER32!DispatchMessageWorker+0x378 (7e4189e5) USER32!DispatchMessageWorker+0x378: 7e4189e5 e816fcffff call USER32!_SEH_epilog (7e418600) 7e4189ea c20800 ret 8 |
Nothing, if fact. It’s just returns from DispatchMessageWorker.
It’s absolutely obvious that if DispatchMessageWorker registered this exception handler any time it dispatches some message, this problem would be well known and most likely fixed already (because I consider this behavior as improper). So there should be some special circumstances to register the handler.
The exact point of where DispatchMessageWorker registers the handler is:
USER32!DispatchMessageWorker+0xbf: 7e419742 393d8c00477e cmp dword ptr [USER32!gfServerProcess (7e47008c)],edi 7e419748 0f8527f2ffff jne USER32!DispatchMessageWorker+0x134 (7e418975) 7e41974e 50 push eax 7e41974f ff7608 push dword ptr [esi+8] 7e419752 ff36 push dword ptr [esi] 7e419754 e849000000 call USER32!NtUserValidateTimerCallback (7e4197a2) 7e419759 85c0 test eax,eax 7e41975b 0f84e3420200 je USER32!DispatchMessageWorker+0x17c (7e43da44) // Set the block counter to 0 7e419761 897dfc mov dword ptr [ebp-4],edi // edi == 0 7e419764 3bdf cmp ebx,edi 7e419766 0f85ed590100 jne USER32!DispatchMessageWorker+0xe1 (7e42f159) |
And the conditions necessary to instruction execution are the following:
1. WM_TIMER message
USER32!DispatchMessageWorker: 7e4188f1 6a1c push 1Ch 7e4188f3 68f089417e push offset USER32!MessageTable+0x440 (7e4189f0) 7e4188f8 e8c3fcffff call USER32!_SEH_prolog (7e4185c0) 7e4188fd 33ff xor edi,edi 7e4188ff 897de4 mov dword ptr [ebp-1Ch],edi // Set MSG* param to esi 7e418902 8b7508 mov esi,dword ptr [ebp+8] USER32!DispatchMessageWorker+0x6e: 7e418954 0f85947d0200 jne USER32!DispatchMessageWorker+0x70 (7e4406ee) // Get MSG.message value 7e41895a 8b5604 mov edx,dword ptr [esi+4] // and check if it’s equal to WM_TIMER 7e41895d 81fa13010000 cmp edx,113h 7e418963 0f84f0000000 je USER32!DispatchMessageWorker+0xa1 (7e418a59) |
2. Valid timer callback function
USER32!DispatchMessageWorker+0xbf: 7e419742 393d8c00477e cmp dword ptr [USER32!gfServerProcess (7e47008c)],edi 7e419748 0f8527f2ffff jne USER32!DispatchMessageWorker+0x134 (7e418975) 7e41974e 50 push eax 7e41974f ff7608 push dword ptr [esi+8] 7e419752 ff36 push dword ptr [esi] 7e419754 e849000000 call USER32!NtUserValidateTimerCallback (7e4197a2) 7e419759 85c0 test eax,eax 7e41975b 0f84e3420200 je USER32!DispatchMessageWorker+0x17c (7e43da44) 7e419761 897dfc mov dword ptr [ebp-4],edi 7e419764 3bdf cmp ebx,edi 7e419766 0f85ed590100 jne USER32!DispatchMessageWorker+0xe1 (7e42f159) |
I think those conditions are not uncommon for the SetTimer function.
I was so discouraged, that I thought that there should be something which I don’t know and I posted the issue to MSDN forum (http://social.msdn.microsoft.com/Forums/en-US/windowsgeneraldevelopmentissues/thread/c1e34bd0-8ea4-4869-bb20-4e4d0ed586fe). I described the issue and asked the community to the following questions:
1. Why does the DispatchMessageWorker behaves this way (looks like a patch)?
2. What compatibility flag should I set to bypass this behavior?
And the silence was an answer :)
I sent the issue to the Microsoft technical support finally. After several months of correspondence they acknowledged that the behavior of the DispatchMessageWorker function is abnormal and suggested that I catch all exceptions in the timer function and call MiniDumpWriteDump function inside my exception handler. When I explained to the escalation engineer why the idea is not very good and sometimes even inapplicable, he promised me to try to find another solution.
Look forward to some results. And to be continued…
I would be really happy to see news about your problem? we have the same "anti-crash prevention" problem, which blocks us in certain conditions from sending our own crash reports.
ReplyDelete(sorry if it is there twice but first time i submitted nothing happened on the page)
Thanks for the great article! I was chasing the very same problem for months!
ReplyDeleteWhat made things even more difficult is the fact that in at least one routine triggered via SetTimer() I _did_ receive a crash (Why?), whereas in others the crash was eaten as you described - and the application continued working with really bad delayed problems. The call to SetTimer() is identical in all my cases ...
So, not knowing when and why some exceptions are handled correctly and some not I have now enclosed all WM_TIMER called routines in __try() blocks, which dumps and exits the program.
Thanks and that i have a keen proposal: How Much Renovate House whole home renovation cost
ReplyDelete