Several weeks ago we encountered with very interesting crash of one of our product’s processes. After analyzing the dump we found that exceptions were excited inside two threads simultaneously.
The first one roused up the debugger:
0355f0f0 7c3627e4 ntdll!RtlAllocateHeap+0x655 cmp edi,dword ptr [eax+4] ds:0023:8851e6cc=???????? 0355f130 7c36280c msvcr71!_heap_alloc+0xe0 0355f138 7c362829 msvcr71!_nh_malloc+0x10 0355f144 7c3eb633 msvcr71!malloc+0xf 0355f154 7c3c1f0e msvcp71!operator new+0x21 0355f9cc 7c3c4f9e msvcp71!std::basic_string 0355f9e0 7c3c55df msvcp71!std::basic_string 0355f9fc 7c3c6752 msvcp71!std::basic_string 0355fa10 00595239 msvcp71!std::basic_string 0355fc4c 00595387 SCServer_dll71!_ConvertBinsToChars+0xb9 0355fdac 007335ea SCServer_dll71!TReadResourceHandler::HandleEvent+0x127 0355ff74 10046453 SCLib71!TReactor::Thread+0xda 0355ff80 7c36b381 ETL!TThread_::ThreadThunkFunction+0x23 0355ffb4 7c80b50b msvcr71!_threadstartex+0x6f 0355ffec 00000000 kernel32!BaseThreadStart+0x37 |
And the second one was pending:
0012e45c 7c90e9ab ntdll!KiFastSystemCallRet 0012e460 7c8633d5 ntdll!ZwWaitForMultipleObjects+0xc 0012e7a0 7c36e289 kernel32!UnhandledExceptionFilter+0x82d 0012e7bc 0040c860 msvcr71!_XcptFilter+0x15f 0012e7c8 7c363943 SCServer71!WinMainCRTStartup+0x1d7 0012e7f0 7c9037bf msvcr71!_except_handler3+0x61 0012e814 7c90378b ntdll!ExecuteHandler2+0x26 0012e8c4 7c90eafa ntdll!ExecuteHandler+0x24 0012e8c4 00409c1c ntdll!KiUserExceptionDispatcher+0xe 0012ebc4 7c1adc5b SCServer71!CMonitoringDlg::OnTimer+0x1c call dword ptr [edx+8] ds:0023:00000008=???????? 0012ec54 7c1a9f01 mfc71!CWnd::OnWndMsg+0x46b 0012ec74 00422d16 mfc71!CWnd::WindowProc+0x22 […] 0012ef74 77d487eb user32!InternalCallWinProc+0x28 0012efdc 77d489a5 user32!UserCallWinProcCheckWow+0x150 0012f03c 77d4bccc user32!DispatchMessageWorker+0x306 0012f04c 7c1b1645 user32!DispatchMessageA+0xf 0012f05c 7c1ab833 mfc71!AfxInternalPumpMessage+0x3e 0012f080 7c1aeeed mfc71!CWnd::RunModalLoop+0xca 0012f0bc 00424726 mfc71!CDialog::DoModal+0xf3 0012f0f0 0040142b NSGuiCtl10!CNSGDialog::DoModal+0xc6 0012ff08 7c1ae5d0 SCServer71!CScServerApp::InitInstance+0x9b 0012ff18 0040c80e mfc71!AfxWinMain+0x47 0012ffc0 7c816d4f SCServer71!WinMainCRTStartup+0x185 0012fff0 00000000 kernel32!BaseProcessStart+0x23 |
In both cases the heap was corrupted. Analyzing content of the corrupted memory blocks we found that in second case the block was filled with 0, and in first case the picture was even more interesting - we found the string not far from corrupted address inside the heap structure.
It looked like the following:
0:000> du 033beea0 033beea0 "\Device\HarddiskVolume1\Program " 033beee0 "Files\VirCOM\VSPort.exe" |
This string could hardly be a part of our process. Obviously, it was the path to the VSPort.exe image in terms of kernel devices. Similarly it’s obvious that user mode application has no reason to generate a path, which can be used to refer to device in the device manager. Moreover, the path itself was well known. It was the path to the COM-port emulator application that was supplied by the vendor of the computer, where the crashes had occurred.
At the same time we found out that the size of the data we received from the serial port through the GetOverlappedResult function was much greater than the size of the buffer we sent to the ReadFile function.
All of these facts were clear pointed to the COM-port emulator driver. The only thing we had to do to prove this suggestion was caching the buffer overrun.
First thing we did was setting the user mode processor breakpoint on data access. We set it just after the last buffer byte… and caught nothing. The only scenario in which this could happen was overrunning the buffer from the kernel mode. We increased the buffer size and set the guard just behind the buffer to ensure that our suggestion was right and the buffer had been overran. After receiving wrong data size from GetOverlappedResult we checked the guard and found it completely destroyed.
In fact, this information was enough to realize that the source of the problem is the COM-port emulator driver (along with the VSPort.exe utility). We just changed the emulator to another one and made sure that the problem had gone. But I was wonder how could it happen that the driver destroys the data in our user memory space.
Let’s remember the relationships between user application, I/O manager and device driver.
When user application calls the ReadFile function, the I/O manager (which handles this request finally) does the following:
- Creates the IRP (I/O request packet)
- Creates the buffer from the kernel memory space and attaches it to the IRP
- Finds the read routine inside a device driver object and passes the pointer to the IRP to the read routine
When device driver processed the request in its turn it’s:
- Copies data to the buffer created by the I/O manager
- Calls the IoCompleteRequest routine, returning the IRP to the I/O manager
After the I/O manager receives the IRP back it has to switch a process context to the context of the process which requested read operation (we are interesting in asynchronous operations only, since our code requests asynchronous operation). It accomplishes this by sending APC to the thread, which requested for I/O. When the APC routine executes, it checks for the return code (returned by the driver) and if the code doesn’t indicate an error it copies data from kernel buffer to user buffer and sets I/O event.
That is, according to our suggestion, the I/O manager’s APC routine receives wrong data size from the driver and overruns our buffer. This time I set the data breakpoint just behind the buffer in the kernel mode.
It’s fired on the following instruction:
f677991c 804febb5 820de048 f6779968 f677995c nt!IopCompleteRequest+0x92 f677996c 80502b35 00000000 00000000 00000000 nt!KiDeliverApc+0xb3 |
804f48d0 8b4b1c mov ecx,dword ptr [ebx+1Ch] 804f48d3 8b730c mov esi,dword ptr [ebx+0Ch] 804f48d6 8b7b3c mov edi,dword ptr [ebx+3Ch] 804f48d9 8bc1 mov eax,ecx 804f48db c1e902 shr ecx,2 804f48de f3a5 rep movs dword ptr es:[edi],dword ptr [esi] |
0: kd> ? @eax Evaluate expression: 10051637 = 00996035 |
I found that the value of eax register is very strange, considering the fact that the buffer size which we sent to the I/O manager was 4000h (that is 16K) bytes long.
Let’s check the data inside the IRP:
0: kd> dt @ebx _IRP ntdll!_IRP +0x000 Type : 0n6 +0x002 Size : 0x190 +0x004 MdlAddress : (null) +0x008 Flags : 0x970 +0x00c AssociatedIrp : __unnamed +0x010 ThreadListEntry : _LIST_ENTRY [ 0x82303230 - 0x81522e80 ] +0x018 IoStatus : _IO_STATUS_BLOCK +0x020 RequestorMode : 1 '' +0x021 PendingReturned : 0x1 '' +0x022 StackCount : 2 '' +0x023 CurrentLocation : 4 '' +0x024 Cancel : 0 '' +0x025 CancelIrql : 0 '' +0x026 ApcEnvironment : 0 '' +0x027 AllocationFlags : 0xc '' +0x028 UserIosb : 0x00d291a8 _IO_STATUS_BLOCK +0x02c UserEvent : 0x81ec5328 _KEVENT +0x030 Overlay : __unnamed +0x038 CancelRoutine : (null) +0x03c UserBuffer : 0x02b7c4e0 Void +0x040 Tail : __unnamed 0: kd> dt @ebx+18 _IO_STATUS_BLOCK ntdll!_IO_STATUS_BLOCK +0x000 Status : 0n258 +0x000 Pointer : 0x00000102 Void +0x004 Information : 0x996035 |
There are two things to note:
- The status code is 102h (which is mean IO_TIMEOUT)
- The data size of 996035h bytes can’t be true
So, the task was to find from where this values came from. And it was the time to debug the driver. I had no symbols for it, so it promised to be a funny task.
One way to getting to the place where the data size is calculated is to find the IRPs which are not processed yet (!irpfind) and set the data breakpoint to the field where the data size should be placed.
Here we are:
F8ac5fa0 8050165f vserial+0x8bdf f8ac5fcc 80544e5f nt!KiTimerExpiration+0xb1 f8ac5ff4 805449cb nt!KiRetireDpcList+0x61 f8ac5ff8 f5b508e8 nt!KiDispatchInterrupt+0x2b |
The data size is calculates in the following instruction:
f6301bcf 2b90c0000000 sub edx,dword ptr [eax+0C0h] f6301bd5 8bb0bc000000 mov esi,dword ptr [eax+0BCh] f6301bdb 8d5432ff lea edx,[edx+esi-1] f6301bdf 89511c mov dword ptr [ecx+1Ch],edx |
0: kd> ? @edx Evaluate expression: 2125373441 = 7eaea001 0: kd> ? @esi Evaluate expression: -2125373440 = 81516000 |
The esi and edx registers values should contain the carriage address and inverted pointer to the IRP buffer respectively (since edx register in the first instruction contains the size of the IRP buffer (4000h bytes)).
Probably, the source of the problem could be related with these addresses. To check it out, I added the trace print to this instruction which prints the values of the data carriage pointer, base address of the buffer and resulting size of data inside the buffer, and stops the execution if the data size exceeds the user buffer size (that is 16K).
The breakpoint fired on the following:
[…] [811f5000,811f5035] - 35 [8182b000,8182b000] - 0 [822ea000,822ea000] - 0 [811ef000,811ef035] - 35 [8182b000,8182b000] - 0 [822ea000,822ea000] - 0 [811f5000,811f5035] - 35 [8182b000,8182b000] - 0 [822ea000,822ea000] - 0 [811ef000,811ef035] - 35 [8182b000,8182b000] - 0 [822ea000,822ea000] - 0 [811f5000,811f5035] - 35 [8182b000,8182b000] - 0 [822ea000,822ea000] - 0 [811ef000,811ef035] - 35 [8182b000,8182b000] - 0 [822ea000,822ea000] - 0 [811f5000,820b7035] - ec2035 |
Now I had to find the source of the pointers values and the reason why the carriage pointer doesn’t correspond to its buffer base address. But before we continue, let’s get some more information about the driver and its structure, since it will help us in the future.
Knowing the IRP we can get the name of device this IRP was sent to:
0: kd> !irp 820de048-40 Irp is active with 2 stacks 4 is current (= 00000000) No Mdl: System buffer=81510000: Thread 82303054: Irp is completed. Pending has been returned cmd flg cl Device File Completion-Context [ 0, 0] 0 0 00000000 00000000 00000000-00000000 Args: 00000000 00000000 00000000 00000000 [ 3, 0] 0 0 82211950 00000000 00000000-00000000 *** ERROR: Module load completed but symbols could not be loaded for vserial.sys \Driver\vserial Args: 00000000 00000000 00000000 000000000: kd> !devobj 82211950 Device object (82211950) is for: VSerial0 \Driver\vserial DriverObject 82248970 Current Irp 00000000 RefCount 1 Type 0000001b Flags 0000204c Dacl e1401cac DevExt 82211a08 DevObjExt 82211fd0 ExtensionFlags (0000000000) AttachedTo (Lower) 81eb1760*** ERROR: Module load completed but symbols could not be loaded for vsb.sys \Driver\vsbus Device queue is not busy. |
Knowing the devise name we can get the driver object. More precise, we are interesting in its dispatch routines table:
1: kd> !drvobj vserial 7 Driver object (820c27e0) is for: \Driver\vserial Driver Extension List: (id , addr) Device Object list: 820b5950 821ed040 82182950 8218f880 821c3950 821c8628 DriverEntry: f5e8e442 vserial DriverStartIo: 00000000 DriverUnload: f5e87270 vserial AddDevice: f5e87cf0 vserial Dispatch routines: [00] IRP_MJ_CREATE f5e87680 vserial+0x3680 [01] IRP_MJ_CREATE_NAMED_PIPE 804f4282 nt!IopInvalidDeviceRequest [02] IRP_MJ_CLOSE f5e8c740 vserial+0x8740 [03] IRP_MJ_READ f5e8d420 vserial+0x9420 [04] IRP_MJ_WRITE f5e8e310 vserial+0xa310 [05] IRP_MJ_QUERY_INFORMATION f5e89a40 vserial+0x5a40 [06] IRP_MJ_SET_INFORMATION f5e89b00 vserial+0x5b00 [07] IRP_MJ_QUERY_EA 804f4282 nt!IopInvalidDeviceRequest […] |
Now, when we know the key driver routines addresses, we are ready to further investigation.
Let’s return to the data size calculation and look closely to the marked data fields:
f6301bcf 2b90c0000000 sub edx,dword ptr [eax+0C0h] f6301bd5 8bb0bc000000 mov esi,dword ptr [eax+0BCh] f6301bdb 8d5432ff lea edx,[edx+esi-1] f6301bdf 89511c mov dword ptr [ecx+1Ch],edx |
eax register contains the address of the device extension, created for each instance of the serial port device (let’s call this address DevEx for short):
nt!_DEVICE_OBJECT +0x000 Type : 0n3 +0x002 Size : 0x680 +0x004 ReferenceCount : 0n1 +0x008 DriverObject : 0x82144f38 _DRIVER_OBJECT +0x00c NextDevice : 0x8218f490 _DEVICE_OBJECT +0x010 AttachedDevice : (null) +0x014 CurrentIrp : (null) +0x018 Timer : (null) +0x01c Flags : 0x204c +0x020 Characteristics : 0x100 +0x024 Vpb : (null) +0x028 DeviceExtension : 0x82028a08 Void +0x02c DeviceType : 0x1b +0x030 StackSize : 2 '' +0x034 Queue : __unnamed +0x05c AlignmentRequirement : 0 +0x060 DeviceQueue : _KDEVICE_QUEUE +0x074 Dpc : _KDPC +0x094 ActiveThreadCount : 0 +0x098 SecurityDescriptor : 0xe13f8168 Void +0x09c DeviceLock : _KEVENT +0x0ac SectorSize : 0 +0x0ae Spare1 : 0 +0x0b0 DeviceObjectExtension : 0x82028fd0 _DEVOBJ_EXTENSION +0x0b4 Reserved : (null) |
Setting the breakpoints to this memory addresses we can easily find out instructions, which fills in the fields.
Here is one of such instructions sets:
f5fed3ae 8b501c mov edx,dword ptr [eax+1Ch] f5fed3b1 03d1 add edx,ecx f5fed3b3 8996bc000000 mov dword ptr [esi+0BCh],edx ds:0023:81ff3354=8121c000 f5fed3b9 8b4860 mov ecx,dword ptr [eax+60h] f5fed3bc 8b5104 mov edx,dword ptr [ecx+4] f5fed3bf 8b8eb4000000 mov ecx,dword ptr [esi+0B4h] f5fed3c5 8d540aff lea edx,[edx+ecx-1] f5fed3c9 8996c0000000 mov dword ptr [esi+0C0h],edx |
This instructions set executes as a part of the read routine of the device object (this is where the knowledge about the driver routines addresses can be useful). The driver fills in the following fields, after the data from the port buffer was copied to the buffer, attached to the IRP.
DevEx+BCh now contains the address of the IRP buffer carriage (address of the buffer + number of bytes copied) and DevEx+C0h now points to the end of the IPR buffer (address of the buffer + size of the buffer (4000h bytes)).
Another instructions set is the following:
f5fec106 40 inc eax f5fec107 5f pop edi f5fec108 8986bc000000 mov dword ptr [esi+0BCh],eax ds:0023:82028ac4=81fe102e f5fec10e 5e pop esi f5fec10f c20800 ret 8 |
This instructions set executes as a part of the write completion routine, which executes on the DPC IRQL level. Since the read routine executes on the Passive IRQL, this write completion routine will interrupt execution of the read routine. Moreover, there is no synchronization to keep this data in consistent state.
The write completion routine copies data to the port buffer byte by byte and uses the DevEx+BCh data field to save the pointer to the carriage of the port buffer – that is completely another buffer than the buffer which DevEx+C0h points to. Thus, if the write completion routine will accidentally interrupt the read routine, it can change the value of the DevEx+BCh to the address inside another buffer, which will subsequently lead to the client’s process crash.
To be fair I have to say that this time I met with the vserial driver wasn’t first. The number of utilities uses this driver and there is no problem observed. But when I checked the work of the driver with several utilities, I found that they use the driver completely different way than the VSPort utility does, and the driver code which was discussed in the article is not executed.
Also, it seems that the bug is a consequence of the misprint. Logically, two pointers for read and write routines just should be saved inside different data fields.
No comments:
Post a Comment