IIS 5.0, Win2K SP4, Help with crash dump

Associate
Joined
2 Aug 2005
Posts
9
Hi Guys/Girls,

First proper post here but i'm hoping some of the more experienced debuggers (or users of WinDBG) might be able to help with a major problem we're having.

Basically, the problem we've had recently is on our live web server, which is running Windows 2000 SP4, IIS 5, All Hotfixes, etc, keeps randomly crashing. To be more precise, IIS keeps crashing and therefore all of our sites go down.

Once the server has 'crashed', an IIS Reset is unable to restart IIS, and the only way to get the sites back online is to reboot the server.

This has been a problem for quite some time now, and it's very random, sometimes it happens twice in a week, sometimes twice in a month, sometimes it won't happen for a couple of months, but the problem never goes away.

We host many e-commerce websites, so obviously this is causing a major headache for both us and our customers when the sites are offline.

So, what i've been doing recently is implementing some tools to try and capture what is causing IIS to fail, and one tool i'm using is the IIS Crash/Hang Agent. This has created many logs and dumps from when IIS has failed.

Unfortunately, i don't really understand the information that WinDBG is giving me when i open the dump and type the '!analyze -v' command.

This is what is returned when the analyze command is given:
___________________________________________________

0:000> !analyze -v
*******************************************************************************
* *
* Exception Analysis *
* *
*******************************************************************************

*************************************************************************
*** ***
*** ***
*** Your debugger is not using the correct symbols ***
*** ***
*** In order for this command to work properly, your symbol path ***
*** must point to .pdb files that have full type information. ***
*** ***
*** Certain .pdb files (such as the public OS symbols) do not ***
*** contain the required information. Contact the group that ***
*** provided you with these symbols if you need this command to ***
*** work. ***
*** ***
*** Type referenced: ntdll!_PEB ***
*** ***
*************************************************************************

FAULTING_IP:
+0
00000000 ?? ???

EXCEPTION_RECORD: ffffffff -- (.exr ffffffffffffffff)
ExceptionAddress: 00000000
ExceptionCode: 80000007 (Wake debugger)
ExceptionFlags: 00000000
NumberParameters: 0

BUGCHECK_STR: 80000007

DEFAULT_BUCKET_ID: APPLICATION_FAULT

PROCESS_NAME: DLLHOST.EXE

ERROR_CODE: (NTSTATUS) 0x80000007 - {Kernel Debugger Awakened} the system debugger was awakened by an interrupt.

LAST_CONTROL_TRANSFER: from 7c59a072 to 77f88f13

STACK_TEXT:
0006fd28 7c59a072 0000004c 00000000 00000000 NTDLL!ZwWaitForSingleObject+0xb
0006fd50 7c57b3e9 0000004c ffffffff 00000000 KERNEL32!WaitForSingleObjectEx+0x71
0006fd60 7ce7b194 0000004c ffffffff 000745fc KERNEL32!WaitForSingleObject+0xf
0006fd80 7ce7a991 000745e8 ffffffff 0006fdbf OLE32!CSurrogateProcessActivator::WaitForSurrogateTimeout+0x4f
0006fd9c 01001230 0006ff10 00000000 00072d80 OLE32!CoRegisterSurrogateEx+0x169
0006ff24 010014c6 01000000 00000000 00072d80 DLLHOST!WinMain+0xb0
0006ffc0 7c5989a5 ffffffff 00caef70 7ffdf000 DLLHOST!WinMainCRTStartup+0x156
0006fff0 00000000 01001370 00000000 000000c8 KERNEL32!BaseProcessStart+0x3d


STACK_COMMAND: ~0s; .ecxr ; kb

FAULTING_THREAD: 00000a0c

FOLLOWUP_IP:
DLLHOST!WinMain+b0
01001230 ff1578100001 call dword ptr [DLLHOST!_imp__CoUninitialize (01001078)]

SYMBOL_STACK_INDEX: 5

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: DLLHOST

IMAGE_NAME: DLLHOST.EXE

DEBUG_FLR_IMAGE_TIMESTAMP: 3e7b8905

SYMBOL_NAME: DLLHOST!WinMain+b0

FAILURE_BUCKET_ID: 80000007_DLLHOST!WinMain+b0

BUCKET_ID: 80000007_DLLHOST!WinMain+b0

Followup: MachineOwner
___________________________________________________


I would really appreciate it if anyone could suggest what the details above could possibly mean?

If you need any further details (IE the complete list of details from the log file generated), then i'll be more than happy to post them up.

One thing i noticed that seemed to aggrivate the problem was a script that one of our developers executed prior to a site launch. This script sent out 1400 plain text e-mails containing usernames/passwords for a site, and around 2 hours after the script was executed, the server went down. Incidentally, the same script was run the next day at a different time, and funnily enough, 2 hours after it was run the server went down.

Could it be something to do with e-mails? I believe it was just using the PHP mail() statement.

Anyway, i would really appreciate any help with this so thanks in advance for any suggestions.

Kieran Welch
 
Last edited:
Back
Top Bottom