10 Days in Tech Hell

I’ve had the worst 10 days in technician hell. I had the opportunity to work on several computers that were either very broken requiring enormous effort to fix, or would seem to be almost fixed then give me a nice surprise right at the last minute. Let me see if I can sum it up…

One computer came in with a massive malware and rootkit infection that took extreme effort to repair without a format. The hard drive had to be removed multiple times, and the system had to be booted into a WinPE environment multiple times to use a series of anti-virus scans and my patented “Most Recently Created Files = DELETE!” trick to remove the infection. This machine also had the MXZ virus, a nasty thing I would like to call a “system disabler”. It physically kills the taskmgr.exe file and drops hundreds of thousands of DLL files into the Desktop, Windows and Windows/System32 directories. Deleting those with a wildcard command line took, oh, about 4 HOURS! Final result: machine cleaned and working properly, account of kids who infected the computer were seriously locked down using manual registry hacks to system policy (XP Home, no GPEDIT.MSC!), per parents’ request. They can’t download anything, install anything, run Java, Flash or ANY ActiveX control, create or change passwords for their user accounts, or create new user acounts to bypass the security measures. I couldn’t keep them from running IM clients, but since they couldn’t download them or install them, I figured I was okay on that front. I don’t think we’ll see that one in again for malware infection.

Another computer had, too, a rootkit infection. This one appended itself to ndis.sys (somehow bypassing windows file protection) and hijacked the network stack to do whatever the heck it wanted. That machine had weird DHCP and DNS issues, random BSODs, etc. Fixed that with another round of the “Most Recently Created Files = DELETE!” trick and the obligatory scans. After I got this system all clean and tried Windows Updates, Microsoft felt abliged to inform me that this customer’s machine was running Windows XP Pro VLK. I called the customer, got approval to replace their OS with Windows XP Pro OEM, via repair install/upgrade. Turns out that there was something really wacky with their registry, and the repair install resulted in no Primary IDE or Secondary IDE controllers being listed in the proper class key in the registry, giving me 0x0000007b, DIRECTLY AFTER FINISHING AN ENTIRE REPAIR INSTALLATION, AN INSTALLATION WHICH HAD TO BOOT OFF THE FRIGGIN’ HARD DRIVE TO COMPLETE THE SECOND HALF! After a 1.5 hour long attempt to swiss-cheese the registry by doing a clean install on a new HDD and trying to export it’s working IDE controller class entries, then Loading the Hive of the customer’s installation and importing the entries, but to no avail, I gave up. Final Result: backup data, wipe HDD, install clean copy of Windows, return data. Basically, the 4 hours I spent removing the infection by hand was rendered completely unnecessary.

Third computer had a weird series of problems. Random BSODs, missing files, file system corruption, incorrect file permissions, broken ActiveX, broken Windows Updates service, broken Cryptographic Service, and the pièce de résistance, “invisible display tabs” (as referenced in another blog post). The cause of all this weirdness was a combination of 2 things. First, a hijacker, Smitfraud most likely, that killed much functionality via registry-hacked policy (no task manager, no display tabs, no change certain IE settings, etc.). Secondly, a bad stick of RAM. Replaced the RAM, removed the infection, then bounced my head off the wall forever trying to figure out the “invisible display tabs” problem, finally wading through all the crap Google spewed at me regarding System Policy and finally settling on this page, that had the right fix. That fix definately went into the Service Wiki.

Then I had a weird computer that would not boot into normal mode. While trying to boot into normal mode, it would proceed all the way past the splash screen to the point where it first initalizes the video card, then freeze. Tried new video drivers, disabling ALL (when I say all, I mean ALL) unnecessary system services, disabling all unnecessary hardware, shutting off all startup programs, and even creating a new user account. None of that worked. I ran hardware scans, and those all passed. For giggles, I ghosted the customer’s drive to a new drive, and voila, it booted. Once. It stopped booting from the new drive after that. I tried a repair install of Windows on the customer’s drive: no joy. I tried a repair install on the new drive: worked great. I tried booting the customer’s drive again and this time I put my ear to it. It was very quiet, but it was exhibiting click-of-death even though the diagnostic passed. Replaced the drive, did the repair install, system fixed.

Fifth computer was just comic. Came in because the customer said that it rarely boots to Windows anymore, and most of the time it won’t turn on. When it does turn on, he says it “turns on, then shuts right back off.” We figured bad power supply, and I grabbed it thinking it would be a quick job for me. Ended up being bad motherboard. A Dell with a bad motherboard. 4 years old. No direct replacement motherboard available. Chassis is proprietary to the motherboard. Heatsink is proprietary to the motherboard. Only new motherboards I can get have 2 RAM slots, he has 4 sticks of RAM. New motherboard, RAM, system chassis and heatsink/fan combo, basicially a complete system rebuild. Yeah, that was a $400 repair.

And the last computer. This one looked like it would be a real challenge to figure out, because according to the sign-in description, it wouldn’t complete POST. I figured “what the hell, I’m already on a roll, lets pick the one that will take 3 days to figure out.” I benched it and, lo and behold, it would not finish POST. Got to the point just before detecting IDE drives and would freeze. Cleared CMOS, no joy. Disconnected all unnecessary parts, no joy. Swapped RAM, no joy. I figured motherboard, so I took the customer’s CPU, Power Supply, and RAM, attached them to a new motherboard: no POST/Video. The only things left are CPU and Power Supply. I tried the Power Supply first and the customer’s hardware on the new motherboard POSTed. AH HA! I tried all the customer’s old hardware with a new power supply. POST freezes. AH HA AGAIN! Customer had a failing power supply that most likely damaged the motherboard, or vice versa. Called customer got approval to replace motherboard+power supply, installed them, Windows didn’t even require a repair install, installed the 2 missing drivers, and the system was done. Total diagnostic and repair time: a little over an hour.

Figures, the last one I take, the one I figure would be the hardest of all of these ends up being the easiest. Just my technician’s luck, I guess.