Times when you wanted to cry

Soldato
Joined
26 Feb 2009
Posts
14,814
Location
Exeter
It looks like there's a lot of people in similar jobs to myself here, so thought it would be amusing to share stories of times you've found mistakes, misconfigurations and just general idiocy on your network.

I'll start - I've been doing some reconfiguration and rebuilding of our AD, File and Exchange servers and all was going smoothly until I finished the work on the file server. As it was previously a combined DC/File server and I moved to a dedicated DC and standalone file server, I gave the old servers IP to the new DC and set up a DNS alias for the new file server. All was working fine until I noticed one of our Citrix servers wasnt loading the roaming profiles and I was getting DSAccess errors on the old Exchange server. After a bit of probing, flushng DNS caches and rebooting, I finally discovered what had been done..

Someone had thought it would be a good idea to fill the hosts file on each server with entries for EVERY other server. An easy fix but I did want to cry, I dont understand how someone can be so stupid. :confused:

Can anyone beat that? :D
 
Associate
Joined
13 Aug 2004
Posts
1,485
Location
Hampshire
Could write a book on this!

A recent contract was for a large government AD redesign. Appeared onsite and got taken through the current topology (which they had paid tens of thousands to a previous external IT 'partner' to comission). The first warning signs were seeing INT and CORP as the internal domain names, straight out of the Microsoft Active Directory for Dummies book.

Suffice to say there were countless misconfigurations (the best was finding 54 seperate group policies applied to a single OU as they didnt know that a single policy can emcompass multiple settings as well as having 30 domain controllers with only 2 set as global catalogs (one onsite the other remote, they did wonder why Exchange was timing out on GAL lookups).

Still....... you've got to laugh.......
 
Associate
Joined
16 Jan 2006
Posts
655
Location
Surrey
Still to this day, EVERY customer site I've been sent to (20 or so) pre domain migration with my current employer (of 7 years) has had their servers and workstations DNS pointing to their ISP's. This is one basic understanding that appears (in my experience) to not have fully sunk in yet - even to people claiming to be higly capable administrators with Windows and Directory Services.
 
Associate
Joined
29 Nov 2007
Posts
513
an old mate of mine about 4 years ago saw a couple of monkeys drop a fully loaded HP drive array down a flight of stairs whilst they were relocating the server to a "better" server room to help with network congestion........

hence to say there was a database corruption on the exchange server and a couple of days downtime.

The exchange guy was in tears, my mate just laffed his ******* head off as he was in the network team watching a bout 11 disks clattering down the stairs.
 
Soldato
OP
Joined
26 Feb 2009
Posts
14,814
Location
Exeter
This is generally where strict change control and documentation comes into play, could have saved you a lot of time and effort.

I completely agree, although this problem dates back (the modified dates on the files were all 2006). Its something I've been pushing for but we're a small team (4 of us) so there's a bit of resistance to it. I cant see any point moving towards a full ITIL change control procedure, but a bit of control and documentation would help!

Oddly, the network was built as a 2003 domain, there's been no upgrades or migrations...
 
Soldato
Joined
28 Dec 2003
Posts
16,081
Was once called out to a customer whose server (which they took to rebooting every time anyone had a problem of any description) wouldn't boot at all.

Went onsite to find BOOT.INI was missing. Recreated it and all was fine.

I then set about tracking down what exactly had happened as I doubted even they were stupid enough to have deleted the file. After much searching I discovered that it was actually McAffee AV which they'd installed on the server. Apparently there was a known bug with the specific version they'd installed which, when the server software went online to check for updates for the first time, caused it to delete BOOT.INI !!!

Quite the most spectacular software bug I've ever come across, especially considering it was software designed to protect them. The most annoying thing was that McAffee support denied all knowledge of it and it took a lot of persistence and escalating before I actually spoke to someone who admitted to the bug. The minions were all obviously under strict instructions never to admit to it.
 
Soldato
Joined
18 Oct 2002
Posts
5,299
Came across a customer with approx 40 PCs, one domain controller which was also running Exchange. No machines were joined to the domain and all were using workstation based routes and hosts files. After moving them to a brand spanking new SAS based server and joining all to the domain, the 'network' slowed to a crawl. This is one of these *wtf* moments when you know the new solution should utterly own what the customer previously had.
After some (loads) digging, I found the previous supplier had amended the %PATH% on EVERY machine to a UNC address of an old w2k fileserver they were using because none of them were joined to the domain.

Seems rare to get a customer where you look over the infrastructure and think, "hmm, someone has done a half tidy job here..." :p
 
Soldato
OP
Joined
26 Feb 2009
Posts
14,814
Location
Exeter
You have to wonder how some of these companies get away with it. A lot of these stupid mistakes seem to be done by consultancy companies who charget a lot of money for their services.
Another one I saw was an Exchange 2003 server that had been done by a fairly large, well known consultancy/support company. It was a 6 disk system and there were 3 mirror arrays configured labelled "system", "database" and "transaction logs". Fair enough, its an OK configuration I thought. On closer inspection, though, the "system" volume contained the OS and transaction logs, the "database" volume contained the EDB files and the transaction logs volume had the STM files! Thats a mistake thats so fundamental I had to double check I wasnt imagining it..
 
Associate
Joined
29 Nov 2007
Posts
513
when i was just starting out i got to learn real fast on customer sites that dont trust KVM's labelling and double check computer names.

yep you know the one where youve finally got approval/signoff to shutdown/reboot the server and you blindly trust what it says on the KVM screen and dont bother to check the hostname.

lol... so i logged in hit shutdown and waited looking at the physical print server in front of me to power down, much to my supprise the large SAP database beast of the server (well it was last decade) went quiet directly behind me and the little print server infront was still online... didnt take long though for the network manager to come in running saying "we really must workout how to rename the KVM names" or words to that affect.
 
Associate
Joined
9 Feb 2004
Posts
593
Location
derbyshire
i know of a council that gives all there users local admin rights to all there machines.
not sure how common that is :)
as well as having servers on there network that they don't know what they do. so they just keep them running lol
 
Soldato
Joined
5 May 2003
Posts
4,515
Location
UK
When I started my new job there were over 45 domain admins....

... and 2 real domain admins (myself and my boss).
 
Associate
Joined
26 Apr 2004
Posts
1,603
Location
Kent/London
I'm not a server bod like most people here as I work for a large IP carrier.

About 10 months ago I arranged for some of our DC power kit to have a faulty rectifier replaced by the manufacturer. I arranged it for 3am-6am as a precaution as there are routers in our suite with multiple links to the US and Europe as well as the majority of UK traffic.
There was enough rectifier capacity to hold the load while the faulty one was replaced. Ten minutes into the change and I was on the other side of the room and suddenly the room went deadly silent.......no routers......no switches......no DWDM kit........nothing.
The engineer had pulled the wrong rectifier meaning the entire load was too much and blew the breakers loosing the load. This was despite a label on the rectifier, full MOP, looking at it together 15 minutes earlier and previous inspection by the same guy!!!
Needless to say I wanted to cry, within 30 seconds the NOC was ringing me, then my Director and then the CEO who was at a dinner function in Washington.
Anyway, 3 minutes later the power was back and after about 45 minutes of testing all the kit was back up. Oh and he changed the faulty rectifier eventually.
That moment we lost over 200Gb of Internet traffic:eek:

Morale of the story............do it yourself!
 
Back
Top Bottom