Times when you wanted to cry

Skidilliplop · 4 Mar 2009 at 15:45

I've not been in the gig long (coming up for 5 years) but i've seen a fair few of these moments

A couple for you:
Builders!
Builders Dug through a 50 pair Voice cable, not once but twice. After laying a steel culvert and concreting over it the first time. I still to this day have no idea why they decided to dig up a driveway less than 1 month old... This lost voice to an entire building of secretaries and i'm sure you all are aware how they live on their phones and how ****y they get when things go wrong

BT!
a) BT quoted us for 3 1Mbit leased lines and refused to do a site survey at two of our sites (despite us knowing the areas were poor for Data services and specifically requesting one), come install one line could only manage 512k and another was borderline attenuation. They also had the cheek to want to charge us the quoted 1Mbit annual rate for the 512k line.
b) The line with borderline attenuation kept dropping so a BT engineer came to run it on 2 local pairs as this apparently would make it more stable, however the NTEs natively clock at 2mbit on 2 pair operation and thus it continued to play up as one end was running twice the speed of the other.
The above made worse by an ancient DEC Alpha Billing/Booking system that closed all terminal sessions to preserve data integrity of shared files each time the link went down for more than a few seconds.

Foxes.
A Fox managed to bite an above ground 4 core fiber cable strapped to a 410V external lighting main. Fractured all 4 cores causing the whole above ground section to be replaced with expensive armoured cable and to top it all off the 410V run was intact and the bugger didn't fry for it

this killed data to an outlying office building

Consultants.
AD consultants conducted a netware 4.x to AD migration. We'd done an NT4-AD2k3 migration at another site and they seemed good. Now whenever we create an exchange mailbox we have to wait 15 minutes before Outlook can see it and it can be setup. They didn't tell us that DNS HAS to point to the DC else logon happens slooooow. Lucky I knew that already. One of the shares on the server was configured to contain one folder per dept. But users could delete their own departmental folders..... Also logon scripts were .bat not .vbs. I'm still learning stuff about AD and it's inner workings and the more I do the more I see dubious configuration decisions.

A company who we let out space to also used a consultant to set up data and VoIP after turning down our offer to provide a VLAN, routing and Extension/DDI range on our PBX. They plumbed in two ISDN2s and an ADSL line. 8 Months after install they'd run out of capacity on the Switch and the 2800 got hacked... We believe this to be due to them using TELNET to support the Router via either the IDSN or ADSL line. We had more issues as they'd provided a normal ADSL line not a business one and when they used a hosted system via RDP it all fell appart because the upload was pathetic.

ISPs.
Fluidata installed a temp link to facilitate and office move while a WES10 was in the pipeline. It consisted of two ADSL2+ lines with a Cisco VPN over them. Simple task, chuck anything that comes in one end over the VPN to the other. They got this wrong and it took them 4 days to get it right and it still didn't ceased to work until their "guru" had a look. Despite him looking at it for an hour and it miraculously working again they insisted there was no fault and we must have altered something our end. Being the only Cisco Certified person that deals with our WAN routing I can say we changed nothing.
1200 users, 5 sites and no internet or external email for 4 of them, including BACS transfers

Makes you feel secure in your job knowing such incompetance surges through the industry

svbooga · 4 Mar 2009 at 17:07

ethos said:
Fairly common

Not the ones we are working with, some of them are impossible :-)

ethos · 4 Mar 2009 at 17:10

svbooga said:
Not the ones we are working with, some of them are impossible

But it is fairly common

Sin_Chase · 4 Mar 2009 at 23:38

Local admin rights to users is very common in my experience. So many applications/browser plugins etc fall over without it.

iaind · 5 Mar 2009 at 07:41

Sin_Chase said:
Local admin rights to users is very common in my experience. So many applications/browser plugins etc fall over without it.

I'm not so sure, most of the problems can be removed by applying specific NTFS permissions or assigning the correct user rights assignment in group policy - a lot of admins are just too lazy to figure out what is required

Skidilliplop · 5 Mar 2009 at 09:38

iaind said:
I'm not so sure, most of the problems can be removed by applying specific NTFS permissions or assigning the correct user rights assignment in group policy - a lot of admins are just too lazy to figure out what is required

Depends, sometimes it's just not viable to manually set the permissions on all the required folders. Provided the PCs are locked down and you have decent AV etc there's not a whole lot of damage they can do provided rights are only local.

Worst case scenario, format and ghost. Hour - 90 mins work. For a net admin that's worth about £300-400 to the company spending the time testing and tweaking everytime a new vesrion comes out or a new app needs to be rolled out would mount up to much more.

bigredshark · 5 Mar 2009 at 09:48

pdw8 said:
I'm not a server bod like most people here as I work for a large IP carrier.

About 10 months ago I arranged for some of our DC power kit to have a faulty rectifier replaced by the manufacturer. I arranged it for 3am-6am as a precaution as there are routers in our suite with multiple links to the US and Europe as well as the majority of UK traffic.
There was enough rectifier capacity to hold the load while the faulty one was replaced. Ten minutes into the change and I was on the other side of the room and suddenly the room went deadly silent.......no routers......no switches......no DWDM kit........nothing.
The engineer had pulled the wrong rectifier meaning the entire load was too much and blew the breakers loosing the load. This was despite a label on the rectifier, full MOP, looking at it together 15 minutes earlier and previous inspection by the same guy!!!
Needless to say I wanted to cry, within 30 seconds the NOC was ringing me, then my Director and then the CEO who was at a dinner function in Washington.
Anyway, 3 minutes later the power was back and after about 45 minutes of testing all the kit was back up. Oh and he changed the faulty rectifier eventually.
That moment we lost over 200Gb of Internet traffic

Morale of the story............do it yourself!

Much the same, troubleshooting a failed dark fibre span, a engineer from a tier one fibre provider (they'll remain nameless), after having traced the right fibre, triple checked it was the right one, decided to pull the other fibre in the same tray and downed an active 10Gb DWDM span. Just sat at my desk shaking my head for 5 minutes then went and explain to the tech director what had just happened. Lovely.

pdw8 · 5 Mar 2009 at 19:39

bigredshark said:
Much the same, troubleshooting a failed dark fibre span, a engineer from a tier one fibre provider (they'll remain nameless), after having traced the right fibre, triple checked it was the right one, decided to pull the other fibre in the same tray and downed an active 10Gb DWDM span. Just sat at my desk shaking my head for 5 minutes then went and explain to the tech director what had just happened. Lovely.

Don't even start me on fibre providers..........3 times a certain one handed over a span and 3 times I threw it back due to missing patches or high loss connectors. Quite how they got to the test results I will never know but they did not relate to my span.
Fair play though, we got 8 months credit due to the one month delay and countless issues

Badger2003 · 5 Mar 2009 at 21:10

pdw8 said:
I'm not a server bod like most people here as I work for a large IP carrier.

About 10 months ago I arranged for some of our DC power kit to have a faulty rectifier replaced by the manufacturer. I arranged it for 3am-6am as a precaution as there are routers in our suite with multiple links to the US and Europe as well as the majority of UK traffic.
There was enough rectifier capacity to hold the load while the faulty one was replaced. Ten minutes into the change and I was on the other side of the room and suddenly the room went deadly silent.......no routers......no switches......no DWDM kit........nothing.
The engineer had pulled the wrong rectifier meaning the entire load was too much and blew the breakers loosing the load. This was despite a label on the rectifier, full MOP, looking at it together 15 minutes earlier and previous inspection by the same guy!!!
Needless to say I wanted to cry, within 30 seconds the NOC was ringing me, then my Director and then the CEO who was at a dinner function in Washington.
Anyway, 3 minutes later the power was back and after about 45 minutes of testing all the kit was back up. Oh and he changed the faulty rectifier eventually.
That moment we lost over 200Gb of Internet traffic

Morale of the story............do it yourself!

Similar thing happened to me last Saturday. I'm in the office carrying out a major upgrade of our PBX and programming via the LAN at my desk.

Suddenly lose connection to the PBX then look at my phone and it's off. Get to the Comms / Server room to see the phone system off. 2 seconds later reality kicks in as the room is silent. Nothing is working at all. Even the full UPS we have is silent.

Transpires UPS fault shut UPS down and it didnt bypass and drop back to mains. TO cap it all, I restarted it and it' surged taking some network switches with it.

Lost all my data (4.5 hours worth of reconfig) and then spent an hour replacing and repatching switches.

Nightmare

Ev0 · 5 Mar 2009 at 21:14

Skidilliplop said:
Depends, sometimes it's just not viable to manually set the permissions on all the required folders. Provided the PCs are locked down and you have decent AV etc there's not a whole lot of damage they can do provided rights are only local.

no one at my current or previous work had local admin rights, and we had some right weird apps.

All software is installed through sms, everything, and if there are any permissions changes (folder or registry) needed my installer sets them.

Users either get execute (by default) or write (with execute on existing files) perms on a folder, not both

brainchylde · 5 Mar 2009 at 21:44

Just had a complete network overhaul. Cisco engineer in working with us, ripping out switches and replacing. Complete core & edge replacement.

Suddenly, a few days ago, all the network locks up.. all the switches appear 'locked'.. stable green lights..

Go to server room, same on core switch and a bunch of edge switches in there.. oh dear.. 1500 machines, ip phones and other bits and bobs currently not working.

Start pulling fibre links from other buildings out, suddenly - they all start flashing again... so we've narrowed the problem down to a building.

Work to cabinet in that building, work through the switches........ find this is daisy chained to another cabinet. Head to other cabinet - all appears ok with wiring but still locked out green lights.

Check the network ports around the edge of the particular room that cabinet links to - some numpty had plugged a cable directly from one wall port, to another.

ARRRRRRRRRRRRRRRRRRGGGH.

Conclusion. Don't let cisco engineer leave without implementing spanning tree protocol!

Skidilliplop · 6 Mar 2009 at 11:51

brainchylde said:
Just had a complete network overhaul. Cisco engineer in working with us, ripping out switches and replacing. Complete core & edge replacement.

Suddenly, a few days ago, all the network locks up.. all the switches appear 'locked'.. stable green lights..

Go to server room, same on core switch and a bunch of edge switches in there.. oh dear.. 1500 machines, ip phones and other bits and bobs currently not working.

Start pulling fibre links from other buildings out, suddenly - they all start flashing again... so we've narrowed the problem down to a building.

Work to cabinet in that building, work through the switches........ find this is daisy chained to another cabinet. Head to other cabinet - all appears ok with wiring but still locked out green lights.

Check the network ports around the edge of the particular room that cabinet links to - some numpty had plugged a cable directly from one wall port, to another.

ARRRRRRRRRRRRRRRRRRGGGH.

Conclusion. Don't let cisco engineer leave without implementing spanning tree protocol!

Daaamn. This is why here we don't patch in wall points until they're needed. - Numpty proofing.
Suprised your VoIP went down too, normally you VLAN that off from data? :confused:

Burnsy2023 · 6 Mar 2009 at 12:13

Jas1975 said:
Suffice to say there were countless misconfigurations (the best was finding 54 seperate group policies applied to a single OU as they didnt know that a single policy can emcompass multiple settings

You do realise that that is technically MS best practice

rick827 · 6 Mar 2009 at 13:00

Skidilliplop said:
Daaamn. This is why here we don't patch in wall points until they're needed. - Numpty proofing.
Suprised your VoIP went down too, normally you VLAN that off from data?

I'd imagine it denial of serviced the switches bringing all VLANs to a crawl

Skidilliplop · 6 Mar 2009 at 13:43

rick827 said:
I'd imagine it denial of serviced the switches bringing all VLANs to a crawl

It shouldn't do, you can create a switching loop on one VLAN and it shoudln't affect the other at all. I've done this deliberately to test load balanced trunks. It's an easy way to generate a load of traffic and watch the slave links come alive as the Master link becomes congested. I did this on a live switch with the ports i was working on VLAN'd off from the live stuff. Didn't slow anything down.

It could congest uplinks but if these are QoS'd or separate for voice/data as they should be no issue should arise.

brainchylde · 6 Mar 2009 at 18:24

Skidilliplop said:
Daaamn. This is why here we don't patch in wall points until they're needed. - Numpty proofing.
Suprised your VoIP went down too, normally you VLAN that off from data?

Quite. Unfortunately the VLAN configuration has yet to be carried out. There are only 3/4 at the moment. We were in a rush to get the network up and working again. CCIE didn't say VLAN's were a priority at the moment but today had spanning tree put on the required kit.

Laser402 · 6 Mar 2009 at 19:48

brainchylde said:
Conclusion. Don't let cisco engineer leave without implementing spanning tree protocol!

I thought STP was on by default on cisco kit? :confused:

wij · 6 Mar 2009 at 20:14

Laser402 said:
I thought STP was on by default on cisco kit?

That is what I thought too. :x

Skidilliplop · 6 Mar 2009 at 23:06

wij said:
That is what I thought too. :x

Never come across it on the 29xx series. That's about all the access level kit i've played with tbh. Most of the core stuff I deal with is eXtreme Networks now.

brainchylde · 7 Mar 2009 at 12:14

Laser402 said:
I thought STP was on by default on cisco kit?

Cisco Engineer but HP Procurve kit.

Half the price!