Patching 2 year old downed servers that have duplicates on the domain?

Bleek · 9 Jun 2009 at 10:58

We have a pair of DR Exchange servers, they are exact replicas of the two live Exchange servers and attach to the same databases on SAN storage.

These two DR boxes have been down for two years and require bringing up-to-date via patching, the question is how best we do it?

We can't just power on these servers because they share computer names with the live pair and don't want to disrupt the production Exchange environment.

To make matters more difficult they are Blades and we must patch via PatchLink software, but that requires network access and has no 'offline' facility - or so our PatchLink expert tells us.

Any ideas?

neil_g · 9 Jun 2009 at 11:01

take them off the network, change the names, put them on the network, patch them, take them off the network, change the names back, power down, connect them back up.

?

Bleek · 9 Jun 2009 at 11:05

That does sound rather simple... isn't it always the case you miss the simplest of options?

However won't that change the SIDs and cause problems in the future if we need to failover to them?

mr.bond · 9 Jun 2009 at 12:27

Talk to whichever software vendor sold you the HA solution and ask them for a patching process, assuming you've purchased a third party app.

We use Neverfail and have a very specific p[rocess which must be followed for this exact scenario you've desribed.

mr.bond · 9 Jun 2009 at 12:28

neil_g said:
take them off the network, change the names, put them on the network, patch them, take them off the network, change the names back, power down, connect them back up.

?

That wouldn't be reccomended Neil. Can elaborate if you wish.

VeNT · 9 Jun 2009 at 13:17

pitty you can't rip the disks and pop them on a VM then put it back on the disk.

Bleek · 9 Jun 2009 at 13:29

mr.bond said:
Talk to whichever software vendor sold you the HA solution and ask them for a patching process, assuming you've purchased a third party app.

We use Neverfail and have a very specific p[rocess which must be followed for this exact scenario you've desribed.

It was a contractor and a third party, they don't mention patching just how to fail over to the servers.

I agree though, there should have been a patching plan because having them sat there for years is very amateurish (before you say anything it's just been dumped on my plate).

Bleek · 9 Jun 2009 at 13:29

mr.bond said:
That wouldn't be reccomended Neil. Can elaborate if you wish.

Please do, seriously it would help weigh up the pros and cons.

Curiosityx · 9 Jun 2009 at 15:21

Can you not simply place the two DR servers into a seperate vlan and or network then update them?

Regards

mr.bond · 9 Jun 2009 at 20:53

Edleake said:
It was a contractor and a third party, they don't mention patching just how to fail over to the servers.

I agree though, there should have been a patching plan because having them sat there for years is very amateurish (before you say anything it's just been dumped on my plate).

If you have purchased a specific thrid party application to provide HA then talk with the company who supply the software, try to get some support or have them point you in the right direction for the relevant documentation. I'm guessing by your post and the questions you've asked you're not that familiar with Exchange or the HA application installed for it (can you tell us what it's called?). It may be wise to have youyr company pay for support with the software vendor if you're expected to support it and are unsure as to what to do regarding administration and support.

To answer your second question rearding removing exchange from the network or changing the name (could differ on Exchange 2007 as have not used it yet, will assume you;re using 2k or 2k3):

I take it Neil when saying 'remove from the network' he meant from the domain. If this is the case your Exchange server would stop processing messages and none of the messaging databases (information stores) would mount and be renderred unusable. You can't change the name of the server either without it failing (not sure if it even allows you to, not tried).

Basically, leave it alone and get some help to aid you support and administer the HA solution you have installed. When Exchange goes bad it tends to go very bad, you could end up having a long week if you follow Neils advice.

**Sorry if this sounds harsh Neil, it's not meant to be. Giving no advice is better than bad advice.

EDIT*** Just re-read your post!!!

So you've got two exhcange servers installed for HA that have been offline for two years? Exactly what is your HA solution?
What I've typed above is relevant to Exchange still, but finding out what your HA is, it's setup and how it's supossed to operate is your first step.

Bleek · 10 Jun 2009 at 11:55

Can I not disable network, boot up off domain, logon locally, enable network and just use Windows Update?

Exchange 2003, Windows 2003.

As requested the documented DR process is as follows (in reduced form from a 35 page word document):

The blade server chassis houses 2 blade servers, these have been produced by taking images of the live servers and deploying the images to the recovery servers. The only exceptions to the image on the recovery servers is their IP addresses, Default Gateway, Domain / Workgroup membership and the machine Security Identifier (SID)

After imaging the disaster recovery servers will be reconfigured as follows;

1. Boot the server
2. Log on as a local Administrator

Disable the ‘Local Area Network Connection’
3. Go to ‘Start’, ‘Control Panel’, Network Connections’, right click the ‘Team 1’ connection and select ‘Disable’

4. Reconfigure the servers as follows;

Server Name : CSLWINEXE01
Workgroup : EXE_RECOVERY
IP Address : ##removed
Subnet Mask : ##removed
Default Gateway : ##removed
DNS Server : ##removed

Server Name : CSLWINEXE02
Workgroup : EXE_RECOVERY
IP Address : ##removed
Subnet Mask : ##removed
Default Gateway : ##removed
DNS Server : ##removed

5. Reboot the server when prompted, log on as a ‘Local Administrator’

6. Change the server SID using the Microsoft ‘NEWSID’ utility, NEWSID can be downloaded from;
http://www.microsoft.com/technet/sysinternals/Security/NewSid.mspx

7. When prompted select ‘Random SID’

8. The server will reboot when NEWSID has completed

NOTE : The SID change operation is only required to completed once during the initial configuration of the Disaster Recovery servers.

Active Directory

The recovery servers are members of a Workgroup ‘EXE_RECOVERY’ and therefore it is necessary to add the recovery servers into the Domain. The ‘Computer Account’ in Active Directory must be ‘reset’ prior to bringing the recovery servers on-line, to do this proceed as follows;

NOTE : Under NO circumstances must the Computer object be deleted from Active Directory, doing so will require the re-installation of the Server Operating System and Exchange.

1. Log onto the nearest Domain Controller as a ‘Domain Administrator’

2. Go to ‘Start’, ‘Administrative Tools’, Active Directory Users and Computers’

3. Expand the Domain object ‘###, expand the ‘Secure’, select the ‘Secure Servers’ object.

4. From the list in the right hand window locate the required server name (CSLWINEXE01 or CSLWINEXE02), right click on it and select ‘Reset Account’ as shown below;

#SNIP

The recovery servers have the same name as the live servers but have different IP addresses, it is therefore necessary to change the DNS records for the servers so that the recovery servers can be located.

#SNIP

Enable networking on both Exchange servers.

That is from what I can see the process for getting the two servers up, obviously there is loads of other info about SAN, databases, Exhcange services etc etc but obviously we don't care for that here.

mr.bond · 10 Jun 2009 at 12:27

Edleake said:
Can I not disable network, boot up off domain, logon locally, enable network and just use Windows Update?

Very doubtful if you want to patch Exchange, the server needs to have exchange services functionall of the paches won't register in Windows Update as Exchange won't be detected and running. Not sure as to whether you can manually d/l the pacthes if you ascertain whch ones are required and install them, but I very much doubt it.

The process you've supplied looks to be for how to recover when a failure occurs (DR), rather than a patching process.

Bleek · 10 Jun 2009 at 13:16

That is the process for failover from Head office the datacentre to DR datacentre.

There is no 'patching' method in this document.

Thanks for your help so far.

Bleek · 10 Jun 2009 at 15:41

Ok well we tried it the dirty way, boot the DR Exchange box with network disabled (it's not a member of the domain by the way), logged in locally, enabled network and gave it an IP (it noticed a duplicate name on the network but production was unaffected), gave it a route to the proxy and it can access MSoft updates... so I'm guessing this will work?

It found SP2 as it's first patch, nice... and out-of-date.

Can you see any reasons why doing it this way would/could cause a problem?

isoToxin · 10 Jun 2009 at 15:56

Argh. Never nice when you're left to pick up the pieces on a poorly designed system.

If these 2 DR servers are just point-in-time snapshots of the live systems, then your quickest solution may just be to create new up-to-date clones. Whoever put them in probably just imaged the disks while the live system was offline, and then restored the data onto the DR boxes. This is only a quick fix though and doesn't really solve the longterm problem of maintenance.

Is there any reason why you couldn't just have these DR boxes on your network as active hosts with unique names/SIDs? In a DR scenario, you could just mount the information store onto one of them and re-home everyone's mailboxes to the new storage group.

Looking to the future, if you chose to stay with exchange, the 2007 version has some slick DR functionality. I deployed SCR (standby continuous replication) at our company earlier this year so that we have a live replica of the live DB at a standby site that can be activated at any point.

Maintaining cold-start systems is always tricky, and I'd probably push for an online DR solution instead. This also cuts the risk of hardware failing while it's not being used, and going un-noticed until the fateful day when it's actually required.

DustyMiller · 10 Jun 2009 at 23:03

Have I missed something. Why not fail over to your DR server, and patch them. Not the best method of patching on a live box. You also have the problem of any patches that affect the exchange database structures or schema, will already have been applied to the databases, not usre how they will take a second patching.

What is worng with MS clustering with Exchange 2003, especially as you have a SAN infrastructure.

Bleek · 11 Jun 2009 at 09:23

I'm a Project Manager with a technical background but by no means an Exchange engineer and to my knowledge this company has never had one on site. Our current Windows team know basic admin but that aside, no one has come forward about improving the current situation.

We don't want to DR failover as we're not confident it'll be clean and comfortable, we can't afford that kind of disruption for now, a full DR test is later in the year.

isoToxin said:
Argh. Never nice when you're left to pick up the pieces on a poorly designed system.

Tell me about it!

Is there any reason why you couldn't just have these DR boxes on your network as active hosts with unique names/SIDs? In a DR scenario, you could just mount the information store onto one of them and re-home everyone's mailboxes to the new storage group.

That does sound like an option to look at.

Looking to the future, if you chose to stay with exchange, the 2007 version has some slick DR functionality. I deployed SCR (standby continuous replication) at our company earlier this year so that we have a live replica of the live DB at a standby site that can be activated at any point.

With purse strings so tight at present, work loads at bursting point and staff thinning not expanding, a move to 2007 in the near future is unlikely.

Maintaining cold-start systems is always tricky, and I'd probably push for an online DR solution instead. This also cuts the risk of hardware failing while it's not being used, and going un-noticed until the fateful day when it's actually required.

Our whole DR suit is online, about 20 odd cabinets worth of kit, it just so happens that the only two boxes offline are the Exchange ones - I agree about coldstart being a potential nightmare!

isoToxin · 11 Jun 2009 at 12:35

Edleake said:
That does sound like an option to look at.

Tbh, I had my Exchange 2007 head on when I wrote that, and thinking back, I'm sure 2003 had some really stupid pre-requisites when it came to database portability. Notably, I think it needed the target server to have the same hostname as the production one before it would allow the IS to be mounted onto it. Only hostname though, not SID etc, so you could still probably have the server live and just do a rename when the time came. This was probably why your company put this offline DR solution into place in the first place tho. Exchange 2007 has made it incredibly easy to mount an IS onto a different server without faffing with hostnames. With 2003, clustering was the only real way to have this kind of flexibility.

Good luck with it anyway, sounds like it will take quite a bit of legwork to get it all sorted mate.

If you have the time (which you probably don't), get the whole exchange environment (except for the full mailbox db of course if it's massive) mirrored on a meaty VMWARe box and play around to find out what works best. Snapshots mean you can mess about trying stuff to your heart's content. Alternatively, get an Exchange consultant in and pay $$$ (and laugh as they fail to do a better job than you could have done yourself, but hey, at least they're liable when it falls over. lol).

the_jetsetwilly · 11 Jun 2009 at 18:01

if they are the same then why cant you do a full backup of a live server excluding the Exchange stores/logs etc and then do a restore to a fail over server. by vlanning off the failover servers with a temporary setup backup server or just use an external USB disk to hold the data, i would think using the built in windows backup should work.

wonder how the backup servers were created originally

n3vrmind · 24 Jun 2009 at 16:59

If these are full clones of the original are your originals using a mirrored volume with a raid card. I would guess yes because of them being blades that will likely only take 2 disks

If you are using mirrored volumes then my suggestion would be to check your array on the production box is in order with no errors, shut down the production box and remove one of the two mirrored disks. turn the production server back on with only one disk and insert a spare and let it rebuild. Because the server was off when the disk was pulled that will be a consistent copy so keep it until the rebuild of the array completes in case you have the bad luck for it to fail at that moment.

Then for your DR server, pop both disks and insert the disk from the production. Disable the switch port on the blade chassis to isolate the server from the network and power it on. Once is booted up add a second disk and let the raid array rebuild again.

If its an option for you (i.e you have the skills) create a spare isloated vlan and put the blade into that vlan then mark the slot as such. any time you need to work on a blade off the network you just reuse that slot allowing you to do this regularly.

When you next do a technology refresh on your exchange, hire a different contractor. The only benefit your current setup has is that your licensing cost is reduced. You would have been better off having both production and DR online in a clustered config. If you have a SAN i cant see any sanity in doing it the way they did