VM Replication Traffic

Associate
Joined
16 Mar 2005
Posts
708
Location
Staffordshire
Hi Chaps

Replicating a data store which hosts VMs from a primary to a secondary SAN, I have noticed that the data size is significantly higher than what I had originally anticipated. In order to save on potentially wasteful replication traffic, I have already taken measures to move the VM swap file to separate, non-replicated volume - which rules out the VM swap file causing the issue. I'm now considering moving the OS page files to a separate, non-replicated volume also, as I believe that could be the cause of the overly-high rate of data change.

I have monitored the guest VMs' page file usage and have noticed that on the 3 Server 2003 Standard machines, there is a continual page file utilisation of approximately 15%. These 2003 machines have 4GB of RAM allocated and probably run at around 50% virtual memory utilisation - so, they're not being pushed to their limits. There are also 2 Server 2008 machines residing on the same host which share the same qualities, 4GB of RAM, 50% utilisation, but 0% page file utilisation. The 2003 machines are Terminal Servers, so I'm guessing that the cause for the page file usage is down to idle / disconnected sessions, whereas the 2008 machines provide file and print services, which don't tend to tie up resource for prolonged periods of time.

So, to cut to the chase - what are people's thoughts on this? Is offloading the OS page file usual practice in a replicated environment? Has anyone tried this configuration before? Is there anything that I'm missing? Maybe the page file isn't to blame?

Just to fill you in on some statistics, there's 5 VMs in total, as mentioned above. The replication traffic for the VM host volume varies between 1 - 2GB (per hour) depending on time of day. The hypervisor is ESXi 4.1 update 1, running on R710s with Equallogic SANs.

Any comments / suggestions appreciated!

Thanks

Martin
 
Last edited:
Yes, definitely move the OS page files to a different non-replicated disk, we do this for much the same reasons as yourself.

If you use Site Recovery Manager there are some additional things you should do to make your life easier which I can post here.

Another benefit is that if you are doing full image backups you can just exclude the page file disk and reduce the size of your full and incrementals.
 
Thanks for replying.

Well, that's sorted then, I'll get that set up this weekend if I get the time.

I'm not using SRM yet, but planning to implement over the next 2 months. So any info that you've got on that would be appreciated.

Thanks
 
I moved the page file from each machine, but it has had little affect on the overall replication size.

I've been using perfmon and task manager to monitor disk I/O and it doesn't really add up. I don't see a high volume of writes on any of the disks - which has left me slightly confused.

Strangely, the replication data size was approximately 1 - 2GB previously, whilst the VMs were replicating every hour. I've now adjusted the schedule and configured the volume to replicate every 6 hours - which produces 2 - 3GB, not a significant increase, like I had expected.

Any suggestions, Chaps?

Thanks
 
What are you using to do the replication?

####
As an aside, and to cover my comment earlier.
We found issues with SRM if we followed the VMWare guide for non-replicated page files.
They suggest creating a 'template' page file disk and just copying it, which works for a given value of 'works'.

What I recommend is to setup the page file LUN on your Recovery Site, shutdown each server and copy the folder containing the page file vmdk using SCP from your protected site to the recovery site.

Then in the recovery group you configure the page file disk to the copied vmdk.
 
I'm using the EqualLogic VMware snapshot management tools (ASM/VE) to do the replicaiton - it's effectively a Linux based VM that uses the SDK to do snapshots and the like.

I was running this VM on the same VM datastore that I'm replicating, but I'm certain that it wouldn't have any affect on the replication size. I have moved the management VMs over on to a separate datastore this morning, to see if it does actually have any impact - just for peace of mind.
 
To my mind I wonder why you are replicating terminal servers at all, normally the approach with terminal servers is to make them application servers only and not keep data on them by redirecting all the users folders off the terminal server and preventing them ever being able to store files locally using group policy

If you were using them in that way you could potentially just create 3 terminal servers at your secondary site on a non replicated store and only replicate your file servers
 
Back
Top Bottom