Clustered service fails when failover occurs

Associate
Joined
28 Apr 2012
Posts
800
ALMOST done I think but this for some reason does not work.

I created a Windows Service which hosts a WCF application looking at/binding to the Clustered MSMQ
The Windows service was first installed on both nodes of the cluster.

I then created a cluster of this through the high availability wizard (selecting Generic service).

The client which is sending data to the clustered MSMQ (external from the cluster but same domain/network) sends to the Clustered MSMQ fine.

One of the nodes in the cluster sees the data through the Windows Service (ive got debugging trace to verify this)

Then, when say NodeA goes down, NodeB *should* kick in but it fails to start the Clustered Windows Service with no explanation at all!


It does this several times and still fails. I have to start the Windows Service manually on this Node through services.msc then it works just fine and the service is up and running and continues to process the messages from the Clustered MSMQ.



something clearly isn't right here. Why does the Clustered service fail to "start" when one of the nodes goes down/appears offline? What am I doing wrong? Wrong setting somewhere?


Thank you.
 
after further investigating, it appears that I was running the service somewhat incorrectly. I am now... 2 steps backwards.

I was running the service manually but reading from the clustered MSMQ which is fine but when time comes for a proper failover, it fails to do this.

So now, I am not running the service manually and just letting the Cluster management do it but it fails as soon as I create the high availability app generic service.

it says that The version of MSMQ cannot be detected and that the MSMQ service is not available.

but it is available on the cluster. I can even look at it through computer management and see the queue and the messages coming in.

any ideas?
 
On the Clustered service, I view its properties and Unticked the already ticked "Use Network Name for computer name"

a few blogs I read said to be sure that this option IS ticked...

however it does make me question if it is reading the right queue and not reverting to local queue.

Just made a node fail over...the other picks up and continues processing the messages from the Clustered MSMQ which is what I want.
 
actually I am not sure if this is working correctly, as if I run the same service on other computers, all pointing to the clustered MSMQ - it does not read messages.

but if you run the service on the clustered nodes, then it works fine.

any ideas?
 
Generic type applications have limited functionality, it'll probably never work the way you want it to work.

What if our application failed in a way that didn't result in the process terminating (for example, network failure, hanging or background thread termination)? Unfortunately, with the Generic Application resource type you only get generic failure detection. Most developers writing applications that will run in a clustered environment will prefer to produce custom resource DLLs, to handle application specific issues.
 
ok thats fine I think from what you quoted....



The only issue I have is that if I keep that checkbox ticked to use network name as computer name then the generic service/app fails (its using clustered MSMQ) saying that The version of MSMQ cannot be detected All operations that are on the queued channel will fail. Ensure that MSMQ is installed and is available but if it is left unticked then it works fine.

I want to be able to read messages from the clustered MSMQ from 3 seperate servers on the same network which are not clustered and each one of those will run the application independantly, pointing to the clustered MSMQ however it fails to find any messages.... so im wondering if my setup is correct?
 
Last edited:
Back
Top Bottom