Soldato
- Joined
- 30 Sep 2005
- Posts
- 16,736
Hi Everyone,
We've been having a really strange problem lately and I have no idea what's causing it.
We have the following equipment:
Dell Blade M1000e Blade Chassis with 8 PEM640 blades
Blades all running Windows Server 2019, latest updates, bios and firmware
Intel X520 Mezz Network Cards (latest v20 firmware and drivers)
The 8 servers are all joined in a hyper-V cluster using a Dell SC4020 SAN for storage.
Looking in failover cluster logs we are getting errors every second due to IO Timeouts for the clustered storage. Doing some troubleshooting I am experiencing strange connectivity issues:
Every server can ping each other by hostname and IP (rules out DNS)
I can powershell to ping via various ports (RPC etc) and everythings ok (rules out Firewall)
RDP connection works on every server, to every other server
nslookup reports everything correct
Opening server manager on each server list random issues where some servers can't see others (unable to connect to target):
HV1<>HV2 FAILS
HV3<>HV7 FAILS
HV6<>HV8 FAILS
on these servers, you can't browse to \\servername\c$ but you can "sometimes" browse via the IP address
What is strange is that whilst HV1 can't talk to HV2 and vice versa, all the other servers can talk to them, and it back. Example, HV1 can talk to HV3 and HV3 can talk to HV1.
My gut tells me we have an issue with something on the backend networking side (switch, switch config, cabling) etc etc
any ideas?
We've been having a really strange problem lately and I have no idea what's causing it.
We have the following equipment:
Dell Blade M1000e Blade Chassis with 8 PEM640 blades
Blades all running Windows Server 2019, latest updates, bios and firmware
Intel X520 Mezz Network Cards (latest v20 firmware and drivers)
The 8 servers are all joined in a hyper-V cluster using a Dell SC4020 SAN for storage.
Looking in failover cluster logs we are getting errors every second due to IO Timeouts for the clustered storage. Doing some troubleshooting I am experiencing strange connectivity issues:
Every server can ping each other by hostname and IP (rules out DNS)
I can powershell to ping via various ports (RPC etc) and everythings ok (rules out Firewall)
RDP connection works on every server, to every other server
nslookup reports everything correct
Opening server manager on each server list random issues where some servers can't see others (unable to connect to target):
HV1<>HV2 FAILS
HV3<>HV7 FAILS
HV6<>HV8 FAILS
on these servers, you can't browse to \\servername\c$ but you can "sometimes" browse via the IP address
What is strange is that whilst HV1 can't talk to HV2 and vice versa, all the other servers can talk to them, and it back. Example, HV1 can talk to HV3 and HV3 can talk to HV1.
My gut tells me we have an issue with something on the backend networking side (switch, switch config, cabling) etc etc
any ideas?