Draytek router dropping traffic frequently

Soldato
Joined
18 Oct 2002
Posts
7,056
Location
Kuala Lumpur, Malaysia
We've been having issues with our Draytek router (2960) in our office dropping out recently and I've been trying to figure out what's causing it without much luck so far.

Each time it happens everything comes back within 30-60 seconds, but it occurs sometimes once an hour, sometimes only once or twice a day. I work remotely so my VPN session (usually just RDP) drops as well.

The WAN connection itself is fine - VOIP which runs through the same connection (but not this router) stays up, the Draytek can't be pinged during these short outages either so I'm sure the fault is there.

There is no activity on the logs during these drops, only a few of the below entries each time (just with different IPs of course)

[ip_route_input_slow:2426] set reply 31.13.92.14192.168.40.18 oif=13 and NO ROUTE

Manufacturer support hasn't been of much help so far, and I haven't been able to find much on Google either.

FWIW, CPU and Memory load aren't high (<50% mostly), only myself connected through VPN. Restarting the router doesn't help unfortunately.

Any ideas? I hope I'm at least on the right track and those log entries are related :o
 
On latest firmware? (August 2016)

My experience in Draytek has always been 100% uptime

Yup - on the latest 1.2.1 firmware.

I've been trying to use the logs to figure out the issue - at first I thought the logging to PC was stopping as the connection was being dropped, but logging to USB has the same gap during that period (so I guess logging is correct, and traffic just isn't being passed through at that time)
 
Can you ping the router during these times?

I've tried running 3 constant pings :

1. To other machine on network (works fine, connected to different switch)
2. To router (this gets no response during the downtime)
3. To Google DNS (no response during downtime)

So I'm sure it's the Draytek causing the issue, as our VOIP calls run through the first router (Cisco, managed by ISP)
 
Aha, does sound like a similar issue - I haven't tried downgrading the firmware, I guess we're in the same boat that I can't just take the network down (other than a reboot maybe) and configuring all the VLANs etc from scratch would probably take a bit of time so I'm trying to find a solution without doing this.

The router just seems to stop / drop any traffic during the period so the Syslog has a gap for 30-60 seconds or however long the drop is and stops responding to pings during this time too. I guess I could try constantly pinging the WAN Gateway, but this passes through the Drähten first?

As the connection was dropping I assumed Syslog wasn't able to send updates to the Syslog server and that's why the log was empty, but logging to USB directly from the router doesn't seem to make a difference - still those same gaps in the logs.
 
Anyone running a bit torrent client? I had Vigors many years ago struggle to manage 'high' numbers of connections.

You may also want to do some packet sniffing on your network to ensure there isn't some kind of storm going on.

Well there's a filter to block BT traffic, and our session usage isn't high - we're normally at 1000-2000 sessions (router limit is meant to be 80,000)

The latest firmware does have packet capture built in but that's limited to 10k packets unfortunately so it can't keep running. I'm going to see if we can get 1 of the LAN ports freed up by connecting part of the network through a different switch, that way I'd be able to use LAN mirroring to see all the packets.
 
Might be the best solution -main reason we went with Draytek is familiarity with the interface / configuration as we'd had them for quite a while and not had any major issues.

If I don't manage to get this resolved - any recommendations? We're a small outfit without a dedicated IT dept so nothing too complicated to set up.
 
Before you bin it, check the power plug. Is it loose? Is the router placed where someone can accidentally move it / step on it / whatever while walking past?

Nope - that's all fine, it happens during odd hours also when there's no one in the office or nearby the unit and is always resolving by itself.
 
If your IP range is not 192.168.254.* then this may be an issue. If it can't see it locally it may be trying to push traffic over your WAN. If it's UDP then there is a chance it'll be a constant flow saturating the bandwidth which could potentially cause drops and or lack of response to ping.

Ideally you'd have a rule already in your firewall stopping anything destined for RFC1918 (or 1980 I can never remember) addresses over your WAN interface to stop this but may be worth a try.

I'll have a look at this. 192.168.254.*** is a range Hikvision NVRs use only by default for the cameras connected to the PoE interfaces. I did run Wireshark for a while yesterday but didn't spot anything related to this traffic, will look into it further today.

The Cisco Meraki range I was considering - they even give you the free equipment (MX64) for taking part in their webinars which might be worth a look, I guess initially we could run this separately to make sure everything is running as it should (we have a couple of VLANs using internet access only so could run these through a second router temporarily). MX64 looks sufficient as there isn't a need for wireless (have separate APs for Wireless already)

You're right - this was indeed happening, blocked this now and see if anything changes.

By number of packets not much though - during a 30 minute period approximately 1k packets
 
Last edited:
Had some free time to look into this further - running packet capture didn't seem to bring out anything obvious, have sent Draytek some captures / logs to see if they can spot anything.

Pretty much ready to move to another vendor once we can afford a bit of downtime as we haven't come any closer to finding a solution
 
We've started deploying Dell Sonicwalls recently, they've been quite good.

I'll have a look at those too - we wouldn't be able to change for at least a couple of weeks until we can get someone in over the weekend to avoid any downtime during the week.

I did run simultaneous wireshark captures on both the WAN and LAN interfaces - about all that's telling me is that it stops routing between the LAN and WAN but not the cause of it. I'm not too hopeful in the support guys finding the reason but we'll see.
 
I've had a right headbanger with a customer who has had a Draytek 2860ac since December '15. Started to exhibit similar symptoms. I ended up reducing the MTU down from 1500. Sorted.

The MTU on the LAN or WAN interface? What did you change it to?

I've been looking at other parts of the network as I was hoping to find an issue somewhere there which is causing it to hang - I did come across someone mentioning Sonos devices causing this with Draytek of which there's one on our network but I could recreate it.

Funnily enough it happened yesterday (after nearly everyone left the office, so with not much load) twice within a few minutes. After this our Draytek wireless AP went down and hasn't come back up since.

Will have a look today to see what happened there, if it's related. We've got a new Meraki AP arriving today so quite good timing :p
 
The WAN is connected to another router (provided by ISP)

Anyway - it looks like the issue was possibly caused by a bad network cable :o:o (Draytek WAN > provider router) - switched out early this morning and not a single downtime event as of yet.
 

Still not 100% now sure the cable was the only cause - we've had 1 day without a single disconnect but did have 1 yesterday, although it's far less than before.

We're moving part of the network to an Edgerouter now - some part only requires internet connection only so we're going to separate and see if the issue still continues.
 
Back
Top Bottom