Rate-limiting load balancer?

CGrieves · 18 May 2010 at 15:14

We've got a client API, which in a nutshell is a farm of servers responding to http/XML product search requests from our B2B clients, all sat behind a Foundry ServerIronXL SLB.

All works fine, the only problem is occasionally a legit client will do a performance test, or release some dodgy code on their API connector which floods the farm and can cause slow responses or outages for all clients.

When this happens, it's a manual response to block the clients responsible. Annoying if it happens at 3am, as always seems to be the case.

My question is, is there a load balancer that can detect "problem" clients that exceed a specific transaction rate limit, and redirect their requests transparently to a separate server farm behind the scenes, thus protecting other well-behaved clients?

Our ServerIron will do transaction rate limiting, but it's based on a hold/deny scheme (or does an http redirect, which won't work for us), so bursting above the limit would cause an outage for the client. We'd rather they get slower responses from the secondary farm rather than no responses at all.

I'm thinking F5 Big-IP, but it's hard to tell if it does exactly this- anyone got any ideas? We'd rather go for a hardware SLB than software if possible.

daz · 18 May 2010 at 15:32

The difficult thing here for software is to detect the difference between legitimate floods of requests (e.g. a site just featured on TV) versus illegitimate requests (someone, or a group of people being malicious or just being overzealous with testing).

Obviously, if there are too many requests coming from a particular IP, this is usually illegitimate, but many large companies will often sit big offices behind a single NATed proxy/VPN which can cause issues. Reverse proxy servers like Nginx acting as load balancers can restrict numbers of connections based on the remote IP pretty easily, but rate limiting based on requests per second I haven't actually seen... personally. I think this would be an interesting project.

bigredshark · 18 May 2010 at 15:37

To be honest what you're asking for is a firewall feature and is easily done on a firewall - I'd do it there rather than on your load balancer...it's a 30 second on Juniper I know...

CGrieves · 18 May 2010 at 16:35

Client identification isn't too difficult- All our client systems are manually authorised by ACLs /IP Address. In almost all cases one "misbehaving" client's requests will originate from one IP. And yes, while it's not an easy thing to differentiate between legit and "flood" traffic, the purpose here is to just redirect them, not block them as is currently the case. For example I could easily define three server farms- a small one for clients producing light transactions loads (i.e. lightning fast), one for everyone else who's behaving, and a third for "misbehaving" clients. It'd be dead easy to monitor and swap physical resources around behind the scenes, as indeed we do already.

Its frustrating, I can easily do source-based routing OR transactional rate limiting on the ServerIron, it just doesn't seem capable of intelligently linking the two together.

bigredshark- Yes we used to use Juniper kit, and now Fortinet (who stole most of Juniper's development staff!) and rate limiting policies based on bandwidth is easy. But unless I'm mistaken ours (two Fortigate 200As as a redundant pair) don't have true content load balancing, let alone the automation to manage it by source, or transactional rate, which is more important. It's more than possible that a misbehaving client could be causing high load due to transactional rate, but using minimal bandwidth.

I'm starting to think this might involve a software load balancer sitting in front of the ServerIron to filter out problem clients- there seem to be lots of LBS VMWare appliances, problem is sorting through all the guff to find the gems. Either that or using our monitoring system to somehow reconfigure the ServerIron on the fly when a problem arises. It'd be really nice to do it all in the same box though....

CGrieves · 18 May 2010 at 17:12

Interesting- F5 now offer a virtual edition of the Big-IP, with a 90 day trial. I'll get a copy, plug it into the VSphere farm and see what it can do.

bigredshark · 18 May 2010 at 20:37

CGrieves said:
bigredshark- Yes we used to use Juniper kit, and now Fortinet (who stole most of Juniper's development staff!) and rate limiting policies based on bandwidth is easy. But unless I'm mistaken ours (two Fortigate 200As as a redundant pair) don't have true content load balancing, let alone the automation to manage it by source, or transactional rate, which is more important. It's more than possible that a misbehaving client could be causing high load due to transactional rate, but using minimal bandwidth.

What I was meaning was just set a max sessions per client IP - unless I'm misunderstanding your need - src ip session limit is the value in ScreenOS - god knows in fortigate's (I'm not a fan). That'd solve the issue of one client eating all your API sessions short term at least...

But either way, unless I misunderstand, you're attempting to prevent what's essentially a DOS attack and that's a firewall task - I'm not clear why you're trying to tackle it with your load balancer.

daz · 19 May 2010 at 09:32

Just sessions per IP might not be enough though (which as you say is firewall) - you might want to limit based on HTTP requests per second to each back-end client, so if a single site/client receives a legitimate flood for whatever reason, their sessions are transparently pushed by the load balancer to a different set of application/web servers so that anyone else on the main web servers isn't affected by traffic. Could probably still be done in the firewall, but I know that a lot of software load balancers could also do this, if not out of the box with just a bit of hacking/modification.

CGrieves · 19 May 2010 at 10:12

Yep Daz, exactly that- I can easily limit sessions per IP, i.e. drop sessions that stray above an arbitrary limit, but as you say the point is I don't want to block anything, just redirect requests above that limit to a separate application farm to protect well-behaved clients. In other words I want to manage the DOS, not block it. The high transactional rate might be a legit spike in activity, and after all, every product search is a potential sale.

It's looking like an F5 box might be the answer in the long term- the iRules system allows you to script just about any behaviour. Just have to learn how to use the damn thing!

Beansprout · 19 May 2010 at 11:54

Nginx can do some pretty funky limiting:

Code:

# 5 req/sec per IP with burst to 500 req in a row, then 503 error
limit_req_zone $binary_remote_addr zone=zoneone:10m rate=5r/s;
limit_req zone=zoneone burst=500 nodelay;
# 200req/sec per host with burst to 2000 req, then 503 error 
limit_req_zone $host zone=zonetwo:10m rate=200r/s;

http://wiki.nginx.org/NginxHttpLimitReqModule

Edit: As for customising the block action I'm not sure though.

ThorpedoUK · 21 May 2010 at 02:13

Check out Netscaler VPX Express (Citrix), this is not only a cool piece of software (virtual load balancer) but its also free!!