We're on the latest code (and to be fair, juniper are trying hard to help) but we haven't found a fix yet. We're running EBGP on them, though between data centers rather than to the internet.
We ran active active previously on our old (actually still in production) firewalls for this client, we were apparently the only people around silly enough to try active/active between datacenters (and it worked with some tweaking) but the new solution calls for increased redundancy and they don't mind coughing up for duplicate firewalls at each site.
We should have bought 2000s actually, with HA ports taken out we only have 6 gig ports available, we have about 20 security zones so we're running loads of sub-interfaces.
I'm not wild about only being able to manage IDP blades through NSM as you say, it's something of a pain (though it has finally got my boss to shell out for NSM, which we've needed for ages). The design on this project isn't my own so I'm having to go with it for now (if it had been me I'd have left the ISGs as firewalls and added some 6500s to do the routing and put Cisco IDP blades in those...
i've just taken delivery of a pair of ssg140 advanced which we are going to use to proof of concept our new isp design. this will be the first opportunity i've had to play with the new generation kit and also screenos 6 which im quite excited about.
ha, we're in a bit of a daft situation with our isg's - we have the 2000's for the high physical port count but were not even remotely stretching them - they sit at around 2% most of the time. we ideally need to consolidate individual security zones down into shared dmz's to make the configuration and management simpler.
i'm not a huge fan of nsm to be honest. i can see it's merits if you need to template an ipsec vpn rollout for lots of spoke sites, for example, but management of a small number of firewalls, particularly clusters seems pretty horrible...it attempts to update both boxes, which in turn try and sync their configs...so one ends up winning which causes the whole system to bog down. we're running nsm in vmware though so i suppose we only have ourselves to blame! we invariably have to run several update operations to get everything back in sync. i suppose we could turn off config sync between the boxes to get around this though.
thanks for the ammunition against active/active, i would rather see individual boxes at each data centre and let the routing on the network take care of the high availability rather than go with an overly complex active/active design! another one to forward up to the senior guys!!!