Global BSOD

I wondered whether it was actually a deployment failure rather than a testing failure. It seems so severe that it surely would have been picked up by even the most cursory of checks.

First question our head of development asked when he saw this was 'Where are their canaries?'.

We upgrade our customers over the Internet too, but do it in stages. Each stage has customers who are randomly designated as Canaries - if their update fails, the whole update schedule is frozen until we figure out why.
 
Last edited:
We upgrade our customers over the Internet too, but do it in stages. Each stage has customers who are randomly designated as Canaries - if their update fails, the whole update schedule is frozen until we figure out why.

Sensible approach - and even without that, wouldn't you space updates over time? If nothing else so that customers in Sydney and London are both updated during the night, in which case why weren't they able to pull the update before it made it out to 8.5 million PCs?

I'd love to know the details of what went wrong inside the company, and I suspect we eventually will. Governments are likely to get involved if nothing else.
 
Sensible approach - and even without that, wouldn't you space updates over time? If nothing else so that customers in Sydney and London are both updated during the night, in which case why weren't they able to pull the update before it made it out to 8.5 million PCs?

I'd love to know the details of what went wrong inside the company, and I suspect we eventually will. Governments are likely to get involved if nothing else.

I might be wrong but I believe the update ignored policies and was pushed to everything at once - even where businesses had configured to have some systems ahead of others...
 
I might be wrong but I believe the update ignored policies and was pushed to everything at once - even where businesses had configured to have some systems ahead of others...

Which a Vendor should never be able to do imho. We're in a different boat to CRWD, we supply an out of band network appliance. If an update to that fails we don't break anything, but we still operate as if we could.

I used to work for a vendor who deployed agents to workstations. I remember one deployment we did for a PoC at a large telco - 10,000 devices sent out over SCCM after rigorous testing by their resilience teams, 10% of the devices crashed with a BSOD and we were mortified.
Management demanded an immediate roll back and explanation - our deal looked dead in the water. Then investigation showed that the machine impacted were running an beta version of IBMs version control software and IBM had left .dlls compiled in debug mode in the software - it was them crashing, not us and our bacon was saved. Was brown pants time all round though, I can tell you!
 
Which a Vendor should never be able to do imho. We're in a different boat to CRWD, we supply an out of band network appliance. If an update to that fails we don't break anything, but we still operate as if we could.

I assume that Crowdstrike pushed this update because they believed it dealt with a very serious threat. Although I don't think they've commented on that yet. The difference with their business area compared to any I've worked in, is that failing to update can mean leaving their clients at risk during that time.
 
Sensible approach - and even without that, wouldn't you space updates over time? If nothing else so that customers in Sydney and London are both updated during the night, in which case why weren't they able to pull the update before it made it out to 8.5 million PCs?

I'd love to know the details of what went wrong inside the company, and I suspect we eventually will. Governments are likely to get involved if nothing else.
rememebr this was billed as a definition update as i remember to an AV product those are generally deploy ASAP, but from the posts ive seen it was deployed regadless of the policies set by companies, which is in its self a bit or a worry they have so much control.
 
I remember one deployment we did for a PoC at a large telco - 10,000 devices sent out over SCCM after rigorous testing by their resilience teams, 10% of the devices crashed with a BSOD and we were mortified.
Management demanded an immediate roll back and explanation - our deal looked dead in the water. Then investigation showed that the machine impacted were running an beta version of IBMs version control software and IBM had left .dlls compiled in debug mode in the software - it was them crashing, not us and our bacon was saved. Was brown pants time all round though, I can tell you!

Difficult to take into account issues like that as I mentioned in a recent post where at work we had BSODs due to similarly about 10% of machines using a device with a substitute chipset which was supposed to be identical but wasn't 100% so, which slipped through testing because head office only had devices with the normal chipset.
 
we need another brexit, none of this euro immigrant nonsense messing up our 'puters


Well the UK would have been complicit in this **** up based on the timings in the article but obviously not for any future issues.. hence we are getting AI with the next iPhone etc.
 
You can see why Apple and other tech companies are starting to swerve the EU.

Personally I'd rather not see Windows go that way, yes there is a higher security risk to it but I'd rather not see desktop OSes locked down, and 3rd party software locked out, to that extent, though there is also room for OSes like Mac where it is. Although I'm sure that isn't why you had a pop at the EU.
 
Back
Top Bottom