IT outages reported across globe by airlines, airports, banks and others

https://www.theverge.com/2024/7/19/24201717/windows-bsod-crowdstrike-outage-issue

Thousands of Windows machines are experiencing a Blue Screen of Death (BSOD) issue at boot today, impacting banks, airlines, TV broadcasters, supermarkets, and many more businesses worldwide. A faulty update from cybersecurity provider CrowdStrike is knocking affected PCs and servers offline, forcing them into a recovery boot loop so machines can’t start properly. CrowdStrike is widely used by many businesses worldwide for managing the security of Windows PCs and servers.
 
Somebody screwed up big time. Too eager to get out an update and did not thoroughly vet it more than likely.
the company is going to take a big hit; there will be competitors looking to capitalize on this
 
The Crowdstrike CEO issued a statement on X today, but forget to apologize to customers for the screw up. I feel like this software was not tested very well considering the magnitude of the outage.
 
I talked to an IT tech years ago who addressed this; that a lot of updates are pushed out without sufficient vetting.

Look at some of the Windows Operating systems that were less than successful - they were often pushed out before they were ready.

and a lot of so called security updates are nowhere near as important as claimed.
 
you wanted everything connected and in the cloud...
 
If it was a bad update, then why not just back it out? Windows has a recovery feature to roll back to before the last change was applied.
 
If it was a bad update, then why not just back it out? Windows has a recovery feature to roll back to before the last change was applied.
IT is harder and harder to do that anymore with Windows. Your ability to do much without extensive IT knowledge is less and less.
seems like the easy fix was not possible here
 
I work in IT for my city. It affected us massively. Many in my group have been working since 11am last night. Things are coming back up and stations fixed at this point. Absolutely insanity.

Probably be some job openings at Crowdstrike after today! Lol
 
I work in IT for my city. It affected us massively. Many in my group have been working since 11am last night. Things are coming back up and stations fixed at this point. Absolutely insanity.

Probably be some job openings at Crowdstrike after today! Lol
Hopefully starting at the top.
 
If it was a bad update, then why not just back it out? Windows has a recovery feature to roll back to before the last change was applied.
Even if you knew how to do it most companies don't give admin rights to normal workers to be able to do it.
 
If it was a bad update, then why not just back it out? Windows has a recovery feature to roll back to before the last change was applied.
The problem is that the update causes a BSOD crash so that's not possible.

From what I have read it requires each Windows device be booted into SM/WRM (Safe Mode/Windows Recovery Mode) and then having to manually delete files in a directory.

Because Crowdstrike's Falcon application is used by so many government agencies and large corporations, you can imagine how long that process will take the nowhere-near-large-enough-IT-departments going from device to device performing those steps even if they seem minor.

For a regular Windows computer, you could probably create a bootable USB drive that could automate the fix with a simple insert-boot-remove-reboot process, but I am sure most larger organizations have stricter configurations in place that make it more difficult or requires direct IT personnel access to implement the changes.

Decades ago, everything ran off of mainframes using "dumb" terminals so IT issues like this were easier to find, fix and resolve as you typically only had 1 to 3 network-wide failure points.

The transition to "smart" workstations created countless numbers of failure points and this was further hampered by necessary security systems to protect both the workstations as well as all network devices and stored data.

In recent years, you are starting to see more virtually powered centralized servers again, but it's not the same as the mainframe days because most connected devices are still "smart" workstations.

What I am hoping to see is the separation of workstation and network integration methods in the near future.

Just like virtual servers/machines allow the separation of workstations and operating systems, it would be a lot more secure if applications were running remotely on spin-up virtual server/serverless platforms and the access methods used by remote devices was limited to enchanced terminal software/firmware like it was back in the mainframes days.

That way, the security access would be on a security card (USB, SDCard, custom card, bio scan, etc.) and you would simply insert or scan to authorize any capable device and suddenly you are back live again.

A large corporation or government agency then could buy hundreds of cheap laptops, tablets or other corporate/government-issued mobile devices and hand them out if something like this happens to provide a temporarily solution.
 
The problem is that the update causes a BSOD crash so that's not possible.

From what I have read it requires each Windows device be booted into SM/WRM (Safe Mode/Windows Recovery Mode) and then having to manually delete files in a directory.

Because Crowdstrike's Falcon application is used by so many government agencies and large corporations, you can imagine how long that process will take the nowhere-near-large-enough-IT-departments going from device to device performing those steps even if they seem minor.

For a regular Windows computer, you could probably create a bootable USB drive that could automate the fix with a simple insert-boot-remove-reboot process, but I am sure most larger organizations have stricter configurations in place that make it more difficult or requires direct IT personnel access to implement the changes.

Decades ago, everything ran off of mainframes using "dumb" terminals so it issues like this were easier to find, fix and resolve as you typically only had 1 to 3 network-wide failure points.

The transition to "smart" workstations created countless numbers of failure points and this was further hampered by necessary security systems to protect both the workstations as well as all network devices and stored data.

In recent years, you are starting to see more virtually powered centralized servers again, but it's not the same as the mainframe days because most connected devices are still "smart" workstations.

What I am hoping to see is the separation of workstation and network integration methods in the near future.

Just like virtual servers/machines allow the separation of workstations and operating systems, it would be a lot more secure if applications were running remotely on spin-up virtual server/serverless platforms and the access methods used by remote devices was limited to enchanced terminal software/firmware like it was back in the mainframes days.

That way, the security access would be on a security card (USB, SDCard, custom card, bio scan, etc.) and you would simply insert or scan to authorize any capable device and suddenly you are back live again.

A large corporation or government agency then could buy hundreds of cheap laptops, tablets or other corporate/government-issued mobile devices and hand them out if something like this happens to provide a temporarily solution.
sensible and logical and exactly why it will not happen.
Well at least in any government situation
one would be able to spot the better run companies if they do that
wanna bet that will only be a few?
 
The problem is that the update causes a BSOD crash so that's not possible.

From what I have read it requires each Windows device be booted into SM/WRM (Safe Mode/Windows Recovery Mode) and then having to manually delete files in a directory.

Because Crowdstrike's Falcon application is used by so many government agencies and large corporations, you can imagine how long that process will take the nowhere-near-large-enough-IT-departments going from device to device performing those steps even if they seem minor.

For a regular Windows computer, you could probably create a bootable USB drive that could automate the fix with a simple insert-boot-remove-reboot process, but I am sure most larger organizations have stricter configurations in place that make it more difficult or requires direct IT personnel access to implement the changes.

Decades ago, everything ran off of mainframes using "dumb" terminals so it issues like this were easier to find, fix and resolve as you typically only had 1 to 3 network-wide failure points.

The transition to "smart" workstations created countless numbers of failure points and this was further hampered by necessary security systems to protect both the workstations as well as all network devices and stored data.

In recent years, you are starting to see more virtually powered centralized servers again, but it's not the same as the mainframe days because most connected devices are still "smart" workstations.

What I am hoping to see is the separation of workstation and network integration methods in the near future.

Just like virtual servers/machines allow the separation of workstations and operating systems, it would be a lot more secure if applications were running remotely on spin-up virtual server/serverless platforms and the access methods used by remote devices was limited to enchanced terminal software/firmware like it was back in the mainframes days.

That way, the security access would be on a security card (USB, SDCard, custom card, bio scan, etc.) and you would simply insert or scan to authorize any capable device and suddenly you are back live again.

A large corporation or government agency then could buy hundreds of cheap laptops, tablets or other corporate/government-issued mobile devices and hand them out if something like this happens to provide a temporarily solution.
Good explanation, thanks.
 
One thing I do know from my experience in software development is if my team ever created an outage like this from installing software changes, heads would roll.
 
The problem is that the update causes a BSOD crash so that's not possible.

From what I have read it requires each Windows device be booted into SM/WRM (Safe Mode/Windows Recovery Mode) and then having to manually delete files in a directory.

Because Crowdstrike's Falcon application is used by so many government agencies and large corporations, you can imagine how long that process will take the nowhere-near-large-enough-IT-departments going from device to device performing those steps even if they seem minor.
Yep, this was exactly what we had to do. We had an emergency all hands IT day. Everybody came in and it was boots on the ground, roll a vehicle, and everybody had to cover sites to clear them up. Some Windows machines were fairly quick at recovery, but there were a few that got stuck in that loop of restarting several times. So, fix times varied.

I realized I typo'd 11AM in my comment up the thread, I meant 11PM. My team had started getting trouble at around 11PM Thursday night. We deployed quickly and had most everything back up and operational by about 1PM Friday. Our team kicked butt, but dang, what a pain in the butt.
 
Yep, this was exactly what we had to do. We had an emergency all hands IT day. Everybody came in and it was boots on the ground, roll a vehicle, and everybody had to cover sites to clear them up. Some Windows machines were fairly quick at recovery, but there were a few that got stuck in that loop of restarting several times. So, fix times varied.

I realized I typo'd 11AM in my comment up the thread, I meant 11PM. My team had started getting trouble at around 11PM Thursday night. We deployed quickly and had most everything back up and operational by about 1PM Friday. Our team kicked butt, but dang, what a pain in the butt.
No idea if this works or helps but ..

New Recovery Tool to help with CrowdStrike issue impacting Windows endpoints
https://techcommunity.microsoft.com...with-crowdstrike-issue-impacting/ba-p/4196959
 

Forum statistics

Threads
464,268
Messages
13,801,807
Members
23,776
Latest member
saturdaysarebetter
Back
Top