The danger of overreaction

from blog Surfing Complexity, 12 Jan 2025 | ↗ original

The California-based blogger Kevin Drum has a good post up today with the title Why don’t we do more prescribed burning? An explainer. There’s a lot of great detail in the post, but the bit that really jumped out at me was the history of the enormous forest fires that burned in Yellowstone National Park … Continue reading The danger...

This is a short summary. ↗ Open original to view full content

Aussie firestorms

Qt, linux and everything | original ↗

Flames Still Burning

Cup of Squid | original ↗

Lessons learned from burning things.

Anil Dash | original ↗

Parks, America, and Reading

This is important | original ↗

Disenchantment

Mike Saburenkov | original ↗

When it Rains

The Darth Mall | original ↗

AI and a duty of care

This is important | original ↗

Source score: YouTube comment template to encourage referencing

Gus Hogg-Blake | original ↗

The Spectacular Typography of the Sanborn Fire Maps

Daring Fireball | original ↗

Let’s Go To The Old Posts Home

Bix Dot Blog | original ↗

More from Surfing Complexity

Whither dashboard design?

22 Dec 2024 | original ↗

The sorry state of dashboards It’s true: the dashboards we use today for doing operational diagnostic work are … let’s say suboptimal. Charity Majors is one of the founders of Honeycomb, one of the newer generation of observability tools. I’m not a Honeycomb user myself, so I can’t say much intelligently about the product. But … Continue reading...

The Canva outage: another tale of saturation and resilience

21 Dec 2024 | original ↗

Today’s public incident writeup comes courtesy of Brendan Humphries, the CTO of Canva. Like so many other incidents that came before, this is another tale of saturation, where the failure mode involves overload. There’s a lot of great detail in Humpries’s write-up, and I recommend you read it directly in addition to this post. What … Continue...

Quick takes on the recent OpenAI public incident write-up

15 Dec 2024 | original ↗

OpenAI recently published a public writeup for an incident they had on December 11, and there are lots of good details in here! Here are some of my off-the-cuff observations: Saturation With thousands of nodes performing these operations simultaneously, the Kubernetes API servers became overwhelmed, taking down the Kubernetes control plane in...

Your lying virtual eyes

7 Dec 2024 | original ↗

Well, who you gonna believe, me or your own eyes? – Chico Marx (dressed as Groucho), from Duck Soup: In the ACM Queue article Above the Line, Below the Line, the late safety research Richard Cook (of How Complex Systems Fail fame) notes how that we software operators don’t interact directly with the system. Instead, … Continue reading Your lying...

MTTR: When sample means and power laws combine, trouble follows

2 Dec 2024 | original ↗

Think back on all of the availability-impacting incidents that have occurred in your organization over some decent-sized period, maybe a year or more. Is the majority of the overall availability impact due to: If you answered (2), then this suggests that the time-to-resolve (TTR) incident metric in your organization exhibits a power law...

The danger of overreaction

Related

More from Surfing Complexity