Surfing Complexity

Lorin Hochstein's ramblings about software, complex systems, and incidents.
https://surfingcomplexity.blog/ (RSS)
visit blog
Extending MVCC to be serializable, in TLA+
4 Nov 2024 | original ↗

In the previous blog post, we saw how a transaction isolation strategy built on multi-version concurrency control (MVCC) does not implement the serializable isolation level. Instead, it implements a weaker isolation level called snapshot isolation. In this post, I’ll discuss how that MVCC model can be extended in order to achieve serializability,...

Multi-version concurrency control in TLA+
1 Nov 2024 | original ↗

In a previous blog post, I talked about how we can use TLA+ to specify the serializability isolation level. In this post, we’ll see how we can use TLA+ to describe multi-version concurrency control (MVCC), which is a strategy for implementing transaction isolation. Postgres and MySQL both use MVCC to implement their repeatable read isolation …...

The carefulness knob
30 Oct 2024 | original ↗

A play in one act Dramatis personae Scene 1: A meeting room in an office. The walls are adorned with whiteboards with boxes and arrows. EM: So, do you think the team will be able to finish all of these features by end of the Q2? TL: Well, it might be a bit tight, but … Continue reading The carefulness knob →

Specifying serializability in TLA+
29 Oct 2024 | original ↗

Concurrency is really, really difficult for humans to reason about. TLA+ itself was borne out of Leslie Lamport’s frustration with the difficulty of write error-free concurrent algorithms: When I first learned about the mutual exclusion problem, it seemed easy and the published algorithms seemed needlessly complicated. So, I dashed off a simple...

If you don’t examine what worked, how will you know what works?
18 Oct 2024 | original ↗

This is one of my favorite bits from fellow anglophone Québécois Norm McDonald: One of the goals I believe that we all share for post-incident work is to improve the system. For example, when I wrote the post Why I don’t like discussing action items during incident reviews, I understood why people would want to … Continue reading If you don’t...

A liveness example in TLA+
17 Oct 2024 | original ↗

If you’ve ever sat at a stop light that was just stuck on red, where there was clearly a problem with the light where it wasn’t ever switching green, you’ve encountered a liveness problem with a system. A liveness property of a specification is an assertion that some good thing eventually happens. In the case … Continue reading A liveness example...

Futexes in TLA+
6 Oct 2024 | original ↗

Justine Tunney recently wrote a blog post titled The Fastest Mutexes where she describes how she implemented mutexes in Cosmopolitan Libc. The post discusses how her implementation uses futexes by way of Mike Burrows’s nsync library. From her post nsync enlists the help of the operating system by using futexes. This is a great abstraction …...

Why I don’t like discussing action items during incident reviews
28 Sept 2024 | original ↗

I’m not a fan of talking about action items during incident reviews. Judging from the incident review meetings I’ve attended throughout my career, this is a minority view, and I wanted to elaborate here on why I think this way. For more on this topic, I encourage readers to check out John Allspaw’s 2016 blog … Continue reading Why I don’t like...

Linearizability! Refinement! Prophecy!
22 Sept 2024 | original ↗

Back in August, Murat Derimbas published a blog post about the paper by Herlihy and Wing that first introduced the concept of linearizability. When we move from sequential programs to concurrent ones, we need to extend our concept of what “correct” means to account for the fact that operations from different threads can overlap in … Continue...

Safety first!
31 Aug 2024 | original ↗

I’m sure you’ve heard the slogan “safety first”. It is a statement of values for an organization, but let’s think about how to define what it should mean explicitly. Here’s how I propose to define safety first, in the context of a company. I’ll assume the company is in the tech (software) industry, since that’s … Continue reading Safety first! →

You can specify even when you can’t implement
30 Aug 2024 | original ↗

The other day, The Future of TLA+ (pdf) hit Hacker News. TLA+ is a specification language: it is intended for describing the desired behavior of a system. Because it’s a specification language, you don’t need to specify implementation details to describe desired behavior. This can be confusing to experienced programmers who are newcomers to TLA+,...

↑ these items are from RSS. Visit the blog itself at https://surfingcomplexity.blog/ to find other articles and to appreciate the author's digital home.