Surfing Complexity
https://surfingcomplexity.blog/ (RSS)
In the previous blog post, we saw how a transaction isolation strategy built on multi-version concurrency control (MVCC) does not implement the serializable isolation level. Instead, it implements a weaker isolation level called snapshot isolation. In this post, I’ll discuss how that MVCC model can be extended in order to achieve serializability,...
In a previous blog post, I talked about how we can use TLA+ to specify the serializability isolation level. In this post, we’ll see how we can use TLA+ to describe multi-version concurrency control (MVCC), which is a strategy for implementing transaction isolation. Postgres and MySQL both use MVCC to implement their repeatable read isolation …...
A play in one act Dramatis personae Scene 1: A meeting room in an office. The walls are adorned with whiteboards with boxes and arrows. EM: So, do you think the team will be able to finish all of these features by end of the Q2? TL: Well, it might be a bit tight, but … Continue reading The carefulness knob →
Concurrency is really, really difficult for humans to reason about. TLA+ itself was borne out of Leslie Lamport’s frustration with the difficulty of write error-free concurrent algorithms: When I first learned about the mutual exclusion problem, it seemed easy and the published algorithms seemed needlessly complicated. So, I dashed off a simple...
This is one of my favorite bits from fellow anglophone Québécois Norm McDonald: One of the goals I believe that we all share for post-incident work is to improve the system. For example, when I wrote the post Why I don’t like discussing action items during incident reviews, I understood why people would want to … Continue reading If you don’t...
If you’ve ever sat at a stop light that was just stuck on red, where there was clearly a problem with the light where it wasn’t ever switching green, you’ve encountered a liveness problem with a system. A liveness property of a specification is an assertion that some good thing eventually happens. In the case … Continue reading A liveness example...
Justine Tunney recently wrote a blog post titled The Fastest Mutexes where she describes how she implemented mutexes in Cosmopolitan Libc. The post discusses how her implementation uses futexes by way of Mike Burrows’s nsync library. From her post nsync enlists the help of the operating system by using futexes. This is a great abstraction …...
I’m not a fan of talking about action items during incident reviews. Judging from the incident review meetings I’ve attended throughout my career, this is a minority view, and I wanted to elaborate here on why I think this way. For more on this topic, I encourage readers to check out John Allspaw’s 2016 blog … Continue reading Why I don’t like...
Back in August, Murat Derimbas published a blog post about the paper by Herlihy and Wing that first introduced the concept of linearizability. When we move from sequential programs to concurrent ones, we need to extend our concept of what “correct” means to account for the fact that operations from different threads can overlap in … Continue...
I’m sure you’ve heard the slogan “safety first”. It is a statement of values for an organization, but let’s think about how to define what it should mean explicitly. Here’s how I propose to define safety first, in the context of a company. I’ll assume the company is in the tech (software) industry, since that’s … Continue reading Safety first! →
The other day, The Future of TLA+ (pdf) hit Hacker News. TLA+ is a specification language: it is intended for describing the desired behavior of a system. Because it’s a specification language, you don’t need to specify implementation details to describe desired behavior. This can be confusing to experienced programmers who are newcomers to TLA+,...