Scattered Thoughts

https://www.scattered-thoughts.net/ (RSS)
visit blog
The missing tier for query compilers
12 Jan 2025 | original ↗
0050 - smolderingly fast b-trees, serious fun, what is the point of an online conference, it's ok to be afraid, HYTRADBOI progress, no other progress, vancouver.systems, not the incentives, llm garbage, books
25 Nov 2024 | original ↗

Various new written things:Various new written things: Smolderingly fast b-treesSmolderingly fast b-trees Serious funSerious fun What is the point of an online conference?What is the point of an online conference?

It's ok to be afraid
16 Nov 2024 | original ↗

My climbing this season was crippled by fear. No matter what I did, it just kept growing from week to week. By the end I didn't even want to go out any more. I was secretly relieved whenever it rained.My climbing this season was crippled by fear. No matter what I did, it just kept growing from week to week. By the end I didn't even want to go out any more. I was secretly relieved whenever it rained. The most frustrating part was that I couldn't figure out what I suddenly so afraid of.The most frustrating part was that I couldn't figure out what I suddenly so afraid of. There's obviously fear of injury and fear of not being able to control your own trajectory. But I trust my partner to catch me softly, and I trust my own judgement of risk...There's obviously fear of injury and fear of not being able to control your own trajectory. But I trust my partner to catch me softly, and I trust my own judgement of risk...

What is the point of an online conference?
30 Oct 2024 | original ↗

I've been thinking a lot about this in preparation for the next I've been thinking a lot about this in preparation for the next HYTRADBOIHYTRADBOI.. My experience of online conferences has mostly been underwhelming. They typically borrow the form and structure of an in-person conference without considering whether those still make sense online, and whether the goals of an online conference should even be the same as an in-person conference.My experience of online conferences has mostly been underwhelming. They typically borrow the form and structure of an in-person conference without considering whether those still make sense online, and whether the goals of an online conference should even be the same as an in-person conference. The most important function of...The most important function of...

Serious fun
20 Oct 2024 | original ↗

It's easy to think of being serious and having fun as opposite sides of a spectrum. The problem is that 'being serious' has many unrelated meanings, for example:It's easy to think of being serious and having fun as opposite sides of a spectrum. The problem is that 'being serious' has many unrelated meanings, for example: Serious as in somber or solemn. "This is a serious event, stop playing around."Serious as in somber or solemn. "This is a serious event, stop playing around." Serious as in actually trying to attain your goals, as opposed to just going through the motions. "This is a serious effort."Serious as in actually trying to attain your goals, as opposed to just going through the motions. "This is a serious effort." Only the first meaning is actually opposed to fun. Fun/playful vs...Only the first meaning is actually opposed to fun. Fun/playful vs...

0049: hytradboi 2025, consulting, zest progress, labeled continue, bet against sql, zero-cost costs in debug, packed memory arrays, papers, books
8 Oct 2024 | original ↗

hytradboi 2025 HYTRADBOI is coming back in 2025, this time with a programming languages track. I have 14 speakers confirmed so far, but there is still plenty of room. Let me know who you want to see!

Smolderingly fast b-trees
6 Oct 2024 | original ↗

(This is part of a series on the design of a language. See the list of posts here.) Many 'scripting' languages use a hashmap for their default associative data-structure (javascript objects, python dicts, etc). Hashtables have a lot of annoying properties: Vulnerable to hash flooding....

HYTRADBOI 2025
21 Sept 2024 | original ↗

2025 Feb 28. Put it in your calendar. It's been three years since HYTRADBOI. Long enough that I've mostly forgotten how stressful it was to run a conference and it's starting to seem like a good idea again. talks The format will stay the same. 10 minute, pre-recorded, heavily-edited talks. Asynchronous chat. Join from a different time-zone, watch talks on your lunch break, answer questions...

0048: zest progress, zest ordering, wasm alignment, umbra papers, future of fast code, new internet, books, other stuff
31 Aug 2024 | original ↗

zest progress I've started working on the runtime. Many of the features of zest are going to be implemented by the runtime rather than by the compiler, but the runtime is itself written in zest. I'm slowly unpicking the dependency graph of features to make that work, so the last month saw a lot of tiny changes: Added a != operator. I somehow forgot it earlier. Added support for strings and string...

0047: babys second wasm compiler, zig honggfuzz, values can be values, dont look UB, surely you can be serious, other links, books
11 Jul 2024 | original ↗

I finally got the zest compiler architecture and codegen more or less settled. Lot's of details in baby's second wasm compiler. I'm gonna take a two week break and then move on to writing the self-hosted runtime. zig honggfuzz Zig doesn't have support for coverage-guided...

Baby's second wasm compiler
9 Jul 2024 | original ↗

(This is part of a series on the design of a language. See the list of posts here.) The zest compiler today is ~4500 loc with no dependencies except the zig standard library. It generates wasm of similar quality to an llvm at -O0, although I have some ideas to try later that I hope will push the output closer towards -O1. There are still some low-hanging...

Ruminating about mutable value semantics
3 Jun 2024 | original ↗

(This is part of a series on the design of a language. See the list of posts here.) I have two goals for zest that are in tension: Be a reasonably efficient imperative language (eg go, julia). Treat values as data (eg erlang, clojure, see the shape of data). In...

0046: zest syntax, zest progress, sponsors-only repos, future compilers, error-handling implementations, suboperators, why we drive
2 May 2024 | original ↗

Minimal writing this month: Zest syntax But maximum coding! zest progress Last month I mentioned that I had a good chunk of the language working in a single-pass compiler in destination-passing style, but it was becoming increasingly...

Zest: syntax
16 Apr 2024 | original ↗

(This is part of a series on the design of a language. See the list of posts here.) Popular advice for designing a language is to focus on semantics and worry about syntax later. So it might seem ill-advised to write about syntax before writing about semantics. But a) I think syntax design is underrated - it has a huge impact on the subjective feel of working with a language, and b) it's hard...

0045: unexplanations, business things, zest progress, internal consistency repro, why murat blogs, compiler books + papers, compiling sql to wasm, other books
28 Mar 2024 | original ↗

I wrote a lot this month. More unexplanations: Relational algebra is math Sql is syntactic sugar for relational algebra Some musings about small software businesses:

Notes on compiler IRs
27 Mar 2024 | original ↗

(This is part of a series on the design of a language. See the list of posts here.) zig (Based on reading code.) Zig has a bunch of IRs: Ast. Basic parse tree. Potentially contains parse errors.

Miscellaneous ideas
23 Mar 2024 | original ↗

a better sql As much as I complain about sql, it's historically been too entrenched to be worth trying to compete. Building a database is really hard, and adopting a new database is really risky, so both vendors and customers have strong incentives to be risk-averse. ORMs, BI tools, database vizualizers etc all add to the incredibly strong network effects. The

How to trade software for small money?
21 Mar 2024 | original ↗

Suppose that all your expertise is in making software for use by other programmers, and you want to run a small business. How do you make money? The weird thing about being a system engineer is that it's quite easy to make something that is valuable and widely used. But quite a lot harder to get paid for it. It's easy to wake up as this brick:

Unexplanations: sql is syntactic sugar for relational algebra
18 Mar 2024 | original ↗

This idea is particularly sticky because it was more or less true 50 years ago, and it's a passable mental model to use when learning sql. But it's an inadequate mental model for building new sql frontends, designing new query languages, or writing tools likes ORMs that abstract over sql. Before we get into that, we first have to figure out what 'syntactic sugar' means. Wikipedia...

Unexplanations: relational algebra is math
11 Mar 2024 | original ↗

I see this claim appear in various forms: relational algebra is math, is based on math, comes from math, has strong mathematical foundations. It's a statement that I struggle to assign any precise meaning to, but typically the intent seems to be something like: Relational algebra contains some sort of fundamental truth - discovered rather than invented. Relational algebra is powerful because of it's special math sauce. Relational databases...

Zest: dialects and metaprogramming
28 Feb 2024 | original ↗

(This is part of a series on the design of a language. See the list of posts here.) I now have much of the core language sketched out and (poorly) implemented, including mutable value semantics, control-flow-capturing closures and a tentative solution for error-handling, but I'll cover those another day. Today I want to talk about the three dialects I'm...

0044: zest progress, unexplanations, generic dilemma, bitc retrospective, adversarial memory safety, done list, tiny muffins, anti-anti-spam, happiness, daily rituals, other books
28 Feb 2024 | original ↗

zest progress I'm continuing to chip away at zest. Some new posts: Notation and representation Dialects and metaprogramming I'm really...

Unexplanations: query optimization works because sql is declarative
21 Feb 2024 | original ↗

Here is a simple sql query. select users.id, ( select sum(posts.likes) from posts where posts.user_id =...

Zest: notation and representation
4 Feb 2024 | original ↗

(This is part of a series on the design of a language. See the list of posts here.) I want to be able to describe all data with a consistent notation when writing code, printing values or in graphical inspectors (see the shape of data). On top of this notation, I have two goals that are in...

2023
15 Jan 2024 | original ↗

Kind of a fragmented year. tigerbeetle I left in May. I learned a lot and I'm proud of the work I did. But I don't like the culture that surrounds startups, and that discomfort turns to burnout over time. 7am meetings probably didn't help either, putting me well out of sync with my wife who works evening shifts. I spent very little of that time writing about what I learned, other than scattered

0043: 2023, debog, never sort, critique of sql, status game, more fuel you
15 Jan 2024 | original ↗

Here's my yearly log entry for 2023. So you wanna de-bog yourself A list of ways to get stuck. It's nice to have labels for them, so they're easier to recognize and talk about. Terrible situations, once exited, often become funny...

0042: consulting lessons, there are no strings on me, buttondown, focus goof, jsfuck, 1ml
1 Dec 2023 | original ↗

consulting lessons I ended my consulting gig not long after starting, when it became obvious that we had very different ideas about how to write high-performance software. I guess I can add this to my list of things learned about consulting. It's not enough to be clear on what I'm being hired to produce - we also need to make sure we're on the same page as to what producing that output will actually look like. I'll have to think about how...

There are no strings on me
22 Nov 2023 | original ↗

(This is part of a series on the design of a language. See the list of posts here.) Long ago Steve Yegge wrote about software that feels alive - emacs, smalltalk, lisp machines etc - and lamented that the industry prefers to create dead things. Puppets on strings, not real boys. I'm...

0041: columnar kernels in go, go grouches, column sketches, why user-mode threads, gross margins vs open source, celebrity worship
4 Nov 2023 | original ↗

I'm currently doing some consulting, adding a vectorized query engine to a project written in go. I wrote a thing - columnar kernels in go - about some prep work I did before starting, figuring out how to translate familiar idioms into this foreign (to me) language. go grouches So far I'm extremely unimpressed with go. It disallows using...

Columnar kernels in go?
20 Oct 2023 | original ↗

Over the winter I'm going to be adding a columnar query engine to an existing system written in go. I'm not at all familiar with go, and it's also not ideally suited to this kind of problem, so I started out with a little toy problem that has similar challenges to help puzzle out the best strategy for the kernels. The toy problem is a simple columnar compression library. (It was intended to be a reimplementation of

0040: olap survey, lobster, feldera, innovation, wizard papers, umbra papers, olap papers
29 Sept 2023 | original ↗

I published a shallow survey of OLAP and HTAP query engines. The last 2/3rds or so of this post contains all the supporting notes. Also a lot of papers on strategies for low-latency compilation. lobster A surprisingly neat little language, exploring a lot of ideas...

0040.1: meta in myanmar
29 Sept 2023 | original ↗

I had to put this on a different page. It felt too surreal for it to be mixed in with a bunch of technical notes. Genocide requires coordination. You can't whip people up into a frenzy if you can't reach them. Thanks to heavy data subsidies most people in Myanamr access the internet exclusively through facebook, which has replaced radio and printed news. So the mass genocide has been organized mostly through facebook. As the violence in Mandalay...

A shallow survey of OLAP and HTAP query engines
28 Sept 2023 | original ↗

Focused mostly on data layout and query execution. Query planning seems more or less the same as OLTP systems, and I'm ignoring distribution and transactions for now. Also see my full notes here. It was hard to figure out what systems are even worth studying. There is so much money in this space. Search results are polluted with barely concealed advertising (eg "How to choose between FooDB and...

0039: implementing interactive languages, baby's first wasm compiler, zig 0.11, attack of the killer features, zed, attention span, psychology's loss, privatizing sovereignty
29 Aug 2023 | original ↗

I wrote up some of the compiler stuff I've been noodling on for the last month or so: Implementing interactive languages Baby's first wasm compiler I also fleshed out my

Baby's first wasm compiler
28 Aug 2023 | original ↗

(This is part of a series on the design of a language. See the list of posts here.) I made a compiler from a toy language to wasm. The code quality is very much plan-to-throw-one-away. I'm just trying to get a feel for the amount of effort involved in a non-optimizing compiler. The toy language is mostly unsurprising. It has number, strings, maps (hashtables) and first-class functions,...

Implementing interactive languages
24 Aug 2023 | original ↗

(This is part of a series on the design of a language. See the list of posts here.) Suppose I want to implement an interactive language - one where code is often run immediately after writing. Think scientific computing, database queries, system shells etc. So we care about both compile-time and run-time performance because we'll usually experience their sum. A traditional switch-based...

0038.1: consulting
10 Aug 2023 | original ↗

I'm putting aside around four months this year for consulting. Likely late October to end of March, with a gap in December. Here are some ideas for projects that work well for short engagements: Isolated or research-heavy features. Eg adding decorrelation to materialize took ~6 weeks and didn't require changing much existing...

0038: cheap compilation, mvs-to-wasm, automatically isolating bugs, mastodone, other stuff
28 Jul 2023 | original ↗

cheap compilation? The most popular options for implementing programming languages and query languages are: Write a bytecode interpreter and try to amortize the runtime overhead through vectorized operations (eg python calling into c libraries or duckdb's vector operations). Use LLVM and suffer the slow compilation (and the...

0037: dynamic mutable value semantics, interior pointers, uninterning, functionless effects, papers, books
27 Jun 2023 | original ↗

dynamic mutable value semantics I worked through a simple implementation of mutable value semantics here (tree-walking interpreter, no optimizations). The main difference from swiftlet/val is that it's a dynamic language - demonstrating that nothing about MVS requires static typing. The swiftlet paper goes into some...

0036: typescript, papers, books
29 May 2023 | original ↗

Taking it easy this month. typescript I'm trying to clean up the mess of bash/python/clojure/julia/rust scripts/apps/glue that I've accumulated. Javascript seems like the most obvious candidate to consolidate on, with fast startup, big ecosystem and gradual typing. I went through most of the javascript and typescript courses on Execute Program. The interface is slick and the...

0035: back to the shack
4 May 2023 | original ↗

I left tigerbeetle this week. They didn't do anything wrong, far from it. I'm just not well suited to long-term employment. I should know that by now, but I convinced myself that this time would be different. At the beginning I was talking about a 3-month trial contract, knowing the risk, and then I just got overexcited and dove in head-first. Before taking this job I was approaching critical mass - between hytradboi, github sponsors, a grant from emergent ventures...

What is a database?
19 Apr 2023 | original ↗
0034: perf handover, compaction unchained, crash harder, sketching the query engine, focus catchup, android update policies, legopunk, a world without email, nobody cares, segcache, bloomRF, existential consistency, ssd parameters, fantastic ssd internals
31 Mar 2023 | original ↗

tigerbeetle perf handover compaction unchained crash harder sketching the query engine focus catchup android update policies legopunk reading a world without email nobody cares about our concurrency control research segcache bloomRF existential consistency parameter-aware io management for ssds fantastic ssd...

0033: table sizing, bench harder, wasm first steps, sycl vancouver, breathing for warriors, move your dna, the molecule of more, how to decide, slouching towards utopia
5 Mar 2023 | original ↗

February was really broken up by Systems Distributed so not as much coding or papers to talk about as I'd like. The talks were all recorded but I don't know when they'll be published yet. tigerbeetle table sizing bench harder wasm first steps sycl vancouver reading breathing for warriors move your dna the molecule of...

0032: undroppable tombstones, forest fuzzer, manifest race, hash_log, zig coercions, zig pointer hops, zig object notation, domain knowledge, built from broken, database internals, papers
31 Jan 2023 | original ↗

at tigerbeetle undroppable tombstones forest fuzzer manifest race hash_log musing zig coercions zig pointer hops zig object notation domain knowledge reading built from broken database internals papers... I restarted my habit of reading a paper every...

2022
5 Jan 2023 | original ↗

(See other years). Office hours From January to March I ran office hours every friday morning. I wrote at the time: I'm trying to virtually recreate the cafe/salon culture I loved before moving to a tech backwater during an epidemic. I ended up talking to 35 people. Most of the time it was awkward and exhausting, much more so than...

0031: 2022, systems distributed, random ids, deleting tombstones, disorderly compaction, juggling blocks, code review woes, holiday shutdown, searching for implementors, everything is copy, sharing the page cache after fysncgate, 9/10 climbers, rise and fall of peer review, real-world concurrency
5 Jan 2023 | original ↗

I gave up on twitter, so this log is now only posted via atom or email. I wrote a retrospective for 2022. December things: systems distributed at

0030: lsm perf, colorblind concurrency, tracing, evacuating preimp, reading, fixing my shoulders
5 Dec 2022 | original ↗

Things I've been doing in November: handmade seattle! at tigerbeetle hunting performance bugs in the lsm tree adding tracing evacuating preimp musing colorblind concurrency reading database...

0029: san francisco, seattle, tigerbeetle, studying, links
4 Nov 2022 | original ↗

I'll be in San Francisco Nov 6-11, and in Seattle for Handmade Nov 15-19. tigerbeetle In October I joined the database team at TigerBeetle. Like most of my major life decisions, this was made somewhat impulsively. There wasn't one big reason, rather many nudges that all happened to peak at the same time: I'm...

0028: HYTRADBOI jam, sqllogictest in a week, how safe is zig again, rr on alder lake, google maps jank, links
5 Oct 2022 | original ↗

HYTRADBOI jam The jam results are now up at hytradboi.com/jam. I haven't had a chance to read through all of them yet but it looks like there were some neat projects. sqllogictest in a week For my own jam project, I tried to make a sql frontend that could pass

How (memory) safe is zig?
21 Sept 2022 | original ↗

I keep seeing discussions that equate zig's level of memory safety with c (or occasionally with rust!). Neither is particularly accurate. This is an attempt at a more detailed breakdown. This article is limited to memory safety. See Assorted thoughts on zig and rust for a more general comparison. I'm concerned mostly with security. In practice, it doesn't...

0027.1: hytradboi jam
31 Aug 2022 | original ↗

Published 2022-08-31 A minor update that can't wait until the end of the month: Have you tried rubbing a database on it: The Jam An upcoming coding jam with the same theme as the HYTRADBOI conference: Turning a data-centric lens onto familiar problems to yield...

0027: preimp, framework, dotfiles and backups, links
28 Aug 2022 | original ↗

I got back to work Aug 8. The rest of the month has been blessedly uneventful. preimp I got as far as I could with the clj version. The main obstacle is provenance. The main feedback from the essay was that it would feel much more natural with...

0026: break, preimp essay, focus + clojure, zed experiments, decorrelation and nested relations, bunny, sqlite mode, reading, links
26 Jul 2022 | original ↗

break Nobody could come to our covid-era wedding, so instead we have a wave of family and friends visiting for our first anniversary. Which means that between July 16 - Aug 7 I'm barely computering at all. Progress will resume Aug 8. preimp I wrote a draft of an essay explaining preimp: the program is the database is...

0025: preimp, focus + mach, emergent ventures, clockwork labs, success, hytradboi ideas, zig debugging tips, dev-setup.sh, clojurescript blues, analogies for end-user programming, half-arsed workflows, javascript vs serialization, links
24 Jun 2022 | original ↗

Preimp I've made a lot of progress on preimp. Persistence, server/client sync and collaborative editing are all working. Values are nicely rendered as tables. Functions are rendered as forms, which you can fill out to call the function. Functions can call edit! to change the value of data cells. Metadata can be used to tweak the rendering of values. Together these features...

0024: HYTRADBOI postmortem, HYTWACFI?, preimp, emergent ventures, data and reality, merkle search trees, readyset, julia compilation times
28 May 2022 | original ↗

All the HYTRADBOI videos are now up at hytradboi.com (click on the talk titles). I also wrote a post-mortem. I'm still not sure whether I want to do another HYTRADBOI, since it took a huge chunk of energy and attention out of my year. But one option I'm considering is to aim for dramatically less polish:...

HYTRADBOI 2022 postmortem
2 May 2022 | original ↗

Genesis One morning I drank too much coffee, got over-excited and on a whim proposed a conference - Have you tried rubbing a database on it?. Would you watch a "have you tried rubbing a database on it" conference?Thinking short demo videos of weird and non-traditional uses of database ideas eg writing a dvcs in sql (

0023: HYTRADBOI teaser, dida vs datalogui, preimp cruft, dbsp, links etc
24 Apr 2022 | original ↗

I made a HYTRADBOI teaser trailer. As I write we're at 293 attendees. All the infrastructure is wired up and my todo list is mercifully clear for these last few days. I sat down briefly with Marco Munizaga to look at using dida in

The shape of data
29 Mar 2022 | original ↗

(This is part of a series on the design of a language. See the list of posts here.) Some assorted musings on how data is represented in programming systems, with no clear thesis. Mostly focused on how application-level data is represented, manipulated, viewed, stored and transmitted, as opposed to how new types and data-structures are implemented within a single language. Also...

0022: preimp, shape of data, hytradboi progress, office hours, in nyc, riffle, cue, technical dimensions, js compound keys, hop
28 Mar 2022 | original ↗

Continuing with the 'the program is the database' theme from last month, I made a clojurescript dialect (called preimp) where the only source of mutable state is the source code itself. It works well enough for spreadsheet-sized datasets: I have much more interesting interactions planned. Hopefully for next month. Doing this in clojurescript...

0021: hytradboi schedule + tickets, imp v3 ideas, real world of technology, changing minds, essence of software, typed image-based programming with structure editing, fosdem 2022, introspecting async
21 Feb 2022 | original ↗

HYTRADBOI I just published the schedule and you can now buy tickets! Imp In imp v2, the database was accessible as a value. You could write code that described a transaction and then use a keyboard shortcut in the editor to apply...

0020: hytradboi, milestones, data soup, airtable, self-hosting
3 Feb 2022 | original ↗

HYTRADBOI Many new talks have been added to the HYTRADBOI schedule. There are another 15 or so in the pipeline. It's starting to look pretty exciting. Milestones I've been thinking lately about the psychological need for a sense of completion, or at least milestones. In industrial jobs my work has typically been organized around discrete...

0019: Refactor
26 Jan 2022 | original ↗

I'm going to make this newsletter public. Reasons: As best I can tell, most of my sponsors don't much care about the newsletter itself and just want to support writing/coding. Some don't even know it exists because github kinda buries the checkbox. I'm trying to orient my own consumption towards deliberate explorations of specific subjects rather than the froth of daily news. It seems hypocritical to then be offering infotainment as an...

0018: last reflections, why start a new database conference, 2021 retrospective, imp schemaless db + crdt, office hours, internal inconsistency in the wild, rss feeds, salsa needs finite collections, tiddlywiki vs unigraph, multidimensional indexes, arrow, just don't fsync, testing distributeds systems, tigerbeetle perf demos, web3, explicit formal structure, zig doctests, rust arenas, semidirect products of crdts, single-program distributed systems, sqlite qpsg, valhalla, mundanity of excellence, to mmap or not to mmap, libgavran, relational e-matching
17 Jan 2022 | original ↗

Writing: I finished the reflections series: Coding Testing Writing Why...

Imp: heterogenous types
10 Jan 2022 | original ↗

This is a part of a series - start at the beginning. Unlike sql, the denotational semantics for imp are carefully written to allow a set to contain rows of different lengths. The reason for that is that we can use such sets to model structs, tagged unions, first-class modules etc. // defining a module my_module: | "inclinations",...

2021
30 Dec 2021 | original ↗

(See other years). I felt like I didn't do much this year, but now that I write it all down... Streaming/incremental streaming-consistency: 94 commits, 2952 insertions(+), 36 deletions(-) I wanted to understand this whole space better. The goal was to produce the high-level view that appears in

Coding
20 Dec 2021 | original ↗

This post is part of a series, starting at Reflections on a decade of coding. This is going to be much more vague than the other parts of the series because this is the actual work. Good judgement is learned from experience, not from blog posts. So I think the most useful thing I can convey is what kinds of things I think about when coding, rather than what...

Why start a new database conference?
17 Dec 2021 | original ↗

Somehow I am organizing a conference called Have you tried rubbing a database on it? This isn't what I planned to do this year. How did this happen? Inception The answer is that I drank too much coffee before breakfast and ended up on twitter, where I saw someone joking that the solution to thinking that everything is a compiler problem is to learn about databases - now you have two...

0017: hytradboi updates, imp stonks, misparaphrasing oracle, technical books, rum, creator economy, friend groups, ub, omg design principles, zig build, fossil and indexes, flatpak, skiplang, convex, fuzzing beyond testing, tigerbeetle dev videos, wafl
15 Dec 2021 | original ↗

HYTRADBOI now has 10 confirmed speakers and a couple more maybes. The submissions will stay open until 2022 Feb 28 - if there is someone you would like to see speak, send them to hytradboi.com to submit a talk. Also, if anyone can intro me to someone involved with

Testing
26 Nov 2021 | original ↗

This post is part of a series, starting at Reflections on a decade of coding. This will be a short entry because I agree with almost everything written here. Read that first. I think about tests in terms of

Writing
25 Nov 2021 | original ↗

This post is part of a series, starting at Reflections on a decade of coding. I have a file called 'ideas' where I write down potential projects or thoughts that might be worth writing about. Entries grow over time as I add more thoughts. The entry that eventually became Against...

0016: dida validator and debugger, focus selector perf and async children, emotional management, speed matters, moving faster, have you tried rubbing a database on it, handmade highlights, airtable scripts, bank python, napa, pollen, against markdown, zig-snapshots, exhaustive test inputs, gf, nixos debug symbols, duckdb blog
15 Nov 2021 | original ↗

New stuff: In dida: I added a global validation routine that runs on every debug event. So far it detects any unintended aliasing. I'll test other invariants as needed. I wrote a debugger (video,

Emotional management
1 Nov 2021 | original ↗

This post is part of a series, starting at Reflections on a decade of coding. If you can focus hard on something for at least a few hours per day for several years, you will get good at it. The main obstacle to doing so is being a poorly-designed meat...

Moving faster
23 Oct 2021 | original ↗

This post is part of a series, starting at Reflections on a decade of coding. I don't think I'm very fast in an absolute sense, but I'm much much faster than I was 5 years ago. These are the things that I think made the most impact. Care The main thing that helped is actually wanting to be faster. Early on I...

Speed matters
14 Oct 2021 | original ↗

This post is part of a series, starting at Reflections on a decade of coding. I think that one of the most important things to focus on improving is how fast you can work. I wrote strucjure in 2012 and

2021 Q3 roundup
12 Oct 2021 | original ↗

9 months of being paid to make things by strangers on the internet! Here is what I did in the last quarter: In imp: Added a live repl (demo video here) Redesigned the syntax to be much more concise (before and...

0015: imp internals, reflections, precedence, make mode, mutant, q3, error recovery, tonsky ui, subtext 10, factfulness, benchmarking advice, dependency hubs, independent research, zig wayland, retool, observable dependencies, ugly buildings, without scihub, wasm virtual memory, huawei breakdown, infrastructure langauges, stencil vectors, chiX
12 Oct 2021 | original ↗

New stuff: I drastically changed the internal representations in imp. This was setup work for decorrelation of higher-order functions and fix/reduce, but as a side effect it also made closures much smaller and fixed some nasty edge cases in type inference...

Better operator precedence
9 Oct 2021 | original ↗

In every explanation of parsing that I've seen, operator precedence is handled by assigning a precedence number to each operator. Operators with higher precedence numbers bind tighter eg a + b * c would parse as a + (b * c) because * has a higher precedence than +. We want this because it reduces the number of parentheses needed, and we can get away with it because the precedence rules of arithmetic are familiar to most...

Setting goals
6 Oct 2021 | original ↗

This post is part of a series, starting at Reflections on a decade of coding. When I don't know where I want to go, I usually don't get there. Setting explicit goals for each project is essential for: Measuring progress

Things unlearned
29 Sept 2021 | original ↗

This post is part of a series, starting at Reflections on a decade of coding. This is one of my favorite questions to ask people: what are some things that you used to strongly believe but have now changed your mind about? I want to focus especially on ideas that I wasted a lot of time on, or that got in the way of success.

Reflections on a decade of coding
22 Sept 2021 | original ↗

I've been programming professionally for about 12 years. Here are some of the things I worked on in the last 2 years. In a small team (averaging ~3 people, I think?) in a little under a year, we built a SQL query planner that passed the >6 million tests in the sqlite logic test suite. I figured out how to

On bad advice
22 Sept 2021 | original ↗

This post is part of a series, starting at Reflections on a decade of coding. Like many programmers, I'm largely self-taught. I've rarely worked with anyone more experienced than myself, especially early in my career where I spent a lot of time working with other 20-something-year-olds who also had only a few years of experience. So we all learned about how...

Thoughts on benchmarking streaming systems
6 Sept 2021 | original ↗

I spent some time thinking about how to benchmark streaming systems. Though I didn't go through with the project, the notes might still be useful for others. What is the point of benchmarking? Decision making, capacity planning. I have X queries with Y data and Z slo - what software do I use and what hardware do I buy? Focus engineering effort on bottlenecks. Inform the design of future systems by understanding design...

Focus: text
6 Sept 2021 | original ↗

So far the editor has to do these things with text: iterate forwards/backwards insert/remove strings lookup the point corresponding to a given line find out the line at a given point The simplest thing we could possibly do is just store the entire text in a contiguous string. How would this perform for a 2mb text on my 2017 xeon laptop? A string is the most efficient choice for iterating - we get a simple loop...

Focus: rendering
6 Sept 2021 | original ↗

First order of business is to put things on the screen. I chose to render to opengl rather than make a terminal app because a) I rarely need to run an editor over ssh b) it's hard to do smooth scrolling in a terminal app c) terminal protocols are insane. SDL is a library that abstracts over the details of various different windowing and input systems and makes it easy to eg

Focus: intro
6 Sept 2021 | original ↗

Over the winter of 2020, during the time where I had planned to be working on a port of differential dataflow, I instead wrote a text editor which I now use for all my writing and coding. This was perhaps not the most productive use of my time, but it wasn't wasted either: It's a huge project to write a text editor that people will actually use, but it turns out to be fairly easy to write a text editor that only one person will use. Far from being a toy...

0014: imp live repl + syntax + errors, focus highlighting + squigglies, dida nop, web woes, undrafting, rel, oracle encore, chidb, pinebook touchpad, toplev, use of a life, imgui accessibility, wheel reinvention jam, chibicc, files vs web silos, handwritten parsers, perf ninja
6 Sept 2021 | original ↗

I wrote a live repl for imp and made a demo video. I think this is one of those things where a quantitive change in speed makes a qualitative difference - in theory this is just a slightly faster way to type things into a repl, but in practice it feels totally different. Pointing at a thing to query it feels very natural. To make the live repl experience work better I made a lot of...

Imp: live repl
4 Sept 2021 | original ↗

Published 2021-09-04 This is a part of a series - start at the beginning. I wrote a live repl for imp. It's hard to explain interaction in text, so I made a video. Turn...

0013: till death do us part, minimum wage, dida free, implicit ordering in relational languages, ultralearning, responses to against sql, oracle decorrelation, gede improvements, antisponsoring, convivial design heuristics, knowledge transfer, crafting databases, rust complexity, antitrust, gelly, shakti, lumosql, anti-marketing, NAAL, ledger of harms, tonsky icfp, debugging stories
7 Aug 2021 | original ↗

New stuff: I got married! Great for me, sucks for anyone hoping for more code / blog posts this month :) At some point this month my income from sponsorships passed minimum wage. I'm about 2/3rds of the way towards meeting all my expenses and being able to do this indefinitely. Six months ago that seemed like an impossibly unrealistic goal. Dida updates: The core now has actual memory management. All the tests are running with leak and...

Implicit ordering in relational languages
5 Aug 2021 | original ↗

I tried implementing a text crdt in sql and in imp. Doing this naively isn't particularly hard. The challenge here is to implement it as if writing a batch query, but in such a way that an incremental system like materialize or

Ultralearning
27 Jul 2021 | original ↗

https://www.goodreads.com/book/show/44770129-ultralearning Core idea is combining everything we know about effective (as opposed to traditional) education to learn skills dramatically faster than is generally believed to be possible. The better one gets, the more one recognizes how much better one could become. Don't need talent to be...

Against SQL
9 Jul 2021 | original ↗

TLDR The relational model is great: A shared universal data model allows cooperation between programs written in many different languages, running on different machines and with different lifespans. Normalization allows updating data without worrying about forgetting to update derived data. Physical data independence allows changing data-structures and query plans without having to change all of your queries. Declarative...

2021 Q2 roundup
9 Jul 2021 | original ↗

We're coming towards the end of my second quarter on github sponsors. Thanks to everyone who gave money to a stranger on the internet so that they could type more fun things and less boring things. Here is what I did: Released dida, a library for streaming, incremental, iterative, internally-consistent computation on time-varying collections, based on

0012: dida wasm api + indexes + reduce, food and carbon emissions, async rust, handmade seattle, ideas matter, tools for thought and dida animations, redpanda wasm, live 2021, opportunity costs of twitter, work vs jobs, sourcehut simplicity, writing tools faster, ec2 trends, the state of academia
9 Jul 2021 | original ↗

New stuff: I wrote a detailed breakdown of things that are wrong with sql and how to do better in a new query language. Dida updates: Added a js<->wasm api (example here, sugar...

0011: dida release, DD reading, antirez and small tech, reactive ui, how to test, doom vs memory safety, state of academia, wasm-bindgen, apl compilers, relational.ai and salsa.jl
14 Jun 2021 | original ↗

I made the dida repo public. I still have a lot of work to do but: It works! I came up with a slightly different way of handling progress tracking that I think simplifies the implementation. DD's algorithm encapsulates subgraphs so that they look like single nodes in the parent graph. This requires nodes to be able to have multiple outputs and report complex transformations on timestamps over each...

Making live repls behave
18 May 2021 | original ↗

I like to make live repls that evaluate as you type. Here is an example for a silly little stack language with singleton sets (0 and 1), set union (|), set product (*) and dup (^, makes an extra copy of the top value on the stack). Limit memory usage to 2^ bytes Am I broken? Try changing the numbers and seeing how that affects the...

0010: dida, live repls, query planning for streaming, rust allocators, more zig goto, database resources, guix on mac, criticising people's work, pay what you want
18 May 2021 | original ↗

The main thing I've been working on is a port of differential dataflow. I have most of the basics sketched out with naive implementations and I'm currently working on implementing progress tracking correctly, at which point simple examples like graph reachability should start working. Github doesn't have any way of saying "give all sponsors read-only access to this private repo" short of making a brand new github organization and manually spamming everyone with invites. I...

Why query planning for streaming systems is hard
8 May 2021 | original ↗

Many groups are working on running sql queries in incremental/streaming systems. Query planning in this context is not a well understood problem. This post is a quick pointer towards the many open problems. It's not intended to be complete or authoritative, because I haven't engaged with the problem much beyond noticing that it's hard and deciding not...

0009: 2021 Q1 roundup, updates to internal consistency, garden of forking paths, push vs pull, beca, cambria
24 Apr 2021 | original ↗

Hello to everyone who joined recently! I made a short summary of what I've done so far in the first quarter of 2021 and some ideas for what I'll do next (blog version, twitter version). If any of them are exciting, let me know.

2021 Q1 roundup
22 Apr 2021 | original ↗

Published 2021-04-22 Today marks the end of my first quarter on github sponsors. Thanks to everyone who made this possible by giving money to some rando on the internet. Here is what I did so far (some already public, some sponsors-only):

An opinionated map of incremental and streaming systems
18 Apr 2021 | original ↗

Disclaimer: I used to work for Materialize which is one of the systems being compared in this post. This is a space of systems for which we don't have clear terminology yet. The core idea uniting them is this: We have some function that turns an input into an output. The input changes slightly over time. We want to calculate the new output in a way that performs better (lower latency, higher throughput, lower resource usage etc)...

Internal consistency in streaming systems
17 Apr 2021 | original ↗

Disclaimer: I used to work for Materialize which is one of the systems being compared in this post. This post covers the internally consistent/inconsistent branch in my map of streaming systems. The core message of this post is that intuitions from distributed key-value databases don't carry over well to systems which do complex...

0008: the last internal consistency, geoffrey litt's new newsletter, business structure vs quality, aws throttling, papoc, our machinery, on twitter, injuries
17 Apr 2021 | original ↗

It's alive! Internal consistency in streaming systems. Tell your friends. Major changes since the last update: A bigger dataset with better delay distribution. A second dataset which keeps the values in balance between -1 and 1. This removes some noise and makes the strange dynamics in flink much...

0007: yet more internal consistency, re: how safe is zig, async performance, local-first software, fuzzers and emulators, deterministic hardware counters, zig goto
3 Apr 2021 | original ↗

Yet more internal consistency! I settled on a simple example that stresses most of the failure modes I'm aware of. The final view, total, should always contain a single row with the number 0. This makes it super easy to spot consistency violations. I have

Utopia of rules
19 Mar 2021 | original ↗

https://www.goodreads.com/book/show/22245334-the-utopia-of-rules I found 'Utopia of Rules' less convincing than his other books, but helps provide context to his other arguments Iron Law of Liberalism - any attempt to make market more free ends up creating more red tape The rise of modern corporations was seen as a matter of applying modern bureaucratic techniques...

Debt
19 Mar 2021 | original ↗

https://www.goodreads.com/book/show/6617037-debt 'Debt' is about the origins of money and debt. I found it fascinating, but I'm not sure where to trust it, especially given the clear agenda around tying debt to violence. Barter economies have never been seen (see Caroline Humphrey for more detail) A communalist setting is one in which people share by default It...

Bullshit jobs
19 Mar 2021 | original ↗

https://www.goodreads.com/book/show/38217638-bullshit-jobs Interesting ideas. Probably overstated, but worth reading. Outline of Bullshit Jobs: medieval model - wage labour as a path to adulthood industrial revolution - wage labour forever 70s - scientific management stole productivity increases Medieval model of...

0006: more internal consistency, how safe is zig, bullshit jobs, debt, utopia of rules, kevin's zig adventure, pinebook pro, trio
19 Mar 2021 | original ↗

As expected, I didn't have much access to electricity in the first two weeks of march. Things are getting back on track now. I put a bunch more work into the internal consistency article. Rewrote the description of internal consistency and much of the conclusion. After a back and forth with one of the noria authors, I realised...

Memory-mapped IO registers in zig
8 Mar 2021 | original ↗

Kevin Lynagh and I spent some time playing around with zig on nrf52 boards. He's written about the experience here. I wanted to additionally highlight the api we used for memory-mapped IO...

0005: internal consistency in streaming systems, MMIO in zig, a small matter of programming, rxi, martin kleppmann's new patreon, redpanda benchmarks
27 Feb 2021 | original ↗

New stuff: The first draft of internal consistency in streaming systems. I still have a few more systems to study, repros to write and authors to badger for fact-checking. Memory-mapped IO registers in zig - a little self-contained case study of api design in...

A small matter of programming
25 Feb 2021 | original ↗

https://mitpress.mit.edu/books/small-matter-programming An excellent book on end-user programming. A strong counter-argument to the widespread assumption that all we need to do is turn existing language models into pretty point-and-click diagrams. End-user programming is needed because it's impossible to predict in advance what the user will need, and expensive to cover all the...

How Materialize and other databases optimize SQL subqueries
16 Feb 2021 | original ↗

Subqueries are a SQL feature that allow writing queries nested inside a scalar expression in an outer query. Using subqueries is often the most natural way to express a given problem, but their use is discouraged because most databases struggle to execute them efficiently. This post gives a rough map of existing approaches to optimizing...

The mature optimization handbook
16 Feb 2021 | original ↗

http://carlos.bueno.org/optimization/ 'The Mature Optimization Handbook' is about monitoring and profiling continuous systems. Reasonable ideas but not particularly dense. The performance problem definition must be falsifiable Use performance measurements to try to falsify theory, not just confirm it A measurement is a number obtained during some profiling event Metadata are...

0004: map of incremental/streaming systems, draft of thoughts on benchmarking streaming systems, the mature optimization handbook, various dataflow and database talks
16 Feb 2021 | original ↗

New: An opinionated map of incremental and streaming systems. I wanted to write about query planning for streaming systems, but it turned out to be hard to do that without first writing about streaming systems in general. First draft of

Digital minimalism
6 Feb 2021 | original ↗

https://www.goodreads.com/book/show/40672036-digital-minimalism I found Digital Minimalism compelling, and much better thought out than most similar texts given eg the focus on having a high-quality leisure life. Outline of Digital Minimalism: digital media is advertising-driven makes money by stealing your solitude which you need for...

0003: optimizing correlated subqueries, digital minimalism, data-oriented design
6 Feb 2021 | original ↗

Published 2021-02-06 New stuff: The dive into optimizing correlated subqueries is finished. Notes on Digital Minimalism. See also older notes on...

0002: correlated subqueries intro, text editor data-structures, working in public, thoughts on independent research
30 Jan 2021 | original ↗

Welcome to everyone who signed up since the last update. Thank you for giving me money. I will spend it on sensible things like beans and running water. New posts: The intro section for the post on correlated subqueries. I'm still working on the remaining sections....

Working in public
26 Jan 2021 | original ↗

https://nayafia.substack.com/p/22-working-in-public Working in Public is a book about how open source projects work. It reminds me strongly of Governing the Commons - rather than rely on ambient mythology, Eghbal actually went and looked at how open source projects work and found that...

0001: welcome, text editor intro + rendering
25 Jan 2021 | original ↗

So one weekend I drank too much coffee and impulsively decided to start a newsletter. My biggest worry at the time was that it would launch to the sound of crickets. I've now learned that what I should have been afraid of was waking up on Monday morning to a sea of expectant faces and realizing that maybe I should have written some of this stuff in advance... Luckily, I procrastinated all winter by writing a text editor from scratch so I have ready-made filler material....

Why isn't differential dataflow more popular?
21 Jan 2021 | original ↗

Differential dataflow is a library that lets you write simple dataflow programs and a) then runs them in parallel and b) efficiently updates the outputs when new inputs arrive. Compared to competition like spark and kafka streams, it can handle more complex...

2020 spending
15 Jan 2021 | original ↗

Continuing the tradition from last year. I moved to Canada in March this year, so these numbers are only for April-December for the sake of my accounting sanity. Numbers are actual/budgeted Canadian dollars per month. total = 2420/2397 basics = 1642/1571 physical bills = 1150/1150 digital bills = 82/121 phone = ?/17 fastmail =...

Looking for debugger
13 Dec 2020 | original ↗

I'm trying to find a better debugging experience for my current projects (zig/c, nixos). Usually I just use vanilla gdb. Here is a debugging session I ran recently on this linux x64 binary. 1. Open and run until crash. [nix-shell:~/focus]$ gdb ./test ... (gdb) run ... Test [1/1]...

Looking for more debugger
13 Dec 2020 | original ↗

Continuing my search for a better debugging experience for my current projects (zig/c, nixos). The first post resulted in a ton of suggestions and I tried them all. gdb -tui (again) taviso explained how to use the tui properly. C-l fixes the display and using...

Canada's Express Entry program
26 Oct 2020 | original ↗

Wesley Aptekar-Cassels' recent posts on getting a gold card and moving to Taiwan reminded me that I meant to write about my experience with Canada. I immigrated to Vancouver, BC in March 2020. I was given permanent residency before setting foot in Canada and without a job offer, under the

Assorted thoughts on zig (and rust)
19 Oct 2020 | original ↗

I've been using zig for ~4 months worth of side projects, including a toy text editor and an interpreter for a relational language. I've written ~10kloc. That's not nearly enough time to form a coherent informed opinion. So instead here is an incoherent assortment of thoughts and experiences, in...

Small tech
7 Sept 2020 | original ↗

I frequently see debates about whether it's better to be a cog at a giant semi-monopoly, or to take investment money in the hopes of one day growing to be head cog at a giant semi-monopoly. Role models matter. So I made a list of small companies that I admire. Neither giants nor startups - just people making a living writing software on their own terms. sqlite Sqlite is an...

Imp: iteration
17 Jun 2020 | original ↗

This is a part of a series - start at the beginning. It's hard to be Turing-complete without some form of iteration or recursion. There is a tricky design problem here - I'm trying to steer between two extremes: Programming languages usually allow arbitrary user-defined iteration via loops or recursion, but this introduces false data dependencies that inhibit parallelization and...

Imp: boxes
2 Jun 2020 | original ↗

This is a part of a series - start at the beginning. Previous posts had fancy interactive examples. Keeping these in sync with constant language changes was taking a lot of time, so I'm abandoning them for now. They'll return when the language has settled enough to write docs. This is the thorniest part of imp. So far we've been dealing with first-order relations - sets of...

Imp: solving functions
29 Apr 2020 | original ↗

This is a part of a series - start at the beginning. Here is some imp code that does a reverse lookup on a table: This is kind of verbose - worse even than the equivalent sql: select name from departments where department = "marketing" This is a problem, because one of the design goals of imp is to make it easy to...

Open multiple dispatch in zig
28 Apr 2020 | original ↗

The Zig stdlib often uses open single dispatch eg: // in stdlib pub fn serialize(self: *Serializer, value: var) !void { const T =...

Pinephone first steps
27 Apr 2020 | original ↗

After anki trashed my history, uploaded the trashed history to the sync server, and then repeatedly re-synced the trashed history every time I tried to restore from backup, I wrote my own spaced repetition app. It reads input from a markdown file, asks questions in the terminal and stores state in a json file. I wrote those few hundred lines of code in an afternoon and I've been using it ever since. I wanted to run it on my android phone too, but after several days of...

SELECT wat FROM sql
16 Apr 2020 | original ↗

Working on a postgres-compatible query compiler has taught me many things. Things I was better off not knowing. Let's begin: jamie=# create table nums(a int primary key, b text); CREATE TABLE jamie=# insert into nums(a,b) values (0, 'foo'), (1, 'foo'), (2, 'foo'), (3, 'bar'); INSERT 0 4 jamie=# select a+1 from nums group by a+1; ...

Imp: decorrelation
2 Feb 2020 | original ↗

This is a part of a series - start at the beginning. In SQL, when a subquery references a variable from the surrounding query, this is called a correlated subquery. If you were feeling a bit strange, you could use a correlated subquery to express a join. These two queries are identical (as long as bar.b is a primary key): select...

2019 spending
4 Jan 2020 | original ↗

In the last year I started to pay much more attention to how I spend money. I'll write much more about the reasons at some point, but the short version is that I care a lot about autonomy and agency, and spending less is a critical part of that. I set myself a target of £2000/month for 2019. The median take-home salary in London is £2250/month, so this is hardly a stretch, but it's a start. Today I went through my accounts for the year to see how that worked out on...

Imp: simple interpreter
18 Oct 2019 | original ↗

This is a part of a series - start at the beginning. imp requires javascript! The denotational semantics define what an imp expression means, but it defines the meaning in terms of infinite sets. How can we evaluate those semantics? The little interactive...

Imp: types
14 Oct 2019 | original ↗

This is a part of a series - start at the beginning. Before looking at the interpreter, I'm going to introduce a very simple type system. The type system has two jobs. It rules out sets that contain tuples of varied lengths. The denotational semantics allow sets like (1 x 2) | 3, but it's confusing in practice and banning it allows us to make...

Imp: denotational semantics
1 Oct 2019 | original ↗

This is a part of a series - start at the beginning. This is just going to be a big pile of \( \LaTeX \). If you're not into that, check out the previous post where I explain the same things with pretty interactive repls instead. \[ \gdef\D#1#2{\llbracket #1...

Imp: core language
30 Sept 2019 | original ↗

This is a part of a series - start at the beginning. What do I want out of the core language? simple denotational semantics no evaluation order logical data independence amenable to parallel evaluation amenable to incremental maintenance The datalog family of languages fits the bill, but they struggle to express and compress many...

Imp: intro
27 Sept 2019 | original ↗
Frugality is non-linear
9 Apr 2019 | original ↗

Most people have a mental model of budgeting which is roughly linear. If you spend half as much money, your money will last twice as long. As you approach zero spending, your runway goes up to infinity. In this model, the space of options looks like this: [I am an interactive graph made of javascript!] Return on investment =  This model is wrong. It's wrong because your savings grow over time. If you change the...

Zero-copy deserialization in Julia
28 Aug 2018 | original ↗

While working with RelationalAI I wrote Blobs.jl, a library for zero-copy deserialization in Julia. Not super exciting in itself, but it nicely demonstrates the kinds of zero-overhead abstractions that are possible in Julia. The problem Folks at RelationalAI want to build various complex on-disk data-structures, with...

Julia as a platform for language development
16 Aug 2018 | original ↗

Published 2018-08-16 I gave a talk at JuliaCon 2018 on my experience using Julia to implement various declarative languages, including a datalog variant, a

Psychology vs the graphics pipeline
11 Dec 2017 | original ↗

(EDIT: Much more accurate measurements are available eg for native software and for web software. The former says that most native psych libraries can get to frame-level accuracy with the appropriate hardware. The latter has results for web software similar to what I found here and...

Staged interpreters in rust
22 Nov 2017 | original ↗

Last week I was writing an interpreter for a query language. On arithmetic-heavy queries the interpreter overhead was >10x compared to a compiled baseline. I tried staging the interpreter to move the overhead out of the inner loops. In the end the results weren't worth the complexity compared to just writing a compiler so I didn't end up finishing it. But I think it's a neat idea anyway so I wrote a much simpler example to demonstrate. (It's essentially a

Contrast codes are an implementation detail
21 Nov 2017 | original ↗

I found contrast codes really confusing on first contact. In hindsight, this is because they are typically presented as being part of the model, but it seems much more ergonomic to me to consider them part of the inference algorithm, as I'll explain here. If you haven't encountered contrast codes before - good. Stay there. You are not missing out. If you have encountered contrast codes...

A UI library for a relational language
28 Jul 2017 | original ↗

TLDR: I'm working on a relational programming language intended for rapid GUI dev. Having tried a couple of different approaches to describing GUIs, I've settled on a React-like library that binds relational data to HTML templates.

Decision points and utility
8 Jul 2017 | original ↗

I want to start getting up in the morning, rather then in the afternoon. It seems like a pretty clear choice: Sleep in all morning and be grumpy Get up early and be happy So afternoon-me makes a plan to get up early every morning, sets an alarm and calls it a day. And that would be the end of it, except that when morning rolls around morning-me discovers a third option. Sleep in all morning and be grumpy Get up...

Monolog
16 Mar 2017 | original ↗

Suppose I want to track how I spend my day. There are plenty of existing time-tracking apps, but none of them were invented here. I could build an app from scratch. I would just need to write code for displaying entries, adding new entries, editing old entries, summarizing data, storing data, tracking changes... Or I could just dump it in a spreadsheet, add a few formulas and call it a day. This is the real magic of spreadsheets - there is a large class of...

Quick and dirty review of Psychology of Programming Interest Group 1989-2015
17 Nov 2016 | original ↗

Inspired by Ji Yi's homework assignment, I decided to skim all 360-odd papers in the Psychology of Programming Interest Group archives. Yi reports his students taking ~10 hours to skim 100 papers over a single weekend. His students are clearly way more disciplined than I am. I dragged it out for months. It...

Vive experiments
28 Oct 2016 | original ↗

I bought a HTC Vive during the summer and made some simple toys. These are all a few months old, but I didn't get around to writing anything down until today. This is the first time I've found a VR experience convincing. I'm very susceptible to motion sickness in general, and every previous VR system I've tried has made me unhappy, sometimes for hours afterwards. The Vive is mostly fine, with the exception of a few...

A practical relational query compiler in 500 lines of code
11 Oct 2016 | original ↗

Imp needed a relational database that is simple enough to experiment with but fast enough to power real applications. Relational databases are usually complicated beasts. Even SQLite, a relatively lightweight database, is 116,000 lines of code. Its btree...

Complexity budgets
25 Oct 2015 | original ↗

I notice a tendency to make individual engineering decisions by maximising 'goodness'. Patch X makes the code more complex, but it adds a new feature or increases performance or makes debugging easier. We add up the goodness points, subtract the badness points and if the result is more than zero it's a good patch. Unfortunately, complexity does not add up linearly. The total cost of a set of features is not just the sum of the cost of each feature. Complexity limits how...

Three months of rust
4 Jun 2015 | original ↗

I work on Eve, a functional-relational programming language and environment. Since the Eve editor has to run in a browser we built the first few versions entirely in javascript. This has been pretty painful, so a little over three months ago we started looking at other options. The only hard requirements for the runtime are a) we need control over memory layout and b) we need to safely execute untrusted Eve...

Scaling down
9 Feb 2015 | original ↗

The programming world is obsessed with scaling up. How many million lines of code can we maintain? How many petabytes of data can we process? How deeply can I customise this editor? More code, more data, more people, more machines. Nobody talks about scaling down. The vast majority of programs are never written. Ideas die stillborn because the startup cost is too high to bear. When we focus entirely on the asymptotic cost of developing large systems we neglect the...

Imperative thinking and the making of sandwiches
21 Jul 2014 | original ↗

People regularly tell me that imperative programming is the natural form of programming because 'people think imperatively'. I can see where they are coming from. Why, just the other day I found myself saying, "Hey Chris, I'm hungry. I need you to walk into the kitchen, open the cupboard, take out a bag of bread, open the bag, remove a slice of bread, place it on a plate..." Unfortunately, I hadn't specified where to find the plate so at this point Chris threw a null pointer...

Pain we forgot
17 May 2014 | original ↗

Much of the pain in programming is taken for granted. After years of repetition it fades into the background and is forgotten. The first step in making programming easier is to be concious of what makes it hard. So let's put ourselves in the shoes of a smart but inexperienced end user trying to build, test and maintain a simple application. Anon the intern is charged with managing lunch orders and quickly realises that their job could be done by a computer: Every day at...

Local state is harmful
17 Feb 2014 | original ↗

Picture a traditional webapp. We have a bunch of stateless workers connected to a stateful, relational database. This is a setup with a number of excellent properties: All state can be queried using a uniform api - SQL. This enables flexible ad-hoc exploration of the application state as well as generic UIs like django admin Every item of state has a unique and...

Search trees and core.logic
19 Dec 2012 | original ↗

Last week David Nolen (the author of core.logic) was visiting Hacker School so I decided to poke around inside core.logic. I made a PR that adds fair conjunction, user-configurable search and a parallel solver. First, a little background. From a high-level point of view, a constraint solver does three things: specifies a search space in the form of a set of...

Strucjure: motivation
4 Dec 2012 | original ↗

I feel that the readme for strucjure does a reasonable job of explaining how to use the library but not of explaining why you would want to. I want to do that here. I'm going to focus on the motivation behind strucjure and the use cases for it rather than the internals, so try not to worry too much about how this all works and just focus on the ideas (the implementation itself is

Causal ordering
16 Aug 2012 | original ↗

Causal ordering is a vital tool for thinking about distributed systems. Once you understand it, many other concepts become much simpler. We'll start with the fundamental property of distributed systems: Messages sent between machines may arrive zero or more times at any point after they are sent This is the sole reason that building distributed systems is hard. For example, because of this property it is impossible for two...

Optimising texsearch
8 Dec 2010 | original ↗

Texsearch is a search engine for LaTeX formulae. It forms part of the backend for latexsearch.com which indexes the entire Springer corpus. Texsearch has only a minimal understanding of LaTeX and no understanding of the structure of the formulae it searches in, but unlike it's competitors (eg

Design and analysis of a gossip algorithm
4 Sept 2010 | original ↗

Published 2010-09-04 My MSc dissertation 'Design and Analysis of a Gossip Algorithm', in which I present an algorithm for forming a dynamic, unstructured overlay in which each node can generate a stream of independent, uniformly distributed samples of the overlay membership. Such peer sampling services form the basis for a number of gossip algorithms...

Examining scampy
19 May 2010 | original ↗

Scampy is a bot for engaging 419 scammers in pointless conversation and consuming time that could have been spent on real victims. It was originally intended to be a smart bot. I had visions of data mining conversations and inventing dsl's for chat scripts. This all takes time however, so in order to get up and running quickly the prototype just selects responses at random from a prewritten list. This turns out to be...

↑ These items are from RSS. Visit the blog itself at https://www.scattered-thoughts.net/ to find everything else and to appreciate author's digital home.