Especially in high-level languages, inlining is most useful when it causes: Optimizing the callee’s body based on the arguments passed. Optimizing the call site based on the callee’s return value. Let’s look at some examples. Example: avoiding redundant bounds checks Suppose we have a library for decoding some format of binary files with...
In the first post of this series we looked at a few different ways of parsing a simple JSON-like language. In the second post we implemented a few lexers, and looked at the performance when the parsers from the first post are combined with the lexers in the second post. One of the surprising results in these posts is that our recursive descent...
In the previous post we looked at three different parsing APIs, and compared them for runtime and the use cases they support. In this post we’ll add a lexer (or “tokenizer”), with two APIs, and for each lexer API see how the parsers from the previous post perform when combined with a lexer. What is a lexer? A lexer is very similar to the event...
Consider a simplified and enhanced version of JSON, with these changes: Numbers are 64-bit unsigned integers. Strings cannot have control and escape characters. Single-line comments are allowed, with the usual syntax: // ... . When parsing a language like this, a common first step if to define an “abstract syntax tree” (AST), with only the...
The main use case of resumable exceptions would be collecting a bunch of errors (instead of bailing out after the first one) to log or show to the user, or actually recovering and continuing from the point of error detection, rather than in a call site. Why not design the code to allow error recovery, instead of using a language feature? There...
Code is tree structured, but manipulated as a sequence of characters. Most language tools1 need to convert these sequence of characters to the tree form as the first thing to be able to do anything. When the program is being edited, the tree structure is often broken, and often to the point where the tool cannot operate. For example: An opening...
Subtyping is a relation between two types. It often comes with a typing rule called “subsumption”, which says that if type B is a subtype of type A (usually shown as B ), then a value of type B can be assumed to have type A. The crucial part is that subsumption is implicit, the programmer doesn’t explicitly cast the value with type B to type A....
OOP is certainly not my favorite paradigm, but I think mainstream statically-typed OOP does a few things right that are very important for programming with many people, over long periods of time. In this post I want to explain what I think is the most important one of these things that the mainstream statically-typed OOP languages do well. I will...
Since 2013 I’ve had the chance to use OCaml a few times in different jobs, and I got frustrated and disappointed every time I had to use it. I just don’t enjoy writing OCaml. In this post I want to summarize some of the reasons why I don’t like OCaml and why I wouldn’t choose it for a new project today. No standard and easy way of implementing...
I like anonymous records and row polymorphism, but until recently I didn’t know how to generate efficient code for polymorphic record access. In this blog post I will summarize the different compilations of polymorphic record accesses that I’m aware of. All of the ideas shown in this post can be used to access a record field when the record’s...
I was recently thinking about why do so many languages have tuples, which can be thought of as simple anonymous products (more on the definition of this below), but not something similar for sums. Both sum and product types are widely used, so it seems inconsistent to have anonymous products but not sums. I recently tweeted about this and got...
Suppose you have a no_std crate that you want to use in two ways: As a self-contained static library, to link with other (non-Rust) code As a Rust library, to import from another crate to test it (1) is the main use case for this library. (2) is because you want to test this library and you want to be able to use Rust’s std and other Rust...
21 Jun 2020 was my last day at Well-Typed and as a GHC maintainer/developer. On 22nd I joined the programming language team at DFINITY to work on the Motoko programming language. Here’s the summary of my 8 years writing Haskell pretty much non-stop: In 2012 I wrote my first Haskell program, which was a chat server. I was reading “Real World...
Being able so specify conditions in gdb breakpoints is quite useful. For example, if I’m interested in mmap(NULL, ...) calls I can do break mmap if addr == 0 and gdb doesn’t break on mmap when the addr == 0 condition doesn’t hold. I’ve used this many times to great effect, but it’s not always sufficient, sometimes I need to break not when a...
I recently published a new post on Well-Typed’s blog: “The problem with adding functions to compact regions”. It’s also shared on Twitter and /r/haskell. If you have any questions/comments feel free to ping me in any of these places, or add a comment below!
In the previous post we’ve looked at a representation of expressions in a programming language, what the representation makes easy and where we have to use knot-tying. In this post I’m going to give two more examples, using the same expression representation from the previous post, and then talk about how to implement our passes using a different...
Suppose I have this simple language: data Expr = IdE Id | IntE Int | Lam Id Expr | App Expr Expr | IfE Expr Expr Expr | Let Id Expr Expr When generating code, for an identifier that stands for a lambda, I want to know the arity of the lambda, so that I can generate more efficient code. While in this language a lambda takes only one...