Alex Clemmer

https://blog.nullspace.io/ (RSS)
visit blog
How bad is the Windows command line really?
30 Mar 2016 | original ↗

Kevin Gallo just announced Bash support on Windows. If you have never had to interact with the Windows Batch language, this might not seem like such a big deal. Surely Batch could not be substantially worse than Bash, right? Bash: a language that was neither designed, nor evolved. An adequate solution to a problem that has since become orders of...

Porting Mesos to Windows
20 Aug 2015 | original ↗

This morning, TechCrunch broke the story that Mesos support is coming to Windows. This story is meant to coincide with Ben Hindman’s MesosCon keynote, in which there will be a real, end-to-end demo showing us scheduling work on a cluster with a mix of Linux and Windows nodes. For the vast majority of the project, I have been the only dedicated...

FoundationDB proves your primary datastore is the worst place in your stack to bet on new tech
24 Mar 2015 | original ↗

FoundationDB has been acquired by Apple. A notice on their community site explains that they have pulled download links, and their client libraries now return 404 on GitHub. To database customers, this is a good lesson: assuming FDB did not coordinate with customers ahead of time, this instantly cost at least some FDB customers millions of...

84% of a single-threaded 1KB write in Redis is spent in the kernel
9 Feb 2015 | original ↗

The performance of live site systems — everything from K/V stores to lock servers — is still measured principally in latency and throughput. Server I/O performance still matters here. It is impossible to do well on either of these metrics without a performant I/O subsystem. Oddly, while the last 10 years have seen remarkable improvements in the...

Half of IBM's profits come from mainframes
26 Jan 2015 | original ↗

IBM was at the top of the news aggregators a couple days ago as irresponsible rumors spread that it was laying off more than 100,000 people. IBM eventually posted a scathing denial, though not quickly enough to stem the flow of Internet pundits who showed up in droves (at, e.g., Hacker News) to explain just how irrelevant IBM really is. This is a...

MSFT open sources production serialization system written partially in Haskell
9 Jan 2015 | original ↗

Bond is a performant serialization system developed and deployed across dozens of mission-critical, high-scale infrastructure projects internally here at Microsoft. Today the technical lead, Adam Sapek, is open sourcing the project on GitHub under the very permissive MIT license. Since there is going to be no official MSFT announcement, I would...

The time I found a bug in the .NET framework and fixed it by hand-altering the DLL.
1 Aug 2014 | original ↗

Update: some colleagues and ex-colleagues from the .NET framework team showed up on HN to comment about this issue. It’s worth reading. Prelude C# supports kind-of-macros via the very neat Expression Tree API. The gist is: You build a tree that represents some C# expression. When you want to execute that expression, C# basically treats it like a...

Beginner's guide to OCaml beginner's guides.
9 Jul 2014 | original ↗

[Translation available in Japanese] So you want to learn OCaml. Where do you start? What do you do? I’ve been an OCaml beginner probably a dozen times — picking it up, dropping it, and picking it up again so many times I’ve lost count. This time it’s stuck, and I think it’s because the community has fundamentally changed. Here’s what worked for...

Should I Machine Learn?
7 Jul 2014 | original ↗

As a machine learning acolyte, I spent probably as much time trying to understand things like how and when to use machine learning as I did understanding the technical details of machine learning itself. Unfortunately, most of the discussion around machine learning is about the former. The latter gets almost no press by people who are in the...

Recovering deleted files using only grep
24 Jun 2014 | original ↗

In my college systems class we were required to implement malloc. I spent a week or so on it. No version control — I was both youthful and arrogant. After ironing out all the little systems bugs, I began cleaning up the directory to package up and send off for grading. I went to remove something in the same directory that also started with the...

I needed to use a regex on an airplane without Internet. So I reverse engineered the API.
14 Jun 2014 | original ↗

I’m sitting here on the 5 hour 40 minute flight from SEA to PHL. I need to use a regular expression to do something. But, while normally I’d use the re module, I’ve forgetten the details of the API. And to top it off, I’m certainly not ponying up $14 for the crappy plane Internet! I guess that means I have three problems. Looks like I have to go...

How does Akamai's "secure heap" patch to OpenSSL work?
12 Apr 2014 | original ↗

For more than a decade, Akamai has guarded their users’ private RSA keys using a security-conscious variant of the malloc family. In effect, this allows their systems to maintain a second, more secure heap, which makes it significantly harder to execute a broad class of security vulnerabilities. Yesterday, Rich Salz disseminated a patch to...

Day 266: The unfamiliar world of OS X syscalls
25 Feb 2014 | original ↗

Hacker School: day 266. (My batch ended on August 22, 2013, but as they say, never graduate.) After learning a bunch about concurrency primitives Yesterday, I decided it would be fun to have an operational understanding of their implementation. So I decided to boot up dtruss (which is like dtrace, but for OS X) and look at the syscall pattern...

Day 265: Semaphores and mutexes. Mutices. Muticii.
23 Feb 2014 | original ↗

Hacker School: day 265. (My batch ended on August 22, 2013, but as they say, never graduate.) I’m sort of embarrassed to admit that I never had a really good handle on how the basic concurrency primitives are implemented. I’m now pretty glad I started to dig into this, because most of the analogies people use to describe mutexes and semaphores...

Typo in Apple's SSL implementation causes uniform failure to validate key exchanges
22 Feb 2014 | original ↗

UPDATE: I guess Apple has released a statement explaining that they’re not going to explain this issue, including how big of a deal it is. Ok, then I will. UPDATE 2: Well, looks like Chad beat me to posting the file to Hacker News. Heh. Tonight my friend Chad Brubaker[1] pointed me at an interesting problem. Apple has been rolling out an iOS...

Swisher's AOL.com, 15 years later
31 Jan 2014 | original ↗

Kara Swisher wrote her book AOL.com in 1998. In those days, the industry faced an epistemological crisis. The consumer Internet was new and ill-understood. A company worth billions at that time might have been worth nothing a year later. There was simply a limitation to what could be known. Neither the critics nor the advocates really had a good...

What "viable search engine competition" really looks like.
4 Jan 2014 | original ↗

Hacker News is up in arms again today about the RapGenius fiasco. See RapGenius statement and HN comments. One response article argues that we need more “viable search engine competition” and the HN community largely seems to agree. In much of the discussion, there is a picaresque notion that the “search engine problem” is really just a product...

Ticket to ride: reflecting on the value of my CS degree
2 Jan 2014 | original ↗

Looking back, I think the main advantage of getting a CS degree was that it gave me a lot of time to develop an intuition for how computers behave, what tools are useful for what things, and which problems are amenable to which approaches. Developing this intuition in a semi-directed environment like school is actually really useful because...

Day 212: More about counting things with fancy math.
1 Jan 2014 | original ↗

Hacker School: day 212. (My batch ended on August 22, 2013, but as they say, never graduate.) In a previous post I talked about how the algebraic structure of a statistic you’re aggregating can give you hints about how to distribute it across a cluster. Such is the premise of Twitter’s neat little library Algebird, which I have continued to poke...

Day 208: telnet, turning you into a human server
28 Dec 2013 | original ↗

Hacker School: day 208. (My batch ended on August 22, 2013, but as they say, never graduate.) When I was reverse engineering the Snapchat API, I spent a fair amount of time wondering if there was a quick way to prototype HTTP requests. After complaining to a few friends, it turns out that there is: the telnet utility[1] on unix systems. Goal:...

Day 207: How to count things.
27 Dec 2013 | original ↗

Hacker School: day 207. (My batch ended on August 22, 2013, but as they say, never graduate.) I’ve been playing with Brushfire, which is a machine learning library that distributes the learning of decision trees across a cluster. Most of ML is basically aggregating counts of stuff, and Brushfire is no different. What’s sort of interesting is the...

Day 206: What is an IP address, anyway?
26 Dec 2013 | original ↗

Hacker School: day 206. (My batch ended on August 22, 2013, but as they say, never graduate.) Networking is one of the classes I never had time for. Before today, I didn’t even have good answers for basic questions about IP addresses: What are IP addresses for? How do I get an IP address? Who assigns IP addresses? Who can see my IP address? Is it...

General lessons from scaling large systems at Microsoft
16 Dec 2013 | original ↗

One of the serious disadvantages of working at a place like Microsoft is that everything is built for scale, even prototypes. When we roll something out, it gets used[*]. There is no slapping something up in Django “to see if it works” because it only works if it works for millions of clients immediately. When you build systems in this way, the...

Twitter's IPO in simple terms
6 Oct 2013 | original ↗

The short answer to the question of why Twitter is IPO'ing now is that the timing is excellent, but the particulars of why this is true are actually really interesting. Next year is a comparatively bad time to file. Right now, the Jumpstart Our Business Startups (JOBS) act allows “emerging” companies like Twitter to file confidentially. Briefly...

Adding Haskell's `where` keyword to OCaml
19 Aug 2013 | original ↗

I’ve recently been learning OCaml in my free time at the Hacker School space. Normally, if you want to define multiple variables in OCaml, you chain a bunch of lets together. This works in roughly the same way lisp’s let and let* work: (* OCaml function, computes (x+2)*2 *) let foo x = let y = x+2 in let z = y*2 in z...

Turning the Apple //e into a lisp machine, part 1
9 Aug 2013 | original ↗

The Hacker School space has an old Apple //e sitting around. Due to the fact that we are hackers, my friend Martin Törnwall and I decided to turn it into a lisp machine. (Full source available here.) The main obstacle was not developing the lisp itself. It was that developing software on the Apple //e was astonishingly painful: The Apple //e has...

The obvious Python parser
26 Jul 2013 | original ↗

I spent my first few weeks at Hacker School writing a Python compiler from basically scratch. The task of merely parsing a complete language like Python can be quite intimidating at the outset. I’ve found that many people simply assume it’s nearly impossible. I began to wonder if it was possible to write a parser so clear that it would seem...

What exactly is the difference between machine learning and statistics?
8 Mar 2013 | original ↗

I often get asked about the difference between statistics and machine learning. It is a tricky distinction because some things that were invented for ML (e.g., PAC theory) also get a lot of play in statistics journals, and vice-versa. To say they’re completely equivalent (which is what I often hear) is probably a bit too strong. I tend to think...

Motivating the Bayesian prior with de Finetti's theorem
4 Mar 2013 | original ↗

The modern axiomatization of probability theory (proposed in 1933 by Andrey Kolmogorov) was designed to provide a measure-theoretic probability calculus, that is, a definition of the rules for constructing and manipulating mathematical statements involving probabilities. Unfortunately, this axiomatization only tells us how to manipulate...

Why psuedorandomness is important in frequency moment estimation
23 Feb 2013 | original ↗

Finding the \( p \)-th frequency moment, denoted \( F_p \), is one of the most well-studied problems in streaming algorithms, with a broad set of applications ranging from traffic monitoring on networks, to efficient entropy estimation, to database query optimization. In the streaming setting, this task amounts to computing \( F_p(\mathbf{x}) =...

↑ these items are from RSS. Visit the blog itself at https://blog.nullspace.io/ to find other articles and to appreciate the author's digital home.