Jaz's Blog

Jetstream: Shrinking the AT Proto Firehose by >99%

24 Sept 2024 | original ↗

Bluesky recently saw a massive spike in activity in response to Brazil’s ban of Twitter. As a result, the AT Proto event firehose provided by Bluesky’s Relay at bsky.network has increased in volume by a huge amount. The average event rate during this surge increased by ~1,300%. Before this new surge in activity, the firehose would produce around...

How HLS Works

5 Jul 2024 | original ↗

Over the past few weeks, I’ve been building out server-side short video support for Bluesky. The major aim of this feature is to support short (90 second max) video streaming at a quality that doesn’t cost an arm and a leg for us to provide for free. In order to stay within these constraints, we’re considering making use of a video CDN that can...

An entire Social Network in 1.6GB (GraphD Part 2)

20 Apr 2024 | original ↗

In Part 1 of this series, we tried to answer the question “who do you follow who also follows user B” in Bluesky, a social network with millions of users and hundreds of millions of follow relationships. At the conclusion of the post, we’d developed an in-memory graph store for the network that uses HashMaps and HashSets to keep track of the...

Your Data Fits in Memory (GraphD Part 1)

15 Apr 2024 | original ↗

I recently shipped a new revision of Bluesky’s global AppView at the start of February and things have been going very well. The system scales and handles millions of users without breaking a sweat, the ScyllaDB-backed Data Plane service sits at under 5% DB load in the most intense production workloads, and things are going great. You know what...

Scaling Go to 192 Cores with Heavy I/O

10 Jan 2024 | original ↗

For the past few months I’ve been working alongside Why, Jacob, Dan, and Divy on a new revision of Bluesky’s global AppView. The AppView is a piece of infrastructure that aggregates posts, likes, follows, etc. from all across ATProto and merges them into a consistent view of the network, allowing users to fetch their timelines, notifications,...

Solving Thundering Herds with Request Coalescing in Go

28 Sept 2023 | original ↗

Caches are a wonderful way to make your most frequent operations cheaper. If you’ve got a resource somewhere on disk (or a network hop away) that is accessed often, changes infrequently, and fits in memory, you’ve got an excellent candidate for a cache! Caching Celebrity Posts For example, consider a social media post from a famous celebrity....

Speeding Up Massive PostgreSQL Joins with Common Table Expressions

10 Aug 2023 | original ↗

I’ve been continuing to work on a growing series of services that archive, analyze, and represent data from a social network. This network creates text-based posts at a rate of around 400,000 posts per day, and I’ve been feeding the posts through different ML models to try and gauge the broad sentiment of the network and help find posters that...

Speeding up Postgres Queries by 200x with Analyze

20 May 2023 | original ↗

I’ve been working on a growing series of services that archive, analyze, and represent data from a social network. Part of this process involves archiving every post on a social network, running Computer Vision models on every image posted to the network, and running sentiment analysis on the text of every post on the network. Exposition: Custom...

How to use ChatGPT to Write Good Code Faster

19 Apr 2023 | original ↗

There’s been a lot of hype about Large Language Models (LLMs) lately with lots of cool examples of people using tools like ChatGPT to draft Python scripts, Go code, and all sorts of other useful things. Okay, so how do I use ChatGPT effectively? How do I Use ChatGPT Effectively? To use ChatGPT effectively for writing code, I’ve found it’s useful...

Workload Agnosticism in Large Language Models: The Foundation for the Next Generation of Computing

29 Mar 2023 | original ↗

As discussed in my previous post, LLMs such as OpenAI’s ChatGPT and GPT-4, Google’s Bard, and Meta’s LLaMA have risen seemingly out of nowhere, poised to disrupt the future of computing. Cloud Compute changed the landscape drastically when it was introduced by Amazon in the mid 2000’s, lowering the barriers to entry for new software-based...

A Tale of Two Technologies: Why Large Language Models are the Future and the Metaverse Isn't

26 Mar 2023 | original ↗

In the digital landscape of recent years, two major technologies have vied for the spotlight: the Metaverse and Large Language Models (LLMs). Though the Metaverse, a virtual reality-based universe, initially garnered significant attention and expectations, it ultimately failed to revolutionize the digital world. Meanwhile, LLMs such as OpenAI’s...

Related blogs