Although I’ve been using Python 3.12 in production for nearly a year, one neat feature in the typing module that escaped me was the @override decorator. Proposed in PEP-6981, it’s been hanging out in typing_extensions for a while. This is one of those small features you either don’t care about or get totally psyched over. I’m definitely in the...
This morning, someone on Twitter pointed me to PEP 5621, which introduces __getattr__ and __dir__ at the module level. While __dir__ helps control which attributes are printed when calling dir(module), __getattr__ is the more interesting addition. The __getattr__ method in a module works similarly to how it does in a Python class. For example:...
I always get tripped up by Docker’s different mount types and their syntax, whether I’m stringing together some CLI commands or writing a docker-compose file. Docker’s docs cover these, but for me, the confusion often comes from how “bind” is used in various contexts and how “volume” and “bind” sometimes get mixed up in the documentation. Here’s...
I was fiddling with graphlib in the Python stdlib and found it quite nifty. It processes a Directed Acyclic Graph (DAG), where tasks (nodes) are connected by directed edges (dependencies), and returns the correct execution order. The “acyclic” part ensures no circular dependencies. Topological sorting is useful for arranging tasks so that each...
Besides retries, circuit breakers1 are probably one of the most commonly employed resilience patterns in distributed systems. While writing a retry routine is pretty simple, implementing a circuit breaker needs a little bit of work. I realized that I usually just go for off-the-shelf libraries for circuit breaking and haven’t written one from...
I’m not really a fan of shims—code that automatically performs actions as a side effect or intercepts actions when you use the shell or when a prompt runs. That’s mainly why I’ve stayed away from tools like asdf or pyenv, and instead stick to apt or brew for managing my binary installs, depending on the OS. Recently, though, I’ve started seeing...
I spent the evening watching this incredibly grokkable talk on event-driven services by James Eastham at NDC London 2024. Below is a cleaned-up version of my notes. I highly recommend watching the full talk if you’re interested before reading this distillation. The curse of tightly coupled microservices Microservices often...
While going through a script at work today, I came across Bash’s nameref feature. It uses declare -n ref="$1" to set up a variable that allows you to reference another variable by name—kind of like pass-by-reference in C. I’m pretty sure I’ve seen it before but probably just skimmed over it. But as I dug into the man page1, I realized there’s a...
When I started writing here about five years ago, I made a promise to myself that I wouldn’t give in to the trend of starting a blog, adding one overly enthusiastic entry about the stack behind it, and then vanishing into the ether. I was somewhat successful at that and wanted to write something I can link to when people are curious about the...
I always struggle with the syntax for redirecting multiple streams to another command or a file. LLMs do help, but beyond the most obvious cases, it takes a few prompts to get the syntax right. When I know exactly what I’m after, scanning a quick post is much faster than wrestling with a non-deterministic kraken. So, here’s a list of the...
Here’s a Python snippet that makes an HTTP POST request: # script.py import httpx from typing import Any async def make_request(url: str) -> dict[str, Any]: headers = {"Content-Type": "application/json"} async with httpx.AsyncClient(headers=headers) as client: response = await client.post( url, json={"key_1": "value_1", "key_2": "value_2"}, )...
I love @pytest.mark.parametrize—so much so that I sometimes shoehorn my tests to fit into it. But the default style of writing tests with parametrize can quickly turn into an unreadable mess when the test complexity grows. For example: import pytest from math import atan2 def polarify(x: float, y: float) -> tuple[float, float]: r = (x**2 + y**2)...
I learned this neat Bash trick today where you can make a raw HTTP request using the /dev/tcp file descriptor without using tools like curl or wget. This came in handy while writing a health check script that needed to make a TCP request to a service. The following script opens a TCP connection and makes a simple GET request to example.com:...
Let’s say you have a web app that emits log messages from different layers. Your log shipper collects and sends these messages to a destination like Datadog where you can query them. One common requirement is to tag the log messages with some common attributes, which you can use later to query them. In distributed tracing, this tagging is usually...
With the recent explosion of LLM tools, I often like to kill time fiddling with different LLM client libraries and SDKs in one-off scripts. Lately, I’ve noticed that some newer tools frequently mess up the logger settings, meddling with my application logs. While it’s less common in more seasoned libraries, I guess it’s worth rehashing why...
TIL about the install command on *nix systems. A quick GitHub search for the term brought up a ton of matches1. I’m surprised I just found out about it now. Often, in shell scripts I need to: Create a directory hierarchy Copy a config or binary file to the new directory Set permissions on the file It usually looks like this: # Create the...
I was working on the deployment pipeline for a service that launches an app in a dedicated VM using GitHub Actions. In the last step of the workflow, the CI SSHs into the VM and runs several commands using a here document1 in bash. The simplified version looks like this: # SSH into the remote machine and run a bunch of commands to deploy the...
One of the reasons why I’m a big advocate of rebasing and cleaning up feature branches, even when the changes get squash-merged to the mainline, is that it makes the PR reviewer’s life a little easier. I’ve written about my rebase workflow before1 and learned a few new things from the Hacker News discussion2 around it. While there’s been no...
People tend to get pretty passionate about Git workflows on different online forums. Some like to rebase, while others prefer to keep the disorganized records. Some dislike the extra merge commit, while others love to preserve all the historical artifacts. There’s merit to both sides of the discussion. That being said, I kind of like rebasing...
People typically associate Google’s Protocol Buffer1 with gRPC2 services, and rightfully so. But things often get confusing when discussing protobufs because the term can mean different things: A binary protocol for efficiently serializing structured data. A language used to specify how this data should be structured. In gRPC services, you...
The handful of times I’ve reached for typing.TypeGuard in Python, I’ve always been confused by its behavior and ended up ditching it with a # type: ignore comment. For the uninitiated, TypeGuard allows you to apply custom type narrowing1. For example, let’s say you have a function named pretty_print that accepts a few different types and prints...
One neat use case for the HTTP ETag header is client-side HTTP caching for GET requests. Along with the ETag header, the caching workflow requires you to fiddle with other conditional HTTP headers like If-Match or If-None-Match. However, their interaction can feel a bit confusing at times. Every time I need to tackle this, I end up spending some...
Every once in a while, I find myself skimming through the MDN docs to jog my memory on how CORS1 works and which HTTP headers are associated with it. This is particularly true when a frontend app can’t talk to a backend service I manage due to a CORS error2. MDN’s CORS documentation is excellent but can be a bit verbose for someone just looking...
Ever since Rob Pike published the text on the functional options pattern1, there’s been no shortage of blogs, talks, or comments on how it improves or obfuscates configuration ergonomics. While the necessity of such a pattern is quite evident in a language that lacks default arguments in functions, more often than not, it needlessly complicates...
In 9th grade, when I first learned about Lenz’s Law1 in Physics class, I was fascinated by its implications. It states: The direction of an induced current will always oppose the motion causing it. In simpler terms, imagine you have a hoop and a magnet. If you move the magnet close to the hoop, the hoop generates a magnetic field that pushes the...
These days, I don’t build hierarchical types through inheritance even when writing languages that support it. Type composition has replaced almost all of my use cases where I would’ve reached for inheritance before. I’ve written1 about how to escape the template pattern2 hellscape and replace that with strategy pattern3 in Python before. While by...
While I like Go’s approach of treating errors as values as much as the next person, it inevitably leads to a situation where there isn’t a one-size-fits-all strategy for error handling like in Python or JavaScript. The usual way of dealing with errors entails returning error values from the bottom of the call chain and then handling them at the...
I used reach for reflection whenever I needed a Retry function in Go. It’s fun to write, but gets messy quite quickly. Here’s a rudimentary Retry function that does the following: It takes in another function that accepts arbitrary arguments. Then tries to execute the wrapped function. If the wrapped function returns an error after execution,...
Despite moonlighting as a gopher for a while, the syntax for type assertion and type switches still trips me up every time I need to go for one of them. So, to avoid digging through the docs or crafting stodgy LLM prompts multiple times, I decided to jot this down in a gobyexample1 style for the next run. Type assertion Type assertion in Go...
I’ve been a happy user of pydantic1 settings to manage all my app configurations since the 1.0 era. When pydantic 2.0 was released, the settings portion became a separate package called pydantic_settings2. It does two things that I love: it automatically reads the environment variables from the .env file and allows you to declaratively convert...
As of now, unlike Python or NodeJS, Go doesn’t allow you to specify your development dependencies separately from those of the application. However, I like to specify the dev dependencies explicitly for better reproducibility. While working on a new CLI tool1 for checking dead URLs in markdown files, I came across this neat convention: you can...
I love dynamically typed languages as much as the next person. They let us make ergonomic API calls like this: import httpx # Sync call for simplicity r = httpx.get("https://dummyjson.com/products/1").json() print(r["id"], r["title"], r["description"]) or this: fetch("https://dummyjson.com/products/1") .then((res) => res.json()) .then((json) =>...
While I tend to avoid *args and **kwargs in my function signatures, it’s not always possible to do so without hurting API ergonomics. Especially when you need to write functions that call other helper functions with the same signature. Typing *args and **kwargs has always been a pain since you couldn’t annotate them precisely before. For example,...
I needed to integrate rate limiting into a relatively small service that complements a monolith I was working on. My initial thought was to apply it at the application layer, as it seemed to be the simplest route. Plus, I didn’t want to muck around with load balancer configurations, and there’s no shortage of libraries that allow me to do this...
You can use @dataclass(frozen=True) to make instances of a data class immutable during runtime. However, there’s a small caveat—instantiating a frozen data class is slightly slower than a non-frozen one. This is because, when you enable frozen=True, Python has to generate __setattr__ and __delattr__ methods during class definition time and invoke...
When I started my career in a tightly-knit team of six engineers at a small e-commerce startup, I was struck by the remarkable efficiency of having a centralized hub for all the documents used for planning. We used a single Trello board with four columns—To-do, Doing, Q/A, Done—where the tickets were grouped by feature tags. We’d leverage a dummy...
I’ve always had a thing for old-school web tech. By the time I joined the digital fray, CGI scripts were pretty much relics, but the term kept popping up in tech forums and discussions like ghosts from the past. So, I got curious, started reading about them, and wanted to see if I could reason about them from the first principles. Writing one...
Despite using VSCode as my primary editor, I never really bothered to set up the native debugger to step through application code running inside Docker containers. Configuring the debugger to work with individual files, libraries, or natively running servers is trivial1. So, I use it in those cases and just resort back to my terminal for...
Data classes are containers for your data—not behavior. The delineation is right there in the name. Yet, I see state-mutating methods getting crammed into data classes and polluting their semantics all the time. While this text will primarily talk about data classes in Python, the message remains valid for any language that supports data classes...
Despite being an IC for the bulk of my career, finding my groove amidst the daily torrent of meetings from the early hours has always felt like balancing on a seesaw during a never-ending earthquake. Now, pair that with the onslaught of Slack inquiries and the incessant chiming of email notifications, and you have a front-row ticket to the...
Ever been in a situation where you landed a software engineering job with a particular tech stack, mastered it, switched to another company with a different stack, nailed that too, and then found yourself in a third company that used the original stack? Now, you suddenly sense that your hard-earned acumen in that initial stack has not only...
Adopting existing tools that work, applying them to the business problems at hand, and quickly iterating in the business domain rather than endlessly swirling in the vortex of technobabble is woefully underrated. I’ve worked at two kinds of companies before: One that only cares about measurable business outcomes, accruing technical debt and...
I like writing custom scripts to automate stuff or fix repetitive headaches. Most of them are shell scripts, and a few of them are written in Python. Over the years, I’ve accumulated quite a few of them. I use Git and GNU stow1 to manage them across different machines, and the workflow2 is quite effective. However, as the list of scripts grows...
There are a few ways you can add URLs to your markdown documents: Inline links [inline link](https://example.com) This will render as inline link. Reference links [reference link] Define the link destination elsewhere in the document like this: [reference link]: https://example.com This will render the same way as before, reference link. Footnote...
I’m one of those people who will sit in front of a computer for hours, fiddling with algorithms or debugging performance issues, yet won’t spend 10 minutes to improve their workflows. While I usually get away with this, every now and then, my inertia slithers back to bite me. The latest episode was me realizing how tedious it is to move config...
Every once in a while, I love browsing the Wayback Machine1 to catch a glimpse of the early internet. I enjoy the waves of nostalgic indie hacker vibes that wash over me as I type a URL into the search box and click to see an old snapshot of the site frozen in time. Being a kid of the early ’00s, I missed the spectacular cosmic genesis of the...
This site1 is built with Hugo2 and served via GitHub pages3. Recently, I decided to change the font here to make things more consistent across different devices. However, I didn’t want to go with Google Fonts for a few reasons: CDN is another dependency. Hosting static assets on GitHub Pages has served me well. Google Fonts tracks users and...
Suppose, you have a function that takes an option struct and a message as input. Then it stylizes the message according to the option fields and prints it. What’s the most sensible API you can offer for users to configure your function? Observe: // app/src package src // Option struct type Style struct { Fg string // ANSI escape codes for...
I was curious to see if I could prototype a simple load balancer in a single Go script. Go’s standard library and goroutines make this trivial. Here’s what the script needs to do: Spin up two backend servers that’ll handle the incoming requests. Run a reverse proxy load balancer in the foreground. The load balancer will accept client connections...
I was cobbling together a long-running Go script to send webhook messages to a system when some events occur. The initial script would continuously poll a Kafka topic for events and spawn new goroutines to make HTTP requests to the destination. This had two problems: It could create unlimited goroutines if many events arrived quickly It might...
A TOTP1 based 2FA system has two parts. One is a client that generates the TOTP code. The other part is a server. The server verifies the code. If the client and the server-generated codes match, the server allows the inbound user to access the target system. The code usually expires after 30 seconds and then, you’ll have to regenerate it to be...
I love Go’s implicit interfaces. While convenient, they can also introduce subtle bugs unless you’re careful. Types expected to conform to certain interfaces can fluidly add or remove methods. The compiler will only complain if an identifier anticipates an interface, but is passed a type that doesn’t implement that interface. This can be...
I enjoy writing about software—the things I learn, the tools I use, and the work I do. Owing to the constraints of the corporate software world, more often than not, you can’t showcase your work or talk about them. At least that’s how it always has been throughout my career. At the same time, as you grow older and start having a life outside of...
Before the release of version 1.21, you couldn’t set levels for your log messages in Go without either using third-party libraries or writing your own boilerplates. Coming from Python, I’ve always found this odd, considering that this capability has been in the Python standard library forever. However, it seems like the new log/slog subpackage in...
If you’re a manager, then there’s no shortage of information for you on how to conduct exit interviews. But there aren’t many resources that focus on how to handle them from an employee’s perspective. I’ve been meaning to write a quick piece that isn’t biased by anyone else’s experience and is short enough so that I can quickly jog my memory in...
The 100k context window of Claude 21 has been a huge boon for me since now I can paste a moderately complex problem to the chat window and ask questions about it. In that spirit, it recently refactored some pretty gnarly conditional logic for me in such an elegant manner that it absolutely blew me away. Now, I know how bitmasks2 work and am aware...
This morning, while browsing Hacker News, I came across a neat trick1 that allows you to share textual data by leveraging DNS TXT records. It can be useful for sharing a small amount of data in environments that restrict IP but allow DNS queries, or to bypass censorship. To test this out, I opened my domain registrar’s panel and created a new TXT...
Unless I’m hand rolling my own ORM-like feature or validation logic, I rarely need to write custom descriptors in Python. The built-in descriptor magics like @classmethod, @property, @staticmethod, and vanilla instance methods usually get the job done. However, every time I need to dig my teeth into descriptors, I reach for this fantastic how to1...
Python offers a ton of ways like os.system or os.spawn* to create new processes and run arbitrary commands in your system. However, the documentation usually encourages you to use the subprocess1 module for creating and managing child processes. The subprocess module exposes a high-level run() function that provides a simple interface for running...
The current title of this post is probably incorrect and may even be misleading. I had a hard time coming up with a suitable name for it. But the idea goes like this: sometimes you might find yourself in a situation where you need to iterate through a generator more than once. Sure, you can use an iterable like a tuple or list to allow multiple...
Around a year ago, I ditched my fancy Linux rig for a beefed-up 16" MacBook Pro and ever since, it’s been my primary machine for both personal and work stuff. I love how this machine strikes a decent balance between power and portability. However, I often joke that this chonky boy is just a pound shy of being an ENIAC1. It’s a beast of a machine...
Over the years, I’ve used the template pattern1 across multiple OO languages with varying degrees of success. It was one of the first patterns I learned in the primordial hours of my software engineering career, and for some reason, it just feels like the natural way to tackle many real-world code-sharing problems. Yet, even before I jumped on...
One major drawback of Python’s huge ecosystem is the significant variances in workflows among people trying to accomplish different things. This holds true for dependency management as well. Depending on what you’re doing with Python—whether it’s building reusable libraries, writing web apps, or diving into data science and machine learning—your...
I was watching this amazing lightning talk1 by Karla Burnett and wanted to understand how traceroute works in Unix. Traceroute is a tool that shows the route of a network packet from your computer to another computer on the internet. It also tells you how long it takes for the packet to reach each stop along the way. It’s useful when you want to...
Recently, I purchased a domain for this blog and migrated the content from rednafi.github.io1 to rednafi.com2. This turned out to be a much bigger hassle than I originally thought it’d be, mostly because, despite setting redirection for almost all the URLs from the previous domain to the new one and submitting the new sitemap.xml3 to the Search...
Cloudflare absolutely nailed the serverless function DX with Cloudflare Workers1. However, I feel like it’s yet to receive widespread popularity like AWS Lambda since as of now, the service only offers a single runtime—JavaScript. But if you can look past that big folly, it’s a delightful piece of tech to work with. I’ve been building small tools...
This weekend, I was working on a fun project that required a fixed-time job scheduler to run a curl command at a future timestamp. I was aiming to find the simplest solution that could just get the job done. I’ve also been exploring Google Bard1 recently and wanted to see how it stacks up against other LLM tools like ChatGPT, BingChat, or...
I needed a way to sort a Django queryset based on a custom sequence of an attribute. Typically, Django allows sorting a queryset by any attribute on the model or related to it in either ascending or descending order. However, what if you need to sort the queryset following a custom sequence of attribute values? Suppose, you’re working with a...
I recently gave my blog1 a fresh new look and decided it was time to spruce up my GitHub profile’s2 landing page as well. GitHub has a special3 way of treating the README.md file of your repo, displaying its content as the landing page for your profile. My goal was to showcase a brief introduction about myself and my work, along with a list of...
One of my favorite pastimes these days is to set BingChat to creative mode, ask it to teach me a trick about topic X, and then write a short blog post about it to reinforce my understanding. Some of the things it comes up with are absolutely delightful. In the spirit of that, I asked it to teach me a Shell trick that I can use to mimic maps or...
Whenever I need to deduplicate the items of an iterable in Python, my usual approach is to create a set from the iterable and then convert it back into a list or tuple. However, this approach doesn’t preserve the original order of the items, which can be a problem if you need to keep the order unscathed. Here’s a naive approach that works: from...
I needed to compare two large directories with thousands of similarly named PDF files and find the differing filenames between them. In the first pass, this is what I did: Listed out the content of the first directory and saved it in a file: ls dir1 > dir1.txt Did the same for the second directory: ls dir2 > dir2.txt Compared the difference...
Whenever I need to whip up a quick command line tool, my go-to is usually Python. Python’s CLI solutions tend to be more robust than their Shell counterparts. However, dealing with its portability can sometimes be a hassle, especially when all you want is to distribute a simple script. That’s why while toying around with argparse to create a...
When writing shell scripts, I’d often resort to using hardcoded ANSI escape codes1 to format text, such as: #!/usr/bin/env bash BOLD="\033[1m" UNBOLD="\033[22m" FG_RED="\033[31m" BG_YELLOW="\033[43m" BG_BLUE="\033[44m" RESET="\033[0m" # Print a message in bold red text on a yellow background. echo -e "${BOLD}${FG_RED}${BG_YELLOW}This is a warning...
Whenever I plan to build something, I spend 90% of my time researching and figuring out the idiosyncrasies of the tools that I decide to use for the project. LLM tools like ChatGPT has helped me immensely in that regard. I’m taking on more tangential side projects because they’re no longer as time-consuming as they used to be and provide me with...
In multi-page web applications, a common workflow is where a user: Loads a specific page or clicks on some button that triggers a long-running task. On the server side, a background worker picks up the task and starts processing it asynchronously. The page shouldn’t reload while the task is running. The backend then communicates the status of the...
I’ve always had a vague idea about what Unix domain sockets are from my experience working with Docker for the past couple of years. However, lately, I’m spending more time in embedded edge environments and had to explore Unix domain sockets in a bit more detail. This is a rough documentation of what I’ve explored to gain some insights. The dry...
While working on a multithreaded socket server in an embedded environment, I realized that the default behavior of Python’s socketserver.ThreadingTCPServer requires some extra work if you want to shut down the server gracefully in the presence of an interruption signal. The intended behavior here is that whenever any of SIGHUP, SIGINT, SIGTERM,...
I was working on a project where I needed to poll multiple data sources and consume the incoming data points in a single thread. In this particular case, the two data streams were coming from two different Redis lists. The correct way to consume them would be to write two separate consumers and spin them up as different processes. However, in...
Consider this iterable: it = (1, 2, 3, 0, 4, 5, 6, 7) Let’s say you want to build another iterable that includes only the numbers that appear starting from the element 0. Usually, I’d do this: # This returns (0, 4, 5, 6, 7). from_zero = tuple(elem for idx, elem in enumerate(it) if idx >= it.index(0)) While this is quite terse and does the job, it...
I needed to write a socket server in Python that would allow me to intermittently pause the server loop for a while, run something else, then get back to the previous request-handling phase; repeating this iteration until the heat death of the universe. Initially, I opted for the low-level socket module to write something quick and dirty....
Back in the days when I was working as a data analyst, I used to spend hours inside Jupyter notebooks exploring, wrangling, and plotting data to gain insights. However, as I shifted my career gear towards backend software development, my usage of interactive exploratory tools dwindled. Nowadays, I spend the majority of my time working on a fairly...
I was working with a table that had a similar (simplified) structure like this: | uuid | file_path | |----------------------------------|---------------------------| | b8658dfc3e80446c92f7303edf31dcbd | media/private/file_1.pdf | | 3d750874a9df47388569a23c559a4561 | media/private/file_2.csv | | d177b7f7d8b046768ab65857451a0354 |...
At my workplace, I was writing a script to download multiple files from different S3 buckets. The script relied on Django ORM, so I couldn’t use Python’s async paradigm to speed up the process. Instead, I opted for boto3 to download the files and concurrent.futures.ThreadPoolExecutor to spin up multiple threads and make the requests concurrently....
The colon : command is a shell utility that represents a truthy value. It can be thought of as an alias for the built-in true command. You can test it by opening a shell script and typing a colon on the command line, like this: : If you then inspect the exit code by typing $? on the command line, you’ll see a 0 there, which is exactly what you’d...
Django has a Model.objects.bulk_update method that allows you to update multiple objects in a single pass. While this method is a great way to speed up the update process, oftentimes it’s not fast enough. Recently, at my workplace, I found myself writing a script to update half a million user records and it was taking quite a bit of time to...
I’ve just migrated from Ubuntu to macOS for work and am still in the process of setting up the machine. I’ve been a lifelong Linux user and this is the first time I’ve picked up an OS that’s not just another flavor of Debian. Primarily, I work with Python, NodeJS, and a tiny bit of Go. Previously, any time I had to install these language...
TIL that you can specify update_fields while saving a Django model to generate a leaner underlying SQL query. This yields better performance while updating multiple objects in a tight loop. To test that, I’m opening an IPython shell with python manage.py shell -i ipython command and creating a few user objects with the following lines: In [1]:...
At my workplace, while working on a Lambda1 function, I noticed that my Python logs weren’t appearing on the corresponding Cloudwatch2 log dashboard. At first, I thought that the function wasn’t picking up the correct log level from the environment variables. We were using serverless3 framework and GitLab CI to deploy the function, so my first...
Python makes it freakishly easy to load the whole content of any file into memory and process it afterward. This is one of the first things that’s taught to people who are new to the language. While the following snippet might be frowned upon by many, it’s definitely not uncommon: # src.py with open("foo.csv", "r") as f: # Load the whole content...
After reading Simon Willison’s amazing piece1 on how he adds new features to his open-source softwares, I wanted to adopt some of the good practices and incorporate them into my own workflow. One of the highlights of that post was how to kick off a feature work. The process roughly goes like this: Opening a new GitHub issue for the feature in the...
My grug1 brain can never remember the correct semantics of quoting commands and variables in a UNIX shell environment. Every time I work with a shell script or run some commands in a Docker compose file, I’ve to look up how to quote things properly to stop my ivory tower from crashing down. So, I thought I’d list out some of the most common rules...
TIL that returning a value from a function in bash doesn’t do what I thought it does. Whenever you call a function that’s returning some value, instead of giving you the value, Bash sets the return value of the callee as the status code of the calling command. Consider this example: #!/usr/bin/bash # script.sh return_42() { return 42 } # Call the...
While working with GitHub webhooks, I discovered a common pattern1 a webhook receiver can adopt to verify that the incoming webhooks are indeed arriving from GitHub; not from some miscreant trying to carry out a man-in-the-middle attack. After some amount of digging, I found that it’s quite a common practice that many other webhook services...
While going through the documentation of Python’s sqlite31 module, I noticed that it’s quite API-driven, where different parts of the module are explained in a prescriptive manner. I, however, learn better from examples, recipes, and narratives. Although a few good recipes already exist in the docs, I thought I’d also enlist some of the examples...
TIL from this1 video that Python’s urllib.parse.urlparse2 is quite slow at parsing URLs. I’ve always used urlparse to destructure URLs and didn’t know that there’s a faster alternative to this in the standard library. The official documentation also recommends the alternative function. The urlparse function splits a supplied URL into multiple...
Python has a random.choice routine in the standard library that allows you to pick a random value from an iterable. It works like this: # src.py import random # The seed ensures that you'll get the same random choice # every time you run the script. random.seed(90) # This builds a list: ["choice_0", "choice_1", ..., "choice_9"] lst =...
Over the years, I’ve used Python’s contextlib.ExitStack in a few interesting ways. The official documentation1 advertises it as a way to manage multiple context managers and has a couple of examples of how to leverage it. However, neither in the docs nor in GitHub code search2 I could find examples of some of the maybe unusual ways I’ve used it...
While reading the second version of Brian Okken’s pytest book1, I came across this neat trick to compose multiple levels of fixtures. Suppose, you want to create a fixture that returns some canned data from a database. Now, let’s say that invoking the fixture multiple times is expensive, and to avoid that you want to run it only once per test...
I was reading Ned Bachelder’s blog “Why your mock doesn’t work”1 and it triggered an epiphany in me about a testing pattern that I’ve been using for a while without being aware that there might be an aphorism on the practice. Patch where the object is used; not where it’s defined. To understand it, consider the example below. Here, you have a...
I just found out that you can use Python’s unittest.mock.ANY to make assertions about certain arguments in a mock call, without caring about the other arguments. This can be handy if you want to test how a callable is called but only want to make assertions about some arguments. Consider the following example: # test_src.py import random import...
Whenever your local branch diverges from the remote branch, you can’t directly pull from the remote branch and merge it into the local branch. This can happen when, for example: You checkout from the main branch to work on a feature in a branch named alice. When you’re done, you merge alice into main. After that, if you try to pull the main...
Whenever I need to apply some runtime constraints on a value while building an API, I usually compare the value to an expected range and raise a ValueError if it’s not within the range. For example, let’s define a function that throttles some fictitious operation. The throttle function limits the number of times an operation can be performed by...
Whether I’m trying out a new tool or just prototyping with a familiar stack, I usually create a new project on GitHub and run all the experiments there. Some examples of these are: rubric: linter config initializer for Python exert: declaratively apply converter functions to class attributes hook-slinger: generic service to send, retry, and...
A common bottleneck for processing large data files is—memory. Downloading the file and loading the entire content is surely the easiest way to go. However, it’s likely that you’ll quickly hit OOM errors. Often time, whenever I have to deal with large data files that need to be downloaded and processed, I prefer to stream the content line by line...
I’ve rarely been able to take advantage of Django’s bulk_create / bulk_update APIs in production applications; especially in the cases where I need to create or update multiple complex objects with a script. Often time, these complex objects trigger a chain of signals or need non-trivial setups before any operations can be performed on each of...
I frequently have to write ad-hoc scripts that download a CSV file from s31, do some processing on it, and then create or update objects in the production database using the parsed information from the file. In Python, it’s trivial to download any file from s3 via boto32, and then the file can be read with the csv module from the standard...
I run git log --oneline to list out the commit logs all the time. It prints out a compact view of the git history. Running the command in this repo gives me this: d9fad76 Publish blog on safer operator.itemgetter, closes #130 0570997 Merge pull request #129 from rednafi/dependabot/... 6967f73 Bump actions/setup-python from 3 to 4 48c8634 Merge...
Python’s operator.itemgetter is quite versatile. It works on pretty much any iterables and map-like objects and allows you to fetch elements from them. The following snippet shows how you can use it to sort a list of tuples by the first element of the tuple: In [2]: from operator import itemgetter ...: ...: l = [(10, 9), (1, 3), (4, 8), (0, 55),...
Nested conditionals suck. They’re hard to write and even harder to read. I’ve rarely regretted the time I’ve spent optimizing for the flattest conditional structure in my code. The following piece mimics the actions of a traffic signal: // src.ts enum Signal { YELLOW = "Yellow", RED = "Red", GREEN = "Green", } function processSignal(signal:...
While working on a project with EdgeDB1 and FastAPI2, I wanted to perform health checks against the FastAPI server in the GitHub CI. This would notify me about the working state of the application. The idea is to: Run the server in the background. Run the commands against the server that’ll denote that the app is in a working state. Perform...
At my workplace, we have a large Django monolith that powers the main website and works as the primary REST API server at the same time. We use Django Rest Framework (DRF) to build and serve the API endpoints. This means, whenever there’s an error, based on the incoming request header—we’ve to return different formats of error responses to the...
Generators can help you decouple the production and consumption of iterables—making your code more readable and maintainable. I learned this trick a few years back from David Beazley’s slides1 on generators. Consider this example: # src.py from __future__ import annotations import time from typing import NoReturn def infinite_counter(start: int,...
In CPython, elements of a list are stored as pointers to the elements rather than the values of the elements themselves. This is evident from the struct1 that represents a list in C: // Fetched from CPython main branch. Removed comments for brevity. typedef struct { PyObject_VAR_HEAD PyObject **ob_item; /* Pointer reference to the element. */...
Up until now, I’ve always preferred Title Case to demarcate titles and section headers in my writings. However, lately I’ve realized that each time I start writing a sentence, I waste a few seconds deciding on the appropriate case of the special words like—technical terms, trademark names, proper nouns, etc—and how they’ll blend in with the...
I was working on a DRF POST API endpoint where the consumer is expected to add a URL containing a PDF file and the system would then download the file and save it to an S3 bucket. While this sounds quite straightforward, there’s one big issue. Before I started working on it, the core logic looked like this: # src.py from __future__ import...
While writing microservices in Python, I like to declaratively define the shape of the data coming in and out of JSON APIs or NoSQL databases in a separate module. Both TypedDict and dataclass are fantastic tools to communicate the shape of the data with the next person working on the codebase. Whenever I need to do some processing on the data...
This is the 4th time in a row that I’ve wasted time figuring out how to mock out a function during testing that calls the chained methods of a datetime.datetime object in the function body. So I thought I’d document it here. Consider this function: # src.py from __future__ import annotations import datetime def get_utcnow_isoformat() -> str:...
When I first started working with Python, nothing stumped me more than how bizarre Python’s import system seemed to be. Often time, I wanted to run a module inside of a package with the python src/sub/module.py command, and it’d throw an ImportError that didn’t make any sense. Consider this package structure: src ├── __init__.py ├── a.py └── sub...
To avoid instantiating multiple DB connections in Python apps, a common approach is to initialize the connection objects in a module once and then import them everywhere. So, you’d do this: # src.py import boto3 # Pip install boto3 import redis # Pip install redis dynamo_client = boto3.client("dynamodb") redis_client = redis.Redis() However, this...
While working with microservices in Python, a common pattern that I see is—the usage of dynamically filled dictionaries as payloads of REST APIs or message queues. To understand what I mean by this, consider the following example: # src.py from __future__ import annotations import json from typing import Any import redis # Do a pip install. def...
While most of my pytest fixtures don’t react to the dynamically-passed values of function parameters, there have been situations where I’ve definitely felt the need for that. Consider this example: # test_src.py import pytest @pytest.fixture def create_file(tmp_path): """Fixture to create a file in the tmp_path/tmp directory.""" directory =...
If you try to mutate a sequence while traversing through it, Python usually doesn’t complain. For example: # src.py l = [3, 4, 56, 7, 10, 9, 6, 5] for i in l: if not i % 2 == 0: continue l.remove(i) print(l) The above snippet iterates through a list of numbers and modifies the list l in-place to remove any even number. However, running the script...
Five traits that almost all the GitHub Action workflows in my Python projects share are: If a new workflow is triggered while the previous one is running, the first one will get canceled. The CI is triggered every day at UTC 1. Tests and the lint-checkers are run on Ubuntu and MacOS against multiple Python versions. Pip dependencies are cached....
PEP-6731 introduces the Self type and it’s coming to Python 3.11. However, you can already use that now via the typing_extenstions2 module. The Self type makes annotating methods that return the instances of the corresponding classes trivial. Before this, you’d have to do some mental gymnastics to statically type situations as follows: # src.py...
In Python, even though I adore writing tests in a functional manner via pytest, I still have a soft corner for the tools provided in the unittest.mock module. I like the fact it’s baked into the standard library and is quite flexible. Moreover, I’m yet to see another mock library in any other language or in the Python ecosystem that allows you to...
Static type checkers like Mypy follow your code flow and statically try to figure out the types of the variables without you having to explicitly annotate inline expressions. For example: # src.py from __future__ import annotations def check(x: int | float) -> str: if not isinstance(x, int): reveal_type(x) # Type is now 'float'. else:...
Technically, the type of None in Python is NoneType. However, you’ll rarely see types.NoneType being used in the wild as the community has pretty much adopted None to denote the type of the None singleton. This usage is also documented1 in PEP-484. Whenever a callable doesn’t return anything, you usually annotate it as follows: # src.py from...
While grokking the source code of http.HTTPStatus module, I came across this technique to add extra attributes to the values of enum members. Now, to understand what do I mean by adding attributes, let’s consider the following example: # src.py from __future__ import annotations from enum import Enum class Color(str, Enum): RED = "Red" GREEN =...
The functools.wraps decorator allows you to keep your function’s identity intact after it’s been wrapped by a decorator. Whenever a function is wrapped by a decorator, identity properties like—function name, docstring, annotations of it get replaced by those of the wrapper function. Consider this example: from __future__ import annotations # In ...
I was working with a rate-limited API endpoint where I continuously needed to send short polling GET requests without hitting HTTP 429 error. Perusing the API doc, I found out that the API endpoint only allows a maximum of 100 requests per second. So, my goal was to find out a way to send the maximum amount of requests without encountering the...
Whether you like it or not, the split world of sync and async functions in the Python ecosystem is something we’ll have to live with; at least for now. So, having to write things that work with both sync and async code is an inevitable part of the journey. Projects like Starlette1, HTTPx2 can give you some clever pointers on how to craft APIs...
While grokking Black formatter’s codebase, I came across this1 interesting way of handling exceptions in Python. Exception handling in Python usually follows the EAFP paradigm where it’s easier to ask for forgiveness than permission. However, Rust has this recoverable error2 handling workflow that leverages generic Enums. I wanted to explore how...
I’ve always had a hard time explaining variance of generic types while working with type annotations in Python. This is an attempt to distill the things I’ve picked up on type variance while going through PEP-483. A pinch of type theory A generic type is a class or interface that is parameterized over types. Variance refers to how subtyping...
How’d you create a sub dictionary from a dictionary where the keys of the sub-dict are provided as a list? I was reading a tweet1 by Ned Bachelder on this today and that made me realize that I usually solve it with O(DK) complexity, where K is the length of the sub-dict keys and D is the length of the primary dict. Here’s how I usually do that...
I was reading a tweet about it yesterday and that didn’t stop me from pushing a code change in production with the same rookie mistake today. Consider this function: # src.py from __future__ import annotations import logging import time from datetime import datetime def log( message: str, /, *, level: str, timestamp: str =...
I used to use Unittest’s self.assertTrue / self.assertFalse to check both literal booleans and truthy/falsy values in Unittest. Committed the same sin while writing tests in Django. I feel like assertTrue and assertFalse are misnomers. They don’t specifically check literal booleans, only truthy and falsy states respectively. Consider this...
Accurately static typing decorators in Python is an icky business. The wrapper function obfuscates type information required to statically determine the types of the parameters and the return values of the wrapped function. Let’s write a decorator that registers the decorated functions in a global dictionary during function definition time....
How come I didn’t know about the python -m pydoc command before today! It lets you inspect the docstrings of any modules, classes, functions, or methods in Python. I’m running the commands from a Python 3.10 virtual environment but it’ll work on any Python version. Let’s print out the docstrings of the functools.lru_cache function. Run: python -m...
To check whether an integer is a power of two, I’ve deployed hacks like this: def is_power_of_two(x: int) -> bool: return x > 0 and hex(x)[-1] in ("0", "2", "4", "8") While this works1, I’ve never liked explaining the pattern matching hack that’s going on here. Today, I came across this tweet2 by Raymond Hettinger where he proposed an elegant...
Django Rest Framework exposes a neat hook to customize the response payload of your API when errors occur. I was going through Microsoft’s REST API guideline1 and wanted to make the error response of my APIs more uniform and somewhat similar to this2. I’ll use a modified version of the quickstart example3 in the DRF docs to show how to achieve...
If you want to define a variable that can accept values of multiple possible types, using typing.Union is one way of doing that: from typing import Union U = Union[int, str] However, there’s another way you can express a similar concept via constrained TypeVar. You’d do so as follows: from typing import TypeVar T = TypeVar("T", int, str) So,...
Recently, fell into this trap as I wanted to speed up a slow instance method by caching it. When you decorate an instance method with functools.lru_cache decorator, the instances of the class encapsulating that method never get garbage collected within the lifetime of the process holding them. Let’s consider this example: # src.py import...
Problem A common interview question that I’ve seen goes as follows: Write a function to crop a text corpus without breaking any word. Take the length of the text up to which character you should trim. Make sure that the cropped text doesn’t have any trailing space. Try to maximize the number of words you can pack in your trimmed text. Your...
I was reading the source code1 of the reference implementation of “PEP-661: Sentinel Values”2 and discovered an optimization technique known as String interning. Modern programming languages like Java, Python, PHP, Ruby, Julia, etc, performs string interning to make their string operations more performant. String interning String interning makes...
I love using Go’s interface feature to declaratively define my public API structure. Consider this example: package main import ( "fmt" ) // Declare the interface. type Geometry interface { area() float64 perim() float64 } // Struct that represents a rectangle. type rect struct { width, height float64 } // Method to calculate the area of a...
While trying to avoid inheritance in an API that I was working on, I came across this neat trick to perform attribute delegation on composed classes. Let’s say there’s a class called Engine and you want to put an engine instance in a Car. In this case, the car has a classic ‘has a’ (inheritance usually refers to ‘is a’ relationships) relationship...
I wanted to add a helper method to an Enum class. However, I didn’t want to make it a classmethod as property method made more sense in this particular case. Problem is, you aren’t supposed to initialize an enum class, and property methods can only be accessed from the instances of a class; not from the class itself. While sifting through Django...
I was browsing through the source code of Tom Christie’s typesystem1 library and discovered that the shell scripts2 of the project don’t have any extensions attached to them. At first, I found it odd, and then it all started to make sense. Executable scripts can be written in any language and the users don’t need to care about that. Also, not...
At my workplace, we have a fairly large Celery config file where you’re expected to subclass from a base class and extend that if there’s a new domain. However, the subclass expects the configuration in a specific schema. So, having a way to enforce that schema in the subclasses and raising appropriate runtime exceptions is nice. Wrote a fancy...
Python’s daemon threads are cool. A Python script will stop when the main thread is done and only daemon threads are running. To test a simple hello function that runs indefinitely, you can do the following: # test_hello.py from __future__ import annotations import asyncio import threading from functools import partial from unittest.mock import...
Making tqdm play nice with multiprocessing requires some additional work. It’s not always obvious and I don’t want to add another third-party dependency just for this purpose. The following example attempts to make tqdm work with multiprocessing.imap_unordered. However, this should also work with similar mapping methods like—multiprocessing.map,...
One thing that came to me as news is that the command which—which is the de-facto tool to find the path of an executable—is not POSIX compliant. The recent Debian debacle1 around which brought it to my attention. The POSIX-compliant way of finding an executable program is command -v, which is usually built into most of the shells. So, instead of...
Writing consistent commit messages helps you to weave a coherent story with your git history. Recently, I’ve started paying attention to my commit messages. Before this, my commit messages in this repository used to look like this: git log --oneline -5 d058a23 (HEAD -> master) bash strict mode a62e59b Updating functool partials til. 532b21a Added...
Use unofficial bash strict mode while writing scripts. Bash has a few gotchas and this helps you to avoid that. For example: #!/bin/bash set -euo pipefail echo "Hello" Where, -e Exit immediately if a command exits with a non-zero status. -u Treat unset variables as an error when substituting. -o pipefail The return value of a pipeline is the...
Pasting shell commands can be a pain when they include hidden return \n characters. In such a case, your shell will try to execute the command immediately. To prevent that, use curly braces { } while pasting the command. Your command should look like the following: { dig +short google.com } Here, the spaces after the braces are significant.
The constructor for functools.partial() detects nesting and automatically flattens itself to a more efficient form. For example: from functools import partial def f(*, a: int, b: int, c: int) -> None: print(f"Args are {a}-{b}-{c}") g = partial(partial(partial(f, a=1), b=2), c=3) # Three function calls are flattened into one; free efficiency....
Managing configurations in your Python applications isn’t something you think about much often, until complexity starts to seep in and forces you to re-architect your initial approach. Ideally, your config management flow shouldn’t change across different applications or as your application begins to grow in size and complexity. Even if you’re...
Imagine a custom set-like data structure that doesn’t perform hashing and trades performance for tighter memory footprint. Or imagine a dict-like data structure that automatically stores data in a PostgreSQL or Redis database the moment you initialize it; also it lets you to get-set-delete key-value pairs using the usual...
Updated on 2023-09-11: Fix broken URLs. In Python, metaclass is one of the few tools that enables you to inject metaprogramming capabilities into your code. The term metaprogramming refers to the potential for a program to manipulate itself in a self referential manner. However, messing with metaclasses is often considered an arcane art that’s...
In Python, there’s a saying that “design patterns are anti-patterns”. Also, in the realm of dynamic languages, design patterns have the notoriety of injecting additional abstraction layers to the core logic and making the flow gratuitously obscure. Python’s dynamic nature and the treatment of functions as first-class objects often make Java-ish...
Updated on 2023-09-11: Fix broken URLs. Recently, I was working with MapBox’s1 Route optimization API2. Basically, it tries to solve the traveling salesman problem3 where you provide the API with coordinates of multiple places and it returns a duration-optimized route between those locations. This is a perfect usecase where Redis4 caching can...
Updated on 2022-02-13: Change functools import style. When I first learned about Python decorators, using them felt like doing voodoo magic. Decorators can give you the ability to add new functionalities to any callable without actually touching or changing the code inside it. This can typically yield better encapsulation and help you write...
Writing concurrent code in Python can be tricky. Before you even start, you have to worry about all these icky stuff like whether the task at hand is I/O or CPU bound or whether putting the extra effort to achieve concurrency is even going to give you the boost you need. Also, the presence of Global Interpreter Lock, GIL1 foists further...
When I first encountered Python’s pathlib module for path manipulation, I brushed it aside assuming it to be just an OOP way of doing what os.path already does quite well. The official doc also dubs it as the Object-oriented filesystem paths. However, back in 2019 when ticket1 confirmed that Django was replacing os.path with pathlib, I got...
Pre-commit hooks1 can be a neat way to run automated ad-hoc tasks before submitting a new git commit. These tasks may include linting, trimming trailing whitespaces, running code formatter before code reviews etc. Let’s see how multiple Python linters and formatters can be applied automatically before each commit to impose strict conformity on...
Updated on 2022-02-13: Change import style of functools.singledispatch. Recently, I was refactoring a portion of a Python function that somewhat looked like this: def process(data): if cond0 and cond1: # apply func01 on data that satisfies the cond0 & cond1 return func01(data) elif cond2 or cond3: # apply func23 on data that satisfies the cond2 &...
Python’s context managers are great for resource management and stopping the propagation of leaked abstractions. You’ve probably used it while opening a file or a database connection. Usually it starts with a with statement like this: with open("file.txt", "wt") as f: f.write("contents go here") In the above case, file.txt gets automatically...
Recently, my work needed me to create lots of custom data types and draw comparison among them. So, my code was littered with many classes that somewhat looked like this: class CartesianPoint: def __init__(self, x, y, z): self.x = x self.y = y self.z = z def __repr__(self): return f"CartesianPoint(x = {self.x}, y = {self.y}, z = {self.z})"...
August 31 How to Be a Better Reader - NY Times To read more deeply, to do the kind of reading that stimulates your imagination, the single most important thing to do is take your time. You can’t read deeply if you’re skimming. As the writer Zadie Smith has said, “When you practice reading, and you work at a text, it can only give you what you put...
Anton Zhiyanov Brandon Rhodes Dan Luu Drew DeVault’s blog Fabien Sanglard’s website Harmful stuff Joel on software Julia Evans Preslav Rachev Simon Willison’s weblog
Self Ahoy, fellow daywalkers! I’m Redowan Delowar, also known as ‘rednafi’ on most platforms. Circa 2018, a glitch in the matrix slingshotted me from electrical engineering to data science, eventually catalyzing my osmosis into brick and mortar software work. I enjoy exploring system architecture, databases, data analysis, and API design. In my...