Evan Schwartz
https://emschwartz.me/rss/ (RSS)
If you're developing an application and find yourself running a benchmark whose results are measured in nanoseconds... you should probably stop and get back to more important tasks. But here we are. I'm using binary vector embeddings to build Scour, a service that scours noisy feeds for content related to your interests. Scour uses the Hamming...
I wrote another post about Understanding the BM25 full text search algorithm and had initially included comparisons with two other algorithms. However, that post was already quite long so here are the brief comparisons between BM25, TF-IDF, and PostgreSQL's full text search. BM25 vs TF-IDFTF-IDF was the main model that was used prior to the...
A delicious (and somewhat blasphemous) mashup of two very different traditional foods: Chicago Italian beef sandwiches and Chinese soup dumplings.
BM25, or Best Match 25, is a widely used algorithm for full text search. It is the default in Lucene/Elasticsearch and SQLite, among others. Recently, it has become common to combine full text search and vector similarity search into "hybrid search". I wanted to understand how full text search works, and specifically BM25, so here is my attempt...
Vector embeddings by themselves are pretty neat. Binary quantized vector embeddings are extra impressive. In short, they can retain 95+% retrieval accuracy with 32x compression π€―.
I am reading Mara Bos' Rust Atomics and Locks. On the first pass, I didn't really grok memory ordering. So here's my attempt at understanding by explaining.
Async Rust is powerful. And it can be a pain to work with (and learn). Async Rust can be a pleasure to work with, though, if we can do it without `Send + Sync + 'static`.
Originally published on the Fiberplane Blog Prometheus and other Time Series Databases (TSDBs) donβt work well when your data has too many different labels. However, there are certain small cases when adding additional labels is fine. This post goes through when adding labels does not increase cardinality. How Prometheus stores time seriesTo...
A deep dive on the PromQL queries generated by the Autometrics framework
Explaining some of the confusing inner-workings of PromQL
Supercharge your debugging by automatically producing metrics with exemplars
Autometrics now tracks your software's version and writes queries that correlate that with potential problems to help pinpoint when issues are introduced
Autometrics is an open source framework that makes it easy to track the most useful metrics and actually understand the data with automatically generated queries, alerts, and dashboards.
Evan Schwartz Evan Schwartz AboutAbout ProjectsProjects BlogBlog GithubGithub XX Apr 12,... Apr 12,...