Simon Willison's Weblog

http://simonwillison.net/ (RSS)
visit blog
Voting opens for Oxford Word of the Year 2024
15 Nov 2024 | original ↗

Voting opens for Oxford Word of the Year 2024 One of the options is slop! slop (n.): Art, writing, or other content generated using artificial intelligence, shared and distributed online in an indiscriminate or intrusive way, and characterized as being of low quality, inauthentic, or inaccurate. Via @dloss Tags: slop, ethics,...

Recraft V3
15 Nov 2024 | original ↗

Recraft V3 Recraft are a generative AI design tool startup based out of London who released their v3 model a few weeks ago. It's currently sat at the top of the Artificial Analysis Image Arena Leaderboard, beating Midjourney and Flux 1.1 pro. The thing that impressed me is that it can generate both raster and vector graphics... and the vector...

OpenAI Public Bug Bounty
14 Nov 2024 | original ↗

OpenAI Public Bug Bounty Reading this investigation of the security boundaries of OpenAI's Code Interpreter environment helped me realize that the rules for OpenAI's public bug bounty inadvertently double as the missing details for a whole bunch of different aspects of their platform. This description of Code Interpreter is significantly more...

Quoting OpenAI, Google and Anthropic Are Struggling to Build More Advanced AI
14 Nov 2024 | original ↗

Anthropic declined to comment, but referred Bloomberg News to a five-hour podcast featuring Chief Executive Officer Dario Amodei that was released Monday. "People call them scaling laws. That's a misnomer," he said on the podcast. "They're not laws of the universe. They're empirical regularities. I am going to bet in favor of them continuing, but...

PyPI now supports digital attestations
14 Nov 2024 | original ↗

PyPI now supports digital attestations Dustin Ingram: PyPI package maintainers can now publish signed digital attestations when publishing, in order to further increase trust in the supply-chain security of their projects. Additionally, a new API is available for consumers and installers to verify published attestations. This has been in the work...

QuickTime video script to capture frames and bounding boxes
14 Nov 2024 | original ↗

QuickTime video script to capture frames and bounding boxes An update to an older TIL. I'm working on the write-up for my DjangoCon US talk on plugins and I found myself wanting to capture individual frames from the video in two formats: a full frame capture, and another that captured just the portion of the screen shared from my laptop. I have a...

Releasing the largest multilingual open pretraining dataset
14 Nov 2024 | original ↗

Releasing the largest multilingual open pretraining dataset Common Corpus is a new "open and permissible licensed text dataset, comprising over 2 trillion tokens (2,003,039,184,047 tokens)" released by French AI Lab PleIAs. This appears to be the largest available corpus of openly licensed training data: 926,541,096,243 tokens of public domain...

Quoting Steve Klabnik
13 Nov 2024 | original ↗

This tutorial exists because of a particular quirk of mine: I love to write tutorials about things as I learn them. This is the backstory of TRPL, of which an ancient draft was "Rust for Rubyists." You only get to look at a problem as a beginner once, and so I think writing this stuff down is interesting. It also helps me clarify what I'm...

Ollama: Llama 3.2 Vision
13 Nov 2024 | original ↗

Ollama: Llama 3.2 Vision Ollama released version 0.4 last week with support for Meta's first Llama vision model, Llama 3.2. If you have Ollama installed you can fetch the 11B model (7.9 GB) like this: ollama pull llama3.2-vision Or the larger 90B model (55GB) like this: ollama pull llama3.2-vision:90b I was delighted to learn that Sukhbinder...

django-plugin-django-debug-toolbar
13 Nov 2024 | original ↗

django-plugin-django-debug-toolbar Tom Viner built a plugin for my DJP Django plugin system that configures the excellent django-debug-toolbar debugging tool. You can see everything it sets up for you in this Python code: it configures installed apps, URL patterns and middleware and sets the INTERNAL_IPS and DEBUG settings. Here are Tom's running...

Ars Live: Our first encounter with manipulative AI
12 Nov 2024 | original ↗

Ars Live: Our first encounter with manipulative AI I'm participating in a live conversation with Benj Edwards on 19th November reminiscing over that incredible time back in February last year when Bing went feral. Via @benjedwards Tags: bing, generative-ai, arstechnica, ai, speaking, llms, benj-edwards

Qwen2.5-Coder-32B is an LLM that can code well that runs on my Mac
12 Nov 2024 | original ↗

There's a whole lot of buzz around the new Qwen2.5-Coder Series of open source (Apache 2.0 licensed) LLM releases from Alibaba's Qwen research team. On first impression it looks like the buzz is well deserved. Qwen claim: Qwen2.5-Coder-32B-Instruct has become the current SOTA open-source code model, matching the coding capabilities of GPT-4o....

How I ship projects at big tech companies
11 Nov 2024 | original ↗

How I ship projects at big tech companies This piece by Sean Goedecke on shipping features at larger tech companies is fantastic. Why do so many engineers think shipping is easy? I know it sounds extreme, but I think many engineers do not understand what shipping even is inside a large tech company. What does it mean to ship? It does not mean...

Binary vector embeddings are so cool
11 Nov 2024 | original ↗

Binary vector embeddings are so cool Evan Schwartz: Vector embeddings by themselves are pretty neat. Binary quantized vector embeddings are extra impressive. In short, they can retain 95+% retrieval accuracy with 32x compression and ~25x retrieval speedup. It's so unintuitive how well this trick works: take a vector of 1024x4 byte floating point...

Quoting Matt Webb
11 Nov 2024 | original ↗

That development time acceleration of 4 days down to 20 minutes… that’s equivalent to about 10 years of Moore’s Law cycles. That is, using generative AI like this is equivalent to computers getting 10 years better overnight. That was a real eye-opening framing for me. AI isn’t magical, it’s not sentient, it’s not the end of the world nor our...

Quoting Grant Slatton
11 Nov 2024 | original ↗

As a junior engineer, there's simply no substitute for getting the first 100K lines of code under your belt. The "start over each day" method will help get you to those 100K lines faster. You might think covering the same ground multiple times isn't as valuable as getting 100K diverse lines of code. I disagree. Solving the same problem repeatedly...

MDN Browser Support Timelines
11 Nov 2024 | original ↗

MDN Browser Support Timelines I complained on Hacker News today that I wished the MDN browser compatibility ables - like this one for the Web Locks API - included an indication as to when each browser was released rather than just the browser numbers. It turns out they do! If you click on each browser version in turn you can see an expanded area...

Everything I've learned so far about running local LLMs
10 Nov 2024 | original ↗

Everything I've learned so far about running local LLMs Chris Wellons shares detailed notes on his experience running local LLMs on Windows - though most of these tips apply to other operating systems as well. This is great, there's a ton of detail here and the root recommendations are very solid: Use llama-server from llama.cpp and try ~8B...

Visualizing local election results with Datasette, Observable and MapLibre GL
9 Nov 2024 | original ↗

Alex Garcia and myself hosted the first Datasette Open Office Hours on Friday - a live-streamed video session where we hacked on a project together and took questions and tips from community members on Discord. We didn't record this one (surprisingly not a feature that Discord offers) but we hope to do more of these and record them in the future....

Quoting fast.ai Discord Server
9 Nov 2024 | original ↗

This is a very friendly and supportive place where you are surrounded by peers - we all want to help each other succeed. The golden rule of this server is: Don't ever try to impress anyone here with your knowledge! Instead try to impress folks here with your desire to learn, and desire to help others learn. — fast.ai Discord Server Tags:...

uv 0.5.0
8 Nov 2024 | original ↗

uv 0.5.0 The first backwards-incompatible (in minor ways) release after 30 releases without a breaking change. I found out about this release this morning when I filed an issue about a fiddly usability problem I had encountered with the combo of uv and conda... and learned that the exact problem had been fixed in the brand new version! Tags:...

ChainForge
8 Nov 2024 | original ↗

ChainForge I'm still on the hunt for good options for running evaluations against prompts. ChainForge offers an interesting approach, calling itself "an open-source visual programming environment for prompt engineering". The interface is one of those boxes-and-lines visual programming tools, which reminds me of Yahoo Pipes. It's open source (from...

Datasette Public Office Hours, Friday Nov 8th at 2pm PT
7 Nov 2024 | original ↗

Datasette Public Office Hours, Friday Nov 8th at 2pm PT Tomorrow afternoon (Friday 8th November) at 2pm PT we'll be hosting the first Datasette Public Office Hours - a livestream video session on Discord where Alex Garcia and myself will live code on some Datasette projects and hang out to chat about the project. This is our first time trying...

Project: VERDAD - tracking misinformation in radio broadcasts using Gemini 1.5
7 Nov 2024 | original ↗

I'm starting a new interview series called Project. The idea is to interview people who are building interesting data projects and talk about what they've built, how they built it, and what they learned along the way. The first episode is a conversation with Rajiv Sinclair from Public Data Works about VERDAD, a brand new project in collaboration...

Quoting Jo Kristian Bergum
7 Nov 2024 | original ↗

If you have worked in search, you know how freaking hard even getting started with something close to this with traditional methods. Now, you can zero-shot it. System Instructions: As a query categorization expert, you try to break down the intent of a search query. First, provide your reasoning and then describe the intent using a single...

yet-another-applied-llm-benchmark
6 Nov 2024 | original ↗

yet-another-applied-llm-benchmark Nicholas Carlini introduced this personal LLM benchmark suite back in February as a collection of over 100 automated tests he runs against new LLM models to evaluate their performance against the kinds of tasks he uses them for. There are two defining features of this benchmark that make it interesting. Most...

Generating documentation from tests using files-to-prompt and LLM
5 Nov 2024 | original ↗

Generating documentation from tests using files-to-prompt and LLM I was experimenting with the wasmtime-py Python library today (for executing WebAssembly programs from inside CPython) and I found the existing API docs didn't quite show me what I wanted to know. The project has a comprehensive test suite so I tried seeing if I could generate...

Quoting NY Times Editorial Board
5 Nov 2024 | original ↗

You already know Donald Trump. He is unfit to lead. Watch him. Listen to those who know him best. He tried to subvert an election and remains a threat to democracy. He helped overturn Roe, with terrible consequences. Mr. Trump's corruption and lawlessness go beyond elections: It's his whole ethos. He lies without limit. If he's re-elected, the...

New OpenAI feature: Predicted Outputs
4 Nov 2024 | original ↗

New OpenAI feature: Predicted Outputs Interesting new ability of the OpenAI API - the first time I've seen this from any vendor. If you know your prompt is mostly going to return the same content - you're requesting an edit to some existing code, for example - you can now send that content as a "prediction" and have GPT-4o or GPT-4o mini use that...

Claude 3.5 Haiku
4 Nov 2024 | original ↗

Anthropic released Claude 3.5 Haiku today, a few days later than expected (they said it would be out by the end of October). I was expecting this to be a complete replacement for their existing Claude 3 Haiku model, in the same way that Claude 3.5 Sonnet eclipsed the existing Claude 3 Sonnet while maintaining the same pricing. Claude 3.5 Haiku is...

Nous Hermes 3
4 Nov 2024 | original ↗

Nous Hermes 3 The Nous Hermes family of fine-tuned models have a solid reputation. Their most recent release came out in August, based on Meta's Llama 3.1: Our training data aggressively encourages the model to follow the system and instruction prompts exactly and in an adaptive manner. Hermes 3 was created by fine-tuning Llama 3.1 8B, 70B and...

Quoting Tom MacWright
3 Nov 2024 | original ↗

Building technology in startups is all about having the right level of tech debt. If you have none, you’re probably going too slow and not prioritizing product-market fit and the important business stuff. If you get too much, everything grinds to a halt. Plus, tech debt is a “know it when you see it” kind of thing, and I know that my definition...

California Clock Change
3 Nov 2024 | original ↗

California Clock Change The clocks go back in California tonight and I finally built my dream application for helping me remember if I get an hour extra of sleep or not, using a Claude Artifact. Here's the transcript. This is one of my favorite examples yet of the kind of tiny low stakes utilities I'm building with Claude Artifacts because the...

Docling
3 Nov 2024 | original ↗

Docling MIT licensed document extraction Python library from the Deep Search team at IBM, who released Docling v2 on October 16th. Here's the Docling Technical Report paper from August, which provides details of two custom models: a layout analysis model for figuring out the structure of the document (sections, figures, text, tables etc) and a...

Claude Token Counter
2 Nov 2024 | original ↗

Claude Token Counter Anthropic released a token counting API for Claude a few days ago. I built this tool for running prompts, images and PDFs against that API to count the tokens in them. The API is free (albeit rate limited), but you'll still need to provide your own API key in order to use it. Here's the source code. I built this using two...

Please publish and share more
2 Nov 2024 | original ↗

Please publish and share more 💯 to all of this by Jeff Triplett: Friends, I encourage you to publish more, indirectly meaning you should write more and then share it. [...] You don’t have to change the world with every post. You might publish a quick thought or two that helps encourage someone else to try something new, listen to a new song, or...

SmolLM2
2 Nov 2024 | original ↗

SmolLM2 New from Loubna Ben Allal and her research team at Hugging Face: SmolLM2 is a family of compact language models available in three size: 135M, 360M, and 1.7B parameters. They are capable of solving a wide range of tasks while being lightweight enough to run on-device. [...] It was trained on 11 trillion tokens using a diverse dataset...

From Naptime to Big Sleep: Using Large Language Models To Catch Vulnerabilities In Real-World Code
1 Nov 2024 | original ↗

From Naptime to Big Sleep: Using Large Language Models To Catch Vulnerabilities In Real-World Code Google's Project Zero security team used a system based around Gemini 1.5 Pro to find a previously unreported security vulnerability in SQLite (a stack buffer underflow), in time for it to be fixed prior to making it into a release. A key insight...

Claude API: PDF support (beta)
1 Nov 2024 | original ↗

Claude API: PDF support (beta) Claude 3.5 Sonnet now accepts PDFs as attachments: The new Claude 3.5 Sonnet (claude-3-5-sonnet-20241022) model now supports PDF input and understands both text and visual content within documents. I just released llm-claude-3 0.7 with support for the new attachment type, so now you can do this: llm install...

Quoting Question for Department for Science, Innovation and Technology
1 Nov 2024 | original ↗

Lord Clement-Jones: To ask His Majesty's Government what assessment they have made of the cybersecurity risks posed by prompt injection attacks to the processing by generative artificial intelligence of material provided from outside government, and whether any such attacks have been detected thus far. Lord Vallance of Balham: Security is central...

Control your smart home devices with the Gemini mobile app on Android
1 Nov 2024 | original ↗

Control your smart home devices with the Gemini mobile app on Android Google are adding smart home integration to their Gemini chatbot - so far on Android only. Have they considered the risk of prompt injection? It looks like they have, at least a bit: Important: Home controls are for convenience only, not safety- or security-critical purposes....

Cerebras Coder
31 Oct 2024 | original ↗

Cerebras Coder Val Town founder Steve Krouse has been building demos on top of the Cerebras API that runs Llama3.1-70b at 2,000 tokens/second. Having a capable LLM with that kind of performance turns out to be really interesting. Cerebras Coder is a demo that implements Claude Artifact-style on-demand JavaScript apps, and having it run at that...

Australia/Lord_Howe is the weirdest timezone
31 Oct 2024 | original ↗

Australia/Lord_Howe is the weirdest timezone Lord Howe Island - part of Australia, population 382 - is unique in that the island's standard time zone is UTC+10:30 but is UTC+11 when daylight saving time applies. It's the only time zone where DST represents a 30 minute offset. Via lobste.rs Tags: timezones

Creating a LLM-as-a-Judge that drives business results
30 Oct 2024 | original ↗

Creating a LLM-as-a-Judge that drives business results Hamel Husain's sequel to Your AI product needs evals. This is packed with hard-won actionable advice. Hamel warns against using scores on a 1-5 scale, instead promoting an alternative he calls "Critique Shadowing". Find a domain expert (one is better than many, because you want to keep their...

docs.jina.ai - the Jina meta-prompt
30 Oct 2024 | original ↗

docs.jina.ai - the Jina meta-prompt From Jina AI on Twitter: curl docs.jina.ai - This is our Meta-Prompt. It allows LLMs to understand our Reader, Embeddings, Reranker, and Classifier APIs for improved codegen. Using the meta-prompt is straightforward. Just copy the prompt into your preferred LLM interface like ChatGPT, Claude, or whatever works...

W̶e̶e̶k̶n̶o̶t̶e̶s̶ Monthnotes for October
30 Oct 2024 | original ↗

I try to publish weeknotes at least once every two weeks. It's been four since the last entry, so I guess this one counts as monthnotes instead. In my defense, the reason I've fallen behind on weeknotes is that I've been publishing a lot of long-form blog entries this month. Plentiful LLM vendor news A lot of LLM stuff happened. OpenAI had their...

Bringing developer choice to Copilot with Anthropic’s Claude 3.5 Sonnet, Google’s Gemini 1.5 Pro, and OpenAI’s o1-preview
30 Oct 2024 | original ↗

Bringing developer choice to Copilot with Anthropic’s Claude 3.5 Sonnet, Google’s Gemini 1.5 Pro, and OpenAI’s o1-preview The big announcement from GitHub Universe: Copilot is growing support for alternative models. GitHub Copilot predated the release of ChatGPT by more than year, and was the first widely used LLM-powered tool. This announcement...

Generating Descriptive Weather Reports with LLMs
29 Oct 2024 | original ↗

Generating Descriptive Weather Reports with LLMs Drew Breunig produces the first example I've seen in the wild of the new LLM attachments Python API. Drew's Downtown San Francisco Weather Vibes project combines output from a JSON weather API with the latest image from a webcam pointed at downtown San Francisco to produce a weather report "with a...

You can now run prompts against images, audio and video in your terminal using LLM
29 Oct 2024 | original ↗

I released LLM 0.17 last night, the latest version of my combined CLI tool and Python library for interacting with hundreds of different Large Language Models such as GPT-4o, Llama, Claude and Gemini. The signature feature of 0.17 is that LLM can now be used to prompt multi-modal models - which means you can now use it to send images, audio and...

Matt Webb's Colophon
29 Oct 2024 | original ↗

Matt Webb's Colophon I love a good colophon (here's mine, I should really expand it). Matt Webb has been publishing his thoughts online for 24 years, so his colophon is a delightful accumulation of ideas and principles. So following the principles of web longevity, what matters is the data, i.e. the posts, and simplicity. I want to minimise...

Quoting Panda Smith
28 Oct 2024 | original ↗

If you want to make a good RAG tool that uses your documentation, you should start by making a search engine over those documents that would be good enough for a human to use themselves. — Panda Smith Tags: search, ai, rag, llms

Hugging Face Hub: Configure progress bars
28 Oct 2024 | original ↗

Hugging Face Hub: Configure progress bars This has been driving me a little bit spare. Every time I try and build anything against a library that uses huggingface_hub somewhere under the hood to access models (most recently trying out MLX-VLM) I inevitably get output like this every single time I execute the model: Fetching 11 files:...

python-imgcat
28 Oct 2024 | original ↗

python-imgcat I was investigating options for displaying images in a terminal window (for multi-modal logging output of LLM) and I found this neat Python library for displaying images using iTerm 2. It includes a CLI tool, which means you can run it without installation using uvx like this: uvx imgcat filename.png Via rich/discussions ...

Prompt GPT-4o audio
28 Oct 2024 | original ↗

Prompt GPT-4o audio A week and a half ago I built a tool for experimenting with OpenAI's new audio input. I just put together the other side of that, for experimenting with audio output. Once you've provided an API key (which is saved in localStorage) you can use this to prompt the gpt-4o-audio-preview model with a system and regular prompt and...

llm-whisper-api
27 Oct 2024 | original ↗

llm-whisper-api I wanted to run an experiment through the OpenAI Whisper API this morning so I knocked up a very quick plugin for LLM that provides the following interface: llm install llm-whisper-api llm whisper-api myfile.mp3 It uses the API key that you previously configured using the llm keys set openai command. If you haven't configured one...

Run a prompt to generate and execute jq programs using llm-jq
27 Oct 2024 | original ↗

llm-jq is a brand new plugin for LLM which lets you pipe JSON directly into the llm jq command along with a human-language description of how you'd like to manipulate that JSON and have a jq program generated and executed for you on the fly. Thomas Ptacek on Twitter: The JQ CLI should just BE a ChatGPT client, so there's no pretense of actually...

Quoting Molly White
26 Oct 2024 | original ↗

As an independent writer and publisher, I am the legal team. I am the fact-checking department. I am the editorial staff. I am the one responsible for triple-checking every single statement I make in the type of original reporting that I know carries a serious risk of baseless but ruinously expensive litigation regularly used to silence...

Mastodon discussion about sandboxing SVG data
26 Oct 2024 | original ↗

Mastodon discussion about sandboxing SVG data I asked this on Mastodon and got some really useful replies: How hard is it to process untrusted SVG data to strip out any potentially harmful tags or attributes (like stuff that might execute JavaScript)? The winner for me turned out to be the humble tag. SVG images that are rendered in an image...

LLM Pictionary
26 Oct 2024 | original ↗

LLM Pictionary Inspired by my SVG pelicans on a bicycle, Paul Calcraft built this brilliant system where different vision LLMs can play Pictionary with each other, taking it in turns to progressively draw SVGs while the other models see if they can guess what the image represents. Tags: vision-llms, svg, generative-ai, ai,...

ChatGPT advanced voice mode can attempt Spanish with a Russian accent
26 Oct 2024 | original ↗

ChatGPT advanced voice mode can attempt Spanish with a Russian accent ChatGPT advanced voice mode may refuse to sing (unless you jailbreak it) but it's quite happy to attempt different accents. I've been having a lot of fun with that: I need you to pretend to be a California brown pelican with a very thick Russian accent, but you talk to me...

Pelicans on a bicycle
25 Oct 2024 | original ↗

Pelicans on a bicycle I decided to roll out my own LLM benchmark: how well can different models render an SVG of a pelican riding a bicycle? I chose that because a) I like pelicans and b) I'm pretty sure there aren't any pelican on a bicycle SVG files floating around (yet) that might have already been sucked into the training data. My prompt:...

llm-cerebras
25 Oct 2024 | original ↗

llm-cerebras Cerebras (previously) provides Llama LLMs hosted on custom hardware at ferociously high speeds. GitHub user irthomasthomas built an LLM plugin that works against their API - which is currently free, albeit with a rate limit of 30 requests per minute for their two models. llm install llm-cerebras llm keys set cerebras # paste key here...

ZombAIs: From Prompt Injection to C2 with Claude Computer Use
25 Oct 2024 | original ↗

ZombAIs: From Prompt Injection to C2 with Claude Computer Use In news that should surprise nobody who has been paying attention, Johann Rehberger has demonstrated a prompt injection attack against the new Claude Computer Use demo - the system where you grant Claude the ability to semi-autonomously operate a desktop computer. Johann's attack is...

Introducing the analysis tool in Claude.ai
24 Oct 2024 | original ↗

Introducing the analysis tool in Claude.ai The Claude.ai consumer-facing interface just shipped a major new feature, which they're calling "the analysis tool". It's their answer to OpenAI's ChatGPT Code Interpreter mode: Claude can now chose to solve models by writing some code, executing that code and then continuing the conversation using the...

Quoting Matt Webb
24 Oct 2024 | original ↗

Grandma’s secret cake recipe, passed down generation to generation, could be literally passed down: a flat slab of beige ooze kept in a battered pan, DNA-spliced and perfected by guided evolution by her own deft and ancient hands, a roiling wet mass of engineered microbes that slowly scabs over with delicious sponge cake, a delectable crust to be...

Using uv to develop Python command-line applications
24 Oct 2024 | original ↗

Using uv to develop Python command-line applications I've been increasingly using uv to try out new software (via uvx) and experiment with new ideas, but I hadn't quite figured out the right way to use it for developing my own projects. It turns out I was missing a few things - in particular the fact that there's no need to use uv pip at all when...

Julia Evans: TIL
24 Oct 2024 | original ↗

Julia Evans: TIL I've always loved how Julia Evans emphasizes the joy of learning and how you should celebrate every new thing you learn and never be ashamed to admit that you haven't figured something out yet. That attitude was part of my inspiration when I started writing TILs a few years ago. Julia just started publishing TILs too, and I'm...

Quoting Alex Albert
23 Oct 2024 | original ↗

Go to data.gov, find an interesting recent dataset, and download it. Install sklearn with bash tool write a .py file to split the data into train and test and make a classifier for it. (you may need to inspect the data and/or iterate if this goes poorly at first, but don't get discouraged!). Come up with some way to visualize the results of your...

Running prompts against images and PDFs with Google Gemini
23 Oct 2024 | original ↗

Running prompts against images and PDFs with Google Gemini New TIL. I've been experimenting with the Google Gemini APIs for running prompts against images and PDFs (in preparation for finally adding multi-modal support to LLM) - here are my notes on how to send images or PDF files to their API using curl and the base64 -i macOS command. I figured...

Using Rust in non-Rust servers to improve performance
23 Oct 2024 | original ↗

Using Rust in non-Rust servers to improve performance Deep dive into different strategies for optimizing part of a web server application - in this case written in Node.js, but the same strategies should work for Python as well - by integrating with Rust in different ways. The example app renders QR codes, initially using the pure JavaScript...

Quoting Model Card Addendum: Claude 3.5 Haiku and Upgraded Sonnet
23 Oct 2024 | original ↗

We enhanced the ability of the upgraded Claude 3.5 Sonnet and Claude 3.5 Haiku to recognize and resist prompt injection attempts. Prompt injection is an attack where a malicious user feeds instructions to a model that attempt to change its originally intended behavior. Both models are now better able to recognize adversarial prompts from a user...

Claude Artifact Runner
23 Oct 2024 | original ↗

Claude Artifact Runner One of my least favourite things about Claude Artifacts is the way it defaults to writing code in React in a way that's difficult to reuse outside of Artifacts. I start most of my prompts with "no react" so that it will kick out regular HTML and JavaScript instead, which I can then copy out into my tools.simonwillison.net...

Quoting Deirdre Bosa
23 Oct 2024 | original ↗

According to a document that I viewed, Anthropic is telling investors that it is expecting a billion dollars in revenue this year. Third-party API is expected to make up the majority of sales, 60% to 75% of the total. That refers to the interfaces that allow external developers or third parties like Amazon's AWS to build and scale their own AI...

Quoting Mike Isaac and Erin Griffith
23 Oct 2024 | original ↗

OpenAI’s monthly revenue hit $300 million in August, up 1,700 percent since the beginning of 2023, and the company expects about $3.7 billion in annual sales this year, according to financial documents reviewed by The New York Times. [...] The company expects ChatGPT to bring in $2.7 billion in revenue this year, up from $700 million in 2023,...

Wayback Machine: Models - Anthropic (8th October 20240
22 Oct 2024 | original ↗

Wayback Machine: Models - Anthropic (8th October 20240 The Internet Archive is only intermittently available at the moment, but the Wayback Machine just came back long enough for me to confirm that the Anthropic Models documentation page listed Claude 3.5 Opus as coming “Later this year” at least as recently as the 8th of October, but today makes...

Quoting Anthropic
22 Oct 2024 | original ↗

For the same cost and similar speed to Claude 3 Haiku, Claude 3.5 Haiku improves across every skill set and surpasses even Claude 3 Opus, the largest model in our previous generation, on many intelligence benchmarks. Claude 3.5 Haiku is particularly strong on coding tasks. For example, it scores 40.6% on SWE-bench Verified, outperforming many...

Initial explorations of Anthropic's new Computer Use capability
22 Oct 2024 | original ↗

Two big announcements from Anthropic today: a new Claude 3.5 Sonnet model and a new API mode that they are calling computer use. (They also pre-announced Haiku 3.5, but that's not available yet so I'm ignoring it until I can try it out myself.) Computer use is really interesting. Here's what I've figured out about it so far. You provide the...

Apple's Knowledge Navigator concept video (1987)
22 Oct 2024 | original ↗

Apple's Knowledge Navigator concept video (1987) I learned about this video today while engaged in my irresistible bad habit of arguing about whether or not "agents" means anything useful. It turns out CEO John Sculley's Apple in 1987 promoted a concept called Knowledge Navigator (incorporating input from Alan Kay) which imagined a future where...

This prompt can make an AI chatbot identify and extract personal details from your chats
22 Oct 2024 | original ↗

This prompt can make an AI chatbot identify and extract personal details from your chats Matt Burgess in Wired magazine writes about a new prompt injection / Markdown exfiltration variant called Imprompter, described in the new paper Imprompter: Tricking LLM Agents into Improper Tool Use. The paper describes an exfiltration attack against...

sudoku-in-python-packaging
21 Oct 2024 | original ↗

sudoku-in-python-packaging Absurdly clever hack by konsti: solve a Sudoku puzzle entirely using the Python package resolver! First convert the puzzle into a requirements.in file representing the current state of the board: git clone https://github.com/konstin/sudoku-in-python-packaging cd sudoku-in-python-packaging echo...

Quoting Arvind Narayanan
21 Oct 2024 | original ↗

I've often been building single-use apps with Claude Artifacts when I'm helping my children learn. For example here's one on visualizing fractions. [...] What's more surprising is that it is far easier to create an app on-demand than searching for an app in the app store that will do what I'm looking for. Searching for kids' learning apps is...

Everything I built with Claude Artifacts this week
21 Oct 2024 | original ↗

I'm a huge fan of Claude's Artifacts feature, which lets you prompt Claude to create an interactive Single Page App (using HTML, CSS and JavaScript) and then view the result directly in the Claude interface, iterating on it further with the bot and then, if you like, copying out the resulting code. I was digging around in my Claude activity...

Dashboard: Tools
21 Oct 2024 | original ↗

Dashboard: Tools I used Django SQL Dashboard to spin up a dashboard that shows all of the URLs to my tools.simonwillison.net site that I've shared on my blog so far. It uses this (Claude assisted) regular expression in a PostgreSQL SQL query: select distinct on (tool_url) unnest(regexp_matches( body, ...

Knowledge Worker
20 Oct 2024 | original ↗

Knowledge Worker Forrest Brazeal: Last month, I performed a 30-minute show called "Knowledge Worker" for the incredible audience at Gene Kim's ETLS in Las Vegas. The show included 7 songs about the past, present, and future of "knowledge work" - or, more specifically, how it's affecting us, the humans between keyboard and chair. I poured...

Quoting John Gruber
20 Oct 2024 | original ↗

I really dislike the practice of replacing passwords with email “magic links”. Autofilling a password from my keychain happens instantly; getting a magic link from email can take minutes sometimes, and even in the fastest case, it’s nowhere near instantaneous. Replacing something very fast — password autofill — with something slower is just a...

The 3 AI Use Cases: Gods, Interns, and Cogs
20 Oct 2024 | original ↗

The 3 AI Use Cases: Gods, Interns, and Cogs Drew Breunig introduces an interesting new framework for categorizing use cases of modern AI: Gods refers to the autonomous, human replacement applications - I see that as AGI stuff that's still effectively science fiction. Interns are supervised copilots. This is how I get most of the value out of LLMs...

Quoting Jens Ohlig
20 Oct 2024 | original ↗

Who called it “intellectual property problems around the acquisition of training data for Large Language Models” and not Grand Theft Autocomplete? — Jens Ohlig, on March 8th 2024 Tags: training-data, llms, ai, generative-ai

Quoting Jacob Kaplan-Moss
20 Oct 2024 | original ↗

It feels like we’re at a bit of an inflection point for the Django community. [...] One of the places someone could have the most impact is by serving on the DSF Board. Like the community at large, the DSF is at a transition point: we’re outgrowing the “small nonprofit” status, and have the opportunity to really expand our ambition and reach. In...

You can use text-wrap: balance; on icons
20 Oct 2024 | original ↗

You can use text-wrap: balance; on icons Neat CSS experiment from Terence Eden: the new text-wrap: balance CSS property is intended to help make text like headlines display without ugly wrapped single orphan words, but Terence points out it can be used for icons too: This inspired me to investigate if the same technique could work for text based...

mistral.rs
19 Oct 2024 | original ↗

mistral.rs Here's an LLM inference library written in Rust. It's not just for that one family of models - like how llama.cpp has grown beyond Llama, mistral.rs has grown beyond Mistral. This is the first time I've been able to run the Llama 3.2 vision model on my own Mac M2 laptop: git clone https://github.com/EricLBuehler/mistral.rs.git cd...

Experimenting with audio input and output for the OpenAI Chat Completion API
18 Oct 2024 | original ↗

OpenAI promised this at DevDay a few weeks ago and now it's here: their Chat Completion API can now accept audio as input and return it as output. OpenAI still recommend their WebSocket-based Realtime API for audio tasks, but the Chat Completion API is a whole lot easier to write code against. Generating audio Audio input via a Bash script ...

Quoting D. Richard Hipp
18 Oct 2024 | original ↗

I'm of the opinion that you should never use mmap, because if you get an I/O error of some kind, the OS raises a signal, which SQLite is unable to catch, and so the process dies. When you are not using mmap, SQLite gets back an error code from an I/O error and is able to take remedial action, or at least compose an error message. — D. Richard...

Using static websites for tiny archives
17 Oct 2024 | original ↗

Using static websites for tiny archives Alex Chan: Over the last year or so, I’ve been creating static websites to browse my local archives. I’ve done this for a variety of collections, including: paperwork I’ve scanned documents I’ve created screenshots I’ve taken web pages I’ve bookmarked video and audio files I’ve saved This is such a neat...

New in NotebookLM: Customizing your Audio Overviews
17 Oct 2024 | original ↗

New in NotebookLM: Customizing your Audio Overviews The most requested feature for Google's NotebookLM "audio overviews" (aka automatically generated podcast conversations) has been the ability to provide direction to those artificial podcast hosts - setting their expertise level or asking them to focus on specific topics. Today's update adds...

Video scraping: extracting JSON data from a 35 second screen capture for less than 1/10th of a cent
17 Oct 2024 | original ↗

The other day I found myself needing to add up some numeric values that were scattered across twelve different emails. I didn't particularly feel like copying and pasting all of the numbers out one at a time, so I decided to try something different: could I record a screen capture while browsing around my Gmail account and then extract the...

Gemini API Additional Terms of Service
17 Oct 2024 | original ↗

Gemini API Additional Terms of Service I've been trying to figure out what Google's policy is on using data submitted to their Google Gemini LLM for further training. It turns out it's clearly spelled out in their terms of service, but it differs for the paid v.s. free tiers. The paid APIs do not train on your inputs: When you're using Paid...

files-to-prompt 0.4
16 Oct 2024 | original ↗

files-to-prompt 0.4 New release of my files-to-prompt tool adding an option for filtering just for files with a specific extension. The following command will output Claude XML-style markup for all Python and Markdown files in the current directory, and copy that to the macOS clipboard ready to be pasted into an LLM: files-to-prompt . -e py -e md...

2025 DSF Board Nominations
16 Oct 2024 | original ↗

2025 DSF Board Nominations The Django Software Foundation board elections are coming up. There are four positions open, seven directors total. Terms last two years, and the deadline for submitting a nomination is October 25th (the date of the election has not yet been decided). Several community members have shared "DSF initiatives I'd like to...

Supercharge the One Person Framework with SQLite: Rails World 2024
16 Oct 2024 | original ↗

Supercharge the One Person Framework with SQLite: Rails World 2024 Stephen Margheim shares an annotated transcript of the YouTube video of his recent talk at this year's Rails World conference in Toronto. The Rails community is leaning hard into SQLite right now. Stephen's talk is some of the most effective evangelism I've seen anywhere for...

Supercharge the One Person Framework with SQLite: Rails World 2024
16 Oct 2024 | original ↗

Supercharge the One Person Framework with SQLite: Rails World 2024 Stephen Margheim shares an annotated transcript of the YouTube video of his recent talk at this year's Rails World conference in Toronto. The Rails community is leaning hard into SQLite right now. Stephen's talk is some of the most effective evangelism I've seen anywhere for...

[red-knot] type inference/checking test framework
16 Oct 2024 | original ↗

[red-knot] type inference/checking test framework Ruff maintainer Carl Meyer recently landed an interesting new design for a testing framework. It's based on Markdown, and could be described as a form of "literate testing" - the testing equivalent of Donald Knuth's literate programming. A markdown test file is a suite of tests, each test can...

Un Ministral, des Ministraux
16 Oct 2024 | original ↗

Un Ministral, des Ministraux Two new models from Mistral: Ministral 3B and Ministral 8B (joining Mixtral, Pixtral, Codestral and Mathstral as weird naming variants on the Mistral theme. These models set a new frontier in knowledge, commonsense, reasoning, function-calling, and efficiency in the sub-10B category, and can be used or tuned to a...

Quoting François Chollet
16 Oct 2024 | original ↗

A common misconception about Transformers is to believe that they're a sequence-processing architecture. They're not. They're a set-processing architecture. Transformers are 100% order-agnostic (which was the big innovation compared to RNNs, back in late 2016 -- you compute the full matrix of pairwise token interactions instead of processing one...

The XOXO 2024 Talks
15 Oct 2024 | original ↗

The XOXO 2024 Talks I missed attending the last XOXO in person, but I've been catching up on the videos of the talks over the past few days and they have been absolutely worth spending time with. This year was a single day with ten speakers. Andy Baio explains the intended formula: I usually explain that the conference is about, more than...

Quoting David Heinemeier Hansson
15 Oct 2024 | original ↗

The problem with passkeys is that they're essentially a halfway house to a password manager, but tied to a specific platform in ways that aren't obvious to a user at all, and liable to easily leave them unable to access of their accounts. [...] Chrome on Windows stores your passkeys in Windows Hello, so if you sign up for a service on Windows,...

PATH tips on wizard zines
15 Oct 2024 | original ↗

PATH tips on wizard zines New Julia Evans comic, from which I learned that the which -a X command shows you all of the versions of that command that are available in the directories on your current PATH. This is so useful! I used it to explore my currently available Python versions: $ which -a python ...

ChatGPT will happily write you a thinly disguised horoscope
15 Oct 2024 | original ↗

There's a meme floating around at the moment where you ask ChatGPT the following, and it appears to offer deep insight into your personality: From all of our interactions what is one thing that you can tell me about myself that I may not know about myself Don't be fooled into thinking there's anything deep going on here. It's effectively acting...

My Jina Reader tool
14 Oct 2024 | original ↗

My Jina Reader tool I wanted to feed the Cloudflare Durable Objects SQLite documentation into Claude, but I was on my iPhone so copying and pasting was inconvenient. Jina offer a Reader API which can turn any URL into LLM-friendly Markdown and it turns out it supports CORS, so I got Claude to build me this tool (source code). Paste in a URL to...

Grant Negotiation and Authorization Protocol (GNAP)
14 Oct 2024 | original ↗

Grant Negotiation and Authorization Protocol (GNAP) RFC 9635 was published a few days ago. GNAP is effectively OAuth 3 - it's a newly standardized design for a protocol for delegating authorization so an application can access data on your behalf. The most interesting difference between GNAP and OAuth 2 is that GNAP no longer requires clients to...

I Was A Teenage Foot Clan Ninja
14 Oct 2024 | original ↗

I Was A Teenage Foot Clan Ninja My name is Danny Pennington, I am 48 years old, and between 1988 in 1995 I was a ninja in the Foot Clan. I enjoyed this TMNT parody a lot. Tags: youtube

Zero-latency SQLite storage in every Durable Object
13 Oct 2024 | original ↗

Zero-latency SQLite storage in every Durable Object Kenton Varda introduces the next iteration of Cloudflare's Durable Object platform, which recently upgraded from a key/value store to a full relational system based on SQLite. For useful background on the first version of Durable Objects take a look at Cloudflare's durable multiplayer moat by...

An LLM TDD loop
13 Oct 2024 | original ↗

An LLM TDD loop Super neat demo by David Winterbottom, who wrapped my LLM and files-to-prompt tools in a short Bash script that can be fed a file full of Python unit tests and an empty implementation file and will then iterate on that file in a loop until the tests pass. Via @codeinthehole Tags: llm, ai-assisted-programming, python,...

PostgreSQL 17: SQL/JSON is here!
13 Oct 2024 | original ↗

PostgreSQL 17: SQL/JSON is here! Hubert Lubaczewski dives into the new JSON features added in PostgreSQL 17, released a few weeks ago on the 26th of September. This is the latest in his long series of similar posts about new PostgreSQL features. The features are based on the new SQL:2023 standard from June 2023. If you want to actually read the...

jefftriplett/django-startproject
12 Oct 2024 | original ↗

jefftriplett/django-startproject Django's django-admin startproject and startapp commands include a --template option which can be used to specify an alternative template for generating the initial code. Jeff Triplett actively maintains his own template for new projects, which includes the pattern that I personally prefer of keeping settings and...

Perks of Being a Python Core Developer
12 Oct 2024 | original ↗

Perks of Being a Python Core Developer Mariatta Wijaya provides a detailed breakdown of the exact capabilities and privileges that are granted to Python core developers - including commit access to the Python main, the ability to write or sponsor PEPs, the ability to vote on new core developers and for the steering council election and financial...

Python 3.13's best new features
12 Oct 2024 | original ↗

Python 3.13's best new features Trey Hunner highlights some Python 3.13 usability improvements I had missed, mainly around the new REPL. Pasting a block of code like a class or function that includes blank lines no longer breaks in the REPL - particularly useful if you frequently have LLMs write code for you to try out. Hitting F2 in the REPL...

Quoting Michael Wooldridge
12 Oct 2024 | original ↗

Carl Hewitt recently remarked that the question what is an agent? is embarrassing for the agent-based computing community in just the same way that the question what is intelligence? is embarrassing for the mainstream AI community. The problem is that although the term is widely used, by many people working in closely related areas, it defies...

Quoting James Cham
12 Oct 2024 | original ↗

Frankenstein is a terrific book partly based on how concerned people were about electricity. It captures our fears about the nature of being human but didn’t help anyone really come up with better policies for dealing with electricity. I worry that a lot of AI critics are doing the same thing. — James Cham Tags: ai

Cabel Sasser at XOXO
12 Oct 2024 | original ↗

Cabel Sasser at XOXO I cannot recommend this talk highly enough for the way it ends. After watching the video dive into this new site that accompanies the talk - an online archive of the works of commercial artist Wes Cook. I too would very much love to see a full scan of The Lost McDonalds Satire Triptych. Via Andy Baio Tags: cabel-sasser

lm.rs: run inference on Language Models locally on the CPU with Rust
11 Oct 2024 | original ↗

lm.rs: run inference on Language Models locally on the CPU with Rust Impressive new LLM inference implementation in Rust by Samuel Vitorino. I tried it just now on an M2 Mac with 64GB of RAM and got very snappy performance for this Q8 Llama 3.2 1B, with Activity Monitor reporting 980% CPU usage over 13 threads. Here's how I compiled the library...

$2 H100s: How the GPU Bubble Burst
11 Oct 2024 | original ↗

$2 H100s: How the GPU Bubble Burst Fascinating analysis from Eugene Cheah, founder of LLM hosting provider Featherless, discussing GPU economics over the past 12 months. TLDR: Don’t buy H100s. The market has flipped from shortage ($8/hr) to oversupplied ($2/hr), because of reserved compute resales, open model finetuning, and decline in new...

Quoting Mike Caulfield
11 Oct 2024 | original ↗

The primary use of “misinformation” is not to change the beliefs of other people at all. Instead, the vast majority of misinformation is offered as a service for people to maintain their beliefs in face of overwhelming evidence to the contrary. — Mike Caulfield, via Charlie Warzel Tags: misinformation

HTML for People
11 Oct 2024 | original ↗

HTML for People Blake Watson's brand new HTML tutorial, presented as a free online book (CC BY-NC-SA 4.0, on GitHub). This seems very modern and well thought-out to me. It focuses exclusively on HTML, skipping JavaScript entirely and teaching with Simple.css to avoid needing to dig into CSS while still producing sites that are pleasing to look...

Quoting Ed Yong
11 Oct 2024 | original ↗

Providing validation, strength, and stability to people who feel gaslit and dismissed and forgotten can help them feel stronger and surer in their decisions. These pieces made me understand that journalism can be a caretaking profession, even if it is never really thought about in those terms. It is often framed in terms of antagonism. Speaking...

Bridging Language Gaps in Multilingual Embeddings via Contrastive Learning
10 Oct 2024 | original ↗

Bridging Language Gaps in Multilingual Embeddings via Contrastive Learning Most text embeddings models suffer from a "language gap", where phrases in different languages with the same semantic meaning end up with embedding vectors that aren't clustered together. Jina claim their new jina-embeddings-v3 (CC BY-NC 4.0, which means you need to...

Announcing Deno 2
10 Oct 2024 | original ↗

Announcing Deno 2 The big focus of Deno 2 is compatibility with the existing Node.js and npm ecosystem: Deno 2 takes all of the features developers love about Deno 1.x — zero-config, all-in-one toolchain for JavaScript and TypeScript development, web standard API support, secure by default — and makes it fully backwards compatible with Node and...

Forums are still alive, active, and a treasure trove of information
9 Oct 2024 | original ↗

Forums are still alive, active, and a treasure trove of information Chris Person: When I want information, like the real stuff, I go to forums. Over the years, forums did not really get smaller, so much as the rest of the internet just got bigger. Reddit, Discord and Facebook groups have filled a lot of that space, but there is just certain...

Free Threaded Python With Asyncio
9 Oct 2024 | original ↗

Free Threaded Python With Asyncio Jamie Chang expanded my free-threaded Python experiment from a few months ago to explore the interaction between Python's asyncio and the new GIL-free build of Python 3.13. The results look really promising. Jamie says: Generally when it comes to Asyncio, the discussion around it is always about the performance...

The Fair Source Definition
9 Oct 2024 | original ↗

The Fair Source Definition Fail Source (fair.io) is the new-ish initiative from Chad Whitacre and Sentry aimed at providing an alternative licensing philosophy that provides additional protection for the business models of companies that release their code. I like that they're establishing a new brand for this and making it clear that it's a...

otterwiki
9 Oct 2024 | original ↗

otterwiki It's been a while since I've seen a new-ish Wiki implementation, and this one by Ralph Thesen is really nice. It's written in Python (Flask + SQLAlchemy + mistune for Markdown + GitPython) and keeps all of the actual wiki content as Markdown files in a local Git repository. The installation instructions are a little in-depth as they...

openai/openai-realtime-console
9 Oct 2024 | original ↗

openai/openai-realtime-console I got this OpenAI demo repository working today - it's an extremely easy way to get started playing around with the new Realtime voice API they announced at DevDay last week: cd /tmp git clone https://github.com/openai/openai-realtime-console cd openai-realtime-console npm i npm start That starts a localhost:3000...

If we had $1,000,000…
8 Oct 2024 | original ↗

If we had $1,000,000… Jacob Kaplan-Moss gave my favorite talk at DjangoCon this year, imagining what the Django Software Foundation could do if it quadrupled its annual income to $1 million and laying out a realistic path for getting there. Jacob suggests leaning more into large donors than increasing our small donor base: It’s far easier for me...

Anthropic: Message Batches (beta)
8 Oct 2024 | original ↗

Anthropic: Message Batches (beta) Anthropic now have a batch mode, allowing you to send prompts to Claude in batches which will be processed within 24 hours (though probably much faster than that) and come at a 50% price discount. This matches the batch models offered by OpenAI and by Google Gemini, both of which also provide a 50% discount. ...

Django Commons
8 Oct 2024 | original ↗

Django Commons Django Commons is a really promising initiative started by Tim Schilling, aimed at the problem of keeping key Django community projects responsibly maintained on a long-term basis. Django Commons is an organization dedicated to supporting the community's efforts to maintain packages. It seeks to improve the maintenance experience...

Thoughts on the Treasurer Role at Tech NonProfits
7 Oct 2024 | original ↗

Thoughts on the Treasurer Role at Tech NonProfits Will Vincent, Django Software Foundation treasurer from 2020-2022, explains what’s involved in the non-profit role with the highest level of responsibility and trust. Tags: dsf, django

What's New In Python 3.13
7 Oct 2024 | original ↗

What's New In Python 3.13 It's Python 3.13 release day today. The big signature features are a better REPL with improved error messages, an option to run Python without the GIL and the beginnings of the new JIT. Here are some of the smaller highlights I spotted while perusing the release notes. iOS and Android are both now Tier 3 supported...

What's New in Ruby on Rails 8
7 Oct 2024 | original ↗

What's New in Ruby on Rails 8 Rails 8 takes SQLite from a lightweight development tool to a reliable choice for production use, thanks to extensive work on the SQLite adapter and Ruby driver. With the introduction of the solid adapters discussed above, SQLite now has the capability to power Action Cable, Rails.cache, and Active Job effectively,...

Datasette 0.65
7 Oct 2024 | original ↗

Datasette 0.65 Python 3.13 was released today, which broke compatibility with the Datasette 0.x series due to an issue with an underlying dependency. I've fixed that problem by vendoring and fixing the dependency and the new 0.65 release works on Python 3.13 (but drops support for Python 3.8, which is EOL this month). Datasette 1.0a16 added...

↑ these items are from RSS. Visit the blog itself at http://simonwillison.net/ to find other articles and to appreciate the author's digital home.