Can AI models reason like a human?

from blog John D. Cook, 8 Jan 2025 | ↗ original

We’re awaiting the release of OpenAI’s o3 model later this month. Its performance is impressive on very hard benchmarks like SWE-bench Verified, Frontier Math and the ARC AGI benchmark (discussed previously in this blog). And yet at the same time some behaviors of the frontier AI models are very concerning. Their performance on assorted math […]...

This is a short summary. ↗ Open original to view full content

Can AI do maths yet? Thoughts from a mathematician.

Xena | original ↗

New OpenAI feature: Predicted Outputs

Simon Willison's Weblog | original ↗

I fixed the strawberry problem because OpenAI couldn't

Xe Iaso's blog | original ↗

Malleable software in the age of LLMs

Geoffrey Litt | original ↗

What’s Good for the Goose, AI Training Edition

Daring Fireball | original ↗

In Support of SB 1047

Shtetl-Optimized | original ↗

What I learned from looking at 900 most popular open source AI tools

Chip Huyen | original ↗

On AlphaTensor’s new matrix multiplication algorithms

The ryg blog | original ↗

All the things I learned while trending on Hacker News

Alex Strick van Linschoten | original ↗

Prompts.js

Simon Willison's Weblog | original ↗

More from John D. Cook

Quick change directory

7 Jan 2025 | original ↗

One difficulty when working at a command line is navigating between directories, particularly between locations with long paths. There are several ways to mitigate this. One of the simplest is using cd - to return to the previous directory. Another is to use pushd and popd. Still another is to set the CDPATH variable. qcd […] The post Quick...

Converse of RSA

6 Jan 2025 | original ↗

The security of RSA encryption depends on the difficulty of factoring the product of two large primes. If you can factor large numbers efficiently, you can break RSA. But if can break RSA, can you factor large numbers? Sorta. It’s conceivable that there is a way to break RSA encryption without having to recover the […] The post Converse of RSA...

Unicode Stegonography

2 Jan 2025 | original ↗

Stegonography attempts to prevent messages from being read by unintended recipients by hiding the messages rather than (or in addition to) encrypting them. Stegonography is used when you not only want to keep your communication private, you want to hide the fact that you’ve communicated at all. Fun fact: The words stegonography and stegosaurus...

Carnival of Mathematics 235

2 Jan 2025 | original ↗

A blog carnival is a way to discover new blogs. Writers on a given topic, such as math, take turns hosting the carnival, featuring recent posts from various writers. Blog carnivals were once much more common, but most have faded away. The Carnival of Mathematics, however, is one of the oldest carnivals and still active, […] The post Carnival of...

Up to isomorphism

1 Jan 2025 | original ↗

The previous post showed that there are 10 Abelian groups that have 2025 elements. Implicitly that means there are 10 Abelian groups up to isomorphism, i.e. groups that are not in some sense “the same” even if they look different. Sometimes it is clear what we mean by “the same” and there’s no need to […] The post Up to isomorphism first appeared...

Can AI models reason like a human?

Related

More from John D. Cook