My finetuned models beat OpenAI’s GPT-4

from blog Alex Strick van Linschoten, 30 Jun 2024 | ↗ original

My last post outlined the kinds of evaluation I need and want to understand how well my finetuned LLM is performing in the task of structured data extraction from press releases. Let’s start with the core metric I’m interested in, accuracy, and then later we can dive into some of the other evaluation metrics as well. TL;DR The headline for this...

This is a short summary. ↗ Open original to view full content

Evaluating the Effectiveness of LLM-Evaluators (aka LLM-as-Judge)

Eugene Yan | original ↗

A beginners guide to fine tuning LLM using LoRA

zabirauf || Zohaib | original ↗

Using Cuelang With Go for LLM Data Extraction

Cybernetist | original ↗

Task-Specific LLM Evals that Do & Don't Work

Eugene Yan | original ↗

Experimenting with LLMs to Research, Reflect, and Plan

Eugene Yan | original ↗

Building LLM applications for production

Chip Huyen | original ↗

Go talk to the LLM

meain/blog | original ↗

My Favorite Website to Markdown Tools For LLMs

Brentter.com/ | original ↗

How I'm using AI as a technical writer

passo.uno | original ↗

Everything I did in 2024

Vicki Boykis | original ↗

More from Alex Strick van Linschoten

Final notes on ‘Prompt Engineering for LLMs’

16 Jan 2025 | original ↗

Here are the final notes from ‘Prompt Engineering for LLMs’, a book I’ve been reading over the past few days (and enjoying!). Chapter 10: Evaluating LLM Applications The chapter begins with an interesting anecdote about GitHub Copilot - the first code written in their repository was the evaluation harness, highlighting the importance of testing...

Assembling the Prompt: Notes on ‘Prompt Engineering for LLMs’ ch 6

12 Jan 2025 | original ↗

Chapter 6 of “Prompt Engineering for LLMs” is devoted to how to structure the prompt and compose its various elements. We first learn about the different kinds of ‘documents’ that we can mimic with our prompts, then think about how to pick which pieces of context to include, and then think through how we might compose all of this together....

Prompt Content: Notes on ‘Prompt Engineering for LLMs’ ch 5

11 Jan 2025 | original ↗

Chapter 5 of ‘Prompt Engineering for LLMs’ tackles the kinds of things you might want to include in your prompt. (Chapter 6 thinks through the order, structuring and weighting of these different pieces of content, so this is purely about the ‘what’ and not the ‘how’). We split the kinds of content up into static and dynamic content. For static...

Starting to read Prompt Engineering for LLMs

8 Jan 2025 | original ↗

I’m posting some of my summary notes while reading through John Berryman and Albert Ziegler’s “Prompt Engineering for LLMs”. What follows are my notes from the first two chapters. It was a bit too long for a post to LinkedIn so I’m posting my notes in full here. Chapter 1: Introduction to Prompt Engineering The opening chapter frames prompt...

All the things I learned while trending on Hacker News

6 Jul 2024 | original ↗

My previous two blog posts — here and here — were trending / on the front page of Hacker News, driving over 20,000 new visitors to this blog. Welcome! I learned a few new tricks (and some mistakes I’d made) during the ensuing discussion so I thought I’d share some of these here. Some of them might trigger some mini side-investigations into...

My finetuned models beat OpenAI’s GPT-4

Related

More from Alex Strick van Linschoten