How to think about creating a dataset for LLM finetuning evaluation

from blog Alex Strick van Linschoten, 24 Jun 2024 | ↗ original

I previously experimented with one-click LLM finetuning providers and now is a good time to return to the core of the matter: evaluating how well all these fine-tuned models and experiments are faring. I have a gut feeling that my fine-tuned models did pretty well, but we’re not in the business of gut feeling so I’m hoping to be able to put some...

This is a short summary. ↗ Open original to view full content

Questionable Advice: “How can I drive change and influence teams…without power?”

charity.wtf | original ↗

Trying Kolmogorov-Arnold Networks in Practice

cprimozic.net Blog | original ↗

Llama 3.2: New Edge AI and Vision Models

Tao of Mac | original ↗

[notes] Implications of Plateauing LLMs

Sympolymathesy, by Chris Krycho | original ↗

If you don’t examine what worked, how will you know what works?

Surfing Complexity | original ↗

Free Online Classes From FEMA

Cogito, Ergo Sumana | original ↗

A poor man's guide to fine-tuning Llama 2

Duarte O.Carmo | original ↗

Being data driven

Home on Erik Bernhardsson | original ↗

Experimenting with AI voice

Cassidy Williams | original ↗

kmemcheck in mainline

tail -f /var/log/messages | grep vegard | original ↗

More from Alex Strick van Linschoten

All the things I learned while trending on Hacker News

6 Jul 2024 | original ↗

My previous two blog posts — here and here — were trending / on the front page of Hacker News, driving over 20,000 new visitors to this blog. Welcome! I learned a few new tricks (and some mistakes I’d made) during the ensuing discussion so I thought I’d share some of these here. Some of them might trigger some mini side-investigations into...

One-click LLM finetuning with Predibase, OpenPipe and OpenAI

16 Jun 2024 | original ↗

The last post in this series showed that finetuning an LLM needn’t be particularly difficult. I used axolotl to produce finetuned versions of Llama3, Mistral and TinyLlama models. During the course we were given a bunch of credits by various companies in the LLM and finetuning space. Among those were credits from some finetuning-as-a-service...

Introducing the Afghanwire Dataset: A Unique Collection of Translated Afghan Media Articles from 2006-2009

31 Mar 2024 | original ↗

I am excited to announce the release of a new dataset on the Hugging Face Hub: the Afghanwire Dataset. This dataset is a comprehensive collection of translated Afghan media articles from the period of May 2006 to September 2009, created by the Afghanwire media agency, which I co-founded together with Felix Kuehn. During the years that Afghanwire...

Writing a custom Terraform provider to deploy Huggingface Spaces

30 Mar 2024 | original ↗

If you’re reading this blog, you’ve probably visited the Huggingface website and you’ve almost certainly tried out one of their ‘Spaces’. These are deployed mini-applications hosted on Huggingface infrastructure. I’ve created spaces of my own, and at work I added a way for people to quickly deploy a ZenML server as a ‘Space’. I love browsing all...

Publishing the ISAF Press Releases dataset

23 Mar 2024 | original ↗

Yesterday I published two datasets to the Hugging Face Hub and I wanted to briefly add some context to them and what they might be useful for. TL;DR: I wrote a paper in 2011 that used international military forces’ press releases about Afghanistan military operations to gain an understanding of what was going on on the ground. The paper was...

How to think about creating a dataset for LLM finetuning evaluation

Related

More from Alex Strick van Linschoten