Eugene Yan
https://eugeneyan.com/ (RSS)
ML systems, production & scaling, execution & collaboration, building for users, conference etiquette.
Look at and label your data, build and evaluate your LLM-evaluator, and optimize it against your labels.
Being a human judge at the Weights & Biases LLM-as-a-Judge Hackathon
FastAPI, FastHTML, Next.js, SvelteKit, and thoughts on how coding assistants influence builders' choices.
Use cases, techniques, alignment, finetuning, and critiques against LLM-evaluators.
What to interview for, how to structure the phone screen, interview loop, and debrief, and a few tips.
Special double-feature closing keynote from the 6 authors of the hit O'Reilly article on Applied LLMs.
Challenges and lessons from deploying LLM experiences: evals, scalability, guardrails.
Structured input/output, prefilling, n-shots prompting, chain-of-thought, reducing hallucinations, etc.
From the tactical nuts & bolts to the operational day-to-day to the long-term business strategy.
Building an AI coach with speech-to-text, text-to-speech, an LLM, and a virtual number.
Evals for classification, summarization, translation, copyright regurgitation, and toxicity.
How unit testing machine learning code differs from typical software practices
Overcoming the bottleneck of human annotations in instruction-tuning, preference-tuning, and pretraining.
Some fundamental papers and a one-sentence summary for each; start your own paper club!
An expanded charter, lots of writing and speaking, and finally learning to snowboard.
Sending helpful & engaging pushes, filtering annoying pushes, and finding the frequency sweet spot.
How to use open-source, permissive-use data and collect less labeled samples for our tasks.
The biggest deployment challenges, backward compatibility, multi-modality, and SF work ethic.
Evals, retrieval-augmented generation, guardrails, and collecting feedback; all that good stuff.
Reference, context, and preference-based metrics, self-consistency, and catching hallucinations.
Distinguishing problems with external vs. internal LLMs, and data vs non-data patterns
Evals, RAG, fine-tuning, caching, guardrails, defensive UX, and collecting user feedback.
Writing drafts via retrieval-augmented generation. Also reflecting on the week's journal entries.
What's the big deal, intuition on query-key-value vectors, multiple heads, multiple layers, and more.
It started with a question that had no clear answer, and led to eight PRs from the community.
Should chat be the main UX for LLMs? I don't think so and believe we can do better.
9 patterns including HITL, hard mining, reframing, cascade, data flywheel, business rules layer, and more.
Generating Dr. Seuss headlines, fake WSJ quotes, HackerNews troll comments, and more.
Also, shortcomings in document retrieval and how to overcome them with search & recsys techniques.
Asking LLMs to generate biographies to get a sense of how they memorize and regurgitate.
Writing good instructions to achieve high precision and throughput.
Collecting ground truth, data augmentation, cascading heuristics and models, and more.
End of week debrief, weekly business review, monthly learning sessions, and quarter review.
Pilot & copilot, literature review, methodology review, and timeboxing.
How to migrate and sync notes & images across devices
Seeking first to understand, earning trust, and preparing for away team work.
Travelled, wrote, and learned a lot, L5 -> L6, gave a keynote at RecSyS, and started a meetup.
A quick overview of variational and denoising autoencoders and comparing them to diffusers.
The fundamentals of text-to-image generation, relevant papers, and experimenting with DDPM.
My three favorite papers, 17 paper summaries, and ML and non-ML lessons.
Invited keynote at the Workshop for Online Recommender Systems and User Modeling (ORSUM)
Or why I should write fewer integration tests.
Pushing back on the cult of complexity.
Some off-the-beaten uses of Python learned from reading libraries.
15 minutes a week to document your work, increase visibility, and earn trust.
Understanding and spotting patterns to use code and components as intended.
Mindset, 100-day plan, and balancing learning and taking action to earn trust.
Industry examples, exploration strategies, warm-starting, off-policy evaluation, and more.
Introducing randomness and/or learning from inherent randomness to mitigate position bias.
Thinking about recsys as interventional vs. observational, and inverse propensity scoring.
How they differ and why they work better in different situations.
Hard-won lessons on how to start data science projects effectively.
I'm heading into a team lead role and would like to define the vision and roadmap.
What to consider for in terms of data, roadmap, role, manager, tooling, etc.
Beyond getting that starting role, how does one continue growing in the field?
Daliana and I had a 2hr chat on all things data science and machine learning.
Met most of my goals, adopted a puppy, and built ApplyingML.com.
More than two dozen interviews with ML Practitioners sharing their stories and advice
Susan shares 5 lessons she gained from writing online in public over the past year.
Write before you're ready, write for yourself, quantity over quality, and a few other lessons.
Simple baselines, ideas, tech stacks, and packages to try.
Why this is the first rule, some baseline heuristics, and when to move on to machine learning.
An overview of system design, candidate retrieval, and ranking, with industry examples.
Focusing on long-term rewards, exploration, and frequently updated item.
Why the Amazon applied scientist takes the time to break down his work for readers.
How to generate labels from scratch with semi, active, and weakly supervised learning.
Building semantic search; how to calculate recall when relevant documents are unknown.
Why real-time RecSys? What does the system design look like in industry? How to build an MVP?
Show them the data, the Socratic method, earning trust, and more.
Breaking it into offline vs. online environments, and candidate retrieval vs. ranking steps.
A whirlwind tour of bandits, embedding+MLP, sequences, graph, and user embeddings.
My favourite project, how I write weekly and how you can start, and content I would like to see more of.
How to go from knowing machine learning to applying it at work to drive impact.
An overview and comparison of the various approaches, with examples from industry search systems.
Even high achieving individuals experience impostor syndrome; here's how Susan learned to manage it.
More education, achievements, and awards don't shoo away imposter syndrome. Here's what might help.
What do you deeply care about? What do you excel at? Build a career out of that.
We discussed about how to build and run data teams and engage better with business.
Mike and I take a philosophical detour on Talk Python and discuss life lessons from machine learning.
Short vs. long-term gain, incremental vs. disruptive innovation, and resume-driven development.
I wish I started sooner. All have improved my life and several have compounding effects.
Pointers to think through your methodology and implementation, and the review process.
Three documents I write (one-pager, design doc, after-action review) and how I structure them.
Access, serving, integrity, convenience, autopilot; use what you need.
What the top teams did to win the 36-hour data hackathon. No, not machine learning.
Design and architecture, tech stack, methodology, results, and lessons learned.
What I learned about hiring and training, and fostering innovation, discipline, and camaraderie.
Stop procrastinating, go off the happy path, learn just-in-time, and get your hands dirty.
Why did I start writing? What's my writing process? What's the writing culture at Amazon like?
How to increase the chances of getting called up by recruiters?
Why real-time? How have China & US companies built them? How to design & build an MVP?
A public roadmap to track and share my progress; nothing mission or work-related.
Wrapping up 2020 with writing and site statistics, graphs, and a word cloud.
A short story on flying daggers and life's challenges.
Time to clear the cache, evaluate existing processes, and start new threads.
How he switched from engineering to data science, what "senior" means, and how writing helps.
How did you set up your site and what's an easy way to replicate it?
Data cleaning, transfer learning, overfitting, ensembling, and more.
Interview questions you should ask and how to evolve your job scope.
A personal take on their deliverables and skills, and what it means for the industry and your team.
Setbacks she faced, overcoming them, and how writing changed her life.
What questions do they answer? How do they compare? What open-source solutions are available?
DNS server snafus led to email & security issues. Also, limited free build minutes monthly.
Not 'How to build a data science portfolio', but 'Whys' and 'Whats'.
Step-by-step walkthrough on the environment, compilers, and installation for ScaNN.
Building prototypes helped get buy-in when roadmaps & design docs failed.
As careers grow, how does the balance between writing & coding change? Hear from 4 tech leaders.
Emphasis on bias, more sequential models & bandits, robust offline evaluation, and recsys in the wild.
What's an average day like? What's great about the role? How's working in Amazon?
For years I've refined my routines and found tools to manage my time. Here I share it with readers.
My tools for organization and creation, autopilot routines, and Maker's schedule
A step-by-step of how to migrate from json comments to Utterances.
Checking for correct implementation, expected learned behaviour, and satisfactory performance.
Should I switch from a regex-based to ML-based solution on my application?
My chat with James Le about my experience, leadership, agile, ML in production, writing, and more.
Why read papers, what papers to read, and how to read them.
Becoming a senior after three years and dealing with imposter syndrome.
How not to become an expert beginner and to progress through beginner, intermediate, and so on.
Examining the broad strokes of NLP progress and comparing between models
Why (and why not) be more end-to-end, how to, and Stitch Fix and Netflix's experience
Updating our FastAPI app to let users select options and download results.
Surprising lessons I picked up from the best books, essays, and videos on writing non-fiction.
Why OMSCS? How can I get accepted? How much time needed? Did it help your career? And more...
I couldn't find any guides on serving HTML with FastAPI, thus I wrote this to plug the hole on the internet.
Ever revisit a project & replicate the results the first time round? Me neither. Thus I adopted these habits.
It's not enough to have a good strategy and plan. Execution is just as important.
I wanted to add my recent writing to my GitHub Profile README but was too lazy to do manual updates.
I thought giving it my all led to maximum outcomes; then I learnt about the 85% rule.
Part II of the previous write-up, this time on applications and frameworks of Spark in production
Sharing my notes & practical knowledge from the conference for people who don't have the time.
Does DS have business requirements? When does it make sense to split DS and DE??
After this article, we'll have a workflow of tests and checks that run automatically with each git push.
A curious discussion made me realize my expert blind spot. And no, Airflow is not late.
Haste makes waste. Diving into a data science problem may not be the fastest route to getting it done.
Initially, I didn't like it. But over time, it grew on me. Here's why.
Crocker's Law, cognitive dissonance, and how to receive (uncomfortable) feedback better.
Can maintaining machine learning in production be easier? I go through some practical tips.
I thought deploying machine learning was hard. Then I had to maintain multiple systems in prod.
An expansion of my Twitter thread that went viral.
What I Learnt about evaluating ideas from first-hand participation in a hackathon.
What I learned about measuring diversity, novelty, surprise, and serendipity from 10+ papers.
Why you should give a talk and some tips from five years of speaking and hosting meet-ups.
Should I join a start-up? Which offer should I accept? A simple metaphor to guide your decisions.
Using a Zettelkasten helps you make connections between notes, improving learning and memory.
Writing begins before actually writing; it's a cycle of reading -> note-taking -> writing.
Automate your experimentation workflow to minimize effort and iterate faster.
How hard work, many failures, and a bit of luck got me into the field and up the ladder.
Comparing baselines (matrix factorization) against novel approaches using graphs & NLP.
Beating the baseline using Graph & NLP techniques on PyTorch, AUC improvement of ~21% (Part 2 of 2).
Building a baseline recsys based on data scraped off Amazon. Warning - Lots of charts! (Part 1 of 2).
OMSCS CS6200 (Introduction to OS) - Moving data from one process to another, multi-threaded.
In-depth sharing on how to put machine learning systems into production.
Keynote on how Asia's tech giants scale and their SuperApp strategy.
OMSCS CS6750 (Human Computer Interaction) - You are not your user! Or how to build great products.
Moving off wordpress and hosting for free on GitHub. And gaining full customization!
OMSCS CS6440 (Intro to Health Informatics) - A primer on key tech and standards in healthtech.
OMSCS CS7646 (Machine Learning for Trading) - Don't sell your house to trade algorithmically.
No, you don't need a PhD or 10+ years of experience.
How we built an ML system to predict hospitalization costs at admission; sharing at DATAx Conference.
Taking the best from agile and modifying it to fit the data science process (Part 2 of 2).
A deeper look into the strengths and weaknesses of Agile in Data Science projects (Part 1 of 2).
What's the difference between a data scientist, data engineer, and ML engineer? A panel at Google.
OMSCS CS6601 (Artificial Intelligence) - First, start with the simplest solution, and then add intelligence.
Yes, Agile can be adopted by data science teams. Moderating a panel at GovTech STACK.
OMSCS CS6460 (Education Technology) - How to scale education widely through technology.
OMSCS CS7642 (Reinforcement Learning) - Landing rockets (fun!) via deep Q-Learning (and its variants).
Technical challenges easy compared to business and people issues. Sharing at the BDA Summit.
Culture >> Hierarchy, Process, Bureaucracy.
And my idiosyncratic journey to VP of Data Science at Lazada (Alibaba). A Lunchtime chat at INSEAD.
OMSCS CS7641 (Machine Learning) - Revisiting the fundamentals and learning new techniques.
How being a Lead / Manager is different from being an individual contributor.
What is data science, how to pick it up, and how to enter the field? A discussion with SMU undergrads.
OMSCS CS6300 (Software Development Process) - Java and collaboratively developing an Android app.
Sharing about why data science, data science myths, a typical day, and more with TIA.
Tools and skills to pick up and how to practice them. An Invited Talk with Masters in IT candidates.
Tools and skills to pick up, and how to practice them.
OMSCS CS6476 Computer Vision - Performing computer vision tasks with ONLY numpy.
If things are not failing, you're not innovating enough. - Elon Musk
Or how to put machine learning models into production.
A web app to find similar products based on image.
Cleaning up text and messing with ascii (urgh!)
How Lazada ranks products to improve customer experience and conversion at Strata 2016.
A simple web app to classify fashion images into Amazon categories.
Got accepted into Georgia Tech's Computer Science Masters!
A card sorting game to discover youl passion by identifying skills you like and dislike.
Parsing json and formatting product titles and categories.
Learning Scala from Martin Odersky, father of Scala.
Guest post of how DataKind SG worked with NGOs to frame their problems and suggests solutions
Sharing about my first data science competition at DataScience SG.