Eugene Yan

Eugene Yan works at the intersection of consumer data & tech to build machine learning products, and writes about effective data science, learning & career.
https://eugeneyan.com/ (RSS)
visit blog
39 lessons from Industry ML Conferences in 2024
3 Nov 2024 | original ↗

ML systems, production & scaling, execution & collaboration, building for users, conference etiquette.

AlignEval: Building an App to Make Evals Easy, Fun, and Automated
27 Oct 2024 | original ↗

Look at and label your data, build and evaluate your LLM-evaluator, and optimize it against your labels.

Hackathon Judge - Weights & Biases LLM-Evaluator Hackathon
22 Sept 2024 | original ↗

Being a human judge at the Weights & Biases LLM-as-a-Judge Hackathon

Building the Same App Using Various Web Frameworks
8 Sept 2024 | original ↗

FastAPI, FastHTML, Next.js, SvelteKit, and thoughts on how coding assistants influence builders' choices.

Evaluating the Effectiveness of LLM-Evaluators (aka LLM-as-Judge)
18 Aug 2024 | original ↗

Use cases, techniques, alignment, finetuning, and critiques against LLM-evaluators.

How to Interview and Hire ML/AI Engineers
7 Jul 2024 | original ↗

What to interview for, how to structure the phone screen, interview loop, and debrief, and a few tips.

AIE World's Fair 2024 Keynote - What We Learned from a Year of LLMs
27 Jun 2024 | original ↗

Special double-feature closing keynote from the 6 authors of the hit O'Reilly article on Applied LLMs.

Netflix PRS 2024 - Applying LLMs to Recommendation Experiences
31 May 2024 | original ↗

Challenges and lessons from deploying LLM experiences: evals, scalability, guardrails.

Prompting Fundamentals and How to Apply them Effectively
26 May 2024 | original ↗

Structured input/output, prefilling, n-shots prompting, chain-of-thought, reducing hallucinations, etc.

What We've Learned From A Year of Building with LLMs
12 May 2024 | original ↗

From the tactical nuts & bolts to the operational day-to-day to the long-term business strategy.

Building an AI Coach to Help Tame My Monkey Mind
7 Apr 2024 | original ↗

Building an AI coach with speech-to-text, text-to-speech, an LLM, and a virtual number.

Task-Specific LLM Evals that Do & Don't Work
31 Mar 2024 | original ↗

Evals for classification, summarization, translation, copyright regurgitation, and toxicity.

Don't Mock Machine Learning Models In Unit Tests
25 Feb 2024 | original ↗

How unit testing machine learning code differs from typical software practices

How to Generate and Use Synthetic Data for Finetuning
11 Feb 2024 | original ↗

Overcoming the bottleneck of human annotations in instruction-tuning, preference-tuning, and pretraining.

Language Modeling Reading List (to Start Your Paper Club)
7 Jan 2024 | original ↗

Some fundamental papers and a one-sentence summary for each; start your own paper club!

2023 Year in Review
31 Dec 2023 | original ↗

An expanded charter, lots of writing and speaking, and finally learning to snowboard.

Push Notifications: What to Push, What Not to Push, and How Often
24 Dec 2023 | original ↗

Sending helpful & engaging pushes, filtering annoying pushes, and finding the frequency sweet spot.

Out-of-Domain Finetuning to Bootstrap Hallucination Detection
5 Nov 2023 | original ↗

How to use open-source, permissive-use data and collect less labeled samples for our tasks.

Reflections on AI Engineer Summit 2023
15 Oct 2023 | original ↗

The biggest deployment challenges, backward compatibility, multi-modality, and SF work ethic.

AI Engineer Summit 2023 Keynote - Building Blocks for LLM Systems
9 Oct 2023 | original ↗

Evals, retrieval-augmented generation, guardrails, and collecting feedback; all that good stuff.

Evaluation & Hallucination Detection for Abstractive Summaries
3 Sept 2023 | original ↗

Reference, context, and preference-based metrics, self-consistency, and catching hallucinations.

How to Match LLM Patterns to Problems
13 Aug 2023 | original ↗

Distinguishing problems with external vs. internal LLMs, and data vs non-data patterns

Patterns for Building LLM-based Systems & Products
30 Jul 2023 | original ↗

Evals, RAG, fine-tuning, caching, guardrails, defensive UX, and collecting user feedback.

Obsidian-Copilot: An Assistant for Writing & Reflecting
11 Jun 2023 | original ↗

Writing drafts via retrieval-augmented generation. Also reflecting on the week's journal entries.

Some Intuition on Attention and the Transformer
21 May 2023 | original ↗

What's the big deal, intuition on query-key-value vectors, multiple heads, multiple layers, and more.

Open-LLMs - A list of LLMs for Commercial Use
7 May 2023 | original ↗

It started with a question that had no clear answer, and led to eight PRs from the community.

Interacting with LLMs with Minimal Chat
30 Apr 2023 | original ↗

Should chat be the main UX for LLMs? I don't think so and believe we can do better.

More Design Patterns For Machine Learning Systems
23 Apr 2023 | original ↗

9 patterns including HITL, hard mining, reframing, cascade, data flywheel, business rules layer, and more.

Raspberry-LLM - Making My Raspberry Pico a Little Smarter
16 Apr 2023 | original ↗

Generating Dr. Seuss headlines, fake WSJ quotes, HackerNews troll comments, and more.

Experimenting with LLMs to Research, Reflect, and Plan
9 Apr 2023 | original ↗

Also, shortcomings in document retrieval and how to overcome them with search & recsys techniques.

LLM-powered Biographies
19 Mar 2023 | original ↗

Asking LLMs to generate biographies to get a sense of how they memorize and regurgitate.

How to Write Data Labeling/Annotation Guidelines
12 Mar 2023 | original ↗

Writing good instructions to achieve high precision and throughput.

Content Moderation & Fraud Detection - Patterns in Industry
26 Feb 2023 | original ↗

Collecting ground truth, data augmentation, cascading heuristics and models, and more.

Mechanisms for Effective Technical Teams
5 Feb 2023 | original ↗

End of week debrief, weekly business review, monthly learning sessions, and quarter review.

Mechanisms for Effective Machine Learning Projects
22 Jan 2023 | original ↗

Pilot & copilot, literature review, methodology review, and timeboxing.

Goodbye Roam Research, Hello Obsidian
15 Jan 2023 | original ↗

How to migrate and sync notes & images across devices

What To Do If Dependency Teams Can’t Help
8 Jan 2023 | original ↗

Seeking first to understand, earning trust, and preparing for away team work.

2022 in Review & 2023 Goals
24 Dec 2022 | original ↗

Travelled, wrote, and learned a lot, L5 -> L6, gave a keynote at RecSyS, and started a meetup.

Autoencoders and Diffusers: A Brief Comparison
11 Dec 2022 | original ↗

A quick overview of variational and denoising autoencoders and comparing them to diffusers.

Text-to-Image: Diffusion, Text Conditioning, Guidance, Latent Space
27 Nov 2022 | original ↗

The fundamentals of text-to-image generation, relevant papers, and experimenting with DDPM.

RecSys 2022: Recap, Favorite Papers, and Lessons
2 Oct 2022 | original ↗

My three favorite papers, 17 paper summaries, and ML and non-ML lessons.

RecSys 2022 Keynote - Is the Juice Worth the Squeeze?
23 Sept 2022 | original ↗

Invited keynote at the Workshop for Online Recommender Systems and User Modeling (ORSUM)

Writing Robust Tests for Data & Machine Learning Pipelines
4 Sept 2022 | original ↗

Or why I should write fewer integration tests.

Simplicity is An Advantage but Sadly Complexity Sells Better
14 Aug 2022 | original ↗

Pushing back on the cult of complexity.

Uncommon Uses of Python in Commonly Used Libraries
31 Jul 2022 | original ↗

Some off-the-beaten uses of Python learned from reading libraries.

Why You Should Write Weekly 15-5s
26 Jun 2022 | original ↗

15 minutes a week to document your work, increase visibility, and earn trust.

Design Patterns in Machine Learning Code and Systems
12 Jun 2022 | original ↗

Understanding and spotting patterns to use code and components as intended.

What I Wish I Knew About Onboarding Effectively
22 May 2022 | original ↗

Mindset, 100-day plan, and balancing learning and taking action to earn trust.

Bandits for Recommender Systems
8 May 2022 | original ↗

Industry examples, exploration strategies, warm-starting, off-policy evaluation, and more.

How to Measure and Mitigate Position Bias
17 Apr 2022 | original ↗

Introducing randomness and/or learning from inherent randomness to mitigate position bias.

Counterfactual Evaluation for Recommendation Systems
10 Apr 2022 | original ↗

Thinking about recsys as interventional vs. observational, and inverse propensity scoring.

Traversing High-Level Intent and Low-Level Requirements
20 Mar 2022 | original ↗

How they differ and why they work better in different situations.

Data Science Project Quick-Start
6 Mar 2022 | original ↗

Hard-won lessons on how to start data science projects effectively.

Mailbag: How to Define a Data Team's Vision and Roadmap
18 Feb 2022 | original ↗

I'm heading into a team lead role and would like to define the vision and roadmap.

Red Flags to Look Out for When Joining a Data Team
13 Feb 2022 | original ↗

What to consider for in terms of data, roadmap, role, manager, tooling, etc.

How to Keep Learning about Machine Learning
19 Jan 2022 | original ↗

Beyond getting that starting role, how does one continue growing in the field?

The Data Scientist Show - Building end-to-end ML systems
2 Dec 2021 | original ↗

Daliana and I had a 2hr chat on all things data science and machine learning.

2021 Year in Review
28 Nov 2021 | original ↗

Met most of my goals, adopted a puppy, and built ApplyingML.com.

Informal Mentors Grew into ApplyingML.com!
25 Nov 2021 | original ↗

More than two dozen interviews with ML Practitioners sharing their stories and advice

5 Lessons I Learned from Writing Online (Guest post by Susan Shu)
7 Nov 2021 | original ↗

Susan shares 5 lessons she gained from writing online in public over the past year.

What I Learned from Writing Online - For Fellow Non-Writers
17 Oct 2021 | original ↗

Write before you're ready, write for yourself, quantity over quality, and a few other lessons.

RecSys 2021 - Papers and Talks to Chew on
3 Oct 2021 | original ↗

Simple baselines, ideas, tech stacks, and packages to try.

The First Rule of Machine Learning: Start without Machine Learning
19 Sept 2021 | original ↗

Why this is the first rule, some baseline heuristics, and when to move on to machine learning.

MLOps Community - System Design for RecSys & Search
15 Sept 2021 | original ↗

An overview of system design, candidate retrieval, and ranking, with industry examples.

Reinforcement Learning for Recommendations and Search
5 Sept 2021 | original ↗

Focusing on long-term rewards, exploration, and frequently updated item.

Amazon Science - Eugene Yan and the Art of Writing about Science
2 Aug 2021 | original ↗

Why the Amazon applied scientist takes the time to break down his work for readers.

Bootstrapping Labels via ___ Supervision & Human-In-The-Loop
1 Aug 2021 | original ↗

How to generate labels from scratch with semi, active, and weakly supervised learning.

Mailbag: How to Bootstrap Labels for Relevant Docs in Search
20 Jul 2021 | original ↗

Building semantic search; how to calculate recall when relevant documents are unknown.

SF Big Analytics - System Design for RecSys & Search
13 Jul 2021 | original ↗

Why real-time RecSys? What does the system design look like in industry? How to build an MVP?

Influencing without Authority for Data Scientists
4 Jul 2021 | original ↗

Show them the data, the Socratic method, earning trust, and more.

System Design for Recommendations and Search
27 Jun 2021 | original ↗

Breaking it into offline vs. online environments, and candidate retrieval vs. ranking steps.

Patterns for Personalization in Recommendations and Search
13 Jun 2021 | original ↗

A whirlwind tour of bandits, embedding+MLP, sequences, graph, and user embeddings.

Towards Data Science - Author Spotlight with Eugene Yan
2 Jun 2021 | original ↗

My favourite project, how I write weekly and how you can start, and content I would like to see more of.

The Metagame of Applying Machine Learning
2 May 2021 | original ↗

How to go from knowing machine learning to applying it at work to drive impact.

Search: Query Matching via Lexical, Graph, and Embedding Methods
25 Apr 2021 | original ↗

An overview and comparison of the various approaches, with examples from industry search systems.

My Impostor Syndrome Stories (Guest Post by Susan Shu)
18 Apr 2021 | original ↗

Even high achieving individuals experience impostor syndrome; here's how Susan learned to manage it.

How to Live with Chronic Imposter Syndrome
11 Apr 2021 | original ↗

More education, achievements, and awards don't shoo away imposter syndrome. Here's what might help.

Planning Your Career: Values and Superpowers
4 Apr 2021 | original ↗

What do you deeply care about? What do you excel at? Build a career out of that.

Bukalapak - Fireside Chat with the Data Science team
28 Mar 2021 | original ↗

We discussed about how to build and run data teams and engage better with business.

TalkPython - What ML can Teach Us About Life
26 Mar 2021 | original ↗

Mike and I take a philosophical detour on Talk Python and discuss life lessons from machine learning.

Choosing Problems in Data Science and Machine Learning
21 Mar 2021 | original ↗

Short vs. long-term gain, incremental vs. disruptive innovation, and resume-driven development.

Seven Habits that Shaped My Last Decade
14 Mar 2021 | original ↗

I wish I started sooner. All have improved my life and several have compounding effects.

How to Write Design Docs for Machine Learning Systems
7 Mar 2021 | original ↗

Pointers to think through your methodology and implementation, and the review process.

How to Write Better with The Why, What, How Framework
28 Feb 2021 | original ↗

Three documents I write (one-pager, design doc, after-action review) and how I structure them.

Feature Stores: A Hierarchy of Needs
21 Feb 2021 | original ↗

Access, serving, integrity, convenience, autopilot; use what you need.

How to Win a Data Hackathon (Hacklytics 2021)
14 Feb 2021 | original ↗

What the top teams did to win the 36-hour data hackathon. No, not machine learning.

DataTalksClub - Building an ML System; Behind the Scenes
7 Feb 2021 | original ↗

Design and architecture, tech stack, methodology, results, and lessons learned.

Growing and Running Your Data Science Team
31 Jan 2021 | original ↗

What I learned about hiring and training, and fostering innovation, discipline, and camaraderie.

You Don't Really Need Another MOOC
24 Jan 2021 | original ↗

Stop procrastinating, go off the happy path, learn just-in-time, and get your hands dirty.

DataTalksClub - The Importance Of Writing In A Tech Career
17 Jan 2021 | original ↗

Why did I start writing? What's my writing process? What's the writing culture at Amazon like?

Mailbag: How to Get Experienced DS Resume Noticed by Recruiters?
16 Jan 2021 | original ↗

How to increase the chances of getting called up by recruiters?

Real-time Machine Learning For Recommendations
10 Jan 2021 | original ↗

Why real-time? How have China & US companies built them? How to design & build an MVP?

2021 Roadmap: Sharing, Helping, and Living More
3 Jan 2021 | original ↗

A public roadmap to track and share my progress; nothing mission or work-related.

2020 Retrospective: New Country, New Role, New Habit
20 Dec 2020 | original ↗

Wrapping up 2020 with writing and site statistics, graphs, and a word cloud.

Catch the Flying Daggers
11 Dec 2020 | original ↗

A short story on flying daggers and life's challenges.

How I’m Reflecting on 2020 and Planning for 2021
6 Dec 2020 | original ↗

Time to clear the cache, evaluate existing processes, and start new threads.

Alexey Grigorev on His Career, Data Science, and Writing
29 Nov 2020 | original ↗

How he switched from engineering to data science, what "senior" means, and how writing helps.

Mailbag: What's the Architecture for your Blog?
24 Nov 2020 | original ↗

How did you set up your site and what's an easy way to replicate it?

What Machine Learning Can Teach Us About Life - 7 Lessons
22 Nov 2020 | original ↗

Data cleaning, transfer learning, overfitting, ensembling, and more.

How to Prevent or Deal with a Data Science Role or Title Mismatch
15 Nov 2020 | original ↗

Interview questions you should ask and how to evolve your job scope.

Applied / Research Scientist, ML Engineer: What’s the Difference?
8 Nov 2020 | original ↗

A personal take on their deliverables and skills, and what it means for the industry and your team.

Chip Huyen on Her Career, Writing, and Machine Learning
1 Nov 2020 | original ↗

Setbacks she faced, overcoming them, and how writing changed her life.

Data Discovery Platforms and Their Open Source Solutions
25 Oct 2020 | original ↗

What questions do they answer? How do they compare? What open-source solutions are available?

Why I switched from Netlify back to GitHub Pages
21 Oct 2020 | original ↗

DNS server snafus led to email & security issues. Also, limited free build minutes monthly.

Why Have a Data Science Portfolio and What It Shows
18 Oct 2020 | original ↗

Not 'How to build a data science portfolio', but 'Whys' and 'Whats'.

How to Install Google Scalable Nearest Neighbors (ScaNN) on Mac
14 Oct 2020 | original ↗

Step-by-step walkthrough on the environment, compilers, and installation for ScaNN.

How Prototyping Can Help You to Get Buy-In
11 Oct 2020 | original ↗

Building prototypes helped get buy-in when roadmaps & design docs failed.

Is Writing as Important as Coding?
4 Oct 2020 | original ↗

As careers grow, how does the balance between writing & coding change? Hear from 4 tech leaders.

RecSys 2020: Takeaways and Notable Papers
27 Sept 2020 | original ↗

Emphasis on bias, more sequential models & bandits, robust offline evaluation, and recsys in the wild.

Appreciating the Present
26 Sept 2020 | original ↗

What if the alternative was nothingness?

CareerFair - Day-to-day as an Applied Scientist at Amazon
21 Sept 2020 | original ↗

What's an average day like? What's great about the role? How's working in Amazon?

Routines and Tools to Optimize My Day (Guest Post by Susan Shu)
20 Sept 2020 | original ↗

For years I've refined my routines and found tools to manage my time. Here I share it with readers.

How to Accomplish More with Less - Useful Tools & Routines
13 Sept 2020 | original ↗

My tools for organization and creation, autopilot routines, and Maker's schedule

Migrating Site Comments to Utterances
7 Sept 2020 | original ↗

A step-by-step of how to migrate from json comments to Utterances.

How to Test Machine Learning Code and Systems
6 Sept 2020 | original ↗

Checking for correct implementation, expected learned behaviour, and satisfactory performance.

Mailbag: Parsing Fields from PDFs—When to Use Machine Learning?
4 Sept 2020 | original ↗

Should I switch from a regex-based to ML-based solution on my application?

Datacast Podcast - Effective Data Science with Eugene Yan
3 Sept 2020 | original ↗

My chat with James Le about my experience, leadership, agile, ML in production, writing, and more.

How Reading Papers Helps You Be a More Effective Data Scientist
30 Aug 2020 | original ↗

Why read papers, what papers to read, and how to read them.

Mailbag: I'm Now a Senior DS—How should I Approach this?
27 Aug 2020 | original ↗

Becoming a senior after three years and dealing with imposter syndrome.

Embrace Beginner's Mind; Avoid The Wrong Way To Be An Expert
23 Aug 2020 | original ↗

How not to become an expert beginner and to progress through beginner, intermediate, and so on.

NLP for Supervised Learning - A Brief Survey
16 Aug 2020 | original ↗

Examining the broad strokes of NLP progress and comparing between models

Unpopular Opinion: Data Scientists Should be More End-to-End
9 Aug 2020 | original ↗

Why (and why not) be more end-to-end, how to, and Stitch Fix and Netflix's experience

Adding a Checkbox & Download Button to a FastAPI-HTML app
5 Aug 2020 | original ↗

Updating our FastAPI app to let users select options and download results.

What I Did Not Learn About Writing In School
2 Aug 2020 | original ↗

Surprising lessons I picked up from the best books, essays, and videos on writing non-fiction.

Georgia Tech's OMSCS FAQ (based on my experience)
26 Jul 2020 | original ↗

Why OMSCS? How can I get accepted? How much time needed? Did it help your career? And more...

How to Set Up a HTML App with FastAPI, Jinja, Forms & Templates
23 Jul 2020 | original ↗

I couldn't find any guides on serving HTML with FastAPI, thus I wrote this to plug the hole on the internet.

Why You Need to Follow Up After Your Data Science Project
19 Jul 2020 | original ↗

Ever revisit a project & replicate the results the first time round? Me neither. Thus I adopted these habits.

What I Do During A Data Science Project To Deliver Success
12 Jul 2020 | original ↗

It's not enough to have a good strategy and plan. Execution is just as important.

How to Update a GitHub Profile README Automatically
11 Jul 2020 | original ↗

I wanted to add my recent writing to my GitHub Profile README but was too lazy to do manual updates.

The 85% Rule: When Giving It Your 100% Gets You Less than 85%
9 Jul 2020 | original ↗

I thought giving it my all led to maximum outcomes; then I learnt about the 85% rule.

My Notes From Spark+AI Summit 2020 (Application-Specific Talks)
5 Jul 2020 | original ↗

Part II of the previous write-up, this time on applications and frameworks of Spark in production

My Notes From Spark+AI Summit 2020 (Application-Agnostic Talks)
28 Jun 2020 | original ↗

Sharing my notes & practical knowledge from the conference for people who don't have the time.

Mailbag: Qns on the Intersection of Data Science and Business
21 Jun 2020 | original ↗

Does DS have business requirements? When does it make sense to split DS and DE??

How to Set Up a Python Project For Automation and Collaboration
21 Jun 2020 | original ↗

After this article, we'll have a workflow of tests and checks that run automatically with each git push.

Why Are My Airflow Jobs Running “One Day Late”?
17 Jun 2020 | original ↗

A curious discussion made me realize my expert blind spot. And no, Airflow is not late.

What I Do Before a Data Science Project to Ensure Success
15 Jun 2020 | original ↗

Haste makes waste. Diving into a data science problem may not be the fastest route to getting it done.

What I Love about Scrum for Data Science
7 Jun 2020 | original ↗

Initially, I didn't like it. But over time, it grew on me. Here's why.

How to Apply Crocker's Law for Feedback and Growth
31 May 2020 | original ↗

Crocker's Law, cognitive dissonance, and how to receive (uncomfortable) feedback better.

A Practical Guide to Maintaining Machine Learning in Production
25 May 2020 | original ↗

Can maintaining machine learning in production be easier? I go through some practical tips.

6 Little-Known Challenges After Deploying Machine Learning
18 May 2020 | original ↗

I thought deploying machine learning was hard. Then I had to maintain multiple systems in prod.

How to Write: Advice from David Perell and Sahil Lavingia
9 May 2020 | original ↗

An expansion of my Twitter thread that went viral.

A Hackathon Where the Dinkiest Idea Won. Why?
3 May 2020 | original ↗

What I Learnt about evaluating ideas from first-hand participation in a hackathon.

Serendipity: Accuracy’s Unpopular Best Friend in Recommenders
26 Apr 2020 | original ↗

What I learned about measuring diversity, novelty, surprise, and serendipity from 10+ papers.

How to Give a Kick-Ass Data Science Talk
18 Apr 2020 | original ↗

Why you should give a talk and some tips from five years of speaking and hosting meet-ups.

Commando, Soldier, Police and Your Career Choices
12 Apr 2020 | original ↗

Should I join a start-up? Which offer should I accept? A simple metaphor to guide your decisions.

Stop Taking Regular Notes; Use a Zettelkasten Instead
5 Apr 2020 | original ↗

Using a Zettelkasten helps you make connections between notes, improving learning and memory.

Writing is Learning: How I Learned an Easier Way to Write
28 Mar 2020 | original ↗

Writing begins before actually writing; it's a cycle of reading -> note-taking -> writing.

Simpler Experimentation with Jupyter, Papermill, and MLflow
15 Mar 2020 | original ↗

Automate your experimentation workflow to minimize effort and iterate faster.

My Journey from Psych Grad to Leading Data Science at Lazada
27 Feb 2020 | original ↗

How hard work, many failures, and a bit of luck got me into the field and up the ladder.

DataScience SG Meetup - RecSys, Beyond the Baseline
14 Jan 2020 | original ↗

Comparing baselines (matrix factorization) against novel approaches using graphs & NLP.

Beating the Baseline Recommender with Graph & NLP in Pytorch
13 Jan 2020 | original ↗

Beating the baseline using Graph & NLP techniques on PyTorch, AUC improvement of ~21% (Part 2 of 2).

Building a Strong Baseline Recommender in PyTorch, on a Laptop
6 Jan 2020 | original ↗

Building a baseline recsys based on data scraped off Amazon. Warning - Lots of charts! (Part 1 of 2).

OMSCS CS6200 (Introduction to OS) Review and Tips
15 Dec 2019 | original ↗

OMSCS CS6200 (Introduction to OS) - Moving data from one process to another, multi-threaded.

DataScience SG x ODSC Meetup - Applying ML to Healthcare
9 Oct 2019 | original ↗

In-depth sharing on how to put machine learning systems into production.

OLX Prod Tech 2019 Keynote - Asia's Tech Giants & SuperApps
3 Oct 2019 | original ↗

Keynote on how Asia's tech giants scale and their SuperApp strategy.

OMSCS CS6750 (Human Computer Interaction) Review and Tips
3 Sept 2019 | original ↗

OMSCS CS6750 (Human Computer Interaction) - You are not your user! Or how to build great products.

Goodbye Wordpress, Hello Jekyll!
25 Aug 2019 | original ↗

Moving off wordpress and hosting for free on GitHub. And gaining full customization!

OMSCS CS6440 (Intro to Health Informatics) Review and Tips
4 Aug 2019 | original ↗

OMSCS CS6440 (Intro to Health Informatics) - A primer on key tech and standards in healthtech.

OMSCS CS7646 (Machine Learning for Trading) Review and Tips
11 May 2019 | original ↗

OMSCS CS7646 (Machine Learning for Trading) - Don't sell your house to trade algorithmically.

What does a Data Scientist really do?
30 Apr 2019 | original ↗

No, you don't need a PhD or 10+ years of experience.

DATAx - A Production ML system for SEA's Biggest Hospital Group
6 Mar 2019 | original ↗

How we built an ML system to predict hospitalization costs at admission; sharing at DATAx Conference.

Data Science and Agile (Frameworks for Effectiveness)
2 Feb 2019 | original ↗

Taking the best from agile and modifying it to fit the data science process (Part 2 of 2).

Data Science and Agile (What Works, and What Doesn't)
26 Jan 2019 | original ↗

A deeper look into the strengths and weaknesses of Agile in Data Science projects (Part 1 of 2).

DataScience SG Meetup - Panel On the Different Roles in Data
17 Jan 2019 | original ↗

What's the difference between a data scientist, data engineer, and ML engineer? A panel at Google.

OMSCS CS6601 (Artificial Intelligence) Review and Tips
20 Dec 2018 | original ↗

OMSCS CS6601 (Artificial Intelligence) - First, start with the simplest solution, and then add intelligence.

GovTech Conference - Data Science and Agile—Can or Not?
28 Oct 2018 | original ↗

Yes, Agile can be adopted by data science teams. Moderating a panel at GovTech STACK.

OMSCS CS6460 (Education Technology) Review and Tips
25 Aug 2018 | original ↗

OMSCS CS6460 (Education Technology) - How to scale education widely through technology.

OMSCS CS7642 (Reinforcement Learning) Review and Tips
30 Jul 2018 | original ↗

OMSCS CS7642 (Reinforcement Learning) - Landing rockets (fun!) via deep Q-Learning (and its variants).

Big Data & Analytics Summit - Data Science Challenges @ Lazada
21 Jun 2018 | original ↗

Technical challenges easy compared to business and people issues. Sharing at the BDA Summit.

Building a Strong Data Science Team Culture
12 May 2018 | original ↗

Culture >> Hierarchy, Process, Bureaucracy.

INSEAD Lunchtime Talks - How Lazada uses Data
25 Apr 2018 | original ↗

And my idiosyncratic journey to VP of Data Science at Lazada (Alibaba). A Lunchtime chat at INSEAD.

OMSCS CS7641 (Machine Learning) Review and Tips
27 Dec 2017 | original ↗

OMSCS CS7641 (Machine Learning) - Revisiting the fundamentals and learning new techniques.

My first 100 days as Data Science Lead
25 Sept 2017 | original ↗

How being a Lead / Manager is different from being an individual contributor.

SMU - What is Data Analytics and How do I get into it?
26 Aug 2017 | original ↗

What is data science, how to pick it up, and how to enter the field? A discussion with SMU undergrads.

OMSCS CS6300 (Software Development Process) Review and Tips
13 Aug 2017 | original ↗

OMSCS CS6300 (Software Development Process) - Java and collaboratively developing an Android app.

Tech in Asia - My Journey in Data Science and Advice for others
26 Jul 2017 | original ↗

Sharing about why data science, data science myths, a typical day, and more with TIA.

SMU Masters in IT - How to get started in Data Science
26 Jun 2017 | original ↗

Tools and skills to pick up and how to practice them. An Invited Talk with Masters in IT candidates.

How to get started in Data Science
25 Jun 2017 | original ↗

Tools and skills to pick up, and how to practice them.

OMSCS CS6476 (Computer Vision) Review and Tips
15 May 2017 | original ↗

OMSCS CS6476 Computer Vision - Performing computer vision tasks with ONLY numpy.

One way to help a data science team innovate successfully
19 Feb 2017 | original ↗

If things are not failing, you're not innovating enough. - Elon Musk

Product Categorization API Part 3: Creating an API
13 Feb 2017 | original ↗

Or how to put machine learning models into production.

Image search is now live!
14 Jan 2017 | original ↗

A web app to find similar products based on image.

Product Classification API Part 2: Data Preparation
11 Dec 2016 | original ↗

Cleaning up text and messing with ascii (urgh!)

Strata x Hadoop 2016 - How Lazada Ranks Products
9 Dec 2016 | original ↗

How Lazada ranks products to improve customer experience and conversion at Strata 2016.

Image classification API is now live!
27 Nov 2016 | original ↗

A simple web app to classify fashion images into Amazon categories.

I'm going back to school
2 Nov 2016 | original ↗

Got accepted into Georgia Tech's Computer Science Masters!

SortMySkills is now live!
23 Oct 2016 | original ↗

A card sorting game to discover youl passion by identifying skills you like and dislike.

Product Classification API Part 1: Data Acquisition
11 Oct 2016 | original ↗

Parsing json and formatting product titles and categories.

Thoughts on Functional Programming in Scala Course (Coursera)
31 Jul 2016 | original ↗

Learning Scala from Martin Odersky, father of Scala.

First post!
6 Jul 2016 | original ↗

Time to start writing.

DataKind Singapore’s Latest Project Accelerator
17 Sept 2015 | original ↗

Guest post of how DataKind SG worked with NGOs to frame their problems and suggests solutions

DataScience SG Meetup - How we got top 3% in Kaggle
20 Jun 2015 | original ↗

Sharing about my first data science competition at DataScience SG.

↑ these items are from RSS. Visit the blog itself at https://eugeneyan.com/ to find other articles and to appreciate the author's digital home.