Apache Spark

from blog Tao of Mac, 1 Jun 2024 | ↗ original

Apache Spark is an advanced distribution execution data for large-scale data processing that differs from Hadoop by privileging in-memory compute and further enforcing the decoupling between compute and storage. I’ve been an early adopter and spent far too long messing about with it in low-powered machines, and am rather partial to the DataBricks...

This is a short summary. ↗ Open original to view full content

A Data Engineering Perspective on Go vs. Python (Part 2 - Dataflow)

Christian Hollinger | original ↗

A Data Engineering Perspective on Go vs. Python (Part 1)

Christian Hollinger | original ↗

A look at Apache Hadoop in 2019

Christian Hollinger | original ↗

Data Lakes: Some thoughts on Hadoop, Hive, HBase, and Spark

Christian Hollinger | original ↗

Hot Keys, Scalability, and the Zipf Distribution

Marc Brooker's Blog | original ↗

Stuff that bothers me: “100x faster than Hadoop”

Home on Erik Bernhardsson | original ↗

zheap: a storage engine to provide better control over bloat

PostgreSQL and Databases in general | original ↗

Functional Programming and Big Data

ntietz.com blog | original ↗

DDIA: Chp 6. Partitioning

Metadata | original ↗

DDIA: Chp 3. Storage and Retrieval (Part 2)

Metadata | original ↗

More from Tao of Mac

Brainwash An Executive Today!

14 Jan 2025 | original ↗

Welcome to my life. Seriously, this is an amazing likeness to some of the stuff I go through every day at work.

16GB Raspberry Pi 5

13 Jan 2025 | original ↗

This didn’t fully register when it came out last Thursday (had other stuff on my mind, I guess), but I still think they should have done this for the Raspberry Pi 500 first–because regular desktop users would reap the most benefits, and it would greatly increase the usable lifetime of the device. As to having 16GB on a Model B, it’s certainly...

How I Use LLMs for Coding and Writing

12 Jan 2025 | original ↗

I’ve come across a couple of posts about how people use LLMs for coding, so I thought I would share how I currently use AI in general–spanning office work, writing, and, of course, coding and a bit of fun. DisclaimerSince I know most people won’t read my site disclaimer, I encourage you to do so, and go through the rest of the post with the...

Notes for January 6-12

12 Jan 2025 | original ↗

Notes for January 6-12Work was a bit slow this week (it felt more like a half-week as people started popping back in), so I was able to keep a clear head and ended up doing a fair bit of writing for a change–bits of it will be surfacing in the next few hours or weeks. Going PaperlessAfter a conversation with friends, I installed paperless-ngx on...

Rodney Brooks' Predictions Scorecard

10 Jan 2025 | original ↗

Rodney Brooks’ latest predictions scorecard is a refreshing dose of reality in a tech landscape often clouded by hype. His candid assessment of self-driving cars, AI, and space travel highlights the gap between expectation and reality, especially where it relates to the AI hype cycle and a need to both temper expectstions and have a more rational...

Apache Spark

Related

More from Tao of Mac