An overview of end-to-end entity resolution for big data
More from the morning paper
I’m super excited about the new chapter emerging in our research on a programmable cloud. This is what comes after serverless, people. In this thread, a few recent talks/papers on the vision. First off — 10 minute pitch from CIDR is here. https://t.co/fEMboOGF7Q — Joe Hellerstein (@joe_hellerstein) January 28, 2021
Bias in word embeddings, Papakyriakopoulos et al., FAT*’20 There are no (stochastic) parrots in this paper, but it does examine bias in word embeddings, and how that bias carries forward into models that are trained using them. There are definitely some dangers to be aware of here, but also some cause for hope as we also see that bias can be...
Seeing is believing: a client-centric specification of database isolation, Crooks et al., PODC’17. Last week we looked at Elle, which detects isolation anomalies by setting things up so that the inner workings of the database, in the form of the direct serialization graph (DSG), can be externally recovered. Today’s paper choice, ‘Seeing is...
Elle: inferring isolation anomalies from experimental observations, Kingsbury & Alvaro, VLDB’20 Is there anything more terrifying, and at the same time more useful, to a database vendor than Kyle Kingsbury’s Jepsen? As the abstract to today’s paper choice wryly puts it, “experience shows that many databases do not provide the isolation guarantees...
Achieving 100 Gbps intrusion prevention on a single server, Zhao et al., OSDI’20 Papers-we-love is hosting a mini-event this Wednesday (18th) where I’ll be leading a panel discussion including one of the authors of today’s paper choice: Justine Sherry. Please do join us if you can. We always want more! This stems from a combination of Jevon’s...