Psychosomatic, Lobotomy, Saw

It's X, you'll need Y, I'll get Z
https://psy-lob-saw.blogspot.com/ (RSS)
visit blog
How Inlined Code Makes For Confusing Profiles
11 Jul 2018 | original ↗

Inlining is a powerful and common optimization technique used by compilers. But inlining combines with other optimizations to transform your code to such an extent that other tooling becomes confused, namely profilers. Mommy, What is Inlining? Inlining is the mother of all optimizations (to quote C. Click), and mothers are awesome as we all...

What a difference a JVM makes?
5 Jan 2018 | original ↗

JDK 9 is out! But as a library writer, this means change, and change can go either way... Once we've satisfied that JCTools works with JDK9, what other observations can we make? Well, one of the main motivations for using JCTools is performance, and since the code has been predominantly tested and run with JDK8, is it even better with  JDK9? is...

Java Flame Graphs Introduction: Fire For Everyone!
14 Feb 2017 | original ↗

FlameGraphs are superawesome. If you've never heard of FlameGraphs and want to dive straight in the deep end, you should run off and check out the many many good resources provided by Brendan Greg in his one stop shop page here. This post will give a quick intro and some samples to get you started with collecting profiles for all JVMs everywhere....

What do Atomic*::lazySet/Atomic*FieldUpdater::lazySet/Unsafe::putOrdered* actually mean?
20 Dec 2016 | original ↗

Paved with well intended definitions it is. lazySet/putOrdered (or an ordered store) was added as a bit of a rushed/non-commital afterthought after the JMM was done, so it's description is subject to many debates on the mailing lists, stack overflow and watercoolers the world over. This post merely tries to provide a clear definition with...

Linked Array Queues, part 2: SPSC Benchmarks
13 Dec 2016 | original ↗

JCTools has a bunch of benchmarks we use to stress test the queues and evaluate optimizations. These are of course not 'real' workloads, but serve to highlight imperfections and opportunities. While it is true that an optimization might work in a benchmark but not in the real world, a benchmark can work as a demonstration that there are at least...

Linked Array Queues, part 1: SPSC
24 Oct 2016 | original ↗

When considering concurrent queues people often go for either: An array backed queue (circular array/ring buffer etc.) A linked list queue The trade off in the Java world seems to be that array backed queues offer better throughput, but are always bounded and allocated upfront, and linked queues offer smaller footprint when empty, but worse...

4 Years Blog Anniversary
21 Oct 2016 | original ↗

This has been a very slow year blogging wise... too much work, travel, and family. 4 years ago I aimed for 2 posts a month. I nearly hit it in 2013, failed by a little in 2014, and went on to not meeting at all the last 2 years... Coincidently I also have a delightfully busy 2.5 year old running around, I suspect she has something to do with how...

Fixing Coordinated Omission in Cassandra Stress
18 Jul 2016 | original ↗

Copyright © 2016 Apache Software Foundation I have it from reliable sources that incorrectly measuring latency can lead to losing ones job, loved ones, will to live and control of bowel movements. Out of my great love for fellow humans I have fixed the YCSB load generator to avoid the grave danger that is Coordinated Omission, this was met with...

The Pros and Cons of AsyncGetCallTrace Profilers
23 Jun 2016 | original ↗

So, going on from my whingy post on safepoint bias, where does one go to get their profiling kicks? One option would be to use an OpenJDK internal API call AsyncGetCallTrace to facilitate non-safepoint collection of stack traces. AsyncGetCallTrace is NOT official JVM API. It's not a comforting place to be for profiler writers, and was only...

GC 'Nepotism' And Linked Queues
21 Mar 2016 | original ↗

I've just run into this issue this week, and it's very cute, so this here is a summary. Akka has their own MPSC linked queue implementation, and this week in was suggested they'd swap to using JCTools. The context in which the suggestion was made was a recently closed bug with the mystery title: AbstractNodeQueue suffers from nepotism An Akka...

Why (Most) Sampling Java Profilers Are Fucking Terrible
24 Feb 2016 | original ↗

This post builds on the basis of a previous post on safepoints. If you've not read it you might feel lost and confused. If you have read it, and still feel lost and confused, and you are certain this feeling is related to the matter at hand (as opposed to an existential crisis), please ask away. So, now that we've established what safepoints are,...

Wait For It: Counted/Uncounted loops, Safepoints and OSR Compilation
1 Feb 2016 | original ↗

In this previous post about Safepoints I claimed that this here piece of code: Will get your JVM to hang and will not exit after 5 seconds, as one might expect from reading the code. The reason this happens, I claimed, is because the compiler considers for loops limited by an int to be counted, and will therefore not insert a safepoint poll into...

Safepoints: Meaning, Side Effects and Overheads
14 Dec 2015 | original ↗

I've been giving a couple of talks in the last year about profiling and about the JVM runtime/execution and in both I found myself coming up against the topic of Safepoints. Most people are blissfully ignorant of the existence of safepoints and of a room full of people I would typically find 1 or 2 developers who would have any familiarity with...

Expanding The Queue interface: Relaxed Queue Access
19 Oct 2015 | original ↗

Continuing from previous post on the expansion of the Queue interface to support new ways of interacting with queues I have gone ahead and implemented relaxedOffer/Poll/Peek for the JCTools queues. This was pretty easy as the original algorithms all required a bit of mangling to support the strong semantic and relaxing it made life easier....

An extended Queue interface
19 Aug 2015 | original ↗

In my work on JCTools I have implemented a fair number of concurrent access queues. The Queue interface is part of the java.util package and offers a larger API surface area than I found core to concurrent message passing on the one hand, and still missing others. I'm hoping to solicit some discussion on some new methods, and see if I can be...

JMH perfasm explained: Looking at False Sharing on Conditional Inlining
27 Jul 2015 | original ↗

There is an edge that JMH (read the jmh resources page for other posts and related nuggets) has over other frameworks. That edge is so sharp you may well cut yourself using it, but given an infinite supply of bandages you should definitely use it :-) This edge is the ultimate profiler, the perfasm (pronounced PERF-AWESOME!, the exclamation mark...

Object.equals, primitive '==', and Arrays.equals ain't equal
22 May 2015 | original ↗

It is a fact well known to those who know it well that "==" != "equals()" the example usually going something like: String a = "Tom"; String b = new String(a); -> a != b but a.equals(b) It also seems reasonable therefore that: String[] arr1 = {a}; String[] arr2 = {b}; -> arr1 != arr2 but Arrays.equals(arr1, arr2) So far, so happy......

Degrees Of (Lock/Wait) Freedom
19 May 2015 | original ↗

Yo, Check the diagonal, three brothers gone... I've been throwing around the terms lock-free and wait-free in the context of the queues I've been writing, perhaps too casually. The definition I was using was the one from D. Vyukov's website (direct quote below): Wait-freedom: Wait-freedom means that each thread moves forward regardless of...

Porting Pitfalls: Turning D.Vyukov MPSC Wait-free queue into a j.u.Queue
22 Apr 2015 | original ↗

{This post is part of a long running series on lock free queues, checkout the full index to get more context here} D. Vyukov is an awesome lock-free dude, and I often refer to his instructive and invaluable website 1024cores.net in my posts. On his site he covers lock free queue implementations and in particular a wait-free MPSC linked node...

On Arrays.fill, Intrinsics, SuperWord and SIMD instructions
13 Apr 2015 | original ↗

{This post turned rather long, if you get lazy feel free to skip to the summary} Let's start at the very beginning, a very good place to start... My very first post on this blog was a short rant on intrinsics, and how they ain't what they seem. In that post I made the following statement: "intrinsic functions show up as normal methods or native...

Correcting YCSB's Coordinated Omission problem
11 Mar 2015 | original ↗

YCSB is the Yahoo Cloud Serving Benchmark(also on wiki): a generic set of benchmarks setting out The Nimbus Cloud Serving Board to compare different key-value store providers under a set of loads: The goal of the Yahoo Cloud Serving Benchmark (YCSB) project is to develop a framework and common set of workloads for evaluating the performance of...

HdrHistogram: A better latency capture method
16 Feb 2015 | original ↗

Some years back I was working on a latency sensitive application, and since latency was sensitive it was a requirement that we somehow capture latency both on a transaction/event level and in summary form. The event level latency was post processed from the audit logs we had to produce in any case, but the summary form was used for live system...

MPMC: The Multi Multi Queue vs. CLQ
19 Jan 2015 | original ↗

{This post is part of a long running series on lock free queues, checkout the full index to get more context here} JCTools, which is my spandex side project for lock-free queues and other animals, contains a lovely gem of a queue called the MpmcArrayQueue. It is a port of an algorithm put forward by D. Vyukov (the lock free ninja) which I briefly...

The Escape of ArrayList.iterator()
1 Dec 2014 | original ↗

{This post assumes some familiarity with JMH. For more JMH related content start at the new and improved JMH Resources Page and branch out from there!} Escape Analysis was a much celebrated optimisation added to the JVM in Java 6u23: "Based on escape analysis, an object's escape state might be one of the following: GlobalEscape – An object...

The Mythical Modulo Mask
3 Nov 2014 | original ↗

It is an optimisation well known to those who know it well that % of power of 2 numbers can be replaced by a much cheaper AND operator to the same power of 2 - 1. E.g: x % 8 == x & (8 - 1) [4/11/2014 NOTE] This works because the binary representation for N which is a power of 2 will have a single bit set to 1 and (N-1) will have all the bits...

↑ these items are from RSS. Visit the blog itself at https://psy-lob-saw.blogspot.com/ to find other articles and to appreciate the author's digital home.