> Fray is a concurrency testing tool for Java that can help you find and debug tricky race conditions that manifest as assertion violations, run-time exceptions, or deadlocks. It performs controlled concurrency testing using state-of-the-art techniques such as probabilistic concurrency testing or partial order sampling.
> Fray also provides deterministic replay capabilities for debugging specific thread interleavings. Fray is designed to be easy to use and can be integrated into existing testing frameworks.
Bugs like these are pervasive in languages like Java that give no protection against even the most basic race condition causes. It’s nearly impossible to write reliable concurrent code. Freya only helps if you actually use it to test everything which is not realistic. I am convinced, after my last year long struggle to get a highly concurrent Java (actually Kotlin but Kotlin does not add much to help) module at work, that we should only use languages that provide safe concurrency models, like Erlang/Elixir and Rust, or actor-like like Dart and JavaScript, where concurrency is required.
What is a safe concurrency model? Like, actors can trivially deadlock/livelock, they are no panacea at all, and are trivial to recreate (there are a million java implementations)
You make it sound like there is some modern development superseding what java has, but that's absolutely not the case.
Like even rust is just pretty much a no-overhead `synchronized` on top of an object. It is necessary there, because data races are a fundamental memory safety issue, but Java is immune to that (it has "safe" data races). Logical bugs can trivially happen in either case - as an easy example even if all your fields are atomically mutated, the whole object may not make sense in certain states, like a date with February the 31st. Rust does nothing against such, and concurrent data structures have ample grounds for realistic examples of the above.
The terms 'atomic', 'thread-safe', and 'concurrent' collections are thrown around too loosely for application programmers IMO, for exactly your example above.
In other scenarios, 'atomics' refer to the ability to do one thing atomically. With STM, you can do two or more things atomically.
Likewise with 'thread-safe'. Thread-safe seems to indicate that the object won't break internally in the presence of multiple threads, which is too low of a bar to clear if your goal is to write an actually thread-safe application out of so-called 'thread-safe' parts.
STM has actual concurrent data structures, where you can write straight-line code like 'if this collection has at least 5 elements, then pop one'.
I don't think the Feb 31 example is that fair though, because if you want to construct a representation of Feb 31, who's going to stop you? And if you don't want to, plain old static types is the solution.
Also, a phenomenal writing (as are his other posts) on the whole concurrency landscape, see:
> A wondrous property of concurrent programming is the sheer number and diversity of programming models developed over the years. Actors, message-passing, data parallel, auto-vectorization, …; the titles roll off the tongue, and yet none dominates and pervades. In fact, concurrent programming is a multi-dimensional space with a vast number of worthy points along its many axes.
Race conditions are generally solved with algorithms, not the language. For example, defining a total ordering on locks and only acquiring locks in that order to prevent deadlock.
I guess there there are language features like co-routines/co-operative multi-tasking that make certain algorithms possible, but nothing about Java prevents implementing sound concurrency algorithms in general.
without reworking of the code all these checks of the executor and queue state and queue manipulations have to be under a mutex, and that is just a few lines.
The light mode is fine, but you're right the dark mode is truly awful, the code blocks are unreadable.
edit: for some reason the author overrode the background color on code blocks via an inline style of
background-color:#f0f0f0
from
var(--code-background-color) = #f2f2f2
to make the background nigh imperceptibly darker, but then while the stylesheet properly switches the to #01242e in dark mode the inline override stays and blows it to bit.
Not that it's amazing if you remove the inline stle, on account of operators and method names being styled pretty dark (#666 and #4070a0).
On mobile (Safari), the lines in the code blocks have different font sizes. They also have different fonts. Some are like 3-4x the size of other lines. No idea what could be going wrong, but it does unfortunately make the code blocks difficult to follow along.
On desktop I’d suggest installing an extension that adds a toggle (they exist for Firefox and chrome at least): adding a toggle manually is a bit of a chore, especially if the css system you use does not build that in.
You appear to be one of the authors, so forgive me asking a technical question.
In the technical paper, Section 5.4 you mention that kotlin has non-determinism in the scheduler. Where does this non-determinism come from?
It seems unclear to me why Kotlin would inject randomness here, and I suspect that you may actually have identified a false positive in the Lincheck DSL.
The "randomness" comes from Kotlin coroutines and user-space scheduling. For example, Kotlin runs multiple user-space threads on the same physical thread. Fray only reschedules physical threads. So when testing applications use coroutine/virtual threads, Fray cannot generate certain thread interleavings. Also, It cannot deterministically replay because the thread execution is no longer controlled by Fray.
In our paper, we found that Fray suffers from false negatives because of this missing feature. Lincheck supports Kotlin coroutines so it finds one more bug than Fray in LC-Bench.
We didn't make any claims about false positives in Lincheck.
> Fray is a concurrency testing tool for Java that can help you find and debug tricky race conditions that manifest as assertion violations, run-time exceptions, or deadlocks. It performs controlled concurrency testing using state-of-the-art techniques such as probabilistic concurrency testing or partial order sampling.
> Fray also provides deterministic replay capabilities for debugging specific thread interleavings. Fray is designed to be easy to use and can be integrated into existing testing frameworks.
I wish I had this 20 years ago.
Neat to see sleep calls artificially introduced to reliably recreate the deadlock. [0]
Looks like fixing the underlying bug is still in-progress, [1] I wonder how many lines of code it will take.
[0] https://github.com/aoli-al/jdk/commit/625420ba82d2b0ebac24d9...
[1] https://bugs.openjdk.org/browse/JDK-8358601
Bugs like these are pervasive in languages like Java that give no protection against even the most basic race condition causes. It’s nearly impossible to write reliable concurrent code. Freya only helps if you actually use it to test everything which is not realistic. I am convinced, after my last year long struggle to get a highly concurrent Java (actually Kotlin but Kotlin does not add much to help) module at work, that we should only use languages that provide safe concurrency models, like Erlang/Elixir and Rust, or actor-like like Dart and JavaScript, where concurrency is required.
What is a safe concurrency model? Like, actors can trivially deadlock/livelock, they are no panacea at all, and are trivial to recreate (there are a million java implementations)
You make it sound like there is some modern development superseding what java has, but that's absolutely not the case.
Like even rust is just pretty much a no-overhead `synchronized` on top of an object. It is necessary there, because data races are a fundamental memory safety issue, but Java is immune to that (it has "safe" data races). Logical bugs can trivially happen in either case - as an easy example even if all your fields are atomically mutated, the whole object may not make sense in certain states, like a date with February the 31st. Rust does nothing against such, and concurrent data structures have ample grounds for realistic examples of the above.
> What is a safe concurrency model?
STM.
The terms 'atomic', 'thread-safe', and 'concurrent' collections are thrown around too loosely for application programmers IMO, for exactly your example above.
In other scenarios, 'atomics' refer to the ability to do one thing atomically. With STM, you can do two or more things atomically.
Likewise with 'thread-safe'. Thread-safe seems to indicate that the object won't break internally in the presence of multiple threads, which is too low of a bar to clear if your goal is to write an actually thread-safe application out of so-called 'thread-safe' parts.
STM has actual concurrent data structures, where you can write straight-line code like 'if this collection has at least 5 elements, then pop one'.
I don't think the Feb 31 example is that fair though, because if you want to construct a representation of Feb 31, who's going to stop you? And if you don't want to, plain old static types is the solution.
I couldn't give a better reply than this author:
https://joeduffyblog.com/2010/01/03/a-brief-retrospective-on...
Also, a phenomenal writing (as are his other posts) on the whole concurrency landscape, see:
> A wondrous property of concurrent programming is the sheer number and diversity of programming models developed over the years. Actors, message-passing, data parallel, auto-vectorization, …; the titles roll off the tongue, and yet none dominates and pervades. In fact, concurrent programming is a multi-dimensional space with a vast number of worthy points along its many axes.
Race conditions are generally solved with algorithms, not the language. For example, defining a total ordering on locks and only acquiring locks in that order to prevent deadlock.
I guess there there are language features like co-routines/co-operative multi-tasking that make certain algorithms possible, but nothing about Java prevents implementing sound concurrency algorithms in general.
without reworking of the code all these checks of the executor and queue state and queue manipulations have to be under a mutex, and that is just a few lines.
Maybe it is just me, but I can't read the text in the code because the font is nearly white on white.
The light mode is fine, but you're right the dark mode is truly awful, the code blocks are unreadable.
edit: for some reason the author overrode the background color on code blocks via an inline style of
from to make the background nigh imperceptibly darker, but then while the stylesheet properly switches the to #01242e in dark mode the inline override stays and blows it to bit.Not that it's amazing if you remove the inline stle, on account of operators and method names being styled pretty dark (#666 and #4070a0).
Thanks for pointing it out! Just did a quick fix using Claude :)
On mobile (Safari), the lines in the code blocks have different font sizes. They also have different fonts. Some are like 3-4x the size of other lines. No idea what could be going wrong, but it does unfortunately make the code blocks difficult to follow along.
should be fixed as well :)
any chance you can make light/dark mode switch a UI button?
On desktop I’d suggest installing an extension that adds a toggle (they exist for Firefox and chrome at least): adding a toggle manually is a bit of a chore, especially if the css system you use does not build that in.
You appear to be one of the authors, so forgive me asking a technical question.
In the technical paper, Section 5.4 you mention that kotlin has non-determinism in the scheduler. Where does this non-determinism come from?
It seems unclear to me why Kotlin would inject randomness here, and I suspect that you may actually have identified a false positive in the Lincheck DSL.
The "randomness" comes from Kotlin coroutines and user-space scheduling. For example, Kotlin runs multiple user-space threads on the same physical thread. Fray only reschedules physical threads. So when testing applications use coroutine/virtual threads, Fray cannot generate certain thread interleavings. Also, It cannot deterministically replay because the thread execution is no longer controlled by Fray.
In our paper, we found that Fray suffers from false negatives because of this missing feature. Lincheck supports Kotlin coroutines so it finds one more bug than Fray in LC-Bench.
We didn't make any claims about false positives in Lincheck.
Impressive! Can't wait to try Fray out at work.