<headius[m]>
so ko1 did the same thing I suggested yesterday and moved the core of monitor.rb to native to avoid the interrupt handling
<headius[m]>
@dav
<headius[m]>
daveg_lookout: funny thing
<daveg_lookout[m]>
Doesn't that still have a race in #mon_synchronize since @mon_data.enter is outside the begin/rescue block?
<headius[m]>
that is ok... you want the monitor/lock entry to be outside the begu
<headius[m]>
oops
<headius[m]>
you want it outside because if there is an error locking it you don't want to try to unlock it
<daveg_lookout[m]>
ah, so if enter fails, you don't exit
<headius[m]>
right
<headius[m]>
new keyboard who dis
<headius[m]>
but CRuby, like JRuby, will not check thread interrupts at all in native code unless done explicitly, so this avoids the interrupt issue altogether
<headius[m]>
to me it is an admission that the handle_interrupt pattern was a bad way to fix that problem, but yolo
<daveg_lookout[m]>
Out of curiosity, how is the race I mentioned avoided? Is there no way to be interrupted before the begin block is invoked? I would've assumed you needed a entered flag, only set after mon_data.enter succeeded.
<daveg_lookout[m]>
One question on your change -- previously, SizedQueue.setup called defineClassUnder with runtime.getClass("Queue") and now it's passing in Object.class -- is that important? Not familiar with defineClassUnder...
<daveg_lookout[m]>
ok, thanks for the explanation, trying to make sure i understand
<headius[m]>
oops that is not right
<headius[m]>
I hope that fails CI!
<headius[m]>
yeah it should extend Queue not Object... copypasta bug
<daveg_lookout[m]>
other than that, lgtm, not that I'm a fully qualified reviewer
<headius[m]>
yeah thanks for that!
<headius[m]>
oh good, it failed spectacularly
<headius[m]>
fixed and repushed
subbu is now known as subbu|afk
subbu|afk is now known as subbu
subbu is now known as subbu|lunch
<headius[m]>
enebo: so thoughts on that native monitor PR?
<enebo[m]>
seems fine to me
<enebo[m]>
hahah what is the last line you see?
<headius[m]>
"seems fine to me"
<enebo[m]>
ok...I keep seeing my last line as "so it may just be there is no magic combo"
<headius[m]>
that's weird
<enebo[m]>
It keeps moving to the bottom so it must think it is in the future
<enebo[m]>
I wish it could have said "in bed" at least it would give mileage
<enebo[m]>
I think I will reload this page :)
<headius[m]>
there is the minor compat issue of Monitor no longer mixing in MonitorMixin but that relationship was weird before and this has already shipped in CRuby 2.7
<headius[m]>
ko1 basically moved the guts of MonitorMixin to Monitor as native code and flipped the relationship between the two
<enebo[m]>
is it gone?...hmm
<enebo[m]>
yep
<enebo[m]>
but that also means MRI will hit issues from that if there is an issue
<headius[m]>
right
<enebo[m]>
we will just see it in pre-2.7 release
<headius[m]>
I based this on the 3.0 codebase so clearly they shipped it again
<enebo[m]>
ok that gives some confidence that even if someone does notice the writing has been on the wall for more than a year
<headius[m]>
right
<headius[m]>
and nobody will notice anyway
<headius[m]>
would have to be doing something like monitor.kind_of? MonitorMixin anyway because the new Monitor still exports the same API
<headius[m]>
I wonder if there is a race in this spec
<headius[m]>
aha I think there is
<headius[m]>
enebo: look at this spec, io/close_spec.rb around line 57
<headius[m]>
basically it is trying to wait until the thread is blocking on IO before closing the related stream, so that the thread gets interrupted
<headius[m]>
but the race is determining that... we mark the thread as "stop" immediately before making the blocking read call
<headius[m]>
but if this logic closes it after we mark "stop" but before we try to read the thread itself will raise the normal error
<headius[m]>
not the "another thread" error
<headius[m]>
we have had to tag other tests like this because it is not possible to know that the thread is actually blocking on the IO yet
<enebo[m]>
so is the underlying real problem is stop means two things?
<enebo[m]>
should the thread have a state like init
<enebo[m]>
I realize I am not asking something we can just change
<headius[m]>
so when I have looked into this in the past I considered using the JDK thread state, but the race is still basically the same
<headius[m]>
essentially you can't mark the thread as stopped AND start the blocking call as a single atomic operation
<headius[m]>
so inspecting the thread state to know if it is blocking on IO will always be racy
<enebo[m]>
but what about my question
<enebo[m]>
it is stopped before the read and after it is blocking?
<headius[m]>
I guess the underlying problem is that stop means "I am going to do something that stops me"
<enebo[m]>
Oh wait
<headius[m]>
but you don't know whether it has started doing that thing
<enebo[m]>
it stops the thread to start the read but that is the race internally and we can hit between
<headius[m]>
right
<enebo[m]>
without starting another thread we cannot stop after we start the read :)
<headius[m]>
so depending on when this spec sees that the thread is stopped, it might close the IO early and not trigger the cross-thread close error
<headius[m]>
this might even be a race on CRuby but they can get closer to making stop + read look atomic to other threads
<enebo[m]>
yeah so we have two problems though. The first is the test but the second is how to do you even write this?
<headius[m]>
but still they have to release the gvl immediately before the read, so there is a potential for it to happen
<headius[m]>
my assessmen
<headius[m]>
my assessment years ago when I looked into this is that it is not possible to write this
<enebo[m]>
you could pass n times and make a very loose assumption ruby is slow enough that the read will actually start
<headius[m]>
not reliably
<headius[m]>
yeah that is the best you can do, maybe throw a delay in there
<enebo[m]>
So count != 3 && conditions
<headius[m]>
right
<headius[m]>
I think I will file an issue for this
<headius[m]>
TR has it tagged too fwiw but I believe they should have the same issue
<enebo[m]>
yeah it makes sense
<enebo[m]>
MRI might just end up working out but the local set and the read are not a transaction
<enebo[m]>
so I could see that theoretically an internal change could end up causing MRI to show this behavior in the future
<headius[m]>
yeah exactly
<enebo[m]>
A change they would quickly undo :)
<headius[m]>
even if release gvl was immediately before calling read there is a race
<headius[m]>
because in that moment another thread could start running and see "stop" state
<enebo[m]>
too bad there is not some read(start_marker: read_started, ...)
<enebo[m]>
Ultimately though atomically writing a value will have some gap for the read call itself if we consider the status of the system call to be what we are trying to see
<headius[m]>
regardless of what we do in the implementation there is still the fundamental problem that you eventually have to call read at the kernel level
<enebo[m]>
yeah
<headius[m]>
so any gap between when you set up state or call callbacks and that read will be a race
<enebo[m]>
that was what I meant by system call
<headius[m]>
this is why I figured there is actually no way to do this
<headius[m]>
right
<headius[m]>
ok
<enebo[m]>
we can make the race a lot smaller though
<headius[m]>
maybe this can be replaced with some spec that tries it in 100 threads and only requires that one of them have the message
<enebo[m]>
heh
<enebo[m]>
I mean it may work
<enebo[m]>
I know people hate delays and those are not perfect either
<enebo[m]>
another obvious fix to the test is to have the read run twice and work the first time
<enebo[m]>
ignore that
<headius[m]>
I would love to hear about a better way but I gave up looking for a solution last time I did this
<headius[m]>
I hate to say never but there may never be a way to test this simply
<enebo[m]>
I was going to say you record the first read and use that as part of the test which would also know going_to_read would have to have been set
<headius[m]>
it just makes that first read another race condition unfortunately