<jswenson[m]>
Looks like most of the nashorn stuff has a more complex class name (`jdk.nashorn.internal.scripts.Script$181$\^eval\_`) while the jruby stuff is typically more simple.
<jswenson[m]>
Top few without stripping the end off the class names
<headius[m]>
yeah they are probably encoding it more
<jswenson[m]>
for reference the OQL for the second one was : `SELECT s.member.clazz.getName() FROM java.lang.invoke.DirectMethodHandle s`
<jswenson[m]>
I'm using eclipse mat, not sure if that matters for these.
<headius[m]>
it does, the visualvm oql sucks
<headius[m]>
I am realizing that now
<jswenson[m]>
had to use bash to group and sort
<headius[m]>
that query is syntax error in visualvm, lame
<headius[m]>
ok so basically we could filter by member.clazz.getName =~ /jruby/
<headius[m]>
however that is done in MAT OQL
<headius[m]>
there does seem to be a lot of duplication of DMH just in your results though
<jswenson[m]>
can do this: `SELECT s.member.clazz.getName() FROM java.lang.invoke.DirectMethodHandle s WHERE s.member.clazz.getName().startsWith("org.jruby")` which yields 39k results (of the 66k total)
<headius[m]>
ok that's pretty good
<headius[m]>
39k direct handles to JRuby classes seems like a lot to start
<jswenson[m]>
`SELECT s.member.clazz.getName(), s.member.name.toString() FROM java.lang.invoke.DirectMethodHandle s WHERE s.member.clazz.getName().startsWith("org.jruby")`
<headius[m]>
these are the kinds of sites I would expect to see even with indy off because these are literals and constants
<jswenson[m]>
Yeah I think we must not have any indy on
<jswenson[m]>
as we're not explicitly adding it anywhere and only disabling indy.yield
<headius[m]>
some of these are being created by JVM and would be reduced in newer versions
<headius[m]>
like those references to fstring in Bootstrap should be almost entirely in static data for jitted Ruby code and could use the same entry, but probably doesn't in Java 8
<headius[m]>
ok so this is a good report on direct handles
<headius[m]>
it doesn't look like a symptom of a problem at the moment
<headius[m]>
maybe we can look at LF trees that are rooted by nashorn somehow
<jswenson[m]>
I can query the 113k lambda forms, but I'm not sure where I can go from there.
<headius[m]>
jswenson: well we are getting closer
<headius[m]>
basically we need a report of where all the lambda forms are and what is holding onto them to see which ones are JRuby and could be reduced
<jswenson[m]>
let me see if I can wrangle mat / OQL into doing what I want
<jswenson[m]>
not sure where on a lambda form (or around a lambda form) it is hiding the information about why it was created / what created it
<headius[m]>
yeah I am trying to figure that out... seems like it is not in the object but would have to be based on who is referencing it
<headius[m]>
that is exactly what we want though, who created it and why
<jswenson[m]>
Darn.
<headius[m]>
I may be able to pull in a method handle expert
<headius[m]>
I pinged Vladimir Ivanov from Oracle, who has done a large part of the work to reduce and reuse lambda forms
<headius[m]>
he will be able to give us some pointers on investigating and perhaps tuning this stuff
<headius[m]>
I have to drop off for the night soon so I can't help continue but you can keep poking around with MAT and we will see what Vladimir comes back with
<jswenson[m]>
:thu
<jswenson[m]>
* ๐๏ธ
<jswenson[m]>
thanks for the help!
<headius[m]>
I am looking to see if it is possible to turn more indy off, but that might only be on master
<headius[m]>
on master we have worked toward a completely indy-free mode to use for native compiles, but that is not available in the 9.2 line
<headius[m]>
from what I see, though, those are not unexpected DMH to see for normal mode, and those numbers don't seem to indicate a problem
<headius[m]>
jswenson: yeah thanks for your patience... I want to help you but also figure out if we can do more to reduce our metaspace load
<headius[m]>
seems like something is still corrupting the shared maven cache on travis, but removing it would add a lot of time and noise to the builds
ur5us_ has joined #jruby
ur5us_ has quit [Remote host closed the connection]
<Freaky>
it's certainly harder to reproduce with this cut down environment
<Freaky>
it's like it's sensitive to just the amount of code that happens to be floating about
<Freaky>
meh
<Freaky>
let's try jdb again, now I can reproduce it fairly reliably
<headius[m]>
yeah need to look at the incoming string at the point it raises the error, checking whether it has a weird encoding or bad offsets or something
<jswenson[m]>
* @kalenp I wonder if we're doing any massive json parsing here.
<kalenp[m]>
Some of our known instances are actually on fairly small json objects (2 keys and short values). but perhaps the large json just exercises it intensely enough to show up in a reasonable amount of time
<jswenson[m]>
From the looks of the repro, it looks like simply loading the large JSON will make it so a smaller load will fail later.
<jswenson[m]>
Loads the big bork.json (around 2.5MB) then loads a simple json payload in a loop.
<jswenson[m]>
Thus far I cannot repro, but I haven't tried on jdk15 yet
<kalenp[m]>
well, we certainly do serialize large json objects. will have to think about de-serializing large json
<kalenp[m]>
interesting, I'm able to get the repro on java 16 (no 15 available in my current environment), but on java 11, it's running just fine for several minutes. so, either we have two separate issues, or the particular details of how to trigger the bug have changed
<headius[m]>
could be different heuristics for when and how it jits that aren't being hit with this example on 11
<Freaky>
yeah, I've not seen it <15
<kalenp[m]>
on further reflection, we definitely do parse some large json objects
<Freaky>
the smaller the initial JSON the less reliably I can reproduce it
<Freaky>
maybe it has different sensitivities on different versions
<headius[m]>
I'm wrapping up another investigation and then I will try to repro locally too
<Freaky>
kalenp[m]: any JVM tuning?
<headius[m]>
ah that could make a difference
<Freaky>
JRUBY_OPTS='-J-XX:+UnlockExperimentalVMOptions -J-XX:+UseShenandoahGC' seems to stop it as well
<Freaky>
also SerialGC and ParallelGC from the look of it
<headius[m]>
grr G1
<headius[m]>
I would not be surprised if this ends up being a G1 bug
<jswenson[m]>
We're definitely using G1
<headius[m]>
worth throwing another GC at it if you can afford to try it
<headius[m]>
jswenson: you probably didn't see this on Twitter but I got some assistance from Nashorn folks and there's a way to turn off its aggressive optimizations