<fzakaria1>
Want me to file a bug? A bit of a headscratcher for me right now. Clearly `java/sql/Date` exists, since in an earlier statement you can see I make one :)
<headius[m]>
enebo: I remove the permgen flag from release doc
<headius[m]>
I assume you weren't doing that anymore
<enebo[m]>
headius: I saw that and yeah I stopped using it
<headius[m]>
I have this build rework almost done... the one failure made me check how CRuby does this (it does the installs during `make install`) and we'll match after I finish this
<headius[m]>
enebo: at some point we need to clean break and do a complete reformat and cleanup of the pom.rb files
<headius[m]>
I have been reluctant to do anything drastic since we want to be able to merge
<enebo[m]>
yeah
<enebo[m]>
I have wished there was "hinting" in merging for particular files
<enebo[m]>
like when you merge a pom.xml file it seems like git could have some extra info somehow
<enebo[m]>
but I guess if you PR a branch to change a pom then you would not want it
<headius[m]>
wiping out the poms on master would be nice too, since that will be one less thing to ignore when merging
<headius[m]>
all commits should be in for that dist gem thing now... I am doing another release deploy to get a final diff from 9.2.16.0 to 9.2.17.0
<enebo[m]>
cool
<headius[m]>
enebo: you already check out a clean repo before releasing but that is a requirement now... in order to have all bin files included without expanding this whitelist it just includes everything
<headius[m]>
your mvn deploy should run against a clean clone... installing rspec or whatever after that finishes (so you can run that rake task) is fine
<headius[m]>
if that makes you nervous maybe you can help me figure out a way to keep it clean without easily-outdated whitelists
<enebo[m]>
oh hmm after running deploy I do run rake post_process_artifacts
<enebo[m]>
Which should not require rspec
<enebo[m]>
but I have not run against a dirty workspace in like a decade
<enebo[m]>
I think when we had an ant build it did not matter at all
<headius[m]>
this is done but one change introduced by including all bin/* stuff is that the bin/ruby symlink to bin/jruby is now included
<headius[m]>
it was explicitly excluded before... and this should be ok on Windows since "ruby" as a filename does not look like an executable there, but I wanted to get your input
<headius[m]>
will request a review
<enebo[m]>
headius: I just read through it
<enebo[m]>
My only ask would be what is different from a ls/listing from 9.2.16.0 and also that the shebangs are agnostic to who is building
<enebo[m]>
I assume the later is true so I am just saying it because it came to mind when I read that change
<headius[m]>
the change to always do env shebangs should avoid the latter from now on
<headius[m]>
I have also proposed to RG that they just make env shebang the default and so far they agree
<headius[m]>
but that is future work
<headius[m]>
I did provide a diff listing in my last comment
<headius[m]>
and explained the differences
<enebo[m]>
ah I missed that although I remember reading the top
<enebo[m]>
so out of that then bin/ruby is the only ? in my head really
<enebo[m]>
"This may need testing in the zip file, since the symlink will obviously not work on Windows."
<enebo[m]>
I believe this won't work anyways since it is not an executable or bat file
<headius[m]>
yeah that occurred to me later
<headius[m]>
I do not know why it was excluded
<enebo[m]>
ruby?
<headius[m]>
yeah
<headius[m]>
I do not know how much of a risk it is for it to be in the dist now
<enebo[m]>
I don't either but I wonder if it was someone who wanted ruby to be whatever c ruby they were using
<headius[m]>
yeah but we know that doesn't work anyway
<enebo[m]>
I guess for 9.2 that one is iffy to me but for 9.3 I am gung ho for it
<headius[m]>
so it may be an old requirement from back when we still believed you could have two Rubies in path
<headius[m]>
I could explicitly exclude it for 9.2
<enebo[m]>
I just think it would be removing a variable
<enebo[m]>
I am pretty curious to see if it causes problems but I want to be as done with 9.2 as we can be
<enebo[m]>
not that we won't still fix issues for a while
<headius[m]>
yeah better safe than sorry
<enebo[m]>
headius: do you know of a good writeup on how this works: RubyClass stringClass = runtime.defineClass("String", runtime.getObject(), RubyString::newAllocatedString);
<headius[m]>
lol writup
<enebo[m]>
A static method which matches the interface methods definition
<headius[m]>
oh you mean the method reference?
<enebo[m]>
An explanation is fine but is this that obscure
<headius[m]>
yeah that is all it is
<headius[m]>
this is not really obscure and came along with lambdas in 8
<enebo[m]>
but how does it implement it
<headius[m]>
Method References
<enebo[m]>
It has to resolve to a type and then it makes MHs to hook it up?
<headius[m]>
it is mostly shorthand for (r, c) -> RubyString.newAllocatedString(r, c)
<headius[m]>
but it may route the resulting interface impl directly to the target method more efficiently than a little class, I am not sure
<enebo[m]>
ok so it will generate some handles which internally will replace the need for a type
<headius[m]>
yeah
<headius[m]>
it will be an invokedynamic to build a tiny interface impl that just calls that method
<headius[m]>
and from then on just use the generated code
<enebo[m]>
I brought up a year ago that I should change the parser to use lambdas as an experiment
<enebo[m]>
So I have been thinking about that
<lopex>
it's also called eta reduction
<enebo[m]>
I am a bit worried about initial warmup but I think past experience tells me I need to just try it
<enebo[m]>
I think the neat idea of this is I can hoist some variables and not pass them (although that may not be the win I think it is)
<headius[m]>
warmup is a valid concern
<headius[m]>
well, startup
<headius[m]>
but consider this: it had to load the .class before anyway
<enebo[m]>
yeah startup but mostly how bad cold perf is
<headius[m]>
so unknown how much more or less overhead it is replacing an inner class with one of these
<enebo[m]>
yeah :)
sagax has joined #jruby
<enebo[m]>
It is actually pretty simple to try and it would be removing hundreds of types
<enebo[m]>
I remember when I made this megamorphic change it was only to reduce main method in LALR to compile to native and was really surprised how quick the PIC was
<enebo[m]>
So I never rule out what will make a positive difference to the parser
<enebo[m]>
Probably the main flaw of the parser now is the AST is about 600,000,000 objects
<enebo[m]>
I am exaggerating but some slab allocated semi protobuf-like allocation would be better
<headius[m]>
yeah so optimization wise this should be no worse and probably better
<headius[m]>
it still has to generate a type to implement the ObjectAllocator interface but it will be tiny and loaded anonymously
<headius[m]>
so loading that is probably less overhead than loading our inner class, but generating it may make the difference back
<enebo[m]>
hey since I brought it up I will just try it. I am a little too into pondering this kwargs rest fix and it should be reasonably easy to change this
<enebo[m]>
but if it is faster it may make it up again
<headius[m]>
FWIW I believe one thing the CDS stuff is supposed to do over time is eagerly generate and link in these statically-resolvable lambdas
<enebo[m]>
like gem list is zillions of evals (or it is in my workspace)
<headius[m]>
"supposed to do"
<headius[m]>
I am not sure if or when
<enebo[m]>
ah but if it is ok now then it potentially will be better later
<headius[m]>
I do think overall this is a win because we ship fewer classes and methods to verify
<headius[m]>
and we can't avoid verification on 9+ anymore, so...
<enebo[m]>
I have not recently timed later vs older for something really cold like gem list
<enebo[m]>
I also mostly just see default GC in the way until I remember to use parallel
<headius[m]>
yeah this would be minute
<headius[m]>
reduced total classes by around 100, so there will be 100 or fewer of these additional now
<enebo[m]>
yeah so if this works we will reduce types by 700
<headius[m]>
and if we AOT with GraalVM this compiles fine anyway ð
<headius[m]>
oh I see what you are saying
<headius[m]>
yeah that would be hot
<enebo[m]>
I am including your reduction in that but for 3.0 it will reduce probably by 1000
<headius[m]>
that would be a good test because those loads are pure overhead at startup
<enebo[m]>
I am sure we will add at least 300 types for just the pattern matching feature
<enebo[m]>
It replicates quite a bit of the grammar
<headius[m]>
oh hey there is another possible thing we might be able to improve: generating those giant byte arrays
<enebo[m]>
oh per source file?
<headius[m]>
if you can twiddle it to generate them as a string we can just getBytes rather than emitting all those instructions to load it
<headius[m]>
I mean the parser productions
<headius[m]>
if this were bytecode it would be trivial... just stuff the bytes into a char and put that in constant pool
<enebo[m]>
I still don't know what you mean
<headius[m]>
the stuff you have to split in your post-processing
<headius[m]>
the static inits that are too big
<headius[m]>
if the numeric sequence were in a string in constant pool there would be no need for that
<enebo[m]>
oh so you mean instead of a short[] which reassembles just make a string and the unpack
<headius[m]>
right
<headius[m]>
sorry I just started thinking about other things that could be simplified in there
<enebo[m]>
ok so I need to make getbytes[] push two bytes into a short value
<enebo[m]>
which is trivial if I can assume layout of a short and do something unsafe
<enebo[m]>
It is amusing as well I break it into 4
<headius[m]>
enebo: I will merge default gem PR now to 9.2 and master
<enebo[m]>
those four will fit into a single long[]
<headius[m]>
if I can get a head build snapshot to deploy then marcandre should be able to confirm it by restarting a rubocop-ast build
<enebo[m]>
yeah
<headius[m]>
enebo: oh dunno if you saw but I was able to wipe out bin/rake, ri, and rdoc too
<headius[m]>
so switching branches should not mess up rake bin anymore... it is ignored and not versioned on either branch now
<slonopotamus[m]>
@headius yay! I am a bit surprised that you decided to handle zlib issue I've reported so quickly so I thought it would be more convenient if you could poke me in more realtime fashion here)
<headius[m]>
yeah welcome
<slonopotamus[m]>
looks like my @ powers are too weak :D
<headius[m]>
I did not mean to imply the gz is definitely bad, we just have never seen this reported so I was curious
<headius[m]>
name<tab> should complete in Element client
<slonopotamus[m]>
I remember that I one was already here. There was a story about spawning subprocesses on JRuby + Windows...
<headius[m]>
the mobile client uses @... they need to decide on one way ð
<slonopotamus[m]>
* I remember that I ocne was already here. There was a story about spawning subprocesses on JRuby + Windows...
<slonopotamus[m]>
* I remember that I once was already here. There was a story about spawning subprocesses on JRuby + Windows...
<lopex>
and how much could we win on transcoding tables ?
<headius[m]>
lopex: do they have a lot of inner classes?
<lopex>
headius[m]: no, the tables themselves
<headius[m]>
oh like the constant pool trick I was talking about
<headius[m]>
yeah could be big and help startup too
<lopex>
the bigger offenders are probalby not user
<headius[m]>
just yank the table out of a string in constant pool
<lopex>
is there such a thing like order in zip ?
<lopex>
for files
<slonopotamus[m]>
yes
<headius[m]>
slonopotamus: hah windows and processes, great stuff
<slonopotamus[m]>
* lopex: yes
<headius[m]>
and by great I mean super frustrating to support
<lopex>
headius[m]: you mean a binary ?
<lopex>
slonopotamus[m]: can we affect that ?
<slonopotamus[m]>
lopex: when you create a zip, you add entries to it one by one. and they end up exactly in that order. you can even decide *per file* whether it will be compressed or not.
<lopex>
ooh
<slonopotamus[m]>
* lopex: when you create a zip, you add entries to it one by one. and they end up exactly in that order in the file. you can even decide _per file_ whether it will be compressed or not.
<lopex>
so we might tell maven to do that in some order
<headius[m]>
there is data left on the input buffer and according to this we are supposed to tack them on
<headius[m]>
but clearly that is not the whole story
<slonopotamus[m]>
comment on that code says that it mimics what MRI does. "but clearly that is not the whole story" :D
<lopex>
tables/Transcoder_Big5_WordArray.bin is 400kb
<lopex>
compressed to 230
<lopex>
to not an empty air
<lopex>
and we ship the whole thing every time
<headius[m]>
try zopfli
<headius[m]>
it didn't have a big impact on JRuby but it might do better on that data
<lopex>
via maven ?
<headius[m]>
there is a maven plugin
<headius[m]>
but you could just try zopfli on the file directly, or recompress the jar
<lopex>
or force insertion order like slonopotamus[m] said
<headius[m]>
yeah there are options
<headius[m]>
bzip it and add a dependency to jcodings ðĪŠ
<lopex>
yeah, already considered
<slonopotamus[m]>
lopex: what profits do you expect to have by reordering files? AFAIK, each file inside zip is compressed independently.
<lopex>
no idea
<slonopotamus[m]>
so, you can get better read patterns by placing files that are used together near each other. but you won't win on total size that way.
<slonopotamus[m]>
* so, you can get better read patterns by placing files that are used together near each other within zip archive. but you won't win on total size that way.
<lopex[m]>
I thought reading from jar might have an impact regarding file order
<slonopotamus[m]>
* so, you can get better read patterns by placing files that are used together near each other within zip archive. but you won't win on total archive size that way.
<headius[m]>
lopex: given the ordering and dict issues perhaps we should compress all of them together as one gz blob with a header
<lopex[m]>
hmm big5 and gbk take like 700Kb uncompressed
<headius[m]>
slonopotamus: yeah I may have to step through the C code to see why they don't have the extra bytes at this point
<lopex[m]>
a header ?
<headius[m]>
to indicate how big each transcode chunk is
<headius[m]>
in the uncompressed aggregate
<lopex[m]>
ah
<headius[m]>
are we at least lazily loading them?
<lopex[m]>
yeah
<lopex[m]>
everythin in jcodings is lazy
<lopex[m]>
including impl classes
<slonopotamus[m]>
headius: have you tried minifying test data? like, a zero-sized data that was zlib'ed and got a one or two \0 bytes appended? I actually know almost nothing about zlib format and why it is OK to passthrough trailing bytes instead of breaking with "OMG, strange bytes after end-of-stream!" :D
<headius[m]>
yeah I don't understand this logic either but reading around it they may be resetting input to zero elsewhere
<headius[m]>
input data shouldn't matter, I would expect us to do this for any gz data with trailing junk
<headius[m]>
given this logic
<slonopotamus[m]>
(the worst thing that can currently happen is if it turns out that JRuby implementation is correct and all other are wrong :D I one stumbled upon a bug with multiple addr2line implementations [a program that translates program address into a function/file/line info within executable file] where 4 out of 6 implementations were failing to pass my testcase properly)
<slonopotamus[m]>
* (the worst thing that could currently happen is if it turns out that JRuby implementation is correct and all other are wrong :D I one stumbled upon a bug with multiple addr2line implementations [a program that translates program address into a function/file/line info within executable file] where 4 out of 6 implementations were failing to pass my testcase properly)
<headius[m]>
a workaround for you would be to just call the JDK classes
<headius[m]>
it will work fine
<headius[m]>
ok whoever ported this logic misinterpreted it
<headius[m]>
it appends the remaining bytes to the input string so at the end you have the input string containing only those extra bytes
<headius[m]>
it does not append to the output strean
<headius[m]>
stream
<slonopotamus[m]>
ouch. you're right, though I also misread it
<slonopotamus[m]>
they append to next_in
<headius[m]>
I am not sure we are handling the input buffer correctly here but I can flip this to append there and see what we have
<headius[m]>
well they append to z->input from next_in
<headius[m]>
if it is a coming from a stream they allocate a string for z->input to hold it
<headius[m]>
bit definitely not going to output
<slonopotamus[m]>
no, wait. zstream_append_input appends data from second arg into first arg.
<slonopotamus[m]>
* no, wait. `zstream_append_input` appends data from second arg into first arg.
<slonopotamus[m]>
so... they append `z->stream.next_in` into `z`.
<headius[m]>
well into z->input
<slonopotamus[m]>
okay, you;re right
<headius[m]>
constructing that if necessary
<slonopotamus[m]>
* okay, you're right
<headius[m]>
so someone assumed this was going to out stream
<headius[m]>
I will try a quick patch and then see when this was added
<headius[m]>
$ jruby blah.rb
<headius[m]>
60992
<headius[m]>
ð
<slonopotamus[m]>
yay!
<slonopotamus[m]>
I wonder if there are any multi-stream tests. I mean, when you use zlib to uncompress things that go one after one.
<slonopotamus[m]>
It looks like that such scenario is supposed to be supported (otherwise, why `next_in` at all)
<slonopotamus[m]>
* I wonder if there are any multi-stream tests. I mean, when you use zlib to uncompress things that go one after one in the same byte flow.
<headius[m]>
we pass and fail the exact same number of tests from MRI test_zlib
<headius[m]>
oh well
<headius[m]>
at least this seems like the right direction... I will push a PR and do a little more poking around
<headius[m]>
they do have some chunked stream tests I think
<slonopotamus[m]>
oh, "chunked". didn't have a proper word for this)
<slonopotamus[m]>
* oh, "chunked". didn't have a proper word for this in my vocabulary)
<headius[m]>
slonopotamus: to keep risk low I will probably not attempt any larger scale re-port for 9.2.17.0, but I will file an issue to align behavior in 9.3 (which may or may not happen then without help)
<enebo[m]>
ok it is also wrong in irc.fixing
<headius[m]>
this should at least solve your issue
<slonopotamus[m]>
headius: looking at your code... is buffer you're appending to guaranteed to have enough capacity or it might need a reallocation?
<headius[m]>
hmm
<headius[m]>
input.append will expand
<slonopotamus[m]>
it obviously can't reallocate the way you wrote the code
<headius[m]>
input is a ByteList, our growable collection wrapping byte[]
<headius[m]>
I realize I am not checking it for nil though so I should allocate a new buffer in that case
<headius[m]>
ahh actually this always allocates input
<slonopotamus[m]>
oh, ok, I've misread this code **again** and thought it is appending to `next_in`. need some sleep)
<headius[m]>
the logic in CRuby is goofy... they allocate this buffer and copy the incoming bytes to it and then proceed from there
<headius[m]>
we mimic that but clearly didn't handle the extra part properly
<headius[m]>
slonopotamus: if you get a chance to test out that patch let me know
<headius[m]>
you can build from the branch on headius/jruby or wait until it merges
<slonopotamus[m]>
I will hopefully be able to test it tomorrow. It's almost 1am here, not the best time for figuring out how to build jruby from source))
<headius[m]>
yeah no worries... for future, just check out, "./mvnw", and run jruby from bin/
<slonopotamus[m]>
ðïļ
<headius[m]>
enebo: we may need to be a bit more exclusive in the packaging of the stdlib artifact
<headius[m]>
[INFO] Copying 43655 resources to /Users/headius/projects/jruby-9.2/lib/target/classes/META-INF/jruby.home/lib/ruby/gems/shared
<enebo[m]>
wot
<headius[m]>
because it needs to copy all the default and bundle gems, I made it copy everything... which is a bit of an issue when you have lots of other gems installed in a local repo
<enebo[m]>
oh you mean compiling it locally from your dev env though
<headius[m]>
yes
<enebo[m]>
ok
<headius[m]>
I wouldn't care but it is slow
<enebo[m]>
yeah the workaround of having to always build from a shallow clone is not very appealing
<enebo[m]>
Perhaps any exclusion somehow is mapped to -SNAPSHOT
<headius[m]>
this doesn't really affect anything visible in a local dev env except copying a lot of crap