carlosga_ has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]
houhoulis has joined #rubinius
meh` has quit [Ping timeout: 272 seconds]
<|jemc|>
sheesh
<|jemc|>
I did what I thought was an optimization, but adding one goto_if_true instruction (that skips _ahead_) tripled my total time spent by the parser
amclain has joined #rubinius
<|jemc|>
ah, I see - the original behavior was actually losing capture data - that's why it was faster :P
carlosgaldino has joined #rubinius
houhoulis has quit [Remote host closed the connection]
flavio has joined #rubinius
flavio has quit [Client Quit]
johnmuhl has quit [Quit: Connection closed for inactivity]
<yorickpeterse>
Benny1992: Rbx currently prioritizes the bundled rubysl Gems over those installed/updated manually
<Benny1992>
ah ok thx
<yorickpeterse>
so even if you update the Gem, it will still load the version it initially shipped with
<Benny1992>
kk, can we push a new version of rubysl-pathname?
<yorickpeterse>
not sure if we ever fixed that in master
<yorickpeterse>
Benny1992: that still requires a new release of Rbx I believe, not sure
<yorickpeterse>
I can look into this tonight, bit too busy atm
<Benny1992>
hmm ok
<Benny1992>
no problem :)
benlovell has quit [Ping timeout: 250 seconds]
<brixen>
cpuguy83: no worries
<brixen>
cpuguy83: just slammed the past couple weeks so didn't see the email for 3 days :p
josh-k has joined #rubinius
|jemc| has joined #rubinius
<|jemc|>
does the check_interrupts bytecode instruction play a role in allowing the JIT to do its work?
<|jemc|>
that is, if I make a dynamic_method that runs for a while (say, one second), do I need to explicitly generate a check_interrupts instruction to avoid JIT-related errors like the ones I'm seeing?
<|jemc|>
or is check_interrupts unrelated to the JIT?
<|jemc|>
inserting a check_interrupts instruction before the to_s call seems to _avert_ the problem but I'm not sure if that's causal or just circumstantial
benlovell has joined #rubinius
benlovell has quit [Ping timeout: 244 seconds]
carlosgaldino has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]
noop has quit [Ping timeout: 245 seconds]
noop has joined #rubinius
carlosgaldino has joined #rubinius
flavio has quit [Quit: WeeChat 0.4.1]
josh-k has quit [Remote host closed the connection]
josh-k has joined #rubinius
josh-k has quit [Ping timeout: 244 seconds]
benlovell has joined #rubinius
<brixen>
|jemc|: what JIT-related errors are you seeing?
<|jemc|>
methods that become incorrect when the JIT hits them - I was just now preparing a second part to my earlier gist
<|jemc|>
this part I'm seeing problems in parts where I'm not using dynamic_method
<|jemc|>
(or using any loops - as far as I can tell check_interrupts is intended to be generated in loops)
crowell has joined #rubinius
benlovell has quit [Ping timeout: 272 seconds]
<crowell>
quick question. is there currently a working disassembler for .rbc files?
<|jemc|>
brixen: for example, my Myco::Component#new method becomes incorrect right after I see the line:
<crowell>
brixen: perfect, that's exactly what I was looking for
<brixen>
crowell: ok, cool
<crowell>
the docs for the instructions leave a bit to be desired, but that's what I was looking for
<brixen>
crowell: what's missing?
<|jemc|>
crowell: you can see the implementation of each instruction alongside those same docs if you look in vm/instructions.def
<Benny1992>
ping: yorickpeterse
<Benny1992>
currently adding tests to rubysl-pathname, but the tests fail because the changes I made are not seen, is this because rbx prefers the bundled gem?
<crowell>
ok, cool. just that web page was a bit sparse
<Benny1992>
so the same problem as previous
<brixen>
crowell: if you can be more specific, I may be able to answer a question
<brixen>
"a bit sparse" is not actionable
<brixen>
also, we generate the docs from vm/instructions.def, so you can send a PR as well
<crowell>
brixen: I actually don't have the file I'm trying to disassemble in front of me, so I can't ask any specifics now. but I have enought ot get started
<brixen>
ok
<brixen>
Benny1992: you need to set RUBYOPT=lib to have rbx pick up the gem's files
<brixen>
Benny1992: le'me push an updated .travis.yml for that repo
<Benny1992>
brixen: okay thx :)
<brixen>
Benny1992: er, sorry, RUBYLIB not RUBYOPT
<|jemc|>
brixen: this one is turning out to be a bit of a heisenbug (my code is not threaded, but I think it's JIT-related non-determinism) but I think I should be able to help you reproduce the to_s one it if you feel like rake installing myco and running a basic script from my other repo.
diegoviola has quit [Remote host closed the connection]
elia has joined #rubinius
elia has quit [Client Quit]
diegoviola has joined #rubinius
<yorickpeterse>
ok Ruby trivia:
<yorickpeterse>
Given I have a Hash with find values as the keys, and replacements as the values
<yorickpeterse>
What's the most efficient way of running a find-replace using that table, using the least amount of String#gsub calls
<yorickpeterse>
the most basic form is find_replace.each { |find, replace| input = input.gsub(find, replace) }
<yorickpeterse>
That however is slow as sin
<yorickpeterse>
(we're talking about running this thousands/millions of times potentially)
elia has joined #rubinius
<yorickpeterse>
Hm, I think I _might_ be able to compile a clever regexp for this
<yorickpeterse>
hm no, that's impossible
elia has quit [Client Quit]
<headius>
is it?
<yorickpeterse>
Adding that to this parsing setup slows it down by ~5,5 times
<yorickpeterse>
It would have to run for every text node in the document
<yorickpeterse>
which can potentially be a lot of nodes
<yorickpeterse>
Time wise this 10MB XML file that I auto generated goes from 0.02 seconds to 480 ms
<headius>
you might speed that each loop up by iterating over keys
<yorickpeterse>
(parsing time)
<chrisseaton>
Why not move through the string once with a state machine?
<chrisseaton>
That would be optimal
<headius>
I was thinking a regexp that is all of the keys to replace and a block passed to gsub...then you look up the keys as they're found and return replacement
<yorickpeterse>
so context: I need to replace certain XML entities (e.g. <) with their equivalents (< in this case)
postmodern has joined #rubinius
<yorickpeterse>
The mapping is basically: { '<' => '<', '>' => '>', '&' => '&' }
<yorickpeterse>
I _can_ do this lazily at the very end of the parsing chain (basically find/replace upon access), but I'm curious if I can do it earlier on
diegoviola has quit [Quit: WeeChat 1.0]
<chrisseaton>
you already have a state machine in the lexer right? why not make it recognise and expand the entities?
<yorickpeterse>
That requires emitting separate tokens, which I then have to stich back together (elements can only contain a single text node)
<yorickpeterse>
I tried that actually, it's way slower than the above loop
<yorickpeterse>
it results in more string allocations, only for those to be stitched back together
<yorickpeterse>
so for example, for the string "<foo&l;" you'd normally have 1 allocation
<yorickpeterse>
However, if you emit stuff separately you'd now have 3 allocations
<yorickpeterse>
("<", "foo" and "&")
<yorickpeterse>
then you smack them together for a 4th allocation
<yorickpeterse>
so basically O(N*4) vs O(N)
<chrisseaton>
I don't mean new tokens - I mean you're already going through a string and copying character by character to create the token string in the lexer aren't you? so why not copy and expand at the same time?
* yorickpeterse
finally gets to use the big O
<chrisseaton>
O(N*4) is exactly the same thing as O(N)
<yorickpeterse>
Oh no, the lexer doesn't do that
<yorickpeterse>
it operates on byte ranges
<yorickpeterse>
plus it's shared between C/Java, so doing find/replacements there is a total pain
<chrisseaton>
ah ok
<yorickpeterse>
errr derp you're right
<yorickpeterse>
(regarding the big O stuff)
<yorickpeterse>
see, I never use it :P
elia has joined #rubinius
<yorickpeterse>
headius: does ByteList in JRuby allow me to find/replace bytes?
<yorickpeterse>
crap wait I can't use that, that means I'd have to check things for every token
<yorickpeterse>
darn
<headius>
yorickpeterse: sure
<yorickpeterse>
I guess having something like String#tr supporting multiple character replacements (opposed to 1 char being replaced with another single char) would be nice here
<headius>
can't you turn the keys into a regexp and try what I suggested? gsub + block will hurt, but unless you expand as you lex I'm not sure how to avoid that
<yorickpeterse>
That would still result in multiple string allocations
<yorickpeterse>
In fact, I think the block form would result in one allocation for every match
<yorickpeterse>
unless it only evaluates the block once for all matches
_elia has joined #rubinius
Thijsc has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
elia has quit [Ping timeout: 245 seconds]
<headius>
yorickpeterse: it would
<headius>
but at least the underlying byte[] would be shared across those instances
_elia has quit [Ping timeout: 245 seconds]
<yorickpeterse>
brixen: in Rbx, do we allocate a new string when calling String#gsub! ?
<yorickpeterse>
That is, is it basically `new_str = dup.gsub(....); replace(new_str)`, or does it truly modify the current string in-place?
<yorickpeterse>
hm, I think I can actually test that!
<yorickpeterse>
oh, string literals seem to bypass String#initialize in rbx
<yorickpeterse>
wtf
<yorickpeterse>
bah, I just want to measure how many Strings are created, this is stupid difficult in both MRI and Rbx
<yorickpeterse>
MRI has TracePoint but lol of course that doesn't work when creating strings
<yorickpeterse>
and Rbx just bypasses String#new for literals :/
<yorickpeterse>
(╯°□°)╯︵ ┻━┻
<headius>
you should be able to get a count on JRuby by passing flag -J-Xrunhprof:depth=0
<headius>
that enables JVM-level object profiling... depth=0 makes it not accumulate allocation backtraces to speed up the data gathering
<headius>
it should be pretty similar across impls if we have mostly the same logic
<yorickpeterse>
headius: I also need this on MRI and Rbx though, I need to see if String#gsub! allocates more than I want there
<yorickpeterse>
hmpf, object allocation tracking doesn't appear to work either
<headius>
yeah, I don't know how different it will be between jruby and MRI since we largely have the same logic
<yorickpeterse>
meh, I need to dig in to this when I'm actually awake. Toodles
<yorickpeterse>
oh derp that's right, allocation tracking also requires a compile time flag
<yorickpeterse>
ugh
<yorickpeterse>
see, I need sleep, laters
elia has joined #rubinius
|jemc| has quit [Read error: Connection reset by peer]