ur5us has joined #jruby
subbu|away is now known as subbu
ur5us has quit [Quit: Leaving]
ur5us has joined #jruby
ur5us has quit [Ping timeout: 260 seconds]
Specialist has joined #jruby
ur5us has joined #jruby
ur5us has quit [Ping timeout: 264 seconds]
_whitelogger has joined #jruby
<enebo[m]> byteit101: by bind I mean which java ctor it will call since you could have multiple same-arity java ctors
<enebo[m]> In reading all of that log I got a better idea of how this all works but I am somewhat confused as to whether we can detect trying to use this/self before the super is invoked.
<enebo[m]> This is what you refer to as unitializedThis I think
<enebo[m]> byteit101: A few things we will land regardless of how things end up designwise. Can you break out NodeVisitor/AbstractNodeVisitor/OperatorCallNode as its own PR. It is just missing and is not related to this work.
<enebo[m]> Also can you undo all the import * changes and variable name changes in at least the AST part of the code. It is just making the PR longer than it needs to be.
<enebo[m]> I actually do not see a lot of renaming other than that except in JI proxy code itself but that I think is all fine as it is redoing a far amount of stuff in there
<enebo[m]> I believe for consideration of IR vs AST we need to think about the notion FlatExtractor is doing two separate concerns. 1) Figuring out if super is in a bad place 2) rewriting the arguments
<enebo[m]> At this point I think for IR #2 is marking the actual instr which represents the super call and the Variables/Operands that will be passed to that call
drbobbeaty has joined #jruby
<enebo[m]> Then in JIT #2 is just adding extra logic right before where the super call would be emitted with the arg conversion logic and then emitting your switch for the super call itself instead of what would be the call instr
<enebo[m]> We need to figure out where to put those pieces of info but this can be done at build time on methods call initialize
<enebo[m]> #1 is more challenging. It could be woven into IRBuild but then we are paying a small cost on all intiailize (which probably is very small costwise) but it is messy
Specialist has quit [Quit: http://quassel-irc.org - Chat comfortably. Anywhere.]
<enebo[m]> We can actually make rules specific to IR instrs themselves like a super in the presence of a branch instr or after a jump is not valid? It requires examining the basic syntax rules and mapping how those syntax rules look as IR
M0xd[m] has quit [Quit: Idle for 30+ days]
h1p3r has joined #jruby
h1p3r has quit []
<byteit101[m]> > detect trying to use this/self before [...] super
<byteit101[m]> Currently, that's a non-goal. self should work before super, but self.to_java won't. uninitializedThis is the JVM type that prevents loading classes that use `this` before `super()`. We error now, but there is some other code to make the error nicer: https://github.com/jruby/jruby/issues/6140
<byteit101[m]> Will do the OperatorCallNode in seperate PR this evening
<byteit101[m]> > import * changes and variable name changes
<byteit101[m]> Ah, I haven't been paying attention to the former, but that's on my list of todos. Variable names can be reverted, I was just automating writing the FlatExtractor and it was easier to rename them at least temporarily than fix my generator code.
<byteit101[m]> > then emitting your switch
<byteit101[m]> Oh, that's the JIT proposal. I wasn't going to pursue that right now, as it locks down the dynamism allowed (however stupid it may be to change the initialize method).
<byteit101[m]> enebo: I take it, however, that the JIT (vs super-splitter I have right now) approach is your favorite?
<enebo[m]> byteit101: I will admit I did not completely follow that JIT only was not the solution :)
<byteit101[m]> (note the conversation spilled over to the next day UTC-wise, unsure if that was the original day, or the day after)
<enebo[m]> byteit101: but in thinking about this I realized all we need to do is mark the call instr which represents the super (which may be a super instr)
<enebo[m]> All the arguments to that call is already in the instr
<enebo[m]> so we just call the conversion on each one and then replace the call itself
<byteit101[m]> enebo: ah yes, 00:01 next UTC day: https://logs.jruby.org/jruby/2020-11-03
<enebo[m]> Actually I think we can universally solve #2 above with just adding an addition runtimehelper enum for java_super and add it to all super forms within initialize methods
<byteit101[m]> > the conversion on each one
<byteit101[m]> note: zsuper requires no conversion
<enebo[m]> doesn't it? You still need to make a java argument don't you>?
<enebo[m]> Ah in IR zsuper does in fact still pass in an explicit argument list
<byteit101[m]> ^ generated code for zsuper is case -1
<byteit101[m]> I mean you can, but it's not necessary :-)
<enebo[m]> heh I think I am pretty confused now
<enebo[m]> Ballons represents an extended type which need to call the base type
<byteit101[m]> yes
<enebo[m]> and this code is JIT generated form?
<byteit101[m]> no jit, this is in RealClassGenerator
<byteit101[m]> I could have made zsuper return var7, and convert them back to java types, but as it was unnecessary, just passed the original java args up
<enebo[m]> ah wait I think I am coming into focus :)
<enebo[m]> so splitInitialized is ruby code before the super
<byteit101[m]> correct
<byteit101[m]> var6.call at the end is the ruby code after the super
<enebo[m]> So largely we remove the super altogether (which is confusing part for me I think) but for explcit args to non-zsuper we need to capture arg values and return those from splitInitialized?
<byteit101[m]> var10001 is what I rewrote it to (right now), which is the super args (var8), and var6 (what to do after)
<byteit101[m]> correct
<enebo[m]> ok
<enebo[m]> It feels like this array creation maybe is technically not always needed
<byteit101[m]> Transformation from init to j_init:
<byteit101[m]> (as a reminder)
<byteit101[m]> ^ For the balloon class with (int, String) ctor
<enebo[m]> I think we should consider this split to eliminate the two levels of boxing happening
<enebo[m]> We should not need the RubyArray instance for the closure post call + args for a specific super
<enebo[m]> Part of my also thinks if we need to depend on an array it should not be another ruby array but should be a primitive array
<enebo[m]> This is obviously thinking about optimization a little bit but it is worth talking about a little bit now
<enebo[m]> lots of little bit in that last sentence
<byteit101[m]> Ah yes, I had not been focusing on efficiency, mostly getting a working POC to start with :-)
<enebo[m]> yeah that is fine but now that I am staring at that code
<enebo[m]> [closure, arg1, ... argn] = split
<enebo[m]> IRubyObject[]
<enebo[m]> I mean split itself could do the arg1...argn conversion to Java for the super call as part of a Java helper method to make this smaller
<enebo[m]> One question now comes to mind...arg1..argn needs to be captured by this closure for the post split
<enebo[m]> how does that currently work in this PR
<enebo[m]> or they may be needed
<byteit101[m]> magically (idk how, but it works today)
<enebo[m]> heh ok. I will read and see if I can figure that out
<byteit101[m]> flatextractor.deepen helps
<enebo[m]> oh yeah this is done with the AST so you probably more or less depend on the closure to act like a block which captures them naturally
<enebo[m]> we probably in truth do not want to use a block if we can help it because the execution speed of blocks is not stellar
<enebo[m]> Luckily I still remember stuff from the last time I worked on the inliner which as you can imagine is about splitting two blocks of code apart
<byteit101[m]> ah, nice
<enebo[m]> so I think we could cut off execution up to the super and then make a new scope out of the remainder as another IRMethod (with same name). This is not as simple as it sounds since both methods would share the same temporary variables
<enebo[m]> I am in the brainstorming part of this conversation since I think I know what I am talking about now
<enebo[m]> Although I don't want to derail what you have done. I think the ASt road is fine for figuring out problems until round two of IR
<byteit101[m]> Thats good! I know I am in a bit over my head which is why I knew you would be a good person to help me finish this part of it in a way better than the AST
<enebo[m]> I half feel like IR is missing a concept here which might make this simpler like 'java_yield op1, ..., opn'
<enebo[m]> then we would just replace the super with java_yield and when it finished it would just continue
<enebo[m]> but that would mean this Java code would need the super logic to purely be in its own method which I doubt works
<enebo[m]> since it is 'super'
<enebo[m]> I am trying to work through all the stuff you already worked through so apologies
<byteit101[m]> No problem, I know I wrestled with it myself for a few days
<enebo[m]> so execution has to start from this constructor. execute the front part. call super from this constructor. then continue executing the rest of the original initialize
<byteit101[m]> correct
<enebo[m]> the block trick does work and as a no-arg call it is not that expensive but something feels a little gross about it
<byteit101[m]> uninitializedThis can only be passed to the super <init> method, and nobody else can touch it until then
<enebo[m]> yeah
<enebo[m]> hmm I am trying to envision the complications of using IR here
<enebo[m]> like def initialize; ... ; super(a); ensure; a.close; end
<enebo[m]> in this case the front. the super into java, and the end must all be able to hit that ensure and then exit execution
<enebo[m]> I don't even remember how this is represented in AST...I will check that now
<enebo[m]> in AST this looks fine because ensure wraps the body of the method
<enebo[m]> just documenting the example
<enebo[m]> heh I forgot this would make two paths
<enebo[m]> one for a real exception and one for no exception
ur5us has joined #jruby
<enebo[m]> So we flatten out the tree into a set of instrs and we even duplicate code so different paths can execute the ensure
<enebo[m]> Some of this is due to limitation in JVM bytecode (which in some cases we replicate the same code section multiple times)
<enebo[m]> hmm but the AST method will not work in this case either will it?
<enebo[m]> I think we just execute the thing after the SuperNode as a no-arg block so we capture the variables but we are no longer executing under the ensure
<enebo[m]> unless you rewrite that logic
<byteit101[m]> yeah, that was one of those hard looking problems I've been avoiding :-)
<byteit101[m]> the AST code right now only works for block > super or a single super node, errors out otherwise
<enebo[m]> yeah I guess your existing Flat visitor would error in this case because ensure would make it deeper
<byteit101[m]> correct
<enebo[m]> what is the behavior when it fails?
<byteit101[m]> that's what the level variable is for
<byteit101[m]> explotions and stacktraces right now for easy debugging :-)
<enebo[m]> heh
<byteit101[m]> *explosions
<byteit101[m]> (the explosion changed a few letters there)
<enebo[m]> ok so let's consider what we think should happen perhaps
<byteit101[m]> nasal demons?
<enebo[m]> error reporting will be fairly important if we restrict the behavior of initialize for this feature
<byteit101[m]> definately. I have to do a pass at some point for that more generally for the rest of my Pr
<enebo[m]> I am mostly bringing this up because in AST we see the syntax which causes it fairly simply. In IR we can likely recalculate the syntax but it is a definite cost in implementing
<byteit101[m]> one thought I had a moment ago is, if it fails, just save the super in a variable, emit a warning, and do the super at the end
<enebo[m]> My only big reservation with just rolling with AST is we want to not depend on an AST being present
<enebo[m]> like loading from IR persistence
<byteit101[m]> "calling super in a java extension class is restricted by JVM considerations. Please move super/see wiki/etc"
<byteit101[m]> Yes, that was a big "oh dear" when I saw ast vanishing :-)
<enebo[m]> well that would be a lowest common denominator solution
<enebo[m]> and possibly how we initially roll this out
<byteit101[m]> yes, not ideal, but a backup
<enebo[m]> AST is something we want to be able to change without breaking features and it is also extra memory we want to throw away since we have an IR
<enebo[m]> So I guess as the feature exists the depth == 0 aspect of this means worrying about things like ensure is not a problem for IR
<enebo[m]> It would be nice to say what syntax broke the method but I think we could derive that. At least I think we probably can
<enebo[m]> The post block execution thing in the AST probably will not work in IR though
<byteit101[m]> I started with ast since A) it seemed easy enough for a poc, and b) i've dealt with AST's in the past. the same cannot be said for my IR experience
<enebo[m]> Since the post and pre parts probably both share the same transient temp variable pool
<enebo[m]> yeah I don't blame you for going this route. In some ways it really simplified the idea
<enebo[m]> but we do not want to depend on AST so we have a little more thinking to do on how to move it there
<byteit101[m]> yup:-)
<enebo[m]> For the JIT I had been working on trying to make synthetic methods
<enebo[m]> When you hit things that are generated Ruby code we hit JVM method size limits
<enebo[m]> but the reason I mention this is synthetic method idea was around the notion of executing multiple generated methods which accepted common variables for execution like the temporary variables
<byteit101[m]> ah, cool!
<enebo[m]> we want to pass the same dynamic scope (live variable values) and the temp var array (an array in interp) and java local variables in JIT (which would need to be made into an array)
<enebo[m]> same frame
<enebo[m]> So your case is another version of this I think
<enebo[m]> we want two synthetic methods front/back which have same stuff
<byteit101[m]> yes, indeed
<enebo[m]> let's recap
<enebo[m]> We have an AST solution but AST is not desirable as a finished product. It is good enough in WIP/PR form to debug and figure out what other problems we may encounter
<byteit101[m]> (I've been using it to fix the other side of the PR and get tests passing)
<enebo[m]> The basic problem is we want to execute a front and back portion of some initialize methods and in IR we need to demarcate and split those apart and be able to execute them separately
<enebo[m]> The basic outline of what your generated code does now would likely be very similar but we might need to return temp array and maybe a couple of other things
<enebo[m]> so we can pass them to however we execute the second half
<enebo[m]> We need to consider error reporting and if we can make that descriptive or generic
<enebo[m]> As for splitting I think we can use a special instr (or add to runtimehelperinstr) to mark the super during IRBuild if the method is named initialize
<enebo[m]> Then the split itself should not be difficult we use startup instrs and just cleave it into two
<enebo[m]> I think that is the current state from this conversation and most of it is just considering an IR version
<enebo[m]> I think also we can initially do this as interp-only to reduce the scope
<byteit101[m]> Things I just realized would be really hard to support: super in a loop
<byteit101[m]> But I will need to think through try/catch semantics
<enebo[m]> heh impossible in Java so there is that
<byteit101[m]> yea, if you try to compile the decompiled bytecode in javac, it doesn't like having code before the super() :-)
<byteit101[m]> ^ from the balloon example I pasted above
<enebo[m]> It may be that IR helps us a little bit that if we can detect a branch instr which jumps past the super instr or we see a jump instr which targets something above a super instr we can error
<byteit101[m]> But I think this is good. Want to copy that summary to the PR?
<enebo[m]> likewise IR could give us other obvious tools with a full CFG to detect this
<enebo[m]> sure
<enebo[m]> byteit101: going to take a bit of a break and think about this some more.
<byteit101[m]> sounds good. And thanks! this is one of those big problems that I keep writing stuff down about to not forget it amongst the other stuff
<enebo[m]> byteit101: this is a really great feature but it is also one with a thousand crannies
<byteit101[m]> Quite!
<enebo[m]> I predict we will be debugging issues on it for years but it will totally be worth it
<byteit101[m]> I've been worrying about that, I don't want to increase your workload :-O
<enebo[m]> The more problems we can identify up front the less we will find later
<enebo[m]> yeah don't worry about that. we make decisions based on how useful it will be. I think this is a good idea worth pursuing
<byteit101[m]> I can't wait to use it to clean up jrubyfx :-D
<enebo[m]> If it also starts us more down the road to making synthetic methods it will pay for itself in a second way internally
<enebo[m]> gems like ruby-parser which are huge generated code single method ends up much slower because we cannot JIT compile it
<enebo[m]> being able to continue execution with existing state is a primary part of doing that work
<enebo[m]> the second part is breaking a method (which tend to be switches) into a series of synthetic calls for each branch
<enebo[m]> (note: this still means not all large methods will compile but it does help a class of too large methods)
<enebo[m]> in fact all of them seem to just be huge switches
ur5us has quit [Remote host closed the connection]
ur5us has joined #jruby
_whitelogger has joined #jruby
MattPattersonGit has joined #jruby
joshuacronemeyer has joined #jruby
joshuacronemeyer has quit [Ping timeout: 240 seconds]
joshuacronemeyer has joined #jruby
ur5us_ has quit [Ping timeout: 264 seconds]