<rtyler>
HOLY LOL, I submitted a Spark job written in JRuby
<rtyler>
oh interesting, there's some object serialization that has to happen to distribute some things across the cluster
<rtyler>
which of course, means RubyBasicObject is going to be fun
<headius[m]>
Ahh yeah that gets trickier
_whitelogger has joined #jruby
quadz has quit [Ping timeout: 244 seconds]
_whitelogger has joined #jruby
_whitelogger has joined #jruby
<rtyler>
I think colin had some work for redstorm which I might be able to re-use here. I don't recall exactly how much wacky object serialization hacks we needed for Storm
<rtyler>
headius[m]: I doubt you're still awake, but if you have any pointers or examples of creating a native Java lambda object from JRuby, that might help me along
ebarrett has quit [Ping timeout: 246 seconds]
ebarrett has joined #jruby
ebarrett has quit [Client Quit]
ebarrett has joined #jruby
_whitelogger has joined #jruby
<headius[m]>
rtyler well lambdas are all just implementations of a single method interface, which we do automatically for procs and blocks passed out to Java
xardion has quit [Remote host closed the connection]
<rtyler>
ah
xardion has joined #jruby
<rtyler>
headius[m]: watching a presentation of yours, I've now seen you speak enough to be able to tell what you've practiced versus not :P
<rtyler>
the practiced ones are very good lectures :)
<headius[m]>
Ha yeah, sometimes I do a little too much editing the day before
<rtyler>
so with redstorm it's passing classes around and working with classes, with spark it _seems_ like it's doing something similar
<rtyler>
digging to figure out where objects are serialized and sent along to spark workers in the cluster,
<headius[m]>
It may be possible to set up some custom serialization that can negotiate Ruby runtime etc
<rtyler>
much of what Spark seems to rely on are interfaces which implement java.io.Serializable
<rtyler>
the key difference, as far as I can tell, between how their cluster operations work compared to Storm is that Storm would send whole classes to nodes to do work (spouts, bolts, etc)
<rtyler>
whereas with spark, it appears they're serializing some runtime state and sending it along
<headius[m]>
I just had a weird idea
<headius[m]>
If we just serializes Ruby objects as a marshal dumped byte[] then it gets across fine and we just unmarshal from Ruby on the other side
<headius[m]>
Depends how smart the serialized object is supposed to be
<rtyler>
that does not seem unreasonable to me
<headius[m]>
If it's just a data object it would hook up on other side with something that can understand it
<headius[m]>
Or hell something like msgpack
<rtyler>
now you're cookin' with gas
<headius[m]>
I don't know how spark hooks up serialization thiugh
<rtyler>
I sent a message off to the dev list because I'm not finding it right now either