Ping failed. Terminating

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Ping failed. Terminating

Keith Kowalczykowski-3
Hi All,

We recently updated our hudson install from 1.332 to 1.338 and are now not
able to start our hudson slaves. We launch the slave agent via ssh (ie. ssh
USER@HOST "java -jar bin/slave.jar"), which has worked flawlessly in the
past. Now, when the slave launches, it waits for 1 minute (the timeout in
PingThread.java) and then dies with "Ping failed. Terminating". Below is the
stack trace? Any ideas on what the issue is and/or how to debug it futher?

  -Keith

<===[HUDSON REMOTING CAPACITY]===>channel started
Ping failed. Terminating
Unexpected error in launching a slave. This is probably a bug in Hudson
hudson.remoting.RequestAbortedException:
hudson.remoting.RequestAbortedException: java.io.EOFException
 at hudson.remoting.Request.call(Request.java:137)
 at hudson.remoting.Channel.call(Channel.java:547)
 at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:305)
 at hudson.slaves.CommandLauncher.launch(CommandLauncher.java:111)
 at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:180)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.ja
va:886)
 at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:9
08)
 at java.lang.Thread.run(Thread.java:619)
[12/22/09 12:49:49] slave agent was terminated
Caused by: hudson.remoting.RequestAbortedException: java.io.EOFException
 at hudson.remoting.Request.abort(Request.java:257)
 at hudson.remoting.Channel.terminate(Channel.java:594)
 at hudson.remoting.Channel$ReaderThread.run(Channel.java:872)
java.io.EOFException
 at
java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.ja
va:2554)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297)
 at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
 at hudson.remoting.Channel$ReaderThread.run(Channel.java:852)
Caused by: java.io.EOFException
 at
java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.ja
va:2554)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297)
 at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
 at hudson.remoting.Channel$ReaderThread.run(Channel.java:852)



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Ping failed. Terminating

Michael Donohue
Hi Keith,
This really belongs on the users list, in order to reach a wider audience, such as other users hitting the same problem.  The dev list is for people who are working on Hudson's code, or the code for a plugin.

-Michael
(646) 833-8884


On Tue, Dec 22, 2009 at 7:50 PM, Keith Kowalczykowski <[hidden email]> wrote:
Hi All,

We recently updated our hudson install from 1.332 to 1.338 and are now not
able to start our hudson slaves. We launch the slave agent via ssh (ie. ssh
USER@HOST "java -jar bin/slave.jar"), which has worked flawlessly in the
past. Now, when the slave launches, it waits for 1 minute (the timeout in
PingThread.java) and then dies with "Ping failed. Terminating". Below is the
stack trace? Any ideas on what the issue is and/or how to debug it futher?

 -Keith

<===[HUDSON REMOTING CAPACITY]===>channel started
Ping failed. Terminating
Unexpected error in launching a slave. This is probably a bug in Hudson
hudson.remoting.RequestAbortedException:
hudson.remoting.RequestAbortedException: java.io.EOFException
 at hudson.remoting.Request.call(Request.java:137)
 at hudson.remoting.Channel.call(Channel.java:547)
 at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:305)
 at hudson.slaves.CommandLauncher.launch(CommandLauncher.java:111)
 at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:180)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.ja
va:886)
 at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:9
08)
 at java.lang.Thread.run(Thread.java:619)
[12/22/09 12:49:49] slave agent was terminated
Caused by: hudson.remoting.RequestAbortedException: java.io.EOFException
 at hudson.remoting.Request.abort(Request.java:257)
 at hudson.remoting.Channel.terminate(Channel.java:594)
 at hudson.remoting.Channel$ReaderThread.run(Channel.java:872)
java.io.EOFException
 at
java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.ja
va:2554)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297)
 at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
 at hudson.remoting.Channel$ReaderThread.run(Channel.java:852)
Caused by: java.io.EOFException
 at
java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.ja
va:2554)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297)
 at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
 at hudson.remoting.Channel$ReaderThread.run(Channel.java:852)



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: Ping failed. Terminating

Keith Kowalczykowski-3
Re: Ping failed. Terminating Hi Michael,

Thanks for the feedback. As I’m sure Kohsuke and others can attest, I almost always track down the issues that we’re running into and submit a patch. Therefore, I consider myself more a dev than a user.

In this case, I’m looking for advice from the dev community in two areas:

  1. What has changed in the stdin/stdout remoting capability between 1.332 and 1.338 that may have affected this?
  2. Which classes are involved in the stdin/stdout remoting, so I can check their log output and/or inspect their diffs.

I will also cross post to the users list in case its useful there.

    -Keith


On 12/22/09 9:07 PM, "Michael Donohue" <michael.donohue@...> wrote:

> Hi Keith,
> This really belongs on the users list, in order to reach a wider audience,
> such as other users hitting the same problem.  The dev list is for people who
> are working on Hudson's code, or the code for a plugin.
>
> -Michael
> (646) 833-8884
>
>
> On Tue, Dec 22, 2009 at 7:50 PM, Keith Kowalczykowski <keith@...>
> wrote:
>> Hi All,
>>
>> We recently updated our hudson install from 1.332 to 1.338 and are now not
>> able to start our hudson slaves. We launch the slave agent via ssh (ie. ssh
>> USER@HOST "java -jar bin/slave.jar"), which has worked flawlessly in the
>> past. Now, when the slave launches, it waits for 1 minute (the timeout in
>> PingThread.java) and then dies with "Ping failed. Terminating". Below is the
>> stack trace? Any ideas on what the issue is and/or how to debug it futher?
>>
>>   -Keith
>>
>> <===[HUDSON REMOTING CAPACITY]===>channel started
>> Ping failed. Terminating
>> Unexpected error in launching a slave. This is probably a bug in Hudson
>> hudson.remoting.RequestAbortedException:
>> hudson.remoting.RequestAbortedException: java.io.EOFException
>>  at hudson.remoting.Request.call(Request.java:137)
>>  at hudson.remoting.Channel.call(Channel.java:547)
>>  at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:305)
>>  at hudson.slaves.CommandLauncher.launch(CommandLauncher.java:111)
>>  at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:180)
>>  at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>  at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>>  at
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.ja
>> va:886)
>>  at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:9
>> 08)
>>  at java.lang.Thread.run(Thread.java:619)
>> [12/22/09 12:49:49] slave agent was terminated
>> Caused by: hudson.remoting.RequestAbortedException: java.io.EOFException
>>  at hudson.remoting.Request.abort(Request.java:257)
>>  at hudson.remoting.Channel.terminate(Channel.java:594)
>>  at hudson.remoting.Channel$ReaderThread.run(Channel.java:872)
>> java.io.EOFException
>>  at
>> java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.ja
>> va:2554)
>>  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297)
>>  at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
>>  at hudson.remoting.Channel$ReaderThread.run(Channel.java:852)
>> Caused by: java.io.EOFException
>>  at
>> java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.ja
>> va:2554)
>>  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297)
>>  at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
>>  at hudson.remoting.Channel$ReaderThread.run(Channel.java:852)
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@...
>> For additional commands, e-mail: dev-help@...
>>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Ping failed. Terminating/MultiClassLoader Issue

Keith Kowalczykowski-3
In reply to this post by Keith Kowalczykowski-3
Hi All,

I'm going to revive this issue, as I now have more info to go by. I did
a little more investigative work, and tracked down the offending version
to be 1.337 when the regression occurred. I suspect this has to do with
the MultiClassLoader support that Kohsuke relanded in that version
(http://issues.hudson-ci.org/browse/HUDSON-5048). Any advice on how to
debug this further, would be much appreciated.

     -Keith

Keith Kowalczykowski wrote:

> Hi All,
>
> We recently updated our hudson install from 1.332 to 1.338 and are now not
> able to start our hudson slaves. We launch the slave agent via ssh (ie. ssh
> USER@HOST "java -jar bin/slave.jar"), which has worked flawlessly in the
> past. Now, when the slave launches, it waits for 1 minute (the timeout in
> PingThread.java) and then dies with "Ping failed. Terminating". Below is the
> stack trace? Any ideas on what the issue is and/or how to debug it futher?
>
>    -Keith
>
> <===[HUDSON REMOTING CAPACITY]===>channel started
> Ping failed. Terminating
> Unexpected error in launching a slave. This is probably a bug in Hudson
> hudson.remoting.RequestAbortedException:
> hudson.remoting.RequestAbortedException: java.io.EOFException
>   at hudson.remoting.Request.call(Request.java:137)
>   at hudson.remoting.Channel.call(Channel.java:547)
>   at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:305)
>   at hudson.slaves.CommandLauncher.launch(CommandLauncher.java:111)
>   at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:180)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.ja
> va:886)
>   at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:9
> 08)
>   at java.lang.Thread.run(Thread.java:619)
> [12/22/09 12:49:49] slave agent was terminated
> Caused by: hudson.remoting.RequestAbortedException: java.io.EOFException
>   at hudson.remoting.Request.abort(Request.java:257)
>   at hudson.remoting.Channel.terminate(Channel.java:594)
>   at hudson.remoting.Channel$ReaderThread.run(Channel.java:872)
> java.io.EOFException
>   at
> java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.ja
> va:2554)
>   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297)
>   at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
>   at hudson.remoting.Channel$ReaderThread.run(Channel.java:852)
> Caused by: java.io.EOFException
>   at
> java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.ja
> va:2554)
>   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297)
>   at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
>   at hudson.remoting.Channel$ReaderThread.run(Channel.java:852)
>
>    


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Ping failed. Terminating/MultiClassLoader Issue

Keith Kowalczykowski-3
Kohsuke,

I investigated this further, and the primary difference I can tell
between our broken hudson install, and another hudson install that is
working, is that we have a mix-match of architectures. We have a master
that is 64bit, while several slaves that are 32bit. I dug around in the
remote class loader code some more, and noticed that
RemoteClassLoader.ClassFile actually contains the binary stream of the
class file. Is this class file actually being serialized and sent from
the master to the slave? If so, what are the ramifications of this on a
multi-architecture environment? What if a class has JNI code? What about
system classes inherent in rt.jar? Can you please provide some insight,
as this is fairly convoluted stuff, and not easy to debug.

     -Keith

Keith Kowalczykowski wrote:

> Hi All,
>
> I'm going to revive this issue, as I now have more info to go by. I
> did a little more investigative work, and tracked down the offending
> version to be 1.337 when the regression occurred. I suspect this has
> to do with the MultiClassLoader support that Kohsuke relanded in that
> version (http://issues.hudson-ci.org/browse/HUDSON-5048). Any advice
> on how to debug this further, would be much appreciated.
>
>     -Keith
>
> Keith Kowalczykowski wrote:
>> Hi All,
>>
>> We recently updated our hudson install from 1.332 to 1.338 and are
>> now not
>> able to start our hudson slaves. We launch the slave agent via ssh
>> (ie. ssh
>> USER@HOST "java -jar bin/slave.jar"), which has worked flawlessly in the
>> past. Now, when the slave launches, it waits for 1 minute (the
>> timeout in
>> PingThread.java) and then dies with "Ping failed. Terminating". Below
>> is the
>> stack trace? Any ideas on what the issue is and/or how to debug it
>> futher?
>>
>>    -Keith
>>
>> <===[HUDSON REMOTING CAPACITY]===>channel started
>> Ping failed. Terminating
>> Unexpected error in launching a slave. This is probably a bug in Hudson
>> hudson.remoting.RequestAbortedException:
>> hudson.remoting.RequestAbortedException: java.io.EOFException
>>   at hudson.remoting.Request.call(Request.java:137)
>>   at hudson.remoting.Channel.call(Channel.java:547)
>>   at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:305)
>>   at hudson.slaves.CommandLauncher.launch(CommandLauncher.java:111)
>>   at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:180)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>>   at
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.ja
>>
>> va:886)
>>   at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:9
>>
>> 08)
>>   at java.lang.Thread.run(Thread.java:619)
>> [12/22/09 12:49:49] slave agent was terminated
>> Caused by: hudson.remoting.RequestAbortedException: java.io.EOFException
>>   at hudson.remoting.Request.abort(Request.java:257)
>>   at hudson.remoting.Channel.terminate(Channel.java:594)
>>   at hudson.remoting.Channel$ReaderThread.run(Channel.java:872)
>> java.io.EOFException
>>   at
>> java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.ja
>>
>> va:2554)
>>   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297)
>>   at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
>>   at hudson.remoting.Channel$ReaderThread.run(Channel.java:852)
>> Caused by: java.io.EOFException
>>   at
>> java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.ja
>>
>> va:2554)
>>   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297)
>>   at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
>>   at hudson.remoting.Channel$ReaderThread.run(Channel.java:852)
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Ping failed. Terminating/MultiClassLoader Issue

Andrew Chandler
We use multi- architecture just fine, consider that we have 64bit and 32 bit slaves, plus solaris slaves, plus linux and windows.    All pure java code should be fine even with jni - it really just comes down to the shared library for the jni stuff has to be present and compiled for the appropriate slave.  The bytecode is handled by the jvm regardless - consider that you don't have a different jar for each platform you send to - the .class files are bytecode in file form so the fact that its being serialized and sent over the wire is perfectly fine.     With JNI you have a single class file (compiled once on your favorite jdk) and it will work on all platforms, provided of course you take the jni headers and compile your shared library over on each of the correct native platform...does that help?

On Fri, 2010-02-19 at 18:28 -0800, Keith Kowalczykowski wrote:
Kohsuke,

I investigated this further, and the primary difference I can tell 
between our broken hudson install, and another hudson install that is 
working, is that we have a mix-match of architectures. We have a master 
that is 64bit, while several slaves that are 32bit. I dug around in the 
remote class loader code some more, and noticed that 
RemoteClassLoader.ClassFile actually contains the binary stream of the 
class file. Is this class file actually being serialized and sent from 
the master to the slave? If so, what are the ramifications of this on a 
multi-architecture environment? What if a class has JNI code? What about 
system classes inherent in rt.jar? Can you please provide some insight, 
as this is fairly convoluted stuff, and not easy to debug.

     -Keith

Keith Kowalczykowski wrote:
> Hi All,
>
> I'm going to revive this issue, as I now have more info to go by. I 
> did a little more investigative work, and tracked down the offending 
> version to be 1.337 when the regression occurred. I suspect this has 
> to do with the MultiClassLoader support that Kohsuke relanded in that 
> version (http://issues.hudson-ci.org/browse/HUDSON-5048). Any advice 
> on how to debug this further, would be much appreciated.
>
>     -Keith
>
> Keith Kowalczykowski wrote:
>> Hi All,
>>
>> We recently updated our hudson install from 1.332 to 1.338 and are 
>> now not
>> able to start our hudson slaves. We launch the slave agent via ssh 
>> (ie. ssh
>> USER@HOST "java -jar bin/slave.jar"), which has worked flawlessly in the
>> past. Now, when the slave launches, it waits for 1 minute (the 
>> timeout in
>> PingThread.java) and then dies with "Ping failed. Terminating". Below 
>> is the
>> stack trace? Any ideas on what the issue is and/or how to debug it 
>> futher?
>>
>>    -Keith
>>
>> <===[HUDSON REMOTING CAPACITY]===>channel started
>> Ping failed. Terminating
>> Unexpected error in launching a slave. This is probably a bug in Hudson
>> hudson.remoting.RequestAbortedException:
>> hudson.remoting.RequestAbortedException: java.io.EOFException
>>   at hudson.remoting.Request.call(Request.java:137)
>>   at hudson.remoting.Channel.call(Channel.java:547)
>>   at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:305)
>>   at hudson.slaves.CommandLauncher.launch(CommandLauncher.java:111)
>>   at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:180)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>>   at
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.ja 
>>
>> va:886)
>>   at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:9 
>>
>> 08)
>>   at java.lang.Thread.run(Thread.java:619)
>> [12/22/09 12:49:49] slave agent was terminated
>> Caused by: hudson.remoting.RequestAbortedException: java.io.EOFException
>>   at hudson.remoting.Request.abort(Request.java:257)
>>   at hudson.remoting.Channel.terminate(Channel.java:594)
>>   at hudson.remoting.Channel$ReaderThread.run(Channel.java:872)
>> java.io.EOFException
>>   at
>> java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.ja 
>>
>> va:2554)
>>   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297)
>>   at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
>>   at hudson.remoting.Channel$ReaderThread.run(Channel.java:852)
>> Caused by: java.io.EOFException
>>   at
>> java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.ja 
>>
>> va:2554)
>>   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297)
>>   at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
>>   at hudson.remoting.Channel$ReaderThread.run(Channel.java:852) 
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email] 


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: Ping failed. Terminating/MultiClassLoader Issue

Keith Kowalczykowski-3
Thanks Andrew. Is your multi-architecture system running on 1.337+?

I realize the byte-code should be fine to serialize and send, however, I'm trying to get a better grasp of how/when this serialization occurs. Maybe a plugin is being sent over that uses JNI code for which a native library doesn't exist? Any insight from Kohsuke, or others, on how to debug this further would be greatly appreciated, as right now I either have a non-working environment, or are stuck at 1.336.

    -Keith

Andrew Chandler wrote:
We use multi- architecture just fine, consider that we have 64bit and 32 bit slaves, plus solaris slaves, plus linux and windows.    All pure java code should be fine even with jni - it really just comes down to the shared library for the jni stuff has to be present and compiled for the appropriate slave.  The bytecode is handled by the jvm regardless - consider that you don't have a different jar for each platform you send to - the .class files are bytecode in file form so the fact that its being serialized and sent over the wire is perfectly fine.     With JNI you have a single class file (compiled once on your favorite jdk) and it will work on all platforms, provided of course you take the jni headers and compile your shared library over on each of the correct native platform...does that help?

On Fri, 2010-02-19 at 18:28 -0800, Keith Kowalczykowski wrote:
Kohsuke,

I investigated this further, and the primary difference I can tell 
between our broken hudson install, and another hudson install that is 
working, is that we have a mix-match of architectures. We have a master 
that is 64bit, while several slaves that are 32bit. I dug around in the 
remote class loader code some more, and noticed that 
RemoteClassLoader.ClassFile actually contains the binary stream of the 
class file. Is this class file actually being serialized and sent from 
the master to the slave? If so, what are the ramifications of this on a 
multi-architecture environment? What if a class has JNI code? What about 
system classes inherent in rt.jar? Can you please provide some insight, 
as this is fairly convoluted stuff, and not easy to debug.

     -Keith

Keith Kowalczykowski wrote:
> Hi All,
>
> I'm going to revive this issue, as I now have more info to go by. I 
> did a little more investigative work, and tracked down the offending 
> version to be 1.337 when the regression occurred. I suspect this has 
> to do with the MultiClassLoader support that Kohsuke relanded in that 
> version (http://issues.hudson-ci.org/browse/HUDSON-5048). Any advice 
> on how to debug this further, would be much appreciated.
>
>     -Keith
>
> Keith Kowalczykowski wrote:
>> Hi All,
>>
>> We recently updated our hudson install from 1.332 to 1.338 and are 
>> now not
>> able to start our hudson slaves. We launch the slave agent via ssh 
>> (ie. ssh
>> USER@HOST "java -jar bin/slave.jar"), which has worked flawlessly in the
>> past. Now, when the slave launches, it waits for 1 minute (the 
>> timeout in
>> PingThread.java) and then dies with "Ping failed. Terminating". Below 
>> is the
>> stack trace? Any ideas on what the issue is and/or how to debug it 
>> futher?
>>
>>    -Keith
>>
>> <===[HUDSON REMOTING CAPACITY]===>channel started
>> Ping failed. Terminating
>> Unexpected error in launching a slave. This is probably a bug in Hudson
>> hudson.remoting.RequestAbortedException:
>> hudson.remoting.RequestAbortedException: java.io.EOFException
>>   at hudson.remoting.Request.call(Request.java:137)
>>   at hudson.remoting.Channel.call(Channel.java:547)
>>   at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:305)
>>   at hudson.slaves.CommandLauncher.launch(CommandLauncher.java:111)
>>   at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:180)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>>   at
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.ja 
>>
>> va:886)
>>   at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:9 
>>
>> 08)
>>   at java.lang.Thread.run(Thread.java:619)
>> [12/22/09 12:49:49] slave agent was terminated
>> Caused by: hudson.remoting.RequestAbortedException: java.io.EOFException
>>   at hudson.remoting.Request.abort(Request.java:257)
>>   at hudson.remoting.Channel.terminate(Channel.java:594)
>>   at hudson.remoting.Channel$ReaderThread.run(Channel.java:872)
>> java.io.EOFException
>>   at
>> java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.ja 
>>
>> va:2554)
>>   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297)
>>   at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
>>   at hudson.remoting.Channel$ReaderThread.run(Channel.java:852)
>> Caused by: java.io.EOFException
>>   at
>> java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.ja 
>>
>> va:2554)
>>   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297)
>>   at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
>>   at hudson.remoting.Channel$ReaderThread.run(Channel.java:852) 
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email] 


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]