Slaves dying when run on VMware

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Slaves dying when run on VMware

stephenconnolly
I have two slaves:

* both running the same version of Ubuntu 7.10 Server.

* both configured identically and connected to the same NTP server and
LAN subnet.

The only difference is that one is a real machine, while the other is
a virtual machine.

The real machine's slave agent never dies.

The virtual machine's slave agent dies without fail overnight, e.g.

[12/05/07 16:50:36] Launching slave agent

$ plink maven-mirror2 /var/hudson/bin/launch

channel started

Copied maven-agent.jar

Copied maven-interceptor.jar

This is a Unix slave

FATAL ERROR: Network error: Software caused connection abort

%s slave agent was terminated



java.io.EOFException

        at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2498)

        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1273)

        at java.io.ObjectInputStream.readObject(ObjectInputStream.java:348)

        at hudson.remoting.Channel$ReaderThread.run(Channel.java:426)

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Slaves dying when run on VMware

stephenconnolly
More strangeness... The vm slave dies when running a job while the
real slave does not die when running a clone of this job!

+rsync -v -t -l -r --exclude **/.svn/ --exclude .bak --exclude **/CVS/
_____________

receiving file list ... FATAL: command execution failed
hudson.util.IOException2: Failed to join the process
        at hudson.Proc$RemoteProc.join(Proc.java:196)
        at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:61)
        at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:35)
        at hudson.model.Build$RunnerImpl.build(Build.java:143)
        at hudson.model.Build$RunnerImpl.doRun(Build.java:123)
        at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:182)
        at hudson.model.Run.run(Run.java:579)
        at hudson.model.Build.run(Build.java:103)
        at hudson.model.ResourceController.execute(ResourceController.java:70)
        at hudson.model.Executor.run(Executor.java:62)
Caused by: java.util.concurrent.ExecutionException:
hudson.remoting.RequestAbortedException: java.io.EOFException
        at hudson.remoting.Request$1.get(Request.java:131)
        at hudson.remoting.Request$1.get(Request.java:109)
        at hudson.remoting.FutureAdapter.get(FutureAdapter.java:32)
        at hudson.Proc$RemoteProc.join(Proc.java:188)
        ... 9 more
Caused by: hudson.remoting.RequestAbortedException: java.io.EOFException
        at hudson.remoting.Request.abort(Request.java:166)
        at hudson.remoting.Channel.terminate(Channel.java:311)
        at hudson.remoting.Channel$ReaderThread.run(Channel.java:445)
Caused by: java.io.EOFException
        at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2498)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1273)
        at java.io.ObjectInputStream.readObject(ObjectInputStream.java:348)
        at hudson.remoting.Channel$ReaderThread.run(Channel.java:426)
FATAL: Unable to delete script file /tmp/hudson21278.sh
hudson.util.IOException2: remote file operation failed
        at hudson.FilePath.act(FilePath.java:276)
        at hudson.FilePath.delete(FilePath.java:455)
        at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:71)
        at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:35)
        at hudson.model.Build$RunnerImpl.build(Build.java:143)
        at hudson.model.Build$RunnerImpl.doRun(Build.java:123)
        at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:182)
        at hudson.model.Run.run(Run.java:579)
        at hudson.model.Build.run(Build.java:103)
        at hudson.model.ResourceController.execute(ResourceController.java:70)
        at hudson.model.Executor.run(Executor.java:62)
Caused by: java.io.IOException: already closed
        at hudson.remoting.Channel.send(Channel.java:189)
        at hudson.remoting.Request.call(Request.java:75)
        at hudson.remoting.Channel.call(Channel.java:264)
        at hudson.FilePath.act(FilePath.java:273)
        ... 10 more
FATAL: null
java.lang.NullPointerException
        at hudson.tasks.MailSender.createFailureMail(MailSender.java:191)
        at hudson.tasks.MailSender.getMail(MailSender.java:80)
        at hudson.tasks.MailSender.execute(MailSender.java:57)
        at hudson.tasks.Mailer._perform(Mailer.java:73)
        at hudson.tasks.Mailer.perform(Mailer.java:67)
        at hudson.model.Build$RunnerImpl.post2(Build.java:138)
        at hudson.model.AbstractBuild$AbstractRunner.post(AbstractBuild.java:244)
        at hudson.model.Run.run(Run.java:597)
        at hudson.model.Build.run(Build.java:103)
        at hudson.model.ResourceController.execute(ResourceController.java:70)
        at hudson.model.Executor.run(Executor.java:62)



On Dec 6, 2007 8:32 AM, Stephen Connolly
<[hidden email]> wrote:

> I have two slaves:
>
> * both running the same version of Ubuntu 7.10 Server.
>
> * both configured identically and connected to the same NTP server and
> LAN subnet.
>
> The only difference is that one is a real machine, while the other is
> a virtual machine.
>
> The real machine's slave agent never dies.
>
> The virtual machine's slave agent dies without fail overnight, e.g.
>
> [12/05/07 16:50:36] Launching slave agent
>
> $ plink maven-mirror2 /var/hudson/bin/launch
>
> channel started
>
> Copied maven-agent.jar
>
> Copied maven-interceptor.jar
>
> This is a Unix slave
>
> FATAL ERROR: Network error: Software caused connection abort
>
> %s slave agent was terminated
>
>
>
> java.io.EOFException
>
>         at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2498)
>
>         at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1273)
>
>         at java.io.ObjectInputStream.readObject(ObjectInputStream.java:348)
>
>         at hudson.remoting.Channel$ReaderThread.run(Channel.java:426)
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Slaves dying when run on VMware

stephenconnolly
Actually, scratch that, the job completed when I restarted the slave,
it's not the job, I just think that the slave dies on me.

On Dec 6, 2007 9:08 AM, Stephen Connolly
<[hidden email]> wrote:
> More strangeness... The vm slave dies when running a job while the
> real slave does not die when running a clone of this job!
>
> +rsync -v -t -l -r --exclude **/.svn/ --exclude .bak --exclude **/CVS/

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Slaves dying when run on VMware

Kohsuke Kawaguchi
Administrator
In reply to this post by stephenconnolly

Could it be that VMWare network bridge has some kind of TCP inactivity
timeout?

Maybe we need to have an option to do keep alive in slaves?


Stephen Connolly wrote:

> I have two slaves:
>
> * both running the same version of Ubuntu 7.10 Server.
>
> * both configured identically and connected to the same NTP server and
> LAN subnet.
>
> The only difference is that one is a real machine, while the other is
> a virtual machine.
>
> The real machine's slave agent never dies.
>
> The virtual machine's slave agent dies without fail overnight, e.g.
>
> [12/05/07 16:50:36] Launching slave agent
>
> $ plink maven-mirror2 /var/hudson/bin/launch
>
> channel started
>
> Copied maven-agent.jar
>
> Copied maven-interceptor.jar
>
> This is a Unix slave
>
> FATAL ERROR: Network error: Software caused connection abort
>
> %s slave agent was terminated
>
>
>
> java.io.EOFException
>
> at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2498)
>
> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1273)
>
> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:348)
>
> at hudson.remoting.Channel$ReaderThread.run(Channel.java:426)
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>

--
Kohsuke Kawaguchi
Sun Microsystems                   [hidden email]

smime.p7s (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Slaves dying when run on VMware

stephenconnolly
On Dec 18, 2007 2:36 AM, Kohsuke Kawaguchi <[hidden email]> wrote:
> Maybe we need to have an option to do keep alive in slaves?

Auto relaunch for the non-JNLP slaves would be good too

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Loading...