Single exception in SCM polling on slave prevents future polling

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Single exception in SCM polling on slave prevents future polling

Kianwin Ong
Hi,

I am running Hudson 1.105 (actually, a snapshot before it was officially released). A few days ago, an exception occured during SCM polling on a slave node, probably due to a temporary network outage. Thereafter, SCM polling on the slave ceases to work at all - only one out of the usual four lines show up in the polling log, and no other log messages.

The fateful exeption (recorded on the system log) was:
May 1, 2007 12:11:50 PM hudson.triggers.SCMTrigger$Runner runPolling
SEVERE: Failed to record SCM polling
hudson.remoting.RequestAbortedException: java.io.EOFException
        at hudson.remoting.Request.abort(Request.java:166)
        at hudson.remoting.Channel.terminate(Channel.java:280)
        at hudson.remoting.Channel$ReaderThread.run(Channel.java:391)
Caused by: java.io.EOFException
        at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2498)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1273)
        at java.io.ObjectInputStream.readObject(ObjectInputStream.java:348)
        at hudson.remoting.Channel$ReaderThread.run(Channel.java:377)

May 1, 2007 12:11:50 PM hudson.remoting.Channel$ReaderThread run
SEVERE: I/O error in channel om.ucsd.edu
java.io.EOFException
        at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2498)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1273)
        at java.io.ObjectInputStream.readObject(ObjectInputStream.java:348)
        at hudson.remoting.Channel$ReaderThread.run(Channel.java:377)

Resetting hudson solved the problem for me.

So far, this has only been a one-time occurence, and I have not tried to reproduce the underlying network outage. I was wondering though if this is an issue with the exception handling in general for polling. Let me know if I should file an issue for this.

- Kian Win
Reply | Threaded
Open this post in threaded view
|

Re: Single exception in SCM polling on slave prevents future polling

Kohsuke Kawaguchi
Administrator
It appears to me that you are using distributed builds, and I suspect
the connection between the master and the slave went bad. Normally it
should cause the disconnection, and then you'll see "offline"
indication in the executor list. Have you seen that? Otherwise, it
must have gone bad in a way that didn't cause a disconnection.

Have you tried to build the project manually? Did that succeed?

I'll see if I can figure out from the stack trace.

Can you also remind me what is the "only one out of the usual four" messages?

2007/5/7, Kianwin Ong <[hidden email]>:

>
> Hi,
>
> I am running Hudson 1.105 (actually, a snapshot before it was officially
> released). A few days ago, an exception occured during SCM polling on a
> slave node, probably due to a temporary network outage. Thereafter, SCM
> polling on the slave ceases to work at all - only one out of the usual four
> lines show up in the polling log, and no other log messages.
>
> The fateful exeption (recorded on the system log) was:
> May 1, 2007 12:11:50 PM hudson.triggers.SCMTrigger$Runner runPolling
> SEVERE: Failed to record SCM polling
> hudson.remoting.RequestAbortedException: java.io.EOFException
>         at hudson.remoting.Request.abort(Request.java:166)
>         at hudson.remoting.Channel.terminate(Channel.java:280)
>         at hudson.remoting.Channel$ReaderThread.run(Channel.java:391)
> Caused by: java.io.EOFException
>         at
> java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2498)
>         at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1273)
>         at java.io.ObjectInputStream.readObject(ObjectInputStream.java:348)
>         at hudson.remoting.Channel$ReaderThread.run(Channel.java:377)
>
> May 1, 2007 12:11:50 PM hudson.remoting.Channel$ReaderThread run
> SEVERE: I/O error in channel om.ucsd.edu
> java.io.EOFException
>         at
> java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2498)
>         at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1273)
>         at java.io.ObjectInputStream.readObject(ObjectInputStream.java:348)
>         at hudson.remoting.Channel$ReaderThread.run(Channel.java:377)
>
> Resetting hudson solved the problem for me.
>
> So far, this has only been a one-time occurence, and I have not tried to
> reproduce the underlying network outage. I was wondering though if this is
> an issue with the exception handling in general for polling. Let me know if
> I should file an issue for this.
>
> - Kian Win
> --
> View this message in context: http://www.nabble.com/Single-exception-in-SCM-polling-on-slave-prevents-future-polling-tf3706883.html#a10367362
> Sent from the Hudson users mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>


--
Kohsuke Kawaguchi

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Single exception in SCM polling on slave prevents future polling

Kianwin Ong
kohsuke wrote
It appears to me that you are using distributed builds, and I suspect
the connection between the master and the slave went bad. Normally it
should cause the disconnection, and then you'll see "offline"
indication in the executor list. Have you seen that? Otherwise, it
must have gone bad in a way that didn't cause a disconnection.
There were two jobs scheduled on the same slave, both invoked by polling. If I remember correctly, the exception caused a disconnection of the slave, but after manually reconnecting the slave, one job correctly polled while the job that caused the exception remained wedged.

kohsuke wrote
Have you tried to build the project manually? Did that succeed?
I did not try to re-build the wedged job manually. Instead, I restarted Hudson, and everything has worked fine since.

kohsuke wrote
I'll see if I can figure out from the stack trace.

Can you also remind me what is the "only one out of the usual four" messages?
The Subversion polling log displayed only a single line, eg.
Started on May 1, 2007 12:11:50 PM

whereas it usually displays four lines, eg.
Started on May 1, 2007 12:11:50 PM
Revision:3152
Done. Took 0 seconds
No changes

Thanks for checking on this, and hope this helps.

- Kian Win