Hung slave

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Hung slave

Hayes, Peter
We are again seeing the hanging slave issue.  The slave agent java process is gone and restarting it has no effect.  The stack traces show that one of the monitoring threads is in a runnable state but is not progressing.  All of the other monitoring threads are waiting on this one as it owns the hudson.remoting.Channel lock.
 
Name: Monitoring thread for Clock Difference started on Mon Nov 23 01:46:21 EST 2009
State: RUNNABLE
Total blocked: 3  Total waited: 2
 
Stack trace:
java.io.FileOutputStream.writeBytes(Native Method)
java.io.FileOutputStream.write(FileOutputStream.java:260)
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
   - locked java.io.BufferedOutputStream@5042589b
java.io.ObjectOutputStream$BlockDataOutputStream.flush(ObjectOutputStream.java:1784)
java.io.ObjectOutputStream.flush(ObjectOutputStream.java:691)
hudson.remoting.Channel.send(Channel.java:416)
   - locked hudson.remoting.Channel@b583a80
hudson.remoting.Request.call(Request.java:104)
   - locked hudson.remoting.UserRequest@e1f6102
   - locked hudson.remoting.Channel@b583a80
hudson.remoting.Channel.call(Channel.java:549)
hudson.model.Slave.getClockDifference(Slave.java:230)
hudson.node_monitors.ClockMonitor$1.monitor(ClockMonitor.java:53)
hudson.node_monitors.ClockMonitor$1.monitor(ClockMonitor.java:49)
hudson.node_monitors.AbstractNodeMonitorDescriptor$Record.run(AbstractNodeMonitorDescriptor.java:200)
 
It blocks our jobs from running as the executor threads are in a BLCOKED state waiting for the Channel lock.
 
Name: Executor #2 for ioappl01dev : executing iss-ace-ui-1.0 #42
State: BLOCKED on hudson.remoting.Channel@b583a80 owned by: Monitoring thread for Clock Difference started on Mon Nov 23 01:46:21 EST 2009
Total blocked: 32,537  Total waited: 32,385
 
Stack trace:
hudson.remoting.Request.call(Request.java:100)
hudson.remoting.Channel.call(Channel.java:549)
hudson.FilePath.act(FilePath.java:667)
hudson.FilePath.act(FilePath.java:660)
hudson.FilePath.mkdirs(FilePath.java:724)
hudson.model.AbstractProject.checkout(AbstractProject.java:977)
hudson.model.AbstractBuild$AbstractRunner.checkout(AbstractBuild.java:421)
hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:370)
hudson.model.Run.run(Run.java:1120)
hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
hudson.model.ResourceController.execute(ResourceController.java:88)
hudson.model.Executor.run(Executor.java:123)
 
This also has the affect of hanging all of our SCM polling threads as they are waiting for that Channel lock.
 
Name: SCM polling for hudson.model.FreeStyleProject@196a1a66[iss-ace-application-data-1.0] / waiting for hudson.remoting.Channel@b583a80:ioappl01dev
State: TIMED_WAITING on hudson.remoting.UserRequest@2a6368d1
Total blocked: 1  Total waited: 1,225
 
Stack trace:
java.lang.Object.wait(Native Method)
hudson.remoting.Request.call(Request.java:120)
hudson.remoting.Channel.call(Channel.java:549)
hudson.FilePath.act(FilePath.java:667)
hudson.FilePath.act(FilePath.java:660)
hudson.FilePath.exists(FilePath.java:919)
hudson.model.AbstractProject.pollSCMChanges(AbstractProject.java:1011)
hudson.triggers.SCMTrigger$Runner.runPolling(SCMTrigger.java:317)
hudson.triggers.SCMTrigger$Runner.run(SCMTrigger.java:344)
hudson.util.SequentialExecutionQueue$QueueEntry.run(SequentialExecutionQueue.java:118)
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
java.util.concurrent.FutureTask.run(FutureTask.java:138)
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
java.lang.Thread.run(Thread.java:619)
 
Any ideas?  Any thoughts why there is nothing left behind when the slave exits?
 
Peter Hayes
Director, Software Engineering/Development
Asset Allocation Technology
 
Fidelity Investments
245 Summer St.
Boston, MA 02111
tel: 617-392-1046
fax: 617-563-3295
 
Notice:  All e-mail sent to or from Fidelity Investments is subject to retention, monitoring and/or review by Fidelity personnel.
 
The information in this e-mail and in any attachments is intended solely for the attention and use of the named addressee(s) and may contain information that is considered privileged, proprietary, confidential, and/or exempt from disclosure under applicable law.  If you are not the intended recipient of this email or if you have otherwise received this email in error, please immediately notify me by replying to this message or by telephone.   Any use, dissemination, distribution or copying of this e-mail is strictly prohibited without authorization from Fidelity Investments.
 
 
 
Reply | Threaded
Open this post in threaded view
|

RE: Hung slave

Hayes, Peter
This morning, the slave has again hung.  Here is the contents of the slave log:
 
channel started
This is a Unix slave
Copied maven-agent.jar
Copied maven-interceptor.jar
Copied maven2.1-interceptor.jar
Ping failed. Terminating
 
Has anyone had a similar experience?  We’ve found that Hudson has to be restarted to correct the issue.
 
Peter
 
_____________________________________________
From: Hayes, Peter
Sent: Monday, November 23, 2009 9:32 AM
To: '[hidden email]'
Subject: Hung slave
 
 
We are again seeing the hanging slave issue.  The slave agent java process is gone and restarting it has no effect.  The stack traces show that one of the monitoring threads is in a runnable state but is not progressing.  All of the other monitoring threads are waiting on this one as it owns the hudson.remoting.Channel lock.
 
Name: Monitoring thread for Clock Difference started on Mon Nov 23 01:46:21 EST 2009
State: RUNNABLE
Total blocked: 3  Total waited: 2
 
Stack trace:
java.io.FileOutputStream.writeBytes(Native Method)
java.io.FileOutputStream.write(FileOutputStream.java:260)
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
   - locked java.io.BufferedOutputStream@5042589b
java.io.ObjectOutputStream$BlockDataOutputStream.flush(ObjectOutputStream.java:1784)
java.io.ObjectOutputStream.flush(ObjectOutputStream.java:691)
hudson.remoting.Channel.send(Channel.java:416)
   - locked hudson.remoting.Channel@b583a80
hudson.remoting.Request.call(Request.java:104)
   - locked hudson.remoting.UserRequest@e1f6102
   - locked hudson.remoting.Channel@b583a80
hudson.remoting.Channel.call(Channel.java:549)
hudson.model.Slave.getClockDifference(Slave.java:230)
hudson.node_monitors.ClockMonitor$1.monitor(ClockMonitor.java:53)
hudson.node_monitors.ClockMonitor$1.monitor(ClockMonitor.java:49)
hudson.node_monitors.AbstractNodeMonitorDescriptor$Record.run(AbstractNodeMonitorDescriptor.java:200)
 
It blocks our jobs from running as the executor threads are in a BLCOKED state waiting for the Channel lock.
 
Name: Executor #2 for ioappl01dev : executing iss-ace-ui-1.0 #42
State: BLOCKED on hudson.remoting.Channel@b583a80 owned by: Monitoring thread for Clock Difference started on Mon Nov 23 01:46:21 EST 2009
Total blocked: 32,537  Total waited: 32,385
 
Stack trace:
hudson.remoting.Request.call(Request.java:100)
hudson.remoting.Channel.call(Channel.java:549)
hudson.FilePath.act(FilePath.java:667)
hudson.FilePath.act(FilePath.java:660)
hudson.FilePath.mkdirs(FilePath.java:724)
hudson.model.AbstractProject.checkout(AbstractProject.java:977)
hudson.model.AbstractBuild$AbstractRunner.checkout(AbstractBuild.java:421)
hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:370)
hudson.model.Run.run(Run.java:1120)
hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
hudson.model.ResourceController.execute(ResourceController.java:88)
hudson.model.Executor.run(Executor.java:123)
 
This also has the affect of hanging all of our SCM polling threads as they are waiting for that Channel lock.
 
Name: SCM polling for hudson.model.FreeStyleProject@196a1a66[iss-ace-application-data-1.0] / waiting for hudson.remoting.Channel@b583a80:ioappl01dev
State: TIMED_WAITING on hudson.remoting.UserRequest@2a6368d1
Total blocked: 1  Total waited: 1,225
 
Stack trace:
java.lang.Object.wait(Native Method)
hudson.remoting.Request.call(Request.java:120)
hudson.remoting.Channel.call(Channel.java:549)
hudson.FilePath.act(FilePath.java:667)
hudson.FilePath.act(FilePath.java:660)
hudson.FilePath.exists(FilePath.java:919)
hudson.model.AbstractProject.pollSCMChanges(AbstractProject.java:1011)
hudson.triggers.SCMTrigger$Runner.runPolling(SCMTrigger.java:317)
hudson.triggers.SCMTrigger$Runner.run(SCMTrigger.java:344)
hudson.util.SequentialExecutionQueue$QueueEntry.run(SequentialExecutionQueue.java:118)
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
java.util.concurrent.FutureTask.run(FutureTask.java:138)
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
java.lang.Thread.run(Thread.java:619)
 
Any ideas?  Any thoughts why there is nothing left behind when the slave exits?
 
Peter Hayes
Director, Software Engineering/Development
Asset Allocation Technology
 
Fidelity Investments
245 Summer St.
Boston, MA 02111
tel: 617-392-1046
fax: 617-563-3295
 
Notice:  All e-mail sent to or from Fidelity Investments is subject to retention, monitoring and/or review by Fidelity personnel.
 
The information in this e-mail and in any attachments is intended solely for the attention and use of the named addressee(s) and may contain information that is considered privileged, proprietary, confidential, and/or exempt from disclosure under applicable law.  If you are not the intended recipient of this email or if you have otherwise received this email in error, please immediately notify me by replying to this message or by telephone.   Any use, dissemination, distribution or copying of this e-mail is strictly prohibited without authorization from Fidelity Investments.