[JIRA] Commented: (HUDSON-7707) Multiple dead executors on slaves post 1.379 upgrade

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[JIRA] Commented: (HUDSON-7707) Multiple dead executors on slaves post 1.379 upgrade

Hudson issues mailing list

    [ http://issues.hudson-ci.org/browse/HUDSON-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=143397#action_143397 ]

bertrandgressier commented on HUDSON-7707:
------------------------------------------

Hello,

I have exactly the same problem on my forge.
I have 1 master with 3 slaves.
I upgrade my server from 1.374 to 1.384 and after I have a lot of Exceptions. This is the same of top.

I try to restart queues with a groovy script but it's unsuccess. I can just list crash queues

import hudson.model.*;
hudson = Hudson.instance

def computers= hudson.computers
computers*.executors*.each {
 
  if (it.causeOfDeath != null) {
    println "${it.owner.caption} : ${it.displayName}=============="
    println it.causeOfDeath
   
  }
}

And my result :

MaƮtre : Executor #0==============
java.lang.NullPointerException
Esclave Agent-1 : Executor #1==============
java.lang.AbstractMethodError
Esclave Agent-2 : Executor #1==============
java.lang.AbstractMethodError
Esclave Agent-2 : Executor #3==============
java.lang.AbstractMethodError
Esclave AgentC-1 : Executor #0==============
java.lang.AbstractMethodError
Esclave AgentC-1 : Executor #1==============
java.lang.AbstractMethodError
Esclave AgentC-1 : Executor #2==============
java.lang.AbstractMethodError


Have you an idea to restart queue without restart hudson ???

The fix can be in the next release ?

I can downgrade hudson to 1.378 pending ?



> Multiple dead executors on slaves post 1.379 upgrade
> ----------------------------------------------------
>
>                 Key: HUDSON-7707
>                 URL: http://issues.hudson-ci.org/browse/HUDSON-7707
>             Project: Hudson
>          Issue Type: Bug
>          Components: master-slave
>    Affects Versions: current
>         Environment: CentOS Linux 5.x kernel 2.6.18-194.3.1.el5
> hudson.war 1.379 under Tomcat 5.5.28
> Slave OSs: CentOS Linux 5.x, Windows XP 32bit, Windows Server 2008 64bit
>            Reporter: dru_n
>             Fix For: current
>
>
> Post upgrade to 1.379 we are experiencing increased ocurrances of dead executors on our slave systems.  Prior to this release we had never encountered a dead executor on any system, master or slave.  Immediately after deploying the 1.379 WAR, 6 executors spread out among a variety of slave platforms (Linux, WinXP 32bit, Win2k8 64bit) died.  Today one more died on a Linux slave.  Restarting Hudson clears out the dead executors, but disconnecting and reconnecting the slaves does not.  I have not tried rebooting the slaves themselves yet.  The stack trace below has consistently been the output associated with the dead executors.
> java.lang.AbstractMethodError
> at hudson.model.Executor.getEstimatedRemainingTimeMillis(Executor.java:340)
> at hudson.model.queue.LoadPredictor$CurrentlyRunningTasks.predict(LoadPredictor.java:77)
> at hudson.model.queue.MappingWorksheet.(MappingWorksheet.java:303)
> at hudson.model.Queue.pop(Queue.java:753)
> at hudson.model.Executor.grabJob(Executor.java:175)
> at hudson.model.Executor.run(Executor.java:113)

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.hudson-ci.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]