Jenkins slave (EC2 AWS plugin) fails occasionally when build takes too long with error "Caused: java.io.IOException: Unexpected termination of the channel"

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Jenkins slave (EC2 AWS plugin) fails occasionally when build takes too long with error "Caused: java.io.IOException: Unexpected termination of the channel"

Lax Clarke
I have Jenkins configured to launch EC2 instances (via AWS plugin) to execute a build.
The actual build steps use the Execute Shell method, and launch ansible scripts.

I find that if the ansible script runs for too long, the Jenkins slave on the EC2 system goes down.
The Jenkins GUI shows this error:

Connection was broken

java.io.EOFException
	at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2638)
	at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3113)
	at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:853)
	at java.io.ObjectInputStream.<init>(ObjectInputStream.java:349)
	at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:48)
	at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:59)
Caused: java.io.IOException: Unexpected termination of the channel
	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:73)



Some googling lead to this page: https://wiki.jenkins.io/display/JENKINS/Remoting+issue
It suggests I check these log files:
1) connection logs on master - I checked these, and they have the same information as stack trace.
2) slave logs - I do not know where to find these.  The locations suggested do not exist on my slave.  

I even inspected the files that the slave java process opens, and the only file that seems like a log file ends up being empty:

ubuntu@ip-172-31-93-175:~/support$ ps auxww | grep java | grep slave
ubuntu    25235  8.0  0.5 12211452 157616 ?     Ssl  22:23   0:05 java -jar /tmp/slave.jar
ubuntu@ip-172-31-93-175:~/support$ sudo lsof | grep 25235  | grep -i log | awk '{print $NF}' | sort | uniq
/home/ubuntu/support/all_2017-09-11_22.23.07.log
ubuntu@ip-172-31-93-175:~/support$ cat /home/ubuntu/support/all_2017-09-11_22.23.07.log 
ubuntu@ip-172-31-93-175:~/support$ 



Has anyone run into this issue or know how or where to find the java slave logs? 

Thanks so much.


VERSION INFO:
Master:
Jenkins ver. 2.60.2
Java:  openjdk version "1.8.0_131"

Build node:
Launched via: AWS plugin: https://wiki.jenkins.io/display/JENKINS/Amazon+EC2+Plugin (latest version)

Ec2 instance running Ubuntu 14.04 and Java 8:
ubuntu@ip-172-31-93-175:~$ cat /etc/*release* | grep VERSION
VERSION="14.04.5 LTS, Trusty Tahr"
VERSION_ID="14.04"
ubuntu@ip-172-31-93-175:~$ java -version
java version "1.8.0_144"
Java(TM) SE Runtime Environment (build 1.8.0_144-b01)
Java HotSpot(TM) 64-Bit Server VM (build 25.144-b01, mixed mode)





--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/2e551b74-cab8-4098-8352-1c37d833ef9b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Jenkins slave (EC2 AWS plugin) fails occasionally when build takes too long with error "Caused: java.io.IOException: Unexpected termination of the channel"

Joshua Noble
I've had great success with the EC2 plugin. Might I suggest using Hashicorp Packer to build an AMI, and have the EC2 plugin launch that AMI? That will let you remove Ansible (or any provision tool) out of the equation and save on agent provisioning time. I know that when our cluster scales up another node using the EC2 plugin, we generally need that node online ASAP.

On Monday, September 11, 2017 at 9:32:30 PM UTC-4, K S wrote:
I have Jenkins configured to launch EC2 instances (via AWS plugin) to execute a build.
The actual build steps use the Execute Shell method, and launch ansible scripts.

I find that if the ansible script runs for too long, the Jenkins slave on the EC2 system goes down.
The Jenkins GUI shows this error:

Connection was broken

java.io.EOFException
	at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2638)
	at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3113)
	at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:853)
	at java.io.ObjectInputStream.<init>(ObjectInputStream.java:349)
	at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:48)
	at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:59)
Caused: java.io.IOException: Unexpected termination of the channel
	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:73)



Some googling lead to this page: <a href="https://wiki.jenkins.io/display/JENKINS/Remoting+issue" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fwiki.jenkins.io%2Fdisplay%2FJENKINS%2FRemoting%2Bissue\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNH64po7FABb6HumJokoRmFhTcfuGA&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fwiki.jenkins.io%2Fdisplay%2FJENKINS%2FRemoting%2Bissue\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNH64po7FABb6HumJokoRmFhTcfuGA&#39;;return true;">https://wiki.jenkins.io/display/JENKINS/Remoting+issue
It suggests I check these log files:
1) connection logs on master - I checked these, and they have the same information as stack trace.
2) slave logs - I do not know where to find these.  The locations suggested do not exist on my slave.  

I even inspected the files that the slave java process opens, and the only file that seems like a log file ends up being empty:

ubuntu@ip-172-31-93-175:~/support$ ps auxww | grep java | grep slave
ubuntu    25235  8.0  0.5 12211452 157616 ?     Ssl  22:23   0:05 java -jar /tmp/slave.jar
ubuntu@ip-172-31-93-175:~/support$ sudo lsof | grep 25235  | grep -i log | awk '{print $NF}' | sort | uniq
/home/ubuntu/support/all_2017-09-11_22.23.07.log
ubuntu@ip-172-31-93-175:~/support$ cat /home/ubuntu/support/all_2017-09-11_22.23.07.log 
ubuntu@ip-172-31-93-175:~/support$ 



Has anyone run into this issue or know how or where to find the java slave logs? 

Thanks so much.


VERSION INFO:
Master:
Jenkins ver. 2.60.2
Java:  openjdk version "1.8.0_131"

Build node:
Launched via: AWS plugin: <a href="https://wiki.jenkins.io/display/JENKINS/Amazon+EC2+Plugin" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fwiki.jenkins.io%2Fdisplay%2FJENKINS%2FAmazon%2BEC2%2BPlugin\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNFkEl4JW6svpZjFAPMf3O1a-1VyMg&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fwiki.jenkins.io%2Fdisplay%2FJENKINS%2FAmazon%2BEC2%2BPlugin\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNFkEl4JW6svpZjFAPMf3O1a-1VyMg&#39;;return true;">https://wiki.jenkins.io/display/JENKINS/Amazon+EC2+Plugin (latest version)

Ec2 instance running Ubuntu 14.04 and Java 8:
ubuntu@ip-172-31-93-175:~$ cat /etc/*release* | grep VERSION
VERSION="14.04.5 LTS, Trusty Tahr"
VERSION_ID="14.04"
ubuntu@ip-172-31-93-175:~$ java -version
java version "1.8.0_144"
Java(TM) SE Runtime Environment (build 1.8.0_144-b01)
Java HotSpot(TM) 64-Bit Server VM (build 25.144-b01, mixed mode)





--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/cccf48eb-f780-443e-8f51-828c8a5e3f6e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Jenkins slave (EC2 AWS plugin) fails occasionally when build takes too long with error "Caused: java.io.IOException: Unexpected termination of the channel"

Lax Clarke
The ansible script is minimal and runs a yocto build process.

Are you suggesting ansible interacting with this plugin could be a problem?

On Tuesday, September 12, 2017 at 12:50:26 PM UTC-4, Joshua Noble wrote:
I've had great success with the EC2 plugin. Might I suggest using Hashicorp Packer to build an AMI, and have the EC2 plugin launch that AMI? That will let you remove Ansible (or any provision tool) out of the equation and save on agent provisioning time. I know that when our cluster scales up another node using the EC2 plugin, we generally need that node online ASAP.

On Monday, September 11, 2017 at 9:32:30 PM UTC-4, K S wrote:
I have Jenkins configured to launch EC2 instances (via AWS plugin) to execute a build.
The actual build steps use the Execute Shell method, and launch ansible scripts.

I find that if the ansible script runs for too long, the Jenkins slave on the EC2 system goes down.
The Jenkins GUI shows this error:

Connection was broken

java.io.EOFException
	at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2638)
	at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3113)
	at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:853)
	at java.io.ObjectInputStream.<init>(ObjectInputStream.java:349)
	at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:48)
	at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:59)
Caused: java.io.IOException: Unexpected termination of the channel
	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:73)



Some googling lead to this page: <a href="https://wiki.jenkins.io/display/JENKINS/Remoting+issue" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fwiki.jenkins.io%2Fdisplay%2FJENKINS%2FRemoting%2Bissue\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNH64po7FABb6HumJokoRmFhTcfuGA&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fwiki.jenkins.io%2Fdisplay%2FJENKINS%2FRemoting%2Bissue\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNH64po7FABb6HumJokoRmFhTcfuGA&#39;;return true;">https://wiki.jenkins.io/display/JENKINS/Remoting+issue
It suggests I check these log files:
1) connection logs on master - I checked these, and they have the same information as stack trace.
2) slave logs - I do not know where to find these.  The locations suggested do not exist on my slave.  

I even inspected the files that the slave java process opens, and the only file that seems like a log file ends up being empty:

ubuntu@ip-172-31-93-175:~/support$ ps auxww | grep java | grep slave
ubuntu    25235  8.0  0.5 12211452 157616 ?     Ssl  22:23   0:05 java -jar /tmp/slave.jar
ubuntu@ip-172-31-93-175:~/support$ sudo lsof | grep 25235  | grep -i log | awk '{print $NF}' | sort | uniq
/home/ubuntu/support/all_2017-09-11_22.23.07.log
ubuntu@ip-172-31-93-175:~/support$ cat /home/ubuntu/support/all_2017-09-11_22.23.07.log 
ubuntu@ip-172-31-93-175:~/support$ 



Has anyone run into this issue or know how or where to find the java slave logs? 

Thanks so much.


VERSION INFO:
Master:
Jenkins ver. 2.60.2
Java:  openjdk version "1.8.0_131"

Build node:
Launched via: AWS plugin: <a href="https://wiki.jenkins.io/display/JENKINS/Amazon+EC2+Plugin" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fwiki.jenkins.io%2Fdisplay%2FJENKINS%2FAmazon%2BEC2%2BPlugin\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNFkEl4JW6svpZjFAPMf3O1a-1VyMg&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fwiki.jenkins.io%2Fdisplay%2FJENKINS%2FAmazon%2BEC2%2BPlugin\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNFkEl4JW6svpZjFAPMf3O1a-1VyMg&#39;;return true;">https://wiki.jenkins.io/display/JENKINS/Amazon+EC2+Plugin (latest version)

Ec2 instance running Ubuntu 14.04 and Java 8:
ubuntu@ip-172-31-93-175:~$ cat /etc/*release* | grep VERSION
VERSION="14.04.5 LTS, Trusty Tahr"
VERSION_ID="14.04"
ubuntu@ip-172-31-93-175:~$ java -version
java version "1.8.0_144"
Java(TM) SE Runtime Environment (build 1.8.0_144-b01)
Java HotSpot(TM) 64-Bit Server VM (build 25.144-b01, mixed mode)





--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/f234ccf9-d175-4a56-8f10-8190710f1937%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Jenkins slave (EC2 AWS plugin) fails occasionally when build takes too long with error "Caused: java.io.IOException: Unexpected termination of the channel"

Mike-9
In reply to this post by Lax Clarke
I added ClientAliveInterval and ClientAliveCountMax parameters to the sshd configuration on our Jenkins agents to help prevent disconnects.  I also removed the monitoring plugin since we didn't use it.  I had noticed JavaMelody errors in the Jenkins log file at the same time the Jenkins agent disconnected.

On Monday, September 11, 2017 at 6:32:30 PM UTC-7, K S wrote:
I have Jenkins configured to launch EC2 instances (via AWS plugin) to execute a build.
The actual build steps use the Execute Shell method, and launch ansible scripts.

I find that if the ansible script runs for too long, the Jenkins slave on the EC2 system goes down.
The Jenkins GUI shows this error:

Connection was broken

java.io.EOFException
	at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2638)
	at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3113)
	at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:853)
	at java.io.ObjectInputStream.<init>(ObjectInputStream.java:349)
	at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:48)
	at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:59)
Caused: java.io.IOException: Unexpected termination of the channel
	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:73)



Some googling lead to this page: <a href="https://wiki.jenkins.io/display/JENKINS/Remoting+issue" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fwiki.jenkins.io%2Fdisplay%2FJENKINS%2FRemoting%2Bissue\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNH64po7FABb6HumJokoRmFhTcfuGA&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fwiki.jenkins.io%2Fdisplay%2FJENKINS%2FRemoting%2Bissue\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNH64po7FABb6HumJokoRmFhTcfuGA&#39;;return true;">https://wiki.jenkins.io/display/JENKINS/Remoting+issue
It suggests I check these log files:
1) connection logs on master - I checked these, and they have the same information as stack trace.
2) slave logs - I do not know where to find these.  The locations suggested do not exist on my slave.  

I even inspected the files that the slave java process opens, and the only file that seems like a log file ends up being empty:

ubuntu@ip-172-31-93-175:~/support$ ps auxww | grep java | grep slave
ubuntu    25235  8.0  0.5 12211452 157616 ?     Ssl  22:23   0:05 java -jar /tmp/slave.jar
ubuntu@ip-172-31-93-175:~/support$ sudo lsof | grep 25235  | grep -i log | awk '{print $NF}' | sort | uniq
/home/ubuntu/support/all_2017-09-11_22.23.07.log
ubuntu@ip-172-31-93-175:~/support$ cat /home/ubuntu/support/all_2017-09-11_22.23.07.log 
ubuntu@ip-172-31-93-175:~/support$ 



Has anyone run into this issue or know how or where to find the java slave logs? 

Thanks so much.


VERSION INFO:
Master:
Jenkins ver. 2.60.2
Java:  openjdk version "1.8.0_131"

Build node:
Launched via: AWS plugin: <a href="https://wiki.jenkins.io/display/JENKINS/Amazon+EC2+Plugin" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fwiki.jenkins.io%2Fdisplay%2FJENKINS%2FAmazon%2BEC2%2BPlugin\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNFkEl4JW6svpZjFAPMf3O1a-1VyMg&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fwiki.jenkins.io%2Fdisplay%2FJENKINS%2FAmazon%2BEC2%2BPlugin\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNFkEl4JW6svpZjFAPMf3O1a-1VyMg&#39;;return true;">https://wiki.jenkins.io/display/JENKINS/Amazon+EC2+Plugin (latest version)

Ec2 instance running Ubuntu 14.04 and Java 8:
ubuntu@ip-172-31-93-175:~$ cat /etc/*release* | grep VERSION
VERSION="14.04.5 LTS, Trusty Tahr"
VERSION_ID="14.04"
ubuntu@ip-172-31-93-175:~$ java -version
java version "1.8.0_144"
Java(TM) SE Runtime Environment (build 1.8.0_144-b01)
Java HotSpot(TM) 64-Bit Server VM (build 25.144-b01, mixed mode)





--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/1c82b22f-c02c-4f01-9efb-6093bf8dbcf8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.