[JIRA] Created: (HUDSON-6965) GT seems to loose it's connection to Gerrit without re-connecting

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

[JIRA] Created: (HUDSON-6965) GT seems to loose it's connection to Gerrit without re-connecting

Hudson issues mailing list
GT seems to loose it's connection to Gerrit without re-connecting
-----------------------------------------------------------------

                 Key: HUDSON-6965
                 URL: http://issues.hudson-ci.org/browse/HUDSON-6965
             Project: Hudson
          Issue Type: Bug
          Components: gerrit-trigger
            Reporter: antonystubbs
            Assignee: rsandell
            Priority: Critical


Haven't managed to catch it in the act yet, but it works really well for a while, then I notice it stops getting triggered. I go into hudson/manage/gerrittrigger and click restart and it seems to connect ok (watching gerrit logs) and it works fine again.

This has happened consistently since setting it up. Appears to work for just a few hours, before needing a kick up the pants...

Great work!

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.hudson-ci.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[JIRA] Commented: (HUDSON-6965) GT seems to loose it's connection to Gerrit without re-connecting

Hudson issues mailing list

    [ http://issues.hudson-ci.org/browse/HUDSON-6965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=140152#action_140152 ]

rsandell commented on HUDSON-6965:
----------------------------------

This is a bit strange, it would be really good if you could "catch it in the act" with some logs.
Are you using the 2.0 release or an earlier snapshot version?

For me it has been running smoothly for days (restarted main server for other reasons),

but we have noticed some thread leakage that will be fixed in the 2.1 release.

> GT seems to loose it's connection to Gerrit without re-connecting
> -----------------------------------------------------------------
>
>                 Key: HUDSON-6965
>                 URL: http://issues.hudson-ci.org/browse/HUDSON-6965
>             Project: Hudson
>          Issue Type: Bug
>          Components: gerrit-trigger
>            Reporter: antonystubbs
>            Assignee: rsandell
>            Priority: Critical
>
> Haven't managed to catch it in the act yet, but it works really well for a while, then I notice it stops getting triggered. I go into hudson/manage/gerrittrigger and click restart and it seems to connect ok (watching gerrit logs) and it works fine again.
> This has happened consistently since setting it up. Appears to work for just a few hours, before needing a kick up the pants...
> Great work!

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.hudson-ci.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[JIRA] Commented: (HUDSON-6965) GT seems to loose it's connection to Gerrit without re-connecting

Hudson issues mailing list
In reply to this post by Hudson issues mailing list

    [ http://issues.hudson-ci.org/browse/HUDSON-6965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=140157#action_140157 ]

antonystubbs commented on HUDSON-6965:
--------------------------------------

I have a feeling it's being caused by another build causing the infamous too many files open bug/issue/feature in Hudson, which is causing congestion. Restarting Tomcat fixed it this evening.

I have disabled all other builds, and will report back soon.

I am using the latest release available through Hudson's plugin system.

> GT seems to loose it's connection to Gerrit without re-connecting
> -----------------------------------------------------------------
>
>                 Key: HUDSON-6965
>                 URL: http://issues.hudson-ci.org/browse/HUDSON-6965
>             Project: Hudson
>          Issue Type: Bug
>          Components: gerrit-trigger
>            Reporter: antonystubbs
>            Assignee: rsandell
>            Priority: Critical
>
> Haven't managed to catch it in the act yet, but it works really well for a while, then I notice it stops getting triggered. I go into hudson/manage/gerrittrigger and click restart and it seems to connect ok (watching gerrit logs) and it works fine again.
> This has happened consistently since setting it up. Appears to work for just a few hours, before needing a kick up the pants...
> Great work!

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.hudson-ci.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[JIRA] Commented: (HUDSON-6965) GT seems to loose it's connection to Gerrit without re-connecting

Hudson issues mailing list
In reply to this post by Hudson issues mailing list

    [ http://issues.hudson-ci.org/browse/HUDSON-6965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=140239#action_140239 ]

antonystubbs commented on HUDSON-6965:
--------------------------------------

Doesn't seem to "stay working" for me for longer than a few hours...

Noticed it hadn't been working today, so did a little test.

Went into hudson, manage gerrit-trigger. Tested connection, saw a successful connect and disconnect in gerrit logs.

clicked stop. nothing in gerrit logs from hudson. - implies it wasn't running? I think the manage page really needs some sort of "Status" field so we can see if gerrit-trigger thinks it's connected or not.
clicked start. got a connection in gerrit logs from hudson.

If gerrit-trigger is d/c, doesn't it try to reconnect? From looking at GerritHandler#run, it doesn't seem that it does?

I see there is logging in gerrit-trigger, who do i get the debug log? </lazy>

> GT seems to loose it's connection to Gerrit without re-connecting
> -----------------------------------------------------------------
>
>                 Key: HUDSON-6965
>                 URL: http://issues.hudson-ci.org/browse/HUDSON-6965
>             Project: Hudson
>          Issue Type: Bug
>          Components: gerrit-trigger
>            Reporter: antonystubbs
>            Assignee: rsandell
>            Priority: Critical
>
> Haven't managed to catch it in the act yet, but it works really well for a while, then I notice it stops getting triggered. I go into hudson/manage/gerrittrigger and click restart and it seems to connect ok (watching gerrit logs) and it works fine again.
> This has happened consistently since setting it up. Appears to work for just a few hours, before needing a kick up the pants...
> Great work!

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.hudson-ci.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[JIRA] Commented: (HUDSON-6965) GT seems to loose it's connection to Gerrit without re-connecting

Hudson issues mailing list
In reply to this post by Hudson issues mailing list

    [ http://issues.hudson-ci.org/browse/HUDSON-6965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=140240#action_140240 ]

antonystubbs commented on HUDSON-6965:
--------------------------------------

Ok, I'm seeing these appearing in the Gerrit logs:

==> gerrit/review_site/logs/error_log <==
[2010-07-18 04:10:25,980] WARN  org.apache.sshd.server.session.ServerSession : Exception caught
java.io.IOException: Connection timed out
        at sun.nio.ch.FileDispatcher.read0(Native Method)
        at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)
        at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:233)
        at sun.nio.ch.IOUtil.read(IOUtil.java:206)
        at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:236)
        at org.apache.mina.transport.socket.nio.NioProcessor.read(NioProcessor.java:202)
        at org.apache.mina.transport.socket.nio.NioProcessor.read(NioProcessor.java:42)
        at org.apache.mina.core.polling.AbstractPollingIoProcessor.read(AbstractPollingIoProcessor.java:620)
        at org.apache.mina.core.polling.AbstractPollingIoProcessor.process(AbstractPollingIoProcessor.java:598)
        at org.apache.mina.core.polling.AbstractPollingIoProcessor.process(AbstractPollingIoProcessor.java:587)
        at org.apache.mina.core.polling.AbstractPollingIoProcessor.access$400(AbstractPollingIoProcessor.java:61)
        at org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run(AbstractPollingIoProcessor.java:969)
        at org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:64)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)

==> gerrit/review_site/logs/sshd_log <==
[2010-07-18 04:10:26,024 +0000] eb5e7197 hudson a/1000081 LOGOUT

With no following "hudson LOGIN" events... Hudson user being the Hudson Trigger module...

> GT seems to loose it's connection to Gerrit without re-connecting
> -----------------------------------------------------------------
>
>                 Key: HUDSON-6965
>                 URL: http://issues.hudson-ci.org/browse/HUDSON-6965
>             Project: Hudson
>          Issue Type: Bug
>          Components: gerrit-trigger
>            Reporter: antonystubbs
>            Assignee: rsandell
>            Priority: Critical
>
> Haven't managed to catch it in the act yet, but it works really well for a while, then I notice it stops getting triggered. I go into hudson/manage/gerrittrigger and click restart and it seems to connect ok (watching gerrit logs) and it works fine again.
> This has happened consistently since setting it up. Appears to work for just a few hours, before needing a kick up the pants...
> Great work!

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.hudson-ci.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[JIRA] Commented: (HUDSON-6965) GT seems to loose it's connection to Gerrit without re-connecting

Hudson issues mailing list
In reply to this post by Hudson issues mailing list

    [ http://issues.hudson-ci.org/browse/HUDSON-6965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=140348#action_140348 ]

rsandell commented on HUDSON-6965:
----------------------------------

This is starting to bug me a lot, I cannot reproduce it. However I try to kill the connection the reconnect loop reconnects when the connection is back.
It must be ending up in some semi connected state in your environment that I can't reproduce (or don't know how to reproduce).

I will change logging api to java.util.logging, so it's more easily filtered in Hudson's UI for the 2.1 release and try to put some clever debug logs on the lower levels.

> GT seems to loose it's connection to Gerrit without re-connecting
> -----------------------------------------------------------------
>
>                 Key: HUDSON-6965
>                 URL: http://issues.hudson-ci.org/browse/HUDSON-6965
>             Project: Hudson
>          Issue Type: Bug
>          Components: gerrit-trigger
>            Reporter: antonystubbs
>            Assignee: rsandell
>            Priority: Critical
>
> Haven't managed to catch it in the act yet, but it works really well for a while, then I notice it stops getting triggered. I go into hudson/manage/gerrittrigger and click restart and it seems to connect ok (watching gerrit logs) and it works fine again.
> This has happened consistently since setting it up. Appears to work for just a few hours, before needing a kick up the pants...
> Great work!

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.hudson-ci.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[JIRA] Commented: (HUDSON-6965) GT seems to loose it's connection to Gerrit without re-connecting

Hudson issues mailing list
In reply to this post by Hudson issues mailing list

    [ http://issues.hudson-ci.org/browse/HUDSON-6965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=140377#action_140377 ]

antonystubbs commented on HUDSON-6965:
--------------------------------------

Clicking "restart" as our builds arent' triggering, I'm getting:

{code}
==> gerrit/review_site/logs/sshd_log <==
[2010-07-22 06:42:16,334 +0000] 0e887335 hudson a/1000081 'gerrit stream-events' 0ms 850392ms killed

==> gerrit/review_site/logs/error_log <==
[2010-07-22 06:42:16,346] WARN  org.apache.sshd.server.session.ServerSession : Exception caught
org.apache.mina.core.write.WriteToClosedSessionException
        at org.apache.mina.core.polling.AbstractPollingIoProcessor.clearWriteRequestQueue(AbstractPollingIoProcessor.java:573)
        at org.apache.mina.core.polling.AbstractPollingIoProcessor.removeNow(AbstractPollingIoProcessor.java:525)
        at org.apache.mina.core.polling.AbstractPollingIoProcessor.removeSessions(AbstractPollingIoProcessor.java:497)
        at org.apache.mina.core.polling.AbstractPollingIoProcessor.access$600(AbstractPollingIoProcessor.java:61)
        at org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run(AbstractPollingIoProcessor.java:974)
        at org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:64)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)

==> gerrit/review_site/logs/sshd_log <==
[2010-07-22 06:42:16,518 +0000] 0e887335 hudson a/1000081 LOGOUT
[2010-07-22 06:42:17,718 +0000] aee847ea hudson a/1000081 LOGIN FROM x.x.x.x
{code}


> GT seems to loose it's connection to Gerrit without re-connecting
> -----------------------------------------------------------------
>
>                 Key: HUDSON-6965
>                 URL: http://issues.hudson-ci.org/browse/HUDSON-6965
>             Project: Hudson
>          Issue Type: Bug
>          Components: gerrit-trigger
>            Reporter: antonystubbs
>            Assignee: rsandell
>            Priority: Critical
>
> Haven't managed to catch it in the act yet, but it works really well for a while, then I notice it stops getting triggered. I go into hudson/manage/gerrittrigger and click restart and it seems to connect ok (watching gerrit logs) and it works fine again.
> This has happened consistently since setting it up. Appears to work for just a few hours, before needing a kick up the pants...
> Great work!

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.hudson-ci.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]