github quota limit when scanning with the addition of tags

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

github quota limit when scanning with the addition of tags

j.knurek
Now that we've added Discover tags[1] and a Build everything[2] strategy, we're running into Github quota limits quite frequently.

18:58:09 GitHub API Usage: Current quota has 1110 remaining (5 over budget). Next quota of 5000 in 13 min. Sleeping for 29 sec.

We've had to extend the Scan Organization Triggers -> Periodically if not otherwise run setting to be 8 hours, to help limit the amount of scans, but that hasn't completely solved  this issue, nor is it the goal we want to achieve. 

There's an open bug about the time setting and github quota limits  (JENKINS-47154[3]), but it's not relevant in this case. 
So I'm wondering if it's a bug in the github-branch-source-plugin? or in the Build everything extension? or is there simply an easy way to request Jenkins to have a higher API quota from GitHub?


REF:
1. https://issues.jenkins-ci.org/browse/JENKINS-34395
2. https://github.com/jenkinsci/github-branch-source-plugin/pull/158#issuecomment-332842623
3. https://issues.jenkins-ci.org/browse/JENKINS-47154

--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/4079f366-003e-4ac4-8aea-462ef4ed2090%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: github quota limit when scanning with the addition of tags

stephenconnolly
This is the limitation of 5000 requests per hour.

Ideally we would look into caching the github responses so that duplicate requests could be eliminated... but my preliminary analysis shows that would basically save about 50% of the requests.

The recommendation for "Scan Organization Triggers -> Periodically if not otherwise run" is at least 8 hours more likely somewhere between 24h and 7 days depending on how long you are willing to wait for a failure to deliver an event from GitHub.

There are only two good reasons to scan periodically:

1. To recover from missed events (keep in mind that follow-up commits will typically recover anyway, so the only case here is a commit before bedtime not being built by morning because that event was not delivered by GitHub)
2. To run the orphaned item strategies (which is probably fine at once per week for most people)

The only other reason to scan periodically is a bad one, namely

* You cannot set up push notification from GitHub



On 3 January 2018 at 14:19, <[hidden email]> wrote:
Now that we've added Discover tags[1] and a Build everything[2] strategy, we're running into Github quota limits quite frequently.

18:58:09 GitHub API Usage: Current quota has 1110 remaining (5 over budget). Next quota of 5000 in 13 min. Sleeping for 29 sec.

We've had to extend the Scan Organization Triggers -> Periodically if not otherwise run setting to be 8 hours, to help limit the amount of scans, but that hasn't completely solved  this issue, nor is it the goal we want to achieve. 

There's an open bug about the time setting and github quota limits  (JENKINS-47154[3]), but it's not relevant in this case. 
So I'm wondering if it's a bug in the github-branch-source-plugin? or in the Build everything extension? or is there simply an easy way to request Jenkins to have a higher API quota from GitHub?


REF:
1. https://issues.jenkins-ci.org/browse/JENKINS-34395
2. https://github.com/jenkinsci/github-branch-source-plugin/pull/158#issuecomment-332842623
3. https://issues.jenkins-ci.org/browse/JENKINS-47154

--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/4079f366-003e-4ac4-8aea-462ef4ed2090%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/CA%2BnPnMx9u0kKcNGyYBNW4wt%3DZUjOc%3D3_7gQiepcoLmSB4DCb7Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: github quota limit when scanning with the addition of tags

stephenconnolly


On 3 January 2018 at 14:41, Stephen Connolly <[hidden email]> wrote:
This is the limitation of 5000 requests per hour.

Ideally we would look into caching the github responses so that duplicate requests could be eliminated... but my preliminary analysis shows that would basically save about 50% of the requests.

The recommendation for "Scan Organization Triggers -> Periodically if not otherwise run" is at least 8 hours more likely somewhere between 24h and 7 days depending on how long you are willing to wait for a failure to deliver an event from GitHub.

There are only two good reasons to scan periodically:

1. To recover from missed events (keep in mind that follow-up commits will typically recover anyway, so the only case here is a commit before bedtime not being built by morning because that event was not delivered by GitHub)
2. To run the orphaned item strategies (which is probably fine at once per week for most people)

The only other reason to scan periodically is a bad one, namely

* You cannot set up push notification from GitHub



On 3 January 2018 at 14:19, <[hidden email]> wrote:
Now that we've added Discover tags[1] and a Build everything[2] strategy, we're running into Github quota limits quite frequently.

18:58:09 GitHub API Usage: Current quota has 1110 remaining (5 over budget). Next quota of 5000 in 13 min. Sleeping for 29 sec.

We've had to extend the Scan Organization Triggers -> Periodically if not otherwise run setting to be 8 hours, to help limit the amount of scans, but that hasn't completely solved  this issue, nor is it the goal we want to achieve. 

There's an open bug about the time setting and github quota limits  (JENKINS-47154[3]), but it's not relevant in this case. 
So I'm wondering if it's a bug in the github-branch-source-plugin? or in the Build everything extension? or is there simply an easy way to request Jenkins to have a higher API quota from GitHub?
Good luck with that... They seem to follow the principle that 5000/hr is all anyone gets... if you must have more I think they want you to go GitHub Enterprise.

At some point we will probably need to move to the v4 API, that might let us fetch responses with more tuning... but that still has a limit approximate to 5000/hr: https://developer.github.com/v4/guides/resource-limitations/ the only difference is we might be able to bulk fetch in a single request a lot of the things that we need to make 3-4 requests to collect.

We will still hit issues when we then need to check for marker files as I do not think that is something that can be done in a single v4 API call

--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/4079f366-003e-4ac4-8aea-462ef4ed2090%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/CA%2BnPnMzXsJarh-zqY4npy5Y2d_bN-g%2B-kmVNbWv6DJ6Ee_MSKQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: github quota limit when scanning with the addition of tags

j.knurek
In reply to this post by stephenconnolly
There are only two good reasons to scan periodically:
1. To recover from missed events (keep in mind that follow-up commits will typically recover anyway, so the only case here is a commit before bedtime not being built by morning because that event was not delivered by GitHub)
From my experience working with developers, that isn't the only use case. The more common use case (when a missed event happens) is that they pushed a commit and are waiting for it to proceed through the pipeline and notify them. Fast notification is a key to good CI/CD. So while missed events are not a frequent occurrence, waiting 7 days isn't an option, and the only other solution for a developer is to have an in-depth knowledge of Jenkins and know that this issue exists. 
 
2. To run the orphaned item strategies (which is probably fine at once per week for most people)
Totally agree, that's fine


As we already have a few repos with over 500 tags (and mind you these are still new repos), I expect that this issue will impact others as they begin to implement the ability to scan tags even with a 24 hour interval. 

----

Also, the recommendation in the UI for the interval setting is:
Subsequent commits should trigger indexing anyway and result in the commit being picked up, so most people will pick between 4 hours and 1 day



On Wednesday, 3 January 2018 15:42:15 UTC+1, Stephen Connolly wrote:
This is the limitation of 5000 requests per hour.

Ideally we would look into caching the github responses so that duplicate requests could be eliminated... but my preliminary analysis shows that would basically save about 50% of the requests.

The recommendation for "Scan Organization Triggers -> Periodically if not otherwise run" is at least 8 hours more likely somewhere between 24h and 7 days depending on how long you are willing to wait for a failure to deliver an event from GitHub.

There are only two good reasons to scan periodically:

1. To recover from missed events (keep in mind that follow-up commits will typically recover anyway, so the only case here is a commit before bedtime not being built by morning because that event was not delivered by GitHub)
2. To run the orphaned item strategies (which is probably fine at once per week for most people)

The only other reason to scan periodically is a bad one, namely

* You cannot set up push notification from GitHub



On 3 January 2018 at 14:19, <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="Z1Fr_mbFDQAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">j.kn...@...> wrote:
Now that we've added Discover tags[<a href="https://issues.jenkins-ci.org/browse/JENKINS-34395" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fissues.jenkins-ci.org%2Fbrowse%2FJENKINS-34395\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEIfJfFuqSSCl6SSUjeTFKcckiCrg&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fissues.jenkins-ci.org%2Fbrowse%2FJENKINS-34395\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEIfJfFuqSSCl6SSUjeTFKcckiCrg&#39;;return true;">1] and a Build everything[<a href="https://github.com/jenkinsci/github-branch-source-plugin/pull/158#issuecomment-332842623" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Fjenkinsci%2Fgithub-branch-source-plugin%2Fpull%2F158%23issuecomment-332842623\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNGYvcUFv39Zf2x2CPDrJ3c4hxN0nQ&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Fjenkinsci%2Fgithub-branch-source-plugin%2Fpull%2F158%23issuecomment-332842623\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNGYvcUFv39Zf2x2CPDrJ3c4hxN0nQ&#39;;return true;">2] strategy, we're running into Github quota limits quite frequently.

18:58:09 GitHub API Usage: Current quota has 1110 remaining (5 over budget). Next quota of 5000 in 13 min. Sleeping for 29 sec.

We've had to extend the Scan Organization Triggers -> Periodically if not otherwise run setting to be 8 hours, to help limit the amount of scans, but that hasn't completely solved  this issue, nor is it the goal we want to achieve. 

There's an open bug about the time setting and github quota limits  (JENKINS-47154<a href="https://issues.jenkins-ci.org/browse/JENKINS-47154" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fissues.jenkins-ci.org%2Fbrowse%2FJENKINS-47154\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNFN7ikadCl-PjFvPXVO4xTqm2ykQw&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fissues.jenkins-ci.org%2Fbrowse%2FJENKINS-47154\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNFN7ikadCl-PjFvPXVO4xTqm2ykQw&#39;;return true;">[3]), but it's not relevant in this case. 
So I'm wondering if it's a bug in the github-branch-source-plugin? or in the Build everything extension? or is there simply an easy way to request Jenkins to have a higher API quota from GitHub?


REF:
1. <a href="https://issues.jenkins-ci.org/browse/JENKINS-34395" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fissues.jenkins-ci.org%2Fbrowse%2FJENKINS-34395\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEIfJfFuqSSCl6SSUjeTFKcckiCrg&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fissues.jenkins-ci.org%2Fbrowse%2FJENKINS-34395\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEIfJfFuqSSCl6SSUjeTFKcckiCrg&#39;;return true;">https://issues.jenkins-ci.org/browse/JENKINS-34395
2. <a href="https://github.com/jenkinsci/github-branch-source-plugin/pull/158#issuecomment-332842623" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Fjenkinsci%2Fgithub-branch-source-plugin%2Fpull%2F158%23issuecomment-332842623\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNGYvcUFv39Zf2x2CPDrJ3c4hxN0nQ&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Fjenkinsci%2Fgithub-branch-source-plugin%2Fpull%2F158%23issuecomment-332842623\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNGYvcUFv39Zf2x2CPDrJ3c4hxN0nQ&#39;;return true;">https://github.com/jenkinsci/github-branch-source-plugin/pull/158#issuecomment-332842623
3. <a href="https://issues.jenkins-ci.org/browse/JENKINS-47154" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fissues.jenkins-ci.org%2Fbrowse%2FJENKINS-47154\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNFN7ikadCl-PjFvPXVO4xTqm2ykQw&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fissues.jenkins-ci.org%2Fbrowse%2FJENKINS-47154\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNFN7ikadCl-PjFvPXVO4xTqm2ykQw&#39;;return true;">https://issues.jenkins-ci.org/browse/JENKINS-47154

--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="Z1Fr_mbFDQAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">jenkinsci-use...@googlegroups.com.
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/jenkinsci-users/4079f366-003e-4ac4-8aea-462ef4ed2090%40googlegroups.com?utm_medium=email&amp;utm_source=footer" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://groups.google.com/d/msgid/jenkinsci-users/4079f366-003e-4ac4-8aea-462ef4ed2090%40googlegroups.com?utm_medium\x3demail\x26utm_source\x3dfooter&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/msgid/jenkinsci-users/4079f366-003e-4ac4-8aea-462ef4ed2090%40googlegroups.com?utm_medium\x3demail\x26utm_source\x3dfooter&#39;;return true;">https://groups.google.com/d/msgid/jenkinsci-users/4079f366-003e-4ac4-8aea-462ef4ed2090%40googlegroups.com.
For more options, visit <a href="https://groups.google.com/d/optout" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;">https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/947e054c-4d72-435c-bfa1-29683d1c7216%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: github quota limit when scanning with the addition of tags

j.knurek
@Stephen
You mention that caching the responses would "save about 50% of the requests." That seems like a significant savings to me.

I'm also wondering, I'm seeing a lot of things like this in the scan log:
Checking tag v0.28.1
     
Jenkinsfile not found
   
Does not meet criteria
19:01:38 GitHub API Usage: Current quota has 901 remaining (4 under budget). Next quota of 5000 in 10 min

   
Checking tag v0.28.2
     
Jenkinsfile not found
   
Does not meet criteria
19:01:38 GitHub API Usage: Current quota has 897 remaining (0 under budget). Next quota of 5000 in 10 min

   
Checking tag v0.28.3
     
Jenkinsfile not found
   
Does not meet criteria
19:01:38 GitHub API Usage: Current quota has 894 remaining (3 over budget). Next quota of 5000 in 10 min. Sleeping for 27 sec.
19:02:06 GitHub API Usage: Current quota has 894 remaining (26 under budget). Next quota of 5000 in 9 min 53 sec


That seems to me like each tag invokes an api request? And with 500+ tags, that seems like a lot of unneeded calls (most especially when Jenkins doesn't even track/build the tag). Or am I reading the logs incorrectly? If that is the case then a cache might save over 90% of the requests in this case. 
Should I create a Jira ticket for this? 


On Wednesday, 3 January 2018 16:52:03 UTC+1, [hidden email] wrote:
There are only two good reasons to scan periodically:
1. To recover from missed events (keep in mind that follow-up commits will typically recover anyway, so the only case here is a commit before bedtime not being built by morning because that event was not delivered by GitHub)
From my experience working with developers, that isn't the only use case. The more common use case (when a missed event happens) is that they pushed a commit and are waiting for it to proceed through the pipeline and notify them. Fast notification is a key to good CI/CD. So while missed events are not a frequent occurrence, waiting 7 days isn't an option, and the only other solution for a developer is to have an in-depth knowledge of Jenkins and know that this issue exists. 
 
2. To run the orphaned item strategies (which is probably fine at once per week for most people)
Totally agree, that's fine


As we already have a few repos with over 500 tags (and mind you these are still new repos), I expect that this issue will impact others as they begin to implement the ability to scan tags even with a 24 hour interval. 

----

Also, the recommendation in the UI for the interval setting is:
Subsequent commits should trigger indexing anyway and result in the commit being picked up, so most people will pick between 4 hours and 1 day



On Wednesday, 3 January 2018 15:42:15 UTC+1, Stephen Connolly wrote:
This is the limitation of 5000 requests per hour.

Ideally we would look into caching the github responses so that duplicate requests could be eliminated... but my preliminary analysis shows that would basically save about 50% of the requests.

The recommendation for "Scan Organization Triggers -> Periodically if not otherwise run" is at least 8 hours more likely somewhere between 24h and 7 days depending on how long you are willing to wait for a failure to deliver an event from GitHub.

There are only two good reasons to scan periodically:

1. To recover from missed events (keep in mind that follow-up commits will typically recover anyway, so the only case here is a commit before bedtime not being built by morning because that event was not delivered by GitHub)
2. To run the orphaned item strategies (which is probably fine at once per week for most people)

The only other reason to scan periodically is a bad one, namely

* You cannot set up push notification from GitHub



On 3 January 2018 at 14:19, <[hidden email]> wrote:
Now that we've added Discover tags[<a href="https://issues.jenkins-ci.org/browse/JENKINS-34395" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fissues.jenkins-ci.org%2Fbrowse%2FJENKINS-34395\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEIfJfFuqSSCl6SSUjeTFKcckiCrg&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fissues.jenkins-ci.org%2Fbrowse%2FJENKINS-34395\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEIfJfFuqSSCl6SSUjeTFKcckiCrg&#39;;return true;">1] and a Build everything[<a href="https://github.com/jenkinsci/github-branch-source-plugin/pull/158#issuecomment-332842623" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Fjenkinsci%2Fgithub-branch-source-plugin%2Fpull%2F158%23issuecomment-332842623\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNGYvcUFv39Zf2x2CPDrJ3c4hxN0nQ&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Fjenkinsci%2Fgithub-branch-source-plugin%2Fpull%2F158%23issuecomment-332842623\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNGYvcUFv39Zf2x2CPDrJ3c4hxN0nQ&#39;;return true;">2] strategy, we're running into Github quota limits quite frequently.

18:58:09 GitHub API Usage: Current quota has 1110 remaining (5 over budget). Next quota of 5000 in 13 min. Sleeping for 29 sec.

We've had to extend the Scan Organization Triggers -> Periodically if not otherwise run setting to be 8 hours, to help limit the amount of scans, but that hasn't completely solved  this issue, nor is it the goal we want to achieve. 

There's an open bug about the time setting and github quota limits  (JENKINS-47154<a href="https://issues.jenkins-ci.org/browse/JENKINS-47154" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fissues.jenkins-ci.org%2Fbrowse%2FJENKINS-47154\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNFN7ikadCl-PjFvPXVO4xTqm2ykQw&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fissues.jenkins-ci.org%2Fbrowse%2FJENKINS-47154\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNFN7ikadCl-PjFvPXVO4xTqm2ykQw&#39;;return true;">[3]), but it's not relevant in this case. 
So I'm wondering if it's a bug in the github-branch-source-plugin? or in the Build everything extension? or is there simply an easy way to request Jenkins to have a higher API quota from GitHub?


REF:
1. <a href="https://issues.jenkins-ci.org/browse/JENKINS-34395" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fissues.jenkins-ci.org%2Fbrowse%2FJENKINS-34395\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEIfJfFuqSSCl6SSUjeTFKcckiCrg&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fissues.jenkins-ci.org%2Fbrowse%2FJENKINS-34395\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEIfJfFuqSSCl6SSUjeTFKcckiCrg&#39;;return true;">https://issues.jenkins-ci.org/browse/JENKINS-34395
2. <a href="https://github.com/jenkinsci/github-branch-source-plugin/pull/158#issuecomment-332842623" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Fjenkinsci%2Fgithub-branch-source-plugin%2Fpull%2F158%23issuecomment-332842623\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNGYvcUFv39Zf2x2CPDrJ3c4hxN0nQ&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Fjenkinsci%2Fgithub-branch-source-plugin%2Fpull%2F158%23issuecomment-332842623\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNGYvcUFv39Zf2x2CPDrJ3c4hxN0nQ&#39;;return true;">https://github.com/jenkinsci/github-branch-source-plugin/pull/158#issuecomment-332842623
3. <a href="https://issues.jenkins-ci.org/browse/JENKINS-47154" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fissues.jenkins-ci.org%2Fbrowse%2FJENKINS-47154\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNFN7ikadCl-PjFvPXVO4xTqm2ykQw&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fissues.jenkins-ci.org%2Fbrowse%2FJENKINS-47154\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNFN7ikadCl-PjFvPXVO4xTqm2ykQw&#39;;return true;">https://issues.jenkins-ci.org/browse/JENKINS-47154

--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-use...@googlegroups.com.
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/jenkinsci-users/4079f366-003e-4ac4-8aea-462ef4ed2090%40googlegroups.com?utm_medium=email&amp;utm_source=footer" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://groups.google.com/d/msgid/jenkinsci-users/4079f366-003e-4ac4-8aea-462ef4ed2090%40googlegroups.com?utm_medium\x3demail\x26utm_source\x3dfooter&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/msgid/jenkinsci-users/4079f366-003e-4ac4-8aea-462ef4ed2090%40googlegroups.com?utm_medium\x3demail\x26utm_source\x3dfooter&#39;;return true;">https://groups.google.com/d/msgid/jenkinsci-users/4079f366-003e-4ac4-8aea-462ef4ed2090%40googlegroups.com.
For more options, visit <a href="https://groups.google.com/d/optout" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;">https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/628123fe-ab0f-4139-b307-15b4c4470b66%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: github quota limit when scanning with the addition of tags

stephenconnolly


On 4 January 2018 at 13:27, <[hidden email]> wrote:
@Stephen
You mention that caching the responses would "save about 50% of the requests." That seems like a significant savings to me.

I'm also wondering, I'm seeing a lot of things like this in the scan log:
Checking tag v0.28.1
     
Jenkinsfile not found
   
Does not meet criteria
19:01:38 GitHub API Usage: Current quota has 901 remaining (4 under budget). Next quota of 5000 in 10 min

   
Checking tag v0.28.2
     
Jenkinsfile not found
   
Does not meet criteria
19:01:38 GitHub API Usage: Current quota has 897 remaining (0 under budget). Next quota of 5000 in 10 min

   
Checking tag v0.28.3
     
Jenkinsfile not found
   
Does not meet criteria
19:01:38 GitHub API Usage: Current quota has 894 remaining (3 over budget). Next quota of 5000 in 10 min. Sleeping for 27 sec.
19:02:06 GitHub API Usage: Current quota has 894 remaining (26 under budget). Next quota of 5000 in 9 min 53 sec


That seems to me like each tag invokes an api request? And with 500+ tags, that seems like a lot of unneeded calls (most especially when Jenkins doesn't even track/build the tag).

Why are you discovering tags if you don't want tags?

Every branch/tag/PR you discover needs at least one request to verify that the marker file is present.

If you don't want tags, don't discover them and you will save a lot of requests.
  
Or am I reading the logs incorrectly? If that is the case then a cache might save over 90% of the requests in this case. 
Should I create a Jira ticket for this? 


On Wednesday, 3 January 2018 16:52:03 UTC+1, [hidden email] wrote:
There are only two good reasons to scan periodically:
1. To recover from missed events (keep in mind that follow-up commits will typically recover anyway, so the only case here is a commit before bedtime not being built by morning because that event was not delivered by GitHub)
From my experience working with developers, that isn't the only use case. The more common use case (when a missed event happens) is that they pushed a commit and are waiting for it to proceed through the pipeline and notify them. Fast notification is a key to good CI/CD. So while missed events are not a frequent occurrence, waiting 7 days isn't an option, and the only other solution for a developer is to have an in-depth knowledge of Jenkins and know that this issue exists. 
 
2. To run the orphaned item strategies (which is probably fine at once per week for most people)
Totally agree, that's fine


As we already have a few repos with over 500 tags (and mind you these are still new repos), I expect that this issue will impact others as they begin to implement the ability to scan tags even with a 24 hour interval. 

----

Also, the recommendation in the UI for the interval setting is:
Subsequent commits should trigger indexing anyway and result in the commit being picked up, so most people will pick between 4 hours and 1 day



On Wednesday, 3 January 2018 15:42:15 UTC+1, Stephen Connolly wrote:
This is the limitation of 5000 requests per hour.

Ideally we would look into caching the github responses so that duplicate requests could be eliminated... but my preliminary analysis shows that would basically save about 50% of the requests.

The recommendation for "Scan Organization Triggers -> Periodically if not otherwise run" is at least 8 hours more likely somewhere between 24h and 7 days depending on how long you are willing to wait for a failure to deliver an event from GitHub.

There are only two good reasons to scan periodically:

1. To recover from missed events (keep in mind that follow-up commits will typically recover anyway, so the only case here is a commit before bedtime not being built by morning because that event was not delivered by GitHub)
2. To run the orphaned item strategies (which is probably fine at once per week for most people)

The only other reason to scan periodically is a bad one, namely

* You cannot set up push notification from GitHub



On 3 January 2018 at 14:19, <[hidden email]> wrote:
Now that we've added Discover tags[1] and a Build everything[2] strategy, we're running into Github quota limits quite frequently.

18:58:09 GitHub API Usage: Current quota has 1110 remaining (5 over budget). Next quota of 5000 in 13 min. Sleeping for 29 sec.

We've had to extend the Scan Organization Triggers -> Periodically if not otherwise run setting to be 8 hours, to help limit the amount of scans, but that hasn't completely solved  this issue, nor is it the goal we want to achieve. 

There's an open bug about the time setting and github quota limits  (JENKINS-47154[3]), but it's not relevant in this case. 
So I'm wondering if it's a bug in the github-branch-source-plugin? or in the Build everything extension? or is there simply an easy way to request Jenkins to have a higher API quota from GitHub?


REF:
1. https://issues.jenkins-ci.org/browse/JENKINS-34395
2. https://github.com/jenkinsci/github-branch-source-plugin/pull/158#issuecomment-332842623
3. https://issues.jenkins-ci.org/browse/JENKINS-47154

--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/4079f366-003e-4ac4-8aea-462ef4ed2090%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/628123fe-ab0f-4139-b307-15b4c4470b66%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/CA%2BnPnMwmRoczttDCkFyQ_4t6SZYux%2BdZ3QfXj6OdZ-XJsdnckA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: github quota limit when scanning with the addition of tags

j.knurek
I do want tags. I want tags very much. I'm very happy this feature is finally available. 
There just happens to be some tags in that repo that reference commits in which no Jenkinsfile exists, and I happened to copy those examples.

Here is a better example:
  Checking tag v1.1.0
     
Jenkinsfile found
   
Met criteria
No changes detected: v1.1.0 (still at d11d5c94130db1b43dea147091c2cfc2d260b2c1)
19:05:09 GitHub API Usage: Current quota has 677 remaining (0 under budget). Next quota of 5000 in 6 min 50 sec

   
Checking tag v1.1.1
     
Jenkinsfile found
   
Met criteria
No changes detected: v1.1.1 (still at 20b7a9ccd47f9e10165268ccc252bc4b793a61fc)
19:05:09 GitHub API Usage: Current quota has 675 remaining (2 over budget). Next quota of 5000 in 6 min 50 sec. Sleeping for 26 sec.
19:05:36 GitHub API Usage: Current quota has 675 remaining (26 under budget). Next quota of 5000 in 6 min 23 sec




On Thursday, 4 January 2018 15:02:11 UTC+1, Stephen Connolly wrote:


On 4 January 2018 at 13:27, <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="CQUh9ssRDgAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">j.kn...@...> wrote:
@Stephen
You mention that caching the responses would "save about 50% of the requests." That seems like a significant savings to me.

I'm also wondering, I'm seeing a lot of things like this in the scan log:
Checking tag <a href="https://github.com/travelaudience/cmt-cmtf/tree/v0.28.1" style="word-wrap:break-word;color:rgb(92,53,102)" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Ftravelaudience%2Fcmt-cmtf%2Ftree%2Fv0.28.1\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNGR5roxKieOHOuJvus745Z4hVxt_Q&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Ftravelaudience%2Fcmt-cmtf%2Ftree%2Fv0.28.1\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNGR5roxKieOHOuJvus745Z4hVxt_Q&#39;;return true;">v0.28.1
     
Jenkinsfile not found
   
Does not meet criteria
19:01:38 GitHub API Usage: Current quota has 901 remaining (4 under budget). Next quota of 5000 in 10 min

   
Checking tag <a href="https://github.com/travelaudience/cmt-cmtf/tree/v0.28.2" style="word-wrap:break-word;color:rgb(92,53,102)" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Ftravelaudience%2Fcmt-cmtf%2Ftree%2Fv0.28.2\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNGZSF9MiJgMlkByp-TSRwl_Yrg4iA&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Ftravelaudience%2Fcmt-cmtf%2Ftree%2Fv0.28.2\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNGZSF9MiJgMlkByp-TSRwl_Yrg4iA&#39;;return true;">v0.28.2
     
Jenkinsfile not found
   
Does not meet criteria
19:01:38 GitHub API Usage: Current quota has 897 remaining (0 under budget). Next quota of 5000 in 10 min

   
Checking tag <a href="https://github.com/travelaudience/cmt-cmtf/tree/v0.28.3" style="word-wrap:break-word;color:rgb(92,53,102)" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Ftravelaudience%2Fcmt-cmtf%2Ftree%2Fv0.28.3\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNGHuBSt649iz5QqWt7DzxdjFE9XlQ&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Ftravelaudience%2Fcmt-cmtf%2Ftree%2Fv0.28.3\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNGHuBSt649iz5QqWt7DzxdjFE9XlQ&#39;;return true;">v0.28.3
     
Jenkinsfile not found
   
Does not meet criteria
19:01:38 GitHub API Usage: Current quota has 894 remaining (3 over budget). Next quota of 5000 in 10 min. Sleeping for 27 sec.
19:02:06 GitHub API Usage: Current quota has 894 remaining (26 under budget). Next quota of 5000 in 9 min 53 sec


That seems to me like each tag invokes an api request? And with 500+ tags, that seems like a lot of unneeded calls (most especially when Jenkins doesn't even track/build the tag).

Why are you discovering tags if you don't want tags?

Every branch/tag/PR you discover needs at least one request to verify that the marker file is present.

If you don't want tags, don't discover them and you will save a lot of requests.
  
Or am I reading the logs incorrectly? If that is the case then a cache might save over 90% of the requests in this case. 
Should I create a Jira ticket for this? 


On Wednesday, 3 January 2018 16:52:03 UTC+1, [hidden email] wrote:
There are only two good reasons to scan periodically:
1. To recover from missed events (keep in mind that follow-up commits will typically recover anyway, so the only case here is a commit before bedtime not being built by morning because that event was not delivered by GitHub)
From my experience working with developers, that isn't the only use case. The more common use case (when a missed event happens) is that they pushed a commit and are waiting for it to proceed through the pipeline and notify them. Fast notification is a key to good CI/CD. So while missed events are not a frequent occurrence, waiting 7 days isn't an option, and the only other solution for a developer is to have an in-depth knowledge of Jenkins and know that this issue exists. 
 
2. To run the orphaned item strategies (which is probably fine at once per week for most people)
Totally agree, that's fine


As we already have a few repos with over 500 tags (and mind you these are still new repos), I expect that this issue will impact others as they begin to implement the ability to scan tags even with a 24 hour interval. 

----

Also, the recommendation in the UI for the interval setting is:
Subsequent commits should trigger indexing anyway and result in the commit being picked up, so most people will pick between 4 hours and 1 day



On Wednesday, 3 January 2018 15:42:15 UTC+1, Stephen Connolly wrote:
This is the limitation of 5000 requests per hour.

Ideally we would look into caching the github responses so that duplicate requests could be eliminated... but my preliminary analysis shows that would basically save about 50% of the requests.

The recommendation for "Scan Organization Triggers -> Periodically if not otherwise run" is at least 8 hours more likely somewhere between 24h and 7 days depending on how long you are willing to wait for a failure to deliver an event from GitHub.

There are only two good reasons to scan periodically:

1. To recover from missed events (keep in mind that follow-up commits will typically recover anyway, so the only case here is a commit before bedtime not being built by morning because that event was not delivered by GitHub)
2. To run the orphaned item strategies (which is probably fine at once per week for most people)

The only other reason to scan periodically is a bad one, namely

* You cannot set up push notification from GitHub



On 3 January 2018 at 14:19, <[hidden email]> wrote:
Now that we've added Discover tags[<a href="https://issues.jenkins-ci.org/browse/JENKINS-34395" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fissues.jenkins-ci.org%2Fbrowse%2FJENKINS-34395\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEIfJfFuqSSCl6SSUjeTFKcckiCrg&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fissues.jenkins-ci.org%2Fbrowse%2FJENKINS-34395\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEIfJfFuqSSCl6SSUjeTFKcckiCrg&#39;;return true;">1] and a Build everything[<a href="https://github.com/jenkinsci/github-branch-source-plugin/pull/158#issuecomment-332842623" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Fjenkinsci%2Fgithub-branch-source-plugin%2Fpull%2F158%23issuecomment-332842623\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNGYvcUFv39Zf2x2CPDrJ3c4hxN0nQ&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Fjenkinsci%2Fgithub-branch-source-plugin%2Fpull%2F158%23issuecomment-332842623\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNGYvcUFv39Zf2x2CPDrJ3c4hxN0nQ&#39;;return true;">2] strategy, we're running into Github quota limits quite frequently.

18:58:09 GitHub API Usage: Current quota has 1110 remaining (5 over budget). Next quota of 5000 in 13 min. Sleeping for 29 sec.

We've had to extend the Scan Organization Triggers -> Periodically if not otherwise run setting to be 8 hours, to help limit the amount of scans, but that hasn't completely solved  this issue, nor is it the goal we want to achieve. 

There's an open bug about the time setting and github quota limits  (JENKINS-47154<a href="https://issues.jenkins-ci.org/browse/JENKINS-47154" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fissues.jenkins-ci.org%2Fbrowse%2FJENKINS-47154\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNFN7ikadCl-PjFvPXVO4xTqm2ykQw&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fissues.jenkins-ci.org%2Fbrowse%2FJENKINS-47154\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNFN7ikadCl-PjFvPXVO4xTqm2ykQw&#39;;return true;">[3]), but it's not relevant in this case. 
So I'm wondering if it's a bug in the github-branch-source-plugin? or in the Build everything extension? or is there simply an easy way to request Jenkins to have a higher API quota from GitHub?


REF:
1. <a href="https://issues.jenkins-ci.org/browse/JENKINS-34395" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fissues.jenkins-ci.org%2Fbrowse%2FJENKINS-34395\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEIfJfFuqSSCl6SSUjeTFKcckiCrg&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fissues.jenkins-ci.org%2Fbrowse%2FJENKINS-34395\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEIfJfFuqSSCl6SSUjeTFKcckiCrg&#39;;return true;">https://issues.jenkins-ci.org/browse/JENKINS-34395
2. <a href="https://github.com/jenkinsci/github-branch-source-plugin/pull/158#issuecomment-332842623" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Fjenkinsci%2Fgithub-branch-source-plugin%2Fpull%2F158%23issuecomment-332842623\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNGYvcUFv39Zf2x2CPDrJ3c4hxN0nQ&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Fjenkinsci%2Fgithub-branch-source-plugin%2Fpull%2F158%23issuecomment-332842623\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNGYvcUFv39Zf2x2CPDrJ3c4hxN0nQ&#39;;return true;">https://github.com/jenkinsci/github-branch-source-plugin/pull/158#issuecomment-332842623
3. <a href="https://issues.jenkins-ci.org/browse/JENKINS-47154" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fissues.jenkins-ci.org%2Fbrowse%2FJENKINS-47154\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNFN7ikadCl-PjFvPXVO4xTqm2ykQw&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fissues.jenkins-ci.org%2Fbrowse%2FJENKINS-47154\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNFN7ikadCl-PjFvPXVO4xTqm2ykQw&#39;;return true;">https://issues.jenkins-ci.org/browse/JENKINS-47154

--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-use...@googlegroups.com.
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/jenkinsci-users/4079f366-003e-4ac4-8aea-462ef4ed2090%40googlegroups.com?utm_medium=email&amp;utm_source=footer" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://groups.google.com/d/msgid/jenkinsci-users/4079f366-003e-4ac4-8aea-462ef4ed2090%40googlegroups.com?utm_medium\x3demail\x26utm_source\x3dfooter&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/msgid/jenkinsci-users/4079f366-003e-4ac4-8aea-462ef4ed2090%40googlegroups.com?utm_medium\x3demail\x26utm_source\x3dfooter&#39;;return true;">https://groups.google.com/d/msgid/jenkinsci-users/4079f366-003e-4ac4-8aea-462ef4ed2090%40googlegroups.com.
For more options, visit <a href="https://groups.google.com/d/optout" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;">https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="CQUh9ssRDgAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">jenkinsci-use...@googlegroups.com.
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/jenkinsci-users/628123fe-ab0f-4139-b307-15b4c4470b66%40googlegroups.com?utm_medium=email&amp;utm_source=footer" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://groups.google.com/d/msgid/jenkinsci-users/628123fe-ab0f-4139-b307-15b4c4470b66%40googlegroups.com?utm_medium\x3demail\x26utm_source\x3dfooter&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/msgid/jenkinsci-users/628123fe-ab0f-4139-b307-15b4c4470b66%40googlegroups.com?utm_medium\x3demail\x26utm_source\x3dfooter&#39;;return true;">https://groups.google.com/d/msgid/jenkinsci-users/628123fe-ab0f-4139-b307-15b4c4470b66%40googlegroups.com.

For more options, visit <a href="https://groups.google.com/d/optout" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;">https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/3bdd60ea-91b2-4e82-9b24-b4e4583f1d08%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: github quota limit when scanning with the addition of tags

stephenconnolly
If you know those tags will never match, you could add a filter to exclude them from discovery.

Part of the issue here is that Multibranch doesn't know if the SCMCriteria has changed from the last time it saw that revision (because Jenkins config is a filesystem, who knows what was restored, edited with vi, etc)... on top of that, this is a tag that was not discovered, so it doesn't actually have a place to store the revision.

Consequently, it will check for the Jenkinsfile every time you do a full scan.

On 4 January 2018 at 14:13, <[hidden email]> wrote:
I do want tags. I want tags very much. I'm very happy this feature is finally available. 
There just happens to be some tags in that repo that reference commits in which no Jenkinsfile exists, and I happened to copy those examples.

Here is a better example:
  Checking tag v1.1.0
     
Jenkinsfile found
   
Met criteria
No changes detected: v1.1.0 (still at d11d5c94130db1b43dea147091c2cfc2d260b2c1)
19:05:09 GitHub API Usage: Current quota has 677 remaining (0 under budget). Next quota of 5000 in 6 min 50 sec

   
Checking tag v1.1.1
     
Jenkinsfile found
   
Met criteria
No changes detected: v1.1.1 (still at 20b7a9ccd47f9e10165268ccc252bc4b793a61fc)
19:05:09 GitHub API Usage: Current quota has 675 remaining (2 over budget). Next quota of 5000 in 6 min 50 sec. Sleeping for 26 sec.
19:05:36 GitHub API Usage: Current quota has 675 remaining (26 under budget). Next quota of 5000 in 6 min 23 sec




On Thursday, 4 January 2018 15:02:11 UTC+1, Stephen Connolly wrote:


On 4 January 2018 at 13:27, <[hidden email]> wrote:
@Stephen
You mention that caching the responses would "save about 50% of the requests." That seems like a significant savings to me.

I'm also wondering, I'm seeing a lot of things like this in the scan log:
Checking tag v0.28.1
     
Jenkinsfile not found
   
Does not meet criteria
19:01:38 GitHub API Usage: Current quota has 901 remaining (4 under budget). Next quota of 5000 in 10 min

   
Checking tag v0.28.2
     
Jenkinsfile not found
   
Does not meet criteria
19:01:38 GitHub API Usage: Current quota has 897 remaining (0 under budget). Next quota of 5000 in 10 min

   
Checking tag v0.28.3
     
Jenkinsfile not found
   
Does not meet criteria
19:01:38 GitHub API Usage: Current quota has 894 remaining (3 over budget). Next quota of 5000 in 10 min. Sleeping for 27 sec.
19:02:06 GitHub API Usage: Current quota has 894 remaining (26 under budget). Next quota of 5000 in 9 min 53 sec


That seems to me like each tag invokes an api request? And with 500+ tags, that seems like a lot of unneeded calls (most especially when Jenkins doesn't even track/build the tag).

Why are you discovering tags if you don't want tags?

Every branch/tag/PR you discover needs at least one request to verify that the marker file is present.

If you don't want tags, don't discover them and you will save a lot of requests.
  
Or am I reading the logs incorrectly? If that is the case then a cache might save over 90% of the requests in this case. 
Should I create a Jira ticket for this? 


On Wednesday, 3 January 2018 16:52:03 UTC+1, [hidden email] wrote:
There are only two good reasons to scan periodically:
1. To recover from missed events (keep in mind that follow-up commits will typically recover anyway, so the only case here is a commit before bedtime not being built by morning because that event was not delivered by GitHub)
From my experience working with developers, that isn't the only use case. The more common use case (when a missed event happens) is that they pushed a commit and are waiting for it to proceed through the pipeline and notify them. Fast notification is a key to good CI/CD. So while missed events are not a frequent occurrence, waiting 7 days isn't an option, and the only other solution for a developer is to have an in-depth knowledge of Jenkins and know that this issue exists. 
 
2. To run the orphaned item strategies (which is probably fine at once per week for most people)
Totally agree, that's fine


As we already have a few repos with over 500 tags (and mind you these are still new repos), I expect that this issue will impact others as they begin to implement the ability to scan tags even with a 24 hour interval. 

----

Also, the recommendation in the UI for the interval setting is:
Subsequent commits should trigger indexing anyway and result in the commit being picked up, so most people will pick between 4 hours and 1 day



On Wednesday, 3 January 2018 15:42:15 UTC+1, Stephen Connolly wrote:
This is the limitation of 5000 requests per hour.

Ideally we would look into caching the github responses so that duplicate requests could be eliminated... but my preliminary analysis shows that would basically save about 50% of the requests.

The recommendation for "Scan Organization Triggers -> Periodically if not otherwise run" is at least 8 hours more likely somewhere between 24h and 7 days depending on how long you are willing to wait for a failure to deliver an event from GitHub.

There are only two good reasons to scan periodically:

1. To recover from missed events (keep in mind that follow-up commits will typically recover anyway, so the only case here is a commit before bedtime not being built by morning because that event was not delivered by GitHub)
2. To run the orphaned item strategies (which is probably fine at once per week for most people)

The only other reason to scan periodically is a bad one, namely

* You cannot set up push notification from GitHub



On 3 January 2018 at 14:19, <[hidden email]> wrote:
Now that we've added Discover tags[1] and a Build everything[2] strategy, we're running into Github quota limits quite frequently.

18:58:09 GitHub API Usage: Current quota has 1110 remaining (5 over budget). Next quota of 5000 in 13 min. Sleeping for 29 sec.

We've had to extend the Scan Organization Triggers -> Periodically if not otherwise run setting to be 8 hours, to help limit the amount of scans, but that hasn't completely solved  this issue, nor is it the goal we want to achieve. 

There's an open bug about the time setting and github quota limits  (JENKINS-47154[3]), but it's not relevant in this case. 
So I'm wondering if it's a bug in the github-branch-source-plugin? or in the Build everything extension? or is there simply an easy way to request Jenkins to have a higher API quota from GitHub?


REF:
1. https://issues.jenkins-ci.org/browse/JENKINS-34395
2. https://github.com/jenkinsci/github-branch-source-plugin/pull/158#issuecomment-332842623
3. https://issues.jenkins-ci.org/browse/JENKINS-47154

--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/4079f366-003e-4ac4-8aea-462ef4ed2090%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/628123fe-ab0f-4139-b307-15b4c4470b66%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/3bdd60ea-91b2-4e82-9b24-b4e4583f1d08%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/CA%2BnPnMz%3DqvqsG7Xp6HjYrnzJQdyF2CrZ9a%3D68F_o%3DhHzrfs4Uw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: github quota limit when scanning with the addition of tags

j.knurek
Ok, than I think I misunderstand what the scan is doing. 
During the scan, Jenkins creates a list in memory of all branches, tags, PRs. It does it from a single api call? Or from an api call for each type?
And then while iterating over that list, for each entity Jenkins makes an api call to get the Jenkinsfile (or to find out it doesn't exist)?

If that's the case, it doesn't sound like there is much to be done in the current setup. 

This is a problem though, because as more and more tags come, there is no logical way to keep adding them to the filter if Jenkins is the only source of truth on if the tag has already been built. As in, those few tags that don't reference a commit with a Jenkinsfile could just be deleted from github, but it doesn't fix the problem, just delays it a couple weeks.


On Thursday, 4 January 2018 15:27:37 UTC+1, Stephen Connolly wrote:
If you know those tags will never match, you could add a filter to exclude them from discovery.

Part of the issue here is that Multibranch doesn't know if the SCMCriteria has changed from the last time it saw that revision (because Jenkins config is a filesystem, who knows what was restored, edited with vi, etc)... on top of that, this is a tag that was not discovered, so it doesn't actually have a place to store the revision.

Consequently, it will check for the Jenkinsfile every time you do a full scan.

On 4 January 2018 at 14:13, <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="73wXHi8TDgAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">j.kn...@...> wrote:
I do want tags. I want tags very much. I'm very happy this feature is finally available. 
There just happens to be some tags in that repo that reference commits in which no Jenkinsfile exists, and I happened to copy those examples.

Here is a better example:
  Checking tag <a href="https://github.com/travelaudience/cmt-cmtf/tree/v1.1.0" style="word-wrap:break-word;color:rgb(92,53,102)" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Ftravelaudience%2Fcmt-cmtf%2Ftree%2Fv1.1.0\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNGFV50A-Os5k8e1zG-CzqUDm7z4Rw&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Ftravelaudience%2Fcmt-cmtf%2Ftree%2Fv1.1.0\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNGFV50A-Os5k8e1zG-CzqUDm7z4Rw&#39;;return true;">v1.1.0
     
Jenkinsfile found
   
Met criteria
No changes detected: v1.1.0 (still at d11d5c94130db1b43dea147091c2cfc2d260b2c1)
19:05:09 GitHub API Usage: Current quota has 677 remaining (0 under budget). Next quota of 5000 in 6 min 50 sec

   
Checking tag <a href="https://github.com/travelaudience/cmt-cmtf/tree/v1.1.1" style="word-wrap:break-word;color:rgb(92,53,102)" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Ftravelaudience%2Fcmt-cmtf%2Ftree%2Fv1.1.1\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNE59LJJi23V2x5zOyVGwGsVwJcqKA&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Ftravelaudience%2Fcmt-cmtf%2Ftree%2Fv1.1.1\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNE59LJJi23V2x5zOyVGwGsVwJcqKA&#39;;return true;">v1.1.1
     
Jenkinsfile found
   
Met criteria
No changes detected: v1.1.1 (still at 20b7a9ccd47f9e10165268ccc252bc4b793a61fc)
19:05:09 GitHub API Usage: Current quota has 675 remaining (2 over budget). Next quota of 5000 in 6 min 50 sec. Sleeping for 26 sec.
19:05:36 GitHub API Usage: Current quota has 675 remaining (26 under budget). Next quota of 5000 in 6 min 23 sec




On Thursday, 4 January 2018 15:02:11 UTC+1, Stephen Connolly wrote:


On 4 January 2018 at 13:27, <[hidden email]> wrote:
@Stephen
You mention that caching the responses would "save about 50% of the requests." That seems like a significant savings to me.

I'm also wondering, I'm seeing a lot of things like this in the scan log:
Checking tag <a href="https://github.com/travelaudience/cmt-cmtf/tree/v0.28.1" style="word-wrap:break-word;color:rgb(92,53,102)" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Ftravelaudience%2Fcmt-cmtf%2Ftree%2Fv0.28.1\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNGR5roxKieOHOuJvus745Z4hVxt_Q&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Ftravelaudience%2Fcmt-cmtf%2Ftree%2Fv0.28.1\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNGR5roxKieOHOuJvus745Z4hVxt_Q&#39;;return true;">v0.28.1
     
Jenkinsfile not found
   
Does not meet criteria
19:01:38 GitHub API Usage: Current quota has 901 remaining (4 under budget). Next quota of 5000 in 10 min

   
Checking tag <a href="https://github.com/travelaudience/cmt-cmtf/tree/v0.28.2" style="word-wrap:break-word;color:rgb(92,53,102)" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Ftravelaudience%2Fcmt-cmtf%2Ftree%2Fv0.28.2\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNGZSF9MiJgMlkByp-TSRwl_Yrg4iA&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Ftravelaudience%2Fcmt-cmtf%2Ftree%2Fv0.28.2\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNGZSF9MiJgMlkByp-TSRwl_Yrg4iA&#39;;return true;">v0.28.2
     
Jenkinsfile not found
   
Does not meet criteria
19:01:38 GitHub API Usage: Current quota has 897 remaining (0 under budget). Next quota of 5000 in 10 min

   
Checking tag <a href="https://github.com/travelaudience/cmt-cmtf/tree/v0.28.3" style="word-wrap:break-word;color:rgb(92,53,102)" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Ftravelaudience%2Fcmt-cmtf%2Ftree%2Fv0.28.3\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNGHuBSt649iz5QqWt7DzxdjFE9XlQ&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Ftravelaudience%2Fcmt-cmtf%2Ftree%2Fv0.28.3\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNGHuBSt649iz5QqWt7DzxdjFE9XlQ&#39;;return true;">v0.28.3
     
Jenkinsfile not found
   
Does not meet criteria
19:01:38 GitHub API Usage: Current quota has 894 remaining (3 over budget). Next quota of 5000 in 10 min. Sleeping for 27 sec.
19:02:06 GitHub API Usage: Current quota has 894 remaining (26 under budget). Next quota of 5000 in 9 min 53 sec


That seems to me like each tag invokes an api request? And with 500+ tags, that seems like a lot of unneeded calls (most especially when Jenkins doesn't even track/build the tag).

Why are you discovering tags if you don't want tags?

Every branch/tag/PR you discover needs at least one request to verify that the marker file is present.

If you don't want tags, don't discover them and you will save a lot of requests.
  
Or am I reading the logs incorrectly? If that is the case then a cache might save over 90% of the requests in this case. 
Should I create a Jira ticket for this? 


On Wednesday, 3 January 2018 16:52:03 UTC+1, [hidden email] wrote:
There are only two good reasons to scan periodically:
1. To recover from missed events (keep in mind that follow-up commits will typically recover anyway, so the only case here is a commit before bedtime not being built by morning because that event was not delivered by GitHub)
From my experience working with developers, that isn't the only use case. The more common use case (when a missed event happens) is that they pushed a commit and are waiting for it to proceed through the pipeline and notify them. Fast notification is a key to good CI/CD. So while missed events are not a frequent occurrence, waiting 7 days isn't an option, and the only other solution for a developer is to have an in-depth knowledge of Jenkins and know that this issue exists. 
 
2. To run the orphaned item strategies (which is probably fine at once per week for most people)
Totally agree, that's fine


As we already have a few repos with over 500 tags (and mind you these are still new repos), I expect that this issue will impact others as they begin to implement the ability to scan tags even with a 24 hour interval. 

----

Also, the recommendation in the UI for the interval setting is:
Subsequent commits should trigger indexing anyway and result in the commit being picked up, so most people will pick between 4 hours and 1 day



On Wednesday, 3 January 2018 15:42:15 UTC+1, Stephen Connolly wrote:
This is the limitation of 5000 requests per hour.

Ideally we would look into caching the github responses so that duplicate requests could be eliminated... but my preliminary analysis shows that would basically save about 50% of the requests.

The recommendation for "Scan Organization Triggers -> Periodically if not otherwise run" is at least 8 hours more likely somewhere between 24h and 7 days depending on how long you are willing to wait for a failure to deliver an event from GitHub.

There are only two good reasons to scan periodically:

1. To recover from missed events (keep in mind that follow-up commits will typically recover anyway, so the only case here is a commit before bedtime not being built by morning because that event was not delivered by GitHub)
2. To run the orphaned item strategies (which is probably fine at once per week for most people)

The only other reason to scan periodically is a bad one, namely

* You cannot set up push notification from GitHub



On 3 January 2018 at 14:19, <[hidden email]> wrote:
Now that we've added Discover tags[<a href="https://issues.jenkins-ci.org/browse/JENKINS-34395" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fissues.jenkins-ci.org%2Fbrowse%2FJENKINS-34395\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEIfJfFuqSSCl6SSUjeTFKcckiCrg&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fissues.jenkins-ci.org%2Fbrowse%2FJENKINS-34395\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEIfJfFuqSSCl6SSUjeTFKcckiCrg&#39;;return true;">1] and a Build everything[<a href="https://github.com/jenkinsci/github-branch-source-plugin/pull/158#issuecomment-332842623" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Fjenkinsci%2Fgithub-branch-source-plugin%2Fpull%2F158%23issuecomment-332842623\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNGYvcUFv39Zf2x2CPDrJ3c4hxN0nQ&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Fjenkinsci%2Fgithub-branch-source-plugin%2Fpull%2F158%23issuecomment-332842623\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNGYvcUFv39Zf2x2CPDrJ3c4hxN0nQ&#39;;return true;">2] strategy, we're running into Github quota limits quite frequently.

18:58:09 GitHub API Usage: Current quota has 1110 remaining (5 over budget). Next quota of 5000 in 13 min. Sleeping for 29 sec.

We've had to extend the Scan Organization Triggers -> Periodically if not otherwise run setting to be 8 hours, to help limit the amount of scans, but that hasn't completely solved  this issue, nor is it the goal we want to achieve. 

There's an open bug about the time setting and github quota limits  (JENKINS-47154<a href="https://issues.jenkins-ci.org/browse/JENKINS-47154" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fissues.jenkins-ci.org%2Fbrowse%2FJENKINS-47154\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNFN7ikadCl-PjFvPXVO4xTqm2ykQw&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fissues.jenkins-ci.org%2Fbrowse%2FJENKINS-47154\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNFN7ikadCl-PjFvPXVO4xTqm2ykQw&#39;;return true;">[3]), but it's not relevant in this case. 
So I'm wondering if it's a bug in the github-branch-source-plugin? or in the Build everything extension? or is there simply an easy way to request Jenkins to have a higher API quota from GitHub?


REF:
1. <a href="https://issues.jenkins-ci.org/browse/JENKINS-34395" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fissues.jenkins-ci.org%2Fbrowse%2FJENKINS-34395\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEIfJfFuqSSCl6SSUjeTFKcckiCrg&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fissues.jenkins-ci.org%2Fbrowse%2FJENKINS-34395\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEIfJfFuqSSCl6SSUjeTFKcckiCrg&#39;;return true;">https://issues.jenkins-ci.org/browse/JENKINS-34395
2. <a href="https://github.com/jenkinsci/github-branch-source-plugin/pull/158#issuecomment-332842623" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Fjenkinsci%2Fgithub-branch-source-plugin%2Fpull%2F158%23issuecomment-332842623\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNGYvcUFv39Zf2x2CPDrJ3c4hxN0nQ&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Fjenkinsci%2Fgithub-branch-source-plugin%2Fpull%2F158%23issuecomment-332842623\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNGYvcUFv39Zf2x2CPDrJ3c4hxN0nQ&#39;;return true;">https://github.com/jenkinsci/github-branch-source-plugin/pull/158#issuecomment-332842623
3. <a href="https://issues.jenkins-ci.org/browse/JENKINS-47154" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fissues.jenkins-ci.org%2Fbrowse%2FJENKINS-47154\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNFN7ikadCl-PjFvPXVO4xTqm2ykQw&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fissues.jenkins-ci.org%2Fbrowse%2FJENKINS-47154\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNFN7ikadCl-PjFvPXVO4xTqm2ykQw&#39;;return true;">https://issues.jenkins-ci.org/browse/JENKINS-47154

--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-use...@googlegroups.com.
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/jenkinsci-users/4079f366-003e-4ac4-8aea-462ef4ed2090%40googlegroups.com?utm_medium=email&amp;utm_source=footer" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://groups.google.com/d/msgid/jenkinsci-users/4079f366-003e-4ac4-8aea-462ef4ed2090%40googlegroups.com?utm_medium\x3demail\x26utm_source\x3dfooter&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/msgid/jenkinsci-users/4079f366-003e-4ac4-8aea-462ef4ed2090%40googlegroups.com?utm_medium\x3demail\x26utm_source\x3dfooter&#39;;return true;">https://groups.google.com/d/msgid/jenkinsci-users/4079f366-003e-4ac4-8aea-462ef4ed2090%40googlegroups.com.
For more options, visit <a href="https://groups.google.com/d/optout" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;">https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-use...@googlegroups.com.
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/jenkinsci-users/628123fe-ab0f-4139-b307-15b4c4470b66%40googlegroups.com?utm_medium=email&amp;utm_source=footer" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://groups.google.com/d/msgid/jenkinsci-users/628123fe-ab0f-4139-b307-15b4c4470b66%40googlegroups.com?utm_medium\x3demail\x26utm_source\x3dfooter&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/msgid/jenkinsci-users/628123fe-ab0f-4139-b307-15b4c4470b66%40googlegroups.com?utm_medium\x3demail\x26utm_source\x3dfooter&#39;;return true;">https://groups.google.com/d/msgid/jenkinsci-users/628123fe-ab0f-4139-b307-15b4c4470b66%40googlegroups.com.

For more options, visit <a href="https://groups.google.com/d/optout" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;">https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="73wXHi8TDgAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">jenkinsci-use...@googlegroups.com.
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/jenkinsci-users/3bdd60ea-91b2-4e82-9b24-b4e4583f1d08%40googlegroups.com?utm_medium=email&amp;utm_source=footer" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://groups.google.com/d/msgid/jenkinsci-users/3bdd60ea-91b2-4e82-9b24-b4e4583f1d08%40googlegroups.com?utm_medium\x3demail\x26utm_source\x3dfooter&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/msgid/jenkinsci-users/3bdd60ea-91b2-4e82-9b24-b4e4583f1d08%40googlegroups.com?utm_medium\x3demail\x26utm_source\x3dfooter&#39;;return true;">https://groups.google.com/d/msgid/jenkinsci-users/3bdd60ea-91b2-4e82-9b24-b4e4583f1d08%40googlegroups.com.

For more options, visit <a href="https://groups.google.com/d/optout" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;">https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/0acbea7c-c882-437d-abfe-859a7763996b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: github quota limit when scanning with the addition of tags

stephenconnolly
On 4 January 2018 at 16:43, <[hidden email]> wrote:
Ok, than I think I misunderstand what the scan is doing. 
During the scan, Jenkins creates a list in memory of all branches, tags, PRs. It does it from a single api call? Or from an api call for each type?

At least one API call for each requested (or implied requested) type.

e.g. If there are more than 100 branches then it will take more than one request to get all branches as the page size is 100
e.g. If you request to build branches that are not also filed as pull requests, then that implies we need the list of pull requests (even if you didn't select Discover Pull Requests) 
 
And then while iterating over that list, for each entity Jenkins makes an api call to get the Jenkinsfile (or to find out it doesn't exist)?

Correct.
 

If that's the case, it doesn't sound like there is much to be done in the current setup. 

A quick win might be to maintain a secondary state file that tracks the hash of the XML config for the SCMSourceCriteri and the hash of the XML of each revision for each discovered "head". If the hashes are the same, then we can assume no need to recheck.
  

This is a problem though, because as more and more tags come, there is no logical way to keep adding them to the filter if Jenkins is the only source of truth on if the tag has already been built. As in, those few tags that don't reference a commit with a Jenkinsfile could just be deleted from github, but it doesn't fix the problem, just delays it a couple weeks.


On Thursday, 4 January 2018 15:27:37 UTC+1, Stephen Connolly wrote:
If you know those tags will never match, you could add a filter to exclude them from discovery.

Part of the issue here is that Multibranch doesn't know if the SCMCriteria has changed from the last time it saw that revision (because Jenkins config is a filesystem, who knows what was restored, edited with vi, etc)... on top of that, this is a tag that was not discovered, so it doesn't actually have a place to store the revision.

Consequently, it will check for the Jenkinsfile every time you do a full scan.

On 4 January 2018 at 14:13, <[hidden email]> wrote:
I do want tags. I want tags very much. I'm very happy this feature is finally available. 
There just happens to be some tags in that repo that reference commits in which no Jenkinsfile exists, and I happened to copy those examples.

Here is a better example:
  Checking tag v1.1.0
     
Jenkinsfile found
   
Met criteria
No changes detected: v1.1.0 (still at d11d5c94130db1b43dea147091c2cfc2d260b2c1)
19:05:09 GitHub API Usage: Current quota has 677 remaining (0 under budget). Next quota of 5000 in 6 min 50 sec

   
Checking tag v1.1.1
     
Jenkinsfile found
   
Met criteria
No changes detected: v1.1.1 (still at 20b7a9ccd47f9e10165268ccc252bc4b793a61fc)
19:05:09 GitHub API Usage: Current quota has 675 remaining (2 over budget). Next quota of 5000 in 6 min 50 sec. Sleeping for 26 sec.
19:05:36 GitHub API Usage: Current quota has 675 remaining (26 under budget). Next quota of 5000 in 6 min 23 sec




On Thursday, 4 January 2018 15:02:11 UTC+1, Stephen Connolly wrote:


On 4 January 2018 at 13:27, <[hidden email]> wrote:
@Stephen
You mention that caching the responses would "save about 50% of the requests." That seems like a significant savings to me.

I'm also wondering, I'm seeing a lot of things like this in the scan log:
Checking tag v0.28.1
     
Jenkinsfile not found
   
Does not meet criteria
19:01:38 GitHub API Usage: Current quota has 901 remaining (4 under budget). Next quota of 5000 in 10 min

   
Checking tag v0.28.2
     
Jenkinsfile not found
   
Does not meet criteria
19:01:38 GitHub API Usage: Current quota has 897 remaining (0 under budget). Next quota of 5000 in 10 min

   
Checking tag v0.28.3
     
Jenkinsfile not found
   
Does not meet criteria
19:01:38 GitHub API Usage: Current quota has 894 remaining (3 over budget). Next quota of 5000 in 10 min. Sleeping for 27 sec.
19:02:06 GitHub API Usage: Current quota has 894 remaining (26 under budget). Next quota of 5000 in 9 min 53 sec


That seems to me like each tag invokes an api request? And with 500+ tags, that seems like a lot of unneeded calls (most especially when Jenkins doesn't even track/build the tag).

Why are you discovering tags if you don't want tags?

Every branch/tag/PR you discover needs at least one request to verify that the marker file is present.

If you don't want tags, don't discover them and you will save a lot of requests.
  
Or am I reading the logs incorrectly? If that is the case then a cache might save over 90% of the requests in this case. 
Should I create a Jira ticket for this? 


On Wednesday, 3 January 2018 16:52:03 UTC+1, [hidden email] wrote:
There are only two good reasons to scan periodically:
1. To recover from missed events (keep in mind that follow-up commits will typically recover anyway, so the only case here is a commit before bedtime not being built by morning because that event was not delivered by GitHub)
From my experience working with developers, that isn't the only use case. The more common use case (when a missed event happens) is that they pushed a commit and are waiting for it to proceed through the pipeline and notify them. Fast notification is a key to good CI/CD. So while missed events are not a frequent occurrence, waiting 7 days isn't an option, and the only other solution for a developer is to have an in-depth knowledge of Jenkins and know that this issue exists. 
 
2. To run the orphaned item strategies (which is probably fine at once per week for most people)
Totally agree, that's fine


As we already have a few repos with over 500 tags (and mind you these are still new repos), I expect that this issue will impact others as they begin to implement the ability to scan tags even with a 24 hour interval. 

----

Also, the recommendation in the UI for the interval setting is:
Subsequent commits should trigger indexing anyway and result in the commit being picked up, so most people will pick between 4 hours and 1 day



On Wednesday, 3 January 2018 15:42:15 UTC+1, Stephen Connolly wrote:
This is the limitation of 5000 requests per hour.

Ideally we would look into caching the github responses so that duplicate requests could be eliminated... but my preliminary analysis shows that would basically save about 50% of the requests.

The recommendation for "Scan Organization Triggers -> Periodically if not otherwise run" is at least 8 hours more likely somewhere between 24h and 7 days depending on how long you are willing to wait for a failure to deliver an event from GitHub.

There are only two good reasons to scan periodically:

1. To recover from missed events (keep in mind that follow-up commits will typically recover anyway, so the only case here is a commit before bedtime not being built by morning because that event was not delivered by GitHub)
2. To run the orphaned item strategies (which is probably fine at once per week for most people)

The only other reason to scan periodically is a bad one, namely

* You cannot set up push notification from GitHub



On 3 January 2018 at 14:19, <[hidden email]> wrote:
Now that we've added Discover tags[1] and a Build everything[2] strategy, we're running into Github quota limits quite frequently.

18:58:09 GitHub API Usage: Current quota has 1110 remaining (5 over budget). Next quota of 5000 in 13 min. Sleeping for 29 sec.

We've had to extend the Scan Organization Triggers -> Periodically if not otherwise run setting to be 8 hours, to help limit the amount of scans, but that hasn't completely solved  this issue, nor is it the goal we want to achieve. 

There's an open bug about the time setting and github quota limits  (JENKINS-47154[3]), but it's not relevant in this case. 
So I'm wondering if it's a bug in the github-branch-source-plugin? or in the Build everything extension? or is there simply an easy way to request Jenkins to have a higher API quota from GitHub?


REF:
1. https://issues.jenkins-ci.org/browse/JENKINS-34395
2. https://github.com/jenkinsci/github-branch-source-plugin/pull/158#issuecomment-332842623
3. https://issues.jenkins-ci.org/browse/JENKINS-47154

--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/4079f366-003e-4ac4-8aea-462ef4ed2090%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/628123fe-ab0f-4139-b307-15b4c4470b66%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/3bdd60ea-91b2-4e82-9b24-b4e4583f1d08%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/0acbea7c-c882-437d-abfe-859a7763996b%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/CA%2BnPnMzU2yN5x3DoaM4ag2aJjrTpbHyFocXk1i4hHHnHx-Odvg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: github quota limit when scanning with the addition of tags

Alicia Doblas
Hi,

a couple of months ago we had the same problem. After improving the discover filter of the branch/tag/PR calls, there weren't too much to do...so we decided to "by-pass" the api of github by using different users. Looks like the api limit applies to a single user, so the "solution" for us was to split our jobs into different groups, each of them using a different user for the api call. 

By changing this configuration we can manage 5000 req/hour x N groups.

--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/6a4743a2-f92d-40b0-beac-36fe30cfad1c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: github quota limit when scanning with the addition of tags

R. Tyler Croy
(replies inline)

On Fri, 05 Jan 2018, Alicia Doblas wrote:

> Hi,
>
> a couple of months ago we had the same problem. After improving the
> discover filter of the branch/tag/PR calls, there weren't too much to
> do...so we decided to "by-pass" the api of github by using different users.
> Looks like the api limit applies to a single user, so the "solution" for us
> was to split our jobs into different groups, each of them using a different
> user for the api call.
>
> By changing this configuration we can manage 5000 req/hour x N groups.
This is very very much against the GitHub.com terms of service, which states
that one legal entity can have one free machine account.

https://help.github.com/articles/github-terms-of-service/#b-account-terms



- R. Tyler Croy

------------------------------------------------------
     Code: <https://github.com/rtyler>
  Chatter: <https://twitter.com/agentdero>
     xmpp: [hidden email]

  % gpg --keyserver keys.gnupg.net --recv-key 1426C7DC3F51E16F
------------------------------------------------------

--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/20180106042952.4wlg7enl4dbnnw5i%40blackberry.coupleofllamas.com.
For more options, visit https://groups.google.com/d/optout.

signature.asc (201 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: github quota limit when scanning with the addition of tags

R. Tyler Croy
In reply to this post by j.knurek
Sender: [hidden email]
On-Behalf-Of: [hidden email]
Subject: Re: github quota limit when scanning with the addition of tags
Message-Id: <[hidden email]>
Recipient: [hidden email]

The information contained in this email and its attachments may be confidential.
If you have received this email in error, please notify the sender by return email,
delete this email and destroy any copy.

Any advice contained in this email has been prepared without taking into
account your objectives, financial situation or needs. Before acting on any
advice in this email, National Australia Bank Limited (NAB) recommends that
you consider whether it is appropriate for your circumstances.
If this email contains reference to any financial products, NAB recommends
you consider the Product Disclosure Statement (PDS) or other disclosure
document available from NAB, before making any decisions regarding any
products.

If this email contains any promotional content that you do not wish to receive,
please reply to the original sender and write "Don't email promotional
material" in the subject.

--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/9633fa14-c91f-4969-b6ab-de03e20cebac%40journal.report.generator.
For more options, visit https://groups.google.com/d/optout.

(replies inline)

On Fri, 05 Jan 2018, Alicia Doblas wrote:

> Hi,
>
> a couple of months ago we had the same problem. After improving the
> discover filter of the branch/tag/PR calls, there weren't too much to
> do...so we decided to "by-pass" the api of github by using different users.
> Looks like the api limit applies to a single user, so the "solution" for us
> was to split our jobs into different groups, each of them using a different
> user for the api call.
>
> By changing this configuration we can manage 5000 req/hour x N groups.
This is very very much against the GitHub.com terms of service, which states
that one legal entity can have one free machine account.

https://help.github.com/articles/github-terms-of-service/#b-account-terms



- R. Tyler Croy

------------------------------------------------------
     Code: <https://github.com/rtyler>
  Chatter: <https://twitter.com/agentdero>
     xmpp: [hidden email]

  % gpg --keyserver keys.gnupg.net --recv-key 1426C7DC3F51E16F
------------------------------------------------------

--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/20180106042952.4wlg7enl4dbnnw5i%40blackberry.coupleofllamas.com.
For more options, visit https://groups.google.com/d/optout.

signature.asc (201 bytes) Download Attachment