implementing correct build order for dependent projects via SCMTrigger

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
24 messages Options
12
Reply | Threaded
Open this post in threaded view
|

implementing correct build order for dependent projects via SCMTrigger

bwestrich
Hi Hudson developers,

Being a new Hudson developer, I don't know the overall architecture of Hudson interproject dependencies. But I have an idea that looks easy to implement that might get rid of many of the dependency related build issues I've seen when using other CI tools such as Cruise Control.  The idea is that when an SCM trigger is about to schedule a build, we first check to make sure that all of the upstream projects are current ( i.e. SCM.pollChanges() == false), and if any of these upstream projects are not current we do not schedule the build. What then happens (assuming of course the upstream project is also configured to poll for SCM builds) is that when the upstream project build completes it will trigger the current project to build. So the two interdependent projects will always build in the correct sequence.

Below is a suggested patch that would implement this idea. The only thing I'd need to still figure out is what to pass in to pollChanges() for a Launcher.  The Perforce plugin (that I'm currently helping with) doesn't make use of this, so if we decide the below patch would be good to do, I'm looking for advice as to how we'd get a launcher to pass in.

So, is this a good idea?  A hack?  (both :-))?   Is there some aspect of Hudson that already takes care of this? 

Looking forward to your comments on this.

Brian Westrich

P.S.
So far I've been using Hudson with only 1 executor to keep things simple. If there were muliple executors, I'm not convinced the above approach would be sufficient. To make things more general, we might need to add something such as "if the upstream project has SCM changes, or if the upstream project is in the process of building, do not build the current project".



#P hudson-core
Index: src/main/java/hudson/triggers/SCMTrigger.java
===================================================================
RCS file: /cvs/hudson/hudson/main/core/src/main/java/hudson/triggers/SCMTrigger.java,v
retrieving revision 1.24
diff -u -r1.24 SCMTrigger.java
--- src/main/java/hudson/triggers/SCMTrigger.java    20 Sep 2007 10:50:01 -0000    1.24
+++ src/main/java/hudson/triggers/SCMTrigger.java    7 Nov 2007 11:33:24 -0000
[snip]
@@ -297,6 +299,18 @@
                     }
                    
                     if(foundChanges) {
+                        StreamTaskListener listener = new StreamTaskListener(getLogFile());
+                        for (AbstractProject upstreamProject : job.asProject().getUpstreamProjects()) {
+                            if (upstreamProject.getScm() != null) {
+                                if ( upstreamProject.getScm().pollChanges(upstreamProject,
+                                        launcher, upstreamProject.getWorkspace(), listener)) {
+                                    LOGGER.info("SCM changes detected in upstream project "
+                                            + upstreamProject.getName() + " for current project "
+                                            + job.getName() + ". Not triggering " + job.getName ());
+                                    return;
+                                }
+                            }
+                        }
                         String name = " #"+job.asProject().getNextBuildNumber();
                         if(!job.scheduleBuild()) {
                             LOGGER.info("SCM changes detected in "+ job.getName()+". Triggering "+name);

Reply | Threaded
Open this post in threaded view
|

Re: implementing correct build order for dependent projects via SCMTrigger

Kohsuke Kawaguchi
Administrator
Brian Westrich wrote:

> Hi Hudson developers,
>
> Being a new Hudson developer, I don't know the overall architecture of
> Hudson interproject dependencies. But I have an idea that looks easy to
> implement that might get rid of many of the dependency related build issues
> I've seen when using other CI tools such as Cruise Control.  The idea is
> that when an SCM trigger is about to schedule a build, we first check to
> make sure that all of the upstream projects are current (i.e.
> SCM.pollChanges() == false), and if any of these upstream projects are not
> current we do not schedule the build. What then happens (assuming of course
> the upstream project is also configured to poll for SCM builds) is that when
> the upstream project build completes it will trigger the current project to
> build. So the two interdependent projects will always build in the correct
> sequence.
Indeed there's an issue filed about this problem, and IIRC, there has
been some attempt in the past to tackle this by someone.

So thank you very much for hacking this :-)

> Below is a suggested patch that would implement this idea. The only thing
> I'd need to still figure out is what to pass in to pollChanges() for a
> Launcher.  The Perforce plugin (that I'm currently helping with) doesn't
> make use of this, so if we decide the below patch would be good to do, I'm
> looking for advice as to how we'd get a launcher to pass in.

I think the approach is interesting, but there are several issues.

First, pollChanges may touch the workspace, so it can only happen when a
build is not in progress. There exists some really ugly code between
SCMTrigger and Build class that makes sure that one waits for the other.
  You need to take this mutual exclusion into account.

Another issue is that your current implementation only looks at
immediate dependencies, and not all the transitive dependencies. This
also interacts with the starvation issue --- when there's a long chain
of dependencies, this scheme might delay the frequency the downstream
projects are built, and in the worst case it leads to a real starvation.



I wonder whether a different approach is necessary --- for example, if
this were a Subversion repository, maybe a per-repository polling
activity could track every change and which project it went to, and
issue builds accordingly? That doesn't solve the starvation problem, but
it might be more efficient.


>
> So, is this a good idea?  A hack?  (both :-))?   Is there some aspect of
> Hudson that already takes care of this?
>
> Looking forward to your comments on this.
>
> Brian Westrich
>
> P.S. So far I've been using Hudson with only 1 executor to keep things
> simple. If there were muliple executors, I'm not convinced the above
> approach would be sufficient. To make things more general, we might need to
> add something such as "if the upstream project has SCM changes, or if the
> upstream project is in the process of building, do not build the current
> project".



>
>
>
> #P hudson-core
> Index: src/main/java/hudson/triggers/SCMTrigger.java
> ===================================================================
> RCS file:
> /cvs/hudson/hudson/main/core/src/main/java/hudson/triggers/SCMTrigger.java,v
> retrieving revision 1.24
> diff -u -r1.24 SCMTrigger.java
> --- src/main/java/hudson/triggers/SCMTrigger.java    20 Sep 2007 10:50:01
> -0000    1.24
> +++ src/main/java/hudson/triggers/SCMTrigger.java    7 Nov 2007 11:33:24
> -0000
> [snip]
> @@ -297,6 +299,18 @@
>                      }
>
>                      if(foundChanges) {
> +                        StreamTaskListener listener = new
> StreamTaskListener(getLogFile());
> +                        for (AbstractProject upstreamProject :
> job.asProject().getUpstreamProjects()) {
> +                            if (upstreamProject.getScm() != null) {
> +                                if
> (upstreamProject.getScm().pollChanges(upstreamProject,
>
> +                                        launcher,
> upstreamProject.getWorkspace(), listener)) {
> +                                    LOGGER.info("SCM changes detected in
> upstream project "
> +                                            + upstreamProject.getName() + "
> for current project "
> +                                            + job.getName() + ". Not
> triggering " + job.getName());
> +                                    return;
> +                                }
> +                            }
> +                        }
>                          String name = "
> #"+job.asProject().getNextBuildNumber();
>                          if(!job.scheduleBuild()) {
>                              LOGGER.info("SCM changes detected in "+
> job.getName()+". Triggering "+name);
>

--
Kohsuke Kawaguchi
Sun Microsystems                   [hidden email]

smime.p7s (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: implementing correct build order for dependent projects via SCMTrigger

bwestrich
On Nov 7, 2007 7:56 PM, Kohsuke Kawaguchi <[hidden email]> wrote:
Brian Westrich wrote:


I think the approach is interesting, but there are several issues.

First, pollChanges may touch the workspace, so it can only happen when a
build is not in progress. There exists some really ugly code between
SCMTrigger and Build class that makes sure that one waits for the other.
 You need to take this mutual exclusion into account.

Is the resolution simply to call isBuilding() on the (upstream) job, and if it returns true, abort the SCM build of the current project?



Another issue is that your current implementation only looks at
immediate dependencies, and not all the transitive dependencies. This
also interacts with the starvation issue --- when there's a long chain
of dependencies, this scheme might delay the frequency the downstream
projects are built, and in the worst case it leads to a real starvation.

You are absolutely right about the transitive dependencies. I didn't think of this at first, the algorithm would have to recurse to find these type of dependencies as well.

We could leverage a strength of Perforce (and perhaps some of the other SCM's) here.  With Perforce (unlike for example CVS), polling for changes is a server based operation. Perforce keeps track on the server of all pending adds/updates/deletes. Even with a very large source code base, polling for changes with Perforce is typically a subsecond operation. So efficiency concerns of polling (even when done repeatedly/recursively, for example to traverse transient dependencies) may be moot. Also (not sure if this is significant anymore, but you mentioned this topic in your note), this type of polling does not touch the workspace.

What if we created an interface (e.g. SCMServerBasedPolling) that indicated whether a particular implementation of SCM did server based polling?  The one method on this would be a version of pollChanges() without the Launcher parameter. The SCMTrigger code I proposed in my previous note would only execute for those SCM plugins ( e.g. PerforceSCM) that implemented this interface. 

While we continue to explore other approaches to dependency management, this approach might immediately improve Hudson's accuracy of dependency management for those SCM's that use server based polling.
 


I wonder whether a different approach is necessary --- for example, if
this were a Subversion repository, maybe a per-repository polling
activity could track every change and which project it went to, and
issue builds accordingly? That doesn't solve the starvation problem, but
it might be more efficient.

Would this involve the user setting up an additional job or jobs? Hopefully this is something that would need minimal additional user setup.

Reply | Threaded
Open this post in threaded view
|

Re: implementing correct build order for dependent projects via SCMTrigger

Jean-Baptiste Quenot
Is this what you want:
http://caraldi.com/jbq/blog/2007/10/10/Better-SCM-polling-for-Hudson/

In that case, please test and provide some feedback.  If it works for
you, we can finally show the setting in the configuration user
interface.
--
Jean-Baptiste Quenot
http://caraldi.com/jbq/

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: implementing correct build order for dependent projects via SCMTrigger

bwestrich
This is what I'm looking for!  I'll try it out and let you know how it goes.

On Nov 8, 2007 4:46 PM, Jean-Baptiste Quenot <[hidden email]> wrote:
Is this what you want:
http://caraldi.com/jbq/blog/2007/10/10/Better-SCM-polling-for-Hudson/

In that case, please test and provide some feedback.  If it works for
you, we can finally show the setting in the configuration user
interface.
--
Jean-Baptiste Quenot
http://caraldi.com/jbq/

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: implementing correct build order for dependent projects via SCMTrigger

bwestrich
Did not work for me. The builds were still done in a random order (not even alphabetical).

I refreshed my config files.  Each night I bring hudson completely down and back up again. I'll see if the behavior changes tomorrow...

On Nov 8, 2007 4:54 PM, Brian Westrich <[hidden email]> wrote:
This is what I'm looking for!  I'll try it out and let you know how it goes.


On Nov 8, 2007 4:46 PM, Jean-Baptiste Quenot <[hidden email]> wrote:
Is this what you want:
http://caraldi.com/jbq/blog/2007/10/10/Better-SCM-polling-for-Hudson/

In that case, please test and provide some feedback.  If it works for
you, we can finally show the setting in the configuration user
interface.
--
Jean-Baptiste Quenot
http://caraldi.com/jbq/

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]



Reply | Threaded
Open this post in threaded view
|

Re: implementing correct build order for dependent projects via SCMTrigger

bwestrich
Good news!  After a hard restart (of the Tomcat appserver running hudson), the build order now respects the project dependencies!   Apparently doing "refresh from config files" wasn't enough.

Jean-Baptiste, you mentioned a possible plan to put these settings in the GUI. (Again, probably an ignorant question from a newbie Hudson user/developer), what is the benefit of doing this vs. just having this be the default Hudson behavior?


On Nov 8, 2007 5:07 PM, Brian Westrich <[hidden email]> wrote:
Did not work for me. The builds were still done in a random order (not even alphabetical).

I refreshed my config files.  Each night I bring hudson completely down and back up again. I'll see if the behavior changes tomorrow...


On Nov 8, 2007 4:54 PM, Brian Westrich <[hidden email]> wrote:
This is what I'm looking for!  I'll try it out and let you know how it goes.


On Nov 8, 2007 4:46 PM, Jean-Baptiste Quenot <[hidden email]> wrote:
Is this what you want:
http://caraldi.com/jbq/blog/2007/10/10/Better-SCM-polling-for-Hudson/

In that case, please test and provide some feedback.  If it works for
you, we can finally show the setting in the configuration user
interface.
--
Jean-Baptiste Quenot
http://caraldi.com/jbq/

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]




Reply | Threaded
Open this post in threaded view
|

Re: implementing correct build order for dependent projects via SCMTrigger

Jean-Baptiste Quenot
Dear Brian,

Until now few users have expressed this need, so your feedback is much
appreciated.   I'm glad that it works for you, as it took me some time
to implement.

There is basically nothing preventing this to be the default polling
mechanism, except to reach consensus from the developers. The only
drawback I see with that approach is when you have many projects
without any dependencies between them.  In that case you won't be able
to have more than one polling thread.  When there are many projects,
and depending on the network performance to hit the repo, that may
introduce a significant delay to detect changes.
--
Jean-Baptiste Quenot
http://caraldi.com/jbq/

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: implementing correct build order for dependent projects via SCMTrigger

bwestrich
In reply to this post by bwestrich
An update/correction on my previous note below...

Cruise Control does have a feature (the "veto" feature) that can be used to manage dependency related build issues. It is more complicated to configure than Hudson's, as it doesn't handle transitive dependencies.


On Nov 7, 2007 5:57 AM, Brian Westrich <[hidden email]> wrote:
... I have an idea that looks easy to implement that might get rid of many of the dependency related build issues I've seen when using other CI tools such as Cruise Control. 

Reply | Threaded
Open this post in threaded view
|

Re: implementing correct build order for dependent projects via SCMTrigger

bwestrich
In reply to this post by bwestrich
Jean-Baptiste (or anyone else who knows this),

The build dependency feature continues to perform well for me.  Having said this, I'd like to inspire some confidence in this feature to some others who I'm discussing Hudson with.

Are you aware of a logged message or something else that I can use to verify that the build order is happening because Hudson is respecting project dependencies (vs. just a random event that actually happens to match what is desired)? If not, I'd be happy to add one if you have ideas of the best place to add it.

Brian

On Nov 9, 2007 7:43 AM, Brian Westrich <[hidden email]> wrote:
Good news!  After a hard restart (of the Tomcat appserver running hudson), the build order now respects the project dependencies!   Apparently doing "refresh from config files" wasn't enough.

Jean-Baptiste, you mentioned a possible plan to put these settings in the GUI. (Again, probably an ignorant question from a newbie Hudson user/developer), what is the benefit of doing this vs. just having this be the default Hudson behavior?



On Nov 8, 2007 5:07 PM, Brian Westrich <[hidden email]> wrote:
Did not work for me. The builds were still done in a random order (not even alphabetical).

I refreshed my config files.  Each night I bring hudson completely down and back up again. I'll see if the behavior changes tomorrow...


On Nov 8, 2007 4:54 PM, Brian Westrich <[hidden email]> wrote:
This is what I'm looking for!  I'll try it out and let you know how it goes.


On Nov 8, 2007 4:46 PM, Jean-Baptiste Quenot <[hidden email]> wrote:
Is this what you want:
http://caraldi.com/jbq/blog/2007/10/10/Better-SCM-polling-for-Hudson/

In that case, please test and provide some feedback.  If it works for
you, we can finally show the setting in the configuration user
interface.
--
Jean-Baptiste Quenot
http://caraldi.com/jbq/

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]





Reply | Threaded
Open this post in threaded view
|

Re: implementing correct build order for dependent projects via SCMTrigger

bwestrich
In reply to this post by Jean-Baptiste Quenot
How about if Hudson were to use the current default settings unless at least one project dependency existed.  If one or more project dependency existed, it would instead use the settings that respected project dependencies? 

On Nov 9, 2007 8:54 AM, Jean-Baptiste Quenot <[hidden email]> wrote:
Dear Brian,

Until now few users have expressed this need, so your feedback is much
appreciated.   I'm glad that it works for you, as it took me some time
to implement.

There is basically nothing preventing this to be the default polling
mechanism, except to reach consensus from the developers. The only
drawback I see with that approach is when you have many projects
without any dependencies between them.  In that case you won't be able
to have more than one polling thread.  When there are many projects,
and depending on the network performance to hit the repo, that may
introduce a significant delay to detect changes.
--
Jean-Baptiste Quenot
http://caraldi.com/jbq/

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: implementing correct build order for dependent projects via SCMTrigger

Jean-Baptiste Quenot
In reply to this post by bwestrich
2007/11/9, Brian Westrich <[hidden email]>:
>
> Are you aware of a logged message or something else that I can use to verify
> that the build order is happening because Hudson is respecting project
> dependencies (vs. just a random event that actually happens to match what is
> desired)?

I did it once but never checked it in, and then changed computer :S

There are three relevant places where you can add useful log messages:

1) Trigger#checkTriggers()

2) DependencyRunner

3) SCMTrigger#run()

That's all!
--
Jean-Baptiste Quenot
http://caraldi.com/jbq/

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: implementing correct build order for dependent projects via SCMTrigger

bwestrich
Thanks for the info

I've added some log statements to all three of these classes.

Brian

On Nov 9, 2007 2:27 PM, Jean-Baptiste Quenot <[hidden email]> wrote:
2007/11/9, Brian Westrich <[hidden email]>:
>
> Are you aware of a logged message or something else that I can use to verify
> that the build order is happening because Hudson is respecting project
> dependencies (vs. just a random event that actually happens to match what is
> desired)?

I did it once but never checked it in, and then changed computer :S

There are three relevant places where you can add useful log messages:

1) Trigger#checkTriggers()

2) DependencyRunner

3) SCMTrigger#run()

That's all!
--
Jean-Baptiste Quenot
http://caraldi.com/jbq/

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: implementing correct build order for dependent projects via SCMTrigger

bwestrich
Hello Hudson developers,

I've been researching Hudson build order when one checkin affects multiple projects. Here's my initial findings: 

Setup:
* Single synchronous polling thread (per http://caraldi.com/jbq/blog/2007/10/10/Better-SCM-polling-for-Hudson/, note that this change requires a Hudson restart)
* Multiple projects that need to be built in a specific order. Inter-project relationships defined using Hudson's "(Trigger) Build after other projects are built / (Post-build) " feature.
* All projects have the same SCM cron settings (.e.g * * * * *).
* Only 1 executor defined.
* Hudson version 1.154-SNAPSHOT

Findings:
* Dependencies are always respected during SCM polling. For example, if project A is supposed to trigger the build of project B, A always does SCM polling before B does.
* Projects that have no relationship to other projects (dependee or dependent) are built BEFORE any projects that have dependencies.
* Dependencies are updated realtime, i.e. if I change the dependencies, the next SCM triggering uses the updated dependencies.

If I define multiple executors, dependencies are no longer respected during builds. For example, if project C depends on B which depends on A, with two executors project B starts building immediately without waiting for project A to complete.


Based on these findings, here's how we might, with no impact to the Hudson configuration UI, optionally respect build dependencies while still by default providing Hudson's current asynchronous multi-thread polling mechanism:

1. If Hudson is set to use only 1 executor, use 1 synchronous polling thread (as specified in the caraldi.com link above). Otherwise (if more than one executor), use the same polling approach Hudson currently uses ( i.e. multiple asynchronous polling threads).

2. Add the following statement to the online help for # of executors:  Note: if you wish to guarantee that projects will always build in dependency order, specify only one executor.

3. Leave the default number of executors at 2. This means that the above changes would not modify the default behavior of Hudson. Only those users who set the number of executors to 1 would see these changes.

Thoughts?

Brian

P.S. This recent note to the users list suggests there might be healthy interest in such a feature...


---------- Forwarded message ----------
From: DaleCooper82 <[hidden email]>
To: [hidden email]
Date: Mon, 12 Nov 2007 08:36:35 -0800 (PST)
Subject: Triggering chained builds and quiet period

Hi,

I have a question regarding trigerring downstream builds. We have quite a
few projects (builds) that complete one "big" system - there is client part,
server part and a couple of libraries.

Every project is polling its respective location in SVN. Now as there are
dependencies, the project at the end of the build chain has longest quiet
period, the sum of approx build times of "longest route" to the build. That
is for the case that there are multiple commits at once to the repository.

That is fine however when the build is triggered by some upstream job (i.e.
the last project was not changed in repo, only its dependancy) it still
waits for quiet period, making the whole build process longer.

I was wondering whether it is my bad config or whether there are some people
out there having the same "issue". If the latter, would that be a reasonable
request for enhancement to add checkbox "Ignore when triggered by upstream
build" next to quiet period setting?

Cheers,
dale



On Nov 12, 2007 12:49 AM, Brian Westrich < [hidden email]> wrote:
Thanks for the info

I've added some log statements to all three of these classes.

Brian


On Nov 9, 2007 2:27 PM, Jean-Baptiste Quenot < [hidden email]> wrote:
2007/11/9, Brian Westrich <[hidden email]>:
>
> Are you aware of a logged message or something else that I can use to verify
> that the build order is happening because Hudson is respecting project
> dependencies (vs. just a random event that actually happens to match what is
> desired)?

I did it once but never checked it in, and then changed computer :S

There are three relevant places where you can add useful log messages:

1) Trigger#checkTriggers()

2) DependencyRunner

3) SCMTrigger#run()

That's all!
--
Jean-Baptiste Quenot
http://caraldi.com/jbq/

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]



Reply | Threaded
Open this post in threaded view
|

Re: implementing correct build order for dependent projects via SCMTrigger

Jean-Baptiste Quenot
+1, your proposal looks interesting to me.  As long as this is
properly documented in the configuration UI, it makes sense to have a
different behavior between using one thread and several.  Your
suggestion is good because it does not clutter the configuration
screen, and does not rely on the user deciding between complex
options.

And the advantage is that there are very little changes to do, one
would only need to change Trigger#checkTriggers() to verify the number
of threads in SCMTrigger instead of relying on the synchronousPolling
field.

Again, this is a great change provided that it is properly documented
for the end-user.

Thanks for your feedback!
--
Jean-Baptiste Quenot
http://caraldi.com/jbq/

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: implementing correct build order for dependent projects via SCMTrigger

bwestrich
Thanks to Jean-Baptiste Quenot for his initial implementation of
synchronous SCM polling, and for his feedback on my proposal of a few
days ago.

One limitation of the proposal is that it requires use of only one
executor. So we limit ourselves to serial (not parallel) builds. Here
isa possible way to overcome this limitation....

From walking through code in the debugger, it looks like builds are
added to the queue in the correct order (as computed by
DependencyRunner()), but they are then immediately dispatched to any
free executors. In particular, this dispatching does not wait for any
building projects that the current project depends on (i.e. dependee
projects) to finish building.

To illustrate the functionality we need, I set up the following projects
and dependencies (x <- y means "y depends on x"):
fw1 <- fw2 <- app_a
fw2 <- app_b
app_c (has no dependencies)

Each project uses the same build script (where $1 is the project name):
echo $1 starting...
sleep 5
echo $1 finished.

I then did a single subversion commit that included a change to every one
of the projects. I had five executors defined.

Here's the results for the current default Hudson SCMTrigger.xml settings
(synchronousPolling=false, maximumThreads=0):

zz_app_b starting...
fw2 starting...
fw1 starting...
zz_app_a starting...
app_c starting...
zz_app_b finished.
fw1 finished.
fw2 finished.
zz_app_a finished.
app_c finished.
zz_app_b starting...
fw2 starting...
zz_app_a starting...
zz_app_b finished.
fw2 finished.
zz_app_b starting...
zz_app_a finished.
zz_app_a starting...
zz_app_b finished.
zz_app_a finished.

The key thing we see is that some projects build before the projects
they depend on. This can lead to transient build failures. Also we see
that any projects that build before their dependee projects are built
multiple times (since they are also triggered to be built when their
dependee projects build).

Here's the results for the current implementation of synchronous polling
(synchronousPolling=true, maximumThreads=1):

app_c starting...
fw1 starting...
zz_app_a starting...
zz_app_b starting...
fw2 starting...
app_c finished.
zz_app_a finished.
fw1 finished.
zz_app_b finished.
fw2 finished.
zz_app_b starting...
zz_app_a starting...
fw2 starting...
zz_app_b finished.
zz_app_a finished.
fw2 finished.
zz_app_b starting...
zz_app_a starting...
zz_app_b finished.
zz_app_a finished.

The most important thing to notice about these results is that some
projects start to build before the projects they depend on have finished
building (which can lead to transient build failures). Also note that any
such project will get built multiple times (since it is also triggered to
be built when its dependee finishes building). It's curious (but I think
not central to our discussion) that the builds do not start in dependency
order even though they were added to the queue in that order ( i.e. fw1,
fw2, zz_app_a/b), apparently the order that builds are assigned to executors
is not necessary the order in which the builds start.

To test the theory that we can use multiple executors if we
remove projects from the queue whose dependee projects are building,
I then changed my local copy of Hudson to do such a removal. As
expected, these removed projects were eventually built, their
build being triggered by the completion of the dependee project.

Here's the build order I got using the above synchronous polling settings
with this change in place.

app_c starting...
fw1 starting...
app_c finished.
fw1 finished.
fw2 starting...
fw2 finished.
zz_app_b starting...
zz_app_a starting...
zz_app_b finished.
zz_app_a finished.

We now see that the top level project (fw1) builds before any of the
projects it depends on, but the project with no dependencies (app_c)
builds as soon as possible and in parallel with fw1. Also note that
although zz_app_a and zz_app_b have dependencies, because one does not
depend on the other they both build in parallel.

So, with this change, we are able to achieve single-threaded polling (which guarantees
dependency ordered builds) as well as parallel builds (for projects that
don't depend on each other).  At the end of this note is a patch containing this change.

The patch appears to work correctly regardless
of the maximum polling threads setting in SCMTrigger.xml. So
we only have one attribute to change in order to get dependency
ordered builds to happen. This simplifies our options for implementing
this option in the GUI.  For example, we could implement it as a
checkbox with a label saying something like "Build projects in
dependency order", or the less pejorative but harder to fathom "Use
single threaded SCM polling", with online help saying "Using
this option guarantees that projects will build in the order of their
dependencies, but may lengthen the time it takes builds to complete.".

Please share your thoughts, in particular your opinions on whether I should
check this patch into HEAD, and whether we should implement a GUI option
related to synchronous polling.

Brian

P.S. The patch follows. Note that it has no effect unless
SCMTrigger.synchronousPolling is true, so I think the risks of
checking it into HEAD would be minimal.

### Eclipse Workspace Patch 1.0
#P hudson-core
Index: src/main/java/hudson/model/Executor.java
===================================================================
RCS file: /cvs/hudson/hudson/main/core/src/main/java/hudson/model/Executor.java,v
retrieving revision 1.11
diff -u -r1.11 Executor.java
--- src/main/java/hudson/model/Executor.java    17 Nov 2007 01:58:36 -0000    1.11
+++ src/main/java/hudson/model/Executor.java    17 Nov 2007 12:47:49 -0000
@@ -1,6 +1,7 @@
 package hudson.model;
 
 import hudson.Util;
+import hudson.triggers.SCMTrigger;
 
 import java.io.IOException ;
 
@@ -58,6 +59,15 @@
                     continue;
                 }
 
+                // don't build task if a task it depends on is being built,
+                // as the building task will trigger the dependent task when
+                // it has finished building.
+                if (SCMTrigger.DESCRIPTOR.synchronousPolling == true) {
+                    if (isDependentOnAnyBuildingProject((AbstractProject<?,?>)task)) {
+                        continue;
+                    }
+                }
+
                 try {
                     startTime = System.currentTimeMillis();
                     executable = task.createExecutable();
@@ -79,6 +89,21 @@
         }
     }
 
+    /** find whether a project is dependent on any building projects.
+     *
+     * @param task the dependent project.
+     * @return true if at least one dependee is building.
+     */
+    @SuppressWarnings("unchecked") // due to getTransitiveUpstreamProjects()
+       private boolean isDependentOnAnyBuildingProject(AbstractProject<?,?> project) {
+           for (AbstractProject upstream : project.getTransitiveUpstreamProjects()) {
+                if (upstream.isBuilding()) {
+                    return true;
+               }
+           }
+        return false;
+    }
+
     /**
      * Returns the current {@link Queue.Task} this executor is running.
      *

Reply | Threaded
Open this post in threaded view
|

Re: implementing correct build order for dependent projects via SCMTrigger

Jean-Baptiste Quenot
Dear Brian,

I read up your post, and have a few questions:

1) don't you think Executor should *always* wait for parent projects
builds to complete before building the project?  Regardless of
synchronous polling, as one can use project dependencies without even
a SCM system. However I can't comment much on the executor problem
because I didn't notice this problem.

2) I would remove the SCMTrigger.DESCRIPTOR.synchronousPolling field
and create a new method SCMTrigger.isSynchronousPolling() returning
true when the number of threads is 1 as we discussed previously.

Just curious, how do you manage to test the various options?  You seem
to have reliable and reproducible test cases :-)
--
Jean-Baptiste Quenot
http://caraldi.com/jbq/blog/

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: implementing correct build order for dependent projects via SCMTrigger

bwestrich
In reply to this post by bwestrich
 > On Sun, 18 Nov 2007 11:28:08 +0100, Jean-Baptiste Quenot
<[hidden email]> wrote

> 1) don't you think Executor should *always* wait for parent projects
> builds to complete before building the project? Regardless of
> synchronous polling, as one can use project dependencies without even
> a SCM system. However I can't comment much on the executor problem
> because I didn't notice this problem.

+1
Yes, I think it makes sense for the Executor to always do this type of
behavior. Another benefit of removing the (if
SCMTrigger.DESCRIPTOR.synchronousPolling==true) code from Executor
(and instead always looking for building dependee projects) is that
there would not be a source code dependency of Executor on SCMTrigger.

Ideally a few others on the list will weigh in. I suspect we could
commit this change (i.e. always having the executor wait for parent
project builds to complete) to HEAD independently of the discussion on
synchronous/asynchronous SCM polling.

> 2) I would remove the SCMTrigger.DESCRIPTOR.synchronousPolling field
> and create a new method SCMTrigger.isSynchronousPolling() returning
> true when the number of threads is 1 as we discussed previously.

I need to research this a little more before replying ......

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: implementing correct build order for dependent projects via SCMTrigger

Kohsuke Kawaguchi
Administrator
In reply to this post by bwestrich
I agree with Jean-Baptiste that executors shouldn't always wait for
the upstream projects co complete, as that would lead to starvation
issue.

I think a little formalism is helpful. Given a project X and its
transitive dependencies D={P1,P2,...}, what we are trying to guarantee
here is that when we build X on a timestamp T(X) (where the last
change in SCM before T(X) is LC(X)), we want to make sure that no new
changes were made in SCM in Pi between the time it was built last time
T(Pi) and LC(X).

A new build of X can remain in queue if this condition is not held.

LC(X) needs to be computed before a build is started, so for example
on CVS, you can't really cheaply compute LC(X). In those cases this
can be approximated by T(X).

I think I need to be more awake to make sure this is really right, but
I think the essence of the problem is something like this. This
formalism avoids the executor starvation issue even if the upstream is
building frequently.

2007/11/17, Brian Westrich <[hidden email]>:

> Thanks to Jean-Baptiste Quenot for his initial implementation of
> synchronous SCM polling, and for his feedback on my proposal of a few
> days ago.
>
> One limitation of the proposal is that it requires use of only one
> executor. So we limit ourselves to serial (not parallel) builds. Here
> isa possible way to overcome this limitation....
>
> From walking through code in the debugger, it looks like builds are
> added to the queue in the correct order (as computed by
> DependencyRunner()), but they are then immediately dispatched to any
> free executors. In particular, this dispatching does not wait for any
> building projects that the current project depends on (i.e. dependee
>  projects) to finish building.
>
> To illustrate the functionality we need, I set up the following projects
> and dependencies (x <- y means "y depends on x"):
> fw1 <- fw2 <- app_a
> fw2 <- app_b
> app_c (has no dependencies)
>
> Each project uses the same build script (where $1 is the project name):
> echo $1 starting...
> sleep 5
> echo $1 finished.
>
> I then did a single subversion commit that included a change to every one
> of the projects. I had five executors defined.
>
> Here's the results for the current default Hudson SCMTrigger.xml settings
> (synchronousPolling=false, maximumThreads=0):
>
> zz_app_b starting...
> fw2 starting...
> fw1 starting...
> zz_app_a starting...
> app_c starting...
> zz_app_b finished.
> fw1 finished.
> fw2 finished.
> zz_app_a finished.
> app_c finished.
> zz_app_b starting...
> fw2 starting...
> zz_app_a starting...
> zz_app_b finished.
> fw2 finished.
> zz_app_b starting...
> zz_app_a finished.
> zz_app_a starting...
> zz_app_b finished.
> zz_app_a finished.
>
> The key thing we see is that some projects build before the projects
> they depend on. This can lead to transient build failures. Also we see
> that any projects that build before their dependee projects are built
> multiple times (since they are also triggered to be built when their
> dependee projects build).
>
> Here's the results for the current implementation of synchronous polling
> (synchronousPolling=true, maximumThreads=1):
>
> app_c starting...
> fw1 starting...
> zz_app_a starting...
> zz_app_b starting...
> fw2 starting...
> app_c finished.
> zz_app_a finished.
> fw1 finished.
> zz_app_b finished.
> fw2 finished.
> zz_app_b starting...
> zz_app_a starting...
> fw2 starting...
> zz_app_b finished.
> zz_app_a finished.
> fw2 finished.
> zz_app_b starting...
> zz_app_a starting...
> zz_app_b finished.
> zz_app_a finished.
>
> The most important thing to notice about these results is that some
> projects start to build before the projects they depend on have finished
> building (which can lead to transient build failures). Also note that any
> such project will get built multiple times (since it is also triggered to
> be built when its dependee finishes building). It's curious (but I think
> not central to our discussion) that the builds do not start in dependency
> order even though they were added to the queue in that order ( i.e. fw1,
> fw2, zz_app_a/b), apparently the order that builds are assigned to executors
> is not necessary the order in which the builds start.
>
> To test the theory that we can use multiple executors if we
> remove projects from the queue whose dependee projects are building,
> I then changed my local copy of Hudson to do such a removal. As
> expected, these removed projects were eventually built, their
> build being triggered by the completion of the dependee project.
>
> Here's the build order I got using the above synchronous polling settings
> with this change in place.
>
> app_c starting...
> fw1 starting...
> app_c finished.
> fw1 finished.
> fw2 starting...
> fw2 finished.
> zz_app_b starting...
> zz_app_a starting...
> zz_app_b finished.
> zz_app_a finished.
>
> We now see that the top level project (fw1) builds before any of the
> projects it depends on, but the project with no dependencies (app_c)
> builds as soon as possible and in parallel with fw1. Also note that
>  although zz_app_a and zz_app_b have dependencies, because one does not
> depend on the other they both build in parallel.
>
> So, with this change, we are able to achieve single-threaded polling (which
> guarantees
>  dependency ordered builds) as well as parallel builds (for projects that
> don't depend on each other).  At the end of this note is a patch containing
> this change.
>
> The patch appears to work correctly regardless
> of the maximum polling threads setting in SCMTrigger.xml. So
> we only have one attribute to change in order to get dependency
> ordered builds to happen. This simplifies our options for implementing
> this option in the GUI.  For example, we could implement it as a
> checkbox with a label saying something like "Build projects in
> dependency order", or the less pejorative but harder to fathom "Use
> single threaded SCM polling", with online help saying "Using
> this option guarantees that projects will build in the order of their
> dependencies, but may lengthen the time it takes builds to complete.".
>
> Please share your thoughts, in particular your opinions on whether I should
> check this patch into HEAD, and whether we should implement a GUI option
> related to synchronous polling.
>
> Brian
>
> P.S. The patch follows. Note that it has no effect unless
> SCMTrigger.synchronousPolling is true, so I think the risks of
> checking it into HEAD would be minimal.
>
> ### Eclipse Workspace Patch 1.0
> #P hudson-core
> Index: src/main/java/hudson/model/Executor.java
> ===================================================================
> RCS file:
> /cvs/hudson/hudson/main/core/src/main/java/hudson/model/Executor.java,v
> retrieving revision 1.11
> diff -u -r1.11 Executor.java
> --- src/main/java/hudson/model/Executor.java    17 Nov 2007
> 01:58:36 -0000    1.11
> +++ src/main/java/hudson/model/Executor.java    17 Nov 2007
> 12:47:49 -0000
> @@ -1,6 +1,7 @@
>  package hudson.model;
>
>  import hudson.Util;
> +import hudson.triggers.SCMTrigger;
>
>  import java.io.IOException ;
>
> @@ -58,6 +59,15 @@
>                      continue;
>                  }
>
> +                // don't build task if a task it depends on is being built,
> +                // as the building task will trigger the dependent task
> when
> +                // it has finished building.
> +                if
> (SCMTrigger.DESCRIPTOR.synchronousPolling == true) {
> +                    if
> (isDependentOnAnyBuildingProject((AbstractProject<?,?>)task))
> {
> +                        continue;
> +                    }
> +                }
> +
>                  try {
>                      startTime = System.currentTimeMillis();
>                      executable = task.createExecutable();
> @@ -79,6 +89,21 @@
>          }
>      }
>
> +    /** find whether a project is dependent on any building projects.
> +     *
> +     * @param task the dependent project.
> +     * @return true if at least one dependee is building.
> +     */
> +    @SuppressWarnings("unchecked") // due to
> getTransitiveUpstreamProjects()
> +       private boolean
> isDependentOnAnyBuildingProject(AbstractProject<?,?>
> project) {
> +           for (AbstractProject upstream :
> project.getTransitiveUpstreamProjects()) {
> +                if (upstream.isBuilding()) {
> +                    return true;
> +               }
> +           }
> +        return false;
> +    }
> +
>      /**
>       * Returns the current {@link Queue.Task} this executor is running.
>       *
>
>


--
Kohsuke Kawaguchi

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: implementing correct build order for dependent projects via SCMTrigger

bwestrich
In reply to this post by bwestrich
Kohsuke,

Thanks for your formal statement of the problem.

I'm still curious as to the results of the tests that I reported a
couple of days ago. I am hoping to soon migrate my tests to Hudson
tester where all can run and analyze them. I'll post another note when
this is done...  It would be interesting if someone wrote a test in
Hudson tester that demonstrates executor starvation -- I'm probably
not the best person to do this as I don't understand this problem very
well (though I guess this hasn't stopped me in the past :-)  ).

Brian


>> Date: Mon, 19 Nov 2007 08:53:46 -0800
>> From: Kohsuke Kawaguchi <[hidden email]>
>> Content-Type: text/plain; charset=ISO-8859-1
>> Subject: implementing correct build order for dependent projects
via SCMTrigger

>>
>>
>> I agree with Jean-Baptiste that executors shouldn't always wait for
>> the upstream projects co complete, as that would lead to starvation
>> issue.
>>
>> I think a little formalism is helpful. Given a project X and its
>> transitive dependencies D={P1,P2,...}, what we are trying to guarantee
>> here is that when we build X on a timestamp T(X) (where the last
>> change in SCM before T(X) is LC(X)), we want to make sure that no new
>> changes were made in SCM in Pi between the time it was built last time
>> T(Pi) and LC(X).
>>
>> A new build of X can remain in queue if this condition is not held.
>>
>> LC(X) needs to be computed before a build is started, so for example
>> on CVS, you can't really cheaply compute LC(X). In those cases this
>> can be approximated by T(X).
>>
>> I think I need to be more awake to make sure this is really right, but
>> I think the essence of the problem is something like this. This
>> formalism avoids the executor starvation issue even if the upstream is
>> building frequently.
>>
>> 2007/11/17, Brian Westrich <[hidden email]>:
>> > Thanks to Jean-Baptiste Quenot for his initial implementation of
>> > synchronous SCM polling, and for his feedback on my proposal of a few
>> > days ago.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

12