I'd like to set up chained builds. I understand chained builds as "multiple projects which depends on each other and where changes have been pushed to the same branch in each."
The typical case here is the master branch. People supply features. Eventually, they are merged to the master branch. Now, all downstream builds should run if they
1. have the same branch
2. depend on the same Maven version as the project just built
If I have a logical chain of projects A -> B -> C ("->" == "depends on") and "C" is modified, I want to build B and then A.
- A should wait for the build of B since there us a chance that it might fail otherwise
- A should only build when it has a SNAPSHOT dependency on C. If I have a release 1.0 of C and 2.0-SNAPSHOT in the master branch but A depends on C 1.0, no build is necessary (but wouldn't hurt)
- Builds should not rely on some global Maven repo.
Reasons for the last point:
A global Maven repo is very much like a global variable. Changes there always have side effects.
If a SNAPSHOT is pushed to a global Maven repo, everyone in the company will get this new version first thing in the morning (first Maven build of the day, when it updates SNAPSHOT dependencies). That can cause all kinds of weird problems. So I'm very reluctant to publish SNAPSHOT dependencies globally.
This becomes worse when feature branches are introduced. When several feature branches are built concurrently, no one can tell which version ends up in the global repo. When downstream builds start, they will randomly fail.
I haven't found a good solution for this last point.
I could create a new Maven repo when the build starts for C and then pass the path to B. If B gets a path to a Maven repo, it uses it; otherwise, it creates a new Maven repo. etc.
Problem: When the project builds on different nodes, this fails unless I use a network filesystem. Which adds brittleness plus I'm not 100% sure how Maven handles concurrent access to a local repo.
I could use Jenkins to pass the artifacts around but that means archiving them on Master and then downloading them on the client. Archiving is somewhat slow and a burden on the master node (especially when hundreds of projects build). But the main problem is to know what to download and how to get it into the local Maven repo. I guess I could look at the current job and find the upstream job and then just download all archived artifacts and try to install them. Not sure whether that would work.
A more serious problem is when I find a problem in B and push a new commit to the feature branch. Now the build starts with B. If I rely on C creating the repo for me, the build will fail because the new code from C is not visible anymore. B will download the last master branch from the global Maven repo and fail. If I use the "copy archived artifacts" approach, I have the same problem because there is no upstream "C" job anymore.
So I could create a local Maven repo using the branch name. That would help with feature branches but raise new issues: When can I safely delete those? If I delete too early (say, every night), starting a build with B will randomly fail again.
It would also mean that a lot of projects would eventually build into the repo with the name "master". There would be almost no way to clean that up. Maybe I can use the global Maven repo for "master" + "release" builds and local repos for everything else. Then people would have to remember that when debugging build problems.
One option would be to deny commits to master which contain SNAPSHOT dependencies (so the project itself could use a SNAPSHOT version but all the dependencies would have to be releases). That sounds like a good solution at first but in your case, "A" is a client specific project. For some products ("B"), we have 20+ clients. Most of them stay at the latest release build but a few are part of the next release. It would be a big overhead to force a release of the product every time a new feature is integrated into a client. Imagine having to do release builds 2-3 times a day. We would prefer to have one release build per release cycle of a product and keep the product and all involved clients at SNAPSHOT for the whole cycle because that would allow us to notice early when some feature for client X breaks client Y.
So my final design looks like this:
- Every project has one local Maven repo per branch (somehow shared between nodes)
- When a build of C succeeds, it triggers B. A waits. B copies the whole repo of C into it's own.
- When B has been built, all the A's copy B's repo into their own and build.
That would allow to start a chained build at any point.
If the Maven repos gets corrupted, we can delete them all and trigger a build of C to recreate them.
What I don't like here is the massive disk usage. Even for simple projects, Maven downloads 100-200 MB of code for its plugins. So 100 repos would need 10 TB of disk space (10 projects with 10 feature branches). It's also somewhat slow but in my tests, copying 200 MB of Maven repo took < 10s, so it's bearable.
Also, I'm not sure how to solve "fragmented" chains when there is a feature branch for A and C but not for B. In this case, the build of C needs to trigger just A, skipping B. A may depend on the SNAPSHOT version of B from the master branch.
Any comments? Has someone already set up something like this? Does this work reliably? How can I solve the disk space issue?