January 24, 2008
SCM: Continuous vs. Controlled IntegrationNon-stop Integration
Agile methods clearly enforce frequent build and release cycles, but many development groups have ended up implementing what has been called non-stop integration. What does it mean? Instead of running integrations frequently, developers integrate all the time. A developer makes a change, checks all the code in, and the build system runs all the available test suites. If the build gets broken (it doesn't compile correctly or not all the run tests pass), developers receive a warning notifying that they have to fix the problem. So, in fact integrations are now continuous because they occur all the time.
The key difference between continuous integration and the evil code-and-fix cycle seems to be the presence of a well-defined test suite, plus a firm developer's commitment to run it all the time (or enforced by build software).
But is continuous integration the solution to all version control headaches or does it introduce any problem?
In a perfect world the test suite would be almost perfect, so if it runs correctly no problem would ever occur. But in reality test suites are far from complete, and it is easy to see how a problem introduced by developers reaches the main code line immediately without being correctly checked. Once detected, it will be fixed. But in the meantime lots of developers would have been affected. Figure 4 illustrates a bug spreading scenario.
[Click image to view at full size]
Figure 4: Bug spreading and mainline instability as continuous integration aftermaths.
Imagine the following situation in which a developer finishes a given task and wants somebody from testing to check whether it is correct or not. To deliver the code, he checks it in on the version-control system, triggers the build scripts, and notifies his colleague to get the code and check whether everything is correct or not. The only reason to submit the code at that point was making it available in a managed way. If the code has a problem or doesn't implement the feature correctly, the mainline is already infected by the mistake. Because all the team members are basically doing the same, in a short period there will be a lot of code built on the wrong one.
Figure 5 shows a set of tasks being directly integrated into the mainline, as it would happen with the continuous integration working pattern. There is only one way for developers to deliver code -- merging it into the mainline. In Figure 5, after tasks 1098, 1099, 1100, and 1104 have been delivered, what would happen if task 1098 is detected as a defective one? The answer would be it has to be fixed. But, what if you need to release the code to a client or just to the testing group and you already know changes introduced by 1098 are wrong but we don't have time to fix them? Most likely features introduced by tasks 1099, 1100, and 1104 are totally independent from 1098 and they could have been properly delivered if another working pattern would have been used. Task independency happens more often after the initial phase of a project during which tasks tend to be extremely dependent on each other due to project's infancy.
[Click image to view at full size]
Figure 5: A task introducing a problem and all the rest building on top of it.
Table 1: Continuous integration drawbacks
|
|
||||||||||||||||||||||||||||||||||||||||||||
|
|
|
|