We were having an interesting debate at work today as to whether or not it is ok to checkin on a red build. So we had the situation that we needed a release candidate in order for the QA team to test. The build status was red from our CI environment which was preventing a developer with fixes for the release candidate checking in. This had been the state for a couple of hours with situation not being resolved. So i hear you say “why not just revert back to a green state to allow the developer to checkin on a green build ?” Well now throw into the mix the idea of flakey tests (a flakey test being one which fails sometimes but when you rerun it, it goes green). Now how far do you revert back to ? How would you know where a known good state was without removing all of the flakey tests ? How do we know that our current red build status is a manifestation of a number of flakey tests ?
So the decision was then made for the developer with his fixes to checkin on a red build and low and behold it eventually went green. Which sparked the debate, should we really have a blanket rule saying no checkins on red ?
Pros Of Allowing Developers Checking In On A Red Build
- Developers do not feel frustrated whilst they wait for hours for a green build
- Developers can practice small commits often to the CI environment
- Their checkin may not cause additional failures
- Stories not related to the release candidate can still proceed
- Build status may be arbitrarily holding up developers if failure is not genuine
Cons Of Allowing Developers Checking In On A Red Build
- Release candidate must be green before it can be release to live so may hinder progress
- Relies on developers confidence that their additional checkin on red will not break anything else
- Not starting from a known good state from a testing perspective
- Problem of red build may become compounded if several developers all checkin at the same time
- It may end up staying red for longer vs holding off for additional checkins
My personal opinion on this that if you have the following situation within your organization then it probably would be acceptable (not great) to check in on a red build:
- A stable environment with no flakey tests
- A close knit team not spread among multiple sites
- No need for frequent release candidates
- A small build time to allow for quick reverts
- The ability to diagnose test failures quickly
i would be genuinely interested to see the number of firms which had this. I also realize that the elephant in the room in our organization is to fix the flakey tests to guarantee a build in a genuine failure.