tips on fixing a broken build

When builds break on the build server, it can be hard to figure out what the problem is. I’ve been doing this a lot of times, and so here’s a collection of tips that can help you in fixing that weird error that keeps breaking the build. Build systems are pretty similar, so whether you’re using TeamCity, Azure Devops, Gitlab, Jenkins, Circle CI or whatever, these should pretty much work everywhere.

Make sure it builds locally.

This has to be the “Did you try turning it on and off again?” of build fixing. Sometimes that build really is broken. A file might not have been checked in, or a merge might have gone wrong, but what you have on your local machine, might not be what’s in source control.
Get a clean copy of your repo in a separate folder, and build that to check if it builds OK. At least you’ll be 100% sure it’s nothing trivial before you start your investigation.

Try a clean build.

Another “try turning it on and off again” tip, I guess. On most build servers, you can opt to run a clean build. This means the agent will set up a fresh checkout folder for the new build. This way you make sure no leftovers from a previous build are causing the problem. Sometimes things go wrong with source control and getting the latest version, messing with your build. This also fixes that issue.

Does it fail everywhere?

Sometimes build agents are not installed in exactly the same way, and builds might break on a single or a few build agents only. This is good. Now you know you can always get a build on a good agent, so you aren’t stuck when you want to produce some build artifacts, and you know there’s something on that machine that’s not quite right where it’s failing.
The error will usually give you a clue on what’s missing, but most of the time you’ll only understand why after you resolved the issue. Ah, the joys of software development.
My tip here is to compare the machines and see what build tools are missing or different from the one where it does work. Make sure your SDK and build tool versions are the same, and can be found. Check your shell versions, like PowerShell and if they are running in x32 or x64 mode, and if all PS modules are installed in the mode you’re using.
If you have an automated way to quickly produce a build agent, you might want to ditch the failing one, in favor of a brand new one if your build works there.

Check the logs.

It sounds trivial, but the log files of your build server contain a lot of details you’re not looking at when things are working. Now that things are broken, it’s time to dig in and see if you can find a few clues. Most build systems also have settings to change how detailed those logs are. Setting this to verbose can be what you need to find that final clue to solve the puzzle.

Get into the machine.

When things should work, but appear broken on the build server, you’ll sometimes have to dial into the build agent and see what things look like in there. Paths might not be what is expected, files might actually be missing, and tools might be different. There you can open a shell, check if anything is missing, check if build tools are available, or try and run the same command that’s failing yourself. This doesn’t always work as build systems sometimes wrap your SDK calls, and you’re not always sure you’re running the same statement, but it might give you a hint of what’s going wrong anyway.
Look around on the machine, and you might find the problem. Can’t access the machine? Try the next tip.

Get into the machine, without getting into the machine.

You might not be able to access the build agent itself. Pesky admins, or it might even be a cloud machine you don’t own. Luckily you can edit the build itself right? In most cases, you can run some arbitrary shell commands as a build step for whatever fancy stuff that isn’t covered by default.
Well, nothing is stopping you from running an ls or type nuget.config command to find out what is really happening during your build. The output of those commands will normally appear in your build logs. You might have to check the more detailed full log files, but it should be there. This can be a tedious process, but it saved my butt a few times when things didn’t work for an at the time unknown reason.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.