Thursday, March 18, 2010

The Evolution of a Build System Part 1

A couple of weeks ago I began setting up a new set of build servers for our west coast office.  I’ve pretty much got it setup so it is almost a push button operation.  Not quite there with all the DCC tools yet but almost.  We typically stand up one machine with the OS and run a single installation script which installs Visual Studio 2005, 2008 and the SPs.  It also installs Perforce, CruiseControl.Net and the latest console SDKs automatically. There are other minor tools the script installs as well.  I began to think about how far we had come. It wasn’t always like that…

This is a far cry from where we were when I started working at Emergent about a year and a half ago. Then there was a Twiki page that listed all of the various items required for each build machine. Your fingers ached after scrolling through the instructions.  Add this set of environment variables, make sure this path was added to the system path, install this tool at this path.  It was a very manually intensive process to setup a build machine.  Since the first thing I was challenged with was to streamline the build process, I had to have a reproducible build environment.

So, a combination of organic build scripts and manually setup machines made for a very unstable, and in my eyes, unreliable build system.  I had to first address the reproducibility of a build machine.  But where to start?  I knew we would be finishing a release soon.  Then I think it was Gamebryo 2.5. So I would start preparing for the next release cycle.

The build was taking close to 24 hours.  That was just too long.  Don’t get me wrong.  It was doing a lot of stuff.  It was building all console versions (Win32, PS3, Xbox360 and Wii) of the product and every solution configuration for each.  Then it was executing a rigorous automated testing suite.  So I chose to let that be for now.

Continuous Integration was where I started.  I had experience implementing CruiseControl.Net before so it seemed natural that I mentally moved to it.  My budget was not large so I had to make choices.  I reviewed ElectricCloud and AntHillPro. Both would have done what I wanted but where did I want to spend the little money I had?  I decided to forgo the expensive build systems and spend my money else where. 

I started small. I took a piece of the technology, the base framework, and created a CCNet project on my own box just to prove that I could do it with the C++ code. Then I incorporated each compiler, VS2005 using devenv, VS2008 using MSBuild, VSI from Sony for the PS3, and GCC and make for the Wii builds.  It took a while but the POC was working in a week.  I had some issues with MSBuild and C++ which I outline in a previous post so I decided to move to devenv for all the Win32 builds.  I will spare you the details further.

As my team was piecing the CI process and hardware together, we slowly started putting together a script that evolved through the process for build machine setup. I made sure my team knew I wanted there to be as little manual intervention to a new build setup as possible.  We re-factored existing scripts, combined them into a series of standard named configuration scripts in a well-known-location.  The team slowly began to change their way of thinking.  My mantra was as soon as you have to do something twice, evaluate and automate.  I tried to lead by example.  I engineered a lot of the initial setup and brought the team up to speed.

Continuous Integration as a culture was fairly new to Emergent product engineers.  Yes, they understood the concept and new it would help them be more efficient.  They understood the bringing feedback closer to the time of development was crucial to limiting the time it took to fix the issue.  They understood all of the reasons. But, it took a while for them to actually catch on to the whole “Red means dead” culture.  Finally, after a few months of CI in its full operation,  The engineers were on board and even worked with each other to promote the whole methodology. After all, I believe, these engineers to be the smartest people on the planet.

I was, indeed, pleased to see the fruits of my team’s labors playing out and helping the engineering team as a whole be more efficient.

In Part 2 I’ll talk about the transfiguration of a nightly build system “pieced together with duct tape”, to a streamline, reliable, reproducible build system and process.

Till next time…