Wednesday, March 24, 2010

Interesting CCNet behavior

We have a utility that generates our build scripts when it is passed a directory path as an argument.  The utility finds all the .sln files in the directory and its sub directories.  Then it interrogates the sln files to find out what the project dependencies' are and then generates a build script.  We use this utility in all of our CCNet projects.  CCNet calls the utility and then the next CCNet task calls the resultant batch script.

The engineers wanted to use this utility for their daily local builds so I created a single batch file that calls the utility based on a known Build location that contains all our sln files and then calls the batch files that were just generated.  Unfortunately, if one of the batch files failed it would exit out of the entire script with the exit code not allowing the engineer’s build to continue.  Now this is expected behavior given that none of the generated scripts’ EXIT statements have the /B argument. So I added it.

Everything seemed to be fine when I ran my first tests.  These were positive tests to make sure existing green CCNet projects would stay green.  But then on my negative tests something unexpected came to light.  My red build returned green.  I wanted to validate this was the case so I removed the /B from the batch script and ran the test again. The build turned red.

I am not sure why this is the case.  I looked through the CCNet doc but could not find the answer.  I am not sure if this is a CCNet issue or a CMD.exe issue (CCNet uses this when shelling out to batch files.)

My solution was to add an argument to our script building utility. The utility, with the new argument present, will generate the scripts with the /B on the EXIT command.  If the argument is not present then it does not include the /B.  I then added the new argument to the batch script for the engineers and now we are all happy. 

Well sort of. I still want to know why CCNet behaves this way. I created a thread on the ccnet-user google group discussion board.  I’ll update you on any responses i get.

till next time…

Thursday, March 18, 2010

New Feature in CCNet 1.5

So I was looking at the CCNet documentation this morning because I needed to refresh my mind regarding some syntax.  As I was searching I found the Parallel Task feature.  This feature allows you to run several tasks at the same time as the name might indicate.  Well this is all well and good and I will use the Parallel Task feature.  But what was even more exciting to me (I get excited easily) was the Sequential Task feature.  I know, I know.  You say CCNet has always had sequential tasks.  In fact, they are built sequentially automatically.  Yes this is true.

But what I could never do is have CCNet move on to the next task if the first task failed.  Now I can.  See, the Sequential Task has an attribute associated with it called continueOnFailure.  This is a boolean attribute that if set to “true” allows the set of tasks inside the sequence to build even if the previous one failed.  I love it!

I love it because I am using a ccnet applet tool which I have written about in a previous post that allows me to forcebuild a CruiseControl.Net project from the command line.  This comes in very handy.  Well, I use this in all of our builds here at Emergent.  But sometimes because of the nature of our re-usable config files I want to force a build that may not exist yet on some build machines, say, if I were just testing it on one machine. So I put these forcebuild steps inside this sequential task with the continueOnFailure set to true and I am golden. 

I haven’t put this into practice yet because I just found out about it.  But, I’ll let you know if it doesn’t work.  I’m thinking it will.

Till next time…

The Evolution of a Build System Part 2

In my last post I talked about how Continuous Integration was where I started when faced with rebuilding an entire build process from scratch.  I talked about CruiseControl.Net and some of the challenges I faced at a high level.  Today I am going to talk about taking the Continuous Integration system and moving to a nightly Full Build system.

I began here:

Nightly Build and Test

  • Only built portion of code base
  • Built from the Tip
  • VC80 and VC90 would build on alternate nights
  • All platforms were built and tested serially
  • Packaging was done as a separate process only when deemed necessary
  • Close to 24 hour turn around

This is where we are now:

  • Builds all solutions and all configurations of entire mainline
  • Builds from latest Green CI Change-list #
  • Builds all platforms and compilers in same night
  • Build takes advantage of multi-core hardware and RAID drives by building concurrently where possible
  • Build and Integrated Tests timing is down to 14hrs total
  • Have deployable ISOs for all platforms prior to 10am daily

I built a couple of tools along the way and made use of my staff (thanks Scott and David) and our IT infrastructure and staff.  Being the IT Manager helped considerably.  The IT staff (thanks Jean and Luke) was extremely helpful in automating the process and continues to play a part. 

Here are some next steps:

  • Automate the deployment to a QA environment
  • Automate tests against the QA deployment
  • Continue to bring down the build time

All of this was made possible because we were able to reproduce the build environment and standardize on a process.

Till next time…

The Evolution of a Build System Part 1

A couple of weeks ago I began setting up a new set of build servers for our west coast office.  I’ve pretty much got it setup so it is almost a push button operation.  Not quite there with all the DCC tools yet but almost.  We typically stand up one machine with the OS and run a single installation script which installs Visual Studio 2005, 2008 and the SPs.  It also installs Perforce, CruiseControl.Net and the latest console SDKs automatically. There are other minor tools the script installs as well.  I began to think about how far we had come. It wasn’t always like that…

This is a far cry from where we were when I started working at Emergent about a year and a half ago. Then there was a Twiki page that listed all of the various items required for each build machine. Your fingers ached after scrolling through the instructions.  Add this set of environment variables, make sure this path was added to the system path, install this tool at this path.  It was a very manually intensive process to setup a build machine.  Since the first thing I was challenged with was to streamline the build process, I had to have a reproducible build environment.

So, a combination of organic build scripts and manually setup machines made for a very unstable, and in my eyes, unreliable build system.  I had to first address the reproducibility of a build machine.  But where to start?  I knew we would be finishing a release soon.  Then I think it was Gamebryo 2.5. So I would start preparing for the next release cycle.

The build was taking close to 24 hours.  That was just too long.  Don’t get me wrong.  It was doing a lot of stuff.  It was building all console versions (Win32, PS3, Xbox360 and Wii) of the product and every solution configuration for each.  Then it was executing a rigorous automated testing suite.  So I chose to let that be for now.

Continuous Integration was where I started.  I had experience implementing CruiseControl.Net before so it seemed natural that I mentally moved to it.  My budget was not large so I had to make choices.  I reviewed ElectricCloud and AntHillPro. Both would have done what I wanted but where did I want to spend the little money I had?  I decided to forgo the expensive build systems and spend my money else where. 

I started small. I took a piece of the technology, the base framework, and created a CCNet project on my own box just to prove that I could do it with the C++ code. Then I incorporated each compiler, VS2005 using devenv, VS2008 using MSBuild, VSI from Sony for the PS3, and GCC and make for the Wii builds.  It took a while but the POC was working in a week.  I had some issues with MSBuild and C++ which I outline in a previous post so I decided to move to devenv for all the Win32 builds.  I will spare you the details further.

As my team was piecing the CI process and hardware together, we slowly started putting together a script that evolved through the process for build machine setup. I made sure my team knew I wanted there to be as little manual intervention to a new build setup as possible.  We re-factored existing scripts, combined them into a series of standard named configuration scripts in a well-known-location.  The team slowly began to change their way of thinking.  My mantra was as soon as you have to do something twice, evaluate and automate.  I tried to lead by example.  I engineered a lot of the initial setup and brought the team up to speed.

Continuous Integration as a culture was fairly new to Emergent product engineers.  Yes, they understood the concept and new it would help them be more efficient.  They understood the bringing feedback closer to the time of development was crucial to limiting the time it took to fix the issue.  They understood all of the reasons. But, it took a while for them to actually catch on to the whole “Red means dead” culture.  Finally, after a few months of CI in its full operation,  The engineers were on board and even worked with each other to promote the whole methodology. After all, I believe, these engineers to be the smartest people on the planet.

I was, indeed, pleased to see the fruits of my team’s labors playing out and helping the engineering team as a whole be more efficient.

In Part 2 I’ll talk about the transfiguration of a nightly build system “pieced together with duct tape”, to a streamline, reliable, reproducible build system and process.

Till next time…