Mark Gilbert's Blog

Science and technology, served light and fluffy.

Web Service Testing – Part 3 of 4

In part 1 of this series, I described how I structured the test suite for a web service that I was building.  In part 2, I described an interesting performance issue that cropped up, and how I worked around it.  In this post, I’ll talk about the value that the test suite brought to the project.

When I started writing the test suite, I intentionally designed it so that it could be run against the development, staging, pre-production, and production web services.  I wanted to have some level of confidence that everything that should be been deployed was actually getting deployed, and deployed correctly.  The test fixtures were all driven off of a single setting in the app.config that contained the URL for the web service to run against (this was actually a built-in option for web references in Visual Studio, to make the URL “dynamic” instead of static).

All of this meant that regression testing was ridiculously easy.  Push code to new server or push an update to an existing server, change the value in the app.config, and run the test suite.  I actually maintain all five URLs in the app.config (the four above, and then my local development copy) in the comments; when it came time to change which copy of the service the test suite was to run against, I just copied the correct URL out of comments, and pasted over the “real” line in the config file.  Over the course of the project, I probably ran the full test suite 50 times.  Because it was automated, it ran the same way every time, and it ran unattended.

Whatever time I spent building the test suite was paid back many times over.  When it came time to move the code to pre-production for the first time, the suite proved itself once again and in a very interesting way.

Development and staging were environments that my company managed, while pre-production and production were managed by the client (we had no direct access to the web or database servers there).  In order to prevent problems with deployment, we’ve tried to mirror our development/staging environments as closely as possible to the client’s, including operating system, database, and application patch levels.

If you recall from the first post in this series, each test in the suite would run two searches against the web service: “include by X” and “exclude by X”.  While the first may not return any records, the second should always return at least one (there wasn’t a single filter option that would exclude every possible item).  When I ran the test suite against the pre-production web service for the first time, I found a specific set of the tests were failing.  In particular, the “exclude” tests were returning 0 records.  That should never happen, not with how the tests and the web service were constructed.  Minimally, this wasn’t a problem that had occurred on our servers.

After a day or so of tinkering (mostly to make sure that the problem wasn’t in my test suite, or in the web service code itself), I contacted the client and asked them to run a piece of SQL against the pre-production database.  The SQL snippet was the core of the “exclude by” search, and sure enough, my contact said the query returned 0 results.  I asked him to strip out the entire WHERE clause, and the query dutifully returned the entire data set.

At this point, we organized a conference call (he brought in another developer on his side; I roped in my project manager who happened to have been an Oracle DBA in a past life, as well another of our developers who had more Oracle experience than I did), and started picking the SQL apart to try to narrow down the problem.  As it turns out, there was one particular primary key field that when we included in a sub-select and performed a “not in” statement on it, failed to return any records.  There didn’t seem to be any good reason why it would fail, but it was doing so nonetheless.  We were all stumped.

In the end we found a way to reorganize the query to work around the problem.  This was a rather obscure problem with Oracle, and even though our respective environments (my company’s and the client’s) were close, they apparently weren’t identical.  I doubt that we would have found this problem without the aid of the test suite; we definitely wouldn’t have found it as quickly.  Chalk up another for automation.

In the final post I’ll address an inherent problem with the test suite, and how we decided to shore it up.

Advertisements

July 27, 2008 Posted by | Agile, Visual Studio/.NET | Comments Off on Web Service Testing – Part 3 of 4

Web Service Testing – Part 2 of 4

In Part 1 of this series, I mentioned that each test hits the web service twice – once to do an “include by X” search, and the second to do an “exclude by X” search, where X was some search criteria.  This uncovered an interesting performance issue which I’ll explain in this post.

Some days the tests seem to run slowly, each requiring 5-10 seconds to complete.  Some days, the tests would run downright glacial, requiring 25-30 seconds each.  When I started running the tests against the pre-production and production servers, they started timing out.  I started to get concerned about performance.  If my unit tests were causing the service to time out, what would a production web site consuming the service do?  I had to look into it.

Fortunately, much of the structure that I had built so far lent itself rather well to opportunities for caching.  It was also fortunate that I knew exactly where the problem was, so it should be a relatively simple matter to implement caching there.  I happily plunked away for a couple of hours implementing my idea.  I re-ran my full suite.

No improvement.

I ran it again.  Perhaps the DLLs were being compiled, so the first hits weren’t representative.

No improvement.  I was still seeing the tests require 5-10 seconds each to complete.

Ok, so perhaps I DIDN’T really know where the problem was.

So, the next step was to do what I should have done as step 1: collect some hard data.  I had written a class a couple of years ago to dump out messages to a log file, and record the time it took to get there.  The class was called “MyStopWatch”, and I originally wrote it to wrap the System.Diagnostics.Stopwatch class. However, that class was introduced in .NET 2.0, and the project I was working on was targeting the 1.1 framework, so I had to modify MyStopWatch to simply count clock ticks:

Imports System.IO

Public Class MyStopWatch
    Private _LogFileWriter As StreamWriter
    Private _TimeOfLastCheckPoint As Long
    Private _IsEnabled As Boolean

    Public Sub New(ByVal FunctionBeingMeasured As String, ByVal IsEnabled As Boolean)
        Dim LogFileName, LogFilePath As String

        Me._IsEnabled = IsEnabled
        If (Not Me._IsEnabled) Then Return

        LogFileName = FunctionBeingMeasured & “_” & Now.ToShortDateString.Replace(“/”, “”) & “_” & Now.Hour & Now.Minute & Now.Second & “.log”
        LogFilePath = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), LogFileName)

        Me._LogFileWriter = New StreamWriter(LogFilePath, False)
        Me._LogFileWriter.AutoFlush = True

        Me._TimeOfLastCheckPoint = DateTime.Now.Ticks
        Me._LogFileWriter.WriteLine(“L = Time since last checkpoint (milliseconds)”)
        Me._LogFileWriter.WriteLine(“———————————————“)
    End Sub

    Public Sub RecordCheckpoint(ByVal CheckPointDescription As String)
        Dim CurrentTickMark, ElapsedTicks As Long

        If (Not Me._IsEnabled) Then Return

        CurrentTickMark = DateTime.Now.Ticks
        ElapsedTicks = CurrentTickMark – Me._TimeOfLastCheckPoint
        Dim ElapsedSpan As New TimeSpan(ElapsedTicks)

        Me._LogFileWriter.WriteLine(“(L: “ & ElapsedSpan.TotalMilliseconds & “) “ & CheckPointDescription)

        Me._TimeOfLastCheckPoint = CurrentTickMark
    End Sub

    Protected Overrides Sub Finalize()
        If (Not Me._IsEnabled) Then Return

        Try
            If (Not IsNothing(Me._LogFileWriter)) Then Me._LogFileWriter.Close()
        Catch ex As Exception
           ‘ Ignore these; it means that the log file was already closed
        End Try
    End Sub
End Class

The constructor sets up the log files to dump out to the desktop so I could find them easily.  Once I included the class in my project, I could use it as follows:

{ some code here }

Dim X As MyStopWatch

X = New MyStopWatch(“MySlowFunction”, True)

{ some code here}

X.RecordCheckpoint(“First loop complete”)

{ some code here }

X.RecordCheckpoint(“Business object instantiated; data loaded”)

{ some code here }

X.RecordCheckpoint(“Web service call complete”)

{ some code here }

X = Nothing

After a few runs, and some tweaking to the placement of the RecordCheckpoint() calls, I was able to narrow down the real problem.  The timing tests showed that the slowness was with the stored procedure call itself.

On a hunch I decided to try varying which fields the database returned with each call.  The stored procedure accepted a delimited list of fields to include, and would build the field list dynamically, so it was relatively easy to change what was being returned.  Through trial and error I found that a small handful of fields that when included in the result set would cause the stored procedure to slow down by one or more orders of magnitude.  Those fields happened to all be Oracle CLOB (character large object) fields.  As long as none of these were included in the result set, the stored procedure ran very quickly (a second or less).  Each CLOB field included would add time, even if I included multiple CLOBs from the same table.

Asking the client to modify their table structure to not use CLOBs for these fields was out of the question.  The best we would be able to do for the production web service would be to urge people to filter the list heavily, which would mean fewer CLOBs for the server to pull together, and quicker overall response times.  That advice, however, wouldn’t work for my test suites, which relied on bringing back most or all of the records in each test.

I pondered the problem for a while and it finally dawned on me: I was only counting records to make sure that “include by X” and “exclude by X” were returning record counts that added up to the total number of records – I wasn’t really interested in the data that was brought back.  So, why couldn’t I modify the calls to only bring back an ID field, and none of the CLOB fields?  I quickly implemented that and reran the suite.  The tests were now running in a second or less each.  I could run through a hundred or so tests that hit the actual web service, and do it in a couple of minutes instead of an hour or more.  Not bad for a morning’s work.

While finding out about the performance hit with the CLOBs was interesting, the real lesson for me was to not assume I knew where the performance issue was.  Always get some hard numbers on the various parts of the code that seems to be performing poorly;  otherwise you’ll waste time optimizing where it isn’t needed.

In the next installment of this series, I’ll be examining the value that the test suite brought to the project.

July 20, 2008 Posted by | Agile, Visual Studio/.NET | 1 Comment

Web Service Testing – Part 1 of 4

One of my projects currently involves writing web services that expose a key database for our client.  One of my goals as the lead developer was to create a test suite that could be run against the development, staging, pre-production, and production versions of these services.  I wanted this suite to exercise a large portion of the web service code, but I was constrained by the fact that I wouldn’t have access to the pre-production or production databases to write test data into it (so that my tests could expect that test data coming back from the service).  Even if I had this level of access, I would be very leery about doing so, especially with the production database.  I had to find a different way to test the web services.

In this blog post and the three subsequent ones, I’ll explain the testing solution that I eventually settled on, an interesting performance issue that we stumbled upon using this test suite, and the value that the test suite has brought to the project.  I’ve used VB.NET and NUnit for all of this testing and my nomenclature in these posts reflects those implementation decisions.

Both web services each have a main search feature allow the client code to specify one or more filters to include records by or exclude records by.  This allows the client code to define quite specifically what data the service should return.  Because this is the most complex of the web service methods, and because it is the one that is expected to get the most usage in production, the lion’s share of the tests focus on this search and its components.

As I mentioned in the introduction, I couldn’t write tests that would insert a record and then check that that record would be returned by the service.  After several iterations of tests, I eventually settled on counting the number of records that were returned with a given “include by” search and counting the number returned for the equivalent “exclude by” search.  Adding those two numbers together should return the total number of records that could possibly be returned by the web service.  For example, running a search of “include all records where product name contains ‘cheese'” and “exclude all records where product name contains ‘cheese'” should be perfect inverses of each other, so their sum would be the total number of records.

This approach has two main advantages.  First, I don’t need to know how many records were returned by the include by or the exclude by searches.  I also don’t need to know how many records are in the database ahead of time (or worse, have to assume some number of records to be there).  This latter is particularly important when you consider that the four databases that these services would be exposing (development, staging, pre-production, and production) will likely have different data, and therefore a generic solution that didn’t have to be modified each time would be ideal.

Secondly, doing both include by and exclude by searches in each test will exercise a very large percentage of the code base (one of my original design goals for the test suite).

This approach does have the disadvantage of making the include by searches and the exclude by searches dependent on one another.  It is entirely possible that there is a bug in the logic somewhere that affects both, but they cancel each other out when I add the record counts together.  We’ve tried to address this possibility outside of the NUnit tests, a solution that will be described in the final post in this series.

To implement this testing solution, then:

  1. The TestFixtureSetUp method makes a web service request that asks for every record – no filters are applied at all.  The test fixture saves this number off to a Protected variable.
  2. Each test hits the web service twice: once to run the “include by X” search and a second to run the “exclude by X” search.  In both cases, the record counts are saved in Protected variables.
  3. At the end of each test a standard pair of assertions are run.  First, either the include by count or the exclude by count has to be greater than 0.  It is assumed that there is at least one record in the database that is returnable by the service, so one of the web service requests should return at least 1 record.  If not, it is assumed to be an error (such as the web service being unreachable, timing out, etc.).  The second assertion adds the two individual record counts together and compares them to the total number found by the TestFixtureSetUp method.  If the sum and the total don’t match, the assertion fails.

After writing several of these, I was able to factor out a lot of the basic logic into a “test suite base class” that the other test fixture classes inherit from.  That is the reason that several of the variables are declared as Protected (rather than as Private, for instance).  It also means that the test suites themselves have very little code, thus making it quick and easy to write new tests.

In the next post, I’ll discuss an interesting performance issue that the test suites uncovered, and how I got around it.

July 7, 2008 Posted by | Agile | 3 Comments

NAntRunner 0.2 Released

I published an update to the days-old NAntRunner project on CodePlex.  I had barely begun to use version 0.1 myself and had just showed someone else how to use it, when we discovered how much it was really lacking.  Version 0.2 should address a lot of those shortcomings:

NAntRunner0_2

Natively, version 0.2 supports selecting a framework to target (pulled from the NAnt.exe.config file) as well as the build target (pulled from the script you select).  In addition, you can pass any other NAnt arguments you like using the textbox at the bottom.  The complete command executed is shown at the top of the results box, which has been moved to the right side of the screen.

Along with the updated utility, I’ve released the NUnit test suite that I’ve been compiling for the utility.  I think it is a good start, but it is not complete.  In fact if you scan through the NAntRunner source code you’ll see many TODO notes describing tests or additional validation checks needed. 

Let me know what you think of the update, and what else you’d like to see in future versions.

May 26, 2008 Posted by | Agile, Tools and Toys | Comments Off on NAntRunner 0.2 Released

NAntRunner

I’ve added a new tool to my growing collection: NAntRunner, a lightweight, standalone, GUI interface for NAnt:

NAntRunner

You can browse to multiple NAnt scripts, and they will be saved (so they will appear the next time you start NAntRunner).  Selecting one from the list and clicking Run will do just that, and display the output to the text box on the lower half of the screen.

Why did I want a GUI interface for NAnt?  Frankly, I got tired of creating Windows Explorer shortcuts that would run the script via the NAnt executable (that practice alone saved me time from having to type in the command “nant.exe -buildfile:mybuildscript.build”).  Additionally, I have half a dozen build scripts that I use currently and I’m just getting started.  Having a tool that could roll up and manage many scripts was an attractive feature.

Before I built this, though, I looked around to see if someone had beaten me to the punch.  I found a couple of plug-ins for Visual Studio (Jay Flowers‘ Studio hack and the Sharp Builder toolset) which would do basically what NAntRunner does, except from within the IDE.  Why did I want a standalone interface, separate from Visual Studio?  I am in the process of moving my company to automating our builds with CruiseControl, and NAnt is going to be a key piece to the overall solution.  Several of our projects are not .NET, and therefore not in Visual Studio, so an IDE plug-in just wouldn’t be useful or appropriate across the board.

Additionally, I wanted something that I could use to quickly test a new build script, to see if it was working the way I intended it to.  Having to fire up VS just to do this would be a hassle.  For those developers in the office that don’t have VS, it wouldn’t be possible.

The first official release, 0.1, allows you to select the NAnt executable on your machine, as well as one or more build scripts.  The only command line option for NAnt that NAntRunner supports at this point is “buildfile” – just enough to specify the script itself.  I may eventually get around to adding support for the other command line options.

The NAntRunner page on my blog has both the executable (single .exe file) and full source code for the application.  To run NAntRunner you’ll need the .NET 3.5 Framework and NAnt (of course).

Try it out and let me know what you think.  I’d also be very interested to hear what features you’d like to see in it next.

May 20, 2008 Posted by | Agile, Tools and Toys | Comments Off on NAntRunner

"Agile Software Development and UI Design: a Match made in Heaven?"

This past week the XP West Michigan users group held their monthly meeting, and brought in Professor Robert Biddle of Carleton University in Ottawa, Canada.  Professor Biddle spoke about the research that his group has been doing on agile teams around the world.  The talk was fascinating on many levels, and Professor Biddle made a number of very interesting points.  Here is a smattering (paraphrased):

A tremendous amount of energy has been poured into developing agile teams, and describing how to be a better or “more agile” team.  However, what can be said about how to be a good customer interfacing with an agile team?

Professor Biddle made the observation that agile teams frequently find themselves hating the process after about three months.  Why?  The customer nods and goes along with the development team for that amount of time, making light comments about the direction of the project and hoping that they eventually get on the right course.  The three-month mark typically is the breaking (snapping?) point for the customer, and they finally let their frustrations out.  To avoid this, a good customer needs to have the courage to say things like “no, this isn’t what we need” or “no, we’re not there quite yet – here’s where we’re still falling short of our goals”, and they need to say these things as soon as the customer realizes them – not three months later.  For most human beings, this is a very hard thing to do.

Tell your stories.  Let everyone know where you’re coming from.

“User stories” aren’t supposed to simply be a list of requirements for the new system.  They are meant to be stories – tales that involve real people facing real problems in the real world.  One of my favorite books is “The Soul of a New Machine” by Tracy Kidder in which he tells the story of a company (Data General) working to bring a new computer to market.  The story he tells is riveting because Kidder gets you involved in the people.  He explores their past, their motivations, the circumstances that brought them together as a team, and their challenges to get a completely new machine out the door at a breakneck pace.

Granted, we aren’t all Pulitzer prize-winning authors, but the stories that we collect for a new piece of software have to be more than simply “screen 42 will collect the order shipping information”.  They need to involve the people that will be using it, their motivations, their goals, etc.  Otherwise, we’ll miss valuable information about who the system is being built for, and the other team members won’t understand where we’re coming from.

Prototype the system using paper instead of digital tools.

Professor Biddle described teams that would mock up screens using sticky notes and pens, and ask people to “interact” with the system.  The system is low-tech, inexpensive, very easy to change, and only requires a few seconds for the users to suspend their disbelief.  What’s more – and this was a point that struck many in the room, including myself, as profound – no one will mistake sticky notes for a nearly finished system.  The same can’t be said for a prototype built with a RAD suite.

Computer Science (CS) students rarely learn UI design.  Human-computer interaction (HCI) students rarely learn how to program.  This is bad.

Professor Biddle mentioned that at Carleton University the HCI curriculum was part of the Psychology Department – not the Computer Science department.  His position at the University happens to be co-listed between the two departments so he is able to pull in researchers from both.  He makes the point that not everyone needs to be experts in both disciplines, but everyone does have to be aware of what is going on around them.

The User is not always king.

I think this was my favorite example of the evening.  He alludes to the attitude on several agile teams that the user is the ultimate driving force for the direction of the development efforts, and their word is final.  After all, the software is being built for them, right?  The user is necessary and critical, but not sufficient for a good piece of software, he illustrated it with the example of an ATM being designed purely by a user:

  1. The user walks up to the machine.
  2. The machine dispenses money.
  3. Preferably not the user’s money.

Perhaps we should consider other points of view?

All in all it was a terrific presentation.

March 29, 2008 Posted by | Agile | Comments Off on "Agile Software Development and UI Design: a Match made in Heaven?"

"Testing after Unit Tests"

In one of Scott Hanselman’s more recent installments of Hanselminutes, he interviews Quetzal Bradley of Microsoft.  The interview, titled “Testing after Unit Tests“, focuses on the epistemology of unit testing and code coverage – what knowledge do we really gain by these practices and metrics?

Bradley makes several very interesting observations during the session, but one of the most enlightening was about code coverage.  He says that people like to think of “code coverage” as a positive metric.  For example, “my unit tests cover 60% of my code”.  The real value in that number is not the 60% of the code base that is covered, but rather the 40% that isn’t.  In other words, code coverage as a metric does a good job of telling you what you have not yet tested.

Bradley also makes the point that code coverage doesn’t guarantee that the code that is covered by tests is bug-free (he even gives the example of one of his past projects where the client found a problem in a section of code that was covered by one or more tests).  You can have several tests covering the same function, and therefore may achieve 100% code coverage for that function, and still not have all of the possible scenarios for that function covered.  Bradley describes a metric called “state coverage” that measures the number of scenarios, or states, that the tests cover.  Ideally, he says, we would aim for 100% state coverage in our tests, but usually that’s not realistic given time and budget constraints, so we need to prioritize our testing.

All in all it was a very valuable half an hour.

March 23, 2008 Posted by | Agile | Comments Off on "Testing after Unit Tests"

Most Difficult

One of the companies that I applied to asked prospective employees to answer one of three questions, and if the team thought your answer was particularly insightful, you’d move on to the next step.  What attracted me to this company, Atomic Object in Grand Rapids, Michigan, was their fervent devotion to agile methodologies.  I have studied and read, and played with agile techniques such as test-driven development and pair programming for several years, but I haven’t had many opportunities to put those techniques into full practice on a full scale large (multi-month) development project.  Atomic Object does it daily, and that really interested me.

However, in all of my reading, I thought there was still an aspect of developing software that was really hard, and didn’t seem to be addressed well.  As it turns out, one of the three questions posed by Atomic was this one: “What is the most difficult aspect of writing software? What is the best way to address this reality of software development?”  It was this one that I selected to answer.

The vast majority of my software development experience has been in the role of a consultant.  Over the 12 years of filling that role, I’ve lost track of the number of times I’ve had my clients ask me some variation of The Question: “Can you solve my problem for X dollars and in Y time?”  Answering this question has become the single most difficult aspect of writing software.

This is an important question for the client because they don’t want to spend X dollars and Y time, and then not get their problem solved.  A development team can always answer The Question very precisely, but only after they’ve tried to solve that problem for X dollars and Y time.  However, how can they answer it before any real work has been done towards solving the problem?  Since divination was not an approved elective in my Computer Science program at WMU, I’ve been forced to find other means to answer The Question.

For the first few years of my professional career, I always thought about the problem for a few minutes, and then answered “Yes”.  After getting burned on several projects – either as the result of an unhappy client, a project budget firmly lodged in the red, or both – it began to dawn on me that perhaps I was missing an important, unspoken clause in The Question:  “and not go out of business doing it.”  Losing clients and losing money are two sure-fire ways to go out of business.

So, for the few years after that, I started doing pre-project consultations that would try to gather all of the requirements for the new system.  My thought was that if I could list up front everything that the client wanted, I could provide a better (albeit still imperfect) answer to The Question.  Perhaps then, the client would be less unhappy and my project budget less red.  While there was some benefit for both me and the client in this process, there were two main difficulties with it.

First, since the client usually had a fixed budget and/or timeframe in which to solve their problem, the dollars and days for the pre-project were inevitably included in the overall budget.  This meant the more effort we invested into figuring out the details of the problem to solve only took away from the resources available to actually solve it.

Second, even if we got through to the end of the “requirements” project with flying colors, and thought we could do the real project in the remaining time, those requirements would end up changing mid-stream.  It’s not that the client intentionally set out to rock the boat, or throw a wrench in the works, but stuff happens:

  • A key employee was left out of the initial discussions, and when they get wind of the project, they unearth critical problems in the proposed approach.
  • Something we thought would be easy from a technology-standpoint turns out to be very difficult, forcing us to change our approach.
  • The client puts the development on hold while they go through an employee restructuring, ending in the project being put onto the plate of someone who has completely new ideas on where the software should go.

The possible scenarios of what can cause change in a project goes on and on.

Over the last few years, then, I’ve looked for ways to address these “change” issues.  Agile methodologies seem to be better suited than most to handle change, and in fact expect change to occur during a project.  I especially love the process of working with the client to prioritize the tasks, and working on them in priority-order to get the highest-value ones done first, working in fixed-length time boxes to show regular progress.

Unfortunately, The Question can be brought to bear on our priority lists and our regular releases with only a slight modification: “Can you accomplish these first W tasks for X dollars and Y iterations, and not go out of business doing it?”  When change occurs, we can re-estimate, but there’s always a risk that the change will push a “critical” task out of reach of the previously approved dollars and timeframe, thus leading to an unhappy client, or a reddening budget.  Agile methodologies are great for providing visibility and feedback on progress, but they still don’t appear to be any better suited to answering The Question up front, before any real work has been done.

So, what’s the best way to address this?  Learn how to divine the future, obviously.  Short of that, I’ve begun to wonder if I’m letting the client ask a question that simply cannot be answered.  Perhaps what I need to be doing as their consultant is when they ask The Question, educate them about the futility of asking such a question, and then give them a better, more meaningful question to ask.  Whether this approach will work or not, and what The More Meaningful Question looks like remains to be seen.

Powered by Qumana

November 15, 2007 Posted by | Agile | 1 Comment

One-Three Back Solitaire Republished

I finally got around to republishing the source code and instructions for a simple VB.NET game that I wrote years ago.  This application implements a version of solitaire that I call “1-3 Back”.  I’ve set up a page for this application here, but it is also available as one of the tabs above.  Enjoy!

September 23, 2007 Posted by | Agile, Visual Studio/.NET | Comments Off on One-Three Back Solitaire Republished

The Agile Path: Agile as a Journey (Part 2 of 2)

“The Agile Path” is a semi-regular update on the “agile” state at BlueGranite – what we’ve done, lessons we’ve learned, and where we’re going.

In the first installment of “The Agile Path: Agile as a Journey”, I had alluded to several reasons why there was a 3 year gap between my first exposure to agile methodologies, and actually doing something with it on a “real” project.  The short answer was – developer inertia (only some of which belonged to those around me).

At the time (2002), BlueGranite was very much a waterfall development shop.  I knew I needed the commitment of the full team (other developers, project managers, and executive team) to really make this work.  Looking back at that 2 ½ years I now realize that I wanted “team commitment” to mean more than simply “doing it because Mark-the-tech-lead mandated it on this project”.  I wanted people to see the value in it, and be active participants in helping to incorporate the techniques into our daily development lives – not just adhere to yet another standard.  I still believe that.

While “willing adopters” was probably the single biggest reason that kept this process moving at less than ludicrous speeds, there were others:

  1. I was learning this stuff from scratch.
  2. Researching these new techniques WASN’T my full time job (or even my part-time job), so time spent reading and tinkering had to be worked in around my day to day work.
  3. I was having problems finding a good way to introduce the techniques into the company.

Of these three, the last one was really the primary driver.  If I had recognized earlier how to apply some of the techniques I was reading about on a real-life project, I would have carved out more time for the other two, or even baked it into the project itself as “startup costs”.

As a result, the Phase 1 project that I started in early 2005 (please see my first installment of “The Agile Path: Agile as a Journey”) seemed like as good a place to start as I was ever going to get.  It was a relatively small project (originally planned for 3 months), it had a small development team (two full time developers on the project), and I was a major part of that team, and it was a greenfield development project.  This project was small enough that I could realistically touch every part of it, but large enough to give agile a good work out.  Even given all of those advantages, I still opted to start very conservatively, and only suggested changing when we post builds for QA to review (instead of one or two builds at the end of a milestone, we would post nightly builds into QA).

Sometime during Phase 1, it occurred to me that I could introduce a new technique (or a small number of techniques) into every new development project I was assigned to.  That realization led to some actionable items:

  1. I could look at the techniques that I hadn’t tried in a new light – what would the next most logical step be, now that I have techniques 1, 2, and 3 under my belt?
  2. If done well, the perceived risks in trying technique 4 would be mitigated by the fact that 1, 2, and 3 delivered a lot of value to the project. Past performance might not be an indicator of future results, but I believe they do lend some credibility to at least trying the techniques out. If a new technique doesn’t work, we can always change course – that’s agile, right?

So with those basic thoughts in mind, I planned Phase 2 to have a more agile structure: 2-week iterations where each consisted of talking about functionality at the beginning, building and testing it in the middle, and deploying it at the end.  Internally, I also made use of a project tack board where the functionality planned for a given iteration was shown on user story cards.  The board itself was segregated into several areas (“This Week”, “This release”, “DONE”, etc.), and we would move items around the board as things got done, or priorities changed.  It made it really nice to be able to get the team together every so often, look at the entire iteration at a glance, and evaluate what needed to be done next.

I say “cards”, but they were really pieces of paper.  I didn’t like the idea of keeping the plan for the current (or future) iterations ONLY on pieces of paper (hardcopy = not backed up at night), so I created an Excel spreadsheet to store the information, and a Word document that I could merge the Excel data into to create the small user stories on.  Then, I just print out the Word file, cut the stories apart, and mount them.  I ended up really only needing to do this once every iteration (at the beginning, when we had settled on what we were doing that iteration), so the time involved was minimal.  The warm fuzzies I had in knowing that that Excel spreadsheet with all of our plans and past performance was being safely backed up every night more than made up for the time spent.

Phase 2 officially came to a close at the end of the first quarter in 2006, and since then I haven’t been involved in any major new development projects.  I am on the verge of starting one right now, so I will be able to pick up some speed in my journey, and try some new things.  I expect to have an edition of “The Agile Path” that describes this new project of mine sometime towards the middle of 2007.

Activities, not necessarily methodologies

Something else that has occurred to me in the last few months is that I’m not necessarily going after a specific methodology like XP, Scrum, or Crystal.  I’m looking at the specific practices that agile methodologies pull together, see how they work, see how they bolster each other, and see which ones would be appropriate  to include in our own efforts.  Along the way I fully expect to find some that just don’t work here for one reason or another.  For example, I have read a lot of the benefits and downfalls of pair programming.  Of all of the agile techniques that I’ve read about, I don’t think any get more attention in- and outside of development circles as this one, and for good reason.  There is shift in approach needed to make pair programming work, and I’m not convinced that every developer is willing to make it.

I read a blog posting by Mike Arace recently (“I Am not a Robot”, from http://mikeomatic.net/?p=133) that described some problems he encountered when pair programming was introduced at his place of employment.  If you take away only one thing from his post, it should be the point that an organization’s culture can make or break any initiative, whether it’s trying to introduce pair programming, trying to get the team to shift from one source control system to another, or trying to change the brand of caffeine to stock.  If the people on the ground aren’t interested in trying X, there’s nothing you can do to make it work.

Sure, you can try to sell activities like this to the team (“but CocaDew tastes better”), you can try to mandate it (“you WILL switch to CocaDew”), or even tie rewards to it (“everyone who switched to “CocaDew” in the last quarter is up for a raise”).  I think approaches like these will in general fail spectacularly, cause the majority of the development team to leave, or both.

No silver bullets

I think it’s also important to realize the agile techniques are not silver bullets.  They’re probably not even shiny.  This is true from two different, closely related, aspects:

First, everyone rags on the waterfall method of development, and usually it’s because it’s “not agile”.  Something isn’t good or bad because it’s blue and not red; it’s good or bad because of the job that it was intended for, and how well it performs that job.  I think the waterfall method can work well for projects when you don’t have requirements changing every week, or can mandate that they don’t change after a certain point.  As it turns out, my company doesn’t have this luxury on the majority of the projects it takes on.  We’ve tried the waterfall approach, and we’ve had problems with our projects.  So, we needed to do a better job of matching the tool to the task.  We’ve realized that everything is not, in fact, a nail.

Second, the ability of the team to work together and actually wield the tool selected is a huge contributor to success of failure.  I studied judo for a few years, and one of my most memorable lessons was that “there are no superior martial arts, only superior martial artists”.  A black belt in karate will be much more likely to beat someone a beginning to kung-fu, not because kung-fu is inferior to karate in some fundamental way, but because the black belt will tend to be faster, more flexible, and more adept at reading their opponent, defending against their attacks, and responding with well-placed and well-timed attacks of their own.  These skills don’t magically manifest themselves overnight – they can take years to develop.  Likewise, it’s not enough for us as developers to read about agile, do it on a couple of projects, and then claim that we can deliver EVERY project on time, on budget, and to the client’s complete and utter satisfaction – purely because we’re “doing agile”.  Development will ALWAYS be hard work; agile just gives us a new toolset to better address the problems that the majority of us face the majority of the time.  We need to be honest with ourselves that mastering that tool may take a while.

The journey

So, if we can’t just “be agile” 100% out of the gate, and there’s no guarantee that it will solve all of our problems by lunchtime, where do we really stand?  We stand at the beginning of a journey, one that doesn’t have a destination, but instead a lot of cool sights and mile-markers.  Along the way, we will be able to look back at our progress, and say things like “we are more agile now than 6 months ago because we can better respond to problem X”.  As long as we can continually improve the process, delight customers, and deliver high-quality solutions, then we can say we’re on the right path.  If we can’t say these things, then we need to change directions – that’s agile, right?

January 15, 2007 Posted by | Agile | Comments Off on The Agile Path: Agile as a Journey (Part 2 of 2)