Monday, August 08, 2005

When Bad Software Happens to Good People

My goal for the day was to take a flight with my wife and daughter from Newark to San Francisco. We booked a nice, early 7am United flight so that I could get to California by 10:30am, local time, and put in a reasonable day's work. We showed up at the airport at 6am, and breezed through the line to the self-service checkin.

Now, my first sign that United's software systems are broken should have come weeks earlier -- we were booking one ticket with miles, and the others would be paid for. When the agent booking the mileage ticket was physically unable to to book the other two tickets, I should have realized that United was using a booking system that had long outlived its usefulness.

Sadly, I'm just not smart enough to read an obvious sign when I see it.

So we check in my wife (the mileage ticket) and then attempt to check in me and my daughter. The checkin kiosk doesn't know who I am. We ask the ticketing agent, and she tells us that, although there's a record of the reservation, the tickets were never issued.

The next 45-or-so minutes were enlightening, from the perspective of software anthropology. The key insight for me came from the following exchange:

Ticketing counter agent: The original booking agent put an f-dash in [some field], and the automatic queueing system never picked that up and issued the ticket, because it looks for [some other code].

Ticketing counter manager: They should know not to put an f-dash in that field, we haven't used that code for years.

I can only deduce from this exchange that there are two separate mechanisms, and one of them (the reservation UI) allows input that isn't recognized by the other (the queueing system that issues the tickets from a reservation); as a result, some number of tickets are never actually issued, and this morning's episode gets repeated from time to time at United ticket counters around the country.

Don't feel bad for me. I made it to San Francisco only two hours later than originally intended, I got to see the Denver airport, and I had plenty of time to finish reading Cinderella Man (I recommend that you read it). But think about this from United's perspective:

From the time we realized there was a problem with our tickets, until the time we got on our first flight, we took the time of no fewer than seven United employees. At checkin, there were two ticket agents, their manager, and a person they called at their ticketing "help desk". The two ticketing agents were completely consumed with our case for about an hour. At the gate, another three gate agents were eventually involved in helping re-route us. And, if we're lucky, United will eventually offer us a few free tickets to accommodate us for the inconvenience. All of this is going to cost them a non-trivial amount of money. And there's nothing in the entire chain of events that was unique to our reservation -- this is a systemic bug that can be invoked by any reservation agent inputting the discontinued "f-dash" into the mysterious forbidden field. This magic code has supposedly been discontinued "for years", but the reservation system still accepts it, and the booking agents still use it, so the problem isn't going away any time soon.

I have some theories about possible root causes of United's software problems. I'll try to get to these in a later post.

No comments: