Control Freak
Ebay's technology was nearly its biggest roadblock. Lynn Reedy made it the company's greatest asset.
forbes.com Victoria Murphy, 03.29.04
____________________________________________________________
By the Numbers Another Day At Ebay SEARCHES 175 million OUTBOUND E-MAILS 25 million BIDS 10 million NEW LISTINGS 2 million AUCTION SALES $85 million Source: Ebay.
____________________________________________________________
Most people tend to manage their life and business on a healthy diet of trial and error. Not Lynn Reedy, 48, Ebay's senior vice president for software development. She tends to see everyday tasks as sets of discrete variables, each in need of study, quantification and improvement. Two years ago, in preparation for a ski trip to Squaw Valley, Calif., Reedy's husband, Kevin, bought a watch that measures altitude changes. To her spouse's chagrin, Reedy became obsessed with the data the watch was providing. If they didn't drop at least 30,000 feet per day, she would write the day off as a failure. To Kevin's glee, the watch broke near the end of the season.
That mania for measurement suits Reedy well for her job. People think of Ebay as an auction site, and it is; but at heart it is nothing more than a honking piece of software, one of the most complicated applications ever created. Ebay's dozens of databases and thousands of servers supply 720 million page views daily, and visitor traffic can spike 180% without warning. Last year $24 billion of goods changed hands, a number that should rise to $33 billion this year with growth coming from France, the U.K., China and India. Every day adds 36,000 members to the 95 million already registered. Each minute the site is down equals $60,000 in deals that don't get done. "Technology can't get in the way," she says.
The 640 people in Reedy's department recently put the finishing touches on a strip-to-the-walls rehab of Ebay's data processing--this without any of the millions of visitors noticing. The result is a site that is more responsive and cheaper to run. One big, powerful server has been replaced by hundreds of small, cheaper computers. The software that glues everything together--tracking listings, auctions and payments--can now be continually rewritten without hiccups noticeable to the users. Reedy maintains a rolling list of 40 software ideas under development. Recent ones include automated auction-data feeds to sellers, multiple-item check-outs for buyers and more targeted promotions.
Reedy's fear is a repeat of August 1999, when the site went dark for 22 hours, causing $4 million in lost fees and a $5 billion drop in Ebay's market value. Features like Free Listing Day, a day when sellers can list items free of charge, were canceled because any increase in traffic put the entire site at risk. "Ebay was broken. If the server died, everything failed," recalls Chief Operating Officer Maynard Webb, who was brought in to clean things up after the blackout.
Today Ebay's availability is 99.94%, up from 97% four years ago. This translates into six minutes of downtime weekly, in minute-long increments that typically affect only a small portion of the site. Reedy has managed to do this without blowing the budget. Although her division now spends more on technology than it did in previous years, as a percentage of revenue Ebay last year spent half of what it did in 2000: $159 million, or 7% of revenue.
Reedy was recruited by Webb in September 1999 from trade show producer Miller Freeman, where she oversaw the creation of 200 new Web sites. Before that she had spent seven years with Andersen Consulting. Webb gave Reedy a daunting task: Increase software output fourfold while just doubling staff.
At the time Ebay was running on just one Oracle database, which was near capacity, and a server from Sun Microsystems. Small software flaws were creating a big backlog of queries that had nowhere to go and quickly brought down large parts of the system. Most engineers were troubleshooting around the clock, resting one or two hours a day in sleeping bags underneath desks. "The code was really mucky," says Reedy. "We would throw everyone at these problems, when only a handful were decent at solving them."
Over a two-year stretch, hardware engineers methodically moved data from the two hefty machines to thousands of smaller servers from a mix of vendors. If one machine drops out, others pick up the slack. It costs less, too. Sun's big multimicroprocessor system cost about $2 million; its workload can be handled by as few as seven smaller $120,000 machines.
But this called for a new way to write code. Information within Ebay's system had always run along a two-lane road, shuttling from traders' Web browsers to a server and a single database and back. All 40 developers custom-coded the commands that instructed a server to retrieve specific information from the database, like the price of a used Honda or a buyer's shipping address.
Given Ebay's blistering growth--peak utilization on the site has increased 29-fold since June 1999 to 7.8 gigabytes per day--that two-lane blacktop got jammed. Now it is like a complex highway system. A population of 5,000 servers (half are backup), growing by 1,000 every three months, is connected to 26 databases (6 backup) in three different locations. Because it would be impossible to know precisely which machines are shooting particular data back and forth, Reedy's staff came up with a time-saving piece of management software that knows what kind of commands belong where. For example, an adjustment in how shipping costs are calculated is automatically routed to machines responsible for user profiles.
A homegrown tool, akin to a spell-checker, was created to hunt down bad code that would result in bidders getting those annoying blank Web pages. It also alerts managers to slowdowns in the time it takes to compose and send out automated e-mails to users. Now, when a new feature is rolled out, it's deployed on only one machine at first. If it passes muster, it is then moved onto the next 25% of servers and eventually onto all. "We always want to be able to go back to normal as fast as possible," says Reedy.
Perfection is impossible. At last count there were 325 known bugs (in 5 million lines of code), down from 3,000 bugs two years ago. Fixes that previously took up to two months now require only a maximum of five hours because problems are spotted early. The system runs so smoothly that only one engineer is on call during off-hours.
Her one complaint: "I can't change how many lines of code my developers write in a given hour," she says. But Reedy can't help herself, forgetting momentarily that she is already trying to change that, too. As of January this year she has had her 300 developers filling out forms detailing how they spend their time, hoping to espy pockets of inefficiency. "The code we're building today will be obsolete in three years. We can't dwell on it," she says. |