November 12, 2002
Say CheeseDeveloping StorageSnapfish's storage needs in particular, Parthasarathy says, have presented the company with an uncommon challenge for an online business. Snapfish manages more than 35 terabytes of primary data in a network-storage environment, and all of it must be quickly accessible in order to fulfill orders. "Storage is our number one challenge, both in terms of our business because it's a huge variable cost item, as well as technology35 terabytes is a lot to manage and it keeps on growing," Parthasarathy says. According to Arnold Jones, technical director for the Storage Network Industry Association, Snapfish's storage capacity ranks among the largest on the Web. By comparison, Jones says, the data Amazon stores about each customer probably totals no more than about 50KB. A single digital image, on the other hand, starts at about 400KB and many exceed 3MB. Unlike data-intensive financial or health industries, Snapfish doesn't have the luxury of archiving its data on tape. Thumbnail images have to be on fast, expensive disk storage to be available on the site. The heavier, high-resolution images aren't available to customers after they've been uploaded, but they are the revenue generators, Parthasarathy says. Fast access to them means quicker turnaround on orders for reprints, calendars, mousepads, and other products. "It's an interesting dynamic that we have here," Parthasarathy says. "You have these heavy images which are the revenue generators and the light images which need to be served fast on more expensive storage. Our business is optimized toward this, in terms of what kind of storage we buy and how we manage it. Everything is tuned to this particular characteristic, which is unique to our business." Parthasarathy says Snapfish considered using tape for storing high-resolution images, but it didn't meet his criteria for access speed. And when Parthasarathy's staff factored in the costs of additional personnel and maintenance time for tapes, they decided that it was cheaper to leave the data on disks. "We needed to get access to it within a minute, maximum," Parthasarathy says. "Tape could take half an hour, so that was a problem."
Big Fish in a Small PipeAnother area of the site's architecture that Parthasarathy says his staff couldn't anticipate until the service was fully operational was the amount of bandwidth needed for transferring high-resolution images between the upload servers on the West Coast and processing lab's servers on the East Coast. Parthasarathy says they started with four T-1 lines to link the photo processing plant over the public Internet to the co-location facility in San Francisco. The problem they faced, he says, was variability in bandwidth and the inability to fine-tune it. "On paper, four T-1 lines looked like enough. In reality, sometimes the bandwidth was only half a T-1 line. Sometimes it would take days to get the scans over," Parthasarathy says. "One solution would have been to just over-capacitate, buy a lot of bandwidth. You can always solve problems like that, but that's not a smart business thing to do. It's very expensive." Instead, Snapfish leases a dedicated line from AT&T and uses a custom protocol for transferring most of its data. (The protocol is based on ATM, a low-level protocol that telecommunications carriers and Internet backbones use to transfer data over their lines.) The Snapfish protocol uses checksum validation to ensure that 100 percent of the data gets from point A to point B. Fine-tuning the load in the face of limited bandwidth was an issue on front-end operations as well, where users with slow connections could quickly bog down part of the site. When one part of the site is slow, it degenerates very quickly, Parthasarathy says, because people keep hitting the Refresh button. When planning resources for the coming quarter, he says, "we try not to optimize it too much. We try to have a little cushion left because of this effect." Disabling the Refresh button on certain pages with JavaScript has helped alleviate the problem. Other front-end coding tricks were implemented as part of a sitewide redesign that Snapfish undertook in mid-2001 to address shortcomings in the site's original design and to reengineer stop-gap fixes that had been implemented in the site's first year.
|
|
||||||||||||||||||||||||||||
|
|
|
|