People used to talk about how Google was going to build a “web-based operating system.” Personally, however, I’m more interested in what Amazon is doing. We all know them as a big retail e-commerce site, but running a site that big requires an incredible amount of web-based infrastructure. For the past few years, they have been commercializing their infrastructure to become one of the most influential companies in the cloud computing arena.
A few months ago they introduced Amazon CloudFront, which provides a content delivery service for making static web content available to users worldwide.
Let’s say your corporate website has a lot of multimedia content—photos, background images, Flash animation, video, Ajax—like, say, Amazon. When people visit your site, their browser first downloads the main HTML for the page, and then it downloads every single multimedia component on the page.
Modern browsers try to be smart about this—starting to load images before the HTML is done, and fetching several images at the same time—but it all takes time. In addition, the round-trip times tend to be longer as your users get further from the server. If you’re in Silicon Valley, people in San Francisco will see short load times, people in New York will have more of a wait, and people in Berlin could have an agonizing wait as every graphical element of your page is fetched across the Atlantic.
Arbuably, the best solution is to simplify your web site and remove all the extra content items. A site like Facebook does a pretty good job of this, but the folks who run Craigslist take it to an extreme—the home page is entirely HTML without any graphic elements.
If you can’t simplify your web site, you can still take advantage of the fact that most of your graphic and multimedia content is static. That is, a query from the browser will return the same image or file every single time. Even when the media content changes every time you hit the site—as with the background images at the home page for Terminator: The Sarah Connot Chronicles—the changes are done by changing the HTML to refer to a different image URL. The image served from each URL is always the same.
(That T:SCC site is an excellant example of the kind of site I’m talking about. Every time you refresh the page, it downloads about 190 files, none larger than 60,000 bytes, and some as small as 100 bytes.)
If all of this data was generated dynamically—the result of a database query, or the personalized content of a social networking page—it would be very difficult to make it faster. But with static content, you can put copies of the data on multiple servers around the world and then arrange for browser queries to hit the closest server: Users in Los Vegas get files from Los Angeles, and users in Atlantic City get files from Newark.
This is called a Content Distribution Network (CDN), and although the concept is simple, the implementation is very expensive. You need servers all over the place, preferrably colocated at the internet peering points for maximum speed, and you have to work with the Internet backbone operators to arrange for proper routing.
Only the largest companies can afford to operate their own CDN, but smaller companies can buy these services from specialized CDN providers (Akamai is probably the best known). The plans I’ve seen start at around $1000 per month for basic service, so you’d only use a CDN if you already had a huge amount of web traffic.
Amazon Cloudfront changes the math on all that. Like all Amazon Web Services, Amazon charges for Cloudfront only when you use it. There’s no minimum charge. Use a penny’s worth of service, and they bill you a penny.
This has allowed me to host all of the static design elements of Windypundit on Amazon Cloudfront. (Images in my posts are still served from my hosting service, except for my photography which is all hosted by Smugmug.) Pull up the properties of any image that’s part of the page design, and you’ll see it’s coming from the cf.cdn.windypundit.com domain, which is mapped to an Amazon Cloudfront server.
Here’s how it prices out: I load all the images to the Amazon Simple Storage Service (S3), paying 10 cents per gigabyte transfered plust 1 cent per 1000 items uploaded. Then I pay 15 cents per gigabyte per month to store it.
When someone loads my page, their browser sends requests for static content to the nearest Cloudfront server in Ashburn VA, Dallas/Fort Worth, Los Angeles, Miami, Newark, Palo Alto, Seattle, St. Louis, Amsterdam, Dublin, Frankfurt, London, Asia, Hong Kong, or Tokyo.
The first time any server gets a request, it fetches the file I uploaded to S3, which costs me 1 cent per 10,000 requests and 17 cents per gigabyte trasfered between S3 and the Cloudfront server. Cloudfront then serves the file to the end user which costs between 17 and 23 cents per gigabyte transferred, and between 1 and 1.3 cents per 10,000 requests.
(Cloudfront pricing varies depending on the location of the server, and the costs of operating a data center in that location. The United States is cheapest, and Japan is the most expensive.)
Those numbers sound small, but this service is not cheap. In December when I started doing this, Cloudfront fulfilled 775910 requests totalling 2.593 GB in the United States, 179103 requests totalling 0.629 GB in Europe, 5957 requests totalling 0.020 GB in Japan, and 32305 requests totalling 0.108 GB in Hong Kong, costing me (including the 7 cent charge for S3 activity) a grand total of $1.69.
Since my hosting service only charges me about $10 per month, this is a 17% increase in my hosting costs. I’ve incurred another 73 cents so far in January, so it looks like that’s a reasonable estimate of my usage.
My biggest concern is that I’ll take a bath in service charges if my usage spikes for some reason—such as getting an Instalaunch or Slashdotted or a high rating on Digg. In theory, the additional traffic will eventually convert to more regular readers and therefore more ad revenue, but that takes a while, whereas my Cloudfront bill is due immediately.
The real questions are, does this make my site any faster? And does that higher speed encourage more people to visit? And do the extra visitors translate to more revenue?
The truth is, although the site feels like it loads faster, I haven’t actually benchmarked it, and I don’t track my conversion numbers, so I have no way of knowing how much money this is making me. I just did it because I thought it was cool.