Category Archives: Web Technology

This Can’t Be Good, Part 1

I get my internet service from Speakeasy. They have a good reputation and terrific technical support.

Today I got an email from their CEO, Bruce Chatterley, announcing that they’ve just been bought by Best Buy. He’s real excited about all this, of course,

This agreement is a major step forward for our company. While our business remains strong, our relationship with Best Buy provides us with additional resources and brand recognition, while opening new sales channels which will dramatically accelerate our growth.

I can’t imagine why I should care about any of that.

Best Buy, like Speakeasy, is known for its high level of customer service. Our reputation as a trusted provider of voice and data services with stellar customer service will not change. Our values are similar too — Best Buy shares our customer passion, respect for individuals, and drive to do the right thing while achieving results.

Uh, that’s not the Best Buy I know. Maybe it’s one of those things that varies from store to store.

I wonder if Speakeasy will start trying to sell me magazine subscriptions every time I call support…

Amazon Elastic Compute Cloud (Amazon EC2)

Building on top of their S3 service, Amazon Web Services has announced the beta phase of a new service: The Amazon Elastic Compute Cloud (Amazon EC2).

Simply put, EC2 provides web-accessible virtual servers that you can provision as a service in just a few minutes.

Let’s say you have an application that requires a database server, an application server, and a web server. You start by telling the EC2 to start three machines for you. In a few minutes, they’ll be up and running. Each one is basically a Linux box with an Apache web server on it. You can then install your application and get it running.

These are virtual machines. Amazon’s servers are running virtualization software that allows each server to run one or more “virtual” machines inside it. To the operating system on the virtual machine, it appears to be running on a 1.7Ghz Xeon CPU with 1.75GB of RAM and 160GB of local disk space. In reality, however, whenever it interacts with the virtual hardware, the virtualization software takes over. What the virtual machine thinks is a disk is really one or more files containing all the data that’s supposed to be on the disk.

Or maybe it’s a real disk. That’s one of the advantages of virtualization technology. All the virtual machines can be identical, even if the underlying physical machines aren’t. That leads to the big idea behind virtual data centers: You can clone virtual machines just by copying their data files to another physical machine.

Here’s how that works on EC2: You tell the service to make images of all three of your servers. This makes copies of the virtual machine’s data files and stores them in Amazon’s Simple Storage Service.

Then, as usage of your web application increases, you can tell the EC2 service to launch additional copies of your web and application servers to handle the heavier load. Obviously, it takes a bit of care on your part to make sure your application is built and configured so new servers can just drop in, but that’s pretty common these days.

Suppose you have a site that runs a weekly contest. Each contest opens on Sunday and closes on the following Saturday. You get a burst of people checking it out when it starts, so you need 6 web servers on Sunday, and 10 servers on Monday as people check it out from work. The rest of the week you only need 3 servers, until the last day of the contest when everybody gets in one last entry and you need 25 servers to handle the load.

If you have a dedicated data center, you’d have to have 25 servers running all the time, even on days you don’t need them. Well, you could turn them off, I suppose, but they’d still have to be there. Either you or your hosting company would have to own them, and you’d be paying for all 25 of them, since most hosting companies won’t let you change the number of servers you rent on a daily basis.

Amazon EC2 lets you do exactly that. Each day, you can allocate however many servers you need. In fact, EC2 rents time by the hour, so you can idle with 1 server overnight, bring another server online as people get to work, and another server to handle the lunch break.

All of this is available as a service that responds in just a few minutes—they say 10 minutes from request to server boot—which means you can program your application to monitor its own load and call on EC2 to provision extra servers for itself as it needs them.

Right now, during the beta testing period, you can’t provision more than 20 servers for your application. But when EC2 goes into production, you’ll have access to a vast pool of hardware. So if, say, Stephan Colbert mentions your contest on his show and 10 million people decide to check it out, you’ll be ready, as your website quickly grabs an extra 300 servers to handle the load. Call it traffic-spike protection.

What’s the cost? 10 cents per server per hour. That works out to about 70 dollars per month of continuous uptime. I’m not planning to move Windypundit to EC2 any time soon, because I only pay a fraction of that price for a fraction of a server, but it sounds like a reasonable price for commerce hosting.

You also have to pay 20 cents per gigabyte transferred in or out, and 15 cents per gigabyte per month to store the configurations on S3.

As with S3, anyone planning to launch a large web business is going to have to include evaluation of this service in their planning.


Amazon Simple Storage Service (Amazon S3)

Everybody says Google is going to build a web-based operating system, but Amazon seems to be doing it too, with their Amazon Web Services line of products.

For example, there’s Amazon Simple Storage Service (Amazon S3), which implements a very simple store-a-file/get-a-file service. Amazon says it’s fast and designed for 99.99% availability, with data centers in two locations. Storage costs you 15 cents per gigabyte per month, and bandwidth costs 20 cents for every gigabyte transferred in or out. There’s no minimum charge.

Some companies have already decided that buying bandwidth and space from Amazon is cheaper than expanding their data centers. The smugmug photo sharing service uses S3 for storing redundant copies of all images, and Altexa Software used S3 to implement an over-the-internet data backup service. Both of them added S3 to their services in just a few days.

Calls to the S3 API can be made using REST or SOAP interfaces, and are authenticated with strong encrypted signing of the HTTP header, including the timestamp field to prevent replay attacks.

Each file stored in S3 can have its own security policy. If a file is set to be readable by everyone, then signed headers are unnecessary and S3 will serve up the file in response to an ordinary HTTP request, qualified by the “Host” header, allowing S3 to be used for virtual hosting of static content.

The image of the Chicago lakefront at the top of this page is currently hosted at Amazon. It took me a couple of hours of mucking around with Perl scripts to write a program to put the file out there, but somebody who actually knows Perl could probably have done it in a few minutes. (I could have done it in a few minutes in .NET, but I wanted Perl scripts so they’ll run on the Windypundit host system.)

If you need to serve large files to lots of people, the S3 servers will also act as a seeder site for BitTorrent.

I think anybody implementing a web business that stores large amounts of static data is going to have to include S3 (and its eventual competitors) in their planning.