World Wide Web

If you’ve been trying to read this site (or probably hundreds of thousands of others) in Internet Explorer 6 or 7, you’ve probably been seeing the entire site load just fine, but when it finishes it pops up an error box that says “Internet Explorer cannot open the internet site.”

It took me a while to figure out what was happening, but I eventually traced it to the script from Sitemeter that I use to track usage statistics. I had pretty much saved it for last, since I assumed they’d test their code to avoid, you know, breaking the Internet.

Lots of people are seeing this problem. I hope they fix it soon.

Note: If you’re having this problem on your web pages, you have to track down and remove the sitemeter code from your pages until they get around to fixing it.

I’ve been reading about the Wikileaks issue on some legal blawgs, and corresponding with Scott Greenfield about it, and I think there’s a bit of confusion over a technical issue. I don’t think anyone within Judge White’s jurisdiction is disobeying his order. I’m going to have to delve into some history here, but I think I can explain it without getting too technical. The key is that the network service of hosting web content is different from the network service of associating that content with a domain name.

The networking protocol for the Internet is called IP, which stands for internet protocol. Every computer on the Internet is assigned an IP address within the network where it can be reached. When a human is going to see or type an IP address, it’s usually broken into four numbers ranging from 0 to 255, which are written down separated by dots. E.g. “88.30.13.160”.

That’s an ugly thing to be typing, so since the earliest days of the the old ARPAnet, there has been a mechanism to allow us humans to use names for computers instead of numbers. Initially, every computer simply had a list of the names and addresses of every other computer. Each computer’s administrators would occasionally download a new list from a central location.

By the mid-1980’s, that system was no longer workable because the number of computers had grown into the thousands and it was hard to keep track of the changes across all the organizations on the network. To solve this problem, the architects of the internet invented the Domain Name System (DNS), in which the lists of names and addresses are stored on computers called name servers that the other computers query when they need to lookup a name.

For any given domain name—yahoo.com, windypundit.com, wikileaks.org—there’s a nameserver somewhere that’s responsible for providing the associated IP addresses. It’s called the authoritative name server, and whoever controls this server controls the meaning of all the domain names for which it has authority.

How does the network of name servers find the authoritative name server? They send a query about the domain name to a group of top-level name servers, called registries, which respond by refering the query to the name server that has authority for that domain. When you buy a domain name from a registrar, you’re buying the right to tell one of the top-level registries which nameserver has the authority for that domain.

(I have simplified the domain resolution process quite a bit. There are a lot more options, and there is a lot of localized caching to improve performance.)

When it comes to accessing a web site, there are five roles we need to be concerned about:

  • The web server that actually serves the pages.
  • The authoritative nameserver, which points to the web server.
  • The registry, which points to the authoritative name server.
  • The registrar, which is the business entity that sold the domain name.
  • The registrant, which is the actual person or other entity that owns the domain name.

When you buy a domain name, you’re the registrant, and the registries are operated by the top internet authorities, but the other three roles are up for grabs. Many companies will sell you a package deal for registering a domain, operating the name server, and hosting your content, but it doesn’t have to be that way.

For example, the windypundit.com web server is operated by a company called DowntownHost, and the registrar is Tierranet. Currently, the name server is also operated by Tierranet, but I could switch it to DowntownHost or a third party if I wanted to.

Wikileaks used a similar arrangement. Their server is at IP address 88.30.13.160 and is run by a web hosting company called PRQ in Sweden.  Their registrar is a company called Dynadot, located in California.

According to the court documents John Katz has posted, Judge White ordered the registrar, Dynadot, to remove all information about the wikileaks.org domain from the registry. As far as I can tell, they’ve done so. That’s why we get an error if we try to browse to wikileaks.org. (That’s also why I can’t tell you anything about the nameserver or the registrant—Dynadot has deleted that information.)

None of that, however, affects the actual web server that hosts Wikileaks. If you happen to know that its address is 88.30.13.160, you just have to type this into your browser:

http://88.30.13.160

The Internet will just route traffic between your computer and the Wikileaks server, without ever having to do a query on the wikileaks.org domain. It’s just between your browser and the Swedish server.

Maybe all these technical details are too low-level for the courts to take notice, but as far as I can tell, no one is disobeying the judge’s order. He ordered the name deleted, and it’s gone.

Yesterday, I mentioned that Federal Judge Jeffrey S. White issued an order shutting down the Wikileaks site. He did this by ordering the domain registrar to disable the wikileaks.org domain. This only disables the name lookup feature, not the underlying website, which is still available via its IP address:

http://88.80.13.160

In a comment to my last post, Scott Greenfield asks,

[D]o you think it’s critical that the Judge White’s order was ineffective because of a technology error? If they figure out how to do it effectively next time, then what?

I’ve been giving this a little thought. I’m not an expert at Internet security, but I think I may have been unfair to Judge White. The IP address above traces to a server in Stockholm, Sweden, so he may very well have done all that it was in his power to do by ordering the American registrar to disable the name.

I suppose the aggrieved party could ask him to order the big American internet backbones to stop carrying traffic from that address. I think it would be analogous to ordering a phone company not to put through certain calls, or ordering the post office not to deliver certain mail. It would probably be a serious performance and administrative burden, and I wouldn’t be surprised if it’s not legally possible.

Besides, the Wikileaks site could get a new IP address in a few minutes. Within a day or two, all the usual web sites would be linked to it again.

In addition, Wikileaks has many other domain names, some of which are obvious—wikileaks.cx, wikileaks.cn, wikileaks.in, wikileaks.org.uk, wikileaks.org.nz—and some of which aren’t, e.g. sunshinepress.org. There are also independent mirror sites that serve all the same content to the web from locations in several different countries.

The folks who built Wikileaks make some pretty grandiose claims about it being “uncensorable.” Technically speaking, there’s no such thing, but as a practical matter, they can probably put up a pretty good fight. Wikileaks was originally designed to support dissident activities by people in repressive countries, and it makes use of some advanced security technologies.

It’s not as farfetched as it sounds. Consider that the Chinese government has been trying to censor Wikileaks without success. Here in the United States, our government has only been able to stop online poker sites by attacking the flow of money, not the web sites themselves.

Maybe some intelligence agencies have the resources to stop Wikileaks—especially if they’re willing to commit illegal and/or violent acts—but I don’t think a lawsuit or an overzealous judge is much of a threat.

Check out this clause in the User Agreement for Business Week magazine online:

In addition, User may not:

2. use or attempt to use any “deep-link,” “scraper,” “robot,” “bot,” “spider,” “data mining,” “computer code” or any other automated device, program, tool, algorithm, process or methodology or manual process having similar processes or functionality, to access, acquire, copy, or monitor any portion of BW.com…

It’s 2008 and they don’t allow deep linking?

Aside from their lousy web etiquette (and questionable business model), I’ve always felt that legal attempts to prohibit deep linking are crazy talk.

It would be one thing if people were hacking into the site to steal data, or if people were publishing secret passwords. But Business Week isn’t using any of the web security protocols to protect their articles. Everything is wide open. How can it be wrong to publish deep links to their site when their server is programmed to honor deep links?

They claim their User Agreement is a contract that is binding on everyone who visits the site. For all I know, that may even be actual law, but it makes no sense for them to claim one thing in their User Agreement and then implement another thing in their web server. It’s like posting a “No Trespassing” sign outside your door while the people inside are yelling “Come on in!” to everyone passing by. Who are visitors supposed to believe?

(Hat tip: Don MacAskill)

I’m sort-of debating with Tom McKenna over further testing of evidence in the murder of Wanda McCoy. Roger Coleman was executed for the crime 13 years ago and Virginia Governor Mark Warner has just ordered DNA testing which might show Coleman was innocent.

When I dropped “Roger Keith Coleman” into Google, the results page kicked out an ad:

Roger Keith Coleman

I suspect some automated keyword software is buying ad space for popular searches. The search didn’t actually return any results.

Philipp Lenssen is reporting at Google Blogoscoped that Google has launched Google Publication Ads, a service that allows advertisers to place ads in print publications.

I wonder how long it will be before Google has those giant floating blimps with ads on the side like in Blade Runner? Google BlimpAds. You’ll see.

Update: Really, I just posted this so I could see myself on Lenssen’s new Forty Faces site.