The cost of Page Rank

The subject is covered frequently in the blogosphere. It’s nothing new for many of you. I’ve been bitten by Stopdesign’s Google page rank for specific search queries several times. I just noticed the most recent instance. When John Gruber published “Writing for Google” earlier this month, he provided advice for getting a good rank for one page or article so those seeking its content are likely to find it. Follow his advice for content you want people to find.

There’s a flip side to this issue though. What happens when Google gives a particular page too high a page rank?

Matt Haughey recently found two good examples of sites (written about here and here) that have suffered from the adverse effects of a high page rank for one particular page. To be fair, both of the blog posts Matt mentions do actually discuss the subject for which people are seeking and arriving via a search engine. But other times, titles of blog posts coincidentally make use of search terms that are popular for completely unrelated reasons.

Matt has a theory why so many people blindly believe they’ve found the right place to leave comments, or get in touch with some person or entity. And I agree with him. In fact, I’d take his theory a step further. To many people, the Google search box is synonymous with the browser’s location field. It’s the same thing to them. With the Google Toolbar installed in IE, (or because it pre-exists in other browsers) the search field is almost indistinguishable from the browser’s location field. “I type something into that field, and Google takes me where I want to go.

Case in Point #1

The most recent post I wrote on Tuesday, 24 May, uses a title consisting of a common, generic phrase meaning “new beginning”. I could have used any phrase I wanted, but chose something short and simple. Over the past few days, there were a couple comments posted to that entry that came out of left field. Both comments mentioned female names of which I have no association, or can even recall knowing someone by those names in the past few years. Both comments expressed frustration and disgust, and did so in a not-so-pleasant manner. Needless to say, they were immediately deleted for being irrelevant and offensive.

Today, I noticed a fair number of people were coming to Stopdesign by way of searching Google for a specific phrase. A phrase which happened to match the post title from Tuesday of this week. After searching Google for the phrase, I saw the problem. Stopdesign was just starting to get hit with the same issues as the sites Matt pointed out. Only my visitors were coming expecting to find some means of contacting (or spewing vitriol toward) the women of a current daytime reality television show. Notice I’m not linking to the show’s site, nor am I mentioning the show’s name. This is intentional, as I really don’t want Stopdesign to be ranked any higher for this query than it already is.

Glancing through the results, I also saw a post from Paul Scrivens which happens to use the exact same title. His post is about learning web design over again. I’m sure Scrivs is no more interested in hearing about the participants in this reality television show than I am.

Case in Point #2

Last year, I switched hosts for Stopdesign to pair Networks, a FreeBSD-based, rock-solid hosting service. One of the services made available to me after the switch was a certain filtering mechanism installed on the server to combat unwanted electronic messages. (Again, I purposely omit the name of such messages or the filtering tool used to combat them.) Once I learned how to enable and configure this tool, I was struck by how well it worked. From ~500 messages per day down to 50. I thought it was something worth writing about. The title of an entry I wrote last year to express my satisfaction with the results used the name of the tool, and nothing else.

(Note that I am not at all an expert on such tools, and my entry contained no exhaustive review of the service. After all, I barely knew what I was doing with it. I wrote about it, but not in any way that would be useful for somone wanting to learn more about the tool.)

Within two weeks, the post on Stopdesign about this tool had become the #3 result on Google for a query containing the service’s exact name. I thought nothing of it until I started getting inundated with messages one morning. This time, 500 per day was a drop in the bucket. Before I could shut down a certain means of contacting Stopdesign via this site, someone managed to write a script that slipped over 3,500 messages to me within the span of about 12 minutes. After I yanked the contact form from Stopdesign, and viewed server logs the next day, I saw that another 12,000 POST attempts had been made to the same URL, before the attack finally ceased an hour later. As it was, I got an inbox full of several thousand messages that morning. Had I woken up any later, many more would have made it through. And who knows if the script would have actually stopped after sending off 15,000 messages.

After thinking about the attack the next day, I began to wonder if it had anything to do with the high page rank for the combatting tool I had written about two weeks prior. I knew that those who proclaimed defiance against attacks were more likely to get attacked by others determined to take them down. Whether or not it was related, I took action. I deleted the post from MT, removed all traces of it from files on my server, then requested that Google remove that result from their cache. After another week, Stopdesign was completely eliminated from results for that query.

More Responsible Title-Writing?

Don’t get me wrong. I’m completely happy with the page rank for terms relevant to the associated names and typical content of this site. However, these odd coincidences keep occurring with blogs where machine-processed algorithms can’t always accurately calculate the relevance of a phrase or a name when it’s used in different contexts.

Sometimes, I’m glad to find a weblog with a review of a product I’m thinking of buying, or with advice about visiting a certain city. These posts provide personal or local views of something I won’t find in mainstream media outlets. But I’m positive that shoppers looking to furnish a dining room get sick of seeing all these pages about eliminating tables and using CSS. What does CSS have to do with their dining room? In Case #1 above, I was unaware that the title I used for my entry precisely matched a temporary blip in pop culture. Am I at fault for writing a short and simple title that accurately described the content of my post?

I understand the value of responsible title-writing for blog posts. Perhaps even more so now. But a single writer can’t always predict (or be responsible for) existing uses of every noun, verb, adjective, and adverb combination. If someone kept a popular personal blog, and wanted to write about the victories of a family member previously inflicted with a terminal illness, does this mean using the word “Survivor” in the title of a post needs to be avoided, lest that someone be inundated with insensitive folk trying to contact Rupert or Jeff Probst. Or worse yet, begging for the opportunity to be the next Survivor?

Update: I decided to open up comments on this entry after getting several positive messages from readers. I’m interested in what others have to say about this.