The URL Shortener-SEO Cage Match

Making the case for a Hungarian Notation variation for URL hierarchies

One of the top discussions out in the ether these days revolves around URL shortening. One of the reasons folks flock to URL shortening services like bit.ly and TinyURL is because web sites and applications use exceedingly long URLs. Many times this is because of exposed file system hierarchies (a potential security risk, by the way) and a desire to take advantage of descriptive file names for SEO and informational reasons. Recently Delicious founder Joshua Schachter expressed his opinion that URL Shorteners are bad for the web, while Ben Parr at Mashable refutes with a few reasons he believes the services are useful.

The use of URL Shorteners are somewhat out of the control of developers. They can’t know whether they’ll be used or not, and it’s really out of scope of their responsibilities. But the question then becomes, why don’t developers simply generate shorter URLs in the first place? Why don’t they tighten up the file system hierarchy and present something more reasonable to the user? Wouldn’t this, then, eliminate the need for URL shorteners (and the debate)? After all, very long file names and deeply nested hierarchies waste bandwidth and money.

It turns out that there’s a lot more to consider before shortening URLs or condensing file system hierarchies. Last week’s discussion on the potential waste of bandwidth (and money) from excessively long URLs elicited several responses that were valid and insightful.

LONG URLs AREN’T THE ONLY BANDWIDTH HOG

The first reminded me that cookies are far more piggish in their consumption of bandwidth, with 4-5KB cookies stuffed in HTTP headers like so much stuffing at Thanksgiving being not unusual these days. That reminder is a good one, because it’s a trickier problem to solve than simply shortening a URL. In other words, that’s a discussion for another day – but we will have it eventually. Suffice to say there is a technical solution (a cookie jar) to the problem and we’ll dive into later because, well, it’s not really germane to this discussion.

The second response was more directly related to the negative impact of automatically shortening URLs on business related functions and goals.

Having targeted keywords in your url can provide a (slight) SEO boost to a pages relevance if it matches the query.

http://www.seomoz.org/blog/11-best-practices-for-urls

http://www.seomoz.org/article/search-ranking-factors

Using a meaningful url structure can help with usability as it can provide context (or what Jakob Nielson calls information scent) for the site’s architecture.

http://www.useit.com/alertbox/20030630.html

This raises an important point to consider before deciding to reduce the length of file names.

If the importance of “search” to your business is such that the loss of search traffic due to the removal of keywords from URLs and subsequent SED (Search Engine De-optimization) would be significant then perhaps URL shortening is not a good idea. If you’d notice it, in page ranking or traffic flow or however you’re measuring the effectiveness of SEO in your organization, then you’ll definitely want to consider this against the potential costs of long URLs.

TRIM THE FAT ELSEWHERE

If it’s the case that the business – and they really must be the final arbiter of such a decision because, well, business is the reason you’re serving up web sites and applications in the first place – is willing to absorb the cost of lengthy URLs in the name of SEO then you may need to look elsewhere for a reduction in bandwidth costs. As mentioned earlier, cookies are a common bandwidth hog. Unfortunately the implementation of a cookie-jar functionality is not as straightforward as it seems, though it is likely one of the places you can immediately trim fat that will have a significant impact on bandwidth use. If you can trim them down in the application, do so from the start – it’s always easier to develop an application with leaner cookies in mind than to rework the application later or implement a cookie jar on an intermediary.

For example, perhaps it’s the case that your URLs need to contain certain keywords that are going to necessarily make them long. But do the supporting assets need to contain lengthy, descriptive file names? Do you really need to use /assets/monthly/images/this/is/ridiculous/myimage.jpg when something much shorter, and simpler, would suffice? There are plenty of supporting object types that make up a web page that are not as likely to affect SEO efforts one way or another. Consider the file system hierarchy you have in place for:

Scripts
CSS
Images
Commonly reused content elements (headers, footers, menus, etc…)

And examine whether they are using unnecessarily lengthy URLs. The impact of reducing the length of supporting objects’ file names while maintaining a lengthy page URL in the name of SEO is likely more significant, because the ratio of objects to page URL is, on average, 50:1. A reduction even in the hierarchy naming structure can have a significant impact on bandwidth.

These types of reductions are the reasons folks turn to compression. The trade off between performance and compression is lessened with the use of an intermediary capable of intelligent compression, but just turning on compression in general can certainly buy you quite a bit of bandwidth, which means you can leave the long URLs and file system hierarchies alone.

TRUST ME, YOU’LL LOVE THIS SITE

As is the case with URL shortening services, there are trust issues with shortened URLs. Rick-rolls are the least of your concerns when presented with a shortened URL for which the domain and page URL are obfuscated. Using an intermediary to shorten the URL alleviates some of the concern because the domain is clearly visible, but many folks want more information in the URL from which they can extrapolate information and make a decision whether to follow the link or not.

The difference between shortening a URL yourself using an intermediary or network-side scripting (or just doing it as a matter of design during development) is that you aren’t obfuscating the URL, you’re just shortening it. The concerns with URL shortening services is partially due to the fact that they are misnamed. Oh, they shorten the URL, but in order to do so they completely obliterate any reference to the original and essentially present the user with gibberish.

There is a big difference between obfuscation and shortening. While the latter can lead to the former, it doesn’t have to. Shortening schemes can be developed that adhere – at least in spirit – to usability guidelines such as those presented by Jakob Nielson and that don’t erode the user’s trust. It takes time to design such a hierarchy or scheme, but it can be done.

SHORTEN, BUT DON’T OBFUSCATE

Consider that programmers have long used such techniques that allow other developers to identify variable names “at a glance”. Hungarian notation is one such technique. By prefixing variable names using a codified system of identifiers, developers can easily distinguish a short integer (sNum) from a long integer (lNum) and so on.

So perhaps what we need is to develop a “Hungarian Notation” for web sites; something that’s simple, easy to learn, and can be used as identifying information in a URL so that we can simultaneously shorten URLs (at least by reducing the hierarchy) while maintaining the business’ need for SEO.

Published Apr 06, 2009

Version 1.0

application delivery

dev

internet

network-side scripting

performance

url

web