Friday, September 12, 2008

How I Assess a Website for SEO

Very often website owners wonder what exactly is wrong with their websites in terms of SEO. They could have easily fixed the errors all by themselves, but the lack of specific SEO knowledge makes it impossible. Reading SEO forums and blogs full of conflicting advice doesn’t help; all that people get as a result is a headache and a list of 100 or more factors supposedly important for SEO. Too often, things like Google PageRank or Alexa rank will be among the first on the list, when in fact they shouldn’t have been there at all.This article is meant to help both webmasters who would like to make an assessment of their websites for SEO factors and SEO practitioners who consider adding an SEO review to the list of their services. Sure, the following is just my personal point of view, but it’s derived from my previous SEO experience and the knowledge accumulated over the years.

The on-site factors

It’s essential to check the following on-site factors (in no particular order).

Content. Does the site have a lot of textual content for visitors to read and for the search engines to index? Is it done in plain text, graphics or Flash (because the last two options make it invisible for the engines)? Is it unique?

Use Copyscape along with Google and search for a long sentence copied at random from the site (with and without quotation marks); this will help you identify any copies of the content you are reviewing on the web. There are three possibilities: the site you are reviewing steals the content from another site, the other site steals the content from the site you are reviewing, or a legitimate republishing (with credit and a link) has taken place. In the third case there are no reasons to worry, but if content stealing is going on, the owner should be informed about it in either case to take the necessary measures.

If the site is built upon graphic images for the most part of it (like an online photo album), alt and title attributes along with page titles can help a little, but not in a competitive niche. In this case, if the owner of the site is still interested in the search engine traffic, the only way to go is to describe the pictures, at least briefly (human visitors will be grateful too).

On the other hand, if the site is built as a single Flash movie, it has no chance in the search engines, unless an alternative HTML version of the site is provided in one way or another. It’s possible to use a cache:http://www.domainname.com/ Google command to see what Google sees on the home page of the website.

I sometimes retrieve the keywords from the title tag, insert them into the Google toolbar and click on the highlighter to see how often these keywords are used in the visible content of the page, but I never calculate the keyword density. Nobody ever knows the optimal number that will work in every case, so calculating it is a waste of time.

Title tags and other meta information. The importance of the unique title tags on every page of a website is widely known to everyone who has ever studied SEO basics. Apparently, not so many people have, because I constantly see websites that have the same title tag throughout the whole website (typical for template-based sites, as some templates allow website creators to add only one so-called “website title” while configuring the website). The same applies (even more often) to “keywords” and “description” meta tags. Granted, these tags are less important than the title tags for rankings, but a properly optimised website still should have unique meta tags on each page.

I would like to mention two other meta tags I’m seeing all the time:
(or , , , sometimes all four at once) and
.

I always recommend eliminating them, as they are nothing but useless code bloat. The only meta tags that would make sense - in some special cases - are , , or .
(The examples are given for the HTML 4.01 doctype; in XHTML they would look different.)

Navigation means. Usually, websites have a navigation menu bar on the left or at the top and an alternative “mini sitemap” at the bottom. Some websites use only one or two navigation means mentioned above, some use all three, and sometimes the right column is used for navigation as well. Certain pages can be interlinked from within the content (a good thing as long as the new page doesn’t open in a new window), and last but not least a good site always has a sitemap.

When I review websites, I always pay attention to the spiderability of the links in the main navigation menus. Top and left-side menus (especially those that have dynamic drop-down parts) are often powered with JavaScript. If the code still contains the plain href HTML links with URLs in them, the menu will be spiderable, but if you see JavaScript variables instead of the URLs, or if the location.href command is used, the menu is useless in terms of search engine friendliness and SEO.

Actually, a simple menu with the rollover effect as well as a dynamic menu with drop-down submenus can be implemented using CSS rather than JavaScript. This change of the technology will certainly make the website better for the search engines, but apart from this it will reduce the download time (JavaScript-powered rollover menus are usually built upon graphic images, and the CSS-powered alternatives use text) and improve usability. I always add this recommendation to my SEO reviews.

If the website has breadcrumbs, it should be pointed out as an additional SEO/usability benefit.

Redirects and other HTTP responses. I always check if the website has a 301 redirect from the non-www version of the URL to the www one. If there is no such redirect, I always recommend one.

It’s also important to check what kind of an HTTP response non-existent pages return. It should always be a 404 (with a custom 404 page shown in the browser; it’s permitted to use the site map or the home page as the custom 404 page). This server header checker is very good for checking HTTP responses: user-friendly and reliable.

The code. I always check the code for validity using the W3C validator (I don’t trust browser plugins). If the code is not valid, I recommend fixing all the errors. Other code-related issues are these: table-free layouts are better than table-based; for table-based layouts the number of nested tables should be as small as possible; JavaScript functions (where they can’t be eliminated) and CSS styles should be moved to separate .js and .css files to avoid bloating the HTML page itself. XHTML 1.0 Transitional is better than HTML 4.01 Transitional, but XHTML 1.0 Strict and XHTML 1.1 are better still.

And of course, no frames!

More on internal linking. All internal links pointing to the home page should be linked to the root (http://www.domainname.com/ rather than http://www.domainname.com/index.html). The same rule applies to subfolders (link to http://www.domainname.com/folder/ rather than http://www.domainname.com/folder/index.php). Unfortunately, I have yet to review the first site that follows this rule; even some well known SEO practitioners are guilty of breaking it.

URLs. If the site is built upon dynamic URLs, I always recommend at least considering switching to static URLs or reduce the number of parameters, if possible. Granted, dynamic URLs can get crawled, but they show Google Supplemental Results more often than do static URLs, are more difficult to control via the robots.txt file and blur the hierarchy of the website. Static URLs with subfolders (or pseudo-subfolders) allow us to make the hierarchy of our websites more obvious, besides, we can from time to time name a page after an important keyword or two (separated by a hyphen). All these factors help with SEO to some extent.

Last but not least, I look at the overall number of indexed pages in Google, Yahoo! and MSN (Live) and compare these numbers with the size of the site.

The off-site factors

This usually means links. I do look briefly into links, and use the Yahoo! search engine for this purpose, because Google shows only a tiny part of all links, and MSN has disabled this functionality altogether. The linkdomain:http://www.domainname.com/ command gives us some ideas of the backlinks of the website. The overall number is usually highly exaggerated, but the correlation of this number with the age of the site gives me ideas on the history of the site. The quality of the backlinks is important too. If I see that all links come from a bunch of forum threads, I know at once that they don’t add much to the site’s authority. One link from the Yahoo! Directory could do more for this purpose than two hundred links from forum signatures.

I look at the age of the site (not in the whois records, because the date of the domain registration and the date of the first upload of the website can be far apart). I look at the copyright notice, and if there is none, just ask the website owner when the site was uploaded.

We know that Google looks at historical data (and even has registered a patent for it). Should we assume that they use this data to calculate the average amount of links acquired by the site every year? Or every month? It does make sense. My own experience tells me that old websites that have accumulated a decent number of links (mostly, natural links) do quite well, but old domains with too few links perform in Google even worse than relatively new websites. The same applies to websites that haven’t added new content pages in years. Even a quick and aggressive link building campaign won’t improve the situation at once; it will take months before Google realises that the site in question has been revived.

Anyway, quick and aggressive link campaigns are a thing of the past. The engines now look for naturally developed great resources. That’s why the main purpose of a good SEO review is to help the website owner to turn the website into a resource as close to “great” as possible.

It’s important to look into the backlink patterns of the website. If most backlinks come from websites belonging to the same owner and heavily cross-linked with each other, the site is asking for a severe ranking penalty. Such things should be pointed out to the owner - urgently.

What else?

Of course, I briefly review the websites for the most obvious kinds of SEO spam. The procedure is described in detail in this article; the only thing that has changed since 2005 is the Supplemental Results. These days, if a website is showing a lot of Supplemental Results in Google, it doesn’t mean an approaching penalty for spam anymore. It often happens to innocent sites.

Often, in order to discover doorway pages on a site you are reviewing, you will need to ask for an FTP access to it.

Things I don’t include

There are certain things I don’t include in the SEO review. One of them is Google PageRank, which has become so weird and deceptive that there is almost no point in looking at it, except in some special cases. It is still somehow correlated with the overall link authority of the site, but the correlation is very weak, and the PR itself has no direct affect on rankings.

Alexa Rank. It is supposed to show which sites have more visitors. Actually, it shows which sites have more visitors with the Alexa Toolbar installed on their computers. Alexa is capable of collecting the data about visitors only when the visitors already have their toolbar. How many people know that this toolbar exists, let alone install it? Not many. This alone makes Alexa Rank a worthless parameter.

Current rankings. No, I’m not saying that the rankings are unimportant. I never include them in the SEO report because they can change tomorrow and also because the client can be on a different datacentre and see a completely different picture. The rankings are affected by many other factors, like, for example, the TLD of the search engine (compare what you see on Google.com with what you will find on Google.co.uk), or the language of the interface. Besides, the choice of keywords used at the time of the primary SEO review can be faulty, because the thorough and deep keyword research is a separate task and is often done after the review. And finally, most of the tools that check rankings violate the TOS of the search engines; those that use legitimate APIs often supply very wrong information if you compare their data with the results you actually see in the SERPs.

The review is done to discover (and hopefully, fix) the fundamental problems of every particular website and for this purpose the ranking report does nothing.

--------
by Irina Ponomareva joined Magic Web Solutions ltd.

No comments: