Submit
to Search Engines
Traffic
from Google has increased at an astonishing rate over
the past year: Jakob Nielsen's search engine referrals
to his Useit site confirm this, as do the unpublished
reports from retail sites like Stylata. Google, once
considered a niche site for nerds, is the Wall Street
Journal's pick for best search engine on the Net, and
the traffic numbers seem to agree.
Inktomi, the number two traffic generator, doesn't run
its own search site. Instead, the company provides the
technology behind MSN Search and AOL Search, two top
referrers, as well as Hotbot and over a dozen more.
Portal sites like Excite, Lycos, and AltaVista still
draw lots of traffic, but together Google and Inktomi
outweigh the entire rest of the field. Add it up and
it's pretty clear how to maximize your traffic for the
least effort:
Make sure your site is thoroughly crawled by Google
and Inktomi.
Get lots of links to your site from domains that a lot
of other sites link to -- that's how Google and Inktomi
determine relevance when ranking search results. Links
Manager can setup a reciprocal links
page for your site and facilitate all of the tasks
associated with maintain it.
For all other search engines, implement a blanket
strategy that gets you reasonable results. By not
chasing each one of them separately, you can put your
company's time and money to more important uses.
All of this can be accomplished with one, three-step
process. And it really is as easy as 1-2-3.
STEP 1
There are quite a few things you can do to grab the
attention of search engines and directories:
Clean Up Your URLs
Frames used to be the biggest roadblock to getting
crawled, but no more: Both Google and Inktomi now crawl
them (the section of Inktomi's support FAQ that claims
this isn't so is out of date, according to the company).
Instead, the problem with most e-commerce sites today is
that their product pages are dynamically generated.
While Google will crawl any URL that a browser can read,
most of the other search engines balk at links with
"?" and "&" characters that
separate CGI variables (such as "artloop.com/store?sku=123&uid=456").
As a result, many individual product pages don't show up
outside of Google.
One way to circumvent this difficulty is to create
static versions of your site's dynamic pages for search
engines to crawl. Unfortunately, duplicating your pages
is a huge amount of extra work and a constant
maintenance chore, plus the resulting pages are never
quite up-to-date — all the headaches dynamic pages
were designed to eliminate.
Many readers have written in to to ask if the search
engines will begin crawling and indexing Flash content
soon. The answer, as you might guess, is no. Unlike PDF
files, Flash files rarely contain information in text
format. Search developers don't want to clutter up their
indexes with a million "Skip Intro" pages.
Submit your Site
There are a lot of automated search engine submission
services and software that you can use to submit your
site to as many search engines as possible. The one most
recommended by people I talked to is WebPosition
Gold
Don't Forget the Directories
Yahoo still offers free submissions, except for business
categories, which cost $199. But even the fee doesn't
guarantee they'll accept your site, just that they'll
decide on it within a week — with free submissions,
you don't even get the promise that they'll ever get
around to evaluating it, given the incredible volume of
submissions.
Once you've submitted your pages, be ready to wait a
month, two, or three before they're crawled and indexed.
It's frustrating, but processing a billion Web pages
takes time — at a nonstop rate of one hundred per
second, it would still take almost four months.
Make a Crawler Page
It isn't necessary to submit every page on your site to
the search engines. Just make sure they can find all the
pages that matter by hopping links from your front door.
To do that, make a "crawler page" that
contains nothing but a link to every page you want
search engines to crawl. Use the page's TITLE info as
the link text — this helps improve your site score.
For an example, check out Artloop's crawler page.
Basically, the crawler page is a site map that lists all
the pages on your site — it may be a bit too big for
humans to read through, but it will be no problem for a
search engine. Add an obscure link to the crawler page
on one of your site's top-level pages, using a small
amount of text. MSN used to use 1x1 images for this
trick, but the Google geeks warned us to avoid such
obviously invisible tags. "Why not just label it
'site map?'" one asked. Search engine spiders will
find it as soon as they get to your site, and suck down
all the pages it finds on it.
Don't worry, the crawler page won't show up in search
results. It does get pulled into the search engine's
index, but because it has no text or tags to match a
query, it isn't listed as a result. The pages it links
to, however, will appear because the search engine's
spider found them right after it visited the crawler
page. WiredNews, for example, uses hierarchical sets of
crawler pages to make sure every story ever published is
crawlable from the top of the site.
Pay to Play?
Not too long ago, in response to years of complaints
from commercial site owners who demanded their pages be
indexed and up to date, Inktomi announced a new service
that lets site owners pay to have individual URLs
crawled and indexed quickly. If you're wondering whether
paid listings are worth it, I suggest trying just a
couple of your URLs first — pick the ones you feel are
poised to make the most money — to see if the return
on investment meets your needs.
Remember that Inktomi will rank search results largely
on the links to your page from other domains. And if no
one is linking to you, expect to see your page appear at
the end of the results list, not at the top. Links
Manager can provide you resources to
locate webmasters willing to exchange links you.
There are ways, however, to get your site moving up
through the ranks.
Step 2: Get Ranked
Most people that are concerned with search engine
optimization focus obsessively on keywords and HTML
tags. But when it comes to getting ranked by search
engines, the only tags that matter are TITLE, and the
META tags KEYWORDS and DESCRIPTION. And you have to be
very careful about how you handle each one.
TITLE tag
TITLE makes a big difference, especially with Google. It
should be short (less than 40 characters seems to work
best) and, most importantly, should match the search
queries people will be using to find your site. This
could lead to a struggle with the marketing managers:
They'll want your site's page titles to contain the
company name and/or a positioning statement. Ask them
what good that will do if no one ever sees the pages.
This is a good TITLE tag that will generate traffic from
people searching for "picasso":
<Title>Pablo Picasso</Title>
This is a mediocre one:
<Title>Artstuff: Pablo Picasso<Title>
This one will put you out of business:
<Title>Artstuff: Your Number One Online Resource
for Fine Art Solutions!!!<Title>
META NAME="Keywords"
Keyword spamming is the number one favorite trick for
search engine optimization. But many of the sites that
stuff a zillion keywords into their pages are hoping to
get clicks to their pages just to show ads -- they don't
care if they get any repeat business. But if you want to
draw real customers, focus on the keywords you think
your users will be searching for.
For our Picasso page, something like this would work
(note that uppercase letters don't matter):
<META NAME="keywords" content="Pablo
Picasso, Pablo, Picasso, painting, cubist, painting,
ceramics, collage, Spain, Guernica, Paris, 20th century,
Girl Before a Mirror">
Repeating the most important keyword twice seems to work
with some search engines, but repeating more than that
will cause some of them to ignore the whole page.
What keywords are people searching for? It's important
to focus on the right ones. Zipf's Law predicts that
traffic for any particular keyword on a search engine
will be proportional to its popularity rank. That is,
the number of queries (and hence potential clickthroughs
to your site) for the most popular keyword will be ten
times greater than that for the tenth most popular term.
And traffic to term #10 will be 1,000 times higher than
traffic to term number 10,000. Search engine logs don't
quite match Zipf's curve, and they vary from one engine
to the next. But the lesson remains: If you're not
matching the top keywords, forget it.
Where to find the top keywords? Two free resources are
searchterms.com and a weekly emailing from Wordtracker.
Keyword popularity varies from search engine to search
engine, but across the Web (and according to a few
well-placed contacts at search engines) these listings
are close enough.
META NAME="Description"
This field gets used for the page summary on Inktomi and
some other engines, so don't cram it with keywords: A
scary-looking description on a search engine's results
page could discourage people from clicking through to
your page, even if it scores high. (We'll cover more on
descriptions in Step 3.)
Page Text
It never hurts to have the search terms you want to
match near the top of the page. But cramming in a list
of spam-style keywords can also backfire -- Google will
display them under the page title on its results page,
and Inktomi will show them (as do many others) if there
is no DESCRIPTION tag.
Stuffing long strings of repeated keywords into pages
used to magically get them to the top of search engine
results, but that was before the search engineers
realized what was going on and learned how to prevent
this from happening. Once in a while you'll see a "spamdexed"
page near the top of your results, but this trick works
less and less frequently these days.
Links from Other Domains
Look at the top results for the terms you most want to
match. Will those sites link to you from their domain?
If they do, some of their relevance will rub off on your
pages. There are ways to use this dishonestly but
usually sites only link to other sites they're
comfortable being associated with.
Even if your site does manage to claw its way to a plum
position in the search results, that doesn't guarantee
that users will follow the link -- that still takes some
convincing.
Step 3: Get Clicked
All of the work you've done to get your site crawled at
the top of rankings is meaningless if you neglect the
final step: Getting the searcher to click through to
your site. These days, few users will click on a page
described as "Pablo Picasso Pablo Picasso Pablo
Picasso art art art art" in search engine results.
But if you use TITLE to specify the most likely search
term that matches the page, and DESCRIPTION to provide a
quick (50 words max) synopsis of the info on the page,
your site will attract a lot more clicks.
Don't Scare Them Away
This is where gateway pages, redirects, shadow domains,
and other trickery often fail: The would-be customer
gets to your site only to discover it contains confusing
pages, poor navigation, gratuitous redirects, or exactly
the same content as the last site they looked at —
huh? When users find pages of such a dubious nature, do
you think they're going to trust the site with their
credit card number on, say, a $1400 order for two DJ
turntables? I sure didn't: When I landed at a site like
that recently, I immediately clicked Back and wound up
dropping my money on a pair of pricey Technics decks at
a site that looked like a real, honest company, rather
than a network of sites designed to capture me.
Another mistake new Web marketers make is trying to stop
search engines from sending users directly to individual
pages on the site — something they huffily call
"deep linking." They'll force their Webmaster
to redirect anyone who hasn't come through the site's
front door back to the home page, as if the site were a
brick-and-mortar store. This is usually justified as
"customer experience" and
"branding," but all it really says is the site
doesn't trust its customers to know what they want.
I'm guessing most sites abandon this practice once they
look at their log files and see their would-be customers
abandoning the site after being pulled away from a
product they were ready to buy.
All that said, there are ways to beat the system, as
long as you don't mind getting your hands a little
dirty.
How to Cheat Honestly
As much as I talk up Google, their ranking system isn't
foolproof. In short, it ranks individual URLs based on
which other URLs link to them, which URLs link to those,
and so on. That's the simplified explanation — you can
read about eigenvectors and normal link matrices in this
paper written by Google's creators.
While the system works better than old search engine
rankings based on keywords and page content, it's not
perfect. Links from popular sites can count more than
they should, or not enough if the link comes from an
obscure page.
But when Google's engineers read the original version of
this article, they bristled at some of our suggestions
— even though we'd tested them. Emails led to phone
calls, and eventually we spent a caffeinated afternoon
at the Googleplex in Mountain View, CA, using
whiteboards and napkins to sketch out what actually
raises your rankings, and what doesn't. We came away
with some solid suggestions for where to invest your
time wisely:
::Make sure your dynamic pages are crawlable (see
above), and make sure the URLs remain constant. If you
use one URL on the site map, another for the dynamically
generated page, and yet another after giving the user a
cookie, the URLs other sites use to link to your pages
may not be the same as the one Google indexes. URL
inconsistency keeps your pages from being ranked as high
as they should be.
::Google crawls the Web in descending order of PageRank,
meaning the highest ranked pages are crawled first and
most often. So while a crawler page will make your pages
findable, getting other sites to link to the individual
pages will get them crawled more completely, and thus
raise their scores.
::Focus on getting pages that are considered the
authoritiy on the topic that you cover to link to your
pages. Notice we said pages, not sites. For example, I
have a page that's listed by Yahoo, but it's on an
obscure part of the directory that no one else links to,
so it doesn't help me as much as that link from Dave
Winer's blog.
::Ranking trickles down through popular domains with
lots of interpage links, raising the value of all pages
on a popular site — and hence any page it links to.
This is something all bloggers have realized. For
example, let's say a post on my blog gets Slashdotted.
Not many Web pages will link to the actual Slashdot
post, so you'd think it wouldn't do much for my site's
scores. But the value of the many links to Slashdot's
home page trickles down through to the navigable links
inside the site, and eventually to the posting about my
page.
::Creating fake domains is a popular trick people use to
try to raise their Google scores, hoping to make it
appear that other domains are linking to them. The
Google guys giggle at this obvious scam: If you
understand how vectors work, spreading your pages across
multiple domains, or building duplicate sites, does no
better than if you'd simply added those pages to your
original domain. That's because it's the number of
inbound links from elsewhere on the Web that raises your
overall score, and it's unlikely that fake domains will
make that number go up. Google does make some score
adjustments concerning URLs within the same domain to
improve the overall results quality, but spreading your
pages across ten domains won't do much. And according to
Google's anti-spam cop, duplicate domains are the
easiest scam to spot.
See? There are a lot of ways to improve your site
ranking, and they're all relatively easy. So why on
earth would you ever pay someone else to do it?
|