| |
One of the major search engines, Excite, ignores any information in
META tags, and does so on purpose. What could be the rationale
for such a decision?
The reason stated on Excite's "Getting
Listed" page is that META tags can be used by spammers
to improve their rankings in an unfair way. For Excite,
attempting to make a page appear any different to search spiders
than to human users is an unfair practice. Indeed, nobody can
guarantee that the keywords you enter are those describing your
content, and in principle, you can easily use popular keywords to
inflate your hits without any improvement of the page content.
At a first glance, this position might seem logical. But is it?
Remember that I can easily put any number of "hot" keywords onto the
page itself, and if I don't want to distract readers with this
promotion machinery, I can make them invisible by painting them with
the background color (as many spammers do already, simply because
META tags don't allow them to enter too many keywords). After
all, spiders will always index what I want them to, and banning
one of the weapons can only ginger up the armaments race.
Excite's policy is based on the assumption that each page has its
intrinsic "value," and that this value is evident from reading the
text on the page. If this is true, then it's natural to require that
spiders, to be able to assign a fair "relevance" value, would get
exactly the same text as human readers. But it is also silently
assumed here that a spider can read, understand, and evaluate the text just
as humans do. This is where the main fallacy of this approach lies.
The main purpose of a META tag is to provide some
information about the document, and the tag does it mostly for
computers that cannot deduce this information from the document
itself. Keywords and descriptions, for example, are supposed to present
the main concepts and subjects of the text, and no computer program
can yet compile a meaningful summary or list of keywords for a given
document. (In this context, it's interesting to note that Excite is
the only search engine to employ an artificial intelligence algorithm
for compiling summaries based on the document text.)
True, the META mechanism is open to abuse, but
so far it's the only technique capable of helping computers better
understand human-produced documents. We won't have another choice but to
rely on some sort of META information until computers achieve
a level of intelligence comparable to that of human beings.
In view of this, it is interesting to discuss the latest
development in the field of meta-information, the Meta Content
Framework (MCF). This language is used for describing meta-information
properties,
connections, and interrelations of documents, sites, channels, subject
categories, and other information objects. MCF was developed by
Netscape and has been submitted as a draft standard to W3
Consortium.
MCF may be useful for maintainers of closed information systems,
such as intranets, corporate and scientific databases, and so on. Its
main promise, however, is the capability to build a meta-information
network for the entire Web. Unfortunately, given the controversial
position of the rather primitive META tags of today, it is
not very likely that the sophisticated tools of MCF, even if approved
by W3C, will gain any widespread recognition.
| |