Thursday, June 19, 2008

Spamming Techniques Overviews

Invisible Text: Hiding keywords by using the same color font and background is one of the oldest tricks in the spammers' book. These days, it's also one of the most easily detected by search engines.

Keyword Stuffing: Repeating keywords over and over again, usually at the bottom of the page (tailing) in tiny font or within meta tags or other hidden tags.
Unrelated Keywords: Never use popular keywords that do not apply to your site's content. You might be able to trick a few people searching for such words into clicking at your link, but they will quickly leave your site when they see you have no info on the topic they were originally searching for. If you have a site about Medical Science and your keywords include "Shahrukh Khan" and "Britney Spears", that would be considered unrelated keywords.

Hidden Tags: The use of keywords in hidden HTML tags like comment tags, style tags, http-equiv tags, hidden value tags, alt tags, font tags, author tags, option tags, noframes tags (on sites not using frames).

Duplicate Sites: Content duplication is considered to be search engine spamming also. Sometimes what people do is, they copy the content and name the site differently. But search engines can find it easily and they mark it as a spam. Don't duplicate a web page or doorway page, give them different names, and submit them all. Mirror pages are regarded as spam by all search engines and directories.


Link Farms: Link farm is a network of pages on one or more Web sites, heavily cross-linked with each other, with the sole intention of improving the search engine ranking of those pages and sites.

Many search engines consider the use of link farms or reciprocal link generators as spam. Several search engines are known to kick out sites that participate in any link exchange program that artificially boosts link popularity.

Links can be used to deliver both types of search engine spam, i.e. both content spam and meta spam.

Link content spam

When a link exists on a page A to page B only to affect the hub component of page A or the authority component of page B, that is an example of content spam on page A. Page B is not spamming at all. Page A should receive a spam penalty. Without further evidence, page B should not receive a penalty.

Link meta spam
When the anchor text or title text of a link either mis-describes the link target, or describes the link target using incoherent language, that is an example of link meta spam.

Reapetative Submitting: Each search engine has its own limits on how many pages can be submitted and how often. Do not submit the same page more than once a month to the same search engine and don't submit too many pages each day. Never submit doorways to directories. Decorum

Redirects: Do not list sites using URL redirects. These include welcome.to, i.am, go.to, and others. The complete site should be hosted on the same domain as the entry page. An exception may be made for sites that include a remotely hosted chat or message board as long as the bulk of the site is hosted on its own domain. Actually redirecting of page was not developed for spam, but it is becoming popular technique for spamming.

There are many means of redirecting from one Web page to another. Examples of redirection methods are HTTP 300 series redirect response codes, HTTP 400 series error vectors, META REFRESH tags and JavaScript redirects. As studied earlier these are used to move visitor from one page to another without giving them a single second. In this case the page made for search engine is a spam. Everything on it is an example of either content spam or meta spam.

Alt Text Spamming: Tiny text consists of placing keywords and phrases in the tiniest text imaginable all over your site. Most people can't see them, but spiders can. Alt text spamming is stuffing the alt text tags (for images) with unrelated keywords or phrases.

Doorway Pages: Doorways are pages optimized only for search engine spiders in order to attract more spiders, thus more users. Usually optimized for just one word or phrase and only meant for spiders, not users.

Content Spam: It is possible when different URLs delivers same content i.e. content duplication and same URL can deliver different content as well. Both HTML and HTTP supports it and hence spamming is possible. For example, IMG support and ALT text within HTML means that image-enabled visitors to a URL will see different content to those visitors that, for various reasons, cannot view images. Whether the ability to deliver spam results in the delivery of spam is largely a matter of knowledge and ethics.

Agent based Spam: Agent based delivery is certainly not spam. But it is spam when the use of agent based delivery to identify search engine robots by user agent and deliver unique content to those robots. Since the content is only created for search engines and it is not visible for users, it is always spam.

IP Spam: Identification of search engine robots by IP name or address and delivery of unique content to those robots is considered to be spamming. As in agent based spam, though this technique is also spam when you deliver unique content only to search engines and not the users or visitors.

No Content: If sites do not contain any unique and relevant content to offer visitors, search engines can consider this spam. On that note, illegal content, duplicate content and sites consisting of large affiliate links are also considered to be of low value to search engine relevancy.

Meta Spam: Meta data is data that describes a resource. Meta spam is data that mis-describes a resource or describes a resource incoherently in order to manipulate a search engine's relevancy calculations.

Think again about the ALT tag. Not only does it provide content for a HTML resource, it also provides a description of an image resource. In this description capacity, to mis-describe an image or to describe it incoherently is meta-spam. Perhaps the best examples of meta spam at present can be found in the section of HTML pages. Remember, though, it’s only spam if it is done purely for search engine relevancy gain.

Meta spam is more abstract than content spam. Rather than discuss it in abstract terms, we will take some examples from HTML and XML/RDF in order to illustrate meta spam and where it differs from and crosses with content spam.

Generally, anything within the section of an HTML document, or anything within the section that describes another resource, can be subverted to deliver meta spam.


To make sure that you are not spamming, you need to check out few things. The first and foremost is, you should know whether your content is really valuable for your customers and visitors or not. Any trick to attract more visitors is not going to help you for shorter period of time also. Try and make websites according to user’s tests and preferences. Always remember that, Internet users are information seekers and they want latest content all the time. So think and build a site as of there are no search engines. Avoid automated pages. Google and many other search engines do not index auto generated pages.

Inktomi does accept information pages into their free index and into their paid inclusion programs. For example, if a site contains PDF documents, and you create an information page in HTML with an abstract of each PDF document, that HTML page is acceptable to Inktomi.

No comments: