Cloaking:
As search engine optimization started evolving and search engines became more and more intelligent, webmasters came up with many techniques to rank their sites on search engines. Cloaking is one of those techniques. It is very difficult and time consuming to make a web site both user friendly as well as search engine friendly. So webmasters came up with an idea of Cloaking. In cloaking webmasters delivers one page to search engine for indexing while serving an entirely different page to everyone else. Cloaking is the process of serving different versions of a page based upon identifiable information about the user. Often, pages are based upon Agent name and/or IP address (isp host).
There is no as such clear view that whether cloaking is ethical or unethical. But anyways it is tricking spiders and any attempt to trick a search engine is considered to be spam. Hence cloaking technique is not regularly practiced. A simple way to see if a web page is using a cloaking technique is to look at the cache. Google has a link called Cached next to almost every search result. The cache shows the web page that was indexed by search engine. If a web page that you see in the SERPs differs from cached version, then there’s possibility that the website is using cloaking technique.
As we all know, people wants make web sites user centric. They want their site to be beautiful, attractive and interactive enough to engage visitors. Certainly this enhances user experience. But this does not serve the optimization purpose. So to optimize such a site webmasters use cloaking technique. The few factors are explained bellow, which makes a webmaster to think of cloaking.
Use of flash/splash/ Videos:
HTML days are gone and flash days are in! Many of the sites are build using flash, which is totally no no for search engines. So no plain text and not even flash on the site??? The solution is to create simple HTML text document for search engines and flash pages for visitors. Just recently Google has started to index flash pages but rest of the SEs doesn’t do that.
Websites containing Images:
There are many sites that are full of pictures and images. Also they have image gallery and all. These are image-oriented sites and percentage of images is more than that of text. Obviously there is no way that these sites will rank high on SERP. Hence cloaking comes first in the mind for optimizing these pages.
HTML Coding:
Many of the times there is more HTML code as compared to the text. This is again does not suit for search engine optimization. There has to be substantial amount of text and lesser HTML coding. In this case rather than recoding eth entire websites, they found cloaking as the best option.
Now you know why, it's time to find out how. A cloaking is done by modifying a file called .htaccess. Apache server has a module called "mod_rewrite". With the help of this module in .htaccess file you can apply a cloaking technique for your web pages.
Webmasters gather search engines' IP addresses (231.258.476.13) or User-Agents (Googlebot). If mod_rewrite module detects that an IP address or user-agent belongs to a search engine, it delivers a web page that is especially designed for SEs. If IP doesn't belong to any spider, than it thinks it's a regular visitor and delivers a normal web page.
There are 5 types of cloaking:
User Agent Cloaking (UA Cloaking)
IP Agent Cloaking (IP Cloaking)
IP and User Agent Cloaking (IPUA Cloaking).
Referral based cloaking.
Session based cloaking.
All five have unique applications and purposes, yet all 5 can fit nicely within one program.
User Agent cloaking is good for taking care of specific agents. Wap, Wml pages for the cell phone crowd.
Active X for the IE crowd.
Quality css from the Moz and Opera crowd.
Nice black screen for the web tv'ers.
Specialty content for agents (eg: NoSmartTags, GoogleBot Noarchive)
No sense in sending out stuff with js, java, or flash than a user can't actually run.
IP Address Cloaking is good for taking care of demographic groups. Language file generation for various countries.
Advertising delivery based on geo data.
Pages built for broad band users.
Low impact pages for overseas users.
User-time-of-day determination and custom content based on tod geo data (news, sports weather..etc)
Specifically targeting demo groups such as AOL, Mindspring etal.
IP and Agent cloaking is good for a combo of the above. Custom content for AOL'ers using Wap phones.
Ads based upon geo data and user agent support.
The possibilities for targeting are almost endless. You'll run out of ways to reroll it before you run out of ips and agents to serve.
Indexability. Just getting your site fully indexed can be a challenge in some environments (flash, shock).
Referrer based cloaking is basing delivery on specific referral strings. It is good for content generation such as overriding frames (about.com, ask jeeves, and the google image cache).
Preventing unwanted Hotlinking to your graphics.
Session based cloaking. Sites that use session tracking (either from ip, or cookies) can do incredible things with content. We've all seen session cloaking in action on dynamic sites were custom content was generated for us.
The internet has just scratched the surface here.
Cloaking is the gate keeper that serves your site in it's best light, and protects your custom code from prying eyes.
Search engine cloaking is just one aspect of a much bigger picture. This is why search engines can't even consider banning cloaking. It is so widespread and pervasive, they'd have to delete 1/4th of the domains in their indexes - those would be the best sites they have listed.
Any time you hear a search engine talking about banning cloaking, listen to them very closely -- and remember. If they'd bold face lie about something so pervasive, what are they doing with the really important stuff? They can't be trusted - nor can those that are out here carrying their water.
With the assault of rogue spiders most sites are under, the growing trend of framing, agents that threaten your hrefs (smarttags), I think cloaking has a very bright future. The majority of the top 2000 sites on the net use some form of the above styles of cloaking (including ALL major search engines).
Doorway Pages:
Just like cloaking these pages are also specially created for search engines, the difference is, these are ‘gateway’ or ‘bridge’ pages. They are created to do well for particular phrases. They are programmed to be visible only by specific search engine spiders. They are also known as portal pages, jump pages, gateway pages and entry pages. Doorway pages are build specifically to draw search engine visitors to your web site. They are standalone pages designed only to act as doorways to your site. Doorway pages are a very bad idea for several reasons, though many SEO firms use them routinely.
Doorway pages have acquired something of a bad reputation due to the frequent use (and abuse) of doorways in spamming the search engines. The most flagrant abuses include mass production of machine-generated pages with only minor variations, sometimes using re-direction or cloaking so the visitor does not see the actual page requested. Doorways used in this manner add to the clutter that search engines and Web searchers must contend with.
The purpose behind building Doorway pages is just to trick search engines for higher rankings. So doorway pages is considered to be unethical SEO practice. The fact is that doorway pages don't do a very good job of generating traffic, even when they are done by "experts." Many users simply hit their back buttons when presented with a doorway page. Still, many SEO firms count those first visits and report them to their clients as successes. But these very few visitors go ahead and visit their product’s page.
There are various ways to deliver Doorway pages. Lets check them one by one.
Low Tech Delivery:
When webmasters create and submit a page targeted toward a particular phrase, it is called Low Tech Delivery. Here sometimes webmasters create pages for special search engines as well. But the problem is user doesn’t arrive at the desired page. And it is most likely that if any visitor lands on non-informative page, he won’t navigate any further.
In such a case ‘Meta Refresh Tag’ plays very vital role. It is an HTML tag which automatically refresh the page in defined time. The meta refresh tag they use here is of zero second delay. Therefore use most likely won’t be able to see the optimized content before being sent elsewhere. These META tags are also a red flag to search engines that something may be wrong with the page. Because jump pages manipulate results and clutter indexes with redundant text they are banned by search engines.
Now a days search engines doesn’t accept meta refresh tags. To get around that, some webmasters submit a page, then swap it on the server with the "real" page once a position has been achieved.
This is "code-swapping," which is also sometimes done to keep others from learning exactly how the page ranked well. It's also called "bait-and-switch." The downside is that a search engine may revisit at any time, and if it indexes the "real" page, the position may drop.
But there is another problem with these pages. As they are targeted to key phases, they could be very generic in nature. So the pages can be easily copied and used on other sites. And since they are copied the fear of banning is always there.
Agent Delivery:
The next step up is to deliver a doorway page that only the search engine sees. Each search engine reports an "agent" name, just as each browser reports a name. An agent is a browser, or any other piece of software that can approach web servers and browse their content. In example: Microsoft Internet Explorer, Netscape, Search Engine Spiders.
The advantage to agent name delivery is that you can send the search engine to a tailored page yet direct users to the actual content you want them to see. This eliminates the entire "bridge" problem altogether. It also has the added benefit of "cloaking" your code from prying eyes.
But still the problem is there. Someone can telnet to your web server and report their agent name as being from a particular search engine. Then they see exactly what you are delivering. Additionally, some search engines may not always report the exact same agent name, specifically to help keep people honest.
IP Delivery / Page Cloaking:
Time for one more step up. Instead of delivering by agent name, you can also deliver pages to the search engines by IP address, assuming you've compiled a list of them and maintain it. IP delivery is a technique to present different contents depending on the IP address of the client.
Everyone and everything that accesses a site reports an IP address, which is often resolved into a host name. For example, I might come into a site while connected to AOL, which in turn reports an IP of 199.204.222.123. The web server may resolve the IP address into an address: ww-tb03.proxy.aol.com, for example.
Subscribe to:
Post Comments (Atom)

No comments:
Post a Comment