feedburner

Enter your email address:

Delivered by FeedBurner

What’s In A Name: The State of Typo-Squatting 2007

Labels: , ,

Introduction

By the end of 2007, at least 8,000 URLs using the word iphone will be registered, according to a well known domain expert. The most valuable – iphone.com – is owned by Apple itself, but when Steve Jobs announced the product early in 2007, Apple didn’t own the iphone domain yet. One expert estimates that Apple paid at least $1 million to buy that piece of valuable Web real estate.

Among the 8,000 registered URLs incorporating iphone are community fan sites, rumor and hack sites and, of course, scam sites. Freeappleiphonesnow dot com claims to offer free iPhones and variants that don’t even exist (like the iPhone “shuffle” and “nano”.) The URL is nothing more than a redirect to royalsweeps dot com. When we tested the site, we received debt consolidation offers, get rich quick solicitations, “free” cell phone prizes and other questionable e-mail.

Caption: Example of a site which uses non-existent iPhone models to lure users into providing an e-mail address









Many of the iphone-related domains are misspellings, or typos. Iohone dot com, for example, was registered on January 9, 2007, the day Apple officially announced the iPhone. In August 2007, the site consisted of pay-per-click ads for iPhone-related Web sites.








Caption: Example of a typical typo-squatter page using an “iPhone” misspelling

Typo- and Cyber-squatting on the rise

Apple is not alone in enduring an explosion of 3rd party domain registrations related to a trademarked product. Typo-squatting, the practice of registering domains using common misspellings of popular brands, products and people in order to profit from consumer typing errors, is increasing dramatically.

Cybersquatting cases filed with the World Intellectual Property Organization’s (WIPO) arbitration system increased 20% in 2005 and another 25% in 2006.

Microsoft says that “on an average day more than 2,000 domain names are registered that contain Microsoft trademark terms.”

According to the US Government Accounting Office, at least 8.65% of all domain names are registered with false or incomplete Whois information, a practice that makes domain squatting easier.

More recently, in September 2007, the managers of the .eu top level domain suspended 10,000 domains registered by a Chinese woman who was accused of being a cyber-squatter.

Key Findings

In an effort to further quantify and understand this phenomenon, McAfee studied 1.9 million typographical variations of 2,771 of the most popular and well known Web sites. Of these, we found 127,381 suspected typo-squatters.

Among McAfee’s key findings are the following:

  • Typo-squatting is vast and common, affecting every segment of the Web. 7.2% of the possible typographical errors we studied were actively squatting. In other words, a typical consumer who misspells a popular Web site URL has a 1 in 14 chance of landing at a likely typo-squatter site.
  • The five most highly squatted categories are game sites (14.0%), airlines (11.4%), main stream media company sites (10.8%), adult sites (10.2%) and technology and Web 2.0 related sites (9.6%).
  • Children’s sites are highly targeted by typo squatters. The average for the category is 8.4% and 24 of the top most squatted sites are children’s properties for kids 12 and under. Add in sites like MySpace and Miniclip and more than 60 of the top most squatted sites are properties that appeal to the 18 and under demographic.
  • Squatters follow consumer crowds. Popular, consumer-focused Web sites typically attract more squatters than business to business sites or niche content sites.
  • The incidence of pornographic content on non-adult typo-squatted sites is just 2.4%, suggesting improvement since previous studies by other researchers.
  • Automated ad syndication services like Google’s AdSense enable a significant minority of typo-squatter sites to generate revenue. Google-enabled advertising shows up on 19.3% of all suspected typo-squatter sites in this study. Yahoo-enabled advertising shows up on 4.4% of all suspected typo-squatter sites.
  • The increasing use of automation to buy and sell vast numbers of domains, combined with a 5-day free trial (known as “tasting”) for new registrations to top level domains like dot-com appear to be two significant factors in the rapid growth of typo-squatting.
  • At 3.4%, sites popular outside the U.S. are less than half as likely to be typo-squatted as overall sites.
  • The five non-U.S. countries most likely to have popular sites squatted are the United Kingdom (7.7%), Portugal (6.5%), Spain (5.9%), France (5.4%), and Italy (4.1%).
  • The five non-U.S. countries least likely to have popular sites squatted are the Netherlands (1.5%), Israel (1.1%), Denmark (1.0%), Brazil (0.9%) and Finland (0.1%).
  • The top five parking companies, ranked by the percentage of squatters parked by them, are Information (28.5%), Hitfarm (11.3%), Domainsponsor (2.9%), Sedo (2.5%) and GoDaddy (2.3%). Together, the top five park 47.5% of the squatters we discovered.

Methodology

First, McAfee collected a list of sites based on the most popular and common sites visited by typical consumers. A total of 2,771 target sites were collected from a variety of different sources, including:


Then, McAfee generated permutations (different misspellings) of each of the 2,771 target domains. Among the eight methods we used to generate permutations were:

  • Swapped Characters – Swap characters one at a time. Example: yuotube.com.
  • Replaced Characters – Replace characters one at a time. Example: wschovia.com.
  • Inserted Characters – Insert one character. Example: Newgroounds.com.
  • Deleted Character – Remove one character at a time. Example: cartonnetwork.com.
  • Missing dot – Remove the dot between the “www” and the domain. Example: wwwmicrosoft.com.

We typically generated 500+ permutations for a 5-letter domain and 800+ permutations for a 10-letter domain.

Next, we surfed to each of these 1,920,256 permutations. If the permutation resolved to a live Web site within a certain amount of time, we marked the site as “live” and then tested the site’s content for the presence of a parking company signature – short pieces of text (often, URLs) that indicate a site is hosted by a well known parking company that serves pay per click advertising.

For some categories, we used our judgment to select the target domain. For example, in the celebrity category, we used firstnamelastname.com as a proxy for the celebrity’s official Web site. In some cases (e.g. parishilton.com) the proxy and official site are one in the same. In other cases (e.g. tomcruise.com) the actual site does not currently serve content. We used this method to simulate what we believe to be a typical consumer's effort to directly navigate to the celebrity’s home page.

In a related issue, we occasionally substituted re-directed domains for the final domain. For example, we tested http://playhousedisney.com rather than http://atv.disney.go.com/playhouse/index.html or http://disney.go.com/playhouse. Likewise, we tested http://bigbrother.com.au rather than http://bigbrother.3mobile.com.au to which it resolves.

Rankings by Category

McAfee divided the 2,771 target domains into categories like Children and Shopping. This categorization was based both on classifications by 3rd parties like Hitwise as well as the judgment of McAfee staff. McAfee staff also broke out more than 500 of these domains for their popularity in a variety of non-U.S. countries.

We then ranked these categorized domains by the “percentage of suspected squatters detected.” That figure is calculated by dividing the base domain’s number of suspected squatted sites by the base domain’s total number of sites checked. This ratio represents the likelihood of a typical consumer landing at a squatted site after mistyping the base domain.

Most Frequently Squatted Categories

Rank Category Average % Squatters # of Suspected Squatters # of Sites Checked

Top 100 sites 22.4%

1 Games 14.0% 5,641 40,929
2 Airlines 11.4% 1,563 13,243
3 Mainstream Media 10.8% 5,232 54,511
4 Adult 10.2% 4,931 52,134
5 Tech 9.6% 25,769 288,431
6 Auto 9.2% 2,333 28,283
7 Security 8.5% 620 7,845
8 Children 8.4% 7,804 95,795
9 Music 8.1% 2,127 28,671
10 Shopping 7.9% 16,166 220,679
11 News 7.5% 6,629 105,971
12 Entertainment 7.3% 7,460 92,456

Overall Average 7.2%

13 Financial 6.5% 11,598 184,506
14 Fortune 500 6.3% 10,524 204,700
15 Gadget 5.6% 2,042 41,057
16 Popular 5.4% 2,333 40,518
17 Travel 4.8% 3,691 80,018
18 Dating 4.2% 2,152 54,809
19 Celebrity 3.9% 4,256 125,291
20 Advertising 3.6% 1,000 30,433
20 Global 2000 2.7% 760 32,812
21 Sports 2.3% 845 39,414
22 Fashion 1.6% 392 30,301

For more inormation click the link below
http://www.siteadvisor.com/studies/typo_squatters_nov2007.html

Shane Keats
Research Analyst, McAfee, Inc.