dnlocal  

Go Back   dnlocal > General Discussion > General Discussion zone
Register FAQ Members List Calendar Search Today's Posts Mark Forums Read

Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 05-14-2008, 07:07 PM
Rubber Duck's Avatar
Administrator
 
Join Date: Sep 2006
iTrader: (0)
Posts: 2,745
Default Algorithm Helps ICANN Manage Top-level Domains

Dr. Dobb's | Algorithm Helps ICANN Manage Top-level Domains | May 13, 2008

At the request of ICANN, Paul E. Black, a computer scientist at the National Institute of Standards and Technology (NIST) has developed an algorithm that may guide applicants in proposing new top-level domains. As new top-level domains are added to the familiar .com, .info, and .net, the algorithm checks whether the newly proposed name is confusingly similar to existing ones by looking for visual likenesses in its appearance. Having visually distinct top-level domain names may help avoid confusion in navigating the ever-expanding Internet and combat fraud, by reducing the potential to create malicious look-alikes: .C0M with a zero instead of .COM, for instance.

Later this year, the Internet Corporation for Assigned Names and Numbers (ICANN) plans to launch the process for proposing a new round of "generic" top-level domains (gTLDs), strings such as .net, .gov, and .org meant to indicate organizations or interests. In preparing for newly proposed gTLDs, ICANN reached out to various algorithm developers, including Paul Black, as among those engaged to "provide an open, objective, and predictable mechanism for assessing the degree of visual confusion" in gTLDs.

Black's algorithm compares a proposed gTLD with other TLDs and generates a score based on their visual similarities. For example, the domain .C0M scores an 88 percent visual similarity with the familiar .COM. The resulting scores may help indicate whether the newly proposed domain name looks too much like existing ones.

To make its assessments, the algorithm rates the degree of similarity between pairs of alpha-numeric characters. Some pairs, such as the numeral "1" and its dead-ringer, the lowercase letter "l," are assigned the highest scores for visual similarity while other pairs, such as "h" and "n", are given lower scores. The algorithm takes other considerations into account, for example how certain pairs of letters, like "c" and "l," can join to look like a third letter ("d"), as in the case of "close" and "dose." Employing these scores and considerations, the algorithm computes the "cost" of transforming one string of characters into another, such as "opel" into "apple." Lower cost means higher visual similarity. The algorithm then adjusts for the relative lengths of the two strings (different lengths increase their distinctiveness) and converts the final cost into a percent similarity.

ICANN is considering future enhancements to the algorithm, such as having it check for visual confusion between existing domains and future planned Internet top-level domain names in scripts such as Cyrillic.
__________________
Yours RD.

Sales threads older than 30 days are void unless stated otherwise.
Reply With Quote
  #2 (permalink)  
Old 05-14-2008, 08:39 PM
Member
 
Join Date: Dec 2007
iTrader: (0)
Posts: 84
Default

More gtlds? Wow, there seem to be quite a few out there already...

As long as the algorithm doesn't do cutting and pasting of arabic words, I think we will be okay.

And some of those Hindi and Thai words....look pretty darn similar to my untrained eye!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT. The time now is 08:57 PM.


Powered by vBulletin® Version 3.6.10
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Friendly URLs by vBSEO 3.1.0 ©2007, Crawlability, Inc.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47