Skip to navigation

PCPro-Computing in the Real World Printed from www.pcpro.co.uk

Register to receive our regular email newsletter at http://www.pcpro.co.uk/registration.

The newsletter contains links to our latest PC news, product reviews, features and how-to guides, plus special offers and competitions.

Analysis

10. Google translation

Posted on 15 Apr 2009 at 11:46

One important characteristic of Google is that the company doesn't let projects linger in obscurity. Some - such as the odd Dodgeball service for meeting friends - the company will simply cancel. Others slowly germinate over time until they're much more powerful and compelling. Google Translate (www.google.com/translate) is one such project.

The company recently added several more languages and has plans to include just about every language still spoken. Most importantly, the Google algorithms for translation keep improving such that the engine can now translate whole passages of text in only a second or two.

The translation engine works by using a vast library of language pairs, word matching that takes into account the subtle variations between words. One example is that there may be multiple meanings for one word, but one obvious meaning for that same word in another language (the word cousin in English has one meaning, in French it depends on whether the person is male or female), or - as with Chinese - there might not be spaces between words, which adds to the complexity of the engine.

Jeff Chin, the product manager on the project, says that translation poses one of the most interesting computer science challenges, because of all of the subtle variations in word meanings and the processing power required to provide fast results. "To solve machine translation problems, we're using the Minimum Bayes-Risk (MBR) criterion," he explained.

"Essentially, we look at a sample of the best candidate translations - the so called n-best list - and choose the safest one, the one most likely to provide the best translation quality. You might want to view this as choosing a translation that's a lot like the other good translations instead of choosing that strange one that had the good model score. We build a lattice of translations during the search and then we do our MBR search over the lattice. Instead of a hundred or thousand best translations that we'd use for the n-best approach, lattices give us access to a number that rivals the number of particles in the visible universe."

Each year, the quality of translation improves. Next up in the field: a speech-recognition system that uses the same language pair database, but allows you to speak the word you want to translate and hear the result, likely with a mobile phone.

Back to "10 amazing research projects"

Author: John Brandon

Be the first to comment this article

You need to Login or Register to comment.

(optional)

advertisement

Most Commented Features
Latest News Stories Subscribe to our RSS Feeds
Latest Blog Posts Subscribe to our RSS Feeds
Latest Reviews Subscribe to our RSS Feeds
Latest Real World Computing

advertisement

Sponsored Links
 
SEARCH
SIGN UP

Your email:

Your password:

remember me

advertisement


Hitwise Top 10 Website 2008