10. Google translation
Posted on 15 Apr 2009 at 11:46
One important characteristic of Google is that the company doesn't let projects linger in obscurity. Some - such as the odd Dodgeball service for meeting friends - the company will simply cancel. Others slowly germinate over time until they're much more powerful and compelling. Google Translate (www.google.com/translate) is one such project.
The company recently added several more languages and has plans to include just about every language still spoken. Most importantly, the Google algorithms for translation keep improving such that the engine can now translate whole passages of text in only a second or two.
The translation engine works by using a vast library of language pairs, word matching that takes into account the subtle variations between words. One example is that there may be multiple meanings for one word, but one obvious meaning for that same word in another language (the word cousin in English has one meaning, in French it depends on whether the person is male or female), or - as with Chinese - there might not be spaces between words, which adds to the complexity of the engine.
Jeff Chin, the product manager on the project, says that translation poses one of the most interesting computer science challenges, because of all of the subtle variations in word meanings and the processing power required to provide fast results. "To solve machine translation problems, we're using the Minimum Bayes-Risk (MBR) criterion," he explained.
"Essentially, we look at a sample of the best candidate translations - the so called n-best list - and choose the safest one, the one most likely to provide the best translation quality. You might want to view this as choosing a translation that's a lot like the other good translations instead of choosing that strange one that had the good model score. We build a lattice of translations during the search and then we do our MBR search over the lattice. Instead of a hundred or thousand best translations that we'd use for the n-best approach, lattices give us access to a number that rivals the number of particles in the visible universe."
Each year, the quality of translation improves. Next up in the field: a speech-recognition system that uses the same language pair database, but allows you to speak the word you want to translate and hear the result, likely with a mobile phone.
Back to "10 amazing research projects"
Author: John Brandon
advertisement
- Q&A: Why Conficker was a victim of its own success
- App developers losing faith in Android
- Biz Stone: Murdoch's Google veto will "fail fast"
- Google adds automatic captions to YouTube
- China ramps up cyber spying
- Mozilla maintains dependence on Google
- Windows 7 flying off the shelves
- Google Chrome OS: full details unveiled
- AOL slashes 2,500 jobs
- YouTube begins streaming full-length shows
- Why Britain's watchdogs have fewer teeth than goldfish
- Tabbed documents: how to make Office 2010 great
- Outlook 2010 People Pane – does it spell death to Xobni
- Microsoft Outlook 2010 screenshots
- Co-Authoring in Word 2010 and SharePoint Foundation 2010
- Microsoft Outlook 2010 screenshots: Backstage view
- Flash 10.1: Developing for Desktop and Device
- Microsoft Office 2010 screenshots: Recover unsaved items
- Microsoft Word 2010 screenshots: Text Effects
- Microsoft Word 2010: inserting screenshots
- Getting to grips with Microsoft's IT Health Environment Scanner
- Virtualise your servers
- The changing face of travel gadgets
- Build your own distributed file system
- The bulletproof Dell that costs an arm and a leg
- Microsoft Office 2010 Technical Preview: Q&A
- Lawnmowers, the TyTN II and one odd insurance request
- There'll never be a bulletproof OS
- How far can we trust apps?
- Five nice touches in Outlook 2010
advertisement
Printed from www.pcpro.co.uk

