Verdict:
Simple to use, accurate in its results, with good output options - but not as fast as OmniPage Pro 8.
There are few bigger names in office image processing than Xerox, and its OCR package, TextBridge, has long been a market leader. TextBridge Pro 98 represents a major step up from its predecessor. The main claim for the new version is an 82 per cent improvement in accuracy over earlier versions. It also claims that it can recognise tables without rules, line-art, white text on black and dropped capitals. New outputs include Adobe Acrobat PDF format and HTML. The usability features have been improved too, with an OCR Wizard that guides you through the process and batch processing.
However, a rivalry seems to be brewing with Caere's OmniPage Pro, so the question is, what has Xerox done to stay ahead of the game? Well, it has made a good start with its extensive word processor support which allows TextBridge to run from within your choice of word processor. In fact, any program that outputs RTF can be used.
Running TextBridge from within your word processor often gives the best performance. On fully automatic settings, it took just under three minutes to scan, auto-zone, recognise and display an A4 page which included graphics and columns of varying sizes. This was achieved on a Pentium/90 with a low-end parallel port scanner. My test text was a minimum of 8pt, but this didn't stop TextBridge; it made no mistakes except on italic text, where it seemed to get a little confused. Having said that, it was over a minute slower than OmniPage Pro 8 (reviewed p210) and it didn't preserve font information or layouts quite as well as its rival, running over to an extra page when recognising a single page document.
Complicated documents like this demand the most of OCR packages, but in general TextBridge meets these demands, with excellent recognition of white text on black and preservation of column layouts. It's not too good at picking out tables without rules, relying on the user to indicate these manually before recognition
ADVERTISEMENT
takes place. Another weakness is that it will only accept one-bit images, which means that pictures, in particular, suffer when converted from a colour or grey source.
On the plus side, TextBridge is extremely simple to use. If you're not within a word processor, upon loading you're faced with a simple interface containing three buttons: Auto Process, Get Page and Recognise. Menu selections allow you to fiddle with the settings, but you'll rarely have to do this. Starting the OCR Wizard presents you with a choice of which type of document you're scanning. This gives TextBridge an indication as to whether the document is entirely single column text or whether it should look out for pictures, tables or multiple columns.
Once the scanner has done its job, TextBridge auto-zones your page. This is probably the least reliable part of the process. For example, if your source contains a company logo that's made up of text and a picture, TextBridge will see the text part of the logo and assume that the whole logo needs converting to text. In this case, manual zoning is the only option.
Once you have an OCR image, you don't necessarily have to convert it to a word processor document. If you wish, you can save to Adobe Acrobat PDF format instead. The output to PDF can be controlled so that suspect words are saved as pictures rather than possibly incorrect words. This allows you to create reliable PDF files even if you don't own Acrobat Exchange, since the fact that a particular word is presented as a picture will be invisible to anyone viewing the document. The final results are excellent, opening up the possibility of quick conversion of paper-based documents, such as manuals or tutorials, from paper to fully formatted PDF format.
The same can't be said of TextBridge's HTML output. While it may save some time in retyping, you'll generally have to edit the results in an HTML editor, except for the most basic of documents. What's more unfortunate is that TextBridge is no longer bundled with HoTMetaL.
If you intend to convert to a paperless office, one final feature comes in handy. Batch processing allows you to convert a number of documents automatically; if you have a sheet-fed scanner, you could feed in reports, the day's faxes or even the morning post.
TextBridge is highly accurate, great value and easy to use. The electronic PDF output facility is useful, too. However, OmniPage Pro 8 was slightly faster and had better features. For now, TextBridge Pro remains in second place.
By Kevin Partner
SPECIFICATIONS:
Windows 95 or above, 16Mb of RAM, 25Mb of disk space.