Re: ocr

becka@rz.uni-duesseldorf.de
Mon, 20 Jul 1998 03:06:46 +0200 (MEST)

Hi !

> Anyone working on any OCR software? I've to occassional need and have
> tried the program which came with my microtek scanner under win95. I
> just spent most of the morning playing with it and the best I got
> was a nice little windows message box with the useful message "twain
> error".

*grin*.

Hmm - depends - on you willing to pay for it and on the quality you need.
Vividata has its OCRshop available for Linux.
You can even download a 30-day trial version.
Please use some search-machine for the URL - don't have it handy ...

I have tried recently, but I was not impressed. It did read the numbers
on some tif that came with it right, but it choked at a number of other
tifs I gave to it. Might be a format problem, though.

> I would have been much better off just typing the text in from the
> start.

That has been my impression with _every_ OCR Software I found so far.

I can type roughly 180 chars/minute, so I can hack in the average
page with around 2000 chars in about 10 minutes. It usually takes
about the same time to acquire a scan, OCR it, and then (and this is
the most time-consuming step) proofread it. Most OCR tends to make
mistakes, that are "auto-corrected" by the eye when you just glance
over the text, like messing up i/l or rn/m.

> Is OCR all that difficult to do?

Yes. It is one of the most complex challenges for programming. Even voice-
recognition can be considered easier in some cases.

CU, Andy

-- 
= Andreas Beck                    |  Email :  <andreas.beck@ggi-project.org> =

--
Source code, list archive, and docs: http://www.mostang.com/sane/
To unsubscribe: echo unsubscribe sane-devel | mail majordomo@mostang.com