Re: Problem compiling umax-driver with glibc-2.03

Andrew Kuchling (amk@magnet.com)
Sat, 21 Jun 1997 16:09:26 -0400 (EDT)

Michael K. Johnson wrote:
> David Mosberger-Tang writes:
> >I think it would be neat if there
> >was good OCR support in SANE, but don't have experience with this
> >myself (I tried xocr once, but it failed miserably on the Alpha, which
> >I assumed was due to some 64 bit problems).
> There's public-domain handwriting OCR software available from NIST.

My summer project is to write some sort of OCR software so I
can help the Gutenberg Project; running Linux/AXP only, I'm not about
to buy NT just for this one thing, and it's a good chance to learn
image processing and neural networks. Image processing textbooks and
C.M. Bishop's _Neural Networks For Pattern Recognition_ are in hand,
so all I need is to buy a scanner that works with SANE and start
working on code.

The 60-page document that comes with the NIST software is a
really good overview of how the software works:

1) Take the scanned data, and find the form's reference
points. Rotate the image so it's aligned and scaled properly, and XOR
out the form image leaving only the handwritten data.

2) Use various heuristics to group regions into letters and
lines. This probably involved lots of trial and error with real data.

3) Scale each letter down to a 32x32 array of pixels. Use
Kohonen-Loeve feature extraction to derive a 32-element vector from
that array; it's faster to run a neural network over a 32-element
vector than 1024 pixel values.

4) Use a neural network to classify the vector, getting a
letter and a confidence value.

5) For text, use a dictionary to post-process the output and
fix misclassifications.

Several of these steps are probably easier for printed, as
opposed to hand-written, text: lines don't go all over the place, the
density of the printing is constant, and letters aren't run together
in as many ways.

What could SANE do to support OCR applications in particular?
>From my reading, I can't really think of anything special that the
backend or frontend should support, but that may just indicate that I
haven't done enough research.

If you're looking for training data, you could also generate
properly-sized bitmaps from TeX fonts. Wasn't an OCR program once on
the FSF's wish list? ISTR the FSF offered test data.

Andrew Kuchling
amk@magnet.com
http://people.magnet.com/%7Eamk/

--
Source code, list archive, and docs: http://www.azstarnet.com/~axplinux/sane/
To unsubscribe: mail -s unsubscribe sane-devel-request@listserv.azstarnet.com