Re: SANE_FRAME Formats (was Re: xsane-0.31 available)

Tom Martone (tom@martoneconsulting.com)
Wed, 04 Aug 1999 06:49:56 -0400

Greetings,

Nick Lamb wrote:
>
> On Tue, 3 Aug 1999, Tom Martone wrote:
>
> > Document scanners such as the Bell+Howell scanners can produce CCITT-G3,
> > CCITT-G3-2D, and CCITT-G4 compressed image streams. I'm not sure if it
> > would be appropriate to classify these formats as proprietary although
> > they are certainly not specifically supported by SANE.
>
> Unless I'm sorely mistaken, this algorithm is so cheap that you can
> trivially do it inline. Is there a problem with doing this in your
> backend? If it turns out to be too expensive, we should probably add
> one or more frame types as necessary to SANE.
For document processing applications, you usually store the data in a
compressed format (e.g. G4-TIFF), so if you can get pre-compressed
G4 from the scanner all the way through to the front-end, you get the
benefit of the compression from start to finish. You're right about
the cheapness of decompressing in the backend, but it seems to me
to serve no useful purpose, because the scanner can send uncompressed
data (if it is asked to) and the inline decompression in the backend
is not needed and the data delivered to the front end is the same in
both cases. (Perhaps there are scanners that deliver compressed data
only, but I'm not aware of any).

> The barcode stuff is a separate issue, and I suspect there's more to it
> than meets the eye -- can you really just scan barcode string data off
> A4 pages as you're going along? Or is it meant for indentifying the
> documents somehow?
Yes you can scan barcode data off the page during the scan process.
In my usage we have barcoded employee numbers on the documents and
we decode this so that we can associate the document with the employee
for later retrieval. You can decode several types of barcodes, in
separately defined "windows" or on the entire page (front or back).
Of course the smaller you can make the window, the quicker the recognition
can be.

It can also be used to identify the documents as well. For example,
page one of a multi-page document can have a barcode and then you
can "paginate" based on the presence/absence of the barcode.

> If the barcodes aren't really "image data" as such you can expose them
> as an option text string which is changed by the backend only and
> updates for each new page. Then a custom frontend can read the codes
> before/during/after each page is scanned. No magic needed :)
That's a neat idea. I'll pursue this further. Thanks.

> NB There is NO NEED for each and every frontend to support each and every
> possible FRAME format. A bulk-scanning app needn't support stuff used
> only in desktop scanners, and a photo-oriented app needn't do CCITT.
I agree, and think that _G3, _G32D, and _G4 frame types sound better
today than they did last night.

Tom Martone

--
Source code, list archive, and docs: http://www.mostang.com/sane/
To unsubscribe: echo unsubscribe sane-devel | mail majordomo@mostang.com