I'd rather be able to tell the truth, so to speak, and give a proper
hint as to the format of the frame. Then each frontend could choose
to do with it what it thought was best, but that the baseline
functionality would be to pass it uninterpreted onwards like the
behavior I get with the --raw commandline option.
The frontend, scanadf, allows you to specify a scan script which gets
forked off for each image aquired and this allows the user/integrator
great flexibility in doing stuff with each captured file. It provides
a nice separation between the basic frontend and the specifics of
a particular application's requirements. What I typically do with
the g4 data is to convert it to a full-fledged tiff file using a
simple utility called g42tiff, which is a slightly modified
version of fax2tiff from the tools within Sam Leffler's libtiff code.
Our imaging archive system uses tiff as its file format of choice.
You could also scan without compression, getting true _GRAY data
and have the scan script use pnmtotiff to get the same result. It
just seems nice to have the data compressed in the firmware of the
scanner and have a much smaller amount of data flow across the SCSI
bus and through the software. Any savings here would be more noticeable
if you were going through saned/net as well.
Now getting at the barcode data is a different matter. Basically the
encoded data is to be associated with the image in the "document
database" which provides the infrastructure to support flexible
searches for retrieval. In one case, the encoded data is an employee
identifier - the employee who signed and returned the document. This
allows the document (image) to be associated with that person in a
relational database. Then it is a trivial matter to collect all the
documents for a person, etc. The barcoding technique helps to eliminate
a manual data entry process and is quite desireable in terms of labor
savings.
So for barcodes the scan script pulls out the decoded data and stores
it in a index file which is used to update the database. All this
happens during the scan process which streamlines things and allows
for good throughput.
Now there might be more sophisticated ways of associating a series
of data streams (frames) together as being from the same page, but
I don't really see a dire need for this. As long as the frames
arrive in a well-defined (by the backend), predictable manner,
a custom scan script should be able to make the association simply
by the sequence. The front end really doesn't know about this at
all, and that's all right, the job gets done.
I hope this is the information you were seeking. I'm not quite sure
I understood you completely regarding the multiple data frames, so I
may have missed something here.
Tom Martone
-- Source code, list archive, and docs: http://www.mostang.com/sane/ To unsubscribe: echo unsubscribe sane-devel | mail majordomo@mostang.com