Subject: snapscan / SCSISource_Get(..) / Prisa640S
Dear snapscan folks,
Opening Apology:
* I've been working on the snapscan backend for the last two
* weeks or so, and wrote this e-mail simultaneously.
* In the middle of it, it turned out that I had some
* serious misunderstandings in my mind, which sheds a new light
* on the beginning. I didn't revise the beginning since
* it contains some hopefully interesting observations
* nonetheless.
Sorry, but I think we've got to begin with some
generalities.
As a backend, we're sandwiched between two channels of limited
capacity: the scanner (with a limited transfer speed) and the
frontend (with a limited capability to accept data). What
should we do then? I think we
- should respect the scanner's limitations (so that our reader
process doesn't waste processor time), but we
- must ignore the frontend's limitations (as long as possible)
(sounds cruel, but the alternative is to run the scanner
in "stop-and-go" mode at a lower speed, which (at best) makes
a strange, unpleasant sound, and in practice means that
the scanner has to stop from time to time)
(here the term "frontend" includes any post-processing the backend
wants to make, like inversion, chroma correction, etc..)
Currently, the reader is (at least partially) frontend-driven:
- in "blocking mode" it only tries to get as many bytes as the
frontend requested. So it actually tries to simulate nonblocking
mode.
- in nonblocking mode the child passes it's data through a pipe
to the parent, where it can be requested by the frontend.
I assume that writing to a pipe can be quite slow if too much
data is already waiting to be read. This is also confirmed by
the results of my debugging sessions.
(I timed these write instructions and found that towards the end
some instructions take 4 or 5 times longer than usual.
Also, setting READER_WRITE_SIZE to 26128980 (= size of entire
scan area) allowed the scanner to run without stopping, the
reason presumably being that SCSISource_get(..) then runs
uninterrupted. (it is called only once)
An interesting test:
"scanimage <device> > /dev/null"
made my scanner stop from time to time, but
"scanimage <device> -x 10 > /dev/null"
ran uninterrupted. I assume that the scanner's intended speed
(4.2 ms/line) should not depend on the width of the scan region,
so this problem should be related to the amount of data that
is generated. I found it helpful to determine exactly
the width at which the behaviour changes, and to inspect
the resulting debugging messages. I once found that
- x=55 made scanner stop 3 times (towards the end)
- x=54 made scanimage lock (had to abort it)
- x=53 ran uninterrupted, but with an ugly sound indicating
that it was constantly about to stop
(** I've just added a discussion of blocking/non-blocking mode
at the end of this mail **)
By the way: the estimates done in SCSISource_get(..) have a
systematic leak: if the expected number of bytes is larger
than ps->absolute_max (which is approx. = SCANNER_BUF_SZ) the
bytes that are available but don't fit into the buffer
are forgotten when the routine is called the next time.
The same applies to the bytes that are available, but don't
form a full scan line. I have a modified version which
corrects this. (I'm always referring to this version,
which still makes the scanner stop from time to time, but
does no longer let the expectation go down to 0 bytes.)
Isn't a SCANNER_BUF_SZ of 31744 *much* too small, and/or tuned
to the scsi restrictions of old kernel versions? One should
be more flexible and let the frontend suggest an appropriate
value, and choose SCSIBUFFERMAX by default. I also suspect a
memory leak somewhere: after changing SCANNER_BUF_SZ to a larger
value I got segfaults, which disappeared after I declared
snapscan_scanner.buf to be u_char[SCANNER_BUF_SZ+4096]. This
really should be investigated.
And I wonder what measure_transferrate is doing: for some reason
it seems to be simulating the postprocessing overhead, presumably
to be included in the transferrate estimate. As I've said
above, I think that is "dead wrong", an absolute no-no !
But, maybe I don't understand what the READ_TRANSTIME instruction
is supposed to do: are *we* reading the scanners transfer speed
from the inquiry result, or is *the scanner* estimating our
capability to accept data, by timing the two scsi read instructions
that we're sending? That would also explain why we have to send
a request for 0 bytes to the scanner. It would be helpful to put
some explanations into snapscan's source code.
**** YES! Tests show that the scanner seems to be timing our
capability to accept data, and to adjust the ms/line value
accordingly! This puts everything in a different perspective:
- I can get uninterrupted scans by simulating a heavy system
load with usleep(1000) in the measure_transferrate routine
- I cannot do it by opening a pipe and simulating a write/read
cycle. The following extract of a debugging session (of a
real scan) indicates that writing a small amount to a newly
opened file is probably not representative for the average
transfer time
...
write_msecs=0.68
write_msecs=0.06
write_msecs=1.51
write_msecs=0.03
write_msecs=0.70
write_msecs=0.05
write_msecs=0.69
write_msecs=0.68
write_msecs=0.07
write_msecs=1.54
write_msecs=0.04
write_msecs=0.73
write_msecs=0.07
write_msecs=1.16
write_msecs=0.05
write_msecs=0.69
write_msecs=0.06
write_msecs=0.69
write_msecs=0.03
write_msecs=11.83
write_msecs=103.50
write_msecs=0.08
write_msecs=0.72
write_msecs=0.07
write_msecs=0.70
write_msecs=0.03
write_msecs=1.57
write_msecs=0.07
write_msecs=0.68
write_msecs=0.05
write_msecs=0.69
...
- It seems what we should do is this:
If we want fast scans, the reader child should be asked
to get all of the data as fast as it can and either
- write the data to a pipe, or
- write the data to a shared memory region
in both cases we increase the memory requirements of our backend.
If we use memory for data transfer one buffer certainly won't
suffice, since reading from it would block data acquisition.
What about a circle of three shared memory regions?
Ummh, postprocessing (like chroma correction) may need to
access two buffers for reading, which then leaves
only one for writing, which could be a bit delicate...
maybe we should allow N buffers, and determine N later...
Let me stress this again: the child's execution *must not*
depend on the sane_read requests coming from the frontend
(or, essentially equivalently, on the calls to SCSI_source_get).
The child should get all the data at a speed that is negotiated
in the measure_transferrate routine! If we use memory (rather
than pipes), one buffer does *not* suffice!
Using shared memory would be the best solution, since the data
could be postprocessed in place, without having to copy it
through a source chain. One could use the mm-library for the
implementation of shared memory resources, but I don't know
how good and/or portable that is.
I have actually begun to work on a modified snapscan-clone
that uses shared memory. Unfortunately, this invalidates the
entire SourceChain architecture of the current version,
and inversion and chroma-correction have to be reimplemented.
I'll make this code available when (and if) it works.
Christian
PS:
*** Remarks on blocking/non-blocking mode
There seems to be some confusion or unclarity what one
should mean by these terms. Note that the SANE specs
don't say anything about it!
I would like to actually distinguish *three* different modes:
-- blocking mode (aka *strongly* blocking mode):
all of the data is fetched from the scanner (in the
first call to sane_read), then handed over to the frontend
in digestable pieces
-- simulated non-blocking mode:
program runs in strongly blocking mode but asks
only for a convenient amount of data,
data is then handed out to the frontend
-- non-blocking mode
program runs in strongly blocking mode (for
the full image), but as a background process
-- Source code, list archive, and docs: http://www.mostang.com/sane/ To unsubscribe: echo unsubscribe sane-devel | mail majordomo@mostang.com
This archive was generated by hypermail 2b29 : Wed Jan 10 2001 - 01:57:25 PST