Re: scsi command queuing

From: abel deuring (a.deuring@satzbau-gmbh.de)
Date: Thu Jun 29 2000 - 05:14:25 PDT

  • Next message: Nathan Stenzel: "Re: Test backends with 'scanimage -T'"

    Oliver Rauch wrote:
    >
    > Hi,
    >
    > has someone experience with a sane backend and scsi command queueing?
    >
    > I am just working on it for the umax backend.
    >
    > At first I created some routines that replace the pipe to transfer the
    > data from the reader_process to the main process, it uses shared
    > memory
    > instead (on systems where shared memory is available, otherwise the
    > pipe is
    > used).
    >
    > Unfortunetly it does not speed up scanning large images. It really
    > looks like
    > the comunication via the scsi bus is not fast enough.

    As far as my experiences go, the scan speed (mainly the number of the
    scan head stops) depends on quite a number of factors:

    - more or less broken scanner firmware
    - too tiny memory for the scanner's controller
    - slow responses by the host machine (backend; sanei_scsi layer; speed
    of the low level drivers)

    As Douglas stated in his response to your mail, I had quite some success
    with speeding up the Sharp JX250 with command queueing. Command queueing
    combined with a buffer size of 128 kB (or for 400 dpi scans, 256 kB)
    avoids all scan head stops, at least, if the JX250 is connected to an
    Adaptec 2940 or some NCR controller (sorry, can't remember, which one;
    it needs the ncr53c8xx driver). If the JX250 is connected to an Adaptec
    1542, 5 or 10 scan head stops remain.
    http://www.satzbau-gmbh.de/staff/abel/jx250perf.html show the results of
    some speed tests (not for the 1542).

    On the other hand, I also tried forking and command queueing with the
    Microtek backend in order to speed up an old Microtek Scanmaker II,
    without any success. My conclusion was that the Microtek's firmware
    probably stops the scanhead after each "read data" command, instead of
    scanning a little bit further "just in case", the next read command
    might come soon. The Scanmaker III does not show this behaviour -- at
    least gray scale scans most times don't have any scan heads stop.

    > So I added scsi command queueing into the umax backend. But I am not
    > sure how I can see
    > 1) how/if it works (sanei_scsi debug output is not good enough)

    "If": Well, the JX250 shows that command queueing works and can have
    some influence :) Regarding "how": sanei_scsi_open checks, how many
    commands can be queued by the Linux SCSI subsystem (there is a DBG
    statement showing this number). sanei_scsi_req_enter checks, if this
    queue is full; if it isn't, it sends the command to the SG driver, else
    it queues the command internally. sanei_scsi_req_wait wait for the
    oldest queueing command to finish; if there are any commands in the
    "sanei_scsi-internel" queue which not yet sent to the SG driver, they
    are sent, until the low level queue is again full, or the internal queue
    is completely sent.

    Henning Meier-Geinitz wrote:
    >
    > Hi,
    >
    > On Wed, Jun 28, 2000 at 05:26:23PM +0200, Oliver Rauch wrote:
    > > has someone experience with a sane backend and scsi command queueing?
    >
    > I have tried this some weeks ago without much success. To be exact: it is
    > possible to send more than one scsi_req_enter before scsi_req_wait and there
    > is no problem with this. But it isn't faster than waiting for each request
    > and then sending the next. I haven't looked deeply into the code (and I
    > don't understand much of SCSI) but the following lines looked suspicous:
    > (in scsi_rq_wait())
    > /* Now issue next command asap, if any. We can't do this
    > earlier since the Linux kernel has space for just one big
    > buffer. */
    > issue (req->next);
    >
    > So if I understand this correctly the scsi_req_enter only shedules the
    > request and it will be sent to the driver whenever any pending request is
    > finished. Maybe the "Linux can only use one big buffer" is history with the
    > newer sg drivers?

    Right. I forgot to remove the comment you quoted above, when I worked on
    sanei_scsi.c.

    > Same here. I looked at the scanning times when the backend does nothing but
    > getting data from the scanner and ignoring it. There was no big change in
    > scanning time (about 5 %). With the original SCSI adapter the Mustek
    > scanners are about twice as slow as with Windows despite large (4 MB) SCSI
    > buffers and tweaking the Linux SCSI driver.

    Which adapter is shipped with the Mustek? And do other adpaters work
    better?

    Wolfgang Rapp wrote:

    > But I think the most bottlenek is the interaktion between backends and kernel
    > after every scsi-command. This interaction time by
    > system calls , kernel scheduling etc. at this time is to long to keep the
    > scanner running, the next scan command block should be send
    > by the driver if it receives the completion interrupt from the last.

    Agreed.

    > If we talk about scanspeed we should think about extending the sg driver for
    > doublebuffering data pages in kernel memory space
    > and command block repeat count.

    Well, I think that the sanei_scsi_req_enter / sanei_scsi_req_wait
    mechanism should give similar results as command repeating, but the
    former is more flexible, because you can also sent some status inquiries
    or whatever between to "read data" commands.

    > Filling one buffer by dma from scsi hardware
    > and coping in parallel the other out to the user
    > space instead of waiting for interrupts. My somebody have looked more to the
    > linux sg driver source code then I and knows more
    > about how it works. But so all backends must be changed because not all could
    > be done ins sane_scsi.

    >From the sanei_scsi viewpoint, Linux queueing is not that difficult:
    Simply send as many commands as possible, and wait for the results :)
    Even if the kernel needs to store the read data in an internal buffer,
    is not an important question for sanei_scsi. For the 2.0 and 2.2
    kernels, this happens, but AFAIK the 2.4 kernels will support user space
    DMA. The write call to the SG driver, which starts a SCSI command,
    contains a pointer to the memory location, where the backend wants the
    data to be written to. How many buffers are involved inside the kernel,
    before the data is written to the user space buffer, doesn't matter...
    well, that can of course be a performance issue.

    Douglas Gilbert wrote:

    > Abel found in a few situations there is a benefit. My theory
    > is that queuing commands up against your adapter driver
    > (3 layers down within the kernel) gives better latencies
    > than queuing commands in the app (or just waiting for the
    > previous one to finish).

    Agreed. But to some - quite small, but perhaps important - extent, the
    latency is also a matter of the drivers and/or hardware involved: My
    tests with the Sharp JX250 showed that my NCR adapter gives a slighty
    better performance on a Pentium 100 MHz machine than an Adaptec 2940.

    Now for a different idea. (If I'm going to talk nonsense, let me know.)
    If my memory is right, it is for example possible even with the ISA card
    aha1542 to issue SCSI commands with data blocks larger than 64 kB. Since
    the DMA block size for ISA cards is limited to 64 kB, this means that
    the kernel must organize more than one DMA transfer for one SCSI
    command. At present, these data are collected in a large buffer (or with
    scatter-gather, in several buffers), and and when a SCSI fommand
    finishes, all bytes are at once tranferred to the user memory. In other
    words, the machine must have enough memory to buffer the entire data
    block (or another copies, if DMA to user space is not possible). This
    sets some limit for the reasonable data block size of a SCSI command:
    the size should of course not be larger than the phyical memory
    installed; and since Unix is a multitasking OS, one should leave enough
    memory for other processes. Using data block sizes of more than a few
    hundred kB for SCSI commands is in my opinion a bad idea even on a
    workstation with 128 MB RAM or more.

    On the other hand, it might help to speed up a scan, if only one or two
    read command are issued for an entire scan. For higher resolutions and
    large scan windows, this means to read several dozens of megabytes with
    just one command.

    OK, now, is there (or, could there be) a way to set up something similar
    to piping, so that the data sent from the scanner for one SCSI command
    can be read in smaller chunks by the backend?

    The probem for Sane is, that an implementation of this idea is a matter
    of the kernel, so that we cannot hope to have it available for all the
    Unixes supported by Sane, but it could be implemented as an optional
    function.

    Abel

    --
    Source code, list archive, and docs: http://www.mostang.com/sane/
    To unsubscribe: echo unsubscribe sane-devel | mail majordomo@mostang.com
    



    This archive was generated by hypermail 2b29 : Thu Jun 29 2000 - 05:13:21 PDT