Re: image buffering

David Mosberger-Tang (davidm@azstarnet.com)
Fri, 30 May 1997 08:38:21 -0700

>>>>> On Fri, 30 May 1997 09:05:06 -0400, "Michael K. Johnson" <johnsonm@redhat.com> said:

Michael> Since we *do* have a reliable, portable method that works
Michael> for finding the currently available disk space for the
Michael> partition pertaining to any particular directory (see
Michael> df...), and don't have a portable way to know how much
Michael> memory is available, it seems to me that this is saner
Michael> <groan>: if there's enough file space wherever the output
Michael> file is:

Michael> mmap output file
Michael> if mmap failed,
Michael> if there's enough space in (say) /var/tmp
Michael> create a temp file there
Michael> try to mmap the temp file (the previous mmap may have
Michael> failed because the output file wasn't seekable)
Michael> if mmap failed,
Michael> write directly to temporary file
Michael> if we still haven't figured out what to do
Michael> malloc and hope for the best
Michael> if malloc fails (unlikely) give an error message and exit.

Michael> Since mmap is only going to fail on systems that don't
Michael> support it (rare), or on pipes (which this deals with),
Michael> this really resolves to mmap by choice, then malloc.

I think I like a "malloc-or-temp-file approach better" for the
following reasons:

- in some cases, you won't know how much to mmap() since there
is no realloc() equivalent of mmap() this could get ugly pretty
fast (it's certainly doable, but is it worth the trouble?)

- mmap() is in my experience the wrong thing when memory is short;
while I can't quote any numbers, it seems like (at least on Linux),
the VM system is quite agressive about keeping mmap'ped pages in
RAM, which is bad for our purposes because we access the data in
a strictly sequential fashion (which is the worst-case scenario for
an LRU scheme when the working-set size exceeds the available RAM)

- for scanning purposes, the performance difference between
read()/write() and mmap() will be negligible since it's no
problem to read/write large chunks of data per system call

How about something like this:

if output file is lseek'able
tmp = output file
else if we don't know the image size or if /var/tmp has enough space
tmp = tmpfile ()
else
tmp = malloc ()

Now, it would obviously be advantageous to make the accesses to "tmp"
abstract so the same code can be used independent of whether "tmp"
refers to a FILE* or to a malloc'ed area. This could be done through
something like open_memstream(). But I think the way we use "tmp" is
so simple that open_memstream() would be overkill---it might be both
easier and more portable to simply cook our own abstract interface.
All that's really needed is a seek and a write-byte operation.

--david

--
Source code, list archive, and docs: http://www.azstarnet.com/~axplinux/sane/
To unsubscribe: mail -s unsubscribe sane-devel-request@listserv.azstarnet.com