Re: SANE V2

Nick Lamb (njl98r@ecs.soton.ac.uk)
Thu, 19 Aug 1999 02:45:53 +0100 (GMT)

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: Mike Schmitz: "sane-umax"
Previous message: Andreas Beck: "Re: Mustek Paragon 1200SP"

On Mon, 16 Aug 1999, Andreas Beck wrote:

> Could you explain a little, what you mean ? I am afraid we are just
> misunderstanding each other. Note, that my mail forward was exactly this:
> A forward of vague thoughts, so I can understand, that it might cause
> confusion. I will try to clean that up a little now.

Well, in part the mail was as hostile as it was because your "vague
thoughts" gave the impression that SANE 1.x/2.0 has been discussed
behind the scenes, when it should really have been a big open debate
on SANE-devel as it is now. I don't like that very much :(

But I wouldn't have written it at all if the recommendations weren't
IMHO technically and politcally unsound.

> > MIME is not a solution to your problems, even the MIME types system is
> > probably overkill, but MIME itself is definitely *not* what SANE needs to
> > support the scanners we've been talking about on the list.
>
> I do not get you on that. The scanners we are talking about can produce
> compressed image formats. If I get you right (please correct me), you
> propose to handle this by adding a few well defined new frametypes. Right ?

Yes. The contents of these frametypes will be exactly defined, preferably
by an independent standards body. If someone asks on the list "What is
SANE_FRAME_xxxx and how do I decode it", I want to be able to point at
a public standards document.

> Now I propose to only add a single frametype, which redirects the exact
> description of what the datastream contains to some place, where there is
> "more space" than in a single integer.

Whilst this gives you infinite flexibility, I think this is a poor trade
for the uncertainty it foists on frontend writers, and worse -- users.

> 1. It is widely used.

SANE is unlikely to be interoperable with any of the MIME-based systems
which are currently deployed, mostly NNTP, HTTP, POP etc. So this is
not IMHO a big advantage. Also, we only want to re-use the minimal
subset mime types if I understand you correctly, not all of MIME.

> 2. It handles virtually all fileformats out there, and it can and will be
> extended as need arises.

Handles here means nothing more than "has one or more names for". MIME
doesn't _know_ anything about these file formats, although the more
commonly used databases often recommend one or more file extensions for
identifying data of that type.

> 3. It allows to use an existing external database for handling unknown
> formats.

If you can give me a concrete and portable example of how SANE will
"handle unknown formats" using MIME, I will accept this point. Until
then I see no sign of this working for any of the other MIME aware
applications on Unix, why would SANE be any different?

> If we allow any type to be transmitted, we as well to some extent encourage
> "bad" backends to be written, that will only send "application/something_
> weird" type data.

It's worse than this, backends will send image/tiff and you will accept
that, because after all it has a MIME type. Unfortunately TIFF is so
poorly defined (along with many other types in the MIME database) that
a frontend can only Save it to disk, and hope the user's got a copy of
Gimp or some other highly tolerant image processing software.

If you insist that backends stick to "accepted" MIME types, then you
lose any advantages of MIME, but if you permit them to start lunacy
like image/gif (or application/ms-word) then SANE becomes a joke --
might as well just tell people to access scanners over HTTP.

Note also that MIME itself has only VERY FEW formats, most of them
live their lives as image/x-foo-bar-baz or more recently in the
vendor controlled space. However, even those which are destined for
standardisation start off as image/x-blah-blah, so you must continue
to support this naming convention long after it expires. In MIME's
intended applications this wasn't a problem, but it leaves a nasty
taste in my mouth for SANE.

> This could be countered by requiring for a backend to be SANE compliant to
> support at least one of the RAW datatypes. This as well ensures continuing
> support for frontends, where saving to disk is not a good option, like
> when operating xscanimage in GIMP mode.

I think most people are agreed that backends should send SANE 1.0 frame
types (which I guess is what you mean by RAW) unless asked for something
"better" by the frontend or user. You can do this regardless of whether
SANE uses FRAME_JFIF or image/jpeg.

> > I'm not interested in having fairly bad support for digital cameras in SANE
> > when there's really good support for digital cameras in gPhoto. Since
> > we are now working toward TWAIN/Unix, it makes sense for gPhoto and SANE
> > to each implement TWAIN sources on top of their respective APIs.
>
> I actually do not see, why a digital camera needs to be handled differently.
> O.K. gPhoto exists, and it may well be good, but I do not see a principal
> problem to support digital cameras in SANE.

DO NOT RE-INVENT THE WHEEL

Developers won't work on SANE (no existing support, no frontend, no
support in the API) when they can work on gPhoto (excellent support,
good GUI and cmd-line tools, API designed for the job)

Would you expect scanner developers to desert SANE and work on gPhoto?

> And - as nice as the interest of the TWAIN group in Unix is ... TWAIN is
> IMHO not the greatest thing since sliced bread. Have a look at their own
> standard. It is 526 pages. If the SANE standard were that big, I'd never
> even consider writing an application for it.

Almost all of SANE is mandatory, almost all of TWAIN is optional -- not
a fair comparison. Standards documents for stuff you use every single
day are just as long, but you only read the essentials and everyone
muddles along just fine ;)

> I think the main benefits we will get from connecting SANE to TWAIN is not
> getting a nicer and unified interface, but commercial (or other) Software
> being ported from Windows more easily and SANE being accessible by Windows
> software.

Cool, whatever -- I think as a Gimp developer I'd like to be able to add
a single Import feature and pick gPhoto or XSane or SaneSlidePro or
whatever I had installed with TWAIN support. Nice to dream.

> O.K. - what is the problem with it ? I propose to _allow_ to transfer any
> file type. TWAIN supports audio. How do we make a real bridge layer, if we
> don't ?

This is really horrible reasoning. If you want to support audio just
because TWAIN can support audio, you should read those 500+ pages and
come back when you're sober.

> I am fully with you, that SANE is intended for scanners, and should target
> images, but if we can add other file types at no cost, Why not ?

The world is full of trade offs, the only protocol that comes to mind
which transports "other file types at no cost" is HTTP. Unfortunately
that's also a protocol so broken that it is on its third revision and
still "considered harmful". Oh, and it uses MIME, what a co-incidence.

> Why ? Basically every device has "options" that can be read out and changed.
> Regarding device control, SANE is very universal. It could be used to
> tune your TV or drive a toaster. And actually I am using a variant of the
> SANE option control system in another project to handle IO-devices.

It *could* be used to tune your TV, or drive a toaster. I *could* use
a copy of Stroustup's "The C++ Programming Language" to learn C, but I
am not suprised that Stroustrup doesn't recommend or support this.
I'm happy that you re-used code from SANE in another project, if you
think it's particularly useful you could split that code out and offer
it to other free software projects.

If, OTOH, you propose that this means SANE really is EVERYTHING WITH
KNOBS ON NOW EASY, then I cannot agree. It's an imaging API, which
concentrates on scanners and their ilk. Nothing more, nothing less.

> The only image-centric stuff in SANE is the start/getparm/read/stop stuff.
>
> That's o.k., as for the scope of SANE, this kind of interface makes sense.

YES!

> Now we see demand for transferring data that is not accurately described by
> getparm. Thus I propose to do the simple thing and basically ignore getparm
> stuff and just transfer some streamed data.

NO! You already said "for the scope of SANE, this kind of interface makes
sense", but now you contradict yourself. You were right the first time.

> Now this alone would be not too good, as it leaves no idea what the data
> and its format actually is.
> Thus I propose to add a generic content description system for such data.
> Mime seems like a good choice, as said above.

Even if MIME was a good choice, unless you take fairly strict control
over the contents of the stream (as you would if accepting most of
the recommended new FRAME_TYPEs) you will quickly have a bunch of
proprietary apps communicating over a _read/_write API layer, not much
more than standard I/O really -- and therefore a waste of time.

> This now allows to transfer any datatype that fits into a stream of bytes
> without need for more sophisticated stuff like bandwidth negotiation.

Eh? Do not go down this road. sane_read(...) is defined over memory
and therefore "bandwidth" is an irrelevant notion.

> However it has the drawback, that the frontend will usually not be able to
> interpret the data. This is acceptable when the frontend would save to file
> anyway, but not very good when it wants to display the result or hand it
> over to a client application like the GIMP.
>
> Therefore, I propose, that every backend should have an option to select the
> output format. Every fully SANE compliant backend should support the option
> to transfer in RAW SANE format, to avoid this situation (where the frontend
> cannot make sense of the data, and cannot save it either for processing
> with tools that can) on user (or frontend) request.

Yes, good -- this should be default, but stop calling it "RAW SANE"
please, because I guess you just mean uncompressed like in SANE 1.0
We agree about this, though perhaps not quite about how it is achieved

> To further simplify the process of writing front- or backends in a way that
> allows to finally generate RAW data (in the sense, that the frontend can
> make sense of the data) from common formats, I propose to have "middleends",
> that can attach between a front and a backend (like the net stuff or the
> dll backend does) and can convert from common filetypes to SANE RAW.

Write middleends that do whatever you like. This has nothing much to do
with the SANE standards process, because they're already permitted and
even encouraged for some situations.

> This allows to keep a backend driver simple, as if it is sending a data
> type, for which a converting middleend exists, it doesn't need to worry
> about internal conversion (which would mean code duplication in all backends
> that happen to get a given kind of native input format from their HW).

This can't work -- we've already agreed that backends must support RGB
or GREY or something "natively" by default. So all backends must convert
to RGB or GREY if the hardware can't do that itself.
Converting to JFIF, G3/4/n or most other compressed formats inline is
less silly, but I'll remain a sceptic until I see a good application.

> The other option to allow for full usage with compressed datatypes would be
> implementing conversion in the frontends. This is also not good, as it means
> code-duplication as well.

I don't like this either. But it's OK because the backends will always
to convert to something a frontend can be expected to understand (if
you write a frontend that only reads 4bit GREY then you're on your own)

> So from the position of least code duplication, I think the middleends are a
> good solution. It also gives tuneability for the parameters (jpeg-quality)
> in a simple way, as a middleend can extend the optionlist.

Tunability may/ may not exist in different hardware configurations and
if JPEG compression isn't being done by the hardware, I'd prefer to see
it in the frontend, where I can get a usable progress bar / do it offline
or whatever.

[Stuff about text support removed, I conceded this point after all that
XML arrived in my mail, and so SANE_FRAME_ASCII it is]

> > application/sane?
>
> That the "proposed MIME-type" of the current data stream. It is using extra
> outband data (width and height) from the getparms operation, which all
> other formats we are talking of do not need, as they contain that inband.

This doesn't fill me with confidence about your use of MIME. The objective
here is not to clog up the world with a zillion mime types. It was also
already explained that many new types (even compressed ones) would need
the out-of-band data fields for decompression.

> I think the latter is more extensible, though it opens up more possibilities
> to "misuse" SANE for weird stuff, but I think that is a political rather
> than a technical question.

Hmm, like this?

This is FREE SOFTWARE, and as such it is not in our interests to promote
closed protocols, undocumated formats etc. which undermine future work

Imagine an entity "BG" which is hostile to free software. They want to
write a proprietary scanning system, but they'd like to re-use our
work and preferably use SANE's good name at first (until they can get
50% market share) then kill SANE off later.

Currently they can't write a SANE compliant backend or frontend without
inviting competition from SANE developers or third parties who would
support free software alternatives. They could deliberately break
compatability, but that's not going to help them get market share.

With my proposals this situation is preserved in SANE 1.1/2.0 because
the new _FRAME types are all locked up by standards bodies who are
committed to public standards and/or free software.

With your proposal, "BG" need merely add the compliant, but proprietary
image/x-bg-format to their backend and lock all advanced features in
the frontend to the new format -- tada! Proprietary SANE :(

> And I definitely think, that except for special applications, that is
> customized frontends, the frontend should never have to care. It should
> either be able to force RAW transmission (via converting middleends if
> needed) to be able to do stuff to the resulting image to to revert to saving
> to file.

No need for middleends here, backends MUST offer a SANE 1.0 FRAME format
by default not least because this is polite. If you insist, this can be
a SHOULD, but no-one writing drivers has objected to MUST.

----

> To reapeat again the general outline of my approach:
>
> 1. Add a well-known option that allows to select the transmission format.

Yes fine, exactly how to do this needs more thought

> 2. Add a single bew frametype that announces the transmission of arbitrarily
> formatted data. Add two identification fields somewhere that give the
> mimetype and optionally a proposed filename for the data, so one can
> correctly detect and handle the type of data. For "somewhere" I propose
> either the file stream itself, like it is done in metamail, which is a
> little awkward to handle in the frontend, or an extension to the parms
> struct which is easier to handle, but a bit less compatible.

No. As far as I can tell, the only people who support this option (you
included) won't have to live with the consequences. Are you working on
any of the stuff for which the extended SANE_FRAME list was proposed?

> 3. To simplify backend generation, middleends should be allowed that can
> extend the available options for 1. by on-the-fly converting data from
> backend format to others.

Today's SANE standard doesn't object to your writing on-the-fly middleend
software, and I won't stop you either. As proposed though, the
"from COMPRESSED to SANE 1.0" conversion is done in the backend anyway
and the "SANE 1.0 to COMPRESSED" stuff seems better placed in the
frontend, not least for performance reasons.

Basically our differences could be summed up as:

SANE_FRAME_BLAH vs image/x-wibble-foo

Where SANE_FRAME_BLAH is defined by us, the SANE community and
image/x-wibble-foo is defined by the next person who thinks they've
uniquely discovered run length encoding and/or binary trees.

Nick.

--
Source code, list archive, and docs: http://www.mostang.com/sane/
To unsubscribe: echo unsubscribe sane-devel | mail majordomo@mostang.com

Next message: Mike Schmitz: "sane-umax"
Previous message: Andreas Beck: "Re: Mustek Paragon 1200SP"