> My summer project is to write some sort of OCR software so I
> can help the Gutenberg Project; running Linux/AXP only, I'm not about
> to buy NT just for this one thing, and it's a good chance to learn
> image processing and neural networks. Image processing textbooks and
> C.M. Bishop's _Neural Networks For Pattern Recognition_ are in hand,
> so all I need is to buy a scanner that works with SANE and start
> working on code.
Yeah - would like to see that, too.
> 3) Scale each letter down to a 32x32 array of pixels. Use
> Kohonen-Loeve feature extraction to derive a 32-element vector from
> that array; it's faster to run a neural network over a 32-element
> vector than 1024 pixel values.
I don't have references on this feature-extractor handy, but from the
people involved it should be good.
Pay close attention on which feature-extractor(s) to use. This is the
most important thing with pattern recognition.
> 4) Use a neural network to classify the vector, getting a
> letter and a confidence value.
A really well chosen feature-extractor should allow the use of simpler
methods like vector-matching.
The general guideline is : Make up a good feature-extractor.
While neural networks are a nice concept, you often cannot know,
if they train well for a given set of patterns.
I have seen a lot of software which relied too much on the NN and thus
didn't work very well for low-quality data as often got from scanners.
If you encounter problems with low contrast or noisy images, I could
contact one of our profs which is into image-enhancement.
> What could SANE do to support OCR applications in particular?
> >From my reading, I can't really think of anything special that the
> backend or frontend should support, but that may just indicate that I
> haven't done enough research.
The most important thing is to be able to control the scanner
(especially sheet-feeders etc.) from the ocr app. SANE has support for this.
CU,Andy
-- Andreas Beck | Email : <becka@sunserver1.rz.uni-duesseldorf.de>
-- Source code, list archive, and docs: http://www.azstarnet.com/~axplinux/sane/ To unsubscribe: mail -s unsubscribe sane-devel-request@listserv.azstarnet.com