[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-ocrad] Technical documentation summary readme.txt, page skew, H
From: |
Antonio Diaz Diaz |
Subject: |
Re: [Bug-ocrad] Technical documentation summary readme.txt, page skew, Hough transform. |
Date: |
Sat, 04 Mar 2006 15:21:04 +0100 |
User-agent: |
Mozilla/5.0 (X11; U; Linux i586; en-US; rv:1.7.12) Gecko/20050923 |
Chris K. Skinner wrote:
As you are probably aware, there are software patents in some countries.
Yes, in almost every country with a corrupt and/or fascist government.
If you had some kind of outline of the algorithms that were applied per
version of the software that would greatly help someone new coming in
fresh off the street to gain a quicker understanding of stuff in
general, and probably demonstrate to the world at large that you have
invented something new that could not be patented / stolen / claimed by
some greedy corporate dudes.
I sympathize with your idea, but I lack the time and the ability to
explain the algorithms I use or invent. On the other hand, a patent is
valid even if I invented it independently, so it won't be an effective
defense.
Do you have any design notes, bibliographic citations, web links to
information that you've made use of , release notes for what algorithms
are being used / abandoned.
The short answer is no. I have looked into the source of gocr and
claraocr, but I haven't got anything from them. I use the Otsu algorithm
for binarization (as gocr does). Apart from this, I work mostly in a vacuum.
In the J. R. Parker book w/CD ROM "Algorithms For Image Processing And
Computer Vision" that I have read, the author provides a couple of
algorithm suggestions for combating the page skew angle issue. A
Hough-transform when applied to the dots of the bottoms of the bounding
boxes of glyphs results in a page skew angle in degrees (with his source
code, that is).
This has a number of problems:
- Hough-transform, and in general any transformation on the whole image,
is slow as hell.
- What if the line is not skewed but curved? (frequent in scanned books).
- "The bottoms of the bounding boxes of glyphs" are usually not aligned.
- etc...
This is why I expect working code, not suggestions, from possible
collaborators. (Show me the code, you know?) ;-)
By the way, ocrad's algorithms are designed to be resistant to page skew.
Another approach is to use angle-independent Complex-Number-Coefficient
Neural Networks to use as feature recognizers. The Japanese promoter of
these neural networks says that they are Affine-Transform insensitive,
and thereby can recognize a pattern that has been so transformed.
I would like to see this recognizing a page in less than five minutes
with good accuracy.
This too is just a theory. I don't have a copy of any books on
Complex-Number-Coefficient Neural Networks, or any source code from a
competent mathematician who has converted the advanced mathematics into
working C++ code examples.
Don't worry. "Advanced mathematics" are the sofware of the future... and
they will always be. ;-)
Regards,
Antonio.