Re: [Bug-ocrad] Not recognizing obvious text

bug-ocrad

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-ocrad] Not recognizing obvious text

From:	Antonio Diaz Diaz
Subject:	Re: [Bug-ocrad] Not recognizing obvious text
Date:	Tue, 24 Jan 2006 16:18:02 +0100
User-agent:	Mozilla/5.0 (X11; U; Linux i586; en-US; rv:1.7.12) Gecko/20050923

Tony Maro wrote:

There's not a way to limit the area of
the page you're doing OCR on is there?  Like a zone ocr?

Ocrad is able to do layout analysis and process only one of theresulting blocks. But this is likely to be slower and less reliable thancropping a part of the page as you are doing now.

I have in the to do list for ocrad a "crop" option, and I am in fact soimpressed with what you are doing that I think I am going to implementit in the next version (0.15).


The syntax could be like this:
`ocrad file.pbm --crop left,top,right,bottom'

and the meaning of left, top, right, bottom could be:
between 0.0 and 1.0, a fraction of the whole page
greater than 1, a coordinate.

So `ocrad file.pbm --crop 0.0,0.0,0.5,0.5' would process the upper leftquadrant, and `ocrad file.pbm --crop 0,0,500,500' would process theupper left 500x500 pixel square.

Bet you guys never thought ocrad would be used for that, eh? ;-)


Really no. :-)

So, anyone have an idea that might speed up the process?

I think the proposed "crop" option would speed up the process a lot,because ocrad could do the cropping first, then the rotation. Allwithout creating intermediate files.

Yes, you read that right.  I'll be processing as much as
150,000 pages per day on one server, and am designing this process so it
could be clustered to handle more.


Some day you have to tell me who are you working for. ;-)


Regards,
Antonio.

[Prev in Thread]

Current Thread

[Next in Thread]

[Bug-ocrad] Not recognizing obvious text, Tony Maro, 2006/01/20
- Re: [Bug-ocrad] Not recognizing obvious text, Antonio Diaz Diaz, 2006/01/21
  - Re: [Bug-ocrad] Not recognizing obvious text, Tony Maro, 2006/01/22
    - Re: [Bug-ocrad] Not recognizing obvious text, Antonio Diaz Diaz <=
    - Re: [Bug-ocrad] Not recognizing obvious text, Tony Maro, 2006/01/24
    - Re: [Bug-ocrad] Not recognizing obvious text, Antonio Diaz Diaz, 2006/01/24

Prev by Date: Re: [Bug-ocrad] Not recognizing obvious text
Next by Date: Re: [Bug-ocrad] Not recognizing obvious text
Previous by thread: Re: [Bug-ocrad] Not recognizing obvious text
Next by thread: Re: [Bug-ocrad] Not recognizing obvious text
Index(es):
- Date
- Thread