[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Gnumed-devel] Re: Scanning Xsane, gscan2pdf, Simple Scan, Tesseract
From: |
Karsten Hilbert |
Subject: |
Re: [Gnumed-devel] Re: Scanning Xsane, gscan2pdf, Simple Scan, Tesseract OCR |
Date: |
Tue, 26 Jan 2010 16:20:20 +0100 |
User-agent: |
Mutt/1.5.20 (2009-06-14) |
On Mon, Jan 25, 2010 at 11:41:03PM +0100, Karsten Hilbert wrote:
> > For GNUmed to be able to access such a layer in within-patient searches,
> > would it be necessary for such PDFs to have been imported twice, and/or to
> > use some additional tool to "split" the document into two parts (one an
> > image part, and one the text part)?
>
> It would be possible to implement the access to the text part inside
> GNUmed. Actually using that in a search would, however, presently
> require exporting each and every document and trying to search it.
>
> That could, indeed, only be mitigated by splitting the text part
> into a separate for-search table upon import.
>
> Except that GNUmed already has that table: blobs.doc_desc, of which
> there can by any number per document. In fact, we should probably
> extend the per-patient and across-patients search to look at those !
Which we apparently already do, of course :-)
One concept of the GNUmed document archive that it tries
hard to *not* concern itself with the particulars of the
document part file types. It delegates that as much as at
all possible. Hence splitting / appropriately importing PDF
parts is up to the environment.
Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346
- [Gnumed-devel] Scanning Xsane, gscan2pdf, Simple Scan, Tesseract OCR, Jim Busser, 2010/01/05
- [Gnumed-devel] Re: Scanning Xsane, gscan2pdf, Simple Scan, Tesseract OCR, Jim Busser, 2010/01/15
- Re: [Gnumed-devel] Re: Scanning Xsane, gscan2pdf, Simple Scan, Tesseract OCR, Karsten Hilbert, 2010/01/15
- [Gnumed-devel] Re: Scanning Xsane, gscan2pdf, Simple Scan, Tesseract OCR, Jim Busser, 2010/01/25
- Re: [Gnumed-devel] Re: Scanning Xsane, gscan2pdf, Simple Scan, Tesseract OCR, Karsten Hilbert, 2010/01/25
- Re: [Gnumed-devel] Re: Scanning Xsane, gscan2pdf, Simple Scan, Tesseract OCR, Jim Busser, 2010/01/25
- Re: [Gnumed-devel] Re: Scanning Xsane, gscan2pdf, Simple Scan, Tesseract OCR, Karsten Hilbert, 2010/01/26
- Re: [Gnumed-devel] Re: Scanning Xsane, gscan2pdf, Simple Scan, Tesseract OCR,
Karsten Hilbert <=
- Re: [Gnumed-devel] Re: Scanning Xsane, gscan2pdf, Simple Scan, Tesseract OCR, Jim Busser, 2010/01/26
- Re: [Gnumed-devel] Re: Scanning Xsane, gscan2pdf, Simple Scan, Tesseract OCR, Karsten Hilbert, 2010/01/26