gnumed-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnumed-devel] doc_med , txt blob


From: Karsten Hilbert
Subject: Re: [Gnumed-devel] doc_med , txt blob
Date: Sat, 16 Sep 2006 14:35:04 +0200
User-agent: Mutt/1.5.13 (2006-08-11)

On Sat, Sep 16, 2006 at 05:26:05PM +1000, Ian Haywood wrote:

> That's a good question. PostgreSQL can do text-searching well, but we
> would waste a lot of time
> searching through big radiology JPEGs that will never have the text we want.
I think so, too. I think Ian has covered this well. GNUmed
should not attempt searching inside BLOBs. That's what they
are blobs for.

> I had thought clin.result.val_alpha was the proper place for PIT-type
> results. (i.e. a single unparseable blob of plain ASCII text)
Yep, that's the one other logical place for text PIT. In
this case it's either or depending on what the importer
wants. Syan likely chose the BLOB approach as that is
already handled by the currently released GUI contrary to
clin.result.val_alpha.

> No, wait, we've got doc_obj.fk_intended_reviewer now, cool.
Well, courtesy of your suggestion :-)

> This means clin.result and friends will be empty on AU systems
> (we simply never get any atomic results data to put in there, and so don't
> need it's other meta-data)
Yes.

> Is blobs.doc_obj.fk_intended_reviewer set to NULL when someone has
> reviewed the document?
No. It is used to document who is *responsible* for this
document. Reviewing can be done by anyone but only one
person is responsible for it. Of course, responsibility can
be transferred.

> (otherwise the query may get slow checking against
> blobs.reviewed_doc_objs for every new document.
Not that I should think :-)   If there's no row in
blobs.reviewed_doc_objs then there's no review.

> IMHO we could use doc_desc for PIT documents (which was originally
> for OCR of scans)
Not only originally. It is still very much intentionally
open for such use.

> so searchable text data and non-text-searchable binary
> data are separate.
>
> This means PIT files would have a doc_med and a doc_desc
> entry, but no doc_obj.
In fact, I was just about to suggest that:

Convert the PIT into a proper UTF8 file. Store that in
blobs.doc_obj.data. Store it as properly encoded text in
blobs.doc_desc as well. Link doc_desc to the appropriate
doc_med. The doc_obj content is authoritative (because
doc_desc can always be regenerated from it).

So we get the best of both worlds. Proper blobs handling and
searchability of text.

> (I understand why you many want blobs.reviewed_doc_objs so
> you can track reviewers of individual pages, but surely you
> would not direct different pages of one logical document to different
> reviewers?)
I would not *direct* them towards different reviewers, no.
We can eventually rethink that a bit. For now it works fine
as is.

Karsten
-- 
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346




reply via email to

[Prev in Thread] Current Thread [Next in Thread]