[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Gnumed-devel] doc_med , txt blob
From: |
Karsten Hilbert |
Subject: |
Re: [Gnumed-devel] doc_med , txt blob |
Date: |
Fri, 15 Sep 2006 16:00:24 +0200 |
User-agent: |
Mutt/1.5.13 (2006-08-11) |
On Thu, Sep 14, 2006 at 08:22:33PM +0800, Syan Tan wrote:
> how to store a doc_obj that is just a text file ?
My knee-jerk reaction would be, why, of course, dump it
into doc_obj.data.
While this would certainly work and not lose any data I'll
wager a paraphrasing of the question:
How to store a text blob as a document and not lose
*information* ?
Bytea will not lose data but it will lose information unless
the data is self-descriptive to some degree. PDF is
self-descriptive, "text" is not. The latter needs to be
accompanied by at least one bit of metadata to make it
safely transferrable by purely technical means: the
encoding.
So, there's a bunch of solutions:
- Convert the text into UTFx, create a unicode file with the
proper start of file marker and store that into
doc_obj.data. Probably the cleanest and recommendable
solution.
- Store the text in doc_obj.data and keep the encoding
information elsewhere such as: doc_desc, comments, etc.
- Store the text in doc_desc where it is properly encoded
and keep a special value in doc_obj.data pointing to doc_desc.
- Store an enriched version (custom format) of the text in
doc_obj.data which contains the encoding in a
computationally extractable way (such as XML).
I'd suggest either the first or the last approach. The first
is preferrable, I suppose.
Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346
- [Gnumed-devel] doc_med , txt blob, Syan Tan, 2006/09/14
- Re: [Gnumed-devel] doc_med , txt blob,
Karsten Hilbert <=
- Re: [Gnumed-devel] doc_med , txt blob, James Busser, 2006/09/15
- Re: [Gnumed-devel] doc_med , txt blob, Ian Haywood, 2006/09/15
- Re: [Gnumed-devel] doc_med , txt blob, James Busser, 2006/09/16
- Re: [Gnumed-devel] doc_med , txt blob, Ian Haywood, 2006/09/16
- Re: [Gnumed-devel] doc_med , txt blob, Karsten Hilbert, 2006/09/16
- Re: [Gnumed-devel] doc_med , txt blob, Ian Haywood, 2006/09/16
- Re: [Gnumed-devel] doc_med , txt blob, Karsten Hilbert, 2006/09/16
- Re: [Gnumed-devel] doc_med , txt blob, Karsten Hilbert, 2006/09/16
- Re: [Gnumed-devel] doc_med , txt blob, Karsten Hilbert, 2006/09/16
- Re: [Gnumed-devel] doc_med , txt blob, Karsten Hilbert, 2006/09/16