[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [libextractor] clarification on libextractor functionality and ev
From: |
Christian Grothoff |
Subject: |
Re: [libextractor] clarification on libextractor functionality and ev |
Date: |
Wed, 30 Mar 2011 07:24:23 +0200 |
User-agent: |
KMail/1.13.5 (Linux/2.6.35-28-generic; KDE/4.5.1; i686; ; ) |
On Tuesday, March 29, 2011 11:15:03 pm nijil yes wrote:
> Hi,
>
> I am a student who is planning to work on Xapian search and indexing
> libraries project.My area of work would be to replace the currect content
> and meta data mechanism , which make use of external filter programs
> resulting in a filter being run for every different file format increasing
> the cpu footprint.I would like to replace these external filter programs
> with shared libraries like the one provided with libextractor.If its not
> too much trouble please clarify on the following matters
>
> 1:Does it also support to extract content of the file other than just meta
> data ?
In principle this would be possible, but none of the plugins that have been
implemented do this and this is not the intend of the library.
> 2:Does the file format identification and extraction happen implicitly or
> does libextractor implement a mechanism where for each fileformat an
> external filter program is run.If that is the case then there is no need
> to replace the current xapian system with this.
File format identification happens internally; LE runs each plugin and the
plugin then decides if the given format is applicable to it. Naturally, most
plugins terminate quickly after a brief look at the file header most of the
time.
> 3:Is there any fileformats that would be desirable to be seen as a part of
> libextractor but which currently is not a part ?
Always ;-). There is a TODO list in the distribution.
> 4: Could you suggest any alternative for the above requirements I
> mentioned. I would specifically require c/c++ libaries.
libmime is somewhat related, other than that, I don't know any C/C++ libraries
doing something similar.
Happy hacking,
Christian