gnunet-svn
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[GNUnet-SVN] r9956 - Extractor-docs/WWW


From: gnunet
Subject: [GNUnet-SVN] r9956 - Extractor-docs/WWW
Date: Fri, 1 Jan 2010 20:54:36 +0100

Author: grothoff
Date: 2010-01-01 20:54:36 +0100 (Fri, 01 Jan 2010)
New Revision: 9956

Added:
   Extractor-docs/WWW/extractor.html
Removed:
   Extractor-docs/WWW/demo.php3
   Extractor-docs/WWW/do_upload.php
   Extractor-docs/WWW/news
   Extractor-docs/WWW/oldnews.html
Modified:
   Extractor-docs/WWW/documentation.html
   Extractor-docs/WWW/download.html
   Extractor-docs/WWW/index.html
Log:
docu

Deleted: Extractor-docs/WWW/demo.php3
===================================================================
--- Extractor-docs/WWW/demo.php3        2010-01-01 19:54:34 UTC (rev 9955)
+++ Extractor-docs/WWW/demo.php3        2010-01-01 19:54:36 UTC (rev 9956)
@@ -1,37 +0,0 @@
-<?php
-$title="libExtractor - Demo";
-$email="address@hidden";
-$keywords="keyword, extraction, mp3, html, pdf, images, jpeg, gif, ps, mime";
-$author="Vids Samanta";
-$page="demo";
-include("html_header.php3");
-W("You can see how the tool works by uploading a file you want to extract 
keywords from here.");
-echo "<form enctype=\"multipart/form-data\" method=\"post\" 
action=\"do_upload.php\">";
-P();
-W("File to Upload:");
-BR();
-echo "<input type=\"file\" name=\"img1\" size=\"30\">";
-P();
-W("Run with binary extractor (choose language, or none):");
-$vals=array("Danish"=>"1",
-           "English"=>"2", 
-           "Finnish"=>"3",
-           "Frensh"=>"4",
-           "Gaelic"=>"5",
-           "German"=>"6",
-           "Italian"=>"7",
-           "Norwegian"=>"8",
-           "Portuguese"=>"9",
-           "Swedish"=>"10");
-foreach($vals as $tlang=>$number) {
-  W($tlang);
-  echo "<input type=\"radio\" NAME=\"binary\" VALUE=\"$number\">";
-}
-W("none:");
-echo "<input type=\"radio\" NAME=\"binary\" VALUE=\"0\" Checked>";
-P();
-echo "<input type=\"submit\" name=\"submit\" value=\"Run Demo\">";
-W("This demo is limited to files smaller than 16 MB.");
-echo "</form>";
-include("html_footer.php3"); 
-?>

Deleted: Extractor-docs/WWW/do_upload.php
===================================================================
--- Extractor-docs/WWW/do_upload.php    2010-01-01 19:54:34 UTC (rev 9955)
+++ Extractor-docs/WWW/do_upload.php    2010-01-01 19:54:36 UTC (rev 9956)
@@ -1,61 +0,0 @@
-<?php
-$title="libextractor Demo Results!";
-$email="address@hidden";
-$keywords="keyword, extraction, mp3, html, pdf, images, jpeg, gif, ps, mime";
-$author="Vids Samanta";
-$page="demo";
-include("html_header.php3");
-
-if ($_FILES['img1'] != "") {
-  if ( ($_FILES['img1']['tmp_name'] == 'none') || 
-       ($_FILES['img1']['tmp_name'] == '') ) {
-    die(W_("File empty or too big."));
-  }
-  $dest="/tmp/".$_FILES['img1']['name'] ;  
-  copy($_FILES['img1']['tmp_name'], $dest)
-    or die(W_("Couldn't copy the file!")); 
-  
-} else{
-  die(W_("File upload failed!"));
-}
-H1("libextractor Demo Results");
-$binary = $_POST['binary'];
-switch ( $binary ) {
- case 1:
-   $flag = "-B da";
-   break;
- case 2:
-   $flag = "-B en";
-   break;
- case 3:
-   $flag = "-B fi";
-   break;
- case 4:
-   $flag = "-B fr";
-   break;
- case 5:
-   $flag = "-B ga";
-   break; 
- case 6:
-   $flag = "-B ge";
-   break;
- case 7:
-   $flag = "-B it";
-   break;
- case 8:
-   $flag = "-B no";
-   break;
- case 9:
-   $flag = "-B pt";
-   break;
- case 10:
-   $flag = "-B sv";
-   break;
- default:
-   $flag = "";
- }
-exec("/home/libextractor/bin/extract $flag \"$dest\"", $arr);
-foreach ($arr as $val)
-  print "$val <br>\n";
-include("html_footer.php3");
-?>

Modified: Extractor-docs/WWW/documentation.html
===================================================================
--- Extractor-docs/WWW/documentation.html       2010-01-01 19:54:34 UTC (rev 
9955)
+++ Extractor-docs/WWW/documentation.html       2010-01-01 19:54:36 UTC (rev 
9956)
@@ -27,7 +27,7 @@
 <tr><th nowrap="nowrap" bgcolor="99BBFF"><a 
href="documentation.html">Documentation</a></th></tr>
 <tr><td bgcolor="efefef"><a href="#copyright">Copyright</a></td></tr><tr><td 
bgcolor="efefef"><a href="#install">Installation</a></td></tr>
 <tr><td bgcolor="efefef"><a href="#usage">Usage</a></td></tr><tr><td 
bgcolor="efefef"><a href="#plugins">Plugins</a></td></tr>
-<tr><th nowrap="nowrap" bgcolor="99BBFF"><a href="oldnews.html">Old 
News</a></th></tr>
+<tr><th nowrap="nowrap" bgcolor="99BBFF"><a href="extractor.html">Reference 
Manual</a></th></tr>
 <tr><th nowrap="nowrap" bgcolor="99BBFF"><a 
href="http://freshmeat.net/projects/libextractor/";>Freshmeat Page</a></th></tr>
 </tbody>
 </table>
@@ -209,6 +209,7 @@
                      NULL, 0, 
                      &EXTRACTOR_meta_data_print, stdout);
   EXTRACTOR_plugin_remove_all (plugins);
+  return 0;
 }
 </pre>
 </p>

Modified: Extractor-docs/WWW/download.html
===================================================================
--- Extractor-docs/WWW/download.html    2010-01-01 19:54:34 UTC (rev 9955)
+++ Extractor-docs/WWW/download.html    2010-01-01 19:54:36 UTC (rev 9956)
@@ -29,6 +29,7 @@
   <tr><td bgcolor="efefef"><a href="#rpm">RPM</a></td></tr>
   <tr><td bgcolor="efefef"><a href="#srpm">SRC RPM</a></td></tr>
 <tr><th nowrap="nowrap" bgcolor="99BBFF"><a 
href="documentation.html">Documentation</a></th></tr>
+<tr><th nowrap="nowrap" bgcolor="99BBFF"><a href="extractor.html">Reference 
Manual</a></th></tr>
 <tr><th nowrap="nowrap" bgcolor="99BBFF"><a 
href="http://freshmeat.net/projects/libextractor/";>Freshmeat Page</a></th></tr>
 </tbody>
 </table>

Added: Extractor-docs/WWW/extractor.html
===================================================================
--- Extractor-docs/WWW/extractor.html                           (rev 0)
+++ Extractor-docs/WWW/extractor.html   2010-01-01 19:54:36 UTC (rev 9956)
@@ -0,0 +1,2125 @@
+<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" 
"http://www.w3.org/TR/html401/loose.dtd";>
+<html>
+<!-- This manual is for GNU libextractor
+(version 0.6.0, 1 January 2010),
+which is GNU's library for meta data extraction.
+
+Copyright C 2007, 2010 Christian Grothoff
+
+Permission is granted to copy, distribute and/or modify this document
+under the terms of the GNU Free Documentation License, Version 1.3
+or any later version published by the Free Software Foundation;
+with no Invariant Sections, no Front-Cover Texts, and no Back-Cover
+Texts.  A copy of the license is included in the section entitled "GNU
+Free Documentation License".
+
+ -->
+<!-- Created on January, 1 2010 by texi2html 1.78 -->
+<!--
+Written by: Lionel Cons <address@hidden> (original author)
+            Karl Berry  <address@hidden>
+            Olaf Bachmann <address@hidden>
+            and many others.
+Maintained by: Many creative people.
+Send bugs and suggestions to <address@hidden>
+
+-->
+<head>
+<title>The GNU libextractor Reference Manual</title>
+
+<meta name="description" content="The GNU libextractor Reference Manual">
+<meta name="keywords" content="The GNU libextractor Reference Manual">
+<meta name="resource-type" content="document">
+<meta name="distribution" content="global">
+<meta name="Generator" content="texi2html 1.78">
+<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
+<style type="text/css">
+<!--
+a.summary-letter {text-decoration: none}
+pre.display {font-family: serif}
+pre.format {font-family: serif}
+pre.menu-comment {font-family: serif}
+pre.menu-preformatted {font-family: serif}
+pre.smalldisplay {font-family: serif; font-size: smaller}
+pre.smallexample {font-size: smaller}
+pre.smallformat {font-family: serif; font-size: smaller}
+pre.smalllisp {font-size: smaller}
+span.roman {font-family:serif; font-weight:normal;}
+span.sansserif {font-family:sans-serif; font-weight:normal;}
+ul.toc {list-style: none}
+-->
+</style>
+
+
+</head>
+
+<body lang="en" bgcolor="#FFFFFF" text="#000000" link="#0000FF" 
vlink="#800080" alink="#FF0000">
+
+<a name="Top"></a>
+<a name="SEC_Top"></a>
+<table cellpadding="1" cellspacing="1" border="0">
+<tr><td valign="middle" align="left">[<a href="#SEC_Top" title="Cover (top) of 
document">Top</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_Contents" title="Table of 
contents">Contents</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC27" 
title="Index">Index</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_About" title="About (help)"> ? 
</a>]</td>
+</tr></table>
+<h1 class="settitle">The GNU libextractor Reference Manual</h1>
+<p>This manual is for GNU libextractor
+(version 0.6.0, 1 January 2010),
+which is GNU's library for meta data extraction.
+</p>
+<p>Copyright &copy; 2007, 2010 Christian Grothoff
+</p>
+<blockquote><p>Permission is granted to copy, distribute and/or modify this 
document
+under the terms of the GNU Free Documentation License, Version 1.3
+or any later version published by the Free Software Foundation;
+with no Invariant Sections, no Front-Cover Texts, and no Back-Cover
+Texts.  A copy of the license is included in the section entitled &quot;GNU
+Free Documentation License&quot;.
+</p></blockquote>
+
+<p>GNU libextractor is a GNU package.
+</p>
+<table class="menu" border="0" cellspacing="0">
+<tr><td align="left" valign="top"><a href="#SEC1">1. 
Introduction</a></td><td>&nbsp;&nbsp;</td><td align="left" valign="top">        
         What is <acronym>GNU libextractor</acronym>.
+</td></tr>
+<tr><td align="left" valign="top"><a href="#SEC2">2. 
Preparation</a></td><td>&nbsp;&nbsp;</td><td align="left" valign="top">         
         What you should do before using the library.
+</td></tr>
+<tr><td align="left" valign="top"><a href="#SEC4">3. 
Generalities</a></td><td>&nbsp;&nbsp;</td><td align="left" valign="top">        
         General library functions and data types.
+</td></tr>
+<tr><td align="left" valign="top"><a href="#SEC5">4. Extracting meta 
data</a></td><td>&nbsp;&nbsp;</td><td align="left" valign="top">         How to 
use <acronym>GNU libextractor</acronym> to obtain meta data.
+</td></tr>
+<tr><td align="left" valign="top"><a href="#SEC10">5. Language 
bindings</a></td><td>&nbsp;&nbsp;</td><td align="left" valign="top">            
How to use <acronym>GNU libextractor</acronym> from languages other than C.
+</td></tr>
+<tr><td align="left" valign="top"><a href="#SEC17">6. Utility 
functions</a></td><td>&nbsp;&nbsp;</td><td align="left" valign="top">           
 Utility functions of <acronym>GNU libextractor</acronym>.
+</td></tr>
+<tr><td align="left" valign="top"><a href="#SEC20">7. Existing 
Plugins</a></td><td>&nbsp;&nbsp;</td><td align="left" valign="top">             
What plugins are available.
+</td></tr>
+<tr><td align="left" valign="top"><a href="#SEC21">8. Writing new 
Plugins</a></td><td>&nbsp;&nbsp;</td><td align="left" valign="top">          
How to write new plugins for <acronym>GNU libextractor</acronym>.
+</td></tr>
+<tr><td align="left" valign="top"><a href="#SEC22">9. Internal utility 
functions</a></td><td>&nbsp;&nbsp;</td><td align="left" valign="top">   Utility 
functions of <acronym>GNU libextractor</acronym> for writing plugins.
+</td></tr>
+<tr><td align="left" valign="top"><a href="#SEC23">10. Reporting 
bugs</a></td><td>&nbsp;&nbsp;</td><td align="left" valign="top">               
How to report bugs or request new features.
+</td></tr>
+<tr><th colspan="3" align="left" valign="top"><pre class="menu-comment">
+Appendices
+
+</pre></th></tr><tr><td align="left" valign="top"><a href="#SEC24">A. GNU 
GENERAL PUBLIC LICENSE</a></td><td>&nbsp;&nbsp;</td><td align="left" 
valign="top">                     The GNU General Public License says how you
+                                can copy and share some parts of <acronym>GNU 
libextractor</acronym>.
+</td></tr>
+<tr><th colspan="3" align="left" valign="top"><pre class="menu-comment">
+Indices
+
+</pre></th></tr><tr><td align="left" valign="top"><a href="#SEC27">Concept 
Index</a></td><td>&nbsp;&nbsp;</td><td align="left" valign="top">               
Index of concepts and programs.
+</td></tr>
+<tr><td align="left" valign="top"><a href="#SEC28">Function and Data 
Index</a></td><td>&nbsp;&nbsp;</td><td align="left" valign="top">     Index of 
functions, variables and data types.
+</td></tr>
+<tr><td align="left" valign="top"><a href="#SEC29">Type 
Index</a></td><td>&nbsp;&nbsp;</td><td align="left" valign="top">               
   Index of data types.
+</td></tr>
+<tr><th colspan="3" align="left" valign="top"><pre class="menu-comment">
+</pre></th></tr></table>
+
+
+
+<hr size="1">
+<a name="Introduction"></a>
+<a name="SEC1"></a>
+<table cellpadding="1" cellspacing="1" border="0">
+<tr><td valign="middle" align="left">[<a href="#SEC_Top" title="Previous 
section in reading order"> &lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC2" title="Next section in 
reading order"> &gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[ &lt;&lt; ]</td>
+<td valign="middle" align="left">[<a href="#SEC_Top" title="Up section"> Up 
</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC2" title="Next chapter"> 
&gt;&gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC_Top" title="Cover (top) of 
document">Top</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_Contents" title="Table of 
contents">Contents</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC27" 
title="Index">Index</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_About" title="About (help)"> ? 
</a>]</td>
+</tr></table>
+<h1 class="chapter"> 1. Introduction </h1>
+
+<p><acronym>GNU libextractor</acronym> is GNU's library for extracting meta 
data from
+files.  Meta data includes format information (such as mime type,
+image dimensions, color depth, recording frequency), content
+descriptions (such as document title or document description) and
+copyright information (such as license, author and contributors).
+Meta data extraction is an inherently uncertain business &mdash; a parse
+error can be a corrupt file, an incompatibility in the file format
+version, an entirely different file format or a bug in the parser.  As
+a result of this uncertainty, <acronym>GNU libextractor</acronym> deliberately
+avoids to ever report any errors.  Unexpected file contents simply
+result in less or possibly no meta data being extracted.  
+</p>
+<a name="IDX1"></a>
+<p><acronym>GNU libextractor</acronym> uses plugins to handle various file 
formats.
+Technically a plugin can support multiple file formats; however, most
+plugins only support one particular format.  By default,
+<acronym>GNU libextractor</acronym> will use all plugins that are available 
and found
+in the plugin installation directory.  Applications can
+request the use of only specific plugins or the exclusion of
+certain plugins.
+</p>
+<p><acronym>GNU libextractor</acronym> is distributed with the 
<code>extract</code> 
+command<a name="DOCF1" href="#FOOT1">(1)</a> which is a command-line tool for 
extracting
+meta data.  <code>extract</code> is given a list of filenames and 
+prints the resulting meta data to the console.  The <code>extract</code>
+source code also serves as an advanced example for how to use
+<acronym>GNU libextractor</acronym>.  
+</p>
+<p>This manual focuses on providing documentation for writing software
+with <acronym>GNU libextractor</acronym>.  The only relevant parts for 
end-users
+are the chapter on compiling and installing <acronym>GNU libextractor</acronym>
+(See section <a href="#SEC2">Preparation</a>.).  Also, the chapter on existing 
plugins maybe of
+interest (See section <a href="#SEC20">Existing Plugins</a>.).  Additional 
documentation for
+end-users can be find in the man page on <code>extract</code> (using
+<tt>man extract</tt>).
+</p>
+<a name="IDX2"></a>
+<p><acronym>GNU libextractor</acronym> is licensed under the GNU General 
Public License.  The
+developers have frequently received requests to license GNU
+libextractor under alternative terms.  However, <acronym>GNU 
libextractor</acronym>
+borrows plenty of GPL-licensed code from various other projects.
+Hence we cannot change the license (even if we wanted to).<a name="DOCF2" 
href="#FOOT2">(2)</a>
+</p>
+<hr size="6">
+<a name="Preparation"></a>
+<a name="SEC2"></a>
+<table cellpadding="1" cellspacing="1" border="0">
+<tr><td valign="middle" align="left">[<a href="#SEC1" title="Previous section 
in reading order"> &lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC3" title="Next section in 
reading order"> &gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC1" title="Beginning of this 
chapter or previous chapter"> &lt;&lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_Top" title="Up section"> Up 
</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC4" title="Next chapter"> 
&gt;&gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC_Top" title="Cover (top) of 
document">Top</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_Contents" title="Table of 
contents">Contents</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC27" 
title="Index">Index</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_About" title="About (help)"> ? 
</a>]</td>
+</tr></table>
+<h1 class="chapter"> 2. Preparation </h1>
+
+<p>Compiling <acronym>GNU libextractor</acronym> follows the standard GNU 
autotools
+build process using <code>configure</code> and <code>make</code>.  For
+details, read the &lsquo;<tt>INSTALL</tt>&rsquo; file and query 
+<tt>./configure --help</tt> for additional options.
+</p>
+<p><acronym>GNU libextractor</acronym> has various dependencies, some of which 
are optional. 
+Instead of specifying the names of the software packages, we
+will give the list in terms of the names of the respective
+Debian (unstable) packages that should be installed.
+</p>
+<p>You absolutely need:
+</p>
+<ul>
+<li>
+libtool
+</li><li>
+gcc
+</li><li>
+make
+</li><li>
+g++ 
+</li><li>
+libltdl7-dev
+</li><li>
+zlib1g-dev
+</li><li>
+libbz2-dev
+</li></ul>
+
+<p>Recommended dependencies are:
+</p><ul>
+<li>
+libgtk2.0-dev
+</li><li>
+libvorbis-dev
+</li><li>
+libflac-dev
+</li><li>
+libgsf-1-dev
+</li><li>
+libmpeg2-4-dev
+</li><li>
+libqt4-dev
+</li><li>
+librpm-dev
+</li><li>
+libpoppler-dev
+</li><li>
+libexiv2-dev
+</li></ul>
+
+<p>Optional dependencies (you would need to additionally specify 
+the configure option <code>--enable-ffmpeg</code>) to make use of these
+are:
+</p><ul>
+<li>
+libavformat-dev
+</li><li>
+libswscale-dev
+</li></ul>
+
+<p>For Subversion access and compilation one also needs:
+</p><ul>
+<li>
+subversion
+</li><li>
+autoconf
+</li><li>
+automake
+</li></ul>
+
+<p>Please notify us if we missed some dependencies (note that the list is
+supposed to only list direct dependencies, not transitive
+dependencies).
+</p>
+<p>Once you have compiled and installed <acronym>GNU libextractor</acronym>, 
you should have a file
+&lsquo;<tt>extractor.h</tt>&rsquo; installed in your 
&lsquo;<tt>include/</tt>&rsquo; directory.  This
+file should be the starting point for your C and C++ development with
+<acronym>GNU libextractor</acronym>.  The build process also installs the 
&lsquo;<tt>extract</tt>&rsquo; binary and
+man pages for &lsquo;<tt>extract</tt>&rsquo; and <acronym>GNU 
libextractor</acronym>.  The &lsquo;<tt>extract</tt>&rsquo; man page
+documents the &lsquo;<tt>extract</tt>&rsquo; tool.  The <acronym>GNU 
libextractor</acronym> man page gives a brief
+summary of the C API for <acronym>GNU libextractor</acronym>.
+</p>
+<a name="IDX3"></a>
+<a name="IDX4"></a>
+<a name="IDX5"></a>
+<a name="IDX6"></a>
+<a name="IDX7"></a>
+<p>When you install <acronym>GNU libextractor</acronym>, various plugins will 
be
+installed in the &lsquo;<tt>lib/libextractor/</tt>&rsquo; directory.  The main 
library
+will be installed as &lsquo;<tt>lib/libextractor.so</tt>&rsquo;.  Note that
+<acronym>GNU libextractor</acronym> will attempt to find the plugins relative 
to the
+path of the main library.  Consequently, a package manager can move
+the library and its plugins to a different location later &mdash; as long
+as the relative path between the main library and the plugins is
+preserved.  As a method of last resort, the user can specify an
+environment variable <tt>LIBEXTRACTOR_PREFIX</tt>.  If
+<acronym>GNU libextractor</acronym> cannot locate a plugin, it will look in
+<tt>LIBEXTRACTOR_PREFIX/lib/libextractor/</tt>.
+</p>
+<hr size="6">
+<a name="SEC3"></a>
+<table cellpadding="1" cellspacing="1" border="0">
+<tr><td valign="middle" align="left">[<a href="#SEC2" title="Previous section 
in reading order"> &lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC4" title="Next section in 
reading order"> &gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC2" title="Beginning of this 
chapter or previous chapter"> &lt;&lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC2" title="Up section"> Up 
</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC4" title="Next chapter"> 
&gt;&gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC_Top" title="Cover (top) of 
document">Top</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_Contents" title="Table of 
contents">Contents</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC27" 
title="Index">Index</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_About" title="About (help)"> ? 
</a>]</td>
+</tr></table>
+<h2 class="section"> 2.1 Note to package maintainers </h2>
+
+<p>The suggested way to package GNU libextractor is to split it into
+roughly the following binary packages:<a name="DOCF3" href="#FOOT3">(3)</a>
+</p>
+<ul>
+<li>
+libextractor (main library only, only hard dependency for other packages 
depending on GNU libextractor)
+</li><li>
+extract (command-line tool and man page)
+</li><li>
+libextractor-dev (extractor.h header and man page)
+</li><li>
+libextractor-doc (this manual)
+</li><li>
+libextractor-plugins (plugins without external dependencies; recommended but 
not required by extract and libextractor package)
+</li><li>
+libextractor-plugin-XXX (plugin with dependency on libXXX, for example for 
XXX=mpeg this would be &lsquo;<tt>libextractor_mpeg.so</tt>&rsquo;)
+</li><li>
+libextractor-plugins-all (meta package that requires all plugins)
+</li></ul>
+
+<p>This would enable minimal installations (i.e. for embedded systems) to
+not include any plugins, as well as moderate-size installations (that
+do not trigger GTK, QT and X11) for systems that have limited
+resources.
+</p>
+
+<hr size="6">
+<a name="Generalities"></a>
+<a name="SEC4"></a>
+<table cellpadding="1" cellspacing="1" border="0">
+<tr><td valign="middle" align="left">[<a href="#SEC3" title="Previous section 
in reading order"> &lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC5" title="Next section in 
reading order"> &gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC2" title="Beginning of this 
chapter or previous chapter"> &lt;&lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_Top" title="Up section"> Up 
</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC5" title="Next chapter"> 
&gt;&gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC_Top" title="Cover (top) of 
document">Top</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_Contents" title="Table of 
contents">Contents</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC27" 
title="Index">Index</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_About" title="About (help)"> ? 
</a>]</td>
+</tr></table>
+<h1 class="chapter"> 3. Generalities </h1>
+
+<p>Each public symbol exported by <acronym>GNU libextractor</acronym> has the 
prefix
+<tt>EXTRACTOR_</tt>.  All-caps names are used for constants.  For the
+impatient, the minimal C code for using <acronym>GNU libextractor</acronym> 
(on the
+executing binary itself) looks like this:
+</p>
+<pre class="verbatim">#include &lt;extractor.h&gt;
+int main(int argc, char ** argv) {
+  struct EXTRACTOR_PluginList *plugins
+    = EXTRACTOR_plugin_add_defaults (EXTRACTOR_OPTION_DEFAULT_POLICY);
+  EXTRACTOR_extract (plugins, argv[1],
+                     NULL, 0, 
+                     &amp;EXTRACTOR_meta_data_print, stdout);
+  EXTRACTOR_plugin_remove_all (plugins);
+  return 0;
+}
+</pre>
+
+<hr size="6">
+<a name="Extracting-meta-data"></a>
+<a name="SEC5"></a>
+<table cellpadding="1" cellspacing="1" border="0">
+<tr><td valign="middle" align="left">[<a href="#SEC4" title="Previous section 
in reading order"> &lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC6" title="Next section in 
reading order"> &gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC4" title="Beginning of this 
chapter or previous chapter"> &lt;&lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_Top" title="Up section"> Up 
</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC10" title="Next chapter"> 
&gt;&gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC_Top" title="Cover (top) of 
document">Top</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_Contents" title="Table of 
contents">Contents</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC27" 
title="Index">Index</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_About" title="About (help)"> ? 
</a>]</td>
+</tr></table>
+<h1 class="chapter"> 4. Extracting meta data </h1>
+
+<table class="menu" border="0" cellspacing="0">
+<tr><td align="left" valign="top"><a href="#SEC6">4.1 Plugin 
management</a></td><td>&nbsp;&nbsp;</td><td align="left" valign="top">   How to 
load and unload plugins
+</td></tr>
+<tr><td align="left" valign="top"><a href="#SEC7">4.2 Meta 
types</a></td><td>&nbsp;&nbsp;</td><td align="left" valign="top">          
About meta types
+</td></tr>
+<tr><td align="left" valign="top"><a href="#SEC8">4.3 Meta 
formats</a></td><td>&nbsp;&nbsp;</td><td align="left" valign="top">        
About meta formats
+</td></tr>
+<tr><td align="left" valign="top"><a href="#SEC9">4.4 
Extracting</a></td><td>&nbsp;&nbsp;</td><td align="left" valign="top">          
How to use the extraction API
+</td></tr>
+</table>
+
+
+<hr size="6">
+<a name="Plugin-management"></a>
+<a name="SEC6"></a>
+<table cellpadding="1" cellspacing="1" border="0">
+<tr><td valign="middle" align="left">[<a href="#SEC5" title="Previous section 
in reading order"> &lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC7" title="Next section in 
reading order"> &gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC5" title="Beginning of this 
chapter or previous chapter"> &lt;&lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC5" title="Up section"> Up 
</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC10" title="Next chapter"> 
&gt;&gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC_Top" title="Cover (top) of 
document">Top</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_Contents" title="Table of 
contents">Contents</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC27" 
title="Index">Index</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_About" title="About (help)"> ? 
</a>]</td>
+</tr></table>
+<h2 class="section"> 4.1 Plugin management </h2>
+
+
+<p>All of the functions for loading and unloading plugins, including
+<tt>EXTRACTOR_plugin_add_defaults</tt> and 
<tt>EXTRACTOR_plugin_remove_all</tt>,
+are thread-safe and reentrant.  However, using the same plugin list
+from multiple threads at the same time is not safe.  Creating multiple
+plugin lists and using them concurrently is supported as long as
+the <code>EXTRACTOR_OPTION_IN_PROCESS</code> option is not used. 
+</p>
+<p>Generally, <acronym>GNU libextractor</acronym> is fully thread-safe and 
mostly reentrant.
+All plugin code is expected required to be reentrant and state-less,
+but due to the extensive use of 3rd party libraries this cannot
+be guaranteed.  Hence plugins are executed (by default) out of
+process.  This also ensures that plugins that crash do not cause
+the main application to fail as well.  
+</p>
+<p>Plugins can be executed in-process by giving the option
+<code>EXTRACTOR_OPTION_IN_PROCESS</code> when loading the plugin.  This
+option is only recommended when debugging plugins and not for
+production use.  Due to the use of shared-memory IPC the
+out-of-process execution of plugins should not be a concern for
+performance.
+</p>
+
+<dl>
+<dt><u>C Struct:</u> <b>EXTRACTOR_PluginList</b>
+<a name="IDX8"></a>
+</dt>
+<dd><a name="IDX9"></a>
+
+<p>A plugin list represents a set of GNU libextractor plugins.  Most of
+the GNU libextractor API is concerned with either constructing a
+plugin list or using it to extract meta data.  The internal representation
+of the plugin list is of no concern to users or plugin developers.
+</p></dd></dl>
+
+
+<dl>
+<dt><u>Function:</u> void <b>EXTRACTOR_plugin_remove_all</b><i> (struct 
EXTRACTOR_PluginList *plugins)</i>
+<a name="IDX10"></a>
+</dt>
+<dd><a name="IDX11"></a>
+
+<p>Unload all of the plugins in the given list.
+</p></dd></dl>
+
+<dl>
+<dt><u>Function:</u> struct EXTRACTOR_PluginList * 
<b>EXTRACTOR_plugin_remove</b><i> (struct EXTRACTOR_PluginList *plugins, const 
char*name)</i>
+<a name="IDX12"></a>
+</dt>
+<dd><a name="IDX13"></a>
+
+<p>Unloads a particular plugin.  The given name should be the short name of 
the plugin, for example &ldquo;mime&rdquo; for the mime-type extractor or 
&ldquo;mpeg&rdquo; for the MPEG extractor.
+</p></dd></dl>
+
+<dl>
+<dt><u>Function:</u> struct EXTRACTOR_PluginList * 
<b>EXTRACTOR_plugin_add</b><i> (struct EXTRACTOR_PluginList *plugins, const 
char* name,const char* options, enum EXTRACTOR_Options flags)</i>
+<a name="IDX14"></a>
+</dt>
+<dd><a name="IDX15"></a>
+
+<p>Loads a particular plugin.  The plugin is added to the existing list, which 
can be NULL.  The second argument specifies the name of the plugin (i.e. 
&ldquo;ogg&rdquo;).  The third argument can be NULL and specifies 
plugin-specific options.  Finally, the last argument specifies if the plugin 
should be executed out-of-process 
(<code>EXTRACTOR_OPTION_DEFAULT_POLICY</code>) or not.
+</p></dd></dl>
+
+<dl>
+<dt><u>Function:</u> struct EXTRACTOR_PluginList * 
<b>EXTRACTOR_plugin_add_config</b><i> (struct EXTRACTOR_PluginList *plugins, 
const char* config, enum EXTRACTOR_Options flags)</i>
+<a name="IDX16"></a>
+</dt>
+<dd><a name="IDX17"></a>
+
+<p>Loads and unloads plugins based on a configuration string, modifying the 
existing list, which can be NULL.  The string has the format 
&ldquo;[-]NAME(OPTIONS){:[-]NAME(OPTIONS)}*&rdquo;.  Prefixing the plugin name 
with a &ldquo;-&rdquo; means that the plugin should be unloaded.
+</p></dd></dl>
+
+<dl>
+<dt><u>Function:</u> struct EXTRACTOR_PluginList * 
<b>EXTRACTOR_plugin_add_defaults</b><i> (enum EXTRACTOR_Options flags)</i>
+<a name="IDX18"></a>
+</dt>
+<dd><a name="IDX19"></a>
+
+<p>Loads all of the plugins in the plugin directory.  This function is what 
most <acronym>GNU libextractor</acronym> applications should use to setup the 
plugins.
+</p></dd></dl>
+
+
+
+<hr size="6">
+<a name="Meta-types"></a>
+<a name="SEC7"></a>
+<table cellpadding="1" cellspacing="1" border="0">
+<tr><td valign="middle" align="left">[<a href="#SEC6" title="Previous section 
in reading order"> &lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC8" title="Next section in 
reading order"> &gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC5" title="Beginning of this 
chapter or previous chapter"> &lt;&lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC5" title="Up section"> Up 
</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC10" title="Next chapter"> 
&gt;&gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC_Top" title="Cover (top) of 
document">Top</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_Contents" title="Table of 
contents">Contents</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC27" 
title="Index">Index</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_About" title="About (help)"> ? 
</a>]</td>
+</tr></table>
+<h2 class="section"> 4.2 Meta types </h2>
+
+
+
+<p><tt>enum EXTRACTOR_MetaType</tt> is a C enum which defines a list of over 
100 different types of meta data.  The total number can differ between 
different <acronym>GNU libextractor</acronym> releases; the maximum value for 
the current release can be obtained using the 
<tt>EXTRACTOR_metatype_get_max</tt> function.  All values in this enumeration 
are of the form <tt>EXTRACTOR_METATYPE_XXX</tt>.
+</p>
+<dl>
+<dt><u>Function:</u> const char * <b>EXTRACTOR_metatype_to_string</b><i> (enum 
EXTRACTOR_MetaType type)</i>
+<a name="IDX20"></a>
+</dt>
+<dd><a name="IDX21"></a>
+<a name="IDX22"></a>
+<a name="IDX23"></a>
+
+<p>The function <tt>EXTRACTOR_metatype_to_string</tt> can be used to obtain a 
short English string &lsquo;<samp>s</samp>&rsquo; describing the meta data 
type.  The string can be translated into other languages using GNU gettext with 
the domain set to <acronym>GNU libextractor</acronym> 
(<tt>dgettext(&quot;libextractor&quot;, s)</tt>).  
+</p></dd></dl>
+
+<dl>
+<dt><u>Function:</u> const char * <b>EXTRACTOR_metatype_to_description</b><i> 
(enum EXTRACTOR_MetaType type)</i>
+<a name="IDX24"></a>
+</dt>
+<dd><a name="IDX25"></a>
+<a name="IDX26"></a>
+<a name="IDX27"></a>
+
+<p>The function <tt>EXTRACTOR_metatype_to_description</tt> can be used to 
obtain a longer English string &lsquo;<samp>s</samp>&rsquo; describing the meta 
data type.  The description may be empty if the short description returned by 
<code>EXTRACTOR_metatype_to_string</code> is already comprehensive.  The string 
can be translated into other languages using GNU gettext with the domain set to 
<acronym>GNU libextractor</acronym> (<tt>dgettext(&quot;libextractor&quot;, 
s)</tt>).  
+</p></dd></dl>
+
+
+
+<hr size="6">
+<a name="Meta-formats"></a>
+<a name="SEC8"></a>
+<table cellpadding="1" cellspacing="1" border="0">
+<tr><td valign="middle" align="left">[<a href="#SEC7" title="Previous section 
in reading order"> &lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC9" title="Next section in 
reading order"> &gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC5" title="Beginning of this 
chapter or previous chapter"> &lt;&lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC5" title="Up section"> Up 
</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC10" title="Next chapter"> 
&gt;&gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC_Top" title="Cover (top) of 
document">Top</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_Contents" title="Table of 
contents">Contents</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC27" 
title="Index">Index</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_About" title="About (help)"> ? 
</a>]</td>
+</tr></table>
+<h2 class="section"> 4.3 Meta formats </h2>
+
+
+<p><tt>enum EXTRACTOR_MetaFormat</tt> is a C enum which defines on a high 
level how the extracted meta data is represented.  Currently, the library uses 
three formats: UTF-8 strings, C strings and binary data.  A fourth value, 
<code>EXTRACTOR_METAFORMAT_UNKNOWN</code> is defined but not used.  UTF-8 
strings are 0-terminated strings that have been converted to UTF-8.  The format 
code is <code>EXTRACTOR_METAFORMAT_UTF8</code>. Ideally, most text meta data 
will be of this format.  Some file formats fail to specify the encoding used 
for the text.  In this case, the text cannot be converted to UTF-8.  However, 
the meta data is still known to be 0-terminated and presumably human-readable.  
In this case, the format code used is 
<code>EXTRACTOR_METAFORMAT_C_STRING</code>; however, this should not be 
understood to mean that the encoding is the same as that used by the C 
compiler.  Finally, for binary data (mostly images), the format 
<code>EXTRACTOR_METAFORMAT_BINARY</code> is used.
+</p>
+<p>Naturally this is not a precise description of the meta format. Plugins can 
provide a more precise description (if known) by providing the respective mime 
type of the meta data.  For example, binary image meta data could be also 
tagged as &ldquo;image/png&rdquo; and normal text would typically be tagged as 
&ldquo;text/plain&rdquo;.  
+</p>
+
+
+<hr size="6">
+<a name="Extracting"></a>
+<a name="SEC9"></a>
+<table cellpadding="1" cellspacing="1" border="0">
+<tr><td valign="middle" align="left">[<a href="#SEC8" title="Previous section 
in reading order"> &lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC10" title="Next section in 
reading order"> &gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC5" title="Beginning of this 
chapter or previous chapter"> &lt;&lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC5" title="Up section"> Up 
</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC10" title="Next chapter"> 
&gt;&gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC_Top" title="Cover (top) of 
document">Top</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_Contents" title="Table of 
contents">Contents</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC27" 
title="Index">Index</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_About" title="About (help)"> ? 
</a>]</td>
+</tr></table>
+<h2 class="section"> 4.4 Extracting </h2>
+
+<dl>
+<dt><u>Function Pointer:</u> int <b>(*EXTRACTOR_MetaDataProcessor)(void</b><i> 
*cls, const char *plugin_name, enum EXTRACTOR_MetaType type, enum 
EXTRACTOR_MetaFormat format, const char *data_mime_type, const char *data, 
size_t data_len)</i>
+<a name="IDX28"></a>
+</dt>
+<dd>
+<p>Type of a function that libextractor calls for each meta data item found.
+</p>
+<dl compact="compact">
+<dt> <var>cls</var> </dt>
+<dd><p>closure (user-defined)
+</p>
+</dd>
+<dt> <var>plugin_name</var> </dt>
+<dd><p>name of the plugin that produced this value; special values can be used 
(i.e. '&lt;zlib&gt;' for zlib being used in the main libextractor library and 
yielding meta data);
+</p>
+</dd>
+<dt> <var>type</var> </dt>
+<dd><p>libextractor-type describing the meta data;
+</p>
+</dd>
+<dt> <var>format basic</var> </dt>
+<dd><p>format information about data
+</p>
+</dd>
+<dt> <var>data_mime_type</var> </dt>
+<dd><p>mime-type of data (not of the original file); can be NULL (if mime-type 
is not known);
+</p>
+</dd>
+<dt> <var>data</var> </dt>
+<dd><p>actual meta-data found
+</p>
+</dd>
+<dt> <var>data_len</var> </dt>
+<dd><p>number of bytes in data
+</p>
+</dd>
+</dl>
+
+<p>Return 0 to continue extracting, 1 to abort.
+</p></dd></dl>
+
+
+
+<dl>
+<dt><u>Function:</u> void <b>EXTRACTOR_extract(struct</b><i> 
EXTRACTOR_PluginList *plugins, const char *filename, const void *data, size_t 
size, EXTRACTOR_MetaDataProcessor proc, void *proc_cls)</i>
+<a name="IDX29"></a>
+</dt>
+<dd><a name="IDX30"></a>
+<a name="IDX31"></a>
+<a name="IDX32"></a>
+<a name="IDX33"></a>
+<a name="IDX34"></a>
+
+<p>This is the main function for extracting keywords with <acronym>GNU 
libextractor</acronym>.  The first argument is a plugin list which specifies 
the set of plugins that should be used for extracting meta data.  The 
&lsquo;<samp>filename</samp>&rsquo; argument is optional and can be used to 
specify the name of a file to process.  If &lsquo;<samp>filename</samp>&rsquo; 
is NULL, then the &lsquo;<samp>data</samp>&rsquo; argument must point to the 
in-memory data to extract meta data from.  If 
&lsquo;<samp>filename</samp>&rsquo; is non-NULL, 
&lsquo;<samp>data</samp>&rsquo; can be NULL.  If 
&lsquo;<samp>data</samp>&rsquo; is non-null, then 
&lsquo;<samp>size</samp>&rsquo; is the size of &lsquo;<samp>data</samp>&rsquo; 
in bytes.  Otherwise &lsquo;<samp>size</samp>&rsquo; should be zero.  For each 
meta data item found, GNU libextractor will call the 
&lsquo;<samp>proc</samp>&rsquo; function, passing 
&lsquo;<samp>proc_cls</samp>&rsquo; as the first argument to 
&lsquo;<samp>proc</samp
 >&rsquo;.  The other arguments to &lsquo;<samp>proc</samp>&rsquo; depend on 
 >the specific meta data found.  
+</p>
+<a name="IDX35"></a>
+<a name="IDX36"></a>
+<p>Meta data extraction should never really fail &mdash; at worst, 
<acronym>GNU libextractor</acronym> should not call 
&lsquo;<samp>proc</samp>&rsquo; with any meta data. By design, <acronym>GNU 
libextractor</acronym> should never crash or leak memory, even given corrupt 
files as input.  Note however, that running <acronym>GNU libextractor</acronym> 
on a corrupt file system (or incorrectly <tt>mmap</tt>ed files) can result in 
the operating system sending a SIGBUS (bus error) to the process.  While 
<acronym>GNU libextractor</acronym> runs plugins out-of-process, it first maps 
the file into memory and then attempts to decompress it.  During decompression 
it is possible to encounter a SIGBUS.   <acronym>GNU libextractor</acronym> 
will <em>not</em> attempt to catch this signal and your application is likely 
to crash.  Note again that this should only happen if the file <em>system</em> 
is corrupt (not if individual files are corrupt).  If this is not acceptable, 
you might want to
  consider running <acronym>GNU libextractor</acronym> itself also 
out-of-process (as done, for example, by <a 
href="http://grothoff.org/christian/doodle/";>doodle</a>).
+</p>
+</dd></dl>
+
+
+<hr size="6">
+<a name="Language-bindings"></a>
+<a name="SEC10"></a>
+<table cellpadding="1" cellspacing="1" border="0">
+<tr><td valign="middle" align="left">[<a href="#SEC9" title="Previous section 
in reading order"> &lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC11" title="Next section in 
reading order"> &gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC5" title="Beginning of this 
chapter or previous chapter"> &lt;&lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_Top" title="Up section"> Up 
</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC17" title="Next chapter"> 
&gt;&gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC_Top" title="Cover (top) of 
document">Top</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_Contents" title="Table of 
contents">Contents</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC27" 
title="Index">Index</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_About" title="About (help)"> ? 
</a>]</td>
+</tr></table>
+<h1 class="chapter"> 5. Language bindings </h1>
+
+
+<p><acronym>GNU libextractor</acronym> works immediately with C and C++ code. 
Bindings for Java, Mono, Ruby, Perl, PHP and Python are available for download 
from the main <acronym>GNU libextractor</acronym> website.  Documentation for 
these bindings (if available) is part of the downloads for the respective 
binding.  In all cases, a full installation of the C library is required before 
the binding can be installed.
+</p>
+<hr size="6">
+<a name="SEC11"></a>
+<table cellpadding="1" cellspacing="1" border="0">
+<tr><td valign="middle" align="left">[<a href="#SEC10" title="Previous section 
in reading order"> &lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC12" title="Next section in 
reading order"> &gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC10" title="Beginning of this 
chapter or previous chapter"> &lt;&lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC10" title="Up section"> Up 
</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC17" title="Next chapter"> 
&gt;&gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC_Top" title="Cover (top) of 
document">Top</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_Contents" title="Table of 
contents">Contents</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC27" 
title="Index">Index</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_About" title="About (help)"> ? 
</a>]</td>
+</tr></table>
+<h2 class="section"> 5.1 Java </h2>
+
+<p>Compiling the GNU libextractor Java binding follows the usual process of
+running <code>configure</code> and <code>make</code>.  The result will be a
+shared C library &lsquo;<tt>libextractor_java.so</tt>&rsquo; with the native 
code and
+a JAR file (installed to 
&lsquo;<tt>$PREFIX/share/java/libextractor.java</tt>&rsquo;).
+</p>
+<p>A minimal example for using GNU libextractor's Java binding would look
+like this:
+</p><pre class="verbatim">import org.gnu.libextractor.*;
+import java.util.ArrayList;
+
+public static void main(String[] args) {
+  Extractor ex = Extractor.getDefault();
+  for (int i=0;i&lt;args.length;i++) {
+    ArrayList keywords = ex.extract(args[i]);
+    System.out.println(&quot;Keywords for &quot; + args[i] + &quot;:&quot;);
+    for (int j=0;j&lt;keywords.size();j++)
+      System.out.println(keywords.get(j));
+  }
+}
+</pre>
+<p>The GNU libextractor library and the 
&lsquo;<tt>libextractor_java.so</tt>&rsquo; JNI binding
+have to be in the library search path for this to work.  Furthermore, the
+&lsquo;<tt>libextractor.jar</tt>&rsquo; file should be on the classpath.  
+</p>
+<p>Note that the API does not use Java 5 style generics in order to work
+with older versions of Java.
+</p>
+<hr size="6">
+<a name="SEC12"></a>
+<table cellpadding="1" cellspacing="1" border="0">
+<tr><td valign="middle" align="left">[<a href="#SEC11" title="Previous section 
in reading order"> &lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC13" title="Next section in 
reading order"> &gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC10" title="Beginning of this 
chapter or previous chapter"> &lt;&lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC10" title="Up section"> Up 
</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC17" title="Next chapter"> 
&gt;&gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC_Top" title="Cover (top) of 
document">Top</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_Contents" title="Table of 
contents">Contents</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC27" 
title="Index">Index</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_About" title="About (help)"> ? 
</a>]</td>
+</tr></table>
+<h2 class="section"> 5.2 Mono </h2>
+
+<p>This binding is undocumented at this point.
+</p>
+<hr size="6">
+<a name="SEC13"></a>
+<table cellpadding="1" cellspacing="1" border="0">
+<tr><td valign="middle" align="left">[<a href="#SEC12" title="Previous section 
in reading order"> &lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC14" title="Next section in 
reading order"> &gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC10" title="Beginning of this 
chapter or previous chapter"> &lt;&lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC10" title="Up section"> Up 
</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC17" title="Next chapter"> 
&gt;&gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC_Top" title="Cover (top) of 
document">Top</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_Contents" title="Table of 
contents">Contents</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC27" 
title="Index">Index</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_About" title="About (help)"> ? 
</a>]</td>
+</tr></table>
+<h2 class="section"> 5.3 Perl </h2>
+
+<p>This binding is undocumented at this point.
+</p>
+<hr size="6">
+<a name="SEC14"></a>
+<table cellpadding="1" cellspacing="1" border="0">
+<tr><td valign="middle" align="left">[<a href="#SEC13" title="Previous section 
in reading order"> &lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC15" title="Next section in 
reading order"> &gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC10" title="Beginning of this 
chapter or previous chapter"> &lt;&lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC10" title="Up section"> Up 
</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC17" title="Next chapter"> 
&gt;&gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC_Top" title="Cover (top) of 
document">Top</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_Contents" title="Table of 
contents">Contents</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC27" 
title="Index">Index</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_About" title="About (help)"> ? 
</a>]</td>
+</tr></table>
+<h2 class="section"> 5.4 Python </h2>
+
+<p>This binding is undocumented at this point.
+</p>
+<hr size="6">
+<a name="SEC15"></a>
+<table cellpadding="1" cellspacing="1" border="0">
+<tr><td valign="middle" align="left">[<a href="#SEC14" title="Previous section 
in reading order"> &lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC16" title="Next section in 
reading order"> &gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC10" title="Beginning of this 
chapter or previous chapter"> &lt;&lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC10" title="Up section"> Up 
</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC17" title="Next chapter"> 
&gt;&gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC_Top" title="Cover (top) of 
document">Top</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_Contents" title="Table of 
contents">Contents</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC27" 
title="Index">Index</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_About" title="About (help)"> ? 
</a>]</td>
+</tr></table>
+<h2 class="section"> 5.5 PHP </h2>
+
+<p>This binding is undocumented at this point.
+</p>
+<hr size="6">
+<a name="SEC16"></a>
+<table cellpadding="1" cellspacing="1" border="0">
+<tr><td valign="middle" align="left">[<a href="#SEC15" title="Previous section 
in reading order"> &lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC17" title="Next section in 
reading order"> &gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC10" title="Beginning of this 
chapter or previous chapter"> &lt;&lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC10" title="Up section"> Up 
</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC17" title="Next chapter"> 
&gt;&gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC_Top" title="Cover (top) of 
document">Top</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_Contents" title="Table of 
contents">Contents</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC27" 
title="Index">Index</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_About" title="About (help)"> ? 
</a>]</td>
+</tr></table>
+<h2 class="section"> 5.6 Ruby </h2>
+
+<p>This binding is undocumented at this point.
+</p>
+
+
+<hr size="6">
+<a name="Utility-functions"></a>
+<a name="SEC17"></a>
+<table cellpadding="1" cellspacing="1" border="0">
+<tr><td valign="middle" align="left">[<a href="#SEC16" title="Previous section 
in reading order"> &lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC18" title="Next section in 
reading order"> &gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC10" title="Beginning of this 
chapter or previous chapter"> &lt;&lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_Top" title="Up section"> Up 
</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC20" title="Next chapter"> 
&gt;&gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC_Top" title="Cover (top) of 
document">Top</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_Contents" title="Table of 
contents">Contents</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC27" 
title="Index">Index</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_About" title="About (help)"> ? 
</a>]</td>
+</tr></table>
+<h1 class="chapter"> 6. Utility functions </h1>
+
+<p>This chapter describes various utility functions for <acronym>GNU 
libextractor</acronym> usage. All of the functions are reentrant.
+</p>
+<table class="menu" border="0" cellspacing="0">
+<tr><td align="left" valign="top"><a href="#SEC18">6.1 Utility 
Constants</a></td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
+</td></tr>
+<tr><td align="left" valign="top"><a href="#SEC19">6.2 Meta data 
printing</a></td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
+</td></tr>
+</table>
+
+<hr size="6">
+<a name="Utility-Constants"></a>
+<a name="SEC18"></a>
+<table cellpadding="1" cellspacing="1" border="0">
+<tr><td valign="middle" align="left">[<a href="#SEC17" title="Previous section 
in reading order"> &lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC19" title="Next section in 
reading order"> &gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC17" title="Beginning of this 
chapter or previous chapter"> &lt;&lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC17" title="Up section"> Up 
</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC20" title="Next chapter"> 
&gt;&gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC_Top" title="Cover (top) of 
document">Top</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_Contents" title="Table of 
contents">Contents</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC27" 
title="Index">Index</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_About" title="About (help)"> ? 
</a>]</td>
+</tr></table>
+<h2 class="section"> 6.1 Utility Constants </h2>
+
+<p>The constant <tt>EXTRACTOR_VERSION</tt> is a hexadecimal
+representation of the version number of the installed libextractor
+header.  The hexadecimal format is 0xAABBCCDD where AA is the major
+version (so far always 0), BB is the minor version, CC is the revision
+and DD the patch number.  For example, for version 0.5.18, we would
+have AA=0, BB=5, CC=18 and DD=0.  Minor releases such as 0.5.18a or
+significant changes in unreleased versions would be marked with DD=1
+or higher.
+</p>
+
+<hr size="6">
+<a name="Meta-data-printing"></a>
+<a name="SEC19"></a>
+<table cellpadding="1" cellspacing="1" border="0">
+<tr><td valign="middle" align="left">[<a href="#SEC18" title="Previous section 
in reading order"> &lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC20" title="Next section in 
reading order"> &gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC17" title="Beginning of this 
chapter or previous chapter"> &lt;&lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC17" title="Up section"> Up 
</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC20" title="Next chapter"> 
&gt;&gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC_Top" title="Cover (top) of 
document">Top</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_Contents" title="Table of 
contents">Contents</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC27" 
title="Index">Index</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_About" title="About (help)"> ? 
</a>]</td>
+</tr></table>
+<h2 class="section"> 6.2 Meta data printing </h2>
+
+
+<p>The <tt>EXTRACTOR_meta_data_print</tt> is a simple function which prints 
the meta data found with libextractor to a file.  The function is mostly useful 
for debugging and as an example for how to manipulate the keyword list and can 
be passed as the &lsquo;<samp>proc</samp>&rsquo; argument to 
<code>EXTRACTOR_extract</code>.  The file to print to should be passed as 
&lsquo;<samp>proc_cls</samp>&rsquo; (which must be of type <code>FILE 
*</code>), for example <code>stdout</code>.
+</p>
+
+
+<hr size="6">
+<a name="Existing-Plugins"></a>
+<a name="SEC20"></a>
+<table cellpadding="1" cellspacing="1" border="0">
+<tr><td valign="middle" align="left">[<a href="#SEC19" title="Previous section 
in reading order"> &lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC21" title="Next section in 
reading order"> &gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC17" title="Beginning of this 
chapter or previous chapter"> &lt;&lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_Top" title="Up section"> Up 
</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC21" title="Next chapter"> 
&gt;&gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC_Top" title="Cover (top) of 
document">Top</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_Contents" title="Table of 
contents">Contents</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC27" 
title="Index">Index</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_About" title="About (help)"> ? 
</a>]</td>
+</tr></table>
+<h1 class="chapter"> 7. Existing Plugins </h1>
+
+<ul>
+<li>
+APPLEFILE
+</li><li>
+ASF
+</li><li>
+DEB
+</li><li>
+DVI
+</li><li>
+ELF
+</li><li>
+EXIV2
+</li><li>
+FLAC
+</li><li>
+FLV
+</li><li>
+GIF
+</li><li>
+HTML
+</li><li>
+ID3 (v2.0, v2.3, v2.4)
+</li><li>
+IT
+</li><li>
+JPEG
+</li><li>
+OLE2
+</li><li>
+thumbnail (GTK, QT or FFMPEG-based)
+</li><li>
+MAN
+</li><li>
+MIME
+</li><li>
+MP3 (ID3v1)
+</li><li>
+MPEG
+</li><li>
+NSF and NSFE
+</li><li>
+ODF
+</li><li>
+PNG
+</li><li>
+PS (PostScript)
+</li><li>
+QT (QuickTime)
+</li><li>
+REAL
+</li><li>
+RIFF
+</li><li>
+RPM
+</li><li> 
+S3M
+</li><li>
+SID
+</li><li>
+TAR
+</li><li>
+TIFF
+</li><li>
+WAV
+</li><li>
+XM
+</li><li>
+ZIP
+</li></ul>
+
+<p>&lsquo;<tt>gzip</tt>&rsquo; and &lsquo;<tt>bzip2</tt>&rsquo; compressed 
versions of these formats are 
+also supported (as well as meta data embedded by &lsquo;<tt>gzip</tt>&rsquo; 
itself).
+</p>
+<hr size="6">
+<a name="Writing-new-Plugins"></a>
+<a name="SEC21"></a>
+<table cellpadding="1" cellspacing="1" border="0">
+<tr><td valign="middle" align="left">[<a href="#SEC20" title="Previous section 
in reading order"> &lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC22" title="Next section in 
reading order"> &gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC20" title="Beginning of this 
chapter or previous chapter"> &lt;&lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_Top" title="Up section"> Up 
</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC22" title="Next chapter"> 
&gt;&gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC_Top" title="Cover (top) of 
document">Top</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_Contents" title="Table of 
contents">Contents</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC27" 
title="Index">Index</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_About" title="About (help)"> ? 
</a>]</td>
+</tr></table>
+<h1 class="chapter"> 8. Writing new Plugins </h1>
+
+<p>Writing a new plugin for libextractor usually requires writing of or
+interfacing with an actual parser for a specific format.  How this is
+can be accomplished depends on the format and cannot be specified in
+general.  However, care should be taken for the code to be reentrant
+and highly fault-tolerant, especially with respect to malformed
+inputs.
+</p>
+<p>Plugins should start by verifying that the header of the data matches
+the specific format and immediately return if that is not the case.
+Even if the header matches the expected file format, plugins must not
+assume that the remainder of the file is well formed.
+</p>
+<p>The plugin library must be called libextractor_XXX.so, where XXX 
+denotes the file format of the plugin. The library must export a 
+method <tt>libextractor_XXX_extract</tt>, with the following 
+signature:
+</p><pre class="verbatim">int
+EXTRACTOR_XXX_extract
+   (const char *data,
+    size_t data_size,
+    EXTRACTOR_MetaDataProcessor proc,
+    void *proc_cls,
+    const char * options);
+</pre>
+<p>&lsquo;<samp>data</samp>&rsquo; is a pointer to the typically memory mapped 
contents of
+the file.  Note that plugins cannot ignore the <tt>const</tt>
+annotation since the memory mapping may have been done read-only (and
+thus writes to this page will result in an error).  The 
&lsquo;<samp>data_size</samp>&rsquo;
+argument specifies the size of the &lsquo;<samp>data</samp>&rsquo; buffer in 
bytes.
+</p>
+<p>&lsquo;<samp>proc</samp>&rsquo; should be called on each meta data item 
found.  If &lsquo;<samp>proc</samp>&rsquo; 
+returns non-zero, processing should be aborted and the <code>extract</code>
+function must return 1.  Otherwise <code>extract</code> should always return 
zero.
+</p>
+
+<p>In order to test new plugins, the &lsquo;<tt>extract</tt>&rsquo; command 
can be run
+with the options &ldquo;-ni&rdquo; and &ldquo;-l XXX&rdquo; .  This will run 
the plugin
+in-process (making it easier to debug) and without any of the other
+plugins.
+</p>
+
+<hr size="6">
+<a name="Internal-utility-functions"></a>
+<a name="SEC22"></a>
+<table cellpadding="1" cellspacing="1" border="0">
+<tr><td valign="middle" align="left">[<a href="#SEC21" title="Previous section 
in reading order"> &lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC23" title="Next section in 
reading order"> &gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC21" title="Beginning of this 
chapter or previous chapter"> &lt;&lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_Top" title="Up section"> Up 
</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC23" title="Next chapter"> 
&gt;&gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC_Top" title="Cover (top) of 
document">Top</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_Contents" title="Table of 
contents">Contents</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC27" 
title="Index">Index</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_About" title="About (help)"> ? 
</a>]</td>
+</tr></table>
+<h1 class="chapter"> 9. Internal utility functions </h1>
+
+<p>Some plugins link against the <code>libextractor_common</code> library which
+provides common abstractions needed by many plugins.  This section
+documents this internal API for plugin developers.  Note that the headers
+for this library are (intentionally) not installed: we do not consider
+this API stable and it should hence only be used by plugins that are 
+build and shipped with GNU libextractor.  Third-party plugins should
+not use it.
+</p>
+<p>&lsquo;<tt>convert_numeric.h</tt>&rsquo; defines various conversion 
functions for
+numbers (in particular, byte-order conversion for floating point
+numbers).  
+</p>
+<p>&lsquo;<tt>unzip.h</tt>&rsquo; defines an API for accessing compressed 
files.
+</p>
+<p>&lsquo;<tt>pack.h</tt>&rsquo; provides an interpreter for unpacking structs 
of integer
+numbers from streams and converting from big or little endian to host
+byte order at the same time.
+</p>
+<p>&lsquo;<tt>convert.h</tt>&rsquo; provides a function for character set 
conversion described
+below.
+</p>
+<dl>
+<dt><u>Function:</u> char * <b>EXTRACTOR_common_convert_to_utf8(const</b><i> 
char *input, size_t len, const char * charset)</i>
+<a name="IDX37"></a>
+</dt>
+<dd><a name="IDX38"></a>
+<a name="IDX39"></a>
+<a name="IDX40"></a>
+<p>Various <acronym>GNU libextractor</acronym> plugins make use of the internal
+&lsquo;<tt>convert.h</tt>&rsquo; header which defines a function
+</p>
+<p><tt>EXTRACTOR_common_convert_to_utf8</tt> which can be used to easily 
convert text from
+any character set to UTF-8.  This conversion is important since the
+linked list of keywords that is returned by <acronym>GNU 
libextractor</acronym> is
+expected to contain only UTF-8 strings.  Naturally, proper conversion
+may not always be possible since some file formats fail to specify the
+character set.  In that case, it is often better to not convert at
+all.
+</p>
+<p>The arguments to <tt>EXTRACTOR_common_convert_to_utf8</tt> are the input 
string (which
+does <em>not</em> have to be zero-terminated), the length of the input
+string, and the character set (which <em>must</em> be zero-terminated).
+Which character sets are supported depends on the platform, a list can
+generally be obtained using the <code>iconv -l</code> command.  The
+return value from <tt>EXTRACTOR_common_convert_to_utf8</tt> is a 
zero-terminated string
+in UTF-8 format.  The responsibility to free the string is with the
+caller, so storing the string in the keyword list is acceptable.
+</p></dd></dl>
+
+
+
+
+
+<hr size="6">
+<a name="Reporting-bugs"></a>
+<a name="SEC23"></a>
+<table cellpadding="1" cellspacing="1" border="0">
+<tr><td valign="middle" align="left">[<a href="#SEC22" title="Previous section 
in reading order"> &lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC24" title="Next section in 
reading order"> &gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC22" title="Beginning of this 
chapter or previous chapter"> &lt;&lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_Top" title="Up section"> Up 
</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC24" title="Next chapter"> 
&gt;&gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC_Top" title="Cover (top) of 
document">Top</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_Contents" title="Table of 
contents">Contents</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC27" 
title="Index">Index</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_About" title="About (help)"> ? 
</a>]</td>
+</tr></table>
+<h1 class="chapter"> 10. Reporting bugs </h1>
+
+<p><acronym>GNU libextractor</acronym> uses the <a 
href="http://gnunet.org/bugs/";>Mantis bugtracking system</a>.  If possible, 
please report bugs there.  You can also e-mail
+the <acronym>GNU libextractor</acronym> mailinglist at <a 
href="address@hidden">address@hidden</a>.
+</p>
+
+
+
+<hr size="6">
+<a name="Copying"></a>
+<a name="SEC24"></a>
+<table cellpadding="1" cellspacing="1" border="0">
+<tr><td valign="middle" align="left">[<a href="#SEC23" title="Previous section 
in reading order"> &lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC25" title="Next section in 
reading order"> &gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC23" title="Beginning of this 
chapter or previous chapter"> &lt;&lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_Top" title="Up section"> Up 
</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC27" title="Next chapter"> 
&gt;&gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC_Top" title="Cover (top) of 
document">Top</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_Contents" title="Table of 
contents">Contents</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC27" 
title="Index">Index</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_About" title="About (help)"> ? 
</a>]</td>
+</tr></table>
+<h1 class="appendix"> A. GNU GENERAL PUBLIC LICENSE </h1>
+
+<p align="center"> Version 2, June 1991
+</p>
+<table><tr><td>&nbsp;</td><td><pre class="display">Copyright &copy; 1989, 1991 
Free Software Foundation, Inc.
+59 Temple Place -- Suite 330, Boston, MA 02111-1307, USA
+
+Everyone is permitted to copy and distribute verbatim copies
+of this license document, but changing it is not allowed.
+</pre></td></tr></table>
+
+<hr size="6">
+<a name="SEC25"></a>
+<table cellpadding="1" cellspacing="1" border="0">
+<tr><td valign="middle" align="left">[<a href="#SEC24" title="Previous section 
in reading order"> &lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC26" title="Next section in 
reading order"> &gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC24" title="Beginning of this 
chapter or previous chapter"> &lt;&lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC24" title="Up section"> Up 
</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC27" title="Next chapter"> 
&gt;&gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC_Top" title="Cover (top) of 
document">Top</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_Contents" title="Table of 
contents">Contents</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC27" 
title="Index">Index</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_About" title="About (help)"> ? 
</a>]</td>
+</tr></table>
+<h3 class="appendixsubsec"> A.0.1 Preamble </h3>
+
+<p>  The licenses for most software are designed to take away your
+freedom to share and change it.  By contrast, the GNU General Public
+License is intended to guarantee your freedom to share and change free
+software&mdash;to make sure the software is free for all its users.  This
+General Public License applies to most of the Free Software
+Foundation's software and to any other program whose authors commit to
+using it.  (Some other Free Software Foundation software is covered by
+the GNU Library General Public License instead.)  You can apply it to
+your programs, too.
+</p>
+<p>  When we speak of free software, we are referring to freedom, not
+price.  Our General Public Licenses are designed to make sure that you
+have the freedom to distribute copies of free software (and charge for
+this service if you wish), that you receive source code or can get it
+if you want it, that you can change the software or use pieces of it
+in new free programs; and that you know you can do these things.
+</p>
+<p>  To protect your rights, we need to make restrictions that forbid
+anyone to deny you these rights or to ask you to surrender the rights.
+These restrictions translate to certain responsibilities for you if you
+distribute copies of the software, or if you modify it.
+</p>
+<p>  For example, if you distribute copies of such a program, whether
+gratis or for a fee, you must give the recipients all the rights that
+you have.  You must make sure that they, too, receive or can get the
+source code.  And you must show them these terms so they know their
+rights.
+</p>
+<p>  We protect your rights with two steps: (1) copyright the software, and
+(2) offer you this license which gives you legal permission to copy,
+distribute and/or modify the software.
+</p>
+<p>  Also, for each author's protection and ours, we want to make certain
+that everyone understands that there is no warranty for this free
+software.  If the software is modified by someone else and passed on, we
+want its recipients to know that what they have is not the original, so
+that any problems introduced by others will not reflect on the original
+authors' reputations.
+</p>
+<p>  Finally, any free program is threatened constantly by software
+patents.  We wish to avoid the danger that redistributors of a free
+program will individually obtain patent licenses, in effect making the
+program proprietary.  To prevent this, we have made it clear that any
+patent must be licensed for everyone's free use or not licensed at all.
+</p>
+<p>  The precise terms and conditions for copying, distribution and
+modification follow.
+</p>
+
+<ol>
+<li>
+This License applies to any program or other work which contains
+a notice placed by the copyright holder saying it may be distributed
+under the terms of this General Public License.  The &ldquo;Program&rdquo;, 
below,
+refers to any such program or work, and a &ldquo;work based on the 
Program&rdquo;
+means either the Program or any derivative work under copyright law:
+that is to say, a work containing the Program or a portion of it,
+either verbatim or with modifications and/or translated into another
+language.  (Hereinafter, translation is included without limitation in
+the term &ldquo;modification&rdquo;.)  Each licensee is addressed as 
&ldquo;you&rdquo;.
+
+<p>Activities other than copying, distribution and modification are not
+covered by this License; they are outside its scope.  The act of
+running the Program is not restricted, and the output from the Program
+is covered only if its contents constitute a work based on the
+Program (independent of having been made by running the Program).
+Whether that is true depends on what the Program does.
+</p>
+</li><li>
+You may copy and distribute verbatim copies of the Program's
+source code as you receive it, in any medium, provided that you
+conspicuously and appropriately publish on each copy an appropriate
+copyright notice and disclaimer of warranty; keep intact all the
+notices that refer to this License and to the absence of any warranty;
+and give any other recipients of the Program a copy of this License
+along with the Program.
+
+<p>You may charge a fee for the physical act of transferring a copy, and
+you may at your option offer warranty protection in exchange for a fee.
+</p>
+</li><li>
+You may modify your copy or copies of the Program or any portion
+of it, thus forming a work based on the Program, and copy and
+distribute such modifications or work under the terms of Section 1
+above, provided that you also meet all of these conditions:
+
+<ol>
+<li>
+You must cause the modified files to carry prominent notices
+stating that you changed the files and the date of any change.
+
+</li><li>
+You must cause any work that you distribute or publish, that in
+whole or in part contains or is derived from the Program or any
+part thereof, to be licensed as a whole at no charge to all third
+parties under the terms of this License.
+
+</li><li>
+If the modified program normally reads commands interactively
+when run, you must cause it, when started running for such
+interactive use in the most ordinary way, to print or display an
+announcement including an appropriate copyright notice and a
+notice that there is no warranty (or else, saying that you provide
+a warranty) and that users may redistribute the program under
+these conditions, and telling the user how to view a copy of this
+License.  (Exception: if the Program itself is interactive but
+does not normally print such an announcement, your work based on
+the Program is not required to print an announcement.)
+</li></ol>
+
+<p>These requirements apply to the modified work as a whole.  If
+identifiable sections of that work are not derived from the Program,
+and can be reasonably considered independent and separate works in
+themselves, then this License, and its terms, do not apply to those
+sections when you distribute them as separate works.  But when you
+distribute the same sections as part of a whole which is a work based
+on the Program, the distribution of the whole must be on the terms of
+this License, whose permissions for other licensees extend to the
+entire whole, and thus to each and every part regardless of who wrote it.
+</p>
+<p>Thus, it is not the intent of this section to claim rights or contest
+your rights to work written entirely by you; rather, the intent is to
+exercise the right to control the distribution of derivative or
+collective works based on the Program.
+</p>
+<p>In addition, mere aggregation of another work not based on the Program
+with the Program (or with a work based on the Program) on a volume of
+a storage or distribution medium does not bring the other work under
+the scope of this License.
+</p>
+</li><li>
+You may copy and distribute the Program (or a work based on it,
+under Section 2) in object code or executable form under the terms of
+Sections 1 and 2 above provided that you also do one of the following:
+
+<ol>
+<li>
+Accompany it with the complete corresponding machine-readable
+source code, which must be distributed under the terms of Sections
+1 and 2 above on a medium customarily used for software interchange; or,
+
+</li><li>
+Accompany it with a written offer, valid for at least three
+years, to give any third party, for a charge no more than your
+cost of physically performing source distribution, a complete
+machine-readable copy of the corresponding source code, to be
+distributed under the terms of Sections 1 and 2 above on a medium
+customarily used for software interchange; or,
+
+</li><li>
+Accompany it with the information you received as to the offer
+to distribute corresponding source code.  (This alternative is
+allowed only for noncommercial distribution and only if you
+received the program in object code or executable form with such
+an offer, in accord with Subsection b above.)
+</li></ol>
+
+<p>The source code for a work means the preferred form of the work for
+making modifications to it.  For an executable work, complete source
+code means all the source code for all modules it contains, plus any
+associated interface definition files, plus the scripts used to
+control compilation and installation of the executable.  However, as a
+special exception, the source code distributed need not include
+anything that is normally distributed (in either source or binary
+form) with the major components (compiler, kernel, and so on) of the
+operating system on which the executable runs, unless that component
+itself accompanies the executable.
+</p>
+<p>If distribution of executable or object code is made by offering
+access to copy from a designated place, then offering equivalent
+access to copy the source code from the same place counts as
+distribution of the source code, even though third parties are not
+compelled to copy the source along with the object code.
+</p>
+</li><li>
+You may not copy, modify, sublicense, or distribute the Program
+except as expressly provided under this License.  Any attempt
+otherwise to copy, modify, sublicense or distribute the Program is
+void, and will automatically terminate your rights under this License.
+However, parties who have received copies, or rights, from you under
+this License will not have their licenses terminated so long as such
+parties remain in full compliance.
+
+</li><li>
+You are not required to accept this License, since you have not
+signed it.  However, nothing else grants you permission to modify or
+distribute the Program or its derivative works.  These actions are
+prohibited by law if you do not accept this License.  Therefore, by
+modifying or distributing the Program (or any work based on the
+Program), you indicate your acceptance of this License to do so, and
+all its terms and conditions for copying, distributing or modifying
+the Program or works based on it.
+
+</li><li>
+Each time you redistribute the Program (or any work based on the
+Program), the recipient automatically receives a license from the
+original licensor to copy, distribute or modify the Program subject to
+these terms and conditions.  You may not impose any further
+restrictions on the recipients' exercise of the rights granted herein.
+You are not responsible for enforcing compliance by third parties to
+this License.
+
+</li><li>
+If, as a consequence of a court judgment or allegation of patent
+infringement or for any other reason (not limited to patent issues),
+conditions are imposed on you (whether by court order, agreement or
+otherwise) that contradict the conditions of this License, they do not
+excuse you from the conditions of this License.  If you cannot
+distribute so as to satisfy simultaneously your obligations under this
+License and any other pertinent obligations, then as a consequence you
+may not distribute the Program at all.  For example, if a patent
+license would not permit royalty-free redistribution of the Program by
+all those who receive copies directly or indirectly through you, then
+the only way you could satisfy both it and this License would be to
+refrain entirely from distribution of the Program.
+
+<p>If any portion of this section is held invalid or unenforceable under
+any particular circumstance, the balance of the section is intended to
+apply and the section as a whole is intended to apply in other
+circumstances.
+</p>
+<p>It is not the purpose of this section to induce you to infringe any
+patents or other property right claims or to contest validity of any
+such claims; this section has the sole purpose of protecting the
+integrity of the free software distribution system, which is
+implemented by public license practices.  Many people have made
+generous contributions to the wide range of software distributed
+through that system in reliance on consistent application of that
+system; it is up to the author/donor to decide if he or she is willing
+to distribute software through any other system and a licensee cannot
+impose that choice.
+</p>
+<p>This section is intended to make thoroughly clear what is believed to
+be a consequence of the rest of this License.
+</p>
+</li><li>
+If the distribution and/or use of the Program is restricted in
+certain countries either by patents or by copyrighted interfaces, the
+original copyright holder who places the Program under this License
+may add an explicit geographical distribution limitation excluding
+those countries, so that distribution is permitted only in or among
+countries not thus excluded.  In such case, this License incorporates
+the limitation as if written in the body of this License.
+
+</li><li>
+The Free Software Foundation may publish revised and/or new versions
+of the General Public License from time to time.  Such new versions will
+be similar in spirit to the present version, but may differ in detail to
+address new problems or concerns.
+
+<p>Each version is given a distinguishing version number.  If the Program
+specifies a version number of this License which applies to it and &ldquo;any
+later version&rdquo;, you have the option of following the terms and conditions
+either of that version or of any later version published by the Free
+Software Foundation.  If the Program does not specify a version number of
+this License, you may choose any version ever published by the Free Software
+Foundation.
+</p>
+</li><li>
+If you wish to incorporate parts of the Program into other free
+programs whose distribution conditions are different, write to the author
+to ask for permission.  For software which is copyrighted by the Free
+Software Foundation, write to the Free Software Foundation; we sometimes
+make exceptions for this.  Our decision will be guided by the two goals
+of preserving the free status of all derivatives of our free software and
+of promoting the sharing and reuse of software generally.
+
+
+</li><li>
+BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
+FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW.  EXCEPT WHEN
+OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
+PROVIDE THE PROGRAM &ldquo;AS IS&rdquo; WITHOUT WARRANTY OF ANY KIND, EITHER 
EXPRESSED
+OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
+MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.  THE ENTIRE RISK AS
+TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU.  SHOULD THE
+PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
+REPAIR OR CORRECTION.
+
+</li><li>
+IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
+WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
+REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
+INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
+OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
+TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
+YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
+PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGES.
+</li></ol>
+
+
+
+<hr size="6">
+<a name="SEC26"></a>
+<table cellpadding="1" cellspacing="1" border="0">
+<tr><td valign="middle" align="left">[<a href="#SEC25" title="Previous section 
in reading order"> &lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC27" title="Next section in 
reading order"> &gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC24" title="Beginning of this 
chapter or previous chapter"> &lt;&lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC24" title="Up section"> Up 
</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC27" title="Next chapter"> 
&gt;&gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC_Top" title="Cover (top) of 
document">Top</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_Contents" title="Table of 
contents">Contents</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC27" 
title="Index">Index</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_About" title="About (help)"> ? 
</a>]</td>
+</tr></table>
+<h2 class="unnumberedsec"> How to Apply These Terms to Your New Programs </h2>
+
+<p>  If you develop a new program, and you want it to be of the greatest
+possible use to the public, the best way to achieve this is to make it
+free software which everyone can redistribute and change under these terms.
+</p>
+<p>  To do so, attach the following notices to the program.  It is safest
+to attach them to the start of each source file to most effectively
+convey the exclusion of warranty; and each file should have at least
+the &ldquo;copyright&rdquo; line and a pointer to where the full notice is 
found.
+</p>
+<table><tr><td>&nbsp;</td><td><pre class="smallexample"><var>one line to give 
the program's name and an idea of what it does.</var>
+Copyright (C) 19<var>yy</var>  <var>name of author</var>
+
+This program is free software; you can redistribute it and/or
+modify it under the terms of the GNU General Public License
+as published by the Free Software Foundation; either version 2
+of the License, or (at your option) any later version.
+
+This program is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License along
+with this program; if not, write to the Free Software Foundation, Inc.,
+59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
+</pre></td></tr></table>
+
+<p>Also add information on how to contact you by electronic and paper mail.
+</p>
+<p>If the program is interactive, make it output a short notice like this
+when it starts in an interactive mode:
+</p>
+<table><tr><td>&nbsp;</td><td><pre class="smallexample">Gnomovision version 
69, Copyright (C) 19<var>yy</var> <var>name of author</var>
+Gnomovision comes with ABSOLUTELY NO WARRANTY; for details
+type `show w'.  This is free software, and you are welcome
+to redistribute it under certain conditions; type `show c' 
+for details.
+</pre></td></tr></table>
+
+<p>The hypothetical commands &lsquo;<samp>show w</samp>&rsquo; and 
&lsquo;<samp>show c</samp>&rsquo; should show
+the appropriate parts of the General Public License.  Of course, the
+commands you use may be called something other than &lsquo;<samp>show 
w</samp>&rsquo; and
+&lsquo;<samp>show c</samp>&rsquo;; they could even be mouse-clicks or menu 
items&mdash;whatever
+suits your program.
+</p>
+<p>You should also get your employer (if you work as a programmer) or your
+school, if any, to sign a &ldquo;copyright disclaimer&rdquo; for the program, 
if
+necessary.  Here is a sample; alter the names:
+</p>
+<table><tr><td>&nbsp;</td><td><pre class="smallexample">Yoyodyne, Inc., hereby 
disclaims all copyright
+interest in the program `Gnomovision'
+(which makes passes at compilers) written 
+by James Hacker.
+
+<var>signature of Ty Coon</var>, 1 April 1989
+Ty Coon, President of Vice
+</pre></td></tr></table>
+
+<p>This General Public License does not permit incorporating your program into
+proprietary programs.  If your program is a subroutine library, you may
+consider it more useful to permit linking proprietary applications with the
+library.  If this is what you want to do, use the GNU Library General
+Public License instead of this License.
+</p>
+<hr size="6">
+<a name="Concept-Index"></a>
+<a name="SEC27"></a>
+<table cellpadding="1" cellspacing="1" border="0">
+<tr><td valign="middle" align="left">[<a href="#SEC26" title="Previous section 
in reading order"> &lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC28" title="Next section in 
reading order"> &gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC24" title="Beginning of this 
chapter or previous chapter"> &lt;&lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_Top" title="Up section"> Up 
</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC28" title="Next chapter"> 
&gt;&gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC_Top" title="Cover (top) of 
document">Top</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_Contents" title="Table of 
contents">Contents</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC27" 
title="Index">Index</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_About" title="About (help)"> ? 
</a>]</td>
+</tr></table>
+<h1 class="unnumbered"> Concept Index </h1>
+
+<table><tr><th valign="top">Jump to: &nbsp; </th><td><a href="#SEC27_0" 
class="summary-letter"><b>B</b></a>
+ &nbsp; 
+<a href="#SEC27_1" class="summary-letter"><b>C</b></a>
+ &nbsp; 
+<a href="#SEC27_2" class="summary-letter"><b>D</b></a>
+ &nbsp; 
+<a href="#SEC27_3" class="summary-letter"><b>E</b></a>
+ &nbsp; 
+<a href="#SEC27_4" class="summary-letter"><b>G</b></a>
+ &nbsp; 
+<a href="#SEC27_5" class="summary-letter"><b>I</b></a>
+ &nbsp; 
+<a href="#SEC27_6" class="summary-letter"><b>J</b></a>
+ &nbsp; 
+<a href="#SEC27_7" class="summary-letter"><b>L</b></a>
+ &nbsp; 
+<a href="#SEC27_8" class="summary-letter"><b>M</b></a>
+ &nbsp; 
+<a href="#SEC27_9" class="summary-letter"><b>P</b></a>
+ &nbsp; 
+<a href="#SEC27_10" class="summary-letter"><b>R</b></a>
+ &nbsp; 
+<a href="#SEC27_11" class="summary-letter"><b>S</b></a>
+ &nbsp; 
+<a href="#SEC27_12" class="summary-letter"><b>T</b></a>
+ &nbsp; 
+<a href="#SEC27_13" class="summary-letter"><b>U</b></a>
+ &nbsp; 
+</td></tr></table>
+<table border="0" class="index-cp">
+<tr><td></td><th align="left">Index Entry</th><th align="left"> 
Section</th></tr>
+<tr><td colspan="3"> <hr></td></tr>
+<tr><th><a name="SEC27_0">B</a></th><td></td><td></td></tr>
+<tr><td></td><td valign="top"><a href="#SEC23">bug</a></td><td valign="top"><a 
href="#SEC23">10. Reporting bugs</a></td></tr>
+<tr><td></td><td valign="top"><a href="#IDX36">bus error</a></td><td 
valign="top"><a href="#SEC9">4.4 Extracting</a></td></tr>
+<tr><td colspan="3"> <hr></td></tr>
+<tr><th><a name="SEC27_1">C</a></th><td></td><td></td></tr>
+<tr><td></td><td valign="top"><a href="#IDX39">character set</a></td><td 
valign="top"><a href="#SEC22">9. Internal utility functions</a></td></tr>
+<tr><td></td><td valign="top"><a href="#SEC6">concurrency</a></td><td 
valign="top"><a href="#SEC6">4.1 Plugin management</a></td></tr>
+<tr><td></td><td valign="top"><a href="#IDX32">concurrency</a></td><td 
valign="top"><a href="#SEC9">4.4 Extracting</a></td></tr>
+<tr><td></td><td valign="top"><a href="#SEC17">concurrency</a></td><td 
valign="top"><a href="#SEC17">6. Utility functions</a></td></tr>
+<tr><td colspan="3"> <hr></td></tr>
+<tr><th><a name="SEC27_2">D</a></th><td></td><td></td></tr>
+<tr><td></td><td valign="top"><a href="#IDX4">directory structure</a></td><td 
valign="top"><a href="#SEC2">2. Preparation</a></td></tr>
+<tr><td colspan="3"> <hr></td></tr>
+<tr><th><a name="SEC27_3">E</a></th><td></td><td></td></tr>
+<tr><td></td><td valign="top"><a href="#IDX6">environment 
variables</a></td><td valign="top"><a href="#SEC2">2. Preparation</a></td></tr>
+<tr><td></td><td valign="top"><a href="#SEC1">error handling</a></td><td 
valign="top"><a href="#SEC1">1. Introduction</a></td></tr>
+<tr><td colspan="3"> <hr></td></tr>
+<tr><th><a name="SEC27_4">G</a></th><td></td><td></td></tr>
+<tr><td></td><td valign="top"><a href="#IDX22">gettext</a></td><td 
valign="top"><a href="#SEC7">4.2 Meta types</a></td></tr>
+<tr><td></td><td valign="top"><a href="#IDX26">gettext</a></td><td 
valign="top"><a href="#SEC7">4.2 Meta types</a></td></tr>
+<tr><td></td><td valign="top"><a href="#SEC24">GPL, GNU General Public 
License</a></td><td valign="top"><a href="#SEC24">A. GNU GENERAL PUBLIC 
LICENSE</a></td></tr>
+<tr><td colspan="3"> <hr></td></tr>
+<tr><th><a name="SEC27_5">I</a></th><td></td><td></td></tr>
+<tr><td></td><td valign="top"><a 
href="#IDX23">internationalization</a></td><td valign="top"><a href="#SEC7">4.2 
Meta types</a></td></tr>
+<tr><td></td><td valign="top"><a 
href="#IDX27">internationalization</a></td><td valign="top"><a href="#SEC7">4.2 
Meta types</a></td></tr>
+<tr><td colspan="3"> <hr></td></tr>
+<tr><th><a name="SEC27_6">J</a></th><td></td><td></td></tr>
+<tr><td></td><td valign="top"><a href="#SEC10">Java</a></td><td 
valign="top"><a href="#SEC10">5. Language bindings</a></td></tr>
+<tr><td colspan="3"> <hr></td></tr>
+<tr><th><a name="SEC27_7">L</a></th><td></td><td></td></tr>
+<tr><td></td><td valign="top"><a href="#IDX2">license</a></td><td 
valign="top"><a href="#SEC1">1. Introduction</a></td></tr>
+<tr><td colspan="3"> <hr></td></tr>
+<tr><th><a name="SEC27_8">M</a></th><td></td><td></td></tr>
+<tr><td></td><td valign="top"><a href="#SEC10">Mono</a></td><td 
valign="top"><a href="#SEC10">5. Language bindings</a></td></tr>
+<tr><td colspan="3"> <hr></td></tr>
+<tr><th><a name="SEC27_9">P</a></th><td></td><td></td></tr>
+<tr><td></td><td valign="top"><a href="#IDX3">packageing</a></td><td 
valign="top"><a href="#SEC2">2. Preparation</a></td></tr>
+<tr><td></td><td valign="top"><a href="#SEC10">Perl</a></td><td 
valign="top"><a href="#SEC10">5. Language bindings</a></td></tr>
+<tr><td></td><td valign="top"><a href="#SEC10">PHP</a></td><td valign="top"><a 
href="#SEC10">5. Language bindings</a></td></tr>
+<tr><td></td><td valign="top"><a href="#IDX1">plugin</a></td><td 
valign="top"><a href="#SEC1">1. Introduction</a></td></tr>
+<tr><td></td><td valign="top"><a href="#IDX5">plugin</a></td><td 
valign="top"><a href="#SEC2">2. Preparation</a></td></tr>
+<tr><td></td><td valign="top"><a href="#SEC10">Python</a></td><td 
valign="top"><a href="#SEC10">5. Language bindings</a></td></tr>
+<tr><td colspan="3"> <hr></td></tr>
+<tr><th><a name="SEC27_10">R</a></th><td></td><td></td></tr>
+<tr><td></td><td valign="top"><a href="#SEC6">reentrant</a></td><td 
valign="top"><a href="#SEC6">4.1 Plugin management</a></td></tr>
+<tr><td></td><td valign="top"><a href="#IDX31">reentrant</a></td><td 
valign="top"><a href="#SEC9">4.4 Extracting</a></td></tr>
+<tr><td></td><td valign="top"><a href="#SEC17">reentrant</a></td><td 
valign="top"><a href="#SEC17">6. Utility functions</a></td></tr>
+<tr><td></td><td valign="top"><a href="#SEC10">Ruby</a></td><td 
valign="top"><a href="#SEC10">5. Language bindings</a></td></tr>
+<tr><td colspan="3"> <hr></td></tr>
+<tr><th><a name="SEC27_11">S</a></th><td></td><td></td></tr>
+<tr><td></td><td valign="top"><a href="#IDX35">SIGBUS</a></td><td 
valign="top"><a href="#SEC9">4.4 Extracting</a></td></tr>
+<tr><td colspan="3"> <hr></td></tr>
+<tr><th><a name="SEC27_12">T</a></th><td></td><td></td></tr>
+<tr><td></td><td valign="top"><a href="#SEC6">thread-safety</a></td><td 
valign="top"><a href="#SEC6">4.1 Plugin management</a></td></tr>
+<tr><td></td><td valign="top"><a href="#IDX34">thread-safety</a></td><td 
valign="top"><a href="#SEC9">4.4 Extracting</a></td></tr>
+<tr><td></td><td valign="top"><a href="#SEC17">thread-safety</a></td><td 
valign="top"><a href="#SEC17">6. Utility functions</a></td></tr>
+<tr><td></td><td valign="top"><a href="#SEC6">threads</a></td><td 
valign="top"><a href="#SEC6">4.1 Plugin management</a></td></tr>
+<tr><td></td><td valign="top"><a href="#IDX33">threads</a></td><td 
valign="top"><a href="#SEC9">4.4 Extracting</a></td></tr>
+<tr><td></td><td valign="top"><a href="#SEC17">threads</a></td><td 
valign="top"><a href="#SEC17">6. Utility functions</a></td></tr>
+<tr><td colspan="3"> <hr></td></tr>
+<tr><th><a name="SEC27_13">U</a></th><td></td><td></td></tr>
+<tr><td></td><td valign="top"><a href="#IDX38">UTF-8</a></td><td 
valign="top"><a href="#SEC22">9. Internal utility functions</a></td></tr>
+<tr><td colspan="3"> <hr></td></tr>
+</table>
+<table><tr><th valign="top">Jump to: &nbsp; </th><td><a href="#SEC27_0" 
class="summary-letter"><b>B</b></a>
+ &nbsp; 
+<a href="#SEC27_1" class="summary-letter"><b>C</b></a>
+ &nbsp; 
+<a href="#SEC27_2" class="summary-letter"><b>D</b></a>
+ &nbsp; 
+<a href="#SEC27_3" class="summary-letter"><b>E</b></a>
+ &nbsp; 
+<a href="#SEC27_4" class="summary-letter"><b>G</b></a>
+ &nbsp; 
+<a href="#SEC27_5" class="summary-letter"><b>I</b></a>
+ &nbsp; 
+<a href="#SEC27_6" class="summary-letter"><b>J</b></a>
+ &nbsp; 
+<a href="#SEC27_7" class="summary-letter"><b>L</b></a>
+ &nbsp; 
+<a href="#SEC27_8" class="summary-letter"><b>M</b></a>
+ &nbsp; 
+<a href="#SEC27_9" class="summary-letter"><b>P</b></a>
+ &nbsp; 
+<a href="#SEC27_10" class="summary-letter"><b>R</b></a>
+ &nbsp; 
+<a href="#SEC27_11" class="summary-letter"><b>S</b></a>
+ &nbsp; 
+<a href="#SEC27_12" class="summary-letter"><b>T</b></a>
+ &nbsp; 
+<a href="#SEC27_13" class="summary-letter"><b>U</b></a>
+ &nbsp; 
+</td></tr></table>
+
+<hr size="6">
+<a name="Function-and-Data-Index"></a>
+<a name="SEC28"></a>
+<table cellpadding="1" cellspacing="1" border="0">
+<tr><td valign="middle" align="left">[<a href="#SEC27" title="Previous section 
in reading order"> &lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC29" title="Next section in 
reading order"> &gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC27" title="Beginning of this 
chapter or previous chapter"> &lt;&lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_Top" title="Up section"> Up 
</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC29" title="Next chapter"> 
&gt;&gt; </a>]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC_Top" title="Cover (top) of 
document">Top</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_Contents" title="Table of 
contents">Contents</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC27" 
title="Index">Index</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_About" title="About (help)"> ? 
</a>]</td>
+</tr></table>
+<h1 class="unnumbered"> Function and Data Index </h1>
+
+<table><tr><th valign="top">Jump to: &nbsp; </th><td><a href="#SEC28_0" 
class="summary-letter"><b>(</b></a>
+ &nbsp; 
+<br>
+<a href="#SEC28_1" class="summary-letter"><b>E</b></a>
+ &nbsp; 
+</td></tr></table>
+<table border="0" class="index-fn">
+<tr><td></td><th align="left">Index Entry</th><th align="left"> 
Section</th></tr>
+<tr><td colspan="3"> <hr></td></tr>
+<tr><th><a name="SEC28_0">(</a></th><td></td><td></td></tr>
+<tr><td></td><td valign="top"><a 
href="#IDX28"><code>(*EXTRACTOR_MetaDataProcessor)(void</code></a></td><td 
valign="top"><a href="#SEC9">4.4 Extracting</a></td></tr>
+<tr><td colspan="3"> <hr></td></tr>
+<tr><th><a name="SEC28_1">E</a></th><td></td><td></td></tr>
+<tr><td></td><td valign="top"><a 
href="#IDX40"><code>EXTRACTOR_common_convert_to_utf8</code></a></td><td 
valign="top"><a href="#SEC22">9. Internal utility functions</a></td></tr>
+<tr><td></td><td valign="top"><a 
href="#IDX37"><code>EXTRACTOR_common_convert_to_utf8(const</code></a></td><td 
valign="top"><a href="#SEC22">9. Internal utility functions</a></td></tr>
+<tr><td></td><td valign="top"><a 
href="#IDX30"><code>EXTRACTOR_extract</code></a></td><td valign="top"><a 
href="#SEC9">4.4 Extracting</a></td></tr>
+<tr><td></td><td valign="top"><a 
href="#IDX29"><code>EXTRACTOR_extract(struct</code></a></td><td valign="top"><a 
href="#SEC9">4.4 Extracting</a></td></tr>
+<tr><td></td><td valign="top"><a 
href="#SEC19"><code>EXTRACTOR_meta_data_print</code></a></td><td 
valign="top"><a href="#SEC19">6.2 Meta data printing</a></td></tr>
+<tr><td></td><td valign="top"><a 
href="#SEC7"><code>EXTRACTOR_metatype_get_max</code></a></td><td 
valign="top"><a href="#SEC7">4.2 Meta types</a></td></tr>
+<tr><td></td><td valign="top"><a 
href="#IDX24"><code>EXTRACTOR_metatype_to_description</code></a></td><td 
valign="top"><a href="#SEC7">4.2 Meta types</a></td></tr>
+<tr><td></td><td valign="top"><a 
href="#IDX25"><code>EXTRACTOR_metatype_to_description</code></a></td><td 
valign="top"><a href="#SEC7">4.2 Meta types</a></td></tr>
+<tr><td></td><td valign="top"><a 
href="#IDX20"><code>EXTRACTOR_metatype_to_string</code></a></td><td 
valign="top"><a href="#SEC7">4.2 Meta types</a></td></tr>
+<tr><td></td><td valign="top"><a 
href="#IDX21"><code>EXTRACTOR_metatype_to_string</code></a></td><td 
valign="top"><a href="#SEC7">4.2 Meta types</a></td></tr>
+<tr><td></td><td valign="top"><a 
href="#IDX14"><code>EXTRACTOR_plugin_add</code></a></td><td valign="top"><a 
href="#SEC6">4.1 Plugin management</a></td></tr>
+<tr><td></td><td valign="top"><a 
href="#IDX15"><code>EXTRACTOR_plugin_add</code></a></td><td valign="top"><a 
href="#SEC6">4.1 Plugin management</a></td></tr>
+<tr><td></td><td valign="top"><a 
href="#IDX16"><code>EXTRACTOR_plugin_add_config</code></a></td><td 
valign="top"><a href="#SEC6">4.1 Plugin management</a></td></tr>
+<tr><td></td><td valign="top"><a 
href="#IDX17"><code>EXTRACTOR_plugin_add_config</code></a></td><td 
valign="top"><a href="#SEC6">4.1 Plugin management</a></td></tr>
+<tr><td></td><td valign="top"><a 
href="#IDX18"><code>EXTRACTOR_plugin_add_defaults</code></a></td><td 
valign="top"><a href="#SEC6">4.1 Plugin management</a></td></tr>
+<tr><td></td><td valign="top"><a 
href="#IDX19"><code>EXTRACTOR_plugin_add_defaults</code></a></td><td 
valign="top"><a href="#SEC6">4.1 Plugin management</a></td></tr>
+<tr><td></td><td valign="top"><a 
href="#IDX12"><code>EXTRACTOR_plugin_remove</code></a></td><td valign="top"><a 
href="#SEC6">4.1 Plugin management</a></td></tr>
+<tr><td></td><td valign="top"><a 
href="#IDX13"><code>EXTRACTOR_plugin_remove</code></a></td><td valign="top"><a 
href="#SEC6">4.1 Plugin management</a></td></tr>
+<tr><td></td><td valign="top"><a 
href="#IDX10"><code>EXTRACTOR_plugin_remove_all</code></a></td><td 
valign="top"><a href="#SEC6">4.1 Plugin management</a></td></tr>
+<tr><td></td><td valign="top"><a 
href="#IDX11"><code>EXTRACTOR_plugin_remove_all</code></a></td><td 
valign="top"><a href="#SEC6">4.1 Plugin management</a></td></tr>
+<tr><td></td><td valign="top"><a 
href="#SEC18"><code>EXTRACTOR_VERSION</code></a></td><td valign="top"><a 
href="#SEC18">6.1 Utility Constants</a></td></tr>
+<tr><td colspan="3"> <hr></td></tr>
+</table>
+<table><tr><th valign="top">Jump to: &nbsp; </th><td><a href="#SEC28_0" 
class="summary-letter"><b>(</b></a>
+ &nbsp; 
+<br>
+<a href="#SEC28_1" class="summary-letter"><b>E</b></a>
+ &nbsp; 
+</td></tr></table>
+
+<hr size="6">
+<a name="Type-Index"></a>
+<a name="SEC29"></a>
+<table cellpadding="1" cellspacing="1" border="0">
+<tr><td valign="middle" align="left">[<a href="#SEC28" title="Previous section 
in reading order"> &lt; </a>]</td>
+<td valign="middle" align="left">[ &gt; ]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC28" title="Beginning of this 
chapter or previous chapter"> &lt;&lt; </a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_Top" title="Up section"> Up 
</a>]</td>
+<td valign="middle" align="left">[ &gt;&gt; ]</td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left"> &nbsp; </td>
+<td valign="middle" align="left">[<a href="#SEC_Top" title="Cover (top) of 
document">Top</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_Contents" title="Table of 
contents">Contents</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC27" 
title="Index">Index</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_About" title="About (help)"> ? 
</a>]</td>
+</tr></table>
+<h1 class="unnumbered"> Type Index </h1>
+
+<table><tr><th valign="top">Jump to: &nbsp; </th><td><a href="#SEC29_0" 
class="summary-letter"><b>E</b></a>
+ &nbsp; 
+<a href="#SEC29_1" class="summary-letter"><b>S</b></a>
+ &nbsp; 
+</td></tr></table>
+<table border="0" class="index-tp">
+<tr><td></td><th align="left">Index Entry</th><th align="left"> 
Section</th></tr>
+<tr><td colspan="3"> <hr></td></tr>
+<tr><th><a name="SEC29_0">E</a></th><td></td><td></td></tr>
+<tr><td></td><td valign="top"><a href="#SEC8"><code>enum 
EXTRACTOR_MetaFormat</code></a></td><td valign="top"><a href="#SEC8">4.3 Meta 
formats</a></td></tr>
+<tr><td></td><td valign="top"><a href="#SEC7"><code>enum 
EXTRACTOR_MetaType</code></a></td><td valign="top"><a href="#SEC7">4.2 Meta 
types</a></td></tr>
+<tr><td></td><td valign="top"><a href="#SEC6"><code>enum 
EXTRACTOR_Options</code></a></td><td valign="top"><a href="#SEC6">4.1 Plugin 
management</a></td></tr>
+<tr><td></td><td valign="top"><a 
href="#SEC9"><code>EXTRACTOR_MetaDataProcessor</code></a></td><td 
valign="top"><a href="#SEC9">4.4 Extracting</a></td></tr>
+<tr><td></td><td valign="top"><a 
href="#IDX8"><code>EXTRACTOR_PluginList</code></a></td><td valign="top"><a 
href="#SEC6">4.1 Plugin management</a></td></tr>
+<tr><td colspan="3"> <hr></td></tr>
+<tr><th><a name="SEC29_1">S</a></th><td></td><td></td></tr>
+<tr><td></td><td valign="top"><a href="#IDX9"><code>struct 
EXTRACTOR_PluginList</code></a></td><td valign="top"><a href="#SEC6">4.1 Plugin 
management</a></td></tr>
+<tr><td colspan="3"> <hr></td></tr>
+</table>
+<table><tr><th valign="top">Jump to: &nbsp; </th><td><a href="#SEC29_0" 
class="summary-letter"><b>E</b></a>
+ &nbsp; 
+<a href="#SEC29_1" class="summary-letter"><b>S</b></a>
+ &nbsp; 
+</td></tr></table>
+
+<hr size="6">
+<a name="SEC_Foot"></a>
+<table cellpadding="1" cellspacing="1" border="0">
+<tr><td valign="middle" align="left">[<a href="#SEC_Top" title="Cover (top) of 
document">Top</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_Contents" title="Table of 
contents">Contents</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC27" 
title="Index">Index</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_About" title="About (help)"> ? 
</a>]</td>
+</tr></table>
+<h1>Footnotes</h1>
+<h3><a name="FOOT1" href="#DOCF1">(1)</a></h3>
+<p>Some distributions ship <code>extract</code> in a
+seperate package.
+</p><h3><a name="FOOT2" href="#DOCF2">(2)</a></h3>
+<p>It
+maybe possible to switch to GPLv3 in the future.  For this, an audit
+of the license status of our dependencies would be required.  The new
+code that was developed specifically for <acronym>GNU libextractor</acronym> 
has
+always been licensed under GPLv2 <em>or any later version</em>.
+</p><h3><a name="FOOT3" href="#DOCF3">(3)</a></h3>
+<p>Debian policy
+furthermore requires a &lsquo;<tt>-dev</tt>&rsquo; (meta) package that would 
depend on
+all of the above packages.
+</p><hr size="1">
+<a name="SEC_Contents"></a>
+<table cellpadding="1" cellspacing="1" border="0">
+<tr><td valign="middle" align="left">[<a href="#SEC_Top" title="Cover (top) of 
document">Top</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_Contents" title="Table of 
contents">Contents</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC27" 
title="Index">Index</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_About" title="About (help)"> ? 
</a>]</td>
+</tr></table>
+<h1>Table of Contents</h1>
+<div class="contents">
+
+<ul class="toc">
+  <li><a name="TOC1" href="#SEC1">1. Introduction</a></li>
+  <li><a name="TOC2" href="#SEC2">2. Preparation</a>
+  <ul class="toc">
+    <li><a name="TOC3" href="#SEC3">2.1 Note to package maintainers</a></li>
+  </ul></li>
+  <li><a name="TOC4" href="#SEC4">3. Generalities</a></li>
+  <li><a name="TOC5" href="#SEC5">4. Extracting meta data</a>
+  <ul class="toc">
+    <li><a name="TOC6" href="#SEC6">4.1 Plugin management</a></li>
+    <li><a name="TOC7" href="#SEC7">4.2 Meta types</a></li>
+    <li><a name="TOC8" href="#SEC8">4.3 Meta formats</a></li>
+    <li><a name="TOC9" href="#SEC9">4.4 Extracting</a></li>
+  </ul></li>
+  <li><a name="TOC10" href="#SEC10">5. Language bindings</a>
+  <ul class="toc">
+    <li><a name="TOC11" href="#SEC11">5.1 Java</a></li>
+    <li><a name="TOC12" href="#SEC12">5.2 Mono</a></li>
+    <li><a name="TOC13" href="#SEC13">5.3 Perl</a></li>
+    <li><a name="TOC14" href="#SEC14">5.4 Python</a></li>
+    <li><a name="TOC15" href="#SEC15">5.5 PHP</a></li>
+    <li><a name="TOC16" href="#SEC16">5.6 Ruby</a></li>
+  </ul></li>
+  <li><a name="TOC17" href="#SEC17">6. Utility functions</a>
+  <ul class="toc">
+    <li><a name="TOC18" href="#SEC18">6.1 Utility Constants</a></li>
+    <li><a name="TOC19" href="#SEC19">6.2 Meta data printing</a></li>
+  </ul></li>
+  <li><a name="TOC20" href="#SEC20">7. Existing Plugins</a></li>
+  <li><a name="TOC21" href="#SEC21">8. Writing new Plugins</a></li>
+  <li><a name="TOC22" href="#SEC22">9. Internal utility functions</a></li>
+  <li><a name="TOC23" href="#SEC23">10. Reporting bugs</a></li>
+  <li><a name="TOC24" href="#SEC24">A. GNU GENERAL PUBLIC LICENSE</a>
+  <ul class="toc">
+
+    <ul class="toc">
+      <li><a name="TOC25" href="#SEC25">A.0.1 Preamble</a></li>
+    </ul></li>
+    <li><a name="TOC26" href="#SEC26">How to Apply These Terms to Your New 
Programs</a></li>
+  </ul></li>
+  <li><a name="TOC27" href="#SEC27">Concept Index</a></li>
+  <li><a name="TOC28" href="#SEC28">Function and Data Index</a></li>
+  <li><a name="TOC29" href="#SEC29">Type Index</a></li>
+</ul>
+</div>
+<hr size="1">
+<a name="SEC_Overview"></a>
+<table cellpadding="1" cellspacing="1" border="0">
+<tr><td valign="middle" align="left">[<a href="#SEC_Top" title="Cover (top) of 
document">Top</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_Contents" title="Table of 
contents">Contents</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC27" 
title="Index">Index</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_About" title="About (help)"> ? 
</a>]</td>
+</tr></table>
+<h1>Short Table of Contents</h1>
+<div class="shortcontents">
+<ul class="toc">
+<li><a name="TOC1" href="#SEC1">1. Introduction</a></li>
+<li><a name="TOC2" href="#SEC2">2. Preparation</a></li>
+<li><a name="TOC4" href="#SEC4">3. Generalities</a></li>
+<li><a name="TOC5" href="#SEC5">4. Extracting meta data</a></li>
+<li><a name="TOC10" href="#SEC10">5. Language bindings</a></li>
+<li><a name="TOC17" href="#SEC17">6. Utility functions</a></li>
+<li><a name="TOC20" href="#SEC20">7. Existing Plugins</a></li>
+<li><a name="TOC21" href="#SEC21">8. Writing new Plugins</a></li>
+<li><a name="TOC22" href="#SEC22">9. Internal utility functions</a></li>
+<li><a name="TOC23" href="#SEC23">10. Reporting bugs</a></li>
+<li><a name="TOC24" href="#SEC24">A. GNU GENERAL PUBLIC LICENSE</a></li>
+<li><a name="TOC27" href="#SEC27">Concept Index</a></li>
+<li><a name="TOC28" href="#SEC28">Function and Data Index</a></li>
+<li><a name="TOC29" href="#SEC29">Type Index</a></li>
+</ul>
+</div>
+<hr size="1">
+<a name="SEC_About"></a>
+<table cellpadding="1" cellspacing="1" border="0">
+<tr><td valign="middle" align="left">[<a href="#SEC_Top" title="Cover (top) of 
document">Top</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_Contents" title="Table of 
contents">Contents</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC27" 
title="Index">Index</a>]</td>
+<td valign="middle" align="left">[<a href="#SEC_About" title="About (help)"> ? 
</a>]</td>
+</tr></table>
+<h1>About This Document</h1>
+<p>
+  This document was generated by <em>Christian Grothoff</em> on <em>January, 1 
2010</em> using <a href="http://www.nongnu.org/texi2html/";><em>texi2html 
1.78</em></a>.
+</p>
+<p>
+  The buttons in the navigation panels have the following meaning:
+</p>
+<table border="1">
+  <tr>
+    <th> Button </th>
+    <th> Name </th>
+    <th> Go to </th>
+    <th> From 1.2.3 go to</th>
+  </tr>
+  <tr>
+    <td align="center"> [ &lt; ] </td>
+    <td align="center">Back</td>
+    <td>Previous section in reading order</td>
+    <td>1.2.2</td>
+  </tr>
+  <tr>
+    <td align="center"> [ &gt; ] </td>
+    <td align="center">Forward</td>
+    <td>Next section in reading order</td>
+    <td>1.2.4</td>
+  </tr>
+  <tr>
+    <td align="center"> [ &lt;&lt; ] </td>
+    <td align="center">FastBack</td>
+    <td>Beginning of this chapter or previous chapter</td>
+    <td>1</td>
+  </tr>
+  <tr>
+    <td align="center"> [ Up ] </td>
+    <td align="center">Up</td>
+    <td>Up section</td>
+    <td>1.2</td>
+  </tr>
+  <tr>
+    <td align="center"> [ &gt;&gt; ] </td>
+    <td align="center">FastForward</td>
+    <td>Next chapter</td>
+    <td>2</td>
+  </tr>
+  <tr>
+    <td align="center"> [Top] </td>
+    <td align="center">Top</td>
+    <td>Cover (top) of document</td>
+    <td> &nbsp; </td>
+  </tr>
+  <tr>
+    <td align="center"> [Contents] </td>
+    <td align="center">Contents</td>
+    <td>Table of contents</td>
+    <td> &nbsp; </td>
+  </tr>
+  <tr>
+    <td align="center"> [Index] </td>
+    <td align="center">Index</td>
+    <td>Index</td>
+    <td> &nbsp; </td>
+  </tr>
+  <tr>
+    <td align="center"> [ ? ] </td>
+    <td align="center">About</td>
+    <td>About (help)</td>
+    <td> &nbsp; </td>
+  </tr>
+</table>
+
+<p>
+  where the <strong> Example </strong> assumes that the current position is at 
<strong> Subsubsection One-Two-Three </strong> of a document of the following 
structure:
+</p>
+
+<ul>
+  <li> 1. Section One
+    <ul>
+      <li>1.1 Subsection One-One
+        <ul>
+          <li>...</li>
+        </ul>
+      </li>
+      <li>1.2 Subsection One-Two
+        <ul>
+          <li>1.2.1 Subsubsection One-Two-One</li>
+          <li>1.2.2 Subsubsection One-Two-Two</li>
+          <li>1.2.3 Subsubsection One-Two-Three &nbsp; &nbsp;
+            <strong>&lt;== Current Position </strong></li>
+          <li>1.2.4 Subsubsection One-Two-Four</li>
+        </ul>
+      </li>
+      <li>1.3 Subsection One-Three
+        <ul>
+          <li>...</li>
+        </ul>
+      </li>
+      <li>1.4 Subsection One-Four</li>
+    </ul>
+  </li>
+</ul>
+
+<hr size="1">
+<p>
+ <font size="-1">
+  This document was generated by <em>Christian Grothoff</em> on <em>January, 1 
2010</em> using <a href="http://www.nongnu.org/texi2html/";><em>texi2html 
1.78</em></a>.
+ </font>
+ <br>
+
+</p>
+</body>
+</html>

Modified: Extractor-docs/WWW/index.html
===================================================================
--- Extractor-docs/WWW/index.html       2010-01-01 19:54:34 UTC (rev 9955)
+++ Extractor-docs/WWW/index.html       2010-01-01 19:54:36 UTC (rev 9956)
@@ -29,6 +29,7 @@
 <tr><td bgcolor="efefef"><a href="#contact">Contact</a></td></tr>
 <tr><th nowrap="nowrap" bgcolor="99BBFF"><a 
href="download.html">Download</a></th></tr>
 <tr><th nowrap="nowrap" bgcolor="99BBFF"><a 
href="documentation.html">Documentation</a></th></tr>
+<tr><th nowrap="nowrap" bgcolor="99BBFF"><a href="extractor.html">Reference 
Manual</a></th></tr>
 <tr><th nowrap="nowrap" bgcolor="99BBFF"><a 
href="http://freshmeat.net/projects/libextractor/";>Freshmeat Page</a></th></tr>
 </tbody>
 </table>

Deleted: Extractor-docs/WWW/news
===================================================================
--- Extractor-docs/WWW/news     2010-01-01 19:54:34 UTC (rev 9955)
+++ Extractor-docs/WWW/news     2010-01-01 19:54:36 UTC (rev 9956)
@@ -1,90 +0,0 @@
-    Mon Feb 10 12:00:00 EST 2002
-    <p>
-    There is a bug in rpm-devel-4.0.4-7x.18 which might cause the rpmextractor 
to not install properly. You need to patch /usr/include/rpm/header.h to fix it. 
Here is the <a href=header.patch>patch</a>.
-    
-    Thu Feb 6 17:34:14 EST 2003
-    <p>
-    libextractor v0.2.1 fixes some issues with dynamic libraries that arose in 
0.2.0.
-    
-    Tue Feb 4 05:49:44 EST 2003
-    <p>
-    There is a bug in the dynamic loading in v0.2.0, it does not find 
libraries in /usr/lib, until the fix is released you can get around it by 
adding /usr/lib, /usr/local/lib to LD_LIBRARY_PATH.
-    Sat Feb 1 23:59:59 EST 2003
-    <p>
-    v0.2.0 released with RPM extractor, and libltdl support.
-    
-    Thu Jan 9 18:51:47 EST 2003
-    <p>
-    v0.1.4 released with PDF and PS extractor.
-    Thu Jan 9 18:51:47 EST 2003
-    <p>
-    v0.1.3 released with MIME-extractor.
-    
-    Fri Nov 22 21:54:10 EST 2002
-    <p>
-    Fixed portability problems with the gifextractor, in particular the code n 
ow ensures that C compilers that do not pack the structs are still going to 
result in working code.
-    Sat Oct 19 02:00 EST 2002
-    <p>
-    Compiled on solaris with some tweeking:
-    <pre>
- For solaris you need:
- 
-        * latest CVS or libextractor 0.1.3
-        * gnu make (!), gcc, gnu auto-tools (2.1x is ok)
-        * with autoconf 2.13, you must run:
-
-# ./autogen
-# ./automake -a -i
-# ./configure --disable-dependency-tracking
-</pre>
-Note that just --disable-dependency-tracking without the '-i' option to au 
tomake does not seem to work and results in a compile-error from the 
preprocessor. The -ldl issue (dlsym) is resolved in the latest configure.in.
-<p>
-For compilation on Purdue CS machines, you need to do PATH=/usr/ccs/bin:/p 
/gnu:$PATH before you start to get the GNU linker (/usr/ccs/bin/ld) and GNU 
make (p//gnu/make) and not solaris make (/usr/ccs/bin/make). <br>
-
-
-    Tue Oct 1 17:00:20 EST 2002
-    <p>Fixed a bug in ogg. <br>
-     
-    Wed Jun 12 23:42:55 EST 2002
-    <p>
-    Added a dozen options to extract. Added man pages. Released v0.1.0.<br>
-    
-    Fri Jun 7 01:48:34 EST 2002
-    <p>
-    Added support for real (real.com). Released v0.0.3. Also released rpms! < 
br>
-
-    Fri Jun 7 00:21:40 EST 2002
-    <p>
-    Added support for GIF (what a crazy format). <br>
-
-    Tue Jun 4 23:21:38 EST 2002
-    <p>
-    Added support for PNG, no longer reading the file again and again for each 
extractor (slight interface change, mmapping). <br>
-
-    Sun Jun 2 22:49:17 EST 2002
-    <p>
-    Added support for JPEG and HTML. HTML does not support concurrent use, tho 
ugh (inherent problem with libhtmlparse). Released v0.0.2. <br>
-
-    Sat May 25 16:56:59 EST 2002
-    <p>
-    Added building of a description from artist, title and album, fixed bugs. 
<br>
-
-    Tue May 21 22:24:07 EST 2002
-    <p>
-    Added removing of duplicates, splitting keywords, extraction of keywords f 
rom filenames. <br>
-
-    Sat May 18 16:33:28 EST 2002
-    <p>
-    more convenience methods ('configuration', default set of libraries, remov 
e all libraries) <br>
-
-    Sat May 18 02:33:28 EST 2002
-    <p>
-    ogg extractor works, mp3 extractor now always works <br>
-
-    Thu May 16 00:04:03 EST 2002
-    <p>
-    MP3 extractor mostly works. <br>
-
-    Wed May 15 23:38:31 EST 2002
-    <p>
-    The basics are there, let's write extractors!

Deleted: Extractor-docs/WWW/oldnews.html
===================================================================
--- Extractor-docs/WWW/oldnews.html     2010-01-01 19:54:34 UTC (rev 9955)
+++ Extractor-docs/WWW/oldnews.html     2010-01-01 19:54:36 UTC (rev 9956)
@@ -1,195 +0,0 @@
-<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" 
"http://www.w3.org/TR/html4/loose.dtd";>
-<html>
-<head>
-<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
-<title>libExtractor - News Archive</title>
-<meta name="content-language" content="en">
-<meta name="language" content="en"><meta name="author" content="Vids Samanta">
-<meta name="rights" content="(C) 2002,2003,2004,2005,2006,2007 by Vids 
Samanta">
-<meta name="keywords" content="keyword, extraction, mp3, html, pdf, images, 
jpeg, gif, ps, mime">
-<meta name="robots" content="index,follow">
-<meta name="revisit-after" content="28 days">
-<meta name="content-language" content="en">
-<meta name="language" content="en">
-<meta http-equiv="expires" content="43200">
-<meta http-equiv="content-type" content="text/html; charset=UTF-8">
-<link rel="SHORTCUT ICON" href="http://gnunet.org/libextractor/favicon.ico";>
-</head>
-<body>
-
-<table>
-<tbody>
-<tr><td colspan="2" width="99%" bgcolor="#99bbff" align="center">libExtractor 
- News Archive</td></tr>
-<tr><td valign="top">
-<table width="15%" border="0" cellpadding="2" cellspacing="3"><tbody>
-<tr><th nowrap="nowrap" bgcolor="99BBFF"><a 
href="libextractor.html">Home</a></th></tr>
-<tr><th nowrap="nowrap" bgcolor="99BBFF"><a 
href="download.html">Download</a></th></tr>
-<tr><th nowrap="nowrap" bgcolor="99BBFF"><a 
href="documentation.html">Documentation</a></th></tr>
-<tr><th nowrap="nowrap" bgcolor="99BBFF"><a href="oldnews.html">Old 
News</a></th></tr>
-<tr><th nowrap="nowrap" bgcolor="99BBFF"><a 
href="http://freshmeat.net/projects/libextractor/";>Freshmeat Page</a></th></tr>
-</tbody>
-</table>
-</td>
-
-<td valign="top">
-<dl>
-<dt>Sun Nov  2 20:19:02 MST 2008 | libextractor v0.5.21 released.</dt>
-<dd>This release adds support for the S3M, XM and IT file formats. RPM support 
now requires librpm. Crashes in the OpenOffice and tiff plugins were fixed.</dd>
-<dt>Sun Jul 13 19:35:07 MDT 2008 | libextractor v0.5.20c released.</dt>
-<dd>This release fixes build issues (locale paths and OpenBSD link errors) and 
fixes some issues with thread-safety of module handling. A new experimental 
plugin supporting thumbnail extraction using ffmpeg was added (not used by 
default).</dd>
-<dt>Fri Apr 25 08:46:10 MDT 2008 | libextractor v0.5.20b released.</dt>
-<dd>This release fixes security issues in the XPDF-based PDF plugin (which is 
not the one used by default).</dd>
-<dt>Mon Apr 14 14:23:33 MDT 2008 | libextractor v0.5.20a released.</dt>
-<dd>This release updates the Swedish, Vietnamese, German and Gaelic 
translations and adds translations for Dutch.</dd>
-<dt>Thu Mar 20 23:38:47 MDT 2008 | libextractor v0.5.20 released.</dt>
-<dd>This release adds support for AppleSingle and AppleDouble files and 
improves extraction of track numbers and ISRC codes.</dd>
-<dt>Sat Jan 12 14:10:59 MST 2008 | libextractor v0.5.19a released.</dt>
-<dd>This release fixes security issues in the XPDF-based PDF plugin (which is 
not the one used by default).</dd>
-<dt>Mon Jan  7 08:51:58 MST 2008 | libextractor v0.5.19 released.</dt>
-<dd>This release adds support for Adobe Flash (FLV) and Free Lossless Audio 
Codec (FLAC) files. The quicktime, ole2 and split extractors were also 
improved.</dd>
-<dt>Wed Jul  4 17:53:45 MDT 2007 | libextractor v0.5.18a released.</dt>
-<dd>This release fixes various build problems (Qt, automake 1.10), a crash 
with recent versions of libgsf and adds an (incomplete) manual.</dd>
-<dt>Sat Apr 21 17:11:56 MDT 2007 | libextractor-java v0.5.18 released.</dt>
-<dd>This release adds support for in-memory metadata extracton.  The API was 
changed to return ArrayLists instead of Vectors.</dd>
-<dt>Sun Mar 11 17:55:10 MDT 2007 | libextractor v0.5.18 released.</dt>
-<dd>This release adds support for NSFE files. Removal of duplicate keywords is 
now biased against keywords obtained from splitting. The build process should 
now work properly if no C++ compiler is found. The thumbnail-extractors should 
now load properly in all cases (resolved a symbol naming problem).</dd>
-<dt>Mon Jan  1 19:52:50 MST 2007 | libextractor v0.5.17 released.</dt>
-<dd>This release adds support for SID files (C64 music files) and pkg-config 
detection of libextractor.  A new option makes <tt>extract</tt> easier to use 
with <tt>grep</tt>.  Splitextractor split tokens can now be configured.</dd>
-<dt>Sat Nov 11 16:15:22 MST 2006 | libextractor v0.5.16 released.</dt>
-<dd>This release enhances support for ID3 tags and adds support for the NES 
Sound Format (NSF). It also resolves an issue with libextractor truncating 
libltdl search paths.</dd>
-<dt>Wed Sep  6 14:24:56 PDT 2006 | libextractor v0.5.15 released.</dt>
-<dd>This release fixes minor problems in the PDF extractors and improves the 
PNG extractor. It also makes libextractor relocatable (by no longer building 
the installation path into the binary). Various translations have been 
updated.</dd>
-<dt>Wed May 17 02:17:21 PDT 2006 | libextractor v0.5.14 released.</dt>
-<dd>This release fixes a recently reported security problem in the ASF plugin. 
It also fixes a security problem in the qt extractor (by re-writing it from 
scratch). The re-write also improves support for various Quicktime attributes. 
The mpeg extractor was changed to use <tt>libmpeg2</tt> fixing a problem with 
occasionally wrong image dimensions.</dd>
-<dt>Sat Apr 29 21:30:13 PDT 2006 | libextractor v0.5.13 released.</dt>
-<dd>This release adds Vietnamese and Swedish translations. Some minor changes 
to improve portability and internationalization were made. The wordleaker 
plugin was integrated into the OLE2 plugin, which is now based on libgsf (new 
dependency!) and extracts more and more accurate metadata.</dd>
-<dt>Sat Apr 22 11:55:42 PDT 2006 | libextractor v0.5.12 released.</dt>
-<dd>This release adds an alternative implementation of the PDF extractor. 
Finnish, Frensh, Gaelic and Swedish are now supported by the printable 
extractor. Memory utilization for compiling the printable extractors should no 
longer be an issue.</dd>
-<dt>Thu Mar 10 17:46:39 PST 2006 | libextractor v0.5.11 released.</dt>
-<dd>This release adds support for extracting additional metadata from MS Word 
(OLE2) streams, including language, document statistics and editing 
history.</dd>
-<dt>Sat Feb 18 17:42:24 PST 2006 | libextractor v0.5.10 released.</dt>
-<dd>This release fixes some minor security problems in the PDF extractor. The 
OLE2 extractor supports additional mime types. The TAR extractor is now 
extracting date, format long filenames and supports more checksum variants.</dd>
-<dt>Fri Dec 23 12:58:18 PST 2005 | libextractor v0.5.9 released.</dt>
-<dd>This release fixes a rare crash in the MIME-extractor. The TAR extractor 
is now more robust and supports additional TAR variants. The split extractor 
now uses SPLIT for the keyword type, instead of UNKNOWN.</dd>
-<dt>Tue Dec  6 13:25:56 PST 2005 | libextractor v0.5.8 released.</dt>
-<dd>This release fixes a <a 
href="http://www.idefense.com/application/poi/display?id=344&amp;type=vulnerabilities";>security
 problem</a> in the PDF extractor.</dd>
-<dt>Sat Nov 12 10:50:46 PST 2005 | libextractor v0.5.7 released.</dt>
-<dd>This release features an updated German translation and improves support 
for the TAR and PDF formats. Mime-type detection for OLE2 streams was improved. 
The extract tool now returns an error code if files passed as arguments could 
not be accessed. A double-free problem under BSD was fixed.</dd>
-<dt>Sun Sep 18 19:39:42 PDT 2005 | libextractor v0.5.6 released.</dt>
-<dd>This release fixes warnings with gcc 4.0 and various bugs in the 
decompression code (including making it backwards compatible with zlib 1.1). 
Files are now mmaped read-only (possibly helping the VM perform better for very 
large files). The exiv2 extractor no longer copies the file in memory. The HTML 
extractor was completely rewritten and made simpler and more robust.</dd>
-<dt>Wed Sep  7 21:41:35 PDT 2005 | libextractor v0.5.5 released.</dt>
-<dd>This release fixes a problem with linkers that caused segmentation faults 
for Debian unstable users. The deb extractor no longer uses pthreads. Dead code 
was eliminated in the OLE2 and OO extractors. Minor bugfixes were ported from 
libgsf to the OLE2 extractor. Mime-types are now detected for various Microsoft 
Office formats. libextractor now automatically decompresses GZ and BZ2 files 
before extracting keywords, adding support for compressed files to all formats. 
Individual extractors do no longer perform full-file decompression, avoiding 
some redundant computation.</dd>
-<dt>Fri Aug 26 22:47:07 PDT 2005 | libextractor v0.5.4 released.</dt>
-<dd>This release fixes a memory leak in the thumbnail extractor, character set 
conversion in the OLE2 extractors and the build on OS X. Quotations now follow 
GNU standards. A workaround for a bug in libstdc++ that could cause 
segmentation fauls was added. A new version of the python binding has also been 
released; this revision fixes various problems with the build process.</dd>
-<dt>Sat Aug 13 19:08:46 PDT 2005 | libextractor v0.5.3 released.</dt>
-<dd>This release fixes various bugs in the EXIV2, OO and OLE2 plugins.  A 
static, relocatable version of glib is no longer required.</dd> 
-<dt>Thu Jul 14 22:31:28 CEST 2005 | libextractor v0.5.2 released.</dt>
-<dd>This release adds support for exiv2. The API was extended to support 
in-memory metadata extraction (no file required). Also new are functions to 
encode and decode the binary metadata of a thumbnail. Various plugins were 
changed to allow for the in-memory metadata extraction. A minor compile error 
was fixed.</dd>
-<dt>Mon Jul  4 18:24:18 CEST 2005 | libextractor v0.5.1 released.</dt>
-<dd>This release moves the Java and Python bindings into seperate packages. 
The new version improves the build system and contains some code cleanups.</dd>
-<dt>Sun May 21 13:58:52 CET 2005 | libextractor v0.5.0 released.</dt>
-<dd>This release adds support for Python. Also, plugins can now be supplied 
with user-provided options.</dd>
-<dt>Thu Feb 24 01:23:31 EST 2005 | libextractor v0.4.2 released.</dt>
-<dd>This release fixes some bugs in the ID3, PDF, PNG and REAL extractors. The 
REAL extractor now also handles the new Helix formats. libextractor can now 
also be used to extract thumbnails from images (using ImageMagick).</dd>
-<dt>Wed Jan 26 19:51:44 EST 2005 | libextractor v0.4.1 released.</dt>
-<dd>This release fixes a security issue (inherited from xpdf). It also 
extracts more meta-data from files of TAR or QuickTime format.</dd>
-<dt>Sat Dec 25 21:42:26 CET 2004 | libextractor v0.4.0 released.</dt>
-<dd>This release improves support for character sets (plugins are now expected 
to convert to UTF-8). It also improves support for mp3 (adding genres) and png 
(handling of compressed comments).</dd>
-<dt>Sat Nov 13 13:23:23 EST 2004 | libextractor v0.3.11 released.</dt>
-<dd>This release fixes bugs in the dvi, man, ID3v2.3, ole2 and pdf 
extractors.</dd>
-<dt>Sun Oct 18 13:23:35 EST 2004 | libextractor v0.3.10 released.</dt>
-<dd>This release adds support for ID3v2.3 and ID3v2.4.  It fixes bugs in the 
tar, man, deb, mp3 and ole2 extractors.</dd>
-<dt>Sat Oct 17 18:12:11 EST 2004 | libextractor v0.3.9 released.</dt>
-<dd>This release adds support for the man, tar (including tar.gz) and deb 
formats. It fixes bugs in the id3v2 and jpeg extractors. The size of jpeg 
images is now also extracted. This version adds support for 64-bit file 
sizes.</dd>
-<dt>Sat Oct 02 20:00:04 EST 2004 | libextractor v0.3.8 released.</dt>
-<dd>This release adds support for dvi (from TeX). The plugins are now 
installed in a separate plugin directory. libextractor now works under OS X 
(10.3).</dd>
-<dt>Fri Sep 23 23:30:33 EST 2004 | libextractor v0.3.7 released.</dt>
-<dd>This release adds support for StarOffice formats, ID3v2 tags and the 
Ripe160MD hash function. It also improves the performance of the HTML and ZIP 
extractors.</dd>
-<dt>Fri Sep 10 20:10:38 EST 2004 | libextractor v0.3.6 released.</dt>
-<dd>This release adds support for OpenOffice formats, hash functions (md5, 
sha-1) and fixes some build problems.</dd>
-<dt>Mon Aug 30 23:18:49 IST 2004 | libextractor v0.3.5 released.</dt>
-<dd>This release adds support for OLE2 (WinWord, PowerPoint, Excel formats) 
and fixes various minor bugs. For OLE2 support you will have to have glib 2.0 
installed (yes, that is glib from GTK/Gnome, not glibc!).</dd>
-<dt>Thu Aug 26 20:27:24 IST 2004 | Bugtracking using Mantis enabled.</dt>
-<dd>You can now report and view bug-reports about libextractor on <a 
href="https://gnunet.org/mantis/";>Mantis</a>.</dd>
-<dt>Wed Aug 25 19:02:07 IST 2004 | libextractor v0.3.4 released.</dt>
-<dd>This release fixes a minor linking error (<tt>-lm</tt> for 
<tt>floor</tt>), improves performance and adds support for GNU gettext 
(internationalization).</dd>
-<dt>Wed May 31 19:22:07 EST 2004 | libextractor v0.3.3 released.</dt>
-<dd>This release fixes various minor bugs (segmentation faults and 
non-termination of mpeg and riff extractors for malformed files) and adds 
support for WAV files.</dd>
-<dt>Wed May 31 19:22:07 EST 2004 | libextractor v0.3.2 released.</dt>
-<dd>This release fixes various minor bugs (plugins misbehaving for malformed 
files) and improves portability to Cygwin/MinGW.</dd>
-<dt>Wed Apr 28 19:29:24 EST 2004 | libextractor v0.3.1 released.</dt>
-<dd>This release adds support for ELF and fixes various minor bugs, including 
a memory leak in the PDF code and a problem with possible non-termination of 
the RIFF parser. Also the missing Java header file is now packaged 
properly.</dd>
-<dt>Mon Apr 12 08:28:55 EST 2004 | libextractor v0.3.0 released.</dt>
-<dd>This release adds extractors for MPEG and RIFF (also known as AVI). 
libextractor and all of its plugins should now be reentrant. Various bugs 
including memory leaks and possible segfaults were fixed. The man-pages were 
updated and automated testcases were added. The library initializations 
sequence was streamlined and where possible a <tt>const</tt> modifier was added 
to function prototypes. A Java interface to libextractor is now included and 
the corresponding JNI functions are build into libextractor (if <tt>jni.h</tt> 
is found by configure).</dd>
-<dt>Sun Apr  4 20:24:39 EST 2004 | Libextractor v0.2.7 released.</dt>
-<dd>This release improves portability to Win32 (mingw), fixes various minor 
problems and adds support for TIFF. For PNG and GIF the dimensions of the image 
are now also extracted.</dd>
-<dt>Thu Oct 16 23:11:42 EST 2003 | Libextractor v0.2.6 released.</dt>
-<dd>This release fixes various portability issues. 0.2.6 works under OSX and 
the binary extraction plugins were fixed to work properly on all supported 
platforms. Memory requirements for compiling libextractor have been reduced 
dramatically (now needs only about 100 MB). Various minor bugs were fixed.</dd>
-<dt>Wed Jul 16 21:36:55 EST 2003 | Libextractor v0.2.5 released.</dt>
-<dd>libextractor 0.2.5 fixes some bugs in the binary extraction plugin. Also 
it no longer needs aspell or pspell since it now does spell checking its-self 
using a <a href="http://gnunet.org/bloomfilter";>bloomfilter</a></dd>
-<dt>Mon Jun  30 23:16:34 EST 2003 | Libextractor v0.2.4 released.</dt>
-<dd>libextractor 0.2.4 contains a new generic plugin for the extraction of 
ascii text from binaries. The plugin can use spell checkers to validate 
keywords. Performance of extraction was greatly improved by limiting the pdf 
extractor to process only well formed pdf files.</dd>
-<dt>Fri Apr 11 18:54:47 EST 2003 | Bugfix in dynamic libs and rpm plugin, 
added ASF and QT plugins</dt>
-<dd>v0.2.3 released with new plugins for ASF and QT. Also removed dependecy on 
rpmdevel, rpm code is now integrated in the rpm plugin. Also fixed some issues 
with dynamic libraries.</dd>
-<dt>Wed Feb 26 04:29:32 EST 2003</dt>
-<dd>Added zip plugin by Julia Wolf, and fixed a bug in linking.</dd>
-<dt>Mon Feb 10 12:00:00 EST 2002</dt>
-<dd>There is a bug in rpm-devel-4.0.4-7x.18 which might cause the rpmextractor 
to not install properly. You need to patch /usr/include/rpm/header.h to fix it. 
Here is the <a 
href="http://gnunet.org/libextractor/header.patch";>patch</a>.</dd>
-<dt>Thu Feb 6 17:34:14 EST 2003</dt>
-<dd>libextractor v0.2.1 fixes some issues with dynamic libraries that arose in 
0.2.0.</dd>
-<dt>Tue Feb 4 05:49:44 EST 2003</dt>
-<dd>There is a bug in the dynamic loading in v0.2.0, it does not find 
libraries in /usr/lib, until the fix is released you can get around it by 
adding /usr/lib, /usr/local/lib to LD_LIBRARY_PATH. </dd>
-<dt>Sat Feb 1 23:59:59 EST 2003</dt>
-<dd>v0.2.0 released with RPM extractor, and libltdl support.</dd>
-<dt>Thu Jan 9 18:51:47 EST 2003</dt>
-<dd>v0.1.4 released with PDF and PS extractor.</dd>
-<dt>Thu Jan 9 18:51:47 EST 2003</dt>
-<dd>v0.1.3 released with MIME-extractor.</dd>
-<dt>Fri Nov 22 21:54:10 EST 2002</dt>
-<dd>Fixed portability problems with the gifextractor, in particular the code n 
ow ensures that C compilers that do not pack the structs are still going to 
result in working code.</dd>
-<dt>Sat Oct 19 02:00 EST 2002</dt>
-<dd>Compiled on solaris with some tweeking:
-For solaris you need:
-<ul>
-<li>latest CVS or libextractor 0.1.3
-<li>gnu make (!), gcc, gnu auto-tools (2.1x is ok)
-<li>with autoconf 2.13, you must run:
-<pre>
-# ./autogen
-# ./automake -a -i
-# ./configure --disable-dependency-tracking
-</pre>
-</ul>
-Note that just --disable-dependency-tracking without the '-i' option to 
automake does not seem to work and results in a compile-error from the 
preprocessor. The -ldl issue (dlsym) is resolved in the latest configure.in. 
For compilation on Purdue CS machines, you need to do 
<tt>PATH=/usr/ccs/bin:/p/gnu:$PATH</tt> before you start to get the GNU linker 
(/usr/ccs/bin/ld) and GNU make (p//gnu/make) and not solaris make 
(/usr/ccs/bin/make).</dd>
-<dt>Tue Oct 1 17:00:20 EST 2002
-<dd>Fixed a bug in ogg.</dd></dt>
-<dt>Wed Jun 12 23:42:55 EST 2002
-<dd>Added a dozen options to extract. Added man pages. Released v0.1.0.</dd>
-<dt>Fri Jun 7 01:48:34 EST 2002</dt>
-<dd>Added support for real (real.com). Released v0.0.3. Also released 
rpms!</dd>
-<dt>Fri Jun 7 00:21:40 EST 2002</dt>
-<dd>Added support for GIF (what a crazy format).</dd>
-<dt>Tue Jun 4 23:21:38 EST 2002</dt>
-<dd>Added support for PNG, no longer reading the file again and again for each 
extractor (slight interface change, mmapping).
-<dt>Sun Jun 2 22:49:17 EST 2002</dt>
-<dd>Added support for JPEG and HTML. HTML does not support concurrent use, tho 
ugh (inherent problem with libhtmlparse). Released v0.0.2.</dd>
-<dt>Sat May 25 16:56:59 EST 2002</dt>
-<dd>Added building of a description from artist, title and album, fixed 
bugs.</dd>
-<dt>Tue May 21 22:24:07 EST 2002</dt>
-<dd>Added removing of duplicates, splitting keywords, extraction of keywords f 
rom filenames.</dd>
-<dt>Sat May 18 16:33:28 EST 2002</dt>
-<dd>more convenience methods ('configuration', default set of libraries, remov 
e all libraries)</dd>
-<dt>Sat May 18 02:33:28 EST 2002</dt>
-<dd>ogg extractor works, mp3 extractor now always works</dd>
-<dt>Thu May 16 00:04:03 EST 2002</dt>
-<dd>MP3 extractor mostly works.</dd>
-<dt>Wed May 15 23:38:31 EST 2002</dt>
-<dd>The basics are there, let's write extractors!</dd>
-</dl>
-</td>
-</tr>
-</tbody>
-</table>
-<hr>
-<a href="mailto:address@hidden";>address@hidden</a>
-</body></html>





reply via email to

[Prev in Thread] Current Thread [Next in Thread]