koha-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Koha-devel] Zebra Searching


From: Joshua Ferraro
Subject: [Koha-devel] Zebra Searching
Date: Mon Jun 13 13:59:21 2005
User-agent: Mutt/1.4.1i

Hi everyone,

In case you haven't been following the IRC logs we've been discussing
Zebra as a potential searching engine. From Indexdata's website:

Zebra is a high-performance, general-purpose structured text indexing and 
retrieval engine. It reads structured records in a variety of input formats 
(eg. email, XML, MARC) and allows access to them through exact boolean search 
expressions and relevance-ranked free-text queries.

Zebra supports large databases (more than ten gigabytes of data, tens of 
millions of records). It supports incremental, safe database updates on live 
systems. You can access data stored in Zebra using a variety of Index Data 
tools (eg. YAZ and PHP/YAZ) as well as commercial and freeware Z39.50 clients 
and toolkits. 

http://indexdata.dk/zebra

I've setup a zebra test site running on LibLime's server. It currently
has access to three Zebra datasets, Nelsonville's 150K records, LibLime's 
5 million records (recently donated by sanspach), and Paul Poulain's 13K
records. (Paul is still working out some issues with indexing unimarc 
records so stay tuned for that one to work).

http://liblime.com/zap/advanced.html

Note that the search and retrieval is done via the Z39.50 protocol with
the server that ships with Zebra and both the index and the server can
be customized based on the kinds of searches you want to perform (the 
above site is just a proof of concept) -- we'd have support for relevence
ranking, stemming, the whole gambit of searching technologies.

In all my tests searches are returned in under a second.

If we decide to work with Zebra we will need to decide what to do with 
non-marc libraries. Should we develop an export utility that will allow 
Zebra to index the records (in say, XML format)? Should we use the Koha
tables to create a basic MARC record for use with Zebra? Should we leave
the Koha 1.x searching methods unchanged and only use Zebra for 
MARC libraries? Also, what should we do with the existing marc_*_table
tables?

So ... it's clearly time to schedule a "Koha 2.4 Searching Group Meeting" on
IRC. I'd like to pick a time when everyone can be represented. how
is Thursday, June 23 at 9:00 GMT? Here's the time in your area:
http://tinyurl.com/925c8

Please let me know on-list if you will not be able to attend and what
time you can attend.

Comments, suggestions, concerns?

-- 
Joshua Ferraro               VENDOR SERVICES FOR OPEN-SOURCE SOFTWARE
President, Technology       migration, training, maintenance, support
LibLime                  Koha ILS, Mambo Intranet, DiscrimiNet Filter
address@hidden |  Full Demos at http://liblime.com  |  1(888)KohaILS



reply via email to

[Prev in Thread] Current Thread [Next in Thread]