[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Koha-devel] marc_word and searching
From: |
Joshua Ferraro |
Subject: |
Re: [Koha-devel] marc_word and searching |
Date: |
Sun May 30 05:43:12 2004 |
User-agent: |
Mutt/1.4.1i |
On Fri, May 28, 2004 at 09:27:05AM +0200, paul POULAIN wrote:
> Joshua Ferraro a écrit :
>
> >Paul a écrit:
> >
> >NPL had a tech meeting today focusing on the opac searching and we have
> >reached some tentative conclusions about how to proceed. Running some
> >test searches using marc_subfield_table we realized that a search model
> >based on that table is inadequate for our needs. For example, a search
> >on 'patrick o'brian' using the 'like' syntax produces no results if the
> >database entry is stored as 'o'brian, patrick' (when author is stored in
> >the 100a that is the format). On the other hand, a search using the
> >current
> >marc_word model fails for reasons we have already talked about (marc_word
> >does not keep track of single characters, &c.). But if the marc_word table
> >did index single charcters, a search model based on marc_word would work
> >very well. For example, a search on 'o'brian, patrick' or 'patrick
> >o'brian'
> >would both return the correct records. So our idea is to re-create our
> >marc_word table so that it indexes all characters from the tags and
> >subfields
> >that we want to use for searches (we don't need all of them as you pointed
> >out; for instance, we will never use 300 for a search). So we have three
> >basic tasks:
> >
> >
> another idea, that would be better maybe :
> replace ' by _.
> Thus, o'brian searches o_brian, that will be stored in the DB.
> The only limit is that a search on brian won't be successful. Tell me if
> it's a problem.
>
> Otherwise, we could add a 'index also 1 letter words', but, imho, ONLY
> with the 'do not index this subfield feature'.
It seems to me like indexing on single-character words will lead to a more
accurate search--though the marc_word table will be a bit bloated. I suppose
we should also think about other punctuation marks too--do we change them
all to _ or do we leave them in the database?
> Everybody can give it's opinion here. Both solutions are easy to code.
>
> >1.) write a script to re-create marc_word using the parameters we choose
> >for searching and including all characters.
> >
> >2.) fix Biblio.pm so that it will include all characters when it adds
> >records
> >to marc_word (currently we add to our holdings using a modified version of
> >bulkmarcimport.pl that relies on Biblio.pm)
> >
> >3.) write a clean-up script to delete all the tags and subfields from
> >marc_word
> >that we will never use (like 300)
> >
> >Does that sound like a sound plan to you Paul? Do you have any scripts
> >that
> >will speed up the process of re-building our marc_word table--if not we
> >will write one ourselves. Can you make the changes to Biblio.pm that will
> >force
> >it to index single characters?
> >
> >
> yep, if we decide to do it.
> I've no speedy script to rebuild marc_word table :-(
>
> >One final point about search results. Currently the marc searching does
> >not pass all the variables to the template so that we can choose what
> >values to display (for example, Lord of the Rings: The Two Towers currently
> >displays as 'Lord of the Rings:' without the subtitle). I suggest that
> >we setup a method of easily making marc fields available to the template
> >so that each library can decide exactly what marc fields they want to
> >display for the initial search results.
> >
> >
> already planned. I'll try to commit some code on CVS ASAP.
> "MARC view" is ready (in OPAC).
> we plan to add a systempreference called 'ISBD' where the library could
> define it's own biblio presentation.
> Something like :
> [200a;][200b/][(100c)]
>
> The ; means a ; is added AFTER the 200a, the ( means a ( is added BEFORE
> the 100c.
> Not exactly a ISBD view, but not too far either.
Thanks Paul.
Joshua
- [Koha-devel] Ready to Hack, Nathan Gray, 2004/05/17
- Re: [Koha-devel] Ready to Hack, Chris Cormack, 2004/05/17
- Re: [Koha-devel] Ready to Hack, paul POULAIN, 2004/05/18
- Re: [Koha-devel] Ready to Hack, Benedict P. Barszcz, 2004/05/18
- [Koha-devel] marc_word and searching, Joshua Ferraro, 2004/05/24
- Re: [Koha-devel] marc_word and searching, Stephen Hedges, 2004/05/26
- Re: [Koha-devel] marc_word and searching, paul POULAIN, 2004/05/26
- [Koha-devel] marc_word and searching, Joshua Ferraro, 2004/05/27
- Re: [Koha-devel] marc_word and searching, paul POULAIN, 2004/05/28
- Re: [Koha-devel] marc_word and searching,
Joshua Ferraro <=
Rel_2_0 branch, was: [Koha-devel] Ready to Hack, MJ Ray, 2004/05/18