mifluz-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Mifluz-dev] Question on API


From: Geoff Hutchison
Subject: Re: [Mifluz-dev] Question on API
Date: Sun, 17 Feb 2002 22:43:54 -0600

On Sunday, February 17, 2002, at 04:34  PM, Brian Aker wrote:

1) Has an API that allows me to create an initial index from the api.
2) Allows me to insert a blob of text with a unique keyword
4) Allows me to delete entries (and replace).

Of course.

5) Needs to be thread safe and not leak memory.

Loic may know better than I how rigorously this has been tested, but yes, this should be fine.

7) Needs to be able to search and index that was created with 1 gig of
text in under 3 seconds.

No offense, but this is a bit nonsensical. Granted, I'll assume that you're going to back things up with reliable, fast hardware. But there's a great deal of difference between say, 1 billion keys with a few bytes of record attached and 1 million keys with a few K of record. Things generally scale by the number of keys more than anything else. Even so, unless you're return a lot of query hits and need to do significant work before presentation, 3 seconds is a lot of CPU time.

6) It would be great if I could restrict a search to a certain set of
unique keywords (aka the keys representing the text blobs).

Not a problem. Consider for example the substring or prefix "fuzzy match" algorithms used by ht://Dig.

3) Allows me to pass it a query string and have it return the unique
keyword that the text was identified with when it was inserted (and it
would be really nice if it gave back some sort of number representing
the matched value).

I'm not quite sure I follow. This sounds like you want the query to match the blob (i.e. the record) and return the key? Normally you'd use the keywords to retrieve the blob. Or am I misunderstanding you?

--
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]