Re: [Gnumed-devel] phrase usage scoring

gnumed-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnumed-devel] phrase usage scoring

From:	richard terry
Subject:	Re: [Gnumed-devel] phrase usage scoring
Date:	Fri, 19 Sep 2003 08:31:42 +1000
User-agent:	KMail/1.5

On Fri, 19 Sep 2003 12:03 am, Karsten Hilbert wrote:
> > Incremented the counter. Discuss the pitfalls however.
>
> Well, the obvious pitfalls are that
>
> a) a simple integer field will overflow eventually

Yes, at what actually number just for interest? 

I just checked my database. The top weighted field I found was about 700 
(Thats after nearly 6 years of daily use), most were pretty low. I'd hazard a 
guess I will be well and truly dead and gnuMEd will be surplanted by 
artificial intelligence before weighting on an incremental counter fucks your 
database!

> b) terms that are used often will have astronomically high scores
>
This discrepency in the gaps of scores seems not to matter in practice for the 
following reasons.

Lets say you are prescribing a drug amoxycillin, and use it for many 
conditions (I just checked my pop up list and I've used it for a total of 
only 19 different conditions in since 1997,  of these when one takes synonyms 
out (eg middle ear infection, acute otitis media, otitis media) (gum 
infection, gingivitis) etc, there are very few.

 Even at the maximum number there is very little scrolling down the list as 
the terms we use commonly will always be in the top several on the list. With 
other drugs such as beta-blockers with narrower indications the list is even 
shorter, for e.g my beta blocker list contains just three items.

So even if I used amoxycillin for say acute otitis media and its cumulative 
score was thousands and thousands, and the next one on the list had a score 
of 200, they are still in the same relative frequency. It is only if you 
change your prescribing habits or phrases that you get into trouble, and 
there one needs, as I've mentioned in the docs, a mechanism to manually 
re-weight or shuffle the lists. I've only ever had to do this once in the 
last 6 years - I just edited my database, so you can see how rare this event 
is.

> Simple percented score increases will not work as they make
> all terms asymptotically reach the same weighting unless some
> sort of percentage of sum of all scores is taken into account
> which is prohibitive in terms of speed.
>
> I am not sure I see a Good solution currently.
>
> Karsten

[Prev in Thread]

Current Thread

[Next in Thread]

[Gnumed-devel] phrase usage scoring, Karsten Hilbert, 2003/09/16
- Re: [Gnumed-devel] phrase usage scoring, richard terry, 2003/09/16
  - Re: [Gnumed-devel] phrase usage scoring, Karsten Hilbert, 2003/09/18
    - Re: [Gnumed-devel] phrase usage scoring, richard terry <=
- Re: [Gnumed-devel] phrase usage scoring, Hilmar Berger, 2003/09/18

Prev by Date: Re: [Gnumed-devel] phrase usage scoring
Next by Date: [Gnumed-devel] another algorithm ? ;)
Previous by thread: Re: [Gnumed-devel] phrase usage scoring
Next by thread: Re: [Gnumed-devel] phrase usage scoring
Index(es):
- Date
- Thread