[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[aspell-devel] Affix compression
From: |
Gregory Maxwell |
Subject: |
[aspell-devel] Affix compression |
Date: |
Sat, 23 Jul 2005 17:35:57 -0400 |
I was wondering if anyone has looked at using libjudy
(http://judy.sourceforge.net/) for storing words with aspell?
Libjudy provides a number of sparse array data structures which
provide very fast lookups, because they are cache aware, and
reasonable memory efficiency. There is a function in the libjudy
package that provides a string indexed array which is quite space
efficient because it is prefix compressed.
I don't have a standalone metaphone encoder handy, but just passing
/usr/dict/words on my pentium M laptop into judy sl gives 0.304
uS/word lookups using only 10mbyte of core, which is only 2x the size
of the file.
It would be easy to provide code with this datastructure which quickly
found the longest match and all other entries of the same match
length, perhaps something which would be useful in aspell as well...
- [aspell-devel] Affix compression,
Gregory Maxwell <=