aspell-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Aspell-user] space, colon, digits in words


From: Lars Aronsson
Subject: [Aspell-user] space, colon, digits in words
Date: Sun, 12 Jan 2003 04:57:40 +0100 (CET)

SPACE:
Some words are only correct in certain phrases, e.g. "da" and "capo"
in many languages are OK only in the phrase "da capo", but nowhere
else (two exceptions being Russian and Italian).  It would be nice to
be able specify that "da capo" is a valid phrase without having to
list "da" and "capo" as words.  This is useful not only for loanwords
but also for ancient grammatic forms that live on in fixed phrases,
which is not uncommon in Swedish ("ur huse", "av daga", "till väders",
"i faggorna", "åt fanders").  I personally wouldn't mind using the
underscore character to mark this space (e.g. "da_capo") in my
dictionary files.

COLON:
Swedish sometimes uses colon (:) as an apostrophe ("HSB:s", "k:a",
"S:t Petersburg").  Aspell doesn't complain when I specify
"special : -*-" in sv.dat, but when I try to list "HSB:s" in my
dictionary, I get:

  aspell --lang=sv create master ./mylist <mylist.txt

  aspell: posib_err.cpp:39: class acommon::PosibErrBase &
  acommon::PosibErrBase::set(const acommon::ErrorInfo *,
  acommon::ParmString, acommon::ParmString, acommon::ParmString,
  acommon::ParmString): Assertion `i == inf->num_parms || i ==
  inf->num_parms + 1' failed.
  Aborted

This is GNU Aspell 0.50.3

I understand that colon has a special meaning in dictionaries, as
described in section 7.4 of the manual, but I would suggest that the
.dat file could specify that another character (e.g. slash "/") can be
used instead, allowing colon to be used as an apostrophe.  Then sv.dat
might say:

  name sv
  charset     iso8859-1
  # specify / as the flag char, freeing up : for use as an apostrophe
  flag-char   /
  space-char  _
  special     ' -** - -** . -** : -*-
  soundslike  sv

I think ispell uses / as the flag char for affix patterns.

DIGITS:
In Swedish, 3-våningshus (three story building) is a fine word, but
våningshus without the prefix is not (what is a "story building"?).

When I tried to add the word "3-våningshus" to my dictionary, this is
what I got:

  aspell --lang=sv create master ./mylist <mylist.txt

  Unhandled Error: The word "3-våningshus" is invalid. The character
  '3' may not appear at the beginning of a word.
  Aborted

Then I added "3 *--" to the "special" key in "sv.dat", and got:

  aspell --lang=sv create master ./mylist <mylist.txt

  Unhandled Error: The word "3-våningshus" is invalid. Does not
  contain any letters.

First, this error message is misleading, and should read "Does not
begin with a letter".  Second, if special characters are allowed
at the beginning of words, this error message just shouldn't happen.

Then I got the idea to rewrite iso8859-1.dat, so that positions 48-57
(ASCII digits 0-9) are actually "letter" instead of "other", and then
this particular word works fine, but I get a lot of spelling errors
for plain numbers.

A nice way would be to specify "special - ***", allowing dash as the
first character of the word.  Aspell seems to accept this, and then
allows "-våningshus" as a word when creating a master, but when I use
this dictionary, it still reports "3-våningshus" as a spelling error.

Is there a recommended way?


I really tried to download dictionaries for all the languages, to
see how they do, but ftp.gnu.org seems to be unavailable.


Finally, I noticed some typos in the manual, e.g. section 5.5,
precious -> previous, striped -> stripped.


-- 
  Lars Aronsson (address@hidden)
  Aronsson Datateknik
  Teknikringen 1e, SE-583 30 Linuxköping, Sweden
  tel +46-70-7891609
  http://aronsson.se/ http://elektrosmog.nu/ http://susning.nu/





reply via email to

[Prev in Thread] Current Thread [Next in Thread]