Re: [Aspell-user] Unicode

aspell-user

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Aspell-user] Unicode

From:	Kevin Atkinson
Subject:	Re: [Aspell-user] Unicode
Date:	Mon, 27 Nov 2006 20:38:31 -0700 (MST)

On Mon, 27 Nov 2006, Lars Aronsson wrote:

I'm running Ubuntu Linux 6.06 that comes with Aspell 0.60.4.  I
see two problems related to Unicode.  This system uses UTF-8 by
default, and I'm trying to leave ISO 8859-1 behind all together.

1. I'm trying to create my own master dictionary.  Is it
impossible to have the word list in utf-8? Section 7.1 of the web
documentation seems to say so,
http://aspell.sourceforge.net/man-html/The-Language-Data-File.html

You can set the "data-encoding" to utf-8 in the language data file. Butthat also effects the default encoding used in files like the personaldictionary for all users of the dictionary.


2. The output from "aspell -l sv dump master" is in broken utf-8.
If the command is prefixed with LC_CTYPE=iso8859-1 and the output
is piped through "recode l1..u8", all is fine.  But without this,
aspell's dump command converts to UTF-8 but truncates the words.
For example, in the 5 letter word "själv" the middle letter
a-umlaut is coded in UTF-8 as two bytes (octal 0303 0244), but the
output string is truncated to 5 bytes: "s", "j", "\0303", "\0244",
"l" and the last "v" is missing.

This will be fixed in the next version. You can use CVS branch"rel_0_60-branch" or search for the bug report which should include thepatch.



--
 Lars Aronsson (address@hidden)
 Aronsson Datateknik - http://aronsson.se


_______________________________________________
Aspell-user mailing list
address@hidden
http://lists.gnu.org/mailman/listinfo/aspell-user

[Prev in Thread]

Current Thread

[Next in Thread]

[Aspell-user] Unicode, Lars Aronsson, 2006/11/27
- Re: [Aspell-user] Unicode, Kevin Atkinson <=

Prev by Date: [Aspell-user] Unicode
Next by Date: Re: [Aspell-user] Dictionary of proper names
Previous by thread: [Aspell-user] Unicode
Next by thread: [Aspell-user] Dictionary of proper names (Lars Aronsson)
Index(es):
- Date
- Thread