aspell-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Aspell-user] Problems with Arabic.


From: Mohammed Sameer
Subject: Re: [Aspell-user] Problems with Arabic.
Date: Sun, 5 Mar 2006 11:03:32 +0200
User-agent: Mutt/1.5.11+cvs20060126

On Sat, Mar 04, 2006 at 09:59:18PM -0700, Kevin Atkinson wrote:
> 
> 
> On Sun, 5 Mar 2006, Mohammed Sameer wrote:
> 
> >Hi,
> >
> >I've created a simple wordlist for Arabic.
> >It contains +40,000 just to test aspell and Arabic.
> >
> >Looks like everything is fine with aspell from the command line, But 
> >when using abiword or any other graphical tools to generate the 
> >suggestions, I find that the suggestions are Latin letters not Arabic 
> >words.
> >
> >I had to encode the files in ISO 8859-6 as aspell didn't accept UTF-8 
> >for the data files. I think this might be the source of the problem but 
> >I can't be sure.
> >
> >Now my question is: How can I force the output from libaspell to be 
> >UTF-8 ? I tried the "data-encoding utf-8" in the ar.dat file but it 
> >didn't work.
> 
> You can't really "force" Aspell to output UTF-8.  Aspell will output what 
> every encoding the application ask it to.  if it ask's for "utf-8" it will 
> get it.
> 

Abiword is using enchant, I had a look at enchant source code.
enchant is asking libaspell to output in utf-8 but it's not working.

The point is that it's working with other languages "Otherwise people might've 
complained"
but not with Arabic, That's why I'm a bit lost.

> What encoding is your word list in you used to generate the dictionary? 
iso 8859-6

> It needs to be in the same encoding the "data-encoding" is in.  It can be 
> ISO 8859-6 and Aspell will still output UTF-8 when an application asks for 
> it as Aspell will convert the output to UTF-8.

I've uploaded all the files here: http://www.foolab.org/aspell.tgz

Can you please have a look ?

Here's how I generated the ar.rws file:
aspell --lang ar create master ./ar.rws < wordlist

Many thanks,

-- 
GNU/Linux registered user #224950
Proud Egyptian GNU/Linux User Group <www.eglug.org> Admin.
Life powered by Debian, Homepage: www.foolab.org
--
Don't send me any attachment in Micro$oft (.DOC, .PPT) format please
Read http://www.gnu.org/philosophy/no-word-attachments.html
Preferable attachments: .PDF, .HTML, .TXT
Thanx for adding this text to Your signature

Attachment: signature.asc
Description: Digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]