aramorph-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Aramorph-users] A contribution for AraMorph


From: Pierrick Brihaye
Subject: Re: [Aramorph-users] A contribution for AraMorph
Date: Mon, 13 Jun 2005 11:53:58 +0200
User-agent: Mozilla/5.0 (Windows; U; Win98; fr-FR; rv:1.7.8) Gecko/20050511

Ahmed,

Ahmed El-dawy wrote:

This is the patch for the arabic toknizer.
It uses a range set (found as a new class at the end of the file) to
check for arabic letters.

Why don't you use Java's native capabilities in order to determine if a character is arabic or not ?

See, for example :
http://www.fileformat.info/info/unicode/char/0645/index.htm
http://javaalmanac.com/egs/java.lang/FindUnicodeBlock.html
and
http://java.sun.com/j2se/1.4.2/docs/api/java/lang/Character.UnicodeBlock.html

I will review your patch tonight however.

For changing the format of the dictionaries, I think it is better to
change it into XML format.

I *definitely* agree.

Then, we can translate it into Arabic
UTF-8. Have you decided a structure for the XML files?

No.

If you don't I
can write a DTD file and send it to you for checking.

Please do.

Lacking time right now to go into de details.

Cheers,

--
Pierrick Brihaye, informaticien
Service régional de l'Inventaire
DRAC Bretagne
mailto:address@hidden
+33 (0)2 99 29 67 78




reply via email to

[Prev in Thread] Current Thread [Next in Thread]