bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#50951: Fwd: bug#50951: 28.0.50; Urdu text is not displayed correctly


From: Rah Guzar
Subject: bug#50951: Fwd: bug#50951: 28.0.50; Urdu text is not displayed correctly
Date: Sat, 2 Oct 2021 13:43:47 +0200

I forgot to reply all for my reply and it didn't go to the mailing list. Sorry about that and I am forwarding it
to the mailing list now.

---------- Forwarded message ---------
From: Rah Guzar <aikrahguzar@gmail.com>
Date: Sat, Oct 2, 2021 at 1:40 PM
Subject: Re: bug#50951: 28.0.50; Urdu text is not displayed correctly
To: Eli Zaretskii <eliz@gnu.org>


Hi,
  Thanks a lot for the reply.

On Sat, Oct 2, 2021 at 8:07 AM Eli Zaretskii <eliz@gnu.org> wrote:
-10-01T21:49:10,611532571+02:00.png

Can you give a few specific examples of characters that should be
joined, but aren't?  Please name the characters and also give they
positions relative to the beginning of this text, as I don't read
Urdu, so the images are useless for me without some additional data
and explanations.

Let us consider the word نہیں

It is composed of four letters. I will use character field from `describe-char` for each of them below
1) ن‎ (displayed as ن‎) (codepoint 1606, #o3106, #x646)
2)  ہ‎ (displayed as ہ‎) (codepoint 1729, #o3301, #x6c1)
3)  ی‎ (displayed as ی‎) (codepoint 1740, #o3314, #x6cc)
4) ں‎ (displayed as ں‎) (codepoint 1722, #o3272, #x6ba)

It should be displayed with all 4 characters joined together, instead they are all displayed individually.
If I change to `NotoNastaliqUrdu` this word is displayed correctly. But there is problem with   حرف

It consist of three letters,
1) ح‎ (displayed as ح‎) (codepoint 1581, #o3055, #x62d)
2) ر‎ (displayed as ر‎) (codepoint 1585, #o3061, #x631)
3) ف‎ (displayed as ف‎) (codepoint 1601, #o3101, #x641)

The first two characters should be joined and the last one should be on its own. This seems to be the case.
But the two groups are rendered on top of each other making it illegible.

So isn't this a matter of finding a proper font, in particularly given
the "Nastaliq vs Naskh" issues?  NotoNastaliqUrdu is not the only font
supporting Nastaliq, so perhaps other fonts fare better?
 
My knowledge here is very deficient but my impression is Nastaliq and Naskh are styles and shouldn't affect composition.
NotoNastaliqUrdu was the only Urdu font available from my distro.  Libreoffice which also uses harfbuzz renders it
correctly so I didn't try another font at first. Like emacs libreoffice also uses a Naskh font by default but all the characters
are joined properly.

I did try some fonts from https://urdufonts.net/ after your suggestions and they render correctly. Specifically the font I tried
were:
Jameel Noori Nastaleeq Regular
Alvi Nastaleeq 
Zohra Unicode
Manzor Unicode

I didn't notice a problem with any of them except a very minor one for the last two which have visible boundaries where glyphs
are joined. 

Since Urdu uses the Arabic characters, Emacs uses character
composition rules for Arabic when displaying this text.  Do you know
if the composition rules for Urdu are different?

I think using Arabic composition rules might be part of the problem. Urdu alphabet is a superset of Arabic alphabet and if I
don't set a font specifically designed for Urdu, the words where some characters should be joined but aren't always seem to
include a character like ہ which is in Urdu alphabet but not in Arabic.

Also, which version of HarfBuzz do you have installed?
It is 2.9.1

Please let me know if you need any more information.

Thanks a lot again.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]