bug-gnu-libiconv
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug-gnu-libiconv] Fwd: Supporting Combining Diacritical Marks


From: Christian PERNOT
Subject: [bug-gnu-libiconv] Fwd: Supporting Combining Diacritical Marks
Date: Wed, 29 Jun 2022 15:40:28 +0200

Hello,

We are using gnu libiconv in our alpine-based container, and we face some difficulties with some special characters.

These characters are using Unicode Combining Diacritical Marks like this one : https://www.fileformat.info/info/unicode/char/301/index.htm

I didn't know this behavior in Unicode, but it is a diacritical mark that is put after the base character, and they should be combined on display.

For example :  "é" exists in UTF8 as one character (0xC3 0xA9) : https://www.fileformat.info/info/unicode/char/e9/index.htm
But it may be display the same way by having a "e" without accent (0x65), followed by the accent character (0xCC 0x81)

There is no difference on display, but iconv won't accept to convert to ascii with or without transliteration

here is my attempt :

~/local/bin/iconv -f UTF-8 -t ASCII//TRANSLIT ~/src/iconv.txt                    
Capture d'e
/home/cpernot/local/bin/iconv: /home/cpernot/src/iconv.txt:1:11: ne peut convertir

here is the version

~/local/bin/iconv --version                    
iconv (GNU libiconv 1.17)
Copyright (C) 2000-2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Écrit pas Bruno Haible.

Attached is the test file.

Do you know if I made a mistake, or if it is an unsupported feature ?

Thanks

Christian PERNOT

Attachment: iconv.txt
Description: Text document


reply via email to

[Prev in Thread] Current Thread [Next in Thread]