bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#69488: tr (question)


From: Pádraig Brady
Subject: bug#69488: tr (question)
Date: Fri, 1 Mar 2024 19:30:33 +0000
User-agent: Mozilla Thunderbird

On 01/03/2024 15:33, lacsaP Patatetom wrote:
hi,

I did a few tests with tr and I'm surprised by the results...

$ echo éèçà
éèçà

these characters are encoded in utf-8 on 2 bytes :

$ echo éèçà | xxd
00000000: c3a9 c3a8 c3a7 c3a0 0a                   .........

now I use tr to remove non-printable characters :

$ echo éèçà | tr -cd '[:print:]'
$ echo éèçà | tr -cd '[:print:]' | wc
       0       0       0

all characters are deleted by tr
now I want to keep the "é" character :

$ echo éèçà | tr -cd '[:print:]é'
��

why do the "�" characters appear ?

regards, lacsaP.


It's a known issue that tr is currently non multi-byte aware.

thanks,
Pádraig





reply via email to

[Prev in Thread] Current Thread [Next in Thread]