|
From: | Pádraig Brady |
Subject: | bug#69488: tr (question) |
Date: | Fri, 1 Mar 2024 19:30:33 +0000 |
User-agent: | Mozilla Thunderbird |
On 01/03/2024 15:33, lacsaP Patatetom wrote:
hi, I did a few tests with tr and I'm surprised by the results... $ echo éèçà éèçà these characters are encoded in utf-8 on 2 bytes : $ echo éèçà | xxd 00000000: c3a9 c3a8 c3a7 c3a0 0a ......... now I use tr to remove non-printable characters : $ echo éèçà | tr -cd '[:print:]' $ echo éèçà | tr -cd '[:print:]' | wc 0 0 0 all characters are deleted by tr now I want to keep the "é" character : $ echo éèçà | tr -cd '[:print:]é' é��� why do the "�" characters appear ? regards, lacsaP.
It's a known issue that tr is currently non multi-byte aware. thanks, Pádraig
[Prev in Thread] | Current Thread | [Next in Thread] |