[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[bug #65778] Strings are not considered UTF-8 encoded in Python 3
From: |
Sergey Tereschenko |
Subject: |
[bug #65778] Strings are not considered UTF-8 encoded in Python 3 |
Date: |
Thu, 23 May 2024 04:05:24 -0400 (EDT) |
URL:
<https://savannah.gnu.org/bugs/?65778>
Summary: Strings are not considered UTF-8 encoded in Python 3
Group: GNU gettext
Submitter: partizan
Submitted: Thu 23 May 2024 08:05:24 AM UTC
Category: Python
Severity: 3 - Normal
Item Group: None
Status: None
Privacy: Public
Assigned to: None
Open/Closed: Open
Discussion Lock: Any
_______________________________________________________
Follow-up Comments:
-------------------------------------------------------
Date: Thu 23 May 2024 08:05:24 AM UTC By: Sergey Tereschenko <partizan>
Hello.
To reproduce this bug, create a file "test.py" with the following content:
gettext("Hello\xa0World")
Then, run xgettext:
> xgettext test.py
xgettext: String at test.py:1 is not UTF-8 encoded.
Please specify the source encoding through --from-code.
Adding 'u' prefix to string, fixes this issue. But it should not be the case
for Python 3, as it treats string as Unicode by default.
(And adding --from-code does not work, because Python is exception from this
option)
There is related old bug in Django, related to templates:
https://code.djangoproject.com/ticket/26093
They're adding this prefix when converting {% trans ... %} tags to a code that
gettext can understand. But this is workaround, and not a real solution.
Real solution should be on the gettext side, treating python string as "utf-8"
by default (python 2 reached EOL long time ago). Or at least, adding an option
for this.
_______________________________________________________
Reply to this item at:
<https://savannah.gnu.org/bugs/?65778>
_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/
- [bug #65778] Strings are not considered UTF-8 encoded in Python 3,
Sergey Tereschenko <=