gpsd-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [gpsd-dev] Update on Python Version Compatibility


From: Fred Wright
Subject: Re: [gpsd-dev] Update on Python Version Compatibility
Date: Sun, 3 Apr 2016 14:56:18 -0700 (PDT)

On Sun, 3 Apr 2016, Eric S. Raymond wrote:
> Fred Wright <address@hidden>:

> > I've already tried this out in a separate test module; now I just need to
> > apply it to the GPSD code.  Since it's needed by multiple modules, misc.py
> > seems like the appropriate place for it (unlike the present
> > polystr/polybytes, which are in client.py).
>
> Your method would probably work.  However, I do *not* want the GPSD code
> cluttered with artifacts from two subtly different sets of compatibility
> shims.  That way lies madness.  It is unacceptable.

I'd planned on retrofitting it to the existing cases (not that many,
actually) after fixing the stuff that actually needs it, but...

> I am laying down ironclad policy now: we do polyglot code according to
> the *documented methods* in Practical Python Porting.  If you want to
> change GPSD's methods, you need to engage Peter Donis and me wearing
> out HOWTO-author hats, persuade us to *change our recommendations*,
> and then implement them over the whole GPSD corpus.

In the interests of expediency, I'll do it your way now, and perhaps
revisit this in the future.

FYI, my version looks like this:

# We need to be able to handle data which may be a mixture of text and binary
# data.  The text in this context is known to be limited to US-ASCII, so
# there aren't any issues regarding character sets, but we need to ensure
# that binary data is preserved.  In Python 2, this happens naturally with
# "strings" and the 'str' and 'bytes' types are synonyms.  But in Python 3,
# these are distinct types (with 'str' being based on Unicode), and conversions
# are encoding-sensitive.  The most straightforward encoding to use in this
# context is 'latin-1' (a.k.a.'iso-8859-1'), which directly maps all 256
# 8-bit character values to Unicode page 0.  Thus, if we can enforce the use
# of 'latin-1' encoding, we can preserve arbitrary binary data while correctly
# mapping any actual text to the proper characters.
#
# Here we define a new 'strbytes' type with these properties.  In Python 2
# it's simply another synonym for 'str' and 'bytes', but in Python 3 it's a
# distinct type with forced 'latin-1' encoding on conversions.
#
if bytes is str:

    strbytes = bytes

else:

    BINARY_ENCODING = 'latin-1'

    class strbytes(bytes):
        "Subclass of bytes forcing latin-1 encoding for 8-bit transparency."
        __slots__ = []  # Avoid gratuitous __dict__

        def __new__(cls, arg):
            "Create a new strbytes instance, using latin-1 if from a string."
            if isinstance(arg, str):
                self = super().__new__(cls, arg, encoding=BINARY_ENCODING)
            else:
                self = super().__new__(cls, arg)
            return self

        def __str__(self):
            "Return string representation with forced latin-1 encoding."
            return(self.decode())

        def decode(self, encoding=BINARY_ENCODING, errors='strict'):
            "Decode bytes with forced latin-1 encoding (ignoring arg)."
            return super().decode(encoding=BINARY_ENCODING, errors=errors)

Fred Wright



reply via email to

[Prev in Thread] Current Thread [Next in Thread]