octave-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Octave-bug-tracker] [bug #65963] pkg install "fopen: encoding must be '


From: Markus Mützel
Subject: [Octave-bug-tracker] [bug #65963] pkg install "fopen: encoding must be 'UTF-8'' warnings on macOS 14 AS
Date: Mon, 8 Jul 2024 03:17:35 -0400 (EDT)

Follow-up Comment #2, bug #65963 (group octave):

This might be a couple of different bugs or misunderstandings.

* When starting the Octave CLI, the GUI settings are *not* read. So, any
changes for the encoding or language that were done in the GUI settings don't
apply. If you'd like to change the default .m file encoding for the CLI,
consider setting it in one of the startup files.

* The value of `OCTAVE_HAVE_STRICT_ENCODING_FACET` is set in
`oct-conf-post-private.h` with an explanatory comment:

#if defined (HAVE_LLVM_LIBCXX)
/* The stream encoding facet from libc++ is stricter than libstdc++ when
   it comes to reverting the stream.  Disable encoding conversion for file
   streams with libc++.
   FIXME: Maybe use a more specific test.  */
#  define OCTAVE_HAVE_STRICT_ENCODING_FACET 1
#endif


Patches for implementing an encoding conversion for file streams with libc++
are welcome. (I probably won't work on that because the only platform that I
know that uses libc++ is macOS. And I don't have hardware to test anything
with that OS. I'd be happy to review proposals though.)

* Immediately before the code snippet you quoted, there is the following:

  // Valid names for encodings consist of ASCII characters only.
  std::transform (encoding.begin (), encoding.end (), encoding.begin (),
                  ::tolower);

  if (encoding == "system")
    encoding = octave_locale_charset_wrapper ();

#if defined (OCTAVE_HAVE_STRICT_ENCODING_FACET)
  if (encoding != "utf-8")
    {
      warning_with_id ("Octave:fopen:encoding-unsupported",
                       "fopen: encoding must be 'UTF-8' for this version");
      encoding = "utf-8";
    }
#endif


The original string `encoding` is transformed to lower-case. We probably need
to keep that there to be able to reliably recognize "system"
(case-insensitive). But it might well be that the locale_charset returns a
value that is not all lower-case.
We'll probably need to convert that value to lower-case again for that warning
to disappear for you.

* The warning is probably correct in a sense that "on-the-fly" encoding
conversions don't work on that platform (libc++, see above). You can still
convert between encodings on "static" strings (i.e., `char` vectors in Octave)
with `unicode2native` or `native2unicode`. I.e., open the file (e.g., in byte
mode), read it's content as `int8`, then convert to a UTF-8 encoded `char`
vector with `native2unicode`.

* The default mfile_encoding should be "utf-8" on all platforms. See, e.g.,
the following for the default value of the GUI settings:
https://hg.savannah.gnu.org/hgweb/octave/file/d539aff8b327/libgui/src/gui-preferences-ed.cc#l234

gui_pref
ed_default_enc ("editor/default_encoding", QVariant ("UTF-8"));


Did you run your tests with the GUI or CLI?

PS: Imho, it would be best to avoid double-posting the same issue in multiple
places (for whatever reason). Even if you cross-link to the other place
(that's better than nothing), the discussion will inevitably be split and
trying to follow it in the future might become more difficult than
necessary...



    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?65963>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]