[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#26058: utf16->string and utf32->string don't conform to R6RS
From: |
Taylan Ulrich Bayırlı/Kammer |
Subject: |
bug#26058: utf16->string and utf32->string don't conform to R6RS |
Date: |
Thu, 16 Mar 2017 20:34:14 +0100 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux) |
Andy Wingo <address@hidden> writes:
> Adopting the behavior is more or less fine. If it can be done while
> relying on the existing behavior, that is better than something ad-hoc
> in a module.
You mean somehow leveraging the existing BOM handling code of Guile
(found in the ports code) would be preferable to reimplementing it like
in this patch, correct?
In that light, I had this attempt:
(define r6rs-utf16->string
(case-lambda
((bv default-endianness)
(let* ((binary-port (open-bytevector-input-port bv))
(transcoder (make-transcoder (utf-16-codec)))
(utf16-port (transcoded-port binary-port transcoder)))
;; XXX how to set default-endianness for a port?
(get-string-all utf16-port)))
((bv endianness endianness-mandatory?)
(if endianness-mandatory?
(utf16->string bv endianness)
(r6rs-utf16->string bv endianness)))))
As commented in the first branch of the case-lambda, this does not yet
make use of the 'default-endianness' parameter to tell the port
transcoder (or whoever) what to do in case no BOM is found in the
stream.
>From what I can tell, Guile is currently hardcoded to *transparently*
default to big-endian in ports.c, port_clear_stream_start_for_bom_read.
Is there a way to detect when Guile was unable to find a BOM? (In that
case one could set the endianness explicitly to the desired default.)
Or do you see another way to implement this?
Thanks for the feedback!
Taylan
P.S.: Huge congrats on the big release. :-)