Re: How to get buffer byte length (not number of characters)?

emacs-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: How to get buffer byte length (not number of characters)?

From:	Adam Porter
Subject:	Re: How to get buffer byte length (not number of characters)?
Date:	Thu, 22 Aug 2024 07:26:58 -0500
User-agent:	Mozilla Thunderbird

Hi Joseph, et al,

On 8/22/24 02:24, Joseph Turner wrote:

plz.el does not manually encode buffer text *within Emacs* when sending
requests to curl, but by default, plz.el sends data to curl with --data,
which tells curl to strip CR and newlines.  With the :body-type 'binary
argument, plz.el instead uses --data-binary, which does no conversion.


Newlines is a relatively minor issue (although it, too, needs to be
considered).  My main concern is with the text encoding.  How can it
be TRT to use 'binary when sending buffer text to curl? that would
mean we are more-or-less always sending the internal representation of
characters, which is superset of UTF-8.  If the data was originally
encoded in anything but UTF-8, reading it into Emacs and then sending
it back will change the byte sequences from that other encoding to
UTF-8.  Moreover, 'binary does not guarantee that the result is valid
UTF-8.

So maybe I misunderstand how these plz.el facilities are used, but up
front this sounds like a mistake.


It could be.  Eli, Adam, what do you think about the default coding
systems for encoding the request body in the attached patch?

From an API perspective, I'm not sure. My idea for plz.el is toprovide a simple, somewhat idiomatic Elisp API for making HTTP requests(and, of course, to make "correct" requests, in compliance withspecifications and expectations). Given the relatively few clients ofplz thus far, some issues are yet to be fully explored and developed,and encoding/decoding may be one of those rougher edges. For the usecases I'm aware of, it seems to work well and correctly, but there areundoubtedly improvements to be made.

Encoding/decoding is not exactly a simple matter, especially with regardto API design. Ultimately, no library can abstract it away from users'need to understand it. And I want plz's API to not have to change anymore than necessary over time, so I'd want to be very deliberate withany changes to it. So it's appealing to do as little as possible inthis regard, leaving as much as possible to the upstream user to handleoutside of plz.

One way to do that is to do what hyperdrive.el is basically doing now,to tell plz to tell curl to handle the data as binary, i.e. to pass itthrough unchanged. But it seems that we haven't covered all of thebases with regard to these issues; rather, we have tested a subset ofthem that seem to work as expected.

Also, where it's possible to make plz DTRT automatically, integratingnaturally with Elisp APIs and data structures, I'm certainly in favor ofthat. So, e.g. automatically using a buffer's expected encoding whenpassing its data to curl seems like the right thing to do, which plzdoesn't do yet (and perhaps we could do the same thing when returning abuffer of data).

Of course, AFAIK we can't do such a thing when passing a string, so Iguess the most we can do there is document recommended patterns for theuser; IOW I'm tempted to leave encoding of strings to the user ratherthan add another argument for that, but we can talk about it.


Thanks,
Adam

[Prev in Thread]

Current Thread

[Next in Thread]

Re: How to get buffer byte length (not number of characters)?, (continued)

Prev by Date: Re: Sv: Modularizing Org as GSoC project (was: New Emacs features via Google Summer of Code (or other similar stipend schemes) (was: as for Calc and the math library))
Next by Date: Re: How to get buffer byte length (not number of characters)?
Previous by thread: Re: How to get buffer byte length (not number of characters)?
Next by thread: Re: How to get buffer byte length (not number of characters)?
Index(es):
- Date
- Thread