help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Decoding URLs input


From: Yuri Khan
Subject: Re: Decoding URLs input
Date: Sat, 3 Jul 2021 18:10:47 +0700

On Sat, 3 Jul 2021 at 16:41, Jean Louis <bugs@gnu.support> wrote:

> As I am developing Double Opt-In CGI script served by Emacs I am
> unsure if this function is correct to be used the encoded strings that
> come from URL GET requests, like http://www.example.com/?message=Hello%20There
>
> (rfc2231-decode-encoded-string "Hello%20there") ⇒ "Hello there"
>
> If anybody knows or have clues, let me know. In other programming
> languages I have not been thinking of RFC, I don't know which RFC
> applies there.

Why not look at the RFC referenced in order to see whether it is or is
not relevant to your task?

https://datatracker.ietf.org/doc/html/rfc2231

It talks about encoding MIME headers, which is not what you’re dealing
with; and its encoded strings look like
<encoding>'<locale>'<percent-encoded-string>, which is not what you
have.

What you are dealing with is a URL, specifically, its query string
part. These are described in RFC 3986, and its percent-encoding scheme
in sections 2.1 and 2.5.

(url-unhex-string …) will do half the work for you: It will decode
percent-encoded sequences into bytes. By convention, in URLs,
characters are UTF-8-encoded before percent-encoding (see RFC 3986 §
2.5), so you’ll need to use:

    (decode-coding-string (url-unhex-string s) 'utf-8)

to get a fully decoded text string.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]