[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v5 1/2] json: Add function to unescape JSON-encoded strings
From: |
Daniel Kiper |
Subject: |
Re: [PATCH v5 1/2] json: Add function to unescape JSON-encoded strings |
Date: |
Tue, 12 Jul 2022 15:39:13 +0200 |
User-agent: |
NeoMutt/20170113 (1.7.2) |
On Mon, Jul 11, 2022 at 09:08:09AM -0400, Nicholas Vinson wrote:
> On 7/11/22 06:44, Patrick Steinhardt wrote:
> > JSON strings require certain characters to be encoded, either by using a
> > single reverse solidus character "\" for a set of popular characters, or
> > by using a Unicode representation of "\uXXXXX". The jsmn library doesn't
> > handle unescaping for us, so we must implement this functionality for
> > ourselves.
> >
> > Add a new function `grub_json_unescape ()` that takes a potentially
> > escaped JSON string as input and returns a new unescaped string.
> >
> > Signed-off-by: Patrick Steinhardt <ps@pks.im>
> > ---
> > grub-core/lib/json/json.c | 118 ++++++++++++++++++++++++++++++++++++++
> > grub-core/lib/json/json.h | 12 ++++
> > 2 files changed, 130 insertions(+)
> >
> > diff --git a/grub-core/lib/json/json.c b/grub-core/lib/json/json.c
> > index 1c20c75ea..1eadd1ce9 100644
> > --- a/grub-core/lib/json/json.c
> > +++ b/grub-core/lib/json/json.c
> > @@ -262,3 +262,121 @@ grub_json_getint64 (grub_int64_t *out, const
> > grub_json_t *parent, const char *ke
> > return GRUB_ERR_NONE;
> > }
> > +
> > +grub_err_t
> > +grub_json_unescape (char **out, grub_size_t *outlen, const char *in,
> > grub_size_t inlen)
> > +{
> > + grub_err_t ret = GRUB_ERR_NONE;
> > + grub_size_t inpos, resultpos;
> > + char *result;
> > +
> > + if (out == NULL || outlen == NULL)
> > + return grub_error (GRUB_ERR_BAD_ARGUMENT, N_("output parameters are
> > not set"));
> > +
> > + result = grub_calloc (1, inlen + 1);
> > + if (result == NULL)
> > + return GRUB_ERR_OUT_OF_MEMORY;
> > +
> > + for (inpos = resultpos = 0; inpos < inlen; inpos++)
> > + {
> > + if (in[inpos] == '\\')
> > + {
> > + inpos++;
> > + if (inpos >= inlen)
> > + {
> > + ret = grub_error (GRUB_ERR_BAD_ARGUMENT, N_("expected escaped
> > character"));
> > + goto err;
> > + }
> > +
> > + switch (in[inpos])
> > + {
> > + case '"':
> > + result[resultpos++] = '"';
> > + break;
> > +
> > + case '/':
> > + result[resultpos++] = '/';
> > + break;
> > +
> > + case '\\':
> > + result[resultpos++] = '\\';
> > + break;
> > +
> > + case 'b':
> > + result[resultpos++] = '\b';
> > + break;
> > +
> > + case 'f':
> > + result[resultpos++] = '\f';
> > + break;
> > +
> > + case 'r':
> > + result[resultpos++] = '\r';
> > + break;
> > +
> > + case 'n':
> > + result[resultpos++] = '\n';
> > + break;
> > +
> > + case 't':
> > + result[resultpos++] = '\t';
> > + break;
> > +
> > + case 'u':
> > + {
> > + char values[4] = {0};
> > + unsigned i;
> > +
> > + inpos++;
> > + if (inpos + ARRAY_SIZE(values) > inlen)
> > + {
> > + ret = grub_error (GRUB_ERR_BAD_ARGUMENT, N_("unicode
> > sequence too short"));
> > + goto err;
> > + }
>
> I recommend relaxing these errors a little bit. While technically
> correct, I'm thinking it might serve GRUB better to be a bit more
> forgiving.
>
> Perhaps something like:
>
> if the input is '\u????' where ? is not a hex digit, set the value to
> 'u', and process ???? separately.
>
> if ???? is less than ARRAY_SIZE chars, left-pad ???? with 0s* until it's
> the correct length then convert the value.
>
> (* I don't mean to literally pad. I mean to treat the short number as if
> it was left-padded with 0s)
>
> > +
> > + for (i = 0; i < ARRAY_SIZE(values); i++)
> > + {
> > + char c = in[inpos++];
> > +
> > + if (c >= '0' && c <= '9')
> > + values[i] = c - '0';
> > + else if (c >= 'A' && c <= 'F')
> > + values[i] = c - 'A' + 10;
> > + else if (c >= 'a' && c <= 'f')
> > + values[i] = c - 'a' + 10;
> > + else
> > + {
>
> With a similar argument here. Treat the short ???? as suggested above
> and resume normal processing at the non-hex digit position.
>
> > + ret = grub_error (GRUB_ERR_BAD_ARGUMENT,
> > + N_("unicode sequence with invalid
> > character '%c'"), c);
> > + goto err;
> > + }
> > + }
> > +
> > + if (values[0] != 0 || values[1] != 0)
> > + result[resultpos++] = values[0] << 4 | values[1];
> > + result[resultpos++] = values[2] << 4 | values[3];
> > +
> > + /* Offset the increment that's coming in via the loop
> > increment. */
> > + inpos--;
> > +
> > + break;
> > + }
> > +
> > + default:
> > + ret = grub_error (GRUB_ERR_BAD_ARGUMENT, N_("unrecognized
> > escaped character '%c'"), in[inpos]);
> > + goto err;
>
> And the final prong of the suggestion would be here. Treat the improper
> escape as if it had not been escaped at all or as '\\?' where '?' is the
> escaped character.
Nicholas comments make sense for me.
Patrick?
Daniel