bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: base64 tool?


From: Paul Eggert
Subject: Re: base64 tool?
Date: Sun, 26 Jun 2005 00:14:09 -0700
User-agent: Gnus/5.1006 (Gnus v5.10.6) Emacs/21.4 (gnu/linux)

Thanks for writing that.  Here's a quick review.

Simon Josefsson <address@hidden> writes:

> +The base64 encoding expand data to roughly 133% of the original.

expand -> expands

> address@hidden address@hidden
> address@hidden address@hidden
> address@hidden -w
> address@hidden --wrap
> address@hidden wrap data
> address@hidden column to wrap data after
> +During encoding, wrap lines after @var{COLS} characters.  This must be
> +a positive number.
> +
> +If this option is not given at all, no line wrapping will be
> +performed.  If @var{COLS} is omitted, the default is 76.

The POSIX utility syntax guidelines do not allow options with optional
arguments, and I tend to agree, at least for single-letter options.
How about if we simply require COLS?

> address@hidden -i
> address@hidden --ignore-garbage
> address@hidden -i
> address@hidden --ignore-garbage
> address@hidden Ignore garbage in base64 stream
> +During decoding, ignore unrecognized characters (including newline),
> +to permit distorted data to be decoded.

Is this option needed if the encoder used the -w option?  It's
not clear from the documentation.

> +                  size_t * outlen)

* outlen -> *outlen

> +  const char b64[64] =
> +    "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";

Surely this should be static, not auto.

Also, it's a bit tricky that there's a local var "b64" and a global
one also named "b64".  I'd rename one of them.

> +      *out++ = b64[((to_uchar (in[0]) << 4)
> +                 + (--inlen ? to_uchar (in[1]) >> 4 : 0)) & 0x3f];

That "& 0x3f" is hard to follow.  Here's a better and more-consistent
indenting style:

      *out++ = b64[((to_uchar (in[0]) << 4)
                    + (--inlen ? to_uchar (in[1]) >> 4 : 0))
                   & 0x3f];
+      *out++ =
+       (inlen
+        ? b64[((to_uchar (in[1]) << 2)
+               + (--inlen ? to_uchar (in[2]) >> 6 : 0)) & 0x3f] : '=');

Similarly, the "& 0x3f" and ": '='" are indented confusingly.  How
about this instead?

      *out++ =
        (inlen
         ? b64[((to_uchar (in[1]) << 2)
                + (--inlen ? to_uchar (in[2]) >> 6 : 0))
               & 0x3f]
         : '=');

> +      if (!--outlen)
> +     break;

This is in a loop that will test outlen anyway.  Why is
the if-test needed here?  Can't you simply decrement outlen?

> +      in += 3;

This assignment might have undefined behavior if IN points at or near
the end of a buffer.  The C standard says you can't point more than
one byte past the end of a buffer.


> +The data is encoded as described for the base64 alphabet in RFC 3548.\n\

data is -> data are
("Data" is a plural in English.)

> +       n = fread (inbuf + sum, 1, BLOCKSIZE - sum, in);
> +
> +       if (n > BLOCKSIZE - sum)
> +         error (EXIT_FAILURE, 0, _("read too much"));
> ...
> +       if (n > B64BLOCKSIZE - sum)
> +         error (EXIT_FAILURE, 0, _("read too much"));

That's not possible.  If you want to test for this, you should simply
abort () if you read too much.  But I wouldn't bother testing for this
(the rest of coreutils doesn't).

> +  /* When wrapping, terminate last line. */
> +  if (wrap_column && fputs ("\n", out) < 0)
> +    error (EXIT_FAILURE, errno, _("write error"));

Might this output a newline that is adjacent to a previous newline?
Is the double-newline intended?

> +  if (ferror (in))
> +    error (EXIT_FAILURE, errno, _("read error"));

It's not correct to use errno here, since it may have been set by
a previous system call to something other than the errno that
corresponds to the read error.  (There are two instances
of this problem.)

> +       if (ignore_garbage)
> +         {
> +           size_t i;
> +           for (i = 0; n > 0 && i < n;)
> +             if (isbase64 (inbuf[sum + i]) || inbuf[sum + i] == '=')
> +               i++;
> +             else
> +               memmove (inbuf + sum + i, inbuf + sum + i + 1, --n - i);
> +         }

That's an O(N**2) algorithm.  Why not use an O(N) algorithm instead?
You can do that by copying each valid byte of the buffer as you go.
You can do this in-place, so you don't need to allocate memory.

> +       if (ferror (in))
> +         error (EXIT_FAILURE, errno, _("read error"));

This is another case where errno might be garbage.

> +  while ((opt = getopt_long (argc, argv, "dqiw", long_options, NULL)) != -1)

Change "dqiw" to "dqiw:" (as mentioned above; w's operand should
probably be required).

> +      char *out = NULL;
> +      size_t outlen = 0;

Don't initialize these variables, since the initial values aren't used.

> +      if (decode)
> +     {
> ...
> +       free (out);
> +     }
> +      else
> +     {
> ...
> +       free (out);
> +     }

The "free (out);" can be hoisted out of the if-then-else.

> +  if (fclose (stdin) == EOF)
> +    error (EXIT_FAILURE, errno, _("standard input"));

This code shouldn't be executed unless you actually read stdin.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]