Re: [PATCH] Add arithmetic builtin functions

bug-make
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] Add arithmetic builtin functions

From:	Paul Smith
Subject:	Re: [PATCH] Add arithmetic builtin functions
Date:	Sun, 08 Dec 2024 18:11:03 -0500
User-agent:	Evolution 3.54.2 (by Flathub.org)
Thanks for this work Pete!

I reviewed the previous (lengthy) email thread before looking at this
patch.

This is a good first crack.  There are a few confusing bits and we will
need tests added for the various corner cases, and documentation.  I
can do the docs if you prefer; I usually edit them anyway to keep the
manual in a single "voice" as much as possible.

For a contribution this size we'll need copyright paperwork to be
signed; let me know if you need my help with getting that started.


On Fri, 2024-11-29 at 04:14 -0800, Pete Dietl wrote:
> From 76db129404896c15440b405949b8c830e7ebb98f Mon Sep 17 00:00:00
> 2001
> From: Pete Dietl <petedietl@gmail.com>
> Date: Fri, 29 Nov 2024 03:57:06 -0800
> Subject: [PATCH] Add arithmetic builtin functions
> 
> ---
>  src/function.c | 401
> ++++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 400 insertions(+), 1 deletion(-)

> +    default:
> +      /* Handle invalid operation */
> +      return 0.0;

I prefer to not add a default: to switch statements which are based on
enum values.  This way the compiler will warn us if we forget to handle
any of the enumerations.

Also, returning 0 here is not right anyway and would just lead to
confusing behavior.  If you really want to check at runtime that the
operator is valid you can add a call to fatal() about an internal error
just before the end of the function (after the switch) since every case
invokes return anyway.

> +/* Generic macro to select the correct function based on type and
> +   assert that the types of a and b are the same */

> +       char[__builtin_types_compatible_p (typeof (a), typeof (b)) ?
> 1 : -1]), \
> +   _Generic
> ((a),                                                             \

As noted by Gisle, we shouldn't use either GCC-specific features
(unless they are just hints) or features introduced after C99.

Also in GNU Make code we just write "long long" rather than the full
"long long int".

> +       double:
> generic_math_op_double,                                        \
> +       long long int: generic_math_op_ll) (a, b, op))
> +
> +static struct number
> +parse_number (const char *s, const char *op_name)
> +{
> +  const char *beg = s;
> +  const char *end = s + strlen (s) - 1;
> +  char *endp;
> +  struct number ret = { .integer = 0, .type = t_integer };
> +
> +  while (isspace(*s))
> +    s++;

Please use the NEXT_TOKEN (s) macro for this (see makint.h)

> +
> +  // We need to check this because even though we can command
> `strtoll` to
> +  // parse only base-ten numbers, we can't command `strtod` to only
> parse
> +  // base-10 numbers. Therefore, without this check `0xB` would get
> rejected by
> +  // `strtoll`, but accepted by `strtod` and treated as `11.0`.
> +  if (*s != '\0' && s[1] == 'x')

Please check for both upper and lowercase "x".

> +    OSS (fatal, *expanding_var,
> +         _ ("Invalid argument to %s function: '%s' not a number"),
> op_name, s);

The _() macro has special dispensation to not have a space after the
macro name; all calls should just be _("foo")

Also I think you should omit the word "function" from all messages and
just use "Invalid argument to %s: ..." etc.

> +  errno = 0;
> +  ret.integer = strtoll (beg, &endp, 10);
> +  if (errno == ERANGE)
> +    OSS (fatal, *expanding_var,
> +         _ ("Invalid argument to %s function: '%s' out of range"),
> op_name, s);
> +  if (endp == beg || endp <= end)
> +    {
> +      /* Empty or not an integer */
> +      errno = 0;
> +      ret.type = t_floating;
> +      ret.floating = strtod (beg, &endp);
> +      if (errno == ERANGE)
> +        OSS (fatal, *expanding_var,
> +             _ ("Invalid argument to %s function: '%s' out of
> range"), op_name,
> +             s);
> +      if (endp == beg || endp <= end)
> +        OSS (fatal, *expanding_var,
> +             _ ("Invalid argument to %s function: '%s'"), op_name,
> s);
> +    }
> +
> +  return ret;
> +}
> +
> +static char *
> +number_to_string (char *o, struct number n)

This name leads to concerns about where the memory is handled.  Maybe
"number_to_variable_buffer()" would be more accurate?

> +{
> +  char buffer[64];
> +  if (n.type == t_integer)
> +    snprintf (buffer, sizeof buffer, "%lld", n.integer);

We have a function make_lltoa() in misc.c, which is more portable on
some systems.  There's also a make_toui() but that's not suitable for
what you want: I have no opinion on whether you should try moving some
of these other methods to misc.c to follow that convention, or leave
them here.

> +  else
> +    {
> +      char *end;
> +      snprintf (buffer, sizeof buffer, "%.14f", n.floating);
> +      end = buffer + strlen (buffer) - 1;
> +      while (end > buffer && *end == '0')
> +        *end-- = '\0'; // Remove trailing zero
> +
> +      // If there's a decimal point left, append a '0' to keep a
> single zero
> +      if (*end == '.')
> +        {
> +          end++;
> +          *end++ = '0';
> +          *end = '\0';
> +        }

I think the above can be much simpler since you know for sure that the
string will contain a ".".  You can just walk backwards until either
you find a non-0, or else the char before the pointer is '.' (to
preserve a single "0" after the decimal).

> +    }
> +  return variable_buffer_output (o, buffer, strlen (buffer));
> +}
> +
> +struct number
> +perform_math_op (enum math_operation op, const char *op_name,
> +                 struct math_operation_init init, char **argv)
> +{
> +  struct number parsed;
> +  struct number total;
> +
> +  if (init.init_type == t_first_value)
> +    {
> +      assert (*argv != NULL);

Please don't assert here.  We know it can't happen.  If this is null
we'll crash in parse_number anyway.

> +
> +      total = parse_number (*argv, op_name);
> +      argv++;
> +    }
> +  else
> +    {
> +      total.type = t_integer;
> +      total.integer = init.constant;
> +    }
> +
> +  for (; *argv != NULL; argv++)
> +    {
> +      parsed = parse_number (*argv, op_name);
> +      if (total.type == t_integer)
> +        {
> +          if (parsed.type == t_integer)
> +            total.integer
> +                = generic_math_op (total.integer, parsed.integer,
> op);
> +          else // `parsed` is floating, `total`` is integer. So,
> convert total
> +               // to floating
> +            {
> +              total.type = t_floating;
> +              total.floating = generic_math_op
> ((double)total.integer,
> +                                                parsed.floating,
> op);
> +            }
> +        }
> +      else // `total` is floating
> +        {
> +          if (parsed.type == t_integer)
> +            total.floating
> +                = generic_math_op (total.floating,
> (double)parsed.integer, op);
> +          else // `parsed` is floating, `total`` is floating
> +            total.floating
> +                = generic_math_op (total.floating, parsed.floating,
> op);
> +        }
> +    }
> +
> +  return total;
> +}
> +
> +static char *
> +func_add (char *o, char **argv, const char *funcname)
> +{
> +  struct number n = perform_math_op (
> +      op_add, funcname,
> +      (struct math_operation_init){ .init_type = t_constant,
> .constant = 0 },
> +      argv);
> +
> +  return number_to_string (o, n);
> +}
> +
> +static char *
> +func_subtract (char *o, char **argv, const char *funcname)
> +{
> +  struct math_operation_init init;
> +  struct number n;
> +
> +  // If we received a single argument, negate it.

I don't see the advantage in creating this special case.  If someone
wants the negation of a value, wouldn't they just write "-$V" instead
of "$(sub $V)"?

> +  if (argv[0] != NULL && argv[1] == NULL)
> +    {
> +      init.init_type = t_constant;
> +      init.constant = 0;
> +    }
> +  else
> +    init.init_type = t_first_value;
> +
> +  n = perform_math_op (op_subtract, funcname, init, argv);
> +
> +  return number_to_string (o, n);
> +}
> +
> +static char *
> +func_multiply (char *o, char **argv, const char *funcname)
> +{
> +  struct number n = perform_math_op (
> +      op_multiply, funcname,
> +      (struct math_operation_init){ .init_type = t_constant,
> .constant = 1 },
> +      argv);
> +
> +  return number_to_string (o, n);
> +}
> +
> +static char *
> +func_divide (char *o, char **argv, const char *funcname)
> +{
> +  struct number parsed;
> +  struct number total;
> +
> +  // We either received one argument and we will return its inverse
> or we will use it as
> +  // our initial value for `total` and carry on dividing.

I don't really understand the advantage of this special case for div. 
If someone wants the inverse, why wouldn't they just use $(div 1 X)? 
What does this (to me kind of obscure) special case of $(div X) buy us?

> +  total = parse_number (*argv, funcname);
> +
> +  if (total.type == t_integer ? total.integer == 0 : total.floating
> == 0.0)
> +    OS (fatal, *expanding_var,
> +      _ ("Invalid argument to %s function: argument cannot be
> zero"),
> funcname);
> +
> +  // If we received a single argument, compute its inverse.
> +  if (argv[0] != NULL && argv[1] == NULL)
> +    {
> +      if (total.type == t_integer)
> +        total.integer = 1 / total.integer;
> +      else
> +        total.floating = 1.0 / total.floating;
> +
> +      return number_to_string (o, total);
> +    }
> +
> +  // We are using argv[0] as our initial value for total, so skip
> past it.
> +  argv++;
> +
> +  for (; *argv != NULL; argv++)
> +    {
> +      parsed = parse_number (*argv, funcname);
> +
> +      if (parsed.type == t_integer ? parsed.integer == 0 :
> parsed.floating == 0.0)
> +        OS (fatal, *expanding_var,
> +          _ ("Invalid argument to %s function: argument cannot be
> zero"), funcname);
> +      if (total.type == t_integer)

It seems like this testing of two "number" values to align them based
on type happens in a few different places.  Maybe it would make sense
to create a little function just for that operation, to check the types
of the two values and align them by changing both to double if either
one is double.

The function could even return the resulting type so the caller can
check the return value to see whether to update the integer or floating
values in the number.

> +        {
> +          if (parsed.type == t_integer)
> +            total.integer /= parsed.integer;
> +          else // `parsed` is floating, `total`` is integer. So,
> convert total
> +               // to floating
> +            {
> +              total.type = t_floating;
> +              total.floating = (double)total.integer /
> parsed.floating;
> +            }
> +        }
> +      else // `total` is floating
> +        {
> +          if (parsed.type == t_integer)
> +            total.floating /= (double)parsed.integer;
> +          else // `parsed` is floating, `total`` is floating
> +            total.floating /= parsed.floating;
> +        }
> +    }
> +
> +  return number_to_string (o, total);
> +}
> +
> +static char *
> +func_modulus (char *o, char **argv, const char *funcname)
> +{
> +  struct number parsed;
> +  struct number total;
> +
> +  // We require exactly two arguments
> +  assert(argv[0] != NULL && argv[1] != NULL && argv[2] == NULL);

No asserts please.

> +
> +  // We either received one argument and we will return its inverse
> or we will use it as
> +  // our initial value for `total` and carry on dividing.

I think this comment was copy-pasted from somewhere else...?

> +  total = parse_number (argv[0], funcname);

I find "total" and "parsed" to be confusing names.  Names related to
mod would be better.

> +
> +  if (total.type == t_floating)
> +    OS (fatal, *expanding_var,
> +      _ ("Invalid argument to %s function: argument must be an
> integer"), funcname);
> +
> +  if (total.integer == 0)
> +    OS (fatal, *expanding_var,
> +      _ ("Invalid argument to %s function: argument cannot be
> zero"),
> funcname);

Why can't this be 0?  I think this should be allowed.

> +
> +  parsed = parse_number (argv[1], funcname);
> +
> +  if (parsed.type == t_floating)
> +    OS (fatal, *expanding_var,
> +      _ ("Invalid argument to %s function: argument must be an
> integer"), funcname);
> +
> +  if (parsed.integer == 0)
> +    OS (fatal, *expanding_var,
> +      _ ("Invalid argument to %s function: argument cannot be
> zero"),
> funcname);
> +
> +  total.integer %= parsed.integer;
> +
> +  return number_to_string (o, total);
> +}
> +
> +static char *
> +func_maximum (char *o, char **argv, const char *funcname)
> +{
> +  struct number n = perform_math_op (
> +      op_max, funcname,
> +      (struct math_operation_init){ .init_type = t_first_value},
> +      argv);
> +
> +  return number_to_string (o, n);
> +}
> +
> +static char *
> +func_minimum (char *o, char **argv, const char *funcname)
> +{
> +  struct number n = perform_math_op (
> +      op_min, funcname,
> +      (struct math_operation_init){ .init_type = t_first_value},
> +      argv);
> +
> +  return number_to_string (o, n);
> +}
> +
> +static char *
> +func_absolute_value (char *o, char **argv, const char *funcname)
> +{
> +  struct number n;
> +
> +  // We accept exactly one argumnent
> +  assert(argv[0] != NULL && argv[1] == NULL);

Please don't assert.  If the issue could happen then check and throw an
error.  If not, the program will dump core soon enough for null
pointers.  In this case it can't happen if you have the function table
set properly.

> +
> +  n = parse_number (argv[0], funcname);
> +
> +  if (n.type == t_integer)
> +    n.integer = llabs(n.integer);
> +  else
> +    n.floating = fabs(n.floating);
> +
> +  return number_to_string (o, n);
> +}
> +
>  struct a_word
>  {
>    struct a_word *chain;
> @@ -2394,6 +2785,8 @@ static char *func_call (char *o, char **argv,
> const char *funcname);
>  static const struct function_table_entry function_table_init[] =
>  {
>   /*         Name            MIN MAX EXP? Function */
> +  FT_ENTRY ("abs",           1,  1,  1,  func_absolute_value),

In general it's best if the function name matches the name of the
function, so "func_abs", "func_max", "func_sub", etc.

> +  FT_ENTRY ("add",           1,  0,  1,  func_add),
>    FT_ENTRY ("abspath",       0,  1,  1,  func_abspath),
>    FT_ENTRY ("addprefix",     2,  2,  1,  func_addsuffix_addprefix),
>    FT_ENTRY ("addsuffix",     2,  2,  1,  func_addsuffix_addprefix),
> @@ -2401,6 +2794,7 @@ static const struct function_table_entry
> function_table_init[] =
>    FT_ENTRY ("basename",      0,  1,  1,  func_basename_dir),
>    FT_ENTRY ("call",          1,  0,  1,  func_call),
>    FT_ENTRY ("dir",           0,  1,  1,  func_basename_dir),
> +  FT_ENTRY ("div",           1,  0,  1,  func_divide),
>    FT_ENTRY ("error",         0,  1,  1,  func_error),
>    FT_ENTRY ("eval",          0,  1,  1,  func_eval),
>    FT_ENTRY ("file",          1,  2,  1,  func_file),
> @@ -2416,6 +2810,10 @@ static const struct function_table_entry
> function_table_init[] =
>    FT_ENTRY ("join",          2,  2,  1,  func_join),
>    FT_ENTRY ("lastword",      0,  1,  1,  func_lastword),
>    FT_ENTRY ("let",           3,  3,  0,  func_let),
> +  FT_ENTRY ("max",           1,  0,  1,  func_maximum),
> +  FT_ENTRY ("min",           1,  0,  1,  func_minimum),
> +  FT_ENTRY ("mod",           2,  2,  1,  func_modulus),
> +  FT_ENTRY ("mul",           1,  0,  1,  func_multiply),
>    FT_ENTRY ("notdir",        0,  1,  1,  func_notdir_suffix),
>    FT_ENTRY ("or",            1,  0,  0,  func_or),
>    FT_ENTRY ("origin",        0,  1,  1,  func_origin),
> @@ -2424,6 +2822,7 @@ static const struct function_table_entry
> function_table_init[] =
>    FT_ENTRY ("shell",         0,  1,  1,  func_shell),
>    FT_ENTRY ("sort",          0,  1,  1,  func_sort),
>    FT_ENTRY ("strip",         0,  1,  1,  func_strip),
> +  FT_ENTRY ("sub",           1,  0,  1,  func_subtract),
>    FT_ENTRY ("subst",         3,  3,  1,  func_subst),
>    FT_ENTRY ("suffix",        0,  1,  1,  func_notdir_suffix),
>    FT_ENTRY ("value",         0,  1,  1,  func_value),

-- 
Paul D. Smith <psmith@gnu.org>            Find some GNU Make tips at:
https://www.gnu.org                       http://make.mad-scientist.net
"Please remain calm...I may be mad, but I am a professional." --Mad
Scientist
[Prev in Thread]
Current Thread
[Next in Thread]
Re: [PATCH] Add arithmetic builtin functions, Paul Smith, 2024/12/08
- Re: [PATCH] Add arithmetic builtin functions, Paul Smith <=
  - Re: [PATCH] Add arithmetic builtin functions, Pete Dietl, 2024/12/09
  - Re: [PATCH] Add arithmetic builtin functions, Jouke Witteveen, 2024/12/09
    - Re: [PATCH] Add arithmetic builtin functions, Pete Dietl, 2024/12/09
    - Re: [PATCH] Add arithmetic builtin functions, Paul Smith, 2024/12/09
    - Re: [PATCH] Add arithmetic builtin functions, Pete Dietl, 2024/12/09
    - Re: [PATCH] Add arithmetic builtin functions, Pete Dietl, 2024/12/19
    - Re: [PATCH] Add arithmetic builtin functions, Paul Smith, 2024/12/26
Prev by Date: Re: [PATCH] Add arithmetic builtin functions
Next by Date: [bug #66499] Parameter for option -j seems not to be validated and can cause hangs
Previous by thread: Re: [PATCH] Add arithmetic builtin functions
Next by thread: Re: [PATCH] Add arithmetic builtin functions
Index(es):
- Date
- Thread