Re: indsize: New module

bug-gnulib

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: indsize: New module

From:	Bruno Haible
Subject:	Re: indsize: New module
Date:	Fri, 04 Dec 2020 03:29:06 +0100
User-agent:	KMail/5.1.3 (Linux/4.4.0-193-generic; KDE/5.18.0; x86_64; ; )

Paul Eggert wrote:
> > For this reason I see the choice of a signed type as an_implementation_  
> > detail.
> 
> It's a crucial detail and it belongs in the API. I not not want to waste time 
> worrying whether idx_t is signed in any code that uses idx_t, as 
> signed-vs-unsigned problems are endemic to C

Indeed, some comparison results between signed and unsigned values, specified
by ISO C, are surprising:

int foo () {
/* ISO C99 § 6.3.1.8 */
/* If both operands have the same type, then no further conversion is needed. */
  return (int) -3 < (int) 7; // true
/* Otherwise, if both operands have signed integer types or both have unsigned
   integer types, the operand with the type of lesser integer conversion rank is
   converted to the type of the operand with greater rank. */
  return (int) -3 < (long) 7; // true
  return (long) -3 < (int) 7; // true
/* Otherwise, if the operand that has unsigned integer type has rank greater or
   equal to the rank of the type of the other operand, then the operand with
   signed integer type is converted to the type of the operand with unsigned
   integer type. */
  return (int) -3 < (unsigned int) 7; // false!
  return (int) -3 < (unsigned long) 7; // false!
  return (unsigned int) -3 < (int) 7; // false
  return (unsigned long) -3 < (int) 7; // false
/* Otherwise, if the type of the operand with signed integer type can represent
   all of the values of the type of the operand with unsigned integer type, then
   the operand with unsigned integer type is converted to the type of the
   operand with signed integer type. */
  return (long long) -3 < (unsigned int) 7; // true
  return (unsigned int) -3 < (long long) 7; // false
  /* On 64-bit machines: */
  return (long) -3 < (unsigned int) 7; // true
/* Otherwise, both operands are converted to the unsigned integer type
   corresponding to the type of the operand with signed integer type. */
  return (unsigned int) -3 < (long) 7; // false
  /* On 32-bit machines: */
  return (long) -3 < (unsigned int) 7; // false!
}

> and part of the point of using 
> ptrdiff_t is so that I *don't* have to worry about this stuff. For example, 
> if I 
> have an int value I and an idx_t value N and I happens to be negative, and I 
> don't want to have to worry about the possibility that the expression I < N 
> might fail to be true. Expressions like that happen quite a bit in generic 
> code 
> and this is a real problem.

Indeed, in languages that have only signed integer types (e.g. Java), or more
generally where number comparison always uses the value (e.g. Common Lisp),
this entire class of comparison problems is non-existent.

> If we're going to leave open the possibility that this type might be unsigned,
> I'd rather not use the type in canonicalize.c now.

OK, I've revised the file as follows:


2020-12-03  Bruno Haible  <bruno@clisp.org>

        idx: Clarify that idx_t always behaves like a signed type.
        Suggested by Paul Eggert in
        <https://lists.gnu.org/archive/html/bug-gnulib/2020-12/msg00034.html>.
        * lib/idx.h: Clarify that idx_t always behaves like a signed type.
        Don't test UNSIGNED_IDX_T.

diff --git a/lib/idx.h b/lib/idx.h
index 6574b1b..0095467 100644
--- a/lib/idx.h
+++ b/lib/idx.h
@@ -25,16 +25,35 @@
 #include <stdint.h>
 
 /* The type 'idx_t' holds an (array) index or an (object) size.
-   Its implementation is a signed or unsigned integer type, capable of holding
-   the values
+   Its implementation is a signed integer type, capable of holding the values
      0..2^63-1 (on 64-bit platforms) or
      0..2^31-1 (on 32-bit platforms).
 
-   Why not use 'size_t' directly?
+   Why a signed integer type?
 
      * Security: Signed types can be checked for overflow via
        '-fsanitize=undefined', but unsigned types cannot.
 
+     * Comparisons without surprises: ISO C99 § 6.3.1.8 specifies a few
+       surprising results for comparisons, such as
+
+           (int) -3 < (unsigned long) 7  =>  false
+           (int) -3 < (unsigned int) 7   =>  false
+       and on 32-bit machines:
+           (long) -3 < (unsigned int) 7  =>  false
+
+       This is surprising because the natural comparison order is by
+       value in the realm of infinite-precision signed integers (ℤ).
+
+       The best way to get rid of such surprises is to use signed types
+       for numerical integer values, and use unsigned types only for
+       bit masks and enums.
+
+   Why not use 'size_t' directly?
+
+     * Because 'size_t' is an unsigned type, and a signed type is better.
+       See above.
+
    Why not use 'ptrdiff_t' directly?
 
      * Maintainability: When reading and modifying code, it helps to know that
@@ -67,27 +86,22 @@
        constrained to a certain range of values) may be added to C compilers
        or to the C standard.  Several programming languages (Ada, Haskell,
        Common Lisp, Pascal) already have range types.  Such range types may
-       help producing good code and good warnings.  The type 'idx_t'
-       could then be typedef'ed to a range type.  */
-
-/* The user can define UNSIGNED_IDX_T, to get a different set of compiler
-   warnings.  */
-#if defined UNSIGNED_IDX_T
-# if __clang_version__ >= 11 && 0
-/* It would be tempting to use the clang "extended integer types", which are a
-   special case of range types.
-   <https://clang.llvm.org/docs/LanguageExtensions.html#extended-integer-types>
-   However, these types don't support binary operators with plain integer
-   types (e.g. expressions such as x > 1).  */
-typedef unsigned _ExtInt (PTRDIFF_WIDTH - 1) idx_t;
-# else
-/* Use an unsigned type as wide as 'ptrdiff_t'.
-   ISO C does not mandate that 'size_t' and 'ptrdiff_t' have the same size, but
-   it is so an all platforms we have seen since 1990.  */
-typedef size_t idx_t;
-# endif
+       help producing good code and good warnings.  The type 'idx_t' could
+       then be typedef'ed to a (signed!) range type.  */
+
+#if 0
+/* In the future, idx_t could be typedef'ed to a signed range type.  */
+/* Note: The clang "extended integer types", supported in Clang 11 or newer
+   
<https://clang.llvm.org/docs/LanguageExtensions.html#extended-integer-types>,
+   are a special case of range types.  However, these types don't support 
binary
+   operators with plain integer types (e.g. expressions such as x > 1).
+   Therefore, they don't behave like signed types (and not like unsigned types
+   either).  So, we cannot use them here.  */
+typedef <some_range_type> idx_t;
 #else
 /* Use the signed type 'ptrdiff_t' by default.  */
+/* Note: ISO C does not mandate that 'size_t' and 'ptrdiff_t' have the same
+   size, but it is so an all platforms we have seen since 1990.  */
 typedef ptrdiff_t idx_t;
 #endif
 
@@ -98,7 +112,7 @@ typedef ptrdiff_t idx_t;
 # define IDX_MAX PTRDIFF_MAX
 #endif
 
-/* IDX_WIDTH is the number of bits in an idx_t.  */
+/* IDX_WIDTH is the number of bits in an idx_t (31 or 63).  */
 #define IDX_WIDTH (PTRDIFF_WIDTH - 1)
 
 #endif /* _IDX_H */

[Prev in Thread]

Current Thread

[Next in Thread]

indsize: New module, Bruno Haible, 2020/12/03
- Re: indsize: New module, Paul Eggert, 2020/12/03
  - Re: indsize: New module, Bruno Haible, 2020/12/03
    - Re: indsize: New module, Paul Eggert, 2020/12/03
    - Re: indsize: New module, Bruno Haible <=
    - Re: indsize: New module, Paul Eggert, 2020/12/04
    - Re: exotic CPU hardware, Bruno Haible, 2020/12/04
    - Re: exotic CPU hardware, Jeffrey Walton, 2020/12/04
    - Re: exotic CPU hardware, Paul Eggert, 2020/12/05
    - Re: exotic CPU hardware, Paul Eggert, 2020/12/05

Prev by Date: Re: indsize: New module
Next by Date: [PATCH] intprops: update doc and mention Unisys
Previous by thread: Re: indsize: New module
Next by thread: Re: indsize: New module
Index(es):
- Date
- Thread