bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#56347: Optimize/simplify STRING_SET_MULTIBYTE


From: Stefan Monnier
Subject: bug#56347: Optimize/simplify STRING_SET_MULTIBYTE
Date: Sat, 02 Jul 2022 12:12:06 -0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/29.0.50 (gnu/linux)

>> The patch below simplifies code around STRING_SET_MULTIBYTE.
>> Any objection?
> Rationale?

STRING_SET_MULTIBYTE is fundamentally evil because it changes the nature
of an object.  Its current definition (like that of STRING_SET_UNIBYTE)
is rather scary (it sometimes changes the nature of the arg passed to
it, and sometimes replaces the arg with something else).

>> --- a/src/composite.c
>> +++ b/src/composite.c
>> @@ -1879,11 +1879,7 @@ Otherwise (for terminal display), FONT-OBJECT must be 
>> a terminal ID, a
>>        for (i = SBYTES (string) - 1; i >= 0; i--)
>>          if (!ASCII_CHAR_P (SREF (string, i)))
>>            error ("Attempt to shape unibyte text");
>> -      /* STRING is a pure-ASCII string, so we can convert it (or,
>> -         rather, its copy) to multibyte and use that thereafter.  */
>> -      Lisp_Object string_copy = Fconcat (1, &string);
>> -      STRING_SET_MULTIBYTE (string_copy);
>> -      string = string_copy;
>> +      /* STRING is a pure-ASCII string, so we can treat it as multibyte.  */
>
> Did you actually try your change in the situations where this problem
> pops up?

I don't even know how to go about doing that, no.

> AFAIR, the code makes a copy of the string for good reasons:
> the rest of handling of the string down the line barfs if we keep a
> multibyte string here.

[ I assume you meant "barfs if we keep a *uni*byte string here".  ]

Where?  AFAICT `string` is only used in the subsequent code by passing
it to `fill_gstring_header` and that function only passes that arg to
`fetch_string_char_advance_no_check` and that function only looks at the
string's SDATA, so as long as the sequence of bytes is consistent with
a multibyte string (which we just checked with the ASCII_CHAR_P loop),
I don't see any problem.

>> --- a/src/lisp.h
>> +++ b/src/lisp.h
>> @@ -1637,12 +1637,10 @@ #define STRING_SET_UNIBYTE(STR)                      
>>         \
>>  
>>  /* Mark STR as a multibyte string.  Assure that STR contains only
>>     ASCII characters in advance.  */
>> -#define STRING_SET_MULTIBYTE(STR)                   \
>> -  do {                                                      \
>> -    if (XSTRING (STR)->u.s.size == 0)                       \
>> -      (STR) = empty_multibyte_string;                       \
>> -    else                                            \
>> -      XSTRING (STR)->u.s.size_byte = XSTRING (STR)->u.s.size; \
>> +#define STRING_SET_MULTIBYTE(STR)                       \
>> +  do {                                                          \
>> +    eassert (XSTRING (STR)->u.s.size > 0);              \
>> +    XSTRING (STR)->u.s.size_byte = XSTRING (STR)->u.s.size; \
>>    } while (false)
>>  
>>  /* Convenience functions for dealing with Lisp strings.  */
>
> You want to disallow uses of empty_multibyte_string? why?

No, I want to reduce the scope of semantics of the macro, e.g. so it can
be implemented as a function rather than a macro and so it doesn't
magically substitute empty_multibyte_string into a variable that held
something else.


        Stefan






reply via email to

[Prev in Thread] Current Thread [Next in Thread]