qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 08/28] qapi: Support flat unions tag values with leading digi


From: Markus Armbruster
Subject: Re: [PATCH 08/28] qapi: Support flat unions tag values with leading digit
Date: Tue, 23 Mar 2021 17:18:11 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux)

John Snow <jsnow@redhat.com> writes:

> On 3/23/21 5:40 AM, Markus Armbruster wrote:
>> Flat union tag values get checked twice: as enum member name, and as
>> union branch name.  The former accepts leading digits, the latter
>> doesn't.  The restriction feels arbitrary.  Skip the latter check.
>> This can expose c_name() to input it can't handle: a name starting
>> with a digit.  Improve it to return a valid C identifier for any
>> input.
>> Signed-off-by: Markus Armbruster <armbru@redhat.com>
>
> Anything in particular inspire this?

Just a desire for keeping things simple.  "Any enum type works as
discriminator" is simpler than "any enum works, but branches
corresponding to enum values starting with a digit cannot have members".
Let me elaborate.

This works:

    {'enum': 'Enu', 'data': ['0', 'eins', '2']}
    {'struct': 'St', 'data': {'s': 'str'}}
    {'union': 'Uni',
     'base': {'type': 'Enu'},
     'discriminator': 'type',
     'data': {
       'eins': 'St'}}

But if you change the last line to

       '0': 'St'}}

you get told off:

    scripts/qapi-gen.py: /dev/stdin: In union 'Uni':
    /dev/stdin:3: 'data' member '0' has an invalid name

>
>> ---
>>   scripts/qapi/common.py | 8 ++++----
>>   scripts/qapi/expr.py   | 4 +++-
>>   2 files changed, 7 insertions(+), 5 deletions(-)
>> diff --git a/scripts/qapi/common.py b/scripts/qapi/common.py
>> index 11b86beeab..cbd3fd81d3 100644
>> --- a/scripts/qapi/common.py
>> +++ b/scripts/qapi/common.py
>> @@ -18,7 +18,6 @@
>>   #: Magic string that gets removed along with all space to its right.
>>   EATSPACE = '\033EATSPACE.'
>>   POINTER_SUFFIX = ' *' + EATSPACE
>> -_C_NAME_TRANS = str.maketrans('.-', '__')
>>     
>>   def camel_to_upper(value: str) -> str:
>> @@ -109,9 +108,10 @@ def c_name(name: str, protect: bool = True) -> str:
>>                        'not_eq', 'or', 'or_eq', 'xor', 'xor_eq'])
>>       # namespace pollution:
>>       polluted_words = set(['unix', 'errno', 'mips', 'sparc', 'i386'])
>> -    name = name.translate(_C_NAME_TRANS)
>> -    if protect and (name in c89_words | c99_words | c11_words | gcc_words
>> -                    | cpp_words | polluted_words):
>> +    name = re.sub(r'[^A-Za-z0-9_]', '_', name)
>
> The regex gets a little more powerful now. Instead of just . and - we
> now translate *everything* that's not an alphanumeric to _.

Yes.

> Does this have a visible impact anywhere, or are we constrained
> somewhere else?

If it had, we'd generate C that doesn't compile.  Except when we're
unlucky.  Then it compiles, but means something different than we want.

I need to catch "C identifiers can't start with a digit" to make the
remainder of the patch work (right below).  Instead of doing just that,
I chose to do a complete job, and ensure the function always returns a
valid C identifier.

>> +    if protect and (name in (c89_words | c99_words | c11_words | gcc_words
>> +                             | cpp_words | polluted_words)
>> +                    or name[0].isdigit()):
>
> Worth touching the docstring?
>
> :param protect: If true, avoid returning certain ticklish identifiers 
>                 (like C keywords) by prepending ``q_``. 

It doesn't become wrong.  Care to suggest an improvement?

> I know the formatting is a hot mess, but I still intend to get to that
> "all at once" with an "enable sphinx" pass later, so just touching the 
> content so it gets included in that pass would be fine (to me, at least.)
>
>>           return 'q_' + name
>>       return name
>>   diff --git a/scripts/qapi/expr.py b/scripts/qapi/expr.py
>> index cf09fa9fd3..507550c340 100644
>> --- a/scripts/qapi/expr.py
>> +++ b/scripts/qapi/expr.py
>> @@ -246,7 +246,9 @@ def check_union(expr, info):
>>         for (key, value) in members.items():
>>           source = "'data' member '%s'" % key
>> -        check_name_str(key, info, source)
>> +        if discriminator is None:
>> +            check_name_str(key, info, source)
>> +        # else: name is in discriminator enum, which gets checked
>
> I suppose the alternative would be to have allowed check_name_str to
> take an 'allow_leading_digits' parameter (instead of 'enum_member')
> and set that to true here and just deal with the mild inefficiency.
>
> I might have a slight preference to just accept the inefficiency so
> that it's obvious that it's checked regardless of the discriminator 
> condition, buuuuut not enough to be pushy about it, I think.

Sounds like a defensible idea, but this series is heading in the
opposite direction; see the next few patches.

>>           check_keys(value, info, source, ['type'], ['if'])
>>           check_if(value, info, source)
>>           check_type(value['type'], info, source, allow_array=not base)
>> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]