[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Additional "in" operator for fields being lists of strings
From: |
Marcin Szewczyk |
Subject: |
Re: Additional "in" operator for fields being lists of strings |
Date: |
Fri, 31 Jul 2020 11:35:25 +0200 |
User-agent: |
Mutt/1.10.1 (2018-07-13) |
On Thu, Jul 30, 2020 at 10:43:59PM +0200, Jose E. Marchesi wrote:
> Marcin Szewczyk <marcin.szewczyk@wodny.org> wrote:
> > One question comes to mind. Should the != operator mean:
> > 1. at least one enum value different than or
> > 2. none of enum tokens may be equal to the specified value.
> > [...]
> > Should a normalization step be taken?
> > Like:
> >
> > Device: plumbus
> > Tag: plubus
> > Tag: dinglebop fleeb
> > Tag: grumbo
> >
> > to (only for enum fields):
> >
> > Device: plumbus
> > Tag: plubus dinglebop fleeb grumbo
>
> I would say we clearly want 2. for the semantics of != when applied to
> enumerated fields. Normalizing is indeed necessary.
>
> > Currently, I cannot see any exclusion operator. For multi-field strings
> > neither 'Y!="y3"' nor '!(Y="y3")' will exclude a record if there is any
> > Y field that matches these conditions. So the second semantic variant
> > would give something new and interesting but also incompatible with the
> > current string semantics. [...]
>
> Hmm, I don't think don't need to keep the existing string semantics for
> enums, because in properly conformed data each Tag are restricted to
> have only one of the valid values, i.e.:
>
> --- foo.rec ---
> %rec: Device
> %type: Tag enum dinglebop fleeb plubus grumbo
>
> Device: plumbus
> Tag: dinglebop fleeb plubus grumbo
> --- end of foo.rec ---
>
> $ recfix foo.rec
> foo.rec:5: error: invalid enum value.
But if the user has always used the properly structured variant
(accepted by recfix), ie.:
--- foo.rec ---
%rec: Device
%type: Tag enum dinglebop fleeb plubus grumbo
Device: plumbus
Tag: dinglebop
Tag: fleeb
Tag: plubus
Tag: grumbo
--- end of foo.rec ---
executing `recsel -e 'Tag != "fleeb"' foo.rec` would change output from
returning the record to returning nothing (assuming that both expanded
and non-expanded forms should mean the same thing).
Which form of using multiple enum values should be canonical:
- SFMV: single field with multiple values (non-expanded) or
- MFSV: multiple fields with single values (expanded)?
Implementing SFMV would probably be quite easy and based on strtok() in
the ops switch.
The MFSV form would probably require a serious change in `rec_sex_eval`
implementation[1] to give semantics 2. of the `!=` operator.
Do you think that normalization should be:
- explicit and applied permanently eg. by recfix or
- implicit and calculated just for SEX evaluation?
Or maybe a trick should be implemented:
- official representation of serialized records should be MFSV (explicit
normalization) but
- for ease of implementation internal representation used for SEX
evaluation should be SFMV (implicit normalization)?
One more thing that comes to mind is: should access to enum tokens
(multiple values per field) be implemented (if allowed) over
`rec_record_get_field_by_name` or should `rec_record_get_field_by_name`
itself be capable of indexing in the following implicit normalization
manner:
Tag: plubus
Tag: dinglebop fleeb
Tag: grumbo
Tag[0] = plubus
Tag[1] = dinglebop
Tag[2] = fleeb
Tag[3] = grumbo
[1]: Not all combinations checked when more than one multiple-value field
present
https://lists.gnu.org/archive/html/bug-recutils/2020-07/msg00004.html
--
Marcin Szewczyk
http://wodny.org