bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Question about some fields in regex's re_pattern_buffer


From: Reuben Thomas
Subject: Re: Question about some fields in regex's re_pattern_buffer
Date: Tue, 17 Aug 2010 18:49:33 +0100

On 14 August 2010 23:15, Bruno Haible <address@hidden> wrote:
> Reuben Thomas wrote:
>> New patch attached.
>
> I've applied it for you.

Thanks again.

Reading further, I came up with the following:

1. I believe the @ignored section about RE_NO_EMPTY_ALTS can be
removed, as it no longer exists.

2. Equivalence classes and collating symbol operators: as far as I can
see, these are indeed implemented now, and there is no API syntax flag
to turn them off. Hence, the documentation for them can be
uncommented, with the mentions of syntax bits deleted.

3. "@comment xx something about leftmost-longest": I found this
comment in the section "Alternation Operator". Since there is already
a paragraph about leftmost-longest matching in general ("What Gets
Matched?"), what was intended here? The obvious thing to say about
longest-leftmost and alternation is that the longest match in an
alternative will be returned, not the first alternative in the pattern
which matches.

4. The @ignore section starting

@ignore
The function sets @address@hidden@var{regs}->}start[0]} and

seems to be deletable, in that its contents is already included above.

5. The comment "@c xx what else?" refers to a description of which
fields are set by re_compile_pattern. The fields mentioned are buffer,
used, syntax, re_nsub and fastmap_accurate. regex.h (as recently
amended) has the following to say:

/* This data structure represents a compiled pattern.  Before calling
   the pattern compiler, the fields `buffer', `allocated', `fastmap',
   `translate', and `no_sub' can be set.  After the pattern has been
   compiled, the `re_nsub' field is available.  All other fields are
   private to the regex routines.  */

This suggests that `allocated' and `no_sub' should be added to this
list, while `used', `syntax', and `fastmap_accurate' should not be
documented (the existing documentation should be deleted), as they are
"private to the regex routines". However, the documentation of
`fastmap_accurate' makes an important point, that re_compile_pattern
stops any previous fastmap being used, and this should be made clear
in the discussion of fastmaps.

6. It is not clear whether "@c xx i'm not sure this is all true
anymore." refers to what precedes or follows it. Assuming the latter
is more likely, then "not contained within another group" can be
deleted: all subexpression matches are reported in regs. The rest, as
far as I can see, is correct.

All the POSIX documentation should be removed, as there is better GNU
POSIX documentatioon in the glibc manual (not to mention the POSIX
spec and other sources).
It is best to refer the reader there.

-- 
http://rrt.sc3d.org



reply via email to

[Prev in Thread] Current Thread [Next in Thread]