bug-m4
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

\{n\} is not implemented (it's ok) but this is not documented


From: Van de Bugger
Subject: \{n\} is not implemented (it's ok) but this is not documented
Date: Thu, 24 Jan 2019 01:08:21 +0300
User-agent: Evolution 3.30.3 (3.30.3-1.fc29)

Sotware versions
================

    $ rpm -q m4
    m4-1.4.18-9.fc29.x86_64

    $ m4 --version
    m4 (GNU M4) 1.4.18
    Copyright (C) 2016 Free Software Foundation, Inc.
    License GPLv3+: GNU GPL version 3 or later 
<http://gnu.org/licenses/gpl.html>.
    This is free software: you are free to change and redistribute it.
    There is NO WARRANTY, to the extent permitted by law.

    Written by Rene' Seindal.

Problem
=======

Lack of documentation on regular expressions. 

m4 manual (<https://www.gnu.org/software/m4/manual/m4.html>) is very laconic 
about regular expressions:

>   Builtin: regexp (string, regexp, [replacement])
>   
>       Searches for regexp in string. The syntax for regular expressions is 
>       the same as in GNU Emacs, which is similar to BRE, Basic Regular 
>       Expressions in POSIX. See Syntax of Regular Expressions in the GNU 
>       Emacs Manual. Support for ERE, Extended Regular Expressions is not 
>       available, but will be added in GNU M4 2.0. 

Ok, let us loot at GNU Emacs manual 
(<https://www.gnu.org/software/emacs/manual/html_mono/emacs.html>). It 
dpcuments \{n\} construct:

>   15.6 Syntax of Regular Expressions
>       <...>
>       \{n\}   
>           is a postfix operator specifying n repetitions—that is, the 
>           preceding regular expression must match exactly n times in a row. 
>           For example, ‘x\{4\}’ matches the string ‘xxxx’ and nothing else.   
>   

This construct is also documented in chapter "9.3.6 BREs Matching Multiple 
Characters" of "The Open Group Base Specifications Issue 7, 2018 edition" 
(<https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html>):

>   5.  When a BRE matching a single character, a subexpression, or a 
>       back-reference is followed by an interval expression of the format 
>       "\{m\}", "\{m,\}", or "\{m,n\}", together with that interval expression 
>       it shall match what repeated consecutive occurrences of the BRE would 
>       match.

However, m4 does not recognize \{n\} construct. m4 just treats it literally:

    $ cat input.txt 
    regexp(`aaaaaaa', `a\{3\}') # Not matched.
    regexp(`a{3}   ', `a\{3\}') # Matched.

    $ m4 input.txt 
    -1 # Not matched.
    0 # Matched.
    
It was very confusing and surprising. m4 manual must say a word that \{n\} is 
not supported (as well as \{n,m\}).






reply via email to

[Prev in Thread] Current Thread [Next in Thread]