[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: How to match regex in bash? (any character)
From: |
Stephane CHAZELAS |
Subject: |
Re: How to match regex in bash? (any character) |
Date: |
Mon, 3 Oct 2011 11:03:12 +0000 (UTC) |
User-agent: |
slrn/pre1.0.0-18 (Linux) |
2011-10-02, 21:51(-04), Chet Ramey:
> On 10/2/11 3:43 PM, Stephane CHAZELAS wrote:
>
>> [*] actually, bash does some (undocumented) preprocessing on the
>> regexps, so even the regex(3) reference is misleading here.
>
> Not really. The words are documented to undergo quote removal, so
> they undergo quote removal. That turns \1 into 1, for instance.
[...]
The problem and confusion here comes from the fact that "\" is
overloaded and used by two different pieces of software (bash
and the system regex).
It is used:
- by bash for quoting
- by regex(3) to escape regexp characters in some
circumstances (for instance when not inside [...], but it may
vary per implementations (think of the (?{...} type extensions))
- by some regex(3) implementations to introduce new regexp
operators (\w, \b, \<...)
BTW, another bug:
$ bash -c '[[ "\\" =~ ["."] ]]' && echo yes
yes
And what one could consider a bug:
~$ bash -c 'chars="a]"; [[ "a" =~ ["$chars"] ]]' && echo yes
~$ bash -c 'chars="a]"; [[ "a]" =~ ["$chars"] ]]' && echo yes
yes
I was wrong in saying that bash documentation should refer to
POSIX regexps as it disables extensions. It only disables
extensions introduced by "\", not the ones introduced by
sequences that would otherwise be invalid in POSIX EREs like
"(?", {{, **...
It should still refer to POSIX regexps as it's the only ones
guaranteed to work. Any extension provided by the system's
regex(3) API may not work with bash.
--
Stephane