[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: regex capturing the left-hand side of =
From: |
Koichi Murase |
Subject: |
Re: regex capturing the left-hand side of = |
Date: |
Mon, 12 Jul 2021 07:41:07 +0900 |
> Since bash variables can be at the left-hand side of =, it is easy to
> be captured this case by a regex like ^[A-Za-z_][A-Za-z0-9_]$.
I guess this is a typo of '^[A-Za-z_][A-Za-z0-9_]*' (* but not $).
Then I guess, using a regular expression, you want to capture the
left-hand sides from strings that contain arbitrary assignments of the
form 'LHS=RHS', particularly for the array elements of the form
'arr[XXX]=RHS' with a regular expression something like
'^[A-Za-z_][A-Za-z0-9_]*\[[^][]+\]'. Of course, this regular
expression doesn't work because special characters such as '[', ']',
and '=' can appear in the array subscripts in the general context.
> Is there a regex that can capture and only capture all bash code
> snippets for the left-hand side of the assignment operator =?
If so, the answer to the above question is "No". We can here consider
the array element of the form arr[$($($($($($(... $(echo : ])
...))))))]. In principle, we can nest the command substitutions in an
arbitrary depth, which means there are theoretically infinite possible
states. On the other hand, a regular language should be able to be
accepted by a finite-state machine so that a regular language cannot
capture the chain of the arbitrary depth of $(...$()...). We may
extend the regular expression by adding "non-regular" features to
enable paired brackets, etc., but POSIX <regex.h> that Bash uses
doesn't have such extensions.
--
Koichi
Re: regex capturing the left-hand side of =,
Koichi Murase <=