[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Pan-users] Select Headers with RE in the Subject/Author Entryfield?
From: |
Heinz Mezera |
Subject: |
Re: [Pan-users] Select Headers with RE in the Subject/Author Entryfield? |
Date: |
Tue, 09 Jun 2015 17:16:24 +0200 |
Hello Duncan,
your answer appears to be a short explaination of regex, but does not
work as expected.
Am Montag, den 08.06.2015, 11:31 +0000 schrieb Duncan:
> Heinz Mezera posted on Mon, 08 Jun 2015 09:16:22 +0200 as excerpted:
>
> > I'd like to select Headers in the Header-Pan with a regular expresssion
> > in the Subject/Author field and need your help. Is this possible and how
> > do I do it.
> >
> > I want to select all headers
> > - starting with three alphabetic characters
> > - followed by an underscore
> > - two digits after the underscore
> > - and any number of charcters afterwards.
> >
> > PAN Info:
> > Pan 0.139 Sexual Chocolate (GIT bf56508 git://git.gnome.org/pan2;
> > i686-pc-linux-gnu)
>
> ** Note that after changing the search expression, you may have to toggle
> to something else (say subject), then back to regex, in ordered to get it
> to "take". I noticed it would dynamically refilter part of the time, but
> would appear to stall out and not update without the toggle, sometimes.
> Given that hint, and the caveat that I tested the components separately
> but not together, as I didn't have posts handy that matched that specific
> pattern...
>
> One way to do it:
>
> ^[[:alpha:]]{3}_[[:digit:]]{2}.*$
No matter what expresiion I enter into the search field the headerlist
will be totally empty. To make sure it's not an error in the regex I
tried
^.*$
I think the above expresiion should show all headers, but the header
list is totally empty.
Enclosing the expression in single or double quotes or brackets has no
effect.
What am I doing wrong?
>
> ^ = zero-width match at the beginning/left
> $ = same at the end/right
>
> Non-special characters match themselves. Letters, digits, _, etc, are
> non-special.
>
> . matches exactly one occurrence of any character (and *, mentioned again
> below, is any number including zero, so .* is a full wildcard, including
> matching nothing).
>
> [] encloses a "character class". Such character classes can include
> ranges of characters [a-z], individual lists [123], and/or category
> classes (I seem to have forgotten the proper term ATM) like the above,
> enclosed in further [:xxx:] marks, thus the nesting.
>
> So [[:alpha:][:digit:]] and [a-zA-Z0-9] would both match alphanumeric
> characters in ASCII, tho pan's regex is case insensitive so both a-z and
> A-Z wouldn't be needed for pan, only one or the other. You can also do
> things like [[:digit:]abc._], to match digits, abc, and the individual
> characters . and _. The significance of the [:xxx:] matches, however, is
> that they work across character sets, so [:alpha:] matches letters that
> would be skipped in character-sets where a-z doesn't include all letters
> due to strange ordering or something.
>
> To match a - in a character-class, put it at the beginning so it can't
> specify a range. The \ char is the escape char, both inside and outside
> a character-class, so you can use \] to match a literal ] for instance,
> and of course \\ to match a literal \.
>
> Additionally, you can specify a /negative/ character-class with ^ as the
> first character (outside a character-class, it means match the beginning,
> inside, as the first character of the class, it negates the class, inside
> as anything other than the first char, it matches itself normally). So
> [^abc] means any character /but/ abc.
>
> Significantly, character classes normally only match *ONE* character. To
> match more than one you can repeat, [a-z][a-z] will match TWO letters, or
> use frequency specifiers inside of {} as I did, above. {1,3} would be
> one, two, or three matches, {1,} would be at least one match.
>
> In addition to the {}-delimited frequency range specifiers, there's:
>
> * = zero or more (*NOT* one or more, it doesn't have to be there!)
> ? = zero or one (may or may not be there, but matches only once)
> + = 1 or more
>
> Again in case it didn't sink in above, \ is the escape char, so to match
> a literal *, you'd use \*
>
> () are the grouping characters, and | indicates alternatives (or). So
> ((cat)|(horse)) will match "cat" or "horse" but will NOT match "cah", for
> instance. Note that the alternatives do NOT need to be the same length,
> and that the inside grouping help clarify the scope of the match but
> aren't absolutely required, so (cat|horse) should have the same effect.
> So there are two ways to match a "cat" that may or may not be there:
>
> (cat)?
> (cat|)
>
> That's the basics. FWIW for non-pan usage, some regex uses make things
> like {} special characters, so {3} is a frequency and \{3\} are the
> literal characters, while others don't unless they're escaped, so {3}
> would be the literal characters and the backslash-escaped version would
> be frequency. And of course the shell has its own special chars and \
> escape char, so sometimes you need to play with the number of \\\ a bit
> in ordered to get it to work like you want, but once you understand the
> basics, even /just/ the basics, regex can really be quite powerful.
>
> Of course there's far FAR more. Just a couple quick examples. First, ()
> not only groups, but stores for later use. So if for instance you are
> trying to match quotes but don't know if it's single-quotes or double-
> quotes, you can use (['"]) for the first match (possibly as (['"])? or
> ('|"|) if you don't know if it'll be quoted or not), and \1 or possibly
> $1 to automatically match the same thing at the other end of the quote.
> Second, there's what's called look-ahead and look-behind matching, which
> can be positive or negative. So for instance if you want to match "pro"
> but not "gopro", there's a way to say "look behind (to the left of) the
> pro and don't match if the preceding letters are 'go'". I don't use them
> enough to be sure of my memory, however, so generally have to look that
> sort of advanced stuff up, if I need it. And for this advanced stuff,
> you usually have to either lookup or test whether whatever you're trying
> to work with actually supports it or not. I'm not sure whether pan does,
> for instance, tho it wouldn't surprise me if it did.
>
> So back to the specific case in point:
>
> ^[[:alpha:]]{3}_[[:digit:]]{2}.*$
>
> Given the above, we can parse that as:
>
> ^ Left anchor (begin the line with what follows):
>
> [[:alpha:]] one alphabet character
>
> {3} match the previous exactly three times
>
> _ (matches itself)
>
> [[:digit:]] one digit
>
> {2} match the previous exactly twice
>
> . any character
>
> * match the previous any number (including none) of times
>
> $ right anchor (end of line)
>
>
> Of course the .*$ aren't actually needed, since without them the match is
> simply left-anchored only, but I like the explicit "the rest of the line
> doesn't matter for the match" that .*$ provides. And in non-pan usages
> where you're matching to delete or replace the match, it COULD matter, as
> failing to include the .*$ would leave any other junk on the line still
> there, while including it would match and thus delete/replace the entire
> line.
>
kr Heinz
- [Pan-users] Select Headers with RE in the Subject/Author Entryfield?, Heinz Mezera, 2015/06/08
- Re: [Pan-users] Select Headers with RE in the Subject/Author Entryfield?, Duncan, 2015/06/08
- Re: [Pan-users] Select Headers with RE in the Subject/Author Entryfield?,
Heinz Mezera <=
- Re: [Pan-users] Select Headers with RE in the Subject/Author Entryfield?, Heinz Mezera, 2015/06/12
- Re: [Pan-users] Select Headers with RE in the Subject/Author Entryfield?, Duncan, 2015/06/12
- Re: [Pan-users] Select Headers with RE in the Subject/Author Entryfield?, Heinz Mezera, 2015/06/12
- Re: [Pan-users] Select Headers with RE in the Subject/Author Entryfield?, Duncan, 2015/06/12
- Re: [Pan-users] Select Headers with RE in the Subject/Author Entryfield?, Andrew Nile, 2015/06/12
- Re: [Pan-users] Select Headers with RE in the Subject/Author Entryfield?, Duncan, 2015/06/12
- Re: [Pan-users] Select Headers with RE in the Subject/Author Entryfield?, Andrew Nile, 2015/06/14
Re: [Pan-users] Select Headers with RE in the Subject/Author Entryfield?, Andrew Nile, 2015/06/09