[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
regexp strangeness
From: |
Kay Nick |
Subject: |
regexp strangeness |
Date: |
Sat, 8 Feb 2020 12:47:20 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.4.2 |
Hey all,
the documentation to regexp says:
'\w'
Match any word character
what exactly is a word character (maybe even more important what isn't)?
Am I right in assuming its
[abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ] aka letters? What
about non
english characters like öäßłńŚ?
And here some other strange (to me) behavior:
>> regexp("#w#","#\w#")
ans = 1 <- seems to work in general as expected...
>> regexp("#d#","#\w#")
ans = [](1x0) <- why does this happen? I've provided
a word character (letter)
>> regexp("#d#","#\\w#")
ans = 1 <- Ahhh, so we need to double
escape these special characters... no mention of that in the help...
>> regexp("#j#","#\\w#")
ans = 1 <- ok, seems to work fine...
>> regexp("#E#","#\\w#")
ans = 1 <- ok
>> regexp("#E#","#\\w*#")
ans = 1 <- ok
>> regexp("##","#\\w*#")
ans = 1 <- ok
>> regexp("#.#","#\\w*#")
ans = [](1x0) <- why? Asterisk (*) is supposed to
match zero or more times. Here there is zero times a letter, so it
should match...
Especially the last one >> regexp("#.#","#\\w*#") ans = [](1x0) looks
like a bug to me. Or am I getting something wrong here?
Thanks
Kay