[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#22059: grep -E: unexpected behaviour
From: |
Charles |
Subject: |
bug#22059: grep -E: unexpected behaviour |
Date: |
Mon, 30 Nov 2015 10:27:55 +0530 |
As expected:
# grep -E 'udisksd\[[[:digit:]]+\]: The string .* ' /var/log/syslog.1
Nov 30 07:16:38 CW8 udisksd[2650]: The string `TSSTcorp CDDVDW SHQeò?
±?¾MUæíE³èBãÄL' is not valid UTF-8. Invalid characters begins at `eò?
±?¾MUæíE³èBãÄL'
Nov 30 07:16:38 CW8 udisksd[2650]: The string `TSSTcorp CDDVDW SHQeò?
±?¾MUæíE³èBãÄL' is not valid UTF-8. Invalid characters begins at `eò?
±?¾MUæíE³èBãÄL'
But add the i to the pattern and the behaviour is unexpected:
# grep -E 'udisksd\[[[:digit:]]+\]: The string .* i' /var/log/syslog.1
[no output]
Apparently grep silently stops processing when it encounters the invalid UTF-8:
# grep -E --only-matching 'udisksd\[[[:digit:]]+\]: The string .* '
/var/log/syslog.1 | tail -1
udisksd[2650]: The string `TSSTcorp CDDVDW
In case the specific unusual characters are relevant, here they are in hex:
# grep -E 'udisksd\[[[:digit:]]+\]: The string .* ' /var/log/syslog.1 | head -1
| cut --delimiter=' ' --fields=10-11 | od -x
0000000 4853 8251 f265 88d0 b120 b8d3 4dbe e655
0000020 45ed e8b3 e342 4cc4 0a27
0000032
When the input has invalid characters so grep cannot process it, a message
could be expected perhaps configurable by the -s/--no-messages option because
the input is (sort of) unreadable.
Version: 2.20 from the Debian Jessie package 2.20-4.1
Charles
- bug#22059: grep -E: unexpected behaviour,
Charles <=