[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RS re processing bug
From: |
Ron Rechenmacher |
Subject: |
RS re processing bug |
Date: |
Thu, 17 Jan 2008 12:26:05 -0600 |
Hi,
BACKGROUND:
I wanted to filter some of the ' [0-9]* files...\r' "records" from
"rsync --progress ..." output. I was successful in this except bothered
that the output seemed to be one line behind (See example simple example
below). A google search: gawk rs problem
showed that other would be interested in this bug fix or feature.
SIMPLEST BUG RECREATION ENV.:
(sleep 3;echo `date +%s`;\
sleep 3;echo `date +%s`;\
sleep 3) | gawk \
'BEGIN { RS="[\n]" } {printf "%d %s%c",systime(),$0,RT; fflush();}'
compare output between RS="[\n]" and RS="\n"
It would be nice of the times (numbers) on each line were the same in both
cases.
A bit more sophisticated
(sleep 3;echo `date +%s`;\
sleep 3;echo -n -e "`date +%s`\r";\
sleep 3;echo -n -e "`date +%s`\r";\
sleep 3;echo -e "\n`date +%s`";\
sleep 3) | gawk \
'BEGIN { RS="[\n\r]" } {printf "%d %s%c",systime(),$0,RT; fflush();}'
ENVIRONMENT:
built gawk-3.1.6 on SLF5 (which is like RHEL5 I think).
ANALYSIS
REs are tricky. A standard rule is to return the largest match.
So, in general, if you have a match at the end of an input buffer,
you do not know if (again, in general) you where to receive more input,
if the match would perhaps grow in size. BUT, often is the case that the
RE IS SIMPLE and you do know that the match would not grow!
SUGGESTED PATCH: (sorry if not in exactly the right format, I'm new at this
and would record any reply explaining the better way)
See attached patch against gawk-3.1.6 source.
I created the patch via:
cd cd gawk-3.1.6
for ff in awk.h io.c re.c;do diff -u $ff{.~1~,};done >../re_len.patch
Note, as the definition of "SIMPLE" is vague, the patched code behaves
differently when RS="\n|\r" and when RS="[\n\r]" but this makes at least
a little sense :)
Thanks,
Ron
re_len.patch
Description: re_len.patch
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- RS re processing bug,
Ron Rechenmacher <=