bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: FPAT is not working as expected


From: Arthur Schwarz
Subject: Re: FPAT is not working as expected
Date: Mon, 14 Dec 2020 08:40:52 -0800
User-agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.5.1

It does a lot better than the previous version but there are still issues.

1:    "line 1",,,"http://file.a/A%20Guide%20";
        <4: http>       http is wrong

2:    "line 2",,,"https://www.whitgt.pdf";
        same issue as 1:

3:    "line 3, and xyz",,,"http://www.c/main.pdf";
        same issue as 1: but note that the embedded ',' is treated correctly

4:    "line 4 "" and abc",,,http://file.a/A%20Guide%20
        embedded "" treated correctly and http: recognized correctly

5:    line 5,,,https://www.whitgt.pdf
        all recognized correctly

6:    line 6,,,http://file.a/A%20Guide%20
        all recognized correctly

errata:
1:    All lines with a quoted string recognize an extra field
2:    The last output of all lines is incorrectly formatted:
        ">  <5:" instead of "<5: >"  this may be a programming
        error but I can't seem to locate it.
5:    split($0, array) is uniformly incorrect.
        From the Gnu Awk manual FPAT is used as the regular expression
        and there are words to the effect that the resultant split will be
        the same as in normal input processing. This seems not to be
        the case.


On 12/14/2020 7:11 AM, Jannick wrote:
On Sun, 13 Dec 2020 17:11:50 -0800, Arthur Schwarz wrote:
1:    FPAT = /<pattern>/ does not seem to work.
Valid syntax: FPAT = @/<pattern>/

--------------------------------- CODE ---------------------------------
         FPAT         = "([^,]*)|(\"([^\"]|\"\")\")" # CSV field separator
This should do the job:
          FPAT         = "(\"([^\"]|\"\")+\"|[^,\"]*)"

HTH,
J.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]