[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: FPAT is not working as expected
From: |
Arthur Schwarz |
Subject: |
Re: FPAT is not working as expected |
Date: |
Mon, 14 Dec 2020 08:40:52 -0800 |
User-agent: |
Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.5.1 |
It does a lot better than the previous version but there are still issues.
1: "line 1",,,"http://file.a/A%20Guide%20"
<4: http> http is wrong
2: "line 2",,,"https://www.whitgt.pdf"
same issue as 1:
3: "line 3, and xyz",,,"http://www.c/main.pdf"
same issue as 1: but note that the embedded ',' is treated
correctly
4: "line 4 "" and abc",,,http://file.a/A%20Guide%20
embedded "" treated correctly and http: recognized correctly
5: line 5,,,https://www.whitgt.pdf
all recognized correctly
6: line 6,,,http://file.a/A%20Guide%20
all recognized correctly
errata:
1: All lines with a quoted string recognize an extra field
2: The last output of all lines is incorrectly formatted:
"> <5:" instead of "<5: >" this may be a programming
error but I can't seem to locate it.
5: split($0, array) is uniformly incorrect.
From the Gnu Awk manual FPAT is used as the regular expression
and there are words to the effect that the resultant split will be
the same as in normal input processing. This seems not to be
the case.
On 12/14/2020 7:11 AM, Jannick wrote:
On Sun, 13 Dec 2020 17:11:50 -0800, Arthur Schwarz wrote:
1: FPAT = /<pattern>/ does not seem to work.
Valid syntax: FPAT = @/<pattern>/
--------------------------------- CODE ---------------------------------
FPAT = "([^,]*)|(\"([^\"]|\"\")\")" # CSV field separator
This should do the job:
FPAT = "(\"([^\"]|\"\")+\"|[^,\"]*)"
HTH,
J.
Re: FPAT is not working as expected, Manuel Collado, 2020/12/14
Re: FPAT is not working as expected, Manuel Collado, 2020/12/20