[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: puzzled by the result of this multi-dimensional array code
From: |
Neil R. Ormos |
Subject: |
Re: puzzled by the result of this multi-dimensional array code |
Date: |
Sun, 21 Apr 2024 11:04:20 -0500 (CDT) |
User-agent: |
Alpine 2.20 (DEB 67 2015-01-07) |
Peter Lindgren wrote:
> Neil R. Ormos wrote:
>
> (b) checking the program and the input data for
> characters other than printable ASCII and
> newlines; and
> I found that the original input data had
> Windows/dos line endings (CRLF - 0d0a). My
> subset of the test data still had such endings,
> since I had simply grepped out those lines and
> redirected them to my test file. Removing the
> extraneous CR / 0d / ^M character JUST FROM THE
> TITLE LINE fixed the problem.
> This seems to imply that the split function is
> more sensitive to such line endings than the
> regular automatic record parsing??
I don't think split() is *more* sensitive to the carriage-return character $0D
(called \r in the Gawk manual).
Overall, Gawk does not treat CR as a special character in input data unless you
are running Gawk in certain environments on Microsoft Windows or the like[*].
Otherwise, the automatic field parsing of records received on STDIN or getline,
and explicit splitting using split(), treat any CR characters just as any other
non-special character, and place them in $0 and any automatically-split fields
$1 .. $NF as usual.
It appears from your explanation that the first input line ended in a CRLF
($0D$0A, \r\n), but gawk doesn't treat the CR as special[*], so in your
program, titles[8] contained the field name followed by CR.
In show(), when the for loop reached the 8th field of the second input line,
and executed this:
print x, titles[x], data[c][titles[x]]
Gawk printed:
"8 item6^ 19" (using ^ here to represent a CR character).
The CR moved the cursor to position 1. Then " 19" overwrote the first three
characters "8 i", resulting in " 19tem6".
> In any case, all's well.
I'm glad you were able to get it working.
[*] The Gawk manual explains Gawk's special handling of CRLF line endings when
using Gawk in PC Operating Systems, such as Microsoft Windows:
<https://www.gnu.org/software/gawk/manual/gawk.html#PC-Using>