help-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: puzzled by the result of this multi-dimensional array code


From: Neil R. Ormos
Subject: Re: puzzled by the result of this multi-dimensional array code
Date: Sun, 21 Apr 2024 11:04:20 -0500 (CDT)
User-agent: Alpine 2.20 (DEB 67 2015-01-07)

Peter Lindgren wrote:

> Neil R. Ormos wrote:
>
> (b) checking the program and the input data for
>     characters other than printable ASCII and
>     newlines; and

> I found that the original input data had
> Windows/dos line endings (CRLF - 0d0a). My
> subset of the test data still had such endings,
> since I had simply grepped out those lines and
> redirected them to my test file. Removing the
> extraneous CR / 0d / ^M character JUST FROM THE
> TITLE LINE fixed the problem.

> This seems to imply that the split function is
> more sensitive to such line endings than the
> regular automatic record parsing??

I don't think split() is *more* sensitive to the carriage-return character $0D 
(called \r in the Gawk manual).

Overall, Gawk does not treat CR as a special character in input data unless you 
are running Gawk in certain environments on Microsoft Windows or the like[*].  
Otherwise, the automatic field parsing of records received on STDIN or getline, 
and explicit splitting using split(), treat any CR characters just as any other 
non-special character, and place them in $0 and any automatically-split fields 
$1 .. $NF as usual.

It appears from your explanation that the first input line ended in a CRLF 
($0D$0A, \r\n), but gawk doesn't treat the CR as special[*], so in your 
program, titles[8] contained the field name followed by CR.

In show(), when the for loop reached the 8th field of the second input line, 
and executed this:

  print x, titles[x], data[c][titles[x]]

Gawk printed:

  "8 item6^ 19"   (using ^ here to represent a CR character).

The CR moved the cursor to position 1.  Then " 19" overwrote the first three 
characters "8 i", resulting in " 19tem6".

> In any case, all's well.

I'm glad you were able to get it working.


[*] The Gawk manual explains Gawk's special handling of CRLF line endings when 
using Gawk in PC Operating Systems, such as Microsoft Windows:

  <https://www.gnu.org/software/gawk/manual/gawk.html#PC-Using>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]