help-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Help-bash] files


From: John McKown
Subject: Re: [Help-bash] files
Date: Thu, 21 Apr 2016 08:18:50 -0500

On Wed, Apr 20, 2016 at 7:09 PM, Val Krem <address@hidden> wrote:

> Hi John and  all,
>
> I have two files the first file is pipe delimited  and the other file is
> space delimited. I want to combine the two files by the first column and
> the final result should be pipe delimited file

file 1
>
> A123|24|315
> A125|63|450
>
> file 2
> A123 009 163
> A125 091 112
>
> i want the result
> A123|24|315|009|163
> A125|63|450|091|112
>
> I tried  join and awk but failed to work for me.
>

​Well, not a BASH solution, per say. But I have a gawk (not generic awk)
solution. Solution:

$ cat f1
A123|24|315
A125|63|450
$ cat f2
A123 009 163
A125 091 112
$ cat f1-f2.awk
#!/usr/bin/awk -F '[| ]' -f
{b=$1; ;a[b]=(a[b] gensub(/^[^| ]+/,"",1,$0));} #magic line
END {
    PROCINFO["sorted_in"]="@ind_str_asc"; # return sorted by index value
     for (b in a) {
        print b gensub(/ /,"|","g",a[b]);
    }
}
$ gawk  -F '[| ]' -f f1-f2.awk f1 f2
A123|24|315|009|163
A125|63|450|091|112
​
​The "magic line" does the join work. The "b" variable is just to hold the
array index value. The "a" variable​ is an associative array which contains
the "built up" result for the value in "b". This works in AWK because if
the index value is not already in the array, it : (1) has the value of ""
if referenced and (2) will be dynamically added to the array if assigned
to; The gensub() mess basically removes the index value (equivalent of $1)
 from the value concatenated into the a[b] array element. The print command
later will "re-add" this value.

In the for(...) command in the END{...} portion, the index values are
iterated over in value order (default is "random"). The print then prints
the index value followed by the array value accumulated previously. The
gensub(...) is used to change the " " (space) delimiter to a "| (pipe)
delimiter. The -F '[| ]' sets the AWK field separator to either a single
pipe (|) or space character.


-- 
"He must have a Teflon brain -- nothing sticks to it"
Phyllis Diller

Maranatha! <><
John McKown


reply via email to

[Prev in Thread] Current Thread [Next in Thread]