[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Help-bash] unique
From: |
Davide Brini |
Subject: |
Re: [Help-bash] unique |
Date: |
Sun, 24 Apr 2016 15:57:24 +0200 |
On Sat, 23 Apr 2016 19:20:46 +0000 (UTC), Val Krem <address@hidden>
wrote:
> Hi all,
>
>
> I have a file with several variables. Sample of data is below.
> I want to count the unique occurrence of Name (column 1) in column 2(V1)
> and columns 3(V2).
>
> Name v1 v2
> ABX123 12 125
> ABX123 12 135
>
> ABX123 13 113
> AcX222 12 225
> AcX222 12 235
> AcX222 13 213
> AcX222 13 313
>
> AcX222 14 413
>
> AdX222 14 512
>
> The output should like
>
> ABX123 2 3
>
> AcX123 2 5
> AdX222 1 1
Assuming no empty lines in the input (otherwise, explain why they are
where they are), no header line, and that the "AcX123 2 5" line in the
output shoud really be "AcX222 3 5", one way is with awk:
awk '
!($1 in k){
if (key != "") print key, cv1, cv2
cv1 = cv2 = 0
k[$1]; key = $1
split("", v1, ""); split("", v2, "")
}
!($2 in v1){
cv1++; v1[$2]
}
!($3 in v2){
cv2++; v2[$3]
}
END{
if(key != "") print key, cv1, cv2
}' file
With GNU awk's multidimensional arrays it's simpler:
awk '
{
v1[$1][$2]++; v2[$1][$3]++
}
END {
for (k in v1) print k, length(v1[k]), length(v2[k])
}' file
--
D.