help-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Help-bash] unique


From: Davide Brini
Subject: Re: [Help-bash] unique
Date: Sun, 24 Apr 2016 15:57:24 +0200

On Sat, 23 Apr 2016 19:20:46 +0000 (UTC), Val Krem <address@hidden>
wrote:

> Hi all,
> 
> 
> I have a file with  several variables. Sample of data is below.
> I want to count the unique occurrence of Name  (column 1) in column 2(V1)
> and columns 3(V2).
> 
>   Name v1    v2 
> ABX123  12  125 
> ABX123  12  135
> 
> ABX123  13  113
> AcX222  12  225
> AcX222  12  235
> AcX222  13  213
> AcX222  13  313
> 
> AcX222  14  413
> 
> AdX222 14  512
> 
> The output should like 
> 
> ABX123  2   3
> 
> AcX123  2   5
> AdX222 1    1

Assuming no empty lines in the input (otherwise, explain why they are
where they are), no header line, and that the "AcX123 2 5" line in the
output shoud really be "AcX222 3 5", one way is with awk:

awk '
  !($1 in k){
    if (key != "") print key, cv1, cv2
    cv1 = cv2 = 0
    k[$1]; key = $1
    split("", v1, ""); split("", v2, "")
  } 

  !($2 in v1){
    cv1++; v1[$2]
  }
  !($3 in v2){
    cv2++; v2[$3]
  } 
  END{
    if(key != "") print key, cv1, cv2
  }' file

With GNU awk's multidimensional arrays it's simpler:

awk '
  {
    v1[$1][$2]++; v2[$1][$3]++
  }
  END {
    for (k in v1) print k, length(v1[k]), length(v2[k])
  }' file


-- 
D.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]