[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: joining records depending on special conditions - a job for gawk?
From: |
David Resnick |
Subject: |
Re: joining records depending on special conditions - a job for gawk? |
Date: |
19 Aug 2005 07:11:28 -0700 |
User-agent: |
G2/0.2 |
olli wrote:
> Hello
>
> for special purposes I "constructed" a TAB-delimited text file - here a
> little excerpt with fieldnames:
>
> mainno subno asciino(a) asciino(b) ...
> 000027 00 66 69
> 000027 00 66 69
> 000027 00 66 69
> 000035 00 154 100
> 000035 00 154 100
> 000035 00 154 114
> ...
>
> For every distinct combination of mainno & subno I need a count of the
> occurences of asciino(a..NF) in a special range (e.g.: count only if $X
> between 128 and 255 and not in (196,225))
>
> the result should look like
>
> mainno subno COUNT
> 000027 00 5
> 000035 00 19
> 000426 01 0
> 000701 03 77
> ...
>
> I don't want to use a database for this!
> I know how to solve this problem if I had mainno & subno (eg. 000027
> 00) only on one line as $1 & $2 followed by the contents of all
> asciino-fields belonging to this combination of mainno & subno.
>
> Any ideas how to get rid of this problem?
>
> Thanks in advance
>
> oliver
How about something like this? Or did you want gawk instead
of perl for some reason?
#!/usr/local/bin/perl
use strict;
my %combos;
while (<>) {
chomp;
my ($mainno,$subno,$asciinoa,$asciinob) = split /\s+/, $_, 4;
if ($asciinoa > 128 && $asciinoa < 255 &&
!($asciinoa >= 195 && $asciinoa <= 255)) {
$combos{"$mainno $subno"}++;
}
}
foreach (sort keys %combos) {print "$_ $combos{$_}\n";}
-David