[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Suggestion: add the possibility to apply multiple operations to a si
From: |
Erik Auerswald |
Subject: |
Re: Suggestion: add the possibility to apply multiple operations to a single column (or multiple columns) |
Date: |
Fri, 4 Nov 2022 14:09:15 +0100 |
Hi,
On Fri, Nov 04, 2022 at 12:24:42PM +0000, Tim Rice wrote:
> >Ranges like 1-2,4 could be less straightforward, especially when
> >combined with the former idea of providing multiple operations
> >simultaneously. When preparing a test for "mean,max,count 1-2,4",
> >should the test output columns like "mean_1, max_1, count_1, mean_2,
> >max_2, count_2, mean_4, max_4, count_4", or "mean_1, max_1, count_1,
> >mean_2, max_2, count_2, mean_4, max_4, count_4", or something else?
>
> Ahem, those two suggestions should be different to each other:
>
> * mean_1, max_1, count_1, mean_2, max_2, count_2, mean_4, max_4, count_4
>
> vs
>
> * mean_1, mean_2, mean_4, max_1, max_2, max_4, count_1, count_2, count_4
My first intuition was that "datamash sum 1,2,3 mean 1,2,3" and
"datamash sum,mean 1,2,3" should produce the same result, thus I would
have expected the second ordering:
$ printf '%s\t%s\t%s\n' 1 10 100 2 20 200 3 30 300 | datamash --header-out
sum 1,2,3 mean 1,2,3 | column -t
sum(field-1) sum(field-2) sum(field-3) mean(field-1) mean(field-2)
mean(field-3)
6 60 600 2 20 200
When starting from the combined expression "datamash sum,mean 1,2,3"
the first ordering might seem more natural.
I do not have a strong opinion on this.
Cheers,
Erik