Re: Basic calculation mistakes (e.g. mean/median)

bug-datamash

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Basic calculation mistakes (e.g. mean/median)

From:	Andreas Sommer
Subject:	Re: Basic calculation mistakes (e.g. mean/median)
Date:	Wed, 11 Nov 2020 08:07:44 +0100
User-agent:	Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.4.0


On 2020-11-10 14:48, Brandon Invergo wrote:


Andreas Sommer writes:

$ seq 1 3 | datamash -H mean 1 median 1
mean(1) median(1)
2.5     2.5

$ seq 1 3 | datamash -R 5 -H mean 1 median 1
mean(1) median(1)
2.50000 2.50000

---

$ seq 1 4 | datamash -H -R 2 mean 1 median 1
mean(1) median(1)
3.00    3.00

Until that gets fixed, it means I can't trust the tool :(


All of those results are correct.  The -H option is synonymous with
--header-in and --header-out, so the first row (containing the value 1)
is being treated as a header row not a data row.


Well that explains a lot. I have strongly expected that `-H` would print 
headers without side effects. Hiding `--header-out` in a long option seems 
strange. Also other Unix-y tools often use uppercase as negation, e.g. `zfs 
list -H` = without printing column headers.

Anyway, I have the solution now and the developers can take this as wish to disambiguate the 
short options. I can guess that you don't want to change this parameter, but the 
documentation should clearly hint at it. The website (e.g. 
https://www.gnu.org/software/datamash/examples/) typically first shows an example `seq ... | 
datamash [without -H]` and in the next paragraph `<somefile datamash -H ...` – looking at 
the documented example output, a reader like me might think that `-H` means "print 
headers".

Thanks a lot!
-Andreas

[Prev in Thread]

Current Thread

[Next in Thread]

Basic calculation mistakes (e.g. mean/median), Andreas Sommer, 2020/11/10
- Re: Basic calculation mistakes (e.g. mean/median), Shawn Wagner, 2020/11/10
- Re: Basic calculation mistakes (e.g. mean/median), Brandon Invergo, 2020/11/10
  - Re: Basic calculation mistakes (e.g. mean/median), Andreas Sommer <=

Prev by Date: Re: "Segmentation fault" when input contains embedded NUL characters
Previous by thread: Re: Basic calculation mistakes (e.g. mean/median)
Next by thread: "Segmentation fault" when input contains embedded NUL characters
Index(es):
- Date
- Thread