[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Help-bash] simmerize
From: |
Assaf Gordon |
Subject: |
Re: [Help-bash] simmerize |
Date: |
Sat, 20 Aug 2016 22:26:12 -0400 |
Hello,
> On Aug 20, 2016, at 21:54, Val Krem <address@hidden> wrote:
>
> I am trying to summarize a huge data set by ID and year.
> Here is the sample of data.
<sorry for he shameless plug, but>
GNU Datamash can do exactly that:
========
$ cat in.txt
ID Year
Ak1 2001
Ak1 2001
Ak1 2001
Ak1 2002
Ak1 2007
Ak1 2003
Ak1 2003
Ak1 2004
Ak2 2001
$ datamash --header-in -W crosstab 1,2 < in.txt
2001 2002 2003 2004 2007
Ak1 3 1 2 1 1
Ak2 1 N/A N/A N/A N/A
========
The parameters are:
--header-in => skip the first header line in your file
-W => whitespace-delimiters instead of TAB.
crosstab => The operation to perform (cross-tabulation,.a.k.a. "pivot table")
1,2 => The input fields to operate on.
This requires version 1.1.0, available here:
https://www.gnu.org/software/datamash/
http://ftp.gnu.org/gnu/datamash/
regards,
- assaf