coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

GNU coreutils' `join` behaves differently on macOS versus Linux?


From: Robert Leach
Subject: GNU coreutils' `join` behaves differently on macOS versus Linux?
Date: Mon, 10 Jul 2023 17:26:31 -0400

Hi,

I wanted to ask about the `join` utility in `coreutils` 9.3.  I'm building a 
snakemake workflow and am debugginbg an error that only occurs when the 
workflow is run on a linux system.  I have narrowed the difference down to the 
`join` utility provided by the `coreutils` conda package.  An error is produced 
on both systems, but since my script had not set `set -euxo pipefail`, the 
error was silent.  On linux, this produced an error in the workflow rule that 
executes after the one that uses the join utility, because the input file was 
empty.

So I manually ran the join command and noticed the difference in behavior on:

macOS:
```
(coreutils) gen-rl-imac[2023-07-10 
17:01:59]:...CT-LOCAL/YURI/ATACC/REPOS/ATACCompendium$ join -1 1 -2 1 -o 
1.1,1.7,2.7 -t '    ' 
.tests/test_1/results/counts/peaks/raw/individual/SRR17656980_19_60m_end_counts.tsv
 
.tests/test_1/results/counts/peaks/raw/individual/SRR13509617_19_60m_end_counts.tsv
Geneid  results/sorted_atac_alignments/SRR17656980_19_60m_end.bam       
results/sorted_atac_alignments/SRR13509617_19_60m_end.bam
peak1   22      28
peak2   1       12
peak3   1072    1637
peak4   457     942
peak5   1086    1507
peak6   169     67
peak7   36      85
peak8   212     198
join: 
.tests/test_1/results/counts/peaks/raw/individual/SRR17656980_19_60m_end_counts.tsv:12:
 is not sorted: peak10     19      39038   39248   .       211     194
join: 
.tests/test_1/results/counts/peaks/raw/individual/SRR13509617_19_60m_end_counts.tsv:12:
 is not sorted: peak10     19      39038   39248   .       211     228
peak9   39      34
peak10  194     228
peak11  2178    2778

...

join: input is not in sorted order
```

and linux:
```
(coreutils) [rleach@argo-comp2 ATACCompendium]$ join -1 1 -2 1 -o 1.1,1.7,2.7 
-t '      ' 
.tests/test_1/results/counts/peaks/raw/individual/SRR17656980_19_60m_end_counts.tsv
 
.tests/test_1/results/counts/peaks/raw/individual/SRR13509617_19_60m_end_counts.tsv
join: 
.tests/test_1/results/counts/peaks/raw/individual/SRR17656980_19_60m_end_counts.tsv:12:
 is not sorted: peak10     19      39038   39248   .       211     194
join: 
.tests/test_1/results/counts/peaks/raw/individual/SRR13509617_19_60m_end_counts.tsv:2:
 is not sorted: Geneid      Chr     Start   End     Strand  Length  
results/sorted_atac_alignments/SRR13509617_19_60m_end.bam
join: input is not in sorted order
```

Is this a bug in either the macOS or linux versions of the coreutils join 
utility, a known issue, or what?

Thanks,
Rob

Robert William Leach
133 Carl C. Icahn Lab
Lewis-Sigler Institute for Integrative Genomics
Princeton University
Princeton, NJ 08544



reply via email to

[Prev in Thread] Current Thread [Next in Thread]