[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
GNU coreutils' `join` behaves differently on macOS versus Linux?
From: |
Robert Leach |
Subject: |
GNU coreutils' `join` behaves differently on macOS versus Linux? |
Date: |
Mon, 10 Jul 2023 17:26:31 -0400 |
Hi,
I wanted to ask about the `join` utility in `coreutils` 9.3. I'm building a
snakemake workflow and am debugginbg an error that only occurs when the
workflow is run on a linux system. I have narrowed the difference down to the
`join` utility provided by the `coreutils` conda package. An error is produced
on both systems, but since my script had not set `set -euxo pipefail`, the
error was silent. On linux, this produced an error in the workflow rule that
executes after the one that uses the join utility, because the input file was
empty.
So I manually ran the join command and noticed the difference in behavior on:
macOS:
```
(coreutils) gen-rl-imac[2023-07-10
17:01:59]:...CT-LOCAL/YURI/ATACC/REPOS/ATACCompendium$ join -1 1 -2 1 -o
1.1,1.7,2.7 -t ' '
.tests/test_1/results/counts/peaks/raw/individual/SRR17656980_19_60m_end_counts.tsv
.tests/test_1/results/counts/peaks/raw/individual/SRR13509617_19_60m_end_counts.tsv
Geneid results/sorted_atac_alignments/SRR17656980_19_60m_end.bam
results/sorted_atac_alignments/SRR13509617_19_60m_end.bam
peak1 22 28
peak2 1 12
peak3 1072 1637
peak4 457 942
peak5 1086 1507
peak6 169 67
peak7 36 85
peak8 212 198
join:
.tests/test_1/results/counts/peaks/raw/individual/SRR17656980_19_60m_end_counts.tsv:12:
is not sorted: peak10 19 39038 39248 . 211 194
join:
.tests/test_1/results/counts/peaks/raw/individual/SRR13509617_19_60m_end_counts.tsv:12:
is not sorted: peak10 19 39038 39248 . 211 228
peak9 39 34
peak10 194 228
peak11 2178 2778
...
join: input is not in sorted order
```
and linux:
```
(coreutils) [rleach@argo-comp2 ATACCompendium]$ join -1 1 -2 1 -o 1.1,1.7,2.7
-t ' '
.tests/test_1/results/counts/peaks/raw/individual/SRR17656980_19_60m_end_counts.tsv
.tests/test_1/results/counts/peaks/raw/individual/SRR13509617_19_60m_end_counts.tsv
join:
.tests/test_1/results/counts/peaks/raw/individual/SRR17656980_19_60m_end_counts.tsv:12:
is not sorted: peak10 19 39038 39248 . 211 194
join:
.tests/test_1/results/counts/peaks/raw/individual/SRR13509617_19_60m_end_counts.tsv:2:
is not sorted: Geneid Chr Start End Strand Length
results/sorted_atac_alignments/SRR13509617_19_60m_end.bam
join: input is not in sorted order
```
Is this a bug in either the macOS or linux versions of the coreutils join
utility, a known issue, or what?
Thanks,
Rob
Robert William Leach
133 Carl C. Icahn Lab
Lewis-Sigler Institute for Integrative Genomics
Princeton University
Princeton, NJ 08544
- GNU coreutils' `join` behaves differently on macOS versus Linux?,
Robert Leach <=