[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#16468: join
From: |
Pádraig Brady |
Subject: |
bug#16468: join |
Date: |
Thu, 01 May 2014 00:53:16 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2 |
On 01/17/2014 12:00 AM, Bernhard Voelker wrote:
> On 01/16/2014 07:10 PM, Eric Blake wrote:
>> On 01/16/2014 10:46 AM, barry kesner wrote:
>>> How do you tell join this without resorting. The files are huge!
>>
>> Unfortunately, there isn't any really good way, short of re-processing
>> the files to make the data appear sorted in the order join expects.
>> That said, it certainly appears that for your given data, you can write
>> a sed filter that can reprocess on a line-by-line basis, and feed that
>> into join, without the penalty of having to re-sort the entire file and
>> without having to have the processed file stored in your file system all
>> at once. It also seems possible to write a post filter to get back to
>> the style of the line in the original file. Here, extensions such as bash's
>> join <(infilter file1) <(infilter file2) | outfilter
>> make it easier to type (where the trick is to now write the correct sed
>> scripts to serve as infilter and outfilter) than the alternative of
>> having to use named fifos for limiting yourself to just POSIX semantics.
>
> Hum, isn't such number conversion filtering exactly what numfmt
> wasn't designed for? But wait ...
>
> $ numfmt --field 1 --format='%020f' < f2
> 99980081 1
> 100002129 1
> 100002136 2
> 100002162 3
>
> ... it doesn't support leading zeros, unfortunately. ;-/
> Wouldn't this be a nice enhancement?
I've needed this a few times so I added it in the attached.
thanks,
Pádraig.
numfmt-leading-zeros.patch
Description: Text Data
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- bug#16468: join,
Pádraig Brady <=