[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: incremental read of gzipped matrix
From: |
Andreas Weber |
Subject: |
Re: incremental read of gzipped matrix |
Date: |
Sun, 8 Dec 2019 18:55:07 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.2.2 |
Am 07.12.19 um 06:20 schrieb Andreas Weber:
> Am 07.12.19 um 05:44 schrieb Andreas Weber:
>> is there currently a method in GNU Octave core or forge package to open
>> a gziped file, read as much rows as available and return them as matrix?
>
> I forgot the important part: Keep the file open an store the position so
> that it's possible to later read the new part:
>
> pseudo code:
> fid = gzopen ("foo.gz", "r");
> m = fget_matrix (fid); # returns a 50x5 matrix;
> .... sleep....
> # in the meanwhile another process appends data to the .gz file
> m = fget_matrix (fid); # now returns a 20x5 matrix whith new data;
In the meanwhile I've started my implementation:
https://github.com/Andy1978/load_gz/tree/master
Until know I get a factor 10 improvement in runtime reading large
gzipped numeric CSVs:
For example rand (1e6, 8)
Octave 5.1.1: Elapsed time is 16.9351 seconds.
load_gz.oct Elapsed time is 1.49143 seconds.
If someone wants to try:
git clone https://github.com/Andy1978/load_gz.git
cd load_gz.git
make check
The incremental part is still missing.
-- Andy