[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Import large field-delimited file with strings and numbers
From: |
João Rodrigues |
Subject: |
Import large field-delimited file with strings and numbers |
Date: |
Sat, 06 Sep 2014 15:19:23 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 |
I need to import a large CSV file with multiple columns with mixed
string and number entries, such as:
field1, field2, field3, field4
A, a, 1, 1.0,
B, b, 2, 2.0,
C, c, 3, 3.0,
and I want to pass this on to something like
cell1 ={[1,1] = A; [2,1] = B; [3,1] = C};
cell2 ={[1,1] = a; [2,1] = b; [3,1] = c};
arr3 =[1 2 3]';
arr4 =[1.0 2.0 3.0]';
furthermore, some columns can be ignored, the total number of entries is
known and there is a header.
How can I perform the import within reasonable time and little memory
overhead? Below are a few of my attempts.
Octave offers a wide range of functions to import files (csvread,
dlmread, textscan, textread, fscanf, fgetline) but as far as I can tell
none seems to get the job done.
csvread and dlmread don't work because they only handle numerical data.
textscan works eats up all the memory (the file is 200 MB, textscan's
memory usage was into the GB's). It doesn't allow to provide a priori
the size of the object.
fid = fopen(fstr,"r");
[tmp] = textscan(fid,'%s %s %d %d','delimiter', ',', 'headerlines', 1);
fclose(fid);
fgetline allow to define the size of the object a priori but requires a
loop:
v = cell(nrow,4);
fid = fopen(fstr,"r");
tmp = fgetl(fid);
for irow = 1 : nrow
tmp = fgetl(fid);
v(irow,:) = strsplit(tmp,",");
endfor
fclose(fid);
Any suggestions? (I browsed google and the only suggestion I got was
using fgetl, but this is too slow. It takes 30sec to read 1% of the full
dataset).
Thanks
- Import large field-delimited file with strings and numbers,
João Rodrigues <=
- Re: Import large field-delimited file with strings and numbers, Francesco Potortì, 2014/09/06
- Re: Import large field-delimited file with strings and numbers, Andreas Weber, 2014/09/06
- Re: Import large field-delimited file with strings and numbers, Joao Rodrigues, 2014/09/06
- Re: Import large field-delimited file with strings and numbers, Thomas D. Dean, 2014/09/06
- Re: Import large field-delimited file with strings and numbers, Francesco Potortì, 2014/09/07
- Re: Import large field-delimited file with strings and numbers, Ben Abbott, 2014/09/07
- Re: Import large field-delimited file with strings and numbers, Francesco Potortì, 2014/09/07
- Re: Import large field-delimited file with strings and numbers, Ben Abbott, 2014/09/07
- Re: Import large field-delimited file with strings and numbers, Philip Nienhuis, 2014/09/08
- how does fscanf with "C" work?, Francesco Potortì, 2014/09/08