[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
How to import string and numerical data with arbitrary number of columns
From: |
João Rodrigues |
Subject: |
How to import string and numerical data with arbitrary number of columns |
Date: |
Sun, 01 Sep 2013 12:30:01 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130803 Thunderbird/17.0.8 |
Hi
I need to import a series of data files with several string columns and
one final column with floats. E.g.,
str1 str2 1.234
str3 str4 5.678
...
[a,b,c] = textread(filename,"%s %s %f") looks like a good idea but the
number of string columns is different in each file and there are many
such files, so it is out of the question to use a different textread
statement (with the appropriate number of output variables and format)
for each file.
I can do this in several ways, but they all take a lot of time. (There
may be some spelling errors in the code below, I just wanted to give the
general idea.)
Let nstr be the number of string columns of the current file:
%%%%%%%%%%%
1) Using C-style file input:
fid = fopen(filename);
k = 0;
while ~feof(fid)
k = k + 1;
for i = 1 : nstr
tmp = fscanf(fid,"%s","C");
strres{k,i} = tmp;
endfor
tmp = fscanf(fid,"%f","C");
numres(k,1) = tmp;
endwhile
fclose(fid)
(uses loops, which are not recommended)
%%%%%%%
2) Using eval:
tmpstr1 = "[";
tmpstr2 = "\"";
for i = 1 : nstr
tmpstr1 = strcat(tmpstr1,"tmp",num2str(i),",");
tmpstr2 = strcat(tmpstr2,"%s ");
endfor
tmpstr1 = strcat(tmpstr1,"]");
tmpstr2 = strcat(tmpstr1,"%f\"");
eval([tmpstr1,"=textread(",filename,",",tmpstr2,");"]);
then use eval again to assign the tmpstr's to strres and numres.
(the point is to first use a loop to generate the strings with the
format and list of output files, then apply textread inside eval).
%%%%%%%%
3) Using reshape and cellfun:
tmp = textread(filename,"%s");
tmp = reshape(tmp,[nstr+1,length(tmp)/(nstr+1)])';
numres = cellfun(@str2num,tmp(:,nstr+1));
strres = tmp(:,1:nstr);
(read all data as a string cell vector, then reshape and cut out
the numerical vector. The cell fun operation takes a lot of time.)
%%%%%%%%%%
What I really wanted was an alternative to textread that would do
something like:
res = textread(filename,"%s %s %f")
and would create res as a cell whose columns were the different objects
returned by textread (in this case two string cell vectors and one
numerical vector).
Can anyone suggest a faster and cleaner method?
Thanks
Joao