octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: character matrix inputs to string functions


From: Rik
Subject: Re: character matrix inputs to string functions
Date: Thu, 20 Feb 2020 10:22:43 -0800

On 02/20/2020 09:00 AM, address@hidden wrote:
Subject:
Re: line continuations
From:
José Abílio Matos <address@hidden>
Date:
02/20/2020 08:49 AM
To:
address@hidden
List-Post:
<mailto:address@hidden>
Content-Transfer-Encoding:
quoted-printable
Precedence:
list
MIME-Version:
1.0
References:
<address@hidden> <2990348.2e00OfsEDc@myth> <CABNpfR-qxpu3S9DviRx5oBj3u=address@hidden>
In-Reply-To:
<CABNpfR-qxpu3S9DviRx5oBj3u=address@hidden>
Message-ID:
<9536545.1W78yDfZZB@myth>
Content-Type:
text/plain; charset="iso-8859-1"
Message:
4

On Thursday, 20 February 2020 14.35.19 WET Nicholas Jankowski wrote:
can you give a code example of what produces an error in matlab but not
octave? i may be misunderstanding your earlier comments.
c = ['1 2 3 '; '4 5 6 ']

strrep(c, '2', '0')

The call to strrep fails in Matlab since c is not a char array and it succeeds 
in Octave (with an anti-intuitive result IMHO). My proposal is to do the same 
in Octave.

[m,n] = size(c);

If m != 1 and n!= 1 then throw an error. I hope that this now makes sense. :-)

Regards,
-- José Matos

There is a much larger issue which should be resolved and that is how character matrix inputs should be handled by all string functions.

Consider this example,

octave:1> cstr = { 'Hello World' ; 'Goodbye World '}
cstr =
{
  [1,1] = Hello World
  [2,1] = Goodbye World
}

octave:2> strrep (cstr, 'World', 'Jane')
ans =
{
  [1,1] = Hello Jane
  [2,1] = Goodbye Jane
}

This does just what you would think.  Now try the same thing with a character matrix.

octave:3> chmat = char ('Hello World', 'Goodbye World')
chmat =

Hello World 
Goodbye World

octave:4> strrep (chmat, 'World', 'Jane')
ans =

Hello World 
Goodbye World

There is no substitution because the internal algorithm sees a string that is "HGeolo...."  In any case, the average user is going to be quite surprised by the apparent failure of the strrep function.  Restricting character input to be a row vector (1xN) restores the correct behavior of the function.

octave:5> strrep (chmat(1,:), 'World', 'Jane')
ans = Hello Jane 

So, I think we (Octave community) need to make a decision about how we want to handle character matrix inputs and then propagate this change to all of the m-files in scripts/strings.

One obvious possibility is simply to follow Matlab and increase the level of input validation to reject character matrices.  The validation code is pretty simple

if (ischar (input))
  if (! isrow (input))
    error ("fcn_name: input must be a character string or cell array of strings");
  endif
elseif (iscellstr (input))
  ...
else
  error ("fcn_name: input must be a character string or cell array of strings");
endif

But Octave does try to see itself as a superset of Matlab.  We don't have to follow them slavishly.  In this case we could change the input validation to detect the character matrix and call the function recursively.  For example, this code converts the character matrix to a cell array of string, executes strrep, and converts the output back to a character matrix.

if (ischar (input))
  if (! isrow (input))
    retval = char (strrep (cellstr (input), pattern, replacement)));
    return;
  endif
endif

And it works,

octave:7> char (strrep (cellstr (chmat), 'World', 'Jane'))
ans =

Hello Jane 
Goodbye Jane

Anyone want to comment on which approach they like and why?

--Rik


reply via email to

[Prev in Thread] Current Thread [Next in Thread]