octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Block comments


From: William Poetra Yoga Hadisoeseno
Subject: Re: Block comments
Date: Tue, 6 Jun 2006 13:28:27 +0800

On 6/6/06, John W. Eaton <address@hidden> wrote:
On  5-Jun-2006, William Poetra Yoga Hadisoeseno wrote:

| I think we should only match for "^%{$" and "^%}$", because in Matlab,
| a block comment starter and ender is only valid if the two characters
| %{ or %} are the only characters on the line. So for example this one
| doesn't start a block comment:
|
|   myvar = 1; %{ this is not a block comment
|   %} and this line should generate a warning

What about whitespace (TAB or SPC)?


From
http://www.mathworks.com/access/helpdesk/help/techdoc/matlab_env/edit_d14.html
and
http://www.mathworks.com/access/helpdesk/help/techdoc/matlab_prog/f0-44984.html
it says:

"The lines that contain %{ and %} cannot contain any other text."
"The %{ and %} operators must appear alone on the lines that
immediately precede and follow the block of help text. Do not include
any other text on these lines."

It's not clear whether tabs or spaces constitute "other text" here,
but I think they do. From the examples given on the page, it seems
that they don't encourage/don't want to use indentation in block
comments (the examples for block comment nesting).

| Since I don't have access to Matlab here, I can't verify it
| personally. So should we "extend" the syntax of block comments, so
| that it doesn't need to appear by itself on a line?

No, I don't think that is necessary.


OK.

| If we don't extend
| the syntax, we don't have to worry about column numbers (since it
| matches only when we begin a line and ends with a newline --> %}\n).

OK.

| > The only catch here is that there are a few places where comments are
| > already matched with regular expressions like "%.*\n", so those would
| > have to be modified as well.
|
| OK, I'll take a look. Some functions also deal with '\r'. What is it
| for? Is it for compatibility with Matlab scripts with the DOS/Windows
| encoding?

In a perfect world, we would only have to look for LF ("\n") and all
CR or CRLF pairs would be converted by the system.  But unfortunately,
there are problems with people transferring files with CRLF pairs to
systems that don't do translation, and I think that telling users to
translate their files is not a popular solution.  So yes, we generally
have to match both.


In src/lex.l, in the function scan_for_comments, there's a piece of code:

       case '\r':
         if (in_comment)
           comment_buf += static_cast<char> (c);
         if (i < len)
           {
             c = text[i++];

             if (c == '\n')
               {
                 if (in_comment)
                   {
                     comment_buf += static_cast<char> (c);
                     octave_comment_buffer::append (comment_buf);
                     in_comment = false;
                     beginning_of_comment = false;
                   }
               }
           }

I don't understand why we check it twice. Wouldn't it be better if we
discard \r when we see it? (I don't know of other uses for \r except
for \r\n in DOS/Windows systems)

| Uh, could you please briefly explain the start states

Sorry, I don't have time to explain how to use the tools.


Well, I should've looked at flex.info more carefully...

| and the column
| numbers? I took a look at the flex info manual, and I haven't really
| understood it. About the column numbers (the global variable
| current_input_column, right?), yes, I saw those, but some functions
| don't update it (and input_line_number, too). Why?

I think the line and column number variables are entirely part of
Octave and we don't use any flex-specific features to keep track of
them.  Precisely what functions don't update these variables when you
think they should?


The function gobble_leading_white_space in src/parse.y has a parameter
update_pos, which when false avoids updating current_input_column and
input_line_number. It is currently only called this way from inside
itself and from is_function_file in the same file. Can we remove this
parameter and always update input_line_number and
current_input_column?

jwe


--
William Poetra Yoga Hadisoeseno


reply via email to

[Prev in Thread] Current Thread [Next in Thread]