|
From: | Søren Hauberg |
Subject: | help as m-files |
Date: | Sun, 28 Sep 2008 12:48:54 +0200 |
Hi All, I'd like to make the help system more "hackable" because I'd like to take advantage of it, to produce code for building web sites containing function help texts (such as the function reference on the octave-forge site). So, what I'm proposing is to rewrite the help system as m-files. The approach I've taken is to have one C++ function that returns the raw help text of a function (I think David wrote this function, but I don't remember). Functions in m-files then pass this through makeinfo to get the requested help text. Most of this is pretty straight forward. The only tricky thing is the 'lookfor' command, so let me describe my implementation in some detail. The current implementation simply traverses all functions, and search their help text for the given keyword. Such an approach is very slow when written as an m-file (its also quite slow when written in C++, but its worse with a m-file). So, what I've done instead is to generate a cache containing the help texts of all functions in all directories in the path. I have one cache per directory. 'lookfor' then traverses all directories in the path. If a directory contains a cache, then this is loaded, and searched. If it doesn't contain a cache, then all functions in the directory are searched one by one. I imagine that caches should be generated when Octave is installed, and when packages are installed. This approach is very fast (it takes less then a second on my laptop, whereas the current implementation takes ~1 minute). So, what are then ups and downs of this approach? Ups: * faster 'lookfor'. * IMHO simpler and more hackable code. * provides framework for generating web pages like the function reference on the octave-forge site. See [1] for some code that does this using this framework. Downs: * The current help system works just fine. My rewriting it, we will most likely introduce regressions. Some of the algorithms (get_first_help_sentence) have changed, so it is likely that they produce different results in some weird cases. * Probably something else that I don't see since I wrote the change :-) I'm attaching some code that's usable, and almost done (see PROBLEMS.txt for some of the missing things). Søren [1] http://hauberg.org/wiki/doku.php?id=generate_html
gen_all_caches_in_path.m
Description: Text Data
gen_doc_cache.m
Description: Text Data
list_functions.m
Description: Text Data
Here comes a paragraph with many endlines first...
## ## ## ##Here comes another paragraph with many endlines first...
## ## function myhelp (arg1, arg2) if (nargin == 0) disp ("XXX: display that long list of functions/operators that come with Octave"); elseif (nargin == 1 && ischar (arg1)) ## Is 'arg1' an operator? [text, is_operator, format] = operator_help_text (arg1); ## Get help text if it wasn't an operator if (! is_operator) [text, format] = get_help_text (arg1); endif ## Take action depending on help text format switch (lower (format)) case "plain text" status = 0; case "texinfo" [text, status] = makeinfo (text, "plaintext"); case "html" [text, status] = strip_html_tags (text); case "not found" error ("help: `%s' not found\n", arg1); otherwise error ("help: internal error: unsupported help text format: '%s'\n", format); endswitch ## Print text if (status == 0) disp (text); else warning ("makeinfo: Texinfo formatting filter exited abnormally"); warning ("makeinfo: raw Texinfo source of help text follows...\n"); disp (text); endif elseif (nargin == 2 && ischar (arg1) && ischar (arg2) && strcmp (arg1, "-i")) warning ("help: use 'doc' instead of 'help -i'"); doc (arg2); else error ("help: invalid input\n"); endif endfunction function [text, is_operator, format] = operator_help_text (arg) ## Define operators and their help text ## XXX: We could easily use texinfo here. Should we? operators = { "!", "Logical not operator.\n See also `~'."; "!=", "Logical not equals operator. See also `~' and `<>'."; "\"", "String delimiter."; "#", "Begin comment character. See also `%'."; "%", "Begin comment charcter. See also `#'."; "&", "Logical and operator. See also `&&'."; "&&", "Logical and operator. See also `&'."; "'", ["Matrix transpose operator. For complex matrices, computes the\n" "complex conjugate (Hermitian) transpose. See also `.''\n" "\n" "The single quote character may also be used to delimit strings, but\n" "it is better to use the double quote character, since that is never\n" "ambiguous"]; "(", "Array index or function argument delimiter."; ")", "Array index or function argument delimiter."; "*", "Multiplication operator. See also `.*'", "**", "Power operator. See also `^', `.**', and `.^'", "+", "Addition operator."; "++", ["Increment operator. As in C, may be applied as a prefix or postfix\n" "operator."]; ",", "Array index, function argument, or command separator."; "-", "Subtraction or unary negation operator."; "--", ["Decrement operator. As in C, may be applied as a prefix or postfix\n" "operator."]; ".'", ["Matrix transpose operator. For complex matrices, computes the\n" "transpose, *not* the complex conjugate transpose. See also `''."]; ".*", "Element by element multiplication operator. See also `*'."; ".**", "Element by element power operator. See also `**', `^', and `.^'."; "./", "Element by element division operator. See also `/' and `\\'."; ".^", "Element by element power operator. See also `**', `^', and `.^'."; "/", "Right division. See also `\\' and `./'."; ":", "Select entire rows or columns of matrices."; ";", "Array row or command separator. See also `,'."; "<", "Less than operator."; "<=", "Less than or equals operator."; "=", "Assignment operator."; "==", "Equality test operator."; ">", "Greater than operator."; ">=", "Greater than or equals operator."; "[", "Return list delimiter. See also `]'."; "\\", "Left division operator. See also `/' and `./'."; "]", "Return list delimiter. See also `['."; "^", "Power operator. See also `**', `.^', and `.**.'", "|", "Logical or operator. See also `||'."; "||", "Logical or operator. See also `|'."; "~", "Logical not operator. See also `!' and `~'."; "~=", "Logical not equals operator. See also `<>' and `!='." }; ## Search for operators format = "plain text"; idx = find (strcmp (arg, operators(:,1))); if (isempty (idx)) text = ""; is_operator = false; else text = operators {idx, 2}; is_operator = true; endif endfunction ## This function removes html tags from a text. This is used as a simple ## html-to-text function. function [text, status] = strip_html_tags (html_text) start = find (html_text == "<"); stop = find (html_text == ">"); if (length (start) == length (stop)) text = html_text; for n = length(start):-1:1 text (start (n):stop (n)) = []; endfor text = strip_superfluous_endlines (text); status = 0; else warning ("help: invalid HTML data"); warning ("Raw HTML source follows..."); disp (html_text); text = ""; status = 1; endif endfunction ## This function removes end-lines (\n) that makes printing look bad function text = strip_superfluous_endlines (text) ## Find groups of end-lines els = find (text == "\n"); dels = diff (els); groups = [els(1), 1]; # list containing [start, length] of each group for k = 1:length (dels) if (dels (k) == 1) groups (end, 2) ++; else groups (end+1, 1:2) = [els(k+1), 1]; endif endfor keep = true (size (text)); ## Remove end-lines in the beginning if (groups (1, 1) == 1) keep (1:groups (1, 2)) = false; endif ## Remove end-lines from the end if (sum (groups (end, :)) - 1 == length (text)) keep (groups (end, 1):end) = false; endif ## Remove groups of end-lines with more than 3 end-lines next to each other idx = find (groups (:, 2) >= 3); for k = 1:length (idx) start = groups (idx (k), 1); stop = start + groups (idx (k), 2) - 1; keep (start+2:stop) = false; endfor ## Actually remove the elements text = text (keep); endfunction
mylookfor.m
Description: Text Data
PROBLEMS.txt
Description: Text document
get_help_text.cc
Description: Text Data
get_first_help_sentence.m
Description: Text Data
makeinfo.m
Description: Text Data
[Prev in Thread] | Current Thread | [Next in Thread] |