[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[5984] refill lines uniformly (to column 72); @xref instead of @uref
From: |
karl |
Subject: |
[5984] refill lines uniformly (to column 72); @xref instead of @uref |
Date: |
Wed, 24 Dec 2014 16:36:10 +0000 |
Revision: 5984
http://svn.sv.gnu.org/viewvc/?view=rev&root=texinfo&revision=5984
Author: karl
Date: 2014-12-24 16:36:09 +0000 (Wed, 24 Dec 2014)
Log Message:
-----------
refill lines uniformly (to column 72); @xref instead of @uref
Modified Paths:
--------------
trunk/texindex/GNUmakefile
trunk/texindex/ti.twjr
Property Changed:
----------------
trunk/texindex/
Property changes on: trunk/texindex
___________________________________________________________________
Modified: svn:ignore
- texindex.awk
texindex.awk.in
ti.pdf
ti.t2p
ti.texi
+ texindex.awk
texindex.awk.in
ti.pdf
ti.t2p
ti.texi
texindex.html
texindex.info
Modified: trunk/texindex/GNUmakefile
===================================================================
--- trunk/texindex/GNUmakefile 2014-12-24 15:58:55 UTC (rev 5983)
+++ trunk/texindex/GNUmakefile 2014-12-24 16:36:09 UTC (rev 5984)
@@ -2,19 +2,22 @@
TEXI = ti.texi
AWK = texindex.awk
-all: $(AWK) html ti.pdf
+all: $(AWK) html ti.pdf check
$(TEXI): $(SOURCE)
- rm -f $@
- $(GAWK) ./jrweave $(SOURCE) >$(TEXI) || rm -f $@; chmod a-w $@
+ rm -f $@; $(GAWK) ./jrweave $(SOURCE) >$(TEXI) || rm -f $@; chmod a-w $@
$(AWK): $(SOURCE)
- ./jrtangle $(SOURCE) || rm -f $@
+ rm -f $@; ./jrtangle $(SOURCE) || rm -f $@
ti.pdf: $(TEXI)
texi2dvi --pdf --build-dir=ti.t2p -o ti.pdf $(TEXI)
-html: ti.html
+html: texindex.html
-ti.html: $(TEXI)
+texindex.html: $(TEXI)
makeinfo --no-split --html $(TEXI)
+
+check: $(AWK)
+ texindex $(ttests)/idxmarkup.cp
+ cat $(ttests)/idxmarkup.cps
Modified: trunk/texindex/ti.twjr
===================================================================
--- trunk/texindex/ti.twjr 2014-12-24 15:58:55 UTC (rev 5983)
+++ trunk/texindex/ti.twjr 2014-12-24 16:36:09 UTC (rev 5984)
@@ -91,9 +91,8 @@
@uref{https://github.com/arnoldrobbins/texiwebjr, @sc{TexiWeb Jr.@:}
literate programming system}. The underlying documentation system is
@uref{http://www.gnu.org/software/texinfo, Texinfo}, the GNU
-documentation formatting language. A single source file
-produces the runnable program, a printable document, and an online
-document.
+documentation formatting language. A single source file produces the
+runnable program, a printable document, and an online document.
@menu
* Intended audience::
@@ -119,10 +118,10 @@
@node Requirements
@chapter Requirements
-The input to this program is the list of unsorted index entries
-produced by @file{texinfo.tex} when a Texinfo document is processed.
-For example, two lines resulting from the @command{gawk} manual might
-look like this:
+The input to this program is the list of unsorted index entries produced
+by @file{texinfo.tex} when a Texinfo document is processed. For
+example, two lines resulting from the @command{gawk} manual might look
+like this:
@example
address@hidden address@hidden@address@hidden@{POSIX \command @address@hidden@}
@@ -146,8 +145,8 @@
The braces are balanced in all cases, although for use by this program,
braces can be included in the sort key by escaping them with the
address@hidden character}. This is the first character on the line. It is
-either the backslash used by @TeX{} (@samp{\}) or the at sign used by
address@hidden character}. This is the first character on the line. It
+is either the backslash used by @TeX{} (@samp{\}) or the at sign used by
Texinfo (@samp{@@}).
The job is to sort the entries, and merge those which are identical
@@ -158,17 +157,17 @@
\entry @{POSIX \command @address@hidden@address@hidden, address@hidden
@end example
-The sorting should be in the order of: all symbols first, then all digits, then
-all letters, with uppercase letters following lowercase ones, so we will need
-some smarts.
+The sorting should be in the order of: all symbols first, then all
+digits, then all letters, with uppercase letters following lowercase
+ones, so we will need some smarts.
-Input lines might be duplicated (same entry, same page, more than once), so
-we will have to deal with that.
+Input lines might be duplicated (same entry, same page, more than once),
+so we will have to deal with that.
In addition, @command{texindex} must output special lines indicating the
first character (the @dfn{initial}) of keys grouped together, but only
-if there is more than one initial used throughout the input file.
-This output looks like:
+if there is more than one initial used throughout the input file. This
+output looks like:
@example
\initial @address@hidden
@@ -178,11 +177,10 @@
@enumerate 1
@item
-The mapping of sort key to display text should be unique, with
-only the line number changing each time.
-If the same sort key has two different display texts, it means that
-different markup was used, probably inadvertently. For example, suppose
-you have the following input:
+The mapping of sort key to display text should be unique, with only the
+line number changing each time. If the same sort key has two different
+display texts, it means that different markup was used, probably
+inadvertently. For example, suppose you have the following input:
@example
@@cindex @@address@hidden()@} function
@@ -199,27 +197,27 @@
address@hidden() address@hidden@address@hidden@{\code @{field_split()@}
address@hidden
@end example
address@hidden
-This is ok, and the entries will be processed separately; the results
-will be visible in the final index as two identical appearing entries
-(most likely with different page numbers). This should cause the
-document author to search for entries that are identical with
address@hidden This is ok, and the entries will be processed separately; the
+results will be visible in the final index as two identical appearing
+entries (most likely with different page numbers). This should cause
+the document author to search for entries that are identical with
respect to text but that differ in their use of Texinfo markup.
@item
@cindex roman numerals
For the same sort key and text, page numbers will be monotonically
-increasing. This means we can just use a new page number when it
-comes in, and not have to sort entries based on both sort key and page
-number. (Which, in turn, means that we don't need to worry about page
-numbers that are roman numerals.)
+increasing. This means we can just use a new page number when it comes
+in, and not have to sort entries based on both sort key and page number.
+(Which, in turn, means that we don't need to worry about page numbers
+that are roman numerals.)
+
@end enumerate
An additional requirement, for ease of deployment, is that the program
be written in portable @command{awk}, and not use features that are
found only in GNU @command{awk} (@command{gawk}). For our purposes,
-``portable'' means ``new'' @command{awk} as defined in the 1988 book
-by Aho, Weinberger and Kernighan. This gives us functions,
+``portable'' means ``new'' @command{awk} as defined in the 1988 book by
+Aho, Weinberger and Kernighan. This gives us functions,
multidimensional arrays and a number of other important features over
the original @command{awk} shipped with V7 Unix.
@@ -284,7 +282,7 @@
Per GNU standards, we sometimes hardwire the string @samp{texindex} as
the name of the program, and sometimes use the name by which the program
-was invoked. We'll call the latter @code{Prgname}.
+was invoked. We'll call the latter @code{Prgname}.
The last line below sets up @code{Can_split_null}, which tells us if the
built-in @code{split()} function will split apart a string into its
@@ -354,9 +352,9 @@
@node Processing records
@chapter Processing records
-Processing records includes setting things up for each
-input file, pulling apart each record, sorting the data
-at the end, and writing out the data properly.
+Processing records includes setting things up for each input file,
+pulling apart each record, sorting the data at the end, and writing out
+the data properly.
@menu
* Setup for each input file:: What happens at the start of each file.
@@ -373,16 +371,14 @@
mentioned above, the @option{-o} option in the C implementation of
@command{texindex} has been omitted here.)
-When @code{beginfile()} is called, the first record
-has already been read, so it's possible to perform the
-checks for a Texinfo index file: The first character
-must be either @samp{\} or @samp{@@}, and the next five
-characters must be the word @samp{entry}.
+When @code{beginfile()} is called, the first record has already been
+read, so it's possible to perform the checks for a Texinfo index file:
+The first character must be either @samp{\} or @samp{@@}, and the next
+five characters must be the word @samp{entry}.
@cindex @code{Special_chars} variable
address@hidden are the three characters that
-must be preceded by the command character inside the
-first key.
address@hidden are the three characters that must be preceded by
+the command character inside the first key.
@cindex @code{Output_file} variable
@cindex @code{Command_char} variable
@@ -413,9 +409,8 @@
@node Processing each record
@section Processing each record
-Record processing consists of building the data structures
-for use in sorting and printing once the whole file has been
-processed.
+Record processing consists of building the data structures for use in
+sorting and printing once the whole file has been processed.
@<Process a record@> =
{
@@ -446,10 +441,10 @@
@cindex duplicates, removing
@cindex @code{Seen} array
-Duplicates are going to be exact. Removing them is thus easy;
-store each incoming line as the index of an array named @code{Seen}.
-If a line is not there, it has not been seen. Otherwise it
-has, and we move on to the next record.
+Duplicates are going to be exact. Removing them is thus easy; store
+each incoming line as the index of an array named @code{Seen}. If a
+line is not there, it has not been seen. Otherwise it has, and we move
+on to the next record.
@<Remove duplicates@>=
# Remove duplicates, which can happen
@@ -476,23 +471,22 @@
initial = extract_initial($0)
@
-Extracting the initial is mildly complicated.
-Braces can be nested, and in particular the very first field of
-the sort key can be an open brace. So it is necessary to skip leading
-open braces until we encounter the first real character. This in turn
-could be @address@hidden or @address@hidden preceded by the command character,
or
-another character.
+Extracting the initial is mildly complicated. Braces can be nested, and
+in particular the very first field of the sort key can be an open brace.
+So it is necessary to skip leading open braces until we encounter the
+first real character. This in turn could be @address@hidden or @address@hidden
+preceded by the command character, or another character.
An example can be seen in what older versions of @file{texinfo.tex}
-generated if you needed to index a real backslash, namely an input
-line something like the following:
+generated if you needed to index a real backslash, namely an input line
+something like the following:
@example
address@hidden@{\tt \indexbackslash @}
(backslash)@address@hidden@address@hidden @address@hidden @dots{}
@end example
-Fortunately, the first non-brace character is a backslash, and that
-is also the correct initial.
+Fortunately, the first non-brace character is a backslash, and that is
+also the correct initial.
@cindex @code{extract_initial()} function
@cindex @code{char_split()} function
@@ -548,15 +542,15 @@
@cindex @code{Keys} array
@cindex @code{Entries} variable
@cindex @code{Data} array
-We use a traditional @command{awk} multidimensional array to store
-the various bits and pieces. The subscripts are based on the sort key,
-and the parts are the @code{"linenum"}, the output @code{"text"},
-and the @code{"initial"}. In addition, the key is stored as data
-in the @code{Entries} array. This array is sorted later on.
+We use a traditional @command{awk} multidimensional array to store the
+various bits and pieces. The subscripts are based on the sort key, and
+the parts are the @code{"linenum"}, the output @code{"text"}, and the
address@hidden"initial"}. In addition, the key is stored as data in the
address@hidden array. This array is sorted later on.
-The key and the text are invariant across entries; only the line
-number changes, so we use the key and text as the unique index
-into @code{Data}.
+The key and the text are invariant across entries; only the line number
+changes, so we use the key and text as the unique index into
address@hidden
@<Store the data for this line in the @code{Data} array@>=
if (! ((key, "text") in Data)) {
@@ -592,23 +586,22 @@
@node Splitting the record
@subsection Splitting the record: @code{field_split}
-Let's take a look at the function that breaks apart the record.
-Upon entry to the function, the value of @code{record}
-looks something like:
+Let's take a look at the function that breaks apart the record. Upon
+entry to the function, the value of @code{record} looks something like:
@example
@{POSIX address@hidden@address@hidden@{POSIX \command @address@hidden@}
@end example
-The first field may have instances of @samp{@@@{} and/or @samp{@@@}}
-(or @address@hidden and/or @address@hidden), so
-the braces aren't necessarily exactly balanced.
+The first field may have instances of @samp{@@@{} and/or @samp{@@@}} (or
address@hidden@{} and/or @address@hidden), so the braces aren't necessarily
exactly
+balanced.
The @code{field_split()} function uses fairly straightforward ``count
the delimiters'' code. The loop starts at 2, since we know the first
-character is an open brace. The main things to handle are the
-command character and the final closing brace. The third field is
-taken as a whole; this is described shortly.
+character is an open brace. The main things to handle are the command
+character and the final closing brace. The third field is taken as a
+whole; this is described shortly.
@cindex @code{field_split()} function
@cindex @code{char_split()} function
@@ -646,10 +639,9 @@
}
@
-If the character following the command character is an
-open brace, close brace, or the command character itself,
-we pull it in. Otherwise, the command character is left
-alone as part of the field.
+If the character following the command character is an open brace, close
+brace, or the command character itself, we pull it in. Otherwise, the
+command character is left alone as part of the field.
@<Handle the character after the command character@>=
if (index(Special_chars, chars[i+1]) != 0) {
@@ -659,12 +651,11 @@
out[k++] = chars[i]
@
-Upon seeing the final closing brace, we put all the characters
-back together into a string using @code{join()}.
-We then reset the @code{out} array for the next time through.
-If the next character isn't an open brace, then the line is bad
-and we print a fatal error. Otherwise, we reset @code{delim_count}
-to one.
+Upon seeing the final closing brace, we put all the characters back
+together into a string using @code{join()}. We then reset the
address@hidden array for the next time through. If the next character isn't
+an open brace, then the line is bad and we print a fatal error.
+Otherwise, we reset @code{delim_count} to one.
@cindex @code{join()} function
@<Finish off the field, set up for next field@>=
@@ -729,8 +720,8 @@
}
@
-Printing the initial is not complicated. The main thing
-is to precede special characters with the command character.
+Printing the initial is not complicated. The main thing is to precede
+special characters with the command character.
@cindex @code{Command_char} variable
@cindex @code{Special_chars} variable
@@ -759,8 +750,8 @@
@cindex quicksort
@cindex Hoare, C.A.R.
-Sorting uses a standard Quick Sort, with the @code{less_than()}
-function supplying the comparison.
+Sorting uses a standard Quick Sort, with the @code{less_than()} function
+supplying the comparison.
@cindex @code{less_than()} function
@cindex @code{quicksort()} function
@@ -798,8 +789,8 @@
@node Comparing index entries
@subsection Comparing index entries
-The comparison function is the heart of the sorting algorithm.
-The comparison is based on the indexing rules, which are:
+The comparison function is the heart of the sorting algorithm. The
+comparison is based on the indexing rules, which are:
@itemize @bullet
@item
@@ -809,15 +800,14 @@
Followed by digits.
@item
-Followed by letters. Lowercase precedes uppercase and both ``a''
-and ``A'' precede anything starting with ``b'' or ``B'' (etc.).
+Followed by letters. Lowercase precedes uppercase and both ``a'' and
+``A'' precede anything starting with ``b'' or ``B'' (etc.).
@end itemize
-Implementing these rules is a little complicated.
-The first thing we need is a table
-that maps characters to comparison values.
-The following code is based on the original C @command{texindex},
-although the actual comparison algorithm is more sophisticated.
+Implementing these rules is a little complicated. The first thing we
+need is a table that maps characters to comparison values. The
+following code is based on the original C @command{texindex}, although
+the actual comparison algorithm is more sophisticated.
We set up an @code{Ordval} array to map characters to numeric values.
Most characters map to their ASCII code. We add 512 to the value of
@@ -827,10 +817,9 @@
although @TeX{} does everything in ASCII, so it's not likely to make a
difference.}
-The table must be built completely before changing the
-mapping of the letters, because all of the uppercase and
-lowercase letters must be in the table before we can
-change their values.
+The table must be built completely before changing the mapping of the
+letters, because all of the uppercase and lowercase letters must be in
+the table before we can change their values.
@cindex @code{Ordval} array
@<Work functions@>=
@@ -862,8 +851,8 @@
}
@
-Here is the @code{less_than()} function. It returns true if the @code{left}
-string is ``less than'' the @code{right} string.
+Here is the @code{less_than()} function. It returns true if the
address@hidden string is ``less than'' the @code{right} string.
The comparison algorithm is not too complicated, once we define how
things should work. We loop over each pair of characters in the
@@ -877,17 +866,16 @@
@c nested table
@table @i
@item Same letter, but different case
-This is the complicated case. First, we want lowercase letters
-to be ordered before uppercase ones, even though this is the
-opposite of the natural ASCII ordering. To make this happen,
-we use a @samp{>} comparison instead of a @samp{<} comparison.
+This is the complicated case. First, we want lowercase letters to be
+ordered before uppercase ones, even though this is the opposite of the
+natural ASCII ordering. To make this happen, we use a @samp{>}
+comparison instead of a @samp{<} comparison.
-Second, when two characters are equal, we have to look ahead
-at the next characters to decide whether to continue the
-loop or quit. As long as we are not at the end of the string,
-and at least one of the following characters in either string is a letter,
-we continue the loop. Otherwise we do the character comparison
-and return.
+Second, when two characters are equal, we have to look ahead at the next
+characters to decide whether to continue the loop or quit. As long as
+we are not at the end of the string, and at least one of the following
+characters in either string is a letter, we continue the loop.
+Otherwise we do the character comparison and return.
@item Two different letters, but same case
@itemx Two different letters, different case
@@ -901,11 +889,9 @@
@end table
address@hidden
-When the values are equal, continue around the loop.
-And, as usual, if one string is an initial substring
-of the other, that one is considered to be ``less than''
-the other one.
address@hidden When the values are equal, continue around the loop. And, as
+usual, if one string is an initial substring of the other, that one is
+considered to be ``less than'' the other one.
The rules just described produce @emph{better} results than did the C
@command{texindex}. For example, @samp{beginfile()} sorts
@@ -975,7 +961,7 @@
@node Necessary stuff
@chapter Necessary stuff that isn't thrilling
-This chapter provides some elements that are necessary but not exciting.
+This chapter provides some necessary but unexciting elements.
@menu
* Copyright statement:: Copyright info.
@@ -1019,7 +1005,7 @@
The program uses several library routines discussed in detail
in the @command{gawk} documentation. The first sets up the
infrastructure for the @code{beginfile()} and @code{endfile()} functions.
-See
@uref{http://www.gnu.org/software/gawk/manual/html_node/Filetrans-Function.html}
address@hidden Function,,, gawk, GNU Awk User's Guide},
for an explanation of how this function works.
@cindex @file{ftrans.awk} library file
@@ -1041,9 +1027,8 @@
END { endfile(_filename_) }
@
-The next function is @code{join()}, which joins
-an array of characters back into a string.
-See @uref{http://www.gnu.org/software/gawk/manual/html_node/Join-Function.html}
+The next function is @code{join()}, which joins an array of characters
+back into a string. @xref{Join Function,,, gawk, GNU Awk User's Guide},
for an explanation of how this function works.
@cindex @file{join.awk} library file
@@ -1199,22 +1184,21 @@
For @command{gawk}, we can arrange for the various messages, e.g., in
the @code{usage()} and @code{version()} functions, to be translated. We
do this by setting the text domain at startup. For more information on
-internationalization in @command{gawk}, see
address@hidden://www.gnu.org/software/gawk/manual/html_node/Internationalization.html}.
+internationalization in @command{gawk},
address@hidden,,, gawk, GNU Awk User's Guide}.
@<Initial setup@>=
TEXTDOMAIN = "texinfo"
@
address@hidden
-On non-GNU versions of @command{awk}, this is a harmless assignment, and
-the @code{_"..."} construct below is a harmless concatenation of an
-unassigned variable @code{_}, i.e., the empty string.
address@hidden On non-GNU versions of @command{awk}, this is a harmless
+assignment, and the @code{_"..."} construct below is a harmless
+concatenation of an unassigned variable @code{_}, i.e., the empty
+string.
-The @code{usage()} and @code{version()} functions print the
-necessary information and then exit. The strings that
-can and should be translated are prefixed with an
-underscore.
+The @code{usage()} and @code{version()} functions print the necessary
+information and then exit. The strings that can and should be
+translated are prefixed with an underscore.
@cindex @code{Texindex_Version} variable
@cindex @code{usage()} function
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [5984] refill lines uniformly (to column 72); @xref instead of @uref,
karl <=