[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[gawk-diffs] [SCM] gawk branch, gawk-4.1-stable, updated. gawk-4.1.0-559
From: |
Arnold Robbins |
Subject: |
[gawk-diffs] [SCM] gawk branch, gawk-4.1-stable, updated. gawk-4.1.0-559-g6f22075 |
Date: |
Fri, 23 Jan 2015 11:04:37 +0000 |
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "gawk".
The branch, gawk-4.1-stable has been updated
via 6f220759af1c8e37f56acd334a295daa8c4a2651 (commit)
from 8e0e08c84626633e1d4b7b431576d4ec7d8f10c4 (commit)
Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.
- Log -----------------------------------------------------------------
http://git.sv.gnu.org/cgit/gawk.git/commit/?id=6f220759af1c8e37f56acd334a295daa8c4a2651
commit 6f220759af1c8e37f56acd334a295daa8c4a2651
Author: Arnold D. Robbins <address@hidden>
Date: Fri Jan 23 13:04:09 2015 +0200
More O'Reilly fixes.
diff --git a/doc/ChangeLog b/doc/ChangeLog
index b78fcb6..d16c7c7 100644
--- a/doc/ChangeLog
+++ b/doc/ChangeLog
@@ -1,3 +1,7 @@
+2015-01-23 Arnold D. Robbins <address@hidden>
+
+ * gawktexi.in: O'Reilly fixes.
+
2015-01-21 Arnold D. Robbins <address@hidden>
* gawktexi.in: O'Reilly fixes.
diff --git a/doc/gawk.info b/doc/gawk.info
index de00422..2a17cbc 100644
--- a/doc/gawk.info
+++ b/doc/gawk.info
@@ -3802,8 +3802,9 @@ Collating symbols
Equivalence classes
Locale-specific names for a list of characters that are equal. The
name is enclosed between `[=' and `=]'. For example, the name `e'
- might be used to represent all of "e," "e`," and "e'." In this
- case, `[[=e=]]' is a regexp that matches any of `e', `e'', or `e`'.
+ might be used to represent all of "e," "e^," "e`," and "e'." In
+ this case, `[[=e=]]' is a regexp that matches any of `e', `e^',
+ `e'', or `e`'.
These features are very valuable in non-English-speaking locales.
@@ -3825,7 +3826,7 @@ Consider the following:
This example uses the `sub()' function to make a change to the input
record. (`sub()' replaces the first instance of any text matched by
the first argument with the string provided as the second argument;
-*note String Functions::). Here, the regexp `/a+/' indicates "one or
+*note String Functions::.) Here, the regexp `/a+/' indicates "one or
more `a' characters," and the replacement text is `<A>'.
The input contains four `a' characters. `awk' (and POSIX) regular
@@ -3862,15 +3863,16 @@ regexp":
This sets `digits_regexp' to a regexp that describes one or more digits,
and tests whether the input record matches this regexp.
- NOTE: When using the `~' and `!~' operators, there is a difference
- between a regexp constant enclosed in slashes and a string
- constant enclosed in double quotes. If you are going to use a
- string constant, you have to understand that the string is, in
- essence, scanned _twice_: the first time when `awk' reads your
+ NOTE: When using the `~' and `!~' operators, be aware that there
+ is a difference between a regexp constant enclosed in slashes and
+ a string constant enclosed in double quotes. If you are going to
+ use a string constant, you have to understand that the string is,
+ in essence, scanned _twice_: the first time when `awk' reads your
program, and the second time when it goes to match the string on
the lefthand side of the operator with the pattern on the right.
This is true of any string-valued expression (such as
- `digits_regexp', shown previously), not just string constants.
+ `digits_regexp', shown in the previous example), not just string
+ constants.
What difference does it make if the string is scanned twice? The
answer has to do with escape sequences, and particularly with
@@ -3967,7 +3969,7 @@ letters, digits, or underscores (`_'):
`\B'
Matches the empty string that occurs between two word-constituent
- characters. For example, `/\Brat\B/' matches `crate' but it does
+ characters. For example, `/\Brat\B/' matches `crate', but it does
not match `dirty rat'. `\B' is essentially the opposite of `\y'.
There are two other operators that work on buffers. In Emacs, a
@@ -3976,10 +3978,10 @@ letters, digits, or underscores (`_'):
operators are:
`\`'
- Matches the empty string at the beginning of a buffer (string).
+ Matches the empty string at the beginning of a buffer (string)
`\''
- Matches the empty string at the end of a buffer (string).
+ Matches the empty string at the end of a buffer (string)
Because `^' and `$' always work in terms of the beginning and end of
strings, these operators don't add any new capabilities for `awk'.
@@ -4150,7 +4152,7 @@ one line. Each record is automatically split into chunks
called
parts of a record.
On rare occasions, you may need to use the `getline' command. The
-`getline' command is valuable, both because it can do explicit input
+`getline' command is valuable both because it can do explicit input
from any number of files, and because the files used with it do not
have to be named on the `awk' command line (*note Getline::).
@@ -4199,8 +4201,8 @@ File: gawk.info, Node: awk split records, Next: gawk
split records, Up: Recor
Records are separated by a character called the "record separator". By
default, the record separator is the newline character. This is why
-records are, by default, single lines. A different character can be
-used for the record separator by assigning the character to the
+records are, by default, single lines. To use a different character
+for the record separator, simply assign that character to the
predefined variable `RS'.
Like any other variable, the value of `RS' can be changed in the
@@ -4215,14 +4217,14 @@ BEGIN/END::). For example:
awk 'BEGIN { RS = "u" }
{ print $0 }' mail-list
-changes the value of `RS' to `u', before reading any input. This is a
-string whose first character is the letter "u"; as a result, records
-are separated by the letter "u." Then the input file is read, and the
-second rule in the `awk' program (the action with no pattern) prints
-each record. Because each `print' statement adds a newline at the end
-of its output, this `awk' program copies the input with each `u'
-changed to a newline. Here are the results of running the program on
-`mail-list':
+changes the value of `RS' to `u', before reading any input. The new
+value is a string whose first character is the letter "u"; as a result,
+records are separated by the letter "u". Then the input file is read,
+and the second rule in the `awk' program (the action with no pattern)
+prints each record. Because each `print' statement adds a newline at
+the end of its output, this `awk' program copies the input with each
+`u' changed to a newline. Here are the results of running the program
+on `mail-list':
$ awk 'BEGIN { RS = "u" }
> { print $0 }' mail-list
@@ -4270,11 +4272,11 @@ data file (*note Sample Data Files::), the line looks
like this:
Bill 555-1675 address@hidden A
-It contains no `u' so there is no reason to split the record, unlike
-the others which have one or more occurrences of the `u'. In fact,
-this record is treated as part of the previous record; the newline
-separating them in the output is the original newline in the data file,
-not the one added by `awk' when it printed the record!
+It contains no `u', so there is no reason to split the record, unlike
+the others, which each have one or more occurrences of the `u'. In
+fact, this record is treated as part of the previous record; the
+newline separating them in the output is the original newline in the
+data file, not the one added by `awk' when it printed the record!
Another way to change the record separator is on the command line,
using the variable-assignment feature (*note Other Arguments::):
@@ -4340,8 +4342,8 @@ part of either record.
character. However, when `RS' is a regular expression, `RT' contains
the actual input text that matched the regular expression.
- If the input file ended without any text that matches `RS', `gawk'
-sets `RT' to the null string.
+ If the input file ends without any text matching `RS', `gawk' sets
+`RT' to the null string.
The following example illustrates both of these features. It sets
`RS' equal to a regular expression that matches either a newline or a
@@ -4439,12 +4441,12 @@ to these pieces of the record. You don't have to use
them--you can
operate on the whole record if you want--but fields are what make
simple `awk' programs so powerful.
- You use a dollar-sign (`$') to refer to a field in an `awk' program,
+ You use a dollar sign (`$') to refer to a field in an `awk' program,
followed by the number of the field you want. Thus, `$1' refers to the
-first field, `$2' to the second, and so on. (Unlike the Unix shells,
-the field numbers are not limited to single digits. `$127' is the
-127th field in the record.) For example, suppose the following is a
-line of input:
+first field, `$2' to the second, and so on. (Unlike in the Unix
+shells, the field numbers are not limited to single digits. `$127' is
+the 127th field in the record.) For example, suppose the following is
+a line of input:
This seems like a pretty nice example.
@@ -4461,10 +4463,9 @@ as `$7', which is `example.'. If you try to reference a
field beyond
the last one (such as `$8' when the record has only seven fields), you
get the empty string. (If used in a numeric operation, you get zero.)
- The use of `$0', which looks like a reference to the "zero-th"
-field, is a special case: it represents the whole input record. Use it
-when you are not interested in specific fields. Here are some more
-examples:
+ The use of `$0', which looks like a reference to the "zeroth" field,
+is a special case: it represents the whole input record. Use it when
+you are not interested in specific fields. Here are some more examples:
$ awk '$1 ~ /li/ { print $0 }' mail-list
-| Amelia 555-5553 address@hidden F
@@ -4512,8 +4513,8 @@ is another example of using expressions as field numbers:
awk '{ print $(2*2) }' mail-list
`awk' evaluates the expression `(2*2)' and uses its value as the
-number of the field to print. The `*' sign represents multiplication,
-so the expression `2*2' evaluates to four. The parentheses are used so
+number of the field to print. The `*' represents multiplication, so
+the expression `2*2' evaluates to four. The parentheses are used so
that the multiplication is done before the `$' operation; they are
necessary whenever there is a binary operator(1) in the field-number
expression. This example, then, prints the type of relationship (the
@@ -4537,7 +4538,7 @@ field number.
---------- Footnotes ----------
(1) A "binary operator", such as `*' for multiplication, is one that
-takes two operands. The distinction is required, because `awk' also has
+takes two operands. The distinction is required because `awk' also has
unary (one-operand) and ternary (three-operand) operators.
@@ -4659,7 +4660,7 @@ value of `NF' and recomputes `$0'. (d.c.) Here is an
example:
decremented.
Finally, there are times when it is convenient to force `awk' to
-rebuild the entire record, using the current value of the fields and
+rebuild the entire record, using the current values of the fields and
`OFS'. To do this, use the seemingly innocuous assignment:
$1 = $1 # force record to be reconstituted
@@ -4679,7 +4680,7 @@ built-in function that updates `$0', such as `sub()' and
`gsub()'
It is important to remember that `$0' is the _full_ record, exactly
as it was read from the input. This includes any leading or trailing
whitespace, and the exact whitespace (or other characters) that
-separate the fields.
+separates the fields.
It is a common error to try to change the field separators in a
record simply by setting `FS' and `OFS', and then expecting a plain
@@ -4747,7 +4748,7 @@ attached, such as:
John Q. Smith, LXIX, 29 Oak St., Walamazoo, MI 42139
-The same program would extract `*LXIX', instead of `*29*Oak*St.'. If
+The same program would extract `*LXIX' instead of `*29*Oak*St.'. If
you were expecting the program to print the address, you would be
surprised. The moral is to choose your data layout and separator
characters carefully to prevent such problems. (If the data is not in
@@ -4946,11 +4947,11 @@ your field and record separators.
Perhaps the most common use of a single character as the field
separator occurs when processing the Unix system password file. On
many Unix systems, each user has a separate entry in the system
-password file, one line per user. The information in these lines is
-separated by colons. The first field is the user's login name and the
-second is the user's encrypted or shadow password. (A shadow password
-is indicated by the presence of a single `x' in the second field.) A
-password file entry might look like this:
+password file, with one line per user. The information in these lines
+is separated by colons. The first field is the user's login name and
+the second is the user's encrypted or shadow password. (A shadow
+password is indicated by the presence of a single `x' in the second
+field.) A password file entry might look like this:
arnold:x:2076:10:Arnold Robbins:/home/arnold:/bin/bash
@@ -4978,15 +4979,14 @@ When you do this, `$1' is the same as `$0'.
According to the POSIX standard, `awk' is supposed to behave as if
each record is split into fields at the time it is read. In
particular, this means that if you change the value of `FS' after a
-record is read, the value of the fields (i.e., how they were split)
+record is read, the values of the fields (i.e., how they were split)
should reflect the old value of `FS', not the new one.
However, many older implementations of `awk' do not work this way.
Instead, they defer splitting the fields until a field is actually
referenced. The fields are split using the _current_ value of `FS'!
(d.c.) This behavior can be difficult to diagnose. The following
-example illustrates the difference between the two methods. (The
-`sed'(2) command prints just the first line of `/etc/passwd'.)
+example illustrates the difference between the two methods:
sed 1q /etc/passwd | awk '{ FS = ":" ; print $1 }'
@@ -4999,6 +4999,8 @@ first line of the file, something like:
root:x:0:0:Root:/:
+ (The `sed'(2) command prints just the first line of `/etc/passwd'.)
+
---------- Footnotes ----------
(1) Thanks to Andrew Schorr for this tip.
@@ -5152,7 +5154,7 @@ run on a system with card readers is another story!)
splitting again. Use `FS = FS' to make this happen, without having to
know the current value of `FS'. In order to tell which kind of field
splitting is in effect, use `PROCINFO["FS"]' (*note Auto-set::). The
-value is `"FS"' if regular field splitting is being used, or it is
+value is `"FS"' if regular field splitting is being used, or
`"FIELDWIDTHS"' if fixed-width field splitting is being used:
if (PROCINFO["FS"] == "FS")
@@ -5185,10 +5187,10 @@ what they are, and not by what they are not.
The most notorious such case is so-called "comma-separated values"
(CSV) data. Many spreadsheet programs, for example, can export their
data into text files, where each record is terminated with a newline,
-and fields are separated by commas. If only commas separated the data,
+and fields are separated by commas. If commas only separated the data,
there wouldn't be an issue. The problem comes when one of the fields
contains an _embedded_ comma. In such cases, most programs embed the
-field in double quotes.(1) So we might have data like this:
+field in double quotes.(1) So, we might have data like this:
Robbins,Arnold,"1234 A Pretty Street, NE",MyTown,MyState,12345-6789,USA
@@ -5255,9 +5257,9 @@ being used.
provides an elegant solution for the majority of cases, and the
`gawk' developers are satisfied with that.
- As written, the regexp used for `FPAT' requires that each field have
-a least one character. A straightforward modification (changing
-changed the first `+' to `*') allows fields to be empty:
+ As written, the regexp used for `FPAT' requires that each field
+contain at least one character. A straightforward modification
+(changing the first `+' to `*') allows fields to be empty:
FPAT = "([^,]*)|(\"[^\"]+\")"
@@ -5265,9 +5267,8 @@ changed the first `+' to `*') allows fields to be empty:
available for splitting regular strings (*note String Functions::).
To recap, `gawk' provides three independent methods to split input
-records into fields. `gawk' uses whichever mechanism was last chosen
-based on which of the three variables--`FS', `FIELDWIDTHS', and
-`FPAT'--was last assigned to.
+records into fields. The mechanism used is based on which of the three
+variables--`FS', `FIELDWIDTHS', or `FPAT'--was last assigned to.
---------- Footnotes ----------
@@ -5305,7 +5306,7 @@ empty; lines that contain only whitespace do not count.)
`"\n\n+"' to `RS'. This regexp matches the newline at the end of the
record and one or more blank lines after the record. In addition, a
regular expression always matches the longest possible sequence when
-there is a choice (*note Leftmost Longest::). So the next record
+there is a choice (*note Leftmost Longest::). So, the next record
doesn't start until the first nonblank line that follows--no matter how
many blank lines appear in a row, they are considered one record
separator.
@@ -5317,12 +5318,12 @@ last record, the final newline is removed from the
record. In the
second case, this special processing is not done. (d.c.)
Now that the input is separated into records, the second step is to
-separate the fields in the record. One way to do this is to divide each
-of the lines into fields in the normal manner. This happens by default
-as the result of a special feature. When `RS' is set to the empty
-string, _and_ `FS' is set to a single character, the newline character
-_always_ acts as a field separator. This is in addition to whatever
-field separations result from `FS'.(1)
+separate the fields in the records. One way to do this is to divide
+each of the lines into fields in the normal manner. This happens by
+default as the result of a special feature. When `RS' is set to the
+empty string _and_ `FS' is set to a single character, the newline
+character _always_ acts as a field separator. This is in addition to
+whatever field separations result from `FS'.(1)
The original motivation for this special exception was probably to
provide useful behavior in the default case (i.e., `FS' is equal to
@@ -5330,17 +5331,17 @@ provide useful behavior in the default case (i.e., `FS'
is equal to
newline character to separate fields, because there is no way to
prevent it. However, you can work around this by using the `split()'
function to break up the record manually (*note String Functions::).
-If you have a single character field separator, you can work around the
+If you have a single-character field separator, you can work around the
special feature in a different way, by making `FS' into a regexp for
that single character. For example, if the field separator is a
percent character, instead of `FS = "%"', use `FS = "[%]"'.
Another way to separate fields is to put each field on a separate
line: to do this, just set the variable `FS' to the string `"\n"'.
-(This single character separator matches a single newline.) A
+(This single-character separator matches a single newline.) A
practical example of a data file organized this way might be a mailing
-list, where each entry is separated by blank lines. Consider a mailing
-list in a file named `addresses', which looks like this:
+list, where blank lines separate the entries. Consider a mailing list
+in a file named `addresses', which looks like this:
Jane Doe
123 Main Street
@@ -5423,7 +5424,7 @@ File: gawk.info, Node: Getline, Next: Read Timeout,
Prev: Multiple Line, Up:
So far we have been getting our input data from `awk''s main input
stream--either the standard input (usually your keyboard, sometimes the
-output from another program) or from the files specified on the command
+output from another program) or the files specified on the command
line. The `awk' language has a special built-in command called
`getline' that can be used to read input under your explicit control.
@@ -5561,7 +5562,7 @@ and produces these results:
free
The `getline' command used in this way sets only the variables `NR',
-`FNR', and `RT' (and of course, VAR). The record is not split into
+`FNR', and `RT' (and, of course, VAR). The record is not split into
fields, so the values of the fields (including `$0') and the value of
`NF' do not change.
@@ -5571,8 +5572,8 @@ File: gawk.info, Node: Getline/File, Next:
Getline/Variable/File, Prev: Getli
4.9.3 Using `getline' from a File
---------------------------------
-Use `getline < FILE' to read the next record from FILE. Here FILE is a
-string-valued expression that specifies the file name. `< FILE' is
+Use `getline < FILE' to read the next record from FILE. Here, FILE is
+a string-valued expression that specifies the file name. `< FILE' is
called a "redirection" because it directs input to come from a
different place. For example, the following program reads its input
record from the file `secondary.input' when it encounters a first field
@@ -5708,8 +5709,8 @@ all `awk' implementations.
treatment of a construct like `"echo " "date" | getline'. Most
versions, including the current version, treat it at as `("echo "
"date") | getline'. (This is also how BWK `awk' behaves.) Some
- versions changed and treated it as `"echo " ("date" | getline)'.
- (This is how `mawk' behaves.) In short, _always_ use explicit
+ versions instead treat it as `"echo " ("date" | getline)'. (This
+ is how `mawk' behaves.) In short, _always_ use explicit
parentheses, and then you won't have to worry.
@@ -5745,15 +5746,16 @@ File: gawk.info, Node: Getline/Coprocess, Next:
Getline/Variable/Coprocess, P
4.9.7 Using `getline' from a Coprocess
--------------------------------------
-Input into `getline' from a pipe is a one-way operation. The command
-that is started with `COMMAND | getline' only sends data _to_ your
-`awk' program.
+Reading input into `getline' from a pipe is a one-way operation. The
+command that is started with `COMMAND | getline' only sends data _to_
+your `awk' program.
On occasion, you might want to send data to another program for
processing and then read the results back. `gawk' allows you to start
a "coprocess", with which two-way communications are possible. This is
done with the `|&' operator. Typically, you write data to the
-coprocess first and then read results back, as shown in the following:
+coprocess first and then read the results back, as shown in the
+following:
print "SOME QUERY" |& "db_server"
"db_server" |& getline
@@ -5815,7 +5817,7 @@ in mind:
files. (d.c.) (See *note BEGIN/END::; also *note Auto-set::.)
* Using `FILENAME' with `getline' (`getline < FILENAME') is likely
- to be a source for confusion. `awk' opens a separate input stream
+ to be a source of confusion. `awk' opens a separate input stream
from the current input file. However, by not using a variable,
`$0' and `NF' are still updated. If you're doing this, it's
probably by accident, and you should reconsider what it is you're
@@ -5823,15 +5825,15 @@ in mind:
* *note Getline Summary::, presents a table summarizing the
`getline' variants and which variables they can affect. It is
- worth noting that those variants which do not use redirection can
+ worth noting that those variants that do not use redirection can
cause `FILENAME' to be updated if they cause `awk' to start
reading a new input file.
* If the variable being assigned is an expression with side effects,
different versions of `awk' behave differently upon encountering
end-of-file. Some versions don't evaluate the expression; many
- versions (including `gawk') do. Here is an example, due to Duncan
- Moore:
+ versions (including `gawk') do. Here is an example, courtesy of
+ Duncan Moore:
BEGIN {
system("echo 1 > f")
@@ -5839,8 +5841,8 @@ in mind:
print c
}
- Here, the side effect is the `++c'. Is `c' incremented if end of
- file is encountered, before the element in `a' is assigned?
+ Here, the side effect is the `++c'. Is `c' incremented if
+ end-of-file is encountered before the element in `a' is assigned?
`gawk' treats `getline' like a function call, and evaluates the
expression `a[++c]' before attempting to read from `f'. However,
@@ -5884,8 +5886,8 @@ This minor node describes a feature that is specific to
`gawk'.
You may specify a timeout in milliseconds for reading input from the
keyboard, a pipe, or two-way communication, including TCP/IP sockets.
-This can be done on a per input, command, or connection basis, by
-setting a special element in the `PROCINFO' array (*note Auto-set::):
+This can be done on a per-input, per-command, or per-connection basis,
+by setting a special element in the `PROCINFO' array (*note Auto-set::):
PROCINFO["input_name", "READ_TIMEOUT"] = TIMEOUT IN MILLISECONDS
@@ -5909,7 +5911,7 @@ for more than five seconds:
print $0
`gawk' terminates the read operation if input does not arrive after
-waiting for the timeout period, returns failure and sets `ERRNO' to an
+waiting for the timeout period, returns failure, and sets `ERRNO' to an
appropriate string value. A negative or zero value for the timeout is
the same as specifying no timeout at all.
@@ -5949,7 +5951,7 @@ input to arrive:
environment variable exists, `gawk' uses its value to initialize the
timeout value. The exclusive use of the environment variable to
specify timeout has the disadvantage of not being able to control it on
-a per command or connection basis.
+a per-command or per-connection basis.
`gawk' considers a timeout event to be an error even though the
attempt to read from the underlying device may succeed in a later
@@ -6017,7 +6019,7 @@ File: gawk.info, Node: Input Summary, Next: Input
Exercises, Prev: Command-li
* `gawk' sets `RT' to the text matched by `RS'.
* After splitting the input into records, `awk' further splits the
- record into individual fields, named `$1', `$2', and so on. `$0'
+ records into individual fields, named `$1', `$2', and so on. `$0'
is the whole record, and `NF' indicates how many fields there are.
The default way to split fields is between whitespace characters.
@@ -6031,19 +6033,21 @@ File: gawk.info, Node: Input Summary, Next: Input
Exercises, Prev: Command-li
* Field splitting is more complicated than record splitting:
- Field separator value Fields are split ... `awk' /
- `gawk'
+ Field separator value Fields are split ... `awk' /
+ `gawk'
----------------------------------------------------------------------
- `FS == " "' On runs of whitespace `awk'
- `FS == ANY SINGLE On that character `awk'
- CHARACTER'
- `FS == REGEXP' On text matching the regexp `awk'
- `FS == ""' Each individual character is `gawk'
- a separate field
- `FIELDWIDTHS == LIST OF Based on character position `gawk'
- COLUMNS'
- `FPAT == REGEXP' On the text surrounding text `gawk'
- matching the regexp
+ `FS == " "' On runs of whitespace `awk'
+ `FS == ANY SINGLE On that character `awk'
+ CHARACTER'
+ `FS == REGEXP' On text matching the `awk'
+ regexp
+ `FS == ""' Such that each individual `gawk'
+ character is a separate
+ field
+ `FIELDWIDTHS == LIST OF Based on character `gawk'
+ COLUMNS' position
+ `FPAT == REGEXP' On the text surrounding `gawk'
+ text matching the regexp
* Using `FS = "\n"' causes the entire record to be a single field
(assuming that newlines separate records).
@@ -6053,12 +6057,11 @@ File: gawk.info, Node: Input Summary, Next: Input
Exercises, Prev: Command-li
* Use `PROCINFO["FS"]' to see how fields are being split.
- * Use `getline' in its various forms to read additional records,
- from the default input stream, from a file, or from a pipe or
- coprocess.
+ * Use `getline' in its various forms to read additional records from
+ the default input stream, from a file, or from a pipe or coprocess.
- * Use `PROCINFO[FILE, "READ_TIMEOUT"]' to cause reads to timeout for
- FILE.
+ * Use `PROCINFO[FILE, "READ_TIMEOUT"]' to cause reads to time out
+ for FILE.
* Directories on the command line are fatal for standard `awk';
`gawk' ignores them if not in POSIX mode.
@@ -6152,7 +6155,7 @@ you will probably get an error. Keep in mind that a
space is printed
between any two items.
Note that the `print' statement is a statement and not an
-expression--you can't use it in the pattern part of a PATTERN-ACTION
+expression--you can't use it in the pattern part of a pattern-action
statement, for example.
@@ -6300,7 +6303,7 @@ File: gawk.info, Node: OFMT, Next: Printf, Prev:
Output Separators, Up: Prin
===========================================
When printing numeric values with the `print' statement, `awk'
-internally converts the number to a string of characters and prints
+internally converts each number to a string of characters and prints
that string. `awk' uses the `sprintf()' function to do this conversion
(*note String Functions::). For now, it suffices to say that the
`sprintf()' function accepts a "format specification" that tells it how
@@ -6355,7 +6358,7 @@ A simple `printf' statement looks like this:
As for `print', the entire list of arguments may optionally be enclosed
in parentheses. Here too, the parentheses are necessary if any of the
-item expressions use the `>' relational operator; otherwise, it can be
+item expressions uses the `>' relational operator; otherwise, it can be
confused with an output redirection (*note Redirection::).
The difference between `printf' and `print' is the FORMAT argument.
@@ -6382,7 +6385,7 @@ statements. For example:
> }'
-| Don't Panic!
-Here, neither the `+' nor the `OUCH!' appear in the output message.
+Here, neither the `+' nor the `OUCH!' appears in the output message.
File: gawk.info, Node: Control Letters, Next: Format Modifiers, Prev: Basic
Printf, Up: Printf
@@ -6421,7 +6424,7 @@ width. Here is a list of the format-control letters:
(The `%i' specification is for compatibility with ISO C.)
`%e', `%E'
- Print a number in scientific (exponential) notation; for example:
+ Print a number in scientific (exponential) notation. For example:
printf "%4.3e\n", 1950
@@ -6446,7 +6449,7 @@ width. Here is a list of the format-control letters:
Math Definitions::).
`%F'
- Like `%f' but the infinity and "not a number" values are spelled
+ Like `%f', but the infinity and "not a number" values are spelled
using uppercase letters.
The `%F' format is a POSIX extension to ISO C; not all systems
@@ -6640,7 +6643,7 @@ string, like so:
s = "abcdefg"
printf "%" w "." p "s\n", s
-This is not particularly easy to read but it does work.
+This is not particularly easy to read, but it does work.
C programmers may be used to supplying additional modifiers (`h',
`j', `l', `L', `t', and `z') in `printf' format strings. These are not
@@ -6679,7 +6682,7 @@ an aligned two-column table of names and phone numbers,
as shown here:
-| Jean-Paul 555-2127
In this case, the phone numbers had to be printed as strings because
-the numbers are separated by a dash. Printing the phone numbers as
+the numbers are separated by dashes. Printing the phone numbers as
numbers would have produced just the first three digits: `555'. This
would have been pretty confusing.
@@ -6727,7 +6730,7 @@ output, usually the screen. Both `print' and `printf'
can also send
their output to other places. This is called "redirection".
NOTE: When `--sandbox' is specified (*note Options::), redirecting
- output to files, pipes and coprocesses is disabled.
+ output to files, pipes, and coprocesses is disabled.
A redirection appears after the `print' or `printf' statement.
Redirections in `awk' are written just like redirections in shell
@@ -6767,7 +6770,7 @@ work identically for `printf':
Each output file contains one name or number per line.
`print ITEMS >> OUTPUT-FILE'
- This redirection prints the items into the pre-existing output file
+ This redirection prints the items into the preexisting output file
named OUTPUT-FILE. The difference between this and the single-`>'
redirection is that the old contents (if any) of OUTPUT-FILE are
not erased. Instead, the `awk' output is appended to the file.
@@ -6815,8 +6818,8 @@ work identically for `printf':
`print ITEMS |& COMMAND'
This redirection prints the items to the input of COMMAND. The
difference between this and the single-`|' redirection is that the
- output from COMMAND can be read with `getline'. Thus COMMAND is a
- "coprocess", which works together with, but subsidiary to, the
+ output from COMMAND can be read with `getline'. Thus, COMMAND is
+ a "coprocess", which works together with but is subsidiary to the
`awk' program.
This feature is a `gawk' extension, and is not available in POSIX
@@ -6840,7 +6843,7 @@ a file, and then to use `>>' for subsequent output:
This is indeed how redirections must be used from the shell. But in
`awk', it isn't necessary. In this kind of case, a program should use
`>' for all the `print' statements, because the output file is only
-opened once. (It happens that if you mix `>' and `>>' that output is
+opened once. (It happens that if you mix `>' and `>>' output is
produced in the expected order. However, mixing the operators for the
same file is definitely poor style, and is confusing to readers of your
program.)
@@ -6873,14 +6876,14 @@ command lines to be fed to the shell.
File: gawk.info, Node: Special FD, Next: Special Files, Prev: Redirection,
Up: Printing
-5.7 Special Files for Standard Pre-Opened Data Streams
-======================================================
+5.7 Special Files for Standard Preopened Data Streams
+=====================================================
Running programs conventionally have three input and output streams
already available to them for reading and writing. These are known as
the "standard input", "standard output", and "standard error output".
-These open streams (and any other open file or pipe) are often referred
-to by the technical term "file descriptors".
+These open streams (and any other open files or pipes) are often
+referred to by the technical term "file descriptors".
These streams are, by default, connected to your keyboard and
screen, but they are often redirected with the shell, via the `<', `<<',
@@ -6905,7 +6908,7 @@ error messages to the screen, like this:
(`/dev/tty' is a special file supplied by the operating system that is
connected to your keyboard and screen. It represents the "terminal,"(1)
which on modern systems is a keyboard and screen, not a serial console.)
-This generally has the same effect but not always: although the
+This generally has the same effect, but not always: although the
standard error stream is usually the screen, it can be redirected; when
that happens, writing to the screen is not correct. In fact, if `awk'
is run from a background job, it may not have a terminal at all. Then
@@ -6932,7 +6935,7 @@ becomes:
print "Serious error detected!" > "/dev/stderr"
- Note the use of quotes around the file name. Like any other
+ Note the use of quotes around the file name. Like with any other
redirection, the value must be a string. It is a common error to omit
the quotes, which leads to confusing results.
@@ -6965,7 +6968,7 @@ there are special file names reserved for TCP/IP
networking.
File: gawk.info, Node: Other Inherited Files, Next: Special Network, Up:
Special Files
-5.8.1 Accessing Other Open Files With `gawk'
+5.8.1 Accessing Other Open Files with `gawk'
--------------------------------------------
Besides the `/dev/stdin', `/dev/stdout', and `/dev/stderr' special file
@@ -7015,7 +7018,7 @@ File: gawk.info, Node: Special Caveats, Prev: Special
Network, Up: Special Fi
Here are some things to bear in mind when using the special file names
that `gawk' provides:
- * Recognition of the file names for the three standard pre-opened
+ * Recognition of the file names for the three standard preopened
files is disabled only in POSIX mode.
* Recognition of the other special file names is disabled if `gawk'
@@ -7024,7 +7027,7 @@ that `gawk' provides:
* `gawk' _always_ interprets these special file names. For example,
using `/dev/fd/4' for output actually writes on file descriptor 4,
- and not on a new file descriptor that is `dup()''ed from file
+ and not on a new file descriptor that is `dup()'ed from file
descriptor 4. Most of the time this does not matter; however, it
is important to _not_ close any of the files related to file
descriptors 0, 1, and 2. Doing so results in unpredictable
@@ -7184,8 +7187,8 @@ closing input or output files, respectively. This value
is zero if the
close succeeds, or -1 if it fails.
The POSIX standard is very vague; it says that `close()' returns
-zero on success and nonzero otherwise. In general, different
-implementations vary in what they report when closing pipes; thus the
+zero on success and a nonzero value otherwise. In general, different
+implementations vary in what they report when closing pipes; thus, the
return value cannot be used portably. (d.c.) In POSIX mode (*note
Options::), `gawk' just returns zero when closing a pipe.
@@ -7211,8 +7214,8 @@ File: gawk.info, Node: Output Summary, Next: Output
Exercises, Prev: Close Fi
numeric values for the `print' statement.
* The `printf' statement provides finer-grained control over output,
- with format control letters for different data types and various
- flags that modify the behavior of the format control letters.
+ with format-control letters for different data types and various
+ flags that modify the behavior of the format-control letters.
* Output from both `print' and `printf' may be redirected to files,
pipes, and coprocesses.
@@ -28318,7 +28321,7 @@ Unix `awk'
To get `awka', go to `http://sourceforge.net/projects/awka'.
The project seems to be frozen; no new code changes have been made
- since approximately 2003.
+ since approximately 2001.
`pawk'
Nelson H.F. Beebe at the University of Utah has modified BWK `awk'
@@ -28558,7 +28561,7 @@ possible to include them:
document describes how GNU software should be written. If you
haven't read it, please do so, preferably _before_ starting to
modify `gawk'. (The `GNU Coding Standards' are available from the
- GNU Project's website (http://www.gnu.org/prep/standards_toc.html).
+ GNU Project's website (http://www.gnu.org/prep/standards/).
Texinfo, Info, and DVI versions are also available.)
5. Use the `gawk' coding style. The C code for `gawk' follows the
@@ -31263,7 +31266,7 @@ Index
* ! (exclamation point), !~ operator <5>: Case-sensitivity. (line 26)
* ! (exclamation point), !~ operator <6>: Computed Regexps. (line 6)
* ! (exclamation point), !~ operator: Regexp Usage. (line 19)
-* " (double quote), in regexp constants: Computed Regexps. (line 29)
+* " (double quote), in regexp constants: Computed Regexps. (line 30)
* " (double quote), in shell commands: Quoting. (line 54)
* # (number sign), #! (executable scripts): Executable Scripts.
(line 6)
@@ -31498,7 +31501,7 @@ Index
* \ (backslash), in escape sequences: Escape Sequences. (line 6)
* \ (backslash), in escape sequences, POSIX and: Escape Sequences.
(line 105)
-* \ (backslash), in regexp constants: Computed Regexps. (line 29)
+* \ (backslash), in regexp constants: Computed Regexps. (line 30)
* \ (backslash), in shell commands: Quoting. (line 48)
* \ (backslash), regexp operator: Regexp Operators. (line 18)
* ^ (caret), ^ operator: Precedence. (line 49)
@@ -31767,7 +31770,7 @@ Index
* backslash (\), in escape sequences: Escape Sequences. (line 6)
* backslash (\), in escape sequences, POSIX and: Escape Sequences.
(line 105)
-* backslash (\), in regexp constants: Computed Regexps. (line 29)
+* backslash (\), in regexp constants: Computed Regexps. (line 30)
* backslash (\), in shell commands: Quoting. (line 48)
* backslash (\), regexp operator: Regexp Operators. (line 18)
* backtrace debugger command: Execution Stack. (line 13)
@@ -32364,7 +32367,7 @@ Index
* dollar sign ($), incrementing fields and arrays: Increment Ops.
(line 30)
* dollar sign ($), regexp operator: Regexp Operators. (line 35)
-* double quote ("), in regexp constants: Computed Regexps. (line 29)
+* double quote ("), in regexp constants: Computed Regexps. (line 30)
* double quote ("), in shell commands: Quoting. (line 54)
* down debugger command: Execution Stack. (line 23)
* Drepper, Ulrich: Acknowledgments. (line 52)
@@ -32750,7 +32753,7 @@ Index
* gawk, awk and: Preface. (line 21)
* gawk, bitwise operations in: Bitwise Functions. (line 40)
* gawk, break statement in: Break Statement. (line 51)
-* gawk, character classes and: Bracket Expressions. (line 100)
+* gawk, character classes and: Bracket Expressions. (line 101)
* gawk, coding style in: Adding Code. (line 38)
* gawk, command-line options, and regular expressions: GNU Regexp Operators.
(line 70)
@@ -33027,7 +33030,7 @@ Index
(line 13)
* internationalization, localization: User-modified. (line 151)
* internationalization, localization, character classes: Bracket Expressions.
- (line 100)
+ (line 101)
* internationalization, localization, gawk and: Internationalization.
(line 13)
* internationalization, localization, locale categories: Explaining gettext.
@@ -33245,8 +33248,8 @@ Index
* newlines, as field separators: Default Field Splitting.
(line 6)
* newlines, as record separators: awk split records. (line 12)
-* newlines, in dynamic regexps: Computed Regexps. (line 59)
-* newlines, in regexp constants: Computed Regexps. (line 69)
+* newlines, in dynamic regexps: Computed Regexps. (line 60)
+* newlines, in regexp constants: Computed Regexps. (line 70)
* newlines, printing: Print Examples. (line 12)
* newlines, separating statements in actions <1>: Statements. (line 10)
* newlines, separating statements in actions: Action Overview.
@@ -33672,8 +33675,8 @@ Index
* regexp constants, as patterns: Expression Patterns. (line 34)
* regexp constants, in gawk: Using Constant Regexps.
(line 28)
-* regexp constants, slashes vs. quotes: Computed Regexps. (line 29)
-* regexp constants, vs. string constants: Computed Regexps. (line 39)
+* regexp constants, slashes vs. quotes: Computed Regexps. (line 30)
+* regexp constants, vs. string constants: Computed Regexps. (line 40)
* register extension: Registration Functions.
(line 6)
* regular expressions: Regexp. (line 6)
@@ -33692,7 +33695,7 @@ Index
(line 57)
* regular expressions, dynamic: Computed Regexps. (line 6)
* regular expressions, dynamic, with embedded newlines: Computed Regexps.
- (line 59)
+ (line 60)
* regular expressions, gawk, command-line options: GNU Regexp Operators.
(line 70)
* regular expressions, interval expressions and: Options. (line 281)
@@ -33889,7 +33892,7 @@ Index
* sidebar, Understanding #!: Executable Scripts. (line 31)
* sidebar, Understanding $0: Changing Fields. (line 134)
* sidebar, Using \n in Bracket Expressions of Dynamic Regexps: Computed
Regexps.
- (line 57)
+ (line 58)
* sidebar, Using close()'s Return Value: Close Files And Pipes.
(line 131)
* SIGHUP signal, for dynamic profiling: Profiling. (line 211)
@@ -33983,7 +33986,7 @@ Index
* stream editors: Full Line Fields. (line 22)
* strftime: Time Functions. (line 48)
* string constants: Scalar Constants. (line 15)
-* string constants, vs. regexp constants: Computed Regexps. (line 39)
+* string constants, vs. regexp constants: Computed Regexps. (line 40)
* string extraction (internationalization): String Extraction.
(line 6)
* string length: String Functions. (line 171)
@@ -34118,7 +34121,7 @@ Index
* troubleshooting, quotes with file names: Special FD. (line 62)
* troubleshooting, readable data files: File Checking. (line 6)
* troubleshooting, regexp constants vs. string constants: Computed Regexps.
- (line 39)
+ (line 40)
* troubleshooting, string concatenation: Concatenation. (line 26)
* troubleshooting, substr() function: String Functions. (line 499)
* troubleshooting, system() function: I/O Functions. (line 128)
@@ -34364,495 +34367,495 @@ Ref: Regexp Operators-Footnote-1170485
Ref: Regexp Operators-Footnote-2170632
Node: Bracket Expressions170730
Ref: table-char-classes172745
-Node: Leftmost Longest175670
-Node: Computed Regexps176972
-Node: GNU Regexp Operators180369
-Node: Case-sensitivity184042
-Ref: Case-sensitivity-Footnote-1186927
-Ref: Case-sensitivity-Footnote-2187162
-Node: Regexp Summary187270
-Node: Reading Files188737
-Node: Records190831
-Node: awk split records191564
-Node: gawk split records196479
-Ref: gawk split records-Footnote-1201023
-Node: Fields201060
-Ref: Fields-Footnote-1203836
-Node: Nonconstant Fields203922
-Ref: Nonconstant Fields-Footnote-1206165
-Node: Changing Fields206369
-Node: Field Separators212298
-Node: Default Field Splitting215003
-Node: Regexp Field Splitting216120
-Node: Single Character Fields219470
-Node: Command Line Field Separator220529
-Node: Full Line Fields223741
-Ref: Full Line Fields-Footnote-1225258
-Ref: Full Line Fields-Footnote-2225304
-Node: Field Splitting Summary225405
-Node: Constant Size227479
-Node: Splitting By Content232068
-Ref: Splitting By Content-Footnote-1236062
-Node: Multiple Line236225
-Ref: Multiple Line-Footnote-1242111
-Node: Getline242290
-Node: Plain Getline244502
-Node: Getline/Variable247142
-Node: Getline/File248290
-Node: Getline/Variable/File249674
-Ref: Getline/Variable/File-Footnote-1251277
-Node: Getline/Pipe251364
-Node: Getline/Variable/Pipe254047
-Node: Getline/Coprocess255178
-Node: Getline/Variable/Coprocess256430
-Node: Getline Notes257169
-Node: Getline Summary259961
-Ref: table-getline-variants260373
-Node: Read Timeout261202
-Ref: Read Timeout-Footnote-1265026
-Node: Command-line directories265084
-Node: Input Summary265989
-Node: Input Exercises269290
-Node: Printing270018
-Node: Print271795
-Node: Print Examples273252
-Node: Output Separators276031
-Node: OFMT278049
-Node: Printf279403
-Node: Basic Printf280188
-Node: Control Letters281758
-Node: Format Modifiers285741
-Node: Printf Examples291750
-Node: Redirection294236
-Node: Special FD301077
-Ref: Special FD-Footnote-1304237
-Node: Special Files304311
-Node: Other Inherited Files304928
-Node: Special Network305928
-Node: Special Caveats306790
-Node: Close Files And Pipes307741
-Ref: Close Files And Pipes-Footnote-1314923
-Ref: Close Files And Pipes-Footnote-2315071
-Node: Output Summary315221
-Node: Output Exercises316219
-Node: Expressions316899
-Node: Values318084
-Node: Constants318762
-Node: Scalar Constants319453
-Ref: Scalar Constants-Footnote-1320312
-Node: Nondecimal-numbers320562
-Node: Regexp Constants323580
-Node: Using Constant Regexps324105
-Node: Variables327248
-Node: Using Variables327903
-Node: Assignment Options329814
-Node: Conversion331689
-Node: Strings And Numbers332213
-Ref: Strings And Numbers-Footnote-1335278
-Node: Locale influences conversions335387
-Ref: table-locale-affects338134
-Node: All Operators338722
-Node: Arithmetic Ops339352
-Node: Concatenation341857
-Ref: Concatenation-Footnote-1344676
-Node: Assignment Ops344782
-Ref: table-assign-ops349761
-Node: Increment Ops351033
-Node: Truth Values and Conditions354471
-Node: Truth Values355556
-Node: Typing and Comparison356605
-Node: Variable Typing357415
-Node: Comparison Operators361068
-Ref: table-relational-ops361478
-Node: POSIX String Comparison364973
-Ref: POSIX String Comparison-Footnote-1366045
-Node: Boolean Ops366183
-Ref: Boolean Ops-Footnote-1370662
-Node: Conditional Exp370753
-Node: Function Calls372480
-Node: Precedence376360
-Node: Locales380021
-Node: Expressions Summary381653
-Node: Patterns and Actions384213
-Node: Pattern Overview385333
-Node: Regexp Patterns387012
-Node: Expression Patterns387555
-Node: Ranges391265
-Node: BEGIN/END394371
-Node: Using BEGIN/END395132
-Ref: Using BEGIN/END-Footnote-1397866
-Node: I/O And BEGIN/END397972
-Node: BEGINFILE/ENDFILE400286
-Node: Empty403187
-Node: Using Shell Variables403504
-Node: Action Overview405777
-Node: Statements408103
-Node: If Statement409951
-Node: While Statement411446
-Node: Do Statement413475
-Node: For Statement414619
-Node: Switch Statement417776
-Node: Break Statement420158
-Node: Continue Statement422199
-Node: Next Statement424026
-Node: Nextfile Statement426407
-Node: Exit Statement429037
-Node: Built-in Variables431440
-Node: User-modified432573
-Ref: User-modified-Footnote-1440254
-Node: Auto-set440316
-Ref: Auto-set-Footnote-1453351
-Ref: Auto-set-Footnote-2453556
-Node: ARGC and ARGV453612
-Node: Pattern Action Summary457830
-Node: Arrays460257
-Node: Array Basics461586
-Node: Array Intro462430
-Ref: figure-array-elements464394
-Ref: Array Intro-Footnote-1466920
-Node: Reference to Elements467048
-Node: Assigning Elements469500
-Node: Array Example469991
-Node: Scanning an Array471749
-Node: Controlling Scanning474765
-Ref: Controlling Scanning-Footnote-1479961
-Node: Numeric Array Subscripts480277
-Node: Uninitialized Subscripts482462
-Node: Delete484079
-Ref: Delete-Footnote-1486822
-Node: Multidimensional486879
-Node: Multiscanning489976
-Node: Arrays of Arrays491565
-Node: Arrays Summary496324
-Node: Functions498416
-Node: Built-in499315
-Node: Calling Built-in500393
-Node: Numeric Functions502384
-Ref: Numeric Functions-Footnote-1506401
-Ref: Numeric Functions-Footnote-2506758
-Ref: Numeric Functions-Footnote-3506806
-Node: String Functions507078
-Ref: String Functions-Footnote-1530553
-Ref: String Functions-Footnote-2530682
-Ref: String Functions-Footnote-3530930
-Node: Gory Details531017
-Ref: table-sub-escapes532798
-Ref: table-sub-proposed534318
-Ref: table-posix-sub535682
-Ref: table-gensub-escapes537218
-Ref: Gory Details-Footnote-1538050
-Node: I/O Functions538201
-Ref: I/O Functions-Footnote-1545419
-Node: Time Functions545566
-Ref: Time Functions-Footnote-1556054
-Ref: Time Functions-Footnote-2556122
-Ref: Time Functions-Footnote-3556280
-Ref: Time Functions-Footnote-4556391
-Ref: Time Functions-Footnote-5556503
-Ref: Time Functions-Footnote-6556730
-Node: Bitwise Functions556996
-Ref: table-bitwise-ops557558
-Ref: Bitwise Functions-Footnote-1561867
-Node: Type Functions562036
-Node: I18N Functions563187
-Node: User-defined564832
-Node: Definition Syntax565637
-Ref: Definition Syntax-Footnote-1571044
-Node: Function Example571115
-Ref: Function Example-Footnote-1574034
-Node: Function Caveats574056
-Node: Calling A Function574574
-Node: Variable Scope575532
-Node: Pass By Value/Reference578520
-Node: Return Statement582015
-Node: Dynamic Typing584996
-Node: Indirect Calls585925
-Ref: Indirect Calls-Footnote-1597227
-Node: Functions Summary597355
-Node: Library Functions600057
-Ref: Library Functions-Footnote-1603666
-Ref: Library Functions-Footnote-2603809
-Node: Library Names603980
-Ref: Library Names-Footnote-1607434
-Ref: Library Names-Footnote-2607657
-Node: General Functions607743
-Node: Strtonum Function608846
-Node: Assert Function611868
-Node: Round Function615192
-Node: Cliff Random Function616733
-Node: Ordinal Functions617749
-Ref: Ordinal Functions-Footnote-1620812
-Ref: Ordinal Functions-Footnote-2621064
-Node: Join Function621275
-Ref: Join Function-Footnote-1623044
-Node: Getlocaltime Function623244
-Node: Readfile Function626988
-Node: Shell Quoting628958
-Node: Data File Management630359
-Node: Filetrans Function630991
-Node: Rewind Function635047
-Node: File Checking636434
-Ref: File Checking-Footnote-1637766
-Node: Empty Files637967
-Node: Ignoring Assigns639946
-Node: Getopt Function641497
-Ref: Getopt Function-Footnote-1652959
-Node: Passwd Functions653159
-Ref: Passwd Functions-Footnote-1661996
-Node: Group Functions662084
-Ref: Group Functions-Footnote-1669978
-Node: Walking Arrays670191
-Node: Library Functions Summary671794
-Node: Library Exercises673195
-Node: Sample Programs674475
-Node: Running Examples675245
-Node: Clones675973
-Node: Cut Program677197
-Node: Egrep Program686916
-Ref: Egrep Program-Footnote-1694414
-Node: Id Program694524
-Node: Split Program698169
-Ref: Split Program-Footnote-1701617
-Node: Tee Program701745
-Node: Uniq Program704534
-Node: Wc Program711953
-Ref: Wc Program-Footnote-1716203
-Node: Miscellaneous Programs716297
-Node: Dupword Program717510
-Node: Alarm Program719541
-Node: Translate Program724345
-Ref: Translate Program-Footnote-1728910
-Node: Labels Program729180
-Ref: Labels Program-Footnote-1732531
-Node: Word Sorting732615
-Node: History Sorting736686
-Node: Extract Program738522
-Node: Simple Sed746047
-Node: Igawk Program749115
-Ref: Igawk Program-Footnote-1763439
-Ref: Igawk Program-Footnote-2763640
-Ref: Igawk Program-Footnote-3763762
-Node: Anagram Program763877
-Node: Signature Program766934
-Node: Programs Summary768181
-Node: Programs Exercises769374
-Ref: Programs Exercises-Footnote-1773505
-Node: Advanced Features773596
-Node: Nondecimal Data775544
-Node: Array Sorting777134
-Node: Controlling Array Traversal777831
-Ref: Controlling Array Traversal-Footnote-1786164
-Node: Array Sorting Functions786282
-Ref: Array Sorting Functions-Footnote-1790171
-Node: Two-way I/O790367
-Ref: Two-way I/O-Footnote-1795312
-Ref: Two-way I/O-Footnote-2795498
-Node: TCP/IP Networking795580
-Node: Profiling798453
-Node: Advanced Features Summary806000
-Node: Internationalization807933
-Node: I18N and L10N809413
-Node: Explaining gettext810099
-Ref: Explaining gettext-Footnote-1815124
-Ref: Explaining gettext-Footnote-2815308
-Node: Programmer i18n815473
-Ref: Programmer i18n-Footnote-1820339
-Node: Translator i18n820388
-Node: String Extraction821182
-Ref: String Extraction-Footnote-1822313
-Node: Printf Ordering822399
-Ref: Printf Ordering-Footnote-1825185
-Node: I18N Portability825249
-Ref: I18N Portability-Footnote-1827704
-Node: I18N Example827767
-Ref: I18N Example-Footnote-1830570
-Node: Gawk I18N830642
-Node: I18N Summary831280
-Node: Debugger832619
-Node: Debugging833641
-Node: Debugging Concepts834082
-Node: Debugging Terms835935
-Node: Awk Debugging838507
-Node: Sample Debugging Session839401
-Node: Debugger Invocation839921
-Node: Finding The Bug841305
-Node: List of Debugger Commands847780
-Node: Breakpoint Control849113
-Node: Debugger Execution Control852809
-Node: Viewing And Changing Data856173
-Node: Execution Stack859551
-Node: Debugger Info861188
-Node: Miscellaneous Debugger Commands865205
-Node: Readline Support870234
-Node: Limitations871126
-Node: Debugging Summary873240
-Node: Arbitrary Precision Arithmetic874408
-Node: Computer Arithmetic875824
-Ref: table-numeric-ranges879422
-Ref: Computer Arithmetic-Footnote-1880281
-Node: Math Definitions880338
-Ref: table-ieee-formats883626
-Ref: Math Definitions-Footnote-1884230
-Node: MPFR features884335
-Node: FP Math Caution886006
-Ref: FP Math Caution-Footnote-1887056
-Node: Inexactness of computations887425
-Node: Inexact representation888384
-Node: Comparing FP Values889741
-Node: Errors accumulate890823
-Node: Getting Accuracy892256
-Node: Try To Round894918
-Node: Setting precision895817
-Ref: table-predefined-precision-strings896501
-Node: Setting the rounding mode898290
-Ref: table-gawk-rounding-modes898654
-Ref: Setting the rounding mode-Footnote-1902109
-Node: Arbitrary Precision Integers902288
-Ref: Arbitrary Precision Integers-Footnote-1905274
-Node: POSIX Floating Point Problems905423
-Ref: POSIX Floating Point Problems-Footnote-1909296
-Node: Floating point summary909334
-Node: Dynamic Extensions911528
-Node: Extension Intro913080
-Node: Plugin License914346
-Node: Extension Mechanism Outline915143
-Ref: figure-load-extension915571
-Ref: figure-register-new-function917051
-Ref: figure-call-new-function918055
-Node: Extension API Description920041
-Node: Extension API Functions Introduction921491
-Node: General Data Types926315
-Ref: General Data Types-Footnote-1932054
-Node: Memory Allocation Functions932353
-Ref: Memory Allocation Functions-Footnote-1935192
-Node: Constructor Functions935288
-Node: Registration Functions937022
-Node: Extension Functions937707
-Node: Exit Callback Functions940004
-Node: Extension Version String941252
-Node: Input Parsers941917
-Node: Output Wrappers951796
-Node: Two-way processors956311
-Node: Printing Messages958515
-Ref: Printing Messages-Footnote-1959591
-Node: Updating `ERRNO'959743
-Node: Requesting Values960483
-Ref: table-value-types-returned961211
-Node: Accessing Parameters962168
-Node: Symbol Table Access963399
-Node: Symbol table by name963913
-Node: Symbol table by cookie965894
-Ref: Symbol table by cookie-Footnote-1970038
-Node: Cached values970101
-Ref: Cached values-Footnote-1973600
-Node: Array Manipulation973691
-Ref: Array Manipulation-Footnote-1974789
-Node: Array Data Types974826
-Ref: Array Data Types-Footnote-1977481
-Node: Array Functions977573
-Node: Flattening Arrays981427
-Node: Creating Arrays988319
-Node: Extension API Variables993090
-Node: Extension Versioning993726
-Node: Extension API Informational Variables995627
-Node: Extension API Boilerplate996692
-Node: Finding Extensions1000501
-Node: Extension Example1001061
-Node: Internal File Description1001833
-Node: Internal File Ops1005900
-Ref: Internal File Ops-Footnote-11017570
-Node: Using Internal File Ops1017710
-Ref: Using Internal File Ops-Footnote-11020093
-Node: Extension Samples1020366
-Node: Extension Sample File Functions1021892
-Node: Extension Sample Fnmatch1029530
-Node: Extension Sample Fork1031021
-Node: Extension Sample Inplace1032236
-Node: Extension Sample Ord1033911
-Node: Extension Sample Readdir1034747
-Ref: table-readdir-file-types1035623
-Node: Extension Sample Revout1036434
-Node: Extension Sample Rev2way1037024
-Node: Extension Sample Read write array1037764
-Node: Extension Sample Readfile1039704
-Node: Extension Sample Time1040799
-Node: Extension Sample API Tests1042148
-Node: gawkextlib1042639
-Node: Extension summary1045297
-Node: Extension Exercises1048986
-Node: Language History1049708
-Node: V7/SVR3.11051364
-Node: SVR41053545
-Node: POSIX1054990
-Node: BTL1056379
-Node: POSIX/GNU1057113
-Node: Feature History1062677
-Node: Common Extensions1075775
-Node: Ranges and Locales1077099
-Ref: Ranges and Locales-Footnote-11081717
-Ref: Ranges and Locales-Footnote-21081744
-Ref: Ranges and Locales-Footnote-31081978
-Node: Contributors1082199
-Node: History summary1087740
-Node: Installation1089110
-Node: Gawk Distribution1090056
-Node: Getting1090540
-Node: Extracting1091363
-Node: Distribution contents1092998
-Node: Unix Installation1098715
-Node: Quick Installation1099332
-Node: Additional Configuration Options1101756
-Node: Configuration Philosophy1103494
-Node: Non-Unix Installation1105863
-Node: PC Installation1106321
-Node: PC Binary Installation1107640
-Node: PC Compiling1109488
-Ref: PC Compiling-Footnote-11112509
-Node: PC Testing1112618
-Node: PC Using1113794
-Node: Cygwin1117909
-Node: MSYS1118732
-Node: VMS Installation1119232
-Node: VMS Compilation1120024
-Ref: VMS Compilation-Footnote-11121246
-Node: VMS Dynamic Extensions1121304
-Node: VMS Installation Details1122988
-Node: VMS Running1125240
-Node: VMS GNV1128076
-Node: VMS Old Gawk1128810
-Node: Bugs1129280
-Node: Other Versions1133163
-Node: Installation summary1139587
-Node: Notes1140643
-Node: Compatibility Mode1141508
-Node: Additions1142290
-Node: Accessing The Source1143215
-Node: Adding Code1144650
-Node: New Ports1150815
-Node: Derived Files1155297
-Ref: Derived Files-Footnote-11160772
-Ref: Derived Files-Footnote-21160806
-Ref: Derived Files-Footnote-31161402
-Node: Future Extensions1161516
-Node: Implementation Limitations1162122
-Node: Extension Design1163370
-Node: Old Extension Problems1164524
-Ref: Old Extension Problems-Footnote-11166041
-Node: Extension New Mechanism Goals1166098
-Ref: Extension New Mechanism Goals-Footnote-11169458
-Node: Extension Other Design Decisions1169647
-Node: Extension Future Growth1171755
-Node: Old Extension Mechanism1172591
-Node: Notes summary1174353
-Node: Basic Concepts1175539
-Node: Basic High Level1176220
-Ref: figure-general-flow1176492
-Ref: figure-process-flow1177091
-Ref: Basic High Level-Footnote-11180320
-Node: Basic Data Typing1180505
-Node: Glossary1183833
-Node: Copying1208991
-Node: GNU Free Documentation License1246547
-Node: Index1271683
+Node: Leftmost Longest175687
+Node: Computed Regexps176989
+Node: GNU Regexp Operators180418
+Node: Case-sensitivity184090
+Ref: Case-sensitivity-Footnote-1186975
+Ref: Case-sensitivity-Footnote-2187210
+Node: Regexp Summary187318
+Node: Reading Files188785
+Node: Records190878
+Node: awk split records191611
+Node: gawk split records196540
+Ref: gawk split records-Footnote-1201079
+Node: Fields201116
+Ref: Fields-Footnote-1203894
+Node: Nonconstant Fields203980
+Ref: Nonconstant Fields-Footnote-1206218
+Node: Changing Fields206421
+Node: Field Separators212352
+Node: Default Field Splitting215056
+Node: Regexp Field Splitting216173
+Node: Single Character Fields219523
+Node: Command Line Field Separator220582
+Node: Full Line Fields223799
+Ref: Full Line Fields-Footnote-1225320
+Ref: Full Line Fields-Footnote-2225366
+Node: Field Splitting Summary225467
+Node: Constant Size227541
+Node: Splitting By Content232124
+Ref: Splitting By Content-Footnote-1236089
+Node: Multiple Line236252
+Ref: Multiple Line-Footnote-1242133
+Node: Getline242312
+Node: Plain Getline244519
+Node: Getline/Variable247159
+Node: Getline/File248308
+Node: Getline/Variable/File249693
+Ref: Getline/Variable/File-Footnote-1251296
+Node: Getline/Pipe251383
+Node: Getline/Variable/Pipe254061
+Node: Getline/Coprocess255192
+Node: Getline/Variable/Coprocess256456
+Node: Getline Notes257195
+Node: Getline Summary259989
+Ref: table-getline-variants260401
+Node: Read Timeout261230
+Ref: Read Timeout-Footnote-1265067
+Node: Command-line directories265125
+Node: Input Summary266030
+Node: Input Exercises269415
+Node: Printing270143
+Node: Print271920
+Node: Print Examples273377
+Node: Output Separators276156
+Node: OFMT278174
+Node: Printf279529
+Node: Basic Printf280314
+Node: Control Letters281886
+Node: Format Modifiers285871
+Node: Printf Examples291881
+Node: Redirection294367
+Node: Special FD301205
+Ref: Special FD-Footnote-1304371
+Node: Special Files304445
+Node: Other Inherited Files305062
+Node: Special Network306062
+Node: Special Caveats306924
+Node: Close Files And Pipes307873
+Ref: Close Files And Pipes-Footnote-1315064
+Ref: Close Files And Pipes-Footnote-2315212
+Node: Output Summary315362
+Node: Output Exercises316360
+Node: Expressions317040
+Node: Values318225
+Node: Constants318903
+Node: Scalar Constants319594
+Ref: Scalar Constants-Footnote-1320453
+Node: Nondecimal-numbers320703
+Node: Regexp Constants323721
+Node: Using Constant Regexps324246
+Node: Variables327389
+Node: Using Variables328044
+Node: Assignment Options329955
+Node: Conversion331830
+Node: Strings And Numbers332354
+Ref: Strings And Numbers-Footnote-1335419
+Node: Locale influences conversions335528
+Ref: table-locale-affects338275
+Node: All Operators338863
+Node: Arithmetic Ops339493
+Node: Concatenation341998
+Ref: Concatenation-Footnote-1344817
+Node: Assignment Ops344923
+Ref: table-assign-ops349902
+Node: Increment Ops351174
+Node: Truth Values and Conditions354612
+Node: Truth Values355697
+Node: Typing and Comparison356746
+Node: Variable Typing357556
+Node: Comparison Operators361209
+Ref: table-relational-ops361619
+Node: POSIX String Comparison365114
+Ref: POSIX String Comparison-Footnote-1366186
+Node: Boolean Ops366324
+Ref: Boolean Ops-Footnote-1370803
+Node: Conditional Exp370894
+Node: Function Calls372621
+Node: Precedence376501
+Node: Locales380162
+Node: Expressions Summary381794
+Node: Patterns and Actions384354
+Node: Pattern Overview385474
+Node: Regexp Patterns387153
+Node: Expression Patterns387696
+Node: Ranges391406
+Node: BEGIN/END394512
+Node: Using BEGIN/END395273
+Ref: Using BEGIN/END-Footnote-1398007
+Node: I/O And BEGIN/END398113
+Node: BEGINFILE/ENDFILE400427
+Node: Empty403328
+Node: Using Shell Variables403645
+Node: Action Overview405918
+Node: Statements408244
+Node: If Statement410092
+Node: While Statement411587
+Node: Do Statement413616
+Node: For Statement414760
+Node: Switch Statement417917
+Node: Break Statement420299
+Node: Continue Statement422340
+Node: Next Statement424167
+Node: Nextfile Statement426548
+Node: Exit Statement429178
+Node: Built-in Variables431581
+Node: User-modified432714
+Ref: User-modified-Footnote-1440395
+Node: Auto-set440457
+Ref: Auto-set-Footnote-1453492
+Ref: Auto-set-Footnote-2453697
+Node: ARGC and ARGV453753
+Node: Pattern Action Summary457971
+Node: Arrays460398
+Node: Array Basics461727
+Node: Array Intro462571
+Ref: figure-array-elements464535
+Ref: Array Intro-Footnote-1467061
+Node: Reference to Elements467189
+Node: Assigning Elements469641
+Node: Array Example470132
+Node: Scanning an Array471890
+Node: Controlling Scanning474906
+Ref: Controlling Scanning-Footnote-1480102
+Node: Numeric Array Subscripts480418
+Node: Uninitialized Subscripts482603
+Node: Delete484220
+Ref: Delete-Footnote-1486963
+Node: Multidimensional487020
+Node: Multiscanning490117
+Node: Arrays of Arrays491706
+Node: Arrays Summary496465
+Node: Functions498557
+Node: Built-in499456
+Node: Calling Built-in500534
+Node: Numeric Functions502525
+Ref: Numeric Functions-Footnote-1506542
+Ref: Numeric Functions-Footnote-2506899
+Ref: Numeric Functions-Footnote-3506947
+Node: String Functions507219
+Ref: String Functions-Footnote-1530694
+Ref: String Functions-Footnote-2530823
+Ref: String Functions-Footnote-3531071
+Node: Gory Details531158
+Ref: table-sub-escapes532939
+Ref: table-sub-proposed534459
+Ref: table-posix-sub535823
+Ref: table-gensub-escapes537359
+Ref: Gory Details-Footnote-1538191
+Node: I/O Functions538342
+Ref: I/O Functions-Footnote-1545560
+Node: Time Functions545707
+Ref: Time Functions-Footnote-1556195
+Ref: Time Functions-Footnote-2556263
+Ref: Time Functions-Footnote-3556421
+Ref: Time Functions-Footnote-4556532
+Ref: Time Functions-Footnote-5556644
+Ref: Time Functions-Footnote-6556871
+Node: Bitwise Functions557137
+Ref: table-bitwise-ops557699
+Ref: Bitwise Functions-Footnote-1562008
+Node: Type Functions562177
+Node: I18N Functions563328
+Node: User-defined564973
+Node: Definition Syntax565778
+Ref: Definition Syntax-Footnote-1571185
+Node: Function Example571256
+Ref: Function Example-Footnote-1574175
+Node: Function Caveats574197
+Node: Calling A Function574715
+Node: Variable Scope575673
+Node: Pass By Value/Reference578661
+Node: Return Statement582156
+Node: Dynamic Typing585137
+Node: Indirect Calls586066
+Ref: Indirect Calls-Footnote-1597368
+Node: Functions Summary597496
+Node: Library Functions600198
+Ref: Library Functions-Footnote-1603807
+Ref: Library Functions-Footnote-2603950
+Node: Library Names604121
+Ref: Library Names-Footnote-1607575
+Ref: Library Names-Footnote-2607798
+Node: General Functions607884
+Node: Strtonum Function608987
+Node: Assert Function612009
+Node: Round Function615333
+Node: Cliff Random Function616874
+Node: Ordinal Functions617890
+Ref: Ordinal Functions-Footnote-1620953
+Ref: Ordinal Functions-Footnote-2621205
+Node: Join Function621416
+Ref: Join Function-Footnote-1623185
+Node: Getlocaltime Function623385
+Node: Readfile Function627129
+Node: Shell Quoting629099
+Node: Data File Management630500
+Node: Filetrans Function631132
+Node: Rewind Function635188
+Node: File Checking636575
+Ref: File Checking-Footnote-1637907
+Node: Empty Files638108
+Node: Ignoring Assigns640087
+Node: Getopt Function641638
+Ref: Getopt Function-Footnote-1653100
+Node: Passwd Functions653300
+Ref: Passwd Functions-Footnote-1662137
+Node: Group Functions662225
+Ref: Group Functions-Footnote-1670119
+Node: Walking Arrays670332
+Node: Library Functions Summary671935
+Node: Library Exercises673336
+Node: Sample Programs674616
+Node: Running Examples675386
+Node: Clones676114
+Node: Cut Program677338
+Node: Egrep Program687057
+Ref: Egrep Program-Footnote-1694555
+Node: Id Program694665
+Node: Split Program698310
+Ref: Split Program-Footnote-1701758
+Node: Tee Program701886
+Node: Uniq Program704675
+Node: Wc Program712094
+Ref: Wc Program-Footnote-1716344
+Node: Miscellaneous Programs716438
+Node: Dupword Program717651
+Node: Alarm Program719682
+Node: Translate Program724486
+Ref: Translate Program-Footnote-1729051
+Node: Labels Program729321
+Ref: Labels Program-Footnote-1732672
+Node: Word Sorting732756
+Node: History Sorting736827
+Node: Extract Program738663
+Node: Simple Sed746188
+Node: Igawk Program749256
+Ref: Igawk Program-Footnote-1763580
+Ref: Igawk Program-Footnote-2763781
+Ref: Igawk Program-Footnote-3763903
+Node: Anagram Program764018
+Node: Signature Program767075
+Node: Programs Summary768322
+Node: Programs Exercises769515
+Ref: Programs Exercises-Footnote-1773646
+Node: Advanced Features773737
+Node: Nondecimal Data775685
+Node: Array Sorting777275
+Node: Controlling Array Traversal777972
+Ref: Controlling Array Traversal-Footnote-1786305
+Node: Array Sorting Functions786423
+Ref: Array Sorting Functions-Footnote-1790312
+Node: Two-way I/O790508
+Ref: Two-way I/O-Footnote-1795453
+Ref: Two-way I/O-Footnote-2795639
+Node: TCP/IP Networking795721
+Node: Profiling798594
+Node: Advanced Features Summary806141
+Node: Internationalization808074
+Node: I18N and L10N809554
+Node: Explaining gettext810240
+Ref: Explaining gettext-Footnote-1815265
+Ref: Explaining gettext-Footnote-2815449
+Node: Programmer i18n815614
+Ref: Programmer i18n-Footnote-1820480
+Node: Translator i18n820529
+Node: String Extraction821323
+Ref: String Extraction-Footnote-1822454
+Node: Printf Ordering822540
+Ref: Printf Ordering-Footnote-1825326
+Node: I18N Portability825390
+Ref: I18N Portability-Footnote-1827845
+Node: I18N Example827908
+Ref: I18N Example-Footnote-1830711
+Node: Gawk I18N830783
+Node: I18N Summary831421
+Node: Debugger832760
+Node: Debugging833782
+Node: Debugging Concepts834223
+Node: Debugging Terms836076
+Node: Awk Debugging838648
+Node: Sample Debugging Session839542
+Node: Debugger Invocation840062
+Node: Finding The Bug841446
+Node: List of Debugger Commands847921
+Node: Breakpoint Control849254
+Node: Debugger Execution Control852950
+Node: Viewing And Changing Data856314
+Node: Execution Stack859692
+Node: Debugger Info861329
+Node: Miscellaneous Debugger Commands865346
+Node: Readline Support870375
+Node: Limitations871267
+Node: Debugging Summary873381
+Node: Arbitrary Precision Arithmetic874549
+Node: Computer Arithmetic875965
+Ref: table-numeric-ranges879563
+Ref: Computer Arithmetic-Footnote-1880422
+Node: Math Definitions880479
+Ref: table-ieee-formats883767
+Ref: Math Definitions-Footnote-1884371
+Node: MPFR features884476
+Node: FP Math Caution886147
+Ref: FP Math Caution-Footnote-1887197
+Node: Inexactness of computations887566
+Node: Inexact representation888525
+Node: Comparing FP Values889882
+Node: Errors accumulate890964
+Node: Getting Accuracy892397
+Node: Try To Round895059
+Node: Setting precision895958
+Ref: table-predefined-precision-strings896642
+Node: Setting the rounding mode898431
+Ref: table-gawk-rounding-modes898795
+Ref: Setting the rounding mode-Footnote-1902250
+Node: Arbitrary Precision Integers902429
+Ref: Arbitrary Precision Integers-Footnote-1905415
+Node: POSIX Floating Point Problems905564
+Ref: POSIX Floating Point Problems-Footnote-1909437
+Node: Floating point summary909475
+Node: Dynamic Extensions911669
+Node: Extension Intro913221
+Node: Plugin License914487
+Node: Extension Mechanism Outline915284
+Ref: figure-load-extension915712
+Ref: figure-register-new-function917192
+Ref: figure-call-new-function918196
+Node: Extension API Description920182
+Node: Extension API Functions Introduction921632
+Node: General Data Types926456
+Ref: General Data Types-Footnote-1932195
+Node: Memory Allocation Functions932494
+Ref: Memory Allocation Functions-Footnote-1935333
+Node: Constructor Functions935429
+Node: Registration Functions937163
+Node: Extension Functions937848
+Node: Exit Callback Functions940145
+Node: Extension Version String941393
+Node: Input Parsers942058
+Node: Output Wrappers951937
+Node: Two-way processors956452
+Node: Printing Messages958656
+Ref: Printing Messages-Footnote-1959732
+Node: Updating `ERRNO'959884
+Node: Requesting Values960624
+Ref: table-value-types-returned961352
+Node: Accessing Parameters962309
+Node: Symbol Table Access963540
+Node: Symbol table by name964054
+Node: Symbol table by cookie966035
+Ref: Symbol table by cookie-Footnote-1970179
+Node: Cached values970242
+Ref: Cached values-Footnote-1973741
+Node: Array Manipulation973832
+Ref: Array Manipulation-Footnote-1974930
+Node: Array Data Types974967
+Ref: Array Data Types-Footnote-1977622
+Node: Array Functions977714
+Node: Flattening Arrays981568
+Node: Creating Arrays988460
+Node: Extension API Variables993231
+Node: Extension Versioning993867
+Node: Extension API Informational Variables995768
+Node: Extension API Boilerplate996833
+Node: Finding Extensions1000642
+Node: Extension Example1001202
+Node: Internal File Description1001974
+Node: Internal File Ops1006041
+Ref: Internal File Ops-Footnote-11017711
+Node: Using Internal File Ops1017851
+Ref: Using Internal File Ops-Footnote-11020234
+Node: Extension Samples1020507
+Node: Extension Sample File Functions1022033
+Node: Extension Sample Fnmatch1029671
+Node: Extension Sample Fork1031162
+Node: Extension Sample Inplace1032377
+Node: Extension Sample Ord1034052
+Node: Extension Sample Readdir1034888
+Ref: table-readdir-file-types1035764
+Node: Extension Sample Revout1036575
+Node: Extension Sample Rev2way1037165
+Node: Extension Sample Read write array1037905
+Node: Extension Sample Readfile1039845
+Node: Extension Sample Time1040940
+Node: Extension Sample API Tests1042289
+Node: gawkextlib1042780
+Node: Extension summary1045438
+Node: Extension Exercises1049127
+Node: Language History1049849
+Node: V7/SVR3.11051505
+Node: SVR41053686
+Node: POSIX1055131
+Node: BTL1056520
+Node: POSIX/GNU1057254
+Node: Feature History1062818
+Node: Common Extensions1075916
+Node: Ranges and Locales1077240
+Ref: Ranges and Locales-Footnote-11081858
+Ref: Ranges and Locales-Footnote-21081885
+Ref: Ranges and Locales-Footnote-31082119
+Node: Contributors1082340
+Node: History summary1087881
+Node: Installation1089251
+Node: Gawk Distribution1090197
+Node: Getting1090681
+Node: Extracting1091504
+Node: Distribution contents1093139
+Node: Unix Installation1098856
+Node: Quick Installation1099473
+Node: Additional Configuration Options1101897
+Node: Configuration Philosophy1103635
+Node: Non-Unix Installation1106004
+Node: PC Installation1106462
+Node: PC Binary Installation1107781
+Node: PC Compiling1109629
+Ref: PC Compiling-Footnote-11112650
+Node: PC Testing1112759
+Node: PC Using1113935
+Node: Cygwin1118050
+Node: MSYS1118873
+Node: VMS Installation1119373
+Node: VMS Compilation1120165
+Ref: VMS Compilation-Footnote-11121387
+Node: VMS Dynamic Extensions1121445
+Node: VMS Installation Details1123129
+Node: VMS Running1125381
+Node: VMS GNV1128217
+Node: VMS Old Gawk1128951
+Node: Bugs1129421
+Node: Other Versions1133304
+Node: Installation summary1139728
+Node: Notes1140784
+Node: Compatibility Mode1141649
+Node: Additions1142431
+Node: Accessing The Source1143356
+Node: Adding Code1144791
+Node: New Ports1150948
+Node: Derived Files1155430
+Ref: Derived Files-Footnote-11160905
+Ref: Derived Files-Footnote-21160939
+Ref: Derived Files-Footnote-31161535
+Node: Future Extensions1161649
+Node: Implementation Limitations1162255
+Node: Extension Design1163503
+Node: Old Extension Problems1164657
+Ref: Old Extension Problems-Footnote-11166174
+Node: Extension New Mechanism Goals1166231
+Ref: Extension New Mechanism Goals-Footnote-11169591
+Node: Extension Other Design Decisions1169780
+Node: Extension Future Growth1171888
+Node: Old Extension Mechanism1172724
+Node: Notes summary1174486
+Node: Basic Concepts1175672
+Node: Basic High Level1176353
+Ref: figure-general-flow1176625
+Ref: figure-process-flow1177224
+Ref: Basic High Level-Footnote-11180453
+Node: Basic Data Typing1180638
+Node: Glossary1183966
+Node: Copying1209124
+Node: GNU Free Documentation License1246680
+Node: Index1271816
End Tag Table
diff --git a/doc/gawk.texi b/doc/gawk.texi
index ad4bae1..175c7af 100644
--- a/doc/gawk.texi
+++ b/doc/gawk.texi
@@ -5716,11 +5716,11 @@ and numeric characters in your character set.
@c Date: Tue, 01 Jul 2014 07:39:51 +0200
@c From: Hermann Peifer <address@hidden>
Some utilities that match regular expressions provide a nonstandard
address@hidden:ascii:]} character class; @command{awk} does not. However, you
-can simulate such a construct using @code{[\x00-\x7F]}. This matches
address@hidden:ascii:]} character class; @command{awk} does not. However, you
+can simulate such a construct using @samp{[\x00-\x7F]}. This matches
all values numerically between zero and 127, which is the defined
range of the ASCII character set. Use a complemented character list
-(@code{[^\x00-\x7F]}) to match any single-byte characters that are not
+(@samp{[^\x00-\x7F]}) to match any single-byte characters that are not
in the ASCII range.
@cindex bracket expressions, collating elements
@@ -5749,8 +5749,8 @@ Locale-specific names for a list of
characters that are equal. The name is enclosed between
@samp{[=} and @samp{=]}.
For example, the name @samp{e} might be used to represent all of
-``e,'' address@hidden,'' and ``@'e.'' In this case, @samp{[[=e=]]} is a regexp
-that matches any of @samp{e}, @samp{@'e}, or @address@hidden
+``e,'' address@hidden,'' address@hidden,'' and ``@'e.'' In this case,
@samp{[[=e=]]} is a regexp
+that matches any of @samp{e}, @address@hidden, @samp{@'e}, or @address@hidden
@end table
These features are very valuable in non-English-speaking locales.
@@ -5779,7 +5779,7 @@ echo aaaabcd | awk '@{ sub(/a+/, "<A>"); print @}'
This example uses the @code{sub()} function to make a change to the input
record. (@code{sub()} replaces the first instance of any text matched
by the first argument with the string provided as the second argument;
address@hidden Functions}). Here, the regexp @code{/a+/} indicates ``one
address@hidden Functions}.) Here, the regexp @code{/a+/} indicates ``one
or more @samp{a} characters,'' and the replacement text is @samp{<A>}.
The input contains four @samp{a} characters.
@@ -5833,14 +5833,14 @@ and tests whether the input record matches this regexp.
@quotation NOTE
When using the @samp{~} and @samp{!~}
-operators, there is a difference between a regexp constant
+operators, be aware that there is a difference between a regexp constant
enclosed in slashes and a string constant enclosed in double quotes.
If you are going to use a string constant, you have to understand that
the string is, in essence, scanned @emph{twice}: the first time when
@command{awk} reads your program, and the second time when it goes to
match the string on the lefthand side of the operator with the pattern
on the right. This is true of any string-valued expression (such as
address@hidden, shown previously), not just string constants.
address@hidden, shown in the previous example), not just string constants.
@end quotation
@cindex regexp constants, slashes vs.@: quotes
@@ -6040,7 +6040,7 @@ matches either @samp{ball} or @samp{balls}, as a separate
word.
@item \B
Matches the empty string that occurs between two
word-constituent characters. For example,
address@hidden/\Brat\B/} matches @samp{crate} but it does not match @samp{dirty
rat}.
address@hidden/\Brat\B/} matches @samp{crate}, but it does not match
@samp{dirty rat}.
@samp{\B} is essentially the opposite of @samp{\y}.
@end table
@@ -6059,14 +6059,14 @@ The operators are:
@cindex backslash (@code{\}), @code{\`} operator (@command{gawk})
@cindex @code{\} (backslash), @code{\`} operator (@command{gawk})
Matches the empty string at the
-beginning of a buffer (string).
+beginning of a buffer (string)
@c @cindex operators, @code{\'} (@command{gawk})
@cindex backslash (@code{\}), @code{\'} operator (@command{gawk})
@cindex @code{\} (backslash), @code{\'} operator (@command{gawk})
@item \'
Matches the empty string at the
-end of a buffer (string).
+end of a buffer (string)
@end table
@cindex @code{^} (caret), regexp operator
@@ -6299,7 +6299,7 @@ This makes it more convenient for programs to work on the
parts of a record.
@cindex @code{getline} command
On rare occasions, you may need to use the @code{getline} command.
-The @code{getline} command is valuable, both because it
+The @code{getline} command is valuable both because it
can do explicit input from any number of files, and because the files
used with it do not have to be named on the @command{awk} command line
(@pxref{Getline}).
@@ -6350,8 +6350,8 @@ never automatically reset to zero.
Records are separated by a character called the @dfn{record separator}.
By default, the record separator is the newline character.
This is why records are, by default, single lines.
-A different character can be used for the record separator by
-assigning the character to the predefined variable @code{RS}.
+To use a different character for the record separator,
+simply assign that character to the predefined variable @code{RS}.
@cindex newlines, as record separators
@cindex @code{RS} variable
@@ -6374,8 +6374,8 @@ awk 'BEGIN @{ RS = "u" @}
@noindent
changes the value of @code{RS} to @samp{u}, before reading any input.
-This is a string whose first character is the letter ``u''; as a result,
records
-are separated by the letter ``u.'' Then the input file is read, and the second
+The new value is a string whose first character is the letter ``u''; as a
result, records
+are separated by the letter ``u''. Then the input file is read, and the second
rule in the @command{awk} program (the action with no pattern) prints each
record. Because each @code{print} statement adds a newline at the end of
its output, this @command{awk} program copies the input
@@ -6436,8 +6436,8 @@ Bill 555-1675 bill.drowning@@hotmail.com
A
@end example
@noindent
-It contains no @samp{u} so there is no reason to split the record,
-unlike the others which have one or more occurrences of the @samp{u}.
+It contains no @samp{u}, so there is no reason to split the record,
+unlike the others, which each have one or more occurrences of the @samp{u}.
In fact, this record is treated as part of the previous record;
the newline separating them in the output
is the original newline in the @value{DF}, not the one added by
@@ -6532,7 +6532,7 @@ contains the same single character. However, when
@code{RS} is a
regular expression, @code{RT} contains
the actual input text that matched the regular expression.
-If the input file ended without any text that matches @code{RS},
+If the input file ends without any text matching @code{RS},
@command{gawk} sets @code{RT} to the null string.
The following example illustrates both of these features.
@@ -6713,11 +6713,11 @@ simple @command{awk} programs so powerful.
@cindex @code{$} (dollar sign), @code{$} field operator
@cindex dollar sign (@code{$}), @code{$} field operator
@cindex field address@hidden dollar sign as
-You use a dollar-sign (@samp{$})
+You use a dollar sign (@samp{$})
to refer to a field in an @command{awk} program,
followed by the number of the field you want. Thus, @code{$1}
refers to the first field, @code{$2} to the second, and so on.
-(Unlike the Unix shells, the field numbers are not limited to single digits.
+(Unlike in the Unix shells, the field numbers are not limited to single digits.
@code{$127} is the 127th field in the record.)
For example, suppose the following is a line of input:
@@ -6743,7 +6743,7 @@ If you try to reference a field beyond the last
one (such as @code{$8} when the record has only seven fields), you get
the empty string. (If used in a numeric operation, you get zero.)
-The use of @code{$0}, which looks like a reference to the ``zero-th'' field, is
+The use of @code{$0}, which looks like a reference to the ``zeroth'' field, is
a special case: it represents the whole input record. Use it
when you are not interested in specific fields.
Here are some more examples:
@@ -6798,13 +6798,13 @@ awk '@{ print $(2*2) @}' mail-list
@end example
@command{awk} evaluates the expression @samp{(2*2)} and uses
-its value as the number of the field to print. The @samp{*} sign
+its value as the number of the field to print. The @samp{*}
represents multiplication, so the expression @samp{2*2} evaluates to four.
The parentheses are used so that the multiplication is done before the
@samp{$} operation; they are necessary whenever there is a binary
address@hidden @dfn{binary operator}, such as @samp{*} for
multiplication, is one that takes two operands. The distinction
-is required, because @command{awk} also has unary (one-operand)
+is required because @command{awk} also has unary (one-operand)
and ternary (three-operand) operators.}
in the field-number expression. This example, then, prints the
type of relationship (the fourth field) for every line of the file
@@ -6984,7 +6984,7 @@ rebuild @code{$0} when @code{NF} is decremented.
Finally, there are times when it is convenient to force
@command{awk} to rebuild the entire record, using the current
-value of the fields and @code{OFS}. To do this, use the
+values of the fields and @code{OFS}. To do this, use the
seemingly innocuous assignment:
@example
@@ -7013,7 +7013,7 @@ such as @code{sub()} and @code{gsub()}
It is important to remember that @code{$0} is the @emph{full}
record, exactly as it was read from the input. This includes
any leading or trailing whitespace, and the exact whitespace (or other
-characters) that separate the fields.
+characters) that separates the fields.
It is a common error to try to change the field separators
in a record simply by setting @code{FS} and @code{OFS}, and then
@@ -7038,7 +7038,7 @@ with a statement such as @samp{$1 = $1}, as described
earlier.
It is important to remember that @code{$0} is the @emph{full}
record, exactly as it was read from the input. This includes
any leading or trailing whitespace, and the exact whitespace (or other
-characters) that separate the fields.
+characters) that separates the fields.
It is a common error to try to change the field separators
in a record simply by setting @code{FS} and @code{OFS}, and then
@@ -7132,7 +7132,7 @@ John Q. Smith, LXIX, 29 Oak St., Walamazoo, MI 42139
@end example
@noindent
-The same program would extract @address@hidden, instead of
+The same program would extract @address@hidden instead of
@address@hidden@address@hidden
If you were expecting the program to print the
address, you would be surprised. The moral is to choose your data layout and
@@ -7393,7 +7393,7 @@ choosing your field and record separators.
@cindex Unix @command{awk}, password address@hidden field separators and
Perhaps the most common use of a single character as the field separator
occurs when processing the Unix system password file. On many Unix
-systems, each user has a separate entry in the system password file, one
+systems, each user has a separate entry in the system password file, with one
line per user. The information in these lines is separated by colons.
The first field is the user's login name and the second is the user's
encrypted or shadow password. (A shadow password is indicated by the
@@ -7439,7 +7439,7 @@ When you do this, @code{$1} is the same as @code{$0}.
According to the POSIX standard, @command{awk} is supposed to behave
as if each record is split into fields at the time it is read.
In particular, this means that if you change the value of @code{FS}
-after a record is read, the value of the fields (i.e., how they were split)
+after a record is read, the values of the fields (i.e., how they were split)
should reflect the old value of @code{FS}, not the new one.
@cindex dark corner, field separators
@@ -7452,10 +7452,7 @@ using the @emph{current} value of @code{FS}!
@value{DARKCORNER}
This behavior can be difficult
to diagnose. The following example illustrates the difference
-between the two methods.
-(The @address@hidden @command{sed} utility is a ``stream editor.''
-Its behavior is also defined by the POSIX standard.}
-command prints just the first line of @file{/etc/passwd}.)
+between the two methods:
@example
sed 1q /etc/passwd | awk '@{ FS = ":" ; print $1 @}'
@@ -7476,6 +7473,10 @@ prints the full first line of the file, something like:
root:x:0:0:Root:/:
@end example
+(The @address@hidden @command{sed} utility is a ``stream editor.''
+Its behavior is also defined by the POSIX standard.}
+command prints just the first line of @file{/etc/passwd}.)
+
@docbook
</sidebar>
@end docbook
@@ -7492,7 +7493,7 @@ root:x:0:0:Root:/:
According to the POSIX standard, @command{awk} is supposed to behave
as if each record is split into fields at the time it is read.
In particular, this means that if you change the value of @code{FS}
-after a record is read, the value of the fields (i.e., how they were split)
+after a record is read, the values of the fields (i.e., how they were split)
should reflect the old value of @code{FS}, not the new one.
@cindex dark corner, field separators
@@ -7505,10 +7506,7 @@ using the @emph{current} value of @code{FS}!
@value{DARKCORNER}
This behavior can be difficult
to diagnose. The following example illustrates the difference
-between the two methods.
-(The @address@hidden @command{sed} utility is a ``stream editor.''
-Its behavior is also defined by the POSIX standard.}
-command prints just the first line of @file{/etc/passwd}.)
+between the two methods:
@example
sed 1q /etc/passwd | awk '@{ FS = ":" ; print $1 @}'
@@ -7528,6 +7526,10 @@ prints the full first line of the file, something like:
@example
root:x:0:0:Root:/:
@end example
+
+(The @address@hidden @command{sed} utility is a ``stream editor.''
+Its behavior is also defined by the POSIX standard.}
+command prints just the first line of @file{/etc/passwd}.)
@end cartouche
@end ifnotdocbook
@@ -7739,7 +7741,7 @@ In order to tell which kind of field splitting is in
effect,
use @code{PROCINFO["FS"]}
(@pxref{Auto-set}).
The value is @code{"FS"} if regular field splitting is being used,
-or it is @code{"FIELDWIDTHS"} if fixed-width field splitting is being used:
+or @code{"FIELDWIDTHS"} if fixed-width field splitting is being used:
@example
if (PROCINFO["FS"] == "FS")
@@ -7775,14 +7777,14 @@ what they are, and not by what they are not.
The most notorious such case
is so-called @dfn{comma-separated values} (CSV) data. Many spreadsheet
programs,
for example, can export their data into text files, where each record is
-terminated with a newline, and fields are separated by commas. If only
-commas separated the data, there wouldn't be an issue. The problem comes when
+terminated with a newline, and fields are separated by commas. If
+commas only separated the data, there wouldn't be an issue. The problem comes
when
one of the fields contains an @emph{embedded} comma.
In such cases, most programs embed the field in double address@hidden
CSV format lacked a formal standard definition for many years.
@uref{http://www.ietf.org/rfc/rfc4180.txt, RFC 4180}
standardizes the most common practices.}
-So we might have data like this:
+So, we might have data like this:
@example
@c file eg/misc/addresses.csv
@@ -7868,8 +7870,8 @@ of cases, and the @command{gawk} developers are satisfied
with that.
@end quotation
As written, the regexp used for @code{FPAT} requires that each field
-have a least one character. A straightforward modification
-(changing changed the first @samp{+} to @samp{*}) allows fields to be empty:
+contain at least one character. A straightforward modification
+(changing the first @samp{+} to @samp{*}) allows fields to be empty:
@example
FPAT = "([^,]*)|(\"[^\"]+\")"
@@ -7879,9 +7881,9 @@ Finally, the @code{patsplit()} function makes the same
functionality
available for splitting regular strings (@pxref{String Functions}).
To recap, @command{gawk} provides three independent methods
-to split input records into fields. @command{gawk} uses whichever
-mechanism was last chosen based on which of the three
address@hidden, @code{FIELDWIDTHS}, and @code{FPAT}---was
+to split input records into fields.
+The mechanism used is based on which of the three
address@hidden, @code{FIELDWIDTHS}, or @code{FPAT}---was
last assigned to.
@node Multiple Line
@@ -7924,7 +7926,7 @@ at the end of the record and one or more blank lines
after the record.
In addition, a regular expression always matches the longest possible
sequence when there is a choice
(@pxref{Leftmost Longest}).
-So the next record doesn't start until
+So, the next record doesn't start until
the first nonblank line that follows---no matter how many blank lines
appear in a row, they are considered one record separator.
@@ -7939,10 +7941,10 @@ In the second case, this special processing is not done.
@cindex field separator, in multiline records
@cindex @code{FS}, in multiline records
Now that the input is separated into records, the second step is to
-separate the fields in the record. One way to do this is to divide each
+separate the fields in the records. One way to do this is to divide each
of the lines into fields in the normal manner. This happens by default
as the result of a special feature. When @code{RS} is set to the empty
-string, @emph{and} @code{FS} is set to a single character,
+string @emph{and} @code{FS} is set to a single character,
the newline character @emph{always} acts as a field separator.
This is in addition to whatever field separations result from
@address@hidden @code{FS} is the null string (@code{""})
@@ -7957,7 +7959,7 @@ want the newline character to separate fields, because
there is no way to
prevent it. However, you can work around this by using the @code{split()}
function to break up the record manually
(@pxref{String Functions}).
-If you have a single character field separator, you can work around
+If you have a single-character field separator, you can work around
the special feature in a different way, by making @code{FS} into a
regexp for that single character. For example, if the field
separator is a percent character, instead of
@@ -7965,10 +7967,10 @@ separator is a percent character, instead of
Another way to separate fields is to
put each field on a separate line: to do this, just set the
-variable @code{FS} to the string @code{"\n"}. (This single
-character separator matches a single newline.)
+variable @code{FS} to the string @code{"\n"}.
+(This single-character separator matches a single newline.)
A practical example of a @value{DF} organized this way might be a mailing
-list, where each entry is separated by blank lines. Consider a mailing
+list, where blank lines separate the entries. Consider a mailing
list in a file named @file{addresses}, which looks like this:
@example
@@ -8064,7 +8066,7 @@ then @command{gawk} sets @code{RT} to the null string.
@cindex input, explicit
So far we have been getting our input data from @command{awk}'s main
input stream---either the standard input (usually your keyboard, sometimes
-the output from another program) or from the
+the output from another program) or the
files specified on the command line. The @command{awk} language has a
special built-in command called @code{getline} that
can be used to read input under your explicit control.
@@ -8248,7 +8250,7 @@ free
@end example
The @code{getline} command used in this way sets only the variables
address@hidden, @code{FNR}, and @code{RT} (and of course, @var{var}).
address@hidden, @code{FNR}, and @code{RT} (and, of course, @var{var}).
The record is not
split into fields, so the values of the fields (including @code{$0}) and
the value of @code{NF} do not change.
@@ -8263,7 +8265,7 @@ the value of @code{NF} do not change.
@cindex left angle bracket (@code{<}), @code{<} operator (I/O)
@cindex operators, input/output
Use @samp{getline < @var{file}} to read the next record from @var{file}.
-Here @var{file} is a string-valued expression that
+Here, @var{file} is a string-valued expression that
specifies the @value{FN}. @samp{< @var{file}} is called a @dfn{redirection}
because it directs input to come from a different place.
For example, the following
@@ -8441,7 +8443,7 @@ of a construct like @address@hidden"echo "} "date" |
getline}.
Most versions, including the current version, treat it at as
@address@hidden("echo "} "date") | getline}.
(This is also how BWK @command{awk} behaves.)
-Some versions changed and treated it as
+Some versions instead treat it as
@address@hidden"echo "} ("date" | getline)}.
(This is how @command{mawk} behaves.)
In short, @emph{always} use explicit parentheses, and then you won't
@@ -8489,7 +8491,7 @@ program to be portable to other @command{awk}
implementations.
@cindex operators, input/output
@cindex differences in @command{awk} and @command{gawk}, input/output operators
-Input into @code{getline} from a pipe is a one-way operation.
+Reading input into @code{getline} from a pipe is a one-way operation.
The command that is started with @address@hidden | getline} only
sends data @emph{to} your @command{awk} program.
@@ -8499,7 +8501,7 @@ for processing and then read the results back.
communications are possible. This is done with the @samp{|&}
operator.
Typically, you write data to the coprocess first and then
-read results back, as shown in the following:
+read the results back, as shown in the following:
@example
print "@var{some query}" |& "db_server"
@@ -8582,7 +8584,7 @@ also @pxref{Auto-set}.)
@item
Using @code{FILENAME} with @code{getline}
(@samp{getline < FILENAME})
-is likely to be a source for
+is likely to be a source of
confusion. @command{awk} opens a separate input stream from the
current input file. However, by not using a variable, @code{$0}
and @code{NF} are still updated. If you're doing this, it's
@@ -8590,9 +8592,15 @@ probably by accident, and you should reconsider what it
is you're
trying to accomplish.
@item
address@hidden Summary} presents a table summarizing the
address@hidden
+The next section
address@hidden ifdocbook
address@hidden
address@hidden Summary},
address@hidden ifnotdocbook
+presents a table summarizing the
@code{getline} variants and which variables they can affect.
-It is worth noting that those variants which do not use redirection
+It is worth noting that those variants that do not use redirection
can cause @code{FILENAME} to be updated if they cause
@command{awk} to start reading a new input file.
@@ -8601,7 +8609,7 @@ can cause @code{FILENAME} to be updated if they cause
If the variable being assigned is an expression with side effects,
different versions of @command{awk} behave differently upon encountering
end-of-file. Some versions don't evaluate the expression; many versions
-(including @command{gawk}) do. Here is an example, due to Duncan Moore:
+(including @command{gawk}) do. Here is an example, courtesy of Duncan Moore:
@ignore
Date: Sun, 01 Apr 2012 11:49:33 +0100
@@ -8618,7 +8626,7 @@ BEGIN @{
@noindent
Here, the side effect is the @samp{++c}. Is @code{c} incremented if
-end of file is encountered, before the element in @code{a} is assigned?
+end-of-file is encountered before the element in @code{a} is assigned?
@command{gawk} treats @code{getline} like a function call, and evaluates
the expression @samp{a[++c]} before attempting to read from @file{f}.
@@ -8660,8 +8668,8 @@ This @value{SECTION} describes a feature that is specific
to @command{gawk}.
You may specify a timeout in milliseconds for reading input from the keyboard,
a pipe, or two-way communication, including TCP/IP sockets. This can be done
-on a per input, command, or connection basis, by setting a special element
-in the @code{PROCINFO} array (@pxref{Auto-set}):
+on a per-input, per-command, or per-connection basis, by setting a special
+element in the @code{PROCINFO} array (@pxref{Auto-set}):
@example
PROCINFO["input_name", "READ_TIMEOUT"] = @var{timeout in milliseconds}
@@ -8692,7 +8700,7 @@ while ((getline < "/dev/stdin") > 0)
@end example
@command{gawk} terminates the read operation if input does not
-arrive after waiting for the timeout period, returns failure
+arrive after waiting for the timeout period, returns failure,
and sets @code{ERRNO} to an appropriate string value.
A negative or zero value for the timeout is the same as specifying
no timeout at all.
@@ -8742,7 +8750,7 @@ If the @code{PROCINFO} element is not present and the
@command{gawk} uses its value to initialize the timeout value.
The exclusive use of the environment variable to specify timeout
has the disadvantage of not being able to control it
-on a per command or connection basis.
+on a per-command or per-connection basis.
@command{gawk} considers a timeout event to be an error even though
the attempt to read from the underlying device may
@@ -8808,7 +8816,7 @@ The possibilities are as follows:
@item
After splitting the input into records, @command{awk} further splits
-the record into individual fields, named @code{$1}, @code{$2}, and so
+the records into individual fields, named @code{$1}, @code{$2}, and so
on. @code{$0} is the whole record, and @code{NF} indicates how many
fields there are. The default way to split fields is between whitespace
characters.
@@ -8824,12 +8832,12 @@ thing. Decrementing @code{NF} throws away fields and
rebuilds the record.
@item
Field splitting is more complicated than record splitting:
address@hidden @columnfractions .40 .45 .15
address@hidden @columnfractions .40 .40 .20
@headitem Field separator value @tab Fields are split @dots{} @tab
@command{awk} / @command{gawk}
@item @code{FS == " "} @tab On runs of whitespace @tab @command{awk}
@item @code{FS == @var{any single character}} @tab On that character @tab
@command{awk}
@item @code{FS == @var{regexp}} @tab On text matching the regexp @tab
@command{awk}
address@hidden @code{FS == ""} @tab Each individual character is a separate
field @tab @command{gawk}
address@hidden @code{FS == ""} @tab Such that each individual character is a
separate field @tab @command{gawk}
@item @code{FIELDWIDTHS == @var{list of columns}} @tab Based on character
position @tab @command{gawk}
@item @code{FPAT == @var{regexp}} @tab On the text surrounding text matching
the regexp @tab @command{gawk}
@end multitable
@@ -8846,11 +8854,11 @@ This can also be done using command-line variable
assignment.
Use @code{PROCINFO["FS"]} to see how fields are being split.
@item
-Use @code{getline} in its various forms to read additional records,
+Use @code{getline} in its various forms to read additional records
from the default input stream, from a file, or from a pipe or coprocess.
@item
-Use @address@hidden, "READ_TIMEOUT"]} to cause reads to timeout
+Use @address@hidden, "READ_TIMEOUT"]} to cause reads to time out
for @var{file}.
@item
@@ -8959,7 +8967,7 @@ space is printed between any two items.
Note that the @code{print} statement is a statement and not an
expression---you can't use it in the pattern part of a
address@hidden@var{action} statement, for example.
+pattern--action statement, for example.
@node Print Examples
@section @code{print} Statement Examples
@@ -9150,7 +9158,7 @@ runs together on a single line.
@cindex numeric, output format
@cindex address@hidden numeric output
When printing numeric values with the @code{print} statement,
address@hidden internally converts the number to a string of characters
address@hidden internally converts each number to a string of characters
and prints that string. @command{awk} uses the @code{sprintf()} function
to do this conversion
(@pxref{String Functions}).
@@ -9221,7 +9229,7 @@ printf @var{format}, @var{item1}, @var{item2}, @dots{}
@noindent
As for @code{print}, the entire list of arguments may optionally be
enclosed in parentheses. Here too, the parentheses are necessary if any
-of the item expressions use the @samp{>} relational operator; otherwise,
+of the item expressions uses the @samp{>} relational operator; otherwise,
it can be confused with an output redirection (@pxref{Redirection}).
@cindex format specifiers
@@ -9252,7 +9260,7 @@ $ @kbd{awk 'BEGIN @{}
@end example
@noindent
-Here, neither the @samp{+} nor the @samp{OUCH!} appear in
+Here, neither the @samp{+} nor the @samp{OUCH!} appears in
the output message.
@node Control Letters
@@ -9299,8 +9307,8 @@ The two control letters are equivalent.
(The @samp{%i} specification is for compatibility with ISO C.)
@item @code{%e}, @code{%E}
-Print a number in scientific (exponential) notation;
-for example:
+Print a number in scientific (exponential) notation.
+For example:
@example
printf "%4.3e\n", 1950
@@ -9337,7 +9345,7 @@ The special ``not a number'' value formats as @samp{-nan}
or @samp{nan}
(@pxref{Math Definitions}).
@item @code{%F}
-Like @samp{%f} but the infinity and ``not a number'' values are spelled
+Like @samp{%f}, but the infinity and ``not a number'' values are spelled
using uppercase letters.
The @samp{%F} format is a POSIX extension to ISO C; not all systems
@@ -9581,7 +9589,7 @@ printf "%" w "." p "s\n", s
@end example
@noindent
-This is not particularly easy to read but it does work.
+This is not particularly easy to read, but it does work.
@c @cindex lint checks
@cindex troubleshooting, fatal errors, @code{printf} format strings
@@ -9627,7 +9635,7 @@ $ @kbd{awk '@{ printf "%-10s %s\n", $1, $2 @}' mail-list}
@end example
In this case, the phone numbers had to be printed as strings because
-the numbers are separated by a dash. Printing the phone numbers as
+the numbers are separated by dashes. Printing the phone numbers as
numbers would have produced just the first three digits: @samp{555}.
This would have been pretty confusing.
@@ -9687,7 +9695,7 @@ This is called @dfn{redirection}.
@quotation NOTE
When @option{--sandbox} is specified (@pxref{Options}),
-redirecting output to files, pipes and coprocesses is disabled.
+redirecting output to files, pipes, and coprocesses is disabled.
@end quotation
A redirection appears after the @code{print} or @code{printf} statement.
@@ -9740,7 +9748,7 @@ Each output file contains one name or number per line.
@cindex @code{>} (right angle bracket), @code{>>} operator (I/O)
@cindex right angle bracket (@code{>}), @code{>>} operator (I/O)
@item print @var{items} >> @var{output-file}
-This redirection prints the items into the pre-existing output file
+This redirection prints the items into the preexisting output file
named @var{output-file}. The difference between this and the
address@hidden>} redirection is that the old contents (if any) of
@var{output-file} are not erased. Instead, the @command{awk} output is
@@ -9779,7 +9787,7 @@ The unsorted list is written with an ordinary
redirection, while
the sorted list is written by piping through the @command{sort} utility.
The next example uses redirection to mail a message to the mailing
-list @samp{bug-system}. This might be useful when trouble is encountered
+list @code{bug-system}. This might be useful when trouble is encountered
in an @command{awk} script run periodically for system maintenance:
@example
@@ -9810,15 +9818,23 @@ This redirection prints the items to the input of
@var{command}.
The difference between this and the
address@hidden|} redirection is that the output from @var{command}
can be read with @code{getline}.
-Thus @var{command} is a @dfn{coprocess}, which works together with,
-but subsidiary to, the @command{awk} program.
+Thus, @var{command} is a @dfn{coprocess}, which works together with
+but is subsidiary to the @command{awk} program.
This feature is a @command{gawk} extension, and is not available in
POSIX @command{awk}.
address@hidden/Coprocess}
address@hidden
address@hidden/Coprocess},
for a brief discussion.
address@hidden I/O}
address@hidden I/O},
+for a more complete discussion.
address@hidden ifnotdocbook
address@hidden
address@hidden/Coprocess}
+for a brief discussion and
address@hidden I/O}
for a more complete discussion.
address@hidden ifdocbook
@end table
Redirecting output using @samp{>}, @samp{>>}, @samp{|}, or @samp{|&}
@@ -9843,7 +9859,7 @@ This is indeed how redirections must be used from the
shell. But in
@command{awk}, it isn't necessary. In this kind of case, a program should
use @samp{>} for all the @code{print} statements, because the output file
is only opened once. (It happens that if you mix @samp{>} and @samp{>>}
-that output is produced in the expected order. However, mixing the operators
+output is produced in the expected order. However, mixing the operators
for the same file is definitely poor style, and is confusing to readers
of your program.)
@@ -9936,7 +9952,7 @@ command lines to be fed to the shell.
@end ifnotdocbook
@node Special FD
address@hidden Special Files for Standard Pre-Opened Data Streams
address@hidden Special Files for Standard Preopened Data Streams
@cindex standard input
@cindex input, standard
@cindex standard output
@@ -9949,7 +9965,7 @@ command lines to be fed to the shell.
Running programs conventionally have three input and output streams
already available to them for reading and writing. These are known
as the @dfn{standard input}, @dfn{standard output}, and @dfn{standard
-error output}. These open streams (and any other open file or pipe)
+error output}. These open streams (and any other open files or pipes)
are often referred to by the technical term @dfn{file descriptors}.
These streams are, by default, connected to your keyboard and screen, but
@@ -9987,7 +10003,7 @@ that is connected to your keyboard and screen. It
represents the
``terminal,''@footnote{The ``tty'' in @file{/dev/tty} stands for
``Teletype,'' a serial terminal.} which on modern systems is a keyboard
and screen, not a serial console.)
-This generally has the same effect but not always: although the
+This generally has the same effect, but not always: although the
standard error stream is usually the screen, it can be redirected; when
that happens, writing to the screen is not correct. In fact, if
@command{awk} is run from a background job, it may not have a
@@ -10032,7 +10048,7 @@ print "Serious error detected!" > "/dev/stderr"
@cindex troubleshooting, quotes with file names
Note the use of quotes around the @value{FN}.
-Like any other redirection, the value must be a string.
+Like with any other redirection, the value must be a string.
It is a common error to omit the quotes, which leads
to confusing results.
@@ -10058,7 +10074,7 @@ TCP/IP networking.
@end menu
@node Other Inherited Files
address@hidden Accessing Other Open Files With @command{gawk}
address@hidden Accessing Other Open Files with @command{gawk}
Besides the @code{/dev/stdin}, @code{/dev/stdout}, and @code{/dev/stderr}
special @value{FN}s mentioned earlier, @command{gawk} provides syntax
@@ -10115,7 +10131,7 @@ special @value{FN}s that @command{gawk} provides:
@cindex compatibility mode (@command{gawk}), file names
@cindex file names, in compatibility mode
@item
-Recognition of the @value{FN}s for the three standard pre-opened
+Recognition of the @value{FN}s for the three standard preopened
files is disabled only in POSIX mode.
@item
@@ -10128,7 +10144,7 @@ compatibility mode (either @option{--traditional} or
@option{--posix};
interprets these special @value{FN}s.
For example, using @samp{/dev/fd/4}
for output actually writes on file descriptor 4, and not on a new
-file descriptor that is @code{dup()}'ed from file descriptor 4. Most of
+file descriptor that is @code{dup()}ed from file descriptor 4. Most of
the time this does not matter; however, it is important to @emph{not}
close any of the files related to file descriptors 0, 1, and 2.
Doing so results in unpredictable behavior.
@@ -10350,9 +10366,9 @@ This value is zero if the close succeeds, or @minus{}1
if
it fails.
The POSIX standard is very vague; it says that @code{close()}
-returns zero on success and nonzero otherwise. In general,
+returns zero on success and a nonzero value otherwise. In general,
different implementations vary in what they report when closing
-pipes; thus the return value cannot be used portably.
+pipes; thus, the return value cannot be used portably.
@value{DARKCORNER}
In POSIX mode (@pxref{Options}), @command{gawk} just returns zero
when closing a pipe.
@@ -10407,9 +10423,9 @@ This value is zero if the close succeeds, or @minus{}1
if
it fails.
The POSIX standard is very vague; it says that @code{close()}
-returns zero on success and nonzero otherwise. In general,
+returns zero on success and a nonzero value otherwise. In general,
different implementations vary in what they report when closing
-pipes; thus the return value cannot be used portably.
+pipes; thus, the return value cannot be used portably.
@value{DARKCORNER}
In POSIX mode (@pxref{Options}), @command{gawk} just returns zero
when closing a pipe.
@@ -10429,8 +10445,8 @@ for numeric values for the @code{print} statement.
@item
The @code{printf} statement provides finer-grained control over output,
-with format control letters for different data types and various flags
-that modify the behavior of the format control letters.
+with format-control letters for different data types and various flags
+that modify the behavior of the format-control letters.
@item
Output from both @code{print} and @code{printf} may be redirected to
@@ -38192,7 +38208,7 @@ To get @command{awka}, go to
@url{http://sourceforge.net/projects/awka}.
@c andrewsumner@@yahoo.net
The project seems to be frozen; no new code changes have been made
-since approximately 2003.
+since approximately 2001.
@cindex Beebe, Nelson H.F.@:
@cindex @command{pawk} (profiling version of Brian Kernighan's @command{awk})
@@ -38470,7 +38486,7 @@ for information on getting the latest version of
@command{gawk}.)
@item
@ifnotinfo
-Follow the @uref{http://www.gnu.org/prep/standards/, @cite{GNU Coding
Standards}}.
+Follow the @cite{GNU Coding Standards}.
@end ifnotinfo
@ifinfo
See @inforef{Top, , Version, standards, GNU Coding Standards}.
@@ -38479,7 +38495,7 @@ This document describes how GNU software should be
written. If you haven't
read it, please do so, preferably @emph{before} starting to modify
@command{gawk}.
(The @cite{GNU Coding Standards} are available from
the GNU Project's
address@hidden://www.gnu.org/prep/standards_toc.html, website}.
address@hidden://www.gnu.org/prep/standards/, website}.
Texinfo, Info, and DVI versions are also available.)
@cindex @command{gawk}, coding style in
diff --git a/doc/gawktexi.in b/doc/gawktexi.in
index 7379a9c..f112b35 100644
--- a/doc/gawktexi.in
+++ b/doc/gawktexi.in
@@ -5544,11 +5544,11 @@ and numeric characters in your character set.
@c Date: Tue, 01 Jul 2014 07:39:51 +0200
@c From: Hermann Peifer <address@hidden>
Some utilities that match regular expressions provide a nonstandard
address@hidden:ascii:]} character class; @command{awk} does not. However, you
-can simulate such a construct using @code{[\x00-\x7F]}. This matches
address@hidden:ascii:]} character class; @command{awk} does not. However, you
+can simulate such a construct using @samp{[\x00-\x7F]}. This matches
all values numerically between zero and 127, which is the defined
range of the ASCII character set. Use a complemented character list
-(@code{[^\x00-\x7F]}) to match any single-byte characters that are not
+(@samp{[^\x00-\x7F]}) to match any single-byte characters that are not
in the ASCII range.
@cindex bracket expressions, collating elements
@@ -5577,8 +5577,8 @@ Locale-specific names for a list of
characters that are equal. The name is enclosed between
@samp{[=} and @samp{=]}.
For example, the name @samp{e} might be used to represent all of
-``e,'' address@hidden,'' and ``@'e.'' In this case, @samp{[[=e=]]} is a regexp
-that matches any of @samp{e}, @samp{@'e}, or @address@hidden
+``e,'' address@hidden,'' address@hidden,'' and ``@'e.'' In this case,
@samp{[[=e=]]} is a regexp
+that matches any of @samp{e}, @address@hidden, @samp{@'e}, or @address@hidden
@end table
These features are very valuable in non-English-speaking locales.
@@ -5607,7 +5607,7 @@ echo aaaabcd | awk '@{ sub(/a+/, "<A>"); print @}'
This example uses the @code{sub()} function to make a change to the input
record. (@code{sub()} replaces the first instance of any text matched
by the first argument with the string provided as the second argument;
address@hidden Functions}). Here, the regexp @code{/a+/} indicates ``one
address@hidden Functions}.) Here, the regexp @code{/a+/} indicates ``one
or more @samp{a} characters,'' and the replacement text is @samp{<A>}.
The input contains four @samp{a} characters.
@@ -5661,14 +5661,14 @@ and tests whether the input record matches this regexp.
@quotation NOTE
When using the @samp{~} and @samp{!~}
-operators, there is a difference between a regexp constant
+operators, be aware that there is a difference between a regexp constant
enclosed in slashes and a string constant enclosed in double quotes.
If you are going to use a string constant, you have to understand that
the string is, in essence, scanned @emph{twice}: the first time when
@command{awk} reads your program, and the second time when it goes to
match the string on the lefthand side of the operator with the pattern
on the right. This is true of any string-valued expression (such as
address@hidden, shown previously), not just string constants.
address@hidden, shown in the previous example), not just string constants.
@end quotation
@cindex regexp constants, slashes vs.@: quotes
@@ -5824,7 +5824,7 @@ matches either @samp{ball} or @samp{balls}, as a separate
word.
@item \B
Matches the empty string that occurs between two
word-constituent characters. For example,
address@hidden/\Brat\B/} matches @samp{crate} but it does not match @samp{dirty
rat}.
address@hidden/\Brat\B/} matches @samp{crate}, but it does not match
@samp{dirty rat}.
@samp{\B} is essentially the opposite of @samp{\y}.
@end table
@@ -5843,14 +5843,14 @@ The operators are:
@cindex backslash (@code{\}), @code{\`} operator (@command{gawk})
@cindex @code{\} (backslash), @code{\`} operator (@command{gawk})
Matches the empty string at the
-beginning of a buffer (string).
+beginning of a buffer (string)
@c @cindex operators, @code{\'} (@command{gawk})
@cindex backslash (@code{\}), @code{\'} operator (@command{gawk})
@cindex @code{\} (backslash), @code{\'} operator (@command{gawk})
@item \'
Matches the empty string at the
-end of a buffer (string).
+end of a buffer (string)
@end table
@cindex @code{^} (caret), regexp operator
@@ -6083,7 +6083,7 @@ This makes it more convenient for programs to work on the
parts of a record.
@cindex @code{getline} command
On rare occasions, you may need to use the @code{getline} command.
-The @code{getline} command is valuable, both because it
+The @code{getline} command is valuable both because it
can do explicit input from any number of files, and because the files
used with it do not have to be named on the @command{awk} command line
(@pxref{Getline}).
@@ -6134,8 +6134,8 @@ never automatically reset to zero.
Records are separated by a character called the @dfn{record separator}.
By default, the record separator is the newline character.
This is why records are, by default, single lines.
-A different character can be used for the record separator by
-assigning the character to the predefined variable @code{RS}.
+To use a different character for the record separator,
+simply assign that character to the predefined variable @code{RS}.
@cindex newlines, as record separators
@cindex @code{RS} variable
@@ -6158,8 +6158,8 @@ awk 'BEGIN @{ RS = "u" @}
@noindent
changes the value of @code{RS} to @samp{u}, before reading any input.
-This is a string whose first character is the letter ``u''; as a result,
records
-are separated by the letter ``u.'' Then the input file is read, and the second
+The new value is a string whose first character is the letter ``u''; as a
result, records
+are separated by the letter ``u''. Then the input file is read, and the second
rule in the @command{awk} program (the action with no pattern) prints each
record. Because each @code{print} statement adds a newline at the end of
its output, this @command{awk} program copies the input
@@ -6220,8 +6220,8 @@ Bill 555-1675 bill.drowning@@hotmail.com
A
@end example
@noindent
-It contains no @samp{u} so there is no reason to split the record,
-unlike the others which have one or more occurrences of the @samp{u}.
+It contains no @samp{u}, so there is no reason to split the record,
+unlike the others, which each have one or more occurrences of the @samp{u}.
In fact, this record is treated as part of the previous record;
the newline separating them in the output
is the original newline in the @value{DF}, not the one added by
@@ -6316,7 +6316,7 @@ contains the same single character. However, when
@code{RS} is a
regular expression, @code{RT} contains
the actual input text that matched the regular expression.
-If the input file ended without any text that matches @code{RS},
+If the input file ends without any text matching @code{RS},
@command{gawk} sets @code{RT} to the null string.
The following example illustrates both of these features.
@@ -6440,11 +6440,11 @@ simple @command{awk} programs so powerful.
@cindex @code{$} (dollar sign), @code{$} field operator
@cindex dollar sign (@code{$}), @code{$} field operator
@cindex field address@hidden dollar sign as
-You use a dollar-sign (@samp{$})
+You use a dollar sign (@samp{$})
to refer to a field in an @command{awk} program,
followed by the number of the field you want. Thus, @code{$1}
refers to the first field, @code{$2} to the second, and so on.
-(Unlike the Unix shells, the field numbers are not limited to single digits.
+(Unlike in the Unix shells, the field numbers are not limited to single digits.
@code{$127} is the 127th field in the record.)
For example, suppose the following is a line of input:
@@ -6470,7 +6470,7 @@ If you try to reference a field beyond the last
one (such as @code{$8} when the record has only seven fields), you get
the empty string. (If used in a numeric operation, you get zero.)
-The use of @code{$0}, which looks like a reference to the ``zero-th'' field, is
+The use of @code{$0}, which looks like a reference to the ``zeroth'' field, is
a special case: it represents the whole input record. Use it
when you are not interested in specific fields.
Here are some more examples:
@@ -6525,13 +6525,13 @@ awk '@{ print $(2*2) @}' mail-list
@end example
@command{awk} evaluates the expression @samp{(2*2)} and uses
-its value as the number of the field to print. The @samp{*} sign
+its value as the number of the field to print. The @samp{*}
represents multiplication, so the expression @samp{2*2} evaluates to four.
The parentheses are used so that the multiplication is done before the
@samp{$} operation; they are necessary whenever there is a binary
address@hidden @dfn{binary operator}, such as @samp{*} for
multiplication, is one that takes two operands. The distinction
-is required, because @command{awk} also has unary (one-operand)
+is required because @command{awk} also has unary (one-operand)
and ternary (three-operand) operators.}
in the field-number expression. This example, then, prints the
type of relationship (the fourth field) for every line of the file
@@ -6711,7 +6711,7 @@ rebuild @code{$0} when @code{NF} is decremented.
Finally, there are times when it is convenient to force
@command{awk} to rebuild the entire record, using the current
-value of the fields and @code{OFS}. To do this, use the
+values of the fields and @code{OFS}. To do this, use the
seemingly innocuous assignment:
@example
@@ -6735,7 +6735,7 @@ such as @code{sub()} and @code{gsub()}
It is important to remember that @code{$0} is the @emph{full}
record, exactly as it was read from the input. This includes
any leading or trailing whitespace, and the exact whitespace (or other
-characters) that separate the fields.
+characters) that separates the fields.
It is a common error to try to change the field separators
in a record simply by setting @code{FS} and @code{OFS}, and then
@@ -6828,7 +6828,7 @@ John Q. Smith, LXIX, 29 Oak St., Walamazoo, MI 42139
@end example
@noindent
-The same program would extract @address@hidden, instead of
+The same program would extract @address@hidden instead of
@address@hidden@address@hidden
If you were expecting the program to print the
address, you would be surprised. The moral is to choose your data layout and
@@ -7089,7 +7089,7 @@ choosing your field and record separators.
@cindex Unix @command{awk}, password address@hidden field separators and
Perhaps the most common use of a single character as the field separator
occurs when processing the Unix system password file. On many Unix
-systems, each user has a separate entry in the system password file, one
+systems, each user has a separate entry in the system password file, with one
line per user. The information in these lines is separated by colons.
The first field is the user's login name and the second is the user's
encrypted or shadow password. (A shadow password is indicated by the
@@ -7130,7 +7130,7 @@ When you do this, @code{$1} is the same as @code{$0}.
According to the POSIX standard, @command{awk} is supposed to behave
as if each record is split into fields at the time it is read.
In particular, this means that if you change the value of @code{FS}
-after a record is read, the value of the fields (i.e., how they were split)
+after a record is read, the values of the fields (i.e., how they were split)
should reflect the old value of @code{FS}, not the new one.
@cindex dark corner, field separators
@@ -7143,10 +7143,7 @@ using the @emph{current} value of @code{FS}!
@value{DARKCORNER}
This behavior can be difficult
to diagnose. The following example illustrates the difference
-between the two methods.
-(The @address@hidden @command{sed} utility is a ``stream editor.''
-Its behavior is also defined by the POSIX standard.}
-command prints just the first line of @file{/etc/passwd}.)
+between the two methods:
@example
sed 1q /etc/passwd | awk '@{ FS = ":" ; print $1 @}'
@@ -7166,6 +7163,10 @@ prints the full first line of the file, something like:
@example
root:x:0:0:Root:/:
@end example
+
+(The @address@hidden @command{sed} utility is a ``stream editor.''
+Its behavior is also defined by the POSIX standard.}
+command prints just the first line of @file{/etc/passwd}.)
@end sidebar
@node Field Splitting Summary
@@ -7340,7 +7341,7 @@ In order to tell which kind of field splitting is in
effect,
use @code{PROCINFO["FS"]}
(@pxref{Auto-set}).
The value is @code{"FS"} if regular field splitting is being used,
-or it is @code{"FIELDWIDTHS"} if fixed-width field splitting is being used:
+or @code{"FIELDWIDTHS"} if fixed-width field splitting is being used:
@example
if (PROCINFO["FS"] == "FS")
@@ -7376,14 +7377,14 @@ what they are, and not by what they are not.
The most notorious such case
is so-called @dfn{comma-separated values} (CSV) data. Many spreadsheet
programs,
for example, can export their data into text files, where each record is
-terminated with a newline, and fields are separated by commas. If only
-commas separated the data, there wouldn't be an issue. The problem comes when
+terminated with a newline, and fields are separated by commas. If
+commas only separated the data, there wouldn't be an issue. The problem comes
when
one of the fields contains an @emph{embedded} comma.
In such cases, most programs embed the field in double address@hidden
CSV format lacked a formal standard definition for many years.
@uref{http://www.ietf.org/rfc/rfc4180.txt, RFC 4180}
standardizes the most common practices.}
-So we might have data like this:
+So, we might have data like this:
@example
@c file eg/misc/addresses.csv
@@ -7469,8 +7470,8 @@ of cases, and the @command{gawk} developers are satisfied
with that.
@end quotation
As written, the regexp used for @code{FPAT} requires that each field
-have a least one character. A straightforward modification
-(changing changed the first @samp{+} to @samp{*}) allows fields to be empty:
+contain at least one character. A straightforward modification
+(changing the first @samp{+} to @samp{*}) allows fields to be empty:
@example
FPAT = "([^,]*)|(\"[^\"]+\")"
@@ -7480,9 +7481,9 @@ Finally, the @code{patsplit()} function makes the same
functionality
available for splitting regular strings (@pxref{String Functions}).
To recap, @command{gawk} provides three independent methods
-to split input records into fields. @command{gawk} uses whichever
-mechanism was last chosen based on which of the three
address@hidden, @code{FIELDWIDTHS}, and @code{FPAT}---was
+to split input records into fields.
+The mechanism used is based on which of the three
address@hidden, @code{FIELDWIDTHS}, or @code{FPAT}---was
last assigned to.
@node Multiple Line
@@ -7525,7 +7526,7 @@ at the end of the record and one or more blank lines
after the record.
In addition, a regular expression always matches the longest possible
sequence when there is a choice
(@pxref{Leftmost Longest}).
-So the next record doesn't start until
+So, the next record doesn't start until
the first nonblank line that follows---no matter how many blank lines
appear in a row, they are considered one record separator.
@@ -7540,10 +7541,10 @@ In the second case, this special processing is not done.
@cindex field separator, in multiline records
@cindex @code{FS}, in multiline records
Now that the input is separated into records, the second step is to
-separate the fields in the record. One way to do this is to divide each
+separate the fields in the records. One way to do this is to divide each
of the lines into fields in the normal manner. This happens by default
as the result of a special feature. When @code{RS} is set to the empty
-string, @emph{and} @code{FS} is set to a single character,
+string @emph{and} @code{FS} is set to a single character,
the newline character @emph{always} acts as a field separator.
This is in addition to whatever field separations result from
@address@hidden @code{FS} is the null string (@code{""})
@@ -7558,7 +7559,7 @@ want the newline character to separate fields, because
there is no way to
prevent it. However, you can work around this by using the @code{split()}
function to break up the record manually
(@pxref{String Functions}).
-If you have a single character field separator, you can work around
+If you have a single-character field separator, you can work around
the special feature in a different way, by making @code{FS} into a
regexp for that single character. For example, if the field
separator is a percent character, instead of
@@ -7566,10 +7567,10 @@ separator is a percent character, instead of
Another way to separate fields is to
put each field on a separate line: to do this, just set the
-variable @code{FS} to the string @code{"\n"}. (This single
-character separator matches a single newline.)
+variable @code{FS} to the string @code{"\n"}.
+(This single-character separator matches a single newline.)
A practical example of a @value{DF} organized this way might be a mailing
-list, where each entry is separated by blank lines. Consider a mailing
+list, where blank lines separate the entries. Consider a mailing
list in a file named @file{addresses}, which looks like this:
@example
@@ -7665,7 +7666,7 @@ then @command{gawk} sets @code{RT} to the null string.
@cindex input, explicit
So far we have been getting our input data from @command{awk}'s main
input stream---either the standard input (usually your keyboard, sometimes
-the output from another program) or from the
+the output from another program) or the
files specified on the command line. The @command{awk} language has a
special built-in command called @code{getline} that
can be used to read input under your explicit control.
@@ -7849,7 +7850,7 @@ free
@end example
The @code{getline} command used in this way sets only the variables
address@hidden, @code{FNR}, and @code{RT} (and of course, @var{var}).
address@hidden, @code{FNR}, and @code{RT} (and, of course, @var{var}).
The record is not
split into fields, so the values of the fields (including @code{$0}) and
the value of @code{NF} do not change.
@@ -7864,7 +7865,7 @@ the value of @code{NF} do not change.
@cindex left angle bracket (@code{<}), @code{<} operator (I/O)
@cindex operators, input/output
Use @samp{getline < @var{file}} to read the next record from @var{file}.
-Here @var{file} is a string-valued expression that
+Here, @var{file} is a string-valued expression that
specifies the @value{FN}. @samp{< @var{file}} is called a @dfn{redirection}
because it directs input to come from a different place.
For example, the following
@@ -8042,7 +8043,7 @@ of a construct like @address@hidden"echo "} "date" |
getline}.
Most versions, including the current version, treat it at as
@address@hidden("echo "} "date") | getline}.
(This is also how BWK @command{awk} behaves.)
-Some versions changed and treated it as
+Some versions instead treat it as
@address@hidden"echo "} ("date" | getline)}.
(This is how @command{mawk} behaves.)
In short, @emph{always} use explicit parentheses, and then you won't
@@ -8090,7 +8091,7 @@ program to be portable to other @command{awk}
implementations.
@cindex operators, input/output
@cindex differences in @command{awk} and @command{gawk}, input/output operators
-Input into @code{getline} from a pipe is a one-way operation.
+Reading input into @code{getline} from a pipe is a one-way operation.
The command that is started with @address@hidden | getline} only
sends data @emph{to} your @command{awk} program.
@@ -8100,7 +8101,7 @@ for processing and then read the results back.
communications are possible. This is done with the @samp{|&}
operator.
Typically, you write data to the coprocess first and then
-read results back, as shown in the following:
+read the results back, as shown in the following:
@example
print "@var{some query}" |& "db_server"
@@ -8183,7 +8184,7 @@ also @pxref{Auto-set}.)
@item
Using @code{FILENAME} with @code{getline}
(@samp{getline < FILENAME})
-is likely to be a source for
+is likely to be a source of
confusion. @command{awk} opens a separate input stream from the
current input file. However, by not using a variable, @code{$0}
and @code{NF} are still updated. If you're doing this, it's
@@ -8191,9 +8192,15 @@ probably by accident, and you should reconsider what it
is you're
trying to accomplish.
@item
address@hidden Summary} presents a table summarizing the
address@hidden
+The next section
address@hidden ifdocbook
address@hidden
address@hidden Summary},
address@hidden ifnotdocbook
+presents a table summarizing the
@code{getline} variants and which variables they can affect.
-It is worth noting that those variants which do not use redirection
+It is worth noting that those variants that do not use redirection
can cause @code{FILENAME} to be updated if they cause
@command{awk} to start reading a new input file.
@@ -8202,7 +8209,7 @@ can cause @code{FILENAME} to be updated if they cause
If the variable being assigned is an expression with side effects,
different versions of @command{awk} behave differently upon encountering
end-of-file. Some versions don't evaluate the expression; many versions
-(including @command{gawk}) do. Here is an example, due to Duncan Moore:
+(including @command{gawk}) do. Here is an example, courtesy of Duncan Moore:
@ignore
Date: Sun, 01 Apr 2012 11:49:33 +0100
@@ -8219,7 +8226,7 @@ BEGIN @{
@noindent
Here, the side effect is the @samp{++c}. Is @code{c} incremented if
-end of file is encountered, before the element in @code{a} is assigned?
+end-of-file is encountered before the element in @code{a} is assigned?
@command{gawk} treats @code{getline} like a function call, and evaluates
the expression @samp{a[++c]} before attempting to read from @file{f}.
@@ -8261,8 +8268,8 @@ This @value{SECTION} describes a feature that is specific
to @command{gawk}.
You may specify a timeout in milliseconds for reading input from the keyboard,
a pipe, or two-way communication, including TCP/IP sockets. This can be done
-on a per input, command, or connection basis, by setting a special element
-in the @code{PROCINFO} array (@pxref{Auto-set}):
+on a per-input, per-command, or per-connection basis, by setting a special
+element in the @code{PROCINFO} array (@pxref{Auto-set}):
@example
PROCINFO["input_name", "READ_TIMEOUT"] = @var{timeout in milliseconds}
@@ -8293,7 +8300,7 @@ while ((getline < "/dev/stdin") > 0)
@end example
@command{gawk} terminates the read operation if input does not
-arrive after waiting for the timeout period, returns failure
+arrive after waiting for the timeout period, returns failure,
and sets @code{ERRNO} to an appropriate string value.
A negative or zero value for the timeout is the same as specifying
no timeout at all.
@@ -8343,7 +8350,7 @@ If the @code{PROCINFO} element is not present and the
@command{gawk} uses its value to initialize the timeout value.
The exclusive use of the environment variable to specify timeout
has the disadvantage of not being able to control it
-on a per command or connection basis.
+on a per-command or per-connection basis.
@command{gawk} considers a timeout event to be an error even though
the attempt to read from the underlying device may
@@ -8409,7 +8416,7 @@ The possibilities are as follows:
@item
After splitting the input into records, @command{awk} further splits
-the record into individual fields, named @code{$1}, @code{$2}, and so
+the records into individual fields, named @code{$1}, @code{$2}, and so
on. @code{$0} is the whole record, and @code{NF} indicates how many
fields there are. The default way to split fields is between whitespace
characters.
@@ -8425,12 +8432,12 @@ thing. Decrementing @code{NF} throws away fields and
rebuilds the record.
@item
Field splitting is more complicated than record splitting:
address@hidden @columnfractions .40 .45 .15
address@hidden @columnfractions .40 .40 .20
@headitem Field separator value @tab Fields are split @dots{} @tab
@command{awk} / @command{gawk}
@item @code{FS == " "} @tab On runs of whitespace @tab @command{awk}
@item @code{FS == @var{any single character}} @tab On that character @tab
@command{awk}
@item @code{FS == @var{regexp}} @tab On text matching the regexp @tab
@command{awk}
address@hidden @code{FS == ""} @tab Each individual character is a separate
field @tab @command{gawk}
address@hidden @code{FS == ""} @tab Such that each individual character is a
separate field @tab @command{gawk}
@item @code{FIELDWIDTHS == @var{list of columns}} @tab Based on character
position @tab @command{gawk}
@item @code{FPAT == @var{regexp}} @tab On the text surrounding text matching
the regexp @tab @command{gawk}
@end multitable
@@ -8447,11 +8454,11 @@ This can also be done using command-line variable
assignment.
Use @code{PROCINFO["FS"]} to see how fields are being split.
@item
-Use @code{getline} in its various forms to read additional records,
+Use @code{getline} in its various forms to read additional records
from the default input stream, from a file, or from a pipe or coprocess.
@item
-Use @address@hidden, "READ_TIMEOUT"]} to cause reads to timeout
+Use @address@hidden, "READ_TIMEOUT"]} to cause reads to time out
for @var{file}.
@item
@@ -8560,7 +8567,7 @@ space is printed between any two items.
Note that the @code{print} statement is a statement and not an
expression---you can't use it in the pattern part of a
address@hidden@var{action} statement, for example.
+pattern--action statement, for example.
@node Print Examples
@section @code{print} Statement Examples
@@ -8751,7 +8758,7 @@ runs together on a single line.
@cindex numeric, output format
@cindex address@hidden numeric output
When printing numeric values with the @code{print} statement,
address@hidden internally converts the number to a string of characters
address@hidden internally converts each number to a string of characters
and prints that string. @command{awk} uses the @code{sprintf()} function
to do this conversion
(@pxref{String Functions}).
@@ -8822,7 +8829,7 @@ printf @var{format}, @var{item1}, @var{item2}, @dots{}
@noindent
As for @code{print}, the entire list of arguments may optionally be
enclosed in parentheses. Here too, the parentheses are necessary if any
-of the item expressions use the @samp{>} relational operator; otherwise,
+of the item expressions uses the @samp{>} relational operator; otherwise,
it can be confused with an output redirection (@pxref{Redirection}).
@cindex format specifiers
@@ -8853,7 +8860,7 @@ $ @kbd{awk 'BEGIN @{}
@end example
@noindent
-Here, neither the @samp{+} nor the @samp{OUCH!} appear in
+Here, neither the @samp{+} nor the @samp{OUCH!} appears in
the output message.
@node Control Letters
@@ -8900,8 +8907,8 @@ The two control letters are equivalent.
(The @samp{%i} specification is for compatibility with ISO C.)
@item @code{%e}, @code{%E}
-Print a number in scientific (exponential) notation;
-for example:
+Print a number in scientific (exponential) notation.
+For example:
@example
printf "%4.3e\n", 1950
@@ -8938,7 +8945,7 @@ The special ``not a number'' value formats as @samp{-nan}
or @samp{nan}
(@pxref{Math Definitions}).
@item @code{%F}
-Like @samp{%f} but the infinity and ``not a number'' values are spelled
+Like @samp{%f}, but the infinity and ``not a number'' values are spelled
using uppercase letters.
The @samp{%F} format is a POSIX extension to ISO C; not all systems
@@ -9182,7 +9189,7 @@ printf "%" w "." p "s\n", s
@end example
@noindent
-This is not particularly easy to read but it does work.
+This is not particularly easy to read, but it does work.
@c @cindex lint checks
@cindex troubleshooting, fatal errors, @code{printf} format strings
@@ -9228,7 +9235,7 @@ $ @kbd{awk '@{ printf "%-10s %s\n", $1, $2 @}' mail-list}
@end example
In this case, the phone numbers had to be printed as strings because
-the numbers are separated by a dash. Printing the phone numbers as
+the numbers are separated by dashes. Printing the phone numbers as
numbers would have produced just the first three digits: @samp{555}.
This would have been pretty confusing.
@@ -9288,7 +9295,7 @@ This is called @dfn{redirection}.
@quotation NOTE
When @option{--sandbox} is specified (@pxref{Options}),
-redirecting output to files, pipes and coprocesses is disabled.
+redirecting output to files, pipes, and coprocesses is disabled.
@end quotation
A redirection appears after the @code{print} or @code{printf} statement.
@@ -9341,7 +9348,7 @@ Each output file contains one name or number per line.
@cindex @code{>} (right angle bracket), @code{>>} operator (I/O)
@cindex right angle bracket (@code{>}), @code{>>} operator (I/O)
@item print @var{items} >> @var{output-file}
-This redirection prints the items into the pre-existing output file
+This redirection prints the items into the preexisting output file
named @var{output-file}. The difference between this and the
address@hidden>} redirection is that the old contents (if any) of
@var{output-file} are not erased. Instead, the @command{awk} output is
@@ -9380,7 +9387,7 @@ The unsorted list is written with an ordinary
redirection, while
the sorted list is written by piping through the @command{sort} utility.
The next example uses redirection to mail a message to the mailing
-list @samp{bug-system}. This might be useful when trouble is encountered
+list @code{bug-system}. This might be useful when trouble is encountered
in an @command{awk} script run periodically for system maintenance:
@example
@@ -9411,15 +9418,23 @@ This redirection prints the items to the input of
@var{command}.
The difference between this and the
address@hidden|} redirection is that the output from @var{command}
can be read with @code{getline}.
-Thus @var{command} is a @dfn{coprocess}, which works together with,
-but subsidiary to, the @command{awk} program.
+Thus, @var{command} is a @dfn{coprocess}, which works together with
+but is subsidiary to the @command{awk} program.
This feature is a @command{gawk} extension, and is not available in
POSIX @command{awk}.
address@hidden/Coprocess}
address@hidden
address@hidden/Coprocess},
for a brief discussion.
address@hidden I/O}
address@hidden I/O},
for a more complete discussion.
address@hidden ifnotdocbook
address@hidden
address@hidden/Coprocess}
+for a brief discussion and
address@hidden I/O}
+for a more complete discussion.
address@hidden ifdocbook
@end table
Redirecting output using @samp{>}, @samp{>>}, @samp{|}, or @samp{|&}
@@ -9444,7 +9459,7 @@ This is indeed how redirections must be used from the
shell. But in
@command{awk}, it isn't necessary. In this kind of case, a program should
use @samp{>} for all the @code{print} statements, because the output file
is only opened once. (It happens that if you mix @samp{>} and @samp{>>}
-that output is produced in the expected order. However, mixing the operators
+output is produced in the expected order. However, mixing the operators
for the same file is definitely poor style, and is confusing to readers
of your program.)
@@ -9496,7 +9511,7 @@ command lines to be fed to the shell.
@end sidebar
@node Special FD
address@hidden Special Files for Standard Pre-Opened Data Streams
address@hidden Special Files for Standard Preopened Data Streams
@cindex standard input
@cindex input, standard
@cindex standard output
@@ -9509,7 +9524,7 @@ command lines to be fed to the shell.
Running programs conventionally have three input and output streams
already available to them for reading and writing. These are known
as the @dfn{standard input}, @dfn{standard output}, and @dfn{standard
-error output}. These open streams (and any other open file or pipe)
+error output}. These open streams (and any other open files or pipes)
are often referred to by the technical term @dfn{file descriptors}.
These streams are, by default, connected to your keyboard and screen, but
@@ -9547,7 +9562,7 @@ that is connected to your keyboard and screen. It
represents the
``terminal,''@footnote{The ``tty'' in @file{/dev/tty} stands for
``Teletype,'' a serial terminal.} which on modern systems is a keyboard
and screen, not a serial console.)
-This generally has the same effect but not always: although the
+This generally has the same effect, but not always: although the
standard error stream is usually the screen, it can be redirected; when
that happens, writing to the screen is not correct. In fact, if
@command{awk} is run from a background job, it may not have a
@@ -9592,7 +9607,7 @@ print "Serious error detected!" > "/dev/stderr"
@cindex troubleshooting, quotes with file names
Note the use of quotes around the @value{FN}.
-Like any other redirection, the value must be a string.
+Like with any other redirection, the value must be a string.
It is a common error to omit the quotes, which leads
to confusing results.
@@ -9618,7 +9633,7 @@ TCP/IP networking.
@end menu
@node Other Inherited Files
address@hidden Accessing Other Open Files With @command{gawk}
address@hidden Accessing Other Open Files with @command{gawk}
Besides the @code{/dev/stdin}, @code{/dev/stdout}, and @code{/dev/stderr}
special @value{FN}s mentioned earlier, @command{gawk} provides syntax
@@ -9675,7 +9690,7 @@ special @value{FN}s that @command{gawk} provides:
@cindex compatibility mode (@command{gawk}), file names
@cindex file names, in compatibility mode
@item
-Recognition of the @value{FN}s for the three standard pre-opened
+Recognition of the @value{FN}s for the three standard preopened
files is disabled only in POSIX mode.
@item
@@ -9688,7 +9703,7 @@ compatibility mode (either @option{--traditional} or
@option{--posix};
interprets these special @value{FN}s.
For example, using @samp{/dev/fd/4}
for output actually writes on file descriptor 4, and not on a new
-file descriptor that is @code{dup()}'ed from file descriptor 4. Most of
+file descriptor that is @code{dup()}ed from file descriptor 4. Most of
the time this does not matter; however, it is important to @emph{not}
close any of the files related to file descriptors 0, 1, and 2.
Doing so results in unpredictable behavior.
@@ -9905,9 +9920,9 @@ This value is zero if the close succeeds, or @minus{}1 if
it fails.
The POSIX standard is very vague; it says that @code{close()}
-returns zero on success and nonzero otherwise. In general,
+returns zero on success and a nonzero value otherwise. In general,
different implementations vary in what they report when closing
-pipes; thus the return value cannot be used portably.
+pipes; thus, the return value cannot be used portably.
@value{DARKCORNER}
In POSIX mode (@pxref{Options}), @command{gawk} just returns zero
when closing a pipe.
@@ -9926,8 +9941,8 @@ for numeric values for the @code{print} statement.
@item
The @code{printf} statement provides finer-grained control over output,
-with format control letters for different data types and various flags
-that modify the behavior of the format control letters.
+with format-control letters for different data types and various flags
+that modify the behavior of the format-control letters.
@item
Output from both @code{print} and @code{printf} may be redirected to
@@ -37285,7 +37300,7 @@ To get @command{awka}, go to
@url{http://sourceforge.net/projects/awka}.
@c andrewsumner@@yahoo.net
The project seems to be frozen; no new code changes have been made
-since approximately 2003.
+since approximately 2001.
@cindex Beebe, Nelson H.F.@:
@cindex @command{pawk} (profiling version of Brian Kernighan's @command{awk})
@@ -37563,7 +37578,7 @@ for information on getting the latest version of
@command{gawk}.)
@item
@ifnotinfo
-Follow the @uref{http://www.gnu.org/prep/standards/, @cite{GNU Coding
Standards}}.
+Follow the @cite{GNU Coding Standards}.
@end ifnotinfo
@ifinfo
See @inforef{Top, , Version, standards, GNU Coding Standards}.
@@ -37572,7 +37587,7 @@ This document describes how GNU software should be
written. If you haven't
read it, please do so, preferably @emph{before} starting to modify
@command{gawk}.
(The @cite{GNU Coding Standards} are available from
the GNU Project's
address@hidden://www.gnu.org/prep/standards_toc.html, website}.
address@hidden://www.gnu.org/prep/standards/, website}.
Texinfo, Info, and DVI versions are also available.)
@cindex @command{gawk}, coding style in
-----------------------------------------------------------------------
Summary of changes:
doc/ChangeLog | 4 +
doc/gawk.info | 1289 ++++++++++++++++++++++++++++---------------------------
doc/gawk.texi | 232 ++++++-----
doc/gawktexi.in | 215 +++++-----
4 files changed, 889 insertions(+), 851 deletions(-)
hooks/post-receive
--
gawk
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [gawk-diffs] [SCM] gawk branch, gawk-4.1-stable, updated. gawk-4.1.0-559-g6f22075,
Arnold Robbins <=