[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[gawk-diffs] [SCM] gawk branch, master, updated. 6ffa69e5703cd9453a8adfb
From: |
Arnold Robbins |
Subject: |
[gawk-diffs] [SCM] gawk branch, master, updated. 6ffa69e5703cd9453a8adfb8ad61f3171f615f46 |
Date: |
Tue, 16 Apr 2013 09:06:07 +0000 |
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "gawk".
The branch, master has been updated
via 6ffa69e5703cd9453a8adfb8ad61f3171f615f46 (commit)
via f9ff7dc0b9dd7de3a1f46de3b3aed8583c9ed474 (commit)
via a7da113d7a5918bee47504ed6564988a9212eb9b (commit)
via eb6c4b9c94f0c537e1eeb96356bb59361f578c5c (commit)
via 9efe2646f669379e0a2484ea7e7fa3ae2911e06e (commit)
via 12064c638d18f30bd8fdb9d3261a49684ec7bdc8 (commit)
via a750e1f81cb2b153d5e9de5fef03737ab84fdee1 (commit)
via 07ec66899460f3a0439dfc6a3c0fd1e12afdb46a (commit)
via a679c239ef762a2e4ecfd977b803face0c987e57 (commit)
via 34b9e9e666c79e4c42a59d0b7b7584a0620295f0 (commit)
from abbe62c9521a1ab5c17dd118e521d06c899a1720 (commit)
Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.
- Log -----------------------------------------------------------------
http://git.sv.gnu.org/cgit/gawk.git/commit/?id=6ffa69e5703cd9453a8adfb8ad61f3171f615f46
commit 6ffa69e5703cd9453a8adfb8ad61f3171f615f46
Author: Arnold D. Robbins <address@hidden>
Date: Tue Apr 16 12:05:30 2013 +0300
Update copyrights in all relevant source files.
diff --git a/ChangeLog b/ChangeLog
index 9ebb04a..7c93185 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -4,7 +4,11 @@
* command.c: Ditto.
* dfa.h, dfa.c: Minor edits to sync with GNU grep.
* gettext.h: Sync with gettext 0.18.2.1.
- * random.h: Remove obsolete __P macro and use. Update copyright.
+ * random.h: Remove obsolete __P macro and use. Update copyright year.
+ * Makefile.am, array.c, builtin.c, cint_array.c, cmd.h, debug.c,
+ eval.c, ext.c, field.c, gawkapi.c, gawkapi.h, gettext.h, int_array.c,
+ interpret.h, msg.c, node.c, profile.c, re.c, replace.c, str_array.c,
+ symbol.c: Update copyright year.
2013-04-14 Arnold D. Robbins <address@hidden>
diff --git a/Makefile.am b/Makefile.am
index 1f1929a..8d977d7 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -1,7 +1,7 @@
#
# Makefile.am --- automake input file for gawk
#
-# Copyright (C) 2000-2011 the Free Software Foundation, Inc.
+# Copyright (C) 2000-2013 the Free Software Foundation, Inc.
#
# This file is part of GAWK, the GNU implementation of the
# AWK Programming Language.
diff --git a/Makefile.in b/Makefile.in
index 3b9963d..8a4cea9 100644
--- a/Makefile.in
+++ b/Makefile.in
@@ -17,7 +17,7 @@
#
# Makefile.am --- automake input file for gawk
#
-# Copyright (C) 2000-2011 the Free Software Foundation, Inc.
+# Copyright (C) 2000-2013 the Free Software Foundation, Inc.
#
# This file is part of GAWK, the GNU implementation of the
# AWK Programming Language.
diff --git a/array.c b/array.c
index 1953bfe..5dac7a4 100644
--- a/array.c
+++ b/array.c
@@ -3,7 +3,7 @@
*/
/*
- * Copyright (C) 1986, 1988, 1989, 1991-2011 the Free Software Foundation, Inc.
+ * Copyright (C) 1986, 1988, 1989, 1991-2013 the Free Software Foundation, Inc.
*
* This file is part of GAWK, the GNU implementation of the
* AWK Programming Language.
diff --git a/builtin.c b/builtin.c
index 7327212..ba1d8dc 100644
--- a/builtin.c
+++ b/builtin.c
@@ -3,7 +3,7 @@
*/
/*
- * Copyright (C) 1986, 1988, 1989, 1991-2012 the Free Software Foundation, Inc.
+ * Copyright (C) 1986, 1988, 1989, 1991-2013 the Free Software Foundation, Inc.
*
* This file is part of GAWK, the GNU implementation of the
* AWK Programming Language.
diff --git a/cint_array.c b/cint_array.c
index 29b6fdf..1d34c2f 100644
--- a/cint_array.c
+++ b/cint_array.c
@@ -3,7 +3,7 @@
*/
/*
- * Copyright (C) 1986, 1988, 1989, 1991-2011 the Free Software Foundation, Inc.
+ * Copyright (C) 1986, 1988, 1989, 1991-2013 the Free Software Foundation, Inc.
*
* This file is part of GAWK, the GNU implementation of the
* AWK Programming Language.
diff --git a/cmd.h b/cmd.h
index b321863..5a5fd29 100644
--- a/cmd.h
+++ b/cmd.h
@@ -3,7 +3,7 @@
*/
/*
- * Copyright (C) 2004, 2010, 2011 the Free Software Foundation, Inc.
+ * Copyright (C) 2004, 2010, 2011, 2013 the Free Software Foundation, Inc.
*
* This file is part of GAWK, the GNU implementation of the
* AWK Programming Language.
diff --git a/debug.c b/debug.c
index a69b7e3..d60164a 100644
--- a/debug.c
+++ b/debug.c
@@ -3,7 +3,7 @@
*/
/*
- * Copyright (C) 2004, 2010, 2011 the Free Software Foundation, Inc.
+ * Copyright (C) 2004, 2010-2013 the Free Software Foundation, Inc.
*
* This file is part of GAWK, the GNU implementation of the
* AWK Programming Language.
diff --git a/eval.c b/eval.c
index afb6a45..4965988 100644
--- a/eval.c
+++ b/eval.c
@@ -3,7 +3,7 @@
*/
/*
- * Copyright (C) 1986, 1988, 1989, 1991-2011 the Free Software Foundation, Inc.
+ * Copyright (C) 1986, 1988, 1989, 1991-2013 the Free Software Foundation, Inc.
*
* This file is part of GAWK, the GNU implementation of the
* AWK Programming Language.
diff --git a/ext.c b/ext.c
index 98b7381..9e17761 100644
--- a/ext.c
+++ b/ext.c
@@ -7,7 +7,7 @@
*/
/*
- * Copyright (C) 1995 - 2001, 2003-2012 the Free Software Foundation, Inc.
+ * Copyright (C) 1995 - 2001, 2003-2013 the Free Software Foundation, Inc.
*
* This file is part of GAWK, the GNU implementation of the
* AWK Programming Language.
@@ -149,7 +149,7 @@ do_ext(int nargs)
return ret;
}
-/* load_ext --- load an external library */
+/* load_old_ext --- load an external library */
NODE *
load_old_ext(SRCFILE *s, const char *init_func, const char *fini_func, NODE
*obj)
diff --git a/extension/ChangeLog b/extension/ChangeLog
index ef1aa4d..be5b531 100644
--- a/extension/ChangeLog
+++ b/extension/ChangeLog
@@ -1,3 +1,9 @@
+2013-04-16 Arnold D. Robbins <address@hidden>
+
+ * filefuncs.c, fnmatch.c, fork.c, ordchr.c, readdir.c, readfile.c,
+ revoutput.c, revtwoway.c, rwarray.c, rwarray0.c, stack.c, stack.h,
+ testext.c, time.c: Update copyright year.
+
2013-03-24 Arnold D. Robbins <address@hidden>
* gawkdirfd.h: Improve test for doing own dirfd function. Needed
diff --git a/extension/filefuncs.c b/extension/filefuncs.c
index 579b408..1e8fc8d 100644
--- a/extension/filefuncs.c
+++ b/extension/filefuncs.c
@@ -9,7 +9,7 @@
*/
/*
- * Copyright (C) 2001, 2004, 2005, 2010, 2011, 2012
+ * Copyright (C) 2001, 2004, 2005, 2010, 2011, 2012, 2013
* the Free Software Foundation, Inc.
*
* This file is part of GAWK, the GNU implementation of the
diff --git a/extension/fnmatch.c b/extension/fnmatch.c
index 7f8ab8d..a67bc25 100644
--- a/extension/fnmatch.c
+++ b/extension/fnmatch.c
@@ -7,7 +7,7 @@
*/
/*
- * Copyright (C) 2012 the Free Software Foundation, Inc.
+ * Copyright (C) 2012, 2013 the Free Software Foundation, Inc.
*
* This file is part of GAWK, the GNU implementation of the
* AWK Programming Language.
diff --git a/extension/fork.c b/extension/fork.c
index 6f96e4b..0ca0a0e 100644
--- a/extension/fork.c
+++ b/extension/fork.c
@@ -6,7 +6,7 @@
*/
/*
- * Copyright (C) 2001, 2004, 2011, 2012 the Free Software Foundation, Inc.
+ * Copyright (C) 2001, 2004, 2011, 2012, 2013 the Free Software Foundation,
Inc.
*
* This file is part of GAWK, the GNU implementation of the
* AWK Programming Language.
diff --git a/extension/ordchr.c b/extension/ordchr.c
index 7e3eda5..8ec9de3 100644
--- a/extension/ordchr.c
+++ b/extension/ordchr.c
@@ -9,7 +9,7 @@
*/
/*
- * Copyright (C) 2001, 2004, 2011, 2012 the Free Software Foundation, Inc.
+ * Copyright (C) 2001, 2004, 2011, 2012, 2013 the Free Software Foundation,
Inc.
*
* This file is part of GAWK, the GNU implementation of the
* AWK Programming Language.
diff --git a/extension/readdir.c b/extension/readdir.c
index d9a9b36..5ca4dc6 100644
--- a/extension/readdir.c
+++ b/extension/readdir.c
@@ -10,7 +10,7 @@
*/
/*
- * Copyright (C) 2012 the Free Software Foundation, Inc.
+ * Copyright (C) 2012, 2013 the Free Software Foundation, Inc.
*
* This file is part of GAWK, the GNU implementation of the
* AWK Programming Language.
diff --git a/extension/readfile.c b/extension/readfile.c
index 3e26a7e..06889c3 100644
--- a/extension/readfile.c
+++ b/extension/readfile.c
@@ -11,7 +11,8 @@
*/
/*
- * Copyright (C) 2002, 2003, 2004, 2011, 2012 the Free Software Foundation,
Inc.
+ * Copyright (C) 2002, 2003, 2004, 2011, 2012, 2013
+ * the Free Software Foundation, Inc.
*
* This file is part of GAWK, the GNU implementation of the
* AWK Programming Language.
diff --git a/extension/revoutput.c b/extension/revoutput.c
index 0536627..ae4b444 100644
--- a/extension/revoutput.c
+++ b/extension/revoutput.c
@@ -7,7 +7,7 @@
*/
/*
- * Copyright (C) 2012 the Free Software Foundation, Inc.
+ * Copyright (C) 2012, 2013 the Free Software Foundation, Inc.
*
* This file is part of GAWK, the GNU implementation of the
* AWK Programming Language.
diff --git a/extension/revtwoway.c b/extension/revtwoway.c
index 062a178..6e5bb71 100644
--- a/extension/revtwoway.c
+++ b/extension/revtwoway.c
@@ -7,7 +7,7 @@
*/
/*
- * Copyright (C) 2012 the Free Software Foundation, Inc.
+ * Copyright (C) 2012, 2013 the Free Software Foundation, Inc.
*
* This file is part of GAWK, the GNU implementation of the
* AWK Programming Language.
diff --git a/extension/rwarray.c b/extension/rwarray.c
index cf76f3f..d7b26c4 100644
--- a/extension/rwarray.c
+++ b/extension/rwarray.c
@@ -7,7 +7,7 @@
*/
/*
- * Copyright (C) 2009, 2010, 2011, 2012 the Free Software Foundation, Inc.
+ * Copyright (C) 2009, 2010, 2011, 2012, 2013 the Free Software Foundation,
Inc.
*
* This file is part of GAWK, the GNU implementation of the
* AWK Programming Language.
diff --git a/extension/rwarray0.c b/extension/rwarray0.c
index 353fb15..e2de3cf 100644
--- a/extension/rwarray0.c
+++ b/extension/rwarray0.c
@@ -7,7 +7,7 @@
*/
/*
- * Copyright (C) 2009, 2010, 2011, 2012 the Free Software Foundation, Inc.
+ * Copyright (C) 2009, 2010, 2011, 2012, 2013 the Free Software Foundation,
Inc.
*
* This file is part of GAWK, the GNU implementation of the
* AWK Programming Language.
diff --git a/extension/stack.c b/extension/stack.c
index ec994c6..6150442 100644
--- a/extension/stack.c
+++ b/extension/stack.c
@@ -3,7 +3,7 @@
*/
/*
- * Copyright (C) 2012 the Free Software Foundation, Inc.
+ * Copyright (C) 2012, 2013 the Free Software Foundation, Inc.
*
* This file is part of GAWK, the GNU implementation of the
* AWK Programming Language.
diff --git a/extension/stack.h b/extension/stack.h
index 8fc06e7..9643fb3 100644
--- a/extension/stack.h
+++ b/extension/stack.h
@@ -3,7 +3,7 @@
*/
/*
- * Copyright (C) 2012 the Free Software Foundation, Inc.
+ * Copyright (C) 2012, 2013 the Free Software Foundation, Inc.
*
* This file is part of GAWK, the GNU implementation of the
* AWK Programming Language.
diff --git a/extension/testext.c b/extension/testext.c
index f7bf08a..df15957 100644
--- a/extension/testext.c
+++ b/extension/testext.c
@@ -3,7 +3,7 @@
*/
/*
- * Copyright (C) 2012
+ * Copyright (C) 2012, 2013
* the Free Software Foundation, Inc.
*
* This file is part of GAWK, the GNU implementation of the
diff --git a/extension/time.c b/extension/time.c
index dcafb8f..cf39ccc 100644
--- a/extension/time.c
+++ b/extension/time.c
@@ -4,7 +4,7 @@
*/
/*
- * Copyright (C) 2012
+ * Copyright (C) 2012, 2013
* the Free Software Foundation, Inc.
*
* This file is part of GAWK, the GNU implementation of the
diff --git a/field.c b/field.c
index 3edd5d8..3cd6606 100644
--- a/field.c
+++ b/field.c
@@ -3,7 +3,7 @@
*/
/*
- * Copyright (C) 1986, 1988, 1989, 1991-2011 the Free Software Foundation, Inc.
+ * Copyright (C) 1986, 1988, 1989, 1991-2013 the Free Software Foundation, Inc.
*
* This file is part of GAWK, the GNU implementation of the
* AWK Programming Language.
diff --git a/gawkapi.c b/gawkapi.c
index d17ee42..61f91e8 100644
--- a/gawkapi.c
+++ b/gawkapi.c
@@ -3,7 +3,7 @@
*/
/*
- * Copyright (C) 2012, the Free Software Foundation, Inc.
+ * Copyright (C) 2012, 2013 the Free Software Foundation, Inc.
*
* This file is part of GAWK, the GNU implementation of the
* AWK Programming Language.
diff --git a/gawkapi.h b/gawkapi.h
index b2e5ed5..7d37cee 100644
--- a/gawkapi.h
+++ b/gawkapi.h
@@ -3,7 +3,7 @@
*/
/*
- * Copyright (C) 2012, the Free Software Foundation, Inc.
+ * Copyright (C) 2012, 2013 the Free Software Foundation, Inc.
*
* This file is part of GAWK, the GNU implementation of the
* AWK Programming Language.
diff --git a/gettext.h b/gettext.h
index 2976a51..38b94c4 100644
--- a/gettext.h
+++ b/gettext.h
@@ -1,5 +1,5 @@
/* Convenience header for conditional use of GNU <libintl.h>.
- Copyright (C) 1995-1998, 2000-2002, 2004-2006, 2009-2011 Free Software
Foundation, Inc.
+ Copyright (C) 1995-1998, 2000-2002, 2004-2006, 2009-2013 Free Software
Foundation, Inc.
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
diff --git a/int_array.c b/int_array.c
index 769ac9b..c2bf37b 100644
--- a/int_array.c
+++ b/int_array.c
@@ -3,7 +3,7 @@
*/
/*
- * Copyright (C) 1986, 1988, 1989, 1991-2011 the Free Software Foundation, Inc.
+ * Copyright (C) 1986, 1988, 1989, 1991-2013 the Free Software Foundation, Inc.
*
* This file is part of GAWK, the GNU implementation of the
* AWK Programming Language.
diff --git a/interpret.h b/interpret.h
index 8e0fdee..ba70cf0 100644
--- a/interpret.h
+++ b/interpret.h
@@ -3,7 +3,7 @@
*/
/*
- * Copyright (C) 1986, 1988, 1989, 1991-2012 the Free Software Foundation, Inc.
+ * Copyright (C) 1986, 1988, 1989, 1991-2013 the Free Software Foundation, Inc.
*
* This file is part of GAWK, the GNU implementation of the
* AWK Programming Language.
diff --git a/msg.c b/msg.c
index c0bf38a..edacdd1 100644
--- a/msg.c
+++ b/msg.c
@@ -3,7 +3,7 @@
*/
/*
- * Copyright (C) 1986, 1988, 1989, 1991-2001, 2003, 2010
+ * Copyright (C) 1986, 1988, 1989, 1991-2001, 2003, 2010-2013
* the Free Software Foundation, Inc.
*
* This file is part of GAWK, the GNU implementation of the
diff --git a/node.c b/node.c
index 02c78ae..1c89634 100644
--- a/node.c
+++ b/node.c
@@ -3,7 +3,7 @@
*/
/*
- * Copyright (C) 1986, 1988, 1989, 1991-2001, 2003-2011,
+ * Copyright (C) 1986, 1988, 1989, 1991-2001, 2003-2013,
* the Free Software Foundation, Inc.
*
* This file is part of GAWK, the GNU implementation of the
diff --git a/profile.c b/profile.c
index 8c5f3b7..435cad1 100644
--- a/profile.c
+++ b/profile.c
@@ -3,7 +3,7 @@
*/
/*
- * Copyright (C) 1999-2011 the Free Software Foundation, Inc.
+ * Copyright (C) 1999-2013 the Free Software Foundation, Inc.
*
* This file is part of GAWK, the GNU implementation of the
* AWK Programming Language.
diff --git a/re.c b/re.c
index e549291..a4891cb 100644
--- a/re.c
+++ b/re.c
@@ -3,7 +3,7 @@
*/
/*
- * Copyright (C) 1991-2012 the Free Software Foundation, Inc.
+ * Copyright (C) 1991-2013 the Free Software Foundation, Inc.
*
* This file is part of GAWK, the GNU implementation of the
* AWK Programming Language.
diff --git a/replace.c b/replace.c
index 4259aaf..559de01 100644
--- a/replace.c
+++ b/replace.c
@@ -3,7 +3,7 @@
*/
/*
- * Copyright (C) 1989, 1991-2011 the Free Software Foundation, Inc.
+ * Copyright (C) 1989, 1991-2013 the Free Software Foundation, Inc.
*
* This file is part of GAWK, the GNU implementation of the
* AWK Programming Language.
diff --git a/str_array.c b/str_array.c
index e5b3b40..aa82d71 100644
--- a/str_array.c
+++ b/str_array.c
@@ -3,7 +3,7 @@
*/
/*
- * Copyright (C) 1986, 1988, 1989, 1991-2011 the Free Software Foundation, Inc.
+ * Copyright (C) 1986, 1988, 1989, 1991-2013 the Free Software Foundation, Inc.
*
* This file is part of GAWK, the GNU implementation of the
* AWK Programming Language.
diff --git a/symbol.c b/symbol.c
index 354bfca..2b5e2bb 100644
--- a/symbol.c
+++ b/symbol.c
@@ -3,7 +3,7 @@
*/
/*
- * Copyright (C) 1986, 1988, 1989, 1991-2011 the Free Software Foundation, Inc.
+ * Copyright (C) 1986, 1988, 1989, 1991-2013 the Free Software Foundation, Inc.
*
* This file is part of GAWK, the GNU implementation of the
* AWK Programming Language.
http://git.sv.gnu.org/cgit/gawk.git/commit/?id=f9ff7dc0b9dd7de3a1f46de3b3aed8583c9ed474
commit f9ff7dc0b9dd7de3a1f46de3b3aed8583c9ed474
Author: Arnold D. Robbins <address@hidden>
Date: Tue Apr 16 11:51:18 2013 +0300
Rebuild command.c also.
diff --git a/ChangeLog b/ChangeLog
index fd94905..9ebb04a 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,6 +1,7 @@
2013-04-16 Arnold D. Robbins <address@hidden>
* awkgram.c: Regenerated from bison 2.7.1.
+ * command.c: Ditto.
* dfa.h, dfa.c: Minor edits to sync with GNU grep.
* gettext.h: Sync with gettext 0.18.2.1.
* random.h: Remove obsolete __P macro and use. Update copyright.
diff --git a/command.c b/command.c
index 9b07fd3..d170e4c 100644
--- a/command.c
+++ b/command.c
@@ -1,8 +1,8 @@
-/* A Bison parser, made by GNU Bison 2.7. */
+/* A Bison parser, made by GNU Bison 2.7.12-4996. */
/* Bison implementation for Yacc-like parsers in C
- Copyright (C) 1984, 1989-1990, 2000-2012 Free Software Foundation, Inc.
+ Copyright (C) 1984, 1989-1990, 2000-2013 Free Software Foundation, Inc.
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
@@ -44,7 +44,7 @@
#define YYBISON 1
/* Bison version. */
-#define YYBISON_VERSION "2.7"
+#define YYBISON_VERSION "2.7.12-4996"
/* Skeleton name. */
#define YYSKELETON_NAME "yacc.c"
@@ -358,6 +358,14 @@ typedef short int yytype_int16;
# endif
#endif
+#ifndef __attribute__
+/* This feature is available in gcc versions 2.5 and later. */
+# if (! defined __GNUC__ || __GNUC__ < 2 \
+ || (__GNUC__ == 2 && __GNUC_MINOR__ < 5))
+# define __attribute__(Spec) /* empty */
+# endif
+#endif
+
/* Suppress unused-variable warnings by "using" E. */
#if ! defined lint || defined __GNUC__
# define YYUSE(E) ((void) (E))
@@ -365,6 +373,7 @@ typedef short int yytype_int16;
# define YYUSE(E) /* empty */
#endif
+
/* Identity function, used to suppress warnings about constant conditions. */
#ifndef lint
# define YYID(N) (N)
@@ -1024,11 +1033,7 @@ yy_symbol_value_print (yyoutput, yytype, yyvaluep)
# else
YYUSE (yyoutput);
# endif
- switch (yytype)
- {
- default:
- break;
- }
+ YYUSE (yytype);
}
@@ -1418,12 +1423,7 @@ yydestruct (yymsg, yytype, yyvaluep)
yymsg = "Deleting";
YY_SYMBOL_PRINT (yymsg, yytype, yyvaluep, yylocationp);
- switch (yytype)
- {
-
- default:
- break;
- }
+ YYUSE (yytype);
}
@@ -1707,7 +1707,7 @@ yyreduce:
switch (yyn)
{
case 3:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 109 "command.y"
{
cmd_idx = -1;
@@ -1726,7 +1726,7 @@ yyreduce:
break;
case 5:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 128 "command.y"
{
if (errcount == 0 && cmd_idx >= 0) {
@@ -1780,7 +1780,7 @@ yyreduce:
break;
case 6:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 178 "command.y"
{
yyerrok;
@@ -1788,13 +1788,13 @@ yyreduce:
break;
case 22:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 212 "command.y"
{ want_nodeval = true; }
break;
case 23:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 217 "command.y"
{
if (errcount == 0) {
@@ -1814,7 +1814,7 @@ yyreduce:
break;
case 24:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 236 "command.y"
{
(yyval) = append_statement(arg_list, (char *) start_EVAL);
@@ -1826,13 +1826,13 @@ yyreduce:
break;
case 25:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 243 "command.y"
{ (yyval) = append_statement((yyvsp[(1) - (2)]), lexptr_begin); }
break;
case 26:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 244 "command.y"
{
(yyval) = (yyvsp[(3) - (4)]);
@@ -1840,7 +1840,7 @@ yyreduce:
break;
case 27:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 251 "command.y"
{
arg_list = append_statement((yyvsp[(2) - (3)]), (char *)
end_EVAL);
@@ -1860,7 +1860,7 @@ yyreduce:
break;
case 28:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 267 "command.y"
{
NODE *n;
@@ -1875,7 +1875,7 @@ yyreduce:
break;
case 34:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 286 "command.y"
{
if (cmdtab[cmd_idx].class == D_FRAME
@@ -1885,7 +1885,7 @@ yyreduce:
break;
case 35:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 292 "command.y"
{
int idx = find_argument((yyvsp[(2) - (2)]));
@@ -1901,43 +1901,43 @@ yyreduce:
break;
case 38:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 305 "command.y"
{ want_nodeval = true; }
break;
case 40:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 306 "command.y"
{ want_nodeval = true; }
break;
case 46:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 311 "command.y"
{ want_nodeval = true; }
break;
case 49:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 313 "command.y"
{ want_nodeval = true; }
break;
case 51:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 314 "command.y"
{ want_nodeval = true; }
break;
case 53:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 315 "command.y"
{ want_nodeval = true; }
break;
case 57:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 319 "command.y"
{
if (in_cmd_src((yyvsp[(2) - (2)])->a_string))
@@ -1946,7 +1946,7 @@ yyreduce:
break;
case 58:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 324 "command.y"
{
if (! input_from_tty)
@@ -1955,7 +1955,7 @@ yyreduce:
break;
case 59:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 329 "command.y"
{
int type = 0;
@@ -1985,7 +1985,7 @@ yyreduce:
break;
case 60:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 355 "command.y"
{
if (! in_commands)
@@ -1999,7 +1999,7 @@ yyreduce:
break;
case 61:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 365 "command.y"
{
if (! in_commands)
@@ -2008,7 +2008,7 @@ yyreduce:
break;
case 62:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 370 "command.y"
{
int idx = find_argument((yyvsp[(2) - (2)]));
@@ -2024,13 +2024,13 @@ yyreduce:
break;
case 63:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 381 "command.y"
{ want_nodeval = true; }
break;
case 64:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 382 "command.y"
{
int type;
@@ -2042,7 +2042,7 @@ yyreduce:
break;
case 65:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 390 "command.y"
{
if (in_commands) {
@@ -2057,7 +2057,7 @@ yyreduce:
break;
case 66:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 404 "command.y"
{
if ((yyvsp[(1) - (1)]) != NULL) {
@@ -2071,37 +2071,37 @@ yyreduce:
break;
case 68:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 418 "command.y"
{ (yyval) = NULL; }
break;
case 69:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 423 "command.y"
{ (yyval) = NULL; }
break;
case 74:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 432 "command.y"
{ (yyval) = NULL; }
break;
case 75:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 437 "command.y"
{ (yyval) = NULL; }
break;
case 77:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 440 "command.y"
{ (yyval) = NULL; }
break;
case 78:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 445 "command.y"
{
NODE *n;
@@ -2112,13 +2112,13 @@ yyreduce:
break;
case 79:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 455 "command.y"
{ (yyval) = NULL; }
break;
case 80:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 457 "command.y"
{
if (find_option((yyvsp[(1) - (1)])->a_string) < 0)
@@ -2127,7 +2127,7 @@ yyreduce:
break;
case 81:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 462 "command.y"
{
if (find_option((yyvsp[(1) - (3)])->a_string) < 0)
@@ -2136,7 +2136,7 @@ yyreduce:
break;
case 82:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 470 "command.y"
{
NODE *n;
@@ -2153,49 +2153,49 @@ yyreduce:
break;
case 83:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 486 "command.y"
{ (yyval) = NULL; }
break;
case 88:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 495 "command.y"
{ (yyval) = NULL; }
break;
case 89:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 496 "command.y"
{ want_nodeval = true; }
break;
case 92:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 498 "command.y"
{ want_nodeval = true; }
break;
case 95:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 504 "command.y"
{ (yyval) = NULL; }
break;
case 97:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 510 "command.y"
{ (yyval) = NULL; }
break;
case 99:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 516 "command.y"
{ (yyval) = NULL; }
break;
case 104:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 528 "command.y"
{
int idx = find_argument((yyvsp[(1) - (2)]));
@@ -2211,7 +2211,7 @@ yyreduce:
break;
case 106:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 544 "command.y"
{
(yyvsp[(2) - (2)])->type = D_array; /* dump all items */
@@ -2220,7 +2220,7 @@ yyreduce:
break;
case 107:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 549 "command.y"
{
(yyvsp[(2) - (3)])->type = D_array;
@@ -2229,19 +2229,19 @@ yyreduce:
break;
case 117:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 575 "command.y"
{ (yyval) = NULL; }
break;
case 118:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 577 "command.y"
{ (yyval) = NULL; }
break;
case 119:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 579 "command.y"
{
CMDARG *a;
@@ -2252,7 +2252,7 @@ yyreduce:
break;
case 126:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 595 "command.y"
{
if ((yyvsp[(1) - (3)])->a_int > (yyvsp[(3) - (3)])->a_int)
@@ -2265,25 +2265,25 @@ yyreduce:
break;
case 127:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 607 "command.y"
{ (yyval) = NULL; }
break;
case 134:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 621 "command.y"
{ (yyval) = (yyvsp[(1) - (1)]); }
break;
case 135:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 623 "command.y"
{ (yyval) = (yyvsp[(1) - (3)]); }
break;
case 137:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 629 "command.y"
{
CMDARG *a;
@@ -2302,19 +2302,19 @@ yyreduce:
break;
case 139:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 648 "command.y"
{ (yyval) = (yyvsp[(1) - (1)]); num_dim = 1; }
break;
case 140:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 650 "command.y"
{ (yyval) = (yyvsp[(1) - (2)]); num_dim++; }
break;
case 142:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 656 "command.y"
{
NODE *n = (yyvsp[(2) - (2)])->a_node;
@@ -2327,7 +2327,7 @@ yyreduce:
break;
case 143:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 665 "command.y"
{
/* a_string is array name, a_count is dimension count */
@@ -2338,13 +2338,13 @@ yyreduce:
break;
case 144:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 675 "command.y"
{ (yyval) = (yyvsp[(1) - (1)]); }
break;
case 145:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 677 "command.y"
{
NODE *n = (yyvsp[(2) - (2)])->a_node;
@@ -2355,7 +2355,7 @@ yyreduce:
break;
case 146:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 684 "command.y"
{
NODE *n = (yyvsp[(2) - (2)])->a_node;
@@ -2368,31 +2368,31 @@ yyreduce:
break;
case 147:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 696 "command.y"
{ (yyval) = NULL; }
break;
case 148:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 698 "command.y"
{ (yyval) = (yyvsp[(1) - (1)]); }
break;
case 149:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 703 "command.y"
{ (yyval) = NULL; }
break;
case 150:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 705 "command.y"
{ (yyval) = (yyvsp[(1) - (1)]); }
break;
case 151:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 710 "command.y"
{
if ((yyvsp[(1) - (1)])->a_int == 0)
@@ -2402,7 +2402,7 @@ yyreduce:
break;
case 152:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 716 "command.y"
{
if ((yyvsp[(2) - (2)])->a_int == 0)
@@ -2412,19 +2412,19 @@ yyreduce:
break;
case 153:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 725 "command.y"
{ (yyval) = (yyvsp[(1) - (1)]); }
break;
case 154:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 727 "command.y"
{ (yyval) = (yyvsp[(2) - (2)]); }
break;
case 155:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 729 "command.y"
{
(yyvsp[(2) - (2)])->a_int = - (yyvsp[(2) - (2)])->a_int;
@@ -2433,7 +2433,7 @@ yyreduce:
break;
case 156:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 737 "command.y"
{
if (lexptr_begin != NULL) {
@@ -2446,7 +2446,7 @@ yyreduce:
break;
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 2451 "command.c"
default: break;
}
@@ -2678,7 +2678,7 @@ yyreturn:
}
-/* Line 2055 of yacc.c */
+/* Line 2050 of yacc.c */
#line 747 "command.y"
http://git.sv.gnu.org/cgit/gawk.git/commit/?id=a7da113d7a5918bee47504ed6564988a9212eb9b
commit a7da113d7a5918bee47504ed6564988a9212eb9b
Author: Arnold D. Robbins <address@hidden>
Date: Tue Apr 16 11:49:35 2013 +0300
Modernize random.h some.
diff --git a/ChangeLog b/ChangeLog
index fb31b05..fd94905 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -3,6 +3,7 @@
* awkgram.c: Regenerated from bison 2.7.1.
* dfa.h, dfa.c: Minor edits to sync with GNU grep.
* gettext.h: Sync with gettext 0.18.2.1.
+ * random.h: Remove obsolete __P macro and use. Update copyright.
2013-04-14 Arnold D. Robbins <address@hidden>
@@ -66,7 +67,7 @@
2013-02-26 Arnold D. Robbins <address@hidden>
- * parse.y (expression_list): In case of error return the list
+ * awkgram.y (expression_list): In case of error return the list
instead of NULL so that snode gets something it can count.
2013-02-12 Arnold D. Robbins <address@hidden>
diff --git a/random.h b/random.h
index 626cbcd..d4c6ef1 100644
--- a/random.h
+++ b/random.h
@@ -3,7 +3,7 @@
*/
/*
- * Copyright (C) 1996, 2001, 2004, 2005 the Free Software Foundation, Inc.
+ * Copyright (C) 1996, 2001, 2004, 2005, 2013 the Free Software Foundation,
Inc.
*
* This file is part of GAWK, the GNU implementation of the
* AWK Programming Language.
@@ -40,11 +40,4 @@ typedef long gawk_int32_t;
#define uint32_t gawk_uint32_t
#define int32_t gawk_int32_t
-#ifdef __STDC__
-#undef __P
-#define __P(s) s
-#else
-#define __P(s) ()
-#endif
-
-extern long random __P((void));
+extern long random (void);
http://git.sv.gnu.org/cgit/gawk.git/commit/?id=eb6c4b9c94f0c537e1eeb96356bb59361f578c5c
commit eb6c4b9c94f0c537e1eeb96356bb59361f578c5c
Author: Arnold D. Robbins <address@hidden>
Date: Tue Apr 16 11:28:19 2013 +0300
Prettify lists of tests updating in Makefile.am.
diff --git a/test/ChangeLog b/test/ChangeLog
index 0be3826..8f4caa0 100644
--- a/test/ChangeLog
+++ b/test/ChangeLog
@@ -1,3 +1,8 @@
+2013-04-16 Arnold D. Robbins <address@hidden>
+
+ * Makefile.am: Prettify the lists of tests.
+ (GENTESTS_UNUSED): Bring the list up to date.
+
2013-03-24 Arnold D. Robbins <address@hidden>
* Makefile.am (readdir): Add a check for GNU/Linux and NFS directory
diff --git a/test/Makefile.am b/test/Makefile.am
index cfd4632..d7988fa 100644
--- a/test/Makefile.am
+++ b/test/Makefile.am
@@ -938,9 +938,8 @@ BASIC_TESTS = \
paramdup paramres paramtyp paramuninitglobal parse1 parsefld parseme \
pcntplus posix2008sub prdupval prec printf0 printf1 prmarscl prmreuse \
prt1eval prtoeval \
- rand range1 rebt8b1 redfilnm regeq regexprange regrange \
- reindops reparse \
- resplit rri1 rs rsnul1nl rsnulbig rsnulbig2 rstest1 rstest2 \
+ rand range1 rebt8b1 redfilnm regeq regexprange regrange reindops \
+ reparse resplit rri1 rs rsnul1nl rsnulbig rsnulbig2 rstest1 rstest2 \
rstest3 rstest4 rstest5 rswhite \
scalar sclforin sclifin sortempty splitargv splitarr splitdef \
splitvar splitwht strcat1 strnum1 strtod subamp subi18n \
@@ -958,22 +957,20 @@ GAWK_EXT_TESTS = \
backw badargs beginfile1 beginfile2 binmode1 charasbytes \
colonwarn clos1way delsub devfd devfd1 devfd2 dumpvars exit \
fieldwdth fpat1 fpat2 fpat3 fpatnull fsfwfs funlen \
- functab1 functab2 functab3 \
- fwtest fwtest2 fwtest3 \
+ functab1 functab2 functab3 fwtest fwtest2 fwtest3 \
gensub gensub2 getlndir gnuops2 gnuops3 gnureops \
icasefs icasers id igncdym igncfs ignrcas2 ignrcase \
incdupe incdupe2 incdupe3 incdupe4 incdupe5 incdupe6 incdupe7 \
include include2 indirectcall \
- lint lintold lintwarn \
+ lint lintold lintwarn \
manyfiles match1 match2 match3 mbstr1 \
- nastyparm next nondec nondec2 \
+ nastyparm next nondec nondec2 \
patsplit posix printfbad1 printfbad2 printfbad3 procinfs \
profile1 profile2 profile3 pty1 \
rebuf regx8bit reginttrad reint reint2 rsstart1 \
rsstart2 rsstart3 rstest6 shadow sortfor sortu splitarg4 strftime \
- strtonum switch2 \
- symtab1 symtab2 symtab3 symtab4 symtab5 symtab6 symtab7 \
- symtab8 symtab9
+ strtonum switch2 symtab1 symtab2 symtab3 symtab4 symtab5 symtab6 \
+ symtab7 symtab8 symtab9
EXTRA_TESTS = inftest regtest
@@ -1008,7 +1005,9 @@ CHECK_MPFR = \
rand fnarydel fnparydl
# List of the files that appear in manual tests or are for reserve testing:
-GENTESTS_UNUSED = Makefile.in gtlnbufv.awk printfloat.awk inclib.awk hello.awk
+GENTESTS_UNUSED = Makefile.in dtdgport.awk gtlnbufv.awk hello.awk \
+ inchello.awk inclib.awk inplace.1.in inplace.2.in inplace.in \
+ longdbl.awk longdbl.in printfloat.awk readdir0.awk xref.awk
CMP = cmp
AWKPROG = ../gawk$(EXEEXT)
diff --git a/test/Makefile.in b/test/Makefile.in
index 2751ae3..3e9724a 100644
--- a/test/Makefile.in
+++ b/test/Makefile.in
@@ -1149,9 +1149,8 @@ BASIC_TESTS = \
paramdup paramres paramtyp paramuninitglobal parse1 parsefld parseme \
pcntplus posix2008sub prdupval prec printf0 printf1 prmarscl prmreuse \
prt1eval prtoeval \
- rand range1 rebt8b1 redfilnm regeq regexprange regrange \
- reindops reparse \
- resplit rri1 rs rsnul1nl rsnulbig rsnulbig2 rstest1 rstest2 \
+ rand range1 rebt8b1 redfilnm regeq regexprange regrange reindops \
+ reparse resplit rri1 rs rsnul1nl rsnulbig rsnulbig2 rstest1 rstest2 \
rstest3 rstest4 rstest5 rswhite \
scalar sclforin sclifin sortempty splitargv splitarr splitdef \
splitvar splitwht strcat1 strnum1 strtod subamp subi18n \
@@ -1169,22 +1168,20 @@ GAWK_EXT_TESTS = \
backw badargs beginfile1 beginfile2 binmode1 charasbytes \
colonwarn clos1way delsub devfd devfd1 devfd2 dumpvars exit \
fieldwdth fpat1 fpat2 fpat3 fpatnull fsfwfs funlen \
- functab1 functab2 functab3 \
- fwtest fwtest2 fwtest3 \
+ functab1 functab2 functab3 fwtest fwtest2 fwtest3 \
gensub gensub2 getlndir gnuops2 gnuops3 gnureops \
icasefs icasers id igncdym igncfs ignrcas2 ignrcase \
incdupe incdupe2 incdupe3 incdupe4 incdupe5 incdupe6 incdupe7 \
include include2 indirectcall \
- lint lintold lintwarn \
+ lint lintold lintwarn \
manyfiles match1 match2 match3 mbstr1 \
- nastyparm next nondec nondec2 \
+ nastyparm next nondec nondec2 \
patsplit posix printfbad1 printfbad2 printfbad3 procinfs \
profile1 profile2 profile3 pty1 \
rebuf regx8bit reginttrad reint reint2 rsstart1 \
rsstart2 rsstart3 rstest6 shadow sortfor sortu splitarg4 strftime \
- strtonum switch2 \
- symtab1 symtab2 symtab3 symtab4 symtab5 symtab6 symtab7 \
- symtab8 symtab9
+ strtonum switch2 symtab1 symtab2 symtab3 symtab4 symtab5 symtab6 \
+ symtab7 symtab8 symtab9
EXTRA_TESTS = inftest regtest
INET_TESTS = inetdayu inetdayt inetechu inetecht
@@ -1219,7 +1216,10 @@ CHECK_MPFR = \
# List of the files that appear in manual tests or are for reserve testing:
-GENTESTS_UNUSED = Makefile.in gtlnbufv.awk printfloat.awk inclib.awk hello.awk
+GENTESTS_UNUSED = Makefile.in dtdgport.awk gtlnbufv.awk hello.awk \
+ inchello.awk inclib.awk inplace.1.in inplace.2.in inplace.in \
+ longdbl.awk longdbl.in printfloat.awk readdir0.awk xref.awk
+
CMP = cmp
AWKPROG = ../gawk$(EXEEXT)
http://git.sv.gnu.org/cgit/gawk.git/commit/?id=9efe2646f669379e0a2484ea7e7fa3ae2911e06e
commit 9efe2646f669379e0a2484ea7e7fa3ae2911e06e
Author: Arnold D. Robbins <address@hidden>
Date: Tue Apr 16 11:15:22 2013 +0300
Sync gettext.h with gettext 0.18.2.1.
diff --git a/ChangeLog b/ChangeLog
index d02fd74..fb31b05 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -2,6 +2,7 @@
* awkgram.c: Regenerated from bison 2.7.1.
* dfa.h, dfa.c: Minor edits to sync with GNU grep.
+ * gettext.h: Sync with gettext 0.18.2.1.
2013-04-14 Arnold D. Robbins <address@hidden>
diff --git a/gettext.h b/gettext.h
index 0c1a50e..2976a51 100644
--- a/gettext.h
+++ b/gettext.h
@@ -1,20 +1,18 @@
/* Convenience header for conditional use of GNU <libintl.h>.
- Copyright (C) 1995-1998, 2000-2002, 2004-2006, 2009 Free Software
Foundation, Inc.
+ Copyright (C) 1995-1998, 2000-2002, 2004-2006, 2009-2011 Free Software
Foundation, Inc.
- This program is free software; you can redistribute it and/or modify it
- under the terms of the GNU General Public License as published
- by the Free Software Foundation; either version 3, or (at your option)
- any later version.
+ This program is free software: you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation; either version 3 of the License, or
+ (at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Library General Public License for more details.
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
- You should have received a copy of the GNU General Public
- License along with this program; if not, write to the Free Software
- Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301,
- USA. */
+ You should have received a copy of the GNU General Public License
+ along with this program. If not, see <http://www.gnu.org/licenses/>. */
#ifndef _LIBGETTEXT_H
#define _LIBGETTEXT_H 1
@@ -62,7 +60,7 @@
it now, to make later inclusions of <libintl.h> a NOP. */
#if defined(__cplusplus) && defined(__GNUG__) && (__GNUC__ >= 3)
# include <cstdlib>
-# if (__GLIBC__ >= 2) || _GLIBCXX_HAVE_LIBINTL_H
+# if (__GLIBC__ >= 2 && !defined __UCLIBC__) || _GLIBCXX_HAVE_LIBINTL_H
# include <libintl.h>
# endif
#endif
@@ -90,7 +88,7 @@
((void) (Domainname), ngettext (Msgid1, Msgid2, N))
# undef dcngettext
# define dcngettext(Domainname, Msgid1, Msgid2, N, Category) \
- ((void) (Category), dngettext(Domainname, Msgid1, Msgid2, N))
+ ((void) (Category), dngettext (Domainname, Msgid1, Msgid2, N))
# undef textdomain
# define textdomain(Domainname) ((const char *) (Domainname))
# undef bindtextdomain
@@ -102,6 +100,12 @@
#endif
+/* Prefer gnulib's setlocale override over libintl's setlocale override. */
+#ifdef GNULIB_defined_setlocale
+# undef setlocale
+# define setlocale rpl_setlocale
+#endif
+
/* A pseudo function call that serves as a marker for the automated
extraction of messages, but does not call gettext(). The run-time
translation is done at a different place in the code.
@@ -187,9 +191,12 @@ npgettext_aux (const char *domain,
#include <string.h>
-#define _LIBGETTEXT_HAVE_VARIABLE_SIZE_ARRAYS \
- (((__GNUC__ >= 3 || __GNUG__ >= 2) && !__STRICT_ANSI__) \
- /* || __STDC_VERSION__ >= 199901L */ )
+#if (((__GNUC__ >= 3 || __GNUG__ >= 2) && !defined __STRICT_ANSI__) \
+ /* || __STDC_VERSION__ >= 199901L */ )
+# define _LIBGETTEXT_HAVE_VARIABLE_SIZE_ARRAYS 1
+#else
+# define _LIBGETTEXT_HAVE_VARIABLE_SIZE_ARRAYS 0
+#endif
#if !_LIBGETTEXT_HAVE_VARIABLE_SIZE_ARRAYS
#include <stdlib.h>
http://git.sv.gnu.org/cgit/gawk.git/commit/?id=12064c638d18f30bd8fdb9d3261a49684ec7bdc8
commit 12064c638d18f30bd8fdb9d3261a49684ec7bdc8
Author: Arnold D. Robbins <address@hidden>
Date: Tue Apr 16 11:08:35 2013 +0300
Update TODO.
diff --git a/TODO b/TODO
index 3423388..c76bf8c 100644
--- a/TODO
+++ b/TODO
@@ -1,4 +1,4 @@
-Tue Feb 12 19:51:04 IST 2013
+Tue Apr 16 11:08:26 IDT 2013
============================
There were too many files tracking different thoughts and ideas for
@@ -24,16 +24,8 @@ Minor Cleanups and Code Improvements
regex.h - remove underscores in param names
- Add tests for patches in emails (?? - not sure now what this
- referred to)
-
Consider removing use of and/or need for the protos.h file.
- Consider moving var_value info into Node_var itself
- to reduce memory usage.
-
- Add macros for working with flags instead of using & and | directly.
-
Review the bash source script for working with shared libraries in
order to nuke the use of libtool.
@@ -53,6 +45,7 @@ Minor New Features
Major New Features
------------------
+
Think about how to generalize indirect access. Manuel Collado
suggests things like
@@ -82,18 +75,19 @@ Major New Features
Things To Think About That May Never Happen
-------------------------------------------
- ?? Scope IDs for IPv6 addresses ??
-
- ??? Gnulib
Consider making shadowed variables a warning and not
a fatal warning when --lint=fatal.
Similar for extra parameters in a function call.
+ ?? Scope IDs for IPv6 addresses ??
+
+ ??? Gnulib
+
Look at code coverage tools, like S2E: https://s2e.epfl.ch/
- Try running with diehard: http://www.diehard-software.org,
+ Try running with diehard. See http://www.diehard-software.org,
https://github.com/emeryberger/DieHard
Change from dlopen to using the libltdl library (i.e. lt_dlopen).
@@ -211,12 +205,16 @@ Done in 4.1:
Consider really implementing BWK awk SYMTAB for seeing what
global variables are defined.
-Things To Think About That May Never Happen
--------------------------------------------
-
Things That We Decided We Will Never Do
---------------------------------------
+ Consider moving var_value info into Node_var itself to reduce
+ memory usage. This would break all uses of get_lhs in the
+ code. It's too sweeping a change.
+
+ Add macros for working with flags instead of using & and |
+ directly.
+
Code Review
-----------
array.c
http://git.sv.gnu.org/cgit/gawk.git/commit/?id=a750e1f81cb2b153d5e9de5fef03737ab84fdee1
commit a750e1f81cb2b153d5e9de5fef03737ab84fdee1
Author: Arnold D. Robbins <address@hidden>
Date: Tue Apr 16 11:03:05 2013 +0300
Sync dfa with GNU grep.
diff --git a/ChangeLog b/ChangeLog
index b766a0e..d02fd74 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,6 +1,7 @@
2013-04-16 Arnold D. Robbins <address@hidden>
* awkgram.c: Regenerated from bison 2.7.1.
+ * dfa.h, dfa.c: Minor edits to sync with GNU grep.
2013-04-14 Arnold D. Robbins <address@hidden>
diff --git a/dfa.c b/dfa.c
index df0dc4a..54e0ae9 100644
--- a/dfa.c
+++ b/dfa.c
@@ -1,5 +1,5 @@
/* dfa.c - deterministic extended regexp routines for GNU
- Copyright (C) 1988, 1998, 2000, 2002, 2004-2005, 2007-2012 Free Software
+ Copyright (C) 1988, 1998, 2000, 2002, 2004-2005, 2007-2013 Free Software
Foundation, Inc.
This program is free software; you can redistribute it and/or modify
@@ -65,8 +65,8 @@
#include "mbsupport.h" /* defines MBS_SUPPORT to 1 or 0, as
appropriate */
#if MBS_SUPPORT
/* We can handle multibyte strings. */
-#include <wchar.h>
-#include <wctype.h>
+# include <wchar.h>
+# include <wctype.h>
#endif
#ifdef GAWK
diff --git a/dfa.h b/dfa.h
index 7d29ce2..c58485a 100644
--- a/dfa.h
+++ b/dfa.h
@@ -1,5 +1,5 @@
/* dfa.h - declarations for GNU deterministic regexp compiler
- Copyright (C) 1988, 1998, 2007, 2009-2012 Free Software Foundation, Inc.
+ Copyright (C) 1988, 1998, 2007, 2009-2013 Free Software Foundation, Inc.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
http://git.sv.gnu.org/cgit/gawk.git/commit/?id=07ec66899460f3a0439dfc6a3c0fd1e12afdb46a
commit 07ec66899460f3a0439dfc6a3c0fd1e12afdb46a
Author: Arnold D. Robbins <address@hidden>
Date: Tue Apr 16 10:57:20 2013 +0300
Regenerate awkgram.c with latest bison.
diff --git a/ChangeLog b/ChangeLog
index 0ca9258..b766a0e 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,7 @@
+2013-04-16 Arnold D. Robbins <address@hidden>
+
+ * awkgram.c: Regenerated from bison 2.7.1.
+
2013-04-14 Arnold D. Robbins <address@hidden>
* awkgram.y (check_funcs): Fix logic of test for called but
diff --git a/awkgram.c b/awkgram.c
index 03a39e7..f6cc6de 100644
--- a/awkgram.c
+++ b/awkgram.c
@@ -1,8 +1,8 @@
-/* A Bison parser, made by GNU Bison 2.7. */
+/* A Bison parser, made by GNU Bison 2.7.12-4996. */
/* Bison implementation for Yacc-like parsers in C
- Copyright (C) 1984, 1989-1990, 2000-2012 Free Software Foundation, Inc.
+ Copyright (C) 1984, 1989-1990, 2000-2013 Free Software Foundation, Inc.
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
@@ -44,7 +44,7 @@
#define YYBISON 1
/* Bison version. */
-#define YYBISON_VERSION "2.7"
+#define YYBISON_VERSION "2.7.12-4996"
/* Skeleton name. */
#define YYSKELETON_NAME "yacc.c"
@@ -429,6 +429,14 @@ typedef short int yytype_int16;
# endif
#endif
+#ifndef __attribute__
+/* This feature is available in gcc versions 2.5 and later. */
+# if (! defined __GNUC__ || __GNUC__ < 2 \
+ || (__GNUC__ == 2 && __GNUC_MINOR__ < 5))
+# define __attribute__(Spec) /* empty */
+# endif
+#endif
+
/* Suppress unused-variable warnings by "using" E. */
#if ! defined lint || defined __GNUC__
# define YYUSE(E) ((void) (E))
@@ -436,6 +444,7 @@ typedef short int yytype_int16;
# define YYUSE(E) /* empty */
#endif
+
/* Identity function, used to suppress warnings about constant conditions. */
#ifndef lint
# define YYID(N) (N)
@@ -1365,11 +1374,7 @@ yy_symbol_value_print (yyoutput, yytype, yyvaluep)
# else
YYUSE (yyoutput);
# endif
- switch (yytype)
- {
- default:
- break;
- }
+ YYUSE (yytype);
}
@@ -1759,12 +1764,7 @@ yydestruct (yymsg, yytype, yyvaluep)
yymsg = "Deleting";
YY_SYMBOL_PRINT (yymsg, yytype, yyvaluep, yylocationp);
- switch (yytype)
- {
-
- default:
- break;
- }
+ YYUSE (yytype);
}
@@ -2048,7 +2048,7 @@ yyreduce:
switch (yyn)
{
case 3:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 203 "awkgram.y"
{
rule = 0;
@@ -2057,7 +2057,7 @@ yyreduce:
break;
case 5:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 209 "awkgram.y"
{
next_sourcefile();
@@ -2067,7 +2067,7 @@ yyreduce:
break;
case 6:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 215 "awkgram.y"
{
rule = 0;
@@ -2080,7 +2080,7 @@ yyreduce:
break;
case 7:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 227 "awkgram.y"
{
(void) append_rule((yyvsp[(1) - (2)]), (yyvsp[(2) - (2)]));
@@ -2088,7 +2088,7 @@ yyreduce:
break;
case 8:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 231 "awkgram.y"
{
if (rule != Rule) {
@@ -2103,7 +2103,7 @@ yyreduce:
break;
case 9:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 242 "awkgram.y"
{
in_function = NULL;
@@ -2113,7 +2113,7 @@ yyreduce:
break;
case 10:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 248 "awkgram.y"
{
want_source = false;
@@ -2122,7 +2122,7 @@ yyreduce:
break;
case 11:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 253 "awkgram.y"
{
want_source = false;
@@ -2131,7 +2131,7 @@ yyreduce:
break;
case 12:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 261 "awkgram.y"
{
if (include_source((yyvsp[(1) - (1)])) < 0)
@@ -2143,19 +2143,19 @@ yyreduce:
break;
case 13:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 269 "awkgram.y"
{ (yyval) = NULL; }
break;
case 14:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 271 "awkgram.y"
{ (yyval) = NULL; }
break;
case 15:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 276 "awkgram.y"
{
if (load_library((yyvsp[(1) - (1)])) < 0)
@@ -2167,31 +2167,31 @@ yyreduce:
break;
case 16:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 284 "awkgram.y"
{ (yyval) = NULL; }
break;
case 17:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 286 "awkgram.y"
{ (yyval) = NULL; }
break;
case 18:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 291 "awkgram.y"
{ (yyval) = NULL; rule = Rule; }
break;
case 19:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 293 "awkgram.y"
{ (yyval) = (yyvsp[(1) - (1)]); rule = Rule; }
break;
case 20:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 295 "awkgram.y"
{
INSTRUCTION *tp;
@@ -2221,7 +2221,7 @@ yyreduce:
break;
case 21:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 321 "awkgram.y"
{
static int begin_seen = 0;
@@ -2236,7 +2236,7 @@ yyreduce:
break;
case 22:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 332 "awkgram.y"
{
static int end_seen = 0;
@@ -2251,7 +2251,7 @@ yyreduce:
break;
case 23:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 343 "awkgram.y"
{
(yyvsp[(1) - (1)])->in_rule = rule = BEGINFILE;
@@ -2261,7 +2261,7 @@ yyreduce:
break;
case 24:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 349 "awkgram.y"
{
(yyvsp[(1) - (1)])->in_rule = rule = ENDFILE;
@@ -2271,7 +2271,7 @@ yyreduce:
break;
case 25:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 358 "awkgram.y"
{
if ((yyvsp[(2) - (5)]) == NULL)
@@ -2282,19 +2282,19 @@ yyreduce:
break;
case 26:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 368 "awkgram.y"
{ (yyval) = (yyvsp[(1) - (1)]); }
break;
case 27:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 370 "awkgram.y"
{ (yyval) = (yyvsp[(1) - (1)]); }
break;
case 28:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 372 "awkgram.y"
{
yyerror(_("`%s' is a built-in function, it cannot be
redefined"),
@@ -2304,13 +2304,13 @@ yyreduce:
break;
case 29:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 378 "awkgram.y"
{ (yyval) = (yyvsp[(2) - (2)]); }
break;
case 32:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 388 "awkgram.y"
{
(yyvsp[(1) - (6)])->source_file = source;
@@ -2325,13 +2325,13 @@ yyreduce:
break;
case 33:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 406 "awkgram.y"
{ want_regexp = true; }
break;
case 34:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 408 "awkgram.y"
{
NODE *n, *exp;
@@ -2364,19 +2364,19 @@ yyreduce:
break;
case 35:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 440 "awkgram.y"
{ bcfree((yyvsp[(1) - (1)])); }
break;
case 37:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 446 "awkgram.y"
{ (yyval) = NULL; }
break;
case 38:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 448 "awkgram.y"
{
if ((yyvsp[(2) - (2)]) == NULL)
@@ -2393,25 +2393,25 @@ yyreduce:
break;
case 39:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 461 "awkgram.y"
{ (yyval) = NULL; }
break;
case 42:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 471 "awkgram.y"
{ (yyval) = NULL; }
break;
case 43:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 473 "awkgram.y"
{ (yyval) = (yyvsp[(2) - (3)]); }
break;
case 44:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 475 "awkgram.y"
{
if (do_pretty_print)
@@ -2422,7 +2422,7 @@ yyreduce:
break;
case 45:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 482 "awkgram.y"
{
INSTRUCTION *dflt, *curr = NULL, *cexp, *cstmt;
@@ -2516,7 +2516,7 @@ yyreduce:
break;
case 46:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 572 "awkgram.y"
{
/*
@@ -2562,7 +2562,7 @@ yyreduce:
break;
case 47:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 614 "awkgram.y"
{
/*
@@ -2608,7 +2608,7 @@ yyreduce:
break;
case 48:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 656 "awkgram.y"
{
INSTRUCTION *ip;
@@ -2725,7 +2725,7 @@ regular_loop:
break;
case 49:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 769 "awkgram.y"
{
(yyval) = mk_for_loop((yyvsp[(1) - (12)]), (yyvsp[(3) - (12)]),
(yyvsp[(6) - (12)]), (yyvsp[(9) - (12)]), (yyvsp[(12) - (12)]));
@@ -2736,7 +2736,7 @@ regular_loop:
break;
case 50:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 776 "awkgram.y"
{
(yyval) = mk_for_loop((yyvsp[(1) - (11)]), (yyvsp[(3) - (11)]),
(INSTRUCTION *) NULL, (yyvsp[(8) - (11)]), (yyvsp[(11) - (11)]));
@@ -2747,7 +2747,7 @@ regular_loop:
break;
case 51:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 783 "awkgram.y"
{
if (do_pretty_print)
@@ -2758,7 +2758,7 @@ regular_loop:
break;
case 52:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 793 "awkgram.y"
{
if (! break_allowed)
@@ -2771,7 +2771,7 @@ regular_loop:
break;
case 53:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 802 "awkgram.y"
{
if (! continue_allowed)
@@ -2784,7 +2784,7 @@ regular_loop:
break;
case 54:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 811 "awkgram.y"
{
/* if inside function (rule = 0), resolve context at run-time */
@@ -2797,7 +2797,7 @@ regular_loop:
break;
case 55:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 820 "awkgram.y"
{
/* if inside function (rule = 0), resolve context at run-time */
@@ -2812,7 +2812,7 @@ regular_loop:
break;
case 56:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 831 "awkgram.y"
{
/* Initialize the two possible jump targets, the actual target
@@ -2831,7 +2831,7 @@ regular_loop:
break;
case 57:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 846 "awkgram.y"
{
if (! in_function)
@@ -2840,7 +2840,7 @@ regular_loop:
break;
case 58:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 849 "awkgram.y"
{
if ((yyvsp[(3) - (4)]) == NULL) {
@@ -2865,13 +2865,13 @@ regular_loop:
break;
case 60:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 881 "awkgram.y"
{ in_print = true; in_parens = 0; }
break;
case 61:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 882 "awkgram.y"
{
/*
@@ -2972,13 +2972,13 @@ regular_print:
break;
case 62:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 979 "awkgram.y"
{ sub_counter = 0; }
break;
case 63:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 980 "awkgram.y"
{
char *arr = (yyvsp[(2) - (4)])->lextok;
@@ -3015,7 +3015,7 @@ regular_print:
break;
case 64:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1017 "awkgram.y"
{
static bool warned = false;
@@ -3045,31 +3045,31 @@ regular_print:
break;
case 65:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1043 "awkgram.y"
{ (yyval) = optimize_assignment((yyvsp[(1) - (1)])); }
break;
case 66:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1048 "awkgram.y"
{ (yyval) = NULL; }
break;
case 67:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1050 "awkgram.y"
{ (yyval) = (yyvsp[(1) - (1)]); }
break;
case 68:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1055 "awkgram.y"
{ (yyval) = NULL; }
break;
case 69:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1057 "awkgram.y"
{
if ((yyvsp[(1) - (2)]) == NULL)
@@ -3080,13 +3080,13 @@ regular_print:
break;
case 70:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1064 "awkgram.y"
{ (yyval) = NULL; }
break;
case 71:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1069 "awkgram.y"
{
INSTRUCTION *casestmt = (yyvsp[(5) - (5)]);
@@ -3102,7 +3102,7 @@ regular_print:
break;
case 72:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1081 "awkgram.y"
{
INSTRUCTION *casestmt = (yyvsp[(4) - (4)]);
@@ -3117,13 +3117,13 @@ regular_print:
break;
case 73:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1095 "awkgram.y"
{ (yyval) = (yyvsp[(1) - (1)]); }
break;
case 74:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1097 "awkgram.y"
{
NODE *n = (yyvsp[(2) - (2)])->memory;
@@ -3135,7 +3135,7 @@ regular_print:
break;
case 75:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1105 "awkgram.y"
{
bcfree((yyvsp[(1) - (2)]));
@@ -3144,13 +3144,13 @@ regular_print:
break;
case 76:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1110 "awkgram.y"
{ (yyval) = (yyvsp[(1) - (1)]); }
break;
case 77:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1112 "awkgram.y"
{
(yyvsp[(1) - (1)])->opcode = Op_push_re;
@@ -3159,19 +3159,19 @@ regular_print:
break;
case 78:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1120 "awkgram.y"
{ (yyval) = (yyvsp[(1) - (1)]); }
break;
case 79:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1122 "awkgram.y"
{ (yyval) = (yyvsp[(1) - (1)]); }
break;
case 81:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1132 "awkgram.y"
{
(yyval) = (yyvsp[(2) - (3)]);
@@ -3179,7 +3179,7 @@ regular_print:
break;
case 82:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1139 "awkgram.y"
{
in_print = false;
@@ -3189,13 +3189,13 @@ regular_print:
break;
case 83:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1144 "awkgram.y"
{ in_print = false; in_parens = 0; }
break;
case 84:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1145 "awkgram.y"
{
if ((yyvsp[(1) - (3)])->redir_type == redirect_twoway
@@ -3207,7 +3207,7 @@ regular_print:
break;
case 85:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1156 "awkgram.y"
{
(yyval) = mk_condition((yyvsp[(3) - (6)]), (yyvsp[(1) - (6)]),
(yyvsp[(6) - (6)]), NULL, NULL);
@@ -3215,7 +3215,7 @@ regular_print:
break;
case 86:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1161 "awkgram.y"
{
(yyval) = mk_condition((yyvsp[(3) - (9)]), (yyvsp[(1) - (9)]),
(yyvsp[(6) - (9)]), (yyvsp[(7) - (9)]), (yyvsp[(9) - (9)]));
@@ -3223,13 +3223,13 @@ regular_print:
break;
case 91:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1178 "awkgram.y"
{ (yyval) = NULL; }
break;
case 92:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1180 "awkgram.y"
{
bcfree((yyvsp[(1) - (2)]));
@@ -3238,19 +3238,19 @@ regular_print:
break;
case 93:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1188 "awkgram.y"
{ (yyval) = NULL; }
break;
case 94:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1190 "awkgram.y"
{ (yyval) = (yyvsp[(1) - (1)]) ; }
break;
case 95:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1195 "awkgram.y"
{
(yyvsp[(1) - (1)])->param_count = 0;
@@ -3259,7 +3259,7 @@ regular_print:
break;
case 96:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1200 "awkgram.y"
{
(yyvsp[(3) - (3)])->param_count = (yyvsp[(1) -
(3)])->lasti->param_count + 1;
@@ -3269,55 +3269,55 @@ regular_print:
break;
case 97:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1206 "awkgram.y"
{ (yyval) = NULL; }
break;
case 98:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1208 "awkgram.y"
{ (yyval) = (yyvsp[(1) - (2)]); }
break;
case 99:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1210 "awkgram.y"
{ (yyval) = (yyvsp[(1) - (3)]); }
break;
case 100:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1216 "awkgram.y"
{ (yyval) = NULL; }
break;
case 101:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1218 "awkgram.y"
{ (yyval) = (yyvsp[(1) - (1)]); }
break;
case 102:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1223 "awkgram.y"
{ (yyval) = NULL; }
break;
case 103:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1225 "awkgram.y"
{ (yyval) = (yyvsp[(1) - (1)]); }
break;
case 104:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1230 "awkgram.y"
{ (yyval) = mk_expression_list(NULL, (yyvsp[(1) - (1)])); }
break;
case 105:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1232 "awkgram.y"
{
(yyval) = mk_expression_list((yyvsp[(1) - (3)]), (yyvsp[(3) -
(3)]));
@@ -3326,13 +3326,13 @@ regular_print:
break;
case 106:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1237 "awkgram.y"
{ (yyval) = NULL; }
break;
case 107:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1239 "awkgram.y"
{
/*
@@ -3344,7 +3344,7 @@ regular_print:
break;
case 108:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1247 "awkgram.y"
{
/* Ditto */
@@ -3353,7 +3353,7 @@ regular_print:
break;
case 109:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1252 "awkgram.y"
{
/* Ditto */
@@ -3362,7 +3362,7 @@ regular_print:
break;
case 110:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1261 "awkgram.y"
{
if (do_lint && (yyvsp[(3) - (3)])->lasti->opcode ==
Op_match_rec)
@@ -3373,19 +3373,19 @@ regular_print:
break;
case 111:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1268 "awkgram.y"
{ (yyval) = mk_boolean((yyvsp[(1) - (3)]), (yyvsp[(3) - (3)]), (yyvsp[(2)
- (3)])); }
break;
case 112:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1270 "awkgram.y"
{ (yyval) = mk_boolean((yyvsp[(1) - (3)]), (yyvsp[(3) - (3)]), (yyvsp[(2)
- (3)])); }
break;
case 113:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1272 "awkgram.y"
{
if ((yyvsp[(1) - (3)])->lasti->opcode == Op_match_rec)
@@ -3405,7 +3405,7 @@ regular_print:
break;
case 114:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1288 "awkgram.y"
{
if (do_lint_old)
@@ -3419,7 +3419,7 @@ regular_print:
break;
case 115:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1298 "awkgram.y"
{
if (do_lint && (yyvsp[(3) - (3)])->lasti->opcode ==
Op_match_rec)
@@ -3430,31 +3430,31 @@ regular_print:
break;
case 116:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1305 "awkgram.y"
{ (yyval) = mk_condition((yyvsp[(1) - (5)]), (yyvsp[(2) - (5)]),
(yyvsp[(3) - (5)]), (yyvsp[(4) - (5)]), (yyvsp[(5) - (5)])); }
break;
case 117:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1307 "awkgram.y"
{ (yyval) = (yyvsp[(1) - (1)]); }
break;
case 118:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1312 "awkgram.y"
{ (yyval) = (yyvsp[(1) - (1)]); }
break;
case 119:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1314 "awkgram.y"
{ (yyval) = (yyvsp[(1) - (1)]); }
break;
case 120:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1316 "awkgram.y"
{
(yyvsp[(2) - (2)])->opcode = Op_assign_quotient;
@@ -3463,43 +3463,43 @@ regular_print:
break;
case 121:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1324 "awkgram.y"
{ (yyval) = (yyvsp[(1) - (1)]); }
break;
case 122:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1326 "awkgram.y"
{ (yyval) = (yyvsp[(1) - (1)]); }
break;
case 123:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1331 "awkgram.y"
{ (yyval) = (yyvsp[(1) - (1)]); }
break;
case 124:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1333 "awkgram.y"
{ (yyval) = (yyvsp[(1) - (1)]); }
break;
case 125:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1338 "awkgram.y"
{ (yyval) = (yyvsp[(1) - (1)]); }
break;
case 126:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1340 "awkgram.y"
{ (yyval) = (yyvsp[(1) - (1)]); }
break;
case 127:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1342 "awkgram.y"
{
int count = 2;
@@ -3550,43 +3550,43 @@ regular_print:
break;
case 129:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1394 "awkgram.y"
{ (yyval) = mk_binary((yyvsp[(1) - (3)]), (yyvsp[(3) - (3)]), (yyvsp[(2) -
(3)])); }
break;
case 130:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1396 "awkgram.y"
{ (yyval) = mk_binary((yyvsp[(1) - (3)]), (yyvsp[(3) - (3)]), (yyvsp[(2) -
(3)])); }
break;
case 131:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1398 "awkgram.y"
{ (yyval) = mk_binary((yyvsp[(1) - (3)]), (yyvsp[(3) - (3)]), (yyvsp[(2) -
(3)])); }
break;
case 132:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1400 "awkgram.y"
{ (yyval) = mk_binary((yyvsp[(1) - (3)]), (yyvsp[(3) - (3)]), (yyvsp[(2) -
(3)])); }
break;
case 133:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1402 "awkgram.y"
{ (yyval) = mk_binary((yyvsp[(1) - (3)]), (yyvsp[(3) - (3)]), (yyvsp[(2) -
(3)])); }
break;
case 134:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1404 "awkgram.y"
{ (yyval) = mk_binary((yyvsp[(1) - (3)]), (yyvsp[(3) - (3)]), (yyvsp[(2) -
(3)])); }
break;
case 135:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1406 "awkgram.y"
{
/*
@@ -3613,7 +3613,7 @@ regular_print:
break;
case 136:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1429 "awkgram.y"
{
(yyvsp[(2) - (2)])->opcode = Op_postincrement;
@@ -3622,7 +3622,7 @@ regular_print:
break;
case 137:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1434 "awkgram.y"
{
(yyvsp[(2) - (2)])->opcode = Op_postdecrement;
@@ -3631,7 +3631,7 @@ regular_print:
break;
case 138:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1439 "awkgram.y"
{
if (do_lint_old) {
@@ -3655,7 +3655,7 @@ regular_print:
break;
case 139:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1464 "awkgram.y"
{
(yyval) = mk_getline((yyvsp[(3) - (4)]), (yyvsp[(4) - (4)]),
(yyvsp[(1) - (4)]), (yyvsp[(2) - (4)])->redir_type);
@@ -3664,43 +3664,43 @@ regular_print:
break;
case 140:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1470 "awkgram.y"
{ (yyval) = mk_binary((yyvsp[(1) - (3)]), (yyvsp[(3) - (3)]), (yyvsp[(2) -
(3)])); }
break;
case 141:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1472 "awkgram.y"
{ (yyval) = mk_binary((yyvsp[(1) - (3)]), (yyvsp[(3) - (3)]), (yyvsp[(2) -
(3)])); }
break;
case 142:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1474 "awkgram.y"
{ (yyval) = mk_binary((yyvsp[(1) - (3)]), (yyvsp[(3) - (3)]), (yyvsp[(2) -
(3)])); }
break;
case 143:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1476 "awkgram.y"
{ (yyval) = mk_binary((yyvsp[(1) - (3)]), (yyvsp[(3) - (3)]), (yyvsp[(2) -
(3)])); }
break;
case 144:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1478 "awkgram.y"
{ (yyval) = mk_binary((yyvsp[(1) - (3)]), (yyvsp[(3) - (3)]), (yyvsp[(2) -
(3)])); }
break;
case 145:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1480 "awkgram.y"
{ (yyval) = mk_binary((yyvsp[(1) - (3)]), (yyvsp[(3) - (3)]), (yyvsp[(2) -
(3)])); }
break;
case 146:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1485 "awkgram.y"
{
(yyval) = list_create((yyvsp[(1) - (1)]));
@@ -3708,7 +3708,7 @@ regular_print:
break;
case 147:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1489 "awkgram.y"
{
if ((yyvsp[(2) - (2)])->opcode == Op_match_rec) {
@@ -3744,13 +3744,13 @@ regular_print:
break;
case 148:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1521 "awkgram.y"
{ (yyval) = (yyvsp[(2) - (3)]); }
break;
case 149:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1523 "awkgram.y"
{
(yyval) = snode((yyvsp[(3) - (4)]), (yyvsp[(1) - (4)]));
@@ -3760,7 +3760,7 @@ regular_print:
break;
case 150:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1529 "awkgram.y"
{
(yyval) = snode((yyvsp[(3) - (4)]), (yyvsp[(1) - (4)]));
@@ -3770,7 +3770,7 @@ regular_print:
break;
case 151:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1535 "awkgram.y"
{
static bool warned = false;
@@ -3787,7 +3787,7 @@ regular_print:
break;
case 154:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1550 "awkgram.y"
{
(yyvsp[(1) - (2)])->opcode = Op_preincrement;
@@ -3796,7 +3796,7 @@ regular_print:
break;
case 155:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1555 "awkgram.y"
{
(yyvsp[(1) - (2)])->opcode = Op_predecrement;
@@ -3805,7 +3805,7 @@ regular_print:
break;
case 156:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1560 "awkgram.y"
{
(yyval) = list_create((yyvsp[(1) - (1)]));
@@ -3813,7 +3813,7 @@ regular_print:
break;
case 157:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1564 "awkgram.y"
{
(yyval) = list_create((yyvsp[(1) - (1)]));
@@ -3821,7 +3821,7 @@ regular_print:
break;
case 158:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1568 "awkgram.y"
{
if ((yyvsp[(2) - (2)])->lasti->opcode == Op_push_i
@@ -3840,7 +3840,7 @@ regular_print:
break;
case 159:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1583 "awkgram.y"
{
/*
@@ -3854,7 +3854,7 @@ regular_print:
break;
case 160:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1596 "awkgram.y"
{
func_use((yyvsp[(1) - (1)])->lasti->func_name, FUNC_USE);
@@ -3863,7 +3863,7 @@ regular_print:
break;
case 161:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1601 "awkgram.y"
{
/* indirect function call */
@@ -3900,7 +3900,7 @@ regular_print:
break;
case 162:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1637 "awkgram.y"
{
param_sanity((yyvsp[(3) - (4)]));
@@ -3918,37 +3918,37 @@ regular_print:
break;
case 163:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1654 "awkgram.y"
{ (yyval) = NULL; }
break;
case 164:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1656 "awkgram.y"
{ (yyval) = (yyvsp[(1) - (1)]); }
break;
case 165:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1661 "awkgram.y"
{ (yyval) = NULL; }
break;
case 166:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1663 "awkgram.y"
{ (yyval) = (yyvsp[(1) - (2)]); }
break;
case 167:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1668 "awkgram.y"
{ (yyval) = (yyvsp[(1) - (1)]); }
break;
case 168:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1670 "awkgram.y"
{
(yyval) = list_merge((yyvsp[(1) - (2)]), (yyvsp[(2) - (2)]));
@@ -3956,7 +3956,7 @@ regular_print:
break;
case 169:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1677 "awkgram.y"
{
INSTRUCTION *ip = (yyvsp[(1) - (1)])->lasti;
@@ -3974,7 +3974,7 @@ regular_print:
break;
case 170:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1694 "awkgram.y"
{
INSTRUCTION *t = (yyvsp[(2) - (3)]);
@@ -3992,13 +3992,13 @@ regular_print:
break;
case 171:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1711 "awkgram.y"
{ (yyval) = (yyvsp[(1) - (1)]); }
break;
case 172:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1713 "awkgram.y"
{
(yyval) = list_merge((yyvsp[(1) - (2)]), (yyvsp[(2) - (2)]));
@@ -4006,13 +4006,13 @@ regular_print:
break;
case 173:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1720 "awkgram.y"
{ (yyval) = (yyvsp[(1) - (2)]); }
break;
case 174:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1725 "awkgram.y"
{
char *var_name = (yyvsp[(1) - (1)])->lextok;
@@ -4024,7 +4024,7 @@ regular_print:
break;
case 175:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1733 "awkgram.y"
{
char *arr = (yyvsp[(1) - (2)])->lextok;
@@ -4035,7 +4035,7 @@ regular_print:
break;
case 176:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1743 "awkgram.y"
{
INSTRUCTION *ip = (yyvsp[(1) - (1)])->nexti;
@@ -4051,7 +4051,7 @@ regular_print:
break;
case 177:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1755 "awkgram.y"
{
(yyval) = list_append((yyvsp[(2) - (3)]), (yyvsp[(1) - (3)]));
@@ -4061,7 +4061,7 @@ regular_print:
break;
case 178:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1764 "awkgram.y"
{
(yyvsp[(1) - (1)])->opcode = Op_postincrement;
@@ -4069,7 +4069,7 @@ regular_print:
break;
case 179:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1768 "awkgram.y"
{
(yyvsp[(1) - (1)])->opcode = Op_postdecrement;
@@ -4077,43 +4077,43 @@ regular_print:
break;
case 180:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1771 "awkgram.y"
{ (yyval) = NULL; }
break;
case 182:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1779 "awkgram.y"
{ yyerrok; }
break;
case 183:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1783 "awkgram.y"
{ yyerrok; }
break;
case 186:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1792 "awkgram.y"
{ yyerrok; }
break;
case 187:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1796 "awkgram.y"
{ (yyval) = (yyvsp[(1) - (1)]); yyerrok; }
break;
case 188:
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 1800 "awkgram.y"
{ yyerrok; }
break;
-/* Line 1792 of yacc.c */
+/* Line 1787 of yacc.c */
#line 4118 "awkgram.c"
default: break;
}
@@ -4345,7 +4345,7 @@ yyreturn:
}
-/* Line 2055 of yacc.c */
+/* Line 2050 of yacc.c */
#line 1802 "awkgram.y"
http://git.sv.gnu.org/cgit/gawk.git/commit/?id=a679c239ef762a2e4ecfd977b803face0c987e57
commit a679c239ef762a2e4ecfd977b803face0c987e57
Author: Arnold D. Robbins <address@hidden>
Date: Tue Apr 16 10:51:20 2013 +0300
Move to latest texinfo.tex.
diff --git a/doc/ChangeLog b/doc/ChangeLog
index df02b74..37b3156 100644
--- a/doc/ChangeLog
+++ b/doc/ChangeLog
@@ -2,6 +2,7 @@
* gawk.texi: Pretty much finish cleanup. Move i18n chapter to
after advanced features chapter.
+ * texinfo.tex: Updated to current in texinfo SVN.
2013-04-15 Arnold D. Robbins <address@hidden>
diff --git a/doc/texinfo.tex b/doc/texinfo.tex
index b5f3141..a0c9d08 100644
--- a/doc/texinfo.tex
+++ b/doc/texinfo.tex
@@ -3,11 +3,11 @@
% Load plain if necessary, i.e., if running under initex.
\expandafter\ifx\csname fmtname\endcsname\relax\input plain\fi
%
-\def\texinfoversion{2012-11-08.11}
+\def\texinfoversion{2013-03-19.11}
%
% Copyright 1985, 1986, 1988, 1990, 1991, 1992, 1993, 1994, 1995,
% 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006,
-% 2007, 2008, 2009, 2010, 2011, 2012 Free Software Foundation, Inc.
+% 2007, 2008, 2009, 2010, 2011, 2012, 2013 Free Software Foundation, Inc.
%
% This texinfo.tex file is free software: you can redistribute it and/or
% modify it under the terms of the GNU General Public License as
@@ -24,7 +24,8 @@
%
% As a special exception, when this file is read by TeX when processing
% a Texinfo source document, you may use the result without
-% restriction. (This has been our intent since Texinfo was invented.)
+% restriction. This Exception is an additional permission under section 7
+% of the GNU General Public License, version 3 ("GPLv3").
%
% Please try the latest version of texinfo.tex before submitting bug
% reports; you can get the latest version from:
@@ -2495,7 +2496,7 @@ end
\let-\codedash
\let_\codeunder
\else
- \let-\realdash
+ \let-\normaldash
\let_\realunder
\fi
\codex
@@ -2504,7 +2505,7 @@ end
\def\codex #1{\tclose{#1}\endgroup}
-\def\realdash{-}
+\def\normaldash{-}
\def\codedash{-\discretionary{}{}{}}
\def\codeunder{%
% this is all so @address@hidden can work. In math mode, _
@@ -2519,9 +2520,9 @@ end
}
% An additional complication: the above will allow breaks after, e.g.,
-% each of the four underscores in __typeof__. This is undesirable in
-% some manuals, especially if they don't have long identifiers in
-% general. @allowcodebreaks provides a way to control this.
+% each of the four underscores in __typeof__. This is bad.
+% @allowcodebreaks provides a document-level way to turn breaking at -
+% and _ on and off.
%
\newif\ifallowcodebreaks \allowcodebreakstrue
@@ -4187,7 +4188,7 @@ end
% ..., but we might end up with active ones in the argument if
% we're called from @code, as @address@hidden, though.
% So \let them to their normal equivalents.
- \let-\realdash \let_\normalunderscore
+ \let-\normaldash \let_\normalunderscore
}
}
@@ -4210,8 +4211,9 @@ end
% @ifset VAR ... @end ifset reads the `...' iff VAR has been defined
% with @set.
-%
-% To get special treatment of address@hidden ifset,' call \makeond and the
redefine.
+%
+% To get the special treatment we need for address@hidden ifset,' we call
+% \makecond and then redefine.
%
\makecond{ifset}
\def\ifset{\parsearg{\doifset{\let\next=\ifsetfail}}}
@@ -6401,7 +6403,7 @@ end
\newdimen\nonfillparindent
\def\nonfillstart{%
\aboveenvbreak
- \hfuzz = 12pt % Don't be fussy
+ \ifdim\hfuzz < 12pt \hfuzz = 12pt \fi % Don't be fussy
\sepspaces % Make spaces be word-separators rather than space tokens.
\let\par = \lisppar % don't ignore blank lines
\obeylines % each line of input is a line of output
@@ -9992,22 +9994,26 @@ directory should work if nowhere else does.}
@address@hidden@address@hidden
% Same as @turnoffactive except outputs \ as {\tt\char`\\} instead of
-% the literal character `\'.
-%
address@hidden@normalturnoffactive{%
- @let"address@hidden
- @address@hidden %$ font-lock fix
- @address@hidden
- @let<address@hidden
- @let>address@hidden
- @address@hidden
- @address@hidden
- @address@hidden
- @let|address@hidden
- @address@hidden
- @markupsetuplqdefault
- @markupsetuprqdefault
- @unsepspaces
+% the literal character `\'. Also revert - to its normal character, in
+% case the active - from code has slipped in.
+%
address@hidden = @active
+ @address@hidden
+ @address@hidden
+ @let"address@hidden
+ @address@hidden %$ font-lock fix
+ @address@hidden
+ @let<address@hidden
+ @let>address@hidden
+ @address@hidden
+ @address@hidden
+ @address@hidden
+ @let|address@hidden
+ @address@hidden
+ @markupsetuplqdefault
+ @markupsetuprqdefault
+ @unsepspaces
+ }
}
% Make _ and + \other characters, temporarily.
http://git.sv.gnu.org/cgit/gawk.git/commit/?id=34b9e9e666c79e4c42a59d0b7b7584a0620295f0
commit 34b9e9e666c79e4c42a59d0b7b7584a0620295f0
Author: Arnold D. Robbins <address@hidden>
Date: Tue Apr 16 10:50:46 2013 +0300
Largely done with doc cleanup.
diff --git a/doc/ChangeLog b/doc/ChangeLog
index affa082..df02b74 100644
--- a/doc/ChangeLog
+++ b/doc/ChangeLog
@@ -1,3 +1,8 @@
+2013-04-16 Arnold D. Robbins <address@hidden>
+
+ * gawk.texi: Pretty much finish cleanup. Move i18n chapter to
+ after advanced features chapter.
+
2013-04-15 Arnold D. Robbins <address@hidden>
* gawk.texi: Continue cleanup.
diff --git a/doc/gawk.info b/doc/gawk.info
index 9271620..2d1ee6c 100644
--- a/doc/gawk.info
+++ b/doc/gawk.info
@@ -90,10 +90,10 @@ texts being (a) (see below), and with the Back-Cover Texts
being (b)
* Library Functions:: A Library of `awk' Functions.
* Sample Programs:: Many `awk' programs with complete
explanations.
-* Internationalization:: Getting `gawk' to speak your
- language.
* Advanced Features:: Stuff for advanced users, specific to
`gawk'.
+* Internationalization:: Getting `gawk' to speak your
+ language.
* Debugger:: The `gawk' debugger.
* Arbitrary Precision Arithmetic:: Arbitrary precision arithmetic with
`gawk'.
@@ -997,14 +997,14 @@ problems.
Part III focuses on features specific to `gawk'. It contains the
following chapters:
- *note Internationalization::, describes special features in `gawk'
-for translating program messages into different languages at runtime.
-
*note Advanced Features::, describes a number of `gawk'-specific
advanced features. Of particular note are the abilities to have
two-way communications with another process, perform TCP/IP networking,
and profile your `awk' programs.
+ *note Internationalization::, describes special features in `gawk'
+for translating program messages into different languages at runtime.
+
*note Debugger::, describes the `awk' debugger.
*note Arbitrary Precision Arithmetic::, describes advanced
@@ -15242,7 +15242,7 @@ user-defined function that expects to receive and index
and a value,
and then processes the element.
-File: gawk.info, Node: Sample Programs, Next: Internationalization, Prev:
Library Functions, Up: Top
+File: gawk.info, Node: Sample Programs, Next: Advanced Features, Prev:
Library Functions, Up: Top
11 Practical `awk' Programs
***************************
@@ -17861,1449 +17861,1449 @@ supplies the following copyright terms:
We leave it to you to determine what the program does.
-File: gawk.info, Node: Internationalization, Next: Advanced Features, Prev:
Sample Programs, Up: Top
+File: gawk.info, Node: Advanced Features, Next: Internationalization, Prev:
Sample Programs, Up: Top
-12 Internationalization with `gawk'
-***********************************
+12 Advanced Features of `gawk'
+******************************
-Once upon a time, computer makers wrote software that worked only in
-English. Eventually, hardware and software vendors noticed that if
-their systems worked in the native languages of non-English-speaking
-countries, they were able to sell more systems. As a result,
-internationalization and localization of programs and software systems
-became a common practice.
+ Write documentation as if whoever reads it is a violent psychopath
+ who knows where you live.
+ Steve English, as quoted by Peter Langston
- For many years, the ability to provide internationalization was
-largely restricted to programs written in C and C++. This major node
-describes the underlying library `gawk' uses for internationalization,
-as well as how `gawk' makes internationalization features available at
-the `awk' program level. Having internationalization available at the
-`awk' level gives software developers additional flexibility--they are
-no longer forced to write in C or C++ when internationalization is a
-requirement.
+ This major node discusses advanced features in `gawk'. It's a bit
+of a "grab bag" of items that are otherwise unrelated to each other.
+First, a command-line option allows `gawk' to recognize nondecimal
+numbers in input data, not just in `awk' programs. Then, `gawk''s
+special features for sorting arrays are presented. Next, two-way I/O,
+discussed briefly in earlier parts of this Info file, is described in
+full detail, along with the basics of TCP/IP networking. Finally,
+`gawk' can "profile" an `awk' program, making it possible to tune it
+for performance.
-* Menu:
+ A number of advanced features require separate major nodes of their
+own:
-* I18N and L10N:: Internationalization and Localization.
-* Explaining gettext:: How GNU `gettext' works.
-* Programmer i18n:: Features for the programmer.
-* Translator i18n:: Features for the translator.
-* I18N Example:: A simple i18n example.
-* Gawk I18N:: `gawk' is also internationalized.
+ * *note Internationalization::, discusses how to internationalize
+ your `awk' programs, so that they can speak multiple national
+ languages.
-
-File: gawk.info, Node: I18N and L10N, Next: Explaining gettext, Up:
Internationalization
+ * *note Debugger::, describes `gawk''s built-in command-line
+ debugger for debugging `awk' programs.
-12.1 Internationalization and Localization
-==========================================
+ * *note Arbitrary Precision Arithmetic::, describes how you can use
+ `gawk' to perform arbitrary-precision arithmetic.
-"Internationalization" means writing (or modifying) a program once, in
-such a way that it can use multiple languages without requiring further
-source-code changes. "Localization" means providing the data necessary
-for an internationalized program to work in a particular language.
-Most typically, these terms refer to features such as the language used
-for printing error messages, the language used to read responses, and
-information related to how numerical and monetary values are printed
-and read.
+ * *note Dynamic Extensions::, discusses the ability to dynamically
+ add new built-in functions to `gawk'.
-
-File: gawk.info, Node: Explaining gettext, Next: Programmer i18n, Prev:
I18N and L10N, Up: Internationalization
+* Menu:
-12.2 GNU `gettext'
-==================
+* Nondecimal Data:: Allowing nondecimal input data.
+* Array Sorting:: Facilities for controlling array traversal and
+ sorting arrays.
+* Two-way I/O:: Two-way communications with another process.
+* TCP/IP Networking:: Using `gawk' for network programming.
+* Profiling:: Profiling your `awk' programs.
-The facilities in GNU `gettext' focus on messages; strings printed by a
-program, either directly or via formatting with `printf' or
-`sprintf()'.(1)
+
+File: gawk.info, Node: Nondecimal Data, Next: Array Sorting, Up: Advanced
Features
- When using GNU `gettext', each application has its own "text
-domain". This is a unique name, such as `kpilot' or `gawk', that
-identifies the application. A complete application may have multiple
-components--programs written in C or C++, as well as scripts written in
-`sh' or `awk'. All of the components use the same text domain.
+12.1 Allowing Nondecimal Input Data
+===================================
- To make the discussion concrete, assume we're writing an application
-named `guide'. Internationalization consists of the following steps,
-in this order:
+If you run `gawk' with the `--non-decimal-data' option, you can have
+nondecimal constants in your input data:
- 1. The programmer goes through the source for all of `guide''s
- components and marks each string that is a candidate for
- translation. For example, `"`-F': option required"' is a good
- candidate for translation. A table with strings of option names
- is not (e.g., `gawk''s `--profile' option should remain the same,
- no matter what the local language).
+ $ echo 0123 123 0x123 |
+ > gawk --non-decimal-data '{ printf "%d, %d, %d\n",
+ > $1, $2, $3 }'
+ -| 83, 123, 291
- 2. The programmer indicates the application's text domain (`"guide"')
- to the `gettext' library, by calling the `textdomain()' function.
+ For this feature to work, write your program so that `gawk' treats
+your data as numeric:
- 3. Messages from the application are extracted from the source code
- and collected into a portable object template file (`guide.pot'),
- which lists the strings and their translations. The translations
- are initially empty. The original (usually English) messages
- serve as the key for lookup of the translations.
+ $ echo 0123 123 0x123 | gawk '{ print $1, $2, $3 }'
+ -| 0123 123 0x123
- 4. For each language with a translator, `guide.pot' is copied to a
- portable object file (`.po') and translations are created and
- shipped with the application. For example, there might be a
- `fr.po' for a French translation.
+The `print' statement treats its expressions as strings. Although the
+fields can act as numbers when necessary, they are still strings, so
+`print' does not try to treat them numerically. You may need to add
+zero to a field to force it to be treated as a number. For example:
- 5. Each language's `.po' file is converted into a binary message
- object (`.mo') file. A message object file contains the original
- messages and their translations in a binary format that allows
- fast lookup of translations at runtime.
+ $ echo 0123 123 0x123 | gawk --non-decimal-data '
+ > { print $1, $2, $3
+ > print $1 + 0, $2 + 0, $3 + 0 }'
+ -| 0123 123 0x123
+ -| 83 123 291
- 6. When `guide' is built and installed, the binary translation files
- are installed in a standard place.
+ Because it is common to have decimal data with leading zeros, and
+because using this facility could lead to surprising results, the
+default is to leave it disabled. If you want it, you must explicitly
+request it.
- 7. For testing and development, it is possible to tell `gettext' to
- use `.mo' files in a different directory than the standard one by
- using the `bindtextdomain()' function.
+ CAUTION: _Use of this option is not recommended._ It can break old
+ programs very badly. Instead, use the `strtonum()' function to
+ convert your data (*note Nondecimal-numbers::). This makes your
+ programs easier to write and easier to read, and leads to less
+ surprising results.
- 8. At runtime, `guide' looks up each string via a call to
- `gettext()'. The returned string is the translated string if
- available, or the original string if not.
+
+File: gawk.info, Node: Array Sorting, Next: Two-way I/O, Prev: Nondecimal
Data, Up: Advanced Features
- 9. If necessary, it is possible to access messages from a different
- text domain than the one belonging to the application, without
- having to switch the application's default text domain back and
- forth.
+12.2 Controlling Array Traversal and Array Sorting
+==================================================
- In C (or C++), the string marking and dynamic translation lookup are
-accomplished by wrapping each string in a call to `gettext()':
+`gawk' lets you control the order in which a `for (i in array)' loop
+traverses an array.
- printf("%s", gettext("Don't Panic!\n"));
+ In addition, two built-in functions, `asort()' and `asorti()', let
+you sort arrays based on the array values and indices, respectively.
+These two functions also provide control over the sorting criteria used
+to order the elements during sorting.
- The tools that extract messages from source code pull out all
-strings enclosed in calls to `gettext()'.
+* Menu:
- The GNU `gettext' developers, recognizing that typing `gettext(...)'
-over and over again is both painful and ugly to look at, use the macro
-`_' (an underscore) to make things easier:
+* Controlling Array Traversal:: How to use PROCINFO["sorted_in"].
+* Array Sorting Functions:: How to use `asort()' and `asorti()'.
- /* In the standard header file: */
- #define _(str) gettext(str)
+
+File: gawk.info, Node: Controlling Array Traversal, Next: Array Sorting
Functions, Up: Array Sorting
- /* In the program text: */
- printf("%s", _("Don't Panic!\n"));
+12.2.1 Controlling Array Traversal
+----------------------------------
-This reduces the typing overhead to just three extra characters per
-string and is considerably easier to read as well.
+By default, the order in which a `for (i in array)' loop scans an array
+is not defined; it is generally based upon the internal implementation
+of arrays inside `awk'.
- There are locale "categories" for different types of locale-related
-information. The defined locale categories that `gettext' knows about
-are:
+ Often, though, it is desirable to be able to loop over the elements
+in a particular order that you, the programmer, choose. `gawk' lets
+you do this.
-`LC_MESSAGES'
- Text messages. This is the default category for `gettext'
- operations, but it is possible to supply a different one
- explicitly, if necessary. (It is almost never necessary to supply
- a different category.)
+ *note Controlling Scanning::, describes how you can assign special,
+pre-defined values to `PROCINFO["sorted_in"]' in order to control the
+order in which `gawk' will traverse an array during a `for' loop.
-`LC_COLLATE'
- Text-collation information; i.e., how different characters and/or
- groups of characters sort in a given language.
+ In addition, the value of `PROCINFO["sorted_in"]' can be a function
+name. This lets you traverse an array based on any custom criterion.
+The array elements are ordered according to the return value of this
+function. The comparison function should be defined with at least four
+arguments:
-`LC_CTYPE'
- Character-type information (alphabetic, digit, upper- or
- lowercase, and so on). This information is accessed via the POSIX
- character classes in regular expressions, such as `/[[:alnum:]]/'
- (*note Regexp Operators::).
+ function comp_func(i1, v1, i2, v2)
+ {
+ COMPARE ELEMENTS 1 AND 2 IN SOME FASHION
+ RETURN < 0; 0; OR > 0
+ }
-`LC_MONETARY'
- Monetary information, such as the currency symbol, and whether the
- symbol goes before or after a number.
+ Here, I1 and I2 are the indices, and V1 and V2 are the corresponding
+values of the two elements being compared. Either V1 or V2, or both,
+can be arrays if the array being traversed contains subarrays as values.
+(*Note Arrays of Arrays::, for more information about subarrays.) The
+three possible return values are interpreted as follows:
-`LC_NUMERIC'
- Numeric information, such as which characters to use for the
- decimal point and the thousands separator.(2)
+`comp_func(i1, v1, i2, v2) < 0'
+ Index I1 comes before index I2 during loop traversal.
-`LC_RESPONSE'
- Response information, such as how "yes" and "no" appear in the
- local language, and possibly other information as well.
+`comp_func(i1, v1, i2, v2) == 0'
+ Indices I1 and I2 come together but the relative order with
+ respect to each other is undefined.
-`LC_TIME'
- Time- and date-related information, such as 12- or 24-hour clock,
- month printed before or after the day in a date, local month
- abbreviations, and so on.
+`comp_func(i1, v1, i2, v2) > 0'
+ Index I1 comes after index I2 during loop traversal.
-`LC_ALL'
- All of the above. (Not too useful in the context of `gettext'.)
+ Our first comparison function can be used to scan an array in
+numerical order of the indices:
- ---------- Footnotes ----------
+ function cmp_num_idx(i1, v1, i2, v2)
+ {
+ # numerical index comparison, ascending order
+ return (i1 - i2)
+ }
- (1) For some operating systems, the `gawk' port doesn't support GNU
-`gettext'. Therefore, these features are not available if you are
-using one of those operating systems. Sorry.
+ Our second function traverses an array based on the string order of
+the element values rather than by indices:
- (2) Americans use a comma every three decimal places and a period
-for the decimal point, while many Europeans do exactly the opposite:
-1,234.56 versus 1.234,56.
+ function cmp_str_val(i1, v1, i2, v2)
+ {
+ # string value comparison, ascending order
+ v1 = v1 ""
+ v2 = v2 ""
+ if (v1 < v2)
+ return -1
+ return (v1 != v2)
+ }
-
-File: gawk.info, Node: Programmer i18n, Next: Translator i18n, Prev:
Explaining gettext, Up: Internationalization
+ The third comparison function makes all numbers, and numeric strings
+without any leading or trailing spaces, come out first during loop
+traversal:
-12.3 Internationalizing `awk' Programs
-======================================
+ function cmp_num_str_val(i1, v1, i2, v2, n1, n2)
+ {
+ # numbers before string value comparison, ascending order
+ n1 = v1 + 0
+ n2 = v2 + 0
+ if (n1 == v1)
+ return (n2 == v2) ? (n1 - n2) : -1
+ else if (n2 == v2)
+ return 1
+ return (v1 < v2) ? -1 : (v1 != v2)
+ }
-`gawk' provides the following variables and functions for
-internationalization:
+ Here is a main program to demonstrate how `gawk' behaves using each
+of the previous functions:
-`TEXTDOMAIN'
- This variable indicates the application's text domain. For
- compatibility with GNU `gettext', the default value is
- `"messages"'.
+ BEGIN {
+ data["one"] = 10
+ data["two"] = 20
+ data[10] = "one"
+ data[100] = 100
+ data[20] = "two"
-`_"your message here"'
- String constants marked with a leading underscore are candidates
- for translation at runtime. String constants without a leading
- underscore are not translated.
+ f[1] = "cmp_num_idx"
+ f[2] = "cmp_str_val"
+ f[3] = "cmp_num_str_val"
+ for (i = 1; i <= 3; i++) {
+ printf("Sort function: %s\n", f[i])
+ PROCINFO["sorted_in"] = f[i]
+ for (j in data)
+ printf("\tdata[%s] = %s\n", j, data[j])
+ print ""
+ }
+ }
-`dcgettext(STRING [, DOMAIN [, CATEGORY]])'
- Return the translation of STRING in text domain DOMAIN for locale
- category CATEGORY. The default value for DOMAIN is the current
- value of `TEXTDOMAIN'. The default value for CATEGORY is
- `"LC_MESSAGES"'.
+ Here are the results when the program is run:
- If you supply a value for CATEGORY, it must be a string equal to
- one of the known locale categories described in *note Explaining
- gettext::. You must also supply a text domain. Use `TEXTDOMAIN'
- if you want to use the current domain.
+ $ gawk -f compdemo.awk
+ -| Sort function: cmp_num_idx Sort by numeric index
+ -| data[two] = 20
+ -| data[one] = 10 Both strings are numerically zero
+ -| data[10] = one
+ -| data[20] = two
+ -| data[100] = 100
+ -|
+ -| Sort function: cmp_str_val Sort by element values as strings
+ -| data[one] = 10
+ -| data[100] = 100 String 100 is less than string 20
+ -| data[two] = 20
+ -| data[10] = one
+ -| data[20] = two
+ -|
+ -| Sort function: cmp_num_str_val Sort all numeric values before all
strings
+ -| data[one] = 10
+ -| data[two] = 20
+ -| data[100] = 100
+ -| data[10] = one
+ -| data[20] = two
- CAUTION: The order of arguments to the `awk' version of the
- `dcgettext()' function is purposely different from the order
- for the C version. The `awk' version's order was chosen to
- be simple and to allow for reasonable `awk'-style default
- arguments.
+ Consider sorting the entries of a GNU/Linux system password file
+according to login name. The following program sorts records by a
+specific field position and can be used for this purpose:
-`dcngettext(STRING1, STRING2, NUMBER [, DOMAIN [, CATEGORY]])'
- Return the plural form used for NUMBER of the translation of
- STRING1 and STRING2 in text domain DOMAIN for locale category
- CATEGORY. STRING1 is the English singular variant of a message,
- and STRING2 the English plural variant of the same message. The
- default value for DOMAIN is the current value of `TEXTDOMAIN'.
- The default value for CATEGORY is `"LC_MESSAGES"'.
+ # sort.awk --- simple program to sort by field position
+ # field position is specified by the global variable POS
- The same remarks about argument order as for the `dcgettext()'
- function apply.
+ function cmp_field(i1, v1, i2, v2)
+ {
+ # comparison by value, as string, and ascending order
+ return v1[POS] < v2[POS] ? -1 : (v1[POS] != v2[POS])
+ }
-`bindtextdomain(DIRECTORY [, DOMAIN])'
- Change the directory in which `gettext' looks for `.mo' files, in
- case they will not or cannot be placed in the standard locations
- (e.g., during testing). Return the directory in which DOMAIN is
- "bound."
+ {
+ for (i = 1; i <= NF; i++)
+ a[NR][i] = $i
+ }
- The default DOMAIN is the value of `TEXTDOMAIN'. If DIRECTORY is
- the null string (`""'), then `bindtextdomain()' returns the
- current binding for the given DOMAIN.
+ END {
+ PROCINFO["sorted_in"] = "cmp_field"
+ if (POS < 1 || POS > NF)
+ POS = 1
+ for (i in a) {
+ for (j = 1; j <= NF; j++)
+ printf("%s%c", a[i][j], j < NF ? ":" : "")
+ print ""
+ }
+ }
- To use these facilities in your `awk' program, follow the steps
-outlined in *note Explaining gettext::, like so:
+ The first field in each entry of the password file is the user's
+login name, and the fields are separated by colons. Each record
+defines a subarray, with each field as an element in the subarray.
+Running the program produces the following output:
- 1. Set the variable `TEXTDOMAIN' to the text domain of your program.
- This is best done in a `BEGIN' rule (*note BEGIN/END::), or it can
- also be done via the `-v' command-line option (*note Options::):
+ $ gawk -v POS=1 -F: -f sort.awk /etc/passwd
+ -| adm:x:3:4:adm:/var/adm:/sbin/nologin
+ -| apache:x:48:48:Apache:/var/www:/sbin/nologin
+ -| avahi:x:70:70:Avahi daemon:/:/sbin/nologin
+ ...
- BEGIN {
- TEXTDOMAIN = "guide"
- ...
- }
+ The comparison should normally always return the same value when
+given a specific pair of array elements as its arguments. If
+inconsistent results are returned then the order is undefined. This
+behavior can be exploited to introduce random order into otherwise
+seemingly ordered data:
- 2. Mark all translatable strings with a leading underscore (`_')
- character. It _must_ be adjacent to the opening quote of the
- string. For example:
+ function cmp_randomize(i1, v1, i2, v2)
+ {
+ # random order
+ return (2 - 4 * rand())
+ }
- print _"hello, world"
- x = _"you goofed"
- printf(_"Number of users is %d\n", nusers)
+ As mentioned above, the order of the indices is arbitrary if two
+elements compare equal. This is usually not a problem, but letting the
+tied elements come out in arbitrary order can be an issue, especially
+when comparing item values. The partial ordering of the equal elements
+may change during the next loop traversal, if other elements are added
+or removed from the array. One way to resolve ties when comparing
+elements with otherwise equal values is to include the indices in the
+comparison rules. Note that doing this may make the loop traversal
+less efficient, so consider it only if necessary. The following
+comparison functions force a deterministic order, and are based on the
+fact that the indices of two elements are never equal:
- 3. If you are creating strings dynamically, you can still translate
- them, using the `dcgettext()' built-in function:
+ function cmp_numeric(i1, v1, i2, v2)
+ {
+ # numerical value (and index) comparison, descending order
+ return (v1 != v2) ? (v2 - v1) : (i2 - i1)
+ }
- message = nusers " users logged in"
- message = dcgettext(message, "adminprog")
- print message
+ function cmp_string(i1, v1, i2, v2)
+ {
+ # string value (and index) comparison, descending order
+ v1 = v1 i1
+ v2 = v2 i2
+ return (v1 > v2) ? -1 : (v1 != v2)
+ }
- Here, the call to `dcgettext()' supplies a different text domain
- (`"adminprog"') in which to find the message, but it uses the
- default `"LC_MESSAGES"' category.
+ A custom comparison function can often simplify ordered loop
+traversal, and the sky is really the limit when it comes to designing
+such a function.
- 4. During development, you might want to put the `.mo' file in a
- private directory for testing. This is done with the
- `bindtextdomain()' built-in function:
+ When string comparisons are made during a sort, either for element
+values where one or both aren't numbers, or for element indices handled
+as strings, the value of `IGNORECASE' (*note Built-in Variables::)
+controls whether the comparisons treat corresponding uppercase and
+lowercase letters as equivalent or distinct.
- BEGIN {
- TEXTDOMAIN = "guide" # our text domain
- if (Testing) {
- # where to find our files
- bindtextdomain("testdir")
- # joe is in charge of adminprog
- bindtextdomain("../joe/testdir", "adminprog")
- }
- ...
- }
+ Another point to keep in mind is that in the case of subarrays the
+element values can themselves be arrays; a production comparison
+function should use the `isarray()' function (*note Type Functions::),
+to check for this, and choose a defined sorting order for subarrays.
+ All sorting based on `PROCINFO["sorted_in"]' is disabled in POSIX
+mode, since the `PROCINFO' array is not special in that case.
- *Note I18N Example::, for an example program showing the steps to
-create and use translations from `awk'.
+ As a side note, sorting the array indices before traversing the
+array has been reported to add 15% to 20% overhead to the execution
+time of `awk' programs. For this reason, sorted array traversal is not
+the default.
-File: gawk.info, Node: Translator i18n, Next: I18N Example, Prev:
Programmer i18n, Up: Internationalization
-
-12.4 Translating `awk' Programs
-===============================
+File: gawk.info, Node: Array Sorting Functions, Prev: Controlling Array
Traversal, Up: Array Sorting
-Once a program's translatable strings have been marked, they must be
-extracted to create the initial `.po' file. As part of translation, it
-is often helpful to rearrange the order in which arguments to `printf'
-are output.
+12.2.2 Sorting Array Values and Indices with `gawk'
+---------------------------------------------------
- `gawk''s `--gen-pot' command-line option extracts the messages and
-is discussed next. After that, `printf''s ability to rearrange the
-order for `printf' arguments at runtime is covered.
+In most `awk' implementations, sorting an array requires writing a
+`sort()' function. While this can be educational for exploring
+different sorting algorithms, usually that's not the point of the
+program. `gawk' provides the built-in `asort()' and `asorti()'
+functions (*note String Functions::) for sorting arrays. For example:
-* Menu:
+ POPULATE THE ARRAY data
+ n = asort(data)
+ for (i = 1; i <= n; i++)
+ DO SOMETHING WITH data[i]
-* String Extraction:: Extracting marked strings.
-* Printf Ordering:: Rearranging `printf' arguments.
-* I18N Portability:: `awk'-level portability issues.
+ After the call to `asort()', the array `data' is indexed from 1 to
+some number N, the total number of elements in `data'. (This count is
+`asort()''s return value.) `data[1]' <= `data[2]' <= `data[3]', and so
+on. The comparison is based on the type of the elements (*note Typing
+and Comparison::). All numeric values come before all string values,
+which in turn come before all subarrays.
-
-File: gawk.info, Node: String Extraction, Next: Printf Ordering, Up:
Translator i18n
+ An important side effect of calling `asort()' is that _the array's
+original indices are irrevocably lost_. As this isn't always
+desirable, `asort()' accepts a second argument:
-12.4.1 Extracting Marked Strings
---------------------------------
+ POPULATE THE ARRAY source
+ n = asort(source, dest)
+ for (i = 1; i <= n; i++)
+ DO SOMETHING WITH dest[i]
-Once your `awk' program is working, and all the strings have been
-marked and you've set (and perhaps bound) the text domain, it is time
-to produce translations. First, use the `--gen-pot' command-line
-option to create the initial `.pot' file:
+ In this case, `gawk' copies the `source' array into the `dest' array
+and then sorts `dest', destroying its indices. However, the `source'
+array is not affected.
- $ gawk --gen-pot -f guide.awk > guide.pot
+ `asort()' accepts a third string argument to control comparison of
+array elements. As with `PROCINFO["sorted_in"]', this argument may be
+one of the predefined names that `gawk' provides (*note Controlling
+Scanning::), or the name of a user-defined function (*note Controlling
+Array Traversal::).
- When run with `--gen-pot', `gawk' does not execute your program.
-Instead, it parses it as usual and prints all marked strings to
-standard output in the format of a GNU `gettext' Portable Object file.
-Also included in the output are any constant strings that appear as the
-first argument to `dcgettext()' or as the first and second argument to
-`dcngettext()'.(1) *Note I18N Example::, for the full list of steps to
-go through to create and test translations for `guide'.
+ NOTE: In all cases, the sorted element values consist of the
+ original array's element values. The ability to control
+ comparison merely affects the way in which they are sorted.
- ---------- Footnotes ----------
+ Often, what's needed is to sort on the values of the _indices_
+instead of the values of the elements. To do that, use the `asorti()'
+function. The interface is identical to that of `asort()', except that
+the index values are used for sorting, and become the values of the
+result array:
- (1) The `xgettext' utility that comes with GNU `gettext' can handle
-`.awk' files.
+ { source[$0] = some_func($0) }
-
-File: gawk.info, Node: Printf Ordering, Next: I18N Portability, Prev:
String Extraction, Up: Translator i18n
+ END {
+ n = asorti(source, dest)
+ for (i = 1; i <= n; i++) {
+ Work with sorted indices directly:
+ DO SOMETHING WITH dest[i]
+ ...
+ Access original array via sorted indices:
+ DO SOMETHING WITH source[dest[i]]
+ }
+ }
-12.4.2 Rearranging `printf' Arguments
--------------------------------------
+ Similar to `asort()', in all cases, the sorted element values
+consist of the original array's indices. The ability to control
+comparison merely affects the way in which they are sorted.
-Format strings for `printf' and `sprintf()' (*note Printf::) present a
-special problem for translation. Consider the following:(1)
+ Sorting the array by replacing the indices provides maximal
+flexibility. To traverse the elements in decreasing order, use a loop
+that goes from N down to 1, either over the elements or over the
+indices.(1)
- printf(_"String `%s' has %d characters\n",
- string, length(string)))
+ Copying array indices and elements isn't expensive in terms of
+memory. Internally, `gawk' maintains "reference counts" to data. For
+example, when `asort()' copies the first array to the second one, there
+is only one copy of the original array elements' data, even though both
+arrays use the values.
- A possible German translation for this might be:
+ Because `IGNORECASE' affects string comparisons, the value of
+`IGNORECASE' also affects sorting for both `asort()' and `asorti()'.
+Note also that the locale's sorting order does _not_ come into play;
+comparisons are based on character values only.(2) Caveat Emptor.
- "%d Zeichen lang ist die Zeichenkette `%s'\n"
+ ---------- Footnotes ----------
- The problem should be obvious: the order of the format
-specifications is different from the original! Even though `gettext()'
-can return the translated string at runtime, it cannot change the
-argument order in the call to `printf'.
+ (1) You may also use one of the predefined sorting names that sorts
+in decreasing order.
- To solve this problem, `printf' format specifiers may have an
-additional optional element, which we call a "positional specifier".
-For example:
+ (2) This is true because locale-based comparison occurs only when in
+POSIX compatibility mode, and since `asort()' and `asorti()' are `gawk'
+extensions, they are not available in that case.
- "%2$d Zeichen lang ist die Zeichenkette `%1$s'\n"
+
+File: gawk.info, Node: Two-way I/O, Next: TCP/IP Networking, Prev: Array
Sorting, Up: Advanced Features
- Here, the positional specifier consists of an integer count, which
-indicates which argument to use, and a `$'. Counts are one-based, and
-the format string itself is _not_ included. Thus, in the following
-example, `string' is the first argument and `length(string)' is the
-second:
+12.3 Two-Way Communications with Another Process
+================================================
- $ gawk 'BEGIN {
- > string = "Dont Panic"
- > printf _"%2$d characters live in \"%1$s\"\n",
- > string, length(string)
- > }'
- -| 10 characters live in "Dont Panic"
+ From: address@hidden (Mike Brennan)
+ Newsgroups: comp.lang.awk
+ Subject: Re: Learn the SECRET to Attract Women Easily
+ Date: 4 Aug 1997 17:34:46 GMT
+ Message-ID: <address@hidden>
- If present, positional specifiers come first in the format
-specification, before the flags, the field width, and/or the precision.
+ On 3 Aug 1997 13:17:43 GMT, Want More Dates???
+ <address@hidden> wrote:
+ >Learn the SECRET to Attract Women Easily
+ >
+ >The SCENT(tm) Pheromone Sex Attractant For Men to Attract Women
- Positional specifiers can be used with the dynamic field width and
-precision capability:
+ The scent of awk programmers is a lot more attractive to women than
+ the scent of perl programmers.
+ --
+ Mike Brennan
- $ gawk 'BEGIN {
- > printf("%*.*s\n", 10, 20, "hello")
- > printf("%3$*2$.*1$s\n", 20, 10, "hello")
- > }'
- -| hello
- -| hello
+ It is often useful to be able to send data to a separate program for
+processing and then read the result. This can always be done with
+temporary files:
- NOTE: When using `*' with a positional specifier, the `*' comes
- first, then the integer position, and then the `$'. This is
- somewhat counterintuitive.
+ # Write the data for processing
+ tempfile = ("mydata." PROCINFO["pid"])
+ while (NOT DONE WITH DATA)
+ print DATA | ("subprogram > " tempfile)
+ close("subprogram > " tempfile)
- `gawk' does not allow you to mix regular format specifiers and those
-with positional specifiers in the same string:
+ # Read the results, remove tempfile when done
+ while ((getline newdata < tempfile) > 0)
+ PROCESS newdata APPROPRIATELY
+ close(tempfile)
+ system("rm " tempfile)
- $ gawk 'BEGIN { printf _"%d %3$s\n", 1, 2, "hi" }'
- error--> gawk: cmd. line:1: fatal: must use `count$' on all formats or
none
+This works, but not elegantly. Among other things, it requires that
+the program be run in a directory that cannot be shared among users;
+for example, `/tmp' will not do, as another user might happen to be
+using a temporary file with the same name.
- NOTE: There are some pathological cases that `gawk' may fail to
- diagnose. In such cases, the output may not be what you expect.
- It's still a bad idea to try mixing them, even if `gawk' doesn't
- detect it.
+ However, with `gawk', it is possible to open a _two-way_ pipe to
+another process. The second process is termed a "coprocess", since it
+runs in parallel with `gawk'. The two-way connection is created using
+the `|&' operator (borrowed from the Korn shell, `ksh'):(1)
- Although positional specifiers can be used directly in `awk'
-programs, their primary purpose is to help in producing correct
-translations of format strings into languages different from the one in
-which the program is first written.
+ do {
+ print DATA |& "subprogram"
+ "subprogram" |& getline results
+ } while (DATA LEFT TO PROCESS)
+ close("subprogram")
- ---------- Footnotes ----------
+ The first time an I/O operation is executed using the `|&' operator,
+`gawk' creates a two-way pipeline to a child process that runs the
+other program. Output created with `print' or `printf' is written to
+the program's standard input, and output from the program's standard
+output can be read by the `gawk' program using `getline'. As is the
+case with processes started by `|', the subprogram can be any program,
+or pipeline of programs, that can be started by the shell.
- (1) This example is borrowed from the GNU `gettext' manual.
+ There are some cautionary items to be aware of:
-
-File: gawk.info, Node: I18N Portability, Prev: Printf Ordering, Up:
Translator i18n
+ * As the code inside `gawk' currently stands, the coprocess's
+ standard error goes to the same place that the parent `gawk''s
+ standard error goes. It is not possible to read the child's
+ standard error separately.
-12.4.3 `awk' Portability Issues
--------------------------------
+ * I/O buffering may be a problem. `gawk' automatically flushes all
+ output down the pipe to the coprocess. However, if the coprocess
+ does not flush its output, `gawk' may hang when doing a `getline'
+ in order to read the coprocess's results. This could lead to a
+ situation known as "deadlock", where each process is waiting for
+ the other one to do something.
-`gawk''s internationalization features were purposely chosen to have as
-little impact as possible on the portability of `awk' programs that use
-them to other versions of `awk'. Consider this program:
+ It is possible to close just one end of the two-way pipe to a
+coprocess, by supplying a second argument to the `close()' function of
+either `"to"' or `"from"' (*note Close Files And Pipes::). These
+strings tell `gawk' to close the end of the pipe that sends data to the
+coprocess or the end that reads from it, respectively.
- BEGIN {
- TEXTDOMAIN = "guide"
- if (Test_Guide) # set with -v
- bindtextdomain("/test/guide/messages")
- print _"don't panic!"
- }
+ This is particularly necessary in order to use the system `sort'
+utility as part of a coprocess; `sort' must read _all_ of its input
+data before it can produce any output. The `sort' program does not
+receive an end-of-file indication until `gawk' closes the write end of
+the pipe.
-As written, it won't work on other versions of `awk'. However, it is
-actually almost portable, requiring very little change:
+ When you have finished writing data to the `sort' utility, you can
+close the `"to"' end of the pipe, and then start reading sorted data
+via `getline'. For example:
- * Assignments to `TEXTDOMAIN' won't have any effect, since
- `TEXTDOMAIN' is not special in other `awk' implementations.
+ BEGIN {
+ command = "LC_ALL=C sort"
+ n = split("abcdefghijklmnopqrstuvwxyz", a, "")
- * Non-GNU versions of `awk' treat marked strings as the
- concatenation of a variable named `_' with the string following
- it.(1) Typically, the variable `_' has the null string (`""') as
- its value, leaving the original string constant as the result.
+ for (i = n; i > 0; i--)
+ print a[i] |& command
+ close(command, "to")
- * By defining "dummy" functions to replace `dcgettext()',
- `dcngettext()' and `bindtextdomain()', the `awk' program can be
- made to run, but all the messages are output in the original
- language. For example:
+ while ((command |& getline line) > 0)
+ print "got", line
+ close(command)
+ }
- function bindtextdomain(dir, domain)
- {
- return dir
- }
+ This program writes the letters of the alphabet in reverse order, one
+per line, down the two-way pipe to `sort'. It then closes the write
+end of the pipe, so that `sort' receives an end-of-file indication.
+This causes `sort' to sort the data and write the sorted data back to
+the `gawk' program. Once all of the data has been read, `gawk'
+terminates the coprocess and exits.
- function dcgettext(string, domain, category)
- {
- return string
- }
+ As a side note, the assignment `LC_ALL=C' in the `sort' command
+ensures traditional Unix (ASCII) sorting from `sort'.
- function dcngettext(string1, string2, number, domain, category)
- {
- return (number == 1 ? string1 : string2)
- }
+ You may also use pseudo-ttys (ptys) for two-way communication
+instead of pipes, if your system supports them. This is done on a
+per-command basis, by setting a special element in the `PROCINFO' array
+(*note Auto-set::), like so:
- * The use of positional specifications in `printf' or `sprintf()' is
- _not_ portable. To support `gettext()' at the C level, many
- systems' C versions of `sprintf()' do support positional
- specifiers. But it works only if enough arguments are supplied in
- the function call. Many versions of `awk' pass `printf' formats
- and arguments unchanged to the underlying C library version of
- `sprintf()', but only one format and argument at a time. What
- happens if a positional specification is used is anybody's guess.
- However, since the positional specifications are primarily for use
- in _translated_ format strings, and since non-GNU `awk's never
- retrieve the translated string, this should not be a problem in
- practice.
+ command = "sort -nr" # command, save in convenience variable
+ PROCINFO[command, "pty"] = 1 # update PROCINFO
+ print ... |& command # start two-way pipe
+ ...
+
+Using ptys avoids the buffer deadlock issues described earlier, at some
+loss in performance. If your system does not have ptys, or if all the
+system's ptys are in use, `gawk' automatically falls back to using
+regular pipes.
---------- Footnotes ----------
- (1) This is good fodder for an "Obfuscated `awk'" contest.
+ (1) This is very different from the same operator in the C shell.
-File: gawk.info, Node: I18N Example, Next: Gawk I18N, Prev: Translator
i18n, Up: Internationalization
+File: gawk.info, Node: TCP/IP Networking, Next: Profiling, Prev: Two-way
I/O, Up: Advanced Features
-12.5 A Simple Internationalization Example
-==========================================
+12.4 Using `gawk' for Network Programming
+=========================================
-Now let's look at a step-by-step example of how to internationalize and
-localize a simple `awk' program, using `guide.awk' as our original
-source:
+ `EMISTERED':
+ A host is a host from coast to coast,
+ and no-one can talk to host that's close,
+ unless the host that isn't close
+ is busy hung or dead.
- BEGIN {
- TEXTDOMAIN = "guide"
- bindtextdomain(".") # for testing
- print _"Don't Panic"
- print _"The Answer Is", 42
- print "Pardon me, Zaphod who?"
- }
+ In addition to being able to open a two-way pipeline to a coprocess
+on the same system (*note Two-way I/O::), it is possible to make a
+two-way connection to another process on another system across an IP
+network connection.
-Run `gawk --gen-pot' to create the `.pot' file:
+ You can think of this as just a _very long_ two-way pipeline to a
+coprocess. The way `gawk' decides that you want to use TCP/IP
+networking is by recognizing special file names that begin with one of
+`/inet/', `/inet4/' or `/inet6'.
- $ gawk --gen-pot -f guide.awk > guide.pot
+ The full syntax of the special file name is
+`/NET-TYPE/PROTOCOL/LOCAL-PORT/REMOTE-HOST/REMOTE-PORT'. The
+components are:
-This produces:
+NET-TYPE
+ Specifies the kind of Internet connection to make. Use `/inet4/'
+ to force IPv4, and `/inet6/' to force IPv6. Plain `/inet/' (which
+ used to be the only option) uses the system default, most likely
+ IPv4.
- #: guide.awk:4
- msgid "Don't Panic"
- msgstr ""
+PROTOCOL
+ The protocol to use over IP. This must be either `tcp', or `udp',
+ for a TCP or UDP IP connection, respectively. The use of TCP is
+ recommended for most applications.
- #: guide.awk:5
- msgid "The Answer Is"
- msgstr ""
+LOCAL-PORT
+ The local TCP or UDP port number to use. Use a port number of `0'
+ when you want the system to pick a port. This is what you should do
+ when writing a TCP or UDP client. You may also use a well-known
+ service name, such as `smtp' or `http', in which case `gawk'
+ attempts to determine the predefined port number using the C
+ `getaddrinfo()' function.
- This original portable object template file is saved and reused for
-each language into which the application is translated. The `msgid' is
-the original string and the `msgstr' is the translation.
+REMOTE-HOST
+ The IP address or fully-qualified domain name of the Internet host
+ to which you want to connect.
- NOTE: Strings not marked with a leading underscore do not appear
- in the `guide.pot' file.
+REMOTE-PORT
+ The TCP or UDP port number to use on the given REMOTE-HOST.
+ Again, use `0' if you don't care, or else a well-known service
+ name.
- Next, the messages must be translated. Here is a translation to a
-hypothetical dialect of English, called "Mellow":(1)
+ NOTE: Failure in opening a two-way socket will result in a
+ non-fatal error being returned to the calling code. The value of
+ `ERRNO' indicates the error (*note Auto-set::).
- $ cp guide.pot guide-mellow.po
- ADD TRANSLATIONS TO guide-mellow.po ...
+ Consider the following very simple example:
-Following are the translations:
+ BEGIN {
+ Service = "/inet/tcp/0/localhost/daytime"
+ Service |& getline
+ print $0
+ close(Service)
+ }
- #: guide.awk:4
- msgid "Don't Panic"
- msgstr "Hey man, relax!"
+ This program reads the current date and time from the local system's
+TCP `daytime' server. It then prints the results and closes the
+connection.
- #: guide.awk:5
- msgid "The Answer Is"
- msgstr "Like, the scoop is"
+ Because this topic is extensive, the use of `gawk' for TCP/IP
+programming is documented separately. See *note (General
+Introduction)Top:: gawkinet, TCP/IP Internetworking with `gawk', for a
+much more complete introduction and discussion, as well as extensive
+examples.
- The next step is to make the directory to hold the binary message
-object file and then to create the `guide.mo' file. The directory
-layout shown here is standard for GNU `gettext' on GNU/Linux systems.
-Other versions of `gettext' may use a different layout:
+
+File: gawk.info, Node: Profiling, Prev: TCP/IP Networking, Up: Advanced
Features
- $ mkdir en_US en_US/LC_MESSAGES
+12.5 Profiling Your `awk' Programs
+==================================
- The `msgfmt' utility does the conversion from human-readable `.po'
-file to machine-readable `.mo' file. By default, `msgfmt' creates a
-file named `messages'. This file must be renamed and placed in the
-proper directory so that `gawk' can find it:
+You may produce execution traces of your `awk' programs. This is done
+by passing the option `--profile' to `gawk'. When `gawk' has finished
+running, it creates a profile of your program in a file named
+`awkprof.out'. Because it is profiling, it also executes up to 45%
+slower than `gawk' normally does.
- $ msgfmt guide-mellow.po
- $ mv messages en_US/LC_MESSAGES/guide.mo
+ As shown in the following example, the `--profile' option can be
+used to change the name of the file where `gawk' will write the profile:
- Finally, we run the program to test it:
+ gawk --profile=myprog.prof -f myprog.awk data1 data2
- $ gawk -f guide.awk
- -| Hey man, relax!
- -| Like, the scoop is 42
- -| Pardon me, Zaphod who?
+In the above example, `gawk' places the profile in `myprog.prof'
+instead of in `awkprof.out'.
- If the three replacement functions for `dcgettext()', `dcngettext()'
-and `bindtextdomain()' (*note I18N Portability::) are in a file named
-`libintl.awk', then we can run `guide.awk' unchanged as follows:
+ Here is a sample session showing a simple `awk' program, its input
+data, and the results from running `gawk' with the `--profile' option.
+First, the `awk' program:
- $ gawk --posix -f guide.awk -f libintl.awk
- -| Don't Panic
- -| The Answer Is 42
- -| Pardon me, Zaphod who?
+ BEGIN { print "First BEGIN rule" }
- ---------- Footnotes ----------
+ END { print "First END rule" }
- (1) Perhaps it would be better if it were called "Hippy." Ah, well.
+ /foo/ {
+ print "matched /foo/, gosh"
+ for (i = 1; i <= 3; i++)
+ sing()
+ }
-
-File: gawk.info, Node: Gawk I18N, Prev: I18N Example, Up:
Internationalization
+ {
+ if (/foo/)
+ print "if is true"
+ else
+ print "else is true"
+ }
-12.6 `gawk' Can Speak Your Language
-===================================
+ BEGIN { print "Second BEGIN rule" }
-`gawk' itself has been internationalized using the GNU `gettext'
-package. (GNU `gettext' is described in complete detail in *note (GNU
-`gettext' utilities)Top:: gettext, GNU gettext tools.) As of this
-writing, the latest version of GNU `gettext' is version 0.18.2.1
-(ftp://ftp.gnu.org/gnu/gettext/gettext-0.18.2.1.tar.gz).
+ END { print "Second END rule" }
- If a translation of `gawk''s messages exists, then `gawk' produces
-usage messages, warnings, and fatal errors in the local language.
+ function sing( dummy)
+ {
+ print "I gotta be me!"
+ }
-
-File: gawk.info, Node: Advanced Features, Next: Debugger, Prev:
Internationalization, Up: Top
+ Following is the input data:
-13 Advanced Features of `gawk'
-******************************
+ foo
+ bar
+ baz
+ foo
+ junk
- Write documentation as if whoever reads it is a violent psychopath
- who knows where you live.
- Steve English, as quoted by Peter Langston
+ Here is the `awkprof.out' that results from running the `gawk'
+profiler on this program and data (this example also illustrates that
+`awk' programmers sometimes have to work late):
- This major node discusses advanced features in `gawk'. It's a bit
-of a "grab bag" of items that are otherwise unrelated to each other.
-First, a command-line option allows `gawk' to recognize nondecimal
-numbers in input data, not just in `awk' programs. Then, `gawk''s
-special features for sorting arrays are presented. Next, two-way I/O,
-discussed briefly in earlier parts of this Info file, is described in
-full detail, along with the basics of TCP/IP networking. Finally,
-`gawk' can "profile" an `awk' program, making it possible to tune it
-for performance.
+ # gawk profile, created Sun Aug 13 00:00:15 2000
- A number of advanced features require separate major nodes of their
-own:
+ # BEGIN block(s)
- * *note Internationalization::, discusses how to internationalize
- your `awk' programs, so that they can speak multiple national
- languages.
+ BEGIN {
+ 1 print "First BEGIN rule"
+ 1 print "Second BEGIN rule"
+ }
- * *note Debugger::, describes `gawk''s built-in command-line
- debugger for debugging `awk' programs.
+ # Rule(s)
- * *note Arbitrary Precision Arithmetic::, describes how you can use
- `gawk' to perform arbitrary-precision arithmetic.
+ 5 /foo/ { # 2
+ 2 print "matched /foo/, gosh"
+ 6 for (i = 1; i <= 3; i++) {
+ 6 sing()
+ }
+ }
- * *note Dynamic Extensions::, discusses the ability to dynamically
- add new built-in functions to `gawk'.
+ 5 {
+ 5 if (/foo/) { # 2
+ 2 print "if is true"
+ 3 } else {
+ 3 print "else is true"
+ }
+ }
-* Menu:
-
-* Nondecimal Data:: Allowing nondecimal input data.
-* Array Sorting:: Facilities for controlling array traversal and
- sorting arrays.
-* Two-way I/O:: Two-way communications with another process.
-* TCP/IP Networking:: Using `gawk' for network programming.
-* Profiling:: Profiling your `awk' programs.
-
-
-File: gawk.info, Node: Nondecimal Data, Next: Array Sorting, Up: Advanced
Features
-
-13.1 Allowing Nondecimal Input Data
-===================================
+ # END block(s)
-If you run `gawk' with the `--non-decimal-data' option, you can have
-nondecimal constants in your input data:
+ END {
+ 1 print "First END rule"
+ 1 print "Second END rule"
+ }
- $ echo 0123 123 0x123 |
- > gawk --non-decimal-data '{ printf "%d, %d, %d\n",
- > $1, $2, $3 }'
- -| 83, 123, 291
+ # Functions, listed alphabetically
- For this feature to work, write your program so that `gawk' treats
-your data as numeric:
+ 6 function sing(dummy)
+ {
+ 6 print "I gotta be me!"
+ }
- $ echo 0123 123 0x123 | gawk '{ print $1, $2, $3 }'
- -| 0123 123 0x123
+ This example illustrates many of the basic features of profiling
+output. They are as follows:
-The `print' statement treats its expressions as strings. Although the
-fields can act as numbers when necessary, they are still strings, so
-`print' does not try to treat them numerically. You may need to add
-zero to a field to force it to be treated as a number. For example:
+ * The program is printed in the order `BEGIN' rule, `BEGINFILE' rule,
+ pattern/action rules, `ENDFILE' rule, `END' rule and functions,
+ listed alphabetically. Multiple `BEGIN' and `END' rules are
+ merged together, as are multiple `BEGINFILE' and `ENDFILE' rules.
- $ echo 0123 123 0x123 | gawk --non-decimal-data '
- > { print $1, $2, $3
- > print $1 + 0, $2 + 0, $3 + 0 }'
- -| 0123 123 0x123
- -| 83 123 291
+ * Pattern-action rules have two counts. The first count, to the
+ left of the rule, shows how many times the rule's pattern was
+ _tested_. The second count, to the right of the rule's opening
+ left brace in a comment, shows how many times the rule's action
+ was _executed_. The difference between the two indicates how many
+ times the rule's pattern evaluated to false.
- Because it is common to have decimal data with leading zeros, and
-because using this facility could lead to surprising results, the
-default is to leave it disabled. If you want it, you must explicitly
-request it.
+ * Similarly, the count for an `if'-`else' statement shows how many
+ times the condition was tested. To the right of the opening left
+ brace for the `if''s body is a count showing how many times the
+ condition was true. The count for the `else' indicates how many
+ times the test failed.
- CAUTION: _Use of this option is not recommended._ It can break old
- programs very badly. Instead, use the `strtonum()' function to
- convert your data (*note Nondecimal-numbers::). This makes your
- programs easier to write and easier to read, and leads to less
- surprising results.
+ * The count for a loop header (such as `for' or `while') shows how
+ many times the loop test was executed. (Because of this, you
+ can't just look at the count on the first statement in a rule to
+ determine how many times the rule was executed. If the first
+ statement is a loop, the count is misleading.)
-
-File: gawk.info, Node: Array Sorting, Next: Two-way I/O, Prev: Nondecimal
Data, Up: Advanced Features
+ * For user-defined functions, the count next to the `function'
+ keyword indicates how many times the function was called. The
+ counts next to the statements in the body show how many times
+ those statements were executed.
-13.2 Controlling Array Traversal and Array Sorting
-==================================================
+ * The layout uses "K&R" style with TABs. Braces are used
+ everywhere, even when the body of an `if', `else', or loop is only
+ a single statement.
-`gawk' lets you control the order in which a `for (i in array)' loop
-traverses an array.
+ * Parentheses are used only where needed, as indicated by the
+ structure of the program and the precedence rules. For example,
+ `(3 + 5) * 4' means add three plus five, then multiply the total
+ by four. However, `3 + 5 * 4' has no parentheses, and means `3 +
+ (5 * 4)'.
- In addition, two built-in functions, `asort()' and `asorti()', let
-you sort arrays based on the array values and indices, respectively.
-These two functions also provide control over the sorting criteria used
-to order the elements during sorting.
+ * Parentheses are used around the arguments to `print' and `printf'
+ only when the `print' or `printf' statement is followed by a
+ redirection. Similarly, if the target of a redirection isn't a
+ scalar, it gets parenthesized.
-* Menu:
+ * `gawk' supplies leading comments in front of the `BEGIN' and `END'
+ rules, the pattern/action rules, and the functions.
-* Controlling Array Traversal:: How to use PROCINFO["sorted_in"].
-* Array Sorting Functions:: How to use `asort()' and `asorti()'.
-
-File: gawk.info, Node: Controlling Array Traversal, Next: Array Sorting
Functions, Up: Array Sorting
+ The profiled version of your program may not look exactly like what
+you typed when you wrote it. This is because `gawk' creates the
+profiled version by "pretty printing" its internal representation of
+the program. The advantage to this is that `gawk' can produce a
+standard representation. The disadvantage is that all source-code
+comments are lost, as are the distinctions among multiple `BEGIN',
+`END', `BEGINFILE', and `ENDFILE' rules. Also, things such as:
-13.2.1 Controlling Array Traversal
-----------------------------------
+ /foo/
-By default, the order in which a `for (i in array)' loop scans an array
-is not defined; it is generally based upon the internal implementation
-of arrays inside `awk'.
+come out as:
- Often, though, it is desirable to be able to loop over the elements
-in a particular order that you, the programmer, choose. `gawk' lets
-you do this.
+ /foo/ {
+ print $0
+ }
- *note Controlling Scanning::, describes how you can assign special,
-pre-defined values to `PROCINFO["sorted_in"]' in order to control the
-order in which `gawk' will traverse an array during a `for' loop.
+which is correct, but possibly surprising.
- In addition, the value of `PROCINFO["sorted_in"]' can be a function
-name. This lets you traverse an array based on any custom criterion.
-The array elements are ordered according to the return value of this
-function. The comparison function should be defined with at least four
-arguments:
+ Besides creating profiles when a program has completed, `gawk' can
+produce a profile while it is running. This is useful if your `awk'
+program goes into an infinite loop and you want to see what has been
+executed. To use this feature, run `gawk' with the `--profile' option
+in the background:
- function comp_func(i1, v1, i2, v2)
- {
- COMPARE ELEMENTS 1 AND 2 IN SOME FASHION
- RETURN < 0; 0; OR > 0
- }
+ $ gawk --profile -f myprog &
+ [1] 13992
- Here, I1 and I2 are the indices, and V1 and V2 are the corresponding
-values of the two elements being compared. Either V1 or V2, or both,
-can be arrays if the array being traversed contains subarrays as values.
-(*Note Arrays of Arrays::, for more information about subarrays.) The
-three possible return values are interpreted as follows:
+The shell prints a job number and process ID number; in this case,
+13992. Use the `kill' command to send the `USR1' signal to `gawk':
-`comp_func(i1, v1, i2, v2) < 0'
- Index I1 comes before index I2 during loop traversal.
+ $ kill -USR1 13992
-`comp_func(i1, v1, i2, v2) == 0'
- Indices I1 and I2 come together but the relative order with
- respect to each other is undefined.
+As usual, the profiled version of the program is written to
+`awkprof.out', or to a different file if one specified with the
+`--profile' option.
-`comp_func(i1, v1, i2, v2) > 0'
- Index I1 comes after index I2 during loop traversal.
+ Along with the regular profile, as shown earlier, the profile
+includes a trace of any active functions:
- Our first comparison function can be used to scan an array in
-numerical order of the indices:
+ # Function Call Stack:
- function cmp_num_idx(i1, v1, i2, v2)
- {
- # numerical index comparison, ascending order
- return (i1 - i2)
- }
+ # 3. baz
+ # 2. bar
+ # 1. foo
+ # -- main --
- Our second function traverses an array based on the string order of
-the element values rather than by indices:
+ You may send `gawk' the `USR1' signal as many times as you like.
+Each time, the profile and function call trace are appended to the
+output profile file.
- function cmp_str_val(i1, v1, i2, v2)
- {
- # string value comparison, ascending order
- v1 = v1 ""
- v2 = v2 ""
- if (v1 < v2)
- return -1
- return (v1 != v2)
- }
+ If you use the `HUP' signal instead of the `USR1' signal, `gawk'
+produces the profile and the function call trace and then exits.
- The third comparison function makes all numbers, and numeric strings
-without any leading or trailing spaces, come out first during loop
-traversal:
+ When `gawk' runs on MS-Windows systems, it uses the `INT' and `QUIT'
+signals for producing the profile and, in the case of the `INT' signal,
+`gawk' exits. This is because these systems don't support the `kill'
+command, so the only signals you can deliver to a program are those
+generated by the keyboard. The `INT' signal is generated by the
+`Ctrl-<C>' or `Ctrl-<BREAK>' key, while the `QUIT' signal is generated
+by the `Ctrl-<\>' key.
- function cmp_num_str_val(i1, v1, i2, v2, n1, n2)
- {
- # numbers before string value comparison, ascending order
- n1 = v1 + 0
- n2 = v2 + 0
- if (n1 == v1)
- return (n2 == v2) ? (n1 - n2) : -1
- else if (n2 == v2)
- return 1
- return (v1 < v2) ? -1 : (v1 != v2)
- }
+ Finally, `gawk' also accepts another option, `--pretty-print'. When
+called this way, `gawk' "pretty prints" the program into `awkprof.out',
+without any execution counts.
- Here is a main program to demonstrate how `gawk' behaves using each
-of the previous functions:
+
+File: gawk.info, Node: Internationalization, Next: Debugger, Prev: Advanced
Features, Up: Top
- BEGIN {
- data["one"] = 10
- data["two"] = 20
- data[10] = "one"
- data[100] = 100
- data[20] = "two"
+13 Internationalization with `gawk'
+***********************************
- f[1] = "cmp_num_idx"
- f[2] = "cmp_str_val"
- f[3] = "cmp_num_str_val"
- for (i = 1; i <= 3; i++) {
- printf("Sort function: %s\n", f[i])
- PROCINFO["sorted_in"] = f[i]
- for (j in data)
- printf("\tdata[%s] = %s\n", j, data[j])
- print ""
- }
- }
+Once upon a time, computer makers wrote software that worked only in
+English. Eventually, hardware and software vendors noticed that if
+their systems worked in the native languages of non-English-speaking
+countries, they were able to sell more systems. As a result,
+internationalization and localization of programs and software systems
+became a common practice.
- Here are the results when the program is run:
+ For many years, the ability to provide internationalization was
+largely restricted to programs written in C and C++. This major node
+describes the underlying library `gawk' uses for internationalization,
+as well as how `gawk' makes internationalization features available at
+the `awk' program level. Having internationalization available at the
+`awk' level gives software developers additional flexibility--they are
+no longer forced to write in C or C++ when internationalization is a
+requirement.
- $ gawk -f compdemo.awk
- -| Sort function: cmp_num_idx Sort by numeric index
- -| data[two] = 20
- -| data[one] = 10 Both strings are numerically zero
- -| data[10] = one
- -| data[20] = two
- -| data[100] = 100
- -|
- -| Sort function: cmp_str_val Sort by element values as strings
- -| data[one] = 10
- -| data[100] = 100 String 100 is less than string 20
- -| data[two] = 20
- -| data[10] = one
- -| data[20] = two
- -|
- -| Sort function: cmp_num_str_val Sort all numeric values before all
strings
- -| data[one] = 10
- -| data[two] = 20
- -| data[100] = 100
- -| data[10] = one
- -| data[20] = two
+* Menu:
- Consider sorting the entries of a GNU/Linux system password file
-according to login name. The following program sorts records by a
-specific field position and can be used for this purpose:
+* I18N and L10N:: Internationalization and Localization.
+* Explaining gettext:: How GNU `gettext' works.
+* Programmer i18n:: Features for the programmer.
+* Translator i18n:: Features for the translator.
+* I18N Example:: A simple i18n example.
+* Gawk I18N:: `gawk' is also internationalized.
- # sort.awk --- simple program to sort by field position
- # field position is specified by the global variable POS
+
+File: gawk.info, Node: I18N and L10N, Next: Explaining gettext, Up:
Internationalization
- function cmp_field(i1, v1, i2, v2)
- {
- # comparison by value, as string, and ascending order
- return v1[POS] < v2[POS] ? -1 : (v1[POS] != v2[POS])
- }
+13.1 Internationalization and Localization
+==========================================
- {
- for (i = 1; i <= NF; i++)
- a[NR][i] = $i
- }
+"Internationalization" means writing (or modifying) a program once, in
+such a way that it can use multiple languages without requiring further
+source-code changes. "Localization" means providing the data necessary
+for an internationalized program to work in a particular language.
+Most typically, these terms refer to features such as the language used
+for printing error messages, the language used to read responses, and
+information related to how numerical and monetary values are printed
+and read.
- END {
- PROCINFO["sorted_in"] = "cmp_field"
- if (POS < 1 || POS > NF)
- POS = 1
- for (i in a) {
- for (j = 1; j <= NF; j++)
- printf("%s%c", a[i][j], j < NF ? ":" : "")
- print ""
- }
- }
+
+File: gawk.info, Node: Explaining gettext, Next: Programmer i18n, Prev:
I18N and L10N, Up: Internationalization
- The first field in each entry of the password file is the user's
-login name, and the fields are separated by colons. Each record
-defines a subarray, with each field as an element in the subarray.
-Running the program produces the following output:
+13.2 GNU `gettext'
+==================
- $ gawk -v POS=1 -F: -f sort.awk /etc/passwd
- -| adm:x:3:4:adm:/var/adm:/sbin/nologin
- -| apache:x:48:48:Apache:/var/www:/sbin/nologin
- -| avahi:x:70:70:Avahi daemon:/:/sbin/nologin
- ...
+The facilities in GNU `gettext' focus on messages; strings printed by a
+program, either directly or via formatting with `printf' or
+`sprintf()'.(1)
- The comparison should normally always return the same value when
-given a specific pair of array elements as its arguments. If
-inconsistent results are returned then the order is undefined. This
-behavior can be exploited to introduce random order into otherwise
-seemingly ordered data:
+ When using GNU `gettext', each application has its own "text
+domain". This is a unique name, such as `kpilot' or `gawk', that
+identifies the application. A complete application may have multiple
+components--programs written in C or C++, as well as scripts written in
+`sh' or `awk'. All of the components use the same text domain.
- function cmp_randomize(i1, v1, i2, v2)
- {
- # random order
- return (2 - 4 * rand())
- }
+ To make the discussion concrete, assume we're writing an application
+named `guide'. Internationalization consists of the following steps,
+in this order:
- As mentioned above, the order of the indices is arbitrary if two
-elements compare equal. This is usually not a problem, but letting the
-tied elements come out in arbitrary order can be an issue, especially
-when comparing item values. The partial ordering of the equal elements
-may change during the next loop traversal, if other elements are added
-or removed from the array. One way to resolve ties when comparing
-elements with otherwise equal values is to include the indices in the
-comparison rules. Note that doing this may make the loop traversal
-less efficient, so consider it only if necessary. The following
-comparison functions force a deterministic order, and are based on the
-fact that the indices of two elements are never equal:
+ 1. The programmer goes through the source for all of `guide''s
+ components and marks each string that is a candidate for
+ translation. For example, `"`-F': option required"' is a good
+ candidate for translation. A table with strings of option names
+ is not (e.g., `gawk''s `--profile' option should remain the same,
+ no matter what the local language).
- function cmp_numeric(i1, v1, i2, v2)
- {
- # numerical value (and index) comparison, descending order
- return (v1 != v2) ? (v2 - v1) : (i2 - i1)
- }
+ 2. The programmer indicates the application's text domain (`"guide"')
+ to the `gettext' library, by calling the `textdomain()' function.
- function cmp_string(i1, v1, i2, v2)
- {
- # string value (and index) comparison, descending order
- v1 = v1 i1
- v2 = v2 i2
- return (v1 > v2) ? -1 : (v1 != v2)
- }
+ 3. Messages from the application are extracted from the source code
+ and collected into a portable object template file (`guide.pot'),
+ which lists the strings and their translations. The translations
+ are initially empty. The original (usually English) messages
+ serve as the key for lookup of the translations.
- A custom comparison function can often simplify ordered loop
-traversal, and the sky is really the limit when it comes to designing
-such a function.
+ 4. For each language with a translator, `guide.pot' is copied to a
+ portable object file (`.po') and translations are created and
+ shipped with the application. For example, there might be a
+ `fr.po' for a French translation.
- When string comparisons are made during a sort, either for element
-values where one or both aren't numbers, or for element indices handled
-as strings, the value of `IGNORECASE' (*note Built-in Variables::)
-controls whether the comparisons treat corresponding uppercase and
-lowercase letters as equivalent or distinct.
+ 5. Each language's `.po' file is converted into a binary message
+ object (`.mo') file. A message object file contains the original
+ messages and their translations in a binary format that allows
+ fast lookup of translations at runtime.
- Another point to keep in mind is that in the case of subarrays the
-element values can themselves be arrays; a production comparison
-function should use the `isarray()' function (*note Type Functions::),
-to check for this, and choose a defined sorting order for subarrays.
+ 6. When `guide' is built and installed, the binary translation files
+ are installed in a standard place.
- All sorting based on `PROCINFO["sorted_in"]' is disabled in POSIX
-mode, since the `PROCINFO' array is not special in that case.
+ 7. For testing and development, it is possible to tell `gettext' to
+ use `.mo' files in a different directory than the standard one by
+ using the `bindtextdomain()' function.
- As a side note, sorting the array indices before traversing the
-array has been reported to add 15% to 20% overhead to the execution
-time of `awk' programs. For this reason, sorted array traversal is not
-the default.
+ 8. At runtime, `guide' looks up each string via a call to
+ `gettext()'. The returned string is the translated string if
+ available, or the original string if not.
-
-File: gawk.info, Node: Array Sorting Functions, Prev: Controlling Array
Traversal, Up: Array Sorting
+ 9. If necessary, it is possible to access messages from a different
+ text domain than the one belonging to the application, without
+ having to switch the application's default text domain back and
+ forth.
-13.2.2 Sorting Array Values and Indices with `gawk'
----------------------------------------------------
+ In C (or C++), the string marking and dynamic translation lookup are
+accomplished by wrapping each string in a call to `gettext()':
-In most `awk' implementations, sorting an array requires writing a
-`sort()' function. While this can be educational for exploring
-different sorting algorithms, usually that's not the point of the
-program. `gawk' provides the built-in `asort()' and `asorti()'
-functions (*note String Functions::) for sorting arrays. For example:
+ printf("%s", gettext("Don't Panic!\n"));
- POPULATE THE ARRAY data
- n = asort(data)
- for (i = 1; i <= n; i++)
- DO SOMETHING WITH data[i]
+ The tools that extract messages from source code pull out all
+strings enclosed in calls to `gettext()'.
- After the call to `asort()', the array `data' is indexed from 1 to
-some number N, the total number of elements in `data'. (This count is
-`asort()''s return value.) `data[1]' <= `data[2]' <= `data[3]', and so
-on. The comparison is based on the type of the elements (*note Typing
-and Comparison::). All numeric values come before all string values,
-which in turn come before all subarrays.
+ The GNU `gettext' developers, recognizing that typing `gettext(...)'
+over and over again is both painful and ugly to look at, use the macro
+`_' (an underscore) to make things easier:
- An important side effect of calling `asort()' is that _the array's
-original indices are irrevocably lost_. As this isn't always
-desirable, `asort()' accepts a second argument:
+ /* In the standard header file: */
+ #define _(str) gettext(str)
- POPULATE THE ARRAY source
- n = asort(source, dest)
- for (i = 1; i <= n; i++)
- DO SOMETHING WITH dest[i]
+ /* In the program text: */
+ printf("%s", _("Don't Panic!\n"));
- In this case, `gawk' copies the `source' array into the `dest' array
-and then sorts `dest', destroying its indices. However, the `source'
-array is not affected.
+This reduces the typing overhead to just three extra characters per
+string and is considerably easier to read as well.
- `asort()' accepts a third string argument to control comparison of
-array elements. As with `PROCINFO["sorted_in"]', this argument may be
-one of the predefined names that `gawk' provides (*note Controlling
-Scanning::), or the name of a user-defined function (*note Controlling
-Array Traversal::).
+ There are locale "categories" for different types of locale-related
+information. The defined locale categories that `gettext' knows about
+are:
- NOTE: In all cases, the sorted element values consist of the
- original array's element values. The ability to control
- comparison merely affects the way in which they are sorted.
+`LC_MESSAGES'
+ Text messages. This is the default category for `gettext'
+ operations, but it is possible to supply a different one
+ explicitly, if necessary. (It is almost never necessary to supply
+ a different category.)
- Often, what's needed is to sort on the values of the _indices_
-instead of the values of the elements. To do that, use the `asorti()'
-function. The interface is identical to that of `asort()', except that
-the index values are used for sorting, and become the values of the
-result array:
+`LC_COLLATE'
+ Text-collation information; i.e., how different characters and/or
+ groups of characters sort in a given language.
- { source[$0] = some_func($0) }
+`LC_CTYPE'
+ Character-type information (alphabetic, digit, upper- or
+ lowercase, and so on). This information is accessed via the POSIX
+ character classes in regular expressions, such as `/[[:alnum:]]/'
+ (*note Regexp Operators::).
- END {
- n = asorti(source, dest)
- for (i = 1; i <= n; i++) {
- Work with sorted indices directly:
- DO SOMETHING WITH dest[i]
- ...
- Access original array via sorted indices:
- DO SOMETHING WITH source[dest[i]]
- }
- }
+`LC_MONETARY'
+ Monetary information, such as the currency symbol, and whether the
+ symbol goes before or after a number.
- Similar to `asort()', in all cases, the sorted element values
-consist of the original array's indices. The ability to control
-comparison merely affects the way in which they are sorted.
+`LC_NUMERIC'
+ Numeric information, such as which characters to use for the
+ decimal point and the thousands separator.(2)
- Sorting the array by replacing the indices provides maximal
-flexibility. To traverse the elements in decreasing order, use a loop
-that goes from N down to 1, either over the elements or over the
-indices.(1)
+`LC_RESPONSE'
+ Response information, such as how "yes" and "no" appear in the
+ local language, and possibly other information as well.
- Copying array indices and elements isn't expensive in terms of
-memory. Internally, `gawk' maintains "reference counts" to data. For
-example, when `asort()' copies the first array to the second one, there
-is only one copy of the original array elements' data, even though both
-arrays use the values.
+`LC_TIME'
+ Time- and date-related information, such as 12- or 24-hour clock,
+ month printed before or after the day in a date, local month
+ abbreviations, and so on.
- Because `IGNORECASE' affects string comparisons, the value of
-`IGNORECASE' also affects sorting for both `asort()' and `asorti()'.
-Note also that the locale's sorting order does _not_ come into play;
-comparisons are based on character values only.(2) Caveat Emptor.
+`LC_ALL'
+ All of the above. (Not too useful in the context of `gettext'.)
---------- Footnotes ----------
- (1) You may also use one of the predefined sorting names that sorts
-in decreasing order.
+ (1) For some operating systems, the `gawk' port doesn't support GNU
+`gettext'. Therefore, these features are not available if you are
+using one of those operating systems. Sorry.
- (2) This is true because locale-based comparison occurs only when in
-POSIX compatibility mode, and since `asort()' and `asorti()' are `gawk'
-extensions, they are not available in that case.
+ (2) Americans use a comma every three decimal places and a period
+for the decimal point, while many Europeans do exactly the opposite:
+1,234.56 versus 1.234,56.
-File: gawk.info, Node: Two-way I/O, Next: TCP/IP Networking, Prev: Array
Sorting, Up: Advanced Features
+File: gawk.info, Node: Programmer i18n, Next: Translator i18n, Prev:
Explaining gettext, Up: Internationalization
-13.3 Two-Way Communications with Another Process
-================================================
+13.3 Internationalizing `awk' Programs
+======================================
- From: address@hidden (Mike Brennan)
- Newsgroups: comp.lang.awk
- Subject: Re: Learn the SECRET to Attract Women Easily
- Date: 4 Aug 1997 17:34:46 GMT
- Message-ID: <address@hidden>
+`gawk' provides the following variables and functions for
+internationalization:
- On 3 Aug 1997 13:17:43 GMT, Want More Dates???
- <address@hidden> wrote:
- >Learn the SECRET to Attract Women Easily
- >
- >The SCENT(tm) Pheromone Sex Attractant For Men to Attract Women
+`TEXTDOMAIN'
+ This variable indicates the application's text domain. For
+ compatibility with GNU `gettext', the default value is
+ `"messages"'.
- The scent of awk programmers is a lot more attractive to women than
- the scent of perl programmers.
- --
- Mike Brennan
+`_"your message here"'
+ String constants marked with a leading underscore are candidates
+ for translation at runtime. String constants without a leading
+ underscore are not translated.
- It is often useful to be able to send data to a separate program for
-processing and then read the result. This can always be done with
-temporary files:
+`dcgettext(STRING [, DOMAIN [, CATEGORY]])'
+ Return the translation of STRING in text domain DOMAIN for locale
+ category CATEGORY. The default value for DOMAIN is the current
+ value of `TEXTDOMAIN'. The default value for CATEGORY is
+ `"LC_MESSAGES"'.
- # Write the data for processing
- tempfile = ("mydata." PROCINFO["pid"])
- while (NOT DONE WITH DATA)
- print DATA | ("subprogram > " tempfile)
- close("subprogram > " tempfile)
+ If you supply a value for CATEGORY, it must be a string equal to
+ one of the known locale categories described in *note Explaining
+ gettext::. You must also supply a text domain. Use `TEXTDOMAIN'
+ if you want to use the current domain.
- # Read the results, remove tempfile when done
- while ((getline newdata < tempfile) > 0)
- PROCESS newdata APPROPRIATELY
- close(tempfile)
- system("rm " tempfile)
+ CAUTION: The order of arguments to the `awk' version of the
+ `dcgettext()' function is purposely different from the order
+ for the C version. The `awk' version's order was chosen to
+ be simple and to allow for reasonable `awk'-style default
+ arguments.
-This works, but not elegantly. Among other things, it requires that
-the program be run in a directory that cannot be shared among users;
-for example, `/tmp' will not do, as another user might happen to be
-using a temporary file with the same name.
+`dcngettext(STRING1, STRING2, NUMBER [, DOMAIN [, CATEGORY]])'
+ Return the plural form used for NUMBER of the translation of
+ STRING1 and STRING2 in text domain DOMAIN for locale category
+ CATEGORY. STRING1 is the English singular variant of a message,
+ and STRING2 the English plural variant of the same message. The
+ default value for DOMAIN is the current value of `TEXTDOMAIN'.
+ The default value for CATEGORY is `"LC_MESSAGES"'.
- However, with `gawk', it is possible to open a _two-way_ pipe to
-another process. The second process is termed a "coprocess", since it
-runs in parallel with `gawk'. The two-way connection is created using
-the `|&' operator (borrowed from the Korn shell, `ksh'):(1)
+ The same remarks about argument order as for the `dcgettext()'
+ function apply.
- do {
- print DATA |& "subprogram"
- "subprogram" |& getline results
- } while (DATA LEFT TO PROCESS)
- close("subprogram")
+`bindtextdomain(DIRECTORY [, DOMAIN])'
+ Change the directory in which `gettext' looks for `.mo' files, in
+ case they will not or cannot be placed in the standard locations
+ (e.g., during testing). Return the directory in which DOMAIN is
+ "bound."
- The first time an I/O operation is executed using the `|&' operator,
-`gawk' creates a two-way pipeline to a child process that runs the
-other program. Output created with `print' or `printf' is written to
-the program's standard input, and output from the program's standard
-output can be read by the `gawk' program using `getline'. As is the
-case with processes started by `|', the subprogram can be any program,
-or pipeline of programs, that can be started by the shell.
+ The default DOMAIN is the value of `TEXTDOMAIN'. If DIRECTORY is
+ the null string (`""'), then `bindtextdomain()' returns the
+ current binding for the given DOMAIN.
- There are some cautionary items to be aware of:
+ To use these facilities in your `awk' program, follow the steps
+outlined in *note Explaining gettext::, like so:
- * As the code inside `gawk' currently stands, the coprocess's
- standard error goes to the same place that the parent `gawk''s
- standard error goes. It is not possible to read the child's
- standard error separately.
+ 1. Set the variable `TEXTDOMAIN' to the text domain of your program.
+ This is best done in a `BEGIN' rule (*note BEGIN/END::), or it can
+ also be done via the `-v' command-line option (*note Options::):
+
+ BEGIN {
+ TEXTDOMAIN = "guide"
+ ...
+ }
- * I/O buffering may be a problem. `gawk' automatically flushes all
- output down the pipe to the coprocess. However, if the coprocess
- does not flush its output, `gawk' may hang when doing a `getline'
- in order to read the coprocess's results. This could lead to a
- situation known as "deadlock", where each process is waiting for
- the other one to do something.
+ 2. Mark all translatable strings with a leading underscore (`_')
+ character. It _must_ be adjacent to the opening quote of the
+ string. For example:
- It is possible to close just one end of the two-way pipe to a
-coprocess, by supplying a second argument to the `close()' function of
-either `"to"' or `"from"' (*note Close Files And Pipes::). These
-strings tell `gawk' to close the end of the pipe that sends data to the
-coprocess or the end that reads from it, respectively.
+ print _"hello, world"
+ x = _"you goofed"
+ printf(_"Number of users is %d\n", nusers)
- This is particularly necessary in order to use the system `sort'
-utility as part of a coprocess; `sort' must read _all_ of its input
-data before it can produce any output. The `sort' program does not
-receive an end-of-file indication until `gawk' closes the write end of
-the pipe.
+ 3. If you are creating strings dynamically, you can still translate
+ them, using the `dcgettext()' built-in function:
- When you have finished writing data to the `sort' utility, you can
-close the `"to"' end of the pipe, and then start reading sorted data
-via `getline'. For example:
+ message = nusers " users logged in"
+ message = dcgettext(message, "adminprog")
+ print message
- BEGIN {
- command = "LC_ALL=C sort"
- n = split("abcdefghijklmnopqrstuvwxyz", a, "")
+ Here, the call to `dcgettext()' supplies a different text domain
+ (`"adminprog"') in which to find the message, but it uses the
+ default `"LC_MESSAGES"' category.
- for (i = n; i > 0; i--)
- print a[i] |& command
- close(command, "to")
+ 4. During development, you might want to put the `.mo' file in a
+ private directory for testing. This is done with the
+ `bindtextdomain()' built-in function:
- while ((command |& getline line) > 0)
- print "got", line
- close(command)
- }
+ BEGIN {
+ TEXTDOMAIN = "guide" # our text domain
+ if (Testing) {
+ # where to find our files
+ bindtextdomain("testdir")
+ # joe is in charge of adminprog
+ bindtextdomain("../joe/testdir", "adminprog")
+ }
+ ...
+ }
- This program writes the letters of the alphabet in reverse order, one
-per line, down the two-way pipe to `sort'. It then closes the write
-end of the pipe, so that `sort' receives an end-of-file indication.
-This causes `sort' to sort the data and write the sorted data back to
-the `gawk' program. Once all of the data has been read, `gawk'
-terminates the coprocess and exits.
- As a side note, the assignment `LC_ALL=C' in the `sort' command
-ensures traditional Unix (ASCII) sorting from `sort'.
+ *Note I18N Example::, for an example program showing the steps to
+create and use translations from `awk'.
- You may also use pseudo-ttys (ptys) for two-way communication
-instead of pipes, if your system supports them. This is done on a
-per-command basis, by setting a special element in the `PROCINFO' array
-(*note Auto-set::), like so:
+
+File: gawk.info, Node: Translator i18n, Next: I18N Example, Prev:
Programmer i18n, Up: Internationalization
- command = "sort -nr" # command, save in convenience variable
- PROCINFO[command, "pty"] = 1 # update PROCINFO
- print ... |& command # start two-way pipe
- ...
+13.4 Translating `awk' Programs
+===============================
-Using ptys avoids the buffer deadlock issues described earlier, at some
-loss in performance. If your system does not have ptys, or if all the
-system's ptys are in use, `gawk' automatically falls back to using
-regular pipes.
+Once a program's translatable strings have been marked, they must be
+extracted to create the initial `.po' file. As part of translation, it
+is often helpful to rearrange the order in which arguments to `printf'
+are output.
- ---------- Footnotes ----------
+ `gawk''s `--gen-pot' command-line option extracts the messages and
+is discussed next. After that, `printf''s ability to rearrange the
+order for `printf' arguments at runtime is covered.
- (1) This is very different from the same operator in the C shell.
+* Menu:
+
+* String Extraction:: Extracting marked strings.
+* Printf Ordering:: Rearranging `printf' arguments.
+* I18N Portability:: `awk'-level portability issues.
-File: gawk.info, Node: TCP/IP Networking, Next: Profiling, Prev: Two-way
I/O, Up: Advanced Features
+File: gawk.info, Node: String Extraction, Next: Printf Ordering, Up:
Translator i18n
-13.4 Using `gawk' for Network Programming
-=========================================
+13.4.1 Extracting Marked Strings
+--------------------------------
- `EMISTERED':
- A host is a host from coast to coast,
- and no-one can talk to host that's close,
- unless the host that isn't close
- is busy hung or dead.
+Once your `awk' program is working, and all the strings have been
+marked and you've set (and perhaps bound) the text domain, it is time
+to produce translations. First, use the `--gen-pot' command-line
+option to create the initial `.pot' file:
- In addition to being able to open a two-way pipeline to a coprocess
-on the same system (*note Two-way I/O::), it is possible to make a
-two-way connection to another process on another system across an IP
-network connection.
+ $ gawk --gen-pot -f guide.awk > guide.pot
- You can think of this as just a _very long_ two-way pipeline to a
-coprocess. The way `gawk' decides that you want to use TCP/IP
-networking is by recognizing special file names that begin with one of
-`/inet/', `/inet4/' or `/inet6'.
+ When run with `--gen-pot', `gawk' does not execute your program.
+Instead, it parses it as usual and prints all marked strings to
+standard output in the format of a GNU `gettext' Portable Object file.
+Also included in the output are any constant strings that appear as the
+first argument to `dcgettext()' or as the first and second argument to
+`dcngettext()'.(1) *Note I18N Example::, for the full list of steps to
+go through to create and test translations for `guide'.
- The full syntax of the special file name is
-`/NET-TYPE/PROTOCOL/LOCAL-PORT/REMOTE-HOST/REMOTE-PORT'. The
-components are:
+ ---------- Footnotes ----------
-NET-TYPE
- Specifies the kind of Internet connection to make. Use `/inet4/'
- to force IPv4, and `/inet6/' to force IPv6. Plain `/inet/' (which
- used to be the only option) uses the system default, most likely
- IPv4.
+ (1) The `xgettext' utility that comes with GNU `gettext' can handle
+`.awk' files.
-PROTOCOL
- The protocol to use over IP. This must be either `tcp', or `udp',
- for a TCP or UDP IP connection, respectively. The use of TCP is
- recommended for most applications.
+
+File: gawk.info, Node: Printf Ordering, Next: I18N Portability, Prev:
String Extraction, Up: Translator i18n
-LOCAL-PORT
- The local TCP or UDP port number to use. Use a port number of `0'
- when you want the system to pick a port. This is what you should do
- when writing a TCP or UDP client. You may also use a well-known
- service name, such as `smtp' or `http', in which case `gawk'
- attempts to determine the predefined port number using the C
- `getaddrinfo()' function.
+13.4.2 Rearranging `printf' Arguments
+-------------------------------------
-REMOTE-HOST
- The IP address or fully-qualified domain name of the Internet host
- to which you want to connect.
+Format strings for `printf' and `sprintf()' (*note Printf::) present a
+special problem for translation. Consider the following:(1)
-REMOTE-PORT
- The TCP or UDP port number to use on the given REMOTE-HOST.
- Again, use `0' if you don't care, or else a well-known service
- name.
+ printf(_"String `%s' has %d characters\n",
+ string, length(string)))
- NOTE: Failure in opening a two-way socket will result in a
- non-fatal error being returned to the calling code. The value of
- `ERRNO' indicates the error (*note Auto-set::).
+ A possible German translation for this might be:
- Consider the following very simple example:
+ "%d Zeichen lang ist die Zeichenkette `%s'\n"
- BEGIN {
- Service = "/inet/tcp/0/localhost/daytime"
- Service |& getline
- print $0
- close(Service)
- }
+ The problem should be obvious: the order of the format
+specifications is different from the original! Even though `gettext()'
+can return the translated string at runtime, it cannot change the
+argument order in the call to `printf'.
- This program reads the current date and time from the local system's
-TCP `daytime' server. It then prints the results and closes the
-connection.
+ To solve this problem, `printf' format specifiers may have an
+additional optional element, which we call a "positional specifier".
+For example:
- Because this topic is extensive, the use of `gawk' for TCP/IP
-programming is documented separately. See *note (General
-Introduction)Top:: gawkinet, TCP/IP Internetworking with `gawk', for a
-much more complete introduction and discussion, as well as extensive
-examples.
+ "%2$d Zeichen lang ist die Zeichenkette `%1$s'\n"
-
-File: gawk.info, Node: Profiling, Prev: TCP/IP Networking, Up: Advanced
Features
+ Here, the positional specifier consists of an integer count, which
+indicates which argument to use, and a `$'. Counts are one-based, and
+the format string itself is _not_ included. Thus, in the following
+example, `string' is the first argument and `length(string)' is the
+second:
-13.5 Profiling Your `awk' Programs
-==================================
+ $ gawk 'BEGIN {
+ > string = "Dont Panic"
+ > printf _"%2$d characters live in \"%1$s\"\n",
+ > string, length(string)
+ > }'
+ -| 10 characters live in "Dont Panic"
-You may produce execution traces of your `awk' programs. This is done
-by passing the option `--profile' to `gawk'. When `gawk' has finished
-running, it creates a profile of your program in a file named
-`awkprof.out'. Because it is profiling, it also executes up to 45%
-slower than `gawk' normally does.
+ If present, positional specifiers come first in the format
+specification, before the flags, the field width, and/or the precision.
- As shown in the following example, the `--profile' option can be
-used to change the name of the file where `gawk' will write the profile:
+ Positional specifiers can be used with the dynamic field width and
+precision capability:
- gawk --profile=myprog.prof -f myprog.awk data1 data2
+ $ gawk 'BEGIN {
+ > printf("%*.*s\n", 10, 20, "hello")
+ > printf("%3$*2$.*1$s\n", 20, 10, "hello")
+ > }'
+ -| hello
+ -| hello
-In the above example, `gawk' places the profile in `myprog.prof'
-instead of in `awkprof.out'.
+ NOTE: When using `*' with a positional specifier, the `*' comes
+ first, then the integer position, and then the `$'. This is
+ somewhat counterintuitive.
- Here is a sample session showing a simple `awk' program, its input
-data, and the results from running `gawk' with the `--profile' option.
-First, the `awk' program:
+ `gawk' does not allow you to mix regular format specifiers and those
+with positional specifiers in the same string:
- BEGIN { print "First BEGIN rule" }
+ $ gawk 'BEGIN { printf _"%d %3$s\n", 1, 2, "hi" }'
+ error--> gawk: cmd. line:1: fatal: must use `count$' on all formats or
none
- END { print "First END rule" }
+ NOTE: There are some pathological cases that `gawk' may fail to
+ diagnose. In such cases, the output may not be what you expect.
+ It's still a bad idea to try mixing them, even if `gawk' doesn't
+ detect it.
- /foo/ {
- print "matched /foo/, gosh"
- for (i = 1; i <= 3; i++)
- sing()
- }
+ Although positional specifiers can be used directly in `awk'
+programs, their primary purpose is to help in producing correct
+translations of format strings into languages different from the one in
+which the program is first written.
- {
- if (/foo/)
- print "if is true"
- else
- print "else is true"
- }
+ ---------- Footnotes ----------
+
+ (1) This example is borrowed from the GNU `gettext' manual.
- BEGIN { print "Second BEGIN rule" }
+
+File: gawk.info, Node: I18N Portability, Prev: Printf Ordering, Up:
Translator i18n
- END { print "Second END rule" }
+13.4.3 `awk' Portability Issues
+-------------------------------
- function sing( dummy)
- {
- print "I gotta be me!"
- }
+`gawk''s internationalization features were purposely chosen to have as
+little impact as possible on the portability of `awk' programs that use
+them to other versions of `awk'. Consider this program:
- Following is the input data:
+ BEGIN {
+ TEXTDOMAIN = "guide"
+ if (Test_Guide) # set with -v
+ bindtextdomain("/test/guide/messages")
+ print _"don't panic!"
+ }
- foo
- bar
- baz
- foo
- junk
+As written, it won't work on other versions of `awk'. However, it is
+actually almost portable, requiring very little change:
- Here is the `awkprof.out' that results from running the `gawk'
-profiler on this program and data (this example also illustrates that
-`awk' programmers sometimes have to work late):
+ * Assignments to `TEXTDOMAIN' won't have any effect, since
+ `TEXTDOMAIN' is not special in other `awk' implementations.
- # gawk profile, created Sun Aug 13 00:00:15 2000
+ * Non-GNU versions of `awk' treat marked strings as the
+ concatenation of a variable named `_' with the string following
+ it.(1) Typically, the variable `_' has the null string (`""') as
+ its value, leaving the original string constant as the result.
- # BEGIN block(s)
+ * By defining "dummy" functions to replace `dcgettext()',
+ `dcngettext()' and `bindtextdomain()', the `awk' program can be
+ made to run, but all the messages are output in the original
+ language. For example:
- BEGIN {
- 1 print "First BEGIN rule"
- 1 print "Second BEGIN rule"
- }
+ function bindtextdomain(dir, domain)
+ {
+ return dir
+ }
- # Rule(s)
+ function dcgettext(string, domain, category)
+ {
+ return string
+ }
- 5 /foo/ { # 2
- 2 print "matched /foo/, gosh"
- 6 for (i = 1; i <= 3; i++) {
- 6 sing()
- }
- }
+ function dcngettext(string1, string2, number, domain, category)
+ {
+ return (number == 1 ? string1 : string2)
+ }
- 5 {
- 5 if (/foo/) { # 2
- 2 print "if is true"
- 3 } else {
- 3 print "else is true"
- }
- }
+ * The use of positional specifications in `printf' or `sprintf()' is
+ _not_ portable. To support `gettext()' at the C level, many
+ systems' C versions of `sprintf()' do support positional
+ specifiers. But it works only if enough arguments are supplied in
+ the function call. Many versions of `awk' pass `printf' formats
+ and arguments unchanged to the underlying C library version of
+ `sprintf()', but only one format and argument at a time. What
+ happens if a positional specification is used is anybody's guess.
+ However, since the positional specifications are primarily for use
+ in _translated_ format strings, and since non-GNU `awk's never
+ retrieve the translated string, this should not be a problem in
+ practice.
- # END block(s)
+ ---------- Footnotes ----------
- END {
- 1 print "First END rule"
- 1 print "Second END rule"
- }
+ (1) This is good fodder for an "Obfuscated `awk'" contest.
- # Functions, listed alphabetically
+
+File: gawk.info, Node: I18N Example, Next: Gawk I18N, Prev: Translator
i18n, Up: Internationalization
- 6 function sing(dummy)
- {
- 6 print "I gotta be me!"
- }
+13.5 A Simple Internationalization Example
+==========================================
- This example illustrates many of the basic features of profiling
-output. They are as follows:
+Now let's look at a step-by-step example of how to internationalize and
+localize a simple `awk' program, using `guide.awk' as our original
+source:
- * The program is printed in the order `BEGIN' rule, `BEGINFILE' rule,
- pattern/action rules, `ENDFILE' rule, `END' rule and functions,
- listed alphabetically. Multiple `BEGIN' and `END' rules are
- merged together, as are multiple `BEGINFILE' and `ENDFILE' rules.
+ BEGIN {
+ TEXTDOMAIN = "guide"
+ bindtextdomain(".") # for testing
+ print _"Don't Panic"
+ print _"The Answer Is", 42
+ print "Pardon me, Zaphod who?"
+ }
- * Pattern-action rules have two counts. The first count, to the
- left of the rule, shows how many times the rule's pattern was
- _tested_. The second count, to the right of the rule's opening
- left brace in a comment, shows how many times the rule's action
- was _executed_. The difference between the two indicates how many
- times the rule's pattern evaluated to false.
+Run `gawk --gen-pot' to create the `.pot' file:
- * Similarly, the count for an `if'-`else' statement shows how many
- times the condition was tested. To the right of the opening left
- brace for the `if''s body is a count showing how many times the
- condition was true. The count for the `else' indicates how many
- times the test failed.
+ $ gawk --gen-pot -f guide.awk > guide.pot
- * The count for a loop header (such as `for' or `while') shows how
- many times the loop test was executed. (Because of this, you
- can't just look at the count on the first statement in a rule to
- determine how many times the rule was executed. If the first
- statement is a loop, the count is misleading.)
+This produces:
- * For user-defined functions, the count next to the `function'
- keyword indicates how many times the function was called. The
- counts next to the statements in the body show how many times
- those statements were executed.
+ #: guide.awk:4
+ msgid "Don't Panic"
+ msgstr ""
- * The layout uses "K&R" style with TABs. Braces are used
- everywhere, even when the body of an `if', `else', or loop is only
- a single statement.
+ #: guide.awk:5
+ msgid "The Answer Is"
+ msgstr ""
- * Parentheses are used only where needed, as indicated by the
- structure of the program and the precedence rules. For example,
- `(3 + 5) * 4' means add three plus five, then multiply the total
- by four. However, `3 + 5 * 4' has no parentheses, and means `3 +
- (5 * 4)'.
+ This original portable object template file is saved and reused for
+each language into which the application is translated. The `msgid' is
+the original string and the `msgstr' is the translation.
- * Parentheses are used around the arguments to `print' and `printf'
- only when the `print' or `printf' statement is followed by a
- redirection. Similarly, if the target of a redirection isn't a
- scalar, it gets parenthesized.
+ NOTE: Strings not marked with a leading underscore do not appear
+ in the `guide.pot' file.
- * `gawk' supplies leading comments in front of the `BEGIN' and `END'
- rules, the pattern/action rules, and the functions.
+ Next, the messages must be translated. Here is a translation to a
+hypothetical dialect of English, called "Mellow":(1)
+ $ cp guide.pot guide-mellow.po
+ ADD TRANSLATIONS TO guide-mellow.po ...
- The profiled version of your program may not look exactly like what
-you typed when you wrote it. This is because `gawk' creates the
-profiled version by "pretty printing" its internal representation of
-the program. The advantage to this is that `gawk' can produce a
-standard representation. The disadvantage is that all source-code
-comments are lost, as are the distinctions among multiple `BEGIN',
-`END', `BEGINFILE', and `ENDFILE' rules. Also, things such as:
+Following are the translations:
- /foo/
+ #: guide.awk:4
+ msgid "Don't Panic"
+ msgstr "Hey man, relax!"
-come out as:
+ #: guide.awk:5
+ msgid "The Answer Is"
+ msgstr "Like, the scoop is"
- /foo/ {
- print $0
- }
+ The next step is to make the directory to hold the binary message
+object file and then to create the `guide.mo' file. The directory
+layout shown here is standard for GNU `gettext' on GNU/Linux systems.
+Other versions of `gettext' may use a different layout:
-which is correct, but possibly surprising.
+ $ mkdir en_US en_US/LC_MESSAGES
- Besides creating profiles when a program has completed, `gawk' can
-produce a profile while it is running. This is useful if your `awk'
-program goes into an infinite loop and you want to see what has been
-executed. To use this feature, run `gawk' with the `--profile' option
-in the background:
+ The `msgfmt' utility does the conversion from human-readable `.po'
+file to machine-readable `.mo' file. By default, `msgfmt' creates a
+file named `messages'. This file must be renamed and placed in the
+proper directory so that `gawk' can find it:
- $ gawk --profile -f myprog &
- [1] 13992
+ $ msgfmt guide-mellow.po
+ $ mv messages en_US/LC_MESSAGES/guide.mo
-The shell prints a job number and process ID number; in this case,
-13992. Use the `kill' command to send the `USR1' signal to `gawk':
+ Finally, we run the program to test it:
- $ kill -USR1 13992
+ $ gawk -f guide.awk
+ -| Hey man, relax!
+ -| Like, the scoop is 42
+ -| Pardon me, Zaphod who?
-As usual, the profiled version of the program is written to
-`awkprof.out', or to a different file if one specified with the
-`--profile' option.
+ If the three replacement functions for `dcgettext()', `dcngettext()'
+and `bindtextdomain()' (*note I18N Portability::) are in a file named
+`libintl.awk', then we can run `guide.awk' unchanged as follows:
- Along with the regular profile, as shown earlier, the profile
-includes a trace of any active functions:
+ $ gawk --posix -f guide.awk -f libintl.awk
+ -| Don't Panic
+ -| The Answer Is 42
+ -| Pardon me, Zaphod who?
- # Function Call Stack:
+ ---------- Footnotes ----------
- # 3. baz
- # 2. bar
- # 1. foo
- # -- main --
+ (1) Perhaps it would be better if it were called "Hippy." Ah, well.
- You may send `gawk' the `USR1' signal as many times as you like.
-Each time, the profile and function call trace are appended to the
-output profile file.
+
+File: gawk.info, Node: Gawk I18N, Prev: I18N Example, Up:
Internationalization
- If you use the `HUP' signal instead of the `USR1' signal, `gawk'
-produces the profile and the function call trace and then exits.
+13.6 `gawk' Can Speak Your Language
+===================================
- When `gawk' runs on MS-Windows systems, it uses the `INT' and `QUIT'
-signals for producing the profile and, in the case of the `INT' signal,
-`gawk' exits. This is because these systems don't support the `kill'
-command, so the only signals you can deliver to a program are those
-generated by the keyboard. The `INT' signal is generated by the
-`Ctrl-<C>' or `Ctrl-<BREAK>' key, while the `QUIT' signal is generated
-by the `Ctrl-<\>' key.
+`gawk' itself has been internationalized using the GNU `gettext'
+package. (GNU `gettext' is described in complete detail in *note (GNU
+`gettext' utilities)Top:: gettext, GNU gettext tools.) As of this
+writing, the latest version of GNU `gettext' is version 0.18.2.1
+(ftp://ftp.gnu.org/gnu/gettext/gettext-0.18.2.1.tar.gz).
- Finally, `gawk' also accepts another option, `--pretty-print'. When
-called this way, `gawk' "pretty prints" the program into `awkprof.out',
-without any execution counts.
+ If a translation of `gawk''s messages exists, then `gawk' produces
+usage messages, warnings, and fatal errors in the local language.
-File: gawk.info, Node: Debugger, Next: Arbitrary Precision Arithmetic,
Prev: Advanced Features, Up: Top
+File: gawk.info, Node: Debugger, Next: Arbitrary Precision Arithmetic,
Prev: Internationalization, Up: Top
14 Debugging `awk' Programs
***************************
@@ -32254,65 +32254,65 @@ Ref: Passwd Functions-Footnote-1619614
Node: Group Functions619702
Node: Walking Arrays627786
Node: Sample Programs629923
-Node: Running Examples630600
-Node: Clones631328
-Node: Cut Program632552
-Node: Egrep Program642397
-Ref: Egrep Program-Footnote-1650170
-Node: Id Program650280
-Node: Split Program653896
-Ref: Split Program-Footnote-1657415
-Node: Tee Program657543
-Node: Uniq Program660346
-Node: Wc Program667775
-Ref: Wc Program-Footnote-1672041
-Ref: Wc Program-Footnote-2672241
-Node: Miscellaneous Programs672333
-Node: Dupword Program673521
-Node: Alarm Program675552
-Node: Translate Program680301
-Ref: Translate Program-Footnote-1684688
-Ref: Translate Program-Footnote-2684916
-Node: Labels Program685050
-Ref: Labels Program-Footnote-1688421
-Node: Word Sorting688505
-Node: History Sorting692389
-Node: Extract Program694228
-Ref: Extract Program-Footnote-1701729
-Node: Simple Sed701857
-Node: Igawk Program704919
-Ref: Igawk Program-Footnote-1720076
-Ref: Igawk Program-Footnote-2720277
-Node: Anagram Program720415
-Node: Signature Program723483
-Node: Internationalization724583
-Node: I18N and L10N726015
-Node: Explaining gettext726701
-Ref: Explaining gettext-Footnote-1731767
-Ref: Explaining gettext-Footnote-2731951
-Node: Programmer i18n732116
-Node: Translator i18n736316
-Node: String Extraction737109
-Ref: String Extraction-Footnote-1738070
-Node: Printf Ordering738156
-Ref: Printf Ordering-Footnote-1740940
-Node: I18N Portability741004
-Ref: I18N Portability-Footnote-1743453
-Node: I18N Example743516
-Ref: I18N Example-Footnote-1746151
-Node: Gawk I18N746223
-Node: Advanced Features746844
-Node: Nondecimal Data748719
-Node: Array Sorting750302
-Node: Controlling Array Traversal750999
-Node: Array Sorting Functions759237
-Ref: Array Sorting Functions-Footnote-1762911
-Ref: Array Sorting Functions-Footnote-2763004
-Node: Two-way I/O763198
-Ref: Two-way I/O-Footnote-1768630
-Node: TCP/IP Networking768700
-Node: Profiling771544
-Node: Debugger778999
+Node: Running Examples630597
+Node: Clones631325
+Node: Cut Program632549
+Node: Egrep Program642394
+Ref: Egrep Program-Footnote-1650167
+Node: Id Program650277
+Node: Split Program653893
+Ref: Split Program-Footnote-1657412
+Node: Tee Program657540
+Node: Uniq Program660343
+Node: Wc Program667772
+Ref: Wc Program-Footnote-1672038
+Ref: Wc Program-Footnote-2672238
+Node: Miscellaneous Programs672330
+Node: Dupword Program673518
+Node: Alarm Program675549
+Node: Translate Program680298
+Ref: Translate Program-Footnote-1684685
+Ref: Translate Program-Footnote-2684913
+Node: Labels Program685047
+Ref: Labels Program-Footnote-1688418
+Node: Word Sorting688502
+Node: History Sorting692386
+Node: Extract Program694225
+Ref: Extract Program-Footnote-1701726
+Node: Simple Sed701854
+Node: Igawk Program704916
+Ref: Igawk Program-Footnote-1720073
+Ref: Igawk Program-Footnote-2720274
+Node: Anagram Program720412
+Node: Signature Program723480
+Node: Advanced Features724580
+Node: Nondecimal Data726462
+Node: Array Sorting728045
+Node: Controlling Array Traversal728742
+Node: Array Sorting Functions736980
+Ref: Array Sorting Functions-Footnote-1740654
+Ref: Array Sorting Functions-Footnote-2740747
+Node: Two-way I/O740941
+Ref: Two-way I/O-Footnote-1746373
+Node: TCP/IP Networking746443
+Node: Profiling749287
+Node: Internationalization756742
+Node: I18N and L10N758167
+Node: Explaining gettext758853
+Ref: Explaining gettext-Footnote-1763919
+Ref: Explaining gettext-Footnote-2764103
+Node: Programmer i18n764268
+Node: Translator i18n768468
+Node: String Extraction769261
+Ref: String Extraction-Footnote-1770222
+Node: Printf Ordering770308
+Ref: Printf Ordering-Footnote-1773092
+Node: I18N Portability773156
+Ref: I18N Portability-Footnote-1775605
+Node: I18N Example775668
+Ref: I18N Example-Footnote-1778303
+Node: Gawk I18N778375
+Node: Debugger778996
Node: Debugging779967
Node: Debugging Concepts780400
Node: Debugging Terms782256
diff --git a/doc/gawk.texi b/doc/gawk.texi
index dee577a..94de0af 100644
--- a/doc/gawk.texi
+++ b/doc/gawk.texi
@@ -297,10 +297,10 @@ particular records in a file and perform operations upon
them.
* Library Functions:: A Library of @command{awk} Functions.
* Sample Programs:: Many @command{awk} programs with complete
explanations.
-* Internationalization:: Getting @command{gawk} to speak your
- language.
* Advanced Features:: Stuff for advanced users, specific to
@command{gawk}.
+* Internationalization:: Getting @command{gawk} to speak your
+ language.
* Debugger:: The @code{gawk} debugger.
* Arbitrary Precision Arithmetic:: Arbitrary precision arithmetic with
@command{gawk}.
@@ -1321,10 +1321,6 @@ solving real problems.
Part III focuses on features specific to @command{gawk}.
It contains the following chapters:
address@hidden,
-describes special features in @command{gawk} for translating program
-messages into different languages at runtime.
-
@ref{Advanced Features},
describes a number of @command{gawk}-specific advanced features.
Of particular note
@@ -1332,6 +1328,10 @@ are the abilities to have two-way communications with
another process,
perform TCP/IP networking, and
profile your @command{awk} programs.
address@hidden,
+describes special features in @command{gawk} for translating program
+messages into different languages at runtime.
+
@ref{Debugger}, describes the @command{awk} debugger.
@ref{Arbitrary Precision Arithmetic},
@@ -23946,10 +23946,10 @@ It contains the following chapters:
@itemize @bullet
@item
address@hidden
address@hidden Features}.
@item
address@hidden Features}.
address@hidden
@item
@ref{Debugger}.
@@ -23962,1907 +23962,1907 @@ It contains the following chapters:
@end ifdocbook
@end ignore
address@hidden Internationalization
address@hidden Internationalization with @command{gawk}
-
-Once upon a time, computer makers
-wrote software that worked only in English.
-Eventually, hardware and software vendors noticed that if their
-systems worked in the native languages of non-English-speaking
-countries, they were able to sell more systems.
-As a result, internationalization and localization
-of programs and software systems became a common practice.
-
address@hidden STARTOFRANGE inloc
address@hidden internationalization, localization
address@hidden @command{gawk}, internationalization and, See
internationalization
address@hidden internationalization, localization, @command{gawk} and
-For many years, the ability to provide internationalization
-was largely restricted to programs written in C and C++.
-This @value{CHAPTER} describes the underlying library @command{gawk}
-uses for internationalization, as well as how
address@hidden makes internationalization
-features available at the @command{awk} program level.
-Having internationalization available at the @command{awk} level
-gives software developers additional flexibility---they are no
-longer forced to write in C or C++ when internationalization is
-a requirement.
-
address@hidden
-* I18N and L10N:: Internationalization and Localization.
-* Explaining gettext:: How GNU @code{gettext} works.
-* Programmer i18n:: Features for the programmer.
-* Translator i18n:: Features for the translator.
-* I18N Example:: A simple i18n example.
-* Gawk I18N:: @command{gawk} is also internationalized.
address@hidden menu
-
address@hidden I18N and L10N
address@hidden Internationalization and Localization
-
address@hidden internationalization
address@hidden localization, See address@hidden localization
address@hidden localization
address@hidden means writing (or modifying) a program once,
-in such a way that it can use multiple languages without requiring
-further source-code changes.
address@hidden means providing the data necessary for an
-internationalized program to work in a particular language.
-Most typically, these terms refer to features such as the language
-used for printing error messages, the language used to read
-responses, and information related to how numerical and
-monetary values are printed and read.
-
address@hidden Explaining gettext
address@hidden GNU @code{gettext}
-
address@hidden internationalizing a program
address@hidden STARTOFRANGE gettex
address@hidden @code{gettext} library
-The facilities in GNU @code{gettext} focus on messages; strings printed
-by a program, either directly or via formatting with @code{printf} or
address@hidden()address@hidden some operating systems, the @command{gawk}
-port doesn't support GNU @code{gettext}.
-Therefore, these features are not available
-if you are using one of those operating systems. Sorry.}
-
address@hidden portability, @code{gettext} library and
-When using GNU @code{gettext}, each application has its own
address@hidden domain}. This is a unique name, such as @samp{kpilot} or
@samp{gawk},
-that identifies the application.
-A complete application may have multiple components---programs written
-in C or C++, as well as scripts written in @command{sh} or @command{awk}.
-All of the components use the same text domain.
address@hidden Advanced Features
address@hidden Advanced Features of @command{gawk}
address@hidden advanced features, network connections, See Also networks,
connections
address@hidden STARTOFRANGE gawadv
address@hidden @command{gawk}, features, advanced
address@hidden STARTOFRANGE advgaw
address@hidden advanced features, @command{gawk}
address@hidden
+Contributed by: Peter Langston <address@hidden>
-To make the discussion concrete, assume we're writing an application
-named @command{guide}. Internationalization consists of the
-following steps, in this order:
+ Found in Steve English's "signature" line:
address@hidden
address@hidden
-The programmer goes
-through the source for all of @command{guide}'s components
-and marks each string that is a candidate for translation.
-For example, @code{"`-F': option required"} is a good candidate for
translation.
-A table with strings of option names is not (e.g., @command{gawk}'s
address@hidden option should remain the same, no matter what the local
-language).
+"Write documentation as if whoever reads it is a violent psychopath
+who knows where you live."
address@hidden ignore
address@hidden
address@hidden documentation as if whoever reads it is
+a violent psychopath who knows where you address@hidden
+Steve English, as quoted by Peter Langston
address@hidden quotation
address@hidden @code{textdomain()} function (C library)
address@hidden
-The programmer indicates the application's text domain
-(@code{"guide"}) to the @code{gettext} library,
-by calling the @code{textdomain()} function.
+This @value{CHAPTER} discusses advanced features in @command{gawk}.
+It's a bit of a ``grab bag'' of items that are otherwise unrelated
+to each other.
+First, a command-line option allows @command{gawk} to recognize
+nondecimal numbers in input data, not just in @command{awk}
+programs.
+Then, @command{gawk}'s special features for sorting arrays are presented.
+Next, two-way I/O, discussed briefly in earlier parts of this
address@hidden, is described in full detail, along with the basics
+of TCP/IP networking. Finally, @command{gawk}
+can @dfn{profile} an @command{awk} program, making it possible to tune
+it for performance.
address@hidden @code{.pot} files
address@hidden files, @code{.pot}
address@hidden portable object template files
address@hidden files, portable object template
address@hidden
-Messages from the application are extracted from the source code and
-collected into a portable object template file (@file{guide.pot}),
-which lists the strings and their translations.
-The translations are initially empty.
-The original (usually English) messages serve as the key for
-lookup of the translations.
+A number of advanced features require separate @value{CHAPTER}s of their
+own:
address@hidden @code{.po} files
address@hidden files, @code{.po}
address@hidden portable object files
address@hidden files, portable object
address@hidden @bullet
@item
-For each language with a translator, @file{guide.pot}
-is copied to a portable object file (@code{.po})
-and translations are created and shipped with the application.
-For example, there might be a @file{fr.po} for a French translation.
address@hidden, discusses how to internationalize
+your @command{awk} programs, so that they can speak multiple
+national languages.
address@hidden @code{.mo} files
address@hidden files, @code{.mo}
address@hidden message object files
address@hidden files, message object
@item
-Each language's @file{.po} file is converted into a binary
-message object (@file{.mo}) file.
-A message object file contains the original messages and their
-translations in a binary format that allows fast lookup of translations
-at runtime.
address@hidden, describes @command{gawk}'s built-in command-line
+debugger for debugging @command{awk} programs.
@item
-When @command{guide} is built and installed, the binary translation files
-are installed in a standard place.
address@hidden Precision Arithmetic}, describes how you can use
address@hidden to perform arbitrary-precision arithmetic.
address@hidden @code{bindtextdomain()} function (C library)
@item
-For testing and development, it is possible to tell @code{gettext}
-to use @file{.mo} files in a different directory than the standard
-one by using the @code{bindtextdomain()} function.
address@hidden Extensions},
+discusses the ability to dynamically add new built-in functions to
address@hidden
address@hidden itemize
address@hidden @code{.mo} files, specifying directory of
address@hidden files, @code{.mo}, specifying directory of
address@hidden message object files, specifying directory of
address@hidden files, message object, specifying directory of
address@hidden
-At runtime, @command{guide} looks up each string via a call
-to @code{gettext()}. The returned string is the translated string
-if available, or the original string if not.
address@hidden
+* Nondecimal Data:: Allowing nondecimal input data.
+* Array Sorting:: Facilities for controlling array traversal and
+ sorting arrays.
+* Two-way I/O:: Two-way communications with another process.
+* TCP/IP Networking:: Using @command{gawk} for network programming.
+* Profiling:: Profiling your @command{awk} programs.
address@hidden menu
address@hidden
-If necessary, it is possible to access messages from a different
-text domain than the one belonging to the application, without
-having to switch the application's default text domain back
-and forth.
address@hidden enumerate
address@hidden Nondecimal Data
address@hidden Allowing Nondecimal Input Data
address@hidden @code{--non-decimal-data} option
address@hidden advanced features, @command{gawk}, nondecimal input data
address@hidden input, address@hidden nondecimal
address@hidden constants, nondecimal
address@hidden @code{gettext()} function (C library)
-In C (or C++), the string marking and dynamic translation lookup
-are accomplished by wrapping each string in a call to @code{gettext()}:
+If you run @command{gawk} with the @option{--non-decimal-data} option,
+you can have nondecimal constants in your input data:
address@hidden line break here for small book format
@example
-printf("%s", gettext("Don't Panic!\n"));
+$ @kbd{echo 0123 123 0x123 |}
+> @kbd{gawk --non-decimal-data '@{ printf "%d, %d, %d\n",}
+> @kbd{$1, $2, $3 @}'}
address@hidden 83, 123, 291
@end example
-The tools that extract messages from source code pull out all
-strings enclosed in calls to @code{gettext()}.
-
address@hidden @code{_} (underscore), @code{_} C macro
address@hidden underscore (@code{_}), @code{_} C macro
-The GNU @code{gettext} developers, recognizing that typing
address@hidden(@dots{})} over and over again is both painful and ugly to look
-at, use the macro @samp{_} (an underscore) to make things easier:
+For this feature to work, write your program so that
address@hidden treats your data as numeric:
@example
-/* In the standard header file: */
-#define _(str) gettext(str)
-
-/* In the program text: */
-printf("%s", _("Don't Panic!\n"));
+$ @kbd{echo 0123 123 0x123 | gawk '@{ print $1, $2, $3 @}'}
address@hidden 0123 123 0x123
@end example
address@hidden internationalization, localization, locale categories
address@hidden @code{gettext} library, locale categories
address@hidden locale categories
@noindent
-This reduces the typing overhead to just three extra characters per string
-and is considerably easier to read as well.
-
-There are locale @dfn{categories}
-for different types of locale-related information.
-The defined locale categories that @code{gettext} knows about are:
+The @code{print} statement treats its expressions as strings.
+Although the fields can act as numbers when necessary,
+they are still strings, so @code{print} does not try to treat them
+numerically. You may need to add zero to a field to force it to
+be treated as a number. For example:
address@hidden @code
address@hidden @code{LC_MESSAGES} locale category
address@hidden LC_MESSAGES
-Text messages. This is the default category for @code{gettext}
-operations, but it is possible to supply a different one explicitly,
-if necessary. (It is almost never necessary to supply a different category.)
-
address@hidden sorting characters in different languages
address@hidden @code{LC_COLLATE} locale category
address@hidden LC_COLLATE
-Text-collation information; i.e., how different characters
-and/or groups of characters sort in a given language.
-
address@hidden @code{LC_CTYPE} locale category
address@hidden LC_CTYPE
-Character-type information (alphabetic, digit, upper- or lowercase, and
-so on).
-This information is accessed via the
-POSIX character classes in regular expressions,
-such as @code{/[[:alnum:]]/}
-(@pxref{Regexp Operators}).
address@hidden
+$ @kbd{echo 0123 123 0x123 | gawk --non-decimal-data '}
+> @address@hidden print $1, $2, $3}
+> @kbd{print $1 + 0, $2 + 0, $3 + 0 @}'}
address@hidden 0123 123 0x123
address@hidden 83 123 291
address@hidden example
address@hidden monetary information, localization
address@hidden currency symbols, localization
address@hidden @code{LC_MONETARY} locale category
address@hidden LC_MONETARY
-Monetary information, such as the currency symbol, and whether the
-symbol goes before or after a number.
+Because it is common to have decimal data with leading zeros, and because
+using this facility could lead to surprising results, the default is to leave
it
+disabled. If you want it, you must explicitly request it.
address@hidden @code{LC_NUMERIC} locale category
address@hidden LC_NUMERIC
-Numeric information, such as which characters to use for the decimal
-point and the thousands address@hidden
-use a comma every three decimal places and a period for the decimal
-point, while many Europeans do exactly the opposite:
-1,234.56 versus 1.234,56.}
address@hidden programming conventions, @code{--non-decimal-data} option
address@hidden @code{--non-decimal-data} option, @code{strtonum()} function and
address@hidden @code{strtonum()} function (@command{gawk}),
@code{--non-decimal-data} option and
address@hidden CAUTION
address@hidden of this option is not recommended.}
+It can break old programs very badly.
+Instead, use the @code{strtonum()} function to convert your data
+(@pxref{Nondecimal-numbers}).
+This makes your programs easier to write and easier to read, and
+leads to less surprising results.
address@hidden quotation
address@hidden @code{LC_RESPONSE} locale category
address@hidden LC_RESPONSE
-Response information, such as how ``yes'' and ``no'' appear in the
-local language, and possibly other information as well.
address@hidden Array Sorting
address@hidden Controlling Array Traversal and Array Sorting
address@hidden time, localization and
address@hidden dates, information related address@hidden localization
address@hidden @code{LC_TIME} locale category
address@hidden LC_TIME
-Time- and date-related information, such as 12- or 24-hour clock, month printed
-before or after the day in a date, local month abbreviations, and so on.
address@hidden lets you control the order in which a @samp{for (i in array)}
+loop traverses an array.
address@hidden @code{LC_ALL} locale category
address@hidden LC_ALL
-All of the above. (Not too useful in the context of @code{gettext}.)
address@hidden table
address@hidden ENDOFRANGE gettex
+In addition, two built-in functions, @code{asort()} and @code{asorti()},
+let you sort arrays based on the array values and indices, respectively.
+These two functions also provide control over the sorting criteria used
+to order the elements during sorting.
address@hidden Programmer i18n
address@hidden Internationalizing @command{awk} Programs
address@hidden STARTOFRANGE inap
address@hidden @command{awk} programs, internationalizing
address@hidden
+* Controlling Array Traversal:: How to use PROCINFO["sorted_in"].
+* Array Sorting Functions:: How to use @code{asort()} and @code{asorti()}.
address@hidden menu
address@hidden provides the following variables and functions for
-internationalization:
address@hidden Controlling Array Traversal
address@hidden Controlling Array Traversal
address@hidden @code
address@hidden @code{TEXTDOMAIN} variable
address@hidden TEXTDOMAIN
-This variable indicates the application's text domain.
-For compatibility with GNU @code{gettext}, the default
-value is @code{"messages"}.
+By default, the order in which a @samp{for (i in array)} loop
+scans an array is not defined; it is generally based upon
+the internal implementation of arrays inside @command{awk}.
address@hidden internationalization, localization, marked strings
address@hidden strings, for localization
address@hidden _"your message here"
-String constants marked with a leading underscore
-are candidates for translation at runtime.
-String constants without a leading underscore are not translated.
+Often, though, it is desirable to be able to loop over the elements
+in a particular order that you, the programmer, choose. @command{gawk}
+lets you do this.
address@hidden @code{dcgettext()} function (@command{gawk})
address@hidden dcgettext(@var{string} @r{[}, @var{domain} @r{[},
@address@hidden)
-Return the translation of @var{string} in
-text domain @var{domain} for locale category @var{category}.
-The default value for @var{domain} is the current value of @code{TEXTDOMAIN}.
-The default value for @var{category} is @code{"LC_MESSAGES"}.
address@hidden Scanning}, describes how you can assign special,
+pre-defined values to @code{PROCINFO["sorted_in"]} in order to
+control the order in which @command{gawk} will traverse an array
+during a @code{for} loop.
-If you supply a value for @var{category}, it must be a string equal to
-one of the known locale categories described in
address@hidden
-the previous @value{SECTION}.
address@hidden ifnotinfo
address@hidden
address@hidden gettext}.
address@hidden ifinfo
-You must also supply a text domain. Use @code{TEXTDOMAIN} if
-you want to use the current domain.
+In addition, the value of @code{PROCINFO["sorted_in"]} can be a function name.
+This lets you traverse an array based on any custom criterion.
+The array elements are ordered according to the return value of this
+function. The comparison function should be defined with at least
+four arguments:
address@hidden CAUTION
-The order of arguments to the @command{awk} version
-of the @code{dcgettext()} function is purposely different from the order for
-the C version. The @command{awk} version's order was
-chosen to be simple and to allow for reasonable @command{awk}-style
-default arguments.
address@hidden quotation
address@hidden
+function comp_func(i1, v1, i2, v2)
address@hidden
+ @var{compare elements 1 and 2 in some fashion}
+ @var{return < 0; 0; or > 0}
address@hidden
address@hidden example
address@hidden @code{dcngettext()} function (@command{gawk})
address@hidden dcngettext(@var{string1}, @var{string2}, @var{number} @r{[},
@var{domain} @r{[}, @address@hidden)
-Return the plural form used for @var{number} of the
-translation of @var{string1} and @var{string2} in text domain
address@hidden for locale category @var{category}. @var{string1} is the
-English singular variant of a message, and @var{string2} the English plural
-variant of the same message.
-The default value for @var{domain} is the current value of @code{TEXTDOMAIN}.
-The default value for @var{category} is @code{"LC_MESSAGES"}.
+Here, @var{i1} and @var{i2} are the indices, and @var{v1} and @var{v2}
+are the corresponding values of the two elements being compared.
+Either @var{v1} or @var{v2}, or both, can be arrays if the array being
+traversed contains subarrays as values.
+(@xref{Arrays of Arrays}, for more information about subarrays.)
+The three possible return values are interpreted as follows:
-The same remarks about argument order as for the @code{dcgettext()} function
apply.
address@hidden @code
address@hidden comp_func(i1, v1, i2, v2) < 0
+Index @var{i1} comes before index @var{i2} during loop traversal.
address@hidden @code{.mo} files, specifying directory of
address@hidden files, @code{.mo}, specifying directory of
address@hidden message object files, specifying directory of
address@hidden files, message object, specifying directory of
address@hidden @code{bindtextdomain()} function (@command{gawk})
address@hidden bindtextdomain(@var{directory} @r{[}, @address@hidden)
-Change the directory in which
address@hidden looks for @file{.mo} files, in case they
-will not or cannot be placed in the standard locations
-(e.g., during testing).
-Return the directory in which @var{domain} is ``bound.''
address@hidden comp_func(i1, v1, i2, v2) == 0
+Indices @var{i1} and @var{i2}
+come together but the relative order with respect to each other is undefined.
-The default @var{domain} is the value of @code{TEXTDOMAIN}.
-If @var{directory} is the null string (@code{""}), then
address@hidden()} returns the current binding for the
-given @var{domain}.
address@hidden comp_func(i1, v1, i2, v2) > 0
+Index @var{i1} comes after index @var{i2} during loop traversal.
@end table
-To use these facilities in your @command{awk} program, follow the steps
-outlined in
address@hidden
-the previous @value{SECTION},
address@hidden ifnotinfo
address@hidden
address@hidden gettext},
address@hidden ifinfo
-like so:
-
address@hidden
address@hidden @code{BEGIN} pattern, @code{TEXTDOMAIN} variable and
address@hidden @code{TEXTDOMAIN} variable, @code{BEGIN} pattern and
address@hidden
-Set the variable @code{TEXTDOMAIN} to the text domain of
-your program. This is best done in a @code{BEGIN} rule
-(@pxref{BEGIN/END}),
-or it can also be done via the @option{-v} command-line
-option (@pxref{Options}):
+Our first comparison function can be used to scan an array in
+numerical order of the indices:
@example
-BEGIN @{
- TEXTDOMAIN = "guide"
- @dots{}
+function cmp_num_idx(i1, v1, i2, v2)
address@hidden
+ # numerical index comparison, ascending order
+ return (i1 - i2)
@}
@end example
address@hidden @code{_} (underscore), translatable string
address@hidden underscore (@code{_}), translatable string
address@hidden
-Mark all translatable strings with a leading underscore (@samp{_})
-character. It @emph{must} be adjacent to the opening
-quote of the string. For example:
+Our second function traverses an array based on the string order of
+the element values rather than by indices:
@example
-print _"hello, world"
-x = _"you goofed"
-printf(_"Number of users is %d\n", nusers)
+function cmp_str_val(i1, v1, i2, v2)
address@hidden
+ # string value comparison, ascending order
+ v1 = v1 ""
+ v2 = v2 ""
+ if (v1 < v2)
+ return -1
+ return (v1 != v2)
address@hidden
@end example
address@hidden
-If you are creating strings dynamically, you can
-still translate them, using the @code{dcgettext()}
-built-in function:
+The third
+comparison function makes all numbers, and numeric strings without
+any leading or trailing spaces, come out first during loop traversal:
@example
-message = nusers " users logged in"
-message = dcgettext(message, "adminprog")
-print message
+function cmp_num_str_val(i1, v1, i2, v2, n1, n2)
address@hidden
+ # numbers before string value comparison, ascending order
+ n1 = v1 + 0
+ n2 = v2 + 0
+ if (n1 == v1)
+ return (n2 == v2) ? (n1 - n2) : -1
+ else if (n2 == v2)
+ return 1
+ return (v1 < v2) ? -1 : (v1 != v2)
address@hidden
@end example
-Here, the call to @code{dcgettext()} supplies a different
-text domain (@code{"adminprog"}) in which to find the
-message, but it uses the default @code{"LC_MESSAGES"} category.
-
address@hidden @code{LC_MESSAGES} locale category, @code{bindtextdomain()}
function (@command{gawk})
address@hidden
-During development, you might want to put the @file{.mo}
-file in a private directory for testing. This is done
-with the @code{bindtextdomain()} built-in function:
+Here is a main program to demonstrate how @command{gawk}
+behaves using each of the previous functions:
@example
BEGIN @{
- TEXTDOMAIN = "guide" # our text domain
- if (Testing) @{
- # where to find our files
- bindtextdomain("testdir")
- # joe is in charge of adminprog
- bindtextdomain("../joe/testdir", "adminprog")
- @}
- @dots{}
+ data["one"] = 10
+ data["two"] = 20
+ data[10] = "one"
+ data[100] = 100
+ data[20] = "two"
+
+ f[1] = "cmp_num_idx"
+ f[2] = "cmp_str_val"
+ f[3] = "cmp_num_str_val"
+ for (i = 1; i <= 3; i++) @{
+ printf("Sort function: %s\n", f[i])
+ PROCINFO["sorted_in"] = f[i]
+ for (j in data)
+ printf("\tdata[%s] = %s\n", j, data[j])
+ print ""
+ @}
@}
@end example
address@hidden enumerate
+Here are the results when the program is run:
address@hidden Example},
-for an example program showing the steps to create
-and use translations from @command{awk}.
address@hidden
+$ @kbd{gawk -f compdemo.awk}
address@hidden Sort function: cmp_num_idx @ii{Sort by numeric index}
address@hidden data[two] = 20
address@hidden data[one] = 10 @ii{Both strings are numerically
zero}
address@hidden data[10] = one
address@hidden data[20] = two
address@hidden data[100] = 100
address@hidden
address@hidden Sort function: cmp_str_val @ii{Sort by element values as
strings}
address@hidden data[one] = 10
address@hidden data[100] = 100 @ii{String 100 is less than
string 20}
address@hidden data[two] = 20
address@hidden data[10] = one
address@hidden data[20] = two
address@hidden
address@hidden Sort function: cmp_num_str_val @ii{Sort all numeric values
before all strings}
address@hidden data[one] = 10
address@hidden data[two] = 20
address@hidden data[100] = 100
address@hidden data[10] = one
address@hidden data[20] = two
address@hidden example
address@hidden Translator i18n
address@hidden Translating @command{awk} Programs
+Consider sorting the entries of a GNU/Linux system password file
+according to login name. The following program sorts records
+by a specific field position and can be used for this purpose:
address@hidden @code{.po} files
address@hidden files, @code{.po}
address@hidden portable object files
address@hidden files, portable object
-Once a program's translatable strings have been marked, they must
-be extracted to create the initial @file{.po} file.
-As part of translation, it is often helpful to rearrange the order
-in which arguments to @code{printf} are output.
address@hidden
+# sort.awk --- simple program to sort by field position
+# field position is specified by the global variable POS
address@hidden's @option{--gen-pot} command-line option extracts
-the messages and is discussed next.
-After that, @code{printf}'s ability to
-rearrange the order for @code{printf} arguments at runtime
-is covered.
+function cmp_field(i1, v1, i2, v2)
address@hidden
+ # comparison by value, as string, and ascending order
+ return v1[POS] < v2[POS] ? -1 : (v1[POS] != v2[POS])
address@hidden
address@hidden
-* String Extraction:: Extracting marked strings.
-* Printf Ordering:: Rearranging @code{printf} arguments.
-* I18N Portability:: @command{awk}-level portability issues.
address@hidden menu
address@hidden
+ for (i = 1; i <= NF; i++)
+ a[NR][i] = $i
address@hidden
address@hidden String Extraction
address@hidden Extracting Marked Strings
address@hidden strings, extracting
address@hidden marked address@hidden extracting
address@hidden @code{--gen-pot} option
address@hidden command-line options, string extraction
address@hidden string extraction (internationalization)
address@hidden marked string extraction (internationalization)
address@hidden extraction, of marked strings (internationalization)
+END @{
+ PROCINFO["sorted_in"] = "cmp_field"
+ if (POS < 1 || POS > NF)
+ POS = 1
+ for (i in a) @{
+ for (j = 1; j <= NF; j++)
+ printf("%s%c", a[i][j], j < NF ? ":" : "")
+ print ""
+ @}
address@hidden
address@hidden example
address@hidden @code{--gen-pot} option
-Once your @command{awk} program is working, and all the strings have
-been marked and you've set (and perhaps bound) the text domain,
-it is time to produce translations.
-First, use the @option{--gen-pot} command-line option to create
-the initial @file{.pot} file:
+The first field in each entry of the password file is the user's login name,
+and the fields are separated by colons.
+Each record defines a subarray,
+with each field as an element in the subarray.
+Running the program produces the
+following output:
@example
-$ @kbd{gawk --gen-pot -f guide.awk > guide.pot}
+$ @kbd{gawk -v POS=1 -F: -f sort.awk /etc/passwd}
address@hidden adm:x:3:4:adm:/var/adm:/sbin/nologin
address@hidden apache:x:48:48:Apache:/var/www:/sbin/nologin
address@hidden avahi:x:70:70:Avahi daemon:/:/sbin/nologin
address@hidden
@end example
address@hidden @code{xgettext} utility
-When run with @option{--gen-pot}, @command{gawk} does not execute your
-program. Instead, it parses it as usual and prints all marked strings
-to standard output in the format of a GNU @code{gettext} Portable Object
-file. Also included in the output are any constant strings that
-appear as the first argument to @code{dcgettext()} or as the first and
-second argument to @code{dcngettext()address@hidden
address@hidden utility that comes with GNU
address@hidden can handle @file{.awk} files.}
address@hidden Example},
-for the full list of steps to go through to create and test
-translations for @command{guide}.
-
address@hidden Printf Ordering
address@hidden Rearranging @code{printf} Arguments
-
address@hidden @code{printf} statement, positional specifiers
address@hidden positional specifiers, @code{printf} statement
-Format strings for @code{printf} and @code{sprintf()}
-(@pxref{Printf})
-present a special problem for translation.
-Consider the following:@footnote{This example is borrowed
-from the GNU @code{gettext} manual.}
+The comparison should normally always return the same value when given a
+specific pair of array elements as its arguments. If inconsistent
+results are returned then the order is undefined. This behavior can be
+exploited to introduce random order into otherwise seemingly
+ordered data:
address@hidden line broken here only for smallbook format
@example
-printf(_"String `%s' has %d characters\n",
- string, length(string)))
+function cmp_randomize(i1, v1, i2, v2)
address@hidden
+ # random order
+ return (2 - 4 * rand())
address@hidden
@end example
-A possible German translation for this might be:
+As mentioned above, the order of the indices is arbitrary if two
+elements compare equal. This is usually not a problem, but letting
+the tied elements come out in arbitrary order can be an issue, especially
+when comparing item values. The partial ordering of the equal elements
+may change during the next loop traversal, if other elements are added or
+removed from the array. One way to resolve ties when comparing elements
+with otherwise equal values is to include the indices in the comparison
+rules. Note that doing this may make the loop traversal less efficient,
+so consider it only if necessary. The following comparison functions
+force a deterministic order, and are based on the fact that the
+indices of two elements are never equal:
@example
-"%d Zeichen lang ist die Zeichenkette `%s'\n"
+function cmp_numeric(i1, v1, i2, v2)
address@hidden
+ # numerical value (and index) comparison, descending order
+ return (v1 != v2) ? (v2 - v1) : (i2 - i1)
address@hidden
+
+function cmp_string(i1, v1, i2, v2)
address@hidden
+ # string value (and index) comparison, descending order
+ v1 = v1 i1
+ v2 = v2 i2
+ return (v1 > v2) ? -1 : (v1 != v2)
address@hidden
@end example
-The problem should be obvious: the order of the format
-specifications is different from the original!
-Even though @code{gettext()} can return the translated string
-at runtime,
-it cannot change the argument order in the call to @code{printf}.
address@hidden Avoid using the term ``stable'' when describing the
unpredictable behavior
address@hidden if two items compare equal. Usually, the goal of a "stable
algorithm"
address@hidden is to maintain the original order of the items, which is a
meaningless
address@hidden concept for a list constructed from a hash.
-To solve this problem, @code{printf} format specifiers may have
-an additional optional element, which we call a @dfn{positional specifier}.
-For example:
+A custom comparison function can often simplify ordered loop
+traversal, and the sky is really the limit when it comes to
+designing such a function.
address@hidden
-"%2$d Zeichen lang ist die Zeichenkette `%1$s'\n"
address@hidden example
+When string comparisons are made during a sort, either for element
+values where one or both aren't numbers, or for element indices
+handled as strings, the value of @code{IGNORECASE}
+(@pxref{Built-in Variables}) controls whether
+the comparisons treat corresponding uppercase and lowercase letters as
+equivalent or distinct.
-Here, the positional specifier consists of an integer count, which indicates
which
-argument to use, and a @samp{$}. Counts are one-based, and the
-format string itself is @emph{not} included. Thus, in the following
-example, @samp{string} is the first argument and @samp{length(string)} is the
second:
+Another point to keep in mind is that in the case of subarrays
+the element values can themselves be arrays; a production comparison
+function should use the @code{isarray()} function
+(@pxref{Type Functions}),
+to check for this, and choose a defined sorting order for subarrays.
address@hidden
-$ @kbd{gawk 'BEGIN @{}
-> @kbd{string = "Dont Panic"}
-> @kbd{printf _"%2$d characters live in \"%1$s\"\n",}
-> @kbd{string, length(string)}
-> @address@hidden'}
address@hidden 10 characters live in "Dont Panic"
address@hidden example
+All sorting based on @code{PROCINFO["sorted_in"]}
+is disabled in POSIX mode,
+since the @code{PROCINFO} array is not special in that case.
-If present, positional specifiers come first in the format specification,
-before the flags, the field width, and/or the precision.
+As a side note, sorting the array indices before traversing
+the array has been reported to add 15% to 20% overhead to the
+execution time of @command{awk} programs. For this reason,
+sorted array traversal is not the default.
-Positional specifiers can be used with the dynamic field width and
-precision capability:
address@hidden The @command{gawk}
address@hidden maintainers believe that only the people who wish to use a
address@hidden feature should have to pay for it.
address@hidden
-$ @kbd{gawk 'BEGIN @{}
-> @kbd{printf("%*.*s\n", 10, 20, "hello")}
-> @kbd{printf("%3$*2$.*1$s\n", 20, 10, "hello")}
-> @address@hidden'}
address@hidden hello
address@hidden hello
address@hidden example
-
address@hidden NOTE
-When using @samp{*} with a positional specifier, the @samp{*}
-comes first, then the integer position, and then the @samp{$}.
-This is somewhat counterintuitive.
address@hidden quotation
address@hidden Array Sorting Functions
address@hidden Sorting Array Values and Indices with @command{gawk}
address@hidden @code{printf} statement, positional specifiers, mixing with
regular formats
address@hidden positional specifiers, @code{printf} statement, mixing with
regular formats
address@hidden format specifiers, mixing regular with positional specifiers
address@hidden does not allow you to mix regular format specifiers
-and those with positional specifiers in the same string:
address@hidden arrays, sorting
address@hidden @code{asort()} function (@command{gawk})
address@hidden @code{asort()} function (@command{gawk}), address@hidden sorting
address@hidden sort function, arrays, sorting
+In most @command{awk} implementations, sorting an array requires
+writing a @code{sort()} function.
+While this can be educational for exploring different sorting algorithms,
+usually that's not the point of the program.
address@hidden provides the built-in @code{asort()}
+and @code{asorti()} functions
+(@pxref{String Functions})
+for sorting arrays. For example:
@example
-$ @kbd{gawk 'BEGIN @{ printf _"%d %3$s\n", 1, 2, "hi" @}'}
address@hidden gawk: cmd. line:1: fatal: must use `count$' on all formats or
none
address@hidden the array} data
+n = asort(data)
+for (i = 1; i <= n; i++)
+ @var{do something with} data[i]
@end example
address@hidden NOTE
-There are some pathological cases that @command{gawk} may fail to
-diagnose. In such cases, the output may not be what you expect.
-It's still a bad idea to try mixing them, even if @command{gawk}
-doesn't detect it.
address@hidden quotation
-
-Although positional specifiers can be used directly in @command{awk} programs,
-their primary purpose is to help in producing correct translations of
-format strings into languages different from the one in which the program
-is first written.
-
address@hidden I18N Portability
address@hidden @command{awk} Portability Issues
+After the call to @code{asort()}, the array @code{data} is indexed from 1
+to some number @var{n}, the total number of elements in @code{data}.
+(This count is @code{asort()}'s return value.)
address@hidden @value{LEQ} @code{data[2]} @value{LEQ} @code{data[3]}, and so on.
+The comparison is based on the type of the elements
+(@pxref{Typing and Comparison}).
+All numeric values come before all string values,
+which in turn come before all subarrays.
address@hidden portability, internationalization and
address@hidden internationalization, localization, portability and
address@hidden's internationalization features were purposely chosen to
-have as little impact as possible on the portability of @command{awk}
-programs that use them to other versions of @command{awk}.
-Consider this program:
address@hidden side effects, @code{asort()} function
+An important side effect of calling @code{asort()} is that
address@hidden array's original indices are irrevocably lost}.
+As this isn't always desirable, @code{asort()} accepts a
+second argument:
@example
-BEGIN @{
- TEXTDOMAIN = "guide"
- if (Test_Guide) # set with -v
- bindtextdomain("/test/guide/messages")
- print _"don't panic!"
address@hidden
address@hidden the array} source
+n = asort(source, dest)
+for (i = 1; i <= n; i++)
+ @var{do something with} dest[i]
@end example
address@hidden
-As written, it won't work on other versions of @command{awk}.
-However, it is actually almost portable, requiring very little
-change:
+In this case, @command{gawk} copies the @code{source} array into the
address@hidden array and then sorts @code{dest}, destroying its indices.
+However, the @code{source} array is not affected.
address@hidden @bullet
address@hidden @code{TEXTDOMAIN} variable, portability and
address@hidden
-Assignments to @code{TEXTDOMAIN} won't have any effect,
-since @code{TEXTDOMAIN} is not special in other @command{awk} implementations.
address@hidden()} accepts a third string argument to control comparison of
+array elements. As with @code{PROCINFO["sorted_in"]}, this argument
+may be one of the predefined names that @command{gawk} provides
+(@pxref{Controlling Scanning}), or the name of a user-defined function
+(@pxref{Controlling Array Traversal}).
address@hidden
-Non-GNU versions of @command{awk} treat marked strings
-as the concatenation of a variable named @code{_} with the string
-following address@hidden is good fodder for an ``Obfuscated
address@hidden'' contest.} Typically, the variable @code{_} has
-the null string (@code{""}) as its value, leaving the original string constant
as
-the result.
address@hidden NOTE
+In all cases, the sorted element values consist of the original
+array's element values. The ability to control comparison merely
+affects the way in which they are sorted.
address@hidden quotation
address@hidden
-By defining ``dummy'' functions to replace @code{dcgettext()},
@code{dcngettext()}
-and @code{bindtextdomain()}, the @command{awk} program can be made to run, but
-all the messages are output in the original language.
-For example:
+Often, what's needed is to sort on the values of the @emph{indices}
+instead of the values of the elements.
+To do that, use the
address@hidden()} function. The interface is identical to that of
address@hidden()}, except that the index values are used for sorting, and
+become the values of the result array:
address@hidden @code{bindtextdomain()} function (@command{gawk}), portability
and
address@hidden @code{dcgettext()} function (@command{gawk}), portability and
address@hidden @code{dcngettext()} function (@command{gawk}), portability and
@example
address@hidden file eg/lib/libintl.awk
-function bindtextdomain(dir, domain)
address@hidden
- return dir
address@hidden
-
-function dcgettext(string, domain, category)
address@hidden
- return string
address@hidden
address@hidden source[$0] = some_func($0) @}
-function dcngettext(string1, string2, number, domain, category)
address@hidden
- return (number == 1 ? string1 : string2)
+END @{
+ n = asorti(source, dest)
+ for (i = 1; i <= n; i++) @{
+ @ii{Work with sorted indices directly:}
+ @var{do something with} dest[i]
+ @dots{}
+ @ii{Access original array via sorted indices:}
+ @var{do something with} source[dest[i]]
+ @}
@}
address@hidden endfile
@end example
address@hidden
-The use of positional specifications in @code{printf} or
address@hidden()} is @emph{not} portable.
-To support @code{gettext()} at the C level, many systems' C versions of
address@hidden()} do support positional specifiers. But it works only if
-enough arguments are supplied in the function call. Many versions of
address@hidden pass @code{printf} formats and arguments unchanged to the
-underlying C library version of @code{sprintf()}, but only one format and
-argument at a time. What happens if a positional specification is
-used is anybody's guess.
-However, since the positional specifications are primarily for use in
address@hidden format strings, and since non-GNU @command{awk}s never
-retrieve the translated string, this should not be a problem in practice.
address@hidden itemize
address@hidden ENDOFRANGE inap
+Similar to @code{asort()},
+in all cases, the sorted element values consist of the original
+array's indices. The ability to control comparison merely
+affects the way in which they are sorted.
address@hidden I18N Example
address@hidden A Simple Internationalization Example
+Sorting the array by replacing the indices provides maximal flexibility.
+To traverse the elements in decreasing order, use a loop that goes from
address@hidden down to 1, either over the elements or over the address@hidden
+may also use one of the predefined sorting names that sorts in
+decreasing order.}
-Now let's look at a step-by-step example of how to internationalize and
-localize a simple @command{awk} program, using @file{guide.awk} as our
-original source:
address@hidden reference counting, sorting arrays
+Copying array indices and elements isn't expensive in terms of memory.
+Internally, @command{gawk} maintains @dfn{reference counts} to data.
+For example, when @code{asort()} copies the first array to the second one,
+there is only one copy of the original array elements' data, even though
+both arrays use the values.
address@hidden
address@hidden file eg/prog/guide.awk
-BEGIN @{
- TEXTDOMAIN = "guide"
- bindtextdomain(".") # for testing
- print _"Don't Panic"
- print _"The Answer Is", 42
- print "Pardon me, Zaphod who?"
address@hidden
address@hidden endfile
address@hidden example
address@hidden Document It And Call It A Feature. Sigh.
address@hidden @command{gawk}, @code{IGNORECASE} variable in
address@hidden @code{IGNORECASE} variable
address@hidden arrays, sorting, @code{IGNORECASE} variable and
address@hidden @code{IGNORECASE} variable, array sorting and
+Because @code{IGNORECASE} affects string comparisons, the value
+of @code{IGNORECASE} also affects sorting for both @code{asort()} and
@code{asorti()}.
+Note also that the locale's sorting order does @emph{not}
+come into play; comparisons are based on character values address@hidden
+is true because locale-based comparison occurs only when in POSIX
+compatibility mode, and since @code{asort()} and @code{asorti()} are
address@hidden extensions, they are not available in that case.}
+Caveat Emptor.
address@hidden
-Run @samp{gawk --gen-pot} to create the @file{.pot} file:
address@hidden Two-way I/O
address@hidden Two-Way Communications with Another Process
address@hidden Brennan, Michael
address@hidden programmers, attractiveness of
address@hidden
address@hidden Path:
cssun.mathcs.emory.edu!gatech!newsxfer3.itd.umich.edu!news-peer.sprintlink.net!news-sea-19.sprintlink.net!news-in-west.sprintlink.net!news.sprintlink.net!Sprint!204.94.52.5!news.whidbey.com!brennan
+From: brennan@@whidbey.com (Mike Brennan)
+Newsgroups: comp.lang.awk
+Subject: Re: Learn the SECRET to Attract Women Easily
+Date: 4 Aug 1997 17:34:46 GMT
address@hidden Organization: WhidbeyNet
address@hidden Lines: 12
+Message-ID: <5s53rm$eca@@news.whidbey.com>
address@hidden References: <address@hidden>
address@hidden Reply-To: address@hidden
address@hidden NNTP-Posting-Host: asn202.whidbey.com
address@hidden X-Newsreader: slrn (0.9.4.1 UNIX)
address@hidden Xref: cssun.mathcs.emory.edu comp.lang.awk:5403
address@hidden
-$ @kbd{gawk --gen-pot -f guide.awk > guide.pot}
address@hidden example
+On 3 Aug 1997 13:17:43 GMT, Want More Dates???
+<tracy78@@kilgrona.com> wrote:
+>Learn the SECRET to Attract Women Easily
+>
+>The SCENT(tm) Pheromone Sex Attractant For Men to Attract Women
address@hidden
-This produces:
+The scent of awk programmers is a lot more attractive to women than
+the scent of perl programmers.
+--
+Mike Brennan
address@hidden brennan@@whidbey.com
address@hidden smallexample
address@hidden
address@hidden file eg/data/guide.po
-#: guide.awk:4
-msgid "Don't Panic"
-msgstr ""
address@hidden advanced features, @command{gawk}, address@hidden communicating
with
address@hidden processes, two-way communications with
+It is often useful to be able to
+send data to a separate program for
+processing and then read the result. This can always be
+done with temporary files:
-#: guide.awk:5
-msgid "The Answer Is"
-msgstr ""
address@hidden
+# Write the data for processing
+tempfile = ("mydata." PROCINFO["pid"])
+while (@var{not done with data})
+ print @var{data} | ("subprogram > " tempfile)
+close("subprogram > " tempfile)
address@hidden endfile
+# Read the results, remove tempfile when done
+while ((getline newdata < tempfile) > 0)
+ @var{process} newdata @var{appropriately}
+close(tempfile)
+system("rm " tempfile)
@end example
-This original portable object template file is saved and reused for each
language
-into which the application is translated. The @code{msgid}
-is the original string and the @code{msgstr} is the translation.
-
address@hidden NOTE
-Strings not marked with a leading underscore do not
-appear in the @file{guide.pot} file.
address@hidden quotation
address@hidden
+This works, but not elegantly. Among other things, it requires that
+the program be run in a directory that cannot be shared among users;
+for example, @file{/tmp} will not do, as another user might happen
+to be using a temporary file with the same name.
-Next, the messages must be translated.
-Here is a translation to a hypothetical dialect of English,
-called ``Mellow'':@footnote{Perhaps it would be better if it were
-called ``Hippy.'' Ah, well.}
address@hidden coprocesses
address@hidden input/output, two-way
address@hidden @code{|} (vertical bar), @code{|&} operator (I/O)
address@hidden vertical bar (@code{|}), @code{|&} operator (I/O)
address@hidden @command{csh} utility, @code{|&} operator, comparison with
+However, with @command{gawk}, it is possible to
+open a @emph{two-way} pipe to another process. The second process is
+termed a @dfn{coprocess}, since it runs in parallel with @command{gawk}.
+The two-way connection is created using the @samp{|&} operator
+(borrowed from the Korn shell, @command{ksh}):@footnote{This is very
+different from the same operator in the C shell.}
@example
address@hidden
-$ cp guide.pot guide-mellow.po
address@hidden translations to} guide-mellow.po @dots{}
address@hidden group
+do @{
+ print @var{data} |& "subprogram"
+ "subprogram" |& getline results
address@hidden while (@var{data left to process})
+close("subprogram")
@end example
address@hidden
-Following are the translations:
-
address@hidden
address@hidden file eg/data/guide-mellow.po
-#: guide.awk:4
-msgid "Don't Panic"
-msgstr "Hey man, relax!"
-
-#: guide.awk:5
-msgid "The Answer Is"
-msgstr "Like, the scoop is"
+The first time an I/O operation is executed using the @samp{|&}
+operator, @command{gawk} creates a two-way pipeline to a child process
+that runs the other program. Output created with @code{print}
+or @code{printf} is written to the program's standard input, and
+output from the program's standard output can be read by the @command{gawk}
+program using @code{getline}.
+As is the case with processes started by @samp{|}, the subprogram
+can be any program, or pipeline of programs, that can be started by
+the shell.
address@hidden endfile
address@hidden example
+There are some cautionary items to be aware of:
address@hidden Linux
address@hidden GNU/Linux
-The next step is to make the directory to hold the binary message object
-file and then to create the @file{guide.mo} file.
-The directory layout shown here is standard for GNU @code{gettext} on
-GNU/Linux systems. Other versions of @code{gettext} may use a different
-layout:
address@hidden @bullet
address@hidden
+As the code inside @command{gawk} currently stands, the coprocess's
+standard error goes to the same place that the parent @command{gawk}'s
+standard error goes. It is not possible to read the child's
+standard error separately.
address@hidden
-$ @kbd{mkdir en_US en_US/LC_MESSAGES}
address@hidden example
address@hidden deadlocks
address@hidden buffering, input/output
address@hidden @code{getline} command, deadlock and
address@hidden
+I/O buffering may be a problem. @command{gawk} automatically
+flushes all output down the pipe to the coprocess.
+However, if the coprocess does not flush its output,
address@hidden may hang when doing a @code{getline} in order to read
+the coprocess's results. This could lead to a situation
+known as @dfn{deadlock}, where each process is waiting for the
+other one to do something.
address@hidden itemize
address@hidden @code{.po} files, converting to @code{.mo}
address@hidden files, @code{.po}, converting to @code{.mo}
address@hidden @code{.mo} files, converting from @code{.po}
address@hidden files, @code{.mo}, converting from @code{.po}
address@hidden portable object files, converting to message object files
address@hidden files, portable object, converting to message object files
address@hidden message object files, converting from portable object files
address@hidden files, message object, converting from portable object files
address@hidden @command{msgfmt} utility
-The @command{msgfmt} utility does the conversion from human-readable
address@hidden file to machine-readable @file{.mo} file.
-By default, @command{msgfmt} creates a file named @file{messages}.
-This file must be renamed and placed in the proper directory so that
address@hidden can find it:
address@hidden @code{close()} function, two-way pipes and
+It is possible to close just one end of the two-way pipe to
+a coprocess, by supplying a second argument to the @code{close()}
+function of either @code{"to"} or @code{"from"}
+(@pxref{Close Files And Pipes}).
+These strings tell @command{gawk} to close the end of the pipe
+that sends data to the coprocess or the end that reads from it,
+respectively.
address@hidden
-$ @kbd{msgfmt guide-mellow.po}
-$ @kbd{mv messages en_US/LC_MESSAGES/guide.mo}
address@hidden example
address@hidden @command{sort} utility, coprocesses and
+This is particularly necessary in order to use
+the system @command{sort} utility as part of a coprocess;
address@hidden must read @emph{all} of its input
+data before it can produce any output.
+The @command{sort} program does not receive an end-of-file indication
+until @command{gawk} closes the write end of the pipe.
-Finally, we run the program to test it:
+When you have finished writing data to the @command{sort}
+utility, you can close the @code{"to"} end of the pipe, and
+then start reading sorted data via @code{getline}.
+For example:
@example
-$ @kbd{gawk -f guide.awk}
address@hidden Hey man, relax!
address@hidden Like, the scoop is 42
address@hidden Pardon me, Zaphod who?
address@hidden example
+BEGIN @{
+ command = "LC_ALL=C sort"
+ n = split("abcdefghijklmnopqrstuvwxyz", a, "")
-If the three replacement functions for @code{dcgettext()}, @code{dcngettext()}
-and @code{bindtextdomain()}
-(@pxref{I18N Portability})
-are in a file named @file{libintl.awk},
-then we can run @file{guide.awk} unchanged as follows:
+ for (i = n; i > 0; i--)
+ print a[i] |& command
+ close(command, "to")
address@hidden
-$ @kbd{gawk --posix -f guide.awk -f libintl.awk}
address@hidden Don't Panic
address@hidden The Answer Is 42
address@hidden Pardon me, Zaphod who?
+ while ((command |& getline line) > 0)
+ print "got", line
+ close(command)
address@hidden
@end example
address@hidden Gawk I18N
address@hidden @command{gawk} Can Speak Your Language
+This program writes the letters of the alphabet in reverse order, one
+per line, down the two-way pipe to @command{sort}. It then closes the
+write end of the pipe, so that @command{sort} receives an end-of-file
+indication. This causes @command{sort} to sort the data and write the
+sorted data back to the @command{gawk} program. Once all of the data
+has been read, @command{gawk} terminates the coprocess and exits.
address@hidden itself has been internationalized
-using the GNU @code{gettext} package.
-(GNU @code{gettext} is described in
-complete detail in
address@hidden
address@hidden, , GNU @code{gettext} utilities, gettext, GNU gettext tools}.)
address@hidden ifinfo
address@hidden
address@hidden gettext tools}.)
address@hidden ifnotinfo
-As of this writing, the latest version of GNU @code{gettext} is
address@hidden://ftp.gnu.org/gnu/gettext/gettext-0.18.2.1.tar.gz,
@value{PVERSION} 0.18.2.1}.
+As a side note, the assignment @samp{LC_ALL=C} in the @command{sort}
+command ensures traditional Unix (ASCII) sorting from @command{sort}.
-If a translation of @command{gawk}'s messages exists,
-then @command{gawk} produces usage messages, warnings,
-and fatal errors in the local language.
address@hidden ENDOFRANGE inloc
address@hidden @command{gawk}, @code{PROCINFO} array in
address@hidden @code{PROCINFO} array
+You may also use pseudo-ttys (ptys) for
+two-way communication instead of pipes, if your system supports them.
+This is done on a per-command basis, by setting a special element
+in the @code{PROCINFO} array
+(@pxref{Auto-set}),
+like so:
address@hidden Advanced Features
address@hidden Advanced Features of @command{gawk}
address@hidden advanced features, network connections, See Also networks,
connections
address@hidden STARTOFRANGE gawadv
address@hidden @command{gawk}, features, advanced
address@hidden STARTOFRANGE advgaw
address@hidden advanced features, @command{gawk}
address@hidden
-Contributed by: Peter Langston <address@hidden>
address@hidden
+command = "sort -nr" # command, save in convenience variable
+PROCINFO[command, "pty"] = 1 # update PROCINFO
+print @dots{} |& command # start two-way pipe
address@hidden
address@hidden example
- Found in Steve English's "signature" line:
address@hidden
+Using ptys avoids the buffer deadlock issues described earlier, at some
+loss in performance. If your system does not have ptys, or if all the
+system's ptys are in use, @command{gawk} automatically falls back to
+using regular pipes.
-"Write documentation as if whoever reads it is a violent psychopath
-who knows where you live."
address@hidden ignore
address@hidden TCP/IP Networking
address@hidden Using @command{gawk} for Network Programming
address@hidden advanced features, @command{gawk}, network programming
address@hidden networks, programming
address@hidden STARTOFRANGE tcpip
address@hidden TCP/IP
address@hidden @code{/inet/@dots{}} special files (@command{gawk})
address@hidden files, @code{/inet/@dots{}} (@command{gawk})
address@hidden @code{/inet4/@dots{}} special files (@command{gawk})
address@hidden files, @code{/inet4/@dots{}} (@command{gawk})
address@hidden @code{/inet6/@dots{}} special files (@command{gawk})
address@hidden files, @code{/inet6/@dots{}} (@command{gawk})
address@hidden @code{EMISTERED}
@quotation
address@hidden documentation as if whoever reads it is
-a violent psychopath who knows where you address@hidden
-Steve English, as quoted by Peter Langston
address@hidden:@*
+@ @ @ @ @i{A host is a host from coast to coast,@*
+@ @ @ @ and no-one can talk to host that's close,@*
+@ @ @ @ unless the host that isn't address@hidden
+@ @ @ @ is busy hung or dead.}
@end quotation
-This @value{CHAPTER} discusses advanced features in @command{gawk}.
-It's a bit of a ``grab bag'' of items that are otherwise unrelated
-to each other.
-First, a command-line option allows @command{gawk} to recognize
-nondecimal numbers in input data, not just in @command{awk}
-programs.
-Then, @command{gawk}'s special features for sorting arrays are presented.
-Next, two-way I/O, discussed briefly in earlier parts of this
address@hidden, is described in full detail, along with the basics
-of TCP/IP networking. Finally, @command{gawk}
-can @dfn{profile} an @command{awk} program, making it possible to tune
-it for performance.
+In addition to being able to open a two-way pipeline to a coprocess
+on the same system
+(@pxref{Two-way I/O}),
+it is possible to make a two-way connection to
+another process on another system across an IP network connection.
-A number of advanced features require separate @value{CHAPTER}s of their
-own:
-
address@hidden @bullet
address@hidden
address@hidden, discusses how to internationalize
-your @command{awk} programs, so that they can speak multiple
-national languages.
+You can think of this as just a @emph{very long} two-way pipeline to
+a coprocess.
+The way @command{gawk} decides that you want to use TCP/IP networking is
+by recognizing special @value{FN}s that begin with one of @samp{/inet/},
address@hidden/inet4/} or @samp{/inet6}.
address@hidden
address@hidden, describes @command{gawk}'s built-in command-line
-debugger for debugging @command{awk} programs.
+The full syntax of the special @value{FN} is
address@hidden/@var{net-type}/@var{protocol}/@var{local-port}/@var{remote-host}/@var{remote-port}}.
+The components are:
address@hidden
address@hidden Precision Arithmetic}, describes how you can use
address@hidden to perform arbitrary-precision arithmetic.
address@hidden @var
address@hidden net-type
+Specifies the kind of Internet connection to make.
+Use @samp{/inet4/} to force IPv4, and
address@hidden/inet6/} to force IPv6.
+Plain @samp{/inet/} (which used to be the only option) uses
+the system default, most likely IPv4.
address@hidden
address@hidden Extensions},
-discusses the ability to dynamically add new built-in functions to
address@hidden
address@hidden itemize
address@hidden protocol
+The protocol to use over IP. This must be either @samp{tcp}, or
address@hidden, for a TCP or UDP IP connection,
+respectively. The use of TCP is recommended for most applications.
address@hidden
-* Nondecimal Data:: Allowing nondecimal input data.
-* Array Sorting:: Facilities for controlling array traversal and
- sorting arrays.
-* Two-way I/O:: Two-way communications with another process.
-* TCP/IP Networking:: Using @command{gawk} for network programming.
-* Profiling:: Profiling your @command{awk} programs.
address@hidden menu
address@hidden local-port
address@hidden @code{getaddrinfo()} function (C library)
+The local TCP or UDP port number to use. Use a port number of @samp{0}
+when you want the system to pick a port. This is what you should do
+when writing a TCP or UDP client.
+You may also use a well-known service name, such as @samp{smtp}
+or @samp{http}, in which case @command{gawk} attempts to determine
+the predefined port number using the C @code{getaddrinfo()} function.
address@hidden Nondecimal Data
address@hidden Allowing Nondecimal Input Data
address@hidden @code{--non-decimal-data} option
address@hidden advanced features, @command{gawk}, nondecimal input data
address@hidden input, address@hidden nondecimal
address@hidden constants, nondecimal
address@hidden remote-host
+The IP address or fully-qualified domain name of the Internet
+host to which you want to connect.
-If you run @command{gawk} with the @option{--non-decimal-data} option,
-you can have nondecimal constants in your input data:
address@hidden remote-port
+The TCP or UDP port number to use on the given @var{remote-host}.
+Again, use @samp{0} if you don't care, or else a well-known
+service name.
address@hidden table
address@hidden line break here for small book format
address@hidden
-$ @kbd{echo 0123 123 0x123 |}
-> @kbd{gawk --non-decimal-data '@{ printf "%d, %d, %d\n",}
-> @kbd{$1, $2, $3 @}'}
address@hidden 83, 123, 291
address@hidden example
address@hidden @command{gawk}, @code{ERRNO} variable in
address@hidden @code{ERRNO} variable
address@hidden NOTE
+Failure in opening a two-way socket will result in a non-fatal error
+being returned to the calling code. The value of @code{ERRNO} indicates
+the error (@pxref{Auto-set}).
address@hidden quotation
-For this feature to work, write your program so that
address@hidden treats your data as numeric:
+Consider the following very simple example:
@example
-$ @kbd{echo 0123 123 0x123 | gawk '@{ print $1, $2, $3 @}'}
address@hidden 0123 123 0x123
+BEGIN @{
+ Service = "/inet/tcp/0/localhost/daytime"
+ Service |& getline
+ print $0
+ close(Service)
address@hidden
@end example
address@hidden
-The @code{print} statement treats its expressions as strings.
-Although the fields can act as numbers when necessary,
-they are still strings, so @code{print} does not try to treat them
-numerically. You may need to add zero to a field to force it to
-be treated as a number. For example:
-
address@hidden
-$ @kbd{echo 0123 123 0x123 | gawk --non-decimal-data '}
-> @address@hidden print $1, $2, $3}
-> @kbd{print $1 + 0, $2 + 0, $3 + 0 @}'}
address@hidden 0123 123 0x123
address@hidden 83 123 291
address@hidden example
+This program reads the current date and time from the local system's
+TCP @samp{daytime} server.
+It then prints the results and closes the connection.
-Because it is common to have decimal data with leading zeros, and because
-using this facility could lead to surprising results, the default is to leave
it
-disabled. If you want it, you must explicitly request it.
+Because this topic is extensive, the use of @command{gawk} for
+TCP/IP programming is documented separately.
address@hidden
+See
address@hidden, , General Introduction, gawkinet, TCP/IP Internetworking with
@command{gawk}},
address@hidden ifinfo
address@hidden
+See @cite{TCP/IP Internetworking with @command{gawk}},
+which comes as part of the @command{gawk} distribution,
address@hidden ifnotinfo
+for a much more complete introduction and discussion, as well as
+extensive examples.
address@hidden programming conventions, @code{--non-decimal-data} option
address@hidden @code{--non-decimal-data} option, @code{strtonum()} function and
address@hidden @code{strtonum()} function (@command{gawk}),
@code{--non-decimal-data} option and
address@hidden CAUTION
address@hidden of this option is not recommended.}
-It can break old programs very badly.
-Instead, use the @code{strtonum()} function to convert your data
-(@pxref{Nondecimal-numbers}).
-This makes your programs easier to write and easier to read, and
-leads to less surprising results.
address@hidden quotation
address@hidden ENDOFRANGE tcpip
address@hidden Array Sorting
address@hidden Controlling Array Traversal and Array Sorting
address@hidden Profiling
address@hidden Profiling Your @command{awk} Programs
address@hidden STARTOFRANGE awkp
address@hidden @command{awk} programs, profiling
address@hidden STARTOFRANGE proawk
address@hidden profiling @command{awk} programs
address@hidden profiling @command{gawk}
address@hidden @code{awkprof.out} file
address@hidden files, @code{awkprof.out}
address@hidden lets you control the order in which a @samp{for (i in array)}
-loop traverses an array.
+You may produce execution traces of your @command{awk} programs.
+This is done by passing the option @option{--profile} to @command{gawk}.
+When @command{gawk} has finished running, it creates a profile of your program
in a file
+named @file{awkprof.out}. Because it is profiling, it also executes up to 45%
slower than
address@hidden normally does.
-In addition, two built-in functions, @code{asort()} and @code{asorti()},
-let you sort arrays based on the array values and indices, respectively.
-These two functions also provide control over the sorting criteria used
-to order the elements during sorting.
address@hidden @code{--profile} option
+As shown in the following example,
+the @option{--profile} option can be used to change the name of the file
+where @command{gawk} will write the profile:
address@hidden
-* Controlling Array Traversal:: How to use PROCINFO["sorted_in"].
-* Array Sorting Functions:: How to use @code{asort()} and @code{asorti()}.
address@hidden menu
address@hidden
+gawk --profile=myprog.prof -f myprog.awk data1 data2
address@hidden example
address@hidden Controlling Array Traversal
address@hidden Controlling Array Traversal
address@hidden
+In the above example, @command{gawk} places the profile in
address@hidden instead of in @file{awkprof.out}.
-By default, the order in which a @samp{for (i in array)} loop
-scans an array is not defined; it is generally based upon
-the internal implementation of arrays inside @command{awk}.
+Here is a sample session showing a simple @command{awk} program, its input
data, and the
+results from running @command{gawk} with the @option{--profile} option.
+First, the @command{awk} program:
-Often, though, it is desirable to be able to loop over the elements
-in a particular order that you, the programmer, choose. @command{gawk}
-lets you do this.
address@hidden
+BEGIN @{ print "First BEGIN rule" @}
address@hidden Scanning}, describes how you can assign special,
-pre-defined values to @code{PROCINFO["sorted_in"]} in order to
-control the order in which @command{gawk} will traverse an array
-during a @code{for} loop.
+END @{ print "First END rule" @}
-In addition, the value of @code{PROCINFO["sorted_in"]} can be a function name.
-This lets you traverse an array based on any custom criterion.
-The array elements are ordered according to the return value of this
-function. The comparison function should be defined with at least
-four arguments:
+/foo/ @{
+ print "matched /foo/, gosh"
+ for (i = 1; i <= 3; i++)
+ sing()
address@hidden
address@hidden
-function comp_func(i1, v1, i2, v2)
@{
- @var{compare elements 1 and 2 in some fashion}
- @var{return < 0; 0; or > 0}
+ if (/foo/)
+ print "if is true"
+ else
+ print "else is true"
@}
address@hidden example
-
-Here, @var{i1} and @var{i2} are the indices, and @var{v1} and @var{v2}
-are the corresponding values of the two elements being compared.
-Either @var{v1} or @var{v2}, or both, can be arrays if the array being
-traversed contains subarrays as values.
-(@xref{Arrays of Arrays}, for more information about subarrays.)
-The three possible return values are interpreted as follows:
-
address@hidden @code
address@hidden comp_func(i1, v1, i2, v2) < 0
-Index @var{i1} comes before index @var{i2} during loop traversal.
-
address@hidden comp_func(i1, v1, i2, v2) == 0
-Indices @var{i1} and @var{i2}
-come together but the relative order with respect to each other is undefined.
address@hidden comp_func(i1, v1, i2, v2) > 0
-Index @var{i1} comes after index @var{i2} during loop traversal.
address@hidden table
+BEGIN @{ print "Second BEGIN rule" @}
-Our first comparison function can be used to scan an array in
-numerical order of the indices:
+END @{ print "Second END rule" @}
address@hidden
-function cmp_num_idx(i1, v1, i2, v2)
+function sing( dummy)
@{
- # numerical index comparison, ascending order
- return (i1 - i2)
+ print "I gotta be me!"
@}
@end example
-Our second function traverses an array based on the string order of
-the element values rather than by indices:
+Following is the input data:
@example
-function cmp_str_val(i1, v1, i2, v2)
address@hidden
- # string value comparison, ascending order
- v1 = v1 ""
- v2 = v2 ""
- if (v1 < v2)
- return -1
- return (v1 != v2)
address@hidden
+foo
+bar
+baz
+foo
+junk
@end example
-The third
-comparison function makes all numbers, and numeric strings without
-any leading or trailing spaces, come out first during loop traversal:
+Here is the @file{awkprof.out} that results from running the @command{gawk}
+profiler on this program and data (this example also illustrates that
@command{awk}
+programmers sometimes have to work late):
address@hidden @code{BEGIN} pattern
address@hidden @code{END} pattern
@example
-function cmp_num_str_val(i1, v1, i2, v2, n1, n2)
address@hidden
- # numbers before string value comparison, ascending order
- n1 = v1 + 0
- n2 = v2 + 0
- if (n1 == v1)
- return (n2 == v2) ? (n1 - n2) : -1
- else if (n2 == v2)
- return 1
- return (v1 < v2) ? -1 : (v1 != v2)
address@hidden
address@hidden example
+ # gawk profile, created Sun Aug 13 00:00:15 2000
-Here is a main program to demonstrate how @command{gawk}
-behaves using each of the previous functions:
+ # BEGIN block(s)
address@hidden
-BEGIN @{
- data["one"] = 10
- data["two"] = 20
- data[10] = "one"
- data[100] = 100
- data[20] = "two"
-
- f[1] = "cmp_num_idx"
- f[2] = "cmp_str_val"
- f[3] = "cmp_num_str_val"
- for (i = 1; i <= 3; i++) @{
- printf("Sort function: %s\n", f[i])
- PROCINFO["sorted_in"] = f[i]
- for (j in data)
- printf("\tdata[%s] = %s\n", j, data[j])
- print ""
- @}
address@hidden
address@hidden example
+ BEGIN @{
+ 1 print "First BEGIN rule"
+ 1 print "Second BEGIN rule"
+ @}
-Here are the results when the program is run:
+ # Rule(s)
address@hidden
-$ @kbd{gawk -f compdemo.awk}
address@hidden Sort function: cmp_num_idx @ii{Sort by numeric index}
address@hidden data[two] = 20
address@hidden data[one] = 10 @ii{Both strings are numerically
zero}
address@hidden data[10] = one
address@hidden data[20] = two
address@hidden data[100] = 100
address@hidden
address@hidden Sort function: cmp_str_val @ii{Sort by element values as
strings}
address@hidden data[one] = 10
address@hidden data[100] = 100 @ii{String 100 is less than
string 20}
address@hidden data[two] = 20
address@hidden data[10] = one
address@hidden data[20] = two
address@hidden
address@hidden Sort function: cmp_num_str_val @ii{Sort all numeric values
before all strings}
address@hidden data[one] = 10
address@hidden data[two] = 20
address@hidden data[100] = 100
address@hidden data[10] = one
address@hidden data[20] = two
address@hidden example
+ 5 /foo/ @{ # 2
+ 2 print "matched /foo/, gosh"
+ 6 for (i = 1; i <= 3; i++) @{
+ 6 sing()
+ @}
+ @}
-Consider sorting the entries of a GNU/Linux system password file
-according to login name. The following program sorts records
-by a specific field position and can be used for this purpose:
+ 5 @{
+ 5 if (/foo/) @{ # 2
+ 2 print "if is true"
+ 3 @} else @{
+ 3 print "else is true"
+ @}
+ @}
address@hidden
-# sort.awk --- simple program to sort by field position
-# field position is specified by the global variable POS
+ # END block(s)
-function cmp_field(i1, v1, i2, v2)
address@hidden
- # comparison by value, as string, and ascending order
- return v1[POS] < v2[POS] ? -1 : (v1[POS] != v2[POS])
address@hidden
+ END @{
+ 1 print "First END rule"
+ 1 print "Second END rule"
+ @}
address@hidden
- for (i = 1; i <= NF; i++)
- a[NR][i] = $i
address@hidden
+ # Functions, listed alphabetically
-END @{
- PROCINFO["sorted_in"] = "cmp_field"
- if (POS < 1 || POS > NF)
- POS = 1
- for (i in a) @{
- for (j = 1; j <= NF; j++)
- printf("%s%c", a[i][j], j < NF ? ":" : "")
- print ""
- @}
address@hidden
+ 6 function sing(dummy)
+ @{
+ 6 print "I gotta be me!"
+ @}
@end example
-The first field in each entry of the password file is the user's login name,
-and the fields are separated by colons.
-Each record defines a subarray,
-with each field as an element in the subarray.
-Running the program produces the
-following output:
+This example illustrates many of the basic features of profiling output.
+They are as follows:
address@hidden
-$ @kbd{gawk -v POS=1 -F: -f sort.awk /etc/passwd}
address@hidden adm:x:3:4:adm:/var/adm:/sbin/nologin
address@hidden apache:x:48:48:Apache:/var/www:/sbin/nologin
address@hidden avahi:x:70:70:Avahi daemon:/:/sbin/nologin
address@hidden
address@hidden example
address@hidden @bullet
address@hidden
+The program is printed in the order @code{BEGIN} rule,
address@hidden rule,
+pattern/action rules,
address@hidden rule, @code{END} rule and functions, listed
+alphabetically.
+Multiple @code{BEGIN} and @code{END} rules are merged together,
+as are multiple @code{BEGINFILE} and @code{ENDFILE} rules.
-The comparison should normally always return the same value when given a
-specific pair of array elements as its arguments. If inconsistent
-results are returned then the order is undefined. This behavior can be
-exploited to introduce random order into otherwise seemingly
-ordered data:
address@hidden patterns, counts
address@hidden
+Pattern-action rules have two counts.
+The first count, to the left of the rule, shows how many times
+the rule's pattern was @emph{tested}.
+The second count, to the right of the rule's opening left brace
+in a comment,
+shows how many times the rule's action was @emph{executed}.
+The difference between the two indicates how many times the rule's
+pattern evaluated to false.
address@hidden
-function cmp_randomize(i1, v1, i2, v2)
address@hidden
- # random order
- return (2 - 4 * rand())
address@hidden
address@hidden example
address@hidden
+Similarly,
+the count for an @address@hidden statement shows how many times
+the condition was tested.
+To the right of the opening left brace for the @code{if}'s body
+is a count showing how many times the condition was true.
+The count for the @code{else}
+indicates how many times the test failed.
-As mentioned above, the order of the indices is arbitrary if two
-elements compare equal. This is usually not a problem, but letting
-the tied elements come out in arbitrary order can be an issue, especially
-when comparing item values. The partial ordering of the equal elements
-may change during the next loop traversal, if other elements are added or
-removed from the array. One way to resolve ties when comparing elements
-with otherwise equal values is to include the indices in the comparison
-rules. Note that doing this may make the loop traversal less efficient,
-so consider it only if necessary. The following comparison functions
-force a deterministic order, and are based on the fact that the
-indices of two elements are never equal:
address@hidden loops, count for header
address@hidden
+The count for a loop header (such as @code{for}
+or @code{while}) shows how many times the loop test was executed.
+(Because of this, you can't just look at the count on the first
+statement in a rule to determine how many times the rule was executed.
+If the first statement is a loop, the count is misleading.)
address@hidden
-function cmp_numeric(i1, v1, i2, v2)
address@hidden
- # numerical value (and index) comparison, descending order
- return (v1 != v2) ? (v2 - v1) : (i2 - i1)
address@hidden
address@hidden functions, user-defined, counts
address@hidden user-defined, functions, counts
address@hidden
+For user-defined functions, the count next to the @code{function}
+keyword indicates how many times the function was called.
+The counts next to the statements in the body show how many times
+those statements were executed.
-function cmp_string(i1, v1, i2, v2)
address@hidden
- # string value (and index) comparison, descending order
- v1 = v1 i1
- v2 = v2 i2
- return (v1 > v2) ? -1 : (v1 != v2)
address@hidden
address@hidden example
address@hidden @address@hidden@}} (braces)
address@hidden braces (@address@hidden@}})
address@hidden
+The layout uses ``K&R'' style with TABs.
+Braces are used everywhere, even when
+the body of an @code{if}, @code{else}, or loop is only a single statement.
address@hidden Avoid using the term ``stable'' when describing the
unpredictable behavior
address@hidden if two items compare equal. Usually, the goal of a "stable
algorithm"
address@hidden is to maintain the original order of the items, which is a
meaningless
address@hidden concept for a list constructed from a hash.
address@hidden @code{()} (parentheses)
address@hidden parentheses @code{()}
address@hidden
+Parentheses are used only where needed, as indicated by the structure
+of the program and the precedence rules.
address@hidden extra verbiage here satisfies the copyeditor. ugh.
+For example, @samp{(3 + 5) * 4} means add three plus five, then multiply
+the total by four. However, @samp{3 + 5 * 4} has no parentheses, and
+means @samp{3 + (5 * 4)}.
-A custom comparison function can often simplify ordered loop
-traversal, and the sky is really the limit when it comes to
-designing such a function.
address@hidden
address@hidden
+All string concatenations are parenthesized too.
+(This could be made a bit smarter.)
address@hidden ignore
-When string comparisons are made during a sort, either for element
-values where one or both aren't numbers, or for element indices
-handled as strings, the value of @code{IGNORECASE}
-(@pxref{Built-in Variables}) controls whether
-the comparisons treat corresponding uppercase and lowercase letters as
-equivalent or distinct.
address@hidden
+Parentheses are used around the arguments to @code{print}
+and @code{printf} only when
+the @code{print} or @code{printf} statement is followed by a redirection.
+Similarly, if
+the target of a redirection isn't a scalar, it gets parenthesized.
-Another point to keep in mind is that in the case of subarrays
-the element values can themselves be arrays; a production comparison
-function should use the @code{isarray()} function
-(@pxref{Type Functions}),
-to check for this, and choose a defined sorting order for subarrays.
-
-All sorting based on @code{PROCINFO["sorted_in"]}
-is disabled in POSIX mode,
-since the @code{PROCINFO} array is not special in that case.
address@hidden
address@hidden supplies leading comments in
+front of the @code{BEGIN} and @code{END} rules,
+the pattern/action rules, and the functions.
-As a side note, sorting the array indices before traversing
-the array has been reported to add 15% to 20% overhead to the
-execution time of @command{awk} programs. For this reason,
-sorted array traversal is not the default.
address@hidden itemize
address@hidden The @command{gawk}
address@hidden maintainers believe that only the people who wish to use a
address@hidden feature should have to pay for it.
+The profiled version of your program may not look exactly like what you
+typed when you wrote it. This is because @command{gawk} creates the
+profiled version by ``pretty printing'' its internal representation of
+the program. The advantage to this is that @command{gawk} can produce
+a standard representation. The disadvantage is that all source-code
+comments are lost, as are the distinctions among multiple @code{BEGIN},
address@hidden, @code{BEGINFILE}, and @code{ENDFILE} rules. Also, things such
as:
address@hidden Array Sorting Functions
address@hidden Sorting Array Values and Indices with @command{gawk}
address@hidden
+/foo/
address@hidden example
address@hidden arrays, sorting
address@hidden @code{asort()} function (@command{gawk})
address@hidden @code{asort()} function (@command{gawk}), address@hidden sorting
address@hidden sort function, arrays, sorting
-In most @command{awk} implementations, sorting an array requires
-writing a @code{sort()} function.
-While this can be educational for exploring different sorting algorithms,
-usually that's not the point of the program.
address@hidden provides the built-in @code{asort()}
-and @code{asorti()} functions
-(@pxref{String Functions})
-for sorting arrays. For example:
address@hidden
+come out as:
@example
address@hidden the array} data
-n = asort(data)
-for (i = 1; i <= n; i++)
- @var{do something with} data[i]
+/foo/ @{
+ print $0
address@hidden
@end example
-After the call to @code{asort()}, the array @code{data} is indexed from 1
-to some number @var{n}, the total number of elements in @code{data}.
-(This count is @code{asort()}'s return value.)
address@hidden @value{LEQ} @code{data[2]} @value{LEQ} @code{data[3]}, and so on.
-The comparison is based on the type of the elements
-(@pxref{Typing and Comparison}).
-All numeric values come before all string values,
-which in turn come before all subarrays.
address@hidden
+which is correct, but possibly surprising.
address@hidden side effects, @code{asort()} function
-An important side effect of calling @code{asort()} is that
address@hidden array's original indices are irrevocably lost}.
-As this isn't always desirable, @code{asort()} accepts a
-second argument:
address@hidden profiling @command{awk} programs, dynamically
address@hidden @command{gawk} program, dynamic profiling
+Besides creating profiles when a program has completed,
address@hidden can produce a profile while it is running.
+This is useful if your @command{awk} program goes into an
+infinite loop and you want to see what has been executed.
+To use this feature, run @command{gawk} with the @option{--profile}
+option in the background:
@example
address@hidden the array} source
-n = asort(source, dest)
-for (i = 1; i <= n; i++)
- @var{do something with} dest[i]
+$ @kbd{gawk --profile -f myprog &}
+[1] 13992
@end example
-In this case, @command{gawk} copies the @code{source} array into the
address@hidden array and then sorts @code{dest}, destroying its indices.
-However, the @code{source} array is not affected.
address@hidden @command{kill} address@hidden dynamic profiling
address@hidden @code{USR1} signal
address@hidden @code{SIGUSR1} signal
address@hidden signals, @code{USR1}/@code{SIGUSR1}
address@hidden
+The shell prints a job number and process ID number; in this case, 13992.
+Use the @command{kill} command to send the @code{USR1} signal
+to @command{gawk}:
address@hidden()} accepts a third string argument to control comparison of
-array elements. As with @code{PROCINFO["sorted_in"]}, this argument
-may be one of the predefined names that @command{gawk} provides
-(@pxref{Controlling Scanning}), or the name of a user-defined function
-(@pxref{Controlling Array Traversal}).
address@hidden
+$ @kbd{kill -USR1 13992}
address@hidden example
address@hidden NOTE
-In all cases, the sorted element values consist of the original
-array's element values. The ability to control comparison merely
-affects the way in which they are sorted.
address@hidden quotation
address@hidden
+As usual, the profiled version of the program is written to
address@hidden, or to a different file if one specified with
+the @option{--profile} option.
-Often, what's needed is to sort on the values of the @emph{indices}
-instead of the values of the elements.
-To do that, use the
address@hidden()} function. The interface is identical to that of
address@hidden()}, except that the index values are used for sorting, and
-become the values of the result array:
+Along with the regular profile, as shown earlier, the profile
+includes a trace of any active functions:
@example
address@hidden source[$0] = some_func($0) @}
+# Function Call Stack:
-END @{
- n = asorti(source, dest)
- for (i = 1; i <= n; i++) @{
- @ii{Work with sorted indices directly:}
- @var{do something with} dest[i]
- @dots{}
- @ii{Access original array via sorted indices:}
- @var{do something with} source[dest[i]]
- @}
address@hidden
+# 3. baz
+# 2. bar
+# 1. foo
+# -- main --
@end example
-Similar to @code{asort()},
-in all cases, the sorted element values consist of the original
-array's indices. The ability to control comparison merely
-affects the way in which they are sorted.
+You may send @command{gawk} the @code{USR1} signal as many times as you like.
+Each time, the profile and function call trace are appended to the output
+profile file.
-Sorting the array by replacing the indices provides maximal flexibility.
-To traverse the elements in decreasing order, use a loop that goes from
address@hidden down to 1, either over the elements or over the address@hidden
-may also use one of the predefined sorting names that sorts in
-decreasing order.}
address@hidden @code{HUP} signal
address@hidden @code{SIGHUP} signal
address@hidden signals, @code{HUP}/@code{SIGHUP}
+If you use the @code{HUP} signal instead of the @code{USR1} signal,
address@hidden produces the profile and the function call trace and then exits.
address@hidden reference counting, sorting arrays
-Copying array indices and elements isn't expensive in terms of memory.
-Internally, @command{gawk} maintains @dfn{reference counts} to data.
-For example, when @code{asort()} copies the first array to the second one,
-there is only one copy of the original array elements' data, even though
-both arrays use the values.
address@hidden @code{INT} signal (MS-Windows)
address@hidden @code{SIGINT} signal (MS-Windows)
address@hidden signals, @code{INT}/@code{SIGINT} (MS-Windows)
address@hidden @code{QUIT} signal (MS-Windows)
address@hidden @code{SIGQUIT} signal (MS-Windows)
address@hidden signals, @code{QUIT}/@code{SIGQUIT} (MS-Windows)
+When @command{gawk} runs on MS-Windows systems, it uses the
address@hidden and @code{QUIT} signals for producing the profile and, in
+the case of the @code{INT} signal, @command{gawk} exits. This is
+because these systems don't support the @command{kill} command, so the
+only signals you can deliver to a program are those generated by the
+keyboard. The @code{INT} signal is generated by the
address@hidden@address@hidden or @address@hidden@key{BREAK}} key, while the
address@hidden signal is generated by the @address@hidden@key{\}} key.
address@hidden Document It And Call It A Feature. Sigh.
address@hidden @command{gawk}, @code{IGNORECASE} variable in
address@hidden @code{IGNORECASE} variable
address@hidden arrays, sorting, @code{IGNORECASE} variable and
address@hidden @code{IGNORECASE} variable, array sorting and
-Because @code{IGNORECASE} affects string comparisons, the value
-of @code{IGNORECASE} also affects sorting for both @code{asort()} and
@code{asorti()}.
-Note also that the locale's sorting order does @emph{not}
-come into play; comparisons are based on character values address@hidden
-is true because locale-based comparison occurs only when in POSIX
-compatibility mode, and since @code{asort()} and @code{asorti()} are
address@hidden extensions, they are not available in that case.}
-Caveat Emptor.
+Finally, @command{gawk} also accepts another option, @option{--pretty-print}.
+When called this way, @command{gawk} ``pretty prints'' the program into
address@hidden, without any execution counts.
address@hidden ENDOFRANGE advgaw
address@hidden ENDOFRANGE gawadv
address@hidden ENDOFRANGE awkp
address@hidden ENDOFRANGE proawk
address@hidden Two-way I/O
address@hidden Two-Way Communications with Another Process
address@hidden Brennan, Michael
address@hidden programmers, attractiveness of
address@hidden
address@hidden Path:
cssun.mathcs.emory.edu!gatech!newsxfer3.itd.umich.edu!news-peer.sprintlink.net!news-sea-19.sprintlink.net!news-in-west.sprintlink.net!news.sprintlink.net!Sprint!204.94.52.5!news.whidbey.com!brennan
-From: brennan@@whidbey.com (Mike Brennan)
-Newsgroups: comp.lang.awk
-Subject: Re: Learn the SECRET to Attract Women Easily
-Date: 4 Aug 1997 17:34:46 GMT
address@hidden Organization: WhidbeyNet
address@hidden Lines: 12
-Message-ID: <5s53rm$eca@@news.whidbey.com>
address@hidden References: <address@hidden>
address@hidden Reply-To: address@hidden
address@hidden NNTP-Posting-Host: asn202.whidbey.com
address@hidden X-Newsreader: slrn (0.9.4.1 UNIX)
address@hidden Xref: cssun.mathcs.emory.edu comp.lang.awk:5403
address@hidden Internationalization
address@hidden Internationalization with @command{gawk}
+
+Once upon a time, computer makers
+wrote software that worked only in English.
+Eventually, hardware and software vendors noticed that if their
+systems worked in the native languages of non-English-speaking
+countries, they were able to sell more systems.
+As a result, internationalization and localization
+of programs and software systems became a common practice.
+
address@hidden STARTOFRANGE inloc
address@hidden internationalization, localization
address@hidden @command{gawk}, internationalization and, See
internationalization
address@hidden internationalization, localization, @command{gawk} and
+For many years, the ability to provide internationalization
+was largely restricted to programs written in C and C++.
+This @value{CHAPTER} describes the underlying library @command{gawk}
+uses for internationalization, as well as how
address@hidden makes internationalization
+features available at the @command{awk} program level.
+Having internationalization available at the @command{awk} level
+gives software developers additional flexibility---they are no
+longer forced to write in C or C++ when internationalization is
+a requirement.
+
address@hidden
+* I18N and L10N:: Internationalization and Localization.
+* Explaining gettext:: How GNU @code{gettext} works.
+* Programmer i18n:: Features for the programmer.
+* Translator i18n:: Features for the translator.
+* I18N Example:: A simple i18n example.
+* Gawk I18N:: @command{gawk} is also internationalized.
address@hidden menu
+
address@hidden I18N and L10N
address@hidden Internationalization and Localization
+
address@hidden internationalization
address@hidden localization, See address@hidden localization
address@hidden localization
address@hidden means writing (or modifying) a program once,
+in such a way that it can use multiple languages without requiring
+further source-code changes.
address@hidden means providing the data necessary for an
+internationalized program to work in a particular language.
+Most typically, these terms refer to features such as the language
+used for printing error messages, the language used to read
+responses, and information related to how numerical and
+monetary values are printed and read.
+
address@hidden Explaining gettext
address@hidden GNU @code{gettext}
+
address@hidden internationalizing a program
address@hidden STARTOFRANGE gettex
address@hidden @code{gettext} library
+The facilities in GNU @code{gettext} focus on messages; strings printed
+by a program, either directly or via formatting with @code{printf} or
address@hidden()address@hidden some operating systems, the @command{gawk}
+port doesn't support GNU @code{gettext}.
+Therefore, these features are not available
+if you are using one of those operating systems. Sorry.}
+
address@hidden portability, @code{gettext} library and
+When using GNU @code{gettext}, each application has its own
address@hidden domain}. This is a unique name, such as @samp{kpilot} or
@samp{gawk},
+that identifies the application.
+A complete application may have multiple components---programs written
+in C or C++, as well as scripts written in @command{sh} or @command{awk}.
+All of the components use the same text domain.
+
+To make the discussion concrete, assume we're writing an application
+named @command{guide}. Internationalization consists of the
+following steps, in this order:
+
address@hidden
address@hidden
+The programmer goes
+through the source for all of @command{guide}'s components
+and marks each string that is a candidate for translation.
+For example, @code{"`-F': option required"} is a good candidate for
translation.
+A table with strings of option names is not (e.g., @command{gawk}'s
address@hidden option should remain the same, no matter what the local
+language).
+
address@hidden @code{textdomain()} function (C library)
address@hidden
+The programmer indicates the application's text domain
+(@code{"guide"}) to the @code{gettext} library,
+by calling the @code{textdomain()} function.
+
address@hidden @code{.pot} files
address@hidden files, @code{.pot}
address@hidden portable object template files
address@hidden files, portable object template
address@hidden
+Messages from the application are extracted from the source code and
+collected into a portable object template file (@file{guide.pot}),
+which lists the strings and their translations.
+The translations are initially empty.
+The original (usually English) messages serve as the key for
+lookup of the translations.
-On 3 Aug 1997 13:17:43 GMT, Want More Dates???
-<tracy78@@kilgrona.com> wrote:
->Learn the SECRET to Attract Women Easily
->
->The SCENT(tm) Pheromone Sex Attractant For Men to Attract Women
address@hidden @code{.po} files
address@hidden files, @code{.po}
address@hidden portable object files
address@hidden files, portable object
address@hidden
+For each language with a translator, @file{guide.pot}
+is copied to a portable object file (@code{.po})
+and translations are created and shipped with the application.
+For example, there might be a @file{fr.po} for a French translation.
-The scent of awk programmers is a lot more attractive to women than
-the scent of perl programmers.
---
-Mike Brennan
address@hidden brennan@@whidbey.com
address@hidden smallexample
address@hidden @code{.mo} files
address@hidden files, @code{.mo}
address@hidden message object files
address@hidden files, message object
address@hidden
+Each language's @file{.po} file is converted into a binary
+message object (@file{.mo}) file.
+A message object file contains the original messages and their
+translations in a binary format that allows fast lookup of translations
+at runtime.
address@hidden advanced features, @command{gawk}, address@hidden communicating
with
address@hidden processes, two-way communications with
-It is often useful to be able to
-send data to a separate program for
-processing and then read the result. This can always be
-done with temporary files:
address@hidden
+When @command{guide} is built and installed, the binary translation files
+are installed in a standard place.
address@hidden
-# Write the data for processing
-tempfile = ("mydata." PROCINFO["pid"])
-while (@var{not done with data})
- print @var{data} | ("subprogram > " tempfile)
-close("subprogram > " tempfile)
address@hidden @code{bindtextdomain()} function (C library)
address@hidden
+For testing and development, it is possible to tell @code{gettext}
+to use @file{.mo} files in a different directory than the standard
+one by using the @code{bindtextdomain()} function.
-# Read the results, remove tempfile when done
-while ((getline newdata < tempfile) > 0)
- @var{process} newdata @var{appropriately}
-close(tempfile)
-system("rm " tempfile)
address@hidden example
address@hidden @code{.mo} files, specifying directory of
address@hidden files, @code{.mo}, specifying directory of
address@hidden message object files, specifying directory of
address@hidden files, message object, specifying directory of
address@hidden
+At runtime, @command{guide} looks up each string via a call
+to @code{gettext()}. The returned string is the translated string
+if available, or the original string if not.
address@hidden
-This works, but not elegantly. Among other things, it requires that
-the program be run in a directory that cannot be shared among users;
-for example, @file{/tmp} will not do, as another user might happen
-to be using a temporary file with the same name.
address@hidden
+If necessary, it is possible to access messages from a different
+text domain than the one belonging to the application, without
+having to switch the application's default text domain back
+and forth.
address@hidden enumerate
address@hidden coprocesses
address@hidden input/output, two-way
address@hidden @code{|} (vertical bar), @code{|&} operator (I/O)
address@hidden vertical bar (@code{|}), @code{|&} operator (I/O)
address@hidden @command{csh} utility, @code{|&} operator, comparison with
-However, with @command{gawk}, it is possible to
-open a @emph{two-way} pipe to another process. The second process is
-termed a @dfn{coprocess}, since it runs in parallel with @command{gawk}.
-The two-way connection is created using the @samp{|&} operator
-(borrowed from the Korn shell, @command{ksh}):@footnote{This is very
-different from the same operator in the C shell.}
address@hidden @code{gettext()} function (C library)
+In C (or C++), the string marking and dynamic translation lookup
+are accomplished by wrapping each string in a call to @code{gettext()}:
@example
-do @{
- print @var{data} |& "subprogram"
- "subprogram" |& getline results
address@hidden while (@var{data left to process})
-close("subprogram")
+printf("%s", gettext("Don't Panic!\n"));
@end example
-The first time an I/O operation is executed using the @samp{|&}
-operator, @command{gawk} creates a two-way pipeline to a child process
-that runs the other program. Output created with @code{print}
-or @code{printf} is written to the program's standard input, and
-output from the program's standard output can be read by the @command{gawk}
-program using @code{getline}.
-As is the case with processes started by @samp{|}, the subprogram
-can be any program, or pipeline of programs, that can be started by
-the shell.
+The tools that extract messages from source code pull out all
+strings enclosed in calls to @code{gettext()}.
-There are some cautionary items to be aware of:
address@hidden @code{_} (underscore), @code{_} C macro
address@hidden underscore (@code{_}), @code{_} C macro
+The GNU @code{gettext} developers, recognizing that typing
address@hidden(@dots{})} over and over again is both painful and ugly to look
+at, use the macro @samp{_} (an underscore) to make things easier:
address@hidden @bullet
address@hidden
-As the code inside @command{gawk} currently stands, the coprocess's
-standard error goes to the same place that the parent @command{gawk}'s
-standard error goes. It is not possible to read the child's
-standard error separately.
address@hidden
+/* In the standard header file: */
+#define _(str) gettext(str)
address@hidden deadlocks
address@hidden buffering, input/output
address@hidden @code{getline} command, deadlock and
address@hidden
-I/O buffering may be a problem. @command{gawk} automatically
-flushes all output down the pipe to the coprocess.
-However, if the coprocess does not flush its output,
address@hidden may hang when doing a @code{getline} in order to read
-the coprocess's results. This could lead to a situation
-known as @dfn{deadlock}, where each process is waiting for the
-other one to do something.
address@hidden itemize
+/* In the program text: */
+printf("%s", _("Don't Panic!\n"));
address@hidden example
address@hidden @code{close()} function, two-way pipes and
-It is possible to close just one end of the two-way pipe to
-a coprocess, by supplying a second argument to the @code{close()}
-function of either @code{"to"} or @code{"from"}
-(@pxref{Close Files And Pipes}).
-These strings tell @command{gawk} to close the end of the pipe
-that sends data to the coprocess or the end that reads from it,
-respectively.
address@hidden internationalization, localization, locale categories
address@hidden @code{gettext} library, locale categories
address@hidden locale categories
address@hidden
+This reduces the typing overhead to just three extra characters per string
+and is considerably easier to read as well.
address@hidden @command{sort} utility, coprocesses and
-This is particularly necessary in order to use
-the system @command{sort} utility as part of a coprocess;
address@hidden must read @emph{all} of its input
-data before it can produce any output.
-The @command{sort} program does not receive an end-of-file indication
-until @command{gawk} closes the write end of the pipe.
+There are locale @dfn{categories}
+for different types of locale-related information.
+The defined locale categories that @code{gettext} knows about are:
-When you have finished writing data to the @command{sort}
-utility, you can close the @code{"to"} end of the pipe, and
-then start reading sorted data via @code{getline}.
-For example:
address@hidden @code
address@hidden @code{LC_MESSAGES} locale category
address@hidden LC_MESSAGES
+Text messages. This is the default category for @code{gettext}
+operations, but it is possible to supply a different one explicitly,
+if necessary. (It is almost never necessary to supply a different category.)
address@hidden
-BEGIN @{
- command = "LC_ALL=C sort"
- n = split("abcdefghijklmnopqrstuvwxyz", a, "")
address@hidden sorting characters in different languages
address@hidden @code{LC_COLLATE} locale category
address@hidden LC_COLLATE
+Text-collation information; i.e., how different characters
+and/or groups of characters sort in a given language.
- for (i = n; i > 0; i--)
- print a[i] |& command
- close(command, "to")
address@hidden @code{LC_CTYPE} locale category
address@hidden LC_CTYPE
+Character-type information (alphabetic, digit, upper- or lowercase, and
+so on).
+This information is accessed via the
+POSIX character classes in regular expressions,
+such as @code{/[[:alnum:]]/}
+(@pxref{Regexp Operators}).
- while ((command |& getline line) > 0)
- print "got", line
- close(command)
address@hidden
address@hidden example
address@hidden monetary information, localization
address@hidden currency symbols, localization
address@hidden @code{LC_MONETARY} locale category
address@hidden LC_MONETARY
+Monetary information, such as the currency symbol, and whether the
+symbol goes before or after a number.
-This program writes the letters of the alphabet in reverse order, one
-per line, down the two-way pipe to @command{sort}. It then closes the
-write end of the pipe, so that @command{sort} receives an end-of-file
-indication. This causes @command{sort} to sort the data and write the
-sorted data back to the @command{gawk} program. Once all of the data
-has been read, @command{gawk} terminates the coprocess and exits.
address@hidden @code{LC_NUMERIC} locale category
address@hidden LC_NUMERIC
+Numeric information, such as which characters to use for the decimal
+point and the thousands address@hidden
+use a comma every three decimal places and a period for the decimal
+point, while many Europeans do exactly the opposite:
+1,234.56 versus 1.234,56.}
-As a side note, the assignment @samp{LC_ALL=C} in the @command{sort}
-command ensures traditional Unix (ASCII) sorting from @command{sort}.
address@hidden @code{LC_RESPONSE} locale category
address@hidden LC_RESPONSE
+Response information, such as how ``yes'' and ``no'' appear in the
+local language, and possibly other information as well.
address@hidden @command{gawk}, @code{PROCINFO} array in
address@hidden @code{PROCINFO} array
-You may also use pseudo-ttys (ptys) for
-two-way communication instead of pipes, if your system supports them.
-This is done on a per-command basis, by setting a special element
-in the @code{PROCINFO} array
-(@pxref{Auto-set}),
-like so:
address@hidden time, localization and
address@hidden dates, information related address@hidden localization
address@hidden @code{LC_TIME} locale category
address@hidden LC_TIME
+Time- and date-related information, such as 12- or 24-hour clock, month printed
+before or after the day in a date, local month abbreviations, and so on.
address@hidden
-command = "sort -nr" # command, save in convenience variable
-PROCINFO[command, "pty"] = 1 # update PROCINFO
-print @dots{} |& command # start two-way pipe
address@hidden
address@hidden example
address@hidden @code{LC_ALL} locale category
address@hidden LC_ALL
+All of the above. (Not too useful in the context of @code{gettext}.)
address@hidden table
address@hidden ENDOFRANGE gettex
+
address@hidden Programmer i18n
address@hidden Internationalizing @command{awk} Programs
address@hidden STARTOFRANGE inap
address@hidden @command{awk} programs, internationalizing
address@hidden
-Using ptys avoids the buffer deadlock issues described earlier, at some
-loss in performance. If your system does not have ptys, or if all the
-system's ptys are in use, @command{gawk} automatically falls back to
-using regular pipes.
address@hidden provides the following variables and functions for
+internationalization:
address@hidden TCP/IP Networking
address@hidden Using @command{gawk} for Network Programming
address@hidden advanced features, @command{gawk}, network programming
address@hidden networks, programming
address@hidden STARTOFRANGE tcpip
address@hidden TCP/IP
address@hidden @code{/inet/@dots{}} special files (@command{gawk})
address@hidden files, @code{/inet/@dots{}} (@command{gawk})
address@hidden @code{/inet4/@dots{}} special files (@command{gawk})
address@hidden files, @code{/inet4/@dots{}} (@command{gawk})
address@hidden @code{/inet6/@dots{}} special files (@command{gawk})
address@hidden files, @code{/inet6/@dots{}} (@command{gawk})
address@hidden @code{EMISTERED}
address@hidden
address@hidden:@*
-@ @ @ @ @i{A host is a host from coast to coast,@*
-@ @ @ @ and no-one can talk to host that's close,@*
-@ @ @ @ unless the host that isn't address@hidden
-@ @ @ @ is busy hung or dead.}
address@hidden quotation
address@hidden @code
address@hidden @code{TEXTDOMAIN} variable
address@hidden TEXTDOMAIN
+This variable indicates the application's text domain.
+For compatibility with GNU @code{gettext}, the default
+value is @code{"messages"}.
-In addition to being able to open a two-way pipeline to a coprocess
-on the same system
-(@pxref{Two-way I/O}),
-it is possible to make a two-way connection to
-another process on another system across an IP network connection.
address@hidden internationalization, localization, marked strings
address@hidden strings, for localization
address@hidden _"your message here"
+String constants marked with a leading underscore
+are candidates for translation at runtime.
+String constants without a leading underscore are not translated.
-You can think of this as just a @emph{very long} two-way pipeline to
-a coprocess.
-The way @command{gawk} decides that you want to use TCP/IP networking is
-by recognizing special @value{FN}s that begin with one of @samp{/inet/},
address@hidden/inet4/} or @samp{/inet6}.
address@hidden @code{dcgettext()} function (@command{gawk})
address@hidden dcgettext(@var{string} @r{[}, @var{domain} @r{[},
@address@hidden)
+Return the translation of @var{string} in
+text domain @var{domain} for locale category @var{category}.
+The default value for @var{domain} is the current value of @code{TEXTDOMAIN}.
+The default value for @var{category} is @code{"LC_MESSAGES"}.
-The full syntax of the special @value{FN} is
address@hidden/@var{net-type}/@var{protocol}/@var{local-port}/@var{remote-host}/@var{remote-port}}.
-The components are:
+If you supply a value for @var{category}, it must be a string equal to
+one of the known locale categories described in
address@hidden
+the previous @value{SECTION}.
address@hidden ifnotinfo
address@hidden
address@hidden gettext}.
address@hidden ifinfo
+You must also supply a text domain. Use @code{TEXTDOMAIN} if
+you want to use the current domain.
address@hidden @var
address@hidden net-type
-Specifies the kind of Internet connection to make.
-Use @samp{/inet4/} to force IPv4, and
address@hidden/inet6/} to force IPv6.
-Plain @samp{/inet/} (which used to be the only option) uses
-the system default, most likely IPv4.
address@hidden CAUTION
+The order of arguments to the @command{awk} version
+of the @code{dcgettext()} function is purposely different from the order for
+the C version. The @command{awk} version's order was
+chosen to be simple and to allow for reasonable @command{awk}-style
+default arguments.
address@hidden quotation
address@hidden protocol
-The protocol to use over IP. This must be either @samp{tcp}, or
address@hidden, for a TCP or UDP IP connection,
-respectively. The use of TCP is recommended for most applications.
address@hidden @code{dcngettext()} function (@command{gawk})
address@hidden dcngettext(@var{string1}, @var{string2}, @var{number} @r{[},
@var{domain} @r{[}, @address@hidden)
+Return the plural form used for @var{number} of the
+translation of @var{string1} and @var{string2} in text domain
address@hidden for locale category @var{category}. @var{string1} is the
+English singular variant of a message, and @var{string2} the English plural
+variant of the same message.
+The default value for @var{domain} is the current value of @code{TEXTDOMAIN}.
+The default value for @var{category} is @code{"LC_MESSAGES"}.
address@hidden local-port
address@hidden @code{getaddrinfo()} function (C library)
-The local TCP or UDP port number to use. Use a port number of @samp{0}
-when you want the system to pick a port. This is what you should do
-when writing a TCP or UDP client.
-You may also use a well-known service name, such as @samp{smtp}
-or @samp{http}, in which case @command{gawk} attempts to determine
-the predefined port number using the C @code{getaddrinfo()} function.
+The same remarks about argument order as for the @code{dcgettext()} function
apply.
address@hidden remote-host
-The IP address or fully-qualified domain name of the Internet
-host to which you want to connect.
address@hidden @code{.mo} files, specifying directory of
address@hidden files, @code{.mo}, specifying directory of
address@hidden message object files, specifying directory of
address@hidden files, message object, specifying directory of
address@hidden @code{bindtextdomain()} function (@command{gawk})
address@hidden bindtextdomain(@var{directory} @r{[}, @address@hidden)
+Change the directory in which
address@hidden looks for @file{.mo} files, in case they
+will not or cannot be placed in the standard locations
+(e.g., during testing).
+Return the directory in which @var{domain} is ``bound.''
address@hidden remote-port
-The TCP or UDP port number to use on the given @var{remote-host}.
-Again, use @samp{0} if you don't care, or else a well-known
-service name.
+The default @var{domain} is the value of @code{TEXTDOMAIN}.
+If @var{directory} is the null string (@code{""}), then
address@hidden()} returns the current binding for the
+given @var{domain}.
@end table
address@hidden @command{gawk}, @code{ERRNO} variable in
address@hidden @code{ERRNO} variable
address@hidden NOTE
-Failure in opening a two-way socket will result in a non-fatal error
-being returned to the calling code. The value of @code{ERRNO} indicates
-the error (@pxref{Auto-set}).
address@hidden quotation
+To use these facilities in your @command{awk} program, follow the steps
+outlined in
address@hidden
+the previous @value{SECTION},
address@hidden ifnotinfo
address@hidden
address@hidden gettext},
address@hidden ifinfo
+like so:
-Consider the following very simple example:
address@hidden
address@hidden @code{BEGIN} pattern, @code{TEXTDOMAIN} variable and
address@hidden @code{TEXTDOMAIN} variable, @code{BEGIN} pattern and
address@hidden
+Set the variable @code{TEXTDOMAIN} to the text domain of
+your program. This is best done in a @code{BEGIN} rule
+(@pxref{BEGIN/END}),
+or it can also be done via the @option{-v} command-line
+option (@pxref{Options}):
@example
BEGIN @{
- Service = "/inet/tcp/0/localhost/daytime"
- Service |& getline
- print $0
- close(Service)
+ TEXTDOMAIN = "guide"
+ @dots{}
@}
@end example
-This program reads the current date and time from the local system's
-TCP @samp{daytime} server.
-It then prints the results and closes the connection.
address@hidden @code{_} (underscore), translatable string
address@hidden underscore (@code{_}), translatable string
address@hidden
+Mark all translatable strings with a leading underscore (@samp{_})
+character. It @emph{must} be adjacent to the opening
+quote of the string. For example:
-Because this topic is extensive, the use of @command{gawk} for
-TCP/IP programming is documented separately.
address@hidden
-See
address@hidden, , General Introduction, gawkinet, TCP/IP Internetworking with
@command{gawk}},
address@hidden ifinfo
address@hidden
-See @cite{TCP/IP Internetworking with @command{gawk}},
-which comes as part of the @command{gawk} distribution,
address@hidden ifnotinfo
-for a much more complete introduction and discussion, as well as
-extensive examples.
address@hidden
+print _"hello, world"
+x = _"you goofed"
+printf(_"Number of users is %d\n", nusers)
address@hidden example
address@hidden ENDOFRANGE tcpip
address@hidden
+If you are creating strings dynamically, you can
+still translate them, using the @code{dcgettext()}
+built-in function:
address@hidden Profiling
address@hidden Profiling Your @command{awk} Programs
address@hidden STARTOFRANGE awkp
address@hidden @command{awk} programs, profiling
address@hidden STARTOFRANGE proawk
address@hidden profiling @command{awk} programs
address@hidden profiling @command{gawk}
address@hidden @code{awkprof.out} file
address@hidden files, @code{awkprof.out}
address@hidden
+message = nusers " users logged in"
+message = dcgettext(message, "adminprog")
+print message
address@hidden example
-You may produce execution traces of your @command{awk} programs.
-This is done by passing the option @option{--profile} to @command{gawk}.
-When @command{gawk} has finished running, it creates a profile of your program
in a file
-named @file{awkprof.out}. Because it is profiling, it also executes up to 45%
slower than
address@hidden normally does.
+Here, the call to @code{dcgettext()} supplies a different
+text domain (@code{"adminprog"}) in which to find the
+message, but it uses the default @code{"LC_MESSAGES"} category.
address@hidden @code{--profile} option
-As shown in the following example,
-the @option{--profile} option can be used to change the name of the file
-where @command{gawk} will write the profile:
address@hidden @code{LC_MESSAGES} locale category, @code{bindtextdomain()}
function (@command{gawk})
address@hidden
+During development, you might want to put the @file{.mo}
+file in a private directory for testing. This is done
+with the @code{bindtextdomain()} built-in function:
@example
-gawk --profile=myprog.prof -f myprog.awk data1 data2
+BEGIN @{
+ TEXTDOMAIN = "guide" # our text domain
+ if (Testing) @{
+ # where to find our files
+ bindtextdomain("testdir")
+ # joe is in charge of adminprog
+ bindtextdomain("../joe/testdir", "adminprog")
+ @}
+ @dots{}
address@hidden
@end example
address@hidden
-In the above example, @command{gawk} places the profile in
address@hidden instead of in @file{awkprof.out}.
address@hidden enumerate
-Here is a sample session showing a simple @command{awk} program, its input
data, and the
-results from running @command{gawk} with the @option{--profile} option.
-First, the @command{awk} program:
address@hidden Example},
+for an example program showing the steps to create
+and use translations from @command{awk}.
+
address@hidden Translator i18n
address@hidden Translating @command{awk} Programs
+
address@hidden @code{.po} files
address@hidden files, @code{.po}
address@hidden portable object files
address@hidden files, portable object
+Once a program's translatable strings have been marked, they must
+be extracted to create the initial @file{.po} file.
+As part of translation, it is often helpful to rearrange the order
+in which arguments to @code{printf} are output.
address@hidden
-BEGIN @{ print "First BEGIN rule" @}
address@hidden's @option{--gen-pot} command-line option extracts
+the messages and is discussed next.
+After that, @code{printf}'s ability to
+rearrange the order for @code{printf} arguments at runtime
+is covered.
-END @{ print "First END rule" @}
address@hidden
+* String Extraction:: Extracting marked strings.
+* Printf Ordering:: Rearranging @code{printf} arguments.
+* I18N Portability:: @command{awk}-level portability issues.
address@hidden menu
-/foo/ @{
- print "matched /foo/, gosh"
- for (i = 1; i <= 3; i++)
- sing()
address@hidden
address@hidden String Extraction
address@hidden Extracting Marked Strings
address@hidden strings, extracting
address@hidden marked address@hidden extracting
address@hidden @code{--gen-pot} option
address@hidden command-line options, string extraction
address@hidden string extraction (internationalization)
address@hidden marked string extraction (internationalization)
address@hidden extraction, of marked strings (internationalization)
address@hidden
- if (/foo/)
- print "if is true"
- else
- print "else is true"
address@hidden
address@hidden @code{--gen-pot} option
+Once your @command{awk} program is working, and all the strings have
+been marked and you've set (and perhaps bound) the text domain,
+it is time to produce translations.
+First, use the @option{--gen-pot} command-line option to create
+the initial @file{.pot} file:
-BEGIN @{ print "Second BEGIN rule" @}
address@hidden
+$ @kbd{gawk --gen-pot -f guide.awk > guide.pot}
address@hidden example
-END @{ print "Second END rule" @}
address@hidden @code{xgettext} utility
+When run with @option{--gen-pot}, @command{gawk} does not execute your
+program. Instead, it parses it as usual and prints all marked strings
+to standard output in the format of a GNU @code{gettext} Portable Object
+file. Also included in the output are any constant strings that
+appear as the first argument to @code{dcgettext()} or as the first and
+second argument to @code{dcngettext()address@hidden
address@hidden utility that comes with GNU
address@hidden can handle @file{.awk} files.}
address@hidden Example},
+for the full list of steps to go through to create and test
+translations for @command{guide}.
-function sing( dummy)
address@hidden
- print "I gotta be me!"
address@hidden
address@hidden example
address@hidden Printf Ordering
address@hidden Rearranging @code{printf} Arguments
-Following is the input data:
address@hidden @code{printf} statement, positional specifiers
address@hidden positional specifiers, @code{printf} statement
+Format strings for @code{printf} and @code{sprintf()}
+(@pxref{Printf})
+present a special problem for translation.
+Consider the following:@footnote{This example is borrowed
+from the GNU @code{gettext} manual.}
address@hidden line broken here only for smallbook format
@example
-foo
-bar
-baz
-foo
-junk
+printf(_"String `%s' has %d characters\n",
+ string, length(string)))
@end example
-Here is the @file{awkprof.out} that results from running the @command{gawk}
-profiler on this program and data (this example also illustrates that
@command{awk}
-programmers sometimes have to work late):
+A possible German translation for this might be:
address@hidden @code{BEGIN} pattern
address@hidden @code{END} pattern
@example
- # gawk profile, created Sun Aug 13 00:00:15 2000
-
- # BEGIN block(s)
+"%d Zeichen lang ist die Zeichenkette `%s'\n"
address@hidden example
- BEGIN @{
- 1 print "First BEGIN rule"
- 1 print "Second BEGIN rule"
- @}
+The problem should be obvious: the order of the format
+specifications is different from the original!
+Even though @code{gettext()} can return the translated string
+at runtime,
+it cannot change the argument order in the call to @code{printf}.
- # Rule(s)
+To solve this problem, @code{printf} format specifiers may have
+an additional optional element, which we call a @dfn{positional specifier}.
+For example:
- 5 /foo/ @{ # 2
- 2 print "matched /foo/, gosh"
- 6 for (i = 1; i <= 3; i++) @{
- 6 sing()
- @}
- @}
address@hidden
+"%2$d Zeichen lang ist die Zeichenkette `%1$s'\n"
address@hidden example
- 5 @{
- 5 if (/foo/) @{ # 2
- 2 print "if is true"
- 3 @} else @{
- 3 print "else is true"
- @}
- @}
+Here, the positional specifier consists of an integer count, which indicates
which
+argument to use, and a @samp{$}. Counts are one-based, and the
+format string itself is @emph{not} included. Thus, in the following
+example, @samp{string} is the first argument and @samp{length(string)} is the
second:
- # END block(s)
address@hidden
+$ @kbd{gawk 'BEGIN @{}
+> @kbd{string = "Dont Panic"}
+> @kbd{printf _"%2$d characters live in \"%1$s\"\n",}
+> @kbd{string, length(string)}
+> @address@hidden'}
address@hidden 10 characters live in "Dont Panic"
address@hidden example
- END @{
- 1 print "First END rule"
- 1 print "Second END rule"
- @}
+If present, positional specifiers come first in the format specification,
+before the flags, the field width, and/or the precision.
- # Functions, listed alphabetically
+Positional specifiers can be used with the dynamic field width and
+precision capability:
- 6 function sing(dummy)
- @{
- 6 print "I gotta be me!"
- @}
address@hidden
+$ @kbd{gawk 'BEGIN @{}
+> @kbd{printf("%*.*s\n", 10, 20, "hello")}
+> @kbd{printf("%3$*2$.*1$s\n", 20, 10, "hello")}
+> @address@hidden'}
address@hidden hello
address@hidden hello
@end example
-This example illustrates many of the basic features of profiling output.
-They are as follows:
address@hidden NOTE
+When using @samp{*} with a positional specifier, the @samp{*}
+comes first, then the integer position, and then the @samp{$}.
+This is somewhat counterintuitive.
address@hidden quotation
address@hidden @bullet
address@hidden
-The program is printed in the order @code{BEGIN} rule,
address@hidden rule,
-pattern/action rules,
address@hidden rule, @code{END} rule and functions, listed
-alphabetically.
-Multiple @code{BEGIN} and @code{END} rules are merged together,
-as are multiple @code{BEGINFILE} and @code{ENDFILE} rules.
address@hidden @code{printf} statement, positional specifiers, mixing with
regular formats
address@hidden positional specifiers, @code{printf} statement, mixing with
regular formats
address@hidden format specifiers, mixing regular with positional specifiers
address@hidden does not allow you to mix regular format specifiers
+and those with positional specifiers in the same string:
address@hidden patterns, counts
address@hidden
-Pattern-action rules have two counts.
-The first count, to the left of the rule, shows how many times
-the rule's pattern was @emph{tested}.
-The second count, to the right of the rule's opening left brace
-in a comment,
-shows how many times the rule's action was @emph{executed}.
-The difference between the two indicates how many times the rule's
-pattern evaluated to false.
address@hidden
+$ @kbd{gawk 'BEGIN @{ printf _"%d %3$s\n", 1, 2, "hi" @}'}
address@hidden gawk: cmd. line:1: fatal: must use `count$' on all formats or
none
address@hidden example
address@hidden
-Similarly,
-the count for an @address@hidden statement shows how many times
-the condition was tested.
-To the right of the opening left brace for the @code{if}'s body
-is a count showing how many times the condition was true.
-The count for the @code{else}
-indicates how many times the test failed.
address@hidden NOTE
+There are some pathological cases that @command{gawk} may fail to
+diagnose. In such cases, the output may not be what you expect.
+It's still a bad idea to try mixing them, even if @command{gawk}
+doesn't detect it.
address@hidden quotation
address@hidden loops, count for header
address@hidden
-The count for a loop header (such as @code{for}
-or @code{while}) shows how many times the loop test was executed.
-(Because of this, you can't just look at the count on the first
-statement in a rule to determine how many times the rule was executed.
-If the first statement is a loop, the count is misleading.)
+Although positional specifiers can be used directly in @command{awk} programs,
+their primary purpose is to help in producing correct translations of
+format strings into languages different from the one in which the program
+is first written.
address@hidden functions, user-defined, counts
address@hidden user-defined, functions, counts
address@hidden
-For user-defined functions, the count next to the @code{function}
-keyword indicates how many times the function was called.
-The counts next to the statements in the body show how many times
-those statements were executed.
address@hidden I18N Portability
address@hidden @command{awk} Portability Issues
address@hidden @address@hidden@}} (braces)
address@hidden braces (@address@hidden@}})
address@hidden
-The layout uses ``K&R'' style with TABs.
-Braces are used everywhere, even when
-the body of an @code{if}, @code{else}, or loop is only a single statement.
address@hidden portability, internationalization and
address@hidden internationalization, localization, portability and
address@hidden's internationalization features were purposely chosen to
+have as little impact as possible on the portability of @command{awk}
+programs that use them to other versions of @command{awk}.
+Consider this program:
address@hidden @code{()} (parentheses)
address@hidden parentheses @code{()}
address@hidden
-Parentheses are used only where needed, as indicated by the structure
-of the program and the precedence rules.
address@hidden extra verbiage here satisfies the copyeditor. ugh.
-For example, @samp{(3 + 5) * 4} means add three plus five, then multiply
-the total by four. However, @samp{3 + 5 * 4} has no parentheses, and
-means @samp{3 + (5 * 4)}.
address@hidden
+BEGIN @{
+ TEXTDOMAIN = "guide"
+ if (Test_Guide) # set with -v
+ bindtextdomain("/test/guide/messages")
+ print _"don't panic!"
address@hidden
address@hidden example
address@hidden
address@hidden
+As written, it won't work on other versions of @command{awk}.
+However, it is actually almost portable, requiring very little
+change:
+
address@hidden @bullet
address@hidden @code{TEXTDOMAIN} variable, portability and
@item
-All string concatenations are parenthesized too.
-(This could be made a bit smarter.)
address@hidden ignore
+Assignments to @code{TEXTDOMAIN} won't have any effect,
+since @code{TEXTDOMAIN} is not special in other @command{awk} implementations.
@item
-Parentheses are used around the arguments to @code{print}
-and @code{printf} only when
-the @code{print} or @code{printf} statement is followed by a redirection.
-Similarly, if
-the target of a redirection isn't a scalar, it gets parenthesized.
+Non-GNU versions of @command{awk} treat marked strings
+as the concatenation of a variable named @code{_} with the string
+following address@hidden is good fodder for an ``Obfuscated
address@hidden'' contest.} Typically, the variable @code{_} has
+the null string (@code{""}) as its value, leaving the original string constant
as
+the result.
@item
address@hidden supplies leading comments in
-front of the @code{BEGIN} and @code{END} rules,
-the pattern/action rules, and the functions.
+By defining ``dummy'' functions to replace @code{dcgettext()},
@code{dcngettext()}
+and @code{bindtextdomain()}, the @command{awk} program can be made to run, but
+all the messages are output in the original language.
+For example:
+
address@hidden @code{bindtextdomain()} function (@command{gawk}), portability
and
address@hidden @code{dcgettext()} function (@command{gawk}), portability and
address@hidden @code{dcngettext()} function (@command{gawk}), portability and
address@hidden
address@hidden file eg/lib/libintl.awk
+function bindtextdomain(dir, domain)
address@hidden
+ return dir
address@hidden
+
+function dcgettext(string, domain, category)
address@hidden
+ return string
address@hidden
+function dcngettext(string1, string2, number, domain, category)
address@hidden
+ return (number == 1 ? string1 : string2)
address@hidden
address@hidden endfile
address@hidden example
+
address@hidden
+The use of positional specifications in @code{printf} or
address@hidden()} is @emph{not} portable.
+To support @code{gettext()} at the C level, many systems' C versions of
address@hidden()} do support positional specifiers. But it works only if
+enough arguments are supplied in the function call. Many versions of
address@hidden pass @code{printf} formats and arguments unchanged to the
+underlying C library version of @code{sprintf()}, but only one format and
+argument at a time. What happens if a positional specification is
+used is anybody's guess.
+However, since the positional specifications are primarily for use in
address@hidden format strings, and since non-GNU @command{awk}s never
+retrieve the translated string, this should not be a problem in practice.
@end itemize
address@hidden ENDOFRANGE inap
-The profiled version of your program may not look exactly like what you
-typed when you wrote it. This is because @command{gawk} creates the
-profiled version by ``pretty printing'' its internal representation of
-the program. The advantage to this is that @command{gawk} can produce
-a standard representation. The disadvantage is that all source-code
-comments are lost, as are the distinctions among multiple @code{BEGIN},
address@hidden, @code{BEGINFILE}, and @code{ENDFILE} rules. Also, things such
as:
address@hidden I18N Example
address@hidden A Simple Internationalization Example
+
+Now let's look at a step-by-step example of how to internationalize and
+localize a simple @command{awk} program, using @file{guide.awk} as our
+original source:
@example
-/foo/
address@hidden file eg/prog/guide.awk
+BEGIN @{
+ TEXTDOMAIN = "guide"
+ bindtextdomain(".") # for testing
+ print _"Don't Panic"
+ print _"The Answer Is", 42
+ print "Pardon me, Zaphod who?"
address@hidden
address@hidden endfile
@end example
@noindent
-come out as:
+Run @samp{gawk --gen-pot} to create the @file{.pot} file:
@example
-/foo/ @{
- print $0
address@hidden
+$ @kbd{gawk --gen-pot -f guide.awk > guide.pot}
@end example
@noindent
-which is correct, but possibly surprising.
+This produces:
address@hidden profiling @command{awk} programs, dynamically
address@hidden @command{gawk} program, dynamic profiling
-Besides creating profiles when a program has completed,
address@hidden can produce a profile while it is running.
-This is useful if your @command{awk} program goes into an
-infinite loop and you want to see what has been executed.
-To use this feature, run @command{gawk} with the @option{--profile}
-option in the background:
address@hidden
address@hidden file eg/data/guide.po
+#: guide.awk:4
+msgid "Don't Panic"
+msgstr ""
+
+#: guide.awk:5
+msgid "The Answer Is"
+msgstr ""
+
address@hidden endfile
address@hidden example
+
+This original portable object template file is saved and reused for each
language
+into which the application is translated. The @code{msgid}
+is the original string and the @code{msgstr} is the translation.
+
address@hidden NOTE
+Strings not marked with a leading underscore do not
+appear in the @file{guide.pot} file.
address@hidden quotation
+
+Next, the messages must be translated.
+Here is a translation to a hypothetical dialect of English,
+called ``Mellow'':@footnote{Perhaps it would be better if it were
+called ``Hippy.'' Ah, well.}
@example
-$ @kbd{gawk --profile -f myprog &}
-[1] 13992
address@hidden
+$ cp guide.pot guide-mellow.po
address@hidden translations to} guide-mellow.po @dots{}
address@hidden group
@end example
address@hidden @command{kill} address@hidden dynamic profiling
address@hidden @code{USR1} signal
address@hidden @code{SIGUSR1} signal
address@hidden signals, @code{USR1}/@code{SIGUSR1}
@noindent
-The shell prints a job number and process ID number; in this case, 13992.
-Use the @command{kill} command to send the @code{USR1} signal
-to @command{gawk}:
+Following are the translations:
@example
-$ @kbd{kill -USR1 13992}
address@hidden file eg/data/guide-mellow.po
+#: guide.awk:4
+msgid "Don't Panic"
+msgstr "Hey man, relax!"
+
+#: guide.awk:5
+msgid "The Answer Is"
+msgstr "Like, the scoop is"
+
address@hidden endfile
@end example
address@hidden
-As usual, the profiled version of the program is written to
address@hidden, or to a different file if one specified with
-the @option{--profile} option.
address@hidden Linux
address@hidden GNU/Linux
+The next step is to make the directory to hold the binary message object
+file and then to create the @file{guide.mo} file.
+The directory layout shown here is standard for GNU @code{gettext} on
+GNU/Linux systems. Other versions of @code{gettext} may use a different
+layout:
-Along with the regular profile, as shown earlier, the profile
-includes a trace of any active functions:
address@hidden
+$ @kbd{mkdir en_US en_US/LC_MESSAGES}
address@hidden example
+
address@hidden @code{.po} files, converting to @code{.mo}
address@hidden files, @code{.po}, converting to @code{.mo}
address@hidden @code{.mo} files, converting from @code{.po}
address@hidden files, @code{.mo}, converting from @code{.po}
address@hidden portable object files, converting to message object files
address@hidden files, portable object, converting to message object files
address@hidden message object files, converting from portable object files
address@hidden files, message object, converting from portable object files
address@hidden @command{msgfmt} utility
+The @command{msgfmt} utility does the conversion from human-readable
address@hidden file to machine-readable @file{.mo} file.
+By default, @command{msgfmt} creates a file named @file{messages}.
+This file must be renamed and placed in the proper directory so that
address@hidden can find it:
@example
-# Function Call Stack:
+$ @kbd{msgfmt guide-mellow.po}
+$ @kbd{mv messages en_US/LC_MESSAGES/guide.mo}
address@hidden example
-# 3. baz
-# 2. bar
-# 1. foo
-# -- main --
+Finally, we run the program to test it:
+
address@hidden
+$ @kbd{gawk -f guide.awk}
address@hidden Hey man, relax!
address@hidden Like, the scoop is 42
address@hidden Pardon me, Zaphod who?
@end example
-You may send @command{gawk} the @code{USR1} signal as many times as you like.
-Each time, the profile and function call trace are appended to the output
-profile file.
+If the three replacement functions for @code{dcgettext()}, @code{dcngettext()}
+and @code{bindtextdomain()}
+(@pxref{I18N Portability})
+are in a file named @file{libintl.awk},
+then we can run @file{guide.awk} unchanged as follows:
address@hidden @code{HUP} signal
address@hidden @code{SIGHUP} signal
address@hidden signals, @code{HUP}/@code{SIGHUP}
-If you use the @code{HUP} signal instead of the @code{USR1} signal,
address@hidden produces the profile and the function call trace and then exits.
address@hidden
+$ @kbd{gawk --posix -f guide.awk -f libintl.awk}
address@hidden Don't Panic
address@hidden The Answer Is 42
address@hidden Pardon me, Zaphod who?
address@hidden example
address@hidden @code{INT} signal (MS-Windows)
address@hidden @code{SIGINT} signal (MS-Windows)
address@hidden signals, @code{INT}/@code{SIGINT} (MS-Windows)
address@hidden @code{QUIT} signal (MS-Windows)
address@hidden @code{SIGQUIT} signal (MS-Windows)
address@hidden signals, @code{QUIT}/@code{SIGQUIT} (MS-Windows)
-When @command{gawk} runs on MS-Windows systems, it uses the
address@hidden and @code{QUIT} signals for producing the profile and, in
-the case of the @code{INT} signal, @command{gawk} exits. This is
-because these systems don't support the @command{kill} command, so the
-only signals you can deliver to a program are those generated by the
-keyboard. The @code{INT} signal is generated by the
address@hidden@address@hidden or @address@hidden@key{BREAK}} key, while the
address@hidden signal is generated by the @address@hidden@key{\}} key.
address@hidden Gawk I18N
address@hidden @command{gawk} Can Speak Your Language
-Finally, @command{gawk} also accepts another option, @option{--pretty-print}.
-When called this way, @command{gawk} ``pretty prints'' the program into
address@hidden, without any execution counts.
address@hidden ENDOFRANGE advgaw
address@hidden ENDOFRANGE gawadv
address@hidden ENDOFRANGE awkp
address@hidden ENDOFRANGE proawk
address@hidden itself has been internationalized
+using the GNU @code{gettext} package.
+(GNU @code{gettext} is described in
+complete detail in
address@hidden
address@hidden, , GNU @code{gettext} utilities, gettext, GNU gettext tools}.)
address@hidden ifinfo
address@hidden
address@hidden gettext tools}.)
address@hidden ifnotinfo
+As of this writing, the latest version of GNU @code{gettext} is
address@hidden://ftp.gnu.org/gnu/gettext/gettext-0.18.2.1.tar.gz,
@value{PVERSION} 0.18.2.1}.
+
+If a translation of @command{gawk}'s messages exists,
+then @command{gawk} produces usage messages, warnings,
+and fatal errors in the local language.
address@hidden ENDOFRANGE inloc
@c The original text for this chapter was contributed by Efraim Yawitz.
@c FIXME: Add more indexing.
-----------------------------------------------------------------------
Summary of changes:
ChangeLog | 14 +-
Makefile.am | 2 +-
Makefile.in | 2 +-
TODO | 30 +-
array.c | 2 +-
awkgram.c | 368 +++---
builtin.c | 2 +-
cint_array.c | 2 +-
cmd.h | 2 +-
command.c | 178 ++--
debug.c | 2 +-
dfa.c | 6 +-
dfa.h | 2 +-
doc/ChangeLog | 6 +
doc/gawk.info | 2352 +++++++++++++++++++-------------------
doc/gawk.texi | 3116 ++++++++++++++++++++++++------------------------
doc/texinfo.tex | 62 +-
eval.c | 2 +-
ext.c | 4 +-
extension/ChangeLog | 6 +
extension/filefuncs.c | 2 +-
extension/fnmatch.c | 2 +-
extension/fork.c | 2 +-
extension/ordchr.c | 2 +-
extension/readdir.c | 2 +-
extension/readfile.c | 3 +-
extension/revoutput.c | 2 +-
extension/revtwoway.c | 2 +-
extension/rwarray.c | 2 +-
extension/rwarray0.c | 2 +-
extension/stack.c | 2 +-
extension/stack.h | 2 +-
extension/testext.c | 2 +-
extension/time.c | 2 +-
field.c | 2 +-
gawkapi.c | 2 +-
gawkapi.h | 2 +-
gettext.h | 39 +-
int_array.c | 2 +-
interpret.h | 2 +-
msg.c | 2 +-
node.c | 2 +-
profile.c | 2 +-
random.h | 11 +-
re.c | 2 +-
replace.c | 2 +-
str_array.c | 2 +-
symbol.c | 2 +-
test/ChangeLog | 5 +
test/Makefile.am | 21 +-
test/Makefile.in | 22 +-
51 files changed, 3172 insertions(+), 3139 deletions(-)
hooks/post-receive
--
gawk
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [gawk-diffs] [SCM] gawk branch, master, updated. 6ffa69e5703cd9453a8adfb8ad61f3171f615f46,
Arnold Robbins <=