bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: printf unicode documentation


From: Paul Eggert
Subject: Re: printf unicode documentation
Date: Fri, 15 Oct 2004 12:34:09 -0700
User-agent: Gnus/5.1006 (Gnus v5.10.6) Emacs/21.3 (gnu/linux)

Dan Jacobson <address@hidden> writes:

> LC_CTYPE=zh_TW.Big5 proper and LC_CTYPE=zh_CN.big5 improper. All I
> know is that the former works and the latter doesn't here on Debian.

That sounds like a locale problem, not a coreutils problem.  I'm
afraid I don't know Chinese, so I can't help you much here.  (But are
you sure about that "Big5" versus "big5"?)

> But does it mention the word "Unicode" even once? I think it is worth
> mentioning that word.

OK, thanks, I installed this:

2004-10-15  Paul Eggert  <address@hidden>

        * src/printf.c (usage): Mention Unicode, and use H for hex digits.
        * doc/coreutils.texi (printf invocation): Mention ISO/IEC 10646 as
        well as Unicode.  Various minor formatting cleanups.

Index: src/printf.c
===================================================================
RCS file: /fetish/cu/src/printf.c,v
retrieving revision 1.94
retrieving revision 1.95
diff -p -u -r1.94 -r1.95
--- src/printf.c        3 Aug 2004 19:07:11 -0000       1.94
+++ src/printf.c        15 Oct 2004 19:31:47 -0000      1.95
@@ -128,10 +128,9 @@ FORMAT controls the output as in C print
   \\v      vertical tab\n\
 "), stdout);
       fputs (_("\
-  \\xNN    byte with hexadecimal value NN (1 to 2 digits)\n\
-\n\
-  \\uNNNN  character with hexadecimal value NNNN (4 digits)\n\
-  \\UNNNNNNNN  character with hexadecimal value NNNNNNNN (8 digits)\n\
+  \\xHH    byte with hexadecimal value HH (1 to 2 digits)\n\
+  \\uHHHH  Unicode (ISO/IEC 10646) character with hex value HHHH (4 digits)\n\
+  \\UHHHHHHHH  Unicode character with hex value HHHHHHHH (8 digits)\n\
 "), stdout);
       fputs (_("\
   %%      a single %\n\
Index: doc/coreutils.texi
===================================================================
RCS file: /fetish/cu/doc/coreutils.texi,v
retrieving revision 1.219
retrieving revision 1.220
diff -p -u -r1.219 -r1.220
--- doc/coreutils.texi  13 Oct 2004 21:47:58 -0000      1.219
+++ doc/coreutils.texi  15 Oct 2004 19:31:08 -0000      1.220
@@ -1441,7 +1441,7 @@ expression @var{bre}.
 @opindex --section-delimiter
 @cindex section delimiters of pages
 Set the section delimiter characters to @var{cd}; default is
address@hidden:}. If only @var{c} is given, the second remains @samp{:}.
address@hidden:}.  If only @var{c} is given, the second remains @samp{:}.
 (Remember to protect @samp{\} or other metacharacters from shell
 expansion with quotes or extra backslashes.)
 
@@ -3182,7 +3182,7 @@ has two problems.  First, it is ineffect
 Second, it has undefined behavior if @env{LC_CTYPE} (or @env{LANG}, if
 @env{LC_CTYPE} is unset) is set to an incompatible value.  For example,
 you get undefined behavior if @env{LC_CTYPE} is @code{ja_JP.PCK} but
address@hidden is @code{en_US.UTF-8}. }
address@hidden is @code{en_US.UTF-8}.}
 
 @sc{gnu} @command{sort} (as specified for all @sc{gnu} utilities) has no
 limit on input line length or restrictions on bytes allowed within lines.
@@ -6938,7 +6938,7 @@ gives directories that it creates the de
 @opindex --group
 @cindex group ownership of installed files, setting
 Set the group ownership of installed files or directories to
address@hidden The default is the process' current group.  @var{group}
address@hidden  The default is the process's current group.  @var{group}
 may be either a group name or a numeric group id.
 
 @item -m @var{mode}
@@ -6960,7 +6960,7 @@ and execute for the owner, and read and 
 @cindex appropriate privileges
 @vindex root @r{as default owner}
 If @command{install} has appropriate privileges (is run as root), set the
-ownership of installed files or directories to @var{owner}. The default
+ownership of installed files or directories to @var{owner}.  The default
 is @code{root}.  @var{owner} may be either a user name or a numeric user
 ID.
 
@@ -9270,18 +9270,20 @@ digits) specifying a character to print.
 
 @kindex \uhhhh
 @kindex \Uhhhhhhhh
address@hidden Unicode
address@hidden ISO/IEC 10646
address@hidden LC_CTYPE
 @command{printf} interprets two character syntaxes introduced in ISO C 99:
address@hidden for 16-bit Unicode characters, specified as 4 hex digits
address@hidden, and @samp{\U} for 32-bit Unicode characters, specified as 8 hex
-digits @var{hhhhhhhh}. @command{printf} outputs the Unicode characters
-according to the LC_CTYPE part of the current locale, i.e., depending
-on the values of the environment variables @env{LC_ALL}, @env{LC_CTYPE},
address@hidden
address@hidden for 16-bit Unicode (ISO/IEC 10646) characters, specified as
+four hexadecimal digits @var{hhhh}, and @samp{\U} for 32-bit Unicode
+characters, specified as eight hexadecimal digits @var{hhhhhhhh}.
address@hidden outputs the Unicode characters
+according to the @env{LC_CTYPE} locale.
 
 The processing of @samp{\u} and @samp{\U} requires a full-featured
 @code{iconv} facility.  It is activated on systems with glibc 2.2 (or newer),
-or when @code{libiconv} is installed prior to this package.  Otherwise the
-use of @samp{\u} and @samp{\U} will give an error message.
+or when @code{libiconv} is installed prior to this package.  Otherwise
address@hidden and @samp{\U} will print as-is.
 
 The only options are a lone @option{--help} or
 @option{--version}.  @xref{Common options}.
@@ -9296,7 +9298,7 @@ $ /usr/local/bin/printf '\u20AC 14.95'
 
 @noindent
 will be output correctly in all locales supporting the Euro symbol
-(ISO-8859-15, UTF-8, and others). Similarly, a Chinese string
+(ISO-8859-15, UTF-8, and others).  Similarly, a Chinese string
 
 @example
 $ /usr/local/bin/printf '\u4e2d\u6587'
@@ -11451,9 +11453,9 @@ week number of year with Sunday as first
 Days in a new year preceding the first Sunday are in week zero.
 @item %V
 week number of year with Monday as first day of the week as a decimal
-(address@hidden). If the week containing January 1 has four or more days in
+(address@hidden).  If the week containing January 1 has four or more days in
 the new year, then it is considered week 1; otherwise, it is week 53 of
-the previous year, and the next week is week 1. (See the @acronym{ISO} 8601
+the previous year, and the next week is week 1.  (See the @acronym{ISO} 8601
 standard.)
 @item %w
 day of week (address@hidden) with 0 corresponding to Sunday
@@ -12353,7 +12355,7 @@ the exit status of @var{command} otherwi
 
 @command{su} allows one user to temporarily become another user.  It runs a
 command (often an interactive shell) with the real and effective user
-id, group id, and supplemental groups of a given @var{user}. Synopsis:
+id, group id, and supplemental groups of a given @var{user}.  Synopsis:
 
 @example
 su address@hidden@dots{} address@hidden address@hidden@dots{}]
@@ -13019,7 +13021,7 @@ water pipeline.
 With the Unix shell, it's very easy to set up data pipelines:
 
 @smallexample
-program_to_create_data | filter1 | .... | filterN > final.pretty.data
+program_to_create_data | filter1 | ... | filterN > final.pretty.data
 @end smallexample
 
 We start out by creating the raw data; each filter applies some successive
@@ -13467,7 +13469,7 @@ This book showed how to write and use so
 1976, using a preprocessor for FORTRAN named @command{ratfor} (RATional
 FORtran).  At the time, C was not as ubiquitous as it is now; FORTRAN
 was.  The last chapter presented a @command{ratfor} to FORTRAN
-processor, written in @command{ratfor}. @command{ratfor} looks an awful
+processor, written in @command{ratfor}.  @command{ratfor} looks an awful
 lot like C; if you know C, you won't have any problem following the
 code.
 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]