[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#34524: wc: word count incorrect when words separated only by no-brea
From: |
Pádraig Brady |
Subject: |
bug#34524: wc: word count incorrect when words separated only by no-break space |
Date: |
Sun, 24 Feb 2019 19:55:39 -0800 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 |
On 24/02/19 17:07, Pádraig Brady wrote:
> So non break space is generally considered a word delimiter,
> though there are complications you detail from unicode.
>
> In regard to options for enabling various behaviors for wc(1),
> I'm thinking we might keep the strict POSIX isspace() behavior
> with LC_CTYPE=C and/or POSIXLY_CORRECT=1, and use iswnbspace()
> by default, since that's the most common operation one would want,
> and is consistent with libreoffice for example.
> I'll adjust the patch along those lines.
Full patch attached.
cheers,
Pádraig
wc-nbsp.patch
Description: Text Data