bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#30814: Please increase the value of MAX_MON_WIDTH in ls.c


From: Ruediger Meier
Subject: bug#30814: Please increase the value of MAX_MON_WIDTH in ls.c
Date: Fri, 16 Mar 2018 13:30:53 +0100
User-agent: KMail/1.9.10

On Wednesday 14 March 2018, Pádraig Brady wrote:
> On 13/03/18 17:06, Rafal Luzynski wrote:
> > As we have introduced the support of nominative and genitive
> > month names in glibc [1] and we are going to provide the updated
> > locale data for Catalan language [2] it has been discovered [3]
> > that the current limit of the maximum length of the abbreviated
> > month name as displayed by "ls -l" will not work with the new
> > data for Catalan.  It is obligatory to precede the month name
> > with "de " (note: the space) so the abbreviated month names limited
> > to 5 characters will be ambiguous and therefore unreadable:
>
> It's a bit surprising that _abbreviations_ all need the "de " prefix,
> but fair enough.

Most used "abbreviations" in our locales do not follow the language 
rules anyways. Even in english we would need to add dots and some month 
abbreviations just do not exist.

Below 3 examples of the correct abbreviations for english, spanish, and 
german:

Jan.    enero   Jan.
Feb.    feb.    Feb.
Mar.    marzo   März
Apr.    abr.    Apr.
May     mayo    Mai
June    jun.    Jun.
July    jul.    Jul.
Aug.    agosto  Aug.
Sept.   set.    Sept.
Oct.    oct.    Okt.
Nov.    nov.    Nov.
Dec.    dic.    Dez.

Thankfully all 3 locales just use the first three letters. Note in 
spanish you would also need to add such genitive "de" but of course 
nobody wants to see it when printing short dates to a terminal.

While I see a benefit of having the correct abbreviations *somewhere* in 
the locale. I don't think they should be used in tools like ls by 
default.  The output should IMHO not longer than --time-style=long-iso 
or --full-time.

> > de ma  (should be "de mar" at least)
> > d’abr  (correct)
> > de ma  (should be "de mai" at least)
> > de ju  (should be "de jun" at least)
> > de ju  (should be "de jul" at least)

I don't speak Catalan, but I can't believe that "de jun" is a correct 
abbreviation following the language rules.


> > Increasing the value of MAX_MON_WIDTH to 6 characters will fix
> > the problem. The location of the constant is here: [4]
> >
> > Although it has been also suggested in the same bug report that
> > there should be no additional limit for the month length.
> >
> > This bug may be related with the coreutils bug #29377. [5]
> >
> > Regards,
> >
> > Rafal Luzynski
> >
> >
> > [1] https://sourceware.org/bugzilla/show_bug.cgi?id=10871
> > [2] https://sourceware.org/bugzilla/show_bug.cgi?id=22848
> > [3] https://sourceware.org/bugzilla/show_bug.cgi?id=22848#c6
> > [4]
> > http://git.savannah.gnu.org/cgit/coreutils.git/tree/src/ls.c#n1099
> > [5] https://debbugs.gnu.org/cgi/bugreport.cgi?bug=29377
>
> Thanks for the careful analysis.
>
> 5 was chosen as a max width for abmon
> as that was seen to be unambiguous and
> also truncate overly long abbreviations.
>
> One can browse the abbreviations by length using:
>
>   locale -a | grep utf8 |
>   while read l; do LC_ALL=$l locale abmon; done |
>   tr ';' '\n' | sort -u | grep '.\{5,\}' |
>   while read mon; do
>     printf '%02d %s\n' "$(echo "$mon" | wc -L)" "$mon"
>   done |
>   sort -n | less
>
> That shows a couple of existing issues with the limit of 5.
> ln_CD.utf8 (Democratic Republic of the Congo) needs a length of 7 to
> be unambiguous, while Arabic needs 12!
> I don't remember arabic being so long at the time I implemented
> the alignment/truncation in ls (9 years ago), but we should probably
> expand to account for that.
>
> $ LC_ALL=ln_CD.utf8 locale abmon
> sánzá1.;sánzá2.;sánzá3.;sánzá4.;sánzá5.;sánzá6.;sánzá7.;sánzá8.;sánzá
>9.;sánz10.;sánzá11.;sánzá12.
>
> $ LC_ALL=ar_SY.utf8 locale abmon | tr ';' '\n'
> كانون الثاني
> شباط
> آذار
> نيسان
> نوار
> حزيران
> تموز
> آب
> أيلول
> تشرين الأول
> تشرين الثاني
> كانون الأول
>
> Given the increase in supported size should only impact relatively
> few languages it probably makes sense to increase to 12. The attached
> does that and also augments the test to find ambiguous cases.
>
> cheers,
> Pádraig







reply via email to

[Prev in Thread] Current Thread [Next in Thread]