bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 1/2] maint.mk: Split long argument lists


From: Roman Bolshakov
Subject: Re: [PATCH 1/2] maint.mk: Split long argument lists
Date: Wed, 28 Nov 2018 11:00:50 +0300
User-agent: NeoMutt/20180716

On Tue, Nov 27, 2018 at 07:19:43PM +0100, Bruno Haible wrote:
> Hi,
> 
> > The workaround is to split argument list into chunks that operating
> > system can process. "getconf ARG_MAX" is used to determine size of the
> > chunk.
> 
> Two questions on this:
> 
> 1) People say that 'getconf ARG_MAX' returns the appromixate number
>    of bytes in a command line. [1]
>    But you use it with 'xargs -n', which gives a limit on the number of
>    arguments. Shouldn't the patch use 'xargs -s' instead?
> 

Hi Bruno,

You're right about "-s", the patch should probably use it. I started
from "-n" with a reasonably low value of arguments (IIRC it was 8000
arguments), then tried to lookup if there's system limit of arguments. I
haven't found one, used ARG_MAX instead.

> 2) The really available values are slightly smaller.
> 
>    On Linux:
>    $ getconf ARG_MAX
>    2097152
>    $ LC_ALL=C xargs --show-limits
>    Your environment variables take up 4744 bytes
>    POSIX upper limit on argument length (this system): 2090360
>    POSIX smallest allowable upper limit on argument length (all systems): 4096
>    Maximum length of command we could actually use: 2085616
>    Size of command buffer we are actually using: 131072
> 
>    On FreeBSD/x86_64:
>    $ getconf ARG_MAX
>    262144
>    $ LC_ALL=C xargs --show-limits
>    Your environment variables take up 353 bytes
>    POSIX upper limit on argument length (this system): 259743
>    POSIX smallest allowable upper limit on argument length (all systems): 4096
>    Maximum length of command we could actually use: 259390
>    Size of command buffer we are actually using: 131072
> 
>    On macOS:
>    $ getconf ARG_MAX
>    262144
>    $ LC_ALL=C xargs --show-limits
>    Your environment variables take up 1262 bytes
>    POSIX upper limit on argument length (this system): 258834
>    POSIX smallest allowable upper limit on argument length (all systems): 4096
>    Maximum length of command we could actually use: 257572
>    Size of command buffer we are actually using: 131072
> 
>    How about being conservative and dividing the limit by 2, to avoid
>    this margin error?
> 
> Could it be that your patch works only because xargs uses a command buffer
> of length 131072, regardless of the value you pass to '-n'?
> 
> Bruno
> 
> [1] 
> https://www.cyberciti.biz/faq/linux-unix-arg_max-maximum-length-of-arguments/
> 

This is an interesting coincidence. If we pick big enough value for "-n"
that drains command buffer length, the number of arguments is going
to be limited by default value of "-s" flag. And ARG_MAX of arguments
will always drain command buffer length up to the limit.

Here are related excerpts from macOS xargs man page:
  -n _number_
               Set the maximum number of arguments taken from standard
               input for each invocation of utility.  An invocation of
               utility will use less than _number_ standard input arguments
               if the number of bytes accumulated (see the -s option)
               exceeds the specified _size_ or there are fewer than _number_
               arguments remaining for the last invocation of utility.
               The current default value for _number_ is 5000.
  -s _size_
               Set the maximum number of bytes for the command line length
               provided to utility.  The sum of the length of the utility
               name, the arguments passed to utility (including NULL
               terminators) and the current environment will be less than
               or equal to this number.  The current default value for
               _size_ is ARG_MAX - 4096.

And from GNU xargs:
  -n _max-args_, --max-args=_max-args_
                Use  at most _max-args_ arguments per command line.
                Fewer than _max-args_ arguments will be used if the size
                (see the -s option) is exceeded, unless the -x option is
                given, in which case xargs will exit
  -s _max-chars_, --max-chars=_max-chars_
                Use  at  most _max-chars_ characters per command line,
                including the command and initial-arguments and the
                terminating nulls at the ends of the argument strings.
                The largest allowed value is system-dependent, and is
                calculated as the argument length limit for exec, less
                the size of your environment, less 2048 bytes of
                headroom.  If this value is more than 128KiB, 128Kib is
                used as the default value; otherwise, the default value
                is the maximum.  1KiB is 1024 bytes.  xargs
                automatically adapts to tighter constraints.


Given that even if the patch is kept as is it should work properly on
macOS, FreeBSD (the docs are very close to macOS xargs) and all systems
with GNU xargs. I can correct commit message to note the observation.

Alternatively, we can replace "-n" with "-s", as you pointed out in 1).
But then we will need to correct calculation of VC_ARG_MAX. We can take
formulae from [2]:
expr `getconf ARG_MAX` - `env|wc -c` - `env|egrep '^[^ ]+='|wc -l` \* 4 - 2048

But it's a bit higher than effective limit of command line buffer on
macOS/FreeBSD. To fix that we need to replace 2048 with 4096 in the
formulae (according to man page above), so I think the final VC_ARG_MAX
should be:
expr `getconf ARG_MAX` - `env|wc -c` - `env|egrep '^[^ ]+='|wc -l` \* 4 - 4096

[2] https://www.in-ulm.de/~mascheck/various/argmax/

Thank you,
Roman



reply via email to

[Prev in Thread] Current Thread [Next in Thread]