bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] [PATCH] Small fix for limited number of strings (and pote


From: Tim Rühsen
Subject: Re: [Bug-wget] [PATCH] Small fix for limited number of strings (and potential garbage value) in arguments to concat_strings
Date: Fri, 24 Oct 2014 20:39:38 +0200
User-agent: KMail/4.14.1 (Linux/3.16-2-amd64; KDE/4.14.1; x86_64; ; )

Pär, thanks for your work.

*BUT* speed does not really matter here. My intention was to show code that is 
much more readable than the current implementation which is unnecessary 
complicated and not even understandable by the LLVM/clang analyzer.
That piece of code should be replaced.

FYI, if you want to work on CPU cycle optimization, use valgrind --
tool=callgrind for your wget command to be polished. The resulting file can be 
viewed with e.g. kcachegrind.

Tim


Am Freitag, 24. Oktober 2014, 19:27:43 schrieb Pär Karlsson:
> Well, I wrote a little benchmark and implemented them all in a hastily
> thrown together little program.
> 
> Basically, it tests three different strings against all three
> implementations of the functions (but it uses malloc instead of xmalloc,
> because I didn't want to throw in the whole of gnulib there too):
> 
> The results from 1 million iterations on each string (on an AMD Athlon(tm)
> Dual Core Processor 4850e) are as follows:
> 
> address@hidden ~/code $ make strtest
> cc     strtest.c   -o strtest
> address@hidden ~/code $ ./strtest
> Result concat_strings    : avg 0.00000018 s, total 0.18253400 s
> Result concat_strings_new: avg 0.00000025 s, total 0.25143400 s
> Result concat_strings_tim: avg 0.00000036 s, total 0.36166900 s
> Result concat_strings:     avg 0.00000015 s, total 0.15230500 s
> Result concat_strings_new: avg 0.00000022 s, total 0.22280100 s
> Result concat_strings_tim: avg 0.00000027 s, total 0.27337900 s
> Result concat_strings:     avg 0.00000073 s, total 0.72672500 s
> Result concat_strings_new: avg 0.00000066 s, total 0.66113200 s
> Result concat_strings_tim: avg 0.00000148 s, total 1.47995200 s
> address@hidden ~/code $ gcc -O3 -march=native -o strtest strtest.c -Wall
> address@hidden ~/code $ ./strtest
> Result concat_strings    : avg 0.00000013 s, total 0.12796300 s
> Result concat_strings_new: avg 0.00000020 s, total 0.20472700 s
> Result concat_strings_tim: avg 0.00000016 s, total 0.16481100 s
> Result concat_strings:     avg 0.00000011 s, total 0.10742400 s
> Result concat_strings_new: avg 0.00000018 s, total 0.18417900 s
> Result concat_strings_tim: avg 0.00000013 s, total 0.12504500 s
> Result concat_strings:     avg 0.00000059 s, total 0.58840200 s
> Result concat_strings_new: avg 0.00000054 s, total 0.53843800 s
> Result concat_strings_tim: avg 0.00000068 s, total 0.68416400 s
> address@hidden ~/code $ gcc -g -march=native -o strtest strtest.c -Wall
> address@hidden ~/code $ valgrind ./strtest
> ==3802== Memcheck, a memory error detector
> ==3802== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
> ==3802== Using Valgrind-3.9.0 and LibVEX; rerun with -h for copyright info
> ==3802== Command: ./strtest
> ==3802==
> Result concat_strings    : avg 0.00000754 s, total 7.54378500 s
> Result concat_strings_new: avg 0.00001080 s, total 10.80041900 s
> Result concat_strings_tim: avg 0.00000900 s, total 8.99982200 s
> Result concat_strings:     avg 0.00000652 s, total 6.52148200 s
> Result concat_strings_new: avg 0.00000985 s, total 9.85281200 s
> Result concat_strings_tim: avg 0.00000770 s, total 7.69649600 s
> Result concat_strings:     avg 0.00002109 s, total 21.08910100 s
> Result concat_strings_new: avg 0.00002570 s, total 25.69685100 s
> Result concat_strings_tim: avg 0.00002541 s, total 25.40761200 s
> ==3802==
> ==3802== HEAP SUMMARY:
> ==3802==     in use at exit: 0 bytes in 0 blocks
> ==3802==   total heap usage: 13,000,000 allocs, 13,000,000 frees,
> 743,000,000 bytes allocated
> ==3802==
> ==3802== All heap blocks were freed -- no leaks are possible
> ==3802==
> ==3802== For counts of detected and suppressed errors, rerun with: -v
> ==3802== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 1 from 1)
> 
> All three implementations perform correctly as far as I can tell (they give
> identical results), and behave well regarding memory management (as long as
> the allocated space is free():d afterwards).
> 
> I tested it with gperf too, and it kind of indicated the same results, the
> original implementation is the fastest.
> 
> I'm almost ashamed to attach the benchmark program, since it's so clumsily
> (and copy-pastily) put together, but for completeness, I attach it anyway
> (strtest.c.gz).
> 
> My conclusion is: the current implementation is the best and most
> efficient; it does everything it's supposed to do, and it does it quicker
> (at least on my machine) than any of the given alternatives. For 5
> arguments or less, it's the most efficient implementation, and nowhere in
> the current codebase is there a place where more than 5 strings are used...
> to my knowledge :-) There's really nothing wrong with it, IMO :-)
> 
> Best regards,
> 
> /Pär
> 
> 2014-10-23 22:01 GMT+02:00 Tim Rühsen <address@hidden>:
> > Am Dienstag, 21. Oktober 2014, 16:49:12 schrieb Yousong Zhou:
> > > On 21 October 2014 16:17, Pär Karlsson <address@hidden> wrote:
> > > > Yes, you are right, of course. Looking through the original
> > 
> > implementation
> > 
> > > > again, it seems water tight. clang probably complains about the
> > > > uninitialized values above argcount in saved_lengths[], that are never
> > > > reached.
> > > > 
> > > > The precalculated strlen:s saved is likely only an optimization(?)
> > > > attempt,
> > > > I suppose.
> > > 
> > > Yes. Grepping through the code shows that currently there is no
> > > invocation of concat_strings() having more than 5 arguments.
> > > 
> > > > Still, it seems wasteful to set up two complete loops with va_arg, and
> > > > considering what this function actually does, I wonder if not
> > 
> > s(n)printf
> > 
> > > > should be used instead of this function? :-)
> > > 
> > > I think concat_strings() is more tight and readable than multiple
> > > strlen() + malloc() + snprintf().
> > > 
> > > Regards.
> > > 
> > >                yousong
> > 
> > Reading the answers here tells me, something is wrong with this function.
> > There is misunderstanding / misinterpretation regarding the code.
> > Not only the developers here are affected, but also clang (version 3.3
> > upto
> > 3.6).
> > 
> > Here is my approach - introducing a helper function (strlcpy, which exists
> > in
> > BSD for a long time). It seems very straight forward to me.
> > 
> > How you like it ?
> > 
> > 
> > char *
> > concat_strings (const char *str0, ...)
> > {
> > 
> >   va_list args;
> >   const char *arg;
> >   size_t length = 0, pos = 0;
> >   char *s;
> >   
> >   if (!str0)
> >   
> >     return NULL;
> >   
> >   va_start (args, str0);
> >   for (arg = str0; arg; arg = va_arg (args, const char *))
> >   
> >     length += strlen(arg);
> >   
> >   va_end (args);
> >   
> >   s = xmalloc (length + 1);
> >   
> >   va_start (args, str0);
> >   for (arg = str0; arg; arg = va_arg (args, const char *))
> >   
> >     pos += strlcpy(s + pos, arg, length - pos + 1);
> >   
> >   va_end (args);
> >   
> >   return s;
> > 
> > }
> > 
> > 
> > strlcpy() can often be used as replacement for
> > 
> >   len = snprintf(buf,"%s",string);
> > 
> > but is ways faster.
> > 
> > FYI, my strlcpy() code is
> > 
> > 
> > #ifndef HAVE_STRLCPY
> > size_t
> > strlcpy (char *dst, const char *src, size_t size)
> > {
> > 
> >   const char *old = src;
> >   
> >   /* Copy as many bytes as will fit */
> >   if (size)
> >   
> >     {
> >     
> >       while (--size)
> >       
> >         {
> >         
> >           if (!(*dst++ = *src++))
> >           
> >             return src - old - 1;
> >         
> >         }
> >       
> >       *dst = 0;
> >     
> >     }
> >   
> >   while (*src++);
> >   return src - old - 1;
> > 
> > }
> > #endif
> > 
> > 
> > BTW, I already tested this in my local copy of Wget. Passes all tests (as
> > expected ;-)
> > 
> > Tim

Attachment: signature.asc
Description: This is a digitally signed message part.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]