[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Help-gawk Digest, Vol 1, Issue 5
From: |
J Naman |
Subject: |
Re: Help-gawk Digest, Vol 1, Issue 5 |
Date: |
Tue, 20 Jul 2021 13:15:58 -0400 |
My Benchmark long string functions: cnt=300,000;
str='a' 'abcde' Prev result
srep_rec 1.374s 1.544s 1.436s
srep_dbl 0.671s 0.546s 2.322s
srep_rpt 3.120s 7.239s 13.543s my cnt2=30,000 vs cnts above=10.0%
srep_sub 2.465s 2.574s 27.290s cnt2=30,000 vs cnt above=10.0%
I can not explain why dbl is 1/2 rec on my computer;
I can not explain why rpt is 3x sub vs 1/2 prev;
* NOTE: I got tired of waiting, so rpt & sub counts are 1/10 of rec & dbl
I can not explain why rpt 70xish rec&dbl vs 6xish prev;
* Windows 7 64-bit; Gawk 5.1.0; Intel i7-4930k 3.40Ghz 6 core;
On Tue, Jul 20, 2021 at 12:04 PM <help-gawk-request@gnu.org> wrote:
> Send Help-gawk mailing list submissions to
> help-gawk@gnu.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> https://lists.gnu.org/mailman/listinfo/help-gawk
> or, via email, send a message with subject or body 'help' to
> help-gawk-request@gnu.org
>
> You can reach the person managing the list at
> help-gawk-owner@gnu.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Help-gawk digest..."
>
>
> Today's Topics:
>
> 1. Re: How to Generate a Long String of the Same Character
> (Wolfgang Laun)
> 2. Re: How to Generate a Long String of the Same Character
> (Andrew J. Schorr)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Tue, 20 Jul 2021 11:49:05 +0200
> From: Wolfgang Laun <wolfgang.laun@gmail.com>
> To: "Neil R. Ormos" <ormos-gnulists17@ormos.org>
> Cc: Help Gawk List <help-gawk@gnu.org>
> Subject: Re: How to Generate a Long String of the Same Character
> Message-ID:
> <CANaj1Lch8=dpwEccdyNQQO5A=
> gQLG7x3Ci7kd3F6RQ+zA4L55Q@mail.gmail.com>
> Content-Type: text/plain; charset="UTF-8"
>
> On Mon, 19 Jul 2021 at 19:34, Neil R. Ormos <ormos-gnulists17@ormos.org>
> wrote:
>
> > Wolfgang Laun wrote:
> >
> > > The results for the four versions:
> > > *rec* 0m1,436s
> > > *dbl* 0m2.322s
> > > *rpt* 0m13.543s
> > > *sub* 0m27.290s
> >
> > I was a little surprised that the recursive
> > algorithm was so much faster in Wolfgang's tests.
> >
> Why? gawk programs execute on a virtual machine with a CIS and a couple of
> stacks. A function call isn't much worse than a goto.
> It is mainly the number of interpreter instructions that counts (e.g., h =
> h h x; is better than h = h h; h = h x;)
>
> function neil(n, s, l, s0l){
> > l=1;
> > s0l=length(s);
> > while (l*2<=n) {
> > l=l+l;
> > s=s s;
> > };
> > if (l<n) s=s substr(s, 1, (n-l)*s0l);
> > return s;
> > };
> >
> > You need to add
> if( n == 0 ) return "";
> as the first instruction in neil. You can try to optimize one 2+l from the
> loop. But I still get results where your non-recursive function is somewhat
> slower than my recursive one.
>
> I have noticed that gawk 5.1.0 appears to execute both versions a little
> faster than 5.0.1, but the difference remains. I can send you all the
> details about my environment but I don't think that this would tell you
> anything noteworthy.
>
> (I have never been looking at gawk internals before, so all of the
> statements below are quite unreliable.)
>
> A somewhat enlightening procedure is to read a dump of the interpreter
> code. *rec *results in 32 VM instructions whereas *neil *results in 49. The
> salient numbers are the number of instructions executed for each
> iteration. *rec
> *loops over 22 or 24 VM instructions, the while in *neil *loops over 14
> instructions; out of the remainder of 35 instructions 26 are executed with
> each call. Iteration count in *rec *is one less for 2^(n-1)+1 to 2^n-1.
>
> Some instructions are remarkably "heavy", length() for a string is one.
>
> Timing the single call
> x = srep( 300000000, "abc" );
> also shows that *rec *is faster.
>
> /usr/bin/time is very unreliable. I have done all runs on a machine where a
> browser but no other program (except demons) is running. I don't know what
> emacs or eclipse do when they are just sitting there, enjoying their idle
> time. But if they affect the results of /usr/bin/time, it should not be
> partisan.
>
> All of this doesn't really explain why the results differ in this
> irrational way.
>
> Cheers
> Wolfgang
>
>
> ------------------------------
>
> Message: 2
> Date: Tue, 20 Jul 2021 09:42:53 -0400
> From: "Andrew J. Schorr" <aschorr@telemetry-investments.com>
> To: Wolfgang Laun <wolfgang.laun@gmail.com>
> Cc: "Neil R. Ormos" <ormos-gnulists17@ormos.org>, Help Gawk List
> <help-gawk@gnu.org>
> Subject: Re: How to Generate a Long String of the Same Character
> Message-ID: <20210720134253.GA7400@ti129.telemetry-investments.com>
> Content-Type: text/plain; charset=us-ascii
>
> On Tue, Jul 20, 2021 at 11:49:05AM +0200, Wolfgang Laun wrote:
> > Some instructions are remarkably "heavy", length() for a string is one.
>
> Are you in a multi-byte locale? Because if not, I'd expect length()
> to be very quick.
>
> Regards,
> Andy
>
>
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> Help-gawk mailing list
> Help-gawk@gnu.org
> https://lists.gnu.org/mailman/listinfo/help-gawk
>
>
> ------------------------------
>
> End of Help-gawk Digest, Vol 1, Issue 5
> ***************************************
>
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- Re: Help-gawk Digest, Vol 1, Issue 5,
J Naman <=