[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
How to Generate a Long String of the Same Character
From: |
Neil R. Ormos |
Subject: |
How to Generate a Long String of the Same Character |
Date: |
Wed, 14 Jul 2021 17:31:46 -0500 (CDT) |
In a message on the bug-gawk list, Ed Mortin wrote:
> On an online forum someone asked how to generate a
> string of 100,000,000 "x"s. They had tried this in
> a BEGIN section:
>
> for(i=1;i<=100000000;i++) s = s "x"
>
> and wanted to know if there was a better
> approach. Someone suggested:
>
> s=sprintf("%*s",1000000000,""); gsub(/ /,"x",s)}
>
> which is also what I'd have also suggested, but
> upon testing that they found that the sprintf+gsub
> approach was slower than the loop in gawk 5.1.0
> and while I couldn't reproduce that exactly on
> cygwin, I can confirm that the sprintf+gsub
> solution is much slower than I expected. [...]
I am posting here to reply to the original
question because my comment does not relate to the
apparent speed-of-gsub() bug Ed was reporting.
Building a big string by iterating in tiny chunks
would seem to invite poor performance.
Instead, why not append the string to itself,
doubling its size with each iteration? For
example:
time ~/.local/bin/gawk-5.1.0 \
'BEGIN{sizelim=100000000; a="x"; while (length(a) < sizelim) {a=a a};
a=substr(a, 1, sizelim); print length(a);}'
On my not-very-fast machine, according to the time
built-in, that takes 0.17 seconds of elapsed time.
Yes, worst-case, if the intended string has length
(2^N)+1, you wastefully build a string of size
2^(N+1) and trim off almost half. So maybe on
some machines, building the string in
single-character units would work but the doubling
would not.
- How to Generate a Long String of the Same Character,
Neil R. Ormos <=