Re: Counting words, fast!

help-bash

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Counting words, fast!

From:	Jesse Hathaway
Subject:	Re: Counting words, fast!
Date:	Wed, 17 Mar 2021 10:34:47 -0500

On Tue, Mar 16, 2021 at 10:30 PM Dennis Williamson
<dennistwilliamson@gmail.com> wrote:
> I've been playing with your optimized code changing the read to grab data in 
> chunks like some of the other optimized code does - thus extending your move 
> from by-word to by-line reading to reading a specified larger number of 
> characters.
>
> IFS= read -r -N 4096 var
>
> And appending the result of a regular read to end at a newline. This seemed 
> to cut about 20% off the time. But I get different counts than your code. 
> I've tried using read without specifying a variable and using the resulting 
> $REPLY to preserve whitespace but the counts still didn't match.
>
> In any case this points to larger chunks being more efficient.

Oh! That is a clever idea, I wanted to try reading in larger chunks, but
I wasn't sure how to ensure I had read whole words until you gave
this idea. Using 64K chunks I was able to shave off about 7s in my
testing:

declare -iA words_to_freq
eof='false'
set -o noglob
while [[ "${eof}" == 'false' ]]; do
  if ! LANG='C' IFS='' read -N 65536 -r block; then
    eof='true'
  fi
  if ! IFS='' read -r line; then
    eof='true'
  fi
  for word in ${block@L}${line@L}; do
    words_to_freq["${word}"]+=1
  done
done
set +o noglob

[Prev in Thread]

Current Thread

[Next in Thread]

Counting words, fast!, Jesse Hathaway, 2021/03/16
- Re: Counting words, fast!, Leonid Isaev (ifax), 2021/03/16
  - Re: Counting words, fast!, Greg Wooledge, 2021/03/16
    - Re: Counting words, fast!, Leonid Isaev (ifax), 2021/03/16
  - Re: Counting words, fast!, Jesse Hathaway, 2021/03/16
    - Re: Counting words, fast!, Dennis Williamson, 2021/03/16
    - Re: Counting words, fast!, Jesse Hathaway <=
    - Re: Counting words, fast!, Dennis Williamson, 2021/03/17
    - Re: Counting words, fast!, Jesse Hathaway, 2021/03/17
    - Re: Counting words, fast!, Greg Wooledge, 2021/03/17
    - Re: Counting words, fast!, Jesse Hathaway, 2021/03/17
- Re: Counting words, fast!, Koichi Murase, 2021/03/19
  - Re: Counting words, fast!, Dennis Williamson, 2021/03/19
  - Re: Counting words, fast!, Jesse Hathaway, 2021/03/19
    - Re: Counting words, fast!, Koichi Murase, 2021/03/19
    - Re: Counting words, fast!, Koichi Murase, 2021/03/19
    - Re: Counting words, fast!, Lawrence Velázquez, 2021/03/20

Prev by Date: Re: Counting words, fast!
Next by Date: Re: Counting words, fast!
Previous by thread: Re: Counting words, fast!
Next by thread: Re: Counting words, fast!
Index(es):
- Date
- Thread