[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: grep is horriby slow in UTF-8 locales
From: |
Danilo Segan |
Subject: |
Re: grep is horriby slow in UTF-8 locales |
Date: |
Fri, 07 Nov 2003 16:49:58 +0100 |
User-agent: |
Gnus/5.1002 (Gnus v5.10.2) Emacs/21.3.50 (gnu/linux) |
Markus Kuhn <address@hidden> writes:
> $ grep --version
> grep (GNU grep) 2.5.1
This doesn't happen with:
$ grep --version
grep (GNU grep) 2.4.2
$ LC_ALL=POSIX time grep XYZ test.txt
Command exited with non-zero status 1
0.03user 0.07system 0:00.36elapsed 27%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (118major+25minor)pagefaults 0swaps
$ LC_ALL=sr_CS.UTF-8 time grep XYZ test.txt
Command exited with non-zero status 1
0.06user 0.05system 0:00.10elapsed 105%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (143major+50minor)pagefaults 0swaps
$ LC_ALL=en_GB.UTF-8 time grep XYZ test.txt
Command exited with non-zero status 1
0.06user 0.04system 0:00.15elapsed 64%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (128major+48minor)pagefaults 0swaps
$ LC_ALL=POSIX time grep XYZ test.txt
Command exited with non-zero status 1
0.04user 0.06system 0:00.10elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (118major+25minor)pagefaults 0swaps
Last example shows that CPU usage is not really any kind of rule to
base conculsions on (sr_CS.UTF-8 is my everyday locale, and I would
really notice if grep had any problems with it).
test.txt was produced with:
for i in 1 2 3 4 5 6 7 8 9 0; do cat UnicodeData.txt >>test.txt; done
I can get a newer grep today, if you think I may experience different
results with it.
Cheers,
Danilo