[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[PATCH] [RFC] add support for input/output count of lines
From: |
Roberto Nibali |
Subject: |
[PATCH] [RFC] add support for input/output count of lines |
Date: |
Fri, 01 Oct 2004 09:48:47 +0200 |
User-agent: |
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3) Gecko/20040913 |
Hello,
[I'm not subscribed, so please cc to all addresses if you would like to get
feedback]
The attached patch adds support for printing input/output line counters on
stderr through two new parameters '-j' and '-g'. The rationale behind this is
following:
As dumbfolded as we seem to be, we simply couldn't find a way to reduce the 3
needed fork()'s to achieve following:
1. get the amount of lines of a file that is grep'd.
2. get the amount of lines of where a pattern matched.
3. display the matching lines.
So, here is a 'theoretical trace' of what we do [3 fork()'s]:
$ wc -l < /var/log/really_big_file
$ grep -c foobar /var/log/really_big_file
$ grep foobar /var/log/really_big_file
With the new patch you can do it with one fork:
$ grep -j -g foobar /var/log/really_big_file
or the long opt version:
$ grep --input-lines --output-line foobar /var/log/really_big_file
Et voila, it gives your the matching lines and prints two lines to stderr,
input_lines=... and output_lines=...
Normally I wouldn't really care about 3 pipes and the resulting fork()'s but if
you're processing a really big file (>2GB) there is a significant overhead, not
really in the fork() and exec() but in read()'ing and lseek()'ing the file.
Doing it in one go would be preferrable in many situations.
We did of course maintain 100% backwards compatibility with the existing feature
set and we also tested the added functionality. So if you guys are not against
including such a thing for the next drop of grep, I'd be grateful if you could
merge this little non-intrusive patch with your CVS tree. For us it's one patch
less to maintain :).
Thanks and best regards,
Roberto Nibali, ratz
--
-------------------------------------------------------------
addr://Rathausgasse 31, CH-5001 Aarau tel://++41 62 823 9355
http://www.terreactive.com fax://++41 62 823 9356
-------------------------------------------------------------
terreActive AG Wir sichern Ihren Erfolg
-------------------------------------------------------------
Besuchen Sie uns am 29. und 30.9. an der security-zone.info
Weitere Informationen finden Sie auf www.security-zone.info
Wir freuen uns, Sie an der Messe begrüssen zu dürfen.
-------------------------------------------------------------
diff -ur grep-2.5-uclibc/src/grep.c
grep-2.5-uclibc-fixed_DESTDIR-countpatch/src/grep.c
--- grep-2.5-uclibc/src/grep.c 2002-03-13 07:49:52.000000000 -0700
+++ grep-2.5-uclibc-fixed_DESTDIR-countpatch/src/grep.c 2004-09-29
06:32:43.000000000 -0600
@@ -80,7 +80,7 @@
static struct exclude *included_patterns;
/* Short options. */
static char const short_options[] =
-"0123456789A:B:C:D:EFGHIPUVX:abcd:e:f:hiKLlm:noqRrsuvwxyZz";
+"0123456789A:B:C:D:EFGHIPUVX:abcd:e:f:ghijKLlm:noqRrsuvwxyZz";
/* Non-boolean long options that have no corresponding short equivalents. */
enum
@@ -143,6 +143,8 @@
{"version", no_argument, NULL, 'V'},
{"with-filename", no_argument, NULL, 'H'},
{"word-regexp", no_argument, NULL, 'w'},
+ {"input-lines", no_argument, NULL, 'j'},
+ {"output-lines", no_argument, NULL, 'g'},
{0, 0, 0, 0}
};
@@ -439,6 +441,8 @@
static int out_invert; /* Print nonmatching stuff. */
static int out_file; /* Print filenames. */
static int out_line; /* Print line numbers. */
+static int stderr_input; /* Print input line count on stderr. */
+static int stderr_output; /* Print output line count on stderr. */
static int out_byte; /* Print byte offsets. */
static int out_before; /* Lines of leading context. */
static int out_after; /* Lines of trailing context. */
@@ -514,11 +518,13 @@
{
if (out_file)
printf ("%s%c", filename, sep & filename_mask);
+
if (out_line)
{
nlscan (beg);
totalnl = add_count (totalnl, 1);
- print_offset_sep (totalnl, sep);
+ if (!stderr_input && !stderr_output)
+ print_offset_sep (totalnl, sep);
lastnl = lim;
}
if (out_byte)
@@ -945,6 +951,12 @@
status = count + 2;
else
{
+ if (stderr_input)
+ fprintf (stderr, _("input_lines=%d\n"), totalnl);
+
+ if (stderr_output)
+ fprintf (stderr, _("output_lines=%d\n"), count);
+
if (count_matches)
{
if (out_file)
@@ -1100,6 +1112,8 @@
-L, --files-without-match only print FILE names containing no match\n\
-l, --files-with-matches only print FILE names containing matches\n\
-c, --count only print a count of matching lines per FILE\n\
+ -j, --input-lines print total number of lines on STDERR\n\
+ -g, --output-lines print count of matching lines on STDERR\n\
-Z, --null print 0 byte after FILE name\n"));
printf (_("\
\n\
@@ -1470,10 +1484,20 @@
keys[keycc++] = '\n';
break;
+ case 'g':
+ out_line = 1;
+ stderr_output = 1;
+ break;
+
case 'h':
no_filenames = 1;
break;
+ case 'j':
+ out_line = 1;
+ stderr_input = 1;
+ break;
+
case 'i':
case 'y': /* For old-timers . . . */
match_icase = 1;
@@ -1622,7 +1646,6 @@
default:
usage (2);
break;
-
}
/* POSIX.2 says that -q overrides -l, which in turn overrides the
- [PATCH] [RFC] add support for input/output count of lines,
Roberto Nibali <=