[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#70511: Option to grep into compressed files
From: |
David G. Pickett |
Subject: |
bug#70511: Option to grep into compressed files |
Date: |
Tue, 23 Apr 2024 22:51:45 +0000 (UTC) |
Shell scripting can take file names in from a find or ls with 'while read', or
by globbing 'for f in pattern', and examine them one by one, run 'grep -q' to
find out if the file or uncompressed stream from that file has a match, and if
so 'echo' the file name out, or if you want lines, it can 'while read l' the
stream out of grep to prefix each line with a file name in an 'echo'. It helps
to juggle steams not file names, create steams not temp files that have to be
cleaned up and create delay. In bash, sometimes while read gets tricky as the
variable(s) are local to the loop, so sometimes a parenthesis wrapper helps.
Both ksh and bash also have the nice '<(command)' feature to turn streams of
stdout into input file names, and '>(command)' for output streams to file
names. Bash has so many nice tricks I often google for them, like if recognize
pattern. If you do not trust extensions, you can '$(file filename)' to find
out what you have in hand:
$ echo $(file .profile).profile: ASCII textdgp@dgp-p6803w:~$
On Tuesday, April 23, 2024 at 11:21:26 AM EDT, Mary <marycada@proton.me>
wrote:
> Thanks for the suggestion. You're right, this would be better than zgrep
> etc.
>
> I have some qualms though, as the new option would increase the attack
> surface for 'grep', in that you could then execute arbitrary code by
> passing certain options to 'grep'. Is there some safer way to get what
> you want?
There is still the possibility of including the respective compression
libraries directly in grep and using the `-Z` and `-J` as proposed, but this
wouldn't allow to use less popular compression algorithms.
One possibility, but I'm not sure what it's worth, would be to give grep a
special arg0 to enable shell commands, like `jgrep zcat pattern123 file.gz`.
But I'm not sure if it's worth the trouble.
> One supposes that if the file extension is not trustworthy, one can taste
> file like the file command, and use libraries like the gzip libraries to
> handle gzipped files as a stream. There are so many others: zip files could
> be treated like directories and all the files in them that match the glob
> could be searched, and then there is bzip2, 7zip, .... It becomes a
> popularity contest! One can do all this with shell scripting, and leave poor
> old grep out of it!
The reason why I wanted to do this in grep directly is because it's difficult
to implement this with shell scripting. I noticed that neither zgrep, bzgrep
nor xzgrep support the `-r` option, among others, presumably because it's too
difficult to implement in a portable way.
I made my patch use a shell command specifically to provide maximum flexibility
with minimum maintenance cost. But it does open the door to security risks, so
I understand if it's not worth adding to grep.
bug#70511: Option to grep into compressed files,
David G. Pickett <=
bug#70511: Option to grep into compressed files, Antonio Diaz Diaz, 2024/04/26