bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: grep: 'binary files' where matches are text


From: eavis
Subject: Re: grep: 'binary files' where matches are text
Date: Fri, 7 Feb 2003 10:33:44 +0000

Stepan Kasal <address@hidden> wrote:

>>what really matters is 
>>whether the _matches_ are binary, not the input files.
>
>I cannot fully agree.  If an executable contains string
>
>                "a few\nlines about penguins\nsre here."
>
>and the grep finds te word ``penguins'' and happily prints
>
>                lines about penguins
>
>the user could then be surprised when he opens the file in his pico 
editor.

If the user's editor is jumping straight to the line number found by grep, 
there shouldn't be a problem.  If the user is just loading the file, then 
yes he could be surprised to see binary data.  But why?  Only because of 
an assumption that grep's matches will be only in text files.  There is no 
particular reason, IMHO, why that assumption should be correct or why grep 
should hold to it.  After all, traditional Unix grep would happily scan 
both text and binary files.  So any users surprised by loading binary 
files are surprised only because they have gotten used to a new GNU 
behaviour, one which is not necessarily the best way.

My feeling is that grep's job is to look for and print matches in all 
files specified, and it should try to stick to that.  To avoid 
unpleasantness on the user's terminal grep can think twice before printing 
binary garbage, but this necessary evil should interfere as little as 
possible with the job of printing matches.  So better to print as much as 
possible, and only give the 'binary match found' message when it is really 
necessary.

>OTOH, you are right that it's unpleasant when a file is treated as binary
>even though it in fact isn't.
>
>So I see no nice solution.  Perhaps the 
``--binary-files=print_text_matches''
>is the best alternative.

Perhaps.  We could think of more complex schemes like 'if all the matches 
in a file are text, treat it as text, but if some contain binary 
characters, treat it as binary'.  But I don't think that would be 
particularly helpful or worth the extra complexity.  Printing all text 
matches has the benefit of being simple to explain.

>But I don't know when I get to it.  Are you willing to donate a patch?

Yes, I will make a patch, but I cannot promise any particular timescale 
myself.  Perhaps this weekend I will get around to it.  I will not include 
in my patch making the new behaviour the default, I can only suggest it 
and leave that for you to decide.

-- 
Ed Avis <address@hidden>




reply via email to

[Prev in Thread] Current Thread [Next in Thread]