bug#44983: Truncate long lines of grep output

bug-gnu-emacs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#44983: Truncate long lines of grep output

From:	Juri Linkov
Subject:	bug#44983: Truncate long lines of grep output
Date:	Wed, 09 Dec 2020 21:17:28 +0200
User-agent:	Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (x86_64-pc-linux-gnu)

>>> Alternatively, xref--collect-matches-1 could apply the limit itself, no
>>> matter whether grep or rg is used. And it could make sure to only do that
>>> after the last match. This might be the slower option, but hard to say in
>>> advance, some comparison benchmark could help here.
>> I think until a long string is inserted to the buffer, truncating the
>> string in the variable in xref--collect-matches-1 should be much faster.
>
> It would surely be faster, but how would that overhead compare to the
> whole operation?
>
> Could be negligible, except in the most extreme cases. After all, the main
> slowdown factor with long strings is the display engine, and it won't be in
> play there.
>
> The upside is we'd be able to support column limiting with Grep too. Which
> is the default configuration. And we'd extract the cutoff column into
> a more visible user option.

This is exactly what we need.  After that this bug report/feature request
can be closed.

BTW, for sorting currently xref-search-program-alist uses:

    "| sort -t: -k1,1 -k2n,2"

but fortunately ripgrep has a special option to do the same with:

    "--sort path"

>>> That aside, could you explain the difference between the regexps? Do grep
>>> and rg use different colors or something like that? Ideally, of course,
>>> that would be just 1 regexp (if that's possible without loss in
>>> performance, or significant loss in clarify).
>> They should be merged into one regexp indeed.  Because after customizing
>> it
>> to the rg regexp, grep output doesn't highlight matches anymore (I use both
>> grep and rg interchangeably by different commands).
>> Currently their separate regexps are:
>> grep:
>> "\033\\[0?1;31m
>>   \\(.*?\\)
>>   \033\\[[0-9]*m"
>> rg:
>> "\033\\[[0-9]*m
>>   \033\\[[0-9]*1m
>>   \033\\[[0-9]*1m
>>   \\(.*?\\)
>>   \033\\[[0-9]*0m"
>> That could be combined into one regexp:
>> "\033\\[[0-9?;]*m
>>   \\(?:\033\\[[0-9]*1m\\)\\{0,2\\}
>>   \\(.*?\\)
>>   \033\\[[0-9]*0?m"
>
> Makes sense. Is the parsing performance the same?

Performance is not a problem.  The problem is that more lax regexp
causes more false positives.  So the above regexp highlighted even
the separator colons (':') between file names and column numbers.

BTW, it's possible to see all highlighted parts of the output
by changing the argument 'MODE' of 'compilation-start' in 'grep'
from #'grep-mode to t (so it uses comint-mode in grep buffers).

Anyway, I found the shortest change needed to support ripgrep,
and pushed to master.

> Also, with the increased complexity, I'd rather we added a couple of tests,
> or a comment with output examples. Or maybe both.

Fortunately, we have all possible cases listed in etc/grep.txt,
so it was easy to check if everything is highlighted correctly now.
Also I added ripgrep samples to etc/grep.txt.

[Prev in Thread]

Current Thread

[Next in Thread]

bug#44983: Truncate long lines of grep output, (continued)

Prev by Date: bug#19031: 24.4; find-file in icomplete-mode shows completions with no input
Next by Date: bug#44556: 27.1; Ido deletes file without configuration with C-x C-v C-k
Previous by thread: bug#44983: Truncate long lines of grep output
Next by thread: bug#44983: Truncate long lines of grep output
Index(es):
- Date
- Thread