emacs-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#66806: closed (30.0.50; [PATCH] 'project-find-regexp' passes Git sub


From: GNU bug Tracking System
Subject: bug#66806: closed (30.0.50; [PATCH] 'project-find-regexp' passes Git submodules to the search program)
Date: Mon, 30 Oct 2023 02:02:02 +0000

Your message dated Mon, 30 Oct 2023 04:00:51 +0200
with message-id <a1430ace-413f-1a95-ccc4-6b2dd715c2dc@gutov.dev>
and subject line Re: bug#66806: 30.0.50; [PATCH] 'project-find-regexp' passes 
Git submodules to the search program
has caused the debbugs.gnu.org bug report #66806,
regarding 30.0.50; [PATCH] 'project-find-regexp' passes Git submodules to the 
search program
to be marked as done.

(If you believe you have received this mail in error, please contact
help-debbugs@gnu.org.)


-- 
66806: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=66806
GNU Bug Tracking System
Contact help-debbugs@gnu.org with problems
--- Begin Message --- Subject: 30.0.50; [PATCH] 'project-find-regexp' passes Git submodules to the search program Date: Sat, 28 Oct 2023 22:36:07 -0700
X-Debbugs-Cc: dmitry@gutov.dev

To see this in action, you can do the following, starting from "emacs -Q" inside of a Git repo that contains submodules:

  M-x trace-function RET project-files RET
  C-x p g some-string RET

If you look at the trace, you'll see that the files in your submodules are returned from 'project-files', but so is the submodule directory. That's not really correct, since 'project-files' is supposed to return *files*, not directories. (There are already workarounds for this in 'project-search' and 'project-query-replace-regexp'.)

By default, this is just a theoretical problem, but if you customize 'xref-search-program-alist' and 'xref-search-program' to include some other program, this can cause real issues. For example, I tried to add "ag" to this list[1], and unfortunately, it just doesn't work in this case. The results for submodules get duplicated, and there's no way I can see with ag to search only the specified *files*, ignoring any directories. (Looking at the definition for ripgrep, I'm guessing the "-g '!*/'" is the trick for that program, but nothing similar works for ag.)

Attached are two possible patches for this: a minimal version that just fixes 'project-find-regexp', and a maximal version that fixes this in general, and should theoretically speed up 'project-search' and 'project-query-replace-regexp' since they no longer need to call 'file-regular-p' on every file.

Do either of these patches look ok?

[1] This is a long story, simplified for this message since the gory details aren't especially relevant.

Attachment: minimal_0001-Exclude-Git-submodules-when-getting-list-of-files-fo.patch
Description: Text document

Attachment: maximal_0001-Exclude-Git-submodules-from-project-files.patch
Description: Text document


--- End Message ---
--- Begin Message --- Subject: Re: bug#66806: 30.0.50; [PATCH] 'project-find-regexp' passes Git submodules to the search program Date: Mon, 30 Oct 2023 04:00:51 +0200 User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.13.0
Version: 30.1

On 30/10/2023 02:58, Jim Porter wrote:
On 10/29/2023 2:41 PM, Dmitry Gutov wrote:
And that's not to mention usage over Tramp (which would be affected by the +1 process call that you mentioned as well, but that seems unavoidable).

Yeah, I don't see a way around that, unless we constructed a complex command to run via Tramp that does it all in one go.

(I looked into using the "--stage" argument for "git ls-files", which gets most of the way to fixing this, but you could break that logic with evil file names not in the tree...)

Filtering out gitlinks based on the mode bits? Clever.

Would probably slow down the parsing of the output, though. Not sure by how much.

Anyway, after recent experience micro-optimizing list operations, I came up with this version where the impact seems minimal.

WDYT?

Thanks, that helped form the basis for the attached patch. It moves the 'member' call into the lambda for the original 'mapcar', and then we can just 'delq' all the nil elements. Benchmarking against the Emacs repo, this still seems to have a minimal perf impact.

Ah, nice. delq is pretty handy to abbreviate the relinking.

We could combine both steps to eliminate an extra list altogether, but that only yields a further 2-3% improvement. Maybe later.

There might also be a benefit to using "git ls-files --recurse-submodules" when we can (i.e. when not using "-o"), but maybe we can leave that optimization for another day.

As long as it doesn't result in too much code duplication.

Anyway, I've pushed your latest patch. Thanks! And closing.


--- End Message ---

reply via email to

[Prev in Thread] Current Thread [Next in Thread]