* Enhancing --show-function and --function-context in default configurations @ 2021-08-01 21:40 heapcrash heapcrash 2021-08-02 2:21 ` Junio C Hamano 2021-08-02 8:45 ` Ævar Arnfjörð Bjarmason 0 siblings, 2 replies; 7+ messages in thread From: heapcrash heapcrash @ 2021-08-01 21:40 UTC (permalink / raw) To: git Hello all! Thanks to all of the Git community for making such amazing tools. Some of my favorite features of Git are the --show-function and --function-context features of git {grep,log,diff}. However, the default configuration leaves a bit to be desired -- setting some simple flags in ~/.gitattributes for e.g. *.cpp diff=cpp *.py diff=python Makes these features MUCH more accurate and usable. However, one has to know about gitattributes, diff filters, xfuncname, etc. in order to turn these settings on. I'd like to contribute changes to Git that makes the "obvious" correlations be the default setting. Before I start, I wanted to gauge whether these changes would be accepted or not. As far as I can tell, these would not change the default behavior of plain git {grep,log,diff} unless the --show-function or --function-context flags are specified -- and if they are, the results would be improved. Should I work on a patch that does this? Thanks in advance, Heapcrash ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Enhancing --show-function and --function-context in default configurations 2021-08-01 21:40 Enhancing --show-function and --function-context in default configurations heapcrash heapcrash @ 2021-08-02 2:21 ` Junio C Hamano 2021-08-02 8:45 ` Ævar Arnfjörð Bjarmason 1 sibling, 0 replies; 7+ messages in thread From: Junio C Hamano @ 2021-08-02 2:21 UTC (permalink / raw) To: heapcrash heapcrash; +Cc: git heapcrash heapcrash <heapcrash@gmail.com> writes: > gauge whether these changes would be accepted or not. As far as I can > tell, these would not change the default behavior of plain git > {grep,log,diff} unless the --show-function or --function-context flags > are specified Two other places may be diff hunk headers and --diff-words output, I think. > Should I work on a patch that does this? That depends. If you are going to introduce a mechanism to introduce hardcoded configurations, depending on how it is done, it will become a huge security headache. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Enhancing --show-function and --function-context in default configurations 2021-08-01 21:40 Enhancing --show-function and --function-context in default configurations heapcrash heapcrash 2021-08-02 2:21 ` Junio C Hamano @ 2021-08-02 8:45 ` Ævar Arnfjörð Bjarmason 2021-08-02 16:17 ` Jeff King 1 sibling, 1 reply; 7+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-08-02 8:45 UTC (permalink / raw) To: heapcrash heapcrash; +Cc: git On Sun, Aug 01 2021, heapcrash heapcrash wrote: > Hello all! > > Thanks to all of the Git community for making such amazing tools. > > Some of my favorite features of Git are the --show-function and > --function-context features of git {grep,log,diff}. > > However, the default configuration leaves a bit to be desired -- > setting some simple flags in ~/.gitattributes for e.g. > > *.cpp diff=cpp > *.py diff=python > > Makes these features MUCH more accurate and usable. However, one has > to know about gitattributes, diff filters, xfuncname, etc. in order to > turn these settings on. > > I'd like to contribute changes to Git that makes the "obvious" > correlations be the default setting. Before I start, I wanted to > gauge whether these changes would be accepted or not. As far as I can > tell, these would not change the default behavior of plain git > {grep,log,diff} unless the --show-function or --function-context flags > are specified -- and if they are, the results would be improved. > > Should I work on a patch that does this? As Junio hinted at in his reply this doesn't just change diff behavior with the -W or --function-context flag, but changes the work we do (even if the output is the same) on all unified diff output. I.e. to generate the "@@" context line we by default use a C function in xdiff/, this is replaced with a regex-in-a-loop invoking the userdiff.[ch] code. I would like to see us have a setting to turn these on by default, but think it would be better to make that a diff.* config setting to put into ~/.gitconfig, i.e. we'd extend git itself to know about a list of extensions for the given userdiff drivers, and use them when rendering diffs. It makes emitting the diffs take more CPU, but the same is true of other options like colorMoved etc, so that in itself is not a dealbreaker. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Enhancing --show-function and --function-context in default configurations 2021-08-02 8:45 ` Ævar Arnfjörð Bjarmason @ 2021-08-02 16:17 ` Jeff King 2021-08-02 17:06 ` Junio C Hamano 0 siblings, 1 reply; 7+ messages in thread From: Jeff King @ 2021-08-02 16:17 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason; +Cc: heapcrash heapcrash, git On Mon, Aug 02, 2021 at 10:45:25AM +0200, Ævar Arnfjörð Bjarmason wrote: > I would like to see us have a setting to turn these on by default, but > think it would be better to make that a diff.* config setting to put > into ~/.gitconfig, i.e. we'd extend git itself to know about a list of > extensions for the given userdiff drivers, and use them when rendering > diffs. A long time ago we discussed doing this. The relevant thread is: https://lore.kernel.org/git/20111216110000.GA15676@sigill.intra.peff.net/ which references a few others: https://lore.kernel.org/git/4E569F10.8060808@panasas.com/ https://lore.kernel.org/git/4E6E928A.6080003@sunshineco.com/ From my re-read of the threads, it seemed like people were mostly on board with the idea, but: - there was some question about a few of the more obscure extensions (e.g., '.m' as objective-c versus matlab). Obviously we could drop any contentious ones, though I do wonder if this is simply opening up a can of worms where people will fight about what ought to go into the "official" list. - there was some question over whether some of our builtin funcname regexps were actually worse than the default behavior (especially the "cpp" one). In response we started using .gitattributes more heavily within git.git itself, and we've seen quite a few improvements in the intervening decade. So hopefully there is no reason anybody would _not_ want the content-specific driver to kick in, assuming that the extension mapping is accurate. There should be no security risk or anything like that, since we'd already respect such a mapping from an untrusted repository via its .gitattributes file. It looks like Junio picked up the patch at one point, but it got bumped to the next release cycle. It was marked as "expecting a re-roll" until finally getting discarded in Oct 2011; you can see the progression by searching for "jk/default-attr": https://lore.kernel.org/git/?q=jk%2Fdefault-attr Then in Dec 2011 I posted that re-roll (the top link I gave above), and it looks like it never got picked up. I don't see any conclusions either way in the thread, so I think it just kind of died out due to lack of anybody pushing it forward (and I don't remember making a decision either way on that; it may have been the "can of worms" thing above scaring me off). -Peff ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Enhancing --show-function and --function-context in default configurations 2021-08-02 16:17 ` Jeff King @ 2021-08-02 17:06 ` Junio C Hamano 2021-08-03 2:34 ` heapcrash heapcrash 0 siblings, 1 reply; 7+ messages in thread From: Junio C Hamano @ 2021-08-02 17:06 UTC (permalink / raw) To: Jeff King Cc: Ævar Arnfjörð Bjarmason, heapcrash heapcrash, git Jeff King <peff@peff.net> writes: > On Mon, Aug 02, 2021 at 10:45:25AM +0200, Ævar Arnfjörð Bjarmason wrote: > >> I would like to see us have a setting to turn these on by default, but >> think it would be better to make that a diff.* config setting to put >> into ~/.gitconfig, i.e. we'd extend git itself to know about a list of >> extensions for the given userdiff drivers, and use them when rendering >> diffs. > > A long time ago we discussed doing this. The relevant thread is: > > https://lore.kernel.org/git/20111216110000.GA15676@sigill.intra.peff.net/ > > which references a few others: > > https://lore.kernel.org/git/4E569F10.8060808@panasas.com/ > > https://lore.kernel.org/git/4E6E928A.6080003@sunshineco.com/ > ... Thanks for pointers. One good suggestion given there was to use diff=c and diff=perl in our own .gitattributes to use the patterns ourselves, which we seem to have been doing just fine ;-) As long as the default built-in ones are (1) at least 90% of the time improvement over, or at least is not broken compared to, the unconfigured case, and (2) at the lowest priority that users can easily countermand for the rest 10% cases I do not think it is too bad to resurrect the old patches from these threads. Thanks. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Enhancing --show-function and --function-context in default configurations 2021-08-02 17:06 ` Junio C Hamano @ 2021-08-03 2:34 ` heapcrash heapcrash 2021-08-03 14:55 ` Jeff King 0 siblings, 1 reply; 7+ messages in thread From: heapcrash heapcrash @ 2021-08-03 2:34 UTC (permalink / raw) To: Junio C Hamano; +Cc: Jeff King, Ævar Arnfjörð Bjarmason, git Thanks all for the feedback! I'll try to address it below: > Two other places may be diff hunk headers and --diff-words output, I think. I didn't think of those. Does this mean that diffs generated with a given e.g. diff=cpp configuration might not apply cleanly if run on some other user's system without that setting? > That depends. If you are going to introduce a mechanism to > introduce hardcoded configurations, depending on how it is done, it > will become a huge security headache. I don't intend to add hard-coded configurations, just hard-coded defaults. User configurations (~/.gitattributes and gitconfig diff.funcname/diff.xcfuncname) will still take precedence. > It makes emitting the diffs take more CPU, but the same is true of other > options like colorMoved etc, so that in itself is not a dealbreaker. I didn't think of this scenario, that it would add CPU time even without -W/--function-context/--show-function. I'd definitely be fine with preserving the current behavior in these cases (more below). > As long as the default built-in ones are > > (1) at least 90% of the time improvement over, or at least is not > broken compared to, the unconfigured case, and > > (2) at the lowest priority that users can easily countermand for > the rest 10% cases > > I do not think it is too bad to resurrect the old patches from these The currently-hard-coded regexps for e.g. cpp, python, objc, etc. (in userdiff.c) are all VAST improvements over the default, which is effectively '^[A-Za-z0-9_:]+' (i.e., any word which starts on the zeroth column, IIRC). This has lots of issues for e.g. Python where the function name is often indented (e.g. inside a class definition) and cpp is compatible with, and an improvement over, the default "C" setting (which is particularly broken as it marks any "goto" label as a function start / end). All of those points covered, I think we can make this work in a way that preserves backward compatibility (no defaults at all, user must manually configure ~/.gitattributes) by only setting "smart" defaults (e.g. "*.cpp diff=cpp", "*.cc diff=cpp" etc) when the command is EXPLICITLY invoked with -W/--function-context/--show-function. Another case I didn't think of in the original post is with the "git log -L:funcname:path/to/file.cpp" option, which tracks changes within a function over time. Having "better" default function boundary detection here would also be useful. Ultimately, I agree we should not have any extra overhead for any commands (log, diff, grep, etc) unless the user EXPLICITLY uses flags that indicate function boundary detection are key to functionality of those flags. On Mon, Aug 2, 2021 at 12:06 PM Junio C Hamano <gitster@pobox.com> wrote: > > Jeff King <peff@peff.net> writes: > > > On Mon, Aug 02, 2021 at 10:45:25AM +0200, Ævar Arnfjörð Bjarmason wrote: > > > >> I would like to see us have a setting to turn these on by default, but > >> think it would be better to make that a diff.* config setting to put > >> into ~/.gitconfig, i.e. we'd extend git itself to know about a list of > >> extensions for the given userdiff drivers, and use them when rendering > >> diffs. > > > > A long time ago we discussed doing this. The relevant thread is: > > > > https://lore.kernel.org/git/20111216110000.GA15676@sigill.intra.peff.net/ > > > > which references a few others: > > > > https://lore.kernel.org/git/4E569F10.8060808@panasas.com/ > > > > https://lore.kernel.org/git/4E6E928A.6080003@sunshineco.com/ > > ... > > Thanks for pointers. > > One good suggestion given there was to use diff=c and diff=perl in > our own .gitattributes to use the patterns ourselves, which we seem > to have been doing just fine ;-) > > As long as the default built-in ones are > > (1) at least 90% of the time improvement over, or at least is not > broken compared to, the unconfigured case, and > > (2) at the lowest priority that users can easily countermand for > the rest 10% cases > > I do not think it is too bad to resurrect the old patches from these > threads. > > Thanks. > > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Enhancing --show-function and --function-context in default configurations 2021-08-03 2:34 ` heapcrash heapcrash @ 2021-08-03 14:55 ` Jeff King 0 siblings, 0 replies; 7+ messages in thread From: Jeff King @ 2021-08-03 14:55 UTC (permalink / raw) To: heapcrash heapcrash Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason, git On Mon, Aug 02, 2021 at 09:34:32PM -0500, heapcrash heapcrash wrote: > Thanks all for the feedback! I'll try to address it below: > > > Two other places may be diff hunk headers and --diff-words output, I think. > > I didn't think of those. Does this mean that diffs generated with a > given e.g. diff=cpp configuration might not apply cleanly if run on > some other user's system without that setting? No, because the text in the hunk headers is purely cosmetic. They are just for humans, and applying the patch ignores them. The word-diffs are likewise just meant for human consumption, and can't be applied at all. > > It makes emitting the diffs take more CPU, but the same is true of other > > options like colorMoved etc, so that in itself is not a dealbreaker. > > I didn't think of this scenario, that it would add CPU time even > without -W/--function-context/--show-function. I'd definitely be fine > with preserving the current behavior in these cases (more below). I think the CPU increase is negligible. Either way, the funcname header involves walking backwards through the file applying a pattern to see if a line is a good candidate. The type-specific patterns are a little more complicated, but I doubt the difference is even measurable in practice. If we are going to have defaults for extension-to-diff mapping, I would prefer apply them consistently (so for -W, but also for funcnames, word-diffs, etc). That is both easier to implement and easier to explain to users (and assuming our mapping is accurate, more likely to be what they want). > Another case I didn't think of in the original post is with the "git > log -L:funcname:path/to/file.cpp" option, which tracks changes within > a function over time. Having "better" default function boundary > detection here would also be useful. Yes, I think this uses the same diff funcname patterns, so once we have default mappings, it would start to use them. -Peff ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2021-08-03 14:55 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-08-01 21:40 Enhancing --show-function and --function-context in default configurations heapcrash heapcrash 2021-08-02 2:21 ` Junio C Hamano 2021-08-02 8:45 ` Ævar Arnfjörð Bjarmason 2021-08-02 16:17 ` Jeff King 2021-08-02 17:06 ` Junio C Hamano 2021-08-03 2:34 ` heapcrash heapcrash 2021-08-03 14:55 ` Jeff King
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.