* Memory leak with sparse-checkout @ 2021-09-20 12:15 Calbabreaker 2021-09-20 15:52 ` Taylor Blau 0 siblings, 1 reply; 15+ messages in thread From: Calbabreaker @ 2021-09-20 12:15 UTC (permalink / raw) To: git What did you do before the bug happened? (Steps to reproduce your issue) This was ran: git clone https://github.com/Calbabreaker/piano --sparse cd piano git sparse-checkout add any_text git checkout deploy-frontend git sparse-checkout init --cone git sparse-checkout add any_text Basically clone any repository with --sparse, add anything to sparse-checkout list then checkout to another branch, init the sparse-checkout with --cone and add anything to it again. What did you expect to happen? (Expected behavior) Nothing at all except for the sparse-checkout list to be updated. What happened instead? (Actual behavior) A massive memory leak using all the RAM with one of the cpu cores being at 100% usage. [System Info] git version: git version 2.33.0 cpu: x86_64 no commit associated with this build sizeof-long: 8 sizeof-size_t: 8 shell-path: /bin/sh uname: Linux 5.10.63-1-MANJARO #1 SMP PREEMPT Wed Sep 8 14:13:59 UTC 2021 x86_64 compiler info: gnuc: 11.1 libc info: glibc: 2.33 $SHELL (typically, interactive shell): /usr/bin/zsh ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Memory leak with sparse-checkout 2021-09-20 12:15 Memory leak with sparse-checkout Calbabreaker @ 2021-09-20 15:52 ` Taylor Blau 2021-09-20 16:29 ` Derrick Stolee 0 siblings, 1 reply; 15+ messages in thread From: Taylor Blau @ 2021-09-20 15:52 UTC (permalink / raw) To: Calbabreaker, Derrick Stolee; +Cc: git On Mon, Sep 20, 2021 at 09:45:14PM +0930, Calbabreaker wrote: > What did you do before the bug happened? (Steps to reproduce your issue) > > This was ran: > > git clone https://github.com/Calbabreaker/piano --sparse > cd piano > git sparse-checkout add any_text > git checkout deploy-frontend > git sparse-checkout init --cone > git sparse-checkout add any_text Thanks for the reproduction. An even simpler one may be (inside of any repository): git sparse-checkout init git sparse-checkout add dir git sparse-checkout init --cone git sparse-checkout add dir The problem occurs because we keep existing entries when adding to the sparse-checkout list, and cone-mode patterns do not mix with non cone-mode patterns. So after the first init and "add dir", your sparse-checkout file looks like: /* !/*/ dir but then when we convert to cone-mode and try and add "dir" (which in cone-mode we'll convert to "/dir/"), we run into trouble when adding the existing "dir" entry. That's because add_patterns_cone_mode() calls insert_recursive_pattern() on every entry in the existing list, including "dir". So when we call insert_recursive_pattern() with any pattern list and path containing "dir", we first insert "dir" into the list, and then: char *slash = strrchr(e->pattern, '/'); char *oldpattern = e->pattern; if (slash == e->pattern) break; // trim off a slash, repeat except slash is NULL because "dir" doesn't contain a slash. And that explains the problem you're seeing, because (a) we'll stay in that while loop forever, and (b) because each iteration allocates memory to accommodate the new pattern, so we'll eventually run out of memory. The wrong thing to do would be to handle this case by changing the conditional to "if (!slash || slash == e->pattern)", because we can't blindly carry forward some patterns which look like cone-mode patterns, since together the list of sparse-checkout entries may not represent a cone. (An example here is if we added /foo/bar/baz/* without the corresponding /foo/, !/foo/*, and so on). So I think the problem really is that we need to drop existing patterns when re-initializing the sparse-checkout in cone mode. We could try to recognize that existing patterns may already constitute a cone (and/or create a cone that covers the existing patterns). But I think the easiest thing (if a little unfriendly) would be to just drop them and start afresh when re-initializing the sparse-checkout in cone mode. I'm adding Stolee to the CC to see if he thinks that would be sensible behavior or not. Thanks, Taylor ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Memory leak with sparse-checkout 2021-09-20 15:52 ` Taylor Blau @ 2021-09-20 16:29 ` Derrick Stolee 2021-09-20 16:42 ` Taylor Blau 0 siblings, 1 reply; 15+ messages in thread From: Derrick Stolee @ 2021-09-20 16:29 UTC (permalink / raw) To: Taylor Blau, Calbabreaker; +Cc: git On 9/20/2021 11:52 AM, Taylor Blau wrote: > On Mon, Sep 20, 2021 at 09:45:14PM +0930, Calbabreaker wrote: >> What did you do before the bug happened? (Steps to reproduce your issue) >> >> This was ran: >> >> git clone https://github.com/Calbabreaker/piano --sparse >> cd piano >> git sparse-checkout add any_text >> git checkout deploy-frontend >> git sparse-checkout init --cone >> git sparse-checkout add any_text Thank you for pointing this out. I'll point out that this was likely found because "--sparse" does not initialize cone mode patterns, but you might have expected it to. This will increase the priority of adding something like "--sparse=cone" to the 'git clone' options. > Thanks for the reproduction. An even simpler one may be (inside of any > repository): > > git sparse-checkout init > git sparse-checkout add dir > git sparse-checkout init --cone > git sparse-checkout add dir > > The problem occurs because we keep existing entries when adding to the > sparse-checkout list, and cone-mode patterns do not mix with > non cone-mode patterns. > > So after the first init and "add dir", your sparse-checkout file looks > like: > > /* > !/*/ > dir > > but then when we convert to cone-mode and try and add "dir" (which in > cone-mode we'll convert to "/dir/"), we run into trouble when adding the > existing "dir" entry. That's because add_patterns_cone_mode() calls > insert_recursive_pattern() on every entry in the existing list, > including "dir". > > So when we call insert_recursive_pattern() with any pattern list and > path containing "dir", we first insert "dir" into the list, and then: > > char *slash = strrchr(e->pattern, '/'); > char *oldpattern = e->pattern; > > if (slash == e->pattern) > break; > // trim off a slash, repeat > > except slash is NULL because "dir" doesn't contain a slash. And that > explains the problem you're seeing, because (a) we'll stay in that while > loop forever, and (b) because each iteration allocates memory to > accommodate the new pattern, so we'll eventually run out of memory. Yikes! Thanks for digging into the details. > The wrong thing to do would be to handle this case by changing the > conditional to "if (!slash || slash == e->pattern)", because we can't > blindly carry forward some patterns which look like cone-mode patterns, > since together the list of sparse-checkout entries may not represent a > cone. > > (An example here is if we added /foo/bar/baz/* without the corresponding > /foo/, !/foo/*, and so on). > > So I think the problem really is that we need to drop existing patterns > when re-initializing the sparse-checkout in cone mode. We could try to > recognize that existing patterns may already constitute a cone (and/or > create a cone that covers the existing patterns). > > But I think the easiest thing (if a little unfriendly) would be to just > drop them and start afresh when re-initializing the sparse-checkout in > cone mode. This isn't sufficient, as a user can modify their .git/info/sparse-checkout file whenever they want, so we should fix this bug regardless. We could add a "Your existing patterns are not in cone mode" error. It might still be a good idea to let "git sparse-checkout init --cone" overwrite the sparse-checkout file _if the file is not already in cone mode_. Thanks, -Stolee ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Memory leak with sparse-checkout 2021-09-20 16:29 ` Derrick Stolee @ 2021-09-20 16:42 ` Taylor Blau 2021-09-20 17:25 ` Derrick Stolee 0 siblings, 1 reply; 15+ messages in thread From: Taylor Blau @ 2021-09-20 16:42 UTC (permalink / raw) To: Derrick Stolee; +Cc: Taylor Blau, Calbabreaker, git On Mon, Sep 20, 2021 at 12:29:36PM -0400, Derrick Stolee wrote: > > So I think the problem really is that we need to drop existing patterns > > when re-initializing the sparse-checkout in cone mode. We could try to > > recognize that existing patterns may already constitute a cone (and/or > > create a cone that covers the existing patterns). > > > > But I think the easiest thing (if a little unfriendly) would be to just > > drop them and start afresh when re-initializing the sparse-checkout in > > cone mode. > > This isn't sufficient, as a user can modify their .git/info/sparse-checkout > file whenever they want, so we should fix this bug regardless. We could add > a "Your existing patterns are not in cone mode" error. > > It might still be a good idea to let "git sparse-checkout init --cone" > overwrite the sparse-checkout file _if the file is not already in cone > mode_. I'm not sure how helpful such an error message might be to a user in this scenario without extra information. After seeing just "this isn't a cone", it's not clear what they should do other than drop their sparse-checkout configuration and start over. It would be nice to have an intermediate step between seeing realizing that the existing patterns don't form a cone and dropping them. Perhaps we could include an error message and say something like: warning: your sparse-checkout patterns do not from a cone hint: to reinitialize your sparse-checkout configuration hint: try running: hint: hint: git sparse-checkout init --cone --reinitialize Where `--reinitialize` means to drop existing patterns. I suppose it could be the default when transitioning from non-cone to cone mode, but that would defeat the purpose of the warning. We would probably want to perform this check both during initialization, and when adding patterns in cone mode. It may also be worthwhile to check the validity of the cone before running 'list' or 'reapply', too. Thanks, Taylor ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Memory leak with sparse-checkout 2021-09-20 16:42 ` Taylor Blau @ 2021-09-20 17:25 ` Derrick Stolee 2021-09-20 17:27 ` Derrick Stolee 0 siblings, 1 reply; 15+ messages in thread From: Derrick Stolee @ 2021-09-20 17:25 UTC (permalink / raw) To: Taylor Blau; +Cc: Calbabreaker, git On 9/20/2021 12:42 PM, Taylor Blau wrote: > On Mon, Sep 20, 2021 at 12:29:36PM -0400, Derrick Stolee wrote: >>> So I think the problem really is that we need to drop existing patterns >>> when re-initializing the sparse-checkout in cone mode. We could try to >>> recognize that existing patterns may already constitute a cone (and/or >>> create a cone that covers the existing patterns). >>> >>> But I think the easiest thing (if a little unfriendly) would be to just >>> drop them and start afresh when re-initializing the sparse-checkout in >>> cone mode. >> >> This isn't sufficient, as a user can modify their .git/info/sparse-checkout >> file whenever they want, so we should fix this bug regardless. We could add >> a "Your existing patterns are not in cone mode" error. >> >> It might still be a good idea to let "git sparse-checkout init --cone" >> overwrite the sparse-checkout file _if the file is not already in cone >> mode_. > > I'm not sure how helpful such an error message might be to a user in > this scenario without extra information. After seeing just "this isn't a > cone", it's not clear what they should do other than drop their > sparse-checkout configuration and start over. > > It would be nice to have an intermediate step between seeing realizing > that the existing patterns don't form a cone and dropping them. > > Perhaps we could include an error message and say something like: > > warning: your sparse-checkout patterns do not from a cone > hint: to reinitialize your sparse-checkout configuration > hint: try running: > hint: > hint: git sparse-checkout init --cone --reinitialize > > Where `--reinitialize` means to drop existing patterns. I suppose it > could be the default when transitioning from non-cone to cone mode, but > that would defeat the purpose of the warning. I've got an initial set of commits in a GGG pull request [1], and I'm waiting build validation as a double-check before sending them to the list. [1] https://github.com/gitgitgadget/git/pull/1043 There, I think the best thing to do is have 'init --cone' overwrite the old, non-cone-mode patterns. This essentially makes your --reinitialize suggestion on permanently in that case. Note that it doesn't reinitialize when the patterns are already in cone mode. Note that 'git sparse-checkout set' (with no other arguments) initializes the sparse-checkout file with the default set of cone mode patterns, so we have a way to do that now. > We would probably want to perform this check both during initialization, > and when adding patterns in cone mode. It may also be worthwhile to > check the validity of the cone before running 'list' or 'reapply', too. 'list' definitely seems like a good idea, since it is expecting different output than the literal patterns when cone mode is enabled. 'reapply' should work in both cases, it will just use the old pattern matching algorithm when in cone mode. It will present a warning when the patterns don't match as expected. One thing that is very interesting about the example here is that a pattern "dir" is not recognized as a non-cone-mode pattern, but it should be. The fix is to look for a '/' at the start of the pattern and reject it otherwise. This is fixed in my series. Thanks, -Stolee ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Memory leak with sparse-checkout 2021-09-20 17:25 ` Derrick Stolee @ 2021-09-20 17:27 ` Derrick Stolee 2021-09-20 19:08 ` Taylor Blau 0 siblings, 1 reply; 15+ messages in thread From: Derrick Stolee @ 2021-09-20 17:27 UTC (permalink / raw) To: Taylor Blau; +Cc: Calbabreaker, git On 9/20/2021 1:25 PM, Derrick Stolee wrote: > On 9/20/2021 12:42 PM, Taylor Blau wrote: >> We would probably want to perform this check both during initialization, >> and when adding patterns in cone mode. It may also be worthwhile to >> check the validity of the cone before running 'list' or 'reapply', too. > > 'list' definitely seems like a good idea, since it is expecting different > output than the literal patterns when cone mode is enabled. I double-checked this to see how to fix this, and the 'list' subcommand already notices that the patterns are not in cone mode and reverts its behavior to writing all of the sparse-checkout file to stdout. It also writes warnings over stderr before that. There might not be anything pressing to do here. Thanks, -Stolee ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Memory leak with sparse-checkout 2021-09-20 17:27 ` Derrick Stolee @ 2021-09-20 19:08 ` Taylor Blau 2021-09-20 20:56 ` Derrick Stolee 0 siblings, 1 reply; 15+ messages in thread From: Taylor Blau @ 2021-09-20 19:08 UTC (permalink / raw) To: Derrick Stolee; +Cc: Taylor Blau, Calbabreaker, git On Mon, Sep 20, 2021 at 01:27:25PM -0400, Derrick Stolee wrote: > On 9/20/2021 1:25 PM, Derrick Stolee wrote: > > On 9/20/2021 12:42 PM, Taylor Blau wrote: > >> We would probably want to perform this check both during initialization, > >> and when adding patterns in cone mode. It may also be worthwhile to > >> check the validity of the cone before running 'list' or 'reapply', too. > > > > 'list' definitely seems like a good idea, since it is expecting different > > output than the literal patterns when cone mode is enabled. > > I double-checked this to see how to fix this, and the 'list' subcommand > already notices that the patterns are not in cone mode and reverts its > behavior to writing all of the sparse-checkout file to stdout. It also > writes warnings over stderr before that. > > There might not be anything pressing to do here. Hmm. I think we'd probably want the same behavior for init and any other commands which could potentially overwrite the contents of the sparse-checkout file. Those may already call list routines internally, in which case I agree that this is already taken care of. But if not, then I think we should match list's behavior in the new locations, too. Thanks, Taylor ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Memory leak with sparse-checkout 2021-09-20 19:08 ` Taylor Blau @ 2021-09-20 20:56 ` Derrick Stolee 2021-09-20 21:20 ` Taylor Blau 0 siblings, 1 reply; 15+ messages in thread From: Derrick Stolee @ 2021-09-20 20:56 UTC (permalink / raw) To: Taylor Blau; +Cc: Calbabreaker, git On 9/20/2021 3:08 PM, Taylor Blau wrote: > On Mon, Sep 20, 2021 at 01:27:25PM -0400, Derrick Stolee wrote: >> On 9/20/2021 1:25 PM, Derrick Stolee wrote: >>> On 9/20/2021 12:42 PM, Taylor Blau wrote: >>>> We would probably want to perform this check both during initialization, >>>> and when adding patterns in cone mode. It may also be worthwhile to >>>> check the validity of the cone before running 'list' or 'reapply', too. >>> >>> 'list' definitely seems like a good idea, since it is expecting different >>> output than the literal patterns when cone mode is enabled. >> >> I double-checked this to see how to fix this, and the 'list' subcommand >> already notices that the patterns are not in cone mode and reverts its >> behavior to writing all of the sparse-checkout file to stdout. It also >> writes warnings over stderr before that. >> >> There might not be anything pressing to do here. > > Hmm. I think we'd probably want the same behavior for init and any other > commands which could potentially overwrite the contents of the > sparse-checkout file. Could you elaborate on what you mean by "the same behavior"? Do you mean that "git sparse-checkout add X" should act as if cone mode is not enabled if the existing patterns are not cone-mode patterns? What exactly do you mean about "init" changing behavior here? > Those may already call list routines internally, in which case I agree > that this is already taken care of. But if not, then I think we should > match list's behavior in the new locations, too. "list" interprets the 'struct pattern_list' in two different ways, depending on the use_cone_patterns member. They are static methods in the builtin code, not used by anything else. Thanks, -Stolee ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Memory leak with sparse-checkout 2021-09-20 20:56 ` Derrick Stolee @ 2021-09-20 21:20 ` Taylor Blau 2021-09-21 12:55 ` Derrick Stolee 0 siblings, 1 reply; 15+ messages in thread From: Taylor Blau @ 2021-09-20 21:20 UTC (permalink / raw) To: Derrick Stolee; +Cc: Taylor Blau, Calbabreaker, git On Mon, Sep 20, 2021 at 04:56:47PM -0400, Derrick Stolee wrote: > >> I double-checked this to see how to fix this, and the 'list' subcommand > >> already notices that the patterns are not in cone mode and reverts its > >> behavior to writing all of the sparse-checkout file to stdout. It also > >> writes warnings over stderr before that. > >> > >> There might not be anything pressing to do here. > > > > Hmm. I think we'd probably want the same behavior for init and any other > > commands which could potentially overwrite the contents of the > > sparse-checkout file. > > Could you elaborate on what you mean by "the same behavior"? > > Do you mean that "git sparse-checkout add X" should act as if cone mode > is not enabled if the existing patterns are not cone-mode patterns? > > What exactly do you mean about "init" changing behavior here? No, I was referring to your suggestion from [1] to add a warning from "git sparse-checkout init --cone" when there are existing patterns which are not in cone-mode. > > Those may already call list routines internally, in which case I agree > > that this is already taken care of. But if not, then I think we should > > match list's behavior in the new locations, too. > > "list" interprets the 'struct pattern_list' in two different ways, > depending on the use_cone_patterns member. They are static methods in > the builtin code, not used by anything else. Ah, bummer. I was hoping that they'd be used internally by init so that it would automatically emit a warning in the case where a user's existing patterns are not in cone mode. Apologies for any confusion. Thanks, Taylor [1]: https://lore.kernel.org/git/8a0ddd8e-b585-8f40-c4b1-0a51f11e6b84@gmail.com/ ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Memory leak with sparse-checkout 2021-09-20 21:20 ` Taylor Blau @ 2021-09-21 12:55 ` Derrick Stolee 2021-09-21 16:32 ` Taylor Blau 0 siblings, 1 reply; 15+ messages in thread From: Derrick Stolee @ 2021-09-21 12:55 UTC (permalink / raw) To: Taylor Blau; +Cc: Calbabreaker, git On 9/20/2021 5:20 PM, Taylor Blau wrote: > On Mon, Sep 20, 2021 at 04:56:47PM -0400, Derrick Stolee wrote: >>>> I double-checked this to see how to fix this, and the 'list' subcommand >>>> already notices that the patterns are not in cone mode and reverts its >>>> behavior to writing all of the sparse-checkout file to stdout. It also >>>> writes warnings over stderr before that. >>>> >>>> There might not be anything pressing to do here. >>> >>> Hmm. I think we'd probably want the same behavior for init and any other >>> commands which could potentially overwrite the contents of the >>> sparse-checkout file. >> >> Could you elaborate on what you mean by "the same behavior"? >> >> Do you mean that "git sparse-checkout add X" should act as if cone mode >> is not enabled if the existing patterns are not cone-mode patterns? >> >> What exactly do you mean about "init" changing behavior here? > > No, I was referring to your suggestion from [1] to add a warning from > "git sparse-checkout init --cone" when there are existing patterns which > are not in cone-mode. This warning is part of the sparse-checkout pattern parsing logic, so it happens whenever the patterns are loaded, including the "list" subcommand (among other commands, not just the sparse-checkout builtin). >>> Those may already call list routines internally, in which case I agree >>> that this is already taken care of. But if not, then I think we should >>> match list's behavior in the new locations, too. >> >> "list" interprets the 'struct pattern_list' in two different ways, >> depending on the use_cone_patterns member. They are static methods in >> the builtin code, not used by anything else. > > Ah, bummer. I was hoping that they'd be used internally by init so that > it would automatically emit a warning in the case where a user's > existing patterns are not in cone mode. > > Apologies for any confusion. Thanks for clearing it up! -Stolee ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Memory leak with sparse-checkout 2021-09-21 12:55 ` Derrick Stolee @ 2021-09-21 16:32 ` Taylor Blau 2021-09-21 18:56 ` Derrick Stolee 0 siblings, 1 reply; 15+ messages in thread From: Taylor Blau @ 2021-09-21 16:32 UTC (permalink / raw) To: Derrick Stolee; +Cc: Taylor Blau, Calbabreaker, git On Tue, Sep 21, 2021 at 08:55:01AM -0400, Derrick Stolee wrote: > On 9/20/2021 5:20 PM, Taylor Blau wrote: > > On Mon, Sep 20, 2021 at 04:56:47PM -0400, Derrick Stolee wrote: > >>>> I double-checked this to see how to fix this, and the 'list' subcommand > >>>> already notices that the patterns are not in cone mode and reverts its > >>>> behavior to writing all of the sparse-checkout file to stdout. It also > >>>> writes warnings over stderr before that. > >>>> > >>>> There might not be anything pressing to do here. > >>> > >>> Hmm. I think we'd probably want the same behavior for init and any other > >>> commands which could potentially overwrite the contents of the > >>> sparse-checkout file. > >> > >> Could you elaborate on what you mean by "the same behavior"? > >> > >> Do you mean that "git sparse-checkout add X" should act as if cone mode > >> is not enabled if the existing patterns are not cone-mode patterns? > >> > >> What exactly do you mean about "init" changing behavior here? > > > > No, I was referring to your suggestion from [1] to add a warning from > > "git sparse-checkout init --cone" when there are existing patterns which > > are not in cone-mode. > > This warning is part of the sparse-checkout pattern parsing logic, so > it happens whenever the patterns are loaded, including the "list" > subcommand (among other commands, not just the sparse-checkout builtin). I might be misunderstanding what you're saying. But what I'm wondering is: if we detect when existing patterns aren't in cone-mode, why didn't we catch that case originally when the memory leak was discovered? I thought that it might have been related to your third patch to change how bad patterns are detected. But I ran the following script after applying each of your three patches individually: rm -fr repo git init repo cd repo git sparse-checkout init git sparse-checkout add foo git sparse-checkout init --cone git sparse-checkout add foo and the only difference is that we started silently dropping the bad "foo" pattern after re-adding foo in cone-mode starting with the second patch. I guess my question is: it seems like there may be a friendlier way to tell the user that we're about to drop their sparse-checkout definition instead of just doing it silently. And it seems like you're saying that we already have something that detects that and is used everywhere. But I wonder why it wasn't kicking in in the original report. Thanks, Taylor ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Memory leak with sparse-checkout 2021-09-21 16:32 ` Taylor Blau @ 2021-09-21 18:56 ` Derrick Stolee 2021-09-21 20:45 ` Taylor Blau 0 siblings, 1 reply; 15+ messages in thread From: Derrick Stolee @ 2021-09-21 18:56 UTC (permalink / raw) To: Taylor Blau; +Cc: Calbabreaker, git On 9/21/2021 12:32 PM, Taylor Blau wrote: > On Tue, Sep 21, 2021 at 08:55:01AM -0400, Derrick Stolee wrote: >> On 9/20/2021 5:20 PM, Taylor Blau wrote: >>> On Mon, Sep 20, 2021 at 04:56:47PM -0400, Derrick Stolee wrote: >>>>>> I double-checked this to see how to fix this, and the 'list' subcommand >>>>>> already notices that the patterns are not in cone mode and reverts its >>>>>> behavior to writing all of the sparse-checkout file to stdout. It also >>>>>> writes warnings over stderr before that. >>>>>> >>>>>> There might not be anything pressing to do here. >>>>> >>>>> Hmm. I think we'd probably want the same behavior for init and any other >>>>> commands which could potentially overwrite the contents of the >>>>> sparse-checkout file. >>>> >>>> Could you elaborate on what you mean by "the same behavior"? >>>> >>>> Do you mean that "git sparse-checkout add X" should act as if cone mode >>>> is not enabled if the existing patterns are not cone-mode patterns? >>>> >>>> What exactly do you mean about "init" changing behavior here? >>> >>> No, I was referring to your suggestion from [1] to add a warning from >>> "git sparse-checkout init --cone" when there are existing patterns which >>> are not in cone-mode. >> >> This warning is part of the sparse-checkout pattern parsing logic, so >> it happens whenever the patterns are loaded, including the "list" >> subcommand (among other commands, not just the sparse-checkout builtin). > > I might be misunderstanding what you're saying. But what I'm wondering > is: if we detect when existing patterns aren't in cone-mode, why didn't > we catch that case originally when the memory leak was discovered? Easy answer: there was a bug. The pattern in the original report evaded the checks that were implemented to identify non-cone-mode patterns. My patch 3 [1] fixes that problem, with this hunk: diff --git a/dir.c b/dir.c index 03c4d212672..93136442103 100644 --- a/dir.c +++ b/dir.c @@ -727,7 +727,7 @@ static void add_pattern_to_hashsets(struct pattern_list *pl, struct path_pattern } if (given->patternlen < 2 || - *given->pattern == '*' || + *given->pattern != '/' || strstr(given->pattern, "**")) { /* Not a cone pattern. */ warning(_("unrecognized pattern: '%s'"), given->pattern); and a test case change is required to avoid having the warning message appear in the wrong places: @@ -182,9 +185,9 @@ test_expect_success 'set sparse-checkout using --stdin' ' test_expect_success 'add to sparse-checkout' ' cat repo/.git/info/sparse-checkout >expect && cat >add <<-\EOF && - pattern1 + /pattern1 /folder1/ - pattern2 + /pattern2 EOF cat add >>expect && git -C repo sparse-checkout add --stdin <add && [1] https://lore.kernel.org/git/d513b28b75189d066f9c66de44a1a578cbc38139.1632160658.git.gitgitgadget@gmail.com/ > I thought that it might have been related to your third patch to change > how bad patterns are detected. But I ran the following script after > applying each of your three patches individually: > > rm -fr repo > git init repo > cd repo > > git sparse-checkout init > git sparse-checkout add foo > git sparse-checkout init --cone > git sparse-checkout add foo > > and the only difference is that we started silently dropping the bad > "foo" pattern after re-adding foo in cone-mode starting with the second > patch. In patch 2, we "detect" that the old patterns were not in cone mode because the core.sparseCheckoutCone config is false when parsing the patterns, so use_cone_patterns is 0. > I guess my question is: it seems like there may be a friendlier way to > tell the user that we're about to drop their sparse-checkout definition > instead of just doing it silently. And it seems like you're saying that > we already have something that detects that and is used everywhere. But > I wonder why it wasn't kicking in in the original report. You are correct that we should make better documentation around the re-initialization of patterns. And this _is_ a change of behavior and I will cave to concerns around that. I think this is more a case of matching what users expect from the interface, or at minimum helping them fall into the "pit of success". Let's move the discussion to that thread so we can interleave the patches themselves. Thanks, -Stolee ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: Memory leak with sparse-checkout 2021-09-21 18:56 ` Derrick Stolee @ 2021-09-21 20:45 ` Taylor Blau 2021-09-22 19:16 ` Derrick Stolee 0 siblings, 1 reply; 15+ messages in thread From: Taylor Blau @ 2021-09-21 20:45 UTC (permalink / raw) To: Derrick Stolee; +Cc: Taylor Blau, Calbabreaker, git On Tue, Sep 21, 2021 at 02:56:01PM -0400, Derrick Stolee wrote: > > I thought that it might have been related to your third patch to change > > how bad patterns are detected. But I ran the following script after > > applying each of your three patches individually: > > > > rm -fr repo > > git init repo > > cd repo > > > > git sparse-checkout init > > git sparse-checkout add foo > > git sparse-checkout init --cone > > git sparse-checkout add foo > > > > and the only difference is that we started silently dropping the bad > > "foo" pattern after re-adding foo in cone-mode starting with the second > > patch. > > In patch 2, we "detect" that the old patterns were not in cone mode > because the core.sparseCheckoutCone config is false when parsing the > patterns, so use_cone_patterns is 0. I fear that we're talking about different things. With your patches, if I munge my .git/info/sparse-checkout file, I can easily get something like: $ git.compile sparse-checkout list warning: unrecognized pattern: 'foo' warning: disabling cone pattern matching to appear. But I'm wondering why the same doesn't happen when running `git sparse-checkout init --cone` while the existing sparse-checkout definition contains non-cone mode entries. I could be holding it wrong, but I was unsuccessful in getting a warning out with the quoted script. > Let's move the discussion to that thread so we can interleave the > patches themselves. Sure, if you want. I just found it easier to reply here since I can more readily quote the things I want to respond to. Thanks, Taylor ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Memory leak with sparse-checkout 2021-09-21 20:45 ` Taylor Blau @ 2021-09-22 19:16 ` Derrick Stolee 2021-09-22 19:37 ` Taylor Blau 0 siblings, 1 reply; 15+ messages in thread From: Derrick Stolee @ 2021-09-22 19:16 UTC (permalink / raw) To: Taylor Blau; +Cc: Calbabreaker, git On 9/21/2021 4:45 PM, Taylor Blau wrote: > On Tue, Sep 21, 2021 at 02:56:01PM -0400, Derrick Stolee wrote: >>> I thought that it might have been related to your third patch to change >>> how bad patterns are detected. But I ran the following script after >>> applying each of your three patches individually: >>> >>> rm -fr repo >>> git init repo >>> cd repo >>> >>> git sparse-checkout init >>> git sparse-checkout add foo >>> git sparse-checkout init --cone >>> git sparse-checkout add foo >>> >>> and the only difference is that we started silently dropping the bad >>> "foo" pattern after re-adding foo in cone-mode starting with the second >>> patch. >> >> In patch 2, we "detect" that the old patterns were not in cone mode >> because the core.sparseCheckoutCone config is false when parsing the >> patterns, so use_cone_patterns is 0. > > I fear that we're talking about different things. With your patches, if > I munge my .git/info/sparse-checkout file, I can easily get something > like: > > $ git.compile sparse-checkout list > warning: unrecognized pattern: 'foo' > warning: disabling cone pattern matching > > to appear. But I'm wondering why the same doesn't happen when running > `git sparse-checkout init --cone` while the existing sparse-checkout > definition contains non-cone mode entries. You don't get that warning because it's not trying to parse the previous patterns using cone mode. Before my series, you would get the warning in a _second_ run of "git sparse-checkout init --cone". We should add a different warning for overwriting the existing patterns. Thanks, -Stolee ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Memory leak with sparse-checkout 2021-09-22 19:16 ` Derrick Stolee @ 2021-09-22 19:37 ` Taylor Blau 0 siblings, 0 replies; 15+ messages in thread From: Taylor Blau @ 2021-09-22 19:37 UTC (permalink / raw) To: Derrick Stolee; +Cc: Calbabreaker, git On Wed, Sep 22, 2021 at 03:16:22PM -0400, Derrick Stolee wrote: > On 9/21/2021 4:45 PM, Taylor Blau wrote: > [...] > > I fear that we're talking about different things. With your patches, if > > I munge my .git/info/sparse-checkout file, I can easily get something > > like: > > > > $ git.compile sparse-checkout list > > warning: unrecognized pattern: 'foo' > > warning: disabling cone pattern matching > > > > to appear. But I'm wondering why the same doesn't happen when running > > `git sparse-checkout init --cone` while the existing sparse-checkout > > definition contains non-cone mode entries. > > You don't get that warning because it's not trying to parse the > previous patterns using cone mode. Before my series, you would get > the warning in a _second_ run of "git sparse-checkout init --cone". OK. I was confused because in [1] you said that we produce this error any time we try to load the existing definitions and parse them in cone-mode. I would have assumed that we would try to do that during `init --cone` to read the existing sparse-checkout definition. Thanks, Taylor [1]: https://lore.kernel.org/git/734ecf93-e563-20d5-7cf1-74048aa74d56@gmail.com/ ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2021-09-22 19:37 UTC | newest] Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-09-20 12:15 Memory leak with sparse-checkout Calbabreaker 2021-09-20 15:52 ` Taylor Blau 2021-09-20 16:29 ` Derrick Stolee 2021-09-20 16:42 ` Taylor Blau 2021-09-20 17:25 ` Derrick Stolee 2021-09-20 17:27 ` Derrick Stolee 2021-09-20 19:08 ` Taylor Blau 2021-09-20 20:56 ` Derrick Stolee 2021-09-20 21:20 ` Taylor Blau 2021-09-21 12:55 ` Derrick Stolee 2021-09-21 16:32 ` Taylor Blau 2021-09-21 18:56 ` Derrick Stolee 2021-09-21 20:45 ` Taylor Blau 2021-09-22 19:16 ` Derrick Stolee 2021-09-22 19:37 ` Taylor Blau
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.