All of lore.kernel.org
 help / color / mirror / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: Calbabreaker <calbabreaker@gmail.com>, Derrick Stolee <stolee@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: Memory leak with sparse-checkout
Date: Mon, 20 Sep 2021 11:52:57 -0400	[thread overview]
Message-ID: <YUiuWSXO1P3JwerH@nand.local> (raw)
In-Reply-To: <CAKRwm5a9PyqffEC5N__urSpNcZ-d5vz9GBM2Ei16eGS25B=-FQ@mail.gmail.com>

On Mon, Sep 20, 2021 at 09:45:14PM +0930, Calbabreaker wrote:
> What did you do before the bug happened? (Steps to reproduce your issue)
>
> This was ran:
>
> git clone https://github.com/Calbabreaker/piano --sparse
> cd piano
> git sparse-checkout add any_text
> git checkout deploy-frontend
> git sparse-checkout init --cone
> git sparse-checkout add any_text

Thanks for the reproduction. An even simpler one may be (inside of any
repository):

    git sparse-checkout init
    git sparse-checkout add dir
    git sparse-checkout init --cone
    git sparse-checkout add dir

The problem occurs because we keep existing entries when adding to the
sparse-checkout list, and cone-mode patterns do not mix with
non cone-mode patterns.

So after the first init and "add dir", your sparse-checkout file looks
like:

  /*
  !/*/
  dir

but then when we convert to cone-mode and try and add "dir" (which in
cone-mode we'll convert to "/dir/"), we run into trouble when adding the
existing "dir" entry. That's because add_patterns_cone_mode() calls
insert_recursive_pattern() on every entry in the existing list,
including "dir".

So when we call insert_recursive_pattern() with any pattern list and
path containing "dir", we first insert "dir" into the list, and then:

  char *slash = strrchr(e->pattern, '/');
  char *oldpattern = e->pattern;

  if (slash == e->pattern)
    break;
  // trim off a slash, repeat

except slash is NULL because "dir" doesn't contain a slash. And that
explains the problem you're seeing, because (a) we'll stay in that while
loop forever, and (b) because each iteration allocates memory to
accommodate the new pattern, so we'll eventually run out of memory.

The wrong thing to do would be to handle this case by changing the
conditional to "if (!slash || slash == e->pattern)", because we can't
blindly carry forward some patterns which look like cone-mode patterns,
since together the list of sparse-checkout entries may not represent a
cone.

(An example here is if we added /foo/bar/baz/* without the corresponding
/foo/, !/foo/*, and so on).

So I think the problem really is that we need to drop existing patterns
when re-initializing the sparse-checkout in cone mode. We could try to
recognize that existing patterns may already constitute a cone (and/or
create a cone that covers the existing patterns).

But I think the easiest thing (if a little unfriendly) would be to just
drop them and start afresh when re-initializing the sparse-checkout in
cone mode.

I'm adding Stolee to the CC to see if he thinks that would be sensible
behavior or not.

Thanks,
Taylor

  reply	other threads:[~2021-09-20 15:53 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-20 12:15 Memory leak with sparse-checkout Calbabreaker
2021-09-20 15:52 ` Taylor Blau [this message]
2021-09-20 16:29   ` Derrick Stolee
2021-09-20 16:42     ` Taylor Blau
2021-09-20 17:25       ` Derrick Stolee
2021-09-20 17:27         ` Derrick Stolee
2021-09-20 19:08           ` Taylor Blau
2021-09-20 20:56             ` Derrick Stolee
2021-09-20 21:20               ` Taylor Blau
2021-09-21 12:55                 ` Derrick Stolee
2021-09-21 16:32                   ` Taylor Blau
2021-09-21 18:56                     ` Derrick Stolee
2021-09-21 20:45                       ` Taylor Blau
2021-09-22 19:16                         ` Derrick Stolee
2021-09-22 19:37                           ` Taylor Blau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YUiuWSXO1P3JwerH@nand.local \
    --to=me@ttaylorr.com \
    --cc=calbabreaker@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.