All of lore.kernel.org
 help / color / mirror / Atom feed
From: Samuel Lijin <sxlijin@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: "git@vger.kernel.org" <git@vger.kernel.org>
Subject: Re: [PATCH v4 6/6] clean: teach clean -d to skip dirs containing ignored files
Date: Mon, 22 May 2017 01:58:53 -0400	[thread overview]
Message-ID: <CAJZjrdX9BnuxY3tmpswG+yEdDm1+AR8rc5wKGZyVCMp-jP218A@mail.gmail.com> (raw)
In-Reply-To: <xmqqtw4do5tf.fsf@gitster.mtv.corp.google.com>

On Mon, May 22, 2017 at 12:48 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Samuel Lijin <sxlijin@gmail.com> writes:
>
>> +     for (j = i = 0; i < dir.nr;) {
>> +             for (;
>> +                  j < dir.ignored_nr &&
>> +                    0 <= cmp_dir_entry(&dir.entries[i], &dir.ignored[j]);
>> +                  j++);
>> +
>> +             if ((j < dir.ignored_nr) &&
>> +                             check_dir_entry_contains(dir.entries[i], dir.ignored[j])) {
>> +                     /* skip any dir.entries which contains a dir.ignored */
>> +                     free(dir.entries[i]);
>> +                     dir.entries[i++] = NULL;
>> +             } else {
>> +                     /* prune the contents of a dir.entries which will be removed */
>> +                     struct dir_entry *ent = dir.entries[i++];
>> +                     for (;
>> +                          i < dir.nr &&
>> +                            check_dir_entry_contains(ent, dir.entries[i]);
>> +                          i++) {
>> +                             free(dir.entries[i]);
>> +                             dir.entries[i] = NULL;
>> +                     }
>> +             }
>> +     }
>
> The second loop in the else clause is a bit tricky, and the comment
> "which will be removed" is not all that helpful to explain why the
> loop is there.
>
> But I think the code is correct.  Here is how I understood it.
>
>     While looking at dir.entries[i], the code noticed that nothing
>     in that directory is ignored.  But entries in dir.entries[] that
>     come later may be contained in dir.entries[i] and we just want
>     to show the top-level untracked one (e.g. "a/" and "a/b/" were
>     in entries[], there is nothing in "a/", so naturally there is
>     nothing in "a/b/", but we do not want to bother showing
>     both---showing "a/" alone saying "the entire a/ is untracked" is
>     what we want).

Yep, that's the gist of it.

More specifically: the contents of untracked dirs have to be picked up
so that clean -d can distinguish between purely untracked dirs and
untracked dirs which also contain ignored files. In the case of a
purely untracked dir, clean -d can remove the dir itself, and should
just skip over all the dir's contents; in the case of a mixed
untracked dir, clean -d should not remove the dir itself, just the
untracked contents.

> We may want to have a test to ensure "a/b/" is indeed omitted in
> such a situation from the output, though.

Is there a reason to ensure specifically a/b/ is tested, or are the
current tests, which do cover the a/b (i.e. where a/b is a file, not
where a/b/ is a dir) case, insufficient for some reason?

> By the way, instead of putting NULL, it may be easier to follow if
> you used two pointers, src and dst, into dir.entries[], just like
> you did in your latest version of [PATCH 4/6].  That way, you do not
> have to change anything in the later loop that walks over elements
> in the dir.entries[] array.  It would also help the logic easier to
> follow if the above loop were its own helper function.

Agreed on the helper function. On the src-dst thing: I considered it,
but I figured another O(n) set of array moves was unnecessary. I guess
this is one of those cases where premature optimization doesn't make
sense?

> Putting them all together, here is what I came up with that can be
> squashed into your patch.  I am undecided myself if this is easier
> to follow than your version, but it seems to pass your test ;-)
>
> Thanks.
>
>  builtin/clean.c | 70 ++++++++++++++++++++++++++++++++++-----------------------
>  1 file changed, 42 insertions(+), 28 deletions(-)
>
> diff --git a/builtin/clean.c b/builtin/clean.c
> index dd3308a447..c8712e7ac8 100644
> --- a/builtin/clean.c
> +++ b/builtin/clean.c
> @@ -851,9 +851,49 @@ static void interactive_main_loop(void)
>         }
>  }
>
> +static void simplify_untracked(struct dir_struct *dir)
> +{
> +       int src, dst, ign;
> +
> +       for (src = dst = ign = 0; src < dir->nr; src++) {
> +               /*
> +                * Skip entries in ignored[] that cannot be inside
> +                * entries[src]
> +                */
> +               while (ign < dir->ignored_nr &&
> +                      0 <= cmp_dir_entry(&dir->entries[src], &dir->ignored[ign]))
> +                       ign++;
> +
> +               if (dir->ignored_nr <= ign ||
> +                   !check_dir_entry_contains(dir->entries[src], dir->ignored[ign])) {
> +                       /*
> +                        * entries[src] does not contain an ignored
> +                        * path -- we need to keep it.  But we do not
> +                        * want to show entries[] that are contained
> +                        * in entries[src].
> +                        */
> +                       struct dir_entry *ent = dir->entries[src++];
> +                       dir->entries[dst++] = ent;
> +                       while (src < dir->nr &&
> +                              check_dir_entry_contains(ent, dir->entries[src])) {
> +                               free(dir->entries[src++]);
> +                       }
> +                       /* compensate for the outer loop's loop control */
> +                       src--;
> +               } else {
> +                       /*
> +                        * entries[src] contains an ignored path --
> +                        * drop it.
> +                        */
> +                       free(dir->entries[src]);
> +               }
> +       }
> +       dir->nr = dst;
> +}
> +
>  int cmd_clean(int argc, const char **argv, const char *prefix)
>  {
> -       int i, j, res;
> +       int i, res;
>         int dry_run = 0, remove_directories = 0, quiet = 0, ignored = 0;
>         int ignored_only = 0, config_set = 0, errors = 0, gone = 1;
>         int rm_flags = REMOVE_DIR_KEEP_NESTED_GIT;
> @@ -928,30 +968,7 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
>                        prefix, argv);
>
>         fill_directory(&dir, &pathspec);
> -
> -       for (j = i = 0; i < dir.nr;) {
> -               for (;
> -                    j < dir.ignored_nr &&
> -                      0 <= cmp_dir_entry(&dir.entries[i], &dir.ignored[j]);
> -                    j++);
> -
> -               if ((j < dir.ignored_nr) &&
> -                               check_dir_entry_contains(dir.entries[i], dir.ignored[j])) {
> -                       /* skip any dir.entries which contains a dir.ignored */
> -                       free(dir.entries[i]);
> -                       dir.entries[i++] = NULL;
> -               } else {
> -                       /* prune the contents of a dir.entries which will be removed */
> -                       struct dir_entry *ent = dir.entries[i++];
> -                       for (;
> -                            i < dir.nr &&
> -                              check_dir_entry_contains(ent, dir.entries[i]);
> -                            i++) {
> -                               free(dir.entries[i]);
> -                               dir.entries[i] = NULL;
> -                       }
> -               }
> -       }
> +       simplify_untracked(&dir);
>
>         for (i = 0; i < dir.nr; i++) {
>                 struct dir_entry *ent = dir.entries[i];
> @@ -959,9 +976,6 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
>                 struct stat st;
>                 const char *rel;
>
> -               if (!ent)
> -                       continue;
> -
>                 if (!cache_name_is_other(ent->name, ent->len))
>                         continue;
>

  reply	other threads:[~2017-05-22  5:59 UTC|newest]

Thread overview: 89+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-28 22:56 Bug Report: .gitignore behavior is not matching in git clean and git status Chris Johnson
2017-05-01  1:41 ` Junio C Hamano
2017-05-01  1:56   ` Chris Johnson
2017-05-01  3:15     ` Junio C Hamano
2017-05-01 13:51     ` Samuel Lijin
2017-05-01 15:34       ` Samuel Lijin
2017-05-02  0:26         ` Junio C Hamano
2017-05-03  3:29           ` [PATCH 0/7] Keep git clean -d from inadvertently removing ignored files Samuel Lijin
2017-05-03  3:29             ` [PATCH 1/7] t7300: skip untracked dirs containing " Samuel Lijin
2017-05-03 18:19               ` Stefan Beller
2017-05-03 18:26                 ` Samuel Lijin
2017-05-03 18:34                   ` Stefan Beller
2017-05-03  3:29             ` [PATCH 2/7] dir: recurse into untracked dirs for " Samuel Lijin
2017-05-03  3:29             ` [PATCH 3/7] dir: add method to check if a dir_entry lexically contains another Samuel Lijin
2017-05-03 18:09               ` Stefan Beller
2017-05-03  3:29             ` [PATCH 4/7] dir: hide untracked contents of untracked dirs Samuel Lijin
2017-05-03  3:29             ` [PATCH 5/7] dir: change linkage of cmp_name() and check_contains() Samuel Lijin
2017-05-03  3:29             ` [PATCH 6/7] builtin/clean: teach clean -d to skip dirs containing ignored files Samuel Lijin
2017-05-03  3:29             ` [PATCH 7/7] t7061: check for ignored file in untracked dir Samuel Lijin
2017-05-05 10:46             ` [PATCH v2 0/9] Keep git clean -d from inadvertently removing ignored files Samuel Lijin
2017-05-08  4:26               ` Junio C Hamano
2017-05-08 21:37                 ` Samuel Lijin
2017-05-09  0:31                   ` Junio C Hamano
2017-05-16  7:34               ` [PATCH v3 0/8] Fix clean -d and status --ignored Samuel Lijin
2017-05-18  8:21                 ` [PATCH v4 0/6] " Samuel Lijin
2017-05-23  9:18                   ` [PATCH v5 " Samuel Lijin
2017-05-23 10:09                     ` [PATCH v6 " Samuel Lijin
2017-05-23 10:09                     ` [PATCH v6 1/6] t7300: clean -d should skip dirs with ignored files Samuel Lijin
2017-05-23 10:09                     ` [PATCH v6 2/6] t7061: status --ignored should search untracked dirs Samuel Lijin
2017-05-23 10:09                     ` [PATCH v6 3/6] dir: recurse into untracked dirs for ignored files Samuel Lijin
2017-05-23 10:09                     ` [PATCH v6 4/6] dir: hide untracked contents of untracked dirs Samuel Lijin
2017-05-23 10:09                     ` [PATCH v6 5/6] dir: expose cmp_name() and check_contains() Samuel Lijin
2017-05-23 10:09                     ` [PATCH v6 6/6] clean: teach clean -d to preserve ignored paths Samuel Lijin
2017-05-23 22:35                       ` Junio C Hamano
2017-05-24  4:14                       ` Torsten Bögershausen
2017-05-25 15:36                         ` Samuel Lijin
2017-05-26  1:05                           ` Junio C Hamano
2017-05-23  9:18                   ` [PATCH v5 1/6] t7300: clean -d should skip dirs with ignored files Samuel Lijin
2017-05-23  9:18                   ` [PATCH v5 2/6] t7061: status --ignored should search untracked dirs Samuel Lijin
2017-05-23  9:18                   ` [PATCH v5 3/6] dir: recurse into untracked dirs for ignored files Samuel Lijin
2017-05-23  9:18                   ` [PATCH v5 4/6] dir: hide untracked contents of untracked dirs Samuel Lijin
2017-05-23  9:18                   ` [PATCH v5 5/6] dir: expose cmp_name() and check_contains() Samuel Lijin
2017-05-23  9:18                   ` [PATCH v5 6/6] clean: teach clean -d to preserve ignored paths Samuel Lijin
2017-05-23 12:52                     ` Junio C Hamano
2017-05-23 19:16                       ` Samuel Lijin
2017-05-18  8:21                 ` [PATCH v4 1/6] t7300: clean -d should skip dirs with ignored files Samuel Lijin
2017-05-18  8:21                 ` [PATCH v4 2/6] t7061: status --ignored should search untracked dirs Samuel Lijin
2017-05-18  8:21                 ` [PATCH v4 3/6] dir: recurse into untracked dirs for ignored files Samuel Lijin
2017-05-18  8:21                 ` [PATCH v4 4/6] dir: hide untracked contents of untracked dirs Samuel Lijin
2017-05-22  3:13                   ` Junio C Hamano
2017-05-18  8:21                 ` [PATCH v4 5/6] dir: expose cmp_name() and check_contains() Samuel Lijin
2017-05-22  0:38                   ` Junio C Hamano
2017-05-18  8:21                 ` [PATCH v4 6/6] clean: teach clean -d to skip dirs containing ignored files Samuel Lijin
2017-05-22  4:48                   ` Junio C Hamano
2017-05-22  5:58                     ` Samuel Lijin [this message]
2017-05-22  6:17                       ` Junio C Hamano
2017-05-23  2:43                         ` Samuel Lijin
2017-05-23  7:50                           ` Junio C Hamano
2017-05-16  7:34               ` [PATCH v3 1/8] t7300: clean -d should skip dirs with " Samuel Lijin
2017-05-16  7:34               ` [PATCH v3 2/8] t7061: status --ignored should search untracked dirs Samuel Lijin
2017-05-16  7:34               ` [PATCH v3 3/8] dir: recurse into untracked dirs for ignored files Samuel Lijin
2017-05-17  6:23                 ` Junio C Hamano
2017-05-17  7:02                   ` Samuel Lijin
2017-05-16  7:34               ` [PATCH v3 4/8] dir: hide untracked contents of untracked dirs Samuel Lijin
2017-05-17  6:47                 ` Junio C Hamano
2017-05-17  7:32                   ` Samuel Lijin
2017-05-17 10:14                     ` Junio C Hamano
2017-05-16  7:34               ` [PATCH v3 5/8] dir: expose cmp_name() and check_contains() Samuel Lijin
2017-05-16  7:34               ` [PATCH v3 6/8] clean: teach clean -d to skip dirs containing ignored files Samuel Lijin
2017-05-18  2:34                 ` Junio C Hamano
2017-05-18  6:32                   ` Samuel Lijin
2017-05-16  7:34               ` [PATCH v3 7/8] t7300: clean -d now skips untracked " Samuel Lijin
2017-05-18  2:38                 ` Junio C Hamano
2017-05-16  7:34               ` [PATCH v3 8/8] t7061: status --ignored now searches untracked dirs Samuel Lijin
2017-05-18  2:46                 ` Junio C Hamano
2017-05-05 10:46             ` [PATCH v2 1/9] t7300: skip untracked dirs containing ignored files Samuel Lijin
2017-05-07 19:12               ` Torsten Bögershausen
2017-05-08 21:24                 ` Samuel Lijin
2017-05-05 10:46             ` [PATCH v2 2/9] t7061: expect failure where expected behavior will change Samuel Lijin
2017-05-08  4:34               ` Junio C Hamano
2017-05-05 10:46             ` [PATCH v2 3/9] dir: recurse into untracked dirs for ignored files Samuel Lijin
2017-05-05 10:46             ` [PATCH v2 4/9] dir: add method to check if a dir_entry lexically contains another Samuel Lijin
2017-05-05 10:46             ` [PATCH v2 5/9] dir: hide untracked contents of untracked dirs Samuel Lijin
2017-05-05 10:46             ` [PATCH v2 6/9] dir: change linkage of cmp_name() and check_contains() Samuel Lijin
2017-05-08  4:37               ` Junio C Hamano
2017-05-05 10:46             ` [PATCH v2 7/9] builtin/clean: teach clean -d to skip dirs containing ignored files Samuel Lijin
2017-05-05 10:46             ` [PATCH v2 8/9] t7300: clean -d now skips untracked " Samuel Lijin
2017-05-05 10:46             ` [PATCH v2 9/9] t7061: expect ignored files in untracked dirs Samuel Lijin
2017-05-08  6:35               ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJZjrdX9BnuxY3tmpswG+yEdDm1+AR8rc5wKGZyVCMp-jP218A@mail.gmail.com \
    --to=sxlijin@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.