All of lore.kernel.org
 help / color / mirror / Atom feed
From: Duy Nguyen <pclouds@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: Git Mailing List <git@vger.kernel.org>
Subject: Re: [PATCH 3/4] dir.c: support marking some patterns already matched
Date: Tue, 16 Feb 2016 08:36:32 +0700	[thread overview]
Message-ID: <CACsJy8AurhYgweE99uWs5rf7PsL9_8nuCo-DtQFO+gXCRPkPMA@mail.gmail.com> (raw)
In-Reply-To: <xmqqio1po7s9.fsf@gitster.mtv.corp.google.com>

On Tue, Feb 16, 2016 at 6:47 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:
>
>> Given path "a" and the pattern "a", it's matched. But if we throw path
>> "a/b" to pattern "a", the code fails to realize that if "a" matches
>> "a" then "a/b" should also be matched.
>>
>> When the pattern is matched the first time, we can mark it "sticky", so
>> that all files and dirs inside the matched path also matches. This is a
>> simpler solution than modify all match scenarios to fix that.
>
> I am not quite sure what this one tries to achieve.  Is this a
> performance thing, or is it a correctness thing?
>
> "This is a simpler solution than" is skimpy on the description of
> what the solution is.

As an example, let's look at pattern "a/". For this pattern to match,
"a" must be a directory. When we examine "a", we can stat() and see if
it is a directory. But when we descend inside, say "a/b", current code
is not prepared to deal with it and will try to stat("a/b") to see if
it's a directory instead of stat("a").

> When you see 'a' and path 'a/', you would throw it in the sticky
> list.  when you descend into 'a/' and see things under it,
> e.g. 'a/b', you would say "we have a match" because 'a' is sticky.
> Do you throw 'a/b' also into the sticky list so that you would catch
> 'a/b/c' later?

No because "a/" can still do the job. I know it's a bit hard to see
because add_sticky() is not used yet in this patch.

> Do you rely on the order of tree walking to cull
> entries from the sticky list that are no longer relevant?
> e.g. after you enumerate everything in 'a/b', you do not need 'a/b'
> anymore.

No explicit culling. But notice that these sticky lists are indirectly
contained in exclude_list. Suppose "a/b" is sticky because of
"a/.gitignore". As soon as we move out of "a", I would expect
prep_exclude() to remove the exclude_list of "a/.gitignore" and the
related sticky lists.

> Or do you notice that 'a/' matched at the top-level and stop
> bothering the sticky list when you descend into 'a/b' and others?
>
> How does this interact with negative patterns?

Negative or not is irrelevant. If "a" is matched negatively and we see
"a/b", we return the same response, "negative match".

> Thanks.  The data structure used in 3/4 smells iffy from performance
> point of view--have you tried it on a large trees with deep nesting?

No I haven't. Though I don't expect it to degrade performance much. We
basically add one new exclude rule when add_sticky() is called and pay
the price of pathname matching of that rule. If there are a lot of
sticky rules (especially ones at top directory), then performance can
go to hell. 4/4 should not add many of them though.
-- 
Duy

  reply	other threads:[~2016-02-16  1:37 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-15  9:03 [PATCH 0/4] .gitignore, reinclude rules, take 2 Nguyễn Thái Ngọc Duy
2016-02-15  9:03 ` [PATCH 1/4] dir.c: fix match_pathname() Nguyễn Thái Ngọc Duy
2016-02-15 23:29   ` Junio C Hamano
2016-02-16  1:17     ` Duy Nguyen
2016-02-15  9:03 ` [PATCH 2/4] dir.c: support tracing exclude Nguyễn Thái Ngọc Duy
2016-02-15  9:03 ` [PATCH 3/4] dir.c: support marking some patterns already matched Nguyễn Thái Ngọc Duy
2016-02-15 23:47   ` Junio C Hamano
2016-02-16  1:36     ` Duy Nguyen [this message]
2016-02-15  9:03 ` [PATCH 4/4] dir.c: don't exclude whole dir prematurely Nguyễn Thái Ngọc Duy
2016-02-15 23:49 ` [PATCH 0/4] .gitignore, reinclude rules, take 2 Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CACsJy8AurhYgweE99uWs5rf7PsL9_8nuCo-DtQFO+gXCRPkPMA@mail.gmail.com \
    --to=pclouds@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.