All of lore.kernel.org
 help / color / mirror / Atom feed
From: Duy Nguyen <pclouds@gmail.com>
To: Jeff King <peff@peff.net>
Cc: "Stefan Beller" <sbeller@google.com>,
	"Sérgio Peixoto" <sergio.peixoto@gmail.com>,
	"Brandon Williams" <bwilliams.eng@gmail.com>,
	git <git@vger.kernel.org>
Subject: Re: [PATCH] attr: do not mark queried macros as unset
Date: Tue, 22 Jan 2019 16:50:30 +0700	[thread overview]
Message-ID: <CACsJy8ALL5_gHro9jZcSBnfnV01UEJLReCrqi+w727bkqnjUAA@mail.gmail.com> (raw)
In-Reply-To: <20190122071921.GC28555@sigill.intra.peff.net>

On Tue, Jan 22, 2019 at 2:19 PM Jeff King <peff@peff.net> wrote:
> Yes, that's the interesting part. I think I've convinced myself, too,
> that it doesn't do the _wrong_ thing ever. But I think it misses the
> point of the original, which is that you want common ones like "diff"
> not to trigger in_stack if nobody has actually used them.

Yes. I don't think it matters much when you don't have a lot of
attributes, but if you do, the cost of lookup will be proportional to
the stack's depth even whenever you look up some attribute, even
though you don't use it. This makes code that uses attributes just a
tiny bit slower over time because I think we still add more and more
attributes.

> And doing that
> really does mean marking in_stack not just when a macro mentions it
> (because clearly "binary" is going to mention it for every repo), but
> waiting to see if anybody mentions that macro.
>
> Which means we must call determine_macros(), and then propagate the
> macro's in_stack to its expansion (if it's indeed called at all).
>
> I don't think that would be _too_ hard to do. But I also wonder if
> there's much point. We are trying to avoid fill(), but I think that
> determine_macros() is of roughly the same complexity (look at all
> matches of all stacks). I guess it does avoid path_matches(), which is a
> bit more expensive. And in theory it could be cached for a particular
> stack top, so the work is amortized across many path lookups (though I
> think that gets even more tricky).

There is a comment that got eventually removed in bw/attr, especially
the second to last sentence.

-/*
- * NEEDSWORK: maybe-real, maybe-macro are not property of
- * an attribute, as it depends on what .gitattributes are
- * read.  Once we introduce per git_attr_check attr_stack
- * and check_all_attr, the optimization based on them will
- * become unnecessary and can go away.  So is this variable.
- */
-static int cannot_trust_maybe_real;

The promise here is, after we have moved away from global attribute
stack, we can build custom stacks containing only queried attributes.
This makes attribute stacks short (in the best case, empty, which is
what my optimization is for) which means fill time (I think it's
path_matches() would dominate) becomes shorter in the _general_ case,
so this optimization "will become unnecessary". More importantly the
total number of attributes will not matter since we only look at what
we are interested. This makes attribute lookup scale much better in
the long run.

This part, building custom stacks, has not come true yet. But if we
optimize this code again, I think this is the way forward. Perhaps
this could be one of the mini projects for Matthey's students. The
scope is relatively small, and optimization is always fun.
-- 
Duy

  reply	other threads:[~2019-01-22  9:50 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-17 15:47 Change on check-attr behavior Sérgio Peixoto
2019-01-17 16:07 ` Jeff King
2019-01-18  9:41   ` Sérgio Peixoto
2019-01-18 16:58     ` Jeff King
2019-01-18 21:34       ` [PATCH] attr: do not mark queried macros as unset Jeff King
2019-01-18 21:46         ` Jeff King
2019-01-18 22:19           ` Stefan Beller
2019-01-22  7:19             ` Jeff King
2019-01-22  9:50               ` Duy Nguyen [this message]
2019-01-22 22:00           ` Junio C Hamano
2019-01-21 10:05         ` Duy Nguyen
2019-01-22  7:21           ` Jeff King
2019-01-22  9:34         ` Duy Nguyen
2019-01-22 21:48         ` Junio C Hamano
2019-01-23  5:40           ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CACsJy8ALL5_gHro9jZcSBnfnV01UEJLReCrqi+w727bkqnjUAA@mail.gmail.com \
    --to=pclouds@gmail.com \
    --cc=bwilliams.eng@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    --cc=sbeller@google.com \
    --cc=sergio.peixoto@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.