All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: "Sérgio Peixoto" <sergio.peixoto@gmail.com>
Cc: "Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>,
	"Brandon Williams" <bwilliams.eng@gmail.com>,
	git@vger.kernel.org
Subject: Re: [PATCH] attr: do not mark queried macros as unset
Date: Fri, 18 Jan 2019 16:46:27 -0500	[thread overview]
Message-ID: <20190118214626.GC28808@sigill.intra.peff.net> (raw)
In-Reply-To: <20190118213458.GB28808@sigill.intra.peff.net>

On Fri, Jan 18, 2019 at 04:34:58PM -0500, Jeff King wrote:

> When 06a604e670 later refactored the macro code, it dropped maybe_real
> entirely. This missed the fact that "maybe_real" could be unset for two
> reasons: because of a macro, or because it was never found during
> parsing. This had two results:
> 
>   - the optimization in collect_some_attrs() ceased doing anything
>     meaningful, since it no longer kept track of "was it found during
>     parsing"
> 
>   - worse, it actually kicked in when the caller _did_ ask about a macro
>     by name, causing us to mark it as unspecified
> 
> It should be possible to salvage this optimization, but let's start with
> just removing the remnants. It hasn't been doing anything (except
> creating bugs) since 60a12722ac, and nobody seems to have noticed the
> performance regression. It's more important to fix the correctness
> problem clearly first.

And here's a resurrection of the optimization that _seems_ to work, but
I'm not 100% confident in.

In particular, it does not care about macros at all. It simply asks: is
this queried attribute a thing which was ever mentioned in the
attributes files (either as a path match or as a possible macro
expansion). If not, then we know we do not need to look further for it.

But that leaves me unsure why the original optimization needed to care
about macros at all. Has something changed since then with respect to
the way we expand macros since then? Or am I totally missing some case
that will cause problems?

I guess maybe what I'm missing is that asking for "diff" means that we
need to care about:

  - whether "diff" was mentioned in the stack

  - whether "binary" was mentioned in the stack

But just "binary" mentioning "diff" is not interesting without somebody
actually mentioning "binary". I.e., I don't think the patch here will
produce wrong results, but it will not kick in as often as we might
like.

I'm not sure how to do it robustly without being able to reverse-map all
of the macros after we've resolved them (i.e., to know that "diff" gets
mentioned by "binary", and then check if "binary" is actually
mentioned). I think that would be possible now, as we should know that
after determine_macros(). But I also wonder if we are hitting
diminishing returns (after all, determine_macros() is already walking
the attr stack).

I dunno. This is why I submitted the initial patch as the simplest fix. ;)

---
diff --git a/attr.c b/attr.c
index 57ced792f8..c3cbfa6501 100644
--- a/attr.c
+++ b/attr.c
@@ -31,6 +31,7 @@ static const char git_attr__unknown[] = "(builtin)unknown";
 
 struct git_attr {
 	int attr_nr; /* unique attribute number */
+	int in_stack; /* actually found in some attribute stack */
 	char name[FLEX_ARRAY]; /* attribute name */
 };
 
@@ -220,7 +221,8 @@ static void report_invalid_attr(const char *name, size_t len,
  * dictionary.  If no entry is found, create a new attribute and store it in
  * the dictionary.
  */
-static const struct git_attr *git_attr_internal(const char *name, int namelen)
+static const struct git_attr *git_attr_internal(const char *name, int namelen,
+						int in_stack)
 {
 	struct git_attr *a;
 
@@ -240,6 +242,8 @@ static const struct git_attr *git_attr_internal(const char *name, int namelen)
 		       (hashmap_get_size(&g_attr_hashmap.map) - 1));
 	}
 
+	a->in_stack |= in_stack;
+
 	hashmap_unlock(&g_attr_hashmap);
 
 	return a;
@@ -247,7 +251,7 @@ static const struct git_attr *git_attr_internal(const char *name, int namelen)
 
 const struct git_attr *git_attr(const char *name)
 {
-	return git_attr_internal(name, strlen(name));
+	return git_attr_internal(name, strlen(name), 0);
 }
 
 /* What does a matched pattern decide? */
@@ -335,7 +339,7 @@ static const char *parse_attr(const char *src, int lineno, const char *cp,
 		else {
 			e->setto = xmemdupz(equals + 1, ep - equals - 1);
 		}
-		e->attr = git_attr_internal(cp, len);
+		e->attr = git_attr_internal(cp, len, 1);
 	}
 	return ep + strspn(ep, blank);
 }
@@ -396,7 +400,7 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 		      sizeof(struct attr_state) * num_attr +
 		      (is_macro ? 0 : namelen + 1));
 	if (is_macro) {
-		res->u.attr = git_attr_internal(name, namelen);
+		res->u.attr = git_attr_internal(name, namelen, 1);
 	} else {
 		char *p = (char *)&(res->state[num_attr]);
 		memcpy(p, name, namelen);
@@ -1093,6 +1097,7 @@ static void collect_some_attrs(const struct index_state *istate,
 			       struct attr_check *check)
 {
 	int pathlen, rem, dirlen;
+	int i;
 	const char *cp, *last_slash = NULL;
 	int basename_offset;
 
@@ -1111,6 +1116,21 @@ static void collect_some_attrs(const struct index_state *istate,
 
 	prepare_attr_stack(istate, path, dirlen, &check->stack);
 	all_attrs_init(&g_attr_hashmap, check);
+
+	if (check->nr) {
+		rem = 0;
+		for (i = 0; i < check->nr; i++) {
+			int n = check->items[i].attr->attr_nr;
+			struct all_attrs_item *item = &check->all_attrs[n];
+			if (!item->attr->in_stack) {
+				item->value = ATTR__UNSET;
+				rem++;
+			}
+		}
+		if (rem == check->nr)
+			return;
+	}
+
 	determine_macros(check->all_attrs, check->stack);
 
 	rem = check->all_attrs_nr;

  reply	other threads:[~2019-01-18 21:46 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-17 15:47 Change on check-attr behavior Sérgio Peixoto
2019-01-17 16:07 ` Jeff King
2019-01-18  9:41   ` Sérgio Peixoto
2019-01-18 16:58     ` Jeff King
2019-01-18 21:34       ` [PATCH] attr: do not mark queried macros as unset Jeff King
2019-01-18 21:46         ` Jeff King [this message]
2019-01-18 22:19           ` Stefan Beller
2019-01-22  7:19             ` Jeff King
2019-01-22  9:50               ` Duy Nguyen
2019-01-22 22:00           ` Junio C Hamano
2019-01-21 10:05         ` Duy Nguyen
2019-01-22  7:21           ` Jeff King
2019-01-22  9:34         ` Duy Nguyen
2019-01-22 21:48         ` Junio C Hamano
2019-01-23  5:40           ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190118214626.GC28808@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=bwilliams.eng@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=pclouds@gmail.com \
    --cc=sergio.peixoto@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.