git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/27] Revamp the attribute system; another round
@ 2017-01-12 23:53 Brandon Williams
  2017-01-12 23:53 ` [PATCH 01/27] commit.c: use strchrnul() to scan for one line Brandon Williams
                   ` (28 more replies)
  0 siblings, 29 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-12 23:53 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, gitster, pclouds, sbeller

This series has been bounced around a bit (from Junio to Stefan) and finally
landed in my lap.  The end result of Stefan's attempt at the series still had a
couple of things that needed more tweaking.  It also has a few patches on top
which added functionality to pathspecs to be able to query into the attribute
system, which I've dropped from this series due to this series' length.

As a reminder the intent of this series is to revamp the attribute system so
that it can be thread-safe as well as a couple of other quality of life
changes.  This entailed removing dependencies on writing global data structures
during the attribute collection process.  Major changes are as follows:

 * The global array used to collect attributes needed to be made local and as a
   result was pushed out to the attr_check structure the caller prepares before
   querying the attribute system.

 * As it turns out the attribute stack ends up being used as a read-only
   structure during the collection process and as such parts of the attribute
   stack can be shared between different threads calling into the system.  To
   enable this sharing the attribute stack frames are stored in a hashmap and
   can be read out (or created and stored in the hashmap) based on the
   directory name of the path being queried.  This is possible because if a
   particular stack frame is included in the overall stack for a particular
   query, all of the frames underneath it will be the same for all queries that
   use this frame (only exception is the info frame which is handled special
   case, see the patch for details).

I took many of the first patches of this series as is from the series Stefan
prepared as as such may only need a cursory glace.  I did modify and change
some of the later patches authored by Junio to address a couple of naming
changes and to redistribute some code between patches so those patches would
need a closer look.

Thanks again to all the work Junio and Stefan put into this before I got a hold
of it.

Any comments are appreciated!

Thanks,
Brandon Williams

Brandon Williams (8):
  attr: pass struct attr_check to collect_some_attrs
  attr: use hashmap for attribute dictionary
  attr: eliminate global check_all_attr array
  attr: remove maybe-real, maybe-macro from git_attr
  attr: tighten const correctness with git_attr and match_attr
  attr: store attribute stacks in hashmap
  attr: push the bare repo check into read_attr()
  attr: reformat git_attr_set_direction() function

Junio C Hamano (17):
  commit.c: use strchrnul() to scan for one line
  attr.c: use strchrnul() to scan for one line
  attr.c: update a stale comment on "struct match_attr"
  attr.c: explain the lack of attr-name syntax check in parse_attr()
  attr.c: complete a sentence in a comment
  attr.c: mark where #if DEBUG ends more clearly
  attr.c: simplify macroexpand_one()
  attr.c: tighten constness around "git_attr" structure
  attr.c: plug small leak in parse_attr_line()
  attr.c: add push_stack() helper
  attr.c: outline the future plans by heavily commenting
  attr: rename function and struct related to checking attributes
  attr: (re)introduce git_check_attr() and struct attr_check
  attr: convert git_all_attrs() to use "struct attr_check"
  attr: convert git_check_attrs() callers to use the new API
  attr: retire git_check_attrs() API
  attr: change validity check for attribute names to use positive logic

Nguyễn Thái Ngọc Duy (1):
  attr: support quoting pathname patterns in C style

Stefan Beller (1):
  Documentation: fix a typo

 Documentation/gitattributes.txt               |  10 +-
 Documentation/technical/api-gitattributes.txt |  86 ++-
 archive.c                                     |  24 +-
 attr.c                                        | 932 +++++++++++++++++---------
 attr.h                                        |  50 +-
 builtin/check-attr.c                          |  66 +-
 builtin/pack-objects.c                        |  19 +-
 commit.c                                      |   3 +-
 common-main.c                                 |   3 +
 convert.c                                     |  25 +-
 ll-merge.c                                    |  33 +-
 t/t0003-attributes.sh                         |  26 +
 userdiff.c                                    |  19 +-
 ws.c                                          |  19 +-
 14 files changed, 834 insertions(+), 481 deletions(-)

-- 
2.11.0.390.gc69c2f50cf-goog


^ permalink raw reply	[flat|nested] 111+ messages in thread

* [PATCH 01/27] commit.c: use strchrnul() to scan for one line
  2017-01-12 23:53 [PATCH 00/27] Revamp the attribute system; another round Brandon Williams
@ 2017-01-12 23:53 ` Brandon Williams
  2017-01-12 23:53 ` [PATCH 02/27] attr.c: " Brandon Williams
                   ` (27 subsequent siblings)
  28 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-12 23:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, pclouds, sbeller, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 commit.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/commit.c b/commit.c
index 2cf85158b..0c4ee3de4 100644
--- a/commit.c
+++ b/commit.c
@@ -415,8 +415,7 @@ int find_commit_subject(const char *commit_buffer, const char **subject)
 		p++;
 	if (*p) {
 		p = skip_blank_lines(p + 2);
-		for (eol = p; *eol && *eol != '\n'; eol++)
-			; /* do nothing */
+		eol = strchrnul(p, '\n');
 	} else
 		eol = p;
 
-- 
2.11.0.390.gc69c2f50cf-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH 02/27] attr.c: use strchrnul() to scan for one line
  2017-01-12 23:53 [PATCH 00/27] Revamp the attribute system; another round Brandon Williams
  2017-01-12 23:53 ` [PATCH 01/27] commit.c: use strchrnul() to scan for one line Brandon Williams
@ 2017-01-12 23:53 ` Brandon Williams
  2017-01-12 23:53 ` [PATCH 03/27] attr.c: update a stale comment on "struct match_attr" Brandon Williams
                   ` (26 subsequent siblings)
  28 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-12 23:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, pclouds, sbeller, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/attr.c b/attr.c
index 1fcf042b8..04d24334e 100644
--- a/attr.c
+++ b/attr.c
@@ -402,8 +402,8 @@ static struct attr_stack *read_attr_from_index(const char *path, int macro_ok)
 	for (sp = buf; *sp; ) {
 		char *ep;
 		int more;
-		for (ep = sp; *ep && *ep != '\n'; ep++)
-			;
+
+		ep = strchrnul(sp, '\n');
 		more = (*ep == '\n');
 		*ep = '\0';
 		handle_attr_line(res, sp, path, ++lineno, macro_ok);
-- 
2.11.0.390.gc69c2f50cf-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH 03/27] attr.c: update a stale comment on "struct match_attr"
  2017-01-12 23:53 [PATCH 00/27] Revamp the attribute system; another round Brandon Williams
  2017-01-12 23:53 ` [PATCH 01/27] commit.c: use strchrnul() to scan for one line Brandon Williams
  2017-01-12 23:53 ` [PATCH 02/27] attr.c: " Brandon Williams
@ 2017-01-12 23:53 ` Brandon Williams
  2017-01-12 23:53 ` [PATCH 04/27] attr.c: explain the lack of attr-name syntax check in parse_attr() Brandon Williams
                   ` (25 subsequent siblings)
  28 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-12 23:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, pclouds, sbeller, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

When 82dce998 (attr: more matching optimizations from .gitignore,
2012-10-15) changed a pointer to a string "*pattern" into an
embedded "struct pattern" in struct match_attr, it forgot to update
the comment that describes the structure.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/attr.c b/attr.c
index 04d24334e..007f1a299 100644
--- a/attr.c
+++ b/attr.c
@@ -131,9 +131,8 @@ struct pattern {
  * If is_macro is true, then u.attr is a pointer to the git_attr being
  * defined.
  *
- * If is_macro is false, then u.pattern points at the filename pattern
- * to which the rule applies.  (The memory pointed to is part of the
- * memory block allocated for the match_attr instance.)
+ * If is_macro is false, then u.pat is the filename pattern to which the
+ * rule applies.
  *
  * In either case, num_attr is the number of attributes affected by
  * this rule, and state is an array listing them.  The attributes are
-- 
2.11.0.390.gc69c2f50cf-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH 04/27] attr.c: explain the lack of attr-name syntax check in parse_attr()
  2017-01-12 23:53 [PATCH 00/27] Revamp the attribute system; another round Brandon Williams
                   ` (2 preceding siblings ...)
  2017-01-12 23:53 ` [PATCH 03/27] attr.c: update a stale comment on "struct match_attr" Brandon Williams
@ 2017-01-12 23:53 ` Brandon Williams
  2017-01-12 23:53 ` [PATCH 05/27] attr.c: complete a sentence in a comment Brandon Williams
                   ` (24 subsequent siblings)
  28 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-12 23:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, pclouds, sbeller, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/attr.c b/attr.c
index 007f1a299..6b55a57ef 100644
--- a/attr.c
+++ b/attr.c
@@ -183,6 +183,12 @@ static const char *parse_attr(const char *src, int lineno, const char *cp,
 			return NULL;
 		}
 	} else {
+		/*
+		 * As this function is always called twice, once with
+		 * e == NULL in the first pass and then e != NULL in
+		 * the second pass, no need for invalid_attr_name()
+		 * check here.
+		 */
 		if (*cp == '-' || *cp == '!') {
 			e->setto = (*cp == '-') ? ATTR__FALSE : ATTR__UNSET;
 			cp++;
-- 
2.11.0.390.gc69c2f50cf-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH 05/27] attr.c: complete a sentence in a comment
  2017-01-12 23:53 [PATCH 00/27] Revamp the attribute system; another round Brandon Williams
                   ` (3 preceding siblings ...)
  2017-01-12 23:53 ` [PATCH 04/27] attr.c: explain the lack of attr-name syntax check in parse_attr() Brandon Williams
@ 2017-01-12 23:53 ` Brandon Williams
  2017-01-12 23:53 ` [PATCH 06/27] attr.c: mark where #if DEBUG ends more clearly Brandon Williams
                   ` (23 subsequent siblings)
  28 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-12 23:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, pclouds, sbeller, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/attr.c b/attr.c
index 6b55a57ef..9bdf87a6f 100644
--- a/attr.c
+++ b/attr.c
@@ -300,7 +300,7 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
  * directory (again, reading the file from top to bottom) down to the
  * current directory, and then scan the list backwards to find the first match.
  * This is exactly the same as what is_excluded() does in dir.c to deal with
- * .gitignore
+ * .gitignore file and info/excludes file as a fallback.
  */
 
 static struct attr_stack {
-- 
2.11.0.390.gc69c2f50cf-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH 06/27] attr.c: mark where #if DEBUG ends more clearly
  2017-01-12 23:53 [PATCH 00/27] Revamp the attribute system; another round Brandon Williams
                   ` (4 preceding siblings ...)
  2017-01-12 23:53 ` [PATCH 05/27] attr.c: complete a sentence in a comment Brandon Williams
@ 2017-01-12 23:53 ` Brandon Williams
  2017-01-12 23:53 ` [PATCH 07/27] attr.c: simplify macroexpand_one() Brandon Williams
                   ` (22 subsequent siblings)
  28 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-12 23:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, pclouds, sbeller, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/attr.c b/attr.c
index 9bdf87a6f..17297fffe 100644
--- a/attr.c
+++ b/attr.c
@@ -469,7 +469,7 @@ static void debug_set(const char *what, const char *match, struct git_attr *attr
 #define debug_push(a) do { ; } while (0)
 #define debug_pop(a) do { ; } while (0)
 #define debug_set(a,b,c,d) do { ; } while (0)
-#endif
+#endif /* DEBUG_ATTR */
 
 static void drop_attr_stack(void)
 {
-- 
2.11.0.390.gc69c2f50cf-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH 07/27] attr.c: simplify macroexpand_one()
  2017-01-12 23:53 [PATCH 00/27] Revamp the attribute system; another round Brandon Williams
                   ` (5 preceding siblings ...)
  2017-01-12 23:53 ` [PATCH 06/27] attr.c: mark where #if DEBUG ends more clearly Brandon Williams
@ 2017-01-12 23:53 ` Brandon Williams
  2017-01-12 23:53 ` [PATCH 08/27] attr.c: tighten constness around "git_attr" structure Brandon Williams
                   ` (21 subsequent siblings)
  28 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-12 23:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, pclouds, sbeller, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

The double-loop wants to do an early return immediately when one
matching macro is found.  Eliminate the extra variable 'a' used for
that purpose and rewrite the "assign the found item to 'a' to make
it non-NULL and force the loop(s) to terminate" with a direct return
from there.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/attr.c b/attr.c
index 17297fffe..e42f931b3 100644
--- a/attr.c
+++ b/attr.c
@@ -705,24 +705,21 @@ static int fill(const char *path, int pathlen, int basename_offset,
 static int macroexpand_one(int nr, int rem)
 {
 	struct attr_stack *stk;
-	struct match_attr *a = NULL;
 	int i;
 
 	if (check_all_attr[nr].value != ATTR__TRUE ||
 	    !check_all_attr[nr].attr->maybe_macro)
 		return rem;
 
-	for (stk = attr_stack; !a && stk; stk = stk->prev)
-		for (i = stk->num_matches - 1; !a && 0 <= i; i--) {
+	for (stk = attr_stack; stk; stk = stk->prev) {
+		for (i = stk->num_matches - 1; 0 <= i; i--) {
 			struct match_attr *ma = stk->attrs[i];
 			if (!ma->is_macro)
 				continue;
 			if (ma->u.attr->attr_nr == nr)
-				a = ma;
+				return fill_one("expand", ma, rem);
 		}
-
-	if (a)
-		rem = fill_one("expand", a, rem);
+	}
 
 	return rem;
 }
-- 
2.11.0.390.gc69c2f50cf-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH 08/27] attr.c: tighten constness around "git_attr" structure
  2017-01-12 23:53 [PATCH 00/27] Revamp the attribute system; another round Brandon Williams
                   ` (6 preceding siblings ...)
  2017-01-12 23:53 ` [PATCH 07/27] attr.c: simplify macroexpand_one() Brandon Williams
@ 2017-01-12 23:53 ` Brandon Williams
  2017-01-12 23:53 ` [PATCH 09/27] attr.c: plug small leak in parse_attr_line() Brandon Williams
                   ` (20 subsequent siblings)
  28 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-12 23:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, pclouds, sbeller, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

It holds an interned string, and git_attr_name() is a way to peek
into it.  Make sure the involved pointer types are pointer-to-const.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 2 +-
 attr.h | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/attr.c b/attr.c
index e42f931b3..f7cf7ae30 100644
--- a/attr.c
+++ b/attr.c
@@ -43,7 +43,7 @@ static int cannot_trust_maybe_real;
 static struct git_attr_check *check_all_attr;
 static struct git_attr *(git_attr_hash[HASHSIZE]);
 
-char *git_attr_name(struct git_attr *attr)
+const char *git_attr_name(const struct git_attr *attr)
 {
 	return attr->name;
 }
diff --git a/attr.h b/attr.h
index 8b08d33af..00d7a662c 100644
--- a/attr.h
+++ b/attr.h
@@ -25,7 +25,7 @@ extern const char git_attr__false[];
  * Unset one is returned as NULL.
  */
 struct git_attr_check {
-	struct git_attr *attr;
+	const struct git_attr *attr;
 	const char *value;
 };
 
@@ -34,7 +34,7 @@ struct git_attr_check {
  * return value is a pointer to a null-delimited string that is part
  * of the internal data structure; it should not be modified or freed.
  */
-char *git_attr_name(struct git_attr *);
+extern const char *git_attr_name(const struct git_attr *);
 
 int git_check_attr(const char *path, int, struct git_attr_check *);
 
-- 
2.11.0.390.gc69c2f50cf-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH 09/27] attr.c: plug small leak in parse_attr_line()
  2017-01-12 23:53 [PATCH 00/27] Revamp the attribute system; another round Brandon Williams
                   ` (7 preceding siblings ...)
  2017-01-12 23:53 ` [PATCH 08/27] attr.c: tighten constness around "git_attr" structure Brandon Williams
@ 2017-01-12 23:53 ` Brandon Williams
  2017-01-12 23:53 ` [PATCH 10/27] attr: support quoting pathname patterns in C style Brandon Williams
                   ` (19 subsequent siblings)
  28 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-12 23:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, pclouds, sbeller, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

If any error is noticed after the match_attr structure is allocated,
we shouldn't just return NULL from this function.

Add a fail_return label that frees the allocated structure and
returns NULL, and consistently jump there when we want to return
NULL after cleaning up.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/attr.c b/attr.c
index f7cf7ae30..d180c7833 100644
--- a/attr.c
+++ b/attr.c
@@ -223,7 +223,7 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 		if (!macro_ok) {
 			fprintf(stderr, "%s not allowed: %s:%d\n",
 				name, src, lineno);
-			return NULL;
+			goto fail_return;
 		}
 		is_macro = 1;
 		name += strlen(ATTRIBUTE_MACRO_PREFIX);
@@ -233,7 +233,7 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 			fprintf(stderr,
 				"%.*s is not a valid attribute name: %s:%d\n",
 				namelen, name, src, lineno);
-			return NULL;
+			goto fail_return;
 		}
 	}
 	else
@@ -246,7 +246,7 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 	for (cp = states, num_attr = 0; *cp; num_attr++) {
 		cp = parse_attr(src, lineno, cp, NULL);
 		if (!cp)
-			return NULL;
+			goto fail_return;
 	}
 
 	res = xcalloc(1,
@@ -267,7 +267,7 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 		if (res->u.pat.flags & EXC_FLAG_NEGATIVE) {
 			warning(_("Negative patterns are ignored in git attributes\n"
 				  "Use '\\!' for literal leading exclamation."));
-			return NULL;
+			goto fail_return;
 		}
 	}
 	res->is_macro = is_macro;
@@ -283,6 +283,10 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 	}
 
 	return res;
+
+fail_return:
+	free(res);
+	return NULL;
 }
 
 /*
-- 
2.11.0.390.gc69c2f50cf-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH 10/27] attr: support quoting pathname patterns in C style
  2017-01-12 23:53 [PATCH 00/27] Revamp the attribute system; another round Brandon Williams
                   ` (8 preceding siblings ...)
  2017-01-12 23:53 ` [PATCH 09/27] attr.c: plug small leak in parse_attr_line() Brandon Williams
@ 2017-01-12 23:53 ` Brandon Williams
  2017-01-12 23:53 ` [PATCH 11/27] attr.c: add push_stack() helper Brandon Williams
                   ` (18 subsequent siblings)
  28 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-12 23:53 UTC (permalink / raw)
  To: git
  Cc: Nguyễn Thái Ngọc Duy, gitster, sbeller,
	Brandon Williams

From: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>

Full pattern must be quoted. So 'pat"t"ern attr' will give exactly
'pat"t"ern', not 'pattern'. Also clarify that leading whitespaces are
not part of the pattern and document comment syntax.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 Documentation/gitattributes.txt |  8 +++++---
 attr.c                          | 15 +++++++++++++--
 t/t0003-attributes.sh           | 26 ++++++++++++++++++++++++++
 3 files changed, 44 insertions(+), 5 deletions(-)

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index e0b66c122..3173dee7e 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -21,9 +21,11 @@ Each line in `gitattributes` file is of form:
 	pattern	attr1 attr2 ...
 
 That is, a pattern followed by an attributes list,
-separated by whitespaces.  When the pattern matches the
-path in question, the attributes listed on the line are given to
-the path.
+separated by whitespaces. Leading and trailing whitespaces are
+ignored. Lines that begin with '#' are ignored. Patterns
+that begin with a double quote are quoted in C style.
+When the pattern matches the path in question, the attributes
+listed on the line are given to the path.
 
 Each attribute can be in one of these states for a given path:
 
diff --git a/attr.c b/attr.c
index d180c7833..e1c630f79 100644
--- a/attr.c
+++ b/attr.c
@@ -13,6 +13,7 @@
 #include "attr.h"
 #include "dir.h"
 #include "utf8.h"
+#include "quote.h"
 
 const char git_attr__true[] = "(builtin)true";
 const char git_attr__false[] = "\0(builtin)false";
@@ -212,12 +213,21 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 	const char *cp, *name, *states;
 	struct match_attr *res = NULL;
 	int is_macro;
+	struct strbuf pattern = STRBUF_INIT;
 
 	cp = line + strspn(line, blank);
 	if (!*cp || *cp == '#')
 		return NULL;
 	name = cp;
-	namelen = strcspn(name, blank);
+
+	if (*cp == '"' && !unquote_c_style(&pattern, name, &states)) {
+		name = pattern.buf;
+		namelen = pattern.len;
+	} else {
+		namelen = strcspn(name, blank);
+		states = name + namelen;
+	}
+
 	if (strlen(ATTRIBUTE_MACRO_PREFIX) < namelen &&
 	    starts_with(name, ATTRIBUTE_MACRO_PREFIX)) {
 		if (!macro_ok) {
@@ -239,7 +249,6 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 	else
 		is_macro = 0;
 
-	states = name + namelen;
 	states += strspn(states, blank);
 
 	/* First pass to count the attr_states */
@@ -282,9 +291,11 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 			cannot_trust_maybe_real = 1;
 	}
 
+	strbuf_release(&pattern);
 	return res;
 
 fail_return:
+	strbuf_release(&pattern);
 	free(res);
 	return NULL;
 }
diff --git a/t/t0003-attributes.sh b/t/t0003-attributes.sh
index f0fbb4255..f19ae4f8c 100755
--- a/t/t0003-attributes.sh
+++ b/t/t0003-attributes.sh
@@ -13,10 +13,31 @@ attr_check () {
 	test_line_count = 0 err
 }
 
+attr_check_quote () {
+
+	path="$1"
+	quoted_path="$2"
+	expect="$3"
+
+	git check-attr test -- "$path" >actual &&
+	echo "\"$quoted_path\": test: $expect" >expect &&
+	test_cmp expect actual
+
+}
+
+test_expect_success 'open-quoted pathname' '
+	echo "\"a test=a" >.gitattributes &&
+	test_must_fail attr_check a a
+'
+
+
 test_expect_success 'setup' '
 	mkdir -p a/b/d a/c b &&
 	(
 		echo "[attr]notest !test"
+		echo "\" d \"	test=d"
+		echo " e	test=e"
+		echo " e\"	test=e"
 		echo "f	test=f"
 		echo "a/i test=a/i"
 		echo "onoff test -test"
@@ -69,6 +90,11 @@ test_expect_success 'command line checks' '
 '
 
 test_expect_success 'attribute test' '
+
+	attr_check " d " d &&
+	attr_check e e &&
+	attr_check_quote e\" e\\\" e &&
+
 	attr_check f f &&
 	attr_check a/f f &&
 	attr_check a/c/f f &&
-- 
2.11.0.390.gc69c2f50cf-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH 11/27] attr.c: add push_stack() helper
  2017-01-12 23:53 [PATCH 00/27] Revamp the attribute system; another round Brandon Williams
                   ` (9 preceding siblings ...)
  2017-01-12 23:53 ` [PATCH 10/27] attr: support quoting pathname patterns in C style Brandon Williams
@ 2017-01-12 23:53 ` Brandon Williams
  2017-01-12 23:53 ` [PATCH 12/27] Documentation: fix a typo Brandon Williams
                   ` (17 subsequent siblings)
  28 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-12 23:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, pclouds, sbeller, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

There are too many repetitious "I have this new attr_stack element;
push it at the top of the stack" sequence.  The new helper function
push_stack() gives us a way to express what is going on at these
places, and as a side effect, halves the number of times we mention
the attr_stack global variable.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 71 +++++++++++++++++++++++++++++++-----------------------------------
 1 file changed, 33 insertions(+), 38 deletions(-)

diff --git a/attr.c b/attr.c
index e1c630f79..8026d68bd 100644
--- a/attr.c
+++ b/attr.c
@@ -510,6 +510,18 @@ static int git_attr_system(void)
 
 static GIT_PATH_FUNC(git_path_info_attributes, INFOATTRIBUTES_FILE)
 
+static void push_stack(struct attr_stack **attr_stack_p,
+		       struct attr_stack *elem, char *origin, size_t originlen)
+{
+	if (elem) {
+		elem->origin = origin;
+		if (origin)
+			elem->originlen = originlen;
+		elem->prev = *attr_stack_p;
+		*attr_stack_p = elem;
+	}
+}
+
 static void bootstrap_attr_stack(void)
 {
 	struct attr_stack *elem;
@@ -517,37 +529,23 @@ static void bootstrap_attr_stack(void)
 	if (attr_stack)
 		return;
 
-	elem = read_attr_from_array(builtin_attr);
-	elem->origin = NULL;
-	elem->prev = attr_stack;
-	attr_stack = elem;
-
-	if (git_attr_system()) {
-		elem = read_attr_from_file(git_etc_gitattributes(), 1);
-		if (elem) {
-			elem->origin = NULL;
-			elem->prev = attr_stack;
-			attr_stack = elem;
-		}
-	}
+	push_stack(&attr_stack, read_attr_from_array(builtin_attr), NULL, 0);
+
+	if (git_attr_system())
+		push_stack(&attr_stack,
+			   read_attr_from_file(git_etc_gitattributes(), 1),
+			   NULL, 0);
 
 	if (!git_attributes_file)
 		git_attributes_file = xdg_config_home("attributes");
-	if (git_attributes_file) {
-		elem = read_attr_from_file(git_attributes_file, 1);
-		if (elem) {
-			elem->origin = NULL;
-			elem->prev = attr_stack;
-			attr_stack = elem;
-		}
-	}
+	if (git_attributes_file)
+		push_stack(&attr_stack,
+			   read_attr_from_file(git_attributes_file, 1),
+			   NULL, 0);
 
 	if (!is_bare_repository() || direction == GIT_ATTR_INDEX) {
 		elem = read_attr(GITATTRIBUTES_FILE, 1);
-		elem->origin = xstrdup("");
-		elem->originlen = 0;
-		elem->prev = attr_stack;
-		attr_stack = elem;
+		push_stack(&attr_stack, elem, xstrdup(""), 0);
 		debug_push(elem);
 	}
 
@@ -558,15 +556,12 @@ static void bootstrap_attr_stack(void)
 
 	if (!elem)
 		elem = xcalloc(1, sizeof(*elem));
-	elem->origin = NULL;
-	elem->prev = attr_stack;
-	attr_stack = elem;
+	push_stack(&attr_stack, elem, NULL, 0);
 }
 
 static void prepare_attr_stack(const char *path, int dirlen)
 {
 	struct attr_stack *elem, *info;
-	int len;
 	const char *cp;
 
 	/*
@@ -626,20 +621,21 @@ static void prepare_attr_stack(const char *path, int dirlen)
 
 		assert(attr_stack->origin);
 		while (1) {
-			len = strlen(attr_stack->origin);
+			size_t len = strlen(attr_stack->origin);
+			char *origin;
+
 			if (dirlen <= len)
 				break;
 			cp = memchr(path + len + 1, '/', dirlen - len - 1);
 			if (!cp)
 				cp = path + dirlen;
-			strbuf_add(&pathbuf, path, cp - path);
-			strbuf_addch(&pathbuf, '/');
-			strbuf_addstr(&pathbuf, GITATTRIBUTES_FILE);
+			strbuf_addf(&pathbuf,
+				    "%.*s/%s", (int)(cp - path), path,
+				    GITATTRIBUTES_FILE);
 			elem = read_attr(pathbuf.buf, 0);
 			strbuf_setlen(&pathbuf, cp - path);
-			elem->origin = strbuf_detach(&pathbuf, &elem->originlen);
-			elem->prev = attr_stack;
-			attr_stack = elem;
+			origin = strbuf_detach(&pathbuf, &len);
+			push_stack(&attr_stack, elem, origin, len);
 			debug_push(elem);
 		}
 
@@ -649,8 +645,7 @@ static void prepare_attr_stack(const char *path, int dirlen)
 	/*
 	 * Finally push the "info" one at the top of the stack.
 	 */
-	info->prev = attr_stack;
-	attr_stack = info;
+	push_stack(&attr_stack, info, NULL, 0);
 }
 
 static int path_matches(const char *pathname, int pathlen,
-- 
2.11.0.390.gc69c2f50cf-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH 12/27] Documentation: fix a typo
  2017-01-12 23:53 [PATCH 00/27] Revamp the attribute system; another round Brandon Williams
                   ` (10 preceding siblings ...)
  2017-01-12 23:53 ` [PATCH 11/27] attr.c: add push_stack() helper Brandon Williams
@ 2017-01-12 23:53 ` Brandon Williams
  2017-01-12 23:53 ` [PATCH 13/27] attr.c: outline the future plans by heavily commenting Brandon Williams
                   ` (16 subsequent siblings)
  28 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-12 23:53 UTC (permalink / raw)
  To: git; +Cc: Stefan Beller, gitster, pclouds, Brandon Williams

From: Stefan Beller <sbeller@google.com>

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 Documentation/gitattributes.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index 3173dee7e..a53d093ca 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -88,7 +88,7 @@ is either not set or empty, $HOME/.config/git/attributes is used instead.
 Attributes for all users on a system should be placed in the
 `$(prefix)/etc/gitattributes` file.
 
-Sometimes you would need to override an setting of an attribute
+Sometimes you would need to override a setting of an attribute
 for a path to `Unspecified` state.  This can be done by listing
 the name of the attribute prefixed with an exclamation point `!`.
 
-- 
2.11.0.390.gc69c2f50cf-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH 13/27] attr.c: outline the future plans by heavily commenting
  2017-01-12 23:53 [PATCH 00/27] Revamp the attribute system; another round Brandon Williams
                   ` (11 preceding siblings ...)
  2017-01-12 23:53 ` [PATCH 12/27] Documentation: fix a typo Brandon Williams
@ 2017-01-12 23:53 ` Brandon Williams
  2017-01-12 23:53 ` [PATCH 14/27] attr: rename function and struct related to checking attributes Brandon Williams
                   ` (15 subsequent siblings)
  28 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-12 23:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, pclouds, sbeller, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 40 +++++++++++++++++++++++++++++++++++++++-
 1 file changed, 39 insertions(+), 1 deletion(-)

diff --git a/attr.c b/attr.c
index 8026d68bd..50e5ee393 100644
--- a/attr.c
+++ b/attr.c
@@ -30,6 +30,11 @@ static const char git_attr__unknown[] = "(builtin)unknown";
 #define DEBUG_ATTR 0
 #endif
 
+/*
+ * NEEDSWORK: the global dictionary of the interned attributes
+ * must stay a singleton even after we become thread-ready.
+ * Access to these must be surrounded with mutex when it happens.
+ */
 struct git_attr {
 	struct git_attr *next;
 	unsigned h;
@@ -39,10 +44,19 @@ struct git_attr {
 	char name[FLEX_ARRAY];
 };
 static int attr_nr;
+static struct git_attr *(git_attr_hash[HASHSIZE]);
+
+/*
+ * NEEDSWORK: maybe-real, maybe-macro are not property of
+ * an attribute, as it depends on what .gitattributes are
+ * read.  Once we introduce per git_attr_check attr_stack
+ * and check_all_attr, the optimization based on them will
+ * become unnecessary and can go away.  So is this variable.
+ */
 static int cannot_trust_maybe_real;
 
+/* NEEDSWORK: This will become per git_attr_check */
 static struct git_attr_check *check_all_attr;
-static struct git_attr *(git_attr_hash[HASHSIZE]);
 
 const char *git_attr_name(const struct git_attr *attr)
 {
@@ -102,6 +116,11 @@ static struct git_attr *git_attr_internal(const char *name, int len)
 	a->maybe_real = 0;
 	git_attr_hash[pos] = a;
 
+	/*
+	 * NEEDSWORK: per git_attr_check check_all_attr
+	 * will be initialized a lot more lazily, not
+	 * like this, and not here.
+	 */
 	REALLOC_ARRAY(check_all_attr, attr_nr);
 	check_all_attr[a->attr_nr].attr = a;
 	check_all_attr[a->attr_nr].value = ATTR__UNKNOWN;
@@ -318,6 +337,7 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
  * .gitignore file and info/excludes file as a fallback.
  */
 
+/* NEEDSWORK: This will become per git_attr_check */
 static struct attr_stack {
 	struct attr_stack *prev;
 	char *origin;
@@ -382,6 +402,24 @@ static struct attr_stack *read_attr_from_array(const char **list)
 	return res;
 }
 
+/*
+ * NEEDSWORK: these two are tricky.  The callers assume there is a
+ * single, system-wide global state "where we read attributes from?"
+ * and when the state is flipped by calling git_attr_set_direction(),
+ * attr_stack is discarded so that subsequent attr_check will lazily
+ * read from the right place.  And they do not know or care who called
+ * by them uses the attribute subsystem, hence have no knowledge of
+ * existing git_attr_check instances or future ones that will be
+ * created).
+ *
+ * Probably we need a thread_local that holds these two variables,
+ * and a list of git_attr_check instances (which need to be maintained
+ * by hooking into git_attr_check_alloc(), git_attr_check_initl(), and
+ * git_attr_check_clear().  Then git_attr_set_direction() updates the
+ * fields in that thread_local for these two variables, iterate over
+ * all the active git_attr_check instances and discard the attr_stack
+ * they hold.  Yuck, but it sounds doable.
+ */
 static enum git_attr_direction direction;
 static struct index_state *use_index;
 
-- 
2.11.0.390.gc69c2f50cf-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH 14/27] attr: rename function and struct related to checking attributes
  2017-01-12 23:53 [PATCH 00/27] Revamp the attribute system; another round Brandon Williams
                   ` (12 preceding siblings ...)
  2017-01-12 23:53 ` [PATCH 13/27] attr.c: outline the future plans by heavily commenting Brandon Williams
@ 2017-01-12 23:53 ` Brandon Williams
  2017-01-12 23:53 ` [PATCH 15/27] attr: (re)introduce git_check_attr() and struct attr_check Brandon Williams
                   ` (14 subsequent siblings)
  28 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-12 23:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, pclouds, sbeller, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

The traditional API to check attributes is to prepare an N-element
array of "struct git_attr_check" and pass N and the array to the
function "git_check_attr()" as arguments.

In preparation to revamp the API to pass a single structure, in
which these N elements are held, rename the type used for these
individual array elements to "struct attr_check_item" and rename
the function to "git_check_attrs()".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 archive.c              |  6 +++---
 attr.c                 | 12 ++++++------
 attr.h                 |  8 ++++----
 builtin/check-attr.c   | 19 ++++++++++---------
 builtin/pack-objects.c |  6 +++---
 convert.c              | 12 ++++++------
 ll-merge.c             | 10 +++++-----
 userdiff.c             |  4 ++--
 ws.c                   |  6 +++---
 9 files changed, 42 insertions(+), 41 deletions(-)

diff --git a/archive.c b/archive.c
index 01751e574..b76bd4691 100644
--- a/archive.c
+++ b/archive.c
@@ -87,7 +87,7 @@ void *sha1_file_to_archive(const struct archiver_args *args,
 	return buffer;
 }
 
-static void setup_archive_check(struct git_attr_check *check)
+static void setup_archive_check(struct attr_check_item *check)
 {
 	static struct git_attr *attr_export_ignore;
 	static struct git_attr *attr_export_subst;
@@ -123,7 +123,7 @@ static int write_archive_entry(const unsigned char *sha1, const char *base,
 	struct archiver_context *c = context;
 	struct archiver_args *args = c->args;
 	write_archive_entry_fn_t write_entry = c->write_entry;
-	struct git_attr_check check[2];
+	struct attr_check_item check[2];
 	const char *path_without_prefix;
 	int err;
 
@@ -138,7 +138,7 @@ static int write_archive_entry(const unsigned char *sha1, const char *base,
 	path_without_prefix = path.buf + args->baselen;
 
 	setup_archive_check(check);
-	if (!git_check_attr(path_without_prefix, ARRAY_SIZE(check), check)) {
+	if (!git_check_attrs(path_without_prefix, ARRAY_SIZE(check), check)) {
 		if (ATTR_TRUE(check[0].value))
 			return 0;
 		args->convert = ATTR_TRUE(check[1].value);
diff --git a/attr.c b/attr.c
index 50e5ee393..2f180d609 100644
--- a/attr.c
+++ b/attr.c
@@ -56,7 +56,7 @@ static struct git_attr *(git_attr_hash[HASHSIZE]);
 static int cannot_trust_maybe_real;
 
 /* NEEDSWORK: This will become per git_attr_check */
-static struct git_attr_check *check_all_attr;
+static struct attr_check_item *check_all_attr;
 
 const char *git_attr_name(const struct git_attr *attr)
 {
@@ -713,7 +713,7 @@ static int macroexpand_one(int attr_nr, int rem);
 
 static int fill_one(const char *what, struct match_attr *a, int rem)
 {
-	struct git_attr_check *check = check_all_attr;
+	struct attr_check_item *check = check_all_attr;
 	int i;
 
 	for (i = a->num_attr - 1; 0 < rem && 0 <= i; i--) {
@@ -778,7 +778,7 @@ static int macroexpand_one(int nr, int rem)
  * collected. Otherwise all attributes are collected.
  */
 static void collect_some_attrs(const char *path, int num,
-			       struct git_attr_check *check)
+			       struct attr_check_item *check)
 
 {
 	struct attr_stack *stk;
@@ -806,7 +806,7 @@ static void collect_some_attrs(const char *path, int num,
 		rem = 0;
 		for (i = 0; i < num; i++) {
 			if (!check[i].attr->maybe_real) {
-				struct git_attr_check *c;
+				struct attr_check_item *c;
 				c = check_all_attr + check[i].attr->attr_nr;
 				c->value = ATTR__UNSET;
 				rem++;
@@ -821,7 +821,7 @@ static void collect_some_attrs(const char *path, int num,
 		rem = fill(path, pathlen, basename_offset, stk, rem);
 }
 
-int git_check_attr(const char *path, int num, struct git_attr_check *check)
+int git_check_attrs(const char *path, int num, struct attr_check_item *check)
 {
 	int i;
 
@@ -837,7 +837,7 @@ int git_check_attr(const char *path, int num, struct git_attr_check *check)
 	return 0;
 }
 
-int git_all_attrs(const char *path, int *num, struct git_attr_check **check)
+int git_all_attrs(const char *path, int *num, struct attr_check_item **check)
 {
 	int i, count, j;
 
diff --git a/attr.h b/attr.h
index 00d7a662c..efc7bb3b3 100644
--- a/attr.h
+++ b/attr.h
@@ -20,11 +20,11 @@ extern const char git_attr__false[];
 #define ATTR_UNSET(v) ((v) == NULL)
 
 /*
- * Send one or more git_attr_check to git_check_attr(), and
+ * Send one or more git_attr_check to git_check_attrs(), and
  * each 'value' member tells what its value is.
  * Unset one is returned as NULL.
  */
-struct git_attr_check {
+struct attr_check_item {
 	const struct git_attr *attr;
 	const char *value;
 };
@@ -36,7 +36,7 @@ struct git_attr_check {
  */
 extern const char *git_attr_name(const struct git_attr *);
 
-int git_check_attr(const char *path, int, struct git_attr_check *);
+int git_check_attrs(const char *path, int, struct attr_check_item *);
 
 /*
  * Retrieve all attributes that apply to the specified path.  *num
@@ -45,7 +45,7 @@ int git_check_attr(const char *path, int, struct git_attr_check *);
  * objects describing the attributes and their values.  *check must be
  * free()ed by the caller.
  */
-int git_all_attrs(const char *path, int *num, struct git_attr_check **check);
+int git_all_attrs(const char *path, int *num, struct attr_check_item **check);
 
 enum git_attr_direction {
 	GIT_ATTR_CHECKIN,
diff --git a/builtin/check-attr.c b/builtin/check-attr.c
index 53a5a18c1..889264a5b 100644
--- a/builtin/check-attr.c
+++ b/builtin/check-attr.c
@@ -24,8 +24,8 @@ static const struct option check_attr_options[] = {
 	OPT_END()
 };
 
-static void output_attr(int cnt, struct git_attr_check *check,
-	const char *file)
+static void output_attr(int cnt, struct attr_check_item *check,
+			const char *file)
 {
 	int j;
 	for (j = 0; j < cnt; j++) {
@@ -51,14 +51,15 @@ static void output_attr(int cnt, struct git_attr_check *check,
 	}
 }
 
-static void check_attr(const char *prefix, int cnt,
-	struct git_attr_check *check, const char *file)
+static void check_attr(const char *prefix,
+		       int cnt, struct attr_check_item *check,
+		       const char *file)
 {
 	char *full_path =
 		prefix_path(prefix, prefix ? strlen(prefix) : 0, file);
 	if (check != NULL) {
-		if (git_check_attr(full_path, cnt, check))
-			die("git_check_attr died");
+		if (git_check_attrs(full_path, cnt, check))
+			die("git_check_attrs died");
 		output_attr(cnt, check, file);
 	} else {
 		if (git_all_attrs(full_path, &cnt, &check))
@@ -69,8 +70,8 @@ static void check_attr(const char *prefix, int cnt,
 	free(full_path);
 }
 
-static void check_attr_stdin_paths(const char *prefix, int cnt,
-	struct git_attr_check *check)
+static void check_attr_stdin_paths(const char *prefix,
+				   int cnt, struct attr_check_item *check)
 {
 	struct strbuf buf = STRBUF_INIT;
 	struct strbuf unquoted = STRBUF_INIT;
@@ -99,7 +100,7 @@ static NORETURN void error_with_usage(const char *msg)
 
 int cmd_check_attr(int argc, const char **argv, const char *prefix)
 {
-	struct git_attr_check *check;
+	struct attr_check_item *check;
 	int cnt, i, doubledash, filei;
 
 	if (!is_bare_repository())
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 8841f8b36..8b8fbd814 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -894,7 +894,7 @@ static void write_pack_file(void)
 			written, nr_result);
 }
 
-static void setup_delta_attr_check(struct git_attr_check *check)
+static void setup_delta_attr_check(struct attr_check_item *check)
 {
 	static struct git_attr *attr_delta;
 
@@ -906,10 +906,10 @@ static void setup_delta_attr_check(struct git_attr_check *check)
 
 static int no_try_delta(const char *path)
 {
-	struct git_attr_check check[1];
+	struct attr_check_item check[1];
 
 	setup_delta_attr_check(check);
-	if (git_check_attr(path, ARRAY_SIZE(check), check))
+	if (git_check_attrs(path, ARRAY_SIZE(check), check))
 		return 0;
 	if (ATTR_FALSE(check->value))
 		return 1;
diff --git a/convert.c b/convert.c
index 4e17e45ed..1b9829279 100644
--- a/convert.c
+++ b/convert.c
@@ -1028,7 +1028,7 @@ static int ident_to_worktree(const char *path, const char *src, size_t len,
 	return 1;
 }
 
-static enum crlf_action git_path_check_crlf(struct git_attr_check *check)
+static enum crlf_action git_path_check_crlf(struct attr_check_item *check)
 {
 	const char *value = check->value;
 
@@ -1045,7 +1045,7 @@ static enum crlf_action git_path_check_crlf(struct git_attr_check *check)
 	return CRLF_UNDEFINED;
 }
 
-static enum eol git_path_check_eol(struct git_attr_check *check)
+static enum eol git_path_check_eol(struct attr_check_item *check)
 {
 	const char *value = check->value;
 
@@ -1058,7 +1058,7 @@ static enum eol git_path_check_eol(struct git_attr_check *check)
 	return EOL_UNSET;
 }
 
-static struct convert_driver *git_path_check_convert(struct git_attr_check *check)
+static struct convert_driver *git_path_check_convert(struct attr_check_item *check)
 {
 	const char *value = check->value;
 	struct convert_driver *drv;
@@ -1071,7 +1071,7 @@ static struct convert_driver *git_path_check_convert(struct git_attr_check *chec
 	return NULL;
 }
 
-static int git_path_check_ident(struct git_attr_check *check)
+static int git_path_check_ident(struct attr_check_item *check)
 {
 	const char *value = check->value;
 
@@ -1093,7 +1093,7 @@ static const char *conv_attr_name[] = {
 static void convert_attrs(struct conv_attrs *ca, const char *path)
 {
 	int i;
-	static struct git_attr_check ccheck[NUM_CONV_ATTRS];
+	static struct attr_check_item ccheck[NUM_CONV_ATTRS];
 
 	if (!ccheck[0].attr) {
 		for (i = 0; i < NUM_CONV_ATTRS; i++)
@@ -1102,7 +1102,7 @@ static void convert_attrs(struct conv_attrs *ca, const char *path)
 		git_config(read_convert_config, NULL);
 	}
 
-	if (!git_check_attr(path, NUM_CONV_ATTRS, ccheck)) {
+	if (!git_check_attrs(path, NUM_CONV_ATTRS, ccheck)) {
 		ca->crlf_action = git_path_check_crlf(ccheck + 4);
 		if (ca->crlf_action == CRLF_UNDEFINED)
 			ca->crlf_action = git_path_check_crlf(ccheck + 0);
diff --git a/ll-merge.c b/ll-merge.c
index ad8be42f9..198f07aca 100644
--- a/ll-merge.c
+++ b/ll-merge.c
@@ -336,13 +336,13 @@ static const struct ll_merge_driver *find_ll_merge_driver(const char *merge_attr
 	return &ll_merge_drv[LL_TEXT_MERGE];
 }
 
-static int git_path_check_merge(const char *path, struct git_attr_check check[2])
+static int git_path_check_merge(const char *path, struct attr_check_item check[2])
 {
 	if (!check[0].attr) {
 		check[0].attr = git_attr("merge");
 		check[1].attr = git_attr("conflict-marker-size");
 	}
-	return git_check_attr(path, 2, check);
+	return git_check_attrs(path, 2, check);
 }
 
 static void normalize_file(mmfile_t *mm, const char *path)
@@ -362,7 +362,7 @@ int ll_merge(mmbuffer_t *result_buf,
 	     mmfile_t *theirs, const char *their_label,
 	     const struct ll_merge_options *opts)
 {
-	static struct git_attr_check check[2];
+	static struct attr_check_item check[2];
 	static const struct ll_merge_options default_opts;
 	const char *ll_driver_name = NULL;
 	int marker_size = DEFAULT_CONFLICT_MARKER_SIZE;
@@ -398,12 +398,12 @@ int ll_merge(mmbuffer_t *result_buf,
 
 int ll_merge_marker_size(const char *path)
 {
-	static struct git_attr_check check;
+	static struct attr_check_item check;
 	int marker_size = DEFAULT_CONFLICT_MARKER_SIZE;
 
 	if (!check.attr)
 		check.attr = git_attr("conflict-marker-size");
-	if (!git_check_attr(path, 1, &check) && check.value) {
+	if (!git_check_attrs(path, 1, &check) && check.value) {
 		marker_size = atoi(check.value);
 		if (marker_size <= 0)
 			marker_size = DEFAULT_CONFLICT_MARKER_SIZE;
diff --git a/userdiff.c b/userdiff.c
index 2125d6da2..b0b44467a 100644
--- a/userdiff.c
+++ b/userdiff.c
@@ -263,7 +263,7 @@ struct userdiff_driver *userdiff_find_by_name(const char *name) {
 struct userdiff_driver *userdiff_find_by_path(const char *path)
 {
 	static struct git_attr *attr;
-	struct git_attr_check check;
+	struct attr_check_item check;
 
 	if (!attr)
 		attr = git_attr("diff");
@@ -271,7 +271,7 @@ struct userdiff_driver *userdiff_find_by_path(const char *path)
 
 	if (!path)
 		return NULL;
-	if (git_check_attr(path, 1, &check))
+	if (git_check_attrs(path, 1, &check))
 		return NULL;
 
 	if (ATTR_TRUE(check.value))
diff --git a/ws.c b/ws.c
index ea4b2b1df..fbd876e84 100644
--- a/ws.c
+++ b/ws.c
@@ -71,7 +71,7 @@ unsigned parse_whitespace_rule(const char *string)
 	return rule;
 }
 
-static void setup_whitespace_attr_check(struct git_attr_check *check)
+static void setup_whitespace_attr_check(struct attr_check_item *check)
 {
 	static struct git_attr *attr_whitespace;
 
@@ -82,10 +82,10 @@ static void setup_whitespace_attr_check(struct git_attr_check *check)
 
 unsigned whitespace_rule(const char *pathname)
 {
-	struct git_attr_check attr_whitespace_rule;
+	struct attr_check_item attr_whitespace_rule;
 
 	setup_whitespace_attr_check(&attr_whitespace_rule);
-	if (!git_check_attr(pathname, 1, &attr_whitespace_rule)) {
+	if (!git_check_attrs(pathname, 1, &attr_whitespace_rule)) {
 		const char *value;
 
 		value = attr_whitespace_rule.value;
-- 
2.11.0.390.gc69c2f50cf-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH 15/27] attr: (re)introduce git_check_attr() and struct attr_check
  2017-01-12 23:53 [PATCH 00/27] Revamp the attribute system; another round Brandon Williams
                   ` (13 preceding siblings ...)
  2017-01-12 23:53 ` [PATCH 14/27] attr: rename function and struct related to checking attributes Brandon Williams
@ 2017-01-12 23:53 ` Brandon Williams
  2017-01-12 23:53 ` [PATCH 16/27] attr: convert git_all_attrs() to use "struct attr_check" Brandon Williams
                   ` (13 subsequent siblings)
  28 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-12 23:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, pclouds, sbeller, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

A common pattern to check N attributes for many paths is to

 (1) prepare an array A of N attr_check_item items;
 (2) call git_attr() to intern the N attribute names and fill A;
 (3) repeatedly call git_check_attrs() for path with N and A;

A look-up for these N attributes for a single path P scans the
entire attr_stack, starting from the .git/info/attributes file and
then .gitattributes file in the directory the path P is in, going
upwards to find .gitattributes file found in parent directories.

An earlier commit 06a604e6 (attr: avoid heavy work when we know the
specified attr is not defined, 2014-12-28) tried to optimize out
this scanning for one trivial special case: when the attribute being
sought is known not to exist, we do not have to scan for it.  While
this may be a cheap and effective heuristic, it would not work well
when N is (much) more than 1.

What we would want is a more customized way to skip irrelevant
entries in the attribute stack, and the definition of irrelevance
is tied to the set of attributes passed to git_check_attrs() call,
i.e. the set of attributes being sought.  The data necessary for
this optimization needs to live alongside the set of attributes, but
a simple array of git_attr_check_elem simply does not have any place
for that.

Introduce "struct attr_check" that contains N, the number of
attributes being sought, and A, the array that holds N
attr_check_item items, and a function git_check_attr() that
takes a path P and this structure as its parameters.  This structure
can later be extended to hold extra data necessary for optimization.

Also, to make it easier to write the first two steps in common
cases, introduce git_attr_check_initl() helper function, which takes
a NULL-terminated list of attribute names and initialize this
structure.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 74 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 attr.h | 17 +++++++++++++++
 2 files changed, 91 insertions(+)

diff --git a/attr.c b/attr.c
index 2f180d609..be9e398e9 100644
--- a/attr.c
+++ b/attr.c
@@ -865,6 +865,80 @@ int git_all_attrs(const char *path, int *num, struct attr_check_item **check)
 	return 0;
 }
 
+struct attr_check *attr_check_alloc(void)
+{
+	return xcalloc(1, sizeof(struct attr_check));
+}
+
+int git_check_attr(const char *path, struct attr_check *check)
+{
+	return git_check_attrs(path, check->check_nr, check->check);
+}
+
+struct attr_check *attr_check_initl(const char *one, ...)
+{
+	struct attr_check *check;
+	int cnt;
+	va_list params;
+	const char *param;
+
+	va_start(params, one);
+	for (cnt = 1; (param = va_arg(params, const char *)) != NULL; cnt++)
+		;
+	va_end(params);
+
+	check = attr_check_alloc();
+	check->check_nr = cnt;
+	check->check_alloc = cnt;
+	check->check = xcalloc(cnt, sizeof(struct attr_check_item));
+
+	check->check[0].attr = git_attr(one);
+	va_start(params, one);
+	for (cnt = 1; cnt < check->check_nr; cnt++) {
+		struct git_attr *attr;
+		param = va_arg(params, const char *);
+		if (!param)
+			die("BUG: counted %d != ended at %d",
+			    check->check_nr, cnt);
+		attr = git_attr(param);
+		if (!attr)
+			die("BUG: %s: not a valid attribute name", param);
+		check->check[cnt].attr = attr;
+	}
+	va_end(params);
+	return check;
+}
+
+struct attr_check_item *attr_check_append(struct attr_check *check,
+					  const struct git_attr *attr)
+{
+	struct attr_check_item *item;
+
+	ALLOC_GROW(check->check, check->check_nr + 1, check->check_alloc);
+	item = &check->check[check->check_nr++];
+	item->attr = attr;
+	return item;
+}
+
+void attr_check_reset(struct attr_check *check)
+{
+	check->check_nr = 0;
+}
+
+void attr_check_clear(struct attr_check *check)
+{
+	free(check->check);
+	check->check = NULL;
+	check->check_alloc = 0;
+	check->check_nr = 0;
+}
+
+void attr_check_free(struct attr_check *check)
+{
+	attr_check_clear(check);
+	free(check);
+}
+
 void git_attr_set_direction(enum git_attr_direction new, struct index_state *istate)
 {
 	enum git_attr_direction old = direction;
diff --git a/attr.h b/attr.h
index efc7bb3b3..459347f4b 100644
--- a/attr.h
+++ b/attr.h
@@ -29,6 +29,22 @@ struct attr_check_item {
 	const char *value;
 };
 
+struct attr_check {
+	int check_nr;
+	int check_alloc;
+	struct attr_check_item *check;
+};
+
+extern struct attr_check *attr_check_alloc(void);
+extern struct attr_check *attr_check_initl(const char *, ...);
+
+extern struct attr_check_item *attr_check_append(struct attr_check *check,
+						 const struct git_attr *attr);
+
+extern void attr_check_reset(struct attr_check *check);
+extern void attr_check_clear(struct attr_check *check);
+extern void attr_check_free(struct attr_check *check);
+
 /*
  * Return the name of the attribute represented by the argument.  The
  * return value is a pointer to a null-delimited string that is part
@@ -37,6 +53,7 @@ struct attr_check_item {
 extern const char *git_attr_name(const struct git_attr *);
 
 int git_check_attrs(const char *path, int, struct attr_check_item *);
+extern int git_check_attr(const char *path, struct attr_check *check);
 
 /*
  * Retrieve all attributes that apply to the specified path.  *num
-- 
2.11.0.390.gc69c2f50cf-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH 16/27] attr: convert git_all_attrs() to use "struct attr_check"
  2017-01-12 23:53 [PATCH 00/27] Revamp the attribute system; another round Brandon Williams
                   ` (14 preceding siblings ...)
  2017-01-12 23:53 ` [PATCH 15/27] attr: (re)introduce git_check_attr() and struct attr_check Brandon Williams
@ 2017-01-12 23:53 ` Brandon Williams
  2017-01-12 23:53 ` [PATCH 17/27] attr: convert git_check_attrs() callers to use the new API Brandon Williams
                   ` (12 subsequent siblings)
  28 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-12 23:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, pclouds, sbeller, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

This updates the other two ways the attribute check is done via an
array of "struct attr_check_item" elements.  These two niches
appear only in "git check-attr".

 * The caller does not know offhand what attributes it wants to ask
   about and cannot use attr_check_initl() to prepare the
   attr_check structure.

 * The caller may not know what attributes it wants to ask at all,
   and instead wants to learn everything that the given path has.

Such a caller can call attr_check_alloc() to allocate an empty
attr_check, and then call attr_check_append() to add attribute names
one by one.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c               | 38 ++++++++++++---------------------
 attr.h               |  9 +++-----
 builtin/check-attr.c | 60 ++++++++++++++++++++++++++--------------------------
 3 files changed, 47 insertions(+), 60 deletions(-)

diff --git a/attr.c b/attr.c
index be9e398e9..d2eaa0410 100644
--- a/attr.c
+++ b/attr.c
@@ -837,42 +837,32 @@ int git_check_attrs(const char *path, int num, struct attr_check_item *check)
 	return 0;
 }
 
-int git_all_attrs(const char *path, int *num, struct attr_check_item **check)
+void git_all_attrs(const char *path, struct attr_check *check)
 {
-	int i, count, j;
+	int i;
 
-	collect_some_attrs(path, 0, NULL);
+	attr_check_reset(check);
+	collect_some_attrs(path, check->check_nr, check->check);
 
-	/* Count the number of attributes that are set. */
-	count = 0;
-	for (i = 0; i < attr_nr; i++) {
-		const char *value = check_all_attr[i].value;
-		if (value != ATTR__UNSET && value != ATTR__UNKNOWN)
-			++count;
-	}
-	*num = count;
-	ALLOC_ARRAY(*check, count);
-	j = 0;
 	for (i = 0; i < attr_nr; i++) {
+		const char *name = check_all_attr[i].attr->name;
 		const char *value = check_all_attr[i].value;
-		if (value != ATTR__UNSET && value != ATTR__UNKNOWN) {
-			(*check)[j].attr = check_all_attr[i].attr;
-			(*check)[j].value = value;
-			++j;
-		}
+		struct attr_check_item *item;
+		if (value == ATTR__UNSET || value == ATTR__UNKNOWN)
+			continue;
+		item = attr_check_append(check, git_attr(name));
+		item->value = value;
 	}
-
-	return 0;
 }
 
-struct attr_check *attr_check_alloc(void)
+int git_check_attr(const char *path, struct attr_check *check)
 {
-	return xcalloc(1, sizeof(struct attr_check));
+	return git_check_attrs(path, check->check_nr, check->check);
 }
 
-int git_check_attr(const char *path, struct attr_check *check)
+struct attr_check *attr_check_alloc(void)
 {
-	return git_check_attrs(path, check->check_nr, check->check);
+	return xcalloc(1, sizeof(struct attr_check));
 }
 
 struct attr_check *attr_check_initl(const char *one, ...)
diff --git a/attr.h b/attr.h
index 459347f4b..971bb9a38 100644
--- a/attr.h
+++ b/attr.h
@@ -56,13 +56,10 @@ int git_check_attrs(const char *path, int, struct attr_check_item *);
 extern int git_check_attr(const char *path, struct attr_check *check);
 
 /*
- * Retrieve all attributes that apply to the specified path.  *num
- * will be set to the number of attributes on the path; **check will
- * be set to point at a newly-allocated array of git_attr_check
- * objects describing the attributes and their values.  *check must be
- * free()ed by the caller.
+ * Retrieve all attributes that apply to the specified path.
+ * check holds the attributes and their values.
  */
-int git_all_attrs(const char *path, int *num, struct attr_check_item **check);
+void git_all_attrs(const char *path, struct attr_check *check);
 
 enum git_attr_direction {
 	GIT_ATTR_CHECKIN,
diff --git a/builtin/check-attr.c b/builtin/check-attr.c
index 889264a5b..3d4704be5 100644
--- a/builtin/check-attr.c
+++ b/builtin/check-attr.c
@@ -24,12 +24,13 @@ static const struct option check_attr_options[] = {
 	OPT_END()
 };
 
-static void output_attr(int cnt, struct attr_check_item *check,
-			const char *file)
+static void output_attr(struct attr_check *check, const char *file)
 {
 	int j;
+	int cnt = check->check_nr;
+
 	for (j = 0; j < cnt; j++) {
-		const char *value = check[j].value;
+		const char *value = check->check[j].value;
 
 		if (ATTR_TRUE(value))
 			value = "set";
@@ -42,36 +43,38 @@ static void output_attr(int cnt, struct attr_check_item *check,
 			printf("%s%c" /* path */
 			       "%s%c" /* attrname */
 			       "%s%c" /* attrvalue */,
-			       file, 0, git_attr_name(check[j].attr), 0, value, 0);
+			       file, 0,
+			       git_attr_name(check->check[j].attr), 0, value, 0);
 		} else {
 			quote_c_style(file, NULL, stdout, 0);
-			printf(": %s: %s\n", git_attr_name(check[j].attr), value);
+			printf(": %s: %s\n",
+			       git_attr_name(check->check[j].attr), value);
 		}
-
 	}
 }
 
 static void check_attr(const char *prefix,
-		       int cnt, struct attr_check_item *check,
+		       struct attr_check *check,
+		       int collect_all,
 		       const char *file)
 {
 	char *full_path =
 		prefix_path(prefix, prefix ? strlen(prefix) : 0, file);
-	if (check != NULL) {
-		if (git_check_attrs(full_path, cnt, check))
-			die("git_check_attrs died");
-		output_attr(cnt, check, file);
+
+	if (collect_all) {
+		git_all_attrs(full_path, check);
 	} else {
-		if (git_all_attrs(full_path, &cnt, &check))
-			die("git_all_attrs died");
-		output_attr(cnt, check, file);
-		free(check);
+		if (git_check_attr(full_path, check))
+			die("git_check_attr died");
 	}
+	output_attr(check, file);
+
 	free(full_path);
 }
 
 static void check_attr_stdin_paths(const char *prefix,
-				   int cnt, struct attr_check_item *check)
+				   struct attr_check *check,
+				   int collect_all)
 {
 	struct strbuf buf = STRBUF_INIT;
 	struct strbuf unquoted = STRBUF_INIT;
@@ -85,7 +88,7 @@ static void check_attr_stdin_paths(const char *prefix,
 				die("line is badly quoted");
 			strbuf_swap(&buf, &unquoted);
 		}
-		check_attr(prefix, cnt, check, buf.buf);
+		check_attr(prefix, check, collect_all, buf.buf);
 		maybe_flush_or_die(stdout, "attribute to stdout");
 	}
 	strbuf_release(&buf);
@@ -100,7 +103,7 @@ static NORETURN void error_with_usage(const char *msg)
 
 int cmd_check_attr(int argc, const char **argv, const char *prefix)
 {
-	struct attr_check_item *check;
+	struct attr_check *check;
 	int cnt, i, doubledash, filei;
 
 	if (!is_bare_repository())
@@ -160,28 +163,25 @@ int cmd_check_attr(int argc, const char **argv, const char *prefix)
 			error_with_usage("No file specified");
 	}
 
-	if (all_attrs) {
-		check = NULL;
-	} else {
-		check = xcalloc(cnt, sizeof(*check));
+	check = attr_check_alloc();
+	if (!all_attrs) {
 		for (i = 0; i < cnt; i++) {
-			const char *name;
-			struct git_attr *a;
-			name = argv[i];
-			a = git_attr(name);
+			struct git_attr *a = git_attr(argv[i]);
 			if (!a)
 				return error("%s: not a valid attribute name",
-					name);
-			check[i].attr = a;
+					     argv[i]);
+			attr_check_append(check, a);
 		}
 	}
 
 	if (stdin_paths)
-		check_attr_stdin_paths(prefix, cnt, check);
+		check_attr_stdin_paths(prefix, check, all_attrs);
 	else {
 		for (i = filei; i < argc; i++)
-			check_attr(prefix, cnt, check, argv[i]);
+			check_attr(prefix, check, all_attrs, argv[i]);
 		maybe_flush_or_die(stdout, "attribute to stdout");
 	}
+
+	attr_check_free(check);
 	return 0;
 }
-- 
2.11.0.390.gc69c2f50cf-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH 17/27] attr: convert git_check_attrs() callers to use the new API
  2017-01-12 23:53 [PATCH 00/27] Revamp the attribute system; another round Brandon Williams
                   ` (15 preceding siblings ...)
  2017-01-12 23:53 ` [PATCH 16/27] attr: convert git_all_attrs() to use "struct attr_check" Brandon Williams
@ 2017-01-12 23:53 ` Brandon Williams
  2017-01-12 23:53 ` [PATCH 18/27] attr: retire git_check_attrs() API Brandon Williams
                   ` (11 subsequent siblings)
  28 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-12 23:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, pclouds, sbeller, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

The remaining callers are all simple "I have N attributes I am
interested in.  I'll ask about them with various paths one by one".

After this step, no caller to git_check_attrs() remains.  After
removing it, we can extend "struct attr_check" struct with data
that can be used in optimizing the query for the specific N
attributes it contains.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 archive.c              | 24 ++++++------------------
 builtin/pack-objects.c | 19 +++++--------------
 convert.c              | 17 ++++++-----------
 ll-merge.c             | 33 ++++++++++++++-------------------
 userdiff.c             | 19 ++++++++-----------
 ws.c                   | 19 ++++++-------------
 6 files changed, 45 insertions(+), 86 deletions(-)

diff --git a/archive.c b/archive.c
index b76bd4691..3591f7d55 100644
--- a/archive.c
+++ b/archive.c
@@ -87,19 +87,6 @@ void *sha1_file_to_archive(const struct archiver_args *args,
 	return buffer;
 }
 
-static void setup_archive_check(struct attr_check_item *check)
-{
-	static struct git_attr *attr_export_ignore;
-	static struct git_attr *attr_export_subst;
-
-	if (!attr_export_ignore) {
-		attr_export_ignore = git_attr("export-ignore");
-		attr_export_subst = git_attr("export-subst");
-	}
-	check[0].attr = attr_export_ignore;
-	check[1].attr = attr_export_subst;
-}
-
 struct directory {
 	struct directory *up;
 	struct object_id oid;
@@ -120,10 +107,10 @@ static int write_archive_entry(const unsigned char *sha1, const char *base,
 		void *context)
 {
 	static struct strbuf path = STRBUF_INIT;
+	static struct attr_check *check;
 	struct archiver_context *c = context;
 	struct archiver_args *args = c->args;
 	write_archive_entry_fn_t write_entry = c->write_entry;
-	struct attr_check_item check[2];
 	const char *path_without_prefix;
 	int err;
 
@@ -137,11 +124,12 @@ static int write_archive_entry(const unsigned char *sha1, const char *base,
 		strbuf_addch(&path, '/');
 	path_without_prefix = path.buf + args->baselen;
 
-	setup_archive_check(check);
-	if (!git_check_attrs(path_without_prefix, ARRAY_SIZE(check), check)) {
-		if (ATTR_TRUE(check[0].value))
+	if (!check)
+		check = attr_check_initl("export-ignore", "export-subst", NULL);
+	if (!git_check_attr(path_without_prefix, check)) {
+		if (ATTR_TRUE(check->check[0].value))
 			return 0;
-		args->convert = ATTR_TRUE(check[1].value);
+		args->convert = ATTR_TRUE(check->check[1].value);
 	}
 
 	if (S_ISDIR(mode) || S_ISGITLINK(mode)) {
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 8b8fbd814..ff8b3c12d 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -894,24 +894,15 @@ static void write_pack_file(void)
 			written, nr_result);
 }
 
-static void setup_delta_attr_check(struct attr_check_item *check)
-{
-	static struct git_attr *attr_delta;
-
-	if (!attr_delta)
-		attr_delta = git_attr("delta");
-
-	check[0].attr = attr_delta;
-}
-
 static int no_try_delta(const char *path)
 {
-	struct attr_check_item check[1];
+	static struct attr_check *check;
 
-	setup_delta_attr_check(check);
-	if (git_check_attrs(path, ARRAY_SIZE(check), check))
+	if (!check)
+		check = attr_check_initl("delta", NULL);
+	if (git_check_attr(path, check))
 		return 0;
-	if (ATTR_FALSE(check->value))
+	if (ATTR_FALSE(check->check[0].value))
 		return 1;
 	return 0;
 }
diff --git a/convert.c b/convert.c
index 1b9829279..affd8ce9b 100644
--- a/convert.c
+++ b/convert.c
@@ -1085,24 +1085,19 @@ struct conv_attrs {
 	int ident;
 };
 
-static const char *conv_attr_name[] = {
-	"crlf", "ident", "filter", "eol", "text",
-};
-#define NUM_CONV_ATTRS ARRAY_SIZE(conv_attr_name)
-
 static void convert_attrs(struct conv_attrs *ca, const char *path)
 {
-	int i;
-	static struct attr_check_item ccheck[NUM_CONV_ATTRS];
+	static struct attr_check *check;
 
-	if (!ccheck[0].attr) {
-		for (i = 0; i < NUM_CONV_ATTRS; i++)
-			ccheck[i].attr = git_attr(conv_attr_name[i]);
+	if (!check) {
+		check = attr_check_initl("crlf", "ident", "filter",
+					 "eol", "text", NULL);
 		user_convert_tail = &user_convert;
 		git_config(read_convert_config, NULL);
 	}
 
-	if (!git_check_attrs(path, NUM_CONV_ATTRS, ccheck)) {
+	if (!git_check_attr(path, check)) {
+		struct attr_check_item *ccheck = check->check;
 		ca->crlf_action = git_path_check_crlf(ccheck + 4);
 		if (ca->crlf_action == CRLF_UNDEFINED)
 			ca->crlf_action = git_path_check_crlf(ccheck + 0);
diff --git a/ll-merge.c b/ll-merge.c
index 198f07aca..3a4227a1c 100644
--- a/ll-merge.c
+++ b/ll-merge.c
@@ -336,15 +336,6 @@ static const struct ll_merge_driver *find_ll_merge_driver(const char *merge_attr
 	return &ll_merge_drv[LL_TEXT_MERGE];
 }
 
-static int git_path_check_merge(const char *path, struct attr_check_item check[2])
-{
-	if (!check[0].attr) {
-		check[0].attr = git_attr("merge");
-		check[1].attr = git_attr("conflict-marker-size");
-	}
-	return git_check_attrs(path, 2, check);
-}
-
 static void normalize_file(mmfile_t *mm, const char *path)
 {
 	struct strbuf strbuf = STRBUF_INIT;
@@ -362,7 +353,7 @@ int ll_merge(mmbuffer_t *result_buf,
 	     mmfile_t *theirs, const char *their_label,
 	     const struct ll_merge_options *opts)
 {
-	static struct attr_check_item check[2];
+	static struct attr_check *check;
 	static const struct ll_merge_options default_opts;
 	const char *ll_driver_name = NULL;
 	int marker_size = DEFAULT_CONFLICT_MARKER_SIZE;
@@ -376,10 +367,14 @@ int ll_merge(mmbuffer_t *result_buf,
 		normalize_file(ours, path);
 		normalize_file(theirs, path);
 	}
-	if (!git_path_check_merge(path, check)) {
-		ll_driver_name = check[0].value;
-		if (check[1].value) {
-			marker_size = atoi(check[1].value);
+
+	if (!check)
+		check = attr_check_initl("merge", "conflict-marker-size", NULL);
+
+	if (!git_check_attr(path, check)) {
+		ll_driver_name = check->check[0].value;
+		if (check->check[1].value) {
+			marker_size = atoi(check->check[1].value);
 			if (marker_size <= 0)
 				marker_size = DEFAULT_CONFLICT_MARKER_SIZE;
 		}
@@ -398,13 +393,13 @@ int ll_merge(mmbuffer_t *result_buf,
 
 int ll_merge_marker_size(const char *path)
 {
-	static struct attr_check_item check;
+	static struct attr_check *check;
 	int marker_size = DEFAULT_CONFLICT_MARKER_SIZE;
 
-	if (!check.attr)
-		check.attr = git_attr("conflict-marker-size");
-	if (!git_check_attrs(path, 1, &check) && check.value) {
-		marker_size = atoi(check.value);
+	if (!check)
+		check = attr_check_initl("conflict-marker-size", NULL);
+	if (!git_check_attr(path, check) && check->check[0].value) {
+		marker_size = atoi(check->check[0].value);
 		if (marker_size <= 0)
 			marker_size = DEFAULT_CONFLICT_MARKER_SIZE;
 	}
diff --git a/userdiff.c b/userdiff.c
index b0b44467a..109d4b9fc 100644
--- a/userdiff.c
+++ b/userdiff.c
@@ -262,25 +262,22 @@ struct userdiff_driver *userdiff_find_by_name(const char *name) {
 
 struct userdiff_driver *userdiff_find_by_path(const char *path)
 {
-	static struct git_attr *attr;
-	struct attr_check_item check;
-
-	if (!attr)
-		attr = git_attr("diff");
-	check.attr = attr;
+	static struct attr_check *check;
 
+	if (!check)
+		check = attr_check_initl("diff", NULL);
 	if (!path)
 		return NULL;
-	if (git_check_attrs(path, 1, &check))
+	if (git_check_attr(path, check))
 		return NULL;
 
-	if (ATTR_TRUE(check.value))
+	if (ATTR_TRUE(check->check[0].value))
 		return &driver_true;
-	if (ATTR_FALSE(check.value))
+	if (ATTR_FALSE(check->check[0].value))
 		return &driver_false;
-	if (ATTR_UNSET(check.value))
+	if (ATTR_UNSET(check->check[0].value))
 		return NULL;
-	return userdiff_find_by_name(check.value);
+	return userdiff_find_by_name(check->check[0].value);
 }
 
 struct userdiff_driver *userdiff_get_textconv(struct userdiff_driver *driver)
diff --git a/ws.c b/ws.c
index fbd876e84..7556adbd0 100644
--- a/ws.c
+++ b/ws.c
@@ -71,24 +71,17 @@ unsigned parse_whitespace_rule(const char *string)
 	return rule;
 }
 
-static void setup_whitespace_attr_check(struct attr_check_item *check)
-{
-	static struct git_attr *attr_whitespace;
-
-	if (!attr_whitespace)
-		attr_whitespace = git_attr("whitespace");
-	check[0].attr = attr_whitespace;
-}
-
 unsigned whitespace_rule(const char *pathname)
 {
-	struct attr_check_item attr_whitespace_rule;
+	static struct attr_check *attr_whitespace_rule;
+
+	if (!attr_whitespace_rule)
+		attr_whitespace_rule = attr_check_initl("whitespace", NULL);
 
-	setup_whitespace_attr_check(&attr_whitespace_rule);
-	if (!git_check_attrs(pathname, 1, &attr_whitespace_rule)) {
+	if (!git_check_attr(pathname, attr_whitespace_rule)) {
 		const char *value;
 
-		value = attr_whitespace_rule.value;
+		value = attr_whitespace_rule->check[0].value;
 		if (ATTR_TRUE(value)) {
 			/* true (whitespace) */
 			unsigned all_rule = ws_tab_width(whitespace_rule_cfg);
-- 
2.11.0.390.gc69c2f50cf-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH 18/27] attr: retire git_check_attrs() API
  2017-01-12 23:53 [PATCH 00/27] Revamp the attribute system; another round Brandon Williams
                   ` (16 preceding siblings ...)
  2017-01-12 23:53 ` [PATCH 17/27] attr: convert git_check_attrs() callers to use the new API Brandon Williams
@ 2017-01-12 23:53 ` Brandon Williams
  2017-01-12 23:53 ` [PATCH 19/27] attr: pass struct attr_check to collect_some_attrs Brandon Williams
                   ` (10 subsequent siblings)
  28 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-12 23:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, pclouds, sbeller, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

Since nobody uses the old API, make it file-scope static, and update
the documentation to describe the new API.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 Documentation/technical/api-gitattributes.txt | 86 +++++++++++++++++----------
 attr.c                                        |  3 +-
 attr.h                                        |  1 -
 3 files changed, 58 insertions(+), 32 deletions(-)

diff --git a/Documentation/technical/api-gitattributes.txt b/Documentation/technical/api-gitattributes.txt
index 260266867..82f5130e7 100644
--- a/Documentation/technical/api-gitattributes.txt
+++ b/Documentation/technical/api-gitattributes.txt
@@ -16,10 +16,15 @@ Data Structure
 	of no interest to the calling programs.  The name of the
 	attribute can be retrieved by calling `git_attr_name()`.
 
-`struct git_attr_check`::
+`struct attr_check_item`::
 
-	This structure represents a set of attributes to check in a call
-	to `git_check_attr()` function, and receives the results.
+	This structure represents one attribute and its value.
+
+`struct attr_check`::
+
+	This structure represents a collection of `attr_check_item`.
+	It is passed to `git_check_attr()` function, specifying the
+	attributes to check, and receives their values.
 
 
 Attribute Values
@@ -27,7 +32,7 @@ Attribute Values
 
 An attribute for a path can be in one of four states: Set, Unset,
 Unspecified or set to a string, and `.value` member of `struct
-git_attr_check` records it.  There are three macros to check these:
+attr_check_item` records it.  There are three macros to check these:
 
 `ATTR_TRUE()`::
 
@@ -48,49 +53,51 @@ value of the attribute for the path.
 Querying Specific Attributes
 ----------------------------
 
-* Prepare an array of `struct git_attr_check` to define the list of
-  attributes you would want to check.  To populate this array, you would
-  need to define necessary attributes by calling `git_attr()` function.
+* Prepare `struct attr_check` using attr_check_initl()
+  function, enumerating the names of attributes whose values you are
+  interested in, terminated with a NULL pointer.  Alternatively, an
+  empty `struct attr_check` can be prepared by calling
+  `attr_check_alloc()` function and then attributes you want to
+  ask about can be added to it with `attr_check_append()`
+  function.
 
 * Call `git_check_attr()` to check the attributes for the path.
 
-* Inspect `git_attr_check` structure to see how each of the attribute in
-  the array is defined for the path.
+* Inspect `attr_check` structure to see how each of the
+  attribute in the array is defined for the path.
 
 
 Example
 -------
 
-To see how attributes "crlf" and "indent" are set for different paths.
+To see how attributes "crlf" and "ident" are set for different paths.
 
-. Prepare an array of `struct git_attr_check` with two elements (because
-  we are checking two attributes).  Initialize their `attr` member with
-  pointers to `struct git_attr` obtained by calling `git_attr()`:
+. Prepare a `struct attr_check` with two elements (because
+  we are checking two attributes):
 
 ------------
-static struct git_attr_check check[2];
+static struct attr_check *check;
 static void setup_check(void)
 {
-	if (check[0].attr)
+	if (check)
 		return; /* already done */
-	check[0].attr = git_attr("crlf");
-	check[1].attr = git_attr("ident");
+	check = attr_check_initl("crlf", "ident", NULL);
 }
 ------------
 
-. Call `git_check_attr()` with the prepared array of `struct git_attr_check`:
+. Call `git_check_attr()` with the prepared `struct attr_check`:
 
 ------------
 	const char *path;
 
 	setup_check();
-	git_check_attr(path, ARRAY_SIZE(check), check);
+	git_check_attr(path, check);
 ------------
 
-. Act on `.value` member of the result, left in `check[]`:
+. Act on `.value` member of the result, left in `check->check[]`:
 
 ------------
-	const char *value = check[0].value;
+	const char *value = check->check[0].value;
 
 	if (ATTR_TRUE(value)) {
 		The attribute is Set, by listing only the name of the
@@ -109,20 +116,39 @@ static void setup_check(void)
 	}
 ------------
 
+To see how attributes in argv[] are set for different paths, only
+the first step in the above would be different.
+
+------------
+static struct attr_check *check;
+static void setup_check(const char **argv)
+{
+	check = attr_check_alloc();
+	while (*argv) {
+		struct git_attr *attr = git_attr(*argv);
+		attr_check_append(check, attr);
+		argv++;
+	}
+}
+------------
+
 
 Querying All Attributes
 -----------------------
 
 To get the values of all attributes associated with a file:
 
-* Call `git_all_attrs()`, which returns an array of `git_attr_check`
-  structures.
+* Prepare an empty `attr_check` structure by calling
+  `attr_check_alloc()`.
+
+* Call `git_all_attrs()`, which populates the `attr_check`
+  with the attributes attached to the path.
 
-* Iterate over the `git_attr_check` array to examine the attribute
-  names and values.  The name of the attribute described by a
-  `git_attr_check` object can be retrieved via
-  `git_attr_name(check[i].attr)`.  (Please note that no items will be
-  returned for unset attributes, so `ATTR_UNSET()` will return false
-  for all returned `git_array_check` objects.)
+* Iterate over the `attr_check.check[]` array to examine
+  the attribute names and values.  The name of the attribute
+  described by a  `attr_check.check[]` object can be retrieved via
+  `git_attr_name(check->check[i].attr)`.  (Please note that no items
+  will be returned for unset attributes, so `ATTR_UNSET()` will return
+  false for all returned `attr_check.check[]` objects.)
 
-* Free the `git_array_check` array.
+* Free the `attr_check` struct by calling `attr_check_free()`.
diff --git a/attr.c b/attr.c
index d2eaa0410..da727e3fd 100644
--- a/attr.c
+++ b/attr.c
@@ -821,7 +821,8 @@ static void collect_some_attrs(const char *path, int num,
 		rem = fill(path, pathlen, basename_offset, stk, rem);
 }
 
-int git_check_attrs(const char *path, int num, struct attr_check_item *check)
+static int git_check_attrs(const char *path, int num,
+			   struct attr_check_item *check)
 {
 	int i;
 
diff --git a/attr.h b/attr.h
index 971bb9a38..3db9893ef 100644
--- a/attr.h
+++ b/attr.h
@@ -52,7 +52,6 @@ extern void attr_check_free(struct attr_check *check);
  */
 extern const char *git_attr_name(const struct git_attr *);
 
-int git_check_attrs(const char *path, int, struct attr_check_item *);
 extern int git_check_attr(const char *path, struct attr_check *check);
 
 /*
-- 
2.11.0.390.gc69c2f50cf-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH 19/27] attr: pass struct attr_check to collect_some_attrs
  2017-01-12 23:53 [PATCH 00/27] Revamp the attribute system; another round Brandon Williams
                   ` (17 preceding siblings ...)
  2017-01-12 23:53 ` [PATCH 18/27] attr: retire git_check_attrs() API Brandon Williams
@ 2017-01-12 23:53 ` Brandon Williams
  2017-01-12 23:53 ` [PATCH 20/27] attr: change validity check for attribute names to use positive logic Brandon Williams
                   ` (9 subsequent siblings)
  28 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-12 23:53 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, gitster, pclouds, sbeller

The old callchain used to take an array of attr_check_item items.
Instead pass the 'attr_check' container object to 'collect_some_attrs()'
and access the fields in the data structure directly.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 33 +++++++++++++--------------------
 1 file changed, 13 insertions(+), 20 deletions(-)

diff --git a/attr.c b/attr.c
index da727e3fd..e58fa340c 100644
--- a/attr.c
+++ b/attr.c
@@ -777,9 +777,7 @@ static int macroexpand_one(int nr, int rem)
  * check_all_attr. If num is non-zero, only attributes in check[] are
  * collected. Otherwise all attributes are collected.
  */
-static void collect_some_attrs(const char *path, int num,
-			       struct attr_check_item *check)
-
+static void collect_some_attrs(const char *path, struct attr_check *check)
 {
 	struct attr_stack *stk;
 	int i, pathlen, rem, dirlen;
@@ -802,17 +800,18 @@ static void collect_some_attrs(const char *path, int num,
 	prepare_attr_stack(path, dirlen);
 	for (i = 0; i < attr_nr; i++)
 		check_all_attr[i].value = ATTR__UNKNOWN;
-	if (num && !cannot_trust_maybe_real) {
+	if (check->check_nr && !cannot_trust_maybe_real) {
 		rem = 0;
-		for (i = 0; i < num; i++) {
-			if (!check[i].attr->maybe_real) {
+		for (i = 0; i < check->check_nr; i++) {
+			const struct git_attr *a = check->check[i].attr;
+			if (!a->maybe_real) {
 				struct attr_check_item *c;
-				c = check_all_attr + check[i].attr->attr_nr;
+				c = check_all_attr + a->attr_nr;
 				c->value = ATTR__UNSET;
 				rem++;
 			}
 		}
-		if (rem == num)
+		if (rem == check->check_nr)
 			return;
 	}
 
@@ -821,18 +820,17 @@ static void collect_some_attrs(const char *path, int num,
 		rem = fill(path, pathlen, basename_offset, stk, rem);
 }
 
-static int git_check_attrs(const char *path, int num,
-			   struct attr_check_item *check)
+int git_check_attr(const char *path, struct attr_check *check)
 {
 	int i;
 
-	collect_some_attrs(path, num, check);
+	collect_some_attrs(path, check);
 
-	for (i = 0; i < num; i++) {
-		const char *value = check_all_attr[check[i].attr->attr_nr].value;
+	for (i = 0; i < check->check_nr; i++) {
+		const char *value = check_all_attr[check->check[i].attr->attr_nr].value;
 		if (value == ATTR__UNKNOWN)
 			value = ATTR__UNSET;
-		check[i].value = value;
+		check->check[i].value = value;
 	}
 
 	return 0;
@@ -843,7 +841,7 @@ void git_all_attrs(const char *path, struct attr_check *check)
 	int i;
 
 	attr_check_reset(check);
-	collect_some_attrs(path, check->check_nr, check->check);
+	collect_some_attrs(path, check);
 
 	for (i = 0; i < attr_nr; i++) {
 		const char *name = check_all_attr[i].attr->name;
@@ -856,11 +854,6 @@ void git_all_attrs(const char *path, struct attr_check *check)
 	}
 }
 
-int git_check_attr(const char *path, struct attr_check *check)
-{
-	return git_check_attrs(path, check->check_nr, check->check);
-}
-
 struct attr_check *attr_check_alloc(void)
 {
 	return xcalloc(1, sizeof(struct attr_check));
-- 
2.11.0.390.gc69c2f50cf-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH 20/27] attr: change validity check for attribute names to use positive logic
  2017-01-12 23:53 [PATCH 00/27] Revamp the attribute system; another round Brandon Williams
                   ` (18 preceding siblings ...)
  2017-01-12 23:53 ` [PATCH 19/27] attr: pass struct attr_check to collect_some_attrs Brandon Williams
@ 2017-01-12 23:53 ` Brandon Williams
  2017-01-12 23:53 ` [PATCH 21/27] attr: use hashmap for attribute dictionary Brandon Williams
                   ` (8 subsequent siblings)
  28 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-12 23:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, pclouds, sbeller, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

Convert 'invalid_attr_name()' to 'attr_name_valid()' and use positive
logic for the return value.  In addition create a helper function that
prints out an error message when an invalid attribute name is used.

We could later update the message to exactly spell out what the
rules for a good attribute name are, etc.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 34 ++++++++++++++++++++--------------
 1 file changed, 20 insertions(+), 14 deletions(-)

diff --git a/attr.c b/attr.c
index e58fa340c..5399e1cb3 100644
--- a/attr.c
+++ b/attr.c
@@ -74,23 +74,33 @@ static unsigned hash_name(const char *name, int namelen)
 	return val;
 }
 
-static int invalid_attr_name(const char *name, int namelen)
+static int attr_name_valid(const char *name, size_t namelen)
 {
 	/*
 	 * Attribute name cannot begin with '-' and must consist of
 	 * characters from [-A-Za-z0-9_.].
 	 */
 	if (namelen <= 0 || *name == '-')
-		return -1;
+		return 0;
 	while (namelen--) {
 		char ch = *name++;
 		if (! (ch == '-' || ch == '.' || ch == '_' ||
 		       ('0' <= ch && ch <= '9') ||
 		       ('a' <= ch && ch <= 'z') ||
 		       ('A' <= ch && ch <= 'Z')) )
-			return -1;
+			return 0;
 	}
-	return 0;
+	return 1;
+}
+
+static void report_invalid_attr(const char *name, size_t len,
+				const char *src, int lineno)
+{
+	struct strbuf err = STRBUF_INIT;
+	strbuf_addf(&err, _("%.*s is not a valid attribute name"),
+		    (int) len, name);
+	fprintf(stderr, "%s: %s:%d\n", err.buf, src, lineno);
+	strbuf_release(&err);
 }
 
 static struct git_attr *git_attr_internal(const char *name, int len)
@@ -105,7 +115,7 @@ static struct git_attr *git_attr_internal(const char *name, int len)
 			return a;
 	}
 
-	if (invalid_attr_name(name, len))
+	if (!attr_name_valid(name, len))
 		return NULL;
 
 	FLEX_ALLOC_MEM(a, name, name, len);
@@ -196,17 +206,15 @@ static const char *parse_attr(const char *src, int lineno, const char *cp,
 			cp++;
 			len--;
 		}
-		if (invalid_attr_name(cp, len)) {
-			fprintf(stderr,
-				"%.*s is not a valid attribute name: %s:%d\n",
-				len, cp, src, lineno);
+		if (!attr_name_valid(cp, len)) {
+			report_invalid_attr(cp, len, src, lineno);
 			return NULL;
 		}
 	} else {
 		/*
 		 * As this function is always called twice, once with
 		 * e == NULL in the first pass and then e != NULL in
-		 * the second pass, no need for invalid_attr_name()
+		 * the second pass, no need for attr_name_valid()
 		 * check here.
 		 */
 		if (*cp == '-' || *cp == '!') {
@@ -258,10 +266,8 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 		name += strlen(ATTRIBUTE_MACRO_PREFIX);
 		name += strspn(name, blank);
 		namelen = strcspn(name, blank);
-		if (invalid_attr_name(name, namelen)) {
-			fprintf(stderr,
-				"%.*s is not a valid attribute name: %s:%d\n",
-				namelen, name, src, lineno);
+		if (!attr_name_valid(name, namelen)) {
+			report_invalid_attr(name, namelen, src, lineno);
 			goto fail_return;
 		}
 	}
-- 
2.11.0.390.gc69c2f50cf-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH 21/27] attr: use hashmap for attribute dictionary
  2017-01-12 23:53 [PATCH 00/27] Revamp the attribute system; another round Brandon Williams
                   ` (19 preceding siblings ...)
  2017-01-12 23:53 ` [PATCH 20/27] attr: change validity check for attribute names to use positive logic Brandon Williams
@ 2017-01-12 23:53 ` Brandon Williams
  2017-01-18 20:20   ` Stefan Beller
  2017-01-12 23:53 ` [PATCH 22/27] attr: eliminate global check_all_attr array Brandon Williams
                   ` (7 subsequent siblings)
  28 siblings, 1 reply; 111+ messages in thread
From: Brandon Williams @ 2017-01-12 23:53 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, gitster, pclouds, sbeller

The current implementation of the attribute dictionary uses a custom
hashtable.  This modernizes the dictionary by converting it to the builtin
'hashmap' structure.

Also, in order to enable a threaded API in the future add an
accompanying mutex which must be acquired prior to accessing the
dictionary of interned attributes.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c        | 171 ++++++++++++++++++++++++++++++++++++++++++----------------
 attr.h        |   2 +
 common-main.c |   3 ++
 3 files changed, 131 insertions(+), 45 deletions(-)

diff --git a/attr.c b/attr.c
index 5399e1cb3..8cf2ea901 100644
--- a/attr.c
+++ b/attr.c
@@ -14,6 +14,7 @@
 #include "dir.h"
 #include "utf8.h"
 #include "quote.h"
+#include "thread-utils.h"
 
 const char git_attr__true[] = "(builtin)true";
 const char git_attr__false[] = "\0(builtin)false";
@@ -23,28 +24,17 @@ static const char git_attr__unknown[] = "(builtin)unknown";
 #define ATTR__UNSET NULL
 #define ATTR__UNKNOWN git_attr__unknown
 
-/* This is a randomly chosen prime. */
-#define HASHSIZE 257
-
 #ifndef DEBUG_ATTR
 #define DEBUG_ATTR 0
 #endif
 
-/*
- * NEEDSWORK: the global dictionary of the interned attributes
- * must stay a singleton even after we become thread-ready.
- * Access to these must be surrounded with mutex when it happens.
- */
 struct git_attr {
-	struct git_attr *next;
-	unsigned h;
-	int attr_nr;
+	int attr_nr; /* unique attribute number */
 	int maybe_macro;
 	int maybe_real;
-	char name[FLEX_ARRAY];
+	char name[FLEX_ARRAY]; /* attribute name */
 };
 static int attr_nr;
-static struct git_attr *(git_attr_hash[HASHSIZE]);
 
 /*
  * NEEDSWORK: maybe-real, maybe-macro are not property of
@@ -63,15 +53,94 @@ const char *git_attr_name(const struct git_attr *attr)
 	return attr->name;
 }
 
-static unsigned hash_name(const char *name, int namelen)
+struct attr_hashmap {
+	struct hashmap map;
+#ifndef NO_PTHREADS
+	pthread_mutex_t mutex;
+#endif
+};
+
+static inline void hashmap_lock(struct attr_hashmap *map)
 {
-	unsigned val = 0, c;
+#ifndef NO_PTHREADS
+	pthread_mutex_lock(&map->mutex);
+#endif
+}
 
-	while (namelen--) {
-		c = *name++;
-		val = ((val << 7) | (val >> 22)) ^ c;
-	}
-	return val;
+static inline void hashmap_unlock(struct attr_hashmap *map)
+{
+#ifndef NO_PTHREADS
+	pthread_mutex_unlock(&map->mutex);
+#endif
+}
+
+/*
+ * The global dictionary of all interned attributes.  This
+ * is a singleton object which is shared between threads.
+ * Access to this dictionary must be surrounded with a mutex.
+ */
+static struct attr_hashmap g_attr_hashmap;
+
+/* The container for objects stored in "struct attr_hashmap" */
+struct attr_hash_entry {
+	struct hashmap_entry ent; /* must be the first member! */
+	const char *key; /* the key; memory should be owned by value */
+	size_t keylen; /* length of the key */
+	void *value; /* the stored value */
+};
+
+/* attr_hashmap comparison function */
+static int attr_hash_entry_cmp(const struct attr_hash_entry *a,
+			       const struct attr_hash_entry *b,
+			       void *unused)
+{
+	return (a->keylen != b->keylen) || strncmp(a->key, b->key, a->keylen);
+}
+
+/* Initialize an 'attr_hashmap' object */
+void attr_hashmap_init(struct attr_hashmap *map)
+{
+	hashmap_init(&map->map, (hashmap_cmp_fn) attr_hash_entry_cmp, 0);
+}
+
+/*
+ * Retrieve the 'value' stored in a hashmap given the provided 'key'.
+ * If there is no matching entry, return NULL.
+ */
+static void *attr_hashmap_get(struct attr_hashmap *map,
+			      const char *key, size_t keylen)
+{
+	struct attr_hash_entry k;
+	struct attr_hash_entry *e;
+
+	if (!map->map.tablesize)
+		attr_hashmap_init(map);
+
+	hashmap_entry_init(&k, memhash(key, keylen));
+	k.key = key;
+	k.keylen = keylen;
+	e = hashmap_get(&map->map, &k, NULL);
+
+	return e ? e->value : NULL;
+}
+
+/* Add 'value' to a hashmap based on the provided 'key'. */
+static void attr_hashmap_add(struct attr_hashmap *map,
+			     const char *key, size_t keylen,
+			     void *value)
+{
+	struct attr_hash_entry *e;
+
+	if (!map->map.tablesize)
+		attr_hashmap_init(map);
+
+	e = xmalloc(sizeof(struct attr_hash_entry));
+	hashmap_entry_init(e, memhash(key, keylen));
+	e->key = key;
+	e->keylen = keylen;
+	e->value = value;
+
+	hashmap_add(&map->map, e);
 }
 
 static int attr_name_valid(const char *name, size_t namelen)
@@ -103,37 +172,44 @@ static void report_invalid_attr(const char *name, size_t len,
 	strbuf_release(&err);
 }
 
-static struct git_attr *git_attr_internal(const char *name, int len)
+/*
+ * Given a 'name', lookup and return the corresponding attribute in the global
+ * dictionary.  If no entry is found, create a new attribute and store it in
+ * the dictionary.
+ */
+static struct git_attr *git_attr_internal(const char *name, int namelen)
 {
-	unsigned hval = hash_name(name, len);
-	unsigned pos = hval % HASHSIZE;
 	struct git_attr *a;
 
-	for (a = git_attr_hash[pos]; a; a = a->next) {
-		if (a->h == hval &&
-		    !memcmp(a->name, name, len) && !a->name[len])
-			return a;
-	}
-
-	if (!attr_name_valid(name, len))
+	if (!attr_name_valid(name, namelen))
 		return NULL;
 
-	FLEX_ALLOC_MEM(a, name, name, len);
-	a->h = hval;
-	a->next = git_attr_hash[pos];
-	a->attr_nr = attr_nr++;
-	a->maybe_macro = 0;
-	a->maybe_real = 0;
-	git_attr_hash[pos] = a;
+	hashmap_lock(&g_attr_hashmap);
+
+	a = attr_hashmap_get(&g_attr_hashmap, name, namelen);
+
+	if (!a) {
+		FLEX_ALLOC_MEM(a, name, name, namelen);
+		a->attr_nr = g_attr_hashmap.map.size;
+		a->maybe_real = 0;
+		a->maybe_macro = 0;
+
+		attr_hashmap_add(&g_attr_hashmap, a->name, namelen, a);
+		assert(a->attr_nr == (g_attr_hashmap.map.size - 1));
+
+		/*
+		 * NEEDSWORK: per git_attr_check check_all_attr
+		 * will be initialized a lot more lazily, not
+		 * like this, and not here.
+		 */
+		REALLOC_ARRAY(check_all_attr, ++attr_nr);
+		check_all_attr[a->attr_nr].attr = a;
+		check_all_attr[a->attr_nr].value = ATTR__UNKNOWN;
+		assert(a->attr_nr == (attr_nr - 1));
+	}
+
+	hashmap_unlock(&g_attr_hashmap);
 
-	/*
-	 * NEEDSWORK: per git_attr_check check_all_attr
-	 * will be initialized a lot more lazily, not
-	 * like this, and not here.
-	 */
-	REALLOC_ARRAY(check_all_attr, attr_nr);
-	check_all_attr[a->attr_nr].attr = a;
-	check_all_attr[a->attr_nr].value = ATTR__UNKNOWN;
 	return a;
 }
 
@@ -941,3 +1017,8 @@ void git_attr_set_direction(enum git_attr_direction new, struct index_state *ist
 		drop_attr_stack();
 	use_index = istate;
 }
+
+void attr_start(void)
+{
+	pthread_mutex_init(&g_attr_hashmap.mutex, NULL);
+}
diff --git a/attr.h b/attr.h
index 3db9893ef..8505bca79 100644
--- a/attr.h
+++ b/attr.h
@@ -67,4 +67,6 @@ enum git_attr_direction {
 };
 void git_attr_set_direction(enum git_attr_direction, struct index_state *);
 
+extern void attr_start(void);
+
 #endif /* ATTR_H */
diff --git a/common-main.c b/common-main.c
index c654f9555..6a689007e 100644
--- a/common-main.c
+++ b/common-main.c
@@ -1,5 +1,6 @@
 #include "cache.h"
 #include "exec_cmd.h"
+#include "attr.h"
 
 /*
  * Many parts of Git have subprograms communicate via pipe, expect the
@@ -33,6 +34,8 @@ int main(int argc, const char **argv)
 
 	git_setup_gettext();
 
+	attr_start();
+
 	git_extract_argv0_path(argv[0]);
 
 	restore_sigpipe_to_default();
-- 
2.11.0.390.gc69c2f50cf-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH 22/27] attr: eliminate global check_all_attr array
  2017-01-12 23:53 [PATCH 00/27] Revamp the attribute system; another round Brandon Williams
                   ` (20 preceding siblings ...)
  2017-01-12 23:53 ` [PATCH 21/27] attr: use hashmap for attribute dictionary Brandon Williams
@ 2017-01-12 23:53 ` Brandon Williams
  2017-01-12 23:53 ` [PATCH 23/27] attr: remove maybe-real, maybe-macro from git_attr Brandon Williams
                   ` (6 subsequent siblings)
  28 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-12 23:53 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, gitster, pclouds, sbeller

Currently there is a reliance on 'check_all_attr' which is a global
array of 'attr_check_item' items which is used to store the value of
each attribute during the collection process.

This patch eliminates this global and instead creates an array per
'attr_check' instance which is then used in the attribute collection
process.  This brings the attribute system one step closer to being
thread-safe.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 114 +++++++++++++++++++++++++++++++++++++++++++----------------------
 attr.h |   2 ++
 2 files changed, 78 insertions(+), 38 deletions(-)

diff --git a/attr.c b/attr.c
index 8cf2ea901..38b0d4347 100644
--- a/attr.c
+++ b/attr.c
@@ -34,7 +34,6 @@ struct git_attr {
 	int maybe_real;
 	char name[FLEX_ARRAY]; /* attribute name */
 };
-static int attr_nr;
 
 /*
  * NEEDSWORK: maybe-real, maybe-macro are not property of
@@ -45,9 +44,6 @@ static int attr_nr;
  */
 static int cannot_trust_maybe_real;
 
-/* NEEDSWORK: This will become per git_attr_check */
-static struct attr_check_item *check_all_attr;
-
 const char *git_attr_name(const struct git_attr *attr)
 {
 	return attr->name;
@@ -143,6 +139,52 @@ static void attr_hashmap_add(struct attr_hashmap *map,
 	hashmap_add(&map->map, e);
 }
 
+/*
+ * Reallocate and reinitialize the array of all attributes (which is used in
+ * the attribute collection process) in 'check' based on the global dictionary
+ * of attributes.
+ */
+static void all_attrs_init(struct attr_hashmap *map, struct attr_check *check)
+{
+	int i;
+
+	hashmap_lock(map);
+
+	if (map->map.size < check->all_attrs_nr)
+		die("BUG: interned attributes shouldn't be deleted");
+
+	/*
+	 * If the number of attributes in the global dictionary has increased
+	 * (or this attr_check instance doesn't have an initialized all_attrs
+	 * field), reallocate the provided attr_check instance's all_attrs
+	 * field and fill each entry with its corresponding git_attr.
+	 */
+	if (map->map.size != check->all_attrs_nr) {
+		struct attr_hash_entry *e;
+		struct hashmap_iter iter;
+		hashmap_iter_init(&map->map, &iter);
+
+		REALLOC_ARRAY(check->all_attrs, map->map.size);
+		check->all_attrs_nr = map->map.size;
+
+		while ((e = hashmap_iter_next(&iter))) {
+			const struct git_attr *a = e->value;
+			check->all_attrs[a->attr_nr].attr = a;
+		}
+	}
+
+	hashmap_unlock(map);
+
+	/*
+	 * Re-initialize every entry in check->all_attrs.
+	 * This re-initialization can live outside of the locked region since
+	 * the attribute dictionary is no longer being accessed.
+	 */
+	for (i = 0; i < check->all_attrs_nr; i++) {
+		check->all_attrs[i].value = ATTR__UNKNOWN;
+	}
+}
+
 static int attr_name_valid(const char *name, size_t namelen)
 {
 	/*
@@ -196,16 +238,6 @@ static struct git_attr *git_attr_internal(const char *name, int namelen)
 
 		attr_hashmap_add(&g_attr_hashmap, a->name, namelen, a);
 		assert(a->attr_nr == (g_attr_hashmap.map.size - 1));
-
-		/*
-		 * NEEDSWORK: per git_attr_check check_all_attr
-		 * will be initialized a lot more lazily, not
-		 * like this, and not here.
-		 */
-		REALLOC_ARRAY(check_all_attr, ++attr_nr);
-		check_all_attr[a->attr_nr].attr = a;
-		check_all_attr[a->attr_nr].value = ATTR__UNKNOWN;
-		assert(a->attr_nr == (attr_nr - 1));
 	}
 
 	hashmap_unlock(&g_attr_hashmap);
@@ -791,16 +823,16 @@ static int path_matches(const char *pathname, int pathlen,
 			      pattern, prefix, pat->patternlen, pat->flags);
 }
 
-static int macroexpand_one(int attr_nr, int rem);
+static int macroexpand_one(struct attr_check_item *all_attrs, int nr, int rem);
 
-static int fill_one(const char *what, struct match_attr *a, int rem)
+static int fill_one(const char *what, struct attr_check_item *all_attrs,
+		    struct match_attr *a, int rem)
 {
-	struct attr_check_item *check = check_all_attr;
 	int i;
 
-	for (i = a->num_attr - 1; 0 < rem && 0 <= i; i--) {
+	for (i = a->num_attr - 1; rem > 0 && i >= 0; i--) {
 		struct git_attr *attr = a->state[i].attr;
-		const char **n = &(check[attr->attr_nr].value);
+		const char **n = &(all_attrs[attr->attr_nr].value);
 		const char *v = a->state[i].setto;
 
 		if (*n == ATTR__UNKNOWN) {
@@ -809,14 +841,15 @@ static int fill_one(const char *what, struct match_attr *a, int rem)
 				  attr, v);
 			*n = v;
 			rem--;
-			rem = macroexpand_one(attr->attr_nr, rem);
+			rem = macroexpand_one(all_attrs, attr->attr_nr, rem);
 		}
 	}
 	return rem;
 }
 
 static int fill(const char *path, int pathlen, int basename_offset,
-		struct attr_stack *stk, int rem)
+		struct attr_stack *stk, struct attr_check_item *all_attrs,
+		int rem)
 {
 	int i;
 	const char *base = stk->origin ? stk->origin : "";
@@ -827,18 +860,18 @@ static int fill(const char *path, int pathlen, int basename_offset,
 			continue;
 		if (path_matches(path, pathlen, basename_offset,
 				 &a->u.pat, base, stk->originlen))
-			rem = fill_one("fill", a, rem);
+			rem = fill_one("fill", all_attrs, a, rem);
 	}
 	return rem;
 }
 
-static int macroexpand_one(int nr, int rem)
+static int macroexpand_one(struct attr_check_item *all_attrs, int nr, int rem)
 {
 	struct attr_stack *stk;
 	int i;
 
-	if (check_all_attr[nr].value != ATTR__TRUE ||
-	    !check_all_attr[nr].attr->maybe_macro)
+	if (all_attrs[nr].value != ATTR__TRUE ||
+	    !all_attrs[nr].attr->maybe_macro)
 		return rem;
 
 	for (stk = attr_stack; stk; stk = stk->prev) {
@@ -847,7 +880,7 @@ static int macroexpand_one(int nr, int rem)
 			if (!ma->is_macro)
 				continue;
 			if (ma->u.attr->attr_nr == nr)
-				return fill_one("expand", ma, rem);
+				return fill_one("expand", all_attrs, ma, rem);
 		}
 	}
 
@@ -855,9 +888,9 @@ static int macroexpand_one(int nr, int rem)
 }
 
 /*
- * Collect attributes for path into the array pointed to by
- * check_all_attr. If num is non-zero, only attributes in check[] are
- * collected. Otherwise all attributes are collected.
+ * Collect attributes for path into the array pointed to by check->all_attrs.
+ * If check->check_nr is non-zero, only attributes in check[] are collected.
+ * Otherwise all attributes are collected.
  */
 static void collect_some_attrs(const char *path, struct attr_check *check)
 {
@@ -880,15 +913,15 @@ static void collect_some_attrs(const char *path, struct attr_check *check)
 	}
 
 	prepare_attr_stack(path, dirlen);
-	for (i = 0; i < attr_nr; i++)
-		check_all_attr[i].value = ATTR__UNKNOWN;
+	all_attrs_init(&g_attr_hashmap, check);
+
 	if (check->check_nr && !cannot_trust_maybe_real) {
 		rem = 0;
 		for (i = 0; i < check->check_nr; i++) {
 			const struct git_attr *a = check->check[i].attr;
 			if (!a->maybe_real) {
 				struct attr_check_item *c;
-				c = check_all_attr + a->attr_nr;
+				c = check->all_attrs + a->attr_nr;
 				c->value = ATTR__UNSET;
 				rem++;
 			}
@@ -897,9 +930,9 @@ static void collect_some_attrs(const char *path, struct attr_check *check)
 			return;
 	}
 
-	rem = attr_nr;
+	rem = check->all_attrs_nr;
 	for (stk = attr_stack; 0 < rem && stk; stk = stk->prev)
-		rem = fill(path, pathlen, basename_offset, stk, rem);
+		rem = fill(path, pathlen, basename_offset, stk, check->all_attrs, rem);
 }
 
 int git_check_attr(const char *path, struct attr_check *check)
@@ -909,7 +942,8 @@ int git_check_attr(const char *path, struct attr_check *check)
 	collect_some_attrs(path, check);
 
 	for (i = 0; i < check->check_nr; i++) {
-		const char *value = check_all_attr[check->check[i].attr->attr_nr].value;
+		size_t index = check->check[i].attr->attr_nr;
+		const char *value = check->all_attrs[index].value;
 		if (value == ATTR__UNKNOWN)
 			value = ATTR__UNSET;
 		check->check[i].value = value;
@@ -925,9 +959,9 @@ void git_all_attrs(const char *path, struct attr_check *check)
 	attr_check_reset(check);
 	collect_some_attrs(path, check);
 
-	for (i = 0; i < attr_nr; i++) {
-		const char *name = check_all_attr[i].attr->name;
-		const char *value = check_all_attr[i].value;
+	for (i = 0; i < check->all_attrs_nr; i++) {
+		const char *name = check->all_attrs[i].attr->name;
+		const char *value = check->all_attrs[i].value;
 		struct attr_check_item *item;
 		if (value == ATTR__UNSET || value == ATTR__UNKNOWN)
 			continue;
@@ -997,6 +1031,10 @@ void attr_check_clear(struct attr_check *check)
 	check->check = NULL;
 	check->check_alloc = 0;
 	check->check_nr = 0;
+
+	free(check->all_attrs);
+	check->all_attrs = NULL;
+	check->all_attrs_nr = 0;
 }
 
 void attr_check_free(struct attr_check *check)
diff --git a/attr.h b/attr.h
index 8505bca79..44b21d82c 100644
--- a/attr.h
+++ b/attr.h
@@ -33,6 +33,8 @@ struct attr_check {
 	int check_nr;
 	int check_alloc;
 	struct attr_check_item *check;
+	int all_attrs_nr;
+	struct attr_check_item *all_attrs;
 };
 
 extern struct attr_check *attr_check_alloc(void);
-- 
2.11.0.390.gc69c2f50cf-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH 23/27] attr: remove maybe-real, maybe-macro from git_attr
  2017-01-12 23:53 [PATCH 00/27] Revamp the attribute system; another round Brandon Williams
                   ` (21 preceding siblings ...)
  2017-01-12 23:53 ` [PATCH 22/27] attr: eliminate global check_all_attr array Brandon Williams
@ 2017-01-12 23:53 ` Brandon Williams
  2017-01-12 23:53 ` [PATCH 24/27] attr: tighten const correctness with git_attr and match_attr Brandon Williams
                   ` (5 subsequent siblings)
  28 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-12 23:53 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, gitster, pclouds, sbeller

Whether or not a git attribute is real or a macro isn't a property of
the attribute but rather it depends on the attribute stack (which
.gitattribute files were read).

This patch removes the 'maybe_real' and 'maybe_macro' fields in a
git_attr and instead adds the 'macro' field to a attr_check_item.  The
'macro' indicates (if non-NULL) that a particular attribute is a macro
for the given attribute stack.  It's populated, through a quick scan of
the attribute stack, with the match_attr that corresponds to the macro's
definition.  This way the attribute stack only needs to be scanned a
single time prior to attribute collection instead of each time a macro
needs to be expanded.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 69 ++++++++++++++++++++++++++++++------------------------------------
 attr.h |  6 ++++++
 2 files changed, 37 insertions(+), 38 deletions(-)

diff --git a/attr.c b/attr.c
index 38b0d4347..633a12cc3 100644
--- a/attr.c
+++ b/attr.c
@@ -30,20 +30,9 @@ static const char git_attr__unknown[] = "(builtin)unknown";
 
 struct git_attr {
 	int attr_nr; /* unique attribute number */
-	int maybe_macro;
-	int maybe_real;
 	char name[FLEX_ARRAY]; /* attribute name */
 };
 
-/*
- * NEEDSWORK: maybe-real, maybe-macro are not property of
- * an attribute, as it depends on what .gitattributes are
- * read.  Once we introduce per git_attr_check attr_stack
- * and check_all_attr, the optimization based on them will
- * become unnecessary and can go away.  So is this variable.
- */
-static int cannot_trust_maybe_real;
-
 const char *git_attr_name(const struct git_attr *attr)
 {
 	return attr->name;
@@ -182,6 +171,7 @@ static void all_attrs_init(struct attr_hashmap *map, struct attr_check *check)
 	 */
 	for (i = 0; i < check->all_attrs_nr; i++) {
 		check->all_attrs[i].value = ATTR__UNKNOWN;
+		check->all_attrs[i].macro = NULL;
 	}
 }
 
@@ -233,8 +223,6 @@ static struct git_attr *git_attr_internal(const char *name, int namelen)
 	if (!a) {
 		FLEX_ALLOC_MEM(a, name, name, namelen);
 		a->attr_nr = g_attr_hashmap.map.size;
-		a->maybe_real = 0;
-		a->maybe_macro = 0;
 
 		attr_hashmap_add(&g_attr_hashmap, a->name, namelen, a);
 		assert(a->attr_nr == (g_attr_hashmap.map.size - 1));
@@ -397,7 +385,6 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 		      (is_macro ? 0 : namelen + 1));
 	if (is_macro) {
 		res->u.attr = git_attr_internal(name, namelen);
-		res->u.attr->maybe_macro = 1;
 	} else {
 		char *p = (char *)&(res->state[num_attr]);
 		memcpy(p, name, namelen);
@@ -418,10 +405,6 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 	/* Second pass to fill the attr_states */
 	for (cp = states, i = 0; *cp; i++) {
 		cp = parse_attr(src, lineno, cp, &(res->state[i]));
-		if (!is_macro)
-			res->state[i].attr->maybe_real = 1;
-		if (res->state[i].attr->maybe_macro)
-			cannot_trust_maybe_real = 1;
 	}
 
 	strbuf_release(&pattern);
@@ -826,7 +809,7 @@ static int path_matches(const char *pathname, int pathlen,
 static int macroexpand_one(struct attr_check_item *all_attrs, int nr, int rem);
 
 static int fill_one(const char *what, struct attr_check_item *all_attrs,
-		    struct match_attr *a, int rem)
+		    const struct match_attr *a, int rem)
 {
 	int i;
 
@@ -867,24 +850,34 @@ static int fill(const char *path, int pathlen, int basename_offset,
 
 static int macroexpand_one(struct attr_check_item *all_attrs, int nr, int rem)
 {
-	struct attr_stack *stk;
-	int i;
+	const struct attr_check_item *item = &all_attrs[nr];
 
-	if (all_attrs[nr].value != ATTR__TRUE ||
-	    !all_attrs[nr].attr->maybe_macro)
+	if (item->macro && item->value == ATTR__TRUE)
+		return fill_one("expand", all_attrs, item->macro, rem);
+	else
 		return rem;
+}
 
-	for (stk = attr_stack; stk; stk = stk->prev) {
-		for (i = stk->num_matches - 1; 0 <= i; i--) {
-			struct match_attr *ma = stk->attrs[i];
-			if (!ma->is_macro)
-				continue;
-			if (ma->u.attr->attr_nr == nr)
-				return fill_one("expand", all_attrs, ma, rem);
+/*
+ * Marks the attributes which are macros based on the attribute stack.
+ * This prevents having to search through the attribute stack each time
+ * a macro needs to be expanded during the fill stage.
+ */
+static void determine_macros(struct attr_check_item *all_attrs,
+			     const struct attr_stack *stack)
+{
+	for (; stack; stack = stack->prev) {
+		int i;
+		for (i = stack->num_matches - 1; i >= 0; i--) {
+			const struct match_attr *ma = stack->attrs[i];
+			if (ma->is_macro) {
+				int n = ma->u.attr->attr_nr;
+				if (!all_attrs[n].macro) {
+					all_attrs[n].macro = ma;
+				}
+			}
 		}
 	}
-
-	return rem;
 }
 
 /*
@@ -914,15 +907,15 @@ static void collect_some_attrs(const char *path, struct attr_check *check)
 
 	prepare_attr_stack(path, dirlen);
 	all_attrs_init(&g_attr_hashmap, check);
+	determine_macros(check->all_attrs, attr_stack);
 
-	if (check->check_nr && !cannot_trust_maybe_real) {
+	if (check->check_nr) {
 		rem = 0;
 		for (i = 0; i < check->check_nr; i++) {
-			const struct git_attr *a = check->check[i].attr;
-			if (!a->maybe_real) {
-				struct attr_check_item *c;
-				c = check->all_attrs + a->attr_nr;
-				c->value = ATTR__UNSET;
+			int n = check->check[i].attr->attr_nr;
+			struct attr_check_item *item = &check->all_attrs[n];
+			if (item->macro) {
+				item->value = ATTR__UNSET;
 				rem++;
 			}
 		}
diff --git a/attr.h b/attr.h
index 44b21d82c..f40524875 100644
--- a/attr.h
+++ b/attr.h
@@ -27,6 +27,12 @@ extern const char git_attr__false[];
 struct attr_check_item {
 	const struct git_attr *attr;
 	const char *value;
+	/*
+	 * If 'macro' is non-NULL, indicates that 'attr' is a macro based on
+	 * the current attribute stack and contains a pointer to the match_attr
+	 * definition of the macro
+	 */
+	const struct match_attr *macro;
 };
 
 struct attr_check {
-- 
2.11.0.390.gc69c2f50cf-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH 24/27] attr: tighten const correctness with git_attr and match_attr
  2017-01-12 23:53 [PATCH 00/27] Revamp the attribute system; another round Brandon Williams
                   ` (22 preceding siblings ...)
  2017-01-12 23:53 ` [PATCH 23/27] attr: remove maybe-real, maybe-macro from git_attr Brandon Williams
@ 2017-01-12 23:53 ` Brandon Williams
  2017-01-12 23:53 ` [PATCH 25/27] attr: store attribute stacks in hashmap Brandon Williams
                   ` (4 subsequent siblings)
  28 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-12 23:53 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, gitster, pclouds, sbeller

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c               | 14 +++++++-------
 attr.h               |  2 +-
 builtin/check-attr.c |  3 ++-
 3 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/attr.c b/attr.c
index 633a12cc3..90f576044 100644
--- a/attr.c
+++ b/attr.c
@@ -209,7 +209,7 @@ static void report_invalid_attr(const char *name, size_t len,
  * dictionary.  If no entry is found, create a new attribute and store it in
  * the dictionary.
  */
-static struct git_attr *git_attr_internal(const char *name, int namelen)
+static const struct git_attr *git_attr_internal(const char *name, int namelen)
 {
 	struct git_attr *a;
 
@@ -233,14 +233,14 @@ static struct git_attr *git_attr_internal(const char *name, int namelen)
 	return a;
 }
 
-struct git_attr *git_attr(const char *name)
+const struct git_attr *git_attr(const char *name)
 {
 	return git_attr_internal(name, strlen(name));
 }
 
 /* What does a matched pattern decide? */
 struct attr_state {
-	struct git_attr *attr;
+	const struct git_attr *attr;
 	const char *setto;
 };
 
@@ -267,7 +267,7 @@ struct pattern {
 struct match_attr {
 	union {
 		struct pattern pat;
-		struct git_attr *attr;
+		const struct git_attr *attr;
 	} u;
 	char is_macro;
 	unsigned num_attr;
@@ -814,7 +814,7 @@ static int fill_one(const char *what, struct attr_check_item *all_attrs,
 	int i;
 
 	for (i = a->num_attr - 1; rem > 0 && i >= 0; i--) {
-		struct git_attr *attr = a->state[i].attr;
+		const struct git_attr *attr = a->state[i].attr;
 		const char **n = &(all_attrs[attr->attr_nr].value);
 		const char *v = a->state[i].setto;
 
@@ -838,7 +838,7 @@ static int fill(const char *path, int pathlen, int basename_offset,
 	const char *base = stk->origin ? stk->origin : "";
 
 	for (i = stk->num_matches - 1; 0 < rem && 0 <= i; i--) {
-		struct match_attr *a = stk->attrs[i];
+		const struct match_attr *a = stk->attrs[i];
 		if (a->is_macro)
 			continue;
 		if (path_matches(path, pathlen, basename_offset,
@@ -988,7 +988,7 @@ struct attr_check *attr_check_initl(const char *one, ...)
 	check->check[0].attr = git_attr(one);
 	va_start(params, one);
 	for (cnt = 1; cnt < check->check_nr; cnt++) {
-		struct git_attr *attr;
+		const struct git_attr *attr;
 		param = va_arg(params, const char *);
 		if (!param)
 			die("BUG: counted %d != ended at %d",
diff --git a/attr.h b/attr.h
index f40524875..9b4dc07d8 100644
--- a/attr.h
+++ b/attr.h
@@ -8,7 +8,7 @@ struct git_attr;
  * Given a string, return the gitattribute object that
  * corresponds to it.
  */
-struct git_attr *git_attr(const char *);
+const struct git_attr *git_attr(const char *);
 
 /* Internal use */
 extern const char git_attr__true[];
diff --git a/builtin/check-attr.c b/builtin/check-attr.c
index 3d4704be5..cc6caf7ac 100644
--- a/builtin/check-attr.c
+++ b/builtin/check-attr.c
@@ -166,7 +166,8 @@ int cmd_check_attr(int argc, const char **argv, const char *prefix)
 	check = attr_check_alloc();
 	if (!all_attrs) {
 		for (i = 0; i < cnt; i++) {
-			struct git_attr *a = git_attr(argv[i]);
+			const struct git_attr *a = git_attr(argv[i]);
+
 			if (!a)
 				return error("%s: not a valid attribute name",
 					     argv[i]);
-- 
2.11.0.390.gc69c2f50cf-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH 25/27] attr: store attribute stacks in hashmap
  2017-01-12 23:53 [PATCH 00/27] Revamp the attribute system; another round Brandon Williams
                   ` (23 preceding siblings ...)
  2017-01-12 23:53 ` [PATCH 24/27] attr: tighten const correctness with git_attr and match_attr Brandon Williams
@ 2017-01-12 23:53 ` Brandon Williams
  2017-01-13 21:20   ` Junio C Hamano
  2017-01-18 20:39   ` Stefan Beller
  2017-01-12 23:53 ` [PATCH 26/27] attr: push the bare repo check into read_attr() Brandon Williams
                   ` (3 subsequent siblings)
  28 siblings, 2 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-12 23:53 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, gitster, pclouds, sbeller

The last big hurdle towards a thread-safe API for the attribute system
is the reliance on a global attribute stack that is modified during each
call into the attribute system.

This patch removes this global stack and instead a stack is retrieved or
constructed locally.  Since each of these stacks is only used as a
read-only structure once constructed, they can be stored in a hashmap
and shared between threads.  The key into the hashmap of attribute
stacks is, in the general case, the directory that corresponds to the
attribute stack frame.  For the core stack frames (builtin, system,
home, and info) a key of ".git/<name>-attr" is used to prevent potential
collisions since a directory or file named ".git" is disallowed.

One caveat with storing and sharing the stack frames like this is that
the info stack needs to be treated separately from the rest of the
attribute stack.  This is because each stack frame holds a pointer to
the stack that comes before it and if it was placed on top of the rest
of the attribute stack then this pointer would be different for each
attribute stack and wouldn't be able to be shared between threads.  In
order to allow for sharing the info stack frame it needs to be its own
isolated frame and can simply be processed first to have the same affect
of being at the top of the stack.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 375 +++++++++++++++++++++++++++++++++++++++++------------------------
 1 file changed, 235 insertions(+), 140 deletions(-)

diff --git a/attr.c b/attr.c
index 90f576044..78562592b 100644
--- a/attr.c
+++ b/attr.c
@@ -434,17 +434,19 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
  * .gitignore file and info/excludes file as a fallback.
  */
 
-/* NEEDSWORK: This will become per git_attr_check */
-static struct attr_stack {
-	struct attr_stack *prev;
+struct attr_stack {
+	const struct attr_stack *prev;
 	char *origin;
 	size_t originlen;
 	unsigned num_matches;
 	unsigned alloc;
 	struct match_attr **attrs;
-} *attr_stack;
+};
+
+/* Dictionary of stack frames; access should be surrounded by mutex */
+static struct attr_hashmap g_stack_hashmap;
 
-static void free_attr_elem(struct attr_stack *e)
+static void attr_stack_free(struct attr_stack *e)
 {
 	int i;
 	free(e->origin);
@@ -467,6 +469,25 @@ static void free_attr_elem(struct attr_stack *e)
 	free(e);
 }
 
+static void drop_attr_stack(void)
+{
+	struct hashmap_iter iter;
+	struct attr_hash_entry *e;
+
+	hashmap_lock(&g_stack_hashmap);
+
+	hashmap_iter_init(&g_stack_hashmap.map, &iter);
+	while ((e = hashmap_iter_next(&iter))) {
+		struct attr_stack *stack = e->value;
+		attr_stack_free(stack);
+		free(e);
+	}
+
+	hashmap_free(&g_stack_hashmap.map, 0);
+
+	hashmap_unlock(&g_stack_hashmap);
+}
+
 static const char *builtin_attr[] = {
 	"[attr]binary -diff -merge -text",
 	NULL,
@@ -621,15 +642,6 @@ static void debug_set(const char *what, const char *match, struct git_attr *attr
 #define debug_set(a,b,c,d) do { ; } while (0)
 #endif /* DEBUG_ATTR */
 
-static void drop_attr_stack(void)
-{
-	while (attr_stack) {
-		struct attr_stack *elem = attr_stack;
-		attr_stack = elem->prev;
-		free_attr_elem(elem);
-	}
-}
-
 static const char *git_etc_gitattributes(void)
 {
 	static const char *system_wide;
@@ -638,6 +650,14 @@ static const char *git_etc_gitattributes(void)
 	return system_wide;
 }
 
+static const char *get_home_gitattributes(void)
+{
+	if (!git_attributes_file)
+		git_attributes_file = xdg_config_home("attributes");
+
+	return git_attributes_file;
+}
+
 static int git_attr_system(void)
 {
 	return !git_env_bool("GIT_ATTR_NOSYSTEM", 0);
@@ -645,142 +665,208 @@ static int git_attr_system(void)
 
 static GIT_PATH_FUNC(git_path_info_attributes, INFOATTRIBUTES_FILE)
 
-static void push_stack(struct attr_stack **attr_stack_p,
-		       struct attr_stack *elem, char *origin, size_t originlen)
+/*
+ * This funciton should only be called from 'get_attr_stack()' or
+ * 'get_info_stack()', which already needs to acquire the lock to the stack
+ * hashmap, so there is no need to also acquire the lock in this function.
+ */
+static void push_stack(const struct attr_stack **attr_stack_p,
+		       struct attr_stack *elem,
+		       const char *origin, size_t originlen)
 {
 	if (elem) {
-		elem->origin = origin;
-		if (origin)
-			elem->originlen = originlen;
+		elem->origin = xmemdupz(origin, originlen);
+		elem->originlen = originlen;
 		elem->prev = *attr_stack_p;
 		*attr_stack_p = elem;
+		attr_hashmap_add(&g_stack_hashmap, elem->origin,
+				 elem->originlen, elem);
 	}
 }
 
-static void bootstrap_attr_stack(void)
+/*
+ * Return the path base that can be used in the pattern matching operation.  In
+ * order to enable storing the core and info stack frames in the stack hashmap
+ * an origin string other than NULL needed to be used.  Since git disallows
+ * tracking a ".git" file or directory the core and info stack frames have an
+ * origin string of ".git/<frame>" and must be converted to the empty string
+ * when being used to pattern match.
+ */
+static const char *attr_stack_get_base(const struct attr_stack *stack,
+				       size_t *baselen)
 {
-	struct attr_stack *elem;
+	const char *base;
 
-	if (attr_stack)
-		return;
+	if (starts_with(stack->origin, ".git/")) {
+		base = "";
+		*baselen = 0;
+	} else {
+		base = stack->origin;
+		*baselen = stack->originlen;
+	}
 
-	push_stack(&attr_stack, read_attr_from_array(builtin_attr), NULL, 0);
+	return base;
+}
 
-	if (git_attr_system())
-		push_stack(&attr_stack,
-			   read_attr_from_file(git_etc_gitattributes(), 1),
-			   NULL, 0);
+/*
+ * At the bottom of the attribute stack is the built-in
+ * set of attribute definitions, followed by the contents
+ * of $(prefix)/etc/gitattributes and a file specified by
+ * core.attributesfile.  Then, contents from
+ * .gitattribute files from directories closer to the
+ * root to the ones in deeper directories are pushed
+ * to the stack.  Finally, at the very top of the stack
+ * we always keep the contents of $GIT_DIR/info/attributes.
+ *
+ * When checking, we use entries from near the top of the
+ * stack, preferring $GIT_DIR/info/attributes, then
+ * .gitattributes in deeper directories to shallower ones,
+ * and finally use the built-in set as the default.
+ *
+ * The info stack needs to be treated separately from the rest of the attribute
+ * stack.  This is because each stack frame holds a pointer to the stack that
+ * comes before it and if it was placed on top of the rest of the attribute
+ * stack then this pointer would be different for each attribute stack and
+ * wouldn't be able to be shared between threads.  If the info stack is to be
+ * shared then it needs to be its own isolated frame and can simply be
+ * processed first to have the same affect of being at the top of the stack.
+ */
+static const struct attr_stack *get_info_stack(void)
+{
+	const struct attr_stack *info;
+	const char *key = ".git/info-attr";
+	size_t keylen = strlen(key);
 
-	if (!git_attributes_file)
-		git_attributes_file = xdg_config_home("attributes");
-	if (git_attributes_file)
-		push_stack(&attr_stack,
-			   read_attr_from_file(git_attributes_file, 1),
-			   NULL, 0);
-
-	if (!is_bare_repository() || direction == GIT_ATTR_INDEX) {
-		elem = read_attr(GITATTRIBUTES_FILE, 1);
-		push_stack(&attr_stack, elem, xstrdup(""), 0);
-		debug_push(elem);
+	hashmap_lock(&g_stack_hashmap);
+
+	info = attr_hashmap_get(&g_stack_hashmap, key, keylen);
+
+	if (!info) {
+		struct attr_stack *e = NULL;
+
+		if (startup_info->have_repository)
+			e = read_attr_from_file(git_path_info_attributes(), 1);
+
+		if (!e)
+			e = xcalloc(1, sizeof(struct attr_stack));
+		e->origin = xstrdup(key);
+		e->originlen = keylen;
+
+		attr_hashmap_add(&g_stack_hashmap, e->origin, e->originlen, e);
+		info = e;
 	}
 
-	if (startup_info->have_repository)
-		elem = read_attr_from_file(git_path_info_attributes(), 1);
-	else
-		elem = NULL;
+	hashmap_unlock(&g_stack_hashmap);
 
-	if (!elem)
-		elem = xcalloc(1, sizeof(*elem));
-	push_stack(&attr_stack, elem, NULL, 0);
+	return info;
 }
 
-static void prepare_attr_stack(const char *path, int dirlen)
+/*
+ * This funciton should only be called from 'get_attr_stack()', which already
+ * needs to acquire the lock to the stack hashmap, so there is no need to also
+ * acquire the lock in this function.
+ */
+static const struct attr_stack *core_attr_stack(void)
 {
-	struct attr_stack *elem, *info;
-	const char *cp;
+	const struct attr_stack *core;
 
-	/*
-	 * At the bottom of the attribute stack is the built-in
-	 * set of attribute definitions, followed by the contents
-	 * of $(prefix)/etc/gitattributes and a file specified by
-	 * core.attributesfile.  Then, contents from
-	 * .gitattribute files from directories closer to the
-	 * root to the ones in deeper directories are pushed
-	 * to the stack.  Finally, at the very top of the stack
-	 * we always keep the contents of $GIT_DIR/info/attributes.
-	 *
-	 * When checking, we use entries from near the top of the
-	 * stack, preferring $GIT_DIR/info/attributes, then
-	 * .gitattributes in deeper directories to shallower ones,
-	 * and finally use the built-in set as the default.
-	 */
-	bootstrap_attr_stack();
+	core = attr_hashmap_get(&g_stack_hashmap, "", 0);
 
-	/*
-	 * Pop the "info" one that is always at the top of the stack.
-	 */
-	info = attr_stack;
-	attr_stack = info->prev;
+	if (!core) {
+		struct attr_stack *e;
+		const char *key;
 
-	/*
-	 * Pop the ones from directories that are not the prefix of
-	 * the path we are checking. Break out of the loop when we see
-	 * the root one (whose origin is an empty string "") or the builtin
-	 * one (whose origin is NULL) without popping it.
-	 */
-	while (attr_stack->origin) {
-		int namelen = strlen(attr_stack->origin);
-
-		elem = attr_stack;
-		if (namelen <= dirlen &&
-		    !strncmp(elem->origin, path, namelen) &&
-		    (!namelen || path[namelen] == '/'))
-			break;
-
-		debug_pop(elem);
-		attr_stack = elem->prev;
-		free_attr_elem(elem);
-	}
+		/* builtin frame */
+		e = read_attr_from_array(builtin_attr);
+		key = ".git/builtin-attr";
+		push_stack(&core, e, key, strlen(key));
 
-	/*
-	 * Read from parent directories and push them down
-	 */
-	if (!is_bare_repository() || direction == GIT_ATTR_INDEX) {
-		/*
-		 * bootstrap_attr_stack() should have added, and the
-		 * above loop should have stopped before popping, the
-		 * root element whose attr_stack->origin is set to an
-		 * empty string.
-		 */
-		struct strbuf pathbuf = STRBUF_INIT;
-
-		assert(attr_stack->origin);
-		while (1) {
-			size_t len = strlen(attr_stack->origin);
-			char *origin;
-
-			if (dirlen <= len)
-				break;
-			cp = memchr(path + len + 1, '/', dirlen - len - 1);
-			if (!cp)
-				cp = path + dirlen;
-			strbuf_addf(&pathbuf,
-				    "%.*s/%s", (int)(cp - path), path,
-				    GITATTRIBUTES_FILE);
-			elem = read_attr(pathbuf.buf, 0);
-			strbuf_setlen(&pathbuf, cp - path);
-			origin = strbuf_detach(&pathbuf, &len);
-			push_stack(&attr_stack, elem, origin, len);
-			debug_push(elem);
+		/* system-wide frame */
+		if (git_attr_system()) {
+			e = read_attr_from_file(git_etc_gitattributes(), 1);
+			key = ".git/system-attr";
+			push_stack(&core, e, key, strlen(key));
 		}
 
-		strbuf_release(&pathbuf);
+		/* home directory */
+		if (get_home_gitattributes()) {
+			e = read_attr_from_file(get_home_gitattributes(), 1);
+			key = ".git/home-attr";
+			push_stack(&core, e, key, strlen(key));
+		}
+
+		/* root directory */
+		if (!is_bare_repository() || direction == GIT_ATTR_INDEX) {
+			e = read_attr(GITATTRIBUTES_FILE, 1);
+		} else {
+			e = xcalloc(1, sizeof(struct attr_stack));
+		}
+		key = "";
+		push_stack(&core, e, key, strlen(key));
 	}
 
-	/*
-	 * Finally push the "info" one at the top of the stack.
-	 */
-	push_stack(&attr_stack, info, NULL, 0);
+	assert(core);
+	return core;
+}
+
+static const struct attr_stack *get_attr_stack(const char *path, int dirlen)
+{
+	const struct attr_stack *stack = NULL;
+	struct strbuf key = STRBUF_INIT;
+
+	strbuf_addstr(&key, path);
+
+	hashmap_lock(&g_stack_hashmap);
+
+	/* Search for the deepest, pre-constructed stack frame */
+	while (key.len && !stack) {
+		size_t len = key.len;
+
+		/* Find start of the last component */
+		while (len > 0 && !is_dir_sep(key.buf[len - 1]))
+			len--;
+		/* Skip path-separator */
+		if (len > 0 && is_dir_sep(key.buf[len - 1]))
+			len--;
+		strbuf_setlen(&key, len);
+
+		stack = attr_hashmap_get(&g_stack_hashmap, key.buf, key.len);
+	}
+
+	/* At least start with the core stack */
+	if (!stack) {
+		stack = core_attr_stack();
+	}
+
+	/* Build up to the directory 'path' is in */
+	while (key.len < dirlen) {
+		size_t len = key.len;
+		struct attr_stack *next;
+
+		/* Skip path-separator */
+		if (len < dirlen && is_dir_sep(path[len]))
+			len++;
+		/* Find the end of the next component */
+		while (len < dirlen && !is_dir_sep(path[len]))
+			len++;
+
+		if (key.len > 0)
+			strbuf_addch(&key, '/');
+		strbuf_add(&key, path + key.len, (len - key.len));
+		strbuf_addf(&key, "/%s", GITATTRIBUTES_FILE);
+
+		next = read_attr(key.buf, 0);
+
+		/* reset the keybuffer to not include "/.gitattributes" */
+		strbuf_setlen(&key, len);
+
+		push_stack(&stack, next, key.buf, key.len);
+	}
+
+	hashmap_unlock(&g_stack_hashmap);
+
+	strbuf_release(&key);
+	return stack;
 }
 
 static int path_matches(const char *pathname, int pathlen,
@@ -831,20 +917,24 @@ static int fill_one(const char *what, struct attr_check_item *all_attrs,
 }
 
 static int fill(const char *path, int pathlen, int basename_offset,
-		struct attr_stack *stk, struct attr_check_item *all_attrs,
+		const struct attr_stack *stack, struct attr_check_item *all_attrs,
 		int rem)
 {
-	int i;
-	const char *base = stk->origin ? stk->origin : "";
-
-	for (i = stk->num_matches - 1; 0 < rem && 0 <= i; i--) {
-		const struct match_attr *a = stk->attrs[i];
-		if (a->is_macro)
-			continue;
-		if (path_matches(path, pathlen, basename_offset,
-				 &a->u.pat, base, stk->originlen))
-			rem = fill_one("fill", all_attrs, a, rem);
+	for (; rem > 0 && stack; stack = stack->prev) {
+		int i;
+		size_t baselen;
+		const char *base = attr_stack_get_base(stack, &baselen);
+
+		for (i = stack->num_matches - 1; rem > 0 && i >= 0; i--) {
+			const struct match_attr *a = stack->attrs[i];
+			if (a->is_macro)
+				continue;
+			if (path_matches(path, pathlen, basename_offset,
+					 &a->u.pat, base, baselen))
+				rem = fill_one("fill", all_attrs, a, rem);
+		}
 	}
+
 	return rem;
 }
 
@@ -887,10 +977,11 @@ static void determine_macros(struct attr_check_item *all_attrs,
  */
 static void collect_some_attrs(const char *path, struct attr_check *check)
 {
-	struct attr_stack *stk;
 	int i, pathlen, rem, dirlen;
 	const char *cp, *last_slash = NULL;
 	int basename_offset;
+	const struct attr_stack *stack;
+	const struct attr_stack *info;
 
 	for (cp = path; *cp; cp++) {
 		if (*cp == '/' && cp[1])
@@ -905,9 +996,12 @@ static void collect_some_attrs(const char *path, struct attr_check *check)
 		dirlen = 0;
 	}
 
-	prepare_attr_stack(path, dirlen);
+	info = get_info_stack();
+	stack = get_attr_stack(path, dirlen);
+
 	all_attrs_init(&g_attr_hashmap, check);
-	determine_macros(check->all_attrs, attr_stack);
+	determine_macros(check->all_attrs, info);
+	determine_macros(check->all_attrs, stack);
 
 	if (check->check_nr) {
 		rem = 0;
@@ -924,8 +1018,8 @@ static void collect_some_attrs(const char *path, struct attr_check *check)
 	}
 
 	rem = check->all_attrs_nr;
-	for (stk = attr_stack; 0 < rem && stk; stk = stk->prev)
-		rem = fill(path, pathlen, basename_offset, stk, check->all_attrs, rem);
+	rem = fill(path, pathlen, basename_offset, info, check->all_attrs, rem);
+	fill(path, pathlen, basename_offset, stack, check->all_attrs, rem);
 }
 
 int git_check_attr(const char *path, struct attr_check *check)
@@ -1052,4 +1146,5 @@ void git_attr_set_direction(enum git_attr_direction new, struct index_state *ist
 void attr_start(void)
 {
 	pthread_mutex_init(&g_attr_hashmap.mutex, NULL);
+	pthread_mutex_init(&g_stack_hashmap.mutex, NULL);
 }
-- 
2.11.0.390.gc69c2f50cf-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH 26/27] attr: push the bare repo check into read_attr()
  2017-01-12 23:53 [PATCH 00/27] Revamp the attribute system; another round Brandon Williams
                   ` (24 preceding siblings ...)
  2017-01-12 23:53 ` [PATCH 25/27] attr: store attribute stacks in hashmap Brandon Williams
@ 2017-01-12 23:53 ` Brandon Williams
  2017-01-12 23:53 ` [PATCH 27/27] attr: reformat git_attr_set_direction() function Brandon Williams
                   ` (2 subsequent siblings)
  28 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-12 23:53 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, gitster, pclouds, sbeller

Push the bare repository check into the 'read_attr()' function.  This
avoids needing to have extra logic which creates an empty stack frame
when inside a bare repo as a similar bit of logic already exists in the
'read_attr()' function.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 40 ++++++++++++++++++++--------------------
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/attr.c b/attr.c
index 78562592b..cbb07d25d 100644
--- a/attr.c
+++ b/attr.c
@@ -591,25 +591,29 @@ static struct attr_stack *read_attr_from_index(const char *path, int macro_ok)
 
 static struct attr_stack *read_attr(const char *path, int macro_ok)
 {
-	struct attr_stack *res;
+	struct attr_stack *res = NULL;
 
-	if (direction == GIT_ATTR_CHECKOUT) {
+	if (direction == GIT_ATTR_INDEX) {
 		res = read_attr_from_index(path, macro_ok);
-		if (!res)
-			res = read_attr_from_file(path, macro_ok);
-	}
-	else if (direction == GIT_ATTR_CHECKIN) {
-		res = read_attr_from_file(path, macro_ok);
-		if (!res)
-			/*
-			 * There is no checked out .gitattributes file there, but
-			 * we might have it in the index.  We allow operation in a
-			 * sparsely checked out work tree, so read from it.
-			 */
+	} else if (!is_bare_repository()) {
+		if (direction == GIT_ATTR_CHECKOUT) {
 			res = read_attr_from_index(path, macro_ok);
+			if (!res)
+				res = read_attr_from_file(path, macro_ok);
+		}
+		else if (direction == GIT_ATTR_CHECKIN) {
+			res = read_attr_from_file(path, macro_ok);
+			if (!res)
+				/*
+				 * There is no checked out .gitattributes file
+				 * there, but we might have it in the index.
+				 * We allow operation in a sparsely checked out
+				 * work tree, so read from it.
+				 */
+				res = read_attr_from_index(path, macro_ok);
+		}
 	}
-	else
-		res = read_attr_from_index(path, macro_ok);
+
 	if (!res)
 		res = xcalloc(1, sizeof(*res));
 	return res;
@@ -796,11 +800,7 @@ static const struct attr_stack *core_attr_stack(void)
 		}
 
 		/* root directory */
-		if (!is_bare_repository() || direction == GIT_ATTR_INDEX) {
-			e = read_attr(GITATTRIBUTES_FILE, 1);
-		} else {
-			e = xcalloc(1, sizeof(struct attr_stack));
-		}
+		e = read_attr(GITATTRIBUTES_FILE, 1);
 		key = "";
 		push_stack(&core, e, key, strlen(key));
 	}
-- 
2.11.0.390.gc69c2f50cf-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH 27/27] attr: reformat git_attr_set_direction() function
  2017-01-12 23:53 [PATCH 00/27] Revamp the attribute system; another round Brandon Williams
                   ` (25 preceding siblings ...)
  2017-01-12 23:53 ` [PATCH 26/27] attr: push the bare repo check into read_attr() Brandon Williams
@ 2017-01-12 23:53 ` Brandon Williams
  2017-01-15 23:47 ` [PATCH 00/27] Revamp the attribute system; another round Junio C Hamano
  2017-01-23 20:34 ` [PATCH v2 " Brandon Williams
  28 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-12 23:53 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, gitster, pclouds, sbeller

Move the 'git_attr_set_direction()' up to be closer to the variables
that it modifies as well as a small formatting by renaming the variable
'new' to 'new_direction' so that it is more descriptive.

Update the comment about how 'direction' is used to read the state of
the world.  It should be noted that callers of
'git_attr_set_direction()' should ensure that other threads are not
making calls into the attribute system until after the call to
'git_attr_set_direction()' completes.  This function essentially acts as
reset button for the attribute system and should be handled with care.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 49 ++++++++++++++++++++-----------------------------
 attr.h |  3 ++-
 2 files changed, 22 insertions(+), 30 deletions(-)

diff --git a/attr.c b/attr.c
index cbb07d25d..f5cc68b67 100644
--- a/attr.c
+++ b/attr.c
@@ -521,26 +521,30 @@ static struct attr_stack *read_attr_from_array(const char **list)
 }
 
 /*
- * NEEDSWORK: these two are tricky.  The callers assume there is a
- * single, system-wide global state "where we read attributes from?"
- * and when the state is flipped by calling git_attr_set_direction(),
- * attr_stack is discarded so that subsequent attr_check will lazily
- * read from the right place.  And they do not know or care who called
- * by them uses the attribute subsystem, hence have no knowledge of
- * existing git_attr_check instances or future ones that will be
- * created).
- *
- * Probably we need a thread_local that holds these two variables,
- * and a list of git_attr_check instances (which need to be maintained
- * by hooking into git_attr_check_alloc(), git_attr_check_initl(), and
- * git_attr_check_clear().  Then git_attr_set_direction() updates the
- * fields in that thread_local for these two variables, iterate over
- * all the active git_attr_check instances and discard the attr_stack
- * they hold.  Yuck, but it sounds doable.
+ * Callers into the attribute system assume there is a single, system-wide
+ * global state where attributes are read from and when the state is flipped by
+ * calling git_attr_set_direction(), the stack frames that have been
+ * constructed need to be discarded so so that subsequent calls into the
+ * attribute system will lazily read from the right place.  Since changing
+ * direction causes a global paradigm shift, it should not ever be called while
+ * another thread could potentially be calling into the attribute system.
  */
 static enum git_attr_direction direction;
 static struct index_state *use_index;
 
+void git_attr_set_direction(enum git_attr_direction new_direction,
+			    struct index_state *istate)
+{
+	if (is_bare_repository() && new_direction != GIT_ATTR_INDEX)
+		die("BUG: non-INDEX attr direction in a bare repo");
+
+	if (new_direction != direction)
+		drop_attr_stack();
+
+	direction = new_direction;
+	use_index = istate;
+}
+
 static struct attr_stack *read_attr_from_file(const char *path, int macro_ok)
 {
 	FILE *fp = fopen(path, "r");
@@ -1130,19 +1134,6 @@ void attr_check_free(struct attr_check *check)
 	free(check);
 }
 
-void git_attr_set_direction(enum git_attr_direction new, struct index_state *istate)
-{
-	enum git_attr_direction old = direction;
-
-	if (is_bare_repository() && new != GIT_ATTR_INDEX)
-		die("BUG: non-INDEX attr direction in a bare repo");
-
-	direction = new;
-	if (new != old)
-		drop_attr_stack();
-	use_index = istate;
-}
-
 void attr_start(void)
 {
 	pthread_mutex_init(&g_attr_hashmap.mutex, NULL);
diff --git a/attr.h b/attr.h
index 9b4dc07d8..b8be37c91 100644
--- a/attr.h
+++ b/attr.h
@@ -73,7 +73,8 @@ enum git_attr_direction {
 	GIT_ATTR_CHECKOUT,
 	GIT_ATTR_INDEX
 };
-void git_attr_set_direction(enum git_attr_direction, struct index_state *);
+void git_attr_set_direction(enum git_attr_direction new_direction,
+			    struct index_state *istate);
 
 extern void attr_start(void);
 
-- 
2.11.0.390.gc69c2f50cf-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* Re: [PATCH 25/27] attr: store attribute stacks in hashmap
  2017-01-12 23:53 ` [PATCH 25/27] attr: store attribute stacks in hashmap Brandon Williams
@ 2017-01-13 21:20   ` Junio C Hamano
  2017-01-18 20:34     ` Brandon Williams
  2017-01-18 20:39   ` Stefan Beller
  1 sibling, 1 reply; 111+ messages in thread
From: Junio C Hamano @ 2017-01-13 21:20 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, pclouds, sbeller

Brandon Williams <bmwill@google.com> writes:

> The last big hurdle towards a thread-safe API for the attribute system
> is the reliance on a global attribute stack that is modified during each
> call into the attribute system.
>
> This patch removes this global stack and instead a stack is retrieved or
> constructed locally.  Since each of these stacks is only used as a
> read-only structure once constructed, they can be stored in a hashmap
> and shared between threads.

Very good.

The reason why the original code used a stack was because it wanted
to keep only the info read from releavant files in-core, discarding
ones from files no-longer relevant (because the traversal switched
to another subdirectory of the same parent directory), to avoid the
memory consumption grow unbounded.  It probably was a premature
"optimization" that we can do without, so keeping everything we have
read so far in a hashmap (which is my understanding of what is going
on in this patch) is probably OK.

I suspect that this hashmap may eventually need to become per
attr_check if we want to follow through the optimization envisioned
by patch 15/27.

Inside fill(), path_matches() is called for the number of match_attr
in the entire attribute stack but it is wasteful to check if the
path matches with the a.u.pat if none of the a.state[] entries talk
about attributes and macros that are eventually get used by the
caller of check_attr().  By introducing a wrapping structure, 15/27
wanted to make sure that we have a place to store a "reduced"
attribute stack that is kept per attr_check that has only entries
from the files that talk about the attributes the particular
attr_check wants to learn about.

I need to think about this a bit more, but I do not offhand think
that it makes future such enhancement to make it per-check harder to
move from a global stack to a global hashmap, i.e. the above is not
an objection to this step.

> One caveat with storing and sharing the stack frames like this is that
> the info stack needs to be treated separately from the rest of the
> attribute stack.  This is because each stack frame holds a pointer to
> the stack that comes before it and if it was placed on top of the rest
> of the attribute stack then this pointer would be different for each
> attribute stack and wouldn't be able to be shared between threads.  In
> order to allow for sharing the info stack frame it needs to be its own
> isolated frame and can simply be processed first to have the same affect
> of being at the top of the stack.

Good.

Thanks.

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH 00/27] Revamp the attribute system; another round
  2017-01-12 23:53 [PATCH 00/27] Revamp the attribute system; another round Brandon Williams
                   ` (26 preceding siblings ...)
  2017-01-12 23:53 ` [PATCH 27/27] attr: reformat git_attr_set_direction() function Brandon Williams
@ 2017-01-15 23:47 ` Junio C Hamano
  2017-01-16  8:10   ` Jeff King
  2017-01-23 20:34 ` [PATCH v2 " Brandon Williams
  28 siblings, 1 reply; 111+ messages in thread
From: Junio C Hamano @ 2017-01-15 23:47 UTC (permalink / raw)
  To: Brandon Williams, Jeff King; +Cc: git, pclouds, sbeller

This one unfortunately clashes with jk/nofollow-attr-ignore where
Peff adds sanity to refuse following symbolic links when reading
.gitignore and .gitattributes; I'll eject jk/nofollow-attr-ignore
topic for now and see how well this topic fits together with the
remainder of the topics in flight.

Thanks.

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH 00/27] Revamp the attribute system; another round
  2017-01-15 23:47 ` [PATCH 00/27] Revamp the attribute system; another round Junio C Hamano
@ 2017-01-16  8:10   ` Jeff King
  0 siblings, 0 replies; 111+ messages in thread
From: Jeff King @ 2017-01-16  8:10 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Brandon Williams, git, pclouds, sbeller

On Sun, Jan 15, 2017 at 03:47:16PM -0800, Junio C Hamano wrote:

> This one unfortunately clashes with jk/nofollow-attr-ignore where
> Peff adds sanity to refuse following symbolic links when reading
> .gitignore and .gitattributes; I'll eject jk/nofollow-attr-ignore
> topic for now and see how well this topic fits together with the
> remainder of the topics in flight.

Yeah, that's a good plan. I think my re-roll of the nofollow stuff will
be pretty major, and may not end up touching the attribute code directly
at all.

-Peff

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH 21/27] attr: use hashmap for attribute dictionary
  2017-01-12 23:53 ` [PATCH 21/27] attr: use hashmap for attribute dictionary Brandon Williams
@ 2017-01-18 20:20   ` Stefan Beller
  2017-01-18 20:23     ` Brandon Williams
  0 siblings, 1 reply; 111+ messages in thread
From: Stefan Beller @ 2017-01-18 20:20 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, Junio C Hamano, Duy Nguyen

On Thu, Jan 12, 2017 at 3:53 PM, Brandon Williams <bmwill@google.com> wrote:

> +/* Initialize an 'attr_hashmap' object */
> +void attr_hashmap_init(struct attr_hashmap *map)

In case a reroll is needed, mark this static please.

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH 21/27] attr: use hashmap for attribute dictionary
  2017-01-18 20:20   ` Stefan Beller
@ 2017-01-18 20:23     ` Brandon Williams
  0 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-18 20:23 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, Junio C Hamano, Duy Nguyen

On 01/18, Stefan Beller wrote:
> On Thu, Jan 12, 2017 at 3:53 PM, Brandon Williams <bmwill@google.com> wrote:
> 
> > +/* Initialize an 'attr_hashmap' object */
> > +void attr_hashmap_init(struct attr_hashmap *map)
> 
> In case a reroll is needed, mark this static please.

Will do.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH 25/27] attr: store attribute stacks in hashmap
  2017-01-13 21:20   ` Junio C Hamano
@ 2017-01-18 20:34     ` Brandon Williams
  2017-01-23 18:08       ` Brandon Williams
  0 siblings, 1 reply; 111+ messages in thread
From: Brandon Williams @ 2017-01-18 20:34 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, pclouds, sbeller

On 01/13, Junio C Hamano wrote:
> Brandon Williams <bmwill@google.com> writes:
> 
> > The last big hurdle towards a thread-safe API for the attribute system
> > is the reliance on a global attribute stack that is modified during each
> > call into the attribute system.
> >
> > This patch removes this global stack and instead a stack is retrieved or
> > constructed locally.  Since each of these stacks is only used as a
> > read-only structure once constructed, they can be stored in a hashmap
> > and shared between threads.
> 
> Very good.
> 
> The reason why the original code used a stack was because it wanted
> to keep only the info read from releavant files in-core, discarding
> ones from files no-longer relevant (because the traversal switched
> to another subdirectory of the same parent directory), to avoid the
> memory consumption grow unbounded.  It probably was a premature
> "optimization" that we can do without, so keeping everything we have
> read so far in a hashmap (which is my understanding of what is going
> on in this patch) is probably OK.
> 
> I suspect that this hashmap may eventually need to become per
> attr_check if we want to follow through the optimization envisioned
> by patch 15/27.
> 
> Inside fill(), path_matches() is called for the number of match_attr
> in the entire attribute stack but it is wasteful to check if the
> path matches with the a.u.pat if none of the a.state[] entries talk
> about attributes and macros that are eventually get used by the
> caller of check_attr().  By introducing a wrapping structure, 15/27
> wanted to make sure that we have a place to store a "reduced"
> attribute stack that is kept per attr_check that has only entries
> from the files that talk about the attributes the particular
> attr_check wants to learn about.
> 
> I need to think about this a bit more, but I do not offhand think
> that it makes future such enhancement to make it per-check harder to
> move from a global stack to a global hashmap, i.e. the above is not
> an objection to this step.

If we want to continue through and do the optimization you originally
envisioned then I may need to rethink this patch.  One thing we did talk
about offline was doing another check prior to the path_match() function
call which looks through the list of state structs to see if one of
those states would actually have an affect on the array being used to
collect attributes.  Though that may be an optimization which can be
done in addition to creating a reduced stack.

The one difficulty (which you pointed out in comment form) is if we have
a reduced attribute stack that is stored per attr_check then handling
the cleanup when the direction is changed may be messy.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH 25/27] attr: store attribute stacks in hashmap
  2017-01-12 23:53 ` [PATCH 25/27] attr: store attribute stacks in hashmap Brandon Williams
  2017-01-13 21:20   ` Junio C Hamano
@ 2017-01-18 20:39   ` Stefan Beller
  2017-01-18 20:45     ` Stefan Beller
  2017-01-18 20:50     ` Brandon Williams
  1 sibling, 2 replies; 111+ messages in thread
From: Stefan Beller @ 2017-01-18 20:39 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, Junio C Hamano, Duy Nguyen

On Thu, Jan 12, 2017 at 3:53 PM, Brandon Williams <bmwill@google.com> wrote:
> -static void prepare_attr_stack(const char *path, int dirlen)
> +/*
> + * This funciton should only be called from 'get_attr_stack()', which already

"function"

> +               /* system-wide frame */
> +               if (git_attr_system()) {
> +                       e = read_attr_from_file(git_etc_gitattributes(), 1);

read_attr_from_file may return NULL, so we'd have to treat this similar
to below "root directory", i.e. xcalloc for an empty frame?

> +
> +               /* root directory */
> +               if (!is_bare_repository() || direction == GIT_ATTR_INDEX) {
> +                       e = read_attr(GITATTRIBUTES_FILE, 1);
> +               } else {
> +                       e = xcalloc(1, sizeof(struct attr_stack));
> +               }
> +               key = "";
> +               push_stack(&core, e, key, strlen(key));

If this is a bare repo, could we just omit this frame instead of pushing
an empty xcalloc'd frame? (Same for the stack frames of system wide
and home dir) ?

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH 25/27] attr: store attribute stacks in hashmap
  2017-01-18 20:39   ` Stefan Beller
@ 2017-01-18 20:45     ` Stefan Beller
  2017-01-18 20:50     ` Brandon Williams
  1 sibling, 0 replies; 111+ messages in thread
From: Stefan Beller @ 2017-01-18 20:45 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, Junio C Hamano, Duy Nguyen

On Wed, Jan 18, 2017 at 12:39 PM, Stefan Beller <sbeller@google.com> wrote:
> On Thu, Jan 12, 2017 at 3:53 PM, Brandon Williams <bmwill@google.com> wrote:
>> -static void prepare_attr_stack(const char *path, int dirlen)
>> +/*
>> + * This funciton should only be called from 'get_attr_stack()', which already
>
> "function"
>
>> +               /* system-wide frame */
>> +               if (git_attr_system()) {
>> +                       e = read_attr_from_file(git_etc_gitattributes(), 1);
>
> read_attr_from_file may return NULL, so we'd have to treat this similar
> to below "root directory", i.e. xcalloc for an empty frame?
>
>> +
>> +               /* root directory */
>> +               if (!is_bare_repository() || direction == GIT_ATTR_INDEX) {
>> +                       e = read_attr(GITATTRIBUTES_FILE, 1);
>> +               } else {
>> +                       e = xcalloc(1, sizeof(struct attr_stack));
>> +               }
>> +               key = "";
>> +               push_stack(&core, e, key, strlen(key));
>
> If this is a bare repo, could we just omit this frame instead of pushing
> an empty xcalloc'd frame? (Same for the stack frames of system wide
> and home dir) ?

The next patch moves this issue into the read_attr function.

So in the end we'd either need to fix read_attr_from_file to return
    res = xcalloc(1, sizeof(*res));
if (!fp), or we need to handle NULLs appropriately in 'core_attr_stack' ?

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH 25/27] attr: store attribute stacks in hashmap
  2017-01-18 20:39   ` Stefan Beller
  2017-01-18 20:45     ` Stefan Beller
@ 2017-01-18 20:50     ` Brandon Williams
  1 sibling, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-18 20:50 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, Junio C Hamano, Duy Nguyen

On 01/18, Stefan Beller wrote:
> On Thu, Jan 12, 2017 at 3:53 PM, Brandon Williams <bmwill@google.com> wrote:
> > -static void prepare_attr_stack(const char *path, int dirlen)
> > +/*
> > + * This funciton should only be called from 'get_attr_stack()', which already
> 
> "function"
> 
> > +               /* system-wide frame */
> > +               if (git_attr_system()) {
> > +                       e = read_attr_from_file(git_etc_gitattributes(), 1);
> 
> read_attr_from_file may return NULL, so we'd have to treat this similar
> to below "root directory", i.e. xcalloc for an empty frame?

The push_stack function doesn't do anything if 'e' is NULL, so we should
be fine here.

> 
> > +
> > +               /* root directory */
> > +               if (!is_bare_repository() || direction == GIT_ATTR_INDEX) {
> > +                       e = read_attr(GITATTRIBUTES_FILE, 1);
> > +               } else {
> > +                       e = xcalloc(1, sizeof(struct attr_stack));
> > +               }
> > +               key = "";
> > +               push_stack(&core, e, key, strlen(key));
> 
> If this is a bare repo, could we just omit this frame instead of pushing
> an empty xcalloc'd frame? (Same for the stack frames of system wide
> and home dir) ?

The reasoning behind having the object created even if its a bare repo
is so that later we can easily see that a frame has been read and
included and doesn't need to attempt to reread the frame from disk
later.  It also made things simpler when storing the object in a hashmap
since storing a NULL ptr was awkward.

Though looking at Junio's discussion we may want to rethink how the
stacks are handled.  I still need to think about it some more.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH 25/27] attr: store attribute stacks in hashmap
  2017-01-18 20:34     ` Brandon Williams
@ 2017-01-23 18:08       ` Brandon Williams
  0 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-23 18:08 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, pclouds, sbeller

On 01/18, Brandon Williams wrote:
> On 01/13, Junio C Hamano wrote:
> > Brandon Williams <bmwill@google.com> writes:
> > 
> > > The last big hurdle towards a thread-safe API for the attribute system
> > > is the reliance on a global attribute stack that is modified during each
> > > call into the attribute system.
> > >
> > > This patch removes this global stack and instead a stack is retrieved or
> > > constructed locally.  Since each of these stacks is only used as a
> > > read-only structure once constructed, they can be stored in a hashmap
> > > and shared between threads.
> > 
> > Very good.
> > 
> > The reason why the original code used a stack was because it wanted
> > to keep only the info read from releavant files in-core, discarding
> > ones from files no-longer relevant (because the traversal switched
> > to another subdirectory of the same parent directory), to avoid the
> > memory consumption grow unbounded.  It probably was a premature
> > "optimization" that we can do without, so keeping everything we have
> > read so far in a hashmap (which is my understanding of what is going
> > on in this patch) is probably OK.
> > 
> > I suspect that this hashmap may eventually need to become per
> > attr_check if we want to follow through the optimization envisioned
> > by patch 15/27.
> > 
> > Inside fill(), path_matches() is called for the number of match_attr
> > in the entire attribute stack but it is wasteful to check if the
> > path matches with the a.u.pat if none of the a.state[] entries talk
> > about attributes and macros that are eventually get used by the
> > caller of check_attr().  By introducing a wrapping structure, 15/27
> > wanted to make sure that we have a place to store a "reduced"
> > attribute stack that is kept per attr_check that has only entries
> > from the files that talk about the attributes the particular
> > attr_check wants to learn about.
> > 
> > I need to think about this a bit more, but I do not offhand think
> > that it makes future such enhancement to make it per-check harder to
> > move from a global stack to a global hashmap, i.e. the above is not
> > an objection to this step.
> 
> If we want to continue through and do the optimization you originally
> envisioned then I may need to rethink this patch.  One thing we did talk
> about offline was doing another check prior to the path_match() function
> call which looks through the list of state structs to see if one of
> those states would actually have an affect on the array being used to
> collect attributes.  Though that may be an optimization which can be
> done in addition to creating a reduced stack.
> 
> The one difficulty (which you pointed out in comment form) is if we have
> a reduced attribute stack that is stored per attr_check then handling
> the cleanup when the direction is changed may be messy.

After thinking about this some more I'm going to redo this patch in the
series and instead of storing all of the frames in a shared hashmap,
we'll have an attribute stack stored per attr_check instance like you
originally envisioned.  I think that having a hashmap of all the stack
frames may make it more difficult to do optimizations in the future.  At
least this way (simply pushing the stack into the attr_check) makes it
more straight forward to do optimizations and doesn't have the potential
for memory to grow unbounded.

I'll try to get out a v2 later today.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 111+ messages in thread

* [PATCH v2 00/27] Revamp the attribute system; another round
  2017-01-12 23:53 [PATCH 00/27] Revamp the attribute system; another round Brandon Williams
                   ` (27 preceding siblings ...)
  2017-01-15 23:47 ` [PATCH 00/27] Revamp the attribute system; another round Junio C Hamano
@ 2017-01-23 20:34 ` Brandon Williams
  2017-01-23 20:34   ` [PATCH v2 01/27] commit.c: use strchrnul() to scan for one line Brandon Williams
                     ` (27 more replies)
  28 siblings, 28 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-23 20:34 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, sbeller, gitster, pclouds

Changes in v2:
* surround the mutex initializer calls by #ifdef
* mark file-local symbol static
* handling of attribute stacks.  Instead of storing each stack frame in a
  hashmap, there is a stack per attr_check instance.  This will allow for
  easier optimizing of the stack in future patches as well as eliminates the
  potential for memory to grow unbounded.  This is also more inline with the
  original vision of the attribute system refactor.

Brandon Williams (8):
  attr: pass struct attr_check to collect_some_attrs
  attr: use hashmap for attribute dictionary
  attr: eliminate global check_all_attr array
  attr: remove maybe-real, maybe-macro from git_attr
  attr: tighten const correctness with git_attr and match_attr
  attr: store attribute stack in attr_check structure
  attr: push the bare repo check into read_attr()
  attr: reformat git_attr_set_direction() function

Junio C Hamano (17):
  commit.c: use strchrnul() to scan for one line
  attr.c: use strchrnul() to scan for one line
  attr.c: update a stale comment on "struct match_attr"
  attr.c: explain the lack of attr-name syntax check in parse_attr()
  attr.c: complete a sentence in a comment
  attr.c: mark where #if DEBUG ends more clearly
  attr.c: simplify macroexpand_one()
  attr.c: tighten constness around "git_attr" structure
  attr.c: plug small leak in parse_attr_line()
  attr.c: add push_stack() helper
  attr.c: outline the future plans by heavily commenting
  attr: rename function and struct related to checking attributes
  attr: (re)introduce git_check_attr() and struct attr_check
  attr: convert git_all_attrs() to use "struct attr_check"
  attr: convert git_check_attrs() callers to use the new API
  attr: retire git_check_attrs() API
  attr: change validity check for attribute names to use positive logic

Nguyễn Thái Ngọc Duy (1):
  attr: support quoting pathname patterns in C style

Stefan Beller (1):
  Documentation: fix a typo

 Documentation/gitattributes.txt               |  10 +-
 Documentation/technical/api-gitattributes.txt |  86 ++-
 archive.c                                     |  24 +-
 attr.c                                        | 854 ++++++++++++++++++--------
 attr.h                                        |  53 +-
 builtin/check-attr.c                          |  66 +-
 builtin/pack-objects.c                        |  19 +-
 commit.c                                      |   3 +-
 common-main.c                                 |   3 +
 convert.c                                     |  25 +-
 ll-merge.c                                    |  33 +-
 t/t0003-attributes.sh                         |  26 +
 userdiff.c                                    |  19 +-
 ws.c                                          |  19 +-
 14 files changed, 800 insertions(+), 440 deletions(-)

-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply	[flat|nested] 111+ messages in thread

* [PATCH v2 01/27] commit.c: use strchrnul() to scan for one line
  2017-01-23 20:34 ` [PATCH v2 " Brandon Williams
@ 2017-01-23 20:34   ` Brandon Williams
  2017-01-23 20:35   ` [PATCH v2 02/27] attr.c: " Brandon Williams
                     ` (26 subsequent siblings)
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-23 20:34 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, sbeller, pclouds, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 commit.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/commit.c b/commit.c
index 2cf85158b..0c4ee3de4 100644
--- a/commit.c
+++ b/commit.c
@@ -415,8 +415,7 @@ int find_commit_subject(const char *commit_buffer, const char **subject)
 		p++;
 	if (*p) {
 		p = skip_blank_lines(p + 2);
-		for (eol = p; *eol && *eol != '\n'; eol++)
-			; /* do nothing */
+		eol = strchrnul(p, '\n');
 	} else
 		eol = p;
 
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v2 02/27] attr.c: use strchrnul() to scan for one line
  2017-01-23 20:34 ` [PATCH v2 " Brandon Williams
  2017-01-23 20:34   ` [PATCH v2 01/27] commit.c: use strchrnul() to scan for one line Brandon Williams
@ 2017-01-23 20:35   ` Brandon Williams
  2017-01-23 20:35   ` [PATCH v2 03/27] attr.c: update a stale comment on "struct match_attr" Brandon Williams
                     ` (25 subsequent siblings)
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-23 20:35 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, sbeller, pclouds, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/attr.c b/attr.c
index 1fcf042b8..04d24334e 100644
--- a/attr.c
+++ b/attr.c
@@ -402,8 +402,8 @@ static struct attr_stack *read_attr_from_index(const char *path, int macro_ok)
 	for (sp = buf; *sp; ) {
 		char *ep;
 		int more;
-		for (ep = sp; *ep && *ep != '\n'; ep++)
-			;
+
+		ep = strchrnul(sp, '\n');
 		more = (*ep == '\n');
 		*ep = '\0';
 		handle_attr_line(res, sp, path, ++lineno, macro_ok);
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v2 03/27] attr.c: update a stale comment on "struct match_attr"
  2017-01-23 20:34 ` [PATCH v2 " Brandon Williams
  2017-01-23 20:34   ` [PATCH v2 01/27] commit.c: use strchrnul() to scan for one line Brandon Williams
  2017-01-23 20:35   ` [PATCH v2 02/27] attr.c: " Brandon Williams
@ 2017-01-23 20:35   ` Brandon Williams
  2017-01-23 20:35   ` [PATCH v2 04/27] attr.c: explain the lack of attr-name syntax check in parse_attr() Brandon Williams
                     ` (24 subsequent siblings)
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-23 20:35 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, sbeller, pclouds, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

When 82dce998 (attr: more matching optimizations from .gitignore,
2012-10-15) changed a pointer to a string "*pattern" into an
embedded "struct pattern" in struct match_attr, it forgot to update
the comment that describes the structure.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/attr.c b/attr.c
index 04d24334e..007f1a299 100644
--- a/attr.c
+++ b/attr.c
@@ -131,9 +131,8 @@ struct pattern {
  * If is_macro is true, then u.attr is a pointer to the git_attr being
  * defined.
  *
- * If is_macro is false, then u.pattern points at the filename pattern
- * to which the rule applies.  (The memory pointed to is part of the
- * memory block allocated for the match_attr instance.)
+ * If is_macro is false, then u.pat is the filename pattern to which the
+ * rule applies.
  *
  * In either case, num_attr is the number of attributes affected by
  * this rule, and state is an array listing them.  The attributes are
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v2 04/27] attr.c: explain the lack of attr-name syntax check in parse_attr()
  2017-01-23 20:34 ` [PATCH v2 " Brandon Williams
                     ` (2 preceding siblings ...)
  2017-01-23 20:35   ` [PATCH v2 03/27] attr.c: update a stale comment on "struct match_attr" Brandon Williams
@ 2017-01-23 20:35   ` Brandon Williams
  2017-01-23 20:35   ` [PATCH v2 05/27] attr.c: complete a sentence in a comment Brandon Williams
                     ` (23 subsequent siblings)
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-23 20:35 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, sbeller, pclouds, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/attr.c b/attr.c
index 007f1a299..6b55a57ef 100644
--- a/attr.c
+++ b/attr.c
@@ -183,6 +183,12 @@ static const char *parse_attr(const char *src, int lineno, const char *cp,
 			return NULL;
 		}
 	} else {
+		/*
+		 * As this function is always called twice, once with
+		 * e == NULL in the first pass and then e != NULL in
+		 * the second pass, no need for invalid_attr_name()
+		 * check here.
+		 */
 		if (*cp == '-' || *cp == '!') {
 			e->setto = (*cp == '-') ? ATTR__FALSE : ATTR__UNSET;
 			cp++;
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v2 05/27] attr.c: complete a sentence in a comment
  2017-01-23 20:34 ` [PATCH v2 " Brandon Williams
                     ` (3 preceding siblings ...)
  2017-01-23 20:35   ` [PATCH v2 04/27] attr.c: explain the lack of attr-name syntax check in parse_attr() Brandon Williams
@ 2017-01-23 20:35   ` Brandon Williams
  2017-01-23 20:35   ` [PATCH v2 06/27] attr.c: mark where #if DEBUG ends more clearly Brandon Williams
                     ` (22 subsequent siblings)
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-23 20:35 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, sbeller, pclouds, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/attr.c b/attr.c
index 6b55a57ef..9bdf87a6f 100644
--- a/attr.c
+++ b/attr.c
@@ -300,7 +300,7 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
  * directory (again, reading the file from top to bottom) down to the
  * current directory, and then scan the list backwards to find the first match.
  * This is exactly the same as what is_excluded() does in dir.c to deal with
- * .gitignore
+ * .gitignore file and info/excludes file as a fallback.
  */
 
 static struct attr_stack {
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v2 06/27] attr.c: mark where #if DEBUG ends more clearly
  2017-01-23 20:34 ` [PATCH v2 " Brandon Williams
                     ` (4 preceding siblings ...)
  2017-01-23 20:35   ` [PATCH v2 05/27] attr.c: complete a sentence in a comment Brandon Williams
@ 2017-01-23 20:35   ` Brandon Williams
  2017-01-23 20:35   ` [PATCH v2 07/27] attr.c: simplify macroexpand_one() Brandon Williams
                     ` (21 subsequent siblings)
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-23 20:35 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, sbeller, pclouds, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/attr.c b/attr.c
index 9bdf87a6f..17297fffe 100644
--- a/attr.c
+++ b/attr.c
@@ -469,7 +469,7 @@ static void debug_set(const char *what, const char *match, struct git_attr *attr
 #define debug_push(a) do { ; } while (0)
 #define debug_pop(a) do { ; } while (0)
 #define debug_set(a,b,c,d) do { ; } while (0)
-#endif
+#endif /* DEBUG_ATTR */
 
 static void drop_attr_stack(void)
 {
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v2 07/27] attr.c: simplify macroexpand_one()
  2017-01-23 20:34 ` [PATCH v2 " Brandon Williams
                     ` (5 preceding siblings ...)
  2017-01-23 20:35   ` [PATCH v2 06/27] attr.c: mark where #if DEBUG ends more clearly Brandon Williams
@ 2017-01-23 20:35   ` Brandon Williams
  2017-01-23 20:35   ` [PATCH v2 08/27] attr.c: tighten constness around "git_attr" structure Brandon Williams
                     ` (20 subsequent siblings)
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-23 20:35 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, sbeller, pclouds, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

The double-loop wants to do an early return immediately when one
matching macro is found.  Eliminate the extra variable 'a' used for
that purpose and rewrite the "assign the found item to 'a' to make
it non-NULL and force the loop(s) to terminate" with a direct return
from there.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/attr.c b/attr.c
index 17297fffe..e42f931b3 100644
--- a/attr.c
+++ b/attr.c
@@ -705,24 +705,21 @@ static int fill(const char *path, int pathlen, int basename_offset,
 static int macroexpand_one(int nr, int rem)
 {
 	struct attr_stack *stk;
-	struct match_attr *a = NULL;
 	int i;
 
 	if (check_all_attr[nr].value != ATTR__TRUE ||
 	    !check_all_attr[nr].attr->maybe_macro)
 		return rem;
 
-	for (stk = attr_stack; !a && stk; stk = stk->prev)
-		for (i = stk->num_matches - 1; !a && 0 <= i; i--) {
+	for (stk = attr_stack; stk; stk = stk->prev) {
+		for (i = stk->num_matches - 1; 0 <= i; i--) {
 			struct match_attr *ma = stk->attrs[i];
 			if (!ma->is_macro)
 				continue;
 			if (ma->u.attr->attr_nr == nr)
-				a = ma;
+				return fill_one("expand", ma, rem);
 		}
-
-	if (a)
-		rem = fill_one("expand", a, rem);
+	}
 
 	return rem;
 }
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v2 08/27] attr.c: tighten constness around "git_attr" structure
  2017-01-23 20:34 ` [PATCH v2 " Brandon Williams
                     ` (6 preceding siblings ...)
  2017-01-23 20:35   ` [PATCH v2 07/27] attr.c: simplify macroexpand_one() Brandon Williams
@ 2017-01-23 20:35   ` Brandon Williams
  2017-01-23 20:35   ` [PATCH v2 09/27] attr.c: plug small leak in parse_attr_line() Brandon Williams
                     ` (19 subsequent siblings)
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-23 20:35 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, sbeller, pclouds, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

It holds an interned string, and git_attr_name() is a way to peek
into it.  Make sure the involved pointer types are pointer-to-const.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 2 +-
 attr.h | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/attr.c b/attr.c
index e42f931b3..f7cf7ae30 100644
--- a/attr.c
+++ b/attr.c
@@ -43,7 +43,7 @@ static int cannot_trust_maybe_real;
 static struct git_attr_check *check_all_attr;
 static struct git_attr *(git_attr_hash[HASHSIZE]);
 
-char *git_attr_name(struct git_attr *attr)
+const char *git_attr_name(const struct git_attr *attr)
 {
 	return attr->name;
 }
diff --git a/attr.h b/attr.h
index 8b08d33af..00d7a662c 100644
--- a/attr.h
+++ b/attr.h
@@ -25,7 +25,7 @@ extern const char git_attr__false[];
  * Unset one is returned as NULL.
  */
 struct git_attr_check {
-	struct git_attr *attr;
+	const struct git_attr *attr;
 	const char *value;
 };
 
@@ -34,7 +34,7 @@ struct git_attr_check {
  * return value is a pointer to a null-delimited string that is part
  * of the internal data structure; it should not be modified or freed.
  */
-char *git_attr_name(struct git_attr *);
+extern const char *git_attr_name(const struct git_attr *);
 
 int git_check_attr(const char *path, int, struct git_attr_check *);
 
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v2 09/27] attr.c: plug small leak in parse_attr_line()
  2017-01-23 20:34 ` [PATCH v2 " Brandon Williams
                     ` (7 preceding siblings ...)
  2017-01-23 20:35   ` [PATCH v2 08/27] attr.c: tighten constness around "git_attr" structure Brandon Williams
@ 2017-01-23 20:35   ` Brandon Williams
  2017-01-23 20:35   ` [PATCH v2 10/27] attr: support quoting pathname patterns in C style Brandon Williams
                     ` (18 subsequent siblings)
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-23 20:35 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, sbeller, pclouds, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

If any error is noticed after the match_attr structure is allocated,
we shouldn't just return NULL from this function.

Add a fail_return label that frees the allocated structure and
returns NULL, and consistently jump there when we want to return
NULL after cleaning up.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/attr.c b/attr.c
index f7cf7ae30..d180c7833 100644
--- a/attr.c
+++ b/attr.c
@@ -223,7 +223,7 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 		if (!macro_ok) {
 			fprintf(stderr, "%s not allowed: %s:%d\n",
 				name, src, lineno);
-			return NULL;
+			goto fail_return;
 		}
 		is_macro = 1;
 		name += strlen(ATTRIBUTE_MACRO_PREFIX);
@@ -233,7 +233,7 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 			fprintf(stderr,
 				"%.*s is not a valid attribute name: %s:%d\n",
 				namelen, name, src, lineno);
-			return NULL;
+			goto fail_return;
 		}
 	}
 	else
@@ -246,7 +246,7 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 	for (cp = states, num_attr = 0; *cp; num_attr++) {
 		cp = parse_attr(src, lineno, cp, NULL);
 		if (!cp)
-			return NULL;
+			goto fail_return;
 	}
 
 	res = xcalloc(1,
@@ -267,7 +267,7 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 		if (res->u.pat.flags & EXC_FLAG_NEGATIVE) {
 			warning(_("Negative patterns are ignored in git attributes\n"
 				  "Use '\\!' for literal leading exclamation."));
-			return NULL;
+			goto fail_return;
 		}
 	}
 	res->is_macro = is_macro;
@@ -283,6 +283,10 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 	}
 
 	return res;
+
+fail_return:
+	free(res);
+	return NULL;
 }
 
 /*
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v2 10/27] attr: support quoting pathname patterns in C style
  2017-01-23 20:34 ` [PATCH v2 " Brandon Williams
                     ` (8 preceding siblings ...)
  2017-01-23 20:35   ` [PATCH v2 09/27] attr.c: plug small leak in parse_attr_line() Brandon Williams
@ 2017-01-23 20:35   ` Brandon Williams
  2017-01-23 20:35   ` [PATCH v2 11/27] attr.c: add push_stack() helper Brandon Williams
                     ` (17 subsequent siblings)
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-23 20:35 UTC (permalink / raw)
  To: git
  Cc: Nguyễn Thái Ngọc Duy, sbeller, gitster,
	Brandon Williams

From: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>

Full pattern must be quoted. So 'pat"t"ern attr' will give exactly
'pat"t"ern', not 'pattern'. Also clarify that leading whitespaces are
not part of the pattern and document comment syntax.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 Documentation/gitattributes.txt |  8 +++++---
 attr.c                          | 15 +++++++++++++--
 t/t0003-attributes.sh           | 26 ++++++++++++++++++++++++++
 3 files changed, 44 insertions(+), 5 deletions(-)

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index e0b66c122..3173dee7e 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -21,9 +21,11 @@ Each line in `gitattributes` file is of form:
 	pattern	attr1 attr2 ...
 
 That is, a pattern followed by an attributes list,
-separated by whitespaces.  When the pattern matches the
-path in question, the attributes listed on the line are given to
-the path.
+separated by whitespaces. Leading and trailing whitespaces are
+ignored. Lines that begin with '#' are ignored. Patterns
+that begin with a double quote are quoted in C style.
+When the pattern matches the path in question, the attributes
+listed on the line are given to the path.
 
 Each attribute can be in one of these states for a given path:
 
diff --git a/attr.c b/attr.c
index d180c7833..e1c630f79 100644
--- a/attr.c
+++ b/attr.c
@@ -13,6 +13,7 @@
 #include "attr.h"
 #include "dir.h"
 #include "utf8.h"
+#include "quote.h"
 
 const char git_attr__true[] = "(builtin)true";
 const char git_attr__false[] = "\0(builtin)false";
@@ -212,12 +213,21 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 	const char *cp, *name, *states;
 	struct match_attr *res = NULL;
 	int is_macro;
+	struct strbuf pattern = STRBUF_INIT;
 
 	cp = line + strspn(line, blank);
 	if (!*cp || *cp == '#')
 		return NULL;
 	name = cp;
-	namelen = strcspn(name, blank);
+
+	if (*cp == '"' && !unquote_c_style(&pattern, name, &states)) {
+		name = pattern.buf;
+		namelen = pattern.len;
+	} else {
+		namelen = strcspn(name, blank);
+		states = name + namelen;
+	}
+
 	if (strlen(ATTRIBUTE_MACRO_PREFIX) < namelen &&
 	    starts_with(name, ATTRIBUTE_MACRO_PREFIX)) {
 		if (!macro_ok) {
@@ -239,7 +249,6 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 	else
 		is_macro = 0;
 
-	states = name + namelen;
 	states += strspn(states, blank);
 
 	/* First pass to count the attr_states */
@@ -282,9 +291,11 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 			cannot_trust_maybe_real = 1;
 	}
 
+	strbuf_release(&pattern);
 	return res;
 
 fail_return:
+	strbuf_release(&pattern);
 	free(res);
 	return NULL;
 }
diff --git a/t/t0003-attributes.sh b/t/t0003-attributes.sh
index f0fbb4255..f19ae4f8c 100755
--- a/t/t0003-attributes.sh
+++ b/t/t0003-attributes.sh
@@ -13,10 +13,31 @@ attr_check () {
 	test_line_count = 0 err
 }
 
+attr_check_quote () {
+
+	path="$1"
+	quoted_path="$2"
+	expect="$3"
+
+	git check-attr test -- "$path" >actual &&
+	echo "\"$quoted_path\": test: $expect" >expect &&
+	test_cmp expect actual
+
+}
+
+test_expect_success 'open-quoted pathname' '
+	echo "\"a test=a" >.gitattributes &&
+	test_must_fail attr_check a a
+'
+
+
 test_expect_success 'setup' '
 	mkdir -p a/b/d a/c b &&
 	(
 		echo "[attr]notest !test"
+		echo "\" d \"	test=d"
+		echo " e	test=e"
+		echo " e\"	test=e"
 		echo "f	test=f"
 		echo "a/i test=a/i"
 		echo "onoff test -test"
@@ -69,6 +90,11 @@ test_expect_success 'command line checks' '
 '
 
 test_expect_success 'attribute test' '
+
+	attr_check " d " d &&
+	attr_check e e &&
+	attr_check_quote e\" e\\\" e &&
+
 	attr_check f f &&
 	attr_check a/f f &&
 	attr_check a/c/f f &&
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v2 11/27] attr.c: add push_stack() helper
  2017-01-23 20:34 ` [PATCH v2 " Brandon Williams
                     ` (9 preceding siblings ...)
  2017-01-23 20:35   ` [PATCH v2 10/27] attr: support quoting pathname patterns in C style Brandon Williams
@ 2017-01-23 20:35   ` Brandon Williams
  2017-01-23 20:35   ` [PATCH v2 12/27] Documentation: fix a typo Brandon Williams
                     ` (16 subsequent siblings)
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-23 20:35 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, sbeller, pclouds, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

There are too many repetitious "I have this new attr_stack element;
push it at the top of the stack" sequence.  The new helper function
push_stack() gives us a way to express what is going on at these
places, and as a side effect, halves the number of times we mention
the attr_stack global variable.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 71 +++++++++++++++++++++++++++++++-----------------------------------
 1 file changed, 33 insertions(+), 38 deletions(-)

diff --git a/attr.c b/attr.c
index e1c630f79..8026d68bd 100644
--- a/attr.c
+++ b/attr.c
@@ -510,6 +510,18 @@ static int git_attr_system(void)
 
 static GIT_PATH_FUNC(git_path_info_attributes, INFOATTRIBUTES_FILE)
 
+static void push_stack(struct attr_stack **attr_stack_p,
+		       struct attr_stack *elem, char *origin, size_t originlen)
+{
+	if (elem) {
+		elem->origin = origin;
+		if (origin)
+			elem->originlen = originlen;
+		elem->prev = *attr_stack_p;
+		*attr_stack_p = elem;
+	}
+}
+
 static void bootstrap_attr_stack(void)
 {
 	struct attr_stack *elem;
@@ -517,37 +529,23 @@ static void bootstrap_attr_stack(void)
 	if (attr_stack)
 		return;
 
-	elem = read_attr_from_array(builtin_attr);
-	elem->origin = NULL;
-	elem->prev = attr_stack;
-	attr_stack = elem;
-
-	if (git_attr_system()) {
-		elem = read_attr_from_file(git_etc_gitattributes(), 1);
-		if (elem) {
-			elem->origin = NULL;
-			elem->prev = attr_stack;
-			attr_stack = elem;
-		}
-	}
+	push_stack(&attr_stack, read_attr_from_array(builtin_attr), NULL, 0);
+
+	if (git_attr_system())
+		push_stack(&attr_stack,
+			   read_attr_from_file(git_etc_gitattributes(), 1),
+			   NULL, 0);
 
 	if (!git_attributes_file)
 		git_attributes_file = xdg_config_home("attributes");
-	if (git_attributes_file) {
-		elem = read_attr_from_file(git_attributes_file, 1);
-		if (elem) {
-			elem->origin = NULL;
-			elem->prev = attr_stack;
-			attr_stack = elem;
-		}
-	}
+	if (git_attributes_file)
+		push_stack(&attr_stack,
+			   read_attr_from_file(git_attributes_file, 1),
+			   NULL, 0);
 
 	if (!is_bare_repository() || direction == GIT_ATTR_INDEX) {
 		elem = read_attr(GITATTRIBUTES_FILE, 1);
-		elem->origin = xstrdup("");
-		elem->originlen = 0;
-		elem->prev = attr_stack;
-		attr_stack = elem;
+		push_stack(&attr_stack, elem, xstrdup(""), 0);
 		debug_push(elem);
 	}
 
@@ -558,15 +556,12 @@ static void bootstrap_attr_stack(void)
 
 	if (!elem)
 		elem = xcalloc(1, sizeof(*elem));
-	elem->origin = NULL;
-	elem->prev = attr_stack;
-	attr_stack = elem;
+	push_stack(&attr_stack, elem, NULL, 0);
 }
 
 static void prepare_attr_stack(const char *path, int dirlen)
 {
 	struct attr_stack *elem, *info;
-	int len;
 	const char *cp;
 
 	/*
@@ -626,20 +621,21 @@ static void prepare_attr_stack(const char *path, int dirlen)
 
 		assert(attr_stack->origin);
 		while (1) {
-			len = strlen(attr_stack->origin);
+			size_t len = strlen(attr_stack->origin);
+			char *origin;
+
 			if (dirlen <= len)
 				break;
 			cp = memchr(path + len + 1, '/', dirlen - len - 1);
 			if (!cp)
 				cp = path + dirlen;
-			strbuf_add(&pathbuf, path, cp - path);
-			strbuf_addch(&pathbuf, '/');
-			strbuf_addstr(&pathbuf, GITATTRIBUTES_FILE);
+			strbuf_addf(&pathbuf,
+				    "%.*s/%s", (int)(cp - path), path,
+				    GITATTRIBUTES_FILE);
 			elem = read_attr(pathbuf.buf, 0);
 			strbuf_setlen(&pathbuf, cp - path);
-			elem->origin = strbuf_detach(&pathbuf, &elem->originlen);
-			elem->prev = attr_stack;
-			attr_stack = elem;
+			origin = strbuf_detach(&pathbuf, &len);
+			push_stack(&attr_stack, elem, origin, len);
 			debug_push(elem);
 		}
 
@@ -649,8 +645,7 @@ static void prepare_attr_stack(const char *path, int dirlen)
 	/*
 	 * Finally push the "info" one at the top of the stack.
 	 */
-	info->prev = attr_stack;
-	attr_stack = info;
+	push_stack(&attr_stack, info, NULL, 0);
 }
 
 static int path_matches(const char *pathname, int pathlen,
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v2 12/27] Documentation: fix a typo
  2017-01-23 20:34 ` [PATCH v2 " Brandon Williams
                     ` (10 preceding siblings ...)
  2017-01-23 20:35   ` [PATCH v2 11/27] attr.c: add push_stack() helper Brandon Williams
@ 2017-01-23 20:35   ` Brandon Williams
  2017-01-23 20:35   ` [PATCH v2 13/27] attr.c: outline the future plans by heavily commenting Brandon Williams
                     ` (15 subsequent siblings)
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-23 20:35 UTC (permalink / raw)
  To: git; +Cc: Stefan Beller, gitster, pclouds, Brandon Williams

From: Stefan Beller <sbeller@google.com>

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 Documentation/gitattributes.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index 3173dee7e..a53d093ca 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -88,7 +88,7 @@ is either not set or empty, $HOME/.config/git/attributes is used instead.
 Attributes for all users on a system should be placed in the
 `$(prefix)/etc/gitattributes` file.
 
-Sometimes you would need to override an setting of an attribute
+Sometimes you would need to override a setting of an attribute
 for a path to `Unspecified` state.  This can be done by listing
 the name of the attribute prefixed with an exclamation point `!`.
 
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v2 13/27] attr.c: outline the future plans by heavily commenting
  2017-01-23 20:34 ` [PATCH v2 " Brandon Williams
                     ` (11 preceding siblings ...)
  2017-01-23 20:35   ` [PATCH v2 12/27] Documentation: fix a typo Brandon Williams
@ 2017-01-23 20:35   ` Brandon Williams
  2017-01-23 20:35   ` [PATCH v2 14/27] attr: rename function and struct related to checking attributes Brandon Williams
                     ` (14 subsequent siblings)
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-23 20:35 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, sbeller, pclouds, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 40 +++++++++++++++++++++++++++++++++++++++-
 1 file changed, 39 insertions(+), 1 deletion(-)

diff --git a/attr.c b/attr.c
index 8026d68bd..50e5ee393 100644
--- a/attr.c
+++ b/attr.c
@@ -30,6 +30,11 @@ static const char git_attr__unknown[] = "(builtin)unknown";
 #define DEBUG_ATTR 0
 #endif
 
+/*
+ * NEEDSWORK: the global dictionary of the interned attributes
+ * must stay a singleton even after we become thread-ready.
+ * Access to these must be surrounded with mutex when it happens.
+ */
 struct git_attr {
 	struct git_attr *next;
 	unsigned h;
@@ -39,10 +44,19 @@ struct git_attr {
 	char name[FLEX_ARRAY];
 };
 static int attr_nr;
+static struct git_attr *(git_attr_hash[HASHSIZE]);
+
+/*
+ * NEEDSWORK: maybe-real, maybe-macro are not property of
+ * an attribute, as it depends on what .gitattributes are
+ * read.  Once we introduce per git_attr_check attr_stack
+ * and check_all_attr, the optimization based on them will
+ * become unnecessary and can go away.  So is this variable.
+ */
 static int cannot_trust_maybe_real;
 
+/* NEEDSWORK: This will become per git_attr_check */
 static struct git_attr_check *check_all_attr;
-static struct git_attr *(git_attr_hash[HASHSIZE]);
 
 const char *git_attr_name(const struct git_attr *attr)
 {
@@ -102,6 +116,11 @@ static struct git_attr *git_attr_internal(const char *name, int len)
 	a->maybe_real = 0;
 	git_attr_hash[pos] = a;
 
+	/*
+	 * NEEDSWORK: per git_attr_check check_all_attr
+	 * will be initialized a lot more lazily, not
+	 * like this, and not here.
+	 */
 	REALLOC_ARRAY(check_all_attr, attr_nr);
 	check_all_attr[a->attr_nr].attr = a;
 	check_all_attr[a->attr_nr].value = ATTR__UNKNOWN;
@@ -318,6 +337,7 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
  * .gitignore file and info/excludes file as a fallback.
  */
 
+/* NEEDSWORK: This will become per git_attr_check */
 static struct attr_stack {
 	struct attr_stack *prev;
 	char *origin;
@@ -382,6 +402,24 @@ static struct attr_stack *read_attr_from_array(const char **list)
 	return res;
 }
 
+/*
+ * NEEDSWORK: these two are tricky.  The callers assume there is a
+ * single, system-wide global state "where we read attributes from?"
+ * and when the state is flipped by calling git_attr_set_direction(),
+ * attr_stack is discarded so that subsequent attr_check will lazily
+ * read from the right place.  And they do not know or care who called
+ * by them uses the attribute subsystem, hence have no knowledge of
+ * existing git_attr_check instances or future ones that will be
+ * created).
+ *
+ * Probably we need a thread_local that holds these two variables,
+ * and a list of git_attr_check instances (which need to be maintained
+ * by hooking into git_attr_check_alloc(), git_attr_check_initl(), and
+ * git_attr_check_clear().  Then git_attr_set_direction() updates the
+ * fields in that thread_local for these two variables, iterate over
+ * all the active git_attr_check instances and discard the attr_stack
+ * they hold.  Yuck, but it sounds doable.
+ */
 static enum git_attr_direction direction;
 static struct index_state *use_index;
 
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v2 14/27] attr: rename function and struct related to checking attributes
  2017-01-23 20:34 ` [PATCH v2 " Brandon Williams
                     ` (12 preceding siblings ...)
  2017-01-23 20:35   ` [PATCH v2 13/27] attr.c: outline the future plans by heavily commenting Brandon Williams
@ 2017-01-23 20:35   ` Brandon Williams
  2017-01-23 20:35   ` [PATCH v2 15/27] attr: (re)introduce git_check_attr() and struct attr_check Brandon Williams
                     ` (13 subsequent siblings)
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-23 20:35 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, sbeller, pclouds, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

The traditional API to check attributes is to prepare an N-element
array of "struct git_attr_check" and pass N and the array to the
function "git_check_attr()" as arguments.

In preparation to revamp the API to pass a single structure, in
which these N elements are held, rename the type used for these
individual array elements to "struct attr_check_item" and rename
the function to "git_check_attrs()".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 archive.c              |  6 +++---
 attr.c                 | 12 ++++++------
 attr.h                 |  8 ++++----
 builtin/check-attr.c   | 19 ++++++++++---------
 builtin/pack-objects.c |  6 +++---
 convert.c              | 12 ++++++------
 ll-merge.c             | 10 +++++-----
 userdiff.c             |  4 ++--
 ws.c                   |  6 +++---
 9 files changed, 42 insertions(+), 41 deletions(-)

diff --git a/archive.c b/archive.c
index 01751e574..b76bd4691 100644
--- a/archive.c
+++ b/archive.c
@@ -87,7 +87,7 @@ void *sha1_file_to_archive(const struct archiver_args *args,
 	return buffer;
 }
 
-static void setup_archive_check(struct git_attr_check *check)
+static void setup_archive_check(struct attr_check_item *check)
 {
 	static struct git_attr *attr_export_ignore;
 	static struct git_attr *attr_export_subst;
@@ -123,7 +123,7 @@ static int write_archive_entry(const unsigned char *sha1, const char *base,
 	struct archiver_context *c = context;
 	struct archiver_args *args = c->args;
 	write_archive_entry_fn_t write_entry = c->write_entry;
-	struct git_attr_check check[2];
+	struct attr_check_item check[2];
 	const char *path_without_prefix;
 	int err;
 
@@ -138,7 +138,7 @@ static int write_archive_entry(const unsigned char *sha1, const char *base,
 	path_without_prefix = path.buf + args->baselen;
 
 	setup_archive_check(check);
-	if (!git_check_attr(path_without_prefix, ARRAY_SIZE(check), check)) {
+	if (!git_check_attrs(path_without_prefix, ARRAY_SIZE(check), check)) {
 		if (ATTR_TRUE(check[0].value))
 			return 0;
 		args->convert = ATTR_TRUE(check[1].value);
diff --git a/attr.c b/attr.c
index 50e5ee393..2f180d609 100644
--- a/attr.c
+++ b/attr.c
@@ -56,7 +56,7 @@ static struct git_attr *(git_attr_hash[HASHSIZE]);
 static int cannot_trust_maybe_real;
 
 /* NEEDSWORK: This will become per git_attr_check */
-static struct git_attr_check *check_all_attr;
+static struct attr_check_item *check_all_attr;
 
 const char *git_attr_name(const struct git_attr *attr)
 {
@@ -713,7 +713,7 @@ static int macroexpand_one(int attr_nr, int rem);
 
 static int fill_one(const char *what, struct match_attr *a, int rem)
 {
-	struct git_attr_check *check = check_all_attr;
+	struct attr_check_item *check = check_all_attr;
 	int i;
 
 	for (i = a->num_attr - 1; 0 < rem && 0 <= i; i--) {
@@ -778,7 +778,7 @@ static int macroexpand_one(int nr, int rem)
  * collected. Otherwise all attributes are collected.
  */
 static void collect_some_attrs(const char *path, int num,
-			       struct git_attr_check *check)
+			       struct attr_check_item *check)
 
 {
 	struct attr_stack *stk;
@@ -806,7 +806,7 @@ static void collect_some_attrs(const char *path, int num,
 		rem = 0;
 		for (i = 0; i < num; i++) {
 			if (!check[i].attr->maybe_real) {
-				struct git_attr_check *c;
+				struct attr_check_item *c;
 				c = check_all_attr + check[i].attr->attr_nr;
 				c->value = ATTR__UNSET;
 				rem++;
@@ -821,7 +821,7 @@ static void collect_some_attrs(const char *path, int num,
 		rem = fill(path, pathlen, basename_offset, stk, rem);
 }
 
-int git_check_attr(const char *path, int num, struct git_attr_check *check)
+int git_check_attrs(const char *path, int num, struct attr_check_item *check)
 {
 	int i;
 
@@ -837,7 +837,7 @@ int git_check_attr(const char *path, int num, struct git_attr_check *check)
 	return 0;
 }
 
-int git_all_attrs(const char *path, int *num, struct git_attr_check **check)
+int git_all_attrs(const char *path, int *num, struct attr_check_item **check)
 {
 	int i, count, j;
 
diff --git a/attr.h b/attr.h
index 00d7a662c..efc7bb3b3 100644
--- a/attr.h
+++ b/attr.h
@@ -20,11 +20,11 @@ extern const char git_attr__false[];
 #define ATTR_UNSET(v) ((v) == NULL)
 
 /*
- * Send one or more git_attr_check to git_check_attr(), and
+ * Send one or more git_attr_check to git_check_attrs(), and
  * each 'value' member tells what its value is.
  * Unset one is returned as NULL.
  */
-struct git_attr_check {
+struct attr_check_item {
 	const struct git_attr *attr;
 	const char *value;
 };
@@ -36,7 +36,7 @@ struct git_attr_check {
  */
 extern const char *git_attr_name(const struct git_attr *);
 
-int git_check_attr(const char *path, int, struct git_attr_check *);
+int git_check_attrs(const char *path, int, struct attr_check_item *);
 
 /*
  * Retrieve all attributes that apply to the specified path.  *num
@@ -45,7 +45,7 @@ int git_check_attr(const char *path, int, struct git_attr_check *);
  * objects describing the attributes and their values.  *check must be
  * free()ed by the caller.
  */
-int git_all_attrs(const char *path, int *num, struct git_attr_check **check);
+int git_all_attrs(const char *path, int *num, struct attr_check_item **check);
 
 enum git_attr_direction {
 	GIT_ATTR_CHECKIN,
diff --git a/builtin/check-attr.c b/builtin/check-attr.c
index 53a5a18c1..889264a5b 100644
--- a/builtin/check-attr.c
+++ b/builtin/check-attr.c
@@ -24,8 +24,8 @@ static const struct option check_attr_options[] = {
 	OPT_END()
 };
 
-static void output_attr(int cnt, struct git_attr_check *check,
-	const char *file)
+static void output_attr(int cnt, struct attr_check_item *check,
+			const char *file)
 {
 	int j;
 	for (j = 0; j < cnt; j++) {
@@ -51,14 +51,15 @@ static void output_attr(int cnt, struct git_attr_check *check,
 	}
 }
 
-static void check_attr(const char *prefix, int cnt,
-	struct git_attr_check *check, const char *file)
+static void check_attr(const char *prefix,
+		       int cnt, struct attr_check_item *check,
+		       const char *file)
 {
 	char *full_path =
 		prefix_path(prefix, prefix ? strlen(prefix) : 0, file);
 	if (check != NULL) {
-		if (git_check_attr(full_path, cnt, check))
-			die("git_check_attr died");
+		if (git_check_attrs(full_path, cnt, check))
+			die("git_check_attrs died");
 		output_attr(cnt, check, file);
 	} else {
 		if (git_all_attrs(full_path, &cnt, &check))
@@ -69,8 +70,8 @@ static void check_attr(const char *prefix, int cnt,
 	free(full_path);
 }
 
-static void check_attr_stdin_paths(const char *prefix, int cnt,
-	struct git_attr_check *check)
+static void check_attr_stdin_paths(const char *prefix,
+				   int cnt, struct attr_check_item *check)
 {
 	struct strbuf buf = STRBUF_INIT;
 	struct strbuf unquoted = STRBUF_INIT;
@@ -99,7 +100,7 @@ static NORETURN void error_with_usage(const char *msg)
 
 int cmd_check_attr(int argc, const char **argv, const char *prefix)
 {
-	struct git_attr_check *check;
+	struct attr_check_item *check;
 	int cnt, i, doubledash, filei;
 
 	if (!is_bare_repository())
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 8841f8b36..8b8fbd814 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -894,7 +894,7 @@ static void write_pack_file(void)
 			written, nr_result);
 }
 
-static void setup_delta_attr_check(struct git_attr_check *check)
+static void setup_delta_attr_check(struct attr_check_item *check)
 {
 	static struct git_attr *attr_delta;
 
@@ -906,10 +906,10 @@ static void setup_delta_attr_check(struct git_attr_check *check)
 
 static int no_try_delta(const char *path)
 {
-	struct git_attr_check check[1];
+	struct attr_check_item check[1];
 
 	setup_delta_attr_check(check);
-	if (git_check_attr(path, ARRAY_SIZE(check), check))
+	if (git_check_attrs(path, ARRAY_SIZE(check), check))
 		return 0;
 	if (ATTR_FALSE(check->value))
 		return 1;
diff --git a/convert.c b/convert.c
index 4e17e45ed..1b9829279 100644
--- a/convert.c
+++ b/convert.c
@@ -1028,7 +1028,7 @@ static int ident_to_worktree(const char *path, const char *src, size_t len,
 	return 1;
 }
 
-static enum crlf_action git_path_check_crlf(struct git_attr_check *check)
+static enum crlf_action git_path_check_crlf(struct attr_check_item *check)
 {
 	const char *value = check->value;
 
@@ -1045,7 +1045,7 @@ static enum crlf_action git_path_check_crlf(struct git_attr_check *check)
 	return CRLF_UNDEFINED;
 }
 
-static enum eol git_path_check_eol(struct git_attr_check *check)
+static enum eol git_path_check_eol(struct attr_check_item *check)
 {
 	const char *value = check->value;
 
@@ -1058,7 +1058,7 @@ static enum eol git_path_check_eol(struct git_attr_check *check)
 	return EOL_UNSET;
 }
 
-static struct convert_driver *git_path_check_convert(struct git_attr_check *check)
+static struct convert_driver *git_path_check_convert(struct attr_check_item *check)
 {
 	const char *value = check->value;
 	struct convert_driver *drv;
@@ -1071,7 +1071,7 @@ static struct convert_driver *git_path_check_convert(struct git_attr_check *chec
 	return NULL;
 }
 
-static int git_path_check_ident(struct git_attr_check *check)
+static int git_path_check_ident(struct attr_check_item *check)
 {
 	const char *value = check->value;
 
@@ -1093,7 +1093,7 @@ static const char *conv_attr_name[] = {
 static void convert_attrs(struct conv_attrs *ca, const char *path)
 {
 	int i;
-	static struct git_attr_check ccheck[NUM_CONV_ATTRS];
+	static struct attr_check_item ccheck[NUM_CONV_ATTRS];
 
 	if (!ccheck[0].attr) {
 		for (i = 0; i < NUM_CONV_ATTRS; i++)
@@ -1102,7 +1102,7 @@ static void convert_attrs(struct conv_attrs *ca, const char *path)
 		git_config(read_convert_config, NULL);
 	}
 
-	if (!git_check_attr(path, NUM_CONV_ATTRS, ccheck)) {
+	if (!git_check_attrs(path, NUM_CONV_ATTRS, ccheck)) {
 		ca->crlf_action = git_path_check_crlf(ccheck + 4);
 		if (ca->crlf_action == CRLF_UNDEFINED)
 			ca->crlf_action = git_path_check_crlf(ccheck + 0);
diff --git a/ll-merge.c b/ll-merge.c
index ad8be42f9..198f07aca 100644
--- a/ll-merge.c
+++ b/ll-merge.c
@@ -336,13 +336,13 @@ static const struct ll_merge_driver *find_ll_merge_driver(const char *merge_attr
 	return &ll_merge_drv[LL_TEXT_MERGE];
 }
 
-static int git_path_check_merge(const char *path, struct git_attr_check check[2])
+static int git_path_check_merge(const char *path, struct attr_check_item check[2])
 {
 	if (!check[0].attr) {
 		check[0].attr = git_attr("merge");
 		check[1].attr = git_attr("conflict-marker-size");
 	}
-	return git_check_attr(path, 2, check);
+	return git_check_attrs(path, 2, check);
 }
 
 static void normalize_file(mmfile_t *mm, const char *path)
@@ -362,7 +362,7 @@ int ll_merge(mmbuffer_t *result_buf,
 	     mmfile_t *theirs, const char *their_label,
 	     const struct ll_merge_options *opts)
 {
-	static struct git_attr_check check[2];
+	static struct attr_check_item check[2];
 	static const struct ll_merge_options default_opts;
 	const char *ll_driver_name = NULL;
 	int marker_size = DEFAULT_CONFLICT_MARKER_SIZE;
@@ -398,12 +398,12 @@ int ll_merge(mmbuffer_t *result_buf,
 
 int ll_merge_marker_size(const char *path)
 {
-	static struct git_attr_check check;
+	static struct attr_check_item check;
 	int marker_size = DEFAULT_CONFLICT_MARKER_SIZE;
 
 	if (!check.attr)
 		check.attr = git_attr("conflict-marker-size");
-	if (!git_check_attr(path, 1, &check) && check.value) {
+	if (!git_check_attrs(path, 1, &check) && check.value) {
 		marker_size = atoi(check.value);
 		if (marker_size <= 0)
 			marker_size = DEFAULT_CONFLICT_MARKER_SIZE;
diff --git a/userdiff.c b/userdiff.c
index 2125d6da2..b0b44467a 100644
--- a/userdiff.c
+++ b/userdiff.c
@@ -263,7 +263,7 @@ struct userdiff_driver *userdiff_find_by_name(const char *name) {
 struct userdiff_driver *userdiff_find_by_path(const char *path)
 {
 	static struct git_attr *attr;
-	struct git_attr_check check;
+	struct attr_check_item check;
 
 	if (!attr)
 		attr = git_attr("diff");
@@ -271,7 +271,7 @@ struct userdiff_driver *userdiff_find_by_path(const char *path)
 
 	if (!path)
 		return NULL;
-	if (git_check_attr(path, 1, &check))
+	if (git_check_attrs(path, 1, &check))
 		return NULL;
 
 	if (ATTR_TRUE(check.value))
diff --git a/ws.c b/ws.c
index ea4b2b1df..fbd876e84 100644
--- a/ws.c
+++ b/ws.c
@@ -71,7 +71,7 @@ unsigned parse_whitespace_rule(const char *string)
 	return rule;
 }
 
-static void setup_whitespace_attr_check(struct git_attr_check *check)
+static void setup_whitespace_attr_check(struct attr_check_item *check)
 {
 	static struct git_attr *attr_whitespace;
 
@@ -82,10 +82,10 @@ static void setup_whitespace_attr_check(struct git_attr_check *check)
 
 unsigned whitespace_rule(const char *pathname)
 {
-	struct git_attr_check attr_whitespace_rule;
+	struct attr_check_item attr_whitespace_rule;
 
 	setup_whitespace_attr_check(&attr_whitespace_rule);
-	if (!git_check_attr(pathname, 1, &attr_whitespace_rule)) {
+	if (!git_check_attrs(pathname, 1, &attr_whitespace_rule)) {
 		const char *value;
 
 		value = attr_whitespace_rule.value;
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v2 15/27] attr: (re)introduce git_check_attr() and struct attr_check
  2017-01-23 20:34 ` [PATCH v2 " Brandon Williams
                     ` (13 preceding siblings ...)
  2017-01-23 20:35   ` [PATCH v2 14/27] attr: rename function and struct related to checking attributes Brandon Williams
@ 2017-01-23 20:35   ` Brandon Williams
  2017-01-23 20:35   ` [PATCH v2 16/27] attr: convert git_all_attrs() to use "struct attr_check" Brandon Williams
                     ` (12 subsequent siblings)
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-23 20:35 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, sbeller, pclouds, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

A common pattern to check N attributes for many paths is to

 (1) prepare an array A of N attr_check_item items;
 (2) call git_attr() to intern the N attribute names and fill A;
 (3) repeatedly call git_check_attrs() for path with N and A;

A look-up for these N attributes for a single path P scans the
entire attr_stack, starting from the .git/info/attributes file and
then .gitattributes file in the directory the path P is in, going
upwards to find .gitattributes file found in parent directories.

An earlier commit 06a604e6 (attr: avoid heavy work when we know the
specified attr is not defined, 2014-12-28) tried to optimize out
this scanning for one trivial special case: when the attribute being
sought is known not to exist, we do not have to scan for it.  While
this may be a cheap and effective heuristic, it would not work well
when N is (much) more than 1.

What we would want is a more customized way to skip irrelevant
entries in the attribute stack, and the definition of irrelevance
is tied to the set of attributes passed to git_check_attrs() call,
i.e. the set of attributes being sought.  The data necessary for
this optimization needs to live alongside the set of attributes, but
a simple array of git_attr_check_elem simply does not have any place
for that.

Introduce "struct attr_check" that contains N, the number of
attributes being sought, and A, the array that holds N
attr_check_item items, and a function git_check_attr() that
takes a path P and this structure as its parameters.  This structure
can later be extended to hold extra data necessary for optimization.

Also, to make it easier to write the first two steps in common
cases, introduce git_attr_check_initl() helper function, which takes
a NULL-terminated list of attribute names and initialize this
structure.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 74 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 attr.h | 17 +++++++++++++++
 2 files changed, 91 insertions(+)

diff --git a/attr.c b/attr.c
index 2f180d609..be9e398e9 100644
--- a/attr.c
+++ b/attr.c
@@ -865,6 +865,80 @@ int git_all_attrs(const char *path, int *num, struct attr_check_item **check)
 	return 0;
 }
 
+struct attr_check *attr_check_alloc(void)
+{
+	return xcalloc(1, sizeof(struct attr_check));
+}
+
+int git_check_attr(const char *path, struct attr_check *check)
+{
+	return git_check_attrs(path, check->check_nr, check->check);
+}
+
+struct attr_check *attr_check_initl(const char *one, ...)
+{
+	struct attr_check *check;
+	int cnt;
+	va_list params;
+	const char *param;
+
+	va_start(params, one);
+	for (cnt = 1; (param = va_arg(params, const char *)) != NULL; cnt++)
+		;
+	va_end(params);
+
+	check = attr_check_alloc();
+	check->check_nr = cnt;
+	check->check_alloc = cnt;
+	check->check = xcalloc(cnt, sizeof(struct attr_check_item));
+
+	check->check[0].attr = git_attr(one);
+	va_start(params, one);
+	for (cnt = 1; cnt < check->check_nr; cnt++) {
+		struct git_attr *attr;
+		param = va_arg(params, const char *);
+		if (!param)
+			die("BUG: counted %d != ended at %d",
+			    check->check_nr, cnt);
+		attr = git_attr(param);
+		if (!attr)
+			die("BUG: %s: not a valid attribute name", param);
+		check->check[cnt].attr = attr;
+	}
+	va_end(params);
+	return check;
+}
+
+struct attr_check_item *attr_check_append(struct attr_check *check,
+					  const struct git_attr *attr)
+{
+	struct attr_check_item *item;
+
+	ALLOC_GROW(check->check, check->check_nr + 1, check->check_alloc);
+	item = &check->check[check->check_nr++];
+	item->attr = attr;
+	return item;
+}
+
+void attr_check_reset(struct attr_check *check)
+{
+	check->check_nr = 0;
+}
+
+void attr_check_clear(struct attr_check *check)
+{
+	free(check->check);
+	check->check = NULL;
+	check->check_alloc = 0;
+	check->check_nr = 0;
+}
+
+void attr_check_free(struct attr_check *check)
+{
+	attr_check_clear(check);
+	free(check);
+}
+
 void git_attr_set_direction(enum git_attr_direction new, struct index_state *istate)
 {
 	enum git_attr_direction old = direction;
diff --git a/attr.h b/attr.h
index efc7bb3b3..459347f4b 100644
--- a/attr.h
+++ b/attr.h
@@ -29,6 +29,22 @@ struct attr_check_item {
 	const char *value;
 };
 
+struct attr_check {
+	int check_nr;
+	int check_alloc;
+	struct attr_check_item *check;
+};
+
+extern struct attr_check *attr_check_alloc(void);
+extern struct attr_check *attr_check_initl(const char *, ...);
+
+extern struct attr_check_item *attr_check_append(struct attr_check *check,
+						 const struct git_attr *attr);
+
+extern void attr_check_reset(struct attr_check *check);
+extern void attr_check_clear(struct attr_check *check);
+extern void attr_check_free(struct attr_check *check);
+
 /*
  * Return the name of the attribute represented by the argument.  The
  * return value is a pointer to a null-delimited string that is part
@@ -37,6 +53,7 @@ struct attr_check_item {
 extern const char *git_attr_name(const struct git_attr *);
 
 int git_check_attrs(const char *path, int, struct attr_check_item *);
+extern int git_check_attr(const char *path, struct attr_check *check);
 
 /*
  * Retrieve all attributes that apply to the specified path.  *num
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v2 16/27] attr: convert git_all_attrs() to use "struct attr_check"
  2017-01-23 20:34 ` [PATCH v2 " Brandon Williams
                     ` (14 preceding siblings ...)
  2017-01-23 20:35   ` [PATCH v2 15/27] attr: (re)introduce git_check_attr() and struct attr_check Brandon Williams
@ 2017-01-23 20:35   ` Brandon Williams
  2017-01-23 20:35   ` [PATCH v2 17/27] attr: convert git_check_attrs() callers to use the new API Brandon Williams
                     ` (11 subsequent siblings)
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-23 20:35 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, sbeller, pclouds, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

This updates the other two ways the attribute check is done via an
array of "struct attr_check_item" elements.  These two niches
appear only in "git check-attr".

 * The caller does not know offhand what attributes it wants to ask
   about and cannot use attr_check_initl() to prepare the
   attr_check structure.

 * The caller may not know what attributes it wants to ask at all,
   and instead wants to learn everything that the given path has.

Such a caller can call attr_check_alloc() to allocate an empty
attr_check, and then call attr_check_append() to add attribute names
one by one.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c               | 38 ++++++++++++---------------------
 attr.h               |  9 +++-----
 builtin/check-attr.c | 60 ++++++++++++++++++++++++++--------------------------
 3 files changed, 47 insertions(+), 60 deletions(-)

diff --git a/attr.c b/attr.c
index be9e398e9..d2eaa0410 100644
--- a/attr.c
+++ b/attr.c
@@ -837,42 +837,32 @@ int git_check_attrs(const char *path, int num, struct attr_check_item *check)
 	return 0;
 }
 
-int git_all_attrs(const char *path, int *num, struct attr_check_item **check)
+void git_all_attrs(const char *path, struct attr_check *check)
 {
-	int i, count, j;
+	int i;
 
-	collect_some_attrs(path, 0, NULL);
+	attr_check_reset(check);
+	collect_some_attrs(path, check->check_nr, check->check);
 
-	/* Count the number of attributes that are set. */
-	count = 0;
-	for (i = 0; i < attr_nr; i++) {
-		const char *value = check_all_attr[i].value;
-		if (value != ATTR__UNSET && value != ATTR__UNKNOWN)
-			++count;
-	}
-	*num = count;
-	ALLOC_ARRAY(*check, count);
-	j = 0;
 	for (i = 0; i < attr_nr; i++) {
+		const char *name = check_all_attr[i].attr->name;
 		const char *value = check_all_attr[i].value;
-		if (value != ATTR__UNSET && value != ATTR__UNKNOWN) {
-			(*check)[j].attr = check_all_attr[i].attr;
-			(*check)[j].value = value;
-			++j;
-		}
+		struct attr_check_item *item;
+		if (value == ATTR__UNSET || value == ATTR__UNKNOWN)
+			continue;
+		item = attr_check_append(check, git_attr(name));
+		item->value = value;
 	}
-
-	return 0;
 }
 
-struct attr_check *attr_check_alloc(void)
+int git_check_attr(const char *path, struct attr_check *check)
 {
-	return xcalloc(1, sizeof(struct attr_check));
+	return git_check_attrs(path, check->check_nr, check->check);
 }
 
-int git_check_attr(const char *path, struct attr_check *check)
+struct attr_check *attr_check_alloc(void)
 {
-	return git_check_attrs(path, check->check_nr, check->check);
+	return xcalloc(1, sizeof(struct attr_check));
 }
 
 struct attr_check *attr_check_initl(const char *one, ...)
diff --git a/attr.h b/attr.h
index 459347f4b..971bb9a38 100644
--- a/attr.h
+++ b/attr.h
@@ -56,13 +56,10 @@ int git_check_attrs(const char *path, int, struct attr_check_item *);
 extern int git_check_attr(const char *path, struct attr_check *check);
 
 /*
- * Retrieve all attributes that apply to the specified path.  *num
- * will be set to the number of attributes on the path; **check will
- * be set to point at a newly-allocated array of git_attr_check
- * objects describing the attributes and their values.  *check must be
- * free()ed by the caller.
+ * Retrieve all attributes that apply to the specified path.
+ * check holds the attributes and their values.
  */
-int git_all_attrs(const char *path, int *num, struct attr_check_item **check);
+void git_all_attrs(const char *path, struct attr_check *check);
 
 enum git_attr_direction {
 	GIT_ATTR_CHECKIN,
diff --git a/builtin/check-attr.c b/builtin/check-attr.c
index 889264a5b..3d4704be5 100644
--- a/builtin/check-attr.c
+++ b/builtin/check-attr.c
@@ -24,12 +24,13 @@ static const struct option check_attr_options[] = {
 	OPT_END()
 };
 
-static void output_attr(int cnt, struct attr_check_item *check,
-			const char *file)
+static void output_attr(struct attr_check *check, const char *file)
 {
 	int j;
+	int cnt = check->check_nr;
+
 	for (j = 0; j < cnt; j++) {
-		const char *value = check[j].value;
+		const char *value = check->check[j].value;
 
 		if (ATTR_TRUE(value))
 			value = "set";
@@ -42,36 +43,38 @@ static void output_attr(int cnt, struct attr_check_item *check,
 			printf("%s%c" /* path */
 			       "%s%c" /* attrname */
 			       "%s%c" /* attrvalue */,
-			       file, 0, git_attr_name(check[j].attr), 0, value, 0);
+			       file, 0,
+			       git_attr_name(check->check[j].attr), 0, value, 0);
 		} else {
 			quote_c_style(file, NULL, stdout, 0);
-			printf(": %s: %s\n", git_attr_name(check[j].attr), value);
+			printf(": %s: %s\n",
+			       git_attr_name(check->check[j].attr), value);
 		}
-
 	}
 }
 
 static void check_attr(const char *prefix,
-		       int cnt, struct attr_check_item *check,
+		       struct attr_check *check,
+		       int collect_all,
 		       const char *file)
 {
 	char *full_path =
 		prefix_path(prefix, prefix ? strlen(prefix) : 0, file);
-	if (check != NULL) {
-		if (git_check_attrs(full_path, cnt, check))
-			die("git_check_attrs died");
-		output_attr(cnt, check, file);
+
+	if (collect_all) {
+		git_all_attrs(full_path, check);
 	} else {
-		if (git_all_attrs(full_path, &cnt, &check))
-			die("git_all_attrs died");
-		output_attr(cnt, check, file);
-		free(check);
+		if (git_check_attr(full_path, check))
+			die("git_check_attr died");
 	}
+	output_attr(check, file);
+
 	free(full_path);
 }
 
 static void check_attr_stdin_paths(const char *prefix,
-				   int cnt, struct attr_check_item *check)
+				   struct attr_check *check,
+				   int collect_all)
 {
 	struct strbuf buf = STRBUF_INIT;
 	struct strbuf unquoted = STRBUF_INIT;
@@ -85,7 +88,7 @@ static void check_attr_stdin_paths(const char *prefix,
 				die("line is badly quoted");
 			strbuf_swap(&buf, &unquoted);
 		}
-		check_attr(prefix, cnt, check, buf.buf);
+		check_attr(prefix, check, collect_all, buf.buf);
 		maybe_flush_or_die(stdout, "attribute to stdout");
 	}
 	strbuf_release(&buf);
@@ -100,7 +103,7 @@ static NORETURN void error_with_usage(const char *msg)
 
 int cmd_check_attr(int argc, const char **argv, const char *prefix)
 {
-	struct attr_check_item *check;
+	struct attr_check *check;
 	int cnt, i, doubledash, filei;
 
 	if (!is_bare_repository())
@@ -160,28 +163,25 @@ int cmd_check_attr(int argc, const char **argv, const char *prefix)
 			error_with_usage("No file specified");
 	}
 
-	if (all_attrs) {
-		check = NULL;
-	} else {
-		check = xcalloc(cnt, sizeof(*check));
+	check = attr_check_alloc();
+	if (!all_attrs) {
 		for (i = 0; i < cnt; i++) {
-			const char *name;
-			struct git_attr *a;
-			name = argv[i];
-			a = git_attr(name);
+			struct git_attr *a = git_attr(argv[i]);
 			if (!a)
 				return error("%s: not a valid attribute name",
-					name);
-			check[i].attr = a;
+					     argv[i]);
+			attr_check_append(check, a);
 		}
 	}
 
 	if (stdin_paths)
-		check_attr_stdin_paths(prefix, cnt, check);
+		check_attr_stdin_paths(prefix, check, all_attrs);
 	else {
 		for (i = filei; i < argc; i++)
-			check_attr(prefix, cnt, check, argv[i]);
+			check_attr(prefix, check, all_attrs, argv[i]);
 		maybe_flush_or_die(stdout, "attribute to stdout");
 	}
+
+	attr_check_free(check);
 	return 0;
 }
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v2 17/27] attr: convert git_check_attrs() callers to use the new API
  2017-01-23 20:34 ` [PATCH v2 " Brandon Williams
                     ` (15 preceding siblings ...)
  2017-01-23 20:35   ` [PATCH v2 16/27] attr: convert git_all_attrs() to use "struct attr_check" Brandon Williams
@ 2017-01-23 20:35   ` Brandon Williams
  2017-01-23 20:35   ` [PATCH v2 18/27] attr: retire git_check_attrs() API Brandon Williams
                     ` (10 subsequent siblings)
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-23 20:35 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, sbeller, pclouds, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

The remaining callers are all simple "I have N attributes I am
interested in.  I'll ask about them with various paths one by one".

After this step, no caller to git_check_attrs() remains.  After
removing it, we can extend "struct attr_check" struct with data
that can be used in optimizing the query for the specific N
attributes it contains.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 archive.c              | 24 ++++++------------------
 builtin/pack-objects.c | 19 +++++--------------
 convert.c              | 17 ++++++-----------
 ll-merge.c             | 33 ++++++++++++++-------------------
 userdiff.c             | 19 ++++++++-----------
 ws.c                   | 19 ++++++-------------
 6 files changed, 45 insertions(+), 86 deletions(-)

diff --git a/archive.c b/archive.c
index b76bd4691..3591f7d55 100644
--- a/archive.c
+++ b/archive.c
@@ -87,19 +87,6 @@ void *sha1_file_to_archive(const struct archiver_args *args,
 	return buffer;
 }
 
-static void setup_archive_check(struct attr_check_item *check)
-{
-	static struct git_attr *attr_export_ignore;
-	static struct git_attr *attr_export_subst;
-
-	if (!attr_export_ignore) {
-		attr_export_ignore = git_attr("export-ignore");
-		attr_export_subst = git_attr("export-subst");
-	}
-	check[0].attr = attr_export_ignore;
-	check[1].attr = attr_export_subst;
-}
-
 struct directory {
 	struct directory *up;
 	struct object_id oid;
@@ -120,10 +107,10 @@ static int write_archive_entry(const unsigned char *sha1, const char *base,
 		void *context)
 {
 	static struct strbuf path = STRBUF_INIT;
+	static struct attr_check *check;
 	struct archiver_context *c = context;
 	struct archiver_args *args = c->args;
 	write_archive_entry_fn_t write_entry = c->write_entry;
-	struct attr_check_item check[2];
 	const char *path_without_prefix;
 	int err;
 
@@ -137,11 +124,12 @@ static int write_archive_entry(const unsigned char *sha1, const char *base,
 		strbuf_addch(&path, '/');
 	path_without_prefix = path.buf + args->baselen;
 
-	setup_archive_check(check);
-	if (!git_check_attrs(path_without_prefix, ARRAY_SIZE(check), check)) {
-		if (ATTR_TRUE(check[0].value))
+	if (!check)
+		check = attr_check_initl("export-ignore", "export-subst", NULL);
+	if (!git_check_attr(path_without_prefix, check)) {
+		if (ATTR_TRUE(check->check[0].value))
 			return 0;
-		args->convert = ATTR_TRUE(check[1].value);
+		args->convert = ATTR_TRUE(check->check[1].value);
 	}
 
 	if (S_ISDIR(mode) || S_ISGITLINK(mode)) {
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 8b8fbd814..ff8b3c12d 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -894,24 +894,15 @@ static void write_pack_file(void)
 			written, nr_result);
 }
 
-static void setup_delta_attr_check(struct attr_check_item *check)
-{
-	static struct git_attr *attr_delta;
-
-	if (!attr_delta)
-		attr_delta = git_attr("delta");
-
-	check[0].attr = attr_delta;
-}
-
 static int no_try_delta(const char *path)
 {
-	struct attr_check_item check[1];
+	static struct attr_check *check;
 
-	setup_delta_attr_check(check);
-	if (git_check_attrs(path, ARRAY_SIZE(check), check))
+	if (!check)
+		check = attr_check_initl("delta", NULL);
+	if (git_check_attr(path, check))
 		return 0;
-	if (ATTR_FALSE(check->value))
+	if (ATTR_FALSE(check->check[0].value))
 		return 1;
 	return 0;
 }
diff --git a/convert.c b/convert.c
index 1b9829279..affd8ce9b 100644
--- a/convert.c
+++ b/convert.c
@@ -1085,24 +1085,19 @@ struct conv_attrs {
 	int ident;
 };
 
-static const char *conv_attr_name[] = {
-	"crlf", "ident", "filter", "eol", "text",
-};
-#define NUM_CONV_ATTRS ARRAY_SIZE(conv_attr_name)
-
 static void convert_attrs(struct conv_attrs *ca, const char *path)
 {
-	int i;
-	static struct attr_check_item ccheck[NUM_CONV_ATTRS];
+	static struct attr_check *check;
 
-	if (!ccheck[0].attr) {
-		for (i = 0; i < NUM_CONV_ATTRS; i++)
-			ccheck[i].attr = git_attr(conv_attr_name[i]);
+	if (!check) {
+		check = attr_check_initl("crlf", "ident", "filter",
+					 "eol", "text", NULL);
 		user_convert_tail = &user_convert;
 		git_config(read_convert_config, NULL);
 	}
 
-	if (!git_check_attrs(path, NUM_CONV_ATTRS, ccheck)) {
+	if (!git_check_attr(path, check)) {
+		struct attr_check_item *ccheck = check->check;
 		ca->crlf_action = git_path_check_crlf(ccheck + 4);
 		if (ca->crlf_action == CRLF_UNDEFINED)
 			ca->crlf_action = git_path_check_crlf(ccheck + 0);
diff --git a/ll-merge.c b/ll-merge.c
index 198f07aca..3a4227a1c 100644
--- a/ll-merge.c
+++ b/ll-merge.c
@@ -336,15 +336,6 @@ static const struct ll_merge_driver *find_ll_merge_driver(const char *merge_attr
 	return &ll_merge_drv[LL_TEXT_MERGE];
 }
 
-static int git_path_check_merge(const char *path, struct attr_check_item check[2])
-{
-	if (!check[0].attr) {
-		check[0].attr = git_attr("merge");
-		check[1].attr = git_attr("conflict-marker-size");
-	}
-	return git_check_attrs(path, 2, check);
-}
-
 static void normalize_file(mmfile_t *mm, const char *path)
 {
 	struct strbuf strbuf = STRBUF_INIT;
@@ -362,7 +353,7 @@ int ll_merge(mmbuffer_t *result_buf,
 	     mmfile_t *theirs, const char *their_label,
 	     const struct ll_merge_options *opts)
 {
-	static struct attr_check_item check[2];
+	static struct attr_check *check;
 	static const struct ll_merge_options default_opts;
 	const char *ll_driver_name = NULL;
 	int marker_size = DEFAULT_CONFLICT_MARKER_SIZE;
@@ -376,10 +367,14 @@ int ll_merge(mmbuffer_t *result_buf,
 		normalize_file(ours, path);
 		normalize_file(theirs, path);
 	}
-	if (!git_path_check_merge(path, check)) {
-		ll_driver_name = check[0].value;
-		if (check[1].value) {
-			marker_size = atoi(check[1].value);
+
+	if (!check)
+		check = attr_check_initl("merge", "conflict-marker-size", NULL);
+
+	if (!git_check_attr(path, check)) {
+		ll_driver_name = check->check[0].value;
+		if (check->check[1].value) {
+			marker_size = atoi(check->check[1].value);
 			if (marker_size <= 0)
 				marker_size = DEFAULT_CONFLICT_MARKER_SIZE;
 		}
@@ -398,13 +393,13 @@ int ll_merge(mmbuffer_t *result_buf,
 
 int ll_merge_marker_size(const char *path)
 {
-	static struct attr_check_item check;
+	static struct attr_check *check;
 	int marker_size = DEFAULT_CONFLICT_MARKER_SIZE;
 
-	if (!check.attr)
-		check.attr = git_attr("conflict-marker-size");
-	if (!git_check_attrs(path, 1, &check) && check.value) {
-		marker_size = atoi(check.value);
+	if (!check)
+		check = attr_check_initl("conflict-marker-size", NULL);
+	if (!git_check_attr(path, check) && check->check[0].value) {
+		marker_size = atoi(check->check[0].value);
 		if (marker_size <= 0)
 			marker_size = DEFAULT_CONFLICT_MARKER_SIZE;
 	}
diff --git a/userdiff.c b/userdiff.c
index b0b44467a..109d4b9fc 100644
--- a/userdiff.c
+++ b/userdiff.c
@@ -262,25 +262,22 @@ struct userdiff_driver *userdiff_find_by_name(const char *name) {
 
 struct userdiff_driver *userdiff_find_by_path(const char *path)
 {
-	static struct git_attr *attr;
-	struct attr_check_item check;
-
-	if (!attr)
-		attr = git_attr("diff");
-	check.attr = attr;
+	static struct attr_check *check;
 
+	if (!check)
+		check = attr_check_initl("diff", NULL);
 	if (!path)
 		return NULL;
-	if (git_check_attrs(path, 1, &check))
+	if (git_check_attr(path, check))
 		return NULL;
 
-	if (ATTR_TRUE(check.value))
+	if (ATTR_TRUE(check->check[0].value))
 		return &driver_true;
-	if (ATTR_FALSE(check.value))
+	if (ATTR_FALSE(check->check[0].value))
 		return &driver_false;
-	if (ATTR_UNSET(check.value))
+	if (ATTR_UNSET(check->check[0].value))
 		return NULL;
-	return userdiff_find_by_name(check.value);
+	return userdiff_find_by_name(check->check[0].value);
 }
 
 struct userdiff_driver *userdiff_get_textconv(struct userdiff_driver *driver)
diff --git a/ws.c b/ws.c
index fbd876e84..7556adbd0 100644
--- a/ws.c
+++ b/ws.c
@@ -71,24 +71,17 @@ unsigned parse_whitespace_rule(const char *string)
 	return rule;
 }
 
-static void setup_whitespace_attr_check(struct attr_check_item *check)
-{
-	static struct git_attr *attr_whitespace;
-
-	if (!attr_whitespace)
-		attr_whitespace = git_attr("whitespace");
-	check[0].attr = attr_whitespace;
-}
-
 unsigned whitespace_rule(const char *pathname)
 {
-	struct attr_check_item attr_whitespace_rule;
+	static struct attr_check *attr_whitespace_rule;
+
+	if (!attr_whitespace_rule)
+		attr_whitespace_rule = attr_check_initl("whitespace", NULL);
 
-	setup_whitespace_attr_check(&attr_whitespace_rule);
-	if (!git_check_attrs(pathname, 1, &attr_whitespace_rule)) {
+	if (!git_check_attr(pathname, attr_whitespace_rule)) {
 		const char *value;
 
-		value = attr_whitespace_rule.value;
+		value = attr_whitespace_rule->check[0].value;
 		if (ATTR_TRUE(value)) {
 			/* true (whitespace) */
 			unsigned all_rule = ws_tab_width(whitespace_rule_cfg);
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v2 18/27] attr: retire git_check_attrs() API
  2017-01-23 20:34 ` [PATCH v2 " Brandon Williams
                     ` (16 preceding siblings ...)
  2017-01-23 20:35   ` [PATCH v2 17/27] attr: convert git_check_attrs() callers to use the new API Brandon Williams
@ 2017-01-23 20:35   ` Brandon Williams
  2017-01-23 20:35   ` [PATCH v2 19/27] attr: pass struct attr_check to collect_some_attrs Brandon Williams
                     ` (9 subsequent siblings)
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-23 20:35 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, sbeller, pclouds, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

Since nobody uses the old API, make it file-scope static, and update
the documentation to describe the new API.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 Documentation/technical/api-gitattributes.txt | 86 +++++++++++++++++----------
 attr.c                                        |  3 +-
 attr.h                                        |  1 -
 3 files changed, 58 insertions(+), 32 deletions(-)

diff --git a/Documentation/technical/api-gitattributes.txt b/Documentation/technical/api-gitattributes.txt
index 260266867..82f5130e7 100644
--- a/Documentation/technical/api-gitattributes.txt
+++ b/Documentation/technical/api-gitattributes.txt
@@ -16,10 +16,15 @@ Data Structure
 	of no interest to the calling programs.  The name of the
 	attribute can be retrieved by calling `git_attr_name()`.
 
-`struct git_attr_check`::
+`struct attr_check_item`::
 
-	This structure represents a set of attributes to check in a call
-	to `git_check_attr()` function, and receives the results.
+	This structure represents one attribute and its value.
+
+`struct attr_check`::
+
+	This structure represents a collection of `attr_check_item`.
+	It is passed to `git_check_attr()` function, specifying the
+	attributes to check, and receives their values.
 
 
 Attribute Values
@@ -27,7 +32,7 @@ Attribute Values
 
 An attribute for a path can be in one of four states: Set, Unset,
 Unspecified or set to a string, and `.value` member of `struct
-git_attr_check` records it.  There are three macros to check these:
+attr_check_item` records it.  There are three macros to check these:
 
 `ATTR_TRUE()`::
 
@@ -48,49 +53,51 @@ value of the attribute for the path.
 Querying Specific Attributes
 ----------------------------
 
-* Prepare an array of `struct git_attr_check` to define the list of
-  attributes you would want to check.  To populate this array, you would
-  need to define necessary attributes by calling `git_attr()` function.
+* Prepare `struct attr_check` using attr_check_initl()
+  function, enumerating the names of attributes whose values you are
+  interested in, terminated with a NULL pointer.  Alternatively, an
+  empty `struct attr_check` can be prepared by calling
+  `attr_check_alloc()` function and then attributes you want to
+  ask about can be added to it with `attr_check_append()`
+  function.
 
 * Call `git_check_attr()` to check the attributes for the path.
 
-* Inspect `git_attr_check` structure to see how each of the attribute in
-  the array is defined for the path.
+* Inspect `attr_check` structure to see how each of the
+  attribute in the array is defined for the path.
 
 
 Example
 -------
 
-To see how attributes "crlf" and "indent" are set for different paths.
+To see how attributes "crlf" and "ident" are set for different paths.
 
-. Prepare an array of `struct git_attr_check` with two elements (because
-  we are checking two attributes).  Initialize their `attr` member with
-  pointers to `struct git_attr` obtained by calling `git_attr()`:
+. Prepare a `struct attr_check` with two elements (because
+  we are checking two attributes):
 
 ------------
-static struct git_attr_check check[2];
+static struct attr_check *check;
 static void setup_check(void)
 {
-	if (check[0].attr)
+	if (check)
 		return; /* already done */
-	check[0].attr = git_attr("crlf");
-	check[1].attr = git_attr("ident");
+	check = attr_check_initl("crlf", "ident", NULL);
 }
 ------------
 
-. Call `git_check_attr()` with the prepared array of `struct git_attr_check`:
+. Call `git_check_attr()` with the prepared `struct attr_check`:
 
 ------------
 	const char *path;
 
 	setup_check();
-	git_check_attr(path, ARRAY_SIZE(check), check);
+	git_check_attr(path, check);
 ------------
 
-. Act on `.value` member of the result, left in `check[]`:
+. Act on `.value` member of the result, left in `check->check[]`:
 
 ------------
-	const char *value = check[0].value;
+	const char *value = check->check[0].value;
 
 	if (ATTR_TRUE(value)) {
 		The attribute is Set, by listing only the name of the
@@ -109,20 +116,39 @@ static void setup_check(void)
 	}
 ------------
 
+To see how attributes in argv[] are set for different paths, only
+the first step in the above would be different.
+
+------------
+static struct attr_check *check;
+static void setup_check(const char **argv)
+{
+	check = attr_check_alloc();
+	while (*argv) {
+		struct git_attr *attr = git_attr(*argv);
+		attr_check_append(check, attr);
+		argv++;
+	}
+}
+------------
+
 
 Querying All Attributes
 -----------------------
 
 To get the values of all attributes associated with a file:
 
-* Call `git_all_attrs()`, which returns an array of `git_attr_check`
-  structures.
+* Prepare an empty `attr_check` structure by calling
+  `attr_check_alloc()`.
+
+* Call `git_all_attrs()`, which populates the `attr_check`
+  with the attributes attached to the path.
 
-* Iterate over the `git_attr_check` array to examine the attribute
-  names and values.  The name of the attribute described by a
-  `git_attr_check` object can be retrieved via
-  `git_attr_name(check[i].attr)`.  (Please note that no items will be
-  returned for unset attributes, so `ATTR_UNSET()` will return false
-  for all returned `git_array_check` objects.)
+* Iterate over the `attr_check.check[]` array to examine
+  the attribute names and values.  The name of the attribute
+  described by a  `attr_check.check[]` object can be retrieved via
+  `git_attr_name(check->check[i].attr)`.  (Please note that no items
+  will be returned for unset attributes, so `ATTR_UNSET()` will return
+  false for all returned `attr_check.check[]` objects.)
 
-* Free the `git_array_check` array.
+* Free the `attr_check` struct by calling `attr_check_free()`.
diff --git a/attr.c b/attr.c
index d2eaa0410..da727e3fd 100644
--- a/attr.c
+++ b/attr.c
@@ -821,7 +821,8 @@ static void collect_some_attrs(const char *path, int num,
 		rem = fill(path, pathlen, basename_offset, stk, rem);
 }
 
-int git_check_attrs(const char *path, int num, struct attr_check_item *check)
+static int git_check_attrs(const char *path, int num,
+			   struct attr_check_item *check)
 {
 	int i;
 
diff --git a/attr.h b/attr.h
index 971bb9a38..3db9893ef 100644
--- a/attr.h
+++ b/attr.h
@@ -52,7 +52,6 @@ extern void attr_check_free(struct attr_check *check);
  */
 extern const char *git_attr_name(const struct git_attr *);
 
-int git_check_attrs(const char *path, int, struct attr_check_item *);
 extern int git_check_attr(const char *path, struct attr_check *check);
 
 /*
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v2 19/27] attr: pass struct attr_check to collect_some_attrs
  2017-01-23 20:34 ` [PATCH v2 " Brandon Williams
                     ` (17 preceding siblings ...)
  2017-01-23 20:35   ` [PATCH v2 18/27] attr: retire git_check_attrs() API Brandon Williams
@ 2017-01-23 20:35   ` Brandon Williams
  2017-01-23 20:35   ` [PATCH v2 20/27] attr: change validity check for attribute names to use positive logic Brandon Williams
                     ` (8 subsequent siblings)
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-23 20:35 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, sbeller, gitster, pclouds

The old callchain used to take an array of attr_check_item items.
Instead pass the 'attr_check' container object to 'collect_some_attrs()'
and access the fields in the data structure directly.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 33 +++++++++++++--------------------
 1 file changed, 13 insertions(+), 20 deletions(-)

diff --git a/attr.c b/attr.c
index da727e3fd..e58fa340c 100644
--- a/attr.c
+++ b/attr.c
@@ -777,9 +777,7 @@ static int macroexpand_one(int nr, int rem)
  * check_all_attr. If num is non-zero, only attributes in check[] are
  * collected. Otherwise all attributes are collected.
  */
-static void collect_some_attrs(const char *path, int num,
-			       struct attr_check_item *check)
-
+static void collect_some_attrs(const char *path, struct attr_check *check)
 {
 	struct attr_stack *stk;
 	int i, pathlen, rem, dirlen;
@@ -802,17 +800,18 @@ static void collect_some_attrs(const char *path, int num,
 	prepare_attr_stack(path, dirlen);
 	for (i = 0; i < attr_nr; i++)
 		check_all_attr[i].value = ATTR__UNKNOWN;
-	if (num && !cannot_trust_maybe_real) {
+	if (check->check_nr && !cannot_trust_maybe_real) {
 		rem = 0;
-		for (i = 0; i < num; i++) {
-			if (!check[i].attr->maybe_real) {
+		for (i = 0; i < check->check_nr; i++) {
+			const struct git_attr *a = check->check[i].attr;
+			if (!a->maybe_real) {
 				struct attr_check_item *c;
-				c = check_all_attr + check[i].attr->attr_nr;
+				c = check_all_attr + a->attr_nr;
 				c->value = ATTR__UNSET;
 				rem++;
 			}
 		}
-		if (rem == num)
+		if (rem == check->check_nr)
 			return;
 	}
 
@@ -821,18 +820,17 @@ static void collect_some_attrs(const char *path, int num,
 		rem = fill(path, pathlen, basename_offset, stk, rem);
 }
 
-static int git_check_attrs(const char *path, int num,
-			   struct attr_check_item *check)
+int git_check_attr(const char *path, struct attr_check *check)
 {
 	int i;
 
-	collect_some_attrs(path, num, check);
+	collect_some_attrs(path, check);
 
-	for (i = 0; i < num; i++) {
-		const char *value = check_all_attr[check[i].attr->attr_nr].value;
+	for (i = 0; i < check->check_nr; i++) {
+		const char *value = check_all_attr[check->check[i].attr->attr_nr].value;
 		if (value == ATTR__UNKNOWN)
 			value = ATTR__UNSET;
-		check[i].value = value;
+		check->check[i].value = value;
 	}
 
 	return 0;
@@ -843,7 +841,7 @@ void git_all_attrs(const char *path, struct attr_check *check)
 	int i;
 
 	attr_check_reset(check);
-	collect_some_attrs(path, check->check_nr, check->check);
+	collect_some_attrs(path, check);
 
 	for (i = 0; i < attr_nr; i++) {
 		const char *name = check_all_attr[i].attr->name;
@@ -856,11 +854,6 @@ void git_all_attrs(const char *path, struct attr_check *check)
 	}
 }
 
-int git_check_attr(const char *path, struct attr_check *check)
-{
-	return git_check_attrs(path, check->check_nr, check->check);
-}
-
 struct attr_check *attr_check_alloc(void)
 {
 	return xcalloc(1, sizeof(struct attr_check));
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v2 20/27] attr: change validity check for attribute names to use positive logic
  2017-01-23 20:34 ` [PATCH v2 " Brandon Williams
                     ` (18 preceding siblings ...)
  2017-01-23 20:35   ` [PATCH v2 19/27] attr: pass struct attr_check to collect_some_attrs Brandon Williams
@ 2017-01-23 20:35   ` Brandon Williams
  2017-01-23 20:35   ` [PATCH v2 21/27] attr: use hashmap for attribute dictionary Brandon Williams
                     ` (7 subsequent siblings)
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-23 20:35 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, sbeller, pclouds, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

Convert 'invalid_attr_name()' to 'attr_name_valid()' and use positive
logic for the return value.  In addition create a helper function that
prints out an error message when an invalid attribute name is used.

We could later update the message to exactly spell out what the
rules for a good attribute name are, etc.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 34 ++++++++++++++++++++--------------
 1 file changed, 20 insertions(+), 14 deletions(-)

diff --git a/attr.c b/attr.c
index e58fa340c..5399e1cb3 100644
--- a/attr.c
+++ b/attr.c
@@ -74,23 +74,33 @@ static unsigned hash_name(const char *name, int namelen)
 	return val;
 }
 
-static int invalid_attr_name(const char *name, int namelen)
+static int attr_name_valid(const char *name, size_t namelen)
 {
 	/*
 	 * Attribute name cannot begin with '-' and must consist of
 	 * characters from [-A-Za-z0-9_.].
 	 */
 	if (namelen <= 0 || *name == '-')
-		return -1;
+		return 0;
 	while (namelen--) {
 		char ch = *name++;
 		if (! (ch == '-' || ch == '.' || ch == '_' ||
 		       ('0' <= ch && ch <= '9') ||
 		       ('a' <= ch && ch <= 'z') ||
 		       ('A' <= ch && ch <= 'Z')) )
-			return -1;
+			return 0;
 	}
-	return 0;
+	return 1;
+}
+
+static void report_invalid_attr(const char *name, size_t len,
+				const char *src, int lineno)
+{
+	struct strbuf err = STRBUF_INIT;
+	strbuf_addf(&err, _("%.*s is not a valid attribute name"),
+		    (int) len, name);
+	fprintf(stderr, "%s: %s:%d\n", err.buf, src, lineno);
+	strbuf_release(&err);
 }
 
 static struct git_attr *git_attr_internal(const char *name, int len)
@@ -105,7 +115,7 @@ static struct git_attr *git_attr_internal(const char *name, int len)
 			return a;
 	}
 
-	if (invalid_attr_name(name, len))
+	if (!attr_name_valid(name, len))
 		return NULL;
 
 	FLEX_ALLOC_MEM(a, name, name, len);
@@ -196,17 +206,15 @@ static const char *parse_attr(const char *src, int lineno, const char *cp,
 			cp++;
 			len--;
 		}
-		if (invalid_attr_name(cp, len)) {
-			fprintf(stderr,
-				"%.*s is not a valid attribute name: %s:%d\n",
-				len, cp, src, lineno);
+		if (!attr_name_valid(cp, len)) {
+			report_invalid_attr(cp, len, src, lineno);
 			return NULL;
 		}
 	} else {
 		/*
 		 * As this function is always called twice, once with
 		 * e == NULL in the first pass and then e != NULL in
-		 * the second pass, no need for invalid_attr_name()
+		 * the second pass, no need for attr_name_valid()
 		 * check here.
 		 */
 		if (*cp == '-' || *cp == '!') {
@@ -258,10 +266,8 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 		name += strlen(ATTRIBUTE_MACRO_PREFIX);
 		name += strspn(name, blank);
 		namelen = strcspn(name, blank);
-		if (invalid_attr_name(name, namelen)) {
-			fprintf(stderr,
-				"%.*s is not a valid attribute name: %s:%d\n",
-				namelen, name, src, lineno);
+		if (!attr_name_valid(name, namelen)) {
+			report_invalid_attr(name, namelen, src, lineno);
 			goto fail_return;
 		}
 	}
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v2 21/27] attr: use hashmap for attribute dictionary
  2017-01-23 20:34 ` [PATCH v2 " Brandon Williams
                     ` (19 preceding siblings ...)
  2017-01-23 20:35   ` [PATCH v2 20/27] attr: change validity check for attribute names to use positive logic Brandon Williams
@ 2017-01-23 20:35   ` Brandon Williams
  2017-01-23 20:35   ` [PATCH v2 22/27] attr: eliminate global check_all_attr array Brandon Williams
                     ` (6 subsequent siblings)
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-23 20:35 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, sbeller, gitster, pclouds

The current implementation of the attribute dictionary uses a custom
hashtable.  This modernizes the dictionary by converting it to the builtin
'hashmap' structure.

Also, in order to enable a threaded API in the future add an
accompanying mutex which must be acquired prior to accessing the
dictionary of interned attributes.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c        | 173 +++++++++++++++++++++++++++++++++++++++++++---------------
 attr.h        |   2 +
 common-main.c |   3 +
 3 files changed, 133 insertions(+), 45 deletions(-)

diff --git a/attr.c b/attr.c
index 5399e1cb3..d2ece4eba 100644
--- a/attr.c
+++ b/attr.c
@@ -14,6 +14,7 @@
 #include "dir.h"
 #include "utf8.h"
 #include "quote.h"
+#include "thread-utils.h"
 
 const char git_attr__true[] = "(builtin)true";
 const char git_attr__false[] = "\0(builtin)false";
@@ -23,28 +24,17 @@ static const char git_attr__unknown[] = "(builtin)unknown";
 #define ATTR__UNSET NULL
 #define ATTR__UNKNOWN git_attr__unknown
 
-/* This is a randomly chosen prime. */
-#define HASHSIZE 257
-
 #ifndef DEBUG_ATTR
 #define DEBUG_ATTR 0
 #endif
 
-/*
- * NEEDSWORK: the global dictionary of the interned attributes
- * must stay a singleton even after we become thread-ready.
- * Access to these must be surrounded with mutex when it happens.
- */
 struct git_attr {
-	struct git_attr *next;
-	unsigned h;
-	int attr_nr;
+	int attr_nr; /* unique attribute number */
 	int maybe_macro;
 	int maybe_real;
-	char name[FLEX_ARRAY];
+	char name[FLEX_ARRAY]; /* attribute name */
 };
 static int attr_nr;
-static struct git_attr *(git_attr_hash[HASHSIZE]);
 
 /*
  * NEEDSWORK: maybe-real, maybe-macro are not property of
@@ -63,15 +53,94 @@ const char *git_attr_name(const struct git_attr *attr)
 	return attr->name;
 }
 
-static unsigned hash_name(const char *name, int namelen)
+struct attr_hashmap {
+	struct hashmap map;
+#ifndef NO_PTHREADS
+	pthread_mutex_t mutex;
+#endif
+};
+
+static inline void hashmap_lock(struct attr_hashmap *map)
+{
+#ifndef NO_PTHREADS
+	pthread_mutex_lock(&map->mutex);
+#endif
+}
+
+static inline void hashmap_unlock(struct attr_hashmap *map)
 {
-	unsigned val = 0, c;
+#ifndef NO_PTHREADS
+	pthread_mutex_unlock(&map->mutex);
+#endif
+}
 
-	while (namelen--) {
-		c = *name++;
-		val = ((val << 7) | (val >> 22)) ^ c;
-	}
-	return val;
+/*
+ * The global dictionary of all interned attributes.  This
+ * is a singleton object which is shared between threads.
+ * Access to this dictionary must be surrounded with a mutex.
+ */
+static struct attr_hashmap g_attr_hashmap;
+
+/* The container for objects stored in "struct attr_hashmap" */
+struct attr_hash_entry {
+	struct hashmap_entry ent; /* must be the first member! */
+	const char *key; /* the key; memory should be owned by value */
+	size_t keylen; /* length of the key */
+	void *value; /* the stored value */
+};
+
+/* attr_hashmap comparison function */
+static int attr_hash_entry_cmp(const struct attr_hash_entry *a,
+			       const struct attr_hash_entry *b,
+			       void *unused)
+{
+	return (a->keylen != b->keylen) || strncmp(a->key, b->key, a->keylen);
+}
+
+/* Initialize an 'attr_hashmap' object */
+static void attr_hashmap_init(struct attr_hashmap *map)
+{
+	hashmap_init(&map->map, (hashmap_cmp_fn) attr_hash_entry_cmp, 0);
+}
+
+/*
+ * Retrieve the 'value' stored in a hashmap given the provided 'key'.
+ * If there is no matching entry, return NULL.
+ */
+static void *attr_hashmap_get(struct attr_hashmap *map,
+			      const char *key, size_t keylen)
+{
+	struct attr_hash_entry k;
+	struct attr_hash_entry *e;
+
+	if (!map->map.tablesize)
+		attr_hashmap_init(map);
+
+	hashmap_entry_init(&k, memhash(key, keylen));
+	k.key = key;
+	k.keylen = keylen;
+	e = hashmap_get(&map->map, &k, NULL);
+
+	return e ? e->value : NULL;
+}
+
+/* Add 'value' to a hashmap based on the provided 'key'. */
+static void attr_hashmap_add(struct attr_hashmap *map,
+			     const char *key, size_t keylen,
+			     void *value)
+{
+	struct attr_hash_entry *e;
+
+	if (!map->map.tablesize)
+		attr_hashmap_init(map);
+
+	e = xmalloc(sizeof(struct attr_hash_entry));
+	hashmap_entry_init(e, memhash(key, keylen));
+	e->key = key;
+	e->keylen = keylen;
+	e->value = value;
+
+	hashmap_add(&map->map, e);
 }
 
 static int attr_name_valid(const char *name, size_t namelen)
@@ -103,37 +172,44 @@ static void report_invalid_attr(const char *name, size_t len,
 	strbuf_release(&err);
 }
 
-static struct git_attr *git_attr_internal(const char *name, int len)
+/*
+ * Given a 'name', lookup and return the corresponding attribute in the global
+ * dictionary.  If no entry is found, create a new attribute and store it in
+ * the dictionary.
+ */
+static struct git_attr *git_attr_internal(const char *name, int namelen)
 {
-	unsigned hval = hash_name(name, len);
-	unsigned pos = hval % HASHSIZE;
 	struct git_attr *a;
 
-	for (a = git_attr_hash[pos]; a; a = a->next) {
-		if (a->h == hval &&
-		    !memcmp(a->name, name, len) && !a->name[len])
-			return a;
-	}
-
-	if (!attr_name_valid(name, len))
+	if (!attr_name_valid(name, namelen))
 		return NULL;
 
-	FLEX_ALLOC_MEM(a, name, name, len);
-	a->h = hval;
-	a->next = git_attr_hash[pos];
-	a->attr_nr = attr_nr++;
-	a->maybe_macro = 0;
-	a->maybe_real = 0;
-	git_attr_hash[pos] = a;
+	hashmap_lock(&g_attr_hashmap);
+
+	a = attr_hashmap_get(&g_attr_hashmap, name, namelen);
+
+	if (!a) {
+		FLEX_ALLOC_MEM(a, name, name, namelen);
+		a->attr_nr = g_attr_hashmap.map.size;
+		a->maybe_real = 0;
+		a->maybe_macro = 0;
+
+		attr_hashmap_add(&g_attr_hashmap, a->name, namelen, a);
+		assert(a->attr_nr == (g_attr_hashmap.map.size - 1));
+
+		/*
+		 * NEEDSWORK: per git_attr_check check_all_attr
+		 * will be initialized a lot more lazily, not
+		 * like this, and not here.
+		 */
+		REALLOC_ARRAY(check_all_attr, ++attr_nr);
+		check_all_attr[a->attr_nr].attr = a;
+		check_all_attr[a->attr_nr].value = ATTR__UNKNOWN;
+		assert(a->attr_nr == (attr_nr - 1));
+	}
+
+	hashmap_unlock(&g_attr_hashmap);
 
-	/*
-	 * NEEDSWORK: per git_attr_check check_all_attr
-	 * will be initialized a lot more lazily, not
-	 * like this, and not here.
-	 */
-	REALLOC_ARRAY(check_all_attr, attr_nr);
-	check_all_attr[a->attr_nr].attr = a;
-	check_all_attr[a->attr_nr].value = ATTR__UNKNOWN;
 	return a;
 }
 
@@ -941,3 +1017,10 @@ void git_attr_set_direction(enum git_attr_direction new, struct index_state *ist
 		drop_attr_stack();
 	use_index = istate;
 }
+
+void attr_start(void)
+{
+#ifndef NO_PTHREADS
+	pthread_mutex_init(&g_attr_hashmap.mutex, NULL);
+#endif
+}
diff --git a/attr.h b/attr.h
index 3db9893ef..8505bca79 100644
--- a/attr.h
+++ b/attr.h
@@ -67,4 +67,6 @@ enum git_attr_direction {
 };
 void git_attr_set_direction(enum git_attr_direction, struct index_state *);
 
+extern void attr_start(void);
+
 #endif /* ATTR_H */
diff --git a/common-main.c b/common-main.c
index c654f9555..6a689007e 100644
--- a/common-main.c
+++ b/common-main.c
@@ -1,5 +1,6 @@
 #include "cache.h"
 #include "exec_cmd.h"
+#include "attr.h"
 
 /*
  * Many parts of Git have subprograms communicate via pipe, expect the
@@ -33,6 +34,8 @@ int main(int argc, const char **argv)
 
 	git_setup_gettext();
 
+	attr_start();
+
 	git_extract_argv0_path(argv[0]);
 
 	restore_sigpipe_to_default();
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v2 22/27] attr: eliminate global check_all_attr array
  2017-01-23 20:34 ` [PATCH v2 " Brandon Williams
                     ` (20 preceding siblings ...)
  2017-01-23 20:35   ` [PATCH v2 21/27] attr: use hashmap for attribute dictionary Brandon Williams
@ 2017-01-23 20:35   ` Brandon Williams
  2017-01-23 21:11     ` Junio C Hamano
  2017-01-23 20:35   ` [PATCH v2 23/27] attr: remove maybe-real, maybe-macro from git_attr Brandon Williams
                     ` (5 subsequent siblings)
  27 siblings, 1 reply; 111+ messages in thread
From: Brandon Williams @ 2017-01-23 20:35 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, sbeller, gitster, pclouds

Currently there is a reliance on 'check_all_attr' which is a global
array of 'attr_check_item' items which is used to store the value of
each attribute during the collection process.

This patch eliminates this global and instead creates an array per
'attr_check' instance which is then used in the attribute collection
process.  This brings the attribute system one step closer to being
thread-safe.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 114 +++++++++++++++++++++++++++++++++++++++++++----------------------
 attr.h |   2 ++
 2 files changed, 78 insertions(+), 38 deletions(-)

diff --git a/attr.c b/attr.c
index d2ece4eba..6d2468790 100644
--- a/attr.c
+++ b/attr.c
@@ -34,7 +34,6 @@ struct git_attr {
 	int maybe_real;
 	char name[FLEX_ARRAY]; /* attribute name */
 };
-static int attr_nr;
 
 /*
  * NEEDSWORK: maybe-real, maybe-macro are not property of
@@ -45,9 +44,6 @@ static int attr_nr;
  */
 static int cannot_trust_maybe_real;
 
-/* NEEDSWORK: This will become per git_attr_check */
-static struct attr_check_item *check_all_attr;
-
 const char *git_attr_name(const struct git_attr *attr)
 {
 	return attr->name;
@@ -143,6 +139,52 @@ static void attr_hashmap_add(struct attr_hashmap *map,
 	hashmap_add(&map->map, e);
 }
 
+/*
+ * Reallocate and reinitialize the array of all attributes (which is used in
+ * the attribute collection process) in 'check' based on the global dictionary
+ * of attributes.
+ */
+static void all_attrs_init(struct attr_hashmap *map, struct attr_check *check)
+{
+	int i;
+
+	hashmap_lock(map);
+
+	if (map->map.size < check->all_attrs_nr)
+		die("BUG: interned attributes shouldn't be deleted");
+
+	/*
+	 * If the number of attributes in the global dictionary has increased
+	 * (or this attr_check instance doesn't have an initialized all_attrs
+	 * field), reallocate the provided attr_check instance's all_attrs
+	 * field and fill each entry with its corresponding git_attr.
+	 */
+	if (map->map.size != check->all_attrs_nr) {
+		struct attr_hash_entry *e;
+		struct hashmap_iter iter;
+		hashmap_iter_init(&map->map, &iter);
+
+		REALLOC_ARRAY(check->all_attrs, map->map.size);
+		check->all_attrs_nr = map->map.size;
+
+		while ((e = hashmap_iter_next(&iter))) {
+			const struct git_attr *a = e->value;
+			check->all_attrs[a->attr_nr].attr = a;
+		}
+	}
+
+	hashmap_unlock(map);
+
+	/*
+	 * Re-initialize every entry in check->all_attrs.
+	 * This re-initialization can live outside of the locked region since
+	 * the attribute dictionary is no longer being accessed.
+	 */
+	for (i = 0; i < check->all_attrs_nr; i++) {
+		check->all_attrs[i].value = ATTR__UNKNOWN;
+	}
+}
+
 static int attr_name_valid(const char *name, size_t namelen)
 {
 	/*
@@ -196,16 +238,6 @@ static struct git_attr *git_attr_internal(const char *name, int namelen)
 
 		attr_hashmap_add(&g_attr_hashmap, a->name, namelen, a);
 		assert(a->attr_nr == (g_attr_hashmap.map.size - 1));
-
-		/*
-		 * NEEDSWORK: per git_attr_check check_all_attr
-		 * will be initialized a lot more lazily, not
-		 * like this, and not here.
-		 */
-		REALLOC_ARRAY(check_all_attr, ++attr_nr);
-		check_all_attr[a->attr_nr].attr = a;
-		check_all_attr[a->attr_nr].value = ATTR__UNKNOWN;
-		assert(a->attr_nr == (attr_nr - 1));
 	}
 
 	hashmap_unlock(&g_attr_hashmap);
@@ -791,16 +823,16 @@ static int path_matches(const char *pathname, int pathlen,
 			      pattern, prefix, pat->patternlen, pat->flags);
 }
 
-static int macroexpand_one(int attr_nr, int rem);
+static int macroexpand_one(struct attr_check_item *all_attrs, int nr, int rem);
 
-static int fill_one(const char *what, struct match_attr *a, int rem)
+static int fill_one(const char *what, struct attr_check_item *all_attrs,
+		    struct match_attr *a, int rem)
 {
-	struct attr_check_item *check = check_all_attr;
 	int i;
 
-	for (i = a->num_attr - 1; 0 < rem && 0 <= i; i--) {
+	for (i = a->num_attr - 1; rem > 0 && i >= 0; i--) {
 		struct git_attr *attr = a->state[i].attr;
-		const char **n = &(check[attr->attr_nr].value);
+		const char **n = &(all_attrs[attr->attr_nr].value);
 		const char *v = a->state[i].setto;
 
 		if (*n == ATTR__UNKNOWN) {
@@ -809,14 +841,15 @@ static int fill_one(const char *what, struct match_attr *a, int rem)
 				  attr, v);
 			*n = v;
 			rem--;
-			rem = macroexpand_one(attr->attr_nr, rem);
+			rem = macroexpand_one(all_attrs, attr->attr_nr, rem);
 		}
 	}
 	return rem;
 }
 
 static int fill(const char *path, int pathlen, int basename_offset,
-		struct attr_stack *stk, int rem)
+		struct attr_stack *stk, struct attr_check_item *all_attrs,
+		int rem)
 {
 	int i;
 	const char *base = stk->origin ? stk->origin : "";
@@ -827,18 +860,18 @@ static int fill(const char *path, int pathlen, int basename_offset,
 			continue;
 		if (path_matches(path, pathlen, basename_offset,
 				 &a->u.pat, base, stk->originlen))
-			rem = fill_one("fill", a, rem);
+			rem = fill_one("fill", all_attrs, a, rem);
 	}
 	return rem;
 }
 
-static int macroexpand_one(int nr, int rem)
+static int macroexpand_one(struct attr_check_item *all_attrs, int nr, int rem)
 {
 	struct attr_stack *stk;
 	int i;
 
-	if (check_all_attr[nr].value != ATTR__TRUE ||
-	    !check_all_attr[nr].attr->maybe_macro)
+	if (all_attrs[nr].value != ATTR__TRUE ||
+	    !all_attrs[nr].attr->maybe_macro)
 		return rem;
 
 	for (stk = attr_stack; stk; stk = stk->prev) {
@@ -847,7 +880,7 @@ static int macroexpand_one(int nr, int rem)
 			if (!ma->is_macro)
 				continue;
 			if (ma->u.attr->attr_nr == nr)
-				return fill_one("expand", ma, rem);
+				return fill_one("expand", all_attrs, ma, rem);
 		}
 	}
 
@@ -855,9 +888,9 @@ static int macroexpand_one(int nr, int rem)
 }
 
 /*
- * Collect attributes for path into the array pointed to by
- * check_all_attr. If num is non-zero, only attributes in check[] are
- * collected. Otherwise all attributes are collected.
+ * Collect attributes for path into the array pointed to by check->all_attrs.
+ * If check->check_nr is non-zero, only attributes in check[] are collected.
+ * Otherwise all attributes are collected.
  */
 static void collect_some_attrs(const char *path, struct attr_check *check)
 {
@@ -880,15 +913,15 @@ static void collect_some_attrs(const char *path, struct attr_check *check)
 	}
 
 	prepare_attr_stack(path, dirlen);
-	for (i = 0; i < attr_nr; i++)
-		check_all_attr[i].value = ATTR__UNKNOWN;
+	all_attrs_init(&g_attr_hashmap, check);
+
 	if (check->check_nr && !cannot_trust_maybe_real) {
 		rem = 0;
 		for (i = 0; i < check->check_nr; i++) {
 			const struct git_attr *a = check->check[i].attr;
 			if (!a->maybe_real) {
 				struct attr_check_item *c;
-				c = check_all_attr + a->attr_nr;
+				c = check->all_attrs + a->attr_nr;
 				c->value = ATTR__UNSET;
 				rem++;
 			}
@@ -897,9 +930,9 @@ static void collect_some_attrs(const char *path, struct attr_check *check)
 			return;
 	}
 
-	rem = attr_nr;
+	rem = check->all_attrs_nr;
 	for (stk = attr_stack; 0 < rem && stk; stk = stk->prev)
-		rem = fill(path, pathlen, basename_offset, stk, rem);
+		rem = fill(path, pathlen, basename_offset, stk, check->all_attrs, rem);
 }
 
 int git_check_attr(const char *path, struct attr_check *check)
@@ -909,7 +942,8 @@ int git_check_attr(const char *path, struct attr_check *check)
 	collect_some_attrs(path, check);
 
 	for (i = 0; i < check->check_nr; i++) {
-		const char *value = check_all_attr[check->check[i].attr->attr_nr].value;
+		size_t index = check->check[i].attr->attr_nr;
+		const char *value = check->all_attrs[index].value;
 		if (value == ATTR__UNKNOWN)
 			value = ATTR__UNSET;
 		check->check[i].value = value;
@@ -925,9 +959,9 @@ void git_all_attrs(const char *path, struct attr_check *check)
 	attr_check_reset(check);
 	collect_some_attrs(path, check);
 
-	for (i = 0; i < attr_nr; i++) {
-		const char *name = check_all_attr[i].attr->name;
-		const char *value = check_all_attr[i].value;
+	for (i = 0; i < check->all_attrs_nr; i++) {
+		const char *name = check->all_attrs[i].attr->name;
+		const char *value = check->all_attrs[i].value;
 		struct attr_check_item *item;
 		if (value == ATTR__UNSET || value == ATTR__UNKNOWN)
 			continue;
@@ -997,6 +1031,10 @@ void attr_check_clear(struct attr_check *check)
 	check->check = NULL;
 	check->check_alloc = 0;
 	check->check_nr = 0;
+
+	free(check->all_attrs);
+	check->all_attrs = NULL;
+	check->all_attrs_nr = 0;
 }
 
 void attr_check_free(struct attr_check *check)
diff --git a/attr.h b/attr.h
index 8505bca79..44b21d82c 100644
--- a/attr.h
+++ b/attr.h
@@ -33,6 +33,8 @@ struct attr_check {
 	int check_nr;
 	int check_alloc;
 	struct attr_check_item *check;
+	int all_attrs_nr;
+	struct attr_check_item *all_attrs;
 };
 
 extern struct attr_check *attr_check_alloc(void);
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v2 23/27] attr: remove maybe-real, maybe-macro from git_attr
  2017-01-23 20:34 ` [PATCH v2 " Brandon Williams
                     ` (21 preceding siblings ...)
  2017-01-23 20:35   ` [PATCH v2 22/27] attr: eliminate global check_all_attr array Brandon Williams
@ 2017-01-23 20:35   ` Brandon Williams
  2017-01-23 20:35   ` [PATCH v2 24/27] attr: tighten const correctness with git_attr and match_attr Brandon Williams
                     ` (4 subsequent siblings)
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-23 20:35 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, sbeller, gitster, pclouds

Whether or not a git attribute is real or a macro isn't a property of
the attribute but rather it depends on the attribute stack (which
.gitattribute files were read).

This patch removes the 'maybe_real' and 'maybe_macro' fields in a
git_attr and instead adds the 'macro' field to a attr_check_item.  The
'macro' indicates (if non-NULL) that a particular attribute is a macro
for the given attribute stack.  It's populated, through a quick scan of
the attribute stack, with the match_attr that corresponds to the macro's
definition.  This way the attribute stack only needs to be scanned a
single time prior to attribute collection instead of each time a macro
needs to be expanded.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 69 ++++++++++++++++++++++++++++++------------------------------------
 attr.h |  6 ++++++
 2 files changed, 37 insertions(+), 38 deletions(-)

diff --git a/attr.c b/attr.c
index 6d2468790..ed9ba3756 100644
--- a/attr.c
+++ b/attr.c
@@ -30,20 +30,9 @@ static const char git_attr__unknown[] = "(builtin)unknown";
 
 struct git_attr {
 	int attr_nr; /* unique attribute number */
-	int maybe_macro;
-	int maybe_real;
 	char name[FLEX_ARRAY]; /* attribute name */
 };
 
-/*
- * NEEDSWORK: maybe-real, maybe-macro are not property of
- * an attribute, as it depends on what .gitattributes are
- * read.  Once we introduce per git_attr_check attr_stack
- * and check_all_attr, the optimization based on them will
- * become unnecessary and can go away.  So is this variable.
- */
-static int cannot_trust_maybe_real;
-
 const char *git_attr_name(const struct git_attr *attr)
 {
 	return attr->name;
@@ -182,6 +171,7 @@ static void all_attrs_init(struct attr_hashmap *map, struct attr_check *check)
 	 */
 	for (i = 0; i < check->all_attrs_nr; i++) {
 		check->all_attrs[i].value = ATTR__UNKNOWN;
+		check->all_attrs[i].macro = NULL;
 	}
 }
 
@@ -233,8 +223,6 @@ static struct git_attr *git_attr_internal(const char *name, int namelen)
 	if (!a) {
 		FLEX_ALLOC_MEM(a, name, name, namelen);
 		a->attr_nr = g_attr_hashmap.map.size;
-		a->maybe_real = 0;
-		a->maybe_macro = 0;
 
 		attr_hashmap_add(&g_attr_hashmap, a->name, namelen, a);
 		assert(a->attr_nr == (g_attr_hashmap.map.size - 1));
@@ -397,7 +385,6 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 		      (is_macro ? 0 : namelen + 1));
 	if (is_macro) {
 		res->u.attr = git_attr_internal(name, namelen);
-		res->u.attr->maybe_macro = 1;
 	} else {
 		char *p = (char *)&(res->state[num_attr]);
 		memcpy(p, name, namelen);
@@ -418,10 +405,6 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 	/* Second pass to fill the attr_states */
 	for (cp = states, i = 0; *cp; i++) {
 		cp = parse_attr(src, lineno, cp, &(res->state[i]));
-		if (!is_macro)
-			res->state[i].attr->maybe_real = 1;
-		if (res->state[i].attr->maybe_macro)
-			cannot_trust_maybe_real = 1;
 	}
 
 	strbuf_release(&pattern);
@@ -826,7 +809,7 @@ static int path_matches(const char *pathname, int pathlen,
 static int macroexpand_one(struct attr_check_item *all_attrs, int nr, int rem);
 
 static int fill_one(const char *what, struct attr_check_item *all_attrs,
-		    struct match_attr *a, int rem)
+		    const struct match_attr *a, int rem)
 {
 	int i;
 
@@ -867,24 +850,34 @@ static int fill(const char *path, int pathlen, int basename_offset,
 
 static int macroexpand_one(struct attr_check_item *all_attrs, int nr, int rem)
 {
-	struct attr_stack *stk;
-	int i;
+	const struct attr_check_item *item = &all_attrs[nr];
 
-	if (all_attrs[nr].value != ATTR__TRUE ||
-	    !all_attrs[nr].attr->maybe_macro)
+	if (item->macro && item->value == ATTR__TRUE)
+		return fill_one("expand", all_attrs, item->macro, rem);
+	else
 		return rem;
+}
 
-	for (stk = attr_stack; stk; stk = stk->prev) {
-		for (i = stk->num_matches - 1; 0 <= i; i--) {
-			struct match_attr *ma = stk->attrs[i];
-			if (!ma->is_macro)
-				continue;
-			if (ma->u.attr->attr_nr == nr)
-				return fill_one("expand", all_attrs, ma, rem);
+/*
+ * Marks the attributes which are macros based on the attribute stack.
+ * This prevents having to search through the attribute stack each time
+ * a macro needs to be expanded during the fill stage.
+ */
+static void determine_macros(struct attr_check_item *all_attrs,
+			     const struct attr_stack *stack)
+{
+	for (; stack; stack = stack->prev) {
+		int i;
+		for (i = stack->num_matches - 1; i >= 0; i--) {
+			const struct match_attr *ma = stack->attrs[i];
+			if (ma->is_macro) {
+				int n = ma->u.attr->attr_nr;
+				if (!all_attrs[n].macro) {
+					all_attrs[n].macro = ma;
+				}
+			}
 		}
 	}
-
-	return rem;
 }
 
 /*
@@ -914,15 +907,15 @@ static void collect_some_attrs(const char *path, struct attr_check *check)
 
 	prepare_attr_stack(path, dirlen);
 	all_attrs_init(&g_attr_hashmap, check);
+	determine_macros(check->all_attrs, attr_stack);
 
-	if (check->check_nr && !cannot_trust_maybe_real) {
+	if (check->check_nr) {
 		rem = 0;
 		for (i = 0; i < check->check_nr; i++) {
-			const struct git_attr *a = check->check[i].attr;
-			if (!a->maybe_real) {
-				struct attr_check_item *c;
-				c = check->all_attrs + a->attr_nr;
-				c->value = ATTR__UNSET;
+			int n = check->check[i].attr->attr_nr;
+			struct attr_check_item *item = &check->all_attrs[n];
+			if (item->macro) {
+				item->value = ATTR__UNSET;
 				rem++;
 			}
 		}
diff --git a/attr.h b/attr.h
index 44b21d82c..f40524875 100644
--- a/attr.h
+++ b/attr.h
@@ -27,6 +27,12 @@ extern const char git_attr__false[];
 struct attr_check_item {
 	const struct git_attr *attr;
 	const char *value;
+	/*
+	 * If 'macro' is non-NULL, indicates that 'attr' is a macro based on
+	 * the current attribute stack and contains a pointer to the match_attr
+	 * definition of the macro
+	 */
+	const struct match_attr *macro;
 };
 
 struct attr_check {
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v2 24/27] attr: tighten const correctness with git_attr and match_attr
  2017-01-23 20:34 ` [PATCH v2 " Brandon Williams
                     ` (22 preceding siblings ...)
  2017-01-23 20:35   ` [PATCH v2 23/27] attr: remove maybe-real, maybe-macro from git_attr Brandon Williams
@ 2017-01-23 20:35   ` Brandon Williams
  2017-01-23 20:35   ` [PATCH v2 25/27] attr: store attribute stack in attr_check structure Brandon Williams
                     ` (3 subsequent siblings)
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-23 20:35 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, sbeller, gitster, pclouds

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c               | 14 +++++++-------
 attr.h               |  2 +-
 builtin/check-attr.c |  3 ++-
 3 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/attr.c b/attr.c
index ed9ba3756..95456503e 100644
--- a/attr.c
+++ b/attr.c
@@ -209,7 +209,7 @@ static void report_invalid_attr(const char *name, size_t len,
  * dictionary.  If no entry is found, create a new attribute and store it in
  * the dictionary.
  */
-static struct git_attr *git_attr_internal(const char *name, int namelen)
+static const struct git_attr *git_attr_internal(const char *name, int namelen)
 {
 	struct git_attr *a;
 
@@ -233,14 +233,14 @@ static struct git_attr *git_attr_internal(const char *name, int namelen)
 	return a;
 }
 
-struct git_attr *git_attr(const char *name)
+const struct git_attr *git_attr(const char *name)
 {
 	return git_attr_internal(name, strlen(name));
 }
 
 /* What does a matched pattern decide? */
 struct attr_state {
-	struct git_attr *attr;
+	const struct git_attr *attr;
 	const char *setto;
 };
 
@@ -267,7 +267,7 @@ struct pattern {
 struct match_attr {
 	union {
 		struct pattern pat;
-		struct git_attr *attr;
+		const struct git_attr *attr;
 	} u;
 	char is_macro;
 	unsigned num_attr;
@@ -814,7 +814,7 @@ static int fill_one(const char *what, struct attr_check_item *all_attrs,
 	int i;
 
 	for (i = a->num_attr - 1; rem > 0 && i >= 0; i--) {
-		struct git_attr *attr = a->state[i].attr;
+		const struct git_attr *attr = a->state[i].attr;
 		const char **n = &(all_attrs[attr->attr_nr].value);
 		const char *v = a->state[i].setto;
 
@@ -838,7 +838,7 @@ static int fill(const char *path, int pathlen, int basename_offset,
 	const char *base = stk->origin ? stk->origin : "";
 
 	for (i = stk->num_matches - 1; 0 < rem && 0 <= i; i--) {
-		struct match_attr *a = stk->attrs[i];
+		const struct match_attr *a = stk->attrs[i];
 		if (a->is_macro)
 			continue;
 		if (path_matches(path, pathlen, basename_offset,
@@ -988,7 +988,7 @@ struct attr_check *attr_check_initl(const char *one, ...)
 	check->check[0].attr = git_attr(one);
 	va_start(params, one);
 	for (cnt = 1; cnt < check->check_nr; cnt++) {
-		struct git_attr *attr;
+		const struct git_attr *attr;
 		param = va_arg(params, const char *);
 		if (!param)
 			die("BUG: counted %d != ended at %d",
diff --git a/attr.h b/attr.h
index f40524875..9b4dc07d8 100644
--- a/attr.h
+++ b/attr.h
@@ -8,7 +8,7 @@ struct git_attr;
  * Given a string, return the gitattribute object that
  * corresponds to it.
  */
-struct git_attr *git_attr(const char *);
+const struct git_attr *git_attr(const char *);
 
 /* Internal use */
 extern const char git_attr__true[];
diff --git a/builtin/check-attr.c b/builtin/check-attr.c
index 3d4704be5..cc6caf7ac 100644
--- a/builtin/check-attr.c
+++ b/builtin/check-attr.c
@@ -166,7 +166,8 @@ int cmd_check_attr(int argc, const char **argv, const char *prefix)
 	check = attr_check_alloc();
 	if (!all_attrs) {
 		for (i = 0; i < cnt; i++) {
-			struct git_attr *a = git_attr(argv[i]);
+			const struct git_attr *a = git_attr(argv[i]);
+
 			if (!a)
 				return error("%s: not a valid attribute name",
 					     argv[i]);
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v2 25/27] attr: store attribute stack in attr_check structure
  2017-01-23 20:34 ` [PATCH v2 " Brandon Williams
                     ` (23 preceding siblings ...)
  2017-01-23 20:35   ` [PATCH v2 24/27] attr: tighten const correctness with git_attr and match_attr Brandon Williams
@ 2017-01-23 20:35   ` Brandon Williams
  2017-01-23 21:42     ` Junio C Hamano
  2017-01-23 20:35   ` [PATCH v2 26/27] attr: push the bare repo check into read_attr() Brandon Williams
                     ` (2 subsequent siblings)
  27 siblings, 1 reply; 111+ messages in thread
From: Brandon Williams @ 2017-01-23 20:35 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, sbeller, gitster, pclouds

The last big hurdle towards a thread-safe API for the attribute system
is the reliance on a global attribute stack that is modified during each
call into the attribute system.

This patch removes this global stack and instead a stack is stored
locally in each attr_check instance.  This opens up the opportunity for
future optimizations to customize the attribute stack for the attributes
that a particular attr_check struct is interested in.

One caveat with pushing the attribute stack into the attr_check
structure is that the attribute system now needs to keep track of all
active attr_check instances.  Due to the direction mechanism the stack
needs to be dropped when the direction is switched.  In order to ensure
correctness when the direction is changed the attribute system needs to
iterate through all active attr_check instances and drop each of their
stacks.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 277 ++++++++++++++++++++++++++++++++++++++++++++---------------------
 attr.h |   3 +
 2 files changed, 193 insertions(+), 87 deletions(-)

diff --git a/attr.c b/attr.c
index 95456503e..d64d1959e 100644
--- a/attr.c
+++ b/attr.c
@@ -434,17 +434,16 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
  * .gitignore file and info/excludes file as a fallback.
  */
 
-/* NEEDSWORK: This will become per git_attr_check */
-static struct attr_stack {
+struct attr_stack {
 	struct attr_stack *prev;
 	char *origin;
 	size_t originlen;
 	unsigned num_matches;
 	unsigned alloc;
 	struct match_attr **attrs;
-} *attr_stack;
+};
 
-static void free_attr_elem(struct attr_stack *e)
+static void attr_stack_free(struct attr_stack *e)
 {
 	int i;
 	free(e->origin);
@@ -467,6 +466,85 @@ static void free_attr_elem(struct attr_stack *e)
 	free(e);
 }
 
+/* List of all attr_check structs; access should be surrounded by mutex */
+static struct check_vector {
+	size_t nr;
+	size_t alloc;
+	struct attr_check **checks;
+#ifndef NO_PTHREADS
+	pthread_mutex_t mutex;
+#endif
+} check_vector;
+
+static inline void vector_lock(void)
+{
+#ifndef NO_PTHREADS
+	pthread_mutex_lock(&check_vector.mutex);
+#endif
+}
+
+static inline void vector_unlock(void)
+{
+#ifndef NO_PTHREADS
+	pthread_mutex_unlock(&check_vector.mutex);
+#endif
+}
+
+static void check_vector_add(struct attr_check *c)
+{
+	vector_lock();
+
+	ALLOC_GROW(check_vector.checks,
+		   check_vector.nr + 1,
+		   check_vector.alloc);
+	check_vector.checks[check_vector.nr++] = c;
+
+	vector_unlock();
+}
+
+static void check_vector_remove(struct attr_check *check)
+{
+	int i;
+
+	vector_lock();
+
+	/* Find entry */
+	for (i = 0; i < check_vector.nr; i++)
+		if (check_vector.checks[i] == check)
+			break;
+
+	if (i >= check_vector.nr)
+		die("BUG: no entry found");
+
+	/* shift entries over */
+	for (; i < check_vector.nr - 1; i++)
+		check_vector.checks[i] = check_vector.checks[i + 1];
+
+	check_vector.nr--;
+
+	vector_unlock();
+}
+
+/* Iterate through all attr_check instances and drop their stacks */
+static void drop_attr_stack(void)
+{
+	int i;
+
+	vector_lock();
+
+	for (i = 0; i < check_vector.nr; i++) {
+		struct attr_stack **stack = &check_vector.checks[i]->stack;
+
+		while (*stack) {
+			struct attr_stack *elem = *stack;
+			*stack = elem->prev;
+			attr_stack_free(elem);
+		}
+	}
+
+	vector_unlock();
+}
+
 static const char *builtin_attr[] = {
 	"[attr]binary -diff -merge -text",
 	NULL,
@@ -621,15 +699,6 @@ static void debug_set(const char *what, const char *match, struct git_attr *attr
 #define debug_set(a,b,c,d) do { ; } while (0)
 #endif /* DEBUG_ATTR */
 
-static void drop_attr_stack(void)
-{
-	while (attr_stack) {
-		struct attr_stack *elem = attr_stack;
-		attr_stack = elem->prev;
-		free_attr_elem(elem);
-	}
-}
-
 static const char *git_etc_gitattributes(void)
 {
 	static const char *system_wide;
@@ -638,6 +707,14 @@ static const char *git_etc_gitattributes(void)
 	return system_wide;
 }
 
+static const char *get_home_gitattributes(void)
+{
+	if (!git_attributes_file)
+		git_attributes_file = xdg_config_home("attributes");
+
+	return git_attributes_file;
+}
+
 static int git_attr_system(void)
 {
 	return !git_env_bool("GIT_ATTR_NOSYSTEM", 0);
@@ -657,47 +734,50 @@ static void push_stack(struct attr_stack **attr_stack_p,
 	}
 }
 
-static void bootstrap_attr_stack(void)
+static void bootstrap_attr_stack(struct attr_stack **stack)
 {
-	struct attr_stack *elem;
+	struct attr_stack *e;
 
-	if (attr_stack)
+	if (*stack)
 		return;
 
-	push_stack(&attr_stack, read_attr_from_array(builtin_attr), NULL, 0);
-
-	if (git_attr_system())
-		push_stack(&attr_stack,
-			   read_attr_from_file(git_etc_gitattributes(), 1),
-			   NULL, 0);
+	/* builtin frame */
+	e = read_attr_from_array(builtin_attr);
+	push_stack(stack, e, NULL, 0);
 
-	if (!git_attributes_file)
-		git_attributes_file = xdg_config_home("attributes");
-	if (git_attributes_file)
-		push_stack(&attr_stack,
-			   read_attr_from_file(git_attributes_file, 1),
-			   NULL, 0);
+	/* system-wide frame */
+	if (git_attr_system()) {
+		e = read_attr_from_file(git_etc_gitattributes(), 1);
+		push_stack(stack, e, NULL, 0);
+	}
 
-	if (!is_bare_repository() || direction == GIT_ATTR_INDEX) {
-		elem = read_attr(GITATTRIBUTES_FILE, 1);
-		push_stack(&attr_stack, elem, xstrdup(""), 0);
-		debug_push(elem);
+	/* home directory */
+	if (get_home_gitattributes()) {
+		e = read_attr_from_file(get_home_gitattributes(), 1);
+		push_stack(stack, e, NULL, 0);
 	}
 
-	if (startup_info->have_repository)
-		elem = read_attr_from_file(git_path_info_attributes(), 1);
+	/* root directory */
+	if (!is_bare_repository() || direction == GIT_ATTR_INDEX)
+		e = read_attr(GITATTRIBUTES_FILE, 1);
 	else
-		elem = NULL;
+		e = xcalloc(1, sizeof(struct attr_stack));
+	push_stack(stack, e, xstrdup(""), 0);
 
-	if (!elem)
-		elem = xcalloc(1, sizeof(*elem));
-	push_stack(&attr_stack, elem, NULL, 0);
+	/* info frame */
+	if (startup_info->have_repository)
+		e = read_attr_from_file(git_path_info_attributes(), 1);
+	else
+		e = NULL;
+	if (!e)
+		e = xcalloc(1, sizeof(struct attr_stack));
+	push_stack(stack, e, NULL, 0);
 }
 
-static void prepare_attr_stack(const char *path, int dirlen)
+static void prepare_attr_stack(const char *path, int dirlen,
+			       struct attr_stack **stack)
 {
-	struct attr_stack *elem, *info;
-	const char *cp;
+	struct attr_stack *info;
 
 	/*
 	 * At the bottom of the attribute stack is the built-in
@@ -714,13 +794,13 @@ static void prepare_attr_stack(const char *path, int dirlen)
 	 * .gitattributes in deeper directories to shallower ones,
 	 * and finally use the built-in set as the default.
 	 */
-	bootstrap_attr_stack();
+	bootstrap_attr_stack(stack);
 
 	/*
 	 * Pop the "info" one that is always at the top of the stack.
 	 */
-	info = attr_stack;
-	attr_stack = info->prev;
+	info = *stack;
+	*stack = info->prev;
 
 	/*
 	 * Pop the ones from directories that are not the prefix of
@@ -728,18 +808,19 @@ static void prepare_attr_stack(const char *path, int dirlen)
 	 * the root one (whose origin is an empty string "") or the builtin
 	 * one (whose origin is NULL) without popping it.
 	 */
-	while (attr_stack->origin) {
-		int namelen = strlen(attr_stack->origin);
+	while ((*stack)->origin) {
+		int namelen = (*stack)->originlen;
+		struct attr_stack *elem;
 
-		elem = attr_stack;
+		elem = *stack;
 		if (namelen <= dirlen &&
 		    !strncmp(elem->origin, path, namelen) &&
 		    (!namelen || path[namelen] == '/'))
 			break;
 
 		debug_pop(elem);
-		attr_stack = elem->prev;
-		free_attr_elem(elem);
+		*stack = elem->prev;
+		attr_stack_free(elem);
 	}
 
 	/*
@@ -754,33 +835,43 @@ static void prepare_attr_stack(const char *path, int dirlen)
 		 */
 		struct strbuf pathbuf = STRBUF_INIT;
 
-		assert(attr_stack->origin);
-		while (1) {
-			size_t len = strlen(attr_stack->origin);
+		assert((*stack)->origin);
+		strbuf_addstr(&pathbuf, (*stack)->origin);
+		/* Build up to the directory 'path' is in */
+		while (pathbuf.len < dirlen) {
+			size_t len = pathbuf.len;
+			struct attr_stack *next;
 			char *origin;
 
-			if (dirlen <= len)
-				break;
-			cp = memchr(path + len + 1, '/', dirlen - len - 1);
-			if (!cp)
-				cp = path + dirlen;
-			strbuf_addf(&pathbuf,
-				    "%.*s/%s", (int)(cp - path), path,
-				    GITATTRIBUTES_FILE);
-			elem = read_attr(pathbuf.buf, 0);
-			strbuf_setlen(&pathbuf, cp - path);
-			origin = strbuf_detach(&pathbuf, &len);
-			push_stack(&attr_stack, elem, origin, len);
-			debug_push(elem);
-		}
+			/* Skip path-separator */
+			if (len < dirlen && is_dir_sep(path[len]))
+				len++;
+			/* Find the end of the next component */
+			while (len < dirlen && !is_dir_sep(path[len]))
+				len++;
+
+			if (pathbuf.len > 0)
+				strbuf_addch(&pathbuf, '/');
+			strbuf_add(&pathbuf, path + pathbuf.len,
+				   (len - pathbuf.len));
+			strbuf_addf(&pathbuf, "/%s", GITATTRIBUTES_FILE);
 
+			next = read_attr(pathbuf.buf, 0);
+
+			/* reset the pathbuf to not include "/.gitattributes" */
+			strbuf_setlen(&pathbuf, len);
+
+			origin = xstrdup(pathbuf.buf);
+			push_stack(stack, next, origin, len);
+
+		}
 		strbuf_release(&pathbuf);
 	}
 
 	/*
 	 * Finally push the "info" one at the top of the stack.
 	 */
-	push_stack(&attr_stack, info, NULL, 0);
+	push_stack(stack, info, NULL, 0);
 }
 
 static int path_matches(const char *pathname, int pathlen,
@@ -831,20 +922,23 @@ static int fill_one(const char *what, struct attr_check_item *all_attrs,
 }
 
 static int fill(const char *path, int pathlen, int basename_offset,
-		struct attr_stack *stk, struct attr_check_item *all_attrs,
-		int rem)
+		const struct attr_stack *stack,
+		struct attr_check_item *all_attrs, int rem)
 {
-	int i;
-	const char *base = stk->origin ? stk->origin : "";
-
-	for (i = stk->num_matches - 1; 0 < rem && 0 <= i; i--) {
-		const struct match_attr *a = stk->attrs[i];
-		if (a->is_macro)
-			continue;
-		if (path_matches(path, pathlen, basename_offset,
-				 &a->u.pat, base, stk->originlen))
-			rem = fill_one("fill", all_attrs, a, rem);
+	for (; rem > 0 && stack; stack = stack->prev) {
+		int i;
+		const char *base = stack->origin ? stack->origin : "";
+
+		for (i = stack->num_matches - 1; 0 < rem && 0 <= i; i--) {
+			const struct match_attr *a = stack->attrs[i];
+			if (a->is_macro)
+				continue;
+			if (path_matches(path, pathlen, basename_offset,
+					 &a->u.pat, base, stack->originlen))
+				rem = fill_one("fill", all_attrs, a, rem);
+		}
 	}
+
 	return rem;
 }
 
@@ -887,7 +981,6 @@ static void determine_macros(struct attr_check_item *all_attrs,
  */
 static void collect_some_attrs(const char *path, struct attr_check *check)
 {
-	struct attr_stack *stk;
 	int i, pathlen, rem, dirlen;
 	const char *cp, *last_slash = NULL;
 	int basename_offset;
@@ -905,9 +998,9 @@ static void collect_some_attrs(const char *path, struct attr_check *check)
 		dirlen = 0;
 	}
 
-	prepare_attr_stack(path, dirlen);
+	prepare_attr_stack(path, dirlen, &check->stack);
 	all_attrs_init(&g_attr_hashmap, check);
-	determine_macros(check->all_attrs, attr_stack);
+	determine_macros(check->all_attrs, check->stack);
 
 	if (check->check_nr) {
 		rem = 0;
@@ -924,8 +1017,7 @@ static void collect_some_attrs(const char *path, struct attr_check *check)
 	}
 
 	rem = check->all_attrs_nr;
-	for (stk = attr_stack; 0 < rem && stk; stk = stk->prev)
-		rem = fill(path, pathlen, basename_offset, stk, check->all_attrs, rem);
+	fill(path, pathlen, basename_offset, check->stack, check->all_attrs, rem);
 }
 
 int git_check_attr(const char *path, struct attr_check *check)
@@ -965,7 +1057,12 @@ void git_all_attrs(const char *path, struct attr_check *check)
 
 struct attr_check *attr_check_alloc(void)
 {
-	return xcalloc(1, sizeof(struct attr_check));
+	struct attr_check *c = xcalloc(1, sizeof(struct attr_check));
+
+	/* save pointer to the check struct */
+	check_vector_add(c);
+
+	return c;
 }
 
 struct attr_check *attr_check_initl(const char *one, ...)
@@ -1032,8 +1129,13 @@ void attr_check_clear(struct attr_check *check)
 
 void attr_check_free(struct attr_check *check)
 {
-	attr_check_clear(check);
-	free(check);
+	if (check) {
+		/* Remove check from the check vector */
+		check_vector_remove(check);
+
+		attr_check_clear(check);
+		free(check);
+	}
 }
 
 void git_attr_set_direction(enum git_attr_direction new, struct index_state *istate)
@@ -1053,5 +1155,6 @@ void attr_start(void)
 {
 #ifndef NO_PTHREADS
 	pthread_mutex_init(&g_attr_hashmap.mutex, NULL);
+	pthread_mutex_init(&check_vector.mutex, NULL);
 #endif
 }
diff --git a/attr.h b/attr.h
index 9b4dc07d8..da7c3a229 100644
--- a/attr.h
+++ b/attr.h
@@ -4,6 +4,8 @@
 /* An attribute is a pointer to this opaque structure */
 struct git_attr;
 
+struct attr_stack;
+
 /*
  * Given a string, return the gitattribute object that
  * corresponds to it.
@@ -41,6 +43,7 @@ struct attr_check {
 	struct attr_check_item *check;
 	int all_attrs_nr;
 	struct attr_check_item *all_attrs;
+	struct attr_stack *stack;
 };
 
 extern struct attr_check *attr_check_alloc(void);
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v2 26/27] attr: push the bare repo check into read_attr()
  2017-01-23 20:34 ` [PATCH v2 " Brandon Williams
                     ` (24 preceding siblings ...)
  2017-01-23 20:35   ` [PATCH v2 25/27] attr: store attribute stack in attr_check structure Brandon Williams
@ 2017-01-23 20:35   ` Brandon Williams
  2017-01-23 20:35   ` [PATCH v2 27/27] attr: reformat git_attr_set_direction() function Brandon Williams
  2017-01-28  2:01   ` [PATCH v3 00/27] Revamp the attribute system; another round Brandon Williams
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-23 20:35 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, sbeller, gitster, pclouds

Push the bare repository check into the 'read_attr()' function.  This
avoids needing to have extra logic which creates an empty stack frame
when inside a bare repo as a similar bit of logic already exists in the
'read_attr()' function.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 114 +++++++++++++++++++++++++++++++----------------------------------
 1 file changed, 54 insertions(+), 60 deletions(-)

diff --git a/attr.c b/attr.c
index d64d1959e..c2ea5cb29 100644
--- a/attr.c
+++ b/attr.c
@@ -648,25 +648,28 @@ static struct attr_stack *read_attr_from_index(const char *path, int macro_ok)
 
 static struct attr_stack *read_attr(const char *path, int macro_ok)
 {
-	struct attr_stack *res;
+	struct attr_stack *res = NULL;
 
-	if (direction == GIT_ATTR_CHECKOUT) {
+	if (direction == GIT_ATTR_INDEX) {
 		res = read_attr_from_index(path, macro_ok);
-		if (!res)
-			res = read_attr_from_file(path, macro_ok);
-	}
-	else if (direction == GIT_ATTR_CHECKIN) {
-		res = read_attr_from_file(path, macro_ok);
-		if (!res)
-			/*
-			 * There is no checked out .gitattributes file there, but
-			 * we might have it in the index.  We allow operation in a
-			 * sparsely checked out work tree, so read from it.
-			 */
+	} else if (!is_bare_repository()) {
+		if (direction == GIT_ATTR_CHECKOUT) {
 			res = read_attr_from_index(path, macro_ok);
+			if (!res)
+				res = read_attr_from_file(path, macro_ok);
+		} else if (direction == GIT_ATTR_CHECKIN) {
+			res = read_attr_from_file(path, macro_ok);
+			if (!res)
+				/*
+				 * There is no checked out .gitattributes file
+				 * there, but we might have it in the index.
+				 * We allow operation in a sparsely checked out
+				 * work tree, so read from it.
+				 */
+				res = read_attr_from_index(path, macro_ok);
+		}
 	}
-	else
-		res = read_attr_from_index(path, macro_ok);
+
 	if (!res)
 		res = xcalloc(1, sizeof(*res));
 	return res;
@@ -758,10 +761,7 @@ static void bootstrap_attr_stack(struct attr_stack **stack)
 	}
 
 	/* root directory */
-	if (!is_bare_repository() || direction == GIT_ATTR_INDEX)
-		e = read_attr(GITATTRIBUTES_FILE, 1);
-	else
-		e = xcalloc(1, sizeof(struct attr_stack));
+	e = read_attr(GITATTRIBUTES_FILE, 1);
 	push_stack(stack, e, xstrdup(""), 0);
 
 	/* info frame */
@@ -778,6 +778,7 @@ static void prepare_attr_stack(const char *path, int dirlen,
 			       struct attr_stack **stack)
 {
 	struct attr_stack *info;
+	struct strbuf pathbuf = STRBUF_INIT;
 
 	/*
 	 * At the bottom of the attribute stack is the built-in
@@ -824,54 +825,47 @@ static void prepare_attr_stack(const char *path, int dirlen,
 	}
 
 	/*
-	 * Read from parent directories and push them down
+	 * bootstrap_attr_stack() should have added, and the
+	 * above loop should have stopped before popping, the
+	 * root element whose attr_stack->origin is set to an
+	 * empty string.
 	 */
-	if (!is_bare_repository() || direction == GIT_ATTR_INDEX) {
-		/*
-		 * bootstrap_attr_stack() should have added, and the
-		 * above loop should have stopped before popping, the
-		 * root element whose attr_stack->origin is set to an
-		 * empty string.
-		 */
-		struct strbuf pathbuf = STRBUF_INIT;
-
-		assert((*stack)->origin);
-		strbuf_addstr(&pathbuf, (*stack)->origin);
-		/* Build up to the directory 'path' is in */
-		while (pathbuf.len < dirlen) {
-			size_t len = pathbuf.len;
-			struct attr_stack *next;
-			char *origin;
-
-			/* Skip path-separator */
-			if (len < dirlen && is_dir_sep(path[len]))
-				len++;
-			/* Find the end of the next component */
-			while (len < dirlen && !is_dir_sep(path[len]))
-				len++;
-
-			if (pathbuf.len > 0)
-				strbuf_addch(&pathbuf, '/');
-			strbuf_add(&pathbuf, path + pathbuf.len,
-				   (len - pathbuf.len));
-			strbuf_addf(&pathbuf, "/%s", GITATTRIBUTES_FILE);
-
-			next = read_attr(pathbuf.buf, 0);
-
-			/* reset the pathbuf to not include "/.gitattributes" */
-			strbuf_setlen(&pathbuf, len);
-
-			origin = xstrdup(pathbuf.buf);
-			push_stack(stack, next, origin, len);
-
-		}
-		strbuf_release(&pathbuf);
+	assert((*stack)->origin);
+
+	strbuf_addstr(&pathbuf, (*stack)->origin);
+	/* Build up to the directory 'path' is in */
+	while (pathbuf.len < dirlen) {
+		size_t len = pathbuf.len;
+		struct attr_stack *next;
+		char *origin;
+
+		/* Skip path-separator */
+		if (len < dirlen && is_dir_sep(path[len]))
+			len++;
+		/* Find the end of the next component */
+		while (len < dirlen && !is_dir_sep(path[len]))
+			len++;
+
+		if (pathbuf.len > 0)
+			strbuf_addch(&pathbuf, '/');
+		strbuf_add(&pathbuf, path + pathbuf.len, (len - pathbuf.len));
+		strbuf_addf(&pathbuf, "/%s", GITATTRIBUTES_FILE);
+
+		next = read_attr(pathbuf.buf, 0);
+
+		/* reset the pathbuf to not include "/.gitattributes" */
+		strbuf_setlen(&pathbuf, len);
+
+		origin = xstrdup(pathbuf.buf);
+		push_stack(stack, next, origin, len);
 	}
 
 	/*
 	 * Finally push the "info" one at the top of the stack.
 	 */
 	push_stack(stack, info, NULL, 0);
+
+	strbuf_release(&pathbuf);
 }
 
 static int path_matches(const char *pathname, int pathlen,
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v2 27/27] attr: reformat git_attr_set_direction() function
  2017-01-23 20:34 ` [PATCH v2 " Brandon Williams
                     ` (25 preceding siblings ...)
  2017-01-23 20:35   ` [PATCH v2 26/27] attr: push the bare repo check into read_attr() Brandon Williams
@ 2017-01-23 20:35   ` Brandon Williams
  2017-01-28  2:01   ` [PATCH v3 00/27] Revamp the attribute system; another round Brandon Williams
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-23 20:35 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, sbeller, gitster, pclouds

Move the 'git_attr_set_direction()' up to be closer to the variables
that it modifies as well as a small formatting by renaming the variable
'new' to 'new_direction' so that it is more descriptive.

Update the comment about how 'direction' is used to read the state of
the world.  It should be noted that callers of
'git_attr_set_direction()' should ensure that other threads are not
making calls into the attribute system until after the call to
'git_attr_set_direction()' completes.  This function essentially acts as
reset button for the attribute system and should be handled with care.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 49 ++++++++++++++++++++-----------------------------
 attr.h |  3 ++-
 2 files changed, 22 insertions(+), 30 deletions(-)

diff --git a/attr.c b/attr.c
index c2ea5cb29..f35c1107f 100644
--- a/attr.c
+++ b/attr.c
@@ -578,26 +578,30 @@ static struct attr_stack *read_attr_from_array(const char **list)
 }
 
 /*
- * NEEDSWORK: these two are tricky.  The callers assume there is a
- * single, system-wide global state "where we read attributes from?"
- * and when the state is flipped by calling git_attr_set_direction(),
- * attr_stack is discarded so that subsequent attr_check will lazily
- * read from the right place.  And they do not know or care who called
- * by them uses the attribute subsystem, hence have no knowledge of
- * existing git_attr_check instances or future ones that will be
- * created).
- *
- * Probably we need a thread_local that holds these two variables,
- * and a list of git_attr_check instances (which need to be maintained
- * by hooking into git_attr_check_alloc(), git_attr_check_initl(), and
- * git_attr_check_clear().  Then git_attr_set_direction() updates the
- * fields in that thread_local for these two variables, iterate over
- * all the active git_attr_check instances and discard the attr_stack
- * they hold.  Yuck, but it sounds doable.
+ * Callers into the attribute system assume there is a single, system-wide
+ * global state where attributes are read from and when the state is flipped by
+ * calling git_attr_set_direction(), the stack frames that have been
+ * constructed need to be discarded so so that subsequent calls into the
+ * attribute system will lazily read from the right place.  Since changing
+ * direction causes a global paradigm shift, it should not ever be called while
+ * another thread could potentially be calling into the attribute system.
  */
 static enum git_attr_direction direction;
 static struct index_state *use_index;
 
+void git_attr_set_direction(enum git_attr_direction new_direction,
+			    struct index_state *istate)
+{
+	if (is_bare_repository() && new_direction != GIT_ATTR_INDEX)
+		die("BUG: non-INDEX attr direction in a bare repo");
+
+	if (new_direction != direction)
+		drop_attr_stack();
+
+	direction = new_direction;
+	use_index = istate;
+}
+
 static struct attr_stack *read_attr_from_file(const char *path, int macro_ok)
 {
 	FILE *fp = fopen(path, "r");
@@ -1132,19 +1136,6 @@ void attr_check_free(struct attr_check *check)
 	}
 }
 
-void git_attr_set_direction(enum git_attr_direction new, struct index_state *istate)
-{
-	enum git_attr_direction old = direction;
-
-	if (is_bare_repository() && new != GIT_ATTR_INDEX)
-		die("BUG: non-INDEX attr direction in a bare repo");
-
-	direction = new;
-	if (new != old)
-		drop_attr_stack();
-	use_index = istate;
-}
-
 void attr_start(void)
 {
 #ifndef NO_PTHREADS
diff --git a/attr.h b/attr.h
index da7c3a229..62dbcb6b8 100644
--- a/attr.h
+++ b/attr.h
@@ -76,7 +76,8 @@ enum git_attr_direction {
 	GIT_ATTR_CHECKOUT,
 	GIT_ATTR_INDEX
 };
-void git_attr_set_direction(enum git_attr_direction, struct index_state *);
+void git_attr_set_direction(enum git_attr_direction new_direction,
+			    struct index_state *istate);
 
 extern void attr_start(void);
 
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* Re: [PATCH v2 22/27] attr: eliminate global check_all_attr array
  2017-01-23 20:35   ` [PATCH v2 22/27] attr: eliminate global check_all_attr array Brandon Williams
@ 2017-01-23 21:11     ` Junio C Hamano
  0 siblings, 0 replies; 111+ messages in thread
From: Junio C Hamano @ 2017-01-23 21:11 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, sbeller, pclouds

Brandon Williams <bmwill@google.com> writes:

> Currently there is a reliance on 'check_all_attr' which is a global
> array of 'attr_check_item' items which is used to store the value of
> each attribute during the collection process.
>
> This patch eliminates this global and instead creates an array per
> 'attr_check' instance which is then used in the attribute collection
> process.  This brings the attribute system one step closer to being
> thread-safe.

Hmph, how close is "closer"?  

My understanding of this is that a codepath that has a single
"attr_check" can be executing simultaneously by multiple threads,
and "attr_check" is meant to contain read-only stuff sharable by
them.  Unless this check_all_attr is tied to the attr_result (which
in turn is tied to each invocation and typically is on stack), the
resulting code would not be safe, right?


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v2 25/27] attr: store attribute stack in attr_check structure
  2017-01-23 20:35   ` [PATCH v2 25/27] attr: store attribute stack in attr_check structure Brandon Williams
@ 2017-01-23 21:42     ` Junio C Hamano
  2017-01-23 22:06       ` Brandon Williams
  0 siblings, 1 reply; 111+ messages in thread
From: Junio C Hamano @ 2017-01-23 21:42 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, sbeller, pclouds

Brandon Williams <bmwill@google.com> writes:

> The last big hurdle towards a thread-safe API for the attribute system
> is the reliance on a global attribute stack that is modified during each
> call into the attribute system.

The same comment as 22/27 applies here.  

It is not an immediate problem we need to solve in the scope of this
series, in the sense that a Big Subsystem Lock for the attribute
subsystem around git_check_attr() function can make it thread-safe.

But if we want to make it truly threadable without a Big Subsystem
Lock, this and the other one would need to become per-thread at
least.  I think the check_all_attrs scoreboard, which is the topic
of 22/27, should become per git_check_attr() invocation (immediately
before making a call to collect_some_attrs(), prepare an array with
map.size elements and use that as a scoreboard, for example).  I do
not think we can be sure that the "slimmed down attr stack" 15/27
envisions would help performance without benchmarking, but if it
does, then the "attr stack that holds entries that are relevant to
the current query" would have to become per <thread, check> pair, as
two threads may be executing the same codepath looking for the same
set of attributes (i.e. sharing a single attr_check instance), but
working on two different parts of a tree structure.

> This patch removes this global stack and instead a stack is stored
> locally in each attr_check instance.  This opens up the opportunity for
> future optimizations to customize the attribute stack for the attributes
> that a particular attr_check struct is interested in.

This is still true.  But two threads hitting the same attr_check
would make the stack thrash between the paths they are working on to
hurt performance once we go multi-threaded.

Perhaps, provided if the "slimmed down attr stack" is indeed a good
idea, we should keep the global hashmap that holds everything we
read from .gitattributes tree-wide (i.e. as in your v1), _and_
introduce a mechanism to keep the slimmed down version that is
relevant to check[] for each thread somehow.

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v2 25/27] attr: store attribute stack in attr_check structure
  2017-01-23 21:42     ` Junio C Hamano
@ 2017-01-23 22:06       ` Brandon Williams
  2017-01-24  1:11         ` Brandon Williams
  0 siblings, 1 reply; 111+ messages in thread
From: Brandon Williams @ 2017-01-23 22:06 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, sbeller, pclouds

On 01/23, Junio C Hamano wrote:
> Brandon Williams <bmwill@google.com> writes:
> 
> > The last big hurdle towards a thread-safe API for the attribute system
> > is the reliance on a global attribute stack that is modified during each
> > call into the attribute system.
> 
> The same comment as 22/27 applies here.  
> 
> It is not an immediate problem we need to solve in the scope of this
> series, in the sense that a Big Subsystem Lock for the attribute
> subsystem around git_check_attr() function can make it thread-safe.
> 
> But if we want to make it truly threadable without a Big Subsystem
> Lock, this and the other one would need to become per-thread at
> least.  I think the check_all_attrs scoreboard, which is the topic
> of 22/27, should become per git_check_attr() invocation (immediately
> before making a call to collect_some_attrs(), prepare an array with
> map.size elements and use that as a scoreboard, for example).  I do
> not think we can be sure that the "slimmed down attr stack" 15/27
> envisions would help performance without benchmarking, but if it
> does, then the "attr stack that holds entries that are relevant to
> the current query" would have to become per <thread, check> pair, as
> two threads may be executing the same codepath looking for the same
> set of attributes (i.e. sharing a single attr_check instance), but
> working on two different parts of a tree structure.
> 
> > This patch removes this global stack and instead a stack is stored
> > locally in each attr_check instance.  This opens up the opportunity for
> > future optimizations to customize the attribute stack for the attributes
> > that a particular attr_check struct is interested in.
> 
> This is still true.  But two threads hitting the same attr_check
> would make the stack thrash between the paths they are working on to
> hurt performance once we go multi-threaded.
> 
> Perhaps, provided if the "slimmed down attr stack" is indeed a good
> idea, we should keep the global hashmap that holds everything we
> read from .gitattributes tree-wide (i.e. as in your v1), _and_
> introduce a mechanism to keep the slimmed down version that is
> relevant to check[] for each thread somehow.

Sounds good,  I'll reintroduce the hashmap of stacks that I had in v1
and instead make the all_attrs array that is used the in collection
process allocated at invocation time.  That will cause a bit of
allocation churn but in reality shouldn't make that much of an impact.

As we discussed off-line I'll also do the rework to break up the
question and result.  That way two threads can be executing using the
same attr_check structure.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v2 25/27] attr: store attribute stack in attr_check structure
  2017-01-23 22:06       ` Brandon Williams
@ 2017-01-24  1:11         ` Brandon Williams
  2017-01-24  2:28           ` Junio C Hamano
  0 siblings, 1 reply; 111+ messages in thread
From: Brandon Williams @ 2017-01-24  1:11 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, sbeller, pclouds

On 01/23, Brandon Williams wrote:
> As we discussed off-line I'll also do the rework to break up the
> question and result.  That way two threads can be executing using the
> same attr_check structure.

Thinking about this I don't really see what we would gain by breaking
them up.

Right now most callers have a static attr_check struct which holds the
question and answer (and in my series a buffer of all_attrs used during
the collection process).  If this struct is broken up into question and
answer then the only part of it that can be shared with multiple threads
is the question, which ends up being an array with 2 maybe 3 entries on
average.  The result and the array of all_attrs would then need to be
allocated each time calling into the attribute system since they can't
be shared.  Since this allocation is already going to happen wouldn't it
just make sense to drop the static modifier on the check structure (or
have a per-thread check structure) if you really wanted a particular
function thread safe?  It seems like breaking the question and answer up
doesn't buy you much in terms of reducing allocation churn and instead
complicates the API with needing to keep track of two structures instead
of a one.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v2 25/27] attr: store attribute stack in attr_check structure
  2017-01-24  1:11         ` Brandon Williams
@ 2017-01-24  2:28           ` Junio C Hamano
  2017-01-25 19:57             ` Brandon Williams
  0 siblings, 1 reply; 111+ messages in thread
From: Junio C Hamano @ 2017-01-24  2:28 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, sbeller, pclouds

Brandon Williams <bmwill@google.com> writes:

> ... It seems like breaking the question and answer up
> doesn't buy you much in terms of reducing allocation churn and instead
> complicates the API with needing to keep track of two structures instead
> of a one.

In my mind, the value of having a constant check_attr is primarily
that it gives us a stable pointer to serve as a hashmap key,
i.e. the identifier for each call site, in a later iteration.

Of course, in order to populate the "question" array, we'd need the
interning of attribute names to attr objects, which need to be
protected by mutex, and you would probably not want to do that every
time the control hits the codepath.

But all of the above comes from my intuition, and I'll very much
welcome to be proven wrong with an alternative design, or better
yet, a working code based on an alternative design ;-).

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v2 25/27] attr: store attribute stack in attr_check structure
  2017-01-24  2:28           ` Junio C Hamano
@ 2017-01-25 19:57             ` Brandon Williams
  2017-01-25 20:10               ` Stefan Beller
  2017-01-25 20:14               ` Junio C Hamano
  0 siblings, 2 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-25 19:57 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, sbeller, pclouds

On 01/23, Junio C Hamano wrote:
> Brandon Williams <bmwill@google.com> writes:
> 
> > ... It seems like breaking the question and answer up
> > doesn't buy you much in terms of reducing allocation churn and instead
> > complicates the API with needing to keep track of two structures instead
> > of a one.
> 
> In my mind, the value of having a constant check_attr is primarily
> that it gives us a stable pointer to serve as a hashmap key,
> i.e. the identifier for each call site, in a later iteration.

We didn't really discuss this notion of having the pointer be a key into
a hashmap, what sort of information are you envisioning being stored in
this sort of hashmap?  One issue I can see with this is that the
functions which have a static attr_check struct would still not be thread
safe if the initialization of the structure isn't surrounded by a mutex
itself. ie

static struct attr_check *check;

if (!check)
  init(check);

would need to be:

lock()
if (!check)
  init(check);
unlock();

inorder to prevent a race to initialize the structure.  Which is
something that the attr system itself can't be refactored to fix (at
least I can't see how at the moment).

> Of course, in order to populate the "question" array, we'd need the
> interning of attribute names to attr objects, which need to be
> protected by mutex, and you would probably not want to do that every
> time the control hits the codepath.

While true that doesn't prevent the mutex needed to create/check that
the all_attr array that is used to collect attributes is the correct
size/initialized properly.

> But all of the above comes from my intuition, and I'll very much
> welcome to be proven wrong with an alternative design, or better
> yet, a working code based on an alternative design ;-).

Yeah, after working through the problem the two simple solutions I can
think of are either my v1 or v2 of the series, neither of which allows
for the attr_check structure to be shared.  If we truly want the
"question" array to be const then that can be done, it would just
require a bit more boilerplate and making the all_attr array to be
local to the check_attrs() function itself.  An API like this would look
like:

static const struct attr_check *check;
struct attr_result result;

if (!check)
  init_check(check);

// Result struct needs to be initialized based on the size of check
init_result(&result, check);

check_attrs(path, check, &result);

// use result

attr_result_clear(&result);

>

It still doesn't handle an initialization race on the check structure
but the check pointer would be const and could be used for some future
optimization.  It also will have a bit more allocation churn than either
v1 or v2 of the series.  If this is the route you want to take I'll get
working on it, I just want to make sure we're on the same page before
doing a larger refactor like this.

Thanks for the guidance on this, someday we'll get this right :)

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v2 25/27] attr: store attribute stack in attr_check structure
  2017-01-25 19:57             ` Brandon Williams
@ 2017-01-25 20:10               ` Stefan Beller
  2017-01-25 20:14               ` Junio C Hamano
  1 sibling, 0 replies; 111+ messages in thread
From: Stefan Beller @ 2017-01-25 20:10 UTC (permalink / raw)
  To: Brandon Williams; +Cc: Junio C Hamano, git, Duy Nguyen

On Wed, Jan 25, 2017 at 11:57 AM, Brandon Williams <bmwill@google.com> wrote:
> On 01/23, Junio C Hamano wrote:
>> Brandon Williams <bmwill@google.com> writes:
>>
>> > ... It seems like breaking the question and answer up
>> > doesn't buy you much in terms of reducing allocation churn and instead
>> > complicates the API with needing to keep track of two structures instead
>> > of a one.
>>
>> In my mind, the value of having a constant check_attr is primarily
>> that it gives us a stable pointer to serve as a hashmap key,
>> i.e. the identifier for each call site, in a later iteration.
>
> We didn't really discuss this notion of having the pointer be a key into
> a hashmap, what sort of information are you envisioning being stored in
> this sort of hashmap?  One issue I can see with this is that the
> functions which have a static attr_check struct would still not be thread
> safe if the initialization of the structure isn't surrounded by a mutex
> itself. ie
>
> static struct attr_check *check;
>
> if (!check)
>   init(check);
>
> would need to be:
>
> lock()
> if (!check)
>   init(check);
> unlock();
>
> inorder to prevent a race to initialize the structure.  Which is
> something that the attr system itself can't be refactored to fix (at
> least I can't see how at the moment).

By passing the check pointer into the attr system (using a double pointer)

    extern void git_attr_check_initl( \
            struct git_attr_check out**, \
            const char *, ...)
{
    // get the global lock, as construction of new check structs
    // is not expected to produce contention

    // parse the list of things & construct the thing

    *out = /* I made a thing */
    // unlock globally
}

>
>> Of course, in order to populate the "question" array, we'd need the
>> interning of attribute names to attr objects, which need to be
>> protected by mutex, and you would probably not want to do that every
>> time the control hits the codepath.
>
> While true that doesn't prevent the mutex needed to create/check that
> the all_attr array that is used to collect attributes is the correct
> size/initialized properly.
>
>> But all of the above comes from my intuition, and I'll very much
>> welcome to be proven wrong with an alternative design, or better
>> yet, a working code based on an alternative design ;-).
>
> Yeah, after working through the problem the two simple solutions I can
> think of are either my v1 or v2 of the series, neither of which allows
> for the attr_check structure to be shared.  If we truly want the
> "question" array to be const then that can be done, it would just
> require a bit more boilerplate and making the all_attr array to be
> local to the check_attrs() function itself.  An API like this would look
> like:
>
> static const struct attr_check *check;
> struct attr_result result;
>
> if (!check)
>   init_check(check);
>
> // Result struct needs to be initialized based on the size of check
> init_result(&result, check);

Behind the scenes we may have a pool that caches result allocations,
such that we avoid memory allocation churn in here


>
> check_attrs(path, check, &result);
>
> // use result
>
> attr_result_clear(&result);

Instead of clearing here, we'd give it back to the pool, which then can keep
parts of the result intact.

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v2 25/27] attr: store attribute stack in attr_check structure
  2017-01-25 19:57             ` Brandon Williams
  2017-01-25 20:10               ` Stefan Beller
@ 2017-01-25 20:14               ` Junio C Hamano
  2017-01-25 21:54                 ` Brandon Williams
  1 sibling, 1 reply; 111+ messages in thread
From: Junio C Hamano @ 2017-01-25 20:14 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, sbeller, pclouds

Brandon Williams <bmwill@google.com> writes:

>> In my mind, the value of having a constant check_attr is primarily
>> that it gives us a stable pointer to serve as a hashmap key,
>> i.e. the identifier for each call site, in a later iteration.
>
> We didn't really discuss this notion of having the pointer be a key into
> a hashmap, what sort of information are you envisioning being stored in
> this sort of hashmap?

The "entries relevant to this attr_check() call, that is specific to
the <check_attr instance, the thread> tuple" (aka "what used to be
called the global attr_stack") we discussed would be the primary
example.  A thread is likely be looping in a caller that has many
paths inside a directory, calling a function that has a call to
attr_check() for each path.  Having something that can use to
identify the check_attr instance in a stable way, even when the
inner function is called and returns many times, would allow us to
populate the "attr stack" just once for the thread when it enters a
directory for the first time (remember, another thread may be
executing the same codepath, checking for paths in a different
directory) and keep using it.  There may be other mechanisms you can
come up with, so I wouldn't say it is the only valid way, but it is
a way.  That is why I said:

>> But all of the above comes from my intuition, and I'll very much
>> welcome to be proven wrong with an alternative design, or better
>> yet, a working code based on an alternative design ;-).

near the end of my message.

> One issue I can see with this is that the
> functions which have a static attr_check struct would still not be thread
> safe if the initialization of the structure isn't surrounded by a mutex
> itself. ie

Yes, that goes without saying.  That is why I suggested Stefan to do
not this:

> static struct attr_check *check;
>
> if (!check)
>   init(check);
>
> would need to be:
>
> lock()
> if (!check)
>   init(check);
> unlock();

but this:

	static struct attr_check *check;
	init(&check);

and hide the lock/unlock gymnastics inside the API.  I thought that
already was in what you inherited from him and started your work
on top of?


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v2 25/27] attr: store attribute stack in attr_check structure
  2017-01-25 20:14               ` Junio C Hamano
@ 2017-01-25 21:54                 ` Brandon Williams
  2017-01-25 23:19                   ` Brandon Williams
  0 siblings, 1 reply; 111+ messages in thread
From: Brandon Williams @ 2017-01-25 21:54 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, sbeller, pclouds

On 01/25, Junio C Hamano wrote:
> Brandon Williams <bmwill@google.com> writes:
> 
> >> In my mind, the value of having a constant check_attr is primarily
> >> that it gives us a stable pointer to serve as a hashmap key,
> >> i.e. the identifier for each call site, in a later iteration.
> >
> > We didn't really discuss this notion of having the pointer be a key into
> > a hashmap, what sort of information are you envisioning being stored in
> > this sort of hashmap?
> 
> The "entries relevant to this attr_check() call, that is specific to
> the <check_attr instance, the thread> tuple" (aka "what used to be
> called the global attr_stack") we discussed would be the primary
> example.  A thread is likely be looping in a caller that has many
> paths inside a directory, calling a function that has a call to
> attr_check() for each path.  Having something that can use to
> identify the check_attr instance in a stable way, even when the
> inner function is called and returns many times, would allow us to
> populate the "attr stack" just once for the thread when it enters a
> directory for the first time (remember, another thread may be
> executing the same codepath, checking for paths in a different
> directory) and keep using it.  There may be other mechanisms you can
> come up with, so I wouldn't say it is the only valid way, but it is
> a way.  That is why I said:
> 
> >> But all of the above comes from my intuition, and I'll very much
> >> welcome to be proven wrong with an alternative design, or better
> >> yet, a working code based on an alternative design ;-).
> 
> near the end of my message.
> 
> > One issue I can see with this is that the
> > functions which have a static attr_check struct would still not be thread
> > safe if the initialization of the structure isn't surrounded by a mutex
> > itself. ie
> 
> Yes, that goes without saying.  That is why I suggested Stefan to do
> not this:
> 
> > static struct attr_check *check;
> >
> > if (!check)
> >   init(check);
> >
> > would need to be:
> >
> > lock()
> > if (!check)
> >   init(check);
> > unlock();
> 
> but this:
> 
> 	static struct attr_check *check;
> 	init(&check);
> 
> and hide the lock/unlock gymnastics inside the API.  I thought that
> already was in what you inherited from him and started your work
> on top of?

I essentially built off of the series you had while using Stefan's
patches as inspiration, but I don't believe the kind of mechanism you
are describing existed in Stefan's series.  His series had a single lock
for the entire system, only allowing a single caller to be in it at any
given time.  This definitely isn't ideal, hence why I picked it up.

Implementation aside I want to try and nail down what the purpose of
this refactor is.  There are roughly two notions of being "thread-safe".

1. The first is that the subsystem itself is thread safe, that is
   multiple threads can be executing inside the subsystem without stepping
   on each others work.

2. The second is that the object itself is thread safe or that multiple
   threads can use the same object.

I thought that the main purpose of this was to achieve (1) since
currently that is not the case.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v2 25/27] attr: store attribute stack in attr_check structure
  2017-01-25 21:54                 ` Brandon Williams
@ 2017-01-25 23:19                   ` Brandon Williams
  0 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-25 23:19 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, sbeller, pclouds

On 01/25, Brandon Williams wrote:
> On 01/25, Junio C Hamano wrote:
> > Brandon Williams <bmwill@google.com> writes:
> > 
> > >> In my mind, the value of having a constant check_attr is primarily
> > >> that it gives us a stable pointer to serve as a hashmap key,
> > >> i.e. the identifier for each call site, in a later iteration.
> > >
> > > We didn't really discuss this notion of having the pointer be a key into
> > > a hashmap, what sort of information are you envisioning being stored in
> > > this sort of hashmap?
> > 
> > The "entries relevant to this attr_check() call, that is specific to
> > the <check_attr instance, the thread> tuple" (aka "what used to be
> > called the global attr_stack") we discussed would be the primary
> > example.  A thread is likely be looping in a caller that has many
> > paths inside a directory, calling a function that has a call to
> > attr_check() for each path.  Having something that can use to
> > identify the check_attr instance in a stable way, even when the
> > inner function is called and returns many times, would allow us to
> > populate the "attr stack" just once for the thread when it enters a
> > directory for the first time (remember, another thread may be
> > executing the same codepath, checking for paths in a different
> > directory) and keep using it.  There may be other mechanisms you can
> > come up with, so I wouldn't say it is the only valid way, but it is
> > a way.  That is why I said:
> > 
> > >> But all of the above comes from my intuition, and I'll very much
> > >> welcome to be proven wrong with an alternative design, or better
> > >> yet, a working code based on an alternative design ;-).
> > 
> > near the end of my message.
> > 
> > > One issue I can see with this is that the
> > > functions which have a static attr_check struct would still not be thread
> > > safe if the initialization of the structure isn't surrounded by a mutex
> > > itself. ie
> > 
> > Yes, that goes without saying.  That is why I suggested Stefan to do
> > not this:
> > 
> > > static struct attr_check *check;
> > >
> > > if (!check)
> > >   init(check);
> > >
> > > would need to be:
> > >
> > > lock()
> > > if (!check)
> > >   init(check);
> > > unlock();
> > 
> > but this:
> > 
> > 	static struct attr_check *check;
> > 	init(&check);
> > 
> > and hide the lock/unlock gymnastics inside the API.  I thought that
> > already was in what you inherited from him and started your work
> > on top of?
> 
> I essentially built off of the series you had while using Stefan's
> patches as inspiration, but I don't believe the kind of mechanism you
> are describing existed in Stefan's series.  His series had a single lock
> for the entire system, only allowing a single caller to be in it at any
> given time.  This definitely isn't ideal, hence why I picked it up.
> 
> Implementation aside I want to try and nail down what the purpose of
> this refactor is.  There are roughly two notions of being "thread-safe".
> 
> 1. The first is that the subsystem itself is thread safe, that is
>    multiple threads can be executing inside the subsystem without stepping
>    on each others work.
> 
> 2. The second is that the object itself is thread safe or that multiple
>    threads can use the same object.
> 
> I thought that the main purpose of this was to achieve (1) since
> currently that is not the case.

Ok, so I discovered a very good reason why we should do as Stefan
originally did and split the question and answer (beyond the reasoning
for using the reference as a hashkey).

One motivation behind making this API thread-safe is that we can use it
in pathspec code to match against attributes.  This means that a
pathspec structure will contain an attr_check member describing the
attributes that a pathspec item is interested in.  Then the pathspec
structure is passed to match_pathspec() as a const pointer.  To me, when
passing something as 'const' I expect none of the members should change
at all.  The struct should remain exactly in the same form as before I
invoked the function.

Requiring the attr_check structure to be modified in the process of a
check_attrs() call would violate this "contract" when calling
match_pathspec() as the attr_check structure would have modified state.
The compiler wouldn't catch this as the "const" modifier isn't passed on
to struct members.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 111+ messages in thread

* [PATCH v3 00/27] Revamp the attribute system; another round
  2017-01-23 20:34 ` [PATCH v2 " Brandon Williams
                     ` (26 preceding siblings ...)
  2017-01-23 20:35   ` [PATCH v2 27/27] attr: reformat git_attr_set_direction() function Brandon Williams
@ 2017-01-28  2:01   ` Brandon Williams
  2017-01-28  2:01     ` [PATCH v3 01/27] commit.c: use strchrnul() to scan for one line Brandon Williams
                       ` (27 more replies)
  27 siblings, 28 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-28  2:01 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, sbeller, gitster, pclouds

Per some of the discussion online and off I locally broke up up the question
and answer and I wasn't very thrilled with the outcome for a number of reasons.

1. The API is more complex.  Callers needs to have two structures allocated
instead of one, one can be shared read-only while the other can't.  While this
many not be that big of a deal, it was more confusing to me.

2. Performance hit.  The allocation churn with creating/freeing a
scoreboard and the results struct adds up.  It even looks like the
cost of looking up a stack frame in a hashmap isn't very cheap.

  Here are some very rough performance measurements I made on my machine
  on linux.git by: `perf stat -r 50 git grep "asdfghjkl"`

  master: 0.302176063 seconds
  v1:     0.324243806 seconds
  v2:     0.304339636 seconds
  split:  0.349892023 seconds (hashtable of stacks, all_attr scoreboard
                               allocated per git_attr_check() call, split
                               question/answer)

After looking at this, I'm of the opinion that the API in v2 is the best route
to take.  Its a step-up from what it is currently (at master) and there isn't a
performance degradation (ok there's a small bit but it seems within the margin
of error).  It also allows for easier adaptation of the API if we wanted to do
a change in the future since the primary functionality remains intact, or to do
optimizations like stack pruning (if we decided to go down that route).

Given the above, v3 is a reroll of the same design as in v2.  This is a good
milestone in improving the attribute system as it achieves the goal of making
the attribute subsystem thread-safe (ie multiple callers can be executing
inside the attribute system at the same time) and will enable a future series
to allow pathspec code to call into the attribute system.

Most of the changes in this revision are cosmetic (variable renames, code
movement, etc) but there was a memory leak that was also fixed.

Brandon Williams (8):
  attr: pass struct attr_check to collect_some_attrs
  attr: use hashmap for attribute dictionary
  attr: eliminate global check_all_attr array
  attr: remove maybe-real, maybe-macro from git_attr
  attr: tighten const correctness with git_attr and match_attr
  attr: store attribute stack in attr_check structure
  attr: push the bare repo check into read_attr()
  attr: reformat git_attr_set_direction() function

Junio C Hamano (17):
  commit.c: use strchrnul() to scan for one line
  attr.c: use strchrnul() to scan for one line
  attr.c: update a stale comment on "struct match_attr"
  attr.c: explain the lack of attr-name syntax check in parse_attr()
  attr.c: complete a sentence in a comment
  attr.c: mark where #if DEBUG ends more clearly
  attr.c: simplify macroexpand_one()
  attr.c: tighten constness around "git_attr" structure
  attr.c: plug small leak in parse_attr_line()
  attr.c: add push_stack() helper
  attr.c: outline the future plans by heavily commenting
  attr: rename function and struct related to checking attributes
  attr: (re)introduce git_check_attr() and struct attr_check
  attr: convert git_all_attrs() to use "struct attr_check"
  attr: convert git_check_attrs() callers to use the new API
  attr: retire git_check_attrs() API
  attr: change validity check for attribute names to use positive logic

Nguyễn Thái Ngọc Duy (1):
  attr: support quoting pathname patterns in C style

Stefan Beller (1):
  Documentation: fix a typo

 Documentation/gitattributes.txt               |  10 +-
 Documentation/technical/api-gitattributes.txt |  86 ++-
 archive.c                                     |  24 +-
 attr.c                                        | 878 ++++++++++++++++++--------
 attr.h                                        |  49 +-
 builtin/check-attr.c                          |  66 +-
 builtin/pack-objects.c                        |  19 +-
 commit.c                                      |   3 +-
 common-main.c                                 |   3 +
 convert.c                                     |  25 +-
 ll-merge.c                                    |  33 +-
 t/t0003-attributes.sh                         |  26 +
 userdiff.c                                    |  19 +-
 ws.c                                          |  19 +-
 14 files changed, 816 insertions(+), 444 deletions(-)

-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply	[flat|nested] 111+ messages in thread

* [PATCH v3 01/27] commit.c: use strchrnul() to scan for one line
  2017-01-28  2:01   ` [PATCH v3 00/27] Revamp the attribute system; another round Brandon Williams
@ 2017-01-28  2:01     ` Brandon Williams
  2017-01-28  2:01     ` [PATCH v3 02/27] attr.c: " Brandon Williams
                       ` (26 subsequent siblings)
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-28  2:01 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, sbeller, pclouds, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 commit.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/commit.c b/commit.c
index 2cf85158b..0c4ee3de4 100644
--- a/commit.c
+++ b/commit.c
@@ -415,8 +415,7 @@ int find_commit_subject(const char *commit_buffer, const char **subject)
 		p++;
 	if (*p) {
 		p = skip_blank_lines(p + 2);
-		for (eol = p; *eol && *eol != '\n'; eol++)
-			; /* do nothing */
+		eol = strchrnul(p, '\n');
 	} else
 		eol = p;
 
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v3 02/27] attr.c: use strchrnul() to scan for one line
  2017-01-28  2:01   ` [PATCH v3 00/27] Revamp the attribute system; another round Brandon Williams
  2017-01-28  2:01     ` [PATCH v3 01/27] commit.c: use strchrnul() to scan for one line Brandon Williams
@ 2017-01-28  2:01     ` Brandon Williams
  2017-01-28  2:01     ` [PATCH v3 03/27] attr.c: update a stale comment on "struct match_attr" Brandon Williams
                       ` (25 subsequent siblings)
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-28  2:01 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, sbeller, pclouds, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/attr.c b/attr.c
index 1fcf042b8..04d24334e 100644
--- a/attr.c
+++ b/attr.c
@@ -402,8 +402,8 @@ static struct attr_stack *read_attr_from_index(const char *path, int macro_ok)
 	for (sp = buf; *sp; ) {
 		char *ep;
 		int more;
-		for (ep = sp; *ep && *ep != '\n'; ep++)
-			;
+
+		ep = strchrnul(sp, '\n');
 		more = (*ep == '\n');
 		*ep = '\0';
 		handle_attr_line(res, sp, path, ++lineno, macro_ok);
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v3 03/27] attr.c: update a stale comment on "struct match_attr"
  2017-01-28  2:01   ` [PATCH v3 00/27] Revamp the attribute system; another round Brandon Williams
  2017-01-28  2:01     ` [PATCH v3 01/27] commit.c: use strchrnul() to scan for one line Brandon Williams
  2017-01-28  2:01     ` [PATCH v3 02/27] attr.c: " Brandon Williams
@ 2017-01-28  2:01     ` Brandon Williams
  2017-01-28  2:01     ` [PATCH v3 04/27] attr.c: explain the lack of attr-name syntax check in parse_attr() Brandon Williams
                       ` (24 subsequent siblings)
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-28  2:01 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, sbeller, pclouds, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

When 82dce998 (attr: more matching optimizations from .gitignore,
2012-10-15) changed a pointer to a string "*pattern" into an
embedded "struct pattern" in struct match_attr, it forgot to update
the comment that describes the structure.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/attr.c b/attr.c
index 04d24334e..007f1a299 100644
--- a/attr.c
+++ b/attr.c
@@ -131,9 +131,8 @@ struct pattern {
  * If is_macro is true, then u.attr is a pointer to the git_attr being
  * defined.
  *
- * If is_macro is false, then u.pattern points at the filename pattern
- * to which the rule applies.  (The memory pointed to is part of the
- * memory block allocated for the match_attr instance.)
+ * If is_macro is false, then u.pat is the filename pattern to which the
+ * rule applies.
  *
  * In either case, num_attr is the number of attributes affected by
  * this rule, and state is an array listing them.  The attributes are
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v3 04/27] attr.c: explain the lack of attr-name syntax check in parse_attr()
  2017-01-28  2:01   ` [PATCH v3 00/27] Revamp the attribute system; another round Brandon Williams
                       ` (2 preceding siblings ...)
  2017-01-28  2:01     ` [PATCH v3 03/27] attr.c: update a stale comment on "struct match_attr" Brandon Williams
@ 2017-01-28  2:01     ` Brandon Williams
  2017-01-28  2:01     ` [PATCH v3 05/27] attr.c: complete a sentence in a comment Brandon Williams
                       ` (23 subsequent siblings)
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-28  2:01 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, sbeller, pclouds, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/attr.c b/attr.c
index 007f1a299..6b55a57ef 100644
--- a/attr.c
+++ b/attr.c
@@ -183,6 +183,12 @@ static const char *parse_attr(const char *src, int lineno, const char *cp,
 			return NULL;
 		}
 	} else {
+		/*
+		 * As this function is always called twice, once with
+		 * e == NULL in the first pass and then e != NULL in
+		 * the second pass, no need for invalid_attr_name()
+		 * check here.
+		 */
 		if (*cp == '-' || *cp == '!') {
 			e->setto = (*cp == '-') ? ATTR__FALSE : ATTR__UNSET;
 			cp++;
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v3 05/27] attr.c: complete a sentence in a comment
  2017-01-28  2:01   ` [PATCH v3 00/27] Revamp the attribute system; another round Brandon Williams
                       ` (3 preceding siblings ...)
  2017-01-28  2:01     ` [PATCH v3 04/27] attr.c: explain the lack of attr-name syntax check in parse_attr() Brandon Williams
@ 2017-01-28  2:01     ` Brandon Williams
  2017-01-28  2:01     ` [PATCH v3 06/27] attr.c: mark where #if DEBUG ends more clearly Brandon Williams
                       ` (22 subsequent siblings)
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-28  2:01 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, sbeller, pclouds, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/attr.c b/attr.c
index 6b55a57ef..9bdf87a6f 100644
--- a/attr.c
+++ b/attr.c
@@ -300,7 +300,7 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
  * directory (again, reading the file from top to bottom) down to the
  * current directory, and then scan the list backwards to find the first match.
  * This is exactly the same as what is_excluded() does in dir.c to deal with
- * .gitignore
+ * .gitignore file and info/excludes file as a fallback.
  */
 
 static struct attr_stack {
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v3 06/27] attr.c: mark where #if DEBUG ends more clearly
  2017-01-28  2:01   ` [PATCH v3 00/27] Revamp the attribute system; another round Brandon Williams
                       ` (4 preceding siblings ...)
  2017-01-28  2:01     ` [PATCH v3 05/27] attr.c: complete a sentence in a comment Brandon Williams
@ 2017-01-28  2:01     ` Brandon Williams
  2017-01-28  2:01     ` [PATCH v3 07/27] attr.c: simplify macroexpand_one() Brandon Williams
                       ` (21 subsequent siblings)
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-28  2:01 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, sbeller, pclouds, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/attr.c b/attr.c
index 9bdf87a6f..17297fffe 100644
--- a/attr.c
+++ b/attr.c
@@ -469,7 +469,7 @@ static void debug_set(const char *what, const char *match, struct git_attr *attr
 #define debug_push(a) do { ; } while (0)
 #define debug_pop(a) do { ; } while (0)
 #define debug_set(a,b,c,d) do { ; } while (0)
-#endif
+#endif /* DEBUG_ATTR */
 
 static void drop_attr_stack(void)
 {
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v3 07/27] attr.c: simplify macroexpand_one()
  2017-01-28  2:01   ` [PATCH v3 00/27] Revamp the attribute system; another round Brandon Williams
                       ` (5 preceding siblings ...)
  2017-01-28  2:01     ` [PATCH v3 06/27] attr.c: mark where #if DEBUG ends more clearly Brandon Williams
@ 2017-01-28  2:01     ` Brandon Williams
  2017-01-28  2:01     ` [PATCH v3 08/27] attr.c: tighten constness around "git_attr" structure Brandon Williams
                       ` (20 subsequent siblings)
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-28  2:01 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, sbeller, pclouds, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

The double-loop wants to do an early return immediately when one
matching macro is found.  Eliminate the extra variable 'a' used for
that purpose and rewrite the "assign the found item to 'a' to make
it non-NULL and force the loop(s) to terminate" with a direct return
from there.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/attr.c b/attr.c
index 17297fffe..e42f931b3 100644
--- a/attr.c
+++ b/attr.c
@@ -705,24 +705,21 @@ static int fill(const char *path, int pathlen, int basename_offset,
 static int macroexpand_one(int nr, int rem)
 {
 	struct attr_stack *stk;
-	struct match_attr *a = NULL;
 	int i;
 
 	if (check_all_attr[nr].value != ATTR__TRUE ||
 	    !check_all_attr[nr].attr->maybe_macro)
 		return rem;
 
-	for (stk = attr_stack; !a && stk; stk = stk->prev)
-		for (i = stk->num_matches - 1; !a && 0 <= i; i--) {
+	for (stk = attr_stack; stk; stk = stk->prev) {
+		for (i = stk->num_matches - 1; 0 <= i; i--) {
 			struct match_attr *ma = stk->attrs[i];
 			if (!ma->is_macro)
 				continue;
 			if (ma->u.attr->attr_nr == nr)
-				a = ma;
+				return fill_one("expand", ma, rem);
 		}
-
-	if (a)
-		rem = fill_one("expand", a, rem);
+	}
 
 	return rem;
 }
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v3 08/27] attr.c: tighten constness around "git_attr" structure
  2017-01-28  2:01   ` [PATCH v3 00/27] Revamp the attribute system; another round Brandon Williams
                       ` (6 preceding siblings ...)
  2017-01-28  2:01     ` [PATCH v3 07/27] attr.c: simplify macroexpand_one() Brandon Williams
@ 2017-01-28  2:01     ` Brandon Williams
  2017-01-28  2:01     ` [PATCH v3 09/27] attr.c: plug small leak in parse_attr_line() Brandon Williams
                       ` (19 subsequent siblings)
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-28  2:01 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, sbeller, pclouds, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

It holds an interned string, and git_attr_name() is a way to peek
into it.  Make sure the involved pointer types are pointer-to-const.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 2 +-
 attr.h | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/attr.c b/attr.c
index e42f931b3..f7cf7ae30 100644
--- a/attr.c
+++ b/attr.c
@@ -43,7 +43,7 @@ static int cannot_trust_maybe_real;
 static struct git_attr_check *check_all_attr;
 static struct git_attr *(git_attr_hash[HASHSIZE]);
 
-char *git_attr_name(struct git_attr *attr)
+const char *git_attr_name(const struct git_attr *attr)
 {
 	return attr->name;
 }
diff --git a/attr.h b/attr.h
index 8b08d33af..00d7a662c 100644
--- a/attr.h
+++ b/attr.h
@@ -25,7 +25,7 @@ extern const char git_attr__false[];
  * Unset one is returned as NULL.
  */
 struct git_attr_check {
-	struct git_attr *attr;
+	const struct git_attr *attr;
 	const char *value;
 };
 
@@ -34,7 +34,7 @@ struct git_attr_check {
  * return value is a pointer to a null-delimited string that is part
  * of the internal data structure; it should not be modified or freed.
  */
-char *git_attr_name(struct git_attr *);
+extern const char *git_attr_name(const struct git_attr *);
 
 int git_check_attr(const char *path, int, struct git_attr_check *);
 
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v3 09/27] attr.c: plug small leak in parse_attr_line()
  2017-01-28  2:01   ` [PATCH v3 00/27] Revamp the attribute system; another round Brandon Williams
                       ` (7 preceding siblings ...)
  2017-01-28  2:01     ` [PATCH v3 08/27] attr.c: tighten constness around "git_attr" structure Brandon Williams
@ 2017-01-28  2:01     ` Brandon Williams
  2017-01-28  2:01     ` [PATCH v3 10/27] attr: support quoting pathname patterns in C style Brandon Williams
                       ` (18 subsequent siblings)
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-28  2:01 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, sbeller, pclouds, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

If any error is noticed after the match_attr structure is allocated,
we shouldn't just return NULL from this function.

Add a fail_return label that frees the allocated structure and
returns NULL, and consistently jump there when we want to return
NULL after cleaning up.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/attr.c b/attr.c
index f7cf7ae30..d180c7833 100644
--- a/attr.c
+++ b/attr.c
@@ -223,7 +223,7 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 		if (!macro_ok) {
 			fprintf(stderr, "%s not allowed: %s:%d\n",
 				name, src, lineno);
-			return NULL;
+			goto fail_return;
 		}
 		is_macro = 1;
 		name += strlen(ATTRIBUTE_MACRO_PREFIX);
@@ -233,7 +233,7 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 			fprintf(stderr,
 				"%.*s is not a valid attribute name: %s:%d\n",
 				namelen, name, src, lineno);
-			return NULL;
+			goto fail_return;
 		}
 	}
 	else
@@ -246,7 +246,7 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 	for (cp = states, num_attr = 0; *cp; num_attr++) {
 		cp = parse_attr(src, lineno, cp, NULL);
 		if (!cp)
-			return NULL;
+			goto fail_return;
 	}
 
 	res = xcalloc(1,
@@ -267,7 +267,7 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 		if (res->u.pat.flags & EXC_FLAG_NEGATIVE) {
 			warning(_("Negative patterns are ignored in git attributes\n"
 				  "Use '\\!' for literal leading exclamation."));
-			return NULL;
+			goto fail_return;
 		}
 	}
 	res->is_macro = is_macro;
@@ -283,6 +283,10 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 	}
 
 	return res;
+
+fail_return:
+	free(res);
+	return NULL;
 }
 
 /*
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v3 10/27] attr: support quoting pathname patterns in C style
  2017-01-28  2:01   ` [PATCH v3 00/27] Revamp the attribute system; another round Brandon Williams
                       ` (8 preceding siblings ...)
  2017-01-28  2:01     ` [PATCH v3 09/27] attr.c: plug small leak in parse_attr_line() Brandon Williams
@ 2017-01-28  2:01     ` Brandon Williams
  2017-01-28  2:01     ` [PATCH v3 11/27] attr.c: add push_stack() helper Brandon Williams
                       ` (17 subsequent siblings)
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-28  2:01 UTC (permalink / raw)
  To: git
  Cc: Nguyễn Thái Ngọc Duy, sbeller, gitster,
	Brandon Williams

From: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>

Full pattern must be quoted. So 'pat"t"ern attr' will give exactly
'pat"t"ern', not 'pattern'. Also clarify that leading whitespaces are
not part of the pattern and document comment syntax.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 Documentation/gitattributes.txt |  8 +++++---
 attr.c                          | 15 +++++++++++++--
 t/t0003-attributes.sh           | 26 ++++++++++++++++++++++++++
 3 files changed, 44 insertions(+), 5 deletions(-)

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index e0b66c122..3173dee7e 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -21,9 +21,11 @@ Each line in `gitattributes` file is of form:
 	pattern	attr1 attr2 ...
 
 That is, a pattern followed by an attributes list,
-separated by whitespaces.  When the pattern matches the
-path in question, the attributes listed on the line are given to
-the path.
+separated by whitespaces. Leading and trailing whitespaces are
+ignored. Lines that begin with '#' are ignored. Patterns
+that begin with a double quote are quoted in C style.
+When the pattern matches the path in question, the attributes
+listed on the line are given to the path.
 
 Each attribute can be in one of these states for a given path:
 
diff --git a/attr.c b/attr.c
index d180c7833..e1c630f79 100644
--- a/attr.c
+++ b/attr.c
@@ -13,6 +13,7 @@
 #include "attr.h"
 #include "dir.h"
 #include "utf8.h"
+#include "quote.h"
 
 const char git_attr__true[] = "(builtin)true";
 const char git_attr__false[] = "\0(builtin)false";
@@ -212,12 +213,21 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 	const char *cp, *name, *states;
 	struct match_attr *res = NULL;
 	int is_macro;
+	struct strbuf pattern = STRBUF_INIT;
 
 	cp = line + strspn(line, blank);
 	if (!*cp || *cp == '#')
 		return NULL;
 	name = cp;
-	namelen = strcspn(name, blank);
+
+	if (*cp == '"' && !unquote_c_style(&pattern, name, &states)) {
+		name = pattern.buf;
+		namelen = pattern.len;
+	} else {
+		namelen = strcspn(name, blank);
+		states = name + namelen;
+	}
+
 	if (strlen(ATTRIBUTE_MACRO_PREFIX) < namelen &&
 	    starts_with(name, ATTRIBUTE_MACRO_PREFIX)) {
 		if (!macro_ok) {
@@ -239,7 +249,6 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 	else
 		is_macro = 0;
 
-	states = name + namelen;
 	states += strspn(states, blank);
 
 	/* First pass to count the attr_states */
@@ -282,9 +291,11 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 			cannot_trust_maybe_real = 1;
 	}
 
+	strbuf_release(&pattern);
 	return res;
 
 fail_return:
+	strbuf_release(&pattern);
 	free(res);
 	return NULL;
 }
diff --git a/t/t0003-attributes.sh b/t/t0003-attributes.sh
index f0fbb4255..f19ae4f8c 100755
--- a/t/t0003-attributes.sh
+++ b/t/t0003-attributes.sh
@@ -13,10 +13,31 @@ attr_check () {
 	test_line_count = 0 err
 }
 
+attr_check_quote () {
+
+	path="$1"
+	quoted_path="$2"
+	expect="$3"
+
+	git check-attr test -- "$path" >actual &&
+	echo "\"$quoted_path\": test: $expect" >expect &&
+	test_cmp expect actual
+
+}
+
+test_expect_success 'open-quoted pathname' '
+	echo "\"a test=a" >.gitattributes &&
+	test_must_fail attr_check a a
+'
+
+
 test_expect_success 'setup' '
 	mkdir -p a/b/d a/c b &&
 	(
 		echo "[attr]notest !test"
+		echo "\" d \"	test=d"
+		echo " e	test=e"
+		echo " e\"	test=e"
 		echo "f	test=f"
 		echo "a/i test=a/i"
 		echo "onoff test -test"
@@ -69,6 +90,11 @@ test_expect_success 'command line checks' '
 '
 
 test_expect_success 'attribute test' '
+
+	attr_check " d " d &&
+	attr_check e e &&
+	attr_check_quote e\" e\\\" e &&
+
 	attr_check f f &&
 	attr_check a/f f &&
 	attr_check a/c/f f &&
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v3 11/27] attr.c: add push_stack() helper
  2017-01-28  2:01   ` [PATCH v3 00/27] Revamp the attribute system; another round Brandon Williams
                       ` (9 preceding siblings ...)
  2017-01-28  2:01     ` [PATCH v3 10/27] attr: support quoting pathname patterns in C style Brandon Williams
@ 2017-01-28  2:01     ` Brandon Williams
  2017-01-28  2:01     ` [PATCH v3 12/27] Documentation: fix a typo Brandon Williams
                       ` (16 subsequent siblings)
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-28  2:01 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, sbeller, pclouds, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

There are too many repetitious "I have this new attr_stack element;
push it at the top of the stack" sequence.  The new helper function
push_stack() gives us a way to express what is going on at these
places, and as a side effect, halves the number of times we mention
the attr_stack global variable.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 71 +++++++++++++++++++++++++++++++-----------------------------------
 1 file changed, 33 insertions(+), 38 deletions(-)

diff --git a/attr.c b/attr.c
index e1c630f79..8026d68bd 100644
--- a/attr.c
+++ b/attr.c
@@ -510,6 +510,18 @@ static int git_attr_system(void)
 
 static GIT_PATH_FUNC(git_path_info_attributes, INFOATTRIBUTES_FILE)
 
+static void push_stack(struct attr_stack **attr_stack_p,
+		       struct attr_stack *elem, char *origin, size_t originlen)
+{
+	if (elem) {
+		elem->origin = origin;
+		if (origin)
+			elem->originlen = originlen;
+		elem->prev = *attr_stack_p;
+		*attr_stack_p = elem;
+	}
+}
+
 static void bootstrap_attr_stack(void)
 {
 	struct attr_stack *elem;
@@ -517,37 +529,23 @@ static void bootstrap_attr_stack(void)
 	if (attr_stack)
 		return;
 
-	elem = read_attr_from_array(builtin_attr);
-	elem->origin = NULL;
-	elem->prev = attr_stack;
-	attr_stack = elem;
-
-	if (git_attr_system()) {
-		elem = read_attr_from_file(git_etc_gitattributes(), 1);
-		if (elem) {
-			elem->origin = NULL;
-			elem->prev = attr_stack;
-			attr_stack = elem;
-		}
-	}
+	push_stack(&attr_stack, read_attr_from_array(builtin_attr), NULL, 0);
+
+	if (git_attr_system())
+		push_stack(&attr_stack,
+			   read_attr_from_file(git_etc_gitattributes(), 1),
+			   NULL, 0);
 
 	if (!git_attributes_file)
 		git_attributes_file = xdg_config_home("attributes");
-	if (git_attributes_file) {
-		elem = read_attr_from_file(git_attributes_file, 1);
-		if (elem) {
-			elem->origin = NULL;
-			elem->prev = attr_stack;
-			attr_stack = elem;
-		}
-	}
+	if (git_attributes_file)
+		push_stack(&attr_stack,
+			   read_attr_from_file(git_attributes_file, 1),
+			   NULL, 0);
 
 	if (!is_bare_repository() || direction == GIT_ATTR_INDEX) {
 		elem = read_attr(GITATTRIBUTES_FILE, 1);
-		elem->origin = xstrdup("");
-		elem->originlen = 0;
-		elem->prev = attr_stack;
-		attr_stack = elem;
+		push_stack(&attr_stack, elem, xstrdup(""), 0);
 		debug_push(elem);
 	}
 
@@ -558,15 +556,12 @@ static void bootstrap_attr_stack(void)
 
 	if (!elem)
 		elem = xcalloc(1, sizeof(*elem));
-	elem->origin = NULL;
-	elem->prev = attr_stack;
-	attr_stack = elem;
+	push_stack(&attr_stack, elem, NULL, 0);
 }
 
 static void prepare_attr_stack(const char *path, int dirlen)
 {
 	struct attr_stack *elem, *info;
-	int len;
 	const char *cp;
 
 	/*
@@ -626,20 +621,21 @@ static void prepare_attr_stack(const char *path, int dirlen)
 
 		assert(attr_stack->origin);
 		while (1) {
-			len = strlen(attr_stack->origin);
+			size_t len = strlen(attr_stack->origin);
+			char *origin;
+
 			if (dirlen <= len)
 				break;
 			cp = memchr(path + len + 1, '/', dirlen - len - 1);
 			if (!cp)
 				cp = path + dirlen;
-			strbuf_add(&pathbuf, path, cp - path);
-			strbuf_addch(&pathbuf, '/');
-			strbuf_addstr(&pathbuf, GITATTRIBUTES_FILE);
+			strbuf_addf(&pathbuf,
+				    "%.*s/%s", (int)(cp - path), path,
+				    GITATTRIBUTES_FILE);
 			elem = read_attr(pathbuf.buf, 0);
 			strbuf_setlen(&pathbuf, cp - path);
-			elem->origin = strbuf_detach(&pathbuf, &elem->originlen);
-			elem->prev = attr_stack;
-			attr_stack = elem;
+			origin = strbuf_detach(&pathbuf, &len);
+			push_stack(&attr_stack, elem, origin, len);
 			debug_push(elem);
 		}
 
@@ -649,8 +645,7 @@ static void prepare_attr_stack(const char *path, int dirlen)
 	/*
 	 * Finally push the "info" one at the top of the stack.
 	 */
-	info->prev = attr_stack;
-	attr_stack = info;
+	push_stack(&attr_stack, info, NULL, 0);
 }
 
 static int path_matches(const char *pathname, int pathlen,
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v3 12/27] Documentation: fix a typo
  2017-01-28  2:01   ` [PATCH v3 00/27] Revamp the attribute system; another round Brandon Williams
                       ` (10 preceding siblings ...)
  2017-01-28  2:01     ` [PATCH v3 11/27] attr.c: add push_stack() helper Brandon Williams
@ 2017-01-28  2:01     ` Brandon Williams
  2017-01-28  2:01     ` [PATCH v3 13/27] attr.c: outline the future plans by heavily commenting Brandon Williams
                       ` (15 subsequent siblings)
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-28  2:01 UTC (permalink / raw)
  To: git; +Cc: Stefan Beller, gitster, pclouds, Brandon Williams

From: Stefan Beller <sbeller@google.com>

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 Documentation/gitattributes.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index 3173dee7e..a53d093ca 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -88,7 +88,7 @@ is either not set or empty, $HOME/.config/git/attributes is used instead.
 Attributes for all users on a system should be placed in the
 `$(prefix)/etc/gitattributes` file.
 
-Sometimes you would need to override an setting of an attribute
+Sometimes you would need to override a setting of an attribute
 for a path to `Unspecified` state.  This can be done by listing
 the name of the attribute prefixed with an exclamation point `!`.
 
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v3 13/27] attr.c: outline the future plans by heavily commenting
  2017-01-28  2:01   ` [PATCH v3 00/27] Revamp the attribute system; another round Brandon Williams
                       ` (11 preceding siblings ...)
  2017-01-28  2:01     ` [PATCH v3 12/27] Documentation: fix a typo Brandon Williams
@ 2017-01-28  2:01     ` Brandon Williams
  2017-01-28  2:01     ` [PATCH v3 14/27] attr: rename function and struct related to checking attributes Brandon Williams
                       ` (14 subsequent siblings)
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-28  2:01 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, sbeller, pclouds, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 40 +++++++++++++++++++++++++++++++++++++++-
 1 file changed, 39 insertions(+), 1 deletion(-)

diff --git a/attr.c b/attr.c
index 8026d68bd..50e5ee393 100644
--- a/attr.c
+++ b/attr.c
@@ -30,6 +30,11 @@ static const char git_attr__unknown[] = "(builtin)unknown";
 #define DEBUG_ATTR 0
 #endif
 
+/*
+ * NEEDSWORK: the global dictionary of the interned attributes
+ * must stay a singleton even after we become thread-ready.
+ * Access to these must be surrounded with mutex when it happens.
+ */
 struct git_attr {
 	struct git_attr *next;
 	unsigned h;
@@ -39,10 +44,19 @@ struct git_attr {
 	char name[FLEX_ARRAY];
 };
 static int attr_nr;
+static struct git_attr *(git_attr_hash[HASHSIZE]);
+
+/*
+ * NEEDSWORK: maybe-real, maybe-macro are not property of
+ * an attribute, as it depends on what .gitattributes are
+ * read.  Once we introduce per git_attr_check attr_stack
+ * and check_all_attr, the optimization based on them will
+ * become unnecessary and can go away.  So is this variable.
+ */
 static int cannot_trust_maybe_real;
 
+/* NEEDSWORK: This will become per git_attr_check */
 static struct git_attr_check *check_all_attr;
-static struct git_attr *(git_attr_hash[HASHSIZE]);
 
 const char *git_attr_name(const struct git_attr *attr)
 {
@@ -102,6 +116,11 @@ static struct git_attr *git_attr_internal(const char *name, int len)
 	a->maybe_real = 0;
 	git_attr_hash[pos] = a;
 
+	/*
+	 * NEEDSWORK: per git_attr_check check_all_attr
+	 * will be initialized a lot more lazily, not
+	 * like this, and not here.
+	 */
 	REALLOC_ARRAY(check_all_attr, attr_nr);
 	check_all_attr[a->attr_nr].attr = a;
 	check_all_attr[a->attr_nr].value = ATTR__UNKNOWN;
@@ -318,6 +337,7 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
  * .gitignore file and info/excludes file as a fallback.
  */
 
+/* NEEDSWORK: This will become per git_attr_check */
 static struct attr_stack {
 	struct attr_stack *prev;
 	char *origin;
@@ -382,6 +402,24 @@ static struct attr_stack *read_attr_from_array(const char **list)
 	return res;
 }
 
+/*
+ * NEEDSWORK: these two are tricky.  The callers assume there is a
+ * single, system-wide global state "where we read attributes from?"
+ * and when the state is flipped by calling git_attr_set_direction(),
+ * attr_stack is discarded so that subsequent attr_check will lazily
+ * read from the right place.  And they do not know or care who called
+ * by them uses the attribute subsystem, hence have no knowledge of
+ * existing git_attr_check instances or future ones that will be
+ * created).
+ *
+ * Probably we need a thread_local that holds these two variables,
+ * and a list of git_attr_check instances (which need to be maintained
+ * by hooking into git_attr_check_alloc(), git_attr_check_initl(), and
+ * git_attr_check_clear().  Then git_attr_set_direction() updates the
+ * fields in that thread_local for these two variables, iterate over
+ * all the active git_attr_check instances and discard the attr_stack
+ * they hold.  Yuck, but it sounds doable.
+ */
 static enum git_attr_direction direction;
 static struct index_state *use_index;
 
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v3 14/27] attr: rename function and struct related to checking attributes
  2017-01-28  2:01   ` [PATCH v3 00/27] Revamp the attribute system; another round Brandon Williams
                       ` (12 preceding siblings ...)
  2017-01-28  2:01     ` [PATCH v3 13/27] attr.c: outline the future plans by heavily commenting Brandon Williams
@ 2017-01-28  2:01     ` Brandon Williams
  2017-01-28  2:01     ` [PATCH v3 15/27] attr: (re)introduce git_check_attr() and struct attr_check Brandon Williams
                       ` (13 subsequent siblings)
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-28  2:01 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, sbeller, pclouds, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

The traditional API to check attributes is to prepare an N-element
array of "struct git_attr_check" and pass N and the array to the
function "git_check_attr()" as arguments.

In preparation to revamp the API to pass a single structure, in
which these N elements are held, rename the type used for these
individual array elements to "struct attr_check_item" and rename
the function to "git_check_attrs()".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 archive.c              |  6 +++---
 attr.c                 | 12 ++++++------
 attr.h                 |  8 ++++----
 builtin/check-attr.c   | 19 ++++++++++---------
 builtin/pack-objects.c |  6 +++---
 convert.c              | 12 ++++++------
 ll-merge.c             | 10 +++++-----
 userdiff.c             |  4 ++--
 ws.c                   |  6 +++---
 9 files changed, 42 insertions(+), 41 deletions(-)

diff --git a/archive.c b/archive.c
index 01751e574..b76bd4691 100644
--- a/archive.c
+++ b/archive.c
@@ -87,7 +87,7 @@ void *sha1_file_to_archive(const struct archiver_args *args,
 	return buffer;
 }
 
-static void setup_archive_check(struct git_attr_check *check)
+static void setup_archive_check(struct attr_check_item *check)
 {
 	static struct git_attr *attr_export_ignore;
 	static struct git_attr *attr_export_subst;
@@ -123,7 +123,7 @@ static int write_archive_entry(const unsigned char *sha1, const char *base,
 	struct archiver_context *c = context;
 	struct archiver_args *args = c->args;
 	write_archive_entry_fn_t write_entry = c->write_entry;
-	struct git_attr_check check[2];
+	struct attr_check_item check[2];
 	const char *path_without_prefix;
 	int err;
 
@@ -138,7 +138,7 @@ static int write_archive_entry(const unsigned char *sha1, const char *base,
 	path_without_prefix = path.buf + args->baselen;
 
 	setup_archive_check(check);
-	if (!git_check_attr(path_without_prefix, ARRAY_SIZE(check), check)) {
+	if (!git_check_attrs(path_without_prefix, ARRAY_SIZE(check), check)) {
 		if (ATTR_TRUE(check[0].value))
 			return 0;
 		args->convert = ATTR_TRUE(check[1].value);
diff --git a/attr.c b/attr.c
index 50e5ee393..2f180d609 100644
--- a/attr.c
+++ b/attr.c
@@ -56,7 +56,7 @@ static struct git_attr *(git_attr_hash[HASHSIZE]);
 static int cannot_trust_maybe_real;
 
 /* NEEDSWORK: This will become per git_attr_check */
-static struct git_attr_check *check_all_attr;
+static struct attr_check_item *check_all_attr;
 
 const char *git_attr_name(const struct git_attr *attr)
 {
@@ -713,7 +713,7 @@ static int macroexpand_one(int attr_nr, int rem);
 
 static int fill_one(const char *what, struct match_attr *a, int rem)
 {
-	struct git_attr_check *check = check_all_attr;
+	struct attr_check_item *check = check_all_attr;
 	int i;
 
 	for (i = a->num_attr - 1; 0 < rem && 0 <= i; i--) {
@@ -778,7 +778,7 @@ static int macroexpand_one(int nr, int rem)
  * collected. Otherwise all attributes are collected.
  */
 static void collect_some_attrs(const char *path, int num,
-			       struct git_attr_check *check)
+			       struct attr_check_item *check)
 
 {
 	struct attr_stack *stk;
@@ -806,7 +806,7 @@ static void collect_some_attrs(const char *path, int num,
 		rem = 0;
 		for (i = 0; i < num; i++) {
 			if (!check[i].attr->maybe_real) {
-				struct git_attr_check *c;
+				struct attr_check_item *c;
 				c = check_all_attr + check[i].attr->attr_nr;
 				c->value = ATTR__UNSET;
 				rem++;
@@ -821,7 +821,7 @@ static void collect_some_attrs(const char *path, int num,
 		rem = fill(path, pathlen, basename_offset, stk, rem);
 }
 
-int git_check_attr(const char *path, int num, struct git_attr_check *check)
+int git_check_attrs(const char *path, int num, struct attr_check_item *check)
 {
 	int i;
 
@@ -837,7 +837,7 @@ int git_check_attr(const char *path, int num, struct git_attr_check *check)
 	return 0;
 }
 
-int git_all_attrs(const char *path, int *num, struct git_attr_check **check)
+int git_all_attrs(const char *path, int *num, struct attr_check_item **check)
 {
 	int i, count, j;
 
diff --git a/attr.h b/attr.h
index 00d7a662c..efc7bb3b3 100644
--- a/attr.h
+++ b/attr.h
@@ -20,11 +20,11 @@ extern const char git_attr__false[];
 #define ATTR_UNSET(v) ((v) == NULL)
 
 /*
- * Send one or more git_attr_check to git_check_attr(), and
+ * Send one or more git_attr_check to git_check_attrs(), and
  * each 'value' member tells what its value is.
  * Unset one is returned as NULL.
  */
-struct git_attr_check {
+struct attr_check_item {
 	const struct git_attr *attr;
 	const char *value;
 };
@@ -36,7 +36,7 @@ struct git_attr_check {
  */
 extern const char *git_attr_name(const struct git_attr *);
 
-int git_check_attr(const char *path, int, struct git_attr_check *);
+int git_check_attrs(const char *path, int, struct attr_check_item *);
 
 /*
  * Retrieve all attributes that apply to the specified path.  *num
@@ -45,7 +45,7 @@ int git_check_attr(const char *path, int, struct git_attr_check *);
  * objects describing the attributes and their values.  *check must be
  * free()ed by the caller.
  */
-int git_all_attrs(const char *path, int *num, struct git_attr_check **check);
+int git_all_attrs(const char *path, int *num, struct attr_check_item **check);
 
 enum git_attr_direction {
 	GIT_ATTR_CHECKIN,
diff --git a/builtin/check-attr.c b/builtin/check-attr.c
index 53a5a18c1..889264a5b 100644
--- a/builtin/check-attr.c
+++ b/builtin/check-attr.c
@@ -24,8 +24,8 @@ static const struct option check_attr_options[] = {
 	OPT_END()
 };
 
-static void output_attr(int cnt, struct git_attr_check *check,
-	const char *file)
+static void output_attr(int cnt, struct attr_check_item *check,
+			const char *file)
 {
 	int j;
 	for (j = 0; j < cnt; j++) {
@@ -51,14 +51,15 @@ static void output_attr(int cnt, struct git_attr_check *check,
 	}
 }
 
-static void check_attr(const char *prefix, int cnt,
-	struct git_attr_check *check, const char *file)
+static void check_attr(const char *prefix,
+		       int cnt, struct attr_check_item *check,
+		       const char *file)
 {
 	char *full_path =
 		prefix_path(prefix, prefix ? strlen(prefix) : 0, file);
 	if (check != NULL) {
-		if (git_check_attr(full_path, cnt, check))
-			die("git_check_attr died");
+		if (git_check_attrs(full_path, cnt, check))
+			die("git_check_attrs died");
 		output_attr(cnt, check, file);
 	} else {
 		if (git_all_attrs(full_path, &cnt, &check))
@@ -69,8 +70,8 @@ static void check_attr(const char *prefix, int cnt,
 	free(full_path);
 }
 
-static void check_attr_stdin_paths(const char *prefix, int cnt,
-	struct git_attr_check *check)
+static void check_attr_stdin_paths(const char *prefix,
+				   int cnt, struct attr_check_item *check)
 {
 	struct strbuf buf = STRBUF_INIT;
 	struct strbuf unquoted = STRBUF_INIT;
@@ -99,7 +100,7 @@ static NORETURN void error_with_usage(const char *msg)
 
 int cmd_check_attr(int argc, const char **argv, const char *prefix)
 {
-	struct git_attr_check *check;
+	struct attr_check_item *check;
 	int cnt, i, doubledash, filei;
 
 	if (!is_bare_repository())
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 8841f8b36..8b8fbd814 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -894,7 +894,7 @@ static void write_pack_file(void)
 			written, nr_result);
 }
 
-static void setup_delta_attr_check(struct git_attr_check *check)
+static void setup_delta_attr_check(struct attr_check_item *check)
 {
 	static struct git_attr *attr_delta;
 
@@ -906,10 +906,10 @@ static void setup_delta_attr_check(struct git_attr_check *check)
 
 static int no_try_delta(const char *path)
 {
-	struct git_attr_check check[1];
+	struct attr_check_item check[1];
 
 	setup_delta_attr_check(check);
-	if (git_check_attr(path, ARRAY_SIZE(check), check))
+	if (git_check_attrs(path, ARRAY_SIZE(check), check))
 		return 0;
 	if (ATTR_FALSE(check->value))
 		return 1;
diff --git a/convert.c b/convert.c
index 4e17e45ed..1b9829279 100644
--- a/convert.c
+++ b/convert.c
@@ -1028,7 +1028,7 @@ static int ident_to_worktree(const char *path, const char *src, size_t len,
 	return 1;
 }
 
-static enum crlf_action git_path_check_crlf(struct git_attr_check *check)
+static enum crlf_action git_path_check_crlf(struct attr_check_item *check)
 {
 	const char *value = check->value;
 
@@ -1045,7 +1045,7 @@ static enum crlf_action git_path_check_crlf(struct git_attr_check *check)
 	return CRLF_UNDEFINED;
 }
 
-static enum eol git_path_check_eol(struct git_attr_check *check)
+static enum eol git_path_check_eol(struct attr_check_item *check)
 {
 	const char *value = check->value;
 
@@ -1058,7 +1058,7 @@ static enum eol git_path_check_eol(struct git_attr_check *check)
 	return EOL_UNSET;
 }
 
-static struct convert_driver *git_path_check_convert(struct git_attr_check *check)
+static struct convert_driver *git_path_check_convert(struct attr_check_item *check)
 {
 	const char *value = check->value;
 	struct convert_driver *drv;
@@ -1071,7 +1071,7 @@ static struct convert_driver *git_path_check_convert(struct git_attr_check *chec
 	return NULL;
 }
 
-static int git_path_check_ident(struct git_attr_check *check)
+static int git_path_check_ident(struct attr_check_item *check)
 {
 	const char *value = check->value;
 
@@ -1093,7 +1093,7 @@ static const char *conv_attr_name[] = {
 static void convert_attrs(struct conv_attrs *ca, const char *path)
 {
 	int i;
-	static struct git_attr_check ccheck[NUM_CONV_ATTRS];
+	static struct attr_check_item ccheck[NUM_CONV_ATTRS];
 
 	if (!ccheck[0].attr) {
 		for (i = 0; i < NUM_CONV_ATTRS; i++)
@@ -1102,7 +1102,7 @@ static void convert_attrs(struct conv_attrs *ca, const char *path)
 		git_config(read_convert_config, NULL);
 	}
 
-	if (!git_check_attr(path, NUM_CONV_ATTRS, ccheck)) {
+	if (!git_check_attrs(path, NUM_CONV_ATTRS, ccheck)) {
 		ca->crlf_action = git_path_check_crlf(ccheck + 4);
 		if (ca->crlf_action == CRLF_UNDEFINED)
 			ca->crlf_action = git_path_check_crlf(ccheck + 0);
diff --git a/ll-merge.c b/ll-merge.c
index ad8be42f9..198f07aca 100644
--- a/ll-merge.c
+++ b/ll-merge.c
@@ -336,13 +336,13 @@ static const struct ll_merge_driver *find_ll_merge_driver(const char *merge_attr
 	return &ll_merge_drv[LL_TEXT_MERGE];
 }
 
-static int git_path_check_merge(const char *path, struct git_attr_check check[2])
+static int git_path_check_merge(const char *path, struct attr_check_item check[2])
 {
 	if (!check[0].attr) {
 		check[0].attr = git_attr("merge");
 		check[1].attr = git_attr("conflict-marker-size");
 	}
-	return git_check_attr(path, 2, check);
+	return git_check_attrs(path, 2, check);
 }
 
 static void normalize_file(mmfile_t *mm, const char *path)
@@ -362,7 +362,7 @@ int ll_merge(mmbuffer_t *result_buf,
 	     mmfile_t *theirs, const char *their_label,
 	     const struct ll_merge_options *opts)
 {
-	static struct git_attr_check check[2];
+	static struct attr_check_item check[2];
 	static const struct ll_merge_options default_opts;
 	const char *ll_driver_name = NULL;
 	int marker_size = DEFAULT_CONFLICT_MARKER_SIZE;
@@ -398,12 +398,12 @@ int ll_merge(mmbuffer_t *result_buf,
 
 int ll_merge_marker_size(const char *path)
 {
-	static struct git_attr_check check;
+	static struct attr_check_item check;
 	int marker_size = DEFAULT_CONFLICT_MARKER_SIZE;
 
 	if (!check.attr)
 		check.attr = git_attr("conflict-marker-size");
-	if (!git_check_attr(path, 1, &check) && check.value) {
+	if (!git_check_attrs(path, 1, &check) && check.value) {
 		marker_size = atoi(check.value);
 		if (marker_size <= 0)
 			marker_size = DEFAULT_CONFLICT_MARKER_SIZE;
diff --git a/userdiff.c b/userdiff.c
index 2125d6da2..b0b44467a 100644
--- a/userdiff.c
+++ b/userdiff.c
@@ -263,7 +263,7 @@ struct userdiff_driver *userdiff_find_by_name(const char *name) {
 struct userdiff_driver *userdiff_find_by_path(const char *path)
 {
 	static struct git_attr *attr;
-	struct git_attr_check check;
+	struct attr_check_item check;
 
 	if (!attr)
 		attr = git_attr("diff");
@@ -271,7 +271,7 @@ struct userdiff_driver *userdiff_find_by_path(const char *path)
 
 	if (!path)
 		return NULL;
-	if (git_check_attr(path, 1, &check))
+	if (git_check_attrs(path, 1, &check))
 		return NULL;
 
 	if (ATTR_TRUE(check.value))
diff --git a/ws.c b/ws.c
index ea4b2b1df..fbd876e84 100644
--- a/ws.c
+++ b/ws.c
@@ -71,7 +71,7 @@ unsigned parse_whitespace_rule(const char *string)
 	return rule;
 }
 
-static void setup_whitespace_attr_check(struct git_attr_check *check)
+static void setup_whitespace_attr_check(struct attr_check_item *check)
 {
 	static struct git_attr *attr_whitespace;
 
@@ -82,10 +82,10 @@ static void setup_whitespace_attr_check(struct git_attr_check *check)
 
 unsigned whitespace_rule(const char *pathname)
 {
-	struct git_attr_check attr_whitespace_rule;
+	struct attr_check_item attr_whitespace_rule;
 
 	setup_whitespace_attr_check(&attr_whitespace_rule);
-	if (!git_check_attr(pathname, 1, &attr_whitespace_rule)) {
+	if (!git_check_attrs(pathname, 1, &attr_whitespace_rule)) {
 		const char *value;
 
 		value = attr_whitespace_rule.value;
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v3 15/27] attr: (re)introduce git_check_attr() and struct attr_check
  2017-01-28  2:01   ` [PATCH v3 00/27] Revamp the attribute system; another round Brandon Williams
                       ` (13 preceding siblings ...)
  2017-01-28  2:01     ` [PATCH v3 14/27] attr: rename function and struct related to checking attributes Brandon Williams
@ 2017-01-28  2:01     ` Brandon Williams
  2017-01-30 18:05       ` Brandon Williams
  2017-01-28  2:01     ` [PATCH v3 16/27] attr: convert git_all_attrs() to use "struct attr_check" Brandon Williams
                       ` (12 subsequent siblings)
  27 siblings, 1 reply; 111+ messages in thread
From: Brandon Williams @ 2017-01-28  2:01 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, sbeller, pclouds, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

A common pattern to check N attributes for many paths is to

 (1) prepare an array A of N attr_check_item items;
 (2) call git_attr() to intern the N attribute names and fill A;
 (3) repeatedly call git_check_attrs() for path with N and A;

A look-up for these N attributes for a single path P scans the
entire attr_stack, starting from the .git/info/attributes file and
then .gitattributes file in the directory the path P is in, going
upwards to find .gitattributes file found in parent directories.

An earlier commit 06a604e6 (attr: avoid heavy work when we know the
specified attr is not defined, 2014-12-28) tried to optimize out
this scanning for one trivial special case: when the attribute being
sought is known not to exist, we do not have to scan for it.  While
this may be a cheap and effective heuristic, it would not work well
when N is (much) more than 1.

What we would want is a more customized way to skip irrelevant
entries in the attribute stack, and the definition of irrelevance
is tied to the set of attributes passed to git_check_attrs() call,
i.e. the set of attributes being sought.  The data necessary for
this optimization needs to live alongside the set of attributes, but
a simple array of git_attr_check_elem simply does not have any place
for that.

Introduce "struct attr_check" that contains N, the number of
attributes being sought, and A, the array that holds N
attr_check_item items, and a function git_check_attr() that
takes a path P and this structure as its parameters.  This structure
can later be extended to hold extra data necessary for optimization.

Also, to make it easier to write the first two steps in common
cases, introduce git_attr_check_initl() helper function, which takes
a NULL-terminated list of attribute names and initialize this
structure.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 74 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 attr.h | 17 +++++++++++++++
 2 files changed, 91 insertions(+)

diff --git a/attr.c b/attr.c
index 2f180d609..de8bf35a3 100644
--- a/attr.c
+++ b/attr.c
@@ -132,6 +132,75 @@ struct git_attr *git_attr(const char *name)
 	return git_attr_internal(name, strlen(name));
 }
 
+struct attr_check *attr_check_alloc(void)
+{
+	return xcalloc(1, sizeof(struct attr_check));
+}
+
+struct attr_check *attr_check_initl(const char *one, ...)
+{
+	struct attr_check *check;
+	int cnt;
+	va_list params;
+	const char *param;
+
+	va_start(params, one);
+	for (cnt = 1; (param = va_arg(params, const char *)) != NULL; cnt++)
+		;
+	va_end(params);
+
+	check = attr_check_alloc();
+	check->nr = cnt;
+	check->alloc = cnt;
+	check->items = xcalloc(cnt, sizeof(struct attr_check_item));
+
+	check->items[0].attr = git_attr(one);
+	va_start(params, one);
+	for (cnt = 1; cnt < check->nr; cnt++) {
+		const struct git_attr *attr;
+		param = va_arg(params, const char *);
+		if (!param)
+			die("BUG: counted %d != ended at %d",
+			    check->nr, cnt);
+		attr = git_attr(param);
+		if (!attr)
+			die("BUG: %s: not a valid attribute name", param);
+		check->items[cnt].attr = attr;
+	}
+	va_end(params);
+	return check;
+}
+
+struct attr_check_item *attr_check_append(struct attr_check *check,
+					  const struct git_attr *attr)
+{
+	struct attr_check_item *item;
+
+	ALLOC_GROW(check->items, check->nr + 1, check->alloc);
+	item = &check->items[check->nr++];
+	item->attr = attr;
+	return item;
+}
+
+void attr_check_reset(struct attr_check *check)
+{
+	check->nr = 0;
+}
+
+void attr_check_clear(struct attr_check *check)
+{
+	free(check->items);
+	check->items = NULL;
+	check->alloc = 0;
+	check->nr = 0;
+}
+
+void attr_check_free(struct attr_check *check)
+{
+	attr_check_clear(check);
+	free(check);
+}
+
 /* What does a matched pattern decide? */
 struct attr_state {
 	struct git_attr *attr;
@@ -865,6 +934,11 @@ int git_all_attrs(const char *path, int *num, struct attr_check_item **check)
 	return 0;
 }
 
+int git_check_attr(const char *path, struct attr_check *check)
+{
+	return git_check_attrs(path, check->nr, check->items);
+}
+
 void git_attr_set_direction(enum git_attr_direction new, struct index_state *istate)
 {
 	enum git_attr_direction old = direction;
diff --git a/attr.h b/attr.h
index efc7bb3b3..e611b139a 100644
--- a/attr.h
+++ b/attr.h
@@ -29,6 +29,22 @@ struct attr_check_item {
 	const char *value;
 };
 
+struct attr_check {
+	int nr;
+	int alloc;
+	struct attr_check_item *items;
+};
+
+extern struct attr_check *attr_check_alloc(void);
+extern struct attr_check *attr_check_initl(const char *, ...);
+
+extern struct attr_check_item *attr_check_append(struct attr_check *check,
+						 const struct git_attr *attr);
+
+extern void attr_check_reset(struct attr_check *check);
+extern void attr_check_clear(struct attr_check *check);
+extern void attr_check_free(struct attr_check *check);
+
 /*
  * Return the name of the attribute represented by the argument.  The
  * return value is a pointer to a null-delimited string that is part
@@ -37,6 +53,7 @@ struct attr_check_item {
 extern const char *git_attr_name(const struct git_attr *);
 
 int git_check_attrs(const char *path, int, struct attr_check_item *);
+extern int git_check_attr(const char *path, struct attr_check *check);
 
 /*
  * Retrieve all attributes that apply to the specified path.  *num
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v3 16/27] attr: convert git_all_attrs() to use "struct attr_check"
  2017-01-28  2:01   ` [PATCH v3 00/27] Revamp the attribute system; another round Brandon Williams
                       ` (14 preceding siblings ...)
  2017-01-28  2:01     ` [PATCH v3 15/27] attr: (re)introduce git_check_attr() and struct attr_check Brandon Williams
@ 2017-01-28  2:01     ` Brandon Williams
  2017-01-28 23:50       ` Stefan Beller
  2017-01-30 18:06       ` Brandon Williams
  2017-01-28  2:01     ` [PATCH v3 17/27] attr: convert git_check_attrs() callers to use the new API Brandon Williams
                       ` (11 subsequent siblings)
  27 siblings, 2 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-28  2:01 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, sbeller, pclouds, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

This updates the other two ways the attribute check is done via an
array of "struct attr_check_item" elements.  These two niches
appear only in "git check-attr".

 * The caller does not know offhand what attributes it wants to ask
   about and cannot use attr_check_initl() to prepare the
   attr_check structure.

 * The caller may not know what attributes it wants to ask at all,
   and instead wants to learn everything that the given path has.

Such a caller can call attr_check_alloc() to allocate an empty
attr_check, and then call attr_check_append() to add attribute names
one by one.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c               | 168 ++++++++++++++++++++++++---------------------------
 attr.h               |   9 +--
 builtin/check-attr.c |  60 +++++++++---------
 3 files changed, 112 insertions(+), 125 deletions(-)

diff --git a/attr.c b/attr.c
index de8bf35a3..40818246f 100644
--- a/attr.c
+++ b/attr.c
@@ -132,75 +132,6 @@ struct git_attr *git_attr(const char *name)
 	return git_attr_internal(name, strlen(name));
 }
 
-struct attr_check *attr_check_alloc(void)
-{
-	return xcalloc(1, sizeof(struct attr_check));
-}
-
-struct attr_check *attr_check_initl(const char *one, ...)
-{
-	struct attr_check *check;
-	int cnt;
-	va_list params;
-	const char *param;
-
-	va_start(params, one);
-	for (cnt = 1; (param = va_arg(params, const char *)) != NULL; cnt++)
-		;
-	va_end(params);
-
-	check = attr_check_alloc();
-	check->nr = cnt;
-	check->alloc = cnt;
-	check->items = xcalloc(cnt, sizeof(struct attr_check_item));
-
-	check->items[0].attr = git_attr(one);
-	va_start(params, one);
-	for (cnt = 1; cnt < check->nr; cnt++) {
-		const struct git_attr *attr;
-		param = va_arg(params, const char *);
-		if (!param)
-			die("BUG: counted %d != ended at %d",
-			    check->nr, cnt);
-		attr = git_attr(param);
-		if (!attr)
-			die("BUG: %s: not a valid attribute name", param);
-		check->items[cnt].attr = attr;
-	}
-	va_end(params);
-	return check;
-}
-
-struct attr_check_item *attr_check_append(struct attr_check *check,
-					  const struct git_attr *attr)
-{
-	struct attr_check_item *item;
-
-	ALLOC_GROW(check->items, check->nr + 1, check->alloc);
-	item = &check->items[check->nr++];
-	item->attr = attr;
-	return item;
-}
-
-void attr_check_reset(struct attr_check *check)
-{
-	check->nr = 0;
-}
-
-void attr_check_clear(struct attr_check *check)
-{
-	free(check->items);
-	check->items = NULL;
-	check->alloc = 0;
-	check->nr = 0;
-}
-
-void attr_check_free(struct attr_check *check)
-{
-	attr_check_clear(check);
-	free(check);
-}
-
 /* What does a matched pattern decide? */
 struct attr_state {
 	struct git_attr *attr;
@@ -439,6 +370,75 @@ static void free_attr_elem(struct attr_stack *e)
 	free(e);
 }
 
+struct attr_check *attr_check_alloc(void)
+{
+	return xcalloc(1, sizeof(struct attr_check));
+}
+
+struct attr_check *attr_check_initl(const char *one, ...)
+{
+	struct attr_check *check;
+	int cnt;
+	va_list params;
+	const char *param;
+
+	va_start(params, one);
+	for (cnt = 1; (param = va_arg(params, const char *)) != NULL; cnt++)
+		;
+	va_end(params);
+
+	check = attr_check_alloc();
+	check->nr = cnt;
+	check->alloc = cnt;
+	check->items = xcalloc(cnt, sizeof(struct attr_check_item));
+
+	check->items[0].attr = git_attr(one);
+	va_start(params, one);
+	for (cnt = 1; cnt < check->nr; cnt++) {
+		const struct git_attr *attr;
+		param = va_arg(params, const char *);
+		if (!param)
+			die("BUG: counted %d != ended at %d",
+			    check->nr, cnt);
+		attr = git_attr(param);
+		if (!attr)
+			die("BUG: %s: not a valid attribute name", param);
+		check->items[cnt].attr = attr;
+	}
+	va_end(params);
+	return check;
+}
+
+struct attr_check_item *attr_check_append(struct attr_check *check,
+					  const struct git_attr *attr)
+{
+	struct attr_check_item *item;
+
+	ALLOC_GROW(check->items, check->nr + 1, check->alloc);
+	item = &check->items[check->nr++];
+	item->attr = attr;
+	return item;
+}
+
+void attr_check_reset(struct attr_check *check)
+{
+	check->nr = 0;
+}
+
+void attr_check_clear(struct attr_check *check)
+{
+	free(check->items);
+	check->items = NULL;
+	check->alloc = 0;
+	check->nr = 0;
+}
+
+void attr_check_free(struct attr_check *check)
+{
+	attr_check_clear(check);
+	free(check);
+}
+
 static const char *builtin_attr[] = {
 	"[attr]binary -diff -merge -text",
 	NULL,
@@ -906,32 +906,22 @@ int git_check_attrs(const char *path, int num, struct attr_check_item *check)
 	return 0;
 }
 
-int git_all_attrs(const char *path, int *num, struct attr_check_item **check)
+void git_all_attrs(const char *path, struct attr_check *check)
 {
-	int i, count, j;
+	int i;
 
-	collect_some_attrs(path, 0, NULL);
+	attr_check_reset(check);
+	collect_some_attrs(path, check->nr, check->items);
 
-	/* Count the number of attributes that are set. */
-	count = 0;
-	for (i = 0; i < attr_nr; i++) {
-		const char *value = check_all_attr[i].value;
-		if (value != ATTR__UNSET && value != ATTR__UNKNOWN)
-			++count;
-	}
-	*num = count;
-	ALLOC_ARRAY(*check, count);
-	j = 0;
 	for (i = 0; i < attr_nr; i++) {
+		const char *name = check_all_attr[i].attr->name;
 		const char *value = check_all_attr[i].value;
-		if (value != ATTR__UNSET && value != ATTR__UNKNOWN) {
-			(*check)[j].attr = check_all_attr[i].attr;
-			(*check)[j].value = value;
-			++j;
-		}
+		struct attr_check_item *item;
+		if (value == ATTR__UNSET || value == ATTR__UNKNOWN)
+			continue;
+		item = attr_check_append(check, git_attr(name));
+		item->value = value;
 	}
-
-	return 0;
 }
 
 int git_check_attr(const char *path, struct attr_check *check)
diff --git a/attr.h b/attr.h
index e611b139a..9f2729842 100644
--- a/attr.h
+++ b/attr.h
@@ -56,13 +56,10 @@ int git_check_attrs(const char *path, int, struct attr_check_item *);
 extern int git_check_attr(const char *path, struct attr_check *check);
 
 /*
- * Retrieve all attributes that apply to the specified path.  *num
- * will be set to the number of attributes on the path; **check will
- * be set to point at a newly-allocated array of git_attr_check
- * objects describing the attributes and their values.  *check must be
- * free()ed by the caller.
+ * Retrieve all attributes that apply to the specified path.
+ * check holds the attributes and their values.
  */
-int git_all_attrs(const char *path, int *num, struct attr_check_item **check);
+extern void git_all_attrs(const char *path, struct attr_check *check);
 
 enum git_attr_direction {
 	GIT_ATTR_CHECKIN,
diff --git a/builtin/check-attr.c b/builtin/check-attr.c
index 889264a5b..40cdff13e 100644
--- a/builtin/check-attr.c
+++ b/builtin/check-attr.c
@@ -24,12 +24,13 @@ static const struct option check_attr_options[] = {
 	OPT_END()
 };
 
-static void output_attr(int cnt, struct attr_check_item *check,
-			const char *file)
+static void output_attr(struct attr_check *check, const char *file)
 {
 	int j;
+	int cnt = check->nr;
+
 	for (j = 0; j < cnt; j++) {
-		const char *value = check[j].value;
+		const char *value = check->items[j].value;
 
 		if (ATTR_TRUE(value))
 			value = "set";
@@ -42,36 +43,38 @@ static void output_attr(int cnt, struct attr_check_item *check,
 			printf("%s%c" /* path */
 			       "%s%c" /* attrname */
 			       "%s%c" /* attrvalue */,
-			       file, 0, git_attr_name(check[j].attr), 0, value, 0);
+			       file, 0,
+			       git_attr_name(check->items[j].attr), 0, value, 0);
 		} else {
 			quote_c_style(file, NULL, stdout, 0);
-			printf(": %s: %s\n", git_attr_name(check[j].attr), value);
+			printf(": %s: %s\n",
+			       git_attr_name(check->items[j].attr), value);
 		}
-
 	}
 }
 
 static void check_attr(const char *prefix,
-		       int cnt, struct attr_check_item *check,
+		       struct attr_check *check,
+		       int collect_all,
 		       const char *file)
 {
 	char *full_path =
 		prefix_path(prefix, prefix ? strlen(prefix) : 0, file);
-	if (check != NULL) {
-		if (git_check_attrs(full_path, cnt, check))
-			die("git_check_attrs died");
-		output_attr(cnt, check, file);
+
+	if (collect_all) {
+		git_all_attrs(full_path, check);
 	} else {
-		if (git_all_attrs(full_path, &cnt, &check))
-			die("git_all_attrs died");
-		output_attr(cnt, check, file);
-		free(check);
+		if (git_check_attr(full_path, check))
+			die("git_check_attr died");
 	}
+	output_attr(check, file);
+
 	free(full_path);
 }
 
 static void check_attr_stdin_paths(const char *prefix,
-				   int cnt, struct attr_check_item *check)
+				   struct attr_check *check,
+				   int collect_all)
 {
 	struct strbuf buf = STRBUF_INIT;
 	struct strbuf unquoted = STRBUF_INIT;
@@ -85,7 +88,7 @@ static void check_attr_stdin_paths(const char *prefix,
 				die("line is badly quoted");
 			strbuf_swap(&buf, &unquoted);
 		}
-		check_attr(prefix, cnt, check, buf.buf);
+		check_attr(prefix, check, collect_all, buf.buf);
 		maybe_flush_or_die(stdout, "attribute to stdout");
 	}
 	strbuf_release(&buf);
@@ -100,7 +103,7 @@ static NORETURN void error_with_usage(const char *msg)
 
 int cmd_check_attr(int argc, const char **argv, const char *prefix)
 {
-	struct attr_check_item *check;
+	struct attr_check *check;
 	int cnt, i, doubledash, filei;
 
 	if (!is_bare_repository())
@@ -160,28 +163,25 @@ int cmd_check_attr(int argc, const char **argv, const char *prefix)
 			error_with_usage("No file specified");
 	}
 
-	if (all_attrs) {
-		check = NULL;
-	} else {
-		check = xcalloc(cnt, sizeof(*check));
+	check = attr_check_alloc();
+	if (!all_attrs) {
 		for (i = 0; i < cnt; i++) {
-			const char *name;
-			struct git_attr *a;
-			name = argv[i];
-			a = git_attr(name);
+			struct git_attr *a = git_attr(argv[i]);
 			if (!a)
 				return error("%s: not a valid attribute name",
-					name);
-			check[i].attr = a;
+					     argv[i]);
+			attr_check_append(check, a);
 		}
 	}
 
 	if (stdin_paths)
-		check_attr_stdin_paths(prefix, cnt, check);
+		check_attr_stdin_paths(prefix, check, all_attrs);
 	else {
 		for (i = filei; i < argc; i++)
-			check_attr(prefix, cnt, check, argv[i]);
+			check_attr(prefix, check, all_attrs, argv[i]);
 		maybe_flush_or_die(stdout, "attribute to stdout");
 	}
+
+	attr_check_free(check);
 	return 0;
 }
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v3 17/27] attr: convert git_check_attrs() callers to use the new API
  2017-01-28  2:01   ` [PATCH v3 00/27] Revamp the attribute system; another round Brandon Williams
                       ` (15 preceding siblings ...)
  2017-01-28  2:01     ` [PATCH v3 16/27] attr: convert git_all_attrs() to use "struct attr_check" Brandon Williams
@ 2017-01-28  2:01     ` Brandon Williams
  2017-01-28  2:01     ` [PATCH v3 18/27] attr: retire git_check_attrs() API Brandon Williams
                       ` (10 subsequent siblings)
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-28  2:01 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, sbeller, pclouds, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

The remaining callers are all simple "I have N attributes I am
interested in.  I'll ask about them with various paths one by one".

After this step, no caller to git_check_attrs() remains.  After
removing it, we can extend "struct attr_check" struct with data
that can be used in optimizing the query for the specific N
attributes it contains.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 archive.c              | 24 ++++++------------------
 builtin/pack-objects.c | 19 +++++--------------
 convert.c              | 17 ++++++-----------
 ll-merge.c             | 33 ++++++++++++++-------------------
 userdiff.c             | 19 ++++++++-----------
 ws.c                   | 19 ++++++-------------
 6 files changed, 45 insertions(+), 86 deletions(-)

diff --git a/archive.c b/archive.c
index b76bd4691..60b889198 100644
--- a/archive.c
+++ b/archive.c
@@ -87,19 +87,6 @@ void *sha1_file_to_archive(const struct archiver_args *args,
 	return buffer;
 }
 
-static void setup_archive_check(struct attr_check_item *check)
-{
-	static struct git_attr *attr_export_ignore;
-	static struct git_attr *attr_export_subst;
-
-	if (!attr_export_ignore) {
-		attr_export_ignore = git_attr("export-ignore");
-		attr_export_subst = git_attr("export-subst");
-	}
-	check[0].attr = attr_export_ignore;
-	check[1].attr = attr_export_subst;
-}
-
 struct directory {
 	struct directory *up;
 	struct object_id oid;
@@ -120,10 +107,10 @@ static int write_archive_entry(const unsigned char *sha1, const char *base,
 		void *context)
 {
 	static struct strbuf path = STRBUF_INIT;
+	static struct attr_check *check;
 	struct archiver_context *c = context;
 	struct archiver_args *args = c->args;
 	write_archive_entry_fn_t write_entry = c->write_entry;
-	struct attr_check_item check[2];
 	const char *path_without_prefix;
 	int err;
 
@@ -137,11 +124,12 @@ static int write_archive_entry(const unsigned char *sha1, const char *base,
 		strbuf_addch(&path, '/');
 	path_without_prefix = path.buf + args->baselen;
 
-	setup_archive_check(check);
-	if (!git_check_attrs(path_without_prefix, ARRAY_SIZE(check), check)) {
-		if (ATTR_TRUE(check[0].value))
+	if (!check)
+		check = attr_check_initl("export-ignore", "export-subst", NULL);
+	if (!git_check_attr(path_without_prefix, check)) {
+		if (ATTR_TRUE(check->items[0].value))
 			return 0;
-		args->convert = ATTR_TRUE(check[1].value);
+		args->convert = ATTR_TRUE(check->items[1].value);
 	}
 
 	if (S_ISDIR(mode) || S_ISGITLINK(mode)) {
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 8b8fbd814..181e4a198 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -894,24 +894,15 @@ static void write_pack_file(void)
 			written, nr_result);
 }
 
-static void setup_delta_attr_check(struct attr_check_item *check)
-{
-	static struct git_attr *attr_delta;
-
-	if (!attr_delta)
-		attr_delta = git_attr("delta");
-
-	check[0].attr = attr_delta;
-}
-
 static int no_try_delta(const char *path)
 {
-	struct attr_check_item check[1];
+	static struct attr_check *check;
 
-	setup_delta_attr_check(check);
-	if (git_check_attrs(path, ARRAY_SIZE(check), check))
+	if (!check)
+		check = attr_check_initl("delta", NULL);
+	if (git_check_attr(path, check))
 		return 0;
-	if (ATTR_FALSE(check->value))
+	if (ATTR_FALSE(check->items[0].value))
 		return 1;
 	return 0;
 }
diff --git a/convert.c b/convert.c
index 1b9829279..8d652bf27 100644
--- a/convert.c
+++ b/convert.c
@@ -1085,24 +1085,19 @@ struct conv_attrs {
 	int ident;
 };
 
-static const char *conv_attr_name[] = {
-	"crlf", "ident", "filter", "eol", "text",
-};
-#define NUM_CONV_ATTRS ARRAY_SIZE(conv_attr_name)
-
 static void convert_attrs(struct conv_attrs *ca, const char *path)
 {
-	int i;
-	static struct attr_check_item ccheck[NUM_CONV_ATTRS];
+	static struct attr_check *check;
 
-	if (!ccheck[0].attr) {
-		for (i = 0; i < NUM_CONV_ATTRS; i++)
-			ccheck[i].attr = git_attr(conv_attr_name[i]);
+	if (!check) {
+		check = attr_check_initl("crlf", "ident", "filter",
+					 "eol", "text", NULL);
 		user_convert_tail = &user_convert;
 		git_config(read_convert_config, NULL);
 	}
 
-	if (!git_check_attrs(path, NUM_CONV_ATTRS, ccheck)) {
+	if (!git_check_attr(path, check)) {
+		struct attr_check_item *ccheck = check->items;
 		ca->crlf_action = git_path_check_crlf(ccheck + 4);
 		if (ca->crlf_action == CRLF_UNDEFINED)
 			ca->crlf_action = git_path_check_crlf(ccheck + 0);
diff --git a/ll-merge.c b/ll-merge.c
index 198f07aca..ac0d4a5d7 100644
--- a/ll-merge.c
+++ b/ll-merge.c
@@ -336,15 +336,6 @@ static const struct ll_merge_driver *find_ll_merge_driver(const char *merge_attr
 	return &ll_merge_drv[LL_TEXT_MERGE];
 }
 
-static int git_path_check_merge(const char *path, struct attr_check_item check[2])
-{
-	if (!check[0].attr) {
-		check[0].attr = git_attr("merge");
-		check[1].attr = git_attr("conflict-marker-size");
-	}
-	return git_check_attrs(path, 2, check);
-}
-
 static void normalize_file(mmfile_t *mm, const char *path)
 {
 	struct strbuf strbuf = STRBUF_INIT;
@@ -362,7 +353,7 @@ int ll_merge(mmbuffer_t *result_buf,
 	     mmfile_t *theirs, const char *their_label,
 	     const struct ll_merge_options *opts)
 {
-	static struct attr_check_item check[2];
+	static struct attr_check *check;
 	static const struct ll_merge_options default_opts;
 	const char *ll_driver_name = NULL;
 	int marker_size = DEFAULT_CONFLICT_MARKER_SIZE;
@@ -376,10 +367,14 @@ int ll_merge(mmbuffer_t *result_buf,
 		normalize_file(ours, path);
 		normalize_file(theirs, path);
 	}
-	if (!git_path_check_merge(path, check)) {
-		ll_driver_name = check[0].value;
-		if (check[1].value) {
-			marker_size = atoi(check[1].value);
+
+	if (!check)
+		check = attr_check_initl("merge", "conflict-marker-size", NULL);
+
+	if (!git_check_attr(path, check)) {
+		ll_driver_name = check->items[0].value;
+		if (check->items[1].value) {
+			marker_size = atoi(check->items[1].value);
 			if (marker_size <= 0)
 				marker_size = DEFAULT_CONFLICT_MARKER_SIZE;
 		}
@@ -398,13 +393,13 @@ int ll_merge(mmbuffer_t *result_buf,
 
 int ll_merge_marker_size(const char *path)
 {
-	static struct attr_check_item check;
+	static struct attr_check *check;
 	int marker_size = DEFAULT_CONFLICT_MARKER_SIZE;
 
-	if (!check.attr)
-		check.attr = git_attr("conflict-marker-size");
-	if (!git_check_attrs(path, 1, &check) && check.value) {
-		marker_size = atoi(check.value);
+	if (!check)
+		check = attr_check_initl("conflict-marker-size", NULL);
+	if (!git_check_attr(path, check) && check->items[0].value) {
+		marker_size = atoi(check->items[0].value);
 		if (marker_size <= 0)
 			marker_size = DEFAULT_CONFLICT_MARKER_SIZE;
 	}
diff --git a/userdiff.c b/userdiff.c
index b0b44467a..8b732e40b 100644
--- a/userdiff.c
+++ b/userdiff.c
@@ -262,25 +262,22 @@ struct userdiff_driver *userdiff_find_by_name(const char *name) {
 
 struct userdiff_driver *userdiff_find_by_path(const char *path)
 {
-	static struct git_attr *attr;
-	struct attr_check_item check;
-
-	if (!attr)
-		attr = git_attr("diff");
-	check.attr = attr;
+	static struct attr_check *check;
 
+	if (!check)
+		check = attr_check_initl("diff", NULL);
 	if (!path)
 		return NULL;
-	if (git_check_attrs(path, 1, &check))
+	if (git_check_attr(path, check))
 		return NULL;
 
-	if (ATTR_TRUE(check.value))
+	if (ATTR_TRUE(check->items[0].value))
 		return &driver_true;
-	if (ATTR_FALSE(check.value))
+	if (ATTR_FALSE(check->items[0].value))
 		return &driver_false;
-	if (ATTR_UNSET(check.value))
+	if (ATTR_UNSET(check->items[0].value))
 		return NULL;
-	return userdiff_find_by_name(check.value);
+	return userdiff_find_by_name(check->items[0].value);
 }
 
 struct userdiff_driver *userdiff_get_textconv(struct userdiff_driver *driver)
diff --git a/ws.c b/ws.c
index fbd876e84..a07caedd5 100644
--- a/ws.c
+++ b/ws.c
@@ -71,24 +71,17 @@ unsigned parse_whitespace_rule(const char *string)
 	return rule;
 }
 
-static void setup_whitespace_attr_check(struct attr_check_item *check)
-{
-	static struct git_attr *attr_whitespace;
-
-	if (!attr_whitespace)
-		attr_whitespace = git_attr("whitespace");
-	check[0].attr = attr_whitespace;
-}
-
 unsigned whitespace_rule(const char *pathname)
 {
-	struct attr_check_item attr_whitespace_rule;
+	static struct attr_check *attr_whitespace_rule;
+
+	if (!attr_whitespace_rule)
+		attr_whitespace_rule = attr_check_initl("whitespace", NULL);
 
-	setup_whitespace_attr_check(&attr_whitespace_rule);
-	if (!git_check_attrs(pathname, 1, &attr_whitespace_rule)) {
+	if (!git_check_attr(pathname, attr_whitespace_rule)) {
 		const char *value;
 
-		value = attr_whitespace_rule.value;
+		value = attr_whitespace_rule->items[0].value;
 		if (ATTR_TRUE(value)) {
 			/* true (whitespace) */
 			unsigned all_rule = ws_tab_width(whitespace_rule_cfg);
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v3 18/27] attr: retire git_check_attrs() API
  2017-01-28  2:01   ` [PATCH v3 00/27] Revamp the attribute system; another round Brandon Williams
                       ` (16 preceding siblings ...)
  2017-01-28  2:01     ` [PATCH v3 17/27] attr: convert git_check_attrs() callers to use the new API Brandon Williams
@ 2017-01-28  2:01     ` Brandon Williams
  2017-01-28  2:01     ` [PATCH v3 19/27] attr: pass struct attr_check to collect_some_attrs Brandon Williams
                       ` (9 subsequent siblings)
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-28  2:01 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, sbeller, pclouds, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

Since nobody uses the old API, make it file-scope static, and update
the documentation to describe the new API.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 Documentation/technical/api-gitattributes.txt | 86 +++++++++++++++++----------
 attr.c                                        |  3 +-
 attr.h                                        |  1 -
 3 files changed, 58 insertions(+), 32 deletions(-)

diff --git a/Documentation/technical/api-gitattributes.txt b/Documentation/technical/api-gitattributes.txt
index 260266867..e7cbb7c13 100644
--- a/Documentation/technical/api-gitattributes.txt
+++ b/Documentation/technical/api-gitattributes.txt
@@ -16,10 +16,15 @@ Data Structure
 	of no interest to the calling programs.  The name of the
 	attribute can be retrieved by calling `git_attr_name()`.
 
-`struct git_attr_check`::
+`struct attr_check_item`::
 
-	This structure represents a set of attributes to check in a call
-	to `git_check_attr()` function, and receives the results.
+	This structure represents one attribute and its value.
+
+`struct attr_check`::
+
+	This structure represents a collection of `attr_check_item`.
+	It is passed to `git_check_attr()` function, specifying the
+	attributes to check, and receives their values.
 
 
 Attribute Values
@@ -27,7 +32,7 @@ Attribute Values
 
 An attribute for a path can be in one of four states: Set, Unset,
 Unspecified or set to a string, and `.value` member of `struct
-git_attr_check` records it.  There are three macros to check these:
+attr_check_item` records it.  There are three macros to check these:
 
 `ATTR_TRUE()`::
 
@@ -48,49 +53,51 @@ value of the attribute for the path.
 Querying Specific Attributes
 ----------------------------
 
-* Prepare an array of `struct git_attr_check` to define the list of
-  attributes you would want to check.  To populate this array, you would
-  need to define necessary attributes by calling `git_attr()` function.
+* Prepare `struct attr_check` using attr_check_initl()
+  function, enumerating the names of attributes whose values you are
+  interested in, terminated with a NULL pointer.  Alternatively, an
+  empty `struct attr_check` can be prepared by calling
+  `attr_check_alloc()` function and then attributes you want to
+  ask about can be added to it with `attr_check_append()`
+  function.
 
 * Call `git_check_attr()` to check the attributes for the path.
 
-* Inspect `git_attr_check` structure to see how each of the attribute in
-  the array is defined for the path.
+* Inspect `attr_check` structure to see how each of the
+  attribute in the array is defined for the path.
 
 
 Example
 -------
 
-To see how attributes "crlf" and "indent" are set for different paths.
+To see how attributes "crlf" and "ident" are set for different paths.
 
-. Prepare an array of `struct git_attr_check` with two elements (because
-  we are checking two attributes).  Initialize their `attr` member with
-  pointers to `struct git_attr` obtained by calling `git_attr()`:
+. Prepare a `struct attr_check` with two elements (because
+  we are checking two attributes):
 
 ------------
-static struct git_attr_check check[2];
+static struct attr_check *check;
 static void setup_check(void)
 {
-	if (check[0].attr)
+	if (check)
 		return; /* already done */
-	check[0].attr = git_attr("crlf");
-	check[1].attr = git_attr("ident");
+	check = attr_check_initl("crlf", "ident", NULL);
 }
 ------------
 
-. Call `git_check_attr()` with the prepared array of `struct git_attr_check`:
+. Call `git_check_attr()` with the prepared `struct attr_check`:
 
 ------------
 	const char *path;
 
 	setup_check();
-	git_check_attr(path, ARRAY_SIZE(check), check);
+	git_check_attr(path, check);
 ------------
 
-. Act on `.value` member of the result, left in `check[]`:
+. Act on `.value` member of the result, left in `check->items[]`:
 
 ------------
-	const char *value = check[0].value;
+	const char *value = check->items[0].value;
 
 	if (ATTR_TRUE(value)) {
 		The attribute is Set, by listing only the name of the
@@ -109,20 +116,39 @@ static void setup_check(void)
 	}
 ------------
 
+To see how attributes in argv[] are set for different paths, only
+the first step in the above would be different.
+
+------------
+static struct attr_check *check;
+static void setup_check(const char **argv)
+{
+	check = attr_check_alloc();
+	while (*argv) {
+		struct git_attr *attr = git_attr(*argv);
+		attr_check_append(check, attr);
+		argv++;
+	}
+}
+------------
+
 
 Querying All Attributes
 -----------------------
 
 To get the values of all attributes associated with a file:
 
-* Call `git_all_attrs()`, which returns an array of `git_attr_check`
-  structures.
+* Prepare an empty `attr_check` structure by calling
+  `attr_check_alloc()`.
+
+* Call `git_all_attrs()`, which populates the `attr_check`
+  with the attributes attached to the path.
 
-* Iterate over the `git_attr_check` array to examine the attribute
-  names and values.  The name of the attribute described by a
-  `git_attr_check` object can be retrieved via
-  `git_attr_name(check[i].attr)`.  (Please note that no items will be
-  returned for unset attributes, so `ATTR_UNSET()` will return false
-  for all returned `git_array_check` objects.)
+* Iterate over the `attr_check.items[]` array to examine
+  the attribute names and values.  The name of the attribute
+  described by a  `attr_check.items[]` object can be retrieved via
+  `git_attr_name(check->items[i].attr)`.  (Please note that no items
+  will be returned for unset attributes, so `ATTR_UNSET()` will return
+  false for all returned `attr_check.items[]` objects.)
 
-* Free the `git_array_check` array.
+* Free the `attr_check` struct by calling `attr_check_free()`.
diff --git a/attr.c b/attr.c
index 40818246f..c0e7893b5 100644
--- a/attr.c
+++ b/attr.c
@@ -890,7 +890,8 @@ static void collect_some_attrs(const char *path, int num,
 		rem = fill(path, pathlen, basename_offset, stk, rem);
 }
 
-int git_check_attrs(const char *path, int num, struct attr_check_item *check)
+static int git_check_attrs(const char *path, int num,
+			   struct attr_check_item *check)
 {
 	int i;
 
diff --git a/attr.h b/attr.h
index 9f2729842..b2cfd8550 100644
--- a/attr.h
+++ b/attr.h
@@ -52,7 +52,6 @@ extern void attr_check_free(struct attr_check *check);
  */
 extern const char *git_attr_name(const struct git_attr *);
 
-int git_check_attrs(const char *path, int, struct attr_check_item *);
 extern int git_check_attr(const char *path, struct attr_check *check);
 
 /*
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v3 19/27] attr: pass struct attr_check to collect_some_attrs
  2017-01-28  2:01   ` [PATCH v3 00/27] Revamp the attribute system; another round Brandon Williams
                       ` (17 preceding siblings ...)
  2017-01-28  2:01     ` [PATCH v3 18/27] attr: retire git_check_attrs() API Brandon Williams
@ 2017-01-28  2:01     ` Brandon Williams
  2017-01-28  2:02     ` [PATCH v3 20/27] attr: change validity check for attribute names to use positive logic Brandon Williams
                       ` (8 subsequent siblings)
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-28  2:01 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, sbeller, gitster, pclouds

The old callchain used to take an array of attr_check_item items.
Instead pass the 'attr_check' container object to 'collect_some_attrs()'
and access the fields in the data structure directly.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 33 +++++++++++++--------------------
 1 file changed, 13 insertions(+), 20 deletions(-)

diff --git a/attr.c b/attr.c
index c0e7893b5..81a3c74d8 100644
--- a/attr.c
+++ b/attr.c
@@ -846,9 +846,7 @@ static int macroexpand_one(int nr, int rem)
  * check_all_attr. If num is non-zero, only attributes in check[] are
  * collected. Otherwise all attributes are collected.
  */
-static void collect_some_attrs(const char *path, int num,
-			       struct attr_check_item *check)
-
+static void collect_some_attrs(const char *path, struct attr_check *check)
 {
 	struct attr_stack *stk;
 	int i, pathlen, rem, dirlen;
@@ -871,17 +869,18 @@ static void collect_some_attrs(const char *path, int num,
 	prepare_attr_stack(path, dirlen);
 	for (i = 0; i < attr_nr; i++)
 		check_all_attr[i].value = ATTR__UNKNOWN;
-	if (num && !cannot_trust_maybe_real) {
+	if (check->nr && !cannot_trust_maybe_real) {
 		rem = 0;
-		for (i = 0; i < num; i++) {
-			if (!check[i].attr->maybe_real) {
+		for (i = 0; i < check->nr; i++) {
+			const struct git_attr *a = check->items[i].attr;
+			if (!a->maybe_real) {
 				struct attr_check_item *c;
-				c = check_all_attr + check[i].attr->attr_nr;
+				c = check_all_attr + a->attr_nr;
 				c->value = ATTR__UNSET;
 				rem++;
 			}
 		}
-		if (rem == num)
+		if (rem == check->nr)
 			return;
 	}
 
@@ -890,18 +889,17 @@ static void collect_some_attrs(const char *path, int num,
 		rem = fill(path, pathlen, basename_offset, stk, rem);
 }
 
-static int git_check_attrs(const char *path, int num,
-			   struct attr_check_item *check)
+int git_check_attr(const char *path, struct attr_check *check)
 {
 	int i;
 
-	collect_some_attrs(path, num, check);
+	collect_some_attrs(path, check);
 
-	for (i = 0; i < num; i++) {
-		const char *value = check_all_attr[check[i].attr->attr_nr].value;
+	for (i = 0; i < check->nr; i++) {
+		const char *value = check_all_attr[check->items[i].attr->attr_nr].value;
 		if (value == ATTR__UNKNOWN)
 			value = ATTR__UNSET;
-		check[i].value = value;
+		check->items[i].value = value;
 	}
 
 	return 0;
@@ -912,7 +910,7 @@ void git_all_attrs(const char *path, struct attr_check *check)
 	int i;
 
 	attr_check_reset(check);
-	collect_some_attrs(path, check->nr, check->items);
+	collect_some_attrs(path, check);
 
 	for (i = 0; i < attr_nr; i++) {
 		const char *name = check_all_attr[i].attr->name;
@@ -925,11 +923,6 @@ void git_all_attrs(const char *path, struct attr_check *check)
 	}
 }
 
-int git_check_attr(const char *path, struct attr_check *check)
-{
-	return git_check_attrs(path, check->nr, check->items);
-}
-
 void git_attr_set_direction(enum git_attr_direction new, struct index_state *istate)
 {
 	enum git_attr_direction old = direction;
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v3 20/27] attr: change validity check for attribute names to use positive logic
  2017-01-28  2:01   ` [PATCH v3 00/27] Revamp the attribute system; another round Brandon Williams
                       ` (18 preceding siblings ...)
  2017-01-28  2:01     ` [PATCH v3 19/27] attr: pass struct attr_check to collect_some_attrs Brandon Williams
@ 2017-01-28  2:02     ` Brandon Williams
  2017-01-28  2:02     ` [PATCH v3 21/27] attr: use hashmap for attribute dictionary Brandon Williams
                       ` (7 subsequent siblings)
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-28  2:02 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, sbeller, pclouds, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

Convert 'invalid_attr_name()' to 'attr_name_valid()' and use positive
logic for the return value.  In addition create a helper function that
prints out an error message when an invalid attribute name is used.

We could later update the message to exactly spell out what the
rules for a good attribute name are, etc.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 34 ++++++++++++++++++++--------------
 1 file changed, 20 insertions(+), 14 deletions(-)

diff --git a/attr.c b/attr.c
index 81a3c74d8..9fe848f59 100644
--- a/attr.c
+++ b/attr.c
@@ -74,23 +74,33 @@ static unsigned hash_name(const char *name, int namelen)
 	return val;
 }
 
-static int invalid_attr_name(const char *name, int namelen)
+static int attr_name_valid(const char *name, size_t namelen)
 {
 	/*
 	 * Attribute name cannot begin with '-' and must consist of
 	 * characters from [-A-Za-z0-9_.].
 	 */
 	if (namelen <= 0 || *name == '-')
-		return -1;
+		return 0;
 	while (namelen--) {
 		char ch = *name++;
 		if (! (ch == '-' || ch == '.' || ch == '_' ||
 		       ('0' <= ch && ch <= '9') ||
 		       ('a' <= ch && ch <= 'z') ||
 		       ('A' <= ch && ch <= 'Z')) )
-			return -1;
+			return 0;
 	}
-	return 0;
+	return 1;
+}
+
+static void report_invalid_attr(const char *name, size_t len,
+				const char *src, int lineno)
+{
+	struct strbuf err = STRBUF_INIT;
+	strbuf_addf(&err, _("%.*s is not a valid attribute name"),
+		    (int) len, name);
+	fprintf(stderr, "%s: %s:%d\n", err.buf, src, lineno);
+	strbuf_release(&err);
 }
 
 static struct git_attr *git_attr_internal(const char *name, int len)
@@ -105,7 +115,7 @@ static struct git_attr *git_attr_internal(const char *name, int len)
 			return a;
 	}
 
-	if (invalid_attr_name(name, len))
+	if (!attr_name_valid(name, len))
 		return NULL;
 
 	FLEX_ALLOC_MEM(a, name, name, len);
@@ -196,17 +206,15 @@ static const char *parse_attr(const char *src, int lineno, const char *cp,
 			cp++;
 			len--;
 		}
-		if (invalid_attr_name(cp, len)) {
-			fprintf(stderr,
-				"%.*s is not a valid attribute name: %s:%d\n",
-				len, cp, src, lineno);
+		if (!attr_name_valid(cp, len)) {
+			report_invalid_attr(cp, len, src, lineno);
 			return NULL;
 		}
 	} else {
 		/*
 		 * As this function is always called twice, once with
 		 * e == NULL in the first pass and then e != NULL in
-		 * the second pass, no need for invalid_attr_name()
+		 * the second pass, no need for attr_name_valid()
 		 * check here.
 		 */
 		if (*cp == '-' || *cp == '!') {
@@ -258,10 +266,8 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 		name += strlen(ATTRIBUTE_MACRO_PREFIX);
 		name += strspn(name, blank);
 		namelen = strcspn(name, blank);
-		if (invalid_attr_name(name, namelen)) {
-			fprintf(stderr,
-				"%.*s is not a valid attribute name: %s:%d\n",
-				namelen, name, src, lineno);
+		if (!attr_name_valid(name, namelen)) {
+			report_invalid_attr(name, namelen, src, lineno);
 			goto fail_return;
 		}
 	}
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v3 21/27] attr: use hashmap for attribute dictionary
  2017-01-28  2:01   ` [PATCH v3 00/27] Revamp the attribute system; another round Brandon Williams
                       ` (19 preceding siblings ...)
  2017-01-28  2:02     ` [PATCH v3 20/27] attr: change validity check for attribute names to use positive logic Brandon Williams
@ 2017-01-28  2:02     ` Brandon Williams
  2017-01-28  2:02     ` [PATCH v3 22/27] attr: eliminate global check_all_attr array Brandon Williams
                       ` (6 subsequent siblings)
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-28  2:02 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, sbeller, gitster, pclouds

The current implementation of the attribute dictionary uses a custom
hashtable.  This modernizes the dictionary by converting it to the builtin
'hashmap' structure.

Also, in order to enable a threaded API in the future add an
accompanying mutex which must be acquired prior to accessing the
dictionary of interned attributes.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c        | 173 +++++++++++++++++++++++++++++++++++++++++++---------------
 attr.h        |   2 +
 common-main.c |   3 +
 3 files changed, 133 insertions(+), 45 deletions(-)

diff --git a/attr.c b/attr.c
index 9fe848f59..e008f3026 100644
--- a/attr.c
+++ b/attr.c
@@ -14,6 +14,7 @@
 #include "dir.h"
 #include "utf8.h"
 #include "quote.h"
+#include "thread-utils.h"
 
 const char git_attr__true[] = "(builtin)true";
 const char git_attr__false[] = "\0(builtin)false";
@@ -23,28 +24,17 @@ static const char git_attr__unknown[] = "(builtin)unknown";
 #define ATTR__UNSET NULL
 #define ATTR__UNKNOWN git_attr__unknown
 
-/* This is a randomly chosen prime. */
-#define HASHSIZE 257
-
 #ifndef DEBUG_ATTR
 #define DEBUG_ATTR 0
 #endif
 
-/*
- * NEEDSWORK: the global dictionary of the interned attributes
- * must stay a singleton even after we become thread-ready.
- * Access to these must be surrounded with mutex when it happens.
- */
 struct git_attr {
-	struct git_attr *next;
-	unsigned h;
-	int attr_nr;
+	int attr_nr; /* unique attribute number */
 	int maybe_macro;
 	int maybe_real;
-	char name[FLEX_ARRAY];
+	char name[FLEX_ARRAY]; /* attribute name */
 };
 static int attr_nr;
-static struct git_attr *(git_attr_hash[HASHSIZE]);
 
 /*
  * NEEDSWORK: maybe-real, maybe-macro are not property of
@@ -63,15 +53,94 @@ const char *git_attr_name(const struct git_attr *attr)
 	return attr->name;
 }
 
-static unsigned hash_name(const char *name, int namelen)
+struct attr_hashmap {
+	struct hashmap map;
+#ifndef NO_PTHREADS
+	pthread_mutex_t mutex;
+#endif
+};
+
+static inline void hashmap_lock(struct attr_hashmap *map)
+{
+#ifndef NO_PTHREADS
+	pthread_mutex_lock(&map->mutex);
+#endif
+}
+
+static inline void hashmap_unlock(struct attr_hashmap *map)
 {
-	unsigned val = 0, c;
+#ifndef NO_PTHREADS
+	pthread_mutex_unlock(&map->mutex);
+#endif
+}
 
-	while (namelen--) {
-		c = *name++;
-		val = ((val << 7) | (val >> 22)) ^ c;
-	}
-	return val;
+/*
+ * The global dictionary of all interned attributes.  This
+ * is a singleton object which is shared between threads.
+ * Access to this dictionary must be surrounded with a mutex.
+ */
+static struct attr_hashmap g_attr_hashmap;
+
+/* The container for objects stored in "struct attr_hashmap" */
+struct attr_hash_entry {
+	struct hashmap_entry ent; /* must be the first member! */
+	const char *key; /* the key; memory should be owned by value */
+	size_t keylen; /* length of the key */
+	void *value; /* the stored value */
+};
+
+/* attr_hashmap comparison function */
+static int attr_hash_entry_cmp(const struct attr_hash_entry *a,
+			       const struct attr_hash_entry *b,
+			       void *unused)
+{
+	return (a->keylen != b->keylen) || strncmp(a->key, b->key, a->keylen);
+}
+
+/* Initialize an 'attr_hashmap' object */
+static void attr_hashmap_init(struct attr_hashmap *map)
+{
+	hashmap_init(&map->map, (hashmap_cmp_fn) attr_hash_entry_cmp, 0);
+}
+
+/*
+ * Retrieve the 'value' stored in a hashmap given the provided 'key'.
+ * If there is no matching entry, return NULL.
+ */
+static void *attr_hashmap_get(struct attr_hashmap *map,
+			      const char *key, size_t keylen)
+{
+	struct attr_hash_entry k;
+	struct attr_hash_entry *e;
+
+	if (!map->map.tablesize)
+		attr_hashmap_init(map);
+
+	hashmap_entry_init(&k, memhash(key, keylen));
+	k.key = key;
+	k.keylen = keylen;
+	e = hashmap_get(&map->map, &k, NULL);
+
+	return e ? e->value : NULL;
+}
+
+/* Add 'value' to a hashmap based on the provided 'key'. */
+static void attr_hashmap_add(struct attr_hashmap *map,
+			     const char *key, size_t keylen,
+			     void *value)
+{
+	struct attr_hash_entry *e;
+
+	if (!map->map.tablesize)
+		attr_hashmap_init(map);
+
+	e = xmalloc(sizeof(struct attr_hash_entry));
+	hashmap_entry_init(e, memhash(key, keylen));
+	e->key = key;
+	e->keylen = keylen;
+	e->value = value;
+
+	hashmap_add(&map->map, e);
 }
 
 static int attr_name_valid(const char *name, size_t namelen)
@@ -103,37 +172,44 @@ static void report_invalid_attr(const char *name, size_t len,
 	strbuf_release(&err);
 }
 
-static struct git_attr *git_attr_internal(const char *name, int len)
+/*
+ * Given a 'name', lookup and return the corresponding attribute in the global
+ * dictionary.  If no entry is found, create a new attribute and store it in
+ * the dictionary.
+ */
+static struct git_attr *git_attr_internal(const char *name, int namelen)
 {
-	unsigned hval = hash_name(name, len);
-	unsigned pos = hval % HASHSIZE;
 	struct git_attr *a;
 
-	for (a = git_attr_hash[pos]; a; a = a->next) {
-		if (a->h == hval &&
-		    !memcmp(a->name, name, len) && !a->name[len])
-			return a;
-	}
-
-	if (!attr_name_valid(name, len))
+	if (!attr_name_valid(name, namelen))
 		return NULL;
 
-	FLEX_ALLOC_MEM(a, name, name, len);
-	a->h = hval;
-	a->next = git_attr_hash[pos];
-	a->attr_nr = attr_nr++;
-	a->maybe_macro = 0;
-	a->maybe_real = 0;
-	git_attr_hash[pos] = a;
+	hashmap_lock(&g_attr_hashmap);
+
+	a = attr_hashmap_get(&g_attr_hashmap, name, namelen);
+
+	if (!a) {
+		FLEX_ALLOC_MEM(a, name, name, namelen);
+		a->attr_nr = g_attr_hashmap.map.size;
+		a->maybe_real = 0;
+		a->maybe_macro = 0;
+
+		attr_hashmap_add(&g_attr_hashmap, a->name, namelen, a);
+		assert(a->attr_nr == (g_attr_hashmap.map.size - 1));
+
+		/*
+		 * NEEDSWORK: per git_attr_check check_all_attr
+		 * will be initialized a lot more lazily, not
+		 * like this, and not here.
+		 */
+		REALLOC_ARRAY(check_all_attr, ++attr_nr);
+		check_all_attr[a->attr_nr].attr = a;
+		check_all_attr[a->attr_nr].value = ATTR__UNKNOWN;
+		assert(a->attr_nr == (attr_nr - 1));
+	}
+
+	hashmap_unlock(&g_attr_hashmap);
 
-	/*
-	 * NEEDSWORK: per git_attr_check check_all_attr
-	 * will be initialized a lot more lazily, not
-	 * like this, and not here.
-	 */
-	REALLOC_ARRAY(check_all_attr, attr_nr);
-	check_all_attr[a->attr_nr].attr = a;
-	check_all_attr[a->attr_nr].value = ATTR__UNKNOWN;
 	return a;
 }
 
@@ -941,3 +1017,10 @@ void git_attr_set_direction(enum git_attr_direction new, struct index_state *ist
 		drop_attr_stack();
 	use_index = istate;
 }
+
+void attr_start(void)
+{
+#ifndef NO_PTHREADS
+	pthread_mutex_init(&g_attr_hashmap.mutex, NULL);
+#endif
+}
diff --git a/attr.h b/attr.h
index b2cfd8550..898e1a8c9 100644
--- a/attr.h
+++ b/attr.h
@@ -67,4 +67,6 @@ enum git_attr_direction {
 };
 void git_attr_set_direction(enum git_attr_direction, struct index_state *);
 
+extern void attr_start(void);
+
 #endif /* ATTR_H */
diff --git a/common-main.c b/common-main.c
index c654f9555..6a689007e 100644
--- a/common-main.c
+++ b/common-main.c
@@ -1,5 +1,6 @@
 #include "cache.h"
 #include "exec_cmd.h"
+#include "attr.h"
 
 /*
  * Many parts of Git have subprograms communicate via pipe, expect the
@@ -33,6 +34,8 @@ int main(int argc, const char **argv)
 
 	git_setup_gettext();
 
+	attr_start();
+
 	git_extract_argv0_path(argv[0]);
 
 	restore_sigpipe_to_default();
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v3 22/27] attr: eliminate global check_all_attr array
  2017-01-28  2:01   ` [PATCH v3 00/27] Revamp the attribute system; another round Brandon Williams
                       ` (20 preceding siblings ...)
  2017-01-28  2:02     ` [PATCH v3 21/27] attr: use hashmap for attribute dictionary Brandon Williams
@ 2017-01-28  2:02     ` Brandon Williams
  2017-01-28  2:02     ` [PATCH v3 23/27] attr: remove maybe-real, maybe-macro from git_attr Brandon Williams
                       ` (5 subsequent siblings)
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-28  2:02 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, sbeller, gitster, pclouds

Currently there is a reliance on 'check_all_attr' which is a global
array of 'attr_check_item' items which is used to store the value of
each attribute during the collection process.

This patch eliminates this global and instead creates an array per
'attr_check' instance which is then used in the attribute collection
process.  This brings the attribute system one step closer to being
thread-safe.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 121 ++++++++++++++++++++++++++++++++++++++++++++---------------------
 attr.h |   5 +++
 2 files changed, 87 insertions(+), 39 deletions(-)

diff --git a/attr.c b/attr.c
index e008f3026..2637804b1 100644
--- a/attr.c
+++ b/attr.c
@@ -34,7 +34,6 @@ struct git_attr {
 	int maybe_real;
 	char name[FLEX_ARRAY]; /* attribute name */
 };
-static int attr_nr;
 
 /*
  * NEEDSWORK: maybe-real, maybe-macro are not property of
@@ -45,9 +44,6 @@ static int attr_nr;
  */
 static int cannot_trust_maybe_real;
 
-/* NEEDSWORK: This will become per git_attr_check */
-static struct attr_check_item *check_all_attr;
-
 const char *git_attr_name(const struct git_attr *attr)
 {
 	return attr->name;
@@ -143,6 +139,57 @@ static void attr_hashmap_add(struct attr_hashmap *map,
 	hashmap_add(&map->map, e);
 }
 
+struct all_attrs_item {
+	const struct git_attr *attr;
+	const char *value;
+};
+
+/*
+ * Reallocate and reinitialize the array of all attributes (which is used in
+ * the attribute collection process) in 'check' based on the global dictionary
+ * of attributes.
+ */
+static void all_attrs_init(struct attr_hashmap *map, struct attr_check *check)
+{
+	int i;
+
+	hashmap_lock(map);
+
+	if (map->map.size < check->all_attrs_nr)
+		die("BUG: interned attributes shouldn't be deleted");
+
+	/*
+	 * If the number of attributes in the global dictionary has increased
+	 * (or this attr_check instance doesn't have an initialized all_attrs
+	 * field), reallocate the provided attr_check instance's all_attrs
+	 * field and fill each entry with its corresponding git_attr.
+	 */
+	if (map->map.size != check->all_attrs_nr) {
+		struct attr_hash_entry *e;
+		struct hashmap_iter iter;
+		hashmap_iter_init(&map->map, &iter);
+
+		REALLOC_ARRAY(check->all_attrs, map->map.size);
+		check->all_attrs_nr = map->map.size;
+
+		while ((e = hashmap_iter_next(&iter))) {
+			const struct git_attr *a = e->value;
+			check->all_attrs[a->attr_nr].attr = a;
+		}
+	}
+
+	hashmap_unlock(map);
+
+	/*
+	 * Re-initialize every entry in check->all_attrs.
+	 * This re-initialization can live outside of the locked region since
+	 * the attribute dictionary is no longer being accessed.
+	 */
+	for (i = 0; i < check->all_attrs_nr; i++) {
+		check->all_attrs[i].value = ATTR__UNKNOWN;
+	}
+}
+
 static int attr_name_valid(const char *name, size_t namelen)
 {
 	/*
@@ -196,16 +243,6 @@ static struct git_attr *git_attr_internal(const char *name, int namelen)
 
 		attr_hashmap_add(&g_attr_hashmap, a->name, namelen, a);
 		assert(a->attr_nr == (g_attr_hashmap.map.size - 1));
-
-		/*
-		 * NEEDSWORK: per git_attr_check check_all_attr
-		 * will be initialized a lot more lazily, not
-		 * like this, and not here.
-		 */
-		REALLOC_ARRAY(check_all_attr, ++attr_nr);
-		check_all_attr[a->attr_nr].attr = a;
-		check_all_attr[a->attr_nr].value = ATTR__UNKNOWN;
-		assert(a->attr_nr == (attr_nr - 1));
 	}
 
 	hashmap_unlock(&g_attr_hashmap);
@@ -513,6 +550,10 @@ void attr_check_clear(struct attr_check *check)
 	check->items = NULL;
 	check->alloc = 0;
 	check->nr = 0;
+
+	free(check->all_attrs);
+	check->all_attrs = NULL;
+	check->all_attrs_nr = 0;
 }
 
 void attr_check_free(struct attr_check *check)
@@ -860,16 +901,16 @@ static int path_matches(const char *pathname, int pathlen,
 			      pattern, prefix, pat->patternlen, pat->flags);
 }
 
-static int macroexpand_one(int attr_nr, int rem);
+static int macroexpand_one(struct all_attrs_item *all_attrs, int nr, int rem);
 
-static int fill_one(const char *what, struct match_attr *a, int rem)
+static int fill_one(const char *what, struct all_attrs_item *all_attrs,
+		    struct match_attr *a, int rem)
 {
-	struct attr_check_item *check = check_all_attr;
 	int i;
 
-	for (i = a->num_attr - 1; 0 < rem && 0 <= i; i--) {
+	for (i = a->num_attr - 1; rem > 0 && i >= 0; i--) {
 		struct git_attr *attr = a->state[i].attr;
-		const char **n = &(check[attr->attr_nr].value);
+		const char **n = &(all_attrs[attr->attr_nr].value);
 		const char *v = a->state[i].setto;
 
 		if (*n == ATTR__UNKNOWN) {
@@ -878,14 +919,15 @@ static int fill_one(const char *what, struct match_attr *a, int rem)
 				  attr, v);
 			*n = v;
 			rem--;
-			rem = macroexpand_one(attr->attr_nr, rem);
+			rem = macroexpand_one(all_attrs, attr->attr_nr, rem);
 		}
 	}
 	return rem;
 }
 
 static int fill(const char *path, int pathlen, int basename_offset,
-		struct attr_stack *stk, int rem)
+		struct attr_stack *stk, struct all_attrs_item *all_attrs,
+		int rem)
 {
 	int i;
 	const char *base = stk->origin ? stk->origin : "";
@@ -896,18 +938,18 @@ static int fill(const char *path, int pathlen, int basename_offset,
 			continue;
 		if (path_matches(path, pathlen, basename_offset,
 				 &a->u.pat, base, stk->originlen))
-			rem = fill_one("fill", a, rem);
+			rem = fill_one("fill", all_attrs, a, rem);
 	}
 	return rem;
 }
 
-static int macroexpand_one(int nr, int rem)
+static int macroexpand_one(struct all_attrs_item *all_attrs, int nr, int rem)
 {
 	struct attr_stack *stk;
 	int i;
 
-	if (check_all_attr[nr].value != ATTR__TRUE ||
-	    !check_all_attr[nr].attr->maybe_macro)
+	if (all_attrs[nr].value != ATTR__TRUE ||
+	    !all_attrs[nr].attr->maybe_macro)
 		return rem;
 
 	for (stk = attr_stack; stk; stk = stk->prev) {
@@ -916,7 +958,7 @@ static int macroexpand_one(int nr, int rem)
 			if (!ma->is_macro)
 				continue;
 			if (ma->u.attr->attr_nr == nr)
-				return fill_one("expand", ma, rem);
+				return fill_one("expand", all_attrs, ma, rem);
 		}
 	}
 
@@ -924,9 +966,9 @@ static int macroexpand_one(int nr, int rem)
 }
 
 /*
- * Collect attributes for path into the array pointed to by
- * check_all_attr. If num is non-zero, only attributes in check[] are
- * collected. Otherwise all attributes are collected.
+ * Collect attributes for path into the array pointed to by check->all_attrs.
+ * If check->check_nr is non-zero, only attributes in check[] are collected.
+ * Otherwise all attributes are collected.
  */
 static void collect_some_attrs(const char *path, struct attr_check *check)
 {
@@ -949,15 +991,15 @@ static void collect_some_attrs(const char *path, struct attr_check *check)
 	}
 
 	prepare_attr_stack(path, dirlen);
-	for (i = 0; i < attr_nr; i++)
-		check_all_attr[i].value = ATTR__UNKNOWN;
+	all_attrs_init(&g_attr_hashmap, check);
+
 	if (check->nr && !cannot_trust_maybe_real) {
 		rem = 0;
 		for (i = 0; i < check->nr; i++) {
 			const struct git_attr *a = check->items[i].attr;
 			if (!a->maybe_real) {
-				struct attr_check_item *c;
-				c = check_all_attr + a->attr_nr;
+				struct all_attrs_item *c;
+				c = check->all_attrs + a->attr_nr;
 				c->value = ATTR__UNSET;
 				rem++;
 			}
@@ -966,9 +1008,9 @@ static void collect_some_attrs(const char *path, struct attr_check *check)
 			return;
 	}
 
-	rem = attr_nr;
+	rem = check->all_attrs_nr;
 	for (stk = attr_stack; 0 < rem && stk; stk = stk->prev)
-		rem = fill(path, pathlen, basename_offset, stk, rem);
+		rem = fill(path, pathlen, basename_offset, stk, check->all_attrs, rem);
 }
 
 int git_check_attr(const char *path, struct attr_check *check)
@@ -978,7 +1020,8 @@ int git_check_attr(const char *path, struct attr_check *check)
 	collect_some_attrs(path, check);
 
 	for (i = 0; i < check->nr; i++) {
-		const char *value = check_all_attr[check->items[i].attr->attr_nr].value;
+		size_t n = check->items[i].attr->attr_nr;
+		const char *value = check->all_attrs[n].value;
 		if (value == ATTR__UNKNOWN)
 			value = ATTR__UNSET;
 		check->items[i].value = value;
@@ -994,9 +1037,9 @@ void git_all_attrs(const char *path, struct attr_check *check)
 	attr_check_reset(check);
 	collect_some_attrs(path, check);
 
-	for (i = 0; i < attr_nr; i++) {
-		const char *name = check_all_attr[i].attr->name;
-		const char *value = check_all_attr[i].value;
+	for (i = 0; i < check->all_attrs_nr; i++) {
+		const char *name = check->all_attrs[i].attr->name;
+		const char *value = check->all_attrs[i].value;
 		struct attr_check_item *item;
 		if (value == ATTR__UNSET || value == ATTR__UNKNOWN)
 			continue;
diff --git a/attr.h b/attr.h
index 898e1a8c9..5aaf55c3e 100644
--- a/attr.h
+++ b/attr.h
@@ -4,6 +4,9 @@
 /* An attribute is a pointer to this opaque structure */
 struct git_attr;
 
+/* opaque structure used internally for attribute collection */
+struct all_attrs_item;
+
 /*
  * Given a string, return the gitattribute object that
  * corresponds to it.
@@ -33,6 +36,8 @@ struct attr_check {
 	int nr;
 	int alloc;
 	struct attr_check_item *items;
+	int all_attrs_nr;
+	struct all_attrs_item *all_attrs;
 };
 
 extern struct attr_check *attr_check_alloc(void);
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v3 23/27] attr: remove maybe-real, maybe-macro from git_attr
  2017-01-28  2:01   ` [PATCH v3 00/27] Revamp the attribute system; another round Brandon Williams
                       ` (21 preceding siblings ...)
  2017-01-28  2:02     ` [PATCH v3 22/27] attr: eliminate global check_all_attr array Brandon Williams
@ 2017-01-28  2:02     ` Brandon Williams
  2017-01-28  2:02     ` [PATCH v3 24/27] attr: tighten const correctness with git_attr and match_attr Brandon Williams
                       ` (4 subsequent siblings)
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-28  2:02 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, sbeller, gitster, pclouds

Whether or not a git attribute is real or a macro isn't a property of
the attribute but rather it depends on the attribute stack (which
.gitattribute files were read).

This patch removes the 'maybe_real' and 'maybe_macro' fields in a
git_attr and instead adds the 'macro' field to a attr_check_item.  The
'macro' indicates (if non-NULL) that a particular attribute is a macro
for the given attribute stack.  It's populated, through a quick scan of
the attribute stack, with the match_attr that corresponds to the macro's
definition.  This way the attribute stack only needs to be scanned a
single time prior to attribute collection instead of each time a macro
needs to be expanded.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 75 +++++++++++++++++++++++++++++++++---------------------------------
 1 file changed, 37 insertions(+), 38 deletions(-)

diff --git a/attr.c b/attr.c
index 2637804b1..8f4402ef3 100644
--- a/attr.c
+++ b/attr.c
@@ -30,20 +30,9 @@ static const char git_attr__unknown[] = "(builtin)unknown";
 
 struct git_attr {
 	int attr_nr; /* unique attribute number */
-	int maybe_macro;
-	int maybe_real;
 	char name[FLEX_ARRAY]; /* attribute name */
 };
 
-/*
- * NEEDSWORK: maybe-real, maybe-macro are not property of
- * an attribute, as it depends on what .gitattributes are
- * read.  Once we introduce per git_attr_check attr_stack
- * and check_all_attr, the optimization based on them will
- * become unnecessary and can go away.  So is this variable.
- */
-static int cannot_trust_maybe_real;
-
 const char *git_attr_name(const struct git_attr *attr)
 {
 	return attr->name;
@@ -142,6 +131,12 @@ static void attr_hashmap_add(struct attr_hashmap *map,
 struct all_attrs_item {
 	const struct git_attr *attr;
 	const char *value;
+	/*
+	 * If 'macro' is non-NULL, indicates that 'attr' is a macro based on
+	 * the current attribute stack and contains a pointer to the match_attr
+	 * definition of the macro
+	 */
+	const struct match_attr *macro;
 };
 
 /*
@@ -187,6 +182,7 @@ static void all_attrs_init(struct attr_hashmap *map, struct attr_check *check)
 	 */
 	for (i = 0; i < check->all_attrs_nr; i++) {
 		check->all_attrs[i].value = ATTR__UNKNOWN;
+		check->all_attrs[i].macro = NULL;
 	}
 }
 
@@ -238,8 +234,6 @@ static struct git_attr *git_attr_internal(const char *name, int namelen)
 	if (!a) {
 		FLEX_ALLOC_MEM(a, name, name, namelen);
 		a->attr_nr = g_attr_hashmap.map.size;
-		a->maybe_real = 0;
-		a->maybe_macro = 0;
 
 		attr_hashmap_add(&g_attr_hashmap, a->name, namelen, a);
 		assert(a->attr_nr == (g_attr_hashmap.map.size - 1));
@@ -402,7 +396,6 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 		      (is_macro ? 0 : namelen + 1));
 	if (is_macro) {
 		res->u.attr = git_attr_internal(name, namelen);
-		res->u.attr->maybe_macro = 1;
 	} else {
 		char *p = (char *)&(res->state[num_attr]);
 		memcpy(p, name, namelen);
@@ -423,10 +416,6 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 	/* Second pass to fill the attr_states */
 	for (cp = states, i = 0; *cp; i++) {
 		cp = parse_attr(src, lineno, cp, &(res->state[i]));
-		if (!is_macro)
-			res->state[i].attr->maybe_real = 1;
-		if (res->state[i].attr->maybe_macro)
-			cannot_trust_maybe_real = 1;
 	}
 
 	strbuf_release(&pattern);
@@ -904,7 +893,7 @@ static int path_matches(const char *pathname, int pathlen,
 static int macroexpand_one(struct all_attrs_item *all_attrs, int nr, int rem);
 
 static int fill_one(const char *what, struct all_attrs_item *all_attrs,
-		    struct match_attr *a, int rem)
+		    const struct match_attr *a, int rem)
 {
 	int i;
 
@@ -945,24 +934,34 @@ static int fill(const char *path, int pathlen, int basename_offset,
 
 static int macroexpand_one(struct all_attrs_item *all_attrs, int nr, int rem)
 {
-	struct attr_stack *stk;
-	int i;
+	const struct all_attrs_item *item = &all_attrs[nr];
 
-	if (all_attrs[nr].value != ATTR__TRUE ||
-	    !all_attrs[nr].attr->maybe_macro)
+	if (item->macro && item->value == ATTR__TRUE)
+		return fill_one("expand", all_attrs, item->macro, rem);
+	else
 		return rem;
+}
 
-	for (stk = attr_stack; stk; stk = stk->prev) {
-		for (i = stk->num_matches - 1; 0 <= i; i--) {
-			struct match_attr *ma = stk->attrs[i];
-			if (!ma->is_macro)
-				continue;
-			if (ma->u.attr->attr_nr == nr)
-				return fill_one("expand", all_attrs, ma, rem);
+/*
+ * Marks the attributes which are macros based on the attribute stack.
+ * This prevents having to search through the attribute stack each time
+ * a macro needs to be expanded during the fill stage.
+ */
+static void determine_macros(struct all_attrs_item *all_attrs,
+			     const struct attr_stack *stack)
+{
+	for (; stack; stack = stack->prev) {
+		int i;
+		for (i = stack->num_matches - 1; i >= 0; i--) {
+			const struct match_attr *ma = stack->attrs[i];
+			if (ma->is_macro) {
+				int n = ma->u.attr->attr_nr;
+				if (!all_attrs[n].macro) {
+					all_attrs[n].macro = ma;
+				}
+			}
 		}
 	}
-
-	return rem;
 }
 
 /*
@@ -992,15 +991,15 @@ static void collect_some_attrs(const char *path, struct attr_check *check)
 
 	prepare_attr_stack(path, dirlen);
 	all_attrs_init(&g_attr_hashmap, check);
+	determine_macros(check->all_attrs, attr_stack);
 
-	if (check->nr && !cannot_trust_maybe_real) {
+	if (check->nr) {
 		rem = 0;
 		for (i = 0; i < check->nr; i++) {
-			const struct git_attr *a = check->items[i].attr;
-			if (!a->maybe_real) {
-				struct all_attrs_item *c;
-				c = check->all_attrs + a->attr_nr;
-				c->value = ATTR__UNSET;
+			int n = check->items[i].attr->attr_nr;
+			struct all_attrs_item *item = &check->all_attrs[n];
+			if (item->macro) {
+				item->value = ATTR__UNSET;
 				rem++;
 			}
 		}
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v3 24/27] attr: tighten const correctness with git_attr and match_attr
  2017-01-28  2:01   ` [PATCH v3 00/27] Revamp the attribute system; another round Brandon Williams
                       ` (22 preceding siblings ...)
  2017-01-28  2:02     ` [PATCH v3 23/27] attr: remove maybe-real, maybe-macro from git_attr Brandon Williams
@ 2017-01-28  2:02     ` Brandon Williams
  2017-01-28  2:02     ` [PATCH v3 25/27] attr: store attribute stack in attr_check structure Brandon Williams
                       ` (3 subsequent siblings)
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-28  2:02 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, sbeller, gitster, pclouds

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c               | 12 ++++++------
 attr.h               |  2 +-
 builtin/check-attr.c |  3 ++-
 3 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/attr.c b/attr.c
index 8f4402ef3..69643ae77 100644
--- a/attr.c
+++ b/attr.c
@@ -220,7 +220,7 @@ static void report_invalid_attr(const char *name, size_t len,
  * dictionary.  If no entry is found, create a new attribute and store it in
  * the dictionary.
  */
-static struct git_attr *git_attr_internal(const char *name, int namelen)
+static const struct git_attr *git_attr_internal(const char *name, int namelen)
 {
 	struct git_attr *a;
 
@@ -244,14 +244,14 @@ static struct git_attr *git_attr_internal(const char *name, int namelen)
 	return a;
 }
 
-struct git_attr *git_attr(const char *name)
+const struct git_attr *git_attr(const char *name)
 {
 	return git_attr_internal(name, strlen(name));
 }
 
 /* What does a matched pattern decide? */
 struct attr_state {
-	struct git_attr *attr;
+	const struct git_attr *attr;
 	const char *setto;
 };
 
@@ -278,7 +278,7 @@ struct pattern {
 struct match_attr {
 	union {
 		struct pattern pat;
-		struct git_attr *attr;
+		const struct git_attr *attr;
 	} u;
 	char is_macro;
 	unsigned num_attr;
@@ -898,7 +898,7 @@ static int fill_one(const char *what, struct all_attrs_item *all_attrs,
 	int i;
 
 	for (i = a->num_attr - 1; rem > 0 && i >= 0; i--) {
-		struct git_attr *attr = a->state[i].attr;
+		const struct git_attr *attr = a->state[i].attr;
 		const char **n = &(all_attrs[attr->attr_nr].value);
 		const char *v = a->state[i].setto;
 
@@ -922,7 +922,7 @@ static int fill(const char *path, int pathlen, int basename_offset,
 	const char *base = stk->origin ? stk->origin : "";
 
 	for (i = stk->num_matches - 1; 0 < rem && 0 <= i; i--) {
-		struct match_attr *a = stk->attrs[i];
+		const struct match_attr *a = stk->attrs[i];
 		if (a->is_macro)
 			continue;
 		if (path_matches(path, pathlen, basename_offset,
diff --git a/attr.h b/attr.h
index 5aaf55c3e..abebbc19c 100644
--- a/attr.h
+++ b/attr.h
@@ -11,7 +11,7 @@ struct all_attrs_item;
  * Given a string, return the gitattribute object that
  * corresponds to it.
  */
-struct git_attr *git_attr(const char *);
+const struct git_attr *git_attr(const char *);
 
 /* Internal use */
 extern const char git_attr__true[];
diff --git a/builtin/check-attr.c b/builtin/check-attr.c
index 40cdff13e..4d01ca0c8 100644
--- a/builtin/check-attr.c
+++ b/builtin/check-attr.c
@@ -166,7 +166,8 @@ int cmd_check_attr(int argc, const char **argv, const char *prefix)
 	check = attr_check_alloc();
 	if (!all_attrs) {
 		for (i = 0; i < cnt; i++) {
-			struct git_attr *a = git_attr(argv[i]);
+			const struct git_attr *a = git_attr(argv[i]);
+
 			if (!a)
 				return error("%s: not a valid attribute name",
 					     argv[i]);
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v3 25/27] attr: store attribute stack in attr_check structure
  2017-01-28  2:01   ` [PATCH v3 00/27] Revamp the attribute system; another round Brandon Williams
                       ` (23 preceding siblings ...)
  2017-01-28  2:02     ` [PATCH v3 24/27] attr: tighten const correctness with git_attr and match_attr Brandon Williams
@ 2017-01-28  2:02     ` Brandon Williams
  2017-01-28  2:02     ` [PATCH v3 26/27] attr: push the bare repo check into read_attr() Brandon Williams
                       ` (2 subsequent siblings)
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-28  2:02 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, sbeller, gitster, pclouds

The last big hurdle towards a thread-safe API for the attribute system
is the reliance on a global attribute stack that is modified during each
call into the attribute system.

This patch removes this global stack and instead a stack is stored
locally in each attr_check instance.  This opens up the opportunity for
future optimizations to customize the attribute stack for the attributes
that a particular attr_check struct is interested in.

One caveat with pushing the attribute stack into the attr_check
structure is that the attribute system now needs to keep track of all
active attr_check instances.  Due to the direction mechanism the stack
needs to be dropped when the direction is switched.  In order to ensure
correctness when the direction is changed the attribute system needs to
iterate through all active attr_check instances and drop each of their
stacks.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 284 +++++++++++++++++++++++++++++++++++++++++++++--------------------
 attr.h |   4 +-
 2 files changed, 199 insertions(+), 89 deletions(-)

diff --git a/attr.c b/attr.c
index 69643ae77..bcee0921d 100644
--- a/attr.c
+++ b/attr.c
@@ -445,17 +445,16 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
  * .gitignore file and info/excludes file as a fallback.
  */
 
-/* NEEDSWORK: This will become per git_attr_check */
-static struct attr_stack {
+struct attr_stack {
 	struct attr_stack *prev;
 	char *origin;
 	size_t originlen;
 	unsigned num_matches;
 	unsigned alloc;
 	struct match_attr **attrs;
-} *attr_stack;
+};
 
-static void free_attr_elem(struct attr_stack *e)
+static void attr_stack_free(struct attr_stack *e)
 {
 	int i;
 	free(e->origin);
@@ -478,9 +477,96 @@ static void free_attr_elem(struct attr_stack *e)
 	free(e);
 }
 
+static void drop_attr_stack(struct attr_stack **stack)
+{
+	while (*stack) {
+		struct attr_stack *elem = *stack;
+		*stack = elem->prev;
+		attr_stack_free(elem);
+	}
+}
+
+/* List of all attr_check structs; access should be surrounded by mutex */
+static struct check_vector {
+	size_t nr;
+	size_t alloc;
+	struct attr_check **checks;
+#ifndef NO_PTHREADS
+	pthread_mutex_t mutex;
+#endif
+} check_vector;
+
+static inline void vector_lock(void)
+{
+#ifndef NO_PTHREADS
+	pthread_mutex_lock(&check_vector.mutex);
+#endif
+}
+
+static inline void vector_unlock(void)
+{
+#ifndef NO_PTHREADS
+	pthread_mutex_unlock(&check_vector.mutex);
+#endif
+}
+
+static void check_vector_add(struct attr_check *c)
+{
+	vector_lock();
+
+	ALLOC_GROW(check_vector.checks,
+		   check_vector.nr + 1,
+		   check_vector.alloc);
+	check_vector.checks[check_vector.nr++] = c;
+
+	vector_unlock();
+}
+
+static void check_vector_remove(struct attr_check *check)
+{
+	int i;
+
+	vector_lock();
+
+	/* Find entry */
+	for (i = 0; i < check_vector.nr; i++)
+		if (check_vector.checks[i] == check)
+			break;
+
+	if (i >= check_vector.nr)
+		die("BUG: no entry found");
+
+	/* shift entries over */
+	for (; i < check_vector.nr - 1; i++)
+		check_vector.checks[i] = check_vector.checks[i + 1];
+
+	check_vector.nr--;
+
+	vector_unlock();
+}
+
+/* Iterate through all attr_check instances and drop their stacks */
+static void drop_all_attr_stacks(void)
+{
+	int i;
+
+	vector_lock();
+
+	for (i = 0; i < check_vector.nr; i++) {
+		drop_attr_stack(&check_vector.checks[i]->stack);
+	}
+
+	vector_unlock();
+}
+
 struct attr_check *attr_check_alloc(void)
 {
-	return xcalloc(1, sizeof(struct attr_check));
+	struct attr_check *c = xcalloc(1, sizeof(struct attr_check));
+
+	/* save pointer to the check struct */
+	check_vector_add(c);
+
+	return c;
 }
 
 struct attr_check *attr_check_initl(const char *one, ...)
@@ -543,12 +629,19 @@ void attr_check_clear(struct attr_check *check)
 	free(check->all_attrs);
 	check->all_attrs = NULL;
 	check->all_attrs_nr = 0;
+
+	drop_attr_stack(&check->stack);
 }
 
 void attr_check_free(struct attr_check *check)
 {
-	attr_check_clear(check);
-	free(check);
+	if (check) {
+		/* Remove check from the check vector */
+		check_vector_remove(check);
+
+		attr_check_clear(check);
+		free(check);
+	}
 }
 
 static const char *builtin_attr[] = {
@@ -705,15 +798,6 @@ static void debug_set(const char *what, const char *match, struct git_attr *attr
 #define debug_set(a,b,c,d) do { ; } while (0)
 #endif /* DEBUG_ATTR */
 
-static void drop_attr_stack(void)
-{
-	while (attr_stack) {
-		struct attr_stack *elem = attr_stack;
-		attr_stack = elem->prev;
-		free_attr_elem(elem);
-	}
-}
-
 static const char *git_etc_gitattributes(void)
 {
 	static const char *system_wide;
@@ -722,6 +806,14 @@ static const char *git_etc_gitattributes(void)
 	return system_wide;
 }
 
+static const char *get_home_gitattributes(void)
+{
+	if (!git_attributes_file)
+		git_attributes_file = xdg_config_home("attributes");
+
+	return git_attributes_file;
+}
+
 static int git_attr_system(void)
 {
 	return !git_env_bool("GIT_ATTR_NOSYSTEM", 0);
@@ -741,47 +833,50 @@ static void push_stack(struct attr_stack **attr_stack_p,
 	}
 }
 
-static void bootstrap_attr_stack(void)
+static void bootstrap_attr_stack(struct attr_stack **stack)
 {
-	struct attr_stack *elem;
+	struct attr_stack *e;
 
-	if (attr_stack)
+	if (*stack)
 		return;
 
-	push_stack(&attr_stack, read_attr_from_array(builtin_attr), NULL, 0);
-
-	if (git_attr_system())
-		push_stack(&attr_stack,
-			   read_attr_from_file(git_etc_gitattributes(), 1),
-			   NULL, 0);
+	/* builtin frame */
+	e = read_attr_from_array(builtin_attr);
+	push_stack(stack, e, NULL, 0);
 
-	if (!git_attributes_file)
-		git_attributes_file = xdg_config_home("attributes");
-	if (git_attributes_file)
-		push_stack(&attr_stack,
-			   read_attr_from_file(git_attributes_file, 1),
-			   NULL, 0);
+	/* system-wide frame */
+	if (git_attr_system()) {
+		e = read_attr_from_file(git_etc_gitattributes(), 1);
+		push_stack(stack, e, NULL, 0);
+	}
 
-	if (!is_bare_repository() || direction == GIT_ATTR_INDEX) {
-		elem = read_attr(GITATTRIBUTES_FILE, 1);
-		push_stack(&attr_stack, elem, xstrdup(""), 0);
-		debug_push(elem);
+	/* home directory */
+	if (get_home_gitattributes()) {
+		e = read_attr_from_file(get_home_gitattributes(), 1);
+		push_stack(stack, e, NULL, 0);
 	}
 
-	if (startup_info->have_repository)
-		elem = read_attr_from_file(git_path_info_attributes(), 1);
+	/* root directory */
+	if (!is_bare_repository() || direction == GIT_ATTR_INDEX)
+		e = read_attr(GITATTRIBUTES_FILE, 1);
 	else
-		elem = NULL;
+		e = xcalloc(1, sizeof(struct attr_stack));
+	push_stack(stack, e, xstrdup(""), 0);
 
-	if (!elem)
-		elem = xcalloc(1, sizeof(*elem));
-	push_stack(&attr_stack, elem, NULL, 0);
+	/* info frame */
+	if (startup_info->have_repository)
+		e = read_attr_from_file(git_path_info_attributes(), 1);
+	else
+		e = NULL;
+	if (!e)
+		e = xcalloc(1, sizeof(struct attr_stack));
+	push_stack(stack, e, NULL, 0);
 }
 
-static void prepare_attr_stack(const char *path, int dirlen)
+static void prepare_attr_stack(const char *path, int dirlen,
+			       struct attr_stack **stack)
 {
-	struct attr_stack *elem, *info;
-	const char *cp;
+	struct attr_stack *info;
 
 	/*
 	 * At the bottom of the attribute stack is the built-in
@@ -798,13 +893,13 @@ static void prepare_attr_stack(const char *path, int dirlen)
 	 * .gitattributes in deeper directories to shallower ones,
 	 * and finally use the built-in set as the default.
 	 */
-	bootstrap_attr_stack();
+	bootstrap_attr_stack(stack);
 
 	/*
 	 * Pop the "info" one that is always at the top of the stack.
 	 */
-	info = attr_stack;
-	attr_stack = info->prev;
+	info = *stack;
+	*stack = info->prev;
 
 	/*
 	 * Pop the ones from directories that are not the prefix of
@@ -812,18 +907,19 @@ static void prepare_attr_stack(const char *path, int dirlen)
 	 * the root one (whose origin is an empty string "") or the builtin
 	 * one (whose origin is NULL) without popping it.
 	 */
-	while (attr_stack->origin) {
-		int namelen = strlen(attr_stack->origin);
+	while ((*stack)->origin) {
+		int namelen = (*stack)->originlen;
+		struct attr_stack *elem;
 
-		elem = attr_stack;
+		elem = *stack;
 		if (namelen <= dirlen &&
 		    !strncmp(elem->origin, path, namelen) &&
 		    (!namelen || path[namelen] == '/'))
 			break;
 
 		debug_pop(elem);
-		attr_stack = elem->prev;
-		free_attr_elem(elem);
+		*stack = elem->prev;
+		attr_stack_free(elem);
 	}
 
 	/*
@@ -838,33 +934,43 @@ static void prepare_attr_stack(const char *path, int dirlen)
 		 */
 		struct strbuf pathbuf = STRBUF_INIT;
 
-		assert(attr_stack->origin);
-		while (1) {
-			size_t len = strlen(attr_stack->origin);
+		assert((*stack)->origin);
+		strbuf_addstr(&pathbuf, (*stack)->origin);
+		/* Build up to the directory 'path' is in */
+		while (pathbuf.len < dirlen) {
+			size_t len = pathbuf.len;
+			struct attr_stack *next;
 			char *origin;
 
-			if (dirlen <= len)
-				break;
-			cp = memchr(path + len + 1, '/', dirlen - len - 1);
-			if (!cp)
-				cp = path + dirlen;
-			strbuf_addf(&pathbuf,
-				    "%.*s/%s", (int)(cp - path), path,
-				    GITATTRIBUTES_FILE);
-			elem = read_attr(pathbuf.buf, 0);
-			strbuf_setlen(&pathbuf, cp - path);
-			origin = strbuf_detach(&pathbuf, &len);
-			push_stack(&attr_stack, elem, origin, len);
-			debug_push(elem);
-		}
+			/* Skip path-separator */
+			if (len < dirlen && is_dir_sep(path[len]))
+				len++;
+			/* Find the end of the next component */
+			while (len < dirlen && !is_dir_sep(path[len]))
+				len++;
+
+			if (pathbuf.len > 0)
+				strbuf_addch(&pathbuf, '/');
+			strbuf_add(&pathbuf, path + pathbuf.len,
+				   (len - pathbuf.len));
+			strbuf_addf(&pathbuf, "/%s", GITATTRIBUTES_FILE);
+
+			next = read_attr(pathbuf.buf, 0);
 
+			/* reset the pathbuf to not include "/.gitattributes" */
+			strbuf_setlen(&pathbuf, len);
+
+			origin = xstrdup(pathbuf.buf);
+			push_stack(stack, next, origin, len);
+
+		}
 		strbuf_release(&pathbuf);
 	}
 
 	/*
 	 * Finally push the "info" one at the top of the stack.
 	 */
-	push_stack(&attr_stack, info, NULL, 0);
+	push_stack(stack, info, NULL, 0);
 }
 
 static int path_matches(const char *pathname, int pathlen,
@@ -915,20 +1021,23 @@ static int fill_one(const char *what, struct all_attrs_item *all_attrs,
 }
 
 static int fill(const char *path, int pathlen, int basename_offset,
-		struct attr_stack *stk, struct all_attrs_item *all_attrs,
-		int rem)
+		const struct attr_stack *stack,
+		struct all_attrs_item *all_attrs, int rem)
 {
-	int i;
-	const char *base = stk->origin ? stk->origin : "";
-
-	for (i = stk->num_matches - 1; 0 < rem && 0 <= i; i--) {
-		const struct match_attr *a = stk->attrs[i];
-		if (a->is_macro)
-			continue;
-		if (path_matches(path, pathlen, basename_offset,
-				 &a->u.pat, base, stk->originlen))
-			rem = fill_one("fill", all_attrs, a, rem);
+	for (; rem > 0 && stack; stack = stack->prev) {
+		int i;
+		const char *base = stack->origin ? stack->origin : "";
+
+		for (i = stack->num_matches - 1; 0 < rem && 0 <= i; i--) {
+			const struct match_attr *a = stack->attrs[i];
+			if (a->is_macro)
+				continue;
+			if (path_matches(path, pathlen, basename_offset,
+					 &a->u.pat, base, stack->originlen))
+				rem = fill_one("fill", all_attrs, a, rem);
+		}
 	}
+
 	return rem;
 }
 
@@ -971,7 +1080,6 @@ static void determine_macros(struct all_attrs_item *all_attrs,
  */
 static void collect_some_attrs(const char *path, struct attr_check *check)
 {
-	struct attr_stack *stk;
 	int i, pathlen, rem, dirlen;
 	const char *cp, *last_slash = NULL;
 	int basename_offset;
@@ -989,9 +1097,9 @@ static void collect_some_attrs(const char *path, struct attr_check *check)
 		dirlen = 0;
 	}
 
-	prepare_attr_stack(path, dirlen);
+	prepare_attr_stack(path, dirlen, &check->stack);
 	all_attrs_init(&g_attr_hashmap, check);
-	determine_macros(check->all_attrs, attr_stack);
+	determine_macros(check->all_attrs, check->stack);
 
 	if (check->nr) {
 		rem = 0;
@@ -1008,8 +1116,7 @@ static void collect_some_attrs(const char *path, struct attr_check *check)
 	}
 
 	rem = check->all_attrs_nr;
-	for (stk = attr_stack; 0 < rem && stk; stk = stk->prev)
-		rem = fill(path, pathlen, basename_offset, stk, check->all_attrs, rem);
+	fill(path, pathlen, basename_offset, check->stack, check->all_attrs, rem);
 }
 
 int git_check_attr(const char *path, struct attr_check *check)
@@ -1056,7 +1163,7 @@ void git_attr_set_direction(enum git_attr_direction new, struct index_state *ist
 
 	direction = new;
 	if (new != old)
-		drop_attr_stack();
+		drop_all_attr_stacks();
 	use_index = istate;
 }
 
@@ -1064,5 +1171,6 @@ void attr_start(void)
 {
 #ifndef NO_PTHREADS
 	pthread_mutex_init(&g_attr_hashmap.mutex, NULL);
+	pthread_mutex_init(&check_vector.mutex, NULL);
 #endif
 }
diff --git a/attr.h b/attr.h
index abebbc19c..6f4961fdb 100644
--- a/attr.h
+++ b/attr.h
@@ -4,8 +4,9 @@
 /* An attribute is a pointer to this opaque structure */
 struct git_attr;
 
-/* opaque structure used internally for attribute collection */
+/* opaque structures used internally for attribute collection */
 struct all_attrs_item;
+struct attr_stack;
 
 /*
  * Given a string, return the gitattribute object that
@@ -38,6 +39,7 @@ struct attr_check {
 	struct attr_check_item *items;
 	int all_attrs_nr;
 	struct all_attrs_item *all_attrs;
+	struct attr_stack *stack;
 };
 
 extern struct attr_check *attr_check_alloc(void);
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v3 26/27] attr: push the bare repo check into read_attr()
  2017-01-28  2:01   ` [PATCH v3 00/27] Revamp the attribute system; another round Brandon Williams
                       ` (24 preceding siblings ...)
  2017-01-28  2:02     ` [PATCH v3 25/27] attr: store attribute stack in attr_check structure Brandon Williams
@ 2017-01-28  2:02     ` Brandon Williams
  2017-01-28  2:02     ` [PATCH v3 27/27] attr: reformat git_attr_set_direction() function Brandon Williams
  2017-02-02 19:14     ` [PATCH v3 00/27] Revamp the attribute system; another round Junio C Hamano
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-28  2:02 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, sbeller, gitster, pclouds

Push the bare repository check into the 'read_attr()' function.  This
avoids needing to have extra logic which creates an empty stack frame
when inside a bare repo as a similar bit of logic already exists in the
'read_attr()' function.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 114 +++++++++++++++++++++++++++++++----------------------------------
 1 file changed, 54 insertions(+), 60 deletions(-)

diff --git a/attr.c b/attr.c
index bcee0921d..62298ec2f 100644
--- a/attr.c
+++ b/attr.c
@@ -747,25 +747,28 @@ static struct attr_stack *read_attr_from_index(const char *path, int macro_ok)
 
 static struct attr_stack *read_attr(const char *path, int macro_ok)
 {
-	struct attr_stack *res;
+	struct attr_stack *res = NULL;
 
-	if (direction == GIT_ATTR_CHECKOUT) {
+	if (direction == GIT_ATTR_INDEX) {
 		res = read_attr_from_index(path, macro_ok);
-		if (!res)
-			res = read_attr_from_file(path, macro_ok);
-	}
-	else if (direction == GIT_ATTR_CHECKIN) {
-		res = read_attr_from_file(path, macro_ok);
-		if (!res)
-			/*
-			 * There is no checked out .gitattributes file there, but
-			 * we might have it in the index.  We allow operation in a
-			 * sparsely checked out work tree, so read from it.
-			 */
+	} else if (!is_bare_repository()) {
+		if (direction == GIT_ATTR_CHECKOUT) {
 			res = read_attr_from_index(path, macro_ok);
+			if (!res)
+				res = read_attr_from_file(path, macro_ok);
+		} else if (direction == GIT_ATTR_CHECKIN) {
+			res = read_attr_from_file(path, macro_ok);
+			if (!res)
+				/*
+				 * There is no checked out .gitattributes file
+				 * there, but we might have it in the index.
+				 * We allow operation in a sparsely checked out
+				 * work tree, so read from it.
+				 */
+				res = read_attr_from_index(path, macro_ok);
+		}
 	}
-	else
-		res = read_attr_from_index(path, macro_ok);
+
 	if (!res)
 		res = xcalloc(1, sizeof(*res));
 	return res;
@@ -857,10 +860,7 @@ static void bootstrap_attr_stack(struct attr_stack **stack)
 	}
 
 	/* root directory */
-	if (!is_bare_repository() || direction == GIT_ATTR_INDEX)
-		e = read_attr(GITATTRIBUTES_FILE, 1);
-	else
-		e = xcalloc(1, sizeof(struct attr_stack));
+	e = read_attr(GITATTRIBUTES_FILE, 1);
 	push_stack(stack, e, xstrdup(""), 0);
 
 	/* info frame */
@@ -877,6 +877,7 @@ static void prepare_attr_stack(const char *path, int dirlen,
 			       struct attr_stack **stack)
 {
 	struct attr_stack *info;
+	struct strbuf pathbuf = STRBUF_INIT;
 
 	/*
 	 * At the bottom of the attribute stack is the built-in
@@ -923,54 +924,47 @@ static void prepare_attr_stack(const char *path, int dirlen,
 	}
 
 	/*
-	 * Read from parent directories and push them down
+	 * bootstrap_attr_stack() should have added, and the
+	 * above loop should have stopped before popping, the
+	 * root element whose attr_stack->origin is set to an
+	 * empty string.
 	 */
-	if (!is_bare_repository() || direction == GIT_ATTR_INDEX) {
-		/*
-		 * bootstrap_attr_stack() should have added, and the
-		 * above loop should have stopped before popping, the
-		 * root element whose attr_stack->origin is set to an
-		 * empty string.
-		 */
-		struct strbuf pathbuf = STRBUF_INIT;
-
-		assert((*stack)->origin);
-		strbuf_addstr(&pathbuf, (*stack)->origin);
-		/* Build up to the directory 'path' is in */
-		while (pathbuf.len < dirlen) {
-			size_t len = pathbuf.len;
-			struct attr_stack *next;
-			char *origin;
-
-			/* Skip path-separator */
-			if (len < dirlen && is_dir_sep(path[len]))
-				len++;
-			/* Find the end of the next component */
-			while (len < dirlen && !is_dir_sep(path[len]))
-				len++;
-
-			if (pathbuf.len > 0)
-				strbuf_addch(&pathbuf, '/');
-			strbuf_add(&pathbuf, path + pathbuf.len,
-				   (len - pathbuf.len));
-			strbuf_addf(&pathbuf, "/%s", GITATTRIBUTES_FILE);
-
-			next = read_attr(pathbuf.buf, 0);
-
-			/* reset the pathbuf to not include "/.gitattributes" */
-			strbuf_setlen(&pathbuf, len);
-
-			origin = xstrdup(pathbuf.buf);
-			push_stack(stack, next, origin, len);
-
-		}
-		strbuf_release(&pathbuf);
+	assert((*stack)->origin);
+
+	strbuf_addstr(&pathbuf, (*stack)->origin);
+	/* Build up to the directory 'path' is in */
+	while (pathbuf.len < dirlen) {
+		size_t len = pathbuf.len;
+		struct attr_stack *next;
+		char *origin;
+
+		/* Skip path-separator */
+		if (len < dirlen && is_dir_sep(path[len]))
+			len++;
+		/* Find the end of the next component */
+		while (len < dirlen && !is_dir_sep(path[len]))
+			len++;
+
+		if (pathbuf.len > 0)
+			strbuf_addch(&pathbuf, '/');
+		strbuf_add(&pathbuf, path + pathbuf.len, (len - pathbuf.len));
+		strbuf_addf(&pathbuf, "/%s", GITATTRIBUTES_FILE);
+
+		next = read_attr(pathbuf.buf, 0);
+
+		/* reset the pathbuf to not include "/.gitattributes" */
+		strbuf_setlen(&pathbuf, len);
+
+		origin = xstrdup(pathbuf.buf);
+		push_stack(stack, next, origin, len);
 	}
 
 	/*
 	 * Finally push the "info" one at the top of the stack.
 	 */
 	push_stack(stack, info, NULL, 0);
+
+	strbuf_release(&pathbuf);
 }
 
 static int path_matches(const char *pathname, int pathlen,
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v3 27/27] attr: reformat git_attr_set_direction() function
  2017-01-28  2:01   ` [PATCH v3 00/27] Revamp the attribute system; another round Brandon Williams
                       ` (25 preceding siblings ...)
  2017-01-28  2:02     ` [PATCH v3 26/27] attr: push the bare repo check into read_attr() Brandon Williams
@ 2017-01-28  2:02     ` Brandon Williams
  2017-02-02 19:14     ` [PATCH v3 00/27] Revamp the attribute system; another round Junio C Hamano
  27 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-28  2:02 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, sbeller, gitster, pclouds

Move the 'git_attr_set_direction()' up to be closer to the variables
that it modifies as well as a small formatting by renaming the variable
'new' to 'new_direction' so that it is more descriptive.

Update the comment about how 'direction' is used to read the state of
the world.  It should be noted that callers of
'git_attr_set_direction()' should ensure that other threads are not
making calls into the attribute system until after the call to
'git_attr_set_direction()' completes.  This function essentially acts as
reset button for the attribute system and should be handled with care.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 49 ++++++++++++++++++++-----------------------------
 attr.h |  3 ++-
 2 files changed, 22 insertions(+), 30 deletions(-)

diff --git a/attr.c b/attr.c
index 62298ec2f..5493bff22 100644
--- a/attr.c
+++ b/attr.c
@@ -677,26 +677,30 @@ static struct attr_stack *read_attr_from_array(const char **list)
 }
 
 /*
- * NEEDSWORK: these two are tricky.  The callers assume there is a
- * single, system-wide global state "where we read attributes from?"
- * and when the state is flipped by calling git_attr_set_direction(),
- * attr_stack is discarded so that subsequent attr_check will lazily
- * read from the right place.  And they do not know or care who called
- * by them uses the attribute subsystem, hence have no knowledge of
- * existing git_attr_check instances or future ones that will be
- * created).
- *
- * Probably we need a thread_local that holds these two variables,
- * and a list of git_attr_check instances (which need to be maintained
- * by hooking into git_attr_check_alloc(), git_attr_check_initl(), and
- * git_attr_check_clear().  Then git_attr_set_direction() updates the
- * fields in that thread_local for these two variables, iterate over
- * all the active git_attr_check instances and discard the attr_stack
- * they hold.  Yuck, but it sounds doable.
+ * Callers into the attribute system assume there is a single, system-wide
+ * global state where attributes are read from and when the state is flipped by
+ * calling git_attr_set_direction(), the stack frames that have been
+ * constructed need to be discarded so so that subsequent calls into the
+ * attribute system will lazily read from the right place.  Since changing
+ * direction causes a global paradigm shift, it should not ever be called while
+ * another thread could potentially be calling into the attribute system.
  */
 static enum git_attr_direction direction;
 static struct index_state *use_index;
 
+void git_attr_set_direction(enum git_attr_direction new_direction,
+			    struct index_state *istate)
+{
+	if (is_bare_repository() && new_direction != GIT_ATTR_INDEX)
+		die("BUG: non-INDEX attr direction in a bare repo");
+
+	if (new_direction != direction)
+		drop_all_attr_stacks();
+
+	direction = new_direction;
+	use_index = istate;
+}
+
 static struct attr_stack *read_attr_from_file(const char *path, int macro_ok)
 {
 	FILE *fp = fopen(path, "r");
@@ -1148,19 +1152,6 @@ void git_all_attrs(const char *path, struct attr_check *check)
 	}
 }
 
-void git_attr_set_direction(enum git_attr_direction new, struct index_state *istate)
-{
-	enum git_attr_direction old = direction;
-
-	if (is_bare_repository() && new != GIT_ATTR_INDEX)
-		die("BUG: non-INDEX attr direction in a bare repo");
-
-	direction = new;
-	if (new != old)
-		drop_all_attr_stacks();
-	use_index = istate;
-}
-
 void attr_start(void)
 {
 #ifndef NO_PTHREADS
diff --git a/attr.h b/attr.h
index 6f4961fdb..48ab3e1c2 100644
--- a/attr.h
+++ b/attr.h
@@ -72,7 +72,8 @@ enum git_attr_direction {
 	GIT_ATTR_CHECKOUT,
 	GIT_ATTR_INDEX
 };
-void git_attr_set_direction(enum git_attr_direction, struct index_state *);
+void git_attr_set_direction(enum git_attr_direction new_direction,
+			    struct index_state *istate);
 
 extern void attr_start(void);
 
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* Re: [PATCH v3 16/27] attr: convert git_all_attrs() to use "struct attr_check"
  2017-01-28  2:01     ` [PATCH v3 16/27] attr: convert git_all_attrs() to use "struct attr_check" Brandon Williams
@ 2017-01-28 23:50       ` Stefan Beller
  2017-01-29  2:44         ` Brandon Williams
  2017-01-30 18:06       ` Brandon Williams
  1 sibling, 1 reply; 111+ messages in thread
From: Stefan Beller @ 2017-01-28 23:50 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, Junio C Hamano, Duy Nguyen

On Fri, Jan 27, 2017 at 6:01 PM, Brandon Williams <bmwill@google.com> wrote:
> From: Junio C Hamano <gitster@pobox.com>
>
> This updates the other two ways the attribute check is done via an
> array of "struct attr_check_item" elements.  These two niches
> appear only in "git check-attr".
>
>  * The caller does not know offhand what attributes it wants to ask
>    about and cannot use attr_check_initl() to prepare the
>    attr_check structure.
>
>  * The caller may not know what attributes it wants to ask at all,
>    and instead wants to learn everything that the given path has.
>
> Such a caller can call attr_check_alloc() to allocate an empty
> attr_check, and then call attr_check_append() to add attribute names
> one by one.
>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> Signed-off-by: Stefan Beller <sbeller@google.com>
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>  attr.c               | 168 ++++++++++++++++++++++++---------------------------
>  attr.h               |   9 +--
>  builtin/check-attr.c |  60 +++++++++---------
>  3 files changed, 112 insertions(+), 125 deletions(-)
>
> diff --git a/attr.c b/attr.c
> index de8bf35a3..40818246f 100644
> --- a/attr.c
> +++ b/attr.c
> @@ -132,75 +132,6 @@ struct git_attr *git_attr(const char *name)
>         return git_attr_internal(name, strlen(name));
>  }
>
> -struct attr_check *attr_check_alloc(void)
> -{
> -       return xcalloc(1, sizeof(struct attr_check));
> -}
> -
> -struct attr_check *attr_check_initl(const char *one, ...)
> -{
> -       struct attr_check *check;
> -       int cnt;
> -       va_list params;
> -       const char *param;
> -
> -       va_start(params, one);
> -       for (cnt = 1; (param = va_arg(params, const char *)) != NULL; cnt++)
> -               ;
> -       va_end(params);
> -
> -       check = attr_check_alloc();
> -       check->nr = cnt;
> -       check->alloc = cnt;
> -       check->items = xcalloc(cnt, sizeof(struct attr_check_item));
> -
> -       check->items[0].attr = git_attr(one);
> -       va_start(params, one);
> -       for (cnt = 1; cnt < check->nr; cnt++) {
> -               const struct git_attr *attr;
> -               param = va_arg(params, const char *);
> -               if (!param)
> -                       die("BUG: counted %d != ended at %d",
> -                           check->nr, cnt);
> -               attr = git_attr(param);
> -               if (!attr)
> -                       die("BUG: %s: not a valid attribute name", param);
> -               check->items[cnt].attr = attr;
> -       }
> -       va_end(params);
> -       return check;
> -}

This being moved down to below (being review churn) sounds like a
rebase mistake. ;)

> -
> -struct attr_check_item *attr_check_append(struct attr_check *check,
> -                                         const struct git_attr *attr)
> -{
> -       struct attr_check_item *item;
> -
> -       ALLOC_GROW(check->items, check->nr + 1, check->alloc);
> -       item = &check->items[check->nr++];
> -       item->attr = attr;
> -       return item;
> -}
> -
> -void attr_check_reset(struct attr_check *check)
> -{
> -       check->nr = 0;
> -}
> -
> -void attr_check_clear(struct attr_check *check)
> -{
> -       free(check->items);
> -       check->items = NULL;
> -       check->alloc = 0;
> -       check->nr = 0;
> -}
> -
> -void attr_check_free(struct attr_check *check)
> -{
> -       attr_check_clear(check);
> -       free(check);
> -}
> -
>  /* What does a matched pattern decide? */
>  struct attr_state {
>         struct git_attr *attr;
> @@ -439,6 +370,75 @@ static void free_attr_elem(struct attr_stack *e)
>         free(e);
>  }
>
> +struct attr_check *attr_check_alloc(void)
> +{
> +       return xcalloc(1, sizeof(struct attr_check));
> +}
> +
> +struct attr_check *attr_check_initl(const char *one, ...)
> +{
> +       struct attr_check *check;
> +       int cnt;
> +       va_list params;
> +       const char *param;
> +
> +       va_start(params, one);
> +       for (cnt = 1; (param = va_arg(params, const char *)) != NULL; cnt++)
> +               ;
> +       va_end(params);
> +
> +       check = attr_check_alloc();
> +       check->nr = cnt;
> +       check->alloc = cnt;
> +       check->items = xcalloc(cnt, sizeof(struct attr_check_item));
> +
> +       check->items[0].attr = git_attr(one);
> +       va_start(params, one);
> +       for (cnt = 1; cnt < check->nr; cnt++) {
> +               const struct git_attr *attr;
> +               param = va_arg(params, const char *);
> +               if (!param)
> +                       die("BUG: counted %d != ended at %d",
> +                           check->nr, cnt);
> +               attr = git_attr(param);
> +               if (!attr)
> +                       die("BUG: %s: not a valid attribute name", param);
> +               check->items[cnt].attr = attr;
> +       }
> +       va_end(params);
> +       return check;
> +}
> +
> +struct attr_check_item *attr_check_append(struct attr_check *check,
> +                                         const struct git_attr *attr)
> +{
> +       struct attr_check_item *item;
> +
> +       ALLOC_GROW(check->items, check->nr + 1, check->alloc);
> +       item = &check->items[check->nr++];
> +       item->attr = attr;
> +       return item;
> +}
> +
> +void attr_check_reset(struct attr_check *check)
> +{
> +       check->nr = 0;
> +}
> +
> +void attr_check_clear(struct attr_check *check)
> +{
> +       free(check->items);
> +       check->items = NULL;
> +       check->alloc = 0;
> +       check->nr = 0;
> +}
> +
> +void attr_check_free(struct attr_check *check)
> +{
> +       attr_check_clear(check);
> +       free(check);
> +}
> +
>  static const char *builtin_attr[] = {
>         "[attr]binary -diff -merge -text",
>         NULL,
> @@ -906,32 +906,22 @@ int git_check_attrs(const char *path, int num, struct attr_check_item *check)
>         return 0;
>  }

Below is where the actual change for this patch starts?

>
> -int git_all_attrs(const char *path, int *num, struct attr_check_item **check)
> +void git_all_attrs(const char *path, struct attr_check *check)
>  {
> -       int i, count, j;
> +       int i;
>
> -       collect_some_attrs(path, 0, NULL);
> +       attr_check_reset(check);
> +       collect_some_attrs(path, check->nr, check->items);
>
> -       /* Count the number of attributes that are set. */
> -       count = 0;
> -       for (i = 0; i < attr_nr; i++) {
> -               const char *value = check_all_attr[i].value;
> -               if (value != ATTR__UNSET && value != ATTR__UNKNOWN)
> -                       ++count;
> -       }
> -       *num = count;
> -       ALLOC_ARRAY(*check, count);
> -       j = 0;
>         for (i = 0; i < attr_nr; i++) {
> +               const char *name = check_all_attr[i].attr->name;
>                 const char *value = check_all_attr[i].value;
> -               if (value != ATTR__UNSET && value != ATTR__UNKNOWN) {
> -                       (*check)[j].attr = check_all_attr[i].attr;
> -                       (*check)[j].value = value;
> -                       ++j;
> -               }
> +               struct attr_check_item *item;
> +               if (value == ATTR__UNSET || value == ATTR__UNKNOWN)
> +                       continue;
> +               item = attr_check_append(check, git_attr(name));
> +               item->value = value;
>         }
> -
> -       return 0;
>  }
>
>  int git_check_attr(const char *path, struct attr_check *check)
> diff --git a/attr.h b/attr.h
> index e611b139a..9f2729842 100644
> --- a/attr.h
> +++ b/attr.h
> @@ -56,13 +56,10 @@ int git_check_attrs(const char *path, int, struct attr_check_item *);
>  extern int git_check_attr(const char *path, struct attr_check *check);
>
>  /*
> - * Retrieve all attributes that apply to the specified path.  *num
> - * will be set to the number of attributes on the path; **check will
> - * be set to point at a newly-allocated array of git_attr_check
> - * objects describing the attributes and their values.  *check must be
> - * free()ed by the caller.
> + * Retrieve all attributes that apply to the specified path.
> + * check holds the attributes and their values.
>   */
> -int git_all_attrs(const char *path, int *num, struct attr_check_item **check);
> +extern void git_all_attrs(const char *path, struct attr_check *check);
>
>  enum git_attr_direction {
>         GIT_ATTR_CHECKIN,
> diff --git a/builtin/check-attr.c b/builtin/check-attr.c
> index 889264a5b..40cdff13e 100644
> --- a/builtin/check-attr.c
> +++ b/builtin/check-attr.c
> @@ -24,12 +24,13 @@ static const struct option check_attr_options[] = {
>         OPT_END()
>  };
>
> -static void output_attr(int cnt, struct attr_check_item *check,
> -                       const char *file)
> +static void output_attr(struct attr_check *check, const char *file)
>  {
>         int j;
> +       int cnt = check->nr;
> +
>         for (j = 0; j < cnt; j++) {
> -               const char *value = check[j].value;
> +               const char *value = check->items[j].value;
>
>                 if (ATTR_TRUE(value))
>                         value = "set";
> @@ -42,36 +43,38 @@ static void output_attr(int cnt, struct attr_check_item *check,
>                         printf("%s%c" /* path */
>                                "%s%c" /* attrname */
>                                "%s%c" /* attrvalue */,
> -                              file, 0, git_attr_name(check[j].attr), 0, value, 0);
> +                              file, 0,
> +                              git_attr_name(check->items[j].attr), 0, value, 0);
>                 } else {
>                         quote_c_style(file, NULL, stdout, 0);
> -                       printf(": %s: %s\n", git_attr_name(check[j].attr), value);
> +                       printf(": %s: %s\n",
> +                              git_attr_name(check->items[j].attr), value);
>                 }
> -
>         }
>  }
>
>  static void check_attr(const char *prefix,
> -                      int cnt, struct attr_check_item *check,
> +                      struct attr_check *check,
> +                      int collect_all,
>                        const char *file)
>  {
>         char *full_path =
>                 prefix_path(prefix, prefix ? strlen(prefix) : 0, file);
> -       if (check != NULL) {
> -               if (git_check_attrs(full_path, cnt, check))
> -                       die("git_check_attrs died");
> -               output_attr(cnt, check, file);
> +
> +       if (collect_all) {
> +               git_all_attrs(full_path, check);
>         } else {
> -               if (git_all_attrs(full_path, &cnt, &check))
> -                       die("git_all_attrs died");
> -               output_attr(cnt, check, file);
> -               free(check);
> +               if (git_check_attr(full_path, check))
> +                       die("git_check_attr died");
>         }
> +       output_attr(check, file);
> +
>         free(full_path);
>  }
>
>  static void check_attr_stdin_paths(const char *prefix,
> -                                  int cnt, struct attr_check_item *check)
> +                                  struct attr_check *check,
> +                                  int collect_all)
>  {
>         struct strbuf buf = STRBUF_INIT;
>         struct strbuf unquoted = STRBUF_INIT;
> @@ -85,7 +88,7 @@ static void check_attr_stdin_paths(const char *prefix,
>                                 die("line is badly quoted");
>                         strbuf_swap(&buf, &unquoted);
>                 }
> -               check_attr(prefix, cnt, check, buf.buf);
> +               check_attr(prefix, check, collect_all, buf.buf);
>                 maybe_flush_or_die(stdout, "attribute to stdout");
>         }
>         strbuf_release(&buf);
> @@ -100,7 +103,7 @@ static NORETURN void error_with_usage(const char *msg)
>
>  int cmd_check_attr(int argc, const char **argv, const char *prefix)
>  {
> -       struct attr_check_item *check;
> +       struct attr_check *check;
>         int cnt, i, doubledash, filei;
>
>         if (!is_bare_repository())
> @@ -160,28 +163,25 @@ int cmd_check_attr(int argc, const char **argv, const char *prefix)
>                         error_with_usage("No file specified");
>         }
>
> -       if (all_attrs) {
> -               check = NULL;
> -       } else {
> -               check = xcalloc(cnt, sizeof(*check));
> +       check = attr_check_alloc();
> +       if (!all_attrs) {
>                 for (i = 0; i < cnt; i++) {
> -                       const char *name;
> -                       struct git_attr *a;
> -                       name = argv[i];
> -                       a = git_attr(name);
> +                       struct git_attr *a = git_attr(argv[i]);
>                         if (!a)
>                                 return error("%s: not a valid attribute name",
> -                                       name);
> -                       check[i].attr = a;
> +                                            argv[i]);
> +                       attr_check_append(check, a);
>                 }
>         }
>
>         if (stdin_paths)
> -               check_attr_stdin_paths(prefix, cnt, check);
> +               check_attr_stdin_paths(prefix, check, all_attrs);
>         else {
>                 for (i = filei; i < argc; i++)
> -                       check_attr(prefix, cnt, check, argv[i]);
> +                       check_attr(prefix, check, all_attrs, argv[i]);
>                 maybe_flush_or_die(stdout, "attribute to stdout");
>         }
> +
> +       attr_check_free(check);
>         return 0;
>  }
> --
> 2.11.0.483.g087da7b7c-goog
>

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v3 16/27] attr: convert git_all_attrs() to use "struct attr_check"
  2017-01-28 23:50       ` Stefan Beller
@ 2017-01-29  2:44         ` Brandon Williams
  0 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-29  2:44 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, Junio C Hamano, Duy Nguyen

On 01/28, Stefan Beller wrote:
> 
> This being moved down to below (being review churn) sounds like a
> rebase mistake. ;)
> 

Yep, thanks for catching that.  I'll need to fix that up.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 111+ messages in thread

* [PATCH v3 15/27] attr: (re)introduce git_check_attr() and struct attr_check
  2017-01-28  2:01     ` [PATCH v3 15/27] attr: (re)introduce git_check_attr() and struct attr_check Brandon Williams
@ 2017-01-30 18:05       ` Brandon Williams
  0 siblings, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-30 18:05 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Stefan Beller, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

A common pattern to check N attributes for many paths is to

 (1) prepare an array A of N attr_check_item items;
 (2) call git_attr() to intern the N attribute names and fill A;
 (3) repeatedly call git_check_attrs() for path with N and A;

A look-up for these N attributes for a single path P scans the
entire attr_stack, starting from the .git/info/attributes file and
then .gitattributes file in the directory the path P is in, going
upwards to find .gitattributes file found in parent directories.

An earlier commit 06a604e6 (attr: avoid heavy work when we know the
specified attr is not defined, 2014-12-28) tried to optimize out
this scanning for one trivial special case: when the attribute being
sought is known not to exist, we do not have to scan for it.  While
this may be a cheap and effective heuristic, it would not work well
when N is (much) more than 1.

What we would want is a more customized way to skip irrelevant
entries in the attribute stack, and the definition of irrelevance
is tied to the set of attributes passed to git_check_attrs() call,
i.e. the set of attributes being sought.  The data necessary for
this optimization needs to live alongside the set of attributes, but
a simple array of git_attr_check_elem simply does not have any place
for that.

Introduce "struct attr_check" that contains N, the number of
attributes being sought, and A, the array that holds N
attr_check_item items, and a function git_check_attr() that
takes a path P and this structure as its parameters.  This structure
can later be extended to hold extra data necessary for optimization.

Also, to make it easier to write the first two steps in common
cases, introduce git_attr_check_initl() helper function, which takes
a NULL-terminated list of attribute names and initialize this
structure.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---

This is the correct 15/27 patch that doesn't have the rebase mistake discovered
by Stefan.

 attr.c | 74 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 attr.h | 17 +++++++++++++++
 2 files changed, 91 insertions(+)

diff --git a/attr.c b/attr.c
index 2f180d609..e3298516a 100644
--- a/attr.c
+++ b/attr.c
@@ -370,6 +370,75 @@ static void free_attr_elem(struct attr_stack *e)
 	free(e);
 }
 
+struct attr_check *attr_check_alloc(void)
+{
+	return xcalloc(1, sizeof(struct attr_check));
+}
+
+struct attr_check *attr_check_initl(const char *one, ...)
+{
+	struct attr_check *check;
+	int cnt;
+	va_list params;
+	const char *param;
+
+	va_start(params, one);
+	for (cnt = 1; (param = va_arg(params, const char *)) != NULL; cnt++)
+		;
+	va_end(params);
+
+	check = attr_check_alloc();
+	check->nr = cnt;
+	check->alloc = cnt;
+	check->items = xcalloc(cnt, sizeof(struct attr_check_item));
+
+	check->items[0].attr = git_attr(one);
+	va_start(params, one);
+	for (cnt = 1; cnt < check->nr; cnt++) {
+		const struct git_attr *attr;
+		param = va_arg(params, const char *);
+		if (!param)
+			die("BUG: counted %d != ended at %d",
+			    check->nr, cnt);
+		attr = git_attr(param);
+		if (!attr)
+			die("BUG: %s: not a valid attribute name", param);
+		check->items[cnt].attr = attr;
+	}
+	va_end(params);
+	return check;
+}
+
+struct attr_check_item *attr_check_append(struct attr_check *check,
+					  const struct git_attr *attr)
+{
+	struct attr_check_item *item;
+
+	ALLOC_GROW(check->items, check->nr + 1, check->alloc);
+	item = &check->items[check->nr++];
+	item->attr = attr;
+	return item;
+}
+
+void attr_check_reset(struct attr_check *check)
+{
+	check->nr = 0;
+}
+
+void attr_check_clear(struct attr_check *check)
+{
+	free(check->items);
+	check->items = NULL;
+	check->alloc = 0;
+	check->nr = 0;
+}
+
+void attr_check_free(struct attr_check *check)
+{
+	attr_check_clear(check);
+	free(check);
+}
+
 static const char *builtin_attr[] = {
 	"[attr]binary -diff -merge -text",
 	NULL,
@@ -865,6 +934,11 @@ int git_all_attrs(const char *path, int *num, struct attr_check_item **check)
 	return 0;
 }
 
+int git_check_attr(const char *path, struct attr_check *check)
+{
+	return git_check_attrs(path, check->nr, check->items);
+}
+
 void git_attr_set_direction(enum git_attr_direction new, struct index_state *istate)
 {
 	enum git_attr_direction old = direction;
diff --git a/attr.h b/attr.h
index efc7bb3b3..e611b139a 100644
--- a/attr.h
+++ b/attr.h
@@ -29,6 +29,22 @@ struct attr_check_item {
 	const char *value;
 };
 
+struct attr_check {
+	int nr;
+	int alloc;
+	struct attr_check_item *items;
+};
+
+extern struct attr_check *attr_check_alloc(void);
+extern struct attr_check *attr_check_initl(const char *, ...);
+
+extern struct attr_check_item *attr_check_append(struct attr_check *check,
+						 const struct git_attr *attr);
+
+extern void attr_check_reset(struct attr_check *check);
+extern void attr_check_clear(struct attr_check *check);
+extern void attr_check_free(struct attr_check *check);
+
 /*
  * Return the name of the attribute represented by the argument.  The
  * return value is a pointer to a null-delimited string that is part
@@ -37,6 +53,7 @@ struct attr_check_item {
 extern const char *git_attr_name(const struct git_attr *);
 
 int git_check_attrs(const char *path, int, struct attr_check_item *);
+extern int git_check_attr(const char *path, struct attr_check *check);
 
 /*
  * Retrieve all attributes that apply to the specified path.  *num
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v3 16/27] attr: convert git_all_attrs() to use "struct attr_check"
  2017-01-28  2:01     ` [PATCH v3 16/27] attr: convert git_all_attrs() to use "struct attr_check" Brandon Williams
  2017-01-28 23:50       ` Stefan Beller
@ 2017-01-30 18:06       ` Brandon Williams
  1 sibling, 0 replies; 111+ messages in thread
From: Brandon Williams @ 2017-01-30 18:06 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Stefan Beller, Brandon Williams

From: Junio C Hamano <gitster@pobox.com>

This updates the other two ways the attribute check is done via an
array of "struct attr_check_item" elements.  These two niches
appear only in "git check-attr".

 * The caller does not know offhand what attributes it wants to ask
   about and cannot use attr_check_initl() to prepare the
   attr_check structure.

 * The caller may not know what attributes it wants to ask at all,
   and instead wants to learn everything that the given path has.

Such a caller can call attr_check_alloc() to allocate an empty
attr_check, and then call attr_check_append() to add attribute names
one by one.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---

This is the correct 16/27 patch that doesn't have the rebase mistake discoverd
by Stefan.

 attr.c               | 30 +++++++++-----------------
 attr.h               |  9 +++-----
 builtin/check-attr.c | 60 ++++++++++++++++++++++++++--------------------------
 3 files changed, 43 insertions(+), 56 deletions(-)

diff --git a/attr.c b/attr.c
index e3298516a..40818246f 100644
--- a/attr.c
+++ b/attr.c
@@ -906,32 +906,22 @@ int git_check_attrs(const char *path, int num, struct attr_check_item *check)
 	return 0;
 }
 
-int git_all_attrs(const char *path, int *num, struct attr_check_item **check)
+void git_all_attrs(const char *path, struct attr_check *check)
 {
-	int i, count, j;
+	int i;
 
-	collect_some_attrs(path, 0, NULL);
+	attr_check_reset(check);
+	collect_some_attrs(path, check->nr, check->items);
 
-	/* Count the number of attributes that are set. */
-	count = 0;
-	for (i = 0; i < attr_nr; i++) {
-		const char *value = check_all_attr[i].value;
-		if (value != ATTR__UNSET && value != ATTR__UNKNOWN)
-			++count;
-	}
-	*num = count;
-	ALLOC_ARRAY(*check, count);
-	j = 0;
 	for (i = 0; i < attr_nr; i++) {
+		const char *name = check_all_attr[i].attr->name;
 		const char *value = check_all_attr[i].value;
-		if (value != ATTR__UNSET && value != ATTR__UNKNOWN) {
-			(*check)[j].attr = check_all_attr[i].attr;
-			(*check)[j].value = value;
-			++j;
-		}
+		struct attr_check_item *item;
+		if (value == ATTR__UNSET || value == ATTR__UNKNOWN)
+			continue;
+		item = attr_check_append(check, git_attr(name));
+		item->value = value;
 	}
-
-	return 0;
 }
 
 int git_check_attr(const char *path, struct attr_check *check)
diff --git a/attr.h b/attr.h
index e611b139a..9f2729842 100644
--- a/attr.h
+++ b/attr.h
@@ -56,13 +56,10 @@ int git_check_attrs(const char *path, int, struct attr_check_item *);
 extern int git_check_attr(const char *path, struct attr_check *check);
 
 /*
- * Retrieve all attributes that apply to the specified path.  *num
- * will be set to the number of attributes on the path; **check will
- * be set to point at a newly-allocated array of git_attr_check
- * objects describing the attributes and their values.  *check must be
- * free()ed by the caller.
+ * Retrieve all attributes that apply to the specified path.
+ * check holds the attributes and their values.
  */
-int git_all_attrs(const char *path, int *num, struct attr_check_item **check);
+extern void git_all_attrs(const char *path, struct attr_check *check);
 
 enum git_attr_direction {
 	GIT_ATTR_CHECKIN,
diff --git a/builtin/check-attr.c b/builtin/check-attr.c
index 889264a5b..40cdff13e 100644
--- a/builtin/check-attr.c
+++ b/builtin/check-attr.c
@@ -24,12 +24,13 @@ static const struct option check_attr_options[] = {
 	OPT_END()
 };
 
-static void output_attr(int cnt, struct attr_check_item *check,
-			const char *file)
+static void output_attr(struct attr_check *check, const char *file)
 {
 	int j;
+	int cnt = check->nr;
+
 	for (j = 0; j < cnt; j++) {
-		const char *value = check[j].value;
+		const char *value = check->items[j].value;
 
 		if (ATTR_TRUE(value))
 			value = "set";
@@ -42,36 +43,38 @@ static void output_attr(int cnt, struct attr_check_item *check,
 			printf("%s%c" /* path */
 			       "%s%c" /* attrname */
 			       "%s%c" /* attrvalue */,
-			       file, 0, git_attr_name(check[j].attr), 0, value, 0);
+			       file, 0,
+			       git_attr_name(check->items[j].attr), 0, value, 0);
 		} else {
 			quote_c_style(file, NULL, stdout, 0);
-			printf(": %s: %s\n", git_attr_name(check[j].attr), value);
+			printf(": %s: %s\n",
+			       git_attr_name(check->items[j].attr), value);
 		}
-
 	}
 }
 
 static void check_attr(const char *prefix,
-		       int cnt, struct attr_check_item *check,
+		       struct attr_check *check,
+		       int collect_all,
 		       const char *file)
 {
 	char *full_path =
 		prefix_path(prefix, prefix ? strlen(prefix) : 0, file);
-	if (check != NULL) {
-		if (git_check_attrs(full_path, cnt, check))
-			die("git_check_attrs died");
-		output_attr(cnt, check, file);
+
+	if (collect_all) {
+		git_all_attrs(full_path, check);
 	} else {
-		if (git_all_attrs(full_path, &cnt, &check))
-			die("git_all_attrs died");
-		output_attr(cnt, check, file);
-		free(check);
+		if (git_check_attr(full_path, check))
+			die("git_check_attr died");
 	}
+	output_attr(check, file);
+
 	free(full_path);
 }
 
 static void check_attr_stdin_paths(const char *prefix,
-				   int cnt, struct attr_check_item *check)
+				   struct attr_check *check,
+				   int collect_all)
 {
 	struct strbuf buf = STRBUF_INIT;
 	struct strbuf unquoted = STRBUF_INIT;
@@ -85,7 +88,7 @@ static void check_attr_stdin_paths(const char *prefix,
 				die("line is badly quoted");
 			strbuf_swap(&buf, &unquoted);
 		}
-		check_attr(prefix, cnt, check, buf.buf);
+		check_attr(prefix, check, collect_all, buf.buf);
 		maybe_flush_or_die(stdout, "attribute to stdout");
 	}
 	strbuf_release(&buf);
@@ -100,7 +103,7 @@ static NORETURN void error_with_usage(const char *msg)
 
 int cmd_check_attr(int argc, const char **argv, const char *prefix)
 {
-	struct attr_check_item *check;
+	struct attr_check *check;
 	int cnt, i, doubledash, filei;
 
 	if (!is_bare_repository())
@@ -160,28 +163,25 @@ int cmd_check_attr(int argc, const char **argv, const char *prefix)
 			error_with_usage("No file specified");
 	}
 
-	if (all_attrs) {
-		check = NULL;
-	} else {
-		check = xcalloc(cnt, sizeof(*check));
+	check = attr_check_alloc();
+	if (!all_attrs) {
 		for (i = 0; i < cnt; i++) {
-			const char *name;
-			struct git_attr *a;
-			name = argv[i];
-			a = git_attr(name);
+			struct git_attr *a = git_attr(argv[i]);
 			if (!a)
 				return error("%s: not a valid attribute name",
-					name);
-			check[i].attr = a;
+					     argv[i]);
+			attr_check_append(check, a);
 		}
 	}
 
 	if (stdin_paths)
-		check_attr_stdin_paths(prefix, cnt, check);
+		check_attr_stdin_paths(prefix, check, all_attrs);
 	else {
 		for (i = filei; i < argc; i++)
-			check_attr(prefix, cnt, check, argv[i]);
+			check_attr(prefix, check, all_attrs, argv[i]);
 		maybe_flush_or_die(stdout, "attribute to stdout");
 	}
+
+	attr_check_free(check);
 	return 0;
 }
-- 
2.11.0.483.g087da7b7c-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* Re: [PATCH v3 00/27] Revamp the attribute system; another round
  2017-01-28  2:01   ` [PATCH v3 00/27] Revamp the attribute system; another round Brandon Williams
                       ` (26 preceding siblings ...)
  2017-01-28  2:02     ` [PATCH v3 27/27] attr: reformat git_attr_set_direction() function Brandon Williams
@ 2017-02-02 19:14     ` Junio C Hamano
  2017-02-09 17:18       ` Brandon Williams
  27 siblings, 1 reply; 111+ messages in thread
From: Junio C Hamano @ 2017-02-02 19:14 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, sbeller, pclouds

Brandon Williams <bmwill@google.com> writes:

> Per some of the discussion online and off I locally broke up up the question
> and answer and I wasn't very thrilled with the outcome for a number of reasons.
>
> 1. The API is more complex....
> 2. Performance hit....
> ...
> Given the above, v3 is a reroll of the same design as in v2.  This is a good
> milestone in improving the attribute system as it achieves the goal of making
> the attribute subsystem thread-safe (ie multiple callers can be executing
> inside the attribute system at the same time) and will enable a future series
> to allow pathspec code to call into the attribute system.
>
> Most of the changes in this revision are cosmetic (variable renames, code
> movement, etc) but there was a memory leak that was also fixed.

I am OK with the patches presented in this round, but let me explain
why I still expect that we would eventually end up spliting the
question & answer into separate data structure before we can truly
go multi-threaded.

A typical application would do

	for path taken from some set:
		do_something(path)

and "do something with path" would be another helper function, which
may do

	do_something(path):
		ask 'text' attribute for the path
		switch on the attribute and do different things

With the original API, the latter would statically allocate an array
of <question, answer> pairs, with an optimization to populate
<question> which is immutable (because the codepath is always and
only interested in 'text' attribute, and you need a hash lookup to
intern the string "text" which costs cycles) only once, and make a
call to git_check_attr() function with the "path".  This obviously
will not work when two threads are calling this helper function, as
the threads both want their git_check_attr() to return their answers
to the array, but the <answer> part are shared between the threads.

A naive and inefficient way to split questions and answers is to
have two arrays, allocating the former statically (protected under a
mutex, of course) to avoid repeated cost of interning, while
allocating the latter (and some working area per invocation, like
the check_all_attr[]) dynamically and place it on stack or heap.
Because do_something() will be called number of times, the cost for
allocation and initialization of the <answer> part that is paid per
invocation will of course become very high.

We could in theory keep the original arrangement of having an
array of <question, answer> pairs and restructure the code to do:

	prepare the <question, answer> array
	for path taken from some set:
		do_something(the array, path)

That way, do_something() do not have to keep allocating,
initializing and destroying the array.

But after looking at the current set of codepaths, before coming to
the conclusion that we need to split the static part that is
specific to the callsite for git_check_attr() and the dynamic part
that is specific to the <callsite, thread> pair, I noticed that
typically the callers that can prepare the array before going into
the loop (which will eventually be spread across multiple threads)
are many levels away in the callchain, and they are not even aware
of what attributes are going to be requested in the leaf level
helper functions.  In other words, the approach to hoist "the
<question, answer> array" up in the callchain would not scale.  A
caller that loops over paths in the index and check them out does
not want to know (and we do not want to tell it) what exact
attributes are involved in the decision convert_to_working_tree()
makes for each path, for example.

So how would we split questions and answers in a way that is not
naive and inefficient?  

I envision that we would allow the attribute subsystem to keep track
of the dynamic part, which will receive the answers, holds working
area like check_all_attr[], and has the equivalent to the "attr
stack", indexed by <thread-id, callsite> pair (and the
identification of "callsite" can be done by using the address of the
static part, i.e. the array of questions that we initialize just
once when interning the list of attribute names for the first time).

The API to prepare and ask for attributes may look like:

	static struct attr_static_part Q;
	struct attr_dynamic_part *D;

	attr_check_init(&Q, "text", ...);
	D = git_attr_check(&Q, path);

where Q contains an array of interned attributes (i.e. questions)
and other immutable things that is unique to this callsite, but can
be shared across multiple threads asking the same question from
here.  As an internal implementation detail, it probably will have a
mutex to make sure that init will run only once.

Then the implementation of git_attr_check(&Q, path) would be:

    - see if there is already the "dynaic part" allocated for the
      current thread asking the question Q.  If there is not,
      allocate one and remember it, so that it can be reused in
      later calls by the same thread; if there is, use that existing
      one.

    - reinitialize the "dynamic part" as needed, e.g. clear the
      equivalent to check_all_attr[], adjust the equivalent to
      attr_stack for the current path, etc.  Just like the current
      code optimizes for the case where the entire program (a single
      thread) will ask the same question for paths in traversal
      order (i.e. falling in the same directory), this will optimize
      for the access pattern where each thread asks the same
      question for paths in its traversal order.

    - do what the current collect_some_attrs() thing does.

And this hopefully won't be as costly as the naive and inefficient
one.

The reason why I was pushing hard to split the static part and the
dynamic part in our redesign of the API is primarily because I
didn't want to update the API callers twice.  But I'd imagine that
your v3 (and your earlier "do not discard attr stack, but keep them
around, holding their tips in a hashmap for quick reuse") would at
least lay the foundation for the eventual shape of the API, let's
bite the bullet and accept that we will need to update the callers
again anyway.

Thanks.


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v3 00/27] Revamp the attribute system; another round
  2017-02-02 19:14     ` [PATCH v3 00/27] Revamp the attribute system; another round Junio C Hamano
@ 2017-02-09 17:18       ` Brandon Williams
  2017-02-09 19:31         ` Junio C Hamano
  0 siblings, 1 reply; 111+ messages in thread
From: Brandon Williams @ 2017-02-09 17:18 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, sbeller, pclouds

On 02/02, Junio C Hamano wrote:
> 	prepare the <question, answer> array
> 	for path taken from some set:
> 		do_something(the array, path)
> 
> That way, do_something() do not have to keep allocating,
> initializing and destroying the array.
> 
> But after looking at the current set of codepaths, before coming to
> the conclusion that we need to split the static part that is
> specific to the callsite for git_check_attr() and the dynamic part
> that is specific to the <callsite, thread> pair, I noticed that
> typically the callers that can prepare the array before going into
> the loop (which will eventually be spread across multiple threads)
> are many levels away in the callchain, and they are not even aware
> of what attributes are going to be requested in the leaf level
> helper functions.  In other words, the approach to hoist "the
> <question, answer> array" up in the callchain would not scale.  A
> caller that loops over paths in the index and check them out does
> not want to know (and we do not want to tell it) what exact
> attributes are involved in the decision convert_to_working_tree()
> makes for each path, for example.

This was something that I was envisioning as well, though I didn't dig
very deep into the call stack.  Another means of doing this could be to
have the attr_check structure allocated and then have it configured at a
later point for the particular question being asked:

  alloc struct attr_check c;
  ... many call sites down
  configure(c, questions)
  for path
    do_something(c, path)

That also allows the same structure to be reused (just reconfigured) if
different attributes are needed at a later point in time.  Of course
this is just an idea and I'm not sure if this is the best way to do it
either.

> 
> So how would we split questions and answers in a way that is not
> naive and inefficient?  
> 
> I envision that we would allow the attribute subsystem to keep track
> of the dynamic part, which will receive the answers, holds working
> area like check_all_attr[], and has the equivalent to the "attr
> stack", indexed by <thread-id, callsite> pair (and the
> identification of "callsite" can be done by using the address of the
> static part, i.e. the array of questions that we initialize just
> once when interning the list of attribute names for the first time).
> 
> The API to prepare and ask for attributes may look like:
> 
> 	static struct attr_static_part Q;
> 	struct attr_dynamic_part *D;
> 
> 	attr_check_init(&Q, "text", ...);
> 	D = git_attr_check(&Q, path);
> 
> where Q contains an array of interned attributes (i.e. questions)
> and other immutable things that is unique to this callsite, but can
> be shared across multiple threads asking the same question from
> here.  As an internal implementation detail, it probably will have a
> mutex to make sure that init will run only once.
> 
> Then the implementation of git_attr_check(&Q, path) would be:
> 
>     - see if there is already the "dynaic part" allocated for the
>       current thread asking the question Q.  If there is not,
>       allocate one and remember it, so that it can be reused in
>       later calls by the same thread; if there is, use that existing
>       one.
> 
>     - reinitialize the "dynamic part" as needed, e.g. clear the
>       equivalent to check_all_attr[], adjust the equivalent to
>       attr_stack for the current path, etc.  Just like the current
>       code optimizes for the case where the entire program (a single
>       thread) will ask the same question for paths in traversal
>       order (i.e. falling in the same directory), this will optimize
>       for the access pattern where each thread asks the same
>       question for paths in its traversal order.
> 
>     - do what the current collect_some_attrs() thing does.
> 
> And this hopefully won't be as costly as the naive and inefficient
> one.

I agree, this sort of implementation wouldn't suffer from the same
allocation penalty that the naive implementation suffers from.  This
would be slightly challenging to ensure that there aren't any memory
leaks, well not leaks but rather memory that isn't freed.  i.e. When a
thread terminates we would want to reclaim the memory used for the
dynamic part which is stored inside the attribute system.

> 
> The reason why I was pushing hard to split the static part and the
> dynamic part in our redesign of the API is primarily because I didn't
> want to update the API callers twice.  But I'd imagine that your v3
> (and your earlier "do not discard attr stack, but keep them around,
> holding their tips in a hashmap for quick reuse") would at least lay
> the foundation for the eventual shape of the API, let's bite the
> bullet and accept that we will need to update the callers again
> anyway.
> 
> Thanks.
> 

At least v3 gets the attribute system to a state where further
improvements should be relatively easy to make.  And now as long as each
thread has a unique attr_check structure, multiple callers can exist
inside the attribute system at the same time.  There is still more work
to be done on it though.  Still my biggest complaint is the "direction"
aspect of the system.  I would love to also eliminate that as global
state at some point though I'm not sure how at this point.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v3 00/27] Revamp the attribute system; another round
  2017-02-09 17:18       ` Brandon Williams
@ 2017-02-09 19:31         ` Junio C Hamano
  0 siblings, 0 replies; 111+ messages in thread
From: Junio C Hamano @ 2017-02-09 19:31 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, sbeller, pclouds

Brandon Williams <bmwill@google.com> writes:

> At least v3 gets the attribute system to a state where further
> improvements should be relatively easy to make.  And now as long as each
> thread has a unique attr_check structure, multiple callers can exist
> inside the attribute system at the same time.  There is still more work
> to be done on it though.  Still my biggest complaint is the "direction"
> aspect of the system.  I would love to also eliminate that as global
> state at some point though I'm not sure how at this point.

We are in agreement 100% ;-) The "direction" was the last thorn I
was fighting with (without successfully coming up with a usable
solution) when I stopped working on my original series before Stefan
took it over.

^ permalink raw reply	[flat|nested] 111+ messages in thread

end of thread, other threads:[~2017-02-09 19:32 UTC | newest]

Thread overview: 111+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-12 23:53 [PATCH 00/27] Revamp the attribute system; another round Brandon Williams
2017-01-12 23:53 ` [PATCH 01/27] commit.c: use strchrnul() to scan for one line Brandon Williams
2017-01-12 23:53 ` [PATCH 02/27] attr.c: " Brandon Williams
2017-01-12 23:53 ` [PATCH 03/27] attr.c: update a stale comment on "struct match_attr" Brandon Williams
2017-01-12 23:53 ` [PATCH 04/27] attr.c: explain the lack of attr-name syntax check in parse_attr() Brandon Williams
2017-01-12 23:53 ` [PATCH 05/27] attr.c: complete a sentence in a comment Brandon Williams
2017-01-12 23:53 ` [PATCH 06/27] attr.c: mark where #if DEBUG ends more clearly Brandon Williams
2017-01-12 23:53 ` [PATCH 07/27] attr.c: simplify macroexpand_one() Brandon Williams
2017-01-12 23:53 ` [PATCH 08/27] attr.c: tighten constness around "git_attr" structure Brandon Williams
2017-01-12 23:53 ` [PATCH 09/27] attr.c: plug small leak in parse_attr_line() Brandon Williams
2017-01-12 23:53 ` [PATCH 10/27] attr: support quoting pathname patterns in C style Brandon Williams
2017-01-12 23:53 ` [PATCH 11/27] attr.c: add push_stack() helper Brandon Williams
2017-01-12 23:53 ` [PATCH 12/27] Documentation: fix a typo Brandon Williams
2017-01-12 23:53 ` [PATCH 13/27] attr.c: outline the future plans by heavily commenting Brandon Williams
2017-01-12 23:53 ` [PATCH 14/27] attr: rename function and struct related to checking attributes Brandon Williams
2017-01-12 23:53 ` [PATCH 15/27] attr: (re)introduce git_check_attr() and struct attr_check Brandon Williams
2017-01-12 23:53 ` [PATCH 16/27] attr: convert git_all_attrs() to use "struct attr_check" Brandon Williams
2017-01-12 23:53 ` [PATCH 17/27] attr: convert git_check_attrs() callers to use the new API Brandon Williams
2017-01-12 23:53 ` [PATCH 18/27] attr: retire git_check_attrs() API Brandon Williams
2017-01-12 23:53 ` [PATCH 19/27] attr: pass struct attr_check to collect_some_attrs Brandon Williams
2017-01-12 23:53 ` [PATCH 20/27] attr: change validity check for attribute names to use positive logic Brandon Williams
2017-01-12 23:53 ` [PATCH 21/27] attr: use hashmap for attribute dictionary Brandon Williams
2017-01-18 20:20   ` Stefan Beller
2017-01-18 20:23     ` Brandon Williams
2017-01-12 23:53 ` [PATCH 22/27] attr: eliminate global check_all_attr array Brandon Williams
2017-01-12 23:53 ` [PATCH 23/27] attr: remove maybe-real, maybe-macro from git_attr Brandon Williams
2017-01-12 23:53 ` [PATCH 24/27] attr: tighten const correctness with git_attr and match_attr Brandon Williams
2017-01-12 23:53 ` [PATCH 25/27] attr: store attribute stacks in hashmap Brandon Williams
2017-01-13 21:20   ` Junio C Hamano
2017-01-18 20:34     ` Brandon Williams
2017-01-23 18:08       ` Brandon Williams
2017-01-18 20:39   ` Stefan Beller
2017-01-18 20:45     ` Stefan Beller
2017-01-18 20:50     ` Brandon Williams
2017-01-12 23:53 ` [PATCH 26/27] attr: push the bare repo check into read_attr() Brandon Williams
2017-01-12 23:53 ` [PATCH 27/27] attr: reformat git_attr_set_direction() function Brandon Williams
2017-01-15 23:47 ` [PATCH 00/27] Revamp the attribute system; another round Junio C Hamano
2017-01-16  8:10   ` Jeff King
2017-01-23 20:34 ` [PATCH v2 " Brandon Williams
2017-01-23 20:34   ` [PATCH v2 01/27] commit.c: use strchrnul() to scan for one line Brandon Williams
2017-01-23 20:35   ` [PATCH v2 02/27] attr.c: " Brandon Williams
2017-01-23 20:35   ` [PATCH v2 03/27] attr.c: update a stale comment on "struct match_attr" Brandon Williams
2017-01-23 20:35   ` [PATCH v2 04/27] attr.c: explain the lack of attr-name syntax check in parse_attr() Brandon Williams
2017-01-23 20:35   ` [PATCH v2 05/27] attr.c: complete a sentence in a comment Brandon Williams
2017-01-23 20:35   ` [PATCH v2 06/27] attr.c: mark where #if DEBUG ends more clearly Brandon Williams
2017-01-23 20:35   ` [PATCH v2 07/27] attr.c: simplify macroexpand_one() Brandon Williams
2017-01-23 20:35   ` [PATCH v2 08/27] attr.c: tighten constness around "git_attr" structure Brandon Williams
2017-01-23 20:35   ` [PATCH v2 09/27] attr.c: plug small leak in parse_attr_line() Brandon Williams
2017-01-23 20:35   ` [PATCH v2 10/27] attr: support quoting pathname patterns in C style Brandon Williams
2017-01-23 20:35   ` [PATCH v2 11/27] attr.c: add push_stack() helper Brandon Williams
2017-01-23 20:35   ` [PATCH v2 12/27] Documentation: fix a typo Brandon Williams
2017-01-23 20:35   ` [PATCH v2 13/27] attr.c: outline the future plans by heavily commenting Brandon Williams
2017-01-23 20:35   ` [PATCH v2 14/27] attr: rename function and struct related to checking attributes Brandon Williams
2017-01-23 20:35   ` [PATCH v2 15/27] attr: (re)introduce git_check_attr() and struct attr_check Brandon Williams
2017-01-23 20:35   ` [PATCH v2 16/27] attr: convert git_all_attrs() to use "struct attr_check" Brandon Williams
2017-01-23 20:35   ` [PATCH v2 17/27] attr: convert git_check_attrs() callers to use the new API Brandon Williams
2017-01-23 20:35   ` [PATCH v2 18/27] attr: retire git_check_attrs() API Brandon Williams
2017-01-23 20:35   ` [PATCH v2 19/27] attr: pass struct attr_check to collect_some_attrs Brandon Williams
2017-01-23 20:35   ` [PATCH v2 20/27] attr: change validity check for attribute names to use positive logic Brandon Williams
2017-01-23 20:35   ` [PATCH v2 21/27] attr: use hashmap for attribute dictionary Brandon Williams
2017-01-23 20:35   ` [PATCH v2 22/27] attr: eliminate global check_all_attr array Brandon Williams
2017-01-23 21:11     ` Junio C Hamano
2017-01-23 20:35   ` [PATCH v2 23/27] attr: remove maybe-real, maybe-macro from git_attr Brandon Williams
2017-01-23 20:35   ` [PATCH v2 24/27] attr: tighten const correctness with git_attr and match_attr Brandon Williams
2017-01-23 20:35   ` [PATCH v2 25/27] attr: store attribute stack in attr_check structure Brandon Williams
2017-01-23 21:42     ` Junio C Hamano
2017-01-23 22:06       ` Brandon Williams
2017-01-24  1:11         ` Brandon Williams
2017-01-24  2:28           ` Junio C Hamano
2017-01-25 19:57             ` Brandon Williams
2017-01-25 20:10               ` Stefan Beller
2017-01-25 20:14               ` Junio C Hamano
2017-01-25 21:54                 ` Brandon Williams
2017-01-25 23:19                   ` Brandon Williams
2017-01-23 20:35   ` [PATCH v2 26/27] attr: push the bare repo check into read_attr() Brandon Williams
2017-01-23 20:35   ` [PATCH v2 27/27] attr: reformat git_attr_set_direction() function Brandon Williams
2017-01-28  2:01   ` [PATCH v3 00/27] Revamp the attribute system; another round Brandon Williams
2017-01-28  2:01     ` [PATCH v3 01/27] commit.c: use strchrnul() to scan for one line Brandon Williams
2017-01-28  2:01     ` [PATCH v3 02/27] attr.c: " Brandon Williams
2017-01-28  2:01     ` [PATCH v3 03/27] attr.c: update a stale comment on "struct match_attr" Brandon Williams
2017-01-28  2:01     ` [PATCH v3 04/27] attr.c: explain the lack of attr-name syntax check in parse_attr() Brandon Williams
2017-01-28  2:01     ` [PATCH v3 05/27] attr.c: complete a sentence in a comment Brandon Williams
2017-01-28  2:01     ` [PATCH v3 06/27] attr.c: mark where #if DEBUG ends more clearly Brandon Williams
2017-01-28  2:01     ` [PATCH v3 07/27] attr.c: simplify macroexpand_one() Brandon Williams
2017-01-28  2:01     ` [PATCH v3 08/27] attr.c: tighten constness around "git_attr" structure Brandon Williams
2017-01-28  2:01     ` [PATCH v3 09/27] attr.c: plug small leak in parse_attr_line() Brandon Williams
2017-01-28  2:01     ` [PATCH v3 10/27] attr: support quoting pathname patterns in C style Brandon Williams
2017-01-28  2:01     ` [PATCH v3 11/27] attr.c: add push_stack() helper Brandon Williams
2017-01-28  2:01     ` [PATCH v3 12/27] Documentation: fix a typo Brandon Williams
2017-01-28  2:01     ` [PATCH v3 13/27] attr.c: outline the future plans by heavily commenting Brandon Williams
2017-01-28  2:01     ` [PATCH v3 14/27] attr: rename function and struct related to checking attributes Brandon Williams
2017-01-28  2:01     ` [PATCH v3 15/27] attr: (re)introduce git_check_attr() and struct attr_check Brandon Williams
2017-01-30 18:05       ` Brandon Williams
2017-01-28  2:01     ` [PATCH v3 16/27] attr: convert git_all_attrs() to use "struct attr_check" Brandon Williams
2017-01-28 23:50       ` Stefan Beller
2017-01-29  2:44         ` Brandon Williams
2017-01-30 18:06       ` Brandon Williams
2017-01-28  2:01     ` [PATCH v3 17/27] attr: convert git_check_attrs() callers to use the new API Brandon Williams
2017-01-28  2:01     ` [PATCH v3 18/27] attr: retire git_check_attrs() API Brandon Williams
2017-01-28  2:01     ` [PATCH v3 19/27] attr: pass struct attr_check to collect_some_attrs Brandon Williams
2017-01-28  2:02     ` [PATCH v3 20/27] attr: change validity check for attribute names to use positive logic Brandon Williams
2017-01-28  2:02     ` [PATCH v3 21/27] attr: use hashmap for attribute dictionary Brandon Williams
2017-01-28  2:02     ` [PATCH v3 22/27] attr: eliminate global check_all_attr array Brandon Williams
2017-01-28  2:02     ` [PATCH v3 23/27] attr: remove maybe-real, maybe-macro from git_attr Brandon Williams
2017-01-28  2:02     ` [PATCH v3 24/27] attr: tighten const correctness with git_attr and match_attr Brandon Williams
2017-01-28  2:02     ` [PATCH v3 25/27] attr: store attribute stack in attr_check structure Brandon Williams
2017-01-28  2:02     ` [PATCH v3 26/27] attr: push the bare repo check into read_attr() Brandon Williams
2017-01-28  2:02     ` [PATCH v3 27/27] attr: reformat git_attr_set_direction() function Brandon Williams
2017-02-02 19:14     ` [PATCH v3 00/27] Revamp the attribute system; another round Junio C Hamano
2017-02-09 17:18       ` Brandon Williams
2017-02-09 19:31         ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).