All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/4] pathspec labels [WAS: submodule groups]
@ 2016-05-13  0:19 Stefan Beller
  2016-05-13  0:19 ` [PATCH 1/4] Documentation: correct typo in example for querying attributes Stefan Beller
                   ` (4 more replies)
  0 siblings, 5 replies; 25+ messages in thread
From: Stefan Beller @ 2016-05-13  0:19 UTC (permalink / raw)
  To: gitster; +Cc: pclouds, git, jrnieder, Jens.Lehmann, Stefan Beller

After some fruitful discussion[1] in which Junio suggested trying a very
different route[2] that is more general and not submodule related, I considered
doing a mock for this.

This lets you label arbitrary pathspecs, e.g. in git.git we may want to have:

    t/t[0-9]*.sh label=tests
    
such that

    $ git log --author=Beller ":(label=tests)"
    
would show all commits in which I touched tests.

This has suprisingly few lines of code, as the first 3 patches are refactoring.
The actual new feature is in the last patch.

This would solve the submodule issues I want to solve, as I can produce a
.gitattributes like:

    ./submodule1 label=default
    ./submodule2 label=default
    ./submodule3 label=optional-feature1
    
and then I'd instruct the users to clone like this:

    git clone .. superproject
    cd superproject
    git submodule update --init :(label:default)
    
The second part of the submodule series to collapse these three commands
will come as an extra series later, then.

What annoys me here:
Attributes can only be set once at the moment, there is no way to collect all
attributes. If we'd have such a collection feature we would be able to
configure this:

    *.[ch] label=C,code
    contrib/** label=oldstuff
    
and run this:

    git status ":(label:C oldstuff)"
    
which would be the equivalent to

    contrib/**.[ch]
    
as in this proposed implementation the labels which are given within one
pathspec item are logical AND. To get the logical OR, you'd have to give multiple
pathspec items, i.e. ":(label:C)" ":(label:oldstuff)"

Feedback welcome!

Thanks,
Stefan

[1] http://thread.gmane.org/gmane.comp.version-control.git/294212
[2] http://thread.gmane.org/gmane.comp.version-control.git/294212/focus=294391

Stefan Beller (4):
  Documentation: correct typo in example for querying attributes
  pathspec: move long magic parsing out of prefix_pathspec
  pathspec: move prefix check out of the inner loop
  pathspec: record labels

 Documentation/glossary-content.txt            |   5 ++
 Documentation/technical/api-gitattributes.txt |   2 +-
 attr.h                                        |   1 +
 dir.c                                         |  31 ++++++++
 pathspec.c                                    | 109 +++++++++++++++++---------
 pathspec.h                                    |   1 +
 t/t6134-pathspec-with-labels.sh               |  91 +++++++++++++++++++++
 7 files changed, 201 insertions(+), 39 deletions(-)
 create mode 100755 t/t6134-pathspec-with-labels.sh

-- 
2.8.2.400.g66c4903.dirty

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/4] Documentation: correct typo in example for querying attributes
  2016-05-13  0:19 [RFC PATCH 0/4] pathspec labels [WAS: submodule groups] Stefan Beller
@ 2016-05-13  0:19 ` Stefan Beller
  2016-05-13  0:19 ` [PATCH 2/4] pathspec: move long magic parsing out of prefix_pathspec Stefan Beller
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 25+ messages in thread
From: Stefan Beller @ 2016-05-13  0:19 UTC (permalink / raw)
  To: gitster; +Cc: pclouds, git, jrnieder, Jens.Lehmann, Stefan Beller

The introductory text of the example has a typo in the
specification which makes it harder to follow that example.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 Documentation/technical/api-gitattributes.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/technical/api-gitattributes.txt b/Documentation/technical/api-gitattributes.txt
index 2602668..36fc9af 100644
--- a/Documentation/technical/api-gitattributes.txt
+++ b/Documentation/technical/api-gitattributes.txt
@@ -61,7 +61,7 @@ Querying Specific Attributes
 Example
 -------
 
-To see how attributes "crlf" and "indent" are set for different paths.
+To see how attributes "crlf" and "ident" are set for different paths.
 
 . Prepare an array of `struct git_attr_check` with two elements (because
   we are checking two attributes).  Initialize their `attr` member with
-- 
2.8.2.400.g66c4903.dirty

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 2/4] pathspec: move long magic parsing out of prefix_pathspec
  2016-05-13  0:19 [RFC PATCH 0/4] pathspec labels [WAS: submodule groups] Stefan Beller
  2016-05-13  0:19 ` [PATCH 1/4] Documentation: correct typo in example for querying attributes Stefan Beller
@ 2016-05-13  0:19 ` Stefan Beller
  2016-05-13  0:19 ` [PATCH 3/4] pathspec: move prefix check out of the inner loop Stefan Beller
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 25+ messages in thread
From: Stefan Beller @ 2016-05-13  0:19 UTC (permalink / raw)
  To: gitster; +Cc: pclouds, git, jrnieder, Jens.Lehmann, Stefan Beller

`prefix_pathspec` is quite a lengthy function and we plan on adding more.
Split it up for better readability. As we want to add code into the
inner loop of the long magic parsing, we also benefit from lower
indentation.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 pathspec.c | 84 +++++++++++++++++++++++++++++++++++---------------------------
 1 file changed, 47 insertions(+), 37 deletions(-)

diff --git a/pathspec.c b/pathspec.c
index c9e9b6c..eba37c2 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -88,6 +88,52 @@ static void prefix_short_magic(struct strbuf *sb, int prefixlen,
 	strbuf_addf(sb, ",prefix:%d)", prefixlen);
 }
 
+static void eat_long_magic(struct pathspec_item *item, const char *elt,
+		unsigned *magic, int *pathspec_prefix,
+		const char **copyfrom_, const char **long_magic_end)
+{
+	int i;
+	const char *copyfrom = *copyfrom_;
+	/* longhand */
+	const char *nextat;
+	for (copyfrom = elt + 2;
+	     *copyfrom && *copyfrom != ')';
+	     copyfrom = nextat) {
+		size_t len = strcspn(copyfrom, ",)");
+		if (copyfrom[len] == ',')
+			nextat = copyfrom + len + 1;
+		else
+			/* handle ')' and '\0' */
+			nextat = copyfrom + len;
+		if (!len)
+			continue;
+		for (i = 0; i < ARRAY_SIZE(pathspec_magic); i++) {
+			if (strlen(pathspec_magic[i].name) == len &&
+			    !strncmp(pathspec_magic[i].name, copyfrom, len)) {
+				*magic |= pathspec_magic[i].bit;
+				break;
+			}
+			if (starts_with(copyfrom, "prefix:")) {
+				char *endptr;
+				*pathspec_prefix = strtol(copyfrom + 7,
+							  &endptr, 10);
+				if (endptr - copyfrom != len)
+					die(_("invalid parameter for pathspec magic 'prefix'"));
+				/* "i" would be wrong, but it does not matter */
+				break;
+			}
+		}
+		if (ARRAY_SIZE(pathspec_magic) <= i)
+			die(_("Invalid pathspec magic '%.*s' in '%s'"),
+			    (int) len, copyfrom, elt);
+	}
+	if (*copyfrom != ')')
+		die(_("Missing ')' at the end of pathspec magic in '%s'"), elt);
+	*long_magic_end = copyfrom;
+	copyfrom++;
+	*copyfrom_ = copyfrom;
+}
+
 /*
  * Take an element of a pathspec and check for magic signatures.
  * Append the result to the prefix. Return the magic bitmap.
@@ -150,43 +196,7 @@ static unsigned prefix_pathspec(struct pathspec_item *item,
 	    (flags & PATHSPEC_LITERAL_PATH)) {
 		; /* nothing to do */
 	} else if (elt[1] == '(') {
-		/* longhand */
-		const char *nextat;
-		for (copyfrom = elt + 2;
-		     *copyfrom && *copyfrom != ')';
-		     copyfrom = nextat) {
-			size_t len = strcspn(copyfrom, ",)");
-			if (copyfrom[len] == ',')
-				nextat = copyfrom + len + 1;
-			else
-				/* handle ')' and '\0' */
-				nextat = copyfrom + len;
-			if (!len)
-				continue;
-			for (i = 0; i < ARRAY_SIZE(pathspec_magic); i++) {
-				if (strlen(pathspec_magic[i].name) == len &&
-				    !strncmp(pathspec_magic[i].name, copyfrom, len)) {
-					magic |= pathspec_magic[i].bit;
-					break;
-				}
-				if (starts_with(copyfrom, "prefix:")) {
-					char *endptr;
-					pathspec_prefix = strtol(copyfrom + 7,
-								 &endptr, 10);
-					if (endptr - copyfrom != len)
-						die(_("invalid parameter for pathspec magic 'prefix'"));
-					/* "i" would be wrong, but it does not matter */
-					break;
-				}
-			}
-			if (ARRAY_SIZE(pathspec_magic) <= i)
-				die(_("Invalid pathspec magic '%.*s' in '%s'"),
-				    (int) len, copyfrom, elt);
-		}
-		if (*copyfrom != ')')
-			die(_("Missing ')' at the end of pathspec magic in '%s'"), elt);
-		long_magic_end = copyfrom;
-		copyfrom++;
+		eat_long_magic(item, elt, &magic, &pathspec_prefix, &copyfrom, &long_magic_end);
 	} else {
 		/* shorthand */
 		for (copyfrom = elt + 1;
-- 
2.8.2.400.g66c4903.dirty

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 3/4] pathspec: move prefix check out of the inner loop
  2016-05-13  0:19 [RFC PATCH 0/4] pathspec labels [WAS: submodule groups] Stefan Beller
  2016-05-13  0:19 ` [PATCH 1/4] Documentation: correct typo in example for querying attributes Stefan Beller
  2016-05-13  0:19 ` [PATCH 2/4] pathspec: move long magic parsing out of prefix_pathspec Stefan Beller
@ 2016-05-13  0:19 ` Stefan Beller
  2016-05-13  4:43   ` Junio C Hamano
  2016-05-13  0:19 ` [PATCH 4/4] pathspec: record labels Stefan Beller
  2016-05-15 10:06 ` [RFC PATCH 0/4] pathspec labels [WAS: submodule groups] Duy Nguyen
  4 siblings, 1 reply; 25+ messages in thread
From: Stefan Beller @ 2016-05-13  0:19 UTC (permalink / raw)
  To: gitster; +Cc: pclouds, git, jrnieder, Jens.Lehmann, Stefan Beller

The prefix check is not related the check of pathspec magic; also there
is no code that is relevant after we'd break the loop on a match for
"prefix:". So move the check before the loop and shortcircuit the outer
loop.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 pathspec.c | 19 ++++++++++---------
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/pathspec.c b/pathspec.c
index eba37c2..4dff252 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -107,21 +107,22 @@ static void eat_long_magic(struct pathspec_item *item, const char *elt,
 			nextat = copyfrom + len;
 		if (!len)
 			continue;
+
+		if (starts_with(copyfrom, "prefix:")) {
+			char *endptr;
+			*pathspec_prefix = strtol(copyfrom + 7,
+						  &endptr, 10);
+			if (endptr - copyfrom != len)
+				die(_("invalid parameter for pathspec magic 'prefix'"));
+			continue;
+		}
+
 		for (i = 0; i < ARRAY_SIZE(pathspec_magic); i++) {
 			if (strlen(pathspec_magic[i].name) == len &&
 			    !strncmp(pathspec_magic[i].name, copyfrom, len)) {
 				*magic |= pathspec_magic[i].bit;
 				break;
 			}
-			if (starts_with(copyfrom, "prefix:")) {
-				char *endptr;
-				*pathspec_prefix = strtol(copyfrom + 7,
-							  &endptr, 10);
-				if (endptr - copyfrom != len)
-					die(_("invalid parameter for pathspec magic 'prefix'"));
-				/* "i" would be wrong, but it does not matter */
-				break;
-			}
 		}
 		if (ARRAY_SIZE(pathspec_magic) <= i)
 			die(_("Invalid pathspec magic '%.*s' in '%s'"),
-- 
2.8.2.400.g66c4903.dirty

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 4/4] pathspec: record labels
  2016-05-13  0:19 [RFC PATCH 0/4] pathspec labels [WAS: submodule groups] Stefan Beller
                   ` (2 preceding siblings ...)
  2016-05-13  0:19 ` [PATCH 3/4] pathspec: move prefix check out of the inner loop Stefan Beller
@ 2016-05-13  0:19 ` Stefan Beller
  2016-05-13  4:32   ` Junio C Hamano
  2016-05-13  5:26   ` Junio C Hamano
  2016-05-15 10:06 ` [RFC PATCH 0/4] pathspec labels [WAS: submodule groups] Duy Nguyen
  4 siblings, 2 replies; 25+ messages in thread
From: Stefan Beller @ 2016-05-13  0:19 UTC (permalink / raw)
  To: gitster; +Cc: pclouds, git, jrnieder, Jens.Lehmann, Stefan Beller

Labels were originally designed to manage large amount of
submodules, the discussion steered this in a more general
direction, such that other files can be labeled as well.

Labels are meant to describe arbitrary set of files, which
is not described via the tree layout.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 Documentation/glossary-content.txt |  5 +++
 attr.h                             |  1 +
 dir.c                              | 31 +++++++++++++
 pathspec.c                         | 24 +++++++++-
 pathspec.h                         |  1 +
 t/t6134-pathspec-with-labels.sh    | 91 ++++++++++++++++++++++++++++++++++++++
 6 files changed, 152 insertions(+), 1 deletion(-)
 create mode 100755 t/t6134-pathspec-with-labels.sh

diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt
index 8ad29e6..a1fc9e0 100644
--- a/Documentation/glossary-content.txt
+++ b/Documentation/glossary-content.txt
@@ -362,6 +362,11 @@ glob;;
 	For example, "Documentation/{asterisk}.html" matches
 	"Documentation/git.html" but not "Documentation/ppc/ppc.html"
 	or "tools/perf/Documentation/perf.html".
+
+label:<white space separated list>;;
+	Labels can be assigned to pathspecs in the .gitattributes file.
+	By specifying a list of labels the pattern will match only
+	files which have all of the listed labels.
 +
 Two consecutive asterisks ("`**`") in patterns matched against
 full pathname may have special meaning:
diff --git a/attr.h b/attr.h
index 8b08d33..f6fc7c3 100644
--- a/attr.h
+++ b/attr.h
@@ -18,6 +18,7 @@ extern const char git_attr__false[];
 #define ATTR_TRUE(v) ((v) == git_attr__true)
 #define ATTR_FALSE(v) ((v) == git_attr__false)
 #define ATTR_UNSET(v) ((v) == NULL)
+#define ATTR_CUSTOM(v) (!(ATTR_UNSET(v) || ATTR_FALSE(v) || ATTR_TRUE(v)))
 
 /*
  * Send one or more git_attr_check to git_check_attr(), and
diff --git a/dir.c b/dir.c
index 656f272..51d5965 100644
--- a/dir.c
+++ b/dir.c
@@ -9,6 +9,7 @@
  */
 #include "cache.h"
 #include "dir.h"
+#include "attr.h"
 #include "refs.h"
 #include "wildmatch.h"
 #include "pathspec.h"
@@ -208,6 +209,27 @@ int within_depth(const char *name, int namelen,
 	return 1;
 }
 
+void load_labels(const char *name, int namelen, struct string_list *list)
+{
+	static struct git_attr *attr;
+	struct git_attr_check check;
+	char *path = xmemdupz(name, namelen);
+
+	if (!attr)
+		attr = git_attr("label");
+	check.attr = attr;
+
+	if (git_check_attr(path, 1, &check))
+		die("git_check_attr died");
+
+	if (ATTR_CUSTOM(check.value)) {
+		string_list_split(list, check.value, ',', -1);
+		string_list_sort(list);
+	}
+
+	free(path);
+}
+
 #define DO_MATCH_EXCLUDE   1
 #define DO_MATCH_DIRECTORY 2
 
@@ -263,6 +285,15 @@ static int match_pathspec_item(const struct pathspec_item *item, int prefix,
 	    strncmp(item->match, name - prefix, item->prefix))
 		return 0;
 
+	if (item->group) {
+		struct string_list has_labels = STRING_LIST_INIT_DUP;
+		struct string_list_item *si;
+		load_labels(name, namelen, &has_labels);
+		for_each_string_list_item(si, item->group)
+			if (!string_list_has_string(&has_labels, si->string))
+				return 0;
+	}
+
 	/* If the match was just the prefix, we matched */
 	if (!*match)
 		return MATCHED_RECURSIVELY;
diff --git a/pathspec.c b/pathspec.c
index 4dff252..c227c25 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -94,6 +94,7 @@ static void eat_long_magic(struct pathspec_item *item, const char *elt,
 {
 	int i;
 	const char *copyfrom = *copyfrom_;
+	const char *out;
 	/* longhand */
 	const char *nextat;
 	for (copyfrom = elt + 2;
@@ -117,6 +118,20 @@ static void eat_long_magic(struct pathspec_item *item, const char *elt,
 			continue;
 		}
 
+		if (skip_prefix(copyfrom, "label:", &out)) {
+			struct strbuf sb = STRBUF_INIT;
+			size_t l = nextat - out;
+			strbuf_add(&sb, out, l);
+			if (!item->group) {
+				item->group = xmalloc(sizeof(*item->group));
+				string_list_init(item->group, 1);
+			}
+			string_list_split(item->group, sb.buf, ' ', -1);
+			string_list_remove_empty_items(item->group, 0);
+			strbuf_release(&sb);
+			continue;
+		}
+
 		for (i = 0; i < ARRAY_SIZE(pathspec_magic); i++) {
 			if (strlen(pathspec_magic[i].name) == len &&
 			    !strncmp(pathspec_magic[i].name, copyfrom, len)) {
@@ -425,7 +440,7 @@ void parse_pathspec(struct pathspec *pathspec,
 	for (i = 0; i < n; i++) {
 		unsigned short_magic;
 		entry = argv[i];
-
+		item[i].group = NULL;
 		item[i].magic = prefix_pathspec(item + i, &short_magic,
 						argv + i, flags,
 						prefix, prefixlen, entry);
@@ -502,6 +517,13 @@ void copy_pathspec(struct pathspec *dst, const struct pathspec *src)
 
 void free_pathspec(struct pathspec *pathspec)
 {
+	int i;
+	for (i = 0; i < pathspec->nr; i++) {
+		if (pathspec->items[i].group)
+			string_list_clear(pathspec->items[i].group, 1);
+		free(pathspec->items[i].group);
+	}
+
 	free(pathspec->items);
 	pathspec->items = NULL;
 }
diff --git a/pathspec.h b/pathspec.h
index 0c11262..e3f7ebf 100644
--- a/pathspec.h
+++ b/pathspec.h
@@ -32,6 +32,7 @@ struct pathspec {
 		int len, prefix;
 		int nowildcard_len;
 		int flags;
+		struct string_list *group;
 	} *items;
 };
 
diff --git a/t/t6134-pathspec-with-labels.sh b/t/t6134-pathspec-with-labels.sh
new file mode 100755
index 0000000..0c061ce
--- /dev/null
+++ b/t/t6134-pathspec-with-labels.sh
@@ -0,0 +1,91 @@
+#!/bin/sh
+
+test_description='test labels in pathspecs'
+. ./test-lib.sh
+
+test_expect_success 'setup a tree' '
+	for p in file sub/file sub/sub/file sub/file2 sub/sub/sub/file sub2/file; do
+		if echo $p | grep /; then
+			mkdir -p $(dirname $p)
+		fi &&
+		: >$p &&
+		git add $p &&
+		git commit -m $p
+	done &&
+	git log --oneline --format=%s >actual &&
+	cat <<EOF >expect &&
+sub2/file
+sub/sub/sub/file
+sub/file2
+sub/sub/file
+sub/file
+file
+EOF
+	test_cmp expect actual
+'
+
+test_expect_success 'pathspec with labels and no .gitattributes exists' '
+	git ls-files ":(label:a)" >actual &&
+	test_must_be_empty actual
+'
+
+test_expect_success 'setup .gitattributes' '
+	cat <<-EOF >.gitattributes &&
+	/file label=b
+	sub/file label=a
+	sub/sub/* label=b,c
+	EOF
+	git add .gitattributes &&
+	git commit -m "add attributes"
+'
+
+test_expect_success 'check label' '
+	cat <<-EOF >expect &&
+	sub/file
+	EOF
+	git ls-files ":(label:a)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'check label from label list' '
+	cat <<-EOF >expect &&
+	sub/sub/file
+	EOF
+	git ls-files ":(label:c)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'check label with more labels' '
+	cat <<-EOF >expect &&
+	file
+	sub/sub/file
+	EOF
+	git ls-files ":(label:b)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'check label with more labels but excluded path' '
+	cat <<-EOF >expect &&
+	sub/sub/file
+	EOF
+	git ls-files ":(label:b)" ":(exclude)./file" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'check label specifying more labels' '
+	cat <<-EOF >expect &&
+	sub/sub/file
+	EOF
+	git ls-files ":(label:b c)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'check label specifying more labels' '
+	cat <<-EOF >expect &&
+	sub/file
+	sub/sub/file
+	EOF
+	git ls-files ":(label:b c)" ":(label:a)" >actual &&
+	test_cmp expect actual
+'
+test_done
-- 
2.8.2.400.g66c4903.dirty

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/4] pathspec: record labels
  2016-05-13  0:19 ` [PATCH 4/4] pathspec: record labels Stefan Beller
@ 2016-05-13  4:32   ` Junio C Hamano
  2016-05-13  5:26     ` Stefan Beller
  2016-05-13  5:26   ` Junio C Hamano
  1 sibling, 1 reply; 25+ messages in thread
From: Junio C Hamano @ 2016-05-13  4:32 UTC (permalink / raw)
  To: Stefan Beller; +Cc: pclouds, git, jrnieder, Jens.Lehmann

Stefan Beller <sbeller@google.com> writes:

> diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt
> index 8ad29e6..a1fc9e0 100644
> --- a/Documentation/glossary-content.txt
> +++ b/Documentation/glossary-content.txt
> @@ -362,6 +362,11 @@ glob;;
>  	For example, "Documentation/{asterisk}.html" matches
>  	"Documentation/git.html" but not "Documentation/ppc/ppc.html"
>  	or "tools/perf/Documentation/perf.html".
> +
> +label:<white space separated list>;;
> +	Labels can be assigned to pathspecs in the .gitattributes file.
> +	By specifying a list of labels the pattern will match only
> +	files which have all of the listed labels.
>  +
>  Two consecutive asterisks ("`**`") in patterns matched against
>  full pathname may have special meaning:

I think the new text is stuffed in the middle of the "glob;;" entry.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/4] pathspec: move prefix check out of the inner loop
  2016-05-13  0:19 ` [PATCH 3/4] pathspec: move prefix check out of the inner loop Stefan Beller
@ 2016-05-13  4:43   ` Junio C Hamano
  0 siblings, 0 replies; 25+ messages in thread
From: Junio C Hamano @ 2016-05-13  4:43 UTC (permalink / raw)
  To: pclouds; +Cc: Stefan Beller, git, jrnieder, Jens.Lehmann

Stefan Beller <sbeller@google.com> writes:

> The prefix check is not related the check of pathspec magic; also there
> is no code that is relevant after we'd break the loop on a match for
> "prefix:". So move the check before the loop and shortcircuit the outer
> loop.
>
> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---

What were we thinking back when we added this in the middle of the
loop at 233c3e6c (parse_pathspec: preserve prefix length via
PATHSPEC_PREFIX_ORIGIN, 2013-07-14)?  I am a bit embarrassed.

>  pathspec.c | 19 ++++++++++---------
>  1 file changed, 10 insertions(+), 9 deletions(-)
>
> diff --git a/pathspec.c b/pathspec.c
> index eba37c2..4dff252 100644
> --- a/pathspec.c
> +++ b/pathspec.c
> @@ -107,21 +107,22 @@ static void eat_long_magic(struct pathspec_item *item, const char *elt,
>  			nextat = copyfrom + len;
>  		if (!len)
>  			continue;
> +
> +		if (starts_with(copyfrom, "prefix:")) {
> +			char *endptr;
> +			*pathspec_prefix = strtol(copyfrom + 7,
> +						  &endptr, 10);
> +			if (endptr - copyfrom != len)
> +				die(_("invalid parameter for pathspec magic 'prefix'"));
> +			continue;
> +		}
> +
>  		for (i = 0; i < ARRAY_SIZE(pathspec_magic); i++) {
>  			if (strlen(pathspec_magic[i].name) == len &&
>  			    !strncmp(pathspec_magic[i].name, copyfrom, len)) {
>  				*magic |= pathspec_magic[i].bit;
>  				break;
>  			}
> -			if (starts_with(copyfrom, "prefix:")) {
> -				char *endptr;
> -				*pathspec_prefix = strtol(copyfrom + 7,
> -							  &endptr, 10);
> -				if (endptr - copyfrom != len)
> -					die(_("invalid parameter for pathspec magic 'prefix'"));
> -				/* "i" would be wrong, but it does not matter */
> -				break;
> -			}
>  		}
>  		if (ARRAY_SIZE(pathspec_magic) <= i)
>  			die(_("Invalid pathspec magic '%.*s' in '%s'"),

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/4] pathspec: record labels
  2016-05-13  4:32   ` Junio C Hamano
@ 2016-05-13  5:26     ` Stefan Beller
  0 siblings, 0 replies; 25+ messages in thread
From: Stefan Beller @ 2016-05-13  5:26 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Duy Nguyen, git, Jonathan Nieder, Jens Lehmann

On Thu, May 12, 2016 at 9:32 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt
>> index 8ad29e6..a1fc9e0 100644
>> --- a/Documentation/glossary-content.txt
>> +++ b/Documentation/glossary-content.txt
>> @@ -362,6 +362,11 @@ glob;;
>>       For example, "Documentation/{asterisk}.html" matches
>>       "Documentation/git.html" but not "Documentation/ppc/ppc.html"
>>       or "tools/perf/Documentation/perf.html".
>> +
>> +label:<white space separated list>;;
>> +     Labels can be assigned to pathspecs in the .gitattributes file.
>> +     By specifying a list of labels the pattern will match only
>> +     files which have all of the listed labels.
>>  +
>>  Two consecutive asterisks ("`**`") in patterns matched against
>>  full pathname may have special meaning:
>
> I think the new text is stuffed in the middle of the "glob;;" entry.

It is. Will fix in a reroll.

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/4] pathspec: record labels
  2016-05-13  0:19 ` [PATCH 4/4] pathspec: record labels Stefan Beller
  2016-05-13  4:32   ` Junio C Hamano
@ 2016-05-13  5:26   ` Junio C Hamano
  2016-05-13  5:41     ` Stefan Beller
  1 sibling, 1 reply; 25+ messages in thread
From: Junio C Hamano @ 2016-05-13  5:26 UTC (permalink / raw)
  To: Stefan Beller; +Cc: pclouds, git, jrnieder, Jens.Lehmann

Stefan Beller <sbeller@google.com> writes:

> +
> +label:<white space separated list>;;
> +	Labels can be assigned to pathspecs in the .gitattributes file.
> +	By specifying a list of labels the pattern will match only
> +	files which have all of the listed labels.
>  +

Attributes are given to paths, not pathspecs.  You specify which paths
the entry applies to by matching the pattern (which is at the beginning
of the line) against each path, and take the matching entries.

In any case, what the first sentence tries to explain applies to
each and every attribute .gitattributes file can define, and
explaining it should be better left for the first sentence in the
DESCRIPTION section.

As to the second sentence, I would say "When specifying the paths to
work on to various Git commands, the :(label=<label>,...)  magic
pathspec can be used to select paths that have all the labels given
by the 'label' attribute.", or something like that, to clarify where
that "specifying" and "match" happens (they do not happen here, but
happen when using magic pathspec).

> +void load_labels(const char *name, int namelen, struct string_list *list)

This must be file scope static, no?

> +{
> +	static struct git_attr *attr;
> +	struct git_attr_check check;
> +	char *path = xmemdupz(name, namelen);
> +
> +	if (!attr)
> +		attr = git_attr("label");
> +	check.attr = attr;
> +
> +	if (git_check_attr(path, 1, &check))
> +		die("git_check_attr died");
> +
> +	if (ATTR_CUSTOM(check.value)) {
> +		string_list_split(list, check.value, ',', -1);
> +		string_list_sort(list);
> +	}
> +
> +	free(path);
> +}

I am wondering where we should sanity check (and die, pointing out
an error in .gitattributes file).  Is this function a good place to
reject TRUE and FALSE?

By the way, ATTR_CUSTOM() is probably a bad name.  gitattributes.txt
calls them Set, Unset, Set to a value and Unspecified, and this is
trying to single out "Set to a value" case.  ATTR_STRING() may be
more appropriate.

> @@ -263,6 +285,15 @@ static int match_pathspec_item(const struct pathspec_item *item, int prefix,
>  	    strncmp(item->match, name - prefix, item->prefix))
>  		return 0;
>  
> +	if (item->group) {
> +		struct string_list has_labels = STRING_LIST_INIT_DUP;
> +		struct string_list_item *si;
> +		load_labels(name, namelen, &has_labels);
> +		for_each_string_list_item(si, item->group)
> +			if (!string_list_has_string(&has_labels, si->string))
> +				return 0;
> +	}
> +

Is this the right place, before we even check if the prefix already
covered everything?

We are looking at one element in the pathspec here.  If that element
is known to be a "group", and the path has all the required labels,
is it correct to fall through?

    Ahh, you are making ":(label=...)makefile" to say "paths must
    match the string 'makefile' in some way, and further the paths
    must have all these labels?  Then falling through is correct.
    The placement before the "prefix covered" looks still a bit
    strange, but understandable.

Is this code leaking has_labels every time it is called?

By the way, we should pick one between label and group and stick to
it, no?  Perhaps call the new field "item->label"?

>  	/* If the match was just the prefix, we matched */
>  	if (!*match)
>  		return MATCHED_RECURSIVELY;
> diff --git a/pathspec.c b/pathspec.c
> index 4dff252..c227c25 100644
> --- a/pathspec.c
> +++ b/pathspec.c
> @@ -94,6 +94,7 @@ static void eat_long_magic(struct pathspec_item *item, const char *elt,
>  {
>  	int i;
>  	const char *copyfrom = *copyfrom_;
> +	const char *out;

It probably is meant as "output", but it looks more like the "body" or
the "contents" of the named magic to me.

>  	/* longhand */
>  	const char *nextat;
>  	for (copyfrom = elt + 2;
> @@ -117,6 +118,20 @@ static void eat_long_magic(struct pathspec_item *item, const char *elt,
>  			continue;
>  		}
>  
> +		if (skip_prefix(copyfrom, "label:", &out)) {

This is good, but can you fix the "prefix:" one we see a few lines
above, too, using this to lose "copyfrom + 7" there?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/4] pathspec: record labels
  2016-05-13  5:26   ` Junio C Hamano
@ 2016-05-13  5:41     ` Stefan Beller
  2016-05-13  6:28       ` Junio C Hamano
  0 siblings, 1 reply; 25+ messages in thread
From: Stefan Beller @ 2016-05-13  5:41 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Duy Nguyen, git, Jonathan Nieder, Jens Lehmann

On Thu, May 12, 2016 at 10:26 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> +
>> +label:<white space separated list>;;
>> +     Labels can be assigned to pathspecs in the .gitattributes file.
>> +     By specifying a list of labels the pattern will match only
>> +     files which have all of the listed labels.
>>  +
>
> Attributes are given to paths, not pathspecs.  You specify which paths
> the entry applies to by matching the pattern (which is at the beginning
> of the line) against each path, and take the matching entries.
>
> In any case, what the first sentence tries to explain applies to
> each and every attribute .gitattributes file can define, and
> explaining it should be better left for the first sentence in the
> DESCRIPTION section.
>
> As to the second sentence, I would say "When specifying the paths to
> work on to various Git commands, the :(label=<label>,...)  magic
> pathspec can be used to select paths that have all the labels given
> by the 'label' attribute.", or something like that, to clarify where
> that "specifying" and "match" happens (they do not happen here, but
> happen when using magic pathspec).

Reminder for myself: add a test for

    ":(label=a) :(exclude,label=b)"

>
>> +void load_labels(const char *name, int namelen, struct string_list *list)
>
> This must be file scope static, no?

Yes. It seems like I forget these statics often. :(

>
>> +{
>> +     static struct git_attr *attr;
>> +     struct git_attr_check check;
>> +     char *path = xmemdupz(name, namelen);
>> +
>> +     if (!attr)
>> +             attr = git_attr("label");
>> +     check.attr = attr;
>> +
>> +     if (git_check_attr(path, 1, &check))
>> +             die("git_check_attr died");
>> +
>> +     if (ATTR_CUSTOM(check.value)) {
>> +             string_list_split(list, check.value, ',', -1);
>> +             string_list_sort(list);
>> +     }
>> +
>> +     free(path);
>> +}
>
> I am wondering where we should sanity check (and die, pointing out
> an error in .gitattributes file).  Is this function a good place to
> reject TRUE and FALSE?

Would it make sense to mark a file as

    "follows the labeling system, but has no label" (TRUE)
    "doesn't follow the labeling system at all" (FALSE)

This may be a useful addition later on to say

    "I want to talk only about labeled files, but not label 'a'"

I would think.


>
> By the way, ATTR_CUSTOM() is probably a bad name.  gitattributes.txt
> calls them Set, Unset, Set to a value and Unspecified, and this is
> trying to single out "Set to a value" case.  ATTR_STRING() may be
> more appropriate.

Ok.

>
>> @@ -263,6 +285,15 @@ static int match_pathspec_item(const struct pathspec_item *item, int prefix,
>>           strncmp(item->match, name - prefix, item->prefix))
>>               return 0;
>>
>> +     if (item->group) {
>> +             struct string_list has_labels = STRING_LIST_INIT_DUP;
>> +             struct string_list_item *si;
>> +             load_labels(name, namelen, &has_labels);
>> +             for_each_string_list_item(si, item->group)
>> +                     if (!string_list_has_string(&has_labels, si->string))
>> +                             return 0;
>> +     }
>> +
>
> Is this the right place, before we even check if the prefix already
> covered everything?
>
> We are looking at one element in the pathspec here.  If that element
> is known to be a "group", and the path has all the required labels,
> is it correct to fall through?
>
>     Ahh, you are making ":(label=...)makefile" to say "paths must
>     match the string 'makefile' in some way, and further the paths
>     must have all these labels?  Then falling through is correct.

This is how I understood your initial design idea.

    :(label=C_code)contrib/

gives all the retired C programs.

>     The placement before the "prefix covered" looks still a bit
>     strange, but understandable.
>
> Is this code leaking has_labels every time it is called?

It does.

>
> By the way, we should pick one between label and group and stick to
> it, no?  Perhaps call the new field "item->label"?

will do.

>
>>       /* If the match was just the prefix, we matched */
>>       if (!*match)
>>               return MATCHED_RECURSIVELY;
>> diff --git a/pathspec.c b/pathspec.c
>> index 4dff252..c227c25 100644
>> --- a/pathspec.c
>> +++ b/pathspec.c
>> @@ -94,6 +94,7 @@ static void eat_long_magic(struct pathspec_item *item, const char *elt,
>>  {
>>       int i;
>>       const char *copyfrom = *copyfrom_;
>> +     const char *out;
>
> It probably is meant as "output", but it looks more like the "body" or
> the "contents" of the named magic to me.

ok.

>
>>       /* longhand */
>>       const char *nextat;
>>       for (copyfrom = elt + 2;
>> @@ -117,6 +118,20 @@ static void eat_long_magic(struct pathspec_item *item, const char *elt,
>>                       continue;
>>               }
>>
>> +             if (skip_prefix(copyfrom, "label:", &out)) {
>
> This is good, but can you fix the "prefix:" one we see a few lines
> above, too, using this to lose "copyfrom + 7" there?

I can do that. Though I did not want to clutter this patch with it.

I wonder that you focus on the details already, but not on the grand
design of things. "Is it actually a sane thing I am proposing here?"
Though you may be biased as the the high level idea came from
you. :)

One of the things I switched last minute and tried to address in the
cover letter is the semantics of ORing or ANDing the labels given
within one pathspec item.

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/4] pathspec: record labels
  2016-05-13  5:41     ` Stefan Beller
@ 2016-05-13  6:28       ` Junio C Hamano
  0 siblings, 0 replies; 25+ messages in thread
From: Junio C Hamano @ 2016-05-13  6:28 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Duy Nguyen, git, Jonathan Nieder, Jens Lehmann

Stefan Beller <sbeller@google.com> writes:

> Would it make sense to mark a file as
>
>     "follows the labeling system, but has no label" (TRUE)
>     "doesn't follow the labeling system at all" (FALSE)

Isn't the former be "label="

I do not know what you mean by the latter.  I would understand
"pretend this has all the labels under the sun", though.

>>     Ahh, you are making ":(label=...)makefile" to say "paths must
>>     match the string 'makefile' in some way, and further the paths
>>     must have all these labels?  Then falling through is correct.
>
> This is how I understood your initial design idea.
>
>     :(label=C_code)contrib/
>
> gives all the retired C programs.

OK, you probably meant s/C program/shell script/ with a different
label, but I think I got the idea.

> I wonder that you focus on the details already, but not on the grand
> design of things. "Is it actually a sane thing I am proposing here?"
> Though you may be biased as the the high level idea came from
> you. :)

Biased is the primary reason ;-) I trust that other reviewers can
stop me and correct course when I veer off the deep end by myself.

Just like you called this as a "mock", I am treating this as a
testbed to let you try out the "defaultGroup" thing to see how
flexibly you can express common wishes the end users have, i.e.
"does it give us an expressive enough system to replace repo?"
is the question I want this code to help answering.  And that is
why I was trying to make sure it is good enough quickly.

> One of the things I switched last minute and tried to address in
> the cover letter is the semantics of ORing or ANDing the labels
> given within one pathspec item.

That is something we may find out that the other way is more useful,
or your final choice is better, by seeing how easy to express common
patterns of submodule selections.  Perhaps we might end up wanting
both, but as you said, OR can be given by listing the same path with
differnt required-labels so I think what you have is good enough for
us to start experimenting with the higher layer.

One worry I have is that historically pathspec matching and
attribute matching have both been very hard to make it perform well.
The placement of string_list_has_string() and load_labels() at the
leaf level of the pathspec matching logic cannot be helped, but
these calls might turn out to be an unacceptable performance
bottleneck.  But it is a bit too early to worry about that, I think.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 0/4] pathspec labels [WAS: submodule groups]
  2016-05-13  0:19 [RFC PATCH 0/4] pathspec labels [WAS: submodule groups] Stefan Beller
                   ` (3 preceding siblings ...)
  2016-05-13  0:19 ` [PATCH 4/4] pathspec: record labels Stefan Beller
@ 2016-05-15 10:06 ` Duy Nguyen
  2016-05-15 18:19   ` Junio C Hamano
  2016-05-16 17:20   ` Stefan Beller
  4 siblings, 2 replies; 25+ messages in thread
From: Duy Nguyen @ 2016-05-15 10:06 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Junio C Hamano, Git Mailing List, Jonathan Nieder, Jens Lehmann

On Fri, May 13, 2016 at 7:19 AM, Stefan Beller <sbeller@google.com> wrote:
> After some fruitful discussion[1] in which Junio suggested trying a very
> different route[2] that is more general and not submodule related, I considered
> doing a mock for this.
>
> This lets you label arbitrary pathspecs, e.g. in git.git we may want to have:
>
>     t/t[0-9]*.sh label=tests
>
> such that
>
>     $ git log --author=Beller ":(label=tests)"
>
> would show all commits in which I touched tests.
>
> This has suprisingly few lines of code, as the first 3 patches are refactoring.
> The actual new feature is in the last patch.
>
> This would solve the submodule issues I want to solve, as I can produce a
> .gitattributes like:
>
>     ./submodule1 label=default
>     ./submodule2 label=default
>     ./submodule3 label=optional-feature1
>
> and then I'd instruct the users to clone like this:
>
>     git clone .. superproject
>     cd superproject
>     git submodule update --init :(label:default)
>
> The second part of the submodule series to collapse these three commands
> will come as an extra series later, then.

Yeah. I can see the use of this, even without submodules.

> What annoys me here:
> Attributes can only be set once at the moment, there is no way to collect all
> attributes. If we'd have such a collection feature we would be able to
> configure this:
>
>     *.[ch] label=C,code
>     contrib/** label=oldstuff

Instead of putting everything in under the same attribute name
"label", make one attribute per label? Would this work?

*.[ch] c-group code-group

And the pathspec magic name can be simply "attr", any git attribute
can be used with it, e.g. :(attr:c-group)

> and run this:
>
>     git status ":(label:C oldstuff)"
>
> which would be the equivalent to
>
>     contrib/**.[ch]
>
> as in this proposed implementation the labels which are given within one
> pathspec item are logical AND. To get the logical OR, you'd have to give multiple
> pathspec items, i.e. ":(label:C)" ":(label:oldstuff)"

It's even better if pathspec does not have to deal with combination
logic. Can attr macros deal with that (or if it can't at the moment,
can we improve it)?
-- 
Duy

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 0/4] pathspec labels [WAS: submodule groups]
  2016-05-15 10:06 ` [RFC PATCH 0/4] pathspec labels [WAS: submodule groups] Duy Nguyen
@ 2016-05-15 18:19   ` Junio C Hamano
  2016-05-15 19:33     ` Junio C Hamano
  2016-05-16 17:20   ` Stefan Beller
  1 sibling, 1 reply; 25+ messages in thread
From: Junio C Hamano @ 2016-05-15 18:19 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Stefan Beller, Git Mailing List, Jonathan Nieder, Jens Lehmann

Duy Nguyen <pclouds@gmail.com> writes:

> Instead of putting everything in under the same attribute name
> "label", make one attribute per label? Would this work?
>
> *.[ch] c-group code-group

The attribute subsystem expects that there will not be unbounded
large number of attributes, so this is not a good direction to go.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 0/4] pathspec labels [WAS: submodule groups]
  2016-05-15 18:19   ` Junio C Hamano
@ 2016-05-15 19:33     ` Junio C Hamano
  2016-05-16  0:03       ` Duy Nguyen
  0 siblings, 1 reply; 25+ messages in thread
From: Junio C Hamano @ 2016-05-15 19:33 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Stefan Beller, Git Mailing List, Jonathan Nieder, Jens Lehmann

Junio C Hamano <gitster@pobox.com> writes:

> Duy Nguyen <pclouds@gmail.com> writes:
>
>> Instead of putting everything in under the same attribute name
>> "label", make one attribute per label? Would this work?
>>
>> *.[ch] c-group code-group
>
> The attribute subsystem expects that there will not be unbounded
> large number of attributes, so this is not a good direction to go.

Having said that, I do not mind too much if it turns out that it is
necessary to use an unbounded set of random attributes to solve a
specific problem, if the use case is good.

But even then, in order to avoid confusion and name clashes, I'd
prefer to see more like

	*.[ch] group-c group-code

that is matched by

	git cmd ':(group:c code)

i.e. reserve a single prefix that is not and will not be used for
other purposes.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 0/4] pathspec labels [WAS: submodule groups]
  2016-05-15 19:33     ` Junio C Hamano
@ 2016-05-16  0:03       ` Duy Nguyen
  0 siblings, 0 replies; 25+ messages in thread
From: Duy Nguyen @ 2016-05-16  0:03 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Stefan Beller, Git Mailing List, Jonathan Nieder, Jens Lehmann

On Mon, May 16, 2016 at 2:33 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Junio C Hamano <gitster@pobox.com> writes:
>
>> Duy Nguyen <pclouds@gmail.com> writes:
>>
>>> Instead of putting everything in under the same attribute name
>>> "label", make one attribute per label? Would this work?
>>>
>>> *.[ch] c-group code-group
>>
>> The attribute subsystem expects that there will not be unbounded
>> large number of attributes, so this is not a good direction to go.
>
> Having said that, I do not mind too much if it turns out that it is
> necessary to use an unbounded set of random attributes to solve a
> specific problem, if the use case is good.

I only have a vague idea so far, it seems to me a good idea to be able
to specify "binary-marked paths" or "filter attached paths" from
pathspec. We can already do this with scripts, but we need to be very
careful about quoting. And if it's pathspec it can be combined with
other magic (most likely negation)

> But even then, in order to avoid confusion and name clashes, I'd
> prefer to see more like
>
>         *.[ch] group-c group-code
>
> that is matched by
>
>         git cmd ':(group:c code)
>
> i.e. reserve a single prefix that is not and will not be used for
> other purposes.

For my above use case, i can still define macro group-binary that is
an alias of binary to get around this. So it's ok.
-- 
Duy

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 0/4] pathspec labels [WAS: submodule groups]
  2016-05-15 10:06 ` [RFC PATCH 0/4] pathspec labels [WAS: submodule groups] Duy Nguyen
  2016-05-15 18:19   ` Junio C Hamano
@ 2016-05-16 17:20   ` Stefan Beller
  2016-05-16 17:39     ` Junio C Hamano
  1 sibling, 1 reply; 25+ messages in thread
From: Stefan Beller @ 2016-05-16 17:20 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: Junio C Hamano, Git Mailing List, Jonathan Nieder, Jens Lehmann

On Sun, May 15, 2016 at 3:06 AM, Duy Nguyen <pclouds@gmail.com> wrote:
> Instead of putting everything in under the same attribute name
> "label", make one attribute per label? Would this work?
>
> *.[ch] c-group code-group
>
> And the pathspec magic name can be simply "attr", any git attribute
> can be used with it, e.g. :(attr:c-group)

So you want to be able to query something like:

    git ls-files ":(crlf=auto)"

as well?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 0/4] pathspec labels [WAS: submodule groups]
  2016-05-16 17:20   ` Stefan Beller
@ 2016-05-16 17:39     ` Junio C Hamano
  2016-05-16 17:48       ` Stefan Beller
  0 siblings, 1 reply; 25+ messages in thread
From: Junio C Hamano @ 2016-05-16 17:39 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Duy Nguyen, Git Mailing List, Jonathan Nieder, Jens Lehmann

Stefan Beller <sbeller@google.com> writes:

> On Sun, May 15, 2016 at 3:06 AM, Duy Nguyen <pclouds@gmail.com> wrote:
>> Instead of putting everything in under the same attribute name
>> "label", make one attribute per label? Would this work?
>>
>> *.[ch] c-group code-group
>>
>> And the pathspec magic name can be simply "attr", any git attribute
>> can be used with it, e.g. :(attr:c-group)
>
> So you want to be able to query something like:
>
>     git ls-files ":(crlf=auto)"
>
> as well?

It would be more like

	git ls-files ":(attr:crlf=auto)"

It certainly sounds tempting, even though I do not want to keep what
this initial chunk of the series needs to do to the minimum.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 0/4] pathspec labels [WAS: submodule groups]
  2016-05-16 17:39     ` Junio C Hamano
@ 2016-05-16 17:48       ` Stefan Beller
  2016-05-16 21:18         ` Junio C Hamano
  0 siblings, 1 reply; 25+ messages in thread
From: Stefan Beller @ 2016-05-16 17:48 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Duy Nguyen, Git Mailing List, Jonathan Nieder, Jens Lehmann

On Mon, May 16, 2016 at 10:39 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> On Sun, May 15, 2016 at 3:06 AM, Duy Nguyen <pclouds@gmail.com> wrote:
>>> Instead of putting everything in under the same attribute name
>>> "label", make one attribute per label? Would this work?
>>>
>>> *.[ch] c-group code-group
>>>
>>> And the pathspec magic name can be simply "attr", any git attribute
>>> can be used with it, e.g. :(attr:c-group)
>>
>> So you want to be able to query something like:
>>
>>     git ls-files ":(crlf=auto)"
>>
>> as well?
>
> It would be more like
>
>         git ls-files ":(attr:crlf=auto)"
>
> It certainly sounds tempting, even though I do not want to keep what
> this initial chunk of the series needs to do to the minimum.

This is another case for using ':' instead of '='.
So I think ':' is better for this future enhancement.

Also this future enhancement may ask for

      git ls-files ":(attr:label=foo)"

or

      git ls-files ":(attr:-label)"

so the user can circumvent the warn and ignore strategy. :(

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 0/4] pathspec labels [WAS: submodule groups]
  2016-05-16 17:48       ` Stefan Beller
@ 2016-05-16 21:18         ` Junio C Hamano
  2016-05-16 21:36           ` Stefan Beller
  0 siblings, 1 reply; 25+ messages in thread
From: Junio C Hamano @ 2016-05-16 21:18 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Duy Nguyen, Git Mailing List, Jonathan Nieder, Jens Lehmann

Stefan Beller <sbeller@google.com> writes:

> This is another case for using ':' instead of '='.
> So I think ':' is better for this future enhancement.
>
> Also this future enhancement may ask for
>
>       git ls-files ":(attr:label=foo)"
>
> or
>
>       git ls-files ":(attr:-label)"
>
> so the user can circumvent the warn and ignore strategy. :(

That is exactly what we want, I would say, if we want to accept
"arbitrary set of attributes with their states".

The "warn and ignore" comes only from "with '(:label=X Y Z)', we
inspect attributes X, Y and Z and insist them to be set to true; it
is a configuration error to set the label to anything but a string",
and accepting "arbitrary set of attributes with their states" makes
it moot (as I said earlier in $gmane/294776).

So are we leaning towards ":(attr:<state>)", where <state> is one of

    var=value
    var
    -var
    !var

now?  It makes the pathspec magic parser a bit more complex (and I
am not sure if we have to worry too much about quoting "value" part
to allow arbitrary sequence of bytes), though.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 0/4] pathspec labels [WAS: submodule groups]
  2016-05-16 21:18         ` Junio C Hamano
@ 2016-05-16 21:36           ` Stefan Beller
  2016-05-16 21:50             ` Junio C Hamano
  0 siblings, 1 reply; 25+ messages in thread
From: Stefan Beller @ 2016-05-16 21:36 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Duy Nguyen, Git Mailing List, Jonathan Nieder, Jens Lehmann

On Mon, May 16, 2016 at 2:18 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> This is another case for using ':' instead of '='.
>> So I think ':' is better for this future enhancement.
>>
>> Also this future enhancement may ask for
>>
>>       git ls-files ":(attr:label=foo)"
>>
>> or
>>
>>       git ls-files ":(attr:-label)"
>>
>> so the user can circumvent the warn and ignore strategy. :(
>
> That is exactly what we want, I would say, if we want to accept
> "arbitrary set of attributes with their states".
>
> The "warn and ignore" comes only from "with '(:label=X Y Z)', we
> inspect attributes X, Y and Z and insist them to be set to true; it
> is a configuration error to set the label to anything but a string",
> and accepting "arbitrary set of attributes with their states" makes
> it moot (as I said earlier in $gmane/294776).
>
> So are we leaning towards ":(attr:<state>)", where <state> is one of
>
>     var=value
>     var
>     -var
>     !var
>
> now?  It makes the pathspec magic parser a bit more complex (and I
> am not sure if we have to worry too much about quoting "value" part
> to allow arbitrary sequence of bytes), though.

And we want to have both "label=A B C" and attr:label=A B C" or *just* the
attr query?

We should not allow the user to add arbitrary attributes (i.e. labels).
Instead each of the "attr for labeling purposes" needs to follow a clear scheme
that allows us to add attributes later on that are outside of that scheme.

In a very first implementation we could require the attribute to start with
"label=". ;)

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 0/4] pathspec labels [WAS: submodule groups]
  2016-05-16 21:36           ` Stefan Beller
@ 2016-05-16 21:50             ` Junio C Hamano
  2016-05-16 22:00               ` Stefan Beller
  0 siblings, 1 reply; 25+ messages in thread
From: Junio C Hamano @ 2016-05-16 21:50 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Duy Nguyen, Git Mailing List, Jonathan Nieder, Jens Lehmann

Stefan Beller <sbeller@google.com> writes:

> And we want to have both "label=A B C" and attr:label=A B C" or *just* the
> attr query?

I think the choice at this point is between supporting just "label=A
B C" or supporting just "attr:eol=crlf text=auto !diff".

I think "attr:label=A" is merely a degenerated case of the latter.

> We should not allow the user to add arbitrary attributes (i.e. labels).

Hmph, why not?

> Instead each of the "attr for labeling purposes" needs to follow a clear scheme
> that allows us to add attributes later on that are outside of that scheme.

That was my initial reaction when I saw Duy's "attr:crlf=auto" (by
the way, there is no such setting; crlf should be one of TRUE, UNSET
or set to string "input") idea.  But I do not think of a good
argument to justify that arbitrary attributes are not allowed.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 0/4] pathspec labels [WAS: submodule groups]
  2016-05-16 21:50             ` Junio C Hamano
@ 2016-05-16 22:00               ` Stefan Beller
  2016-05-16 22:02                 ` Junio C Hamano
  0 siblings, 1 reply; 25+ messages in thread
From: Stefan Beller @ 2016-05-16 22:00 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Duy Nguyen, Git Mailing List, Jonathan Nieder, Jens Lehmann

On Mon, May 16, 2016 at 2:50 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> And we want to have both "label=A B C" and attr:label=A B C" or *just* the
>> attr query?
>
> I think the choice at this point is between supporting just "label=A
> B C" or supporting just "attr:eol=crlf text=auto !diff".
>
> I think "attr:label=A" is merely a degenerated case of the latter.
>
>> We should not allow the user to add arbitrary attributes (i.e. labels).

Because we cannot extend our attributes any further from that?

Consider a user starts using attr <foo> for their labeling purposes.
Later (in 2 years) we decide that <foo> is an attribute we want to
use to mark files as foo-ish. so we add meaning to that attribute
(just like eol.crlf does an arbitrary thing, foo would do another arbitrary
thing).

Then the user picks up the new version of Git and expects <foo> to
still be a "I use it for labeling purposes only" thing. They would not
expect to all files labeled as <foo> to start behaving <foo-ish> ?

> Hmph, why not?

We need a namespace for which
* we can guarantee that it is for labeling purposes only (even in the future)
* is obvious to the user to be a labeling name space

Starting with "label" offers both?

>
>> Instead each of the "attr for labeling purposes" needs to follow a clear scheme
>> that allows us to add attributes later on that are outside of that scheme.
>
> That was my initial reaction when I saw Duy's "attr:crlf=auto" (by
> the way, there is no such setting; crlf should be one of TRUE, UNSET
> or set to string "input") idea.  But I do not think of a good
> argument to justify that arbitrary attributes are not allowed.

I agree that querying for attr:eol or attr:diff is a good idea.

I do however want to point out that all labels for "labeling purposes"
MUST be a clear from the beginning, otherwise we'll have the maintenance
problem down the road?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 0/4] pathspec labels [WAS: submodule groups]
  2016-05-16 22:00               ` Stefan Beller
@ 2016-05-16 22:02                 ` Junio C Hamano
  2016-05-16 22:09                   ` Stefan Beller
  0 siblings, 1 reply; 25+ messages in thread
From: Junio C Hamano @ 2016-05-16 22:02 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Duy Nguyen, Git Mailing List, Jonathan Nieder, Jens Lehmann

Stefan Beller <sbeller@google.com> writes:

>> Hmph, why not?
>
> We need a namespace for which
> * we can guarantee that it is for labeling purposes only (even in the future)
> * is obvious to the user to be a labeling name space
>
> Starting with "label" offers both?

Ah, of course.  I thought that you were trying to limit ":(attr:<attribute>)"
magic only to attributes that begin with "label-", which is where my
"why not?" comes from.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 0/4] pathspec labels [WAS: submodule groups]
  2016-05-16 22:02                 ` Junio C Hamano
@ 2016-05-16 22:09                   ` Stefan Beller
  2016-05-16 22:19                     ` Junio C Hamano
  0 siblings, 1 reply; 25+ messages in thread
From: Stefan Beller @ 2016-05-16 22:09 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Duy Nguyen, Git Mailing List, Jonathan Nieder, Jens Lehmann

On Mon, May 16, 2016 at 3:02 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>>> Hmph, why not?
>>
>> We need a namespace for which
>> * we can guarantee that it is for labeling purposes only (even in the future)
>> * is obvious to the user to be a labeling name space
>>
>> Starting with "label" offers both?
>
> Ah, of course.  I thought that you were trying to limit ":(attr:<attribute>)"
> magic only to attributes that begin with "label-", which is where my
> "why not?" comes from.

And going by the logic you presented before, we would
need to error out for the given pathspec ":(<string>)" if

* either the string is not well known (e.g. diif, eol )
* or is outside of the labeling namespace.

So we don't want to see users complaining about
"bug attr:foo worked as a label, now it is a feature; you broke my code"

We would need to ignore data from .gitattributes as it may be crafted from
a newer version of Git, but the command line argument still needs to die
for unknown arguments?

So asking for :(foo) would yield a

    fatal: attr 'foo' is not known to Git, nor is it in the labeling name space

I guess what I am asking is if there is a nice way to query "do we know
this attribute?"

Stefan

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 0/4] pathspec labels [WAS: submodule groups]
  2016-05-16 22:09                   ` Stefan Beller
@ 2016-05-16 22:19                     ` Junio C Hamano
  0 siblings, 0 replies; 25+ messages in thread
From: Junio C Hamano @ 2016-05-16 22:19 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Duy Nguyen, Git Mailing List, Jonathan Nieder, Jens Lehmann

Stefan Beller <sbeller@google.com> writes:

>> Ah, of course.  I thought that you were trying to limit ":(attr:<attribute>)"
>> magic only to attributes that begin with "label-", which is where my
>> "why not?" comes from.
>
> And going by the logic you presented before, we would
> need to error out for the given pathspec ":(<string>)" if
>
> * either the string is not well known (e.g. diif, eol )
> * or is outside of the labeling namespace.

I do not think that follows _my_ line of thought.  What is "well
known"?  Doesn't that change over time?

If we are to do the "attribute match", there is no useful
enforcement point.  An arbitrary version of Git cannot differentiate
a random cruft users will write in their .gitattributes from a
legitimate attribute that will be introduced in the future, so both
"data in .gitattributes" and "pathspec magic that referes to attribute"
cannot be limited by us.

So if we are going to take the arbitrary ":(attr:<attribute>)"
route, "label-" prefix would be limitation on the "core Git" side
and does not limit what end-user does.  We will promise that we
won't use names that begin with the prefix ourselves and leave them
up to the projects.  If the end user uses an attribute "foo", which
does not begin with "label-", the end user is risking to be broken
by future versions of Git.

If you want to compile an authoritative list of attributes used by
core Git and maintain it forever, you are welcome to add warning,
but I wouldn't bother if I were doing this series, at least at the
beginning.

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2016-05-16 22:20 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-13  0:19 [RFC PATCH 0/4] pathspec labels [WAS: submodule groups] Stefan Beller
2016-05-13  0:19 ` [PATCH 1/4] Documentation: correct typo in example for querying attributes Stefan Beller
2016-05-13  0:19 ` [PATCH 2/4] pathspec: move long magic parsing out of prefix_pathspec Stefan Beller
2016-05-13  0:19 ` [PATCH 3/4] pathspec: move prefix check out of the inner loop Stefan Beller
2016-05-13  4:43   ` Junio C Hamano
2016-05-13  0:19 ` [PATCH 4/4] pathspec: record labels Stefan Beller
2016-05-13  4:32   ` Junio C Hamano
2016-05-13  5:26     ` Stefan Beller
2016-05-13  5:26   ` Junio C Hamano
2016-05-13  5:41     ` Stefan Beller
2016-05-13  6:28       ` Junio C Hamano
2016-05-15 10:06 ` [RFC PATCH 0/4] pathspec labels [WAS: submodule groups] Duy Nguyen
2016-05-15 18:19   ` Junio C Hamano
2016-05-15 19:33     ` Junio C Hamano
2016-05-16  0:03       ` Duy Nguyen
2016-05-16 17:20   ` Stefan Beller
2016-05-16 17:39     ` Junio C Hamano
2016-05-16 17:48       ` Stefan Beller
2016-05-16 21:18         ` Junio C Hamano
2016-05-16 21:36           ` Stefan Beller
2016-05-16 21:50             ` Junio C Hamano
2016-05-16 22:00               ` Stefan Beller
2016-05-16 22:02                 ` Junio C Hamano
2016-05-16 22:09                   ` Stefan Beller
2016-05-16 22:19                     ` Junio C Hamano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.