git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH 1/1] attr: add native file mode values support
@ 2023-11-14  2:28 Joanna Wang
  2023-11-14  2:52 ` Junio C Hamano
  0 siblings, 1 reply; 15+ messages in thread
From: Joanna Wang @ 2023-11-14  2:28 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Joanna Wang, tboegi

>> Even if we assume that this code is currently meant to work only
>> with GIT_ATTR_CHECKIN, I do not think this is what you want.  When
>> asked to perform "git add . ':(exclude,mode=160000)'", you not only
>> want to exclude the submodules that are already known to this
>> superproject, but also a repository that _can_ become a submodule of
>> this superproject when added, no?
Sorry, I was totally ignorant of two key concepts that you mention here.
I somehow missed the concept of git_attr_direction altogether and I also
did not know submodules could be added with git-add (I was only
familiar with how
they are currently added in chromium via git update-index).
This makes sense now. I'll fix in the next patch.

>> On the other hand, the GIT_ATTR_CHECKOUT direction is hopefully much
>> simpler.  You'd see what the path in the index is, among a gitlink,
>> a regular non-executable file, an executable file, or a symlink.
I noticed for both the GIT_ATTR_CHECKOUT and GIT_ATTR_CHECKIN directions,
in read_attr(), the indexed .gitattributes file is checked with the
actual file as fallback or vice versa.
I would think that we'd only want to use attributes from one state
(e.g. what's actually in the file)
or the other (e.g. what's indexed).
So I guess I'm still not sure what the "direction" concept is.
For GIT_ATTR_CHECKOUT, would we want to fallback to lstat?

>> "ls-tree" documentation seems to call it %(objectmode).
I can change it to 'objectmode' in a followup patch, if there are no
objections to this.

>> I think the idea is that "mode" being a too generic word can be used
>> for totally different purposes
My thinking was, no matter how generic or rare a name we choose, there
is always a chance
no matter how tiny, that the name will be in use in someone's .gitattributes.
But if people are ok with choosing something less generic
and have that become a 'reserved' attribute name and not have any
existing values in .gitattributes
take precedence I can do that.
I just don't know how git has historically balanced breaking existing
workflows vs easier/more reasonable
implementation/behavior.

>> But stepping back a bit,
>> such an application is likely marking selected few paths with the
>> attribute, and paths for which "mode" was "unset" are now given
>> these natural "mode"; it is inevitable to crash with such uses.
I'm confused. This does not match what I think is the current behavior
of my patch.
If "mode" was unset or removed for a path (meaning '<path> !mode' was
added to .gitattributes),
the code in my patch would respect that and not return the native mode.
It would return 'unspecified' or 'unset'. I have tests for these in
the patch, but if I've missed some
test cases please let me know.


>> Again, this has one hole, I think.  Paths that are not mentioned
>> (not even with "!mode") would come to the function as ATTR__UNKNOWN
>> and trigger the fallback behaviour, even when other paths are given
>> end-user specified "mode" attribute values.
What you are describing sounds correct/what I intended.
So are you saying that the expected behavior is actually:
If the user sets 'mode' for 1+ paths in the repo, then the native mode
fallback should
NOT be used for all paths in the repo?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/1] attr: add native file mode values support
  2023-11-14  2:28 [PATCH 1/1] attr: add native file mode values support Joanna Wang
@ 2023-11-14  2:52 ` Junio C Hamano
  2023-11-14 21:49   ` [PATCH 1/1] attr: add builtin objectmode " Joanna Wang
  0 siblings, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2023-11-14  2:52 UTC (permalink / raw)
  To: Joanna Wang; +Cc: git, tboegi

Joanna Wang <jojwang@google.com> writes:

>>> Even if we assume that this code is currently meant to work only
>>> with GIT_ATTR_CHECKIN, I do not think this is what you want.  When
>>> asked to perform "git add . ':(exclude,mode=160000)'", you not only
>>> want to exclude the submodules that are already known to this
>>> superproject, but also a repository that _can_ become a submodule of
>>> this superproject when added, no?
> Sorry, I was totally ignorant of ...

Nothing to be sorry about, and if I sounded like I was upset or
something, that wasn't my intention.  Reviewers and more experienced
developers read posted patches to give pieces of advice exactly
because no single human is perfect and knows everything.  We cover
holes in each others' knowledge and attention.

>>> On the other hand, the GIT_ATTR_CHECKOUT direction is hopefully much
>>> simpler.  You'd see what the path in the index is, among a gitlink,
>>> a regular non-executable file, an executable file, or a symlink.
> I noticed for both the GIT_ATTR_CHECKOUT and GIT_ATTR_CHECKIN directions,
> in read_attr(), the indexed .gitattributes file is checked with the
> actual file as fallback or vice versa.
> I would think that we'd only want to use attributes from one state
> (e.g. what's actually in the file)
> or the other (e.g. what's indexed).
> So I guess I'm still not sure what the "direction" concept is.
> For GIT_ATTR_CHECKOUT, would we want to fallback to lstat?

When checking things out of the index, the index should be the
source of the truth.  If something is not in the index, is there a
point in falling back to the workint tree state to decide if the
thing should be checked out of the index?

>>> But stepping back a bit,
>>> such an application is likely marking selected few paths with the
>>> attribute, and paths for which "mode" was "unset" are now given
>>> these natural "mode"; it is inevitable to crash with such uses.
> I'm confused. This does not match what I think is the current behavior
> of my patch.
> If "mode" was unset or removed for a path (meaning '<path> !mode' was
> added to .gitattributes),
> the code in my patch would respect that and not return the native mode.
> It would return 'unspecified' or 'unset'.

But the usual practice is *not* to cover all paths with explicit
attribute definition, isn't it?  If somebody used the "foo"
attribute in their project to decide certain paths are worth giving
special treatments, there are paths with that attribute, perhaps
(totally made up example):

	*.c	foo=yes

Now, if you add a new "built-in" attribute next to 'mode' that
assigns "foo" in attr.c:git_check_attr() the same way to those paths
whose value is still ATTR__UNKNOWN after collect_some_attrs returns,
wouldn't a "hello.c" file get attribute 'foo' with value 'yes',
while a "hello.h" file (not mentioned by .gitattributes) will get
whatever value the built-in logic computed for it?  If that existing
project were using "mode" (instead of "foo"), then doesn't this patch
change the behaviour for them?

>>> Again, this has one hole, I think.  Paths that are not mentioned
>>> (not even with "!mode") would come to the function as ATTR__UNKNOWN
>>> and trigger the fallback behaviour, even when other paths are given
>>> end-user specified "mode" attribute values.
> What you are describing sounds correct/what I intended.
> So are you saying that the expected behavior is actually:
> If the user sets 'mode' for 1+ paths in the repo, then the native mode
> fallback should
> NOT be used for all paths in the repo?

That might be workable, but it breaks a well working system suddenly
when somebody uses "mode" in their .gitattibutes and we do not even
warn about the breakage, which is not ideal.

I think, when we see such an attribute is defined while reading from
.gitattributes and friends, we would probably want to warn and
ignore their definition altogether and use *only* the computed
value.  This cannot be done in a backward compatible way, so it
would be better to carve out a namespace (and avoid overly generic
word like "mode") for ourselves to define built-in attributes that
do not overlap with end-user defined attribute names.

Perhaps call this 'builtin-object-mode' or something and document
that any attribute with a name that begins with 'builtin-' will
always get a computed value (possibly "unset"), it is an error
to define such an attribute in .gitattributes system, or something?



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/1] attr: add builtin objectmode values support
  2023-11-14  2:52 ` Junio C Hamano
@ 2023-11-14 21:49   ` Joanna Wang
  2023-11-16  1:26     ` Junio C Hamano
  0 siblings, 1 reply; 15+ messages in thread
From: Joanna Wang @ 2023-11-14 21:49 UTC (permalink / raw)
  To: gitster; +Cc: git, jojwang, tboegi

Gives all paths builtin objectmode values based on the paths' modes
(one of 100644, 100755, 120000, 040000, 160000). Users may use
this feature to filter by file types. For example a pathspec such as
':(attr:builtin_objectmode=160000)' could filter for submodules without
needing to have `builtin_objectmode=160000` to be set in .gitattributes
for every submodule path.

These values are also reflected in `git check-attr` results.
If the git_attr_direction is set to GIT_ATTR_INDEX or GIT_ATTR_CHECKIN
and a path is not found in the index, the value will be unspecified.

This patch also reserves the builtin_* attribute namespace for objectmode
and any future builtin attributes. Any user defined attributes using this
reserved namespace will result in a warning. This is a breaking change for
any existing builtin_* attributes.
Pathspecs with some builtin_* attribute name (excluding builtin_objectmode)
will behave like any attribute where there are no user specified values.

Signed-off-by: Joanna Wang <jojwang@google.com>
---

>But the usual practice is *not* to cover all paths with explicit
> attribute definition, isn't it?
Got it, the breaking change is that we're giving mode a value when
user may expect no value.

>When checking things out of the index, the index should be the
>source of the truth.  If something is not in the index, is there a
>point in falling back to the workint tree state to decide if the
>thing should be checked out of the index?
I don't think working tree objectmode fallback makes sense for
GIT_ATTR_CHECKOUT and GIT_ATTR_INDEX. Updated patch implements
without fallback.

>                static struct git_attr mode_attr;
>
>                if (!mode_attr)
>                        mode_attr = git_attr("mode");
Is there an idiomatic way to check if this git_attr (struct or pointer)
has been initialized? I could not find an anything for C but maybe I've
missed a common way to do this in the codebase.

I also addressed the style nits. Thanks for the review/fedback so far. 

 Documentation/gitattributes.txt | 14 +++++++
 attr.c                          | 67 +++++++++++++++++++++++++++++++--
 t/t0003-attributes.sh           | 44 ++++++++++++++++++++++
 t/t6135-pathspec-with-attrs.sh  | 25 ++++++++++++
 4 files changed, 147 insertions(+), 3 deletions(-)

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index 8c1793c148..84bad3e741 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -100,6 +100,20 @@ for a path to `Unspecified` state.  This can be done by listing
 the name of the attribute prefixed with an exclamation point `!`.
 
 
+RESERVED BUILTIN_* ATTRIBUTES
+-----------------------------
+
+builtin_* is a reserved namespace for builtin attribute values. Any
+user defined attributes under this namespace will result in a warning.
+
+`builtin_objectmode`
+^^^^^^^^^^^^^^^^^^^^
+This attribute is for filtering files by their file bit modes (40000,
+120000, 160000, 100755, 100644). e.g. ':(attr:builtin_objectmode=160000)'.
+You may also check these values with `git check-attr builtin_objectmode -- <file>`.
+If the object is not in the index `git check-attr --cached` will return unspecified.
+
+
 EFFECTS
 -------
 
diff --git a/attr.c b/attr.c
index e62876dfd3..8493f2c5c0 100644
--- a/attr.c
+++ b/attr.c
@@ -17,6 +17,7 @@
 #include "utf8.h"
 #include "quote.h"
 #include "read-cache-ll.h"
+#include "refs.h"
 #include "revision.h"
 #include "object-store-ll.h"
 #include "setup.h"
@@ -183,6 +184,15 @@ static void all_attrs_init(struct attr_hashmap *map, struct attr_check *check)
 	}
 }
 
+/*
+ * Atribute name cannot begin with "builtin_" which
+ * is a reserved namespace for built in attributes values.
+ */
+static int attr_name_reserved(const char *name)
+{
+	return !strncmp(name, "builtin_", strlen("builtin_"));
+}
+
 static int attr_name_valid(const char *name, size_t namelen)
 {
 	/*
@@ -315,7 +325,7 @@ static const char *parse_attr(const char *src, int lineno, const char *cp,
 			cp++;
 			len--;
 		}
-		if (!attr_name_valid(cp, len)) {
+		if (!attr_name_valid(cp, len) || attr_name_reserved(cp)) {
 			report_invalid_attr(cp, len, src, lineno);
 			return NULL;
 		}
@@ -379,7 +389,7 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 		name += strlen(ATTRIBUTE_MACRO_PREFIX);
 		name += strspn(name, blank);
 		namelen = strcspn(name, blank);
-		if (!attr_name_valid(name, namelen)) {
+		if (!attr_name_valid(name, namelen) || attr_name_reserved(name)) {
 			report_invalid_attr(name, namelen, src, lineno);
 			goto fail_return;
 		}
@@ -1240,6 +1250,57 @@ static struct object_id *default_attr_source(void)
 	return &attr_source;
 }
 
+static const char *builtin_object_mode_attr(struct index_state *istate, const char *path)
+{
+	unsigned int mode;
+	struct strbuf sb = STRBUF_INIT;
+	
+	if (direction == GIT_ATTR_CHECKIN) {
+		struct object_id oid;
+		struct stat st;
+		if (lstat(path, &st))
+			die_errno(_("unable to stat '%s'"), path);
+		mode = canon_mode(st.st_mode);
+		if (S_ISDIR(mode)) {
+			/* `path` is either a directory or it is a submodule,
+			 * in which case it is already indexed as submodule
+			 * or it does not exist in the index yet and we need to
+			 * check if we can resolve to a ref.
+			*/
+			int pos = index_name_pos(istate, path, strlen(path));
+			if (pos >= 0) {
+				 if (S_ISGITLINK(istate->cache[pos]->ce_mode))
+					 mode = istate->cache[pos]->ce_mode;
+			} else if (resolve_gitlink_ref(path, "HEAD", &oid) == 0)
+				mode = S_IFGITLINK;
+		}
+	} else {
+		/*
+		 * For GIT_ATTR_CHECKOUT and GIT_ATTR_INDEX we only check
+		 * for mode in the index.
+		 */
+		int pos = index_name_pos(istate, path, strlen(path));
+		if (pos >= 0)
+			mode = istate->cache[pos]->ce_mode;
+		else 
+			return ATTR__UNSET;
+	}
+	strbuf_addf(&sb, "%06o", mode);
+	return sb.buf;
+	
+}
+
+
+static const char *compute_builtin_attr(struct index_state *istate,
+					  const char *path,
+					  const struct git_attr *attr) {
+	const struct git_attr *object_mode_attr = git_attr("builtin_objectmode");
+	
+	if (attr == object_mode_attr)
+		return builtin_object_mode_attr(istate, path);
+	return ATTR__UNSET;
+}
+
 void git_check_attr(struct index_state *istate,
 		    const char *path,
 		    struct attr_check *check)
@@ -1253,7 +1314,7 @@ void git_check_attr(struct index_state *istate,
 		unsigned int n = check->items[i].attr->attr_nr;
 		const char *value = check->all_attrs[n].value;
 		if (value == ATTR__UNKNOWN)
-			value = ATTR__UNSET;
+			value = compute_builtin_attr(istate, path, check->all_attrs[n].attr);
 		check->items[i].value = value;
 	}
 }
diff --git a/t/t0003-attributes.sh b/t/t0003-attributes.sh
index aee2298f01..25aa3fbd05 100755
--- a/t/t0003-attributes.sh
+++ b/t/t0003-attributes.sh
@@ -19,6 +19,16 @@ attr_check () {
 	test_must_be_empty err
 }
 
+attr_check_object_mode () {
+	path="$1" &&
+	expect="$2" &&
+	check_opts="$3" &&
+	git check-attr $check_opts builtin_objectmode -- "$path" >actual 2>err &&
+	echo "$path: builtin_objectmode: $expect" >expect &&
+	test_cmp expect actual
+	test_must_be_empty err
+}
+
 attr_check_quote () {
 	path="$1" quoted_path="$2" expect="$3" &&
 
@@ -558,4 +568,38 @@ test_expect_success EXPENSIVE 'large attributes file ignored in index' '
 	test_cmp expect err
 '
 
+test_expect_success 'native object mode attributes work' '
+	>exec && chmod +x exec && attr_check_object_mode exec 100755 &&
+	>normal && attr_check_object_mode normal 100644 &&
+	mkdir dir && attr_check_object_mode dir 040000 &&
+	>to_sym ln -s to_sym sym && attr_check_object_mode sym 120000
+'
+
+test_expect_success 'native object mode attributes work with --cached' '
+	>normal && attr_check_object_mode normal unspecified --cached &&
+	git add normal && attr_check_object_mode normal 100644 --cached
+'
+
+test_expect_success 'check object mode attributes work for submodules' '
+	mkdir sub &&
+	(
+		cd sub &&
+		git init &&
+		mv .git .real &&
+		echo "gitdir: .real" >.git &&
+		test_commit first
+	) &&
+	attr_check_object_mode sub 160000 &&
+	attr_check_object_mode sub unspecified --cached &&
+	git add sub &&
+	attr_check_object_mode sub 160000 --cached
+'
+
+test_expect_success 'we do not allow user defined builtin_* attributes' '
+	echo "foo* builtin_foo" >.gitattributes &&
+	git add .gitattributes 2>actual &&
+	echo "builtin_foo is not a valid attribute name: .gitattributes:1" >expect &&
+	test_cmp expect actual
+'
+
 test_done
diff --git a/t/t6135-pathspec-with-attrs.sh b/t/t6135-pathspec-with-attrs.sh
index a9c1e4e0ec..b08a32ea68 100755
--- a/t/t6135-pathspec-with-attrs.sh
+++ b/t/t6135-pathspec-with-attrs.sh
@@ -295,4 +295,29 @@ test_expect_success 'reading from .gitattributes in a subdirectory (3)' '
 	test_cmp expect actual
 '
 
+test_expect_success 'pathspec with builtin_objectmode attr can be used' '
+	>mode_exec_file_1 &&
+
+	git status -s ":(attr:builtin_objectmode=100644)mode_exec_*" >actual &&
+	echo ?? mode_exec_file_1 >expect &&
+	test_cmp expect actual &&
+
+	git add mode_exec_file_1 && chmod +x mode_exec_file_1 &&
+	git status -s ":(attr:builtin_objectmode=100755)mode_exec_*" >actual &&
+	echo AM mode_exec_file_1 >expect &&
+	test_cmp expect actual
+'
+
+test_expect_success 'builtin_objectmode attr can be excluded' '
+	>mode_1_regular &&
+	>mode_1_exec  && chmod +x mode_1_exec &&
+	git status -s ":(exclude,attr:builtin_objectmode=100644)" "mode_1_*" >actual &&
+	echo ?? mode_1_exec >expect &&
+	test_cmp expect actual &&
+
+	git status -s ":(exclude,attr:builtin_objectmode=100755)" "mode_1_*" >actual &&
+	echo ?? mode_1_regular >expect &&
+	test_cmp expect actual
+'
+
 test_done
-- 
2.43.0.rc0.421.g78406f8d94-goog


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/1] attr: add builtin objectmode values support
  2023-11-14 21:49   ` [PATCH 1/1] attr: add builtin objectmode " Joanna Wang
@ 2023-11-16  1:26     ` Junio C Hamano
  2023-11-16  1:37       ` Eric Sunshine
  0 siblings, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2023-11-16  1:26 UTC (permalink / raw)
  To: Joanna Wang; +Cc: git, tboegi

Joanna Wang <jojwang@google.com> writes:

>>                static struct git_attr mode_attr;
>>
>>                if (!mode_attr)
>>                        mode_attr = git_attr("mode");
> Is there an idiomatic way to check if this git_attr (struct or pointer)
> has been initialized?

Sorry (I think the above is a typo from the "to illustrate the idea"
code snippet in my message), the static one should be a pointer to a
struct, as git_attr() is meant to "intern" the given name and return
a singleton instance.

The idiom is to see if it is still NULL and then call git_attr().
Of course, you could repeatedly call git_attr() with the same name
and get the same singleton instance, so without the "have we
initialized?" check, your code will still be correct, but it would
incur a locked hashmap lookup (to ensure that we do not create the
second instance of the same name), so...

> +RESERVED BUILTIN_* ATTRIBUTES
> +-----------------------------
> +
> +builtin_* is a reserved namespace for builtin attribute values. Any
> +user defined attributes under this namespace will result in a warning.

warning and then?  I think we would want to do "warn and ignore",
not "warn and disable all the builtin_ computation", i.e.,

	... will be ignored and a warning message is given.

> +/*
> + * Atribute name cannot begin with "builtin_" which
> + * is a reserved namespace for built in attributes values.
> + */
> +static int attr_name_reserved(const char *name)
> +{
> +	return !strncmp(name, "builtin_", strlen("builtin_"));
> +}
> +

You'd want to use starts_with() probably.

> @@ -1240,6 +1250,57 @@ static struct object_id *default_attr_source(void)
>  	return &attr_source;
>  }
>  
> +static const char *builtin_object_mode_attr(struct index_state *istate, const char *path)
> +{
> +	unsigned int mode;
> +	struct strbuf sb = STRBUF_INIT;
> +	
> +	if (direction == GIT_ATTR_CHECKIN) {
> +		struct object_id oid;
> +		struct stat st;
> +		if (lstat(path, &st))
> +			die_errno(_("unable to stat '%s'"), path);
> +		mode = canon_mode(st.st_mode);
> +		if (S_ISDIR(mode)) {
> +			/* `path` is either a directory or it is a submodule,
> +			 * in which case it is already indexed as submodule
> +			 * or it does not exist in the index yet and we need to
> +			 * check if we can resolve to a ref.
> +			*/

	/*
	 * Our multi-line comment begins and ends with
	 * slash-asterisk and asterisk-slash on their
	 * own lines without anything else.
	 */

> +			int pos = index_name_pos(istate, path, strlen(path));
> +			if (pos >= 0) {
> +				 if (S_ISGITLINK(istate->cache[pos]->ce_mode))
> +					 mode = istate->cache[pos]->ce_mode;
> +			} else if (resolve_gitlink_ref(path, "HEAD", &oid) == 0)
> +				mode = S_IFGITLINK;
> +		}

Give {braces} around the else if block as the other side has it
(Documentation/CodingGuidelines):

	- When there are multiple arms to a conditional and some of them
	  require braces, enclose even a single line block in braces for
	  consistency.

> +	} else {
> +		/*
> +		 * For GIT_ATTR_CHECKOUT and GIT_ATTR_INDEX we only check
> +		 * for mode in the index.
> +		 */
> +		int pos = index_name_pos(istate, path, strlen(path));
> +		if (pos >= 0)
> +			mode = istate->cache[pos]->ce_mode;
> +		else 
> +			return ATTR__UNSET;
> +	}
> +	strbuf_addf(&sb, "%06o", mode);
> +	return sb.buf;
> +	

The struct itself will be reclaimed as it is on stack, and sb.buf is
given back to the caller, so there technically is no leak, but I think
a more kosher way to do this involves the use of strbuf_detach().

Let's lose the blank line at the end of the function that is not
serving much useful purpose.

>  void git_check_attr(struct index_state *istate,
>  		    const char *path,
>  		    struct attr_check *check)
> @@ -1253,7 +1314,7 @@ void git_check_attr(struct index_state *istate,
>  		unsigned int n = check->items[i].attr->attr_nr;
>  		const char *value = check->all_attrs[n].value;
>  		if (value == ATTR__UNKNOWN)
> -			value = ATTR__UNSET;
> +			value = compute_builtin_attr(istate, path, check->all_attrs[n].attr);
>  		check->items[i].value = value;
>  	}
>  }

Nicely done.

> diff --git a/t/t0003-attributes.sh b/t/t0003-attributes.sh
> index aee2298f01..25aa3fbd05 100755
> --- a/t/t0003-attributes.sh
> +++ b/t/t0003-attributes.sh
> @@ -19,6 +19,16 @@ attr_check () {
>  	test_must_be_empty err
>  }
>  
> +attr_check_object_mode () {
> +	path="$1" &&
> +	expect="$2" &&
> +	check_opts="$3" &&
> +	git check-attr $check_opts builtin_objectmode -- "$path" >actual 2>err &&
> +	echo "$path: builtin_objectmode: $expect" >expect &&
> +	test_cmp expect actual
> +	test_must_be_empty err
> +}
> +
>  attr_check_quote () {
>  	path="$1" quoted_path="$2" expect="$3" &&
>  
> @@ -558,4 +568,38 @@ test_expect_success EXPENSIVE 'large attributes file ignored in index' '
>  	test_cmp expect err
>  '
>  
> +test_expect_success 'native object mode attributes work' '

> +	>exec && chmod +x exec && attr_check_object_mode exec 100755 &&
> +	>normal && attr_check_object_mode normal 100644 &&
> +	mkdir dir && attr_check_object_mode dir 040000 &&

Just a style thing, but we tend to write one statement on each line.

> +	>to_sym ln -s to_sym sym && attr_check_object_mode sym 120000
> +'

You do not need to_sym file, do you?  Symbolic links can be left
dangling.

In any case, the above test would not work on a filesystem without
native executable bit support.  Also, you cannot do "ln -s" on
certain filesystems.  It is a good idea to split the above into
three tests, one that is run unconditionally (for "normal" and
"dir"), another that is run under the POSIXPERM prerequisite, and
the third that is run under the SYNLINKS prerequisite.

> +test_expect_success 'native object mode attributes work with --cached' '
> +	>normal && attr_check_object_mode normal unspecified --cached &&
> +	git add normal && attr_check_object_mode normal 100644 --cached
> +'

For "--cached test", on the other hand, we should be able to set the
executable bit or record a symbolic link regardless of the
filesystem using "update-index", e.g.,

	>normal && git add normal &&
	empty_blob=$(git rev-parse :normal)
	cat <<-EOF | git update-index --index-info
	100755 $empty_blob 0	exec
	120000 $empty_blob 0	symlink
	EOF
	attr_check_object_mode normal 100644 --cached &&
	attr_check_object_mode exec 100755 --cached &&
	attr_check_object_mode symlink 120000 --cached

or something.

> +test_expect_success 'check object mode attributes work for submodules' '
> +	mkdir sub &&
> +	(
> +		cd sub &&
> +		git init &&
> +		mv .git .real &&
> +		echo "gitdir: .real" >.git &&
> +		test_commit first
> +	) &&
> +	attr_check_object_mode sub 160000 &&
> +	attr_check_object_mode sub unspecified --cached &&
> +	git add sub &&
> +	attr_check_object_mode sub 160000 --cached
> +'

Sounds good.

> +test_expect_success 'we do not allow user defined builtin_* attributes' '
> +	echo "foo* builtin_foo" >.gitattributes &&
> +	git add .gitattributes 2>actual &&
> +	echo "builtin_foo is not a valid attribute name: .gitattributes:1" >expect &&
> +	test_cmp expect actual
> +'

Another thing that is worth checking is what happens when there is an
conflicting entry:

	echo "foo builtin_objectmode=200" >.gitattributes &&
	>foo &&
	attr_check_object_mode foo 100644

>  test_done
> diff --git a/t/t6135-pathspec-with-attrs.sh b/t/t6135-pathspec-with-attrs.sh
> index a9c1e4e0ec..b08a32ea68 100755
> --- a/t/t6135-pathspec-with-attrs.sh
> +++ b/t/t6135-pathspec-with-attrs.sh
> @@ -295,4 +295,29 @@ test_expect_success 'reading from .gitattributes in a subdirectory (3)' '
>  	test_cmp expect actual
>  '
>  
> +test_expect_success 'pathspec with builtin_objectmode attr can be used' '
> +	>mode_exec_file_1 &&
> +
> +	git status -s ":(attr:builtin_objectmode=100644)mode_exec_*" >actual &&
> +	echo ?? mode_exec_file_1 >expect &&
> +	test_cmp expect actual &&
> +
> +	git add mode_exec_file_1 && chmod +x mode_exec_file_1 &&
> +	git status -s ":(attr:builtin_objectmode=100755)mode_exec_*" >actual &&
> +	echo AM mode_exec_file_1 >expect &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'builtin_objectmode attr can be excluded' '
> +	>mode_1_regular &&
> +	>mode_1_exec  && chmod +x mode_1_exec &&
> +	git status -s ":(exclude,attr:builtin_objectmode=100644)" "mode_1_*" >actual &&
> +	echo ?? mode_1_exec >expect &&
> +	test_cmp expect actual &&
> +
> +	git status -s ":(exclude,attr:builtin_objectmode=100755)" "mode_1_*" >actual &&
> +	echo ?? mode_1_regular >expect &&
> +	test_cmp expect actual
> +'
> +
>  test_done

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/1] attr: add builtin objectmode values support
  2023-11-16  1:26     ` Junio C Hamano
@ 2023-11-16  1:37       ` Eric Sunshine
  2023-11-16  2:53         ` Junio C Hamano
  0 siblings, 1 reply; 15+ messages in thread
From: Eric Sunshine @ 2023-11-16  1:37 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Joanna Wang, git, tboegi

On Wed, Nov 15, 2023 at 8:26 PM Junio C Hamano <gitster@pobox.com> wrote:
> Joanna Wang <jojwang@google.com> writes:
> > +test_expect_success 'native object mode attributes work with --cached' '
> > +     >normal && attr_check_object_mode normal unspecified --cached &&
> > +     git add normal && attr_check_object_mode normal 100644 --cached
> > +'
>
> For "--cached test", on the other hand, we should be able to set the
> executable bit or record a symbolic link regardless of the
> filesystem using "update-index", e.g.,
>
>         empty_blob=$(git rev-parse :normal)
>         cat <<-EOF | git update-index --index-info
>         100755 $empty_blob 0    exec
>         120000 $empty_blob 0    symlink
>         EOF
>
> or something.

A bit more idiomatic in this codebase would be:

        git update-index --index-info <<-EOF
        100755 $empty_blob 0    exec
        120000 $empty_blob 0    symlink
        EOF

No need for `cat`.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/1] attr: add builtin objectmode values support
  2023-11-16  1:37       ` Eric Sunshine
@ 2023-11-16  2:53         ` Junio C Hamano
  2023-11-16  5:44           ` Joanna Wang
  0 siblings, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2023-11-16  2:53 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Joanna Wang, git, tboegi

Eric Sunshine <sunshine@sunshineco.com> writes:

> A bit more idiomatic in this codebase would be:
>
>         git update-index --index-info <<-EOF
>         100755 $empty_blob 0    exec
>         120000 $empty_blob 0    symlink
>         EOF
>
> No need for `cat`.

Aye, of course.  We should not cat a single-source datastream into a
pipe.  Thanks for correcting me.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/1] attr: add builtin objectmode values support
  2023-11-16  2:53         ` Junio C Hamano
@ 2023-11-16  5:44           ` Joanna Wang
  2023-11-16  6:17             ` Junio C Hamano
  2023-11-16  7:57             ` Junio C Hamano
  0 siblings, 2 replies; 15+ messages in thread
From: Joanna Wang @ 2023-11-16  5:44 UTC (permalink / raw)
  To: gitster; +Cc: git, jojwang, sunshine, tboegi

Gives all paths builtin objectmode values based on the paths' modes
(one of 100644, 100755, 120000, 040000, 160000). Users may use
this feature to filter by file types. For example a pathspec such as
':(attr:builtin_objectmode=160000)' could filter for submodules without
needing to have `builtin_objectmode=160000` to be set in .gitattributes
for every submodule path.

These values are also reflected in `git check-attr` results.
If the git_attr_direction is set to GIT_ATTR_INDEX or GIT_ATTR_CHECKIN
and a path is not found in the index, the value will be unspecified.

This patch also reserves the builtin_* attribute namespace for objectmode
and any future builtin attributes. Any user defined attributes using this
reserved namespace will result in a warning. This is a breaking change for
any existing builtin_* attributes.
Pathspecs with some builtin_* attribute name (excluding builtin_objectmode)
will behave like any attribute where there are no user specified values.

Signed-off-by: Joanna Wang <jojwang@google.com>
---

>The idiom is to see if it is still NULL and then call git_attr().
Got it, thank you for explaining again.

>We need a precondition here:
>test_expect_success SYMLINKS
Torsten pointed this out in a previous version,
sorry i missed this until Junio pointed it out again.

I've addressed the all the latest feedback in this update.

 Documentation/gitattributes.txt |  15 +
 attr.c                          |  71 ++++-
 t/#t1700-split-index.sh#        | 547 ++++++++++++++++++++++++++++++++
 t/t0003-attributes.sh           |  75 +++++
 t/t6135-pathspec-with-attrs.sh  |  25 ++
 5 files changed, 730 insertions(+), 3 deletions(-)
 create mode 100755 t/#t1700-split-index.sh#

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index 8c1793c148..784aa9d4de 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -100,6 +100,21 @@ for a path to `Unspecified` state.  This can be done by listing
 the name of the attribute prefixed with an exclamation point `!`.
 
 
+RESERVED BUILTIN_* ATTRIBUTES
+-----------------------------
+
+builtin_* is a reserved namespace for builtin attribute values. Any
+user defined attributes under this namespace will be ignored and
+trigger a warning.
+
+`builtin_objectmode`
+^^^^^^^^^^^^^^^^^^^^
+This attribute is for filtering files by their file bit modes (40000,
+120000, 160000, 100755, 100644). e.g. ':(attr:builtin_objectmode=160000)'.
+You may also check these values with `git check-attr builtin_objectmode -- <file>`.
+If the object is not in the index `git check-attr --cached` will return unspecified.
+
+
 EFFECTS
 -------
 
diff --git a/attr.c b/attr.c
index e62876dfd3..f8a94aeb23 100644
--- a/attr.c
+++ b/attr.c
@@ -17,6 +17,7 @@
 #include "utf8.h"
 #include "quote.h"
 #include "read-cache-ll.h"
+#include "refs.h"
 #include "revision.h"
 #include "object-store-ll.h"
 #include "setup.h"
@@ -183,6 +184,15 @@ static void all_attrs_init(struct attr_hashmap *map, struct attr_check *check)
 	}
 }
 
+/*
+ * Atribute name cannot begin with "builtin_" which
+ * is a reserved namespace for built in attributes values.
+ */
+static int attr_name_reserved(const char *name)
+{
+	return starts_with(name, "builtin_");
+}
+
 static int attr_name_valid(const char *name, size_t namelen)
 {
 	/*
@@ -315,7 +325,7 @@ static const char *parse_attr(const char *src, int lineno, const char *cp,
 			cp++;
 			len--;
 		}
-		if (!attr_name_valid(cp, len)) {
+		if (!attr_name_valid(cp, len) || attr_name_reserved(cp)) {
 			report_invalid_attr(cp, len, src, lineno);
 			return NULL;
 		}
@@ -379,7 +389,7 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 		name += strlen(ATTRIBUTE_MACRO_PREFIX);
 		name += strspn(name, blank);
 		namelen = strcspn(name, blank);
-		if (!attr_name_valid(name, namelen)) {
+		if (!attr_name_valid(name, namelen) || attr_name_reserved(name)) {
 			report_invalid_attr(name, namelen, src, lineno);
 			goto fail_return;
 		}
@@ -1240,6 +1250,61 @@ static struct object_id *default_attr_source(void)
 	return &attr_source;
 }
 
+static const char *builtin_object_mode_attr(struct index_state *istate, const char *path)
+{
+	unsigned int mode;
+	struct strbuf sb = STRBUF_INIT;
+	
+	if (direction == GIT_ATTR_CHECKIN) {
+		struct object_id oid;
+		struct stat st;
+		if (lstat(path, &st))
+			die_errno(_("unable to stat '%s'"), path);
+		mode = canon_mode(st.st_mode);
+		if (S_ISDIR(mode)) {
+			/*
+			 *`path` is either a directory or it is a submodule,
+			 * in which case it is already indexed as submodule
+			 * or it does not exist in the index yet and we need to
+			 * check if we can resolve to a ref.
+			*/
+			int pos = index_name_pos(istate, path, strlen(path));
+			if (pos >= 0) {
+				 if (S_ISGITLINK(istate->cache[pos]->ce_mode))
+					 mode = istate->cache[pos]->ce_mode;
+			} else if (resolve_gitlink_ref(path, "HEAD", &oid) == 0) {
+				mode = S_IFGITLINK;
+			}
+		}
+	} else {
+		/*
+		 * For GIT_ATTR_CHECKOUT and GIT_ATTR_INDEX we only check
+		 * for mode in the index.
+		 */
+		int pos = index_name_pos(istate, path, strlen(path));
+		if (pos >= 0)
+			mode = istate->cache[pos]->ce_mode;
+		else 
+			return ATTR__UNSET;
+	}
+	strbuf_addf(&sb, "%06o", mode);
+	return strbuf_detach(&sb, NULL);
+}
+
+
+static const char *compute_builtin_attr(struct index_state *istate,
+					  const char *path,
+					  const struct git_attr *attr) {
+	static const struct git_attr *object_mode_attr;
+
+	if (!object_mode_attr)
+		object_mode_attr = git_attr("builtin_objectmode");
+	
+	if (attr == object_mode_attr)
+		return builtin_object_mode_attr(istate, path);
+	return ATTR__UNSET;
+}
+
 void git_check_attr(struct index_state *istate,
 		    const char *path,
 		    struct attr_check *check)
@@ -1253,7 +1318,7 @@ void git_check_attr(struct index_state *istate,
 		unsigned int n = check->items[i].attr->attr_nr;
 		const char *value = check->all_attrs[n].value;
 		if (value == ATTR__UNKNOWN)
-			value = ATTR__UNSET;
+			value = compute_builtin_attr(istate, path, check->all_attrs[n].attr);
 		check->items[i].value = value;
 	}
 }
diff --git a/t/#t1700-split-index.sh# b/t/#t1700-split-index.sh#
new file mode 100755
index 0000000000..a7b7263b35
--- /dev/null
+++ b/t/#t1700-split-index.sh#
@@ -0,0 +1,547 @@
+#!/bin/sh
+
+test_description='split index mode tests'
+
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+. ./test-lib.sh
+
+# We need total control of index splitting here
+sane_unset GIT_TEST_SPLIT_INDEX
+
+# Testing a hard coded SHA against an index with an extension
+# that can vary from run to run is problematic so we disable
+# those extensions.
+sane_unset GIT_TEST_FSMONITOR
+sane_unset GIT_TEST_INDEX_THREADS
+
+# Create a file named as $1 with content read from stdin.
+# Set the file's mtime to a few seconds in the past to avoid racy situations.
+create_non_racy_file () {
+	cat >"$1" &&
+	test-tool chmtime =-5 "$1"
+}
+
+test_expect_success 'setup' '
+	test_oid_cache <<-EOF
+	own_v3 sha1:8299b0bcd1ac364e5f1d7768efb62fa2da79a339
+	own_v3 sha256:38a6d2925e3eceec33ad7b34cbff4e0086caa0daf28f31e51f5bd94b4a7af86b
+
+	base_v3 sha1:39d890139ee5356c7ef572216cebcd27aa41f9df
+	base_v3 sha256:c9baeadf905112bf6c17aefbd7d02267afd70ded613c30cafed2d40cb506e1ed
+
+	own_v4 sha1:432ef4b63f32193984f339431fd50ca796493569
+	own_v4 sha256:6738ac6319c25b694afa7bcc313deb182d1a59b68bf7a47b4296de83478c0420
+
+	base_v4 sha1:508851a7f0dfa8691e9f69c7f055865389012491
+	base_v4 sha256:3177d4adfdd4b6904f7e921d91d715a471c0dde7cf6a4bba574927f02b699508
+	EOF
+'
+
+test_expect_success 'enable split index' '
+	git config splitIndex.maxPercentChange 100 &&
+	git update-index --split-index &&
+	test-tool dump-split-index .git/index >actual &&
+	indexversion=$(git update-index --show-index-version) &&
+
+	# NEEDSWORK: Stop hard-coding checksums.
+	if test "$indexversion" = "4"
+	then
+		own=$(test_oid own_v4) &&
+		base=$(test_oid base_v4)
+	else
+		own=$(test_oid own_v3) &&
+		base=$(test_oid base_v3)
+	fi &&
+
+	cat >expect <<-EOF &&
+	own $own
+	base $base
+	replacements:
+	deletions:
+	EOF
+	test_cmp expect actual
+'
+
+test_expect_success 'add one file' '
+	create_non_racy_file one &&
+	git update-index --add one &&
+	git ls-files --stage >ls-files.actual &&
+	cat >ls-files.expect <<-EOF &&
+	100644 $EMPTY_BLOB 0	one
+	EOF
+	test_cmp ls-files.expect ls-files.actual &&
+
+	test-tool dump-split-index .git/index | sed "/^own/d" >actual &&
+	cat >expect <<-EOF &&
+	base $base
+	100644 $EMPTY_BLOB 0	one
+	replacements:
+	deletions:
+	EOF
+	test_cmp expect actual
+'
+
+test_expect_success 'disable split index' '
+	git update-index --no-split-index &&
+	git ls-files --stage >ls-files.actual &&
+	cat >ls-files.expect <<-EOF &&
+	100644 $EMPTY_BLOB 0	one
+	EOF
+	test_cmp ls-files.expect ls-files.actual &&
+
+	BASE=$(test-tool dump-split-index .git/index | sed -n "s/^own/base/p") &&
+	test-tool dump-split-index .git/index | sed "/^own/d" >actual &&
+	cat >expect <<-EOF &&
+	not a split index
+	EOF
+	test_cmp expect actual
+'
+
+test_expect_success 'enable split index again, "one" now belongs to base index"' '
+	git update-index --split-index &&
+	git ls-files --stage >ls-files.actual &&
+	cat >ls-files.expect <<-EOF &&
+	100644 $EMPTY_BLOB 0	one
+	EOF
+	test_cmp ls-files.expect ls-files.actual &&
+
+	test-tool dump-split-index .git/index | sed "/^own/d" >actual &&
+	cat >expect <<-EOF &&
+	$BASE
+	replacements:
+	deletions:
+	EOF
+	test_cmp expect actual
+'
+
+test_expect_success 'modify original file, base index untouched' '
+	echo modified | create_non_racy_file one &&
+	file1_blob=$(git hash-object one) &&
+	git update-index one &&
+	git ls-files --stage >ls-files.actual &&
+	cat >ls-files.expect <<-EOF &&
+	100644 $file1_blob 0	one
+	EOF
+	test_cmp ls-files.expect ls-files.actual &&
+
+	test-tool dump-split-index .git/index | sed "/^own/d" >actual &&
+	q_to_tab >expect <<-EOF &&
+	$BASE
+	100644 $file1_blob 0Q
+	replacements: 0
+	deletions:
+	EOF
+	test_cmp expect actual
+'
+
+test_expect_success 'add another file, which stays index' '
+	create_non_racy_file two &&
+	git update-index --add two &&
+	git ls-files --stage >ls-files.actual &&
+	cat >ls-files.expect <<-EOF &&
+	100644 $file1_blob 0	one
+	100644 $EMPTY_BLOB 0	two
+	EOF
+	test_cmp ls-files.expect ls-files.actual &&
+
+	test-tool dump-split-index .git/index | sed "/^own/d" >actual &&
+	q_to_tab >expect <<-EOF &&
+	$BASE
+	100644 $file1_blob 0Q
+	100644 $EMPTY_BLOB 0	two
+	replacements: 0
+	deletions:
+	EOF
+	test_cmp expect actual
+'
+
+test_expect_success 'remove file not in base index' '
+	git update-index --force-remove two &&
+	git ls-files --stage >ls-files.actual &&
+	cat >ls-files.expect <<-EOF &&
+	100644 $file1_blob 0	one
+	EOF
+	test_cmp ls-files.expect ls-files.actual &&
+
+	test-tool dump-split-index .git/index | sed "/^own/d" >actual &&
+	q_to_tab >expect <<-EOF &&
+	$BASE
+	100644 $file1_blob 0Q
+	replacements: 0
+	deletions:
+	EOF
+	test_cmp expect actual
+'
+
+test_expect_success 'remove file in base index' '
+	git update-index --force-remove one &&
+	git ls-files --stage >ls-files.actual &&
+	test_must_be_empty ls-files.actual &&
+
+	test-tool dump-split-index .git/index | sed "/^own/d" >actual &&
+	cat >expect <<-EOF &&
+	$BASE
+	replacements:
+	deletions: 0
+	EOF
+	test_cmp expect actual
+'
+
+test_expect_success 'add original file back' '
+	create_non_racy_file one &&
+	git update-index --add one &&
+	git ls-files --stage >ls-files.actual &&
+	cat >ls-files.expect <<-EOF &&
+	100644 $EMPTY_BLOB 0	one
+	EOF
+	test_cmp ls-files.expect ls-files.actual &&
+
+	test-tool dump-split-index .git/index | sed "/^own/d" >actual &&
+	cat >expect <<-EOF &&
+	$BASE
+	100644 $EMPTY_BLOB 0	one
+	replacements:
+	deletions: 0
+	EOF
+	test_cmp expect actual
+'
+
+test_expect_success 'add new file' '
+	create_non_racy_file two &&
+	git update-index --add two &&
+	git ls-files --stage >actual &&
+	cat >expect <<-EOF &&
+	100644 $EMPTY_BLOB 0	one
+	100644 $EMPTY_BLOB 0	two
+	EOF
+	test_cmp expect actual
+'
+
+test_expect_success 'unify index, two files remain' '
+	git update-index --no-split-index &&
+	git ls-files --stage >ls-files.actual &&
+	cat >ls-files.expect <<-EOF &&
+	100644 $EMPTY_BLOB 0	one
+	100644 $EMPTY_BLOB 0	two
+	EOF
+	test_cmp ls-files.expect ls-files.actual &&
+
+	test-tool dump-split-index .git/index | sed "/^own/d" >actual &&
+	cat >expect <<-EOF &&
+	not a split index
+	EOF
+	test_cmp expect actual
+'
+
+test_expect_success 'rev-parse --shared-index-path' '
+	test_create_repo split-index &&
+	(
+		cd split-index &&
+		git update-index --split-index &&
+		echo .git/sharedindex* >expect &&
+		git rev-parse --shared-index-path >actual &&
+		test_cmp expect actual &&
+		mkdir subdirectory &&
+		cd subdirectory &&
+		echo ../.git/sharedindex* >expect &&
+		git rev-parse --shared-index-path >actual &&
+		test_cmp expect actual
+	)
+'
+
+test_expect_success 'set core.splitIndex config variable to true' '
+	git config core.splitIndex true &&
+	create_non_racy_file three &&
+	git update-index --add three &&
+	git ls-files --stage >ls-files.actual &&
+	cat >ls-files.expect <<-EOF &&
+	100644 $EMPTY_BLOB 0	one
+	100644 $EMPTY_BLOB 0	three
+	100644 $EMPTY_BLOB 0	two
+	EOF
+	test_cmp ls-files.expect ls-files.actual &&
+	BASE=$(test-tool dump-split-index .git/index | grep "^base") &&
+	test-tool dump-split-index .git/index | sed "/^own/d" >actual &&
+	cat >expect <<-EOF &&
+	$BASE
+	replacements:
+	deletions:
+	EOF
+	test_cmp expect actual
+'
+
+test_expect_success 'set core.splitIndex config variable to false' '
+	git config core.splitIndex false &&
+	git update-index --force-remove three &&
+	git ls-files --stage >ls-files.actual &&
+	cat >ls-files.expect <<-EOF &&
+	100644 $EMPTY_BLOB 0	one
+	100644 $EMPTY_BLOB 0	two
+	EOF
+	test_cmp ls-files.expect ls-files.actual &&
+	test-tool dump-split-index .git/index | sed "/^own/d" >actual &&
+	cat >expect <<-EOF &&
+	not a split index
+	EOF
+	test_cmp expect actual
+'
+
+test_expect_success 'set core.splitIndex config variable back to true' '
+	git config core.splitIndex true &&
+	create_non_racy_file three &&
+	git update-index --add three &&
+	BASE=$(test-tool dump-split-index .git/index | grep "^base") &&
+	test-tool dump-split-index .git/index | sed "/^own/d" >actual &&
+	cat >expect <<-EOF &&
+	$BASE
+	replacements:
+	deletions:
+	EOF
+	test_cmp expect actual &&
+	create_non_racy_file four &&
+	git update-index --add four &&
+	test-tool dump-split-index .git/index | sed "/^own/d" >actual &&
+	cat >expect <<-EOF &&
+	$BASE
+	100644 $EMPTY_BLOB 0	four
+	replacements:
+	deletions:
+	EOF
+	test_cmp expect actual
+'
+
+test_expect_success 'check behavior with splitIndex.maxPercentChange unset' '
+	git config --unset splitIndex.maxPercentChange &&
+	create_non_racy_file five &&
+	git update-index --add five &&
+	BASE=$(test-tool dump-split-index .git/index | grep "^base") &&
+	test-tool dump-split-index .git/index | sed "/^own/d" >actual &&
+	cat >expect <<-EOF &&
+	$BASE
+	replacements:
+	deletions:
+	EOF
+	test_cmp expect actual &&
+	create_non_racy_file six &&
+	git update-index --add six &&
+	test-tool dump-split-index .git/index | sed "/^own/d" >actual &&
+	cat >expect <<-EOF &&
+	$BASE
+	100644 $EMPTY_BLOB 0	six
+	replacements:
+	deletions:
+	EOF
+	test_cmp expect actual
+'
+
+test_expect_success 'check splitIndex.maxPercentChange set to 0' '
+	git config splitIndex.maxPercentChange 0 &&
+	create_non_racy_file seven &&
+	git update-index --add seven &&
+	BASE=$(test-tool dump-split-index .git/index | grep "^base") &&
+	test-tool dump-split-index .git/index | sed "/^own/d" >actual &&
+	cat >expect <<-EOF &&
+	$BASE
+	replacements:
+	deletions:
+	EOF
+	test_cmp expect actual &&
+	create_non_racy_file eight &&
+	git update-index --add eight &&
+	BASE=$(test-tool dump-split-index .git/index | grep "^base") &&
+	test-tool dump-split-index .git/index | sed "/^own/d" >actual &&
+	cat >expect <<-EOF &&
+	$BASE
+	replacements:
+	deletions:
+	EOF
+	test_cmp expect actual
+'
+
+test_expect_success 'shared index files expire after 2 weeks by default' '
+	create_non_racy_file ten &&
+	git update-index --add ten &&
+	test $(ls .git/sharedindex.* | wc -l) -gt 2 &&
+	just_under_2_weeks_ago=$((5-14*86400)) &&
+	test-tool chmtime =$just_under_2_weeks_ago .git/sharedindex.* &&
+	create_non_racy_file eleven &&
+	git update-index --add eleven &&
+	test $(ls .git/sharedindex.* | wc -l) -gt 2 &&
+	just_over_2_weeks_ago=$((-1-14*86400)) &&
+	test-tool chmtime =$just_over_2_weeks_ago .git/sharedindex.* &&
+	create_non_racy_file twelve &&
+	git update-index --add twelve &&
+	test $(ls .git/sharedindex.* | wc -l) -le 2
+'
+
+test_expect_success 'check splitIndex.sharedIndexExpire set to 16 days' '
+	git config splitIndex.sharedIndexExpire "16.days.ago" &&
+	test-tool chmtime =$just_over_2_weeks_ago .git/sharedindex.* &&
+	create_non_racy_file thirteen &&
+	git update-index --add thirteen &&
+	test $(ls .git/sharedindex.* | wc -l) -gt 2 &&
+	just_over_16_days_ago=$((-1-16*86400)) &&
+	test-tool chmtime =$just_over_16_days_ago .git/sharedindex.* &&
+	create_non_racy_file fourteen &&
+	git update-index --add fourteen &&
+	test $(ls .git/sharedindex.* | wc -l) -le 2
+'
+
+test_expect_success 'check splitIndex.sharedIndexExpire set to "never" and "now"' '
+	git config splitIndex.sharedIndexExpire never &&
+	just_10_years_ago=$((-365*10*86400)) &&
+	test-tool chmtime =$just_10_years_ago .git/sharedindex.* &&
+	create_non_racy_file fifteen &&
+	git update-index --add fifteen &&
+	test $(ls .git/sharedindex.* | wc -l) -gt 2 &&
+	git config splitIndex.sharedIndexExpire now &&
+	just_1_second_ago=-1 &&
+	test-tool chmtime =$just_1_second_ago .git/sharedindex.* &&
+	create_non_racy_file sixteen &&
+	git update-index --add sixteen &&
+	test $(ls .git/sharedindex.* | wc -l) -le 2
+'
+
+test_expect_success POSIXPERM 'same mode for index & split index' '
+	git init same-mode &&
+	(
+		cd same-mode &&
+		test_commit A &&
+		test_modebits .git/index >index_mode &&
+		test_must_fail git config core.sharedRepository &&
+		git -c core.splitIndex=true status &&
+		shared=$(ls .git/sharedindex.*) &&
+		case "$shared" in
+		*" "*)
+			# we have more than one???
+			false ;;
+		*)
+			test_modebits "$shared" >split_index_mode &&
+			test_cmp index_mode split_index_mode ;;
+		esac
+	)
+'
+
+while read -r mode modebits
+do
+	test_expect_success POSIXPERM "split index respects core.sharedrepository $mode" '
+		# Remove existing shared index files
+		git config core.splitIndex false &&
+		git update-index --force-remove one &&
+		rm -f .git/sharedindex.* &&
+		# Create one new shared index file
+		git config core.sharedrepository "$mode" &&
+		git config core.splitIndex true &&
+		create_non_racy_file one &&
+		git update-index --add one &&
+		echo "$modebits" >expect &&
+		test_modebits .git/index >actual &&
+		test_cmp expect actual &&
+		shared=$(ls .git/sharedindex.*) &&
+		case "$shared" in
+		*" "*)
+			# we have more than one???
+			false ;;
+		*)
+			test_modebits "$shared" >actual &&
+			test_cmp expect actual ;;
+		esac
+	'
+done <<\EOF
+0666 -rw-rw-rw-
+0642 -rw-r---w-
+EOF
+
+test_expect_success POSIXPERM,SANITY 'graceful handling when splitting index is not allowed' '
+	test_create_repo ro &&
+	(
+		cd ro &&
+		test_commit initial &&
+		git update-index --split-index &&
+		test -f .git/sharedindex.*
+	) &&
+	cp ro/.git/index new-index &&
+	test_when_finished "chmod u+w ro/.git" &&
+	chmod u-w ro/.git &&
+	GIT_INDEX_FILE="$(pwd)/new-index" git -C ro update-index --split-index &&
+	chmod u+w ro/.git &&
+	rm ro/.git/sharedindex.* &&
+	GIT_INDEX_FILE=new-index git ls-files >actual &&
+	echo initial.t >expected &&
+	test_cmp expected actual
+'
+
+test_expect_success 'writing split index with null sha1 does not write cache tree' '
+	git config core.splitIndex true &&
+	git config splitIndex.maxPercentChange 0 &&
+	git commit -m "commit" &&
+	{
+		git ls-tree HEAD &&
+		printf "160000 commit $ZERO_OID\\tbroken\\n"
+	} >broken-tree &&
+	echo "add broken entry" >msg &&
+
+	tree=$(git mktree <broken-tree) &&
+	test_tick &&
+	commit=$(git commit-tree $tree -p HEAD <msg) &&
+	git update-ref HEAD "$commit" &&
+	GIT_ALLOW_NULL_SHA1=1 git reset --hard &&
+	test_might_fail test-tool dump-cache-tree >cache-tree.out &&
+	test_line_count = 0 cache-tree.out
+'
+
+test_expect_success 'do not refresh null base index' '
+	test_create_repo merge &&
+	(
+		cd merge &&
+		test_commit initial &&
+		git checkout -b side-branch &&
+		test_commit extra &&
+		git checkout main &&
+		git update-index --split-index &&
+		test_commit more &&
+		# must not write a new shareindex, or we wont catch the problem
+		git -c splitIndex.maxPercentChange=100 merge --no-edit side-branch 2>err &&
+		# i.e. do not expect warnings like
+		# could not freshen shared index .../shareindex.00000...
+		test_must_be_empty err
+	)
+'
+
+test_expect_success 'reading split index at alternate location' '
+	git init reading-alternate-location &&
+	(
+		cd reading-alternate-location &&
+		>file-in-alternate &&
+		git update-index --split-index --add file-in-alternate
+	) &&
+	echo file-in-alternate >expect &&
+
+	# Should be able to find the shared index both right next to
+	# the specified split index file ...
+	GIT_INDEX_FILE=./reading-alternate-location/.git/index \
+	git ls-files --cached >actual &&
+	test_cmp expect actual &&
+
+	# ... and, for backwards compatibility, in the current GIT_DIR
+	# as well.
+	mv -v ./reading-alternate-location/.git/sharedindex.* .git &&
+	GIT_INDEX_FILE=./reading-alternate-location/.git/index \
+	git ls-files --cached >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'GIT_TEST_SPLIT_INDEX works' '
+	git init git-test-split-index &&
+	(
+		cd git-test-split-index &&
+		>file &&
+		GIT_TEST_SPLIT_INDEX=1 git update-index --add file &&
+		ls -l .git/sharedindex.* >actual &&
+		test_line_count = 1 actual
+	)
+'
+
+test_done
diff --git a/t/t0003-attributes.sh b/t/t0003-attributes.sh
index aee2298f01..86f8681570 100755
--- a/t/t0003-attributes.sh
+++ b/t/t0003-attributes.sh
@@ -19,6 +19,20 @@ attr_check () {
 	test_must_be_empty err
 }
 
+attr_check_object_mode_basic () {
+	path="$1" &&
+	expect="$2" &&
+	check_opts="$3" &&
+	git check-attr $check_opts builtin_objectmode -- "$path" >actual 2>err &&
+	echo "$path: builtin_objectmode: $expect" >expect &&
+	test_cmp expect actual
+}
+
+attr_check_object_mode () {
+	attr_check_object_mode_basic "$@" &&
+	test_must_be_empty err
+}
+
 attr_check_quote () {
 	path="$1" quoted_path="$2" expect="$3" &&
 
@@ -558,4 +572,65 @@ test_expect_success EXPENSIVE 'large attributes file ignored in index' '
 	test_cmp expect err
 '
 
+test_expect_success 'builtin object mode attributes work (dir and regular paths)' '
+	>normal &&
+	attr_check_object_mode normal 100644 &&
+	mkdir dir &&
+	attr_check_object_mode dir 040000
+'
+
+test_expect_success POSIXPERM 'builtin object mode attributes work (executable)' '
+	>exec && chmod +x exec &&
+	attr_check_object_mode exec 100755
+'
+
+test_expect_success SYMLINKS 'builtin object mode attributes work (symlinks)' '
+	>to_sym ln -s to_sym sym &&
+	attr_check_object_mode sym 120000
+'
+
+test_expect_success 'native object mode attributes work with --cached' '
+	>normal &&
+	git add normal &&
+	empty_blob=$(git rev-parse :normal) &&
+	git update-index --index-info <<-EOF &&
+	100755 $empty_blob 0	exec
+	120000 $empty_blob 0	symlink
+	EOF
+	attr_check_object_mode normal 100644 --cached &&
+	attr_check_object_mode exec 100755 --cached &&
+	attr_check_object_mode symlink 120000 --cached
+'
+
+test_expect_success 'check object mode attributes work for submodules' '
+	mkdir sub &&
+	(
+		cd sub &&
+		git init &&
+		mv .git .real &&
+		echo "gitdir: .real" >.git &&
+		test_commit first
+	) &&
+	attr_check_object_mode sub 160000 &&
+	attr_check_object_mode sub unspecified --cached &&
+	git add sub &&
+	attr_check_object_mode sub 160000 --cached
+'
+
+test_expect_success 'we do not allow user defined builtin_* attributes' '
+	echo "foo* builtin_foo" >.gitattributes &&
+	git add .gitattributes 2>actual &&
+	echo "builtin_foo is not a valid attribute name: .gitattributes:1" >expect &&
+	test_cmp expect actual
+'
+
+test_expect_success 'user defined builtin_objectmode values are ignored' '
+	echo "foo* builtin_objectmode=12345" >.gitattributes &&
+	git add .gitattributes &&
+	>foo_1 &&
+	attr_check_object_mode_basic foo_1 100644 &&
+	echo "builtin_objectmode is not a valid attribute name: .gitattributes:1" >expect &&
+	test_cmp expect err
+'
+
 test_done
diff --git a/t/t6135-pathspec-with-attrs.sh b/t/t6135-pathspec-with-attrs.sh
index a9c1e4e0ec..b08a32ea68 100755
--- a/t/t6135-pathspec-with-attrs.sh
+++ b/t/t6135-pathspec-with-attrs.sh
@@ -295,4 +295,29 @@ test_expect_success 'reading from .gitattributes in a subdirectory (3)' '
 	test_cmp expect actual
 '
 
+test_expect_success 'pathspec with builtin_objectmode attr can be used' '
+	>mode_exec_file_1 &&
+
+	git status -s ":(attr:builtin_objectmode=100644)mode_exec_*" >actual &&
+	echo ?? mode_exec_file_1 >expect &&
+	test_cmp expect actual &&
+
+	git add mode_exec_file_1 && chmod +x mode_exec_file_1 &&
+	git status -s ":(attr:builtin_objectmode=100755)mode_exec_*" >actual &&
+	echo AM mode_exec_file_1 >expect &&
+	test_cmp expect actual
+'
+
+test_expect_success 'builtin_objectmode attr can be excluded' '
+	>mode_1_regular &&
+	>mode_1_exec  && chmod +x mode_1_exec &&
+	git status -s ":(exclude,attr:builtin_objectmode=100644)" "mode_1_*" >actual &&
+	echo ?? mode_1_exec >expect &&
+	test_cmp expect actual &&
+
+	git status -s ":(exclude,attr:builtin_objectmode=100755)" "mode_1_*" >actual &&
+	echo ?? mode_1_regular >expect &&
+	test_cmp expect actual
+'
+
 test_done
-- 
2.43.0.rc0.421.g78406f8d94-goog


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/1] attr: add builtin objectmode values support
  2023-11-16  5:44           ` Joanna Wang
@ 2023-11-16  6:17             ` Junio C Hamano
  2023-11-16  8:08               ` Junio C Hamano
  2023-11-16  7:57             ` Junio C Hamano
  1 sibling, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2023-11-16  6:17 UTC (permalink / raw)
  To: Joanna Wang; +Cc: git, sunshine, tboegi

Joanna Wang <jojwang@google.com> writes:

>  t/#t1700-split-index.sh#        | 547 ++++++++++++++++++++++++++++++++

This was added by mistake and we can safely ignore it?  IOW, is
everything else in the patch, other than the addition of this path,
what you intended to send out?

Other than the removal of that file, I locally applied the following
fix-up while checking the difference relative to the previous
iteration.  Other than these, the updated version looked very clear
and nicely done.

Thanks.

 t/t0003-attributes.sh | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/t/t0003-attributes.sh b/t/t0003-attributes.sh
index 86f8681570..774b52c298 100755
--- a/t/t0003-attributes.sh
+++ b/t/t0003-attributes.sh
@@ -580,12 +580,13 @@ test_expect_success 'builtin object mode attributes work (dir and regular paths)
 '
 
 test_expect_success POSIXPERM 'builtin object mode attributes work (executable)' '
-	>exec && chmod +x exec &&
+	>exec &&
+	chmod +x exec &&
 	attr_check_object_mode exec 100755
 '
 
 test_expect_success SYMLINKS 'builtin object mode attributes work (symlinks)' '
-	>to_sym ln -s to_sym sym &&
+	ln -s to_sym sym &&
 	attr_check_object_mode sym 120000
 '
 

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/1] attr: add builtin objectmode values support
  2023-11-16  5:44           ` Joanna Wang
  2023-11-16  6:17             ` Junio C Hamano
@ 2023-11-16  7:57             ` Junio C Hamano
  1 sibling, 0 replies; 15+ messages in thread
From: Junio C Hamano @ 2023-11-16  7:57 UTC (permalink / raw)
  To: Joanna Wang; +Cc: git, sunshine, tboegi

Joanna Wang <jojwang@google.com> writes:

> diff --git a/t/t6135-pathspec-with-attrs.sh b/t/t6135-pathspec-with-attrs.sh
> index a9c1e4e0ec..b08a32ea68 100755
> --- a/t/t6135-pathspec-with-attrs.sh
> +++ b/t/t6135-pathspec-with-attrs.sh
> @@ -295,4 +295,29 @@ test_expect_success 'reading from .gitattributes in a subdirectory (3)' '
>  	test_cmp expect actual
>  '
>  
> +test_expect_success 'pathspec with builtin_objectmode attr can be used' '
> +	>mode_exec_file_1 &&
> +
> +	git status -s ":(attr:builtin_objectmode=100644)mode_exec_*" >actual &&
> +	echo ?? mode_exec_file_1 >expect &&
> +	test_cmp expect actual &&
> +
> +	git add mode_exec_file_1 && chmod +x mode_exec_file_1 &&

This again would break with filesystems that are incapable of
supporting executable bit natively.  Writing one command per line,
doing something like this may help

	git add mode_exec_file_1 &&
	git update-index --chmod=+x mode_exec_file_1 &&

> +	git status -s ":(attr:builtin_objectmode=100755)mode_exec_*" >actual &&
> +	echo AM mode_exec_file_1 >expect &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'builtin_objectmode attr can be excluded' '
> +	>mode_1_regular &&
> +	>mode_1_exec  && chmod +x mode_1_exec &&

Ditto.

> +	git status -s ":(exclude,attr:builtin_objectmode=100644)" "mode_1_*" >actual &&
> +	echo ?? mode_1_exec >expect &&
> +	test_cmp expect actual &&
> +
> +	git status -s ":(exclude,attr:builtin_objectmode=100755)" "mode_1_*" >actual &&
> +	echo ?? mode_1_regular >expect &&
> +	test_cmp expect actual
> +'
> +
>  test_done

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/1] attr: add builtin objectmode values support
  2023-11-16  6:17             ` Junio C Hamano
@ 2023-11-16  8:08               ` Junio C Hamano
  2023-11-16 17:54                 ` Joanna Wang
  2023-12-01  4:01                 ` Joanna Wang
  0 siblings, 2 replies; 15+ messages in thread
From: Junio C Hamano @ 2023-11-16  8:08 UTC (permalink / raw)
  To: Joanna Wang; +Cc: git, sunshine, tboegi

Junio C Hamano <gitster@pobox.com> writes:

> Other than the removal of that file, I locally applied the following
> fix-up while checking the difference relative to the previous
> iteration.

Cumulatively, aside from the removal of the t/#t* file, here is what
I ended up with so far.

Subject: [PATCH] SQUASH???

---
 Documentation/gitattributes.txt |  2 +-
 neue                            |  0
 t/t0003-attributes.sh           |  5 +++--
 t/t6135-pathspec-with-attrs.sh  | 10 ++++++----
 4 files changed, 10 insertions(+), 7 deletions(-)
 create mode 100644 neue

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index 784aa9d4de..201bdf5edb 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -108,7 +108,7 @@ user defined attributes under this namespace will be ignored and
 trigger a warning.
 
 `builtin_objectmode`
-^^^^^^^^^^^^^^^^^^^^
+~~~~~~~~~~~~~~~~~~~~
 This attribute is for filtering files by their file bit modes (40000,
 120000, 160000, 100755, 100644). e.g. ':(attr:builtin_objectmode=160000)'.
 You may also check these values with `git check-attr builtin_objectmode -- <file>`.
diff --git a/neue b/neue
new file mode 100644
index 0000000000..e69de29bb2
diff --git a/t/t0003-attributes.sh b/t/t0003-attributes.sh
index 86f8681570..774b52c298 100755
--- a/t/t0003-attributes.sh
+++ b/t/t0003-attributes.sh
@@ -580,12 +580,13 @@ test_expect_success 'builtin object mode attributes work (dir and regular paths)
 '
 
 test_expect_success POSIXPERM 'builtin object mode attributes work (executable)' '
-	>exec && chmod +x exec &&
+	>exec &&
+	chmod +x exec &&
 	attr_check_object_mode exec 100755
 '
 
 test_expect_success SYMLINKS 'builtin object mode attributes work (symlinks)' '
-	>to_sym ln -s to_sym sym &&
+	ln -s to_sym sym &&
 	attr_check_object_mode sym 120000
 '
 
diff --git a/t/t6135-pathspec-with-attrs.sh b/t/t6135-pathspec-with-attrs.sh
index b08a32ea68..f6403ebbda 100755
--- a/t/t6135-pathspec-with-attrs.sh
+++ b/t/t6135-pathspec-with-attrs.sh
@@ -295,22 +295,24 @@ test_expect_success 'reading from .gitattributes in a subdirectory (3)' '
 	test_cmp expect actual
 '
 
-test_expect_success 'pathspec with builtin_objectmode attr can be used' '
+test_expect_success POSIXPERM 'pathspec with builtin_objectmode attr can be used' '
 	>mode_exec_file_1 &&
 
 	git status -s ":(attr:builtin_objectmode=100644)mode_exec_*" >actual &&
 	echo ?? mode_exec_file_1 >expect &&
 	test_cmp expect actual &&
 
-	git add mode_exec_file_1 && chmod +x mode_exec_file_1 &&
+	git add mode_exec_file_1 &&
+	chmod +x mode_exec_file_1 &&
 	git status -s ":(attr:builtin_objectmode=100755)mode_exec_*" >actual &&
 	echo AM mode_exec_file_1 >expect &&
 	test_cmp expect actual
 '
 
-test_expect_success 'builtin_objectmode attr can be excluded' '
+test_expect_success POSIXPERM 'builtin_objectmode attr can be excluded' '
 	>mode_1_regular &&
-	>mode_1_exec  && chmod +x mode_1_exec &&
+	>mode_1_exec  &&
+	chmod +x mode_1_exec &&
 	git status -s ":(exclude,attr:builtin_objectmode=100644)" "mode_1_*" >actual &&
 	echo ?? mode_1_exec >expect &&
 	test_cmp expect actual &&
-- 
2.43.0-rc2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/1] attr: add builtin objectmode values support
  2023-11-16  8:08               ` Junio C Hamano
@ 2023-11-16 17:54                 ` Joanna Wang
  2023-12-01  4:01                 ` Joanna Wang
  1 sibling, 0 replies; 15+ messages in thread
From: Joanna Wang @ 2023-11-16 17:54 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, sunshine, tboegi

yes sorry, t/#t1700 was a mistake.
Thank you for the review and further fixes

On Thu, Nov 16, 2023 at 12:08 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Junio C Hamano <gitster@pobox.com> writes:
>
> > Other than the removal of that file, I locally applied the following
> > fix-up while checking the difference relative to the previous
> > iteration.
>
> Cumulatively, aside from the removal of the t/#t* file, here is what
> I ended up with so far.
>
> Subject: [PATCH] SQUASH???
>
> ---
>  Documentation/gitattributes.txt |  2 +-
>  neue                            |  0
>  t/t0003-attributes.sh           |  5 +++--
>  t/t6135-pathspec-with-attrs.sh  | 10 ++++++----
>  4 files changed, 10 insertions(+), 7 deletions(-)
>  create mode 100644 neue
>
> diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
> index 784aa9d4de..201bdf5edb 100644
> --- a/Documentation/gitattributes.txt
> +++ b/Documentation/gitattributes.txt
> @@ -108,7 +108,7 @@ user defined attributes under this namespace will be ignored and
>  trigger a warning.
>
>  `builtin_objectmode`
> -^^^^^^^^^^^^^^^^^^^^
> +~~~~~~~~~~~~~~~~~~~~
>  This attribute is for filtering files by their file bit modes (40000,
>  120000, 160000, 100755, 100644). e.g. ':(attr:builtin_objectmode=160000)'.
>  You may also check these values with `git check-attr builtin_objectmode -- <file>`.
> diff --git a/neue b/neue
> new file mode 100644
> index 0000000000..e69de29bb2
> diff --git a/t/t0003-attributes.sh b/t/t0003-attributes.sh
> index 86f8681570..774b52c298 100755
> --- a/t/t0003-attributes.sh
> +++ b/t/t0003-attributes.sh
> @@ -580,12 +580,13 @@ test_expect_success 'builtin object mode attributes work (dir and regular paths)
>  '
>
>  test_expect_success POSIXPERM 'builtin object mode attributes work (executable)' '
> -       >exec && chmod +x exec &&
> +       >exec &&
> +       chmod +x exec &&
>         attr_check_object_mode exec 100755
>  '
>
>  test_expect_success SYMLINKS 'builtin object mode attributes work (symlinks)' '
> -       >to_sym ln -s to_sym sym &&
> +       ln -s to_sym sym &&
>         attr_check_object_mode sym 120000
>  '
>
> diff --git a/t/t6135-pathspec-with-attrs.sh b/t/t6135-pathspec-with-attrs.sh
> index b08a32ea68..f6403ebbda 100755
> --- a/t/t6135-pathspec-with-attrs.sh
> +++ b/t/t6135-pathspec-with-attrs.sh
> @@ -295,22 +295,24 @@ test_expect_success 'reading from .gitattributes in a subdirectory (3)' '
>         test_cmp expect actual
>  '
>
> -test_expect_success 'pathspec with builtin_objectmode attr can be used' '
> +test_expect_success POSIXPERM 'pathspec with builtin_objectmode attr can be used' '
>         >mode_exec_file_1 &&
>
>         git status -s ":(attr:builtin_objectmode=100644)mode_exec_*" >actual &&
>         echo ?? mode_exec_file_1 >expect &&
>         test_cmp expect actual &&
>
> -       git add mode_exec_file_1 && chmod +x mode_exec_file_1 &&
> +       git add mode_exec_file_1 &&
> +       chmod +x mode_exec_file_1 &&
>         git status -s ":(attr:builtin_objectmode=100755)mode_exec_*" >actual &&
>         echo AM mode_exec_file_1 >expect &&
>         test_cmp expect actual
>  '
>
> -test_expect_success 'builtin_objectmode attr can be excluded' '
> +test_expect_success POSIXPERM 'builtin_objectmode attr can be excluded' '
>         >mode_1_regular &&
> -       >mode_1_exec  && chmod +x mode_1_exec &&
> +       >mode_1_exec  &&
> +       chmod +x mode_1_exec &&
>         git status -s ":(exclude,attr:builtin_objectmode=100644)" "mode_1_*" >actual &&
>         echo ?? mode_1_exec >expect &&
>         test_cmp expect actual &&
> --
> 2.43.0-rc2
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/1] attr: add builtin objectmode values support
  2023-11-16  8:08               ` Junio C Hamano
  2023-11-16 17:54                 ` Joanna Wang
@ 2023-12-01  4:01                 ` Joanna Wang
  2023-12-12 23:17                   ` Junio C Hamano
  1 sibling, 1 reply; 15+ messages in thread
From: Joanna Wang @ 2023-12-01  4:01 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, sunshine, tboegi

> Cumulatively, aside from the removal of the t/#t* file, here is what
> I ended up with so far.

I want to double check if I should followup here.
I assumed that you had already applied these final fixes on my behalf,
similar to my patch for enabling attr for `git-add`. But if I was wrong,
I'm happy to send another update with all the fixes.


On Thu, Nov 16, 2023 at 4:08 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Junio C Hamano <gitster@pobox.com> writes:
>
> > Other than the removal of that file, I locally applied the following
> > fix-up while checking the difference relative to the previous
> > iteration.
>
> Cumulatively, aside from the removal of the t/#t* file, here is what
> I ended up with so far.
>
> Subject: [PATCH] SQUASH???
>
> ---
>  Documentation/gitattributes.txt |  2 +-
>  neue                            |  0
>  t/t0003-attributes.sh           |  5 +++--
>  t/t6135-pathspec-with-attrs.sh  | 10 ++++++----
>  4 files changed, 10 insertions(+), 7 deletions(-)
>  create mode 100644 neue
>
> diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
> index 784aa9d4de..201bdf5edb 100644
> --- a/Documentation/gitattributes.txt
> +++ b/Documentation/gitattributes.txt
> @@ -108,7 +108,7 @@ user defined attributes under this namespace will be ignored and
>  trigger a warning.
>
>  `builtin_objectmode`
> -^^^^^^^^^^^^^^^^^^^^
> +~~~~~~~~~~~~~~~~~~~~
>  This attribute is for filtering files by their file bit modes (40000,
>  120000, 160000, 100755, 100644). e.g. ':(attr:builtin_objectmode=160000)'.
>  You may also check these values with `git check-attr builtin_objectmode -- <file>`.
> diff --git a/neue b/neue
> new file mode 100644
> index 0000000000..e69de29bb2
> diff --git a/t/t0003-attributes.sh b/t/t0003-attributes.sh
> index 86f8681570..774b52c298 100755
> --- a/t/t0003-attributes.sh
> +++ b/t/t0003-attributes.sh
> @@ -580,12 +580,13 @@ test_expect_success 'builtin object mode attributes work (dir and regular paths)
>  '
>
>  test_expect_success POSIXPERM 'builtin object mode attributes work (executable)' '
> -       >exec && chmod +x exec &&
> +       >exec &&
> +       chmod +x exec &&
>         attr_check_object_mode exec 100755
>  '
>
>  test_expect_success SYMLINKS 'builtin object mode attributes work (symlinks)' '
> -       >to_sym ln -s to_sym sym &&
> +       ln -s to_sym sym &&
>         attr_check_object_mode sym 120000
>  '
>
> diff --git a/t/t6135-pathspec-with-attrs.sh b/t/t6135-pathspec-with-attrs.sh
> index b08a32ea68..f6403ebbda 100755
> --- a/t/t6135-pathspec-with-attrs.sh
> +++ b/t/t6135-pathspec-with-attrs.sh
> @@ -295,22 +295,24 @@ test_expect_success 'reading from .gitattributes in a subdirectory (3)' '
>         test_cmp expect actual
>  '
>
> -test_expect_success 'pathspec with builtin_objectmode attr can be used' '
> +test_expect_success POSIXPERM 'pathspec with builtin_objectmode attr can be used' '
>         >mode_exec_file_1 &&
>
>         git status -s ":(attr:builtin_objectmode=100644)mode_exec_*" >actual &&
>         echo ?? mode_exec_file_1 >expect &&
>         test_cmp expect actual &&
>
> -       git add mode_exec_file_1 && chmod +x mode_exec_file_1 &&
> +       git add mode_exec_file_1 &&
> +       chmod +x mode_exec_file_1 &&
>         git status -s ":(attr:builtin_objectmode=100755)mode_exec_*" >actual &&
>         echo AM mode_exec_file_1 >expect &&
>         test_cmp expect actual
>  '
>
> -test_expect_success 'builtin_objectmode attr can be excluded' '
> +test_expect_success POSIXPERM 'builtin_objectmode attr can be excluded' '
>         >mode_1_regular &&
> -       >mode_1_exec  && chmod +x mode_1_exec &&
> +       >mode_1_exec  &&
> +       chmod +x mode_1_exec &&
>         git status -s ":(exclude,attr:builtin_objectmode=100644)" "mode_1_*" >actual &&
>         echo ?? mode_1_exec >expect &&
>         test_cmp expect actual &&
> --
> 2.43.0-rc2
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/1] attr: add builtin objectmode values support
  2023-12-01  4:01                 ` Joanna Wang
@ 2023-12-12 23:17                   ` Junio C Hamano
  2023-12-20 20:07                     ` Junio C Hamano
  0 siblings, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2023-12-12 23:17 UTC (permalink / raw)
  To: Joanna Wang; +Cc: git, sunshine, tboegi

Joanna Wang <jojwang@google.com> writes:

>> Cumulatively, aside from the removal of the t/#t* file, here is what
>> I ended up with so far.
>
> I want to double check if I should followup here.
> I assumed that you had already applied these final fixes on my behalf,
> similar to my patch for enabling attr for `git-add`. But if I was wrong,
> I'm happy to send another update with all the fixes.

I've squashed the fix in, so no need to resend only to patch up what
I pointed out earlier.

The end result fails t0003 under GIT_TEST_PASSING_SANITIZE_LEAK
though.  As the synthetic attribute values are allocated without
being in the hashmap based on the value read from .gitattributes
files, somebody needs to hold pointers to them *and* we need to
avoid allocating unbounded number of them.

The attached is one possible way to plug the leak; I am not sure if
it is the best one, though.  One thing I like about the solution is
that the approach makes sure that the mode attributes we would ever
return are very tightly controlled and does not allow a buggy code
to come up with "mode" to be passed to this new helper function to
pass random and unsupported mode bits without triggering the BUG().

 attr.c | 30 +++++++++++++++++++++++++++---
 1 file changed, 27 insertions(+), 3 deletions(-)

diff --git c/attr.c w/attr.c
index b03c20f768..679e42258c 100644
--- c/attr.c
+++ w/attr.c
@@ -1250,10 +1250,34 @@ static struct object_id *default_attr_source(void)
 	return &attr_source;
 }
 
+static const char *interned_mode_string(unsigned int mode)
+{
+	static struct {
+		unsigned int val;
+		char str[7];
+	} mode_string[] = {
+		{ .val = 0040000 },
+		{ .val = 0100644 },
+		{ .val = 0100755 },
+		{ .val = 0120000 },
+		{ .val = 0160000 },
+	};
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(mode_string); i++) {
+		if (mode_string[i].val != mode)
+			continue;
+		if (!*mode_string[i].str)
+			snprintf(mode_string[i].str, sizeof(mode_string[i].str),
+				 "%06o", mode);
+		return mode_string[i].str;
+	}
+	BUG("Unsupported mode 0%o", mode);
+}
+
 static const char *builtin_object_mode_attr(struct index_state *istate, const char *path)
 {
 	unsigned int mode;
-	struct strbuf sb = STRBUF_INIT;
 
 	if (direction == GIT_ATTR_CHECKIN) {
 		struct object_id oid;
@@ -1287,8 +1311,8 @@ static const char *builtin_object_mode_attr(struct index_state *istate, const ch
 		else
 			return ATTR__UNSET;
 	}
-	strbuf_addf(&sb, "%06o", mode);
-	return strbuf_detach(&sb, NULL);
+
+	return interned_mode_string(mode);
 }
 
 


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/1] attr: add builtin objectmode values support
  2023-12-12 23:17                   ` Junio C Hamano
@ 2023-12-20 20:07                     ` Junio C Hamano
  2024-01-12  6:12                       ` Joanna Wang
  0 siblings, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2023-12-20 20:07 UTC (permalink / raw)
  To: git; +Cc: Joanna Wang, sunshine, tboegi

Junio C Hamano <gitster@pobox.com> writes:

> The attached is one possible way to plug the leak; I am not sure if
> it is the best one, though.  One thing I like about the solution is
> that the approach makes sure that the mode attributes we would ever
> return are very tightly controlled and does not allow a buggy code
> to come up with "mode" to be passed to this new helper function to
> pass random and unsupported mode bits without triggering the BUG().
>
>  attr.c | 30 +++++++++++++++++++++++++++---
>  1 file changed, 27 insertions(+), 3 deletions(-)

Anybody who want to propose a better leakfix (we cannot afford to do
with UNLEAK() as the number of leaked mode string will be unbounded)?

Otherwise, I'll squash it in to Jonanna's patch and merge it down to
'next'.

Thanks.

> diff --git c/attr.c w/attr.c
> index b03c20f768..679e42258c 100644
> --- c/attr.c
> +++ w/attr.c
> @@ -1250,10 +1250,34 @@ static struct object_id *default_attr_source(void)
>  	return &attr_source;
>  }
>  
> +static const char *interned_mode_string(unsigned int mode)
> +{
> +	static struct {
> +		unsigned int val;
> +		char str[7];
> +	} mode_string[] = {
> +		{ .val = 0040000 },
> +		{ .val = 0100644 },
> +		{ .val = 0100755 },
> +		{ .val = 0120000 },
> +		{ .val = 0160000 },
> +	};
> +	int i;
> +
> +	for (i = 0; i < ARRAY_SIZE(mode_string); i++) {
> +		if (mode_string[i].val != mode)
> +			continue;
> +		if (!*mode_string[i].str)
> +			snprintf(mode_string[i].str, sizeof(mode_string[i].str),
> +				 "%06o", mode);
> +		return mode_string[i].str;
> +	}
> +	BUG("Unsupported mode 0%o", mode);
> +}
> +
>  static const char *builtin_object_mode_attr(struct index_state *istate, const char *path)
>  {
>  	unsigned int mode;
> -	struct strbuf sb = STRBUF_INIT;
>  
>  	if (direction == GIT_ATTR_CHECKIN) {
>  		struct object_id oid;
> @@ -1287,8 +1311,8 @@ static const char *builtin_object_mode_attr(struct index_state *istate, const ch
>  		else
>  			return ATTR__UNSET;
>  	}
> -	strbuf_addf(&sb, "%06o", mode);
> -	return strbuf_detach(&sb, NULL);
> +
> +	return interned_mode_string(mode);
>  }
>  
>  

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/1] attr: add builtin objectmode values support
  2023-12-20 20:07                     ` Junio C Hamano
@ 2024-01-12  6:12                       ` Joanna Wang
  0 siblings, 0 replies; 15+ messages in thread
From: Joanna Wang @ 2024-01-12  6:12 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, sunshine, tboegi

(I was out all month.) Thank you for the additional fixes.


On Thu, Dec 21, 2023 at 4:07 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Junio C Hamano <gitster@pobox.com> writes:
>
> > The attached is one possible way to plug the leak; I am not sure if
> > it is the best one, though.  One thing I like about the solution is
> > that the approach makes sure that the mode attributes we would ever
> > return are very tightly controlled and does not allow a buggy code
> > to come up with "mode" to be passed to this new helper function to
> > pass random and unsupported mode bits without triggering the BUG().
> >
> >  attr.c | 30 +++++++++++++++++++++++++++---
> >  1 file changed, 27 insertions(+), 3 deletions(-)
>
> Anybody who want to propose a better leakfix (we cannot afford to do
> with UNLEAK() as the number of leaked mode string will be unbounded)?
>
> Otherwise, I'll squash it in to Jonanna's patch and merge it down to
> 'next'.
>
> Thanks.
>
> > diff --git c/attr.c w/attr.c
> > index b03c20f768..679e42258c 100644
> > --- c/attr.c
> > +++ w/attr.c
> > @@ -1250,10 +1250,34 @@ static struct object_id *default_attr_source(void)
> >       return &attr_source;
> >  }
> >
> > +static const char *interned_mode_string(unsigned int mode)
> > +{
> > +     static struct {
> > +             unsigned int val;
> > +             char str[7];
> > +     } mode_string[] = {
> > +             { .val = 0040000 },
> > +             { .val = 0100644 },
> > +             { .val = 0100755 },
> > +             { .val = 0120000 },
> > +             { .val = 0160000 },
> > +     };
> > +     int i;
> > +
> > +     for (i = 0; i < ARRAY_SIZE(mode_string); i++) {
> > +             if (mode_string[i].val != mode)
> > +                     continue;
> > +             if (!*mode_string[i].str)
> > +                     snprintf(mode_string[i].str, sizeof(mode_string[i].str),
> > +                              "%06o", mode);
> > +             return mode_string[i].str;
> > +     }
> > +     BUG("Unsupported mode 0%o", mode);
> > +}
> > +
> >  static const char *builtin_object_mode_attr(struct index_state *istate, const char *path)
> >  {
> >       unsigned int mode;
> > -     struct strbuf sb = STRBUF_INIT;
> >
> >       if (direction == GIT_ATTR_CHECKIN) {
> >               struct object_id oid;
> > @@ -1287,8 +1311,8 @@ static const char *builtin_object_mode_attr(struct index_state *istate, const ch
> >               else
> >                       return ATTR__UNSET;
> >       }
> > -     strbuf_addf(&sb, "%06o", mode);
> > -     return strbuf_detach(&sb, NULL);
> > +
> > +     return interned_mode_string(mode);
> >  }
> >
> >

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2024-01-12  6:12 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-14  2:28 [PATCH 1/1] attr: add native file mode values support Joanna Wang
2023-11-14  2:52 ` Junio C Hamano
2023-11-14 21:49   ` [PATCH 1/1] attr: add builtin objectmode " Joanna Wang
2023-11-16  1:26     ` Junio C Hamano
2023-11-16  1:37       ` Eric Sunshine
2023-11-16  2:53         ` Junio C Hamano
2023-11-16  5:44           ` Joanna Wang
2023-11-16  6:17             ` Junio C Hamano
2023-11-16  8:08               ` Junio C Hamano
2023-11-16 17:54                 ` Joanna Wang
2023-12-01  4:01                 ` Joanna Wang
2023-12-12 23:17                   ` Junio C Hamano
2023-12-20 20:07                     ` Junio C Hamano
2024-01-12  6:12                       ` Joanna Wang
2023-11-16  7:57             ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).