git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Bug/regression report - 'git stash push -u' fatal errors when sub-repo present
@ 2021-09-30  0:49 Robert Leftwich
  2021-09-30 10:06 ` René Scharfe
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Robert Leftwich @ 2021-09-30  0:49 UTC (permalink / raw)
  To: git

Thank you for filling out a Git bug report!
Please answer the following questions to help us understand your issue.

What did you do before the bug happened? (Steps to reproduce your issue)
  Initialised a repo and added a file or cloned a repo inside an existing
repo (think dependencies).
  See https://github.com/gitpod-io/gitpod/issues/5675 for background.

  In an existing repo:
    $ mkdir sub_test && cd sub_test/ && git init . && touch test.txt && git
add test.txt
    OR
    $ git clone https://github.com/stencila/test.git sub_test
    THEN
    $ git stash push -u

What did you expect to happen? (Expected behavior)
  Command should complete without error but ignore the directory (this is
the existing behavior prior to v2.31)
    $ git stash push -u
    Ignoring path sub_test
    Saved working directory and index state WIP on (no branch): 94f6e3e283
Git 2.30.2

What happened instead? (Actual behavior)
  Command failed
    $ git stash push -u
    error: sub_test/: is a directory - add files inside instead
    fatal: Unable to process path sub_test/
    Cannot save the untracked files

What's different between what you expected and what actually happened?
  Command failed

Anything else you want to add:
  It happens on all versions from v2.31 to current master.
  It is specifically related to this change:

https://github.com/git/git/commit/6e773527b6b03976cefbb0f9571bd40dd5995e6c#diff-70525b6b89c7cac91e520085d954a68671038d218b77d22855e938ab075a68d8L1006

  If this is the new expected behavior perhaps it can result in a better
error message and related documentation?

Please review the rest of the bug report below.
You can delete any lines you don't wish to share.


[System Info]
git version:
git version 2.33.0.610.gcefe983a32
cpu: x86_64
built from commit: cefe983a320c03d7843ac78e73bd513a27806845
sizeof-long: 8
sizeof-size_t: 8
shell-path: /bin/sh
uname: Linux 5.4.0-1051-gke #54-Ubuntu SMP Thu Aug 5 18:52:13 UTC 2021
x86_64
compiler info: gnuc: 9.3
libc info: glibc: 2.31
$SHELL (typically, interactive shell): /bin/bash


[Enabled Hooks]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Bug/regression report - 'git stash push -u' fatal errors when sub-repo present
  2021-09-30  0:49 Bug/regression report - 'git stash push -u' fatal errors when sub-repo present Robert Leftwich
@ 2021-09-30 10:06 ` René Scharfe
  2021-09-30 16:35   ` Junio C Hamano
  2021-10-07 20:31 ` [PATCH 1/3] t3905: show failure to ignore sub-repo René Scharfe
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 11+ messages in thread
From: René Scharfe @ 2021-09-30 10:06 UTC (permalink / raw)
  To: Robert Leftwich, git, Derrick Stolee

Am 30.09.21 um 02:49 schrieb Robert Leftwich:
> Thank you for filling out a Git bug report!
> Please answer the following questions to help us understand your issue.
>
> What did you do before the bug happened? (Steps to reproduce your issue)
>   Initialised a repo and added a file or cloned a repo inside an existing
> repo (think dependencies).
>   See https://github.com/gitpod-io/gitpod/issues/5675 for background.
>
>   In an existing repo:
>     $ mkdir sub_test && cd sub_test/ && git init . && touch test.txt && git
> add test.txt
>     OR
>     $ git clone https://github.com/stencila/test.git sub_test
>     THEN
>     $ git stash push -u
>
> What did you expect to happen? (Expected behavior)
>   Command should complete without error but ignore the directory (this is
> the existing behavior prior to v2.31)
>     $ git stash push -u
>     Ignoring path sub_test
>     Saved working directory and index state WIP on (no branch): 94f6e3e283
> Git 2.30.2
>
> What happened instead? (Actual behavior)
>   Command failed
>     $ git stash push -u
>     error: sub_test/: is a directory - add files inside instead
>     fatal: Unable to process path sub_test/
>     Cannot save the untracked files
>
> What's different between what you expected and what actually happened?
>   Command failed
>
> Anything else you want to add:
>   It happens on all versions from v2.31 to current master.
>   It is specifically related to this change:
>
> https://github.com/git/git/commit/6e773527b6b03976cefbb0f9571bd40dd5995e6c#diff-70525b6b89c7cac91e520085d954a68671038d218b77d22855e938ab075a68d8L1006
>
>   If this is the new expected behavior perhaps it can result in a better
> error message and related documentation?
>
> Please review the rest of the bug report below.
> You can delete any lines you don't wish to share.

Looping in Stolee as the author of 6e773527b6 (sparse-index: convert
from full to sparse, 2021-03-30).

>
>
> [System Info]
> git version:
> git version 2.33.0.610.gcefe983a32
> cpu: x86_64
> built from commit: cefe983a320c03d7843ac78e73bd513a27806845
> sizeof-long: 8
> sizeof-size_t: 8
> shell-path: /bin/sh
> uname: Linux 5.4.0-1051-gke #54-Ubuntu SMP Thu Aug 5 18:52:13 UTC 2021
> x86_64
> compiler info: gnuc: 9.3
> libc info: glibc: 2.31
> $SHELL (typically, interactive shell): /bin/bash
>
>
> [Enabled Hooks]
>

Here's a raw patch for that.  Versions before 6e773527b6 pass the
included test.

The magic return value of 2 is a bit ugly, but allows adding the
additional check only to the call-site relevant to the bug report.

I don't know if other callers of verify_path() might also need that
check, or if it is too narrow.

René


---
 builtin/update-index.c             | 15 ++++++++++++++-
 read-cache.c                       |  2 +-
 t/t3905-stash-include-untracked.sh |  6 ++++++
 3 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/builtin/update-index.c b/builtin/update-index.c
index 187203e8bb..3d32db7304 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -445,10 +445,22 @@ static void chmod_path(char flip, const char *path)
 	die("git update-index: cannot chmod %cx '%s'", flip, path);
 }

+static int is_nonbare_repo_dir(const char *path)
+{
+	int ret;
+	struct strbuf sb = STRBUF_INIT;
+
+	strbuf_addstr(&sb, path);
+	ret = is_nonbare_repository_dir(&sb);
+	strbuf_release(&sb);
+	return ret;
+}
+
 static void update_one(const char *path)
 {
 	int stat_errno = 0;
 	struct stat st;
+	int ret;

 	if (mark_valid_only || mark_skip_worktree_only || force_remove ||
 	    mark_fsmonitor_only)
@@ -458,7 +470,8 @@ static void update_one(const char *path)
 		stat_errno = errno;
 	} /* else stat is valid */

-	if (!verify_path(path, st.st_mode)) {
+	ret = verify_path(path, st.st_mode);
+	if (ret == 0 || (ret == 2 && is_nonbare_repo_dir(path))) {
 		fprintf(stderr, "Ignoring path %s\n", path);
 		return;
 	}
diff --git a/read-cache.c b/read-cache.c
index f5d4385c40..0188b5b798 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1040,7 +1040,7 @@ int verify_path(const char *path, unsigned mode)
 			 * sparse directory entries.
 			 */
 			if (c == '\0')
-				return S_ISDIR(mode);
+				return S_ISDIR(mode) ? 2 : 0;
 		} else if (c == '\\' && protect_ntfs) {
 			if (is_ntfs_dotgit(path))
 				return 0;
diff --git a/t/t3905-stash-include-untracked.sh b/t/t3905-stash-include-untracked.sh
index dd2cdcc114..5390eec4e3 100755
--- a/t/t3905-stash-include-untracked.sh
+++ b/t/t3905-stash-include-untracked.sh
@@ -422,4 +422,10 @@ test_expect_success 'stash show --{include,only}-untracked on stashes without un
 	test_must_be_empty actual
 '

+test_expect_success 'stash -u ignores sub-repository' '
+	test_when_finished "rm -rf sub-repo" &&
+	git init sub-repo &&
+	git stash -u
+'
+
 test_done
--
2.33.0

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Bug/regression report - 'git stash push -u' fatal errors when sub-repo present
  2021-09-30 10:06 ` René Scharfe
@ 2021-09-30 16:35   ` Junio C Hamano
  2021-10-01 14:25     ` René Scharfe
  0 siblings, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2021-09-30 16:35 UTC (permalink / raw)
  To: René Scharfe; +Cc: Robert Leftwich, git, Derrick Stolee

René Scharfe <l.s.r@web.de> writes:

> Looping in Stolee as the author of 6e773527b6 (sparse-index: convert
> from full to sparse, 2021-03-30).
>
> Here's a raw patch for that.  Versions before 6e773527b6 pass the
> included test.
>
> The magic return value of 2 is a bit ugly, but allows adding the
> additional check only to the call-site relevant to the bug report.
>
> I don't know if other callers of verify_path() might also need that
> check, or if it is too narrow.

The callers inside read-cache.c, which I think were the original
intended audiences of the function, always call verify_path() with
the pathname suitable to be stored as a cache entry.

The callee never has to assume an extra trailing slash might exist
at the end.  And verify_path() said "no", if any caller passed an
extra trailing slash before the commit in question.

I _think_ we defined the new "tree" type entries the sparse index
stuff added to be the _only_ cache entries whose path always ends
with a slash?  If so, perhaps we should audit the existing callers 
and move the "paths that come from end-users to be entered to the
index must never end with a slash" sanity check this function gave
us, which was broken by 6e773527b6, perhaps?

"git update-index" should never allow to create a "tree" kind of
cache entry (making it sparse again should be the task of systems
internal, and should not be done by end-user feeding a pre-shrunk
"tree" kind of entry to record a sparsely populated state, if I
understand correctly), so its call to verify_path(), if it ends with
a directory separator, should say "that's not a good path".

Prehaps we can rename verify_path() to verify_path_internal() that
is _known_ to be called with names that are already vetted to be
stored in ce_name and convert callers inside read-cache.c to call
it, and write verify_path() as a thin wrapper of it to fail when the
former stops by returning S_ISDIR() at the place your fix touched,
or something?

Thanks.

>
> René
>
>
> ---
>  builtin/update-index.c             | 15 ++++++++++++++-
>  read-cache.c                       |  2 +-
>  t/t3905-stash-include-untracked.sh |  6 ++++++
>  3 files changed, 21 insertions(+), 2 deletions(-)
>
> diff --git a/builtin/update-index.c b/builtin/update-index.c
> index 187203e8bb..3d32db7304 100644
> --- a/builtin/update-index.c
> +++ b/builtin/update-index.c
> @@ -445,10 +445,22 @@ static void chmod_path(char flip, const char *path)
>  	die("git update-index: cannot chmod %cx '%s'", flip, path);
>  }
>
> +static int is_nonbare_repo_dir(const char *path)
> +{
> +	int ret;
> +	struct strbuf sb = STRBUF_INIT;
> +
> +	strbuf_addstr(&sb, path);
> +	ret = is_nonbare_repository_dir(&sb);
> +	strbuf_release(&sb);
> +	return ret;
> +}
> +
>  static void update_one(const char *path)
>  {
>  	int stat_errno = 0;
>  	struct stat st;
> +	int ret;
>
>  	if (mark_valid_only || mark_skip_worktree_only || force_remove ||
>  	    mark_fsmonitor_only)
> @@ -458,7 +470,8 @@ static void update_one(const char *path)
>  		stat_errno = errno;
>  	} /* else stat is valid */
>
> -	if (!verify_path(path, st.st_mode)) {
> +	ret = verify_path(path, st.st_mode);
> +	if (ret == 0 || (ret == 2 && is_nonbare_repo_dir(path))) {
>  		fprintf(stderr, "Ignoring path %s\n", path);
>  		return;
>  	}
> diff --git a/read-cache.c b/read-cache.c
> index f5d4385c40..0188b5b798 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -1040,7 +1040,7 @@ int verify_path(const char *path, unsigned mode)
>  			 * sparse directory entries.
>  			 */
>  			if (c == '\0')
> -				return S_ISDIR(mode);
> +				return S_ISDIR(mode) ? 2 : 0;
>  		} else if (c == '\\' && protect_ntfs) {
>  			if (is_ntfs_dotgit(path))
>  				return 0;
> diff --git a/t/t3905-stash-include-untracked.sh b/t/t3905-stash-include-untracked.sh
> index dd2cdcc114..5390eec4e3 100755
> --- a/t/t3905-stash-include-untracked.sh
> +++ b/t/t3905-stash-include-untracked.sh
> @@ -422,4 +422,10 @@ test_expect_success 'stash show --{include,only}-untracked on stashes without un
>  	test_must_be_empty actual
>  '
>
> +test_expect_success 'stash -u ignores sub-repository' '
> +	test_when_finished "rm -rf sub-repo" &&
> +	git init sub-repo &&
> +	git stash -u
> +'
> +
>  test_done
> --
> 2.33.0

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Bug/regression report - 'git stash push -u' fatal errors when sub-repo present
  2021-09-30 16:35   ` Junio C Hamano
@ 2021-10-01 14:25     ` René Scharfe
  2021-10-04 20:06       ` Derrick Stolee
  0 siblings, 1 reply; 11+ messages in thread
From: René Scharfe @ 2021-10-01 14:25 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Robert Leftwich, git, Derrick Stolee

Am 30.09.21 um 18:35 schrieb Junio C Hamano:
> René Scharfe <l.s.r@web.de> writes:
>
>> Looping in Stolee as the author of 6e773527b6 (sparse-index: convert
>> from full to sparse, 2021-03-30).
>>
>> Here's a raw patch for that.  Versions before 6e773527b6 pass the
>> included test.
>>
>> The magic return value of 2 is a bit ugly, but allows adding the
>> additional check only to the call-site relevant to the bug report.
>>
>> I don't know if other callers of verify_path() might also need that
>> check, or if it is too narrow.
>
> The callers inside read-cache.c, which I think were the original
> intended audiences of the function, always call verify_path() with
> the pathname suitable to be stored as a cache entry.
>
> The callee never has to assume an extra trailing slash might exist
> at the end.  And verify_path() said "no", if any caller passed an
> extra trailing slash before the commit in question.
>
> I _think_ we defined the new "tree" type entries the sparse index
> stuff added to be the _only_ cache entries whose path always ends
> with a slash?  If so, perhaps we should audit the existing callers
> and move the "paths that come from end-users to be entered to the
> index must never end with a slash" sanity check this function gave
> us, which was broken by 6e773527b6, perhaps?
>
> "git update-index" should never allow to create a "tree" kind of
> cache entry (making it sparse again should be the task of systems
> internal, and should not be done by end-user feeding a pre-shrunk
> "tree" kind of entry to record a sparsely populated state, if I
> understand correctly), so its call to verify_path(), if it ends with
> a directory separator, should say "that's not a good path".
>
> Prehaps we can rename verify_path() to verify_path_internal() that
> is _known_ to be called with names that are already vetted to be
> stored in ce_name and convert callers inside read-cache.c to call
> it, and write verify_path() as a thin wrapper of it to fail when the
> former stops by returning S_ISDIR() at the place your fix touched,
> or something?

Restoring the stricter check for the general case and providing a way
out for the code handling sparse indexes sounds like a good idea.

I don't know which callers are supposed to need that, but the following
patch passes all tests, including the new one.

René


---
 read-cache.c                       | 45 ++++++++++++++++++++----------
 t/t3905-stash-include-untracked.sh |  6 ++++
 2 files changed, 36 insertions(+), 15 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index f5d4385c40..a1da4619c4 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -849,6 +849,19 @@ struct cache_entry *make_empty_transient_cache_entry(size_t len,
 	return xcalloc(1, cache_entry_size(len));
 }

+enum verify_path_result {
+	PATH_OK,
+	PATH_INVALID,
+	PATH_DIR_WITH_SEP,
+};
+
+static enum verify_path_result verify_path_internal(const char *, unsigned);
+
+int verify_path(const char *path, unsigned mode)
+{
+	return verify_path_internal(path, mode) == PATH_OK;
+}
+
 struct cache_entry *make_cache_entry(struct index_state *istate,
 				     unsigned int mode,
 				     const struct object_id *oid,
@@ -859,7 +872,7 @@ struct cache_entry *make_cache_entry(struct index_state *istate,
 	struct cache_entry *ce, *ret;
 	int len;

-	if (!verify_path(path, mode)) {
+	if (verify_path_internal(path, mode) == PATH_INVALID) {
 		error(_("invalid path '%s'"), path);
 		return NULL;
 	}
@@ -993,60 +1006,62 @@ static int verify_dotfile(const char *rest, unsigned mode)
 	return 1;
 }

-int verify_path(const char *path, unsigned mode)
+static enum verify_path_result verify_path_internal(const char *path,
+						    unsigned mode)
 {
 	char c = 0;

 	if (has_dos_drive_prefix(path))
-		return 0;
+		return PATH_INVALID;

 	if (!is_valid_path(path))
-		return 0;
+		return PATH_INVALID;

 	goto inside;
 	for (;;) {
 		if (!c)
-			return 1;
+			return PATH_OK;
 		if (is_dir_sep(c)) {
 inside:
 			if (protect_hfs) {

 				if (is_hfs_dotgit(path))
-					return 0;
+					return PATH_INVALID;
 				if (S_ISLNK(mode)) {
 					if (is_hfs_dotgitmodules(path))
-						return 0;
+						return PATH_INVALID;
 				}
 			}
 			if (protect_ntfs) {
 #if defined GIT_WINDOWS_NATIVE || defined __CYGWIN__
 				if (c == '\\')
-					return 0;
+					return PATH_INVALID;
 #endif
 				if (is_ntfs_dotgit(path))
-					return 0;
+					return PATH_INVALID;
 				if (S_ISLNK(mode)) {
 					if (is_ntfs_dotgitmodules(path))
-						return 0;
+						return PATH_INVALID;
 				}
 			}

 			c = *path++;
 			if ((c == '.' && !verify_dotfile(path, mode)) ||
 			    is_dir_sep(c))
-				return 0;
+				return PATH_INVALID;
 			/*
 			 * allow terminating directory separators for
 			 * sparse directory entries.
 			 */
 			if (c == '\0')
-				return S_ISDIR(mode);
+				return S_ISDIR(mode) ? PATH_DIR_WITH_SEP :
+						       PATH_INVALID;
 		} else if (c == '\\' && protect_ntfs) {
 			if (is_ntfs_dotgit(path))
-				return 0;
+				return PATH_INVALID;
 			if (S_ISLNK(mode)) {
 				if (is_ntfs_dotgitmodules(path))
-					return 0;
+					return PATH_INVALID;
 			}
 		}

@@ -1349,7 +1364,7 @@ static int add_index_entry_with_check(struct index_state *istate, struct cache_e

 	if (!ok_to_add)
 		return -1;
-	if (!verify_path(ce->name, ce->ce_mode))
+	if (verify_path_internal(ce->name, ce->ce_mode) == PATH_INVALID)
 		return error(_("invalid path '%s'"), ce->name);

 	if (!skip_df_check &&
diff --git a/t/t3905-stash-include-untracked.sh b/t/t3905-stash-include-untracked.sh
index dd2cdcc114..5390eec4e3 100755
--- a/t/t3905-stash-include-untracked.sh
+++ b/t/t3905-stash-include-untracked.sh
@@ -422,4 +422,10 @@ test_expect_success 'stash show --{include,only}-untracked on stashes without un
 	test_must_be_empty actual
 '

+test_expect_success 'stash -u ignores sub-repository' '
+	test_when_finished "rm -rf sub-repo" &&
+	git init sub-repo &&
+	git stash -u
+'
+
 test_done
--
2.33.0

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Bug/regression report - 'git stash push -u' fatal errors when sub-repo present
  2021-10-01 14:25     ` René Scharfe
@ 2021-10-04 20:06       ` Derrick Stolee
  2021-10-04 20:52         ` Junio C Hamano
  0 siblings, 1 reply; 11+ messages in thread
From: Derrick Stolee @ 2021-10-04 20:06 UTC (permalink / raw)
  To: René Scharfe, Junio C Hamano; +Cc: Robert Leftwich, git, Derrick Stolee

On 10/1/2021 10:25 AM, René Scharfe wrote:
> Am 30.09.21 um 18:35 schrieb Junio C Hamano:
>> René Scharfe <l.s.r@web.de> writes:
>>
>>> Looping in Stolee as the author of 6e773527b6 (sparse-index: convert
>>> from full to sparse, 2021-03-30).

Thanks for looping me in. I'm still getting caught up from some
time off.

>>> Here's a raw patch for that.  Versions before 6e773527b6 pass the
>>> included test.
>>>
>>> The magic return value of 2 is a bit ugly, but allows adding the
>>> additional check only to the call-site relevant to the bug report.
>>>
>>> I don't know if other callers of verify_path() might also need that
>>> check, or if it is too narrow.
>>
>> The callers inside read-cache.c, which I think were the original
>> intended audiences of the function, always call verify_path() with
>> the pathname suitable to be stored as a cache entry.
>>
>> The callee never has to assume an extra trailing slash might exist
>> at the end.  And verify_path() said "no", if any caller passed an
>> extra trailing slash before the commit in question.
>>
>> I _think_ we defined the new "tree" type entries the sparse index
>> stuff added to be the _only_ cache entries whose path always ends
>> with a slash?  If so, perhaps we should audit the existing callers
>> and move the "paths that come from end-users to be entered to the
>> index must never end with a slash" sanity check this function gave
>> us, which was broken by 6e773527b6, perhaps?
>>
>> "git update-index" should never allow to create a "tree" kind of
>> cache entry (making it sparse again should be the task of systems
>> internal, and should not be done by end-user feeding a pre-shrunk
>> "tree" kind of entry to record a sparsely populated state, if I
>> understand correctly), so its call to verify_path(), if it ends with
>> a directory separator, should say "that's not a good path".
>>
>> Prehaps we can rename verify_path() to verify_path_internal() that
>> is _known_ to be called with names that are already vetted to be
>> stored in ce_name and convert callers inside read-cache.c to call
>> it, and write verify_path() as a thin wrapper of it to fail when the
>> former stops by returning S_ISDIR() at the place your fix touched,
>> or something?
> 
> Restoring the stricter check for the general case and providing a way
> out for the code handling sparse indexes sounds like a good idea.
> 
> I don't know which callers are supposed to need that, but the following
> patch passes all tests, including the new one.

I like this new approach.

> +enum verify_path_result {
> +	PATH_OK,
> +	PATH_INVALID,
> +	PATH_DIR_WITH_SEP,
> +};
> +
> +static enum verify_path_result verify_path_internal(const char *, unsigned);
> +
> +int verify_path(const char *path, unsigned mode)
> +{
> +	return verify_path_internal(path, mode) == PATH_OK;
> +}
> +

I was confused at first as to why the verify_path() implementation is
not near the implementation of verify_path_internal(), but I see how
you use the _internal() version in make_cache_tree(), justifying this
position.

>  struct cache_entry *make_cache_entry(struct index_state *istate,
>  				     unsigned int mode,
>  				     const struct object_id *oid,
...
>  			/*
>  			 * allow terminating directory separators for
>  			 * sparse directory entries.
>  			 */
>  			if (c == '\0')
> -				return S_ISDIR(mode);
> +				return S_ISDIR(mode) ? PATH_DIR_WITH_SEP :
> +						       PATH_INVALID;

The rewrite of this method is mostly mechanical except for this one
bit that is solving the issue at hand.

> @@ -1349,7 +1364,7 @@ static int add_index_entry_with_check(struct index_state *istate, struct cache_e
> 
>  	if (!ok_to_add)
>  		return -1;
> -	if (!verify_path(ce->name, ce->ce_mode))
> +	if (verify_path_internal(ce->name, ce->ce_mode) == PATH_INVALID)
>  		return error(_("invalid path '%s'"), ce->name);

And yes, I believe that make_cache_entry() and add_index_entry_with_check()
are the only places that need this internal version. If we find others,
then we can add them as necessary. The tests in t1092 are getting rather
robust, although they don't do much for this test case:
> +test_expect_success 'stash -u ignores sub-repository' '
> +	test_when_finished "rm -rf sub-repo" &&
> +	git init sub-repo &&
> +	git stash -u
> +'

Seems like a good test to have, anyway.

I look forward to seeing this as a full patch.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Bug/regression report - 'git stash push -u' fatal errors when sub-repo present
  2021-10-04 20:06       ` Derrick Stolee
@ 2021-10-04 20:52         ` Junio C Hamano
  2021-10-04 22:29           ` Derrick Stolee
  0 siblings, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2021-10-04 20:52 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: René Scharfe, Robert Leftwich, git, Derrick Stolee

Derrick Stolee <stolee@gmail.com> writes:

> And yes, I believe that make_cache_entry() and add_index_entry_with_check()
> are the only places that need this internal version. If we find others,
> then we can add them as necessary. The tests in t1092 are getting rather
> robust, although they don't do much for this test case:
>> +test_expect_success 'stash -u ignores sub-repository' '
>> +	test_when_finished "rm -rf sub-repo" &&
>> +	git init sub-repo &&
>> +	git stash -u
>> +'
>
> Seems like a good test to have, anyway.
>
> I look forward to seeing this as a full patch.

Just one thing I want to pick your brains for ;-)

I said this earlier ...

>>> "git update-index" should never allow to create a "tree" kind of
>>> cache entry (making it sparse again should be the task of systems
>>> internal, and should not be done by end-user feeding a pre-shrunk
>>> "tree" kind of entry to record a sparsely populated state, if I
>>> understand correctly), so its call to verify_path(), if it ends with
>>> a directory separator, should say "that's not a good path".

... without knowing what you had in mind when you did the "tree kind
of entry in the index".  Are we on the same page, or do we think it
might be beneficial to give end-users a long-enough rope
to hang themselves, aka get into the lower details of
implementation?

One _could_ imagine that allowing

 $ git update-index --cacheinfo 40000,609869396314577e5a,t/

given by the end user to drop all entries under t/* and replace them
with a single sparse-dir-entry might be a good way to allow
scripters the same power as the C-code to take advantage of the
sparse checkout feature.  It needs to be paired with some mechanism
to allow sparse-dir-entry observed by the end users with a plumbing,
e.g. even though ls-files unconditionally calls ensure_full_index(),

 $ git ls-files --show-sparse

may show the sparse-dir-entry by bypassing the call.

Thanks.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Bug/regression report - 'git stash push -u' fatal errors when sub-repo present
  2021-10-04 20:52         ` Junio C Hamano
@ 2021-10-04 22:29           ` Derrick Stolee
  0 siblings, 0 replies; 11+ messages in thread
From: Derrick Stolee @ 2021-10-04 22:29 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: René Scharfe, Robert Leftwich, git, Derrick Stolee

On 10/4/2021 4:52 PM, Junio C Hamano wrote:
> Derrick Stolee <stolee@gmail.com> writes:
> 
>> And yes, I believe that make_cache_entry() and add_index_entry_with_check()
>> are the only places that need this internal version. If we find others,
>> then we can add them as necessary. The tests in t1092 are getting rather
>> robust, although they don't do much for this test case:
>>> +test_expect_success 'stash -u ignores sub-repository' '
>>> +	test_when_finished "rm -rf sub-repo" &&
>>> +	git init sub-repo &&
>>> +	git stash -u
>>> +'
>>
>> Seems like a good test to have, anyway.
>>
>> I look forward to seeing this as a full patch.
> 
> Just one thing I want to pick your brains for ;-)
> 
> I said this earlier ...
> 
>>>> "git update-index" should never allow to create a "tree" kind of
>>>> cache entry (making it sparse again should be the task of systems
>>>> internal, and should not be done by end-user feeding a pre-shrunk
>>>> "tree" kind of entry to record a sparsely populated state, if I
>>>> understand correctly), so its call to verify_path(), if it ends with
>>>> a directory separator, should say "that's not a good path".
> 
> ... without knowing what you had in mind when you did the "tree kind
> of entry in the index".  Are we on the same page, or do we think it
> might be beneficial to give end-users a long-enough rope
> to hang themselves, aka get into the lower details of
> implementation?
> 
> One _could_ imagine that allowing
> 
>  $ git update-index --cacheinfo 40000,609869396314577e5a,t/
> 
> given by the end user to drop all entries under t/* and replace them
> with a single sparse-dir-entry might be a good way to allow
> scripters the same power as the C-code to take advantage of the
> sparse checkout feature.  It needs to be paired with some mechanism
> to allow sparse-dir-entry observed by the end users with a plumbing,
> e.g. even though ls-files unconditionally calls ensure_full_index(),

I think this is an interesting capability, but I'm not sure I see
a use case that is worth the footgun, especially with the accidents
that can happen when working with submodules.

I think we can happily extend to include this functionality in the
future. Having an error condition now that we relax in the future is
the good kind of behavior change.

>  $ git ls-files --show-sparse
> 
> may show the sparse-dir-entry by bypassing the call.

I have something in the works for this, but I'm letting others send
their sparse-index work first. I have not forgotten your request for
such a feature!

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/3] t3905: show failure to ignore sub-repo
  2021-09-30  0:49 Bug/regression report - 'git stash push -u' fatal errors when sub-repo present Robert Leftwich
  2021-09-30 10:06 ` René Scharfe
@ 2021-10-07 20:31 ` René Scharfe
  2021-10-07 20:31 ` [PATCH 2/3] read-cache: add verify_path_internal() René Scharfe
  2021-10-07 20:31 ` [PATCH 3/3] read-cache: let verify_path() reject trailing dir separators again René Scharfe
  3 siblings, 0 replies; 11+ messages in thread
From: René Scharfe @ 2021-10-07 20:31 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Derrick Stolee, Robert Leftwich

"git stash" used to ignore sub-repositories until 6e773527b6
(sparse-index: convert from full to sparse, 2021-03-30).  Add a test
that demonstrates this regression.

Reported-by: Robert Leftwich <robert@gitpod.io>
Signed-off-by: René Scharfe <l.s.r@web.de>
---
 t/t3905-stash-include-untracked.sh | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/t/t3905-stash-include-untracked.sh b/t/t3905-stash-include-untracked.sh
index dd2cdcc114..19da46b7fb 100755
--- a/t/t3905-stash-include-untracked.sh
+++ b/t/t3905-stash-include-untracked.sh
@@ -422,4 +422,10 @@ test_expect_success 'stash show --{include,only}-untracked on stashes without un
 	test_must_be_empty actual
 '

+test_expect_failure 'stash -u ignores sub-repository' '
+	test_when_finished "rm -rf sub-repo" &&
+	git init sub-repo &&
+	git stash -u
+'
+
 test_done
--
2.33.0

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 2/3] read-cache: add verify_path_internal()
  2021-09-30  0:49 Bug/regression report - 'git stash push -u' fatal errors when sub-repo present Robert Leftwich
  2021-09-30 10:06 ` René Scharfe
  2021-10-07 20:31 ` [PATCH 1/3] t3905: show failure to ignore sub-repo René Scharfe
@ 2021-10-07 20:31 ` René Scharfe
  2021-10-07 20:31 ` [PATCH 3/3] read-cache: let verify_path() reject trailing dir separators again René Scharfe
  3 siblings, 0 replies; 11+ messages in thread
From: René Scharfe @ 2021-10-07 20:31 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Derrick Stolee, Robert Leftwich

Turn verify_path() into an internal function that distinguishes between
valid paths and those with trailing directory separators and rename it
to verify_path_internal().  Provide a wrapper with the old behavior
under the old name.  No functional change intended.  The new function
will be used in the next patch.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
---
 read-cache.c | 41 ++++++++++++++++++++++++++++-------------
 1 file changed, 28 insertions(+), 13 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index b0a06db5c5..5b6fc08e46 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -849,6 +849,19 @@ struct cache_entry *make_empty_transient_cache_entry(size_t len,
 	return xcalloc(1, cache_entry_size(len));
 }

+enum verify_path_result {
+	PATH_OK,
+	PATH_INVALID,
+	PATH_DIR_WITH_SEP,
+};
+
+static enum verify_path_result verify_path_internal(const char *, unsigned);
+
+int verify_path(const char *path, unsigned mode)
+{
+	return verify_path_internal(path, mode) != PATH_INVALID;
+}
+
 struct cache_entry *make_cache_entry(struct index_state *istate,
 				     unsigned int mode,
 				     const struct object_id *oid,
@@ -993,60 +1006,62 @@ static int verify_dotfile(const char *rest, unsigned mode)
 	return 1;
 }

-int verify_path(const char *path, unsigned mode)
+static enum verify_path_result verify_path_internal(const char *path,
+						    unsigned mode)
 {
 	char c = 0;

 	if (has_dos_drive_prefix(path))
-		return 0;
+		return PATH_INVALID;

 	if (!is_valid_path(path))
-		return 0;
+		return PATH_INVALID;

 	goto inside;
 	for (;;) {
 		if (!c)
-			return 1;
+			return PATH_OK;
 		if (is_dir_sep(c)) {
 inside:
 			if (protect_hfs) {

 				if (is_hfs_dotgit(path))
-					return 0;
+					return PATH_INVALID;
 				if (S_ISLNK(mode)) {
 					if (is_hfs_dotgitmodules(path))
-						return 0;
+						return PATH_INVALID;
 				}
 			}
 			if (protect_ntfs) {
 #if defined GIT_WINDOWS_NATIVE || defined __CYGWIN__
 				if (c == '\\')
-					return 0;
+					return PATH_INVALID;
 #endif
 				if (is_ntfs_dotgit(path))
-					return 0;
+					return PATH_INVALID;
 				if (S_ISLNK(mode)) {
 					if (is_ntfs_dotgitmodules(path))
-						return 0;
+						return PATH_INVALID;
 				}
 			}

 			c = *path++;
 			if ((c == '.' && !verify_dotfile(path, mode)) ||
 			    is_dir_sep(c))
-				return 0;
+				return PATH_INVALID;
 			/*
 			 * allow terminating directory separators for
 			 * sparse directory entries.
 			 */
 			if (c == '\0')
-				return S_ISDIR(mode);
+				return S_ISDIR(mode) ? PATH_DIR_WITH_SEP :
+						       PATH_INVALID;
 		} else if (c == '\\' && protect_ntfs) {
 			if (is_ntfs_dotgit(path))
-				return 0;
+				return PATH_INVALID;
 			if (S_ISLNK(mode)) {
 				if (is_ntfs_dotgitmodules(path))
-					return 0;
+					return PATH_INVALID;
 			}
 		}

--
2.33.0

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 3/3] read-cache: let verify_path() reject trailing dir separators again
  2021-09-30  0:49 Bug/regression report - 'git stash push -u' fatal errors when sub-repo present Robert Leftwich
                   ` (2 preceding siblings ...)
  2021-10-07 20:31 ` [PATCH 2/3] read-cache: add verify_path_internal() René Scharfe
@ 2021-10-07 20:31 ` René Scharfe
  2021-10-08  9:04   ` Junio C Hamano
  3 siblings, 1 reply; 11+ messages in thread
From: René Scharfe @ 2021-10-07 20:31 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Derrick Stolee, Robert Leftwich

6e773527b6 (sparse-index: convert from full to sparse, 2021-03-30) made
verify_path() accept trailing directory separators for directories,
which is necessary for sparse directory entries.  This clemency causes
"git stash" to stumble over sub-repositories, though, and there may be
more unintended side-effects.

Avoid them by restoring the old verify_path() behavior and accepting
trailing directory separators only in places that are supposed to handle
sparse directory entries.

Signed-off-by: René Scharfe <l.s.r@web.de>
---
 read-cache.c                       | 6 +++---
 t/t3905-stash-include-untracked.sh | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index 5b6fc08e46..f1aabc47b6 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -859,7 +859,7 @@ static enum verify_path_result verify_path_internal(const char *, unsigned);

 int verify_path(const char *path, unsigned mode)
 {
-	return verify_path_internal(path, mode) != PATH_INVALID;
+	return verify_path_internal(path, mode) == PATH_OK;
 }

 struct cache_entry *make_cache_entry(struct index_state *istate,
@@ -872,7 +872,7 @@ struct cache_entry *make_cache_entry(struct index_state *istate,
 	struct cache_entry *ce, *ret;
 	int len;

-	if (!verify_path(path, mode)) {
+	if (verify_path_internal(path, mode) == PATH_INVALID) {
 		error(_("invalid path '%s'"), path);
 		return NULL;
 	}
@@ -1364,7 +1364,7 @@ static int add_index_entry_with_check(struct index_state *istate, struct cache_e

 	if (!ok_to_add)
 		return -1;
-	if (!verify_path(ce->name, ce->ce_mode))
+	if (verify_path_internal(ce->name, ce->ce_mode) == PATH_INVALID)
 		return error(_("invalid path '%s'"), ce->name);

 	if (!skip_df_check &&
diff --git a/t/t3905-stash-include-untracked.sh b/t/t3905-stash-include-untracked.sh
index 19da46b7fb..5390eec4e3 100755
--- a/t/t3905-stash-include-untracked.sh
+++ b/t/t3905-stash-include-untracked.sh
@@ -422,7 +422,7 @@ test_expect_success 'stash show --{include,only}-untracked on stashes without un
 	test_must_be_empty actual
 '

-test_expect_failure 'stash -u ignores sub-repository' '
+test_expect_success 'stash -u ignores sub-repository' '
 	test_when_finished "rm -rf sub-repo" &&
 	git init sub-repo &&
 	git stash -u
--
2.33.0

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] read-cache: let verify_path() reject trailing dir separators again
  2021-10-07 20:31 ` [PATCH 3/3] read-cache: let verify_path() reject trailing dir separators again René Scharfe
@ 2021-10-08  9:04   ` Junio C Hamano
  0 siblings, 0 replies; 11+ messages in thread
From: Junio C Hamano @ 2021-10-08  9:04 UTC (permalink / raw)
  To: René Scharfe; +Cc: git, Derrick Stolee, Robert Leftwich

René Scharfe <l.s.r@web.de> writes:

> 6e773527b6 (sparse-index: convert from full to sparse, 2021-03-30) made
> verify_path() accept trailing directory separators for directories,
> which is necessary for sparse directory entries.  This clemency causes
> "git stash" to stumble over sub-repositories, though, and there may be
> more unintended side-effects.
>
> Avoid them by restoring the old verify_path() behavior and accepting
> trailing directory separators only in places that are supposed to handle
> sparse directory entries.
>
> Signed-off-by: René Scharfe <l.s.r@web.de>
> ---
>  read-cache.c                       | 6 +++---
>  t/t3905-stash-include-untracked.sh | 2 +-
>  2 files changed, 4 insertions(+), 4 deletions(-)

Makes sense.  Thanks.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-10-08  9:04 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-30  0:49 Bug/regression report - 'git stash push -u' fatal errors when sub-repo present Robert Leftwich
2021-09-30 10:06 ` René Scharfe
2021-09-30 16:35   ` Junio C Hamano
2021-10-01 14:25     ` René Scharfe
2021-10-04 20:06       ` Derrick Stolee
2021-10-04 20:52         ` Junio C Hamano
2021-10-04 22:29           ` Derrick Stolee
2021-10-07 20:31 ` [PATCH 1/3] t3905: show failure to ignore sub-repo René Scharfe
2021-10-07 20:31 ` [PATCH 2/3] read-cache: add verify_path_internal() René Scharfe
2021-10-07 20:31 ` [PATCH 3/3] read-cache: let verify_path() reject trailing dir separators again René Scharfe
2021-10-08  9:04   ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).