All of lore.kernel.org
 help / color / mirror / Atom feed
* Strange diff-index with fsmonitor, submodules
@ 2023-08-29  0:56 Jonathan Tan
  2023-08-29  1:20 ` Junio C Hamano
                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Jonathan Tan @ 2023-08-29  0:56 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan

There is a strange interaction where diff-index not only produces
different results when run with and without fsmonitor, but produces
different results for 2 entries that as far as I can tell, should behave
identically (sibling files in the same directory - file_11 and file_12,
and both of these filenames are only mentioned once each in the entire
test).

You can see this with this patch:

diff --git a/t/t7527-builtin-fsmonitor.sh b/t/t7527-builtin-fsmonitor.sh
index 0c241d6c14..e9e5e32016 100755
--- a/t/t7527-builtin-fsmonitor.sh
+++ b/t/t7527-builtin-fsmonitor.sh
@@ -809,6 +809,11 @@ my_match_and_clean () {
                status --porcelain=v2 >actual.without &&
        test_cmp actual.with actual.without &&
 
+       git -C super --no-optional-locks diff-index --name-status HEAD >actual.with &&
+       git -C super --no-optional-locks -c core.fsmonitor=false \
+               diff-index --name-status HEAD >actual.without &&
+       test_cmp actual.with actual.without &&
+
        git -C super/dir_1/dir_2/sub reset --hard &&
        git -C super/dir_1/dir_2/sub clean -d -f
 }
@@ -837,6 +842,7 @@ test_expect_success 'submodule always visited' '
        # some dirt in the submodule and confirm matching output.
 
        # Completely clean status.
+       echo Now running for clean status &&
        my_match_and_clean &&

and this is the output:

++ echo Now running for clean status
Now running for clean status
++ my_match_and_clean
++ git -C super --no-optional-locks status --porcelain=v2
++ git -C super --no-optional-locks -c core.fsmonitor=false status --porcelain=v2
++ test_cmp actual.with actual.without
++ test 2 -ne 2
++ eval 'diff -u' '"$@"'
+++ diff -u actual.with actual.without
++ git -C super --no-optional-locks diff-index --name-status HEAD
++ git -C super --no-optional-locks -c core.fsmonitor=false diff-index --name-status HEAD
++ test_cmp actual.with actual.without
++ test 2 -ne 2
++ eval 'diff -u' '"$@"'
+++ diff -u actual.with actual.without
--- actual.with	2023-08-29 00:39:26
+++ actual.without	2023-08-29 00:39:26
@@ -1 +0,0 @@
-D	dir_1/file_11
error: last command exited with $?=1
not ok 61 - submodule always visited

Notice that with fsmonitor, diff-index reports a "D" line that is not
present when fsmonitor is off. To add to that, it only reports "D" for
file_11 when I would expect that if it reported file_11, it would report
file_12 as well.

I'll continue investigating this myself, but does anyone know what is
going on?

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: Strange diff-index with fsmonitor, submodules
  2023-08-29  0:56 Strange diff-index with fsmonitor, submodules Jonathan Tan
@ 2023-08-29  1:20 ` Junio C Hamano
  2023-08-29 12:45   ` Jeff Hostetler
  2023-08-29 16:57 ` Jeff Hostetler
  2023-09-06  6:02 ` [PATCH] [diff-lib] Fix check_removed when fsmonitor is on Josip Sokcevic
  2 siblings, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2023-08-29  1:20 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, Jeff Hostetler, Eric DeCosta

Jonathan Tan <jonathantanmy@google.com> writes:

> There is a strange interaction where diff-index not only produces
> different results when run with and without fsmonitor, but produces
> different results for 2 entries that as far as I can tell, should behave
> identically (sibling files in the same directory - file_11 and file_12,
> and both of these filenames are only mentioned once each in the entire
> test).

Picking those who had non-clean-up changes to fsmonitor out of the
output from

    $ git shortlog -n --since=2.year --no-merges ':(glob)**/*fsmonitor*'

and adding them to CC: for their attention.

> You can see this with this patch:
>
> diff --git a/t/t7527-builtin-fsmonitor.sh b/t/t7527-builtin-fsmonitor.sh
> index 0c241d6c14..e9e5e32016 100755
> --- a/t/t7527-builtin-fsmonitor.sh
> +++ b/t/t7527-builtin-fsmonitor.sh
> @@ -809,6 +809,11 @@ my_match_and_clean () {
>                 status --porcelain=v2 >actual.without &&
>         test_cmp actual.with actual.without &&
>  
> +       git -C super --no-optional-locks diff-index --name-status HEAD >actual.with &&
> +       git -C super --no-optional-locks -c core.fsmonitor=false \
> +               diff-index --name-status HEAD >actual.without &&
> +       test_cmp actual.with actual.without &&
> +
>         git -C super/dir_1/dir_2/sub reset --hard &&
>         git -C super/dir_1/dir_2/sub clean -d -f
>  }
> @@ -837,6 +842,7 @@ test_expect_success 'submodule always visited' '
>         # some dirt in the submodule and confirm matching output.
>  
>         # Completely clean status.
> +       echo Now running for clean status &&
>         my_match_and_clean &&
>
> and this is the output:
>
> ++ echo Now running for clean status
> Now running for clean status
> ++ my_match_and_clean
> ++ git -C super --no-optional-locks status --porcelain=v2
> ++ git -C super --no-optional-locks -c core.fsmonitor=false status --porcelain=v2
> ++ test_cmp actual.with actual.without
> ++ test 2 -ne 2
> ++ eval 'diff -u' '"$@"'
> +++ diff -u actual.with actual.without
> ++ git -C super --no-optional-locks diff-index --name-status HEAD
> ++ git -C super --no-optional-locks -c core.fsmonitor=false diff-index --name-status HEAD
> ++ test_cmp actual.with actual.without
> ++ test 2 -ne 2
> ++ eval 'diff -u' '"$@"'
> +++ diff -u actual.with actual.without
> --- actual.with	2023-08-29 00:39:26
> +++ actual.without	2023-08-29 00:39:26
> @@ -1 +0,0 @@
> -D	dir_1/file_11
> error: last command exited with $?=1
> not ok 61 - submodule always visited
>
> Notice that with fsmonitor, diff-index reports a "D" line that is not
> present when fsmonitor is off. To add to that, it only reports "D" for
> file_11 when I would expect that if it reported file_11, it would report
> file_12 as well.
>
> I'll continue investigating this myself, but does anyone know what is
> going on?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Strange diff-index with fsmonitor, submodules
  2023-08-29  1:20 ` Junio C Hamano
@ 2023-08-29 12:45   ` Jeff Hostetler
  0 siblings, 0 replies; 18+ messages in thread
From: Jeff Hostetler @ 2023-08-29 12:45 UTC (permalink / raw)
  To: Junio C Hamano, Jonathan Tan; +Cc: git, Jeff Hostetler, Eric DeCosta



On 8/28/23 9:20 PM, Junio C Hamano wrote:
> Jonathan Tan <jonathantanmy@google.com> writes:
> 
>> There is a strange interaction where diff-index not only produces
>> different results when run with and without fsmonitor, but produces
>> different results for 2 entries that as far as I can tell, should behave
>> identically (sibling files in the same directory - file_11 and file_12,
>> and both of these filenames are only mentioned once each in the entire
>> test).
...
>> I'll continue investigating this myself, but does anyone know what is
>> going on?

Just a quick ACK. I'll take a look.

Thanks
Jeff

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Strange diff-index with fsmonitor, submodules
  2023-08-29  0:56 Strange diff-index with fsmonitor, submodules Jonathan Tan
  2023-08-29  1:20 ` Junio C Hamano
@ 2023-08-29 16:57 ` Jeff Hostetler
  2023-08-29 17:01   ` Jonathan Tan
  2023-09-06  6:02 ` [PATCH] [diff-lib] Fix check_removed when fsmonitor is on Josip Sokcevic
  2 siblings, 1 reply; 18+ messages in thread
From: Jeff Hostetler @ 2023-08-29 16:57 UTC (permalink / raw)
  To: Jonathan Tan, git



On 8/28/23 8:56 PM, Jonathan Tan wrote:
> There is a strange interaction where diff-index not only produces
> different results when run with and without fsmonitor, but produces
> different results for 2 entries that as far as I can tell, should behave
> identically (sibling files in the same directory - file_11 and file_12,
> and both of these filenames are only mentioned once each in the entire
> test).
> 
> You can see this with this patch:

Jonathan, what platform are you seeing this on?

Jeff


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Strange diff-index with fsmonitor, submodules
  2023-08-29 16:57 ` Jeff Hostetler
@ 2023-08-29 17:01   ` Jonathan Tan
  0 siblings, 0 replies; 18+ messages in thread
From: Jonathan Tan @ 2023-08-29 17:01 UTC (permalink / raw)
  To: Jeff Hostetler; +Cc: Jonathan Tan, git

Jeff Hostetler <git@jeffhostetler.com> writes:
> 
> 
> On 8/28/23 8:56 PM, Jonathan Tan wrote:
> > There is a strange interaction where diff-index not only produces
> > different results when run with and without fsmonitor, but produces
> > different results for 2 entries that as far as I can tell, should behave
> > identically (sibling files in the same directory - file_11 and file_12,
> > and both of these filenames are only mentioned once each in the entire
> > test).
> > 
> > You can see this with this patch:
> 
> Jonathan, what platform are you seeing this on?
> 
> Jeff

Ah, I should have mentioned this. This is on Mac OS X.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH] [diff-lib] Fix check_removed when fsmonitor is on
  2023-08-29  0:56 Strange diff-index with fsmonitor, submodules Jonathan Tan
  2023-08-29  1:20 ` Junio C Hamano
  2023-08-29 16:57 ` Jeff Hostetler
@ 2023-09-06  6:02 ` Josip Sokcevic
  2023-09-06 20:37   ` Jonathan Tan
  2023-09-11 17:09   ` [PATCH v3] " Josip Sokcevic
  2 siblings, 2 replies; 18+ messages in thread
From: Josip Sokcevic @ 2023-09-06  6:02 UTC (permalink / raw)
  To: jonathantanmy; +Cc: git, git, Josip Sokcevic

git-diff-index may return incorrect deleted entries when fsmonitor is used in a
repository with git submodules. This can be observed on Mac machines, but it
can affect all other supported platforms too.

If fsmonitor is used, `stat *st` may not be initialized. Since `lstat` calls
aren't not desired when fsmonitor is on, skip the entire gitlink check using
the same condition used to initialize `stat *st`.
---
This patch is using jonathantanmy@ test case, which now passes. It's
possible there are similar edge cases with fsmonitor, so I'll do more
extensive e2e testing tomorrow.
 diff-lib.c                   | 2 +-
 t/t7527-builtin-fsmonitor.sh | 6 ++++++
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/diff-lib.c b/diff-lib.c
index d8aa777a73..c67aa5ff89 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -46,7 +46,7 @@ static int check_removed(const struct index_state *istate, const struct cache_en
 	}
 	if (has_symlink_leading_path(ce->name, ce_namelen(ce)))
 		return 1;
-	if (S_ISDIR(st->st_mode)) {
+	if (!(ce->ce_flags & CE_FSMONITOR_VALID) && S_ISDIR(st->st_mode)) {
 		struct object_id sub;
 
 		/*
diff --git a/t/t7527-builtin-fsmonitor.sh b/t/t7527-builtin-fsmonitor.sh
index 0c241d6c14..894a80fbe8 100755
--- a/t/t7527-builtin-fsmonitor.sh
+++ b/t/t7527-builtin-fsmonitor.sh
@@ -824,6 +824,10 @@ test_expect_success 'submodule always visited' '
 
 	create_super super &&
 	create_sub sub &&
+	git -C super --no-optional-locks diff-index --name-status HEAD >actual.with &&
+	git -C super --no-optional-locks -c core.fsmonitor=false \
+	    diff-index --name-status HEAD >actual.without &&
+	    test_cmp actual.with actual.without &&
 
 	git -C super submodule add ../sub ./dir_1/dir_2/sub &&
 	git -C super commit -m "add sub" &&
@@ -837,6 +841,8 @@ test_expect_success 'submodule always visited' '
 	# some dirt in the submodule and confirm matching output.
 
 	# Completely clean status.
+	echo Now running for clean status &&
+
 	my_match_and_clean &&
 
 	# .M S..U
-- 
2.42.0.283.g2d96d420d3-goog



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH] [diff-lib] Fix check_removed when fsmonitor is on
  2023-09-06  6:02 ` [PATCH] [diff-lib] Fix check_removed when fsmonitor is on Josip Sokcevic
@ 2023-09-06 20:37   ` Jonathan Tan
  2023-09-07 17:01     ` [PATCH v2] diff-lib: " Josip Sokcevic
  2023-09-11 17:09   ` [PATCH v3] " Josip Sokcevic
  1 sibling, 1 reply; 18+ messages in thread
From: Jonathan Tan @ 2023-09-06 20:37 UTC (permalink / raw)
  To: Josip Sokcevic; +Cc: Jonathan Tan, git, git

Josip Sokcevic <sokcevic@google.com> writes:
> git-diff-index may return incorrect deleted entries when fsmonitor is used in a
> repository with git submodules. This can be observed on Mac machines, but it
> can affect all other supported platforms too.
> 
> If fsmonitor is used, `stat *st` may not be initialized. Since `lstat` calls
> aren't not desired when fsmonitor is on, skip the entire gitlink check using
> the same condition used to initialize `stat *st`.

Ah, that's a great catch. Thanks for narrowing it down to
check_removed(). 

I see from "git blame" that this was introduced in 4f3d6d0261
(fsmonitor: skip lstat deletion check during git diff-index, 2021-03-
17). In that commit, subsequent code in check_removed() already used
"st" when it was possibly uninitialized, so perhaps no one noticed it.
I don't think anyone discussed it either on the mailing list thread that
introduced this patch [1].

[1] https://lore.kernel.org/git/pull.903.git.1615760258.gitgitgadget@gmail.com/

Looking at this function in more detail, though, I see that the solution
in this patch set is incomplete - both before and after this patch,
there are code paths in which "st" is never initialized, but "st->mode"
is used (not in check_removed(), but in its callers).

I think a more complete solution needs to look something like this:

        diff --git a/diff-lib.c b/diff-lib.c
        index d8aa777a73..5637237003 100644
        --- a/diff-lib.c
        +++ b/diff-lib.c
        @@ -39,10 +39,20 @@
         static int check_removed(const struct index_state *istate, const struct cache_entry *ce, struct stat *st)
         {
                assert(is_fsmonitor_refreshed(istate));
        -       if (!(ce->ce_flags & CE_FSMONITOR_VALID) && lstat(ce->name, st) < 0) {
        -               if (!is_missing_file_error(errno))
        -                       return -1;
        -               return 1;
        +       if (ce->ce_flags & CE_FSMONITOR_VALID) {
        +               /*
        +                * Both check_removed() and its callers expect lstat() to have
        +                * happened and, in particular, the st_mode field to be set.
        +                * Simulate this with the contents of ce.
        +                */
        +               memset(st, 0, sizeof(*st));
        +               st->st_mode = ce->ce_mode;
        +       } else {
        +               if (lstat(ce->name, st) < 0) {
        +                       if (!is_missing_file_error(errno))
        +                               return -1;
        +                       return 1;
        +               }
                }
                if (has_symlink_leading_path(ce->name, ce_namelen(ce)))
                        return 1;

Also, regarding the commit message, the title should be formatted like:

  diff-lib: simulate stat when using fsmonitor

or something like that (file, then colon, then message).

You'll also need a sign-off. See the [[sign-off]] section in
Documentation/SubmittingPatches for more information.

> diff --git a/t/t7527-builtin-fsmonitor.sh b/t/t7527-builtin-fsmonitor.sh
> index 0c241d6c14..894a80fbe8 100755
> --- a/t/t7527-builtin-fsmonitor.sh
> +++ b/t/t7527-builtin-fsmonitor.sh
> @@ -824,6 +824,10 @@ test_expect_success 'submodule always visited' '
>  
>  	create_super super &&
>  	create_sub sub &&
> +	git -C super --no-optional-locks diff-index --name-status HEAD >actual.with &&
> +	git -C super --no-optional-locks -c core.fsmonitor=false \
> +	    diff-index --name-status HEAD >actual.without &&
> +	    test_cmp actual.with actual.without &&
>  
>  	git -C super submodule add ../sub ./dir_1/dir_2/sub &&
>  	git -C super commit -m "add sub" &&

Any reason why these additions are here instead of where I put them (in
my_match_and_clean())? I think it makes more sense where I put them,
as then they would automatically apply to all tests.

> @@ -837,6 +841,8 @@ test_expect_success 'submodule always visited' '
>  	# some dirt in the submodule and confirm matching output.
>  
>  	# Completely clean status.
> +	echo Now running for clean status &&
> +
>  	my_match_and_clean &&
>  
>  	# .M S..U

You can remove this - I included this to show exactly where in the code
the test fails.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v2] diff-lib: Fix check_removed when fsmonitor is on
  2023-09-06 20:37   ` Jonathan Tan
@ 2023-09-07 17:01     ` Josip Sokcevic
  2023-09-07 17:22       ` Jonathan Tan
  2023-09-07 18:07       ` Junio C Hamano
  0 siblings, 2 replies; 18+ messages in thread
From: Josip Sokcevic @ 2023-09-07 17:01 UTC (permalink / raw)
  To: jonathantanmy; +Cc: git, git, Josip Sokcevic

git-diff-index may return incorrect deleted entries when fsmonitor is used in a
repository with git submodules. This can be observed on Mac machines, but it
can affect all other supported platforms too.

If fsmonitor is used, `stat *st` may not be initialized. Since `lstat` calls
aren't not desired when fsmonitor is on, skip the entire gitlink check using
the same condition used to initialize `stat *st`.

Signed-off-by: Josip Sokcevic <sokcevic@google.com>
---
 diff-lib.c                   | 19 +++++++++++++++----
 t/t7527-builtin-fsmonitor.sh |  5 +++++
 2 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/diff-lib.c b/diff-lib.c
index d8aa777a73..664613bb1b 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -39,11 +39,22 @@
 static int check_removed(const struct index_state *istate, const struct cache_entry *ce, struct stat *st)
 {
 	assert(is_fsmonitor_refreshed(istate));
-	if (!(ce->ce_flags & CE_FSMONITOR_VALID) && lstat(ce->name, st) < 0) {
-		if (!is_missing_file_error(errno))
-			return -1;
-		return 1;
+	if (ce->ce_flags & CE_FSMONITOR_VALID) {
+		/*
+		 * Both check_removed() and its callers expect lstat() to have
+		 * happened and, in particular, the st_mode field to be set.
+		 * Simulate this with the contents of ce.
+		 */
+		memset(st, 0, sizeof(*st));
+		st->st_mode = ce->ce_mode;
+	} else {
+		if (lstat(ce->name, st) < 0) {
+			if (!is_missing_file_error(errno))
+				return -1;
+			return 1;
+		}
 	}
+
 	if (has_symlink_leading_path(ce->name, ce_namelen(ce)))
 		return 1;
 	if (S_ISDIR(st->st_mode)) {
diff --git a/t/t7527-builtin-fsmonitor.sh b/t/t7527-builtin-fsmonitor.sh
index 0c241d6c14..78503158fd 100755
--- a/t/t7527-builtin-fsmonitor.sh
+++ b/t/t7527-builtin-fsmonitor.sh
@@ -809,6 +809,11 @@ my_match_and_clean () {
 		status --porcelain=v2 >actual.without &&
 	test_cmp actual.with actual.without &&
 
+	git -C super --no-optional-locks diff-index --name-status HEAD >actual.with &&
+	git -C super --no-optional-locks -c core.fsmonitor=false \
+		diff-index --name-status HEAD >actual.without &&
+	test_cmp actual.with actual.without &&
+
 	git -C super/dir_1/dir_2/sub reset --hard &&
 	git -C super/dir_1/dir_2/sub clean -d -f
 }
-- 
2.42.0.283.g2d96d420d3-goog



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] diff-lib: Fix check_removed when fsmonitor is on
  2023-09-07 17:01     ` [PATCH v2] diff-lib: " Josip Sokcevic
@ 2023-09-07 17:22       ` Jonathan Tan
  2023-09-07 18:07       ` Junio C Hamano
  1 sibling, 0 replies; 18+ messages in thread
From: Jonathan Tan @ 2023-09-07 17:22 UTC (permalink / raw)
  To: Josip Sokcevic; +Cc: Jonathan Tan, git, git

Josip Sokcevic <sokcevic@google.com> writes:
> git-diff-index may return incorrect deleted entries when fsmonitor is used in a
> repository with git submodules. This can be observed on Mac machines, but it
> can affect all other supported platforms too.
> 
> If fsmonitor is used, `stat *st` may not be initialized. Since `lstat` calls
> aren't not desired when fsmonitor is on, skip the entire gitlink check using
> the same condition used to initialize `stat *st`.

I think this paragraph is outdated - you'll need to update it to match
the code in this version.

> diff --git a/diff-lib.c b/diff-lib.c
> index d8aa777a73..664613bb1b 100644
> --- a/diff-lib.c
> +++ b/diff-lib.c
> @@ -39,11 +39,22 @@
>  static int check_removed(const struct index_state *istate, const struct cache_entry *ce, struct stat *st)
>  {
>  	assert(is_fsmonitor_refreshed(istate));
> -	if (!(ce->ce_flags & CE_FSMONITOR_VALID) && lstat(ce->name, st) < 0) {
> -		if (!is_missing_file_error(errno))
> -			return -1;
> -		return 1;
> +	if (ce->ce_flags & CE_FSMONITOR_VALID) {
> +		/*
> +		 * Both check_removed() and its callers expect lstat() to have
> +		 * happened and, in particular, the st_mode field to be set.
> +		 * Simulate this with the contents of ce.
> +		 */
> +		memset(st, 0, sizeof(*st));
> +		st->st_mode = ce->ce_mode;
> +	} else {
> +		if (lstat(ce->name, st) < 0) {
> +			if (!is_missing_file_error(errno))
> +				return -1;
> +			return 1;
> +		}
>  	}
> +
>  	if (has_symlink_leading_path(ce->name, ce_namelen(ce)))
>  		return 1;
>  	if (S_ISDIR(st->st_mode)) {

I'm on the fence about whether the extra newline is necessary - the "if"
did get bigger, but the code is clear enough without the newline. I lean
towards removing it, to avoid cluttering the diff.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] diff-lib: Fix check_removed when fsmonitor is on
  2023-09-07 17:01     ` [PATCH v2] diff-lib: " Josip Sokcevic
  2023-09-07 17:22       ` Jonathan Tan
@ 2023-09-07 18:07       ` Junio C Hamano
  2023-09-07 23:08         ` Josip Sokcevic
  1 sibling, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2023-09-07 18:07 UTC (permalink / raw)
  To: Josip Sokcevic; +Cc: jonathantanmy, git, git

Josip Sokcevic <sokcevic@google.com> writes:

> diff --git a/diff-lib.c b/diff-lib.c
> index d8aa777a73..664613bb1b 100644
> --- a/diff-lib.c
> +++ b/diff-lib.c
> @@ -39,11 +39,22 @@
>  static int check_removed(const struct index_state *istate, const struct cache_entry *ce, struct stat *st)
>  {
>  	assert(is_fsmonitor_refreshed(istate));

Not a problem this patch introduces, but doesn't this call path

  diff_cache()
  -> unpack_trees()
     -> oneway_diff()
        -> do_oneway_diff()
           -> show_new_file(), show_modified()
               -> get_stat_data()
                  -> check_removed()

violate the assertion?  If so, perhaps we should rewrite it into a
more explicit "if (...) BUG(...)" that is not compiled away.

> -	if (!(ce->ce_flags & CE_FSMONITOR_VALID) && lstat(ce->name, st) < 0) {
> -		if (!is_missing_file_error(errno))
> -			return -1;
> -		return 1;
> +	if (ce->ce_flags & CE_FSMONITOR_VALID) {
> +		/*
> +		 * Both check_removed() and its callers expect lstat() to have
> +		 * happened and, in particular, the st_mode field to be set.
> +		 * Simulate this with the contents of ce.
> +		 */
> +		memset(st, 0, sizeof(*st));

It is true that the original, when CE_FSMONITOR_VALID bit is set,
bypasses lstat() altogether and leaves the contents of st completely
uninitialized, but this is still way too insufficient, isn't it?

There are three call sites of the check_removed() function.

 * The first one in run_diff_files() only cares about st.st_mode and
   other members of the structure are not looked at.  This makes
   readers wonder if the "st" parameter to check_removed() should
   become "mode_t *st_mode" to clarify this point, but the primary
   thing I want to say is that this caller will not mind if we leave
   other members of st bogus (like 0-bit filled) as long as the mode
   is set correctly.

 * The second one in run_diff_files() passes the resulting &st to
   match_stat_with_submodule(), which in turn passes it to
   ie_match_stat(), which cares about "struct stat" members that are
   used for quick change detection, like owner, group, mtime.
   Giving it a bogus st will most likely cause it to report a
   change.

 * The third one is in get_stat_data().  This also uses the &st to
   call match_stat_with_submodule(), so it is still totally broken
   to give it a bogus st, the same way as the second caller above.

> +		st->st_mode = ce->ce_mode;

Does this work correctly when the cache entry points at a gitlink,
which uses 0160000 that is not a valid st_mode?  I think you'd want
to use a reverse function of create_ce_mode().

> +	} else {
> +		if (lstat(ce->name, st) < 0) {
> +			if (!is_missing_file_error(errno))
> +				return -1;
> +			return 1;
> +		}
>  	}

At this point, if FSMONITOR_VALID bit is not set, we will always
perform lstat() and get all the members of st populated properly,
which is a definite improvement.

While I think this does not make it worse (it is an existing bug
that the code is broken for a ce with the CE_FSMONITOR_VALID bit
set), we may want to leave a note that we _know_ the code after this
patch is still broken.  "Simulate this with ..." -> "Just setting
st_mode is still insufficient and will break majority of callers".

It may make sense, until we clean it up, to disable the check for
the FSMONITOR_VALID bit in this codepath and always perform lstat().
Optimization matters, but computing quickly in order to return an
incorrect result is optimizing for a wrong thing.  I dunno.

Thanks.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] diff-lib: Fix check_removed when fsmonitor is on
  2023-09-07 18:07       ` Junio C Hamano
@ 2023-09-07 23:08         ` Josip Sokcevic
  2023-09-08 15:00           ` Junio C Hamano
  0 siblings, 1 reply; 18+ messages in thread
From: Josip Sokcevic @ 2023-09-07 23:08 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: jonathantanmy, git, git

On Thu, Sep 7, 2023 at 11:07 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Josip Sokcevic <sokcevic@google.com> writes:
>
> > diff --git a/diff-lib.c b/diff-lib.c
> > index d8aa777a73..664613bb1b 100644
> > --- a/diff-lib.c
> > +++ b/diff-lib.c
> > @@ -39,11 +39,22 @@
> >  static int check_removed(const struct index_state *istate, const struct cache_entry *ce, struct stat *st)
> >  {
> >       assert(is_fsmonitor_refreshed(istate));
>
> Not a problem this patch introduces, but doesn't this call path
>
>   diff_cache()
>   -> unpack_trees()
>      -> oneway_diff()
>         -> do_oneway_diff()
>            -> show_new_file(), show_modified()
>                -> get_stat_data()
>                   -> check_removed()
>
> violate the assertion?  If so, perhaps we should rewrite it into a
> more explicit "if (...) BUG(...)" that is not compiled away.

True, I will update it.

>
> > -     if (!(ce->ce_flags & CE_FSMONITOR_VALID) && lstat(ce->name, st) < 0) {
> > -             if (!is_missing_file_error(errno))
> > -                     return -1;
> > -             return 1;
> > +     if (ce->ce_flags & CE_FSMONITOR_VALID) {
> > +             /*
> > +              * Both check_removed() and its callers expect lstat() to have
> > +              * happened and, in particular, the st_mode field to be set.
> > +              * Simulate this with the contents of ce.
> > +              */
> > +             memset(st, 0, sizeof(*st));
>
> It is true that the original, when CE_FSMONITOR_VALID bit is set,
> bypasses lstat() altogether and leaves the contents of st completely
> uninitialized, but this is still way too insufficient, isn't it?
>
> There are three call sites of the check_removed() function.
>
>  * The first one in run_diff_files() only cares about st.st_mode and
>    other members of the structure are not looked at.  This makes
>    readers wonder if the "st" parameter to check_removed() should
>    become "mode_t *st_mode" to clarify this point, but the primary
>    thing I want to say is that this caller will not mind if we leave
>    other members of st bogus (like 0-bit filled) as long as the mode
>    is set correctly.
>
>  * The second one in run_diff_files() passes the resulting &st to
>    match_stat_with_submodule(), which in turn passes it to
>    ie_match_stat(), which cares about "struct stat" members that are
>    used for quick change detection, like owner, group, mtime.
>    Giving it a bogus st will most likely cause it to report a
>    change.
>
>  * The third one is in get_stat_data().  This also uses the &st to
>    call match_stat_with_submodule(), so it is still totally broken
>    to give it a bogus st, the same way as the second caller above.
>
> > +             st->st_mode = ce->ce_mode;
>
> Does this work correctly when the cache entry points at a gitlink,
> which uses 0160000 that is not a valid st_mode?  I think you'd want
> to use a reverse function of create_ce_mode().

I realized this too but after I sent the patch. I don't think we have
a good way to reverse it, so the best we can do is to guess it. But I
don't think we should do that. Instead, should we just zero the struct
and add a TODO? Alternative could be to use a double pointer for stat
and do more checks in call sites, but I'm not familiar with the
codebase to how that branching would need to look like.

Any preference?

-- 
Josip Sokcevic

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] diff-lib: Fix check_removed when fsmonitor is on
  2023-09-07 23:08         ` Josip Sokcevic
@ 2023-09-08 15:00           ` Junio C Hamano
  0 siblings, 0 replies; 18+ messages in thread
From: Junio C Hamano @ 2023-09-08 15:00 UTC (permalink / raw)
  To: Josip Sokcevic; +Cc: jonathantanmy, git, git

Josip Sokcevic <sokcevic@google.com> writes:

>> > @@ -39,11 +39,22 @@
>> >  static int check_removed(const struct index_state *istate, const struct cache_entry *ce, struct stat *st)
>> >  {
>> >       assert(is_fsmonitor_refreshed(istate));
>>
>> Not a problem this patch introduces, but doesn't this call path
>>
>>   diff_cache()
>>   -> unpack_trees()
>>      -> oneway_diff()
>>         -> do_oneway_diff()
>>            -> show_new_file(), show_modified()
>>                -> get_stat_data()
>>                   -> check_removed()
>>
>> violate the assertion?  If so, perhaps we should rewrite it into a
>> more explicit "if (...) BUG(...)" that is not compiled away.
>
> True, I will update it.

Not as a part of this series, though.

I think this came from the same series that were merged at 858119f6
(Merge branch 'nk/diff-index-fsmonitor', 2021-03-24), so if we just
simply do a moral "revert" of the series, then it should disappear.

More importantly, you'd want to ask help from folks who are more
familiar with fsmonitor, which would take some time, and fixing this
assertion can be backburnered while you are fixing the original
issue that got you started here.  I see JeffH is already Cc:ed,
which is good.

>> > -     if (!(ce->ce_flags & CE_FSMONITOR_VALID) && lstat(ce->name, st) < 0) {
>> > -             if (!is_missing_file_error(errno))
>> > -                     return -1;
>> > -             return 1;
>> > +     if (ce->ce_flags & CE_FSMONITOR_VALID) {
>> > +             /*
>> > +              * Both check_removed() and its callers expect lstat() to have
>> > +              * happened and, in particular, the st_mode field to be set.
>> > +              * Simulate this with the contents of ce.
>> > +              */
>> > +             memset(st, 0, sizeof(*st));
>>
>> It is true that the original, when CE_FSMONITOR_VALID bit is set,
>> bypasses lstat() altogether and leaves the contents of st completely
>> uninitialized, but this is still way too insufficient, isn't it?
>>
>> There are three call sites of the check_removed() function.
>>
>>  * The first one in run_diff_files() only cares about st.st_mode and
>>    other members of the structure are not looked at.  This makes
>>    readers wonder if the "st" parameter to check_removed() should
>>    become "mode_t *st_mode" to clarify this point, but the primary
>>    thing I want to say is that this caller will not mind if we leave
>>    other members of st bogus (like 0-bit filled) as long as the mode
>>    is set correctly.
>>
>>  * The second one in run_diff_files() passes the resulting &st to
>>    match_stat_with_submodule(), which in turn passes it to
>>    ie_match_stat(), which cares about "struct stat" members that are
>>    used for quick change detection, like owner, group, mtime.
>>    Giving it a bogus st will most likely cause it to report a
>>    change.
>>
>>  * The third one is in get_stat_data().  This also uses the &st to
>>    call match_stat_with_submodule(), so it is still totally broken
>>    to give it a bogus st, the same way as the second caller above.
>>
>> > +             st->st_mode = ce->ce_mode;
>>
>> Does this work correctly when the cache entry points at a gitlink,
>> which uses 0160000 that is not a valid st_mode?  I think you'd want
>> to use a reverse function of create_ce_mode().
>
> I realized this too but after I sent the patch. I don't think we have
> a good way to reverse it, so the best we can do is to guess it. But I
> don't think we should do that. Instead, should we just zero the struct
> and add a TODO? Alternative could be to use a double pointer for stat
> and do more checks in call sites, but I'm not familiar with the
> codebase to how that branching would need to look like.
>
> Any preference?

My preference in the short term actually is to teach this codepath
to ignore CE_FSMONITOR_VALID bit and always run lstat() instead.

That will unbreak this "optimization" that allowed the code to do a
wrong thing quickly.

The solution suitable for a bit longer term becomes rather obvious
when we step back a bit and think about the purpose of this block
inside "if ce_flags says CE_FSMONITOR_VALID is true, do this".  The
fsmonitor has already checked and knows that the path has not been
updated, i.e.  what we have in the in-core index still matches what
we would get from running lstat().  So now what we need to do is to
pretend as if we called lstat(), recreating the data in "struct
stat" from "struct cache_entry" well enough to fool ie_match_stat()
into saying that the cached stat data in ce still matches what is in
st.

Can we do that?  Of course we can.  When we added or refreshed the
cache entry, we should have had "struct stat" taken out of lstat()
for the path, and copied some data from the members of "struct stat"
into "struct cache_entry".  After all, that is how we can later ask
ie_match_stat() if the cached stat data in the "struct cache_entry"
matches the current "struct stat" data to tell if the path hasn't
been touched since we last refreshed.

We want to have a helper function that we can ask:

    With the data in this 'struct cache_entry', please populate this
    'struct stat'.  Do so well enough that ie_match_stat() says "ok,
    the path is unchanged".

You can write such a function by studying two code paths.

 * ie_match_stat() and its helpers to find out which members of
   "struct stat" matters (i.e. compared against corresponding
   members in "struct cache_entry").

 * add_to_index() and its helpers, fill_stat_cache_info() in
   particular, to see which members of "struct stat" are used to
   populate ce->ce_stat_data and how (e.g. some members lose
   precision and that is find---the reverse operation will also be
   lossy and the resulting "struct stat" you will forge will *not*
   exactly match what you would obtain from real lstat(), but it is
   good enough to fool ie_match_stat() because you'll be comparing
   against an existing ce that was populated in a lossy way).

Then, replace the above partial assignment to st with a call to such
a helper function, which should live in statinfo.c next to
fill_stat_data() and whose function signature should look like

    void fill_stat_from_stat_data(struct stat *st, struct stat_data *sd);

You'd probably need to define two macros to update nsec part of c/mtime
members that is conditional near where ST_CTIME_NSEC() macro is
defined to write that reverse function of fill_stat_data(), perhaps
along the lines of ...

	#ifdef NO_NSEC
	#define FILL_ST_CTIME_NSEC(st, nsec) do { ; /* noop */ } while (0)
	#else
	#ifdef USE_ST_TIMESPEC
	#define FILL_ST_CTIME_NSEC(st, nsec) (st)->st_ctimespec.tv_nsec = (nsec)
	#else
	#define FILL_ST_CTIME_NSEC(st, nsec) (st)->st_ctim.tv_nsec = (nsec)
	#endif
	#endif



^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v3] diff-lib: Fix check_removed when fsmonitor is on
  2023-09-06  6:02 ` [PATCH] [diff-lib] Fix check_removed when fsmonitor is on Josip Sokcevic
  2023-09-06 20:37   ` Jonathan Tan
@ 2023-09-11 17:09   ` Josip Sokcevic
  2023-09-11 22:53     ` Junio C Hamano
  1 sibling, 1 reply; 18+ messages in thread
From: Josip Sokcevic @ 2023-09-11 17:09 UTC (permalink / raw)
  To: gitster, jonathantanmy; +Cc: git, git, Josip Sokcevic

git-diff-index may return incorrect deleted entries when fsmonitor is used in a
repository with git submodules. This can be observed on Mac machines, but it
can affect all other supported platforms too.

If fsmonitor is used, `stat *st` is not initialized if cache_entry has
CE_FSMONITOR_VALID set. But, there are three call sites that rely on stat
afterwards, which can result in incorrect results.

This change partially reverts commit 4f3d6d0.

Signed-off-by: Josip Sokcevic <sokcevic@google.com>
---
 diff-lib.c                   | 12 ++++++------
 t/t7527-builtin-fsmonitor.sh |  5 +++++
 2 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/diff-lib.c b/diff-lib.c
index d8aa777a73..5848e4f9ca 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -36,14 +36,14 @@
  * exists for ce that is a submodule -- it is a submodule that is not
  * checked out).  Return negative for an error.
  */
-static int check_removed(const struct index_state *istate, const struct cache_entry *ce, struct stat *st)
+static int check_removed(const struct cache_entry *ce, struct stat *st)
 {
-	assert(is_fsmonitor_refreshed(istate));
-	if (!(ce->ce_flags & CE_FSMONITOR_VALID) && lstat(ce->name, st) < 0) {
+	if (lstat(ce->name, st) < 0) {
 		if (!is_missing_file_error(errno))
 			return -1;
 		return 1;
 	}
+
 	if (has_symlink_leading_path(ce->name, ce_namelen(ce)))
 		return 1;
 	if (S_ISDIR(st->st_mode)) {
@@ -149,7 +149,7 @@ void run_diff_files(struct rev_info *revs, unsigned int option)
 			memset(&(dpath->parent[0]), 0,
 			       sizeof(struct combine_diff_parent)*5);
 
-			changed = check_removed(istate, ce, &st);
+			changed = check_removed(ce, &st);
 			if (!changed)
 				wt_mode = ce_mode_from_stat(ce, st.st_mode);
 			else {
@@ -229,7 +229,7 @@ void run_diff_files(struct rev_info *revs, unsigned int option)
 		} else {
 			struct stat st;
 
-			changed = check_removed(istate, ce, &st);
+			changed = check_removed(ce, &st);
 			if (changed) {
 				if (changed < 0) {
 					perror(ce->name);
@@ -303,7 +303,7 @@ static int get_stat_data(const struct index_state *istate,
 	if (!cached && !ce_uptodate(ce)) {
 		int changed;
 		struct stat st;
-		changed = check_removed(istate, ce, &st);
+		changed = check_removed(ce, &st);
 		if (changed < 0)
 			return -1;
 		else if (changed) {
diff --git a/t/t7527-builtin-fsmonitor.sh b/t/t7527-builtin-fsmonitor.sh
index 0c241d6c14..78503158fd 100755
--- a/t/t7527-builtin-fsmonitor.sh
+++ b/t/t7527-builtin-fsmonitor.sh
@@ -809,6 +809,11 @@ my_match_and_clean () {
 		status --porcelain=v2 >actual.without &&
 	test_cmp actual.with actual.without &&
 
+	git -C super --no-optional-locks diff-index --name-status HEAD >actual.with &&
+	git -C super --no-optional-locks -c core.fsmonitor=false \
+		diff-index --name-status HEAD >actual.without &&
+	test_cmp actual.with actual.without &&
+
 	git -C super/dir_1/dir_2/sub reset --hard &&
 	git -C super/dir_1/dir_2/sub clean -d -f
 }
-- 
2.42.0.283.g2d96d420d3-goog



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v3] diff-lib: Fix check_removed when fsmonitor is on
  2023-09-11 17:09   ` [PATCH v3] " Josip Sokcevic
@ 2023-09-11 22:53     ` Junio C Hamano
  2023-09-12  3:03       ` Josip Sokcevic
  0 siblings, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2023-09-11 22:53 UTC (permalink / raw)
  To: Josip Sokcevic; +Cc: jonathantanmy, git, git

Josip Sokcevic <sokcevic@google.com> writes:

> git-diff-index may return incorrect deleted entries when fsmonitor is used in a
> repository with git submodules. This can be observed on Mac machines, but it
> can affect all other supported platforms too.
>
> If fsmonitor is used, `stat *st` is not initialized if cache_entry has
> CE_FSMONITOR_VALID set. But, there are three call sites that rely on stat
> afterwards, which can result in incorrect results.
>
> This change partially reverts commit 4f3d6d0.
>
> Signed-off-by: Josip Sokcevic <sokcevic@google.com>
> ---
>  diff-lib.c                   | 12 ++++++------
>  t/t7527-builtin-fsmonitor.sh |  5 +++++
>  2 files changed, 11 insertions(+), 6 deletions(-)

This certainly is more "complete" if simpler than the previous one
;-)

In the longer term, we would probably want to enable optimization
using what fsmonitor knows, but as we have seen in the review on
the previous round, this code needs a bit more work than the
original we are reverting here to get it right, and in the shorter
term, hopefully this would do.

Thanks.

> diff --git a/diff-lib.c b/diff-lib.c
> index d8aa777a73..5848e4f9ca 100644
> --- a/diff-lib.c
> +++ b/diff-lib.c
> @@ -36,14 +36,14 @@
>   * exists for ce that is a submodule -- it is a submodule that is not
>   * checked out).  Return negative for an error.
>   */
> -static int check_removed(const struct index_state *istate, const struct cache_entry *ce, struct stat *st)
> +static int check_removed(const struct cache_entry *ce, struct stat *st)
>  {
> -	assert(is_fsmonitor_refreshed(istate));
> -	if (!(ce->ce_flags & CE_FSMONITOR_VALID) && lstat(ce->name, st) < 0) {
> +	if (lstat(ce->name, st) < 0) {
>  		if (!is_missing_file_error(errno))
>  			return -1;
>  		return 1;
>  	}
> +
>  	if (has_symlink_leading_path(ce->name, ce_namelen(ce)))
>  		return 1;
>  	if (S_ISDIR(st->st_mode)) {
> @@ -149,7 +149,7 @@ void run_diff_files(struct rev_info *revs, unsigned int option)
>  			memset(&(dpath->parent[0]), 0,
>  			       sizeof(struct combine_diff_parent)*5);
>  
> -			changed = check_removed(istate, ce, &st);
> +			changed = check_removed(ce, &st);
>  			if (!changed)
>  				wt_mode = ce_mode_from_stat(ce, st.st_mode);
>  			else {
> @@ -229,7 +229,7 @@ void run_diff_files(struct rev_info *revs, unsigned int option)
>  		} else {
>  			struct stat st;
>  
> -			changed = check_removed(istate, ce, &st);
> +			changed = check_removed(ce, &st);
>  			if (changed) {
>  				if (changed < 0) {
>  					perror(ce->name);
> @@ -303,7 +303,7 @@ static int get_stat_data(const struct index_state *istate,
>  	if (!cached && !ce_uptodate(ce)) {
>  		int changed;
>  		struct stat st;
> -		changed = check_removed(istate, ce, &st);
> +		changed = check_removed(ce, &st);
>  		if (changed < 0)
>  			return -1;
>  		else if (changed) {
> diff --git a/t/t7527-builtin-fsmonitor.sh b/t/t7527-builtin-fsmonitor.sh
> index 0c241d6c14..78503158fd 100755
> --- a/t/t7527-builtin-fsmonitor.sh
> +++ b/t/t7527-builtin-fsmonitor.sh
> @@ -809,6 +809,11 @@ my_match_and_clean () {
>  		status --porcelain=v2 >actual.without &&
>  	test_cmp actual.with actual.without &&
>  
> +	git -C super --no-optional-locks diff-index --name-status HEAD >actual.with &&
> +	git -C super --no-optional-locks -c core.fsmonitor=false \
> +		diff-index --name-status HEAD >actual.without &&
> +	test_cmp actual.with actual.without &&
> +
>  	git -C super/dir_1/dir_2/sub reset --hard &&
>  	git -C super/dir_1/dir_2/sub clean -d -f
>  }

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3] diff-lib: Fix check_removed when fsmonitor is on
  2023-09-11 22:53     ` Junio C Hamano
@ 2023-09-12  3:03       ` Josip Sokcevic
  2023-09-12 17:07         ` Junio C Hamano
  0 siblings, 1 reply; 18+ messages in thread
From: Josip Sokcevic @ 2023-09-12  3:03 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: jonathantanmy, git, git

On Mon, Sep 11, 2023 at 3:53 PM Junio C Hamano <gitster@pobox.com> wrote:
> This certainly is more "complete" if simpler than the previous one
> ;-)
>
> In the longer term, we would probably want to enable optimization
> using what fsmonitor knows, but as we have seen in the review on
> the previous round, this code needs a bit more work than the
> original we are reverting here to get it right, and in the shorter
> term, hopefully this would do.

Yes, I agree we should optimize this in a follow up. One thing I'm not
sure about is if we should try to construct `struct stat` using
`cache_index`, or we should check for `CE_FSMONITOR_VALID` in a way
that `stat` would no longer be needed for those code paths.


-- 
Josip Sokcevic

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3] diff-lib: Fix check_removed when fsmonitor is on
  2023-09-12  3:03       ` Josip Sokcevic
@ 2023-09-12 17:07         ` Junio C Hamano
  2023-09-14 22:39           ` Josip Sokcevic
  0 siblings, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2023-09-12 17:07 UTC (permalink / raw)
  To: Josip Sokcevic; +Cc: jonathantanmy, git, git

Josip Sokcevic <sokcevic@google.com> writes:

> Yes, I agree we should optimize this in a follow up. One thing I'm not
> sure about is if we should try to construct `struct stat` using
> `cache_index`, or we should check for `CE_FSMONITOR_VALID` in a way
> that `stat` would no longer be needed for those code paths.

Good point.  

It seems to be entirely doable, even though the stench from the
abstraction layer violation may be horrible.

ie_match_stat(), which is called by match_stat_with_submodule(),
does pay attention to CE_FSMONITOR_VALID bit, so none of the members
of struct stat matters when the bit is set.  But the bit is not set,
all members that fill_stat_data() and ce_match_stat_basic() care
about do matter.  Other code that follows callers of check_removed()
do care about at least .st_mode member, and I suspect that in the
current code .st_mode is the only member that gets looked at.

So after all, I think your original "fix" was correct, but it took
us some time to figure out why it was, which means we would want to
explain it in the log message for developers who would want to touch
the same area in the future.

Thanks.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3] diff-lib: Fix check_removed when fsmonitor is on
  2023-09-12 17:07         ` Junio C Hamano
@ 2023-09-14 22:39           ` Josip Sokcevic
  2023-09-18 16:35             ` Junio C Hamano
  0 siblings, 1 reply; 18+ messages in thread
From: Josip Sokcevic @ 2023-09-14 22:39 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: jonathantanmy, git, git

On Tue, Sep 12, 2023 at 10:07 AM Junio C Hamano <gitster@pobox.com> wrote:
> It seems to be entirely doable, even though the stench from the
> abstraction layer violation may be horrible.
>
> ie_match_stat(), which is called by match_stat_with_submodule(),
> does pay attention to CE_FSMONITOR_VALID bit, so none of the members
> of struct stat matters when the bit is set.  But the bit is not set,
> all members that fill_stat_data() and ce_match_stat_basic() care
> about do matter.  Other code that follows callers of check_removed()
> do care about at least .st_mode member, and I suspect that in the
> current code .st_mode is the only member that gets looked at.
>
> So after all, I think your original "fix" was correct, but it took
> us some time to figure out why it was, which means we would want to
> explain it in the log message for developers who would want to touch
> the same area in the future.

I finished testing this - after my original fix and after running all
tests, I can confirm that `st_mode` of `struct stat` is indeed not
consumed if CE_FSMONITOR_VALID is set. But, it's fragile and likely to
cause problems in the future. Your approach of constructing `struct
stat` based on `struct cache_entry` is the way to go.

I see you created a new set of patches in a separate thread, so I'll
start those tests and report back there.

Thanks!

-- 
Josip Sokcevic

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3] diff-lib: Fix check_removed when fsmonitor is on
  2023-09-14 22:39           ` Josip Sokcevic
@ 2023-09-18 16:35             ` Junio C Hamano
  0 siblings, 0 replies; 18+ messages in thread
From: Junio C Hamano @ 2023-09-18 16:35 UTC (permalink / raw)
  To: Josip Sokcevic; +Cc: jonathantanmy, git, git

Josip Sokcevic <sokcevic@google.com> writes:

> I see you created a new set of patches in a separate thread, so I'll
> start those tests and report back there.

Thanks.

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2023-09-18 16:40 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-29  0:56 Strange diff-index with fsmonitor, submodules Jonathan Tan
2023-08-29  1:20 ` Junio C Hamano
2023-08-29 12:45   ` Jeff Hostetler
2023-08-29 16:57 ` Jeff Hostetler
2023-08-29 17:01   ` Jonathan Tan
2023-09-06  6:02 ` [PATCH] [diff-lib] Fix check_removed when fsmonitor is on Josip Sokcevic
2023-09-06 20:37   ` Jonathan Tan
2023-09-07 17:01     ` [PATCH v2] diff-lib: " Josip Sokcevic
2023-09-07 17:22       ` Jonathan Tan
2023-09-07 18:07       ` Junio C Hamano
2023-09-07 23:08         ` Josip Sokcevic
2023-09-08 15:00           ` Junio C Hamano
2023-09-11 17:09   ` [PATCH v3] " Josip Sokcevic
2023-09-11 22:53     ` Junio C Hamano
2023-09-12  3:03       ` Josip Sokcevic
2023-09-12 17:07         ` Junio C Hamano
2023-09-14 22:39           ` Josip Sokcevic
2023-09-18 16:35             ` Junio C Hamano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.