git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] update-index: refresh should rewrite index in case of racy timestamps
@ 2021-12-22 13:56 Marc Strapetz via GitGitGadget
  2021-12-22 23:52 ` Junio C Hamano
  2022-01-05 13:15 ` [PATCH v2 0/2] " Marc Strapetz via GitGitGadget
  0 siblings, 2 replies; 20+ messages in thread
From: Marc Strapetz via GitGitGadget @ 2021-12-22 13:56 UTC (permalink / raw)
  To: git; +Cc: Marc Strapetz

From: Marc Strapetz <marc.strapetz@syntevo.com>

update-index --refresh and --really-refresh should force writing of the
index file if racy timestamps have been encountered, as status already
does [1].

Note that calling update-index still does not guarantee that there will
be no more racy timestamps afterwards (the same holds true for status):

- calling update-index immediately after touching and adding a file may
  still leave racy timestamps if all three operations occur within the
  racy-tolerance (usually 1 second unless USE_NSEC has been defined)

- calling update-index for timestamps which are set into the future
  will leave them racy

To guarantee that such racy timestamps will be resolved would require to
wait until the system clock has passed beyond these timestamps and only
then write the index file. Especially for future timestamps, this does
not seem feasible because of possibly long delays/hangs.

Both --refresh and --really-refresh may in theory be used in
combination with --unresolve and --again which may reset the
"active_cache_changed" flag. There is no difference of whether we
write the index due to racy timestamps or due to other
reasons, like if --really-refresh has detected CE_ENTRY_CHANGED in
refresh_cache(). Hence, we will set the "active_cache_changed" flag
immediately after calling refresh_cache().

[1] https://lore.kernel.org/git/d3dd805c-7c1d-30a9-6574-a7bfcb7fc013@syntevo.com/

Signed-off-by: Marc Strapetz <marc.strapetz@syntevo.com>
---
    update-index: refresh should rewrite index in case of racy timestamps
    
    This patch makes update-index --refresh write the index if it contains
    racy timestamps, as discussed at [1].
    
    [1]
    https://lore.kernel.org/git/d3dd805c-7c1d-30a9-6574-a7bfcb7fc013@syntevo.com/

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1105%2Fmstrap%2Ffeature%2Fupdate-index-refresh-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1105/mstrap/feature/update-index-refresh-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1105

 builtin/update-index.c               |  6 +++
 cache.h                              |  1 +
 read-cache.c                         |  2 +-
 t/t2108-update-index-refresh-racy.sh | 58 ++++++++++++++++++++++++++++
 4 files changed, 66 insertions(+), 1 deletion(-)
 create mode 100755 t/t2108-update-index-refresh-racy.sh

diff --git a/builtin/update-index.c b/builtin/update-index.c
index 187203e8bb5..0a069281e23 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -787,6 +787,12 @@ static int refresh(struct refresh_params *o, unsigned int flag)
 	setup_work_tree();
 	read_cache();
 	*o->has_errors |= refresh_cache(o->flags | flag);
+	if (has_racy_timestamp(&the_index)) {
+		/* For racy timestamps we should set active_cache_changed immediately:
+		 * other callbacks may follow for which some of them may reset
+		 * active_cache_changed. */
+		active_cache_changed |= SOMETHING_CHANGED;
+	}
 	return 0;
 }
 
diff --git a/cache.h b/cache.h
index cfba463aa97..dd1932e2d0e 100644
--- a/cache.h
+++ b/cache.h
@@ -891,6 +891,7 @@ void *read_blob_data_from_index(struct index_state *, const char *, unsigned lon
 #define CE_MATCH_IGNORE_FSMONITOR 0X20
 int is_racy_timestamp(const struct index_state *istate,
 		      const struct cache_entry *ce);
+int has_racy_timestamp(struct index_state *istate);
 int ie_match_stat(struct index_state *, const struct cache_entry *, struct stat *, unsigned int);
 int ie_modified(struct index_state *, const struct cache_entry *, struct stat *, unsigned int);
 
diff --git a/read-cache.c b/read-cache.c
index cbe73f14e5e..ed297635a33 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2775,7 +2775,7 @@ static int repo_verify_index(struct repository *repo)
 	return verify_index_from(repo->index, repo->index_file);
 }
 
-static int has_racy_timestamp(struct index_state *istate)
+int has_racy_timestamp(struct index_state *istate)
 {
 	int entries = istate->cache_nr;
 	int i;
diff --git a/t/t2108-update-index-refresh-racy.sh b/t/t2108-update-index-refresh-racy.sh
new file mode 100755
index 00000000000..ece1151847c
--- /dev/null
+++ b/t/t2108-update-index-refresh-racy.sh
@@ -0,0 +1,58 @@
+#!/bin/sh
+
+test_description='update-index refresh tests related to racy timestamps'
+
+. ./test-lib.sh
+
+reset_mtime() {
+	test-tool chmtime =$(test-tool chmtime --get .git/fs-tstamp) $1
+}
+
+update_assert_unchanged() {
+	local ts1=$(test-tool chmtime --get .git/index) &&
+	git update-index $1 &&
+	local ts2=$(test-tool chmtime --get .git/index) &&
+	[ $ts1 -eq $ts2 ]
+}
+
+update_assert_changed() {
+	local ts1=$(test-tool chmtime --get .git/index) &&
+	test_might_fail git update-index $1 &&
+	local ts2=$(test-tool chmtime --get .git/index) &&
+	[ $ts1 -ne $ts2 ]
+}
+
+test_expect_success 'setup' '
+	touch .git/fs-tstamp &&
+	test-tool chmtime -1 .git/fs-tstamp &&
+	echo content >file &&
+	reset_mtime file &&
+
+	git add file &&
+	git commit -m "initial import"
+'
+
+test_expect_success '--refresh has no racy timestamps to fix' '
+	reset_mtime .git/index &&
+	test-tool chmtime +1 .git/index &&
+	update_assert_unchanged --refresh
+'
+
+test_expect_success '--refresh should fix racy timestamp' '
+	reset_mtime .git/index &&
+	update_assert_changed --refresh
+'
+
+test_expect_success '--really-refresh should fix racy timestamp' '
+	reset_mtime .git/index &&
+	update_assert_changed --really-refresh
+'
+
+test_expect_success '--refresh should fix racy timestamp even if needs update' '
+	echo content2 >file &&
+	reset_mtime file &&
+	reset_mtime .git/index &&
+	update_assert_changed --refresh
+'
+
+test_done

base-commit: 597af311a2899bfd6640b9b107622c5795d5f998
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH] update-index: refresh should rewrite index in case of racy timestamps
  2021-12-22 13:56 [PATCH] update-index: refresh should rewrite index in case of racy timestamps Marc Strapetz via GitGitGadget
@ 2021-12-22 23:52 ` Junio C Hamano
  2021-12-23 18:24   ` Marc Strapetz
  2022-01-05 13:15 ` [PATCH v2 0/2] " Marc Strapetz via GitGitGadget
  1 sibling, 1 reply; 20+ messages in thread
From: Junio C Hamano @ 2021-12-22 23:52 UTC (permalink / raw)
  To: Marc Strapetz via GitGitGadget; +Cc: git, Marc Strapetz

"Marc Strapetz via GitGitGadget" <gitgitgadget@gmail.com> writes:

>  builtin/update-index.c               |  6 +++
>  cache.h                              |  1 +
>  read-cache.c                         |  2 +-
>  t/t2108-update-index-refresh-racy.sh | 58 ++++++++++++++++++++++++++++
>  4 files changed, 66 insertions(+), 1 deletion(-)
>  create mode 100755 t/t2108-update-index-refresh-racy.sh
>
> diff --git a/builtin/update-index.c b/builtin/update-index.c
> index 187203e8bb5..0a069281e23 100644
> --- a/builtin/update-index.c
> +++ b/builtin/update-index.c
> @@ -787,6 +787,12 @@ static int refresh(struct refresh_params *o, unsigned int flag)
>  	setup_work_tree();
>  	read_cache();
>  	*o->has_errors |= refresh_cache(o->flags | flag);
> +	if (has_racy_timestamp(&the_index)) {
> +		/* For racy timestamps we should set active_cache_changed immediately:
> +		 * other callbacks may follow for which some of them may reset
> +		 * active_cache_changed. */
> +		active_cache_changed |= SOMETHING_CHANGED;
> +	}

Documentation/CodingGuidelines says:

 - Multi-line comments include their delimiters on separate lines from
   the text.  E.g.

	/*
	 * A very long
	 * multi-line comment.
	 */

The last half-sentence puzzles me, partly because of the word
"callback", which is an implementation detail of how --refresh and
other actions are triggered by the update-index command.  Calling
them "operation" or "action" might be easier to understand.  I dunno.

But more problematic is the word "reset", which at least to me
implies that the SOMETHING_CHANGED bit may be cleared by them, which
sounds just wrong and broken.

    ... goes and looks ...

Ah, there are cases where we do clear active_cache_changed when we
notice that an operation detected an error, to avoid spreading the
breakage by writing the index file out, and I think that is the
right thing to do.  Which means that the above patch is not quite
right.  Perhaps taking all of the above together, something like
this?

	*o->has_errors |= refresh_cache(o->flags | flag);
	if (*o->has_errors)
		active_cache_changed = 0; 
	else if (has_racy_timestamps(&the_index))
        	/*
		 * Even if nothing else has changed, updating the file
		 * increases the chance that racy timestamps become
		 * non-racy, helping future run-time performance.
		 */
		active_cache_changed |= SOMETHING_CHANGED;


> diff --git a/t/t2108-update-index-refresh-racy.sh b/t/t2108-update-index-refresh-racy.sh
> new file mode 100755
> index 00000000000..ece1151847c
> --- /dev/null
> +++ b/t/t2108-update-index-refresh-racy.sh
> @@ -0,0 +1,58 @@
> +#!/bin/sh
> +
> +test_description='update-index refresh tests related to racy timestamps'
> +
> +. ./test-lib.sh
> +
> +reset_mtime() {

Documentation/CodingGuidelines

 - We prefer a space between the function name and the parentheses,
   and no space inside the parentheses. The opening "{" should also
   be on the same line.

	(incorrect)
	my_function(){
		...

	(correct)
	my_function () {
		...

> +	test-tool chmtime =$(test-tool chmtime --get .git/fs-tstamp) $1

Even if we know all the existing callers pass a single word argument
to this function, it would be a good discipline to put double-quotes
around "$1" to assure the readers that we are future-proofed.

> +}
> +
> +update_assert_unchanged() {
> +	local ts1=$(test-tool chmtime --get .git/index) &&
> +	git update-index $1 &&
> +	local ts2=$(test-tool chmtime --get .git/index) &&
> +	[ $ts1 -eq $ts2 ]

Documentation/CodingGuidelines

 - We prefer "test" over "[ ... ]".

> +}
> +
> +update_assert_changed() {
> +	local ts1=$(test-tool chmtime --get .git/index) &&
> +	test_might_fail git update-index $1 &&
> +	local ts2=$(test-tool chmtime --get .git/index) &&
> +	[ $ts1 -ne $ts2 ]
> +}
> +
> +test_expect_success 'setup' '
> +	touch .git/fs-tstamp &&

Not that it is wrong, but do we need to create such a throw-away
file inside the .git directory?

When we care only the presence of a path, and not that the path has
the current timestamp, we prefer not to use "touch".

	>.git/fs-tstamp

I am debating myself which is more appropriate in this case.  A
mistaken implementation of "touch" could call gettimeofday() and use
the result to call utimes(), leaving wallclock timestamp in the
result, but redirecation to create or truncate the path is a more
guaranteed way to make sure the timestamp comes from the filesystem,
so it may be more suitable for our needs here.

> +	test-tool chmtime -1 .git/fs-tstamp &&
> +	echo content >file &&
> +	reset_mtime file &&
> +
> +	git add file &&
> +	git commit -m "initial import"
> +'
> +
> +test_expect_success '--refresh has no racy timestamps to fix' '
> +	reset_mtime .git/index &&
> +	test-tool chmtime +1 .git/index &&
> +	update_assert_unchanged --refresh
> +'
> +
> +test_expect_success '--refresh should fix racy timestamp' '
> +	reset_mtime .git/index &&
> +	update_assert_changed --refresh
> +'
> +
> +test_expect_success '--really-refresh should fix racy timestamp' '
> +	reset_mtime .git/index &&
> +	update_assert_changed --really-refresh
> +'
> +
> +test_expect_success '--refresh should fix racy timestamp even if needs update' '
> +	echo content2 >file &&
> +	reset_mtime file &&
> +	reset_mtime .git/index &&
> +	update_assert_changed --refresh
> +'
> +
> +test_done
>
> base-commit: 597af311a2899bfd6640b9b107622c5795d5f998

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] update-index: refresh should rewrite index in case of racy timestamps
  2021-12-22 23:52 ` Junio C Hamano
@ 2021-12-23 18:24   ` Marc Strapetz
  0 siblings, 0 replies; 20+ messages in thread
From: Marc Strapetz @ 2021-12-23 18:24 UTC (permalink / raw)
  To: Junio C Hamano, Marc Strapetz via GitGitGadget; +Cc: git

On 23/12/2021 00:52, Junio C Hamano wrote:
> Ah, there are cases where we do clear active_cache_changed when we
> notice that an operation detected an error, to avoid spreading the
> breakage by writing the index file out, and I think that is the
> right thing to do.  Which means that the above patch is not quite
> right.  Perhaps taking all of the above together, something like
> this?
> 
> 	*o->has_errors |= refresh_cache(o->flags | flag);
> 	if (*o->has_errors)
> 		active_cache_changed = 0;
> 	else if (has_racy_timestamps(&the_index))
>          	/*
> 		 * Even if nothing else has changed, updating the file
> 		 * increases the chance that racy timestamps become
> 		 * non-racy, helping future run-time performance.
> 		 */
> 		active_cache_changed |= SOMETHING_CHANGED;

I think it's safe to write the index even if refresh_cache() reports an 
"error" and we should actually do that:

The underlying refresh_index() will report an "error" only for "file: 
needs merge" and "file: needs update". In both cases, the corresponding 
entries will not have been updated. Every entry which has been updated 
is good on its own and writing these updates makes the index a little 
bit better. Subsequent calls to refresh_index() won't have to do the 
same work again (like invoking the quite expensive LFS filter).

This is also how cmd_status() currently works: it does not pay attention 
to the return value of refresh_index() and will always write the index 
if racy timestamps are encountered.

Overall, the "error" handling in update-index.c might not always do what 
one expects. Let's consider your suggested fix. When invoking:

update-index --refresh

this won't fix racy timestamps, however:

update-index --refresh --add untracked

will do. I think this is caused by active_cache_changed being used in 
two different ways: to indicate that the cache should be written and to 
indicate that it must not be written. It might be a good idea to take 
the latter "block index write" to a separate static variable in 
update-index.c.

Candidate usages of this new "block index write" variable will be in the 
existing callbacks: errors detected in unresolve_callback() should 
probably continue to block an index write, to ensure that either all or 
none of the specified files will be unresolved. For the 
reupdate_callback(), the underlying do_reupdate() seems to return 0 
always, so there is dead code in the callback (or am I completely 
blind?). To stay on the safe side, we may still continue to block an 
index write here. The refresh_callback() will never block an index write.

Does it make sense to clarify error handling in some preceding commit 
and only then address the razy timestamps? It will probably make this 
second commit clearer.

>> +}
>> +
>> +update_assert_changed() {
>> +	local ts1=$(test-tool chmtime --get .git/index) &&
>> +	test_might_fail git update-index $1 &&
>> +	local ts2=$(test-tool chmtime --get .git/index) &&
>> +	[ $ts1 -ne $ts2 ]
>> +}
>> +
>> +test_expect_success 'setup' '
>> +	touch .git/fs-tstamp &&
> 
> Not that it is wrong, but do we need to create such a throw-away
> file inside the .git directory?

We actually only need a timestamp for which we know that it is before 
the timestamp the next file system operation would create. I agree that 
it should be easy to rewrite that using "test-tool chmtime". This should 
also simplify reset_mtime().

Regarding all other comments, thanks, I'll address them as suggested for 
the next patch. And sorry for not checking CodingGuidelines before (I 
had completely missed this document).

-Marc

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v2 0/2] update-index: refresh should rewrite index in case of racy timestamps
  2021-12-22 13:56 [PATCH] update-index: refresh should rewrite index in case of racy timestamps Marc Strapetz via GitGitGadget
  2021-12-22 23:52 ` Junio C Hamano
@ 2022-01-05 13:15 ` Marc Strapetz via GitGitGadget
  2022-01-05 13:15   ` [PATCH v2 1/2] t7508: add tests capturing racy timestamp handling Marc Strapetz via GitGitGadget
                     ` (2 more replies)
  1 sibling, 3 replies; 20+ messages in thread
From: Marc Strapetz via GitGitGadget @ 2022-01-05 13:15 UTC (permalink / raw)
  To: git; +Cc: Marc Strapetz

This patch makes update-index --refresh write the index if it contains racy
timestamps, as discussed at [1].

Changes since v1:

 * main commit message now uses 'git update-index' and the paragraph was
   dropped
 * t/t7508-status.sh: two tests added which capture status racy handling
 * builtin/update-index.c: comment improved
 * t/t2108-update-index-refresh-racy.sh: major overhaul
   * one test case added
   * mtime-manipulations simplified and aligned to t7508
   * code style fixes, as discussed

[1]
https://lore.kernel.org/git/d3dd805c-7c1d-30a9-6574-a7bfcb7fc013@syntevo.com/

Marc Strapetz (2):
  t7508: add tests capturing racy timestamp handling
  update-index: refresh should rewrite index in case of racy timestamps

 builtin/update-index.c               | 11 +++++
 cache.h                              |  1 +
 read-cache.c                         |  2 +-
 t/t2108-update-index-refresh-racy.sh | 64 ++++++++++++++++++++++++++++
 t/t7508-status.sh                    | 28 ++++++++++++
 5 files changed, 105 insertions(+), 1 deletion(-)
 create mode 100755 t/t2108-update-index-refresh-racy.sh


base-commit: dcc0cd074f0c639a0df20461a301af6d45bd582e
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1105%2Fmstrap%2Ffeature%2Fupdate-index-refresh-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1105/mstrap/feature/update-index-refresh-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1105

Range-diff vs v1:

 -:  ----------- > 1:  7d58f806111 t7508: add tests capturing racy timestamp handling
 1:  8f9618a44c5 ! 2:  dfeabf6af15 update-index: refresh should rewrite index in case of racy timestamps
     @@ Metadata
       ## Commit message ##
          update-index: refresh should rewrite index in case of racy timestamps
      
     -    update-index --refresh and --really-refresh should force writing of the
     -    index file if racy timestamps have been encountered, as status already
     -    does [1].
     +    'git update-index --refresh' and '--really-refresh' should force writing
     +    of the index file if racy timestamps have been encountered, as
     +    'git status' already does [1].
      
     -    Note that calling update-index still does not guarantee that there will
     -    be no more racy timestamps afterwards (the same holds true for status):
     +    Note that calling 'git update-index --refresh' still does not guarantee
     +    that there will be no more racy timestamps afterwards (the same holds
     +    true for 'git status'):
      
     -    - calling update-index immediately after touching and adding a file may
     -      still leave racy timestamps if all three operations occur within the
     -      racy-tolerance (usually 1 second unless USE_NSEC has been defined)
     +    - calling 'git update-index --refresh' immediately after touching and
     +      adding a file may still leave racy timestamps if all three operations
     +      occur within the racy-tolerance (usually 1 second unless USE_NSEC has
     +      been defined)
      
     -    - calling update-index for timestamps which are set into the future
     -      will leave them racy
     +    - calling 'git update-index --refresh' for timestamps which are set into
     +      the future will leave them racy
      
          To guarantee that such racy timestamps will be resolved would require to
          wait until the system clock has passed beyond these timestamps and only
          then write the index file. Especially for future timestamps, this does
          not seem feasible because of possibly long delays/hangs.
      
     -    Both --refresh and --really-refresh may in theory be used in
     -    combination with --unresolve and --again which may reset the
     -    "active_cache_changed" flag. There is no difference of whether we
     -    write the index due to racy timestamps or due to other
     -    reasons, like if --really-refresh has detected CE_ENTRY_CHANGED in
     -    refresh_cache(). Hence, we will set the "active_cache_changed" flag
     -    immediately after calling refresh_cache().
     -
          [1] https://lore.kernel.org/git/d3dd805c-7c1d-30a9-6574-a7bfcb7fc013@syntevo.com/
      
          Signed-off-by: Marc Strapetz <marc.strapetz@syntevo.com>
     @@ builtin/update-index.c: static int refresh(struct refresh_params *o, unsigned in
       	read_cache();
       	*o->has_errors |= refresh_cache(o->flags | flag);
      +	if (has_racy_timestamp(&the_index)) {
     -+		/* For racy timestamps we should set active_cache_changed immediately:
     -+		 * other callbacks may follow for which some of them may reset
     -+		 * active_cache_changed. */
     ++		/*
     ++		 * Even if nothing else has changed, updating the file
     ++		 * increases the chance that racy timestamps become
     ++		 * non-racy, helping future run-time performance.
     ++		 * We do that even in case of "errors" returned by
     ++		 * refresh_cache() as these are no actual errors.
     ++		 * cmd_status() does the same.
     ++		 */
      +		active_cache_changed |= SOMETHING_CHANGED;
      +	}
       	return 0;
     @@ t/t2108-update-index-refresh-racy.sh (new)
      +
      +test_description='update-index refresh tests related to racy timestamps'
      +
     ++TEST_PASSES_SANITIZE_LEAK=true
      +. ./test-lib.sh
      +
     -+reset_mtime() {
     -+	test-tool chmtime =$(test-tool chmtime --get .git/fs-tstamp) $1
     -+}
     -+
     -+update_assert_unchanged() {
     -+	local ts1=$(test-tool chmtime --get .git/index) &&
     -+	git update-index $1 &&
     -+	local ts2=$(test-tool chmtime --get .git/index) &&
     -+	[ $ts1 -eq $ts2 ]
     ++reset_files () {
     ++	echo content >file &&
     ++	echo content >other &&
     ++	test-tool chmtime =1234567890 file &&
     ++	test-tool chmtime =1234567890 other
      +}
      +
     -+update_assert_changed() {
     -+	local ts1=$(test-tool chmtime --get .git/index) &&
     -+	test_might_fail git update-index $1 &&
     -+	local ts2=$(test-tool chmtime --get .git/index) &&
     -+	[ $ts1 -ne $ts2 ]
     ++update_assert_changed () {
     ++	test-tool chmtime =1234567890 .git/index &&
     ++	test_might_fail git update-index "$1" &&
     ++	test-tool chmtime --get .git/index >.git/out &&
     ++	! grep ^1234567890 .git/out
      +}
      +
      +test_expect_success 'setup' '
     -+	touch .git/fs-tstamp &&
     -+	test-tool chmtime -1 .git/fs-tstamp &&
     -+	echo content >file &&
     -+	reset_mtime file &&
     -+
     -+	git add file &&
     ++	reset_files &&
     ++	# we are calling reset_files() a couple of times during tests;
     ++	# test-tool chmtime does not change the ctime; to not weaken
     ++	# or even break our tests, disable ctime-checks entirely
     ++	git config core.trustctime false &&
     ++	git add file other &&
      +	git commit -m "initial import"
      +'
      +
      +test_expect_success '--refresh has no racy timestamps to fix' '
     -+	reset_mtime .git/index &&
     -+	test-tool chmtime +1 .git/index &&
     -+	update_assert_unchanged --refresh
     ++	reset_files &&
     ++	test-tool chmtime =1234567891 .git/index &&
     ++	git update-index --refresh &&
     ++	test-tool chmtime --get .git/index >.git/out &&
     ++	grep ^1234567891 .git/out
      +'
      +
      +test_expect_success '--refresh should fix racy timestamp' '
     -+	reset_mtime .git/index &&
     ++	reset_files &&
      +	update_assert_changed --refresh
      +'
      +
      +test_expect_success '--really-refresh should fix racy timestamp' '
     -+	reset_mtime .git/index &&
     ++	reset_files &&
      +	update_assert_changed --really-refresh
      +'
      +
     -+test_expect_success '--refresh should fix racy timestamp even if needs update' '
     ++test_expect_success '--refresh should fix racy timestamp if other file needs update' '
     ++	reset_files &&
     ++	echo content2 >other &&
     ++	test-tool chmtime =1234567890 other &&
     ++	update_assert_changed --refresh
     ++'
     ++
     ++test_expect_success '--refresh should fix racy timestamp if racy file needs update' '
     ++	reset_files &&
      +	echo content2 >file &&
     -+	reset_mtime file &&
     -+	reset_mtime .git/index &&
     ++	test-tool chmtime =1234567890 file &&
      +	update_assert_changed --refresh
      +'
      +

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v2 1/2] t7508: add tests capturing racy timestamp handling
  2022-01-05 13:15 ` [PATCH v2 0/2] " Marc Strapetz via GitGitGadget
@ 2022-01-05 13:15   ` Marc Strapetz via GitGitGadget
  2022-01-05 20:59     ` Junio C Hamano
  2022-01-05 13:15   ` [PATCH v2 2/2] update-index: refresh should rewrite index in case of racy timestamps Marc Strapetz via GitGitGadget
  2022-01-06 22:34   ` [PATCH v3 0/4] " Marc Strapetz via GitGitGadget
  2 siblings, 1 reply; 20+ messages in thread
From: Marc Strapetz via GitGitGadget @ 2022-01-05 13:15 UTC (permalink / raw)
  To: git; +Cc: Marc Strapetz, Marc Strapetz

From: Marc Strapetz <marc.strapetz@syntevo.com>

"git status" fixes racy timestamps regardless of the worktree being
dirty or not. The new test cases capture this behavior.

Signed-off-by: Marc Strapetz <marc.strapetz@syntevo.com>
---
 t/t7508-status.sh | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/t/t7508-status.sh b/t/t7508-status.sh
index 05c6c02435d..652cbb5ed2e 100755
--- a/t/t7508-status.sh
+++ b/t/t7508-status.sh
@@ -1656,4 +1656,32 @@ test_expect_success '--no-optional-locks prevents index update' '
 	! grep ^1234567890 out
 '
 
+test_expect_success 'racy timestamps will be fixed for clean worktree' '
+	echo content >racy-dirty &&
+	echo content >racy-racy &&
+	git add racy* &&
+	git commit -m "racy test files" &&
+	# let status rewrite the index, if necessary; after that we expect
+	# no more index writes unless caused by racy timestamps; note that
+	# timestamps may already be racy now (depending on previous tests)
+	git status &&
+	test-tool chmtime =1234567890 .git/index &&
+	test-tool chmtime --get .git/index >out &&
+	grep ^1234567890 out &&
+	git status &&
+	test-tool chmtime --get .git/index >out &&
+	! grep ^1234567890 out
+'
+
+test_expect_success 'racy timestamps will be fixed for dirty worktree' '
+	echo content2 >racy-dirty &&
+	git status &&
+	test-tool chmtime =1234567890 .git/index &&
+	test-tool chmtime --get .git/index >out &&
+	grep ^1234567890 out &&
+	git status &&
+	test-tool chmtime --get .git/index >out &&
+	! grep ^1234567890 out
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v2 2/2] update-index: refresh should rewrite index in case of racy timestamps
  2022-01-05 13:15 ` [PATCH v2 0/2] " Marc Strapetz via GitGitGadget
  2022-01-05 13:15   ` [PATCH v2 1/2] t7508: add tests capturing racy timestamp handling Marc Strapetz via GitGitGadget
@ 2022-01-05 13:15   ` Marc Strapetz via GitGitGadget
  2022-01-05 21:03     ` Junio C Hamano
  2022-01-06 22:34   ` [PATCH v3 0/4] " Marc Strapetz via GitGitGadget
  2 siblings, 1 reply; 20+ messages in thread
From: Marc Strapetz via GitGitGadget @ 2022-01-05 13:15 UTC (permalink / raw)
  To: git; +Cc: Marc Strapetz, Marc Strapetz

From: Marc Strapetz <marc.strapetz@syntevo.com>

'git update-index --refresh' and '--really-refresh' should force writing
of the index file if racy timestamps have been encountered, as
'git status' already does [1].

Note that calling 'git update-index --refresh' still does not guarantee
that there will be no more racy timestamps afterwards (the same holds
true for 'git status'):

- calling 'git update-index --refresh' immediately after touching and
  adding a file may still leave racy timestamps if all three operations
  occur within the racy-tolerance (usually 1 second unless USE_NSEC has
  been defined)

- calling 'git update-index --refresh' for timestamps which are set into
  the future will leave them racy

To guarantee that such racy timestamps will be resolved would require to
wait until the system clock has passed beyond these timestamps and only
then write the index file. Especially for future timestamps, this does
not seem feasible because of possibly long delays/hangs.

[1] https://lore.kernel.org/git/d3dd805c-7c1d-30a9-6574-a7bfcb7fc013@syntevo.com/

Signed-off-by: Marc Strapetz <marc.strapetz@syntevo.com>
---
 builtin/update-index.c               | 11 +++++
 cache.h                              |  1 +
 read-cache.c                         |  2 +-
 t/t2108-update-index-refresh-racy.sh | 64 ++++++++++++++++++++++++++++
 4 files changed, 77 insertions(+), 1 deletion(-)
 create mode 100755 t/t2108-update-index-refresh-racy.sh

diff --git a/builtin/update-index.c b/builtin/update-index.c
index 187203e8bb5..7e0a0d9bf80 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -787,6 +787,17 @@ static int refresh(struct refresh_params *o, unsigned int flag)
 	setup_work_tree();
 	read_cache();
 	*o->has_errors |= refresh_cache(o->flags | flag);
+	if (has_racy_timestamp(&the_index)) {
+		/*
+		 * Even if nothing else has changed, updating the file
+		 * increases the chance that racy timestamps become
+		 * non-racy, helping future run-time performance.
+		 * We do that even in case of "errors" returned by
+		 * refresh_cache() as these are no actual errors.
+		 * cmd_status() does the same.
+		 */
+		active_cache_changed |= SOMETHING_CHANGED;
+	}
 	return 0;
 }
 
diff --git a/cache.h b/cache.h
index cfba463aa97..dd1932e2d0e 100644
--- a/cache.h
+++ b/cache.h
@@ -891,6 +891,7 @@ void *read_blob_data_from_index(struct index_state *, const char *, unsigned lon
 #define CE_MATCH_IGNORE_FSMONITOR 0X20
 int is_racy_timestamp(const struct index_state *istate,
 		      const struct cache_entry *ce);
+int has_racy_timestamp(struct index_state *istate);
 int ie_match_stat(struct index_state *, const struct cache_entry *, struct stat *, unsigned int);
 int ie_modified(struct index_state *, const struct cache_entry *, struct stat *, unsigned int);
 
diff --git a/read-cache.c b/read-cache.c
index cbe73f14e5e..ed297635a33 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2775,7 +2775,7 @@ static int repo_verify_index(struct repository *repo)
 	return verify_index_from(repo->index, repo->index_file);
 }
 
-static int has_racy_timestamp(struct index_state *istate)
+int has_racy_timestamp(struct index_state *istate)
 {
 	int entries = istate->cache_nr;
 	int i;
diff --git a/t/t2108-update-index-refresh-racy.sh b/t/t2108-update-index-refresh-racy.sh
new file mode 100755
index 00000000000..171c37ebec9
--- /dev/null
+++ b/t/t2108-update-index-refresh-racy.sh
@@ -0,0 +1,64 @@
+#!/bin/sh
+
+test_description='update-index refresh tests related to racy timestamps'
+
+TEST_PASSES_SANITIZE_LEAK=true
+. ./test-lib.sh
+
+reset_files () {
+	echo content >file &&
+	echo content >other &&
+	test-tool chmtime =1234567890 file &&
+	test-tool chmtime =1234567890 other
+}
+
+update_assert_changed () {
+	test-tool chmtime =1234567890 .git/index &&
+	test_might_fail git update-index "$1" &&
+	test-tool chmtime --get .git/index >.git/out &&
+	! grep ^1234567890 .git/out
+}
+
+test_expect_success 'setup' '
+	reset_files &&
+	# we are calling reset_files() a couple of times during tests;
+	# test-tool chmtime does not change the ctime; to not weaken
+	# or even break our tests, disable ctime-checks entirely
+	git config core.trustctime false &&
+	git add file other &&
+	git commit -m "initial import"
+'
+
+test_expect_success '--refresh has no racy timestamps to fix' '
+	reset_files &&
+	test-tool chmtime =1234567891 .git/index &&
+	git update-index --refresh &&
+	test-tool chmtime --get .git/index >.git/out &&
+	grep ^1234567891 .git/out
+'
+
+test_expect_success '--refresh should fix racy timestamp' '
+	reset_files &&
+	update_assert_changed --refresh
+'
+
+test_expect_success '--really-refresh should fix racy timestamp' '
+	reset_files &&
+	update_assert_changed --really-refresh
+'
+
+test_expect_success '--refresh should fix racy timestamp if other file needs update' '
+	reset_files &&
+	echo content2 >other &&
+	test-tool chmtime =1234567890 other &&
+	update_assert_changed --refresh
+'
+
+test_expect_success '--refresh should fix racy timestamp if racy file needs update' '
+	reset_files &&
+	echo content2 >file &&
+	test-tool chmtime =1234567890 file &&
+	update_assert_changed --refresh
+'
+
+test_done
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/2] t7508: add tests capturing racy timestamp handling
  2022-01-05 13:15   ` [PATCH v2 1/2] t7508: add tests capturing racy timestamp handling Marc Strapetz via GitGitGadget
@ 2022-01-05 20:59     ` Junio C Hamano
  2022-01-06 10:21       ` Marc Strapetz
  0 siblings, 1 reply; 20+ messages in thread
From: Junio C Hamano @ 2022-01-05 20:59 UTC (permalink / raw)
  To: Marc Strapetz via GitGitGadget; +Cc: git, Marc Strapetz

"Marc Strapetz via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Marc Strapetz <marc.strapetz@syntevo.com>
>
> "git status" fixes racy timestamps regardless of the worktree being
> dirty or not. The new test cases capture this behavior.
>
> Signed-off-by: Marc Strapetz <marc.strapetz@syntevo.com>
> ---
>  t/t7508-status.sh | 28 ++++++++++++++++++++++++++++
>  1 file changed, 28 insertions(+)
>
> diff --git a/t/t7508-status.sh b/t/t7508-status.sh
> index 05c6c02435d..652cbb5ed2e 100755
> --- a/t/t7508-status.sh
> +++ b/t/t7508-status.sh
> @@ -1656,4 +1656,32 @@ test_expect_success '--no-optional-locks prevents index update' '
>  	! grep ^1234567890 out
>  '
>  
> +test_expect_success 'racy timestamps will be fixed for clean worktree' '
> +	echo content >racy-dirty &&
> +	echo content >racy-racy &&
> +	git add racy* &&
> +	git commit -m "racy test files" &&
> +	# let status rewrite the index, if necessary; after that we expect
> +	# no more index writes unless caused by racy timestamps; note that
> +	# timestamps may already be racy now (depending on previous tests)
> +	git status &&
> +	test-tool chmtime =1234567890 .git/index &&
> +	test-tool chmtime --get .git/index >out &&
> +	grep ^1234567890 out &&

If file contents were 1234567890999, this will still hit, but I do
not think that is what you wanted to see.  Perhaps

	git status &&
	echo 1234567890 >expect &&
	test-tool chmtime=$(cat expect) .git/index &&
	test-tool chmtime --get .git/index >actual &&
	test_cmp expect actual

or something?  But I think you inherited this bogosity from the
previous test, so I am OK to add a few more copies of the same
bogosity to the test.

Somebody later has to step in and clean them all up, though.  When
that happens, we should document how the magic 1234567890 timestamp
was chosen near its first use.

I think it is because it is a timestamp in year 2009, so as long as
your filetime clock is reasonably accurate, a write to the file
would never get such a low timestamp.

> +	git status &&
> +	test-tool chmtime --get .git/index >out &&
> +	! grep ^1234567890 out


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 2/2] update-index: refresh should rewrite index in case of racy timestamps
  2022-01-05 13:15   ` [PATCH v2 2/2] update-index: refresh should rewrite index in case of racy timestamps Marc Strapetz via GitGitGadget
@ 2022-01-05 21:03     ` Junio C Hamano
  0 siblings, 0 replies; 20+ messages in thread
From: Junio C Hamano @ 2022-01-05 21:03 UTC (permalink / raw)
  To: Marc Strapetz via GitGitGadget; +Cc: git, Marc Strapetz

"Marc Strapetz via GitGitGadget" <gitgitgadget@gmail.com> writes:

> +test_expect_success '--refresh has no racy timestamps to fix' '
> +	reset_files &&
> +	test-tool chmtime =1234567891 .git/index &&

Don't some people use Git on VFAT where the time resolution is 2
seconds?  1234567890 and 1234567891 differ only by one second, so
chmtime may not be able to store one of them exactly, so I am not
sure if this guarantees "no racy timestamps to fix".


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/2] t7508: add tests capturing racy timestamp handling
  2022-01-05 20:59     ` Junio C Hamano
@ 2022-01-06 10:21       ` Marc Strapetz
  0 siblings, 0 replies; 20+ messages in thread
From: Marc Strapetz @ 2022-01-06 10:21 UTC (permalink / raw)
  To: Junio C Hamano, Marc Strapetz via GitGitGadget; +Cc: git

On 05/01/2022 21:59, Junio C Hamano wrote:
>> From: Marc Strapetz <marc.strapetz@syntevo.com>
>>   
>> +test_expect_success 'racy timestamps will be fixed for clean worktree' '
>> +	echo content >racy-dirty &&
>> +	echo content >racy-racy &&
>> +	git add racy* &&
>> +	git commit -m "racy test files" &&
>> +	# let status rewrite the index, if necessary; after that we expect
>> +	# no more index writes unless caused by racy timestamps; note that
>> +	# timestamps may already be racy now (depending on previous tests)
>> +	git status &&
>> +	test-tool chmtime =1234567890 .git/index &&
>> +	test-tool chmtime --get .git/index >out &&
>> +	grep ^1234567890 out &&
> 
> If file contents were 1234567890999, this will still hit, but I do
> not think that is what you wanted to see.  Perhaps
> 
> 	git status &&
> 	echo 1234567890 >expect &&
> 	test-tool chmtime=$(cat expect) .git/index &&
> 	test-tool chmtime --get .git/index >actual &&
> 	test_cmp expect actual
> 
> or something?  But I think you inherited this bogosity from the
> previous test, so I am OK to add a few more copies of the same
> bogosity to the test.
> 
> Somebody later has to step in and clean them all up, though.  When
> that happens, we should document how the magic 1234567890 timestamp
> was chosen near its first use.

It seems like this pattern was used only once before my changes, hence I 
will extract to test-lib-functions.sh and fix the bogosity for the next 
version of my patch.

-Marc

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v3 0/4] update-index: refresh should rewrite index in case of racy timestamps
  2022-01-05 13:15 ` [PATCH v2 0/2] " Marc Strapetz via GitGitGadget
  2022-01-05 13:15   ` [PATCH v2 1/2] t7508: add tests capturing racy timestamp handling Marc Strapetz via GitGitGadget
  2022-01-05 13:15   ` [PATCH v2 2/2] update-index: refresh should rewrite index in case of racy timestamps Marc Strapetz via GitGitGadget
@ 2022-01-06 22:34   ` Marc Strapetz via GitGitGadget
  2022-01-06 22:34     ` [PATCH v3 1/4] test-lib: introduce API for verifying file mtime Marc Strapetz via GitGitGadget
                       ` (4 more replies)
  2 siblings, 5 replies; 20+ messages in thread
From: Marc Strapetz via GitGitGadget @ 2022-01-06 22:34 UTC (permalink / raw)
  To: git; +Cc: Marc Strapetz

This patch makes update-index --refresh write the index if it contains racy
timestamps, as discussed at [1].

Changes since v2:

 * new patch: test-lib: introduce API for verifying file mtime
 * new patch: t7508: fix bogus mtime verification for test
   "--no-optional-locks prevents index update"
 * change new tests in t2108 and t7508 to use new test-lib mtime API
 * fix "--refresh has no racy timestamps to fix" to use +60s mtime to be
   save on VFAT

Changes since v1:

 * main commit message now uses 'git update-index' and the paragraph was
   dropped
 * t/t7508-status.sh: two tests added which capture status racy handling
 * builtin/update-index.c: comment improved
 * t/t2108-update-index-refresh-racy.sh: major overhaul
   * one test case added
   * mtime-manipulations simplified and aligned to t7508
   * code style fixes, as discussed

[1]
https://lore.kernel.org/git/d3dd805c-7c1d-30a9-6574-a7bfcb7fc013@syntevo.com/

Marc Strapetz (4):
  test-lib: introduce API for verifying file mtime
  t7508: fix bogus mtime verification
  t7508: add tests capturing racy timestamp handling
  update-index: refresh should rewrite index in case of racy timestamps

 builtin/update-index.c               | 11 +++++
 cache.h                              |  1 +
 read-cache.c                         |  2 +-
 t/t2108-update-index-refresh-racy.sh | 64 ++++++++++++++++++++++++++++
 t/t7508-status.sh                    | 30 ++++++++++---
 t/test-lib-functions.sh              | 28 ++++++++++++
 6 files changed, 130 insertions(+), 6 deletions(-)
 create mode 100755 t/t2108-update-index-refresh-racy.sh


base-commit: dcc0cd074f0c639a0df20461a301af6d45bd582e
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1105%2Fmstrap%2Ffeature%2Fupdate-index-refresh-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1105/mstrap/feature/update-index-refresh-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/1105

Range-diff vs v2:

 -:  ----------- > 1:  e6301e9d770 test-lib: introduce API for verifying file mtime
 -:  ----------- > 2:  d15a23cc804 t7508: fix bogus mtime verification
 1:  7d58f806111 ! 3:  3567ef91e7a t7508: add tests capturing racy timestamp handling
     @@ Commit message
      
       ## t/t7508-status.sh ##
      @@ t/t7508-status.sh: test_expect_success '--no-optional-locks prevents index update' '
     - 	! grep ^1234567890 out
     + 	! test_is_magic_mtime .git/index
       '
       
      +test_expect_success 'racy timestamps will be fixed for clean worktree' '
     @@ t/t7508-status.sh: test_expect_success '--no-optional-locks prevents index updat
      +	# no more index writes unless caused by racy timestamps; note that
      +	# timestamps may already be racy now (depending on previous tests)
      +	git status &&
     -+	test-tool chmtime =1234567890 .git/index &&
     -+	test-tool chmtime --get .git/index >out &&
     -+	grep ^1234567890 out &&
     ++	test_set_magic_mtime .git/index &&
      +	git status &&
     -+	test-tool chmtime --get .git/index >out &&
     -+	! grep ^1234567890 out
     ++	! test_is_magic_mtime .git/index
      +'
      +
      +test_expect_success 'racy timestamps will be fixed for dirty worktree' '
      +	echo content2 >racy-dirty &&
      +	git status &&
     -+	test-tool chmtime =1234567890 .git/index &&
     -+	test-tool chmtime --get .git/index >out &&
     -+	grep ^1234567890 out &&
     ++	test_set_magic_mtime .git/index &&
      +	git status &&
     -+	test-tool chmtime --get .git/index >out &&
     -+	! grep ^1234567890 out
     ++	! test_is_magic_mtime .git/index
      +'
      +
       test_done
 2:  dfeabf6af15 ! 4:  4a6b18fb304 update-index: refresh should rewrite index in case of racy timestamps
     @@ t/t2108-update-index-refresh-racy.sh (new)
      +reset_files () {
      +	echo content >file &&
      +	echo content >other &&
     -+	test-tool chmtime =1234567890 file &&
     -+	test-tool chmtime =1234567890 other
     ++	test_set_magic_mtime file &&
     ++	test_set_magic_mtime other
      +}
      +
      +update_assert_changed () {
     -+	test-tool chmtime =1234567890 .git/index &&
     ++	test_set_magic_mtime .git/index &&
      +	test_might_fail git update-index "$1" &&
     -+	test-tool chmtime --get .git/index >.git/out &&
     -+	! grep ^1234567890 .git/out
     ++	! test_is_magic_mtime .git/index
      +}
      +
      +test_expect_success 'setup' '
     @@ t/t2108-update-index-refresh-racy.sh (new)
      +
      +test_expect_success '--refresh has no racy timestamps to fix' '
      +	reset_files &&
     -+	test-tool chmtime =1234567891 .git/index &&
     ++	# set the index time far enough to the future;
     ++	# it must be at least 3 seconds for VFAT
     ++	test_set_magic_mtime .git/index +60 &&
      +	git update-index --refresh &&
     -+	test-tool chmtime --get .git/index >.git/out &&
     -+	grep ^1234567891 .git/out
     ++	test_is_magic_mtime .git/index +60
      +'
      +
      +test_expect_success '--refresh should fix racy timestamp' '
     @@ t/t2108-update-index-refresh-racy.sh (new)
      +test_expect_success '--refresh should fix racy timestamp if other file needs update' '
      +	reset_files &&
      +	echo content2 >other &&
     -+	test-tool chmtime =1234567890 other &&
     ++	test_set_magic_mtime other &&
      +	update_assert_changed --refresh
      +'
      +
      +test_expect_success '--refresh should fix racy timestamp if racy file needs update' '
      +	reset_files &&
      +	echo content2 >file &&
     -+	test-tool chmtime =1234567890 file &&
     ++	test_set_magic_mtime file &&
      +	update_assert_changed --refresh
      +'
      +

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v3 1/4] test-lib: introduce API for verifying file mtime
  2022-01-06 22:34   ` [PATCH v3 0/4] " Marc Strapetz via GitGitGadget
@ 2022-01-06 22:34     ` Marc Strapetz via GitGitGadget
  2022-01-06 23:55       ` Junio C Hamano
  2022-01-06 22:34     ` [PATCH v3 2/4] t7508: fix bogus mtime verification Marc Strapetz via GitGitGadget
                       ` (3 subsequent siblings)
  4 siblings, 1 reply; 20+ messages in thread
From: Marc Strapetz via GitGitGadget @ 2022-01-06 22:34 UTC (permalink / raw)
  To: git; +Cc: Marc Strapetz, Marc Strapetz

From: Marc Strapetz <marc.strapetz@syntevo.com>

Add functions `test_set_magic_mtime` and `test_is_magic_mtime` which can
be used to (re)set the mtime of a file to a predefined ("magic")
timestamp, then perform some operations and finally check for mtime
changes of the file.

The core implementation follows the suggestion from the
mailing list [1].

[1] https://lore.kernel.org/git/xmqqczl5hpaq.fsf@gitster.g/T/#u

Signed-off-by: Marc Strapetz <marc.strapetz@syntevo.com>
---
 t/test-lib-functions.sh | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 389153e5916..01dd8c01f59 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -1806,3 +1806,31 @@ test_region () {
 test_readlink () {
 	perl -le 'print readlink($_) for @ARGV' "$@"
 }
+
+# Set a fixed "magic" mtime to the given file,
+# with an optional increment specified as second argument.
+# Use in combination with test_is_magic_mtime.
+test_set_magic_mtime () {
+	# We are using 1234567890 because it's a common timestamp used in
+	# various tests. It represents date 2009-02-13 which should be safe
+	# to use as long as the filetime clock is reasonably accurate.
+	local inc=${2:-0} &&
+	local mtime=$((1234567890 + $inc)) &&
+	test-tool chmtime =$mtime $1 &&
+	test_is_magic_mtime $1 $inc
+}
+
+# Test whether the given file has the "magic" mtime set,
+# with an optional increment specified as second argument.
+# Use in combination with test_set_magic_mtime.
+test_is_magic_mtime () {
+	local inc=${2:-0} &&
+	local mtime=$((1234567890 + $inc)) &&
+	echo $mtime >.git/test-mtime-expect &&
+	test-tool chmtime --get $1 >.git/test-mtime-actual &&
+	test_cmp .git/test-mtime-expect .git/test-mtime-actual
+	local ret=$?
+	rm .git/test-mtime-expect
+	rm .git/test-mtime-actual
+	return $ret
+}
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v3 2/4] t7508: fix bogus mtime verification
  2022-01-06 22:34   ` [PATCH v3 0/4] " Marc Strapetz via GitGitGadget
  2022-01-06 22:34     ` [PATCH v3 1/4] test-lib: introduce API for verifying file mtime Marc Strapetz via GitGitGadget
@ 2022-01-06 22:34     ` Marc Strapetz via GitGitGadget
  2022-01-06 22:34     ` [PATCH v3 3/4] t7508: add tests capturing racy timestamp handling Marc Strapetz via GitGitGadget
                       ` (2 subsequent siblings)
  4 siblings, 0 replies; 20+ messages in thread
From: Marc Strapetz via GitGitGadget @ 2022-01-06 22:34 UTC (permalink / raw)
  To: git; +Cc: Marc Strapetz, Marc Strapetz

From: Marc Strapetz <marc.strapetz@syntevo.com>

The current `grep`-approach in "--no-optional-locks prevents index
update" may fail e.g. for `out` file contents "1234567890999" [1].
Fix this by using test-lib's new mtime-verification API.

[1] https://lore.kernel.org/git/xmqqczl5hpaq.fsf@gitster.g/T/#u

Signed-off-by: Marc Strapetz <marc.strapetz@syntevo.com>
---
 t/t7508-status.sh | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/t/t7508-status.sh b/t/t7508-status.sh
index 05c6c02435d..b9efd2613d0 100755
--- a/t/t7508-status.sh
+++ b/t/t7508-status.sh
@@ -1647,13 +1647,11 @@ test_expect_success '"Initial commit" should not be noted in commit template' '
 '
 
 test_expect_success '--no-optional-locks prevents index update' '
-	test-tool chmtime =1234567890 .git/index &&
+	test_set_magic_mtime .git/index &&
 	git --no-optional-locks status &&
-	test-tool chmtime --get .git/index >out &&
-	grep ^1234567890 out &&
+	test_is_magic_mtime .git/index &&
 	git status &&
-	test-tool chmtime --get .git/index >out &&
-	! grep ^1234567890 out
+	! test_is_magic_mtime .git/index
 '
 
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v3 3/4] t7508: add tests capturing racy timestamp handling
  2022-01-06 22:34   ` [PATCH v3 0/4] " Marc Strapetz via GitGitGadget
  2022-01-06 22:34     ` [PATCH v3 1/4] test-lib: introduce API for verifying file mtime Marc Strapetz via GitGitGadget
  2022-01-06 22:34     ` [PATCH v3 2/4] t7508: fix bogus mtime verification Marc Strapetz via GitGitGadget
@ 2022-01-06 22:34     ` Marc Strapetz via GitGitGadget
  2022-01-06 22:34     ` [PATCH v3 4/4] update-index: refresh should rewrite index in case of racy timestamps Marc Strapetz via GitGitGadget
  2022-01-07 11:17     ` [PATCH v4 0/4] " Marc Strapetz via GitGitGadget
  4 siblings, 0 replies; 20+ messages in thread
From: Marc Strapetz via GitGitGadget @ 2022-01-06 22:34 UTC (permalink / raw)
  To: git; +Cc: Marc Strapetz, Marc Strapetz

From: Marc Strapetz <marc.strapetz@syntevo.com>

"git status" fixes racy timestamps regardless of the worktree being
dirty or not. The new test cases capture this behavior.

Signed-off-by: Marc Strapetz <marc.strapetz@syntevo.com>
---
 t/t7508-status.sh | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/t/t7508-status.sh b/t/t7508-status.sh
index b9efd2613d0..2b7ef6c41a4 100755
--- a/t/t7508-status.sh
+++ b/t/t7508-status.sh
@@ -1654,4 +1654,26 @@ test_expect_success '--no-optional-locks prevents index update' '
 	! test_is_magic_mtime .git/index
 '
 
+test_expect_success 'racy timestamps will be fixed for clean worktree' '
+	echo content >racy-dirty &&
+	echo content >racy-racy &&
+	git add racy* &&
+	git commit -m "racy test files" &&
+	# let status rewrite the index, if necessary; after that we expect
+	# no more index writes unless caused by racy timestamps; note that
+	# timestamps may already be racy now (depending on previous tests)
+	git status &&
+	test_set_magic_mtime .git/index &&
+	git status &&
+	! test_is_magic_mtime .git/index
+'
+
+test_expect_success 'racy timestamps will be fixed for dirty worktree' '
+	echo content2 >racy-dirty &&
+	git status &&
+	test_set_magic_mtime .git/index &&
+	git status &&
+	! test_is_magic_mtime .git/index
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v3 4/4] update-index: refresh should rewrite index in case of racy timestamps
  2022-01-06 22:34   ` [PATCH v3 0/4] " Marc Strapetz via GitGitGadget
                       ` (2 preceding siblings ...)
  2022-01-06 22:34     ` [PATCH v3 3/4] t7508: add tests capturing racy timestamp handling Marc Strapetz via GitGitGadget
@ 2022-01-06 22:34     ` Marc Strapetz via GitGitGadget
  2022-01-07 11:17     ` [PATCH v4 0/4] " Marc Strapetz via GitGitGadget
  4 siblings, 0 replies; 20+ messages in thread
From: Marc Strapetz via GitGitGadget @ 2022-01-06 22:34 UTC (permalink / raw)
  To: git; +Cc: Marc Strapetz, Marc Strapetz

From: Marc Strapetz <marc.strapetz@syntevo.com>

'git update-index --refresh' and '--really-refresh' should force writing
of the index file if racy timestamps have been encountered, as
'git status' already does [1].

Note that calling 'git update-index --refresh' still does not guarantee
that there will be no more racy timestamps afterwards (the same holds
true for 'git status'):

- calling 'git update-index --refresh' immediately after touching and
  adding a file may still leave racy timestamps if all three operations
  occur within the racy-tolerance (usually 1 second unless USE_NSEC has
  been defined)

- calling 'git update-index --refresh' for timestamps which are set into
  the future will leave them racy

To guarantee that such racy timestamps will be resolved would require to
wait until the system clock has passed beyond these timestamps and only
then write the index file. Especially for future timestamps, this does
not seem feasible because of possibly long delays/hangs.

[1] https://lore.kernel.org/git/d3dd805c-7c1d-30a9-6574-a7bfcb7fc013@syntevo.com/

Signed-off-by: Marc Strapetz <marc.strapetz@syntevo.com>
---
 builtin/update-index.c               | 11 +++++
 cache.h                              |  1 +
 read-cache.c                         |  2 +-
 t/t2108-update-index-refresh-racy.sh | 64 ++++++++++++++++++++++++++++
 4 files changed, 77 insertions(+), 1 deletion(-)
 create mode 100755 t/t2108-update-index-refresh-racy.sh

diff --git a/builtin/update-index.c b/builtin/update-index.c
index 187203e8bb5..7e0a0d9bf80 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -787,6 +787,17 @@ static int refresh(struct refresh_params *o, unsigned int flag)
 	setup_work_tree();
 	read_cache();
 	*o->has_errors |= refresh_cache(o->flags | flag);
+	if (has_racy_timestamp(&the_index)) {
+		/*
+		 * Even if nothing else has changed, updating the file
+		 * increases the chance that racy timestamps become
+		 * non-racy, helping future run-time performance.
+		 * We do that even in case of "errors" returned by
+		 * refresh_cache() as these are no actual errors.
+		 * cmd_status() does the same.
+		 */
+		active_cache_changed |= SOMETHING_CHANGED;
+	}
 	return 0;
 }
 
diff --git a/cache.h b/cache.h
index cfba463aa97..dd1932e2d0e 100644
--- a/cache.h
+++ b/cache.h
@@ -891,6 +891,7 @@ void *read_blob_data_from_index(struct index_state *, const char *, unsigned lon
 #define CE_MATCH_IGNORE_FSMONITOR 0X20
 int is_racy_timestamp(const struct index_state *istate,
 		      const struct cache_entry *ce);
+int has_racy_timestamp(struct index_state *istate);
 int ie_match_stat(struct index_state *, const struct cache_entry *, struct stat *, unsigned int);
 int ie_modified(struct index_state *, const struct cache_entry *, struct stat *, unsigned int);
 
diff --git a/read-cache.c b/read-cache.c
index cbe73f14e5e..ed297635a33 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2775,7 +2775,7 @@ static int repo_verify_index(struct repository *repo)
 	return verify_index_from(repo->index, repo->index_file);
 }
 
-static int has_racy_timestamp(struct index_state *istate)
+int has_racy_timestamp(struct index_state *istate)
 {
 	int entries = istate->cache_nr;
 	int i;
diff --git a/t/t2108-update-index-refresh-racy.sh b/t/t2108-update-index-refresh-racy.sh
new file mode 100755
index 00000000000..bc5f2886faf
--- /dev/null
+++ b/t/t2108-update-index-refresh-racy.sh
@@ -0,0 +1,64 @@
+#!/bin/sh
+
+test_description='update-index refresh tests related to racy timestamps'
+
+TEST_PASSES_SANITIZE_LEAK=true
+. ./test-lib.sh
+
+reset_files () {
+	echo content >file &&
+	echo content >other &&
+	test_set_magic_mtime file &&
+	test_set_magic_mtime other
+}
+
+update_assert_changed () {
+	test_set_magic_mtime .git/index &&
+	test_might_fail git update-index "$1" &&
+	! test_is_magic_mtime .git/index
+}
+
+test_expect_success 'setup' '
+	reset_files &&
+	# we are calling reset_files() a couple of times during tests;
+	# test-tool chmtime does not change the ctime; to not weaken
+	# or even break our tests, disable ctime-checks entirely
+	git config core.trustctime false &&
+	git add file other &&
+	git commit -m "initial import"
+'
+
+test_expect_success '--refresh has no racy timestamps to fix' '
+	reset_files &&
+	# set the index time far enough to the future;
+	# it must be at least 3 seconds for VFAT
+	test_set_magic_mtime .git/index +60 &&
+	git update-index --refresh &&
+	test_is_magic_mtime .git/index +60
+'
+
+test_expect_success '--refresh should fix racy timestamp' '
+	reset_files &&
+	update_assert_changed --refresh
+'
+
+test_expect_success '--really-refresh should fix racy timestamp' '
+	reset_files &&
+	update_assert_changed --really-refresh
+'
+
+test_expect_success '--refresh should fix racy timestamp if other file needs update' '
+	reset_files &&
+	echo content2 >other &&
+	test_set_magic_mtime other &&
+	update_assert_changed --refresh
+'
+
+test_expect_success '--refresh should fix racy timestamp if racy file needs update' '
+	reset_files &&
+	echo content2 >file &&
+	test_set_magic_mtime file &&
+	update_assert_changed --refresh
+'
+
+test_done
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 1/4] test-lib: introduce API for verifying file mtime
  2022-01-06 22:34     ` [PATCH v3 1/4] test-lib: introduce API for verifying file mtime Marc Strapetz via GitGitGadget
@ 2022-01-06 23:55       ` Junio C Hamano
  0 siblings, 0 replies; 20+ messages in thread
From: Junio C Hamano @ 2022-01-06 23:55 UTC (permalink / raw)
  To: Marc Strapetz via GitGitGadget; +Cc: git, Marc Strapetz

"Marc Strapetz via GitGitGadget" <gitgitgadget@gmail.com> writes:

> +# Set a fixed "magic" mtime to the given file,
> +# with an optional increment specified as second argument.
> +# Use in combination with test_is_magic_mtime.
> +test_set_magic_mtime () {
> +	# We are using 1234567890 because it's a common timestamp used in
> +	# various tests. It represents date 2009-02-13 which should be safe
> +	# to use as long as the filetime clock is reasonably accurate.

In the original context of "setting an ancient time, and detect
filesystem modification by noticing that the timestamp has or has
not changed", such an ancient timestamp "should be safe to use", but
if you expose it to more general audience, the context of their use
must be in line with your intended use to be safe.

	# Set mtime to mid February 2009, before we run an operation
	# that may or may not touch the file.  If the file was
	# touched, its timestamp will not accidentally have such an
	# old timestamp, as long as your filesystem clock is
	# reasonably correct.

perhaps?

> +	local inc=${2:-0} &&
> +	local mtime=$((1234567890 + $inc)) &&
> +	test-tool chmtime =$mtime $1 &&
> +	test_is_magic_mtime $1 $inc
> +}

Also as a helper function in the library that is (hopefully) useful
to many other callers, make sure you got your quoting correct.

There is no rule that you must use filenames without SP in it in
your tests, for example, so make sure "$1" above are quoted.  The
same applies to the next function.

> +# Test whether the given file has the "magic" mtime set,
> +# with an optional increment specified as second argument.
> +# Use in combination with test_set_magic_mtime.
> +test_is_magic_mtime () {
> +	local inc=${2:-0} &&
> +	local mtime=$((1234567890 + $inc)) &&
> +	echo $mtime >.git/test-mtime-expect &&
> +	test-tool chmtime --get $1 >.git/test-mtime-actual &&
> +	test_cmp .git/test-mtime-expect .git/test-mtime-actual
> +	local ret=$?
> +	rm .git/test-mtime-expect
> +	rm .git/test-mtime-actual

Use "rm -f" here?  Otherwise, if the main test failed somewhere
before it runs test_cmp, we'd see an error from an attempt to remove
a file that does not exist.

> +	return $ret
> +}

Other than that, quite nicely done (both these two functions and its
users).

Thanks.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v4 0/4] update-index: refresh should rewrite index in case of racy timestamps
  2022-01-06 22:34   ` [PATCH v3 0/4] " Marc Strapetz via GitGitGadget
                       ` (3 preceding siblings ...)
  2022-01-06 22:34     ` [PATCH v3 4/4] update-index: refresh should rewrite index in case of racy timestamps Marc Strapetz via GitGitGadget
@ 2022-01-07 11:17     ` Marc Strapetz via GitGitGadget
  2022-01-07 11:17       ` [PATCH v4 1/4] test-lib: introduce API for verifying file mtime Marc Strapetz via GitGitGadget
                         ` (3 more replies)
  4 siblings, 4 replies; 20+ messages in thread
From: Marc Strapetz via GitGitGadget @ 2022-01-07 11:17 UTC (permalink / raw)
  To: git; +Cc: Marc Strapetz

This patch makes update-index --refresh write the index if it contains racy
timestamps, as discussed at [1].

Changes since v3:

 * test-lib: improve API for verifying file mtime
   * fix quoting around "$1"
   * use "rm -f" for cleanup of auxiliary files
   * improve API description comments
 * Note that gitgitgadget's "freebsd_12" check is failing since a couple of
   days (unrelated to this pull request); hence, this check hasn't been
   applied to this patch series

Changes since v2:

 * new patch: test-lib: introduce API for verifying file mtime
 * new patch: t7508: fix bogus mtime verification for test
   "--no-optional-locks prevents index update"
 * change new tests in t2108 and t7508 to use new test-lib mtime API
 * fix "--refresh has no racy timestamps to fix" to use +60s mtime to be
   save on VFAT

Changes since v1:

 * main commit message now uses 'git update-index' and the paragraph was
   dropped
 * t/t7508-status.sh: two tests added which capture status racy handling
 * builtin/update-index.c: comment improved
 * t/t2108-update-index-refresh-racy.sh: major overhaul
   * one test case added
   * mtime-manipulations simplified and aligned to t7508
   * code style fixes, as discussed

[1]
https://lore.kernel.org/git/d3dd805c-7c1d-30a9-6574-a7bfcb7fc013@syntevo.com/

Marc Strapetz (4):
  test-lib: introduce API for verifying file mtime
  t7508: fix bogus mtime verification
  t7508: add tests capturing racy timestamp handling
  update-index: refresh should rewrite index in case of racy timestamps

 builtin/update-index.c               | 11 +++++
 cache.h                              |  1 +
 read-cache.c                         |  2 +-
 t/t2108-update-index-refresh-racy.sh | 64 ++++++++++++++++++++++++++++
 t/t7508-status.sh                    | 30 ++++++++++---
 t/test-lib-functions.sh              | 33 ++++++++++++++
 6 files changed, 135 insertions(+), 6 deletions(-)
 create mode 100755 t/t2108-update-index-refresh-racy.sh


base-commit: dcc0cd074f0c639a0df20461a301af6d45bd582e
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1105%2Fmstrap%2Ffeature%2Fupdate-index-refresh-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1105/mstrap/feature/update-index-refresh-v4
Pull-Request: https://github.com/gitgitgadget/git/pull/1105

Range-diff vs v3:

 1:  e6301e9d770 ! 1:  37c11bfafc4 test-lib: introduce API for verifying file mtime
     @@ t/test-lib-functions.sh: test_region () {
       	perl -le 'print readlink($_) for @ARGV' "$@"
       }
      +
     -+# Set a fixed "magic" mtime to the given file,
     -+# with an optional increment specified as second argument.
     -+# Use in combination with test_is_magic_mtime.
     ++# Set mtime to a fixed "magic" timestamp in mid February 2009, before we
     ++# run an operation that may or may not touch the file.  If the file was
     ++# touched, its timestamp will not accidentally have such an old timestamp,
     ++# as long as your filesystem clock is reasonably correct.  To verify the
     ++# timestamp, follow up with test_is_magic_mtime.
     ++#
     ++# An optional increment to the magic timestamp may be specified as second
     ++# argument.
      +test_set_magic_mtime () {
     -+	# We are using 1234567890 because it's a common timestamp used in
     -+	# various tests. It represents date 2009-02-13 which should be safe
     -+	# to use as long as the filetime clock is reasonably accurate.
      +	local inc=${2:-0} &&
      +	local mtime=$((1234567890 + $inc)) &&
     -+	test-tool chmtime =$mtime $1 &&
     -+	test_is_magic_mtime $1 $inc
     ++	test-tool chmtime =$mtime "$1" &&
     ++	test_is_magic_mtime "$1" $inc
      +}
      +
     -+# Test whether the given file has the "magic" mtime set,
     -+# with an optional increment specified as second argument.
     -+# Use in combination with test_set_magic_mtime.
     ++# Test whether the given file has the "magic" mtime set.  This is meant to
     ++# be used in combination with test_set_magic_mtime.
     ++#
     ++# An optional increment to the magic timestamp may be specified as second
     ++# argument.  Usually, this should be the same increment which was used for
     ++# the associated test_set_magic_mtime.
      +test_is_magic_mtime () {
      +	local inc=${2:-0} &&
      +	local mtime=$((1234567890 + $inc)) &&
      +	echo $mtime >.git/test-mtime-expect &&
     -+	test-tool chmtime --get $1 >.git/test-mtime-actual &&
     ++	test-tool chmtime --get "$1" >.git/test-mtime-actual &&
      +	test_cmp .git/test-mtime-expect .git/test-mtime-actual
      +	local ret=$?
     -+	rm .git/test-mtime-expect
     -+	rm .git/test-mtime-actual
     ++	rm -f .git/test-mtime-expect
     ++	rm -f .git/test-mtime-actual
      +	return $ret
      +}
 2:  d15a23cc804 = 2:  c97a41af389 t7508: fix bogus mtime verification
 3:  3567ef91e7a = 3:  82d0b6ab8d2 t7508: add tests capturing racy timestamp handling
 4:  4a6b18fb304 = 4:  e31edb74e24 update-index: refresh should rewrite index in case of racy timestamps

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v4 1/4] test-lib: introduce API for verifying file mtime
  2022-01-07 11:17     ` [PATCH v4 0/4] " Marc Strapetz via GitGitGadget
@ 2022-01-07 11:17       ` Marc Strapetz via GitGitGadget
  2022-01-07 11:17       ` [PATCH v4 2/4] t7508: fix bogus mtime verification Marc Strapetz via GitGitGadget
                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 20+ messages in thread
From: Marc Strapetz via GitGitGadget @ 2022-01-07 11:17 UTC (permalink / raw)
  To: git; +Cc: Marc Strapetz, Marc Strapetz

From: Marc Strapetz <marc.strapetz@syntevo.com>

Add functions `test_set_magic_mtime` and `test_is_magic_mtime` which can
be used to (re)set the mtime of a file to a predefined ("magic")
timestamp, then perform some operations and finally check for mtime
changes of the file.

The core implementation follows the suggestion from the
mailing list [1].

[1] https://lore.kernel.org/git/xmqqczl5hpaq.fsf@gitster.g/T/#u

Signed-off-by: Marc Strapetz <marc.strapetz@syntevo.com>
---
 t/test-lib-functions.sh | 33 +++++++++++++++++++++++++++++++++
 1 file changed, 33 insertions(+)

diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 389153e5916..c1afa0884bf 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -1806,3 +1806,36 @@ test_region () {
 test_readlink () {
 	perl -le 'print readlink($_) for @ARGV' "$@"
 }
+
+# Set mtime to a fixed "magic" timestamp in mid February 2009, before we
+# run an operation that may or may not touch the file.  If the file was
+# touched, its timestamp will not accidentally have such an old timestamp,
+# as long as your filesystem clock is reasonably correct.  To verify the
+# timestamp, follow up with test_is_magic_mtime.
+#
+# An optional increment to the magic timestamp may be specified as second
+# argument.
+test_set_magic_mtime () {
+	local inc=${2:-0} &&
+	local mtime=$((1234567890 + $inc)) &&
+	test-tool chmtime =$mtime "$1" &&
+	test_is_magic_mtime "$1" $inc
+}
+
+# Test whether the given file has the "magic" mtime set.  This is meant to
+# be used in combination with test_set_magic_mtime.
+#
+# An optional increment to the magic timestamp may be specified as second
+# argument.  Usually, this should be the same increment which was used for
+# the associated test_set_magic_mtime.
+test_is_magic_mtime () {
+	local inc=${2:-0} &&
+	local mtime=$((1234567890 + $inc)) &&
+	echo $mtime >.git/test-mtime-expect &&
+	test-tool chmtime --get "$1" >.git/test-mtime-actual &&
+	test_cmp .git/test-mtime-expect .git/test-mtime-actual
+	local ret=$?
+	rm -f .git/test-mtime-expect
+	rm -f .git/test-mtime-actual
+	return $ret
+}
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v4 2/4] t7508: fix bogus mtime verification
  2022-01-07 11:17     ` [PATCH v4 0/4] " Marc Strapetz via GitGitGadget
  2022-01-07 11:17       ` [PATCH v4 1/4] test-lib: introduce API for verifying file mtime Marc Strapetz via GitGitGadget
@ 2022-01-07 11:17       ` Marc Strapetz via GitGitGadget
  2022-01-07 11:17       ` [PATCH v4 3/4] t7508: add tests capturing racy timestamp handling Marc Strapetz via GitGitGadget
  2022-01-07 11:17       ` [PATCH v4 4/4] update-index: refresh should rewrite index in case of racy timestamps Marc Strapetz via GitGitGadget
  3 siblings, 0 replies; 20+ messages in thread
From: Marc Strapetz via GitGitGadget @ 2022-01-07 11:17 UTC (permalink / raw)
  To: git; +Cc: Marc Strapetz, Marc Strapetz

From: Marc Strapetz <marc.strapetz@syntevo.com>

The current `grep`-approach in "--no-optional-locks prevents index
update" may fail e.g. for `out` file contents "1234567890999" [1].
Fix this by using test-lib's new mtime-verification API.

[1] https://lore.kernel.org/git/xmqqczl5hpaq.fsf@gitster.g/T/#u

Signed-off-by: Marc Strapetz <marc.strapetz@syntevo.com>
---
 t/t7508-status.sh | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/t/t7508-status.sh b/t/t7508-status.sh
index 05c6c02435d..b9efd2613d0 100755
--- a/t/t7508-status.sh
+++ b/t/t7508-status.sh
@@ -1647,13 +1647,11 @@ test_expect_success '"Initial commit" should not be noted in commit template' '
 '
 
 test_expect_success '--no-optional-locks prevents index update' '
-	test-tool chmtime =1234567890 .git/index &&
+	test_set_magic_mtime .git/index &&
 	git --no-optional-locks status &&
-	test-tool chmtime --get .git/index >out &&
-	grep ^1234567890 out &&
+	test_is_magic_mtime .git/index &&
 	git status &&
-	test-tool chmtime --get .git/index >out &&
-	! grep ^1234567890 out
+	! test_is_magic_mtime .git/index
 '
 
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v4 3/4] t7508: add tests capturing racy timestamp handling
  2022-01-07 11:17     ` [PATCH v4 0/4] " Marc Strapetz via GitGitGadget
  2022-01-07 11:17       ` [PATCH v4 1/4] test-lib: introduce API for verifying file mtime Marc Strapetz via GitGitGadget
  2022-01-07 11:17       ` [PATCH v4 2/4] t7508: fix bogus mtime verification Marc Strapetz via GitGitGadget
@ 2022-01-07 11:17       ` Marc Strapetz via GitGitGadget
  2022-01-07 11:17       ` [PATCH v4 4/4] update-index: refresh should rewrite index in case of racy timestamps Marc Strapetz via GitGitGadget
  3 siblings, 0 replies; 20+ messages in thread
From: Marc Strapetz via GitGitGadget @ 2022-01-07 11:17 UTC (permalink / raw)
  To: git; +Cc: Marc Strapetz, Marc Strapetz

From: Marc Strapetz <marc.strapetz@syntevo.com>

"git status" fixes racy timestamps regardless of the worktree being
dirty or not. The new test cases capture this behavior.

Signed-off-by: Marc Strapetz <marc.strapetz@syntevo.com>
---
 t/t7508-status.sh | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/t/t7508-status.sh b/t/t7508-status.sh
index b9efd2613d0..2b7ef6c41a4 100755
--- a/t/t7508-status.sh
+++ b/t/t7508-status.sh
@@ -1654,4 +1654,26 @@ test_expect_success '--no-optional-locks prevents index update' '
 	! test_is_magic_mtime .git/index
 '
 
+test_expect_success 'racy timestamps will be fixed for clean worktree' '
+	echo content >racy-dirty &&
+	echo content >racy-racy &&
+	git add racy* &&
+	git commit -m "racy test files" &&
+	# let status rewrite the index, if necessary; after that we expect
+	# no more index writes unless caused by racy timestamps; note that
+	# timestamps may already be racy now (depending on previous tests)
+	git status &&
+	test_set_magic_mtime .git/index &&
+	git status &&
+	! test_is_magic_mtime .git/index
+'
+
+test_expect_success 'racy timestamps will be fixed for dirty worktree' '
+	echo content2 >racy-dirty &&
+	git status &&
+	test_set_magic_mtime .git/index &&
+	git status &&
+	! test_is_magic_mtime .git/index
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v4 4/4] update-index: refresh should rewrite index in case of racy timestamps
  2022-01-07 11:17     ` [PATCH v4 0/4] " Marc Strapetz via GitGitGadget
                         ` (2 preceding siblings ...)
  2022-01-07 11:17       ` [PATCH v4 3/4] t7508: add tests capturing racy timestamp handling Marc Strapetz via GitGitGadget
@ 2022-01-07 11:17       ` Marc Strapetz via GitGitGadget
  3 siblings, 0 replies; 20+ messages in thread
From: Marc Strapetz via GitGitGadget @ 2022-01-07 11:17 UTC (permalink / raw)
  To: git; +Cc: Marc Strapetz, Marc Strapetz

From: Marc Strapetz <marc.strapetz@syntevo.com>

'git update-index --refresh' and '--really-refresh' should force writing
of the index file if racy timestamps have been encountered, as
'git status' already does [1].

Note that calling 'git update-index --refresh' still does not guarantee
that there will be no more racy timestamps afterwards (the same holds
true for 'git status'):

- calling 'git update-index --refresh' immediately after touching and
  adding a file may still leave racy timestamps if all three operations
  occur within the racy-tolerance (usually 1 second unless USE_NSEC has
  been defined)

- calling 'git update-index --refresh' for timestamps which are set into
  the future will leave them racy

To guarantee that such racy timestamps will be resolved would require to
wait until the system clock has passed beyond these timestamps and only
then write the index file. Especially for future timestamps, this does
not seem feasible because of possibly long delays/hangs.

[1] https://lore.kernel.org/git/d3dd805c-7c1d-30a9-6574-a7bfcb7fc013@syntevo.com/

Signed-off-by: Marc Strapetz <marc.strapetz@syntevo.com>
---
 builtin/update-index.c               | 11 +++++
 cache.h                              |  1 +
 read-cache.c                         |  2 +-
 t/t2108-update-index-refresh-racy.sh | 64 ++++++++++++++++++++++++++++
 4 files changed, 77 insertions(+), 1 deletion(-)
 create mode 100755 t/t2108-update-index-refresh-racy.sh

diff --git a/builtin/update-index.c b/builtin/update-index.c
index 187203e8bb5..7e0a0d9bf80 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -787,6 +787,17 @@ static int refresh(struct refresh_params *o, unsigned int flag)
 	setup_work_tree();
 	read_cache();
 	*o->has_errors |= refresh_cache(o->flags | flag);
+	if (has_racy_timestamp(&the_index)) {
+		/*
+		 * Even if nothing else has changed, updating the file
+		 * increases the chance that racy timestamps become
+		 * non-racy, helping future run-time performance.
+		 * We do that even in case of "errors" returned by
+		 * refresh_cache() as these are no actual errors.
+		 * cmd_status() does the same.
+		 */
+		active_cache_changed |= SOMETHING_CHANGED;
+	}
 	return 0;
 }
 
diff --git a/cache.h b/cache.h
index cfba463aa97..dd1932e2d0e 100644
--- a/cache.h
+++ b/cache.h
@@ -891,6 +891,7 @@ void *read_blob_data_from_index(struct index_state *, const char *, unsigned lon
 #define CE_MATCH_IGNORE_FSMONITOR 0X20
 int is_racy_timestamp(const struct index_state *istate,
 		      const struct cache_entry *ce);
+int has_racy_timestamp(struct index_state *istate);
 int ie_match_stat(struct index_state *, const struct cache_entry *, struct stat *, unsigned int);
 int ie_modified(struct index_state *, const struct cache_entry *, struct stat *, unsigned int);
 
diff --git a/read-cache.c b/read-cache.c
index cbe73f14e5e..ed297635a33 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2775,7 +2775,7 @@ static int repo_verify_index(struct repository *repo)
 	return verify_index_from(repo->index, repo->index_file);
 }
 
-static int has_racy_timestamp(struct index_state *istate)
+int has_racy_timestamp(struct index_state *istate)
 {
 	int entries = istate->cache_nr;
 	int i;
diff --git a/t/t2108-update-index-refresh-racy.sh b/t/t2108-update-index-refresh-racy.sh
new file mode 100755
index 00000000000..bc5f2886faf
--- /dev/null
+++ b/t/t2108-update-index-refresh-racy.sh
@@ -0,0 +1,64 @@
+#!/bin/sh
+
+test_description='update-index refresh tests related to racy timestamps'
+
+TEST_PASSES_SANITIZE_LEAK=true
+. ./test-lib.sh
+
+reset_files () {
+	echo content >file &&
+	echo content >other &&
+	test_set_magic_mtime file &&
+	test_set_magic_mtime other
+}
+
+update_assert_changed () {
+	test_set_magic_mtime .git/index &&
+	test_might_fail git update-index "$1" &&
+	! test_is_magic_mtime .git/index
+}
+
+test_expect_success 'setup' '
+	reset_files &&
+	# we are calling reset_files() a couple of times during tests;
+	# test-tool chmtime does not change the ctime; to not weaken
+	# or even break our tests, disable ctime-checks entirely
+	git config core.trustctime false &&
+	git add file other &&
+	git commit -m "initial import"
+'
+
+test_expect_success '--refresh has no racy timestamps to fix' '
+	reset_files &&
+	# set the index time far enough to the future;
+	# it must be at least 3 seconds for VFAT
+	test_set_magic_mtime .git/index +60 &&
+	git update-index --refresh &&
+	test_is_magic_mtime .git/index +60
+'
+
+test_expect_success '--refresh should fix racy timestamp' '
+	reset_files &&
+	update_assert_changed --refresh
+'
+
+test_expect_success '--really-refresh should fix racy timestamp' '
+	reset_files &&
+	update_assert_changed --really-refresh
+'
+
+test_expect_success '--refresh should fix racy timestamp if other file needs update' '
+	reset_files &&
+	echo content2 >other &&
+	test_set_magic_mtime other &&
+	update_assert_changed --refresh
+'
+
+test_expect_success '--refresh should fix racy timestamp if racy file needs update' '
+	reset_files &&
+	echo content2 >file &&
+	test_set_magic_mtime file &&
+	update_assert_changed --refresh
+'
+
+test_done
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2022-01-07 11:17 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-22 13:56 [PATCH] update-index: refresh should rewrite index in case of racy timestamps Marc Strapetz via GitGitGadget
2021-12-22 23:52 ` Junio C Hamano
2021-12-23 18:24   ` Marc Strapetz
2022-01-05 13:15 ` [PATCH v2 0/2] " Marc Strapetz via GitGitGadget
2022-01-05 13:15   ` [PATCH v2 1/2] t7508: add tests capturing racy timestamp handling Marc Strapetz via GitGitGadget
2022-01-05 20:59     ` Junio C Hamano
2022-01-06 10:21       ` Marc Strapetz
2022-01-05 13:15   ` [PATCH v2 2/2] update-index: refresh should rewrite index in case of racy timestamps Marc Strapetz via GitGitGadget
2022-01-05 21:03     ` Junio C Hamano
2022-01-06 22:34   ` [PATCH v3 0/4] " Marc Strapetz via GitGitGadget
2022-01-06 22:34     ` [PATCH v3 1/4] test-lib: introduce API for verifying file mtime Marc Strapetz via GitGitGadget
2022-01-06 23:55       ` Junio C Hamano
2022-01-06 22:34     ` [PATCH v3 2/4] t7508: fix bogus mtime verification Marc Strapetz via GitGitGadget
2022-01-06 22:34     ` [PATCH v3 3/4] t7508: add tests capturing racy timestamp handling Marc Strapetz via GitGitGadget
2022-01-06 22:34     ` [PATCH v3 4/4] update-index: refresh should rewrite index in case of racy timestamps Marc Strapetz via GitGitGadget
2022-01-07 11:17     ` [PATCH v4 0/4] " Marc Strapetz via GitGitGadget
2022-01-07 11:17       ` [PATCH v4 1/4] test-lib: introduce API for verifying file mtime Marc Strapetz via GitGitGadget
2022-01-07 11:17       ` [PATCH v4 2/4] t7508: fix bogus mtime verification Marc Strapetz via GitGitGadget
2022-01-07 11:17       ` [PATCH v4 3/4] t7508: add tests capturing racy timestamp handling Marc Strapetz via GitGitGadget
2022-01-07 11:17       ` [PATCH v4 4/4] update-index: refresh should rewrite index in case of racy timestamps Marc Strapetz via GitGitGadget

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).