git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Utsav Shah via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Utsav Shah <ukshah2@illinois.edu>, Junio C Hamano <gitster@pobox.com>
Subject: [PATCH v4 0/1] unpack-trees: skip stat on fsmonitor-valid files
Date: Wed, 20 Nov 2019 08:32:16 +0000	[thread overview]
Message-ID: <pull.424.v4.git.1574238737.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.424.v3.git.1573016055.gitgitgadget@gmail.com>

The index might be aware that a file hasn't modified via fsmonitor, but
unpack-trees did not pay attention to it and checked via ie_match_stat which
can be inefficient on certain filesystems. This significantly slows down
commands that run oneway_merge, like checkout and reset --hard.

This patch makes oneway_merge check whether a file is considered unchanged
through fsmonitor and skips ie_match_stat on it. unpack-trees also now
correctly copies over fsmonitor validity state from the source index.
Finally, for correctness, we force a refresh of fsmonitor state in
tweak_fsmonitor.

After this change, commands like stash (that use reset --hard internally) go
from 8s or more to ~2s on a 250k file repository on a mac.

Changes since the last version are:

 * The sanity checks around accessing the fsmonitor_dirty bitmap have been
   moved to another patch, which is in message id [1]
 * Unintended indentation changes in fsmonitor have been removed
 * A comment explaining what untracked->use_fsmonitor means has been
   re-added (it was dropped in the previous version)
 * A few "helped-by" entries have been added to the patch

1: (xmqqzhh0d0ma.fsf@gitster-ct.c.googlers.com)

Helped-by: Junio C Hamano gitster@pobox.com [gitster@pobox.com]Helped-by:
Kevin Willford Kevin.Willford@microsoft.com [Kevin.Willford@microsoft.com]
Signed-off-by: Utsav Shah utsav@dropbox.com [utsav@dropbox.com]

Utsav Shah (1):
  unpack-trees: skip stat on fsmonitor-valid files

 fsmonitor.c                 | 23 +++++++++++++++++------
 t/t7519-status-fsmonitor.sh |  9 +++++++--
 unpack-trees.c              |  6 +++++-
 3 files changed, 29 insertions(+), 9 deletions(-)


base-commit: 566a1439f6f56c2171b8853ddbca0ad3f5098770
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-424%2FUtsav2%2Fskip-lstat-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-424/Utsav2/skip-lstat-v4
Pull-Request: https://github.com/gitgitgadget/git/pull/424

Range-diff vs v3:

 1:  4bea7075cf ! 1:  ea7880f2d0 unpack-trees: skip stat on fsmonitor-valid files
     @@ -17,52 +17,29 @@
          internally) go from 8s or more to ~2s on a 250k file repository on a
          mac.
      
     +    Changes since the last version are:
     +    * The sanity checks around accessing the fsmonitor_dirty bitmap have
     +    been moved to another patch, which is in message id [1]
     +    * Unintended indentation changes in fsmonitor have been removed
     +    * A comment explaining what untracked->use_fsmonitor means has been
     +    re-added (it was dropped in the previous version)
     +    * A few "helped-by" entries have been added to the patch
     +
     +    [1]: (xmqqzhh0d0ma.fsf@gitster-ct.c.googlers.com)
     +
     +    Helped-by: Junio C Hamano <gitster@pobox.com>
     +    Helped-by: Kevin Willford <Kevin.Willford@microsoft.com>
          Signed-off-by: Utsav Shah <utsav@dropbox.com>
      
       diff --git a/fsmonitor.c b/fsmonitor.c
       --- a/fsmonitor.c
       +++ b/fsmonitor.c
      @@
     - 
     - 	if (pos >= istate->cache_nr)
     - 		BUG("fsmonitor_dirty has more entries than the index (%"PRIuMAX" >= %u)",
     --		    (uintmax_t)pos, istate->cache_nr);
     -+			(uintmax_t)pos, istate->cache_nr);
     - 
     - 	ce = istate->cache[pos];
     - 	ce->ce_flags &= ~CE_FSMONITOR_VALID;
     -@@
     - 	}
     - 	istate->fsmonitor_dirty = fsmonitor_dirty;
     - 
     --	if (istate->fsmonitor_dirty->bit_size > istate->cache_nr)
     --		BUG("fsmonitor_dirty has more entries than the index (%"PRIuMAX" > %u)",
     --		    (uintmax_t)istate->fsmonitor_dirty->bit_size, istate->cache_nr);
     -+	if (!istate->split_index && istate->fsmonitor_dirty->bit_size > istate->cache_nr)
     -+		BUG("fsmonitor_dirty has more entries than the index (%"PRIuMAX" > %"PRIuMAX")",
     -+			(uintmax_t)istate->fsmonitor_dirty->bit_size, (uintmax_t)istate->cache_nr);
     -+
     - 
     - 	trace_printf_key(&trace_fsmonitor, "read fsmonitor extension successful");
     - 	return 0;
     -@@
     - 	uint32_t ewah_size = 0;
     - 	int fixup = 0;
     - 
     --	if (istate->fsmonitor_dirty->bit_size > istate->cache_nr)
     --		BUG("fsmonitor_dirty has more entries than the index (%"PRIuMAX" > %u)",
     --		    (uintmax_t)istate->fsmonitor_dirty->bit_size, istate->cache_nr);
     -+	if (!istate->split_index && istate->fsmonitor_dirty->bit_size > istate->cache_nr)
     -+		BUG("fsmonitor_dirty has more entries than the index (%"PRIuMAX" > %"PRIuMAX")",
     -+			(uintmax_t)istate->fsmonitor_dirty->bit_size, (uintmax_t)istate->cache_nr);
     - 
     - 	put_be32(&hdr_version, INDEX_EXTENSION_VERSION);
     - 	strbuf_add(sb, &hdr_version, sizeof(uint32_t));
     -@@
       		}
       		if (bol < query_result.len)
       			fsmonitor_refresh_callback(istate, buf + bol);
      +
     ++		/* Now mark the untracked cache for fsmonitor usage */
      +		if (istate->untracked)
      +			istate->untracked->use_fsmonitor = 1;
       	} else {
     @@ -88,11 +65,7 @@
       		if (istate->untracked)
       			istate->untracked->use_fsmonitor = 0;
      @@
     - 			/* Mark all previously saved entries as dirty */
     - 			if (istate->fsmonitor_dirty->bit_size > istate->cache_nr)
     - 				BUG("fsmonitor_dirty has more entries than the index (%"PRIuMAX" > %u)",
     --				    (uintmax_t)istate->fsmonitor_dirty->bit_size, istate->cache_nr);
     -+					(uintmax_t)istate->fsmonitor_dirty->bit_size, istate->cache_nr);
     + 				    (uintmax_t)istate->fsmonitor_dirty->bit_size, istate->cache_nr);
       			ewah_each_bit(istate->fsmonitor_dirty, fsmonitor_ewah_callback, istate);
       
      -			/* Now mark the untracked cache for fsmonitor usage */

-- 
gitgitgadget

  parent reply	other threads:[~2019-11-20  8:32 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-25 15:23 [PATCH 0/1] unpack-trees: skip lstat on files based on fsmonitor Utsav Shah via GitGitGadget
2019-10-25 15:23 ` [PATCH 1/1] unpack-trees: skip lstat " Utsav Shah via GitGitGadget
2019-10-28  3:37   ` Junio C Hamano
2019-10-28  6:39     ` Utsav Shah
2019-10-28 19:23       ` Kevin Willford
2019-10-29 19:06         ` Utsav Shah
2019-10-29 20:12           ` Kevin Willford
2019-10-29 23:50             ` Utsav Shah
2019-10-30  0:21               ` Junio C Hamano
2019-10-30 16:41                 ` Utsav Shah
2019-11-04  6:02                   ` Junio C Hamano
2019-11-05 15:27 ` [PATCH v2 0/1] unpack-trees: skip stat on fsmonitor-valid files Utsav Shah via GitGitGadget
2019-11-05 15:27   ` [PATCH v2 1/1] " Utsav Shah via GitGitGadget
2019-11-05 21:40     ` Kevin Willford
2019-11-06  4:36       ` Utsav Shah
2019-11-06 17:24         ` Kevin Willford
2019-11-06  4:54   ` [PATCH v3 0/1] " Utsav Shah via GitGitGadget
2019-11-06  4:54     ` [PATCH v3 1/1] " Utsav Shah via GitGitGadget
2019-11-06 10:46       ` Junio C Hamano
2019-11-06 22:33         ` Utsav Shah
2019-11-08  3:51           ` Utsav Shah
2019-11-08  4:11             ` Junio C Hamano
2019-11-06 10:16     ` [PATCH v3 0/1] " Junio C Hamano
2019-11-20  8:32     ` Utsav Shah via GitGitGadget [this message]
2019-11-20  8:32       ` [PATCH v4 1/1] " Utsav Shah via GitGitGadget
2019-11-21  4:15         ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pull.424.v4.git.1574238737.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=ukshah2@illinois.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).