All of lore.kernel.org
 help / color / mirror / Atom feed
From: Utsav Shah <utsav@dropbox.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: Utsav Shah via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org, Utsav Shah <ukshah2@illinois.edu>
Subject: Re: [PATCH 1/1] unpack-trees: skip lstat based on fsmonitor
Date: Sun, 27 Oct 2019 23:39:42 -0700	[thread overview]
Message-ID: <CAPYzU3N9mDfHVogfq=mhJFj6VOjS2z4ui4msnDdK6pOtVBa_QA@mail.gmail.com> (raw)
In-Reply-To: <xmqq8sp5a6cd.fsf@gitster-ct.c.googlers.com>

Thanks for the review.

On Sun, Oct 27, 2019 at 8:37 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Utsav Shah via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > From: Utsav Shah <utsav@dropbox.com>
> >
> > git stash runs git reset --hard as its final step, which can be fairly slow on a large repository.
> > This change lets us skip that if fsmonitor thinks those files aren't modified.
> >
> > git stash goes from ~8s -> 2s on my repository after this change.
>
> Please line-wrap overlong lines.
>
> More importantly, "stash" may be a mere symptom that does not
> deserve this much emphasis.  What you improved directly is "git
> reset --hard" isn't it?
>
>     The fsmonitor may know that a path hasn't been modified but
>     "git reset --hard" did not pay attention to it and performed
>     its own check based on ie_match_stat(), which was inefficient.
>
> or something like that?
>
> >       if (old && same(old, a)) {
> >               int update = 0;
> > -             if (o->reset && o->update && !ce_uptodate(old) && !ce_skip_worktree(old)) {
> > +             if (o->reset && o->update && !ce_uptodate(old) && !ce_skip_worktree(old) &&
> > +                 !(old->ce_flags & CE_FSMONITOR_VALID)) {
>
> I wonder if !ce_uptodate(old) should say "this one is up to date and
> not modified" when CE_FSMONITOR_VALID bit is set.  Are there other
> codepaths that use ce_uptodate(ce) to decide to do X without paying
> attention to CE_FSMONITOR_VALID bit?  If there are, are they buggy
> in the same way as you found this instance, or do they have legitimate
> reason why they only check ce_uptodate(ce) and ignore fsmonitor?
>

Yes, there are other code paths as well. After reading the code some
more, it seems like there's no legitimate need to ignore fsmonitor.

> If there isn't, would it make sense to get rid of CE_FSMONITOR_VALID
> bit and have fsmonitor directly set CE_UPTODATE bit instead?  That
> would make this fix unnecessary and fix other codepaths that check
> only ce_uptodate() without checking fsmonitor.
>

There's a few issues with replacing it entirely that I've found.

One is the  "CE_MATCH_IGNORE_FSMONITOR" flag. This flag can be set to
let ie_match_stat skip calling refresh_fsmonitor repeatedly. This is
set only in one place right now in preload-index, and it's unclear how
necessary this optimization even is, given that refresh_fsmonitor has
a check whether it's been called already, and returns if true.

The second is that git ls-files has an "f" option that makes it "use
lowercase letters for 'fsmonitor clean' files". I think this can
simply be replaced by checking if a file is up to date instead of
specifically via fsmonitor.

If we do go ahead with the replace, we will have to be diligent about
calling refresh_fsmonitor everywhere, or we will have correctness
issues. I patched git locally to do this, and immediately saw a bug in
git stash where the underlying git reset --hard skipped modifying a
file it should have. In my opinion refresh_fsmonitor should be called
somewhere top level, like an initialization, but I'm not sure if that
makes sense for all git subcommands.

Do you think it's worth cleaning up and sending this patch instead? It
will reduce the surface area of bugs and remove a bunch of functions
like mark_fsmonitor_valid/mark_fsmonitor_invalid

  reply	other threads:[~2019-10-28  6:39 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-25 15:23 [PATCH 0/1] unpack-trees: skip lstat on files based on fsmonitor Utsav Shah via GitGitGadget
2019-10-25 15:23 ` [PATCH 1/1] unpack-trees: skip lstat " Utsav Shah via GitGitGadget
2019-10-28  3:37   ` Junio C Hamano
2019-10-28  6:39     ` Utsav Shah [this message]
2019-10-28 19:23       ` Kevin Willford
2019-10-29 19:06         ` Utsav Shah
2019-10-29 20:12           ` Kevin Willford
2019-10-29 23:50             ` Utsav Shah
2019-10-30  0:21               ` Junio C Hamano
2019-10-30 16:41                 ` Utsav Shah
2019-11-04  6:02                   ` Junio C Hamano
2019-11-05 15:27 ` [PATCH v2 0/1] unpack-trees: skip stat on fsmonitor-valid files Utsav Shah via GitGitGadget
2019-11-05 15:27   ` [PATCH v2 1/1] " Utsav Shah via GitGitGadget
2019-11-05 21:40     ` Kevin Willford
2019-11-06  4:36       ` Utsav Shah
2019-11-06 17:24         ` Kevin Willford
2019-11-06  4:54   ` [PATCH v3 0/1] " Utsav Shah via GitGitGadget
2019-11-06  4:54     ` [PATCH v3 1/1] " Utsav Shah via GitGitGadget
2019-11-06 10:46       ` Junio C Hamano
2019-11-06 22:33         ` Utsav Shah
2019-11-08  3:51           ` Utsav Shah
2019-11-08  4:11             ` Junio C Hamano
2019-11-06 10:16     ` [PATCH v3 0/1] " Junio C Hamano
2019-11-20  8:32     ` [PATCH v4 " Utsav Shah via GitGitGadget
2019-11-20  8:32       ` [PATCH v4 1/1] " Utsav Shah via GitGitGadget
2019-11-21  4:15         ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAPYzU3N9mDfHVogfq=mhJFj6VOjS2z4ui4msnDdK6pOtVBa_QA@mail.gmail.com' \
    --to=utsav@dropbox.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=ukshah2@illinois.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.