git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "D. Ben Knoble" <ben.knoble@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: git-status performance with submodules
Date: Sun, 01 Dec 2019 22:50:29 -0800	[thread overview]
Message-ID: <xmqq5zizz0ei.fsf@gitster-ct.c.googlers.com> (raw)
In-Reply-To: <CALnO6CCoXOZTsfag6yN_Ffn+H7KE-KTzm+P-GqLKnDMg8j_Qmg@mail.gmail.com> (D. Ben Knoble's message of "Mon, 2 Dec 2019 01:19:49 -0500")

"D. Ben Knoble" <ben.knoble@gmail.com> writes:

> ### What I am curious about
>
> From the traces (attached), it appears that git-status suffers from a lack of
> (possibly embarrassing) parallelism: I would expect each submodule to be
> independently check-able, ...
> ...
> What can we do to fix this? Is there a reason for this (really terribly slow)
> serial execution? Is this something developers haven't bothered to optimize
> ("unexpected use case")? If so, I would like to discuss taking a crack at it,
> because I do have at least one repository with this many submodules, and I
> care about its performance.

Nice to hear from somebody who cares about improving submodule
support.  I offhand do not think of a reason why we inherently have
to process them serially.

But the way "git status" code is structured, it probably takes a bit
of preparatory refactoring.  If I recall correctly, it walks each
path in the index in the superproject and notes how the file in the
working tree is different from that of the index and the HEAD, under
the assumption that inspection of each path is relatively cheap and
at the same cost.  You'd first need to restructure that part so that
inspecting groups of index entries can be sharded to separate
subprocesses while the parent process waits, and have them report to
the parent process, and let the parent process continue with the
aggregated result, or something like that.

Thanks.


  reply	other threads:[~2019-12-02  6:50 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-02  6:19 git-status performance with submodules D. Ben Knoble
2019-12-02  6:50 ` Junio C Hamano [this message]
2019-12-02 14:05   ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqq5zizz0ei.fsf@gitster-ct.c.googlers.com \
    --to=gitster@pobox.com \
    --cc=ben.knoble@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).