From: Jonathan Tan <jonathantanmy@google.com>
To: gitster@pobox.com
Cc: jonathantanmy@google.com, git@vger.kernel.org, stolee@gmail.com,
peff@peff.net
Subject: Re: [PATCH v2 2/2] diff: restrict when prefetching occurs
Date: Thu, 2 Apr 2020 16:09:37 -0700 [thread overview]
Message-ID: <20200402230937.47323-1-jonathantanmy@google.com> (raw)
In-Reply-To: <xmqq7dyx3b1o.fsf@gitster.c.googlers.com>
> > + int output_formats_to_prefetch = DIFF_FORMAT_DIFFSTAT |
> > + DIFF_FORMAT_NUMSTAT |
> > + DIFF_FORMAT_PATCH |
> > + DIFF_FORMAT_SHORTSTAT |
> > + DIFF_FORMAT_DIRSTAT;
>
> Would this want to be a "const int" (or even #define), I wonder. I
> do not care too much between the two, but leaving it as a variable
> makes me a bit nervous.
OK, will switch to "const int".
> > + if (options->repo == the_repository && has_promisor_remote() &&
> > + (options->output_format & output_formats_to_prefetch ||
> > + (!options->found_follow && options->break_opt != -1))) {
> > int i;
> > struct diff_queue_struct *q = &diff_queued_diff;
> > struct oid_array to_fetch = OID_ARRAY_INIT;
> >
> > for (i = 0; i < q->nr; i++) {
> > struct diff_filepair *p = q->queue[i];
> > - add_if_missing(options->repo, &to_fetch, p->one);
> > - add_if_missing(options->repo, &to_fetch, p->two);
> > + diff_add_if_missing(options->repo, &to_fetch, p->one);
> > + diff_add_if_missing(options->repo, &to_fetch, p->two);
> > }
> > +
> > + prefetched = 1;
> > +
>
> Wouldn't it logically make more sense to do this after calling
> promisor_remote_get_direct() and if to_fetch.nr is not 0, ...
>
> > /*
> > * NEEDSWORK: Consider deduplicating the OIDs sent.
> > */
> > promisor_remote_get_direct(options->repo,
> > to_fetch.oid, to_fetch.nr);
> > +
>
> ... namely, here?
>
> When (q->nr != 0), to_fetch.nr may not be zero, I suspect, but the
> original code before [1/2] protected against to_fetch.nr==0 case, so
> ...?
My idea is that this prefetch is a superset of what diffcore_rebase()
wants to prefetch, so if we have already done the necessary logic here
(even if nothing gets prefetched - which might be the case if we have
all objects), we do not need to do it in diffcore_rebase().
> > + if (!prefetched) {
> > + /*
> > + * At this point we know there's actual work to do: we have rename
> > + * destinations that didn't find an exact match, and we have potential
> > + * sources. So we'll have to do inexact rename detection, which
> > + * requires looking at the blobs.
> > + *
> > + * If we haven't already prefetched, it's worth pre-fetching
> > + * them as a group now.
> > + */
>
> This comment makes me wonder if it would be even better to
>
> - prepare an empty to_fetch OID array in the caller,
>
> - if the output format is one of the ones that wants prefetch, add
> object names to to_fetch in the caller, BUT not fetch there.
>
> - pass &to_fetch by the caller to this function, and this code here
> may add even more objects,
>
> - then do the prefetch here (so a single promisor interaction will
> grab objects the caller would have fetched before calling us and
> the ones we want here), and then clear the to_fetch array.
>
> - the caller, after seeing this function returns, checks to_fetch
> and if it is not empty, fetches (i.e. the caller prepared list of
> objects based on the output type, we ended up not calling this
> helper, and then finally the caller does the prefetch).
>
> That way, the "unless we have already prefetched" logic can go, and
> we can lose one indentation level, no?
This means that the only prefetch occurs in diffcore_rename()? I don't
think this will work for 2 reasons:
- diffcore_std() calls diffcore_break() (which also reads blobs) before
diffcore_rename()
- (more importantly) there's a code path in diffcore_std() that does
not call diffcore_rename(), so we would still need some prefetching
logic in diffcore_std() in case diffcore_rename() is not called
> > + if (to_fetch.nr)
> > + promisor_remote_get_direct(options->repo,
> > + to_fetch.oid, to_fetch.nr);
>
> You no longer need the if(), no?
Ah...I'll remove the if().
next prev parent reply other threads:[~2020-04-02 23:09 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-31 2:04 [PATCH] diff: restrict when prefetching occurs Jonathan Tan
2020-03-31 12:14 ` Derrick Stolee
2020-03-31 16:50 ` Jonathan Tan
2020-03-31 17:48 ` Derrick Stolee
2020-03-31 18:21 ` Junio C Hamano
2020-03-31 18:15 ` Junio C Hamano
2020-04-02 19:19 ` [PATCH v2 0/2] Restrict when prefetcing occurs Jonathan Tan
2020-04-02 19:19 ` [PATCH v2 1/2] promisor-remote: accept 0 as oid_nr in function Jonathan Tan
2020-04-02 19:46 ` Junio C Hamano
2020-04-02 23:01 ` Jonathan Tan
2020-04-02 19:19 ` [PATCH v2 2/2] diff: restrict when prefetching occurs Jonathan Tan
2020-04-02 20:08 ` Junio C Hamano
2020-04-02 23:09 ` Jonathan Tan [this message]
2020-04-02 23:25 ` Junio C Hamano
2020-04-02 23:54 ` Junio C Hamano
2020-04-03 21:35 ` Jonathan Tan
2020-04-03 22:12 ` Junio C Hamano
2020-04-02 20:28 ` [PATCH v2 0/2] Restrict when prefetcing occurs Junio C Hamano
2020-04-06 11:44 ` Derrick Stolee
2020-04-06 11:57 ` Garima Singh
2020-04-07 22:11 ` [PATCH v3 0/4] " Jonathan Tan
2020-04-07 22:11 ` [PATCH v3 1/4] promisor-remote: accept 0 as oid_nr in function Jonathan Tan
2020-04-07 22:11 ` [PATCH v3 2/4] diff: make diff_populate_filespec_options struct Jonathan Tan
2020-04-07 23:44 ` Junio C Hamano
2020-04-07 22:11 ` [PATCH v3 3/4] diff: refactor object read Jonathan Tan
2020-04-07 22:11 ` [PATCH v3 4/4] diff: restrict when prefetching occurs Jonathan Tan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200402230937.47323-1-jonathantanmy@google.com \
--to=jonathantanmy@google.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=peff@peff.net \
--cc=stolee@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).