From: Karthik Nayak <karthik.188@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, ps@pks.im
Subject: Re: [PATCH v3 3/3] rev-list: add commit object support in `--missing` option
Date: Tue, 24 Oct 2023 13:34:01 +0200 [thread overview]
Message-ID: <CAOLa=ZTUiYi1YqSuirmB9e_7B8ihC6rsFAR3VTDCL4SoLrEX_Q@mail.gmail.com> (raw)
In-Reply-To: <xmqq1qdptffk.fsf@gitster.g>
On Fri, Oct 20, 2023 at 6:41 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Junio C Hamano <gitster@pobox.com> writes:
>
> > Rather, I was wondering if we need to use object flags to mark these
> > objects, or can do what we want to do without using any object flags
> > at all. For the purpose of reporting "missing" objects, wouldn't it
> > be sufficient to walk the object graph and report our findings as we
> > go? To avoid reporting the same object twice, as we reasonably can
> > expect that the missing objects are minority (compared to the total
> > number of objects), perhaps the codepath that makes such a report
> > can use a hashmap of object_ids or something, for example.
>
> Digging from the bottom,
>
> * builtin/rev-list.c:show_commit() gets "struct rev_list_info *"
> that has "struct rev_info *" [*].
>
> * list-objects.c:do_traverse() calls revision.c:get_revision() to
> obtain commits, some of which may be missing ones, and things
> behind get_revision() are responsible for marking the commit as
> missing. It has "struct traversal_context *", among whose
> members is the "revs" member that is the "struct rev_info *".
>
> * revision.c:get_revision() and machinery behind it ultimately
> discovers a missing commit in the revision.c:process_parents()
> that loops over the parents commit_list. It of course has access
> to "struct rev_info *".
>
> So, presumably, if we add a new member to "struct rev_info" that
> optionally [*] points at an oidset that records the object names of
> missing objects we discovered so far (i.e., the set of missing
> objects), the location we set the MISSING bit of a commit can
> instead add the object name of the commit to the set. And we can
> export a function that takes "struct rev_info *" and "struct object
> *" (or "struct object_id *") to check for membership in the "set of
> missing objects", which would be used where we checked the MISSING
> bit of a commit.
>
> I do not know the performance implications of going this route, but
> if we do not find a suitable vacant bit, we do not have to use any
> object flags bit to do this, if we go this route, I would think. I
> may be missing some details that breaks the above outline, though.
>
>
> [Footnotes]
>
> * A potential #leftoverbits tangent.
>
> Why is "rev_list_info" structure declared in <bisect.h>? I
> suspect that this is a fallout from recent header file shuffling,
> but given who uses it (among which is rev-list:show_commit() that
> has very little to do with bisection and uses the information in
> rev_list_info when doing its normal non-bisect things), it does
> not make much sense.
>
> * When .do_not_die_on_missing_objects is false, it can and should
> be left NULL, but presumably we use the "do not die" bit even
> when we are not necessarily collecting the missing objects? So
> the new member cannot replace the "do not die" bit completely.
Thanks for the suggestion, this does seem like a good way to go ahead without
using flags. The only performance issue being if there are too many commits
which are missing, then oidset would be large.
But I think that's okay though.
> Thanks for researching. It sounds like it may be a better bit to
> steal than the one used by the commit-graph, as long as there is no
> reason to expect that blame may want to work in a corrupt repository
> with missing objects, but when it happens, we may regret the
> decision we are making here.
>
I don't see blame working with missing commits though, because it relies on
parsing commits to get information to show to the user. So I think it's a safe
bit to steal. Also, when the time comes we could always release the bit and
move to the solution you mentioned above.
Anyways on the whole I think keeping it future compatible makes a lot
more sense.
I'll send a patch series to implement an oidset instead of flags soon.
- Karthik
next prev parent reply other threads:[~2023-10-24 11:34 UTC|newest]
Thread overview: 61+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-09 10:55 [PATCH 0/3] rev-list: add support for commits in `--missing` Karthik Nayak
2023-10-09 10:55 ` [PATCH 1/3] revision: rename bit to `do_not_die_on_missing_objects` Karthik Nayak
2023-10-09 10:55 ` [PATCH 2/3] rev-list: move `show_commit()` to the bottom Karthik Nayak
2023-10-09 10:55 ` [PATCH 3/3] rev-list: add commit object support in `--missing` option Karthik Nayak
2023-10-09 22:02 ` [PATCH 0/3] rev-list: add support for commits in `--missing` Junio C Hamano
2023-10-10 6:19 ` Patrick Steinhardt
2023-10-10 17:09 ` Junio C Hamano
2023-10-11 10:37 ` Karthik Nayak
2023-10-11 16:54 ` Junio C Hamano
2023-10-12 10:44 ` Karthik Nayak
2023-10-12 11:04 ` Patrick Steinhardt
2023-10-12 13:23 ` Karthik Nayak
2023-10-12 16:17 ` Junio C Hamano
2023-10-13 5:53 ` Patrick Steinhardt
2023-10-13 8:38 ` Patrick Steinhardt
2023-10-13 12:37 ` [PATCH] commit: detect commits that exist in commit-graph but not in the ODB Patrick Steinhardt
2023-10-13 18:21 ` Junio C Hamano
2023-10-17 6:37 ` Patrick Steinhardt
2023-10-17 18:34 ` Junio C Hamano
2023-10-19 6:45 ` Patrick Steinhardt
2023-10-19 8:25 ` Patrick Steinhardt
2023-10-19 17:16 ` Junio C Hamano
2023-10-20 10:00 ` Jeff King
2023-10-20 17:35 ` Junio C Hamano
2023-10-23 10:15 ` Patrick Steinhardt
2023-10-13 17:07 ` [PATCH 0/3] rev-list: add support for commits in `--missing` Junio C Hamano
2023-10-12 16:26 ` Junio C Hamano
2023-10-16 10:38 ` [PATCH v2 " Karthik Nayak
2023-10-16 10:38 ` [PATCH v2 1/3] revision: rename bit to `do_not_die_on_missing_objects` Karthik Nayak
2023-10-16 10:38 ` [PATCH v2 2/3] rev-list: move `show_commit()` to the bottom Karthik Nayak
2023-10-16 10:38 ` [PATCH v2 3/3] rev-list: add commit object support in `--missing` option Karthik Nayak
2023-10-16 16:24 ` [PATCH v2 0/3] rev-list: add support for commits in `--missing` Junio C Hamano
2023-10-16 19:01 ` Karthik Nayak
2023-10-16 20:33 ` Junio C Hamano
2023-10-19 12:10 ` [PATCH v3 " Karthik Nayak
2023-10-19 12:10 ` [PATCH v3 1/3] revision: rename bit to `do_not_die_on_missing_objects` Karthik Nayak
2023-10-19 12:10 ` [PATCH v3 2/3] rev-list: move `show_commit()` to the bottom Karthik Nayak
2023-10-19 12:10 ` [PATCH v3 3/3] rev-list: add commit object support in `--missing` option Karthik Nayak
2023-10-19 22:05 ` Junio C Hamano
2023-10-19 23:35 ` Junio C Hamano
2023-10-20 11:14 ` Karthik Nayak
2023-10-20 14:47 ` Karthik Nayak
2023-10-20 17:45 ` Junio C Hamano
2023-10-20 16:41 ` Junio C Hamano
2023-10-24 11:34 ` Karthik Nayak [this message]
2023-10-24 12:26 ` [PATCH v4 0/3] rev-list: add support for commits in `--missing` Karthik Nayak
2023-10-24 12:26 ` [PATCH v4 1/3] revision: rename bit to `do_not_die_on_missing_objects` Karthik Nayak
2023-10-24 12:26 ` [PATCH v4 2/3] rev-list: move `show_commit()` to the bottom Karthik Nayak
2023-10-24 12:26 ` [PATCH v4 3/3] rev-list: add commit object support in `--missing` option Karthik Nayak
2023-10-24 17:45 ` Junio C Hamano
2023-10-25 0:35 ` Junio C Hamano
2023-10-25 9:34 ` Karthik Nayak
2023-10-25 6:40 ` Patrick Steinhardt
2023-10-26 12:37 ` Junio C Hamano
2023-10-26 10:11 ` [PATCH v5 0/3] rev-list: add support for commits in `--missing` Karthik Nayak
2023-10-26 10:11 ` [PATCH v5 1/3] revision: rename bit to `do_not_die_on_missing_objects` Karthik Nayak
2023-10-26 10:11 ` [PATCH v5 2/3] rev-list: move `show_commit()` to the bottom Karthik Nayak
2023-10-26 10:11 ` [PATCH v5 3/3] rev-list: add commit object support in `--missing` option Karthik Nayak
2023-10-27 6:25 ` Patrick Steinhardt
2023-10-27 7:54 ` Karthik Nayak
2023-10-27 7:59 ` Karthik Nayak
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAOLa=ZTUiYi1YqSuirmB9e_7B8ihC6rsFAR3VTDCL4SoLrEX_Q@mail.gmail.com' \
--to=karthik.188@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=ps@pks.im \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).