git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Karthik Nayak <karthik.188@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, ps@pks.im
Subject: Re: [PATCH v3 3/3] rev-list: add commit object support in `--missing` option
Date: Tue, 24 Oct 2023 13:34:01 +0200	[thread overview]
Message-ID: <CAOLa=ZTUiYi1YqSuirmB9e_7B8ihC6rsFAR3VTDCL4SoLrEX_Q@mail.gmail.com> (raw)
In-Reply-To: <xmqq1qdptffk.fsf@gitster.g>

On Fri, Oct 20, 2023 at 6:41 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Junio C Hamano <gitster@pobox.com> writes:
>
> > Rather, I was wondering if we need to use object flags to mark these
> > objects, or can do what we want to do without using any object flags
> > at all.  For the purpose of reporting "missing" objects, wouldn't it
> > be sufficient to walk the object graph and report our findings as we
> > go?  To avoid reporting the same object twice, as we reasonably can
> > expect that the missing objects are minority (compared to the total
> > number of objects), perhaps the codepath that makes such a report
> > can use a hashmap of object_ids or something, for example.
>
> Digging from the bottom,
>
>  * builtin/rev-list.c:show_commit() gets "struct rev_list_info *"
>    that has "struct rev_info *" [*].
>
>  * list-objects.c:do_traverse() calls revision.c:get_revision() to
>    obtain commits, some of which may be missing ones, and things
>    behind get_revision() are responsible for marking the commit as
>    missing.  It has "struct traversal_context *", among whose
>    members is the "revs" member that is the "struct rev_info *".
>
>  * revision.c:get_revision() and machinery behind it ultimately
>    discovers a missing commit in the revision.c:process_parents()
>    that loops over the parents commit_list.  It of course has access
>    to "struct rev_info *".
>
> So, presumably, if we add a new member to "struct rev_info" that
> optionally [*] points at an oidset that records the object names of
> missing objects we discovered so far (i.e., the set of missing
> objects), the location we set the MISSING bit of a commit can
> instead add the object name of the commit to the set.  And we can
> export a function that takes "struct rev_info *" and "struct object
> *" (or "struct object_id *") to check for membership in the "set of
> missing objects", which would be used where we checked the MISSING
> bit of a commit.
>
> I do not know the performance implications of going this route, but
> if we do not find a suitable vacant bit, we do not have to use any
> object flags bit to do this, if we go this route, I would think.  I
> may be missing some details that breaks the above outline, though.
>
>
> [Footnotes]
>
>  * A potential #leftoverbits tangent.
>
>    Why is "rev_list_info" structure declared in <bisect.h>?  I
>    suspect that this is a fallout from recent header file shuffling,
>    but given who uses it (among which is rev-list:show_commit() that
>    has very little to do with bisection and uses the information in
>    rev_list_info when doing its normal non-bisect things), it does
>    not make much sense.
>
>  * When .do_not_die_on_missing_objects is false, it can and should
>    be left NULL, but presumably we use the "do not die" bit even
>    when we are not necessarily collecting the missing objects?  So
>    the new member cannot replace the "do not die" bit completely.

Thanks for the suggestion, this does seem like a good way to go ahead without
using flags. The only performance issue being if there are too many commits
which are missing, then oidset would be large.

But I think that's okay though.

> Thanks for researching.  It sounds like it may be a better bit to
> steal than the one used by the commit-graph, as long as there is no
> reason to expect that blame may want to work in a corrupt repository
> with missing objects, but when it happens, we may regret the
> decision we are making here.
>

I don't see blame working with missing commits though, because it relies on
parsing commits to get information to show to the user. So I think it's a safe
bit to steal. Also, when the time comes we could always release the bit and
move to the solution you mentioned above.

Anyways on the whole I think keeping it future compatible makes a lot
more sense.
I'll send a patch series to implement an oidset instead of flags soon.

- Karthik

  reply	other threads:[~2023-10-24 11:34 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-09 10:55 [PATCH 0/3] rev-list: add support for commits in `--missing` Karthik Nayak
2023-10-09 10:55 ` [PATCH 1/3] revision: rename bit to `do_not_die_on_missing_objects` Karthik Nayak
2023-10-09 10:55 ` [PATCH 2/3] rev-list: move `show_commit()` to the bottom Karthik Nayak
2023-10-09 10:55 ` [PATCH 3/3] rev-list: add commit object support in `--missing` option Karthik Nayak
2023-10-09 22:02 ` [PATCH 0/3] rev-list: add support for commits in `--missing` Junio C Hamano
2023-10-10  6:19 ` Patrick Steinhardt
2023-10-10 17:09   ` Junio C Hamano
2023-10-11 10:37     ` Karthik Nayak
2023-10-11 16:54       ` Junio C Hamano
2023-10-12 10:44         ` Karthik Nayak
2023-10-12 11:04           ` Patrick Steinhardt
2023-10-12 13:23             ` Karthik Nayak
2023-10-12 16:17             ` Junio C Hamano
2023-10-13  5:53               ` Patrick Steinhardt
2023-10-13  8:38                 ` Patrick Steinhardt
2023-10-13 12:37                   ` [PATCH] commit: detect commits that exist in commit-graph but not in the ODB Patrick Steinhardt
2023-10-13 18:21                     ` Junio C Hamano
2023-10-17  6:37                       ` Patrick Steinhardt
2023-10-17 18:34                         ` Junio C Hamano
2023-10-19  6:45                           ` Patrick Steinhardt
2023-10-19  8:25                             ` Patrick Steinhardt
2023-10-19 17:16                               ` Junio C Hamano
2023-10-20 10:00                                 ` Jeff King
2023-10-20 17:35                                   ` Junio C Hamano
2023-10-23 10:15                                   ` Patrick Steinhardt
2023-10-13 17:07                   ` [PATCH 0/3] rev-list: add support for commits in `--missing` Junio C Hamano
2023-10-12 16:26           ` Junio C Hamano
2023-10-16 10:38 ` [PATCH v2 " Karthik Nayak
2023-10-16 10:38   ` [PATCH v2 1/3] revision: rename bit to `do_not_die_on_missing_objects` Karthik Nayak
2023-10-16 10:38   ` [PATCH v2 2/3] rev-list: move `show_commit()` to the bottom Karthik Nayak
2023-10-16 10:38   ` [PATCH v2 3/3] rev-list: add commit object support in `--missing` option Karthik Nayak
2023-10-16 16:24   ` [PATCH v2 0/3] rev-list: add support for commits in `--missing` Junio C Hamano
2023-10-16 19:01     ` Karthik Nayak
2023-10-16 20:33       ` Junio C Hamano
2023-10-19 12:10   ` [PATCH v3 " Karthik Nayak
2023-10-19 12:10     ` [PATCH v3 1/3] revision: rename bit to `do_not_die_on_missing_objects` Karthik Nayak
2023-10-19 12:10     ` [PATCH v3 2/3] rev-list: move `show_commit()` to the bottom Karthik Nayak
2023-10-19 12:10     ` [PATCH v3 3/3] rev-list: add commit object support in `--missing` option Karthik Nayak
2023-10-19 22:05       ` Junio C Hamano
2023-10-19 23:35         ` Junio C Hamano
2023-10-20 11:14           ` Karthik Nayak
2023-10-20 14:47             ` Karthik Nayak
2023-10-20 17:45               ` Junio C Hamano
2023-10-20 16:41           ` Junio C Hamano
2023-10-24 11:34             ` Karthik Nayak [this message]
2023-10-24 12:26     ` [PATCH v4 0/3] rev-list: add support for commits in `--missing` Karthik Nayak
2023-10-24 12:26       ` [PATCH v4 1/3] revision: rename bit to `do_not_die_on_missing_objects` Karthik Nayak
2023-10-24 12:26       ` [PATCH v4 2/3] rev-list: move `show_commit()` to the bottom Karthik Nayak
2023-10-24 12:26       ` [PATCH v4 3/3] rev-list: add commit object support in `--missing` option Karthik Nayak
2023-10-24 17:45         ` Junio C Hamano
2023-10-25  0:35           ` Junio C Hamano
2023-10-25  9:34           ` Karthik Nayak
2023-10-25  6:40         ` Patrick Steinhardt
2023-10-26 12:37           ` Junio C Hamano
2023-10-26 10:11       ` [PATCH v5 0/3] rev-list: add support for commits in `--missing` Karthik Nayak
2023-10-26 10:11         ` [PATCH v5 1/3] revision: rename bit to `do_not_die_on_missing_objects` Karthik Nayak
2023-10-26 10:11         ` [PATCH v5 2/3] rev-list: move `show_commit()` to the bottom Karthik Nayak
2023-10-26 10:11         ` [PATCH v5 3/3] rev-list: add commit object support in `--missing` option Karthik Nayak
2023-10-27  6:25           ` Patrick Steinhardt
2023-10-27  7:54             ` Karthik Nayak
2023-10-27  7:59             ` Karthik Nayak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAOLa=ZTUiYi1YqSuirmB9e_7B8ihC6rsFAR3VTDCL4SoLrEX_Q@mail.gmail.com' \
    --to=karthik.188@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=ps@pks.im \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).