All of lore.kernel.org
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "Baumann, Moritz" <moritz.baumann@sap.com>
Cc: "git@vger.kernel.org" <git@vger.kernel.org>
Subject: Re: Feature Request: Option to make "git rev-list --objects" output duplicate objects
Date: Mon, 27 Mar 2023 09:07:05 -0700	[thread overview]
Message-ID: <xmqqzg7ych5i.fsf@gitster.g> (raw)
In-Reply-To: <AS1PR02MB818545E563E40C5E31A02D1E948B9@AS1PR02MB8185.eurprd02.prod.outlook.com> (Moritz Baumann's message of "Mon, 27 Mar 2023 07:02:54 +0000")

"Baumann, Moritz" <moritz.baumann@sap.com> writes:

> The option would be called something like "--include-all-object-names" and
> belong to the category of options that only make sense in combination with
> "--objects". That name (hopefully) already explains the intended behavior:
>
>  * commits are not duplicated

Didn't the sentence above just say "all object names"?  Why not commits?

>  * as before, only changed blobs / subtrees are shown, however:

We are not showing "changed" things at all.  If you want that, you'd
need "git rev-list | git diff-tree --stdin" instead.

The difference matters if you are aiming for automation (i.e. not
casual browsing).  If you had a blob A, changed it in a commit to B,
and then put the original one A back in another commit, you should
see (in reverse history traversal) A and then B and then A again if
we were showing "changed" things.  But rev-list is not about showing
changes.  We show A in the latest commit, B in the previous commit,
and that is the end.  Original needs not to be shown, because rev-list
is enumerating the objects in the history and it knows that it showed
A already.

If you did not have a blob C, and added two instances of the same
blob C to different paths, you should see C twice if we were showing
"changed" things.  But rev-list is not about showing changes.  We
show C once for a path, and it is not shown again for the other
path.  The other one needs not to be shown, because rev-list is
enumerating the objects in the history and it knows that it showed C
already.

>  * blobs / subtrees are duplicated in the output if they were previously shown
>    with a different name

I suspect that adding "if they were ... with a DIFFERENT name" would
probably make it prohibitively more expensive.  The traversal done
by rev-list is fundamentally driven by "have we shown this object
(1-bit)?" to avoid duplicated work, and not "at which path did this
object appear (unbounded amount of information)?".

As I told you in the message you are responding to, I think I am OK
with the idea of showing a bit more from "rev-list --objects", but
such an enhancement may need to be designed a bit better, I am
afraid.

I suspect that "rev-list | diff-tree --stdin" without "--objects"
added to the upstream "rev-list" might be closer to what you are
looking for in this case, but I dunno.

Thanks.

  reply	other threads:[~2023-03-27 16:07 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-24 15:51 Feature Request: Option to make "git rev-list --objects" output duplicate objects Baumann, Moritz
2023-03-24 16:50 ` Junio C Hamano
2023-03-27  7:02   ` Baumann, Moritz
2023-03-27 16:07     ` Junio C Hamano [this message]
2023-03-24 19:28 ` Jeff King
2023-03-28  8:08   ` Baumann, Moritz
2023-03-28 18:26     ` [PATCH] docs: document caveats of rev-list's object-name output Jeff King
2023-03-30 10:32       ` Baumann, Moritz
2023-03-28 18:32     ` Feature Request: Option to make "git rev-list --objects" output duplicate objects Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqzg7ych5i.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=moritz.baumann@sap.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.