All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Derrick Stolee <stolee@gmail.com>
Cc: "git@vger.kernel.org" <git@vger.kernel.org>,
	Derrick Stolee <dstolee@microsoft.com>,
	Junio C Hamano <gitster@pobox.com>
Subject: Re: Question about get_cached_commit_buffer()
Date: Tue, 20 Feb 2018 17:57:27 -0500	[thread overview]
Message-ID: <20180220225726.GA17496@sigill.intra.peff.net> (raw)
In-Reply-To: <ecbbe515-b7a8-3dc8-7d14-32412e7b12c3@gmail.com>

On Tue, Feb 20, 2018 at 05:12:50PM -0500, Derrick Stolee wrote:

> In rev-list, the "--header" option outputs a value and expects the buffer to
> be cached. It outputs the header info only if get_cached_commit_buffer()
> returns a non-null buffer, giving incorrect output. If it called
> get_commit_buffer() instead, it would immediately call
> get_cached_commit_buffer() and on failure actually load the buffer.
> 
> This has not been a problem before, since the buffer was always loaded at
> some point for each commit (and saved because of the save_commit_buffer
> global).
> 
> I propose to make get_cached_commit_buffer() static to commit.c and convert
> all callers to get_commit_buffer(). Is there any reason to _not_ do this? It
> seems that there is no functional or performance change.

That helper was added in 152ff1cceb (provide helpers to access the
commit buffer, 2014-06-10). I think interesting part is the final
paragraph:

      Note that we also need to add a "get_cached" variant which
      returns NULL when we do not have a cached buffer. At first
      glance this seems to defeat the purpose of "get", which is
      to always provide a return value. However, some log code
      paths actually use the NULL-ness of commit->buffer as a
      boolean flag to decide whether to try printing the
      commit. At least for now, we want to continue supporting
      that use.

So I think a conversion to get_commit_buffer() would break the callers
that use the boolean nature for something useful. Unfortunately the
author did not enumerate exactly what those uses are, so we'll have to
dig. :)

My guess is that it has to do with showing some commits twice, since we
would normally have the buffer available the first time we hit the
commit, and then free it in finish_commit().

If we blame that rev-list line (and then walk back over a bunch of
uninteresting commits via parent re-blaming), it comes from 3131b71301
(Add "--show-all" revision walker flag for debugging, 2008-02-09).

So there it is. It does show commits multiple times, but suppresses the
verbose header after the first showing. If we do something like this:

  git rev-list --show-all --pretty --boundary c93150cfb0^-

you'll see some boundary commits that _don't_ have their pretty headers
shown. And with your proposed patch, we'd show them again. To keep the
same behavior we need to store that "we've already seen this" boolean
somewhere else (e.g., in an object flag; possibly SEEN, but that might
run afoul of other logic).

It looks like the call in log-tree comes from the same commit, and
serves the same purpose.

Aside from storing the boolean "did we show it" in another way, the
other option is to simply change the behavior and accept that we might
pretty-print the commit twice. This is a backwards-incompatible change,
but I'm not sure if anybody would care. According to that commit,
--show-all was added explicitly for debugging, and it has never been
documented.  I couldn't find any reference to people actually using it
on the list (a grep of the whole archive turns up 32 messages, most of
which are just it appearing in context; the only person mentioning its
actual use was Linus in 2008.

-Peff

  reply	other threads:[~2018-02-20 22:57 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-20 22:12 Question about get_cached_commit_buffer() Derrick Stolee
2018-02-20 22:57 ` Jeff King [this message]
2018-02-21 14:13   ` Derrick Stolee
2018-02-21 18:48     ` Jeff King
2018-02-21 18:52       ` Jeff King
2018-02-21 19:17         ` [PATCH] commit: drop uses of get_cached_commit_buffer() Derrick Stolee
2018-02-21 19:19           ` Derrick Stolee
2018-02-21 23:34             ` Jeff King
2018-02-21 23:13           ` Jeff King
2018-02-21 23:22             ` Stefan Beller
2018-02-21 23:29               ` Jeff King
2018-02-22  1:52             ` Derrick Stolee
2018-02-21 22:22       ` Question about get_cached_commit_buffer() Junio C Hamano
2018-02-21 22:50         ` Jeff King
2018-02-21 23:03           ` Junio C Hamano
2018-02-21 23:27             ` [PATCH] revision: drop --show-all option Jeff King
2018-02-21 23:45               ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180220225726.GA17496@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.