All of lore.kernel.org
 help / color / mirror / Atom feed
From: Phillip Wood <phillip.wood123@gmail.com>
To: Elijah Newren <newren@gmail.com>,
	Ian Jackson <ijackson@chiark.greenend.org.uk>
Cc: Elijah Newren via GitGitGadget <gitgitgadget@gmail.com>,
	Git Mailing List <git@vger.kernel.org>,
	Jonathan Nieder <jrnieder@gmail.com>,
	Paul Gevers <elbrus@debian.org>,
	Phillip Wood <phillip.wood@dunelm.org.uk>,
	Alban Gruin <alban.gruin@gmail.com>,
	Johannes Schindelin <Johannes.Schindelin@gmx.de>
Subject: Re: [PATCH] sequencer: honor GIT_REFLOG_ACTION
Date: Thu, 2 Apr 2020 10:39:34 +0100	[thread overview]
Message-ID: <5c5532ac-7df5-e8ef-9122-f015783427c2@gmail.com> (raw)
In-Reply-To: <CABPp-BGo=6W5wfba7us8ca3eAfz04v8WxyOQ96DkoXn2fV=J1Q@mail.gmail.com>

On 02/04/2020 06:15, Elijah Newren wrote:
> On Wed, Apr 1, 2020 at 4:29 PM Ian Jackson
> <ijackson@chiark.greenend.org.uk> wrote:
>>
>> Hi.  Thanks for looking at this.
>>
>> Elijah Newren via GitGitGadget writes ("[PATCH] sequencer: honor GIT_REFLOG_ACTION"):
>>>      I'm not the best with getenv/setenv. The xstrdup() wrapping is
>>>      apparently necessary on mac and bsd. The xstrdup seems like it leaves us
>>>      with a memory leak, but since setenv(3) says to not alter or free it, I
>>>      think it's right. Anyone have any alternative suggestions?
>>
>> I can try to help.  It's not entirely trivial.
>>
>> The setenv interface is a wrapper around putenv.  putenv has had a
>> variety of different semantics.  Some of these sets of semantics
>> cannot be used to re-set the same environment variable without a
>> memory leak - and even figuring out what semantics you have would be
>> complex and tend to produce code which would fail in bad ways.
>> There's a short summary of the situation in Linux's putenv(3).
>>
>> Would it be possible for git to arrange to set GIT_REFLOG_ACTION only
>> when it is invoking subprocesses ? Otherwise it would update, and >> look at, a global variable of its own.  (Or a parameter to relevant
>> functions if one doesn't like the action-at-a-distance effect of a
>> global.)
>>
>> And, it seems to me that the reflog handling should be centralised.
>>
>>> +     char *reflog_action = getenv("GIT_REFLOG_ACTION");
>>>
>>>        va_start(ap, fmt);
>>>        strbuf_reset(&buf);
>>> -     strbuf_addstr(&buf, action_name(opts));
>>> +     strbuf_addstr(&buf, reflog_action ? reflog_action : action_name(opts));
>>
>> Open coding this kind of thing at every site which needs to think
>> about the reflog actions will surely result in some of the instances
>> having bugs.
>>
>> Writing a single function that contans this (or most of it) would
>> happily decouple all of its call sites from literally asking about
>> getenv("GIT_REFLOG_ACTION") thereby making it easier to do the
>> indirection-through-program-variables I suggest.
> 
> That sounds great, but I'm not sure that "only when invoking
> subprocesses" will limit the places where we set the environment
> variable all that much; it might actually expand it.

Off hand I think we'd need to change run_git_checkout(), 
run_git_commit() and do_merge() to set the environment variable and fix 
try_to_commit() to use a proper variable for the reflog message.

>  I wasn't there
> for the whole history, but my understanding is the rebase code has
> slowly transformed from the original all-shell rebase
> implementation(s), to being a helper program that the shell could call
> into for parts of its operations and passing control back and forth
> between shell and C, to being a reimplementation of just invoking the
> same commands that the shell script would have, to slowly transforming
> into an actual library where invocations of other git subprocesses are
> being replaced with relevant function calls.  It's a long cleanup
> process that is still ongoing.  I'd like to get to the point where we
> only invoke subprocesses if the user specifies --exec or a special
> merge strategy, but that's a goal with a longer term timeframe than
> fixing a 2.26 regression.
> 
>> Having said that,
>>
>>> diff --git a/t/t3406-rebase-message.sh b/t/t3406-rebase-message.sh
>>> index 61b76f33019..927a4f4a4e4 100755
>>> --- a/t/t3406-rebase-message.sh
>>> +++ b/t/t3406-rebase-message.sh
>>
>> This test case convinces me that the patch has the right behaviour for
>> at least the case I care about :-).
> 
> Cool, sounds like it's a good immediate fix for the 2.26 regression,
> and then longer term as we continue refactoring we can hopefully
> isolate subprocess handling and writing of state.
> 
> As a heads up, though, my personal plans for rebase (subject to buy-in
> from other stakeholders) is to make it do a lot more in-memory work.
> In particular, this means for common cases there will be no subprocess
> invocations, no writing of any state unless/until you hit a conflict,
> no updating of any files in the working tree until all commits have
> been created (or a conflict is hit), 

I'm with you up to here - it sounds fantastic

> and no updating of the branch
> until after all the commits have been created.

We only update the branch reflog at the end of the rebase now.

> Thus, for the common
> cases with no conflicts, there would only be 1 entry in the reflog of
> HEAD the entire operation, rather than approximately 1 per commit. 

This I'm not so sure about. In the past where I've messed up a rebase 
and not noticed until after a subsequent rebase it has been really 
useful to be able to go through HEAD's reflog and find out where exactly 
I messed up by looking at the sequence of picks for the first rebase. 
Specifically it shows which commits where squashed together which you 
cannot get by running git log on the result of the rebase.

> I
> have a proof-of-concept showing these ideas work for basic cases. 

Sounds exciting

Best Wishes

Phillip

  So,
> I hope your tests don't depend on the number of entries added to
> HEAD's reflog.
> 

  reply	other threads:[~2020-04-02  9:39 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-01 20:31 [PATCH] sequencer: honor GIT_REFLOG_ACTION Elijah Newren via GitGitGadget
2020-04-01 20:46 ` Junio C Hamano
2020-04-01 23:29 ` Ian Jackson
2020-04-02  5:15   ` Elijah Newren
2020-04-02  9:39     ` Phillip Wood [this message]
2020-04-02 17:40       ` Elijah Newren
2020-04-02  9:25 ` Phillip Wood
2020-04-02 17:01   ` Elijah Newren
2020-04-02 19:05     ` Phillip Wood
2020-04-07 14:50       ` Johannes Schindelin
2020-04-07 15:18         ` Elijah Newren
2020-04-07 22:37           ` Johannes Schindelin
2020-04-07 23:05             ` Junio C Hamano
2020-04-07 16:59 ` [PATCH v2] " Elijah Newren via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5c5532ac-7df5-e8ef-9122-f015783427c2@gmail.com \
    --to=phillip.wood123@gmail.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=alban.gruin@gmail.com \
    --cc=elbrus@debian.org \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=ijackson@chiark.greenend.org.uk \
    --cc=jrnieder@gmail.com \
    --cc=newren@gmail.com \
    --cc=phillip.wood@dunelm.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.