git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Johannes Schindelin <Johannes.Schindelin@gmx.de>
To: Junio C Hamano <gitster@pobox.com>
Cc: Phillip Wood <phillip.wood123@gmail.com>,
	Matt R via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org, Matt Rogers <mattr94@gmail.com>
Subject: Re: [PATCH 1/1] rebase -r: let `label` generate safer labels
Date: Tue, 3 Sep 2019 13:19:59 +0200 (CEST)	[thread overview]
Message-ID: <nycvar.QRO.7.76.6.1909031256290.46@tvgsbejvaqbjf.bet> (raw)
In-Reply-To: <xmqq5zmav9ej.fsf@gitster-ct.c.googlers.com>

Hi Junio,

On Mon, 2 Sep 2019, Junio C Hamano wrote:

> Phillip Wood <phillip.wood123@gmail.com> writes:
>
> >>   		for (p1 = label.buf; *p1; p1++)
> >> -			if (isspace(*p1))
> >> +			if (!(*p1 & 0x80) && !isalnum(*p1))
> >>   				*(char *)p1 = '-';
> >
> > I'm sightly concerned that this opens the possibility for unexpected
> > effects if two different labels get sanitized to the same string. I
> > suspect it's unlikely to happen in practice but doing something like
> > percent encoding non-alphanumeric characters would avoid the problem
> > entirely.
>
> I'd rather see 'x' used instead of '-' (double-or-more dashes and
> leading dash in refnames may currently be allowed but double-or-more
> exes and leading ex would be much more likely to stay valid) if we
> just want to redact invalid characters.

Hmm. Let's take a concrete example from the VFS for Git fork:

	Merge pull request #160: trace2:gvfs:experiment Add experimental regions and data events to help diagnose checkout and reset perf problems

(Yes, we have some quite verbose merge commits, with very, very long
onelines. Not a good practice, we stopped doing it, but it was well
within what Git allows.)

And now use dashes to encode all white-space:

	Merge-pull-request-#160:-trace2:gvfs:experiment-Add-experimental-regions-and-data-events-to-help-diagnose-checkout-and-reset-perf-problems

Pretty long, but looks okay. Of course, it does not work, because colons. So here is the label with Matt's patch:

	Merge-pull-request--160--trace2-gvfs-experiment-Add-experimental-regions-and-data-events-to-help-diagnose-checkout-and-reset-perf-problems

And here is the label with your proposed xs.

	Mergexpullxrequestxx160xxtrace2xgvfsxexperimentxAddxexperimentalxregionsxandxdataxeventsxtoxhelpxdiagnosexcheckoutxandxresetxperfxproblems

I cannot speak for you, of course, but I can speak for myself: this is
not only way too reminiscent of xoxoxothxbye, but it is also really,
totally unreadable.

If you care deeply about double dashes and leading dashes, how about
this instead?

		char *from, *to;

   		for (from = to = label.buf; *from; from++)
			if ((*from & 0x80) || isalnum(*from))
				*(to++) = *from;
			/* avoid leading dash and double-dashes */
			else if (to != label.buf && to[-1] != '-')
   				*(to++) = '-';
		strbuf_setlen(&label, to - label.buf);

That would result in

	Merge-pull-request-160-trace2-gvfs-experiment-Add-experimental-regions-and-data-events-to-help-diagnose-checkout-and-reset-perf-problems

> I see there are "lets make sure it is unique by suffixing "-%d" in
> other codepaths; would that help if this piece of code yields a
> label that is not unique?

I'm way ahead of you. The sequencer already goes out of its way to
guarantee the uniqueness of the labels (it's part of the design, as
applied in 1644c73c6d4 (rebase-helper --make-script: introduce a flag to
rebase merges, 2018-04-25)).

The patch you are looking at in this thread is only concerned about the
initial phase, `label_oid()` does a lot more. Not only does it make the
label unique (case-insensitively!), it also prevents it from looking
like a full 40-hex digit SHA-1, so that we can guarantee that unique
abbreviations of commit hashes will work as labels, too.

Ciao,
Dscho

  parent reply	other threads:[~2019-09-03 11:20 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-02 14:01 [PATCH 0/1] Make git rebase -r's label generation more resilient Johannes Schindelin via GitGitGadget
2019-09-02 14:01 ` [PATCH 1/1] rebase -r: let `label` generate safer labels Matt R via GitGitGadget
2019-09-02 17:57   ` Phillip Wood
2019-09-02 18:29     ` Junio C Hamano
2019-09-02 20:12       ` brian m. carlson
2019-09-02 21:24       ` Philip Oakley
     [not found]         ` <CAOjrSZtw+wYHxFRQCfb80xzm9OsGDh2rW8uD+AYYdmDPxk5DFQ@mail.gmail.com>
2019-09-02 22:13           ` Philip Oakley
2019-09-03 11:19       ` Johannes Schindelin [this message]
2019-09-03 19:51         ` Junio C Hamano
2019-09-03 22:40           ` Matt Rogers
2019-09-02 19:42     ` Johannes Schindelin
2019-09-03 18:10       ` Junio C Hamano
2019-11-18 20:51         ` Johannes Schindelin
2019-11-17 23:16 ` [PATCH v2 0/2] Make git rebase -r's label generation more resilient Johannes Schindelin via GitGitGadget
2019-11-17 23:16   ` [PATCH v2 1/2] rebase-merges: move labels' whitespace mangling into `label_oid()` Johannes Schindelin via GitGitGadget
2019-11-17 23:16   ` [PATCH v2 2/2] rebase -r: let `label` generate safer labels Matthew Rogers via GitGitGadget
2019-11-18  3:42   ` [PATCH v2 0/2] Make git rebase -r's label generation more resilient Junio C Hamano
2019-11-18 11:23     ` [PATCH] sequencer: handle rebase-merge for "onto" message Danh Doan
2019-11-18 11:57       ` [PATCH v2] sequencer: handle rebase-merges " Doan Tran Cong Danh
2019-11-18 20:15         ` Johannes Schindelin
2019-11-21  0:16       ` [PATCH] sequencer: handle rebase-merge " Junio C Hamano
2019-11-18 20:52     ` [PATCH v2 0/2] Make git rebase -r's label generation more resilient Johannes Schindelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=nycvar.QRO.7.76.6.1909031256290.46@tvgsbejvaqbjf.bet \
    --to=johannes.schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=mattr94@gmail.com \
    --cc=phillip.wood123@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).