All of lore.kernel.org
 help / color / mirror / Atom feed
From: "René Scharfe" <l.s.r@web.de>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Matheus Tavares" <matheus.bernardino@usp.br>
Cc: git@vger.kernel.org, gitster@pobox.com
Subject: Re: [PATCH] format-patch: warn if commit msg contains a patch delimiter
Date: Mon, 5 Sep 2022 12:57:45 +0200	[thread overview]
Message-ID: <904b784d-a328-011f-c71a-c2092534e0f7@web.de> (raw)
In-Reply-To: <220905.864jxmme0a.gmgdl@evledraar.gmail.com>

Am 05.09.22 um 10:01 schrieb Ævar Arnfjörð Bjarmason:
>
> On Sun, Sep 04 2022, Matheus Tavares wrote:
>
>> When applying a patch, `git am` looks for special delimiter strings
>> (such as "---") to know where the message ends and the actual diff
>> starts. If one of these strings appears in the commit message itself,
>> `am` might get confused and fail to apply the patch properly. This has
>> already caused inconveniences in the past [1][2]. To help avoid such
>> problem, let's make `git format-patch` warn on commit messages
>> containing one of the said strings.
>>
>> [1]: https://lore.kernel.org/git/20210113085846-mutt-send-email-mst@kernel.org/
>> [2]: https://lore.kernel.org/git/16297305.cDA1TJNmNo@earendil/
>
> I followed this topic with one eye, and have run into this myself in the
> past. I'm not against this warning, but I wonder if we can't fix
> "am/apply" to just be smarter. The cases I've seen are all ones where:
>
>  * We have a copy/pasted git diff, but we could disambiguate based on
>    (at least) the "---" line being a telltale for the "real" patch, and
>    the "X file changed..." diffstat.
>  * We have a not-quite-git-looking patch diff in the commit message
>    (which we'd normally detect and apply), as in your [2].
>
> Couldn't we just be a bit smarter about applying these, and do a
> look-ahead and find what the user meant.

Whatever we use to separate message from diff can be included in that
message by an unsuspecting user and "---" can be part of a diff.  An
earlier discussion yielded an idea, but no implementation:
https://lore.kernel.org/git/20200204010524-mutt-send-email-mst@kernel.org/

> Is any case, having such a warning won't "settle" this issue, as we're
> able to deal with this non-ambiguity in commit objects/the push/fetch
> protocol. It's just "format-patch/am" as a "wire protocol" that has this
> issue.
>
> But anyway, that's the state of the world now, so warning() about it is
> fair, even if we had a fix for the "apply" part we might want to warn
> for a while to note that it's an issue on older gits.
>
>> +		if (pp->check_in_body_patch_breaks) {
>> +			strbuf_reset(&linebuf);
>> +			strbuf_add(&linebuf, line, linelen);
>> +			if (patchbreak(&linebuf) || is_scissors_line(linebuf.buf)) {
>> +				strbuf_strip_suffix(&linebuf, "\n");
>
> Hrm, it's a (small) shame that the patchbreak() function takes a "struct
> strbuf" rather than a char */size_t in this case (seemingly for no good
> reason, as it's "const"?).

A strbuf is NUL-terminated, a length-limited string (char */size_t)
doesn't have to be.  That means the current implementation can use
functions like starts_with(), but a faithful version that promises to
stay within a given length cannot.  So the reason is probably
convenience.  With skip_prefix_mem() it wouldn't be that bad, though:

---
 mailinfo.c | 37 +++++++++++++++++++------------------
 1 file changed, 19 insertions(+), 18 deletions(-)

diff --git a/mailinfo.c b/mailinfo.c
index 9621ba62a3..ae2e70e363 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -646,32 +646,30 @@ static void decode_transfer_encoding(struct mailinfo *mi, struct strbuf *line)
 	free(ret);
 }

-static inline int patchbreak(const struct strbuf *line)
+static int patchbreak(const char *buf, size_t len)
 {
-	size_t i;
-
 	/* Beginning of a "diff -" header? */
-	if (starts_with(line->buf, "diff -"))
+	if (skip_prefix_mem(buf, len, "diff -", &buf, &len))
 		return 1;

 	/* CVS "Index: " line? */
-	if (starts_with(line->buf, "Index: "))
+	if (skip_prefix_mem(buf, len, "Index: ", &buf, &len))
 		return 1;

 	/*
 	 * "--- <filename>" starts patches without headers
 	 * "---<sp>*" is a manual separator
 	 */
-	if (line->len < 4)
+	if (len < 4)
 		return 0;

-	if (starts_with(line->buf, "---")) {
+	if (skip_prefix_mem(buf, len, "---", &buf, &len)) {
 		/* space followed by a filename? */
-		if (line->buf[3] == ' ' && !isspace(line->buf[4]))
+		if (len > 1 && buf[0] == ' ' && !isspace(buf[1]))
 			return 1;
 		/* Just whitespace? */
-		for (i = 3; i < line->len; i++) {
-			unsigned char c = line->buf[i];
+		for (; len; buf++, len--) {
+			unsigned char c = buf[0];
 			if (c == '\n')
 				return 1;
 			if (!isspace(c))
@@ -682,14 +680,14 @@ static inline int patchbreak(const struct strbuf *line)
 	return 0;
 }

-static int is_scissors_line(const char *line)
+static int is_scissors_line(const char *line, size_t len)
 {
 	const char *c;
 	int scissors = 0, gap = 0;
 	const char *first_nonblank = NULL, *last_nonblank = NULL;
 	int visible, perforation = 0, in_perforation = 0;

-	for (c = line; *c; c++) {
+	for (c = line; len; c++, len--) {
 		if (isspace(*c)) {
 			if (in_perforation) {
 				perforation++;
@@ -705,12 +703,14 @@ static int is_scissors_line(const char *line)
 			perforation++;
 			continue;
 		}
-		if (starts_with(c, ">8") || starts_with(c, "8<") ||
-		    starts_with(c, ">%") || starts_with(c, "%<")) {
+		if (skip_prefix_mem(c, len, ">8", &c, &len) ||
+		    skip_prefix_mem(c, len, "8<", &c, &len) ||
+		    skip_prefix_mem(c, len, ">%", &c, &len) ||
+		    skip_prefix_mem(c, len, "%<", &c, &len)) {
 			in_perforation = 1;
 			perforation += 2;
 			scissors += 2;
-			c++;
+			c--, len++;
 			continue;
 		}
 		in_perforation = 0;
@@ -747,7 +747,8 @@ static int check_inbody_header(struct mailinfo *mi, const struct strbuf *line)
 {
 	if (mi->inbody_header_accum.len &&
 	    (line->buf[0] == ' ' || line->buf[0] == '\t')) {
-		if (mi->use_scissors && is_scissors_line(line->buf)) {
+		if (mi->use_scissors &&
+		    is_scissors_line(line->buf, line->len)) {
 			/*
 			 * This is a scissors line; do not consider this line
 			 * as a header continuation line.
@@ -808,7 +809,7 @@ static int handle_commit_msg(struct mailinfo *mi, struct strbuf *line)
 	if (convert_to_utf8(mi, line, mi->charset.buf))
 		return 0; /* mi->input_error already set */

-	if (mi->use_scissors && is_scissors_line(line->buf)) {
+	if (mi->use_scissors && is_scissors_line(line->buf, line->len)) {
 		int i;

 		strbuf_setlen(&mi->log_message, 0);
@@ -826,7 +827,7 @@ static int handle_commit_msg(struct mailinfo *mi, struct strbuf *line)
 		return 0;
 	}

-	if (patchbreak(line)) {
+	if (patchbreak(line->buf, line->len)) {
 		if (mi->message_id)
 			strbuf_addf(&mi->log_message,
 				    "Message-Id: %s\n", mi->message_id);
--
2.37.2


  reply	other threads:[~2022-09-05 10:58 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-04 23:12 [PATCH] format-patch: warn if commit msg contains a patch delimiter Matheus Tavares
2022-09-05  8:01 ` Ævar Arnfjörð Bjarmason
2022-09-05 10:57   ` René Scharfe [this message]
2022-09-07 14:44   ` [PATCH v2 0/2] " Matheus Tavares
2022-09-07 14:44     ` [PATCH v2 1/2] patchbreak(), is_scissors_line(): work with a buf/len pair Matheus Tavares
2022-09-07 18:20       ` Phillip Wood
2022-09-08  0:35       ` Eric Sunshine
2022-09-07 14:44     ` [PATCH v2 2/2] format-patch: warn if commit msg contains a patch delimiter Matheus Tavares
2022-09-07 18:09       ` Phillip Wood
2022-09-07 18:36         ` Junio C Hamano
2022-09-09  1:08           ` Matheus Tavares
2022-09-09 16:47             ` Junio C Hamano
2022-09-07 17:44     ` [PATCH v2 0/2] " René Scharfe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=904b784d-a328-011f-c71a-c2092534e0f7@web.de \
    --to=l.s.r@web.de \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=matheus.bernardino@usp.br \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.