All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] format-patch: teach `--header-cmd`
@ 2024-03-07 19:59 Kristoffer Haugsbakk
  2024-03-07 19:59 ` [PATCH 1/3] log-tree: take ownership of pointer Kristoffer Haugsbakk
                   ` (3 more replies)
  0 siblings, 4 replies; 34+ messages in thread
From: Kristoffer Haugsbakk @ 2024-03-07 19:59 UTC (permalink / raw)
  To: git; +Cc: Kristoffer Haugsbakk, Jeff King, Maxim Cournoyer

(most of this is from the main commit with some elaboration in parts)

Teach git-format-patch(1) `--header-cmd` (with negation) and the
accompanying config variable `format.headerCmd` which allows the user to
add extra headers per-patch.

§ Motivation

format-patch knows `--add-header`. However, that seems most useful for
series-wide headers; you cannot really control what the header is like
per patch or specifically for the cover letter. To that end, teach
format-patch a new option which runs a command that has access to the
hash of the current commit (if it is a code patch) and the patch count
which is used for the patch files that this command outputs. Also
include an environment variable which tells the version of this API so
that the command can detect and error out in case the API changes.

This is inspired by `--header-cmd` of git-send-email(1).

§ Discussion

One may already get the commit hash from the commit by looking at the message id created by format-patch:

    Message-ID: <48c517ffb3dce2188aba5f0a2c1f4f9dc8df59f0.1709840255.git.code@khaugsbakk.name>

However that does not seem to be documented behavior, and relying on the
message id from various tools seems to have backfired before.[1]

It’s also more convenient to not have to parse any output from
format-patch.

One may also be interested in finding some information based on the
commit hash, not just simply to output the hash itself.

🔗 1: https://lore.kernel.org/git/20231106153214.s5abourejkuiwk64@pengutronix.de/

For example, the command could output a header for the current commit as
well as another header for the previously-published commits:

    X-Commit-Hash: 97b53c04894578b23d0c650f69885f734699afc7
    X-Previous-Commits:
        4ad5d4190649dcb5f26c73a6f15ab731891b9dfd
        d275d1d179b90592ddd7b5da2ae4573b3f7a37b7
        402b7937951073466bf4527caffd38175391c7da

One can imagine that (1) these previous commit hashes were stored on every
commit rewrite and (2) the commits that had been published previously
were also stored. Then the command just needed the current commit hash
in order to look up this information.

Now interested parties can use this information to track where the
patches come from.

This information could of course be given between the
three-dash/three-hyphen line and the patch proper. However, the
hypoethetical project in question might prefer to use this part for
extra patch information written by the author and leave the above
information for tooling; this way the extra information does not need to
disturb the reader.

§ Demonstration

The above current/previous hash example is taken from:

https://lore.kernel.org/git/97b53c04894578b23d0c650f69885f734699afc7.1709670287.git.code@khaugsbakk.name/

§ CC

For patch “log-tree: take ownership of pointer”:

Cc: Jeff King <peff@peff.net>

For the git-send-email(1) `--header-cmd` topic:[1]

Cc: Maxim Cournoyer <maxim.cournoyer@gmail.com>

🔗 1: https://lore.kernel.org/git/20230423122744.4865-1-maxim.cournoyer@gmail.com/

Kristoffer Haugsbakk (3):
  log-tree: take ownership of pointer
  format-patch: teach `--header-cmd`
  format-patch: check if header output looks valid

 Documentation/config/format.txt    |  5 ++
 Documentation/git-format-patch.txt | 26 ++++++++++
 builtin/log.c                      | 76 ++++++++++++++++++++++++++++++
 log-tree.c                         | 18 ++++++-
 revision.h                         |  2 +
 t/t4014-format-patch.sh            | 55 +++++++++++++++++++++
 6 files changed, 180 insertions(+), 2 deletions(-)

-- 
2.44.0.169.gd259cac85a8


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 1/3] log-tree: take ownership of pointer
  2024-03-07 19:59 [PATCH 0/3] format-patch: teach `--header-cmd` Kristoffer Haugsbakk
@ 2024-03-07 19:59 ` Kristoffer Haugsbakk
  2024-03-12  9:29   ` Jeff King
  2024-03-07 19:59 ` [PATCH 2/3] format-patch: teach `--header-cmd` Kristoffer Haugsbakk
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 34+ messages in thread
From: Kristoffer Haugsbakk @ 2024-03-07 19:59 UTC (permalink / raw)
  To: git; +Cc: Kristoffer Haugsbakk

The MIME header handling started using string buffers in
d50b69b868d (log_write_email_headers: use strbufs, 2018-05-18). The
subject buffer is given to `extra_headers` without that variable taking
ownership; the commit “punts on that ownership” (in general, not just
for this buffer).

In an upcoming commit we will first assign `extra_headers` to the owned
pointer from another `strbuf`. In turn we need this variable to always
contain an owned pointer so that we can free it in the calling
function.

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
---
 log-tree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/log-tree.c b/log-tree.c
index 337b9334cdb..2eabd19962b 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -519,7 +519,7 @@ void log_write_email_headers(struct rev_info *opt, struct commit *commit,
 			 extra_headers ? extra_headers : "",
 			 mime_boundary_leader, opt->mime_boundary,
 			 mime_boundary_leader, opt->mime_boundary);
-		extra_headers = subject_buffer.buf;
+		extra_headers = strbuf_detach(&subject_buffer, NULL);
 
 		if (opt->numbered_files)
 			strbuf_addf(&filename, "%d", opt->nr);
-- 
2.44.0.169.gd259cac85a8


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 2/3] format-patch: teach `--header-cmd`
  2024-03-07 19:59 [PATCH 0/3] format-patch: teach `--header-cmd` Kristoffer Haugsbakk
  2024-03-07 19:59 ` [PATCH 1/3] log-tree: take ownership of pointer Kristoffer Haugsbakk
@ 2024-03-07 19:59 ` Kristoffer Haugsbakk
  2024-03-08 18:30   ` Kristoffer Haugsbakk
  2024-03-11 21:29   ` Jean-Noël Avila
  2024-03-07 19:59 ` [PATCH 3/3] format-patch: check if header output looks valid Kristoffer Haugsbakk
  2024-03-19 18:35 ` [PATCH v2 0/3] format-patch: teach `--header-cmd` Kristoffer Haugsbakk
  3 siblings, 2 replies; 34+ messages in thread
From: Kristoffer Haugsbakk @ 2024-03-07 19:59 UTC (permalink / raw)
  To: git; +Cc: Kristoffer Haugsbakk

Teach git-format-patch(1) `--header-cmd` (with negation) and the
accompanying config variable `format.headerCmd` which allows the user to
add extra headers per-patch.

format-patch knows `--add-header`. However, that seems most useful for
series-wide headers; you cannot really control what the header is like
per patch or specifically for the cover letter. To that end, teach
format-patch a new option which runs a command that has access to the
hash of the current commit (if it is a code patch) and the patch count
which is used for the patch files that this command outputs. Also
include an environment variable which tells the version of this API so
that the command can detect and error out in case the API changes.

This is inspired by `--header-cmd` of git-send-email(1).

§ Discussion

The command can use the provided commit hash to provide relevant
information in the header. For example, the command could output a
header for the current commit as well as the previously-published
commits:

    X-Commit-Hash: 97b53c04894578b23d0c650f69885f734699afc7
    X-Previous-Commits:
        4ad5d4190649dcb5f26c73a6f15ab731891b9dfd
        d275d1d179b90592ddd7b5da2ae4573b3f7a37b7
        402b7937951073466bf4527caffd38175391c7da

Now interested parties can use this information to track where the
patches come from.

This information could of course be given between the
three-dash/three-hyphen line and the patch proper. However, the project
might prefer to use this part for extra patch information written by the
author and leave the above information for tooling; this way the extra
information does not need to disturb the reader.

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
---

Notes (series):
    Documentation/config/format.txt:
    • I get the impression that `_` is the convention for placeholders now:
      `_<cmd>_`

 Documentation/config/format.txt    |  5 ++++
 Documentation/git-format-patch.txt | 26 ++++++++++++++++++
 builtin/log.c                      | 43 ++++++++++++++++++++++++++++++
 log-tree.c                         | 16 ++++++++++-
 revision.h                         |  2 ++
 t/t4014-format-patch.sh            | 42 +++++++++++++++++++++++++++++
 6 files changed, 133 insertions(+), 1 deletion(-)

diff --git a/Documentation/config/format.txt b/Documentation/config/format.txt
index 7410e930e53..c184b865824 100644
--- a/Documentation/config/format.txt
+++ b/Documentation/config/format.txt
@@ -31,6 +31,11 @@ format.headers::
 	Additional email headers to include in a patch to be submitted
 	by mail.  See linkgit:git-format-patch[1].
 
+format.headerCmd::
+	Command to run for each patch that should output RFC 2822 email
+	headers. Has access to some information per patch via
+	environment variables. See linkgit:git-format-patch[1].
+
 format.to::
 format.cc::
 	Additional recipients to include in a patch to be submitted
diff --git a/Documentation/git-format-patch.txt b/Documentation/git-format-patch.txt
index 728bb3821c1..41c344902e9 100644
--- a/Documentation/git-format-patch.txt
+++ b/Documentation/git-format-patch.txt
@@ -303,6 +303,32 @@ feeding the result to `git send-email`.
 	`Cc:`, and custom) headers added so far from config or command
 	line.
 
+--[no-]header-cmd=<cmd>::
+	Run _<cmd>_ for each patch. _<cmd>_ should output valid RFC 2822
+	email headers. This can also be configured with
+	the configuration variable `format.headerCmd`. Can be turned off
+	with `--no-header-cmd`. This works independently of
+	`--[no-]add-header`.
++
+_<cmd>_ has access to these environment variables:
++
+	GIT_FP_HEADER_CMD_VERSION
++
+The version of this API. Currently `1`. _<cmd>_ may return exit code
+`2` in order to signal that it does not support the given version.
++
+	GIT_FP_HEADER_CMD_HASH
++
+The hash of the commit corresponding to the current patch. Not set if
+the current patch is the cover letter.
++
+	GIT_FP_HEADER_CMD_COUNT
++
+The current patch count. Increments for each patch.
++
+`git format-patch` will error out if _<cmd>_ returns a non-zero exit
+code.
+
 --[no-]cover-letter::
 	In addition to the patches, generate a cover letter file
 	containing the branch description, shortlog and the overall diffstat.  You can
diff --git a/builtin/log.c b/builtin/log.c
index db1808d7c13..eecbcdf1d6d 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -43,10 +43,13 @@
 #include "tmp-objdir.h"
 #include "tree.h"
 #include "write-or-die.h"
+#include "run-command.h"
 
 #define MAIL_DEFAULT_WRAP 72
 #define COVER_FROM_AUTO_MAX_SUBJECT_LEN 100
 #define FORMAT_PATCH_NAME_MAX_DEFAULT 64
+#define HC_VERSION "1"
+#define HC_NOT_SUPPORTED 2
 
 /* Set a default date-time format for git log ("log.date" config variable) */
 static const char *default_date_mode = NULL;
@@ -902,6 +905,7 @@ static int auto_number = 1;
 
 static char *default_attach = NULL;
 
+static const char *header_cmd = NULL;
 static struct string_list extra_hdr = STRING_LIST_INIT_NODUP;
 static struct string_list extra_to = STRING_LIST_INIT_NODUP;
 static struct string_list extra_cc = STRING_LIST_INIT_NODUP;
@@ -1100,6 +1104,8 @@ static int git_format_config(const char *var, const char *value,
 		format_no_prefix = 1;
 		return 0;
 	}
+	if (!strcmp(var, "format.headercmd"))
+		return git_config_string(&header_cmd, var, value);
 
 	/*
 	 * ignore some porcelain config which would otherwise be parsed by
@@ -1419,6 +1425,7 @@ static void make_cover_letter(struct rev_info *rev, int use_separate_file,
 		show_range_diff(rev->rdiff1, rev->rdiff2, &range_diff_opts);
 		strvec_clear(&other_arg);
 	}
+	free((char *)pp.after_subject);
 }
 
 static char *clean_message_id(const char *msg_id)
@@ -1865,6 +1872,35 @@ static void infer_range_diff_ranges(struct strbuf *r1,
 	}
 }
 
+/* Returns an owned pointer */
+static char *header_cmd_output(struct rev_info *rev, const struct commit *cmit)
+{
+	struct child_process header_cmd_proc = CHILD_PROCESS_INIT;
+	struct strbuf output = STRBUF_INIT;
+	int res;
+
+	strvec_pushl(&header_cmd_proc.args, header_cmd, NULL);
+	if (cmit)
+		strvec_pushf(&header_cmd_proc.env, "GIT_FP_HEADER_CMD_HASH=%s",
+			     oid_to_hex(&cmit->object.oid));
+	strvec_pushl(&header_cmd_proc.env,
+		     "GIT_FP_HEADER_CMD_VERSION=" HC_VERSION, NULL);
+	strvec_pushf(&header_cmd_proc.env, "GIT_FP_HEADER_CMD_COUNT=%" PRIuMAX,
+		     (uintmax_t)rev->nr);
+	res = capture_command(&header_cmd_proc, &output, 0);
+	if (res) {
+		if (res == HC_NOT_SUPPORTED)
+			die(_("header-cmd %s: returned exit "
+			      "code %d; the command does not support "
+			      "version " HC_VERSION),
+			    header_cmd, HC_NOT_SUPPORTED);
+		else
+			die(_("header-cmd %s: failed with exit code %d"),
+			    header_cmd, res);
+	}
+	return strbuf_detach(&output, NULL);
+}
+
 int cmd_format_patch(int argc, const char **argv, const char *prefix)
 {
 	struct commit *commit;
@@ -1955,6 +1991,7 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 		OPT_GROUP(N_("Messaging")),
 		OPT_CALLBACK(0, "add-header", NULL, N_("header"),
 			    N_("add email header"), header_callback),
+		OPT_STRING(0, "header-cmd", &header_cmd, N_("email"), N_("command that will be run to generate headers")),
 		OPT_STRING_LIST(0, "to", &extra_to, N_("email"), N_("add To: header")),
 		OPT_STRING_LIST(0, "cc", &extra_cc, N_("email"), N_("add Cc: header")),
 		OPT_CALLBACK_F(0, "from", &from, N_("ident"),
@@ -2321,6 +2358,8 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 	if (cover_letter) {
 		if (thread)
 			gen_message_id(&rev, "cover");
+		if (header_cmd)
+			rev.pe_headers = header_cmd_output(&rev, NULL);
 		make_cover_letter(&rev, !!output_directory,
 				  origin, nr, list, description_file, branch_name, quiet);
 		print_bases(&bases, rev.diffopt.file);
@@ -2330,6 +2369,7 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 		/* interdiff/range-diff in cover-letter; omit from patches */
 		rev.idiff_oid1 = NULL;
 		rev.rdiff1 = NULL;
+		free((char *)rev.pe_headers);
 	}
 	rev.add_signoff = do_signoff;
 
@@ -2376,6 +2416,8 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 			gen_message_id(&rev, oid_to_hex(&commit->object.oid));
 		}
 
+		if (header_cmd)
+			rev.pe_headers = header_cmd_output(&rev, commit);
 		if (output_directory &&
 		    open_next_file(rev.numbered_files ? NULL : commit, NULL, &rev, quiet))
 			die(_("failed to create output files"));
@@ -2402,6 +2444,7 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 		}
 		if (output_directory)
 			fclose(rev.diffopt.file);
+		free((char *)rev.pe_headers);
 	}
 	stop_progress(&progress);
 	free(list);
diff --git a/log-tree.c b/log-tree.c
index 2eabd19962b..3ca383d099f 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -469,12 +469,24 @@ void fmt_output_email_subject(struct strbuf *sb, struct rev_info *opt)
 	}
 }
 
+static char *extra_and_pe_headers(const char *extra_headers, const char *pe_headers) {
+	struct strbuf all_headers = STRBUF_INIT;
+
+	if (extra_headers)
+		strbuf_addstr(&all_headers, extra_headers);
+	if (pe_headers) {
+		strbuf_addstr(&all_headers, pe_headers);
+	}
+	return strbuf_detach(&all_headers, NULL);
+}
+
 void log_write_email_headers(struct rev_info *opt, struct commit *commit,
 			     const char **extra_headers_p,
 			     int *need_8bit_cte_p,
 			     int maybe_multipart)
 {
-	const char *extra_headers = opt->extra_headers;
+	const char *extra_headers =
+		extra_and_pe_headers(opt->extra_headers, opt->pe_headers);
 	const char *name = oid_to_hex(opt->zero_commit ?
 				      null_oid() : &commit->object.oid);
 
@@ -519,6 +531,7 @@ void log_write_email_headers(struct rev_info *opt, struct commit *commit,
 			 extra_headers ? extra_headers : "",
 			 mime_boundary_leader, opt->mime_boundary,
 			 mime_boundary_leader, opt->mime_boundary);
+		free((char *)extra_headers);
 		extra_headers = strbuf_detach(&subject_buffer, NULL);
 
 		if (opt->numbered_files)
@@ -857,6 +870,7 @@ void show_log(struct rev_info *opt)
 
 	strbuf_release(&msgbuf);
 	free(ctx.notes_message);
+	free((char *)ctx.after_subject);
 
 	if (cmit_fmt_is_mail(ctx.fmt) && opt->idiff_oid1) {
 		struct diff_queue_struct dq;
diff --git a/revision.h b/revision.h
index 94c43138bc3..eb36bdea36e 100644
--- a/revision.h
+++ b/revision.h
@@ -291,6 +291,8 @@ struct rev_info {
 	struct string_list *ref_message_ids;
 	int		add_signoff;
 	const char	*extra_headers;
+	/* per-email headers */
+	const char	*pe_headers;
 	const char	*log_reencode;
 	const char	*subject_prefix;
 	int		patch_name_max;
diff --git a/t/t4014-format-patch.sh b/t/t4014-format-patch.sh
index e37a1411ee2..dfda21d4b2b 100755
--- a/t/t4014-format-patch.sh
+++ b/t/t4014-format-patch.sh
@@ -238,6 +238,48 @@ test_expect_failure 'configuration To: header (rfc2047)' '
 	grep "^To: =?UTF-8?q?R=20=C3=84=20Cipient?= <rcipient@example.com>\$" hdrs9
 '
 
+test_expect_success '--header-cmd' '
+	write_script cmd <<-\EOF &&
+	printf "X-S: $GIT_FP_HEADER_CMD_HASH\n"
+	printf "X-V: $GIT_FP_HEADER_CMD_VERSION\n"
+	printf "X-C: $GIT_FP_HEADER_CMD_COUNT\n"
+	EOF
+	expect_sha1=$(git rev-parse side) &&
+	git format-patch --header-cmd=./cmd --stdout main..side >patch &&
+	grep "^X-S: $expect_sha1" patch &&
+	grep "^X-V: 1" patch &&
+	grep "^X-C: 3" patch
+'
+
+test_expect_success '--header-cmd with no output works' '
+	write_script cmd <<-\EOF &&
+	exit 0
+	EOF
+	git format-patch --header-cmd=./cmd --stdout main..side
+'
+
+test_expect_success '--header-cmd reports failed command' '
+	write_script cmd <<-\EOF &&
+	exit 1
+	EOF
+		cat > expect <<-\EOF &&
+	fatal: header-cmd ./cmd: failed with exit code 1
+	EOF
+	test_must_fail git format-patch --header-cmd=./cmd --stdout main..side >actual 2>&1 &&
+	test_cmp expect actual
+'
+
+test_expect_success '--header-cmd reports exit code 2' '
+	write_script cmd <<-\EOF &&
+	exit 2
+	EOF
+	cat > expect <<-\EOF &&
+	fatal: header-cmd ./cmd: returned exit code 2; the command does not support version 1
+	EOF
+	test_must_fail git format-patch --header-cmd=./cmd --stdout main..side >actual 2>&1 &&
+	test_cmp expect actual
+'
+
 # check_patch <patch>: Verify that <patch> looks like a half-sane
 # patch email to avoid a false positive with !grep
 check_patch () {
-- 
2.44.0.169.gd259cac85a8


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 3/3] format-patch: check if header output looks valid
  2024-03-07 19:59 [PATCH 0/3] format-patch: teach `--header-cmd` Kristoffer Haugsbakk
  2024-03-07 19:59 ` [PATCH 1/3] log-tree: take ownership of pointer Kristoffer Haugsbakk
  2024-03-07 19:59 ` [PATCH 2/3] format-patch: teach `--header-cmd` Kristoffer Haugsbakk
@ 2024-03-07 19:59 ` Kristoffer Haugsbakk
  2024-03-19 18:35 ` [PATCH v2 0/3] format-patch: teach `--header-cmd` Kristoffer Haugsbakk
  3 siblings, 0 replies; 34+ messages in thread
From: Kristoffer Haugsbakk @ 2024-03-07 19:59 UTC (permalink / raw)
  To: git; +Cc: Kristoffer Haugsbakk

Implement a function based on `mailinfo.c:is_mail`.

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
---

Notes (series):
    Isolating this for review as its own commit so that I can point out the
    provenance. May well be squashed into the main patch eventually.

 builtin/log.c           | 33 +++++++++++++++++++++++++++++++++
 t/t4014-format-patch.sh | 13 +++++++++++++
 2 files changed, 46 insertions(+)

diff --git a/builtin/log.c b/builtin/log.c
index eecbcdf1d6d..27e1a66dd03 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -1872,6 +1872,35 @@ static void infer_range_diff_ranges(struct strbuf *r1,
 	}
 }
 
+static int is_mail(struct strbuf *sb)
+{
+	const char *header_regex = "^[!-9;-~]+:";
+	regex_t regex;
+	int ret = 1, i;
+	struct string_list list = STRING_LIST_INIT_DUP;
+
+	if (regcomp(&regex, header_regex, REG_NOSUB | REG_EXTENDED))
+		die("invalid pattern: %s", header_regex);
+	string_list_split(&list, sb->buf, '\n', -1);
+	for (i = 0; i < list.nr; i++) {
+		/* End of header */
+		if (!*list.items[i].string && i == (list.nr - 1))
+			break;
+		/* Ignore indented folded lines */
+		if (*list.items[i].string == '\t' ||
+		    *list.items[i].string == ' ')
+			continue;
+		/* It's a header if it matches header_regex */
+		if (regexec(&regex, list.items[i].string, 0, NULL, 0)) {
+			ret = 0;
+			break;
+		}
+	}
+	string_list_clear(&list, 1);
+	regfree(&regex);
+	return ret;
+}
+
 /* Returns an owned pointer */
 static char *header_cmd_output(struct rev_info *rev, const struct commit *cmit)
 {
@@ -1898,6 +1927,10 @@ static char *header_cmd_output(struct rev_info *rev, const struct commit *cmit)
 			die(_("header-cmd %s: failed with exit code %d"),
 			    header_cmd, res);
 	}
+	if (!is_mail(&output))
+		die(_("header-cmd %s: returned output which was "
+		      "not recognized as valid RFC 2822 headers"),
+		    header_cmd);
 	return strbuf_detach(&output, NULL);
 }
 
diff --git a/t/t4014-format-patch.sh b/t/t4014-format-patch.sh
index dfda21d4b2b..98e0eb706e6 100755
--- a/t/t4014-format-patch.sh
+++ b/t/t4014-format-patch.sh
@@ -258,6 +258,19 @@ test_expect_success '--header-cmd with no output works' '
 	git format-patch --header-cmd=./cmd --stdout main..side
 '
 
+test_expect_success '--header-cmd without headers-like output fails' '
+	write_script cmd <<-\EOF &&
+	printf "X-S: $GIT_FP_HEADER_CMD_HASH\n"
+	printf "\n"
+	printf "X-C: $GIT_FP_HEADER_CMD_COUNT\n"
+	EOF
+	cat > expect <<-\EOF &&
+	fatal: header-cmd ./cmd: returned output which was not recognized as valid RFC 2822 headers
+	EOF
+	test_must_fail git format-patch --header-cmd=./cmd --stdout main..side >actual 2>&1 &&
+	test_cmp expect actual
+'
+
 test_expect_success '--header-cmd reports failed command' '
 	write_script cmd <<-\EOF &&
 	exit 1
-- 
2.44.0.169.gd259cac85a8


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/3] format-patch: teach `--header-cmd`
  2024-03-07 19:59 ` [PATCH 2/3] format-patch: teach `--header-cmd` Kristoffer Haugsbakk
@ 2024-03-08 18:30   ` Kristoffer Haugsbakk
  2024-03-11 21:29   ` Jean-Noël Avila
  1 sibling, 0 replies; 34+ messages in thread
From: Kristoffer Haugsbakk @ 2024-03-08 18:30 UTC (permalink / raw)
  To: git

On Thu, Mar 7, 2024, at 20:59, Kristoffer Haugsbakk wrote:
> +test_expect_success '--header-cmd with no output works' '
> +	write_script cmd <<-\EOF &&
> +	exit 0
> +	EOF
> +	git format-patch --header-cmd=./cmd --stdout main..side
> +'

This can be simplified to `--header-cmd=true`.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/3] format-patch: teach `--header-cmd`
  2024-03-07 19:59 ` [PATCH 2/3] format-patch: teach `--header-cmd` Kristoffer Haugsbakk
  2024-03-08 18:30   ` Kristoffer Haugsbakk
@ 2024-03-11 21:29   ` Jean-Noël Avila
  2024-03-12  8:13     ` Kristoffer Haugsbakk
  1 sibling, 1 reply; 34+ messages in thread
From: Jean-Noël Avila @ 2024-03-11 21:29 UTC (permalink / raw)
  To: Kristoffer Haugsbakk, git

Le 07/03/2024 à 20:59, Kristoffer Haugsbakk a écrit :
>  
> +format.headerCmd::
> +	Command to run for each patch that should output RFC 2822 email
> +	headers. Has access to some information per patch via
> +	environment variables. See linkgit:git-format-patch[1].
> +
>  format.to::
>  format.cc::
>  	Additional recipients to include in a patch to be submitted
> diff --git a/Documentation/git-format-patch.txt b/Documentation/git-format-patch.txt
> index 728bb3821c1..41c344902e9 100644
> --- a/Documentation/git-format-patch.txt
> +++ b/Documentation/git-format-patch.txt
> @@ -303,6 +303,32 @@ feeding the result to `git send-email`.
>  	`Cc:`, and custom) headers added so far from config or command
>  	line.
>  
> +--[no-]header-cmd=<cmd>::
> +	Run _<cmd>_ for each patch. _<cmd>_ should output valid RFC 2822
> +	email headers. This can also be configured with
> +	the configuration variable `format.headerCmd`. Can be turned off
> +	with `--no-header-cmd`. This works independently of
> +	`--[no-]add-header`.
> ++
> +_<cmd>_ has access to these environment variables:
> ++
> +	GIT_FP_HEADER_CMD_VERSION

Better use a nested description list like this:

GIT_FP_HEADER_CMD_VERSION;;
  The version of this API. Currently `1`. _<cmd>_ may return exit code
  `2` in order to signal that it does not support the given version.

> ++
> +The version of this API. Currently `1`. _<cmd>_ may return exit code
> +`2` in order to signal that it does not support the given version.
> ++
> +	GIT_FP_HEADER_CMD_HASH
> ++
> +The hash of the commit corresponding to the current patch. Not set if
> +the current patch is the cover letter.
> ++
> +	GIT_FP_HEADER_CMD_COUNT
> ++
> +The current patch count. Increments for each patch.
> ++
> +`git format-patch` will error out if _<cmd>_ returns a non-zero exit
> +code.
> +
>  --[no-]cover-letter::
>  	In addition to the patches, generate a cover letter file
>  	containing the branch description, shortlog and the overall diffstat.  You can


Overall, thank you for correctly marking up placeholders and options.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/3] format-patch: teach `--header-cmd`
  2024-03-11 21:29   ` Jean-Noël Avila
@ 2024-03-12  8:13     ` Kristoffer Haugsbakk
  0 siblings, 0 replies; 34+ messages in thread
From: Kristoffer Haugsbakk @ 2024-03-12  8:13 UTC (permalink / raw)
  To: Jean-Noël Avila; +Cc: git

On Mon, Mar 11, 2024, at 22:29, Jean-Noël Avila wrote:
>> +--[no-]header-cmd=<cmd>::
>> +	Run _<cmd>_ for each patch. _<cmd>_ should output valid RFC 2822
>> +	email headers. This can also be configured with
>> +	the configuration variable `format.headerCmd`. Can be turned off
>> +	with `--no-header-cmd`. This works independently of
>> +	`--[no-]add-header`.
>> ++
>> +_<cmd>_ has access to these environment variables:
>> ++
>> +	GIT_FP_HEADER_CMD_VERSION
>
> Better use a nested description list like this:
>
> GIT_FP_HEADER_CMD_VERSION;;
>   The version of this API. Currently `1`. _<cmd>_ may return exit code
>   `2` in order to signal that it does not support the given version.
>

Thanks, I’ll do that in the next version.

>> ++
>> +The version of this API. Currently `1`. _<cmd>_ may return exit code
>> +`2` in order to signal that it does not support the given version.
>> ++
>> +	GIT_FP_HEADER_CMD_HASH
>> ++
>> +The hash of the commit corresponding to the current patch. Not set if
>> +the current patch is the cover letter.
>> ++
>> +	GIT_FP_HEADER_CMD_COUNT
>> ++
>> +The current patch count. Increments for each patch.
>> ++
>> +`git format-patch` will error out if _<cmd>_ returns a non-zero exit
>> +code.
>> +
>>  --[no-]cover-letter::
>>  	In addition to the patches, generate a cover letter file
>>  	containing the branch description, shortlog and the overall diffstat.  You can
>
>
> Overall, thank you for correctly marking up placeholders and options.

Thanks for reviewing!

--
Kristoffer Haugsbakk

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/3] log-tree: take ownership of pointer
  2024-03-07 19:59 ` [PATCH 1/3] log-tree: take ownership of pointer Kristoffer Haugsbakk
@ 2024-03-12  9:29   ` Jeff King
  2024-03-12 17:43     ` Kristoffer Haugsbakk
  0 siblings, 1 reply; 34+ messages in thread
From: Jeff King @ 2024-03-12  9:29 UTC (permalink / raw)
  To: Kristoffer Haugsbakk; +Cc: git

On Thu, Mar 07, 2024 at 08:59:35PM +0100, Kristoffer Haugsbakk wrote:

> The MIME header handling started using string buffers in
> d50b69b868d (log_write_email_headers: use strbufs, 2018-05-18). The
> subject buffer is given to `extra_headers` without that variable taking
> ownership; the commit “punts on that ownership” (in general, not just
> for this buffer).
> 
> In an upcoming commit we will first assign `extra_headers` to the owned
> pointer from another `strbuf`. In turn we need this variable to always
> contain an owned pointer so that we can free it in the calling
> function.

Hmm, OK. This patch by itself introduces a memory leak. It would be nice
if we could couple it with the matching free() so that we can see that
the issue is fixed. It sounds like your patch 2 is going to introduce
such a free, but I'm not sure it's complete. It frees the old
extra_headers before reassigning it, but nobody cleans it up after
handling the final commit.

We should also drop the "static" from subject_buffer, if it is no longer
needed. Likewise, any strings that start owning memory (here or in patch
2) should probably drop their "const". That makes the ownership more
clear, and avoids ugly casts when freeing.

-Peff

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/3] log-tree: take ownership of pointer
  2024-03-12  9:29   ` Jeff King
@ 2024-03-12 17:43     ` Kristoffer Haugsbakk
  2024-03-13  6:54       ` Jeff King
  0 siblings, 1 reply; 34+ messages in thread
From: Kristoffer Haugsbakk @ 2024-03-12 17:43 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Hi Jeff and thanks for taking a look

On Tue, Mar 12, 2024, at 10:29, Jeff King wrote:
> On Thu, Mar 07, 2024 at 08:59:35PM +0100, Kristoffer Haugsbakk wrote:
>
>> The MIME header handling started using string buffers in
>> d50b69b868d (log_write_email_headers: use strbufs, 2018-05-18). The
>> subject buffer is given to `extra_headers` without that variable taking
>> ownership; the commit “punts on that ownership” (in general, not just
>> for this buffer).
>>
>> In an upcoming commit we will first assign `extra_headers` to the owned
>> pointer from another `strbuf`. In turn we need this variable to always
>> contain an owned pointer so that we can free it in the calling
>> function.
>
> Hmm, OK. This patch by itself introduces a memory leak. It would be nice
> if we could couple it with the matching free() so that we can see that
> the issue is fixed. It sounds like your patch 2 is going to introduce
> such a free, but I'm not sure it's complete.

Is it okay if it is done in patch 2?

> It frees the old extra_headers before reassigning it, but nobody
> cleans it up after handling the final commit.

I didn’t get any leak errors from the CI. `extra_headers` in `show_log`
is populated by calling `log_write_email_headers`. Then later it is
assigned to

    ctx.after_subject = extra_headers;

Then `ctx.after_subject is freed later

    free((char *)ctx.after_subject);

Am I missing something?

> We should also drop the "static" from subject_buffer, if it is no longer
> needed. Likewise, any strings that start owning memory (here or in patch
> 2) should probably drop their "const". That makes the ownership more
> clear, and avoids ugly casts when freeing.

Okay, I’ll do that.

Thanks

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/3] log-tree: take ownership of pointer
  2024-03-12 17:43     ` Kristoffer Haugsbakk
@ 2024-03-13  6:54       ` Jeff King
  2024-03-13 17:49         ` Kristoffer Haugsbakk
  0 siblings, 1 reply; 34+ messages in thread
From: Jeff King @ 2024-03-13  6:54 UTC (permalink / raw)
  To: Kristoffer Haugsbakk; +Cc: git

On Tue, Mar 12, 2024 at 06:43:55PM +0100, Kristoffer Haugsbakk wrote:

> > Hmm, OK. This patch by itself introduces a memory leak. It would be nice
> > if we could couple it with the matching free() so that we can see that
> > the issue is fixed. It sounds like your patch 2 is going to introduce
> > such a free, but I'm not sure it's complete.
> 
> Is it okay if it is done in patch 2?

I don't think it's the end of the world to do it in patch 2, as long as
we end up in a good spot. But IMHO it's really hard for reviewers to
understand what is going on, because it's intermingled with so many
other changes. It would be much easier to read if we had a preparatory
patch that switched the memory ownership of the field, and then built on
top of that.

But I recognize that sometimes that's hard to do, because the state is
so tangled that the functional change is what untangles it. I'm not sure
if that's the case here or not; you'd probably have a better idea as
somebody who looked carefully at it recently.

> > It frees the old extra_headers before reassigning it, but nobody
> > cleans it up after handling the final commit.
> 
> I didn’t get any leak errors from the CI. `extra_headers` in `show_log`
> is populated by calling `log_write_email_headers`. Then later it is
> assigned to
> 
>     ctx.after_subject = extra_headers;
> 
> Then `ctx.after_subject is freed later
> 
>     free((char *)ctx.after_subject);
> 
> Am I missing something?

Ah, I see. I was confused by looking for a free of an extra_headers
field. We have rev_info.extra_headers, and that is _not_ owned by
rev_info. We used to assign that to a variable in
log_write_email_headers(), but now we actually make a copy of it. And so
the copy is freed in that function when we replace it with a version
containing extra mime headers here:

                  strbuf_addf(&subject_buffer,
                           "%s"
                           "MIME-Version: 1.0\n"
                           "Content-Type: multipart/mixed;"
                           " boundary=\"%s%s\"\n"
                           "\n"
                           "This is a multi-part message in MIME "
                           "format.\n"
                           "--%s%s\n"
                           "Content-Type: text/plain; "
                           "charset=UTF-8; format=fixed\n"
                           "Content-Transfer-Encoding: 8bit\n\n",
                           extra_headers ? extra_headers : "",
                           mime_boundary_leader, opt->mime_boundary,
                           mime_boundary_leader, opt->mime_boundary);
                  free((char *)extra_headers);
                  extra_headers = strbuf_detach(&subject_buffer, NULL);

But the actual ownership is passed out via the extra_headers_p variable,
and that is what is assigned to ctx.after_subject (which now takes
ownership).

I think in the snippet I quoted above that extra_headers could never be
NULL now, right? We'll always return at least an empty string. But
moreover, we are formatting it into a strbuf, only to potentially copy
it it another strbuf. Couldn't we just do it all in one strbuf?

Something like this:

 log-tree.c | 29 ++++++++---------------------
 1 file changed, 8 insertions(+), 21 deletions(-)

diff --git a/log-tree.c b/log-tree.c
index 9196b4f1d4..0a703a0303 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -469,29 +469,22 @@ void fmt_output_email_subject(struct strbuf *sb, struct rev_info *opt)
 	}
 }
 
-static char *extra_and_pe_headers(const char *extra_headers, const char *pe_headers) {
-	struct strbuf all_headers = STRBUF_INIT;
-
-	if (extra_headers)
-		strbuf_addstr(&all_headers, extra_headers);
-	if (pe_headers) {
-		strbuf_addstr(&all_headers, pe_headers);
-	}
-	return strbuf_detach(&all_headers, NULL);
-}
-
 void log_write_email_headers(struct rev_info *opt, struct commit *commit,
 			     const char **extra_headers_p,
 			     int *need_8bit_cte_p,
 			     int maybe_multipart)
 {
-	const char *extra_headers =
-		extra_and_pe_headers(opt->extra_headers, opt->pe_headers);
+	struct strbuf headers = STRBUF_INIT;
 	const char *name = oid_to_hex(opt->zero_commit ?
 				      null_oid() : &commit->object.oid);
 
 	*need_8bit_cte_p = 0; /* unknown */
 
+	if (opt->extra_headers)
+		strbuf_addstr(&headers, opt->extra_headers);
+	if (opt->pe_headers)
+		strbuf_addstr(&headers, opt->pe_headers);
+
 	fprintf(opt->diffopt.file, "From %s Mon Sep 17 00:00:00 2001\n", name);
 	graph_show_oneline(opt->graph);
 	if (opt->message_id) {
@@ -508,16 +501,13 @@ void log_write_email_headers(struct rev_info *opt, struct commit *commit,
 		graph_show_oneline(opt->graph);
 	}
 	if (opt->mime_boundary && maybe_multipart) {
-		static struct strbuf subject_buffer = STRBUF_INIT;
 		static struct strbuf buffer = STRBUF_INIT;
 		struct strbuf filename =  STRBUF_INIT;
 		*need_8bit_cte_p = -1; /* NEVER */
 
-		strbuf_reset(&subject_buffer);
 		strbuf_reset(&buffer);
 
-		strbuf_addf(&subject_buffer,
-			 "%s"
+		strbuf_addf(&headers,
 			 "MIME-Version: 1.0\n"
 			 "Content-Type: multipart/mixed;"
 			 " boundary=\"%s%s\"\n"
@@ -528,11 +518,8 @@ void log_write_email_headers(struct rev_info *opt, struct commit *commit,
 			 "Content-Type: text/plain; "
 			 "charset=UTF-8; format=fixed\n"
 			 "Content-Transfer-Encoding: 8bit\n\n",
-			 extra_headers ? extra_headers : "",
 			 mime_boundary_leader, opt->mime_boundary,
 			 mime_boundary_leader, opt->mime_boundary);
-		free((char *)extra_headers);
-		extra_headers = strbuf_detach(&subject_buffer, NULL);
 
 		if (opt->numbered_files)
 			strbuf_addf(&filename, "%d", opt->nr);
@@ -552,7 +539,7 @@ void log_write_email_headers(struct rev_info *opt, struct commit *commit,
 		opt->diffopt.stat_sep = buffer.buf;
 		strbuf_release(&filename);
 	}
-	*extra_headers_p = extra_headers;
+	*extra_headers_p = headers.len ? strbuf_detach(&headers, NULL) : NULL;
 }
 
 static void show_sig_lines(struct rev_info *opt, int status, const char *bol)


The resulting code is shorter and (IMHO) easier to understand. It
avoids an extra allocation and copy when using mime. It also avoids the
allocation of an empty string when opt->extra_headers and
opt->pe_headers are both NULL. It does make an extra copy when
extra_headers is non-NULL but pe_headers is NULL (and you're not using
MIME), as we could just use opt->extra_headers as-is, then. But since
the caller needs to take ownership, we can't avoid that copy.

I think you could even do this cleanup before adding pe_headers,
especially if it was coupled with cleaning up the memory ownership
issues.

-Peff

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/3] log-tree: take ownership of pointer
  2024-03-13  6:54       ` Jeff King
@ 2024-03-13 17:49         ` Kristoffer Haugsbakk
  0 siblings, 0 replies; 34+ messages in thread
From: Kristoffer Haugsbakk @ 2024-03-13 17:49 UTC (permalink / raw)
  To: Jeff King; +Cc: git

On Wed, Mar 13, 2024, at 07:54, Jeff King wrote:
> On Tue, Mar 12, 2024 at 06:43:55PM +0100, Kristoffer Haugsbakk wrote:
>
>> > Hmm, OK. This patch by itself introduces a memory leak. It would be nice
>> > if we could couple it with the matching free() so that we can see that
>> > the issue is fixed. It sounds like your patch 2 is going to introduce
>> > such a free, but I'm not sure it's complete.
>>
>> Is it okay if it is done in patch 2?
>
> I don't think it's the end of the world to do it in patch 2, as long as
> we end up in a good spot. But IMHO it's really hard for reviewers to
> understand what is going on, because it's intermingled with so many
> other changes. It would be much easier to read if we had a preparatory
> patch that switched the memory ownership of the field, and then built on
> top of that.

Sounds good. I’ll do that.

> But I recognize that sometimes that's hard to do, because the state is
> so tangled that the functional change is what untangles it. I'm not sure
> if that's the case here or not; you'd probably have a better idea as
> somebody who looked carefully at it recently.

Seems doable in this case.

By the way. I pretty much just elbowed in the changes I needed (like in
`revision.h`) in order to add this per-patch/cover letter headers
variable. Let me know if there are better ways to do it.

>> > It frees the old extra_headers before reassigning it, but nobody
>> > cleans it up after handling the final commit.
>>
>> I didn’t get any leak errors from the CI. `extra_headers` in `show_log`
>> is populated by calling `log_write_email_headers`. Then later it is
>> assigned to
>>
>>     ctx.after_subject = extra_headers;
>>
>> Then `ctx.after_subject is freed later
>>
>>     free((char *)ctx.after_subject);
>>
>> Am I missing something?
>
> Ah, I see. I was confused by looking for a free of an extra_headers
> field. We have rev_info.extra_headers, and that is _not_ owned by
> rev_info. We used to assign that to a variable in
> log_write_email_headers(), but now we actually make a copy of it. And so
> the copy is freed in that function when we replace it with a version
> containing extra mime headers here:
>
> [snip]
>
> But the actual ownership is passed out via the extra_headers_p variable,
> and that is what is assigned to ctx.after_subject (which now takes
> ownership).
>
> I think in the snippet I quoted above that extra_headers could never be
> NULL now, right? We'll always return at least an empty string. But
> moreover, we are formatting it into a strbuf, only to potentially copy
> it it another strbuf. Couldn't we just do it all in one strbuf?
>
> Something like this:
>
> [snip]
>
>
> The resulting code is shorter and (IMHO) easier to understand. It
> avoids an extra allocation and copy when using mime. It also avoids the
> allocation of an empty string when opt->extra_headers and
> opt->pe_headers are both NULL. It does make an extra copy when
> extra_headers is non-NULL but pe_headers is NULL (and you're not using
> MIME), as we could just use opt->extra_headers as-is, then. But since
> the caller needs to take ownership, we can't avoid that copy.
>
> I think you could even do this cleanup before adding pe_headers,
> especially if it was coupled with cleaning up the memory ownership
> issues.
>
> -Peff

I haven’t tried yet but this seems like a good plan. It was getting a
getting a bit too back and forth with my changes. So I’ll try to use
your patch and see if I can get a clean preparatory patch/commit before
the main change.

Cheers

-- 
Kristoffer Haugsbakk


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v2 0/3] format-patch: teach `--header-cmd`
  2024-03-07 19:59 [PATCH 0/3] format-patch: teach `--header-cmd` Kristoffer Haugsbakk
                   ` (2 preceding siblings ...)
  2024-03-07 19:59 ` [PATCH 3/3] format-patch: check if header output looks valid Kristoffer Haugsbakk
@ 2024-03-19 18:35 ` Kristoffer Haugsbakk
  2024-03-19 18:35   ` [PATCH v2 1/3] revision: add a per-email field to rev-info Kristoffer Haugsbakk
                     ` (2 more replies)
  3 siblings, 3 replies; 34+ messages in thread
From: Kristoffer Haugsbakk @ 2024-03-19 18:35 UTC (permalink / raw)
  To: git
  Cc: Kristoffer Haugsbakk, Jeff King, Maxim Cournoyer, Jean-Noël Avila

(most of this is from the main commit/patch with some elaboration in
parts)

Teach git-format-patch(1) `--header-cmd` (with negation) and the
accompanying config variable `format.headerCmd` which allows the user to
add extra headers per-patch.

§ Motivation

format-patch knows `--add-header`. However, that seems most useful for
series-wide headers; you cannot really control what the header is like
per patch or specifically for the cover letter. To that end, teach
format-patch a new option which runs a command that has access to the
hash of the current commit (if it is a code patch) and the patch count
which is used for the patch files that this command outputs. Also
include an environment variable which tells the version of this API so
that the command can detect and error out in case the API changes.

This is inspired by `--header-cmd` of git-send-email(1).[1]

🔗 1: https://lore.kernel.org/git/20230423122744.4865-1-maxim.cournoyer@gmail.com/

§ Discussion

One can already get the commit hash from the first line of the patch file:

    From 04967d53399ed2db09d224008f557ec1a5847cc1 Mon Sep 17 00:00:00 2001

However, the point of this option is to be able to calculate header
output based primarily on the commit hash without having to go back and
forth between commands (i.e. running format-patch and then parsing the
patch files).

As an example, the command could output a header for the current commit
as well as another header for the previously-published commits:

    X-Commit-Hash: 97b53c04894578b23d0c650f69885f734699afc7
    X-Previous-Commits:
        4ad5d4190649dcb5f26c73a6f15ab731891b9dfd
        d275d1d179b90592ddd7b5da2ae4573b3f7a37b7
        402b7937951073466bf4527caffd38175391c7da

One can imagine that (1) these previous commit hashes were stored on every
commit rewrite and (2) the commits that had been published previously
were also stored. Then the command just needed the current commit hash
in order to look up this information.

Now interested parties can use this information to track where the
patches come from.

This information could of course be given between the three-dash/three-
hyphen line and the patch proper. However, the hypothetical project in
question might prefer to use this part for extra patch information
written by the author and leave the above information for tooling; this
way the extra information does not need to disturb the reader.

§ Demonstration

The above current/previous hash example is taken from:

https://lore.kernel.org/git/97b53c04894578b23d0c650f69885f734699afc7.1709670287.git.code@khaugsbakk.name/

§ Changes in v2

Changes based on feedback from Jean-Noël and Peff.

Peff suggested changes to the strbuf handling in `log_tree`, removing
constness in order to facilitate freeing without casting, and fleshing
out the preliminary patch.

In detail:

• revision: add a per-email field to rev-info
  • Replaces (subject) “log-tree: take ownership of pointer”
  • Separate out changes for `rev.info.pe_headers`, changes to
    ownership, const, and strbuf handling in `log_tree`
  • Link: https://lore.kernel.org/git/20240313065454.GB125150@coredump.intra.peff.net/
• format-patch: teach `--header-cmd`
  • Simplify tests: use `true` and `false` as commands
  • Fix indentation in code
  • Use AsciiDoc definition list
    • Link: https://lore.kernel.org/git/53ea3745-205b-40c0-a4c5-9be26d9b88bf@gmail.com/
• format-patch: check if header output looks valid
  • (no changes)

§ CC

Cc: Jeff King <peff@peff.net>
Cc: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Cc: "Jean-Noël Avila" <avila.jn@gmail.com>

§ CI

https://github.com/LemmingAvalanche/git/actions/runs/8332742200/job/22802597793

Fails but looks similar to the failures for `master` at 3bd955d2691 (The
ninth batch, 2024-03-18):

https://github.com/git/git/actions/runs/8333455189/job/22804953254

Kristoffer Haugsbakk (3):
  revision: add a per-email field to rev-info
  format-patch: teach `--header-cmd`
  format-patch: check if header output looks valid

 Documentation/config/format.txt    |  5 ++
 Documentation/git-format-patch.txt | 23 +++++++++
 builtin/log.c                      | 76 ++++++++++++++++++++++++++++++
 log-tree.c                         | 21 +++++----
 log-tree.h                         |  2 +-
 pretty.h                           |  2 +-
 revision.h                         |  4 +-
 t/t4014-format-patch.sh            | 49 +++++++++++++++++++
 8 files changed, 169 insertions(+), 13 deletions(-)

Range-diff against v1:
1:  3b12a8cf393 < -:  ----------- log-tree: take ownership of pointer
-:  ----------- > 1:  9a7102b708e revision: add a per-email field to rev-info
2:  f405a0140b5 ! 2:  8c511797a47 format-patch: teach `--header-cmd`
    @@ Commit message
     
     
      ## Notes (series) ##
    -    Documentation/config/format.txt:
    -    • I get the impression that `_` is the convention for placeholders now:
    -      `_<cmd>_`
    +    v2:
    +    • Simplify tests: use `true` and `false` as commands
    +    • Fix indentation
    +    • Don’t use `const` for owned pointer (avoid cast when freeing)
    +      • Link: https://lore.kernel.org/git/cover.1709841147.git.code@khaugsbakk.name/T/#m12d104a5a769c7f6e02b1d0a75855142004e9206
    +    • Use AsciiDoc definition list
    +      • Link: https://lore.kernel.org/git/53ea3745-205b-40c0-a4c5-9be26d9b88bf@gmail.com/
     
      ## Documentation/config/format.txt ##
     @@ Documentation/config/format.txt: format.headers::
    @@ Documentation/git-format-patch.txt: feeding the result to `git send-email`.
     ++
     +_<cmd>_ has access to these environment variables:
     ++
    -+	GIT_FP_HEADER_CMD_VERSION
    -++
    -+The version of this API. Currently `1`. _<cmd>_ may return exit code
    -+`2` in order to signal that it does not support the given version.
    -++
    -+	GIT_FP_HEADER_CMD_HASH
    -++
    -+The hash of the commit corresponding to the current patch. Not set if
    -+the current patch is the cover letter.
    -++
    -+	GIT_FP_HEADER_CMD_COUNT
    -++
    -+The current patch count. Increments for each patch.
    ++--
    ++GIT_FP_HEADER_CMD_VERSION;;
    ++	The version of this API. Currently `1`. _<cmd>_ may return exit code
    ++	`2` in order to signal that it does not support the given version.
    ++GIT_FP_HEADER_CMD_HASH;;
    ++	The hash of the commit corresponding to the current patch. Not set if
    ++	the current patch is the cover letter.
    ++GIT_FP_HEADER_CMD_COUNT;;
    ++	The current patch count. Increments for each patch.
    ++--
     ++
     +`git format-patch` will error out if _<cmd>_ returns a non-zero exit
     +code.
    @@ builtin/log.c: static void make_cover_letter(struct rev_info *rev, int use_separ
      		show_range_diff(rev->rdiff1, rev->rdiff2, &range_diff_opts);
      		strvec_clear(&other_arg);
      	}
    -+	free((char *)pp.after_subject);
    ++	free(pp.after_subject);
      }
      
      static char *clean_message_id(const char *msg_id)
    @@ builtin/log.c: int cmd_format_patch(int argc, const char **argv, const char *pre
      		/* interdiff/range-diff in cover-letter; omit from patches */
      		rev.idiff_oid1 = NULL;
      		rev.rdiff1 = NULL;
    -+		free((char *)rev.pe_headers);
    ++		free(rev.pe_headers);
      	}
      	rev.add_signoff = do_signoff;
      
    @@ builtin/log.c: int cmd_format_patch(int argc, const char **argv, const char *pre
      		}
      		if (output_directory)
      			fclose(rev.diffopt.file);
    -+		free((char *)rev.pe_headers);
    ++		free(rev.pe_headers);
      	}
      	stop_progress(&progress);
      	free(list);
     
    - ## log-tree.c ##
    -@@ log-tree.c: void fmt_output_email_subject(struct strbuf *sb, struct rev_info *opt)
    - 	}
    - }
    - 
    -+static char *extra_and_pe_headers(const char *extra_headers, const char *pe_headers) {
    -+	struct strbuf all_headers = STRBUF_INIT;
    -+
    -+	if (extra_headers)
    -+		strbuf_addstr(&all_headers, extra_headers);
    -+	if (pe_headers) {
    -+		strbuf_addstr(&all_headers, pe_headers);
    -+	}
    -+	return strbuf_detach(&all_headers, NULL);
    -+}
    -+
    - void log_write_email_headers(struct rev_info *opt, struct commit *commit,
    - 			     const char **extra_headers_p,
    - 			     int *need_8bit_cte_p,
    - 			     int maybe_multipart)
    - {
    --	const char *extra_headers = opt->extra_headers;
    -+	const char *extra_headers =
    -+		extra_and_pe_headers(opt->extra_headers, opt->pe_headers);
    - 	const char *name = oid_to_hex(opt->zero_commit ?
    - 				      null_oid() : &commit->object.oid);
    - 
    -@@ log-tree.c: void log_write_email_headers(struct rev_info *opt, struct commit *commit,
    - 			 extra_headers ? extra_headers : "",
    - 			 mime_boundary_leader, opt->mime_boundary,
    - 			 mime_boundary_leader, opt->mime_boundary);
    -+		free((char *)extra_headers);
    - 		extra_headers = strbuf_detach(&subject_buffer, NULL);
    - 
    - 		if (opt->numbered_files)
    -@@ log-tree.c: void show_log(struct rev_info *opt)
    - 
    - 	strbuf_release(&msgbuf);
    - 	free(ctx.notes_message);
    -+	free((char *)ctx.after_subject);
    - 
    - 	if (cmit_fmt_is_mail(ctx.fmt) && opt->idiff_oid1) {
    - 		struct diff_queue_struct dq;
    -
    - ## revision.h ##
    -@@ revision.h: struct rev_info {
    - 	struct string_list *ref_message_ids;
    - 	int		add_signoff;
    - 	const char	*extra_headers;
    -+	/* per-email headers */
    -+	const char	*pe_headers;
    - 	const char	*log_reencode;
    - 	const char	*subject_prefix;
    - 	int		patch_name_max;
    -
      ## t/t4014-format-patch.sh ##
     @@ t/t4014-format-patch.sh: test_expect_failure 'configuration To: header (rfc2047)' '
      	grep "^To: =?UTF-8?q?R=20=C3=84=20Cipient?= <rcipient@example.com>\$" hdrs9
    @@ t/t4014-format-patch.sh: test_expect_failure 'configuration To: header (rfc2047)
     +'
     +
     +test_expect_success '--header-cmd with no output works' '
    -+	write_script cmd <<-\EOF &&
    -+	exit 0
    -+	EOF
    -+	git format-patch --header-cmd=./cmd --stdout main..side
    ++	git format-patch --header-cmd=true --stdout main..side
     +'
     +
     +test_expect_success '--header-cmd reports failed command' '
    -+	write_script cmd <<-\EOF &&
    -+	exit 1
    -+	EOF
    -+		cat > expect <<-\EOF &&
    -+	fatal: header-cmd ./cmd: failed with exit code 1
    ++	cat > expect <<-\EOF &&
    ++	fatal: header-cmd false: failed with exit code 1
     +	EOF
    -+	test_must_fail git format-patch --header-cmd=./cmd --stdout main..side >actual 2>&1 &&
    ++	test_must_fail git format-patch --header-cmd=false --stdout main..side >actual 2>&1 &&
     +	test_cmp expect actual
     +'
     +
3:  0e8409227e4 ! 3:  c570467c8db format-patch: check if header output looks valid
    @@ builtin/log.c: static char *header_cmd_output(struct rev_info *rev, const struct
     
      ## t/t4014-format-patch.sh ##
     @@ t/t4014-format-patch.sh: test_expect_success '--header-cmd with no output works' '
    - 	git format-patch --header-cmd=./cmd --stdout main..side
    + 	git format-patch --header-cmd=true --stdout main..side
      '
      
     +test_expect_success '--header-cmd without headers-like output fails' '
    @@ t/t4014-format-patch.sh: test_expect_success '--header-cmd with no output works'
     +'
     +
      test_expect_success '--header-cmd reports failed command' '
    - 	write_script cmd <<-\EOF &&
    - 	exit 1
    + 	cat > expect <<-\EOF &&
    + 	fatal: header-cmd false: failed with exit code 1
-- 
2.44.0.144.g29ae9861142


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v2 1/3] revision: add a per-email field to rev-info
  2024-03-19 18:35 ` [PATCH v2 0/3] format-patch: teach `--header-cmd` Kristoffer Haugsbakk
@ 2024-03-19 18:35   ` Kristoffer Haugsbakk
  2024-03-19 21:29     ` Jeff King
  2024-03-19 18:35   ` [PATCH v2 2/3] format-patch: teach `--header-cmd` Kristoffer Haugsbakk
  2024-03-19 18:35   ` [PATCH v2 3/3] format-patch: check if header output looks valid Kristoffer Haugsbakk
  2 siblings, 1 reply; 34+ messages in thread
From: Kristoffer Haugsbakk @ 2024-03-19 18:35 UTC (permalink / raw)
  To: git; +Cc: Kristoffer Haugsbakk, Jeff King

Add `pe_header` to `rev_info` to store per-email headers.

The next commit will add an option to `format-patch` which will allow
the user to store headers per-email; a complement to options like
`--add-header`.

To make this possible we need a new field to store these headers. We
also need to take ownership of `extra_headers_p` in
`log_write_email_headers`; facilitate this by removing constness from
the relevant pointers.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
---

Notes (series):
    v2:
    • Replaces “log-tree: take ownership of pointer”
      • Link: https://lore.kernel.org/git/3b12a8cf393b6d8f0877fd7d87173c565d7d5a90.1709841147.git.code@khaugsbakk.name/
    • More preliminary work
      • Link: https://lore.kernel.org/git/20240313065454.GB125150@coredump.intra.peff.net/

 log-tree.c | 21 +++++++++++----------
 log-tree.h |  2 +-
 pretty.h   |  2 +-
 revision.h |  4 +++-
 4 files changed, 16 insertions(+), 13 deletions(-)

diff --git a/log-tree.c b/log-tree.c
index e5438b029d9..f6cdde6e8f3 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -470,16 +470,21 @@ void fmt_output_email_subject(struct strbuf *sb, struct rev_info *opt)
 }
 
 void log_write_email_headers(struct rev_info *opt, struct commit *commit,
-			     const char **extra_headers_p,
+			     char **extra_headers_p,
 			     int *need_8bit_cte_p,
 			     int maybe_multipart)
 {
-	const char *extra_headers = opt->extra_headers;
+	struct strbuf headers = STRBUF_INIT;
 	const char *name = oid_to_hex(opt->zero_commit ?
 				      null_oid() : &commit->object.oid);
 
 	*need_8bit_cte_p = 0; /* unknown */
 
+	if (opt->extra_headers)
+		strbuf_addstr(&headers, opt->extra_headers);
+	if (opt->pe_headers)
+		strbuf_addstr(&headers, opt->pe_headers);
+
 	fprintf(opt->diffopt.file, "From %s Mon Sep 17 00:00:00 2001\n", name);
 	graph_show_oneline(opt->graph);
 	if (opt->message_id) {
@@ -496,16 +501,13 @@ void log_write_email_headers(struct rev_info *opt, struct commit *commit,
 		graph_show_oneline(opt->graph);
 	}
 	if (opt->mime_boundary && maybe_multipart) {
-		static struct strbuf subject_buffer = STRBUF_INIT;
 		static struct strbuf buffer = STRBUF_INIT;
 		struct strbuf filename =  STRBUF_INIT;
 		*need_8bit_cte_p = -1; /* NEVER */
 
-		strbuf_reset(&subject_buffer);
 		strbuf_reset(&buffer);
 
-		strbuf_addf(&subject_buffer,
-			 "%s"
+		strbuf_addf(&headers,
 			 "MIME-Version: 1.0\n"
 			 "Content-Type: multipart/mixed;"
 			 " boundary=\"%s%s\"\n"
@@ -516,10 +518,8 @@ void log_write_email_headers(struct rev_info *opt, struct commit *commit,
 			 "Content-Type: text/plain; "
 			 "charset=UTF-8; format=fixed\n"
 			 "Content-Transfer-Encoding: 8bit\n\n",
-			 extra_headers ? extra_headers : "",
 			 mime_boundary_leader, opt->mime_boundary,
 			 mime_boundary_leader, opt->mime_boundary);
-		extra_headers = subject_buffer.buf;
 
 		if (opt->numbered_files)
 			strbuf_addf(&filename, "%d", opt->nr);
@@ -539,7 +539,7 @@ void log_write_email_headers(struct rev_info *opt, struct commit *commit,
 		opt->diffopt.stat_sep = buffer.buf;
 		strbuf_release(&filename);
 	}
-	*extra_headers_p = extra_headers;
+	*extra_headers_p = headers.len ? strbuf_detach(&headers, NULL) : NULL;
 }
 
 static void show_sig_lines(struct rev_info *opt, int status, const char *bol)
@@ -678,7 +678,7 @@ void show_log(struct rev_info *opt)
 	struct log_info *log = opt->loginfo;
 	struct commit *commit = log->commit, *parent = log->parent;
 	int abbrev_commit = opt->abbrev_commit ? opt->abbrev : the_hash_algo->hexsz;
-	const char *extra_headers = opt->extra_headers;
+	char *extra_headers = opt->extra_headers;
 	struct pretty_print_context ctx = {0};
 
 	opt->loginfo = NULL;
@@ -857,6 +857,7 @@ void show_log(struct rev_info *opt)
 
 	strbuf_release(&msgbuf);
 	free(ctx.notes_message);
+	free(ctx.after_subject);
 
 	if (cmit_fmt_is_mail(ctx.fmt) && opt->idiff_oid1) {
 		struct diff_queue_struct dq;
diff --git a/log-tree.h b/log-tree.h
index 41c776fea52..94978e2c838 100644
--- a/log-tree.h
+++ b/log-tree.h
@@ -29,7 +29,7 @@ void format_decorations(struct strbuf *sb, const struct commit *commit,
 			int use_color, const struct decoration_options *opts);
 void show_decorations(struct rev_info *opt, struct commit *commit);
 void log_write_email_headers(struct rev_info *opt, struct commit *commit,
-			     const char **extra_headers_p,
+			     char **extra_headers_p,
 			     int *need_8bit_cte_p,
 			     int maybe_multipart);
 void load_ref_decorations(struct decoration_filter *filter, int flags);
diff --git a/pretty.h b/pretty.h
index 421209e9ec2..bdce3191875 100644
--- a/pretty.h
+++ b/pretty.h
@@ -35,7 +35,7 @@ struct pretty_print_context {
 	 */
 	enum cmit_fmt fmt;
 	int abbrev;
-	const char *after_subject;
+	char *after_subject;
 	int preserve_subject;
 	struct date_mode date_mode;
 	unsigned date_mode_explicit:1;
diff --git a/revision.h b/revision.h
index 94c43138bc3..95e92397a7a 100644
--- a/revision.h
+++ b/revision.h
@@ -290,7 +290,9 @@ struct rev_info {
 	struct ident_split from_ident;
 	struct string_list *ref_message_ids;
 	int		add_signoff;
-	const char	*extra_headers;
+	char		*extra_headers;
+	/* per-email headers */
+	char		*pe_headers;
 	const char	*log_reencode;
 	const char	*subject_prefix;
 	int		patch_name_max;
-- 
2.44.0.144.g29ae9861142


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v2 2/3] format-patch: teach `--header-cmd`
  2024-03-19 18:35 ` [PATCH v2 0/3] format-patch: teach `--header-cmd` Kristoffer Haugsbakk
  2024-03-19 18:35   ` [PATCH v2 1/3] revision: add a per-email field to rev-info Kristoffer Haugsbakk
@ 2024-03-19 18:35   ` Kristoffer Haugsbakk
  2024-03-19 18:35   ` [PATCH v2 3/3] format-patch: check if header output looks valid Kristoffer Haugsbakk
  2 siblings, 0 replies; 34+ messages in thread
From: Kristoffer Haugsbakk @ 2024-03-19 18:35 UTC (permalink / raw)
  To: git; +Cc: Kristoffer Haugsbakk

Teach git-format-patch(1) `--header-cmd` (with negation) and the
accompanying config variable `format.headerCmd` which allows the user to
add extra headers per-patch.

format-patch knows `--add-header`. However, that seems most useful for
series-wide headers; you cannot really control what the header is like
per patch or specifically for the cover letter. To that end, teach
format-patch a new option which runs a command that has access to the
hash of the current commit (if it is a code patch) and the patch count
which is used for the patch files that this command outputs. Also
include an environment variable which tells the version of this API so
that the command can detect and error out in case the API changes.

This is inspired by `--header-cmd` of git-send-email(1).

§ Discussion

The command can use the provided commit hash to provide relevant
information in the header. For example, the command could output a
header for the current commit as well as the previously-published
commits:

    X-Commit-Hash: 97b53c04894578b23d0c650f69885f734699afc7
    X-Previous-Commits:
        4ad5d4190649dcb5f26c73a6f15ab731891b9dfd
        d275d1d179b90592ddd7b5da2ae4573b3f7a37b7
        402b7937951073466bf4527caffd38175391c7da

Now interested parties can use this information to track where the
patches come from.

This information could of course be given between the
three-dash/three-hyphen line and the patch proper. However, the project
might prefer to use this part for extra patch information written by the
author and leave the above information for tooling; this way the extra
information does not need to disturb the reader.

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
---

Notes (series):
    v2:
    • Simplify tests: use `true` and `false` as commands
    • Fix indentation
    • Don’t use `const` for owned pointer (avoid cast when freeing)
      • Link: https://lore.kernel.org/git/cover.1709841147.git.code@khaugsbakk.name/T/#m12d104a5a769c7f6e02b1d0a75855142004e9206
    • Use AsciiDoc definition list
      • Link: https://lore.kernel.org/git/53ea3745-205b-40c0-a4c5-9be26d9b88bf@gmail.com/

 Documentation/config/format.txt    |  5 ++++
 Documentation/git-format-patch.txt | 23 ++++++++++++++++
 builtin/log.c                      | 43 ++++++++++++++++++++++++++++++
 t/t4014-format-patch.sh            | 36 +++++++++++++++++++++++++
 4 files changed, 107 insertions(+)

diff --git a/Documentation/config/format.txt b/Documentation/config/format.txt
index 7410e930e53..c184b865824 100644
--- a/Documentation/config/format.txt
+++ b/Documentation/config/format.txt
@@ -31,6 +31,11 @@ format.headers::
 	Additional email headers to include in a patch to be submitted
 	by mail.  See linkgit:git-format-patch[1].
 
+format.headerCmd::
+	Command to run for each patch that should output RFC 2822 email
+	headers. Has access to some information per patch via
+	environment variables. See linkgit:git-format-patch[1].
+
 format.to::
 format.cc::
 	Additional recipients to include in a patch to be submitted
diff --git a/Documentation/git-format-patch.txt b/Documentation/git-format-patch.txt
index 728bb3821c1..4f87fd25db9 100644
--- a/Documentation/git-format-patch.txt
+++ b/Documentation/git-format-patch.txt
@@ -303,6 +303,29 @@ feeding the result to `git send-email`.
 	`Cc:`, and custom) headers added so far from config or command
 	line.
 
+--[no-]header-cmd=<cmd>::
+	Run _<cmd>_ for each patch. _<cmd>_ should output valid RFC 2822
+	email headers. This can also be configured with
+	the configuration variable `format.headerCmd`. Can be turned off
+	with `--no-header-cmd`. This works independently of
+	`--[no-]add-header`.
++
+_<cmd>_ has access to these environment variables:
++
+--
+GIT_FP_HEADER_CMD_VERSION;;
+	The version of this API. Currently `1`. _<cmd>_ may return exit code
+	`2` in order to signal that it does not support the given version.
+GIT_FP_HEADER_CMD_HASH;;
+	The hash of the commit corresponding to the current patch. Not set if
+	the current patch is the cover letter.
+GIT_FP_HEADER_CMD_COUNT;;
+	The current patch count. Increments for each patch.
+--
++
+`git format-patch` will error out if _<cmd>_ returns a non-zero exit
+code.
+
 --[no-]cover-letter::
 	In addition to the patches, generate a cover letter file
 	containing the branch description, shortlog and the overall diffstat.  You can
diff --git a/builtin/log.c b/builtin/log.c
index e5da0d10434..bc656b5e0f8 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -43,10 +43,13 @@
 #include "tmp-objdir.h"
 #include "tree.h"
 #include "write-or-die.h"
+#include "run-command.h"
 
 #define MAIL_DEFAULT_WRAP 72
 #define COVER_FROM_AUTO_MAX_SUBJECT_LEN 100
 #define FORMAT_PATCH_NAME_MAX_DEFAULT 64
+#define HC_VERSION "1"
+#define HC_NOT_SUPPORTED 2
 
 /* Set a default date-time format for git log ("log.date" config variable) */
 static const char *default_date_mode = NULL;
@@ -902,6 +905,7 @@ static int auto_number = 1;
 
 static char *default_attach = NULL;
 
+static const char *header_cmd = NULL;
 static struct string_list extra_hdr = STRING_LIST_INIT_NODUP;
 static struct string_list extra_to = STRING_LIST_INIT_NODUP;
 static struct string_list extra_cc = STRING_LIST_INIT_NODUP;
@@ -1100,6 +1104,8 @@ static int git_format_config(const char *var, const char *value,
 		format_no_prefix = 1;
 		return 0;
 	}
+	if (!strcmp(var, "format.headercmd"))
+		return git_config_string(&header_cmd, var, value);
 
 	/*
 	 * ignore some porcelain config which would otherwise be parsed by
@@ -1419,6 +1425,7 @@ static void make_cover_letter(struct rev_info *rev, int use_separate_file,
 		show_range_diff(rev->rdiff1, rev->rdiff2, &range_diff_opts);
 		strvec_clear(&other_arg);
 	}
+	free(pp.after_subject);
 }
 
 static char *clean_message_id(const char *msg_id)
@@ -1869,6 +1876,35 @@ static void infer_range_diff_ranges(struct strbuf *r1,
 	}
 }
 
+/* Returns an owned pointer */
+static char *header_cmd_output(struct rev_info *rev, const struct commit *cmit)
+{
+	struct child_process header_cmd_proc = CHILD_PROCESS_INIT;
+	struct strbuf output = STRBUF_INIT;
+	int res;
+
+	strvec_pushl(&header_cmd_proc.args, header_cmd, NULL);
+	if (cmit)
+		strvec_pushf(&header_cmd_proc.env, "GIT_FP_HEADER_CMD_HASH=%s",
+			     oid_to_hex(&cmit->object.oid));
+	strvec_pushl(&header_cmd_proc.env,
+		     "GIT_FP_HEADER_CMD_VERSION=" HC_VERSION, NULL);
+	strvec_pushf(&header_cmd_proc.env, "GIT_FP_HEADER_CMD_COUNT=%" PRIuMAX,
+		     (uintmax_t)rev->nr);
+	res = capture_command(&header_cmd_proc, &output, 0);
+	if (res) {
+		if (res == HC_NOT_SUPPORTED)
+			die(_("header-cmd %s: returned exit "
+			      "code %d; the command does not support "
+			      "version " HC_VERSION),
+			    header_cmd, HC_NOT_SUPPORTED);
+		else
+			die(_("header-cmd %s: failed with exit code %d"),
+			    header_cmd, res);
+	}
+	return strbuf_detach(&output, NULL);
+}
+
 int cmd_format_patch(int argc, const char **argv, const char *prefix)
 {
 	struct commit *commit;
@@ -1959,6 +1995,7 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 		OPT_GROUP(N_("Messaging")),
 		OPT_CALLBACK(0, "add-header", NULL, N_("header"),
 			    N_("add email header"), header_callback),
+		OPT_STRING(0, "header-cmd", &header_cmd, N_("email"), N_("command that will be run to generate headers")),
 		OPT_STRING_LIST(0, "to", &extra_to, N_("email"), N_("add To: header")),
 		OPT_STRING_LIST(0, "cc", &extra_cc, N_("email"), N_("add Cc: header")),
 		OPT_CALLBACK_F(0, "from", &from, N_("ident"),
@@ -2325,6 +2362,8 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 	if (cover_letter) {
 		if (thread)
 			gen_message_id(&rev, "cover");
+		if (header_cmd)
+			rev.pe_headers = header_cmd_output(&rev, NULL);
 		make_cover_letter(&rev, !!output_directory,
 				  origin, nr, list, description_file, branch_name, quiet);
 		print_bases(&bases, rev.diffopt.file);
@@ -2334,6 +2373,7 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 		/* interdiff/range-diff in cover-letter; omit from patches */
 		rev.idiff_oid1 = NULL;
 		rev.rdiff1 = NULL;
+		free(rev.pe_headers);
 	}
 	rev.add_signoff = do_signoff;
 
@@ -2380,6 +2420,8 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 			gen_message_id(&rev, oid_to_hex(&commit->object.oid));
 		}
 
+		if (header_cmd)
+			rev.pe_headers = header_cmd_output(&rev, commit);
 		if (output_directory &&
 		    open_next_file(rev.numbered_files ? NULL : commit, NULL, &rev, quiet))
 			die(_("failed to create output files"));
@@ -2406,6 +2448,7 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 		}
 		if (output_directory)
 			fclose(rev.diffopt.file);
+		free(rev.pe_headers);
 	}
 	stop_progress(&progress);
 	free(list);
diff --git a/t/t4014-format-patch.sh b/t/t4014-format-patch.sh
index e37a1411ee2..dc85c4c28fe 100755
--- a/t/t4014-format-patch.sh
+++ b/t/t4014-format-patch.sh
@@ -238,6 +238,42 @@ test_expect_failure 'configuration To: header (rfc2047)' '
 	grep "^To: =?UTF-8?q?R=20=C3=84=20Cipient?= <rcipient@example.com>\$" hdrs9
 '
 
+test_expect_success '--header-cmd' '
+	write_script cmd <<-\EOF &&
+	printf "X-S: $GIT_FP_HEADER_CMD_HASH\n"
+	printf "X-V: $GIT_FP_HEADER_CMD_VERSION\n"
+	printf "X-C: $GIT_FP_HEADER_CMD_COUNT\n"
+	EOF
+	expect_sha1=$(git rev-parse side) &&
+	git format-patch --header-cmd=./cmd --stdout main..side >patch &&
+	grep "^X-S: $expect_sha1" patch &&
+	grep "^X-V: 1" patch &&
+	grep "^X-C: 3" patch
+'
+
+test_expect_success '--header-cmd with no output works' '
+	git format-patch --header-cmd=true --stdout main..side
+'
+
+test_expect_success '--header-cmd reports failed command' '
+	cat > expect <<-\EOF &&
+	fatal: header-cmd false: failed with exit code 1
+	EOF
+	test_must_fail git format-patch --header-cmd=false --stdout main..side >actual 2>&1 &&
+	test_cmp expect actual
+'
+
+test_expect_success '--header-cmd reports exit code 2' '
+	write_script cmd <<-\EOF &&
+	exit 2
+	EOF
+	cat > expect <<-\EOF &&
+	fatal: header-cmd ./cmd: returned exit code 2; the command does not support version 1
+	EOF
+	test_must_fail git format-patch --header-cmd=./cmd --stdout main..side >actual 2>&1 &&
+	test_cmp expect actual
+'
+
 # check_patch <patch>: Verify that <patch> looks like a half-sane
 # patch email to avoid a false positive with !grep
 check_patch () {
-- 
2.44.0.144.g29ae9861142


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v2 3/3] format-patch: check if header output looks valid
  2024-03-19 18:35 ` [PATCH v2 0/3] format-patch: teach `--header-cmd` Kristoffer Haugsbakk
  2024-03-19 18:35   ` [PATCH v2 1/3] revision: add a per-email field to rev-info Kristoffer Haugsbakk
  2024-03-19 18:35   ` [PATCH v2 2/3] format-patch: teach `--header-cmd` Kristoffer Haugsbakk
@ 2024-03-19 18:35   ` Kristoffer Haugsbakk
  2 siblings, 0 replies; 34+ messages in thread
From: Kristoffer Haugsbakk @ 2024-03-19 18:35 UTC (permalink / raw)
  To: git; +Cc: Kristoffer Haugsbakk

Implement a function based on `mailinfo.c:is_mail`.

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
---

Notes (series):
    Isolating this for review as its own commit so that I can point out the
    provenance. May well be squashed into the main patch eventually.

 builtin/log.c           | 33 +++++++++++++++++++++++++++++++++
 t/t4014-format-patch.sh | 13 +++++++++++++
 2 files changed, 46 insertions(+)

diff --git a/builtin/log.c b/builtin/log.c
index bc656b5e0f8..2902c2bf6fe 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -1876,6 +1876,35 @@ static void infer_range_diff_ranges(struct strbuf *r1,
 	}
 }
 
+static int is_mail(struct strbuf *sb)
+{
+	const char *header_regex = "^[!-9;-~]+:";
+	regex_t regex;
+	int ret = 1, i;
+	struct string_list list = STRING_LIST_INIT_DUP;
+
+	if (regcomp(&regex, header_regex, REG_NOSUB | REG_EXTENDED))
+		die("invalid pattern: %s", header_regex);
+	string_list_split(&list, sb->buf, '\n', -1);
+	for (i = 0; i < list.nr; i++) {
+		/* End of header */
+		if (!*list.items[i].string && i == (list.nr - 1))
+			break;
+		/* Ignore indented folded lines */
+		if (*list.items[i].string == '\t' ||
+		    *list.items[i].string == ' ')
+			continue;
+		/* It's a header if it matches header_regex */
+		if (regexec(&regex, list.items[i].string, 0, NULL, 0)) {
+			ret = 0;
+			break;
+		}
+	}
+	string_list_clear(&list, 1);
+	regfree(&regex);
+	return ret;
+}
+
 /* Returns an owned pointer */
 static char *header_cmd_output(struct rev_info *rev, const struct commit *cmit)
 {
@@ -1902,6 +1931,10 @@ static char *header_cmd_output(struct rev_info *rev, const struct commit *cmit)
 			die(_("header-cmd %s: failed with exit code %d"),
 			    header_cmd, res);
 	}
+	if (!is_mail(&output))
+		die(_("header-cmd %s: returned output which was "
+		      "not recognized as valid RFC 2822 headers"),
+		    header_cmd);
 	return strbuf_detach(&output, NULL);
 }
 
diff --git a/t/t4014-format-patch.sh b/t/t4014-format-patch.sh
index dc85c4c28fe..533a5b246e5 100755
--- a/t/t4014-format-patch.sh
+++ b/t/t4014-format-patch.sh
@@ -255,6 +255,19 @@ test_expect_success '--header-cmd with no output works' '
 	git format-patch --header-cmd=true --stdout main..side
 '
 
+test_expect_success '--header-cmd without headers-like output fails' '
+	write_script cmd <<-\EOF &&
+	printf "X-S: $GIT_FP_HEADER_CMD_HASH\n"
+	printf "\n"
+	printf "X-C: $GIT_FP_HEADER_CMD_COUNT\n"
+	EOF
+	cat > expect <<-\EOF &&
+	fatal: header-cmd ./cmd: returned output which was not recognized as valid RFC 2822 headers
+	EOF
+	test_must_fail git format-patch --header-cmd=./cmd --stdout main..side >actual 2>&1 &&
+	test_cmp expect actual
+'
+
 test_expect_success '--header-cmd reports failed command' '
 	cat > expect <<-\EOF &&
 	fatal: header-cmd false: failed with exit code 1
-- 
2.44.0.144.g29ae9861142


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 1/3] revision: add a per-email field to rev-info
  2024-03-19 18:35   ` [PATCH v2 1/3] revision: add a per-email field to rev-info Kristoffer Haugsbakk
@ 2024-03-19 21:29     ` Jeff King
  2024-03-19 21:41       ` Kristoffer Haugsbakk
  2024-03-20  0:25       ` Jeff King
  0 siblings, 2 replies; 34+ messages in thread
From: Jeff King @ 2024-03-19 21:29 UTC (permalink / raw)
  To: Kristoffer Haugsbakk; +Cc: git

On Tue, Mar 19, 2024 at 07:35:36PM +0100, Kristoffer Haugsbakk wrote:

> Add `pe_header` to `rev_info` to store per-email headers.

It is only just now that I realized that "pe" stands for per-email
(though to be fair I was not really focused on the intent of the series
when reading v1). Can we just call it per_email_headers or something?

> The next commit will add an option to `format-patch` which will allow
> the user to store headers per-email; a complement to options like
> `--add-header`.
> 
> To make this possible we need a new field to store these headers. We
> also need to take ownership of `extra_headers_p` in
> `log_write_email_headers`; facilitate this by removing constness from
> the relevant pointers.

There are three pointers at play here:

  - ctx.after_subject has its const removed, since it will now always be
    allocated by log_write_email_headers(), and then freed by the
    caller. Makes sense Though it looks like we only free in show_log(),
    and the free in make_cover_letter() is not added until patch 2?

  - rev_info.extra_headers has its const removed here, but I don't think
    that is helping anything. We only use it to write into the "headers"
    strbuf in log_write_email_headers(), which always returns
    headers.buf (or NULL).

  - rev.pe_headers is introduced as non-const because it is allocated
    and freed for each email. That makes some sense, though if we
    followed the pattern of rev.extra_headers, then the pointer is
    conceptually "const" within the rev_info struct, and it is the
    caller who keeps track of the allocation (using a to_free variable).
    Possibly we should do the same here?

I do still think this could be split in a more obvious way, leaving the
pe_headers bits until they are actually needed. Let me see if I can
sketch it up.

-Peff

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 1/3] revision: add a per-email field to rev-info
  2024-03-19 21:29     ` Jeff King
@ 2024-03-19 21:41       ` Kristoffer Haugsbakk
  2024-03-20  0:25       ` Jeff King
  1 sibling, 0 replies; 34+ messages in thread
From: Kristoffer Haugsbakk @ 2024-03-19 21:41 UTC (permalink / raw)
  To: Jeff King; +Cc: git

On Tue, Mar 19, 2024, at 22:29, Jeff King wrote:
> On Tue, Mar 19, 2024 at 07:35:36PM +0100, Kristoffer Haugsbakk wrote:
>
>> Add `pe_header` to `rev_info` to store per-email headers.
>
> It is only just now that I realized that "pe" stands for per-email
> (though to be fair I was not really focused on the intent of the series
> when reading v1). Can we just call it per_email_headers or something?

For sure.

>> The next commit will add an option to `format-patch` which will allow
>> the user to store headers per-email; a complement to options like
>> `--add-header`.
>>
>> To make this possible we need a new field to store these headers. We
>> also need to take ownership of `extra_headers_p` in
>> `log_write_email_headers`; facilitate this by removing constness from
>> the relevant pointers.
>
> There are three pointers at play here:
>
>   - ctx.after_subject has its const removed, since it will now always be
>     allocated by log_write_email_headers(), and then freed by the
>     caller. Makes sense Though it looks like we only free in show_log(),
>     and the free in make_cover_letter() is not added until patch 2?
>
>   - rev_info.extra_headers has its const removed here, but I don't think
>     that is helping anything. We only use it to write into the "headers"
>     strbuf in log_write_email_headers(), which always returns
>     headers.buf (or NULL).
>
>   - rev.pe_headers is introduced as non-const because it is allocated
>     and freed for each email. That makes some sense, though if we
>     followed the pattern of rev.extra_headers, then the pointer is
>     conceptually "const" within the rev_info struct, and it is the
>     caller who keeps track of the allocation (using a to_free variable).
>     Possibly we should do the same here?
>
> I do still think this could be split in a more obvious way, leaving the
> pe_headers bits until they are actually needed. Let me see if I can
> sketch it up.

Nice :)

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 1/3] revision: add a per-email field to rev-info
  2024-03-19 21:29     ` Jeff King
  2024-03-19 21:41       ` Kristoffer Haugsbakk
@ 2024-03-20  0:25       ` Jeff King
  2024-03-20  0:27         ` [PATCH 1/6] shortlog: stop setting pp.print_email_subject Jeff King
                           ` (7 more replies)
  1 sibling, 8 replies; 34+ messages in thread
From: Jeff King @ 2024-03-20  0:25 UTC (permalink / raw)
  To: Kristoffer Haugsbakk; +Cc: git

On Tue, Mar 19, 2024 at 05:29:40PM -0400, Jeff King wrote:

> There are three pointers at play here:
> 
>   - ctx.after_subject has its const removed, since it will now always be
>     allocated by log_write_email_headers(), and then freed by the
>     caller. Makes sense Though it looks like we only free in show_log(),
>     and the free in make_cover_letter() is not added until patch 2?
> 
>   - rev_info.extra_headers has its const removed here, but I don't think
>     that is helping anything. We only use it to write into the "headers"
>     strbuf in log_write_email_headers(), which always returns
>     headers.buf (or NULL).
> 
>   - rev.pe_headers is introduced as non-const because it is allocated
>     and freed for each email. That makes some sense, though if we
>     followed the pattern of rev.extra_headers, then the pointer is
>     conceptually "const" within the rev_info struct, and it is the
>     caller who keeps track of the allocation (using a to_free variable).
>     Possibly we should do the same here?
> 
> I do still think this could be split in a more obvious way, leaving the
> pe_headers bits until they are actually needed. Let me see if I can
> sketch it up.

OK, this rabbit hole went much deeper than I expected. ;)

I see why you wanted to drop the const from rev_info.extra_headers here.
We need the local extra_headers variable in show_log() to be non-const
(since it receives the output of log_write_email_headers). But we also
assign rev_info.extra_headers to that variable, and if it is const, the
compiler will complain.

But as it turns out, that assignment is not really necessary at all! It
is only used when you have extra headers along with a non-email format.
In most cases we simply ignore the headers for those formats, and in the
one case where we do respect them, I think it is doing the wrong thing.

So here are some patches which clean things up. They would make a
suitable base for your changes, I think, but IMHO they also stand on
their own as cleanups.

Having now stared at this code for a bit, I do think there's another,
much simpler option for your series: keep the same ugly static-strbuf
allocation pattern in log_write_email_headers(), but extend it further.
I'll show that in a moment, too.

  [1/6]: shortlog: stop setting pp.print_email_subject
  [2/6]: pretty: split oneline and email subject printing
  [3/6]: pretty: drop print_email_subject flag
  [4/6]: log: do not set up extra_headers for non-email formats
  [5/6]: format-patch: return an allocated string from log_write_email_headers()
  [6/6]: format-patch: simplify after-subject MIME header handling

 builtin/log.c      |  4 ++--
 builtin/rev-list.c |  1 +
 builtin/shortlog.c |  1 -
 log-tree.c         | 22 +++++++++-------------
 log-tree.h         |  2 +-
 pretty.c           | 43 ++++++++++++++++++++-----------------------
 pretty.h           | 11 +++++------
 7 files changed, 38 insertions(+), 46 deletions(-)

-Peff

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 1/6] shortlog: stop setting pp.print_email_subject
  2024-03-20  0:25       ` Jeff King
@ 2024-03-20  0:27         ` Jeff King
  2024-03-20  0:28         ` [PATCH 2/6] pretty: split oneline and email subject printing Jeff King
                           ` (6 subsequent siblings)
  7 siblings, 0 replies; 34+ messages in thread
From: Jeff King @ 2024-03-20  0:27 UTC (permalink / raw)
  To: Kristoffer Haugsbakk; +Cc: git

When shortlog processes a commit using its internal traversal, it may
pretty-print the subject line for the summary view. When we do so, we
set the "print_email_subject" flag in the pretty-print context. But this
flag does nothing! Since we are using CMIT_FMT_USERFORMAT, we skip most
of the usual formatting code entirely.

This flag is there due to commit 6d167fd7cc (pretty: use
fmt_output_email_subject(), 2017-03-01). But that just switched us away
from setting an empty "subject" header field, which was similarly
useless. That was added by dd2e794a21 (Refactor pretty_print_commit
arguments into a struct, 2009-10-19). Before using the struct, we had to
pass _something_ as the argument, so we passed the empty string (a NULL
would have worked equally well).

So this setting has never done anything, and we can drop the line. That
shortens the code, but more importantly, makes it easier to reason about
and refactor the other users of this flag.

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/shortlog.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/builtin/shortlog.c b/builtin/shortlog.c
index 1307ed2b88..3c7cd2d6ef 100644
--- a/builtin/shortlog.c
+++ b/builtin/shortlog.c
@@ -245,7 +245,6 @@ void shortlog_add_commit(struct shortlog *log, struct commit *commit)
 
 	ctx.fmt = CMIT_FMT_USERFORMAT;
 	ctx.abbrev = log->abbrev;
-	ctx.print_email_subject = 1;
 	ctx.date_mode = log->date_mode;
 	ctx.output_encoding = get_log_output_encoding();
 
-- 
2.44.0.643.g35f318e502


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 2/6] pretty: split oneline and email subject printing
  2024-03-20  0:25       ` Jeff King
  2024-03-20  0:27         ` [PATCH 1/6] shortlog: stop setting pp.print_email_subject Jeff King
@ 2024-03-20  0:28         ` Jeff King
  2024-03-22 22:00           ` Kristoffer Haugsbakk
  2024-03-20  0:30         ` [PATCH 3/6] pretty: drop print_email_subject flag Jeff King
                           ` (5 subsequent siblings)
  7 siblings, 1 reply; 34+ messages in thread
From: Jeff King @ 2024-03-20  0:28 UTC (permalink / raw)
  To: Kristoffer Haugsbakk; +Cc: git

The pp_title_line() function is used for two formats: the oneline format
and the subject line of the email format. But most of the logic in the
function does not make any sense for oneline; it is about special
formatting of email headers.

Lumping the two formats together made sense long ago in 4234a76167
(Extend --pretty=oneline to cover the first paragraph, 2007-06-11), when
there was a lot of manual logic to paste lines together. But later,
88c44735ab (pretty: factor out format_subject(), 2008-12-27) pulled that
logic into its own function.

We can implement the oneline format by just calling that one function.
This makes the intention of the code much more clear, as we know we only
need to worry about those extra email options when dealing with actual
email.

While the intent here is cleanup, it is possible to trigger these cases
in practice by running format-patch with an explicit --oneline option.
But if you did, the results are basically nonsense. For example, with
the preserve_subject flag:

  $ printf "%s\n" one two three | git commit --allow-empty -F -
  $ git format-patch -1 --stdout -k | grep ^Subject
  Subject: =?UTF-8?q?one=0Atwo=0Athree?=
  $ git format-patch -1 --stdout -k --oneline --no-signature
  2af7fbe one
  two
  three

Or with extra headers:

  $ git format-patch -1 --stdout --cc=me --oneline --no-signature
  2af7fbe one two three
  Cc: me

So I'd actually consider this to be an improvement, though you are
probably crazy to use other formats with format-patch in the first place
(arguably it should forbid non-email formats entirely, but that's a
bigger change).

As a bonus, it eliminates some pointless extra allocations for the
oneline output. The email code, since it has to deal with wrapping,
formats into an extra auxiliary buffer. The speedup is tiny, though like
"rev-list --no-abbrev --format=oneline" seems to improve by a consistent
1-2% for me.

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/log.c |  2 +-
 pretty.c      | 22 ++++++++++++----------
 pretty.h      |  8 ++++----
 3 files changed, 17 insertions(+), 15 deletions(-)

diff --git a/builtin/log.c b/builtin/log.c
index e5da0d1043..89cce9c29d 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -1297,7 +1297,7 @@ static void prepare_cover_text(struct pretty_print_context *pp,
 		subject = subject_sb.buf;
 
 do_pp:
-	pp_title_line(pp, &subject, sb, encoding, need_8bit_cte);
+	pp_email_subject(pp, &subject, sb, encoding, need_8bit_cte);
 	pp_remainder(pp, &body, sb, 0);
 
 	strbuf_release(&description_sb);
diff --git a/pretty.c b/pretty.c
index bdbed4295a..be0f2f566d 100644
--- a/pretty.c
+++ b/pretty.c
@@ -2077,11 +2077,11 @@ static void pp_header(struct pretty_print_context *pp,
 	}
 }
 
-void pp_title_line(struct pretty_print_context *pp,
-		   const char **msg_p,
-		   struct strbuf *sb,
-		   const char *encoding,
-		   int need_8bit_cte)
+void pp_email_subject(struct pretty_print_context *pp,
+		      const char **msg_p,
+		      struct strbuf *sb,
+		      const char *encoding,
+		      int need_8bit_cte)
 {
 	static const int max_length = 78; /* per rfc2047 */
 	struct strbuf title;
@@ -2126,9 +2126,8 @@ void pp_title_line(struct pretty_print_context *pp,
 	if (pp->after_subject) {
 		strbuf_addstr(sb, pp->after_subject);
 	}
-	if (cmit_fmt_is_mail(pp->fmt)) {
-		strbuf_addch(sb, '\n');
-	}
+
+	strbuf_addch(sb, '\n');
 
 	if (pp->in_body_headers.nr) {
 		int i;
@@ -2328,8 +2327,11 @@ void pretty_print_commit(struct pretty_print_context *pp,
 	msg = skip_blank_lines(msg);
 
 	/* These formats treat the title line specially. */
-	if (pp->fmt == CMIT_FMT_ONELINE || cmit_fmt_is_mail(pp->fmt))
-		pp_title_line(pp, &msg, sb, encoding, need_8bit_cte);
+	if (pp->fmt == CMIT_FMT_ONELINE) {
+		msg = format_subject(sb, msg, " ");
+		strbuf_addch(sb, '\n');
+	} else if (cmit_fmt_is_mail(pp->fmt))
+		pp_email_subject(pp, &msg, sb, encoding, need_8bit_cte);
 
 	beginning_of_body = sb->len;
 	if (pp->fmt != CMIT_FMT_ONELINE)
diff --git a/pretty.h b/pretty.h
index 421209e9ec..d4ff79deb3 100644
--- a/pretty.h
+++ b/pretty.h
@@ -96,13 +96,13 @@ void pp_user_info(struct pretty_print_context *pp, const char *what,
 			const char *encoding);
 
 /*
- * Format title line of commit message taken from "msg_p" and
+ * Format subject line of commit message taken from "msg_p" and
  * put it into "sb".
  * First line of "msg_p" is also affected.
  */
-void pp_title_line(struct pretty_print_context *pp, const char **msg_p,
-			struct strbuf *sb, const char *encoding,
-			int need_8bit_cte);
+void pp_email_subject(struct pretty_print_context *pp, const char **msg_p,
+		      struct strbuf *sb, const char *encoding,
+		      int need_8bit_cte);
 
 /*
  * Get current state of commit message from "msg_p" and continue formatting
-- 
2.44.0.643.g35f318e502


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 3/6] pretty: drop print_email_subject flag
  2024-03-20  0:25       ` Jeff King
  2024-03-20  0:27         ` [PATCH 1/6] shortlog: stop setting pp.print_email_subject Jeff King
  2024-03-20  0:28         ` [PATCH 2/6] pretty: split oneline and email subject printing Jeff King
@ 2024-03-20  0:30         ` Jeff King
  2024-03-20  0:31         ` [PATCH 4/6] log: do not set up extra_headers for non-email formats Jeff King
                           ` (4 subsequent siblings)
  7 siblings, 0 replies; 34+ messages in thread
From: Jeff King @ 2024-03-20  0:30 UTC (permalink / raw)
  To: Kristoffer Haugsbakk; +Cc: git

With one exception, the print_email_subject flag is set if and only if
the commit format is email based:

  - in make_cover_letter() we set it along with CMIT_FMT_EMAIL
    explicitly

  - in show_log(), we set it if cmit_fmt_is_mail() is true. That covers
    format-patch as well as "git log --format=email" (or mboxrd).

The one exception is "rev-list --format=email", which somewhat
nonsensically prints the author and date as email headers, but no
subject, like:

  $ git rev-list --format=email HEAD
  commit 64fc4c2cdd4db2645eaabb47aa4bac820b03cdba
  From: Jeff King <peff@peff.net>
  Date: Tue, 19 Mar 2024 19:39:26 -0400

  this is the subject

  this is the body

It's doubtful that this is a useful format at all (the "commit" lines
replace the "From" lines that would make it work as an actual mbox).
But I think that printing the subject as a header (like this patch does)
is the least surprising thing to do.

So let's drop this field, making the code a little simpler and easier to
reason about. Note that we do need to set the "rev" field of the
pretty_print_context in rev-list, since that is used to check for
subject_prefix, etc. It's not possible to set those fields via rev-list,
so we'll always just print "Subject: ". But unless we pass in our
rev_info, fmt_output_email_subject() would segfault trying to figure it
out.

Signed-off-by: Jeff King <peff@peff.net>
---
This one is not strictly necessary for building your series, but it
seemed like a useful simplification/cleanup made possible by the
previous commits.

I didn't bother with tests here or in the previous commit, because I
think the commands I've shown are basically nonsense. So while there are
user-visible changes, to me they are more "this is slightly less
nonsensical than the previous behavior", and the main motivation is the
cleanup.

 builtin/log.c      |  1 -
 builtin/rev-list.c |  1 +
 log-tree.c         |  1 -
 pretty.c           | 21 ++++++++-------------
 pretty.h           |  1 -
 5 files changed, 9 insertions(+), 16 deletions(-)

diff --git a/builtin/log.c b/builtin/log.c
index 89cce9c29d..071a7f3131 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -1364,7 +1364,6 @@ static void make_cover_letter(struct rev_info *rev, int use_separate_file,
 	pp.fmt = CMIT_FMT_EMAIL;
 	pp.date_mode.type = DATE_RFC2822;
 	pp.rev = rev;
-	pp.print_email_subject = 1;
 	pp.encode_email_headers = rev->encode_email_headers;
 	pp_user_info(&pp, NULL, &sb, committer, encoding);
 	prepare_cover_text(&pp, description_file, branch_name, &sb,
diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index ec455aa972..77803727e0 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -219,6 +219,7 @@ static void show_commit(struct commit *commit, void *data)
 		ctx.fmt = revs->commit_format;
 		ctx.output_encoding = get_log_output_encoding();
 		ctx.color = revs->diffopt.use_color;
+		ctx.rev = revs;
 		pretty_print_commit(&ctx, commit, &buf);
 		if (buf.len) {
 			if (revs->commit_format != CMIT_FMT_ONELINE)
diff --git a/log-tree.c b/log-tree.c
index e5438b029d..c27240a533 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -742,7 +742,6 @@ void show_log(struct rev_info *opt)
 		log_write_email_headers(opt, commit, &extra_headers,
 					&ctx.need_8bit_cte, 1);
 		ctx.rev = opt;
-		ctx.print_email_subject = 1;
 	} else if (opt->commit_format != CMIT_FMT_USERFORMAT) {
 		fputs(diff_get_color_opt(&opt->diffopt, DIFF_COMMIT), opt->diffopt.file);
 		if (opt->commit_format != CMIT_FMT_ONELINE)
diff --git a/pretty.c b/pretty.c
index be0f2f566d..eecbce82cf 100644
--- a/pretty.c
+++ b/pretty.c
@@ -2091,19 +2091,14 @@ void pp_email_subject(struct pretty_print_context *pp,
 				pp->preserve_subject ? "\n" : " ");
 
 	strbuf_grow(sb, title.len + 1024);
-	if (pp->print_email_subject) {
-		if (pp->rev)
-			fmt_output_email_subject(sb, pp->rev);
-		if (pp->encode_email_headers &&
-		    needs_rfc2047_encoding(title.buf, title.len))
-			add_rfc2047(sb, title.buf, title.len,
-						encoding, RFC2047_SUBJECT);
-		else
-			strbuf_add_wrapped_bytes(sb, title.buf, title.len,
+	fmt_output_email_subject(sb, pp->rev);
+	if (pp->encode_email_headers &&
+	    needs_rfc2047_encoding(title.buf, title.len))
+		add_rfc2047(sb, title.buf, title.len,
+			    encoding, RFC2047_SUBJECT);
+	else
+		strbuf_add_wrapped_bytes(sb, title.buf, title.len,
 					 -last_line_length(sb), 1, max_length);
-	} else {
-		strbuf_addbuf(sb, &title);
-	}
 	strbuf_addch(sb, '\n');
 
 	if (need_8bit_cte == 0) {
@@ -2319,7 +2314,7 @@ void pretty_print_commit(struct pretty_print_context *pp,
 	}
 
 	pp_header(pp, encoding, commit, &msg, sb);
-	if (pp->fmt != CMIT_FMT_ONELINE && !pp->print_email_subject) {
+	if (pp->fmt != CMIT_FMT_ONELINE && !cmit_fmt_is_mail(pp->fmt)) {
 		strbuf_addch(sb, '\n');
 	}
 
diff --git a/pretty.h b/pretty.h
index d4ff79deb3..021bc1d658 100644
--- a/pretty.h
+++ b/pretty.h
@@ -39,7 +39,6 @@ struct pretty_print_context {
 	int preserve_subject;
 	struct date_mode date_mode;
 	unsigned date_mode_explicit:1;
-	int print_email_subject;
 	int expand_tabs_in_log;
 	int need_8bit_cte;
 	char *notes_message;
-- 
2.44.0.643.g35f318e502


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 4/6] log: do not set up extra_headers for non-email formats
  2024-03-20  0:25       ` Jeff King
                           ` (2 preceding siblings ...)
  2024-03-20  0:30         ` [PATCH 3/6] pretty: drop print_email_subject flag Jeff King
@ 2024-03-20  0:31         ` Jeff King
  2024-03-22 22:04           ` Kristoffer Haugsbakk
  2024-03-20  0:35         ` [PATCH 5/6] format-patch: return an allocated string from log_write_email_headers() Jeff King
                           ` (3 subsequent siblings)
  7 siblings, 1 reply; 34+ messages in thread
From: Jeff King @ 2024-03-20  0:31 UTC (permalink / raw)
  To: Kristoffer Haugsbakk; +Cc: git

The commit pretty-printer code has an "after_subject" parameter which it
uses to insert extra headers into the email format. In show_log() we set
this by calling log_write_email_headers() if we are using an email
format, but otherwise default the variable to the rev_info.extra_headers
variable.

Since the pretty-printer code will ignore after_subject unless we are
using an email format, this default is pointless. We can just set
after_subject directly, eliminating an extra variable.

Signed-off-by: Jeff King <peff@peff.net>
---
This one is enabled by the previous commits. And after this now both
callers of log_write_email_headers() directly pass in "after_subject",
which makes the next steps easy.

 log-tree.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/log-tree.c b/log-tree.c
index c27240a533..a50f79ec60 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -678,7 +678,6 @@ void show_log(struct rev_info *opt)
 	struct log_info *log = opt->loginfo;
 	struct commit *commit = log->commit, *parent = log->parent;
 	int abbrev_commit = opt->abbrev_commit ? opt->abbrev : the_hash_algo->hexsz;
-	const char *extra_headers = opt->extra_headers;
 	struct pretty_print_context ctx = {0};
 
 	opt->loginfo = NULL;
@@ -739,7 +738,7 @@ void show_log(struct rev_info *opt)
 	 */
 
 	if (cmit_fmt_is_mail(opt->commit_format)) {
-		log_write_email_headers(opt, commit, &extra_headers,
+		log_write_email_headers(opt, commit, &ctx.after_subject,
 					&ctx.need_8bit_cte, 1);
 		ctx.rev = opt;
 	} else if (opt->commit_format != CMIT_FMT_USERFORMAT) {
@@ -807,7 +806,6 @@ void show_log(struct rev_info *opt)
 	ctx.date_mode = opt->date_mode;
 	ctx.date_mode_explicit = opt->date_mode_explicit;
 	ctx.abbrev = opt->diffopt.abbrev;
-	ctx.after_subject = extra_headers;
 	ctx.preserve_subject = opt->preserve_subject;
 	ctx.encode_email_headers = opt->encode_email_headers;
 	ctx.reflog_info = opt->reflog_info;
-- 
2.44.0.643.g35f318e502


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 5/6] format-patch: return an allocated string from log_write_email_headers()
  2024-03-20  0:25       ` Jeff King
                           ` (3 preceding siblings ...)
  2024-03-20  0:31         ` [PATCH 4/6] log: do not set up extra_headers for non-email formats Jeff King
@ 2024-03-20  0:35         ` Jeff King
  2024-03-22 22:06           ` Kristoffer Haugsbakk
  2024-03-20  0:35         ` [PATCH 6/6] format-patch: simplify after-subject MIME header handling Jeff King
                           ` (2 subsequent siblings)
  7 siblings, 1 reply; 34+ messages in thread
From: Jeff King @ 2024-03-20  0:35 UTC (permalink / raw)
  To: Kristoffer Haugsbakk; +Cc: git

When pretty-printing a commit in the email format, we have to fill in
the "after subject" field of the pretty_print_context with any extra
headers the user provided (e.g., from "--to" or "--cc" options) plus any
special MIME headers.

We return an out-pointer that sometimes points to a newly heap-allocated
string and sometimes not. To avoid leaking, we store the allocated
version in a buffer with static lifetime, which is ugly. Worse, as we
extend the header feature, we'll end up having to repeat this ugly
pattern.

Instead, let's have our out-pointer pass ownership back to the caller,
and duplicate the string when necessary. This does mean one extra
allocation per commit when you use extra headers, but in the context of
format-patch which is showing diffs, I don't think that's even
measurable.

Signed-off-by: Jeff King <peff@peff.net>
---
I don't think the extra allocation is a big deal, but if we do, there
are some other options:

  - instead of an out-pointer we could take a strbuf, and the caller
    could reset and reuse a strbuf for each commit

  - the after_subject stuff could become a callback; we discussed this a
    long time ago (I had no recollection of the thread until finding it
    in the archive just now):

      https://lore.kernel.org/git/20170325211149.yyvocmdfw4zbjyoi@sigill.intra.peff.net/

  - this log_write_email_headers() function prints part of its output to
    stdout, and shoves part of it into the after_subject field to be
    shown by the pretty-printer. I wonder if it could just format the
    subject itself (though that would make "rev-list --format=email"
    even more awkward, I guess).

 builtin/log.c |  1 +
 log-tree.c    | 11 ++++++-----
 log-tree.h    |  2 +-
 pretty.h      |  2 +-
 4 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/builtin/log.c b/builtin/log.c
index 071a7f3131..c0a8bb95e9 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -1370,6 +1370,7 @@ static void make_cover_letter(struct rev_info *rev, int use_separate_file,
 			   encoding, need_8bit_cte);
 	fprintf(rev->diffopt.file, "%s\n", sb.buf);
 
+	free(pp.after_subject);
 	strbuf_release(&sb);
 
 	shortlog_init(&log);
diff --git a/log-tree.c b/log-tree.c
index a50f79ec60..5092a75958 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -470,11 +470,11 @@ void fmt_output_email_subject(struct strbuf *sb, struct rev_info *opt)
 }
 
 void log_write_email_headers(struct rev_info *opt, struct commit *commit,
-			     const char **extra_headers_p,
+			     char **extra_headers_p,
 			     int *need_8bit_cte_p,
 			     int maybe_multipart)
 {
-	const char *extra_headers = opt->extra_headers;
+	char *extra_headers = xstrdup_or_null(opt->extra_headers);
 	const char *name = oid_to_hex(opt->zero_commit ?
 				      null_oid() : &commit->object.oid);
 
@@ -496,12 +496,11 @@ void log_write_email_headers(struct rev_info *opt, struct commit *commit,
 		graph_show_oneline(opt->graph);
 	}
 	if (opt->mime_boundary && maybe_multipart) {
-		static struct strbuf subject_buffer = STRBUF_INIT;
+		struct strbuf subject_buffer = STRBUF_INIT;
 		static struct strbuf buffer = STRBUF_INIT;
 		struct strbuf filename =  STRBUF_INIT;
 		*need_8bit_cte_p = -1; /* NEVER */
 
-		strbuf_reset(&subject_buffer);
 		strbuf_reset(&buffer);
 
 		strbuf_addf(&subject_buffer,
@@ -519,7 +518,8 @@ void log_write_email_headers(struct rev_info *opt, struct commit *commit,
 			 extra_headers ? extra_headers : "",
 			 mime_boundary_leader, opt->mime_boundary,
 			 mime_boundary_leader, opt->mime_boundary);
-		extra_headers = subject_buffer.buf;
+		free(extra_headers);
+		extra_headers = strbuf_detach(&subject_buffer, NULL);
 
 		if (opt->numbered_files)
 			strbuf_addf(&filename, "%d", opt->nr);
@@ -854,6 +854,7 @@ void show_log(struct rev_info *opt)
 
 	strbuf_release(&msgbuf);
 	free(ctx.notes_message);
+	free(ctx.after_subject);
 
 	if (cmit_fmt_is_mail(ctx.fmt) && opt->idiff_oid1) {
 		struct diff_queue_struct dq;
diff --git a/log-tree.h b/log-tree.h
index 41c776fea5..94978e2c83 100644
--- a/log-tree.h
+++ b/log-tree.h
@@ -29,7 +29,7 @@ void format_decorations(struct strbuf *sb, const struct commit *commit,
 			int use_color, const struct decoration_options *opts);
 void show_decorations(struct rev_info *opt, struct commit *commit);
 void log_write_email_headers(struct rev_info *opt, struct commit *commit,
-			     const char **extra_headers_p,
+			     char **extra_headers_p,
 			     int *need_8bit_cte_p,
 			     int maybe_multipart);
 void load_ref_decorations(struct decoration_filter *filter, int flags);
diff --git a/pretty.h b/pretty.h
index 021bc1d658..9cc9e5d42b 100644
--- a/pretty.h
+++ b/pretty.h
@@ -35,7 +35,7 @@ struct pretty_print_context {
 	 */
 	enum cmit_fmt fmt;
 	int abbrev;
-	const char *after_subject;
+	char *after_subject;
 	int preserve_subject;
 	struct date_mode date_mode;
 	unsigned date_mode_explicit:1;
-- 
2.44.0.643.g35f318e502


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 6/6] format-patch: simplify after-subject MIME header handling
  2024-03-20  0:25       ` Jeff King
                           ` (4 preceding siblings ...)
  2024-03-20  0:35         ` [PATCH 5/6] format-patch: return an allocated string from log_write_email_headers() Jeff King
@ 2024-03-20  0:35         ` Jeff King
  2024-03-22 22:08           ` Kristoffer Haugsbakk
  2024-03-20  0:43         ` [PATCH v2 1/3] revision: add a per-email field to rev-info Jeff King
  2024-03-22  9:59         ` [PATCH 7/6] format-patch: fix leak of empty header string Jeff King
  7 siblings, 1 reply; 34+ messages in thread
From: Jeff King @ 2024-03-20  0:35 UTC (permalink / raw)
  To: Kristoffer Haugsbakk; +Cc: git

In log_write_email_headers(), we append our MIME headers to the set of
extra headers by creating a new strbuf, adding the existing headers, and
then adding our new ones.  We had to do it this way when our output
buffer might point to the constant opt->extra_headers variable.

But since the previous commit, we always make a local copy of that
variable. Let's turn that into a strbuf, which lets the MIME code simply
append to it. That simplifies the function and avoids a pointless extra
copy of the headers.

Signed-off-by: Jeff King <peff@peff.net>
---
 log-tree.c | 14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/log-tree.c b/log-tree.c
index 5092a75958..eb2e841046 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -474,12 +474,15 @@ void log_write_email_headers(struct rev_info *opt, struct commit *commit,
 			     int *need_8bit_cte_p,
 			     int maybe_multipart)
 {
-	char *extra_headers = xstrdup_or_null(opt->extra_headers);
+	struct strbuf headers = STRBUF_INIT;
 	const char *name = oid_to_hex(opt->zero_commit ?
 				      null_oid() : &commit->object.oid);
 
 	*need_8bit_cte_p = 0; /* unknown */
 
+	if (opt->extra_headers)
+		strbuf_addstr(&headers, opt->extra_headers);
+
 	fprintf(opt->diffopt.file, "From %s Mon Sep 17 00:00:00 2001\n", name);
 	graph_show_oneline(opt->graph);
 	if (opt->message_id) {
@@ -496,15 +499,13 @@ void log_write_email_headers(struct rev_info *opt, struct commit *commit,
 		graph_show_oneline(opt->graph);
 	}
 	if (opt->mime_boundary && maybe_multipart) {
-		struct strbuf subject_buffer = STRBUF_INIT;
 		static struct strbuf buffer = STRBUF_INIT;
 		struct strbuf filename =  STRBUF_INIT;
 		*need_8bit_cte_p = -1; /* NEVER */
 
 		strbuf_reset(&buffer);
 
-		strbuf_addf(&subject_buffer,
-			 "%s"
+		strbuf_addf(&headers,
 			 "MIME-Version: 1.0\n"
 			 "Content-Type: multipart/mixed;"
 			 " boundary=\"%s%s\"\n"
@@ -515,11 +516,8 @@ void log_write_email_headers(struct rev_info *opt, struct commit *commit,
 			 "Content-Type: text/plain; "
 			 "charset=UTF-8; format=fixed\n"
 			 "Content-Transfer-Encoding: 8bit\n\n",
-			 extra_headers ? extra_headers : "",
 			 mime_boundary_leader, opt->mime_boundary,
 			 mime_boundary_leader, opt->mime_boundary);
-		free(extra_headers);
-		extra_headers = strbuf_detach(&subject_buffer, NULL);
 
 		if (opt->numbered_files)
 			strbuf_addf(&filename, "%d", opt->nr);
@@ -539,7 +537,7 @@ void log_write_email_headers(struct rev_info *opt, struct commit *commit,
 		opt->diffopt.stat_sep = buffer.buf;
 		strbuf_release(&filename);
 	}
-	*extra_headers_p = extra_headers;
+	*extra_headers_p = headers.len ? strbuf_detach(&headers, NULL) : NULL;
 }
 
 static void show_sig_lines(struct rev_info *opt, int status, const char *bol)
-- 
2.44.0.643.g35f318e502

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 1/3] revision: add a per-email field to rev-info
  2024-03-20  0:25       ` Jeff King
                           ` (5 preceding siblings ...)
  2024-03-20  0:35         ` [PATCH 6/6] format-patch: simplify after-subject MIME header handling Jeff King
@ 2024-03-20  0:43         ` Jeff King
  2024-03-22 22:31           ` Kristoffer Haugsbakk
  2024-03-22  9:59         ` [PATCH 7/6] format-patch: fix leak of empty header string Jeff King
  7 siblings, 1 reply; 34+ messages in thread
From: Jeff King @ 2024-03-20  0:43 UTC (permalink / raw)
  To: Kristoffer Haugsbakk; +Cc: git

On Tue, Mar 19, 2024 at 08:25:55PM -0400, Jeff King wrote:

> Having now stared at this code for a bit, I do think there's another,
> much simpler option for your series: keep the same ugly static-strbuf
> allocation pattern in log_write_email_headers(), but extend it further.
> I'll show that in a moment, too.

So something like this:

diff --git a/log-tree.c b/log-tree.c
index e5438b029d..ae0f4fc502 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -474,12 +474,21 @@ void log_write_email_headers(struct rev_info *opt, struct commit *commit,
 			     int *need_8bit_cte_p,
 			     int maybe_multipart)
 {
-	const char *extra_headers = opt->extra_headers;
+	static struct strbuf headers = STRBUF_INIT;
 	const char *name = oid_to_hex(opt->zero_commit ?
 				      null_oid() : &commit->object.oid);
 
 	*need_8bit_cte_p = 0; /* unknown */
 
+	strbuf_reset(&headers);
+	if (opt->extra_headers)
+		strbuf_addstr(&headers, opt->extra_headers);
+	/*
+	 * here's where you'd do your pe_headers; I wonder if you could even
+	 * just run the header command directly here and not need to shove the
+	 * string into rev_info?
+	 */
+
 	fprintf(opt->diffopt.file, "From %s Mon Sep 17 00:00:00 2001\n", name);
 	graph_show_oneline(opt->graph);
 	if (opt->message_id) {
@@ -496,16 +505,13 @@ void log_write_email_headers(struct rev_info *opt, struct commit *commit,
 		graph_show_oneline(opt->graph);
 	}
 	if (opt->mime_boundary && maybe_multipart) {
-		static struct strbuf subject_buffer = STRBUF_INIT;
 		static struct strbuf buffer = STRBUF_INIT;
 		struct strbuf filename =  STRBUF_INIT;
 		*need_8bit_cte_p = -1; /* NEVER */
 
-		strbuf_reset(&subject_buffer);
 		strbuf_reset(&buffer);
 
-		strbuf_addf(&subject_buffer,
-			 "%s"
+		strbuf_addf(&headers,
 			 "MIME-Version: 1.0\n"
 			 "Content-Type: multipart/mixed;"
 			 " boundary=\"%s%s\"\n"
@@ -516,10 +522,8 @@ void log_write_email_headers(struct rev_info *opt, struct commit *commit,
 			 "Content-Type: text/plain; "
 			 "charset=UTF-8; format=fixed\n"
 			 "Content-Transfer-Encoding: 8bit\n\n",
-			 extra_headers ? extra_headers : "",
 			 mime_boundary_leader, opt->mime_boundary,
 			 mime_boundary_leader, opt->mime_boundary);
-		extra_headers = subject_buffer.buf;
 
 		if (opt->numbered_files)
 			strbuf_addf(&filename, "%d", opt->nr);
@@ -539,7 +543,7 @@ void log_write_email_headers(struct rev_info *opt, struct commit *commit,
 		opt->diffopt.stat_sep = buffer.buf;
 		strbuf_release(&filename);
 	}
-	*extra_headers_p = extra_headers;
+	*extra_headers_p = headers.len ? headers.buf : NULL;
 }
 
 static void show_sig_lines(struct rev_info *opt, int status, const char *bol)

And then the callers can continue not caring about how or when to free
the returned pointer. I think in the long run the cleanups I showed are
a nicer place to end up, but I'd just worry that your feature work will
be held hostage by my desire to clean. ;)

If you did it this way (probably as a separate preparatory patch minus
the pe_headers comment), then either I could do my cleanups on top, or
they could even graduate independently (though obviously there will be a
little bit of tricky merging at the end).

-Peff

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 7/6] format-patch: fix leak of empty header string
  2024-03-20  0:25       ` Jeff King
                           ` (6 preceding siblings ...)
  2024-03-20  0:43         ` [PATCH v2 1/3] revision: add a per-email field to rev-info Jeff King
@ 2024-03-22  9:59         ` Jeff King
  2024-03-22 10:03           ` Kristoffer Haugsbakk
                             ` (2 more replies)
  7 siblings, 3 replies; 34+ messages in thread
From: Jeff King @ 2024-03-22  9:59 UTC (permalink / raw)
  To: Kristoffer Haugsbakk; +Cc: Junio C Hamano, git

On Tue, Mar 19, 2024 at 08:25:55PM -0400, Jeff King wrote:

>   [1/6]: shortlog: stop setting pp.print_email_subject
>   [2/6]: pretty: split oneline and email subject printing
>   [3/6]: pretty: drop print_email_subject flag
>   [4/6]: log: do not set up extra_headers for non-email formats
>   [5/6]: format-patch: return an allocated string from log_write_email_headers()
>   [6/6]: format-patch: simplify after-subject MIME header handling

These patches introduce a small leak into format-patch. I didn't notice
before because the "leaks" CI jobs were broken due to sanitizer problems
in the base image (which now seem fixed?).

Here's a fix that can go on top of jk/pretty-subject-cleanup. That topic
is not in 'next' yet, so I could also re-roll. The issue was subtle
enough that a separate commit is not such a bad thing, but I'm happy to
squash it in if we'd prefer.

-- >8 --
Subject: [PATCH] format-patch: fix leak of empty header string

The log_write_email_headers() function recently learned to return the
"extra_headers_p" variable to the caller as an allocated string. We
start by copying rev_info.extra_headers into a strbuf, and then detach
the strbuf at the end of the function. If there are no extra headers, we
leave the strbuf empty. Likewise, if there are no headers to return, we
pass back NULL.

This misses a corner case which can cause a leak. The "do we have any
headers to copy" check is done by looking for a NULL opt->extra_headers.
But the "do we have a non-empty string to return" check is done by
checking the length of the strbuf. That means if opt->extra_headers is
the empty string, we'll "copy" it into the strbuf, triggering an
allocation, but then leak the buffer when we return NULL from the
function.

We can solve this in one of two ways:

  1. Rather than checking headers->len at the end, we could check
     headers->alloc to see if we allocated anything. That retains the
     original behavior before the recent change, where an empty
     extra_headers string is "passed through" to the caller. In practice
     this doesn't matter, though (the code which eventually looks at the
     result treats NULL or the empty string the same).

  2. Only bother copying a non-empty string into the strbuf. This has
     the added bonus of avoiding a pointless allocation.

     Arguably strbuf_addstr() could do this optimization itself, though
     it may be slightly dangerous to do so (some existing callers may
     not get a fresh allocation when they expect to). In theory callers
     are all supposed to use strbuf_detach() in such a case, but there's
     no guarantee that this is the case.

This patch uses option 2. Without it, building with SANITIZE=leak shows
many errors in t4021 and elsewhere.

Signed-off-by: Jeff King <peff@peff.net>
---
 log-tree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/log-tree.c b/log-tree.c
index eb2e841046..59eeaef1f7 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -480,7 +480,7 @@ void log_write_email_headers(struct rev_info *opt, struct commit *commit,
 
 	*need_8bit_cte_p = 0; /* unknown */
 
-	if (opt->extra_headers)
+	if (opt->extra_headers && *opt->extra_headers)
 		strbuf_addstr(&headers, opt->extra_headers);
 
 	fprintf(opt->diffopt.file, "From %s Mon Sep 17 00:00:00 2001\n", name);
-- 
2.44.0.682.g01e1dab148


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH 7/6] format-patch: fix leak of empty header string
  2024-03-22  9:59         ` [PATCH 7/6] format-patch: fix leak of empty header string Jeff King
@ 2024-03-22 10:03           ` Kristoffer Haugsbakk
  2024-03-22 16:50           ` Junio C Hamano
  2024-03-22 22:16           ` Kristoffer Haugsbakk
  2 siblings, 0 replies; 34+ messages in thread
From: Kristoffer Haugsbakk @ 2024-03-22 10:03 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git

On Fri, Mar 22, 2024, at 10:59, Jeff King wrote:
> On Tue, Mar 19, 2024 at 08:25:55PM -0400, Jeff King wrote:
>
>>   [1/6]: shortlog: stop setting pp.print_email_subject
>>   [2/6]: pretty: split oneline and email subject printing
>>   [3/6]: pretty: drop print_email_subject flag
>>   [4/6]: log: do not set up extra_headers for non-email formats
>>   [5/6]: format-patch: return an allocated string from log_write_email_headers()
>>   [6/6]: format-patch: simplify after-subject MIME header handling
>
> These patches introduce a small leak into format-patch. I didn't notice
> before because the "leaks" CI jobs were broken due to sanitizer problems
> in the base image (which now seem fixed?).
>
> Here's a fix that can go on top of jk/pretty-subject-cleanup. That topic
> is not in 'next' yet, so I could also re-roll. The issue was subtle
> enough that a separate commit is not such a bad thing, but I'm happy to
> squash it in if we'd prefer.
>
> -- >8 --
> Subject: [PATCH] format-patch: fix leak of empty header string
> [snip]

Hi Peff, and thanks a lot for making this series.

I’ll have a look at it this evening.

Thanks

-- 
Kristoffer Haugsbakk

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 7/6] format-patch: fix leak of empty header string
  2024-03-22  9:59         ` [PATCH 7/6] format-patch: fix leak of empty header string Jeff King
  2024-03-22 10:03           ` Kristoffer Haugsbakk
@ 2024-03-22 16:50           ` Junio C Hamano
  2024-03-22 22:16           ` Kristoffer Haugsbakk
  2 siblings, 0 replies; 34+ messages in thread
From: Junio C Hamano @ 2024-03-22 16:50 UTC (permalink / raw)
  To: Jeff King; +Cc: Kristoffer Haugsbakk, git

Jeff King <peff@peff.net> writes:

> On Tue, Mar 19, 2024 at 08:25:55PM -0400, Jeff King wrote:
>
>>   [1/6]: shortlog: stop setting pp.print_email_subject
>>   [2/6]: pretty: split oneline and email subject printing
>>   [3/6]: pretty: drop print_email_subject flag
>>   [4/6]: log: do not set up extra_headers for non-email formats
>>   [5/6]: format-patch: return an allocated string from log_write_email_headers()
>>   [6/6]: format-patch: simplify after-subject MIME header handling
>
> These patches introduce a small leak into format-patch. I didn't notice
> before because the "leaks" CI jobs were broken due to sanitizer problems
> in the base image (which now seem fixed?).
>
> Here's a fix that can go on top of jk/pretty-subject-cleanup. That topic
> is not in 'next' yet, so I could also re-roll. The issue was subtle
> enough that a separate commit is not such a bad thing, but I'm happy to
> squash it in if we'd prefer.

Indeed it is subtle and I like the corner case described separately
like this one does.  Very much appreciated.

Thanks.

> -- >8 --
> Subject: [PATCH] format-patch: fix leak of empty header string
>
> The log_write_email_headers() function recently learned to return the
> "extra_headers_p" variable to the caller as an allocated string. We
> start by copying rev_info.extra_headers into a strbuf, and then detach
> the strbuf at the end of the function. If there are no extra headers, we
> leave the strbuf empty. Likewise, if there are no headers to return, we
> pass back NULL.
>
> This misses a corner case which can cause a leak. The "do we have any
> headers to copy" check is done by looking for a NULL opt->extra_headers.
> But the "do we have a non-empty string to return" check is done by
> checking the length of the strbuf. That means if opt->extra_headers is
> the empty string, we'll "copy" it into the strbuf, triggering an
> allocation, but then leak the buffer when we return NULL from the
> function.
>
> We can solve this in one of two ways:
>
>   1. Rather than checking headers->len at the end, we could check
>      headers->alloc to see if we allocated anything. That retains the
>      original behavior before the recent change, where an empty
>      extra_headers string is "passed through" to the caller. In practice
>      this doesn't matter, though (the code which eventually looks at the
>      result treats NULL or the empty string the same).
>
>   2. Only bother copying a non-empty string into the strbuf. This has
>      the added bonus of avoiding a pointless allocation.
>
>      Arguably strbuf_addstr() could do this optimization itself, though
>      it may be slightly dangerous to do so (some existing callers may
>      not get a fresh allocation when they expect to). In theory callers
>      are all supposed to use strbuf_detach() in such a case, but there's
>      no guarantee that this is the case.
>
> This patch uses option 2. Without it, building with SANITIZE=leak shows
> many errors in t4021 and elsewhere.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  log-tree.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/log-tree.c b/log-tree.c
> index eb2e841046..59eeaef1f7 100644
> --- a/log-tree.c
> +++ b/log-tree.c
> @@ -480,7 +480,7 @@ void log_write_email_headers(struct rev_info *opt, struct commit *commit,
>  
>  	*need_8bit_cte_p = 0; /* unknown */
>  
> -	if (opt->extra_headers)
> +	if (opt->extra_headers && *opt->extra_headers)
>  		strbuf_addstr(&headers, opt->extra_headers);
>  
>  	fprintf(opt->diffopt.file, "From %s Mon Sep 17 00:00:00 2001\n", name);

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/6] pretty: split oneline and email subject printing
  2024-03-20  0:28         ` [PATCH 2/6] pretty: split oneline and email subject printing Jeff King
@ 2024-03-22 22:00           ` Kristoffer Haugsbakk
  0 siblings, 0 replies; 34+ messages in thread
From: Kristoffer Haugsbakk @ 2024-03-22 22:00 UTC (permalink / raw)
  To: Jeff King; +Cc: git

On Wed, Mar 20, 2024, at 01:28, Jeff King wrote:
> The pp_title_line() function is used for two formats: the oneline format
> and the subject line of the email format. But most of the logic in the
> function does not make any sense for oneline; it is about special
> formatting of email headers.
>
> Lumping the two formats together made sense long ago in 4234a76167
> (Extend --pretty=oneline to cover the first paragraph, 2007-06-11), when
> there was a lot of manual logic to paste lines together. But later,
> 88c44735ab (pretty: factor out format_subject(), 2008-12-27) pulled that
> logic into its own function.
>
> We can implement the oneline format by just calling that one function.
> This makes the intention of the code much more clear, as we know we only
> need to worry about those extra email options when dealing with actual
> email.
>
> While the intent here is cleanup, it is possible to trigger these cases
> in practice by running format-patch with an explicit --oneline option.
> But if you did, the results are basically nonsense. For example, with
> the preserve_subject flag:
>
>   $ printf "%s\n" one two three | git commit --allow-empty -F -
>   $ git format-patch -1 --stdout -k | grep ^Subject
>   Subject: =?UTF-8?q?one=0Atwo=0Athree?=
>   $ git format-patch -1 --stdout -k --oneline --no-signature
>   2af7fbe one
>   two
>   three
>
> Or with extra headers:
>
>   $ git format-patch -1 --stdout --cc=me --oneline --no-signature
>   2af7fbe one two three
>   Cc: me
>
> So I'd actually consider this to be an improvement, though you are
> probably crazy to use other formats with format-patch in the first place
> (arguably it should forbid non-email formats entirely, but that's a
> bigger change).

Makes sense. This make the code more focused.

> As a bonus, it eliminates some pointless extra allocations for the
> oneline output. The email code, since it has to deal with wrapping,
> formats into an extra auxiliary buffer. The speedup is tiny, though like
> "rev-list --no-abbrev --format=oneline" seems to improve by a consistent
> 1-2% for me.

Nice. That could add up when formatting a moderate amount of patches.

-- 
Kristoffer Haugsbakk


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 4/6] log: do not set up extra_headers for non-email formats
  2024-03-20  0:31         ` [PATCH 4/6] log: do not set up extra_headers for non-email formats Jeff King
@ 2024-03-22 22:04           ` Kristoffer Haugsbakk
  0 siblings, 0 replies; 34+ messages in thread
From: Kristoffer Haugsbakk @ 2024-03-22 22:04 UTC (permalink / raw)
  To: Jeff King; +Cc: git

On Wed, Mar 20, 2024, at 01:31, Jeff King wrote:
> The commit pretty-printer code has an "after_subject" parameter which it
> uses to insert extra headers into the email format. In show_log() we set
> this by calling log_write_email_headers() if we are using an email
> format, but otherwise default the variable to the rev_info.extra_headers
> variable.
>
> Since the pretty-printer code will ignore after_subject unless we are
> using an email format, this default is pointless. We can just set
> after_subject directly, eliminating an extra variable.
>
> Signed-off-by: Jeff King <peff@peff.net>

Good. I did feel like the code was kind of daisy-chaining assignments
for no obvious reason.

> ---
> This one is enabled by the previous commits. And after this now both
> callers of log_write_email_headers() directly pass in "after_subject",
> which makes the next steps easy.

Yep, these changes are being done in a nice progression.

-- 
Kristoffer Haugsbakk


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 5/6] format-patch: return an allocated string from log_write_email_headers()
  2024-03-20  0:35         ` [PATCH 5/6] format-patch: return an allocated string from log_write_email_headers() Jeff King
@ 2024-03-22 22:06           ` Kristoffer Haugsbakk
  0 siblings, 0 replies; 34+ messages in thread
From: Kristoffer Haugsbakk @ 2024-03-22 22:06 UTC (permalink / raw)
  To: Jeff King; +Cc: git

On Wed, Mar 20, 2024, at 01:35, Jeff King wrote:
> When pretty-printing a commit in the email format, we have to fill in
> the "after subject" field of the pretty_print_context with any extra
> headers the user provided (e.g., from "--to" or "--cc" options) plus any
> special MIME headers.
>
> We return an out-pointer that sometimes points to a newly heap-allocated
> string and sometimes not. To avoid leaking, we store the allocated
> version in a buffer with static lifetime, which is ugly. Worse, as we
> extend the header feature, we'll end up having to repeat this ugly
> pattern.
>
> Instead, let's have our out-pointer pass ownership back to the caller,
> and duplicate the string when necessary. This does mean one extra
> allocation per commit when you use extra headers, but in the context of
> format-patch which is showing diffs, I don't think that's even
> measurable.
>
> Signed-off-by: Jeff King <peff@peff.net>

Good presentation of motivation here.

> ---
> I don't think the extra allocation is a big deal, but if we do, there
> are some other options:
>
>   - instead of an out-pointer we could take a strbuf, and the caller
>     could reset and reuse a strbuf for each commit
>
>   - the after_subject stuff could become a callback; we discussed this a
>     long time ago (I had no recollection of the thread until finding it
>     in the archive just now):
>
>
> https://lore.kernel.org/git/20170325211149.yyvocmdfw4zbjyoi@sigill.intra.peff.net/
>
>   - this log_write_email_headers() function prints part of its output to
>     stdout, and shoves part of it into the after_subject field to be
>     shown by the pretty-printer. I wonder if it could just format the
>     subject itself (though that would make "rev-list --format=email"
>     even more awkward, I guess).

I don’t quite understand all of these alternatives but the first one
makes sense. Leave the responsibility to the caller. That could work.

-- 
Kristoffer Haugsbakk


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 6/6] format-patch: simplify after-subject MIME header handling
  2024-03-20  0:35         ` [PATCH 6/6] format-patch: simplify after-subject MIME header handling Jeff King
@ 2024-03-22 22:08           ` Kristoffer Haugsbakk
  0 siblings, 0 replies; 34+ messages in thread
From: Kristoffer Haugsbakk @ 2024-03-22 22:08 UTC (permalink / raw)
  To: Jeff King; +Cc: git

On Wed, Mar 20, 2024, at 01:35, Jeff King wrote:
> In log_write_email_headers(), we append our MIME headers to the set of
> extra headers by creating a new strbuf, adding the existing headers, and
> then adding our new ones.  We had to do it this way when our output
> buffer might point to the constant opt->extra_headers variable.
>
> But since the previous commit, we always make a local copy of that
> variable. Let's turn that into a strbuf, which lets the MIME code simply
> append to it. That simplifies the function and avoids a pointless extra
> copy of the headers.
>
> Signed-off-by: Jeff King <peff@peff.net>

I like how all the previous work makes this change straightforward.

-- 
Kristoffer Haugsbakk


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 7/6] format-patch: fix leak of empty header string
  2024-03-22  9:59         ` [PATCH 7/6] format-patch: fix leak of empty header string Jeff King
  2024-03-22 10:03           ` Kristoffer Haugsbakk
  2024-03-22 16:50           ` Junio C Hamano
@ 2024-03-22 22:16           ` Kristoffer Haugsbakk
  2 siblings, 0 replies; 34+ messages in thread
From: Kristoffer Haugsbakk @ 2024-03-22 22:16 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git

On Fri, Mar 22, 2024, at 10:59, Jeff King wrote:
> On Tue, Mar 19, 2024 at 08:25:55PM -0400, Jeff King wrote:
>
>>   [1/6]: shortlog: stop setting pp.print_email_subject
>>   [2/6]: pretty: split oneline and email subject printing
>>   [3/6]: pretty: drop print_email_subject flag
>>   [4/6]: log: do not set up extra_headers for non-email formats
>>   [5/6]: format-patch: return an allocated string from log_write_email_headers()
>>   [6/6]: format-patch: simplify after-subject MIME header handling
>
> These patches introduce a small leak into format-patch. I didn't notice
> before because the "leaks" CI jobs were broken due to sanitizer problems
> in the base image (which now seem fixed?).
>
> Here's a fix that can go on top of jk/pretty-subject-cleanup. That topic
> is not in 'next' yet, so I could also re-roll. The issue was subtle
> enough that a separate commit is not such a bad thing, but I'm happy to
> squash it in if we'd prefer.
>
> -- >8 --
> Subject: [PATCH] format-patch: fix leak of empty header string
>
> The log_write_email_headers() function recently learned to return the
> "extra_headers_p" variable to the caller as an allocated string. We
> start by copying rev_info.extra_headers into a strbuf, and then detach
> the strbuf at the end of the function. If there are no extra headers, we
> leave the strbuf empty. Likewise, if there are no headers to return, we
> pass back NULL.
>
> This misses a corner case which can cause a leak. The "do we have any
> headers to copy" check is done by looking for a NULL opt->extra_headers.
> But the "do we have a non-empty string to return" check is done by
> checking the length of the strbuf. That means if opt->extra_headers is
> the empty string, we'll "copy" it into the strbuf, triggering an
> allocation, but then leak the buffer when we return NULL from the
> function.
>
> We can solve this in one of two ways:
>
>   1. Rather than checking headers->len at the end, we could check
>      headers->alloc to see if we allocated anything. That retains the
>      original behavior before the recent change, where an empty
>      extra_headers string is "passed through" to the caller. In practice
>      this doesn't matter, though (the code which eventually looks at the
>      result treats NULL or the empty string the same).
>
>   2. Only bother copying a non-empty string into the strbuf. This has
>      the added bonus of avoiding a pointless allocation.
>
>      Arguably strbuf_addstr() could do this optimization itself, though
>      it may be slightly dangerous to do so (some existing callers may
>      not get a fresh allocation when they expect to). In theory callers
>      are all supposed to use strbuf_detach() in such a case, but there's
>      no guarantee that this is the case.
>
> This patch uses option 2. Without it, building with SANITIZE=leak shows
> many errors in t4021 and elsewhere.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  log-tree.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/log-tree.c b/log-tree.c
> index eb2e841046..59eeaef1f7 100644
> --- a/log-tree.c
> +++ b/log-tree.c
> @@ -480,7 +480,7 @@ void log_write_email_headers(struct rev_info *opt,
> struct commit *commit,
>
>  	*need_8bit_cte_p = 0; /* unknown */
>
> -	if (opt->extra_headers)
> +	if (opt->extra_headers && *opt->extra_headers)
>  		strbuf_addstr(&headers, opt->extra_headers);
>
>  	fprintf(opt->diffopt.file, "From %s Mon Sep 17 00:00:00 2001\n", name);
> --
> 2.44.0.682.g01e1dab148

I was wondering if the new empty-string check now makes the condition
look non-obvious. I mean given that

• You explain how headers-to-copy-check and have-non-empty-string are
  not the same
• You explain how strbuf_addstr() could do this itself (which makes
  sense) but how it could be risky

The condition looks bare without a comment. But the empty-string check
of course makes sense without this context. And it could also be read as
an optimization (and not a leak fix).

And maybe most people just `git log -S'*opt->extra_headers'` if they
have questions in their head. So no information is really missing.

-- 
Kristoffer Haugsbakk


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 1/3] revision: add a per-email field to rev-info
  2024-03-20  0:43         ` [PATCH v2 1/3] revision: add a per-email field to rev-info Jeff King
@ 2024-03-22 22:31           ` Kristoffer Haugsbakk
  0 siblings, 0 replies; 34+ messages in thread
From: Kristoffer Haugsbakk @ 2024-03-22 22:31 UTC (permalink / raw)
  To: Jeff King; +Cc: git

On Wed, Mar 20, 2024, at 01:43, Jeff King wrote:
> On Tue, Mar 19, 2024 at 08:25:55PM -0400, Jeff King wrote:
>
>> Having now stared at this code for a bit, I do think there's another,
>> much simpler option for your series: keep the same ugly static-strbuf
>> allocation pattern in log_write_email_headers(), but extend it further.
>> I'll show that in a moment, too.
>
> So something like this:
>
> diff --git a/log-tree.c b/log-tree.c
> index e5438b029d..ae0f4fc502 100644
> --- a/log-tree.c
> +++ b/log-tree.c
> @@ -474,12 +474,21 @@ void log_write_email_headers(struct rev_info
> *opt, struct commit *commit,
>  			     int *need_8bit_cte_p,
>  			     int maybe_multipart)
>  {
> -	const char *extra_headers = opt->extra_headers;
> +	static struct strbuf headers = STRBUF_INIT;
>  	const char *name = oid_to_hex(opt->zero_commit ?
>  				      null_oid() : &commit->object.oid);
>
>  	*need_8bit_cte_p = 0; /* unknown */
>
> +	strbuf_reset(&headers);
> +	if (opt->extra_headers)
> +		strbuf_addstr(&headers, opt->extra_headers);
> +	/*
> +	 * here's where you'd do your pe_headers; I wonder if you could even
> +	 * just run the header command directly here and not need to shove the
> +	 * string into rev_info?
> +	 */
> +

Hmm. I’ll look into that. This seems like a nicer place to do it
compared to `log.c`.

>  	fprintf(opt->diffopt.file, "From %s Mon Sep 17 00:00:00 2001\n",
> name);
>  	graph_show_oneline(opt->graph);
>  	if (opt->message_id) {
> @@ -496,16 +505,13 @@ void log_write_email_headers(struct rev_info
> *opt, struct commit *commit,
>  		graph_show_oneline(opt->graph);
>  	}
>  	if (opt->mime_boundary && maybe_multipart) {
> -		static struct strbuf subject_buffer = STRBUF_INIT;
>  		static struct strbuf buffer = STRBUF_INIT;
>  		struct strbuf filename =  STRBUF_INIT;
>  		*need_8bit_cte_p = -1; /* NEVER */
>
> -		strbuf_reset(&subject_buffer);
>  		strbuf_reset(&buffer);
>
> -		strbuf_addf(&subject_buffer,
> -			 "%s"
> +		strbuf_addf(&headers,
>  			 "MIME-Version: 1.0\n"
>  			 "Content-Type: multipart/mixed;"
>  			 " boundary=\"%s%s\"\n"
> @@ -516,10 +522,8 @@ void log_write_email_headers(struct rev_info *opt,
> struct commit *commit,
>  			 "Content-Type: text/plain; "
>  			 "charset=UTF-8; format=fixed\n"
>  			 "Content-Transfer-Encoding: 8bit\n\n",
> -			 extra_headers ? extra_headers : "",
>  			 mime_boundary_leader, opt->mime_boundary,
>  			 mime_boundary_leader, opt->mime_boundary);
> -		extra_headers = subject_buffer.buf;
>
>  		if (opt->numbered_files)
>  			strbuf_addf(&filename, "%d", opt->nr);
> @@ -539,7 +543,7 @@ void log_write_email_headers(struct rev_info *opt,
> struct commit *commit,
>  		opt->diffopt.stat_sep = buffer.buf;
>  		strbuf_release(&filename);
>  	}
> -	*extra_headers_p = extra_headers;
> +	*extra_headers_p = headers.len ? headers.buf : NULL;
>  }
>
>  static void show_sig_lines(struct rev_info *opt, int status, const char *bol)
>
> And then the callers can continue not caring about how or when to free
> the returned pointer. I think in the long run the cleanups I showed are
> a nicer place to end up, but I'd just worry that your feature work will
> be held hostage by my desire to clean. ;)

Hah! Definitely don’t worry about that, this has been very helpful.

> If you did it this way (probably as a separate preparatory patch minus
> the pe_headers comment), then either I could do my cleanups on top, or
> they could even graduate independently (though obviously there will be a
> little bit of tricky merging at the end).
>
> -Peff

I think your series should take precedence. I’ll put my series on the
backburner for a while. There’s no rush with that one. These changes of
yours will make extending the header logic easier overall.

Then when yours is merged I’ll have an even easier time.

Thanks again

Kristoffer

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2024-03-22 22:31 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-07 19:59 [PATCH 0/3] format-patch: teach `--header-cmd` Kristoffer Haugsbakk
2024-03-07 19:59 ` [PATCH 1/3] log-tree: take ownership of pointer Kristoffer Haugsbakk
2024-03-12  9:29   ` Jeff King
2024-03-12 17:43     ` Kristoffer Haugsbakk
2024-03-13  6:54       ` Jeff King
2024-03-13 17:49         ` Kristoffer Haugsbakk
2024-03-07 19:59 ` [PATCH 2/3] format-patch: teach `--header-cmd` Kristoffer Haugsbakk
2024-03-08 18:30   ` Kristoffer Haugsbakk
2024-03-11 21:29   ` Jean-Noël Avila
2024-03-12  8:13     ` Kristoffer Haugsbakk
2024-03-07 19:59 ` [PATCH 3/3] format-patch: check if header output looks valid Kristoffer Haugsbakk
2024-03-19 18:35 ` [PATCH v2 0/3] format-patch: teach `--header-cmd` Kristoffer Haugsbakk
2024-03-19 18:35   ` [PATCH v2 1/3] revision: add a per-email field to rev-info Kristoffer Haugsbakk
2024-03-19 21:29     ` Jeff King
2024-03-19 21:41       ` Kristoffer Haugsbakk
2024-03-20  0:25       ` Jeff King
2024-03-20  0:27         ` [PATCH 1/6] shortlog: stop setting pp.print_email_subject Jeff King
2024-03-20  0:28         ` [PATCH 2/6] pretty: split oneline and email subject printing Jeff King
2024-03-22 22:00           ` Kristoffer Haugsbakk
2024-03-20  0:30         ` [PATCH 3/6] pretty: drop print_email_subject flag Jeff King
2024-03-20  0:31         ` [PATCH 4/6] log: do not set up extra_headers for non-email formats Jeff King
2024-03-22 22:04           ` Kristoffer Haugsbakk
2024-03-20  0:35         ` [PATCH 5/6] format-patch: return an allocated string from log_write_email_headers() Jeff King
2024-03-22 22:06           ` Kristoffer Haugsbakk
2024-03-20  0:35         ` [PATCH 6/6] format-patch: simplify after-subject MIME header handling Jeff King
2024-03-22 22:08           ` Kristoffer Haugsbakk
2024-03-20  0:43         ` [PATCH v2 1/3] revision: add a per-email field to rev-info Jeff King
2024-03-22 22:31           ` Kristoffer Haugsbakk
2024-03-22  9:59         ` [PATCH 7/6] format-patch: fix leak of empty header string Jeff King
2024-03-22 10:03           ` Kristoffer Haugsbakk
2024-03-22 16:50           ` Junio C Hamano
2024-03-22 22:16           ` Kristoffer Haugsbakk
2024-03-19 18:35   ` [PATCH v2 2/3] format-patch: teach `--header-cmd` Kristoffer Haugsbakk
2024-03-19 18:35   ` [PATCH v2 3/3] format-patch: check if header output looks valid Kristoffer Haugsbakk

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.