All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Kristoffer Haugsbakk <code@khaugsbakk.name>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 1/3] log-tree: take ownership of pointer
Date: Wed, 13 Mar 2024 02:54:54 -0400	[thread overview]
Message-ID: <20240313065454.GB125150@coredump.intra.peff.net> (raw)
In-Reply-To: <73a4cb87-2800-4ad1-b7a2-33c6465fcc50@app.fastmail.com>

On Tue, Mar 12, 2024 at 06:43:55PM +0100, Kristoffer Haugsbakk wrote:

> > Hmm, OK. This patch by itself introduces a memory leak. It would be nice
> > if we could couple it with the matching free() so that we can see that
> > the issue is fixed. It sounds like your patch 2 is going to introduce
> > such a free, but I'm not sure it's complete.
> 
> Is it okay if it is done in patch 2?

I don't think it's the end of the world to do it in patch 2, as long as
we end up in a good spot. But IMHO it's really hard for reviewers to
understand what is going on, because it's intermingled with so many
other changes. It would be much easier to read if we had a preparatory
patch that switched the memory ownership of the field, and then built on
top of that.

But I recognize that sometimes that's hard to do, because the state is
so tangled that the functional change is what untangles it. I'm not sure
if that's the case here or not; you'd probably have a better idea as
somebody who looked carefully at it recently.

> > It frees the old extra_headers before reassigning it, but nobody
> > cleans it up after handling the final commit.
> 
> I didn’t get any leak errors from the CI. `extra_headers` in `show_log`
> is populated by calling `log_write_email_headers`. Then later it is
> assigned to
> 
>     ctx.after_subject = extra_headers;
> 
> Then `ctx.after_subject is freed later
> 
>     free((char *)ctx.after_subject);
> 
> Am I missing something?

Ah, I see. I was confused by looking for a free of an extra_headers
field. We have rev_info.extra_headers, and that is _not_ owned by
rev_info. We used to assign that to a variable in
log_write_email_headers(), but now we actually make a copy of it. And so
the copy is freed in that function when we replace it with a version
containing extra mime headers here:

                  strbuf_addf(&subject_buffer,
                           "%s"
                           "MIME-Version: 1.0\n"
                           "Content-Type: multipart/mixed;"
                           " boundary=\"%s%s\"\n"
                           "\n"
                           "This is a multi-part message in MIME "
                           "format.\n"
                           "--%s%s\n"
                           "Content-Type: text/plain; "
                           "charset=UTF-8; format=fixed\n"
                           "Content-Transfer-Encoding: 8bit\n\n",
                           extra_headers ? extra_headers : "",
                           mime_boundary_leader, opt->mime_boundary,
                           mime_boundary_leader, opt->mime_boundary);
                  free((char *)extra_headers);
                  extra_headers = strbuf_detach(&subject_buffer, NULL);

But the actual ownership is passed out via the extra_headers_p variable,
and that is what is assigned to ctx.after_subject (which now takes
ownership).

I think in the snippet I quoted above that extra_headers could never be
NULL now, right? We'll always return at least an empty string. But
moreover, we are formatting it into a strbuf, only to potentially copy
it it another strbuf. Couldn't we just do it all in one strbuf?

Something like this:

 log-tree.c | 29 ++++++++---------------------
 1 file changed, 8 insertions(+), 21 deletions(-)

diff --git a/log-tree.c b/log-tree.c
index 9196b4f1d4..0a703a0303 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -469,29 +469,22 @@ void fmt_output_email_subject(struct strbuf *sb, struct rev_info *opt)
 	}
 }
 
-static char *extra_and_pe_headers(const char *extra_headers, const char *pe_headers) {
-	struct strbuf all_headers = STRBUF_INIT;
-
-	if (extra_headers)
-		strbuf_addstr(&all_headers, extra_headers);
-	if (pe_headers) {
-		strbuf_addstr(&all_headers, pe_headers);
-	}
-	return strbuf_detach(&all_headers, NULL);
-}
-
 void log_write_email_headers(struct rev_info *opt, struct commit *commit,
 			     const char **extra_headers_p,
 			     int *need_8bit_cte_p,
 			     int maybe_multipart)
 {
-	const char *extra_headers =
-		extra_and_pe_headers(opt->extra_headers, opt->pe_headers);
+	struct strbuf headers = STRBUF_INIT;
 	const char *name = oid_to_hex(opt->zero_commit ?
 				      null_oid() : &commit->object.oid);
 
 	*need_8bit_cte_p = 0; /* unknown */
 
+	if (opt->extra_headers)
+		strbuf_addstr(&headers, opt->extra_headers);
+	if (opt->pe_headers)
+		strbuf_addstr(&headers, opt->pe_headers);
+
 	fprintf(opt->diffopt.file, "From %s Mon Sep 17 00:00:00 2001\n", name);
 	graph_show_oneline(opt->graph);
 	if (opt->message_id) {
@@ -508,16 +501,13 @@ void log_write_email_headers(struct rev_info *opt, struct commit *commit,
 		graph_show_oneline(opt->graph);
 	}
 	if (opt->mime_boundary && maybe_multipart) {
-		static struct strbuf subject_buffer = STRBUF_INIT;
 		static struct strbuf buffer = STRBUF_INIT;
 		struct strbuf filename =  STRBUF_INIT;
 		*need_8bit_cte_p = -1; /* NEVER */
 
-		strbuf_reset(&subject_buffer);
 		strbuf_reset(&buffer);
 
-		strbuf_addf(&subject_buffer,
-			 "%s"
+		strbuf_addf(&headers,
 			 "MIME-Version: 1.0\n"
 			 "Content-Type: multipart/mixed;"
 			 " boundary=\"%s%s\"\n"
@@ -528,11 +518,8 @@ void log_write_email_headers(struct rev_info *opt, struct commit *commit,
 			 "Content-Type: text/plain; "
 			 "charset=UTF-8; format=fixed\n"
 			 "Content-Transfer-Encoding: 8bit\n\n",
-			 extra_headers ? extra_headers : "",
 			 mime_boundary_leader, opt->mime_boundary,
 			 mime_boundary_leader, opt->mime_boundary);
-		free((char *)extra_headers);
-		extra_headers = strbuf_detach(&subject_buffer, NULL);
 
 		if (opt->numbered_files)
 			strbuf_addf(&filename, "%d", opt->nr);
@@ -552,7 +539,7 @@ void log_write_email_headers(struct rev_info *opt, struct commit *commit,
 		opt->diffopt.stat_sep = buffer.buf;
 		strbuf_release(&filename);
 	}
-	*extra_headers_p = extra_headers;
+	*extra_headers_p = headers.len ? strbuf_detach(&headers, NULL) : NULL;
 }
 
 static void show_sig_lines(struct rev_info *opt, int status, const char *bol)


The resulting code is shorter and (IMHO) easier to understand. It
avoids an extra allocation and copy when using mime. It also avoids the
allocation of an empty string when opt->extra_headers and
opt->pe_headers are both NULL. It does make an extra copy when
extra_headers is non-NULL but pe_headers is NULL (and you're not using
MIME), as we could just use opt->extra_headers as-is, then. But since
the caller needs to take ownership, we can't avoid that copy.

I think you could even do this cleanup before adding pe_headers,
especially if it was coupled with cleaning up the memory ownership
issues.

-Peff

  reply	other threads:[~2024-03-13  6:54 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-07 19:59 [PATCH 0/3] format-patch: teach `--header-cmd` Kristoffer Haugsbakk
2024-03-07 19:59 ` [PATCH 1/3] log-tree: take ownership of pointer Kristoffer Haugsbakk
2024-03-12  9:29   ` Jeff King
2024-03-12 17:43     ` Kristoffer Haugsbakk
2024-03-13  6:54       ` Jeff King [this message]
2024-03-13 17:49         ` Kristoffer Haugsbakk
2024-03-07 19:59 ` [PATCH 2/3] format-patch: teach `--header-cmd` Kristoffer Haugsbakk
2024-03-08 18:30   ` Kristoffer Haugsbakk
2024-03-11 21:29   ` Jean-Noël Avila
2024-03-12  8:13     ` Kristoffer Haugsbakk
2024-03-07 19:59 ` [PATCH 3/3] format-patch: check if header output looks valid Kristoffer Haugsbakk
2024-03-19 18:35 ` [PATCH v2 0/3] format-patch: teach `--header-cmd` Kristoffer Haugsbakk
2024-03-19 18:35   ` [PATCH v2 1/3] revision: add a per-email field to rev-info Kristoffer Haugsbakk
2024-03-19 21:29     ` Jeff King
2024-03-19 21:41       ` Kristoffer Haugsbakk
2024-03-20  0:25       ` Jeff King
2024-03-20  0:27         ` [PATCH 1/6] shortlog: stop setting pp.print_email_subject Jeff King
2024-03-20  0:28         ` [PATCH 2/6] pretty: split oneline and email subject printing Jeff King
2024-03-22 22:00           ` Kristoffer Haugsbakk
2024-03-20  0:30         ` [PATCH 3/6] pretty: drop print_email_subject flag Jeff King
2024-03-20  0:31         ` [PATCH 4/6] log: do not set up extra_headers for non-email formats Jeff King
2024-03-22 22:04           ` Kristoffer Haugsbakk
2024-03-20  0:35         ` [PATCH 5/6] format-patch: return an allocated string from log_write_email_headers() Jeff King
2024-03-22 22:06           ` Kristoffer Haugsbakk
2024-03-20  0:35         ` [PATCH 6/6] format-patch: simplify after-subject MIME header handling Jeff King
2024-03-22 22:08           ` Kristoffer Haugsbakk
2024-03-20  0:43         ` [PATCH v2 1/3] revision: add a per-email field to rev-info Jeff King
2024-03-22 22:31           ` Kristoffer Haugsbakk
2024-03-22  9:59         ` [PATCH 7/6] format-patch: fix leak of empty header string Jeff King
2024-03-22 10:03           ` Kristoffer Haugsbakk
2024-03-22 16:50           ` Junio C Hamano
2024-03-22 22:16           ` Kristoffer Haugsbakk
2024-03-19 18:35   ` [PATCH v2 2/3] format-patch: teach `--header-cmd` Kristoffer Haugsbakk
2024-03-19 18:35   ` [PATCH v2 3/3] format-patch: check if header output looks valid Kristoffer Haugsbakk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240313065454.GB125150@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=code@khaugsbakk.name \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.