All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] format-patch: fix ignored encode_email_headers for cover letter
@ 2023-11-09 11:19 Simon Ser
  2023-11-09 18:35 ` Jeff King
  0 siblings, 1 reply; 6+ messages in thread
From: Simon Ser @ 2023-11-09 11:19 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano

When writing the cover letter, the encode_email_headers option was
ignored. That is, UTF-8 subject lines and email addresses were
written out as-is, without any Q-encoding, even if
--encode-email-headers was passed on the command line.

This is due to encode_email_headers not being copied over from
struct rev_info to struct pretty_print_context. Fix that and add
a test.

Signed-off-by: Simon Ser <contact@emersion.fr>
---
 builtin/log.c           |  1 +
 t/t4014-format-patch.sh | 10 ++++++++++
 2 files changed, 11 insertions(+)

diff --git a/builtin/log.c b/builtin/log.c
index ba775d7b5cf8..87fd1c8560de 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -1364,6 +1364,7 @@ static void make_cover_letter(struct rev_info *rev, int use_separate_file,
 	pp.date_mode.type = DATE_RFC2822;
 	pp.rev = rev;
 	pp.print_email_subject = 1;
+	pp.encode_email_headers = rev->encode_email_headers;
 	pp_user_info(&pp, NULL, &sb, committer, encoding);
 	prepare_cover_text(&pp, description_file, branch_name, &sb,
 			   encoding, need_8bit_cte);
diff --git a/t/t4014-format-patch.sh b/t/t4014-format-patch.sh
index 5ced27ed4571..e37a1411ee24 100755
--- a/t/t4014-format-patch.sh
+++ b/t/t4014-format-patch.sh
@@ -1906,6 +1906,16 @@ body" &&
 	grep "^body$" actual
 '
 
+test_expect_success 'cover letter with --cover-from-description subject (UTF-8 subject line)' '
+	test_config branch.rebuild-1.description "Café?
+
+body" &&
+	git checkout rebuild-1 &&
+	git format-patch --stdout --cover-letter --cover-from-description subject --encode-email-headers main >actual &&
+	grep "^Subject: \[PATCH 0/2\] =?UTF-8?q?Caf=C3=A9=3F?=$" actual &&
+	! grep "Café" actual
+'
+
 test_expect_success 'cover letter with format.coverFromDescription = auto (short subject line)' '
 	test_config branch.rebuild-1.description "config subject
 
-- 
2.42.0



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] format-patch: fix ignored encode_email_headers for cover letter
  2023-11-09 11:19 [PATCH] format-patch: fix ignored encode_email_headers for cover letter Simon Ser
@ 2023-11-09 18:35 ` Jeff King
  2023-11-10 10:36   ` Simon Ser
  2023-11-12 18:38   ` René Scharfe
  0 siblings, 2 replies; 6+ messages in thread
From: Jeff King @ 2023-11-09 18:35 UTC (permalink / raw)
  To: Simon Ser; +Cc: René Scharfe, git, Junio C Hamano

On Thu, Nov 09, 2023 at 11:19:56AM +0000, Simon Ser wrote:

> When writing the cover letter, the encode_email_headers option was
> ignored. That is, UTF-8 subject lines and email addresses were
> written out as-is, without any Q-encoding, even if
> --encode-email-headers was passed on the command line.
> 
> This is due to encode_email_headers not being copied over from
> struct rev_info to struct pretty_print_context. Fix that and add
> a test.

That makes sense, and your patch looks the right thing to do as an
immediate fix. But I have to wonder:

  1. Are there other bits that need to be copied? Grepping for other
     code that does the same thing, I see that show_log() and
     cmd_format_patch() copy a lot more. (For that matter, why doesn't
     make_cover_letter() just use the context set up by
     cmd_format_patch()?).

  2. Why are we copying this stuff at all? When we introduced the
     pretty-print context back in 6bf139440c (clean up calling
     conventions for pretty.c functions, 2011-05-26), the idea was just
     to keep all of the format options together. But later, 6d167fd7cc
     (pretty: use fmt_output_email_subject(), 2017-03-01) added a
     pointer to the rev_info directly. So could/should we just be using
     pp->rev->encode_email_headers here?

     Or if that field is not always set (or we want to override some
     elements), should there be a single helper function to initialize
     the pretty_print_context from a rev_info, that could be shared
     between spots like show_log() and make_cover_letter()?

I don't think that answering those questions needs to hold up your
patch. We can take it as a quick fix for a real bug, and then anybody
interested can dig further as a separate topic on top.

> diff --git a/builtin/log.c b/builtin/log.c
> index ba775d7b5cf8..87fd1c8560de 100644
> --- a/builtin/log.c
> +++ b/builtin/log.c
> @@ -1364,6 +1364,7 @@ static void make_cover_letter(struct rev_info *rev, int use_separate_file,
>  	pp.date_mode.type = DATE_RFC2822;
>  	pp.rev = rev;
>  	pp.print_email_subject = 1;
> +	pp.encode_email_headers = rev->encode_email_headers;
>  	pp_user_info(&pp, NULL, &sb, committer, encoding);
>  	prepare_cover_text(&pp, description_file, branch_name, &sb,
>  			   encoding, need_8bit_cte);

This part looks obviously good.

> +test_expect_success 'cover letter with --cover-from-description subject (UTF-8 subject line)' '
> +	test_config branch.rebuild-1.description "Café?
> +
> +body" &&
> +	git checkout rebuild-1 &&
> +	git format-patch --stdout --cover-letter --cover-from-description subject --encode-email-headers main >actual &&
> +	grep "^Subject: \[PATCH 0/2\] =?UTF-8?q?Caf=C3=A9=3F?=$" actual &&
> +	! grep "Café" actual
> +'

The test looks correct to me.

Some of these long lines (and the in-string newlines!) make this ugly
and hard to read. But it is also just copying the already-ugly style of
nearby tests. So I'm OK with that. But a prettier version might be:

  test_expect_success 'cover letter respects --encode-email-headers' '
        test_config branch.rebuild-1.description "Café?" &&
        git checkout rebuild-1 &&
        git format-patch --stdout --encode-email-headers \
                --cover-letter --cover-from-description=subject \
                main >actual &&
        ...
  '

I also wondered if we could be just be testing this much more easily
with another header like "--to". But I guess that would be found in both
the cover letter and the actual patches (we also don't seem to encode
it even in the regular patches; is that a bug?).

-Peff

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] format-patch: fix ignored encode_email_headers for cover letter
  2023-11-09 18:35 ` Jeff King
@ 2023-11-10 10:36   ` Simon Ser
  2023-11-10 21:48     ` Jeff King
  2023-11-12 18:38   ` René Scharfe
  1 sibling, 1 reply; 6+ messages in thread
From: Simon Ser @ 2023-11-10 10:36 UTC (permalink / raw)
  To: Jeff King; +Cc: René Scharfe, git, Junio C Hamano

On Thursday, November 9th, 2023 at 19:35, Jeff King <peff@peff.net> wrote:

> That makes sense, and your patch looks the right thing to do as an
> immediate fix. But I have to wonder:
> 
>   1. Are there other bits that need to be copied? Grepping for other
>      code that does the same thing, I see that show_log() and
>      cmd_format_patch() copy a lot more. (For that matter, why doesn't
>      make_cover_letter() just use the context set up by
>      cmd_format_patch()?).
> 
>   2. Why are we copying this stuff at all? When we introduced the
>      pretty-print context back in 6bf139440c (clean up calling
>      conventions for pretty.c functions, 2011-05-26), the idea was just
>      to keep all of the format options together. But later, 6d167fd7cc
>      (pretty: use fmt_output_email_subject(), 2017-03-01) added a
>      pointer to the rev_info directly. So could/should we just be using
>      pp->rev->encode_email_headers here?
> 
>      Or if that field is not always set (or we want to override some
>      elements), should there be a single helper function to initialize
>      the pretty_print_context from a rev_info, that could be shared
>      between spots like show_log() and make_cover_letter()?
> 
> I don't think that answering those questions needs to hold up your
> patch. We can take it as a quick fix for a real bug, and then anybody
> interested can dig further as a separate topic on top.

These are good questions indeed. Unfortunately I don't hink I'll have time to
work on this though.

> > +test_expect_success 'cover letter with --cover-from-description subject (UTF-8 subject line)' '
> > +	test_config branch.rebuild-1.description "Café?
> > +
> > +body" &&
> > +	git checkout rebuild-1 &&
> > +	git format-patch --stdout --cover-letter --cover-from-description subject --encode-email-headers main >actual &&
> > +	grep "^Subject: \[PATCH 0/2\] =?UTF-8?q?Caf=C3=A9=3F?=$" actual &&
> > +	! grep "Café" actual
> > +'
> 
> The test looks correct to me.
> 
> Some of these long lines (and the in-string newlines!) make this ugly
> and hard to read. But it is also just copying the already-ugly style of
> nearby tests. So I'm OK with that. But a prettier version might be:
> 
>   test_expect_success 'cover letter respects --encode-email-headers' '
>         test_config branch.rebuild-1.description "Café?" &&
>         git checkout rebuild-1 &&
>         git format-patch --stdout --encode-email-headers \
>                 --cover-letter --cover-from-description=subject \
>                 main >actual &&
>         ...
>   '

Yeah, that sounds better indeed. Let me know if you want me to resend a cleaner
version of the test.

> I also wondered if we could be just be testing this much more easily
> with another header like "--to". But I guess that would be found in both
> the cover letter and the actual patches (we also don't seem to encode
> it even in the regular patches; is that a bug?).

That sounds like another bug indeed… But maybe that'll be harder to fix. To
Q-encode this field one needs to split off the full name and actual mail
address ("André <andre@example.org>" would be split into "André" and
"andre@example.org"), then Q-encode the full name, then join the strings
together again. In particular, it's incorrect to Q-encode the full string.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] format-patch: fix ignored encode_email_headers for cover letter
  2023-11-10 10:36   ` Simon Ser
@ 2023-11-10 21:48     ` Jeff King
  0 siblings, 0 replies; 6+ messages in thread
From: Jeff King @ 2023-11-10 21:48 UTC (permalink / raw)
  To: Simon Ser; +Cc: René Scharfe, git, Junio C Hamano

On Fri, Nov 10, 2023 at 10:36:22AM +0000, Simon Ser wrote:

> > I don't think that answering those questions needs to hold up your
> > patch. We can take it as a quick fix for a real bug, and then anybody
> > interested can dig further as a separate topic on top.
> 
> These are good questions indeed. Unfortunately I don't hink I'll have time to
> work on this though.

That's OK. I think it's fine to stop here for now.

> > Some of these long lines (and the in-string newlines!) make this ugly
> > and hard to read. But it is also just copying the already-ugly style of
> > nearby tests. So I'm OK with that. But a prettier version might be:
> > 
> >   test_expect_success 'cover letter respects --encode-email-headers' '
> >         test_config branch.rebuild-1.description "Café?" &&
> >         git checkout rebuild-1 &&
> >         git format-patch --stdout --encode-email-headers \
> >                 --cover-letter --cover-from-description=subject \
> >                 main >actual &&
> >         ...
> >   '
> 
> Yeah, that sounds better indeed. Let me know if you want me to resend a cleaner
> version of the test.

I don't have a strong opinion, so I'd leave it up to you.

> > I also wondered if we could be just be testing this much more easily
> > with another header like "--to". But I guess that would be found in both
> > the cover letter and the actual patches (we also don't seem to encode
> > it even in the regular patches; is that a bug?).
> 
> That sounds like another bug indeed… But maybe that'll be harder to fix. To
> Q-encode this field one needs to split off the full name and actual mail
> address ("André <andre@example.org>" would be split into "André" and
> "andre@example.org"), then Q-encode the full name, then join the strings
> together again. In particular, it's incorrect to Q-encode the full string.

Yeah, without even looking at the code, I had a suspicion that this
would be an issue. I doubt that format-patch is doing much parsing at
all of what we feed it via --to.

-Peff

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] format-patch: fix ignored encode_email_headers for cover letter
  2023-11-09 18:35 ` Jeff King
  2023-11-10 10:36   ` Simon Ser
@ 2023-11-12 18:38   ` René Scharfe
  2023-11-13 19:00     ` Jeff King
  1 sibling, 1 reply; 6+ messages in thread
From: René Scharfe @ 2023-11-12 18:38 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Junio C Hamano, Simon Ser

Am 09.11.23 um 19:35 schrieb Jeff King:
> On Thu, Nov 09, 2023 at 11:19:56AM +0000, Simon Ser wrote:
>
>> When writing the cover letter, the encode_email_headers option was
>> ignored. That is, UTF-8 subject lines and email addresses were
>> written out as-is, without any Q-encoding, even if
>> --encode-email-headers was passed on the command line.
>>
>> This is due to encode_email_headers not being copied over from
>> struct rev_info to struct pretty_print_context. Fix that and add
>> a test.
>
> That makes sense, and your patch looks the right thing to do as an
> immediate fix. But I have to wonder:
>
>   1. Are there other bits that need to be copied?

Good question.

> Grepping for other
>      code that does the same thing, I see that show_log() and
>      cmd_format_patch() copy a lot more.

show_log() copies almost half of the struct 6d167fd7cc members
from struct rev_info!

cmd_format_patch() doesn't seem have a struct pretty_print_context,
though...

> (For that matter, why doesn't
>      make_cover_letter() just use the context set up by
>      cmd_format_patch()?).

... which answers this question, but did you perhaps mean a different
function?

>   2. Why are we copying this stuff at all? When we introduced the
>      pretty-print context back in 6bf139440c (clean up calling
>      conventions for pretty.c functions, 2011-05-26), the idea was just
>      to keep all of the format options together. But later, 6d167fd7cc
>      (pretty: use fmt_output_email_subject(), 2017-03-01) added a
>      pointer to the rev_info directly.

Hmm.  Was sticking the rev_info pointer in unwise?  Does it tangle up
things that should be kept separate?  It uses force_in_body_from,
grep_filter, sources, nr, total and subject_prefix from struct rev_info.
That seems a lot, but is just a small fraction of its total members and
we could just copy those we need.  Or prepare the subject string and
pass it in, as before 6d167fd7cc.

> So could/should we just be using
>      pp->rev->encode_email_headers here?

Perhaps.  If we want struct pretty_print_context to be a collection of
named parameters, though, then it makes sense to avoid using
complicated types to provide a nice interface to its callers, and
struct rev_info is one of our largest structs we have.

>      Or if that field is not always set (or we want to override some
>      elements), should there be a single helper function to initialize
>      the pretty_print_context from a rev_info, that could be shared
>      between spots like show_log() and make_cover_letter()?

That would help avoid forgetting to copy something.  But copying is
questionable in general, as you mentioned.  Given the extent of the
overlap, would it make sense to embed struct pretty_print_context in
struct rev_info instead?  Or a subset thereof?

René

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] format-patch: fix ignored encode_email_headers for cover letter
  2023-11-12 18:38   ` René Scharfe
@ 2023-11-13 19:00     ` Jeff King
  0 siblings, 0 replies; 6+ messages in thread
From: Jeff King @ 2023-11-13 19:00 UTC (permalink / raw)
  To: René Scharfe; +Cc: git, Junio C Hamano, Simon Ser

On Sun, Nov 12, 2023 at 07:38:22PM +0100, René Scharfe wrote:

> > Grepping for other
> >      code that does the same thing, I see that show_log() and
> >      cmd_format_patch() copy a lot more.
> 
> show_log() copies almost half of the struct 6d167fd7cc members
> from struct rev_info!
> 
> cmd_format_patch() doesn't seem have a struct pretty_print_context,
> though...

Doh, you're right. I grepped for spots setting encode_email_headers, but
the one in cmd_format_patch() is copying it from the config-default into
the rev_info, not into the pretty-print context.

Which makes sense. It is going to call show_log() to show each commit,
which is where the value is copied into the pretty-print context.

> > (For that matter, why doesn't
> >      make_cover_letter() just use the context set up by
> >      cmd_format_patch()?).
> 
> ... which answers this question, but did you perhaps mean a different
> function?

Right, I was just confused.

> >   2. Why are we copying this stuff at all? When we introduced the
> >      pretty-print context back in 6bf139440c (clean up calling
> >      conventions for pretty.c functions, 2011-05-26), the idea was just
> >      to keep all of the format options together. But later, 6d167fd7cc
> >      (pretty: use fmt_output_email_subject(), 2017-03-01) added a
> >      pointer to the rev_info directly.
> 
> Hmm.  Was sticking the rev_info pointer in unwise?  Does it tangle up
> things that should be kept separate?  It uses force_in_body_from,
> grep_filter, sources, nr, total and subject_prefix from struct rev_info.
> That seems a lot, but is just a small fraction of its total members and
> we could just copy those we need.  Or prepare the subject string and
> pass it in, as before 6d167fd7cc.

I don't know that it was unwise. I was mostly just noting the history
because that explains why we _didn't_ simply refer to ctx->revs in
6bf139440c. Has the separation between the two been valuable since then?
I'm not sure. We do have some code paths that do not have a rev_info at
all (e.g., pp_commit_easy(), which makes an ad-hoc empty context
struct).

> > So could/should we just be using
> >      pp->rev->encode_email_headers here?
> 
> Perhaps.  If we want struct pretty_print_context to be a collection of
> named parameters, though, then it makes sense to avoid using
> complicated types to provide a nice interface to its callers, and
> struct rev_info is one of our largest structs we have.

Yeah, philosophically it may be better to keep the modules separated.
But if we end up having to just copy a ton of fields, it may not be as
practical. If we can at least factor that out into a single spot,
though, it may not be so bad.

> >      Or if that field is not always set (or we want to override some
> >      elements), should there be a single helper function to initialize
> >      the pretty_print_context from a rev_info, that could be shared
> >      between spots like show_log() and make_cover_letter()?
> 
> That would help avoid forgetting to copy something.  But copying is
> questionable in general, as you mentioned.  Given the extent of the
> overlap, would it make sense to embed struct pretty_print_context in
> struct rev_info instead?  Or a subset thereof?

I had a similar thought, but the pretty_print_context carries both input
options from the caller, as well as computed state used internally by
the pretty-print code. So you'd have to split those two up, I would
think. And now all of the pretty-print functions have to pass around
_two_ contexts.

That's more annoying, but arguably is a cleaner design (the internal
struct can be a private thing that is not even defined outside of
pretty.c). I dunno.

-Peff

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-11-13 19:00 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-09 11:19 [PATCH] format-patch: fix ignored encode_email_headers for cover letter Simon Ser
2023-11-09 18:35 ` Jeff King
2023-11-10 10:36   ` Simon Ser
2023-11-10 21:48     ` Jeff King
2023-11-12 18:38   ` René Scharfe
2023-11-13 19:00     ` Jeff King

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.