All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] checkpatch: Only encode UTF-8 quoted printable mail headers
@ 2018-07-18 11:35 Geert Uytterhoeven
  2018-07-18 11:59 ` Joe Perches
  0 siblings, 1 reply; 2+ messages in thread
From: Geert Uytterhoeven @ 2018-07-18 11:35 UTC (permalink / raw)
  To: Andy Whitcroft, Joe Perches, Andrew Morton
  Cc: Stephen Rothwell, linux-kernel, Geert Uytterhoeven

As PERL uses its own internal character encoding, always calling
encode("utf8", ...) on the author name may cause corruption, leading to
an author signoff mismatch.

This happens in the following cases:
  - If a patch is in ISO-8859, and contains a non-ASCII author name in
    the From: line, it is converted to UTF-8, while the Signed-off-by
    line will still be in ISO-8859.
  - If a patch is in UTF-8, and contains a non-ASCII author name in the
    body (not header) From: line, it is assumed to be encoded in PERL's
    internal character encoding, and converted to UTF-8 incorrectly,
    while the Signed-off-by line will be in real UTF-8.

Fix this by only doing the encode step if the From: line used UTF-8
quoted printable encoding.

Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
---
Fixes: bc76e3a125b44379 ("checkpatch: warn if missing author Signed-off-by")
in -next

To be folded into "checkpatch: Warn if missing author Signed-off-by" in
Andrew's tree.
---
 scripts/checkpatch.pl | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 3d01fee203c4775d..e847377779e7804f 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -2523,7 +2523,8 @@ sub process {
 
 # Check the patch for a From:
 		if (decode("MIME-Header", $line) =~ /^From:\s*(.*)/) {
-			$author = encode("utf8", $1);
+			$author = $1;
+			$author = encode("utf8", $author) if $line =~ /=\?utf-8\?/i;
 			$author =~ s/"//g;
 		}
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] checkpatch: Only encode UTF-8 quoted printable mail headers
  2018-07-18 11:35 [PATCH] checkpatch: Only encode UTF-8 quoted printable mail headers Geert Uytterhoeven
@ 2018-07-18 11:59 ` Joe Perches
  0 siblings, 0 replies; 2+ messages in thread
From: Joe Perches @ 2018-07-18 11:59 UTC (permalink / raw)
  To: Geert Uytterhoeven, Andy Whitcroft, Andrew Morton
  Cc: Stephen Rothwell, linux-kernel

On Wed, 2018-07-18 at 13:35 +0200, Geert Uytterhoeven wrote:
> As PERL uses its own internal character encoding, always calling
> encode("utf8", ...) on the author name may cause corruption, leading to
> an author signoff mismatch.
[]
> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
[]
> @@ -2523,7 +2523,8 @@ sub process {
>  
>  # Check the patch for a From:
>  		if (decode("MIME-Header", $line) =~ /^From:\s*(.*)/) {
> -			$author = encode("utf8", $1);
> +			$author = $1;
> +			$author = encode("utf8", $author) if $line =~ /=\?utf-8\?/i;

trivial:

checkpatch uses parentheses around tests so

			$author = encode("utf8", $author) if ($line =~ /=\?utf-8\?/i);


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2018-07-18 11:59 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-18 11:35 [PATCH] checkpatch: Only encode UTF-8 quoted printable mail headers Geert Uytterhoeven
2018-07-18 11:59 ` Joe Perches

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.