* [PATCH] checkpatch: Only encode UTF-8 quoted printable mail headers
@ 2018-07-18 11:35 Geert Uytterhoeven
2018-07-18 11:59 ` Joe Perches
0 siblings, 1 reply; 2+ messages in thread
From: Geert Uytterhoeven @ 2018-07-18 11:35 UTC (permalink / raw)
To: Andy Whitcroft, Joe Perches, Andrew Morton
Cc: Stephen Rothwell, linux-kernel, Geert Uytterhoeven
As PERL uses its own internal character encoding, always calling
encode("utf8", ...) on the author name may cause corruption, leading to
an author signoff mismatch.
This happens in the following cases:
- If a patch is in ISO-8859, and contains a non-ASCII author name in
the From: line, it is converted to UTF-8, while the Signed-off-by
line will still be in ISO-8859.
- If a patch is in UTF-8, and contains a non-ASCII author name in the
body (not header) From: line, it is assumed to be encoded in PERL's
internal character encoding, and converted to UTF-8 incorrectly,
while the Signed-off-by line will be in real UTF-8.
Fix this by only doing the encode step if the From: line used UTF-8
quoted printable encoding.
Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
---
Fixes: bc76e3a125b44379 ("checkpatch: warn if missing author Signed-off-by")
in -next
To be folded into "checkpatch: Warn if missing author Signed-off-by" in
Andrew's tree.
---
scripts/checkpatch.pl | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 3d01fee203c4775d..e847377779e7804f 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -2523,7 +2523,8 @@ sub process {
# Check the patch for a From:
if (decode("MIME-Header", $line) =~ /^From:\s*(.*)/) {
- $author = encode("utf8", $1);
+ $author = $1;
+ $author = encode("utf8", $author) if $line =~ /=\?utf-8\?/i;
$author =~ s/"//g;
}
--
2.17.1
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [PATCH] checkpatch: Only encode UTF-8 quoted printable mail headers
2018-07-18 11:35 [PATCH] checkpatch: Only encode UTF-8 quoted printable mail headers Geert Uytterhoeven
@ 2018-07-18 11:59 ` Joe Perches
0 siblings, 0 replies; 2+ messages in thread
From: Joe Perches @ 2018-07-18 11:59 UTC (permalink / raw)
To: Geert Uytterhoeven, Andy Whitcroft, Andrew Morton
Cc: Stephen Rothwell, linux-kernel
On Wed, 2018-07-18 at 13:35 +0200, Geert Uytterhoeven wrote:
> As PERL uses its own internal character encoding, always calling
> encode("utf8", ...) on the author name may cause corruption, leading to
> an author signoff mismatch.
[]
> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
[]
> @@ -2523,7 +2523,8 @@ sub process {
>
> # Check the patch for a From:
> if (decode("MIME-Header", $line) =~ /^From:\s*(.*)/) {
> - $author = encode("utf8", $1);
> + $author = $1;
> + $author = encode("utf8", $author) if $line =~ /=\?utf-8\?/i;
trivial:
checkpatch uses parentheses around tests so
$author = encode("utf8", $author) if ($line =~ /=\?utf-8\?/i);
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2018-07-18 11:59 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-18 11:35 [PATCH] checkpatch: Only encode UTF-8 quoted printable mail headers Geert Uytterhoeven
2018-07-18 11:59 ` Joe Perches
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.