All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas Huth <thuth@redhat.com>
To: qemu-devel@nongnu.org, Paolo Bonzini <pbonzini@redhat.com>,
	Markus Armbruster <armbru@redhat.com>
Subject: [Qemu-devel] [RFC PATCH 2/5] checkpatch: check utf-8 content from a commit log when it's missing from charset
Date: Thu, 26 Jan 2017 14:11:02 +0100	[thread overview]
Message-ID: <1485436265-12573-3-git-send-email-thuth@redhat.com> (raw)
In-Reply-To: <1485436265-12573-1-git-send-email-thuth@redhat.com>

This is a port of the following commit from the Linux kernel:

commit fa64205df9dfd7b7662cc64a7e82115c00e428e5
Author: Pasi Savanainen <pasi.savanainen@nixu.com>
Date:   Thu Oct 4 17:13:29 2012 -0700

    checkpatch: check utf-8 content from a commit log when it's missing from charset

    Check that a commit log doesn't contain UTF-8 when a mail header
    explicitly defines a different charset, like

    'Content-Type: text/plain; charset="us-ascii"'

    Signed-off-by: Pasi Savanainen <pasi.savanainen@nixu.com>
    Cc: Joe Perches <joe@perches.com>
    Cc: Andy Whitcroft <apw@canonical.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Note: I slightly updated the regex for the "Content-Type" line check
since most of the time, the character set does not seem to be given
in quotes.

Signed-off-by: Thomas Huth <thuth@redhat.com>
---
 scripts/checkpatch.pl | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 5da423a..0f88e3b 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -1097,6 +1097,8 @@ sub process {
 	my $in_header_lines = 1;
 	my $in_commit_log = 0;		#Scanning lines before patch
 
+	my $non_utf8_charset = 0;
+
 	our @report = ();
 	our $cnt_lines = 0;
 	our $cnt_error = 0;
@@ -1324,8 +1326,15 @@ sub process {
 			$in_commit_log = 1;
 		}
 
-# Still not yet in a patch, check for any UTF-8
-		if ($in_commit_log && $realfile =~ /^$/ &&
+# Check if there is UTF-8 in a commit log when a mail header has explicitly
+# declined it, i.e defined some charset where it is missing.
+		if ($in_header_lines &&
+		    $rawline =~ /^Content-Type:.+charset=(.+)$/ &&
+		    $1 !~ /utf-8/i) {
+			$non_utf8_charset = 1;
+		}
+
+		if ($in_commit_log && $non_utf8_charset && $realfile =~ /^$/ &&
 		    $rawline =~ /$NON_ASCII_UTF8/) {
 			WARN("8-bit UTF-8 used in possible commit log\n"
 			     . $herecurr);
-- 
1.8.3.1

  parent reply	other threads:[~2017-01-26 13:11 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-26 13:11 [Qemu-devel] [RFC PATCH 0/5] checkpatch: Test bad UTF-8 encodings and updates to MAINTAINERS Thomas Huth
2017-01-26 13:11 ` [Qemu-devel] [RFC PATCH 1/5] checkpatch: add a check for utf-8 in commit logs Thomas Huth
2017-01-30 14:12   ` Stefan Hajnoczi
2017-01-30 15:57     ` Thomas Huth
2017-02-01 16:27       ` Stefan Hajnoczi
2017-01-26 13:11 ` Thomas Huth [this message]
2017-01-30 14:13   ` [Qemu-devel] [RFC PATCH 2/5] checkpatch: check utf-8 content from a commit log when it's missing from charset Stefan Hajnoczi
2017-01-26 13:11 ` [Qemu-devel] [RFC PATCH 3/5] checkpatch: ignore email headers better Thomas Huth
2017-01-30 14:14   ` Stefan Hajnoczi
2017-01-26 13:11 ` [Qemu-devel] [RFC PATCH 4/5] checkpatch: emit a reminder about MAINTAINERS on file add/move/delete Thomas Huth
2017-01-30 14:15   ` Stefan Hajnoczi
2017-01-26 13:11 ` [Qemu-devel] [RFC PATCH 5/5] checkpatch: reduce MAINTAINERS update message frequency Thomas Huth
2017-01-26 13:28   ` Paolo Bonzini
2017-01-26 13:39     ` Thomas Huth
2017-01-26 14:03       ` Cornelia Huck
2017-01-30 14:15   ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1485436265-12573-3-git-send-email-thuth@redhat.com \
    --to=thuth@redhat.com \
    --cc=armbru@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.