All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC PATCH 0/5] checkpatch: Test bad UTF-8 encodings and updates to MAINTAINERS
@ 2017-01-26 13:11 Thomas Huth
  2017-01-26 13:11 ` [Qemu-devel] [RFC PATCH 1/5] checkpatch: add a check for utf-8 in commit logs Thomas Huth
                   ` (4 more replies)
  0 siblings, 5 replies; 16+ messages in thread
From: Thomas Huth @ 2017-01-26 13:11 UTC (permalink / raw)
  To: qemu-devel, Paolo Bonzini, Markus Armbruster

I recently noticed that the checkpatch.pl script from the Linux kernel
nowadays issues a warning if you try to add, delete or move a file
without updating the MAINTAINERS file in the same turn, too. Since
we are also struggling with keeping the MAINTAINERS file of QEMU up to
date, it might be useful to have this feature in QEMU's checkpatch.pl,
too. So here is a port the upstream patches to our version of checkpatch,
including some patches to check for invalid utf-8 in the commit message
(which were a prerequisite to the MAINTAINERS patches due to the
$in_commit_log variable that is introduced there).

Thomas Huth (5):
  checkpatch: add a check for utf-8 in commit logs
  checkpatch: check utf-8 content from a commit log when it's missing
    from charset
  checkpatch: ignore email headers better
  checkpatch: emit a reminder about MAINTAINERS on file add/move/delete
  checkpatch: reduce MAINTAINERS update message frequency

 scripts/checkpatch.pl | 57 +++++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 53 insertions(+), 4 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Qemu-devel] [RFC PATCH 1/5] checkpatch: add a check for utf-8 in commit logs
  2017-01-26 13:11 [Qemu-devel] [RFC PATCH 0/5] checkpatch: Test bad UTF-8 encodings and updates to MAINTAINERS Thomas Huth
@ 2017-01-26 13:11 ` Thomas Huth
  2017-01-30 14:12   ` Stefan Hajnoczi
  2017-01-26 13:11 ` [Qemu-devel] [RFC PATCH 2/5] checkpatch: check utf-8 content from a commit log when it's missing from charset Thomas Huth
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 16+ messages in thread
From: Thomas Huth @ 2017-01-26 13:11 UTC (permalink / raw)
  To: qemu-devel, Paolo Bonzini, Markus Armbruster

This patch is a port of the following commit from the Linux kernel:

commit 15662b3e8644905032c2e26808401a487d4e90c1
Author: Joe Perches <joe@perches.com>
Date:   Mon Oct 31 17:13:12 2011 -0700

    checkpatch: add a --strict check for utf-8 in commit logs

    Some find using utf-8 in commit logs inappropriate.

    Some patch commit logs contain unintended utf-8 characters when doing
    things like copy/pasting compilation output.

    Look for the start of any commit log by skipping initial lines that look
    like email headers and "From: " lines.

    Stop looking for utf-8 at the first signature line.

    Signed-off-by: Joe Perches <joe@perches.com>
    Suggested-by: Andrew Morton <akpm@linux-foundation.org>
    Cc: Andy Whitcroft <apw@shadowen.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Signed-off-by: Thomas Huth <thuth@redhat.com>
---
 scripts/checkpatch.pl | 30 ++++++++++++++++++++++++++----
 1 file changed, 26 insertions(+), 4 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index f084542..5da423a 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -172,9 +172,8 @@ our $NonptrType;
 our $Type;
 our $Declare;
 
-our $UTF8	= qr {
-	[\x09\x0A\x0D\x20-\x7E]              # ASCII
-	| [\xC2-\xDF][\x80-\xBF]             # non-overlong 2-byte
+our $NON_ASCII_UTF8	= qr{
+	[\xC2-\xDF][\x80-\xBF]               # non-overlong 2-byte
 	|  \xE0[\xA0-\xBF][\x80-\xBF]        # excluding overlongs
 	| [\xE1-\xEC\xEE\xEF][\x80-\xBF]{2}  # straight 3-byte
 	|  \xED[\x80-\x9F][\x80-\xBF]        # excluding surrogates
@@ -183,6 +182,11 @@ our $UTF8	= qr {
 	|  \xF4[\x80-\x8F][\x80-\xBF]{2}     # plane 16
 }x;
 
+our $UTF8	= qr{
+	[\x09\x0A\x0D\x20-\x7E]              # ASCII
+	| $NON_ASCII_UTF8
+}x;
+
 # There are still some false positives, but this catches most
 # common cases.
 our $typeTypedefs = qr{(?x:
@@ -1090,6 +1094,9 @@ sub process {
 	my $signoff = 0;
 	my $is_patch = 0;
 
+	my $in_header_lines = 1;
+	my $in_commit_log = 0;		#Scanning lines before patch
+
 	our @report = ();
 	our $cnt_lines = 0;
 	our $cnt_error = 0;
@@ -1242,7 +1249,6 @@ sub process {
 		if ($line =~ /^diff --git.*?(\S+)$/) {
 			$realfile = $1;
 			$realfile =~ s@^([^/]*)/@@;
-
 		} elsif ($line =~ /^\+\+\+\s+(\S+)/) {
 			$realfile = $1;
 			$realfile =~ s@^([^/]*)/@@;
@@ -1281,6 +1287,7 @@ sub process {
 		if ($line =~ /^\s*signed-off-by:/i) {
 			# This is a signoff, if ugly, so do not double report.
 			$signoff++;
+			$in_commit_log = 0;
 			if (!($line =~ /^\s*Signed-off-by:/)) {
 				ERROR("The correct form is \"Signed-off-by\"\n" .
 					$herecurr);
@@ -1309,6 +1316,21 @@ sub process {
 			ERROR("Invalid UTF-8, patch and commit message should be encoded in UTF-8\n" . $hereptr);
 		}
 
+# Check if it's the start of a commit log
+# (not a header line and we haven't seen the patch filename)
+		if ($in_header_lines && $realfile =~ /^$/ &&
+		    $rawline !~ /^(commit\b|from\b|\w+:).+$/i) {
+			$in_header_lines = 0;
+			$in_commit_log = 1;
+		}
+
+# Still not yet in a patch, check for any UTF-8
+		if ($in_commit_log && $realfile =~ /^$/ &&
+		    $rawline =~ /$NON_ASCII_UTF8/) {
+			WARN("8-bit UTF-8 used in possible commit log\n"
+			     . $herecurr);
+		}
+
 # ignore non-hunk lines and lines being removed
 		next if (!$hunk_line || $line =~ /^-/);
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Qemu-devel] [RFC PATCH 2/5] checkpatch: check utf-8 content from a commit log when it's missing from charset
  2017-01-26 13:11 [Qemu-devel] [RFC PATCH 0/5] checkpatch: Test bad UTF-8 encodings and updates to MAINTAINERS Thomas Huth
  2017-01-26 13:11 ` [Qemu-devel] [RFC PATCH 1/5] checkpatch: add a check for utf-8 in commit logs Thomas Huth
@ 2017-01-26 13:11 ` Thomas Huth
  2017-01-30 14:13   ` Stefan Hajnoczi
  2017-01-26 13:11 ` [Qemu-devel] [RFC PATCH 3/5] checkpatch: ignore email headers better Thomas Huth
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 16+ messages in thread
From: Thomas Huth @ 2017-01-26 13:11 UTC (permalink / raw)
  To: qemu-devel, Paolo Bonzini, Markus Armbruster

This is a port of the following commit from the Linux kernel:

commit fa64205df9dfd7b7662cc64a7e82115c00e428e5
Author: Pasi Savanainen <pasi.savanainen@nixu.com>
Date:   Thu Oct 4 17:13:29 2012 -0700

    checkpatch: check utf-8 content from a commit log when it's missing from charset

    Check that a commit log doesn't contain UTF-8 when a mail header
    explicitly defines a different charset, like

    'Content-Type: text/plain; charset="us-ascii"'

    Signed-off-by: Pasi Savanainen <pasi.savanainen@nixu.com>
    Cc: Joe Perches <joe@perches.com>
    Cc: Andy Whitcroft <apw@canonical.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Note: I slightly updated the regex for the "Content-Type" line check
since most of the time, the character set does not seem to be given
in quotes.

Signed-off-by: Thomas Huth <thuth@redhat.com>
---
 scripts/checkpatch.pl | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 5da423a..0f88e3b 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -1097,6 +1097,8 @@ sub process {
 	my $in_header_lines = 1;
 	my $in_commit_log = 0;		#Scanning lines before patch
 
+	my $non_utf8_charset = 0;
+
 	our @report = ();
 	our $cnt_lines = 0;
 	our $cnt_error = 0;
@@ -1324,8 +1326,15 @@ sub process {
 			$in_commit_log = 1;
 		}
 
-# Still not yet in a patch, check for any UTF-8
-		if ($in_commit_log && $realfile =~ /^$/ &&
+# Check if there is UTF-8 in a commit log when a mail header has explicitly
+# declined it, i.e defined some charset where it is missing.
+		if ($in_header_lines &&
+		    $rawline =~ /^Content-Type:.+charset=(.+)$/ &&
+		    $1 !~ /utf-8/i) {
+			$non_utf8_charset = 1;
+		}
+
+		if ($in_commit_log && $non_utf8_charset && $realfile =~ /^$/ &&
 		    $rawline =~ /$NON_ASCII_UTF8/) {
 			WARN("8-bit UTF-8 used in possible commit log\n"
 			     . $herecurr);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Qemu-devel] [RFC PATCH 3/5] checkpatch: ignore email headers better
  2017-01-26 13:11 [Qemu-devel] [RFC PATCH 0/5] checkpatch: Test bad UTF-8 encodings and updates to MAINTAINERS Thomas Huth
  2017-01-26 13:11 ` [Qemu-devel] [RFC PATCH 1/5] checkpatch: add a check for utf-8 in commit logs Thomas Huth
  2017-01-26 13:11 ` [Qemu-devel] [RFC PATCH 2/5] checkpatch: check utf-8 content from a commit log when it's missing from charset Thomas Huth
@ 2017-01-26 13:11 ` Thomas Huth
  2017-01-30 14:14   ` Stefan Hajnoczi
  2017-01-26 13:11 ` [Qemu-devel] [RFC PATCH 4/5] checkpatch: emit a reminder about MAINTAINERS on file add/move/delete Thomas Huth
  2017-01-26 13:11 ` [Qemu-devel] [RFC PATCH 5/5] checkpatch: reduce MAINTAINERS update message frequency Thomas Huth
  4 siblings, 1 reply; 16+ messages in thread
From: Thomas Huth @ 2017-01-26 13:11 UTC (permalink / raw)
  To: qemu-devel, Paolo Bonzini, Markus Armbruster

This is a port of the following commit from the Linux kernel:

commit 29ee1b0c67e0dd7dea8dd718e8326076bce5b6fe
Author: Joe Perches <joe@perches.com>
Date:   Wed Aug 6 16:10:35 2014 -0700

    checkpatch: ignore email headers better

    There are some patches created by git format-patch that when scanned by
    checkpatch report errors on lines like

    To: address.tld

    This is a checkpatch false positive.

    Improve the logic a bit to ignore folded email headers to avoid emitting
    these messages.

    Signed-off-by: Joe Perches <joe@perches.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Signed-off-by: Thomas Huth <thuth@redhat.com>
---
 scripts/checkpatch.pl | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 0f88e3b..52ad64a 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -1094,7 +1094,7 @@ sub process {
 	my $signoff = 0;
 	my $is_patch = 0;
 
-	my $in_header_lines = 1;
+	my $in_header_lines = $file ? 0 : 1;
 	my $in_commit_log = 0;		#Scanning lines before patch
 
 	my $non_utf8_charset = 0;
@@ -1321,7 +1321,8 @@ sub process {
 # Check if it's the start of a commit log
 # (not a header line and we haven't seen the patch filename)
 		if ($in_header_lines && $realfile =~ /^$/ &&
-		    $rawline !~ /^(commit\b|from\b|\w+:).+$/i) {
+		    !($rawline =~ /^\s+\S/ ||
+		      $rawline =~ /^(commit\b|from\b|[\w-]+:).*$/i)) {
 			$in_header_lines = 0;
 			$in_commit_log = 1;
 		}
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Qemu-devel] [RFC PATCH 4/5] checkpatch: emit a reminder about MAINTAINERS on file add/move/delete
  2017-01-26 13:11 [Qemu-devel] [RFC PATCH 0/5] checkpatch: Test bad UTF-8 encodings and updates to MAINTAINERS Thomas Huth
                   ` (2 preceding siblings ...)
  2017-01-26 13:11 ` [Qemu-devel] [RFC PATCH 3/5] checkpatch: ignore email headers better Thomas Huth
@ 2017-01-26 13:11 ` Thomas Huth
  2017-01-30 14:15   ` Stefan Hajnoczi
  2017-01-26 13:11 ` [Qemu-devel] [RFC PATCH 5/5] checkpatch: reduce MAINTAINERS update message frequency Thomas Huth
  4 siblings, 1 reply; 16+ messages in thread
From: Thomas Huth @ 2017-01-26 13:11 UTC (permalink / raw)
  To: qemu-devel, Paolo Bonzini, Markus Armbruster

This is a port of the following commit from the Linux kernel:

commit 13f1937ef33950b1112049972249e6191b82e6c9
Author: Joe Perches <joe@perches.com>
Date:   Wed Aug 6 16:10:59 2014 -0700

    checkpatch: emit a warning on file add/move/delete

    Whenever files are added, moved, or deleted, the MAINTAINERS file
    patterns can be out of sync or outdated.

    To try to keep MAINTAINERS more up-to-date, add a one-time warning
    whenever a patch does any of those.

    Signed-off-by: Joe Perches <joe@perches.com>
    Acked-by: Andy Whitcroft <apw@canonical.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Note: I've changed the "WARNING:" into a "NOTE:" since this is often
caused by false positives, so a warning seems too harsh to me.

Signed-off-by: Thomas Huth <thuth@redhat.com>
---
 scripts/checkpatch.pl | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 52ad64a..e1be7b3 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -1096,7 +1096,7 @@ sub process {
 
 	my $in_header_lines = $file ? 0 : 1;
 	my $in_commit_log = 0;		#Scanning lines before patch
-
+	my $reported_maintainer_file = 0;
 	my $non_utf8_charset = 0;
 
 	our @report = ();
@@ -1300,6 +1300,17 @@ sub process {
 			}
 		}
 
+# Check for added, moved or deleted files
+		if (!$reported_maintainer_file && !$in_commit_log &&
+		    ($line =~ /^(?:new|deleted) file mode\s*\d+\s*$/ ||
+		     $line =~ /^rename (?:from|to) [\w\/\.\-]+\s*$/ ||
+		     ($line =~ /\{\s*([\w\/\.\-]*)\s*\=\>\s*([\w\/\.\-]*)\s*\}/ &&
+		      (defined($1) || defined($2))))) {
+			$reported_maintainer_file = 1;
+			print "NOTE: added, moved or deleted file(s), "
+			      ."does MAINTAINERS need updating?\n" . $herecurr;
+		}
+
 # Check for wrappage within a valid hunk of the file
 		if ($realcnt != 0 && $line !~ m{^(?:\+|-| |\\ No newline|$)}) {
 			ERROR("patch seems to be corrupt (line wrapped?)\n" .
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Qemu-devel] [RFC PATCH 5/5] checkpatch: reduce MAINTAINERS update message frequency
  2017-01-26 13:11 [Qemu-devel] [RFC PATCH 0/5] checkpatch: Test bad UTF-8 encodings and updates to MAINTAINERS Thomas Huth
                   ` (3 preceding siblings ...)
  2017-01-26 13:11 ` [Qemu-devel] [RFC PATCH 4/5] checkpatch: emit a reminder about MAINTAINERS on file add/move/delete Thomas Huth
@ 2017-01-26 13:11 ` Thomas Huth
  2017-01-26 13:28   ` Paolo Bonzini
  2017-01-30 14:15   ` Stefan Hajnoczi
  4 siblings, 2 replies; 16+ messages in thread
From: Thomas Huth @ 2017-01-26 13:11 UTC (permalink / raw)
  To: qemu-devel, Paolo Bonzini, Markus Armbruster

This is a port of the following commit from the Linux kernel:

commit e0d975b1b439c4fef58fbc306c542c94f48bb849
Author: Joe Perches <joe@perches.com>
Date:   Wed Dec 10 15:51:49 2014 -0800

    checkpatch: reduce MAINTAINERS update message frequency

    When files are being added/moved/deleted and a patch contains an update to
    the MAINTAINERS file, assume it's to update the MAINTAINERS file correctly
    and do not emit the "does MAINTAINERS need updating?" message.

    Reported by many people.

    Signed-off-by: Joe Perches <joe@perches.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Signed-off-by: Thomas Huth <thuth@redhat.com>
---
 scripts/checkpatch.pl | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index e1be7b3..555a5b6 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -1300,6 +1300,12 @@ sub process {
 			}
 		}
 
+# Check if MAINTAINERS is being updated.  If so, there's probably no need to
+# emit the "does MAINTAINERS need updating?" message on file add/move/delete
+		if ($line =~ /^\s*MAINTAINERS\s*\|/) {
+			$reported_maintainer_file = 1;
+		}
+
 # Check for added, moved or deleted files
 		if (!$reported_maintainer_file && !$in_commit_log &&
 		    ($line =~ /^(?:new|deleted) file mode\s*\d+\s*$/ ||
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 5/5] checkpatch: reduce MAINTAINERS update message frequency
  2017-01-26 13:11 ` [Qemu-devel] [RFC PATCH 5/5] checkpatch: reduce MAINTAINERS update message frequency Thomas Huth
@ 2017-01-26 13:28   ` Paolo Bonzini
  2017-01-26 13:39     ` Thomas Huth
  2017-01-30 14:15   ` Stefan Hajnoczi
  1 sibling, 1 reply; 16+ messages in thread
From: Paolo Bonzini @ 2017-01-26 13:28 UTC (permalink / raw)
  To: Thomas Huth, qemu-devel, Markus Armbruster



On 26/01/2017 14:11, Thomas Huth wrote:
> This is a port of the following commit from the Linux kernel:
> 
> commit e0d975b1b439c4fef58fbc306c542c94f48bb849
> Author: Joe Perches <joe@perches.com>
> Date:   Wed Dec 10 15:51:49 2014 -0800
> 
>     checkpatch: reduce MAINTAINERS update message frequency
> 
>     When files are being added/moved/deleted and a patch contains an update to
>     the MAINTAINERS file, assume it's to update the MAINTAINERS file correctly
>     and do not emit the "does MAINTAINERS need updating?" message.
> 
>     Reported by many people.
> 
>     Signed-off-by: Joe Perches <joe@perches.com>
>     Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
>     Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
> 
> Signed-off-by: Thomas Huth <thuth@redhat.com>
> ---
>  scripts/checkpatch.pl | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> index e1be7b3..555a5b6 100755
> --- a/scripts/checkpatch.pl
> +++ b/scripts/checkpatch.pl
> @@ -1300,6 +1300,12 @@ sub process {
>  			}
>  		}
>  
> +# Check if MAINTAINERS is being updated.  If so, there's probably no need to
> +# emit the "does MAINTAINERS need updating?" message on file add/move/delete
> +		if ($line =~ /^\s*MAINTAINERS\s*\|/) {
> +			$reported_maintainer_file = 1;
> +		}
> +
>  # Check for added, moved or deleted files
>  		if (!$reported_maintainer_file && !$in_commit_log &&
>  		    ($line =~ /^(?:new|deleted) file mode\s*\d+\s*$/ ||
> 

Maybe leave it as a warning given this change?

Paolo

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 5/5] checkpatch: reduce MAINTAINERS update message frequency
  2017-01-26 13:28   ` Paolo Bonzini
@ 2017-01-26 13:39     ` Thomas Huth
  2017-01-26 14:03       ` Cornelia Huck
  0 siblings, 1 reply; 16+ messages in thread
From: Thomas Huth @ 2017-01-26 13:39 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel, Markus Armbruster

On 26.01.2017 14:28, Paolo Bonzini wrote:
> 
> 
> On 26/01/2017 14:11, Thomas Huth wrote:
>> This is a port of the following commit from the Linux kernel:
>>
>> commit e0d975b1b439c4fef58fbc306c542c94f48bb849
>> Author: Joe Perches <joe@perches.com>
>> Date:   Wed Dec 10 15:51:49 2014 -0800
>>
>>     checkpatch: reduce MAINTAINERS update message frequency
>>
>>     When files are being added/moved/deleted and a patch contains an update to
>>     the MAINTAINERS file, assume it's to update the MAINTAINERS file correctly
>>     and do not emit the "does MAINTAINERS need updating?" message.
>>
>>     Reported by many people.
>>
>>     Signed-off-by: Joe Perches <joe@perches.com>
>>     Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
>>     Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
>>
>> Signed-off-by: Thomas Huth <thuth@redhat.com>
>> ---
>>  scripts/checkpatch.pl | 6 ++++++
>>  1 file changed, 6 insertions(+)
>>
>> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
>> index e1be7b3..555a5b6 100755
>> --- a/scripts/checkpatch.pl
>> +++ b/scripts/checkpatch.pl
>> @@ -1300,6 +1300,12 @@ sub process {
>>  			}
>>  		}
>>  
>> +# Check if MAINTAINERS is being updated.  If so, there's probably no need to
>> +# emit the "does MAINTAINERS need updating?" message on file add/move/delete
>> +		if ($line =~ /^\s*MAINTAINERS\s*\|/) {
>> +			$reported_maintainer_file = 1;
>> +		}
>> +
>>  # Check for added, moved or deleted files
>>  		if (!$reported_maintainer_file && !$in_commit_log &&
>>  		    ($line =~ /^(?:new|deleted) file mode\s*\d+\s*$/ ||
>>
> 
> Maybe leave it as a warning given this change?

I think chances are high that it still pops up quite frequently with
false positives:

1) The above regex only triggers for patches that contain a diffstat. If
you run the script on patches without diffstat, you always get the
warning as soon as you add, delete or move a file, even if you update
the MAINTAINERS file in the same patch.

2) I think it is quite common in patch series to first introduce new
files in the first patches, and then update MAINTAINERS only once at the
end.

3) The MAINTAINERS file often covers whole folders with wildcard
expressions. So if you add/delete/rename a file within such a folder,
you don't need to update MAINTAINERS thanks to the wildcard.

I guess people might be annoyed if checkpatch.pl throws a warning in
these cases. So a "NOTE: ..." sounds more sane to me. But if you like,
we can also start with a WARNING first and only ease it if people start
to complain?

 Thomas

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 5/5] checkpatch: reduce MAINTAINERS update message frequency
  2017-01-26 13:39     ` Thomas Huth
@ 2017-01-26 14:03       ` Cornelia Huck
  0 siblings, 0 replies; 16+ messages in thread
From: Cornelia Huck @ 2017-01-26 14:03 UTC (permalink / raw)
  To: Thomas Huth; +Cc: Paolo Bonzini, qemu-devel, Markus Armbruster

On Thu, 26 Jan 2017 14:39:35 +0100
Thomas Huth <thuth@redhat.com> wrote:

> On 26.01.2017 14:28, Paolo Bonzini wrote:
> > 
> > 
> > On 26/01/2017 14:11, Thomas Huth wrote:
> >> This is a port of the following commit from the Linux kernel:
> >>
> >> commit e0d975b1b439c4fef58fbc306c542c94f48bb849
> >> Author: Joe Perches <joe@perches.com>
> >> Date:   Wed Dec 10 15:51:49 2014 -0800
> >>
> >>     checkpatch: reduce MAINTAINERS update message frequency
> >>
> >>     When files are being added/moved/deleted and a patch contains an update to
> >>     the MAINTAINERS file, assume it's to update the MAINTAINERS file correctly
> >>     and do not emit the "does MAINTAINERS need updating?" message.
> >>
> >>     Reported by many people.
> >>
> >>     Signed-off-by: Joe Perches <joe@perches.com>
> >>     Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> >>     Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
> >>
> >> Signed-off-by: Thomas Huth <thuth@redhat.com>
> >> ---
> >>  scripts/checkpatch.pl | 6 ++++++
> >>  1 file changed, 6 insertions(+)
> >>
> >> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> >> index e1be7b3..555a5b6 100755
> >> --- a/scripts/checkpatch.pl
> >> +++ b/scripts/checkpatch.pl
> >> @@ -1300,6 +1300,12 @@ sub process {
> >>  			}
> >>  		}
> >>  
> >> +# Check if MAINTAINERS is being updated.  If so, there's probably no need to
> >> +# emit the "does MAINTAINERS need updating?" message on file add/move/delete
> >> +		if ($line =~ /^\s*MAINTAINERS\s*\|/) {
> >> +			$reported_maintainer_file = 1;
> >> +		}
> >> +
> >>  # Check for added, moved or deleted files
> >>  		if (!$reported_maintainer_file && !$in_commit_log &&
> >>  		    ($line =~ /^(?:new|deleted) file mode\s*\d+\s*$/ ||
> >>
> > 
> > Maybe leave it as a warning given this change?
> 
> I think chances are high that it still pops up quite frequently with
> false positives:
> 
> 1) The above regex only triggers for patches that contain a diffstat. If
> you run the script on patches without diffstat, you always get the
> warning as soon as you add, delete or move a file, even if you update
> the MAINTAINERS file in the same patch.
> 
> 2) I think it is quite common in patch series to first introduce new
> files in the first patches, and then update MAINTAINERS only once at the
> end.
> 
> 3) The MAINTAINERS file often covers whole folders with wildcard
> expressions. So if you add/delete/rename a file within such a folder,
> you don't need to update MAINTAINERS thanks to the wildcard.
> 
> I guess people might be annoyed if checkpatch.pl throws a warning in
> these cases. So a "NOTE: ..." sounds more sane to me. But if you like,
> we can also start with a WARNING first and only ease it if people start
> to complain?

I'd vote for 'NOTE:'. I'm always a bit annoyed if checkpatch triggers
for a file already covered by 3) (which seems to be the most common
case for patches I've been involved with).

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 1/5] checkpatch: add a check for utf-8 in commit logs
  2017-01-26 13:11 ` [Qemu-devel] [RFC PATCH 1/5] checkpatch: add a check for utf-8 in commit logs Thomas Huth
@ 2017-01-30 14:12   ` Stefan Hajnoczi
  2017-01-30 15:57     ` Thomas Huth
  0 siblings, 1 reply; 16+ messages in thread
From: Stefan Hajnoczi @ 2017-01-30 14:12 UTC (permalink / raw)
  To: Thomas Huth; +Cc: qemu-devel, Paolo Bonzini, Markus Armbruster

[-- Attachment #1: Type: text/plain, Size: 1422 bytes --]

On Thu, Jan 26, 2017 at 02:11:01PM +0100, Thomas Huth wrote:
> This patch is a port of the following commit from the Linux kernel:
> 
> commit 15662b3e8644905032c2e26808401a487d4e90c1
> Author: Joe Perches <joe@perches.com>
> Date:   Mon Oct 31 17:13:12 2011 -0700
> 
>     checkpatch: add a --strict check for utf-8 in commit logs
> 
>     Some find using utf-8 in commit logs inappropriate.
> 
>     Some patch commit logs contain unintended utf-8 characters when doing
>     things like copy/pasting compilation output.
> 
>     Look for the start of any commit log by skipping initial lines that look
>     like email headers and "From: " lines.
> 
>     Stop looking for utf-8 at the first signature line.
> 
>     Signed-off-by: Joe Perches <joe@perches.com>
>     Suggested-by: Andrew Morton <akpm@linux-foundation.org>
>     Cc: Andy Whitcroft <apw@shadowen.org>
>     Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
>     Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
> 
> Signed-off-by: Thomas Huth <thuth@redhat.com>
> ---
>  scripts/checkpatch.pl | 30 ++++++++++++++++++++++++++----
>  1 file changed, 26 insertions(+), 4 deletions(-)

This patch prevents including names with non-ASCII characters in the
commit description.  Some people care about the proper spelling of their
names.

Allowing UTF-8 in Signed-off-by and other headers isn't enough.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 2/5] checkpatch: check utf-8 content from a commit log when it's missing from charset
  2017-01-26 13:11 ` [Qemu-devel] [RFC PATCH 2/5] checkpatch: check utf-8 content from a commit log when it's missing from charset Thomas Huth
@ 2017-01-30 14:13   ` Stefan Hajnoczi
  0 siblings, 0 replies; 16+ messages in thread
From: Stefan Hajnoczi @ 2017-01-30 14:13 UTC (permalink / raw)
  To: Thomas Huth; +Cc: qemu-devel, Paolo Bonzini, Markus Armbruster

[-- Attachment #1: Type: text/plain, Size: 1217 bytes --]

On Thu, Jan 26, 2017 at 02:11:02PM +0100, Thomas Huth wrote:
> This is a port of the following commit from the Linux kernel:
> 
> commit fa64205df9dfd7b7662cc64a7e82115c00e428e5
> Author: Pasi Savanainen <pasi.savanainen@nixu.com>
> Date:   Thu Oct 4 17:13:29 2012 -0700
> 
>     checkpatch: check utf-8 content from a commit log when it's missing from charset
> 
>     Check that a commit log doesn't contain UTF-8 when a mail header
>     explicitly defines a different charset, like
> 
>     'Content-Type: text/plain; charset="us-ascii"'
> 
>     Signed-off-by: Pasi Savanainen <pasi.savanainen@nixu.com>
>     Cc: Joe Perches <joe@perches.com>
>     Cc: Andy Whitcroft <apw@canonical.com>
>     Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
>     Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
> 
> Note: I slightly updated the regex for the "Content-Type" line check
> since most of the time, the character set does not seem to be given
> in quotes.
> 
> Signed-off-by: Thomas Huth <thuth@redhat.com>
> ---
>  scripts/checkpatch.pl | 13 +++++++++++--
>  1 file changed, 11 insertions(+), 2 deletions(-)

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 3/5] checkpatch: ignore email headers better
  2017-01-26 13:11 ` [Qemu-devel] [RFC PATCH 3/5] checkpatch: ignore email headers better Thomas Huth
@ 2017-01-30 14:14   ` Stefan Hajnoczi
  0 siblings, 0 replies; 16+ messages in thread
From: Stefan Hajnoczi @ 2017-01-30 14:14 UTC (permalink / raw)
  To: Thomas Huth; +Cc: qemu-devel, Paolo Bonzini, Markus Armbruster

[-- Attachment #1: Type: text/plain, Size: 1017 bytes --]

On Thu, Jan 26, 2017 at 02:11:03PM +0100, Thomas Huth wrote:
> This is a port of the following commit from the Linux kernel:
> 
> commit 29ee1b0c67e0dd7dea8dd718e8326076bce5b6fe
> Author: Joe Perches <joe@perches.com>
> Date:   Wed Aug 6 16:10:35 2014 -0700
> 
>     checkpatch: ignore email headers better
> 
>     There are some patches created by git format-patch that when scanned by
>     checkpatch report errors on lines like
> 
>     To: address.tld
> 
>     This is a checkpatch false positive.
> 
>     Improve the logic a bit to ignore folded email headers to avoid emitting
>     these messages.
> 
>     Signed-off-by: Joe Perches <joe@perches.com>
>     Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
>     Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
> 
> Signed-off-by: Thomas Huth <thuth@redhat.com>
> ---
>  scripts/checkpatch.pl | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 4/5] checkpatch: emit a reminder about MAINTAINERS on file add/move/delete
  2017-01-26 13:11 ` [Qemu-devel] [RFC PATCH 4/5] checkpatch: emit a reminder about MAINTAINERS on file add/move/delete Thomas Huth
@ 2017-01-30 14:15   ` Stefan Hajnoczi
  0 siblings, 0 replies; 16+ messages in thread
From: Stefan Hajnoczi @ 2017-01-30 14:15 UTC (permalink / raw)
  To: Thomas Huth; +Cc: qemu-devel, Paolo Bonzini, Markus Armbruster

[-- Attachment #1: Type: text/plain, Size: 1174 bytes --]

On Thu, Jan 26, 2017 at 02:11:04PM +0100, Thomas Huth wrote:
> This is a port of the following commit from the Linux kernel:
> 
> commit 13f1937ef33950b1112049972249e6191b82e6c9
> Author: Joe Perches <joe@perches.com>
> Date:   Wed Aug 6 16:10:59 2014 -0700
> 
>     checkpatch: emit a warning on file add/move/delete
> 
>     Whenever files are added, moved, or deleted, the MAINTAINERS file
>     patterns can be out of sync or outdated.
> 
>     To try to keep MAINTAINERS more up-to-date, add a one-time warning
>     whenever a patch does any of those.
> 
>     Signed-off-by: Joe Perches <joe@perches.com>
>     Acked-by: Andy Whitcroft <apw@canonical.com>
>     Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
>     Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
> 
> Note: I've changed the "WARNING:" into a "NOTE:" since this is often
> caused by false positives, so a warning seems too harsh to me.
> 
> Signed-off-by: Thomas Huth <thuth@redhat.com>
> ---
>  scripts/checkpatch.pl | 13 ++++++++++++-
>  1 file changed, 12 insertions(+), 1 deletion(-)

Yes!

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 5/5] checkpatch: reduce MAINTAINERS update message frequency
  2017-01-26 13:11 ` [Qemu-devel] [RFC PATCH 5/5] checkpatch: reduce MAINTAINERS update message frequency Thomas Huth
  2017-01-26 13:28   ` Paolo Bonzini
@ 2017-01-30 14:15   ` Stefan Hajnoczi
  1 sibling, 0 replies; 16+ messages in thread
From: Stefan Hajnoczi @ 2017-01-30 14:15 UTC (permalink / raw)
  To: Thomas Huth; +Cc: qemu-devel, Paolo Bonzini, Markus Armbruster

[-- Attachment #1: Type: text/plain, Size: 982 bytes --]

On Thu, Jan 26, 2017 at 02:11:05PM +0100, Thomas Huth wrote:
> This is a port of the following commit from the Linux kernel:
> 
> commit e0d975b1b439c4fef58fbc306c542c94f48bb849
> Author: Joe Perches <joe@perches.com>
> Date:   Wed Dec 10 15:51:49 2014 -0800
> 
>     checkpatch: reduce MAINTAINERS update message frequency
> 
>     When files are being added/moved/deleted and a patch contains an update to
>     the MAINTAINERS file, assume it's to update the MAINTAINERS file correctly
>     and do not emit the "does MAINTAINERS need updating?" message.
> 
>     Reported by many people.
> 
>     Signed-off-by: Joe Perches <joe@perches.com>
>     Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
>     Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
> 
> Signed-off-by: Thomas Huth <thuth@redhat.com>
> ---
>  scripts/checkpatch.pl | 6 ++++++
>  1 file changed, 6 insertions(+)

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 1/5] checkpatch: add a check for utf-8 in commit logs
  2017-01-30 14:12   ` Stefan Hajnoczi
@ 2017-01-30 15:57     ` Thomas Huth
  2017-02-01 16:27       ` Stefan Hajnoczi
  0 siblings, 1 reply; 16+ messages in thread
From: Thomas Huth @ 2017-01-30 15:57 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Paolo Bonzini, qemu-devel, Markus Armbruster

[-- Attachment #1: Type: text/plain, Size: 1948 bytes --]

On 30.01.2017 15:12, Stefan Hajnoczi wrote:
> On Thu, Jan 26, 2017 at 02:11:01PM +0100, Thomas Huth wrote:
>> This patch is a port of the following commit from the Linux kernel:
>>
>> commit 15662b3e8644905032c2e26808401a487d4e90c1
>> Author: Joe Perches <joe@perches.com>
>> Date:   Mon Oct 31 17:13:12 2011 -0700
>>
>>     checkpatch: add a --strict check for utf-8 in commit logs
>>
>>     Some find using utf-8 in commit logs inappropriate.
>>
>>     Some patch commit logs contain unintended utf-8 characters when doing
>>     things like copy/pasting compilation output.
>>
>>     Look for the start of any commit log by skipping initial lines that look
>>     like email headers and "From: " lines.
>>
>>     Stop looking for utf-8 at the first signature line.
>>
>>     Signed-off-by: Joe Perches <joe@perches.com>
>>     Suggested-by: Andrew Morton <akpm@linux-foundation.org>
>>     Cc: Andy Whitcroft <apw@shadowen.org>
>>     Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
>>     Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
>>
>> Signed-off-by: Thomas Huth <thuth@redhat.com>
>> ---
>>  scripts/checkpatch.pl | 30 ++++++++++++++++++++++++++----
>>  1 file changed, 26 insertions(+), 4 deletions(-)
> 
> This patch prevents including names with non-ASCII characters in the
> commit description.  Some people care about the proper spelling of their
> names.
> 
> Allowing UTF-8 in Signed-off-by and other headers isn't enough.

Right, and I guess the folks from the kernel checkpatch noticed this,
too. That's likely why the next patch restricts this check to only
happen if the patch is from a mail with non-UTF-8 content type, but
still contains UTF-8 characters, i.e. there is really something fishy
with the character set of the patch description. Maybe I should squash
the two patches together, so that it is more obvious what is going on here?

 Thomas




[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 1/5] checkpatch: add a check for utf-8 in commit logs
  2017-01-30 15:57     ` Thomas Huth
@ 2017-02-01 16:27       ` Stefan Hajnoczi
  0 siblings, 0 replies; 16+ messages in thread
From: Stefan Hajnoczi @ 2017-02-01 16:27 UTC (permalink / raw)
  To: Thomas Huth; +Cc: Paolo Bonzini, qemu-devel, Markus Armbruster

[-- Attachment #1: Type: text/plain, Size: 2331 bytes --]

On Mon, Jan 30, 2017 at 04:57:53PM +0100, Thomas Huth wrote:
> On 30.01.2017 15:12, Stefan Hajnoczi wrote:
> > On Thu, Jan 26, 2017 at 02:11:01PM +0100, Thomas Huth wrote:
> >> This patch is a port of the following commit from the Linux kernel:
> >>
> >> commit 15662b3e8644905032c2e26808401a487d4e90c1
> >> Author: Joe Perches <joe@perches.com>
> >> Date:   Mon Oct 31 17:13:12 2011 -0700
> >>
> >>     checkpatch: add a --strict check for utf-8 in commit logs
> >>
> >>     Some find using utf-8 in commit logs inappropriate.
> >>
> >>     Some patch commit logs contain unintended utf-8 characters when doing
> >>     things like copy/pasting compilation output.
> >>
> >>     Look for the start of any commit log by skipping initial lines that look
> >>     like email headers and "From: " lines.
> >>
> >>     Stop looking for utf-8 at the first signature line.
> >>
> >>     Signed-off-by: Joe Perches <joe@perches.com>
> >>     Suggested-by: Andrew Morton <akpm@linux-foundation.org>
> >>     Cc: Andy Whitcroft <apw@shadowen.org>
> >>     Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> >>     Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
> >>
> >> Signed-off-by: Thomas Huth <thuth@redhat.com>
> >> ---
> >>  scripts/checkpatch.pl | 30 ++++++++++++++++++++++++++----
> >>  1 file changed, 26 insertions(+), 4 deletions(-)
> > 
> > This patch prevents including names with non-ASCII characters in the
> > commit description.  Some people care about the proper spelling of their
> > names.
> > 
> > Allowing UTF-8 in Signed-off-by and other headers isn't enough.
> 
> Right, and I guess the folks from the kernel checkpatch noticed this,
> too. That's likely why the next patch restricts this check to only
> happen if the patch is from a mail with non-UTF-8 content type, but
> still contains UTF-8 characters, i.e. there is really something fishy
> with the character set of the patch description. Maybe I should squash
> the two patches together, so that it is more obvious what is going on here?

In that case I'm okay with this patch :).  There's no need to squash it.

Sorry, I didn't make the connection with the next patch.  I thought that
was simply for checking that the byte stream is valid according to the
claimed charset.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2017-02-01 16:27 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-26 13:11 [Qemu-devel] [RFC PATCH 0/5] checkpatch: Test bad UTF-8 encodings and updates to MAINTAINERS Thomas Huth
2017-01-26 13:11 ` [Qemu-devel] [RFC PATCH 1/5] checkpatch: add a check for utf-8 in commit logs Thomas Huth
2017-01-30 14:12   ` Stefan Hajnoczi
2017-01-30 15:57     ` Thomas Huth
2017-02-01 16:27       ` Stefan Hajnoczi
2017-01-26 13:11 ` [Qemu-devel] [RFC PATCH 2/5] checkpatch: check utf-8 content from a commit log when it's missing from charset Thomas Huth
2017-01-30 14:13   ` Stefan Hajnoczi
2017-01-26 13:11 ` [Qemu-devel] [RFC PATCH 3/5] checkpatch: ignore email headers better Thomas Huth
2017-01-30 14:14   ` Stefan Hajnoczi
2017-01-26 13:11 ` [Qemu-devel] [RFC PATCH 4/5] checkpatch: emit a reminder about MAINTAINERS on file add/move/delete Thomas Huth
2017-01-30 14:15   ` Stefan Hajnoczi
2017-01-26 13:11 ` [Qemu-devel] [RFC PATCH 5/5] checkpatch: reduce MAINTAINERS update message frequency Thomas Huth
2017-01-26 13:28   ` Paolo Bonzini
2017-01-26 13:39     ` Thomas Huth
2017-01-26 14:03       ` Cornelia Huck
2017-01-30 14:15   ` Stefan Hajnoczi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.