linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] checkpatch: warn on found Change-Id lines
@ 2011-10-10  5:56 Olof Johansson
  2011-10-10 17:09 ` Joe Perches
  0 siblings, 1 reply; 5+ messages in thread
From: Olof Johansson @ 2011-10-10  5:56 UTC (permalink / raw)
  To: linux-kernel; +Cc: akpm, apw, Olof Johansson

Some external projects use repo, which uses a special-format Change-Id
field in the commit message for some internal bookkeeping (to re-associate
patches back to review entries, etc). Sometimes they sneak into patches
going upstream, which is embarrassing for the developer, and annoying
for the maintainer.

Add checking for these to checkpatch, to catch them before
posting. Provide a way to disable the check to keep checkpatch useful
for intra-project use.

Signed-off-by: Olof Johansson <olof@lixom.net>
---

 scripts/checkpatch.pl |    8 ++++++++
 1 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 3dfc471..bac1c93 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -18,6 +18,7 @@ my $quiet = 0;
 my $tree = 1;
 my $chk_signoff = 1;
 my $chk_patch = 1;
+my $chk_changeid = 1;
 my $tst_only;
 my $emacs = 0;
 my $terse = 0;
@@ -45,6 +46,7 @@ Options:
   -q, --quiet                quiet
   --no-tree                  run without a kernel tree
   --no-signoff               do not check for 'Signed-off-by' line
+  --no-change-id             don't complain about 'Change-Id' lines
   --patch                    treat FILE as patchfile (default)
   --emacs                    emacs compile window format
   --terse                    one line per report
@@ -99,6 +101,7 @@ GetOptions(
 	'q|quiet+'	=> \$quiet,
 	'tree!'		=> \$tree,
 	'signoff!'	=> \$chk_signoff,
+	'change-id!'	=> \$chk_changeid,
 	'patch!'	=> \$chk_patch,
 	'emacs!'	=> \$emacs,
 	'terse!'	=> \$terse,
@@ -341,6 +344,7 @@ sub deparenthesize {
 }
 
 $chk_signoff = 0 if ($file);
+$chk_changeid = 0 if ($file);
 
 my @dep_includes = ();
 my @dep_functions = ();
@@ -1578,6 +1582,10 @@ sub process {
 				}
 			}
 		}
+# Check for Change-Id lines:
+		if ($line =~ /^\s*change-id:/i && $chk_changeid) {
+			ERROR("CHANGE_ID", "Found Change-Id line\n", $herecurr);
+		}
 
 # Check for wrappage within a valid hunk of the file
 		if ($realcnt != 0 && $line !~ m{^(?:\+|-| |\\ No newline|$)}) {
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] checkpatch: warn on found Change-Id lines
  2011-10-10  5:56 [PATCH] checkpatch: warn on found Change-Id lines Olof Johansson
@ 2011-10-10 17:09 ` Joe Perches
  2011-10-10 22:10   ` Andrew Morton
  0 siblings, 1 reply; 5+ messages in thread
From: Joe Perches @ 2011-10-10 17:09 UTC (permalink / raw)
  To: Olof Johansson; +Cc: linux-kernel, akpm, apw

On Sun, 2011-10-09 at 22:56 -0700, Olof Johansson wrote:
> Some external projects use repo, which uses a special-format Change-Id
> field in the commit message for some internal bookkeeping (to re-associate
> patches back to review entries, etc).

I believe Andrew has a patch from me in his queue
somewhere that tries to isolate the commit message
and does a couple of checks just on that content.

> Sometimes they sneak into patches
> going upstream, which is embarrassing for the developer, and annoying
> for the maintainer.
> 
> Add checking for these to checkpatch, to catch them before
> posting. Provide a way to disable the check to keep checkpatch useful
> for intra-project use.
[]
> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
[]
> +my $chk_changeid = 1;

> +  --no-change-id             don't complain about 'Change-Id' lines

Adding piecemeal option overrides could eventually
make these sorts of things quite long.

Perhaps it's better to use:
--ignore=CHANGE_ID

or add a .checkpatch.conf file somewhere with this.

> +	'change-id!'	=> \$chk_changeid,

> +$chk_changeid = 0 if ($file);

> +# Check for Change-Id lines:
> +		if ($line =~ /^\s*change-id:/i && $chk_changeid) {
> +			ERROR("CHANGE_ID", "Found Change-Id line\n", $herecurr);
> +		}

I believe this will also match lines in any
patch context, so maybe this is not good.

I think this should be placed adjacent to the
RCS/CVS check.

Maybe it's better to add some "VCS_CRUFT"
option and track/emit all the RCS/SCCS/PVS/CVS
crud in one bundle.



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] checkpatch: warn on found Change-Id lines
  2011-10-10 17:09 ` Joe Perches
@ 2011-10-10 22:10   ` Andrew Morton
  2011-10-10 22:32     ` [PATCH] checkpatch: Add a --strict check for utf-8 in commit logs Joe Perches
  0 siblings, 1 reply; 5+ messages in thread
From: Andrew Morton @ 2011-10-10 22:10 UTC (permalink / raw)
  To: Joe Perches; +Cc: Olof Johansson, linux-kernel, apw

On Mon, 10 Oct 2011 10:09:59 -0700
Joe Perches <joe@perches.com> wrote:

> On Sun, 2011-10-09 at 22:56 -0700, Olof Johansson wrote:
> > Some external projects use repo, which uses a special-format Change-Id
> > field in the commit message for some internal bookkeeping (to re-associate
> > patches back to review entries, etc).
> 
> I believe Andrew has a patch from me in his queue
> somewhere that tries to isolate the commit message
> and does a couple of checks just on that content.

I haven't seen a changelogged, signed-off, cc'ed-to-mailing-list
version.

Possibly gmail munched it.  I've been having a lot of inexplicable email
vanishings lately :(

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] checkpatch: Add a --strict check for utf-8 in commit logs
  2011-10-10 22:10   ` Andrew Morton
@ 2011-10-10 22:32     ` Joe Perches
  2011-10-10 22:42       ` Andrew Morton
  0 siblings, 1 reply; 5+ messages in thread
From: Joe Perches @ 2011-10-10 22:32 UTC (permalink / raw)
  To: Andrew Morton; +Cc: LKML, Andy Whitcroft

Some find using utf-8 in commit logs inappropriate.

Some patch commit logs contain unintended utf-8 characters
when doing things like copy/pasting compilation output.

Look for the start of any commit log by skipping initial
lines that look like email headers and "From: " lines.

Stop looking for utf-8 at the first signature line.

Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Joe Perches <joe@perches.com>

---

I don't feel strongly that this patch should be applied,
that's why it's a --strict check and not on by default,
but Andrew Morton seems to want something like it...

 scripts/checkpatch.pl |   30 ++++++++++++++++++++++++++----
 1 files changed, 26 insertions(+), 4 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 3dfc471..04c0a55 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -240,9 +240,8 @@ our $NonptrType;
 our $Type;
 our $Declare;
 
-our $UTF8	= qr {
-	[\x09\x0A\x0D\x20-\x7E]              # ASCII
-	| [\xC2-\xDF][\x80-\xBF]             # non-overlong 2-byte
+our $NON_ASCII_UTF8	= qr{
+	[\xC2-\xDF][\x80-\xBF]               # non-overlong 2-byte
 	|  \xE0[\xA0-\xBF][\x80-\xBF]        # excluding overlongs
 	| [\xE1-\xEC\xEE\xEF][\x80-\xBF]{2}  # straight 3-byte
 	|  \xED[\x80-\x9F][\x80-\xBF]        # excluding surrogates
@@ -251,6 +250,11 @@ our $UTF8	= qr {
 	|  \xF4[\x80-\x8F][\x80-\xBF]{2}     # plane 16
 }x;
 
+our $UTF8	= qr{
+	[\x09\x0A\x0D\x20-\x7E]              # ASCII
+	| $NON_ASCII_UTF8
+}x;
+
 our $typeTypedefs = qr{(?x:
 	(?:__)?(?:u|s|be|le)(?:8|16|32|64)|
 	atomic_t
@@ -1330,6 +1334,9 @@ sub process {
 	my $signoff = 0;
 	my $is_patch = 0;
 
+	my $in_header_lines = 1;
+	my $in_commit_log = 0;		#Scanning lines before patch
+
 	our @report = ();
 	our $cnt_lines = 0;
 	our $cnt_error = 0;
@@ -1497,7 +1504,6 @@ sub process {
 		if ($line =~ /^diff --git.*?(\S+)$/) {
 			$realfile = $1;
 			$realfile =~ s@^([^/]*)/@@;
-
 		} elsif ($line =~ /^\+\+\+\s+(\S+)/) {
 			$realfile = $1;
 			$realfile =~ s@^([^/]*)/@@;
@@ -1536,6 +1542,7 @@ sub process {
 # Check the patch for a signoff:
 		if ($line =~ /^\s*signed-off-by:/i) {
 			$signoff++;
+			$in_commit_log = 0;
 		}
 
 # Check signature styles
@@ -1613,6 +1620,21 @@ sub process {
 			    "Invalid UTF-8, patch and commit message should be encoded in UTF-8\n" . $hereptr);
 		}
 
+# Check if it's the start of a commit log
+# (not a header line and we haven't seen the patch filename)
+		if ($in_header_lines && $realfile =~ /^$/ &&
+		    $rawline !~ /^(commit\b|from\b|\w+:).+$/i) {
+			$in_header_lines = 0;
+			$in_commit_log = 1;
+		}
+
+# Still not yet in a patch, check for any UTF-8
+		if ($in_commit_log && $realfile =~ /^$/ &&
+		    $rawline =~ /$NON_ASCII_UTF8/) {
+			CHK("UTF8_BEFORE_PATCH",
+			    "8-bit UTF-8 used in possible commit log\n" . $herecurr);
+		}
+
 # ignore non-hunk lines and lines being removed
 		next if (!$hunk_line || $line =~ /^-/);
 



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] checkpatch: Add a --strict check for utf-8 in commit logs
  2011-10-10 22:32     ` [PATCH] checkpatch: Add a --strict check for utf-8 in commit logs Joe Perches
@ 2011-10-10 22:42       ` Andrew Morton
  0 siblings, 0 replies; 5+ messages in thread
From: Andrew Morton @ 2011-10-10 22:42 UTC (permalink / raw)
  To: Joe Perches; +Cc: LKML, Andy Whitcroft

On Mon, 10 Oct 2011 15:32:30 -0700
Joe Perches <joe@perches.com> wrote:

> Some find using utf-8 in commit logs inappropriate.
> 
> Some patch commit logs contain unintended utf-8 characters
> when doing things like copy/pasting compilation output.
> 
> Look for the start of any commit log by skipping initial
> lines that look like email headers and "From: " lines.
> 
> Stop looking for utf-8 at the first signature line.
> 
> Suggested-by: Andrew Morton <akpm@linux-foundation.org>
> Signed-off-by: Joe Perches <joe@perches.com>
> 
> ---
> 
> I don't feel strongly that this patch should be applied,
> that's why it's a --strict check and not on by default,
> but Andrew Morton seems to want something like it...

Mainly because of the non-ascii single-quote chars which gcc emits in
its warning/error messages.  I use LANG=C to stop gcc from doing that,
and also have a proglet to undo this nonsense when I'm merging patches
(below) (I totally forget how it works).  But I see such things turning
up in the tree via other merge paths.



#include <stdio.h>
#include <string.h>
#include <errno.h>
#include <stdlib.h>

static void dump(int *buf)
{
	if (buf[0] == 0xE2 && buf[1] == 0x80 && buf[2] == 0x98) {
		putchar('`');
		buf[0] = 0;
		buf[1] = 0;
		buf[2] = 0;
	} else if (buf[0] == 0xE2 && buf[1] == 0x80 && buf[2] == 0x99) {
		putchar('\'');
		buf[0] = 0;
		buf[1] = 0;
		buf[2] = 0;
	} else if (buf[0] == 0xa1) {
		putchar('`');
		goto move;
	} else if (buf[0] == 0xa2) {
		putchar('\'');
		goto move;
	} else {
		if (buf[0])
			putchar(buf[0]);
move:
		buf[0] = buf[1]; 
		buf[1] = buf[2];
		buf[2] = 0;
	}
}

static void doit(FILE *f)
{
	int buf[3] = {};
	int c;

	while ((c = fgetc(f)) != EOF) {
		dump(buf);
		buf[2] = c;
	}
	dump(buf);
	dump(buf);
	dump(buf);
}

int main(int argc, char *argv[])
{
	if (argc == 1) {
		doit(stdin);
	} else {
		int i;

		for (i = 1; i < argc; i++) {
			FILE *f = fopen(argv[i], "r");

			if (f == NULL) {
				fprintf(stderr, "%s: cannot open `%s': %s\n",
					argv[0], argv[1], strerror(errno));
				exit(1);
			}
			doit(f);
			fclose(f);
		}
	}
	exit(0);
}			


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2011-10-10 22:42 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-10-10  5:56 [PATCH] checkpatch: warn on found Change-Id lines Olof Johansson
2011-10-10 17:09 ` Joe Perches
2011-10-10 22:10   ` Andrew Morton
2011-10-10 22:32     ` [PATCH] checkpatch: Add a --strict check for utf-8 in commit logs Joe Perches
2011-10-10 22:42       ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).