linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3] checkpatch: add new warnings to author signoff checks.
@ 2020-10-05 19:24 Dwaipayan Ray
  2020-10-05 19:37 ` Joe Perches
  0 siblings, 1 reply; 8+ messages in thread
From: Dwaipayan Ray @ 2020-10-05 19:24 UTC (permalink / raw)
  To: joe; +Cc: linux-kernel-mentees, dwaipayanray1, lukas.bulwahn, linux-kernel

The author signed-off-by checks are currently very vague.
Cases like same name or same address are not handled separately.

For example, running checkpatch on commit be6577af0cef
("parisc: Add atomic64_set_release() define to avoid CPU soft lockups"),
gives:

WARNING: Missing Signed-off-by: line by nominal patch author
'John David Anglin <dave.anglin@bell.net>'

The signoff line was:
"Signed-off-by: Dave Anglin <dave.anglin@bell.net>"

Clearly the author has signed off but with a slightly different version
of his name. A more appropriate warning would have been to point out
at the name mismatch instead.

Previously, the values assumed by $authorsignoff were either 0 or 1
to indicate whether a proper sign off by author is present.
Extended the checks to handle three new cases.

$authorsignoff values now denote the following:

0: Missing sign off by patch author.

1: Sign off present and identical.

2: Addresses match, but names are different.
   "James Watson <james@gmail.com>", "James <james@gmail.com>"

3: Names match, but addresses are different.
   "James Watson <james@watson.com>", "James Watson <james@gmail.com>"

4: Names match, addresses excluding subaddress details (RFC 5233) match.
   "James Watson <james@gmail.com>", "James Watson <james+a@gmail.com>"

For case 4, a --strict check message is generated, and for the
other cases 0, 2 and 3, warnings are generated.

Link: https://lore.kernel.org/lkml/7958ded756c895ca614ba900aae7b830a992475e.camel@perches.com/
Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Dwaipayan Ray <dwaipayanray1@gmail.com>
---
 scripts/checkpatch.pl | 54 +++++++++++++++++++++++++++++++++++++++----
 1 file changed, 50 insertions(+), 4 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 31624bbb342e..e81f0bebbeb9 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -2347,6 +2347,7 @@ sub process {
 	my $signoff = 0;
 	my $author = '';
 	my $authorsignoff = 0;
+	my $author_sob = '';
 	my $is_patch = 0;
 	my $is_binding_patch = -1;
 	my $in_header_lines = $file ? 0 : 1;
@@ -2674,9 +2675,34 @@ sub process {
 		if ($line =~ /^\s*signed-off-by:\s*(.*)/i) {
 			$signoff++;
 			$in_commit_log = 0;
-			if ($author ne '') {
+			if ($author ne ''  && $authorsignoff != 1) {
 				if (same_email_addresses($1, $author)) {
 					$authorsignoff = 1;
+				} else {
+					my $ctx = $1;
+					my ($email_name, $email_comment, $email_address, $comment1) = parse_email($ctx);
+					my ($author_name, $author_comment, $author_address, $comment2) = parse_email($author);
+
+					if ($email_address eq $author_address) {
+						$author_sob = $ctx;
+						$authorsignoff = 2;
+					} elsif ($email_name eq $author_name) {
+						$author_sob = $ctx;
+						$authorsignoff = 3;
+
+						my $address1 = $email_address;
+						my $address2 = $author_address;
+
+						if ($address1 =~ /(\S+)\+\S+(\@.*)/) {
+							$address1 = "$1$2";
+						}
+						if ($address2 =~ /(\S+)\+\S+(\@.*)/) {
+							$address2 = "$1$2";
+						}
+						if ($address1 eq $address2) {
+							$authorsignoff = 4;
+						}
+					}
 				}
 			}
 		}
@@ -6891,9 +6917,29 @@ sub process {
 		if ($signoff == 0) {
 			ERROR("MISSING_SIGN_OFF",
 			      "Missing Signed-off-by: line(s)\n");
-		} elsif (!$authorsignoff) {
-			WARN("NO_AUTHOR_SIGN_OFF",
-			     "Missing Signed-off-by: line by nominal patch author '$author'\n");
+		} elsif ($authorsignoff != 1) {
+			# authorsignoff values:
+			# 0 -> missing sign off
+			# 1 -> sign off identical
+			# 2 -> addresses match, names different
+			# 3 -> names match, addresses different
+			# 4 -> names match, addresses excluding subaddress details (refer RFC 5233) match
+
+			my $sob_msg = "'From: $author' != 'Signed-off-by: $author_sob'";
+
+			if ($authorsignoff == 0) {
+				WARN("NO_AUTHOR_SIGN_OFF",
+				     "Missing Signed-off-by: line by nominal patch author '$author'\n");
+			} elsif ($authorsignoff == 2) {
+				WARN("NO_AUTHOR_SIGN_OFF",
+				     "From:/Signed-off-by: email name mismatch: $sob_msg\n");
+			} elsif ($authorsignoff == 3) {
+				WARN("NO_AUTHOR_SIGN_OFF",
+				     "From:/Signed-off-by: email address mismatch: $sob_msg\n");
+			} elsif ($authorsignoff == 4) {
+				CHK("NO_AUTHOR_SIGN_OFF",
+				    "From:/Signed-off-by: email extension mismatch: $sob_msg\n");
+			}
 		}
 	}
 
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v3] checkpatch: add new warnings to author signoff checks.
  2020-10-05 19:24 [PATCH v3] checkpatch: add new warnings to author signoff checks Dwaipayan Ray
@ 2020-10-05 19:37 ` Joe Perches
  2020-10-05 20:07   ` Dwaipayan Ray
  0 siblings, 1 reply; 8+ messages in thread
From: Joe Perches @ 2020-10-05 19:37 UTC (permalink / raw)
  To: Dwaipayan Ray; +Cc: linux-kernel-mentees, lukas.bulwahn, linux-kernel

On Tue, 2020-10-06 at 00:54 +0530, Dwaipayan Ray wrote:
> The author signed-off-by checks are currently very vague.
> Cases like same name or same address are not handled separately.

When you run tests for this, how many mismatches are
caused by name formatting changes like:

From: "Developer, J. Random" <jrd@bigcorp.com>
...
Signed-off-by: "J. Random Developer" <jrd@bigcorp.com>?

Should these differences generate a warning?


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3] checkpatch: add new warnings to author signoff checks.
  2020-10-05 19:37 ` Joe Perches
@ 2020-10-05 20:07   ` Dwaipayan Ray
  2020-10-05 21:09     ` Joe Perches
  0 siblings, 1 reply; 8+ messages in thread
From: Dwaipayan Ray @ 2020-10-05 20:07 UTC (permalink / raw)
  To: Joe Perches; +Cc: linux-kernel-mentees, Lukas Bulwahn, linux-kernel

On Tue, Oct 6, 2020 at 1:07 AM Joe Perches <joe@perches.com> wrote:
>
> On Tue, 2020-10-06 at 00:54 +0530, Dwaipayan Ray wrote:
> > The author signed-off-by checks are currently very vague.
> > Cases like same name or same address are not handled separately.
>
> When you run tests for this, how many mismatches are
> caused by name formatting changes like:
>
> From: "Developer, J. Random" <jrd@bigcorp.com>
> ...
> Signed-off-by: "J. Random Developer" <jrd@bigcorp.com>?
>
> Should these differences generate a warning?
>

Hi,
I ran my tests on non merge commits between v5.7 and v5.8.

There were a total of 250 NO_AUTHOR_SIGN_OFF Warnings

203 of these were email address mismatches.
32 of these were name mismatches.

So for the name mismatches, the typical cases are like:

'From: tannerlove <tannerlove@google.com>' != 'Signed-off-by: Tanner
Love <tannerlove@google.com>'
'From: "朱灿灿" <zhucancan@vivo.com>' != 'Signed-off-by: zhucancan
<zhucancan@vivo.com>'
'From: Yuval Basson <ybason@marvell.com>' != 'Signed-off-by: Yuval
Bason <ybason@marvell.com>'
'From: allen <allen.chen@ite.com.tw>' != 'Signed-off-by: Allen Chen
<allen.chen@ite.com.tw>'

I didn't find the exact formatting change you mentioned in my commit range.
But I did find something like:

'From: "Paul A. Clarke" <pc@us.ibm.com>' != 'Signed-off-by: Paul
Clarke <pc@us.ibm.com>'

So it's like some have parts of their names removed, some have language
conflicts, and yet some have well different spellings, or initials,
etc. It's like
a wide variety of things happening here.

I think considering these, it should be warned about, and let people know
that there might be something wrong going on.

What do you think?

Thanks,
Dwaipayan.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3] checkpatch: add new warnings to author signoff checks.
  2020-10-05 20:07   ` Dwaipayan Ray
@ 2020-10-05 21:09     ` Joe Perches
  2020-10-06  4:23       ` Dwaipayan Ray
  0 siblings, 1 reply; 8+ messages in thread
From: Joe Perches @ 2020-10-05 21:09 UTC (permalink / raw)
  To: Dwaipayan Ray; +Cc: linux-kernel-mentees, Lukas Bulwahn, linux-kernel

On Tue, 2020-10-06 at 01:37 +0530, Dwaipayan Ray wrote:
> On Tue, Oct 6, 2020 at 1:07 AM Joe Perches <joe@perches.com> wrote:
> > On Tue, 2020-10-06 at 00:54 +0530, Dwaipayan Ray wrote:
> > > The author signed-off-by checks are currently very vague.
> > > Cases like same name or same address are not handled separately.
> > 
> > When you run tests for this, how many mismatches are
> > caused by name formatting changes like:
> > 
> > From: "Developer, J. Random" <jrd@bigcorp.com>
> > ...
> > Signed-off-by: "J. Random Developer" <jrd@bigcorp.com>?
> > 
> > Should these differences generate a warning?
> > 
> 
> Hi,
> I ran my tests on non merge commits between v5.7 and v5.8.
> 
> There were a total of 250 NO_AUTHOR_SIGN_OFF Warnings
> 
> 203 of these were email address mismatches.
> 32 of these were name mismatches.
> 
> So for the name mismatches, the typical cases are like:
> 
> 'From: tannerlove <tannerlove@google.com>' != 'Signed-off-by: Tanner
> Love <tannerlove@google.com>'
> 'From: "朱灿灿" <zhucancan@vivo.com>' != 'Signed-off-by: zhucancan
> <zhucancan@vivo.com>'
> 'From: Yuval Basson <ybason@marvell.com>' != 'Signed-off-by: Yuval
> Bason <ybason@marvell.com>'
> 'From: allen <allen.chen@ite.com.tw>' != 'Signed-off-by: Allen Chen
> <allen.chen@ite.com.tw>'
> 
> I didn't find the exact formatting change you mentioned in my commit range.
> But I did find something like:
> 
> 'From: "Paul A. Clarke" <pc@us.ibm.com>' != 'Signed-off-by: Paul
> Clarke <pc@us.ibm.com>'
> 
> So it's like some have parts of their names removed, some have language
> conflicts, and yet some have well different spellings, or initials,
> etc. It's like
> a wide variety of things happening here.
> 
> I think considering these, it should be warned about, and let people know
> that there might be something wrong going on.
> 
> What do you think?

Except for comments and quotes like:

	From: J. Random Developer (BigCorp) <jrd@bigcorp.com>
	Signed-off-by: "J. Random Developer" <jrd@bigcorp.com>

I think any time there's a mismatch, there
should be a warning emitted.

That includes any subaddress detail difference.



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3] checkpatch: add new warnings to author signoff checks.
  2020-10-05 21:09     ` Joe Perches
@ 2020-10-06  4:23       ` Dwaipayan Ray
  2020-10-06  4:38         ` Lukas Bulwahn
  0 siblings, 1 reply; 8+ messages in thread
From: Dwaipayan Ray @ 2020-10-06  4:23 UTC (permalink / raw)
  To: Joe Perches; +Cc: linux-kernel-mentees, Lukas Bulwahn, linux-kernel

On Tue, Oct 6, 2020 at 2:39 AM Joe Perches <joe@perches.com> wrote:
>
> On Tue, 2020-10-06 at 01:37 +0530, Dwaipayan Ray wrote:
> > On Tue, Oct 6, 2020 at 1:07 AM Joe Perches <joe@perches.com> wrote:
> > > On Tue, 2020-10-06 at 00:54 +0530, Dwaipayan Ray wrote:
> > > > The author signed-off-by checks are currently very vague.
> > > > Cases like same name or same address are not handled separately.
> > >
> > > When you run tests for this, how many mismatches are
> > > caused by name formatting changes like:
> > >
> > > From: "Developer, J. Random" <jrd@bigcorp.com>
> > > ...
> > > Signed-off-by: "J. Random Developer" <jrd@bigcorp.com>?
> > >
> > > Should these differences generate a warning?
> > >
> >
> > Hi,
> > I ran my tests on non merge commits between v5.7 and v5.8.
> >
> > There were a total of 250 NO_AUTHOR_SIGN_OFF Warnings
> >
> > 203 of these were email address mismatches.
> > 32 of these were name mismatches.
> >
> > So for the name mismatches, the typical cases are like:
> >
> > 'From: tannerlove <tannerlove@google.com>' != 'Signed-off-by: Tanner
> > Love <tannerlove@google.com>'
> > 'From: "朱灿灿" <zhucancan@vivo.com>' != 'Signed-off-by: zhucancan
> > <zhucancan@vivo.com>'
> > 'From: Yuval Basson <ybason@marvell.com>' != 'Signed-off-by: Yuval
> > Bason <ybason@marvell.com>'
> > 'From: allen <allen.chen@ite.com.tw>' != 'Signed-off-by: Allen Chen
> > <allen.chen@ite.com.tw>'
> >
> > I didn't find the exact formatting change you mentioned in my commit range.
> > But I did find something like:
> >
> > 'From: "Paul A. Clarke" <pc@us.ibm.com>' != 'Signed-off-by: Paul
> > Clarke <pc@us.ibm.com>'
> >
> > So it's like some have parts of their names removed, some have language
> > conflicts, and yet some have well different spellings, or initials,
> > etc. It's like
> > a wide variety of things happening here.
> >
> > I think considering these, it should be warned about, and let people know
> > that there might be something wrong going on.
> >
> > What do you think?
>
> Except for comments and quotes like:
>
>         From: J. Random Developer (BigCorp) <jrd@bigcorp.com>
>         Signed-off-by: "J. Random Developer" <jrd@bigcorp.com>
>
> I think any time there's a mismatch, there
> should be a warning emitted.
>
> That includes any subaddress detail difference.
>
>
Hi,
Yeah these cases are being handled.

Comments and quotes don't generate any warning message but
all the other mismatches do.

Only the check for subaddress generates a --strict check message.
others are all WARN messages. It was followed from our discussion at
https://lore.kernel.org/linux-kernel-mentees/7b52e085f0b69ad1742966f8eacd02deb9299b96.camel@perches.com/

So does it need to be changed to a WARN or is it fine like that?

Thanks,
Dwaipayan.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3] checkpatch: add new warnings to author signoff checks.
  2020-10-06  4:23       ` Dwaipayan Ray
@ 2020-10-06  4:38         ` Lukas Bulwahn
  2020-10-06 13:15           ` Dwaipayan Ray
  0 siblings, 1 reply; 8+ messages in thread
From: Lukas Bulwahn @ 2020-10-06  4:38 UTC (permalink / raw)
  To: Dwaipayan Ray
  Cc: Joe Perches, linux-kernel-mentees, Lukas Bulwahn, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 3197 bytes --]



On Tue, 6 Oct 2020, Dwaipayan Ray wrote:

> On Tue, Oct 6, 2020 at 2:39 AM Joe Perches <joe@perches.com> wrote:
> >
> > On Tue, 2020-10-06 at 01:37 +0530, Dwaipayan Ray wrote:
> > > On Tue, Oct 6, 2020 at 1:07 AM Joe Perches <joe@perches.com> wrote:
> > > > On Tue, 2020-10-06 at 00:54 +0530, Dwaipayan Ray wrote:
> > > > > The author signed-off-by checks are currently very vague.
> > > > > Cases like same name or same address are not handled separately.
> > > >
> > > > When you run tests for this, how many mismatches are
> > > > caused by name formatting changes like:
> > > >
> > > > From: "Developer, J. Random" <jrd@bigcorp.com>
> > > > ...
> > > > Signed-off-by: "J. Random Developer" <jrd@bigcorp.com>?
> > > >
> > > > Should these differences generate a warning?
> > > >
> > >
> > > Hi,
> > > I ran my tests on non merge commits between v5.7 and v5.8.
> > >
> > > There were a total of 250 NO_AUTHOR_SIGN_OFF Warnings
> > >
> > > 203 of these were email address mismatches.
> > > 32 of these were name mismatches.
> > >
> > > So for the name mismatches, the typical cases are like:
> > >
> > > 'From: tannerlove <tannerlove@google.com>' != 'Signed-off-by: Tanner
> > > Love <tannerlove@google.com>'
> > > 'From: "朱灿灿" <zhucancan@vivo.com>' != 'Signed-off-by: zhucancan
> > > <zhucancan@vivo.com>'
> > > 'From: Yuval Basson <ybason@marvell.com>' != 'Signed-off-by: Yuval
> > > Bason <ybason@marvell.com>'
> > > 'From: allen <allen.chen@ite.com.tw>' != 'Signed-off-by: Allen Chen
> > > <allen.chen@ite.com.tw>'
> > >
> > > I didn't find the exact formatting change you mentioned in my commit range.
> > > But I did find something like:
> > >
> > > 'From: "Paul A. Clarke" <pc@us.ibm.com>' != 'Signed-off-by: Paul
> > > Clarke <pc@us.ibm.com>'
> > >
> > > So it's like some have parts of their names removed, some have language
> > > conflicts, and yet some have well different spellings, or initials,
> > > etc. It's like
> > > a wide variety of things happening here.
> > >
> > > I think considering these, it should be warned about, and let people know
> > > that there might be something wrong going on.
> > >
> > > What do you think?
> >
> > Except for comments and quotes like:
> >
> >         From: J. Random Developer (BigCorp) <jrd@bigcorp.com>
> >         Signed-off-by: "J. Random Developer" <jrd@bigcorp.com>
> >
> > I think any time there's a mismatch, there
> > should be a warning emitted.
> >
> > That includes any subaddress detail difference.
> >
> >
> Hi,
> Yeah these cases are being handled.
> 
> Comments and quotes don't generate any warning message but
> all the other mismatches do.
> 
> Only the check for subaddress generates a --strict check message.
> others are all WARN messages. It was followed from our discussion at
> https://lore.kernel.org/linux-kernel-mentees/7b52e085f0b69ad1742966f8eacd02deb9299b96.camel@perches.com/
> 
> So does it need to be changed to a WARN or is it fine like that?
>

I will repeat what I suggested before:

I think the complete mismatch where we cannot even find a name or an email 
match the author deserves to be reported as ERROR.

Dwaipayan, if Joe does not disagree, could you change that in your PATCH v4?

Lukas

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3] checkpatch: add new warnings to author signoff checks.
  2020-10-06  4:38         ` Lukas Bulwahn
@ 2020-10-06 13:15           ` Dwaipayan Ray
  2020-10-06 18:28             ` Joe Perches
  0 siblings, 1 reply; 8+ messages in thread
From: Dwaipayan Ray @ 2020-10-06 13:15 UTC (permalink / raw)
  To: Lukas Bulwahn; +Cc: Joe Perches, linux-kernel-mentees, linux-kernel

> > > Except for comments and quotes like:
> > >
> > >         From: J. Random Developer (BigCorp) <jrd@bigcorp.com>
> > >         Signed-off-by: "J. Random Developer" <jrd@bigcorp.com>
> > >
> > > I think any time there's a mismatch, there
> > > should be a warning emitted.
> > >
> > > That includes any subaddress detail difference.
> > >
> > >
> > Hi,
> > Yeah these cases are being handled.
> >
> > Comments and quotes don't generate any warning message but
> > all the other mismatches do.
> >
> > Only the check for subaddress generates a --strict check message.
> > others are all WARN messages. It was followed from our discussion at
> > https://lore.kernel.org/linux-kernel-mentees/7b52e085f0b69ad1742966f8eacd02deb9299b96.camel@perches.com/
> >
> > So does it need to be changed to a WARN or is it fine like that?
> >
>
> I will repeat what I suggested before:
>
> I think the complete mismatch where we cannot even find a name or an email
> match the author deserves to be reported as ERROR.
>
> Dwaipayan, if Joe does not disagree, could you change that in your PATCH v4?
>

Yes sure I hope to do that after Joe gives his affirmation.

To summarize it, two changes that could be made are
the CHK for subaddress extension could be converted to
a WARN, and the WARN in case of a missing author signoff
could be converted to an ERROR.

Thanks,
Dwaipayan.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3] checkpatch: add new warnings to author signoff checks.
  2020-10-06 13:15           ` Dwaipayan Ray
@ 2020-10-06 18:28             ` Joe Perches
  0 siblings, 0 replies; 8+ messages in thread
From: Joe Perches @ 2020-10-06 18:28 UTC (permalink / raw)
  To: Dwaipayan Ray, Lukas Bulwahn; +Cc: linux-kernel-mentees, linux-kernel

On Tue, 2020-10-06 at 18:45 +0530, Dwaipayan Ray wrote:
> To summarize it, two changes that could be made are
> the CHK for subaddress extension could be converted to
> a WARN, and the WARN in case of a missing author signoff
> could be converted to an ERROR.

Sure, why not...



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-10-06 18:28 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-05 19:24 [PATCH v3] checkpatch: add new warnings to author signoff checks Dwaipayan Ray
2020-10-05 19:37 ` Joe Perches
2020-10-05 20:07   ` Dwaipayan Ray
2020-10-05 21:09     ` Joe Perches
2020-10-06  4:23       ` Dwaipayan Ray
2020-10-06  4:38         ` Lukas Bulwahn
2020-10-06 13:15           ` Dwaipayan Ray
2020-10-06 18:28             ` Joe Perches

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).