linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] checkpatch: improve handling of email comments
@ 2020-10-30  9:07 Dwaipayan Ray
  2020-10-30 11:54 ` Joe Perches
  0 siblings, 1 reply; 8+ messages in thread
From: Dwaipayan Ray @ 2020-10-30  9:07 UTC (permalink / raw)
  To: joe
  Cc: linux-kernel-mentees, dwaipayanray1, linux-kernel, lukas.bulwahn,
	yashsri421

checkpatch has limited support for parsing email comments. It only
support single name comments or single after address comments.
Whereas, RFC 5322 specifies that comments can be inserted in
between any tokens of the email fields.

Improve comment parsing mechanism in checkpatch.

What is handled now:

- Multiple name/address comments
- Comments anywhere in between name/address
- Nested comments like (John (Doe))


A brief analysis of checkpatch output on v5.0..v5.7 showed that
after these modifications, the number of BAD_SIGN_OFF warnings
came down from 2944 to 1424, and FROM_SIGN_OFF_MISMATCH came
down from 2366 to 2330.

So, a total of 1556 false positives were resolved in total.

Signed-off-by: Dwaipayan Ray <dwaipayanray1@gmail.com>
---
 scripts/checkpatch.pl | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index fab38b493cef..ae8436385fc1 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -1183,14 +1183,20 @@ sub parse_email {
 		}
 	}
 
-	$comment = trim($comment);
+	# Comments in between name like John(A nice chap) Doe
+	while ($name =~ s/\s*($balanced_parens)\s*/ /) {
+		$name_comment .= trim($1);
+	}
 	$name = trim($name);
 	$name =~ s/^\"|\"$//g;
-	if ($name =~ s/(\s*\([^\)]+\))\s*//) {
-		$name_comment = trim($1);
+
+	# Comments in between address like <john(his account)@doe.com>
+	while ($address =~ s/\s*($balanced_parens)\s*//) {
+		$comment .= trim($1);
 	}
 	$address = trim($address);
 	$address =~ s/^\<|\>$//g;
+	$comment = trim($comment);
 
 	if ($name =~ /[^\w \-]/i) { ##has "must quote" chars
 		$name =~ s/(?<!\\)"/\\"/g; ##escape quotes
@@ -1205,8 +1211,6 @@ sub format_email {
 
 	my $formatted_email;
 
-	$name_comment = trim($name_comment);
-	$comment = trim($comment);
 	$name = trim($name);
 	$name =~ s/^\"|\"$//g;
 	$address = trim($address);
@@ -1216,6 +1220,11 @@ sub format_email {
 		$name = "\"$name\"";
 	}
 
+	$name_comment = trim($name_comment);
+	$name_comment =~ s/(.+)/ $1/;
+	$comment = trim($comment);
+	$comment =~ s/(.+)/ $1/;
+
 	if ("$name" eq "") {
 		$formatted_email = "$address";
 	} else {
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] checkpatch: improve handling of email comments
  2020-10-30  9:07 [PATCH] checkpatch: improve handling of email comments Dwaipayan Ray
@ 2020-10-30 11:54 ` Joe Perches
  2020-10-30 11:58   ` Lukas Bulwahn
  0 siblings, 1 reply; 8+ messages in thread
From: Joe Perches @ 2020-10-30 11:54 UTC (permalink / raw)
  To: Dwaipayan Ray
  Cc: linux-kernel-mentees, linux-kernel, lukas.bulwahn, yashsri421

On Fri, 2020-10-30 at 14:37 +0530, Dwaipayan Ray wrote:
> checkpatch has limited support for parsing email comments. It only
> support single name comments or single after address comments.
> Whereas, RFC 5322 specifies that comments can be inserted in
> between any tokens of the email fields.
> 
> Improve comment parsing mechanism in checkpatch.
> 
> What is handled now:
> 
> - Multiple name/address comments
> - Comments anywhere in between name/address
> - Nested comments like (John (Doe))
>
> A brief analysis of checkpatch output on v5.0..v5.7 showed that
> after these modifications, the number of BAD_SIGN_OFF warnings
> came down from 2944 to 1424, and FROM_SIGN_OFF_MISMATCH came
> down from 2366 to 2330.
> 
> So, a total of 1556 false positives were resolved in total.

A mere reduction in messages emitted isn't necessarily good.

Please send me privately a complete list of these nominally
false positive messages that are no longer emitted.

I believe one of the relatively common incorrect messages is
for the cc: <stable@vger.kernel.org> where a version number is
continued on the same line after a #.

CC: stable@vger.kernel.org # for versions x.y.z and above




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] checkpatch: improve handling of email comments
  2020-10-30 11:54 ` Joe Perches
@ 2020-10-30 11:58   ` Lukas Bulwahn
  2020-10-30 12:21     ` Joe Perches
  2020-10-31  4:14     ` Joe Perches
  0 siblings, 2 replies; 8+ messages in thread
From: Lukas Bulwahn @ 2020-10-30 11:58 UTC (permalink / raw)
  To: Joe Perches
  Cc: Dwaipayan Ray, linux-kernel-mentees, linux-kernel, lukas.bulwahn,
	yashsri421



On Fri, 30 Oct 2020, Joe Perches wrote:

> On Fri, 2020-10-30 at 14:37 +0530, Dwaipayan Ray wrote:
> > checkpatch has limited support for parsing email comments. It only
> > support single name comments or single after address comments.
> > Whereas, RFC 5322 specifies that comments can be inserted in
> > between any tokens of the email fields.
> > 
> > Improve comment parsing mechanism in checkpatch.
> > 
> > What is handled now:
> > 
> > - Multiple name/address comments
> > - Comments anywhere in between name/address
> > - Nested comments like (John (Doe))
> >
> > A brief analysis of checkpatch output on v5.0..v5.7 showed that
> > after these modifications, the number of BAD_SIGN_OFF warnings
> > came down from 2944 to 1424, and FROM_SIGN_OFF_MISMATCH came
> > down from 2366 to 2330.
> > 
> > So, a total of 1556 false positives were resolved in total.
> 
> A mere reduction in messages emitted isn't necessarily good.
>

Agree. That is why I also went through the list of those warnings.

I could not spot any obvious true positive among the reduced ones.
 
> Please send me privately a complete list of these nominally
> false positive messages that are no longer emitted.
> 
> I believe one of the relatively common incorrect messages is
> for the cc: <stable@vger.kernel.org> where a version number is
> continued on the same line after a #.
> 
> CC: stable@vger.kernel.org # for versions x.y.z and above
>

That was one, another common pattern was just quotes put inconsistently at 
different places.

Lukas

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] checkpatch: improve handling of email comments
  2020-10-30 11:58   ` Lukas Bulwahn
@ 2020-10-30 12:21     ` Joe Perches
  2020-10-31  4:14     ` Joe Perches
  1 sibling, 0 replies; 8+ messages in thread
From: Joe Perches @ 2020-10-30 12:21 UTC (permalink / raw)
  To: Lukas Bulwahn
  Cc: Dwaipayan Ray, linux-kernel-mentees, linux-kernel, yashsri421

On Fri, 2020-10-30 at 12:58 +0100, Lukas Bulwahn wrote:
> On Fri, 30 Oct 2020, Joe Perches wrote:
> > On Fri, 2020-10-30 at 14:37 +0530, Dwaipayan Ray wrote:
> > > checkpatch has limited support for parsing email comments. It only
> > > support single name comments or single after address comments.
> > > Whereas, RFC 5322 specifies that comments can be inserted in
> > > between any tokens of the email fields.
> > > 
> > > Improve comment parsing mechanism in checkpatch.
> > > 
> > > What is handled now:
> > > 
> > > - Multiple name/address comments
> > > - Comments anywhere in between name/address
> > > - Nested comments like (John (Doe))
> > > 
> > > A brief analysis of checkpatch output on v5.0..v5.7 showed that
> > > after these modifications, the number of BAD_SIGN_OFF warnings
> > > came down from 2944 to 1424, and FROM_SIGN_OFF_MISMATCH came
> > > down from 2366 to 2330.
> > > 
> > > So, a total of 1556 false positives were resolved in total.
> > 
> > A mere reduction in messages emitted isn't necessarily good.
> > 
> 
> Agree. That is why I also went through the list of those warnings.

So sending me a copy of that list shouldn't be a burden.
 
> > Please send me privately a complete list of these nominally
> > false positive messages that are no longer emitted.
> > 
> > I believe one of the relatively common incorrect messages is
> > for the cc: <stable@vger.kernel.org> where a version number is
> > continued on the same line after a #.
> > 
> > CC: stable@vger.kernel.org # for versions x.y.z and above
> > 
> 
> That was one, another common pattern was just quotes put inconsistently at 
> different places.

Which to me is more an indication that a message
_should_ be emitted as many email clients do not like
to copy/paste incorrectly formatted email addresses
(ie: Missing necessary quotes when the name contains
characters like .) and that's a common way to cc a
reply to a possible commit message of an email.

Perhaps as well the .mailmap mechanism may not cope
with these differently formatted email addresses.



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] checkpatch: improve handling of email comments
  2020-10-30 11:58   ` Lukas Bulwahn
  2020-10-30 12:21     ` Joe Perches
@ 2020-10-31  4:14     ` Joe Perches
  2020-10-31  6:11       ` Dwaipayan Ray
  1 sibling, 1 reply; 8+ messages in thread
From: Joe Perches @ 2020-10-31  4:14 UTC (permalink / raw)
  To: Lukas Bulwahn
  Cc: Dwaipayan Ray, linux-kernel-mentees, linux-kernel, yashsri421

On Fri, 2020-10-30 at 12:58 +0100, Lukas Bulwahn wrote:
> 
> On Fri, 30 Oct 2020, Joe Perches wrote:
> 
> > On Fri, 2020-10-30 at 14:37 +0530, Dwaipayan Ray wrote:
> > > checkpatch has limited support for parsing email comments. It only
> > > support single name comments or single after address comments.
> > > Whereas, RFC 5322 specifies that comments can be inserted in
> > > between any tokens of the email fields.
> > > 
> > > Improve comment parsing mechanism in checkpatch.
> > > 
> > > What is handled now:
> > > 
> > > - Multiple name/address comments
> > > - Comments anywhere in between name/address
> > > - Nested comments like (John (Doe))
> > > 
> > > A brief analysis of checkpatch output on v5.0..v5.7 showed that
> > > after these modifications, the number of BAD_SIGN_OFF warnings
> > > came down from 2944 to 1424, and FROM_SIGN_OFF_MISMATCH came
> > > down from 2366 to 2330.
> > > 
> > > So, a total of 1556 false positives were resolved in total.
> > 
> > A mere reduction in messages emitted isn't necessarily good.
> > 
> 
> Agree. That is why I also went through the list of those warnings.
> 
> I could not spot any obvious true positive among the reduced ones.
>  
> 
> > Please send me privately a complete list of these nominally
> > false positive messages that are no longer emitted.
> > 
> > I believe one of the relatively common incorrect messages is
> > for the cc: <stable@vger.kernel.org> where a version number is
> > continued on the same line after a #.
> > 
> > CC: stable@vger.kernel.org # for versions x.y.z and above
> > 
> 
> That was one,

It's not just one, it's ~90% of the list that Dwaipayan sent me.

$ wc -l mismatches
831 mismatches

$ grep -v -i stable mismatches | wc -l
98

> another common pattern was just quotes put inconsistently at 
> different places.

Yes, there are some defects there.
But there are also now false negatives.

For instance, this is not appropriate to ignore:

WARNING:BAD_SIGN_OFF: email address 'jacek.anaszewski@gmail.com, linux-leds@vger.kernel.org, linux-kernel@vger.kernel.org, dmurphy@ti.com' might be better as 'jacek.anaszewski@gmail.com,linux-leds@vger.kernel.org, linux-kernel@vger.kernel.org, dmurphy@ti.com'

From the file that Dwaipayan sent me, all the rest not including the
stable variants, which IMO should be handled separately, are below.

Of these 98 in total, 60+% are unicode which IMO should always be quoted
and most are doubled with BAD_SIGN_OFF doubling FROM_SIGN_OFF_MISMATCH
(and I don't quite understand why it's "From:/" then "Signed-off-by:"

$ grep -v -i stable dwai | sort | uniq -c | sort -rn
     31 WARNING:BAD_SIGN_OFF: email address '周琰杰 (Zhou Yanjie) <zhouyanjie@wanyeetech.com>' might be better as '"周琰杰"(Zhou Yanjie) <zhouyanjie@wanyeetech.com>'
     30 WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email name mismatch: 'From: "周琰杰"(Zhou Yanjie) <zhouyanjie@wanyeetech.com>' != 'Signed-off-by: 周琰杰 (Zhou Yanjie) <zhouyanjie@wanyeetech.com>'
     
These 29 in total would be better stripping any bits in parentheses from
the name portion only when _not_ inside quotes.

     20 WARNING:BAD_SIGN_OFF: email address 'Thomas Hellström (VMware) <thomas_os@shipmail.org>' might be better as '"Thomas Hellström"(VMware) <thomas_os@shipmail.org>'
      5 WARNING:BAD_SIGN_OFF: email address 'H. Peter Anvin (Intel) <hpa@zytor.com>' might be better as '"H. Peter Anvin"(Intel) <hpa@zytor.com>'
      

      1 WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email name mismatch: 'From: "Thomas Hellström"(VMware) <thomas_os@shipmail.org>' != 'Signed-off-by: Thomas Hellström (VMware) <thomas_os@shipmail.org>'
      
      1 WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email name mismatch: 'From: "Srivatsa S. Bhat"(VMware) <srivatsa@csail.mit.edu>' != 'Signed-off-by: Srivatsa S. Bhat (VMware) <srivatsa@csail.mit.edu>'
      1 WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email name mismatch: 'From: JanNieuwenhuizen(janneke) <janneke@gnu.org>' != 'Signed-off-by: Jan Nieuwenhuizen <janneke@gnu.org>'  
      1 WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email name mismatch: 'From: "Frédéric Pierret"(fepitre) <frederic.pierret@qubes-os.org>' != 'Signed-off-by: Frédéric Pierret (fepitre) <frederic.pierret@qubes-os.org>'

So these 8 others are ones where quotes are either oddly placed
or perhaps should always exist and the comment in parentheses
is suggested poorly.  7 of these should be fixed and one should
still be reported.

      1 WARNING:BAD_SIGN_OFF: email address '"Thomas Hellström (VMware)" <thomas_os@shipmail.org>' might be better as '"Thomas Hellström"(VMware) <thomas_os@shipmail.org>'
      1 WARNING:BAD_SIGN_OFF: email address 'Srivatsa S. Bhat (VMware) <srivatsa@csail.mit.edu>' might be better as '"Srivatsa S. Bhat"(VMware) <srivatsa@csail.mit.edu>'
      1 WARNING:BAD_SIGN_OFF: email address '"Rantala, Tommi T. (Nokia - FI/Espoo)" <tommi.t.rantala@nokia.com>' might be better as '"Rantala, Tommi T."(Nokia - FI/Espoo) <tommi.t.rantala@nokia.com>'
      1 WARNING:BAD_SIGN_OFF: email address '"Kai Mäkisara (Kolumbus)" <kai.makisara@kolumbus.fi>' might be better as '"Kai Mäkisara"(Kolumbus) <kai.makisara@kolumbus.fi>'
      1 WARNING:BAD_SIGN_OFF: email address 'jacek.anaszewski@gmail.com, linux-leds@vger.kernel.org, linux-kernel@vger.kernel.org, dmurphy@ti.com' might be better as 'jacek.anaszewski@gmail.com,linux-leds@vger.kernel.org, linux-kernel@vger.kernel.org, dmurphy@ti.com'
      1 WARNING:BAD_SIGN_OFF: email address 'Frédéric Pierret (fepitre) <frederic.pierret@qubes-os.org>' might be better as '"Frédéric Pierret"(fepitre) <frederic.pierret@qubes-os.org>'
      1 WARNING:BAD_SIGN_OFF: email address 'David.Laight@aculab.com (big endian system concerns)' might be better as 'David.Laight@aculab.com(big endian system concerns)'
      1 WARNING:BAD_SIGN_OFF: email address 'apenwarr@foxnet.net (Avery Pennarun)' might be better as 'apenwarr@foxnet.net(Avery Pennarun)'




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] checkpatch: improve handling of email comments
  2020-10-31  4:14     ` Joe Perches
@ 2020-10-31  6:11       ` Dwaipayan Ray
  2020-10-31 11:14         ` Joe Perches
  0 siblings, 1 reply; 8+ messages in thread
From: Dwaipayan Ray @ 2020-10-31  6:11 UTC (permalink / raw)
  To: Joe Perches
  Cc: Lukas Bulwahn, linux-kernel-mentees, linux-kernel, Aditya Srivastava

On Sat, Oct 31, 2020 at 9:44 AM Joe Perches <joe@perches.com> wrote:
>
> On Fri, 2020-10-30 at 12:58 +0100, Lukas Bulwahn wrote:
> >
> > On Fri, 30 Oct 2020, Joe Perches wrote:
> >
> > > On Fri, 2020-10-30 at 14:37 +0530, Dwaipayan Ray wrote:
> > > > checkpatch has limited support for parsing email comments. It only
> > > > support single name comments or single after address comments.
> > > > Whereas, RFC 5322 specifies that comments can be inserted in
> > > > between any tokens of the email fields.
> > > >
> > > > Improve comment parsing mechanism in checkpatch.
> > > >
> > > > What is handled now:
> > > >
> > > > - Multiple name/address comments
> > > > - Comments anywhere in between name/address
> > > > - Nested comments like (John (Doe))
> > > >
> > > > A brief analysis of checkpatch output on v5.0..v5.7 showed that
> > > > after these modifications, the number of BAD_SIGN_OFF warnings
> > > > came down from 2944 to 1424, and FROM_SIGN_OFF_MISMATCH came
> > > > down from 2366 to 2330.
> > > >
> > > > So, a total of 1556 false positives were resolved in total.
> > >
> > > A mere reduction in messages emitted isn't necessarily good.
> > >
> >
> > Agree. That is why I also went through the list of those warnings.
> >
> > I could not spot any obvious true positive among the reduced ones.
> >
> >
> > > Please send me privately a complete list of these nominally
> > > false positive messages that are no longer emitted.
> > >
> > > I believe one of the relatively common incorrect messages is
> > > for the cc: <stable@vger.kernel.org> where a version number is
> > > continued on the same line after a #.
> > >
> > > CC: stable@vger.kernel.org # for versions x.y.z and above
> > >
> >
> > That was one,
>
> It's not just one, it's ~90% of the list that Dwaipayan sent me.
>
> $ wc -l mismatches
> 831 mismatches
>
> $ grep -v -i stable mismatches | wc -l
> 98
>
> > another common pattern was just quotes put inconsistently at
> > different places.
>
> Yes, there are some defects there.
> But there are also now false negatives.
>
> For instance, this is not appropriate to ignore:
>
> WARNING:BAD_SIGN_OFF: email address 'jacek.anaszewski@gmail.com, linux-leds@vger.kernel.org, linux-kernel@vger.kernel.org, dmurphy@ti.com' might be better as 'jacek.anaszewski@gmail.com,linux-leds@vger.kernel.org, linux-kernel@vger.kernel.org, dmurphy@ti.com'
>
> From the file that Dwaipayan sent me, all the rest not including the
> stable variants, which IMO should be handled separately, are below.
>
> Of these 98 in total, 60+% are unicode which IMO should always be quoted
> and most are doubled with BAD_SIGN_OFF doubling FROM_SIGN_OFF_MISMATCH
> (and I don't quite understand why it's "From:/" then "Signed-off-by:"
>
> $ grep -v -i stable dwai | sort | uniq -c | sort -rn
>      31 WARNING:BAD_SIGN_OFF: email address '周琰杰 (Zhou Yanjie) <zhouyanjie@wanyeetech.com>' might be better as '"周琰杰"(Zhou Yanjie) <zhouyanjie@wanyeetech.com>'
>      30 WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email name mismatch: 'From: "周琰杰"(Zhou Yanjie) <zhouyanjie@wanyeetech.com>' != 'Signed-off-by: 周琰杰 (Zhou Yanjie) <zhouyanjie@wanyeetech.com>'
>
> These 29 in total would be better stripping any bits in parentheses from
> the name portion only when _not_ inside quotes.
>
>      20 WARNING:BAD_SIGN_OFF: email address 'Thomas Hellström (VMware) <thomas_os@shipmail.org>' might be better as '"Thomas Hellström"(VMware) <thomas_os@shipmail.org>'
>       5 WARNING:BAD_SIGN_OFF: email address 'H. Peter Anvin (Intel) <hpa@zytor.com>' might be better as '"H. Peter Anvin"(Intel) <hpa@zytor.com>'
>
>
>       1 WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email name mismatch: 'From: "Thomas Hellström"(VMware) <thomas_os@shipmail.org>' != 'Signed-off-by: Thomas Hellström (VMware) <thomas_os@shipmail.org>'
>
>       1 WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email name mismatch: 'From: "Srivatsa S. Bhat"(VMware) <srivatsa@csail.mit.edu>' != 'Signed-off-by: Srivatsa S. Bhat (VMware) <srivatsa@csail.mit.edu>'
>       1 WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email name mismatch: 'From: JanNieuwenhuizen(janneke) <janneke@gnu.org>' != 'Signed-off-by: Jan Nieuwenhuizen <janneke@gnu.org>'
>       1 WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email name mismatch: 'From: "Frédéric Pierret"(fepitre) <frederic.pierret@qubes-os.org>' != 'Signed-off-by: Frédéric Pierret (fepitre) <frederic.pierret@qubes-os.org>'
>
> So these 8 others are ones where quotes are either oddly placed
> or perhaps should always exist and the comment in parentheses
> is suggested poorly.  7 of these should be fixed and one should
> still be reported.
>
>       1 WARNING:BAD_SIGN_OFF: email address '"Thomas Hellström (VMware)" <thomas_os@shipmail.org>' might be better as '"Thomas Hellström"(VMware) <thomas_os@shipmail.org>'
>       1 WARNING:BAD_SIGN_OFF: email address 'Srivatsa S. Bhat (VMware) <srivatsa@csail.mit.edu>' might be better as '"Srivatsa S. Bhat"(VMware) <srivatsa@csail.mit.edu>'
>       1 WARNING:BAD_SIGN_OFF: email address '"Rantala, Tommi T. (Nokia - FI/Espoo)" <tommi.t.rantala@nokia.com>' might be better as '"Rantala, Tommi T."(Nokia - FI/Espoo) <tommi.t.rantala@nokia.com>'
>       1 WARNING:BAD_SIGN_OFF: email address '"Kai Mäkisara (Kolumbus)" <kai.makisara@kolumbus.fi>' might be better as '"Kai Mäkisara"(Kolumbus) <kai.makisara@kolumbus.fi>'
>       1 WARNING:BAD_SIGN_OFF: email address 'jacek.anaszewski@gmail.com, linux-leds@vger.kernel.org, linux-kernel@vger.kernel.org, dmurphy@ti.com' might be better as 'jacek.anaszewski@gmail.com,linux-leds@vger.kernel.org, linux-kernel@vger.kernel.org, dmurphy@ti.com'
>       1 WARNING:BAD_SIGN_OFF: email address 'Frédéric Pierret (fepitre) <frederic.pierret@qubes-os.org>' might be better as '"Frédéric Pierret"(fepitre) <frederic.pierret@qubes-os.org>'
>       1 WARNING:BAD_SIGN_OFF: email address 'David.Laight@aculab.com (big endian system concerns)' might be better as 'David.Laight@aculab.com(big endian system concerns)'
>       1 WARNING:BAD_SIGN_OFF: email address 'apenwarr@foxnet.net (Avery Pennarun)' might be better as 'apenwarr@foxnet.net(Avery Pennarun)'
>
>
Hi,
Thanks for the review.

So I get that the parentheses from within quotes should not
be extracted. I will do that.

But for the names which should be quoted, I think the errors appeared
because of a parsing bug. There is no separate mechanism
to distinguish quoted and unquoted names currently.

Names which have must quote characters without any comments are
not warned about right now:

D. Ray <dwaipayanray1@gmail.com> doesn't throw any warning, while
D. Ray (Dwai) <dwaipayanray1@gmail.com> does.

Do you think this should be dealt separately from this patch?
Perhaps as another warning?

Thanks,
Dwaipayan.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] checkpatch: improve handling of email comments
  2020-10-31  6:11       ` Dwaipayan Ray
@ 2020-10-31 11:14         ` Joe Perches
  2020-10-31 21:08           ` Dwaipayan Ray
  0 siblings, 1 reply; 8+ messages in thread
From: Joe Perches @ 2020-10-31 11:14 UTC (permalink / raw)
  To: Dwaipayan Ray
  Cc: Lukas Bulwahn, linux-kernel-mentees, linux-kernel, Aditya Srivastava

On Sat, 2020-10-31 at 11:41 +0530, Dwaipayan Ray wrote:
> Names which have must quote characters without any comments are
> not warned about right now:
> 
> D. Ray <dwaipayanray1@gmail.com> doesn't throw any warning, while
> D. Ray (Dwai) <dwaipayanray1@gmail.com> does.

I agree that a comment in parentheses after the name and before
the email address is an issue that should be resolved.

I think your proposed solution isn't great through.

> Do you think this should be dealt separately from this patch?

I think the cc: stable@(?:vger\.)?kernel.org with additional
content on the same line should be separated from other email
addresses with additional content on the same line.

> Perhaps as another warning?

Dunno.

Try this git log grep:

$ git log --format=email -100000 | \
  grep -P '^(?:[\w\-]+-by:|cc:|CC:|Cc:)' | \
  grep -v 'stable\@' | \
  grep -P '\>.+'

This finds any signature/cc line with content after an
email address that end with a close angle bracket that
doesn't go to the stable address.

Think about what content after that close angle bracket
should and shoud not be allowed.

There are a few variants here:

o comments (optional whitespace, followed by '#' or '[' or '(' or c89)
o misuse of quote (around the whole name and address)
o Odd commas after '>' likely from defective cut'n'paste use

Then add this to the first grep to avoid the comments as above

$ git log --format=email -100000 | \
  grep -P '^(?:[\w\-]+-by:|cc:|CC:|Cc:)' | \
  grep -v 'stable\@' | \
  grep -P '\>.+' | \
  grep -vP '\>\s*(?:\#|\(|/\*|\[)'

Shouldn't all these be reported?
Are they if your patch is applied?

Then look at the addresses that do not have a close angle
bracket and also have more content after the email address.

$ git log --format=email -100000 | \
  grep -P '^(?:[\w\-]+-by:|cc:|CC:|Cc:)' | \
  grep -v 'stable@' | \
  grep -vP '<[\w\.\@\+\-]+>' | \
  grep -vP '[\w\.\@\+\-]+$'

What of all of these should be reported?

Happy testing...


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] checkpatch: improve handling of email comments
  2020-10-31 11:14         ` Joe Perches
@ 2020-10-31 21:08           ` Dwaipayan Ray
  0 siblings, 0 replies; 8+ messages in thread
From: Dwaipayan Ray @ 2020-10-31 21:08 UTC (permalink / raw)
  To: Joe Perches
  Cc: Lukas Bulwahn, linux-kernel-mentees, linux-kernel, Aditya Srivastava


> Try this git log grep:
>
> $ git log --format=email -100000 | \
>    grep -P '^(?:[\w\-]+-by:|cc:|CC:|Cc:)' | \
>    grep -v 'stable\@' | \
>    grep -P '\>.+'
>
> This finds any signature/cc line with content after an
> email address that end with a close angle bracket that
> doesn't go to the stable address.
>
> Think about what content after that close angle bracket
> should and shoud not be allowed.
>
> There are a few variants here:
>
> o comments (optional whitespace, followed by '#' or '[' or '(' or c89)
> o misuse of quote (around the whole name and address)
> o Odd commas after '>' likely from defective cut'n'paste use
>
> Then add this to the first grep to avoid the comments as above
>
> $ git log --format=email -100000 | \
>    grep -P '^(?:[\w\-]+-by:|cc:|CC:|Cc:)' | \
>    grep -v 'stable\@' | \
>    grep -P '\>.+' | \
>    grep -vP '\>\s*(?:\#|\(|/\*|\[)'
>
> Shouldn't all these be reported?
> Are they if your patch is applied?
>
> Then look at the addresses that do not have a close angle
> bracket and also have more content after the email address.
>
> $ git log --format=email -100000 | \
>    grep -P '^(?:[\w\-]+-by:|cc:|CC:|Cc:)' | \
>    grep -v 'stable@' | \
>    grep -vP '<[\w\.\@\+\-]+>' | \
>    grep -vP '[\w\.\@\+\-]+$'
>
> What of all of these should be reported?
>
> Happy testing...
>
Hi,
So I ran the tests and there are some interesting results.

The warnings were the same before and after this patch
was applied.

For illegal elements after the closing braces, there were
several errors for which no warnings were reported. These
are:

       6 Cc: Peter Zijlstra <peterz@infradead.org>,
       5 Reviewed-by: "Dietmar Eggemann <dietmar.eggemann@arm.com>"
       1 Suggested-by: Julia Lawall <julia.lawall@lip6.fr>.
       1 Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>"
       1 Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>"
       1 Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>"
       1 Signed-off-by: Veerabhadrarao Badiganti 
<vbadigan@codeaurora.org> Link:
       1 Signed-off-by: Jingle Wu <jingle.wu@emc.com.tw>.
       1 Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>,
       1 Reviewed-by: Max Gurtovoy <maxg@mellanox.com 
<mailto:maxg@mellanox.com>>
       1 Reviewed-by: Maulik Shah <mkshah@codeaurora.org> Thanks, Maulik
       1 Reviewed-by: Marc Zyngier <maz@kernel.org> Link: 
https://lore.kernel.org/r/20200826112331.047917603@linutronix.de
       1 Reviewed-by: David Sterba <dsterba@suse.com>i
       1 Reviewed-by: David Sterba <dsterba@suse.com>c
       1 Reviewed-by: David Sterba <dsterba@suse.com>3
       1 Reviewed-by: Christoph Hellwig <hch@lst.de>,
       1 Reviewed-by: Christian König <christian.koenig@amd.com> for both.
       1 Reviewed-by: Christian König <christian.koenig@amd.com>.
       1 Reported-by: Randy Dunlap <rdunlap@infradead.org>>
       1 Reported-by: Qian Cai <cai@redhat.com>>
       1 Reported-by: Qian Cai <cai@lca.pw> writes:
       1 Reported-by: kernel test robot <lkp@intel.com> for missing #include
       1 Reported-by: "kernelci.org bot" <bot@kernelci.org>"
       1 Reported-by: kbuild test robot <lkp@intel.com>]
       1 Reported-by: Bob Sanders <bob.sanders@hpe.com> and the LTP 
testsuite
       1 Cc: Wolfram Sang <wsa@kernel.org>,
       1 Cc: Valdis Kletnieks <valdis.kletnieks@vt.edu>,
       1 Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>,
       1 CC: Stephen Rothwell <sfr@canb.auug.org.au>,
       1 Cc: Sia, Jee Heng <jee.heng.sia@intel.com>; 
alsa-devel@alsa-project.org; linux-kernel@vger.kernel.org; 
clang-built-linux@googlegroups.com; Nathan Chancellor 
<natechancellor@gmail.com>
       1 Cc: Robert Sesek <rsesek@google.com>,
       1 CC: Peter Zijlstra <peterz@infradead.org>,
       1 Cc: Omar Sandoval <osandov@fb.com>,
       1 Cc: Michael Neuling <mikey@neuling.org> <mikey@neuling.org>
       1 Cc: Maxime Ripard <mripard@kernel.org>,
       1 Cc: "Martin K. Petersen" <martin.petersen@oracle.com>,
       1 Cc: Mark Scott <mscott@forcepoint.com>,
       1 Cc: Mark Rutland <mark.rutland@arm.com>.
       1 Cc: Mark Rutland <mark.rutland@arm.com>,
       1 Cc: Mark Rutland <mark.rutland@arm.com>,
       1 Cc: Laurent Pinchart <Laurent.pinchart@ideasonboard.com>,
       1 Cc: Kees Cook <keescook@google.com>,
       1 Cc: Jonas Karlman <jonas@kwiboo.se>,
       1 Cc: Jernej Skrabec <jernej.skrabec@siol.net>,
       1 Cc: Jason Wang <jasowang@redhat.com>; Parav Pandit 
<parav@mellanox.com>; virtualization@lists.linux-foundation.org; 
linux-kernel@vger.kernel.org; kernel-janitors@vger.kernel.org
       1 Cc: Jann Horn <jannh@google.com>,
       1 Cc: James Bottomley <james.bottomley@hansenpartnership.com>,
       1 Cc: Icenowy Zheng <icenowy@aosc.io>,
       1 Cc: Dan Murphy <dmurphy@ti.com>A
       1 Cc: Daniel Vetter <daniel@ffwll.ch>,
       1 Cc: Christoph Lameter <cl@linux.com>Cc: Pekka Enberg 
<penberg@kernel.org>
       1 Cc: Christoph Hellwig <hch@lst.de>,
       1 Cc: Christian König <christian.koenig@amd.com>.
       1 CC: "Chang S. Bae" <chang.seok.bae@intel.com>,
       1 Cc: Al Viro <viro@zeniv.linux.org.uk>e
       1 Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>A
       1 Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>A


For cases with no closing '>', following cases were not
reported by checkpatch:

       6 Cc: linux-arm-kernel@lists.infradead.org,
       1 Reviewed-by: Max Gurtovoy <maxg@mellanox.com 
<mailto:maxg@mellanox.com>>
       1 Cc: rostedt@goodmis.org,
       1 Cc: linux-wireless@vger.kernel.org,
       1 Cc: dri-devel@lists.freedesktop.org,

So it's mostly extra commas at the end. But other
illegal characters were also there. Also in one or two
cases, multiple addresses were stacked together.
I think these all deserve to be reported.

Thanks,
Dwaipayan.


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-10-31 21:09 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-30  9:07 [PATCH] checkpatch: improve handling of email comments Dwaipayan Ray
2020-10-30 11:54 ` Joe Perches
2020-10-30 11:58   ` Lukas Bulwahn
2020-10-30 12:21     ` Joe Perches
2020-10-31  4:14     ` Joe Perches
2020-10-31  6:11       ` Dwaipayan Ray
2020-10-31 11:14         ` Joe Perches
2020-10-31 21:08           ` Dwaipayan Ray

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).