* [Linux-kernel-mentees] [PATCH RFC] checkpatch: extend same_email_address check to load mailmap entries
@ 2020-09-23 10:02 Dwaipayan Ray
2020-09-23 10:17 ` Joe Perches
0 siblings, 1 reply; 4+ messages in thread
From: Dwaipayan Ray @ 2020-09-23 10:02 UTC (permalink / raw)
To: joe; +Cc: dwaipayanray1, linux-kernel-mentees
Checkpatch throws a NO_AUTHOR_SIGN_OFF warning whenever either the
name or email address of author doesn't match any signed-off-by
entry. But it has no mechanism to check whether different email
addresses belong to the same author.
As a result, there have been multiple commits where the author has
signed off with a different email address which caused checkpatch
to product that warning.
An example is commit dc5bdb68b5b3 ("drm/fb-helper: Fix vt restore"),
where the author uses two different mail addresses, and checkpatch
generates a NO_AUTHOR_SIGN_OFF warning.
This was fixed by adding support for loading .mailmap entries.
If a mismatch in name or email address then occurs, the
entries in mailmap can be looked up to deterime the actual author
name and email address.
Signed-off-by: Dwaipayan Ray <dwaipayanray1@gmail.com>
---
scripts/checkpatch.pl | 113 +++++++++++++++++++++++++++++++++++++++++-
1 file changed, 111 insertions(+), 2 deletions(-)
diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 9e65d21456f1..bfa06c9b9625 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -1128,6 +1128,78 @@ sub top_of_kernel_tree {
return 1;
}
+my $mailmap;
+
+sub read_mailmap {
+ $mailmap = {
+ names => {},
+ addresses => {}
+ };
+
+ return if (!defined($root) || !(-f "${root}/.mailmap"));
+
+ open(my $mailmap_file, '<', "${root}/.mailmap")
+ or warn "$P: Can't open .mailmap: $!\n";
+
+ while (<$mailmap_file>) {
+ s/#.*$//; #strip comments
+ s/^\s+|\s+$//g; #trim
+
+ next if (/^\s*$/); #skip empty lines
+ #entries have one of the following formats:
+ # name1 <mail1>
+ # <mail1> <mail2>
+ # name1 <mail1> <mail2>
+ # name1 <mail1> name2 <mail2>
+ # (see man git-shortlog)
+
+ if (/^([^<]+)<([^>]+)>$/) {
+ my $real_name = $1;
+ my $address = $2;
+
+ $real_name =~ s/\s+$//;
+ ($real_name, $address) = parse_email("$real_name <$address>");
+ $mailmap->{names}->{$address} = $real_name;
+
+ } elsif (/^<([^>]+)>\s*<([^>]+)>$/) {
+ my $real_address = $1;
+ my $wrong_address = $2;
+
+ $mailmap->{addresses}->{$wrong_address} = $real_address;
+
+ } elsif (/^(.+)<([^>]+)>\s*<([^>]+)>$/) {
+ my $real_name = $1;
+ my $real_address = $2;
+ my $wrong_address = $3;
+
+ $real_name =~ s/\s+$//;
+ ($real_name, $real_address) =
+ parse_email("$real_name <$real_address>");
+ $mailmap->{names}->{$wrong_address} = $real_name;
+ $mailmap->{addresses}->{$wrong_address} = $real_address;
+
+ } elsif (/^(.+)<([^>]+)>\s*(.+)\s*<([^>]+)>$/) {
+ my $real_name = $1;
+ my $real_address = $2;
+ my $wrong_name = $3;
+ my $wrong_address = $4;
+
+ $real_name =~ s/\s+$//;
+ ($real_name, $real_address) =
+ parse_email("$real_name <$real_address>");
+
+ $wrong_name =~ s/\s+$//;
+ ($wrong_name, $wrong_address) =
+ parse_email("$wrong_name <$wrong_address>");
+
+ my $wrong_email = format_email($wrong_name, $wrong_address);
+ $mailmap->{names}->{$wrong_email} = $real_name;
+ $mailmap->{addresses}->{$wrong_email} = $real_address;
+ }
+ }
+ close($mailmap_file);
+}
+
sub parse_email {
my ($formatted_email) = @_;
@@ -1210,14 +1282,50 @@ sub reformat_email {
return format_email($email_name, $email_address);
}
+sub mailmap_email {
+ my ($name, $address) = @_;
+
+ my $email = format_email($name, $address);
+ my $real_name = $name;
+ my $real_address = $address;
+
+ if (exists $mailmap->{names}->{$email} ||
+ exists $mailmap->{addresses}->{$email}) {
+ if (exists $mailmap->{names}->{$email}) {
+ $real_name = $mailmap->{names}->{$email};
+ }
+ if (exists $mailmap->{addresses}->{$email}) {
+ $real_address = $mailmap->{addresses}->{$email};
+ }
+ } else {
+ if (exists $mailmap->{names}->{$address}) {
+ $real_name = $mailmap->{names}->{$address};
+ }
+ if (exists $mailmap->{addresses}->{$address}) {
+ $real_address = $mailmap->{addresses}->{$address};
+ }
+ }
+ return ($real_name, $real_address);
+}
+
sub same_email_addresses {
my ($email1, $email2) = @_;
my ($email1_name, $name1_comment, $email1_address, $comment1) = parse_email($email1);
my ($email2_name, $name2_comment, $email2_address, $comment2) = parse_email($email2);
- return $email1_name eq $email2_name &&
- $email1_address eq $email2_address;
+ my $name_match = $email1_name eq $email2_name;
+ my $address_match = $email1_address eq $email2_address;
+
+ if(!$name_match || !$address_match) {
+ my ($real_name1, $real_address1) = mailmap_email($email1_name, $email1_address);
+ my ($real_name2, $real_address2) = mailmap_email($email2_name, $email2_address);
+
+ $name_match |= ($real_name1 eq $real_name2);
+ $address_match |= ($real_address1 eq $real_address2);
+ }
+
+ return $name_match && $address_match;
}
sub which {
@@ -2400,6 +2508,7 @@ sub process {
my $checklicenseline = 1;
+ read_mailmap();
sanitise_line_reset();
my $line;
foreach my $rawline (@rawlines) {
--
2.27.0
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [Linux-kernel-mentees] [PATCH RFC] checkpatch: extend same_email_address check to load mailmap entries
2020-09-23 10:02 [Linux-kernel-mentees] [PATCH RFC] checkpatch: extend same_email_address check to load mailmap entries Dwaipayan Ray
@ 2020-09-23 10:17 ` Joe Perches
2020-09-23 11:11 ` Dwaipayan Ray
0 siblings, 1 reply; 4+ messages in thread
From: Joe Perches @ 2020-09-23 10:17 UTC (permalink / raw)
To: Dwaipayan Ray; +Cc: linux-kernel-mentees
On Wed, 2020-09-23 at 15:32 +0530, Dwaipayan Ray wrote:
> Checkpatch throws a NO_AUTHOR_SIGN_OFF warning whenever either the
> name or email address of author doesn't match any signed-off-by
> entry. But it has no mechanism to check whether different email
> addresses belong to the same author.
>
> As a result, there have been multiple commits where the author has
> signed off with a different email address which caused checkpatch
> to product that warning.
>
> An example is commit dc5bdb68b5b3 ("drm/fb-helper: Fix vt restore"),
> where the author uses two different mail addresses, and checkpatch
> generates a NO_AUTHOR_SIGN_OFF warning.
.mailmap entries are used when a person no longer has
access to a particular email and a mapping is created to
allow these old/invalid email addresses to be converted
to a current/valid email address.
The idea here is that a person may have a .gitconfig or
equivalent that still uses an old/invalid email address
but uses a Signed-off-by: for a current/valid one, so
a warning _should_ be emitted for this case.
At least for checkpatch, mismatches in email addresses
should not be papered over by assuming equivalence between
multiple email addresses.
So this should be a separate test and not used in a function
named
"same_email_address". These are specifically _not_
the same email
addresses.
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Linux-kernel-mentees] [PATCH RFC] checkpatch: extend same_email_address check to load mailmap entries
2020-09-23 10:17 ` Joe Perches
@ 2020-09-23 11:11 ` Dwaipayan Ray
2020-09-23 15:08 ` Joe Perches
0 siblings, 1 reply; 4+ messages in thread
From: Dwaipayan Ray @ 2020-09-23 11:11 UTC (permalink / raw)
To: Joe Perches; +Cc: linux-kernel-mentees
> .mailmap entries are used when a person no longer has
> access to a particular email and a mapping is created to
> allow these old/invalid email addresses to be converted
> to a current/valid email address.
>
> The idea here is that a person may have a .gitconfig or
> equivalent that still uses an old/invalid email address
> but uses a Signed-off-by: for a current/valid one, so
> a warning _should_ be emitted for this case.
>
> At least for checkpatch, mismatches in email addresses
> should not be papered over by assuming equivalence between
> multiple email addresses.
>
> So this should be a separate test and not used in a function
> named
> "same_email_address". These are specifically _not_
> the same email
> addresses.
>
>
Hi,
Thanks for the clarification.
So instead of eliminating the warning completely will it be a
better alternative to display a more descriptive warning
on the lines of:
"WARNING: NO_AUTHOR_SIGN_OFF:
invalid/obsolete email address used in signoff by
author $author"
Or is it better to ignore the mailmap extension in checkpatch
and just display warnings for either mismatch in name/
mismatch in email.
Like, you had mentioned one in:
https://lore.kernel.org/linux-kernel-mentees/
7958ded756c895ca614ba900aae7b830a992475e.camel@perches.com/
In that case, same name, but different mail address was
reported as a different warning.
So, I believe two cases can be handled better:
1) Same name, different email
2) Same email, different name
For case 2, it might also be possible to ignore the warning
completely.
What do you think is better?
Thanks,
Dwaipayan.
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Linux-kernel-mentees] [PATCH RFC] checkpatch: extend same_email_address check to load mailmap entries
2020-09-23 11:11 ` Dwaipayan Ray
@ 2020-09-23 15:08 ` Joe Perches
0 siblings, 0 replies; 4+ messages in thread
From: Joe Perches @ 2020-09-23 15:08 UTC (permalink / raw)
To: Dwaipayan Ray; +Cc: linux-kernel-mentees
On Wed, 2020-09-23 at 16:41 +0530, Dwaipayan Ray wrote:
> > .mailmap entries are used when a person no longer has
> > access to a particular email and a mapping is created to
> > allow these old/invalid email addresses to be converted
> > to a current/valid email address.
> >
> > The idea here is that a person may have a .gitconfig or
> > equivalent that still uses an old/invalid email address
> > but uses a Signed-off-by: for a current/valid one, so
> > a warning _should_ be emitted for this case.
> >
> > At least for checkpatch, mismatches in email addresses
> > should not be papered over by assuming equivalence between
> > multiple email addresses.
> >
> > So this should be a separate test and not used in a function
> > named
> > "same_email_address". These are specifically _not_
> > the same email
> > addresses.
> >
> >
>
> Hi,
> Thanks for the clarification.
>
> So instead of eliminating the warning completely will it be a
> better alternative to display a more descriptive warning
> on the lines of:
>
> "WARNING: NO_AUTHOR_SIGN_OFF:
> invalid/obsolete email address used in signoff by
> author $author"
>
> Or is it better to ignore the mailmap extension in checkpatch
> and just display warnings for either mismatch in name/
> mismatch in email.
>
> Like, you had mentioned one in:
> https://lore.kernel.org/linux-kernel-mentees/
> 7958ded756c895ca614ba900aae7b830a992475e.camel@perches.com/
>
> In that case, same name, but different mail address was
> reported as a different warning.
>
> So, I believe two cases can be handled better:
> 1) Same name, different email
> 2) Same email, different name
>
> For case 2, it might also be possible to ignore the warning
> completely.
Maybe both.
I suggest seeing if extensions to email addresses like:
<local_part+extension@domain.tld>
could be compared to:
<local_part@domain.tld>
and comment blocks like:
"First Last" (via mailing list) <local_part@domain.tld>
could be compared to:
"First Last" <local_part@domain.tlc> (MAINTAINER)
Still, I think all of these should emit warnings or
--strict checks messages.
The idea is to notify that some oddity has occurred
rather than silently accept it.
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2020-09-23 19:55 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-23 10:02 [Linux-kernel-mentees] [PATCH RFC] checkpatch: extend same_email_address check to load mailmap entries Dwaipayan Ray
2020-09-23 10:17 ` Joe Perches
2020-09-23 11:11 ` Dwaipayan Ray
2020-09-23 15:08 ` Joe Perches
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).