All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dwaipayan Ray <dwaipayanray1@gmail.com>
To: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Cc: linux-kernel-mentees@lists.linuxfoundation.org
Subject: Re: [Linux-kernel-mentees] checkpatch.pl investigation: NO_AUTHOR_SIGN_OFF issues
Date: Fri, 25 Sep 2020 12:59:31 +0530	[thread overview]
Message-ID: <CABJPP5Chc5+tNda3af88=pUC2XN8bEga4FiP=+KBwyx0UWOmyA@mail.gmail.com> (raw)
In-Reply-To: <alpine.DEB.2.21.2009250854230.5992@felia>

> Let us try to put some systematic structure into this handling of the
> NO_AUTHOR_SIGN_OFF.
>
> There are basically two aspects to consider:
>
> - which classes are potential mismatches can we potentially observe?
>
> - which level of severity do we assign to each class of mismatch?
>
> To the first point, you started with some initial classes above.
>
>   0) same name, same address (no mismatch)
>   1) same name, different address
>      subclasses: (how 'different' address?)
>      - email address differ by valid difference, e.g., extensions that
>        will go to the same inbox, e.g.,  the xyz+fds extension in mail
>        addresses. (that is your class 3, right?)
>      - two email addresses that are known to belong to the same individual
>        - known because of .mailmap
>        - known otherwise?
>   2) different name, same address
>      subclasses: (how 'different' name?)
>      - examples:
>        - Firstname, Lastname (but middle-name initials differ)
>        - special "regional" characters, like ü vs. ue, etc.
>        - firstname and lastname are reverted etc.
>   3) different name, different address (but still we believe it "matches")
>      combinations of subclasses of 1) and 2), which we want to consider.
>
> Then, to the second question:
>
> So, these different classes we can think of and "assign" different
> severities that checkpatch.pl can use.
>
> Checkpatch.pl already has four levels:
>
> three severity reporting levels: ERROR, WARN, CHECK, and
> the level _do not report_ (which is implemented by just ignoring some kind
> of difference).
>
> I think a lot of discussion will be around what severity to assign to
> which class above, and in some way, long-term maintainers have a larger
> say than we do here.
>
> So, let us first modify checkpatch.pl to identify the various classes
> above, maybe just starting very basic, with different name, same address
> and same name, different address and run it over the commit range and see
> which examples show up and how often.
>
> For now, we first just use checkpatch.pl to identify the different types
> and refine them into subclasses; then, we can discuss with severity should
> be assigned to which type of mismatch.
>
> The second question should not invalidate our data collection and
> identification of subclasses, though.
>
> Does that help? What do you think?
>
>
> Lukas

Yes, that should help I think. Currently checkpatch is very vague
about Author sign offs, so subclassing it will really be helpful.
But at the same point I don't think it should become
too complex either.

I have already prepared a patch for the basic two classes:
1) same name, different address
2) same address, different name

It would be great to have your feedback on this.

---
diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 9e65d21456f1..e23e753fcaf0 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -1210,14 +1210,22 @@ sub reformat_email {
  return format_email($email_name, $email_address);
 }

-sub same_email_addresses {
+sub same_mail_check {
  my ($email1, $email2) = @_;

  my ($email1_name, $name1_comment, $email1_address, $comment1) =
parse_email($email1);
  my ($email2_name, $name2_comment, $email2_address, $comment2) =
parse_email($email2);

- return $email1_name eq $email2_name &&
-        $email1_address eq $email2_address;
+ my $same_name = $email1_name eq $email2_name;
+ my $same_address= $email1_address eq $email2_address;
+
+ return ($same_name, $same_address);
+}
+
+sub same_email_addresses {
+ my ($same_name, $same_address) = same_mail_check(@_);
+
+ return $same_name && $same_address;
 }

 sub which {
@@ -2347,6 +2355,7 @@ sub process {
  my $signoff = 0;
  my $author = '';
  my $authorsignoff = 0;
+ my $authorsignerr = '';
  my $is_patch = 0;
  my $is_binding_patch = -1;
  my $in_header_lines = $file ? 0 : 1;
@@ -2677,6 +2686,13 @@ sub process {
  if ($author ne '') {
  if (same_email_addresses($1, $author)) {
  $authorsignoff = 1;
+ } else {
+ my ($same_name, $same_address) = same_mail_check($1, $author);
+ if($same_name) {
+ $authorsignerr = "ADDRESS_MISMATCH:${1}";
+ } elsif ($same_address) {
+ $authorsignerr = "NAME_MISMATCH:${1}";
+ }
  }
  }
  }
@@ -6891,8 +6907,16 @@ sub process {
  ERROR("MISSING_SIGN_OFF",
        "Missing Signed-off-by: line(s)\n");
  } elsif (!$authorsignoff) {
- WARN("NO_AUTHOR_SIGN_OFF",
-      "Missing Signed-off-by: line by nominal patch author '$author'\n");
+ if($authorsignerr =~ /^NAME_MISMATCH:(.*)/) {
+ WARN("NO_AUTHOR_SIGN_OFF",
+      "Author name mismatch: 'From: $author' != 'Signed-off-by: $1'\n");
+ } elsif ($authorsignerr =~ /^ADDRESS_MISMATCH:(.*)/) {
+ WARN("NO_AUTHOR_SIGN_OFF",
+      "Author email address mismatch: 'From: $author' !=
'Signed-off-by: $1'\n");
+ } else {
+ WARN("NO_AUTHOR_SIGN_OFF",
+ "Missing Signed-off-by: line by nominal patch author '$author'\n");
+ }
  }
  }
---

Thanks,
Dwaipayan.
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

  reply	other threads:[~2020-09-25  7:29 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-18 10:06 [Linux-kernel-mentees] checkpatch.pl investigation: NO_AUTHOR_SIGN_OFF issues Lukas Bulwahn
2020-09-18 10:29 ` Dwaipayan Ray
2020-09-18 10:44   ` Lukas Bulwahn
2020-09-21  9:07     ` Dwaipayan Ray
2020-09-21  9:12       ` Lukas Bulwahn
2020-09-21  9:15       ` Lukas Bulwahn
2020-09-22 13:21         ` Dwaipayan Ray
2020-09-22 18:38           ` Lukas Bulwahn
2020-09-22 19:08             ` Dwaipayan Ray
2020-09-23  7:32               ` Lukas Bulwahn
2020-09-23  7:38                 ` Dwaipayan Ray
2020-09-23  7:42                   ` Lukas Bulwahn
2020-09-25  4:18                     ` Dwaipayan Ray
2020-09-25  7:20                       ` Lukas Bulwahn
2020-09-25  7:29                         ` Dwaipayan Ray [this message]
2020-09-25  7:35                           ` Lukas Bulwahn
2020-09-26 11:31                             ` Dwaipayan Ray
2020-09-28 13:30                               ` Dwaipayan Ray
2020-09-28 14:09                                 ` Lukas Bulwahn
2020-09-28 14:20                                   ` Dwaipayan Ray
2020-09-28 15:09                                     ` Lukas Bulwahn
2020-09-28 15:06                               ` Lukas Bulwahn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CABJPP5Chc5+tNda3af88=pUC2XN8bEga4FiP=+KBwyx0UWOmyA@mail.gmail.com' \
    --to=dwaipayanray1@gmail.com \
    --cc=linux-kernel-mentees@lists.linuxfoundation.org \
    --cc=lukas.bulwahn@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.