linux-kernel-mentees.lists.linuxfoundation.org archive mirror
 help / color / mirror / Atom feed
From: Lukas Bulwahn <lukas.bulwahn@gmail.com>
To: Joe Perches <joe@perches.com>
Cc: apw@canonical.com, Dwaipayan Ray <dwaipayanray1@gmail.com>,
	linux-kernel-mentees@lists.linuxfoundation.org,
	linux-kernel@vger.kernel.org
Subject: Re: [Linux-kernel-mentees] [PATCH v2] checkpatch: extend author Signed-off-by check for split From: header
Date: Mon, 21 Sep 2020 09:39:32 +0200 (CEST)	[thread overview]
Message-ID: <alpine.DEB.2.21.2009210921520.7483@felia> (raw)
In-Reply-To: <52ccb41c8922dda44ac325f2f3e09f81f1936611.camel@perches.com>

[-- Attachment #1: Type: text/plain, Size: 7958 bytes --]



On Sun, 20 Sep 2020, Joe Perches wrote:

> On Sun, 2020-09-20 at 21:52 +0530, Dwaipayan Ray wrote:
> > On Sun, Sep 20, 2020 at 8:39 PM Joe Perches <joe@perches.com> wrote:
> > > On Sun, 2020-09-20 at 14:47 +0530, Dwaipayan Ray wrote:
> > > > Checkpatch did not handle cases where the author From: header
> > > > was split into multiple lines. The author identity could not
> > > > be resolved and checkpatch generated a false NO_AUTHOR_SIGN_OFF
> > > > warning.
> > > 
> > I think there won't be any problem. Is my
> > observation correct?
> 
> Likely true, it probably doesn't matter.
> Still, I'd imagine it doesn't hurt either.
> 
> > > What I have does a bit more by saving any post-folding
> > > 
> > > "From: <name and email address>"
> > > 
> > > and comparing that to any "name and perhaps different
> > > email address" in a Signed-off-by: line.
> > > 
> > > A new message is emitted if the name matches but the
> > > email address is different.
> > > 
> > > Perhaps it's reasonable to apply your patch and then
> > > update it with something like the below:
> > > ---
> > >  scripts/checkpatch.pl | 32 ++++++++++++++++++++++++++++----
> > >  1 file changed, 28 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> > > index 3e474072aa90..1ecc179e938d 100755
> > > --- a/scripts/checkpatch.pl
> > > +++ b/scripts/checkpatch.pl
> > > @@ -1240,6 +1240,15 @@ sub same_email_addresses {
> > >                $email1_address eq $email2_address;
> > >  }
> > > 
> > > +sub same_email_names {
> > > +       my ($email1, $email2) = @_;
> > > +
> > > +       my ($email1_name, $name1_comment, $email1_address, $comment1) = parse_email($email1);
> > > +       my ($email2_name, $name2_comment, $email2_address, $comment2) = parse_email($email2);
> > > +
> > > +       return $email1_name eq $email2_name;
> > > +}
> > > +
> > >  sub which {
> > >         my ($bin) = @_;
> > > 
> > > @@ -2679,20 +2688,32 @@ sub process {
> > >                 }
> > > 
> > >  # Check the patch for a From:
> > > -               if (decode("MIME-Header", $line) =~ /^From:\s*(.*)/) {
> > > +               if ($line =~ /^From:\s*(.*)/i) {
> > >                         $author = $1;
> > > -                       $author = encode("utf8", $author) if ($line =~ /=\?utf-8\?/i);
> > > +                       my $curline = $linenr;
> > > +                       while (defined($rawlines[$curline]) && $rawlines[$curline++] =~ /^\s(\s+)?(.*)/) {
> > > +                               $author .= ' ' if (defined($1));
> > > +                               $author .= "$2";
> > > +                       }
> > > +                       if ($author =~ /=\?utf-8\?/i) {
> > > +                               $author = decode("MIME-Header", $author);
> > > +                               $author = encode("utf8", $author);
> > > +                       }
> > > +
> > >                         $author =~ s/"//g;
> > >                         $author = reformat_email($author);
> > >                 }
> > > 
> > >  # Check the patch for a signoff:
> > >                 if ($line =~ /^\s*signed-off-by:\s*(.*)/i) {
> > > +                       my $sig = $1;
> > >                         $signoff++;
> > >                         $in_commit_log = 0;
> > >                         if ($author ne '') {
> > > -                               if (same_email_addresses($1, $author)) {
> > > -                                       $authorsignoff = 1;
> > > +                               if (same_email_addresses($sig, $author)) {
> > > +                                       $authorsignoff = "1";
> > > +                               } elsif (same_email_names($sig, $author)) {
> > > +                                       $authorsignoff = $sig;
> > >                                 }
> > >                         }
> > >                 }
> > > @@ -6937,6 +6958,9 @@ sub process {
> > >                 } elsif (!$authorsignoff) {
> > >                         WARN("NO_AUTHOR_SIGN_OFF",
> > >                              "Missing Signed-off-by: line by nominal patch author '$author'\n");
> > > +               } elsif ($authorsignoff ne "1") {
> > > +                       WARN("NO_AUTHOR_SIGN_OFF",
> > > +                            "From:/SoB: email address mismatch: 'From: $author' != 'Signed-off-by: $authorsignoff'\n");
> > >                 }
> > >         }
> > > 
> > > 
> > 
> > Yes, this is definitely more logical !
> > I was actually hoping to talk with you on this.
> 
> Hope granted, but only via email... (smile)
> 
> > The code you sent better handles name mismatches when
> > email addresses are same. But I also have found several
> > such commits in which the author have signed off using
> > a different email address than the one which he/she used
> > to send the patch.
> > 
> > For example, Lukas checked commits between v5.4 and
> > v5.8 and he found:
> >     175 Missing Signed-off-by: line by nominal patch authorrep
> >     'Daniel Vetter <daniel.vetter@ffwll.ch>'
> > 
> > Infact in all of those commits he signed off using a different
> > mail, Daniel Vetter <daniel.vetter@intel.com>.
> > 
> > So is it possible to resolve these using perhaps .mailmap
> > entries? Or should only the name mismatch part be better
> > handled? Or perhaps both?
> 
> Dunno.  It certainly can be improved...
> Try adding some more logic and see what you come up with.
> 
> btw:
> 
> The most frequent NO_AUTHOR_SIGN_OFF messages for v5.7..v5.8 are
> 
>      98 WARNING: From:/SoB: email address mismatch: 'From: Daniel Vetter <daniel.vetter@ffwll.ch>' != 'Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>'
>      24 WARNING: From:/SoB: email address mismatch: 'From: Thinh Nguyen <Thinh.Nguyen@synopsys.com>' != 'Signed-off-by: Thinh Nguyen <thinhn@synopsys.com>'
>      19 WARNING: From:/SoB: email address mismatch: 'From: Wolfram Sang <wsa+renesas@sang-engineering.com>' != 'Signed-off-by: Wolfram Sang <wsa@kernel.org>'
>      11 WARNING: From:/SoB: email address mismatch: 'From: Luke Nelson <lukenels@cs.washington.edu>' != 'Signed-off-by: Luke Nelson <luke.r.nels@gmail.com>'
>       8 WARNING: From:/SoB: email address mismatch: 'From: Christophe Leroy <christophe.leroy@c-s.fr>' != 'Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>'
>       6 WARNING: From:/SoB: email address mismatch: 'From: Davidlohr Bueso <dave@stgolabs.net>' != 'Signed-off-by: Davidlohr Bueso <dbueso@suse.de>'
>       5 WARNING: Missing Signed-off-by: line by nominal patch author '"Paul A. Clarke" <pc@us.ibm.com>'
>       4 WARNING: Missing Signed-off-by: line by nominal patch author 'Huang Ying <ying.huang@intel.com>'
>       3 WARNING: Missing Signed-off-by: line by nominal patch author '"Stephan Müller" <smueller@chronox.de>'
>

Great minds think alike.

I did a similar investigation on Friday after the first discussion with 
Dwaipayan:

https://lore.kernel.org/linux-kernel-mentees/alpine.DEB.2.21.2009181238230.14717@felia/T/#m1bf5f7ca876d33d4d53e492b5d8a6232437c921f

I hope Dwaipayan can come up with a '.AUTHOR_SIGN_OFF.mailmap' file that 
we can use to distinguish the known developers that knowingly and 
intentionally use different identities vs. the 'newbies' that should 
validly be warned.

We will see if Dwaipayan can come up with a good idea how to handle that.

Lukas

> For the Missing Signed-off-by: lines above,
> even after decoding, the email matches but
> the names do not.
> 
> From: "Paul A. Clarke" <pc@us.ibm.com>
> [...]
> Signed-off-by: Paul Clarke <pc@us.ibm.com>
> 
> From: Huang Ying <ying.huang@intel.com>
> [...]
> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
> 
> From: =?UTF-8?q?Stephan=20M=C3=BCller?= <smueller@chronox.de>
> [...]
> Signed-off-by: Stephan Mueller <smueller@chronox.de>
> 
> > Also, I would like to know if there are any more changes
> > required for the current patch or if it is good to go?
> 
> I think it's fine.
> 
> cheers, Joe
> 
> 

[-- Attachment #2: Type: text/plain, Size: 201 bytes --]

_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

  reply	other threads:[~2020-09-21  7:39 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-20  9:17 [Linux-kernel-mentees] [PATCH v2] checkpatch: extend author Signed-off-by check for split From: header Dwaipayan Ray
2020-09-20 15:09 ` Joe Perches
2020-09-20 16:22   ` Dwaipayan Ray
2020-09-20 16:54     ` Joe Perches
2020-09-21  7:39       ` Lukas Bulwahn [this message]
2020-09-21  9:47         ` Joe Perches
2020-09-20 17:39 ` Joe Perches
2020-09-21  7:49 ` Lukas Bulwahn
2020-09-21  8:31   ` Dwaipayan Ray
  -- strict thread matches above, loose matches on Subject: below --
2020-09-19 20:47 Dwaipayan Ray
2020-09-19 21:15 ` Dwaipayan Ray
2020-09-20  8:11   ` Lukas Bulwahn
2020-09-20  8:01 ` Lukas Bulwahn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.21.2009210921520.7483@felia \
    --to=lukas.bulwahn@gmail.com \
    --cc=apw@canonical.com \
    --cc=dwaipayanray1@gmail.com \
    --cc=joe@perches.com \
    --cc=linux-kernel-mentees@lists.linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).