linux-kernel-mentees.lists.linuxfoundation.org archive mirror
 help / color / mirror / Atom feed
From: Lukas Bulwahn <lukas.bulwahn@gmail.com>
To: Dwaipayan Ray <dwaipayanray1@gmail.com>
Cc: linux-kernel-mentees@lists.linuxfoundation.org
Subject: Re: [Linux-kernel-mentees] checkpatch.pl investigation: NO_AUTHOR_SIGN_OFF issues
Date: Fri, 18 Sep 2020 12:44:14 +0200 (CEST)	[thread overview]
Message-ID: <alpine.DEB.2.21.2009181238230.14717@felia> (raw)
In-Reply-To: <CABJPP5DGpep-S5F+J=dxdhBBDp2J9wQBrJvH_Ys6uJDj3AfRRA@mail.gmail.com>



On Fri, 18 Sep 2020, Dwaipayan Ray wrote:

> On Fri, Sep 18, 2020 at 3:36 PM Lukas Bulwahn <lukas.bulwahn@gmail.com> wrote:
> >
> > Hi Dwaipayan, hi others,
> >
> > I had a quick look on the NO AUTHOR SIGN OFF issues reported by
> > checkpatch.pl on v5.4..v5.8 on my own small script.
> >
> > After collecting all the data in a tsv (no details on that tsv here):
> >
> > $ grep "NO_AUTHOR_SIGN_OFF" v5.4..v5.8.tsv | wc -l
> > 1064
> >
> > $ grep "NO_AUTHOR_SIGN_OFF" v5.4..v5.8.tsv  | cut -f 7 | sort  | uniq -c |
> > sort -nr | head -n 8
> >     175 Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
> >     116 Missing Signed-off-by: line by nominal patch author ''
> >      68 Missing Signed-off-by: line by nominal patch author 'Trond Myklebust <trondmy@gmail.com>'
> >      43 Missing Signed-off-by: line by nominal patch author 'Thinh Nguyen <Thinh.Nguyen@synopsys.com>'
> >      40 Missing Signed-off-by: line by nominal patch author 'Pascal van Leeuwen <pascalvanl@gmail.com>'
> >      36 Missing Signed-off-by: line by nominal patch author 'Alex Maftei <amaftei@solarflare.com>'
> >      31 Missing Signed-off-by: line by nominal patch author 'Valdis Kletnieks <valdis.kletnieks@vt.edu>'
> >      24 Missing Signed-off-by: line by nominal patch author 'Luke Nelson <lukenels@cs.washington.edu>'
> >
> >
> > Here a quick look at the first two:
> >
> >     175 Missing Signed-off-by: line by nominal patch author 'Daniel Vetter
> > <daniel.vetter@ffwll.ch>'
> >
> > $ grep "NO_AUTHOR_SIGN_OFF" v5.4..v5.8.tsv | grep "Daniel Vetter" \
> >   | cut -f 1 > commits-NO_AUTHOR_SIGN_OFF-Daniel-Vetter
> > $ cat commits-NO_AUTHOR_SIGN_OFF-Daniel-Vetter | \
> >   xargs git show  --format="%b" -s | \
> >   grep "Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>" | \
> >   wc -l
> >
> > So all 175 commits are of the type:
> >
> > Author: Daniel Vetter <daniel.vetter@ffwll.ch>
> > ...
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> >
> >
> > and the second:
> >
> >     116 Missing Signed-off-by: line by nominal patch author ''
> >
> > That is probably due to not parsing the patch author with a line break.
> >
> >
> > So, if we can find a solution for Daniel Vetter (of course, not
> > hard-coding it in checkpatch.pl), e.g., by adding a .mailmap entry for him
> > and making use of that in checkpatch.pl,
> >
> > ... and for the encoding problem, then we got around 27% of the
> > NO_AUTHOR_SIGN_OFF 'problems' solved. Next, we can continue to look at the
> > next few remaining ones. The longer tail of warnings are clearly warnings
> > that deserve to be pointed out to newbies with a broken setup.
> >
> > I hope you can continue to work on a solution for this class.
> >
> >
> > Thanks for your initial investigation,
> >
> > Lukas
> 
> Hi,
> I think I have figured out how to fix the authors with a line break.
> The checkpatch script keeps a track of the previous line in a variable
> $prevline. I can use it to fix the author's initial signature. I will
> send you a patch on this when it's done.
>

Okay, let us start to have this first patch correct and accepted by Joe 
Perches.
 
> And for Daniel Vetter's issue, it will be a bit more complex i think
> as checkpatch
> seems to check both author's name and author's email by parsing two
> strings.
>

When the first patch is accepted, let us discuss with Joe the best 
solution for this case.

> At revision 10b82d5176488acee2820e5a2cf0f2ec5c3488b6,
> scripts/checkpatch.pl,
> 
> 
> line 2673:
> if ($author ne '') {
> if (same_email_addresses($1, $author)) {
> $authorsignoff = 1;
> }
> }
> 
> line 1213:
> sub same_email_addresses {
> my ($email1, $email2) = @_;
> 
> my ($email1_name, $name1_comment, $email1_address, $comment1)
>  = parse_email($email1);
> my ($email2_name, $name2_comment, $email2_address, $comment2)
>  = parse_email($email2);
> 
> return $email1_name eq $email2_name &&
> $email1_address eq $email2_address;
> }
> 
> The easiest way at this point will be to modify same_email_address
> subroutine and add checks for other email addresses belonging to
> the same author by loading .mailmap.
> But will this make the subroutine heavy?
>

Yes, it is becoming heavier... but maybe still as performant as before if 
done right.

You could always run a quick fast path check and only if you cannot find 
the sign-off, you do a slow path, loading .mailmap and checking.

By the way, there is already the function read_mailmap in 
./scripts/get_maintainers.pl, so we would not need to implement that, just 
find a good way for both scripts to share this implementation. 

But let us do the first patch and then discuss with Joe.

Lukas
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

  reply	other threads:[~2020-09-18 10:44 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-18 10:06 [Linux-kernel-mentees] checkpatch.pl investigation: NO_AUTHOR_SIGN_OFF issues Lukas Bulwahn
2020-09-18 10:29 ` Dwaipayan Ray
2020-09-18 10:44   ` Lukas Bulwahn [this message]
2020-09-21  9:07     ` Dwaipayan Ray
2020-09-21  9:12       ` Lukas Bulwahn
2020-09-21  9:15       ` Lukas Bulwahn
2020-09-22 13:21         ` Dwaipayan Ray
2020-09-22 18:38           ` Lukas Bulwahn
2020-09-22 19:08             ` Dwaipayan Ray
2020-09-23  7:32               ` Lukas Bulwahn
2020-09-23  7:38                 ` Dwaipayan Ray
2020-09-23  7:42                   ` Lukas Bulwahn
2020-09-25  4:18                     ` Dwaipayan Ray
2020-09-25  7:20                       ` Lukas Bulwahn
2020-09-25  7:29                         ` Dwaipayan Ray
2020-09-25  7:35                           ` Lukas Bulwahn
2020-09-26 11:31                             ` Dwaipayan Ray
2020-09-28 13:30                               ` Dwaipayan Ray
2020-09-28 14:09                                 ` Lukas Bulwahn
2020-09-28 14:20                                   ` Dwaipayan Ray
2020-09-28 15:09                                     ` Lukas Bulwahn
2020-09-28 15:06                               ` Lukas Bulwahn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.21.2009181238230.14717@felia \
    --to=lukas.bulwahn@gmail.com \
    --cc=dwaipayanray1@gmail.com \
    --cc=linux-kernel-mentees@lists.linuxfoundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).