linux-kernel-mentees.lists.linuxfoundation.org archive mirror
 help / color / mirror / Atom feed
From: Lukas Bulwahn <lukas.bulwahn@gmail.com>
To: Dwaipayan Ray <dwaipayanray1@gmail.com>,
	 Aditya Srivastava <yashsri421@gmail.com>
Cc: linux-kernel-mentees@lists.linuxfoundation.org
Subject: Re: [Linux-kernel-mentees] [PATCH RFC] checkpatch: improve handling of email comments
Date: Wed, 28 Oct 2020 16:59:47 +0100 (CET)	[thread overview]
Message-ID: <alpine.DEB.2.21.2010281659160.13040@felia> (raw)
In-Reply-To: <CABJPP5CkzaQ8x+b8UYZCLwXC1WaZL6+rpkespV684zkPAeJXrA@mail.gmail.com>



On Wed, 28 Oct 2020, Dwaipayan Ray wrote:

> On Wed, Oct 28, 2020 at 8:55 PM Dwaipayan Ray <dwaipayanray1@gmail.com> wrote:
> >
> > checkpatch has limited support for parsing email comments. It only
> > support single name comments or single after address comments.
> > Whereas, RFC 5322 specifies that comments can be inserted in
> > between any tokens of the email fields.
> >
> > On analyzing 50,000 commits from v5.4 it was found that there were
> > about 370 false positives resulting from wrong parsing of comments.
> >
> > Improve comment parsing mechanism in checkpatch.
> >
> > What is handled now:
> >
> > - Multiple name/address comments
> > - Comments anywhere in between name/address
> > - Multi level comments like (John (Doe) )
> >
> > Signed-off-by: Dwaipayan Ray <dwaipayanray1@gmail.com>
> > ---
> >  scripts/checkpatch.pl | 19 ++++++++++++++-----
> >  1 file changed, 14 insertions(+), 5 deletions(-)
> >
> > diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> > index fab38b493cef..ae8436385fc1 100755
> > --- a/scripts/checkpatch.pl
> > +++ b/scripts/checkpatch.pl
> > @@ -1183,14 +1183,20 @@ sub parse_email {
> >                 }
> >         }
> >
> > -       $comment = trim($comment);
> > +       # Comments in between name like John(A nice chap) Doe
> > +       while ($name =~ s/\s*($balanced_parens)\s*/ /) {
> > +               $name_comment .= trim($1);
> > +       }
> >         $name = trim($name);
> >         $name =~ s/^\"|\"$//g;
> > -       if ($name =~ s/(\s*\([^\)]+\))\s*//) {
> > -               $name_comment = trim($1);
> > +
> > +       # Comments in between address like <john(his account)@doe.com>
> > +       while ($address =~ s/\s*($balanced_parens)\s*//) {
> > +               $comment .= trim($1);
> >         }
> >         $address = trim($address);
> >         $address =~ s/^\<|\>$//g;
> > +       $comment = trim($comment);
> >
> >         if ($name =~ /[^\w \-]/i) { ##has "must quote" chars
> >                 $name =~ s/(?<!\\)"/\\"/g; ##escape quotes
> > @@ -1205,8 +1211,6 @@ sub format_email {
> >
> >         my $formatted_email;
> >
> > -       $name_comment = trim($name_comment);
> > -       $comment = trim($comment);
> >         $name = trim($name);
> >         $name =~ s/^\"|\"$//g;
> >         $address = trim($address);
> > @@ -1216,6 +1220,11 @@ sub format_email {
> >                 $name = "\"$name\"";
> >         }
> >
> > +       $name_comment = trim($name_comment);
> > +       $name_comment =~ s/(.+)/ $1/;
> > +       $comment = trim($comment);
> > +       $comment =~ s/(.+)/ $1/;
> > +
> >         if ("$name" eq "") {
> >                 $formatted_email = "$address";
> >         } else {
> > --
> > 2.27.0
> >
> 
> Hi,
> This patch is a follow up to our discussion earlier about improving the
> comment handling at:
> https://lore.kernel.org/linux-kernel-mentees/alpine.DEB.2.21.2010251022060.25172@felia/
> 
> I have only tested it on a few patterns.  But I will run an
> evaluation again on those 50k commits to see what changes.
> 
> What do you think of this?
>

That was exactly the comment I wanted to make.

Commit message is clear; code addition also looks pretty clear.

We are just missing the evaluation to make sure how much we improve and 
that we did not add some stupid bug on the way.

Aditya, can you help us here with evaluation?

E.g., Dwaipayan takes from v5.4 into the future, e.g., until v5.8/v5.9 or 
so.

Aditya, you could then check e.g., v5.0..v5.4, so v5.4 down to the past as 
far as your computer and scripts handles within roughly half a day...

An evaluation on 100,000 commits is certainly a good basis.

Of course, every mentorship candidate can join here as well and show off 
their own evaluation script.

Lukas
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

  parent reply	other threads:[~2020-10-28 16:00 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-28 15:25 [Linux-kernel-mentees] [PATCH RFC] checkpatch: improve handling of email comments Dwaipayan Ray
2020-10-28 15:28 ` Dwaipayan Ray
2020-10-28 15:59   ` Lukas Bulwahn
2020-10-28 15:59   ` Lukas Bulwahn [this message]
2020-10-28 16:28     ` Aditya
2020-10-28 16:38       ` Dwaipayan Ray
2020-10-29 15:03         ` Aditya
2020-10-29 15:47           ` Dwaipayan Ray
2020-10-29 19:43             ` Dwaipayan Ray
2020-10-30  5:30               ` Lukas Bulwahn
2020-10-30  8:31                 ` Aditya
2020-10-30  8:47                 ` Dwaipayan Ray

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.21.2010281659160.13040@felia \
    --to=lukas.bulwahn@gmail.com \
    --cc=dwaipayanray1@gmail.com \
    --cc=linux-kernel-mentees@lists.linuxfoundation.org \
    --cc=yashsri421@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).