linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4] checkpatch: warn for non-standard fixes tag style
@ 2022-09-08 16:44 Niklas Söderlund
  2022-09-08 17:49 ` Philippe Schenker
  0 siblings, 1 reply; 7+ messages in thread
From: Niklas Söderlund @ 2022-09-08 16:44 UTC (permalink / raw)
  To: Dwaipayan Ray, Lukas Bulwahn, Joe Perches, Jonathan Corbet,
	Andy Whitcroft, linux-doc, linux-kernel, Philippe Schenker
  Cc: oss-drivers, Niklas Söderlund, Simon Horman, Louis Peens

Add a warning for fixes tags that does not fall in line with the
standards specified by the community.

Signed-off-by: Niklas Söderlund <niklas.soderlund@corigine.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Louis Peens <louis.peens@corigine.com>
---
* Changes since v3
- Add test that title in tag match title of commit referenced by sha1.

* Changes since v2
- Change the pattern to match on 'fixes:?' to catch more malformed
  fixes tags.

* Changes since v1
- Update the documentation wording and add mention one cause of the
  message can be that email program splits the tag over multiple lines.
---
 Documentation/dev-tools/checkpatch.rst |  8 +++++
 scripts/checkpatch.pl                  | 41 ++++++++++++++++++++++++++
 2 files changed, 49 insertions(+)

diff --git a/Documentation/dev-tools/checkpatch.rst b/Documentation/dev-tools/checkpatch.rst
index b52452bc2963..8c8456a3bd18 100644
--- a/Documentation/dev-tools/checkpatch.rst
+++ b/Documentation/dev-tools/checkpatch.rst
@@ -612,6 +612,14 @@ Commit message
 
     See: https://www.kernel.org/doc/html/latest/process/submitting-patches.html#describe-your-changes
 
+  **BAD_FIXES_TAG**
+    The Fixes: tag is malformed or does not fall in line with the standards
+    specified by the community. This can occur if the tag have been split into
+    multiple lines (e.g., when pasted in email program with word wrapping
+    enabled).
+
+    See: https://www.kernel.org/doc/html/latest/process/submitting-patches.html#describe-your-changes
+
 
 Comparison style
 ----------------
diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 79e759aac543..ac7ae2e4a1d8 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -3140,6 +3140,47 @@ sub process {
 			}
 		}
 
+# Check Fixes: styles is correct
+		if (!$in_header_lines && $line =~ /^fixes:?/i) {
+			my $orig_commit = "";
+			my $id = "0123456789ab";
+			my $title = "commit title";
+			my $tag_case = 1;
+			my $tag_space = 1;
+			my $id_length = 1;
+			my $id_case = 1;
+			my $title_has_quotes = 0;
+
+			if ($line =~ /(fixes:?)\s+([0-9a-f]{5,})\s+($balanced_parens)/i) {
+				my $tag = $1;
+				$orig_commit = $2;
+				$title = $3;
+
+				$tag_case = 0 if $tag eq "Fixes:";
+				$tag_space = 0 if ($line =~ /^fixes:? [0-9a-f]{5,} ($balanced_parens)/i);
+
+				$id_length = 0 if ($orig_commit =~ /^[0-9a-f]{12}$/i);
+				$id_case = 0 if ($orig_commit !~ /[A-F]/);
+
+				# Always strip leading/trailing parens then double quotes if existing
+				$title = substr($title, 1, -1);
+				if ($title =~ /^".*"$/) {
+					$title = substr($title, 1, -1);
+					$title_has_quotes = 1;
+				}
+			}
+
+			my ($cid, $ctitle) = git_commit_info($orig_commit, $id,
+							     $title);
+
+			if ($ctitle ne $title || $tag_case || $tag_space ||
+			    $id_length || $id_case || !$title_has_quotes) {
+				WARN("BAD_FIXES_TAG",
+				     "Please use correct Fixes: style 'Fixes: <12 chars of sha1> (\"<title line>\")' - ie: 'Fixes: $cid (\"$ctitle\")'\n" . $herecurr);
+
+			}
+		}
+
 # Check email subject for common tools that don't need to be mentioned
 		if ($in_header_lines &&
 		    $line =~ /^Subject:.*\b(?:checkpatch|sparse|smatch)\b[^:]/i) {
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v4] checkpatch: warn for non-standard fixes tag style
  2022-09-08 16:44 [PATCH v4] checkpatch: warn for non-standard fixes tag style Niklas Söderlund
@ 2022-09-08 17:49 ` Philippe Schenker
  2022-09-09  7:40   ` niklas.soderlund
  0 siblings, 1 reply; 7+ messages in thread
From: Philippe Schenker @ 2022-09-08 17:49 UTC (permalink / raw)
  To: corbet, niklas.soderlund, dwaipayanray1, linux-kernel, linux-doc,
	joe, lukas.bulwahn, apw
  Cc: sfr, louis.peens, simon.horman, oss-drivers

Hi Niklas, 

Thanks for adding me to cc. I will also add Stephen, as he also sent
some comments on my submission the exact same problem. I'm supportive of
your code as it has the nice advantage of suggesting the right format of
the tag if it might be wrong. However it seems lot of stuff is slightly
duplicated and lots of lines could be left away simplifying it greatly.
I don't want to hold anything up anyway so I'm fine with it, but will
stillleave some comments of things I think should be improved.

Please see my comments inline.


On Thu, 2022-09-08 at 18:44 +0200, Niklas Söderlund wrote:
> Add a warning for fixes tags that does not fall in line with the
> standards specified by the community.
> 
> Signed-off-by: Niklas Söderlund <niklas.soderlund@corigine.com>
> Reviewed-by: Simon Horman <simon.horman@corigine.com>
> Reviewed-by: Louis Peens <louis.peens@corigine.com>
> ---
> * Changes since v3
> - Add test that title in tag match title of commit referenced by sha1.
> 
> * Changes since v2
> - Change the pattern to match on 'fixes:?' to catch more malformed
>   fixes tags.
> 
> * Changes since v1
> - Update the documentation wording and add mention one cause of the
>   message can be that email program splits the tag over multiple
> lines.
> ---
>  Documentation/dev-tools/checkpatch.rst |  8 +++++
>  scripts/checkpatch.pl                  | 41
> ++++++++++++++++++++++++++
>  2 files changed, 49 insertions(+)
> 
> diff --git a/Documentation/dev-tools/checkpatch.rst
> b/Documentation/dev-tools/checkpatch.rst
> index b52452bc2963..8c8456a3bd18 100644
> --- a/Documentation/dev-tools/checkpatch.rst
> +++ b/Documentation/dev-tools/checkpatch.rst
> @@ -612,6 +612,14 @@ Commit message
>  
>      See:
> https://www.kernel.org/doc/html/latest/process/submitting-patches.html#describe-your-changes
>  
> +  **BAD_FIXES_TAG**
> +    The Fixes: tag is malformed or does not fall in line with the
> standards
> +    specified by the community. This can occur if the tag have been
> split into
> +    multiple lines (e.g., when pasted in email program with word
> wrapping
> +    enabled).
> +
> +    See:
> https://www.kernel.org/doc/html/latest/process/submitting-patches.html#describe-your-changes
> +
>  
>  Comparison style
>  ----------------
> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> index 79e759aac543..ac7ae2e4a1d8 100755
> --- a/scripts/checkpatch.pl
> +++ b/scripts/checkpatch.pl
> @@ -3140,6 +3140,47 @@ sub process {
>                         }
>                 }
>  
> +# Check Fixes: styles is correct
> +               if (!$in_header_lines && $line =~ /^fixes:?/i) {

I would check all lines that start with fixes, even if there is
whitespace in front (and then failing later on...)

if (!$in_header_lines && $line =~ /^\s*fixes:?/i) {


> +                       my $orig_commit = "";
> +                       my $id = "0123456789ab";
> +                       my $title = "commit title";
> +                       my $tag_case = 1;
> +                       my $tag_space = 1;
> +                       my $id_length = 1;
> +                       my $id_case = 1;
> +                       my $title_has_quotes = 0;
> +
> +                       if ($line =~ /(fixes:?)\s+([0-9a-
> f]{5,})\s+($balanced_parens)/i) {

If we check also the fixes: lines that begin with whitespace this would
be a good space to check that we do not want any whitespace in front of
Fixes: tag.

/(\s*fixes:?)\s+([0-9a-f]{5,})\s+($balanced_parens)/i) {

> +                               my $tag = $1;
> +                               $orig_commit = $2;
> +                               $title = $3;
> +
> +                               $tag_case = 0 if $tag eq "Fixes:";
> +                               $tag_space = 0 if ($line =~ /^fixes:?
> [0-9a-f]{5,} ($balanced_parens)/i);
> +
> +                               $id_length = 0 if ($orig_commit =~
> /^[0-9a-f]{12}$/i);

I suggest we borrow the patter that is also used in "Check for git id
commit length and improperly formed commit descriptions". This has the
reason as checking strictly for 12 chars is at the moment right but as
linux grows 13 chars will eventually come.

$id_length = 0 if ($orig_commit =~ /^[0-9a-f]{12,40}$/i);
 

> +                               $id_case = 0 if ($orig_commit !~ /[A-
> F]/);
> +
> +                               # Always strip leading/trailing parens
> then double quotes if existing
> +                               $title = substr($title, 1, -1);
> +                               if ($title =~ /^".*"$/) {
> +                                       $title = substr($title, 1, -
> 1);
> +                                       $title_has_quotes = 1;
> +                               }
> +                       }
> +
> +                       my ($cid, $ctitle) =
> git_commit_info($orig_commit, $id,
> +                                                            $title);
> +
> +                       if ($ctitle ne $title || $tag_case ||
> $tag_space ||
> +                           $id_length || $id_case ||
> !$title_has_quotes) {
> +                               WARN("BAD_FIXES_TAG",
> +                                    "Please use correct Fixes: style
> 'Fixes: <12 chars of sha1> (\"<title line>\")' - ie: 'Fixes: $cid
> (\"$ctitle\")'\n" . $herecurr);
> +
> +                       }
> +               }
> +
>  # Check email subject for common tools that don't need to be
> mentioned
>                 if ($in_header_lines &&
>                     $line =~
> /^Subject:.*\b(?:checkpatch|sparse|smatch)\b[^:]/i) {


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4] checkpatch: warn for non-standard fixes tag style
  2022-09-08 17:49 ` Philippe Schenker
@ 2022-09-09  7:40   ` niklas.soderlund
  2022-09-09 14:39     ` Philippe Schenker
  2022-09-09 17:57     ` Joe Perches
  0 siblings, 2 replies; 7+ messages in thread
From: niklas.soderlund @ 2022-09-09  7:40 UTC (permalink / raw)
  To: Philippe Schenker
  Cc: corbet, dwaipayanray1, linux-kernel, linux-doc, joe,
	lukas.bulwahn, apw, sfr, louis.peens, simon.horman, oss-drivers

Hi Philippe,

On 2022-09-08 17:49:14 +0000, Philippe Schenker wrote:
> Hi Niklas, 
> 
> Thanks for adding me to cc. I will also add Stephen, as he also sent
> some comments on my submission the exact same problem. I'm supportive of
> your code as it has the nice advantage of suggesting the right format of
> the tag if it might be wrong. However it seems lot of stuff is slightly
> duplicated and lots of lines could be left away simplifying it greatly.
> I don't want to hold anything up anyway so I'm fine with it, but will
> stillleave some comments of things I think should be improved.

I agree the LoC could be reduced, I try to mimic the style from the 
"Check for git id commit length and improperly formed commit 
descriptions" check. As there is some overlap maybe one day someone 
cleverer then me can figure out how to share code between them.

> > +# Check Fixes: styles is correct
> > +               if (!$in_header_lines && $line =~ /^fixes:?/i) {
> 
> I would check all lines that start with fixes, even if there is
> whitespace in front (and then failing later on...)
> 
> if (!$in_header_lines && $line =~ /^\s*fixes:?/i) {

Good point, will do so in v5.

> If we check also the fixes: lines that begin with whitespace this 
> would be a good space to check that we do not want any whitespace in 
> front of Fixes: tag.
> 
> /(\s*fixes:?)\s+([0-9a-f]{5,})\s+($balanced_parens)/i) {

Ditto.

> > +                               $id_length = 0 if ($orig_commit =~
> > /^[0-9a-f]{12}$/i);
> 
> I suggest we borrow the patter that is also used in "Check for git id
> commit length and improperly formed commit descriptions". This has the
> reason as checking strictly for 12 chars is at the moment right but as
> linux grows 13 chars will eventually come.
> 
> $id_length = 0 if ($orig_commit =~ /^[0-9a-f]{12,40}$/i);

This one bothers me a bit. I did do that before I sent out v1 but 
changed my mind. The reason being that the documentation asks explicitly 
for 12 chars [1]. I have no preference on keeping it strictly 12 or 
allowing anything in the 12 to 40 range, but i do think the check should 
reflect whats in the documentation. If we change this maybe we also need 
to update the documentation?

One argument to keep it strict is that when Linux the need 13 or more 
characters the documentation will need to be updated and it is natural 
that the script to check that the style documented is followed is 
updated at the same time.

1.  https://www.kernel.org/doc/html/latest/process/submitting-patches.html#describe-your-changes

-- 
Kind Regards,
Niklas Söderlund

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4] checkpatch: warn for non-standard fixes tag style
  2022-09-09  7:40   ` niklas.soderlund
@ 2022-09-09 14:39     ` Philippe Schenker
  2022-09-09 17:57     ` Joe Perches
  1 sibling, 0 replies; 7+ messages in thread
From: Philippe Schenker @ 2022-09-09 14:39 UTC (permalink / raw)
  To: niklas.soderlund
  Cc: corbet, sfr, louis.peens, oss-drivers, dwaipayanray1,
	linux-kernel, joe, linux-doc, apw, lukas.bulwahn, simon.horman

On Fri, 2022-09-09 at 09:40 +0200, niklas.soderlund@corigine.com wrote:
> Hi Philippe,
> 
> On 2022-09-08 17:49:14 +0000, Philippe Schenker wrote:
> > Hi Niklas, 
> > 
> > Thanks for adding me to cc. I will also add Stephen, as he also sent
> > some comments on my submission the exact same problem. I'm
> > supportive of
> > your code as it has the nice advantage of suggesting the right
> > format of
> > the tag if it might be wrong. However it seems lot of stuff is
> > slightly
> > duplicated and lots of lines could be left away simplifying it
> > greatly.
> > I don't want to hold anything up anyway so I'm fine with it, but
> > will
> > stillleave some comments of things I think should be improved.
> 
> I agree the LoC could be reduced, I try to mimic the style from the 
> "Check for git id commit length and improperly formed commit 
> descriptions" check. As there is some overlap maybe one day someone 
> cleverer then me can figure out how to share code between them.
> 
> > > +# Check Fixes: styles is correct
> > > +               if (!$in_header_lines && $line =~ /^fixes:?/i) {
> > 
> > I would check all lines that start with fixes, even if there is
> > whitespace in front (and then failing later on...)
> > 
> > if (!$in_header_lines && $line =~ /^\s*fixes:?/i) {
> 
> Good point, will do so in v5.
> 
> > If we check also the fixes: lines that begin with whitespace this 
> > would be a good space to check that we do not want any whitespace in
> > front of Fixes: tag.
> > 
> > /(\s*fixes:?)\s+([0-9a-f]{5,})\s+($balanced_parens)/i) {
> 
> Ditto.
> 
> > > +                               $id_length = 0 if ($orig_commit =~
> > > /^[0-9a-f]{12}$/i);
> > 
> > I suggest we borrow the patter that is also used in "Check for git
> > id
> > commit length and improperly formed commit descriptions". This has
> > the
> > reason as checking strictly for 12 chars is at the moment right but
> > as
> > linux grows 13 chars will eventually come.
> > 
> > $id_length = 0 if ($orig_commit =~ /^[0-9a-f]{12,40}$/i);
> 
> This one bothers me a bit. I did do that before I sent out v1 but 
> changed my mind. The reason being that the documentation asks
> explicitly 
> for 12 chars [1]. I have no preference on keeping it strictly 12 or 
> allowing anything in the 12 to 40 range, but i do think the check
> should 
> reflect whats in the documentation. If we change this maybe we also
> need 
> to update the documentation?
> 
> One argument to keep it strict is that when Linux the need 13 or more 
> characters the documentation will need to be updated and it is natural
> that the script to check that the style documented is followed is 
> updated at the same time.

In the end I'm fine with both variants. Up to you, I just would not have
made the whole check too strict as until now there was no check at all.
But the fact we are now even checking one space character it is also
fine for me that we limit this to strictly 12 characters.

> 
> 1. 
> https://www.kernel.org/doc/html/latest/process/submitting-patches.html#describe-your-changes
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4] checkpatch: warn for non-standard fixes tag style
  2022-09-09  7:40   ` niklas.soderlund
  2022-09-09 14:39     ` Philippe Schenker
@ 2022-09-09 17:57     ` Joe Perches
  2022-09-10  8:48       ` niklas.soderlund
  2022-09-11  2:47       ` Joe Perches
  1 sibling, 2 replies; 7+ messages in thread
From: Joe Perches @ 2022-09-09 17:57 UTC (permalink / raw)
  To: niklas.soderlund, Philippe Schenker
  Cc: corbet, dwaipayanray1, linux-kernel, linux-doc, lukas.bulwahn,
	apw, sfr, louis.peens, simon.horman, oss-drivers

On Fri, 2022-09-09 at 09:40 +0200, niklas.soderlund@corigine.com wrote:
> On 2022-09-08 17:49:14 +0000, Philippe Schenker wrote:
> > Thanks for adding me to cc. I will also add Stephen, as he also sent
> > some comments on my submission the exact same problem. I'm supportive of
> > your code as it has the nice advantage of suggesting the right format of
> > the tag if it might be wrong. However it seems lot of stuff is slightly
> > duplicated and lots of lines could be left away simplifying it greatly.

It's not very possible to reduce the line count here.
I mentioned the same thing in my first reply.

> > > +# Check Fixes: styles is correct
> > > +               if (!$in_header_lines && $line =~ /^fixes:?/i) {
> > 
> > I would check all lines that start with fixes, even if there is
> > whitespace in front (and then failing later on...)
> > 
> > if (!$in_header_lines && $line =~ /^\s*fixes:?/i) {

I think that's a poor idea.

You should really review git history for lines that start with fixes
and look at the number of false positives that would give.

Try this grep:

$ git log -100000 --no-merges --grep="^\s*fixes" -i --format=email -P | \
  grep -P -i "^\s*fixes)" | \
  grep -P -v "^Fixes: [0-9a-f]{12,}'
[...]

That is a greater than 10% false positive rate.

I think it's better to make sure that there is a likely SHA1 of some
minimum length after the fixes line.

And a relatively common defect is to have the word "commit" after fixes.

e.g.:

Fixes commit 554c0a3abf216 ("staging: Add rtl8723bs sdio wifi driver").
Fixes commit a2c60d42d97c ("Add files for new driver - part 16").

So maybe:

	if (!$in_header_lines &&
	    $line =~ /^\s*fixes:?\s*(?:commit\s*)?[0-9a-f]{5,}\b/i)




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4] checkpatch: warn for non-standard fixes tag style
  2022-09-09 17:57     ` Joe Perches
@ 2022-09-10  8:48       ` niklas.soderlund
  2022-09-11  2:47       ` Joe Perches
  1 sibling, 0 replies; 7+ messages in thread
From: niklas.soderlund @ 2022-09-10  8:48 UTC (permalink / raw)
  To: Joe Perches
  Cc: Philippe Schenker, corbet, dwaipayanray1, linux-kernel,
	linux-doc, lukas.bulwahn, apw, sfr, louis.peens, simon.horman,
	oss-drivers

Hi Joe,

Thanks for your feedback.

On 2022-09-09 10:57:10 -0700, Joe Perches wrote:
> I think it's better to make sure that there is a likely SHA1 of some
> minimum length after the fixes line.
> 
> And a relatively common defect is to have the word "commit" after fixes.
> 
> e.g.:
> 
> Fixes commit 554c0a3abf216 ("staging: Add rtl8723bs sdio wifi driver").
> Fixes commit a2c60d42d97c ("Add files for new driver - part 16").
> 
> So maybe:
> 
> 	if (!$in_header_lines &&
> 	    $line =~ /^\s*fixes:?\s*(?:commit\s*)?[0-9a-f]{5,}\b/i)

I'm sorry I just sent out v6 and missed this comment, which I think is a 
great idea. I will let v6 stew for a day or two in case there is other 
feedback and then spin a v7 with this addressed.

-- 
Kind Regards,
Niklas Söderlund

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4] checkpatch: warn for non-standard fixes tag style
  2022-09-09 17:57     ` Joe Perches
  2022-09-10  8:48       ` niklas.soderlund
@ 2022-09-11  2:47       ` Joe Perches
  1 sibling, 0 replies; 7+ messages in thread
From: Joe Perches @ 2022-09-11  2:47 UTC (permalink / raw)
  To: niklas.soderlund, Philippe Schenker
  Cc: corbet, dwaipayanray1, linux-kernel, linux-doc, lukas.bulwahn,
	apw, sfr, louis.peens, simon.horman, oss-drivers

On Fri, 2022-09-09 at 10:57 -0700, Joe Perches wrote:
> On Fri, 2022-09-09 at 09:40 +0200, niklas.soderlund@corigine.com wrote:
> > On 2022-09-08 17:49:14 +0000, Philippe Schenker wrote:
[]
> > > I would check all lines that start with fixes, even if there is
> > > whitespace in front (and then failing later on...)
> > > 
> > > if (!$in_header_lines && $line =~ /^\s*fixes:?/i) {
> 
> I think that's a poor idea.
> 
> You should really review git history for lines that start with fixes
> and look at the number of false positives that would give.
> 
> Try this grep:
> 
> $ git log -100000 --no-merges --grep="^\s*fixes" -i --format=email -P | \
>   grep -P -i "^\s*fixes)" | \
>   grep -P -v "^Fixes: [0-9a-f]{12,}'
> [...]
> 
> That is a greater than 10% false positive rate.

One day I'll be better at typing grep commands in the editor...

Try:

$ git log -100000 --no-merges --grep="^\s*fixes" -i --format=email -P | \
  grep -P -i "^\s*fixes" | \
  grep -P -v "^Fixes: [0-9a-f]{12,}"


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-09-11  2:48 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-08 16:44 [PATCH v4] checkpatch: warn for non-standard fixes tag style Niklas Söderlund
2022-09-08 17:49 ` Philippe Schenker
2022-09-09  7:40   ` niklas.soderlund
2022-09-09 14:39     ` Philippe Schenker
2022-09-09 17:57     ` Joe Perches
2022-09-10  8:48       ` niklas.soderlund
2022-09-11  2:47       ` Joe Perches

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).