linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Joe Perches <joe@perches.com>
To: Aditya Srivastava <yashsri421@gmail.com>
Cc: lukas.bulwahn@gmail.com,
	linux-kernel-mentees@lists.linuxfoundation.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4] checkpatch: add fix and improve warning msg for non-standard signature
Date: Sat, 28 Nov 2020 07:40:33 -0800	[thread overview]
Message-ID: <2f5c625f5f342042ab55902fe4b808bff8dd297b.camel@perches.com> (raw)
In-Reply-To: <20201128130501.23448-1-yashsri421@gmail.com>

On Sat, 2020-11-28 at 18:35 +0530, Aditya Srivastava wrote:
> Currently checkpatch warns for BAD_SIGN_OFF on non-standard signature
> styles.
> 
> A large number of these warnings occur because of typo mistakes in
> signature tags. An evaluation over v4.13..v5.8 showed that out of 539
> warnings due to non-standard signatures, 87 are due to typo mistakes.
> 
> Following are the standard signature tags which are often incorrectly
> used, along with their individual counts of incorrect use (over
> v4.13..v5.8):
> 
>  Reviewed-by: 42
>  Signed-off-by: 25
>  Reported-by: 6
>  Acked-by: 4
>  Tested-by: 4
>  Suggested-by: 4
> 
> Provide a fix by calculating levenshtein distance for the signature tag
> with all the standard signatures and suggest a fix with a signature, whose
> edit distance is less than or equal to 2 with the misspelled signature.
> 
> Out of the 86 mispelled signatures fixed with this approach, 85 were
> found to be good corrections and 1 was bad correction.
> 
> Following was found to be a bad correction:
>  Tweeted-by (count: 1) => Tested-by
> 
> Signed-off-by: Aditya Srivastava <yashsri421@gmail.com>
> ---
> changes in v2: modify commit message: replace specific example with overall evaluation, minor changes
> 
> changes in v3: summarize commit message
> 
> changes in v4: improve commit message; remove signature suggestions of small length (ie 'cc' and 'to')

Seems OKish but this needs style modifications as there are
several whitespace uses that don't match the typical forms
and perhaps some new function naming could be improved.

> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
[]
> @@ -506,6 +506,77 @@ our $signature_tags = qr{(?xi:
>  	Cc:
>  )};
>  
> 
> +sub get_min {

probably a poor name choice.  Maybe edit_distance_min

> +	my (@arr) = @_;
> +	my $len = scalar @arr;
> +	if((scalar @arr) < 1) {

space after if

> +		# if underflow, return
> +		return;
> +	}
> +	my $min = $arr[0];
> +	for my $i (0 .. ($len-1)) {
> +		if ($arr[$i] < $min) {
> +			$min = $arr[$i];
> +		}
> +	}
> +	return $min;
> +}
> +
> +sub get_edit_distance {
> +	my ($str1, $str2) = @_;

maybe lc($str) =~ s/-//g; here instead of the code in the caller

> +	my $len1 = length($str1);
> +	my $len2 = length($str2);
> +	# two dimensional array storing minimum edit distance
> +	my @distance;
> +	for my $i (0 .. $len1) {
> +		for my $j (0 .. $len2) {
> +			if ($i == 0) {
> +				$distance[$i][$j] = $j;
> +			}
> +			elsif ($j == 0) {

} elsif {

> +				$distance[$i][$j] = $i;
> +			}
> +			elsif (substr($str1, $i-1, 1) eq substr($str2, $j-1, 1)) {
> +				$distance[$i][$j] = $distance[$i - 1][$j - 1];
> +			}
> +			else {

} else {

> +				my $dist1 = $distance[$i][$j - 1]; #insert distance
> +				my $dist2 = $distance[$i - 1][$j]; # remove
> +				my $dist3 = $distance[$i - 1][$j - 1]; #replace
> +				$distance[$i][$j] = 1 + get_min($dist1, $dist2, $dist3);
> +			}
> +		}
> +	}
> +	return $distance[$len1][$len2];
> +}
> +
> +sub get_standard_signature {

find_standard_signature ?

> +	my ($sign_off) = @_;
> +	$sign_off = lc($sign_off);
> +	$sign_off =~ s/\-//g; # to match with formed hash

why not strip the dashes in get_edit_distance instead
of using this weird dance with dashes here?

> +	my @standard_signature_tags = (
> +		'signed-off-by:', 'co-developed-by:', 'acked-by:', 'tested-by:',
> +		'reviewed-by:', 'reported-by:', 'suggested-by:'
> +	);
> +	# setting default values
> +	my $standard_signature = 'signed-off-by';

why is does this need to be given a value?

> +	my $min_edit_distance = 20;
> +	my $edit_distance;
> +	foreach (@standard_signature_tags) {
> +		my $signature = $_;
> +		$_ =~ s/\-//g;

and this dancing here

> +		$edit_distance = get_edit_distance($sign_off, $_);
> +		if ($edit_distance < $min_edit_distance) {
> +			$min_edit_distance = $edit_distance;
> +			$standard_signature = $signature;
> +		}
> +	}
> +        if($min_edit_distance<=2) {

bad indentation, if (, spaces around test <=

> +		return ucfirst($standard_signature);
> +        }

bad indentation

> +	return "";
> +}
> +
>  our @typeListMisordered = (
>  	qr{char\s+(?:un)?signed},
>  	qr{int\s+(?:(?:un)?signed\s+)?short\s},
> @@ -2773,8 +2844,18 @@ sub process {
>  			my $ucfirst_sign_off = ucfirst(lc($sign_off));
>  
> 
>  			if ($sign_off !~ /$signature_tags/) {
> -				WARN("BAD_SIGN_OFF",
> -				     "Non-standard signature: $sign_off\n" . $herecurr);
> +				my $suggested_signature = get_standard_signature($sign_off);
> +				if ($suggested_signature eq "") {
> +					WARN("BAD_SIGN_OFF",
> +					"Non-standard signature: $sign_off\n" . $herecurr);

bad alignment

> +				}
> +				else {

} else {

> +					if (WARN("BAD_SIGN_OFF",
> +						 "Non-standard signature: $sign_off. Please use '$suggested_signature' instead\n" . $herecurr) &&

"perhaps" rather than "please use" or "likely typo of"

> +					    $fix) {
> +						$fixed[$fixlinenr] =~ s/$sign_off/$suggested_signature/;
> +					}
> +				}
>  			}
>  			if (defined $space_before && $space_before ne "") {
>  				if (WARN("BAD_SIGN_OFF",



  reply	other threads:[~2020-11-28 22:09 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-28 13:05 [PATCH v4] checkpatch: add fix and improve warning msg for non-standard signature Aditya Srivastava
2020-11-28 15:40 ` Joe Perches [this message]
2020-11-28 18:35   ` [PATCH v5] " Aditya Srivastava
2020-11-28 19:12     ` Joe Perches
2020-11-28 20:43       ` [PATCH v6] " Aditya Srivastava
2020-11-28 20:57         ` Joe Perches
  -- strict thread matches above, loose matches on Subject: below --
2020-11-23 15:16 [PATCH v3] checkpatch: add fix and improve warning msg for Non-standard signature Lukas Bulwahn
2020-11-23 17:24 ` [PATCH v4] " Aditya Srivastava
2020-11-23 17:33   ` Joe Perches
2020-11-24  3:12     ` Aditya
2020-11-24  6:54     ` Lukas Bulwahn
2020-11-24  7:26       ` Joe Perches
2020-11-24  7:44         ` Lukas Bulwahn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2f5c625f5f342042ab55902fe4b808bff8dd297b.camel@perches.com \
    --to=joe@perches.com \
    --cc=linux-kernel-mentees@lists.linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lukas.bulwahn@gmail.com \
    --cc=yashsri421@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).