linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] checkpatch: Add test for commit id formatting style in commit log
       [not found]   ` <20140702131534.c613f55f79519b3862f79e40@linux-foundation.org>
@ 2014-07-02 22:00     ` Joe Perches
  2014-08-10 21:08       ` Geert Uytterhoeven
  0 siblings, 1 reply; 6+ messages in thread
From: Joe Perches @ 2014-07-02 22:00 UTC (permalink / raw)
  To: Andrew Morton; +Cc: LKML

Commit logs have various forms of commit id references.

Try to standardize on a 12 character long lower case
commit id along with a description of parentheses and
the quoted subject line

ie: commit 0123456789ab ("commit description")

If git and a git tree exists, look up the commit id an
emit the appropriate line as part of the message.

Signed-off-by: Joe Perches <joe@perches.com>
Requested-by: Andrew Morton <akpm@linux-foundation.org>
---
 scripts/checkpatch.pl | 54 +++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 54 insertions(+)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 6c7cbaf..9b0734b 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -550,6 +550,34 @@ sub seed_camelcase_includes {
 	}
 }
 
+sub git_commit_info {
+	my ($commit, $id, $desc) = @_;
+
+	return ($id, $desc) if ((which("git") eq "") || !(-e ".git"));
+
+	my $output = `git log --no-color --format='%H %s' -1 $commit 2>&1`;
+	$output =~ s/^\s*//gm;
+	my @lines = split("\n", $output);
+
+	if ($lines[0] =~ /^error: short SHA1 $commit is ambiguous\./) {
+# Maybe one day convert this block of bash into something that returns
+# all matching commit ids, but it's very slow...
+#
+#		echo "checking commits $1..."
+#		git rev-list --remotes | grep -i "^$1" |
+#		while read line ; do
+#		    git log --format='%H %s' -1 $line |
+#		    echo "commit $(cut -c 1-12,41-)"
+#		done
+	} elsif ($lines[0] =~ /^fatal: ambiguous argument '$commit': unknown revision or path not in the working tree\./) {
+	} else {
+		$id = substr($lines[0], 0, 12);
+		$desc = substr($lines[0], 41);
+	}
+
+	return ($id, $desc);
+}
+
 $chk_signoff = 0 if ($file);
 
 my @rawlines = ();
@@ -674,6 +702,18 @@ sub format_email {
 	return $formatted_email;
 }
 
+sub which {
+    my ($bin) = @_;
+
+    foreach my $path (split(/:/, $ENV{PATH})) {
+	if (-e "$path/$bin") {
+	    return "$path/$bin";
+	}
+    }
+
+    return "";
+}
+
 sub which_conf {
 	my ($conf) = @_;
 
@@ -1958,6 +1998,20 @@ sub process {
 			      "Remove Gerrit Change-Id's before submitting upstream.\n" . $herecurr);
 		}
 
+# Check for improperly formed commit descriptions
+		if ($in_commit_log &&
+		    $line =~ /\bcommit\s+[0-9a-f]{5,}/i &&
+		    $line !~ /\b[Cc]ommit [0-9a-f]{12,16} \("/) {
+			$line =~ /\b(c)ommit\s+([0-9a-f]{5,})/i;
+			my $init_char = $1;
+			my $orig_commit = lc($2);
+			my $id = '01234567890ab';
+			my $desc = 'commit description';
+		        ($id, $desc) = git_commit_info($orig_commit, $id, $desc);
+			ERROR("GIT_COMMIT_ID",
+			      "Please use 12 to 16 chars for the git commit ID like: '${init_char}ommit $id (\"$desc\")'\n" . $herecurr);
+		}
+
 # Check for wrappage within a valid hunk of the file
 		if ($realcnt != 0 && $line !~ m{^(?:\+|-| |\\ No newline|$)}) {
 			ERROR("CORRUPTED_PATCH",



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] checkpatch: Add test for commit id formatting style in commit log
  2014-07-02 22:00     ` [PATCH] checkpatch: Add test for commit id formatting style in commit log Joe Perches
@ 2014-08-10 21:08       ` Geert Uytterhoeven
  2014-08-10 21:28         ` Joe Perches
  0 siblings, 1 reply; 6+ messages in thread
From: Geert Uytterhoeven @ 2014-08-10 21:08 UTC (permalink / raw)
  To: Joe Perches; +Cc: Andrew Morton, LKML, Git Mailing List

Hi Joe,

On Thu, Jul 3, 2014 at 12:00 AM, Joe Perches <joe@perches.com> wrote:
> Commit logs have various forms of commit id references.
>
> Try to standardize on a 12 character long lower case
> commit id along with a description of parentheses and
> the quoted subject line
>
> ie: commit 0123456789ab ("commit description")

Now this is in mainline, checkpatch starts complaining about my "too long"
(40 chars) commit IDs in commit messages :-(

40 chars may be too long (but it's quick to copy-and-paste, as "git show"
shows that by default), but 12 sounds a bit short, as that's only 48 bits.

According to the Birthday Paradox (en.wikipedia.org/wiki/Birthday_problem),
there's a probability of 50% of a collision if you use 48 bits IDs in a
repository with ca. 16 milion (2^24) objects. A Linux kernel repository
counts ca. 4 million objects, so we're getting close...

So soon we'll get "error: short SHA1 is ambiguous".

BTW, is there actually an easy way to make "git show" show all options for
an ambiguous SHA1?

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] checkpatch: Add test for commit id formatting style in commit log
  2014-08-10 21:08       ` Geert Uytterhoeven
@ 2014-08-10 21:28         ` Joe Perches
  2014-08-10 21:35           ` Andrew Morton
  0 siblings, 1 reply; 6+ messages in thread
From: Joe Perches @ 2014-08-10 21:28 UTC (permalink / raw)
  To: Geert Uytterhoeven; +Cc: Andrew Morton, LKML, Git Mailing List

On Sun, 2014-08-10 at 23:08 +0200, Geert Uytterhoeven wrote:
> Hi Joe,

Hi Geert.

> On Thu, Jul 3, 2014 at 12:00 AM, Joe Perches <joe@perches.com> wrote:
> > Commit logs have various forms of commit id references.
> >
> > Try to standardize on a 12 character long lower case
> > commit id along with a description of parentheses and
> > the quoted subject line
> >
> > ie: commit 0123456789ab ("commit description")
> 
> Now this is in mainline, checkpatch starts complaining about my "too long"
> (40 chars) commit IDs in commit messages :-(
> 
> 40 chars may be too long (but it's quick to copy-and-paste, as "git show"
> shows that by default), but 12 sounds a bit short, as that's only 48 bits.

Right now, this test allows 12 to 16 byte length commit ids
without emitting a warning.

Andrew wanted this test, I don't care how long the commit id
is in the commit log.

> According to the Birthday Paradox (en.wikiipedia.org/wiki/Birthday_problem),
> there's a probability of 50% of a collision if you use 48 bits IDs in a
> repository with ca. 16 milion (2^24) objects. A Linux kernel repository
> counts ca. 4 million objects, so we're getting close...
> 
> So soon we'll get "error: short SHA1 is ambiguous".
> 
> BTW, is there actually an easy way to make "git show" show all options for
> an ambiguous SHA1?

Not so far as I know, but I'm nothing like a git expert.

The script I used before adding this to checkpatch was:

$ cat format_commit.sh 
#!/bin/bash

regex1="^error: short SHA1 $1 is ambiguous\."
regex2="fatal: ambiguous argument '$1': unknown revision or path not in the working tree\."

tmp=$(mktemp --tmpdir format_commit.XXXXXXXXXXXXX)

git log --format='%H ("%s")' -1 $1 > $tmp 2>&1

read line < $tmp

rm -f $tmp

if [[ $line =~ $regex1 ]] ; then
    echo "checking commits $1..."
    git rev-list --remotes | grep -i "^$1" |
    while read line ; do
        git log --format='%H ("%s")' -1 $line | 
        echo "commit $(cut -c 1-12,41-)"
    done
elif [[ $line =~ $regex2 ]] ; then
    echo "No matching commit"
    exit 1
else
    echo "commit $(echo $line | cut -c1-12,41-)"
fi

exit 0
$

so that using "$ format_commit.sh 1234" looks
at _all_ the commit references by using git rev-list
then greps that output for the matches, but it is
darn slow...

$ time ./format_commit.sh 1234
checking commits 1234...
commit 1234351cba95 ("xfs: introduce xlog_copy_iovec")
commit 1234471e2d11 ("perf header: Fix numa topology printing")
commit 1234f4bada54 ("hwrng: Kconfig: remove dependency for atmel-rng driver")
commit 12340313cf94 ("MAINTAINERS: add new cgroup list to CC notice")
commit 12346037a718 ("UBIFS: dump more in the lprops debugging check")
commit 12342c475f5d ("iwlwifi: proper monitor support")
commit 1234010684bb ("Add notation that the Asus W5F laptop has a short cable instead of 80-wire.")
commit 123411f2d0da ("[CPUFREQ] dprintf format fixes in cpufreq/speedstep-centrino.c")

real	0m24.535s
user	0m21.668s
sys	0m5.332s



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] checkpatch: Add test for commit id formatting style in commit log
  2014-08-10 21:28         ` Joe Perches
@ 2014-08-10 21:35           ` Andrew Morton
  2014-08-10 21:41             ` Joe Perches
  0 siblings, 1 reply; 6+ messages in thread
From: Andrew Morton @ 2014-08-10 21:35 UTC (permalink / raw)
  To: Joe Perches; +Cc: Geert Uytterhoeven, LKML, Git Mailing List

On Sun, 10 Aug 2014 14:28:01 -0700 Joe Perches <joe@perches.com> wrote:

> > On Thu, Jul 3, 2014 at 12:00 AM, Joe Perches <joe@perches.com> wrote:
> > > Commit logs have various forms of commit id references.
> > >
> > > Try to standardize on a 12 character long lower case
> > > commit id along with a description of parentheses and
> > > the quoted subject line
> > >
> > > ie: commit 0123456789ab ("commit description")
> > 
> > Now this is in mainline, checkpatch starts complaining about my "too long"
> > (40 chars) commit IDs in commit messages :-(
> > 
> > 40 chars may be too long (but it's quick to copy-and-paste, as "git show"
> > shows that by default), but 12 sounds a bit short, as that's only 48 bits.
> 
> Right now, this test allows 12 to 16 byte length commit ids
> without emitting a warning.
> 
> Andrew wanted this test, I don't care how long the commit id
> is in the commit log.

Well, I mainly wanted to stop having to add "commit description" when
people forget it.  The length check was perhaps a bit anal.  How about
we make it "12 or more"?



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] checkpatch: Add test for commit id formatting style in commit log
  2014-08-10 21:35           ` Andrew Morton
@ 2014-08-10 21:41             ` Joe Perches
  0 siblings, 0 replies; 6+ messages in thread
From: Joe Perches @ 2014-08-10 21:41 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Geert Uytterhoeven, LKML, Git Mailing List

On Sun, 2014-08-10 at 14:35 -0700, Andrew Morton wrote:
> On Sun, 10 Aug 2014 14:28:01 -0700 Joe Perches <joe@perches.com> wrote:
> > > On Thu, Jul 3, 2014 at 12:00 AM, Joe Perches <joe@perches.com> wrote:
> > > > Commit logs have various forms of commit id references.
> > > >
> > > > Try to standardize on a 12 character long lower case
> > > > commit id along with a description of parentheses and
> > > > the quoted subject line
> > > >
> > > > ie: commit 0123456789ab ("commit description")
> > > 
> > > Now this is in mainline, checkpatch starts complaining about my "too long"
> > > (40 chars) commit IDs in commit messages :-(
> > > 
> > > 40 chars may be too long (but it's quick to copy-and-paste, as "git show"
> > > shows that by default), but 12 sounds a bit short, as that's only 48 bits.
> > 
> > Right now, this test allows 12 to 16 byte length commit ids
> > without emitting a warning.
> > 
> > Andrew wanted this test, I don't care how long the commit id
> > is in the commit log.
> 
> Well, I mainly wanted to stop having to add "commit description" when
> people forget it.  The length check was perhaps a bit anal.  How about
> we make it "12 or more"?

Fine by me, just change the 16 to 40
---
 scripts/checkpatch.pl | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 31a731e..b385bcb 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -2133,7 +2133,7 @@ sub process {
 # Check for improperly formed commit descriptions
 		if ($in_commit_log &&
 		    $line =~ /\bcommit\s+[0-9a-f]{5,}/i &&
-		    $line !~ /\b[Cc]ommit [0-9a-f]{12,16} \("/) {
+		    $line !~ /\b[Cc]ommit [0-9a-f]{12,40} \("/) {
 			$line =~ /\b(c)ommit\s+([0-9a-f]{5,})/i;
 			my $init_char = $1;
 			my $orig_commit = lc($2);
@@ -2141,7 +2141,7 @@ sub process {
 			my $desc = 'commit description';
 		        ($id, $desc) = git_commit_info($orig_commit, $id, $desc);
 			ERROR("GIT_COMMIT_ID",
-			      "Please use 12 to 16 chars for the git commit ID like: '${init_char}ommit $id (\"$desc\")'\n" . $herecurr);
+			      "Please use 12 or more chars for the git commit ID like: '${init_char}ommit $id (\"$desc\")'\n" . $herecurr);
 		}
 
 # Check for added, moved or deleted files



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] checkpatch: Add test for commit id formatting style in commit log
@ 2014-08-27 17:11 Joe Lawrence
  0 siblings, 0 replies; 6+ messages in thread
From: Joe Lawrence @ 2014-08-27 17:11 UTC (permalink / raw)
  To: linux-kernel; +Cc: Joe Perches, Andrew Morton

> Try to standardize on a 12 character long lower case
> commit id along with a description of parentheses and
> the quoted subject line
>
> ie: commit 0123456789ab ("commit description")

Hi Joe / Andrew,

I don't know if checkpatch is built for multiline regex, but I noticed
that this new check complained about the following commit message:

...
The PCI register read checking introduced in commit fe1b806f4f71
("qla2xxx: Disable adapter when we encounter a PCI disconnect") is
...

where 'commit hash ("message")' is split on multiple lines.

Dunno if that behavior is easily fixed, but thought I would report it.

Regards,

-- Joe

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-08-27 17:20 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20140702130210.fd40d67f0819cfb5f3e9e5ca@linux-foundation.org>
     [not found] ` <1404331746.14624.95.camel@joe-AO725>
     [not found]   ` <20140702131534.c613f55f79519b3862f79e40@linux-foundation.org>
2014-07-02 22:00     ` [PATCH] checkpatch: Add test for commit id formatting style in commit log Joe Perches
2014-08-10 21:08       ` Geert Uytterhoeven
2014-08-10 21:28         ` Joe Perches
2014-08-10 21:35           ` Andrew Morton
2014-08-10 21:41             ` Joe Perches
2014-08-27 17:11 Joe Lawrence

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).