linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] get_maintainer.pl: append reason for cc to the name by default
@ 2010-09-10  9:33 florian
  2010-09-10  9:42 ` Joe Perches
  2010-09-26 18:52 ` RFC: " Joe Perches
  0 siblings, 2 replies; 62+ messages in thread
From: florian @ 2010-09-10  9:33 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Florian Mickler, Andrew Morton (role:commit_signer),
	Joe Perches (role:commit_signer),
	Stephen Hemminger (role:commit_signer),
	Wolfram Sang (role:commit_signer),
	linux-kernel

The script get_maintainer.pl is a very useful tool for deploying changes
made to the kernel. Among others it searches not only the MAINTAINERS
file but also the git history for people to send patches to.

This can be unexpected for the receiving side and can and does provoke
sometimes anger because it is not easy to determine if the sender did
explicitly put the receiving side on the cc list, or if they just
trolled the tree. The receiving side, if not used to be cc'd on many
things will check the patch, spend time investigating what the heck they
were cc'd just to realize, that there was no special reason.

As get_maintainer.pl is frequently used by kernel newcommers who _can_
not know whom to cc by themself, this anger then comes as a surprise for them
and definitely puts them in an awkward position.

By appending a  a note of the reason for the cc in the name, the reason
becomes clear and the receiving side is relieved from feeling obliged to
check the patch  while the sending side has a chance to adapt the
cc'list to their liking.

But the most useful aspect of this is, IMHO, that it makes it transparent who
just used get_maintainer.pl as a shortcut to increase his own
patch-throughput or who really put an effort in finding or editing the
cc'list to their likings.

Signed-off-by: Florian Mickler <florian@mickler.org>
---
 scripts/get_maintainer.pl |   77 ++++++++++++++++++++++++--------------------
 1 files changed, 42 insertions(+), 35 deletions(-)

diff --git a/scripts/get_maintainer.pl b/scripts/get_maintainer.pl
index b228198..0cdc66a 100755
--- a/scripts/get_maintainer.pl
+++ b/scripts/get_maintainer.pl
@@ -255,6 +255,9 @@ close($maint);
 
 my %mailmap;
 
+#
+# read in a name->address lookup table
+#
 if ($email_remove_duplicates) {
     open(my $mailmap, '<', "${lk_path}.mailmap")
 	or warn "$P: Can't open .mailmap: $!\n";
@@ -265,10 +268,11 @@ if ($email_remove_duplicates) {
 	next if ($line =~ m/^\s*$/);
 
 	my ($name, $address) = parse_email($line);
-	$line = format_email($name, $address, $email_usename);
+	$line = format_email($name, $address, "", $email_usename);
 
 	next if ($line =~ m/^\s*$/);
 
+	#append address to name 
 	if (exists($mailmap{$name})) {
 	    my $obj = $mailmap{$name};
 	    push(@$obj, $address);
@@ -457,11 +461,11 @@ if ($email) {
     foreach my $chief (@penguin_chief) {
 	if ($chief =~ m/^(.*):(.*)/) {
 	    my $email_address;
-
-	    $email_address = format_email($1, $2, $email_usename);
+	    $email_address = format_email($1, $2, 'chief penguin', $email_usename);
 	    if ($email_git_penguin_chiefs) {
 		push(@email_to, [$email_address, 'chief penguin']);
 	    } else {
+		$email_address = format_email($1, $2, "", 0);
 		@email_to = grep($_->[0] !~ /${email_address}/, @email_to);
 	    }
 	}
@@ -469,10 +473,9 @@ if ($email) {
 
     foreach my $email (@file_emails) {
 	my ($name, $address) = parse_email($email);
-
-	my $tmp_email = format_email($name, $address, $email_usename);
-	push_email_address($tmp_email, '');
-	add_role($tmp_email, 'in file');
+	my $tmp_email = format_email($name, $address, "", $email_usename);
+	push_email_address($tmp_email, 'explicit cc', 'in file');
+	add_role($tmp_email, 'explicit cc', 'in file');
     }
 }
 
@@ -660,10 +663,13 @@ sub parse_email {
 }
 
 sub format_email {
-    my ($name, $address, $usename) = @_;
-
+    my ($name, $address, $simple_role, $usename) = @_;
     my $formatted_email;
 
+    my $role_display = "";
+    if($simple_role){
+    	$role_display = "(role:$simple_role) ";
+    }
     $name =~ s/^\s+|\s+$//g;
     $name =~ s/^\"|\"$//g;
     $address =~ s/^\s+|\s+$//g;
@@ -675,9 +681,9 @@ sub format_email {
 
     if ($usename) {
 	if ("$name" eq "") {
-	    $formatted_email = "$address";
+	    $formatted_email = "$role_display<$address>";
 	} else {
-	    $formatted_email = "$name <$address>";
+	    $formatted_email = "$name $role_display<$address>";
 	}
     } else {
 	$formatted_email = $address;
@@ -769,7 +775,7 @@ sub get_maintainer_role {
 	$role = "chief penguin";
     }
 
-    return $role . ":" . $subsystem;
+    return ($role, $role . ":" . $subsystem);
 }
 
 sub get_list_role {
@@ -836,14 +842,14 @@ sub add_categories {
 			if ($tv =~ m/^(\C):\s*(.*)/) {
 			    if ($1 eq "P") {
 				$name = $2;
-				$pvalue = format_email($name, $address, $email_usename);
+				$pvalue = format_email($name, $address, "maintainer", $email_usename);
 			    }
 			}
 		    }
 		}
 		if ($email_maintainer) {
-		    my $role = get_maintainer_role($i);
-		    push_email_addresses($pvalue, $role);
+		    my ($role, $debug_role) = get_maintainer_role($i);
+		    push_email_addresses($pvalue, $role, $debug_role);
 		}
 	    } elsif ($ptype eq "T") {
 		push(@scm, $pvalue);
@@ -870,7 +876,7 @@ sub email_inuse {
 }
 
 sub push_email_address {
-    my ($line, $role) = @_;
+    my ($line, $display_role, $debug_role) = @_;
 
     my ($name, $address) = parse_email($line);
 
@@ -879,9 +885,9 @@ sub push_email_address {
     }
 
     if (!$email_remove_duplicates) {
-	push(@email_to, [format_email($name, $address, $email_usename), $role]);
+	push(@email_to, [format_email($name, $address, $display_role, $email_usename), $debug_role]);
     } elsif (!email_inuse($name, $address)) {
-	push(@email_to, [format_email($name, $address, $email_usename), $role]);
+	push(@email_to, [format_email($name, $address, $display_role, $email_usename), $debug_role]);
 	$email_hash_name{$name}++;
 	$email_hash_address{$address}++;
     }
@@ -890,29 +896,29 @@ sub push_email_address {
 }
 
 sub push_email_addresses {
-    my ($address, $role) = @_;
+    my ($address, $display_role, $debug_role) = @_;
 
     my @address_list = ();
 
     if (rfc822_valid($address)) {
-	push_email_address($address, $role);
+	push_email_address($address, $display_role, $debug_role);
     } elsif (@address_list = rfc822_validlist($address)) {
 	my $array_count = shift(@address_list);
 	while (my $entry = shift(@address_list)) {
-	    push_email_address($entry, $role);
+	    push_email_address($entry, $display_role, $debug_role);
 	}
     } else {
-	if (!push_email_address($address, $role)) {
+	if (!push_email_address($address, $display_role, $debug_role)) {
 	    warn("Invalid MAINTAINERS address: '" . $address . "'\n");
 	}
     }
 }
 
 sub add_role {
-    my ($line, $role) = @_;
+    my ($line, $display_role, $role) = @_;
 
     my ($name, $address) = parse_email($line);
-    my $email = format_email($name, $address, $email_usename);
+    my $email = format_email($name, $address, $display_role, $email_usename);
 
     foreach my $entry (@email_to) {
 	if ($email_remove_duplicates) {
@@ -920,6 +926,7 @@ sub add_role {
 	    if (($name eq $entry_name || $address eq $entry_address)
 		&& ($role eq "" || !($entry->[1] =~ m/$role/))
 	    ) {
+		#append role
 		if ($entry->[1] eq "") {
 		    $entry->[1] = "$role";
 		} else {
@@ -930,6 +937,7 @@ sub add_role {
 	    if ($email eq $entry->[0]
 		&& ($role eq "" || !($entry->[1] =~ m/$role/))
 	    ) {
+		#append role
 		if ($entry->[1] eq "") {
 		    $entry->[1] = "$role";
 		} else {
@@ -958,18 +966,20 @@ sub mailmap {
 
     foreach my $line (@lines) {
 	my ($name, $address) = parse_email($line);
+
 	if (!exists($hash{$name})) {
 	    $hash{$name} = $address;
 	} elsif ($address ne $hash{$name}) {
 	    $address = $hash{$name};
-	    $line = format_email($name, $address, $email_usename);
+	    $line = format_email($name, $address, "", $email_usename);
 	}
+	
 	if (exists($mailmap{$name})) {
 	    my $obj = $mailmap{$name};
 	    foreach my $map_address (@$obj) {
 		if (($map_address eq $address) &&
 		    ($map_address ne $hash{$name})) {
-		    $line = format_email($name, $hash{$name}, $email_usename);
+		    $line = format_email($name, $hash{$name}, "", $email_usename);
 		}
 	    }
 	}
@@ -1021,7 +1031,7 @@ sub vcs_find_signers {
 
     foreach my $line (@lines) {
 	my ($name, $address) = parse_email($line);
-	$line = format_email($name, $address, 1);
+	$line = format_email($name, $address, "", 1);
     }
 
     return ($commits, @lines);
@@ -1136,13 +1146,10 @@ sub vcs_assign {
 	last if ($sign_offs < $email_git_min_signatures ||
 		 $count > $email_git_max_maintainers ||
 		 $percent < $email_git_min_percent);
-	push_email_address($line, '');
-	if ($output_rolestats) {
-	    my $fmt_percent = sprintf("%.0f", $percent);
-	    add_role($line, "$role:$sign_offs/$divisor=$fmt_percent%");
-	} else {
-	    add_role($line, $role);
-	}
+	my $fmt_percent = sprintf("%.0f", $percent);
+	my $debug_role = "$role:$sign_offs/$divisor=$fmt_percent%";
+	push_email_address($line, $role, $debug_role);
+	add_role($line, $role, $debug_role);
     }
 }
 
@@ -1248,7 +1255,7 @@ sub clean_file_emails {
 	    $name = '"' . substr($name, 2, length($name) - 2);
 	}
 
-	my $fmt_email = format_email($name, $address, $email_usename);
+	my $fmt_email = format_email($name, $address, "", $email_usename);
 	push(@fmt_emails, $fmt_email);
     }
     return @fmt_emails;
-- 
1.7.2


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH] get_maintainer.pl: append reason for cc to the name by default
  2010-09-10  9:33 [PATCH] get_maintainer.pl: append reason for cc to the name by default florian
@ 2010-09-10  9:42 ` Joe Perches
  2010-09-10  9:46   ` Wolfram Sang
                     ` (4 more replies)
  2010-09-26 18:52 ` RFC: " Joe Perches
  1 sibling, 5 replies; 62+ messages in thread
From: Joe Perches @ 2010-09-10  9:42 UTC (permalink / raw)
  To: florian
  Cc: Andrew Morton, Stephen Hemminger (role:commit_signer),
	Wolfram Sang (role:commit_signer),
	linux-kernel

On Fri, 2010-09-10 at 11:33 +0200, florian@mickler.org wrote:
> The script get_maintainer.pl is a very useful tool for deploying changes
> made to the kernel. Among others it searches not only the MAINTAINERS
> file but also the git history for people to send patches to.\
[]
> By appending a  a note of the reason for the cc in the name, the reason
> becomes clear and the receiving side is relieved from feeling obliged to
> check the patch  while the sending side has a chance to adapt the
> cc'list to their liking.

I don't like reading annotated email names myself.

As the coverage of file patterns in MAINTAINERS is
now pretty good, the --git results are less useful.

I'd much rather turn off --git as a default, or change
get_maintainers to add git signers only when there's no
specifically listed maintainer for a particular file.



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] get_maintainer.pl: append reason for cc to the name by default
  2010-09-10  9:42 ` Joe Perches
@ 2010-09-10  9:46   ` Wolfram Sang
  2010-09-10  9:53   ` Mark Brown
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 62+ messages in thread
From: Wolfram Sang @ 2010-09-10  9:46 UTC (permalink / raw)
  To: Joe Perches
  Cc: florian, Andrew Morton, Stephen Hemminger (role:commit_signer),
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1123 bytes --]

On Fri, Sep 10, 2010 at 02:42:47AM -0700, Joe Perches wrote:
> On Fri, 2010-09-10 at 11:33 +0200, florian@mickler.org wrote:
> > The script get_maintainer.pl is a very useful tool for deploying changes
> > made to the kernel. Among others it searches not only the MAINTAINERS
> > file but also the git history for people to send patches to.\
> []
> > By appending a  a note of the reason for the cc in the name, the reason
> > becomes clear and the receiving side is relieved from feeling obliged to
> > check the patch  while the sending side has a chance to adapt the
> > cc'list to their liking.
> 
> I don't like reading annotated email names myself.
> 
> As the coverage of file patterns in MAINTAINERS is
> now pretty good, the --git results are less useful.
> 
> I'd much rather turn off --git as a default, or change
> get_maintainers to add git signers only when there's no
> specifically listed maintainer for a particular file.

+1

-- 
Pengutronix e.K.                           | Wolfram Sang                |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] get_maintainer.pl: append reason for cc to the name by default
  2010-09-10  9:42 ` Joe Perches
  2010-09-10  9:46   ` Wolfram Sang
@ 2010-09-10  9:53   ` Mark Brown
  2010-09-10 10:04     ` Joe Perches
  2010-09-11 21:22     ` Joe Perches
  2010-09-10 10:30   ` Florian Mickler
                     ` (2 subsequent siblings)
  4 siblings, 2 replies; 62+ messages in thread
From: Mark Brown @ 2010-09-10  9:53 UTC (permalink / raw)
  To: Joe Perches
  Cc: florian, Andrew Morton, Stephen Hemminger (role:commit_signer),
	Wolfram Sang (role:commit_signer),
	linux-kernel

On Fri, Sep 10, 2010 at 02:42:47AM -0700, Joe Perches wrote:

> As the coverage of file patterns in MAINTAINERS is
> now pretty good, the --git results are less useful.

It's pretty good for subsystems and PC drivers but for embedded stuff it
is much more patchy and fairly unlikely to find someone who actually
knows about the specific hardware.

> I'd much rather turn off --git as a default, or change
> get_maintainers to add git signers only when there's no
> specifically listed maintainer for a particular file.

OTOH I'd still be somewhat in favour of this purely on the basis that
the default results have way too many false positives.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] get_maintainer.pl: append reason for cc to the name by default
  2010-09-10  9:53   ` Mark Brown
@ 2010-09-10 10:04     ` Joe Perches
  2010-09-10 10:18       ` Mark Brown
  2010-09-10 10:22       ` Florian Mickler
  2010-09-11 21:22     ` Joe Perches
  1 sibling, 2 replies; 62+ messages in thread
From: Joe Perches @ 2010-09-10 10:04 UTC (permalink / raw)
  To: Mark Brown
  Cc: florian, Andrew Morton, Stephen Hemminger (role:commit_signer),
	Wolfram Sang (role:commit_signer),
	linux-kernel

On Fri, 2010-09-10 at 10:53 +0100, Mark Brown wrote:
> On Fri, Sep 10, 2010 at 02:42:47AM -0700, Joe Perches wrote:
> > As the coverage of file patterns in MAINTAINERS is
> > now pretty good, the --git results are less useful.
> It's pretty good for subsystems and PC drivers but for embedded stuff it
> is much more patchy and fairly unlikely to find someone who actually
> knows about the specific hardware.

Especially not so good for ARM stuff.

It'd be great if the ARM/embedded folk would spend
some effort improving the MAINTAINERS file pattern
coverage.

I wrote some scripts to compare files and git commits
to the individuals listed in MAINTAINERS as well as
the "first author" of files added vs email addresses
contained in the files.  Some nominal MAINTAINERS
haven't had commits or signoffs in quite awhile and
could be likely be sifted out.

One possible improvement that get_maintainers could
add is to [optionally] CC the author of the first
commit when there's no listed maintainer.


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] get_maintainer.pl: append reason for cc to the name by default
  2010-09-10 10:04     ` Joe Perches
@ 2010-09-10 10:18       ` Mark Brown
  2010-09-10 10:47         ` Joe Perches
  2010-09-10 11:44         ` [PATCH] get_maintainer.pl: append reason for cc to the name by default Alan Cox
  2010-09-10 10:22       ` Florian Mickler
  1 sibling, 2 replies; 62+ messages in thread
From: Mark Brown @ 2010-09-10 10:18 UTC (permalink / raw)
  To: Joe Perches
  Cc: florian, Andrew Morton, Stephen Hemminger (role:commit_signer),
	Wolfram Sang (role:commit_signer),
	linux-kernel

On Fri, Sep 10, 2010 at 03:04:26AM -0700, Joe Perches wrote:

> It'd be great if the ARM/embedded folk would spend
> some effort improving the MAINTAINERS file pattern
> coverage.

Half the problem is that a lot of drivers aren't maintained by the
people who wrote them - for example, they wrote the driver to get a
board working but have no ongoing interest or can only really comment on
the one specific configuration used on their particular hardware.  This
means getting people to add MAINTAINERS entries is a bit more tricky
than it needs to be, even if they could offer useful review on changes.

> One possible improvement that get_maintainers could
> add is to [optionally] CC the author of the first
> commit when there's no listed maintainer.

That might be a nice option (though it will still give false positives).
Something based on percentage of the driver written rather than log
entry counts might also be interesting.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] get_maintainer.pl: append reason for cc to the name by default
  2010-09-10 10:04     ` Joe Perches
  2010-09-10 10:18       ` Mark Brown
@ 2010-09-10 10:22       ` Florian Mickler
  2010-09-10 10:47         ` Joe Perches
  1 sibling, 1 reply; 62+ messages in thread
From: Florian Mickler @ 2010-09-10 10:22 UTC (permalink / raw)
  To: Joe Perches
  Cc: Mark Brown, Andrew Morton, Stephen Hemminger (role:commit_signer),
	Wolfram Sang (role:commit_signer),
	linux-kernel

On Fri, 10 Sep 2010 03:04:26 -0700
Joe Perches <joe@perches.com> wrote:

> On Fri, 2010-09-10 at 10:53 +0100, Mark Brown wrote:
> > On Fri, Sep 10, 2010 at 02:42:47AM -0700, Joe Perches wrote:
> > > As the coverage of file patterns in MAINTAINERS is
> > > now pretty good, the --git results are less useful.
> > It's pretty good for subsystems and PC drivers but for embedded stuff it
> > is much more patchy and fairly unlikely to find someone who actually
> > knows about the specific hardware.
> 
> Especially not so good for ARM stuff.
> 
> It'd be great if the ARM/embedded folk would spend
> some effort improving the MAINTAINERS file pattern
> coverage.

hm.. not likely to happen.

> One possible improvement that get_maintainers could
> add is to [optionally] CC the author of the first
> commit when there's no listed maintainer.
> 

I don't think the first commit has any special significance 
in general. 


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] get_maintainer.pl: append reason for cc to the name by default
  2010-09-10  9:42 ` Joe Perches
  2010-09-10  9:46   ` Wolfram Sang
  2010-09-10  9:53   ` Mark Brown
@ 2010-09-10 10:30   ` Florian Mickler
  2010-09-10 11:04     ` Mark Brown
                       ` (2 more replies)
  2010-09-10 11:11   ` [PATCH] get_maintainer.pl: append reason for cc to the name by default Florian Mickler
  2010-09-11  0:13   ` Christoph Hellwig
  4 siblings, 3 replies; 62+ messages in thread
From: Florian Mickler @ 2010-09-10 10:30 UTC (permalink / raw)
  To: Joe Perches
  Cc: Andrew Morton, Stephen Hemminger (role:commit_signer),
	Wolfram Sang (role:commit_signer),
	linux-kernel

On Fri, 10 Sep 2010 02:42:47 -0700
Joe Perches <joe@perches.com> wrote:

> On Fri, 2010-09-10 at 11:33 +0200, florian@mickler.org wrote:
> > The script get_maintainer.pl is a very useful tool for deploying changes
> > made to the kernel. Among others it searches not only the MAINTAINERS
> > file but also the git history for people to send patches to.\
> []
> > By appending a  a note of the reason for the cc in the name, the reason
> > becomes clear and the receiving side is relieved from feeling obliged to
> > check the patch  while the sending side has a chance to adapt the
> > cc'list to their liking.
> 
> I don't like reading annotated email names myself.
> 
> As the coverage of file patterns in MAINTAINERS is
> now pretty good, the --git results are less useful.
> 
> I'd much rather turn off --git as a default, or change
> get_maintainers to add git signers only when there's no
> specifically listed maintainer for a particular file.
> 
> 

Of course, that is also an option. 

But I personally would love to be cc'd on files I committed stuff to,
as long as it is clear, that it was done in an automated way. 

People that get angered by being cc'd a lot as commit_signers can always
filter their mail for that. 
But I don't think it is the major part of kernel hackers that get angry
when they are cc'd. Most kernel hackers _love_ getting mail, or not? :) 
At least they should.

Your mileage may vary of course, but I can't see any harm done by this
change. Anyone familiar with get_maintainer.pl can always just put
"--nogit" in his .get_maintainer.conf file. 

I think the --git thing is great, as it really get's patches into the
kernel vs letting them rot on some mailinglist.

But  
Cheers,
Flo

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] get_maintainer.pl: append reason for cc to the name by default
  2010-09-10 10:22       ` Florian Mickler
@ 2010-09-10 10:47         ` Joe Perches
  0 siblings, 0 replies; 62+ messages in thread
From: Joe Perches @ 2010-09-10 10:47 UTC (permalink / raw)
  To: Florian Mickler
  Cc: Mark Brown, Andrew Morton, Stephen Hemminger, Wolfram Sang, linux-kernel

On Fri, 2010-09-10 at 12:22 +0200, Florian Mickler wrote:
> On Fri, 10 Sep 2010 03:04:26 -0700
> Joe Perches <joe@perches.com> wrote:
> > One possible improvement that get_maintainers could
> > add is to [optionally] CC the author of the first
> > commit when there's no listed maintainer.
> I don't think the first commit has any special significance 
> in general. 

It does depend on the age of the first commit.

The author of the first commit is most frequently
the primary file author and also a current maintainer
or a still significantly interested party.


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] get_maintainer.pl: append reason for cc to the name by default
  2010-09-10 10:18       ` Mark Brown
@ 2010-09-10 10:47         ` Joe Perches
  2010-09-10 11:07           ` Mark Brown
  2010-09-11  0:22           ` [PATCH] scripts/get_maintainer.pl: Add --git-blame --rolestats "Authored lines" information Joe Perches
  2010-09-10 11:44         ` [PATCH] get_maintainer.pl: append reason for cc to the name by default Alan Cox
  1 sibling, 2 replies; 62+ messages in thread
From: Joe Perches @ 2010-09-10 10:47 UTC (permalink / raw)
  To: Mark Brown
  Cc: florian, Andrew Morton, Stephen Hemminger, Wolfram Sang, linux-kernel

On Fri, 2010-09-10 at 11:18 +0100, Mark Brown wrote:
> On Fri, Sep 10, 2010 at 03:04:26AM -0700, Joe Perches wrote:
> > One possible improvement that get_maintainers could
> > add is to [optionally] CC the author of the first
> > commit when there's no listed maintainer.
> 
> That might be a nice option (though it will still give false positives).

Anything that uses git will give false positives.
Anything that uses MAINTAINERS will give false positives.

> Something based on percentage of the driver written rather than log
> entry counts might also be interesting.

There is a --git-blame option, but it uses a count of file commits
active in the file rather than % of lines authored.

For instance:

$ ./scripts/get_maintainer.pl --git-blame --rolestats -f arch/arm/mach-kirkwood/common.c
Lennert Buytenhek <kernel@wantstofly.org> (odd fixer:ARM/Marvell Loki/...,commits:11/38=29%)
Nicolas Pitre <nico@fluxnic.net> (odd fixer:ARM/Marvell Loki/...,commit_signer:7/12=58%,commits:22/38=58%)
Russell King <linux@arm.linux.org.uk> (maintainer:ARM PORT,commits:3/38=8%)
Mark Brown <broonie@opensource.wolfsonmicro.com> (commit_signer:3/12=25%)
Liam Girdwood <lrg@slimlogic.co.uk> (commit_signer:3/12=25%)
Saeed Bishara <saeed@marvell.com> (commit_signer:3/12=25%,commits:6/38=16%)
Arnaud Patard <arnaud.patard@rtp-net.org> (commit_signer:2/12=17%)
Ronen Shitrit <rshitrit@marvell.com> (commits:5/38=13%)
linux-arm-kernel@lists.infradead.org (open list:ARM/Marvell Loki/...)
linux-kernel@vger.kernel.org (open list)

Maybe --rolestats should be extended to include line counts
if --git-blame is used.

--git-blame can be very slow, it shouldn't be a default option.



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] get_maintainer.pl: append reason for cc to the name by default
  2010-09-10 10:30   ` Florian Mickler
@ 2010-09-10 11:04     ` Mark Brown
  2010-09-10 11:15       ` Florian Mickler
  2010-09-10 21:04     ` Andrew Morton
  2010-09-13  4:01     ` Valdis.Kletnieks
  2 siblings, 1 reply; 62+ messages in thread
From: Mark Brown @ 2010-09-10 11:04 UTC (permalink / raw)
  To: Florian Mickler
  Cc: Joe Perches, Andrew Morton,
	Stephen Hemminger (role:commit_signer),
	Wolfram Sang (role:commit_signer),
	linux-kernel

On Fri, Sep 10, 2010 at 12:30:40PM +0200, Florian Mickler wrote:

> People that get angered by being cc'd a lot as commit_signers can always
> filter their mail for that. 
> But I don't think it is the major part of kernel hackers that get angry
> when they are cc'd. Most kernel hackers _love_ getting mail, or not? :) 
> At least they should.

Consider what happens if someone does lots of kernel wide changes or
generic code cleanup work like the pattern matching cleanups.  As
someone who gets some of this sort of stuff I find it's quite confusing
when I get patches sent to me and I have to figure out what I'm supposed
to do about them.

> I think the --git thing is great, as it really get's patches into the
> kernel vs letting them rot on some mailinglist.

It's a very useful tool - the issue is with people not realising that
it's not infalible and taking the results blindly.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] get_maintainer.pl: append reason for cc to the name by default
  2010-09-10 10:47         ` Joe Perches
@ 2010-09-10 11:07           ` Mark Brown
  2010-09-11  0:22           ` [PATCH] scripts/get_maintainer.pl: Add --git-blame --rolestats "Authored lines" information Joe Perches
  1 sibling, 0 replies; 62+ messages in thread
From: Mark Brown @ 2010-09-10 11:07 UTC (permalink / raw)
  To: Joe Perches
  Cc: florian, Andrew Morton, Stephen Hemminger, Wolfram Sang, linux-kernel

On Fri, Sep 10, 2010 at 03:47:45AM -0700, Joe Perches wrote:
> On Fri, 2010-09-10 at 11:18 +0100, Mark Brown wrote:

> > That might be a nice option (though it will still give false positives).

> Anything that uses git will give false positives.
> Anything that uses MAINTAINERS will give false positives.

Agreed, the main problem here is people thinking the tool is more magic
than it actually is rather than a problem in the tool itself.

> Maybe --rolestats should be extended to include line counts
> if --git-blame is used.

> --git-blame can be very slow, it shouldn't be a default option.

Yes, agreed.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] get_maintainer.pl: append reason for cc to the name by default
  2010-09-10  9:42 ` Joe Perches
                     ` (2 preceding siblings ...)
  2010-09-10 10:30   ` Florian Mickler
@ 2010-09-10 11:11   ` Florian Mickler
  2010-09-10 15:12     ` Joe Perches
  2010-09-11  0:13   ` Christoph Hellwig
  4 siblings, 1 reply; 62+ messages in thread
From: Florian Mickler @ 2010-09-10 11:11 UTC (permalink / raw)
  To: Joe Perches
  Cc: Andrew Morton, Stephen Hemminger (role:commit_signer),
	Wolfram Sang (role:commit_signer),
	linux-kernel

On Fri, 10 Sep 2010 02:42:47 -0700
Joe Perches <joe@perches.com> wrote:

> On Fri, 2010-09-10 at 11:33 +0200, florian@mickler.org wrote:
> > The script get_maintainer.pl is a very useful tool for deploying changes
> > made to the kernel. Among others it searches not only the MAINTAINERS
> > file but also the git history for people to send patches to.\
> []
> > By appending a  a note of the reason for the cc in the name, the reason
> > becomes clear and the receiving side is relieved from feeling obliged to
> > check the patch  while the sending side has a chance to adapt the
> > cc'list to their liking.
> 
> I don't like reading annotated email names myself.

Btw, why? I think it is a great way to show that it was done in an
automated fashion. 

I wondered if tagging another way would be useful but thought the
role:what could be used to allow better filtering on the receiving
side. 

It also would show how many people used get_maintainer.pl. 

Maybe some people want to suppress the name-tagging via a cmdline, but
then I think the tagging should be default on or it could backpack on
the --git/--nogit option. 

Cheers,
Flo

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] get_maintainer.pl: append reason for cc to the name by default
  2010-09-10 11:04     ` Mark Brown
@ 2010-09-10 11:15       ` Florian Mickler
  0 siblings, 0 replies; 62+ messages in thread
From: Florian Mickler @ 2010-09-10 11:15 UTC (permalink / raw)
  To: Mark Brown
  Cc: Joe Perches, Andrew Morton,
	Stephen Hemminger (role:commit_signer),
	Wolfram Sang (role:commit_signer),
	linux-kernel

On Fri, 10 Sep 2010 12:04:31 +0100
Mark Brown <broonie@opensource.wolfsonmicro.com> wrote:

> On Fri, Sep 10, 2010 at 12:30:40PM +0200, Florian Mickler wrote:
> 
> > People that get angered by being cc'd a lot as commit_signers can always
> > filter their mail for that. 
> > But I don't think it is the major part of kernel hackers that get angry
> > when they are cc'd. Most kernel hackers _love_ getting mail, or not? :) 
> > At least they should.
> 
> Consider what happens if someone does lots of kernel wide changes or
> generic code cleanup work like the pattern matching cleanups.  As
> someone who gets some of this sort of stuff I find it's quite confusing
> when I get patches sent to me and I have to figure out what I'm supposed
> to do about them.

Yes, absolutely. That was what the patch was motivated by. 

> 
> > I think the --git thing is great, as it really get's patches into the
> > kernel vs letting them rot on some mailinglist.
> 
> It's a very useful tool - the issue is with people not realising that
> it's not infalible and taking the results blindly.

Yeah. I agree. That's why I think that making the "blindly take the
results" visible somehow would be nice. 

After all, if you can easy determine (or even have it filtered away)
that you were cc'd automatically it would spare you some time. 

I would probably don't have those mails be sent to my mobile device.
While I probably want things I'm cc'd manually tho. 


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] get_maintainer.pl: append reason for cc to the name by default
  2010-09-10 10:18       ` Mark Brown
  2010-09-10 10:47         ` Joe Perches
@ 2010-09-10 11:44         ` Alan Cox
  1 sibling, 0 replies; 62+ messages in thread
From: Alan Cox @ 2010-09-10 11:44 UTC (permalink / raw)
  To: Mark Brown
  Cc: Joe Perches, florian, Andrew Morton,
	Stephen Hemminger (role:commit_signer),
	Wolfram Sang (role:commit_signer),
	linux-kernel

> > One possible improvement that get_maintainers could
> > add is to [optionally] CC the author of the first
> > commit when there's no listed maintainer.
> 
> That might be a nice option (though it will still give false positives).
> Something based on percentage of the driver written rather than log
> entry counts might also be interesting.

Within a time limit yes.

I'd guess a heuristic along the lines of

original committer && has committed in the past 6 months

might work rather well - and you could probably walk up the list of
signed-off-by this way with the same rule and take a fair shot at a
maintainer.

Trouble is for a lot of other stuff beyond that you need to hit the
person whose signoff is on most of the submits to that file/subtree in the
past six months but not count wide cross tree commits

That gets a bit harder to compute.

Seems like there is a social sciences or systems research project here
to do some graph theory and modelling into "who really maintains the Linux
kernel"

Alan

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] get_maintainer.pl: append reason for cc to the name by default
  2010-09-10 11:11   ` [PATCH] get_maintainer.pl: append reason for cc to the name by default Florian Mickler
@ 2010-09-10 15:12     ` Joe Perches
  2010-09-11  9:34       ` Florian Mickler
  0 siblings, 1 reply; 62+ messages in thread
From: Joe Perches @ 2010-09-10 15:12 UTC (permalink / raw)
  To: Florian Mickler
  Cc: Andrew Morton, Stephen Hemminger (role:commit_signer),
	Wolfram Sang (role:commit_signer),
	linux-kernel

On Fri, 2010-09-10 at 13:11 +0200, Florian Mickler wrote:
> On Fri, 10 Sep 2010 02:42:47 -0700
> Joe Perches <joe@perches.com> wrote:
> > On Fri, 2010-09-10 at 11:33 +0200, florian@mickler.org wrote:
> > > The script get_maintainer.pl is a very useful tool for deploying changes
> > > made to the kernel. Among others it searches not only the MAINTAINERS
> > > file but also the git history for people to send patches to.\
> > []
> > > By appending a  a note of the reason for the cc in the name, the reason
> > > becomes clear and the receiving side is relieved from feeling obliged to
> > > check the patch  while the sending side has a chance to adapt the
> > > cc'list to their liking.
> > I don't like reading annotated email names myself.
> Btw, why? I think it is a great way to show that it was done in an
> automated fashion. 

More stuff than necessary to read, requires effort to find next.
Fewer names in visual field.

Also, it can affect systems that automatically add email addresses
to an address directory as it could store the entire content
before the < as the name.



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] get_maintainer.pl: append reason for cc to the name by default
  2010-09-10 10:30   ` Florian Mickler
  2010-09-10 11:04     ` Mark Brown
@ 2010-09-10 21:04     ` Andrew Morton
  2010-09-10 21:39       ` Florian Mickler
  2010-09-10 21:44       ` Joe Perches
  2010-09-13  4:01     ` Valdis.Kletnieks
  2 siblings, 2 replies; 62+ messages in thread
From: Andrew Morton @ 2010-09-10 21:04 UTC (permalink / raw)
  To: Florian Mickler
  Cc: Joe Perches, Stephen Hemminger (role:commit_signer),
	Wolfram Sang (role:commit_signer),
	linux-kernel

On Fri, 10 Sep 2010 12:30:40 +0200
Florian Mickler <florian@mickler.org> wrote:

> People that get angered by being cc'd a lot as commit_signers can always
> filter their mail for that. 

People who get angry at an unexpected cc need to get a clue.  Or
get slapped.

I mean really, it's not a big deal.  And the addition of the occasional
inappropriate cc is a far less serious problem than the omission of
someone who might have been interested in the change.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] get_maintainer.pl: append reason for cc to the name by default
  2010-09-10 21:04     ` Andrew Morton
@ 2010-09-10 21:39       ` Florian Mickler
  2010-09-10 21:44       ` Joe Perches
  1 sibling, 0 replies; 62+ messages in thread
From: Florian Mickler @ 2010-09-10 21:39 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Joe Perches, Stephen Hemminger (role:commit_signer),
	Wolfram Sang (role:commit_signer),
	linux-kernel

On Fri, 10 Sep 2010 14:04:04 -0700
Andrew Morton <akpm@linux-foundation.org> wrote:

> On Fri, 10 Sep 2010 12:30:40 +0200
> Florian Mickler <florian@mickler.org> wrote:
> 
> > People that get angered by being cc'd a lot as commit_signers can always
> > filter their mail for that. 
> 
> People who get angry at an unexpected cc need to get a clue.  Or
> get slapped.
> 
> I mean really, it's not a big deal.  And the addition of the occasional
> inappropriate cc is a far less serious problem than the omission of
> someone who might have been interested in the change.

Indeed. So what is your opinion on the patch? 
In my opinion it is a net win, because realistically we can not go
around and slap them... can we? 

So this change might enable those who really have a problem with it, to
filter for it. 
Anybody who cares on the sending side, can edit it out, or perhaps,
when we make it optional, opt out of it. 

I personally would like such annotations on the sending side, because I
don't wanna anger people, and would perhaps when in doubt, trim the cc
list. But the annotation enables me to say: Hey, it was an
automatic cc'list. So you could have filtered it out. Don't be angry
with me! 

While on the receiving side, I might prioritize mails I get as an
maintainer (not that I am a maintainer.. but..), while other mail goes
to the read-on-a-rainy-day-mailbox. 

Cheers,
Flo


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] get_maintainer.pl: append reason for cc to the name by default
  2010-09-10 21:04     ` Andrew Morton
  2010-09-10 21:39       ` Florian Mickler
@ 2010-09-10 21:44       ` Joe Perches
  1 sibling, 0 replies; 62+ messages in thread
From: Joe Perches @ 2010-09-10 21:44 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Florian Mickler, Stephen Hemminger, Wolfram Sang, linux-kernel

On Fri, 2010-09-10 at 14:04 -0700, Andrew Morton wrote:
> People who get angry at an unexpected cc need to get a clue.
> I mean really, it's not a big deal.  And the addition of the occasional
> inappropriate cc is a far less serious problem than the omission of
> someone who might have been interested in the change.

I agree and I don't much like the suggested patch.

I get some of these even though all I ever do is
trivial stuff.


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] get_maintainer.pl: append reason for cc to the name by default
  2010-09-10  9:42 ` Joe Perches
                     ` (3 preceding siblings ...)
  2010-09-10 11:11   ` [PATCH] get_maintainer.pl: append reason for cc to the name by default Florian Mickler
@ 2010-09-11  0:13   ` Christoph Hellwig
  2010-09-11  0:31     ` Joe Perches
  4 siblings, 1 reply; 62+ messages in thread
From: Christoph Hellwig @ 2010-09-11  0:13 UTC (permalink / raw)
  To: Joe Perches
  Cc: florian, Andrew Morton, Stephen Hemminger (role:commit_signer),
	Wolfram Sang (role:commit_signer),
	linux-kernel

On Fri, Sep 10, 2010 at 02:42:47AM -0700, Joe Perches wrote:
> I don't like reading annotated email names myself.
> 
> As the coverage of file patterns in MAINTAINERS is
> now pretty good, the --git results are less useful.
> 
> I'd much rather turn off --git as a default, or change
> get_maintainers to add git signers only when there's no
> specifically listed maintainer for a particular file.

Whoever made --git the default was on crack and this needs to be
reverted ASAP.  Just because people do global API changes it doesn't
mean they need to be cced on every second pathc to the kernel tree.

We already have a mailinglist for that purpose.


^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH] scripts/get_maintainer.pl: Add --git-blame --rolestats "Authored lines" information
  2010-09-10 10:47         ` Joe Perches
  2010-09-10 11:07           ` Mark Brown
@ 2010-09-11  0:22           ` Joe Perches
  2010-09-11  9:38             ` Florian Mickler
  1 sibling, 1 reply; 62+ messages in thread
From: Joe Perches @ 2010-09-11  0:22 UTC (permalink / raw)
  To: Mark Brown
  Cc: florian, Andrew Morton, Stephen Hemminger, Wolfram Sang, linux-kernel

> > Something based on percentage of the driver written rather than log
> > entry counts might also be interesting.
> There is a --git-blame option, but it uses a count of file commits
> active in the file rather than % of lines authored.

When options --git-blame and --rolestats are specified, add
the maintainers with the qualifying --git-min-percent amount
of lines authored of the complete file.  Does not add more
authors than specified by --git-max-maintainers.

For anyone using hg, this option works but is _very_ slow.
It's orders of magnitude slower than git slow.

The get_maintainer.pl version was incremented to 0.25.

This can be used with or without --git.

For instance:

$ ./scripts/get_maintainer.pl --git-blame --nogit --rolestats -f lib/bitmap.c
Paul Jackson <pj@sgi.com> (authored lines:406/613=66%,commits:7/20=35%)
Akinobu Mita <mita@miraclelinux.com> (authored lines:87/613=14%,commits:3/20=15%)
Reinette Chatre <reinette.chatre@linux.intel.com> (authored lines:42/613=7%)
Andrew Morton <akpm@linux-foundation.org> (commits:16/20=80%)
Paul Mundt <lethal@linux-sh.org> (commits:3/20=15%)
Randy Dunlap <randy.dunlap@oracle.com> (commits:2/20=10%)

$ ./scripts/get_maintainer.pl --git-blame --git --rolestats -f lib/bitmap.c
Andrew Morton <akpm@linux-foundation.org> (commit_signer:4/5=80%,commits:16/20=80%)
Akinobu Mita <akinobu.mita@gmail.com> (commit_signer:2/5=40%,authored lines:87/613=14%,commits:3/20=15%)
Jack Steiner <steiner@sgi.com> (commit_signer:1/5=20%)
Ben Hutchings <ben@decadent.org.uk> (commit_signer:1/5=20%)
Lee Schermerhorn <lee.schermerhorn@hp.com> (commit_signer:1/5=20%)
Paul Jackson <pj@sgi.com> (authored lines:406/613=66%,commits:7/20=35%)
Reinette Chatre <reinette.chatre@linux.intel.com> (authored lines:42/613=7%)
Paul Mundt <lethal@linux-sh.org> (commits:3/20=15%)
Randy Dunlap <randy.dunlap@oracle.com> (commits:2/20=10%)
linux-kernel@vger.kernel.org (open list)

Signed-off-by: Joe Perches <joe@perches.com>
---
 scripts/get_maintainer.pl |   60 ++++++++++++++++++++++++++++++++++++++++++--
 1 files changed, 57 insertions(+), 3 deletions(-)

diff --git a/scripts/get_maintainer.pl b/scripts/get_maintainer.pl
index b228198..a91ae63 100755
--- a/scripts/get_maintainer.pl
+++ b/scripts/get_maintainer.pl
@@ -13,7 +13,7 @@
 use strict;
 
 my $P = $0;
-my $V = '0.24';
+my $V = '0.25';
 
 use Getopt::Long qw(:config no_auto_abbrev);
 
@@ -88,6 +88,7 @@ my %VCS_cmds_git = (
     "available" => '(which("git") ne "") && (-d ".git")',
     "find_signers_cmd" => "git log --no-color --since=\$email_git_since -- \$file",
     "find_commit_signers_cmd" => "git log --no-color -1 \$commit",
+    "find_commit_author_cmd" => "git log -1 --format=\"%an <%ae>\" \$commit",
     "blame_range_cmd" => "git blame -l -L \$diff_start,+\$diff_length \$file",
     "blame_file_cmd" => "git blame -l \$file",
     "commit_pattern" => "^commit [0-9a-f]{40,40}",
@@ -101,6 +102,7 @@ my %VCS_cmds_hg = (
 	"hg log --date=\$email_hg_since" .
 		" --template='commit {node}\\n{desc}\\n' -- \$file",
     "find_commit_signers_cmd" => "hg log --template='{desc}\\n' -r \$commit",
+    "find_commit_author_cmd" => "hg log -l 1 --template='{author}\\n' -r \$commit",
     "blame_range_cmd" => "",		# not supported
     "blame_file_cmd" => "hg blame -c \$file",
     "commit_pattern" => "^commit [0-9a-f]{40,40}",
@@ -1014,6 +1016,9 @@ sub vcs_find_signers {
     if (!$email_git_penguin_chiefs) {
 	@lines = grep(!/${penguin_chiefs}/i, @lines);
     }
+
+    return (0, @lines) if !@lines;
+
     # cut -f2- -d":"
     s/.*:\s*(.+)\s*/$1/ for (@lines);
 
@@ -1027,6 +1032,28 @@ sub vcs_find_signers {
     return ($commits, @lines);
 }
 
+sub vcs_find_author {
+    my ($cmd) = @_;
+    my @lines = ();
+
+    @lines = &{$VCS_cmds{"execute_cmd"}}($cmd);
+
+    if (!$email_git_penguin_chiefs) {
+	@lines = grep(!/${penguin_chiefs}/i, @lines);
+    }
+
+    return @lines if !@lines;
+
+## Reformat email addresses (with names) to avoid badly written signatures
+
+    foreach my $line (@lines) {
+	my ($name, $address) = parse_email($line);
+	$line = format_email($name, $address, 1);
+    }
+
+    return @lines;
+}
+
 sub vcs_save_commits {
     my ($cmd) = @_;
     my @lines = ();
@@ -1084,6 +1111,10 @@ sub vcs_blame {
 	@commits = vcs_save_commits($cmd);
     }
 
+    foreach my $commit (@commits) {
+	$commit =~ s/^\^//g;
+    }
+
     return @commits;
 }
 
@@ -1121,6 +1152,8 @@ sub vcs_assign {
 	@lines = mailmap(@lines);
     }
 
+    return if (@lines <= 0);
+
     @lines = sort(@lines);
 
     # uniq -c
@@ -1165,14 +1198,17 @@ sub vcs_file_blame {
     my ($file) = @_;
 
     my @signers = ();
+    my @all_commits = ();
     my @commits = ();
     my $total_commits;
+    my $total_lines;
 
     return if (!vcs_exists());
 
-    @commits = vcs_blame($file);
-    @commits = uniq(@commits);
+    @all_commits = vcs_blame($file);
+    @commits = uniq(@all_commits);
     $total_commits = @commits;
+    $total_lines = @all_commits;
 
     foreach my $commit (@commits) {
 	my $commit_count;
@@ -1182,10 +1218,28 @@ sub vcs_file_blame {
 	$cmd =~ s/(\$\w+)/$1/eeg;	#interpolate $cmd
 
 	($commit_count, @commit_signers) = vcs_find_signers($cmd);
+
 	push(@signers, @commit_signers);
     }
 
     if ($from_filename) {
+	if ($output_rolestats) {
+	    my @blame_signers;
+	    foreach my $commit (@commits) {
+		my $i;
+		my $cmd = $VCS_cmds{"find_commit_author_cmd"};
+		$cmd =~ s/(\$\w+)/$1/eeg;	#interpolate $cmd
+		my @author = vcs_find_author($cmd);
+		next if !@author;
+		my $count = grep(/$commit/, @all_commits);
+		for ($i = 0; $i < $count ; $i++) {
+		    push(@blame_signers, $author[0]);
+		}
+	    }
+	    if (@blame_signers) {
+		vcs_assign("authored lines", $total_lines, @blame_signers);
+	    }
+	}
 	vcs_assign("commits", $total_commits, @signers);
     } else {
 	vcs_assign("modified commits", $total_commits, @signers);



^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH] get_maintainer.pl: append reason for cc to the name by default
  2010-09-11  0:13   ` Christoph Hellwig
@ 2010-09-11  0:31     ` Joe Perches
  2010-09-11  0:45       ` Christoph Hellwig
  0 siblings, 1 reply; 62+ messages in thread
From: Joe Perches @ 2010-09-11  0:31 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: florian, Andrew Morton, Stephen Hemminger (role:commit_signer),
	Wolfram Sang (role:commit_signer),
	linux-kernel

On Fri, 2010-09-10 at 20:13 -0400, Christoph Hellwig wrote:
> Whoever made --git the default was on crack and this needs to be
> reverted ASAP.

That'd be Linus' suggestion and it's been the default
for get_maintainers since it was developed.

Reverted? no.
Changed, maybe.

I hardly use the --git option and I do many tree-wide changes.

If you don't like it, you could use a .get_maintainer.conf
file entry until some consensus is reached later.


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] get_maintainer.pl: append reason for cc to the name by default
  2010-09-11  0:31     ` Joe Perches
@ 2010-09-11  0:45       ` Christoph Hellwig
  2010-09-11  0:56         ` Joe Perches
  2010-09-11  9:28         ` Florian Mickler
  0 siblings, 2 replies; 62+ messages in thread
From: Christoph Hellwig @ 2010-09-11  0:45 UTC (permalink / raw)
  To: Joe Perches
  Cc: Christoph Hellwig, florian, Andrew Morton,
	Stephen Hemminger (role:commit_signer),
	Wolfram Sang (role:commit_signer),
	linux-kernel

On Fri, Sep 10, 2010 at 05:31:14PM -0700, Joe Perches wrote:
> If you don't like it, you could use a .get_maintainer.conf
> file entry until some consensus is reached later.

I don't use the pice of junk at all.  What really pisses me off is that
I get spamed by all by tons of utterly useless cleanups patches (and
some useful but not at all interesting to me patches) because some
idiot made the bloody script return my address just because I touched a
file.  If it goes on like that I'll simply have to add a mail filter to
throw away all patches sent directly to me instead of a mailinglist.


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] get_maintainer.pl: append reason for cc to the name by default
  2010-09-11  0:45       ` Christoph Hellwig
@ 2010-09-11  0:56         ` Joe Perches
  2010-09-11  9:28         ` Florian Mickler
  1 sibling, 0 replies; 62+ messages in thread
From: Joe Perches @ 2010-09-11  0:56 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: florian, Andrew Morton, Stephen Hemminger (role:commit_signer),
	Wolfram Sang (role:commit_signer),
	linux-kernel

On Fri, 2010-09-10 at 20:45 -0400, Christoph Hellwig wrote:
> What really pisses me off is that
> I get spamed by all by tons of utterly useless cleanups patches (and
> some useful but not at all interesting to me patches) because some
> idiot made the bloody script return my address just because I touched a
> file.  If it goes on like that I'll simply have to add a mail filter to
> throw away all patches sent directly to me instead of a mailinglist.

Seems like I'll have to ask Andrew to slap you next
time you meet. <That's a joke for those humor impaired>

Your choice, if you do that perhaps you should remove
your name from MAINTAINERS as well.

cheers, Joe


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] get_maintainer.pl: append reason for cc to the name by default
  2010-09-11  0:45       ` Christoph Hellwig
  2010-09-11  0:56         ` Joe Perches
@ 2010-09-11  9:28         ` Florian Mickler
  2010-09-13  7:16           ` Eric W. Biederman
  1 sibling, 1 reply; 62+ messages in thread
From: Florian Mickler @ 2010-09-11  9:28 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Joe Perches, Andrew Morton,
	Stephen Hemminger (role:commit_signer),
	Wolfram Sang (role:commit_signer),
	linux-kernel

On Fri, 10 Sep 2010 20:45:51 -0400
Christoph Hellwig <hch@infradead.org> wrote:

> On Fri, Sep 10, 2010 at 05:31:14PM -0700, Joe Perches wrote:
> > If you don't like it, you could use a .get_maintainer.conf
> > file entry until some consensus is reached later.
> 
> I don't use the pice of junk at all.  What really pisses me off is that
> I get spamed by all by tons of utterly useless cleanups patches (and
> some useful but not at all interesting to me patches) because some
> idiot made the bloody script return my address just because I touched a
> file.  If it goes on like that I'll simply have to add a mail filter to
> throw away all patches sent directly to me instead of a mailinglist.
> 

But would you also have the same attitude towards these specific
'spams' if it were made clear in the email adress field that you were
cc'd as a commit_signer  (vs maintainer)?

I know it is a bit hackisch to have stuff in the name field, but it is
rfc conform, it is even proposed for redirected email '(by way of...)'. 

Cheers,
Flo

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] get_maintainer.pl: append reason for cc to the name by default
  2010-09-10 15:12     ` Joe Perches
@ 2010-09-11  9:34       ` Florian Mickler
  0 siblings, 0 replies; 62+ messages in thread
From: Florian Mickler @ 2010-09-11  9:34 UTC (permalink / raw)
  To: Joe Perches
  Cc: Andrew Morton, Stephen Hemminger (role:commit_signer),
	Wolfram Sang (role:commit_signer),
	linux-kernel

On Fri, 10 Sep 2010 08:12:37 -0700
Joe Perches <joe@perches.com> wrote:

> On Fri, 2010-09-10 at 13:11 +0200, Florian Mickler wrote:
> > On Fri, 10 Sep 2010 02:42:47 -0700
> > Joe Perches <joe@perches.com> wrote:
> > > On Fri, 2010-09-10 at 11:33 +0200, florian@mickler.org wrote:
> > > > The script get_maintainer.pl is a very useful tool for deploying changes
> > > > made to the kernel. Among others it searches not only the MAINTAINERS
> > > > file but also the git history for people to send patches to.\
> > > []
> > > > By appending a  a note of the reason for the cc in the name, the reason
> > > > becomes clear and the receiving side is relieved from feeling obliged to
> > > > check the patch  while the sending side has a chance to adapt the
> > > > cc'list to their liking.
> > > I don't like reading annotated email names myself.
> > Btw, why? I think it is a great way to show that it was done in an
> > automated fashion. 
> 
> More stuff than necessary to read, requires effort to find next. 

next what? 

I think the normal case is to ignore the name tag. It is only if you
really get many many emails that you will begin to filter for these
role-tags. And then you will probably do it in an automated fashion.



> Fewer names in visual field.

ok. But anybody who cares about his patch submission deeply can edit
the cc'list before sending. And then he is also likely to trim the
cc'list by leaving out people that did only trivial spelling fixes or
renamed stuff, or flamed them the last time.


> 
> Also, it can affect systems that automatically add email addresses
> to an address directory as it could store the entire content
> before the < as the name.

I think any system that relies on input it can not control has to deal
with these kind of stuff anyway. (Be it typo's or just plain wrong
names, or surename<->name switcheroos...)
So I don't think this is that big a concern.


Cheers,
Flo

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] scripts/get_maintainer.pl: Add --git-blame --rolestats "Authored lines" information
  2010-09-11  0:22           ` [PATCH] scripts/get_maintainer.pl: Add --git-blame --rolestats "Authored lines" information Joe Perches
@ 2010-09-11  9:38             ` Florian Mickler
  2010-09-11  9:52               ` Joe Perches
  0 siblings, 1 reply; 62+ messages in thread
From: Florian Mickler @ 2010-09-11  9:38 UTC (permalink / raw)
  To: Joe Perches
  Cc: Mark Brown, Andrew Morton, Stephen Hemminger, Wolfram Sang, linux-kernel

On Fri, 10 Sep 2010 17:22:21 -0700
Joe Perches <joe@perches.com> wrote:

> > > Something based on percentage of the driver written rather than log
> > > entry counts might also be interesting.
> > There is a --git-blame option, but it uses a count of file commits
> > active in the file rather than % of lines authored.
> 
> When options --git-blame and --rolestats are specified, add
> the maintainers with the qualifying --git-min-percent amount
> of lines authored of the complete file.  Does not add more
> authors than specified by --git-max-maintainers.
> 
> For anyone using hg, this option works but is _very_ slow.
> It's orders of magnitude slower than git slow.
> 
> The get_maintainer.pl version was incremented to 0.25.
> 
> This can be used with or without --git.
> 
> For instance:
> 
> $ ./scripts/get_maintainer.pl --git-blame --nogit --rolestats -f lib/bitmap.c
> Paul Jackson <pj@sgi.com> (authored lines:406/613=66%,commits:7/20=35%)
> Akinobu Mita <mita@miraclelinux.com> (authored lines:87/613=14%,commits:3/20=15%)
> Reinette Chatre <reinette.chatre@linux.intel.com> (authored lines:42/613=7%)
> Andrew Morton <akpm@linux-foundation.org> (commits:16/20=80%)
> Paul Mundt <lethal@linux-sh.org> (commits:3/20=15%)
> Randy Dunlap <randy.dunlap@oracle.com> (commits:2/20=10%)
> 
> $ ./scripts/get_maintainer.pl --git-blame --git --rolestats -f lib/bitmap.c
> Andrew Morton <akpm@linux-foundation.org> (commit_signer:4/5=80%,commits:16/20=80%)
> Akinobu Mita <akinobu.mita@gmail.com> (commit_signer:2/5=40%,authored lines:87/613=14%,commits:3/20=15%)
> Jack Steiner <steiner@sgi.com> (commit_signer:1/5=20%)
> Ben Hutchings <ben@decadent.org.uk> (commit_signer:1/5=20%)
> Lee Schermerhorn <lee.schermerhorn@hp.com> (commit_signer:1/5=20%)
> Paul Jackson <pj@sgi.com> (authored lines:406/613=66%,commits:7/20=35%)
> Reinette Chatre <reinette.chatre@linux.intel.com> (authored lines:42/613=7%)
> Paul Mundt <lethal@linux-sh.org> (commits:3/20=15%)
> Randy Dunlap <randy.dunlap@oracle.com> (commits:2/20=10%)
> linux-kernel@vger.kernel.org (open list)
> 
> Signed-off-by: Joe Perches <joe@perches.com>

I think this is a good change. I also think this should be made
default. (linestat is probably more of a hint then nr of commits. 

But I don't think it really solves the same problem as the annotations
on the name. We really want to show who just used a script to cc people
and who made an effort to find the right people to send the patch to.

Cheers,
Flo

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] scripts/get_maintainer.pl: Add --git-blame --rolestats "Authored lines" information
  2010-09-11  9:38             ` Florian Mickler
@ 2010-09-11  9:52               ` Joe Perches
  2010-09-11 10:02                 ` Florian Mickler
  0 siblings, 1 reply; 62+ messages in thread
From: Joe Perches @ 2010-09-11  9:52 UTC (permalink / raw)
  To: Florian Mickler
  Cc: Mark Brown, Andrew Morton, Stephen Hemminger, Wolfram Sang, linux-kernel

On Sat, 2010-09-11 at 11:38 +0200, Florian Mickler wrote:
> On Fri, 10 Sep 2010 17:22:21 -0700
> Joe Perches <joe@perches.com> wrote:
> > > > Something based on percentage of the driver written rather than log
> > > > entry counts might also be interesting.
> > > There is a --git-blame option, but it uses a count of file commits
> > > active in the file rather than % of lines authored.
> > When options --git-blame and --rolestats are specified, add
> > the maintainers with the qualifying --git-min-percent amount
> > of lines authored of the complete file.  Does not add more
> > authors than specified by --git-max-maintainers.
> > For anyone using hg, this option works but is _very_ slow.
> > It's orders of magnitude slower than git slow.
> > The get_maintainer.pl version was incremented to 0.25.
> > This can be used with or without --git.
> I think this is a good change. I also think this should be made
> default. (linestat is probably more of a hint then nr of commits. 

It's OK but make it default?  No.

It's slower.  A lot slower.
git blame takes a very long time.  It can take minutes.
If you're using hg, it can take a lot more than that.

Run it on something like MAINTAINERS or a largish
active driver and see.

It also doesn't favor recent changes and newer, active
authors.  It also can overemphasis submitters that are
no longer active.



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] scripts/get_maintainer.pl: Add --git-blame --rolestats "Authored lines" information
  2010-09-11  9:52               ` Joe Perches
@ 2010-09-11 10:02                 ` Florian Mickler
  2010-09-11 10:22                   ` Joe Perches
  2010-09-11 19:22                   ` [PATCH] Documentation/SubmittingPatches: Add and describe scripts/get_maintainer.pl Joe Perches
  0 siblings, 2 replies; 62+ messages in thread
From: Florian Mickler @ 2010-09-11 10:02 UTC (permalink / raw)
  To: Joe Perches
  Cc: Mark Brown, Andrew Morton, Stephen Hemminger, Wolfram Sang, linux-kernel

On Sat, 11 Sep 2010 02:52:30 -0700
Joe Perches <joe@perches.com> wrote:

> On Sat, 2010-09-11 at 11:38 +0200, Florian Mickler wrote:
> > On Fri, 10 Sep 2010 17:22:21 -0700
> > Joe Perches <joe@perches.com> wrote:
> > > > > Something based on percentage of the driver written rather than log
> > > > > entry counts might also be interesting.
> > > > There is a --git-blame option, but it uses a count of file commits
> > > > active in the file rather than % of lines authored.
> > > When options --git-blame and --rolestats are specified, add
> > > the maintainers with the qualifying --git-min-percent amount
> > > of lines authored of the complete file.  Does not add more
> > > authors than specified by --git-max-maintainers.
> > > For anyone using hg, this option works but is _very_ slow.
> > > It's orders of magnitude slower than git slow.
> > > The get_maintainer.pl version was incremented to 0.25.
> > > This can be used with or without --git.
> > I think this is a good change. I also think this should be made
> > default. (linestat is probably more of a hint then nr of commits. 
> 
> It's OK but make it default?  No.
> 
> It's slower.  A lot slower.
> git blame takes a very long time.  It can take minutes.
> If you're using hg, it can take a lot more than that.
> 
> Run it on something like MAINTAINERS or a largish
> active driver and see.

Point taken.

> It also doesn't favor recent changes and newer, active
> authors.  It also can overemphasis submitters that are
> no longer active.
> 
> 
But at least there is a reason to cc the guy. He wrote it after all.

Cheers,
Flo

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] scripts/get_maintainer.pl: Add --git-blame --rolestats "Authored lines" information
  2010-09-11 10:02                 ` Florian Mickler
@ 2010-09-11 10:22                   ` Joe Perches
  2010-09-11 19:22                   ` [PATCH] Documentation/SubmittingPatches: Add and describe scripts/get_maintainer.pl Joe Perches
  1 sibling, 0 replies; 62+ messages in thread
From: Joe Perches @ 2010-09-11 10:22 UTC (permalink / raw)
  To: Florian Mickler
  Cc: Mark Brown, Andrew Morton, Stephen Hemminger, Wolfram Sang, linux-kernel

On Sat, 2010-09-11 at 12:02 +0200, Florian Mickler wrote:
> On Sat, 11 Sep 2010 02:52:30 -0700
> Joe Perches <joe@perches.com> wrote:
> 
> > On Sat, 2010-09-11 at 11:38 +0200, Florian Mickler wrote:
> > > On Fri, 10 Sep 2010 17:22:21 -0700
> > > Joe Perches <joe@perches.com> wrote:
> > > > > > Something based on percentage of the driver written rather than log
> > > > > > entry counts might also be interesting.
> > > > > There is a --git-blame option, but it uses a count of file commits
> > > > > active in the file rather than % of lines authored.
> > > > When options --git-blame and --rolestats are specified, add
> > > > the maintainers with the qualifying --git-min-percent amount
> > > > of lines authored of the complete file.  Does not add more
> > > > authors than specified by --git-max-maintainers.
> > > > For anyone using hg, this option works but is _very_ slow.
> > > > It's orders of magnitude slower than git slow.
> > > > The get_maintainer.pl version was incremented to 0.25.
> > > > This can be used with or without --git.
> > > I think this is a good change. I also think this should be made
> > > default. (linestat is probably more of a hint then nr of commits. 
> > 
> > It's OK but make it default?  No.
> > 
> > It's slower.  A lot slower.
> > git blame takes a very long time.  It can take minutes.
> > If you're using hg, it can take a lot more than that.
> > 
> > Run it on something like MAINTAINERS or a largish
> > active driver and see.
> 
> Point taken.
> 
> > It also doesn't favor recent changes and newer, active
> > authors.  It also can overemphasis submitters that are
> > no longer active.
> > 
> > 
> But at least there is a reason to cc the guy. He wrote it after all.

That's why you might have an option to find the
first submitter.

The equivalent of:

git log --reverse --pretty="%an <%ae>" filename | head -1

Unfortunately, for anything in the tree older than 2.6.12
that shows Linus Torvalds at a no longer current address.

$ git log --reverse --pretty="%an <%ae>" kernel/sched.c | head -1
Linus Torvalds <torvalds@ppc970.osdl.org>



^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH] Documentation/SubmittingPatches: Add and describe scripts/get_maintainer.pl
  2010-09-11 10:02                 ` Florian Mickler
  2010-09-11 10:22                   ` Joe Perches
@ 2010-09-11 19:22                   ` Joe Perches
  2010-09-11 19:34                     ` Florian Mickler
  2010-09-11 19:43                     ` [PATCH V2] " Joe Perches
  1 sibling, 2 replies; 62+ messages in thread
From: Joe Perches @ 2010-09-11 19:22 UTC (permalink / raw)
  To: Florian Mickler, Andrew Morton
  Cc: Mark Brown, Stephen Hemminger, Wolfram Sang, linux-kernel

Signed-off-by: Joe Perches <joe@perches.com>
---
 Documentation/SubmittingPatches |   17 +++++++++++++++++
 1 files changed, 17 insertions(+), 0 deletions(-)

diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches
index 689e237..bb318f9 100644
--- a/Documentation/SubmittingPatches
+++ b/Documentation/SubmittingPatches
@@ -176,6 +176,23 @@ discussed should the patch then be submitted to Linus.
 
 6) Select your CC (e-mail carbon copy) list.
 
+Use
+	"./scripts/get_maintainer.pl --nogit --roles <patch>"
+to find the appropriate maintainers and mailing lists to CC.
+
+If the output email addresses doesn't show a "supporter" or "maintainer",
+you can expand the search to include git history with:
+
+	"./scripts/get_maintainer.pl --roles <patch>"
+
+Searching git history can often list individuals that may not be
+particularly interested in your patch, so use the list of names
+judiciously.
+
+./scripts/get_maintainers.pl options "--rolestats" and "--git-blame"
+can show a bit more information and might help you select the appropriate
+parties to CC.
+
 Unless you have a reason NOT to do so, CC linux-kernel@vger.kernel.org.
 
 Other kernel developers besides Linus need to be aware of your change,



^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH] Documentation/SubmittingPatches: Add and describe scripts/get_maintainer.pl
  2010-09-11 19:22                   ` [PATCH] Documentation/SubmittingPatches: Add and describe scripts/get_maintainer.pl Joe Perches
@ 2010-09-11 19:34                     ` Florian Mickler
  2010-09-11 19:43                     ` [PATCH V2] " Joe Perches
  1 sibling, 0 replies; 62+ messages in thread
From: Florian Mickler @ 2010-09-11 19:34 UTC (permalink / raw)
  To: Joe Perches
  Cc: Andrew Morton, Mark Brown, Stephen Hemminger, Wolfram Sang, linux-kernel

On Sat, 11 Sep 2010 12:22:23 -0700
Joe Perches <joe@perches.com> wrote:

> +./scripts/get_maintainers.pl options "--rolestats" and "--git-blame"
> +can show a bit more information and might help you select the appropriate
> +parties to CC.

typo. s/maintainers/maintainer/

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH V2] Documentation/SubmittingPatches: Add and describe scripts/get_maintainer.pl
  2010-09-11 19:22                   ` [PATCH] Documentation/SubmittingPatches: Add and describe scripts/get_maintainer.pl Joe Perches
  2010-09-11 19:34                     ` Florian Mickler
@ 2010-09-11 19:43                     ` Joe Perches
  2010-09-12 16:18                       ` Florian Mickler
  1 sibling, 1 reply; 62+ messages in thread
From: Joe Perches @ 2010-09-11 19:43 UTC (permalink / raw)
  To: Florian Mickler
  Cc: Andrew Morton, Mark Brown, Stephen Hemminger, Wolfram Sang, linux-kernel

Signed-off-by: Joe Perches <joe@perches.com>
---
 Documentation/SubmittingPatches |   17 +++++++++++++++++
 1 files changed, 17 insertions(+), 0 deletions(-)

diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches
index 689e237..bb318f9 100644
--- a/Documentation/SubmittingPatches
+++ b/Documentation/SubmittingPatches
@@ -176,6 +176,23 @@ discussed should the patch then be submitted to Linus.
 
 6) Select your CC (e-mail carbon copy) list.
 
+Use
+	"./scripts/get_maintainer.pl --nogit --roles <patch>"
+to find the appropriate maintainers and mailing lists to CC.
+
+If the output email addresses doesn't show a "supporter" or "maintainer",
+you can expand the search to include git history with:
+
+	"./scripts/get_maintainer.pl --roles <patch>"
+
+Searching git history can often list individuals that may not be
+particularly interested in your patch, so use the list of names
+judiciously.
+
+./scripts/get_maintainer.pl options "--rolestats" and "--git-blame"
+can show a bit more information and might help you select the appropriate
+parties to CC.
+
 Unless you have a reason NOT to do so, CC linux-kernel@vger.kernel.org.
 
 Other kernel developers besides Linus need to be aware of your change,



^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH] get_maintainer.pl: append reason for cc to the name by default
  2010-09-10  9:53   ` Mark Brown
  2010-09-10 10:04     ` Joe Perches
@ 2010-09-11 21:22     ` Joe Perches
  1 sibling, 0 replies; 62+ messages in thread
From: Joe Perches @ 2010-09-11 21:22 UTC (permalink / raw)
  To: Mark Brown
  Cc: florian, Andrew Morton, Stephen Hemminger (role:commit_signer),
	Wolfram Sang (role:commit_signer),
	linux-kernel

On Fri, 2010-09-10 at 10:53 +0100, Mark Brown wrote:
> On Fri, Sep 10, 2010 at 02:42:47AM -0700, Joe Perches wrote:
> > As the coverage of file patterns in MAINTAINERS is
> > now pretty good, the --git results are less useful.
> It's pretty good for subsystems and PC drivers but for embedded stuff it
> is much more patchy and fairly unlikely to find someone who actually
> knows about the specific hardware.
> > I'd much rather turn off --git as a default, or change
> > get_maintainers to add git signers only when there's no
> > specifically listed maintainer for a particular file.
> OTOH I'd still be somewhat in favour of this purely on the basis that
> the default results have way too many false positives.

Maybe something like this?

Add --git-fallback default enabled to add commit signers
only when there's no specifically listed maintainer.

Change --git default to off

 scripts/get_maintainer.pl |   41 +++++++++++++++++++++++------------------
 1 files changed, 23 insertions(+), 18 deletions(-)

diff --git a/scripts/get_maintainer.pl b/scripts/get_maintainer.pl
index a91ae63..0c4de78 100755
--- a/scripts/get_maintainer.pl
+++ b/scripts/get_maintainer.pl
@@ -24,9 +24,10 @@ my $email_maintainer = 1;
 my $email_list = 1;
 my $email_subscriber_list = 0;
 my $email_git_penguin_chiefs = 0;
-my $email_git = 1;
+my $email_git = 0;
 my $email_git_all_signature_types = 0;
 my $email_git_blame = 0;
+my $email_git_fallback = 1;
 my $email_git_min_signatures = 1;
 my $email_git_max_maintainers = 5;
 my $email_git_min_percent = 5;
@@ -138,6 +139,7 @@ if (!GetOptions(
 		'git!' => \$email_git,
 		'git-all-signature-types!' => \$email_git_all_signature_types,
 		'git-blame!' => \$email_git_blame,
+		'git-fallback!' => \$email_git_fallback,
 		'git-chief-penguins!' => \$email_git_penguin_chiefs,
 		'git-min-signatures=i' => \$email_git_min_signatures,
 		'git-max-maintainers=i' => \$email_git_max_maintainers,
@@ -371,6 +373,7 @@ my @status = ();
 foreach my $file (@files) {
 
     my %hash;
+    my $has_exact_pd = 0;
     my $tvi = find_first_section();
     while ($tvi < @typevalue) {
 	my $start = find_starting_index($tvi);
@@ -405,6 +408,7 @@ foreach my $file (@files) {
 			    my $value_pd = ($value =~ tr@/@@);
 			    my $file_pd = ($file  =~ tr@/@@);
 			    $value_pd++ if (substr($value,-1,1) ne "/");
+			    $has_exact_pd = 1 if ($file_pd == $value_pd);
 			    if ($pattern_depth == 0 ||
 				(($file_pd - $value_pd) < $pattern_depth)) {
 				$hash{$tvi} = $value_pd;
@@ -420,26 +424,26 @@ foreach my $file (@files) {
 
     foreach my $line (sort {$hash{$b} <=> $hash{$a}} keys %hash) {
 	add_categories($line);
-	    if ($sections) {
-		my $i;
-		my $start = find_starting_index($line);
-		my $end = find_ending_index($line);
-		for ($i = $start; $i < $end; $i++) {
-		    my $line = $typevalue[$i];
-		    if ($line =~ /^[FX]:/) {		##Restore file patterns
-			$line =~ s/([^\\])\.([^\*])/$1\?$2/g;
-			$line =~ s/([^\\])\.$/$1\?/g;	##Convert . back to ?
-			$line =~ s/\\\./\./g;       	##Convert \. to .
-			$line =~ s/\.\*/\*/g;       	##Convert .* to *
-		    }
-		    $line =~ s/^([A-Z]):/$1:\t/g;
-		    print("$line\n");
+	if ($sections) {
+	    my $i;
+	    my $start = find_starting_index($line);
+	    my $end = find_ending_index($line);
+	    for ($i = $start; $i < $end; $i++) {
+		my $line = $typevalue[$i];
+		if ($line =~ /^[FX]:/) {		##Restore file patterns
+		    $line =~ s/([^\\])\.([^\*])/$1\?$2/g;
+		    $line =~ s/([^\\])\.$/$1\?/g;	##Convert . back to ?
+		    $line =~ s/\\\./\./g;       	##Convert \. to .
+		    $line =~ s/\.\*/\*/g;       	##Convert .* to *
 		}
-		print("\n");
+		$line =~ s/^([A-Z]):/$1:\t/g;
+		print("$line\n");
 	    }
+	    print("\n");
+	}
     }
 
-    if ($email && $email_git) {
+    if ($email && ($email_git || ($email_git_fallback && !$has_exact_pd))) {
 	vcs_file_signoffs($file);
     }
 
@@ -540,6 +544,7 @@ MAINTAINER field selection options:
     --git => include recent git \*-by: signers
     --git-all-signature-types => include signers regardless of signature type
         or use only ${signaturePattern} signers (default: $email_git_all_signature_types)
+    --git-fallback => use git when no exact MAINTAINERS pattern (default: $email_git_fallback)
     --git-chief-penguins => include ${penguin_chiefs}
     --git-min-signatures => number of signatures required (default: $email_git_min_signatures)
     --git-max-maintainers => maximum maintainers to add (default: $email_git_max_maintainers)
@@ -568,7 +573,7 @@ Output type options:
 Other options:
   --pattern-depth => Number of pattern directory traversals (default: 0 (all))
   --keywords => scan patch for keywords (default: 1 (on))
-  --sections => print the entire subsystem sections with pattern matches
+  --sections => only print the entire subsystem sections with pattern matches
   --version => show version
   --help => show this help information
 



^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH V2] Documentation/SubmittingPatches: Add and describe scripts/get_maintainer.pl
  2010-09-11 19:43                     ` [PATCH V2] " Joe Perches
@ 2010-09-12 16:18                       ` Florian Mickler
  0 siblings, 0 replies; 62+ messages in thread
From: Florian Mickler @ 2010-09-12 16:18 UTC (permalink / raw)
  To: Joe Perches
  Cc: Andrew Morton, Mark Brown, Stephen Hemminger, Wolfram Sang, linux-kernel

On Sat, 11 Sep 2010 12:43:31 -0700
Joe Perches <joe@perches.com> wrote:

> Signed-off-by: Joe Perches <joe@perches.com>
> ---
>  Documentation/SubmittingPatches |   17 +++++++++++++++++
>  1 files changed, 17 insertions(+), 0 deletions(-)
> 
> diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches
> index 689e237..bb318f9 100644
> --- a/Documentation/SubmittingPatches
> +++ b/Documentation/SubmittingPatches
> @@ -176,6 +176,23 @@ discussed should the patch then be submitted to Linus.
>  
>  6) Select your CC (e-mail carbon copy) list.
>  
> +Use
> +	"./scripts/get_maintainer.pl --nogit --roles <patch>"
> +to find the appropriate maintainers and mailing lists to CC.
> +
> +If the output email addresses doesn't show a "supporter" or "maintainer",
> +you can expand the search to include git history with:
> +
> +	"./scripts/get_maintainer.pl --roles <patch>"
> +
> +Searching git history can often list individuals that may not be
> +particularly interested in your patch, so use the list of names
> +judiciously.
> +
> +./scripts/get_maintainer.pl options "--rolestats" and "--git-blame"
> +can show a bit more information and might help you select the appropriate
> +parties to CC.
> +
>  Unless you have a reason NOT to do so, CC linux-kernel@vger.kernel.org.
>  
>  Other kernel developers besides Linus need to be aware of your change,
> 
> 

That is more to the point. I.e. it tackles the same problem
as my patch: The communication disconnect between sender and receiver.

But I'm a bit sceptical at how efficient this is. 

Cheers,
Flo


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] get_maintainer.pl: append reason for cc to the name by default
  2010-09-10 10:30   ` Florian Mickler
  2010-09-10 11:04     ` Mark Brown
  2010-09-10 21:04     ` Andrew Morton
@ 2010-09-13  4:01     ` Valdis.Kletnieks
  2010-09-13  5:21       ` [PATCH] get_maintainer.pl: Look for .get_maintainer.conf in lk, then $HOME then scripts Joe Perches
  2 siblings, 1 reply; 62+ messages in thread
From: Valdis.Kletnieks @ 2010-09-13  4:01 UTC (permalink / raw)
  To: Florian Mickler
  Cc: Joe Perches, Andrew Morton,
	Stephen Hemminger (role:commit_signer),
	Wolfram Sang (role:commit_signer),
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 529 bytes --]

On Fri, 10 Sep 2010 12:30:40 +0200, Florian Mickler said:

> Your mileage may vary of course, but I can't see any harm done by this
> change. Anyone familiar with get_maintainer.pl can always just put
> "--nogit" in his .get_maintainer.conf file. 

Any chance of getting that to be ~/.get_maintainer.conf rather than
./.get_maintainer.conf? I've just gotten bit like the 3rd or 4th time by "oh but
you didn't create that file in *this* tree" (I usually have a linus git tree, a linux-next
tree, and 3-4 -mm trees lying around).


[-- Attachment #2: Type: application/pgp-signature, Size: 227 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH] get_maintainer.pl: Look for .get_maintainer.conf in lk, then $HOME then scripts
  2010-09-13  4:01     ` Valdis.Kletnieks
@ 2010-09-13  5:21       ` Joe Perches
  2010-09-13  6:13         ` Florian Mickler
  2010-09-13 13:21         ` Valdis.Kletnieks
  0 siblings, 2 replies; 62+ messages in thread
From: Joe Perches @ 2010-09-13  5:21 UTC (permalink / raw)
  To: Valdis.Kletnieks
  Cc: Florian Mickler, Andrew Morton,
	Stephen Hemminger (role:commit_signer),
	Wolfram Sang (role:commit_signer),
	linux-kernel

On Mon, 2010-09-13 at 00:01 -0400, Valdis.Kletnieks@vt.edu wrote:
> Any chance of getting that to be ~/.get_maintainer.conf rather than
> ./.get_maintainer.conf? I've just gotten bit like the 3rd or 4th time by "oh but
> you didn't create that file in *this* tree" (I usually have a linus git tree, a linux-next
> tree, and 3-4 -mm trees lying around).

Sure.

Signed-off-by: Joe Perches <joe@perches.com>
---
diff --git a/scripts/get_maintainer.pl b/scripts/get_maintainer.pl
index b228198..ea54561 100755
--- a/scripts/get_maintainer.pl
+++ b/scripts/get_maintainer.pl
@@ -107,10 +107,12 @@ my %VCS_cmds_hg = (
     "blame_commit_pattern" => "^([0-9a-f]+):"
 );
 
-if (-f "${lk_path}.get_maintainer.conf") {
+my $conf = which_conf(".get_maintainer.conf");
+if (-f $conf) {
     my @conf_args;
-    open(my $conffile, '<', "${lk_path}.get_maintainer.conf")
-	or warn "$P: Can't open .get_maintainer.conf: $!\n";
+    open(my $conffile, '<', "$conf")
+	or warn "$P: Can't find a readable .get_maintainer.conf file $!\n";
+
     while (<$conffile>) {
 	my $line = $_;
 
@@ -952,6 +954,18 @@ sub which {
     return "";
 }
 
+sub which_conf {
+    my ($conf) = @_;
+
+    foreach my $path (split(/:/, ".:$ENV{HOME}:.scripts")) {
+	if (-e "$path/$conf") {
+	    return "$path/$conf";
+	}
+    }
+
+    return "";
+}
+
 sub mailmap {
     my (@lines) = @_;
     my %hash;



^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH] get_maintainer.pl: Look for .get_maintainer.conf in lk, then $HOME then scripts
  2010-09-13  5:21       ` [PATCH] get_maintainer.pl: Look for .get_maintainer.conf in lk, then $HOME then scripts Joe Perches
@ 2010-09-13  6:13         ` Florian Mickler
  2010-09-13 13:21         ` Valdis.Kletnieks
  1 sibling, 0 replies; 62+ messages in thread
From: Florian Mickler @ 2010-09-13  6:13 UTC (permalink / raw)
  To: Joe Perches
  Cc: Valdis.Kletnieks, Andrew Morton,
	Stephen Hemminger (role:commit_signer),
	Wolfram Sang (role:commit_signer),
	linux-kernel

On Sun, 12 Sep 2010 22:21:39 -0700
Joe Perches <joe@perches.com> wrote:

> On Mon, 2010-09-13 at 00:01 -0400, Valdis.Kletnieks@vt.edu wrote:
> > Any chance of getting that to be ~/.get_maintainer.conf rather than
> > ./.get_maintainer.conf? I've just gotten bit like the 3rd or 4th time by "oh but
> > you didn't create that file in *this* tree" (I usually have a linus git tree, a linux-next
> > tree, and 3-4 -mm trees lying around).
> 
> Sure.
> 
> Signed-off-by: Joe Perches <joe@perches.com>
> ---
> diff --git a/scripts/get_maintainer.pl b/scripts/get_maintainer.pl
> index b228198..ea54561 100755
> --- a/scripts/get_maintainer.pl
> +++ b/scripts/get_maintainer.pl
> @@ -107,10 +107,12 @@ my %VCS_cmds_hg = (
>      "blame_commit_pattern" => "^([0-9a-f]+):"
>  );
>  
> -if (-f "${lk_path}.get_maintainer.conf") {
> +my $conf = which_conf(".get_maintainer.conf");
> +if (-f $conf) {
>      my @conf_args;
> -    open(my $conffile, '<', "${lk_path}.get_maintainer.conf")
> -	or warn "$P: Can't open .get_maintainer.conf: $!\n";
> +    open(my $conffile, '<', "$conf")
> +	or warn "$P: Can't find a readable .get_maintainer.conf file $!\n";
> +
>      while (<$conffile>) {
>  	my $line = $_;
>  
> @@ -952,6 +954,18 @@ sub which {
>      return "";
>  }
>  
> +sub which_conf {
> +    my ($conf) = @_;
> +
> +    foreach my $path (split(/:/, ".:$ENV{HOME}:.scripts")) {
> +	if (-e "$path/$conf") {
> +	    return "$path/$conf";
> +	}
> +    }
> +
> +    return "";
> +}
> +
>  sub mailmap {
>      my (@lines) = @_;
>      my %hash;
> 
> 

What about just using an array?

From: Florian Mickler <florian@mickler.org>
Date: Mon, 13 Sep 2010 07:57:29 +0200
Subject: [PATCH] get_maintainer.pl: read also ~/.get_maintainer.conf

Read first .get_maintainer.conf from current source tree, then from
home.
This means that cmdline switches trump source-tree-config which trump
~/.get_maintainer.conf.

Signed-off-by: Florian Mickler <florian@mickler.org>
---
 scripts/get_maintainer.pl |   11 ++++++++---
 1 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/scripts/get_maintainer.pl b/scripts/get_maintainer.pl
index b228198..937da0b 100755
--- a/scripts/get_maintainer.pl
+++ b/scripts/get_maintainer.pl
@@ -107,10 +107,15 @@ my %VCS_cmds_hg = (
     "blame_commit_pattern" => "^([0-9a-f]+):"
 );
 
-if (-f "${lk_path}.get_maintainer.conf") {
+my @configfiles = ("${lk_path}.get_maintainer.conf",
"$ENV{ HOME }/.get_maintainer.conf"); +while (my $cf =
shift(@configfiles)){ my @conf_args;
-    open(my $conffile, '<', "${lk_path}.get_maintainer.conf")
-	or warn "$P: Can't open .get_maintainer.conf: $!\n";
+
+    next unless (-f $cf);
+
+    open(my $conffile, '<', $cf)
+	or warn "$P: Can't open $cf: $!\n";
+
     while (<$conffile>) {
 	my $line = $_;
 
-- 
1.7.2




^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH] get_maintainer.pl: append reason for cc to the name by default
  2010-09-11  9:28         ` Florian Mickler
@ 2010-09-13  7:16           ` Eric W. Biederman
  2010-09-13  7:57             ` Joe Perches
  2010-09-13  9:01             ` Florian Mickler
  0 siblings, 2 replies; 62+ messages in thread
From: Eric W. Biederman @ 2010-09-13  7:16 UTC (permalink / raw)
  To: Florian Mickler
  Cc: Christoph Hellwig, Joe Perches, Andrew Morton,
	Stephen Hemminger (role:commit_signer),
	Wolfram Sang (role:commit_signer),
	linux-kernel

Florian Mickler <florian@mickler.org> writes:

> On Fri, 10 Sep 2010 20:45:51 -0400
> Christoph Hellwig <hch@infradead.org> wrote:
>
>> On Fri, Sep 10, 2010 at 05:31:14PM -0700, Joe Perches wrote:
>> > If you don't like it, you could use a .get_maintainer.conf
>> > file entry until some consensus is reached later.
>> 
>> I don't use the pice of junk at all.  What really pisses me off is that
>> I get spamed by all by tons of utterly useless cleanups patches (and
>> some useful but not at all interesting to me patches) because some
>> idiot made the bloody script return my address just because I touched a
>> file.  If it goes on like that I'll simply have to add a mail filter to
>> throw away all patches sent directly to me instead of a mailinglist.
>> 
>
> But would you also have the same attitude towards these specific
> 'spams' if it were made clear in the email adress field that you were
> cc'd as a commit_signer  (vs maintainer)?
>
> I know it is a bit hackisch to have stuff in the name field, but it is
> rfc conform, it is even proposed for redirected email '(by way
> of...)'. 

It is trivial for a human to look at a git log and see which changes
were just global cleanups and which changes were actual maintenance.
Apparently get_maintainers doesn't have that ability.

Have seen some files with something like 5 years of changes without a
single commit by a maintainer and the only changes happening to it are
global cleanup changes.

If get_maintainers would look at MAINTAINERS and validate or invalidate
that information by looking at git that would be useful.

If get_maintainers can't validate what in MAINTAINERS perhaps it can
suggest that the author read the git log to get an idea of who has been
working on the code lately.

I guess what would be very good would be if get_maintainers would
present the information from the git log not as a fact but instead as a
guess.  Presenting people who only made a single change to a file on
equal footing with people were involved with 95% of the changes to a
file gets silly.

For people who have just a few commits in a file I expect presenting the
commit id's and subject lines of the patches they have touched so a
person with a brain can easily filter out the nonsense hits would be
very useful.

At this point I agree with Christoph being CC'd on every mips syscall
change on every architecture for the rest of my life because
get_maintainers.pl is too stupid to even indicate to the user of the
script that it might have made a mistake is silly.

Adding an "ignore me" tag to email addresses generated by
get_maintainer.pl would only encourage people to filter those emails
when they get them, which is not what we want.  Let's see if we can make
the script smart enough so people sending patches can use their own
intelligence to keep from asking people to look at things that they have
utterly no interest in.

This whole thing reminds me of a quote my father used to have hanging
in his study: "To err is human. To really foul things up use a computer"

Eric

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] get_maintainer.pl: append reason for cc to the name by default
  2010-09-13  7:16           ` Eric W. Biederman
@ 2010-09-13  7:57             ` Joe Perches
  2010-09-13  8:54               ` Florian Mickler
  2010-09-13  9:01             ` Florian Mickler
  1 sibling, 1 reply; 62+ messages in thread
From: Joe Perches @ 2010-09-13  7:57 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Florian Mickler, Christoph Hellwig, Stephen Hemminger,
	Andrew Morton, linux-kernel, Wolfram Sang

On Mon, 2010-09-13 at 00:16 -0700, Eric W. Biederman wrote:
> It is trivial for a human to look at a git log and see which changes
> were just global cleanups and which changes were actual maintenance.
> Apparently get_maintainers doesn't have that ability.

Do you have a useful, trivial or non-trivial algorithm
to suggest or is that soft commenting?  All I'll say is
AI can be a surprisingly difficult field.

> Have seen some files with something like 5 years of changes without a
> single commit by a maintainer and the only changes happening to it are
> global cleanup changes.

Then likely there's no actual maintainer for that file.

> If get_maintainers would look at MAINTAINERS and validate or invalidate
> that information by looking at git that would be useful.

Some entries in MAINTAINERS are outdated.
Validating MAINTAINERS entries is probably best done once.

I suggest you try that concept out, see what you get, and
make public the results.



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] get_maintainer.pl: append reason for cc to the name by default
  2010-09-13  7:57             ` Joe Perches
@ 2010-09-13  8:54               ` Florian Mickler
  2010-09-14 17:19                 ` Eric W. Biederman
  0 siblings, 1 reply; 62+ messages in thread
From: Florian Mickler @ 2010-09-13  8:54 UTC (permalink / raw)
  To: Joe Perches, Eric W. Biederman
  Cc: Christoph Hellwig, Stephen Hemminger, Andrew Morton,
	linux-kernel, Wolfram Sang

On Mon, 13 Sep 2010 00:57:45 -0700
Joe Perches <joe@perches.com> wrote:

> On Mon, 2010-09-13 at 00:16 -0700, Eric W. Biederman wrote:
> > It is trivial for a human to look at a git log and see which changes
> > were just global cleanups and which changes were actual maintenance.
> > Apparently get_maintainers doesn't have that ability.
> 
> Do you have a useful, trivial or non-trivial algorithm
> to suggest or is that soft commenting?  All I'll say is
> AI can be a surprisingly difficult field.

:) indeed. 

> 
> > Have seen some files with something like 5 years of changes without a
> > single commit by a maintainer and the only changes happening to it are
> > global cleanup changes.
> 
> Then likely there's no actual maintainer for that file.

and which means that get_maintainer.pl --git will output either nothing
(if we somehow get its heuristics to filter correctly) or wrong people. 

> 
> > If get_maintainers would look at MAINTAINERS and validate or invalidate
> > that information by looking at git that would be useful.
> 
> Some entries in MAINTAINERS are outdated.
> Validating MAINTAINERS entries is probably best done once.
> 
> I suggest you try that concept out, see what you get, and
> make public the results.

It is easy to make get_maintainer.pl output less people. 
What is not easy is to get it to decrease false-positives while
not decreasing it's detection rate. 

As far as I can see, Andrew is in favor of not caring about
false-positives in order to not sacrifice the detection rate of the
tool. 

My approach tried to lower the impact of false positives by allowing
people to filter between "cc'd as maintainer" and "cc'd as
commit_signer".  The former is pretty much never a false positive (as
long as MAINTAINERS is up to date) while the latter is more of a
hit'n'miss kind of method. 

Don't know. 

Regards,
Flo


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] get_maintainer.pl: append reason for cc to the name by default
  2010-09-13  7:16           ` Eric W. Biederman
  2010-09-13  7:57             ` Joe Perches
@ 2010-09-13  9:01             ` Florian Mickler
  2010-09-14 17:24               ` Eric W. Biederman
  1 sibling, 1 reply; 62+ messages in thread
From: Florian Mickler @ 2010-09-13  9:01 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Christoph Hellwig, Joe Perches, Andrew Morton,
	Stephen Hemminger (role:commit_signer),
	Wolfram Sang (role:commit_signer),
	linux-kernel

On Mon, 13 Sep 2010 00:16:30 -0700
ebiederm@xmission.com (Eric W. Biederman) wrote:

> Florian Mickler <florian@mickler.org> writes:
> 

> Adding an "ignore me" tag to email addresses generated by
> get_maintainer.pl would only encourage people to filter those emails
> when they get them, which is not what we want. 

If people get angry, we want them to have those emails filtered away.  

I for one, would not filter them away. I doubt many people will filter
them away. Postpone? Maybe. But filtering them to > /dev/null? I doubt
it. And if they do, it is their choice.

If you care about cc'ing someone deeply, you just put him manually on the
cc list and thus bypass any get_maintainer.pl-provoked filtering.



> This whole thing reminds me of a quote my father used to have hanging
> in his study: "To err is human. To really foul things up use a computer"

:)

> 
> Eric

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] get_maintainer.pl: Look for .get_maintainer.conf in lk, then $HOME then scripts
  2010-09-13  5:21       ` [PATCH] get_maintainer.pl: Look for .get_maintainer.conf in lk, then $HOME then scripts Joe Perches
  2010-09-13  6:13         ` Florian Mickler
@ 2010-09-13 13:21         ` Valdis.Kletnieks
  1 sibling, 0 replies; 62+ messages in thread
From: Valdis.Kletnieks @ 2010-09-13 13:21 UTC (permalink / raw)
  To: Joe Perches
  Cc: Florian Mickler, Andrew Morton,
	Stephen Hemminger (role:commit_signer),
	Wolfram Sang (role:commit_signer),
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 610 bytes --]

On Sun, 12 Sep 2010 22:21:39 PDT, Joe Perches said:
> On Mon, 2010-09-13 at 00:01 -0400, Valdis.Kletnieks@vt.edu wrote:
> > Any chance of getting that to be ~/.get_maintainer.conf rather than
> > ./.get_maintainer.conf? I've just gotten bit like the 3rd or 4th time by "oh but
> > you didn't create that file in *this* tree" (I usually have a linus git tree, a linux-next
> > tree, and 3-4 -mm trees lying around).
> 
> Sure.
> 
> Signed-off-by: Joe Perches <joe@perches.com>
> ---
> diff --git a/scripts/get_maintainer.pl b/scripts/get_maintainer.pl
> index b228198..ea54561 100755

Thanks dude, you rock. :)

[-- Attachment #2: Type: application/pgp-signature, Size: 227 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] get_maintainer.pl: append reason for cc to the name by default
  2010-09-13  8:54               ` Florian Mickler
@ 2010-09-14 17:19                 ` Eric W. Biederman
  2010-09-14 17:46                   ` Florian Mickler
  2010-09-14 23:15                   ` Joe Perches
  0 siblings, 2 replies; 62+ messages in thread
From: Eric W. Biederman @ 2010-09-14 17:19 UTC (permalink / raw)
  To: Florian Mickler
  Cc: Joe Perches, Christoph Hellwig, Stephen Hemminger, Andrew Morton,
	linux-kernel, Wolfram Sang

Florian Mickler <florian@mickler.org> writes:

> On Mon, 13 Sep 2010 00:57:45 -0700
> Joe Perches <joe@perches.com> wrote:
>
>> On Mon, 2010-09-13 at 00:16 -0700, Eric W. Biederman wrote:
>> > It is trivial for a human to look at a git log and see which changes
>> > were just global cleanups and which changes were actual maintenance.
>> > Apparently get_maintainers doesn't have that ability.
>> 
>> Do you have a useful, trivial or non-trivial algorithm
>> to suggest or is that soft commenting?  All I'll say is
>> AI can be a surprisingly difficult field.
>
> :) indeed. 

Which is why the tool needs to assist a person in doing the work.
Please deliver a tool and not a broken solution.

>> > Have seen some files with something like 5 years of changes without a
>> > single commit by a maintainer and the only changes happening to it are
>> > global cleanup changes.
>> 
>> Then likely there's no actual maintainer for that file.
>
> and which means that get_maintainer.pl --git will output either nothing
> (if we somehow get its heuristics to filter correctly) or wrong people. 
>
>> 
>> > If get_maintainers would look at MAINTAINERS and validate or invalidate
>> > that information by looking at git that would be useful.
>> 
>> Some entries in MAINTAINERS are outdated.
>> Validating MAINTAINERS entries is probably best done once.
>> 
>> I suggest you try that concept out, see what you get, and
>> make public the results.
>
> It is easy to make get_maintainer.pl output less people. 
> What is not easy is to get it to decrease false-positives while
> not decreasing it's detection rate. 

What is needed is something other than output that is a list of
email addresses.

email address foo had x% of non-author signed off bys
email address foo had y% of author signed off bys
email address foo had y% of author commits.
email address foo came from the Maintainers file.

Additionally for email addresses that hit less often a list
of patch subject titles, and truncated sha1 patch ids.  So
with luck you can tell at a glance the person is of interest
and if not you can look at their commits quickly and see.

That is all pretty trivial, it should be fast and it should with
a little care let the bogus results be filtered out quickly.

> As far as I can see, Andrew is in favor of not caring about
> false-positives in order to not sacrifice the detection rate of the
> tool. 

Which means in time every long time developer will be copied on every
patch.  That is what we have lkml for.  I don't have a problem with the
tool returning false positives.  I do have a problem with the tool
taking away the ability and the responsibility of developers to pay
attention to which human beings they are sending their patches to.

I don't want the tool to do the filtering.  I want the tool to give
enough information that the person using the tool can get a feel for the
development history of the affected files and suggestions with a couple
of metrics how useful someone is when Cc'd on a commit.

> My approach tried to lower the impact of false positives by allowing
> people to filter between "cc'd as maintainer" and "cc'd as
> commit_signer".  The former is pretty much never a false positive (as
> long as MAINTAINERS is up to date) while the latter is more of a
> hit'n'miss kind of method. 

And right now get_maintainer.pl is decreasing the relevancy of cc lines
in commits, which if get_maintainers.pl is used enough could be a
vicious circle.

The problem as I see it is you present of a list of email addresses
without enough information for someone using the tool to guess how
accurate the results are.

Eric


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] get_maintainer.pl: append reason for cc to the name by default
  2010-09-13  9:01             ` Florian Mickler
@ 2010-09-14 17:24               ` Eric W. Biederman
  0 siblings, 0 replies; 62+ messages in thread
From: Eric W. Biederman @ 2010-09-14 17:24 UTC (permalink / raw)
  To: Florian Mickler
  Cc: Christoph Hellwig, Joe Perches, Andrew Morton,
	Stephen Hemminger (role:commit_signer),
	Wolfram Sang (role:commit_signer),
	linux-kernel

Florian Mickler <florian@mickler.org> writes:

> On Mon, 13 Sep 2010 00:16:30 -0700
> ebiederm@xmission.com (Eric W. Biederman) wrote:
>
>> Florian Mickler <florian@mickler.org> writes:
>> 
>
>> Adding an "ignore me" tag to email addresses generated by
>> get_maintainer.pl would only encourage people to filter those emails
>> when they get them, which is not what we want. 
>
> If people get angry, we want them to have those emails filtered away.

What we want is a better signal to noise ratio so people don't feel
compelled to filter or get angry.

What we really don't want are additional penalties for being a prolific
kernel developer, we have enough of those already.

Eric

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] get_maintainer.pl: append reason for cc to the name by default
  2010-09-14 17:19                 ` Eric W. Biederman
@ 2010-09-14 17:46                   ` Florian Mickler
  2010-09-15  3:28                     ` Joe Perches
  2010-09-14 23:15                   ` Joe Perches
  1 sibling, 1 reply; 62+ messages in thread
From: Florian Mickler @ 2010-09-14 17:46 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Joe Perches, Christoph Hellwig, Stephen Hemminger, Andrew Morton,
	linux-kernel, Wolfram Sang

On Tue, 14 Sep 2010 10:19:33 -0700
ebiederm@xmission.com (Eric W. Biederman) wrote:
> Florian Mickler <florian@mickler.org> writes:
>



> What is needed is something other than output that is a list of
> email addresses.
> 
> email address foo had x% of non-author signed off bys
> email address foo had y% of author signed off bys
> email address foo had y% of author commits.
> email address foo came from the Maintainers file.

Currently get_maintainer.pl only does signed-off-by counting, it doesn't
take authorship in account, IIRC. That is a good point. It's
information that is easily available. 

> 
> Additionally for email addresses that hit less often a list
> of patch subject titles, and truncated sha1 patch ids.  So
> with luck you can tell at a glance the person is of interest
> and if not you can look at their commits quickly and see.

An interactive mode in git shortlog form of the last year should be
possible, i guess. 

I wonder, if git send-email --cc-cmd allows for directing input towards
get_maintainer.pl. That would be awesome.

> 
> That is all pretty trivial, it should be fast and it should with
> a little care let the bogus results be filtered out quickly.
> 
> > As far as I can see, Andrew is in favor of not caring about
> > false-positives in order to not sacrifice the detection rate of the
> > tool. 
> 
> Which means in time every long time developer will be copied on every
> patch.  That is what we have lkml for.  I don't have a problem with the
> tool returning false positives.  I do have a problem with the tool
> taking away the ability and the responsibility of developers to pay
> attention to which human beings they are sending their patches to.

Fair enough. It does a one year cut-off for the history. But maybe that
is not the best approach. 

> 
> I don't want the tool to do the filtering.  I want the tool to give
> enough information that the person using the tool can get a feel for the
> development history of the affected files and suggestions with a couple
> of metrics how useful someone is when Cc'd on a commit.

I think this is a good approach. 

> 
> > My approach tried to lower the impact of false positives by allowing
> > people to filter between "cc'd as maintainer" and "cc'd as
> > commit_signer".  The former is pretty much never a false positive (as
> > long as MAINTAINERS is up to date) while the latter is more of a
> > hit'n'miss kind of method. 
> 
> And right now get_maintainer.pl is decreasing the relevancy of cc lines
> in commits, which if get_maintainers.pl is used enough could be a
> vicious circle.
> 
> The problem as I see it is you present of a list of email addresses
> without enough information for someone using the tool to guess how
> accurate the results are.

Yes. I guess my patch adresses that somewhat, as it puts more
information in the output by default. But it only uses the information
already present in the script. 

Regards,
Flo

> 
> Eric
> 

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] get_maintainer.pl: append reason for cc to the name by default
  2010-09-14 17:19                 ` Eric W. Biederman
  2010-09-14 17:46                   ` Florian Mickler
@ 2010-09-14 23:15                   ` Joe Perches
  1 sibling, 0 replies; 62+ messages in thread
From: Joe Perches @ 2010-09-14 23:15 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Florian Mickler, Christoph Hellwig, Stephen Hemminger,
	Andrew Morton, linux-kernel, Wolfram Sang

On Tue, 2010-09-14 at 10:19 -0700, Eric W. Biederman wrote:
> Florian Mickler <florian@mickler.org> writes:
> > On Mon, 13 Sep 2010 00:57:45 -0700
> > Joe Perches <joe@perches.com> wrote:
> >> On Mon, 2010-09-13 at 00:16 -0700, Eric W. Biederman wrote:
> >> > It is trivial for a human to look at a git log and see which changes
> >> > were just global cleanups and which changes were actual maintenance.
> >> > Apparently get_maintainers doesn't have that ability.
> >> Do you have a useful, trivial or non-trivial algorithm
> >> to suggest or is that soft commenting?  All I'll say is
> >> AI can be a surprisingly difficult field.
> > :) indeed. 
> Which is why the tool needs to assist a person in doing the work.
> Please deliver a tool and not a broken solution.

Please suggest or implement improvements instead of merely asking
for something better.

> > It is easy to make get_maintainer.pl output less people. 
> > What is not easy is to get it to decrease false-positives while
> > not decreasing it's detection rate. 
> What is needed is something other than output that is a list of
> email addresses.

> email address foo had x% of non-author signed off bys
> email address foo had y% of author signed off bys
> email address foo had y% of author commits.
> email address foo came from the Maintainers file.

Already implemented.

Use --git --rolestats

> Additionally for email addresses that hit less often a list
> of patch subject titles, and truncated sha1 patch ids.  So
> with luck you can tell at a glance the person is of interest
> and if not you can look at their commits quickly and see.

Use git.  There's no reason to reimplement what git does.

> Which means in time every long time developer will be copied on every
> patch.

No, it doesn't mean that.  Inactive developers don't get listed
by the tool.

> And right now get_maintainer.pl is decreasing the relevancy of cc lines
> in commits, which if get_maintainers.pl is used enough could be a
> vicious circle.

I don't see that get_maintainer does what you suggest it does.
Find some examples and publish them.



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] get_maintainer.pl: append reason for cc to the name by default
  2010-09-14 17:46                   ` Florian Mickler
@ 2010-09-15  3:28                     ` Joe Perches
  2010-09-15  4:34                       ` Florian Mickler
  0 siblings, 1 reply; 62+ messages in thread
From: Joe Perches @ 2010-09-15  3:28 UTC (permalink / raw)
  To: Florian Mickler
  Cc: Eric W. Biederman, Christoph Hellwig, Stephen Hemminger,
	Andrew Morton, linux-kernel, Wolfram Sang

On Tue, 2010-09-14 at 19:46 +0200, Florian Mickler wrote:
> On Tue, 14 Sep 2010 10:19:33 -0700
> ebiederm@xmission.com (Eric W. Biederman) wrote:
> > Florian Mickler <florian@mickler.org> writes:
> >
> 
> 
> 
> > What is needed is something other than output that is a list of
> > email addresses.
> > 
> > email address foo had x% of non-author signed off bys
> > email address foo had y% of author signed off bys
> > email address foo had y% of author commits.
> > email address foo came from the Maintainers file.
> 
> Currently get_maintainer.pl only does signed-off-by counting, it doesn't
> take authorship in account, IIRC. That is a good point. It's
> information that is easily available. 
> 
> > 
> > Additionally for email addresses that hit less often a list
> > of patch subject titles, and truncated sha1 patch ids.  So
> > with luck you can tell at a glance the person is of interest
> > and if not you can look at their commits quickly and see.
> 
> An interactive mode in git shortlog form of the last year should be
> possible, i guess. 
> 
> I wonder, if git send-email --cc-cmd allows for directing input towards
> get_maintainer.pl. That would be awesome.

Yes it does.  You could use any script or direct command
with arguments you like.

I sent you privately a script that does that last week.



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] get_maintainer.pl: append reason for cc to the name by default
  2010-09-15  3:28                     ` Joe Perches
@ 2010-09-15  4:34                       ` Florian Mickler
  2010-09-15  4:45                         ` Joe Perches
  0 siblings, 1 reply; 62+ messages in thread
From: Florian Mickler @ 2010-09-15  4:34 UTC (permalink / raw)
  To: Joe Perches
  Cc: Eric W. Biederman, Christoph Hellwig, Stephen Hemminger,
	Andrew Morton, linux-kernel, Wolfram Sang

On Tue, 14 Sep 2010 20:28:44 -0700
Joe Perches <joe@perches.com> wrote:

> On Tue, 2010-09-14 at 19:46 +0200, Florian Mickler wrote:
> > On Tue, 14 Sep 2010 10:19:33 -0700
> > ebiederm@xmission.com (Eric W. Biederman) wrote:
> > > Florian Mickler <florian@mickler.org> writes:
> > An interactive mode in git shortlog form of the last year should be
> > possible, i guess. 
> > 
> > I wonder, if git send-email --cc-cmd allows for directing input towards
> > get_maintainer.pl. That would be awesome.
> 
> Yes it does.  You could use any script or direct command
> with arguments you like.
> 
> I sent you privately a script that does that last week.

I think we misunderstand each other. 

For the reference, you talk about this snippet? Or did I overlook smth?

$ cat scripts/send-email-listed-MAINTAINERS-only
#!/bin/bash
if [[ $(basename $1) =~ ^0000- ]] ; then
    ./scripts/get_maintainer.pl --nogit $(dirname $1)/*
else
    ./scripts/get_maintainer.pl --nogit $1
fi


But for the interactive mode I had STDIN in mind. 
And now that I checked, it is indeed possible. I have a quick prove of
concept for an get_maintainer.pl --interactive mode that shows you the
cc's, lists their commits  and let's you choose whom to add to the cc
(using STDERR, as git-send-email.pl obviously expects adresses on
STDOUT).
I haven't experimented much with it, but if it works out expect a
patch. But first I need some coffee and breakfast, as it's morning
over here. :-)

Regards,
Flo



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] get_maintainer.pl: append reason for cc to the name by default
  2010-09-15  4:34                       ` Florian Mickler
@ 2010-09-15  4:45                         ` Joe Perches
  2010-09-15 12:49                           ` Florian Mickler
  0 siblings, 1 reply; 62+ messages in thread
From: Joe Perches @ 2010-09-15  4:45 UTC (permalink / raw)
  To: Florian Mickler
  Cc: Eric W. Biederman, Christoph Hellwig, Stephen Hemminger,
	Andrew Morton, linux-kernel, Wolfram Sang

On Wed, 2010-09-15 at 06:34 +0200, Florian Mickler wrote:
> But for the interactive mode I had STDIN in mind. 
> And now that I checked, it is indeed possible. I have a quick prove of
> concept for an get_maintainer.pl --interactive mode that shows you the
> cc's, lists their commits  and let's you choose whom to add to the cc
> (using STDERR, as git-send-email.pl obviously expects adresses on
> STDOUT).
> I haven't experimented much with it, but if it works out expect a
> patch. But first I need some coffee and breakfast, as it's morning
> over here. :-)

I suggest you don't need to patch get_maintainer.pl,
but should add another script that calls it and/or
uses git commands directly.

cheers, Joe



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] get_maintainer.pl: append reason for cc to the name by default
  2010-09-15  4:45                         ` Joe Perches
@ 2010-09-15 12:49                           ` Florian Mickler
  0 siblings, 0 replies; 62+ messages in thread
From: Florian Mickler @ 2010-09-15 12:49 UTC (permalink / raw)
  To: Joe Perches
  Cc: Eric W. Biederman, Christoph Hellwig, Stephen Hemminger,
	Andrew Morton, linux-kernel, Wolfram Sang

On Tue, 14 Sep 2010 21:45:49 -0700
Joe Perches <joe@perches.com> wrote:

> On Wed, 2010-09-15 at 06:34 +0200, Florian Mickler wrote:
> > But for the interactive mode I had STDIN in mind. 
> > And now that I checked, it is indeed possible. I have a quick prove of
> > concept for an get_maintainer.pl --interactive mode that shows you the
> > cc's, lists their commits  and let's you choose whom to add to the cc
> > (using STDERR, as git-send-email.pl obviously expects adresses on
> > STDOUT).
> > I haven't experimented much with it, but if it works out expect a
> > patch. But first I need some coffee and breakfast, as it's morning
> > over here. :-)
> 
> I suggest you don't need to patch get_maintainer.pl,
> but should add another script that calls it and/or
> uses git commands directly.
> 
> cheers, Joe
> 
> 

Have a look at the patch I sent seperately to lkml. 
I think it integrates well.

Regards,
Flo

^ permalink raw reply	[flat|nested] 62+ messages in thread

* RFC: get_maintainer.pl: append reason for cc to the name by default
  2010-09-10  9:33 [PATCH] get_maintainer.pl: append reason for cc to the name by default florian
  2010-09-10  9:42 ` Joe Perches
@ 2010-09-26 18:52 ` Joe Perches
  2010-09-27 14:57   ` Florian Mickler
  1 sibling, 1 reply; 62+ messages in thread
From: Joe Perches @ 2010-09-26 18:52 UTC (permalink / raw)
  To: Ted Ts'o, Florian Mickler
  Cc: Andrew Morton, Stephen Hemminger, Wolfram Sang, linux-kernel

On Fri, 2010-09-10 at 11:33 +0200, florian@mickler.org wrote:
> The script get_maintainer.pl is a very useful tool for deploying changes
> made to the kernel. Among others it searches not only the MAINTAINERS
> file but also the git history for people to send patches to.
> 
> This can be unexpected for the receiving side and can and does provoke
> sometimes anger because it is not easy to determine if the sender did
> explicitly put the receiving side on the cc list, or if they just
> trolled the tree. The receiving side, if not used to be cc'd on many
> things will check the patch, spend time investigating what the heck they
> were cc'd just to realize, that there was no special reason.
> 
> As get_maintainer.pl is frequently used by kernel newcomers who _can_
> not know whom to cc by themself, this anger then comes as a surprise for them
> and definitely puts them in an awkward position.
> 
> By appending a  a note of the reason for the cc in the name, the reason
> becomes clear and the receiving side is relieved from feeling obliged to
> check the patch  while the sending side has a chance to adapt the
> cc'list to their liking.
> 
> But the most useful aspect of this is, IMHO, that it makes it transparent who
> just used get_maintainer.pl as a shortcut to increase his own
> patch-throughput or who really put an effort in finding or editing the
> cc'list to their likings.

How about making --rolestats the default and adding a --nodecorate
option default off?

This would mean that any current script that expects bare email
addresses would need to add --nodecorate to get the old behavior.


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: RFC: get_maintainer.pl: append reason for cc to the name by default
  2010-09-26 18:52 ` RFC: " Joe Perches
@ 2010-09-27 14:57   ` Florian Mickler
  2010-09-27 15:44     ` Ted Ts'o
  0 siblings, 1 reply; 62+ messages in thread
From: Florian Mickler @ 2010-09-27 14:57 UTC (permalink / raw)
  To: Joe Perches
  Cc: Ted Ts'o, Andrew Morton, Stephen Hemminger, Wolfram Sang,
	linux-kernel

On Sun, 26 Sep 2010 11:52:05 -0700
Joe Perches <joe@perches.com> wrote:

> On Fri, 2010-09-10 at 11:33 +0200, florian@mickler.org wrote:
> > The script get_maintainer.pl is a very useful tool for deploying changes
> > made to the kernel. Among others it searches not only the MAINTAINERS
> > file but also the git history for people to send patches to.
> > 
> > This can be unexpected for the receiving side and can and does provoke
> > sometimes anger because it is not easy to determine if the sender did
> > explicitly put the receiving side on the cc list, or if they just
> > trolled the tree. The receiving side, if not used to be cc'd on many
> > things will check the patch, spend time investigating what the heck they
> > were cc'd just to realize, that there was no special reason.
> > 
> > As get_maintainer.pl is frequently used by kernel newcomers who _can_
> > not know whom to cc by themself, this anger then comes as a surprise for them
> > and definitely puts them in an awkward position.
> > 
> > By appending a  a note of the reason for the cc in the name, the reason
> > becomes clear and the receiving side is relieved from feeling obliged to
> > check the patch  while the sending side has a chance to adapt the
> > cc'list to their liking.
> > 
> > But the most useful aspect of this is, IMHO, that it makes it transparent who
> > just used get_maintainer.pl as a shortcut to increase his own
> > patch-throughput or who really put an effort in finding or editing the
> > cc'list to their likings.
> 
> How about making --rolestats the default and adding a --nodecorate
> option default off?
> 
> This would mean that any current script that expects bare email
> addresses would need to add --nodecorate to get the old behavior.
> 

Would in essence have the same effect, but I think it's slightly better
to have some shorter tags in the mail addresses, as I expect them to
actually show up on lkml quite a bit.

Regards,
Flo

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: RFC: get_maintainer.pl: append reason for cc to the name by default
  2010-09-27 14:57   ` Florian Mickler
@ 2010-09-27 15:44     ` Ted Ts'o
  2010-09-27 17:00       ` Florian Mickler
  0 siblings, 1 reply; 62+ messages in thread
From: Ted Ts'o @ 2010-09-27 15:44 UTC (permalink / raw)
  To: Florian Mickler
  Cc: Joe Perches, Andrew Morton, Stephen Hemminger, Wolfram Sang,
	linux-kernel

On Mon, Sep 27, 2010 at 04:57:48PM +0200, Florian Mickler wrote:
> 
> Would in essence have the same effect, but I think it's slightly better
> to have some shorter tags in the mail addresses, as I expect them to
> actually show up on lkml quite a bit.

What if there are no tags on mail addresses that come from the
MAINTAINERS file, but only tags on the mail addresses that come from
guessing wildly based on git sign-offs?  People should just be looking
in the MAINTAINERS file, after all, and I don't think that's something
that needs an explanation.

The thing that needs explanation is when someone like Steve Hemminger
gets cc'ed on a patch for fs/ext4/acl.c, which really makes no sense
at all, where you desperately need some kind of tag:

shemminger@vyatta.com (Wild guess using get_maintainer.pl --git)

	      	    	  			  - Ted

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: RFC: get_maintainer.pl: append reason for cc to the name by default
  2010-09-27 15:44     ` Ted Ts'o
@ 2010-09-27 17:00       ` Florian Mickler
  2010-09-27 18:21         ` Ted Ts'o
  0 siblings, 1 reply; 62+ messages in thread
From: Florian Mickler @ 2010-09-27 17:00 UTC (permalink / raw)
  To: Ted Ts'o
  Cc: Joe Perches, Andrew Morton, Stephen Hemminger, Wolfram Sang,
	linux-kernel

On Mon, 27 Sep 2010 11:44:41 -0400
Ted Ts'o <tytso@mit.edu> wrote:

> On Mon, Sep 27, 2010 at 04:57:48PM +0200, Florian Mickler wrote:
> > 
> > Would in essence have the same effect, but I think it's slightly better
> > to have some shorter tags in the mail addresses, as I expect them to
> > actually show up on lkml quite a bit.
> 
> What if there are no tags on mail addresses that come from the
> MAINTAINERS file, but only tags on the mail addresses that come from
> guessing wildly based on git sign-offs?  People should just be looking
> in the MAINTAINERS file, after all, and I don't think that's something
> that needs an explanation.
> 
> The thing that needs explanation is when someone like Steve Hemminger
> gets cc'ed on a patch for fs/ext4/acl.c, which really makes no sense
> at all, where you desperately need some kind of tag:
> 
> shemminger@vyatta.com (Wild guess using get_maintainer.pl --git)
> 
> 	      	    	  			  - Ted

Might make sense to omit the tags on the MAINTAINER-source. I can
agree to your reasoning there. 


As far as I can see, the use and the use cases for the git-part of
get_maintainer.pl are mostly to get patches to not-so-well-maintained
parts of the tree upstream. What I envision for the git part is a
scoring based classification scheme that uses all readily available
information of the git-history to determine relevant people for patch
review and patch routing.

I already have implemented a small parser that extracts that information
out of git-log and makes them available to the script, but didn't have
time to wire it up yet. 

Another improvement (beyond finding a decent heuristic based on the
artifacts 'authorship', 'signed-off-by', 'reviewed-by', 'acked-by',
'committer' and nr-of-lines-changed.. and maybe time) is probably to
not make an arbitrarily 1-Year-Back cut-off, but to check the last N
commits on that region of the tree. (I'm thinking of the more
"settled down" areas of the tree here)
But let's see what I come up with...

Regards,
Flo

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: RFC: get_maintainer.pl: append reason for cc to the name by default
  2010-09-27 17:00       ` Florian Mickler
@ 2010-09-27 18:21         ` Ted Ts'o
  2010-09-27 19:26           ` Florian Mickler
  0 siblings, 1 reply; 62+ messages in thread
From: Ted Ts'o @ 2010-09-27 18:21 UTC (permalink / raw)
  To: Florian Mickler
  Cc: Joe Perches, Andrew Morton, Stephen Hemminger, Wolfram Sang,
	linux-kernel

On Mon, Sep 27, 2010 at 07:00:26PM +0200, Florian Mickler wrote:
> 
> Another improvement (beyond finding a decent heuristic based on the
> artifacts 'authorship', 'signed-off-by', 'reviewed-by', 'acked-by',
> 'committer' and nr-of-lines-changed.. and maybe time) is probably to
> not make an arbitrarily 1-Year-Back cut-off, but to check the last N
> commits on that region of the tree. (I'm thinking of the more
> "settled down" areas of the tree here)
> But let's see what I come up with...

I'd suggest a "stop list" of keywords that would cause a commit not be
considered at all.  i.e., words like "trivial" (the trivial tree often
bypasses the subsystem maintainer, and you don't want to identify
someone as the maintainer just because they submitted a change to
change "colour" to "color"), "checkpatch", or "cleanup".

The other thing I'd suggest is try to figure out hueristics when the
git log analysis should be expanded beyond the file which was
modified, to the subdirctory.  That is, if the patch touches a file
that rarely changes except for one spelling fix in the past year, but
the subsystem or device driver has activity in other files in that
subdirectory, it would be nice if get_maintainer.pl at least _tried_
to figure out this case of (for example)
drivers/media/video/omap/omap_voutlib.c has only one change in the
past year, and doesn't have an entry in MAINTAINERS, the history of
the subdirectory drivers/media/video/omap/ might be a better thing to
use when deciding who to bug about some trivial spelling chnage in
omap_voutlib.c.

Really, though, the right answer is to keep the MAINTAINERS file up to
date enough that we don't have to resort to having scripts attempt to
solve the AI problem.  (I've argued for not even trying, but clearly
people who have tried to argue for that have lost that battle; enough
people seem to think it's worth while to make wild guesses even though
the script is called get_maintainer.pl, and not
get_maintainer_or_make_wild_stabs_in_the_dark.pl.)

						- Ted

P.S.  Wouldn't it be better to train kernel newbies how to read
through the output of git log themselves?  I'm not sure that training
people to rely blindly on dumb scripts is in the end actually going to
be doing ourselves and the whole community a service.  Sigh....

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: RFC: get_maintainer.pl: append reason for cc to the name by default
  2010-09-27 18:21         ` Ted Ts'o
@ 2010-09-27 19:26           ` Florian Mickler
  2010-09-27 20:08             ` Joe Perches
  0 siblings, 1 reply; 62+ messages in thread
From: Florian Mickler @ 2010-09-27 19:26 UTC (permalink / raw)
  To: Ted Ts'o
  Cc: Joe Perches, Andrew Morton, Stephen Hemminger, Wolfram Sang,
	linux-kernel

On Mon, 27 Sep 2010 14:21:24 -0400
Ted Ts'o <tytso@mit.edu> wrote:

> (I've argued for not even trying, but clearly
> people who have tried to argue for that have lost that battle; enough
> people seem to think it's worth while to make wild guesses even though
> the script is called get_maintainer.pl, and not
> get_maintainer_or_make_wild_stabs_in_the_dark.pl.)
> 
> 						- Ted

Actually, I think get_maintainer_or_make_wild_stabs_in_the_dark.pl is
a better strategy to ensure that a patch doesn't get ignored then
to ... well.. do nothing.
 
> P.S.  Wouldn't it be better to train kernel newbies how to read
> through the output of git log themselves?  I'm not sure that training
> people to rely blindly on dumb scripts is in the end actually going to
> be doing ourselves and the whole community a service.  Sigh....

Yes and no. Reaching out to people is something kernel newbies
need to learn too... 
I'd agree if the end result of get_maintainer.pl would be to send it to
random machines... but we actually give out
addresses of human beings, so I don't think it is _that_ bad an idea. 

But we should definitely try to tune down the annoying part of
it by making it less random and maybe by adding a tag in the cc field...




^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: RFC: get_maintainer.pl: append reason for cc to the name by default
  2010-09-27 19:26           ` Florian Mickler
@ 2010-09-27 20:08             ` Joe Perches
  2010-09-27 20:47               ` Ted Ts'o
  0 siblings, 1 reply; 62+ messages in thread
From: Joe Perches @ 2010-09-27 20:08 UTC (permalink / raw)
  To: Florian Mickler
  Cc: Ted Ts'o, Andrew Morton, Stephen Hemminger, Wolfram Sang,
	linux-kernel

On Mon, 2010-09-27 at 21:26 +0200, Florian Mickler wrote:
> But we should definitely try to tune down the annoying part of
> it by making it less random and maybe by adding a tag in the cc field...

Less random: sure, whatever's reasonable.

I still think that tagging the name portion of an email address
is not a good idea, especially without some way of turning it
off.  Maybe a mechanism to optionally enable it would be ok.


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: RFC: get_maintainer.pl: append reason for cc to the name by default
  2010-09-27 20:08             ` Joe Perches
@ 2010-09-27 20:47               ` Ted Ts'o
  2010-09-27 21:16                 ` Joe Perches
  0 siblings, 1 reply; 62+ messages in thread
From: Ted Ts'o @ 2010-09-27 20:47 UTC (permalink / raw)
  To: Joe Perches
  Cc: Florian Mickler, Andrew Morton, Stephen Hemminger, Wolfram Sang,
	linux-kernel

On Mon, Sep 27, 2010 at 01:08:17PM -0700, Joe Perches wrote:
> On Mon, 2010-09-27 at 21:26 +0200, Florian Mickler wrote:
> > But we should definitely try to tune down the annoying part of
> > it by making it less random and maybe by adding a tag in the cc field...
> 
> Less random: sure, whatever's reasonable.
> 
> I still think that tagging the name portion of an email address
> is not a good idea, especially without some way of turning it
> off.  Maybe a mechanism to optionally enable it would be ok.

Well, at the moment, what is currently shipping in git-head and 2.6.35
does such an __awful__ job that I think a lot of people would be a lot
happier if we could get the e-mail messages tagged.  Maybe the call
for that would be less if some of queued fixes for get_maintainer.pl
could get pushed out more quickly, or you made an out-of-tree version
of get_mainatiner.pl so that fixes could get pushed to the newbies
more quickly.

Still, if as you argue, the heuristic guessing from the git repository
becomes something which will (eventually) be rarely done, then what
harm does it done if when those time that get_maintainer.pl is
guessing wildly by trying to read git history, those e-mail names are
tagged?

I'd argue that users shouldn't be using get_maintainer.pl in scripts.
Those who use them properly can always cut away the tags.  If someone
is lazy enough to be using get_maintainer.pl in a script to send
e-mail, then I think it's only fair and transparent that the fact that
(a) they are using get_maintainer in a script, and (b)
get_maintainer.pl was forced to guess would be a good thing.

Think about it this way; if we see something to LKML with an a
get_maintainer.pl tag indicating that it had to guess, it's a hint
that the MAINTAINERS file needs to be updated?  Wouldn't that be a
valuable way of notifying people of that fact?   :-)

	     		  	    	 - Ted

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: RFC: get_maintainer.pl: append reason for cc to the name by default
  2010-09-27 20:47               ` Ted Ts'o
@ 2010-09-27 21:16                 ` Joe Perches
  2010-09-28  4:22                   ` Ted Ts'o
  2010-09-28  4:37                   ` Mark Brown
  0 siblings, 2 replies; 62+ messages in thread
From: Joe Perches @ 2010-09-27 21:16 UTC (permalink / raw)
  To: Ted Ts'o
  Cc: Florian Mickler, Andrew Morton, Stephen Hemminger, Wolfram Sang,
	linux-kernel

On Mon, 2010-09-27 at 16:47 -0400, Ted Ts'o wrote:
> On Mon, Sep 27, 2010 at 01:08:17PM -0700, Joe Perches wrote:
> > On Mon, 2010-09-27 at 21:26 +0200, Florian Mickler wrote:
> > > But we should definitely try to tune down the annoying part of
> > > it by making it less random and maybe by adding a tag in the cc field...
> > Less random: sure, whatever's reasonable.
> > I still think that tagging the name portion of an email address
> > is not a good idea, especially without some way of turning it
> > off.  Maybe a mechanism to optionally enable it would be ok.
> Well, at the moment, what is currently shipping in git-head and 2.6.35
> does such an __awful__ job that I think a lot of people would be a lot
> happier if we could get the e-mail messages tagged.

Ridiculous exaggeration.  The script doesn't do an _awful_ job.

It includes what some vocal sorts consider suboptimal additional
cc's in some circumstances.  It has for the last year and a half.

Those cc's generally take _seconds_ for a cc'd party to ignore.

> Maybe the call
> for that would be less if some of queued fixes for get_maintainer.pl
> could get pushed out more quickly, or you made an out-of-tree version
> of get_mainatiner.pl so that fixes could get pushed to the newbies
> more quickly.

I already posted the tree.
http://lkml.org/lkml/2010/9/22/464
http://repo.or.cz/w/linux-2.6/get_maintainer.git

> Wouldn't that be a
> valuable way of notifying people of that fact?   :-)

Not really, and it just makes email message threading uglier.

A lot of these are already known.  I've told you multiple
times that arch/arm pattern coverage is poor.  _No_
arch/arm maintainer has made any effort to better describe
the files as it's a fairly difficult job.  And I'm not the
arch/arm maintainer.

Mark Brown said: http://lkml.org/lkml/2010/9/10/116
----------------------------------------------------
On Fri, Sep 10, 2010 at 03:04:26AM -0700, Joe Perches wrote:

> It'd be great if the ARM/embedded folk would spend
> some effort improving the MAINTAINERS file pattern
> coverage.

Half the problem is that a lot of drivers aren't maintained by the
people who wrote them - for example, they wrote the driver to get a
board working but have no ongoing interest or can only really comment on
the one specific configuration used on their particular hardware.  This
means getting people to add MAINTAINERS entries is a bit more tricky
than it needs to be, even if they could offer useful review on changes.
---------------------------------------------------

So for Mark's case, the current behavior works reasonably well.

Tell me something Ted.  Have you in the last 5 years or
so done any work in the kernel outside of fs or modified
files outside of fs when fs structures weren't changed?

Are you representative of the typical user of a script like
get_maintainer or checkpatch?  Does it matter that much?

If you want it, I have a trivial script that shows the files
in the kernel that do not have an "exact pattern match"
in MAINTAINERS.  You could make the effort to find out
if those files should have pattern entries made.   I did
a lot of that privately and found out that most people
appear not to care.




^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: RFC: get_maintainer.pl: append reason for cc to the name by default
  2010-09-27 21:16                 ` Joe Perches
@ 2010-09-28  4:22                   ` Ted Ts'o
  2010-09-28  4:37                   ` Mark Brown
  1 sibling, 0 replies; 62+ messages in thread
From: Ted Ts'o @ 2010-09-28  4:22 UTC (permalink / raw)
  To: Joe Perches
  Cc: Florian Mickler, Andrew Morton, Stephen Hemminger, Wolfram Sang,
	linux-kernel

On Mon, Sep 27, 2010 at 02:16:45PM -0700, Joe Perches wrote:
> 
> Tell me something Ted.  Have you in the last 5 years or
> so done any work in the kernel outside of fs or modified
> files outside of fs when fs structures weren't changed?

Yes, I have.  And when I needed to figure out who to send my patches
to, I used "git log" to figure out where to send it.  Quite frankly it
was easier than using the MAINTAINERS file, just because it was
faster.  But I was a human, so I could filter out the trivial cleanup
patches, and I could do a better job figuring out how to pick out the
right information from git log output.  (And Linus was suggesting that
a human do it, not some script that takes anyone who has 5% of the
signoffs, even if that's only a single signoff.  Sigh.)

> Are you representative of the typical user of a script like
> get_maintainer or checkpatch?  Does it matter that much?

Well, I think I know something about good pedagogy for teaching
newcomers to the kernel development process how to be good kernel
developers.  And I think get_maintainers and checkpatch.pl --file are
really horrible pedogagical tools.  You really want to teach people
how to look through "git log" output.  You don't want them relying on
scripts.

Note that there are web pages out there with comments where people
have advertised scripts they have written which attempt to automate
trivial whitespace cleanups, and where users are bragging that "they
got their name in the Linux kernel" --- and they're using
get_maintainer.pl as a way of automating how to send it these trivial
patches just so they can say they have contributed something to the
kernel.  OK, we want to be welcoming people to be kernel developers
--- but do we really think this is really a good way of doing it?

Who do *you* think is the typical user of a script like
get_maintainer.pl?  Someone who is just trying to do a "kilroy was
here" in git?  Someone who is genuinely trying to learn how to
contribute positive value to Linux?  A regular kernel maintainer?

Given that the currently published version has extra cc lines, I
certainly won't use it in its current form.  And it's really not that
hard to look things up in MAINTAINERS or to use git log.  Heck, before
I make a change to file I generally want to look to see what sort of
recent changes there have been to the file in the first place.

And if it's a genuine newcomer who is interested in learning, again,
is get_maintainer.pl really doing them a service?  If so, what do you
think that service is?  I don't think a crutch which actively
discourages them from looking through MAINTAINERS or "git log" is in
fact a good thing --- for anyone.

						- Ted

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: RFC: get_maintainer.pl: append reason for cc to the name by default
  2010-09-27 21:16                 ` Joe Perches
  2010-09-28  4:22                   ` Ted Ts'o
@ 2010-09-28  4:37                   ` Mark Brown
  1 sibling, 0 replies; 62+ messages in thread
From: Mark Brown @ 2010-09-28  4:37 UTC (permalink / raw)
  To: Joe Perches
  Cc: Ted Ts'o, Florian Mickler, Andrew Morton, Stephen Hemminger,
	Wolfram Sang, linux-kernel

On Mon, Sep 27, 2010 at 02:16:45PM -0700, Joe Perches wrote:
> On Mon, 2010-09-27 at 16:47 -0400, Ted Ts'o wrote:

> Mark Brown said: http://lkml.org/lkml/2010/9/10/116
> ----------------------------------------------------
> On Fri, Sep 10, 2010 at 03:04:26AM -0700, Joe Perches wrote:

> > It'd be great if the ARM/embedded folk would spend
> > some effort improving the MAINTAINERS file pattern
> > coverage.

> Half the problem is that a lot of drivers aren't maintained by the
> people who wrote them - for example, they wrote the driver to get a
> board working but have no ongoing interest or can only really comment on
> the one specific configuration used on their particular hardware.  This
> means getting people to add MAINTAINERS entries is a bit more tricky
> than it needs to be, even if they could offer useful review on changes.
> ---------------------------------------------------

> So for Mark's case, the current behavior works reasonably well.

Actually I've got pretty much all the problems Ted has with it - when I
use get_maintainer it's done in conjunction with my knowledge of who the
people who turn up are and what the chances are that they'd have been
doing substantial work on the code.  Half the time it's just because I
need to look up someone's e-mail address.

> Tell me something Ted.  Have you in the last 5 years or
> so done any work in the kernel outside of fs or modified
> files outside of fs when fs structures weren't changed?

> Are you representative of the typical user of a script like
> get_maintainer or checkpatch?  Does it matter that much?

I've personally always thought of get_maintainer as being more useful
for more experienced developers since you have to have a reasonable idea
of what it's giving you to be able to parse the output effectively.

^ permalink raw reply	[flat|nested] 62+ messages in thread

end of thread, other threads:[~2010-09-28  4:37 UTC | newest]

Thread overview: 62+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-09-10  9:33 [PATCH] get_maintainer.pl: append reason for cc to the name by default florian
2010-09-10  9:42 ` Joe Perches
2010-09-10  9:46   ` Wolfram Sang
2010-09-10  9:53   ` Mark Brown
2010-09-10 10:04     ` Joe Perches
2010-09-10 10:18       ` Mark Brown
2010-09-10 10:47         ` Joe Perches
2010-09-10 11:07           ` Mark Brown
2010-09-11  0:22           ` [PATCH] scripts/get_maintainer.pl: Add --git-blame --rolestats "Authored lines" information Joe Perches
2010-09-11  9:38             ` Florian Mickler
2010-09-11  9:52               ` Joe Perches
2010-09-11 10:02                 ` Florian Mickler
2010-09-11 10:22                   ` Joe Perches
2010-09-11 19:22                   ` [PATCH] Documentation/SubmittingPatches: Add and describe scripts/get_maintainer.pl Joe Perches
2010-09-11 19:34                     ` Florian Mickler
2010-09-11 19:43                     ` [PATCH V2] " Joe Perches
2010-09-12 16:18                       ` Florian Mickler
2010-09-10 11:44         ` [PATCH] get_maintainer.pl: append reason for cc to the name by default Alan Cox
2010-09-10 10:22       ` Florian Mickler
2010-09-10 10:47         ` Joe Perches
2010-09-11 21:22     ` Joe Perches
2010-09-10 10:30   ` Florian Mickler
2010-09-10 11:04     ` Mark Brown
2010-09-10 11:15       ` Florian Mickler
2010-09-10 21:04     ` Andrew Morton
2010-09-10 21:39       ` Florian Mickler
2010-09-10 21:44       ` Joe Perches
2010-09-13  4:01     ` Valdis.Kletnieks
2010-09-13  5:21       ` [PATCH] get_maintainer.pl: Look for .get_maintainer.conf in lk, then $HOME then scripts Joe Perches
2010-09-13  6:13         ` Florian Mickler
2010-09-13 13:21         ` Valdis.Kletnieks
2010-09-10 11:11   ` [PATCH] get_maintainer.pl: append reason for cc to the name by default Florian Mickler
2010-09-10 15:12     ` Joe Perches
2010-09-11  9:34       ` Florian Mickler
2010-09-11  0:13   ` Christoph Hellwig
2010-09-11  0:31     ` Joe Perches
2010-09-11  0:45       ` Christoph Hellwig
2010-09-11  0:56         ` Joe Perches
2010-09-11  9:28         ` Florian Mickler
2010-09-13  7:16           ` Eric W. Biederman
2010-09-13  7:57             ` Joe Perches
2010-09-13  8:54               ` Florian Mickler
2010-09-14 17:19                 ` Eric W. Biederman
2010-09-14 17:46                   ` Florian Mickler
2010-09-15  3:28                     ` Joe Perches
2010-09-15  4:34                       ` Florian Mickler
2010-09-15  4:45                         ` Joe Perches
2010-09-15 12:49                           ` Florian Mickler
2010-09-14 23:15                   ` Joe Perches
2010-09-13  9:01             ` Florian Mickler
2010-09-14 17:24               ` Eric W. Biederman
2010-09-26 18:52 ` RFC: " Joe Perches
2010-09-27 14:57   ` Florian Mickler
2010-09-27 15:44     ` Ted Ts'o
2010-09-27 17:00       ` Florian Mickler
2010-09-27 18:21         ` Ted Ts'o
2010-09-27 19:26           ` Florian Mickler
2010-09-27 20:08             ` Joe Perches
2010-09-27 20:47               ` Ted Ts'o
2010-09-27 21:16                 ` Joe Perches
2010-09-28  4:22                   ` Ted Ts'o
2010-09-28  4:37                   ` Mark Brown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).