linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/8] scripts/get_maintainer.pl: Add --interactive
@ 2010-09-23  3:17 Joe Perches
  2010-09-23  3:17 ` [PATCH 1/8] scripts/get_maintainer.pl: add interactive mode Joe Perches
                   ` (7 more replies)
  0 siblings, 8 replies; 11+ messages in thread
From: Joe Perches @ 2010-09-23  3:17 UTC (permalink / raw)
  To: linux-kernel

Add --interactive mode to allow some user interaction with the
script before returning a list of email addresses.

Using --interactive outputs to STDERR until the list of email
addresses is approved, then that list is emitted normally to STDOUT.

Duplicate email address matching is much improved, and the .mailmap file
is now used correctly.

For instance the previous RFC patch: http://lkml.org/lkml/2010/9/20/313
did not correctly identify drivers/net/sky2.c patches by Stephen Hemminger
as his maintainer address differs from his normally used submittal address.

After this patch series is applied it correctly shows:

$ ./scripts/get_maintainer.pl -i -f drivers/net/sky2.c

*  # email/list and role:stats                                        
*  1 Stephen Hemminger <shemminger@linux-foundation.org>              
     maintainer:SKGE, SKY2 10/100...
*  2 netdev@vger.kernel.org                                           
     open list:SKGE, SKY2 10/100...
*  3 linux-kernel@vger.kernel.org                                     
     open list

_#(toggle), A#(author), S#(signed) *(all), ^(none), O(options), Y(approve): g

*  # email/list and role:stats                                        auth sign
*  1 Stephen Hemminger <shemminger@linux-foundation.org>                26   37
     maintainer:SKGE, SKY2 10/100...,commit_signer:37/67=55%
*  2 "David S. Miller" <davem@davemloft.net>                             7   56
     commit_signer:56/67=84%
*  3 Mike McCormack <mikem@ring3k.org>                                  15   15
     commit_signer:15/67=22%
*  4 netdev@vger.kernel.org                                           
     open list:SKGE, SKY2 10/100...
*  5 linux-kernel@vger.kernel.org                                     
     open list

_#(toggle), A#(author), S#(signed) *(all), ^(none), O(options), Y(approve): b
git-blame can be very slow, please have patience...
*  # email/list and role:stats                                        auth sign
*  1 Stephen Hemminger <shemminger@linux-foundation.org>               248  265
     maintainer:SKGE, SKY2 10/100...,commit_signer:37/67=55%,authored lines:4534/5035=90%,commits:262/298=88%
*  2 "David S. Miller" <davem@davemloft.net>                             9  109
     commit_signer:56/67=84%,commits:99/298=33%
*  3 Mike McCormack <mikem@ring3k.org>                                  22   22
     commit_signer:15/67=22%,authored lines:343/5035=7%,commits:20/298=7%
*  4 Jeff Garzik <jgarzik@redhat.com>                                    3  168
     commits:168/298=56%
*  5 netdev@vger.kernel.org                                           
     open list:SKGE, SKY2 10/100...
*  6 linux-kernel@vger.kernel.org                                     
     open list

_#(toggle), A#(author), S#(signed) *(all), ^(none), O(options), Y(approve): y
Stephen Hemminger <shemminger@linux-foundation.org>
"David S. Miller" <davem@davemloft.net>
Mike McCormack <mikem@ring3k.org>
Jeff Garzik <jgarzik@redhat.com>
netdev@vger.kernel.org
linux-kernel@vger.kernel.org

Joe Perches (6):
  scripts/get_maintainer.pl: Improve --interactive UI
  scripts/get_maintainer.pl: Update --interactive UI, improve hg runtime
  scripts/get_maintainer.pl: Use case insensitive name de-duplication
  scripts/get_maintainer.pl: Correct indentation in a few places
  scripts/get_maintainer.pl: Use mailmap in name deduplication and other updates
  scripts/get_maintainer.pl: Don't deduplicate unnamed addresses ie: mailing lists

florian@mickler.org (2):
  scripts/get_maintainer.pl: add interactive mode
  scripts/get_maintainer.pl: fix mailmap handling

 scripts/get_maintainer.pl | 1153 +++++++++++++++++++++++++++++++++++----------
 1 files changed, 906 insertions(+), 247 deletions(-)

Please apply.

The current version is available to pull from:

   git://repo.or.cz/linux-2.6/get_maintainer.git 20100922_gm

-- 
1.7.3


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/8] scripts/get_maintainer.pl: add interactive mode
  2010-09-23  3:17 [PATCH 0/8] scripts/get_maintainer.pl: Add --interactive Joe Perches
@ 2010-09-23  3:17 ` Joe Perches
  2010-09-23  7:11   ` Florian Mickler
  2010-09-23  3:17 ` [PATCH 2/8] scripts/get_maintainer.pl: Improve --interactive UI Joe Perches
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 11+ messages in thread
From: Joe Perches @ 2010-09-23  3:17 UTC (permalink / raw)
  To: linux-kernel

From: florian@mickler.org <florian@mickler.org>

This is a first version of an interactive mode for
scripts/get_maintainer.pl .

It allows the user to interact with the script. Each cc candidate can be
selected and deselected and a shortlog of authored commits can be
displayed for each candidate.

The menu is displayed via STDERR, the end result is outputted to STDOUT.
This unusual mechanism allows using get_maintainer.pl in interactive mode via
git send-email --cc-cmd.

Signed-off-by: Joe Perches <joe@perches.com>
---
 scripts/get_maintainer.pl |  146 +++++++++++++++++++++++++++++++++++++++++++--
 1 files changed, 141 insertions(+), 5 deletions(-)

diff --git a/scripts/get_maintainer.pl b/scripts/get_maintainer.pl
index e5a400c..1ae8c50 100755
--- a/scripts/get_maintainer.pl
+++ b/scripts/get_maintainer.pl
@@ -33,6 +33,7 @@ my $email_git_max_maintainers = 5;
 my $email_git_min_percent = 5;
 my $email_git_since = "1-year-ago";
 my $email_hg_since = "-365";
+my $interactive = 0;
 my $email_remove_duplicates = 1;
 my $output_multiline = 1;
 my $output_separator = ", ";
@@ -52,6 +53,8 @@ my $help = 0;
 
 my $exit = 0;
 
+my %shortlog_buffer;
+
 my @penguin_chief = ();
 push(@penguin_chief, "Linus Torvalds:torvalds\@linux-foundation.org");
 #Andrew wants in on most everything - 2009/01/14
@@ -93,7 +96,8 @@ my %VCS_cmds_git = (
     "blame_range_cmd" => "git blame -l -L \$diff_start,+\$diff_length \$file",
     "blame_file_cmd" => "git blame -l \$file",
     "commit_pattern" => "^commit [0-9a-f]{40,40}",
-    "blame_commit_pattern" => "^([0-9a-f]+) "
+    "blame_commit_pattern" => "^([0-9a-f]+) ",
+    "shortlog_cmd" => "git log --no-color --oneline --since=\$email_git_since --author=\"\$email\" -- \$file"
 );
 
 my %VCS_cmds_hg = (
@@ -107,7 +111,8 @@ my %VCS_cmds_hg = (
     "blame_range_cmd" => "",		# not supported
     "blame_file_cmd" => "hg blame -c \$file",
     "commit_pattern" => "^commit [0-9a-f]{40,40}",
-    "blame_commit_pattern" => "^([0-9a-f]+):"
+    "blame_commit_pattern" => "^([0-9a-f]+):",
+    "shortlog_cmd" => "ht log --date=\$email_hg_since"
 );
 
 my $conf = which_conf(".get_maintainer.conf");
@@ -148,6 +153,7 @@ if (!GetOptions(
 		'git-min-percent=i' => \$email_git_min_percent,
 		'git-since=s' => \$email_git_since,
 		'hg-since=s' => \$email_hg_since,
+		'i|interactive!' => \$interactive,
 		'remove-duplicates!' => \$email_remove_duplicates,
 		'm!' => \$email_maintainer,
 		'n!' => \$email_usename,
@@ -225,6 +231,8 @@ if ($email_git_all_signature_types) {
     $signaturePattern = "(.+?)[Bb][Yy]:";
 }
 
+
+
 ## Read MAINTAINERS for type/value pairs
 
 my @typevalue = ();
@@ -450,10 +458,13 @@ foreach my $file (@files) {
 	($email_git || ($email_git_fallback && !$exact_pattern_match))) {
 	vcs_file_signoffs($file);
     }
-
     if ($email && $email_git_blame) {
 	vcs_file_blame($file);
     }
+    if ($email && $interactive){
+	vcs_file_shortlogs($file);
+
+    }
 }
 
 if ($keywords) {
@@ -486,9 +497,13 @@ if ($email) {
     }
 }
 
+
 if ($email || $email_list) {
     my @to = ();
     if ($email) {
+	if ($interactive) {
+	    @email_to = @{vcs_interactive_menu(\@email_to)};
+	}
 	@to = (@to, @email_to);
     }
     if ($email_list) {
@@ -501,7 +516,6 @@ if ($scm) {
     @scm = uniq(@scm);
     output(@scm);
 }
-
 if ($status) {
     @status = uniq(@status);
     output(@status);
@@ -556,6 +570,7 @@ MAINTAINER field selection options:
     --git-blame => use git blame to find modified commits for patch or file
     --git-since => git history to use (default: $email_git_since)
     --hg-since => hg history to use (default: $email_hg_since)
+    --interactive => display a menu (mostly useful if used with the --git option)
     --m => include maintainer(s) if any
     --n => include name 'Full Name <addr\@domain.tld>'
     --l => include list(s) if any
@@ -1156,6 +1171,127 @@ sub vcs_exists {
     return 0;
 }
 
+sub vcs_interactive_menu {
+    my $list_ref = shift;
+    my @list = @$list_ref;
+
+    return if (!vcs_exists());
+
+    my %selected;
+    my %shortlog;
+    my $input;
+    my $count = 0;
+
+    #select maintainers by default
+    foreach my $entry (@list){
+	    my $role = $entry->[1];
+	    $selected{$count} = ($role =~ /maintainer:|supporter:/);
+	    $count++;
+    }
+
+    #menu loop
+    do {
+	my $count = 0;
+	foreach my $entry (@list){
+	    my $email = $entry->[0];
+	    my $role = $entry->[1];
+	    if ($selected{$count}){
+		print STDERR "* ";
+	    } else {
+		print STDERR "  ";
+	    }
+	    print STDERR "$count: $email,\t\t $role";
+	    print STDERR "\n";
+	    if ($shortlog{$count}){
+		my $entries_ref = vcs_get_shortlog($email);
+		foreach my $entry_ref (@{$entries_ref}){
+		    my $filename = @{$entry_ref}[0];
+		    my @shortlog = @{@{$entry_ref}[1]};
+		    print STDERR "\tshortlog for $filename (authored commits: " . @shortlog . ").\n";
+		    foreach my $commit (@shortlog){
+			print STDERR "\t  $commit\n";
+		    }
+		    print STDERR "\n";
+		}
+	    }
+	    $count++;
+	}
+	print STDERR "\n";
+	print STDERR "Choose whom to cc by entering a commaseperated list of numbers and hitting enter.\n";
+	print STDERR "To show a short list of commits, precede the number by a '?',\n";
+	print STDERR "A blank line indicates that you are satisfied with your choice.\n";
+	$input = <STDIN>;
+	chomp($input);
+
+	my @wish = split(/[, ]+/,$input);
+	foreach my $nr (@wish){
+		my $logtoggle = 0;
+		if ($nr =~ /\?/){
+			$nr =~ s/\?//;
+			$logtoggle = 1;
+		}
+
+		#skip out of bounds numbers
+		next unless ($nr <= $count && $nr >= 0);
+
+		if ($logtoggle){
+			$shortlog{$nr} = !$shortlog{$nr};
+		} else {
+			$selected{$nr} = !$selected{$nr};
+
+			#switch shortlog on if an entry get's selected
+			if ($selected{$nr}){
+				$shortlog{$nr}=1;
+			}
+		}
+	};
+    } while(length($input) > 0);
+
+    #drop not selected entries
+    $count = 0;
+    my @new_emailto;
+    foreach my $entry (@list){
+	if ($selected{$count}){
+		push(@new_emailto,$list[$count]);
+		print STDERR "$count: ";
+		print STDERR $email_to[$count]->[0];
+		print STDERR ",\t\t ";
+		print STDERR $email_to[$count]->[1];
+		print STDERR "\n";
+	}
+	$count++;
+    }
+    return \@new_emailto;
+}
+
+sub vcs_get_shortlog {
+    my $arg = shift;
+    my ($name, $address) = parse_email($arg);
+    return $shortlog_buffer{$address};
+}
+
+sub vcs_file_shortlogs {
+    my ($file) = @_;
+    print STDERR "shortlog processing $file:";
+    foreach my $entry (@email_to){
+	my ($name, $address) = parse_email($entry->[0]);
+	print STDERR ".";
+	my $commits_ref = vcs_email_shortlog($address, $file);
+	push(@{$shortlog_buffer{$address}}, [ $file, $commits_ref ]);
+    }
+    print STDERR "\n";
+}
+
+sub vcs_email_shortlog {
+    my $email = shift;
+    my ($file) = @_;
+
+    my $cmd = $VCS_cmds{"shortlog_cmd"};
+    $cmd =~ s/(\$\w+)/$1/eeg;		#substitute variables
+    my @lines = &{$VCS_cmds{"execute_cmd"}}($cmd);
+    return \@lines;
+}
+
 sub vcs_assign {
     my ($role, $divisor, @lines) = @_;
 
@@ -1236,7 +1372,7 @@ sub vcs_file_blame {
 	my @commit_signers = ();
 
 	my $cmd = $VCS_cmds{"find_commit_signers_cmd"};
-	$cmd =~ s/(\$\w+)/$1/eeg;	#interpolate $cmd
+	$cmd =~ s/(\$\w+)/$1/eeg;	#substitute variables in $cmd
 
 	($commit_count, @commit_signers) = vcs_find_signers($cmd);
 
-- 
1.7.3


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/8] scripts/get_maintainer.pl: Improve --interactive UI
  2010-09-23  3:17 [PATCH 0/8] scripts/get_maintainer.pl: Add --interactive Joe Perches
  2010-09-23  3:17 ` [PATCH 1/8] scripts/get_maintainer.pl: add interactive mode Joe Perches
@ 2010-09-23  3:17 ` Joe Perches
  2010-09-23  3:17 ` [PATCH 3/8] scripts/get_maintainer.pl: Update --interactive UI, improve hg runtime Joe Perches
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Joe Perches @ 2010-09-23  3:17 UTC (permalink / raw)
  To: linux-kernel

o Added searching by git-blame as well as git-history
o Added different selection toggles
o Added ability to list commits by author or by sign-off-type
o Use custom git and hg formats to make searching for subject/author
  a bit easier.
o Move inlined section matching and searching git/hg history to
  new get_maintainer subroutine
o Added subroutines save_commits_by_author and save_commits_by_signer
o Removed subroutines vcs_get_shortlog and vcs_email_shortlog
o Rename camelcase signaturePattern to signature_pattern

Update version to 0.26-beta.

Signed-off-by: Joe Perches <joe@perches.com>
---
 scripts/get_maintainer.pl |  640 +++++++++++++++++++++++++++------------------
 1 files changed, 388 insertions(+), 252 deletions(-)

diff --git a/scripts/get_maintainer.pl b/scripts/get_maintainer.pl
index 1ae8c50..3fa639e 100755
--- a/scripts/get_maintainer.pl
+++ b/scripts/get_maintainer.pl
@@ -13,7 +13,7 @@
 use strict;
 
 my $P = $0;
-my $V = '0.25';
+my $V = '0.26-beta';
 
 use Getopt::Long qw(:config no_auto_abbrev);
 
@@ -53,7 +53,8 @@ my $help = 0;
 
 my $exit = 0;
 
-my %shortlog_buffer;
+my %commit_author_hash;
+my %commit_signer_hash;
 
 my @penguin_chief = ();
 push(@penguin_chief, "Linus Torvalds:torvalds\@linux-foundation.org");
@@ -77,7 +78,7 @@ my @signature_tags = ();
 push(@signature_tags, "Signed-off-by:");
 push(@signature_tags, "Reviewed-by:");
 push(@signature_tags, "Acked-by:");
-my $signaturePattern = "\(" . join("|", @signature_tags) . "\)";
+my $signature_pattern = "\(" . join("|", @signature_tags) . "\)";
 
 # rfc822 email address - preloaded methods go here.
 my $rfc822_lwsp = "(?:(?:\\r\\n)?[ \\t])";
@@ -90,29 +91,52 @@ my %VCS_cmds;
 my %VCS_cmds_git = (
     "execute_cmd" => \&git_execute_cmd,
     "available" => '(which("git") ne "") && (-d ".git")',
-    "find_signers_cmd" => "git log --no-color --since=\$email_git_since -- \$file",
-    "find_commit_signers_cmd" => "git log --no-color -1 \$commit",
-    "find_commit_author_cmd" => "git log -1 --format=\"%an <%ae>\" \$commit",
+    "find_signers_cmd" =>
+	"git log --no-color --since=\$email_git_since " .
+	    '--format="GitCommit: %H%n' .
+		      'GitAuthor: %an <%ae>%n' .
+		      'GitDate: %aD%n' .
+		      'GitSubject: %s%n' .
+		      '%b%n"' .
+	    " -- \$file",
+    "find_commit_signers_cmd" =>
+	"git log --no-color -1 " .
+	    '--format="GitCommit: %H%n' .
+		      'GitAuthor: %an <%ae>%n' .
+		      'GitDate: %aD%n' .
+		      'GitSubject: %s%n' .
+		      '%b%n"' .
+	    " \$commit",
+    "find_commit_author_cmd" =>
+	"git log --no-color " .
+	    '--format="GitAuthor: %an <%ae>"' .
+	    " -1 \$commit",
     "blame_range_cmd" => "git blame -l -L \$diff_start,+\$diff_length \$file",
     "blame_file_cmd" => "git blame -l \$file",
-    "commit_pattern" => "^commit [0-9a-f]{40,40}",
+    "commit_pattern" => "^GitCommit: ([0-9a-f]{40,40})",
     "blame_commit_pattern" => "^([0-9a-f]+) ",
-    "shortlog_cmd" => "git log --no-color --oneline --since=\$email_git_since --author=\"\$email\" -- \$file"
+    "author_pattern" => "^GitAuthor: (.*)",
+    "subject_pattern" => "^GitSubject: (.*)",
 );
 
 my %VCS_cmds_hg = (
     "execute_cmd" => \&hg_execute_cmd,
     "available" => '(which("hg") ne "") && (-d ".hg")',
     "find_signers_cmd" =>
-	"hg log --date=\$email_hg_since" .
-		" --template='commit {node}\\n{desc}\\n' -- \$file",
+	"hg log --date=\$email_hg_since " .
+	    "--template='HgCommit: {node}\\nHgAuthor: {author}\\nHgSubject: {desc}\\n'" .
+	    " -- \$file",
     "find_commit_signers_cmd" => "hg log --template='{desc}\\n' -r \$commit",
-    "find_commit_author_cmd" => "hg log -l 1 --template='{author}\\n' -r \$commit",
+    "find_commit_author_cmd" =>
+	"hg log -l 1 " .
+	    "--template='HgAuthor: {author}\\n'" .
+	    "-r \$commit",
     "blame_range_cmd" => "",		# not supported
     "blame_file_cmd" => "hg blame -c \$file",
-    "commit_pattern" => "^commit [0-9a-f]{40,40}",
+    "commit_pattern" => "^HgCommit: ([0-9a-f]{40,40})",
     "blame_commit_pattern" => "^([0-9a-f]+):",
-    "shortlog_cmd" => "ht log --date=\$email_hg_since"
+    "author_pattern" => "^HgAuthor: (.*)",
+    "subject_pattern" => "^HgSubject: (.*)",
 );
 
 my $conf = which_conf(".get_maintainer.conf");
@@ -193,13 +217,9 @@ if (-t STDIN && !@ARGV) {
     die "$P: missing patchfile or -f file - use --help if necessary\n";
 }
 
-if ($output_separator ne ", ") {
-    $output_multiline = 0;
-}
-
-if ($output_rolestats) {
-    $output_roles = 1;
-}
+$output_multiline = 0 if ($output_separator ne ", ");
+$output_rolestats = 1 if ($interactive);
+$output_roles = 1 if ($output_rolestats);
 
 if ($sections) {
     $email = 0;
@@ -228,11 +248,9 @@ if (!top_of_kernel_tree($lk_path)) {
 }
 
 if ($email_git_all_signature_types) {
-    $signaturePattern = "(.+?)[Bb][Yy]:";
+    $signature_pattern = "(.+?)[Bb][Yy]:";
 }
 
-
-
 ## Read MAINTAINERS for type/value pairs
 
 my @typevalue = ();
@@ -371,168 +389,186 @@ foreach my $file (@ARGV) {
 
 @file_emails = uniq(@file_emails);
 
+my %email_hash_name;
+my %email_hash_address;
 my @email_to = ();
+my %hash_list_to;
 my @list_to = ();
 my @scm = ();
 my @web = ();
 my @subsystem = ();
 my @status = ();
 
-# Find responsible parties
+my @to = get_maintainer();
 
-foreach my $file (@files) {
+@to = merge_email(@to);
 
-    my %hash;
-    my $exact_pattern_match = 0;
-    my $tvi = find_first_section();
-    while ($tvi < @typevalue) {
-	my $start = find_starting_index($tvi);
-	my $end = find_ending_index($tvi);
-	my $exclude = 0;
-	my $i;
-
-	#Do not match excluded file patterns
-
-	for ($i = $start; $i < $end; $i++) {
-	    my $line = $typevalue[$i];
-	    if ($line =~ m/^(\C):\s*(.*)/) {
-		my $type = $1;
-		my $value = $2;
-		if ($type eq 'X') {
-		    if (file_match_pattern($file, $value)) {
-			$exclude = 1;
-			last;
-		    }
-		}
-	    }
-	}
+output(@to) if (@to);
+
+if ($scm) {
+    @scm = uniq(@scm);
+    output(@scm);
+}
+
+if ($status) {
+    @status = uniq(@status);
+    output(@status);
+}
+
+if ($subsystem) {
+    @subsystem = uniq(@subsystem);
+    output(@subsystem);
+}
+
+if ($web) {
+    @web = uniq(@web);
+    output(@web);
+}
+
+exit($exit);
+
+sub get_maintainer {
+    %email_hash_name = ();
+    %email_hash_address = ();
+    %commit_author_hash = ();
+    %commit_signer_hash = ();
+    @email_to = ();
+    %hash_list_to = ();
+    @list_to = ();
+    @scm = ();
+    @web = ();
+    @subsystem = ();
+    @status = ();
+
+    # Find responsible parties
+
+    foreach my $file (@files) {
+
+	my %hash;
+	my $exact_pattern_match = 0;
+	my $tvi = find_first_section();
+	while ($tvi < @typevalue) {
+	    my $start = find_starting_index($tvi);
+	    my $end = find_ending_index($tvi);
+	    my $exclude = 0;
+	    my $i;
+
+	    #Do not match excluded file patterns
 
-	if (!$exclude) {
 	    for ($i = $start; $i < $end; $i++) {
 		my $line = $typevalue[$i];
 		if ($line =~ m/^(\C):\s*(.*)/) {
 		    my $type = $1;
 		    my $value = $2;
-		    if ($type eq 'F') {
+		    if ($type eq 'X') {
 			if (file_match_pattern($file, $value)) {
-			    my $value_pd = ($value =~ tr@/@@);
-			    my $file_pd = ($file  =~ tr@/@@);
-			    $value_pd++ if (substr($value,-1,1) ne "/");
-			    $value_pd = -1 if ($value =~ /^\.\*/);
-			    $exact_pattern_match = 1 if ($value_pd >= $file_pd);
-			    if ($pattern_depth == 0 ||
-				(($file_pd - $value_pd) < $pattern_depth)) {
-				$hash{$tvi} = $value_pd;
+			    $exclude = 1;
+			    last;
+			}
+		    }
+		}
+	    }
+
+	    if (!$exclude) {
+		for ($i = $start; $i < $end; $i++) {
+		    my $line = $typevalue[$i];
+		    if ($line =~ m/^(\C):\s*(.*)/) {
+			my $type = $1;
+			my $value = $2;
+			if ($type eq 'F') {
+			    if (file_match_pattern($file, $value)) {
+				my $value_pd = ($value =~ tr@/@@);
+				my $file_pd = ($file  =~ tr@/@@);
+				$value_pd++ if (substr($value,-1,1) ne "/");
+				$value_pd = -1 if ($value =~ /^\.\*/);
+				$exact_pattern_match = 1 if ($value_pd >= $file_pd);
+				if ($pattern_depth == 0 ||
+				    (($file_pd - $value_pd) < $pattern_depth)) {
+				    $hash{$tvi} = $value_pd;
+				}
 			    }
 			}
 		    }
 		}
 	    }
+	    $tvi = $end + 1;
 	}
 
-	$tvi = $end + 1;
-    }
-
-    foreach my $line (sort {$hash{$b} <=> $hash{$a}} keys %hash) {
-	add_categories($line);
-	if ($sections) {
-	    my $i;
-	    my $start = find_starting_index($line);
-	    my $end = find_ending_index($line);
-	    for ($i = $start; $i < $end; $i++) {
-		my $line = $typevalue[$i];
-		if ($line =~ /^[FX]:/) {		##Restore file patterns
-		    $line =~ s/([^\\])\.([^\*])/$1\?$2/g;
-		    $line =~ s/([^\\])\.$/$1\?/g;	##Convert . back to ?
-		    $line =~ s/\\\./\./g;       	##Convert \. to .
-		    $line =~ s/\.\*/\*/g;       	##Convert .* to *
+	foreach my $line (sort {$hash{$b} <=> $hash{$a}} keys %hash) {
+	    add_categories($line);
+	    if ($sections) {
+		my $i;
+		my $start = find_starting_index($line);
+		my $end = find_ending_index($line);
+		for ($i = $start; $i < $end; $i++) {
+		    my $line = $typevalue[$i];
+		    if ($line =~ /^[FX]:/) {		##Restore file patterns
+			$line =~ s/([^\\])\.([^\*])/$1\?$2/g;
+			$line =~ s/([^\\])\.$/$1\?/g;	##Convert . back to ?
+			$line =~ s/\\\./\./g;       	##Convert \. to .
+			$line =~ s/\.\*/\*/g;       	##Convert .* to *
+		    }
+		    $line =~ s/^([A-Z]):/$1:\t/g;
+		    print("$line\n");
 		}
-		$line =~ s/^([A-Z]):/$1:\t/g;
-		print("$line\n");
+		print("\n");
 	    }
-	    print("\n");
 	}
-    }
-
-    if ($email &&
-	($email_git || ($email_git_fallback && !$exact_pattern_match))) {
-	vcs_file_signoffs($file);
-    }
-    if ($email && $email_git_blame) {
-	vcs_file_blame($file);
-    }
-    if ($email && $interactive){
-	vcs_file_shortlogs($file);
 
+	if ($email && ($email_git ||
+		       ($email_git_fallback && !$exact_pattern_match))) {
+	    vcs_file_signoffs($file);
+	}
+	if ($email && $email_git_blame) {
+	    vcs_file_blame($file);
+	}
     }
-}
 
-if ($keywords) {
-    @keyword_tvi = sort_and_uniq(@keyword_tvi);
-    foreach my $line (@keyword_tvi) {
-	add_categories($line);
+    if ($keywords) {
+	@keyword_tvi = sort_and_uniq(@keyword_tvi);
+	foreach my $line (@keyword_tvi) {
+	    add_categories($line);
+	}
     }
-}
 
-if ($email) {
-    foreach my $chief (@penguin_chief) {
-	if ($chief =~ m/^(.*):(.*)/) {
-	    my $email_address;
+    if ($email) {
+	foreach my $chief (@penguin_chief) {
+	    if ($chief =~ m/^(.*):(.*)/) {
+		my $email_address;
 
-	    $email_address = format_email($1, $2, $email_usename);
-	    if ($email_git_penguin_chiefs) {
-		push(@email_to, [$email_address, 'chief penguin']);
-	    } else {
-		@email_to = grep($_->[0] !~ /${email_address}/, @email_to);
+		$email_address = format_email($1, $2, $email_usename);
+		if ($email_git_penguin_chiefs) {
+		    push(@email_to, [$email_address, 'chief penguin']);
+		} else {
+		    @email_to = grep($_->[0] !~ /${email_address}/, @email_to);
+		}
 	    }
 	}
-    }
 
-    foreach my $email (@file_emails) {
-	my ($name, $address) = parse_email($email);
+	foreach my $email (@file_emails) {
+	    my ($name, $address) = parse_email($email);
 
-	my $tmp_email = format_email($name, $address, $email_usename);
-	push_email_address($tmp_email, '');
-	add_role($tmp_email, 'in file');
+	    my $tmp_email = format_email($name, $address, $email_usename);
+	    push_email_address($tmp_email, '');
+	    add_role($tmp_email, 'in file');
+	}
     }
-}
-
 
-if ($email || $email_list) {
     my @to = ();
-    if ($email) {
-	if ($interactive) {
-	    @email_to = @{vcs_interactive_menu(\@email_to)};
+    if ($email || $email_list) {
+	if ($email) {
+	    @to = (@to, @email_to);
+	}
+	if ($email_list) {
+	    @to = (@to, @list_to);
 	}
-	@to = (@to, @email_to);
-    }
-    if ($email_list) {
-	@to = (@to, @list_to);
     }
-    output(merge_email(@to));
-}
-
-if ($scm) {
-    @scm = uniq(@scm);
-    output(@scm);
-}
-if ($status) {
-    @status = uniq(@status);
-    output(@status);
-}
 
-if ($subsystem) {
-    @subsystem = uniq(@subsystem);
-    output(@subsystem);
-}
+    @to = interactive_get_maintainer(\@to) if ($interactive);
 
-if ($web) {
-    @web = uniq(@web);
-    output(@web);
+    return @to;
 }
 
-exit($exit);
-
 sub file_match_pattern {
     my ($file, $pattern) = @_;
     if (substr($pattern, -1) eq "/") {
@@ -561,7 +597,7 @@ MAINTAINER field selection options:
   --email => print email address(es) if any
     --git => include recent git \*-by: signers
     --git-all-signature-types => include signers regardless of signature type
-        or use only ${signaturePattern} signers (default: $email_git_all_signature_types)
+        or use only ${signature_pattern} signers (default: $email_git_all_signature_types)
     --git-fallback => use git when no exact MAINTAINERS pattern (default: $email_git_fallback)
     --git-chief-penguins => include ${penguin_chiefs}
     --git-min-signatures => number of signatures required (default: $email_git_min_signatures)
@@ -847,11 +883,19 @@ sub add_categories {
 		}
 		if ($list_additional =~ m/subscribers-only/) {
 		    if ($email_subscriber_list) {
-			push(@list_to, [$list_address, "subscriber list${list_role}"]);
+			if (!$hash_list_to{$list_address}) {
+			    $hash_list_to{$list_address} = 1;
+			    push(@list_to, [$list_address,
+					    "subscriber list${list_role}"]);
+			}
 		    }
 		} else {
 		    if ($email_list) {
-			push(@list_to, [$list_address, "open list${list_role}"]);
+			if (!$hash_list_to{$list_address}) {
+			    $hash_list_to{$list_address} = 1;
+			    push(@list_to, [$list_address,
+					    "open list${list_role}"]);
+			}
 		    }
 		}
 	    } elsif ($ptype eq "M") {
@@ -882,9 +926,6 @@ sub add_categories {
     }
 }
 
-my %email_hash_name;
-my %email_hash_address;
-
 sub email_inuse {
     my ($name, $address) = @_;
 
@@ -1037,10 +1078,31 @@ sub hg_execute_cmd {
     return @lines;
 }
 
+sub extract_formatted_signatures {
+    my (@signature_lines) = @_;
+
+    my @type = @signature_lines;
+
+    s/\s*(.*):.*/$1/ for (@type);
+
+    # cut -f2- -d":"
+    s/\s*.*:\s*(.+)\s*/$1/ for (@signature_lines);
+
+## Reformat email addresses (with names) to avoid badly written signatures
+
+    foreach my $signer (@signature_lines) {
+	my ($name, $address) = parse_email($signer);
+	$signer = format_email($name, $address, 1);
+    }
+
+    return (\@type, \@signature_lines);
+}
+
 sub vcs_find_signers {
     my ($cmd) = @_;
-    my @lines = ();
     my $commits;
+    my @lines = ();
+    my @signatures = ();
 
     @lines = &{$VCS_cmds{"execute_cmd"}}($cmd);
 
@@ -1048,24 +1110,20 @@ sub vcs_find_signers {
 
     $commits = grep(/$pattern/, @lines);	# of commits
 
-    @lines = grep(/^[ \t]*${signaturePattern}.*\@.*$/, @lines);
-    if (!$email_git_penguin_chiefs) {
-	@lines = grep(!/${penguin_chiefs}/i, @lines);
-    }
-
-    return (0, @lines) if !@lines;
+    @signatures = grep(/^[ \t]*${signature_pattern}.*\@.*$/, @lines);
 
-    # cut -f2- -d":"
-    s/.*:\s*(.+)\s*/$1/ for (@lines);
+    return (0, @signatures) if !@signatures;
 
-## Reformat email addresses (with names) to avoid badly written signatures
+    save_commits_by_author(@lines) if ($interactive);
+    save_commits_by_signer(@lines) if ($interactive);
 
-    foreach my $line (@lines) {
-	my ($name, $address) = parse_email($line);
-	$line = format_email($name, $address, 1);
+    if (!$email_git_penguin_chiefs) {
+	@signatures = grep(!/${penguin_chiefs}/i, @signatures);
     }
 
-    return ($commits, @lines);
+    my ($types_ref, $signers_ref) = extract_formatted_signatures(@signatures);
+
+    return ($commits, @$signers_ref);
 }
 
 sub vcs_find_author {
@@ -1080,14 +1138,19 @@ sub vcs_find_author {
 
     return @lines if !@lines;
 
+    my @authors = ();
+    foreach my $line (@lines) {
+	push(@authors, $1) if ($line =~ m/$VCS_cmds{"author_pattern"}/);
+    }
+
 ## Reformat email addresses (with names) to avoid badly written signatures
 
-    foreach my $line (@lines) {
-	my ($name, $address) = parse_email($line);
-	$line = format_email($name, $address, 1);
+    foreach my $author (@authors) {
+	my ($name, $address) = parse_email($author);
+	$author = format_email($name, $address, 1);
     }
 
-    return @lines;
+    return @authors;
 }
 
 sub vcs_save_commits {
@@ -1171,125 +1234,198 @@ sub vcs_exists {
     return 0;
 }
 
-sub vcs_interactive_menu {
-    my $list_ref = shift;
+sub interactive_get_maintainer {
+    my ($list_ref) = @_;
     my @list = @$list_ref;
 
-    return if (!vcs_exists());
-
     my %selected;
-    my %shortlog;
-    my $input;
+    my %authored;
+    my %signed;
     my $count = 0;
 
     #select maintainers by default
     foreach my $entry (@list){
-	    my $role = $entry->[1];
-	    $selected{$count} = ($role =~ /maintainer:|supporter:/);
-	    $count++;
+	my $role = $entry->[1];
+	$selected{$count} = ($role =~ /^(maintainer|supporter|open list)/);
+	$authored{$count} = 0;
+	$signed{$count} = 0;
+	$count++;
     }
 
     #menu loop
-    do {
-	my $count = 0;
-	foreach my $entry (@list){
-	    my $email = $entry->[0];
-	    my $role = $entry->[1];
-	    if ($selected{$count}){
-		print STDERR "* ";
-	    } else {
-		print STDERR "  ";
-	    }
-	    print STDERR "$count: $email,\t\t $role";
-	    print STDERR "\n";
-	    if ($shortlog{$count}){
-		my $entries_ref = vcs_get_shortlog($email);
-		foreach my $entry_ref (@{$entries_ref}){
-		    my $filename = @{$entry_ref}[0];
-		    my @shortlog = @{@{$entry_ref}[1]};
-		    print STDERR "\tshortlog for $filename (authored commits: " . @shortlog . ").\n";
-		    foreach my $commit (@shortlog){
-			print STDERR "\t  $commit\n";
+    my $done = 0;
+    my $redraw = 1;
+    while (!$done) {
+	$count = 0;
+	if ($redraw) {
+	    foreach my $entry (@list) {
+		my $email = $entry->[0];
+		my $role = $entry->[1];
+		my $sel = "";
+		$sel = "*" if ($selected{$count});
+		my $commit_author = $commit_author_hash{$email};
+		my $commit_signer = $commit_signer_hash{$email};
+		my $authored = 0;
+		my $signed = 0;
+		$authored++ for (@{$commit_author});
+		$signed++ for (@{$commit_signer});
+		printf STDERR "%1s %2d %-52s",
+			      $sel, $count + 1, $email;
+		printf STDERR " Author:%3d Signer:%3d",
+			      $authored, $signed
+			      if ($authored > 0 || $signed > 0);
+		printf STDERR "\n     %s\n", $role;
+		if ($authored{$count}) {
+		    my $commit_author = $commit_author_hash{$email};
+		    foreach my $ref (@{$commit_author}) {
+			print STDERR "     Author: @{$ref}[1]\n";
 		    }
-		    print STDERR "\n";
 		}
+		if ($signed{$count}) {
+		    my $commit_signer = $commit_signer_hash{$email};
+		    foreach my $ref (@{$commit_signer}) {
+			print STDERR "     @{$ref}[2]: @{$ref}[1]\n";
+		    }
+		}
+
+		$count++;
 	    }
-	    $count++;
 	}
-	print STDERR "\n";
-	print STDERR "Choose whom to cc by entering a commaseperated list of numbers and hitting enter.\n";
-	print STDERR "To show a short list of commits, precede the number by a '?',\n";
-	print STDERR "A blank line indicates that you are satisfied with your choice.\n";
-	$input = <STDIN>;
+	print STDERR
+"\n#(toggle), A#(author), S#(signed) *(all), ^(none), O(options), Y(approve): ";
+	my $input = <STDIN>;
 	chomp($input);
 
-	my @wish = split(/[, ]+/,$input);
-	foreach my $nr (@wish){
-		my $logtoggle = 0;
-		if ($nr =~ /\?/){
-			$nr =~ s/\?//;
-			$logtoggle = 1;
+	my @wish = split(/[, ]+/, $input);
+	$redraw = 1;
+	foreach my $nr (@wish) {
+	    my $sel = lc(substr($nr, 0, 1));
+	    my $str = substr($nr, 1);
+	    my $val = 0;
+	    $val = $1 if $str =~ /^(\d+)$/;
+
+	    if ($sel eq "y") {
+		$interactive = 0;
+		$done = 1;
+		$output_rolestats = 0;
+		$output_roles = 0;
+		last;
+	    } elsif ($sel eq "*" || $sel eq '^') {
+		my $toggle = 0;
+		$toggle = 1 if ($sel eq '*');
+		for (my $i = 0; $i < $count; $i++) {
+		    $selected{$i} = $toggle;
 		}
-
-		#skip out of bounds numbers
-		next unless ($nr <= $count && $nr >= 0);
-
-		if ($logtoggle){
-			$shortlog{$nr} = !$shortlog{$nr};
-		} else {
-			$selected{$nr} = !$selected{$nr};
-
-			#switch shortlog on if an entry get's selected
-			if ($selected{$nr}){
-				$shortlog{$nr}=1;
-			}
+	    } elsif ($sel eq "0") {
+		for (my $i = 0; $i < $count; $i++) {
+		    $selected{$i} = !$selected{$i};
+		}
+	    } elsif ($sel eq "a") {
+		if ($val > 0 && $val <= $count) {
+		    $authored{$val - 1} = !$authored{$val - 1};
+		} elsif ($str eq '*' || $str eq '^') {
+		    my $toggle = 0;
+		    $toggle = 1 if ($str eq '*');
+		    for (my $i = 0; $i < $count; $i++) {
+			$authored{$i} = $toggle;
+		    }
+		}
+	    } elsif ($sel eq "s") {
+		if ($val > 0 && $val <= $count) {
+		    $signed{$val - 1} = !$signed{$val - 1};
+		} elsif ($str eq '*' || $str eq '^') {
+		    print("yes\n");
+		    my $toggle = 0;
+		    $toggle = 1 if ($str eq '*');
+		    for (my $i = 0; $i < $count; $i++) {
+			$signed{$i} = $toggle;
+		    }
+		}
+	    } elsif ($sel eq "o") {
+		print STDERR
+"0(toggle all) g(use git history([$email_git]) b(Use git blame[$email_git_blame])\n" .
+"c#(minimum commits[$email_git_min_signatures]) x#(max maintainers[$email_git_max_maintainers] d#(history to use[$email_git_since])\n";
+		$redraw = 0;
+	    } elsif ($sel eq "b") {
+		$email_git_blame = !$email_git_blame;
+		goto &get_maintainer;
+	    } elsif ($sel eq "g") {
+		$email_git = !$email_git;
+		goto &get_maintainer;
+	    } elsif ($sel eq "x") {
+		if ($val > 0) {
+		    $email_git_max_maintainers = $val;
 		}
-	};
-    } while(length($input) > 0);
+		goto &get_maintainer;
+	    } elsif ($sel eq "%") {
+		if ($val >= 0) {
+		    $email_git_min_percent = $val;
+		}
+		goto &get_maintainer;
+	    } elsif ($sel eq "c") {
+		if ($val >= 0) {
+		    $email_git_min_signatures = $val;
+		}
+		goto &get_maintainer;
+	    } elsif ($sel eq "d") {
+		$email_git_since = $str;
+		goto &get_maintainer;
+	    } elsif ($sel =~ /^\d+$/ && $sel > 0 && $nr <= $count) {
+		$selected{$nr - 1} = !$selected{$nr - 1};
+	    }
+	}
+    }
 
     #drop not selected entries
     $count = 0;
-    my @new_emailto;
-    foreach my $entry (@list){
-	if ($selected{$count}){
-		push(@new_emailto,$list[$count]);
-		print STDERR "$count: ";
-		print STDERR $email_to[$count]->[0];
-		print STDERR ",\t\t ";
-		print STDERR $email_to[$count]->[1];
-		print STDERR "\n";
+    my @new_emailto = ();
+    foreach my $entry (@list) {
+	if ($selected{$count}) {
+	    push(@new_emailto, $list[$count]);
 	}
 	$count++;
     }
-    return \@new_emailto;
+    return @new_emailto;
 }
 
-sub vcs_get_shortlog {
-    my $arg = shift;
-    my ($name, $address) = parse_email($arg);
-    return $shortlog_buffer{$address};
-}
+sub save_commits_by_author {
+    my (@lines) = @_;
 
-sub vcs_file_shortlogs {
-    my ($file) = @_;
-    print STDERR "shortlog processing $file:";
-    foreach my $entry (@email_to){
-	my ($name, $address) = parse_email($entry->[0]);
-	print STDERR ".";
-	my $commits_ref = vcs_email_shortlog($address, $file);
-	push(@{$shortlog_buffer{$address}}, [ $file, $commits_ref ]);
+    my @authors = ();
+    my @commits = ();
+    my @subjects = ();
+
+    foreach my $line (@lines) {
+	push(@authors, $1) if ($line =~ m/$VCS_cmds{"author_pattern"}/);
+	push(@commits, $1) if ($line =~ m/$VCS_cmds{"commit_pattern"}/);
+	push(@subjects, $1) if ($line =~ m/$VCS_cmds{"subject_pattern"}/);
+    }
+
+    for (my $i = 0; $i < @authors; $i++) {
+	push(@{$commit_author_hash{$authors[$i]}},
+	     [ ($commits[$i], $subjects[$i]) ]);
     }
-    print STDERR "\n";
 }
 
-sub vcs_email_shortlog {
-    my $email = shift;
-    my ($file) = @_;
+sub save_commits_by_signer {
+    my (@lines) = @_;
+
+    my $commit = "";
+    my $subject = "";
 
-    my $cmd = $VCS_cmds{"shortlog_cmd"};
-    $cmd =~ s/(\$\w+)/$1/eeg;		#substitute variables
-    my @lines = &{$VCS_cmds{"execute_cmd"}}($cmd);
-    return \@lines;
+    foreach my $line (@lines) {
+	$commit = $1 if ($line =~ m/$VCS_cmds{"commit_pattern"}/);
+	$subject = $1 if ($line =~ m/$VCS_cmds{"subject_pattern"}/);
+	if ($line =~ /^[ \t]*${signature_pattern}.*\@.*$/) {
+	    my @signature = ($line);
+	    my ($types_ref, $signers_ref) = extract_formatted_signatures(@signature);
+	    my @type = @$types_ref;
+	    my @signer = @$signers_ref;
+
+	    push(@{$commit_signer_hash{$signer[0]}},
+		 [ ($commit, $subject, $type[0]) ]);
+	}
+    }
 }
 
 sub vcs_assign {
-- 
1.7.3


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/8] scripts/get_maintainer.pl: Update --interactive UI, improve hg runtime
  2010-09-23  3:17 [PATCH 0/8] scripts/get_maintainer.pl: Add --interactive Joe Perches
  2010-09-23  3:17 ` [PATCH 1/8] scripts/get_maintainer.pl: add interactive mode Joe Perches
  2010-09-23  3:17 ` [PATCH 2/8] scripts/get_maintainer.pl: Improve --interactive UI Joe Perches
@ 2010-09-23  3:17 ` Joe Perches
  2010-09-23  3:17 ` [PATCH 4/8] scripts/get_maintainer.pl: Use case insensitive name de-duplication Joe Perches
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Joe Perches @ 2010-09-23  3:17 UTC (permalink / raw)
  To: linux-kernel

o Add option --git-blame-signatures default:on to search each
  commit used by the current file for signatures
o Use consistent style in VCS_cmds_(git|hg)
o Add more options to be controlled at the --interactive prompt
o Speed up hg commit searching runtime by issuing a single
  hg command to search all modified commits instead of running
  multiple hg commands with a single commit each.

Update to 0.26 beta3

Signed-off-by: Joe Perches <joe@perches.com>
---
 scripts/get_maintainer.pl |  347 ++++++++++++++++++++++++++++++++++-----------
 1 files changed, 266 insertions(+), 81 deletions(-)

diff --git a/scripts/get_maintainer.pl b/scripts/get_maintainer.pl
index 3fa639e..f511760 100755
--- a/scripts/get_maintainer.pl
+++ b/scripts/get_maintainer.pl
@@ -13,7 +13,7 @@
 use strict;
 
 my $P = $0;
-my $V = '0.26-beta';
+my $V = '0.26-beta3';
 
 use Getopt::Long qw(:config no_auto_abbrev);
 
@@ -27,6 +27,7 @@ my $email_git_penguin_chiefs = 0;
 my $email_git = 0;
 my $email_git_all_signature_types = 0;
 my $email_git_blame = 0;
+my $email_git_blame_signatures = 1;
 my $email_git_fallback = 1;
 my $email_git_min_signatures = 1;
 my $email_git_max_maintainers = 5;
@@ -51,6 +52,8 @@ my $pattern_depth = 0;
 my $version = 0;
 my $help = 0;
 
+my $vcs_used = 0;
+
 my $exit = 0;
 
 my %commit_author_hash;
@@ -78,7 +81,6 @@ my @signature_tags = ();
 push(@signature_tags, "Signed-off-by:");
 push(@signature_tags, "Reviewed-by:");
 push(@signature_tags, "Acked-by:");
-my $signature_pattern = "\(" . join("|", @signature_tags) . "\)";
 
 # rfc822 email address - preloaded methods go here.
 my $rfc822_lwsp = "(?:(?:\\r\\n)?[ \\t])";
@@ -100,16 +102,19 @@ my %VCS_cmds_git = (
 		      '%b%n"' .
 	    " -- \$file",
     "find_commit_signers_cmd" =>
-	"git log --no-color -1 " .
+	"git log --no-color " .
 	    '--format="GitCommit: %H%n' .
 		      'GitAuthor: %an <%ae>%n' .
 		      'GitDate: %aD%n' .
 		      'GitSubject: %s%n' .
 		      '%b%n"' .
-	    " \$commit",
+	    " -1 \$commit",
     "find_commit_author_cmd" =>
 	"git log --no-color " .
-	    '--format="GitAuthor: %an <%ae>"' .
+	    '--format="GitCommit: %H%n' .
+		      'GitAuthor: %an <%ae>%n' .
+		      'GitDate: %aD%n' .
+		      'GitSubject: %s%n"' .
 	    " -1 \$commit",
     "blame_range_cmd" => "git blame -l -L \$diff_start,+\$diff_length \$file",
     "blame_file_cmd" => "git blame -l \$file",
@@ -124,17 +129,24 @@ my %VCS_cmds_hg = (
     "available" => '(which("hg") ne "") && (-d ".hg")',
     "find_signers_cmd" =>
 	"hg log --date=\$email_hg_since " .
-	    "--template='HgCommit: {node}\\nHgAuthor: {author}\\nHgSubject: {desc}\\n'" .
+	    "--template='HgCommit: {node}\\n" .
+	                "HgAuthor: {author}\\n" .
+			"HgSubject: {desc}\\n'" .
 	    " -- \$file",
-    "find_commit_signers_cmd" => "hg log --template='{desc}\\n' -r \$commit",
+    "find_commit_signers_cmd" =>
+	"hg log " .
+	    "--template='HgSubject: {desc}\\n'" .
+	    " -r \$commit",
     "find_commit_author_cmd" =>
-	"hg log -l 1 " .
-	    "--template='HgAuthor: {author}\\n'" .
-	    "-r \$commit",
+	"hg log " .
+	    "--template='HgCommit: {node}\\n" .
+		        "HgAuthor: {author}\\n" .
+			"HgSubject: {desc|firstline}\\n'" .
+	    " -r \$commit",
     "blame_range_cmd" => "",		# not supported
-    "blame_file_cmd" => "hg blame -c \$file",
+    "blame_file_cmd" => "hg blame -n \$file",
     "commit_pattern" => "^HgCommit: ([0-9a-f]{40,40})",
-    "blame_commit_pattern" => "^([0-9a-f]+):",
+    "blame_commit_pattern" => "^([ 0-9a-f]+):",
     "author_pattern" => "^HgAuthor: (.*)",
     "subject_pattern" => "^HgSubject: (.*)",
 );
@@ -170,6 +182,7 @@ if (!GetOptions(
 		'git!' => \$email_git,
 		'git-all-signature-types!' => \$email_git_all_signature_types,
 		'git-blame!' => \$email_git_blame,
+		'git-blame-signatures!' => \$email_git_blame_signatures,
 		'git-fallback!' => \$email_git_fallback,
 		'git-chief-penguins!' => \$email_git_penguin_chiefs,
 		'git-min-signatures=i' => \$email_git_min_signatures,
@@ -247,10 +260,6 @@ if (!top_of_kernel_tree($lk_path)) {
 	. "a linux kernel source tree.\n";
 }
 
-if ($email_git_all_signature_types) {
-    $signature_pattern = "(.+?)[Bb][Yy]:";
-}
-
 ## Read MAINTAINERS for type/value pairs
 
 my @typevalue = ();
@@ -398,6 +407,7 @@ my @scm = ();
 my @web = ();
 my @subsystem = ();
 my @status = ();
+my $signature_pattern;
 
 my @to = get_maintainer();
 
@@ -440,6 +450,12 @@ sub get_maintainer {
     @subsystem = ();
     @status = ();
 
+    if ($email_git_all_signature_types) {
+	$signature_pattern = "(.+?)[Bb][Yy]:";
+    } else {
+	$signature_pattern = "\(" . join("|", @signature_tags) . "\)";
+    }
+
     # Find responsible parties
 
     foreach my $file (@files) {
@@ -1140,15 +1156,16 @@ sub vcs_find_author {
 
     my @authors = ();
     foreach my $line (@lines) {
-	push(@authors, $1) if ($line =~ m/$VCS_cmds{"author_pattern"}/);
+	if ($line =~ m/$VCS_cmds{"author_pattern"}/) {
+	    my $author = $1;
+	    my ($name, $address) = parse_email($author);
+	    $author = format_email($name, $address, 1);
+	    push(@authors, $author);
+	}
     }
 
-## Reformat email addresses (with names) to avoid badly written signatures
-
-    foreach my $author (@authors) {
-	my ($name, $address) = parse_email($author);
-	$author = format_email($name, $address, 1);
-    }
+    save_commits_by_author(@lines) if ($interactive);
+    save_commits_by_signer(@lines) if ($interactive);
 
     return @authors;
 }
@@ -1222,7 +1239,7 @@ sub vcs_exists {
     %VCS_cmds = %VCS_cmds_git;
     return 1 if eval $VCS_cmds{"available"};
     %VCS_cmds = %VCS_cmds_hg;
-    return 1 if eval $VCS_cmds{"available"};
+    return 2 if eval $VCS_cmds{"available"};
     %VCS_cmds = ();
     if (!$printed_novcs) {
 	warn("$P: No supported VCS found.  Add --nogit to options?\n");
@@ -1234,10 +1251,20 @@ sub vcs_exists {
     return 0;
 }
 
+sub vcs_is_git {
+    return $vcs_used == 1;
+}
+
+sub vcs_is_hg {
+    return $vcs_used == 2;
+}
+
 sub interactive_get_maintainer {
     my ($list_ref) = @_;
     my @list = @$list_ref;
 
+    vcs_exists();
+
     my %selected;
     my %authored;
     my %signed;
@@ -1254,10 +1281,13 @@ sub interactive_get_maintainer {
 
     #menu loop
     my $done = 0;
+    my $print_options = 0;
     my $redraw = 1;
     while (!$done) {
 	$count = 0;
 	if ($redraw) {
+	    printf STDERR "\n%1s %2s %-65sauth sign\n",
+		"*", "#", "email/list and role:stats";
 	    foreach my $entry (@list) {
 		my $email = $entry->[0];
 		my $role = $entry->[1];
@@ -1269,11 +1299,9 @@ sub interactive_get_maintainer {
 		my $signed = 0;
 		$authored++ for (@{$commit_author});
 		$signed++ for (@{$commit_signer});
-		printf STDERR "%1s %2d %-52s",
-			      $sel, $count + 1, $email;
-		printf STDERR " Author:%3d Signer:%3d",
-			      $authored, $signed
-			      if ($authored > 0 || $signed > 0);
+		printf STDERR "%1s %2d %-65s", $sel, $count + 1, $email;
+		printf STDERR "%4d %4d", $authored, $signed
+		    if ($authored > 0 || $signed > 0);
 		printf STDERR "\n     %s\n", $role;
 		if ($authored{$count}) {
 		    my $commit_author = $commit_author_hash{$email};
@@ -1291,15 +1319,42 @@ sub interactive_get_maintainer {
 		$count++;
 	    }
 	}
+	my $date_ref = \$email_git_since;
+	$date_ref = \$email_hg_since if (vcs_is_hg());
+	if ($print_options) {
+	    $print_options = 0;
+	    if (vcs_exists()) {
+		print STDERR
+"\nVersion Control options:\n" .
+"g  use git history      [$email_git]\n" .
+"gf use git-fallback     [$email_git_fallback]\n" .
+"b  use git blame        [$email_git_blame]\n" .
+"bs use blame signatures [$email_git_blame_signatures]\n" .
+"c# minimum commits      [$email_git_min_signatures]\n" .
+"%# min percent          [$email_git_min_percent]\n" .
+"d# history to use       [$$date_ref]\n" .
+"x# max maintainers      [$email_git_max_maintainers]\n" .
+"t  all signature types  [$email_git_all_signature_types]\n";
+	    }
+	    print STDERR "\nAdditional options:\n" .
+"0  toggle all\n" .
+"f  emails in file       [$file_emails]\n" .
+"k  keywords in file     [$keywords]\n" .
+"r  remove duplicates    [$email_remove_duplicates]\n" .
+"p# pattern match depth  [$pattern_depth]\n";
+	}
 	print STDERR
 "\n#(toggle), A#(author), S#(signed) *(all), ^(none), O(options), Y(approve): ";
+
 	my $input = <STDIN>;
 	chomp($input);
 
-	my @wish = split(/[, ]+/, $input);
 	$redraw = 1;
+	my $rerun = 0;
+	my @wish = split(/[, ]+/, $input);
 	foreach my $nr (@wish) {
-	    my $sel = lc(substr($nr, 0, 1));
+	    $nr = lc($nr);
+	    my $sel = substr($nr, 0, 1);
 	    my $str = substr($nr, 1);
 	    my $val = 0;
 	    $val = $1 if $str =~ /^(\d+)$/;
@@ -1310,6 +1365,8 @@ sub interactive_get_maintainer {
 		$output_rolestats = 0;
 		$output_roles = 0;
 		last;
+	    } elsif ($nr =~ /^\d+$/ && $nr > 0 && $nr <= $count) {
+		$selected{$nr - 1} = !$selected{$nr - 1};
 	    } elsif ($sel eq "*" || $sel eq '^') {
 		my $toggle = 0;
 		$toggle = 1 if ($sel eq '*');
@@ -1334,7 +1391,6 @@ sub interactive_get_maintainer {
 		if ($val > 0 && $val <= $count) {
 		    $signed{$val - 1} = !$signed{$val - 1};
 		} elsif ($str eq '*' || $str eq '^') {
-		    print("yes\n");
 		    my $toggle = 0;
 		    $toggle = 1 if ($str eq '*');
 		    for (my $i = 0; $i < $count; $i++) {
@@ -1342,38 +1398,71 @@ sub interactive_get_maintainer {
 		    }
 		}
 	    } elsif ($sel eq "o") {
-		print STDERR
-"0(toggle all) g(use git history([$email_git]) b(Use git blame[$email_git_blame])\n" .
-"c#(minimum commits[$email_git_min_signatures]) x#(max maintainers[$email_git_max_maintainers] d#(history to use[$email_git_since])\n";
-		$redraw = 0;
-	    } elsif ($sel eq "b") {
-		$email_git_blame = !$email_git_blame;
-		goto &get_maintainer;
+		$print_options = 1;
+		$redraw = 1;
 	    } elsif ($sel eq "g") {
-		$email_git = !$email_git;
-		goto &get_maintainer;
+		if ($str eq "f") {
+		    bool_invert(\$email_git_fallback);
+		} else {
+		    bool_invert(\$email_git);
+		}
+		$rerun = 1;
+	    } elsif ($sel eq "b") {
+		if ($str eq "s") {
+		    bool_invert(\$email_git_blame_signatures);
+		} else {
+		    bool_invert(\$email_git_blame);
+		}
+		$rerun = 1;
+	    } elsif ($sel eq "c") {
+		if ($val > 0) {
+		    $email_git_min_signatures = $val;
+		    $rerun = 1;
+		}
 	    } elsif ($sel eq "x") {
 		if ($val > 0) {
 		    $email_git_max_maintainers = $val;
+		    $rerun = 1;
 		}
-		goto &get_maintainer;
 	    } elsif ($sel eq "%") {
-		if ($val >= 0) {
+		if ($str ne "" && $val >= 0) {
 		    $email_git_min_percent = $val;
+		    $rerun = 1;
 		}
-		goto &get_maintainer;
-	    } elsif ($sel eq "c") {
-		if ($val >= 0) {
-		    $email_git_min_signatures = $val;
-		}
-		goto &get_maintainer;
 	    } elsif ($sel eq "d") {
-		$email_git_since = $str;
-		goto &get_maintainer;
-	    } elsif ($sel =~ /^\d+$/ && $sel > 0 && $nr <= $count) {
-		$selected{$nr - 1} = !$selected{$nr - 1};
+		if (vcs_is_git()) {
+		    $email_git_since = $str;
+		} elsif (vcs_is_hg()) {
+		    $email_hg_since = $str;
+		}
+		$rerun = 1;
+	    } elsif ($sel eq "t") {
+		bool_invert(\$email_git_all_signature_types);
+		$rerun = 1;
+	    } elsif ($sel eq "f") {
+		bool_invert(\$file_emails);
+		$rerun = 1;
+	    } elsif ($sel eq "r") {
+		bool_invert(\$email_remove_duplicates);
+		$rerun = 1;
+	    } elsif ($sel eq "k") {
+		bool_invert(\$keywords);
+		$rerun = 1;
+	    } elsif ($sel eq "p") {
+		if ($str ne "" && $val >= 0) {
+		    $pattern_depth = $val;
+		    $rerun = 1;
+		}
+	    } else {
+		print STDERR "invalid option: '$nr'\n";
+		$redraw = 0;
 	    }
 	}
+	if ($rerun) {
+	    print STDERR "git-blame can be very slow, please have patience..."
+		if ($email_git_blame);
+	    goto &get_maintainer;
+	}
     }
 
     #drop not selected entries
@@ -1388,6 +1477,16 @@ sub interactive_get_maintainer {
     return @new_emailto;
 }
 
+sub bool_invert {
+    my ($bool_ref) = @_;
+
+    if ($$bool_ref) {
+	$$bool_ref = 0;
+    } else {
+	$$bool_ref = 1;
+    }
+}
+
 sub save_commits_by_author {
     my (@lines) = @_;
 
@@ -1396,14 +1495,29 @@ sub save_commits_by_author {
     my @subjects = ();
 
     foreach my $line (@lines) {
-	push(@authors, $1) if ($line =~ m/$VCS_cmds{"author_pattern"}/);
+	if ($line =~ m/$VCS_cmds{"author_pattern"}/) {
+	    my $author = $1;
+	    my ($name, $address) = parse_email($author);
+	    $author = format_email($name, $address, 1);
+	    push(@authors, $author);
+	}
 	push(@commits, $1) if ($line =~ m/$VCS_cmds{"commit_pattern"}/);
 	push(@subjects, $1) if ($line =~ m/$VCS_cmds{"subject_pattern"}/);
     }
 
     for (my $i = 0; $i < @authors; $i++) {
-	push(@{$commit_author_hash{$authors[$i]}},
-	     [ ($commits[$i], $subjects[$i]) ]);
+	my $exists = 0;
+	foreach my $ref(@{$commit_author_hash{$authors[$i]}}) {
+	    if (@{$ref}[0] eq $commits[$i] &&
+		@{$ref}[1] eq $subjects[$i]) {
+		$exists = 1;
+		last;
+	    }
+	}
+	if (!$exists) {
+	    push(@{$commit_author_hash{$authors[$i]}},
+		 [ ($commits[$i], $subjects[$i]) ]);
+	}
     }
 }
 
@@ -1417,13 +1531,27 @@ sub save_commits_by_signer {
 	$commit = $1 if ($line =~ m/$VCS_cmds{"commit_pattern"}/);
 	$subject = $1 if ($line =~ m/$VCS_cmds{"subject_pattern"}/);
 	if ($line =~ /^[ \t]*${signature_pattern}.*\@.*$/) {
-	    my @signature = ($line);
-	    my ($types_ref, $signers_ref) = extract_formatted_signatures(@signature);
-	    my @type = @$types_ref;
-	    my @signer = @$signers_ref;
-
-	    push(@{$commit_signer_hash{$signer[0]}},
-		 [ ($commit, $subject, $type[0]) ]);
+	    my @signatures = ($line);
+	    my ($types_ref, $signers_ref) = extract_formatted_signatures(@signatures);
+	    my @types = @$types_ref;
+	    my @signers = @$signers_ref;
+
+	    my $type = $types[0];
+	    my $signer = $signers[0];
+
+	    my $exists = 0;
+	    foreach my $ref(@{$commit_signer_hash{$signer}}) {
+		if (@{$ref}[0] eq $commit &&
+		    @{$ref}[1] eq $subject &&
+		    @{$ref}[2] eq $type) {
+		    $exists = 1;
+		    last;
+		}
+	    }
+	    if (!$exists) {
+		push(@{$commit_signer_hash{$signer}},
+		     [ ($commit, $subject, $type) ]);
+	    }
 	}
     }
 }
@@ -1478,7 +1606,8 @@ sub vcs_file_signoffs {
     my @signers = ();
     my $commits;
 
-    return if (!vcs_exists());
+    $vcs_used = vcs_exists();
+    return if (!$vcs_used);
 
     my $cmd = $VCS_cmds{"find_signers_cmd"};
     $cmd =~ s/(\$\w+)/$1/eeg;		# interpolate $cmd
@@ -1496,37 +1625,93 @@ sub vcs_file_blame {
     my $total_commits;
     my $total_lines;
 
-    return if (!vcs_exists());
+    $vcs_used = vcs_exists();
+    return if (!$vcs_used);
 
     @all_commits = vcs_blame($file);
     @commits = uniq(@all_commits);
     $total_commits = @commits;
     $total_lines = @all_commits;
 
-    foreach my $commit (@commits) {
-	my $commit_count;
-	my @commit_signers = ();
+    if ($email_git_blame_signatures) {
+	if (vcs_is_hg()) {
+	    my $commit_count;
+	    my @commit_signers = ();
+	    my $commit = join(" -r ", @commits);
+	    my $cmd;
+
+	    $cmd = $VCS_cmds{"find_commit_signers_cmd"};
+	    $cmd =~ s/(\$\w+)/$1/eeg;	#substitute variables in $cmd
+
+	    ($commit_count, @commit_signers) = vcs_find_signers($cmd);
+
+	    push(@signers, @commit_signers);
+	} else {
+	    foreach my $commit (@commits) {
+		my $commit_count;
+		my @commit_signers = ();
+		my $cmd;
 
-	my $cmd = $VCS_cmds{"find_commit_signers_cmd"};
-	$cmd =~ s/(\$\w+)/$1/eeg;	#substitute variables in $cmd
+		$cmd = $VCS_cmds{"find_commit_signers_cmd"};
+		$cmd =~ s/(\$\w+)/$1/eeg;	#substitute variables in $cmd
 
-	($commit_count, @commit_signers) = vcs_find_signers($cmd);
+		($commit_count, @commit_signers) = vcs_find_signers($cmd);
 
-	push(@signers, @commit_signers);
+		push(@signers, @commit_signers);
+	    }
+	}
     }
 
     if ($from_filename) {
 	if ($output_rolestats) {
 	    my @blame_signers;
-	    foreach my $commit (@commits) {
-		my $i;
-		my $cmd = $VCS_cmds{"find_commit_author_cmd"};
-		$cmd =~ s/(\$\w+)/$1/eeg;	#interpolate $cmd
-		my @author = vcs_find_author($cmd);
-		next if !@author;
-		my $count = grep(/$commit/, @all_commits);
-		for ($i = 0; $i < $count ; $i++) {
-		    push(@blame_signers, $author[0]);
+	    if (vcs_is_hg()) {{		# Double brace for last exit
+		my $commit_count;
+		my @commit_signers = ();
+		@commits = uniq(@commits);
+		@commits = sort(@commits);
+		my $commit = join(" -r ", @commits);
+		my $cmd;
+
+		$cmd = $VCS_cmds{"find_commit_author_cmd"};
+		$cmd =~ s/(\$\w+)/$1/eeg;	#substitute variables in $cmd
+
+		my @lines = ();
+
+		@lines = &{$VCS_cmds{"execute_cmd"}}($cmd);
+
+		if (!$email_git_penguin_chiefs) {
+		    @lines = grep(!/${penguin_chiefs}/i, @lines);
+		}
+
+		last if !@lines;
+
+		my @authors = ();
+		foreach my $line (@lines) {
+		    if ($line =~ m/$VCS_cmds{"author_pattern"}/) {
+			my $author = $1;
+			my ($name, $address) = parse_email($author);
+			$author = format_email($name, $address, 1);
+			push(@authors, $1);
+		    }
+		}
+
+		save_commits_by_author(@lines) if ($interactive);
+		save_commits_by_signer(@lines) if ($interactive);
+
+		push(@signers, @authors);
+	    }}
+	    else {
+		foreach my $commit (@commits) {
+		    my $i;
+		    my $cmd = $VCS_cmds{"find_commit_author_cmd"};
+		    $cmd =~ s/(\$\w+)/$1/eeg;	#interpolate $cmd
+		    my @author = vcs_find_author($cmd);
+		    next if !@author;
+		    my $count = grep(/$commit/, @all_commits);
+		    for ($i = 0; $i < $count ; $i++) {
+			push(@blame_signers, $author[0]);
+		    }
 		}
 	    }
 	    if (@blame_signers) {
-- 
1.7.3


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 4/8] scripts/get_maintainer.pl: Use case insensitive name de-duplication
  2010-09-23  3:17 [PATCH 0/8] scripts/get_maintainer.pl: Add --interactive Joe Perches
                   ` (2 preceding siblings ...)
  2010-09-23  3:17 ` [PATCH 3/8] scripts/get_maintainer.pl: Update --interactive UI, improve hg runtime Joe Perches
@ 2010-09-23  3:17 ` Joe Perches
  2010-09-23  3:17 ` [PATCH 5/8] scripts/get_maintainer.pl: fix mailmap handling Joe Perches
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Joe Perches @ 2010-09-23  3:17 UTC (permalink / raw)
  To: linux-kernel

Case insensitive name and email address matching can help reduce
duplication when authors don't always use the exact same signature.

o Add a --interactive per-file exact_match hash so git history
  can be checked on per-file only when there is no direct maintainer
o Make @interactive_to list global so save_commits_by_<foo> can check
  email names & addresses against this list for duplication
o Don't allow --interactive and --sections
o rename subroutine get_maintainer to get_maintainers
o Added help text option to --interactive menu prompt

Update version to 0.26-beta4

Signed-off-by: Joe Perches <joe@perches.com>
---
 scripts/get_maintainer.pl |  135 +++++++++++++++++++++++++++++++++-----------
 1 files changed, 101 insertions(+), 34 deletions(-)

diff --git a/scripts/get_maintainer.pl b/scripts/get_maintainer.pl
index f511760..61d3bb5 100755
--- a/scripts/get_maintainer.pl
+++ b/scripts/get_maintainer.pl
@@ -13,7 +13,7 @@
 use strict;
 
 my $P = $0;
-my $V = '0.26-beta3';
+my $V = '0.26-beta4';
 
 use Getopt::Long qw(:config no_auto_abbrev);
 
@@ -242,6 +242,7 @@ if ($sections) {
     $subsystem = 0;
     $web = 0;
     $keywords = 0;
+    $interactive = 0;
 } else {
     my $selections = $email + $scm + $status + $subsystem + $web;
     if ($selections == 0) {
@@ -407,13 +408,15 @@ my @scm = ();
 my @web = ();
 my @subsystem = ();
 my @status = ();
+my @interactive_to = ();
 my $signature_pattern;
 
-my @to = get_maintainer();
+my @maintainers = get_maintainers();
 
-@to = merge_email(@to);
-
-output(@to) if (@to);
+if (@maintainers) {
+    @maintainers = merge_email(@maintainers);
+    output(@maintainers);
+}
 
 if ($scm) {
     @scm = uniq(@scm);
@@ -437,7 +440,7 @@ if ($web) {
 
 exit($exit);
 
-sub get_maintainer {
+sub get_maintainers {
     %email_hash_name = ();
     %email_hash_address = ();
     %commit_author_hash = ();
@@ -449,7 +452,7 @@ sub get_maintainer {
     @web = ();
     @subsystem = ();
     @status = ();
-
+    @interactive_to = ();
     if ($email_git_all_signature_types) {
 	$signature_pattern = "(.+?)[Bb][Yy]:";
     } else {
@@ -458,10 +461,11 @@ sub get_maintainer {
 
     # Find responsible parties
 
+    my %exact_pattern_match_hash;
+
     foreach my $file (@files) {
 
 	my %hash;
-	my $exact_pattern_match = 0;
 	my $tvi = find_first_section();
 	while ($tvi < @typevalue) {
 	    my $start = find_starting_index($tvi);
@@ -497,7 +501,9 @@ sub get_maintainer {
 				my $file_pd = ($file  =~ tr@/@@);
 				$value_pd++ if (substr($value,-1,1) ne "/");
 				$value_pd = -1 if ($value =~ /^\.\*/);
-				$exact_pattern_match = 1 if ($value_pd >= $file_pd);
+				if ($value_pd >= $file_pd) {
+				    $exact_pattern_match_hash{$file} = 1;
+				}
 				if ($pattern_depth == 0 ||
 				    (($file_pd - $value_pd) < $pattern_depth)) {
 				    $hash{$tvi} = $value_pd;
@@ -530,14 +536,6 @@ sub get_maintainer {
 		print("\n");
 	    }
 	}
-
-	if ($email && ($email_git ||
-		       ($email_git_fallback && !$exact_pattern_match))) {
-	    vcs_file_signoffs($file);
-	}
-	if ($email && $email_git_blame) {
-	    vcs_file_blame($file);
-	}
     }
 
     if ($keywords) {
@@ -547,6 +545,19 @@ sub get_maintainer {
 	}
     }
 
+    @interactive_to = (@email_to, @list_to);
+
+    foreach my $file (@files) {
+	if ($email &&
+	    ($email_git || ($email_git_fallback &&
+			    !$exact_pattern_match_hash{$file}))) {
+	    vcs_file_signoffs($file);
+	}
+	if ($email && $email_git_blame) {
+	    vcs_file_blame($file);
+	}
+    }
+
     if ($email) {
 	foreach my $chief (@penguin_chief) {
 	    if ($chief =~ m/^(.*):(.*)/) {
@@ -580,7 +591,10 @@ sub get_maintainer {
 	}
     }
 
-    @to = interactive_get_maintainer(\@to) if ($interactive);
+    if ($interactive) {
+	@interactive_to = @to;
+	@to = interactive_get_maintainers(\@interactive_to);
+    }
 
     return @to;
 }
@@ -899,16 +913,16 @@ sub add_categories {
 		}
 		if ($list_additional =~ m/subscribers-only/) {
 		    if ($email_subscriber_list) {
-			if (!$hash_list_to{$list_address}) {
-			    $hash_list_to{$list_address} = 1;
+			if (!$hash_list_to{lc($list_address)}) {
+			    $hash_list_to{lc($list_address)} = 1;
 			    push(@list_to, [$list_address,
 					    "subscriber list${list_role}"]);
 			}
 		    }
 		} else {
 		    if ($email_list) {
-			if (!$hash_list_to{$list_address}) {
-			    $hash_list_to{$list_address} = 1;
+			if (!$hash_list_to{lc($list_address)}) {
+			    $hash_list_to{lc($list_address)} = 1;
 			    push(@list_to, [$list_address,
 					    "open list${list_role}"]);
 			}
@@ -946,8 +960,8 @@ sub email_inuse {
     my ($name, $address) = @_;
 
     return 1 if (($name eq "") && ($address eq ""));
-    return 1 if (($name ne "") && exists($email_hash_name{$name}));
-    return 1 if (($address ne "") && exists($email_hash_address{$address}));
+    return 1 if (($name ne "") && exists($email_hash_name{lc($name)}));
+    return 1 if (($address ne "") && exists($email_hash_address{lc($address)}));
 
     return 0;
 }
@@ -965,8 +979,8 @@ sub push_email_address {
 	push(@email_to, [format_email($name, $address, $email_usename), $role]);
     } elsif (!email_inuse($name, $address)) {
 	push(@email_to, [format_email($name, $address, $email_usename), $role]);
-	$email_hash_name{$name}++;
-	$email_hash_address{$address}++;
+	$email_hash_name{lc($name)}++;
+	$email_hash_address{lc($address)}++;
     }
 
     return 1;
@@ -1259,7 +1273,7 @@ sub vcs_is_hg {
     return $vcs_used == 2;
 }
 
-sub interactive_get_maintainer {
+sub interactive_get_maintainers {
     my ($list_ref) = @_;
     my @list = @$list_ref;
 
@@ -1269,11 +1283,12 @@ sub interactive_get_maintainer {
     my %authored;
     my %signed;
     my $count = 0;
-
+    my $maintained = 0;
     #select maintainers by default
-    foreach my $entry (@list){
+    foreach my $entry (@list) {
 	my $role = $entry->[1];
-	$selected{$count} = ($role =~ /^(maintainer|supporter|open list)/);
+	$selected{$count} = ($role =~ /^(maintainer|supporter|open list)/i);
+	$maintained = 1 if ($role =~ /^(maintainer|supporter)/i);
 	$authored{$count} = 0;
 	$signed{$count} = 0;
 	$count++;
@@ -1286,8 +1301,14 @@ sub interactive_get_maintainer {
     while (!$done) {
 	$count = 0;
 	if ($redraw) {
-	    printf STDERR "\n%1s %2s %-65sauth sign\n",
-		"*", "#", "email/list and role:stats";
+	    printf STDERR "\n%1s %2s %-65s",
+			  "*", "#", "email/list and role:stats";
+	    if ($email_git ||
+		($email_git_fallback && !$maintained) ||
+		$email_git_blame) {
+		print STDERR "auth sign";
+	    }
+	    print STDERR "\n";
 	    foreach my $entry (@list) {
 		my $email = $entry->[0];
 		my $role = $entry->[1];
@@ -1453,6 +1474,27 @@ sub interactive_get_maintainer {
 		    $pattern_depth = $val;
 		    $rerun = 1;
 		}
+	    } elsif ($sel eq "h" || $sel eq "?") {
+		print STDERR <<EOT
+
+Interactive mode allows you to select the various maintainers, submitters,
+commit signers and mailing lists that could be CC'd on a patch.
+
+Any *'d entry is selected.
+
+If you have git or hg installed, You can choose to summarize the commit
+history of files in the patch.  Also, each line of the current file can
+be matched to its commit author and that commits signers with blame.
+
+Various knobs exist to control the length of time for active commit
+tracking, the maximum number of commit authors and signers to add,
+and such.
+
+Enter selections at the prompt until you are satisfied that the selected
+maintainers are appropriate.  You may enter multiple selections separated
+by either commas or spaces.
+
+EOT
 	    } else {
 		print STDERR "invalid option: '$nr'\n";
 		$redraw = 0;
@@ -1461,7 +1503,7 @@ sub interactive_get_maintainer {
 	if ($rerun) {
 	    print STDERR "git-blame can be very slow, please have patience..."
 		if ($email_git_blame);
-	    goto &get_maintainer;
+	    goto &get_maintainers;
 	}
     }
 
@@ -1496,9 +1538,20 @@ sub save_commits_by_author {
 
     foreach my $line (@lines) {
 	if ($line =~ m/$VCS_cmds{"author_pattern"}/) {
+	    my $matched = 0;
 	    my $author = $1;
 	    my ($name, $address) = parse_email($author);
-	    $author = format_email($name, $address, 1);
+	    foreach my $to (@interactive_to) {
+		my ($to_name, $to_address) = parse_email($to->[0]);
+		if ($email_remove_duplicates &&
+		    ((lc($name) eq lc($to_name)) ||
+		     (lc($address) eq lc($to_address)))) {
+		    $author = $to->[0];
+		    $matched = 1;
+		    last;
+		}
+	    }
+	    $author = format_email($name, $address, 1) if (!$matched);
 	    push(@authors, $author);
 	}
 	push(@commits, $1) if ($line =~ m/$VCS_cmds{"commit_pattern"}/);
@@ -1539,6 +1592,20 @@ sub save_commits_by_signer {
 	    my $type = $types[0];
 	    my $signer = $signers[0];
 
+	    my $matched = 0;
+	    my ($name, $address) = parse_email($signer);
+	    foreach my $to (@interactive_to) {
+		my ($to_name, $to_address) = parse_email($to->[0]);
+		if ($email_remove_duplicates &&
+		    ((lc($name) eq lc($to_name)) ||
+		     (lc($address) eq lc($to_address)))) {
+		    $signer = $to->[0];
+		    $matched = 1;
+		    last;
+		}
+		$signer = format_email($name, $address, 1) if (!$matched);
+	    }
+
 	    my $exists = 0;
 	    foreach my $ref(@{$commit_signer_hash{$signer}}) {
 		if (@{$ref}[0] eq $commit &&
-- 
1.7.3


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 5/8] scripts/get_maintainer.pl: fix mailmap handling
  2010-09-23  3:17 [PATCH 0/8] scripts/get_maintainer.pl: Add --interactive Joe Perches
                   ` (3 preceding siblings ...)
  2010-09-23  3:17 ` [PATCH 4/8] scripts/get_maintainer.pl: Use case insensitive name de-duplication Joe Perches
@ 2010-09-23  3:17 ` Joe Perches
  2010-09-23  7:12   ` Florian Mickler
  2010-09-23  3:17 ` [PATCH 6/8] scripts/get_maintainer.pl: Correct indentation in a few places Joe Perches
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 11+ messages in thread
From: Joe Perches @ 2010-09-23  3:17 UTC (permalink / raw)
  To: linux-kernel

From: florian@mickler.org <florian@mickler.org>

Implement it, like it is described in git-shortlog.

Signed-off-by: Florian Mickler <florian@mickler.org>
Signed-off-by: Joe Perches <joe@perches.com>
---
 scripts/get_maintainer.pl |  147 +++++++++++++++++++++++++++++++++------------
 1 files changed, 109 insertions(+), 38 deletions(-)

diff --git a/scripts/get_maintainer.pl b/scripts/get_maintainer.pl
index 61d3bb5..faeace4 100755
--- a/scripts/get_maintainer.pl
+++ b/scripts/get_maintainer.pl
@@ -295,31 +295,76 @@ while (<$maint>) {
 }
 close($maint);
 
-my %mailmap;
 
-if ($email_remove_duplicates) {
-    open(my $mailmap, '<', "${lk_path}.mailmap")
+#
+# Read mail address map
+#
+
+my $mailmap = read_mailmap();
+
+sub read_mailmap {
+    my $mailmap = {
+	names => {},
+	addresses => {}
+   };
+
+    if (!$email_remove_duplicates) {
+	return $mailmap;
+    }
+
+    open(my $mailmap_file, '<', "${lk_path}.mailmap")
 	or warn "$P: Can't open .mailmap: $!\n";
-    while (<$mailmap>) {
-	my $line = $_;
 
-	next if ($line =~ m/^\s*#/);
-	next if ($line =~ m/^\s*$/);
+    while (<$mailmap_file>) {
+	s/#.*$//; #strip comments
+	s/^\s+|\s+$//g; #trim
 
-	my ($name, $address) = parse_email($line);
-	$line = format_email($name, $address, $email_usename);
+	next if (/^\s*$/); #skip empty lines
+	#entries have one of the following formats:
+	# name1 <mail1>
+	# <mail1> <mail2>
+	# name1 <mail1> <mail2>
+	# name1 <mail1> name2 <mail2>
+	# (see man git-shortlog)
+	if (/^(.+)<(.+)>$/) {
+		my $real_name = $1;
+		my $address = $2;
 
-	next if ($line =~ m/^\s*$/);
+		$real_name =~ s/\s+$//;
+		$mailmap->{names}->{$address} = $real_name;
 
-	if (exists($mailmap{$name})) {
-	    my $obj = $mailmap{$name};
-	    push(@$obj, $address);
-	} else {
-	    my @arr = ($address);
-	    $mailmap{$name} = \@arr;
+	} elsif (/^<([^\s]+)>\s*<([^\s]+)>$/) {
+		my $real_address = $1;
+		my $wrong_address = $2;
+
+		$mailmap->{addresses}->{$wrong_address} = $real_address;
+
+	} elsif (/^(.+)<([^\s]+)>\s*<([^\s]+)>$/) {
+		my $real_name= $1;
+		my $real_address = $2;
+		my $wrong_address = $3;
+
+		$real_name =~ s/\s+$//;
+
+		$mailmap->{names}->{$wrong_address} = $real_name;
+		$mailmap->{addresses}->{$wrong_address} = $real_address;
+
+	} elsif (/^(.+)<([^\s]+)>\s*([^\s].*)<([^\s]+)>$/) {
+		my $real_name = $1;
+		my $real_address = $2;
+		my $wrong_name = $3;
+		my $wrong_address = $4;
+
+		$real_name =~ s/\s+$//;
+		$wrong_name =~ s/\s+$//;
+
+		$mailmap->{names}->{format_email($wrong_name,$wrong_address,1)} = $real_name;
+		$mailmap->{addresses}->{format_email($wrong_name,$wrong_address,1)} = $real_address;
 	}
     }
-    close($mailmap);
+    close($mailmap_file);
+
+    return $mailmap;
 }
 
 ## use the filenames on the command line or find the filenames in the patchfiles
@@ -1061,30 +1106,58 @@ sub which_conf {
     return "";
 }
 
-sub mailmap {
-    my (@lines) = @_;
-    my %hash;
+sub mailmap_email {
+	my $line = shift;
 
-    foreach my $line (@lines) {
 	my ($name, $address) = parse_email($line);
-	if (!exists($hash{$name})) {
-	    $hash{$name} = $address;
-	} elsif ($address ne $hash{$name}) {
-	    $address = $hash{$name};
-	    $line = format_email($name, $address, $email_usename);
-	}
-	if (exists($mailmap{$name})) {
-	    my $obj = $mailmap{$name};
-	    foreach my $map_address (@$obj) {
-		if (($map_address eq $address) &&
-		    ($map_address ne $hash{$name})) {
-		    $line = format_email($name, $hash{$name}, $email_usename);
+	my $email = format_email($name, $address, 1);
+	my $real_name = $name;
+	my $real_address = $address;
+
+	if (exists $mailmap->{names}->{$email} || exists $mailmap->{addresses}->{$email}) {
+		if (exists $mailmap->{names}->{$email}) {
+			$real_name = $mailmap->{names}->{$email};
+		}
+		if (exists $mailmap->{addresses}->{$email}) {
+			$real_address = $mailmap->{addresses}->{$email};
+		}
+	} else {
+		if (exists $mailmap->{names}->{$address}) {
+			$real_name = $mailmap->{names}->{$address};
+		}
+		if (exists $mailmap->{addresses}->{$address}) {
+			$real_address = $mailmap->{addresses}->{$address};
 		}
-	    }
 	}
+	return format_email($real_name, $real_address, 1);
+}
+
+sub mailmap {
+    my (@addresses) = @_;
+
+    my @ret = ();
+    foreach my $line (@addresses) {
+	push(@ret, mailmap_email($line), 1);
     }
 
-    return @lines;
+    merge_by_realname(@ret) if $email_remove_duplicates;
+
+    return @ret;
+}
+
+sub merge_by_realname {
+	my %address_map;
+	my (@emails) = @_;
+	foreach my $email (@emails) {
+		my ($name, $address) = parse_email($email);
+		if (!exists $address_map{$name}) {
+			$address_map{$name} = $address;
+		} else {
+			$address = $address_map{$name};
+			$email = format_email($name,$address,1);
+		}
+	}
+
 }
 
 sub git_execute_cmd {
@@ -1636,9 +1709,7 @@ sub vcs_assign {
 	$divisor = 1;
     }
 
-    if ($email_remove_duplicates) {
-	@lines = mailmap(@lines);
-    }
+    @lines = mailmap(@lines);
 
     return if (@lines <= 0);
 
-- 
1.7.3


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 6/8] scripts/get_maintainer.pl: Correct indentation in a few places
  2010-09-23  3:17 [PATCH 0/8] scripts/get_maintainer.pl: Add --interactive Joe Perches
                   ` (4 preceding siblings ...)
  2010-09-23  3:17 ` [PATCH 5/8] scripts/get_maintainer.pl: fix mailmap handling Joe Perches
@ 2010-09-23  3:17 ` Joe Perches
  2010-09-23  3:17 ` [PATCH 7/8] scripts/get_maintainer.pl: Use mailmap in name deduplication and other updates Joe Perches
  2010-09-23  3:17 ` [PATCH 8/8] scripts/get_maintainer.pl: Don't deduplicate unnamed addresses ie: mailing lists Joe Perches
  7 siblings, 0 replies; 11+ messages in thread
From: Joe Perches @ 2010-09-23  3:17 UTC (permalink / raw)
  To: linux-kernel

And a miscellaneous conversion of You to you in a help message

Signed-off-by: Joe Perches <joe@perches.com>
---
 scripts/get_maintainer.pl |  156 ++++++++++++++++++++++----------------------
 1 files changed, 78 insertions(+), 78 deletions(-)

diff --git a/scripts/get_maintainer.pl b/scripts/get_maintainer.pl
index faeace4..0abfdbc 100755
--- a/scripts/get_maintainer.pl
+++ b/scripts/get_maintainer.pl
@@ -306,7 +306,7 @@ sub read_mailmap {
     my $mailmap = {
 	names => {},
 	addresses => {}
-   };
+    };
 
     if (!$email_remove_duplicates) {
 	return $mailmap;
@@ -327,39 +327,39 @@ sub read_mailmap {
 	# name1 <mail1> name2 <mail2>
 	# (see man git-shortlog)
 	if (/^(.+)<(.+)>$/) {
-		my $real_name = $1;
-		my $address = $2;
+	    my $real_name = $1;
+	    my $address = $2;
 
-		$real_name =~ s/\s+$//;
-		$mailmap->{names}->{$address} = $real_name;
+	    $real_name =~ s/\s+$//;
+	    $mailmap->{names}->{$address} = $real_name;
 
 	} elsif (/^<([^\s]+)>\s*<([^\s]+)>$/) {
-		my $real_address = $1;
-		my $wrong_address = $2;
+	    my $real_address = $1;
+	    my $wrong_address = $2;
 
-		$mailmap->{addresses}->{$wrong_address} = $real_address;
+	    $mailmap->{addresses}->{$wrong_address} = $real_address;
 
 	} elsif (/^(.+)<([^\s]+)>\s*<([^\s]+)>$/) {
-		my $real_name= $1;
-		my $real_address = $2;
-		my $wrong_address = $3;
+	    my $real_name= $1;
+	    my $real_address = $2;
+	    my $wrong_address = $3;
 
-		$real_name =~ s/\s+$//;
+	    $real_name =~ s/\s+$//;
 
-		$mailmap->{names}->{$wrong_address} = $real_name;
-		$mailmap->{addresses}->{$wrong_address} = $real_address;
+	    $mailmap->{names}->{$wrong_address} = $real_name;
+	    $mailmap->{addresses}->{$wrong_address} = $real_address;
 
 	} elsif (/^(.+)<([^\s]+)>\s*([^\s].*)<([^\s]+)>$/) {
-		my $real_name = $1;
-		my $real_address = $2;
-		my $wrong_name = $3;
-		my $wrong_address = $4;
+	    my $real_name = $1;
+	    my $real_address = $2;
+	    my $wrong_name = $3;
+	    my $wrong_address = $4;
 
-		$real_name =~ s/\s+$//;
-		$wrong_name =~ s/\s+$//;
+	    $real_name =~ s/\s+$//;
+	    $wrong_name =~ s/\s+$//;
 
-		$mailmap->{names}->{format_email($wrong_name,$wrong_address,1)} = $real_name;
-		$mailmap->{addresses}->{format_email($wrong_name,$wrong_address,1)} = $real_address;
+	    $mailmap->{names}->{format_email($wrong_name,$wrong_address,1)} = $real_name;
+	    $mailmap->{addresses}->{format_email($wrong_name,$wrong_address,1)} = $real_address;
 	}
     }
     close($mailmap_file);
@@ -743,30 +743,30 @@ EOT
 }
 
 sub top_of_kernel_tree {
-	my ($lk_path) = @_;
+    my ($lk_path) = @_;
 
-	if ($lk_path ne "" && substr($lk_path,length($lk_path)-1,1) ne "/") {
-	    $lk_path .= "/";
-	}
-	if (   (-f "${lk_path}COPYING")
-	    && (-f "${lk_path}CREDITS")
-	    && (-f "${lk_path}Kbuild")
-	    && (-f "${lk_path}MAINTAINERS")
-	    && (-f "${lk_path}Makefile")
-	    && (-f "${lk_path}README")
-	    && (-d "${lk_path}Documentation")
-	    && (-d "${lk_path}arch")
-	    && (-d "${lk_path}include")
-	    && (-d "${lk_path}drivers")
-	    && (-d "${lk_path}fs")
-	    && (-d "${lk_path}init")
-	    && (-d "${lk_path}ipc")
-	    && (-d "${lk_path}kernel")
-	    && (-d "${lk_path}lib")
-	    && (-d "${lk_path}scripts")) {
-		return 1;
-	}
-	return 0;
+    if ($lk_path ne "" && substr($lk_path,length($lk_path)-1,1) ne "/") {
+	$lk_path .= "/";
+    }
+    if (   (-f "${lk_path}COPYING")
+	&& (-f "${lk_path}CREDITS")
+	&& (-f "${lk_path}Kbuild")
+	&& (-f "${lk_path}MAINTAINERS")
+	&& (-f "${lk_path}Makefile")
+	&& (-f "${lk_path}README")
+	&& (-d "${lk_path}Documentation")
+	&& (-d "${lk_path}arch")
+	&& (-d "${lk_path}include")
+	&& (-d "${lk_path}drivers")
+	&& (-d "${lk_path}fs")
+	&& (-d "${lk_path}init")
+	&& (-d "${lk_path}ipc")
+	&& (-d "${lk_path}kernel")
+	&& (-d "${lk_path}lib")
+	&& (-d "${lk_path}scripts")) {
+	return 1;
+    }
+    return 0;
 }
 
 sub parse_email {
@@ -1107,29 +1107,30 @@ sub which_conf {
 }
 
 sub mailmap_email {
-	my $line = shift;
+    my $line = shift;
 
-	my ($name, $address) = parse_email($line);
-	my $email = format_email($name, $address, 1);
-	my $real_name = $name;
-	my $real_address = $address;
-
-	if (exists $mailmap->{names}->{$email} || exists $mailmap->{addresses}->{$email}) {
-		if (exists $mailmap->{names}->{$email}) {
-			$real_name = $mailmap->{names}->{$email};
-		}
-		if (exists $mailmap->{addresses}->{$email}) {
-			$real_address = $mailmap->{addresses}->{$email};
-		}
-	} else {
-		if (exists $mailmap->{names}->{$address}) {
-			$real_name = $mailmap->{names}->{$address};
-		}
-		if (exists $mailmap->{addresses}->{$address}) {
-			$real_address = $mailmap->{addresses}->{$address};
-		}
+    my ($name, $address) = parse_email($line);
+    my $email = format_email($name, $address, 1);
+    my $real_name = $name;
+    my $real_address = $address;
+
+    if (exists $mailmap->{names}->{$email} ||
+	exists $mailmap->{addresses}->{$email}) {
+	if (exists $mailmap->{names}->{$email}) {
+	    $real_name = $mailmap->{names}->{$email};
+	}
+	if (exists $mailmap->{addresses}->{$email}) {
+	    $real_address = $mailmap->{addresses}->{$email};
+	}
+    } else {
+	if (exists $mailmap->{names}->{$address}) {
+	    $real_name = $mailmap->{names}->{$address};
+	}
+	if (exists $mailmap->{addresses}->{$address}) {
+	    $real_address = $mailmap->{addresses}->{$address};
 	}
-	return format_email($real_name, $real_address, 1);
+    }
+    return format_email($real_name, $real_address, 1);
 }
 
 sub mailmap {
@@ -1146,18 +1147,17 @@ sub mailmap {
 }
 
 sub merge_by_realname {
-	my %address_map;
-	my (@emails) = @_;
-	foreach my $email (@emails) {
-		my ($name, $address) = parse_email($email);
-		if (!exists $address_map{$name}) {
-			$address_map{$name} = $address;
-		} else {
-			$address = $address_map{$name};
-			$email = format_email($name,$address,1);
-		}
+    my %address_map;
+    my (@emails) = @_;
+    foreach my $email (@emails) {
+	my ($name, $address) = parse_email($email);
+	if (!exists $address_map{$name}) {
+	    $address_map{$name} = $address;
+	} else {
+	    $address = $address_map{$name};
+	    $email = format_email($name,$address,1);
 	}
-
+    }
 }
 
 sub git_execute_cmd {
@@ -1555,7 +1555,7 @@ commit signers and mailing lists that could be CC'd on a patch.
 
 Any *'d entry is selected.
 
-If you have git or hg installed, You can choose to summarize the commit
+If you have git or hg installed, you can choose to summarize the commit
 history of files in the patch.  Also, each line of the current file can
 be matched to its commit author and that commits signers with blame.
 
-- 
1.7.3


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 7/8] scripts/get_maintainer.pl: Use mailmap in name deduplication and other updates
  2010-09-23  3:17 [PATCH 0/8] scripts/get_maintainer.pl: Add --interactive Joe Perches
                   ` (5 preceding siblings ...)
  2010-09-23  3:17 ` [PATCH 6/8] scripts/get_maintainer.pl: Correct indentation in a few places Joe Perches
@ 2010-09-23  3:17 ` Joe Perches
  2010-09-23  3:17 ` [PATCH 8/8] scripts/get_maintainer.pl: Don't deduplicate unnamed addresses ie: mailing lists Joe Perches
  7 siblings, 0 replies; 11+ messages in thread
From: Joe Perches @ 2010-09-23  3:17 UTC (permalink / raw)
  To: linux-kernel

Use Florian Mickler's mailmap routine to reduce name duplication.

o Add subroutine deduplicate_email to centralize code
o Add hashes for deduplicate_(name|address)_hash
o Remove now unused @interactive_to
o Whitespace neatening
o Add command line --help text
o Add --mailmap command line option control
o Interactive changes:
   - Add toggles for maintainer, git and list selections
   - Default selection is all
   - Add mailmap control

Update to 0.26-beta5

Signed-off-by: Joe Perches <joe@perches.com>
---
 scripts/get_maintainer.pl |  232 +++++++++++++++++++++++++++++----------------
 1 files changed, 148 insertions(+), 84 deletions(-)

diff --git a/scripts/get_maintainer.pl b/scripts/get_maintainer.pl
index 0abfdbc..e822518 100755
--- a/scripts/get_maintainer.pl
+++ b/scripts/get_maintainer.pl
@@ -13,7 +13,7 @@
 use strict;
 
 my $P = $0;
-my $V = '0.26-beta4';
+my $V = '0.26-beta5';
 
 use Getopt::Long qw(:config no_auto_abbrev);
 
@@ -36,6 +36,7 @@ my $email_git_since = "1-year-ago";
 my $email_hg_since = "-365";
 my $interactive = 0;
 my $email_remove_duplicates = 1;
+my $email_use_mailmap = 1;
 my $output_multiline = 1;
 my $output_separator = ", ";
 my $output_roles = 0;
@@ -192,6 +193,7 @@ if (!GetOptions(
 		'hg-since=s' => \$email_hg_since,
 		'i|interactive!' => \$interactive,
 		'remove-duplicates!' => \$email_remove_duplicates,
+		'mailmap!' => \$email_use_mailmap,
 		'm!' => \$email_maintainer,
 		'n!' => \$email_usename,
 		'l!' => \$email_list,
@@ -300,17 +302,17 @@ close($maint);
 # Read mail address map
 #
 
-my $mailmap = read_mailmap();
+my $mailmap;
+
+read_mailmap();
 
 sub read_mailmap {
-    my $mailmap = {
+    $mailmap = {
 	names => {},
 	addresses => {}
     };
 
-    if (!$email_remove_duplicates) {
-	return $mailmap;
-    }
+    return if (!$email_use_mailmap || !(-f "${lk_path}.mailmap"));
 
     open(my $mailmap_file, '<', "${lk_path}.mailmap")
 	or warn "$P: Can't open .mailmap: $!\n";
@@ -331,6 +333,7 @@ sub read_mailmap {
 	    my $address = $2;
 
 	    $real_name =~ s/\s+$//;
+	    ($real_name, $address) = parse_email("$real_name <$address>");
 	    $mailmap->{names}->{$address} = $real_name;
 
 	} elsif (/^<([^\s]+)>\s*<([^\s]+)>$/) {
@@ -340,12 +343,13 @@ sub read_mailmap {
 	    $mailmap->{addresses}->{$wrong_address} = $real_address;
 
 	} elsif (/^(.+)<([^\s]+)>\s*<([^\s]+)>$/) {
-	    my $real_name= $1;
+	    my $real_name = $1;
 	    my $real_address = $2;
 	    my $wrong_address = $3;
 
 	    $real_name =~ s/\s+$//;
-
+	    ($real_name, $real_address) =
+		parse_email("$real_name <$real_address>");
 	    $mailmap->{names}->{$wrong_address} = $real_name;
 	    $mailmap->{addresses}->{$wrong_address} = $real_address;
 
@@ -356,15 +360,19 @@ sub read_mailmap {
 	    my $wrong_address = $4;
 
 	    $real_name =~ s/\s+$//;
+	    ($real_name, $real_address) =
+		parse_email("$real_name <$real_address>");
+
 	    $wrong_name =~ s/\s+$//;
+	    ($wrong_name, $wrong_address) =
+		parse_email("$wrong_name <$wrong_address>");
 
-	    $mailmap->{names}->{format_email($wrong_name,$wrong_address,1)} = $real_name;
-	    $mailmap->{addresses}->{format_email($wrong_name,$wrong_address,1)} = $real_address;
+	    my $wrong_email = format_email($wrong_name, $wrong_address, 1);
+	    $mailmap->{names}->{$wrong_email} = $real_name;
+	    $mailmap->{addresses}->{$wrong_email} = $real_address;
 	}
     }
     close($mailmap_file);
-
-    return $mailmap;
 }
 
 ## use the filenames on the command line or find the filenames in the patchfiles
@@ -453,7 +461,8 @@ my @scm = ();
 my @web = ();
 my @subsystem = ();
 my @status = ();
-my @interactive_to = ();
+my %deduplicate_name_hash = ();
+my %deduplicate_address_hash = ();
 my $signature_pattern;
 
 my @maintainers = get_maintainers();
@@ -497,7 +506,8 @@ sub get_maintainers {
     @web = ();
     @subsystem = ();
     @status = ();
-    @interactive_to = ();
+    %deduplicate_name_hash = ();
+    %deduplicate_address_hash = ();
     if ($email_git_all_signature_types) {
 	$signature_pattern = "(.+?)[Bb][Yy]:";
     } else {
@@ -506,7 +516,7 @@ sub get_maintainers {
 
     # Find responsible parties
 
-    my %exact_pattern_match_hash;
+    my %exact_pattern_match_hash = ();
 
     foreach my $file (@files) {
 
@@ -590,7 +600,9 @@ sub get_maintainers {
 	}
     }
 
-    @interactive_to = (@email_to, @list_to);
+    foreach my $email (@email_to, @list_to) {
+	$email->[0] = deduplicate_email($email->[0]);
+    }
 
     foreach my $file (@files) {
 	if ($email &&
@@ -637,8 +649,7 @@ sub get_maintainers {
     }
 
     if ($interactive) {
-	@interactive_to = @to;
-	@to = interactive_get_maintainers(\@interactive_to);
+	@to = interactive_get_maintainers(\@to);
     }
 
     return @to;
@@ -702,8 +713,9 @@ Output type options:
 
 Other options:
   --pattern-depth => Number of pattern directory traversals (default: 0 (all))
-  --keywords => scan patch for keywords (default: 1 (on))
-  --sections => print the entire subsystem sections with pattern matches
+  --keywords => scan patch for keywords (default: $keywords)
+  --sections => print all of the subsystem sections with pattern matches
+  --mailmap => use .mailmap file (default: $email_use_mailmap)
   --version => show version
   --help => show this help information
 
@@ -1107,7 +1119,7 @@ sub which_conf {
 }
 
 sub mailmap_email {
-    my $line = shift;
+    my ($line) = @_;
 
     my ($name, $address) = parse_email($line);
     my $email = format_email($name, $address, 1);
@@ -1136,26 +1148,25 @@ sub mailmap_email {
 sub mailmap {
     my (@addresses) = @_;
 
-    my @ret = ();
+    my @mapped_emails = ();
     foreach my $line (@addresses) {
-	push(@ret, mailmap_email($line), 1);
+	push(@mapped_emails, mailmap_email($line));
     }
-
-    merge_by_realname(@ret) if $email_remove_duplicates;
-
-    return @ret;
+    merge_by_realname(@mapped_emails) if ($email_use_mailmap);
+    return @mapped_emails;
 }
 
 sub merge_by_realname {
     my %address_map;
     my (@emails) = @_;
+
     foreach my $email (@emails) {
 	my ($name, $address) = parse_email($email);
-	if (!exists $address_map{$name}) {
-	    $address_map{$name} = $address;
-	} else {
+	if (exists $address_map{$name}) {
 	    $address = $address_map{$name};
-	    $email = format_email($name,$address,1);
+	    $email = format_email($name, $address, 1);
+	} else {
+	    $address_map{$name} = $address;
 	}
     }
 }
@@ -1194,8 +1205,7 @@ sub extract_formatted_signatures {
 ## Reformat email addresses (with names) to avoid badly written signatures
 
     foreach my $signer (@signature_lines) {
-	my ($name, $address) = parse_email($signer);
-	$signer = format_email($name, $address, 1);
+	$signer = deduplicate_email($signer);
     }
 
     return (\@type, \@signature_lines);
@@ -1339,6 +1349,7 @@ sub vcs_exists {
 }
 
 sub vcs_is_git {
+    vcs_exists();
     return $vcs_used == 1;
 }
 
@@ -1357,11 +1368,9 @@ sub interactive_get_maintainers {
     my %signed;
     my $count = 0;
     my $maintained = 0;
-    #select maintainers by default
     foreach my $entry (@list) {
-	my $role = $entry->[1];
-	$selected{$count} = ($role =~ /^(maintainer|supporter|open list)/i);
-	$maintained = 1 if ($role =~ /^(maintainer|supporter)/i);
+	$maintained = 1 if ($entry->[1] =~ /^(maintainer|supporter)/i);
+	$selected{$count} = 1;
 	$authored{$count} = 0;
 	$signed{$count} = 0;
 	$count++;
@@ -1418,24 +1427,34 @@ sub interactive_get_maintainers {
 	if ($print_options) {
 	    $print_options = 0;
 	    if (vcs_exists()) {
-		print STDERR
-"\nVersion Control options:\n" .
-"g  use git history      [$email_git]\n" .
-"gf use git-fallback     [$email_git_fallback]\n" .
-"b  use git blame        [$email_git_blame]\n" .
-"bs use blame signatures [$email_git_blame_signatures]\n" .
-"c# minimum commits      [$email_git_min_signatures]\n" .
-"%# min percent          [$email_git_min_percent]\n" .
-"d# history to use       [$$date_ref]\n" .
-"x# max maintainers      [$email_git_max_maintainers]\n" .
-"t  all signature types  [$email_git_all_signature_types]\n";
+		print STDERR <<EOT
+
+Version Control options:
+g  use git history      [$email_git]
+gf use git-fallback     [$email_git_fallback]
+b  use git blame        [$email_git_blame]
+bs use blame signatures [$email_git_blame_signatures]
+c# minimum commits      [$email_git_min_signatures]
+%# min percent          [$email_git_min_percent]
+d# history to use       [$$date_ref]
+x# max maintainers      [$email_git_max_maintainers]
+t  all signature types  [$email_git_all_signature_types]
+m  use .mailmap         [$email_use_mailmap]
+EOT
 	    }
-	    print STDERR "\nAdditional options:\n" .
-"0  toggle all\n" .
-"f  emails in file       [$file_emails]\n" .
-"k  keywords in file     [$keywords]\n" .
-"r  remove duplicates    [$email_remove_duplicates]\n" .
-"p# pattern match depth  [$pattern_depth]\n";
+	    print STDERR <<EOT
+
+Additional options:
+0  toggle all
+tm toggle maintainers
+tg toggle git entries
+tl toggle open list entries
+ts toggle subscriber list entries
+f  emails in file       [$file_emails]
+k  keywords in file     [$keywords]
+r  remove duplicates    [$email_remove_duplicates]
+p# pattern match depth  [$pattern_depth]
+EOT
 	}
 	print STDERR
 "\n#(toggle), A#(author), S#(signed) *(all), ^(none), O(options), Y(approve): ";
@@ -1471,6 +1490,28 @@ sub interactive_get_maintainers {
 		for (my $i = 0; $i < $count; $i++) {
 		    $selected{$i} = !$selected{$i};
 		}
+	    } elsif ($sel eq "t") {
+		if (lc($str) eq "m") {
+		    for (my $i = 0; $i < $count; $i++) {
+			$selected{$i} = !$selected{$i}
+			    if ($list[$i]->[1] =~ /^(maintainer|supporter)/i);
+		    }
+		} elsif (lc($str) eq "g") {
+		    for (my $i = 0; $i < $count; $i++) {
+			$selected{$i} = !$selected{$i}
+			    if ($list[$i]->[1] =~ /^(author|commit|signer)/i);
+		    }
+		} elsif (lc($str) eq "l") {
+		    for (my $i = 0; $i < $count; $i++) {
+			$selected{$i} = !$selected{$i}
+			    if ($list[$i]->[1] =~ /^(open list)/i);
+		    }
+		} elsif (lc($str) eq "s") {
+		    for (my $i = 0; $i < $count; $i++) {
+			$selected{$i} = !$selected{$i}
+			    if ($list[$i]->[1] =~ /^(subscriber list)/i);
+		    }
+		}
 	    } elsif ($sel eq "a") {
 		if ($val > 0 && $val <= $count) {
 		    $authored{$val - 1} = !$authored{$val - 1};
@@ -1539,6 +1580,10 @@ sub interactive_get_maintainers {
 	    } elsif ($sel eq "r") {
 		bool_invert(\$email_remove_duplicates);
 		$rerun = 1;
+	    } elsif ($sel eq "m") {
+		bool_invert(\$email_use_mailmap);
+		read_mailmap();
+		$rerun = 1;
 	    } elsif ($sel eq "k") {
 		bool_invert(\$keywords);
 		$rerun = 1;
@@ -1602,6 +1647,36 @@ sub bool_invert {
     }
 }
 
+sub deduplicate_email {
+    my ($email) = @_;
+
+    my $matched = 0;
+    my ($name, $address) = parse_email($email);
+    $email = format_email($name, $address, 1);
+    $email = mailmap_email($email);
+
+    return $email if (!$email_remove_duplicates);
+
+    ($name, $address) = parse_email($email);
+
+    if ($deduplicate_name_hash{lc($name)}) {
+	$name = $deduplicate_name_hash{lc($name)}->[0];
+	$address = $deduplicate_name_hash{lc($name)}->[1];
+	$matched = 1;
+    } elsif ($deduplicate_address_hash{lc($address)}) {
+	$name = $deduplicate_address_hash{lc($address)}->[0];
+	$address = $deduplicate_address_hash{lc($address)}->[1];
+	$matched = 1;
+    }
+    if (!$matched) {
+	$deduplicate_name_hash{lc($name)} = [ $name, $address ];
+	$deduplicate_address_hash{lc($address)} = [ $name, $address ];
+    }
+    $email = format_email($name, $address, 1);
+    $email = mailmap_email($email);
+    return $email;
+}
+
 sub save_commits_by_author {
     my (@lines) = @_;
 
@@ -1611,20 +1686,8 @@ sub save_commits_by_author {
 
     foreach my $line (@lines) {
 	if ($line =~ m/$VCS_cmds{"author_pattern"}/) {
-	    my $matched = 0;
 	    my $author = $1;
-	    my ($name, $address) = parse_email($author);
-	    foreach my $to (@interactive_to) {
-		my ($to_name, $to_address) = parse_email($to->[0]);
-		if ($email_remove_duplicates &&
-		    ((lc($name) eq lc($to_name)) ||
-		     (lc($address) eq lc($to_address)))) {
-		    $author = $to->[0];
-		    $matched = 1;
-		    last;
-		}
-	    }
-	    $author = format_email($name, $address, 1) if (!$matched);
+	    $author = deduplicate_email($author);
 	    push(@authors, $author);
 	}
 	push(@commits, $1) if ($line =~ m/$VCS_cmds{"commit_pattern"}/);
@@ -1665,19 +1728,7 @@ sub save_commits_by_signer {
 	    my $type = $types[0];
 	    my $signer = $signers[0];
 
-	    my $matched = 0;
-	    my ($name, $address) = parse_email($signer);
-	    foreach my $to (@interactive_to) {
-		my ($to_name, $to_address) = parse_email($to->[0]);
-		if ($email_remove_duplicates &&
-		    ((lc($name) eq lc($to_name)) ||
-		     (lc($address) eq lc($to_address)))) {
-		    $signer = $to->[0];
-		    $matched = 1;
-		    last;
-		}
-		$signer = format_email($name, $address, 1) if (!$matched);
-	    }
+	    $signer = deduplicate_email($signer);
 
 	    my $exists = 0;
 	    foreach my $ref(@{$commit_signer_hash{$signer}}) {
@@ -1751,6 +1802,11 @@ sub vcs_file_signoffs {
     $cmd =~ s/(\$\w+)/$1/eeg;		# interpolate $cmd
 
     ($commits, @signers) = vcs_find_signers($cmd);
+
+    foreach my $signer (@signers) {
+	$signer = deduplicate_email($signer);
+    }
+
     vcs_assign("commit_signer", $commits, @signers);
 }
 
@@ -1828,9 +1884,8 @@ sub vcs_file_blame {
 		foreach my $line (@lines) {
 		    if ($line =~ m/$VCS_cmds{"author_pattern"}/) {
 			my $author = $1;
-			my ($name, $address) = parse_email($author);
-			$author = format_email($name, $address, 1);
-			push(@authors, $1);
+			$author = deduplicate_email($author);
+			push(@authors, $author);
 		    }
 		}
 
@@ -1846,9 +1901,12 @@ sub vcs_file_blame {
 		    $cmd =~ s/(\$\w+)/$1/eeg;	#interpolate $cmd
 		    my @author = vcs_find_author($cmd);
 		    next if !@author;
+
+		    my $formatted_author = deduplicate_email($author[0]);
+
 		    my $count = grep(/$commit/, @all_commits);
 		    for ($i = 0; $i < $count ; $i++) {
-			push(@blame_signers, $author[0]);
+			push(@blame_signers, $formatted_author);
 		    }
 		}
 	    }
@@ -1856,8 +1914,14 @@ sub vcs_file_blame {
 		vcs_assign("authored lines", $total_lines, @blame_signers);
 	    }
 	}
+	foreach my $signer (@signers) {
+	    $signer = deduplicate_email($signer);
+	}
 	vcs_assign("commits", $total_commits, @signers);
     } else {
+	foreach my $signer (@signers) {
+	    $signer = deduplicate_email($signer);
+	}
 	vcs_assign("modified commits", $total_commits, @signers);
     }
 }
-- 
1.7.3


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 8/8] scripts/get_maintainer.pl: Don't deduplicate unnamed addresses ie: mailing lists
  2010-09-23  3:17 [PATCH 0/8] scripts/get_maintainer.pl: Add --interactive Joe Perches
                   ` (6 preceding siblings ...)
  2010-09-23  3:17 ` [PATCH 7/8] scripts/get_maintainer.pl: Use mailmap in name deduplication and other updates Joe Perches
@ 2010-09-23  3:17 ` Joe Perches
  7 siblings, 0 replies; 11+ messages in thread
From: Joe Perches @ 2010-09-23  3:17 UTC (permalink / raw)
  To: linux-kernel

Fix a defect with the first mailing list address being used for
each subsequent mailing list.

Updated to 0.26-beta6.

Signed-off-by: Joe Perches <joe@perches.com>
---
 scripts/get_maintainer.pl |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/scripts/get_maintainer.pl b/scripts/get_maintainer.pl
index e822518..d21ec3a 100755
--- a/scripts/get_maintainer.pl
+++ b/scripts/get_maintainer.pl
@@ -13,7 +13,7 @@
 use strict;
 
 my $P = $0;
-my $V = '0.26-beta5';
+my $V = '0.26-beta6';
 
 use Getopt::Long qw(:config no_auto_abbrev);
 
@@ -1036,7 +1036,7 @@ sub push_email_address {
 	push(@email_to, [format_email($name, $address, $email_usename), $role]);
     } elsif (!email_inuse($name, $address)) {
 	push(@email_to, [format_email($name, $address, $email_usename), $role]);
-	$email_hash_name{lc($name)}++;
+	$email_hash_name{lc($name)}++ if ($name ne "");
 	$email_hash_address{lc($address)}++;
     }
 
@@ -1659,7 +1659,7 @@ sub deduplicate_email {
 
     ($name, $address) = parse_email($email);
 
-    if ($deduplicate_name_hash{lc($name)}) {
+    if ($name ne "" && $deduplicate_name_hash{lc($name)}) {
 	$name = $deduplicate_name_hash{lc($name)}->[0];
 	$address = $deduplicate_name_hash{lc($name)}->[1];
 	$matched = 1;
-- 
1.7.3


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/8] scripts/get_maintainer.pl: add interactive mode
  2010-09-23  3:17 ` [PATCH 1/8] scripts/get_maintainer.pl: add interactive mode Joe Perches
@ 2010-09-23  7:11   ` Florian Mickler
  0 siblings, 0 replies; 11+ messages in thread
From: Florian Mickler @ 2010-09-23  7:11 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Joe Perches, Stephen Hemminger, linux-kernel

Hi Andrew!

I don't know if my other mail came through to you, Joe meant his emails
probably got discarded by linux-kernel@vger.kernel.org because of a
missing ">" in your address.

Anyway: 

On Wed, 22 Sep 2010 19:50:09 -0700
Joe Perches <joe@perches.com> wrote:

> From: florian@mickler.org <florian@mickler.org>

From: Florian Mickler <florian@mickler.org> 

My git-send-email somehow dropped the name.


> 
> This is a first version of an interactive mode for
> scripts/get_maintainer.pl .
> 
> It allows the user to interact with the script. Each cc candidate can be
> selected and deselected and a shortlog of authored commits can be
> displayed for each candidate.
> 
> The menu is displayed via STDERR, the end result is outputted to STDOUT.
> This unusual mechanism allows using get_maintainer.pl in interactive mode via
> git send-email --cc-cmd.
> 
> Signed-off-by: Joe Perches <joe@perches.com>


Also, you can add my Signed-Off-By to this, if it is needed.


Thx,
Flo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 5/8] scripts/get_maintainer.pl: fix mailmap handling
  2010-09-23  3:17 ` [PATCH 5/8] scripts/get_maintainer.pl: fix mailmap handling Joe Perches
@ 2010-09-23  7:12   ` Florian Mickler
  0 siblings, 0 replies; 11+ messages in thread
From: Florian Mickler @ 2010-09-23  7:12 UTC (permalink / raw)
  To: Joe Perches; +Cc: Andrew Morton, Stephen Hemminger, linux-kernel

On Wed, 22 Sep 2010 19:50:13 -0700
Joe Perches <joe@perches.com> wrote:

> From: florian@mickler.org <florian@mickler.org>

Here again, my git-send-email borked it up: 

From: Florian Mickler <florian@mickler.org>

> 
> Implement it, like it is described in git-shortlog.
> 
> Signed-off-by: Florian Mickler <florian@mickler.org>
> Signed-off-by: Joe Perches <joe@perches.com>
> ---
>  scripts/get_maintainer.pl |  147 +++++++++++++++++++++++++++++++++------------
>  1 files changed, 109 insertions(+), 38 deletions(-)

Thx,
Flo

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2010-09-23  7:12 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-09-23  3:17 [PATCH 0/8] scripts/get_maintainer.pl: Add --interactive Joe Perches
2010-09-23  3:17 ` [PATCH 1/8] scripts/get_maintainer.pl: add interactive mode Joe Perches
2010-09-23  7:11   ` Florian Mickler
2010-09-23  3:17 ` [PATCH 2/8] scripts/get_maintainer.pl: Improve --interactive UI Joe Perches
2010-09-23  3:17 ` [PATCH 3/8] scripts/get_maintainer.pl: Update --interactive UI, improve hg runtime Joe Perches
2010-09-23  3:17 ` [PATCH 4/8] scripts/get_maintainer.pl: Use case insensitive name de-duplication Joe Perches
2010-09-23  3:17 ` [PATCH 5/8] scripts/get_maintainer.pl: fix mailmap handling Joe Perches
2010-09-23  7:12   ` Florian Mickler
2010-09-23  3:17 ` [PATCH 6/8] scripts/get_maintainer.pl: Correct indentation in a few places Joe Perches
2010-09-23  3:17 ` [PATCH 7/8] scripts/get_maintainer.pl: Use mailmap in name deduplication and other updates Joe Perches
2010-09-23  3:17 ` [PATCH 8/8] scripts/get_maintainer.pl: Don't deduplicate unnamed addresses ie: mailing lists Joe Perches

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).