linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [patch] mm, vmpressure: add high level
@ 2013-10-17  0:43 David Rientjes
  2013-10-17  3:05 ` Anton Vorontsov
  2013-10-17  3:51 ` [bug] get_maintainer.pl incomplete output David Rientjes
  0 siblings, 2 replies; 14+ messages in thread
From: David Rientjes @ 2013-10-17  0:43 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Anton Vorontsov, Michal Hocko, Kirill A. Shutemov, linux-kernel,
	linux-mm

Vmpressure has two important levels: medium and critical.  Medium is 
defined at 60% and critical is defined at 95%.

We have a customer who needs a notification at a higher level than medium, 
which is slight to moderate reclaim activity, and before critical to start 
throttling incoming requests to save memory and avoid oom.

This patch adds the missing link: a high level defined at 80%.

In the future, it would probably be better to allow the user to specify an 
integer ratio for the notification rather than relying on arbitrarily 
specified levels.

Signed-off-by: David Rientjes <rientjes@google.com>
---
 mm/vmpressure.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/mm/vmpressure.c b/mm/vmpressure.c
--- a/mm/vmpressure.c
+++ b/mm/vmpressure.c
@@ -46,6 +46,7 @@ static const unsigned long vmpressure_win = SWAP_CLUSTER_MAX * 16;
  * unsuccessful reclaims there were.
  */
 static const unsigned int vmpressure_level_med = 60;
+static const unsigned int vmpressure_level_high = 80;
 static const unsigned int vmpressure_level_critical = 95;
 
 /*
@@ -88,6 +89,7 @@ static struct vmpressure *vmpressure_parent(struct vmpressure *vmpr)
 enum vmpressure_levels {
 	VMPRESSURE_LOW = 0,
 	VMPRESSURE_MEDIUM,
+	VMPRESSURE_HIGH,
 	VMPRESSURE_CRITICAL,
 	VMPRESSURE_NUM_LEVELS,
 };
@@ -95,6 +97,7 @@ enum vmpressure_levels {
 static const char * const vmpressure_str_levels[] = {
 	[VMPRESSURE_LOW] = "low",
 	[VMPRESSURE_MEDIUM] = "medium",
+	[VMPRESSURE_HIGH] = "high",
 	[VMPRESSURE_CRITICAL] = "critical",
 };
 
@@ -102,6 +105,8 @@ static enum vmpressure_levels vmpressure_level(unsigned long pressure)
 {
 	if (pressure >= vmpressure_level_critical)
 		return VMPRESSURE_CRITICAL;
+	else if (pressure >= vmpressure_level_high)
+		return VMPRESSURE_HIGH;
 	else if (pressure >= vmpressure_level_med)
 		return VMPRESSURE_MEDIUM;
 	return VMPRESSURE_LOW;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch] mm, vmpressure: add high level
  2013-10-17  0:43 [patch] mm, vmpressure: add high level David Rientjes
@ 2013-10-17  3:05 ` Anton Vorontsov
  2013-10-17  3:44   ` David Rientjes
  2013-10-17  3:51 ` [bug] get_maintainer.pl incomplete output David Rientjes
  1 sibling, 1 reply; 14+ messages in thread
From: Anton Vorontsov @ 2013-10-17  3:05 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, Michal Hocko, Kirill A. Shutemov, linux-kernel, linux-mm

Hello David,

On Wed, Oct 16, 2013 at 05:43:55PM -0700, David Rientjes wrote:
> Vmpressure has two important levels: medium and critical.  Medium is 
> defined at 60% and critical is defined at 95%.
> 
> We have a customer who needs a notification at a higher level than medium, 
> which is slight to moderate reclaim activity, and before critical to start 
> throttling incoming requests to save memory and avoid oom.
> 
> This patch adds the missing link: a high level defined at 80%.
> 
> In the future, it would probably be better to allow the user to specify an 
> integer ratio for the notification rather than relying on arbitrarily 
> specified levels.

Does the customer need to differentiate the two levels (medium and high),
or the customer only interested in this (80%) specific level?

In the latter case, instead of adding a new level I would vote for adding
a [sysfs] knob for modifying medium level's threshold.

Thanks,

Anton

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch] mm, vmpressure: add high level
  2013-10-17  3:05 ` Anton Vorontsov
@ 2013-10-17  3:44   ` David Rientjes
  0 siblings, 0 replies; 14+ messages in thread
From: David Rientjes @ 2013-10-17  3:44 UTC (permalink / raw)
  To: Anton Vorontsov
  Cc: Andrew Morton, Michal Hocko, Kirill A. Shutemov, linux-kernel, linux-mm

On Wed, 16 Oct 2013, Anton Vorontsov wrote:

> > Vmpressure has two important levels: medium and critical.  Medium is 
> > defined at 60% and critical is defined at 95%.
> > 
> > We have a customer who needs a notification at a higher level than medium, 
> > which is slight to moderate reclaim activity, and before critical to start 
> > throttling incoming requests to save memory and avoid oom.
> > 
> > This patch adds the missing link: a high level defined at 80%.
> > 
> > In the future, it would probably be better to allow the user to specify an 
> > integer ratio for the notification rather than relying on arbitrarily 
> > specified levels.
> 
> Does the customer need to differentiate the two levels (medium and high),
> or the customer only interested in this (80%) specific level?
> 

Only high.

> In the latter case, instead of adding a new level I would vote for adding
> a [sysfs] knob for modifying medium level's threshold.
> 

Hmm, doesn't seem like such a good idea.  If one process depends on this 
being 60% and another depends on it being 80%, we're stuck.  I think it's 
legitimate to have things like low, medium, high, and critical as rough 
approximations (and to keep backwards compatibility), but as mentioned in 
the changelog I want to extend the interface to allow integer writes to 
specify their own ratio.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [bug] get_maintainer.pl incomplete output
  2013-10-17  0:43 [patch] mm, vmpressure: add high level David Rientjes
  2013-10-17  3:05 ` Anton Vorontsov
@ 2013-10-17  3:51 ` David Rientjes
  2013-10-17  4:03   ` Joe Perches
  2013-10-17 19:12   ` Andrew Morton
  1 sibling, 2 replies; 14+ messages in thread
From: David Rientjes @ 2013-10-17  3:51 UTC (permalink / raw)
  To: Joe Perches
  Cc: Andrew Morton, Anton Vorontsov, Michal Hocko, Kirill A. Shutemov,
	linux-kernel, linux-mm

Hi Joe,

I haven't looked closely at scripts/get_maintainer.pl, but I recently 
wrote a patch touching mm/vmpressure.c and it doesn't list the file's 
author, Anton Vorontsov <anton.vorontsov@linaro.org>.

Even when I do scripts/get_maintainer.pl -f mm/vmpressure.c, his entry is 
missing and git blame attributs >90% of the lines to his authorship.

$ ./scripts/get_maintainer.pl -f mm/vmpressure.c 
Tejun Heo <tj@kernel.org> (commit_signer:6/7=86%)
Michal Hocko <mhocko@suse.cz> (commit_signer:5/7=71%)
Andrew Morton <akpm@linux-foundation.org> (commit_signer:4/7=57%)
Li Zefan <lizefan@huawei.com> (commit_signer:3/7=43%)
"Kirill A. Shutemov" <kirill@shutemov.name> (commit_signer:1/7=14%)
linux-mm@kvack.org (open list:MEMORY MANAGEMENT)
linux-kernel@vger.kernel.org (open list)

Any ideas?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [bug] get_maintainer.pl incomplete output
  2013-10-17  3:51 ` [bug] get_maintainer.pl incomplete output David Rientjes
@ 2013-10-17  4:03   ` Joe Perches
  2013-10-17  4:19     ` David Rientjes
  2013-10-17 19:12   ` Andrew Morton
  1 sibling, 1 reply; 14+ messages in thread
From: Joe Perches @ 2013-10-17  4:03 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, Anton Vorontsov, Michal Hocko, Kirill A. Shutemov,
	linux-kernel, linux-mm

On Wed, 2013-10-16 at 20:51 -0700, David Rientjes wrote:
> Hi Joe,

Hi David.

> I haven't looked closely at scripts/get_maintainer.pl, but I recently 
> wrote a patch touching mm/vmpressure.c and it doesn't list the file's 
> author, Anton Vorontsov <anton.vorontsov@linaro.org>.
> 
> Even when I do scripts/get_maintainer.pl -f mm/vmpressure.c, his entry is 
> missing and git blame attributs >90% of the lines to his authorship.
> 
> $ ./scripts/get_maintainer.pl -f mm/vmpressure.c 
> Tejun Heo <tj@kernel.org> (commit_signer:6/7=86%)
> Michal Hocko <mhocko@suse.cz> (commit_signer:5/7=71%)
> Andrew Morton <akpm@linux-foundation.org> (commit_signer:4/7=57%)
> Li Zefan <lizefan@huawei.com> (commit_signer:3/7=43%)
> "Kirill A. Shutemov" <kirill@shutemov.name> (commit_signer:1/7=14%)
> linux-mm@kvack.org (open list:MEMORY MANAGEMENT)
> linux-kernel@vger.kernel.org (open list)
> 
> Any ideas?

get_maintainer has a lot of options.

get_maintainer tries to find people that are either
listed in the MAINTAINERS file or that have recently
(in the last year by default) worked on the file.

If you want to find all authors, use the --git-blame option

It's not the default because it can take quite awhile to run.

If you always want --git-blame added, use a .get_maintainer.conf
file to override the default options.

$ time ./scripts/get_maintainer.pl -f --git-blame mm/vmpressure.c
Tejun Heo <tj@kernel.org> (commit_signer:6/7=86%,commits:5/6=83%)
Michal Hocko <mhocko@suse.cz> (commit_signer:5/7=71%,authored lines:22/387=6%,commits:5/6=83%)
Andrew Morton <akpm@linux-foundation.org> (commit_signer:4/7=57%,commits:4/6=67%)
Li Zefan <lizefan@huawei.com> (commit_signer:3/7=43%,commits:2/6=33%)
"Kirill A. Shutemov" <kirill@shutemov.name> (commit_signer:1/7=14%,commits:1/6=17%)
Anton Vorontsov <anton.vorontsov@linaro.org> (authored lines:354/387=91%)
linux-mm@kvack.org (open list:MEMORY MANAGEMENT)
linux-kernel@vger.kernel.org (open list)

real	0m28.529s
user	0m25.400s
sys	0m0.764s

The --help option is moderately descriptive.
If you think it needs to be better, please say how.

cheers, Joe

$ ./scripts/get_maintainer.pl --help
usage: ./scripts/get_maintainer.pl [options] patchfile
       ./scripts/get_maintainer.pl [options] -f file|directory
version: 0.26

MAINTAINER field selection options:
  --email => print email address(es) if any
    --git => include recent git *-by: signers
    --git-all-signature-types => include signers regardless of signature type
        or use only (Signed-off-by:|Reviewed-by:|Acked-by:) signers (default: 0)
    --git-fallback => use git when no exact MAINTAINERS pattern (default: 1)
    --git-chief-penguins => include (Linus Torvalds)
    --git-min-signatures => number of signatures required (default: 1)
    --git-max-maintainers => maximum maintainers to add (default: 5)
    --git-min-percent => minimum percentage of commits required (default: 5)
    --git-blame => use git blame to find modified commits for patch or file
    --git-since => git history to use (default: 1-year-ago)
    --hg-since => hg history to use (default: -365)
    --interactive => display a menu (mostly useful if used with the --git option)
    --m => include maintainer(s) if any
    --n => include name 'Full Name <addr@domain.tld>'
    --l => include list(s) if any
    --s => include subscriber only list(s) if any
    --remove-duplicates => minimize duplicate email names/addresses
    --roles => show roles (status:subsystem, git-signer, list, etc...)
    --rolestats => show roles and statistics (commits/total_commits, %)
    --file-emails => add email addresses found in -f file (default: 0 (off))
  --scm => print SCM tree(s) if any
  --status => print status if any
  --subsystem => print subsystem name if any
  --web => print website(s) if any

Output type options:
  --separator [, ] => separator for multiple entries on 1 line
    using --separator also sets --nomultiline if --separator is not [, ]
  --multiline => print 1 entry per line

Other options:
  --pattern-depth => Number of pattern directory traversals (default: 0 (all))
  --keywords => scan patch for keywords (default: 1)
  --sections => print all of the subsystem sections with pattern matches
  --mailmap => use .mailmap file (default: 1)
  --version => show version
  --help => show this help information

Default options:
  [--email --nogit --git-fallback --m --n --l --multiline -pattern-depth=0
   --remove-duplicates --rolestats]

Notes:
  Using "-f directory" may give unexpected results:
      Used with "--git", git signators for _all_ files in and below
          directory are examined as git recurses directories.
          Any specified X: (exclude) pattern matches are _not_ ignored.
      Used with "--nogit", directory is used as a pattern match,
          no individual file within the directory or subdirectory
          is matched.
      Used with "--git-blame", does not iterate all files in directory
  Using "--git-blame" is slow and may add old committers and authors
      that are no longer active maintainers to the output.
  Using "--roles" or "--rolestats" with git send-email --cc-cmd or any
      other automated tools that expect only ["name"] <email address>
      may not work because of additional output after <email address>.
  Using "--rolestats" and "--git-blame" shows the #/total=% commits,
      not the percentage of the entire file authored.  # of commits is
      not a good measure of amount of code authored.  1 major commit may
      contain a thousand lines, 5 trivial commits may modify a single line.
  If git is not installed, but mercurial (hg) is installed and an .hg
      repository exists, the following options apply to mercurial:
          --git,
          --git-min-signatures, --git-max-maintainers, --git-min-percent, and
          --git-blame
      Use --hg-since not --git-since to control date selection
  File ".get_maintainer.conf", if it exists in the linux kernel source root
      directory, can change whatever get_maintainer defaults are desired.
      Entries in this file can be any command line argument.
      This file is prepended to any additional command line arguments.
      Multiple lines and # comments are allowed.



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [bug] get_maintainer.pl incomplete output
  2013-10-17  4:03   ` Joe Perches
@ 2013-10-17  4:19     ` David Rientjes
  2013-10-17  4:36       ` Joe Perches
  0 siblings, 1 reply; 14+ messages in thread
From: David Rientjes @ 2013-10-17  4:19 UTC (permalink / raw)
  To: Joe Perches
  Cc: Andrew Morton, Anton Vorontsov, Michal Hocko, Kirill A. Shutemov,
	linux-kernel, linux-mm

On Wed, 16 Oct 2013, Joe Perches wrote:

> > I haven't looked closely at scripts/get_maintainer.pl, but I recently 
> > wrote a patch touching mm/vmpressure.c and it doesn't list the file's 
> > author, Anton Vorontsov <anton.vorontsov@linaro.org>.
> > 
> > Even when I do scripts/get_maintainer.pl -f mm/vmpressure.c, his entry is 
> > missing and git blame attributs >90% of the lines to his authorship.
> > 
> > $ ./scripts/get_maintainer.pl -f mm/vmpressure.c 
> > Tejun Heo <tj@kernel.org> (commit_signer:6/7=86%)
> > Michal Hocko <mhocko@suse.cz> (commit_signer:5/7=71%)
> > Andrew Morton <akpm@linux-foundation.org> (commit_signer:4/7=57%)
> > Li Zefan <lizefan@huawei.com> (commit_signer:3/7=43%)
> > "Kirill A. Shutemov" <kirill@shutemov.name> (commit_signer:1/7=14%)
> > linux-mm@kvack.org (open list:MEMORY MANAGEMENT)
> > linux-kernel@vger.kernel.org (open list)
> > 
> > Any ideas?
> 
> get_maintainer has a lot of options.
> 
> get_maintainer tries to find people that are either
> listed in the MAINTAINERS file or that have recently
> (in the last year by default) worked on the file.
> 
> If you want to find all authors, use the --git-blame option
> 
> It's not the default because it can take quite awhile to run.
> 

Hmm, it's a little strange to only consider recent activity when >90% of 
the lines were written by someone not listed.  Isn't there any faster way 
to determine that besides using the expensive git blame?  Something like 
weighing the output of "git show --shortstat" for all commits in "git log 
mm/vmpressure.c" to determine the most important recent changes?  That 
should be fairly cheap.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [bug] get_maintainer.pl incomplete output
  2013-10-17  4:19     ` David Rientjes
@ 2013-10-17  4:36       ` Joe Perches
  0 siblings, 0 replies; 14+ messages in thread
From: Joe Perches @ 2013-10-17  4:36 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, Anton Vorontsov, Michal Hocko, Kirill A. Shutemov,
	linux-kernel, linux-mm

On Wed, 2013-10-16 at 21:19 -0700, David Rientjes wrote:
> On Wed, 16 Oct 2013, Joe Perches wrote:
> 
> > > I haven't looked closely at scripts/get_maintainer.pl, but I recently 
> > > wrote a patch touching mm/vmpressure.c and it doesn't list the file's 
> > > author, Anton Vorontsov <anton.vorontsov@linaro.org>.
> > > 
> > > Even when I do scripts/get_maintainer.pl -f mm/vmpressure.c, his entry is 
> > > missing and git blame attributs >90% of the lines to his authorship.
> > > 
> > > $ ./scripts/get_maintainer.pl -f mm/vmpressure.c 
> > > Tejun Heo <tj@kernel.org> (commit_signer:6/7=86%)
> > > Michal Hocko <mhocko@suse.cz> (commit_signer:5/7=71%)
> > > Andrew Morton <akpm@linux-foundation.org> (commit_signer:4/7=57%)
> > > Li Zefan <lizefan@huawei.com> (commit_signer:3/7=43%)
> > > "Kirill A. Shutemov" <kirill@shutemov.name> (commit_signer:1/7=14%)
> > > linux-mm@kvack.org (open list:MEMORY MANAGEMENT)
> > > linux-kernel@vger.kernel.org (open list)
> > > 
> > > Any ideas?
> > 
> > get_maintainer has a lot of options.
> > 
> > get_maintainer tries to find people that are either
> > listed in the MAINTAINERS file or that have recently
> > (in the last year by default) worked on the file.
> > 
> > If you want to find all authors, use the --git-blame option
> > 
> > It's not the default because it can take quite awhile to run.
> > 
> 
> Hmm, it's a little strange to only consider recent activity when >90% of 
> the lines were written by someone not listed.  Isn't there any faster way 
> to determine that besides using the expensive git blame?

Not so far as I know.

> Something like 
> weighing the output of "git show --shortstat" for all commits in "git log 
> mm/vmpressure.c" to determine the most important recent changes?  That 
> should be fairly cheap.

Important can be hard to determine.

git blame effectively does a --follow

get_maintainers already does a git log <file>
and accumulates all the signatures for the
time selected by --git-since

It doesn't weight each commit by +/- count.

I think the most significant negative to
the current get_maintainer is that only the
"signature" lines are considered.

The "Author:" line isn't.

The default ris for get_maintainer to only list
a maximum of 5 "maintainers" ordered by signature
count.

Anton has only signed/acked 1.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [bug] get_maintainer.pl incomplete output
  2013-10-17  3:51 ` [bug] get_maintainer.pl incomplete output David Rientjes
  2013-10-17  4:03   ` Joe Perches
@ 2013-10-17 19:12   ` Andrew Morton
  2013-10-17 19:23     ` Joe Perches
  2013-10-18  4:17     ` Joe Perches
  1 sibling, 2 replies; 14+ messages in thread
From: Andrew Morton @ 2013-10-17 19:12 UTC (permalink / raw)
  To: David Rientjes
  Cc: Joe Perches, Anton Vorontsov, Michal Hocko, Kirill A. Shutemov,
	linux-kernel, linux-mm

On Wed, 16 Oct 2013 20:51:18 -0700 (PDT) David Rientjes <rientjes@google.com> wrote:

> I haven't looked closely at scripts/get_maintainer.pl, but I recently 
> wrote a patch touching mm/vmpressure.c and it doesn't list the file's 
> author, Anton Vorontsov <anton.vorontsov@linaro.org>.
> 
> Even when I do scripts/get_maintainer.pl -f mm/vmpressure.c, his entry is 
> missing and git blame attributs >90% of the lines to his authorship.
> 
> $ ./scripts/get_maintainer.pl -f mm/vmpressure.c 
> Tejun Heo <tj@kernel.org> (commit_signer:6/7=86%)
> Michal Hocko <mhocko@suse.cz> (commit_signer:5/7=71%)
> Andrew Morton <akpm@linux-foundation.org> (commit_signer:4/7=57%)
> Li Zefan <lizefan@huawei.com> (commit_signer:3/7=43%)
> "Kirill A. Shutemov" <kirill@shutemov.name> (commit_signer:1/7=14%)
> linux-mm@kvack.org (open list:MEMORY MANAGEMENT)
> linux-kernel@vger.kernel.org (open list)

get_maintainer should, by default, answer the question "who should I
email about this file".  It clearly isn't doing this, and that's a
pretty big fail.

I've learned not to trust it, so when I use it I always have to check
its homework with "git log | grep Author" :(

Joe, pretty please?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [bug] get_maintainer.pl incomplete output
  2013-10-17 19:12   ` Andrew Morton
@ 2013-10-17 19:23     ` Joe Perches
  2013-10-18  4:17     ` Joe Perches
  1 sibling, 0 replies; 14+ messages in thread
From: Joe Perches @ 2013-10-17 19:23 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Rientjes, Anton Vorontsov, Michal Hocko,
	Kirill A. Shutemov, linux-kernel, linux-mm

On Thu, 2013-10-17 at 12:12 -0700, Andrew Morton wrote:
> On Wed, 16 Oct 2013 20:51:18 -0700 (PDT) David Rientjes <rientjes@google.com> wrote:
> 
> > I haven't looked closely at scripts/get_maintainer.pl, but I recently 
> > wrote a patch touching mm/vmpressure.c and it doesn't list the file's 
> > author, Anton Vorontsov <anton.vorontsov@linaro.org>.
> > 
> > Even when I do scripts/get_maintainer.pl -f mm/vmpressure.c, his entry is 
> > missing and git blame attributs >90% of the lines to his authorship.
> > 
> > $ ./scripts/get_maintainer.pl -f mm/vmpressure.c 
> > Tejun Heo <tj@kernel.org> (commit_signer:6/7=86%)
> > Michal Hocko <mhocko@suse.cz> (commit_signer:5/7=71%)
> > Andrew Morton <akpm@linux-foundation.org> (commit_signer:4/7=57%)
> > Li Zefan <lizefan@huawei.com> (commit_signer:3/7=43%)
> > "Kirill A. Shutemov" <kirill@shutemov.name> (commit_signer:1/7=14%)
> > linux-mm@kvack.org (open list:MEMORY MANAGEMENT)
> > linux-kernel@vger.kernel.org (open list)
> 
> get_maintainer should, by default, answer the question "who should I
> email about this file".  It clearly isn't doing this, and that's a
> pretty big fail.

I disagree.

It's decidedly good at doing precisely that when a
MAINTAINERS entry exists.

When no one is a listed maintainer, the results
can certainly be tweaked to be better.

> I've learned not to trust it, so when I use it I always have to check
> its homework with "git log | grep Author" :(
> 
> Joe, pretty please?

It's really a question of "how long ago is too long ago" as
older commits way too often also show old/invalid email
addresses.

I also don't want to wait over 30 seconds or so to find out
who is listed as a git signer/author by default.

Adding the commit listed "Author:" as a signer doesn't seem
too hard though.  I'll play with that.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [bug] get_maintainer.pl incomplete output
  2013-10-17 19:12   ` Andrew Morton
  2013-10-17 19:23     ` Joe Perches
@ 2013-10-18  4:17     ` Joe Perches
  2013-10-18 22:58       ` David Rientjes
  2013-11-14 21:56       ` [PATCH] get_maintainer: Add commit author information to --rolestats Joe Perches
  1 sibling, 2 replies; 14+ messages in thread
From: Joe Perches @ 2013-10-18  4:17 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Rientjes, Anton Vorontsov, Michal Hocko,
	Kirill A. Shutemov, linux-kernel, linux-mm

On Thu, 2013-10-17 at 12:12 -0700, Andrew Morton wrote:
> On Wed, 16 Oct 2013 20:51:18 -0700 (PDT) David Rientjes <rientjes@google.com> wrote:
> 
> > I haven't looked closely at scripts/get_maintainer.pl, but I recently 
> > wrote a patch touching mm/vmpressure.c and it doesn't list the file's 
> > author, Anton Vorontsov <anton.vorontsov@linaro.org>.
> > 
> > Even when I do scripts/get_maintainer.pl -f mm/vmpressure.c, his entry is 
> > missing and git blame attributs >90% of the lines to his authorship.
> > 
> > $ ./scripts/get_maintainer.pl -f mm/vmpressure.c 
> > Tejun Heo <tj@kernel.org> (commit_signer:6/7=86%)
> > Michal Hocko <mhocko@suse.cz> (commit_signer:5/7=71%)
> > Andrew Morton <akpm@linux-foundation.org> (commit_signer:4/7=57%)
> > Li Zefan <lizefan@huawei.com> (commit_signer:3/7=43%)
> > "Kirill A. Shutemov" <kirill@shutemov.name> (commit_signer:1/7=14%)
> > linux-mm@kvack.org (open list:MEMORY MANAGEMENT)
> > linux-kernel@vger.kernel.org (open list)
> 
> get_maintainer should, by default, answer the question "who should I
> email about this file".  It clearly isn't doing this, and that's a
> pretty big fail.
> 
> I've learned not to trust it, so when I use it I always have to check
> its homework with "git log | grep Author" :(
> 
> Joe, pretty please?

Try this:

It adds authored/lines_added/lines_deleted to rolestats

For instance:

$ ./scripts/get_maintainer.pl -f mm/vmpressure.c
Tejun Heo <tj@kernel.org> (commit_signer:6/7=86%,authored:3/7=43%,removed_lines:15/21=71%)
Michal Hocko <mhocko@suse.cz> (commit_signer:5/7=71%,authored:3/7=43%,added_lines:22/408=5%,removed_lines:6/21=29%)
Andrew Morton <akpm@linux-foundation.org> (commit_signer:4/7=57%)
Li Zefan <lizefan@huawei.com> (commit_signer:3/7=43%)
"Kirill A. Shutemov" <kirill@shutemov.name> (commit_signer:1/7=14%)
Anton Vorontsov <anton.vorontsov@linaro.org> (authored:1/7=14%,added_lines:374/408=92%)
linux-mm@kvack.org (open list:MEMORY MANAGEMENT)
linux-kernel@vger.kernel.org (open list)

I haven't tested it much.

---

 scripts/get_maintainer.pl | 90 +++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 84 insertions(+), 6 deletions(-)

diff --git a/scripts/get_maintainer.pl b/scripts/get_maintainer.pl
index 5e4fb14..ee9adb8 100755
--- a/scripts/get_maintainer.pl
+++ b/scripts/get_maintainer.pl
@@ -98,6 +98,7 @@ my %VCS_cmds_git = (
     "available" => '(which("git") ne "") && (-d ".git")',
     "find_signers_cmd" =>
 	"git log --no-color --follow --since=\$email_git_since " .
+	    '--numstat --no-merges ' .
 	    '--format="GitCommit: %H%n' .
 		      'GitAuthor: %an <%ae>%n' .
 		      'GitDate: %aD%n' .
@@ -106,6 +107,7 @@ my %VCS_cmds_git = (
 	    " -- \$file",
     "find_commit_signers_cmd" =>
 	"git log --no-color " .
+	    '--numstat ' .
 	    '--format="GitCommit: %H%n' .
 		      'GitAuthor: %an <%ae>%n' .
 		      'GitDate: %aD%n' .
@@ -114,6 +116,7 @@ my %VCS_cmds_git = (
 	    " -1 \$commit",
     "find_commit_author_cmd" =>
 	"git log --no-color " .
+	    '--numstat ' .
 	    '--format="GitCommit: %H%n' .
 		      'GitAuthor: %an <%ae>%n' .
 		      'GitDate: %aD%n' .
@@ -125,6 +128,7 @@ my %VCS_cmds_git = (
     "blame_commit_pattern" => "^([0-9a-f]+) ",
     "author_pattern" => "^GitAuthor: (.*)",
     "subject_pattern" => "^GitSubject: (.*)",
+    "stat_pattern" => "(\\d+)\\t(\\d+)\\t\$file",
 );
 
 my %VCS_cmds_hg = (
@@ -152,6 +156,7 @@ my %VCS_cmds_hg = (
     "blame_commit_pattern" => "^([ 0-9a-f]+):",
     "author_pattern" => "^HgAuthor: (.*)",
     "subject_pattern" => "^HgSubject: (.*)",
+    "stat_pattern" => "(\\d+)\t(\\d+)\t\$file",
 );
 
 my $conf = which_conf(".get_maintainer.conf");
@@ -1269,20 +1274,30 @@ sub extract_formatted_signatures {
 }
 
 sub vcs_find_signers {
-    my ($cmd) = @_;
+    my ($cmd, $file) = @_;
     my $commits;
     my @lines = ();
     my @signatures = ();
+    my @authors = ();
+    my @stats = ();
 
     @lines = &{$VCS_cmds{"execute_cmd"}}($cmd);
 
     my $pattern = $VCS_cmds{"commit_pattern"};
+    my $author_pattern = $VCS_cmds{"author_pattern"};
+    my $stat_pattern = $VCS_cmds{"stat_pattern"};
+
+    $stat_pattern =~ s/(\$\w+)/$1/eeg;		#interpolate $stat_pattern
 
     $commits = grep(/$pattern/, @lines);	# of commits
 
+    @authors = grep(/$author_pattern/, @lines);
     @signatures = grep(/^[ \t]*${signature_pattern}.*\@.*$/, @lines);
+    @stats = grep(/$stat_pattern/, @lines);
+
+#    print("stats: <@stats>\n");
 
-    return (0, @signatures) if !@signatures;
+    return (0, @signatures, @authors) if !@signatures;
 
     save_commits_by_author(@lines) if ($interactive);
     save_commits_by_signer(@lines) if ($interactive);
@@ -1291,9 +1306,10 @@ sub vcs_find_signers {
 	@signatures = grep(!/${penguin_chiefs}/i, @signatures);
     }
 
+    my ($author_ref, $authors_ref) = extract_formatted_signatures(@authors);
     my ($types_ref, $signers_ref) = extract_formatted_signatures(@signatures);
 
-    return ($commits, @$signers_ref);
+    return ($commits, $signers_ref, $authors_ref, \@stats);
 }
 
 sub vcs_find_author {
@@ -1849,7 +1865,12 @@ sub vcs_assign {
 sub vcs_file_signoffs {
     my ($file) = @_;
 
+    my $authors_ref;
+    my $signers_ref;
+    my $stats_ref;
+    my @authors = ();
     my @signers = ();
+    my @stats = ();
     my $commits;
 
     $vcs_used = vcs_exists();
@@ -1858,13 +1879,58 @@ sub vcs_file_signoffs {
     my $cmd = $VCS_cmds{"find_signers_cmd"};
     $cmd =~ s/(\$\w+)/$1/eeg;		# interpolate $cmd
 
-    ($commits, @signers) = vcs_find_signers($cmd);
+    ($commits, $signers_ref, $authors_ref, $stats_ref) = vcs_find_signers($cmd, $file);
+    @signers = @{$signers_ref};
+    @authors = @{$authors_ref};
+    @stats = @{$stats_ref};
+
+#    print("commits: <$commits>\nsigners:<@signers>\nauthors: <@authors>\nstats: <@stats>\n");
 
     foreach my $signer (@signers) {
 	$signer = deduplicate_email($signer);
     }
 
     vcs_assign("commit_signer", $commits, @signers);
+    vcs_assign("authored", $commits, @authors);
+    if ($#authors == $#stats) {
+	my $stat_pattern = $VCS_cmds{"stat_pattern"};
+	$stat_pattern =~ s/(\$\w+)/$1/eeg;	#interpolate $stat_pattern
+
+	my $added = 0;
+	my $deleted = 0;
+	for (my $i = 0; $i <= $#stats; $i++) {
+	    if ($stats[$i] =~ /$stat_pattern/) {
+		$added += $1;
+		$deleted += $2;
+	    }
+	}
+	my @tmp_authors = uniq(@authors);
+	foreach my $author (@tmp_authors) {
+	    $author = deduplicate_email($author);
+	}
+	@tmp_authors = uniq(@tmp_authors);
+	my @list_added = ();
+	my @list_deleted = ();
+	foreach my $author (@tmp_authors) {
+	    my $auth_added = 0;
+	    my $auth_deleted = 0;
+	    for (my $i = 0; $i <= $#stats; $i++) {
+		if ($author eq deduplicate_email($authors[$i]) &&
+		    $stats[$i] =~ /$stat_pattern/) {
+		    $auth_added += $1;
+		    $auth_deleted += $2;
+		}
+	    }
+	    for (my $i = 0; $i < $auth_added; $i++) {
+		push(@list_added, $author);
+	    }
+	    for (my $i = 0; $i < $auth_deleted; $i++) {
+		push(@list_deleted, $author);
+	    }
+	}
+	vcs_assign("added_lines", $added, @list_added);
+	vcs_assign("removed_lines", $deleted, @list_deleted);
+    }
 }
 
 sub vcs_file_blame {
@@ -1887,6 +1953,10 @@ sub vcs_file_blame {
     if ($email_git_blame_signatures) {
 	if (vcs_is_hg()) {
 	    my $commit_count;
+	    my $commit_authors_ref;
+	    my $commit_signers_ref;
+	    my $stats_ref;
+	    my @commit_authors = ();
 	    my @commit_signers = ();
 	    my $commit = join(" -r ", @commits);
 	    my $cmd;
@@ -1894,19 +1964,27 @@ sub vcs_file_blame {
 	    $cmd = $VCS_cmds{"find_commit_signers_cmd"};
 	    $cmd =~ s/(\$\w+)/$1/eeg;	#substitute variables in $cmd
 
-	    ($commit_count, @commit_signers) = vcs_find_signers($cmd);
+	    ($commit_count, $commit_signers_ref, $commit_authors_ref, $stats_ref) = vcs_find_signers($cmd, $file);
+	    @commit_authors = @{$commit_authors_ref};
+	    @commit_signers = @{$commit_signers_ref};
 
 	    push(@signers, @commit_signers);
 	} else {
 	    foreach my $commit (@commits) {
 		my $commit_count;
+		my $commit_authors_ref;
+		my $commit_signers_ref;
+		my $stats_ref;
+		my @commit_authors = ();
 		my @commit_signers = ();
 		my $cmd;
 
 		$cmd = $VCS_cmds{"find_commit_signers_cmd"};
 		$cmd =~ s/(\$\w+)/$1/eeg;	#substitute variables in $cmd
 
-		($commit_count, @commit_signers) = vcs_find_signers($cmd);
+		($commit_count, $commit_signers_ref, $commit_authors_ref, $stats_ref) = vcs_find_signers($cmd, $file);
+		@commit_authors = @{$commit_authors_ref};
+		@commit_signers = @{$commit_signers_ref};
 
 		push(@signers, @commit_signers);
 	    }


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [bug] get_maintainer.pl incomplete output
  2013-10-18  4:17     ` Joe Perches
@ 2013-10-18 22:58       ` David Rientjes
  2013-10-19  0:25         ` Joe Perches
  2013-11-14 21:56       ` [PATCH] get_maintainer: Add commit author information to --rolestats Joe Perches
  1 sibling, 1 reply; 14+ messages in thread
From: David Rientjes @ 2013-10-18 22:58 UTC (permalink / raw)
  To: Joe Perches
  Cc: Andrew Morton, Anton Vorontsov, Michal Hocko, Kirill A. Shutemov,
	linux-kernel, linux-mm

On Thu, 17 Oct 2013, Joe Perches wrote:

> Try this:
> 
> It adds authored/lines_added/lines_deleted to rolestats
> 
> For instance:
> 
> $ ./scripts/get_maintainer.pl -f mm/vmpressure.c
> Tejun Heo <tj@kernel.org> (commit_signer:6/7=86%,authored:3/7=43%,removed_lines:15/21=71%)
> Michal Hocko <mhocko@suse.cz> (commit_signer:5/7=71%,authored:3/7=43%,added_lines:22/408=5%,removed_lines:6/21=29%)
> Andrew Morton <akpm@linux-foundation.org> (commit_signer:4/7=57%)
> Li Zefan <lizefan@huawei.com> (commit_signer:3/7=43%)
> "Kirill A. Shutemov" <kirill@shutemov.name> (commit_signer:1/7=14%)
> Anton Vorontsov <anton.vorontsov@linaro.org> (authored:1/7=14%,added_lines:374/408=92%)
> linux-mm@kvack.org (open list:MEMORY MANAGEMENT)
> linux-kernel@vger.kernel.org (open list)
> 
> I haven't tested it much.
> 

This looks good, thanks very much!  I'm not sure how useful the 
removed_lines stat is, but perhaps it can be useful for someone to chime 
in if someone proposes a patch that includes support that had already been 
removed.

Once it's signed off, feel free to add

	Acked-by: David Rientjes <rientjes@google.com>

Thanks again.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [bug] get_maintainer.pl incomplete output
  2013-10-18 22:58       ` David Rientjes
@ 2013-10-19  0:25         ` Joe Perches
  0 siblings, 0 replies; 14+ messages in thread
From: Joe Perches @ 2013-10-19  0:25 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, Anton Vorontsov, Michal Hocko, Kirill A. Shutemov,
	linux-kernel, linux-mm

On Fri, 2013-10-18 at 15:58 -0700, David Rientjes wrote:
> On Thu, 17 Oct 2013, Joe Perches wrote:
> > Try this:
> > It adds authored/lines_added/lines_deleted to rolestats
> > For instance:
> > $ ./scripts/get_maintainer.pl -f mm/vmpressure.c
> > Tejun Heo <tj@kernel.org> (commit_signer:6/7=86%,authored:3/7=43%,removed_lines:15/21=71%)
> > Michal Hocko <mhocko@suse.cz> (commit_signer:5/7=71%,authored:3/7=43%,added_lines:22/408=5%,removed_lines:6/21=29%)
> > Andrew Morton <akpm@linux-foundation.org> (commit_signer:4/7=57%)
> > Li Zefan <lizefan@huawei.com> (commit_signer:3/7=43%)
> > "Kirill A. Shutemov" <kirill@shutemov.name> (commit_signer:1/7=14%)
> > Anton Vorontsov <anton.vorontsov@linaro.org> (authored:1/7=14%,added_lines:374/408=92%)
> > linux-mm@kvack.org (open list:MEMORY MANAGEMENT)
> > linux-kernel@vger.kernel.org (open list)
> > 
> > I haven't tested it much.
> This looks good, thanks very much!

Oh sure, it's a trifle.

> I'm not sure how useful the 
> removed_lines stat is, but perhaps it can be useful for someone to chime 
> in if someone proposes a patch that includes support that had already been 
> removed.

The whole concept of lines +/- is kind of dubious.
Quantity really doesn't judge value.

It makes some sense for patches where new files
are added, but new files are a relatively a small
percentage of the overall kernel source tree.

Let it stew for awhile in Andrew's tree and when
-next opens up again, let's see if there are any
other comments about it.

My guess is very few people will notice.

The best mechanism is to have MAINTAINERS entries.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] get_maintainer: Add commit author information to --rolestats
  2013-10-18  4:17     ` Joe Perches
  2013-10-18 22:58       ` David Rientjes
@ 2013-11-14 21:56       ` Joe Perches
  2013-11-15  2:16         ` Chen Gang
  1 sibling, 1 reply; 14+ messages in thread
From: Joe Perches @ 2013-11-14 21:56 UTC (permalink / raw)
  To: Andrew Morton; +Cc: David Rientjes, linux-kernel, linux-mm, Chen Gang

get_maintainer currently uses "Signed-off-by" style lines
to find interested parties to send patches to when the
MAINTAINERS file does not have a specific section entry
with a matching file pattern.

Add statistics for commit authors and lines added and
deleted to the information provided by --rolestats.

These statistics are also emitted whenever --rolestats
and --git are selected even when there is a specified
maintainer.

This can have the effect of expanding the number of people
that are shown as possible "maintainers" of a particular
file because "authors", "added_lines", and "removed_lines"
are also used as criterion for the --max-maintainers option
separate from the "commit_signers".

The first "--git-max-maintainers" values of each criterion
are emitted.  Any "ties" are not shown.

For example: (forcedeth does not have a named maintainer)

Old output:

$ ./scripts/get_maintainer.pl -f drivers/net/ethernet/nvidia/forcedeth.c
"David S. Miller" <davem@davemloft.net> (commit_signer:8/10=80%)
Jiri Pirko <jiri@resnulli.us> (commit_signer:2/10=20%)
Patrick McHardy <kaber@trash.net> (commit_signer:2/10=20%)
Larry Finger <Larry.Finger@lwfinger.net> (commit_signer:1/10=10%)
Peter Zijlstra <peterz@infradead.org> (commit_signer:1/10=10%)
netdev@vger.kernel.org (open list:NETWORKING DRIVERS)
linux-kernel@vger.kernel.org (open list)

New output:

$ ./scripts/get_maintainer.pl -f drivers/net/ethernet/nvidia/forcedeth.c 
"David S. Miller" <davem@davemloft.net> (commit_signer:8/10=80%)
Jiri Pirko <jiri@resnulli.us> (commit_signer:2/10=20%,authored:2/10=20%,removed_lines:3/33=9%)
Patrick McHardy <kaber@trash.net> (commit_signer:2/10=20%,authored:2/10=20%,added_lines:12/95=13%,removed_lines:10/33=30%)
Larry Finger <Larry.Finger@lwfinger.net> (commit_signer:1/10=10%,authored:1/10=10%,added_lines:35/95=37%)
Peter Zijlstra <peterz@infradead.org> (commit_signer:1/10=10%)
"Peter Huwe" <PeterHuewe@gmx.de> (authored:1/10=10%,removed_lines:15/33=45%)
Joe Perches <joe@perches.com> (authored:1/10=10%)
Neil Horman <nhorman@tuxdriver.com> (added_lines:40/95=42%)
Bill Pemberton <wfp5p@virginia.edu> (removed_lines:3/33=9%)
netdev@vger.kernel.org (open list:NETWORKING DRIVERS)
linux-kernel@vger.kernel.org (open list)

Signed-off-by: Joe Perches <joe@perches.com>

---

Andrew, please replace the get_maintainer "try this"
patch with this.

It also fixes a defect in that proposal you may have
picked up separately.  That fix isn't yet in -next.

 scripts/get_maintainer.pl | 91 +++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 85 insertions(+), 6 deletions(-)

diff --git a/scripts/get_maintainer.pl b/scripts/get_maintainer.pl
index 5e4fb14..9c3986f 100755
--- a/scripts/get_maintainer.pl
+++ b/scripts/get_maintainer.pl
@@ -98,6 +98,7 @@ my %VCS_cmds_git = (
     "available" => '(which("git") ne "") && (-d ".git")',
     "find_signers_cmd" =>
 	"git log --no-color --follow --since=\$email_git_since " .
+	    '--numstat --no-merges ' .
 	    '--format="GitCommit: %H%n' .
 		      'GitAuthor: %an <%ae>%n' .
 		      'GitDate: %aD%n' .
@@ -106,6 +107,7 @@ my %VCS_cmds_git = (
 	    " -- \$file",
     "find_commit_signers_cmd" =>
 	"git log --no-color " .
+	    '--numstat ' .
 	    '--format="GitCommit: %H%n' .
 		      'GitAuthor: %an <%ae>%n' .
 		      'GitDate: %aD%n' .
@@ -114,6 +116,7 @@ my %VCS_cmds_git = (
 	    " -1 \$commit",
     "find_commit_author_cmd" =>
 	"git log --no-color " .
+	    '--numstat ' .
 	    '--format="GitCommit: %H%n' .
 		      'GitAuthor: %an <%ae>%n' .
 		      'GitDate: %aD%n' .
@@ -125,6 +128,7 @@ my %VCS_cmds_git = (
     "blame_commit_pattern" => "^([0-9a-f]+) ",
     "author_pattern" => "^GitAuthor: (.*)",
     "subject_pattern" => "^GitSubject: (.*)",
+    "stat_pattern" => "^(\\d+)\\t(\\d+)\\t\$file\$",
 );
 
 my %VCS_cmds_hg = (
@@ -152,6 +156,7 @@ my %VCS_cmds_hg = (
     "blame_commit_pattern" => "^([ 0-9a-f]+):",
     "author_pattern" => "^HgAuthor: (.*)",
     "subject_pattern" => "^HgSubject: (.*)",
+    "stat_pattern" => "^(\\d+)\t(\\d+)\t\$file\$",
 );
 
 my $conf = which_conf(".get_maintainer.conf");
@@ -1269,20 +1274,30 @@ sub extract_formatted_signatures {
 }
 
 sub vcs_find_signers {
-    my ($cmd) = @_;
+    my ($cmd, $file) = @_;
     my $commits;
     my @lines = ();
     my @signatures = ();
+    my @authors = ();
+    my @stats = ();
 
     @lines = &{$VCS_cmds{"execute_cmd"}}($cmd);
 
     my $pattern = $VCS_cmds{"commit_pattern"};
+    my $author_pattern = $VCS_cmds{"author_pattern"};
+    my $stat_pattern = $VCS_cmds{"stat_pattern"};
+
+    $stat_pattern =~ s/(\$\w+)/$1/eeg;		#interpolate $stat_pattern
 
     $commits = grep(/$pattern/, @lines);	# of commits
 
+    @authors = grep(/$author_pattern/, @lines);
     @signatures = grep(/^[ \t]*${signature_pattern}.*\@.*$/, @lines);
+    @stats = grep(/$stat_pattern/, @lines);
 
-    return (0, @signatures) if !@signatures;
+#    print("stats: <@stats>\n");
+
+    return (0, \@signatures, \@authors, \@stats) if !@signatures;
 
     save_commits_by_author(@lines) if ($interactive);
     save_commits_by_signer(@lines) if ($interactive);
@@ -1291,9 +1306,10 @@ sub vcs_find_signers {
 	@signatures = grep(!/${penguin_chiefs}/i, @signatures);
     }
 
+    my ($author_ref, $authors_ref) = extract_formatted_signatures(@authors);
     my ($types_ref, $signers_ref) = extract_formatted_signatures(@signatures);
 
-    return ($commits, @$signers_ref);
+    return ($commits, $signers_ref, $authors_ref, \@stats);
 }
 
 sub vcs_find_author {
@@ -1849,7 +1865,12 @@ sub vcs_assign {
 sub vcs_file_signoffs {
     my ($file) = @_;
 
+    my $authors_ref;
+    my $signers_ref;
+    my $stats_ref;
+    my @authors = ();
     my @signers = ();
+    my @stats = ();
     my $commits;
 
     $vcs_used = vcs_exists();
@@ -1858,13 +1879,59 @@ sub vcs_file_signoffs {
     my $cmd = $VCS_cmds{"find_signers_cmd"};
     $cmd =~ s/(\$\w+)/$1/eeg;		# interpolate $cmd
 
-    ($commits, @signers) = vcs_find_signers($cmd);
+    ($commits, $signers_ref, $authors_ref, $stats_ref) = vcs_find_signers($cmd, $file);
+
+    @signers = @{$signers_ref} if defined $signers_ref;
+    @authors = @{$authors_ref} if defined $authors_ref;
+    @stats = @{$stats_ref} if defined $stats_ref;
+
+#    print("commits: <$commits>\nsigners:<@signers>\nauthors: <@authors>\nstats: <@stats>\n");
 
     foreach my $signer (@signers) {
 	$signer = deduplicate_email($signer);
     }
 
     vcs_assign("commit_signer", $commits, @signers);
+    vcs_assign("authored", $commits, @authors);
+    if ($#authors == $#stats) {
+	my $stat_pattern = $VCS_cmds{"stat_pattern"};
+	$stat_pattern =~ s/(\$\w+)/$1/eeg;	#interpolate $stat_pattern
+
+	my $added = 0;
+	my $deleted = 0;
+	for (my $i = 0; $i <= $#stats; $i++) {
+	    if ($stats[$i] =~ /$stat_pattern/) {
+		$added += $1;
+		$deleted += $2;
+	    }
+	}
+	my @tmp_authors = uniq(@authors);
+	foreach my $author (@tmp_authors) {
+	    $author = deduplicate_email($author);
+	}
+	@tmp_authors = uniq(@tmp_authors);
+	my @list_added = ();
+	my @list_deleted = ();
+	foreach my $author (@tmp_authors) {
+	    my $auth_added = 0;
+	    my $auth_deleted = 0;
+	    for (my $i = 0; $i <= $#stats; $i++) {
+		if ($author eq deduplicate_email($authors[$i]) &&
+		    $stats[$i] =~ /$stat_pattern/) {
+		    $auth_added += $1;
+		    $auth_deleted += $2;
+		}
+	    }
+	    for (my $i = 0; $i < $auth_added; $i++) {
+		push(@list_added, $author);
+	    }
+	    for (my $i = 0; $i < $auth_deleted; $i++) {
+		push(@list_deleted, $author);
+	    }
+	}
+	vcs_assign("added_lines", $added, @list_added);
+	vcs_assign("removed_lines", $deleted, @list_deleted);
+    }
 }
 
 sub vcs_file_blame {
@@ -1887,6 +1954,10 @@ sub vcs_file_blame {
     if ($email_git_blame_signatures) {
 	if (vcs_is_hg()) {
 	    my $commit_count;
+	    my $commit_authors_ref;
+	    my $commit_signers_ref;
+	    my $stats_ref;
+	    my @commit_authors = ();
 	    my @commit_signers = ();
 	    my $commit = join(" -r ", @commits);
 	    my $cmd;
@@ -1894,19 +1965,27 @@ sub vcs_file_blame {
 	    $cmd = $VCS_cmds{"find_commit_signers_cmd"};
 	    $cmd =~ s/(\$\w+)/$1/eeg;	#substitute variables in $cmd
 
-	    ($commit_count, @commit_signers) = vcs_find_signers($cmd);
+	    ($commit_count, $commit_signers_ref, $commit_authors_ref, $stats_ref) = vcs_find_signers($cmd, $file);
+	    @commit_authors = @{$commit_authors_ref} if defined $commit_authors_ref;
+	    @commit_signers = @{$commit_signers_ref} if defined $commit_signers_ref;
 
 	    push(@signers, @commit_signers);
 	} else {
 	    foreach my $commit (@commits) {
 		my $commit_count;
+		my $commit_authors_ref;
+		my $commit_signers_ref;
+		my $stats_ref;
+		my @commit_authors = ();
 		my @commit_signers = ();
 		my $cmd;
 
 		$cmd = $VCS_cmds{"find_commit_signers_cmd"};
 		$cmd =~ s/(\$\w+)/$1/eeg;	#substitute variables in $cmd
 
-		($commit_count, @commit_signers) = vcs_find_signers($cmd);
+		($commit_count, $commit_signers_ref, $commit_authors_ref, $stats_ref) = vcs_find_signers($cmd, $file);
+		@commit_authors = @{$commit_authors_ref} if defined $commit_authors_ref;
+		@commit_signers = @{$commit_signers_ref} if defined $commit_signers_ref;
 
 		push(@signers, @commit_signers);
 	    }


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] get_maintainer: Add commit author information to --rolestats
  2013-11-14 21:56       ` [PATCH] get_maintainer: Add commit author information to --rolestats Joe Perches
@ 2013-11-15  2:16         ` Chen Gang
  0 siblings, 0 replies; 14+ messages in thread
From: Chen Gang @ 2013-11-15  2:16 UTC (permalink / raw)
  To: Joe Perches
  Cc: Andrew Morton, David Rientjes, linux-kernel, linux-mm, Li Zefan,
	Thomas Gleixner, Richard Weinberger

On 11/15/2013 05:56 AM, Joe Perches wrote:
> get_maintainer currently uses "Signed-off-by" style lines
> to find interested parties to send patches to when the
> MAINTAINERS file does not have a specific section entry
> with a matching file pattern.
> 
> Add statistics for commit authors and lines added and
> deleted to the information provided by --rolestats.
> 
> These statistics are also emitted whenever --rolestats
> and --git are selected even when there is a specified
> maintainer.
> 
> This can have the effect of expanding the number of people
> that are shown as possible "maintainers" of a particular
> file because "authors", "added_lines", and "removed_lines"
> are also used as criterion for the --max-maintainers option
> separate from the "commit_signers".
> 
> The first "--git-max-maintainers" values of each criterion
> are emitted.  Any "ties" are not shown.
> 
> For example: (forcedeth does not have a named maintainer)
> 
> Old output:
> 
> $ ./scripts/get_maintainer.pl -f drivers/net/ethernet/nvidia/forcedeth.c
> "David S. Miller" <davem@davemloft.net> (commit_signer:8/10=80%)
> Jiri Pirko <jiri@resnulli.us> (commit_signer:2/10=20%)
> Patrick McHardy <kaber@trash.net> (commit_signer:2/10=20%)
> Larry Finger <Larry.Finger@lwfinger.net> (commit_signer:1/10=10%)
> Peter Zijlstra <peterz@infradead.org> (commit_signer:1/10=10%)
> netdev@vger.kernel.org (open list:NETWORKING DRIVERS)
> linux-kernel@vger.kernel.org (open list)
> 
> New output:
> 
> $ ./scripts/get_maintainer.pl -f drivers/net/ethernet/nvidia/forcedeth.c 
> "David S. Miller" <davem@davemloft.net> (commit_signer:8/10=80%)
> Jiri Pirko <jiri@resnulli.us> (commit_signer:2/10=20%,authored:2/10=20%,removed_lines:3/33=9%)
> Patrick McHardy <kaber@trash.net> (commit_signer:2/10=20%,authored:2/10=20%,added_lines:12/95=13%,removed_lines:10/33=30%)
> Larry Finger <Larry.Finger@lwfinger.net> (commit_signer:1/10=10%,authored:1/10=10%,added_lines:35/95=37%)
> Peter Zijlstra <peterz@infradead.org> (commit_signer:1/10=10%)
> "Peter Hi? 1/2 we" <PeterHuewe@gmx.de> (authored:1/10=10%,removed_lines:15/33=45%)
> Joe Perches <joe@perches.com> (authored:1/10=10%)
> Neil Horman <nhorman@tuxdriver.com> (added_lines:40/95=42%)
> Bill Pemberton <wfp5p@virginia.edu> (removed_lines:3/33=9%)
> netdev@vger.kernel.org (open list:NETWORKING DRIVERS)
> linux-kernel@vger.kernel.org (open list)
> 
> Signed-off-by: Joe Perches <joe@perches.com>
> 

Thank you very much!

At least for me, it is OK (I test 4 patches, include 2 patches which may
cause issue with original "./scripts/get_maintainers.pl", but not with
current one).


BTW:

I have to say sorry for you (also for many other members), sometimes
what I said is not quite suitable (although it seems still 'polite').

I should stick to the fact, but need continue improving communication
experience (e.g. how to say and say what), and always clear the goal:
"make contributions to outside with lowest negative effect".

I also often make mistakes (some of them are common sense), firstly, I
should face them directly with standard way (it is basic requirement to
me), and then should try to reduce these things.


Welcome any members suggestions and completions.

Thanks.

> ---
> 
> Andrew, please replace the get_maintainer "try this"
> patch with this.
> 
> It also fixes a defect in that proposal you may have
> picked up separately.  That fix isn't yet in -next.
> 
>  scripts/get_maintainer.pl | 91 +++++++++++++++++++++++++++++++++++++++++++----
>  1 file changed, 85 insertions(+), 6 deletions(-)
> 
> diff --git a/scripts/get_maintainer.pl b/scripts/get_maintainer.pl
> index 5e4fb14..9c3986f 100755
> --- a/scripts/get_maintainer.pl
> +++ b/scripts/get_maintainer.pl
> @@ -98,6 +98,7 @@ my %VCS_cmds_git = (
>      "available" => '(which("git") ne "") && (-d ".git")',
>      "find_signers_cmd" =>
>  	"git log --no-color --follow --since=\$email_git_since " .
> +	    '--numstat --no-merges ' .
>  	    '--format="GitCommit: %H%n' .
>  		      'GitAuthor: %an <%ae>%n' .
>  		      'GitDate: %aD%n' .
> @@ -106,6 +107,7 @@ my %VCS_cmds_git = (
>  	    " -- \$file",
>      "find_commit_signers_cmd" =>
>  	"git log --no-color " .
> +	    '--numstat ' .
>  	    '--format="GitCommit: %H%n' .
>  		      'GitAuthor: %an <%ae>%n' .
>  		      'GitDate: %aD%n' .
> @@ -114,6 +116,7 @@ my %VCS_cmds_git = (
>  	    " -1 \$commit",
>      "find_commit_author_cmd" =>
>  	"git log --no-color " .
> +	    '--numstat ' .
>  	    '--format="GitCommit: %H%n' .
>  		      'GitAuthor: %an <%ae>%n' .
>  		      'GitDate: %aD%n' .
> @@ -125,6 +128,7 @@ my %VCS_cmds_git = (
>      "blame_commit_pattern" => "^([0-9a-f]+) ",
>      "author_pattern" => "^GitAuthor: (.*)",
>      "subject_pattern" => "^GitSubject: (.*)",
> +    "stat_pattern" => "^(\\d+)\\t(\\d+)\\t\$file\$",
>  );
>  
>  my %VCS_cmds_hg = (
> @@ -152,6 +156,7 @@ my %VCS_cmds_hg = (
>      "blame_commit_pattern" => "^([ 0-9a-f]+):",
>      "author_pattern" => "^HgAuthor: (.*)",
>      "subject_pattern" => "^HgSubject: (.*)",
> +    "stat_pattern" => "^(\\d+)\t(\\d+)\t\$file\$",
>  );
>  
>  my $conf = which_conf(".get_maintainer.conf");
> @@ -1269,20 +1274,30 @@ sub extract_formatted_signatures {
>  }
>  
>  sub vcs_find_signers {
> -    my ($cmd) = @_;
> +    my ($cmd, $file) = @_;
>      my $commits;
>      my @lines = ();
>      my @signatures = ();
> +    my @authors = ();
> +    my @stats = ();
>  
>      @lines = &{$VCS_cmds{"execute_cmd"}}($cmd);
>  
>      my $pattern = $VCS_cmds{"commit_pattern"};
> +    my $author_pattern = $VCS_cmds{"author_pattern"};
> +    my $stat_pattern = $VCS_cmds{"stat_pattern"};
> +
> +    $stat_pattern =~ s/(\$\w+)/$1/eeg;		#interpolate $stat_pattern
>  
>      $commits = grep(/$pattern/, @lines);	# of commits
>  
> +    @authors = grep(/$author_pattern/, @lines);
>      @signatures = grep(/^[ \t]*${signature_pattern}.*\@.*$/, @lines);
> +    @stats = grep(/$stat_pattern/, @lines);
>  
> -    return (0, @signatures) if !@signatures;
> +#    print("stats: <@stats>\n");
> +
> +    return (0, \@signatures, \@authors, \@stats) if !@signatures;
>  
>      save_commits_by_author(@lines) if ($interactive);
>      save_commits_by_signer(@lines) if ($interactive);
> @@ -1291,9 +1306,10 @@ sub vcs_find_signers {
>  	@signatures = grep(!/${penguin_chiefs}/i, @signatures);
>      }
>  
> +    my ($author_ref, $authors_ref) = extract_formatted_signatures(@authors);
>      my ($types_ref, $signers_ref) = extract_formatted_signatures(@signatures);
>  
> -    return ($commits, @$signers_ref);
> +    return ($commits, $signers_ref, $authors_ref, \@stats);
>  }
>  
>  sub vcs_find_author {
> @@ -1849,7 +1865,12 @@ sub vcs_assign {
>  sub vcs_file_signoffs {
>      my ($file) = @_;
>  
> +    my $authors_ref;
> +    my $signers_ref;
> +    my $stats_ref;
> +    my @authors = ();
>      my @signers = ();
> +    my @stats = ();
>      my $commits;
>  
>      $vcs_used = vcs_exists();
> @@ -1858,13 +1879,59 @@ sub vcs_file_signoffs {
>      my $cmd = $VCS_cmds{"find_signers_cmd"};
>      $cmd =~ s/(\$\w+)/$1/eeg;		# interpolate $cmd
>  
> -    ($commits, @signers) = vcs_find_signers($cmd);
> +    ($commits, $signers_ref, $authors_ref, $stats_ref) = vcs_find_signers($cmd, $file);
> +
> +    @signers = @{$signers_ref} if defined $signers_ref;
> +    @authors = @{$authors_ref} if defined $authors_ref;
> +    @stats = @{$stats_ref} if defined $stats_ref;
> +
> +#    print("commits: <$commits>\nsigners:<@signers>\nauthors: <@authors>\nstats: <@stats>\n");
>  
>      foreach my $signer (@signers) {
>  	$signer = deduplicate_email($signer);
>      }
>  
>      vcs_assign("commit_signer", $commits, @signers);
> +    vcs_assign("authored", $commits, @authors);
> +    if ($#authors == $#stats) {
> +	my $stat_pattern = $VCS_cmds{"stat_pattern"};
> +	$stat_pattern =~ s/(\$\w+)/$1/eeg;	#interpolate $stat_pattern
> +
> +	my $added = 0;
> +	my $deleted = 0;
> +	for (my $i = 0; $i <= $#stats; $i++) {
> +	    if ($stats[$i] =~ /$stat_pattern/) {
> +		$added += $1;
> +		$deleted += $2;
> +	    }
> +	}
> +	my @tmp_authors = uniq(@authors);
> +	foreach my $author (@tmp_authors) {
> +	    $author = deduplicate_email($author);
> +	}
> +	@tmp_authors = uniq(@tmp_authors);
> +	my @list_added = ();
> +	my @list_deleted = ();
> +	foreach my $author (@tmp_authors) {
> +	    my $auth_added = 0;
> +	    my $auth_deleted = 0;
> +	    for (my $i = 0; $i <= $#stats; $i++) {
> +		if ($author eq deduplicate_email($authors[$i]) &&
> +		    $stats[$i] =~ /$stat_pattern/) {
> +		    $auth_added += $1;
> +		    $auth_deleted += $2;
> +		}
> +	    }
> +	    for (my $i = 0; $i < $auth_added; $i++) {
> +		push(@list_added, $author);
> +	    }
> +	    for (my $i = 0; $i < $auth_deleted; $i++) {
> +		push(@list_deleted, $author);
> +	    }
> +	}
> +	vcs_assign("added_lines", $added, @list_added);
> +	vcs_assign("removed_lines", $deleted, @list_deleted);
> +    }
>  }
>  
>  sub vcs_file_blame {
> @@ -1887,6 +1954,10 @@ sub vcs_file_blame {
>      if ($email_git_blame_signatures) {
>  	if (vcs_is_hg()) {
>  	    my $commit_count;
> +	    my $commit_authors_ref;
> +	    my $commit_signers_ref;
> +	    my $stats_ref;
> +	    my @commit_authors = ();
>  	    my @commit_signers = ();
>  	    my $commit = join(" -r ", @commits);
>  	    my $cmd;
> @@ -1894,19 +1965,27 @@ sub vcs_file_blame {
>  	    $cmd = $VCS_cmds{"find_commit_signers_cmd"};
>  	    $cmd =~ s/(\$\w+)/$1/eeg;	#substitute variables in $cmd
>  
> -	    ($commit_count, @commit_signers) = vcs_find_signers($cmd);
> +	    ($commit_count, $commit_signers_ref, $commit_authors_ref, $stats_ref) = vcs_find_signers($cmd, $file);
> +	    @commit_authors = @{$commit_authors_ref} if defined $commit_authors_ref;
> +	    @commit_signers = @{$commit_signers_ref} if defined $commit_signers_ref;
>  
>  	    push(@signers, @commit_signers);
>  	} else {
>  	    foreach my $commit (@commits) {
>  		my $commit_count;
> +		my $commit_authors_ref;
> +		my $commit_signers_ref;
> +		my $stats_ref;
> +		my @commit_authors = ();
>  		my @commit_signers = ();
>  		my $cmd;
>  
>  		$cmd = $VCS_cmds{"find_commit_signers_cmd"};
>  		$cmd =~ s/(\$\w+)/$1/eeg;	#substitute variables in $cmd
>  
> -		($commit_count, @commit_signers) = vcs_find_signers($cmd);
> +		($commit_count, $commit_signers_ref, $commit_authors_ref, $stats_ref) = vcs_find_signers($cmd, $file);
> +		@commit_authors = @{$commit_authors_ref} if defined $commit_authors_ref;
> +		@commit_signers = @{$commit_signers_ref} if defined $commit_signers_ref;
>  
>  		push(@signers, @commit_signers);
>  	    }
> 
> 
> 
> 


-- 
Chen Gang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2013-11-15  2:14 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-10-17  0:43 [patch] mm, vmpressure: add high level David Rientjes
2013-10-17  3:05 ` Anton Vorontsov
2013-10-17  3:44   ` David Rientjes
2013-10-17  3:51 ` [bug] get_maintainer.pl incomplete output David Rientjes
2013-10-17  4:03   ` Joe Perches
2013-10-17  4:19     ` David Rientjes
2013-10-17  4:36       ` Joe Perches
2013-10-17 19:12   ` Andrew Morton
2013-10-17 19:23     ` Joe Perches
2013-10-18  4:17     ` Joe Perches
2013-10-18 22:58       ` David Rientjes
2013-10-19  0:25         ` Joe Perches
2013-11-14 21:56       ` [PATCH] get_maintainer: Add commit author information to --rolestats Joe Perches
2013-11-15  2:16         ` Chen Gang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).