All of lore.kernel.org
 help / color / mirror / Atom feed
* [rfc patch] sched/topology: fix domain reconstruction memory leakage
@ 2017-08-19  6:10 Mike Galbraith
  2017-08-19  6:22 ` Mike Galbraith
  2017-08-21  8:16 ` Peter Zijlstra
  0 siblings, 2 replies; 6+ messages in thread
From: Mike Galbraith @ 2017-08-19  6:10 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar; +Cc: LKML

Greetings,

While beating on cpu hotplug with the shiny new topology fixes
backported, my memory poor 8 socket box fairly quickly leaked itself to
death, 0c0e776a9b0f being the culprit.  With the below applied, box
took a severe beating overnight without a whimper.

I'm wondering (ergo rfc) if free_sched_groups() shouldn't be renamed to
put_sched_groups() instead, with overlapping domains taking a group
reference reference as well so they can put both sg/sgc rather than put
one free the other.  Those places that want an explicit free can pass
free to only explicitly free sg (or use two functions).  Minimalist
approach works (minus signs, yay), but could perhaps use some "pretty".

sched/topology: fix domain reconstruction memory leakage

Since 0c0e776a9b0f, build_sched_groups() takes a reference on each
sg and sgc during domain generation, where previously it only took
a reference on the first group.  Iterate groups and drop all added
references during domain destruction, otherwise CPU hotplug leaks.

Signed-off-by: Mike Galbraith <mgalbraith@suse.de>
Fixes: 0c0e776a9b0f ("sched/topology: Rewrite get_group()")
---
 kernel/sched/topology.c |   19 ++++++++-----------
 1 file changed, 8 insertions(+), 11 deletions(-)

--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -323,7 +323,7 @@ static struct root_domain *alloc_rootdom
 	return rd;
 }
 
-static void free_sched_groups(struct sched_group *sg, int free_sgc)
+static void free_sched_groups(struct sched_group *sg, int put)
 {
 	struct sched_group *tmp, *first;
 
@@ -334,10 +334,11 @@ static void free_sched_groups(struct sch
 	do {
 		tmp = sg->next;
 
-		if (free_sgc && atomic_dec_and_test(&sg->sgc->ref))
+		if (put && atomic_dec_and_test(&sg->sgc->ref))
 			kfree(sg->sgc);
+		if (put < 2 || atomic_dec_and_test(&sg->ref))
+			kfree(sg);
 
-		kfree(sg);
 		sg = tmp;
 	} while (sg != first);
 }
@@ -345,15 +346,11 @@ static void free_sched_groups(struct sch
 static void destroy_sched_domain(struct sched_domain *sd)
 {
 	/*
-	 * If its an overlapping domain it has private groups, iterate and
-	 * nuke them all.
+	 * If it's an overlapping domain it has private groups, iterate,
+	 * freeing groups, otherwise dropping group references.  In both
+	 * cases, we must drop group capacity references.
 	 */
-	if (sd->flags & SD_OVERLAP) {
-		free_sched_groups(sd->groups, 1);
-	} else if (atomic_dec_and_test(&sd->groups->ref)) {
-		kfree(sd->groups->sgc);
-		kfree(sd->groups);
-	}
+	free_sched_groups(sd->groups, !(sd->flags & SD_OVERLAP)+1);
 	if (sd->shared && atomic_dec_and_test(&sd->shared->ref))
 		kfree(sd->shared);
 	kfree(sd);

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [rfc patch] sched/topology: fix domain reconstruction memory leakage
  2017-08-19  6:10 [rfc patch] sched/topology: fix domain reconstruction memory leakage Mike Galbraith
@ 2017-08-19  6:22 ` Mike Galbraith
  2017-08-21  8:16 ` Peter Zijlstra
  1 sibling, 0 replies; 6+ messages in thread
From: Mike Galbraith @ 2017-08-19  6:22 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar; +Cc: LKML

On Sat, 2017-08-19 at 08:10 +0200, Mike Galbraith wrote:
> 
> Signed-off-by: Mike Galbraith <efault@gmx.de>
(grr, wrong /me autographed the damn thing)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [rfc patch] sched/topology: fix domain reconstruction memory leakage
  2017-08-19  6:10 [rfc patch] sched/topology: fix domain reconstruction memory leakage Mike Galbraith
  2017-08-19  6:22 ` Mike Galbraith
@ 2017-08-21  8:16 ` Peter Zijlstra
  2017-08-21 11:45   ` Mike Galbraith
  2017-08-21 13:30   ` Mike Galbraith
  1 sibling, 2 replies; 6+ messages in thread
From: Peter Zijlstra @ 2017-08-21  8:16 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: Ingo Molnar, LKML

On Sat, Aug 19, 2017 at 08:10:49AM +0200, Mike Galbraith wrote:
> Greetings,
> 
> While beating on cpu hotplug with the shiny new topology fixes
> backported, my memory poor 8 socket box fairly quickly leaked itself to
> death, 0c0e776a9b0f being the culprit.  With the below applied, box
> took a severe beating overnight without a whimper.
> 
> I'm wondering (ergo rfc) if free_sched_groups() shouldn't be renamed to
> put_sched_groups() instead, with overlapping domains taking a group
> reference reference as well so they can put both sg/sgc rather than put
> one free the other.  Those places that want an explicit free can pass
> free to only explicitly free sg (or use two functions).  Minimalist
> approach works (minus signs, yay), but could perhaps use some "pretty".
> 
> sched/topology: fix domain reconstruction memory leakage

I was sitting on this one:

  https://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git/commit/?h=sched/core&id=c63d18dd6ea59eec5cba857835f788943ff9f0d5

is that the same?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [rfc patch] sched/topology: fix domain reconstruction memory leakage
  2017-08-21  8:16 ` Peter Zijlstra
@ 2017-08-21 11:45   ` Mike Galbraith
  2017-08-21 13:30   ` Mike Galbraith
  1 sibling, 0 replies; 6+ messages in thread
From: Mike Galbraith @ 2017-08-21 11:45 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Ingo Molnar, LKML

On Mon, 2017-08-21 at 10:16 +0200, Peter Zijlstra wrote:
> On Sat, Aug 19, 2017 at 08:10:49AM +0200, Mike Galbraith wrote:
> > Greetings,
> > 
> > While beating on cpu hotplug with the shiny new topology fixes
> > backported, my memory poor 8 socket box fairly quickly leaked itself to
> > death, 0c0e776a9b0f being the culprit.  With the below applied, box
> > took a severe beating overnight without a whimper.
> > 
> > I'm wondering (ergo rfc) if free_sched_groups() shouldn't be renamed to
> > put_sched_groups() instead, with overlapping domains taking a group
> > reference reference as well so they can put both sg/sgc rather than put
> > one free the other.  Those places that want an explicit free can pass
> > free to only explicitly free sg (or use two functions).  Minimalist
> > approach works (minus signs, yay), but could perhaps use some "pretty".
> > 
> > sched/topology: fix domain reconstruction memory leakage
> 
> I was sitting on this one:
> 
>   https://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git/commit/?h=sched/core&id=c63d18dd6ea59eec5cba857835f788943ff9f0d5
> 
> is that the same?

If that doesn't muck up the other places that were doing explicit sd
free without putting sgc, cool (I still like rename.. but then me and
pretty don't seem to get along all that well, so whatever works;).

	-Mike

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [rfc patch] sched/topology: fix domain reconstruction memory leakage
  2017-08-21  8:16 ` Peter Zijlstra
  2017-08-21 11:45   ` Mike Galbraith
@ 2017-08-21 13:30   ` Mike Galbraith
  2017-08-21 13:43     ` Peter Zijlstra
  1 sibling, 1 reply; 6+ messages in thread
From: Mike Galbraith @ 2017-08-21 13:30 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Ingo Molnar, LKML

On Mon, 2017-08-21 at 10:16 +0200, Peter Zijlstra wrote:
> On Sat, Aug 19, 2017 at 08:10:49AM +0200, Mike Galbraith wrote:
> > Greetings,
> > 
> > While beating on cpu hotplug with the shiny new topology fixes
> > backported, my memory poor 8 socket box fairly quickly leaked itself to
> > death, 0c0e776a9b0f being the culprit.  With the below applied, box
> > took a severe beating overnight without a whimper.
> > 
> > I'm wondering (ergo rfc) if free_sched_groups() shouldn't be renamed to
> > put_sched_groups() instead, with overlapping domains taking a group
> > reference reference as well so they can put both sg/sgc rather than put
> > one free the other.  Those places that want an explicit free can pass
> > free to only explicitly free sg (or use two functions).  Minimalist
> > approach works (minus signs, yay), but could perhaps use some "pretty".
> > 
> > sched/topology: fix domain reconstruction memory leakage
> 
> I was sitting on this one:
> 
>   https://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git/commit/?h=sched/core&id=c63d18dd6ea59eec5cba857835f788943ff9f0d5

The comment in the patch reads better to me like so:

@@ -345,15 +346,12 @@ static void free_sched_groups(struct sch
 static void destroy_sched_domain(struct sched_domain *sd)
 {
 	/*
-	 * If its an overlapping domain it has private groups, iterate and
-	 * nuke them all.
+	 * A normal sched domain may have multiple group references, an
+	 * overlapping domain, having private groups, only one.  Iterate,
+	 * dropping group/capacity references, freeing where none remain.
 	 */
-	if (sd->flags & SD_OVERLAP) {
-		free_sched_groups(sd->groups, 1);
-	} else if (atomic_dec_and_test(&sd->groups->ref)) {
-		kfree(sd->groups->sgc);
-		kfree(sd->groups);
-	}
+	free_sched_groups(sd->groups, 1);
+
 	if (sd->shared && atomic_dec_and_test(&sd->shared->ref))
 		kfree(sd->shared);
 	kfree(sd);

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [rfc patch] sched/topology: fix domain reconstruction memory leakage
  2017-08-21 13:30   ` Mike Galbraith
@ 2017-08-21 13:43     ` Peter Zijlstra
  0 siblings, 0 replies; 6+ messages in thread
From: Peter Zijlstra @ 2017-08-21 13:43 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: Ingo Molnar, LKML

On Mon, Aug 21, 2017 at 03:30:29PM +0200, Mike Galbraith wrote:
> On Mon, 2017-08-21 at 10:16 +0200, Peter Zijlstra wrote:
> > On Sat, Aug 19, 2017 at 08:10:49AM +0200, Mike Galbraith wrote:
> > > Greetings,
> > > 
> > > While beating on cpu hotplug with the shiny new topology fixes
> > > backported, my memory poor 8 socket box fairly quickly leaked itself to
> > > death, 0c0e776a9b0f being the culprit.  With the below applied, box
> > > took a severe beating overnight without a whimper.
> > > 
> > > I'm wondering (ergo rfc) if free_sched_groups() shouldn't be renamed to
> > > put_sched_groups() instead, with overlapping domains taking a group
> > > reference reference as well so they can put both sg/sgc rather than put
> > > one free the other.  Those places that want an explicit free can pass
> > > free to only explicitly free sg (or use two functions).  Minimalist
> > > approach works (minus signs, yay), but could perhaps use some "pretty".
> > > 
> > > sched/topology: fix domain reconstruction memory leakage
> > 
> > I was sitting on this one:
> > 
> >   https://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git/commit/?h=sched/core&id=c63d18dd6ea59eec5cba857835f788943ff9f0d5
> 
> The comment in the patch reads better to me like so:
> 
> @@ -345,15 +346,12 @@ static void free_sched_groups(struct sch
>  static void destroy_sched_domain(struct sched_domain *sd)
>  {
>  	/*
> +	 * A normal sched domain may have multiple group references, an
> +	 * overlapping domain, having private groups, only one.  Iterate,
> +	 * dropping group/capacity references, freeing where none remain.
>  	 */
> 	free_sched_groups(sd->groups, 1);

Made it so, thanks!

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-08-21 13:43 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-19  6:10 [rfc patch] sched/topology: fix domain reconstruction memory leakage Mike Galbraith
2017-08-19  6:22 ` Mike Galbraith
2017-08-21  8:16 ` Peter Zijlstra
2017-08-21 11:45   ` Mike Galbraith
2017-08-21 13:30   ` Mike Galbraith
2017-08-21 13:43     ` Peter Zijlstra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.