All of lore.kernel.org
 help / color / mirror / Atom feed
From: Preeti U Murthy <preeti@linux.vnet.ibm.com>
To: Srikar Dronamraju <srikar@linux.vnet.ibm.com>,
	Peter Zijlstra <peterz@infradead.org>
Cc: Preeti Murthy <preeti.lkml@gmail.com>,
	mingo@kernel.org, hpa@zytor.com, linux-kernel@vger.kernel.org,
	Thomas Gleixner <tglx@linutronix.de>,
	mikey@neuling.org, linux-tip-commits@vger.kernel.org
Subject: Re: [PATCH v2] sched: Check sched_domain before computing group power.
Date: Thu, 14 Nov 2013 11:36:27 +0530	[thread overview]
Message-ID: <52846863.4060001@linux.vnet.ibm.com> (raw)
In-Reply-To: <20131113112329.GC543@linux.vnet.ibm.com>

Hi,

On 11/13/2013 04:53 PM, Srikar Dronamraju wrote:
> * Preeti Murthy <preeti.lkml@gmail.com> [2013-11-13 16:22:37]:
> 
>> Hi Srikar,
>>
>> update_group_power() is called only during load balancing during
>> update_sg_lb_stats().
>> Load balancing begins at the base domain of the CPU,rq(cpu)->sd. This is
>> checked for
>> NULL. So how can update_group_power() be called in a scenario where the
>> base domain
>> of the CPU is not initialized? I say 'initialized' since you check for NULL
>> on rq(cpu)->sd.
>>
> 
> update_group_power() also gets called from init_sched_groups_power().
> And if you see the oops message, we know that the oops happens from that
> path. In build_sched_domains(), we do cpu_attach_domain() what updates
> rq->sd after the call to init_sched_groups_power(). So by the time
> init_sched_groups_power() is called rq->sd is not yet initialized. 
> 
> We only hit oops case, when the sd->flags has SD_OVERLAP set.
> 
> 
> 
>> In the changelog, you say 'updated'. Are you saying that it has a stale
>> value?
> 
> I said, "before the sched_domain for a cpu is updated", so its not yet
> updated or has stale value. As I said earlier in this mail, the
> initialization happens after we do a update_group_power().
> 
>> Please do elaborate on how you observed this.
>>
> 
> Does this clarify?

Yes it clarifies, thank you.

However I was thinking that a better fix would be to reorder the way we call
update_group_power() and cpu_attach_domain(). Why do we need to do
update_group_power() of the groups of the sched domains that would probably
degenerate in cpu_attach_domain()? So it seemed best to move update_group_power()
to after cpu_attach_domain() so that it saves unnecessary iterations over
sched domains which could degenerate, and it fixes the issue that you have brought out
as well. See below for the patch:

-------------------------------------------------------------------------------

sched: Update power of sched groups after sched domains have been attached to CPUs

From: Preeti U Murthy <preeti@linux.vnet.ibm.com>

Avoid iterating unnecessarily over the sched domains which could potentially
degenerate,  while updating sched groups' power. This can be done by moving
the call to init_sched_groups_power() to after cpu_attach_domain(), when the
possibility of degenerating sched domains is examined and appropriately sched
domains are degenerated.

But claim_allocations() which does a NULL on the struct sd_data members for
each sched domain should iterate over all the initally built sched domains.
So move claim_allocations() to a loop where we build sched groups for each domain.
We would not require to reference sd_data after sched domains and sched
groups have been built.

Another use of this re-ordering is with reference to the commit
"sched/fair: Fix group power_orig computation". After this commit,
we end up dereferencing cpu_rq(cpu)->sd in update_group_power(). This would
lead to a NULL pointer since this parameter is updated after call to
update_group_power() in build_sched_domains() during initialization of sched
domains per CPU. The below change prevents this from occuring.

Signed-off-by: Preeti U. Murthy <preeti@linux.vnet.ibm.com>
---
 kernel/sched/core.c |   20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index e6a6244..d9703ac 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6211,17 +6211,7 @@ static int build_sched_domains(const struct cpumask *cpu_map,
 				if (build_sched_groups(sd, i))
 					goto error;
 			}
-		}
-	}
-
-	/* Calculate CPU power for physical packages and nodes */
-	for (i = nr_cpumask_bits-1; i >= 0; i--) {
-		if (!cpumask_test_cpu(i, cpu_map))
-			continue;
-
-		for (sd = *per_cpu_ptr(d.sd, i); sd; sd = sd->parent) {
 			claim_allocations(i, sd);
-			init_sched_groups_power(i, sd);
 		}
 	}
 
@@ -6233,6 +6223,16 @@ static int build_sched_domains(const struct cpumask *cpu_map,
 	}
 	rcu_read_unlock();
 
+	/* Calculate CPU power for physical packages and nodes */
+	for (i = nr_cpumask_bits-1; i >= 0; i--) {
+		if (!cpumask_test_cpu(i, cpu_map))
+			continue;
+
+		for (sd = *per_cpu_ptr(d.sd, i); sd; sd = sd->parent)
+			init_sched_groups_power(i, sd);
+	}
+
+
 	ret = 0;
 error:
 	__free_domain_allocs(&d, alloc_state, cpu_map);

> 
Regards
Preeti U. Murthy


  reply	other threads:[~2013-11-14  6:09 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-09-12 18:05 [tip:sched/core] sched/fair: Fix group power_orig computation tip-bot for Peter Zijlstra
2013-09-12 23:21 ` Michael Neuling
2013-11-12 10:55 ` Srikar Dronamraju
2013-11-12 11:57   ` Peter Zijlstra
2013-11-12 16:41     ` [PATCH v2] sched: Check sched_domain before computing group power Srikar Dronamraju
2013-11-12 17:03       ` Peter Zijlstra
2013-11-12 17:15         ` Srikar Dronamraju
2013-11-12 17:55           ` Peter Zijlstra
2013-11-13  5:55             ` Srikar Dronamraju
     [not found]       ` <CAM4v1pNMn=5oZDiX3fUp9uPkZTPJgk=vEKEjevzvpwn=PjTzXg@mail.gmail.com>
2013-11-13 11:23         ` Srikar Dronamraju
2013-11-14  6:06           ` Preeti U Murthy [this message]
2013-11-14  8:30             ` Peter Zijlstra
2013-11-14  9:12               ` Preeti U Murthy
2013-11-13 15:17       ` Peter Zijlstra
2013-11-14 10:50         ` Srikar Dronamraju
2013-11-14 11:15           ` Peter Zijlstra
2013-11-19 19:15         ` [tip:sched/urgent] " tip-bot for Srikar Dronamraju
2013-11-19 23:36           ` Yinghai Lu
2013-11-21 15:03             ` Peter Zijlstra
2013-11-21 17:22               ` Yinghai Lu
2013-11-21 22:03                 ` Yinghai Lu
2013-11-28  3:02                   ` David Rientjes
2013-11-28  7:07                     ` Yinghai Lu
2013-11-28  9:38                       ` Peter Zijlstra
2013-11-28 20:23                         ` Yinghai Lu
2013-12-06  6:24                       ` Yinghai Lu
2013-12-10 10:58                         ` Peter Zijlstra
2013-12-10 21:26                           ` Yinghai Lu
2013-11-22 12:07                 ` Peter Zijlstra
2013-11-23  5:00                   ` Yinghai Lu
2013-11-23 18:53                     ` Peter Zijlstra
2013-11-28  2:57           ` David Rientjes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52846863.4060001@linux.vnet.ibm.com \
    --to=preeti@linux.vnet.ibm.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=mikey@neuling.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=preeti.lkml@gmail.com \
    --cc=srikar@linux.vnet.ibm.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.