From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753898AbdEDNbk (ORCPT ); Thu, 4 May 2017 09:31:40 -0400 Received: from merlin.infradead.org ([205.233.59.134]:47736 "EHLO merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751565AbdEDNbc (ORCPT ); Thu, 4 May 2017 09:31:32 -0400 Date: Thu, 4 May 2017 15:31:22 +0200 From: Peter Zijlstra To: Tejun Heo Cc: Ingo Molnar , =?utf-8?B?4oCcbGludXgta2VybmVsQHZnZXIua2VybmVsLm9yZ+KAnQ==?= , Linus Torvalds , Mike Galbraith , Paul Turner , Chris Mason , =?utf-8?B?4oCca2VybmVsLXRlYW1AZmIuY29t4oCd?= Subject: Re: [PATCH 1/2] sched/fair: Use task_groups instead of leaf_cfs_rq_list to walk all cfs_rqs Message-ID: <20170504133122.a6qjlj3hlblbjxux@hirez.programming.kicks-ass.net> References: <20170426004039.GA3222@wtj.duckdns.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170426004039.GA3222@wtj.duckdns.org> User-Agent: NeoMutt/20170113 (1.7.2) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Apr 25, 2017 at 05:40:39PM -0700, Tejun Heo wrote: > task_groups list is protected by its own lock. While it allows RCU > protected traversal and the order of operations guarantees that all > online cfs_rqs will be visited, holding rq->lock won't protect against > iterating an already unregistered cfs_rq. So I think the below is sufficient. Yes we can hit an (almost) dead cfs_rq, but poking the bandwidth variables thereof is harmless. unthrottle_cfs_rq() also stops doing anything much when it finds the cfs_rq is empty, which must be the case if we're removing it. I don't know Paul's opinion on RCU GPs happening while stop_machine(), but just in case he feels that's fair game, I did add the rcu_read_lock() thingies. The lockdep assert is mostly documentation, to more easily see it is indeed held when we get there. I left print_cfs_stats using the leaf list, no point in printing stuff that's empty. This way we can avoid taking all RQ locks on cgroup destruction. --- kernel/sched/fair.c | 21 ++++++++++++++++----- 1 file changed, 16 insertions(+), 5 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index cd6c3f957d44..c897ad6d714c 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -4644,22 +4644,32 @@ static void destroy_cfs_bandwidth(struct cfs_bandwidth *cfs_b) static void __maybe_unused update_runtime_enabled(struct rq *rq) { - struct cfs_rq *cfs_rq; + struct task_group *tg; - for_each_leaf_cfs_rq(rq, cfs_rq) { - struct cfs_bandwidth *cfs_b = &cfs_rq->tg->cfs_bandwidth; + lockdep_assert_held(&rq->lock); + + rcu_read_lock(); + list_for_each_entry_rcu(tg, &task_groups, list) { + struct cfs_bandwidth *cfs_b = &tg->cfs_bandwidth; + struct cfs_rq *cfs_rq = tg->cfs_rq[cpu_of(rq)]; raw_spin_lock(&cfs_b->lock); cfs_rq->runtime_enabled = cfs_b->quota != RUNTIME_INF; raw_spin_unlock(&cfs_b->lock); } + rcu_read_unlock(); } static void __maybe_unused unthrottle_offline_cfs_rqs(struct rq *rq) { - struct cfs_rq *cfs_rq; + struct task_group *tg; + + lockdep_assert_held(&rq->lock); + + rcu_read_lock(); + list_for_each_entry_rcu(tg, &task_groups, list) { + struct cfs_rq *cfs_rq = tg->cfs_rq[cpu_of(rq)]; - for_each_leaf_cfs_rq(rq, cfs_rq) { if (!cfs_rq->runtime_enabled) continue; @@ -4677,6 +4687,7 @@ static void __maybe_unused unthrottle_offline_cfs_rqs(struct rq *rq) if (cfs_rq_throttled(cfs_rq)) unthrottle_cfs_rq(cfs_rq); } + rcu_read_unlock(); } #else /* CONFIG_CFS_BANDWIDTH */