linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Prateek Sood <prsood@codeaurora.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: tj@kernel.org, lizefan@huawei.com, cgroups@vger.kernel.org,
	mingo@kernel.org, longman@redhat.com, boqun.feng@gmail.com,
	tglx@linutronix.de, linux-kernel@vger.kernel.org,
	sramana@codeaurora.org
Subject: Re: [PATCH] cgroup/cpuset: remove circular dependency deadlock
Date: Mon, 9 Oct 2017 18:57:46 +0530	[thread overview]
Message-ID: <4668d1ec-dc43-8a9c-4f94-a421683d3c17@codeaurora.org> (raw)
In-Reply-To: <20170907175107.GG17526@worktop.programming.kicks-ass.net>

On 09/07/2017 11:21 PM, Peter Zijlstra wrote:
> On Thu, Sep 07, 2017 at 07:26:23PM +0530, Prateek Sood wrote:
>> Remove circular dependency deadlock in a scenario where hotplug of CPU is
>> being done while there is updation in cgroup and cpuset triggered from
>> userspace.
> 
> You've forgotten to mention your solution to the deadlock, namely
> inverting cpuset_mutex and cpu_hotplug_lock.
> 
>> Signed-off-by: Prateek Sood <prsood@codeaurora.org>
>> ---
>>  kernel/cgroup/cpuset.c | 32 +++++++++++++++++++-------------
>>  1 file changed, 19 insertions(+), 13 deletions(-)
>>
>> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
>> index 2f4039b..60dc0ac 100644
>> --- a/kernel/cgroup/cpuset.c
>> +++ b/kernel/cgroup/cpuset.c
>> @@ -816,16 +816,15 @@ static int generate_sched_domains(cpumask_var_t **domains,
>>   * 'cpus' is removed, then call this routine to rebuild the
>>   * scheduler's dynamic sched domains.
>>   *
>> - * Call with cpuset_mutex held.  Takes get_online_cpus().
>>   */
>> -static void rebuild_sched_domains_locked(void)
>> +static void rebuild_sched_domains_cpuslocked(void)
>>  {
>>  	struct sched_domain_attr *attr;
>>  	cpumask_var_t *doms;
>>  	int ndoms;
>>  
>> +	lockdep_assert_cpus_held();
>>  	lockdep_assert_held(&cpuset_mutex);
>> -	get_online_cpus();
>>  
>>  	/*
>>  	 * We have raced with CPU hotplug. Don't do anything to avoid
>> @@ -833,27 +832,27 @@ static void rebuild_sched_domains_locked(void)
>>  	 * Anyways, hotplug work item will rebuild sched domains.
>>  	 */
>>  	if (!cpumask_equal(top_cpuset.effective_cpus, cpu_active_mask))
>> -		goto out;
>> +		return;
>>  
>>  	/* Generate domain masks and attrs */
>>  	ndoms = generate_sched_domains(&doms, &attr);
>>  
>>  	/* Have scheduler rebuild the domains */
>>  	partition_sched_domains(ndoms, doms, attr);
>> -out:
>> -	put_online_cpus();
>>  }
>>  #else /* !CONFIG_SMP */
>> -static void rebuild_sched_domains_locked(void)
>> +static void rebuild_sched_domains_cpuslocked(void)
>>  {
>>  }
>>  #endif /* CONFIG_SMP */
>>  
>>  void rebuild_sched_domains(void)
>>  {
>> +	get_online_cpus();
>>  	mutex_lock(&cpuset_mutex);
>> -	rebuild_sched_domains_locked();
>> +	rebuild_sched_domains_cpuslocked();
>>  	mutex_unlock(&cpuset_mutex);
>> +	put_online_cpus();
>>  }
> 
> But if you invert these locks, the need for cpuset_hotplug_workfn() goes
> away, at least for the CPU part, and we can make in synchronous again.
> Yay!!
> 
> Also, I think new code should use cpus_read_lock() instead of
> get_online_cpus().
> 

Thanks for the review comments Peter.
For patch related to circular deadlock, I will send an updated version.

The callback making a call to cpuset_hotplug_workfn()in hotplug path are
        [CPUHP_AP_ACTIVE] = {
                .name                   = "sched:active",
                .startup.single         = sched_cpu_activate,
                .teardown.single        = sched_cpu_deactivate,
        },

if we make cpuset_hotplug_workfn() synchronous, deadlock might happen:
_cpu_down()
   cpus_write_lock()  //held
      cpuhp_kick_ap_work()
        cpuhp_kick_ap()
           __cpuhp_kick_ap()
              wake_up_process() //cpuhp_thread_fun
                wait_for_ap_thread() //wait for complete from cpuhp_thread_fun()

cpuhp_thread_fun()
   cpuhp_invoke_callback()
     sched_cpu_deactivate()
       cpuset_cpu_inactive()
          cpuset_update_active_cpus()
             cpuset_hotplug_work()
                rebuild_sched_domains()
                   cpus_read_lock() //waiting as acquired in _cpu_down()
                  

-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation
Center, Inc., is a member of Code Aurora Forum, a Linux Foundation
Collaborative Project

  reply	other threads:[~2017-10-09 13:28 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-07 13:56 [PATCH] cgroup/cpuset: remove circular dependency deadlock Prateek Sood
2017-09-07 17:45 ` Peter Zijlstra
2017-09-08  2:13   ` Prateek Sood
2017-09-07 17:51 ` Peter Zijlstra
2017-10-09 13:27   ` Prateek Sood [this message]
2017-10-11  9:48     ` Peter Zijlstra
2017-10-25  8:39       ` Prateek Sood
2017-10-25  9:30         ` Peter Zijlstra
2017-10-26 11:52           ` Prateek Sood
2017-10-26 14:05             ` Waiman Long
2017-10-27  8:03               ` Prateek Sood
2017-10-30  7:16                 ` [PATCH v2] " Prateek Sood
2017-11-06  4:01                   ` Prateek Sood
2017-11-15 10:26                   ` Prateek Sood
2017-11-15 10:37                     ` Peter Zijlstra
2017-11-15 14:20                       ` [PATCH v3 0/2] Invert cpu_hotplug_lock and cpuset_mutex locking order Prateek Sood
2017-11-15 14:20                         ` [PATCH v3 1/2] cgroup/cpuset: remove circular dependency deadlock Prateek Sood
2017-11-15 14:20                         ` [PATCH v3 2/2] cpuset: Make cpuset hotplug synchronous Prateek Sood
2017-11-27 16:48                         ` [PATCH v3 0/2] Invert cpu_hotplug_lock and cpuset_mutex locking order Tejun Heo
2017-11-15 17:05                       ` [PATCH v2] cgroup/cpuset: remove circular dependency deadlock Tejun Heo
2017-11-15 17:18                         ` Prateek Sood
  -- strict thread matches above, loose matches on Subject: below --
2017-09-07  6:04 [PATCH] " Prateek Sood
2017-09-07  7:28 ` Peter Zijlstra
2017-09-07  8:56   ` Boqun Feng
2017-09-07  9:07     ` Prateek Sood
2017-09-07  9:05   ` Prateek Sood
2017-09-06 11:48 Prateek Sood
2017-09-06 12:56 ` Waiman Long
2017-09-06 14:23   ` Prateek Sood

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4668d1ec-dc43-8a9c-4f94-a421683d3c17@codeaurora.org \
    --to=prsood@codeaurora.org \
    --cc=boqun.feng@gmail.com \
    --cc=cgroups@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizefan@huawei.com \
    --cc=longman@redhat.com \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=sramana@codeaurora.org \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).