linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Prateek Sood <prsood@codeaurora.org>
To: Peter Zijlstra <peterz@infradead.org>, Tejun Heo <tj@kernel.org>
Cc: avagin@gmail.com, mingo@kernel.org, linux-kernel@vger.kernel.org,
	cgroups@vger.kernel.org, sramana@codeaurora.org
Subject: Re: [PATCH] cgroup/cpuset: fix circular locking dependency
Date: Fri, 8 Dec 2017 17:15:55 +0530	[thread overview]
Message-ID: <40968aea-cd73-5ce4-d559-962d91e315c5@codeaurora.org> (raw)
In-Reply-To: <4e63b5e9-1696-910f-16ac-4d4d7eb98725@codeaurora.org>

On 12/08/2017 03:10 PM, Prateek Sood wrote:
> On 12/05/2017 04:31 AM, Peter Zijlstra wrote:
>> On Mon, Dec 04, 2017 at 02:58:25PM -0800, Tejun Heo wrote:
>>> Hello, again.
>>>
>>> On Mon, Dec 04, 2017 at 12:22:19PM -0800, Tejun Heo wrote:
>>>> Hello,
>>>>
>>>> On Mon, Dec 04, 2017 at 10:44:49AM +0530, Prateek Sood wrote:
>>>>> Any feedback/suggestion for this patch?
>>>>
>>>> Sorry about the delay.  I'm a bit worried because it feels like we're
>>>> chasing a squirrel.  I'll think through the recent changes and this
>>>> one and get back to you.
>>>
>>> Can you please take a look at the following pending commit?
>>>
>>>   https://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git/commit/?h=for-4.15-fixes&id=e8b3f8db7aad99fcc5234fc5b89984ff6620de3d
>>>
>>> AFAICS, this should remove the circular dependency you originally
>>> reported.  I'll revert the two cpuset commits for now.
>>
>> So I liked his patches in that we would be able to go back to
>> synchronous sched_domain building.
>>
> https://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git/commit/?h=for-4.15-fixes&id=e8b3f8db7aad99fcc5234fc5b89984ff6620de3d
> 
> This will fix the original circular locking dependency issue.
> I will let you both (Peter & TJ) to decide on which one to
> pick.
> 
> 
> 

TJ & Peter,

There is one deadlock issue during cgroup migration from cpu
hotplug path when a task T is being moved from source to
destination cgroup.

kworker/0:0
cpuset_hotplug_workfn()
   cpuset_hotplug_update_tasks()
      hotplug_update_tasks_legacy()
        remove_tasks_in_empty_cpuset()
          cgroup_transfer_tasks() // stuck in iterator loop
            cgroup_migrate()
              cgroup_migrate_add_task()

In cgroup_migrate_add_task() it checks for PF_EXITING flag of task T.
Task T will not migrate to destination cgroup. css_task_iter_start()
will keep pointing to task T in loop waiting for task T cg_list node
to be removed.

Task T
do_exit()
  exit_signals() // sets PF_EXITING
  exit_task_namespaces()
    switch_task_namespaces()
      free_nsproxy()
        put_mnt_ns()
          drop_collected_mounts()
            namespace_unlock()
              synchronize_rcu()
                _synchronize_rcu_expedited()
                  schedule_work() // on cpu0 low priority worker pool
                  wait_event() // waiting for work item to execute

Task T inserted a work item in the worklist of cpu0 low priority
worker pool. It is waiting for expedited grace period work item
to execute. This work item will only be executed once kworker/0:0
complete execution of cpuset_hotplug_workfn().

kworker/0:0 ==> Task T ==>kworker/0:0
 
Following suggested patch might not be able to fix the above
mentioned case:
https://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git/commit/?h=for-4.15-fixes&id=e8b3f8db7aad99fcc5234fc5b89984ff6620de3d

Combination of following patches fixes above mentioned scenario
as well:
1) Inverting cpuset_mutex and cpu_hotplug_lock locking sequence
2) Making cpuset hotplug work synchronous 



-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation
Center, Inc., is a member of Code Aurora Forum, a Linux Foundation
Collaborative Project

  reply	other threads:[~2017-12-08 11:46 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CANaxB-zT+sz=z1FFk5npnwMySdfKCBZDkM+P+=JXDkCXbh=rCw@mail.gmail.com>
2017-11-28 11:35 ` [PATCH] cgroup/cpuset: fix circular locking dependency Prateek Sood
2017-12-04  5:14   ` Prateek Sood
2017-12-04 20:22     ` Tejun Heo
2017-12-04 22:58       ` Tejun Heo
2017-12-04 23:01         ` Peter Zijlstra
2017-12-08  9:40           ` Prateek Sood
2017-12-08 11:45             ` Prateek Sood [this message]
2017-12-11 15:32               ` Tejun Heo
2017-12-13 14:28                 ` Prateek Sood
2017-12-13 15:40                   ` Tejun Heo
2017-12-15  8:54                     ` Prateek Sood
2017-12-15 13:22                       ` Tejun Heo
2017-12-15 19:06                         ` Prateek Sood
2017-12-19  7:26                           ` [PATCH] cgroup: Fix deadlock in cpu hotplug path Prateek Sood
2017-12-19 13:39                             ` Tejun Heo
2017-12-11 15:20           ` [PATCH] cgroup/cpuset: fix circular locking dependency Tejun Heo
2017-12-13  7:50             ` Prateek Sood
2017-12-13 16:06               ` Tejun Heo
2017-12-15 19:04                 ` Prateek Sood
2017-12-28 20:37                 ` Prateek Sood
2018-01-02 16:16                   ` Tejun Heo
2018-01-02 17:44                     ` Paul E. McKenney
2018-01-02 18:01                       ` Paul E. McKenney
2018-01-08 12:28                         ` Tejun Heo
2018-01-08 13:47                           ` [PATCH wq/for-4.16 1/2] workqueue: separate out init_rescuer() Tejun Heo
2018-01-08 13:47                             ` [PATCH wq/for-4.16 2/2] workqueue: allow WQ_MEM_RECLAIM on early init workqueues Tejun Heo
2018-01-08 22:52                           ` [PATCH] cgroup/cpuset: fix circular locking dependency Paul E. McKenney
2018-01-09  0:31                             ` Paul E. McKenney
2018-01-09  3:42                               ` Tejun Heo
2018-01-09  4:20                                 ` Paul E. McKenney
2018-01-09 13:44                                   ` Tejun Heo
2018-01-09 15:21                                     ` Paul E. McKenney
2018-01-09 15:37                                       ` Tejun Heo
2018-01-09 16:00                                         ` Paul E. McKenney
2018-01-10 20:08                                           ` Paul E. McKenney
2018-01-10 21:41                                             ` Tejun Heo
2018-01-10 22:10                                               ` Paul E. McKenney
2018-01-15 12:02                     ` Prateek Sood
2018-01-16 16:27                       ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=40968aea-cd73-5ce4-d559-962d91e315c5@codeaurora.org \
    --to=prsood@codeaurora.org \
    --cc=avagin@gmail.com \
    --cc=cgroups@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=sramana@codeaurora.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).