From mboxrd@z Thu Jan  1 00:00:00 1970
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1751164AbeABQRC (ORCPT <rfc822;ralf@linux-mips.org> + 1 other);
        Tue, 2 Jan 2018 11:17:02 -0500
Received: from mail-io0-f181.google.com ([209.85.223.181]:45083 "EHLO
        mail-io0-f181.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1751101AbeABQRA (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Tue, 2 Jan 2018 11:17:00 -0500
X-Google-Smtp-Source: ACJfBovEtqcsAwsnZXuZNnjW7I5q4xzyjDlZ3hU7w9a1h53WAGJ65IUsXVm3XVeQrhkj4m9o6DRR8A==
Date: Tue, 2 Jan 2018 08:16:56 -0800
From: Tejun Heo <tj@kernel.org>
To: Prateek Sood <prsood@codeaurora.org>
Cc: Peter Zijlstra <peterz@infradead.org>, avagin@gmail.com,
        mingo@kernel.org, linux-kernel@vger.kernel.org,
        cgroups@vger.kernel.org, sramana@codeaurora.org,
        "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Subject: Re: [PATCH] cgroup/cpuset: fix circular locking dependency
Message-ID: <20180102161656.GD3668920@devbig577.frc2.facebook.com>
References: <CANaxB-zT+sz=z1FFk5npnwMySdfKCBZDkM+P+=JXDkCXbh=rCw@mail.gmail.com>
 <1511868946-23959-1-git-send-email-prsood@codeaurora.org>
 <623f214b-8b9a-f967-7a3d-ca9c06151267@codeaurora.org>
 <20171204202219.GF2421075@devbig577.frc2.facebook.com>
 <20171204225825.GP2421075@devbig577.frc2.facebook.com>
 <20171204230117.GF20227@worktop.programming.kicks-ass.net>
 <20171211152059.GH2421075@devbig577.frc2.facebook.com>
 <cca9aea2-9bff-decf-d669-8f90b928ea18@codeaurora.org>
 <20171213160617.GQ3919388@devbig577.frc2.facebook.com>
 <9843d982-d201-8702-2e4e-0541a4d96b53@codeaurora.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <9843d982-d201-8702-2e4e-0541a4d96b53@codeaurora.org>
User-Agent: Mutt/1.5.21 (2010-09-15)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Return-Path: <linux-kernel-owner@vger.kernel.org>

Hello,

On Fri, Dec 29, 2017 at 02:07:16AM +0530, Prateek Sood wrote:
> task T is waiting for cpuset_mutex acquired
> by kworker/2:1
> 
> sh ==> cpuhp/2 ==> kworker/2:1 ==> sh 
> 
> kworker/2:3 ==> kthreadd ==> Task T ==> kworker/2:1
> 
> It seems that my earlier patch set should fix this scenario:
> 1) Inverting locking order of cpuset_mutex and cpu_hotplug_lock.
> 2) Make cpuset hotplug work synchronous.
>
> Could you please share your feedback.

Hmm... this can also be resolved by adding WQ_MEM_RECLAIM to the
synchronize rcu workqueue, right?  Given the wide-spread usages of
synchronize_rcu and friends, maybe that's the right solution, or at
least something we also need to do, for this particular deadlock?

Again, I don't have anything against making the domain rebuliding part
of cpuset operations synchronous and these tricky deadlock scenarios
do indicate that doing so would probably be beneficial.  That said,
tho, these scenarios seem more of manifestations of other problems
exposed through kthreadd dependency than anything else.

Thanks.

-- 
tejun