From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1161030AbbHGPic (ORCPT ); Fri, 7 Aug 2015 11:38:32 -0400 Received: from mail-yk0-f180.google.com ([209.85.160.180]:34001 "EHLO mail-yk0-f180.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932077AbbHGPia (ORCPT ); Fri, 7 Aug 2015 11:38:30 -0400 Date: Fri, 7 Aug 2015 11:38:28 -0400 From: Tejun Heo To: Peter Zijlstra Cc: mingo@kernel.org, riel@redhat.com, dedekind1@gmail.com, linux-kernel@vger.kernel.org, mgorman@suse.de, rostedt@goodmis.org, juri.lelli@arm.com, Oleg Nesterov Subject: Re: [RFC][PATCH 1/4] sched: Fix a race between __kthread_bind() and sched_setaffinity() Message-ID: <20150807153828.GE14626@mtj.duckdns.org> References: <20150515154333.712161952@infradead.org> <20150515154833.545640346@infradead.org> <20150515155653.GA23692@htj.duckdns.org> <20150807142708.GK16853@twins.programming.kicks-ass.net> <20150807151608.GD14626@mtj.duckdns.org> <20150807152956.GN16853@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150807152956.GN16853@twins.programming.kicks-ass.net> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello, On Fri, Aug 07, 2015 at 05:29:56PM +0200, Peter Zijlstra wrote: > Even if we were to strictly order those stores you could have (note > there is no matching barrier, as there is only the one load, so ordering > cannot help): > > __kthread_bind() > > sched_setaffinity() > if (p->flags & PF_NO_SETAFFINITY) /* false-not-taken */ > p->flags |= PF_NO_SETAFFINITY; > smp_wmb(); > do_set_cpus_allowed(); > set_cpus_allowed_ptr() > > > I think the code was better before. Can't we just revert workqueue.c > > part? > > I agree that the new argument isn't pretty, but I cannot see how > workqueues would not be affected by this. So, the problem there is that __kthread_bind() doesn't grab the same lock that the syscall side grabs but workqueue used set_cpus_allowed_ptr() which goes through the rq locking, so as long as the check on syscall side is movied inside rq lock, it should be fine. Thanks. -- tejun