From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753057AbdHBRMf (ORCPT ); Wed, 2 Aug 2017 13:12:35 -0400 Received: from bombadil.infradead.org ([65.50.211.133]:40427 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752827AbdHBRMd (ORCPT ); Wed, 2 Aug 2017 13:12:33 -0400 Date: Wed, 2 Aug 2017 19:12:23 +0200 From: Peter Zijlstra To: "Paul E. McKenney" Cc: Boqun Feng , linux-kernel@vger.kernel.org, Steven Rostedt , Krister Johansen , Paul Gortmaker , Thomas Gleixner , Ingo Molnar Subject: Re: [PATCH tip/sched/core] swait: Remove the lockless swait_active() check in swake_up*() Message-ID: <20170802171223.7sfh4f5rbuplu4th@hirez.programming.kicks-ass.net> References: <20170730134737.8374-1-boqun.feng@gmail.com> <20170802160713.GA1352@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170802160713.GA1352@linux.vnet.ibm.com> User-Agent: NeoMutt/20170609 (1.8.3) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Aug 02, 2017 at 09:07:13AM -0700, Paul E. McKenney wrote: > On Sun, Jul 30, 2017 at 09:47:35PM +0800, Boqun Feng wrote: > > Steven Rostedt reported a potential race in RCU core because of > > swake_up(): > > > > CPU0 CPU1 > > ---- ---- > > __call_rcu_core() { > > > > spin_lock(rnp_root) > > need_wake = __rcu_start_gp() { > > rcu_start_gp_advanced() { > > gp_flags = FLAG_INIT > > } > > } > > > > rcu_gp_kthread() { > > swait_event_interruptible(wq, > > gp_flags & FLAG_INIT) { > > spin_lock(q->lock) > > > > *fetch wq->task_list here! * > > > > list_add(wq->task_list, q->task_list) > > spin_unlock(q->lock); > > > > *fetch old value of gp_flags here * > > > > spin_unlock(rnp_root) > > > > rcu_gp_kthread_wake() { > > swake_up(wq) { > > swait_active(wq) { > > list_empty(wq->task_list) > > > > } * return false * > > > > if (condition) * false * > > schedule(); > > > > In this case, a wakeup is missed, which could cause the rcu_gp_kthread > > waits for a long time. > > > > The reason of this is that we do a lockless swait_active() check in > > swake_up(). To fix this, we can either 1) add a smp_mb() in swake_up() > > before swait_active() to provide the proper order or 2) simply remove > > the swait_active() in swake_up(). > > > > The solution 2 not only fixes this problem but also keeps the swait and > > wait API as close as possible, as wake_up() doesn't provide a full > > barrier and doesn't do a lockless check of the wait queue either. > > Moreover, there are users already using swait_active() to do their quick > > checks for the wait queues, so it make less sense that swake_up() and > > swake_up_all() do this on their own. > > > > This patch then removes the lockless swait_active() check in swake_up() > > and swake_up_all(). > > > > Reported-by: Steven Rostedt > > Signed-off-by: Boqun Feng > > Acked-by: Paul E. McKenney > > Tested-by: Paul E. McKenney > > Hearing no objections but not hearing anything else, either, I have > queued this for v4.14. If someone else would rather queue it, please > let me know. I have it too. Lets see who can get it into -tip first :-)