From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756046AbbBPQy2 (ORCPT ); Mon, 16 Feb 2015 11:54:28 -0500 Received: from mx1.redhat.com ([209.132.183.28]:48981 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755478AbbBPQyZ (ORCPT ); Mon, 16 Feb 2015 11:54:25 -0500 Date: Mon, 16 Feb 2015 17:51:56 +0100 From: Oleg Nesterov To: Peter Zijlstra Cc: "Paul E. McKenney" , linux-kernel@vger.kernel.org, dave@stgolabs.net, waiman.long@hp.com, raghavendra.kt@linux.vnet.ibm.com, Nicholas Mc Guire , Linus Torvalds , mingo@kernel.org Subject: Re: [PATCH] sched/completion: completion_done() should serialize with complete() Message-ID: <20150216165156.GA20089@redhat.com> References: <20150212003430.GA28656@linux.vnet.ibm.com> <20150212195913.GA30430@redhat.com> <20150216082113.GY5029@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150216082113.GY5029@twins.programming.kicks-ass.net> User-Agent: Mutt/1.5.18 (2008-05-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 02/16, Peter Zijlstra wrote: > > On Thu, Feb 12, 2015 at 08:59:13PM +0100, Oleg Nesterov wrote: > > Commit de30ec47302c "Remove unnecessary ->wait.lock serialization when > > reading completion state" was not correct, without lock/unlock the code > > like stop_machine_from_inactive_cpu() > > > > while (!completion_done()) > > cpu_relax(); > > > > can return before complete() finishes its spin_unlock() which writes to > > this memory. And spin_unlock_wait(). > > > > While at it, change try_wait_for_completion() to use READ_ONCE(). > > So I share Davidlohrs concern Ah. I forgot to reply to Davidlohr's email. Sorry. > if we should not simply revert that > change; but given we've now gone over it detail I suppose we should just > keep the optimized version. Yes, I was going to say that of course I won't argue if we simply revert that commit. As he rigthly pointed the lockless check doesn't make sense performance-wise. However, this code needs a comment to explain why we can't simply check ->done and return, unlock_wait() is more documentation than optimization. But, > I did add a comment to your patch; and queued the below for > sched/urgent. Thanks! Now this logic is actually documented ;) unlock_wait() alone could confuse the reader too. Oleg.