From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751002AbdAQCJI (ORCPT ); Mon, 16 Jan 2017 21:09:08 -0500 Received: from mail.efficios.com ([167.114.142.141]:53736 "EHLO mail.efficios.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750734AbdAQCJH (ORCPT ); Mon, 16 Jan 2017 21:09:07 -0500 Date: Tue, 17 Jan 2017 02:09:38 +0000 (UTC) From: Mathieu Desnoyers To: Linus Torvalds Cc: "Paul E. McKenney" , linux-kernel , Josh Triplett , KOSAKI Motohiro , rostedt , Nicholas Miell , Ingo Molnar , One Thousand Gnomes , Lai Jiangshan , Stephen Hemminger , Thomas Gleixner , Peter Zijlstra , David Howells , bobby prani , Michael Kerrisk , Shuah Khan , Andrew Morton Message-ID: <792537721.5599.1484618978163.JavaMail.zimbra@efficios.com> In-Reply-To: References: <1484596275-30412-1-git-send-email-mathieu.desnoyers@efficios.com> <1587103499.5523.1484607367124.JavaMail.zimbra@efficios.com> Subject: Re: [RFC PATCH] membarrier: handle nohz_full with expedited thread registration MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [167.114.142.141] X-Mailer: Zimbra 8.7.1_GA_1670 (ZimbraWebClient - FF45 (Linux)/8.7.1_GA_1670) Thread-Topic: membarrier: handle nohz_full with expedited thread registration Thread-Index: er9ivjC9y8JLGoQHaJp8e3kT4yMHHA== Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org ----- On Jan 16, 2017, at 6:50 PM, Linus Torvalds torvalds@linux-foundation.org wrote: > Why not just make the write be a "smp_store_release()", and the read > be a "smp_load_acquire()". That guarantees a certain amount of > ordering. The only amount that I suspect makes sense, in fact. > > But it's not clear what the problem is, so.. If we only use a smp_store_release() for the store to membarrier_exped, the "unregister" (setting back to 0) would be OK, but not the "register", as the following scenario shows: Initial values: A = B = 0 CPU 0 | CPU 1 (no-hz full) | | membarrier(REGISTER_EXPEDITED) | (write barrier implied by store-release) | set t->membarrier_exped = 1 (store-release imply memory barrier before store) | store B = 1 | barrier() (compiler-level barrier) | store A = 1 x = load A | membarrier(CMD_SHARED) | smp_mb() [1] | iter. on nohz cpus | if iter_t->membarrier_exped == 0 | (skip) | smp_mb() [2] | y = load B | Expect: if x == 1, then y == 1 CPU 0 can observe A == 1, membarrier_exped == 0, and B == 0, because there is no memory barrier between store to membarrier_exped and store to A on CPU 1. What we seem to need on the registration/unregistration side is store-acquire for registration, and store-release for unregistration. This pairs with a load of membarrier_exped that has both acquire and release barriers ([1] and [2] above). > I'm not seeing how a regular fork() could possibly ever make sense to > have the membarrier state in the newly forked process. Not that > "fork()" is really well-defined for within a single thread anyway (it > actually is as far as Linux is concerned, but not in POSIX, afaik). > > So if there is no major reason for it, I would strongly suggest that > _if_ all this makes sense in the first place, the membarrier thing > should just be cleared unconditionally both for exec and for > clone/fork. That's fine with me! Thanks, Mathieu > > Linus -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com