From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751285AbdH1HS5 (ORCPT ); Mon, 28 Aug 2017 03:18:57 -0400 Received: from mail-pg0-f68.google.com ([74.125.83.68]:37982 "EHLO mail-pg0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751143AbdH1HS4 (ORCPT ); Mon, 28 Aug 2017 03:18:56 -0400 Date: Mon, 28 Aug 2017 17:18:27 +1000 From: Nicholas Piggin To: Linus Torvalds Cc: Tim Chen , Mel Gorman , Peter Zijlstra , Ingo Molnar , Andi Kleen , Kan Liang , Andrew Morton , Johannes Weiner , Jan Kara , Christopher Lameter , "Eric W . Biederman" , Davidlohr Bueso , linux-mm , Linux Kernel Mailing List Subject: Re: [PATCH 2/2 v2] sched/wait: Introduce lock breaker in wake_up_page_bit Message-ID: <20170828171827.1dc41715@roar.ozlabs.ibm.com> In-Reply-To: References: <83f675ad385d67760da4b99cd95ee912ca7c0b44.1503677178.git.tim.c.chen@linux.intel.com> <20170828111648.22f81bc5@roar.ozlabs.ibm.com> <20170828112959.05622961@roar.ozlabs.ibm.com> Organization: IBM X-Mailer: Claws Mail 3.15.0-dirty (GTK+ 2.24.31; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, 27 Aug 2017 22:17:55 -0700 Linus Torvalds wrote: > On Sun, Aug 27, 2017 at 6:29 PM, Nicholas Piggin wrote: > > > > BTW. since you are looking at this stuff, one other small problem I remember > > with exclusive waiters is that losing to a concurrent locker puts them to > > the back of the queue. I think that could be fixed with some small change to > > the wait loops (first add to tail, then retries add to head). Thoughts? > > No, not that way. > > First off, it's oddly complicated, but more importantly, the real > unfairness you lose to is not other things on the wait queue, but to > other lockers that aren't on the wait-queue at all, but instead just > come in and do a "test-and-set" without ever even going through the > slow path. Right, there is that unfairness *as well*. The requeue-to-tail logic seems to make that worse and I thought it seemed like a simple way to improve it. > > So instead of playing queuing games, you'd need to just change the > unlock sequence. Right now we basically do: > > - clear lock bit and atomically test if contended (and we play games > with bit numbering to do that atomic test efficiently) > > - if contended, wake things up > > and you'd change the logic to be > > - if contended, don't clear the lock bit at all, just transfer the > lock ownership directly to the waiters by walking the wait list > > - clear the lock bit only once there are no more wait entries (either > because there were no waiters at all, or because all the entries were > just waiting for the lock to be released) > > which is certainly doable with a couple of small extensions to the > page wait key data structure. Yeah that would be ideal. Conceptually trivial, I guess care has to be taken with transferring the memory ordering with the lock. Could be a good concept to apply elsewhere too. > > But most of my clever schemes the last few days were abject failures, > and honestly, it's late in the rc. > > In fact, this late in the game I probably wouldn't even have committed > the small cleanups I did if it wasn't for the fact that thinking of > the whole WQ_FLAG_EXCLUSIVE bit made me find the bug. > > So the cleanups were actually what got me to look at the problem in > the first place, and then I went "I'm going to commit the cleanup, and > then I can think about the bug I just found". > > I'm just happy that the fix seems to be trivial. I was afraid I'd have > to do something nastier (like have the EINTR case send another > explicit wakeup to make up for the lost one, or some ugly hack like > that). > > It was only when I started looking at the history of that code, and I > saw the old bit_lock code, and I went "Hmm. That has the _same_ bug - > oh wait, no it doesn't!" that I realized that there was that simple > fix. > > You weren't cc'd on the earlier part of the discussion, you only got > added when I realized what the history and simple fix was. You're right, no such improvement would be appropriate for 4.14. Thanks, Nick From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pg0-f70.google.com (mail-pg0-f70.google.com [74.125.83.70]) by kanga.kvack.org (Postfix) with ESMTP id 8DE736B025F for ; Mon, 28 Aug 2017 03:18:57 -0400 (EDT) Received: by mail-pg0-f70.google.com with SMTP id a186so21904791pge.5 for ; Mon, 28 Aug 2017 00:18:57 -0700 (PDT) Received: from mail-pg0-x241.google.com (mail-pg0-x241.google.com. [2607:f8b0:400e:c05::241]) by mx.google.com with ESMTPS id 22si1516570pgc.392.2017.08.28.00.18.56 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 28 Aug 2017 00:18:56 -0700 (PDT) Received: by mail-pg0-x241.google.com with SMTP id q16so5771255pgc.0 for ; Mon, 28 Aug 2017 00:18:56 -0700 (PDT) Date: Mon, 28 Aug 2017 17:18:27 +1000 From: Nicholas Piggin Subject: Re: [PATCH 2/2 v2] sched/wait: Introduce lock breaker in wake_up_page_bit Message-ID: <20170828171827.1dc41715@roar.ozlabs.ibm.com> In-Reply-To: References: <83f675ad385d67760da4b99cd95ee912ca7c0b44.1503677178.git.tim.c.chen@linux.intel.com> <20170828111648.22f81bc5@roar.ozlabs.ibm.com> <20170828112959.05622961@roar.ozlabs.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Linus Torvalds Cc: Tim Chen , Mel Gorman , Peter Zijlstra , Ingo Molnar , Andi Kleen , Kan Liang , Andrew Morton , Johannes Weiner , Jan Kara , Christopher Lameter , "Eric W . Biederman" , Davidlohr Bueso , linux-mm , Linux Kernel Mailing List On Sun, 27 Aug 2017 22:17:55 -0700 Linus Torvalds wrote: > On Sun, Aug 27, 2017 at 6:29 PM, Nicholas Piggin wrote: > > > > BTW. since you are looking at this stuff, one other small problem I remember > > with exclusive waiters is that losing to a concurrent locker puts them to > > the back of the queue. I think that could be fixed with some small change to > > the wait loops (first add to tail, then retries add to head). Thoughts? > > No, not that way. > > First off, it's oddly complicated, but more importantly, the real > unfairness you lose to is not other things on the wait queue, but to > other lockers that aren't on the wait-queue at all, but instead just > come in and do a "test-and-set" without ever even going through the > slow path. Right, there is that unfairness *as well*. The requeue-to-tail logic seems to make that worse and I thought it seemed like a simple way to improve it. > > So instead of playing queuing games, you'd need to just change the > unlock sequence. Right now we basically do: > > - clear lock bit and atomically test if contended (and we play games > with bit numbering to do that atomic test efficiently) > > - if contended, wake things up > > and you'd change the logic to be > > - if contended, don't clear the lock bit at all, just transfer the > lock ownership directly to the waiters by walking the wait list > > - clear the lock bit only once there are no more wait entries (either > because there were no waiters at all, or because all the entries were > just waiting for the lock to be released) > > which is certainly doable with a couple of small extensions to the > page wait key data structure. Yeah that would be ideal. Conceptually trivial, I guess care has to be taken with transferring the memory ordering with the lock. Could be a good concept to apply elsewhere too. > > But most of my clever schemes the last few days were abject failures, > and honestly, it's late in the rc. > > In fact, this late in the game I probably wouldn't even have committed > the small cleanups I did if it wasn't for the fact that thinking of > the whole WQ_FLAG_EXCLUSIVE bit made me find the bug. > > So the cleanups were actually what got me to look at the problem in > the first place, and then I went "I'm going to commit the cleanup, and > then I can think about the bug I just found". > > I'm just happy that the fix seems to be trivial. I was afraid I'd have > to do something nastier (like have the EINTR case send another > explicit wakeup to make up for the lost one, or some ugly hack like > that). > > It was only when I started looking at the history of that code, and I > saw the old bit_lock code, and I went "Hmm. That has the _same_ bug - > oh wait, no it doesn't!" that I realized that there was that simple > fix. > > You weren't cc'd on the earlier part of the discussion, you only got > added when I realized what the history and simple fix was. You're right, no such improvement would be appropriate for 4.14. Thanks, Nick -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org