From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C7947C31E46 for ; Wed, 12 Jun 2019 04:42:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id A0F3420665 for ; Wed, 12 Jun 2019 04:42:25 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="eBXhtMqX" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729919AbfFLEmZ (ORCPT ); Wed, 12 Jun 2019 00:42:25 -0400 Received: from userp2120.oracle.com ([156.151.31.85]:34972 "EHLO userp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727384AbfFLEmY (ORCPT ); Wed, 12 Jun 2019 00:42:24 -0400 Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x5C4Y7QG151814; Wed, 12 Jun 2019 04:40:51 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=content-type : mime-version : subject : from : in-reply-to : date : cc : content-transfer-encoding : message-id : references : to; s=corp-2018-07-02; bh=Dvgzk7NkhDjyOBKbHEmBJhstfiU7BL6wh43neF+ZDkY=; b=eBXhtMqXh8EbDaA5cd4D3GN7ltogiLw0wHdiLQVI1uQwRHXIS3DHjZvRCRaEUcd5xHkz JAM7IoXdBgfIrH7+JznevmT1Mp2ZJRe/trIkEPNY44SRY43R3Tshf1Qe1hyOJUnJK1St Y4nYDWdh19ikRfbs0EWiuUo/w/RPjr3A4/jMTbNPYc3SFuyCfx/n3r+qbGS/kGvhALLL Bs4Wm6qJ35n3iYR4qvcSbs4Jiqc/V6TnjFt/17JJfEkRJ0nDaXI695/drkHljkQrW/wp 3PPttF8sqdA/OqPKJZW0ZYXnj4t1CeN3cMmitwxQsx1iORublrBr082JGosrJPHMP7H7 2w== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by userp2120.oracle.com with ESMTP id 2t05nqrvse-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 12 Jun 2019 04:40:51 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x5C4bIUd020268; Wed, 12 Jun 2019 04:38:50 GMT Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by userp3030.oracle.com with ESMTP id 2t024us0pr-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 12 Jun 2019 04:38:50 +0000 Received: from abhmp0010.oracle.com (abhmp0010.oracle.com [141.146.116.16]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id x5C4cVCg022576; Wed, 12 Jun 2019 04:38:31 GMT Received: from [10.39.217.163] (/10.39.217.163) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Tue, 11 Jun 2019 21:38:30 -0700 Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 10.2 \(3259\)) Subject: Re: [PATCH v2 3/5] locking/qspinlock: Introduce CNA into the slow path of qspinlock From: Alex Kogan In-Reply-To: Date: Wed, 12 Jun 2019 00:38:29 -0400 Cc: linux@armlinux.org.uk, Peter Zijlstra , mingo@redhat.com, will.deacon@arm.com, arnd@arndb.de, Waiman Long , linux-arch@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, Thomas Gleixner , bp@alien8.de, hpa@zytor.com, x86@kernel.org, dave.dice@oracle.com, Rahul Yadav , Steven Sistare , Daniel Jordan Content-Transfer-Encoding: quoted-printable Message-Id: <54241445-458C-4AE2-840B-6DFCCD410399@oracle.com> References: <20190329152006.110370-1-alex.kogan@oracle.com> <20190329152006.110370-4-alex.kogan@oracle.com> To: "liwei (GF)" X-Mailer: Apple Mail (2.3259) X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9285 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1906120030 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9285 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1011 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1906120030 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, Wei. > On Jun 11, 2019, at 12:22 AM, liwei (GF) wrote: >=20 > Hi Alex, >=20 > On 2019/3/29 23:20, Alex Kogan wrote: >> In CNA, spinning threads are organized in two queues, a main queue = for >> threads running on the same node as the current lock holder, and a >> secondary queue for threads running on other nodes. At the unlock = time, >> the lock holder scans the main queue looking for a thread running on >> the same node. If found (call it thread T), all threads in the main = queue >> between the current lock holder and T are moved to the end of the >> secondary queue, and the lock is passed to T. If such T is not found, = the >> lock is passed to the first node in the secondary queue. Finally, if = the >> secondary queue is empty, the lock is passed to the next thread in = the >> main queue. For more details, see = https://urldefense.proofpoint.com/v2/url?u=3Dhttps-3A__arxiv.org_abs_1810.= 05600&d=3DDwICbg&c=3DRoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=3DHvhk3= F4omdCk-GE1PTOm3Kn0A7ApWOZ2aZLTuVxFK4k&m=3DU7mfTbYj1r2Te2BBUUNbVrRPuTa_ujl= pR4GZfUsrGTM&s=3DDw4O1EniF-nde4fp6RA9ISlSMOjWuqeR9OS1G0iauj0&e=3D. >>=20 >> Note that this variant of CNA may introduce starvation by = continuously >> passing the lock to threads running on the same node. This issue >> will be addressed later in the series. >>=20 >> Enabling CNA is controlled via a new configuration option >> (NUMA_AWARE_SPINLOCKS), which is enabled by default if NUMA is = enabled. >>=20 >> Signed-off-by: Alex Kogan >> Reviewed-by: Steve Sistare >> --- >> arch/x86/Kconfig | 14 +++ >> include/asm-generic/qspinlock_types.h | 13 +++ >> kernel/locking/mcs_spinlock.h | 10 ++ >> kernel/locking/qspinlock.c | 29 +++++- >> kernel/locking/qspinlock_cna.h | 173 = ++++++++++++++++++++++++++++++++++ >> 5 files changed, 236 insertions(+), 3 deletions(-) >> create mode 100644 kernel/locking/qspinlock_cna.h >>=20 > (SNIP) >> + >> +static __always_inline int get_node_index(struct mcs_spinlock *node) >> +{ >> + return decode_count(node->node_and_count++); > When nesting level is > 4, it won't return a index >=3D 4 here and the = numa node number > is changed by mistake. It will go into a wrong way instead of the = following branch. >=20 >=20 > /* > * 4 nodes are allocated based on the assumption that there will > * not be nested NMIs taking spinlocks. That may not be true in > * some architectures even though the chance of needing more = than > * 4 nodes will still be extremely unlikely. When that happens, > * we fall back to spinning on the lock directly without using > * any MCS node. This is not the most elegant solution, but is > * simple enough. > */ > if (unlikely(idx >=3D MAX_NODES)) { > while (!queued_spin_trylock(lock)) > cpu_relax(); > goto release; > } Good point. This patch does not handle count overflows gracefully. It can be easily fixed by allocating more bits for the count =E2=80=94 = we don=E2=80=99t really need 30 bits for #NUMA nodes. However, I am working on a new revision of the patch, in which the cna = node encapsulates the mcs node (following Peter=E2=80=99s suggestion and = similarly to pv_node). With that approach, this issue is gone. Best regards, =E2=80=94 Alex From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alex Kogan Subject: Re: [PATCH v2 3/5] locking/qspinlock: Introduce CNA into the slow path of qspinlock Date: Wed, 12 Jun 2019 00:38:29 -0400 Message-ID: <54241445-458C-4AE2-840B-6DFCCD410399@oracle.com> References: <20190329152006.110370-1-alex.kogan@oracle.com> <20190329152006.110370-4-alex.kogan@oracle.com> Mime-Version: 1.0 (Mac OS X Mail 10.2 \(3259\)) Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Return-path: In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=m.gmane.org@lists.infradead.org To: "liwei (GF)" Cc: linux-arch@vger.kernel.org, arnd@arndb.de, Peter Zijlstra , dave.dice@oracle.com, x86@kernel.org, will.deacon@arm.com, linux@armlinux.org.uk, Steven Sistare , linux-kernel@vger.kernel.org, Rahul Yadav , mingo@redhat.com, bp@alien8.de, hpa@zytor.com, Waiman Long , Thomas Gleixner , Daniel Jordan , linux-arm-kernel@lists.infradead.org List-Id: linux-arch.vger.kernel.org SGksIFdlaS4KCj4gT24gSnVuIDExLCAyMDE5LCBhdCAxMjoyMiBBTSwgbGl3ZWkgKEdGKSA8bGl3 ZWkzOTFAaHVhd2VpLmNvbT4gd3JvdGU6Cj4gCj4gSGkgQWxleCwKPiAKPiBPbiAyMDE5LzMvMjkg MjM6MjAsIEFsZXggS29nYW4gd3JvdGU6Cj4+IEluIENOQSwgc3Bpbm5pbmcgdGhyZWFkcyBhcmUg b3JnYW5pemVkIGluIHR3byBxdWV1ZXMsIGEgbWFpbiBxdWV1ZSBmb3IKPj4gdGhyZWFkcyBydW5u aW5nIG9uIHRoZSBzYW1lIG5vZGUgYXMgdGhlIGN1cnJlbnQgbG9jayBob2xkZXIsIGFuZCBhCj4+ IHNlY29uZGFyeSBxdWV1ZSBmb3IgdGhyZWFkcyBydW5uaW5nIG9uIG90aGVyIG5vZGVzLiBBdCB0 aGUgdW5sb2NrIHRpbWUsCj4+IHRoZSBsb2NrIGhvbGRlciBzY2FucyB0aGUgbWFpbiBxdWV1ZSBs b29raW5nIGZvciBhIHRocmVhZCBydW5uaW5nIG9uCj4+IHRoZSBzYW1lIG5vZGUuIElmIGZvdW5k IChjYWxsIGl0IHRocmVhZCBUKSwgYWxsIHRocmVhZHMgaW4gdGhlIG1haW4gcXVldWUKPj4gYmV0 d2VlbiB0aGUgY3VycmVudCBsb2NrIGhvbGRlciBhbmQgVCBhcmUgbW92ZWQgdG8gdGhlIGVuZCBv ZiB0aGUKPj4gc2Vjb25kYXJ5IHF1ZXVlLCBhbmQgdGhlIGxvY2sgaXMgcGFzc2VkIHRvIFQuIElm IHN1Y2ggVCBpcyBub3QgZm91bmQsIHRoZQo+PiBsb2NrIGlzIHBhc3NlZCB0byB0aGUgZmlyc3Qg bm9kZSBpbiB0aGUgc2Vjb25kYXJ5IHF1ZXVlLiBGaW5hbGx5LCBpZiB0aGUKPj4gc2Vjb25kYXJ5 IHF1ZXVlIGlzIGVtcHR5LCB0aGUgbG9jayBpcyBwYXNzZWQgdG8gdGhlIG5leHQgdGhyZWFkIGlu IHRoZQo+PiBtYWluIHF1ZXVlLiBGb3IgbW9yZSBkZXRhaWxzLCBzZWUgaHR0cHM6Ly91cmxkZWZl bnNlLnByb29mcG9pbnQuY29tL3YyL3VybD91PWh0dHBzLTNBX19hcnhpdi5vcmdfYWJzXzE4MTAu MDU2MDAmZD1Ed0lDYmcmYz1Sb1AxWXVtQ1hDZ2FXSHZsWllSOFBaaDhCdjdxSXJNVUI2NWVhcElf Sm5FJnI9SHZoazNGNG9tZENrLUdFMVBUT20zS24wQTdBcFdPWjJhWkxUdVZ4Rks0ayZtPVU3bWZU YllqMXIyVGUyQkJVVU5iVnJSUHVUYV91amxwUjRHWmZVc3JHVE0mcz1EdzRPMUVuaUYtbmRlNGZw NlJBOUlTbFNNT2pXdXFlUjlPUzFHMGlhdWowJmU9Lgo+PiAKPj4gTm90ZSB0aGF0IHRoaXMgdmFy aWFudCBvZiBDTkEgbWF5IGludHJvZHVjZSBzdGFydmF0aW9uIGJ5IGNvbnRpbnVvdXNseQo+PiBw YXNzaW5nIHRoZSBsb2NrIHRvIHRocmVhZHMgcnVubmluZyBvbiB0aGUgc2FtZSBub2RlLiBUaGlz IGlzc3VlCj4+IHdpbGwgYmUgYWRkcmVzc2VkIGxhdGVyIGluIHRoZSBzZXJpZXMuCj4+IAo+PiBF bmFibGluZyBDTkEgaXMgY29udHJvbGxlZCB2aWEgYSBuZXcgY29uZmlndXJhdGlvbiBvcHRpb24K Pj4gKE5VTUFfQVdBUkVfU1BJTkxPQ0tTKSwgd2hpY2ggaXMgZW5hYmxlZCBieSBkZWZhdWx0IGlm IE5VTUEgaXMgZW5hYmxlZC4KPj4gCj4+IFNpZ25lZC1vZmYtYnk6IEFsZXggS29nYW4gPGFsZXgu a29nYW5Ab3JhY2xlLmNvbT4KPj4gUmV2aWV3ZWQtYnk6IFN0ZXZlIFNpc3RhcmUgPHN0ZXZlbi5z aXN0YXJlQG9yYWNsZS5jb20+Cj4+IC0tLQo+PiBhcmNoL3g4Ni9LY29uZmlnICAgICAgICAgICAg ICAgICAgICAgIHwgIDE0ICsrKwo+PiBpbmNsdWRlL2FzbS1nZW5lcmljL3FzcGlubG9ja190eXBl cy5oIHwgIDEzICsrKwo+PiBrZXJuZWwvbG9ja2luZy9tY3Nfc3BpbmxvY2suaCAgICAgICAgIHwg IDEwICsrCj4+IGtlcm5lbC9sb2NraW5nL3FzcGlubG9jay5jICAgICAgICAgICAgfCAgMjkgKysr KystCj4+IGtlcm5lbC9sb2NraW5nL3FzcGlubG9ja19jbmEuaCAgICAgICAgfCAxNzMgKysrKysr KysrKysrKysrKysrKysrKysrKysrKysrKysrKwo+PiA1IGZpbGVzIGNoYW5nZWQsIDIzNiBpbnNl cnRpb25zKCspLCAzIGRlbGV0aW9ucygtKQo+PiBjcmVhdGUgbW9kZSAxMDA2NDQga2VybmVsL2xv Y2tpbmcvcXNwaW5sb2NrX2NuYS5oCj4+IAo+IChTTklQKQo+PiArCj4+ICtzdGF0aWMgX19hbHdh eXNfaW5saW5lIGludCBnZXRfbm9kZV9pbmRleChzdHJ1Y3QgbWNzX3NwaW5sb2NrICpub2RlKQo+ PiArewo+PiArCXJldHVybiBkZWNvZGVfY291bnQobm9kZS0+bm9kZV9hbmRfY291bnQrKyk7Cj4g V2hlbiBuZXN0aW5nIGxldmVsIGlzID4gNCwgaXQgd29uJ3QgcmV0dXJuIGEgaW5kZXggPj0gNCBo ZXJlIGFuZCB0aGUgbnVtYSBub2RlIG51bWJlcgo+IGlzIGNoYW5nZWQgYnkgbWlzdGFrZS4gSXQg d2lsbCBnbyBpbnRvIGEgd3Jvbmcgd2F5IGluc3RlYWQgb2YgdGhlIGZvbGxvd2luZyBicmFuY2gu Cj4gCj4gCj4gCS8qCj4gCSAqIDQgbm9kZXMgYXJlIGFsbG9jYXRlZCBiYXNlZCBvbiB0aGUgYXNz dW1wdGlvbiB0aGF0IHRoZXJlIHdpbGwKPiAJICogbm90IGJlIG5lc3RlZCBOTUlzIHRha2luZyBz cGlubG9ja3MuIFRoYXQgbWF5IG5vdCBiZSB0cnVlIGluCj4gCSAqIHNvbWUgYXJjaGl0ZWN0dXJl cyBldmVuIHRob3VnaCB0aGUgY2hhbmNlIG9mIG5lZWRpbmcgbW9yZSB0aGFuCj4gCSAqIDQgbm9k ZXMgd2lsbCBzdGlsbCBiZSBleHRyZW1lbHkgdW5saWtlbHkuIFdoZW4gdGhhdCBoYXBwZW5zLAo+ IAkgKiB3ZSBmYWxsIGJhY2sgdG8gc3Bpbm5pbmcgb24gdGhlIGxvY2sgZGlyZWN0bHkgd2l0aG91 dCB1c2luZwo+IAkgKiBhbnkgTUNTIG5vZGUuIFRoaXMgaXMgbm90IHRoZSBtb3N0IGVsZWdhbnQg c29sdXRpb24sIGJ1dCBpcwo+IAkgKiBzaW1wbGUgZW5vdWdoLgo+IAkgKi8KPiAJaWYgKHVubGlr ZWx5KGlkeCA+PSBNQVhfTk9ERVMpKSB7Cj4gCQl3aGlsZSAoIXF1ZXVlZF9zcGluX3RyeWxvY2so bG9jaykpCj4gCQkJY3B1X3JlbGF4KCk7Cj4gCQlnb3RvIHJlbGVhc2U7Cj4gCX0KR29vZCBwb2lu dC4KVGhpcyBwYXRjaCBkb2VzIG5vdCBoYW5kbGUgY291bnQgb3ZlcmZsb3dzIGdyYWNlZnVsbHku Ckl0IGNhbiBiZSBlYXNpbHkgZml4ZWQgYnkgYWxsb2NhdGluZyBtb3JlIGJpdHMgZm9yIHRoZSBj b3VudCDigJQgd2UgZG9u4oCZdCByZWFsbHkgbmVlZCAzMCBiaXRzIGZvciAjTlVNQSBub2Rlcy4K Ckhvd2V2ZXIsIEkgYW0gd29ya2luZyBvbiBhIG5ldyByZXZpc2lvbiBvZiB0aGUgcGF0Y2gsIGlu IHdoaWNoIHRoZSBjbmEgbm9kZSBlbmNhcHN1bGF0ZXMgdGhlIG1jcyBub2RlIChmb2xsb3dpbmcg UGV0ZXLigJlzIHN1Z2dlc3Rpb24gYW5kIHNpbWlsYXJseSB0byBwdl9ub2RlKS4KV2l0aCB0aGF0 IGFwcHJvYWNoLCB0aGlzIGlzc3VlIGlzIGdvbmUuCgpCZXN0IHJlZ2FyZHMsCuKAlCBBbGV4CgoK CgpfX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXwpsaW51eC1h cm0ta2VybmVsIG1haWxpbmcgbGlzdApsaW51eC1hcm0ta2VybmVsQGxpc3RzLmluZnJhZGVhZC5v cmcKaHR0cDovL2xpc3RzLmluZnJhZGVhZC5vcmcvbWFpbG1hbi9saXN0aW5mby9saW51eC1hcm0t a2VybmVsCg==