From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1954748AbdDZMkM (ORCPT ); Wed, 26 Apr 2017 08:40:12 -0400 Received: from mail-by2nam01on0082.outbound.protection.outlook.com ([104.47.34.82]:7088 "EHLO NAM01-BY2-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1950835AbdDZMkC (ORCPT ); Wed, 26 Apr 2017 08:40:02 -0400 Authentication-Results: infradead.org; dkim=none (message not signed) header.d=none;infradead.org; dmarc=none action=none header.from=caviumnetworks.com; Date: Wed, 26 Apr 2017 15:39:47 +0300 From: Yury Norov To: Peter Zijlstra Cc: linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org, linux-arm-kernel@lists.infradead.org, Ingo Molnar , Arnd Bergmann , Catalin Marinas , Will Deacon , Jan Glauber Subject: Re: [PATCH 3/3] arm64/locking: qspinlocks and qrwlocks support Message-ID: <20170426123947.ogap6xgg7xc5wsd4@yury-N73SV> References: <1491860104-4103-1-git-send-email-ynorov@caviumnetworks.com> <1491860104-4103-4-git-send-email-ynorov@caviumnetworks.com> <20170413181212.y3ezah76qoztxhnn@hirez.programming.kicks-ass.net> <20170420182318.4ddtfiobxz6hgbo4@yury-N73SV> <20170420190530.GA6873@worktop> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170420190530.GA6873@worktop> User-Agent: NeoMutt/20170113 (1.7.2) X-Originating-IP: [176.59.50.245] X-ClientProxiedBy: VI1P190CA0028.EURP190.PROD.OUTLOOK.COM (10.165.188.169) To DM2PR0701MB1280.namprd07.prod.outlook.com (10.161.225.18) X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 7c2a5b1e-5ab3-4c08-b5dd-08d48ca159f2 X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:(22001)(201703131423075)(201703031133081);SRVR:DM2PR0701MB1280; X-Microsoft-Exchange-Diagnostics: 1;DM2PR0701MB1280;3:h7VgqiK9taqSTIYRF26ACbhwiCQgWRQJXZhC4daJlczlfaoOMGJOxhmMdU43x3WXEV22bwIGiMtSgfb5gPLFVW+VtB1dphxTJBfcf60WBfiivtXxgH/U0eVp1LS5MHPR7vseDWHZvUGxHnIXFDAhfZbw6IGHagcecWLR3MkzKKBn70rv5XfBe34NUw3tBRPg77PTVDhVuT4qfhvYL64YmKvFYWx9Y1sHXxmDYkMoiy/Qiz4oo0O4hQsdn//tjImIuqOobXvqpAhWXRQljVV/i1rbQnTqMb++bX8/IBWcE7cgekPW9POxA1UbJjCvJ1W/aTyAw3uzRangJaEcokpb0A==;25:U/6SDXx8HSOafcprN3hNGzDXFM2yJmZJXWkDWHVh6Zd7ZuDo8i2taqYsM5cXjazD+xKj2DsMISznkNkk3KlliJQjqFT4XNtOfDe9PtTwHR15yn3YZNsyTnuviuGME5Mwuk51pfKmxix0Pg7hSlIp+NUx/CB+srucqPUDT9SsnA/f/CcTnqPc/T+9asRl1OcYCEU8khm4IMmCGYoWWhfpRnFSdumCP14dnUCD5Y8SPySVagPFAgVIUzm0ob2I2T+zET3Aw4ljQDmFKKyEphCMV8TzpD7xiz3A+ZEvcWzyGu6OWs1V3u4bOVYmjSlBae9X/7IgrNLKYTSNrrqtPDu2fC8FU11QK7zTtM7MgL+/v1U8L6902H3+1NGgYm9pdUI+qhF6/lAL/xgbZsl96IiSKPHVFA7Mjamg65mAFQ7D9ANR0VlTNjr6Mo0NsHJrHFYAzwxTr9jOCR+yfVcyfVBtlA== X-Microsoft-Exchange-Diagnostics: 1;DM2PR0701MB1280;31:KgDzSDNzViiqL36ovScZxAEsirUsvirye7fyYhMWRG3ivGspsT+F/JsuuJ8OLxYDNBHNNMwy9urBRk715449LpDhNNcVPVnENbZzamqfHfyh/saLF5dKV0VHGhVhg0Mc7qGwzeiuoMPRfwW2CsNmJ5aisRVEzb5EhYaMDWkvnLulT5hE0cq4jpEc+s5jAya+rOtnCd7kALsm+w9ZEJh4b0HilJaN1fFkogU5b2rXRaQ=;20:cRAxPcz+Re97WdSaC50y1Zfpxxar901+FX3Wzdqsry9hiQOkVY0NYPrTqzKPl2KHtPVVD0ipzLOAzsJMDNsevOsqfh525L16tZe+ovTpRSQz+KRBJFnBbU/SLLjazhxMghp6ctANQxvzMBD3/RndMQKyP/yzAsp+ISqEEkK9VmFp1D3CCiLfxYFbbGgMPxJ2jbztPzhZiFYHSjsL/xFepLKegFWmF9FAfw/Nd5B1/aP2pmJfwtYaFea76oqVilP+mo3cYfQX9j3nIo2BAlytfaPupWVLmz77FjAIP1f3XbFEtrFr5zKN+GlPSNqPORjVX7+EVSPrjMykI2lknnJAnVSUBb1Vgr30tby0qg9KUmej+Sz5aE/AnOgmw3+xr7m6vBjSFXq+30rkEn2aMBEkx/V3QrploLP9j+O9x5fHKYL4GHmgzXIc40UCKTV+MbC5L64MGm6cnpokNCvHoVzdZ7bfd1Ix0PVDNPjMsFhcmesRMmx6EZhgz0tSX/1iXa9Wz24yD3Fkccgmdql8b1THAdhvTiTL75QH6N6gTWAt7cf5/oB2x3WrTYJbwj2e41itCSByJxO+0Ev4FP2QyvsGZELhd0ElKsjlRGJ5hKcIef8= X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(190756311086443); X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(6040450)(601004)(2401047)(8121501046)(5005006)(3002001)(10201501046)(93006095)(6041248)(201703131423075)(201702281528075)(201703061421075)(20161123564025)(20161123560025)(20161123555025)(20161123562025)(6072148);SRVR:DM2PR0701MB1280;BCL:0;PCL:0;RULEID:;SRVR:DM2PR0701MB1280; X-Microsoft-Exchange-Diagnostics: 1;DM2PR0701MB1280;4:3fk4C39BulQkXTSXsxN2AZmvnh+Fgv0YVldDyWKoJ693Pczo0jM0917Eje6Da5akQ0wh5OXvY9PoYpvhr705BXrpVLPuG7St6WCnuqnQdzTZpSR/AJsTR6/MCbVvK5pGT6xAE6EHjff0ysZq7PTV5wTVWZDwoAFbYwRRln+Y/li541ON23iHl21Oa3yLSKmgsOSd5CxvFozxDk4hsytQsOB+c8Rou17Nc/nI359Ak9+7Vd6zK9xuBVplAxCGbPveRSc/s4u2C6K2rbfc76+leFwYYf75XWfiVXVMfELds3E497rv7Gqstd0NCtG8fLlRUj0YkEBzGXTSP37RyDktI5LgwkMNfCa3BSSERY16fPOaRSZkd9/Ky64PEfmNCHm9Xv6uLW2Xv5MQZkCcmoLaoPdaZFjFbh3kgPcxwKINL2dA4aKCkxbIfDMYS5iGKhdMEuMwOfP2mXXN1y5zqIv6iY0gXmtahFvwI/kkJRt1D/jBvKOcNbt9jnuXbmlgK/FdAFfiIKxuixFkE1XR76nUE3TgRmYKHelAEQ82Gb3C3AGZhb7SqXfy+0DEZY18JdASLSbkvMn1CZhRjdH3RAXq3eK0vfoCKKJI+Yzr2KBT5QM+L7SmJCPaIKGBpNBVboR/tHLD0krpCGfc+7tk9xoFYtNAs5Ik0RMPC76H3J1Nc6aOrjx/96VOcT4YpMuiN4CItmj6jg+URr+cj+ZLs5jgzIOalo+gtqUjE1q6b6ZpPEWprjKqm5zi0cEzioMpitSM X-Forefront-PRVS: 0289B6431E X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10009020)(4630300001)(6009001)(6069001)(39850400002)(39400400002)(39840400002)(39450400003)(39410400002)(24454002)(38730400002)(575784001)(7736002)(66066001)(6496005)(6486002)(54906002)(3846002)(23726003)(50986999)(76176999)(54356999)(107886003)(5660300001)(110136004)(6116002)(4001350100001)(9686003)(93886004)(6246003)(33646002)(83506001)(47776003)(305945005)(53936002)(2906002)(76506005)(8676002)(42186005)(4326008)(229853002)(25786009)(6916009)(2950100002)(42882006)(50466002)(33716001)(1076002)(81166006)(189998001)(6666003);DIR:OUT;SFP:1101;SCL:1;SRVR:DM2PR0701MB1280;H:localhost;FPR:;SPF:None;MLV:sfv;LANG:en; X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1;DM2PR0701MB1280;23:mcvaRKN3sUk/jvWDWVzpM+D/DBSZV+hBpGqpmtL?= =?us-ascii?Q?AIhoVI/nvrPCUNzgdQY7j2Cb8jhJfpQ6c1XS+kxXiOGGnE7ZSXdTlfQin36b?= =?us-ascii?Q?/3+NmpRluJwEplfKkPvwrfVlUVwQjgCpqipArLS6uxi9El87nR66+xxEijec?= =?us-ascii?Q?MWv96SmHZIrTkbopwZstW7H0cpdXp/uIvuLmrh1Q7IiQoclTrqpmG+NTbrum?= =?us-ascii?Q?GF2Ma97ALsm7YbmFcK3NMgBS18lDv8zXSIsoP7IR/KPrgEwizsP/pailwjXZ?= =?us-ascii?Q?jMB6ym5HiSUzcuiZd4NNDbyu/f52ReaNSv/hEwaCCJhj/inlrDKETI/qKMiO?= =?us-ascii?Q?wXJ4AeM/EUy5D5+P7ySDbgxKk36/9g8Y4YUN7Za6BopFcrz6ueN0G6hdvm0L?= =?us-ascii?Q?TDP0piu+Yq4MTpMhGy2YoRj30Py5yoQgiYUp9LVmu3sdNS0RG5vpVKH2UrMR?= =?us-ascii?Q?Y5xKFXav35Xt93kxmLgP0OrrwjVyuMjEnl7B8yb+kLQdD5DeFI11bOU6hGWr?= =?us-ascii?Q?neD34t3Kg2a4hnHfnM1At0CNoDDOphrihGNxwt5qW4mj+0ifVYCOYA/af/fw?= =?us-ascii?Q?57vjHD0/r86iPj5gsjCTMAlQJ+uOObWyn2NEWjNpVm5HvykLO96z/bJNpFAO?= =?us-ascii?Q?kJhOzx8Ndg5qTaveFEOoCgYMYNbTXDbbbpuFBC1CqN+wbBsbG44HWZH5Vs9K?= =?us-ascii?Q?L/IY5N6cZitXeBKM6s/pn8ZWz15iDxJwuz1in5ce7BJL1OqLlPua8hjz+XiH?= =?us-ascii?Q?wedrb7JcBtwE9eYklzIqs6HpwgWcr+TLSA+gJb96HAJ810Fttwzts8H1iePj?= =?us-ascii?Q?opjqa/ZbkTy2YxWeY1SaimQpkDxkEVtzDbkyXLkoHq/gX2zpJZHxQGtpthpb?= =?us-ascii?Q?swrYAPbvkBav/L1iHA/GoLdOIhGbNyd6vdoWiMjC4UWCDiQoIETuOv3AhPHq?= =?us-ascii?Q?nijxfcYJxJ2U0boja+41fSJX+h4CG1KN+aEKLQ0rC9appnFMHfaM05+yrc+Y?= =?us-ascii?Q?3Ekj1NMzoTRbi4J37QlaSM3hWkGYoknPDyUBKG7QrPc7GFlWF4kFQYXctpaD?= =?us-ascii?Q?raXY952dGdwjpQtBhSPKkZb1nqk1VAf0VAy1BrDamCzAgvp8+StageK8P/IU?= =?us-ascii?Q?1/414Ow47MUKTgTyqYBdfqq96Nf6Eu6e+/hxcVZEoYYLI1Veyhv8kcv2Ph8M?= =?us-ascii?Q?BEeoxwrHXLTRUwS81Hawell0GGIOr2eGFD0C0j0QLknHS7VA8Cl0I+bjpMQ?= =?us-ascii?Q?=3D=3D?= X-Microsoft-Exchange-Diagnostics: 1;DM2PR0701MB1280;6:MlkjS/XVkgQuh/fUlNHTeTfoBO81MMem7ivq8zFSoi5UaQi96TmzJOEiVZ4XiyZl3QXQjdGYRPvcj7y0mTk/uCyXUqlcOVzqVDKEtVEdaXvTl4qUpDHOapxS0o8UPB5Eobyb3PRUkQEnGl9/pcAP6YoU3SjKTBFsoZw6ViNhRn+yjO9STtPupGez0oHOGO7KXCCbExrzO1mEOREZAUS9ypup/ue9Hu3iT1v70jeehpM08WABgHLb5192B+DIxetVc3OZo3MfJb8BAFzugYfTzcWh46KISNl5AS8lbakNkH9f4xULLpxDKGJm7x5MDoRyJH76QrBIi0C8/2Wxy8dVF30oLvA2e03yTP2IAuihV4Syu71mTE22SBZxYRvNs5p1U+ld3mSjoODIFHTQv5PF5+EouyY+fYDN3LKoojklpgh/kOSOam4hsVfbQJoQw0i99VcHMJ4Z0Gfp9lKu0c1RvoXzzXykR5j+zBTyip715X94EudXZ5zOX3MkeCtWzvjakk0bXclNBnUWV3MftJDm/Q==;5:JrG9+X7QDn9ezvZN9XBKj8smqkVSpdyNr+BBJiprxGQEEQ3zPcSMeHdCIufHr3/qiCac1AD0dq5tn2VzimHjYvkFjdgx+tVqqggK7oUhoYjG4rl5WZy4XtCev34NDFlS3T0wIK+2LXBtHCHOMrAyIQ==;24:tnRA+xBdFs0FkSh7m6t0U0Q9ZInKMgtT6QC2vq1iqkj5DJ65qO1r4PkqmXddgBrCjAJOem2lpehAZHVTJyScQNdKAe4Vhj7jJ7iqp/el3EE= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1;DM2PR0701MB1280;7:ugLeIb26I8o8O/3+RQ742rNzbXmRBd9vIYfCaB/KYY9v6sGnL3qhZF9x6DrzVadD9QNs6udWbNTZeLsc+8DCQnxchfUul1fJPs3LDNysDy6hKIjZoM0czzpQoYCZjKRoF6LLq1NPVTOkAlq1TwPH9JJ4J3fSxDXgaxWeJRqFVGeg8pej5MOdlgs7hu/mkKh0+8l3Yx6pdqBRZQGfUjyJc1TCOYzFPBEvcgRemIEtGGbSspiidU2OP4huJxe8biegmHEnuS7e7ADfeFrEYFB6w1ZrxX/mrEy3WCZerCKLpG+5cI+bu2XcVFQshnYqhJRDKw50kQnb3ihYQ6maC0kk5A== X-OriginatorOrg: caviumnetworks.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 26 Apr 2017 12:39:58.3987 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM2PR0701MB1280 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Apr 20, 2017 at 09:05:30PM +0200, Peter Zijlstra wrote: > On Thu, Apr 20, 2017 at 09:23:18PM +0300, Yury Norov wrote: > > Is there some test to reproduce the locking failure for the case. > > Possibly sysvsem stress before commit: > > 27d7be1801a4 ("ipc/sem.c: avoid using spin_unlock_wait()") > > Although a similar scheme is also used in nf_conntrack, see commit: > > b316ff783d17 ("locking/spinlock, netfilter: Fix nf_conntrack_lock() barriers") > > > I > > ask because I run loctorture for many hours on my qemu (emulating > > cortex-a57), and I see no failures in the test reports. And Jan did it > > on ThunderX, and Adam on QDF2400 without any problems. So even if I > > rework those functions, how could I check them for correctness? > > Running them doesn't prove them correct. Memory ordering bugs have been > in the kernel for many years without 'ever' triggering. This is stuff > you have to think about. > > > Anyway, regarding the queued_spin_unlock_wait(), is my understanding > > correct that you assume adding smp_mb() before entering the for(;;) > > cycle, and using ldaxr/strxr instead of atomic_read()? > > You'll have to ask Will, I always forget the arm64 details. So, below is what I have. For queued_spin_unlock_wait() the generated code is looking like this: ffff0000080983a0 : ffff0000080983a0: d5033bbf dmb ish ffff0000080983a4: b9400007 ldr w7, [x0] ffff0000080983a8: 350000c7 cbnz w7, ffff0000080983c0 ffff0000080983ac: 1400000e b ffff0000080983e4 ffff0000080983b0: d503203f yield ffff0000080983b4: d5033bbf dmb ish ffff0000080983b8: b9400007 ldr w7, [x0] ffff0000080983bc: 34000147 cbz w7, ffff0000080983e4 ffff0000080983c0: f2401cff tst x7, #0xff ffff0000080983c4: 54ffff60 b.eq ffff0000080983b0 ffff0000080983c8: 14000003 b ffff0000080983d4 ffff0000080983cc: d503201f nop ffff0000080983d0: d503203f yield ffff0000080983d4: d5033bbf dmb ish ffff0000080983d8: b9400007 ldr w7, [x0] ffff0000080983dc: f2401cff tst x7, #0xff ffff0000080983e0: 54ffff81 b.ne ffff0000080983d0 ffff0000080983e4: d50339bf dmb ishld ffff0000080983e8: d65f03c0 ret ffff0000080983ec: d503201f nop If I understand the documentation correctly, it's enough to check the lock properly. If not - please give me the clue. Will? Yury diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 22dbde97eefa..2d80161ee367 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -25,6 +25,8 @@ config ARM64 select ARCH_WANT_COMPAT_IPC_PARSE_VERSION select ARCH_WANT_FRAME_POINTERS select ARCH_HAS_UBSAN_SANITIZE_ALL + select ARCH_USE_QUEUED_SPINLOCKS + select ARCH_USE_QUEUED_RWLOCKS select ARM_AMBA select ARM_ARCH_TIMER select ARM_GIC diff --git a/arch/arm64/include/asm/qrwlock.h b/arch/arm64/include/asm/qrwlock.h new file mode 100644 index 000000000000..626f6ebfb52d --- /dev/null +++ b/arch/arm64/include/asm/qrwlock.h @@ -0,0 +1,7 @@ +#ifndef _ASM_ARM64_QRWLOCK_H +#define _ASM_ARM64_QRWLOCK_H + +#include +#include + +#endif /* _ASM_ARM64_QRWLOCK_H */ diff --git a/arch/arm64/include/asm/qspinlock.h b/arch/arm64/include/asm/qspinlock.h new file mode 100644 index 000000000000..09ef4f13f549 --- /dev/null +++ b/arch/arm64/include/asm/qspinlock.h @@ -0,0 +1,42 @@ +#ifndef _ASM_ARM64_QSPINLOCK_H +#define _ASM_ARM64_QSPINLOCK_H + +#include +#include + +extern void queued_spin_unlock_wait(struct qspinlock *lock); +#define queued_spin_unlock_wait queued_spin_unlock_wait + +#define queued_spin_unlock queued_spin_unlock +/** + * queued_spin_unlock - release a queued spinlock + * @lock : Pointer to queued spinlock structure + * + * A smp_store_release() on the least-significant byte. + */ +static __always_inline void queued_spin_unlock(struct qspinlock *lock) +{ + smp_store_release((u8 *)lock, 0); +} + +#define queued_spin_is_locked queued_spin_is_locked +/** + * queued_spin_is_locked - is the spinlock locked? + * @lock: Pointer to queued spinlock structure + * Return: 1 if it is locked, 0 otherwise + */ +static __always_inline int queued_spin_is_locked(struct qspinlock *lock) +{ + /* + * See queued_spin_unlock_wait(). + * + * Any !0 state indicates it is locked, even if _Q_LOCKED_VAL + * isn't immediately observable. + */ + smp_mb(); + return atomic_read(&lock->val); +} + +#include + +#endif /* _ASM_ARM64_QSPINLOCK_H */ diff --git a/arch/arm64/include/asm/spinlock.h b/arch/arm64/include/asm/spinlock.h index cae331d553f8..37713397e0c5 100644 --- a/arch/arm64/include/asm/spinlock.h +++ b/arch/arm64/include/asm/spinlock.h @@ -20,6 +20,10 @@ #include #include +#ifdef CONFIG_QUEUED_SPINLOCKS +#include +#else + /* * Spinlock implementation. * @@ -187,6 +191,12 @@ static inline int arch_spin_is_contended(arch_spinlock_t *lock) } #define arch_spin_is_contended arch_spin_is_contended +#endif /* CONFIG_QUEUED_SPINLOCKS */ + +#ifdef CONFIG_QUEUED_RWLOCKS +#include +#else + /* * Write lock implementation. * @@ -351,6 +361,8 @@ static inline int arch_read_trylock(arch_rwlock_t *rw) /* read_can_lock - would read_trylock() succeed? */ #define arch_read_can_lock(x) ((x)->lock < 0x80000000) +#endif /* CONFIG_QUEUED_RWLOCKS */ + #define arch_read_lock_flags(lock, flags) arch_read_lock(lock) #define arch_write_lock_flags(lock, flags) arch_write_lock(lock) diff --git a/arch/arm64/include/asm/spinlock_types.h b/arch/arm64/include/asm/spinlock_types.h index 55be59a35e3f..0f0f1561ab6a 100644 --- a/arch/arm64/include/asm/spinlock_types.h +++ b/arch/arm64/include/asm/spinlock_types.h @@ -16,9 +16,9 @@ #ifndef __ASM_SPINLOCK_TYPES_H #define __ASM_SPINLOCK_TYPES_H -#if !defined(__LINUX_SPINLOCK_TYPES_H) && !defined(__ASM_SPINLOCK_H) -# error "please don't include this file directly" -#endif +#ifdef CONFIG_QUEUED_SPINLOCKS +#include +#else #include @@ -36,10 +36,18 @@ typedef struct { #define __ARCH_SPIN_LOCK_UNLOCKED { 0 , 0 } +#endif /* CONFIG_QUEUED_SPINLOCKS */ + +#ifdef CONFIG_QUEUED_RWLOCKS +#include +#else + typedef struct { volatile unsigned int lock; } arch_rwlock_t; #define __ARCH_RW_LOCK_UNLOCKED { 0 } +#endif /* CONFIG_QUEUED_RWLOCKS */ + #endif diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile index 9d56467dc223..f48f6256e893 100644 --- a/arch/arm64/kernel/Makefile +++ b/arch/arm64/kernel/Makefile @@ -56,6 +56,7 @@ arm64-obj-$(CONFIG_KEXEC) += machine_kexec.o relocate_kernel.o \ arm64-obj-$(CONFIG_ARM64_RELOC_TEST) += arm64-reloc-test.o arm64-reloc-test-y := reloc_test_core.o reloc_test_syms.o arm64-obj-$(CONFIG_CRASH_DUMP) += crash_dump.o +arm64-obj-$(CONFIG_QUEUED_SPINLOCKS) += qspinlock.o obj-y += $(arm64-obj-y) vdso/ probes/ obj-$(CONFIG_ARM64_ILP32) += vdso-ilp32/ diff --git a/arch/arm64/kernel/qspinlock.c b/arch/arm64/kernel/qspinlock.c new file mode 100644 index 000000000000..924f19953adb --- /dev/null +++ b/arch/arm64/kernel/qspinlock.c @@ -0,0 +1,34 @@ +#include +#include + +void queued_spin_unlock_wait(struct qspinlock *lock) +{ + u32 val; + + for (;;) { + smp_mb(); + val = atomic_read(&lock->val); + + if (!val) /* not locked, we're done */ + goto done; + + if (val & _Q_LOCKED_MASK) /* locked, go wait for unlock */ + break; + + /* not locked, but pending, wait until we observe the lock */ + cpu_relax(); + } + + for (;;) { + smp_mb(); + val = atomic_read(&lock->val); + if (!(val & _Q_LOCKED_MASK)) /* any unlock is good */ + break; + + cpu_relax(); + } + +done: + smp_acquire__after_ctrl_dep(); +} +EXPORT_SYMBOL(queued_spin_unlock_wait); -- 2.11.0 From mboxrd@z Thu Jan 1 00:00:00 1970 From: ynorov@caviumnetworks.com (Yury Norov) Date: Wed, 26 Apr 2017 15:39:47 +0300 Subject: [PATCH 3/3] arm64/locking: qspinlocks and qrwlocks support In-Reply-To: <20170420190530.GA6873@worktop> References: <1491860104-4103-1-git-send-email-ynorov@caviumnetworks.com> <1491860104-4103-4-git-send-email-ynorov@caviumnetworks.com> <20170413181212.y3ezah76qoztxhnn@hirez.programming.kicks-ass.net> <20170420182318.4ddtfiobxz6hgbo4@yury-N73SV> <20170420190530.GA6873@worktop> Message-ID: <20170426123947.ogap6xgg7xc5wsd4@yury-N73SV> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org On Thu, Apr 20, 2017 at 09:05:30PM +0200, Peter Zijlstra wrote: > On Thu, Apr 20, 2017 at 09:23:18PM +0300, Yury Norov wrote: > > Is there some test to reproduce the locking failure for the case. > > Possibly sysvsem stress before commit: > > 27d7be1801a4 ("ipc/sem.c: avoid using spin_unlock_wait()") > > Although a similar scheme is also used in nf_conntrack, see commit: > > b316ff783d17 ("locking/spinlock, netfilter: Fix nf_conntrack_lock() barriers") > > > I > > ask because I run loctorture for many hours on my qemu (emulating > > cortex-a57), and I see no failures in the test reports. And Jan did it > > on ThunderX, and Adam on QDF2400 without any problems. So even if I > > rework those functions, how could I check them for correctness? > > Running them doesn't prove them correct. Memory ordering bugs have been > in the kernel for many years without 'ever' triggering. This is stuff > you have to think about. > > > Anyway, regarding the queued_spin_unlock_wait(), is my understanding > > correct that you assume adding smp_mb() before entering the for(;;) > > cycle, and using ldaxr/strxr instead of atomic_read()? > > You'll have to ask Will, I always forget the arm64 details. So, below is what I have. For queued_spin_unlock_wait() the generated code is looking like this: ffff0000080983a0 : ffff0000080983a0: d5033bbf dmb ish ffff0000080983a4: b9400007 ldr w7, [x0] ffff0000080983a8: 350000c7 cbnz w7, ffff0000080983c0 ffff0000080983ac: 1400000e b ffff0000080983e4 ffff0000080983b0: d503203f yield ffff0000080983b4: d5033bbf dmb ish ffff0000080983b8: b9400007 ldr w7, [x0] ffff0000080983bc: 34000147 cbz w7, ffff0000080983e4 ffff0000080983c0: f2401cff tst x7, #0xff ffff0000080983c4: 54ffff60 b.eq ffff0000080983b0 ffff0000080983c8: 14000003 b ffff0000080983d4 ffff0000080983cc: d503201f nop ffff0000080983d0: d503203f yield ffff0000080983d4: d5033bbf dmb ish ffff0000080983d8: b9400007 ldr w7, [x0] ffff0000080983dc: f2401cff tst x7, #0xff ffff0000080983e0: 54ffff81 b.ne ffff0000080983d0 ffff0000080983e4: d50339bf dmb ishld ffff0000080983e8: d65f03c0 ret ffff0000080983ec: d503201f nop If I understand the documentation correctly, it's enough to check the lock properly. If not - please give me the clue. Will? Yury diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 22dbde97eefa..2d80161ee367 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -25,6 +25,8 @@ config ARM64 select ARCH_WANT_COMPAT_IPC_PARSE_VERSION select ARCH_WANT_FRAME_POINTERS select ARCH_HAS_UBSAN_SANITIZE_ALL + select ARCH_USE_QUEUED_SPINLOCKS + select ARCH_USE_QUEUED_RWLOCKS select ARM_AMBA select ARM_ARCH_TIMER select ARM_GIC diff --git a/arch/arm64/include/asm/qrwlock.h b/arch/arm64/include/asm/qrwlock.h new file mode 100644 index 000000000000..626f6ebfb52d --- /dev/null +++ b/arch/arm64/include/asm/qrwlock.h @@ -0,0 +1,7 @@ +#ifndef _ASM_ARM64_QRWLOCK_H +#define _ASM_ARM64_QRWLOCK_H + +#include +#include + +#endif /* _ASM_ARM64_QRWLOCK_H */ diff --git a/arch/arm64/include/asm/qspinlock.h b/arch/arm64/include/asm/qspinlock.h new file mode 100644 index 000000000000..09ef4f13f549 --- /dev/null +++ b/arch/arm64/include/asm/qspinlock.h @@ -0,0 +1,42 @@ +#ifndef _ASM_ARM64_QSPINLOCK_H +#define _ASM_ARM64_QSPINLOCK_H + +#include +#include + +extern void queued_spin_unlock_wait(struct qspinlock *lock); +#define queued_spin_unlock_wait queued_spin_unlock_wait + +#define queued_spin_unlock queued_spin_unlock +/** + * queued_spin_unlock - release a queued spinlock + * @lock : Pointer to queued spinlock structure + * + * A smp_store_release() on the least-significant byte. + */ +static __always_inline void queued_spin_unlock(struct qspinlock *lock) +{ + smp_store_release((u8 *)lock, 0); +} + +#define queued_spin_is_locked queued_spin_is_locked +/** + * queued_spin_is_locked - is the spinlock locked? + * @lock: Pointer to queued spinlock structure + * Return: 1 if it is locked, 0 otherwise + */ +static __always_inline int queued_spin_is_locked(struct qspinlock *lock) +{ + /* + * See queued_spin_unlock_wait(). + * + * Any !0 state indicates it is locked, even if _Q_LOCKED_VAL + * isn't immediately observable. + */ + smp_mb(); + return atomic_read(&lock->val); +} + +#include + +#endif /* _ASM_ARM64_QSPINLOCK_H */ diff --git a/arch/arm64/include/asm/spinlock.h b/arch/arm64/include/asm/spinlock.h index cae331d553f8..37713397e0c5 100644 --- a/arch/arm64/include/asm/spinlock.h +++ b/arch/arm64/include/asm/spinlock.h @@ -20,6 +20,10 @@ #include #include +#ifdef CONFIG_QUEUED_SPINLOCKS +#include +#else + /* * Spinlock implementation. * @@ -187,6 +191,12 @@ static inline int arch_spin_is_contended(arch_spinlock_t *lock) } #define arch_spin_is_contended arch_spin_is_contended +#endif /* CONFIG_QUEUED_SPINLOCKS */ + +#ifdef CONFIG_QUEUED_RWLOCKS +#include +#else + /* * Write lock implementation. * @@ -351,6 +361,8 @@ static inline int arch_read_trylock(arch_rwlock_t *rw) /* read_can_lock - would read_trylock() succeed? */ #define arch_read_can_lock(x) ((x)->lock < 0x80000000) +#endif /* CONFIG_QUEUED_RWLOCKS */ + #define arch_read_lock_flags(lock, flags) arch_read_lock(lock) #define arch_write_lock_flags(lock, flags) arch_write_lock(lock) diff --git a/arch/arm64/include/asm/spinlock_types.h b/arch/arm64/include/asm/spinlock_types.h index 55be59a35e3f..0f0f1561ab6a 100644 --- a/arch/arm64/include/asm/spinlock_types.h +++ b/arch/arm64/include/asm/spinlock_types.h @@ -16,9 +16,9 @@ #ifndef __ASM_SPINLOCK_TYPES_H #define __ASM_SPINLOCK_TYPES_H -#if !defined(__LINUX_SPINLOCK_TYPES_H) && !defined(__ASM_SPINLOCK_H) -# error "please don't include this file directly" -#endif +#ifdef CONFIG_QUEUED_SPINLOCKS +#include +#else #include @@ -36,10 +36,18 @@ typedef struct { #define __ARCH_SPIN_LOCK_UNLOCKED { 0 , 0 } +#endif /* CONFIG_QUEUED_SPINLOCKS */ + +#ifdef CONFIG_QUEUED_RWLOCKS +#include +#else + typedef struct { volatile unsigned int lock; } arch_rwlock_t; #define __ARCH_RW_LOCK_UNLOCKED { 0 } +#endif /* CONFIG_QUEUED_RWLOCKS */ + #endif diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile index 9d56467dc223..f48f6256e893 100644 --- a/arch/arm64/kernel/Makefile +++ b/arch/arm64/kernel/Makefile @@ -56,6 +56,7 @@ arm64-obj-$(CONFIG_KEXEC) += machine_kexec.o relocate_kernel.o \ arm64-obj-$(CONFIG_ARM64_RELOC_TEST) += arm64-reloc-test.o arm64-reloc-test-y := reloc_test_core.o reloc_test_syms.o arm64-obj-$(CONFIG_CRASH_DUMP) += crash_dump.o +arm64-obj-$(CONFIG_QUEUED_SPINLOCKS) += qspinlock.o obj-y += $(arm64-obj-y) vdso/ probes/ obj-$(CONFIG_ARM64_ILP32) += vdso-ilp32/ diff --git a/arch/arm64/kernel/qspinlock.c b/arch/arm64/kernel/qspinlock.c new file mode 100644 index 000000000000..924f19953adb --- /dev/null +++ b/arch/arm64/kernel/qspinlock.c @@ -0,0 +1,34 @@ +#include +#include + +void queued_spin_unlock_wait(struct qspinlock *lock) +{ + u32 val; + + for (;;) { + smp_mb(); + val = atomic_read(&lock->val); + + if (!val) /* not locked, we're done */ + goto done; + + if (val & _Q_LOCKED_MASK) /* locked, go wait for unlock */ + break; + + /* not locked, but pending, wait until we observe the lock */ + cpu_relax(); + } + + for (;;) { + smp_mb(); + val = atomic_read(&lock->val); + if (!(val & _Q_LOCKED_MASK)) /* any unlock is good */ + break; + + cpu_relax(); + } + +done: + smp_acquire__after_ctrl_dep(); +} +EXPORT_SYMBOL(queued_spin_unlock_wait); -- 2.11.0