From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1031072AbbD1SjL (ORCPT ); Tue, 28 Apr 2015 14:39:11 -0400 Received: from mail-db3on0085.outbound.protection.outlook.com ([157.55.234.85]:56670 "EHLO emea01-db3-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1030639AbbD1SjH (ORCPT ); Tue, 28 Apr 2015 14:39:07 -0400 Authentication-Results: redhat.com; dkim=none (message not signed) header.d=none; Message-ID: <553FD3B6.3080909@ezchip.com> Date: Tue, 28 Apr 2015 14:38:46 -0400 From: Chris Metcalf User-Agent: Mozilla/5.0 (X11; Linux i686 on x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0 MIME-Version: 1.0 To: Peter Zijlstra CC: "Paul E. McKenney" , Manfred Spraul , Oleg Nesterov , Kirill Tkhai , , Ingo Molnar , Josh Poimboeuf Subject: Re: [PATCH 2/2] [PATCH] sched: Add smp_rmb() in task rq locking cycles References: <54E77CC0.5030401@colorfullife.com> <20150220184551.GQ2896@worktop.programming.kicks-ass.net> <20150425195602.GA26676@linux.vnet.ibm.com> <20150426105213.GA27191@linux.vnet.ibm.com> <20150428143357.GF23123@twins.programming.kicks-ass.net> <553FACF1.2020405@ezchip.com> <20150428164033.GJ5029@twins.programming.kicks-ass.net> <553FBC4F.3070104@ezchip.com> <20150428174323.GL5029@twins.programming.kicks-ass.net> <553FCAD0.9090403@ezchip.com> <20150428182410.GM5029@twins.programming.kicks-ass.net> In-Reply-To: <20150428182410.GM5029@twins.programming.kicks-ass.net> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [12.216.194.146] X-ClientProxiedBy: BN1PR12CA0015.namprd12.prod.outlook.com (25.160.77.25) To DB5PR02MB0773.eurprd02.prod.outlook.com (25.161.243.144) X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:DB5PR02MB0773; X-Forefront-Antispam-Report: BMV:1;SFV:NSPM;SFS:(10009020)(6009001)(6049001)(377454003)(479174004)(51704005)(24454002)(86362001)(42186005)(62966003)(33656002)(76176999)(2950100001)(4001350100001)(59896002)(77096005)(92566002)(77156002)(50986999)(15975445007)(64126003)(46102003)(65816999)(47776003)(66066001)(54356999)(65806001)(23746002)(110136001)(122386002)(40100003)(87976001)(19580395003)(83506001)(93886004)(36756003)(50466002)(87266999)(5001920100001)(65956001)(21314002)(18886065003);DIR:OUT;SFP:1101;SCL:1;SRVR:DB5PR02MB0773;H:[10.7.0.41];FPR:;SPF:None;MLV:sfv;LANG:en; X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(601004)(5002010)(5005006)(3002001);SRVR:DB5PR02MB0773;BCL:0;PCL:0;RULEID:;SRVR:DB5PR02MB0773; X-Forefront-PRVS: 0560A2214D X-OriginatorOrg: ezchip.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 28 Apr 2015 18:38:58.7411 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB5PR02MB0773 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 04/28/2015 02:24 PM, Peter Zijlstra wrote: > A few questions: > On Tue, Apr 28, 2015 at 02:00:48PM -0400, Chris Metcalf wrote: > >>> static inline void arch_spin_lock(arch_spinlock_t *lock) >>> { >>> unsigned short head, tail; >>> >>> ___tns_lock(&lock->lock); /* XXX does the TNS imply a ___sync? */ > Does it? Something needs to provide the ACQUIRE semantics. Yes, __insn_xxx() is a compiler barrier on tile. Tile architectures do not need any hardware-level "acquire" semantics since normally control dependency is sufficient for lock acquisition. Loads and stores are issued in-order into the mesh network, but issued loads don't block further instruction issue until a register read dependency requires it. There is no speculative execution. >>> head = lock->head; >>> lock->head++; >>> ___tns_unlock(&lock->lock); >>> >>> while (READ_ONCE(lock->tail) != head) >>> cpu_relax(); >>> } >>> >>> static inline void arch_spin_unlock(arch_spinlock_t *lock) >>> { >>> /* >>> * can do with regular load/store because the lock owner >>> * is the only one going to do stores to the tail >>> */ >>> unsigned short tail = READ_ONCE(lock->tail); >>> smp_mb(); /* MB is stronger than RELEASE */ > Note that your code uses wmb(), wmb is strictly speaking not correct, > as its weaker than RELEASE. > > _However_ it doesn't make any practical difference since all three > barriers end up emitting __sync() so its not a bug per se. Yes, this code dates back to 2.6.18 or so; today I would use smp_store_release(). I like the trend in the kernel to move more towards the C11 memory order model; I think it will help a lot. -- Chris Metcalf, EZChip Semiconductor http://www.ezchip.com