From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751524AbdFFKyF (ORCPT ); Tue, 6 Jun 2017 06:54:05 -0400 Received: from bombadil.infradead.org ([65.50.211.133]:50324 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751345AbdFFKyD (ORCPT ); Tue, 6 Jun 2017 06:54:03 -0400 Date: Tue, 6 Jun 2017 12:53:43 +0200 From: Peter Zijlstra To: "Paul E. McKenney" Cc: linux-kernel@vger.kernel.org, mingo@kernel.org, jiangshanlai@gmail.com, dipankar@in.ibm.com, akpm@linux-foundation.org, mathieu.desnoyers@efficios.com, josh@joshtriplett.org, tglx@linutronix.de, rostedt@goodmis.org, dhowells@redhat.com, edumazet@google.com, fweisbec@gmail.com, oleg@redhat.com, Paolo Bonzini , kvm@vger.kernel.org, Linus Torvalds Subject: Re: [PATCH RFC tip/core/rcu 1/2] srcu: Allow use of Tiny/Tree SRCU from both process and interrupt context Message-ID: <20170606105343.ibhzrk6jwhmoja5t@hirez.programming.kicks-ass.net> References: <20170605220919.GA27820@linux.vnet.ibm.com> <1496700591-30177-1-git-send-email-paulmck@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1496700591-30177-1-git-send-email-paulmck@linux.vnet.ibm.com> User-Agent: NeoMutt/20170113 (1.7.2) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jun 05, 2017 at 03:09:50PM -0700, Paul E. McKenney wrote: > There would be a slowdown if 1) fast this_cpu_inc is not available and > cannot be implemented (this usually means that atomic_inc has implicit > memory barriers), I don't get this. How is per-cpu crud related to being strongly ordered? this_cpu_ has 3 forms: x86: single instruction arm64,s390: preempt_disable()+atomic_op generic: local_irq_save()+normal_op Only s390 is TSO, arm64 is very much a weak arch. > and 2) local_irq_save/restore is slower than disabling > preemption. The main architecture with these constraints is s390, which > however is already paying the price in __srcu_read_unlock and has not > complained. IIRC only PPC (and hopefully soon x86) has a local_irq_save() that is as fast as preempt_disable(). > A valid optimization on s390 would be to skip the smp_mb; > AIUI, this_cpu_inc implies a memory barrier (!) due to its implementation. You mean the s390 this_cpu_inc() in specific, right? Because this_cpu_inc() in general does not imply any such thing.