From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.6 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7688CC10F25 for ; Mon, 9 Mar 2020 23:52:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 4A29324671 for ; Mon, 9 Mar 2020 23:52:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1583797936; bh=rvLn+3KUM8jGGGocEPOPca2UYIRlF7YJljTxc4fs+9g=; h=Date:From:To:Cc:Subject:References:In-Reply-To:List-ID:From; b=wySZketYZcwr7vnukenXOGuUshoZPgRR0m4WsA6346lcIgIQySxQYOnyLqJsk7dZx 5yTcfr4+qK9ARfwpJIUk0t54BQ0BDBwekwOrVaTTTqRfVkod6+a8o3TpXJ7I+PNEPm LX8VB/bISd8QyMjHrcIhUbTH5QbqTS+8IVUyInG4= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727486AbgCIXwP (ORCPT ); Mon, 9 Mar 2020 19:52:15 -0400 Received: from mail.kernel.org ([198.145.29.99]:48156 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726275AbgCIXwO (ORCPT ); Mon, 9 Mar 2020 19:52:14 -0400 Received: from localhost (lfbn-ncy-1-985-231.w90-101.abo.wanadoo.fr [90.101.63.231]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 90F7C2465A; Mon, 9 Mar 2020 23:52:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1583797934; bh=rvLn+3KUM8jGGGocEPOPca2UYIRlF7YJljTxc4fs+9g=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=Fx2UDdG+vMIy6ey0pt7L8jh8OsnEV5l2up0CVpWwMnKBvGPbOBMAp9vFvYAEHzjrl 5e0jUMwaU45rrKzdd+MbMRUFNfRTRwU274YBvDlTp/xgbfUBUNZdGeX8FChhXg9DGB ISxjc+I1ZcTr/+OhHjk7TMMKGPoc/87EcGYDSanI= Date: Tue, 10 Mar 2020 00:52:11 +0100 From: Frederic Weisbecker To: "Paul E. McKenney" Cc: Thomas Gleixner , LKML , Peter Zijlstra , Steven Rostedt , Masami Hiramatsu , Alexei Starovoitov , Mathieu Desnoyers , Joel Fernandes Subject: Re: Instrumentation and RCU Message-ID: <20200309235210.GB20868@lenoir> References: <87mu8p797b.fsf@nanos.tec.linutronix.de> <20200309204710.GU2935@paulmck-ThinkPad-P72> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200309204710.GU2935@paulmck-ThinkPad-P72> User-Agent: Mutt/1.9.4 (2018-02-28) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Mar 09, 2020 at 01:47:10PM -0700, Paul E. McKenney wrote: > On Mon, Mar 09, 2020 at 06:02:32PM +0100, Thomas Gleixner wrote: > > #3) RCU idle > > > > Being able to trace code inside RCU idle sections is very similar to > > the question raised in #1. > > > > Assume all of the instrumentation would be doing conditional RCU > > schemes, i.e.: > > > > if (rcuidle) > > .... > > else > > rcu_read_lock_sched() > > > > before invoking the actual instrumentation functions and of course > > undoing that right after it, that really begs the question whether > > it's worth it. > > > > Especially constructs like: > > > > trace_hardirqs_off() > > idx = srcu_read_lock() > > rcu_irq_enter_irqson(); > > ... > > rcu_irq_exit_irqson(); > > srcu_read_unlock(idx); > > > > if (user_mode) > > user_exit_irqsoff(); > > else > > rcu_irq_enter(); > > > > are really more than questionable. For 99.9999% of instrumentation > > users it's absolutely irrelevant whether this traces the interrupt > > disabled time of user_exit_irqsoff() or rcu_irq_enter() or not. > > > > But what's relevant is the tracer overhead which is e.g. inflicted > > with todays trace_hardirqs_off/on() implementation because that > > unconditionally uses the rcuidle variant with the scru/rcu_irq dance > > around every tracepoint. > > > > Even if the tracepoint sits in the ASM code it just covers about ~20 > > low level ASM instructions more. The tracer invocation, which is > > even done twice when coming from user space on x86 (the second call > > is optimized in the tracer C-code), costs definitely way more > > cycles. When you take the scru/rcu_irq dance into account it's a > > complete disaster performance wise. > > Suppose that we had a variant of RCU that had about the same read-side > overhead as Preempt-RCU, but which could be used from idle as well as > from CPUs in the process of coming online or going offline? I have not > thought through the irq/NMI/exception entry/exit cases, but I don't see > why that would be problem. > > This would have explicit critical-section entry/exit code, so it would > not be any help for trampolines. > > Would such a variant of RCU help? > > Yeah, I know. Just what the kernel doesn't need, yet another variant > of RCU... > I was thinking about having a tracing-specific implementation of RCU. Last week Steve told me that the tracing ring buffer has its own ad-hoc RCU implementation which schedule a thread on each CPU to complete a grace period (did I understand it right?). Of course such a flavour of RCU wouldn't be nice to nohz_full but surely we can arrange some tweaks for those who require strong isolation. I'm sure you're having a much better idea though.