From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2A55FC4320A for ; Thu, 5 Aug 2021 19:15:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 11F6661108 for ; Thu, 5 Aug 2021 19:15:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242025AbhHETQA (ORCPT ); Thu, 5 Aug 2021 15:16:00 -0400 Received: from mail.efficios.com ([167.114.26.124]:42972 "EHLO mail.efficios.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241945AbhHETP7 (ORCPT ); Thu, 5 Aug 2021 15:15:59 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.efficios.com (Postfix) with ESMTP id 83617372E0F; Thu, 5 Aug 2021 15:15:44 -0400 (EDT) Received: from mail.efficios.com ([127.0.0.1]) by localhost (mail03.efficios.com [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id wYvbfDrBUFde; Thu, 5 Aug 2021 15:15:43 -0400 (EDT) Received: from localhost (localhost [127.0.0.1]) by mail.efficios.com (Postfix) with ESMTP id 6B9B6372A4E; Thu, 5 Aug 2021 15:15:43 -0400 (EDT) DKIM-Filter: OpenDKIM Filter v2.10.3 mail.efficios.com 6B9B6372A4E DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=efficios.com; s=default; t=1628190943; bh=Ty7RbvkQqrCPIxm+YfHG7bWLDDAioo2iU/vXWJG3ZLY=; h=Date:From:To:Message-ID:MIME-Version; b=F7yl5RjfT3tytzXm6NvhJURLm2mDJevaq35feG8HgbkWAG56TVUPwE6Q1ifEEUQ63 pp8Wc3pvW+a+G+V3kh9fJEHIgiSl+tsQCZZYk5Ltt+ybVEyhQYMTjJ+ymn7rRjTWtq TrLK38eJLo9VKpVI/VzyUuZQkuqbNgxxCodGgas0ecqYf2/KeivHKhu44wIojSFmMT +Wu39o3qYbNdBTpYmYc3O26j4KXuIgbkEJc5agiS34M2M2QH1kt/sezUbsUEvIFHVW rVAP/OxJnhshQXulsc5+p7j9VCxqrsq29cxw2sJ8WtrDKnLaFDzHhjBhrqjH2tRNVx UtKjidwHSOrUw== X-Virus-Scanned: amavisd-new at efficios.com Received: from mail.efficios.com ([127.0.0.1]) by localhost (mail03.efficios.com [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id X73s26ayJ0cP; Thu, 5 Aug 2021 15:15:43 -0400 (EDT) Received: from mail03.efficios.com (mail03.efficios.com [167.114.26.124]) by mail.efficios.com (Postfix) with ESMTP id 56EAA372E0E; Thu, 5 Aug 2021 15:15:43 -0400 (EDT) Date: Thu, 5 Aug 2021 15:15:43 -0400 (EDT) From: Mathieu Desnoyers To: rostedt Cc: Ingo Molnar , Peter Zijlstra , Andrew Morton , paulmck , Stefan Metzmacher , stable , linux-kernel Message-ID: <1058325468.7289.1628190943244.JavaMail.zimbra@efficios.com> In-Reply-To: <20210805145631.609e0a80@oasis.local.home> References: <20210805132717.23813-1-mathieu.desnoyers@efficios.com> <20210805132717.23813-3-mathieu.desnoyers@efficios.com> <20210805145631.609e0a80@oasis.local.home> Subject: Re: [PATCH 2/3] Fix: tracepoint: static call function vs data state mismatch (v2) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [167.114.26.124] X-Mailer: Zimbra 8.8.15_GA_4101 (ZimbraWebClient - FF90 (Linux)/8.8.15_GA_4059) Thread-Topic: tracepoint: static call function vs data state mismatch (v2) Thread-Index: ern69LtzYAjFuaxy6YMDzDYXlkUC1w== Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org ----- On Aug 5, 2021, at 2:56 PM, rostedt rostedt@goodmis.org wrote: > Note, there shouldn't be a "(v2)" outside the "[PATCH ]" part. > Otherwise it gets added into the git commit during "git am". Out of curiosity, do you know any way to annotate my local commits to have the [PATCH v2] tag automatically generated by git send-email ? > > On Thu, 5 Aug 2021 09:27:16 -0400 > Mathieu Desnoyers wrote: > >> On a 1->0->1 callbacks transition, there is an issue with the new >> callback using the old callback's data. >> >> Considering __DO_TRACE_CALL: >> >> do { \ >> struct tracepoint_func *it_func_ptr; \ >> void *__data; \ >> it_func_ptr = \ >> rcu_dereference_raw((&__tracepoint_##name)->funcs); \ >> if (it_func_ptr) { \ >> __data = (it_func_ptr)->data; \ >> >> ----> [ delayed here on one CPU (e.g. vcpu preempted by the host) ] >> >> static_call(tp_func_##name)(__data, args); \ >> } \ >> } while (0) >> >> It has loaded the tp->funcs of the old callback, so it will try to use the old >> data. This can be fixed by adding a RCU sync anywhere in the 1->0->1 >> transition chain. >> >> On a N->2->1 transition, we need an rcu-sync because you may have a >> sequence of 3->2->1 (or 1->2->1) where the element 0 data is unchanged >> between 2->1, but was changed from 3->2 (or from 1->2), which may be >> observed by the static call. This can be fixed by adding an >> unconditional RCU sync in transition 2->1. >> >> A follow up fix will introduce a more lightweight scheme based on RCU >> get_state and cond_sync. > > I'll add here that this patch will cause a huge performance regression > on disabling the trace events, but the follow up patch will fix that. > > Before this patch: > > # trace-cmd start -e all > # time trace-cmd start -p nop > > real 0m0.778s > user 0m0.000s > sys 0m0.061s > > After this patch: > > # trace-cmd start -e all > # time trace-cmd start -p nop > > real 0m10.593s > user 0m0.017s > sys 0m0.259s > > > That's more than 10x slow down. Just under a second to disable all > events now goes to over 10 seconds! > > But after the next patch: > > # trace-cmd start -e all > # time trace-cmd start -p nop > > real 0m0.878s > user 0m0.000s > sys 0m0.103s > > Which is in the noise from before this patch. > > This is a big enough regression, I'll even add a Fixes tag to the next > patch on the final sha1 of this patch! Such that this patch won't be > backported without the next patch. This makes sense. I still wanted to keep the two patches separate so we would introduce the (slow) state machine in the first patch, and optimize for speed in the second. My intent is to facilitate of small logical changes, and make bisection more precise in the future if we introduce an issue here. Calling out more clearly how slow things become with this patch is indeed important. > >> >> Link: >> https://lore.kernel.org/io-uring/4ebea8f0-58c9-e571-fd30-0ce4f6f09c70@samba.org/ >> Fixes: d25e37d89dd2 ("tracepoint: Optimize using static_call()") > > For this patch, I would say the above is what this fixes. Yes. Thanks, Mathieu > > -- Steve > >> Fixes: 547305a64632 ("tracepoint: Fix out of sync data passing by static >> caller") >> Fixes: 352384d5c84e ("tracepoints: Update static_call before tp_funcs when >> adding a tracepoint") >> Signed-off-by: Mathieu Desnoyers >> Cc: Steven Rostedt >> Cc: Ingo Molnar >> Cc: Peter Zijlstra >> Cc: Andrew Morton >> Cc: "Paul E. McKenney" >> Cc: Stefan Metzmacher >> Cc: # 5.10+ > > --- -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com