From: Nadav Amit <namit@vmware.com> To: Dave Hansen <dave.hansen@intel.com> Cc: kernel test robot <oliver.sang@intel.com>, Ingo Molnar <mingo@kernel.org>, Dave Hansen <dave.hansen@linux.intel.com>, LKML <linux-kernel@vger.kernel.org>, "lkp@lists.01.org" <lkp@lists.01.org>, "lkp@intel.com" <lkp@intel.com>, "ying.huang@intel.com" <ying.huang@intel.com>, "feng.tang@intel.com" <feng.tang@intel.com>, "zhengjun.xing@linux.intel.com" <zhengjun.xing@linux.intel.com>, "fengwei.yin@intel.com" <fengwei.yin@intel.com> Subject: Re: [x86/mm/tlb] 6035152d8e: will-it-scale.per_thread_ops -13.2% regression Date: Thu, 17 Mar 2022 20:32:35 +0000 [thread overview] Message-ID: <DC37F01B-A80F-4839-B4FB-C21F64943E64@vmware.com> (raw) In-Reply-To: <96f9b880-876f-bf4d-8eb0-9ae8bbc8df6d@intel.com> > On Mar 17, 2022, at 12:11 PM, Dave Hansen <dave.hansen@intel.com> wrote: > > On 3/17/22 12:02, Nadav Amit wrote: >>> This new "early lazy check" behavior could theoretically work both ways. >>> If threads tended to be waking up from idle when TLB flushes were being >>> sent, this would tend to reduce the number of IPIs. But, since they >>> tend to be going to sleep it increases the number of IPIs. >>> >>> Anybody have a better theory? I think we should probably revert the commit. >> >> Let’s get back to the motivation behind this patch. >> >> Originally we had an indirect branch that on system which are >> vulnerable to Spectre v2 translates into a retpoline. >> >> So I would not paraphrase this patch purpose as “early lazy check” >> but instead “more efficient lazy check”. There is very little code >> that was executed between the call to on_each_cpu_cond_mask() and >> the actual check of tlb_is_not_lazy(). So what it seems to happen >> in this test-case - according to what you say - is that *slower* >> checks of is-lazy allows to send fewer IPIs since some cores go >> into idle-state. >> >> Was this test run with retpolines? If there is a difference in >> performance without retpoline - I am probably wrong. > > Nope, no retpolines: Err.. > >> /sys/devices/system/cpu/vulnerabilities/spectre_v2:Mitigation: Enhanced IBRS, IBPB: conditional, RSB filling > > which is the same situation as the "Xeon Platinum 8358" which found this > in 0day. > > Maybe the increased IPIs with this approach end up being a wash with the > reduced retpoline overhead. > > Did you have any specific performance numbers that show the benefit on > retpoline systems? I had profiled this thing to death at the time. I don’t have the numbers with me now though. I did not run will-it-scale but a similar benchmark that I wrote. Another possible reason is that perhaps with this patch alone, without subsequent patches we get some negative impact. I do not have a good explanation, but can we rule this one out? Can you please clarify how the bot works - did it notice a performance regression and then started bisecting, or did it just check one patch at a time? I ask because I got a different report from the report that a subsequent patch ("x86/mm/tlb: Privatize cpu_tlbstate”) made a 23.3% improvement [1] for a very similar (yet different) test. Without a good explanation, my knee-jerk reaction is that this seems as a pathological case. I do not expect performance improvement without retpolines, and perhaps the few cycles in which the test of is-lazy is performed earlier matter. I’m not married to this patch, but before a revert it would be good to know why it even matters. I wonder whether you can confirm that reverting the patch (without the rest of the series) even helps. If it does, I’ll try to run some tests to understand what the heck is going on. [1] https://lists.ofono.org/hyperkitty/list/lkp@lists.01.org/thread/UTC7DVZX4O5DKT2WUTWBTCVQ6W5QLGFA/
WARNING: multiple messages have this Message-ID (diff)
From: Nadav Amit <namit@vmware.com> To: lkp@lists.01.org Subject: Re: [x86/mm/tlb] 6035152d8e: will-it-scale.per_thread_ops -13.2% regression Date: Thu, 17 Mar 2022 20:32:35 +0000 [thread overview] Message-ID: <DC37F01B-A80F-4839-B4FB-C21F64943E64@vmware.com> (raw) In-Reply-To: <96f9b880-876f-bf4d-8eb0-9ae8bbc8df6d@intel.com> [-- Attachment #1: Type: text/plain, Size: 3089 bytes --] > On Mar 17, 2022, at 12:11 PM, Dave Hansen <dave.hansen@intel.com> wrote: > > On 3/17/22 12:02, Nadav Amit wrote: >>> This new "early lazy check" behavior could theoretically work both ways. >>> If threads tended to be waking up from idle when TLB flushes were being >>> sent, this would tend to reduce the number of IPIs. But, since they >>> tend to be going to sleep it increases the number of IPIs. >>> >>> Anybody have a better theory? I think we should probably revert the commit. >> >> Let’s get back to the motivation behind this patch. >> >> Originally we had an indirect branch that on system which are >> vulnerable to Spectre v2 translates into a retpoline. >> >> So I would not paraphrase this patch purpose as “early lazy check” >> but instead “more efficient lazy check”. There is very little code >> that was executed between the call to on_each_cpu_cond_mask() and >> the actual check of tlb_is_not_lazy(). So what it seems to happen >> in this test-case - according to what you say - is that *slower* >> checks of is-lazy allows to send fewer IPIs since some cores go >> into idle-state. >> >> Was this test run with retpolines? If there is a difference in >> performance without retpoline - I am probably wrong. > > Nope, no retpolines: Err.. > >> /sys/devices/system/cpu/vulnerabilities/spectre_v2:Mitigation: Enhanced IBRS, IBPB: conditional, RSB filling > > which is the same situation as the "Xeon Platinum 8358" which found this > in 0day. > > Maybe the increased IPIs with this approach end up being a wash with the > reduced retpoline overhead. > > Did you have any specific performance numbers that show the benefit on > retpoline systems? I had profiled this thing to death at the time. I don’t have the numbers with me now though. I did not run will-it-scale but a similar benchmark that I wrote. Another possible reason is that perhaps with this patch alone, without subsequent patches we get some negative impact. I do not have a good explanation, but can we rule this one out? Can you please clarify how the bot works - did it notice a performance regression and then started bisecting, or did it just check one patch at a time? I ask because I got a different report from the report that a subsequent patch ("x86/mm/tlb: Privatize cpu_tlbstate”) made a 23.3% improvement [1] for a very similar (yet different) test. Without a good explanation, my knee-jerk reaction is that this seems as a pathological case. I do not expect performance improvement without retpolines, and perhaps the few cycles in which the test of is-lazy is performed earlier matter. I’m not married to this patch, but before a revert it would be good to know why it even matters. I wonder whether you can confirm that reverting the patch (without the rest of the series) even helps. If it does, I’ll try to run some tests to understand what the heck is going on. [1] https://lists.ofono.org/hyperkitty/list/lkp(a)lists.01.org/thread/UTC7DVZX4O5DKT2WUTWBTCVQ6W5QLGFA/
next prev parent reply other threads:[~2022-03-17 20:32 UTC|newest] Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top 2022-03-17 9:04 [x86/mm/tlb] 6035152d8e: will-it-scale.per_thread_ops -13.2% regression kernel test robot 2022-03-17 9:04 ` kernel test robot 2022-03-17 18:38 ` Dave Hansen 2022-03-17 18:38 ` Dave Hansen 2022-03-17 19:02 ` Nadav Amit 2022-03-17 19:02 ` Nadav Amit 2022-03-17 19:11 ` Dave Hansen 2022-03-17 19:11 ` Dave Hansen 2022-03-17 20:32 ` Nadav Amit [this message] 2022-03-17 20:32 ` Nadav Amit 2022-03-17 20:49 ` Dave Hansen 2022-03-17 20:49 ` Dave Hansen 2022-03-18 2:56 ` Oliver Sang 2022-03-18 2:56 ` Oliver Sang 2022-03-18 0:16 ` Dave Hansen 2022-03-18 0:16 ` Dave Hansen 2022-03-18 0:20 ` Nadav Amit 2022-03-18 0:20 ` Nadav Amit 2022-03-18 0:45 ` Dave Hansen 2022-03-18 0:45 ` Dave Hansen 2022-03-18 3:02 ` Nadav Amit 2022-03-18 3:02 ` Nadav Amit
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=DC37F01B-A80F-4839-B4FB-C21F64943E64@vmware.com \ --to=namit@vmware.com \ --cc=dave.hansen@intel.com \ --cc=dave.hansen@linux.intel.com \ --cc=feng.tang@intel.com \ --cc=fengwei.yin@intel.com \ --cc=linux-kernel@vger.kernel.org \ --cc=lkp@intel.com \ --cc=lkp@lists.01.org \ --cc=mingo@kernel.org \ --cc=oliver.sang@intel.com \ --cc=ying.huang@intel.com \ --cc=zhengjun.xing@linux.intel.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.