From: Linus Torvalds <torvalds@linux-foundation.org> To: kernel test robot <oliver.sang@intel.com> Cc: Mikulas Patocka <mpatocka@redhat.com>, lkp@lists.01.org, lkp@intel.com, Matthew Wilcox <willy@infradead.org>, linux-kernel@vger.kernel.org, ying.huang@intel.com, feng.tang@intel.com, zhengjun.xing@linux.intel.com, fengwei.yin@intel.com, regressions@lists.linux.dev Subject: Re: d4252071b9: fxmark.ssd_ext4_no_jnl_DWTL_54_directio.works/sec -26.5% regression Date: Wed, 31 Aug 2022 09:46:12 -0700 [thread overview] Message-ID: <CAHk-=wgG=mttS-m2OLcnsTwia2roHR2b-DxXXG-tbDH8_cUNiA@mail.gmail.com> (raw) In-Reply-To: <Yw8L7HTZ/dE2/o9C@xsang-OptiPlex-9020> On Wed, Aug 31, 2022 at 12:21 AM kernel test robot <oliver.sang@intel.com> wrote: > > hi, pleased be noted that we read this patch and understand it as a fix, > also what we understand is, since the patch itself adds some memory barrier, > some regression in block IO area is kind of expected. Well, yes and no. It's a memory ordering fix, but the memory ordering part is one that should *not* have any actual impact on x86, because the addition of smp_mb__before_atomic() should be a total no-op, and "smp_load_acquire()" should only imply a compiler scheduling barrier. IOW, it most definitely shouldn't cause something like this: > FYI, we noticed a -26.5% regression of > fxmark.ssd_ext4_no_jnl_DWTL_54_directio.works/sec because at most it should have caused tiny perturbation of the instruction scheduling (obviously possibly register allocation, stack spill differences and and instruction choice). Except there was a change there that isn't just about memory ordering: > after more internal review, we still decided to report out to share our finding > in our tests, and for your information that how this patch could impact > performance in some cases. please let us know if you have any concern. Oh, it's absolutely interesting and unexpected. And I think the cause is obvious: our "set_buffer_uptodate()" *used* to use the BUFFER_FNS() macro, which does that bit setting conditionally. And while that isn't actually correct in an "atomic op" situation, it *is* fine in the case of set_buffer_uptodate(), since if the buffer was already uptodate, any other CPU looking at that bit will not be caring about what *this* CPU did. IOW, if this CPU sees the bit as having ever been uptodate before, then any barriers are irrelevant, because they are about the original setting of 'uptodate', not the new one. So I think we can just do this: --- a/include/linux/buffer_head.h +++ b/include/linux/buffer_head.h @@ -137,12 +137,14 @@ BUFFER_FNS(Defer_Completion, defer_completion) static __always_inline void set_buffer_uptodate(struct buffer_head *bh) { - /* - * make it consistent with folio_mark_uptodate - * pairs with smp_load_acquire in buffer_uptodate - */ - smp_mb__before_atomic(); - set_bit(BH_Uptodate, &bh->b_state); + if (!test_bit(BH_Uptodate, &bh->b_state)) { + /* + * make it consistent with folio_mark_uptodate + * pairs with smp_load_acquire in buffer_uptodate + */ + smp_mb__before_atomic(); + set_bit(BH_Uptodate, &bh->b_state); + } } static __always_inline void clear_buffer_uptodate(struct buffer_head *bh) and re-introduce the original code (maybe extend that comment to talk about this "only first up-to-date matters". HOWEVER. I'd love to hear if you have a clear profile change, and to see exactly which set_buffer_uptodate() is *so* important. Honestly, I didn't expect the buffer head functions to even really matter much any more, with pretty much all IO being about the page cache.. Linus
WARNING: multiple messages have this Message-ID (diff)
From: Linus Torvalds <torvalds@linux-foundation.org> To: lkp@lists.01.org Subject: Re: d4252071b9: fxmark.ssd_ext4_no_jnl_DWTL_54_directio.works/sec -26.5% regression Date: Wed, 31 Aug 2022 09:46:12 -0700 [thread overview] Message-ID: <CAHk-=wgG=mttS-m2OLcnsTwia2roHR2b-DxXXG-tbDH8_cUNiA@mail.gmail.com> (raw) In-Reply-To: <Yw8L7HTZ/dE2/o9C@xsang-OptiPlex-9020> [-- Attachment #1: Type: text/plain, Size: 3204 bytes --] On Wed, Aug 31, 2022 at 12:21 AM kernel test robot <oliver.sang@intel.com> wrote: > > hi, pleased be noted that we read this patch and understand it as a fix, > also what we understand is, since the patch itself adds some memory barrier, > some regression in block IO area is kind of expected. Well, yes and no. It's a memory ordering fix, but the memory ordering part is one that should *not* have any actual impact on x86, because the addition of smp_mb__before_atomic() should be a total no-op, and "smp_load_acquire()" should only imply a compiler scheduling barrier. IOW, it most definitely shouldn't cause something like this: > FYI, we noticed a -26.5% regression of > fxmark.ssd_ext4_no_jnl_DWTL_54_directio.works/sec because at most it should have caused tiny perturbation of the instruction scheduling (obviously possibly register allocation, stack spill differences and and instruction choice). Except there was a change there that isn't just about memory ordering: > after more internal review, we still decided to report out to share our finding > in our tests, and for your information that how this patch could impact > performance in some cases. please let us know if you have any concern. Oh, it's absolutely interesting and unexpected. And I think the cause is obvious: our "set_buffer_uptodate()" *used* to use the BUFFER_FNS() macro, which does that bit setting conditionally. And while that isn't actually correct in an "atomic op" situation, it *is* fine in the case of set_buffer_uptodate(), since if the buffer was already uptodate, any other CPU looking at that bit will not be caring about what *this* CPU did. IOW, if this CPU sees the bit as having ever been uptodate before, then any barriers are irrelevant, because they are about the original setting of 'uptodate', not the new one. So I think we can just do this: --- a/include/linux/buffer_head.h +++ b/include/linux/buffer_head.h @@ -137,12 +137,14 @@ BUFFER_FNS(Defer_Completion, defer_completion) static __always_inline void set_buffer_uptodate(struct buffer_head *bh) { - /* - * make it consistent with folio_mark_uptodate - * pairs with smp_load_acquire in buffer_uptodate - */ - smp_mb__before_atomic(); - set_bit(BH_Uptodate, &bh->b_state); + if (!test_bit(BH_Uptodate, &bh->b_state)) { + /* + * make it consistent with folio_mark_uptodate + * pairs with smp_load_acquire in buffer_uptodate + */ + smp_mb__before_atomic(); + set_bit(BH_Uptodate, &bh->b_state); + } } static __always_inline void clear_buffer_uptodate(struct buffer_head *bh) and re-introduce the original code (maybe extend that comment to talk about this "only first up-to-date matters". HOWEVER. I'd love to hear if you have a clear profile change, and to see exactly which set_buffer_uptodate() is *so* important. Honestly, I didn't expect the buffer head functions to even really matter much any more, with pretty much all IO being about the page cache.. Linus
next prev parent reply other threads:[~2022-08-31 16:46 UTC|newest] Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top 2022-08-31 7:21 d4252071b9: fxmark.ssd_ext4_no_jnl_DWTL_54_directio.works/sec -26.5% regression kernel test robot 2022-08-31 7:21 ` kernel test robot 2022-08-31 16:46 ` Linus Torvalds [this message] 2022-08-31 16:46 ` Linus Torvalds 2022-09-08 8:55 ` [LKP] " Yin, Fengwei 2022-09-08 8:55 ` Yin, Fengwei 2022-09-08 12:14 ` [LKP] " Linus Torvalds 2022-09-08 12:14 ` Linus Torvalds 2022-09-08 13:25 ` [LKP] " Yin, Fengwei 2022-09-08 13:25 ` Yin, Fengwei 2022-09-08 16:50 ` d4252071b9: fxmark.ssd_ext4_no_jnl_DWTL_54_directio.works/sec -26.5% regression #forregzbot Thorsten Leemhuis
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to='CAHk-=wgG=mttS-m2OLcnsTwia2roHR2b-DxXXG-tbDH8_cUNiA@mail.gmail.com' \ --to=torvalds@linux-foundation.org \ --cc=feng.tang@intel.com \ --cc=fengwei.yin@intel.com \ --cc=linux-kernel@vger.kernel.org \ --cc=lkp@intel.com \ --cc=lkp@lists.01.org \ --cc=mpatocka@redhat.com \ --cc=oliver.sang@intel.com \ --cc=regressions@lists.linux.dev \ --cc=willy@infradead.org \ --cc=ying.huang@intel.com \ --cc=zhengjun.xing@linux.intel.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.