* [PATCH] smp_rmb_cond
@ 2021-04-19 20:12 Matthew Wilcox
2021-04-19 20:20 ` David Howells
0 siblings, 1 reply; 3+ messages in thread
From: Matthew Wilcox @ 2021-04-19 20:12 UTC (permalink / raw)
To: David Howells; +Cc: linux-fsdevel
i see worse inlining decisions from gcc with this. maybe you see
an improvement that would justify it?
[ref: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99998]
diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h
index 4819d5e5a335..4cbc5bd5bcdd 100644
--- a/arch/x86/include/asm/barrier.h
+++ b/arch/x86/include/asm/barrier.h
@@ -60,6 +60,7 @@ static inline unsigned long array_index_mask_nospec(unsigned long index,
#define __smp_mb() asm volatile("lock; addl $0,-4(%%rsp)" ::: "memory", "cc")
#endif
#define __smp_rmb() dma_rmb()
+#define smp_rmb_cond(x) barrier()
#define __smp_wmb() barrier()
#define __smp_store_mb(var, value) do { (void)xchg(&var, value); } while (0)
diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
index 640f09479bdf..cc0c864f90dc 100644
--- a/include/asm-generic/barrier.h
+++ b/include/asm-generic/barrier.h
@@ -89,6 +89,10 @@
#endif /* CONFIG_SMP */
+#ifndef smp_rmb_cond
+#define smp_rmb_cond(x) do { if (x) smp_rmb(); } while (0)
+#endif
+
#ifndef __smp_store_mb
#define __smp_store_mb(var, value) do { WRITE_ONCE(var, value); __smp_mb(); } while (0)
#endif
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 04a34c08e0a6..c45d491e9245 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -522,8 +522,7 @@ static inline int PageUptodate(struct page *page)
*
* See SetPageUptodate() for the other side of the story.
*/
- if (ret)
- smp_rmb();
+ smp_rmb_cond(ret);
return ret;
}
diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c
index 7a1414622051..260ef2474ff2 100644
--- a/kernel/printk/printk_safe.c
+++ b/kernel/printk/printk_safe.c
@@ -89,8 +89,7 @@ static __printf(2, 0) int printk_safe_log_store(struct printk_safe_seq_buf *s,
* Make sure that all old data have been read before the buffer
* was reset. This is not needed when we just append data.
*/
- if (!len)
- smp_rmb();
+ smp_rmb_cond(!len);
va_copy(ap, args);
add = vscnprintf(s->buffer + len, sizeof(s->buffer) - len, fmt, ap);
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] smp_rmb_cond
2021-04-19 20:12 [PATCH] smp_rmb_cond Matthew Wilcox
@ 2021-04-19 20:20 ` David Howells
2021-04-19 22:17 ` Matthew Wilcox
0 siblings, 1 reply; 3+ messages in thread
From: David Howells @ 2021-04-19 20:20 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: dhowells, linux-fsdevel
Matthew Wilcox <willy@infradead.org> wrote:
> i see worse inlining decisions from gcc with this. maybe you see
> an improvement that would justify it?
>
> [ref: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99998]
Perhaps attach the patch to the bz, see if the compiler guys can recommend
anything?
David
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] smp_rmb_cond
2021-04-19 20:20 ` David Howells
@ 2021-04-19 22:17 ` Matthew Wilcox
0 siblings, 0 replies; 3+ messages in thread
From: Matthew Wilcox @ 2021-04-19 22:17 UTC (permalink / raw)
To: David Howells; +Cc: linux-fsdevel
On Mon, Apr 19, 2021 at 09:20:40PM +0100, David Howells wrote:
> Matthew Wilcox <willy@infradead.org> wrote:
>
> > i see worse inlining decisions from gcc with this. maybe you see
> > an improvement that would justify it?
> >
> > [ref: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99998]
>
> Perhaps attach the patch to the bz, see if the compiler guys can recommend
> anything?
your test case loses the bogus branch
0000000000000000 <PageUptodate>:
0: 48 8b 47 08 mov 0x8(%rdi),%rax
4: a8 01 test $0x1,%al
6: 74 04 je c <PageUptodate+0xc>
8: 48 8d 78 ff lea -0x1(%rax),%rdi
c: 8b 07 mov (%rdi),%eax
e: 48 c1 e8 02 shr $0x2,%rax
12: 24 01 and $0x1,%al
14: 74 00 je 16 <PageUptodate+0x16>
16: c3 retq
0000000000000017 <Page2Uptodate>:
17: 48 8b 47 08 mov 0x8(%rdi),%rax
1b: a8 01 test $0x1,%al
1d: 74 04 je 23 <Page2Uptodate+0xc>
1f: 48 8d 78 ff lea -0x1(%rax),%rdi
23: 8b 07 mov (%rdi),%eax
25: 48 c1 e8 02 shr $0x2,%rax
29: 83 e0 01 and $0x1,%eax
2c: c3 retq
but that means that gcc then does more inlining to functions that
call PageUptodate:
$ ./scripts/bloat-o-meter filemap-before.o filemap-after.o
add/remove: 0/0 grow/shrink: 3/4 up/down: 179/-91 (88)
Function old new delta
mapping_seek_hole_data 1203 1347 +144
__lock_page_killable 394 426 +32
next_uptodate_page 603 606 +3
wait_on_page_bit_common 582 576 -6
filemap_get_pages 1530 1512 -18
do_read_cache_page 1031 1012 -19
filemap_read_page 261 213 -48
Total: Before=24603, After=24691, chg +0.36%
but maybe you have a metric that shows this winning at scale instead
of in a micro?
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2021-04-19 22:17 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-19 20:12 [PATCH] smp_rmb_cond Matthew Wilcox
2021-04-19 20:20 ` David Howells
2021-04-19 22:17 ` Matthew Wilcox
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).