All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] mm: Make PageType more efficient
@ 2020-03-10 18:56 Matthew Wilcox
  2020-03-10 20:17 ` Alexander Duyck
  2020-03-11  1:10 ` Andrew Morton
  0 siblings, 2 replies; 9+ messages in thread
From: Matthew Wilcox @ 2020-03-10 18:56 UTC (permalink / raw)
  To: Andrew Morton, Vlastimil Babka, linux-mm; +Cc: Matthew Wilcox (Oracle)

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

PageType is a little hard for GCC to reason about, By checking
((~A) & flag) instead of (flag & (A | MASK) == MASK), GCC can do
better optimisations, saving 652 bytes in page_alloc.o (which is
a heavy user of PageBuddy).

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/page-flags.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 1bf83c8fcaa7..8fc0876e2794 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -725,14 +725,14 @@ PAGEFLAG_FALSE(DoubleMap)
 #define PG_table	0x00000400
 #define PG_guard	0x00000800
 
-#define PageType(page, flag)						\
-	((page->page_type & (PAGE_TYPE_BASE | flag)) == PAGE_TYPE_BASE)
-
 static inline int page_has_type(struct page *page)
 {
 	return (int)page->page_type < PAGE_MAPCOUNT_RESERVE;
 }
 
+#define PageType(page, flag)						\
+	(page_has_type(page) && (~page->page_type & flag))
+
 #define PAGE_TYPE_OPS(uname, lname)					\
 static __always_inline int Page##uname(struct page *page)		\
 {									\
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] mm: Make PageType more efficient
  2020-03-10 18:56 [PATCH] mm: Make PageType more efficient Matthew Wilcox
@ 2020-03-10 20:17 ` Alexander Duyck
  2020-03-10 20:37   ` Matthew Wilcox
  2020-03-11  1:10 ` Andrew Morton
  1 sibling, 1 reply; 9+ messages in thread
From: Alexander Duyck @ 2020-03-10 20:17 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Andrew Morton, Vlastimil Babka, linux-mm

On Tue, Mar 10, 2020 at 11:56 AM Matthew Wilcox <willy@infradead.org> wrote:
>
> From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
>
> PageType is a little hard for GCC to reason about, By checking
> ((~A) & flag) instead of (flag & (A | MASK) == MASK), GCC can do
> better optimisations, saving 652 bytes in page_alloc.o (which is
> a heavy user of PageBuddy).
>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>  include/linux/page-flags.h | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
> index 1bf83c8fcaa7..8fc0876e2794 100644
> --- a/include/linux/page-flags.h
> +++ b/include/linux/page-flags.h
> @@ -725,14 +725,14 @@ PAGEFLAG_FALSE(DoubleMap)
>  #define PG_table       0x00000400
>  #define PG_guard       0x00000800
>
> -#define PageType(page, flag)                                           \
> -       ((page->page_type & (PAGE_TYPE_BASE | flag)) == PAGE_TYPE_BASE)
> -
>  static inline int page_has_type(struct page *page)
>  {
>         return (int)page->page_type < PAGE_MAPCOUNT_RESERVE;
>  }
>
> +#define PageType(page, flag)                                           \
> +       (page_has_type(page) && (~page->page_type & flag))
> +
>  #define PAGE_TYPE_OPS(uname, lname)                                    \
>  static __always_inline int Page##uname(struct page *page)              \
>  {                                                                      \

If I recall all the page type is doing is clearing a single bit to
indicate the page type, and only one page type is supposed to be set
at a time correct?

Is there any reason why we couldn't just do an addition and test?
Basically just add the flag + 1 and see if the value rolls over to 0.
I would think that would reduce to an even simpler setup since that
would be an addition with a test for carry flag or zero.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mm: Make PageType more efficient
  2020-03-10 20:17 ` Alexander Duyck
@ 2020-03-10 20:37   ` Matthew Wilcox
  2020-03-10 21:50     ` Alexander Duyck
  0 siblings, 1 reply; 9+ messages in thread
From: Matthew Wilcox @ 2020-03-10 20:37 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: Andrew Morton, Vlastimil Babka, linux-mm

On Tue, Mar 10, 2020 at 01:17:17PM -0700, Alexander Duyck wrote:
> On Tue, Mar 10, 2020 at 11:56 AM Matthew Wilcox <willy@infradead.org> wrote:
> > PageType is a little hard for GCC to reason about, By checking
> > ((~A) & flag) instead of (flag & (A | MASK) == MASK), GCC can do
> > better optimisations, saving 652 bytes in page_alloc.o (which is
> > a heavy user of PageBuddy).
> >
> > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> > ---
> >  include/linux/page-flags.h | 6 +++---
> >  1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
> > index 1bf83c8fcaa7..8fc0876e2794 100644
> > --- a/include/linux/page-flags.h
> > +++ b/include/linux/page-flags.h
> > @@ -725,14 +725,14 @@ PAGEFLAG_FALSE(DoubleMap)
> >  #define PG_table       0x00000400
> >  #define PG_guard       0x00000800
> >
> > -#define PageType(page, flag)                                           \
> > -       ((page->page_type & (PAGE_TYPE_BASE | flag)) == PAGE_TYPE_BASE)
> > -
> >  static inline int page_has_type(struct page *page)
> >  {
> >         return (int)page->page_type < PAGE_MAPCOUNT_RESERVE;
> >  }
> >
> > +#define PageType(page, flag)                                           \
> > +       (page_has_type(page) && (~page->page_type & flag))
> > +
> >  #define PAGE_TYPE_OPS(uname, lname)                                    \
> >  static __always_inline int Page##uname(struct page *page)              \
> >  {                                                                      \
> 
> If I recall all the page type is doing is clearing a single bit to
> indicate the page type, and only one page type is supposed to be set
> at a time correct?
> 
> Is there any reason why we couldn't just do an addition and test?
> Basically just add the flag + 1 and see if the value rolls over to 0.
> I would think that would reduce to an even simpler setup since that
> would be an addition with a test for carry flag or zero.

I think we already allow for both PageKmemcg and PageTable to be set
on the same page.  I don't want to stop people from being able to do
combinations like that in the future.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mm: Make PageType more efficient
  2020-03-10 20:37   ` Matthew Wilcox
@ 2020-03-10 21:50     ` Alexander Duyck
  2020-03-11 13:13       ` Matthew Wilcox
  0 siblings, 1 reply; 9+ messages in thread
From: Alexander Duyck @ 2020-03-10 21:50 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Andrew Morton, Vlastimil Babka, linux-mm

On Tue, Mar 10, 2020 at 1:37 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Tue, Mar 10, 2020 at 01:17:17PM -0700, Alexander Duyck wrote:
> > On Tue, Mar 10, 2020 at 11:56 AM Matthew Wilcox <willy@infradead.org> wrote:
> > > PageType is a little hard for GCC to reason about, By checking
> > > ((~A) & flag) instead of (flag & (A | MASK) == MASK), GCC can do
> > > better optimisations, saving 652 bytes in page_alloc.o (which is
> > > a heavy user of PageBuddy).
> > >
> > > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> > > ---
> > >  include/linux/page-flags.h | 6 +++---
> > >  1 file changed, 3 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
> > > index 1bf83c8fcaa7..8fc0876e2794 100644
> > > --- a/include/linux/page-flags.h
> > > +++ b/include/linux/page-flags.h
> > > @@ -725,14 +725,14 @@ PAGEFLAG_FALSE(DoubleMap)
> > >  #define PG_table       0x00000400
> > >  #define PG_guard       0x00000800
> > >
> > > -#define PageType(page, flag)                                           \
> > > -       ((page->page_type & (PAGE_TYPE_BASE | flag)) == PAGE_TYPE_BASE)
> > > -

From what I can tell this is the only consumer of PAGE_TYPE_BASE.
Since it is removed you can probably remove that definition as well.

> > >  static inline int page_has_type(struct page *page)
> > >  {
> > >         return (int)page->page_type < PAGE_MAPCOUNT_RESERVE;
> > >  }
> > >
> > > +#define PageType(page, flag)                                           \
> > > +       (page_has_type(page) && (~page->page_type & flag))
> > > +

You can probably spare a cycle or two here by testing for
"!(page->page_type & flag)". That way you avoid the extra bit flipping
since the compiler can just handle the result of the AND op as it sees
fit.

> > >  #define PAGE_TYPE_OPS(uname, lname)                                    \
> > >  static __always_inline int Page##uname(struct page *page)              \
> > >  {                                                                      \
> >
> > If I recall all the page type is doing is clearing a single bit to
> > indicate the page type, and only one page type is supposed to be set
> > at a time correct?
> >
> > Is there any reason why we couldn't just do an addition and test?
> > Basically just add the flag + 1 and see if the value rolls over to 0.
> > I would think that would reduce to an even simpler setup since that
> > would be an addition with a test for carry flag or zero.
>
> I think we already allow for both PageKmemcg and PageTable to be set
> on the same page.  I don't want to stop people from being able to do
> combinations like that in the future.

Okay, i wasn't aware of that. So that prevents us from simplifying this further.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mm: Make PageType more efficient
  2020-03-10 18:56 [PATCH] mm: Make PageType more efficient Matthew Wilcox
  2020-03-10 20:17 ` Alexander Duyck
@ 2020-03-11  1:10 ` Andrew Morton
  2020-03-11 13:21   ` Matthew Wilcox
  1 sibling, 1 reply; 9+ messages in thread
From: Andrew Morton @ 2020-03-11  1:10 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Vlastimil Babka, linux-mm, Alexander Duyck

On Tue, 10 Mar 2020 11:56:09 -0700 Matthew Wilcox <willy@infradead.org> wrote:

> From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> 
> PageType is a little hard for GCC to reason about, By checking
> ((~A) & flag) instead of (flag & (A | MASK) == MASK), GCC can do
> better optimisations, saving 652 bytes in page_alloc.o (which is
> a heavy user of PageBuddy).
> 
> ...
>
> --- a/include/linux/page-flags.h
> +++ b/include/linux/page-flags.h
> @@ -725,14 +725,14 @@ PAGEFLAG_FALSE(DoubleMap)
>  #define PG_table	0x00000400
>  #define PG_guard	0x00000800
>  
> -#define PageType(page, flag)						\
> -	((page->page_type & (PAGE_TYPE_BASE | flag)) == PAGE_TYPE_BASE)
> -
>  static inline int page_has_type(struct page *page)
>  {
>  	return (int)page->page_type < PAGE_MAPCOUNT_RESERVE;
>  }
>  
> +#define PageType(page, flag)						\
> +	(page_has_type(page) && (~page->page_type & flag))
> +

`flag' should be parenthesized here.  As should `page' if one is even
more paranoid.

I tried this:

--- a/include/linux/page-flags.h~mm-make-pagetype-more-efficient-fix
+++ a/include/linux/page-flags.h
@@ -731,7 +731,7 @@ static inline int page_has_type(struct p
 }
 
 #define PageType(page, flag)						\
-	(page_has_type(page) && (~page->page_type & flag))
+	(page_has_type(page) && !(page->page_type & ~(flag)))
 
 #define PAGE_TYPE_OPS(uname, lname)					\
 static __always_inline int Page##uname(struct page *page)		\

and page_alloc.o shrunk by 6782 bytes, half of it in
get_page_from_freelist().  Something crazy is going on.  gcc-7.2.0.



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mm: Make PageType more efficient
  2020-03-10 21:50     ` Alexander Duyck
@ 2020-03-11 13:13       ` Matthew Wilcox
  2020-03-11 17:14         ` Ira Weiny
  2020-03-11 17:22         ` Alexander Duyck
  0 siblings, 2 replies; 9+ messages in thread
From: Matthew Wilcox @ 2020-03-11 13:13 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: Andrew Morton, Vlastimil Babka, linux-mm

On Tue, Mar 10, 2020 at 02:50:50PM -0700, Alexander Duyck wrote:
> On Tue, Mar 10, 2020 at 1:37 PM Matthew Wilcox <willy@infradead.org> wrote:
> > > > -#define PageType(page, flag)                                           \
> > > > -       ((page->page_type & (PAGE_TYPE_BASE | flag)) == PAGE_TYPE_BASE)
> > > > -
> 
> >From what I can tell this is the only consumer of PAGE_TYPE_BASE.
> Since it is removed you can probably remove that definition as well.

I _could_ ... I do want to indicate to people that they probably
shouldn't use those bits in order to leave space for overflow and
wraparound of _mapcount.

> > > > +#define PageType(page, flag)                                           \
> > > > +       (page_has_type(page) && (~page->page_type & flag))
> 
> You can probably spare a cycle or two here by testing for
> "!(page->page_type & flag)". That way you avoid the extra bit flipping
> since the compiler can just handle the result of the AND op as it sees
> fit.

GCC already knows to do that optimisation; mm/page_alloc.o is identical
(same md5sum) when changing from (~page->page_type & flag) to
!(page->page_type & flag).  So it's just a question of which one is
easier for humans to read and reason about.  Do you have an opinion
which one you'd like to see?



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mm: Make PageType more efficient
  2020-03-11  1:10 ` Andrew Morton
@ 2020-03-11 13:21   ` Matthew Wilcox
  0 siblings, 0 replies; 9+ messages in thread
From: Matthew Wilcox @ 2020-03-11 13:21 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Vlastimil Babka, linux-mm, Alexander Duyck

On Tue, Mar 10, 2020 at 06:10:27PM -0700, Andrew Morton wrote:
> On Tue, 10 Mar 2020 11:56:09 -0700 Matthew Wilcox <willy@infradead.org> wrote:
> 
> > From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> > 
> > PageType is a little hard for GCC to reason about, By checking
> > ((~A) & flag) instead of (flag & (A | MASK) == MASK), GCC can do
> > better optimisations, saving 652 bytes in page_alloc.o (which is
> > a heavy user of PageBuddy).
> > 
> > ...
> >
> > --- a/include/linux/page-flags.h
> > +++ b/include/linux/page-flags.h
> > @@ -725,14 +725,14 @@ PAGEFLAG_FALSE(DoubleMap)
> >  #define PG_table	0x00000400
> >  #define PG_guard	0x00000800
> >  
> > -#define PageType(page, flag)						\
> > -	((page->page_type & (PAGE_TYPE_BASE | flag)) == PAGE_TYPE_BASE)
> > -
> >  static inline int page_has_type(struct page *page)
> >  {
> >  	return (int)page->page_type < PAGE_MAPCOUNT_RESERVE;
> >  }
> >  
> > +#define PageType(page, flag)						\
> > +	(page_has_type(page) && (~page->page_type & flag))
> > +
> 
> `flag' should be parenthesized here.  As should `page' if one is even
> more paranoid.

I suppose so.  The thing is, the flag name isn't specified by the user;
users say 'if (PageBuddy(page))' and the flag gets expanded through
PAGE_TYPE_OPS.  We also don't need to worry about the user doing 'if
(PageBuddy(page++))' as it'll expand into a call to

static __always_inline int PageBuddy(struct page *page)
{
	return PageType(page, PG_buddy)
}

> I tried this:
> 
> --- a/include/linux/page-flags.h~mm-make-pagetype-more-efficient-fix
> +++ a/include/linux/page-flags.h
> @@ -731,7 +731,7 @@ static inline int page_has_type(struct p
>  }
>  
>  #define PageType(page, flag)						\
> -	(page_has_type(page) && (~page->page_type & flag))
> +	(page_has_type(page) && !(page->page_type & ~(flag)))

Mr Boole would like to let you know you misapplied his rules of algebra.
~flag is going to expand into 0xfffffff7f and then gcc will figure out
the expression as a whole is always false, and so no page is ever a
buddy page, and ..

> and page_alloc.o shrunk by 6782 bytes, half of it in
> get_page_from_freelist().  Something crazy is going on.  gcc-7.2.0.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mm: Make PageType more efficient
  2020-03-11 13:13       ` Matthew Wilcox
@ 2020-03-11 17:14         ` Ira Weiny
  2020-03-11 17:22         ` Alexander Duyck
  1 sibling, 0 replies; 9+ messages in thread
From: Ira Weiny @ 2020-03-11 17:14 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Alexander Duyck, Andrew Morton, Vlastimil Babka, linux-mm

On Wed, Mar 11, 2020 at 06:13:04AM -0700, Matthew Wilcox wrote:
> On Tue, Mar 10, 2020 at 02:50:50PM -0700, Alexander Duyck wrote:
> > On Tue, Mar 10, 2020 at 1:37 PM Matthew Wilcox <willy@infradead.org> wrote:
> > > > > -#define PageType(page, flag)                                           \
> > > > > -       ((page->page_type & (PAGE_TYPE_BASE | flag)) == PAGE_TYPE_BASE)
> > > > > -
> > 
> > >From what I can tell this is the only consumer of PAGE_TYPE_BASE.
> > Since it is removed you can probably remove that definition as well.
> 
> I _could_ ... I do want to indicate to people that they probably
> shouldn't use those bits in order to leave space for overflow and
> wraparound of _mapcount.
> 
> > > > > +#define PageType(page, flag)                                           \
> > > > > +       (page_has_type(page) && (~page->page_type & flag))
> > 
> > You can probably spare a cycle or two here by testing for
> > "!(page->page_type & flag)". That way you avoid the extra bit flipping
> > since the compiler can just handle the result of the AND op as it sees
> > fit.
> 
> GCC already knows to do that optimisation; mm/page_alloc.o is identical
> (same md5sum) when changing from (~page->page_type & flag) to
> !(page->page_type & flag).  So it's just a question of which one is

No dog in this fight...  But for simpletons like me...

!(page->page_type & flag)

... makes much more sense,
Ira

> easier for humans to read and reason about.  Do you have an opinion
> which one you'd like to see?
> 
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mm: Make PageType more efficient
  2020-03-11 13:13       ` Matthew Wilcox
  2020-03-11 17:14         ` Ira Weiny
@ 2020-03-11 17:22         ` Alexander Duyck
  1 sibling, 0 replies; 9+ messages in thread
From: Alexander Duyck @ 2020-03-11 17:22 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Andrew Morton, Vlastimil Babka, linux-mm

On Wed, Mar 11, 2020 at 6:13 AM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Tue, Mar 10, 2020 at 02:50:50PM -0700, Alexander Duyck wrote:
> > On Tue, Mar 10, 2020 at 1:37 PM Matthew Wilcox <willy@infradead.org> wrote:
> > > > > -#define PageType(page, flag)                                           \
> > > > > -       ((page->page_type & (PAGE_TYPE_BASE | flag)) == PAGE_TYPE_BASE)
> > > > > -
> >
> > >From what I can tell this is the only consumer of PAGE_TYPE_BASE.
> > Since it is removed you can probably remove that definition as well.
>
> I _could_ ... I do want to indicate to people that they probably
> shouldn't use those bits in order to leave space for overflow and
> wraparound of _mapcount.

You already have some of this with the check against
PAGE_MAPCOUNT_RESERVE. That is guaranteeing the sign bit is set and
that at least one of the bits has been cleared as otherwise it would
be equal to 128. The bit you aren't enforcing though is the upper 4
bits. It might make sense to just replace the #define with a comment
further down that the upper 4 bit are reserved.

> > > > > +#define PageType(page, flag)                                           \
> > > > > +       (page_has_type(page) && (~page->page_type & flag))
> >
> > You can probably spare a cycle or two here by testing for
> > "!(page->page_type & flag)". That way you avoid the extra bit flipping
> > since the compiler can just handle the result of the AND op as it sees
> > fit.
>
> GCC already knows to do that optimisation; mm/page_alloc.o is identical
> (same md5sum) when changing from (~page->page_type & flag) to
> !(page->page_type & flag).  So it's just a question of which one is
> easier for humans to read and reason about.  Do you have an opinion
> which one you'd like to see?

So it looks like Andrew and Ira are kind of thinking the same way I am.

Also one other thing that occurred to me is that by breaking it up the
way you did I think it might open things up for possible races since
the flag and upper bits are being done as two separate checks. It
might make sense to just combine this all into one function and use
the READ_ONCE macro on the read of the page_type so that you can be
guaranteed that the reading of the variable is atomic.


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-03-11 17:23 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-10 18:56 [PATCH] mm: Make PageType more efficient Matthew Wilcox
2020-03-10 20:17 ` Alexander Duyck
2020-03-10 20:37   ` Matthew Wilcox
2020-03-10 21:50     ` Alexander Duyck
2020-03-11 13:13       ` Matthew Wilcox
2020-03-11 17:14         ` Ira Weiny
2020-03-11 17:22         ` Alexander Duyck
2020-03-11  1:10 ` Andrew Morton
2020-03-11 13:21   ` Matthew Wilcox

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.