linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* Bogus struct page layout on 32-bit
       [not found] ` <202104100656.N7EVvkNZ-lkp@intel.com>
@ 2021-04-10  2:43   ` Matthew Wilcox
  2021-04-10  6:21     ` Jesper Dangaard Brouer
                       ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Matthew Wilcox @ 2021-04-10  2:43 UTC (permalink / raw)
  To: kernel test robot
  Cc: linux-mm, kbuild-all, clang-built-linux, linux-kernel,
	linux-fsdevel, Michael Ellerman, Benjamin Herrenschmidt,
	Paul Mackerras, linuxppc-dev, linux-arm-kernel,
	Jesper Dangaard Brouer, David S. Miller

On Sat, Apr 10, 2021 at 06:45:35AM +0800, kernel test robot wrote:
> >> include/linux/mm_types.h:274:1: error: static_assert failed due to requirement '__builtin_offsetof(struct page, lru) == __builtin_offsetof(struct folio, lru)' "offsetof(struct page, lru) == offsetof(struct folio, lru)"
>    FOLIO_MATCH(lru, lru);
>    include/linux/mm_types.h:272:2: note: expanded from macro 'FOLIO_MATCH'
>            static_assert(offsetof(struct page, pg) == offsetof(struct folio, fl))

Well, this is interesting.  pahole reports:

struct page {
        long unsigned int          flags;                /*     0     4 */
        /* XXX 4 bytes hole, try to pack */
        union {
                struct {
                        struct list_head lru;            /*     8     8 */
...
struct folio {
        union {
                struct {
                        long unsigned int flags;         /*     0     4 */
                        struct list_head lru;            /*     4     8 */

so this assert has absolutely done its job.

But why has this assert triggered?  Why is struct page layout not what
we thought it was?  Turns out it's the dma_addr added in 2019 by commit
c25fff7171be ("mm: add dma_addr_t to struct page").  On this particular
config, it's 64-bit, and ppc32 requires alignment to 64-bit.  So
the whole union gets moved out by 4 bytes.

Unfortunately, we can't just fix this by putting an 'unsigned long pad'
in front of it.  It still aligns the entire union to 8 bytes, and then
it skips another 4 bytes after the pad.

We can fix it like this ...

+++ b/include/linux/mm_types.h
@@ -96,11 +96,12 @@ struct page {
                        unsigned long private;
                };
                struct {        /* page_pool used by netstack */
+                       unsigned long _page_pool_pad;
                        /**
                         * @dma_addr: might require a 64-bit value even on
                         * 32-bit architectures.
                         */
-                       dma_addr_t dma_addr;
+                       dma_addr_t dma_addr __packed;
                };
                struct {        /* slab, slob and slub */
                        union {

but I don't know if GCC is smart enough to realise that dma_addr is now
on an 8 byte boundary and it can use a normal instruction to access it,
or whether it'll do something daft like use byte loads to access it.

We could also do:

+                       dma_addr_t dma_addr __packed __aligned(sizeof(void *));

and I see pahole, at least sees this correctly:

                struct {
                        long unsigned int _page_pool_pad; /*     4     4 */
                        dma_addr_t dma_addr __attribute__((__aligned__(4))); /*     8     8 */
                } __attribute__((__packed__)) __attribute__((__aligned__(4)));  

This presumably affects any 32-bit architecture with a 64-bit phys_addr_t
/ dma_addr_t.  Advice, please?

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Bogus struct page layout on 32-bit
  2021-04-10  2:43   ` Bogus struct page layout on 32-bit Matthew Wilcox
@ 2021-04-10  6:21     ` Jesper Dangaard Brouer
  2021-04-10  8:52       ` Ilias Apalodimas
  2021-04-10 14:17     ` David Laight
  2021-04-10 19:10     ` Arnd Bergmann
  2 siblings, 1 reply; 11+ messages in thread
From: Jesper Dangaard Brouer @ 2021-04-10  6:21 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: kernel test robot, linux-mm, kbuild-all, clang-built-linux,
	linux-kernel, linux-fsdevel, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, linuxppc-dev,
	linux-arm-kernel, David S. Miller, brouer, Ilias Apalodimas,
	Ivan Khoronzhuk, Matteo Croce, netdev

On Sat, 10 Apr 2021 03:43:13 +0100
Matthew Wilcox <willy@infradead.org> wrote:

> On Sat, Apr 10, 2021 at 06:45:35AM +0800, kernel test robot wrote:
> > >> include/linux/mm_types.h:274:1: error: static_assert failed due to requirement '__builtin_offsetof(struct page, lru) == __builtin_offsetof(struct folio, lru)' "offsetof(struct page, lru) == offsetof(struct folio, lru)"  
> >    FOLIO_MATCH(lru, lru);
> >    include/linux/mm_types.h:272:2: note: expanded from macro 'FOLIO_MATCH'
> >            static_assert(offsetof(struct page, pg) == offsetof(struct folio, fl))  
> 
> Well, this is interesting.  pahole reports:
> 
> struct page {
>         long unsigned int          flags;                /*     0     4 */
>         /* XXX 4 bytes hole, try to pack */
>         union {
>                 struct {
>                         struct list_head lru;            /*     8     8 */
> ...
> struct folio {
>         union {
>                 struct {
>                         long unsigned int flags;         /*     0     4 */
>                         struct list_head lru;            /*     4     8 */
> 
> so this assert has absolutely done its job.
> 
> But why has this assert triggered?  Why is struct page layout not what
> we thought it was?  Turns out it's the dma_addr added in 2019 by commit
> c25fff7171be ("mm: add dma_addr_t to struct page").  On this particular
> config, it's 64-bit, and ppc32 requires alignment to 64-bit.  So
> the whole union gets moved out by 4 bytes.

Argh, good that you are catching this!

> Unfortunately, we can't just fix this by putting an 'unsigned long pad'
> in front of it.  It still aligns the entire union to 8 bytes, and then
> it skips another 4 bytes after the pad.
> 
> We can fix it like this ...
> 
> +++ b/include/linux/mm_types.h
> @@ -96,11 +96,12 @@ struct page {
>                         unsigned long private;
>                 };
>                 struct {        /* page_pool used by netstack */
> +                       unsigned long _page_pool_pad;

I'm fine with this pad.  Matteo is currently proposing[1] to add a 32-bit
value after @dma_addr, and he could use this area instead.

[1] https://lore.kernel.org/netdev/20210409223801.104657-3-mcroce@linux.microsoft.com/

When adding/changing this, we need to make sure that it doesn't overlap
member @index, because network stack use/check page_is_pfmemalloc().
As far as my calculations this is safe to add.  I always try to keep an
eye out for this, but I wonder if we could have a build check like yours.


>                         /**
>                          * @dma_addr: might require a 64-bit value even on
>                          * 32-bit architectures.
>                          */
> -                       dma_addr_t dma_addr;
> +                       dma_addr_t dma_addr __packed;
>                 };
>                 struct {        /* slab, slob and slub */
>                         union {
> 
> but I don't know if GCC is smart enough to realise that dma_addr is now
> on an 8 byte boundary and it can use a normal instruction to access it,
> or whether it'll do something daft like use byte loads to access it.
> 
> We could also do:
> 
> +                       dma_addr_t dma_addr __packed __aligned(sizeof(void *));
> 
> and I see pahole, at least sees this correctly:
> 
>                 struct {
>                         long unsigned int _page_pool_pad; /*     4     4 */
>                         dma_addr_t dma_addr __attribute__((__aligned__(4))); /*     8     8 */
>                 } __attribute__((__packed__)) __attribute__((__aligned__(4)));  
> 
> This presumably affects any 32-bit architecture with a 64-bit phys_addr_t
> / dma_addr_t.  Advice, please?

I'm not sure that the 32-bit behavior is with 64-bit (dma) addrs.

I don't have any 32-bit boards with 64-bit DMA.  Cc. Ivan, wasn't your
board (572x ?) 32-bit with driver 'cpsw' this case (where Ivan added
XDP+page_pool) ?

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Bogus struct page layout on 32-bit
  2021-04-10  6:21     ` Jesper Dangaard Brouer
@ 2021-04-10  8:52       ` Ilias Apalodimas
  2021-04-10 14:06         ` Matthew Wilcox
  2021-04-16  9:26         ` Grygorii Strashko
  0 siblings, 2 replies; 11+ messages in thread
From: Ilias Apalodimas @ 2021-04-10  8:52 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Matthew Wilcox, kernel test robot, Linux-MM, kbuild-all,
	clang-built-linux, open list, linux-fsdevel, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, linuxppc-dev, Linux ARM,
	David S. Miller, Ivan Khoronzhuk, Matteo Croce, netdev,
	Grygorii Strashko

+CC Grygorii for the cpsw part as Ivan's email is not valid anymore

Thanks for catching this. Interesting indeed...

On Sat, 10 Apr 2021 at 09:22, Jesper Dangaard Brouer <brouer@redhat.com> wrote:
>
> On Sat, 10 Apr 2021 03:43:13 +0100
> Matthew Wilcox <willy@infradead.org> wrote:
>
> > On Sat, Apr 10, 2021 at 06:45:35AM +0800, kernel test robot wrote:
> > > >> include/linux/mm_types.h:274:1: error: static_assert failed due to requirement '__builtin_offsetof(struct page, lru) == __builtin_offsetof(struct folio, lru)' "offsetof(struct page, lru) == offsetof(struct folio, lru)"
> > >    FOLIO_MATCH(lru, lru);
> > >    include/linux/mm_types.h:272:2: note: expanded from macro 'FOLIO_MATCH'
> > >            static_assert(offsetof(struct page, pg) == offsetof(struct folio, fl))
> >
> > Well, this is interesting.  pahole reports:
> >
> > struct page {
> >         long unsigned int          flags;                /*     0     4 */
> >         /* XXX 4 bytes hole, try to pack */
> >         union {
> >                 struct {
> >                         struct list_head lru;            /*     8     8 */
> > ...
> > struct folio {
> >         union {
> >                 struct {
> >                         long unsigned int flags;         /*     0     4 */
> >                         struct list_head lru;            /*     4     8 */
> >
> > so this assert has absolutely done its job.
> >
> > But why has this assert triggered?  Why is struct page layout not what
> > we thought it was?  Turns out it's the dma_addr added in 2019 by commit
> > c25fff7171be ("mm: add dma_addr_t to struct page").  On this particular
> > config, it's 64-bit, and ppc32 requires alignment to 64-bit.  So
> > the whole union gets moved out by 4 bytes.
>
> Argh, good that you are catching this!
>
> > Unfortunately, we can't just fix this by putting an 'unsigned long pad'
> > in front of it.  It still aligns the entire union to 8 bytes, and then
> > it skips another 4 bytes after the pad.
> >
> > We can fix it like this ...
> >
> > +++ b/include/linux/mm_types.h
> > @@ -96,11 +96,12 @@ struct page {
> >                         unsigned long private;
> >                 };
> >                 struct {        /* page_pool used by netstack */
> > +                       unsigned long _page_pool_pad;
>
> I'm fine with this pad.  Matteo is currently proposing[1] to add a 32-bit
> value after @dma_addr, and he could use this area instead.
>
> [1] https://lore.kernel.org/netdev/20210409223801.104657-3-mcroce@linux.microsoft.com/
>
> When adding/changing this, we need to make sure that it doesn't overlap
> member @index, because network stack use/check page_is_pfmemalloc().
> As far as my calculations this is safe to add.  I always try to keep an
> eye out for this, but I wonder if we could have a build check like yours.
>
>
> >                         /**
> >                          * @dma_addr: might require a 64-bit value even on
> >                          * 32-bit architectures.
> >                          */
> > -                       dma_addr_t dma_addr;
> > +                       dma_addr_t dma_addr __packed;
> >                 };
> >                 struct {        /* slab, slob and slub */
> >                         union {
> >
> > but I don't know if GCC is smart enough to realise that dma_addr is now
> > on an 8 byte boundary and it can use a normal instruction to access it,
> > or whether it'll do something daft like use byte loads to access it.
> >
> > We could also do:
> >
> > +                       dma_addr_t dma_addr __packed __aligned(sizeof(void *));
> >
> > and I see pahole, at least sees this correctly:
> >
> >                 struct {
> >                         long unsigned int _page_pool_pad; /*     4     4 */
> >                         dma_addr_t dma_addr __attribute__((__aligned__(4))); /*     8     8 */
> >                 } __attribute__((__packed__)) __attribute__((__aligned__(4)));
> >
> > This presumably affects any 32-bit architecture with a 64-bit phys_addr_t
> > / dma_addr_t.  Advice, please?
>
> I'm not sure that the 32-bit behavior is with 64-bit (dma) addrs.
>
> I don't have any 32-bit boards with 64-bit DMA.  Cc. Ivan, wasn't your
> board (572x ?) 32-bit with driver 'cpsw' this case (where Ivan added
> XDP+page_pool) ?
>
> --
> Best regards,
>   Jesper Dangaard Brouer
>   MSc.CS, Principal Kernel Engineer at Red Hat
>   LinkedIn: http://www.linkedin.com/in/brouer
>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Bogus struct page layout on 32-bit
  2021-04-10  8:52       ` Ilias Apalodimas
@ 2021-04-10 14:06         ` Matthew Wilcox
  2021-04-10 15:54           ` Russell King - ARM Linux admin
  2021-04-16  9:26         ` Grygorii Strashko
  1 sibling, 1 reply; 11+ messages in thread
From: Matthew Wilcox @ 2021-04-10 14:06 UTC (permalink / raw)
  To: Ilias Apalodimas
  Cc: Jesper Dangaard Brouer, kernel test robot, Linux-MM, kbuild-all,
	clang-built-linux, open list, linux-fsdevel, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, linuxppc-dev, Linux ARM,
	David S. Miller, Ivan Khoronzhuk, Matteo Croce, netdev,
	Grygorii Strashko

How about moving the flags into the union?  A bit messy, but we don't
have to play games with __packed__.

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 1210a8e41fad..f374d2f06255 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -68,16 +68,22 @@ struct mem_cgroup;
 #endif
 
 struct page {
-	unsigned long flags;		/* Atomic flags, some possibly
-					 * updated asynchronously */
 	/*
-	 * Five words (20/40 bytes) are available in this union.
-	 * WARNING: bit 0 of the first word is used for PageTail(). That
-	 * means the other users of this union MUST NOT use the bit to
+	 * This union is six words (24 / 48 bytes) in size.
+	 * The first word is reserved for atomic flags, often updated
+	 * asynchronously.  Use the PageFoo() macros to access it.  Some
+	 * of the flags can be reused for your own purposes, but the
+	 * word as a whole often contains other information and overwriting
+	 * it will cause functions like page_zone() and page_node() to stop
+	 * working correctly.
+	 *
+	 * Bit 0 of the second word is used for PageTail(). That
+	 * means the other users of this union MUST leave the bit zero to
 	 * avoid collision and false-positive PageTail().
 	 */
 	union {
 		struct {	/* Page cache and anonymous pages */
+			unsigned long flags;
 			/**
 			 * @lru: Pageout list, eg. active_list protected by
 			 * lruvec->lru_lock.  Sometimes used as a generic list
@@ -96,6 +102,8 @@ struct page {
 			unsigned long private;
 		};
 		struct {	/* page_pool used by netstack */
+			unsigned long _pp_flags;
+			unsigned long _pp_pad;
 			/**
 			 * @dma_addr: might require a 64-bit value even on
 			 * 32-bit architectures.
@@ -103,6 +111,7 @@ struct page {
 			dma_addr_t dma_addr;
 		};
 		struct {	/* slab, slob and slub */
+			unsigned long _slab_flags;
 			union {
 				struct list_head slab_list;
 				struct {	/* Partial pages */
@@ -130,6 +139,7 @@ struct page {
 			};
 		};
 		struct {	/* Tail pages of compound page */
+			unsigned long _tail1_flags;
 			unsigned long compound_head;	/* Bit zero is set */
 
 			/* First tail page only */
@@ -139,12 +149,14 @@ struct page {
 			unsigned int compound_nr; /* 1 << compound_order */
 		};
 		struct {	/* Second tail page of compound page */
+			unsigned long _tail2_flags;
 			unsigned long _compound_pad_1;	/* compound_head */
 			atomic_t hpage_pinned_refcount;
 			/* For both global and memcg */
 			struct list_head deferred_list;
 		};
 		struct {	/* Page table pages */
+			unsigned long _pt_flags;
 			unsigned long _pt_pad_1;	/* compound_head */
 			pgtable_t pmd_huge_pte; /* protected by page->ptl */
 			unsigned long _pt_pad_2;	/* mapping */
@@ -159,6 +171,7 @@ struct page {
 #endif
 		};
 		struct {	/* ZONE_DEVICE pages */
+			unsigned long _zd_flags;
 			/** @pgmap: Points to the hosting device page map. */
 			struct dev_pagemap *pgmap;
 			void *zone_device_data;
@@ -174,8 +187,11 @@ struct page {
 			 */
 		};
 
-		/** @rcu_head: You can use this to free a page by RCU. */
-		struct rcu_head rcu_head;
+		struct {
+			unsigned long _rcu_flags;
+			/** @rcu_head: You can use this to free a page by RCU. */
+			struct rcu_head rcu_head;
+		};
 	};
 
 	union {		/* This union is 4 bytes in size. */

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* RE: Bogus struct page layout on 32-bit
  2021-04-10  2:43   ` Bogus struct page layout on 32-bit Matthew Wilcox
  2021-04-10  6:21     ` Jesper Dangaard Brouer
@ 2021-04-10 14:17     ` David Laight
  2021-04-10 19:10     ` Arnd Bergmann
  2 siblings, 0 replies; 11+ messages in thread
From: David Laight @ 2021-04-10 14:17 UTC (permalink / raw)
  To: 'Matthew Wilcox', kernel test robot
  Cc: linux-mm, kbuild-all, clang-built-linux, linux-kernel,
	linux-fsdevel, Michael Ellerman, Benjamin Herrenschmidt,
	Paul Mackerras, linuxppc-dev, linux-arm-kernel,
	Jesper Dangaard Brouer, David S. Miller

From: Matthew Wilcox
> Sent: 10 April 2021 03:43
> On Sat, Apr 10, 2021 at 06:45:35AM +0800, kernel test robot wrote:
> > >> include/linux/mm_types.h:274:1: error: static_assert failed due to requirement
> '__builtin_offsetof(struct page, lru) == __builtin_offsetof(struct folio, lru)' "offsetof(struct page,
> lru) == offsetof(struct folio, lru)"
> >    FOLIO_MATCH(lru, lru);
> >    include/linux/mm_types.h:272:2: note: expanded from macro 'FOLIO_MATCH'
> >            static_assert(offsetof(struct page, pg) == offsetof(struct folio, fl))
> 
> Well, this is interesting.  pahole reports:
> 
> struct page {
>         long unsigned int          flags;                /*     0     4 */
>         /* XXX 4 bytes hole, try to pack */
>         union {
>                 struct {
>                         struct list_head lru;            /*     8     8 */
> ...
> struct folio {
>         union {
>                 struct {
>                         long unsigned int flags;         /*     0     4 */
>                         struct list_head lru;            /*     4     8 */
> 
> so this assert has absolutely done its job.
> 
> But why has this assert triggered?  Why is struct page layout not what
> we thought it was?  Turns out it's the dma_addr added in 2019 by commit
> c25fff7171be ("mm: add dma_addr_t to struct page").  On this particular
> config, it's 64-bit, and ppc32 requires alignment to 64-bit.  So
> the whole union gets moved out by 4 bytes.
> 
> Unfortunately, we can't just fix this by putting an 'unsigned long pad'
> in front of it.  It still aligns the entire union to 8 bytes, and then
> it skips another 4 bytes after the pad.
> 
> We can fix it like this ...
> 
> +++ b/include/linux/mm_types.h
> @@ -96,11 +96,12 @@ struct page {
>                         unsigned long private;
>                 };
>                 struct {        /* page_pool used by netstack */
> +                       unsigned long _page_pool_pad;
>                         /**
>                          * @dma_addr: might require a 64-bit value even on
>                          * 32-bit architectures.
>                          */
> -                       dma_addr_t dma_addr;
> +                       dma_addr_t dma_addr __packed;
>                 };
>                 struct {        /* slab, slob and slub */
>                         union {
> 
> but I don't know if GCC is smart enough to realise that dma_addr is now
> on an 8 byte boundary and it can use a normal instruction to access it,
> or whether it'll do something daft like use byte loads to access it.

I'm betting it will use byte accesses.
Checked - it does seem to use 4-byte accesses.
(godbolt is rather short of 32 bit compilers...)

> 
> We could also do:
> 
> +                       dma_addr_t dma_addr __packed __aligned(sizeof(void *));

I wonder if __aligned(n) should be defined as
	__attribute__((packed,aligned(n))
I don't think you ever want the 'unpacked' variant.

But explicitly reducing the alignment of single member is much
better than the habit of marking the structure 'packed'.

(Never mind the habit of adding __packed 'because we don't want
the compiler to add random padding.)

> 
> and I see pahole, at least sees this correctly:
> 
>                 struct {
>                         long unsigned int _page_pool_pad; /*     4     4 */
>                         dma_addr_t dma_addr __attribute__((__aligned__(4))); /*     8     8 */
>                 } __attribute__((__packed__)) __attribute__((__aligned__(4)));

Is the attribute on the struct an artifact of pahole?
It should just have an alignment of 4 without anything special.

> 
> This presumably affects any 32-bit architecture with a 64-bit phys_addr_t
> / dma_addr_t.  Advice, please?

Only those where a 64-bit value is 64-bit aligned.
So that excludes x86 (which can have 64-bit dma) but includes sparc32
(which probably doesn't).

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Bogus struct page layout on 32-bit
  2021-04-10 14:06         ` Matthew Wilcox
@ 2021-04-10 15:54           ` Russell King - ARM Linux admin
  0 siblings, 0 replies; 11+ messages in thread
From: Russell King - ARM Linux admin @ 2021-04-10 15:54 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Ilias Apalodimas, Jesper Dangaard Brouer, kernel test robot,
	Linux-MM, kbuild-all, clang-built-linux, open list,
	linux-fsdevel, Michael Ellerman, Benjamin Herrenschmidt,
	Paul Mackerras, linuxppc-dev, Linux ARM, David S. Miller,
	Ivan Khoronzhuk, Matteo Croce, netdev, Grygorii Strashko

On Sat, Apr 10, 2021 at 03:06:52PM +0100, Matthew Wilcox wrote:
> How about moving the flags into the union?  A bit messy, but we don't
> have to play games with __packed__.

Yes, that is probably the better solution, avoiding the games to try
and get the union appropriately placed on 32-bit systems.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Bogus struct page layout on 32-bit
  2021-04-10  2:43   ` Bogus struct page layout on 32-bit Matthew Wilcox
  2021-04-10  6:21     ` Jesper Dangaard Brouer
  2021-04-10 14:17     ` David Laight
@ 2021-04-10 19:10     ` Arnd Bergmann
  2021-04-11 22:35       ` Matthew Wilcox
  2 siblings, 1 reply; 11+ messages in thread
From: Arnd Bergmann @ 2021-04-10 19:10 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: kernel test robot, Linux-MM, kbuild-all, clang-built-linux,
	Linux Kernel Mailing List, Linux FS-devel Mailing List,
	Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
	linuxppc-dev, Linux ARM, Jesper Dangaard Brouer, David S. Miller

On Sat, Apr 10, 2021 at 4:44 AM Matthew Wilcox <willy@infradead.org> wrote:
> +                       dma_addr_t dma_addr __packed;
>                 };
>                 struct {        /* slab, slob and slub */
>                         union {
>
> but I don't know if GCC is smart enough to realise that dma_addr is now
> on an 8 byte boundary and it can use a normal instruction to access it,
> or whether it'll do something daft like use byte loads to access it.
>
> We could also do:
>
> +                       dma_addr_t dma_addr __packed __aligned(sizeof(void *));
>
> and I see pahole, at least sees this correctly:
>
>                 struct {
>                         long unsigned int _page_pool_pad; /*     4     4 */
>                         dma_addr_t dma_addr __attribute__((__aligned__(4))); /*     8     8 */
>                 } __attribute__((__packed__)) __attribute__((__aligned__(4)));
>
> This presumably affects any 32-bit architecture with a 64-bit phys_addr_t
> / dma_addr_t.  Advice, please?

I've tried out what gcc would make of this:  https://godbolt.org/z/aTEbxxbG3

struct page {
    short a;
    struct {
        short b;
        long long c __attribute__((packed, aligned(2)));
    } __attribute__((packed));
} __attribute__((aligned(8)));

In this structure, 'c' is clearly aligned to eight bytes, and gcc does
realize that
it is safe to use the 'ldrd' instruction for 32-bit arm, which is forbidden on
struct members with less than 4 byte alignment. However, it also complains
that passing a pointer to 'c' into a function that expects a 'long long' is not
allowed because alignof(c) is only '2' here.

(I used 'short' here because I having a 64-bit member misaligned by four
bytes wouldn't make a difference to the instructions on Arm, or any other
32-bit architecture I can think of, regardless of the ABI requirements).

      Arnd

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Bogus struct page layout on 32-bit
  2021-04-10 19:10     ` Arnd Bergmann
@ 2021-04-11 22:35       ` Matthew Wilcox
  0 siblings, 0 replies; 11+ messages in thread
From: Matthew Wilcox @ 2021-04-11 22:35 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: kernel test robot, Linux-MM, kbuild-all, clang-built-linux,
	Linux Kernel Mailing List, Linux FS-devel Mailing List,
	Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
	linuxppc-dev, Linux ARM, Jesper Dangaard Brouer, David S. Miller

On Sat, Apr 10, 2021 at 09:10:47PM +0200, Arnd Bergmann wrote:
> On Sat, Apr 10, 2021 at 4:44 AM Matthew Wilcox <willy@infradead.org> wrote:
> > +                       dma_addr_t dma_addr __packed;
> >                 };
> >                 struct {        /* slab, slob and slub */
> >                         union {
> >
> > but I don't know if GCC is smart enough to realise that dma_addr is now
> > on an 8 byte boundary and it can use a normal instruction to access it,
> > or whether it'll do something daft like use byte loads to access it.
> >
> > We could also do:
> >
> > +                       dma_addr_t dma_addr __packed __aligned(sizeof(void *));
> >
> > and I see pahole, at least sees this correctly:
> >
> >                 struct {
> >                         long unsigned int _page_pool_pad; /*     4     4 */
> >                         dma_addr_t dma_addr __attribute__((__aligned__(4))); /*     8     8 */
> >                 } __attribute__((__packed__)) __attribute__((__aligned__(4)));
> >
> > This presumably affects any 32-bit architecture with a 64-bit phys_addr_t
> > / dma_addr_t.  Advice, please?
> 
> I've tried out what gcc would make of this:  https://godbolt.org/z/aTEbxxbG3
> 
> struct page {
>     short a;
>     struct {
>         short b;
>         long long c __attribute__((packed, aligned(2)));
>     } __attribute__((packed));
> } __attribute__((aligned(8)));
> 
> In this structure, 'c' is clearly aligned to eight bytes, and gcc does
> realize that
> it is safe to use the 'ldrd' instruction for 32-bit arm, which is forbidden on
> struct members with less than 4 byte alignment. However, it also complains
> that passing a pointer to 'c' into a function that expects a 'long long' is not
> allowed because alignof(c) is only '2' here.
> 
> (I used 'short' here because I having a 64-bit member misaligned by four
> bytes wouldn't make a difference to the instructions on Arm, or any other
> 32-bit architecture I can think of, regardless of the ABI requirements).

So ... we could do this:

+++ b/include/linux/types.h
@@ -140,7 +140,7 @@ typedef u64 blkcnt_t;
  * so they don't care about the size of the actual bus addresses.
  */
 #ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT
-typedef u64 dma_addr_t;
+typedef u64 __attribute__((aligned(sizeof(void *)))) dma_addr_t;
 #else
 typedef u32 dma_addr_t;
 #endif

but I'm a little scared that this might have unintended consequences.
And Jesper points out that a big-endian 64-bit dma_addr_t can impersonate
a PageTail page, and we should solve that problem while we're at it.
So I don't think we should do this, but thought I should mention it as
a possibility.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Bogus struct page layout on 32-bit
  2021-04-10  8:52       ` Ilias Apalodimas
  2021-04-10 14:06         ` Matthew Wilcox
@ 2021-04-16  9:26         ` Grygorii Strashko
  2021-04-16 14:10           ` Arnd Bergmann
  2021-04-17 13:08           ` David Laight
  1 sibling, 2 replies; 11+ messages in thread
From: Grygorii Strashko @ 2021-04-16  9:26 UTC (permalink / raw)
  To: Ilias Apalodimas, Jesper Dangaard Brouer, Christoph Hellwig
  Cc: Matthew Wilcox, kernel test robot, Linux-MM, kbuild-all,
	clang-built-linux, open list, linux-fsdevel, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, linuxppc-dev, Linux ARM,
	David S. Miller, Matteo Croce, netdev

Hi Ilias, All,

On 10/04/2021 11:52, Ilias Apalodimas wrote:
> +CC Grygorii for the cpsw part as Ivan's email is not valid anymore
> 
> Thanks for catching this. Interesting indeed...
> 
> On Sat, 10 Apr 2021 at 09:22, Jesper Dangaard Brouer <brouer@redhat.com> wrote:
>>
>> On Sat, 10 Apr 2021 03:43:13 +0100
>> Matthew Wilcox <willy@infradead.org> wrote:
>>
>>> On Sat, Apr 10, 2021 at 06:45:35AM +0800, kernel test robot wrote:
>>>>>> include/linux/mm_types.h:274:1: error: static_assert failed due to requirement '__builtin_offsetof(struct page, lru) == __builtin_offsetof(struct folio, lru)' "offsetof(struct page, lru) == offsetof(struct folio, lru)"
>>>>     FOLIO_MATCH(lru, lru);
>>>>     include/linux/mm_types.h:272:2: note: expanded from macro 'FOLIO_MATCH'
>>>>             static_assert(offsetof(struct page, pg) == offsetof(struct folio, fl))
>>>
>>> Well, this is interesting.  pahole reports:
>>>
>>> struct page {
>>>          long unsigned int          flags;                /*     0     4 */
>>>          /* XXX 4 bytes hole, try to pack */
>>>          union {
>>>                  struct {
>>>                          struct list_head lru;            /*     8     8 */
>>> ...
>>> struct folio {
>>>          union {
>>>                  struct {
>>>                          long unsigned int flags;         /*     0     4 */
>>>                          struct list_head lru;            /*     4     8 */
>>>
>>> so this assert has absolutely done its job.
>>>
>>> But why has this assert triggered?  Why is struct page layout not what
>>> we thought it was?  Turns out it's the dma_addr added in 2019 by commit
>>> c25fff7171be ("mm: add dma_addr_t to struct page").  On this particular
>>> config, it's 64-bit, and ppc32 requires alignment to 64-bit.  So
>>> the whole union gets moved out by 4 bytes.
>>
>> Argh, good that you are catching this!
>>
>>> Unfortunately, we can't just fix this by putting an 'unsigned long pad'
>>> in front of it.  It still aligns the entire union to 8 bytes, and then
>>> it skips another 4 bytes after the pad.
>>>
>>> We can fix it like this ...
>>>
>>> +++ b/include/linux/mm_types.h
>>> @@ -96,11 +96,12 @@ struct page {
>>>                          unsigned long private;
>>>                  };
>>>                  struct {        /* page_pool used by netstack */
>>> +                       unsigned long _page_pool_pad;
>>
>> I'm fine with this pad.  Matteo is currently proposing[1] to add a 32-bit
>> value after @dma_addr, and he could use this area instead.
>>
>> [1] https://lore.kernel.org/netdev/20210409223801.104657-3-mcroce@linux.microsoft.com/
>>
>> When adding/changing this, we need to make sure that it doesn't overlap
>> member @index, because network stack use/check page_is_pfmemalloc().
>> As far as my calculations this is safe to add.  I always try to keep an
>> eye out for this, but I wonder if we could have a build check like yours.
>>
>>
>>>                          /**
>>>                           * @dma_addr: might require a 64-bit value even on
>>>                           * 32-bit architectures.
>>>                           */
>>> -                       dma_addr_t dma_addr;
>>> +                       dma_addr_t dma_addr __packed;
>>>                  };
>>>                  struct {        /* slab, slob and slub */
>>>                          union {
>>>
>>> but I don't know if GCC is smart enough to realise that dma_addr is now
>>> on an 8 byte boundary and it can use a normal instruction to access it,
>>> or whether it'll do something daft like use byte loads to access it.
>>>
>>> We could also do:
>>>
>>> +                       dma_addr_t dma_addr __packed __aligned(sizeof(void *));
>>>
>>> and I see pahole, at least sees this correctly:
>>>
>>>                  struct {
>>>                          long unsigned int _page_pool_pad; /*     4     4 */
>>>                          dma_addr_t dma_addr __attribute__((__aligned__(4))); /*     8     8 */
>>>                  } __attribute__((__packed__)) __attribute__((__aligned__(4)));
>>>
>>> This presumably affects any 32-bit architecture with a 64-bit phys_addr_t
>>> / dma_addr_t.  Advice, please?
>>
>> I'm not sure that the 32-bit behavior is with 64-bit (dma) addrs.
>>
>> I don't have any 32-bit boards with 64-bit DMA.  Cc. Ivan, wasn't your
>> board (572x ?) 32-bit with driver 'cpsw' this case (where Ivan added
>> XDP+page_pool) ?

Sry, for delayed reply.

The TI platforms am3/4/5 (cpsw) and Keystone 2 (netcp) can do only 32bit DMA even in case of LPAE (dma-ranges are used).
Originally, as I remember, CONFIG_ARCH_DMA_ADDR_T_64BIT has not been selected for the LPAE case
on TI platforms and the fact that it became set is the result of multi-paltform/allXXXconfig/DMA
optimizations and unification.
(just checked - not set in 4.14)

Probable commit 4965a68780c5 ("arch: define the ARCH_DMA_ADDR_T_64BIT config symbol in lib/Kconfig").

The TI drivers have been updated, finally to accept ARCH_DMA_ADDR_T_64BIT=y by using things like (__force u32)
for example.

Honestly, I've done sanity check of CPSW with LPAE=y (ARCH_DMA_ADDR_T_64BIT=y) very long time ago.

-- 
Best regards,
grygorii

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Bogus struct page layout on 32-bit
  2021-04-16  9:26         ` Grygorii Strashko
@ 2021-04-16 14:10           ` Arnd Bergmann
  2021-04-17 13:08           ` David Laight
  1 sibling, 0 replies; 11+ messages in thread
From: Arnd Bergmann @ 2021-04-16 14:10 UTC (permalink / raw)
  To: Grygorii Strashko
  Cc: Ilias Apalodimas, Jesper Dangaard Brouer, Christoph Hellwig,
	Matthew Wilcox, kernel test robot, Linux-MM, kbuild-all,
	clang-built-linux, open list, linux-fsdevel, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, linuxppc-dev, Linux ARM,
	David S. Miller, Matteo Croce, netdev

On Fri, Apr 16, 2021 at 11:27 AM 'Grygorii Strashko' via Clang Built
Linux <clang-built-linux@googlegroups.com> wrote:
> On 10/04/2021 11:52, Ilias Apalodimas wrote:
> > +CC Grygorii for the cpsw part as Ivan's email is not valid anymore
> The TI platforms am3/4/5 (cpsw) and Keystone 2 (netcp) can do only 32bit DMA even in case of LPAE (dma-ranges are used).
> Originally, as I remember, CONFIG_ARCH_DMA_ADDR_T_64BIT has not been selected for the LPAE case
> on TI platforms and the fact that it became set is the result of multi-paltform/allXXXconfig/DMA
> optimizations and unification.
> (just checked - not set in 4.14)
>
> Probable commit 4965a68780c5 ("arch: define the ARCH_DMA_ADDR_T_64BIT config symbol in lib/Kconfig").

I completely missed this change in the past, and I don't really agree
with it either.

Most 32-bit Arm platforms are in fact limited to 32-bit DMA, even when they have
MMIO or RAM areas above the 4GB boundary that require LPAE.

> The TI drivers have been updated, finally to accept ARCH_DMA_ADDR_T_64BIT=y by using
> things like (__force u32) for example.
>
> Honestly, I've done sanity check of CPSW with LPAE=y (ARCH_DMA_ADDR_T_64BIT=y) very long time ago.

This is of course a good idea, drivers should work with any
combination of 32-bit
or 64-bit phys_addr_t and dma_addr_t.

        Arnd

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: Bogus struct page layout on 32-bit
  2021-04-16  9:26         ` Grygorii Strashko
  2021-04-16 14:10           ` Arnd Bergmann
@ 2021-04-17 13:08           ` David Laight
  1 sibling, 0 replies; 11+ messages in thread
From: David Laight @ 2021-04-17 13:08 UTC (permalink / raw)
  To: 'Grygorii Strashko',
	Ilias Apalodimas, Jesper Dangaard Brouer, Christoph Hellwig
  Cc: Matthew Wilcox, kernel test robot, Linux-MM, kbuild-all,
	clang-built-linux, open list, linux-fsdevel, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, linuxppc-dev, Linux ARM,
	David S. Miller, Matteo Croce, netdev

From: Grygorii Strashko
> Sent: 16 April 2021 10:27
...
> Sry, for delayed reply.
> 
> The TI platforms am3/4/5 (cpsw) and Keystone 2 (netcp) can do only 32bit DMA even in case of LPAE
> (dma-ranges are used).
> Originally, as I remember, CONFIG_ARCH_DMA_ADDR_T_64BIT has not been selected for the LPAE case
> on TI platforms and the fact that it became set is the result of multi-paltform/allXXXconfig/DMA
> optimizations and unification.
> (just checked - not set in 4.14)
> 
> Probable commit 4965a68780c5 ("arch: define the ARCH_DMA_ADDR_T_64BIT config symbol in lib/Kconfig").
> 
> The TI drivers have been updated, finally to accept ARCH_DMA_ADDR_T_64BIT=y by using things like
> (__force u32)
> for example.

Hmmm using (__force u32) is probably wrong.
If an address +length >= 2**32 can get passed then the IO request
needs to be errored (or a bounce buffer used).

Otherwise you can get particularly horrid corruptions.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-04-17 13:13 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20210409185105.188284-3-willy@infradead.org>
     [not found] ` <202104100656.N7EVvkNZ-lkp@intel.com>
2021-04-10  2:43   ` Bogus struct page layout on 32-bit Matthew Wilcox
2021-04-10  6:21     ` Jesper Dangaard Brouer
2021-04-10  8:52       ` Ilias Apalodimas
2021-04-10 14:06         ` Matthew Wilcox
2021-04-10 15:54           ` Russell King - ARM Linux admin
2021-04-16  9:26         ` Grygorii Strashko
2021-04-16 14:10           ` Arnd Bergmann
2021-04-17 13:08           ` David Laight
2021-04-10 14:17     ` David Laight
2021-04-10 19:10     ` Arnd Bergmann
2021-04-11 22:35       ` Matthew Wilcox

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).