All of lore.kernel.org
 help / color / mirror / Atom feed
From: Arnd Bergmann <arnd@kernel.org>
To: Matthew Wilcox <willy@infradead.org>
Cc: kernel test robot <lkp@intel.com>, Linux-MM <linux-mm@kvack.org>,
	kbuild-all@lists.01.org,
	clang-built-linux <clang-built-linux@googlegroups.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Linux FS-devel Mailing List <linux-fsdevel@vger.kernel.org>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Paul Mackerras <paulus@samba.org>,
	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
	Linux ARM <linux-arm-kernel@lists.infradead.org>,
	Jesper Dangaard Brouer <brouer@redhat.com>,
	"David S. Miller" <davem@davemloft.net>
Subject: Re: Bogus struct page layout on 32-bit
Date: Sat, 10 Apr 2021 21:10:47 +0200	[thread overview]
Message-ID: <CAK8P3a3uEGaEN-p06vFP+jwbFt3P=Bx4=aRN+kUyB4PcFPxLRg@mail.gmail.com> (raw)
In-Reply-To: <20210410024313.GX2531743@casper.infradead.org>

On Sat, Apr 10, 2021 at 4:44 AM Matthew Wilcox <willy@infradead.org> wrote:
> +                       dma_addr_t dma_addr __packed;
>                 };
>                 struct {        /* slab, slob and slub */
>                         union {
>
> but I don't know if GCC is smart enough to realise that dma_addr is now
> on an 8 byte boundary and it can use a normal instruction to access it,
> or whether it'll do something daft like use byte loads to access it.
>
> We could also do:
>
> +                       dma_addr_t dma_addr __packed __aligned(sizeof(void *));
>
> and I see pahole, at least sees this correctly:
>
>                 struct {
>                         long unsigned int _page_pool_pad; /*     4     4 */
>                         dma_addr_t dma_addr __attribute__((__aligned__(4))); /*     8     8 */
>                 } __attribute__((__packed__)) __attribute__((__aligned__(4)));
>
> This presumably affects any 32-bit architecture with a 64-bit phys_addr_t
> / dma_addr_t.  Advice, please?

I've tried out what gcc would make of this:  https://godbolt.org/z/aTEbxxbG3

struct page {
    short a;
    struct {
        short b;
        long long c __attribute__((packed, aligned(2)));
    } __attribute__((packed));
} __attribute__((aligned(8)));

In this structure, 'c' is clearly aligned to eight bytes, and gcc does
realize that
it is safe to use the 'ldrd' instruction for 32-bit arm, which is forbidden on
struct members with less than 4 byte alignment. However, it also complains
that passing a pointer to 'c' into a function that expects a 'long long' is not
allowed because alignof(c) is only '2' here.

(I used 'short' here because I having a 64-bit member misaligned by four
bytes wouldn't make a difference to the instructions on Arm, or any other
32-bit architecture I can think of, regardless of the ABI requirements).

      Arnd

WARNING: multiple messages have this Message-ID (diff)
From: Arnd Bergmann <arnd@kernel.org>
To: Matthew Wilcox <willy@infradead.org>
Cc: kbuild-all@lists.01.org, kernel test robot <lkp@intel.com>,
	clang-built-linux <clang-built-linux@googlegroups.com>,
	Jesper Dangaard Brouer <brouer@redhat.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>, Paul Mackerras <paulus@samba.org>,
	Linux FS-devel Mailing List <linux-fsdevel@vger.kernel.org>,
	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
	"David S. Miller" <davem@davemloft.net>,
	Linux ARM <linux-arm-kernel@lists.infradead.org>
Subject: Re: Bogus struct page layout on 32-bit
Date: Sat, 10 Apr 2021 21:10:47 +0200	[thread overview]
Message-ID: <CAK8P3a3uEGaEN-p06vFP+jwbFt3P=Bx4=aRN+kUyB4PcFPxLRg@mail.gmail.com> (raw)
In-Reply-To: <20210410024313.GX2531743@casper.infradead.org>

On Sat, Apr 10, 2021 at 4:44 AM Matthew Wilcox <willy@infradead.org> wrote:
> +                       dma_addr_t dma_addr __packed;
>                 };
>                 struct {        /* slab, slob and slub */
>                         union {
>
> but I don't know if GCC is smart enough to realise that dma_addr is now
> on an 8 byte boundary and it can use a normal instruction to access it,
> or whether it'll do something daft like use byte loads to access it.
>
> We could also do:
>
> +                       dma_addr_t dma_addr __packed __aligned(sizeof(void *));
>
> and I see pahole, at least sees this correctly:
>
>                 struct {
>                         long unsigned int _page_pool_pad; /*     4     4 */
>                         dma_addr_t dma_addr __attribute__((__aligned__(4))); /*     8     8 */
>                 } __attribute__((__packed__)) __attribute__((__aligned__(4)));
>
> This presumably affects any 32-bit architecture with a 64-bit phys_addr_t
> / dma_addr_t.  Advice, please?

I've tried out what gcc would make of this:  https://godbolt.org/z/aTEbxxbG3

struct page {
    short a;
    struct {
        short b;
        long long c __attribute__((packed, aligned(2)));
    } __attribute__((packed));
} __attribute__((aligned(8)));

In this structure, 'c' is clearly aligned to eight bytes, and gcc does
realize that
it is safe to use the 'ldrd' instruction for 32-bit arm, which is forbidden on
struct members with less than 4 byte alignment. However, it also complains
that passing a pointer to 'c' into a function that expects a 'long long' is not
allowed because alignof(c) is only '2' here.

(I used 'short' here because I having a 64-bit member misaligned by four
bytes wouldn't make a difference to the instructions on Arm, or any other
32-bit architecture I can think of, regardless of the ABI requirements).

      Arnd

WARNING: multiple messages have this Message-ID (diff)
From: Arnd Bergmann <arnd@kernel.org>
To: Matthew Wilcox <willy@infradead.org>
Cc: kernel test robot <lkp@intel.com>, Linux-MM <linux-mm@kvack.org>,
	kbuild-all@lists.01.org,
	 clang-built-linux <clang-built-linux@googlegroups.com>,
	 Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	 Linux FS-devel Mailing List <linux-fsdevel@vger.kernel.org>,
	Michael Ellerman <mpe@ellerman.id.au>,
	 Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Paul Mackerras <paulus@samba.org>,
	 linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
	 Linux ARM <linux-arm-kernel@lists.infradead.org>,
	 Jesper Dangaard Brouer <brouer@redhat.com>,
	"David S. Miller" <davem@davemloft.net>
Subject: Re: Bogus struct page layout on 32-bit
Date: Sat, 10 Apr 2021 21:10:47 +0200	[thread overview]
Message-ID: <CAK8P3a3uEGaEN-p06vFP+jwbFt3P=Bx4=aRN+kUyB4PcFPxLRg@mail.gmail.com> (raw)
In-Reply-To: <20210410024313.GX2531743@casper.infradead.org>

On Sat, Apr 10, 2021 at 4:44 AM Matthew Wilcox <willy@infradead.org> wrote:
> +                       dma_addr_t dma_addr __packed;
>                 };
>                 struct {        /* slab, slob and slub */
>                         union {
>
> but I don't know if GCC is smart enough to realise that dma_addr is now
> on an 8 byte boundary and it can use a normal instruction to access it,
> or whether it'll do something daft like use byte loads to access it.
>
> We could also do:
>
> +                       dma_addr_t dma_addr __packed __aligned(sizeof(void *));
>
> and I see pahole, at least sees this correctly:
>
>                 struct {
>                         long unsigned int _page_pool_pad; /*     4     4 */
>                         dma_addr_t dma_addr __attribute__((__aligned__(4))); /*     8     8 */
>                 } __attribute__((__packed__)) __attribute__((__aligned__(4)));
>
> This presumably affects any 32-bit architecture with a 64-bit phys_addr_t
> / dma_addr_t.  Advice, please?

I've tried out what gcc would make of this:  https://godbolt.org/z/aTEbxxbG3

struct page {
    short a;
    struct {
        short b;
        long long c __attribute__((packed, aligned(2)));
    } __attribute__((packed));
} __attribute__((aligned(8)));

In this structure, 'c' is clearly aligned to eight bytes, and gcc does
realize that
it is safe to use the 'ldrd' instruction for 32-bit arm, which is forbidden on
struct members with less than 4 byte alignment. However, it also complains
that passing a pointer to 'c' into a function that expects a 'long long' is not
allowed because alignof(c) is only '2' here.

(I used 'short' here because I having a 64-bit member misaligned by four
bytes wouldn't make a difference to the instructions on Arm, or any other
32-bit architecture I can think of, regardless of the ABI requirements).

      Arnd

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

WARNING: multiple messages have this Message-ID (diff)
From: Arnd Bergmann <arnd@kernel.org>
To: kbuild-all@lists.01.org
Subject: Re: Bogus struct page layout on 32-bit
Date: Sat, 10 Apr 2021 21:10:47 +0200	[thread overview]
Message-ID: <CAK8P3a3uEGaEN-p06vFP+jwbFt3P=Bx4=aRN+kUyB4PcFPxLRg@mail.gmail.com> (raw)
In-Reply-To: <20210410024313.GX2531743@casper.infradead.org>

[-- Attachment #1: Type: text/plain, Size: 1918 bytes --]

On Sat, Apr 10, 2021 at 4:44 AM Matthew Wilcox <willy@infradead.org> wrote:
> +                       dma_addr_t dma_addr __packed;
>                 };
>                 struct {        /* slab, slob and slub */
>                         union {
>
> but I don't know if GCC is smart enough to realise that dma_addr is now
> on an 8 byte boundary and it can use a normal instruction to access it,
> or whether it'll do something daft like use byte loads to access it.
>
> We could also do:
>
> +                       dma_addr_t dma_addr __packed __aligned(sizeof(void *));
>
> and I see pahole, at least sees this correctly:
>
>                 struct {
>                         long unsigned int _page_pool_pad; /*     4     4 */
>                         dma_addr_t dma_addr __attribute__((__aligned__(4))); /*     8     8 */
>                 } __attribute__((__packed__)) __attribute__((__aligned__(4)));
>
> This presumably affects any 32-bit architecture with a 64-bit phys_addr_t
> / dma_addr_t.  Advice, please?

I've tried out what gcc would make of this:  https://godbolt.org/z/aTEbxxbG3

struct page {
    short a;
    struct {
        short b;
        long long c __attribute__((packed, aligned(2)));
    } __attribute__((packed));
} __attribute__((aligned(8)));

In this structure, 'c' is clearly aligned to eight bytes, and gcc does
realize that
it is safe to use the 'ldrd' instruction for 32-bit arm, which is forbidden on
struct members with less than 4 byte alignment. However, it also complains
that passing a pointer to 'c' into a function that expects a 'long long' is not
allowed because alignof(c) is only '2' here.

(I used 'short' here because I having a 64-bit member misaligned by four
bytes wouldn't make a difference to the instructions on Arm, or any other
32-bit architecture I can think of, regardless of the ABI requirements).

      Arnd

  parent reply	other threads:[~2021-04-10 19:11 UTC|newest]

Thread overview: 87+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-09 18:50 [PATCH v7 00/28] Memory Folios Matthew Wilcox (Oracle)
2021-04-09 18:50 ` [PATCH v7 01/28] mm: Optimise nth_page for contiguous memmap Matthew Wilcox (Oracle)
2021-04-12  6:08   ` Christoph Hellwig
2021-04-09 18:50 ` [PATCH v7 02/28] mm: Introduce struct folio Matthew Wilcox (Oracle)
2021-04-09 22:45   ` kernel test robot
2021-04-09 22:45     ` kernel test robot
2021-04-10  2:43     ` Bogus struct page layout on 32-bit Matthew Wilcox
2021-04-10  2:43       ` Matthew Wilcox
2021-04-10  2:43       ` Matthew Wilcox
2021-04-10  2:43       ` Matthew Wilcox
2021-04-10  6:21       ` Jesper Dangaard Brouer
2021-04-10  6:21         ` Jesper Dangaard Brouer
2021-04-10  6:21         ` Jesper Dangaard Brouer
2021-04-10  6:21         ` Jesper Dangaard Brouer
2021-04-10  8:52         ` Ilias Apalodimas
2021-04-10  8:52           ` Ilias Apalodimas
2021-04-10  8:52           ` Ilias Apalodimas
2021-04-10  8:52           ` Ilias Apalodimas
2021-04-10  8:52           ` Ilias Apalodimas
2021-04-10 14:06           ` Matthew Wilcox
2021-04-10 14:06             ` Matthew Wilcox
2021-04-10 14:06             ` Matthew Wilcox
2021-04-10 14:06             ` Matthew Wilcox
2021-04-10 15:54             ` Russell King - ARM Linux admin
2021-04-10 15:54               ` Russell King - ARM Linux admin
2021-04-10 15:54               ` Russell King - ARM Linux admin
2021-04-10 15:54               ` Russell King - ARM Linux admin
2021-04-16  9:26           ` Grygorii Strashko
2021-04-16  9:26             ` Grygorii Strashko
2021-04-16  9:26             ` Grygorii Strashko
2021-04-16  9:26             ` Grygorii Strashko
2021-04-16 14:10             ` Arnd Bergmann
2021-04-16 14:10               ` Arnd Bergmann
2021-04-16 14:10               ` Arnd Bergmann
2021-04-16 14:10               ` Arnd Bergmann
2021-04-16 14:10               ` Arnd Bergmann
2021-04-17 13:08             ` David Laight
2021-04-17 13:08               ` David Laight
2021-04-17 13:08               ` David Laight
2021-04-17 13:08               ` David Laight
2021-04-17 13:08               ` David Laight
2021-04-10 14:17       ` David Laight
2021-04-10 14:17         ` David Laight
2021-04-10 14:17         ` David Laight
2021-04-10 14:17         ` David Laight
2021-04-10 14:17         ` David Laight
2021-04-10 19:10       ` Arnd Bergmann [this message]
2021-04-10 19:10         ` Arnd Bergmann
2021-04-10 19:10         ` Arnd Bergmann
2021-04-10 19:10         ` Arnd Bergmann
2021-04-10 19:10         ` Arnd Bergmann
2021-04-11 22:35         ` Matthew Wilcox
2021-04-11 22:35           ` Matthew Wilcox
2021-04-11 22:35           ` Matthew Wilcox
2021-04-11 22:35           ` Matthew Wilcox
2021-04-10  2:51   ` [PATCH v7 02/28] mm: Introduce struct folio kernel test robot
2021-04-10  2:51     ` kernel test robot
2021-04-16 15:55   ` Matthew Wilcox
2021-04-19  9:06     ` Kirill A. Shutemov
2021-04-09 18:50 ` [PATCH v7 03/28] mm: Add folio_pgdat and folio_zone Matthew Wilcox (Oracle)
2021-04-09 18:50 ` [PATCH v7 04/28] mm/vmstat: Add functions to account folio statistics Matthew Wilcox (Oracle)
2021-04-09 18:50 ` [PATCH v7 05/28] mm/debug: Add VM_BUG_ON_FOLIO and VM_WARN_ON_ONCE_FOLIO Matthew Wilcox (Oracle)
2021-04-09 18:50 ` [PATCH v7 06/28] mm: Add folio reference count functions Matthew Wilcox (Oracle)
2021-04-09 18:50 ` [PATCH v7 07/28] mm: Add put_folio Matthew Wilcox (Oracle)
2021-04-09 18:50 ` [PATCH v7 08/28] mm: Add get_folio Matthew Wilcox (Oracle)
2021-04-09 18:50 ` [PATCH v7 09/28] mm: Create FolioFlags Matthew Wilcox (Oracle)
2021-04-19 13:25   ` Peter Zijlstra
2021-04-19 13:55     ` Matthew Wilcox
2021-04-09 18:50 ` [PATCH v7 10/28] mm: Handle per-folio private data Matthew Wilcox (Oracle)
2021-04-09 18:50 ` [PATCH v7 11/28] mm/filemap: Add folio_index, folio_file_page and folio_contains Matthew Wilcox (Oracle)
2021-04-09 18:50 ` [PATCH v7 12/28] mm/filemap: Add folio_next_index Matthew Wilcox (Oracle)
2021-04-09 18:50 ` [PATCH v7 13/28] mm/filemap: Add folio_offset and folio_file_offset Matthew Wilcox (Oracle)
2021-04-09 18:50 ` [PATCH v7 14/28] mm/util: Add folio_mapping and folio_file_mapping Matthew Wilcox (Oracle)
2021-04-09 18:50 ` [PATCH v7 15/28] mm: Add folio_mapcount Matthew Wilcox (Oracle)
2021-04-09 18:50 ` [PATCH v7 16/28] mm/memcg: Add folio wrappers for various functions Matthew Wilcox (Oracle)
2021-04-09 18:50 ` [PATCH v7 17/28] mm/filemap: Add unlock_folio Matthew Wilcox (Oracle)
2021-04-09 18:50 ` [PATCH v7 18/28] mm/filemap: Add lock_folio Matthew Wilcox (Oracle)
2021-04-09 18:50 ` [PATCH v7 19/28] mm/filemap: Add lock_folio_killable Matthew Wilcox (Oracle)
2021-04-09 18:50 ` [PATCH v7 20/28] mm/filemap: Add __lock_folio_async Matthew Wilcox (Oracle)
2021-04-09 18:50 ` [PATCH v7 21/28] mm/filemap: Add __lock_folio_or_retry Matthew Wilcox (Oracle)
2021-04-09 18:50 ` [PATCH v7 22/28] mm/filemap: Add wait_on_folio_locked Matthew Wilcox (Oracle)
2021-04-09 18:51 ` [PATCH v7 23/28] mm/filemap: Add end_folio_writeback Matthew Wilcox (Oracle)
2021-04-09 18:51 ` [PATCH v7 24/28] mm/writeback: Add wait_on_folio_writeback Matthew Wilcox (Oracle)
2021-04-09 18:51 ` [PATCH v7 25/28] mm/writeback: Add wait_for_stable_folio Matthew Wilcox (Oracle)
2021-04-09 18:51 ` [PATCH v7 26/28] mm/filemap: Convert wait_on_page_bit to wait_on_folio_bit Matthew Wilcox (Oracle)
2021-04-09 18:51 ` [PATCH v7 27/28] mm/filemap: Convert wake_up_page_bit to wake_up_folio_bit Matthew Wilcox (Oracle)
2021-04-09 18:51 ` [PATCH v7 28/28] mm/filemap: Convert page wait queues to be folios Matthew Wilcox (Oracle)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAK8P3a3uEGaEN-p06vFP+jwbFt3P=Bx4=aRN+kUyB4PcFPxLRg@mail.gmail.com' \
    --to=arnd@kernel.org \
    --cc=benh@kernel.crashing.org \
    --cc=brouer@redhat.com \
    --cc=clang-built-linux@googlegroups.com \
    --cc=davem@davemloft.net \
    --cc=kbuild-all@lists.01.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=lkp@intel.com \
    --cc=mpe@ellerman.id.au \
    --cc=paulus@samba.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.