All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ard Biesheuvel <ardb@kernel.org>
To: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: James Morse <james.morse@arm.com>,
	Linux ARM <linux-arm-kernel@lists.infradead.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	linux-cxl@vger.kernel.org
Subject: Re: [RFC PATCH] Documentation/arm64: describe the kernel's expectations of 'memory'
Date: Mon, 17 May 2021 14:25:15 +0200	[thread overview]
Message-ID: <CAMj1kXHg9bOWuKmiPLgAz44ezDrVoZrWBBepgqG3M8uB+kco7A@mail.gmail.com> (raw)
In-Reply-To: <20210517131725.00002068@Huawei.com>

On Mon, 17 May 2021 at 14:19, Jonathan Cameron
<Jonathan.Cameron@huawei.com> wrote:
>
> On Mon, 17 May 2021 13:55:16 +0200
> Ard Biesheuvel <ardb@kernel.org> wrote:
>
> > On Mon, 17 May 2021 at 13:30, Jonathan Cameron
> > <Jonathan.Cameron@huawei.com> wrote:
> > >
> > > On Mon, 17 May 2021 11:33:19 +0100
> > > James Morse <james.morse@arm.com> wrote:
> > >
> > > > Standards such as CXL allow memory on PCIe devices to be made
> > > > available to the operating system for use as regular memory.
> > > >
> > > > Document linux's expectations around the behaviour of memory as the
> > > > implementations of these new standards may need special treatment in
> > > > the OS, firmware or bootloader.
> > > >
> > > > Signed-off-by: James Morse <james.morse@arm.com>
> > >
> > > Hi James,
> > >
> > > +CC linux-cxl to pick up a few more interesting people who might loose
> > > this in the wash of linux-arm-kernel
> > >
> > > Good to see this description as there has been some confusion on this
> > > point. This basically looks like what I'd expect to see. Just a few
> > > comments around firmware description towards the end.
> > >
> > > > ---
> > > >  Documentation/arm64/memory.rst | 31 +++++++++++++++++++++++++++++++
> > > >  1 file changed, 31 insertions(+)
> > > >
> > > > diff --git a/Documentation/arm64/memory.rst b/Documentation/arm64/memory.rst
> > > > index 901cd094f4ec..951802aee55f 100644
> > > > --- a/Documentation/arm64/memory.rst
> > > > +++ b/Documentation/arm64/memory.rst
> > > > @@ -167,3 +167,34 @@ from a 52-bit space by enabling the following kernel config options:
> > > >
> > > >  Note that this option is only intended for debugging applications
> > > >  and should not be used in production.
> > > > +
> > > > +On device memory used as regular memory
> > > > +---------------------------------------
> > > > +Standards such as CXL allow memory on PCIe device to be made
> > > > +available to the operating system for use as regular memory.
> > > > +
> > > > +If memory is added to the UEFI memory map or DT, or discovered via ACPI's SRAT,
> > > > +linux expects it to function in the same way as the bulk DRAM. This section
> > >
> > > Linux
> > >
> > > > +terms this 'regular memory'.
> > > > +
> > > > +The kernel may use any attributes to map this memory, e.g. Device-nGnRnE or
> > > > +Normal Writeback-Cacheable. The kernel may not be in control of the attributes
> > > > +used, e.g. if the memory is used by a KVM guest.
> > > > +The kernel will perform cache maintenance to resolve mismatched attributes,
> > > > +e.g. invalidating clean stale lines after writing new data when the MMU is
> > > > +disabled.
> > > > +
> > > > +The memory may be used by any instruction supported by the CPUs.
> > > > +e.g. Even when the v8.1 LSE atomic instructions are supported, the v8.0
> > > > +exclusives are still used for the futex code, and conditional waits, and still
> > > > +used by existing user-space binaries. When the CPUs support features such as
> > > > +MTE, all regular memory must support MTE tags.
> > > > +
> > > > +On device memory that does not function in the same way as regular memory must
> > > > +not be added to the UEFI memory map or DT, or be discovered via ACPI's SRAT.
> > > > +
> > > > +On arm64, the kernel does not rewrite the UEFI memory map when memory is added
> > > > +or removed. On device memory that is present at boot, but must be removed later
> > >
> > > Might be worth giving an example of why memory 'must be removed'?  I'm not sure
> > > what you are getting at there.  Specific purpose memory?
> > >
> > > > +should be discovered via ACPI's SRAT to ensure it is not used for non-movable
> > > > +structures.
> > >
> > > Not sure I follow this part.  It could be of type EFI_MEMORY_SP.
> >
> > EFI_MEMORY_SP is an attribute, not a type.
>
> Good point.
>
> >
> > > It should be in SRAT as well, but the EFI type should be sufficient to avoid
> > > problems.
> > > "The SPM attribute serves as a hint to the OS to avoid allocating this memory
> > >  for core OS data or code that can not be relocated."
> > >
> > > Now I'm not sure the kernel is handling EFI_MEMORY_SP fully yet...  If
> > > we need to exclude this approach for now, then this text should perhaps
> > > call it out explicitly.
> > >
> >
> > The problem with EFI_MEMORY_SP is that it is not a type, but an
> > attribute,  which gives a hint to the OS about the nature of the
> > memory, which the OS is free to ignore.
>
> IIRC the way around that is to use the reserved type + EFI_MEMORY_SP.
> An unware bootloader or OS will then not use it and hence we are safe.
> An aware driver can then decide it is safe to "hotplug" said memory.
>

True, but then, what good does it do to describe this memory in the
UEFI memory map in the first place?

> >
> > The UEFI memory map is not only consumed by the OS, but by any driver
> > or OS loader that executes in the EFI boot environment, e.g., GPU
> > drivers or shim/grub bootloaders. If these are not enlightened and
> > understand what EFI_MEMORY_SP means, they may (and are entitled to)
> > treat this EFI_MEMORY_SP as if it were regular memory. If GRUB loads
> > the kernel into EFI_MEMORY_SP memory, it had better behave like
> > regular memory or things will fall apart.
>
> Two separate issues here. The 'broken' one where _SP or indeed
> hotplug flag is no use, and the one where it is 'must be removed later'
> and we just don't want to put unmovable allocations in it.
>

I am not sure I follow the 'must be removed' thing. Why is that needed?

> >
> > This means that EFI_MEMORY_SP is really only suitable to describe
> > aspects of the memory range that can be happily ignored. MTE or
> > atomics capability must be described in a different way.
> >
>
> That's indeed the intent. These are just hints and indeed not suitable for
> the cases where things are broken (MTE / Atomics).  In those you
> should not be claiming it is normal memory at all.  SRAT doesn't help
> you with that though.
>
> The hotplug flag is SRAT is also only a hint. OS doesn't have
> to take any notice or support it nor does any boot loader.  Things
> will 'work' with the exception of hot-remove.  If you definitely don't
> want your memory to be used by the OS for normal purposes, then
> don't present it in a form where it might be.
>

Agreed

WARNING: multiple messages have this Message-ID (diff)
From: Ard Biesheuvel <ardb@kernel.org>
To: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: James Morse <james.morse@arm.com>,
	Linux ARM <linux-arm-kernel@lists.infradead.org>,
	 Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	linux-cxl@vger.kernel.org
Subject: Re: [RFC PATCH] Documentation/arm64: describe the kernel's expectations of 'memory'
Date: Mon, 17 May 2021 14:25:15 +0200	[thread overview]
Message-ID: <CAMj1kXHg9bOWuKmiPLgAz44ezDrVoZrWBBepgqG3M8uB+kco7A@mail.gmail.com> (raw)
In-Reply-To: <20210517131725.00002068@Huawei.com>

On Mon, 17 May 2021 at 14:19, Jonathan Cameron
<Jonathan.Cameron@huawei.com> wrote:
>
> On Mon, 17 May 2021 13:55:16 +0200
> Ard Biesheuvel <ardb@kernel.org> wrote:
>
> > On Mon, 17 May 2021 at 13:30, Jonathan Cameron
> > <Jonathan.Cameron@huawei.com> wrote:
> > >
> > > On Mon, 17 May 2021 11:33:19 +0100
> > > James Morse <james.morse@arm.com> wrote:
> > >
> > > > Standards such as CXL allow memory on PCIe devices to be made
> > > > available to the operating system for use as regular memory.
> > > >
> > > > Document linux's expectations around the behaviour of memory as the
> > > > implementations of these new standards may need special treatment in
> > > > the OS, firmware or bootloader.
> > > >
> > > > Signed-off-by: James Morse <james.morse@arm.com>
> > >
> > > Hi James,
> > >
> > > +CC linux-cxl to pick up a few more interesting people who might loose
> > > this in the wash of linux-arm-kernel
> > >
> > > Good to see this description as there has been some confusion on this
> > > point. This basically looks like what I'd expect to see. Just a few
> > > comments around firmware description towards the end.
> > >
> > > > ---
> > > >  Documentation/arm64/memory.rst | 31 +++++++++++++++++++++++++++++++
> > > >  1 file changed, 31 insertions(+)
> > > >
> > > > diff --git a/Documentation/arm64/memory.rst b/Documentation/arm64/memory.rst
> > > > index 901cd094f4ec..951802aee55f 100644
> > > > --- a/Documentation/arm64/memory.rst
> > > > +++ b/Documentation/arm64/memory.rst
> > > > @@ -167,3 +167,34 @@ from a 52-bit space by enabling the following kernel config options:
> > > >
> > > >  Note that this option is only intended for debugging applications
> > > >  and should not be used in production.
> > > > +
> > > > +On device memory used as regular memory
> > > > +---------------------------------------
> > > > +Standards such as CXL allow memory on PCIe device to be made
> > > > +available to the operating system for use as regular memory.
> > > > +
> > > > +If memory is added to the UEFI memory map or DT, or discovered via ACPI's SRAT,
> > > > +linux expects it to function in the same way as the bulk DRAM. This section
> > >
> > > Linux
> > >
> > > > +terms this 'regular memory'.
> > > > +
> > > > +The kernel may use any attributes to map this memory, e.g. Device-nGnRnE or
> > > > +Normal Writeback-Cacheable. The kernel may not be in control of the attributes
> > > > +used, e.g. if the memory is used by a KVM guest.
> > > > +The kernel will perform cache maintenance to resolve mismatched attributes,
> > > > +e.g. invalidating clean stale lines after writing new data when the MMU is
> > > > +disabled.
> > > > +
> > > > +The memory may be used by any instruction supported by the CPUs.
> > > > +e.g. Even when the v8.1 LSE atomic instructions are supported, the v8.0
> > > > +exclusives are still used for the futex code, and conditional waits, and still
> > > > +used by existing user-space binaries. When the CPUs support features such as
> > > > +MTE, all regular memory must support MTE tags.
> > > > +
> > > > +On device memory that does not function in the same way as regular memory must
> > > > +not be added to the UEFI memory map or DT, or be discovered via ACPI's SRAT.
> > > > +
> > > > +On arm64, the kernel does not rewrite the UEFI memory map when memory is added
> > > > +or removed. On device memory that is present at boot, but must be removed later
> > >
> > > Might be worth giving an example of why memory 'must be removed'?  I'm not sure
> > > what you are getting at there.  Specific purpose memory?
> > >
> > > > +should be discovered via ACPI's SRAT to ensure it is not used for non-movable
> > > > +structures.
> > >
> > > Not sure I follow this part.  It could be of type EFI_MEMORY_SP.
> >
> > EFI_MEMORY_SP is an attribute, not a type.
>
> Good point.
>
> >
> > > It should be in SRAT as well, but the EFI type should be sufficient to avoid
> > > problems.
> > > "The SPM attribute serves as a hint to the OS to avoid allocating this memory
> > >  for core OS data or code that can not be relocated."
> > >
> > > Now I'm not sure the kernel is handling EFI_MEMORY_SP fully yet...  If
> > > we need to exclude this approach for now, then this text should perhaps
> > > call it out explicitly.
> > >
> >
> > The problem with EFI_MEMORY_SP is that it is not a type, but an
> > attribute,  which gives a hint to the OS about the nature of the
> > memory, which the OS is free to ignore.
>
> IIRC the way around that is to use the reserved type + EFI_MEMORY_SP.
> An unware bootloader or OS will then not use it and hence we are safe.
> An aware driver can then decide it is safe to "hotplug" said memory.
>

True, but then, what good does it do to describe this memory in the
UEFI memory map in the first place?

> >
> > The UEFI memory map is not only consumed by the OS, but by any driver
> > or OS loader that executes in the EFI boot environment, e.g., GPU
> > drivers or shim/grub bootloaders. If these are not enlightened and
> > understand what EFI_MEMORY_SP means, they may (and are entitled to)
> > treat this EFI_MEMORY_SP as if it were regular memory. If GRUB loads
> > the kernel into EFI_MEMORY_SP memory, it had better behave like
> > regular memory or things will fall apart.
>
> Two separate issues here. The 'broken' one where _SP or indeed
> hotplug flag is no use, and the one where it is 'must be removed later'
> and we just don't want to put unmovable allocations in it.
>

I am not sure I follow the 'must be removed' thing. Why is that needed?

> >
> > This means that EFI_MEMORY_SP is really only suitable to describe
> > aspects of the memory range that can be happily ignored. MTE or
> > atomics capability must be described in a different way.
> >
>
> That's indeed the intent. These are just hints and indeed not suitable for
> the cases where things are broken (MTE / Atomics).  In those you
> should not be claiming it is normal memory at all.  SRAT doesn't help
> you with that though.
>
> The hotplug flag is SRAT is also only a hint. OS doesn't have
> to take any notice or support it nor does any boot loader.  Things
> will 'work' with the exception of hot-remove.  If you definitely don't
> want your memory to be used by the OS for normal purposes, then
> don't present it in a form where it might be.
>

Agreed

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2021-05-17 12:25 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-17 10:33 [RFC PATCH] Documentation/arm64: describe the kernel's expectations of 'memory' James Morse
2021-05-17 11:27 ` Jonathan Cameron
2021-05-17 11:27   ` Jonathan Cameron
2021-05-17 11:55   ` Ard Biesheuvel
2021-05-17 11:55     ` Ard Biesheuvel
2021-05-17 12:17     ` Jonathan Cameron
2021-05-17 12:17       ` Jonathan Cameron
2021-05-17 12:25       ` Ard Biesheuvel [this message]
2021-05-17 12:25         ` Ard Biesheuvel
2021-06-04 17:57         ` James Morse
2021-06-04 17:57           ` James Morse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAMj1kXHg9bOWuKmiPLgAz44ezDrVoZrWBBepgqG3M8uB+kco7A@mail.gmail.com \
    --to=ardb@kernel.org \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=catalin.marinas@arm.com \
    --cc=james.morse@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-cxl@vger.kernel.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.