linux-cxl.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
To: Ard Biesheuvel <ardb@kernel.org>
Cc: James Morse <james.morse@arm.com>,
	Linux ARM <linux-arm-kernel@lists.infradead.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>, <linux-cxl@vger.kernel.org>
Subject: Re: [RFC PATCH] Documentation/arm64: describe the kernel's expectations of 'memory'
Date: Mon, 17 May 2021 13:17:25 +0100	[thread overview]
Message-ID: <20210517131725.00002068@Huawei.com> (raw)
In-Reply-To: <CAMj1kXHvQ1v7W8jdTaEmp9q-Vj+S9dz0P-Q4YZd5iNnqaBSf+A@mail.gmail.com>

On Mon, 17 May 2021 13:55:16 +0200
Ard Biesheuvel <ardb@kernel.org> wrote:

> On Mon, 17 May 2021 at 13:30, Jonathan Cameron
> <Jonathan.Cameron@huawei.com> wrote:
> >
> > On Mon, 17 May 2021 11:33:19 +0100
> > James Morse <james.morse@arm.com> wrote:
> >  
> > > Standards such as CXL allow memory on PCIe devices to be made
> > > available to the operating system for use as regular memory.
> > >
> > > Document linux's expectations around the behaviour of memory as the
> > > implementations of these new standards may need special treatment in
> > > the OS, firmware or bootloader.
> > >
> > > Signed-off-by: James Morse <james.morse@arm.com>  
> >
> > Hi James,
> >
> > +CC linux-cxl to pick up a few more interesting people who might loose
> > this in the wash of linux-arm-kernel
> >
> > Good to see this description as there has been some confusion on this
> > point. This basically looks like what I'd expect to see. Just a few
> > comments around firmware description towards the end.
> >  
> > > ---
> > >  Documentation/arm64/memory.rst | 31 +++++++++++++++++++++++++++++++
> > >  1 file changed, 31 insertions(+)
> > >
> > > diff --git a/Documentation/arm64/memory.rst b/Documentation/arm64/memory.rst
> > > index 901cd094f4ec..951802aee55f 100644
> > > --- a/Documentation/arm64/memory.rst
> > > +++ b/Documentation/arm64/memory.rst
> > > @@ -167,3 +167,34 @@ from a 52-bit space by enabling the following kernel config options:
> > >
> > >  Note that this option is only intended for debugging applications
> > >  and should not be used in production.
> > > +
> > > +On device memory used as regular memory
> > > +---------------------------------------
> > > +Standards such as CXL allow memory on PCIe device to be made
> > > +available to the operating system for use as regular memory.
> > > +
> > > +If memory is added to the UEFI memory map or DT, or discovered via ACPI's SRAT,
> > > +linux expects it to function in the same way as the bulk DRAM. This section  
> >
> > Linux
> >  
> > > +terms this 'regular memory'.
> > > +
> > > +The kernel may use any attributes to map this memory, e.g. Device-nGnRnE or
> > > +Normal Writeback-Cacheable. The kernel may not be in control of the attributes
> > > +used, e.g. if the memory is used by a KVM guest.
> > > +The kernel will perform cache maintenance to resolve mismatched attributes,
> > > +e.g. invalidating clean stale lines after writing new data when the MMU is
> > > +disabled.
> > > +
> > > +The memory may be used by any instruction supported by the CPUs.
> > > +e.g. Even when the v8.1 LSE atomic instructions are supported, the v8.0
> > > +exclusives are still used for the futex code, and conditional waits, and still
> > > +used by existing user-space binaries. When the CPUs support features such as
> > > +MTE, all regular memory must support MTE tags.
> > > +
> > > +On device memory that does not function in the same way as regular memory must
> > > +not be added to the UEFI memory map or DT, or be discovered via ACPI's SRAT.
> > > +
> > > +On arm64, the kernel does not rewrite the UEFI memory map when memory is added
> > > +or removed. On device memory that is present at boot, but must be removed later  
> >
> > Might be worth giving an example of why memory 'must be removed'?  I'm not sure
> > what you are getting at there.  Specific purpose memory?
> >  
> > > +should be discovered via ACPI's SRAT to ensure it is not used for non-movable
> > > +structures.  
> >
> > Not sure I follow this part.  It could be of type EFI_MEMORY_SP.  
> 
> EFI_MEMORY_SP is an attribute, not a type.

Good point.

> 
> > It should be in SRAT as well, but the EFI type should be sufficient to avoid
> > problems.
> > "The SPM attribute serves as a hint to the OS to avoid allocating this memory
> >  for core OS data or code that can not be relocated."
> >
> > Now I'm not sure the kernel is handling EFI_MEMORY_SP fully yet...  If
> > we need to exclude this approach for now, then this text should perhaps
> > call it out explicitly.
> >  
> 
> The problem with EFI_MEMORY_SP is that it is not a type, but an
> attribute,  which gives a hint to the OS about the nature of the
> memory, which the OS is free to ignore.

IIRC the way around that is to use the reserved type + EFI_MEMORY_SP.
An unware bootloader or OS will then not use it and hence we are safe.
An aware driver can then decide it is safe to "hotplug" said memory.

> 
> The UEFI memory map is not only consumed by the OS, but by any driver
> or OS loader that executes in the EFI boot environment, e.g., GPU
> drivers or shim/grub bootloaders. If these are not enlightened and
> understand what EFI_MEMORY_SP means, they may (and are entitled to)
> treat this EFI_MEMORY_SP as if it were regular memory. If GRUB loads
> the kernel into EFI_MEMORY_SP memory, it had better behave like
> regular memory or things will fall apart.

Two separate issues here. The 'broken' one where _SP or indeed
hotplug flag is no use, and the one where it is 'must be removed later'
and we just don't want to put unmovable allocations in it.

> 
> This means that EFI_MEMORY_SP is really only suitable to describe
> aspects of the memory range that can be happily ignored. MTE or
> atomics capability must be described in a different way.
> 

That's indeed the intent. These are just hints and indeed not suitable for
the cases where things are broken (MTE / Atomics).  In those you
should not be claiming it is normal memory at all.  SRAT doesn't help
you with that though.

The hotplug flag is SRAT is also only a hint. OS doesn't have
to take any notice or support it nor does any boot loader.  Things
will 'work' with the exception of hot-remove.  If you definitely don't
want your memory to be used by the OS for normal purposes, then
don't present it in a form where it might be.

Jonathan



> 
> 
> > > +e.g. the kernel text, page tables or the GIC ITS Pending Table.  
> >
> >
> > _______________________________________________
> > linux-arm-kernel mailing list
> > linux-arm-kernel@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel  


  reply	other threads:[~2021-05-17 12:19 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20210517103319.5356-1-james.morse@arm.com>
2021-05-17 11:27 ` [RFC PATCH] Documentation/arm64: describe the kernel's expectations of 'memory' Jonathan Cameron
2021-05-17 11:55   ` Ard Biesheuvel
2021-05-17 12:17     ` Jonathan Cameron [this message]
2021-05-17 12:25       ` Ard Biesheuvel
2021-06-04 17:57         ` James Morse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210517131725.00002068@Huawei.com \
    --to=jonathan.cameron@huawei.com \
    --cc=ardb@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=james.morse@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-cxl@vger.kernel.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).