All of lore.kernel.org
 help / color / mirror / Atom feed
* Should bios always mark CXL DRAM as EFI_MEMORY_SP?
       [not found] <07cedbe6-00ab-52fc-9475-c8d7120f5a95@jagalactic.com>
@ 2022-01-27 16:18 ` John Groves
  2022-01-28  0:47   ` Dan Williams
  2022-01-28  4:08   ` Li Qiang (Johnny Li)
  0 siblings, 2 replies; 6+ messages in thread
From: John Groves @ 2022-01-27 16:18 UTC (permalink / raw)
  To: Dan Williams, linux-cxl; +Cc: Jonathan Cameron, Ben Widawsky, John Groves

I’d like to seek some feedback and see whether a consensus exists or can be developed regarding how system firmware (bios/efi/etc) should present CXL DRAM to a system in a pre-fabric world (CXL 1.1/2.0).

 

The CXL spec, along with the Intel documentation are pretty specific and useful, but one open issue seems not to be outright specified: should the system mark CXL-attached DRAM as “specific purpose” (EFI_MEMORY_SP)? Consistency across platforms is certainly desirable. If this behavior is not prescribed, we could end up with inconsistent behavior across server and bios vendors.

 

If this is already specified, no need to read on (but please point me to where it’s specified).

 

Objective: I think everyone will likely agree that it should be possible to use CXL DRAM as either general-purpose memory, or via DAX, or a mix.

 

What’s the difference?

 

Memory marked as EFI_MEMORY_SP:

 

·      Mappable via DAX

·      Can be online-converted to general purpose memory via daxctl

·      Can be boot-converted to general-purpose with efi=nosoftreserve on Linux command line

 

Memory NOT marked as EFI_MEMORY_SP:

 

·      CXL is general-purpose memory (NUMA node with no local CPU cores)

·      Some of the contents appear to be used for in-memory metadata (presumably buddy lists, etc.)

·      Can be boot-converted to DAX with an efi_fake_mem= argument on the Linux command line

·      Currently cannot be online-converted to DAX-managed (can this work? Is it intended to be working?)

 

If online conversion from general-purpose to DAX is not going to work, it seems that the default should preserve the ability to use it either way: mark the memory as EFI_MEMORY_SP.

 

Is there a right and wrong answer re:EFI_MEMORY_SP? How important is it to have consistency across platforms?

 

If there is a consensus, the next question is who should express it. Perhaps the CXL consortium. I’m a part of that, but it seemed like the Linux dev community was the right place to start.

 

Thanks for any thoughts.


John Groves

Micron




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Should bios always mark CXL DRAM as EFI_MEMORY_SP?
  2022-01-27 16:18 ` Should bios always mark CXL DRAM as EFI_MEMORY_SP? John Groves
@ 2022-01-28  0:47   ` Dan Williams
  2022-01-28  4:08   ` Li Qiang (Johnny Li)
  1 sibling, 0 replies; 6+ messages in thread
From: Dan Williams @ 2022-01-28  0:47 UTC (permalink / raw)
  To: John Groves
  Cc: linux-cxl, Jonathan Cameron, Ben Widawsky, John Groves, Linux MM

[ add linux-mm since my opinion is not the only one that matters here ]

Responses inline below with only my Linux kernel developer hat on,
i.e. not necessarily the view of $current_employer:

On Thu, Jan 27, 2022 at 8:18 AM John Groves <john@jagalactic.com> wrote:
>
> I’d like to seek some feedback and see whether a consensus exists or can be developed regarding how system firmware (bios/efi/etc) should present CXL DRAM to a system in a pre-fabric world (CXL 1.1/2.0).
>
>
>
> The CXL spec, along with the Intel documentation are pretty specific and useful, but one open issue seems not to be outright specified: should the system mark CXL-attached DRAM as “specific purpose” (EFI_MEMORY_SP)? Consistency across platforms is certainly desirable. If this behavior is not prescribed, we could end up with inconsistent behavior across server and bios vendors.
>
>
>
> If this is already specified, no need to read on (but please point me to where it’s specified).

There is no specification for how an OS handles EFI_MEMORY_SP.
Everything below is only a Linux perspective and likely any other OS
you ask will give a different perspective.

> Objective: I think everyone will likely agree that it should be possible to use CXL DRAM as either general-purpose memory, or via DAX, or a mix.
[..]
>
> ·      Currently cannot be online-converted to DAX-managed (can this work? Is it intended to be working?)
>

There is no guaranteed way to un-online memory, especially ZONE_NORMAL
memory. There are heuristics to make it fail less often, but in
general it's not reliable so online-conversion to DAX-managed is not
being attempted for the general case.

> If online conversion from general-purpose to DAX is not going to work, it seems that the default should preserve the ability to use it either way: mark the memory as EFI_MEMORY_SP.

Yes, unfortunately that requires a paradigm shift for end users to
make a policy decision about memory where they did not need to make
one before. My hope is that distributions would set a default daxctl
policy to just online soft-reserved (Linux term for EFI_MEMORY_SP)
memory. That way savvy users have a control point to change the policy
to varying degrees of exclusive access through a DAX-device instance /
instances, and other users, that don't even know what EFI_MEMORY_SP
is, will see just another NUMA node by default.

> Is there a right and wrong answer re:EFI_MEMORY_SP? How important is it to have consistency across platforms?

The Principle of Least Surprise applies, and the vast bulk of users
simply don't know that they need to care about memory types and memory
performance classes. The ones that do know and care are also likely
the ones to be surprised if they can not guarantee 100% exclusive
access, i.e. machines purpose built to run a workload where the
application gets 100% of the high performance memory.

The distro gets to decide the CONFIG_EFI_SOFT_RESERVE policy, and if
it chooses CONFIG_EFI_SOFT_RESERVE=y I think it should go further to
ship daxctl and a policy that onlines it by default.

https://github.com/pmem/ndctl/blob/main/Documentation/daxctl/daxctl-reconfigure-device.txt#L244

> If there is a consensus, the next question is who should express it. Perhaps the CXL consortium. I’m a part of that, but it seemed like the Linux dev community was the right place to start.

EFI_MEMORY_SP is defined as a hint, so to me that effectively kicks
all the policy questions over to OS specific / Distro specific
solution space.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: Should bios always mark CXL DRAM as EFI_MEMORY_SP?
  2022-01-27 16:18 ` Should bios always mark CXL DRAM as EFI_MEMORY_SP? John Groves
  2022-01-28  0:47   ` Dan Williams
@ 2022-01-28  4:08   ` Li Qiang (Johnny Li)
  2022-01-28  5:28     ` Dan Williams
  1 sibling, 1 reply; 6+ messages in thread
From: Li Qiang (Johnny Li) @ 2022-01-28  4:08 UTC (permalink / raw)
  To: 'John Groves', 'Dan Williams', linux-cxl
  Cc: 'Jonathan Cameron', 'Ben Widawsky',
	'John Groves'

I think BIOS should follow CDAT spec v1.01 Device Scoped EFI Memory Type Structure (DSEMTS) structure

In Table 8 Device Scoped EFI Memory Type Structure, field EFI Memory Type and Attribute has below definition
  0 – EfiConventionalMemory
  1 - EfiConventionalMemory Type with EFI_MEMORY_SP Attribute
  2 – EfiReservedMemoryType
  3-255 – Reserved encoding
  The memory attribute EFI_MEMORY_NV may be inferred from NonVolatile flag in DSMAS.
  Memory types other than EfiConventionalMemory and EfiReservedMemoryType are not permitted.

Thanks
Johnny
-----Original Message-----
From: John Groves (john@jagalactic.com) [mailto:john@jagalactic.com] 
Sent: Friday, January 28, 2022 12:19 AM
To: Dan Williams; linux-cxl@vger.kernel.org
Cc: Jonathan Cameron; Ben Widawsky; John Groves
Subject: Should bios always mark CXL DRAM as EFI_MEMORY_SP?

I’d like to seek some feedback and see whether a consensus exists or can be developed regarding how system firmware (bios/efi/etc) should present CXL DRAM to a system in a pre-fabric world (CXL 1.1/2.0).

 

The CXL spec, along with the Intel documentation are pretty specific and useful, but one open issue seems not to be outright specified: should the system mark CXL-attached DRAM as “specific purpose” (EFI_MEMORY_SP)? Consistency across platforms is certainly desirable. If this behavior is not prescribed, we could end up with inconsistent behavior across server and bios vendors.

 

If this is already specified, no need to read on (but please point me to where it’s specified).

 

Objective: I think everyone will likely agree that it should be possible to use CXL DRAM as either general-purpose memory, or via DAX, or a mix.

 

What’s the difference?

 

Memory marked as EFI_MEMORY_SP:

 

·      Mappable via DAX

·      Can be online-converted to general purpose memory via daxctl

·      Can be boot-converted to general-purpose with efi=nosoftreserve on Linux command line

 

Memory NOT marked as EFI_MEMORY_SP:

 

·      CXL is general-purpose memory (NUMA node with no local CPU cores)

·      Some of the contents appear to be used for in-memory metadata (presumably buddy lists, etc.)

·      Can be boot-converted to DAX with an efi_fake_mem= argument on the Linux command line

·      Currently cannot be online-converted to DAX-managed (can this work? Is it intended to be working?)

 

If online conversion from general-purpose to DAX is not going to work, it seems that the default should preserve the ability to use it either way: mark the memory as EFI_MEMORY_SP.

 

Is there a right and wrong answer re:EFI_MEMORY_SP? How important is it to have consistency across platforms?

 

If there is a consensus, the next question is who should express it. Perhaps the CXL consortium. I’m a part of that, but it seemed like the Linux dev community was the right place to start.

 

Thanks for any thoughts.


John Groves

Micron








^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Should bios always mark CXL DRAM as EFI_MEMORY_SP?
  2022-01-28  4:08   ` Li Qiang (Johnny Li)
@ 2022-01-28  5:28     ` Dan Williams
  2022-01-28 10:24       ` Jonathan Cameron
  0 siblings, 1 reply; 6+ messages in thread
From: Dan Williams @ 2022-01-28  5:28 UTC (permalink / raw)
  To: Li Qiang (Johnny Li)
  Cc: John Groves, linux-cxl, Jonathan Cameron, Ben Widawsky, John Groves

On Thu, Jan 27, 2022 at 8:12 PM Li Qiang (Johnny Li)
<johnny.li@montage-tech.com> wrote:
>
> I think BIOS should follow CDAT spec v1.01 Device Scoped EFI Memory Type Structure (DSEMTS) structure
>
> In Table 8 Device Scoped EFI Memory Type Structure, field EFI Memory Type and Attribute has below definition
>   0 – EfiConventionalMemory
>   1 - EfiConventionalMemory Type with EFI_MEMORY_SP Attribute
>   2 – EfiReservedMemoryType
>   3-255 – Reserved encoding
>   The memory attribute EFI_MEMORY_NV may be inferred from NonVolatile flag in DSMAS.
>   Memory types other than EfiConventionalMemory and EfiReservedMemoryType are not permitted.

Definitely BIOS should follow CDAT for the type, but it's not so clear
to me the same can be said about the attribute. I think the bigger
question is when should devices claim to be EFI_MEMORY_SP, and when
should BIOS apply EFI_MEMORY_SP regardless of what the device
advertises. EFI_MEMORY_SP is a claim about usage that the memory is
either too high performance or too low performance to be added to the
general memory pool by default. That's not a decision that a device
necessarily knows how to make on its own. The platform BIOS might have
a better chance to know intended application the system was built. The
OS kernel is somewhat blind to usage but OS policy can do the last
mile tuning of how much if any memory of a given performance class
should be set aside for exclusive access.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Should bios always mark CXL DRAM as EFI_MEMORY_SP?
  2022-01-28  5:28     ` Dan Williams
@ 2022-01-28 10:24       ` Jonathan Cameron
  2022-01-28 16:12         ` Dan Williams
  0 siblings, 1 reply; 6+ messages in thread
From: Jonathan Cameron @ 2022-01-28 10:24 UTC (permalink / raw)
  To: Dan Williams
  Cc: Li Qiang (Johnny Li), John Groves, linux-cxl, Ben Widawsky, John Groves

On Thu, 27 Jan 2022 21:28:40 -0800
Dan Williams <dan.j.williams@intel.com> wrote:

> On Thu, Jan 27, 2022 at 8:12 PM Li Qiang (Johnny Li)
> <johnny.li@montage-tech.com> wrote:
> >
> > I think BIOS should follow CDAT spec v1.01 Device Scoped EFI Memory Type Structure (DSEMTS) structure
> >
> > In Table 8 Device Scoped EFI Memory Type Structure, field EFI Memory Type and Attribute has below definition
> >   0 – EfiConventionalMemory
> >   1 - EfiConventionalMemory Type with EFI_MEMORY_SP Attribute
> >   2 – EfiReservedMemoryType
> >   3-255 – Reserved encoding
> >   The memory attribute EFI_MEMORY_NV may be inferred from NonVolatile flag in DSMAS.
> >   Memory types other than EfiConventionalMemory and EfiReservedMemoryType are not permitted.  
> 
> Definitely BIOS should follow CDAT for the type, but it's not so clear
> to me the same can be said about the attribute. I think the bigger
> question is when should devices claim to be EFI_MEMORY_SP, and when
> should BIOS apply EFI_MEMORY_SP regardless of what the device
> advertises. EFI_MEMORY_SP is a claim about usage that the memory is
> either too high performance or too low performance to be added to the
> general memory pool by default. That's not a decision that a device
> necessarily knows how to make on its own. The platform BIOS might have
> a better chance to know intended application the system was built. The
> OS kernel is somewhat blind to usage but OS policy can do the last
> mile tuning of how much if any memory of a given performance class
> should be set aside for exclusive access.

I'd add another spin based on where EFI_MEMORY_SP originally came from,
though it's not relevant to memory only devices which I think is what
is being discussed here.

For some devices the memory will work fine as general purpose RAM, but
it was put there with an intended use.  Typically something like DDR
attached to a GPU or other accelerator.  Might be nice and quick for
general use, but it's even quicker if the GPU is using it :)

Again, how much to reserve for what usecase is an OS policy decision
hence the hint from the attribute.  Neither the device nor the
bios can know the answer as it depends on what is actually being run
in the OS.

Jonathan

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Should bios always mark CXL DRAM as EFI_MEMORY_SP?
  2022-01-28 10:24       ` Jonathan Cameron
@ 2022-01-28 16:12         ` Dan Williams
  0 siblings, 0 replies; 6+ messages in thread
From: Dan Williams @ 2022-01-28 16:12 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Li Qiang (Johnny Li), John Groves, linux-cxl, Ben Widawsky, John Groves

On Fri, Jan 28, 2022 at 2:25 AM Jonathan Cameron
<Jonathan.Cameron@huawei.com> wrote:
>
> On Thu, 27 Jan 2022 21:28:40 -0800
> Dan Williams <dan.j.williams@intel.com> wrote:
>
> > On Thu, Jan 27, 2022 at 8:12 PM Li Qiang (Johnny Li)
> > <johnny.li@montage-tech.com> wrote:
> > >
> > > I think BIOS should follow CDAT spec v1.01 Device Scoped EFI Memory Type Structure (DSEMTS) structure
> > >
> > > In Table 8 Device Scoped EFI Memory Type Structure, field EFI Memory Type and Attribute has below definition
> > >   0 – EfiConventionalMemory
> > >   1 - EfiConventionalMemory Type with EFI_MEMORY_SP Attribute
> > >   2 – EfiReservedMemoryType
> > >   3-255 – Reserved encoding
> > >   The memory attribute EFI_MEMORY_NV may be inferred from NonVolatile flag in DSMAS.
> > >   Memory types other than EfiConventionalMemory and EfiReservedMemoryType are not permitted.
> >
> > Definitely BIOS should follow CDAT for the type, but it's not so clear
> > to me the same can be said about the attribute. I think the bigger
> > question is when should devices claim to be EFI_MEMORY_SP, and when
> > should BIOS apply EFI_MEMORY_SP regardless of what the device
> > advertises. EFI_MEMORY_SP is a claim about usage that the memory is
> > either too high performance or too low performance to be added to the
> > general memory pool by default. That's not a decision that a device
> > necessarily knows how to make on its own. The platform BIOS might have
> > a better chance to know intended application the system was built. The
> > OS kernel is somewhat blind to usage but OS policy can do the last
> > mile tuning of how much if any memory of a given performance class
> > should be set aside for exclusive access.
>
> I'd add another spin based on where EFI_MEMORY_SP originally came from,
> though it's not relevant to memory only devices which I think is what
> is being discussed here.
>
> For some devices the memory will work fine as general purpose RAM, but
> it was put there with an intended use.  Typically something like DDR
> attached to a GPU or other accelerator.  Might be nice and quick for
> general use, but it's even quicker if the GPU is using it :)
>
> Again, how much to reserve for what usecase is an OS policy decision
> hence the hint from the attribute.  Neither the device nor the
> bios can know the answer as it depends on what is actually being run
> in the OS.

I agree this was one of the original motivations, but every time I
talk to a GPU developer and bring up the case of the OS taking a page,
and pinning it indefinitely, they balk. So I think this case is
covered by setting the type to EfiReservedMemory (hard-reserved) and
the GPU driver owns the policy about giving the memory to the OS
general pool, if ever.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-01-28 16:12 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <07cedbe6-00ab-52fc-9475-c8d7120f5a95@jagalactic.com>
2022-01-27 16:18 ` Should bios always mark CXL DRAM as EFI_MEMORY_SP? John Groves
2022-01-28  0:47   ` Dan Williams
2022-01-28  4:08   ` Li Qiang (Johnny Li)
2022-01-28  5:28     ` Dan Williams
2022-01-28 10:24       ` Jonathan Cameron
2022-01-28 16:12         ` Dan Williams

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.