All of lore.kernel.org
 help / color / mirror / Atom feed
* DVSEC FLEX BUS CAP bit 3: MEM_HWINIT_MODE
@ 2022-06-27  6:04 Mani Subramaniyan
  2022-06-27 19:48 ` Dan Williams
  0 siblings, 1 reply; 5+ messages in thread
From: Mani Subramaniyan @ 2022-06-27  6:04 UTC (permalink / raw)
  To: linux-cxl

While reading the CXL spec (both 1.1 and 2.0), of the DVSEC Flex bus cap (offset 0ah, bit 3), I see this:
"3 RO Mem_HwInit_Mode: If set, indicates this CXL.mem capable device initializes memory with assistance from hardware and firmware located on the device. If clear, indicates memory is initialized by host software such as device driver. This bit must be ignored if CXL.mem_Capable=0."

I figured that I would clear this bit so I can load my driver for the CXL type 3 device and also reset the memory active bit in the HDM entry (range low address bits). I was experimenting this on QEMU version 6.0.50 from Ben Widawsky (branch cxl2v4-doe), running an Ubuntu 64 bit Kernel ver 5.18.0.rc7. 
Here is what I see in dmesg output: 
[   62.234755] cxl_pci 0000:09:00.0: timeout awaiting memory active after 60 seconds
[   62.235793] cxl_mem mem0: Media not active (-110)
[   62.236447] cxl_mem: probe of mem0 failed with error -110
[  123.674758] cxl_pci 0000:09:00.0: timeout awaiting memory active after 60 seconds
[  123.675800] cxl_mem mem0: Media not active (-110)
[  123.676450] cxl_mem: probe of mem0 failed with error -110

In spite of these error messages, I could use the CXL memory without any issues!

My purpose is to load my driver on this device and then initialize and bring the memory on line or use the memory for my own software not giving it to general memory pool of Linux. 

Since I am new to this alias, and not following the threads fully (did do a search though), can someone point me in the right direction/clues?

My questions are:
1.	Shouldn't the Linux CXL code have looked at the MEM_HWINIT_MODE bit first and if that is not set, should not have complained about this mem active bit, assuming that the software/driver will initialize and then make the memory active at a later time ?
2.	If a vendor driver loads, talks to the hardware to initialize and make the hardware change the memory active bit state, how does the vendor driver, if it wants to, gives the memory back to the kernel? Should there be a kernel call for that purpose or this is very similar to dax initialization and so there is no further need for a kernel call? 

Thanks
ManiS

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: DVSEC FLEX BUS CAP bit 3: MEM_HWINIT_MODE
  2022-06-27  6:04 DVSEC FLEX BUS CAP bit 3: MEM_HWINIT_MODE Mani Subramaniyan
@ 2022-06-27 19:48 ` Dan Williams
  2022-06-28 15:34   ` Mani Subramaniyan
  0 siblings, 1 reply; 5+ messages in thread
From: Dan Williams @ 2022-06-27 19:48 UTC (permalink / raw)
  To: Mani Subramaniyan, linux-cxl

Mani Subramaniyan wrote:
> While reading the CXL spec (both 1.1 and 2.0), of the DVSEC Flex bus
> cap (offset 0ah, bit 3), I see this: "3 RO Mem_HwInit_Mode: If set,
> indicates this CXL.mem capable device initializes memory with
> assistance from hardware and firmware located on the device. If clear,
> indicates memory is initialized by host software such as device
> driver. This bit must be ignored if CXL.mem_Capable=0."

Hmm, to me the specification does not indicate how OS drivers can initialize memory
in the MEM_HWINIT_MODE=0 case. It says impossible things like the
following in "9.11.5 CXL 1.1 device discovery":

    The OS device driver is responsible for setting Memory_Active after
    memory initialization is complete. Any subsequent accesses to the HDM
    memory are decoded and routed to the local memory by the device.

...which is impossible because Memory_Active is an RO bit. So yes, you
have discovered a gap in the Linux handling of the MEM_HWINIT_MODE=0
case, but I think it looks like a gap in the specification as well as
what the OS can expect of memory readiness in that case.

> 
> I figured that I would clear this bit so I can load my driver for the
> CXL type 3 device and also reset the memory active bit in the HDM
> entry (range low address bits). I was experimenting this on QEMU
> version 6.0.50 from Ben Widawsky (branch cxl2v4-doe), running an
> Ubuntu 64 bit Kernel ver 5.18.0.rc7.  Here is what I see in dmesg
> output: 
> [   62.234755] cxl_pci 0000:09:00.0: timeout awaiting memory active after 60 seconds
> [   62.235793] cxl_mem mem0: Media not active (-110)
> [   62.236447] cxl_mem: probe of mem0 failed with error -110
> [  123.674758] cxl_pci 0000:09:00.0: timeout awaiting memory active after 60 seconds
> [  123.675800] cxl_mem mem0: Media not active (-110)
> [  123.676450] cxl_mem: probe of mem0 failed with error -110
> 
> In spite of these error messages, I could use the CXL memory without any issues!

Sure, the driver has no role to play if something else has already
programmed the HDM decoders.

> My purpose is to load my driver on this device and then initialize and
> bring the memory on line or use the memory for my own software not
> giving it to general memory pool of Linux. 

This is a flow that the native Linux driver will support. Memory will be
assigned to the device-dax subsystem. From there it can either remain
dedicated access through a device special file, or optionally shared
with the general Linux memory pool. See the "persistent reconfiguration"
section of this man page for examples of what this policy might look
like (i.e. it's still a work in progress):

https://github.com/pmem/ndctl/blob/main/Documentation/daxctl/daxctl-reconfigure-device.txt#L244


> 
> Since I am new to this alias, and not following the threads fully (did
> do a search though), can someone point me in the right
> direction/clues?
> 
> My questions are:
> 1.	Shouldn't the Linux CXL code have looked at the MEM_HWINIT_MODE
> bit first and if that is not set, should not have complained about
> this mem active bit, assuming that the software/driver will initialize
> and then make the memory active at a later time ?

Perhaps, if there is clarity about the OS responsibilities in the
MEM_HWINIT_MODE=0 case.

> 2.	If a vendor driver loads,

For generic memory devices expander device why is a vendor driver in the
picture? For accelerators the expectation is that they can use some of
the core CXL infrastructure to manage HDM, but we (upstream Linux
community), are still awaiting the first use case to take advantage of
those core APIs.

> talks to the hardware to initialize and make the hardware change the
> memory active bit state, how does the vendor driver, if it wants to,
> gives the memory back to the kernel? Should there be a kernel call for
> that purpose or this is very similar to dax initialization and so there
> is no further need for a kernel call? 

It's very similar to dax to the point that the shortest path to get what you
want is just use dax and the generic driver, unless I am missing a
reason why that is not possible?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: DVSEC FLEX BUS CAP bit 3: MEM_HWINIT_MODE
  2022-06-27 19:48 ` Dan Williams
@ 2022-06-28 15:34   ` Mani Subramaniyan
  2022-06-28 19:49     ` Dan Williams
  0 siblings, 1 reply; 5+ messages in thread
From: Mani Subramaniyan @ 2022-06-28 15:34 UTC (permalink / raw)
  To: Dan Williams, linux-cxl

Thanks Dan for your answers!
As for MEM_HwInit_Mode, spec indicates that when it is set, hardware has to set memory active in the range low address bits within 100msec (it has to complete its internal initializations within that time).

This leads me to assume the following for a software initialization:
1. Both mem_hwinit_mode and mem_active are set to 0 by hardware.
2. OS loads a driver (either generic or vendor specific) that talks to the hardware and does the hardware initialization. Hardware, after completing the initialization, also changes the mem_active bit state to 1.
3. Software/driver checks the state of mem_active and then sets up that memory (either mmap for its own use or for system use).

You are right about the spec wording. Yes, driver cannot make this mem_active to change states, but the hardware completion of initialization should make the hardware change the state.

Some other replies/questions inline...

Thanks
ManiS

-----Original Message-----
From: Dan Williams <dan.j.williams@intel.com> 

Mani Subramaniyan wrote:
> While reading the CXL spec (both 1.1 and 2.0), of the DVSEC Flex bus 
> cap (offset 0ah, bit 3), I see this: "3 RO Mem_HwInit_Mode: If set, 
> indicates this CXL.mem capable device initializes memory with 
> assistance from hardware and firmware located on the device. If clear, 
> indicates memory is initialized by host software such as device 
> driver. This bit must be ignored if CXL.mem_Capable=0."

Hmm, to me the specification does not indicate how OS drivers can initialize memory in the MEM_HWINIT_MODE=0 case. It says impossible things like the following in "9.11.5 CXL 1.1 device discovery":

    The OS device driver is responsible for setting Memory_Active after
    memory initialization is complete. Any subsequent accesses to the HDM
    memory are decoded and routed to the local memory by the device.

...which is impossible because Memory_Active is an RO bit. So yes, you have discovered a gap in the Linux handling of the MEM_HWINIT_MODE=0 case, but I think it looks like a gap in the specification as well as what the OS can expect of memory readiness in that case.

ManiS: agree. The wording has to say that the hardware changes the value of mem_active to 1 when the initialization is complete and successful.

> 
> I figured that I would clear this bit so I can load my driver for the 
> CXL type 3 device and also reset the memory active bit in the HDM 
> entry (range low address bits). I was experimenting this on QEMU 
> version 6.0.50 from Ben Widawsky (branch cxl2v4-doe), running an 
> Ubuntu 64 bit Kernel ver 5.18.0.rc7.  Here is what I see in dmesg
> output: 
> [   62.234755] cxl_pci 0000:09:00.0: timeout awaiting memory active after 60 seconds
> [   62.235793] cxl_mem mem0: Media not active (-110)
> [   62.236447] cxl_mem: probe of mem0 failed with error -110
> [  123.674758] cxl_pci 0000:09:00.0: timeout awaiting memory active 
> after 60 seconds [  123.675800] cxl_mem mem0: Media not active (-110) 
> [  123.676450] cxl_mem: probe of mem0 failed with error -110
> 
> In spite of these error messages, I could use the CXL memory without any issues!

Sure, the driver has no role to play if something else has already programmed the HDM decoders.

> My purpose is to load my driver on this device and then initialize and 
> bring the memory on line or use the memory for my own software not 
> giving it to general memory pool of Linux.

This is a flow that the native Linux driver will support. Memory will be assigned to the device-dax subsystem. From there it can either remain dedicated access through a device special file, or optionally shared with the general Linux memory pool. See the "persistent reconfiguration"
section of this man page for examples of what this policy might look like (i.e. it's still a work in progress):

https://github.com/pmem/ndctl/blob/main/Documentation/daxctl/daxctl-reconfigure-device.txt#L244

ManiS: Thanks for the link. Yes, that may work for my purpose.  
> 
> Since I am new to this alias, and not following the threads fully (did 
> do a search though), can someone point me in the right 
> direction/clues?
> 
> My questions are:
> 1.	Shouldn't the Linux CXL code have looked at the MEM_HWINIT_MODE
> bit first and if that is not set, should not have complained about 
> this mem active bit, assuming that the software/driver will initialize 
> and then make the memory active at a later time ?

Perhaps, if there is clarity about the OS responsibilities in the
MEM_HWINIT_MODE=0 case.

ManiS: Agree.  I am interpreting based on an extrapolation of init_mode=1 spec.

> 2.	If a vendor driver loads,

For generic memory devices expander device why is a vendor driver in the picture? For accelerators the expectation is that they can use some of the core CXL infrastructure to manage HDM, but we (upstream Linux community), are still awaiting the first use case to take advantage of those core APIs.

> talks to the hardware to initialize and make the hardware change the 
> memory active bit state, how does the vendor driver, if it wants to, 
> gives the memory back to the kernel? Should there be a kernel call for 
> that purpose or this is very similar to dax initialization and so 
> there is no further need for a kernel call?

It's very similar to dax to the point that the shortest path to get what you want is just use dax and the generic driver, unless I am missing a reason why that is not possible?

ManiS: Yes, device dax is enough for most purposes. I can't think of a usage where we need anything more at this point. Thanks!

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: DVSEC FLEX BUS CAP bit 3: MEM_HWINIT_MODE
  2022-06-28 15:34   ` Mani Subramaniyan
@ 2022-06-28 19:49     ` Dan Williams
  2022-06-29 16:02       ` Mani Subramaniyan
  0 siblings, 1 reply; 5+ messages in thread
From: Dan Williams @ 2022-06-28 19:49 UTC (permalink / raw)
  To: Mani Subramaniyan, Dan Williams, linux-cxl

Mani Subramaniyan wrote:
> Thanks Dan for your answers!
> As for MEM_HwInit_Mode, spec indicates that when it is set, hardware
> has to set memory active in the range low address bits within 100msec
> (it has to complete its internal initializations within that time).
> 
> This leads me to assume the following for a software initialization:
> 1. Both mem_hwinit_mode and mem_active are set to 0 by hardware.
> 2. OS loads a driver (either generic or vendor specific) that talks to
> the hardware and does the hardware initialization. Hardware, after
> completing the initialization, also changes the mem_active bit state
> to 1.

If mem_hwinit_mode=0 then the device is using a custom memory
initialization flow and the generic driver can not be used. This is
because mem_hwinit_mode=0 implies a non-generic / non-standard
initialization mechanism. So I think a device that sets
mem_hwinit_mode=0 must either not specify the common class code and use
a vendor driver, or it needs to make sure that the platform firmware has
initialized the device before the generic OS driver attaches.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: DVSEC FLEX BUS CAP bit 3: MEM_HWINIT_MODE
  2022-06-28 19:49     ` Dan Williams
@ 2022-06-29 16:02       ` Mani Subramaniyan
  0 siblings, 0 replies; 5+ messages in thread
From: Mani Subramaniyan @ 2022-06-29 16:02 UTC (permalink / raw)
  To: Dan Williams, linux-cxl

Yes, my mistake. Only vendor drivers have to load and initialize when mem_hwinit_mode is 0.

That said:
1. The CXL code in linux has to look and take the right action for mem_hwinit_mode==0 case. It doesn't check it right now. When I can I will get around to proposing a patch for it here.
2.  The CXL spec has to add some verbiage to clear this point. I will follow up on the consortium.

Thanks
ManiS

-----Original Message-----
From: Dan Williams <dan.j.williams@intel.com> 
Sent: Tuesday, June 28, 2022 12:49 PM
To: Mani Subramaniyan <mani.subramaniyan@elastics.cloud>; Dan Williams <dan.j.williams@intel.com>; linux-cxl@vger.kernel.org
Subject: RE: DVSEC FLEX BUS CAP bit 3: MEM_HWINIT_MODE

Mani Subramaniyan wrote:
> Thanks Dan for your answers!
> As for MEM_HwInit_Mode, spec indicates that when it is set, hardware 
> has to set memory active in the range low address bits within 100msec 
> (it has to complete its internal initializations within that time).
> 
> This leads me to assume the following for a software initialization:
> 1. Both mem_hwinit_mode and mem_active are set to 0 by hardware.
> 2. OS loads a driver (either generic or vendor specific) that talks to 
> the hardware and does the hardware initialization. Hardware, after 
> completing the initialization, also changes the mem_active bit state 
> to 1.

If mem_hwinit_mode=0 then the device is using a custom memory initialization flow and the generic driver can not be used. This is because mem_hwinit_mode=0 implies a non-generic / non-standard initialization mechanism. So I think a device that sets
mem_hwinit_mode=0 must either not specify the common class code and use a vendor driver, or it needs to make sure that the platform firmware has initialized the device before the generic OS driver attaches.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-06-29 16:02 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-27  6:04 DVSEC FLEX BUS CAP bit 3: MEM_HWINIT_MODE Mani Subramaniyan
2022-06-27 19:48 ` Dan Williams
2022-06-28 15:34   ` Mani Subramaniyan
2022-06-28 19:49     ` Dan Williams
2022-06-29 16:02       ` Mani Subramaniyan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.