linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [RFC PATCH 0/9] CXL 2.0 Support
       [not found] <FB00A034-7C6D-40B1-8452-318A3B052216@avery-design.com>
@ 2020-12-04 17:40 ` Chris Browy
  2020-12-04 18:12   ` Ben Widawsky
  0 siblings, 1 reply; 9+ messages in thread
From: Chris Browy @ 2020-12-04 17:40 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: bhelgaas, dan.j.williams, ira.weiny, linux-acpi, linux-cxl,
	linux-kernel, linux-pci, rafael.j.wysocki, sean.v.kelley,
	vishal.l.verma

Hi Ben,

Trying to bring up the environment using the latest developments as follows:

1. Linux kernel baseline version is cloned using
     git clone git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
   Using master branch.  Merged the 9 CXL linux kernel patches manually and built kernel

2. QEMU baseline version is cloned using
     git clone https://gitlab.com/bwidawsk/qemu.git

3. UEFI baseline is cloned using
     git clone https://github.com/tianocore/edk2.git
   Using master and built

4. Now can run qemu as follows:
     The qcow2 we use is based on Ubuntu 20.10 with updated with kernel from 1) above

     QEMU command:

     sudo qemu-system-x86_64 -nic \
     user,hostfwd=tcp::2222-:22,hostfwd=tcp::1234-:1234 -machine \
     type=pc-q35-4.0,hmat=on,accel=kvm -enable-kvm -cpu host -smp \
     6,cores=6,threads=1,sockets=1 -m 8G -boot order=d -k 'en-us' -vga virtio \
     -drive file=/home/chris/Downloads/AQCXL/ubuntu_20.qcow,format=qcow2 -drive \
     if=pflash,format=raw,readonly,file=/home/chris/OVMF_CODE.fd \
     -drive if=pflash,format=raw,file=/home/chris/OVMF_VARS.fd \
     -object memory-backend-file,id=cxl-mem1,share,mem-path=/tmp/cxl-test/cxl,size=512M \
     -device pxb-cxl,id=cxl.0,bus=pcie.0,bus_nr=52,uid=0,len-window-base=1,\
     window-base[0]=0x4c0000000,memdev[0]=cxl-mem1 \
     -device cxl-rp,id=rp0,bus=cxl.0,addr=0.0,chassis=0,slot=0  \
     -device cxl-type3,bus=rp0,memdev=cxl-mem1,id=cxl-pmem0,size=256M  2>&1 | tee -a \
     /home/chris/Downloads/AQCXL/log/qemu.log

   The qemu options are derived from looking at the tests/qtests/cxl-test.c
   along with the -hmat=on which seemed to make sense.

   The system boots and lspci -vvv shows the CXL device is enumerated.  But
   no DOE capability register for CDAT access though (see below).  Otherwise the
   DVSEC registers are present.

   acpidump indicates the CXL0 and CXLM devices but no SRAT or HMAT tables are
   in the dump which is curious.


35:00.0 Memory controller [0502]: Intel Corporation Device 0d93 (rev 01) (prog-if 10)
    Subsystem: Red Hat, Inc. Device 1100
    Physical Slot: 0
    Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
    Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
    Latency: 0
    Region 0: Memory at c0a00000 (64-bit, non-prefetchable) [size=64K]
    Region 2: Memory at c0a10000 (64-bit, non-prefetchable) [size=4K]
    Capabilities: [80] Express (v2) Endpoint, MSI 00
        DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
            ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 0.000W
        DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
            RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
            MaxPayload 128 bytes, MaxReadReq 128 bytes
        DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
        LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s, Exit Latency L0s <64ns
            ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
        LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
            ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
        LnkSta: Speed 2.5GT/s (ok), Width x1 (ok)
            TrErr- Train- SlotClk- DLActive+ BWMgmt- ABWMgmt-
        DevCap2: Completion Timeout: Not Supported, TimeoutDis-, NROPrPrP-, LTR-
             10BitTagComp-, 10BitTagReq-, OBFF Not Supported, ExtFmt+, EETLPPrefix+, MaxEETLPPrefixes 4
               EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
             FRS-, TPHComp-, ExtTPHComp-
             AtomicOpsCap: 32bit- 64bit- 128bitCAS-
        DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
             AtomicOpsCtl: ReqEn-
        LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
             Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
             Compliance De-emphasis: -6dB
        LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
             EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
    Capabilities: [100 v1] Designated Vendor-Specific <?>
    Capabilities: [138 v1] Designated Vendor-Specific <?>
    Kernel driver in use: cxl_mem

Questions/Comments:
-------------------
1. Linux
  a. Is there a gitlab for the linux kernel patches for CXL?  This would
     facilitate review and code modifications.

2. UEFI (edk2 from tianocore)
  a. seems to only support CXL 1.1 which means only method #1 (Device
     option ROM) of Coherent Device Attribute Table_1.02 spec
     for CDAT handling is possible now.

     Does device option ROM need to be added to QEMU CXL setup?

     Can we add a CXL 1.1 emulated device?

  b. lspci doesn’t show the existence of the DOE extended capability register
     in the CXL CT3D (needed to support method #2).  Are there more patches?

3. Do you have example user programs to share or better yet the CXL 2.0
   Sec 14.3.6.1 Application Layer/ Transaction layer test for CXL.mem?

4. What are the userspace system APIs for targeting CXL HDM address domain?
   Usually you can mmap a SPA if you know how to look it up.


Best Regards,
Chris Browy




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH 0/9] CXL 2.0 Support
  2020-12-04 17:40 ` [RFC PATCH 0/9] CXL 2.0 Support Chris Browy
@ 2020-12-04 18:12   ` Ben Widawsky
  2020-12-04 19:26     ` Verma, Vishal L
       [not found]     ` <B4FDE6CB-DA00-4F58-AA9F-F04E678E076B@avery-design.com>
  0 siblings, 2 replies; 9+ messages in thread
From: Ben Widawsky @ 2020-12-04 18:12 UTC (permalink / raw)
  To: Chris Browy
  Cc: bhelgaas, dan.j.williams, ira.weiny, linux-acpi, linux-cxl,
	linux-kernel, linux-pci, rafael.j.wysocki, sean.v.kelley,
	vishal.l.verma

Hi Chris.

On 20-12-04 12:40:03, Chris Browy wrote:
> Hi Ben,
> 
> Trying to bring up the environment using the latest developments as follows:
> 
> 1. Linux kernel baseline version is cloned using
>      git clone git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
>    Using master branch.  Merged the 9 CXL linux kernel patches manually and built kernel
> 
> 2. QEMU baseline version is cloned using
>      git clone https://gitlab.com/bwidawsk/qemu.git
> 
> 3. UEFI baseline is cloned using
>      git clone https://github.com/tianocore/edk2.git
>    Using master and built
> 
> 4. Now can run qemu as follows:
>      The qcow2 we use is based on Ubuntu 20.10 with updated with kernel from 1) above
> 
>      QEMU command:
> 
>      sudo qemu-system-x86_64 -nic \
>      user,hostfwd=tcp::2222-:22,hostfwd=tcp::1234-:1234 -machine \
>      type=pc-q35-4.0,hmat=on,accel=kvm -enable-kvm -cpu host -smp \
>      6,cores=6,threads=1,sockets=1 -m 8G -boot order=d -k 'en-us' -vga virtio \
>      -drive file=/home/chris/Downloads/AQCXL/ubuntu_20.qcow,format=qcow2 -drive \
>      if=pflash,format=raw,readonly,file=/home/chris/OVMF_CODE.fd \
>      -drive if=pflash,format=raw,file=/home/chris/OVMF_VARS.fd \
>      -object memory-backend-file,id=cxl-mem1,share,mem-path=/tmp/cxl-test/cxl,size=512M \
>      -device pxb-cxl,id=cxl.0,bus=pcie.0,bus_nr=52,uid=0,len-window-base=1,\
>      window-base[0]=0x4c0000000,memdev[0]=cxl-mem1 \
>      -device cxl-rp,id=rp0,bus=cxl.0,addr=0.0,chassis=0,slot=0  \
>      -device cxl-type3,bus=rp0,memdev=cxl-mem1,id=cxl-pmem0,size=256M  2>&1 | tee -a \
>      /home/chris/Downloads/AQCXL/log/qemu.log
> 
>    The qemu options are derived from looking at the tests/qtests/cxl-test.c
>    along with the -hmat=on which seemed to make sense.
> 
>    The system boots and lspci -vvv shows the CXL device is enumerated.  But
>    no DOE capability register for CDAT access though (see below).  Otherwise the
>    DVSEC registers are present.

DOE is not supported yet in either Linux or QEMU. For us, CDAT isn't a high
priority yet so it likely won't be done for a while. I'd really like to see DOE
support added by someone - not me - so that we can wire it up. Not sure what
that would look like in the QEMU side.

> 
>    acpidump indicates the CXL0 and CXLM devices but no SRAT or HMAT tables are
>    in the dump which is curious.

I don't typically use HMAT, but I do have an SRAT in mine, so that's strange.
You should also have a CEDT.

> 
> 
> 35:00.0 Memory controller [0502]: Intel Corporation Device 0d93 (rev 01) (prog-if 10)
>     Subsystem: Red Hat, Inc. Device 1100
>     Physical Slot: 0
>     Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
>     Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
>     Latency: 0
>     Region 0: Memory at c0a00000 (64-bit, non-prefetchable) [size=64K]
>     Region 2: Memory at c0a10000 (64-bit, non-prefetchable) [size=4K]
>     Capabilities: [80] Express (v2) Endpoint, MSI 00
>         DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
>             ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 0.000W
>         DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
>             RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
>             MaxPayload 128 bytes, MaxReadReq 128 bytes
>         DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
>         LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s, Exit Latency L0s <64ns
>             ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
>         LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
>             ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
>         LnkSta: Speed 2.5GT/s (ok), Width x1 (ok)
>             TrErr- Train- SlotClk- DLActive+ BWMgmt- ABWMgmt-
>         DevCap2: Completion Timeout: Not Supported, TimeoutDis-, NROPrPrP-, LTR-
>              10BitTagComp-, 10BitTagReq-, OBFF Not Supported, ExtFmt+, EETLPPrefix+, MaxEETLPPrefixes 4
>                EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
>              FRS-, TPHComp-, ExtTPHComp-
>              AtomicOpsCap: 32bit- 64bit- 128bitCAS-
>         DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
>              AtomicOpsCtl: ReqEn-
>         LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
>              Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
>              Compliance De-emphasis: -6dB
>         LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
>              EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
>     Capabilities: [100 v1] Designated Vendor-Specific <?>
>     Capabilities: [138 v1] Designated Vendor-Specific <?>
>     Kernel driver in use: cxl_mem
> 
> Questions/Comments:
> -------------------
> 1. Linux
>   a. Is there a gitlab for the linux kernel patches for CXL?  This would
>      facilitate review and code modifications.

We're hopefully going to send out v2 in the next couple of days. I'll push the
repo somewhere as well.

> 
> 2. UEFI (edk2 from tianocore)
>   a. seems to only support CXL 1.1 which means only method #1 (Device
>      option ROM) of Coherent Device Attribute Table_1.02 spec
>      for CDAT handling is possible now.
> 
>      Does device option ROM need to be added to QEMU CXL setup?
> 
>      Can we add a CXL 1.1 emulated device?

Patches welcome :-). I know of other people who want this, but I only care about
2.0+, so I have no intention to implement it.

> 
>   b. lspci doesn’t show the existence of the DOE extended capability register
>      in the CXL CT3D (needed to support method #2).  Are there more patches?

As above, it's not supported. I'm hoping someone else will do that work since I
don't care about it just yet.

> 
> 3. Do you have example user programs to share or better yet the CXL 2.0
>    Sec 14.3.6.1 Application Layer/ Transaction layer test for CXL.mem?
> 

I don't have, mostly because I haven't actually implemented a lot of the real
CXL support. My primary concern was having the Linux driver be able to enumerate
devices and communicate with the device via the mailbox interface. v2 will
contain support for userspace to do this, which I think is a step toward what
you're asking for.

> 4. What are the userspace system APIs for targeting CXL HDM address domain?
>    Usually you can mmap a SPA if you know how to look it up.

I think Dan answered this in the other thread...

> 
> 
> Best Regards,
> Chris Browy
> 
> 
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH 0/9] CXL 2.0 Support
  2020-12-04 18:12   ` Ben Widawsky
@ 2020-12-04 19:26     ` Verma, Vishal L
  2020-12-04 19:45       ` Dan Williams
       [not found]     ` <B4FDE6CB-DA00-4F58-AA9F-F04E678E076B@avery-design.com>
  1 sibling, 1 reply; 9+ messages in thread
From: Verma, Vishal L @ 2020-12-04 19:26 UTC (permalink / raw)
  To: Widawsky, Ben, cbrowy
  Cc: Kelley, Sean V, Weiny, Ira, linux-cxl, linux-kernel, Williams,
	Dan J, Wysocki, Rafael J, linux-pci, bhelgaas, linux-acpi

On Fri, 2020-12-04 at 10:12 -0800, Ben Widawsky wrote:
> Hi Chris.
> 
> On 20-12-04 12:40:03, Chris Browy wrote:
[..]
> 
> >    acpidump indicates the CXL0 and CXLM devices but no SRAT or HMAT tables are
> >    in the dump which is curious.
> 
> I don't typically use HMAT, but I do have an SRAT in mine, so that's strange.
> You should also have a CEDT.
> 
I suspect an SRAT is only added if you have distinct numa nodes. Adding
a few '-numa node' bits to the qemu command line should be enough to
make that happen.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH 0/9] CXL 2.0 Support
  2020-12-04 19:26     ` Verma, Vishal L
@ 2020-12-04 19:45       ` Dan Williams
  0 siblings, 0 replies; 9+ messages in thread
From: Dan Williams @ 2020-12-04 19:45 UTC (permalink / raw)
  To: Verma, Vishal L
  Cc: Widawsky, Ben, cbrowy, Kelley, Sean V, Weiny, Ira, linux-cxl,
	linux-kernel, Wysocki, Rafael J, linux-pci, bhelgaas, linux-acpi

On Fri, Dec 4, 2020 at 11:26 AM Verma, Vishal L
<vishal.l.verma@intel.com> wrote:
>
> On Fri, 2020-12-04 at 10:12 -0800, Ben Widawsky wrote:
> > Hi Chris.
> >
> > On 20-12-04 12:40:03, Chris Browy wrote:
> [..]
> >
> > >    acpidump indicates the CXL0 and CXLM devices but no SRAT or HMAT tables are
> > >    in the dump which is curious.
> >
> > I don't typically use HMAT, but I do have an SRAT in mine, so that's strange.
> > You should also have a CEDT.
> >
> I suspect an SRAT is only added if you have distinct numa nodes. Adding
> a few '-numa node' bits to the qemu command line should be enough to
> make that happen.

For CXL-2.0-Type-3, BIOS is responsible for retrieving CDATs and
synthesizing SRAT/SLIT/HMAT tables for the CXL.mem that is mapped by
platform firmware. For CXL.mem that is mapped by the OS, there is no
requirement to publish updated ACPI tables. CXL.mem mapped by the OS
need only support native CXL memory enumeration and leave ACPI only
for static platform resources.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH 0/9] CXL 2.0 Support
       [not found]     ` <B4FDE6CB-DA00-4F58-AA9F-F04E678E076B@avery-design.com>
@ 2020-12-07  4:40       ` Chris Browy
  2020-12-09  0:48         ` Ben Widawsky
  0 siblings, 1 reply; 9+ messages in thread
From: Chris Browy @ 2020-12-07  4:40 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: bhelgaas, dan.j.williams, ira.weiny, linux-acpi, linux-cxl,
	linux-kernel, linux-pci, rafael.j.wysocki, sean.v.kelley,
	vishal.l.verma

Hi Ben,


>On Dec 4, 2020, at 1:12 PM, Ben Widawsky <mailto:ben.widawsky@intel.com> wrote:
>
>Hi Chris.
>
>On 20-12-04 12:40:03, Chris Browy wrote:
>
>Hi Ben,
>
>Trying to bring up the environment using the latest developments as follows:
>
>1. Linux kernel baseline version is cloned using
>     git clone git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
>   Using master branch.  Merged the 9 CXL linux kernel patches manually and built kernel
>
>2. QEMU baseline version is cloned using
>     git clone https://gitlab.com/bwidawsk/qemu.git
>
>3. UEFI baseline is cloned using
>     git clone https://github.com/tianocore/edk2.git
>   Using master and built
>
>4. Now can run qemu as follows:
>     The qcow2 we use is based on Ubuntu 20.10 with updated with kernel from 1) above
>
>     QEMU command:
>
>     sudo qemu-system-x86_64 -nic \
>     user,hostfwd=tcp::2222-:22,hostfwd=tcp::1234-:1234 -machine \
>     type=pc-q35-4.0,hmat=on,accel=kvm -enable-kvm -cpu host -smp \
>     6,cores=6,threads=1,sockets=1 -m 8G -boot order=d -k 'en-us' -vga virtio \
>     -drive file=/home/chris/Downloads/AQCXL/ubuntu_20.qcow,format=qcow2 -drive \
>     if=pflash,format=raw,readonly,file=/home/chris/OVMF_CODE.fd \
>     -drive if=pflash,format=raw,file=/home/chris/OVMF_VARS.fd \
>     -object memory-backend-file,id=cxl-mem1,share,mem-path=/tmp/cxl-test/cxl,size=512M \
>     -device pxb-cxl,id=cxl.0,bus=pcie.0,bus_nr=52,uid=0,len-window-base=1,\
>     window-base[0]=0x4c0000000,memdev[0]=cxl-mem1 \
>     -device cxl-rp,id=rp0,bus=cxl.0,addr=0.0,chassis=0,slot=0  \
>     -device cxl-type3,bus=rp0,memdev=cxl-mem1,id=cxl-pmem0,size=256M  2>&1 | tee -a \
>     /home/chris/Downloads/AQCXL/log/qemu.log
>
>   The qemu options are derived from looking at the tests/qtests/cxl-test.c
>   along with the -hmat=on which seemed to make sense.
>
>   The system boots and lspci -vvv shows the CXL device is enumerated.  But
>   no DOE capability register for CDAT access though (see below).  Otherwise the
>   DVSEC registers are present.
>
>DOE is not supported yet in either Linux or QEMU. For us, CDAT isn't a high
>priority yet so it likely won't be done for a while. I'd really like to see DOE
>support added by someone - not me - so that we can wire it up. Not sure what
>that would look like in the QEMU side.
>
>
>
>   acpidump indicates the CXL0 and CXLM devices but no SRAT or HMAT tables are
>   in the dump which is curious.
>
>I don't typically use HMAT, but I do have an SRAT in mine, so that's strange.
>You should also have a CEDT.
>
Could you provide the QEMU command line?  I was not successful adding numa node for cxl-mem1
and RAM.  Leaving out numa node for cxl-mem1 I can now see  SRAT table being created but that’s
not exactly the point.

Are you using UEFI or legacy BIOS to boot?

Reproducing a known working environment is better for now and then deviate for trying new
configurations.

>
>
>35:00.0 Memory controller [0502]: Intel Corporation Device 0d93 (rev 01) (prog-if 10)
>    Subsystem: Red Hat, Inc. Device 1100
>    Physical Slot: 0
>    Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
>    Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
>    Latency: 0
>    Region 0: Memory at c0a00000 (64-bit, non-prefetchable) [size=64K]
>    Region 2: Memory at c0a10000 (64-bit, non-prefetchable) [size=4K]
>    Capabilities: [80] Express (v2) Endpoint, MSI 00
>        DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
>            ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 0.000W
>        DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
>            RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
>            MaxPayload 128 bytes, MaxReadReq 128 bytes
>        DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
>        LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s, Exit Latency L0s <64ns
>            ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
>        LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
>            ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
>        LnkSta: Speed 2.5GT/s (ok), Width x1 (ok)
>            TrErr- Train- SlotClk- DLActive+ BWMgmt- ABWMgmt-
>        DevCap2: Completion Timeout: Not Supported, TimeoutDis-, NROPrPrP-, LTR-
>             10BitTagComp-, 10BitTagReq-, OBFF Not Supported, ExtFmt+, EETLPPrefix+, MaxEETLPPrefixes 4
>               EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
>             FRS-, TPHComp-, ExtTPHComp-
>             AtomicOpsCap: 32bit- 64bit- 128bitCAS-
>        DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
>             AtomicOpsCtl: ReqEn-
>        LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
>             Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
>             Compliance De-emphasis: -6dB
>        LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
>             EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
>    Capabilities: [100 v1] Designated Vendor-Specific <?>
>    Capabilities: [138 v1] Designated Vendor-Specific <?>
>    Kernel driver in use: cxl_mem
Can you check this in the QEMU cxl_component_register_init_common() in cxl-component-utils.c


I used devmem2 to dump out CLX Component CLX.mem regsisters located at BAR + 0x1000. Header
List:
     04110000 CXL CAP
     08010002 CXL RAS
     0D820004 CXL Link
     11010005 CXL HDM Decoder
     00000000

I think the 1st entry is the CXL Cap Header List should be
0x03110001


Capability_ID looks like it is set using ARRAY_FIELD_DP32

    /* CXL Capability Header Register */
    ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, ID, 1);
    ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, VERSION, 1);
    ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, CACHE_MEM_VERSION, 1);
    ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, ARRAY_SIZE, caps);


But value reported for ID is ‘0’.

Since there are only RAS, LINK, and HDM headers found the Array_Size should be ‘3’
according to the spec:

Array_Size: This defines the number of elements present in
the CXL_Capability array, not including the
CXL_Capability_Header element. Each element is 1 DWORD in
size and is located contiguous with previous elements.


‘caps’ should one less that calculated in switch statement

 96 void cxl_component_register_init_common(uint32_t *reg_state, enum reg_type type)
 97 {
 98     int caps = 0;
 99     switch (type) {
100     case CXL2_DOWNSTREAM_PORT:
101     case CXL2_DEVICE:
102         /* CAP, RAS, Link */
103         caps = 3;
104         break;
105     case CXL2_UPSTREAM_PORT:
106     case CXL2_TYPE3_DEVICE:
107     case CXL2_LOGICAL_DEVICE:
108         /* + HDM */
109         caps = 4;
110         break;
111     case CXL2_ROOT_PORT:
112         /* + Extended Security, + Snoop */
113         caps = 6;
114         break;
115     default:
116         abort();
117     }


>
>Questions/Comments:
>-------------------
>1. Linux
>  a. Is there a gitlab for the linux kernel patches for CXL?  This would
>     facilitate review and code modifications.
>
>We're hopefully going to send out v2 in the next couple of days. I'll push the
>repo somewhere as well.
>
That’s great!

>
>
>
>2. UEFI (edk2 from tianocore)
>  a. seems to only support CXL 1.1 which means only method #1 (Device
>     option ROM) of Coherent Device Attribute Table_1.02 spec
>     for CDAT handling is possible now.
>
>     Does device option ROM need to be added to QEMU CXL setup?
>
>     Can we add a CXL 1.1 emulated device?
>
>Patches welcome :-). I know of other people who want this, but I only care about
>2.0+, so I have no intention to implement it.
Can you summarize how the System Physical Address (SPA) gets assigned to the
CXL pmem if not by the UEFI or CXL driver methods?

Is it some alternate method using host bridge window-base and then
cxl_realize() in cxl_type3.c.

-device pxb-cxl,id=cxl.0,bus=pcie.0,bus_nr=52,uid=0,len-window-base=1, \
 window-base[0]=0x4c0000000,memdev[0]=cxl-mem1 \

The QEMU command would inform a lot especially whether you
create 2 backend memories with numa nodes to models
RAM and PMEM?

Eventually the CXL driver would need to handle hot plug of
CT3D and access CDAT from the CT3D via DOE Mailbox.

As Dan mentions about Coherent Device Attribute Table (CDAT) Specification r1.02
requires CDAT to build the SRAT/HMAT statically and dynamically
 - Figure 1 Pre-boot CDAT Extraction Method (for CXL devices)  (uses CDAT in option ROM)
 - Figure 2 OS Runtime CDAT Extraction Method (for CXL devices  (uses DOE mailbox)

>
>
>
>
>  b. lspci doesn’t show the existence of the DOE extended capability register
>     in the CXL CT3D (needed to support method #2).  Are there more patches?
>
>As above, it's not supported. I'm hoping someone else will do that work since I
>don't care about it just yet.

We can get a start on some QEMU updates to support DOE mailbox access.

>
>
>
>
>3. Do you have example user programs to share or better yet the CXL 2.0
>   Sec 14.3.6.1 Application Layer/ Transaction layer test for CXL.mem?
>
>I don't have, mostly because I haven't actually implemented a lot of the real
>CXL support. My primary concern was having the Linux driver be able to enumerate
>devices and communicate with the device via the mailbox interface. v2 will
>contain support for userspace to do this, which I think is a step toward what
>you're asking for.
>
Understood. Looking forward to v2 and linux branch!  Working from a known good
reference environment will allow for more in depth code review.


>
>
>4. What are the userspace system APIs for targeting CXL HDM address domain?
>   Usually you can mmap a SPA if you know how to look it up.
>
>I think Dan answered this in the other thread…

Correct 


Best Regards,
Chris Browy




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH 0/9] CXL 2.0 Support
  2020-12-07  4:40       ` Chris Browy
@ 2020-12-09  0:48         ` Ben Widawsky
  0 siblings, 0 replies; 9+ messages in thread
From: Ben Widawsky @ 2020-12-09  0:48 UTC (permalink / raw)
  To: Chris Browy
  Cc: bhelgaas, dan.j.williams, ira.weiny, linux-acpi, linux-cxl,
	linux-kernel, linux-pci, rafael.j.wysocki, sean.v.kelley,
	vishal.l.verma

On 20-12-06 23:40:47, Chris Browy wrote:
> Hi Ben,
> 
> 
> >On Dec 4, 2020, at 1:12 PM, Ben Widawsky <mailto:ben.widawsky@intel.com> wrote:
> >
> >Hi Chris.
> >
> >On 20-12-04 12:40:03, Chris Browy wrote:
> >
> >Hi Ben,
> >
> >Trying to bring up the environment using the latest developments as follows:
> >
> >1. Linux kernel baseline version is cloned using
> >     git clone git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
> >   Using master branch.  Merged the 9 CXL linux kernel patches manually and built kernel
> >
> >2. QEMU baseline version is cloned using
> >     git clone https://gitlab.com/bwidawsk/qemu.git
> >
> >3. UEFI baseline is cloned using
> >     git clone https://github.com/tianocore/edk2.git
> >   Using master and built
> >
> >4. Now can run qemu as follows:
> >     The qcow2 we use is based on Ubuntu 20.10 with updated with kernel from 1) above
> >
> >     QEMU command:
> >
> >     sudo qemu-system-x86_64 -nic \
> >     user,hostfwd=tcp::2222-:22,hostfwd=tcp::1234-:1234 -machine \
> >     type=pc-q35-4.0,hmat=on,accel=kvm -enable-kvm -cpu host -smp \
> >     6,cores=6,threads=1,sockets=1 -m 8G -boot order=d -k 'en-us' -vga virtio \
> >     -drive file=/home/chris/Downloads/AQCXL/ubuntu_20.qcow,format=qcow2 -drive \
> >     if=pflash,format=raw,readonly,file=/home/chris/OVMF_CODE.fd \
> >     -drive if=pflash,format=raw,file=/home/chris/OVMF_VARS.fd \
> >     -object memory-backend-file,id=cxl-mem1,share,mem-path=/tmp/cxl-test/cxl,size=512M \
> >     -device pxb-cxl,id=cxl.0,bus=pcie.0,bus_nr=52,uid=0,len-window-base=1,\
> >     window-base[0]=0x4c0000000,memdev[0]=cxl-mem1 \
> >     -device cxl-rp,id=rp0,bus=cxl.0,addr=0.0,chassis=0,slot=0  \
> >     -device cxl-type3,bus=rp0,memdev=cxl-mem1,id=cxl-pmem0,size=256M  2>&1 | tee -a \
> >     /home/chris/Downloads/AQCXL/log/qemu.log
> >
> >   The qemu options are derived from looking at the tests/qtests/cxl-test.c
> >   along with the -hmat=on which seemed to make sense.
> >
> >   The system boots and lspci -vvv shows the CXL device is enumerated.  But
> >   no DOE capability register for CDAT access though (see below).  Otherwise the
> >   DVSEC registers are present.
> >
> >DOE is not supported yet in either Linux or QEMU. For us, CDAT isn't a high
> >priority yet so it likely won't be done for a while. I'd really like to see DOE
> >support added by someone - not me - so that we can wire it up. Not sure what
> >that would look like in the QEMU side.
> >
> >
> >
> >   acpidump indicates the CXL0 and CXLM devices but no SRAT or HMAT tables are
> >   in the dump which is curious.
> >
> >I don't typically use HMAT, but I do have an SRAT in mine, so that's strange.
> >You should also have a CEDT.
> >
> Could you provide the QEMU command line?  I was not successful adding numa node for cxl-mem1
> and RAM.  Leaving out numa node for cxl-mem1 I can now see  SRAT table being created but that’s
> not exactly the point.
> 
> Are you using UEFI or legacy BIOS to boot?
> 
> Reproducing a known working environment is better for now and then deviate for trying new
> configurations.

I'm using UEFI. I haven't built my own in a while. The script we use to run QEMU
pulls whatever is here: https://www.archlinux.org/packages/extra/any/edk2-ovmf/download/

-machine q35,accel=kvm,nvdimm,cxl
-m 8192M,slots=4,maxmem=40964M
-smp 8,sockets=2,cores=2,threads=2
-enable-kvm
-display none
-nographic
-drive if=pflash,format=raw,unit=0,file=OVMF_CODE.fd,readonly=on
-drive if=pflash,format=raw,unit=1,file=OVMF_VARS.fd
-debugcon file:uefi_debug.log
-global isa-debugcon.iobase=0x402
-drive file=root.img,format=raw,media=disk
-kernel ./mkosi.extra/boot/vmlinuz
-initrd mkosi.extra/boot/initramfs-5.10.0-rc2-BEN-00016-geb5b35e092af-dirty.img
-append selinux=0 audit=0 console=tty0 console=ttyS0 root=/dev/sda2 ignore_loglevel rw dyndbg="file drivers/cxl/mem.c +p"
-device e1000,netdev=net0,mac=52:54:00:12:34:56
-netdev user,id=net0,hostfwd=tcp::10022-:22
-object memory-backend-file,id=cxl-mem1,share,mem-path=cxl-type3,size=512M
-device pxb-cxl,id=cxl.0,bus=pcie.0,bus_nr=52,uid=0,len-window-base=1,window-base[0]=0x4c0000000,memdev[0]=cxl-mem1
-device cxl-rp,id=rp0,bus=cxl.0,addr=0.0,chassis=0,slot=0
-device cxl-type3,bus=rp0,memdev=cxl-mem1,id=cxl-pmem0,size=256M
-snapshot
-object memory-backend-ram,id=mem0,size=2048M
-numa node,nodeid=0,memdev=mem0,
-numa cpu,node-id=0,socket-id=0
-object memory-backend-ram,id=mem1,size=2048M
-numa node,nodeid=1,memdev=mem1,
-numa cpu,node-id=1,socket-id=1
-object memory-backend-ram,id=mem2,size=2048M
-numa node,nodeid=2,memdev=mem2,
-object memory-backend-ram,id=mem3,size=2048M
-numa node,nodeid=3,memdev=mem3,
-numa node,nodeid=4,
-object memory-backend-file,id=nvmem0,share,mem-path=nvdimm-0,size=16384M,align=1G
-device nvdimm,memdev=nvmem0,id=nv0,label-size=2M,node=4
-numa node,nodeid=5,
-object memory-backend-file,id=nvmem1,share,mem-path=nvdimm-1,size=16384M,align=1G
-device nvdimm,memdev=nvmem1,id=nv1,label-size=2M,node=5
-numa dist,src=0,dst=0,val=10
-numa dist,src=0,dst=1,val=21
-numa dist,src=0,dst=2,val=12
-numa dist,src=0,dst=3,val=21
-numa dist,src=0,dst=4,val=17
-numa dist,src=0,dst=5,val=28
-numa dist,src=1,dst=1,val=10
-numa dist,src=1,dst=2,val=21
-numa dist,src=1,dst=3,val=12
-numa dist,src=1,dst=4,val=28
-numa dist,src=1,dst=5,val=17
-numa dist,src=2,dst=2,val=10
-numa dist,src=2,dst=3,val=21
-numa dist,src=2,dst=4,val=28
-numa dist,src=2,dst=5,val=28
-numa dist,src=3,dst=3,val=10
-numa dist,src=3,dst=4,val=28
-numa dist,src=3,dst=5,val=28
-numa dist,src=4,dst=4,val=10
-numa dist,src=4,dst=5,val=28
-numa dist,src=5,dst=5,val=10


> 
> >
> >
> >35:00.0 Memory controller [0502]: Intel Corporation Device 0d93 (rev 01) (prog-if 10)
> >    Subsystem: Red Hat, Inc. Device 1100
> >    Physical Slot: 0
> >    Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
> >    Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
> >    Latency: 0
> >    Region 0: Memory at c0a00000 (64-bit, non-prefetchable) [size=64K]
> >    Region 2: Memory at c0a10000 (64-bit, non-prefetchable) [size=4K]
> >    Capabilities: [80] Express (v2) Endpoint, MSI 00
> >        DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
> >            ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 0.000W
> >        DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
> >            RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
> >            MaxPayload 128 bytes, MaxReadReq 128 bytes
> >        DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
> >        LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s, Exit Latency L0s <64ns
> >            ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
> >        LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
> >            ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
> >        LnkSta: Speed 2.5GT/s (ok), Width x1 (ok)
> >            TrErr- Train- SlotClk- DLActive+ BWMgmt- ABWMgmt-
> >        DevCap2: Completion Timeout: Not Supported, TimeoutDis-, NROPrPrP-, LTR-
> >             10BitTagComp-, 10BitTagReq-, OBFF Not Supported, ExtFmt+, EETLPPrefix+, MaxEETLPPrefixes 4
> >               EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
> >             FRS-, TPHComp-, ExtTPHComp-
> >             AtomicOpsCap: 32bit- 64bit- 128bitCAS-
> >        DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
> >             AtomicOpsCtl: ReqEn-
> >        LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
> >             Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
> >             Compliance De-emphasis: -6dB
> >        LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
> >             EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
> >    Capabilities: [100 v1] Designated Vendor-Specific <?>
> >    Capabilities: [138 v1] Designated Vendor-Specific <?>
> >    Kernel driver in use: cxl_mem
> Can you check this in the QEMU cxl_component_register_init_common() in cxl-component-utils.c
> 
> 
> I used devmem2 to dump out CLX Component CLX.mem regsisters located at BAR + 0x1000. Header
> List:
>      04110000 CXL CAP
>      08010002 CXL RAS
>      0D820004 CXL Link
>      11010005 CXL HDM Decoder
>      00000000
> 
> I think the 1st entry is the CXL Cap Header List should be
> 0x03110001
> 
> 
> Capability_ID looks like it is set using ARRAY_FIELD_DP32
> 
>     /* CXL Capability Header Register */
>     ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, ID, 1);
>     ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, VERSION, 1);
>     ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, CACHE_MEM_VERSION, 1);
>     ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, ARRAY_SIZE, caps);
> 
> 
> But value reported for ID is ‘0’.
> 
> Since there are only RAS, LINK, and HDM headers found the Array_Size should be ‘3’
> according to the spec:
> 
> Array_Size: This defines the number of elements present in
> the CXL_Capability array, not including the
> CXL_Capability_Header element. Each element is 1 DWORD in
> size and is located contiguous with previous elements.
> 
> 
> ‘caps’ should one less that calculated in switch statement
> 
>  96 void cxl_component_register_init_common(uint32_t *reg_state, enum reg_type type)
>  97 {
>  98     int caps = 0;
>  99     switch (type) {
> 100     case CXL2_DOWNSTREAM_PORT:
> 101     case CXL2_DEVICE:
> 102         /* CAP, RAS, Link */
> 103         caps = 3;
> 104         break;
> 105     case CXL2_UPSTREAM_PORT:
> 106     case CXL2_TYPE3_DEVICE:
> 107     case CXL2_LOGICAL_DEVICE:
> 108         /* + HDM */
> 109         caps = 4;
> 110         break;
> 111     case CXL2_ROOT_PORT:
> 112         /* + Extended Security, + Snoop */
> 113         caps = 6;
> 114         break;
> 115     default:
> 116         abort();
> 117     }
> 
> 

You're correct. I've fixed it and pushed it to the v2 branch.

> >
> >Questions/Comments:
> >-------------------
> >1. Linux
> >  a. Is there a gitlab for the linux kernel patches for CXL?  This would
> >     facilitate review and code modifications.
> >
> >We're hopefully going to send out v2 in the next couple of days. I'll push the
> >repo somewhere as well.
> >
> That’s great!
> 
> >
> >
> >
> >2. UEFI (edk2 from tianocore)
> >  a. seems to only support CXL 1.1 which means only method #1 (Device
> >     option ROM) of Coherent Device Attribute Table_1.02 spec
> >     for CDAT handling is possible now.
> >
> >     Does device option ROM need to be added to QEMU CXL setup?
> >
> >     Can we add a CXL 1.1 emulated device?
> >
> >Patches welcome :-). I know of other people who want this, but I only care about
> >2.0+, so I have no intention to implement it.
> Can you summarize how the System Physical Address (SPA) gets assigned to the
> CXL pmem if not by the UEFI or CXL driver methods?

The PMEM address isn't currently assigned. It expects the driver to program the
HDMs. However, for testing I faked this a bit, grep for SET_PMEM_PADDR. I did
have a tool I used to verify the HDM programming, but it appears that's been
lost to the sands of time.


> 
> Is it some alternate method using host bridge window-base and then
> cxl_realize() in cxl_type3.c.
> 
> -device pxb-cxl,id=cxl.0,bus=pcie.0,bus_nr=52,uid=0,len-window-base=1, \
>  window-base[0]=0x4c0000000,memdev[0]=cxl-mem1 \
> 
> The QEMU command would inform a lot especially whether you
> create 2 backend memories with numa nodes to models
> RAM and PMEM?
> 
> Eventually the CXL driver would need to handle hot plug of
> CT3D and access CDAT from the CT3D via DOE Mailbox.
> 
> As Dan mentions about Coherent Device Attribute Table (CDAT) Specification r1.02
> requires CDAT to build the SRAT/HMAT statically and dynamically
>  - Figure 1 Pre-boot CDAT Extraction Method (for CXL devices)  (uses CDAT in option ROM)
>  - Figure 2 OS Runtime CDAT Extraction Method (for CXL devices  (uses DOE mailbox)
> 
> >
> >
> >
> >
> >  b. lspci doesn’t show the existence of the DOE extended capability register
> >     in the CXL CT3D (needed to support method #2).  Are there more patches?
> >
> >As above, it's not supported. I'm hoping someone else will do that work since I
> >don't care about it just yet.
> 
> We can get a start on some QEMU updates to support DOE mailbox access.
> 
> >
> >
> >
> >
> >3. Do you have example user programs to share or better yet the CXL 2.0
> >   Sec 14.3.6.1 Application Layer/ Transaction layer test for CXL.mem?
> >
> >I don't have, mostly because I haven't actually implemented a lot of the real
> >CXL support. My primary concern was having the Linux driver be able to enumerate
> >devices and communicate with the device via the mailbox interface. v2 will
> >contain support for userspace to do this, which I think is a step toward what
> >you're asking for.
> >
> Understood. Looking forward to v2 and linux branch!  Working from a known good
> reference environment will allow for more in depth code review.
> 
> 
> >
> >
> >4. What are the userspace system APIs for targeting CXL HDM address domain?
> >   Usually you can mmap a SPA if you know how to look it up.
> >
> >I think Dan answered this in the other thread…
> 
> Correct 
> 
> 
> Best Regards,
> Chris Browy
> 
> 
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH 0/9] CXL 2.0 Support
  2020-11-11  5:43 Ben Widawsky
  2020-11-11 22:06 ` Ben Widawsky
@ 2020-11-11 22:43 ` Bjorn Helgaas
  1 sibling, 0 replies; 9+ messages in thread
From: Bjorn Helgaas @ 2020-11-11 22:43 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, linux-kernel, linux-pci, linux-acpi, Dan Williams,
	Ira Weiny, Vishal Verma, Kelley, Sean V, Bjorn Helgaas,
	Rafael J . Wysocki

On Tue, Nov 10, 2020 at 09:43:47PM -0800, Ben Widawsky wrote:
> ...
> Ben Widawsky (5):
>   cxl/mem: Map memory device registers
>   cxl/mem: Find device capabilities
>   cxl/mem: Initialize the mailbox interface
>   cxl/mem: Implement polled mode mailbox
>   MAINTAINERS: Add maintainers of the CXL driver
> 
> Dan Williams (2):
>   cxl/mem: Add a driver for the type-3 mailbox

To include important words first and use "Type 3" as in spec:

  cxl/mem: Add Type 3 mailbox driver

>   cxl/mem: Register CXL memX devices
> 
> Vishal Verma (2):
>   cxl/acpi: Add an acpi_cxl module for the CXL interconnect
>   cxl/acpi: add OSC support

For consistency:

  cxl/acpi: Add _OSC support

It's conventional in drivers/acpi and drivers/pci to capitalize the
"ACPI" and "PCI" initialisms except in actual C code.   Seems like
you're mostly doing the same with "CXL", except in the subject lines
above.  Since you're making a new directory, I guess you get to
choose.

I use "PCIe" (not "PCIE" or "PCI-E"; you have a mix) because that
seems to be the official abbreviation.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH 0/9] CXL 2.0 Support
  2020-11-11  5:43 Ben Widawsky
@ 2020-11-11 22:06 ` Ben Widawsky
  2020-11-11 22:43 ` Bjorn Helgaas
  1 sibling, 0 replies; 9+ messages in thread
From: Ben Widawsky @ 2020-11-11 22:06 UTC (permalink / raw)
  To: linux-cxl
  Cc: linux-kernel, linux-pci, linux-acpi, Dan Williams, Ira Weiny,
	Vishal Verma, Kelley, Sean V, Bjorn Helgaas, Rafael J . Wysocki

Adding a cross reference to the QEMU work since I sent those patches after this:

https://gitlab.com/bwidawsk/qemu/-/tree/cxl-2.0
https://lists.nongnu.org/archive/html/qemu-devel/2020-11/msg02886.html

[snip]


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [RFC PATCH 0/9] CXL 2.0 Support
@ 2020-11-11  5:43 Ben Widawsky
  2020-11-11 22:06 ` Ben Widawsky
  2020-11-11 22:43 ` Bjorn Helgaas
  0 siblings, 2 replies; 9+ messages in thread
From: Ben Widawsky @ 2020-11-11  5:43 UTC (permalink / raw)
  To: linux-cxl
  Cc: linux-kernel, linux-pci, linux-acpi, Dan Williams, Ira Weiny,
	Vishal Verma, Kelley, Sean V, Bjorn Helgaas, Rafael J . Wysocki,
	Ben Widawsky

Introduce support for “type-3” memory devices defined in the recently released
Compute Express Link (CXL) 2.0 specification. Specifically, these are the memory
devices defined by section 8.2.8.5 of the CXL 2.0 spec. A reference
implementation emulating these devices is being submitted to the QEMU mailing
list. “Type-3” is a CXL device that acts as a memory expander for RAM or PMEM.
It might be interleaved with other CXL devices in a given physical address
range.

These changes allow for foundational enumeration of CXL 2.0 memory devices. The functionality present is:
- Initial driver bring-up
- Device enumeration and an initial sysfs representation
- Submit a basic firmware command via ‘mailbox’ to an emulated memory device
  with non-volatile capacity.

Some of the functionality that is still missing includes:
- Memory interleaving at the host bridge, root port, or switch level
- CXL 1.1 Root Complex Integrated Endpoint Support
- CXL 2.0 Hot plug support

In addition to the core functionality of discovering the spec defined registers
and resources, introduce a CXL device model that will be the foundation for
translating CXL capabilities into existing Linux infrastructure for Persistent
Memory and other memory devices. For now, this only includes support for the
management command mailbox that type-3 devices surface. These control devices
fill the role of “DIMMs” / nmemX memory-devices in LIBNVDIMM terms.

Now, while implementing the driver some feedback for the specification was
generated to cover perceived gaps and address conflicts. The feedback is
presented as a reference implementation in the driver and QEMU emulation.
Specifically the following concepts are original to the Linux implementation and
feedback / collaboration is requested to develop these into specification
proposals:
1. Top level ACPI object (ACPI0017)
2. HW imposed address space and interleave constraints
3. _OSC UUID A4D1629D-FF52-4888-BE96-E5CADE548DB1

ACPI0017
--------
Introduce a new ACPI namespace device with an _HID of ACPI0017. The purpose of
this object is twofold, support a legacy OS with a set of out-of-tree CXL
modules, and establish an attach point for a driver that knows about
interleaving. Both of these boil down to the same point, to centralize Operating
System support for resources described by the CXL Early Discovery Table (CEDT).

The legacy OS problem stems from the spec's description of a host bridge,
ACPI0016 is denoted as the _HID for host bridges, with a _CID of PNP0A08. In a
CXL unaware version of Linux, the core ACPI subsystem will bind a driver to
PNP0A08 and preclude a CXL-aware driver from binding to ACPI0016. An ACPI0017
device allows a standalone CXL-aware driver to register for handling /
coordinating CEDT and CXL-specific _OSC control.

Similarly when managing interleaving there needs to be some management layer
above the ACPI0016 device that is capable of assembling leaf nodes into
interleave sets. As is the case with ACPI0012 that does this central
coordination for NFIT defined resources, ACPI0017 does the same for CEDT
described resources.

Memory Windows
-------
For CXL.mem capable platforms, there is a need for a mechanism for platform
firmware to make the Operating System aware of any restrictions that hardware
might have in address space. For example, in a system with 4 host bridges all
participating in an interleave set, the firmware needs to provide some
description of this. That information is missing from the CXL 2.0 spec as of
today and it also is not implemented in the driver. A variety of ACPI based
mechanisms, for example _CRS fields on the ACPI0017 device, were considered.


CXL Exclusive _OSC
-----------------
CXL 2.0 definition provides new fields to _OSC for host bridges to allow for new
services provided by CXL - error handling, hot plug, capabilities, etc. This is
built on top of PCIe _OSC via a new UUID. A CXL unaware OS will use the old UUID
to configure the PCIe host bridge. The expectation is that a CXL aware OS uses
the new UUID and to modify both CXL and PCIE capabilities in one shot. The issue
arises when trying to define a standalone CXL driver. The core OS will configure
the PCIe _OSC, but when the CXL driver attempts to set CXL _OSC the current
definition makes that driver re-specify PCIE capabilities. An isolated CXL-only
_OSC allows the PCIE core to be unchanged and let a CXL driver stack manage CXL
_OSC without the possibility of clobbering / colliding with PCIE core OSC
management.  The proposal moves the new _OSC dwords (SUPC and CTRC) to their own
_OSC UUID.

Next steps after this basic foundation is expanded command support and LIBNVDIMM
integration. This is the initial “release early / release often” version of the
Linux CXL enabling.


Ben Widawsky (5):
  cxl/mem: Map memory device registers
  cxl/mem: Find device capabilities
  cxl/mem: Initialize the mailbox interface
  cxl/mem: Implement polled mode mailbox
  MAINTAINERS: Add maintainers of the CXL driver

Dan Williams (2):
  cxl/mem: Add a driver for the type-3 mailbox
  cxl/mem: Register CXL memX devices

Vishal Verma (2):
  cxl/acpi: Add an acpi_cxl module for the CXL interconnect
  cxl/acpi: add OSC support

 MAINTAINERS           |   9 +
 drivers/Kconfig       |   1 +
 drivers/Makefile      |   1 +
 drivers/cxl/Kconfig   |  50 ++++
 drivers/cxl/Makefile  |   9 +
 drivers/cxl/acpi.c    | 325 ++++++++++++++++++++++
 drivers/cxl/acpi.h    |  33 +++
 drivers/cxl/bus.c     |  35 +++
 drivers/cxl/bus.h     |   8 +
 drivers/cxl/cxl.h     | 166 +++++++++++
 drivers/cxl/mem.c     | 631 ++++++++++++++++++++++++++++++++++++++++++
 drivers/cxl/pci.h     |  21 ++
 include/acpi/actbl1.h |  52 ++++
 13 files changed, 1341 insertions(+)
 create mode 100644 drivers/cxl/Kconfig
 create mode 100644 drivers/cxl/Makefile
 create mode 100644 drivers/cxl/acpi.c
 create mode 100644 drivers/cxl/acpi.h
 create mode 100644 drivers/cxl/bus.c
 create mode 100644 drivers/cxl/bus.h
 create mode 100644 drivers/cxl/cxl.h
 create mode 100644 drivers/cxl/mem.c
 create mode 100644 drivers/cxl/pci.h

-- 
2.29.2


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-12-09  0:48 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <FB00A034-7C6D-40B1-8452-318A3B052216@avery-design.com>
2020-12-04 17:40 ` [RFC PATCH 0/9] CXL 2.0 Support Chris Browy
2020-12-04 18:12   ` Ben Widawsky
2020-12-04 19:26     ` Verma, Vishal L
2020-12-04 19:45       ` Dan Williams
     [not found]     ` <B4FDE6CB-DA00-4F58-AA9F-F04E678E076B@avery-design.com>
2020-12-07  4:40       ` Chris Browy
2020-12-09  0:48         ` Ben Widawsky
2020-11-11  5:43 Ben Widawsky
2020-11-11 22:06 ` Ben Widawsky
2020-11-11 22:43 ` Bjorn Helgaas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).