linux-cxl.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Vikram Sethi <vsethi@nvidia.com>
To: "linux-cxl@vger.kernel.org" <linux-cxl@vger.kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>,
	"Natu, Mahesh" <mahesh.natu@intel.com>,
	"Rudoff, Andy" <andy.rudoff@intel.com>,
	Jeff Smith <JSMITH@nvidia.com>,
	Mark Hairgrove <mhairgrove@nvidia.com>,
	"jglisse@redhat.com" <jglisse@redhat.com>,
	Vikram Sethi <vsethi@nvidia.com>
Subject: Onlining CXL Type2 device coherent memory
Date: Wed, 28 Oct 2020 23:05:48 +0000	[thread overview]
Message-ID: <BL0PR12MB25321C8689BAFDF8678E5C69BD170@BL0PR12MB2532.namprd12.prod.outlook.com> (raw)

Hello, 
 
I wanted to kick off a discussion on how Linux onlining of CXL [1] type 2 device 
Coherent memory aka Host managed device memory (HDM) will work for type 2 CXL 
devices which are available/plugged in at boot. A type 2 CXL device can be simply 
thought of as an accelerator with coherent device memory, that also has a 
CXL.cache to cache system memory. 
 
One could envision that BIOS/UEFI could expose the HDM in EFI memory map 
as conventional memory as well as in ACPI SRAT/SLIT/HMAT. However, at least 
on some architectures (arm64) EFI conventional memory available at kernel boot 
memory cannot be offlined, so this may not be suitable on all architectures. 
 
Further, the device driver associated with the type 2 device/accelerator may 
want to save off a chunk of HDM for driver private use. 
So it seems the more appropriate model may be something like dev dax model 
where the device driver probe/open calls add_memory_driver_managed, and 
the driver could choose how much of the HDM it wants to reserve and how 
much to make generally available for application mmap/malloc. 
 
Another thing to think about is whether the kernel relies on UEFI having fully 
described NUMA proximity domains and end-end NUMA distances for HDM,
or whether the kernel will provide some infrastructure to make use of the 
device-local affinity information provided by the device in the Coherent Device 
Attribute Table (CDAT) via a mailbox, and use that to add a new NUMA node ID
for the HDM, and with the NUMA distances calculated by adding to the NUMA 
distance of the host bridge/Root port with the device local distance. At least 
that's how I think CDAT is supposed to work when kernel doesn't want to rely 
on BIOS tables.
 
A similar question on NUMA node ID and distances for HDM arises for CXL hotplug. 
Will the kernel rely on CDAT, and create its own NUMA node ID and patch up 
distances, or will it rely on BIOS providing PXM domain reserved at boot in 
SRAT to be used later on hotplug?
 
Thanks,
Vikram
 
[1] https://www.computeexpresslink.org/download-the-specification


             reply	other threads:[~2020-10-28 23:06 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-28 23:05 Vikram Sethi [this message]
2020-10-29 14:50 ` Onlining CXL Type2 device coherent memory Ben Widawsky
2020-10-30 20:37 ` Dan Williams
2020-10-30 20:59   ` Matthew Wilcox
2020-10-30 23:38     ` Dan Williams
2020-10-30 22:39   ` Vikram Sethi
2020-11-02 17:47     ` Dan Williams
2020-10-31 10:21   ` David Hildenbrand
2020-10-31 16:51     ` Dan Williams
2020-11-02  9:51       ` David Hildenbrand
2020-11-02 16:17         ` Vikram Sethi
2020-11-02 17:53           ` David Hildenbrand
2020-11-02 18:03             ` Dan Williams
2020-11-02 19:25               ` Vikram Sethi
2020-11-02 19:45                 ` Dan Williams
2020-11-03  3:56                 ` Alistair Popple
2020-11-02 18:34       ` Jonathan Cameron

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BL0PR12MB25321C8689BAFDF8678E5C69BD170@BL0PR12MB2532.namprd12.prod.outlook.com \
    --to=vsethi@nvidia.com \
    --cc=JSMITH@nvidia.com \
    --cc=andy.rudoff@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=jglisse@redhat.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=mahesh.natu@intel.com \
    --cc=mhairgrove@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).