All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vikram Sethi <vsethi@nvidia.com>
To: Dan Williams <dan.j.williams@intel.com>,
	"Yasunori Gotou (Fujitsu)" <y-goto@fujitsu.com>,
	"linux-cxl@vger.kernel.org" <linux-cxl@vger.kernel.org>,
	"catalin.marinas@arm.com" <Catalin.Marinas@arm.com>,
	James Morse <james.morse@arm.com>
Cc: "Natu, Mahesh" <mahesh.natu@intel.com>
Subject: RE: Questions about CXL device (type 3 memory) hotplug
Date: Wed, 24 May 2023 14:47:31 +0000	[thread overview]
Message-ID: <BN8PR12MB3330831F2E666E9BB1319E66BD419@BN8PR12MB3330.namprd12.prod.outlook.com> (raw)
In-Reply-To: <646d8c76811cb_250e29456@dwillia2-mobl3.amr.corp.intel.com.notmuch>

> From: Dan Williams <dan.j.williams@intel.com>
> Sent: Tuesday, May 23, 2023 11:03 PM
> To: Vikram Sethi <vsethi@nvidia.com>; Dan Williams
> <dan.j.williams@intel.com>; Yasunori Gotou (Fujitsu) <y-goto@fujitsu.com>;
> linux-cxl@vger.kernel.org; catalin.marinas@arm.com; James Morse
> <james.morse@arm.com>
> Cc: Natu, Mahesh <mahesh.natu@intel.com>
> Subject: RE: Questions about CXL device (type 3 memory) hotplug
> Vikram Sethi wrote:
> > > From: Dan Williams <dan.j.williams@intel.com>
> > > Sent: Tuesday, May 23, 2023 1:40 PM
> > > To: Vikram Sethi <vsethi@nvidia.com>; Dan Williams
> > > <dan.j.williams@intel.com>; Yasunori Gotou (Fujitsu)
> > > <y-goto@fujitsu.com>; linux-cxl@vger.kernel.org;
> > > catalin.marinas@arm.com; James Morse <james.morse@arm.com>
> > > Cc: Natu, Mahesh <mahesh.natu@intel.com>
> > > Subject: RE: Questions about CXL device (type 3 memory) hotplug
> > >
> > > Vikram Sethi wrote:
> > > > Hi Dan,
> > > >
> > > > > From: Dan Williams <dan.j.williams@intel.com>
> > > > > Sent: Monday, May 22, 2023 7:12 PM
> > > > > To: Yasunori Gotou (Fujitsu) <y-goto@fujitsu.com>; linux-
> > > > > cxl@vger.kernel.org
> > > > > Cc: 'Dan Williams' <dan.j.williams@intel.com>
> > > > > Subject: RE: Questions about CXL device (type 3 memory) hotplug
> > > > >
> > > > > > Q4) Current CXL drivers/tools support Hot-removal request from
> PCIe?
> > > > > >
> > > > > >     CXL specification says "In a managed Hot-Remove flow, software
> is
> > > > > >     notified of a hot removal request."
> > > > >
> > > > > Currently there is a requirement that:
> > > > >
> > > > > cxl disable-memdev
> > > > >
> > > > > ...is run before the device can be removed. There is no warning
> > > > > from the PCI hotplug driver. Which means that if end user does
> > > > > the wrong sequence they can crash the kernel / remove memory
> > > > > that may still be in
> > > active use.
> > > > >
> > > > Is there any notion of a cache flush when memory is removed (or in
> > > > future
> > > CXL reset)?
> > >
> > > No.
> > >
> > > > Generally, CPU caches must be flushed when memory is removed
> > > > because any evictions when the memory isn't present can cause
> > > > async errors which can be fatal to the system or at least to VMs,
> depending on ISA.
> > >
> > > This seems incompatible with memory hotplug. The cache flushing is
> > > only done on the subsequent reuse of physical address range to make
> > > sure that any pending evictions are complete before the newly
> > > constituted address range is put into service, or that any prior
> > > clean cache lines of old content are dropped. See
> cxl_region_invalidate_memregion() for where this is called.
> > >
> > > > If the kernel does the cache flush, it must be done with only
> > > > uncacheable mappings present to prevent speculative fetches after
> > > > the
> > > cache flush.
> > >
> > > This is why the invalidation is done after physical address range is
> > > populated by new devices. To flush any speculative fetches to the
> > > old composition of the address range.
> > >
> > I don't think invalidate on the probe path will always work for
> > devices with snoop filters, including HDM-DB memory devices or CXL
> > type2 accelerators.  After CXL reset or hot plug insertion, a HDM-DB
> > device's snoop filter isn't tracking any lines checked out by the
> > host. Even if those were just clean lines in CPU caches, hosts can
> > send drop notifications in CXL in response to the cache flush
> > (MemClnEvict).  A device that isn't expecting this evict notification
> > can go into error state and optionally raise a device internal error
> > interrupt. So you could end up with a non functional device.
> 
> I don't understand this failure mode. Accelerator is added, driver sets up an
> HDM decode range and triggers CPU cache invalidation before mapping the
> memory into page tables. Wouldn't the device, upon receiving an invalidation
> request, just snoop its caches and say "nothing for me to do"?

Device's snoop filter is in a clean reset/power on state. It is not tracking anything checked out by the host CPU/peer.
If it starts receiving writebacks or even CleanEvicts for its memory, it certainly looks like an unexpected coherency message and i
Know of at least one implementation that triggers an error interrupt in response. I don't know of a statement 
In the specification that this is expected and implementations should ignore. If there is such a statement, could you 
please point me to it? 

Remove memory needs a cache flush IMO, in a way that prevents speculative fetches. 
This can be done in kernel with uncacheable mappings alone, if possible in the arch callback, or via FW call. 

  reply	other threads:[~2023-05-24 14:47 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-22  8:06 Questions about CXL device (type 3 memory) hotplug Yasunori Gotou (Fujitsu)
2023-05-23  0:11 ` Dan Williams
2023-05-23  8:31   ` Yasunori Gotou (Fujitsu)
2023-05-23 17:36     ` Dan Williams
2023-05-24 11:12       ` Yasunori Gotou (Fujitsu)
2023-05-24 20:51         ` Dan Williams
2023-05-25 10:32           ` Yasunori Gotou (Fujitsu)
2023-05-26  8:05         ` Yasunori Gotou (Fujitsu)
2023-05-26 14:48           ` Dan Williams
2023-05-29  8:07             ` Yasunori Gotou (Fujitsu)
2023-06-06 17:58               ` Dan Williams
2023-06-08  7:39                 ` Yasunori Gotou (Fujitsu)
2023-06-08 18:37                   ` Dan Williams
2023-06-09  1:02                     ` Yasunori Gotou (Fujitsu)
2023-05-23 13:34   ` Vikram Sethi
2023-05-23 18:40     ` Dan Williams
2023-05-24  0:02       ` Vikram Sethi
2023-05-24  4:03         ` Dan Williams
2023-05-24 14:47           ` Vikram Sethi [this message]
2023-05-24 21:20             ` Dan Williams
2023-05-31  4:25               ` Vikram Sethi
2023-06-06 20:54                 ` Dan Williams
2023-06-07  1:06                   ` Vikram Sethi
2023-06-07 15:12                     ` Jonathan Cameron
2023-06-07 18:44                       ` Vikram Sethi
2023-06-08 15:19                         ` Jonathan Cameron
2023-06-08 18:41                           ` Dan Williams
2024-03-27  7:10   ` Yuquan Wang
2024-03-27  7:18   ` Yuquan Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BN8PR12MB3330831F2E666E9BB1319E66BD419@BN8PR12MB3330.namprd12.prod.outlook.com \
    --to=vsethi@nvidia.com \
    --cc=Catalin.Marinas@arm.com \
    --cc=dan.j.williams@intel.com \
    --cc=james.morse@arm.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=mahesh.natu@intel.com \
    --cc=y-goto@fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.