All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vikram Sethi <vsethi@nvidia.com>
To: Dan Williams <dan.j.williams@intel.com>,
	"Yasunori Gotou (Fujitsu)" <y-goto@fujitsu.com>,
	"linux-cxl@vger.kernel.org" <linux-cxl@vger.kernel.org>,
	"catalin.marinas@arm.com" <Catalin.Marinas@arm.com>,
	James Morse <james.morse@arm.com>
Cc: "Natu, Mahesh" <mahesh.natu@intel.com>
Subject: RE: Questions about CXL device (type 3 memory) hotplug
Date: Wed, 24 May 2023 00:02:31 +0000	[thread overview]
Message-ID: <BYAPR12MB3336AB6A519C709DBEA75CB5BD419@BYAPR12MB3336.namprd12.prod.outlook.com> (raw)
In-Reply-To: <646d0892eadc3_afb77294cb@dwillia2-xfh.jf.intel.com.notmuch>

> From: Dan Williams <dan.j.williams@intel.com>
> Sent: Tuesday, May 23, 2023 1:40 PM
> To: Vikram Sethi <vsethi@nvidia.com>; Dan Williams
> <dan.j.williams@intel.com>; Yasunori Gotou (Fujitsu) <y-goto@fujitsu.com>;
> linux-cxl@vger.kernel.org; catalin.marinas@arm.com; James Morse
> <james.morse@arm.com>
> Cc: Natu, Mahesh <mahesh.natu@intel.com>
> Subject: RE: Questions about CXL device (type 3 memory) hotplug
> 
> Vikram Sethi wrote:
> > Hi Dan,
> >
> > > From: Dan Williams <dan.j.williams@intel.com>
> > > Sent: Monday, May 22, 2023 7:12 PM
> > > To: Yasunori Gotou (Fujitsu) <y-goto@fujitsu.com>; linux-
> > > cxl@vger.kernel.org
> > > Cc: 'Dan Williams' <dan.j.williams@intel.com>
> > > Subject: RE: Questions about CXL device (type 3 memory) hotplug
> > >
> > > > Q4) Current CXL drivers/tools support Hot-removal request from PCIe?
> > > >
> > > >     CXL specification says "In a managed Hot-Remove flow, software is
> > > >     notified of a hot removal request."
> > >
> > > Currently there is a requirement that:
> > >
> > > cxl disable-memdev
> > >
> > > ...is run before the device can be removed. There is no warning from
> > > the PCI hotplug driver. Which means that if end user does the wrong
> > > sequence they can crash the kernel / remove memory that may still be in
> active use.
> > >
> > Is there any notion of a cache flush when memory is removed (or in future
> CXL reset)?
> 
> No.
> 
> > Generally, CPU caches must be flushed when memory is removed because
> > any evictions when the memory isn't present can cause async errors
> > which can be fatal to the system or at least to VMs, depending on ISA.
> 
> This seems incompatible with memory hotplug. The cache flushing is only
> done on the subsequent reuse of physical address range to make sure that
> any pending evictions are complete before the newly constituted address
> range is put into service, or that any prior clean cache lines of old content are
> dropped. See cxl_region_invalidate_memregion() for where this is called.
> 
> > If the kernel does the cache flush, it must be done with only
> > uncacheable mappings present to prevent speculative fetches after the
> cache flush.
> 
> This is why the invalidation is done after physical address range is populated
> by new devices. To flush any speculative fetches to the old composition of
> the address range.
> 
I don't think invalidate on the probe path will always work for devices with snoop filters, including HDM-DB memory devices or CXL type2 accelerators. 
After CXL reset or hot plug insertion, a HDM-DB device's snoop filter isn't tracking any lines checked out by the host. Even if those were just clean lines in CPU caches, hosts can send drop notifications in CXL in response to the cache flush (MemClnEvict).
A device that isn't expecting this evict notification can go into error state and optionally raise a device internal error interrupt. So you could end up with a non functional device.

> > Even so, kernel VA based cache flushes will likely be slow, so may be
> > better to have the notion of an arch callback that can invoke firmware to do
> the cache flush.
> > Perhaps arch_remove_memory is the right place to invoke such a cache
> flush/FW call?
> > I think the CXL specification should also address the need for cache
> > flush when removing memory or doing CXL reset.
> 
> Seems out of scope for the CXL specification, this is up to each arch to
> handle.
> 
> Here is some discussion of what ARM is thinking about in this space:
> 
> https://lore.kernel.org/all/40cd479b-f0f8-5dba-0e41-
> 4cef73693927@arm.com/

  reply	other threads:[~2023-05-24  0:02 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-22  8:06 Questions about CXL device (type 3 memory) hotplug Yasunori Gotou (Fujitsu)
2023-05-23  0:11 ` Dan Williams
2023-05-23  8:31   ` Yasunori Gotou (Fujitsu)
2023-05-23 17:36     ` Dan Williams
2023-05-24 11:12       ` Yasunori Gotou (Fujitsu)
2023-05-24 20:51         ` Dan Williams
2023-05-25 10:32           ` Yasunori Gotou (Fujitsu)
2023-05-26  8:05         ` Yasunori Gotou (Fujitsu)
2023-05-26 14:48           ` Dan Williams
2023-05-29  8:07             ` Yasunori Gotou (Fujitsu)
2023-06-06 17:58               ` Dan Williams
2023-06-08  7:39                 ` Yasunori Gotou (Fujitsu)
2023-06-08 18:37                   ` Dan Williams
2023-06-09  1:02                     ` Yasunori Gotou (Fujitsu)
2023-05-23 13:34   ` Vikram Sethi
2023-05-23 18:40     ` Dan Williams
2023-05-24  0:02       ` Vikram Sethi [this message]
2023-05-24  4:03         ` Dan Williams
2023-05-24 14:47           ` Vikram Sethi
2023-05-24 21:20             ` Dan Williams
2023-05-31  4:25               ` Vikram Sethi
2023-06-06 20:54                 ` Dan Williams
2023-06-07  1:06                   ` Vikram Sethi
2023-06-07 15:12                     ` Jonathan Cameron
2023-06-07 18:44                       ` Vikram Sethi
2023-06-08 15:19                         ` Jonathan Cameron
2023-06-08 18:41                           ` Dan Williams
2024-03-27  7:10   ` Yuquan Wang
2024-03-27  7:18   ` Yuquan Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BYAPR12MB3336AB6A519C709DBEA75CB5BD419@BYAPR12MB3336.namprd12.prod.outlook.com \
    --to=vsethi@nvidia.com \
    --cc=Catalin.Marinas@arm.com \
    --cc=dan.j.williams@intel.com \
    --cc=james.morse@arm.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=mahesh.natu@intel.com \
    --cc=y-goto@fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.