All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Yasunori Gotou (Fujitsu)" <y-goto@fujitsu.com>
To: 'Dan Williams' <dan.j.williams@intel.com>,
	"linux-cxl@vger.kernel.org" <linux-cxl@vger.kernel.org>
Subject: RE: Questions about CXL device (type 3 memory) hotplug
Date: Fri, 26 May 2023 08:05:57 +0000	[thread overview]
Message-ID: <TYWPR01MB10082A36ECFE209E4A05BA33190479@TYWPR01MB10082.jpnprd01.prod.outlook.com> (raw)
In-Reply-To: <TYWPR01MB10082A58633DCCB1AAC5DDC3C90419@TYWPR01MB10082.jpnprd01.prod.outlook.com>


> > > > > Q4) Current CXL drivers/tools support Hot-removal request from PCIe?
> > > > >
> > > > >     CXL specification says "In a managed Hot-Remove flow, software
> is
> > > > >     notified of a hot removal request."
> > > >
> > > > Currently there is a requirement that:
> > > >
> > > > cxl disable-memdev
> > > >
> > > > ...is run before the device can be removed. There is no warning
> > > > from the PCI hotplug driver. Which means that if end user does the
> > > > wrong sequence they can crash the kernel / remove memory that may
> > > > still be in
> > active use.
> > >
> > > Ok.
> > > Though "Surprising remove" is not guaranteed by specification, I
> > > think "managed hot-removed flow" should be realized.
> > > I'll chase more what should we do about it.
> >
> > The nuance here is that even though the PCI hotplug driver supports an
> > attention button and pauses to let the OS acknowledge the removal.
> > That acknowledgement is not coordinated with the associated drivers
> > instead those drivers just receive a ->remove() notification that can not be
> failed.
> >
> > So, this means that the CXL device must be shutdown manually with
> >
> > daxctl offline-memory
> > cxl disable-region
> > cxl disable-memdev
> >
> > ...*before* the hotplug attention button is pressed. If any of those
> > commands fail the device is in active use by the kernel and the
> > hotplug attempt needs to be cancelled. My expectation is that CXL
> > memory device removal is not possible in the majority of cases.
> > This is why the Dynamic Capacity Device definition in CXL 3.0 allows
> > for the flexibility of partial removal.
> 
> Hmmm, I mind something here, but I cannot make sentence what is it yet
> Probably, I need time to reconsider it. Please wait.

One of what I mind here --was-- which documentation describes OS triggered hotremove instead of PCIe trigger.
Because many hardware/firmware developers don't know the circumstance of Linux.
They may want to implement same system not only for Linux but also for VMware or any other system,
and may want to obey only the specification or any similar documents.
But I found " CXL* Type 3 Memory Device Software Guide: 2.13.7 OS managed hot remove sequence"
https://cdrdv2-public.intel.com/643805/643805_CXL%20Memory%20Device%20SW%20Guide_Rev1p0.pdf
Then, I can talk with them by it. So, it was solved.

My remain questions are the followings.

Q6) Are there any way to hotremove from outside of servers now?
    Currently, administrator seems to need to login a server and execute offline and cxl disable commands
    to remove memory in it, right? But in future, something software like memory pool manager,
    Fabric Manager, or any other management tools which can manage many servers CXL devices
    will want to remove each server's devices from outside.
    But I'm not sure it can available or not yet now.

Q7) Are there any interface to know which block cannot be offlined?
    Even DCD is supported, such manager software seems to need to repeat memory offline
    request for many blocks and collect succeeded blocks until requested amount....
    It may need much time to complete even if its success rate is low. I think that "time out"
    for such case is not so good idea.
    If there is an interface for such application to know which block has long term pin pages
    or any other obstacles, then it is desirable the above software can avoid to wait long time.

    If not, I would like to investigate more how to make it.

Thanks,

---
Yasunori Goto

  parent reply	other threads:[~2023-05-26  8:06 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-22  8:06 Questions about CXL device (type 3 memory) hotplug Yasunori Gotou (Fujitsu)
2023-05-23  0:11 ` Dan Williams
2023-05-23  8:31   ` Yasunori Gotou (Fujitsu)
2023-05-23 17:36     ` Dan Williams
2023-05-24 11:12       ` Yasunori Gotou (Fujitsu)
2023-05-24 20:51         ` Dan Williams
2023-05-25 10:32           ` Yasunori Gotou (Fujitsu)
2023-05-26  8:05         ` Yasunori Gotou (Fujitsu) [this message]
2023-05-26 14:48           ` Dan Williams
2023-05-29  8:07             ` Yasunori Gotou (Fujitsu)
2023-06-06 17:58               ` Dan Williams
2023-06-08  7:39                 ` Yasunori Gotou (Fujitsu)
2023-06-08 18:37                   ` Dan Williams
2023-06-09  1:02                     ` Yasunori Gotou (Fujitsu)
2023-05-23 13:34   ` Vikram Sethi
2023-05-23 18:40     ` Dan Williams
2023-05-24  0:02       ` Vikram Sethi
2023-05-24  4:03         ` Dan Williams
2023-05-24 14:47           ` Vikram Sethi
2023-05-24 21:20             ` Dan Williams
2023-05-31  4:25               ` Vikram Sethi
2023-06-06 20:54                 ` Dan Williams
2023-06-07  1:06                   ` Vikram Sethi
2023-06-07 15:12                     ` Jonathan Cameron
2023-06-07 18:44                       ` Vikram Sethi
2023-06-08 15:19                         ` Jonathan Cameron
2023-06-08 18:41                           ` Dan Williams
2024-03-27  7:10   ` Yuquan Wang
2024-03-27  7:18   ` Yuquan Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=TYWPR01MB10082A36ECFE209E4A05BA33190479@TYWPR01MB10082.jpnprd01.prod.outlook.com \
    --to=y-goto@fujitsu.com \
    --cc=dan.j.williams@intel.com \
    --cc=linux-cxl@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.