All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
To: Gregory Price <gourry.memverge@gmail.com>
Cc: <qemu-devel@nongnu.org>, <linux-cxl@vger.kernel.org>,
	Alison Schofield <alison.schofield@intel.com>,
	Davidlohr Bueso <dave@stgolabs.net>,
	"a.manzanares@samsung.com" <a.manzanares@samsung.com>,
	Ben Widawsky <bwidawsk@kernel.org>
Subject: Re: [PATCH RFC] hw/cxl: type 3 devices can now present volatile or persistent memory
Date: Mon, 10 Oct 2022 16:18:09 +0100	[thread overview]
Message-ID: <20221010161809.00006f8e@huawei.com> (raw)
In-Reply-To: <CAD3UvdT1ZHJDaqj05C+n7t4rM7yhjZyM6fooXAfG12rAYnBqmw@mail.gmail.com>

On Fri, 7 Oct 2022 10:50:12 -0400
Gregory Price <gourry.memverge@gmail.com> wrote:

> Now that i've had some time to look at the spec, the DVSEC CXL Capability
> register (8.1.3.1 in 3.0 spec)
> only supports enabling two HDM ranges at the moment, which to me means we
> should implement
> 
> memdev0=..., memdev1=...

That needs to work (for backwards compatibility)
but doesn't need to be the most optimal mapping.  So you could have
two HDM ranges defined, but if your software is clever enough to
use the CXL 2.0 infrastructure you can have more (Linux is).

> 
> Yesterday I pushed a patch proposal that separated the regions into
> persistent and volatile, but
> i had discovered that there isn't a good way (presently) to determine of a
> MemoryBacking object
> is both file File-type AND has pmem=true, meaning we'd have to either
> expose that interface
> (this seems dubious to me) or do the following:
Whilst there are good usecases for using actual volatile and non volatile
for the backends, I don't think we want to enforce that as it should
'work' with whatever backends are used.
> 
> memdev0=mem0,memdev0-pmem=true,memdev1=mem0,memdev0-pmem=false
> 
> and then simply store a bit for each hostmem region accordingly to report
> pmem/volatile.
> This would allow the flexibility for the backing device to be whatever the
> user wants while
> still being able to mark the behavior of the type-3 device as pmem or
> volatile.  Alternatively
> we could make `memdev0-type=` and allow for new types in the future I
> suppose.
> 
That was why I had one suggestion of just supporting two memdevs.

memdev-volatile and memdev-nonvolatile but then messing with whether the
whole memdev was presented depending on the set partition.  I'm increasingly
thinking that is the least ugly solution.  + until we implement set partition
along with a control on maximum size of partitionable region, we can
just use a pair of memdevs to provide all the information needed to emulate
the devices.

> The one thing I'm a bit confused on - the Media_type and Memory_Class
> fields in the DVSEC
> CXL Range registers (8.1.3.8.2).  Right now they're set to "010b" = Memory
> Characteristics
> are communicated via CDAT".  Do the devices presently emulate this? I'm
> finding it hard to pick
> apart the code to identify it.

Not upstream yet.

Patches on list + can be found at

https://gitlab.com/jic23/qemu/-/commits/cxl-2022-10-09
There are a few messy corners in that tree but it should work. I'll be
pushing out a new version in a few days.

I updated that in latest version to build the tables based on the
memdev provided.  We'll want to add the volatile support to that alongside
your patch.

Jonathan


WARNING: multiple messages have this Message-ID (diff)
From: Jonathan Cameron via <qemu-devel@nongnu.org>
To: Gregory Price <gourry.memverge@gmail.com>
Cc: <qemu-devel@nongnu.org>, <linux-cxl@vger.kernel.org>,
	Alison Schofield <alison.schofield@intel.com>,
	Davidlohr Bueso <dave@stgolabs.net>,
	"a.manzanares@samsung.com" <a.manzanares@samsung.com>,
	Ben Widawsky <bwidawsk@kernel.org>
Subject: Re: [PATCH RFC] hw/cxl: type 3 devices can now present volatile or persistent memory
Date: Mon, 10 Oct 2022 16:18:09 +0100	[thread overview]
Message-ID: <20221010161809.00006f8e@huawei.com> (raw)
In-Reply-To: <CAD3UvdT1ZHJDaqj05C+n7t4rM7yhjZyM6fooXAfG12rAYnBqmw@mail.gmail.com>

On Fri, 7 Oct 2022 10:50:12 -0400
Gregory Price <gourry.memverge@gmail.com> wrote:

> Now that i've had some time to look at the spec, the DVSEC CXL Capability
> register (8.1.3.1 in 3.0 spec)
> only supports enabling two HDM ranges at the moment, which to me means we
> should implement
> 
> memdev0=..., memdev1=...

That needs to work (for backwards compatibility)
but doesn't need to be the most optimal mapping.  So you could have
two HDM ranges defined, but if your software is clever enough to
use the CXL 2.0 infrastructure you can have more (Linux is).

> 
> Yesterday I pushed a patch proposal that separated the regions into
> persistent and volatile, but
> i had discovered that there isn't a good way (presently) to determine of a
> MemoryBacking object
> is both file File-type AND has pmem=true, meaning we'd have to either
> expose that interface
> (this seems dubious to me) or do the following:
Whilst there are good usecases for using actual volatile and non volatile
for the backends, I don't think we want to enforce that as it should
'work' with whatever backends are used.
> 
> memdev0=mem0,memdev0-pmem=true,memdev1=mem0,memdev0-pmem=false
> 
> and then simply store a bit for each hostmem region accordingly to report
> pmem/volatile.
> This would allow the flexibility for the backing device to be whatever the
> user wants while
> still being able to mark the behavior of the type-3 device as pmem or
> volatile.  Alternatively
> we could make `memdev0-type=` and allow for new types in the future I
> suppose.
> 
That was why I had one suggestion of just supporting two memdevs.

memdev-volatile and memdev-nonvolatile but then messing with whether the
whole memdev was presented depending on the set partition.  I'm increasingly
thinking that is the least ugly solution.  + until we implement set partition
along with a control on maximum size of partitionable region, we can
just use a pair of memdevs to provide all the information needed to emulate
the devices.

> The one thing I'm a bit confused on - the Media_type and Memory_Class
> fields in the DVSEC
> CXL Range registers (8.1.3.8.2).  Right now they're set to "010b" = Memory
> Characteristics
> are communicated via CDAT".  Do the devices presently emulate this? I'm
> finding it hard to pick
> apart the code to identify it.

Not upstream yet.

Patches on list + can be found at

https://gitlab.com/jic23/qemu/-/commits/cxl-2022-10-09
There are a few messy corners in that tree but it should work. I'll be
pushing out a new version in a few days.

I updated that in latest version to build the tables based on the
memdev provided.  We'll want to add the volatile support to that alongside
your patch.

Jonathan



  reply	other threads:[~2022-10-10 15:18 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-06  0:01 [PATCH RFC] hw/cxl: type 3 devices can now present volatile or persistent memory Gourry
2022-10-06  8:45 ` Jonathan Cameron
2022-10-06  8:45   ` Jonathan Cameron via
2022-10-06  8:50   ` Jonathan Cameron
2022-10-06  8:50     ` Jonathan Cameron via
2022-10-06 15:52     ` Gregory Price
2022-10-06 16:42       ` Jonathan Cameron
2022-10-06 16:42         ` Jonathan Cameron via
2022-10-06 17:29         ` Gregory Price
2022-10-07 14:50           ` Gregory Price
2022-10-10 15:18             ` Jonathan Cameron [this message]
2022-10-10 15:18               ` Jonathan Cameron via
2022-10-10 15:25               ` Gregory Price
2022-10-11  1:23                 ` Gregory Price
2022-10-11 17:14                   ` Davidlohr Bueso
2022-10-11 17:22                     ` Gregory Price
2022-10-11 17:28                       ` Jonathan Cameron
2022-10-11 17:28                         ` Jonathan Cameron via
2022-10-10 14:43           ` Jonathan Cameron
2022-10-10 14:43             ` Jonathan Cameron via
2022-10-10 15:20             ` Gregory Price
2022-10-10 16:26               ` Jonathan Cameron
2022-10-10 16:26                 ` Jonathan Cameron via
2022-10-10 16:32             ` Jonathan Cameron
2022-10-10 16:32               ` Jonathan Cameron via
2022-10-10 17:18             ` Davidlohr Bueso
2022-10-07 18:16         ` Davidlohr Bueso
2022-10-07 18:46           ` Gregory Price
2022-10-07 19:55             ` Davidlohr Bueso
2022-10-07 19:52     ` Davidlohr Bueso

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221010161809.00006f8e@huawei.com \
    --to=jonathan.cameron@huawei.com \
    --cc=a.manzanares@samsung.com \
    --cc=alison.schofield@intel.com \
    --cc=bwidawsk@kernel.org \
    --cc=dave@stgolabs.net \
    --cc=gourry.memverge@gmail.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.