linux-cxl.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: Ben Widawsky <ben.widawsky@intel.com>,
	linux-cxl@vger.kernel.org, Linux NVDIMM <nvdimm@lists.linux.dev>,
	patches@lists.linux.dev,
	Alison Schofield <alison.schofield@intel.com>,
	Ira Weiny <ira.weiny@intel.com>,
	Jonathan Cameron <Jonathan.Cameron@huawei.com>,
	Vishal Verma <vishal.l.verma@intel.com>,
	Christoph Hellwig <hch@infradead.org>,
	John Hubbard <jhubbard@nvidia.com>
Subject: Re: [RFC PATCH 05/15] cxl/acpi: Reserve CXL resources from request_free_mem_region
Date: Tue, 19 Apr 2022 14:59:46 -0700	[thread overview]
Message-ID: <CAPcyv4hyTRm7K8gu4wdL_gaMm2C+Agg1V2-BbnmJ8Kf0OH4sng@mail.gmail.com> (raw)
In-Reply-To: <CAPcyv4hPBw0yJmu7qzSZ_gsbVuj+_R7-_r3+_W9-JsLTD6Uscw@mail.gmail.com>

On Tue, Apr 19, 2022 at 2:50 PM Dan Williams <dan.j.williams@intel.com> wrote:
>
> On Tue, Apr 19, 2022 at 9:43 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
> >
> > On Mon, Apr 18, 2022 at 09:42:00AM -0700, Dan Williams wrote:
> > > [ add the usual HMM suspects Christoph, Jason, and John ]
> > >
> > > On Wed, Apr 13, 2022 at 11:38 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > >
> > > > Define an API which allows CXL drivers to manage CXL address space.
> > > > CXL is unique in that the address space and various properties are only
> > > > known after CXL drivers come up, and therefore cannot be part of core
> > > > memory enumeration.
> > >
> > > I think this buries the lead on the problem introduced by
> > > MEMORY_DEVICE_PRIVATE in the first place. Let's revisit that history
> > > before diving into what CXL needs.
> > >
> > >
> > > Commit 4ef589dc9b10 ("mm/hmm/devmem: device memory hotplug using
> > > ZONE_DEVICE") introduced the concept of MEMORY_DEVICE_PRIVATE. At its
> > > core MEMORY_DEVICE_PRIVATE uses the ZONE_DEVICE capability to annotate
> > > an "unused" physical address range with 'struct page' for the purpose
> > > of coordinating migration of buffers onto and off of a GPU /
> > > accelerator. The determination of "unused" was based on a heuristic,
> > > not a guarantee, that any address range not expressly conveyed in the
> > > platform firmware map of the system can be repurposed for software
> > > use. The CXL Fixed Memory Windows Structure  (CFMWS) definition
> > > explicitly breaks the assumptions of that heuristic.
> >
> > So CXL defines an address map that is not part of the FW list?
>
> It defines a super-set of 'potential' address space and a subset that
> is active in the FW list. It's similar to memory hotplug where an
> address range may come online after the fact, but unlike ACPI memory
> hotplug, FW is not involved in the hotplug path, and FW cannot predict
> what address ranges will come online. For example ACPI hotplug knows
> in advance to publish the ranges that can experience an online /
> insert event, CXL has many more degrees of freedom.
>
> >
> > > > It would be desirable to simply insert this address space into
> > > > iomem_resource with a new flag to denote this is CXL memory. This would
> > > > permit request_free_mem_region() to be reused for CXL memory provided it
> > > > learned some new tricks. For that, it is tempting to simply use
> > > > insert_resource(). The API was designed specifically for cases where new
> > > > devices may offer new address space. This cannot work in the general
> > > > case. Boot firmware can pass, some, none, or all of the CFMWS range as
> > > > various types of memory to the kernel, and this may be left alone,
> > > > merged, or even expanded.
> >
> > And then we understand that on CXL the FW is might pass stuff that
> > intersects with the actual CXL ranges?
> >
> > > > As a result iomem_resource may intersect CFMWS
> > > > regions in ways insert_resource cannot handle [2]. Similar reasoning
> > > > applies to allocate_resource().
> > > >
> > > > With the insert_resource option out, the only reasonable approach left
> > > > is to let the CXL driver manage the address space independently of
> > > > iomem_resource and attempt to prevent users of device private memory
> >
> > And finally due to all these FW problems we are going to make a 2nd
> > allocator for physical address space and just disable the normal one?
>
> No, or I am misunderstanding this comment. The CXL address space
> allocator is managing space that can be populated and become an
> iomem_resource. So it's not supplanting iomem_resource it is
> coordinating dynamic extensions to the FW map.
>
> > Then since DEVICE_PRIVATE is a notable user of this allocator we now
> > understand it becomes broken?
> >
> > Sounds horrible. IMHO you should fix the normal allocator somehow to
> > understand that the ranges from FW have been reprogrammed by Linux
>
> There is no reprogramming of the ranges from FW. CXL memory that is
> mapped as System RAM at boot will have the CXL decode configuration
> locked in all the participating devices. The remaining CXL decode
> space is then available for dynamic reconfiguration of CXL resources
> from the devices that the FW explicitly ignores, which is all
> hot-added devices and all persistent-memory capacity.
>
> > and
> > not try to build a whole different allocator in CXL code.
>
> I am not seeing much overlap for DEVICE_PRIVATE and CXL to share an
> allocator. CXL explicitly wants ranges that have been set aside for
> CXL and are related to 1 or more CXL host bridges. DEVICE_PRIVATE
> wants to consume an unused physical address range to proxy
> device-local-memory with no requirements on what range is chosen as
> long as it does not collide with anything else.

...or are you suggesting to represent CXL free memory capacity in
iomem_resource and augment the FW list early with CXL ranges. That
seems doable, but it would only represent the free CXL ranges in
iomem_resource as the populated CXL ranges cannot have their resources
reparented after the fact, and there is plenty of code that expects
"System RAM" to be a top-level resource.

  reply	other threads:[~2022-04-19 22:00 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-13 18:37 [RFC PATCH 00/15] Region driver Ben Widawsky
2022-04-13 18:37 ` [RFC PATCH 01/15] cxl/core: Use is_endpoint_decoder Ben Widawsky
2022-04-13 21:22   ` Dan Williams
     [not found]   ` <CGME20220415205052uscas1p209e03abf95b9c80b2ba1f287c82dfd80@uscas1p2.samsung.com>
2022-04-15 20:50     ` Adam Manzanares
2022-04-13 18:37 ` [RFC PATCH 02/15] cxl/core/hdm: Bail on endpoint init fail Ben Widawsky
2022-04-13 21:31   ` Dan Williams
     [not found]     ` <CGME20220418163713uscas1p17b3b1b45c7d27e54e3ecb62eb8af2469@uscas1p1.samsung.com>
2022-04-18 16:37       ` Adam Manzanares
2022-05-12 15:50         ` Ben Widawsky
2022-05-12 17:27           ` Luis Chamberlain
2022-05-13 12:09             ` Jonathan Cameron
2022-05-13 15:03               ` Dan Williams
2022-05-13 15:12               ` Luis Chamberlain
2022-05-13 19:14                 ` Dan Williams
2022-05-13 19:31                   ` Luis Chamberlain
2022-05-19  5:09                     ` Dan Williams
2022-04-13 18:37 ` [RFC PATCH 03/15] Revert "cxl/core: Convert decoder range to resource" Ben Widawsky
2022-04-13 21:43   ` Dan Williams
2022-05-12 16:09     ` Ben Widawsky
2022-04-13 18:37 ` [RFC PATCH 04/15] cxl/core: Create distinct decoder structs Ben Widawsky
2022-04-15  1:45   ` Dan Williams
2022-04-18 20:43     ` Dan Williams
2022-04-13 18:37 ` [RFC PATCH 05/15] cxl/acpi: Reserve CXL resources from request_free_mem_region Ben Widawsky
2022-04-18 16:42   ` Dan Williams
2022-04-19 16:43     ` Jason Gunthorpe
2022-04-19 21:50       ` Dan Williams
2022-04-19 21:59         ` Dan Williams [this message]
2022-04-19 23:04           ` Jason Gunthorpe
2022-04-20  0:47             ` Dan Williams
2022-04-20 14:34               ` Jason Gunthorpe
2022-04-20 15:32                 ` Dan Williams
2022-04-13 18:37 ` [RFC PATCH 06/15] cxl/acpi: Manage root decoder's address space Ben Widawsky
2022-04-18 22:15   ` Dan Williams
2022-05-12 19:18     ` Ben Widawsky
2022-04-13 18:37 ` [RFC PATCH 07/15] cxl/port: Surface ram and pmem resources Ben Widawsky
2022-04-13 18:37 ` [RFC PATCH 08/15] cxl/core/hdm: Allocate resources from the media Ben Widawsky
2022-04-13 18:37 ` [RFC PATCH 09/15] cxl/core/port: Add attrs for size and volatility Ben Widawsky
2022-04-13 18:37 ` [RFC PATCH 10/15] cxl/core: Extract IW/IG decoding Ben Widawsky
2022-04-13 18:37 ` [RFC PATCH 11/15] cxl/acpi: Use common " Ben Widawsky
2022-04-13 18:37 ` [RFC PATCH 12/15] cxl/region: Add region creation ABI Ben Widawsky
2022-05-04 22:56   ` Verma, Vishal L
2022-05-05  5:17     ` Dan Williams
2022-05-12 15:54       ` Ben Widawsky
2022-04-13 18:37 ` [RFC PATCH 13/15] cxl/core/port: Add attrs for root ways & granularity Ben Widawsky
2022-04-13 18:37 ` [RFC PATCH 14/15] cxl/region: Introduce configuration Ben Widawsky
2022-04-13 18:37 ` [RFC PATCH 15/15] cxl/region: Introduce a cxl_region driver Ben Widawsky
2022-05-20 16:23 ` [RFC PATCH 00/15] Region driver Jonathan Cameron
2022-05-20 16:41   ` Dan Williams
2022-05-31 12:21     ` Jonathan Cameron
2022-06-23  5:40       ` Dan Williams
2022-06-23 15:08         ` Jonathan Cameron
2022-06-23 17:33           ` Dan Williams
2022-06-23 23:44             ` Dan Williams
2022-06-24  9:08             ` Jonathan Cameron

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPcyv4hyTRm7K8gu4wdL_gaMm2C+Agg1V2-BbnmJ8Kf0OH4sng@mail.gmail.com \
    --to=dan.j.williams@intel.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=alison.schofield@intel.com \
    --cc=ben.widawsky@intel.com \
    --cc=hch@infradead.org \
    --cc=ira.weiny@intel.com \
    --cc=jgg@nvidia.com \
    --cc=jhubbard@nvidia.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=nvdimm@lists.linux.dev \
    --cc=patches@lists.linux.dev \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).