From: Ben Widawsky <ben.widawsky@intel.com>
To: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: linux-cxl@vger.kernel.org,
Alison Schofield <alison.schofield@intel.com>,
Dan Williams <dan.j.williams@intel.com>,
Ira Weiny <ira.weiny@intel.com>,
Vishal Verma <vishal.l.verma@intel.com>
Subject: Re: [RFC PATCH 4/4] cxl/region: Introduce concept of region configuration
Date: Mon, 14 Jun 2021 09:18:51 -0700 [thread overview]
Message-ID: <20210614161851.iipfwo5dvxkht7yw@intel.com> (raw)
In-Reply-To: <20210611145206.00003c50@Huawei.com>
On 21-06-11 14:52:06, Jonathan Cameron wrote:
> On Thu, 10 Jun 2021 11:57:25 -0700
> Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> > The region creation APIs leave a region unconfigured. Configuring the
> > region will work in the same way as similar subsystems such as devdax.
> > Sysfs attrs will be provided to allow userspace to configure the region.
> > Finally once all configuration is complete, userspace may "commit" the
> > config. What the kernel decides to do after a config is committed is out
> > of scope at this point.
> >
> > Introduced here are the most basic attributes needed to configure a
> > region.
> >
> > A x1 interleave example is provided below:
> >
> > decoder1.0
> > ├── create_region
> > ├── delete_region
> > ├── devtype
> > ├── locked
> > ├── region1.0:0
> > │ ├── offset
> > │ ├── size
> > │ ├── subsystem -> ../../../../../../../bus/cxl
> > │ ├── target0
> > │ ├── uevent
> > │ ├── uuid
> > │ └── verify
> > ├── size
> > ├── start
> > ├── subsystem -> ../../../../../../bus/cxl
> > ├── target_list
> > ├── target_type
> > └── uevent
> >
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > ---
> > Documentation/ABI/testing/sysfs-bus-cxl | 27 +++
> > drivers/cxl/mem.h | 2 +
> > drivers/cxl/region.c | 227 +++++++++++++++++++++++-
> > 3 files changed, 255 insertions(+), 1 deletion(-)
> >
> > diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl
> > index 699c8514fd7b..d7174a84f70d 100644
> > --- a/Documentation/ABI/testing/sysfs-bus-cxl
> > +++ b/Documentation/ABI/testing/sysfs-bus-cxl
> > @@ -159,3 +159,30 @@ Description:
> > integer is returned describing the first error found in the
> > configuration. A verified region can still fail binding due to
> > lack of resources.
> > +
> > +What: /sys/bus/cxl/devices/decoderX.Y/regionX.Y:Z/{offset,size}
>
> I missed this before, but why do we need X.Y in the region naming given it's
> always inside decoderX.Y. Seems like RegionZ would work.
>
Yes. There are two options, either we do X.Y:Z, or Z where Z is globally unique
across all decoders. The reason for this is the devices are symlinked into sysfs
and so you can't have the same Z for multiple regions in different decoders.
My preference is to have regionX.y:[0-n] for each decoder, rather than the globally
unique. I'll entertain an argument the other way though.
> > +Date: June, 2021
> > +KernelVersion: v5.14
> > +Contact: linux-cxl@vger.kernel.org
> > +Description:
> > + A region resides within an address space that is claimed by a
> > + decoder. The region will be of some size within the address
> > + space and at some offset that must also reside within the
> > + address space. The size and position of the region is specified
> > + by these attributes.
> Could perhaps reword this. Something like.
>
> The region defined by size and offset must be fully contained
> within the address space.
> > +
> > +What: /sys/bus/cxl/devices/decoderX.Y/regionX.Y:Z/uuid
> > +Date: June, 2021
> > +KernelVersion: v5.14
> > +Contact: linux-cxl@vger.kernel.org
> > +Description:
> > + The unique identifier for the region.
> > +
> > +What: /sys/bus/cxl/devices/decoderX.Y/regionX.Y:Z/target[0-15]
> > +Date: June, 2021
> > +KernelVersion: v5.14
> > +Contact: linux-cxl@vger.kernel.org
> > +Description:
> > + Memory devices are the backing storage for a region. Each target
> > + must be populated with a memdev in order for the region to be
> > + eligible to be activated.
>
> How do you do that? What is written to this file?
A name of the dev is written here, ie "mem0". I can add this to the
documentation...
>
> > diff --git a/drivers/cxl/mem.h b/drivers/cxl/mem.h
> > index 9795aa924035..059fbf084fa1 100644
> > --- a/drivers/cxl/mem.h
> > +++ b/drivers/cxl/mem.h
> > @@ -58,6 +58,7 @@ static inline struct cxl_memdev *to_cxl_memdev(struct device *dev)
> > * @dev: This region's device.
> > * @id: This regions id. Id is globally unique across all regions.
> > * @res: Address space consumed by this region.
> > + * @uuid: The UUID for this region.
> > * @list: Node in decoders region list.
> > * @targets: The memory devices comprising the region.
> > */
> > @@ -65,6 +66,7 @@ struct cxl_region {
> > struct device dev;
> > int id;
> > struct resource res;
> > + uuid_t uuid;
> > struct list_head list;
> > struct cxl_memdev *targets[];
> > };
> > diff --git a/drivers/cxl/region.c b/drivers/cxl/region.c
> > index ea1ac848c713..a69ee00514cb 100644
> > --- a/drivers/cxl/region.c
> > +++ b/drivers/cxl/region.c
> > @@ -3,7 +3,9 @@
> > #include <linux/io-64-nonatomic-lo-hi.h>
> > #include <linux/device.h>
> > #include <linux/module.h>
> > +#include <linux/sizes.h>
> > #include <linux/slab.h>
> > +#include <linux/uuid.h>
> > #include <linux/idr.h>
> > #include "cxl.h"
> > #include "mem.h"
> > @@ -20,15 +22,130 @@
> > * relationship between decoder and region when the region is interleaved.
> > */
> >
> > +static struct cxl_region *to_cxl_region(struct device *dev);
> > +
> > +#define cxl_region_ways(region) \
> > + to_cxl_decoder((region)->dev.parent)->interleave_ways
> > +
> > static ssize_t verify_show(struct device *dev, struct device_attribute *attr, char *buf)
> > {
> > + struct cxl_decoder *cxld = to_cxl_decoder(dev->parent);
> > + struct cxl_region *region = to_cxl_region(dev);
> > + struct resource decode_res;
> > + int i;
> > +
> > + decode_res = (struct resource)DEFINE_RES_MEM(cxld->range.start,
> > + range_len(&cxld->range));
> > +
> > + /* Invalid region size */
> > + if (!resource_contains(&decode_res, ®ion->res))
> > + return sysfs_emit(buf, "size");
>
> Perhaps "outside region"? Size might be fine, but not the offset.
> Also, docs say a negative integer is returned for this attribute.
> Those docs need to call out the full list of things that might be returned
> so that userspace can know what to expect.
Okay.
>
> > +
> > + if (resource_size(®ion->res) % (SZ_256M * cxld->interleave_ways))
> > + return sysfs_emit(buf, "alignment");
> > +
> > + /* Missing target memory device */
> > + for (i = 0; i < cxld->interleave_ways; i++)
> > + if (!region->targets[i])
> > + return sysfs_emit(buf, "memdev");
> > +
> > return sysfs_emit(buf, "0");
> > }
> >
> > static DEVICE_ATTR_RO(verify);
> >
> > +static bool is_region_active(struct cxl_region *region)
> > +{
> > + /* TODO: Regions can't be activated yet. */
> > + return false;
> > +}
> > +
> > +static ssize_t offset_show(struct device *dev, struct device_attribute *attr,
> > + char *buf)
> > +{
> > + struct cxl_region *region = to_cxl_region(dev);
> > +
> > + return sysfs_emit(buf, "%#llx\n", region->res.start);
> > +}
> > +
> > +static ssize_t offset_store(struct device *dev, struct device_attribute *attr,
> > + const char *buf, size_t len)
> > +{
> > + struct cxl_region *region = to_cxl_region(dev);
> > + unsigned long long val;
> > + ssize_t rc;
> > +
> > + rc = kstrtoull(buf, 0, &val);
> > + if (rc)
> > + return rc;
> > +
> > + if (is_region_active(region)) {
> > + /* TODO: */
> > + } else {
> > + region->res.start = val;
> > + }
> > +
> > + return len;
> > +}
> > +
> > +static DEVICE_ATTR_RW(offset);
> > +
> > +static ssize_t size_show(struct device *dev, struct device_attribute *attr,
> > + char *buf)
> > +{
> > + struct cxl_region *region = to_cxl_region(dev);
> > +
> > + return sysfs_emit(buf, "%llu\n", resource_size(®ion->res));
> > +}
> > +
> > +static ssize_t size_store(struct device *dev, struct device_attribute *attr,
> > + const char *buf, size_t len)
> > +{
> > + struct cxl_region *region = to_cxl_region(dev);
> > + unsigned long long val;
> > + ssize_t rc;
> > +
> > + rc = kstrtoull(buf, 0, &val);
> > + if (rc)
> > + return rc;
> > +
> > + if (is_region_active(region)) {
> > + /* TODO: */
> > + } else {
> > + region->res.end = region->res.start + val - 1;
> > + }
> > +
> > + return len;
> > +}
> > +static DEVICE_ATTR_RW(size);
> > +
> > +static ssize_t uuid_show(struct device *dev, struct device_attribute *attr,
> > + char *buf)
> > +{
> > + struct cxl_region *region = to_cxl_region(dev);
> > +
> > + return sysfs_emit(buf, "%pUb\n", ®ion->uuid);
> > +}
> > +static ssize_t uuid_store(struct device *dev, struct device_attribute *attr,
> > + const char *buf, size_t len)
> > +{
> > + struct cxl_region *region = to_cxl_region(dev);
> > + ssize_t rc;
> > +
> > + if (len != UUID_STRING_LEN + 1)
> > + return -EINVAL;
> > +
> > + rc = uuid_parse(buf, ®ion->uuid);
> > +
> > + return rc ? rc : len;
> > +}
> > +static DEVICE_ATTR_RW(uuid);
> > +
> > static struct attribute *region_attrs[] = {
> > &dev_attr_verify.attr,
> > + &dev_attr_offset.attr,
> > + &dev_attr_size.attr,
> > + &dev_attr_uuid.attr,
> > NULL,
> > };
> >
> > @@ -36,8 +153,111 @@ static const struct attribute_group region_group = {
> > .attrs = region_attrs,
> > };
> >
> > +static size_t show_targetN(struct cxl_region *region, char *buf, int n)
> > +{
> > + if (region->targets[n])
> > + return sysfs_emit(buf, "%s\n", dev_name(®ion->targets[n]->dev));
> > + else
> > + return sysfs_emit(buf, "nil\n");
>
> This needs documenting in the ABI docs. I'd I guessed it would return an empty
> string.
Okay.
>
> > +}
> > +
> > +static size_t set_targetN(struct cxl_region *region, const char *buf, int n, size_t len)
> > +{
> > + struct device *memdev_dev;
> > + struct cxl_memdev *cxlmd;
> > + ssize_t rc;
> > + int val;
> > +
> > + rc = kstrtoint(buf, 0, &val);
> > + if (!rc && val == 0) {
> > + cxlmd = region->targets[n] = cxlmd;
> > + if (cxlmd)
> > + put_device(&cxlmd->dev);
> > + region->targets[n] = NULL;
> > + return len;
> > + }
> > +
> > + memdev_dev = bus_find_device_by_name(&cxl_bus_type, NULL, buf);
> > + if (!memdev_dev)
> > + return -ENOENT;
> > +
> > + cxlmd = to_cxl_memdev(memdev_dev);
> > + get_device(&cxlmd->dev);
> > + region->targets[n] = cxlmd;
> > +
> > + return len;
> > +}
> > +
> > +#define TARGET_ATTR_RW(n) \
> > + static ssize_t target##n##_show( \
G> > + struct device *dev, struct device_attribute *attr, char *buf) \
> > + { \
> > + return show_targetN(to_cxl_region(dev), buf, n); \
> > + } \
> > + static ssize_t target##n##_store(struct device *dev, \
> > + struct device_attribute *attr, \
> > + const char *buf, size_t len) \
> > + { \
> > + return set_targetN(to_cxl_region(dev), buf, n, len); \
> > + } \
> > + static DEVICE_ATTR_RW(target##n)
> > +
> > +TARGET_ATTR_RW(0);
> > +TARGET_ATTR_RW(1);
> > +TARGET_ATTR_RW(2);
> > +TARGET_ATTR_RW(3);
> > +TARGET_ATTR_RW(4);
> > +TARGET_ATTR_RW(5);
> > +TARGET_ATTR_RW(6);
> > +TARGET_ATTR_RW(7);
> > +TARGET_ATTR_RW(8);
> > +TARGET_ATTR_RW(9);
> > +TARGET_ATTR_RW(10);
> > +TARGET_ATTR_RW(11);
> > +TARGET_ATTR_RW(12);
> > +TARGET_ATTR_RW(13);
> > +TARGET_ATTR_RW(14);
> > +TARGET_ATTR_RW(15);
> > +
> > +static struct attribute *interleave_attrs[] = {
> > + &dev_attr_target0.attr,
> > + &dev_attr_target1.attr,
> > + &dev_attr_target2.attr,
> > + &dev_attr_target3.attr,
> > + &dev_attr_target4.attr,
> > + &dev_attr_target5.attr,
> > + &dev_attr_target6.attr,
> > + &dev_attr_target7.attr,
> > + &dev_attr_target8.attr,
> > + &dev_attr_target9.attr,
> > + &dev_attr_target10.attr,
> > + &dev_attr_target11.attr,
> > + &dev_attr_target12.attr,
> > + &dev_attr_target13.attr,
> > + &dev_attr_target14.attr,
> > + &dev_attr_target15.attr,
> > + NULL,
> > +};
> > +
> > +static umode_t visible_targets(struct kobject *kobj, struct attribute *a, int n)
> > +{
> > + struct device *dev = container_of(kobj, struct device, kobj);
> > + struct cxl_region *region = to_cxl_region(dev);
> > +
> > + if (n < cxl_region_ways(region))
> > + return a->mode;
> > + return 0;
> > +}
> > +
> > +static const struct attribute_group region_interleave_group = {
> > + .attrs = interleave_attrs,
> > + .is_visible = visible_targets,
> > +};
> > +
> > static const struct attribute_group *region_groups[] = {
> > ®ion_group,
> > + ®ion_interleave_group,
> > + NULL,
> > };
> >
> > static void cxl_region_release(struct device *dev);
> > @@ -58,8 +278,13 @@ static struct cxl_region *to_cxl_region(struct device *dev)
> >
> > void cxl_free_region(struct cxl_decoder *cxld, struct cxl_region *region)
> > {
> > + int i;
> > +
> > ida_free(&cxld->region_ida, region->id);
> > - kfree(region->targets);
>
> This line looks like a bug in earlier patch as targets is allocated
> as part of the allocation of region.
Yep, thanks.
>
> > + for (i = 0; i < cxld->interleave_ways; i++) {
> > + if (region->targets[i])
> > + put_device(®ion->targets[i]->dev);
> > + }
> > kfree(region);
> > }
> >
>
next prev parent reply other threads:[~2021-06-14 16:18 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-10 18:57 [RFC PATCH 0/4] Region Creation Ben Widawsky
2021-06-10 18:57 ` [RFC PATCH 1/4] cxl/region: Add region creation ABI Ben Widawsky
2021-06-11 13:31 ` Jonathan Cameron
2021-06-16 17:38 ` Ben Widawsky
2021-06-10 18:57 ` [RFC PATCH 2/4] cxl/region: Create attribute structure / verify Ben Widawsky
2021-06-11 13:37 ` Jonathan Cameron
2021-06-12 0:59 ` Dan Williams
2021-06-14 16:12 ` Ben Widawsky
2021-06-10 18:57 ` [RFC PATCH 3/4] cxl: Move cxl_memdev conversion helper to mem.h Ben Widawsky
2021-06-10 18:57 ` [RFC PATCH 4/4] cxl/region: Introduce concept of region configuration Ben Widawsky
2021-06-11 13:52 ` Jonathan Cameron
2021-06-14 16:18 ` Ben Widawsky [this message]
2021-06-14 16:20 ` Jonathan Cameron
2021-06-11 13:11 ` [RFC PATCH 0/4] Region Creation Jonathan Cameron
2021-06-11 13:53 ` Jonathan Cameron
2021-06-11 16:12 ` Ben Widawsky
2021-06-12 0:44 ` Dan Williams
2021-06-14 8:20 ` Jonathan Cameron
2021-06-14 16:12 ` Ben Widawsky
2021-06-14 21:04 ` Dan Williams
2021-06-14 21:54 ` Ben Widawsky
2021-06-14 22:21 ` Dan Williams
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210614161851.iipfwo5dvxkht7yw@intel.com \
--to=ben.widawsky@intel.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=alison.schofield@intel.com \
--cc=dan.j.williams@intel.com \
--cc=ira.weiny@intel.com \
--cc=linux-cxl@vger.kernel.org \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).