All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: Ben Widawsky <ben.widawsky@intel.com>
Cc: linux-cxl@vger.kernel.org, Linux NVDIMM <nvdimm@lists.linux.dev>,
	 Jonathan Cameron <Jonathan.Cameron@huawei.com>,
	Vishal L Verma <vishal.l.verma@intel.com>,
	 "Schofield, Alison" <alison.schofield@intel.com>,
	"Weiny, Ira" <ira.weiny@intel.com>
Subject: Re: [PATCH 17/23] cxl/mbox: Add exclusive kernel command support
Date: Tue, 10 Aug 2021 18:22:27 -0700	[thread overview]
Message-ID: <CAPcyv4iLLPR+yijKNHceEKM4+fKQ4i6r+ZLYH+_b-ao6tznHLQ@mail.gmail.com> (raw)
In-Reply-To: <20210810220654.nztok7mxvjzaizhk@intel.com>

On Tue, Aug 10, 2021 at 3:07 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> On 21-08-10 14:52:18, Dan Williams wrote:
> > On Tue, Aug 10, 2021 at 2:35 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > >
> > > On 21-08-09 15:29:18, Dan Williams wrote:
> > > > The CXL_PMEM driver expects exclusive control of the label storage area
> > > > space. Similar to the LIBNVDIMM expectation that the label storage area
> > > > is only writable from userspace when the corresponding memory device is
> > > > not active in any region, the expectation is the native CXL_PCI UAPI
> > > > path is disabled while the cxl_nvdimm for a given cxl_memdev device is
> > > > active in LIBNVDIMM.
> > > >
> > > > Add the ability to toggle the availability of a given command for the
> > > > UAPI path. Use that new capability to shutdown changes to partitions and
> > > > the label storage area while the cxl_nvdimm device is actively proxying
> > > > commands for LIBNVDIMM.
> > > >
> > > > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> > > > ---
> > > >  drivers/cxl/core/mbox.c |    5 +++++
> > > >  drivers/cxl/cxlmem.h    |    2 ++
> > > >  drivers/cxl/pmem.c      |   35 +++++++++++++++++++++++++++++------
> > > >  3 files changed, 36 insertions(+), 6 deletions(-)
> > > >
> > > > diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> > > > index 23100231e246..f26962d7cb65 100644
> > > > --- a/drivers/cxl/core/mbox.c
> > > > +++ b/drivers/cxl/core/mbox.c
> > > > @@ -409,6 +409,11 @@ static int handle_mailbox_cmd_from_user(struct cxl_mem *cxlm,
> > > >               }
> > > >       }
> > > >
> > > > +     if (test_bit(cmd->info.id, cxlm->exclusive_cmds)) {
> > > > +             rc = -EBUSY;
> > > > +             goto out;
> > > > +     }
> > > > +
> > >
> > > This breaks our current definition for cxl_raw_allow_all. All the test machinery
> >
> > That's deliberate; this exclusion is outside of the raw policy. I
> > don't think raw_allow_all should override kernel self protection of
> > data structures, like labels, that it needs to maintain consistency.
> > If userspace wants to use raw_allow_all to send LSA manipulation
> > commands it must do so while the device is not active on the nvdimm
> > side of the house. You'll see that:
> >
> > ndctl disable-region all
> > <mutate labels>
> > ndctl enable-region all
> >
> > ...is a common pattern from custom label update flows.
> >
>
> I won't argue about raw_allow_all since we never did document its debugfs
> meaning (however, my intention was always to let userspace trump the kernel
> (which was why we tainted)).

Yeah we should document because the taint in my mind was for the
possibility of passing commands completely unknown to the kernel. If
someone really wants to subvert the kernel's label area coherency they
could simply have a vendor specific command that writes the labels.
Instead, if the kernel knows the opcode it is free to apply policy to
it as it sees fit, and if the opcode is unknown to the kernel then
raw_allow_all policy lets it through. We already have security
commands as another case of opcode that the kernel knows about and
thinks is a good idea to block. This is a dynamic version of the same.

> Either way, could you please move the actual check to
> cxl_validate_cmd_from_user() instead of handle...(). Validate is the main
> function to determine whether a command is allowed to be sent on behalf of the
> user.  I think just putting it next to the enabled cmd check would make a lot
> more sense. And please add the EBUSY meaning to the kdocs.

Sure, sounds good.

>
> > > for whether a command can be submitted was supposed to happen in
> > > cxl_validate_cmd_from_user(). Various versions of the original patches made
> > > cxl_mem_raw_command_allowed() grow more intelligence (ie. more than just the
> > > opcode). I think this check belongs there with more intelligence.
> > >
> > > I don't love the EBUSY because it already had a meaning for concurrent use of
> > > the mailbox, but I can't think of a better errno.
> >
> > It's the existing errno that happens from nvdimm land when the kernel
> > owns the label area, so it would be confusing to invent a new one for
> > the same behavior now:
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/nvdimm/bus.c#n1013
> >
> > >
> > > >       dev_dbg(dev,
> > > >               "Submitting %s command for user\n"
> > > >               "\topcode: %x\n"
> > > > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> > > > index df4f3636a999..f6cfe84a064c 100644
> > > > --- a/drivers/cxl/cxlmem.h
> > > > +++ b/drivers/cxl/cxlmem.h
> > > > @@ -102,6 +102,7 @@ struct cxl_mbox_cmd {
> > > >   * @mbox_mutex: Mutex to synchronize mailbox access.
> > > >   * @firmware_version: Firmware version for the memory device.
> > > >   * @enabled_cmds: Hardware commands found enabled in CEL.
> > > > + * @exclusive_cmds: Commands that are kernel-internal only
> > > >   * @pmem_range: Persistent memory capacity information.
> > > >   * @ram_range: Volatile memory capacity information.
> > > >   * @mbox_send: @dev specific transport for transmitting mailbox commands
> > > > @@ -117,6 +118,7 @@ struct cxl_mem {
> > > >       struct mutex mbox_mutex; /* Protects device mailbox and firmware */
> > > >       char firmware_version[0x10];
> > > >       DECLARE_BITMAP(enabled_cmds, CXL_MEM_COMMAND_ID_MAX);
> > > > +     DECLARE_BITMAP(exclusive_cmds, CXL_MEM_COMMAND_ID_MAX);
> > > >
> > > >       struct range pmem_range;
> > > >       struct range ram_range;
> > > > diff --git a/drivers/cxl/pmem.c b/drivers/cxl/pmem.c
> > > > index 9652c3ee41e7..11410df77444 100644
> > > > --- a/drivers/cxl/pmem.c
> > > > +++ b/drivers/cxl/pmem.c
> > > > @@ -16,9 +16,23 @@
> > > >   */
> > > >  static struct workqueue_struct *cxl_pmem_wq;
> > > >
> > > > -static void unregister_nvdimm(void *nvdimm)
> > > > +static void unregister_nvdimm(void *_cxl_nvd)
> > > >  {
> > > > -     nvdimm_delete(nvdimm);
> > > > +     struct cxl_nvdimm *cxl_nvd = _cxl_nvd;
> > > > +     struct cxl_memdev *cxlmd = cxl_nvd->cxlmd;
> > > > +     struct cxl_mem *cxlm = cxlmd->cxlm;
> > > > +     struct device *dev = &cxl_nvd->dev;
> > > > +     struct nvdimm *nvdimm;
> > > > +
> > > > +     nvdimm = dev_get_drvdata(dev);
> > > > +     if (nvdimm)
> > > > +             nvdimm_delete(nvdimm);
> > > > +
> > > > +     mutex_lock(&cxlm->mbox_mutex);
> > > > +     clear_bit(CXL_MEM_COMMAND_ID_SET_PARTITION_INFO, cxlm->exclusive_cmds);
> > > > +     clear_bit(CXL_MEM_COMMAND_ID_SET_SHUTDOWN_STATE, cxlm->exclusive_cmds);
> > > > +     clear_bit(CXL_MEM_COMMAND_ID_SET_LSA, cxlm->exclusive_cmds);
> > > > +     mutex_unlock(&cxlm->mbox_mutex);
> > > >  }
> > > >
> > > >  static int match_nvdimm_bridge(struct device *dev, const void *data)
> > > > @@ -39,6 +53,8 @@ static struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(void)
> > > >  static int cxl_nvdimm_probe(struct device *dev)
> > > >  {
> > > >       struct cxl_nvdimm *cxl_nvd = to_cxl_nvdimm(dev);
> > > > +     struct cxl_memdev *cxlmd = cxl_nvd->cxlmd;
> > > > +     struct cxl_mem *cxlm = cxlmd->cxlm;
> > > >       struct cxl_nvdimm_bridge *cxl_nvb;
> > > >       unsigned long flags = 0;
> > > >       struct nvdimm *nvdimm;
> > > > @@ -52,17 +68,24 @@ static int cxl_nvdimm_probe(struct device *dev)
> > > >       if (!cxl_nvb->nvdimm_bus)
> > > >               goto out;
> > > >
> > > > +     mutex_lock(&cxlm->mbox_mutex);
> > > > +     set_bit(CXL_MEM_COMMAND_ID_SET_PARTITION_INFO, cxlm->exclusive_cmds);
> > > > +     set_bit(CXL_MEM_COMMAND_ID_SET_SHUTDOWN_STATE, cxlm->exclusive_cmds);
> > > > +     set_bit(CXL_MEM_COMMAND_ID_SET_LSA, cxlm->exclusive_cmds);
> > > > +     mutex_unlock(&cxlm->mbox_mutex);
> > > > +
> > >
> > > What's the concurrency this lock is trying to protect against?
> >
> > I can add a comment. It synchronizes against in-flight ioctl users to
> > make sure that any requests have completed before the policy changes.
> > I.e. do not allow userspace to race the nvdimm subsystem attaching to
> > get a consistent state of the persistent memory configuration.
> >
>
> Ah, so the expectation is that these things will be set not just on
> probe/unregister()? I would assume an IOCTL couldn't happen while
> probe/unregister is happening.

The ioctl is going through the cxl_pci driver. That driver has
finished probe and published the ioctl before this lockout can run in
cxl_nvdimm_probe(), so it's entirely possible that label writing
ioctls are in progress when cxl_nvdimm_probe() eventually fires.

The current policy for /sys/bus/nd/devices/nmemX devices are that
label writes are allowed as long as the nmemX device is not active in
any region. I was thinking the CXL policy is coarser. Label writes via
/sys/bus/cxl/devices/memX ioctls are disallowed as long as the bridge
for that device into the nvdimm subsystem is active.

  reply	other threads:[~2021-08-11  1:22 UTC|newest]

Thread overview: 75+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-09 22:27 [PATCH 00/23] cxl_test: Enable CXL Topology and UAPI regression tests Dan Williams
2021-08-09 22:27 ` [PATCH 01/23] libnvdimm/labels: Introduce getters for namespace label fields Dan Williams
2021-08-10 20:48   ` Ben Widawsky
2021-08-10 21:58     ` Dan Williams
2021-08-10 21:58       ` Dan Williams
2021-08-11 18:44   ` Jonathan Cameron
2021-08-09 22:27 ` [PATCH 02/23] libnvdimm/labels: Add isetcookie validation helper Dan Williams
2021-08-11 18:44   ` Jonathan Cameron
2021-08-09 22:28 ` [PATCH 03/23] libnvdimm/labels: Introduce label setter helpers Dan Williams
2021-08-11 17:27   ` Jonathan Cameron
2021-08-11 17:42     ` Dan Williams
2021-08-11 17:42       ` Dan Williams
2021-08-09 22:28 ` [PATCH 04/23] libnvdimm/labels: Add a checksum calculation helper Dan Williams
2021-08-11 18:44   ` Jonathan Cameron
2021-08-09 22:28 ` [PATCH 05/23] libnvdimm/labels: Add blk isetcookie set / validation helpers Dan Williams
2021-08-11 18:45   ` Jonathan Cameron
2021-08-09 22:28 ` [PATCH 06/23] libnvdimm/labels: Add blk special cases for nlabel and position helpers Dan Williams
2021-08-11 18:45   ` Jonathan Cameron
2021-08-09 22:28 ` [PATCH 07/23] libnvdimm/labels: Add type-guid helpers Dan Williams
2021-08-11 18:46   ` Jonathan Cameron
2021-08-09 22:28 ` [PATCH 08/23] libnvdimm/labels: Add claim class helpers Dan Williams
2021-08-11 18:46   ` Jonathan Cameron
2021-08-09 22:28 ` [PATCH 09/23] libnvdimm/labels: Add address-abstraction uuid definitions Dan Williams
2021-08-11 18:49   ` Jonathan Cameron
2021-08-11 22:47     ` Dan Williams
2021-08-11 22:47       ` Dan Williams
2021-08-09 22:28 ` [PATCH 10/23] libnvdimm/labels: Add uuid helpers Dan Williams
2021-08-11  8:05   ` Andy Shevchenko
2021-08-11 16:59     ` Andy Shevchenko
2021-08-11 17:11       ` Dan Williams
2021-08-11 17:11         ` Dan Williams
2021-08-11 19:18         ` Andy Shevchenko
2021-08-11 19:26           ` Dan Williams
2021-08-11 19:26             ` Dan Williams
2021-08-12 22:34           ` Dan Williams
2021-08-12 22:34             ` Dan Williams
2021-08-13 10:14             ` Andy Shevchenko
2021-08-14  7:35               ` Christoph Hellwig
2021-08-11 18:13   ` Jonathan Cameron
2021-08-12 21:17     ` Dan Williams
2021-08-12 21:17       ` Dan Williams
2021-08-09 22:28 ` [PATCH 11/23] libnvdimm/labels: Introduce CXL labels Dan Williams
2021-08-11 18:41   ` Jonathan Cameron
2021-08-11 23:01     ` Dan Williams
2021-08-11 23:01       ` Dan Williams
2021-08-09 22:28 ` [PATCH 12/23] cxl/pci: Make 'struct cxl_mem' device type generic Dan Williams
2021-08-09 22:28 ` [PATCH 13/23] cxl/mbox: Introduce the mbox_send operation Dan Williams
2021-08-09 22:29 ` [PATCH 14/23] cxl/mbox: Move mailbox and other non-PCI specific infrastructure to the core Dan Williams
2021-08-11  6:11   ` [PATCH v2 " Dan Williams
2021-08-09 22:29 ` [PATCH 15/23] cxl/pci: Use module_pci_driver Dan Williams
2021-08-09 22:29 ` [PATCH 16/23] cxl/mbox: Convert 'enabled_cmds' to DECLARE_BITMAP Dan Williams
2021-08-09 22:29 ` [PATCH 17/23] cxl/mbox: Add exclusive kernel command support Dan Williams
2021-08-10 21:34   ` Ben Widawsky
2021-08-10 21:52     ` Dan Williams
2021-08-10 21:52       ` Dan Williams
2021-08-10 22:06       ` Ben Widawsky
2021-08-11  1:22         ` Dan Williams [this message]
2021-08-11  1:22           ` Dan Williams
2021-08-11  2:14           ` Dan Williams
2021-08-11  2:14             ` Dan Williams
2021-08-09 22:29 ` [PATCH 18/23] cxl/pmem: Translate NVDIMM label commands to CXL label commands Dan Williams
2021-08-09 22:29 ` [PATCH 19/23] cxl/pmem: Add support for multiple nvdimm-bridge objects Dan Williams
2021-08-09 22:29 ` [PATCH 20/23] tools/testing/cxl: Introduce a mocked-up CXL port hierarchy Dan Williams
2021-08-10 21:57   ` Ben Widawsky
2021-08-10 22:40     ` Dan Williams
2021-08-10 22:40       ` Dan Williams
2021-08-11 15:18       ` Ben Widawsky
     [not found]       ` <xp0k4.l2r85dw1p7do@intel.com>
2021-08-11 21:03         ` Dan Williams
2021-08-11 21:03           ` Dan Williams
2021-08-09 22:29 ` [PATCH 21/23] cxl/bus: Populate the target list at decoder create Dan Williams
2021-08-09 22:29 ` [PATCH 22/23] cxl/mbox: Move command definitions to common location Dan Williams
2021-08-09 22:29 ` [PATCH 23/23] tools/testing/cxl: Introduce a mock memory device + driver Dan Williams
2021-08-10 22:10 ` [PATCH 00/23] cxl_test: Enable CXL Topology and UAPI regression tests Ben Widawsky
2021-08-10 22:58   ` Dan Williams
2021-08-10 22:58     ` Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPcyv4iLLPR+yijKNHceEKM4+fKQ4i6r+ZLYH+_b-ao6tznHLQ@mail.gmail.com \
    --to=dan.j.williams@intel.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=alison.schofield@intel.com \
    --cc=ben.widawsky@intel.com \
    --cc=ira.weiny@intel.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=nvdimm@lists.linux.dev \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.