linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ben Widawsky <ben.widawsky@intel.com>
To: Ariel.Sibley@microchip.com
Cc: linux-cxl@vger.kernel.org, linux-acpi@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-nvdimm@lists.01.org,
	linux-pci@vger.kernel.org, helgaas@kernel.org,
	cbrowy@avery-design.com, hch@infradead.org,
	dan.j.williams@intel.com, david@redhat.com, rientjes@google.com,
	ira.weiny@intel.com, jcm@jonmasters.org,
	Jonathan.Cameron@huawei.com, rafael.j.wysocki@intel.com,
	rdunlap@infradead.org, vishal.l.verma@intel.com,
	jgroves@micron.com, sean.v.kelley@intel.com,
	Ahmad.Danesh@microchip.com, Varada.Dighe@microchip.com,
	Kirthi.Shenoy@microchip.com, Sanjay.Goyal@microchip.com
Subject: Re: [PATCH v2 5/8] cxl/mem: Add a "RAW" send command
Date: Wed, 10 Feb 2021 11:12:29 -0800	[thread overview]
Message-ID: <20210210191229.xgsr4s27iveo37qv@intel.com> (raw)
In-Reply-To: <MN2PR11MB364513777E713224B3BB7D74888D9@MN2PR11MB3645.namprd11.prod.outlook.com>

On 21-02-10 18:46:04, Ariel.Sibley@microchip.com wrote:
> > > > > > diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
> > > > > > index c4ba3aa0a05d..08eaa8e52083 100644
> > > > > > --- a/drivers/cxl/Kconfig
> > > > > > +++ b/drivers/cxl/Kconfig
> > > > > > @@ -33,6 +33,24 @@ config CXL_MEM
> > > > > >
> > > > > >           If unsure say 'm'.
> > > > > >
> > > > > > +config CXL_MEM_RAW_COMMANDS
> > > > > > +       bool "RAW Command Interface for Memory Devices"
> > > > > > +       depends on CXL_MEM
> > > > > > +       help
> > > > > > +         Enable CXL RAW command interface.
> > > > > > +
> > > > > > +         The CXL driver ioctl interface may assign a kernel ioctl command
> > > > > > +         number for each specification defined opcode. At any given point in
> > > > > > +         time the number of opcodes that the specification defines and a device
> > > > > > +         may implement may exceed the kernel's set of associated ioctl function
> > > > > > +         numbers. The mismatch is either by omission, specification is too new,
> > > > > > +         or by design. When prototyping new hardware, or developing /
> > > > > > debugging
> > > > > > +         the driver it is useful to be able to submit any possible command to
> > > > > > +         the hardware, even commands that may crash the kernel due to their
> > > > > > +         potential impact to memory currently in use by the kernel.
> > > > > > +
> > > > > > +         If developing CXL hardware or the driver say Y, otherwise say N.
> > > > >
> > > > > Blocking RAW commands by default will prevent vendors from developing user
> > > > > space tools that utilize vendor specific commands. Vendors of CXL.mem devices
> > > > > should take ownership of ensuring any vendor defined commands that could cause
> > > > > user data to be exposed or corrupted are disabled at the device level for
> > > > > shipping configurations.
> > > >
> > > > Thanks for brining this up Ariel. If there is a recommendation on how to codify
> > > > this, I would certainly like to know because the explanation will be long.
> > > >
> > > > ---
> > > >
> > > > The background:
> > > >
> > > > The enabling/disabling of the Kconfig option is driven by the distribution
> > > > and/or system integrator. Even if we made the default 'y', nothing stops them
> > > > from changing that. if you are using this driver in production and insist on
> > > > using RAW commands, you are free to carry around a small patch to get rid of the
> > > > WARN (it is a one-liner).
> > > >
> > > > To recap why this is in place - the driver owns the sanctity of the device and
> > > > therefore a [large] part of the whole system. What we can do as driver writers
> > > > is figure out the set of commands that are "safe" and allow those. Aside from
> > > > being able to validate them, we're able to mediate them with other parallel
> > > > operations that might conflict. We gain the ability to squint extra hard at bug
> > > > reports. We provide a reason to try to use a well defined part of the spec.
> > > > Realizing that only allowing that small set of commands in a rapidly growing
> > > > ecosystem is not a welcoming API; we decided on RAW.
> > > >
> > > > Vendor commands can be one of two types:
> > > > 1. Some functionality probably most vendors want.
> > > > 2. Functionality that is really single vendor specific.
> > > >
> > > > Hopefully we can agree that the path for case #1 is to work with the consortium
> > > > to standardize a command that does what is needed and that can eventually become
> > > > part of UAPI. The situation is unfortunate, but temporary. If you won't be able
> > > > to upgrade your kernel, patch out the WARN as above.
> > > >
> > > > The second situation is interesting and does need some more thought and
> > > > discussion.
> > > >
> > > > ---
> > > >
> > > > I see 3 realistic options for truly vendor specific commands.
> > > > 1. Tough noogies. Vendors aren't special and they shouldn't do that.
> > > > 2. modparam to disable the WARN for specific devices (let the sysadmin decide)
> > > > 3. Try to make them part of UAPI.
> > > >
> > > > The right answer to me is #1, but I also realize I live in the real world.
> > > >
> > > > #2 provides too much flexibility. Vendors will just do what they please and
> > > > distros and/or integrators will be seen as hostile if they don't accommodate.
> > > >
> > > > I like #3, but I have a feeling not everyone will agree. My proposal for vendor
> > > > specific commands is, if it's clear it's truly a unique command, allow adding it
> > > > as part of UAPI (moving it out of RAW). I expect like 5 of these, ever. If we
> > > > start getting multiple per vendor, we've failed. The infrastructure is already
> > > > in place to allow doing this pretty easily. I think we'd have to draw up some
> > > > guidelines (like adding test cases for the command) to allow these to come in.
> > > > Anything with command effects is going to need extra scrutiny.
> > >
> > > This would necessitate adding specific opcode values in the range C000h-FFFFh
> > > to UAPI, and those would then be allowed for all CXL.mem devices, correct?  If
> > > so, I do not think this is the right approach, as opcodes in this range are by
> > > definition vendor defined.  A given opcode value will have totally different
> > > effects depending on the vendor.
> > 
> > Perhaps I didn't explain well enough. The UAPI would define the command ID to
> > opcode mapping, for example 0xC000. There would be a validation step in the
> > driver where it determines if it's actually the correct hardware to execute on.
> > So it would be entirely possible to have multiple vendor commands with the same
> > opcode.
> > 
> > So UAPI might be this:
> >         ___C(GET_HEALTH_INFO, "Get Health Info"),                         \
> >         ___C(GET_LOG, "Get Log"),                                         \
> >         ___C(VENDOR_FOO_XXX, "FOO"),                                      \
> >         ___C(VENDOR_BAR_XXX, "BAR"),                                      \
> > 
> > User space just picks the command they want, FOO/BAR. If they use VENDOR_BAR_XXX
> > on VENDOR_FOO's hardware, they will get an error return value.
> 
> Would the driver be doing this enforcement of vendor ID / opcode
> compatibility, or would the error return value mentioned here be from the
> device?  My concern is where the same opcode has two meanings for different
> vendors.  For example, for Vendor A opcode 0xC000 might report some form of
> status information, but for Vendor B it might have data side effects.  There
> may not have been any UAPI intention to expose 0xC000 for Vendor B devices,
> but the existence of 0xC000 in UAPI for Vendor A results in the data
> corrupting version of 0xC000 for Vendor B being allowed.  It would seem to me
> that even if the commands are in UAPI, the driver would still need to rely on
> the contents of the CEL to determine if the command should be allowed.

I think I might not be properly understanding your concern. There are two types
of errors in UAPI that represent 3 error conditions:

1. errno from the ioctl - parameter invalid kind of stuff, this would include using
   the vendor A UAPI on vendor B's device (assuming matching opcodes).
2. errno from the ioctl - transport error of some sort in the mailbox command -
   timeout on doorbell kind of thing.
3. cxl_send_command.retval - Device's error code.

Did that address your concern?

>  
> > > I think you may be on to something with the command effects.  But rather than
> > > "extra scrutiny" for opcodes that have command effects, would it make sense to
> > > allow vendor defined opcodes that have Bit[5:0] in the Command Effect field of
> > > the CEL Entry Structure (Table 173) set to 0?  In conjunction, those bits
> > > represent any change to the configuration or data within the device.  For
> > > commands that have no such effects, is there harm to allowing them?  Of
> > > course, this approach relies on the vendor to not misrepresent the command
> > > effects.
> > >
> > 
> > That last sentence is what worries me :-)
> 
> One must also rely on the vendor to not simply corrupt data at random. :) IMO
> the contents of the CEL should be believed by the driver, rather than the
> driver treating the device as a hostile actor.
> 

I respect your opinion, but my opinion is that driver writers absolutely cannot
rely on that. It would further the conversation a great deal to get concrete
examples of commands that couldn't be part of the core spec and had no effects.
I assume all vendors are going to avoid doing that, which is a real shame.

So far I haven't seen the consortium shoot something down from a vendor because
it is too vendor specific...

> > 
> > 
> > > >
> > > > In my opinion, as maintainers of the driver, we do owe the community an answer
> > > > as to our direction for this. Dan, what is your thought?

  reply	other threads:[~2021-02-10 19:14 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-10  0:02 [PATCH v2 0/8] CXL 2.0 Support Ben Widawsky
2021-02-10  0:02 ` [PATCH v2 1/8] cxl/mem: Introduce a driver for CXL-2.0-Type-3 endpoints Ben Widawsky
2021-02-10 16:17   ` Jonathan Cameron
2021-02-10 17:12     ` Ben Widawsky
2021-02-10 17:23       ` Jonathan Cameron
2021-02-10  0:02 ` [PATCH v2 2/8] cxl/mem: Find device capabilities Ben Widawsky
2021-02-10 13:32   ` Jonathan Cameron
2021-02-10 15:07     ` Jonathan Cameron
2021-02-10 16:55       ` Ben Widawsky
2021-02-10 17:30         ` Jonathan Cameron
2021-02-10 18:16         ` Ben Widawsky
2021-02-11  9:55           ` Jonathan Cameron
2021-02-11 15:55             ` Ben Widawsky
2021-02-12 13:27               ` Jonathan Cameron
2021-02-12 15:54                 ` Ben Widawsky
2021-02-11 18:27             ` Ben Widawsky
2021-02-12 13:23               ` Jonathan Cameron
2021-02-10 19:32     ` Ben Widawsky
2021-02-10 17:41   ` Jonathan Cameron
2021-02-10 18:53     ` Ben Widawsky
2021-02-10 19:54       ` Dan Williams
2021-02-11 10:01         ` Jonathan Cameron
2021-02-11 16:04           ` Ben Widawsky
2021-02-10  0:02 ` [PATCH v2 3/8] cxl/mem: Register CXL memX devices Ben Widawsky
2021-02-10 18:17   ` Jonathan Cameron
2021-02-11 10:17     ` Jonathan Cameron
2021-02-11 20:40       ` Dan Williams
2021-02-12 13:33         ` Jonathan Cameron
2021-02-10  0:02 ` [PATCH v2 4/8] cxl/mem: Add basic IOCTL interface Ben Widawsky
2021-02-10 18:45   ` Jonathan Cameron
2021-02-10 20:22     ` Ben Widawsky
2021-02-11  4:40     ` Dan Williams
2021-02-11 10:06       ` Jonathan Cameron
2021-02-11 16:54         ` Ben Widawsky
2021-02-14 16:30   ` Al Viro
2021-02-14 23:14     ` Ben Widawsky
2021-02-14 23:50       ` Al Viro
2021-02-14 23:57         ` Al Viro
2021-02-10  0:02 ` [PATCH v2 5/8] cxl/mem: Add a "RAW" send command Ben Widawsky
2021-02-10 15:26   ` Ariel.Sibley
2021-02-10 16:49     ` Ben Widawsky
2021-02-10 18:03       ` Ariel.Sibley
2021-02-10 18:11         ` Ben Widawsky
2021-02-10 18:46           ` Ariel.Sibley
2021-02-10 19:12             ` Ben Widawsky [this message]
2021-02-11 16:43     ` Dan Williams
2021-02-11 11:19   ` Jonathan Cameron
2021-02-11 16:01     ` Ben Widawsky
2021-02-12 13:40       ` Jonathan Cameron
2021-02-10  0:02 ` [PATCH v2 6/8] cxl/mem: Enable commands via CEL Ben Widawsky
2021-02-11 12:02   ` Jonathan Cameron
2021-02-11 17:45     ` Ben Widawsky
2021-02-11 20:34       ` Dan Williams
2021-02-16 13:43     ` Bartosz Golaszewski
2021-02-10  0:02 ` [PATCH v2 7/8] cxl/mem: Add set of informational commands Ben Widawsky
2021-02-11 12:07   ` Jonathan Cameron
2021-02-10  0:02 ` [PATCH v2 8/8] MAINTAINERS: Add maintainers of the CXL driver Ben Widawsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210210191229.xgsr4s27iveo37qv@intel.com \
    --to=ben.widawsky@intel.com \
    --cc=Ahmad.Danesh@microchip.com \
    --cc=Ariel.Sibley@microchip.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=Kirthi.Shenoy@microchip.com \
    --cc=Sanjay.Goyal@microchip.com \
    --cc=Varada.Dighe@microchip.com \
    --cc=cbrowy@avery-design.com \
    --cc=dan.j.williams@intel.com \
    --cc=david@redhat.com \
    --cc=hch@infradead.org \
    --cc=helgaas@kernel.org \
    --cc=ira.weiny@intel.com \
    --cc=jcm@jonmasters.org \
    --cc=jgroves@micron.com \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=rafael.j.wysocki@intel.com \
    --cc=rdunlap@infradead.org \
    --cc=rientjes@google.com \
    --cc=sean.v.kelley@intel.com \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).