linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: Greg KH <gregkh@linuxfoundation.org>
Cc: "Chen, Mike Ximing" <mike.ximing.chen@intel.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"arnd@arndb.de" <arnd@arndb.de>,
	"pierre-louis.bossart@linux.intel.com" 
	<pierre-louis.bossart@linux.intel.com>
Subject: Re: [PATCH v8 04/20] dlb: add device ioctl layer and first three ioctls
Date: Tue, 12 Jan 2021 13:03:08 -0800	[thread overview]
Message-ID: <CAPcyv4hGxLZGEkfnqdLfF-a1CzfEjLux-TBxXztbknFhEe9mYA@mail.gmail.com> (raw)
In-Reply-To: <X/sYSfac3GQ8SsqO@kroah.com>

On Sun, Jan 10, 2021 at 7:06 AM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Sat, Jan 09, 2021 at 01:49:42PM -0800, Dan Williams wrote:
> > On Sat, Jan 9, 2021 at 12:34 AM Greg KH <gregkh@linuxfoundation.org> wrote:
> > >
> > > On Sat, Jan 09, 2021 at 07:49:24AM +0000, Chen, Mike Ximing wrote:
> > > > > > +static int dlb_ioctl_arg_size[NUM_DLB_CMD] = {
> > > > > > + sizeof(struct dlb_get_device_version_args),
> > > > > > + sizeof(struct dlb_create_sched_domain_args),
> > > > > > + sizeof(struct dlb_get_num_resources_args)
> > > > >
> > > > > That list.
> > > > >
> > > > > Ugh, no.  that's no way to write maintainable code that you will be able
> > > > > to understand in 2 years.
> > > > >
> > > >
> > > > dlb_ioctl() was implemented with switch-case and real function calls previously.
> > > > We changed to the table/list implementation during a code restructure. I will move
> > > > back to the old implementation.
> > >
> > > Who said to change this?  And why did they say that?  Please go back to
> > > those developers and point them at this thread so that doesn't happen
> > > again...
> > >
> > > > > > +{
> > > > > > + struct dlb *dlb;
> > > > > > + dlb_ioctl_fn_t fn;
> > > > > > + u32 cmd_nr;
> > > > > > + void *karg;
> > > > > > + int size;
> > > > > > + int ret;
> > > > > > +
> > > > > > + dlb = f->private_data;
> > > > > > +
> > > > > > + if (!dlb) {
> > > > > > +         pr_err("dlb: [%s()] Invalid DLB data\n", __func__);
> > > > > > +         return -EFAULT;
> > > > >
> > > > > This error value is only allowed if you really do have a memory fault.
> > > > >
> > > > > Hint, you do not here.
> > > > >
> > > > > How can that value ever be NULL?
> > > > >
> > > >
> > > > It is targeted at some very rare cases, in which an ioctl command are called immediately after a device unbind (by a different process/application).
> > >
> > > And how can that happen?  If it really can happen, where is the lock
> > > that you are holding to keep that pointer from becoming "stale" right
> > > after you assign it?
> > >
> > > So either this never can happen, or your logic here for this type of
> > > thing is totally wrong.  Please work to determine which it is.
> >
> > I would have preferred a chance to offer a reviewed-by on this set
> > before it went out (per the required process) to validate that the
> > feedback on the lifetime handling was properly addressed, it wasn't,
> > but lets deal with this on the list now.
> >
> > The race to handle is the one identified by cdev_del():
> >
> >  * NOTE: This guarantees that cdev device will no longer be able to be
> >  * opened, however any cdevs already open will remain and their fops will
> >  * still be callable even after cdev_del returns.
> >
> > This means that the dlb->private_data is pointing to a live device, a
> > dying device, or NULL. Without revalidating to the dlb pointer under a
> > lock, or some other coordinated reference cout, it can transition
> > states underneath the running ioctl.
>
> But, that's only the case if this is the last cdev reference held here,
> right?  How can a close be called if a filehandle is still open such
> that an ioctl can be called?
>
> This should just be a "simple" char device operation, with no need to be
> fancy or anything odd like that, right?  If not, then yes, this really
> does need a real lock.
>
> > Greg, I'm thinking of taking a shot at a document, "botching up device
> > lifetimes",  in the same spirit as
> > Documentation/process/botching-up-ioctls.rst to lay out the different
> > schemes for revalidating driver private data in ioctls.
>
> Sure, but again, it should be "simple" in that an ioctl can't be called
> after close() happens.

Yes, for checking that file->private_data is not NULL it is sufficient
to just assume that ioctl() can't be called after close().

That's not my concern though. The open race that cdev_del() does not
address is ioctl() called after device-unbind. The open fd is never
revoked and can live past device_unregister() in which case the ioctl
needs to revalidate the device, or (not recommended) block unbind /
driver-remove while open file descriptors are outstanding.

>
> > It strikes me that a helper like this might address many of the common patterns:
> >
> > +/**
> > + * get_live_device() - increment reference count for device iff !dead
> > + * @dev: device.
> > + *
> > + * Forward the call to get_device() if the device is still alive. If
> > + * this is called with the device_lock() held then the device is
> > + * guaranteed to not die until the device_lock() is dropped.
> > + */
> > +struct device *get_live_device(struct device *dev)
> > +{
> > +       return dev && !dev->p->dead ? get_device(dev) : NULL;
> > +}
> > +EXPORT_SYMBOL_GPL(get_live_device);
>
> Ick, no, that's still racy :(

Hence the comment about needing to synchronize with the driver doing
device_unregister().

>
> And a cdev is not a "real" struct device, despite looking like one if
> you squint at it.  The patches from Christoph should be merged soon that
> remove the last remants of the logic that kind of looked like a struct
> device operation (with a kobject), and after that, I will clean it out
> to keep it from looking like it ties into the driver model entirely, as
> many people get this wrong, because it does not.

Agree, but many drivers still tie cdev lifetime to a struct device.

>
> > Alternatively, if device_lock() is too awkward for a driver it could
> > use its own lock and kill_device().
> >
> > ...am I off base thinking that cdev_del vs fops liveness is a
> > widespread problem worth documenting new design patterns?
>
> It shouldn't be a problem, again, because who would be able to close a
> char device node and still be able to call ioctl on it?  The VFS layer
> should prevent that from happening, right?

Per above, unbind vs and revoking new ioctl() invocations is the concern.

  reply	other threads:[~2021-01-12 21:51 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-05  2:58 [PATCH v8 00/20] dlb: introduce DLB device driver Mike Ximing Chen
2021-01-05  2:58 ` [PATCH v8 01/20] dlb: add skeleton for DLB driver Mike Ximing Chen
2021-01-07 19:35   ` Greg KH
2021-01-09  6:09     ` Chen, Mike Ximing
2021-01-05  2:58 ` [PATCH v8 02/20] dlb: initialize device Mike Ximing Chen
2021-01-05  2:58 ` [PATCH v8 03/20] dlb: add resource and device initialization Mike Ximing Chen
2021-01-07 19:40   ` Greg KH
2021-01-10 17:22     ` Chen, Mike Ximing
2021-01-05  2:58 ` [PATCH v8 04/20] dlb: add device ioctl layer and first three ioctls Mike Ximing Chen
2021-01-07 19:41   ` Greg KH
2021-01-09  6:17     ` Chen, Mike Ximing
2021-01-07 19:41   ` Greg KH
2021-01-09  7:55     ` Chen, Mike Ximing
2021-01-07 19:50   ` Greg KH
2021-01-09  7:49     ` Chen, Mike Ximing
2021-01-09  8:33       ` Greg KH
2021-01-09 21:49         ` Dan Williams
2021-01-10 15:07           ` Greg KH
2021-01-12 21:03             ` Dan Williams [this message]
2021-01-13  9:57               ` Greg KH
2021-01-21 23:14                 ` Dan Williams
2021-01-10  4:30         ` Chen, Mike Ximing
2021-01-05  2:58 ` [PATCH v8 05/20] dlb: add scheduling domain configuration Mike Ximing Chen
2021-01-05  2:58 ` [PATCH v8 06/20] dlb: add domain software reset Mike Ximing Chen
2021-01-05  2:58 ` [PATCH v8 07/20] dlb: add low-level register reset operations Mike Ximing Chen
2021-01-05  2:58 ` [PATCH v8 08/20] dlb: add runtime power-management support Mike Ximing Chen
2021-01-05  2:58 ` [PATCH v8 09/20] dlb: add queue create, reset, get-depth ioctls Mike Ximing Chen
2021-01-05  2:58 ` [PATCH v8 10/20] dlb: add register operations for queue management Mike Ximing Chen
2021-01-05  2:58 ` [PATCH v8 11/20] dlb: add ioctl to configure ports and query poll mode Mike Ximing Chen
2021-01-05  2:58 ` [PATCH v8 12/20] dlb: add register operations for port management Mike Ximing Chen
2021-01-05  2:58 ` [PATCH v8 13/20] dlb: add port mmap support Mike Ximing Chen
2021-01-05  2:58 ` [PATCH v8 14/20] dlb: add start domain ioctl Mike Ximing Chen
2021-01-05  2:58 ` [PATCH v8 15/20] dlb: add queue map, unmap, and pending unmap operations Mike Ximing Chen
2021-01-05  2:58 ` [PATCH v8 16/20] dlb: add port map/unmap state machine Mike Ximing Chen
2021-01-05  2:58 ` [PATCH v8 17/20] dlb: add static queue map register operations Mike Ximing Chen
2021-01-05  2:58 ` [PATCH v8 18/20] dlb: add dynamic " Mike Ximing Chen
2021-01-05  2:58 ` [PATCH v8 19/20] dlb: add queue unmap " Mike Ximing Chen
2021-01-05  2:58 ` [PATCH v8 20/20] dlb: queue map/unmap workqueue Mike Ximing Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPcyv4hGxLZGEkfnqdLfF-a1CzfEjLux-TBxXztbknFhEe9mYA@mail.gmail.com \
    --to=dan.j.williams@intel.com \
    --cc=arnd@arndb.de \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mike.ximing.chen@intel.com \
    --cc=pierre-louis.bossart@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).