linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alexander Duyck <alexander.h.duyck@linux.intel.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Greg KH <gregkh@linuxfoundation.org>,
	linux-nvdimm <linux-nvdimm@lists.01.org>,
	Tejun Heo <tj@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linux-pm mailing list <linux-pm@vger.kernel.org>,
	jiangshanlai@gmail.com, "Rafael J. Wysocki" <rafael@kernel.org>,
	"Brown, Len" <len.brown@intel.com>, Pavel Machek <pavel@ucw.cz>,
	zwisler@kernel.org, Dave Jiang <dave.jiang@intel.com>,
	bvanassche@acm.org
Subject: Re: [driver-core PATCH v5 5/9] driver core: Establish clear order of operations for deferred probe and remove
Date: Tue, 27 Nov 2018 08:49:30 -0800	[thread overview]
Message-ID: <b1ee672ccc2746ce57fb6a0def3d553faa462785.camel@linux.intel.com> (raw)
In-Reply-To: <CAPcyv4ieNzbx5iN74ZCAKDesgYt_67iM_DC-A35M=Z9jfR+YAQ@mail.gmail.com>

On Mon, 2018-11-26 at 18:35 -0800, Dan Williams wrote:
> On Mon, Nov 5, 2018 at 1:12 PM Alexander Duyck
> <alexander.h.duyck@linux.intel.com> wrote:
> > 
> > This patch adds an additional bit to the device struct named async_probe.
> > This additional bit allows us to guarantee ordering between probe and
> > remove operations.
> > 
> > This allows us to guarantee that if we execute a remove operation or a
> > driver load attempt on a given interface it will not attempt to update the
> > driver member asynchronously following the earlier operation. Previously
> > this guarantee was not present and could result in us attempting to remove
> > a driver from an interface only to have it show up later when it is
> > asynchronously loaded.
> > 
> > One change I made in addition is I replaced the use of "bool X:1" to define
> > the bitfield to a "u8 X:1" setup in order to resolve some checkpatch
> > warnings.
> 
> The usage of "us" in the changelog through me off, please reword this
> to explicitly state the subject like: "The additional bit allows the
> driver core to guarantee ordering between probe and remove
> operations."

Okay, I will work on the wording here.

> > Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
> > ---
> >  drivers/base/dd.c      |  104 +++++++++++++++++++++++++++---------------------
> >  include/linux/device.h |    9 ++--
> >  2 files changed, 64 insertions(+), 49 deletions(-)
> > 
> > diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> > index e74cefeb5b69..ed19cf0d6f9a 100644
> > --- a/drivers/base/dd.c
> > +++ b/drivers/base/dd.c
> > @@ -472,6 +472,8 @@ static int really_probe(struct device *dev, struct device_driver *drv)
> >                  drv->bus->name, __func__, drv->name, dev_name(dev));
> >         WARN_ON(!list_empty(&dev->devres_head));
> > 
> > +       /* clear async_probe flag as we are no longer deferring driver load */
> > +       dev->async_probe = false;
> >  re_probe:
> >         dev->driver = drv;
> > 
> > @@ -771,6 +773,10 @@ static void __device_attach_async_helper(void *_dev, async_cookie_t cookie)
> > 
> >         device_lock(dev);
> > 
> > +       /* nothing to do if async_probe has been cleared */
> > +       if (!dev->async_probe)
> > +               goto out_unlock;
> > +
> >         if (dev->parent)
> >                 pm_runtime_get_sync(dev->parent);
> > 
> > @@ -781,7 +787,7 @@ static void __device_attach_async_helper(void *_dev, async_cookie_t cookie)
> > 
> >         if (dev->parent)
> >                 pm_runtime_put(dev->parent);
> > -
> > +out_unlock:
> >         device_unlock(dev);
> > 
> >         put_device(dev);
> > @@ -826,6 +832,7 @@ static int __device_attach(struct device *dev, bool allow_async)
> >                          */
> >                         dev_dbg(dev, "scheduling asynchronous probe\n");
> >                         get_device(dev);
> > +                       dev->async_probe = true;
> >                         async_schedule(__device_attach_async_helper, dev);
> >                 } else {
> >                         pm_request_idle(dev);
> > @@ -971,62 +978,69 @@ EXPORT_SYMBOL_GPL(driver_attach);
> >   */
> >  static void __device_release_driver(struct device *dev, struct device *parent)
> >  {
> > -       struct device_driver *drv;
> > +       struct device_driver *drv = dev->driver;
> > 
> > -       drv = dev->driver;
> > -       if (drv) {
> > -               while (device_links_busy(dev)) {
> > -                       __device_driver_unlock(dev, parent);
> > +       /*
> > +        * In the event that we are asked to release the driver on an
> > +        * interface that is still waiting on a probe we can just terminate
> > +        * the probe by setting async_probe to false. When the async call
> > +        * is finally completed it will see this state and just exit.
> > +        */
> > +       dev->async_probe = false;
> > +       if (!drv)
> > +               return;
> 
> Patch 4 deleted the async_synchronize_full() that would have flushed
> in-flight ->probe() relative to the current ->remove(). If remove runs
> before probe then would seem to be deadlock condition, but if
> ->remove() runs before probe then dev->driver is NULL and we abort. So
> I'm struggling to see what value dev->async_probe provides over
> dev->driver?

So the issue addressed by patch 4 is a deadlock on the device driver
lock. Also it didn't make much sense to flush per device when we really
only needed to flush things once and then clean up the devices.

What I am doing with the async_probe value is providing a way to
gracefully shutdown any deferred probe calls that may be outstanding
without having to resort to an explicit flush. The problem with the
flush is that you have to release the device lock and as soon as you
have done that the potential for something else to occur gets opended
up. The general issue I have with just trying to use dev->driver is
that it assumes the probe has already completed. That may not always be
the case. My main concern would be with a device that is popping in and
out of existence quickly so that something like a remove call is being
made before the driver has been loaded. I would prefer all possible
cases where probe and then remove is called in quick succession to
result in no driver being loaded instead of a driver loading on a
device that maybe shouldn't be there.

As far as the remove before probe case that should have no effect with
this code since it would just transition async_probe from false to
false so the extra write would have no effect.



  reply	other threads:[~2018-11-27 16:49 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-05 21:11 [driver-core PATCH v5 0/9] Add NUMA aware async_schedule calls Alexander Duyck
2018-11-05 21:11 ` [driver-core PATCH v5 1/9] workqueue: Provide queue_work_node to queue work near a given NUMA node Alexander Duyck
2018-11-06  0:42   ` Bart Van Assche
2018-11-06 16:27     ` Alexander Duyck
2018-11-05 21:11 ` [driver-core PATCH v5 2/9] async: Add support for queueing on specific " Alexander Duyck
2018-11-07  0:50   ` Bart Van Assche
2018-11-05 21:11 ` [driver-core PATCH v5 3/9] device core: Consolidate locking and unlocking of parent and device Alexander Duyck
2018-11-05 21:11 ` [driver-core PATCH v5 4/9] driver core: Move async_synchronize_full call Alexander Duyck
2018-11-06  1:04   ` Bart Van Assche
2018-11-06 16:18     ` Alexander Duyck
2018-11-06 17:22       ` Bart Van Assche
2018-11-05 21:12 ` [driver-core PATCH v5 5/9] driver core: Establish clear order of operations for deferred probe and remove Alexander Duyck
2018-11-06  4:10   ` kbuild test robot
2018-11-06 23:51     ` Bart Van Assche
2018-11-07  0:52       ` Alexander Duyck
2018-11-23  1:23       ` Rong Chen
2018-11-23 14:19         ` Bart Van Assche
2018-11-06 23:48   ` Bart Van Assche
2018-11-07  1:34     ` Joe Perches
2018-11-08 23:42       ` Bart Van Assche
2018-11-11 14:31     ` Pavel Machek
2018-11-27  2:35   ` Dan Williams
2018-11-27 16:49     ` Alexander Duyck [this message]
2018-11-05 21:12 ` [driver-core PATCH v5 6/9] driver core: Probe devices asynchronously instead of the driver Alexander Duyck
2018-11-07  0:22   ` Bart Van Assche
2018-11-05 21:12 ` [driver-core PATCH v5 7/9] driver core: Attach devices on CPU local to device node Alexander Duyck
2018-11-07  0:24   ` Bart Van Assche
2018-11-05 21:12 ` [driver-core PATCH v5 8/9] PM core: Use new async_schedule_dev command Alexander Duyck
2018-11-07  0:24   ` Bart Van Assche
2018-11-05 21:12 ` [driver-core PATCH v5 9/9] libnvdimm: Schedule device registration on node local to the device Alexander Duyck
2018-11-07  0:26   ` Bart Van Assche
2018-11-06  0:50 ` [driver-core PATCH v5 0/9] Add NUMA aware async_schedule calls Bart Van Assche
2018-11-06 16:25   ` Alexander Duyck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b1ee672ccc2746ce57fb6a0def3d553faa462785.camel@linux.intel.com \
    --to=alexander.h.duyck@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=bvanassche@acm.org \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=jiangshanlai@gmail.com \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=pavel@ucw.cz \
    --cc=rafael@kernel.org \
    --cc=tj@kernel.org \
    --cc=zwisler@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).