All of lore.kernel.org
 help / color / mirror / Atom feed
From: Saravana Kannan <saravanak@google.com>
To: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	"Cc: Android Kernel" <kernel-team@android.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v1] driver core: Allow device link operations inside sync_state()
Date: Thu, 14 Nov 2019 11:32:08 -0800	[thread overview]
Message-ID: <CAGETcx9UXatfj4nOYLWscZ7ewa1p15RR2sbHzOsrFCe6fTgrHg@mail.gmail.com> (raw)
In-Reply-To: <CAJZ5v0hZZzq=U+6Tpja+hsZOhPMDVik5E+9kX=q2nSCbE8Lf4Q@mail.gmail.com>

On Thu, Nov 14, 2019 at 1:13 AM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Wed, Nov 13, 2019 at 3:36 AM Saravana Kannan <saravanak@google.com> wrote:
> >
> > Some sync_state() implementations might need to call APIs that in turn
> > make calls to device link APIs. So, do the sync_state() callbacks
> > without holding the device link lock.
> >
> > Signed-off-by: Saravana Kannan <saravanak@google.com>
> > ---
> >  drivers/base/core.c | 63 +++++++++++++++++++++++++++++++++++++++------
> >  1 file changed, 55 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/base/core.c b/drivers/base/core.c
> > index e6d3e6d485da..d396b0597c10 100644
> > --- a/drivers/base/core.c
> > +++ b/drivers/base/core.c
> > @@ -48,6 +48,8 @@ early_param("sysfs.deprecated", sysfs_deprecated_setup);
> >  static LIST_HEAD(wait_for_suppliers);
> >  static DEFINE_MUTEX(wfs_lock);
> >  static LIST_HEAD(deferred_sync);
> > +static LIST_HEAD(sync_list);
> > +static DEFINE_MUTEX(sync_lock);
> >  static unsigned int defer_sync_state_count = 1;
> >
> >  #ifdef CONFIG_SRCU
> > @@ -695,7 +697,23 @@ int device_links_check_suppliers(struct device *dev)
> >         return ret;
> >  }
> >
> > -static void __device_links_supplier_sync_state(struct device *dev)
> > +/** __device_links_queue_sync_state - Queue a device for sync_state() callback
> > + * @dev: Device to call sync_state() on
> > + *
> > + * Queues a device for a sync_state() callback when the device links write lock
> > + * isn't held. This allows the sync_state() execution flow to use device links
> > + * APIs.  The caller must ensure this function is called with
> > + * device_links_write_lock() held.
> > + *
> > + * This function does a get_device() to make sure the device is not freed while
> > + * on this list.
> > + *
> > + * So the caller must also ensure that device_links_flush_sync_list() is called
> > + * as soon as the caller releases device_links_write_lock().  This is necessary
> > + * to make sure the sync_state() is called in a timely fashion and the
> > + * put_device() is called on this device.
> > + */
> > +static void __device_links_queue_sync_state(struct device *dev)
> >  {
> >         struct device_link *link;
> >
> > @@ -709,12 +727,35 @@ static void __device_links_supplier_sync_state(struct device *dev)
> >                         return;
> >         }
> >
> > -       if (dev->bus->sync_state)
> > -               dev->bus->sync_state(dev);
> > -       else if (dev->driver && dev->driver->sync_state)
> > -               dev->driver->sync_state(dev);
> > -
> >         dev->state_synced = true;

<combining emails>
> BTW, this should go into device_links_flush_sync_list() too, shouldn't it?

Once a device is added to a sync list, it shouldn't be readded to the
list again -- that'll cause list corruption. So, I intentionally put
it here. Once a device is added to the list, it WILL get synced -- so
it's okay to flag it as such here.

For example, after all the consumers of a supplier probe and it's been
added to the sync_list, a new consumer could attach itself to the
supplier and then probe. You don't want that to cause list corruption.
I'll add a comment like:

/* Flag here to avoid trying to add the same device to the sync_list twice */

> > +
> > +       mutex_lock(&sync_lock);
>
> Total nit: I add empty lines around lock/unlock as a rule to make them
> more visible.

Ack

> > +       WARN_ON(!list_empty(&dev->links.defer_sync));
> > +       if (list_empty(&dev->links.defer_sync)) {
>
> Do you really need to duplicate that check?
>
> > +               get_device(dev);
> > +               list_add_tail(&dev->links.defer_sync, &sync_list);
> > +       }
> > +       mutex_unlock(&sync_lock);
> > +}
>
> What about adding
>
> } else {
>         WARN_ON(1);
> }
>
> here instead?

Sounds good.

>
> > +
>
> Kerneldoc?

Looked too obvious of a function and a static one to add a kernel doc
for. I can add it since you seem to want one.

> > +static void device_links_flush_sync_list(void)
> > +{
> > +       struct device *dev, *tmp;
> > +
> > +       mutex_lock(&sync_lock);
> > +
> > +       list_for_each_entry_safe(dev, tmp, &sync_list, links.defer_sync) {
> > +               list_del_init(&dev->links.defer_sync);
> > +               device_lock(dev);
> > +               if (dev->bus->sync_state)
> > +                       dev->bus->sync_state(dev);
> > +               else if (dev->driver && dev->driver->sync_state)
> > +                       dev->driver->sync_state(dev);
> > +               device_unlock(dev);
> > +               put_device(dev);
> > +       }
> > +
> > +       mutex_unlock(&sync_lock);
> >  }
> >
> >  void device_links_supplier_sync_state_pause(void)
> > @@ -738,11 +779,16 @@ void device_links_supplier_sync_state_resume(void)
> >                 goto out;
> >
> >         list_for_each_entry_safe(dev, tmp, &deferred_sync, links.defer_sync) {
> > -               __device_links_supplier_sync_state(dev);
> > +               /*
> > +                * Delete from deferred_sync list before queuing it to
> > +                * sync_list because defer_sync is used for both lists.
> > +                */
> >                 list_del_init(&dev->links.defer_sync);
> > +               __device_links_queue_sync_state(dev);
> >         }
> >  out:
> >         device_links_write_unlock();
> > +       device_links_flush_sync_list();
>
> Wouldn't it be better to use a local list in this function instead of
> the global sync_list?

Yeah, not a bad idea. But this is more flexible if we find in the
future that we need to queue to sync list in a function A which is
called by function B with device link write lock held. In that case,
function A will have to do the flushing.

> I guess the idea is that you wouldn't be able to do the flush in
> device_links_driver_bound() below,

Why do you say that? Local list would work in that case too, no?

> but do you really need that flush?

Yeah, device_links_driver_bound() is where I actually found this issue
when trying to implement sync_state() for a downstream driver.

> It looks like this is the only place calling
> __device_links_queue_sync_state() and you do a flush right away after
> the loop, so why is the extra flush in device_links_driver_bound()
> needed?

No, I call it in device_links_driver_bound() too :)

>
> >  }
> >
> >  static int sync_state_resume_initcall(void)
> > @@ -815,12 +861,13 @@ void device_links_driver_bound(struct device *dev)
> >                 if (defer_sync_state_count)
> >                         __device_links_supplier_defer_sync(link->supplier);
> >                 else
> > -                       __device_links_supplier_sync_state(link->supplier);
> > +                       __device_links_queue_sync_state(link->supplier);

See, right here.

> >         }
> >
> >         dev->links.status = DL_DEV_DRIVER_BOUND;
> >
> >         device_links_write_unlock();
> > +       device_links_flush_sync_list();
>
> It looks like devices can be added to sync_list in parallel  with each
> other

Not sure what you mean by "in parallel with each other". But, two
different threads running device_links_supplier_sync_state_resume()
and device_links_driver_bound() could end up updating the sync_list
before either one of them gets to flush it.

> and so is it always OK to always flush all of them after one of
> them has been bound to a driver?

Not sure what you mean by "after one of them has been bound to a
driver". A device won't ever get added to the sync_list unless all
it's consumers have probed. So once it's added to sync_list, it's okay
to flush it any time.

>
> >  }
> >
> >  static void device_link_drop_managed(struct device_link *link)
> > --

So as far as I can tell, the stuff that might need fixing are:
1. The spacing nit.
2. Whether we should move to a local list -- I like to keep it as is
for the flexibility.
3. Add a comment for why I set sync_state = 1 in the queuing function.

I'll skip (2) and send a patch now. And if you want to change (2), I
can update the patch again.

Thanks,
Saravana

  reply	other threads:[~2019-11-14 19:32 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-13  2:35 [PATCH v1] driver core: Allow device link operations inside sync_state() Saravana Kannan
2019-11-13  2:35 ` [PATCH v1] driver core: Clarify documentation for fwnode_operations.add_links() Saravana Kannan
2019-11-13 10:20   ` Rafael J. Wysocki
2019-11-14  9:13 ` [PATCH v1] driver core: Allow device link operations inside sync_state() Rafael J. Wysocki
2019-11-14 19:32   ` Saravana Kannan [this message]
2019-11-14  9:15 ` Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAGETcx9UXatfj4nOYLWscZ7ewa1p15RR2sbHzOsrFCe6fTgrHg@mail.gmail.com \
    --to=saravanak@google.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=kernel-team@android.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rafael@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.