linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Daniel Vetter <daniel.vetter@ffwll.ch>
To: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	dri-devel <dri-devel@lists.freedesktop.org>,
	Ramalingam C <ramalingam.c@intel.com>,
	Greg KH <gregkh@linuxfoundation.org>,
	Daniel Vetter <daniel.vetter@intel.com>
Subject: Re: [PATCH] drivers/base: use a worker for sysfs unbind
Date: Thu, 13 Dec 2018 13:36:22 +0100	[thread overview]
Message-ID: <CAKMK7uF7noCEgwE0QYZWQFx-OPxipAF1MojUZ8KTo_SXfQW8+w@mail.gmail.com> (raw)
In-Reply-To: <CAJZ5v0iWshem3kuurF53gutVJ8jFm_caAbetK2CiSCpyc6ReeQ@mail.gmail.com>

On Thu, Dec 13, 2018 at 11:23 AM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Thu, Dec 13, 2018 at 10:58 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> >
> > On Thu, Dec 13, 2018 at 10:38:14AM +0100, Rafael J. Wysocki wrote:
> > > On Mon, Dec 10, 2018 at 9:47 AM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > > >
> > > > Drivers might want to remove some sysfs files, which needs the same
> > > > locks and ends up angering lockdep. Relevant snippet of the stack
> > > > trace:
> > > >
> > > >   kernfs_remove_by_name_ns+0x3b/0x80
> > > >   bus_remove_driver+0x92/0xa0
> > > >   acpi_video_unregister+0x24/0x40
> > > >   i915_driver_unload+0x42/0x130 [i915]
> > > >   i915_pci_remove+0x19/0x30 [i915]
> > > >   pci_device_remove+0x36/0xb0
> > > >   device_release_driver_internal+0x185/0x250
> > > >   unbind_store+0xaf/0x180
> > > >   kernfs_fop_write+0x104/0x190
> > >
> > > Is the acpi_bus_unregister_driver() in acpi_video_unregister() the
> > > source of the lockdep unhappiness?
> >
> > Yeah I guess I cut out too much of the lockdep splat. It complains about
> > kernfs_fop_write and kernfs_remove_by_name_ns acquiring the same lock
> > class. It's ofc not the same lock, so no real deadlock. Getting the
> > device_release_driver outside of the callchain under kernfs_fop_write,
> > which this patch does, "fixes" it. For "fixes" = shut up lockdep.
>
> OK, so the problem really is that the operation is started via sysfs
> which means that this code is running under a lock already.
>
> Which lock does lockdep complain about, exactly?

mutex_lock(&of->mutex);

> > Other options:
> > - Anotate the recursion with the usual lockdep annotations. Potentially
> >   results in lockdep not catching real deadlocks (you can still have other
> >   loops closing the deadlock, maybe through some subsystem/bus lock).
> >
> > - Rewrite kernfs_fop_write to drop the lock (optionally, for callbacks
> >   that know what they're doing), which should be fine if we refcount
> >   everything properly (bus, driver & device).
> >
> > - Also note that probably the same bug exists on the bind sysfs interface,
> >   but we don't use that, so I don't care :-)
> >
> > - Most of these issues are never visible in normal usage, since normally
> >   driver bind/unbind is done from a kthread or model_load/unload, neither
> >   of which is running in the context of that kernfs mutex kernfs_fop_write
> >   holds. That's why I think the task work is the best solution, since it
> >   changes the locking context of the unbind sysfs to match the locking
> >   context of module unload and hotunplug.
>
> I think that using a task work here makes sense.  There is a drawback,
> which is that the original sysfs write will not wait for the driver to
> actually be released before returning to user space AFAICS, but that
> probably isn't a big deal.

This would happen with a normal work_struct, which runs on some other
thread eventually. That added asynonchrouns execution uncovered lots
of bugs in our CI (fbcon isn't solid, let's put it that way). Hence
the task work, which will be run before the syscall returns to
userspace, but outside of anything else. Was originally created to
avoid locking inversion on the final fput, where the same "must
complete before returning to userspace, but outside of any other
locking context" issue was causing trouble.

> Also please note that the patch changes the code flow slightly,
> because passing a non-NULL parent pointer to
> device_release_driver_internal() potentially has side effects, but
> that should not be a big deal either.

I can do the old code exactly, but afaict the non-NULL parent just
takes care of the parent bus locking for us, instead of hand-rolling
it in the caller. But if I missed something, I can easily undo that
part.

> > Unfortunately that trick doesn't work for the bind sysfs file, since that way we can't thread the errno value back to userspace.
>
> Right.  That is unless we wait for the operation to complete and check
> the error left behind by it.  That should be doable, but somewhat
> complicated.

For real deadlocks this doesn't fix anything, it just hides it from
lockdep. cross-release lockdep would still complain. If we want to fix
the bind side _and_ keep reporting the errno from the driver's bind
function, then we need to rework kernfs to and add a callback which
doesn't hold the mutex. Should be doable, just a pile more work.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

  parent reply	other threads:[~2018-12-13 12:36 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-10  8:46 [PATCH] drivers/base: use a worker for sysfs unbind Daniel Vetter
2018-12-10 10:06 ` Greg Kroah-Hartman
2018-12-10 10:18   ` Daniel Vetter
2018-12-10 10:20     ` Daniel Vetter
2018-12-12 11:08       ` Daniel Vetter
2018-12-12 11:19         ` Greg Kroah-Hartman
2018-12-12 12:40           ` Daniel Vetter
2018-12-13  9:38 ` Rafael J. Wysocki
2018-12-13  9:58   ` Daniel Vetter
2018-12-13 10:23     ` Rafael J. Wysocki
2018-12-13 11:05       ` Rafael J. Wysocki
2018-12-13 12:36       ` Daniel Vetter [this message]
2018-12-13 16:18         ` Rafael J. Wysocki
2018-12-13 16:25           ` Daniel Vetter
2018-12-13 18:09             ` Rafael J. Wysocki
2018-12-17 19:48               ` Daniel Vetter
2018-12-18  0:03                 ` Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAKMK7uF7noCEgwE0QYZWQFx-OPxipAF1MojUZ8KTo_SXfQW8+w@mail.gmail.com \
    --to=daniel.vetter@ffwll.ch \
    --cc=daniel.vetter@intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rafael@kernel.org \
    --cc=ramalingam.c@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).