linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Daniel Vetter <daniel@ffwll.ch>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	LKML <linux-kernel@vger.kernel.org>,
	DRI Development <dri-devel@lists.freedesktop.org>,
	Ramalingam C <ramalingam.c@intel.com>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Daniel Vetter <daniel.vetter@intel.com>
Subject: Re: [PATCH] drivers/base: use a worker for sysfs unbind
Date: Wed, 12 Dec 2018 12:08:40 +0100	[thread overview]
Message-ID: <20181212110840.GA21184@phenom.ffwll.local> (raw)
In-Reply-To: <20181210102058.GO21184@phenom.ffwll.local>

On Mon, Dec 10, 2018 at 11:20:58AM +0100, Daniel Vetter wrote:
> On Mon, Dec 10, 2018 at 11:18:32AM +0100, Daniel Vetter wrote:
> > On Mon, Dec 10, 2018 at 11:06:34AM +0100, Greg Kroah-Hartman wrote:
> > > On Mon, Dec 10, 2018 at 09:46:53AM +0100, Daniel Vetter wrote:
> > > > Drivers might want to remove some sysfs files, which needs the same
> > > > locks and ends up angering lockdep. Relevant snippet of the stack
> > > > trace:
> > > > 
> > > >   kernfs_remove_by_name_ns+0x3b/0x80
> > > >   bus_remove_driver+0x92/0xa0
> > > >   acpi_video_unregister+0x24/0x40
> > > >   i915_driver_unload+0x42/0x130 [i915]
> > > >   i915_pci_remove+0x19/0x30 [i915]
> > > >   pci_device_remove+0x36/0xb0
> > > >   device_release_driver_internal+0x185/0x250
> > > >   unbind_store+0xaf/0x180
> > > >   kernfs_fop_write+0x104/0x190
> > > > 
> > > > I've stumbled over this because some new patches by Ram connect the
> > > > snd-hda-intel unload (where we do use sysfs unbind) with the locking
> > > > chains in the i915 unload code (but without creating a new loop),
> > > > which upset our CI. But the bug is already there and can be easily
> > > > reproduced by unbind i915 directly.
> > > 
> > > This is odd, why wouldn't any driver hit this issue?  And why now since
> > > you say this is triggerable today?
> > 
> > The above backtrace is triggered by unbinding i915 on current upstream
> > kernels. Note: Will crash later on rather badly in the
> > fbdev/fbcon/vtconsole hell, but that's separate issue (which can be worked
> > around by first unbinding fbcon manually through sysfs).
> > 
> > > I know scsi was doing some strange things like trying to remove the
> > > device itself from a sysfs callback on the device, which requires it to
> > > just call a different kobject function created just for that type of
> > > thing.  Would that also make sense to do here instead of your workqueue?
> > 
> > Note how we blow up on unregistering sw device instances supported by i915
> > in entirely different subsystems. I guess most drivers just have sysfs
> > files for their own stuff, where this is done as you describe. The problem
> > is that there's an awful lot of unrelated stuff hanging off i915.
> > 
> > Or maybe acpi_video is busted, and should be using a different function.
> > You haven't said which one, and I have no idea which one it is ...
> > 
> > And in case the context wasn't clear: This is unbinding the i915 pci
> > driver which triggers the above lockdep splat recursion.
> 
> btw another option for "fixing" this would be to annotate the mutex_lock
> in kernfs_remove_by_name_ns as recursive. Which just shuts up lockdep (and
> might hide some real bugs), but would get the job done since there's not
> actually a deadlock here. Just lockdep being annoyed.

So what's the pick? I can do the typing, but I don't understand all the
driver core interactions to know what we should be doing here best.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

  reply	other threads:[~2018-12-12 11:08 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-10  8:46 [PATCH] drivers/base: use a worker for sysfs unbind Daniel Vetter
2018-12-10 10:06 ` Greg Kroah-Hartman
2018-12-10 10:18   ` Daniel Vetter
2018-12-10 10:20     ` Daniel Vetter
2018-12-12 11:08       ` Daniel Vetter [this message]
2018-12-12 11:19         ` Greg Kroah-Hartman
2018-12-12 12:40           ` Daniel Vetter
2018-12-13  9:38 ` Rafael J. Wysocki
2018-12-13  9:58   ` Daniel Vetter
2018-12-13 10:23     ` Rafael J. Wysocki
2018-12-13 11:05       ` Rafael J. Wysocki
2018-12-13 12:36       ` Daniel Vetter
2018-12-13 16:18         ` Rafael J. Wysocki
2018-12-13 16:25           ` Daniel Vetter
2018-12-13 18:09             ` Rafael J. Wysocki
2018-12-17 19:48               ` Daniel Vetter
2018-12-18  0:03                 ` Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181212110840.GA21184@phenom.ffwll.local \
    --to=daniel@ffwll.ch \
    --cc=daniel.vetter@intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rafael@kernel.org \
    --cc=ramalingam.c@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).