All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Vetter <daniel@ffwll.ch>
To: Greg KH <gregkh@linuxfoundation.org>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>,
	LKML <linux-kernel@vger.kernel.org>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Intel Graphics Development <intel-gfx@lists.freedesktop.org>,
	DRI Development <dri-devel@lists.freedesktop.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Alan Cox <alan@lxorguk.ukuu.org.uk>
Subject: Re: [Intel-gfx] [PATCH] console: implement lockdep support for console_lock
Date: Mon, 24 Sep 2012 13:36:46 +0200	[thread overview]
Message-ID: <20120924113646.GB3824@bremse> (raw)
In-Reply-To: <20120922200629.GC14004@kroah.com>

On Sat, Sep 22, 2012 at 01:06:29PM -0700, Greg KH wrote:
> On Sat, Sep 22, 2012 at 07:52:11PM +0200, Daniel Vetter wrote:
> > Dave Airlie recently discovered a locking bug in the fbcon layer,
> > where a timer_del_sync (for the blinking cursor) deadlocks with the
> > timer itself, since both (want to) hold the console_lock:
> > 
> > https://lkml.org/lkml/2012/8/21/36
> > 
> > Unfortunately the console_lock isn't a plain mutex and hence has no
> > lockdep support. Which resulted in a few days wasted of tracking down
> > this bug (complicated by the fact that printk doesn't show anything
> > when the console is locked) instead of noticing the bug much earlier
> > with the lockdep splat.
> > 
> > Hence I've figured I need to fix that for the next deadlock involving
> > console_lock - and with kms/drm growing ever more complex locking
> > that'll eventually happen.
> > 
> > Now the console_lock has rather funky semantics, so after a quick irc
> > discussion with Thomas Gleixner and Dave Airlie I've quickly ditched
> > the original idead of switching to a real mutex (since it won't work)
> > and instead opted to annotate the console_lock with lockdep
> > information manually.
> > 
> > There are a few special cases:
> > - The console_lock state is protected by the console_sem, and usually
> >   grabbed/dropped at _lock/_unlock time. But the suspend/resume code
> >   drops the semaphore without dropping the console_lock (see
> >   suspend_console/resume_console). But since the same thread that did
> >   the suspend will do the resume, we don't need to fix up anything.
> > 
> > - In the printk code there's a special trylock, only used to kick off
> >   the logbuffer printk'ing in console_unlock. But all that happens
> >   while lockdep is disable (since printk does a few other evil
> >   tricks). So no issue there, either.
> > 
> > - The console_lock can also be acquired form irq context (but only
> >   with a trylock). lockdep already handles that.
> > 
> > This all leaves us with annotating the normal console_lock, _unlock
> > and _trylock functions.
> > 
> > And yes, it works - simply unloading a drm kms driver resulted in
> > lockdep complaining about the deadlock in fbcon_deinit:
> > 
> > ======================================================
> > [ INFO: possible circular locking dependency detected ]
> > 3.6.0-rc2+ #552 Not tainted
> > -------------------------------------------------------
> > kms-reload/3577 is trying to acquire lock:
> >  ((&info->queue)){+.+...}, at: [<ffffffff81058c70>] wait_on_work+0x0/0xa7
> > 
> > but task is already holding lock:
> >  (console_lock){+.+.+.}, at: [<ffffffff81264686>] bind_con_driver+0x38/0x263
> > 
> > which lock already depends on the new lock.
> > 
> > the existing dependency chain (in reverse order) is:
> > 
> > -> #1 (console_lock){+.+.+.}:
> >        [<ffffffff81087440>] lock_acquire+0x95/0x105
> >        [<ffffffff81040190>] console_lock+0x59/0x5b
> >        [<ffffffff81209cb6>] fb_flashcursor+0x2e/0x12c
> >        [<ffffffff81057c3e>] process_one_work+0x1d9/0x3b4
> >        [<ffffffff810584a2>] worker_thread+0x1a7/0x24b
> >        [<ffffffff8105ca29>] kthread+0x7f/0x87
> >        [<ffffffff813b1204>] kernel_thread_helper+0x4/0x10
> > 
> > -> #0 ((&info->queue)){+.+...}:
> >        [<ffffffff81086cb3>] __lock_acquire+0x999/0xcf6
> >        [<ffffffff81087440>] lock_acquire+0x95/0x105
> >        [<ffffffff81058cab>] wait_on_work+0x3b/0xa7
> >        [<ffffffff81058dd6>] __cancel_work_timer+0xbf/0x102
> >        [<ffffffff81058e33>] cancel_work_sync+0xb/0xd
> >        [<ffffffff8120a3b3>] fbcon_deinit+0x11c/0x1dc
> >        [<ffffffff81264793>] bind_con_driver+0x145/0x263
> >        [<ffffffff81264a45>] unbind_con_driver+0x14f/0x195
> >        [<ffffffff8126540c>] store_bind+0x1ad/0x1c1
> >        [<ffffffff8127cbb7>] dev_attr_store+0x13/0x1f
> >        [<ffffffff8116d884>] sysfs_write_file+0xe9/0x121
> >        [<ffffffff811145b2>] vfs_write+0x9b/0xfd
> >        [<ffffffff811147b7>] sys_write+0x3e/0x6b
> >        [<ffffffff813b0039>] system_call_fastpath+0x16/0x1b
> > 
> > other info that might help us debug this:
> > 
> >  Possible unsafe locking scenario:
> > 
> >        CPU0                    CPU1
> >        ----                    ----
> >   lock(console_lock);
> >                                lock((&info->queue));
> >                                lock(console_lock);
> >   lock((&info->queue));
> > 
> >  *** DEADLOCK ***
> > 
> > v2: Mark the lockdep_map static, noticed by Jani Nikula.
> > 
> > Cc: Dave Airlie <airlied@gmail.com>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
> > Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> > Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> > ---
> >  kernel/printk.c |    9 +++++++++
> >  1 file changed, 9 insertions(+)
> 
> So I'm guessing I should take this through the tty tree, right?  Any
> objections to that for 3.7?

I didn't know who would be the relevant maintainer, so just spammed a few
people. Would be awesome if you could merge these patches for 3.7, and at
least Alan Cox seems to like them:

http://marc.info/?l=linux-fbdev&m=134564125601147&w=1

Thanks, Daniel
> 
> thanks,
> 
> greg k-h

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

  reply	other threads:[~2012-09-24 11:36 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-17 23:03 [PATCH 1/2] [RESEND] console: use might_sleep in console_lock Daniel Vetter
2012-09-17 23:03 ` [PATCH 2/2] [RESEND] console: implement lockdep support for console_lock Daniel Vetter
2012-09-18  7:33   ` Jani Nikula
2012-09-18  7:33     ` Jani Nikula
2012-09-19  7:30     ` Daniel Vetter
2012-09-22 17:52     ` [PATCH] " Daniel Vetter
2012-09-22 20:06       ` [Intel-gfx] " Greg KH
2012-09-22 20:06         ` Greg KH
2012-09-24 11:36         ` Daniel Vetter [this message]
2012-10-02 12:56         ` [Intel-gfx] " Daniel Vetter
2012-10-02 13:28           ` Greg KH
2012-10-02 13:28             ` Greg KH
2012-10-02 13:31             ` [Intel-gfx] " Daniel Vetter
2012-09-24 12:17   ` [PATCH 2/2] [RESEND] " Peter Zijlstra
2012-09-24 12:24     ` Peter Zijlstra
2012-09-24 12:54       ` Daniel Vetter
2012-09-24 13:10         ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120924113646.GB3824@bremse \
    --to=daniel@ffwll.ch \
    --cc=a.p.zijlstra@chello.nl \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=daniel.vetter@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.