linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jarek Poplawski <jarkao2@o2.pl>
To: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Neil Brown <neilb@suse.de>,
	Andrew Morton <akpm@linux-foundation.org>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Folkert van Heusden <folkert@vanheusden.com>,
	linux-kernel@vger.kernel.org,
	"J\. Bruce Fields" <bfields@fieldses.org>,
	Ingo Molnar <mingo@elte.hu>
Subject: Re: [PATCH] Re: [2.6.20] BUG: workqueue leaked lock
Date: Wed, 21 Mar 2007 16:16:51 +0100	[thread overview]
Message-ID: <20070321151651.GA4547@ff.dom.local> (raw)
In-Reply-To: <20070321144620.GC78@tv-sign.ru>

On Wed, Mar 21, 2007 at 05:46:20PM +0300, Oleg Nesterov wrote:
> On 03/21, Jarek Poplawski wrote:
> >
> > On Tue, Mar 20, 2007 at 07:07:59PM +0300, Oleg Nesterov wrote:
> > > On 03/20, Jarek Poplawski wrote:
> > ...
> > > > >>> On Thu, 2007-03-15 at 11:06 -0800, Andrew Morton wrote:
> > > > >>>>> On Tue, 13 Mar 2007 17:50:14 +0100 Folkert van Heusden <folkert@vanheusden.com> wrote:
> > > > >>>>> ...
> > > > >>>>> [ 1756.728209] BUG: workqueue leaked lock or atomic: nfsd4/0x00000000/3577
> > > > >>>>> [ 1756.728271]     last function: laundromat_main+0x0/0x69 [nfsd]
> > > > >>>>> [ 1756.728392] 2 locks held by nfsd4/3577:
> > > > >>>>> [ 1756.728435]  #0:  (client_mutex){--..}, at: [<c1205b88>] mutex_lock+0x8/0xa
> > > > >>>>> [ 1756.728679]  #1:  (&inode->i_mutex){--..}, at: [<c1205b88>] mutex_lock+0x8/0xa
> > > > >>>>> [ 1756.728923]  [<c1003d57>] show_trace_log_lvl+0x1a/0x30
> > > > >>>>> [ 1756.729015]  [<c1003d7f>] show_trace+0x12/0x14
> > > > >>>>> [ 1756.729103]  [<c1003e79>] dump_stack+0x16/0x18
> > > > >>>>> [ 1756.729187]  [<c102c2e8>] run_workqueue+0x167/0x170
> > > > >>>>> [ 1756.729276]  [<c102c437>] worker_thread+0x146/0x165
> > > > >>>>> [ 1756.729368]  [<c102f797>] kthread+0x97/0xc4
> > > > >>>>> [ 1756.729456]  [<c1003bdb>] kernel_thread_helper+0x7/0x10
> > > > >>>>> [ 1756.729547]  =======================
> > ...
> > > > This check is valid with keventd, but it looks like nfsd runs
> > > > kthread by itself. I'm not sure it's illegal to hold locks then,
> > > 
> > > nfsd creates laundry_wq by itself, yes, but cwq->thread runs with
> > > lockdep_depth() == 0. Unless we have a bug with lockdep_depth(),
> > > lockdep_depth() != 0 means that work->func() returns with a lock
> > > held (or it can flush its own workqueue under lock, but in that case
> > > we should have a different trace).
> > 
> > IMHO you can only say this thread is supposed to run with
> > lockdep_depth() == 0. lockdep_depth is counted within a process,
> > which starts before f(), so the only way to say f() leaked locks
> > is to check these locks before and after f().
> 
> Sorry, I can't understand you. lockdep_depth is counted within a process,
> which starts before f(), yes. This process is cwq->thread, it was forked
> during create_workqueue(). It does not take any locks directly, only by
> calling work->func(). laundry_wq doesn't differ from keventd_wq or any other
> wq in this sense. nfsd does not "runs kthread by itself", it inserts the
> work and wakes up cwq->thread.

I think we know how it all should start and go. If only analyzing
the code could be enough, this current check of lockdep_depth()
is unnecessary too, because laundromat_code code looks as good
as run_workqueue code. I send it for testing and I mean it -
something strange is going here, so some checks should be added
- I didn't say mine is the right and will certainly help.  So, if
you have another idea for looking after it, let us know.

> 
> > > Personally I agree with Andrew:
> > > >
> > > > > OK.  That's not necessarily a bug: one could envisage a (weird) piece of
> > > > > code which takes a lock then releases it on a later workqueue invokation.
> > 
> > But this code is named here as laundromat_main and it doesn't
> > seem to work like this.
> 
> This means we have a problem with leak detection.

I totally agree!

> 
> > > > +		ld = lockdep_depth(current);
> > > > +
> > > >  		f(work);
> > > >  
> > > > -		if (unlikely(in_atomic() || lockdep_depth(current) > 0)) {
> > > > +		if (unlikely(in_atomic() || (ld -= lockdep_depth(current)))) {
> > > 
> > > and with this change we will also have a BUG report on "then releases it on a
> > > later workqueue invokation".
> > 
> > Then we could say at least this code is weird (buggy - in my opinion).
> > This patch doesn't change the way the "standard" code is treated,
> > so I cannot see any possibility to get it worse then now.
> 
> I didn't mean this patch makes things worse (except it conflicts with other
> patches in -mm tree). In fact, it may improve the diagnostics. My point is
> that this patch afaics has nothing to do with the discussed problem.

I thought you are trying to diagnose a bug and maybe Folkert could test,
if this patch makes any difference. Sorry, if I missed the real problem.

Cheers,
Jarek P.

  reply	other threads:[~2007-03-21 15:12 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-03-13 16:50 [2.6.20] BUG: workqueue leaked lock Folkert van Heusden
2007-03-15 19:06 ` Andrew Morton
2007-03-15 19:17   ` Folkert van Heusden
2007-03-16 14:49     ` Jarek Poplawski
2007-03-20 11:17     ` dquot.c: possible circular locking " Jarek Poplawski
2007-03-20 11:22       ` Jarek Poplawski
2007-03-20 11:31         ` Jarek Poplawski
2007-03-20 12:19           ` Jan Kara
2007-03-20 13:35             ` Arjan van de Ven
2007-03-20 14:21               ` Jan Kara
2007-03-20 14:18                 ` Arjan van de Ven
2007-03-20 13:44             ` Jarek Poplawski
2007-03-20 14:00               ` Jan Kara
2007-03-16  8:41   ` Peter Zijlstra
2007-03-16 11:39     ` Andrew Morton
2007-03-19  6:24       ` Neil Brown
2007-03-20  9:37         ` [PATCH] " Jarek Poplawski
2007-03-20 16:07           ` Oleg Nesterov
2007-03-21  8:05             ` Jarek Poplawski
2007-03-21 14:46               ` Oleg Nesterov
2007-03-21 15:16                 ` Jarek Poplawski [this message]
2007-03-21 15:17                   ` Folkert van Heusden
2007-03-21 15:29                   ` Oleg Nesterov
2007-03-21 18:16                     ` Oleg Nesterov
2007-03-22  6:11                       ` [PATCH] lockdep: lockdep_depth vs. debug_locks " Jarek Poplawski
2007-03-22  6:28                         ` Andrew Morton
2007-03-22  7:06                           ` Jarek Poplawski
2007-03-22  7:23                             ` Jarek Poplawski
2007-03-22  7:13                           ` Jarek Poplawski
2007-03-22  8:26                           ` Jarek Poplawski
2007-03-22  6:57                         ` [PATCH] lockdep: debug_show_all_locks & debug_show_held_locks vs. debug_locks Jarek Poplawski
2007-03-22  7:23                           ` Peter Zijlstra
2007-03-22  9:06                             ` Ingo Molnar
2007-03-22  7:22                         ` [PATCH] lockdep: lockdep_depth vs. debug_locks Re: [2.6.20] BUG: workqueue leaked lock Peter Zijlstra
2007-03-22  9:06                           ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070321151651.GA4547@ff.dom.local \
    --to=jarkao2@o2.pl \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=bfields@fieldses.org \
    --cc=folkert@vanheusden.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=neilb@suse.de \
    --cc=oleg@tv-sign.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).