All of lore.kernel.org
 help / color / mirror / Atom feed
From: Steven Rostedt <rostedt@goodmis.org>
To: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: paulmck@linux.vnet.ibm.com, Peter Zijlstra <peterz@infradead.org>,
	linux-kernel@vger.kernel.org, C.Emde@osadl.org
Subject: Re: [PATCH 0/8] rcu: Ensure rcu read site is deadlock-immunity
Date: Wed, 07 Aug 2013 22:12:12 -0400	[thread overview]
Message-ID: <1375927932.6848.33.camel@gandalf.local.home> (raw)
In-Reply-To: <5202F8CC.2020703@cn.fujitsu.com>

On Thu, 2013-08-08 at 09:47 +0800, Lai Jiangshan wrote:

> > [  393.641012]        CPU0
> > [  393.641012]        ----
> > [  393.641012]   lock(&lock->wait_lock);
> > [  393.641012]   <Interrupt>
> > [  393.641012]     lock(&lock->wait_lock);
> 
> Patch2 causes it!
> When I found all lock which can (chained) nested in rcu_read_unlock_special(),
> I didn't notice rtmutex's lock->wait_lock is not nested in irq-disabled.
> 
> Two ways to fix it:
> 1) change rtmutex's lock->wait_lock, make it alwasys irq-disabled.
> 2) revert my patch2

Your patch 2 states:

"After patch 10f39bb1, "special & RCU_READ_UNLOCK_BLOCKED" can't be true
in irq nor softirq.(due to RCU_READ_UNLOCK_BLOCKED can only be set
when preemption)"

But then below we have:


> 
> > [  393.641012] 
> > [  393.641012]  *** DEADLOCK ***
> > [  393.641012] 
> > [  393.641012] no locks held by rcu_torture_rea/697.
> > [  393.641012] 
> > [  393.641012] stack backtrace:
> > [  393.641012] CPU: 3 PID: 697 Comm: rcu_torture_rea Not tainted 3.11.0-rc1+ #1
> > [  393.641012] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
> > [  393.641012]  ffffffff8586fea0 ffff88001fcc3a78 ffffffff8187b4cb ffffffff8104a261
> > [  393.641012]  ffff88001e1a20c0 ffff88001fcc3ad8 ffffffff818773e4 0000000000000000
> > [  393.641012]  ffff880000000000 ffff880000000001 ffffffff81010a0a 0000000000000001
> > [  393.641012] Call Trace:
> > [  393.641012]  <IRQ>  [<ffffffff8187b4cb>] dump_stack+0x4f/0x84
> > [  393.641012]  [<ffffffff8104a261>] ? console_unlock+0x291/0x410
> > [  393.641012]  [<ffffffff818773e4>] print_usage_bug+0x1f5/0x206
> > [  393.641012]  [<ffffffff81010a0a>] ? save_stack_trace+0x2a/0x50
> > [  393.641012]  [<ffffffff810ae603>] mark_lock+0x283/0x2e0
> > [  393.641012]  [<ffffffff810ada10>] ? print_irq_inversion_bug.part.40+0x1f0/0x1f0
> > [  393.641012]  [<ffffffff810aef66>] __lock_acquire+0x906/0x1d40
> > [  393.641012]  [<ffffffff810ae94b>] ? __lock_acquire+0x2eb/0x1d40
> > [  393.641012]  [<ffffffff810ae94b>] ? __lock_acquire+0x2eb/0x1d40
> > [  393.641012]  [<ffffffff810b0a65>] lock_acquire+0x95/0x210
> > [  393.641012]  [<ffffffff818860f3>] ? rt_mutex_unlock+0x53/0x100
> > [  393.641012]  [<ffffffff81886d26>] _raw_spin_lock+0x36/0x50
> > [  393.641012]  [<ffffffff818860f3>] ? rt_mutex_unlock+0x53/0x100
> > [  393.641012]  [<ffffffff818860f3>] rt_mutex_unlock+0x53/0x100
> > [  393.641012]  [<ffffffff810ee3ca>] rcu_read_unlock_special+0x17a/0x2a0
> > [  393.641012]  [<ffffffff810ee803>] rcu_check_callbacks+0x313/0x950
> > [  393.641012]  [<ffffffff8107a6bd>] ? hrtimer_run_queues+0x1d/0x180
> > [  393.641012]  [<ffffffff810abb9d>] ? trace_hardirqs_off+0xd/0x10
> > [  393.641012]  [<ffffffff8105bae3>] update_process_times+0x43/0x80
> > [  393.641012]  [<ffffffff810a9801>] tick_sched_handle.isra.10+0x31/0x40
> > [  393.641012]  [<ffffffff810a98f7>] tick_sched_timer+0x47/0x70
> > [  393.641012]  [<ffffffff8107941c>] __run_hrtimer+0x7c/0x490
> > [  393.641012]  [<ffffffff810a260d>] ? ktime_get_update_offsets+0x4d/0xe0
> > [  393.641012]  [<ffffffff810a98b0>] ? tick_nohz_handler+0xa0/0xa0
> > [  393.641012]  [<ffffffff8107a017>] hrtimer_interrupt+0x107/0x260

The hrtimer_interrupt is calling a rt_mutex_unlock? How did that happen?
Did it first call a rt_mutex_lock?

If patch two was the culprit, I'm thinking the idea behind patch two is
wrong. The only option is to remove patch number two!

Or perhaps I missed something.

-- Steve


> > [  393.641012]  [<ffffffff81030173>] local_apic_timer_interrupt+0x33/0x60
> > [  393.641012]  [<ffffffff8103059e>] smp_apic_timer_interrupt+0x3e/0x60
> > [  393.641012]  [<ffffffff8189022f>] apic_timer_interrupt+0x6f/0x80
> > [  393.641012]  <EOI>  [<ffffffff810ee250>] ? rcu_scheduler_starting+0x60/0x60
> > [  393.641012]  [<ffffffff81072101>] ? __rcu_read_unlock+0x91/0xa0
> > [  393.641012]  [<ffffffff810e80e3>] rcu_torture_read_unlock+0x33/0x70
> > [  393.641012]  [<ffffffff810e8f54>] rcu_torture_reader+0xe4/0x450
> > [  393.641012]  [<ffffffff810e92c0>] ? rcu_torture_reader+0x450/0x450
> > [  393.641012]  [<ffffffff810e8e70>] ? rcutorture_trace_dump+0x30/0x30
> > [  393.641012]  [<ffffffff810759d6>] kthread+0xd6/0xe0
> > [  393.641012]  [<ffffffff818874bb>] ? _raw_spin_unlock_irq+0x2b/0x60
> > [  393.641012]  [<ffffffff81075900>] ? flush_kthread_worker+0x130/0x130
> > [  393.641012]  [<ffffffff8188f52c>] ret_from_fork+0x7c/0xb0
> > [  393.641012]  [<ffffffff81075900>] ? flush_kthread_worker+0x130/0x130
> > 
> > I don't see this without your patches.
> > 
> > .config attached.  The other configurations completed without errors.
> > Short tests, 30 minutes per configuration.
> > 
> > Thoughts?
> > 
> > 							Thanx, Paul



  reply	other threads:[~2013-08-08  2:12 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-07 10:24 [PATCH 0/8] rcu: Ensure rcu read site is deadlock-immunity Lai Jiangshan
2013-08-07 10:24 ` [PATCH 1/8] rcu: add a warn to rcu_preempt_note_context_switch() Lai Jiangshan
2013-10-30 11:02   ` Paul E. McKenney
2013-08-07 10:24 ` [PATCH 2/8] rcu: remove irq/softirq context check in rcu_read_unlock_special() Lai Jiangshan
2013-10-30 11:18   ` Paul E. McKenney
2013-08-07 10:24 ` [PATCH 3/8] rcu: keep irqs disabled " Lai Jiangshan
2013-08-07 10:25 ` [PATCH 4/8] rcu: delay task rcu state cleanup in exit_rcu() Lai Jiangshan
2013-08-07 10:25 ` [PATCH 5/8] rcu: eliminate deadlock for rcu read site Lai Jiangshan
2013-08-08 20:40   ` Paul E. McKenney
2013-08-09  9:31     ` Lai Jiangshan
2013-08-09 17:58       ` Paul E. McKenney
2013-08-12 13:55       ` Peter Zijlstra
2013-08-12 15:16         ` Paul E. McKenney
2013-08-12 16:21           ` Peter Zijlstra
2013-08-12 16:44             ` Paul E. McKenney
2013-08-10  3:43     ` Lai Jiangshan
2013-08-10 15:07       ` Paul E. McKenney
2013-08-10 15:08         ` Paul E. McKenney
2013-08-21  3:17         ` Paul E. McKenney
2013-08-21  3:25           ` Lai Jiangshan
2013-08-21 13:42             ` Paul E. McKenney
2013-08-12 13:53       ` Peter Zijlstra
2013-08-12 14:10         ` Steven Rostedt
2013-08-12 14:14           ` Peter Zijlstra
     [not found]           ` <CACvQF51-oAGkdxwku+orKSQ=SZd1A4saXzkrgcRGi+KnZUZYxQ@mail.gmail.com>
2013-08-22 14:34             ` Steven Rostedt
2013-08-22 14:41               ` Steven Rostedt
2013-08-23  6:26               ` Lai Jiangshan
     [not found]                 ` <CACvQF51kcLrJsa=zBKhLkJfFBh109TW+Zrepm+tRxmEp0gALbQ@mail.gmail.com>
2013-08-25 17:43                   ` Paul E. McKenney
2013-08-26  2:39                     ` Lai Jiangshan
2013-08-30  2:05                       ` Paul E. McKenney
2013-09-05 15:22                 ` Steven Rostedt
2013-08-07 10:25 ` [PATCH 6/8] rcu: call rcu_read_unlock_special() in rcu_preempt_check_callbacks() Lai Jiangshan
2013-08-07 10:25 ` [PATCH 7/8] rcu: add # of deferred _special() statistics Lai Jiangshan
2013-08-07 16:42   ` Paul E. McKenney
2013-08-07 10:25 ` [PATCH 8/8] rcu: remove irq work for rsp_wakeup() Lai Jiangshan
2013-08-07 12:38 ` [PATCH 0/8] rcu: Ensure rcu read site is deadlock-immunity Paul E. McKenney
2013-08-07 19:29   ` Carsten Emde
2013-08-07 19:52     ` Paul E. McKenney
2013-08-08  1:17     ` Lai Jiangshan
2013-08-08  0:36   ` Paul E. McKenney
2013-08-08  1:47     ` Lai Jiangshan
2013-08-08  2:12       ` Steven Rostedt [this message]
2013-08-08  2:33         ` Lai Jiangshan
2013-08-08  2:33           ` Paul E. McKenney
2013-08-08  3:10             ` Lai Jiangshan
2013-08-08  4:18               ` Paul E. McKenney
2013-08-08  5:27                 ` Lai Jiangshan
2013-08-08  7:05                   ` Paul E. McKenney
2013-08-08  2:45         ` Lai Jiangshan
     [not found] <CE792442-F438-4486-BAB7-12B0E1C57273@gmail.com>
2013-08-12 14:12 ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1375927932.6848.33.camel@gandalf.local.home \
    --to=rostedt@goodmis.org \
    --cc=C.Emde@osadl.org \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.