linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mike Galbraith <umgwanakikbuti@gmail.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	linux-rt-users@vger.kernel.org, linux-kernel@vger.kernel.org,
	Steven Rostedt <rostedt@goodmis.org>
Subject: Re: [PATCH RT 4/6] rt/locking: Reenable migration accross schedule
Date: Fri, 25 Mar 2016 10:13:16 +0100	[thread overview]
Message-ID: <1458897196.3870.8.camel@gmail.com> (raw)
In-Reply-To: <alpine.DEB.2.11.1603250951100.3978@nanos>

On Fri, 2016-03-25 at 09:52 +0100, Thomas Gleixner wrote:
> On Fri, 25 Mar 2016, Mike Galbraith wrote:
> > On Thu, 2016-03-24 at 12:06 +0100, Mike Galbraith wrote:
> > > On Thu, 2016-03-24 at 11:44 +0100, Thomas Gleixner wrote:
> > > >  
> > > > > On the bright side, with the busted migrate enable business reverted,
> > > > > plus one dinky change from me [1], master-rt.today has completed 100
> > > > > iterations of Steven's hotplug stress script along side endless
> > > > > futexstress, and is happily doing another 900 as I write this, so the
> > > > > next -rt should finally be hotplug deadlock free.
> > > > > 
> > > > > Thomas's state machinery seems to work wonders.  'course this being
> > > > > hotplug, the other shoe will likely apply itself to my backside soon.
> > > > 
> > > > That's a given :)
> > > 
> > > blk-mq applied it shortly after I was satisfied enough to poke xmit.
> > 
> > The other shoe is that notifiers can depend upon RCU grace periods, so
> > when pin_current_cpu() snags rcu_sched, the hotplug game is over.
> > 
> > blk_mq_queue_reinit_notify:
> >         /*
> >          * We need to freeze and reinit all existing queues.  Freezing
> >          * involves synchronous wait for an RCU grace period and doing it
> >          * one by one may take a long time.  Start freezing all queues in
> >          * one swoop and then wait for the completions so that freezing can
> >          * take place in parallel.
> >          */
> >         list_for_each_entry(q, &all_q_list, all_q_node)
> >                 blk_mq_freeze_queue_start(q);
> >         list_for_each_entry(q, &all_q_list, all_q_node) {
> >                 blk_mq_freeze_queue_wait(q);
> 
> Yeah, I stumbled over that already when analysing all the hotplug notifier
> sites. That's definitely a horrible one.
>  
> > Hohum (sharpens rock), next.
> 
> /me recommends frozen sharks

With the sharp rock below and the one I'll follow up with, master-rt on
my DL980 just passed 3 hours of endless hotplug stress concurrent with
endless tbench 8, stockfish and futextest.  It has never survived this
long with this load by a long shot.

hotplug/rt: Do not let pin_current_cpu() block RCU grace periods

Notifiers may depend upon grace periods continuing to advance
as blk_mq_queue_reinit_notify() below.

crash> bt ffff8803aee76400
PID: 1113   TASK: ffff8803aee76400  CPU: 0   COMMAND: "stress-cpu-hotp"
 #0 [ffff880396fe7ad8] __schedule at ffffffff816b7142
 #1 [ffff880396fe7b28] schedule at ffffffff816b797b
 #2 [ffff880396fe7b48] blk_mq_freeze_queue_wait at ffffffff8135c5ac
 #3 [ffff880396fe7b80] blk_mq_queue_reinit_notify at ffffffff8135f819
 #4 [ffff880396fe7b98] notifier_call_chain at ffffffff8109b8ed
 #5 [ffff880396fe7bd8] __raw_notifier_call_chain at ffffffff8109b91e
 #6 [ffff880396fe7be8] __cpu_notify at ffffffff81072825
 #7 [ffff880396fe7bf8] cpu_notify_nofail at ffffffff81072b15
 #8 [ffff880396fe7c08] notify_dead at ffffffff81072d06
 #9 [ffff880396fe7c38] cpuhp_invoke_callback at ffffffff81073718
#10 [ffff880396fe7c78] cpuhp_down_callbacks at ffffffff81073a70
#11 [ffff880396fe7cb8] _cpu_down at ffffffff816afc71
#12 [ffff880396fe7d38] do_cpu_down at ffffffff8107435c
#13 [ffff880396fe7d60] cpu_down at ffffffff81074390
#14 [ffff880396fe7d70] cpu_subsys_offline at ffffffff814cd854
#15 [ffff880396fe7d80] device_offline at ffffffff814c7cda
#16 [ffff880396fe7da8] online_store at ffffffff814c7dd0
#17 [ffff880396fe7dd0] dev_attr_store at ffffffff814c4fc8
#18 [ffff880396fe7de0] sysfs_kf_write at ffffffff812cfbe4
#19 [ffff880396fe7e08] kernfs_fop_write at ffffffff812cf172
#20 [ffff880396fe7e50] __vfs_write at ffffffff81241428
#21 [ffff880396fe7ed0] vfs_write at ffffffff81242535
#22 [ffff880396fe7f10] sys_write at ffffffff812438f9
#23 [ffff880396fe7f50] entry_SYSCALL_64_fastpath at ffffffff816bb4bc
    RIP: 00007fafd918acd0  RSP: 00007ffd2ca956e8  RFLAGS: 00000246
    RAX: ffffffffffffffda  RBX: 000000000226a770  RCX: 00007fafd918acd0
    RDX: 0000000000000002  RSI: 00007fafd9cb9000  RDI: 0000000000000001
    RBP: 00007ffd2ca95700   R8: 000000000000000a   R9: 00007fafd9cb3700
    R10: 00000000ffffffff  R11: 0000000000000246  R12: 0000000000000007
    R13: 0000000000000001  R14: 0000000000000009  R15: 000000000000000a
    ORIG_RAX: 0000000000000001  CS: 0033  SS: 002b

blk_mq_queue_reinit_notify:
        /*
         * We need to freeze and reinit all existing queues.  Freezing
         * involves synchronous wait for an RCU grace period and doing it
         * one by one may take a long time.  Start freezing all queues in
         * one swoop and then wait for the completions so that freezing can
         * take place in parallel.
         */
        list_for_each_entry(q, &all_q_list, all_q_node)
                blk_mq_freeze_queue_start(q);
        list_for_each_entry(q, &all_q_list, all_q_node) {
                blk_mq_freeze_queue_wait(q);

crash> bt ffff880176cc9900
PID: 17     TASK: ffff880176cc9900  CPU: 0   COMMAND: "rcu_sched"
 #0 [ffff880176cd7ab8] __schedule at ffffffff816b7142
 #1 [ffff880176cd7b08] schedule at ffffffff816b797b
 #2 [ffff880176cd7b28] rt_spin_lock_slowlock at ffffffff816b974d
 #3 [ffff880176cd7bc8] rt_spin_lock_fastlock at ffffffff811b0f3c
 #4 [ffff880176cd7be8] rt_spin_lock__no_mg at ffffffff816bac1b
 #5 [ffff880176cd7c08] pin_current_cpu at ffffffff8107406a
 #6 [ffff880176cd7c50] migrate_disable at ffffffff810a0e9e
 #7 [ffff880176cd7c70] rt_spin_lock at ffffffff816bad69
 #8 [ffff880176cd7c90] lock_timer_base at ffffffff810fc5e8
 #9 [ffff880176cd7cc8] try_to_del_timer_sync at ffffffff810fe290
#10 [ffff880176cd7cf0] del_timer_sync at ffffffff810fe381
#11 [ffff880176cd7d58] schedule_timeout at ffffffff816b9e4b
#12 [ffff880176cd7df0] rcu_gp_kthread at ffffffff810f52b4
#13 [ffff880176cd7e70] kthread at ffffffff8109a02f
#14 [ffff880176cd7f50] ret_from_fork at ffffffff816bb6f2

Game Over.

Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com>
---
 include/linux/sched.h |    1 +
 kernel/cpu.c          |    2 +-
 kernel/rcu/tree.c     |    3 +++
 3 files changed, 5 insertions(+), 1 deletion(-)

--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1492,6 +1492,7 @@ struct task_struct {
 #ifdef CONFIG_COMPAT_BRK
 	unsigned brk_randomized:1;
 #endif
+	unsigned sched_is_rcu:1; /* RT: is a critical RCU thread */
 
 	unsigned long atomic_flags; /* Flags needing atomic access. */
 
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -156,7 +156,7 @@ void pin_current_cpu(void)
 	hp = this_cpu_ptr(&hotplug_pcp);
 
 	if (!hp->unplug || hp->refcount || force || preempt_count() > 1 ||
-	    hp->unplug == current) {
+	    hp->unplug == current || current->sched_is_rcu) {
 		hp->refcount++;
 		return;
 	}
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2100,6 +2100,9 @@ static int __noreturn rcu_gp_kthread(voi
 	struct rcu_state *rsp = arg;
 	struct rcu_node *rnp = rcu_get_root(rsp);
 
+	/* RT: pin_current_cpu() MUST NOT block RCU grace periods. */
+	current->sched_is_rcu = 1;
+
 	rcu_bind_gp_kthread();
 	for (;;) {
 

  reply	other threads:[~2016-03-25  9:13 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-12 23:02 [PATCH RT 1/6] kernel: softirq: unlock with irqs on Sebastian Andrzej Siewior
2016-02-12 23:02 ` [PATCH RT 2/6] kernel: migrate_disable() do fastpath in atomic & irqs-off Sebastian Andrzej Siewior
2016-02-12 23:02 ` [PATCH RT 3/6] rtmutex: push down migrate_disable() into rt_spin_lock() Sebastian Andrzej Siewior
2016-02-12 23:02 ` [PATCH RT 4/6] rt/locking: Reenable migration accross schedule Sebastian Andrzej Siewior
2016-03-20  8:43   ` Mike Galbraith
2016-03-24 10:07     ` Mike Galbraith
2016-03-24 10:44       ` Thomas Gleixner
2016-03-24 11:06         ` Mike Galbraith
2016-03-25  5:38           ` Mike Galbraith
2016-03-25  8:52             ` Thomas Gleixner
2016-03-25  9:13               ` Mike Galbraith [this message]
2016-03-25  9:14                 ` Mike Galbraith
2016-03-25 16:24                 ` Mike Galbraith
2016-03-29  4:05                   ` Mike Galbraith
2016-03-31  6:31         ` Mike Galbraith
2016-04-01 21:11           ` Sebastian Andrzej Siewior
2016-04-02  3:12             ` Mike Galbraith
2016-04-05 12:49               ` [rfc patch 0/2] Kill hotplug_lock()/hotplug_unlock() Mike Galbraith
     [not found]               ` <1459837988.26938.16.camel@gmail.com>
2016-04-05 12:49                 ` [rfc patch 1/2] rt/locking/hotplug: " Mike Galbraith
2016-04-05 12:49                 ` [rfc patch 2/2] rt/locking/hotplug: Fix rt_spin_lock_slowlock() migrate_disable() bug Mike Galbraith
2016-04-06 12:00                   ` Mike Galbraith
2016-04-07  4:37                     ` Mike Galbraith
2016-04-07 16:48                       ` Sebastian Andrzej Siewior
2016-04-07 19:08                         ` Mike Galbraith
2016-04-07 16:47               ` [PATCH RT 4/6] rt/locking: Reenable migration accross schedule Sebastian Andrzej Siewior
2016-04-07 19:04                 ` Mike Galbraith
2016-04-08 10:30                   ` Sebastian Andrzej Siewior
2016-04-08 12:10                     ` Mike Galbraith
2016-04-08  6:35                 ` Mike Galbraith
2016-04-08 13:44                 ` Mike Galbraith
2016-04-08 13:58                   ` Sebastian Andrzej Siewior
2016-04-08 14:16                     ` Mike Galbraith
2016-04-08 14:51                       ` Sebastian Andrzej Siewior
2016-04-08 16:49                         ` Mike Galbraith
2016-04-18 17:15                           ` Sebastian Andrzej Siewior
2016-04-18 17:55                             ` Mike Galbraith
2016-04-19  7:07                               ` Sebastian Andrzej Siewior
2016-04-19  8:55                                 ` Mike Galbraith
2016-04-19  9:02                                   ` Sebastian Andrzej Siewior
2016-02-12 23:02 ` [PATCH RT 5/6] kernel/stop_machine: partly revert "stop_machine: Use raw spinlocks" Sebastian Andrzej Siewior
2016-02-12 23:02 ` [PATCH RT 6/6] rcu: disable more spots of rcu_bh Sebastian Andrzej Siewior

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1458897196.3870.8.camel@gmail.com \
    --to=umgwanakikbuti@gmail.com \
    --cc=bigeasy@linutronix.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).