rcu.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Uladzislau Rezki <urezki@gmail.com>
To: Michael Ellerman <mpe@ellerman.id.au>
Cc: Uladzislau Rezki <urezki@gmail.com>,
	"Paul E . McKenney" <paulmck@kernel.org>,
	Daniel Axtens <dja@axtens.net>,
	rcu@vger.kernel.org, linuxppc-dev@lists.ozlabs.org
Subject: Re: powerpc 5.10-rcN boot failures with RCU_SCALE_TEST=m
Date: Thu, 3 Dec 2020 15:34:45 +0100	[thread overview]
Message-ID: <20201203143445.GA22204@pc636> (raw)
In-Reply-To: <87sg8nv277.fsf@mpe.ellerman.id.au>

On Thu, Dec 03, 2020 at 05:22:20PM +1100, Michael Ellerman wrote:
> Uladzislau Rezki <urezki@gmail.com> writes:
> > On Thu, Dec 03, 2020 at 01:03:32AM +1100, Michael Ellerman wrote:
> ...
> >> 
> >> The SMP bringup stalls because _cpu_up() is blocked trying to take
> >> cpu_hotplug_lock for writing:
> >> 
> >> [  401.403132][    T0] task:swapper/0       state:D stack:12512 pid:    1 ppid:     0 flags:0x00000800
> >> [  401.403502][    T0] Call Trace:
> >> [  401.403907][    T0] [c0000000062c37d0] [c0000000062c3830] 0xc0000000062c3830 (unreliable)
> >> [  401.404068][    T0] [c0000000062c39b0] [c000000000019d70] __switch_to+0x2e0/0x4a0
> >> [  401.404189][    T0] [c0000000062c3a10] [c000000000b87228] __schedule+0x288/0x9b0
> >> [  401.404257][    T0] [c0000000062c3ad0] [c000000000b879b8] schedule+0x68/0x120
> >> [  401.404324][    T0] [c0000000062c3b00] [c000000000184ad4] percpu_down_write+0x164/0x170
> >> [  401.404390][    T0] [c0000000062c3b50] [c000000000116b68] _cpu_up+0x68/0x280
> >> [  401.404475][    T0] [c0000000062c3bb0] [c000000000116e70] cpu_up+0xf0/0x140
> >> [  401.404546][    T0] [c0000000062c3c30] [c00000000011776c] bringup_nonboot_cpus+0xac/0xf0
> >> [  401.404643][    T0] [c0000000062c3c80] [c000000000eea1b8] smp_init+0x40/0xcc
> >> [  401.404727][    T0] [c0000000062c3ce0] [c000000000ec43dc] kernel_init_freeable+0x1e0/0x3a0
> >> [  401.404799][    T0] [c0000000062c3db0] [c000000000011ec4] kernel_init+0x24/0x150
> >> [  401.404958][    T0] [c0000000062c3e20] [c00000000000daf0] ret_from_kernel_thread+0x5c/0x6c
> >> 
> >> It can't get it because kprobe_optimizer() has taken it for read and is now
> >> blocked waiting for synchronize_rcu_tasks():
> >> 
> >> [  401.418808][    T0] task:kworker/0:1     state:D stack:13392 pid:   12 ppid:     2 flags:0x00000800
> >> [  401.418951][    T0] Workqueue: events kprobe_optimizer
> >> [  401.419078][    T0] Call Trace:
> >> [  401.419121][    T0] [c0000000062ef650] [c0000000062ef710] 0xc0000000062ef710 (unreliable)
> >> [  401.419213][    T0] [c0000000062ef830] [c000000000019d70] __switch_to+0x2e0/0x4a0
> >> [  401.419281][    T0] [c0000000062ef890] [c000000000b87228] __schedule+0x288/0x9b0
> >> [  401.419347][    T0] [c0000000062ef950] [c000000000b879b8] schedule+0x68/0x120
> >> [  401.419415][    T0] [c0000000062ef980] [c000000000b8e664] schedule_timeout+0x2a4/0x340
> >> [  401.419484][    T0] [c0000000062efa80] [c000000000b894ec] wait_for_completion+0x9c/0x170
> >> [  401.419552][    T0] [c0000000062efae0] [c0000000001ac85c] __wait_rcu_gp+0x19c/0x210
> >> [  401.419619][    T0] [c0000000062efb40] [c0000000001ac90c] synchronize_rcu_tasks_generic+0x3c/0x70
> >> [  401.419690][    T0] [c0000000062efbe0] [c00000000022a3dc] kprobe_optimizer+0x1dc/0x470
> >> [  401.419757][    T0] [c0000000062efc60] [c000000000136684] process_one_work+0x2f4/0x530
> >> [  401.419823][    T0] [c0000000062efd20] [c000000000138d28] worker_thread+0x78/0x570
> >> [  401.419891][    T0] [c0000000062efdb0] [c000000000142424] kthread+0x194/0x1a0
> >> [  401.419976][    T0] [c0000000062efe20] [c00000000000daf0] ret_from_kernel_thread+0x5c/0x6c
> >> 
> >> But why is the synchronize_rcu_tasks() not completing?
> >> 
> > I think that it is because RCU is not fully initialized by that time.
> 
> Yeah that would explain it :)
> 
> > The 36dadef23fcc ("kprobes: Init kprobes in early_initcall") patch
> > switches to early_initcall() that has a higher priority sequence than
> > core_initcall() that is used to complete an RCU setup in the rcu_set_runtime_mode().
> 
> I was looking at debug_lockdep_rcu_enabled(), which is:
> 
> noinstr int notrace debug_lockdep_rcu_enabled(void)
> {
> 	return rcu_scheduler_active != RCU_SCHEDULER_INACTIVE && debug_locks &&
> 	       current->lockdep_recursion == 0;
> }
> 
> That is not firing any warnings for me because rcu_scheduler_active is:
> 
> (gdb) p/x rcu_scheduler_active
> $1 = 0x1
> 
> Which is:
> 
> #define RCU_SCHEDULER_INIT	1
> 

Agree with that.

> But that's different to RCU_SCHEDULER_RUNNING, which is set in
> rcu_set_runtime_mode() as you mentioned:
> 
> static int __init rcu_set_runtime_mode(void)
> {
> 	rcu_test_sync_prims();
> 	rcu_scheduler_active = RCU_SCHEDULER_RUNNING;
> 	kfree_rcu_scheduler_running();
> 	rcu_test_sync_prims();
> 	return 0;
> }
> 
BTW, since you can reproduce it and have a test setup, could you please
check that:

<snip>
-core_initcall(rcu_set_runtime_mode);
+early_initcall(rcu_set_runtime_mode);
<snip>

Just in case. The the synchronize_rcu_tasks() gets blocked:

<snip>
void __wait_rcu_gp(bool checktiny, int n, call_rcu_func_t *crcu_array,
    if (crcu_array[j] == crcu_array[i])
                                break;
        if (j == i) {
            wait_for_completion(&rs_array[i].completion); <--- here
            destroy_rcu_head_on_stack(&rs_array[i].head);
        }
...
<snip>

but that is obvious when looking at your full traces.

> The comment on rcu_scheduler_active implies that once we're at
> RCU_SCHEDULER_INIT things should work:
> 
> /*
>  * The rcu_scheduler_active variable is initialized to the value
>  * RCU_SCHEDULER_INACTIVE and transitions RCU_SCHEDULER_INIT just before the
>  * first task is spawned.  So when this variable is RCU_SCHEDULER_INACTIVE,
>  * RCU can assume that there is but one task, allowing RCU to (for example)
>  * optimize synchronize_rcu() to a simple barrier().  When this variable
>  * is RCU_SCHEDULER_INIT, RCU must actually do all the hard work required
>  * to detect real grace periods.  This variable is also used to suppress
>  * boot-time false positives from lockdep-RCU error checking.  Finally, it
>  * transitions from RCU_SCHEDULER_INIT to RCU_SCHEDULER_RUNNING after RCU
>  * is fully initialized, including all of its kthreads having been spawned.
>  */
> 
> 
> So I'm not sure, the comments and the debug checks imply that it is OK
> for kprobes to be using RCU this early.
>
Sounds like it should be possible.

> 
> I guess I'll keep digging.
> 
Thank you! I also will dig further with that even though i do not have a setup
for reproducing it.

--
Vlad Rezki

  reply	other threads:[~2020-12-03 14:51 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-27  2:02 powerpc 5.10-rcN boot failures with RCU_SCALE_TEST=m Daniel Axtens
2020-11-27 15:50 ` Paul E. McKenney
2020-11-27 16:41 ` Uladzislau Rezki
2020-12-02 14:03 ` Michael Ellerman
2020-12-02 14:39   ` Uladzislau Rezki
2020-12-03  6:22     ` Michael Ellerman
2020-12-03 14:34       ` Uladzislau Rezki [this message]
2020-12-04 20:23         ` Uladzislau Rezki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201203143445.GA22204@pc636 \
    --to=urezki@gmail.com \
    --cc=dja@axtens.net \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mpe@ellerman.id.au \
    --cc=paulmck@kernel.org \
    --cc=rcu@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).