All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mark Rutland <mark.rutland@arm.com>
To: Sven Schnelle <svens@linux.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Ingo Molnar <mingo@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Yinan Liu <yinan@linux.alibaba.com>,
	Ard Biesheuvel <ardb@kernel.org>,
	Kees Cook <keescook@chromium.org>,
	Sachin Sant <sachinp@linux.ibm.com>,
	linuxppc-dev@lists.ozlabs.org,
	Russell King <linux@armlinux.org.uk>,
	linux-arm-kernel@lists.infradead.org, hca@linux.ibm.com,
	linux-s390@vger.kernel.org,
	"Paul E. McKenney" <paulmck@kernel.org>
Subject: Re: ftrace hangs waiting for rcu
Date: Fri, 28 Jan 2022 16:11:57 +0000	[thread overview]
Message-ID: <YfQVzba5thVs+qap@FVFF77S0Q05N> (raw)
In-Reply-To: <yt9dee4rn8q7.fsf@linux.ibm.com>

On Fri, Jan 28, 2022 at 05:08:48PM +0100, Sven Schnelle wrote:
> Hi Mark,
> 
> Mark Rutland <mark.rutland@arm.com> writes:
> 
> > On arm64 I bisected this down to:
> >
> >   7a30871b6a27de1a ("rcu-tasks: Introduce ->percpu_enqueue_shift for dynamic queue selection")
> >
> > Which was going wrong because ilog2() rounds down, and so the shift was wrong
> > for any nr_cpus that was not a power-of-two. Paul had already fixed that in
> > rcu-next, and just sent a pull request to Linus:
> >
> >   https://lore.kernel.org/lkml/20220128143251.GA2398275@paulmck-ThinkPad-P17-Gen-1/
> >
> > With that applied, I no longer see these hangs.
> >
> > Does your s390 test machine have a non-power-of-two nr_cpus, and does that fix
> > the issue for you?
> 
> We noticed the PR from Paul and are currently testing the fix. So far
> it's looking good. The configuration where we have seen the hang is a
> bit unusual:
> 
> - 16 physical CPUs on the kvm host
> - 248 logical CPUs inside kvm

Aha! 248 is notably *NOT* a power of two, and in this case the shift would be
wrong (ilog2() would give 7, when we need a shift of 8).

So I suspect you're hitting the same issue as I was.

Thanks,
Mark.

> - debug kernel both on the host and kvm guest
> 
> So things are likely a bit slow in the kvm guest. Interesting is that
> the number of CPUs is even. But maybe RCU sees an odd number of CPUs
> and gets confused before all cpus are brought up. Have to read code/test
> to see whether that could be possible.
> 
> Thanks for investigating!
> Sven

WARNING: multiple messages have this Message-ID (diff)
From: Mark Rutland <mark.rutland@arm.com>
To: Sven Schnelle <svens@linux.ibm.com>
Cc: linux-s390@vger.kernel.org, Kees Cook <keescook@chromium.org>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	hca@linux.ibm.com, LKML <linux-kernel@vger.kernel.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ingo Molnar <mingo@kernel.org>,
	Sachin Sant <sachinp@linux.ibm.com>,
	Russell King <linux@armlinux.org.uk>,
	Andrew Morton <akpm@linux-foundation.org>,
	Yinan Liu <yinan@linux.alibaba.com>,
	linuxppc-dev@lists.ozlabs.org, Ard Biesheuvel <ardb@kernel.org>,
	linux-arm-kernel@lists.infradead.org
Subject: Re: ftrace hangs waiting for rcu
Date: Fri, 28 Jan 2022 16:11:57 +0000	[thread overview]
Message-ID: <YfQVzba5thVs+qap@FVFF77S0Q05N> (raw)
In-Reply-To: <yt9dee4rn8q7.fsf@linux.ibm.com>

On Fri, Jan 28, 2022 at 05:08:48PM +0100, Sven Schnelle wrote:
> Hi Mark,
> 
> Mark Rutland <mark.rutland@arm.com> writes:
> 
> > On arm64 I bisected this down to:
> >
> >   7a30871b6a27de1a ("rcu-tasks: Introduce ->percpu_enqueue_shift for dynamic queue selection")
> >
> > Which was going wrong because ilog2() rounds down, and so the shift was wrong
> > for any nr_cpus that was not a power-of-two. Paul had already fixed that in
> > rcu-next, and just sent a pull request to Linus:
> >
> >   https://lore.kernel.org/lkml/20220128143251.GA2398275@paulmck-ThinkPad-P17-Gen-1/
> >
> > With that applied, I no longer see these hangs.
> >
> > Does your s390 test machine have a non-power-of-two nr_cpus, and does that fix
> > the issue for you?
> 
> We noticed the PR from Paul and are currently testing the fix. So far
> it's looking good. The configuration where we have seen the hang is a
> bit unusual:
> 
> - 16 physical CPUs on the kvm host
> - 248 logical CPUs inside kvm

Aha! 248 is notably *NOT* a power of two, and in this case the shift would be
wrong (ilog2() would give 7, when we need a shift of 8).

So I suspect you're hitting the same issue as I was.

Thanks,
Mark.

> - debug kernel both on the host and kvm guest
> 
> So things are likely a bit slow in the kvm guest. Interesting is that
> the number of CPUs is even. But maybe RCU sees an odd number of CPUs
> and gets confused before all cpus are brought up. Have to read code/test
> to see whether that could be possible.
> 
> Thanks for investigating!
> Sven

WARNING: multiple messages have this Message-ID (diff)
From: Mark Rutland <mark.rutland@arm.com>
To: Sven Schnelle <svens@linux.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Ingo Molnar <mingo@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Yinan Liu <yinan@linux.alibaba.com>,
	Ard Biesheuvel <ardb@kernel.org>,
	Kees Cook <keescook@chromium.org>,
	Sachin Sant <sachinp@linux.ibm.com>,
	linuxppc-dev@lists.ozlabs.org,
	Russell King <linux@armlinux.org.uk>,
	linux-arm-kernel@lists.infradead.org, hca@linux.ibm.com,
	linux-s390@vger.kernel.org,
	"Paul E. McKenney" <paulmck@kernel.org>
Subject: Re: ftrace hangs waiting for rcu
Date: Fri, 28 Jan 2022 16:11:57 +0000	[thread overview]
Message-ID: <YfQVzba5thVs+qap@FVFF77S0Q05N> (raw)
In-Reply-To: <yt9dee4rn8q7.fsf@linux.ibm.com>

On Fri, Jan 28, 2022 at 05:08:48PM +0100, Sven Schnelle wrote:
> Hi Mark,
> 
> Mark Rutland <mark.rutland@arm.com> writes:
> 
> > On arm64 I bisected this down to:
> >
> >   7a30871b6a27de1a ("rcu-tasks: Introduce ->percpu_enqueue_shift for dynamic queue selection")
> >
> > Which was going wrong because ilog2() rounds down, and so the shift was wrong
> > for any nr_cpus that was not a power-of-two. Paul had already fixed that in
> > rcu-next, and just sent a pull request to Linus:
> >
> >   https://lore.kernel.org/lkml/20220128143251.GA2398275@paulmck-ThinkPad-P17-Gen-1/
> >
> > With that applied, I no longer see these hangs.
> >
> > Does your s390 test machine have a non-power-of-two nr_cpus, and does that fix
> > the issue for you?
> 
> We noticed the PR from Paul and are currently testing the fix. So far
> it's looking good. The configuration where we have seen the hang is a
> bit unusual:
> 
> - 16 physical CPUs on the kvm host
> - 248 logical CPUs inside kvm

Aha! 248 is notably *NOT* a power of two, and in this case the shift would be
wrong (ilog2() would give 7, when we need a shift of 8).

So I suspect you're hitting the same issue as I was.

Thanks,
Mark.

> - debug kernel both on the host and kvm guest
> 
> So things are likely a bit slow in the kvm guest. Interesting is that
> the number of CPUs is even. But maybe RCU sees an odd number of CPUs
> and gets confused before all cpus are brought up. Have to read code/test
> to see whether that could be possible.
> 
> Thanks for investigating!
> Sven

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2022-01-28 16:12 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-27 16:42 [PATCH] ftrace: Have architectures opt-in for mcount build time sorting Steven Rostedt
2022-01-27 16:42 ` Steven Rostedt
2022-01-27 16:42 ` Steven Rostedt
2022-01-27 18:23 ` Mark Rutland
2022-01-27 18:23   ` Mark Rutland
2022-01-27 18:23   ` Mark Rutland
2022-01-27 18:42   ` ftrace hangs waiting for rcu (was: Re: [PATCH] ftrace: Have architectures opt-in for mcount build time sorting) Sven Schnelle
2022-01-27 18:42     ` Sven Schnelle
2022-01-27 18:42     ` Sven Schnelle
2022-01-28 15:42     ` Mark Rutland
2022-01-28 15:42       ` Mark Rutland
2022-01-28 15:42       ` Mark Rutland
2022-01-28 16:08       ` ftrace hangs waiting for rcu Sven Schnelle
2022-01-28 16:08         ` Sven Schnelle
2022-01-28 16:08         ` Sven Schnelle
2022-01-28 16:11         ` Mark Rutland [this message]
2022-01-28 16:11           ` Mark Rutland
2022-01-28 16:11           ` Mark Rutland
2022-01-28 16:15           ` Paul E. McKenney
2022-01-28 16:15             ` Paul E. McKenney
2022-01-28 16:15             ` Paul E. McKenney
2022-01-28 17:47             ` Paul E. McKenney
2022-01-28 17:47               ` Paul E. McKenney
2022-01-28 17:47               ` Paul E. McKenney
2022-01-28 16:17           ` Sven Schnelle
2022-01-28 16:17             ` Sven Schnelle
2022-01-28 16:17             ` Sven Schnelle
2022-01-28 21:11 ` [PATCH] ftrace: Have architectures opt-in for mcount build time sorting Joe Lawrence
2022-01-28 21:11   ` Joe Lawrence
2022-01-28 21:11   ` Joe Lawrence
2022-01-28 21:39   ` Steven Rostedt
2022-01-28 21:39     ` Steven Rostedt
2022-01-28 21:39     ` Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YfQVzba5thVs+qap@FVFF77S0Q05N \
    --to=mark.rutland@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=ardb@kernel.org \
    --cc=hca@linux.ibm.com \
    --cc=keescook@chromium.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mingo@kernel.org \
    --cc=paulmck@kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=sachinp@linux.ibm.com \
    --cc=svens@linux.ibm.com \
    --cc=yinan@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.