rcu.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Allow multiple GP misses before Panic
@ 2020-08-13 17:22 Chao Zhou
  2020-08-13 18:19 ` Paul E. McKenney
  0 siblings, 1 reply; 8+ messages in thread
From: Chao Zhou @ 2020-08-13 17:22 UTC (permalink / raw)
  To: rcu

Hi,

Some RCU stalls are transient and a system is fully capable to recover
after that, but we do want Panic after certain amount of GP misses.

Current module parameter rcu_cpu_stall_panic only turn on/off Panic,
and 1 GP miss will trigger Panic when it is enabled.

Plan to add a module parameter for users to fine-tune how many GP
misses are allowed before Panic.

To save our precious time, a diff has been tested on our systems and
it works and solves our problem in transient RCU stall events.

Your insights and guidance is highly appreciated.

Thanks!

Chao

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Allow multiple GP misses before Panic
  2020-08-13 17:22 Allow multiple GP misses before Panic Chao Zhou
@ 2020-08-13 18:19 ` Paul E. McKenney
  2020-08-13 18:50   ` Chao Zhou
  0 siblings, 1 reply; 8+ messages in thread
From: Paul E. McKenney @ 2020-08-13 18:19 UTC (permalink / raw)
  To: Chao Zhou; +Cc: rcu

On Thu, Aug 13, 2020 at 10:22:09AM -0700, Chao Zhou wrote:
> Hi,
> 
> Some RCU stalls are transient and a system is fully capable to recover
> after that, but we do want Panic after certain amount of GP misses.
> 
> Current module parameter rcu_cpu_stall_panic only turn on/off Panic,
> and 1 GP miss will trigger Panic when it is enabled.
> 
> Plan to add a module parameter for users to fine-tune how many GP
> misses are allowed before Panic.
> 
> To save our precious time, a diff has been tested on our systems and
> it works and solves our problem in transient RCU stall events.
> 
> Your insights and guidance is highly appreciated.

Please feel free to post a patch.  I could imagine a number of things
you might be doing from your description above:

1.	Having a different time for panic, so that (for example) an
	RCU CPU stall warning appears at 21 seconds (in mainline), and
	if the grace period still has not ended at some time specified
	by some kernel parameter.  For example, one approach would be
	to make the existing panic_on_rcu_stall sysctl take an integer
	instead of a boolean, and to make that integer specify how old
	the stall-warned grace period must be before panic() is invoked.

2.	Instead use the number of RCU CPU stall warning messages to
	trigger the panic, so that (for example), the panic would happen
	on the tenth message.  Again, the panic_on_rcu_stall sysctl
	might be used for this.

3.	Like #2, but reset the count every time a new grace period
	starts.  So if the panic_on_rcu_stall sysctl was set to
	ten, there would need to be ten RCU CPU stall warnings for
	the same grace period before panic() was invoked.

Of the above three, #1 and #3 seem the most attractive, with a slight
preference for #1.

Or did you have something else in mind?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Allow multiple GP misses before Panic
  2020-08-13 18:19 ` Paul E. McKenney
@ 2020-08-13 18:50   ` Chao Zhou
  2020-08-13 19:00     ` Chao Zhou
  2020-08-13 21:21     ` Paul E. McKenney
  0 siblings, 2 replies; 8+ messages in thread
From: Chao Zhou @ 2020-08-13 18:50 UTC (permalink / raw)
  To: paulmck; +Cc: rcu

Thanks Paul for the insights!

I studied the 3 options and think that #1+#3 offers both flexibility
to users and coverage of boundary user cases.

For example, as an user of RCU, we want the warnings to be spilled at
the default 21 seconds so that we know such events are happening. At
the same time, we want Panic to happen if the stall is long enough to
significantly affect available system memory on our system.

Here is the plan based on our discussion, please advise if not inline
with the idea:
1. modify panic_on_rcu_stall to be the maximum number of consecutive
warnings to trigger Panic.
    1) change its name to max_rcu_stall_to_panic,
    2) default value to 1, which is the same behavior as today's.
2. use ((struct rcu_state *)->gpnum - (struct rcu_data *)->gpnum) >=
max_rcu_stall_to_panic as condition to trigger Panic;
3. reset (struct rcu_data *)->gpnum to (struct rcu_state *)->gpnum
every time a new grace period starts;
4. add a new member (struct rcu_data *)->gpmiss that is incremented at
each miss to track how many misses so far for statistics/debug
purpose.

Your insights and advice are highly appreciated.

Thanks!

Chao

On Thu, Aug 13, 2020 at 11:19 AM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Thu, Aug 13, 2020 at 10:22:09AM -0700, Chao Zhou wrote:
> > Hi,
> >
> > Some RCU stalls are transient and a system is fully capable to recover
> > after that, but we do want Panic after certain amount of GP misses.
> >
> > Current module parameter rcu_cpu_stall_panic only turn on/off Panic,
> > and 1 GP miss will trigger Panic when it is enabled.
> >
> > Plan to add a module parameter for users to fine-tune how many GP
> > misses are allowed before Panic.
> >
> > To save our precious time, a diff has been tested on our systems and
> > it works and solves our problem in transient RCU stall events.
> >
> > Your insights and guidance is highly appreciated.
>
> Please feel free to post a patch.  I could imagine a number of things
> you might be doing from your description above:
>
> 1.      Having a different time for panic, so that (for example) an
>         RCU CPU stall warning appears at 21 seconds (in mainline), and
>         if the grace period still has not ended at some time specified
>         by some kernel parameter.  For example, one approach would be
>         to make the existing panic_on_rcu_stall sysctl take an integer
>         instead of a boolean, and to make that integer specify how old
>         the stall-warned grace period must be before panic() is invoked.
>
> 2.      Instead use the number of RCU CPU stall warning messages to
>         trigger the panic, so that (for example), the panic would happen
>         on the tenth message.  Again, the panic_on_rcu_stall sysctl
>         might be used for this.
>
> 3.      Like #2, but reset the count every time a new grace period
>         starts.  So if the panic_on_rcu_stall sysctl was set to
>         ten, there would need to be ten RCU CPU stall warnings for
>         the same grace period before panic() was invoked.
>
> Of the above three, #1 and #3 seem the most attractive, with a slight
> preference for #1.
>
> Or did you have something else in mind?
>
>                                                         Thanx, Paul

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Allow multiple GP misses before Panic
  2020-08-13 18:50   ` Chao Zhou
@ 2020-08-13 19:00     ` Chao Zhou
  2020-08-16 18:01       ` Paul E. McKenney
  2020-08-13 21:21     ` Paul E. McKenney
  1 sibling, 1 reply; 8+ messages in thread
From: Chao Zhou @ 2020-08-13 19:00 UTC (permalink / raw)
  To: paulmck; +Cc: rcu

Hi Paul,

Because sysctl panic_on_rcu_stall is public interface, it might have
already been used by adopters, will the change break them? Will a new
sysctl max_rcu_stall_to_panic be more un-interruptive? Appreciate your
insights about this .

On Thu, Aug 13, 2020 at 11:50 AM Chao Zhou <chaozhou1018@gmail.com> wrote:
>
> Thanks Paul for the insights!
>
> I studied the 3 options and think that #1+#3 offers both flexibility
> to users and coverage of boundary user cases.
>
> For example, as an user of RCU, we want the warnings to be spilled at
> the default 21 seconds so that we know such events are happening. At
> the same time, we want Panic to happen if the stall is long enough to
> significantly affect available system memory on our system.
>
> Here is the plan based on our discussion, please advise if not inline
> with the idea:
> 1. modify panic_on_rcu_stall to be the maximum number of consecutive
> warnings to trigger Panic.
>     1) change its name to max_rcu_stall_to_panic,
>     2) default value to 1, which is the same behavior as today's.
> 2. use ((struct rcu_state *)->gpnum - (struct rcu_data *)->gpnum) >=
> max_rcu_stall_to_panic as condition to trigger Panic;
> 3. reset (struct rcu_data *)->gpnum to (struct rcu_state *)->gpnum
> every time a new grace period starts;
> 4. add a new member (struct rcu_data *)->gpmiss that is incremented at
> each miss to track how many misses so far for statistics/debug
> purpose.
>
> Your insights and advice are highly appreciated.
>
> Thanks!
>
> Chao
>
> On Thu, Aug 13, 2020 at 11:19 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > On Thu, Aug 13, 2020 at 10:22:09AM -0700, Chao Zhou wrote:
> > > Hi,
> > >
> > > Some RCU stalls are transient and a system is fully capable to recover
> > > after that, but we do want Panic after certain amount of GP misses.
> > >
> > > Current module parameter rcu_cpu_stall_panic only turn on/off Panic,
> > > and 1 GP miss will trigger Panic when it is enabled.
> > >
> > > Plan to add a module parameter for users to fine-tune how many GP
> > > misses are allowed before Panic.
> > >
> > > To save our precious time, a diff has been tested on our systems and
> > > it works and solves our problem in transient RCU stall events.
> > >
> > > Your insights and guidance is highly appreciated.
> >
> > Please feel free to post a patch.  I could imagine a number of things
> > you might be doing from your description above:
> >
> > 1.      Having a different time for panic, so that (for example) an
> >         RCU CPU stall warning appears at 21 seconds (in mainline), and
> >         if the grace period still has not ended at some time specified
> >         by some kernel parameter.  For example, one approach would be
> >         to make the existing panic_on_rcu_stall sysctl take an integer
> >         instead of a boolean, and to make that integer specify how old
> >         the stall-warned grace period must be before panic() is invoked.
> >
> > 2.      Instead use the number of RCU CPU stall warning messages to
> >         trigger the panic, so that (for example), the panic would happen
> >         on the tenth message.  Again, the panic_on_rcu_stall sysctl
> >         might be used for this.
> >
> > 3.      Like #2, but reset the count every time a new grace period
> >         starts.  So if the panic_on_rcu_stall sysctl was set to
> >         ten, there would need to be ten RCU CPU stall warnings for
> >         the same grace period before panic() was invoked.
> >
> > Of the above three, #1 and #3 seem the most attractive, with a slight
> > preference for #1.
> >
> > Or did you have something else in mind?
> >
> >                                                         Thanx, Paul

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Allow multiple GP misses before Panic
  2020-08-13 18:50   ` Chao Zhou
  2020-08-13 19:00     ` Chao Zhou
@ 2020-08-13 21:21     ` Paul E. McKenney
  2020-08-13 21:57       ` Chao Zhou
  1 sibling, 1 reply; 8+ messages in thread
From: Paul E. McKenney @ 2020-08-13 21:21 UTC (permalink / raw)
  To: Chao Zhou; +Cc: rcu

On Thu, Aug 13, 2020 at 11:50:47AM -0700, Chao Zhou wrote:
> Thanks Paul for the insights!
> 
> I studied the 3 options and think that #1+#3 offers both flexibility
> to users and coverage of boundary user cases.
> 
> For example, as an user of RCU, we want the warnings to be spilled at
> the default 21 seconds so that we know such events are happening. At
> the same time, we want Panic to happen if the stall is long enough to
> significantly affect available system memory on our system.

OK, good, you want to panic only on long stalls, not on multiple short
stalls.  Makes sense!

By the way, the short stalls sometimes indicate bugs that should
be fixed.  For example, there might be a long loop in the kernel
that needs an added cond_resched().  Fixing those would make your
systems less noisy.

I am not asking you to do anything I don't do myself:

0a3b3c253a1e ("mm/mmap.c: Add cond_resched() for exit_mmap() CPU stalls")
9f47eb5461aa ("fs/btrfs: Add cond_resched() for try_release_extent_mapping() stalls")

> Here is the plan based on our discussion, please advise if not inline
> with the idea:
> 1. modify panic_on_rcu_stall to be the maximum number of consecutive
> warnings to trigger Panic.
>     1) change its name to max_rcu_stall_to_panic,

This change would make sense were it not for the possibility that there
are lots of existing "panic_on_rcu_stall" instances in scripts, boot
parameters, and so on.  We need the old name for compatibility, and with
"0" and "1" having the same meanings as before.  We -can- use other
numbers because those other numbers were not legal before.

So we don't get to change the name.

>     2) default value to 1, which is the same behavior as today's.

The default value must be zero, which needs to mean "don't ever panic".
Setting the value to "1" must mean "panic after the very first RCU CPU
stall warning.  Other values can have other meanings.

But you could use the existing panic_on_rcu_stall to be (say) the number
of stall-warning messages that will be tolerated in a given grace period
before panicking.

Then you could create a new panic_on_rcu_stall_age or similar that
does the time-based panicking.

Either way, people currently using panic_on_rcu_stall need to be able
to continue using panic_on_rcu_stall exactly as they are without seeing
any change in behavior.  We don't break users if we have a choice,
and here we really do have a choice.  So we don't break users.

> 2. use ((struct rcu_state *)->gpnum - (struct rcu_data *)->gpnum) >=
> max_rcu_stall_to_panic as condition to trigger Panic;

This probably won't do what you want.  This would trigger only if
the rcu_data structure's ->gpnum were to lag behind the rcu_state
structure's ->gpnum, which happens now whenever a CPU is idle for
an extended time period.

For the time-based approach (#1 below), the idea would be to compare
against the rcu_state structure's ->gp_start field, and even then only
when a grace period is actually in progress.  One way to do this is
to check against the "gps" local variable in check_cpu_stall() in
kernel/rcu/tree_stall.h.

> 3. reset (struct rcu_data *)->gpnum to (struct rcu_state *)->gpnum
> every time a new grace period starts;

Touching every rcu_data structure at the beginning of each grace
period will slow things down.  This slowdown is avoided in the
current code by having each CPU update its rcu_data structure's
->gpnum when it notices a new grace period.

And ->gpnum and ->completed have been combined into a single ->gp_seq
in recent kernels.

So it is better to compare against ->gp_start (AKA "gps") as discussed
above.

> 4. add a new member (struct rcu_data *)->gpmiss that is incremented at
> each miss to track how many misses so far for statistics/debug
> purpose.

This is for #3 below, correct?

If so, please instead create an ->nstalls_cur_gp or similar in the
rcu_state structure.  Zero this at the beginning of every grace
period (in record_gp_stall_check_time() in kernel/rcu/tree_stall.h)
and increment and check it in both legs of the "if" statement at the
end of check_cpu_stall() in this same file.

> Your insights and advice are highly appreciated.

Please let me know how it goes!

							Thanx, Paul

> Thanks!
> 
> Chao
> 
> On Thu, Aug 13, 2020 at 11:19 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > On Thu, Aug 13, 2020 at 10:22:09AM -0700, Chao Zhou wrote:
> > > Hi,
> > >
> > > Some RCU stalls are transient and a system is fully capable to recover
> > > after that, but we do want Panic after certain amount of GP misses.
> > >
> > > Current module parameter rcu_cpu_stall_panic only turn on/off Panic,
> > > and 1 GP miss will trigger Panic when it is enabled.
> > >
> > > Plan to add a module parameter for users to fine-tune how many GP
> > > misses are allowed before Panic.
> > >
> > > To save our precious time, a diff has been tested on our systems and
> > > it works and solves our problem in transient RCU stall events.
> > >
> > > Your insights and guidance is highly appreciated.
> >
> > Please feel free to post a patch.  I could imagine a number of things
> > you might be doing from your description above:
> >
> > 1.      Having a different time for panic, so that (for example) an
> >         RCU CPU stall warning appears at 21 seconds (in mainline), and
> >         if the grace period still has not ended at some time specified
> >         by some kernel parameter.  For example, one approach would be
> >         to make the existing panic_on_rcu_stall sysctl take an integer
> >         instead of a boolean, and to make that integer specify how old
> >         the stall-warned grace period must be before panic() is invoked.
> >
> > 2.      Instead use the number of RCU CPU stall warning messages to
> >         trigger the panic, so that (for example), the panic would happen
> >         on the tenth message.  Again, the panic_on_rcu_stall sysctl
> >         might be used for this.
> >
> > 3.      Like #2, but reset the count every time a new grace period
> >         starts.  So if the panic_on_rcu_stall sysctl was set to
> >         ten, there would need to be ten RCU CPU stall warnings for
> >         the same grace period before panic() was invoked.
> >
> > Of the above three, #1 and #3 seem the most attractive, with a slight
> > preference for #1.
> >
> > Or did you have something else in mind?
> >
> >                                                         Thanx, Paul

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Allow multiple GP misses before Panic
  2020-08-13 21:21     ` Paul E. McKenney
@ 2020-08-13 21:57       ` Chao Zhou
  0 siblings, 0 replies; 8+ messages in thread
From: Chao Zhou @ 2020-08-13 21:57 UTC (permalink / raw)
  To: paulmck; +Cc: rcu

Thanks Paul, please see inline @chao.

On Thu, Aug 13, 2020 at 2:21 PM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Thu, Aug 13, 2020 at 11:50:47AM -0700, Chao Zhou wrote:
> > Thanks Paul for the insights!
> >
> > I studied the 3 options and think that #1+#3 offers both flexibility
> > to users and coverage of boundary user cases.
> >
> > For example, as an user of RCU, we want the warnings to be spilled at
> > the default 21 seconds so that we know such events are happening. At
> > the same time, we want Panic to happen if the stall is long enough to
> > significantly affect available system memory on our system.
>
> OK, good, you want to panic only on long stalls, not on multiple short
> stalls.  Makes sense!
>
> By the way, the short stalls sometimes indicate bugs that should
> be fixed.  For example, there might be a long loop in the kernel
> that needs an added cond_resched().  Fixing those would make your
> systems less noisy.
>
> I am not asking you to do anything I don't do myself:
>
> 0a3b3c253a1e ("mm/mmap.c: Add cond_resched() for exit_mmap() CPU stalls")
> 9f47eb5461aa ("fs/btrfs: Add cond_resched() for try_release_extent_mapping() stalls")

@chao: Yes exactly, any stall will warn, that is why I want to add
miss statistics.
but mostly it is long stalls that hold on to Read Copy and detriment
system's capability to a point that we have to Panic :) .

cond_resched is exactly the API I am asking my team to do as one of
the ways to fix some of the stallers. Feel honored that we are on same
page.

>
> > Here is the plan based on our discussion, please advise if not inline
> > with the idea:
> > 1. modify panic_on_rcu_stall to be the maximum number of consecutive
> > warnings to trigger Panic.
> >     1) change its name to max_rcu_stall_to_panic,
>
> This change would make sense were it not for the possibility that there
> are lots of existing "panic_on_rcu_stall" instances in scripts, boot
> parameters, and so on.  We need the old name for compatibility, and with
> "0" and "1" having the same meanings as before.  We -can- use other
> numbers because those other numbers were not legal before.
>
> So we don't get to change the name.

@chao: Thanks that is why I post a follow up email about this to get
your advice. It is public API, better find another way.

>
> >     2) default value to 1, which is the same behavior as today's.
>
> The default value must be zero, which needs to mean "don't ever panic".
> Setting the value to "1" must mean "panic after the very first RCU CPU
> stall warning.  Other values can have other meanings.
>
> But you could use the existing panic_on_rcu_stall to be (say) the number
> of stall-warning messages that will be tolerated in a given grace period
> before panicking.
>
> Then you could create a new panic_on_rcu_stall_age or similar that
> does the time-based panicking.

@chao: good suggestions, will use in the patch.

>
> Either way, people currently using panic_on_rcu_stall need to be able
> to continue using panic_on_rcu_stall exactly as they are without seeing
> any change in behavior.  We don't break users if we have a choice,
> and here we really do have a choice.  So we don't break users.
>

@chao: sure will do. Also learning the best practices.

> > 2. use ((struct rcu_state *)->gpnum - (struct rcu_data *)->gpnum) >=
> > max_rcu_stall_to_panic as condition to trigger Panic;
>
> This probably won't do what you want.  This would trigger only if
> the rcu_data structure's ->gpnum were to lag behind the rcu_state
> structure's ->gpnum, which happens now whenever a CPU is idle for
> an extended time period.
>
> For the time-based approach (#1 below), the idea would be to compare
> against the rcu_state structure's ->gp_start field, and even then only
> when a grace period is actually in progress.  One way to do this is
> to check against the "gps" local variable in check_cpu_stall() in
> kernel/rcu/tree_stall.h.

@chao: good advice, will do in patch.

>
> > 3. reset (struct rcu_data *)->gpnum to (struct rcu_state *)->gpnum
> > every time a new grace period starts;
>
> Touching every rcu_data structure at the beginning of each grace
> period will slow things down.  This slowdown is avoided in the
> current code by having each CPU update its rcu_data structure's
> ->gpnum when it notices a new grace period.
>
> And ->gpnum and ->completed have been combined into a single ->gp_seq
> in recent kernels.
>
> So it is better to compare against ->gp_start (AKA "gps") as discussed
> above.

@chao: good advice, will do in patch.

>
> > 4. add a new member (struct rcu_data *)->gpmiss that is incremented at
> > each miss to track how many misses so far for statistics/debug
> > purpose.
>
> This is for #3 below, correct?
>
> If so, please instead create an ->nstalls_cur_gp or similar in the
> rcu_state structure.  Zero this at the beginning of every grace
> period (in record_gp_stall_check_time() in kernel/rcu/tree_stall.h)
> and increment and check it in both legs of the "if" statement at the
> end of check_cpu_stall() in this same file.

@chao: sure will do.

>
> > Your insights and advice are highly appreciated.
>
> Please let me know how it goes!

@chao: Thanks Paul, I think we are at a stage where I need to post a
patch for more suggestions.

>
>                                                         Thanx, Paul
>
> > Thanks!
> >
> > Chao
> >
> > On Thu, Aug 13, 2020 at 11:19 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> > >
> > > On Thu, Aug 13, 2020 at 10:22:09AM -0700, Chao Zhou wrote:
> > > > Hi,
> > > >
> > > > Some RCU stalls are transient and a system is fully capable to recover
> > > > after that, but we do want Panic after certain amount of GP misses.
> > > >
> > > > Current module parameter rcu_cpu_stall_panic only turn on/off Panic,
> > > > and 1 GP miss will trigger Panic when it is enabled.
> > > >
> > > > Plan to add a module parameter for users to fine-tune how many GP
> > > > misses are allowed before Panic.
> > > >
> > > > To save our precious time, a diff has been tested on our systems and
> > > > it works and solves our problem in transient RCU stall events.
> > > >
> > > > Your insights and guidance is highly appreciated.
> > >
> > > Please feel free to post a patch.  I could imagine a number of things
> > > you might be doing from your description above:
> > >
> > > 1.      Having a different time for panic, so that (for example) an
> > >         RCU CPU stall warning appears at 21 seconds (in mainline), and
> > >         if the grace period still has not ended at some time specified
> > >         by some kernel parameter.  For example, one approach would be
> > >         to make the existing panic_on_rcu_stall sysctl take an integer
> > >         instead of a boolean, and to make that integer specify how old
> > >         the stall-warned grace period must be before panic() is invoked.
> > >
> > > 2.      Instead use the number of RCU CPU stall warning messages to
> > >         trigger the panic, so that (for example), the panic would happen
> > >         on the tenth message.  Again, the panic_on_rcu_stall sysctl
> > >         might be used for this.
> > >
> > > 3.      Like #2, but reset the count every time a new grace period
> > >         starts.  So if the panic_on_rcu_stall sysctl was set to
> > >         ten, there would need to be ten RCU CPU stall warnings for
> > >         the same grace period before panic() was invoked.
> > >
> > > Of the above three, #1 and #3 seem the most attractive, with a slight
> > > preference for #1.
> > >
> > > Or did you have something else in mind?
> > >
> > >                                                         Thanx, Paul

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Allow multiple GP misses before Panic
  2020-08-13 19:00     ` Chao Zhou
@ 2020-08-16 18:01       ` Paul E. McKenney
  2020-08-16 23:56         ` Chao Zhou
  0 siblings, 1 reply; 8+ messages in thread
From: Paul E. McKenney @ 2020-08-16 18:01 UTC (permalink / raw)
  To: Chao Zhou; +Cc: rcu

On Thu, Aug 13, 2020 at 12:00:07PM -0700, Chao Zhou wrote:
> Hi Paul,
> 
> Because sysctl panic_on_rcu_stall is public interface, it might have
> already been used by adopters, will the change break them? Will a new
> sysctl max_rcu_stall_to_panic be more un-interruptive? Appreciate your
> insights about this .

It is a public interface, but using the previously forbidden values
other than zero and one is fine.

							Thanx, Paul

> On Thu, Aug 13, 2020 at 11:50 AM Chao Zhou <chaozhou1018@gmail.com> wrote:
> >
> > Thanks Paul for the insights!
> >
> > I studied the 3 options and think that #1+#3 offers both flexibility
> > to users and coverage of boundary user cases.
> >
> > For example, as an user of RCU, we want the warnings to be spilled at
> > the default 21 seconds so that we know such events are happening. At
> > the same time, we want Panic to happen if the stall is long enough to
> > significantly affect available system memory on our system.
> >
> > Here is the plan based on our discussion, please advise if not inline
> > with the idea:
> > 1. modify panic_on_rcu_stall to be the maximum number of consecutive
> > warnings to trigger Panic.
> >     1) change its name to max_rcu_stall_to_panic,
> >     2) default value to 1, which is the same behavior as today's.
> > 2. use ((struct rcu_state *)->gpnum - (struct rcu_data *)->gpnum) >=
> > max_rcu_stall_to_panic as condition to trigger Panic;
> > 3. reset (struct rcu_data *)->gpnum to (struct rcu_state *)->gpnum
> > every time a new grace period starts;
> > 4. add a new member (struct rcu_data *)->gpmiss that is incremented at
> > each miss to track how many misses so far for statistics/debug
> > purpose.
> >
> > Your insights and advice are highly appreciated.
> >
> > Thanks!
> >
> > Chao
> >
> > On Thu, Aug 13, 2020 at 11:19 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> > >
> > > On Thu, Aug 13, 2020 at 10:22:09AM -0700, Chao Zhou wrote:
> > > > Hi,
> > > >
> > > > Some RCU stalls are transient and a system is fully capable to recover
> > > > after that, but we do want Panic after certain amount of GP misses.
> > > >
> > > > Current module parameter rcu_cpu_stall_panic only turn on/off Panic,
> > > > and 1 GP miss will trigger Panic when it is enabled.
> > > >
> > > > Plan to add a module parameter for users to fine-tune how many GP
> > > > misses are allowed before Panic.
> > > >
> > > > To save our precious time, a diff has been tested on our systems and
> > > > it works and solves our problem in transient RCU stall events.
> > > >
> > > > Your insights and guidance is highly appreciated.
> > >
> > > Please feel free to post a patch.  I could imagine a number of things
> > > you might be doing from your description above:
> > >
> > > 1.      Having a different time for panic, so that (for example) an
> > >         RCU CPU stall warning appears at 21 seconds (in mainline), and
> > >         if the grace period still has not ended at some time specified
> > >         by some kernel parameter.  For example, one approach would be
> > >         to make the existing panic_on_rcu_stall sysctl take an integer
> > >         instead of a boolean, and to make that integer specify how old
> > >         the stall-warned grace period must be before panic() is invoked.
> > >
> > > 2.      Instead use the number of RCU CPU stall warning messages to
> > >         trigger the panic, so that (for example), the panic would happen
> > >         on the tenth message.  Again, the panic_on_rcu_stall sysctl
> > >         might be used for this.
> > >
> > > 3.      Like #2, but reset the count every time a new grace period
> > >         starts.  So if the panic_on_rcu_stall sysctl was set to
> > >         ten, there would need to be ten RCU CPU stall warnings for
> > >         the same grace period before panic() was invoked.
> > >
> > > Of the above three, #1 and #3 seem the most attractive, with a slight
> > > preference for #1.
> > >
> > > Or did you have something else in mind?
> > >
> > >                                                         Thanx, Paul

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Allow multiple GP misses before Panic
  2020-08-16 18:01       ` Paul E. McKenney
@ 2020-08-16 23:56         ` Chao Zhou
  0 siblings, 0 replies; 8+ messages in thread
From: Chao Zhou @ 2020-08-16 23:56 UTC (permalink / raw)
  To: paulmck; +Cc: rcu

Great, thanks!

On Sun, Aug 16, 2020 at 11:01 AM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Thu, Aug 13, 2020 at 12:00:07PM -0700, Chao Zhou wrote:
> > Hi Paul,
> >
> > Because sysctl panic_on_rcu_stall is public interface, it might have
> > already been used by adopters, will the change break them? Will a new
> > sysctl max_rcu_stall_to_panic be more un-interruptive? Appreciate your
> > insights about this .
>
> It is a public interface, but using the previously forbidden values
> other than zero and one is fine.
>
>                                                         Thanx, Paul
>
> > On Thu, Aug 13, 2020 at 11:50 AM Chao Zhou <chaozhou1018@gmail.com> wrote:
> > >
> > > Thanks Paul for the insights!
> > >
> > > I studied the 3 options and think that #1+#3 offers both flexibility
> > > to users and coverage of boundary user cases.
> > >
> > > For example, as an user of RCU, we want the warnings to be spilled at
> > > the default 21 seconds so that we know such events are happening. At
> > > the same time, we want Panic to happen if the stall is long enough to
> > > significantly affect available system memory on our system.
> > >
> > > Here is the plan based on our discussion, please advise if not inline
> > > with the idea:
> > > 1. modify panic_on_rcu_stall to be the maximum number of consecutive
> > > warnings to trigger Panic.
> > >     1) change its name to max_rcu_stall_to_panic,
> > >     2) default value to 1, which is the same behavior as today's.
> > > 2. use ((struct rcu_state *)->gpnum - (struct rcu_data *)->gpnum) >=
> > > max_rcu_stall_to_panic as condition to trigger Panic;
> > > 3. reset (struct rcu_data *)->gpnum to (struct rcu_state *)->gpnum
> > > every time a new grace period starts;
> > > 4. add a new member (struct rcu_data *)->gpmiss that is incremented at
> > > each miss to track how many misses so far for statistics/debug
> > > purpose.
> > >
> > > Your insights and advice are highly appreciated.
> > >
> > > Thanks!
> > >
> > > Chao
> > >
> > > On Thu, Aug 13, 2020 at 11:19 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> > > >
> > > > On Thu, Aug 13, 2020 at 10:22:09AM -0700, Chao Zhou wrote:
> > > > > Hi,
> > > > >
> > > > > Some RCU stalls are transient and a system is fully capable to recover
> > > > > after that, but we do want Panic after certain amount of GP misses.
> > > > >
> > > > > Current module parameter rcu_cpu_stall_panic only turn on/off Panic,
> > > > > and 1 GP miss will trigger Panic when it is enabled.
> > > > >
> > > > > Plan to add a module parameter for users to fine-tune how many GP
> > > > > misses are allowed before Panic.
> > > > >
> > > > > To save our precious time, a diff has been tested on our systems and
> > > > > it works and solves our problem in transient RCU stall events.
> > > > >
> > > > > Your insights and guidance is highly appreciated.
> > > >
> > > > Please feel free to post a patch.  I could imagine a number of things
> > > > you might be doing from your description above:
> > > >
> > > > 1.      Having a different time for panic, so that (for example) an
> > > >         RCU CPU stall warning appears at 21 seconds (in mainline), and
> > > >         if the grace period still has not ended at some time specified
> > > >         by some kernel parameter.  For example, one approach would be
> > > >         to make the existing panic_on_rcu_stall sysctl take an integer
> > > >         instead of a boolean, and to make that integer specify how old
> > > >         the stall-warned grace period must be before panic() is invoked.
> > > >
> > > > 2.      Instead use the number of RCU CPU stall warning messages to
> > > >         trigger the panic, so that (for example), the panic would happen
> > > >         on the tenth message.  Again, the panic_on_rcu_stall sysctl
> > > >         might be used for this.
> > > >
> > > > 3.      Like #2, but reset the count every time a new grace period
> > > >         starts.  So if the panic_on_rcu_stall sysctl was set to
> > > >         ten, there would need to be ten RCU CPU stall warnings for
> > > >         the same grace period before panic() was invoked.
> > > >
> > > > Of the above three, #1 and #3 seem the most attractive, with a slight
> > > > preference for #1.
> > > >
> > > > Or did you have something else in mind?
> > > >
> > > >                                                         Thanx, Paul

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-08-16 23:56 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-13 17:22 Allow multiple GP misses before Panic Chao Zhou
2020-08-13 18:19 ` Paul E. McKenney
2020-08-13 18:50   ` Chao Zhou
2020-08-13 19:00     ` Chao Zhou
2020-08-16 18:01       ` Paul E. McKenney
2020-08-16 23:56         ` Chao Zhou
2020-08-13 21:21     ` Paul E. McKenney
2020-08-13 21:57       ` Chao Zhou

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).