rcu.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.ibm.com>
To: Byungchul Park <max.byungchul.park@gmail.com>
Cc: Joel Fernandes <joel@joelfernandes.org>,
	Byungchul Park <byungchul.park@lge.com>,
	Josh Triplett <josh@joshtriplett.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Lai Jiangshan <jiangshanlai@gmail.com>, rcu <rcu@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	kernel-team@lge.com
Subject: Re: [PATCH] rcu: Make jiffies_till_sched_qs writable
Date: Sun, 14 Jul 2019 06:56:10 -0700	[thread overview]
Message-ID: <20190714135610.GJ26519@linux.ibm.com> (raw)
In-Reply-To: <CANrsvRNK=+SKHJNLmwNjp2tnjacJpqwFVQH9aRCj2E1L10GHDQ@mail.gmail.com>

On Sun, Jul 14, 2019 at 10:39:58PM +0900, Byungchul Park wrote:
> On Sun, Jul 14, 2019 at 2:41 AM Paul E. McKenney <paulmck@linux.ibm.com> wrote:
> >
> > On Sat, Jul 13, 2019 at 11:42:57AM -0400, Joel Fernandes wrote:
> > > On Sat, Jul 13, 2019 at 08:13:30AM -0700, Paul E. McKenney wrote:
> > > > On Sat, Jul 13, 2019 at 10:20:02AM -0400, Joel Fernandes wrote:
> > > > > On Sat, Jul 13, 2019 at 4:47 AM Byungchul Park
> > > > > <max.byungchul.park@gmail.com> wrote:
> > > > > >
> > > > > > On Fri, Jul 12, 2019 at 9:51 PM Joel Fernandes <joel@joelfernandes.org> wrote:
> > > > > > >
> > > > > > > On Fri, Jul 12, 2019 at 03:32:40PM +0900, Byungchul Park wrote:
> > > > > > > > On Thu, Jul 11, 2019 at 03:58:39PM -0400, Joel Fernandes wrote:
> > > > > > > > > Hmm, speaking of grace period durations, it seems to me the maximum grace
> > > > > > > > > period ever is recorded in rcu_state.gp_max. However it is not read from
> > > > > > > > > anywhere.
> > > > > > > > >
> > > > > > > > > Any idea why it was added but not used?
> > > > > > > > >
> > > > > > > > > I am interested in dumping this value just for fun, and seeing what I get.
> > > > > > > > >
> > > > > > > > > I wonder also it is useful to dump it in rcutorture/rcuperf to find any
> > > > > > > > > issues, or even expose it in sys/proc fs to see what worst case grace periods
> > > > > > > > > look like.
> > > > > > > >
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > >       commit ae91aa0adb14dc33114d566feca2f7cb7a96b8b7
> > > > > > > >       rcu: Remove debugfs tracing
> > > > > > > >
> > > > > > > > removed all debugfs tracing, gp_max also included.
> > > > > > > >
> > > > > > > > And you sounds great. And even looks not that hard to add it like,
> > > > > > > >
> > > > > > > > :)
> > > > > > > >
> > > > > > > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > > > > > > index ad9dc86..86095ff 100644
> > > > > > > > --- a/kernel/rcu/tree.c
> > > > > > > > +++ b/kernel/rcu/tree.c
> > > > > > > > @@ -1658,8 +1658,10 @@ static void rcu_gp_cleanup(void)
> > > > > > > >       raw_spin_lock_irq_rcu_node(rnp);
> > > > > > > >       rcu_state.gp_end = jiffies;
> > > > > > > >       gp_duration = rcu_state.gp_end - rcu_state.gp_start;
> > > > > > > > -     if (gp_duration > rcu_state.gp_max)
> > > > > > > > +     if (gp_duration > rcu_state.gp_max) {
> > > > > > > >               rcu_state.gp_max = gp_duration;
> > > > > > > > +             trace_rcu_grace_period(something something);
> > > > > > > > +     }
> > > > > > >
> > > > > > > Yes, that makes sense. But I think it is much better off as a readable value
> > > > > > > from a virtual fs. The drawback of tracing for this sort of thing are:
> > > > > > >  - Tracing will only catch it if tracing is on
> > > > > > >  - Tracing data can be lost if too many events, then no one has a clue what
> > > > > > >    the max gp time is.
> > > > > > >  - The data is already available in rcu_state::gp_max so copying it into the
> > > > > > >    trace buffer seems a bit pointless IMHO
> > > > > > >  - It is a lot easier on ones eyes to process a single counter than process
> > > > > > >    heaps of traces.
> > > > > > >
> > > > > > > I think a minimal set of RCU counters exposed to /proc or /sys should not
> > > > > > > hurt and could do more good than not. The scheduler already does this for
> > > > > > > scheduler statistics. I have seen Peter complain a lot about new tracepoints
> > > > > > > but not much (or never) about new statistics.
> > > > > > >
> > > > > > > Tracing has its strengths but may not apply well here IMO. I think a counter
> > > > > > > like this could be useful for tuning of things like the jiffies_*_sched_qs,
> > > > > > > the stall timeouts and also any other RCU knobs. What do you think?
> > > > > >
> > > > > > I prefer proc/sys knob for it to tracepoint. Why I've considered it is just it
> > > > > > looks like undoing what Paul did at ae91aa0ad.
> > > > > >
> > > > > > I think you're rational enough. I just wondered how Paul think of it.
> > > > >
> > > > > I believe at least initially, a set of statistics can be made
> > > > > available only when rcutorture or rcuperf module is loaded. That way
> > > > > they are purely only for debugging and nothing needs to be exposed to
> > > > > normal kernels distributed thus reducing testability concerns.
> > > > >
> > > > > rcu_state::gp_max would be trivial to expose through this, but for
> > > > > other statistics that are more complicated - perhaps
> > > > > tracepoint_probe_register can be used to add hooks on to the
> > > > > tracepoints and generate statistics from them. Again the registration
> > > > > of the probe and the probe handler itself would all be in
> > > > > rcutorture/rcuperf test code and not a part of the kernel proper.
> > > > > Thoughts?
> > > >
> > > > It still feels like you guys are hyperfocusing on this one particular
> > > > knob.  I instead need you to look at the interrelating knobs as a group.
> > >
> > > Thanks for the hints, we'll do that.
> > >
> > > > On the debugging side, suppose someone gives you an RCU bug report.
> > > > What information will you need?  How can you best get that information
> > > > without excessive numbers of over-and-back interactions with the guy
> > > > reporting the bug?  As part of this last question, what information is
> > > > normally supplied with the bug?  Alternatively, what information are
> > > > bug reporters normally expected to provide when asked?
> > >
> > > I suppose I could dig out some of our Android bug reports of the past where
> > > there were RCU issues but if there's any fires you are currently fighting do
> > > send it our way as debugging homework ;-)
> >
> > Evading the questions, are we?
> >
> > OK, I can be flexible.  Suppose that you were getting RCU CPU stall
> > warnings featuring multi_cpu_stop() called from cpu_stopper_thread().
> > Of course, this really means that some other CPU/task is holding up
> > multi_cpu_stop() without also blocking the current grace period.
> >
> > What is the best way to work out what is really holding things up?
> 
> Either the stopper preempted another being in a critical section and
> has something wrong itself in case of PREEMPT or mechanisms for
> urgent control doesn't work correctly.
> 
> I don't know what exactly you intended but I would check things like
> (1) irq disable / eqs / tick / scheduler events and (2) whether special
> handling for each level of qs urgency has started correctly. For that
> purpose all the history of those events would be more useful.
> 
> And with thinking it more, we could come up with a good way to
> make use of those data to identify what the problem is. Do I catch
> the point correctly? If so, me and Joel can start to work on it.
> Otherwise, please correct me.

I believe you are on the right track.  In short, it would be great if
the kernel would automatically dump out the needed information when
cpu_stopper gets stalled, sort of like RCU does (much of the time,
anyway) in its CPU stall warnings.  Given a patch that did this, I would
be quite happy to help advocate for it!

							Thanx, Paul

  reply	other threads:[~2019-07-14 13:56 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-08  6:00 [PATCH] rcu: Make jiffies_till_sched_qs writable Byungchul Park
2019-07-08 12:50 ` Paul E. McKenney
2019-07-08 13:03   ` Joel Fernandes
2019-07-08 13:19     ` Paul E. McKenney
2019-07-08 14:15       ` Joel Fernandes
2019-07-09  6:05       ` Byungchul Park
2019-07-09 12:43         ` Paul E. McKenney
2019-07-09  5:58     ` Byungchul Park
2019-07-09  6:45       ` Byungchul Park
2019-07-09 12:41       ` Paul E. McKenney
2019-07-10  1:20         ` Byungchul Park
2019-07-11 12:30           ` Paul E. McKenney
2019-07-11 13:08             ` Joel Fernandes
2019-07-11 15:02               ` Paul E. McKenney
2019-07-11 16:48                 ` Joel Fernandes
2019-07-11 19:58                   ` Joel Fernandes
2019-07-12  6:32                     ` Byungchul Park
2019-07-12 12:51                       ` Joel Fernandes
2019-07-12 13:02                         ` Paul E. McKenney
2019-07-12 13:43                           ` Joel Fernandes
2019-07-12 14:53                             ` Paul E. McKenney
2019-07-13  8:47                         ` Byungchul Park
2019-07-13 14:20                           ` Joel Fernandes
2019-07-13 15:13                             ` Paul E. McKenney
2019-07-13 15:42                               ` Joel Fernandes
2019-07-13 17:41                                 ` Paul E. McKenney
2019-07-14 13:39                                   ` Byungchul Park
2019-07-14 13:56                                     ` Paul E. McKenney [this message]
2019-07-15 17:39                                       ` Joel Fernandes
2019-07-15 20:09                                         ` Paul E. McKenney
2019-07-18 16:14                                   ` Joel Fernandes
2019-07-18 16:15                                     ` Joel Fernandes
2019-07-18 21:34                                     ` Paul E. McKenney
2019-07-19  0:48                                       ` Joel Fernandes
2019-07-19  0:54                                       ` Byungchul Park
2019-07-19  0:39                                     ` Byungchul Park
2019-07-19  0:52                                       ` Joel Fernandes
2019-07-19  1:10                                         ` Byungchul Park
2019-07-19  7:43                                         ` Paul E. McKenney
2019-07-19  9:57                                           ` Byungchul Park
2019-07-19 19:57                                             ` Paul E. McKenney
2019-07-19 20:33                                               ` Joel Fernandes
2019-07-23 11:05                                                 ` Byungchul Park
2019-07-23 13:47                                                   ` Paul E. McKenney
2019-07-23 16:54                                                     ` Paul E. McKenney
2019-07-24  7:58                                                       ` Byungchul Park
2019-07-24  7:59                                                     ` Byungchul Park
2019-07-12 13:01                     ` Paul E. McKenney
2019-07-12 13:40                       ` Joel Fernandes
2019-07-12  6:00                 ` Byungchul Park
2019-07-12  5:52               ` Byungchul Park
2019-07-12  5:48             ` Byungchul Park
2019-07-13  9:08               ` Byungchul Park

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190714135610.GJ26519@linux.ibm.com \
    --to=paulmck@linux.ibm.com \
    --cc=byungchul.park@lge.com \
    --cc=jiangshanlai@gmail.com \
    --cc=joel@joelfernandes.org \
    --cc=josh@joshtriplett.org \
    --cc=kernel-team@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=max.byungchul.park@gmail.com \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).