From: Joel Fernandes <joel@joelfernandes.org>
To: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Byungchul Park <byungchul.park@lge.com>,
jiangshanlai@gmail.com, josh@joshtriplett.org,
rostedt@goodmis.org, mathieu.desnoyers@efficios.com,
linux-kernel@vger.kernel.org, kernel-team@lge.com,
peterz@infradead.org
Subject: Re: [PATCH] rcu: Report a quiescent state when it's exactly in the state
Date: Fri, 11 May 2018 23:30:37 -0700 [thread overview]
Message-ID: <20180512063037.GC192642@joelaf.mtv.corp.google.com> (raw)
In-Reply-To: <20180512050824.GF26088@linux.vnet.ibm.com>
On Fri, May 11, 2018 at 10:08:24PM -0700, Paul E. McKenney wrote:
> On Fri, May 11, 2018 at 03:41:38PM -0700, Joel Fernandes wrote:
> > On Fri, May 11, 2018 at 09:17:46AM -0700, Paul E. McKenney wrote:
> > > On Fri, May 11, 2018 at 09:57:54PM +0900, Byungchul Park wrote:
> > > > Hello folks,
> > > >
> > > > I think I wrote the title in a misleading way.
> > > >
> > > > Please change the title to something else such as,
> > > > "rcu: Report a quiescent state when it's in the state" or,
> > > > "rcu: Add points reporting quiescent states where proper" or so on.
> > > >
> > > > On 2018-05-11 오후 5:30, Byungchul Park wrote:
> > > > >We expect a quiescent state of TASKS_RCU when cond_resched_tasks_rcu_qs()
> > > > >is called, no matter whether it actually be scheduled or not. However,
> > > > >it currently doesn't report the quiescent state when the task enters
> > > > >into __schedule() as it's called with preempt = true. So make it report
> > > > >the quiescent state unconditionally when cond_resched_tasks_rcu_qs() is
> > > > >called.
> > > > >
> > > > >And in TINY_RCU, even though the quiescent state of rcu_bh also should
> > > > >be reported when the tick interrupt comes from user, it doesn't. So make
> > > > >it reported.
> > > > >
> > > > >Lastly in TREE_RCU, rcu_note_voluntary_context_switch() should be
> > > > >reported when the tick interrupt comes from not only user but also idle,
> > > > >as an extended quiescent state.
> > > > >
> > > > >Signed-off-by: Byungchul Park <byungchul.park@lge.com>
> > > > >---
> > > > > include/linux/rcupdate.h | 4 ++--
> > > > > kernel/rcu/tiny.c | 6 +++---
> > > > > kernel/rcu/tree.c | 4 ++--
> > > > > 3 files changed, 7 insertions(+), 7 deletions(-)
> > > > >
> > > > >diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> > > > >index ee8cf5fc..7432261 100644
> > > > >--- a/include/linux/rcupdate.h
> > > > >+++ b/include/linux/rcupdate.h
> > > > >@@ -195,8 +195,8 @@ static inline void exit_tasks_rcu_finish(void) { }
> > > > > */
> > > > > #define cond_resched_tasks_rcu_qs() \
> > > > > do { \
> > > > >- if (!cond_resched()) \
> > > > >- rcu_note_voluntary_context_switch_lite(current); \
> > > > >+ rcu_note_voluntary_context_switch_lite(current); \
> > > > >+ cond_resched(); \
> > >
> > > Ah, good point.
> > >
> > > Peter, I have to ask... Why is "cond_resched()" considered a preemption
> > > while "schedule()" is not?
> >
> > Infact something interesting I inferred from the __schedule loop related to
> > your question:
> >
> > switch_count can either be set to prev->invcsw or prev->nvcsw. If we can
> > assume that switch_count reflects whether the context switch is involuntary
> > or voluntary,
> >
> > task-running-state preempt switch_count
> > 0 (running) 1 involuntary
> > 0 0 involuntary
> > 1 0 voluntary
> > 1 1 involuntary
> >
> > According to the above table, both the task's running state and the preempt
> > parameter to __schedule should be used together to determine if the switch is
> > a voluntary one or not.
> >
> > So this code in rcu_note_context_switch should really be:
> > if (!preempt && !(current->state & TASK_RUNNING))
I should have writte here- !preempt && current->state
> > rcu_note_voluntary_context_switch_lite(current);
> >
> > According to the above table, cond_resched always classifies as an
> > involuntary switch which makes sense to me. Even though cond_resched is
> > explicitly called, its still sort of involuntary in the sense its not called
> > into the scheduler for sleeping, but rather for seeing if something else can
> > run instead (a preemption point). Infact none of the task deactivation in the
> > __schedule loop will run if cond_resched is used.
> >
> > I agree that if schedule was called directly but with TASK_RUNNING=1, then
> > that could probably be classified an involuntary switch too...
> >
> > Also since we're deciding to call rcu_note_voluntary_context_switch_lite
> > unconditionally, then IMO this comment on that macro:
> >
> > /*
> > * Note a voluntary context switch for RCU-tasks benefit. This is a
> > * macro rather than an inline function to avoid #include hell.
> > */
> > #ifdef CONFIG_TASKS_RCU
> > #define rcu_note_voluntary_context_switch_lite(t)
> >
> > Should be changed to:
> >
> > /*
> > * Note a attempt to perform a voluntary context switch for RCU-tasks
> > * benefit. This is called even in situations where a context switch
> > * didn't really happen even though it was requested. This is a
> > * macro rather than an inline function to avoid #include hell.
> > */
> > #ifdef CONFIG_TASKS_RCU
> > #define rcu_note_voluntary_context_switch_lite(t)
> >
> > Right?
> >
> > Correct me if I'm wrong about anything, thanks,
>
> The starting point for me is that Tasks RCU is a special-purpose mechanism
> for freeing trampolines in PREEMPT=y kernels. The approach is to arrange
> for the trampoline to be inaccessible to future execution, wait for a
> tasks-RCU grace period, then free the trampoline. So a tasks-RCU grace
> period must wait until all tasks have spent at least some time outside
> of a trampoline. My understanding is that trampolines cannot contain
> preemption points, such as cond_resched() and cond_resched_tasks_rcu_qs(),
> so we want to count them as quiescent states regardless of whether or
> not any associated context switch is counted as involuntary.
>
> What situations lead to the second line of your table above?
> The sched_yield() system call, but trampolines don't do system calls,
> either, as far as I know.
>
> So it looks to me like that test can leave out the TASK_RUNNING check.
I don't know much about tasks-RCU to comment more, sorry. Probably a few more
reading nights for me to catch up with that. Its possible the check is not
needed and tasks-RCU can survive without it, but I was thinking from a
correctness and future-proofing stand point... I generally don't like
inconsistencies. The check in the __schedule loop is as:
if (!preempt && prev->state) {
....
// switch_count = voluntary context switch counter pointer
....
} else {
....
// switch_count = involuntary context switch counter pointer
....
}
// context switch really happening
if (prev != next) {
....
++switch_count;
}
The first conditional if (!preempt...) above is what I was referring to which
also checks the state.
Also this issue aside, I was more trying to answer your question about why
schedule() is or isn't a preemption point, by sharing the table but I
possibly caused more confusion, sorry :-(. I'll let Peter and Steven chime in
since they know more than me about that and will just shutup and listen
instead of being more noisy.. :-D
thanks,
- Joel
next prev parent reply other threads:[~2018-05-12 6:30 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-05-11 8:30 [PATCH] rcu: Report a quiescent state when it's exactly in the state Byungchul Park
2018-05-11 12:57 ` Byungchul Park
2018-05-11 16:17 ` Paul E. McKenney
2018-05-11 16:23 ` Steven Rostedt
2018-05-11 16:25 ` Steven Rostedt
2018-05-11 16:27 ` Steven Rostedt
2018-05-11 17:27 ` Paul E. McKenney
2018-05-11 17:29 ` Steven Rostedt
2018-05-11 22:41 ` Joel Fernandes
2018-05-12 5:08 ` Paul E. McKenney
2018-05-12 6:30 ` Joel Fernandes [this message]
2018-05-12 14:41 ` Paul E. McKenney
2018-05-12 17:26 ` Steven Rostedt
2018-05-14 3:11 ` Byungchul Park
2018-05-13 0:09 ` Joel Fernandes
2018-05-14 2:59 ` Byungchul Park
2018-05-14 14:25 ` Byungchul Park
2018-05-14 21:04 ` Paul E. McKenney
2018-05-15 0:18 ` Byungchul Park
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180512063037.GC192642@joelaf.mtv.corp.google.com \
--to=joel@joelfernandes.org \
--cc=byungchul.park@lge.com \
--cc=jiangshanlai@gmail.com \
--cc=josh@joshtriplett.org \
--cc=kernel-team@lge.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).