All of lore.kernel.org
 help / color / mirror / Atom feed
* Normal RCU grace period can be stalled for long because need-resched flags not set?
@ 2019-07-03 15:25 Joel Fernandes
  2019-07-03 15:30 ` Steven Rostedt
  2019-07-03 16:10 ` Paul E. McKenney
  0 siblings, 2 replies; 30+ messages in thread
From: Joel Fernandes @ 2019-07-03 15:25 UTC (permalink / raw)
  To: Paul E. McKenney, Steven Rostedt, Mathieu Desnoyers, rcu

Hi!
I am measuring performance of the RCU consolidated vs RCU before the
consolidation of flavors happened (just for fun and may be to talk
about in a presentation).

What I did is I limited the readers/writers in rcuperf to run on all
but one CPU. And then on that one CPU, I had a thread doing a
preempt-disable + busy-wait + preempt_enable in a loop.

I was hoping the preempt disable busy-wait thread would stall the
regular readers, and it did.
But what I noticed is that grace periods take 100-200 milliseconds to
finish instead of the busy-wait time of 5-10 ms that I set. On closer
examination, it looks like even though the preempt_enable happens in
my loop, the need-resched flag is not set even though the grace period
is long over due. So the thread does not reschedule.

For now, in my test I am just setting the need-resched flag manual
after a busy wait.

But I was thinking, can this really happen in real life? So, say a CPU
is doing a lot of work in preempt_disable but is diligent enough to
check need-resched flag periodically. I believe some spin-on-owner
type locking primitives do this.

Even though the thread is stalling the grace period, it has no clue
because no one told it that a GP is in progress that is being held up.
The tick interrupt for that thread returns rcu_need_deferred_qs()
returns false during the preempt disable section. Can we do better for
such usecases, such as even sending an IPI to the CPUs holding the
Grace period? Or even upgrading the grace period to an expedited one
if need be?

Expedited grace periods did not have such issues. However I did notice
that sometimes the Grace period would end not within 1 busy-wait
duration but within 2. The distribution was strongly bi-modal to
1*busy-wait and 2*busy-wait durations for expedited tests. (This
expedited test actually happened by accident, because the
preempt-disable in my loop was delaying init enough that the whole
test was running during init during which synchronize_rcu is upgraded
to expedited).

I am sorry if this is not a realistic real-life problem, but more a
"doctor it hurts if I do this" problem as Steven once said ;-)

I'll keep poking ;-)

 J.

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2019-07-07 11:19 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-03 15:25 Normal RCU grace period can be stalled for long because need-resched flags not set? Joel Fernandes
2019-07-03 15:30 ` Steven Rostedt
2019-07-03 16:41   ` Joel Fernandes
2019-07-03 16:43     ` Joel Fernandes
2019-07-03 17:39     ` Paul E. McKenney
2019-07-03 21:24       ` Joel Fernandes
2019-07-03 21:57         ` Paul E. McKenney
2019-07-03 22:24           ` Joel Fernandes
2019-07-03 23:01             ` Paul E. McKenney
2019-07-04  0:21               ` Joel Fernandes
2019-07-04  0:32                 ` Joel Fernandes
2019-07-04  0:50                   ` Paul E. McKenney
2019-07-04  3:24                     ` Joel Fernandes
2019-07-04 17:13                       ` Paul E. McKenney
2019-07-04 18:50                         ` Joel Fernandes
2019-07-04 22:17                           ` Paul E. McKenney
2019-07-05  0:08                             ` Joel Fernandes
2019-07-05  1:30                               ` Joel Fernandes
2019-07-05  1:57                                 ` Paul E. McKenney
2019-07-06 12:18                                   ` [attn: Steve] " Joel Fernandes
2019-07-06 18:05                                     ` Paul E. McKenney
2019-07-06 23:25                                       ` Steven Rostedt
2019-07-06 12:02                             ` Joel Fernandes
2019-07-06 18:21                               ` Paul E. McKenney
2019-07-06 23:03                                 ` Joel Fernandes
2019-07-07 11:19                                   ` Paul E. McKenney
2019-07-04  0:47                 ` Paul E. McKenney
2019-07-04 16:49                   ` Joel Fernandes
2019-07-04 17:08                     ` Paul E. McKenney
2019-07-03 16:10 ` Paul E. McKenney

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.