All of lore.kernel.org
 help / color / mirror / Atom feed
* kernel-rt rcuc lock contention problem
@ 2015-01-26 19:14 Luiz Capitulino
  2015-01-27 20:37 ` Paul E. McKenney
  0 siblings, 1 reply; 23+ messages in thread
From: Luiz Capitulino @ 2015-01-26 19:14 UTC (permalink / raw)
  To: paulmck; +Cc: linux-rt-users, Marcelo Tosatti

Paul,

We're running some measurements with cyclictest running inside a
KVM guest where we could observe spinlock contention among rcuc
threads.

Basically, we have a 16-CPU NUMA machine very well setup for RT.
This machine and the guest run the RT kernel. As our test-case
requires an application in the guest taking 100% of the CPU, the
RT priority configuration that gives the best latency is this one:

 263  FF   3  [rcuc/15]
  13  FF   3  [rcub/1]
  12  FF   3  [rcub/0]
 265  FF   2  [ksoftirqd/15]
3181  FF   1  qemu-kvm

In this configuration, the rcuc can preempt the guest's vcpu
thread. This shouldn't be a problem, except for the fact that
we're seeing that in some cases the rcuc/15 thread spends 10us
or more spinning in this spinlock (note that IRQs are disabled
during this period):

__rcu_process_callbacks()
{
...
	local_irq_save(flags);
	if (cpu_needs_another_gp(rsp, rdp)) {
		raw_spin_lock(&rcu_get_root(rsp)->lock); /* irqs disabled. */
		rcu_start_gp(rsp);
		raw_spin_unlock_irqrestore(&rcu_get_root(rsp)->lock, flags);
...

We've tried playing with the rcu_nocbs= option. However, it
did not help because, for reasons we don't understand, the rcuc
threads have to handle grace period start even when callback
offloading is used. Handling this case requires this code path
to be executed.

We've cooked the following extremely dirty patch, just to see
what would happen:

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index eaed1ef..c0771cc 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -2298,9 +2298,19 @@ __rcu_process_callbacks(struct rcu_state *rsp)
 	/* Does this CPU require a not-yet-started grace period? */
 	local_irq_save(flags);
 	if (cpu_needs_another_gp(rsp, rdp)) {
-		raw_spin_lock(&rcu_get_root(rsp)->lock); /* irqs disabled. */
-		rcu_start_gp(rsp);
-		raw_spin_unlock_irqrestore(&rcu_get_root(rsp)->lock, flags);
+		for (;;) {
+			if (!raw_spin_trylock(&rcu_get_root(rsp)->lock)) {
+				local_irq_restore(flags);
+				local_bh_enable();
+				schedule_timeout_interruptible(2);
+				local_bh_disable();
+				local_irq_save(flags);
+				continue;
+			}
+			rcu_start_gp(rsp);
+			raw_spin_unlock_irqrestore(&rcu_get_root(rsp)->lock, flags);
+			break;
+		}
 	} else {
 		local_irq_restore(flags);
 	}

With this patch rcuc is gone from our traces and the scheduling
latency is reduced by 3us in our CPU-bound test-case.

Could you please advice on how to solve this contention problem?

Can we test whether the local CPU is nocb, and in that case, 
skip rcu_start_gp entirely for example?

Thanks!

^ permalink raw reply related	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2015-02-03 23:56 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-26 19:14 kernel-rt rcuc lock contention problem Luiz Capitulino
2015-01-27 20:37 ` Paul E. McKenney
2015-01-28  1:55   ` Marcelo Tosatti
2015-01-28 14:18     ` Luiz Capitulino
2015-01-28 18:09       ` Paul E. McKenney
2015-01-28 18:39         ` Luiz Capitulino
2015-01-28 19:00           ` Paul E. McKenney
2015-01-28 19:06             ` Luiz Capitulino
2015-01-28 18:03     ` Paul E. McKenney
2015-01-28 18:25       ` Marcelo Tosatti
2015-01-28 18:55         ` Paul E. McKenney
2015-01-29 17:06           ` Steven Rostedt
2015-01-29 18:11             ` Paul E. McKenney
2015-01-29 18:13           ` Marcelo Tosatti
2015-01-29 18:36             ` Paul E. McKenney
2015-02-02 18:24           ` Marcelo Tosatti
2015-02-02 20:35             ` Steven Rostedt
2015-02-02 20:46               ` Marcelo Tosatti
2015-02-02 20:55                 ` Steven Rostedt
2015-02-02 21:02                   ` Marcelo Tosatti
2015-02-03 20:36                     ` Steven Rostedt
2015-02-03 20:57                       ` Paul E. McKenney
2015-02-03 23:55                       ` Marcelo Tosatti

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.