Re: [Xen-devel] [PATCH v2 0/4] xen/rcu: let rcu work better with core scheduling

From: "Jürgen Groß" <jgross@suse.com>
To: Igor Druzhinin <igor.druzhinin@citrix.com>,
	xen-devel@lists.xenproject.org
Cc: "Kevin Tian" <kevin.tian@intel.com>,
	"Stefano Stabellini" <sstabellini@kernel.org>,
	"Julien Grall" <julien@xen.org>,
	"Jan Beulich" <jbeulich@suse.com>, "Wei Liu" <wl@xen.org>,
	"Konrad Rzeszutek Wilk" <konrad.wilk@oracle.com>,
	"George Dunlap" <George.Dunlap@eu.citrix.com>,
	"Andrew Cooper" <andrew.cooper3@citrix.com>,
	"Ian Jackson" <ian.jackson@eu.citrix.com>,
	"Jun Nakajima" <jun.nakajima@intel.com>,
	"Roger Pau Monné" <roger.pau@citrix.com>
Subject: Re: [Xen-devel] [PATCH v2 0/4] xen/rcu: let rcu work better with core scheduling
Date: Mon, 2 Mar 2020 15:03:03 +0100	[thread overview]
Message-ID: <b287c3f5-4819-c6eb-6c77-dcb9cc5d5335@suse.com> (raw)
In-Reply-To: <c29bb636-a7d8-3bf0-ae59-f10a274a9238@citrix.com>

[-- Attachment #1: Type: text/plain, Size: 1915 bytes --]

On 02.03.20 14:25, Igor Druzhinin wrote:
> On 28/02/2020 07:10, Jürgen Groß wrote:
>>
>> I think you are just narrowing the window of the race:
>>
>> It is still possible to have two cpus entering rcu_barrier() and to
>> make it into the if ( !initial ) clause.
>>
>> Instead of introducing another atomic I believe the following patch
>> instead of yours should do it:
>>
>> diff --git a/xen/common/rcupdate.c b/xen/common/rcupdate.c
>> index e6add0b120..0d5469a326 100644
>> --- a/xen/common/rcupdate.c
>> +++ b/xen/common/rcupdate.c
>> @@ -180,23 +180,17 @@ static void rcu_barrier_action(void)
>>
>>   void rcu_barrier(void)
>>   {
>> -    int initial = atomic_read(&cpu_count);
>> -
>>       while ( !get_cpu_maps() )
>>       {
>>           process_pending_softirqs();
>> -        if ( initial && !atomic_read(&cpu_count) )
>> +        if ( !atomic_read(&cpu_count) )
>>               return;
>>
>>           cpu_relax();
>> -        initial = atomic_read(&cpu_count);
>>       }
>>
>> -    if ( !initial )
>> -    {
>> -        atomic_set(&cpu_count, num_online_cpus());
>> +    if ( atomic_cmpxchg(&cpu_count, 0, num_online_cpus()) == 0 )
>>           cpumask_raise_softirq(&cpu_online_map, RCU_SOFTIRQ);
>> -    }
>>
>>       while ( atomic_read(&cpu_count) )
>>       {
>>
>> Could you give that a try, please?
> 
> With this patch I cannot disable SMT at all.
> 
> The problem that my diff solved was a race between 2 consecutive
> rcu_barrier operations on CPU0 (the pattern specific to SMT-on/off
> operation) where some CPUs didn't exit the cpu_count checking loop
> completely but cpu_count is already reinitialized on CPU0 - this
> results in some CPUs being stuck in the loop.

Ah, okay, then I believe a combination of the two patches is needed.

Something like the attached version?


Juergen

[-- Attachment #2: 0002-xen-rcu-don-t-use-stop_machine_run-for-rcu_barrier.patch --]
[-- Type: text/x-patch, Size: 4793 bytes --]

From 560ecf8ca947b16aa5af7978905ace51965167e2 Mon Sep 17 00:00:00 2001
From: Juergen Gross <jgross@suse.com>
To: xen-devel@lists.xenproject.org
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Julien Grall <julien@xen.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Wei Liu <wl@xen.org>
Date: Mon, 17 Feb 2020 06:58:49 +0100
Subject: [PATCH 2/2] xen/rcu: don't use stop_machine_run() for rcu_barrier()

Today rcu_barrier() is calling stop_machine_run() to synchronize all
physical cpus in order to ensure all pending rcu calls have finished
when returning.

As stop_machine_run() is using tasklets this requires scheduling of
idle vcpus on all cpus imposing the need to call rcu_barrier() on idle
cpus only in case of core scheduling being active, as otherwise a
scheduling deadlock would occur.

There is no need at all to do the syncing of the cpus in tasklets, as
rcu activity is started in __do_softirq() called whenever softirq
activity is allowed. So rcu_barrier() can easily be modified to use
softirq for synchronization of the cpus no longer requiring any
scheduling activity.

As there already is a rcu softirq reuse that for the synchronization.

Finally switch rcu_barrier() to return void as it now can never fail.

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 xen/common/rcupdate.c      | 49 ++++++++++++++++++++++++++--------------------
 xen/include/xen/rcupdate.h |  2 +-
 2 files changed, 29 insertions(+), 22 deletions(-)

diff --git a/xen/common/rcupdate.c b/xen/common/rcupdate.c
index 079ea9d8a1..1f02a804e3 100644
--- a/xen/common/rcupdate.c
+++ b/xen/common/rcupdate.c
@@ -143,47 +143,51 @@ static int qhimark = 10000;
 static int qlowmark = 100;
 static int rsinterval = 1000;
 
-struct rcu_barrier_data {
-    struct rcu_head head;
-    atomic_t *cpu_count;
-};
+/*
+ * rcu_barrier() handling:
+ * cpu_count holds the number of cpu required to finish barrier handling.
+ * Cpus are synchronized via softirq mechanism. rcu_barrier() is regarded to
+ * be active if cpu_count is not zero. In case rcu_barrier() is called on
+ * multiple cpus it is enough to check for cpu_count being not zero on entry
+ * and to call process_pending_softirqs() in a loop until cpu_count drops to
+ * zero, as syncing has been requested already and we don't need to sync
+ * multiple times.
+ */
+static atomic_t cpu_count = ATOMIC_INIT(0);
 
 static void rcu_barrier_callback(struct rcu_head *head)
 {
-    struct rcu_barrier_data *data = container_of(
-        head, struct rcu_barrier_data, head);
-    atomic_inc(data->cpu_count);
+    atomic_dec(&cpu_count);
 }
 
-static int rcu_barrier_action(void *_cpu_count)
+static void rcu_barrier_action(void)
 {
-    struct rcu_barrier_data data = { .cpu_count = _cpu_count };
-
-    ASSERT(!local_irq_is_enabled());
-    local_irq_enable();
+    struct rcu_head head;
 
     /*
      * When callback is executed, all previously-queued RCU work on this CPU
      * is completed. When all CPUs have executed their callback, data.cpu_count
      * will have been incremented to include every online CPU.
      */
-    call_rcu(&data.head, rcu_barrier_callback);
+    call_rcu(&head, rcu_barrier_callback);
 
-    while ( atomic_read(data.cpu_count) != num_online_cpus() )
+    while ( atomic_read(&cpu_count) )
     {
         process_pending_softirqs();
         cpu_relax();
     }
-
-    local_irq_disable();
-
-    return 0;
 }
 
-int rcu_barrier(void)
+void rcu_barrier(void)
 {
-    atomic_t cpu_count = ATOMIC_INIT(0);
-    return stop_machine_run(rcu_barrier_action, &cpu_count, NR_CPUS);
+    if ( !atomic_cmpxchg(&cpu_count, 0, num_online_cpus()) )
+        cpumask_raise_softirq(&cpu_online_map, RCU_SOFTIRQ);
+
+    while ( atomic_read(&cpu_count) )
+    {
+        process_pending_softirqs();
+        cpu_relax();
+    }
 }
 
 /* Is batch a before batch b ? */
@@ -422,6 +426,9 @@ static void rcu_process_callbacks(void)
         rdp->process_callbacks = false;
         __rcu_process_callbacks(&rcu_ctrlblk, rdp);
     }
+
+    if ( atomic_read(&cpu_count) )
+        rcu_barrier_action();
 }
 
 static int __rcu_pending(struct rcu_ctrlblk *rcp, struct rcu_data *rdp)
diff --git a/xen/include/xen/rcupdate.h b/xen/include/xen/rcupdate.h
index 174d058113..87f35b7704 100644
--- a/xen/include/xen/rcupdate.h
+++ b/xen/include/xen/rcupdate.h
@@ -143,7 +143,7 @@ void rcu_check_callbacks(int cpu);
 void call_rcu(struct rcu_head *head, 
               void (*func)(struct rcu_head *head));
 
-int rcu_barrier(void);
+void rcu_barrier(void);
 
 void rcu_idle_enter(unsigned int cpu);
 void rcu_idle_exit(unsigned int cpu);
-- 
2.16.4


[-- Attachment #3: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel