All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Jürgen Groß" <jgross@suse.com>
To: Igor Druzhinin <igor.druzhinin@citrix.com>,
	xen-devel@lists.xenproject.org
Cc: "Kevin Tian" <kevin.tian@intel.com>,
	"Stefano Stabellini" <sstabellini@kernel.org>,
	"Julien Grall" <julien@xen.org>,
	"Jan Beulich" <jbeulich@suse.com>, "Wei Liu" <wl@xen.org>,
	"Konrad Rzeszutek Wilk" <konrad.wilk@oracle.com>,
	"George Dunlap" <George.Dunlap@eu.citrix.com>,
	"Andrew Cooper" <andrew.cooper3@citrix.com>,
	"Ian Jackson" <ian.jackson@eu.citrix.com>,
	"Jun Nakajima" <jun.nakajima@intel.com>,
	"Roger Pau Monné" <roger.pau@citrix.com>
Subject: Re: [Xen-devel] [PATCH v2 0/4] xen/rcu: let rcu work better with core scheduling
Date: Mon, 2 Mar 2020 15:03:03 +0100	[thread overview]
Message-ID: <b287c3f5-4819-c6eb-6c77-dcb9cc5d5335@suse.com> (raw)
In-Reply-To: <c29bb636-a7d8-3bf0-ae59-f10a274a9238@citrix.com>

[-- Attachment #1: Type: text/plain, Size: 1915 bytes --]

On 02.03.20 14:25, Igor Druzhinin wrote:
> On 28/02/2020 07:10, Jürgen Groß wrote:
>>
>> I think you are just narrowing the window of the race:
>>
>> It is still possible to have two cpus entering rcu_barrier() and to
>> make it into the if ( !initial ) clause.
>>
>> Instead of introducing another atomic I believe the following patch
>> instead of yours should do it:
>>
>> diff --git a/xen/common/rcupdate.c b/xen/common/rcupdate.c
>> index e6add0b120..0d5469a326 100644
>> --- a/xen/common/rcupdate.c
>> +++ b/xen/common/rcupdate.c
>> @@ -180,23 +180,17 @@ static void rcu_barrier_action(void)
>>
>>   void rcu_barrier(void)
>>   {
>> -    int initial = atomic_read(&cpu_count);
>> -
>>       while ( !get_cpu_maps() )
>>       {
>>           process_pending_softirqs();
>> -        if ( initial && !atomic_read(&cpu_count) )
>> +        if ( !atomic_read(&cpu_count) )
>>               return;
>>
>>           cpu_relax();
>> -        initial = atomic_read(&cpu_count);
>>       }
>>
>> -    if ( !initial )
>> -    {
>> -        atomic_set(&cpu_count, num_online_cpus());
>> +    if ( atomic_cmpxchg(&cpu_count, 0, num_online_cpus()) == 0 )
>>           cpumask_raise_softirq(&cpu_online_map, RCU_SOFTIRQ);
>> -    }
>>
>>       while ( atomic_read(&cpu_count) )
>>       {
>>
>> Could you give that a try, please?
> 
> With this patch I cannot disable SMT at all.
> 
> The problem that my diff solved was a race between 2 consecutive
> rcu_barrier operations on CPU0 (the pattern specific to SMT-on/off
> operation) where some CPUs didn't exit the cpu_count checking loop
> completely but cpu_count is already reinitialized on CPU0 - this
> results in some CPUs being stuck in the loop.

Ah, okay, then I believe a combination of the two patches is needed.

Something like the attached version?


Juergen

[-- Attachment #2: 0002-xen-rcu-don-t-use-stop_machine_run-for-rcu_barrier.patch --]
[-- Type: text/x-patch, Size: 4793 bytes --]

From 560ecf8ca947b16aa5af7978905ace51965167e2 Mon Sep 17 00:00:00 2001
From: Juergen Gross <jgross@suse.com>
To: xen-devel@lists.xenproject.org
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Julien Grall <julien@xen.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Wei Liu <wl@xen.org>
Date: Mon, 17 Feb 2020 06:58:49 +0100
Subject: [PATCH 2/2] xen/rcu: don't use stop_machine_run() for rcu_barrier()

Today rcu_barrier() is calling stop_machine_run() to synchronize all
physical cpus in order to ensure all pending rcu calls have finished
when returning.

As stop_machine_run() is using tasklets this requires scheduling of
idle vcpus on all cpus imposing the need to call rcu_barrier() on idle
cpus only in case of core scheduling being active, as otherwise a
scheduling deadlock would occur.

There is no need at all to do the syncing of the cpus in tasklets, as
rcu activity is started in __do_softirq() called whenever softirq
activity is allowed. So rcu_barrier() can easily be modified to use
softirq for synchronization of the cpus no longer requiring any
scheduling activity.

As there already is a rcu softirq reuse that for the synchronization.

Finally switch rcu_barrier() to return void as it now can never fail.

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 xen/common/rcupdate.c      | 49 ++++++++++++++++++++++++++--------------------
 xen/include/xen/rcupdate.h |  2 +-
 2 files changed, 29 insertions(+), 22 deletions(-)

diff --git a/xen/common/rcupdate.c b/xen/common/rcupdate.c
index 079ea9d8a1..1f02a804e3 100644
--- a/xen/common/rcupdate.c
+++ b/xen/common/rcupdate.c
@@ -143,47 +143,51 @@ static int qhimark = 10000;
 static int qlowmark = 100;
 static int rsinterval = 1000;
 
-struct rcu_barrier_data {
-    struct rcu_head head;
-    atomic_t *cpu_count;
-};
+/*
+ * rcu_barrier() handling:
+ * cpu_count holds the number of cpu required to finish barrier handling.
+ * Cpus are synchronized via softirq mechanism. rcu_barrier() is regarded to
+ * be active if cpu_count is not zero. In case rcu_barrier() is called on
+ * multiple cpus it is enough to check for cpu_count being not zero on entry
+ * and to call process_pending_softirqs() in a loop until cpu_count drops to
+ * zero, as syncing has been requested already and we don't need to sync
+ * multiple times.
+ */
+static atomic_t cpu_count = ATOMIC_INIT(0);
 
 static void rcu_barrier_callback(struct rcu_head *head)
 {
-    struct rcu_barrier_data *data = container_of(
-        head, struct rcu_barrier_data, head);
-    atomic_inc(data->cpu_count);
+    atomic_dec(&cpu_count);
 }
 
-static int rcu_barrier_action(void *_cpu_count)
+static void rcu_barrier_action(void)
 {
-    struct rcu_barrier_data data = { .cpu_count = _cpu_count };
-
-    ASSERT(!local_irq_is_enabled());
-    local_irq_enable();
+    struct rcu_head head;
 
     /*
      * When callback is executed, all previously-queued RCU work on this CPU
      * is completed. When all CPUs have executed their callback, data.cpu_count
      * will have been incremented to include every online CPU.
      */
-    call_rcu(&data.head, rcu_barrier_callback);
+    call_rcu(&head, rcu_barrier_callback);
 
-    while ( atomic_read(data.cpu_count) != num_online_cpus() )
+    while ( atomic_read(&cpu_count) )
     {
         process_pending_softirqs();
         cpu_relax();
     }
-
-    local_irq_disable();
-
-    return 0;
 }
 
-int rcu_barrier(void)
+void rcu_barrier(void)
 {
-    atomic_t cpu_count = ATOMIC_INIT(0);
-    return stop_machine_run(rcu_barrier_action, &cpu_count, NR_CPUS);
+    if ( !atomic_cmpxchg(&cpu_count, 0, num_online_cpus()) )
+        cpumask_raise_softirq(&cpu_online_map, RCU_SOFTIRQ);
+
+    while ( atomic_read(&cpu_count) )
+    {
+        process_pending_softirqs();
+        cpu_relax();
+    }
 }
 
 /* Is batch a before batch b ? */
@@ -422,6 +426,9 @@ static void rcu_process_callbacks(void)
         rdp->process_callbacks = false;
         __rcu_process_callbacks(&rcu_ctrlblk, rdp);
     }
+
+    if ( atomic_read(&cpu_count) )
+        rcu_barrier_action();
 }
 
 static int __rcu_pending(struct rcu_ctrlblk *rcp, struct rcu_data *rdp)
diff --git a/xen/include/xen/rcupdate.h b/xen/include/xen/rcupdate.h
index 174d058113..87f35b7704 100644
--- a/xen/include/xen/rcupdate.h
+++ b/xen/include/xen/rcupdate.h
@@ -143,7 +143,7 @@ void rcu_check_callbacks(int cpu);
 void call_rcu(struct rcu_head *head, 
               void (*func)(struct rcu_head *head));
 
-int rcu_barrier(void);
+void rcu_barrier(void);
 
 void rcu_idle_enter(unsigned int cpu);
 void rcu_idle_exit(unsigned int cpu);
-- 
2.16.4


[-- Attachment #3: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

  reply	other threads:[~2020-03-02 14:03 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-18 12:21 [Xen-devel] [PATCH v2 0/4] xen/rcu: let rcu work better with core scheduling Juergen Gross
2020-02-18 12:21 ` [Xen-devel] [PATCH v2 1/4] xen/rcu: use rcu softirq for forcing quiescent state Juergen Gross
2020-02-21 14:17   ` Andrew Cooper
2020-02-18 12:21 ` [Xen-devel] [PATCH v2 2/4] xen/rcu: don't use stop_machine_run() for rcu_barrier() Juergen Gross
2020-02-18 12:21 ` [Xen-devel] [PATCH v2 3/4] xen: add process_pending_softirqs_norcu() for keyhandlers Juergen Gross
2020-02-24 11:25   ` Roger Pau Monné
2020-02-24 11:44     ` Jürgen Groß
2020-02-24 12:02       ` Roger Pau Monné
2020-02-18 12:21 ` [Xen-devel] [PATCH v2 4/4] xen/rcu: add assertions to debug build Juergen Gross
2020-02-24 11:31   ` Roger Pau Monné
2020-02-24 11:45     ` Jürgen Groß
2020-02-18 13:15 ` [Xen-devel] [PATCH v2 0/4] xen/rcu: let rcu work better with core scheduling Igor Druzhinin
2020-02-19 16:48   ` Igor Druzhinin
2020-02-22  2:29 ` Igor Druzhinin
2020-02-22  6:05   ` Jürgen Groß
2020-02-22 12:32     ` Julien Grall
2020-02-22 13:56       ` Jürgen Groß
2020-02-22 16:42     ` Igor Druzhinin
2020-02-23 14:14       ` Jürgen Groß
2020-02-27 15:16         ` Igor Druzhinin
2020-02-27 15:21           ` Jürgen Groß
2020-02-28  7:10           ` Jürgen Groß
2020-03-02 13:25             ` Igor Druzhinin
2020-03-02 14:03               ` Jürgen Groß [this message]
2020-03-02 14:23                 ` Igor Druzhinin
2020-03-02 14:32                   ` Jürgen Groß
2020-03-02 22:29                     ` Igor Druzhinin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b287c3f5-4819-c6eb-6c77-dcb9cc5d5335@suse.com \
    --to=jgross@suse.com \
    --cc=George.Dunlap@eu.citrix.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=ian.jackson@eu.citrix.com \
    --cc=igor.druzhinin@citrix.com \
    --cc=jbeulich@suse.com \
    --cc=julien@xen.org \
    --cc=jun.nakajima@intel.com \
    --cc=kevin.tian@intel.com \
    --cc=konrad.wilk@oracle.com \
    --cc=roger.pau@citrix.com \
    --cc=sstabellini@kernel.org \
    --cc=wl@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.