xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [Xen-devel] [PATCH v2 0/4] xen/rcu: let rcu work better with core scheduling
@ 2020-02-18 12:21 Juergen Gross
  2020-02-18 12:21 ` [Xen-devel] [PATCH v2 1/4] xen/rcu: use rcu softirq for forcing quiescent state Juergen Gross
                   ` (5 more replies)
  0 siblings, 6 replies; 27+ messages in thread
From: Juergen Gross @ 2020-02-18 12:21 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Kevin Tian, Stefano Stabellini, Julien Grall,
	Jun Nakajima, Wei Liu, Konrad Rzeszutek Wilk, George Dunlap,
	Andrew Cooper, Ian Jackson, Jan Beulich, Roger Pau Monné

Today the RCU handling in Xen is affecting scheduling in several ways.
It is raising sched softirqs without any real need and it requires
tasklets for rcu_barrier(), which interacts badly with core scheduling.

This small series repairs those issues.

Additionally some ASSERT()s are added for verification of sane rcu
handling. In order to avoid those triggering right away the obvious
violations are fixed.

Changes in V2:
- use get_cpu_maps() in rcu_barrier() handling
- avoid recursion in rcu_barrier() handling
- new patches 3 and 4

Juergen Gross (4):
  xen/rcu: use rcu softirq for forcing quiescent state
  xen/rcu: don't use stop_machine_run() for rcu_barrier()
  xen: add process_pending_softirqs_norcu() for keyhandlers
  xen/rcu: add assertions to debug build

 xen/arch/x86/mm/p2m-ept.c                   |  2 +-
 xen/arch/x86/numa.c                         |  4 +-
 xen/common/keyhandler.c                     |  6 +-
 xen/common/multicall.c                      |  1 +
 xen/common/rcupdate.c                       | 96 +++++++++++++++++++++--------
 xen/common/softirq.c                        | 19 ++++--
 xen/common/wait.c                           |  1 +
 xen/drivers/passthrough/amd/pci_amd_iommu.c |  2 +-
 xen/drivers/passthrough/vtd/iommu.c         |  2 +-
 xen/drivers/vpci/msi.c                      |  4 +-
 xen/include/xen/rcupdate.h                  | 23 +++++--
 xen/include/xen/softirq.h                   |  2 +
 12 files changed, 118 insertions(+), 44 deletions(-)

-- 
2.16.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Xen-devel] [PATCH v2 1/4] xen/rcu: use rcu softirq for forcing quiescent state
  2020-02-18 12:21 [Xen-devel] [PATCH v2 0/4] xen/rcu: let rcu work better with core scheduling Juergen Gross
@ 2020-02-18 12:21 ` Juergen Gross
  2020-02-21 14:17   ` Andrew Cooper
  2020-02-18 12:21 ` [Xen-devel] [PATCH v2 2/4] xen/rcu: don't use stop_machine_run() for rcu_barrier() Juergen Gross
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 27+ messages in thread
From: Juergen Gross @ 2020-02-18 12:21 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Stefano Stabellini, Julien Grall, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Jan Beulich

As rcu callbacks are processed in __do_softirq() there is no need to
use the scheduling softirq for forcing quiescent state. Any other
softirq would do the job and the scheduling one is the most expensive.

So use the already existing rcu softirq for that purpose. For telling
apart why the rcu softirq was raised add a flag for the current usage.

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 xen/common/rcupdate.c | 20 +++++++++++++++++---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/xen/common/rcupdate.c b/xen/common/rcupdate.c
index 91d4ad0fd8..079ea9d8a1 100644
--- a/xen/common/rcupdate.c
+++ b/xen/common/rcupdate.c
@@ -89,6 +89,8 @@ struct rcu_data {
     /* 3) idle CPUs handling */
     struct timer idle_timer;
     bool idle_timer_active;
+
+    bool            process_callbacks;
 };
 
 /*
@@ -194,7 +196,7 @@ static void force_quiescent_state(struct rcu_data *rdp,
                                   struct rcu_ctrlblk *rcp)
 {
     cpumask_t cpumask;
-    raise_softirq(SCHEDULE_SOFTIRQ);
+    raise_softirq(RCU_SOFTIRQ);
     if (unlikely(rdp->qlen - rdp->last_rs_qlen > rsinterval)) {
         rdp->last_rs_qlen = rdp->qlen;
         /*
@@ -202,7 +204,7 @@ static void force_quiescent_state(struct rcu_data *rdp,
          * rdp->cpu is the current cpu.
          */
         cpumask_andnot(&cpumask, &rcp->cpumask, cpumask_of(rdp->cpu));
-        cpumask_raise_softirq(&cpumask, SCHEDULE_SOFTIRQ);
+        cpumask_raise_softirq(&cpumask, RCU_SOFTIRQ);
     }
 }
 
@@ -259,7 +261,10 @@ static void rcu_do_batch(struct rcu_data *rdp)
     if (!rdp->donelist)
         rdp->donetail = &rdp->donelist;
     else
+    {
+        rdp->process_callbacks = true;
         raise_softirq(RCU_SOFTIRQ);
+    }
 }
 
 /*
@@ -410,7 +415,13 @@ static void __rcu_process_callbacks(struct rcu_ctrlblk *rcp,
 
 static void rcu_process_callbacks(void)
 {
-    __rcu_process_callbacks(&rcu_ctrlblk, &this_cpu(rcu_data));
+    struct rcu_data *rdp = &this_cpu(rcu_data);
+
+    if ( rdp->process_callbacks )
+    {
+        rdp->process_callbacks = false;
+        __rcu_process_callbacks(&rcu_ctrlblk, rdp);
+    }
 }
 
 static int __rcu_pending(struct rcu_ctrlblk *rcp, struct rcu_data *rdp)
@@ -518,6 +529,9 @@ static void rcu_idle_timer_handler(void* data)
 
 void rcu_check_callbacks(int cpu)
 {
+    struct rcu_data *rdp = &this_cpu(rcu_data);
+
+    rdp->process_callbacks = true;
     raise_softirq(RCU_SOFTIRQ);
 }
 
-- 
2.16.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Xen-devel] [PATCH v2 2/4] xen/rcu: don't use stop_machine_run() for rcu_barrier()
  2020-02-18 12:21 [Xen-devel] [PATCH v2 0/4] xen/rcu: let rcu work better with core scheduling Juergen Gross
  2020-02-18 12:21 ` [Xen-devel] [PATCH v2 1/4] xen/rcu: use rcu softirq for forcing quiescent state Juergen Gross
@ 2020-02-18 12:21 ` Juergen Gross
  2020-02-18 12:21 ` [Xen-devel] [PATCH v2 3/4] xen: add process_pending_softirqs_norcu() for keyhandlers Juergen Gross
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 27+ messages in thread
From: Juergen Gross @ 2020-02-18 12:21 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Stefano Stabellini, Julien Grall, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Jan Beulich

Today rcu_barrier() is calling stop_machine_run() to synchronize all
physical cpus in order to ensure all pending rcu calls have finished
when returning.

As stop_machine_run() is using tasklets this requires scheduling of
idle vcpus on all cpus imposing the need to call rcu_barrier() on idle
cpus only in case of core scheduling being active, as otherwise a
scheduling deadlock would occur.

There is no need at all to do the syncing of the cpus in tasklets, as
rcu activity is started in __do_softirq() called whenever softirq
activity is allowed. So rcu_barrier() can easily be modified to use
softirq for synchronization of the cpus no longer requiring any
scheduling activity.

As there already is a rcu softirq reuse that for the synchronization.

Remove the barrier element from struct rcu_data as it isn't used.

Finally switch rcu_barrier() to return void as it now can never fail.

Signed-off-by: Juergen Gross <jgross@suse.com>
---
V2:
- use get_cpu_maps()
- add recursion detection
---
 xen/common/rcupdate.c      | 72 ++++++++++++++++++++++++++++++++--------------
 xen/include/xen/rcupdate.h |  2 +-
 2 files changed, 51 insertions(+), 23 deletions(-)

diff --git a/xen/common/rcupdate.c b/xen/common/rcupdate.c
index 079ea9d8a1..e6add0b120 100644
--- a/xen/common/rcupdate.c
+++ b/xen/common/rcupdate.c
@@ -83,7 +83,6 @@ struct rcu_data {
     struct rcu_head **donetail;
     long            blimit;           /* Upper limit on a processed batch */
     int cpu;
-    struct rcu_head barrier;
     long            last_rs_qlen;     /* qlen during the last resched */
 
     /* 3) idle CPUs handling */
@@ -91,6 +90,7 @@ struct rcu_data {
     bool idle_timer_active;
 
     bool            process_callbacks;
+    bool            barrier_active;
 };
 
 /*
@@ -143,47 +143,68 @@ static int qhimark = 10000;
 static int qlowmark = 100;
 static int rsinterval = 1000;
 
-struct rcu_barrier_data {
-    struct rcu_head head;
-    atomic_t *cpu_count;
-};
+/*
+ * rcu_barrier() handling:
+ * cpu_count holds the number of cpu required to finish barrier handling.
+ * Cpus are synchronized via softirq mechanism. rcu_barrier() is regarded to
+ * be active if cpu_count is not zero. In case rcu_barrier() is called on
+ * multiple cpus it is enough to check for cpu_count being not zero on entry
+ * and to call process_pending_softirqs() in a loop until cpu_count drops to
+ * zero, as syncing has been requested already and we don't need to sync
+ * multiple times.
+ */
+static atomic_t cpu_count = ATOMIC_INIT(0);
 
 static void rcu_barrier_callback(struct rcu_head *head)
 {
-    struct rcu_barrier_data *data = container_of(
-        head, struct rcu_barrier_data, head);
-    atomic_inc(data->cpu_count);
+    atomic_dec(&cpu_count);
 }
 
-static int rcu_barrier_action(void *_cpu_count)
+static void rcu_barrier_action(void)
 {
-    struct rcu_barrier_data data = { .cpu_count = _cpu_count };
-
-    ASSERT(!local_irq_is_enabled());
-    local_irq_enable();
+    struct rcu_head head;
 
     /*
      * When callback is executed, all previously-queued RCU work on this CPU
      * is completed. When all CPUs have executed their callback, data.cpu_count
      * will have been incremented to include every online CPU.
      */
-    call_rcu(&data.head, rcu_barrier_callback);
+    call_rcu(&head, rcu_barrier_callback);
 
-    while ( atomic_read(data.cpu_count) != num_online_cpus() )
+    while ( atomic_read(&cpu_count) )
     {
         process_pending_softirqs();
         cpu_relax();
     }
-
-    local_irq_disable();
-
-    return 0;
 }
 
-int rcu_barrier(void)
+void rcu_barrier(void)
 {
-    atomic_t cpu_count = ATOMIC_INIT(0);
-    return stop_machine_run(rcu_barrier_action, &cpu_count, NR_CPUS);
+    int initial = atomic_read(&cpu_count);
+
+    while ( !get_cpu_maps() )
+    {
+        process_pending_softirqs();
+        if ( initial && !atomic_read(&cpu_count) )
+            return;
+
+        cpu_relax();
+        initial = atomic_read(&cpu_count);
+    }
+
+    if ( !initial )
+    {
+        atomic_set(&cpu_count, num_online_cpus());
+        cpumask_raise_softirq(&cpu_online_map, RCU_SOFTIRQ);
+    }
+
+    while ( atomic_read(&cpu_count) )
+    {
+        process_pending_softirqs();
+        cpu_relax();
+    }
+
+    put_cpu_maps();
 }
 
 /* Is batch a before batch b ? */
@@ -422,6 +443,13 @@ static void rcu_process_callbacks(void)
         rdp->process_callbacks = false;
         __rcu_process_callbacks(&rcu_ctrlblk, rdp);
     }
+
+    if ( atomic_read(&cpu_count) && !rdp->barrier_active )
+    {
+        rdp->barrier_active = true;
+        rcu_barrier_action();
+        rdp->barrier_active = false;
+    }
 }
 
 static int __rcu_pending(struct rcu_ctrlblk *rcp, struct rcu_data *rdp)
diff --git a/xen/include/xen/rcupdate.h b/xen/include/xen/rcupdate.h
index 174d058113..87f35b7704 100644
--- a/xen/include/xen/rcupdate.h
+++ b/xen/include/xen/rcupdate.h
@@ -143,7 +143,7 @@ void rcu_check_callbacks(int cpu);
 void call_rcu(struct rcu_head *head, 
               void (*func)(struct rcu_head *head));
 
-int rcu_barrier(void);
+void rcu_barrier(void);
 
 void rcu_idle_enter(unsigned int cpu);
 void rcu_idle_exit(unsigned int cpu);
-- 
2.16.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Xen-devel] [PATCH v2 3/4] xen: add process_pending_softirqs_norcu() for keyhandlers
  2020-02-18 12:21 [Xen-devel] [PATCH v2 0/4] xen/rcu: let rcu work better with core scheduling Juergen Gross
  2020-02-18 12:21 ` [Xen-devel] [PATCH v2 1/4] xen/rcu: use rcu softirq for forcing quiescent state Juergen Gross
  2020-02-18 12:21 ` [Xen-devel] [PATCH v2 2/4] xen/rcu: don't use stop_machine_run() for rcu_barrier() Juergen Gross
@ 2020-02-18 12:21 ` Juergen Gross
  2020-02-24 11:25   ` Roger Pau Monné
  2020-02-18 12:21 ` [Xen-devel] [PATCH v2 4/4] xen/rcu: add assertions to debug build Juergen Gross
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 27+ messages in thread
From: Juergen Gross @ 2020-02-18 12:21 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Kevin Tian, Stefano Stabellini, Julien Grall,
	Jun Nakajima, Wei Liu, Konrad Rzeszutek Wilk, George Dunlap,
	Andrew Cooper, Ian Jackson, Jan Beulich, Roger Pau Monné

Some keyhandlers are calling process_pending_softirqs() while holding
a rcu_read_lock(). This is wrong, as process_pending_softirqs() might
activate rcu calls which should not happen inside a rcu_read_lock().

For that purpose add process_pending_softirqs_norcu() which will not
do any rcu activity and use this for keyhandlers.

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 xen/arch/x86/mm/p2m-ept.c                   |  2 +-
 xen/arch/x86/numa.c                         |  4 ++--
 xen/common/keyhandler.c                     |  6 +++---
 xen/common/softirq.c                        | 17 +++++++++++++----
 xen/drivers/passthrough/amd/pci_amd_iommu.c |  2 +-
 xen/drivers/passthrough/vtd/iommu.c         |  2 +-
 xen/drivers/vpci/msi.c                      |  4 ++--
 xen/include/xen/softirq.h                   |  2 ++
 8 files changed, 25 insertions(+), 14 deletions(-)

diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index d4defa01c2..af2b012144 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -1342,7 +1342,7 @@ static void ept_dump_p2m_table(unsigned char key)
                            c ?: ept_entry->ipat ? '!' : ' ');
 
                 if ( !(record_counter++ % 100) )
-                    process_pending_softirqs();
+                    process_pending_softirqs_norcu();
             }
             unmap_domain_page(table);
         }
diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c
index f1066c59c7..cf6fcc9966 100644
--- a/xen/arch/x86/numa.c
+++ b/xen/arch/x86/numa.c
@@ -418,7 +418,7 @@ static void dump_numa(unsigned char key)
     printk("Memory location of each domain:\n");
     for_each_domain ( d )
     {
-        process_pending_softirqs();
+        process_pending_softirqs_norcu();
 
         printk("Domain %u (total: %u):\n", d->domain_id, domain_tot_pages(d));
 
@@ -462,7 +462,7 @@ static void dump_numa(unsigned char key)
             for ( j = 0; j < d->max_vcpus; j++ )
             {
                 if ( !(j & 0x3f) )
-                    process_pending_softirqs();
+                    process_pending_softirqs_norcu();
 
                 if ( vnuma->vcpu_to_vnode[j] == i )
                 {
diff --git a/xen/common/keyhandler.c b/xen/common/keyhandler.c
index 87bd145374..0d32bc4e2a 100644
--- a/xen/common/keyhandler.c
+++ b/xen/common/keyhandler.c
@@ -263,7 +263,7 @@ static void dump_domains(unsigned char key)
     {
         unsigned int i;
 
-        process_pending_softirqs();
+        process_pending_softirqs_norcu();
 
         printk("General information for domain %u:\n", d->domain_id);
         printk("    refcnt=%d dying=%d pause_count=%d\n",
@@ -307,7 +307,7 @@ static void dump_domains(unsigned char key)
             for_each_sched_unit_vcpu ( unit, v )
             {
                 if ( !(v->vcpu_id & 0x3f) )
-                    process_pending_softirqs();
+                    process_pending_softirqs_norcu();
 
                 printk("    VCPU%d: CPU%d [has=%c] poll=%d "
                        "upcall_pend=%02x upcall_mask=%02x ",
@@ -337,7 +337,7 @@ static void dump_domains(unsigned char key)
         for_each_vcpu ( d, v )
         {
             if ( !(v->vcpu_id & 0x3f) )
-                process_pending_softirqs();
+                process_pending_softirqs_norcu();
 
             printk("Notifying guest %d:%d (virq %d, port %d)\n",
                    d->domain_id, v->vcpu_id,
diff --git a/xen/common/softirq.c b/xen/common/softirq.c
index b83ad96d6c..3fe75ca3e8 100644
--- a/xen/common/softirq.c
+++ b/xen/common/softirq.c
@@ -25,7 +25,7 @@ static softirq_handler softirq_handlers[NR_SOFTIRQS];
 static DEFINE_PER_CPU(cpumask_t, batch_mask);
 static DEFINE_PER_CPU(unsigned int, batching);
 
-static void __do_softirq(unsigned long ignore_mask)
+static void __do_softirq(unsigned long ignore_mask, bool rcu_allowed)
 {
     unsigned int i, cpu;
     unsigned long pending;
@@ -38,7 +38,7 @@ static void __do_softirq(unsigned long ignore_mask)
          */
         cpu = smp_processor_id();
 
-        if ( rcu_pending(cpu) )
+        if ( rcu_allowed && rcu_pending(cpu) )
             rcu_check_callbacks(cpu);
 
         if ( ((pending = (softirq_pending(cpu) & ~ignore_mask)) == 0)
@@ -55,13 +55,22 @@ void process_pending_softirqs(void)
 {
     ASSERT(!in_irq() && local_irq_is_enabled());
     /* Do not enter scheduler as it can preempt the calling context. */
-    __do_softirq((1ul << SCHEDULE_SOFTIRQ) | (1ul << SCHED_SLAVE_SOFTIRQ));
+    __do_softirq((1ul << SCHEDULE_SOFTIRQ) | (1ul << SCHED_SLAVE_SOFTIRQ),
+                 true);
+}
+
+void process_pending_softirqs_norcu(void)
+{
+    ASSERT(!in_irq() && local_irq_is_enabled());
+    /* Do not enter scheduler as it can preempt the calling context. */
+    __do_softirq((1ul << SCHEDULE_SOFTIRQ) | (1ul << SCHED_SLAVE_SOFTIRQ),
+                 false);
 }
 
 void do_softirq(void)
 {
     ASSERT_NOT_IN_ATOMIC();
-    __do_softirq(0);
+    __do_softirq(0, true);
 }
 
 void open_softirq(int nr, softirq_handler handler)
diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c b/xen/drivers/passthrough/amd/pci_amd_iommu.c
index 3112653960..880d64c748 100644
--- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
+++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
@@ -587,7 +587,7 @@ static void amd_dump_p2m_table_level(struct page_info* pg, int level,
         struct amd_iommu_pte *pde = &table_vaddr[index];
 
         if ( !(index % 2) )
-            process_pending_softirqs();
+            process_pending_softirqs_norcu();
 
         if ( !pde->pr )
             continue;
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index 3d60976dd5..c7bd8d4ada 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -2646,7 +2646,7 @@ static void vtd_dump_p2m_table_level(paddr_t pt_maddr, int level, paddr_t gpa,
     for ( i = 0; i < PTE_NUM; i++ )
     {
         if ( !(i % 2) )
-            process_pending_softirqs();
+            process_pending_softirqs_norcu();
 
         pte = &pt_vaddr[i];
         if ( !dma_pte_present(*pte) )
diff --git a/xen/drivers/vpci/msi.c b/xen/drivers/vpci/msi.c
index 75010762ed..1d337604cc 100644
--- a/xen/drivers/vpci/msi.c
+++ b/xen/drivers/vpci/msi.c
@@ -321,13 +321,13 @@ void vpci_dump_msi(void)
                      * holding the lock.
                      */
                     printk("unable to print all MSI-X entries: %d\n", rc);
-                    process_pending_softirqs();
+                    process_pending_softirqs_norcu();
                     continue;
                 }
             }
 
             spin_unlock(&pdev->vpci->lock);
-            process_pending_softirqs();
+            process_pending_softirqs_norcu();
         }
     }
     rcu_read_unlock(&domlist_read_lock);
diff --git a/xen/include/xen/softirq.h b/xen/include/xen/softirq.h
index b4724f5c8b..b5bf3b83b1 100644
--- a/xen/include/xen/softirq.h
+++ b/xen/include/xen/softirq.h
@@ -37,7 +37,9 @@ void cpu_raise_softirq_batch_finish(void);
  * Process pending softirqs on this CPU. This should be called periodically
  * when performing work that prevents softirqs from running in a timely manner.
  * Use this instead of do_softirq() when you do not want to be preempted.
+ * The norcu variant is to be used while holding a read_rcu_lock().
  */
 void process_pending_softirqs(void);
+void process_pending_softirqs_norcu(void);
 
 #endif /* __XEN_SOFTIRQ_H__ */
-- 
2.16.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Xen-devel] [PATCH v2 4/4] xen/rcu: add assertions to debug build
  2020-02-18 12:21 [Xen-devel] [PATCH v2 0/4] xen/rcu: let rcu work better with core scheduling Juergen Gross
                   ` (2 preceding siblings ...)
  2020-02-18 12:21 ` [Xen-devel] [PATCH v2 3/4] xen: add process_pending_softirqs_norcu() for keyhandlers Juergen Gross
@ 2020-02-18 12:21 ` Juergen Gross
  2020-02-24 11:31   ` Roger Pau Monné
  2020-02-18 13:15 ` [Xen-devel] [PATCH v2 0/4] xen/rcu: let rcu work better with core scheduling Igor Druzhinin
  2020-02-22  2:29 ` Igor Druzhinin
  5 siblings, 1 reply; 27+ messages in thread
From: Juergen Gross @ 2020-02-18 12:21 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Stefano Stabellini, Julien Grall, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Jan Beulich

Xen's RCU implementation relies on no softirq handling taking place
while being in a RCU critical section. Add ASSERT()s in debug builds
in order to catch any violations.

For that purpose modify rcu_read_[un]lock() to use a dedicated percpu
counter instead of preempt_[en|dis]able() as this enables to test
that condition in __do_softirq() (ASSERT_NOT_IN_ATOMIC() is not
usable there due to __cpu_up() calling process_pending_softirqs()
while holding the cpu hotplug lock).

Dropping the now no longer needed #include of preempt.h in rcupdate.h
requires adding it in some sources.

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 xen/common/multicall.c     |  1 +
 xen/common/rcupdate.c      |  4 ++++
 xen/common/softirq.c       |  2 ++
 xen/common/wait.c          |  1 +
 xen/include/xen/rcupdate.h | 21 +++++++++++++++++----
 5 files changed, 25 insertions(+), 4 deletions(-)

diff --git a/xen/common/multicall.c b/xen/common/multicall.c
index 5a199ebf8f..67f1a23485 100644
--- a/xen/common/multicall.c
+++ b/xen/common/multicall.c
@@ -10,6 +10,7 @@
 #include <xen/multicall.h>
 #include <xen/guest_access.h>
 #include <xen/perfc.h>
+#include <xen/preempt.h>
 #include <xen/trace.h>
 #include <asm/current.h>
 #include <asm/hardirq.h>
diff --git a/xen/common/rcupdate.c b/xen/common/rcupdate.c
index e6add0b120..b03f4b44d9 100644
--- a/xen/common/rcupdate.c
+++ b/xen/common/rcupdate.c
@@ -46,6 +46,10 @@
 #include <xen/cpu.h>
 #include <xen/stop_machine.h>
 
+#ifndef NDEBUG
+DEFINE_PER_CPU(unsigned int, rcu_lock_cnt);
+#endif
+
 /* Global control variables for rcupdate callback mechanism. */
 static struct rcu_ctrlblk {
     long cur;           /* Current batch number.                      */
diff --git a/xen/common/softirq.c b/xen/common/softirq.c
index 3fe75ca3e8..18be8db0c6 100644
--- a/xen/common/softirq.c
+++ b/xen/common/softirq.c
@@ -30,6 +30,8 @@ static void __do_softirq(unsigned long ignore_mask, bool rcu_allowed)
     unsigned int i, cpu;
     unsigned long pending;
 
+    ASSERT(!rcu_allowed || rcu_quiesce_allowed());
+
     for ( ; ; )
     {
         /*
diff --git a/xen/common/wait.c b/xen/common/wait.c
index 24716e7676..9cdb174036 100644
--- a/xen/common/wait.c
+++ b/xen/common/wait.c
@@ -19,6 +19,7 @@
  * along with this program; If not, see <http://www.gnu.org/licenses/>.
  */
 
+#include <xen/preempt.h>
 #include <xen/sched.h>
 #include <xen/softirq.h>
 #include <xen/wait.h>
diff --git a/xen/include/xen/rcupdate.h b/xen/include/xen/rcupdate.h
index 87f35b7704..a5ee7fec2b 100644
--- a/xen/include/xen/rcupdate.h
+++ b/xen/include/xen/rcupdate.h
@@ -34,10 +34,23 @@
 #include <xen/cache.h>
 #include <xen/spinlock.h>
 #include <xen/cpumask.h>
-#include <xen/preempt.h>
+#include <xen/percpu.h>
 
 #define __rcu
 
+#ifndef NDEBUG
+DECLARE_PER_CPU(unsigned int, rcu_lock_cnt);
+
+#define rcu_quiesce_disable() (this_cpu(rcu_lock_cnt))++
+#define rcu_quiesce_enable()  (this_cpu(rcu_lock_cnt))--
+#define rcu_quiesce_allowed() (!this_cpu(rcu_lock_cnt))
+
+#else
+#define rcu_quiesce_disable() ((void)0)
+#define rcu_quiesce_enable()  ((void)0)
+#define rcu_quiesce_allowed() true
+#endif
+
 /**
  * struct rcu_head - callback structure for use with RCU
  * @next: next update requests in a list
@@ -90,16 +103,16 @@ typedef struct _rcu_read_lock rcu_read_lock_t;
  * will be deferred until the outermost RCU read-side critical section
  * completes.
  *
- * It is illegal to block while in an RCU read-side critical section.
+ * It is illegal to process softirqs while in an RCU read-side critical section.
  */
-#define rcu_read_lock(x)       ({ ((void)(x)); preempt_disable(); })
+#define rcu_read_lock(x)       ({ ((void)(x)); rcu_quiesce_disable(); })
 
 /**
  * rcu_read_unlock - marks the end of an RCU read-side critical section.
  *
  * See rcu_read_lock() for more information.
  */
-#define rcu_read_unlock(x)     ({ ((void)(x)); preempt_enable(); })
+#define rcu_read_unlock(x)     ({ ((void)(x)); rcu_quiesce_enable(); })
 
 /*
  * So where is rcu_write_lock()?  It does not exist, as there is no
-- 
2.16.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [Xen-devel] [PATCH v2 0/4] xen/rcu: let rcu work better with core scheduling
  2020-02-18 12:21 [Xen-devel] [PATCH v2 0/4] xen/rcu: let rcu work better with core scheduling Juergen Gross
                   ` (3 preceding siblings ...)
  2020-02-18 12:21 ` [Xen-devel] [PATCH v2 4/4] xen/rcu: add assertions to debug build Juergen Gross
@ 2020-02-18 13:15 ` Igor Druzhinin
  2020-02-19 16:48   ` Igor Druzhinin
  2020-02-22  2:29 ` Igor Druzhinin
  5 siblings, 1 reply; 27+ messages in thread
From: Igor Druzhinin @ 2020-02-18 13:15 UTC (permalink / raw)
  To: Juergen Gross, xen-devel
  Cc: Kevin Tian, Stefano Stabellini, Julien Grall, Jan Beulich,
	Wei Liu, Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Ian Jackson, Jun Nakajima, Roger Pau Monné

On 18/02/2020 12:21, Juergen Gross wrote:
> Today the RCU handling in Xen is affecting scheduling in several ways.
> It is raising sched softirqs without any real need and it requires
> tasklets for rcu_barrier(), which interacts badly with core scheduling.
> 
> This small series repairs those issues.
> 
> Additionally some ASSERT()s are added for verification of sane rcu
> handling. In order to avoid those triggering right away the obvious
> violations are fixed.
> 

Initial test of the first 2 patches is promising. Will run more tests
over night to see how stable it is.

Igor

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Xen-devel] [PATCH v2 0/4] xen/rcu: let rcu work better with core scheduling
  2020-02-18 13:15 ` [Xen-devel] [PATCH v2 0/4] xen/rcu: let rcu work better with core scheduling Igor Druzhinin
@ 2020-02-19 16:48   ` Igor Druzhinin
  0 siblings, 0 replies; 27+ messages in thread
From: Igor Druzhinin @ 2020-02-19 16:48 UTC (permalink / raw)
  To: Juergen Gross, xen-devel
  Cc: Kevin Tian, Stefano Stabellini, Julien Grall, Jun Nakajima,
	Wei Liu, Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Ian Jackson, Jan Beulich, Roger Pau Monné

On 18/02/2020 13:15, Igor Druzhinin wrote:
> On 18/02/2020 12:21, Juergen Gross wrote:
>> Today the RCU handling in Xen is affecting scheduling in several ways.
>> It is raising sched softirqs without any real need and it requires
>> tasklets for rcu_barrier(), which interacts badly with core scheduling.
>>
>> This small series repairs those issues.
>>
>> Additionally some ASSERT()s are added for verification of sane rcu
>> handling. In order to avoid those triggering right away the obvious
>> violations are fixed.
>>
> 
> Initial test of the first 2 patches is promising. Will run more tests
> over night to see how stable it is.

I stress-tested it over night and it seems to work for our case.

Tested-by: Igor Druzhinin <igor.druzhinin@citrix.com>

Igor

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Xen-devel] [PATCH v2 1/4] xen/rcu: use rcu softirq for forcing quiescent state
  2020-02-18 12:21 ` [Xen-devel] [PATCH v2 1/4] xen/rcu: use rcu softirq for forcing quiescent state Juergen Gross
@ 2020-02-21 14:17   ` Andrew Cooper
  0 siblings, 0 replies; 27+ messages in thread
From: Andrew Cooper @ 2020-02-21 14:17 UTC (permalink / raw)
  To: Juergen Gross, xen-devel
  Cc: Stefano Stabellini, Julien Grall, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Ian Jackson, Jan Beulich

On 18/02/2020 12:21, Juergen Gross wrote:
> As rcu callbacks are processed in __do_softirq() there is no need to
> use the scheduling softirq for forcing quiescent state. Any other
> softirq would do the job and the scheduling one is the most expensive.
>
> So use the already existing rcu softirq for that purpose. For telling
> apart why the rcu softirq was raised add a flag for the current usage.
>
> Signed-off-by: Juergen Gross <jgross@suse.com>

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Xen-devel] [PATCH v2 0/4] xen/rcu: let rcu work better with core scheduling
  2020-02-18 12:21 [Xen-devel] [PATCH v2 0/4] xen/rcu: let rcu work better with core scheduling Juergen Gross
                   ` (4 preceding siblings ...)
  2020-02-18 13:15 ` [Xen-devel] [PATCH v2 0/4] xen/rcu: let rcu work better with core scheduling Igor Druzhinin
@ 2020-02-22  2:29 ` Igor Druzhinin
  2020-02-22  6:05   ` Jürgen Groß
  5 siblings, 1 reply; 27+ messages in thread
From: Igor Druzhinin @ 2020-02-22  2:29 UTC (permalink / raw)
  To: Juergen Gross, xen-devel
  Cc: Kevin Tian, Stefano Stabellini, Julien Grall, Jan Beulich,
	Wei Liu, Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Ian Jackson, Jun Nakajima, Roger Pau Monné

On 18/02/2020 12:21, Juergen Gross wrote:
> Today the RCU handling in Xen is affecting scheduling in several ways.
> It is raising sched softirqs without any real need and it requires
> tasklets for rcu_barrier(), which interacts badly with core scheduling.
> 
> This small series repairs those issues.
> 
> Additionally some ASSERT()s are added for verification of sane rcu
> handling. In order to avoid those triggering right away the obvious
> violations are fixed.

I've done more testing of this with [1] and, unfortunately, it quite easily
deadlocks while without this series it doesn't.

Steps to repro:
- apply [1]
- take a host with considerable CPU count (~64)
- run a loop: xen-hptool smt-disable; xen-hptool smt-enable

[1] https://lists.xenproject.org/archives/html/xen-devel/2020-02/msg01383.html

Igor

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Xen-devel] [PATCH v2 0/4] xen/rcu: let rcu work better with core scheduling
  2020-02-22  2:29 ` Igor Druzhinin
@ 2020-02-22  6:05   ` Jürgen Groß
  2020-02-22 12:32     ` Julien Grall
  2020-02-22 16:42     ` Igor Druzhinin
  0 siblings, 2 replies; 27+ messages in thread
From: Jürgen Groß @ 2020-02-22  6:05 UTC (permalink / raw)
  To: Igor Druzhinin, xen-devel
  Cc: Kevin Tian, Stefano Stabellini, Julien Grall, Jan Beulich,
	Wei Liu, Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Ian Jackson, Jun Nakajima, Roger Pau Monné

On 22.02.20 03:29, Igor Druzhinin wrote:
> On 18/02/2020 12:21, Juergen Gross wrote:
>> Today the RCU handling in Xen is affecting scheduling in several ways.
>> It is raising sched softirqs without any real need and it requires
>> tasklets for rcu_barrier(), which interacts badly with core scheduling.
>>
>> This small series repairs those issues.
>>
>> Additionally some ASSERT()s are added for verification of sane rcu
>> handling. In order to avoid those triggering right away the obvious
>> violations are fixed.
> 
> I've done more testing of this with [1] and, unfortunately, it quite easily
> deadlocks while without this series it doesn't.
> 
> Steps to repro:
> - apply [1]
> - take a host with considerable CPU count (~64)
> - run a loop: xen-hptool smt-disable; xen-hptool smt-enable
> 
> [1] https://lists.xenproject.org/archives/html/xen-devel/2020-02/msg01383.html

Yeah, the reason for that is that rcu_barrier() is a nop in this
situation without my patch, as the then called stop_machine_run() in
rcu_barrier() will just return -EBUSY.

Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Xen-devel] [PATCH v2 0/4] xen/rcu: let rcu work better with core scheduling
  2020-02-22  6:05   ` Jürgen Groß
@ 2020-02-22 12:32     ` Julien Grall
  2020-02-22 13:56       ` Jürgen Groß
  2020-02-22 16:42     ` Igor Druzhinin
  1 sibling, 1 reply; 27+ messages in thread
From: Julien Grall @ 2020-02-22 12:32 UTC (permalink / raw)
  To: Jürgen Groß, Igor Druzhinin, xen-devel
  Cc: Kevin Tian, Stefano Stabellini, Jan Beulich, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Jun Nakajima, Roger Pau Monné

Hi,

On 22/02/2020 06:05, Jürgen Groß wrote:
> On 22.02.20 03:29, Igor Druzhinin wrote:
>> On 18/02/2020 12:21, Juergen Gross wrote:
>>> Today the RCU handling in Xen is affecting scheduling in several ways.
>>> It is raising sched softirqs without any real need and it requires
>>> tasklets for rcu_barrier(), which interacts badly with core scheduling.
>>>
>>> This small series repairs those issues.
>>>
>>> Additionally some ASSERT()s are added for verification of sane rcu
>>> handling. In order to avoid those triggering right away the obvious
>>> violations are fixed.
>>
>> I've done more testing of this with [1] and, unfortunately, it quite 
>> easily
>> deadlocks while without this series it doesn't.
>>
>> Steps to repro:
>> - apply [1]
>> - take a host with considerable CPU count (~64)
>> - run a loop: xen-hptool smt-disable; xen-hptool smt-enable
>>
>> [1] 
>> https://lists.xenproject.org/archives/html/xen-devel/2020-02/msg01383.html 
>>
> 
> Yeah, the reason for that is that rcu_barrier() is a nop in this
> situation without my patch, as the then called stop_machine_run() in
> rcu_barrier() will just return -EBUSY.

I think rcu_barrier() been a NOP is also problem as it means you would 
be able to continue before the in-flight callback has been completed.

But I am not entirely sure why a deadlock would happen with your 
suggestion? Could you details a bit more?

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Xen-devel] [PATCH v2 0/4] xen/rcu: let rcu work better with core scheduling
  2020-02-22 12:32     ` Julien Grall
@ 2020-02-22 13:56       ` Jürgen Groß
  0 siblings, 0 replies; 27+ messages in thread
From: Jürgen Groß @ 2020-02-22 13:56 UTC (permalink / raw)
  To: Julien Grall, Igor Druzhinin, xen-devel
  Cc: Kevin Tian, Stefano Stabellini, Jan Beulich, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Jun Nakajima, Roger Pau Monné

On 22.02.20 13:32, Julien Grall wrote:
> Hi,
> 
> On 22/02/2020 06:05, Jürgen Groß wrote:
>> On 22.02.20 03:29, Igor Druzhinin wrote:
>>> On 18/02/2020 12:21, Juergen Gross wrote:
>>>> Today the RCU handling in Xen is affecting scheduling in several ways.
>>>> It is raising sched softirqs without any real need and it requires
>>>> tasklets for rcu_barrier(), which interacts badly with core scheduling.
>>>>
>>>> This small series repairs those issues.
>>>>
>>>> Additionally some ASSERT()s are added for verification of sane rcu
>>>> handling. In order to avoid those triggering right away the obvious
>>>> violations are fixed.
>>>
>>> I've done more testing of this with [1] and, unfortunately, it quite 
>>> easily
>>> deadlocks while without this series it doesn't.
>>>
>>> Steps to repro:
>>> - apply [1]
>>> - take a host with considerable CPU count (~64)
>>> - run a loop: xen-hptool smt-disable; xen-hptool smt-enable
>>>
>>> [1] 
>>> https://lists.xenproject.org/archives/html/xen-devel/2020-02/msg01383.html 
>>>
>>
>> Yeah, the reason for that is that rcu_barrier() is a nop in this
>> situation without my patch, as the then called stop_machine_run() in
>> rcu_barrier() will just return -EBUSY.
> 
> I think rcu_barrier() been a NOP is also problem as it means you would 
> be able to continue before the in-flight callback has been completed.
> 
> But I am not entirely sure why a deadlock would happen with your 
> suggestion? Could you details a bit more?

get_cpu_maps() will return false as long stop_machine_run() is holding
the lock, and rcu handling will loop until it gets the lock...


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Xen-devel] [PATCH v2 0/4] xen/rcu: let rcu work better with core scheduling
  2020-02-22  6:05   ` Jürgen Groß
  2020-02-22 12:32     ` Julien Grall
@ 2020-02-22 16:42     ` Igor Druzhinin
  2020-02-23 14:14       ` Jürgen Groß
  1 sibling, 1 reply; 27+ messages in thread
From: Igor Druzhinin @ 2020-02-22 16:42 UTC (permalink / raw)
  To: Jürgen Groß, xen-devel
  Cc: Kevin Tian, Stefano Stabellini, Julien Grall, Jan Beulich,
	Wei Liu, Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Ian Jackson, Jun Nakajima, Roger Pau Monné

On 22/02/2020 06:05, Jürgen Groß wrote:
> On 22.02.20 03:29, Igor Druzhinin wrote:
>> On 18/02/2020 12:21, Juergen Gross wrote:
>>> Today the RCU handling in Xen is affecting scheduling in several ways.
>>> It is raising sched softirqs without any real need and it requires
>>> tasklets for rcu_barrier(), which interacts badly with core scheduling.
>>>
>>> This small series repairs those issues.
>>>
>>> Additionally some ASSERT()s are added for verification of sane rcu
>>> handling. In order to avoid those triggering right away the obvious
>>> violations are fixed.
>>
>> I've done more testing of this with [1] and, unfortunately, it quite easily
>> deadlocks while without this series it doesn't.
>>
>> Steps to repro:
>> - apply [1]
>> - take a host with considerable CPU count (~64)
>> - run a loop: xen-hptool smt-disable; xen-hptool smt-enable
>>
>> [1] https://lists.xenproject.org/archives/html/xen-devel/2020-02/msg01383.html
> 
> Yeah, the reason for that is that rcu_barrier() is a nop in this
> situation without my patch, as the then called stop_machine_run() in
> rcu_barrier() will just return -EBUSY.

Are you sure that's ther reason? I always have the following stack on CPU0:

(XEN) [  120.891143] *** Dumping CPU0 host state: ***
(XEN) [  120.895909] ----[ Xen-4.13.0  x86_64  debug=y   Not tainted ]----
(XEN) [  120.902487] CPU:    0
(XEN) [  120.905269] RIP:    e008:[<ffff82d0802aa750>] smp_send_call_function_mask+0x40/0x43
(XEN) [  120.913415] RFLAGS: 0000000000000286   CONTEXT: hypervisor
(XEN) [  120.919389] rax: 0000000000000000   rbx: ffff82d0805ddb78   rcx: 0000000000000001
(XEN) [  120.927362] rdx: ffff82d0805cdb00   rsi: ffff82d0805c7cd8   rdi: 0000000000000007
(XEN) [  120.935341] rbp: ffff8300920bfbc0   rsp: ffff8300920bfbb8   r8:  000000000000003b
(XEN) [  120.943310] r9:  0444444444444432   r10: 3333333333333333   r11: 0000000000000001
(XEN) [  120.951282] r12: ffff82d0805ddb78   r13: 0000000000000001   r14: ffff8300920bfc18
(XEN) [  120.959251] r15: ffff82d0802af646   cr0: 000000008005003b   cr4: 00000000003506e0
(XEN) [  120.967223] cr3: 00000000920b0000   cr2: ffff88820dffe7f8
(XEN) [  120.973125] fsb: 0000000000000000   gsb: ffff88821e3c0000   gss: 0000000000000000
(XEN) [  120.981094] ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs: e008
(XEN) [  120.988548] Xen code around <ffff82d0802aa750> (smp_send_call_function_mask+0x40/0x43):
(XEN) [  120.997037]  85 f9 ff fb 48 83 c4 08 <5b> 5d c3 9c 58 f6 c4 02 74 02 0f 0b 55 48 89 e5
(XEN) [  121.005442] Xen stack trace from rsp=ffff8300920bfbb8:
(XEN) [  121.011080]    ffff8300920bfc18 ffff8300920bfc00 ffff82d080242c84 ffff82d080389845
(XEN) [  121.019145]    ffff8300920bfc18 ffff82d0802af178 0000000000000000 0000001c1d27aff8
(XEN) [  121.027200]    0000000000000000 ffff8300920bfc80 ffff82d0802af1fa ffff82d080289adf
(XEN) [  121.035255]    fffffffffffffd55 0000000000000000 0000000000000000 0000000000000000
(XEN) [  121.043320]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) [  121.051375]    000000000000003b 0000001c25e54bf1 0000000000000000 ffff8300920bfc80
(XEN) [  121.059443]    ffff82d0805c7300 ffff8300920bfcb0 ffff82d080245f4d ffff82d0802af4a2
(XEN) [  121.067498]    ffff82d0805c7300 ffff83042bb24f60 ffff82d08060f400 ffff8300920bfd00
(XEN) [  121.075553]    ffff82d080246781 ffff82d0805cdb00 ffff8300920bfd80 ffff82d0805c7040
(XEN) [  121.083621]    ffff82d0805cdb00 ffff82d0805cdb00 fffffffffffffff9 ffff8300920bffff
(XEN) [  121.091674]    0000000000000000 ffff8300920bfd30 ffff82d0802425a5 ffff82d0805c7040
(XEN) [  121.099739]    ffff82d0805cdb00 fffffffffffffff9 ffff8300920bffff ffff8300920bfd40
(XEN) [  121.107797]    ffff82d0802425e5 ffff8300920bfd80 ffff82d08022bc0f 0000000000000000
(XEN) [  121.115852]    ffff82d08022b600 ffff82d0804b3888 ffff82d0805cdb00 ffff82d0805cdb00
(XEN) [  121.123917]    fffffffffffffff9 ffff8300920bfdb0 ffff82d0802425a5 0000000000000003
(XEN) [  121.131975]    0000000000000001 00000000ffffffef ffff8300920bffff ffff8300920bfdc0
(XEN) [  121.140037]    ffff82d0802425e5 ffff8300920bfdd0 ffff82d08022b91b ffff8300920bfdf0
(XEN) [  121.148093]    ffff82d0802addb1 ffff83042b3b0000 0000000000000003 ffff8300920bfe30
(XEN) [  121.156150]    ffff82d0802ae086 ffff8300920bfe10 ffff83042b7e81e0 ffff83042b3b0000
(XEN) [  121.164216]    0000000000000000 0000000000000000 0000000000000000 ffff8300920bfe50
(XEN) [  121.172271] Xen call trace:
(XEN) [  121.175573]    [<ffff82d0802aa750>] R smp_send_call_function_mask+0x40/0x43
(XEN) [  121.183024]    [<ffff82d080242c84>] F on_selected_cpus+0xa4/0xde
(XEN) [  121.189520]    [<ffff82d0802af1fa>] F arch/x86/time.c#time_calibration+0x82/0x89
(XEN) [  121.197403]    [<ffff82d080245f4d>] F common/timer.c#execute_timer+0x49/0x64
(XEN) [  121.204951]    [<ffff82d080246781>] F common/timer.c#timer_softirq_action+0x116/0x24e
(XEN) [  121.213271]    [<ffff82d0802425a5>] F common/softirq.c#__do_softirq+0x85/0x90
(XEN) [  121.220890]    [<ffff82d0802425e5>] F process_pending_softirqs+0x35/0x37
(XEN) [  121.228086]    [<ffff82d08022bc0f>] F common/rcupdate.c#rcu_process_callbacks+0x1ef/0x20d
(XEN) [  121.236758]    [<ffff82d0802425a5>] F common/softirq.c#__do_softirq+0x85/0x90
(XEN) [  121.244378]    [<ffff82d0802425e5>] F process_pending_softirqs+0x35/0x37
(XEN) [  121.251568]    [<ffff82d08022b91b>] F rcu_barrier+0x58/0x6e
(XEN) [  121.257639]    [<ffff82d0802addb1>] F cpu_down_helper+0x11/0x32
(XEN) [  121.264051]    [<ffff82d0802ae086>] F arch/x86/sysctl.c#smt_up_down_helper+0x1d6/0x1fe
(XEN) [  121.272454]    [<ffff82d08020878d>] F common/domain.c#continue_hypercall_tasklet_handler+0x54/0xb8
(XEN) [  121.281900]    [<ffff82d0802454e6>] F common/tasklet.c#do_tasklet_work+0x81/0xb4
(XEN) [  121.289786]    [<ffff82d080245803>] F do_tasklet+0x58/0x85
(XEN) [  121.295771]    [<ffff82d08027a0b4>] F arch/x86/domain.c#idle_loop+0x87/0xcb

So it's not in get_cpu_maps() loop. It seems to me it's not entering time sync for some
reason.

Igor

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Xen-devel] [PATCH v2 0/4] xen/rcu: let rcu work better with core scheduling
  2020-02-22 16:42     ` Igor Druzhinin
@ 2020-02-23 14:14       ` Jürgen Groß
  2020-02-27 15:16         ` Igor Druzhinin
  0 siblings, 1 reply; 27+ messages in thread
From: Jürgen Groß @ 2020-02-23 14:14 UTC (permalink / raw)
  To: Igor Druzhinin, xen-devel
  Cc: Kevin Tian, Stefano Stabellini, Julien Grall, Jan Beulich,
	Wei Liu, Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Ian Jackson, Jun Nakajima, Roger Pau Monné

On 22.02.20 17:42, Igor Druzhinin wrote:
> On 22/02/2020 06:05, Jürgen Groß wrote:
>> On 22.02.20 03:29, Igor Druzhinin wrote:
>>> On 18/02/2020 12:21, Juergen Gross wrote:
>>>> Today the RCU handling in Xen is affecting scheduling in several ways.
>>>> It is raising sched softirqs without any real need and it requires
>>>> tasklets for rcu_barrier(), which interacts badly with core scheduling.
>>>>
>>>> This small series repairs those issues.
>>>>
>>>> Additionally some ASSERT()s are added for verification of sane rcu
>>>> handling. In order to avoid those triggering right away the obvious
>>>> violations are fixed.
>>>
>>> I've done more testing of this with [1] and, unfortunately, it quite easily
>>> deadlocks while without this series it doesn't.
>>>
>>> Steps to repro:
>>> - apply [1]
>>> - take a host with considerable CPU count (~64)
>>> - run a loop: xen-hptool smt-disable; xen-hptool smt-enable
>>>
>>> [1] https://lists.xenproject.org/archives/html/xen-devel/2020-02/msg01383.html
>>
>> Yeah, the reason for that is that rcu_barrier() is a nop in this
>> situation without my patch, as the then called stop_machine_run() in
>> rcu_barrier() will just return -EBUSY.
> 
> Are you sure that's ther reason? I always have the following stack on CPU0:
> 
> (XEN) [  120.891143] *** Dumping CPU0 host state: ***
> (XEN) [  120.895909] ----[ Xen-4.13.0  x86_64  debug=y   Not tainted ]----
> (XEN) [  120.902487] CPU:    0
> (XEN) [  120.905269] RIP:    e008:[<ffff82d0802aa750>] smp_send_call_function_mask+0x40/0x43
> (XEN) [  120.913415] RFLAGS: 0000000000000286   CONTEXT: hypervisor
> (XEN) [  120.919389] rax: 0000000000000000   rbx: ffff82d0805ddb78   rcx: 0000000000000001
> (XEN) [  120.927362] rdx: ffff82d0805cdb00   rsi: ffff82d0805c7cd8   rdi: 0000000000000007
> (XEN) [  120.935341] rbp: ffff8300920bfbc0   rsp: ffff8300920bfbb8   r8:  000000000000003b
> (XEN) [  120.943310] r9:  0444444444444432   r10: 3333333333333333   r11: 0000000000000001
> (XEN) [  120.951282] r12: ffff82d0805ddb78   r13: 0000000000000001   r14: ffff8300920bfc18
> (XEN) [  120.959251] r15: ffff82d0802af646   cr0: 000000008005003b   cr4: 00000000003506e0
> (XEN) [  120.967223] cr3: 00000000920b0000   cr2: ffff88820dffe7f8
> (XEN) [  120.973125] fsb: 0000000000000000   gsb: ffff88821e3c0000   gss: 0000000000000000
> (XEN) [  120.981094] ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs: e008
> (XEN) [  120.988548] Xen code around <ffff82d0802aa750> (smp_send_call_function_mask+0x40/0x43):
> (XEN) [  120.997037]  85 f9 ff fb 48 83 c4 08 <5b> 5d c3 9c 58 f6 c4 02 74 02 0f 0b 55 48 89 e5
> (XEN) [  121.005442] Xen stack trace from rsp=ffff8300920bfbb8:
> (XEN) [  121.011080]    ffff8300920bfc18 ffff8300920bfc00 ffff82d080242c84 ffff82d080389845
> (XEN) [  121.019145]    ffff8300920bfc18 ffff82d0802af178 0000000000000000 0000001c1d27aff8
> (XEN) [  121.027200]    0000000000000000 ffff8300920bfc80 ffff82d0802af1fa ffff82d080289adf
> (XEN) [  121.035255]    fffffffffffffd55 0000000000000000 0000000000000000 0000000000000000
> (XEN) [  121.043320]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN) [  121.051375]    000000000000003b 0000001c25e54bf1 0000000000000000 ffff8300920bfc80
> (XEN) [  121.059443]    ffff82d0805c7300 ffff8300920bfcb0 ffff82d080245f4d ffff82d0802af4a2
> (XEN) [  121.067498]    ffff82d0805c7300 ffff83042bb24f60 ffff82d08060f400 ffff8300920bfd00
> (XEN) [  121.075553]    ffff82d080246781 ffff82d0805cdb00 ffff8300920bfd80 ffff82d0805c7040
> (XEN) [  121.083621]    ffff82d0805cdb00 ffff82d0805cdb00 fffffffffffffff9 ffff8300920bffff
> (XEN) [  121.091674]    0000000000000000 ffff8300920bfd30 ffff82d0802425a5 ffff82d0805c7040
> (XEN) [  121.099739]    ffff82d0805cdb00 fffffffffffffff9 ffff8300920bffff ffff8300920bfd40
> (XEN) [  121.107797]    ffff82d0802425e5 ffff8300920bfd80 ffff82d08022bc0f 0000000000000000
> (XEN) [  121.115852]    ffff82d08022b600 ffff82d0804b3888 ffff82d0805cdb00 ffff82d0805cdb00
> (XEN) [  121.123917]    fffffffffffffff9 ffff8300920bfdb0 ffff82d0802425a5 0000000000000003
> (XEN) [  121.131975]    0000000000000001 00000000ffffffef ffff8300920bffff ffff8300920bfdc0
> (XEN) [  121.140037]    ffff82d0802425e5 ffff8300920bfdd0 ffff82d08022b91b ffff8300920bfdf0
> (XEN) [  121.148093]    ffff82d0802addb1 ffff83042b3b0000 0000000000000003 ffff8300920bfe30
> (XEN) [  121.156150]    ffff82d0802ae086 ffff8300920bfe10 ffff83042b7e81e0 ffff83042b3b0000
> (XEN) [  121.164216]    0000000000000000 0000000000000000 0000000000000000 ffff8300920bfe50
> (XEN) [  121.172271] Xen call trace:
> (XEN) [  121.175573]    [<ffff82d0802aa750>] R smp_send_call_function_mask+0x40/0x43
> (XEN) [  121.183024]    [<ffff82d080242c84>] F on_selected_cpus+0xa4/0xde
> (XEN) [  121.189520]    [<ffff82d0802af1fa>] F arch/x86/time.c#time_calibration+0x82/0x89
> (XEN) [  121.197403]    [<ffff82d080245f4d>] F common/timer.c#execute_timer+0x49/0x64
> (XEN) [  121.204951]    [<ffff82d080246781>] F common/timer.c#timer_softirq_action+0x116/0x24e
> (XEN) [  121.213271]    [<ffff82d0802425a5>] F common/softirq.c#__do_softirq+0x85/0x90
> (XEN) [  121.220890]    [<ffff82d0802425e5>] F process_pending_softirqs+0x35/0x37
> (XEN) [  121.228086]    [<ffff82d08022bc0f>] F common/rcupdate.c#rcu_process_callbacks+0x1ef/0x20d
> (XEN) [  121.236758]    [<ffff82d0802425a5>] F common/softirq.c#__do_softirq+0x85/0x90
> (XEN) [  121.244378]    [<ffff82d0802425e5>] F process_pending_softirqs+0x35/0x37
> (XEN) [  121.251568]    [<ffff82d08022b91b>] F rcu_barrier+0x58/0x6e
> (XEN) [  121.257639]    [<ffff82d0802addb1>] F cpu_down_helper+0x11/0x32
> (XEN) [  121.264051]    [<ffff82d0802ae086>] F arch/x86/sysctl.c#smt_up_down_helper+0x1d6/0x1fe
> (XEN) [  121.272454]    [<ffff82d08020878d>] F common/domain.c#continue_hypercall_tasklet_handler+0x54/0xb8
> (XEN) [  121.281900]    [<ffff82d0802454e6>] F common/tasklet.c#do_tasklet_work+0x81/0xb4
> (XEN) [  121.289786]    [<ffff82d080245803>] F do_tasklet+0x58/0x85
> (XEN) [  121.295771]    [<ffff82d08027a0b4>] F arch/x86/domain.c#idle_loop+0x87/0xcb
> 
> So it's not in get_cpu_maps() loop. It seems to me it's not entering time sync for some
> reason.

Interesting. Looking further into that.

At least time_calibration() is missing to call get_cpu_maps().


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Xen-devel] [PATCH v2 3/4] xen: add process_pending_softirqs_norcu() for keyhandlers
  2020-02-18 12:21 ` [Xen-devel] [PATCH v2 3/4] xen: add process_pending_softirqs_norcu() for keyhandlers Juergen Gross
@ 2020-02-24 11:25   ` Roger Pau Monné
  2020-02-24 11:44     ` Jürgen Groß
  0 siblings, 1 reply; 27+ messages in thread
From: Roger Pau Monné @ 2020-02-24 11:25 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Kevin Tian, Stefano Stabellini, Julien Grall, Jun Nakajima,
	Wei Liu, Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Ian Jackson, Jan Beulich, xen-devel

On Tue, Feb 18, 2020 at 01:21:13PM +0100, Juergen Gross wrote:
> Some keyhandlers are calling process_pending_softirqs() while holding
> a rcu_read_lock(). This is wrong, as process_pending_softirqs() might
> activate rcu calls which should not happen inside a rcu_read_lock().

It might be helpful to turn the ASSERT in process_pending_softirqs
into ASSERT_NOT_IN_ATOMIC also, as it would catch such missuses
AFAICT.

> 
> For that purpose add process_pending_softirqs_norcu() which will not
> do any rcu activity and use this for keyhandlers.

I wonder if for keyhandlers it might be easier to just disable the
watchdog in handle_keypress and remove the softirq processing from the
handlers.

At the end of day we want the keyhanders to run as fast as possible in
order to get the data out, and we only care about the watchdog not
triggering? (maybe I'm missing something here)

> +void process_pending_softirqs_norcu(void)
> +{
> +    ASSERT(!in_irq() && local_irq_is_enabled());
> +    /* Do not enter scheduler as it can preempt the calling context. */
> +    __do_softirq((1ul << SCHEDULE_SOFTIRQ) | (1ul << SCHED_SLAVE_SOFTIRQ),

Don't you also need to pass RCU_SOFTIRQ to the ignore mask in order to
avoid any RCU work happening?

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Xen-devel] [PATCH v2 4/4] xen/rcu: add assertions to debug build
  2020-02-18 12:21 ` [Xen-devel] [PATCH v2 4/4] xen/rcu: add assertions to debug build Juergen Gross
@ 2020-02-24 11:31   ` Roger Pau Monné
  2020-02-24 11:45     ` Jürgen Groß
  0 siblings, 1 reply; 27+ messages in thread
From: Roger Pau Monné @ 2020-02-24 11:31 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Stefano Stabellini, Julien Grall, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Jan Beulich,
	xen-devel

On Tue, Feb 18, 2020 at 01:21:14PM +0100, Juergen Gross wrote:
> Xen's RCU implementation relies on no softirq handling taking place
> while being in a RCU critical section. Add ASSERT()s in debug builds
> in order to catch any violations.
> 
> For that purpose modify rcu_read_[un]lock() to use a dedicated percpu
> counter instead of preempt_[en|dis]able() as this enables to test
> that condition in __do_softirq() (ASSERT_NOT_IN_ATOMIC() is not
> usable there due to __cpu_up() calling process_pending_softirqs()
> while holding the cpu hotplug lock).
> 
> Dropping the now no longer needed #include of preempt.h in rcupdate.h
> requires adding it in some sources.
> 
> Signed-off-by: Juergen Gross <jgross@suse.com>
> ---
>  xen/common/multicall.c     |  1 +
>  xen/common/rcupdate.c      |  4 ++++
>  xen/common/softirq.c       |  2 ++
>  xen/common/wait.c          |  1 +
>  xen/include/xen/rcupdate.h | 21 +++++++++++++++++----
>  5 files changed, 25 insertions(+), 4 deletions(-)
> 
> diff --git a/xen/common/multicall.c b/xen/common/multicall.c
> index 5a199ebf8f..67f1a23485 100644
> --- a/xen/common/multicall.c
> +++ b/xen/common/multicall.c
> @@ -10,6 +10,7 @@
>  #include <xen/multicall.h>
>  #include <xen/guest_access.h>
>  #include <xen/perfc.h>
> +#include <xen/preempt.h>
>  #include <xen/trace.h>
>  #include <asm/current.h>
>  #include <asm/hardirq.h>
> diff --git a/xen/common/rcupdate.c b/xen/common/rcupdate.c
> index e6add0b120..b03f4b44d9 100644
> --- a/xen/common/rcupdate.c
> +++ b/xen/common/rcupdate.c
> @@ -46,6 +46,10 @@
>  #include <xen/cpu.h>
>  #include <xen/stop_machine.h>
>  
> +#ifndef NDEBUG
> +DEFINE_PER_CPU(unsigned int, rcu_lock_cnt);
> +#endif
> +
>  /* Global control variables for rcupdate callback mechanism. */
>  static struct rcu_ctrlblk {
>      long cur;           /* Current batch number.                      */
> diff --git a/xen/common/softirq.c b/xen/common/softirq.c
> index 3fe75ca3e8..18be8db0c6 100644
> --- a/xen/common/softirq.c
> +++ b/xen/common/softirq.c
> @@ -30,6 +30,8 @@ static void __do_softirq(unsigned long ignore_mask, bool rcu_allowed)
>      unsigned int i, cpu;
>      unsigned long pending;
>  
> +    ASSERT(!rcu_allowed || rcu_quiesce_allowed());
> +
>      for ( ; ; )
>      {
>          /*
> diff --git a/xen/common/wait.c b/xen/common/wait.c
> index 24716e7676..9cdb174036 100644
> --- a/xen/common/wait.c
> +++ b/xen/common/wait.c
> @@ -19,6 +19,7 @@
>   * along with this program; If not, see <http://www.gnu.org/licenses/>.
>   */
>  
> +#include <xen/preempt.h>
>  #include <xen/sched.h>
>  #include <xen/softirq.h>
>  #include <xen/wait.h>
> diff --git a/xen/include/xen/rcupdate.h b/xen/include/xen/rcupdate.h
> index 87f35b7704..a5ee7fec2b 100644
> --- a/xen/include/xen/rcupdate.h
> +++ b/xen/include/xen/rcupdate.h
> @@ -34,10 +34,23 @@
>  #include <xen/cache.h>
>  #include <xen/spinlock.h>
>  #include <xen/cpumask.h>
> -#include <xen/preempt.h>
> +#include <xen/percpu.h>
>  
>  #define __rcu
>  
> +#ifndef NDEBUG
> +DECLARE_PER_CPU(unsigned int, rcu_lock_cnt);
> +
> +#define rcu_quiesce_disable() (this_cpu(rcu_lock_cnt))++
> +#define rcu_quiesce_enable()  (this_cpu(rcu_lock_cnt))--

I think you need a barrier here like it's currently used in
preempt_{enabled/disable}, or use arch_lock_{acquire/release}_barrier
which would be better IMO.

> +#define rcu_quiesce_allowed() (!this_cpu(rcu_lock_cnt))

ASSERT_NOT_IN_ATOMIC should be expanded to also assert
!this_cpu(rcu_lock_cnt), or else missing pairs of
rcu_read_{lock/unlock} would be undetected.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Xen-devel] [PATCH v2 3/4] xen: add process_pending_softirqs_norcu() for keyhandlers
  2020-02-24 11:25   ` Roger Pau Monné
@ 2020-02-24 11:44     ` Jürgen Groß
  2020-02-24 12:02       ` Roger Pau Monné
  0 siblings, 1 reply; 27+ messages in thread
From: Jürgen Groß @ 2020-02-24 11:44 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Kevin Tian, Stefano Stabellini, Julien Grall, Jun Nakajima,
	Wei Liu, Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Ian Jackson, Jan Beulich, xen-devel

On 24.02.20 12:25, Roger Pau Monné wrote:
> On Tue, Feb 18, 2020 at 01:21:13PM +0100, Juergen Gross wrote:
>> Some keyhandlers are calling process_pending_softirqs() while holding
>> a rcu_read_lock(). This is wrong, as process_pending_softirqs() might
>> activate rcu calls which should not happen inside a rcu_read_lock().
> 
> It might be helpful to turn the ASSERT in process_pending_softirqs
> into ASSERT_NOT_IN_ATOMIC also, as it would catch such missuses
> AFAICT.

No, this would be triggering in __cpu_up() at system boot.

> 
>>
>> For that purpose add process_pending_softirqs_norcu() which will not
>> do any rcu activity and use this for keyhandlers.
> 
> I wonder if for keyhandlers it might be easier to just disable the
> watchdog in handle_keypress and remove the softirq processing from the
> handlers.
> 
> At the end of day we want the keyhanders to run as fast as possible in
> order to get the data out, and we only care about the watchdog not
> triggering? (maybe I'm missing something here)

It is not that simple, I believe.

You'd need to be very careful that other functionality wouldn't suffer.
I'm e.g. not sure time_calibration won't lead to a hanging system then.

> 
>> +void process_pending_softirqs_norcu(void)
>> +{
>> +    ASSERT(!in_irq() && local_irq_is_enabled());
>> +    /* Do not enter scheduler as it can preempt the calling context. */
>> +    __do_softirq((1ul << SCHEDULE_SOFTIRQ) | (1ul << SCHED_SLAVE_SOFTIRQ),
> 
> Don't you also need to pass RCU_SOFTIRQ to the ignore mask in order to
> avoid any RCU work happening?

Yes, that's probably a good idea.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Xen-devel] [PATCH v2 4/4] xen/rcu: add assertions to debug build
  2020-02-24 11:31   ` Roger Pau Monné
@ 2020-02-24 11:45     ` Jürgen Groß
  0 siblings, 0 replies; 27+ messages in thread
From: Jürgen Groß @ 2020-02-24 11:45 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Stefano Stabellini, Julien Grall, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Jan Beulich,
	xen-devel

On 24.02.20 12:31, Roger Pau Monné wrote:
> On Tue, Feb 18, 2020 at 01:21:14PM +0100, Juergen Gross wrote:
>> Xen's RCU implementation relies on no softirq handling taking place
>> while being in a RCU critical section. Add ASSERT()s in debug builds
>> in order to catch any violations.
>>
>> For that purpose modify rcu_read_[un]lock() to use a dedicated percpu
>> counter instead of preempt_[en|dis]able() as this enables to test
>> that condition in __do_softirq() (ASSERT_NOT_IN_ATOMIC() is not
>> usable there due to __cpu_up() calling process_pending_softirqs()
>> while holding the cpu hotplug lock).
>>
>> Dropping the now no longer needed #include of preempt.h in rcupdate.h
>> requires adding it in some sources.
>>
>> Signed-off-by: Juergen Gross <jgross@suse.com>
>> ---
>>   xen/common/multicall.c     |  1 +
>>   xen/common/rcupdate.c      |  4 ++++
>>   xen/common/softirq.c       |  2 ++
>>   xen/common/wait.c          |  1 +
>>   xen/include/xen/rcupdate.h | 21 +++++++++++++++++----
>>   5 files changed, 25 insertions(+), 4 deletions(-)
>>
>> diff --git a/xen/common/multicall.c b/xen/common/multicall.c
>> index 5a199ebf8f..67f1a23485 100644
>> --- a/xen/common/multicall.c
>> +++ b/xen/common/multicall.c
>> @@ -10,6 +10,7 @@
>>   #include <xen/multicall.h>
>>   #include <xen/guest_access.h>
>>   #include <xen/perfc.h>
>> +#include <xen/preempt.h>
>>   #include <xen/trace.h>
>>   #include <asm/current.h>
>>   #include <asm/hardirq.h>
>> diff --git a/xen/common/rcupdate.c b/xen/common/rcupdate.c
>> index e6add0b120..b03f4b44d9 100644
>> --- a/xen/common/rcupdate.c
>> +++ b/xen/common/rcupdate.c
>> @@ -46,6 +46,10 @@
>>   #include <xen/cpu.h>
>>   #include <xen/stop_machine.h>
>>   
>> +#ifndef NDEBUG
>> +DEFINE_PER_CPU(unsigned int, rcu_lock_cnt);
>> +#endif
>> +
>>   /* Global control variables for rcupdate callback mechanism. */
>>   static struct rcu_ctrlblk {
>>       long cur;           /* Current batch number.                      */
>> diff --git a/xen/common/softirq.c b/xen/common/softirq.c
>> index 3fe75ca3e8..18be8db0c6 100644
>> --- a/xen/common/softirq.c
>> +++ b/xen/common/softirq.c
>> @@ -30,6 +30,8 @@ static void __do_softirq(unsigned long ignore_mask, bool rcu_allowed)
>>       unsigned int i, cpu;
>>       unsigned long pending;
>>   
>> +    ASSERT(!rcu_allowed || rcu_quiesce_allowed());
>> +
>>       for ( ; ; )
>>       {
>>           /*
>> diff --git a/xen/common/wait.c b/xen/common/wait.c
>> index 24716e7676..9cdb174036 100644
>> --- a/xen/common/wait.c
>> +++ b/xen/common/wait.c
>> @@ -19,6 +19,7 @@
>>    * along with this program; If not, see <http://www.gnu.org/licenses/>.
>>    */
>>   
>> +#include <xen/preempt.h>
>>   #include <xen/sched.h>
>>   #include <xen/softirq.h>
>>   #include <xen/wait.h>
>> diff --git a/xen/include/xen/rcupdate.h b/xen/include/xen/rcupdate.h
>> index 87f35b7704..a5ee7fec2b 100644
>> --- a/xen/include/xen/rcupdate.h
>> +++ b/xen/include/xen/rcupdate.h
>> @@ -34,10 +34,23 @@
>>   #include <xen/cache.h>
>>   #include <xen/spinlock.h>
>>   #include <xen/cpumask.h>
>> -#include <xen/preempt.h>
>> +#include <xen/percpu.h>
>>   
>>   #define __rcu
>>   
>> +#ifndef NDEBUG
>> +DECLARE_PER_CPU(unsigned int, rcu_lock_cnt);
>> +
>> +#define rcu_quiesce_disable() (this_cpu(rcu_lock_cnt))++
>> +#define rcu_quiesce_enable()  (this_cpu(rcu_lock_cnt))--
> 
> I think you need a barrier here like it's currently used in
> preempt_{enabled/disable}, or use arch_lock_{acquire/release}_barrier
> which would be better IMO.

Thanks, will do that.

> 
>> +#define rcu_quiesce_allowed() (!this_cpu(rcu_lock_cnt))
> 
> ASSERT_NOT_IN_ATOMIC should be expanded to also assert
> !this_cpu(rcu_lock_cnt), or else missing pairs of
> rcu_read_{lock/unlock} would be undetected.

Good idea.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Xen-devel] [PATCH v2 3/4] xen: add process_pending_softirqs_norcu() for keyhandlers
  2020-02-24 11:44     ` Jürgen Groß
@ 2020-02-24 12:02       ` Roger Pau Monné
  0 siblings, 0 replies; 27+ messages in thread
From: Roger Pau Monné @ 2020-02-24 12:02 UTC (permalink / raw)
  To: Jürgen Groß
  Cc: Kevin Tian, Stefano Stabellini, Julien Grall, Jun Nakajima,
	Wei Liu, Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Ian Jackson, Jan Beulich, xen-devel

On Mon, Feb 24, 2020 at 12:44:48PM +0100, Jürgen Groß wrote:
> On 24.02.20 12:25, Roger Pau Monné wrote:
> > On Tue, Feb 18, 2020 at 01:21:13PM +0100, Juergen Gross wrote:
> > > Some keyhandlers are calling process_pending_softirqs() while holding
> > > a rcu_read_lock(). This is wrong, as process_pending_softirqs() might
> > > activate rcu calls which should not happen inside a rcu_read_lock().
> > 
> > It might be helpful to turn the ASSERT in process_pending_softirqs
> > into ASSERT_NOT_IN_ATOMIC also, as it would catch such missuses
> > AFAICT.
> 
> No, this would be triggering in __cpu_up() at system boot.

Yes, saw that in the next patch.

> > 
> > > 
> > > For that purpose add process_pending_softirqs_norcu() which will not
> > > do any rcu activity and use this for keyhandlers.
> > 
> > I wonder if for keyhandlers it might be easier to just disable the
> > watchdog in handle_keypress and remove the softirq processing from the
> > handlers.
> > 
> > At the end of day we want the keyhanders to run as fast as possible in
> > order to get the data out, and we only care about the watchdog not
> > triggering? (maybe I'm missing something here)
> 
> It is not that simple, I believe.
> 
> You'd need to be very careful that other functionality wouldn't suffer.
> I'm e.g. not sure time_calibration won't lead to a hanging system then.

AFAICT time_calibration is used to sync the timestamps of the various
CPUs so that they don't drift too much, but I don't think not
executing it could lead to a hang, it would lead to (bigger) skews
between CPUs, but such skews happen anyway.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Xen-devel] [PATCH v2 0/4] xen/rcu: let rcu work better with core scheduling
  2020-02-23 14:14       ` Jürgen Groß
@ 2020-02-27 15:16         ` Igor Druzhinin
  2020-02-27 15:21           ` Jürgen Groß
  2020-02-28  7:10           ` Jürgen Groß
  0 siblings, 2 replies; 27+ messages in thread
From: Igor Druzhinin @ 2020-02-27 15:16 UTC (permalink / raw)
  To: Jürgen Groß, xen-devel
  Cc: Kevin Tian, Stefano Stabellini, Julien Grall, Jan Beulich,
	Wei Liu, Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Ian Jackson, Jun Nakajima, Roger Pau Monné

On 23/02/2020 14:14, Jürgen Groß wrote:
> On 22.02.20 17:42, Igor Druzhinin wrote:
>> (XEN) [  120.891143] *** Dumping CPU0 host state: ***
>> (XEN) [  120.895909] ----[ Xen-4.13.0  x86_64  debug=y   Not tainted ]----
>> (XEN) [  120.902487] CPU:    0
>> (XEN) [  120.905269] RIP:    e008:[<ffff82d0802aa750>] smp_send_call_function_mask+0x40/0x43
>> (XEN) [  120.913415] RFLAGS: 0000000000000286   CONTEXT: hypervisor
>> (XEN) [  120.919389] rax: 0000000000000000   rbx: ffff82d0805ddb78   rcx: 0000000000000001
>> (XEN) [  120.927362] rdx: ffff82d0805cdb00   rsi: ffff82d0805c7cd8   rdi: 0000000000000007
>> (XEN) [  120.935341] rbp: ffff8300920bfbc0   rsp: ffff8300920bfbb8   r8:  000000000000003b
>> (XEN) [  120.943310] r9:  0444444444444432   r10: 3333333333333333   r11: 0000000000000001
>> (XEN) [  120.951282] r12: ffff82d0805ddb78   r13: 0000000000000001   r14: ffff8300920bfc18
>> (XEN) [  120.959251] r15: ffff82d0802af646   cr0: 000000008005003b   cr4: 00000000003506e0
>> (XEN) [  120.967223] cr3: 00000000920b0000   cr2: ffff88820dffe7f8
>> (XEN) [  120.973125] fsb: 0000000000000000   gsb: ffff88821e3c0000   gss: 0000000000000000
>> (XEN) [  120.981094] ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs: e008
>> (XEN) [  120.988548] Xen code around <ffff82d0802aa750> (smp_send_call_function_mask+0x40/0x43):
>> (XEN) [  120.997037]  85 f9 ff fb 48 83 c4 08 <5b> 5d c3 9c 58 f6 c4 02 74 02 0f 0b 55 48 89 e5
>> (XEN) [  121.005442] Xen stack trace from rsp=ffff8300920bfbb8:
>> (XEN) [  121.011080]    ffff8300920bfc18 ffff8300920bfc00 ffff82d080242c84 ffff82d080389845
>> (XEN) [  121.019145]    ffff8300920bfc18 ffff82d0802af178 0000000000000000 0000001c1d27aff8
>> (XEN) [  121.027200]    0000000000000000 ffff8300920bfc80 ffff82d0802af1fa ffff82d080289adf
>> (XEN) [  121.035255]    fffffffffffffd55 0000000000000000 0000000000000000 0000000000000000
>> (XEN) [  121.043320]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
>> (XEN) [  121.051375]    000000000000003b 0000001c25e54bf1 0000000000000000 ffff8300920bfc80
>> (XEN) [  121.059443]    ffff82d0805c7300 ffff8300920bfcb0 ffff82d080245f4d ffff82d0802af4a2
>> (XEN) [  121.067498]    ffff82d0805c7300 ffff83042bb24f60 ffff82d08060f400 ffff8300920bfd00
>> (XEN) [  121.075553]    ffff82d080246781 ffff82d0805cdb00 ffff8300920bfd80 ffff82d0805c7040
>> (XEN) [  121.083621]    ffff82d0805cdb00 ffff82d0805cdb00 fffffffffffffff9 ffff8300920bffff
>> (XEN) [  121.091674]    0000000000000000 ffff8300920bfd30 ffff82d0802425a5 ffff82d0805c7040
>> (XEN) [  121.099739]    ffff82d0805cdb00 fffffffffffffff9 ffff8300920bffff ffff8300920bfd40
>> (XEN) [  121.107797]    ffff82d0802425e5 ffff8300920bfd80 ffff82d08022bc0f 0000000000000000
>> (XEN) [  121.115852]    ffff82d08022b600 ffff82d0804b3888 ffff82d0805cdb00 ffff82d0805cdb00
>> (XEN) [  121.123917]    fffffffffffffff9 ffff8300920bfdb0 ffff82d0802425a5 0000000000000003
>> (XEN) [  121.131975]    0000000000000001 00000000ffffffef ffff8300920bffff ffff8300920bfdc0
>> (XEN) [  121.140037]    ffff82d0802425e5 ffff8300920bfdd0 ffff82d08022b91b ffff8300920bfdf0
>> (XEN) [  121.148093]    ffff82d0802addb1 ffff83042b3b0000 0000000000000003 ffff8300920bfe30
>> (XEN) [  121.156150]    ffff82d0802ae086 ffff8300920bfe10 ffff83042b7e81e0 ffff83042b3b0000
>> (XEN) [  121.164216]    0000000000000000 0000000000000000 0000000000000000 ffff8300920bfe50
>> (XEN) [  121.172271] Xen call trace:
>> (XEN) [  121.175573]    [<ffff82d0802aa750>] R smp_send_call_function_mask+0x40/0x43
>> (XEN) [  121.183024]    [<ffff82d080242c84>] F on_selected_cpus+0xa4/0xde
>> (XEN) [  121.189520]    [<ffff82d0802af1fa>] F arch/x86/time.c#time_calibration+0x82/0x89
>> (XEN) [  121.197403]    [<ffff82d080245f4d>] F common/timer.c#execute_timer+0x49/0x64
>> (XEN) [  121.204951]    [<ffff82d080246781>] F common/timer.c#timer_softirq_action+0x116/0x24e
>> (XEN) [  121.213271]    [<ffff82d0802425a5>] F common/softirq.c#__do_softirq+0x85/0x90
>> (XEN) [  121.220890]    [<ffff82d0802425e5>] F process_pending_softirqs+0x35/0x37
>> (XEN) [  121.228086]    [<ffff82d08022bc0f>] F common/rcupdate.c#rcu_process_callbacks+0x1ef/0x20d
>> (XEN) [  121.236758]    [<ffff82d0802425a5>] F common/softirq.c#__do_softirq+0x85/0x90
>> (XEN) [  121.244378]    [<ffff82d0802425e5>] F process_pending_softirqs+0x35/0x37
>> (XEN) [  121.251568]    [<ffff82d08022b91b>] F rcu_barrier+0x58/0x6e
>> (XEN) [  121.257639]    [<ffff82d0802addb1>] F cpu_down_helper+0x11/0x32
>> (XEN) [  121.264051]    [<ffff82d0802ae086>] F arch/x86/sysctl.c#smt_up_down_helper+0x1d6/0x1fe
>> (XEN) [  121.272454]    [<ffff82d08020878d>] F common/domain.c#continue_hypercall_tasklet_handler+0x54/0xb8
>> (XEN) [  121.281900]    [<ffff82d0802454e6>] F common/tasklet.c#do_tasklet_work+0x81/0xb4
>> (XEN) [  121.289786]    [<ffff82d080245803>] F do_tasklet+0x58/0x85
>> (XEN) [  121.295771]    [<ffff82d08027a0b4>] F arch/x86/domain.c#idle_loop+0x87/0xcb
>>
>> So it's not in get_cpu_maps() loop. It seems to me it's not entering time sync for some
>> reason.
> 
> Interesting. Looking further into that.
> 
> At least time_calibration() is missing to call get_cpu_maps().

I debugged this issue and the following fixes it:

diff --git a/xen/common/rcupdate.c b/xen/common/rcupdate.c
index ccf2ec6..36d98a4 100644
--- a/xen/common/rcupdate.c
+++ b/xen/common/rcupdate.c
@@ -153,6 +153,7 @@ static int rsinterval = 1000;
  * multiple times.
  */
 static atomic_t cpu_count = ATOMIC_INIT(0);
+static atomic_t done_count = ATOMIC_INIT(0);
 
 static void rcu_barrier_callback(struct rcu_head *head)
 {
@@ -175,6 +176,8 @@ static void rcu_barrier_action(void)
         process_pending_softirqs();
         cpu_relax();
     }
+
+    atomic_dec(&done_count);
 }
 
 void rcu_barrier(void)
@@ -194,10 +197,11 @@ void rcu_barrier(void)
     if ( !initial )
     {
         atomic_set(&cpu_count, num_online_cpus());
+        atomic_set(&done_count, num_online_cpus());
         cpumask_raise_softirq(&cpu_online_map, RCU_SOFTIRQ);
     }
 
-    while ( atomic_read(&cpu_count) )
+    while ( atomic_read(&done_count) )
     {
         process_pending_softirqs();
         cpu_relax();

Is there anything else that blocks v3 currently.

Igor

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [Xen-devel] [PATCH v2 0/4] xen/rcu: let rcu work better with core scheduling
  2020-02-27 15:16         ` Igor Druzhinin
@ 2020-02-27 15:21           ` Jürgen Groß
  2020-02-28  7:10           ` Jürgen Groß
  1 sibling, 0 replies; 27+ messages in thread
From: Jürgen Groß @ 2020-02-27 15:21 UTC (permalink / raw)
  To: Igor Druzhinin, xen-devel
  Cc: Kevin Tian, Stefano Stabellini, Julien Grall, Jan Beulich,
	Wei Liu, Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Ian Jackson, Jun Nakajima, Roger Pau Monné

On 27.02.20 16:16, Igor Druzhinin wrote:
> On 23/02/2020 14:14, Jürgen Groß wrote:
>> On 22.02.20 17:42, Igor Druzhinin wrote:
>>> (XEN) [  120.891143] *** Dumping CPU0 host state: ***
>>> (XEN) [  120.895909] ----[ Xen-4.13.0  x86_64  debug=y   Not tainted ]----
>>> (XEN) [  120.902487] CPU:    0
>>> (XEN) [  120.905269] RIP:    e008:[<ffff82d0802aa750>] smp_send_call_function_mask+0x40/0x43
>>> (XEN) [  120.913415] RFLAGS: 0000000000000286   CONTEXT: hypervisor
>>> (XEN) [  120.919389] rax: 0000000000000000   rbx: ffff82d0805ddb78   rcx: 0000000000000001
>>> (XEN) [  120.927362] rdx: ffff82d0805cdb00   rsi: ffff82d0805c7cd8   rdi: 0000000000000007
>>> (XEN) [  120.935341] rbp: ffff8300920bfbc0   rsp: ffff8300920bfbb8   r8:  000000000000003b
>>> (XEN) [  120.943310] r9:  0444444444444432   r10: 3333333333333333   r11: 0000000000000001
>>> (XEN) [  120.951282] r12: ffff82d0805ddb78   r13: 0000000000000001   r14: ffff8300920bfc18
>>> (XEN) [  120.959251] r15: ffff82d0802af646   cr0: 000000008005003b   cr4: 00000000003506e0
>>> (XEN) [  120.967223] cr3: 00000000920b0000   cr2: ffff88820dffe7f8
>>> (XEN) [  120.973125] fsb: 0000000000000000   gsb: ffff88821e3c0000   gss: 0000000000000000
>>> (XEN) [  120.981094] ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs: e008
>>> (XEN) [  120.988548] Xen code around <ffff82d0802aa750> (smp_send_call_function_mask+0x40/0x43):
>>> (XEN) [  120.997037]  85 f9 ff fb 48 83 c4 08 <5b> 5d c3 9c 58 f6 c4 02 74 02 0f 0b 55 48 89 e5
>>> (XEN) [  121.005442] Xen stack trace from rsp=ffff8300920bfbb8:
>>> (XEN) [  121.011080]    ffff8300920bfc18 ffff8300920bfc00 ffff82d080242c84 ffff82d080389845
>>> (XEN) [  121.019145]    ffff8300920bfc18 ffff82d0802af178 0000000000000000 0000001c1d27aff8
>>> (XEN) [  121.027200]    0000000000000000 ffff8300920bfc80 ffff82d0802af1fa ffff82d080289adf
>>> (XEN) [  121.035255]    fffffffffffffd55 0000000000000000 0000000000000000 0000000000000000
>>> (XEN) [  121.043320]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
>>> (XEN) [  121.051375]    000000000000003b 0000001c25e54bf1 0000000000000000 ffff8300920bfc80
>>> (XEN) [  121.059443]    ffff82d0805c7300 ffff8300920bfcb0 ffff82d080245f4d ffff82d0802af4a2
>>> (XEN) [  121.067498]    ffff82d0805c7300 ffff83042bb24f60 ffff82d08060f400 ffff8300920bfd00
>>> (XEN) [  121.075553]    ffff82d080246781 ffff82d0805cdb00 ffff8300920bfd80 ffff82d0805c7040
>>> (XEN) [  121.083621]    ffff82d0805cdb00 ffff82d0805cdb00 fffffffffffffff9 ffff8300920bffff
>>> (XEN) [  121.091674]    0000000000000000 ffff8300920bfd30 ffff82d0802425a5 ffff82d0805c7040
>>> (XEN) [  121.099739]    ffff82d0805cdb00 fffffffffffffff9 ffff8300920bffff ffff8300920bfd40
>>> (XEN) [  121.107797]    ffff82d0802425e5 ffff8300920bfd80 ffff82d08022bc0f 0000000000000000
>>> (XEN) [  121.115852]    ffff82d08022b600 ffff82d0804b3888 ffff82d0805cdb00 ffff82d0805cdb00
>>> (XEN) [  121.123917]    fffffffffffffff9 ffff8300920bfdb0 ffff82d0802425a5 0000000000000003
>>> (XEN) [  121.131975]    0000000000000001 00000000ffffffef ffff8300920bffff ffff8300920bfdc0
>>> (XEN) [  121.140037]    ffff82d0802425e5 ffff8300920bfdd0 ffff82d08022b91b ffff8300920bfdf0
>>> (XEN) [  121.148093]    ffff82d0802addb1 ffff83042b3b0000 0000000000000003 ffff8300920bfe30
>>> (XEN) [  121.156150]    ffff82d0802ae086 ffff8300920bfe10 ffff83042b7e81e0 ffff83042b3b0000
>>> (XEN) [  121.164216]    0000000000000000 0000000000000000 0000000000000000 ffff8300920bfe50
>>> (XEN) [  121.172271] Xen call trace:
>>> (XEN) [  121.175573]    [<ffff82d0802aa750>] R smp_send_call_function_mask+0x40/0x43
>>> (XEN) [  121.183024]    [<ffff82d080242c84>] F on_selected_cpus+0xa4/0xde
>>> (XEN) [  121.189520]    [<ffff82d0802af1fa>] F arch/x86/time.c#time_calibration+0x82/0x89
>>> (XEN) [  121.197403]    [<ffff82d080245f4d>] F common/timer.c#execute_timer+0x49/0x64
>>> (XEN) [  121.204951]    [<ffff82d080246781>] F common/timer.c#timer_softirq_action+0x116/0x24e
>>> (XEN) [  121.213271]    [<ffff82d0802425a5>] F common/softirq.c#__do_softirq+0x85/0x90
>>> (XEN) [  121.220890]    [<ffff82d0802425e5>] F process_pending_softirqs+0x35/0x37
>>> (XEN) [  121.228086]    [<ffff82d08022bc0f>] F common/rcupdate.c#rcu_process_callbacks+0x1ef/0x20d
>>> (XEN) [  121.236758]    [<ffff82d0802425a5>] F common/softirq.c#__do_softirq+0x85/0x90
>>> (XEN) [  121.244378]    [<ffff82d0802425e5>] F process_pending_softirqs+0x35/0x37
>>> (XEN) [  121.251568]    [<ffff82d08022b91b>] F rcu_barrier+0x58/0x6e
>>> (XEN) [  121.257639]    [<ffff82d0802addb1>] F cpu_down_helper+0x11/0x32
>>> (XEN) [  121.264051]    [<ffff82d0802ae086>] F arch/x86/sysctl.c#smt_up_down_helper+0x1d6/0x1fe
>>> (XEN) [  121.272454]    [<ffff82d08020878d>] F common/domain.c#continue_hypercall_tasklet_handler+0x54/0xb8
>>> (XEN) [  121.281900]    [<ffff82d0802454e6>] F common/tasklet.c#do_tasklet_work+0x81/0xb4
>>> (XEN) [  121.289786]    [<ffff82d080245803>] F do_tasklet+0x58/0x85
>>> (XEN) [  121.295771]    [<ffff82d08027a0b4>] F arch/x86/domain.c#idle_loop+0x87/0xcb
>>>
>>> So it's not in get_cpu_maps() loop. It seems to me it's not entering time sync for some
>>> reason.
>>
>> Interesting. Looking further into that.
>>
>> At least time_calibration() is missing to call get_cpu_maps().
> 
> I debugged this issue and the following fixes it:
> 
> diff --git a/xen/common/rcupdate.c b/xen/common/rcupdate.c
> index ccf2ec6..36d98a4 100644
> --- a/xen/common/rcupdate.c
> +++ b/xen/common/rcupdate.c
> @@ -153,6 +153,7 @@ static int rsinterval = 1000;
>    * multiple times.
>    */
>   static atomic_t cpu_count = ATOMIC_INIT(0);
> +static atomic_t done_count = ATOMIC_INIT(0);
>   
>   static void rcu_barrier_callback(struct rcu_head *head)
>   {
> @@ -175,6 +176,8 @@ static void rcu_barrier_action(void)
>           process_pending_softirqs();
>           cpu_relax();
>       }
> +
> +    atomic_dec(&done_count);
>   }
>   
>   void rcu_barrier(void)
> @@ -194,10 +197,11 @@ void rcu_barrier(void)
>       if ( !initial )
>       {
>           atomic_set(&cpu_count, num_online_cpus());
> +        atomic_set(&done_count, num_online_cpus());
>           cpumask_raise_softirq(&cpu_online_map, RCU_SOFTIRQ);
>       }
>   
> -    while ( atomic_read(&cpu_count) )
> +    while ( atomic_read(&done_count) )
>       {
>           process_pending_softirqs();
>           cpu_relax();
> 
> Is there anything else that blocks v3 currently.

Thanks for the work!

I'll send V3 probably tomorrow.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Xen-devel] [PATCH v2 0/4] xen/rcu: let rcu work better with core scheduling
  2020-02-27 15:16         ` Igor Druzhinin
  2020-02-27 15:21           ` Jürgen Groß
@ 2020-02-28  7:10           ` Jürgen Groß
  2020-03-02 13:25             ` Igor Druzhinin
  1 sibling, 1 reply; 27+ messages in thread
From: Jürgen Groß @ 2020-02-28  7:10 UTC (permalink / raw)
  To: Igor Druzhinin, xen-devel
  Cc: Kevin Tian, Stefano Stabellini, Julien Grall, Jan Beulich,
	Wei Liu, Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Ian Jackson, Jun Nakajima, Roger Pau Monné

On 27.02.20 16:16, Igor Druzhinin wrote:
> On 23/02/2020 14:14, Jürgen Groß wrote:
>> On 22.02.20 17:42, Igor Druzhinin wrote:
>>> (XEN) [  120.891143] *** Dumping CPU0 host state: ***
>>> (XEN) [  120.895909] ----[ Xen-4.13.0  x86_64  debug=y   Not tainted ]----
>>> (XEN) [  120.902487] CPU:    0
>>> (XEN) [  120.905269] RIP:    e008:[<ffff82d0802aa750>] smp_send_call_function_mask+0x40/0x43
>>> (XEN) [  120.913415] RFLAGS: 0000000000000286   CONTEXT: hypervisor
>>> (XEN) [  120.919389] rax: 0000000000000000   rbx: ffff82d0805ddb78   rcx: 0000000000000001
>>> (XEN) [  120.927362] rdx: ffff82d0805cdb00   rsi: ffff82d0805c7cd8   rdi: 0000000000000007
>>> (XEN) [  120.935341] rbp: ffff8300920bfbc0   rsp: ffff8300920bfbb8   r8:  000000000000003b
>>> (XEN) [  120.943310] r9:  0444444444444432   r10: 3333333333333333   r11: 0000000000000001
>>> (XEN) [  120.951282] r12: ffff82d0805ddb78   r13: 0000000000000001   r14: ffff8300920bfc18
>>> (XEN) [  120.959251] r15: ffff82d0802af646   cr0: 000000008005003b   cr4: 00000000003506e0
>>> (XEN) [  120.967223] cr3: 00000000920b0000   cr2: ffff88820dffe7f8
>>> (XEN) [  120.973125] fsb: 0000000000000000   gsb: ffff88821e3c0000   gss: 0000000000000000
>>> (XEN) [  120.981094] ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs: e008
>>> (XEN) [  120.988548] Xen code around <ffff82d0802aa750> (smp_send_call_function_mask+0x40/0x43):
>>> (XEN) [  120.997037]  85 f9 ff fb 48 83 c4 08 <5b> 5d c3 9c 58 f6 c4 02 74 02 0f 0b 55 48 89 e5
>>> (XEN) [  121.005442] Xen stack trace from rsp=ffff8300920bfbb8:
>>> (XEN) [  121.011080]    ffff8300920bfc18 ffff8300920bfc00 ffff82d080242c84 ffff82d080389845
>>> (XEN) [  121.019145]    ffff8300920bfc18 ffff82d0802af178 0000000000000000 0000001c1d27aff8
>>> (XEN) [  121.027200]    0000000000000000 ffff8300920bfc80 ffff82d0802af1fa ffff82d080289adf
>>> (XEN) [  121.035255]    fffffffffffffd55 0000000000000000 0000000000000000 0000000000000000
>>> (XEN) [  121.043320]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
>>> (XEN) [  121.051375]    000000000000003b 0000001c25e54bf1 0000000000000000 ffff8300920bfc80
>>> (XEN) [  121.059443]    ffff82d0805c7300 ffff8300920bfcb0 ffff82d080245f4d ffff82d0802af4a2
>>> (XEN) [  121.067498]    ffff82d0805c7300 ffff83042bb24f60 ffff82d08060f400 ffff8300920bfd00
>>> (XEN) [  121.075553]    ffff82d080246781 ffff82d0805cdb00 ffff8300920bfd80 ffff82d0805c7040
>>> (XEN) [  121.083621]    ffff82d0805cdb00 ffff82d0805cdb00 fffffffffffffff9 ffff8300920bffff
>>> (XEN) [  121.091674]    0000000000000000 ffff8300920bfd30 ffff82d0802425a5 ffff82d0805c7040
>>> (XEN) [  121.099739]    ffff82d0805cdb00 fffffffffffffff9 ffff8300920bffff ffff8300920bfd40
>>> (XEN) [  121.107797]    ffff82d0802425e5 ffff8300920bfd80 ffff82d08022bc0f 0000000000000000
>>> (XEN) [  121.115852]    ffff82d08022b600 ffff82d0804b3888 ffff82d0805cdb00 ffff82d0805cdb00
>>> (XEN) [  121.123917]    fffffffffffffff9 ffff8300920bfdb0 ffff82d0802425a5 0000000000000003
>>> (XEN) [  121.131975]    0000000000000001 00000000ffffffef ffff8300920bffff ffff8300920bfdc0
>>> (XEN) [  121.140037]    ffff82d0802425e5 ffff8300920bfdd0 ffff82d08022b91b ffff8300920bfdf0
>>> (XEN) [  121.148093]    ffff82d0802addb1 ffff83042b3b0000 0000000000000003 ffff8300920bfe30
>>> (XEN) [  121.156150]    ffff82d0802ae086 ffff8300920bfe10 ffff83042b7e81e0 ffff83042b3b0000
>>> (XEN) [  121.164216]    0000000000000000 0000000000000000 0000000000000000 ffff8300920bfe50
>>> (XEN) [  121.172271] Xen call trace:
>>> (XEN) [  121.175573]    [<ffff82d0802aa750>] R smp_send_call_function_mask+0x40/0x43
>>> (XEN) [  121.183024]    [<ffff82d080242c84>] F on_selected_cpus+0xa4/0xde
>>> (XEN) [  121.189520]    [<ffff82d0802af1fa>] F arch/x86/time.c#time_calibration+0x82/0x89
>>> (XEN) [  121.197403]    [<ffff82d080245f4d>] F common/timer.c#execute_timer+0x49/0x64
>>> (XEN) [  121.204951]    [<ffff82d080246781>] F common/timer.c#timer_softirq_action+0x116/0x24e
>>> (XEN) [  121.213271]    [<ffff82d0802425a5>] F common/softirq.c#__do_softirq+0x85/0x90
>>> (XEN) [  121.220890]    [<ffff82d0802425e5>] F process_pending_softirqs+0x35/0x37
>>> (XEN) [  121.228086]    [<ffff82d08022bc0f>] F common/rcupdate.c#rcu_process_callbacks+0x1ef/0x20d
>>> (XEN) [  121.236758]    [<ffff82d0802425a5>] F common/softirq.c#__do_softirq+0x85/0x90
>>> (XEN) [  121.244378]    [<ffff82d0802425e5>] F process_pending_softirqs+0x35/0x37
>>> (XEN) [  121.251568]    [<ffff82d08022b91b>] F rcu_barrier+0x58/0x6e
>>> (XEN) [  121.257639]    [<ffff82d0802addb1>] F cpu_down_helper+0x11/0x32
>>> (XEN) [  121.264051]    [<ffff82d0802ae086>] F arch/x86/sysctl.c#smt_up_down_helper+0x1d6/0x1fe
>>> (XEN) [  121.272454]    [<ffff82d08020878d>] F common/domain.c#continue_hypercall_tasklet_handler+0x54/0xb8
>>> (XEN) [  121.281900]    [<ffff82d0802454e6>] F common/tasklet.c#do_tasklet_work+0x81/0xb4
>>> (XEN) [  121.289786]    [<ffff82d080245803>] F do_tasklet+0x58/0x85
>>> (XEN) [  121.295771]    [<ffff82d08027a0b4>] F arch/x86/domain.c#idle_loop+0x87/0xcb
>>>
>>> So it's not in get_cpu_maps() loop. It seems to me it's not entering time sync for some
>>> reason.
>>
>> Interesting. Looking further into that.
>>
>> At least time_calibration() is missing to call get_cpu_maps().
> 
> I debugged this issue and the following fixes it:
> 
> diff --git a/xen/common/rcupdate.c b/xen/common/rcupdate.c
> index ccf2ec6..36d98a4 100644
> --- a/xen/common/rcupdate.c
> +++ b/xen/common/rcupdate.c
> @@ -153,6 +153,7 @@ static int rsinterval = 1000;
>    * multiple times.
>    */
>   static atomic_t cpu_count = ATOMIC_INIT(0);
> +static atomic_t done_count = ATOMIC_INIT(0);
>   
>   static void rcu_barrier_callback(struct rcu_head *head)
>   {
> @@ -175,6 +176,8 @@ static void rcu_barrier_action(void)
>           process_pending_softirqs();
>           cpu_relax();
>       }
> +
> +    atomic_dec(&done_count);
>   }
>   
>   void rcu_barrier(void)
> @@ -194,10 +197,11 @@ void rcu_barrier(void)
>       if ( !initial )
>       {
>           atomic_set(&cpu_count, num_online_cpus());
> +        atomic_set(&done_count, num_online_cpus());
>           cpumask_raise_softirq(&cpu_online_map, RCU_SOFTIRQ);
>       }
>   
> -    while ( atomic_read(&cpu_count) )
> +    while ( atomic_read(&done_count) )
>       {
>           process_pending_softirqs();
>           cpu_relax();

I think you are just narrowing the window of the race:

It is still possible to have two cpus entering rcu_barrier() and to
make it into the if ( !initial ) clause.

Instead of introducing another atomic I believe the following patch
instead of yours should do it:

diff --git a/xen/common/rcupdate.c b/xen/common/rcupdate.c
index e6add0b120..0d5469a326 100644
--- a/xen/common/rcupdate.c
+++ b/xen/common/rcupdate.c
@@ -180,23 +180,17 @@ static void rcu_barrier_action(void)

  void rcu_barrier(void)
  {
-    int initial = atomic_read(&cpu_count);
-
      while ( !get_cpu_maps() )
      {
          process_pending_softirqs();
-        if ( initial && !atomic_read(&cpu_count) )
+        if ( !atomic_read(&cpu_count) )
              return;

          cpu_relax();
-        initial = atomic_read(&cpu_count);
      }

-    if ( !initial )
-    {
-        atomic_set(&cpu_count, num_online_cpus());
+    if ( atomic_cmpxchg(&cpu_count, 0, num_online_cpus()) == 0 )
          cpumask_raise_softirq(&cpu_online_map, RCU_SOFTIRQ);
-    }

      while ( atomic_read(&cpu_count) )
      {

Could you give that a try, please?


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [Xen-devel] [PATCH v2 0/4] xen/rcu: let rcu work better with core scheduling
  2020-02-28  7:10           ` Jürgen Groß
@ 2020-03-02 13:25             ` Igor Druzhinin
  2020-03-02 14:03               ` Jürgen Groß
  0 siblings, 1 reply; 27+ messages in thread
From: Igor Druzhinin @ 2020-03-02 13:25 UTC (permalink / raw)
  To: Jürgen Groß, xen-devel
  Cc: Kevin Tian, Stefano Stabellini, Julien Grall, Jan Beulich,
	Wei Liu, Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Ian Jackson, Jun Nakajima, Roger Pau Monné

On 28/02/2020 07:10, Jürgen Groß wrote:
> 
> I think you are just narrowing the window of the race:
> 
> It is still possible to have two cpus entering rcu_barrier() and to
> make it into the if ( !initial ) clause.
> 
> Instead of introducing another atomic I believe the following patch
> instead of yours should do it:
> 
> diff --git a/xen/common/rcupdate.c b/xen/common/rcupdate.c
> index e6add0b120..0d5469a326 100644
> --- a/xen/common/rcupdate.c
> +++ b/xen/common/rcupdate.c
> @@ -180,23 +180,17 @@ static void rcu_barrier_action(void)
> 
>  void rcu_barrier(void)
>  {
> -    int initial = atomic_read(&cpu_count);
> -
>      while ( !get_cpu_maps() )
>      {
>          process_pending_softirqs();
> -        if ( initial && !atomic_read(&cpu_count) )
> +        if ( !atomic_read(&cpu_count) )
>              return;
> 
>          cpu_relax();
> -        initial = atomic_read(&cpu_count);
>      }
> 
> -    if ( !initial )
> -    {
> -        atomic_set(&cpu_count, num_online_cpus());
> +    if ( atomic_cmpxchg(&cpu_count, 0, num_online_cpus()) == 0 )
>          cpumask_raise_softirq(&cpu_online_map, RCU_SOFTIRQ);
> -    }
> 
>      while ( atomic_read(&cpu_count) )
>      {
> 
> Could you give that a try, please?

With this patch I cannot disable SMT at all.

The problem that my diff solved was a race between 2 consecutive
rcu_barrier operations on CPU0 (the pattern specific to SMT-on/off
operation) where some CPUs didn't exit the cpu_count checking loop
completely but cpu_count is already reinitialized on CPU0 - this
results in some CPUs being stuck in the loop.

Igor

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Xen-devel] [PATCH v2 0/4] xen/rcu: let rcu work better with core scheduling
  2020-03-02 13:25             ` Igor Druzhinin
@ 2020-03-02 14:03               ` Jürgen Groß
  2020-03-02 14:23                 ` Igor Druzhinin
  0 siblings, 1 reply; 27+ messages in thread
From: Jürgen Groß @ 2020-03-02 14:03 UTC (permalink / raw)
  To: Igor Druzhinin, xen-devel
  Cc: Kevin Tian, Stefano Stabellini, Julien Grall, Jan Beulich,
	Wei Liu, Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Ian Jackson, Jun Nakajima, Roger Pau Monné

[-- Attachment #1: Type: text/plain, Size: 1915 bytes --]

On 02.03.20 14:25, Igor Druzhinin wrote:
> On 28/02/2020 07:10, Jürgen Groß wrote:
>>
>> I think you are just narrowing the window of the race:
>>
>> It is still possible to have two cpus entering rcu_barrier() and to
>> make it into the if ( !initial ) clause.
>>
>> Instead of introducing another atomic I believe the following patch
>> instead of yours should do it:
>>
>> diff --git a/xen/common/rcupdate.c b/xen/common/rcupdate.c
>> index e6add0b120..0d5469a326 100644
>> --- a/xen/common/rcupdate.c
>> +++ b/xen/common/rcupdate.c
>> @@ -180,23 +180,17 @@ static void rcu_barrier_action(void)
>>
>>   void rcu_barrier(void)
>>   {
>> -    int initial = atomic_read(&cpu_count);
>> -
>>       while ( !get_cpu_maps() )
>>       {
>>           process_pending_softirqs();
>> -        if ( initial && !atomic_read(&cpu_count) )
>> +        if ( !atomic_read(&cpu_count) )
>>               return;
>>
>>           cpu_relax();
>> -        initial = atomic_read(&cpu_count);
>>       }
>>
>> -    if ( !initial )
>> -    {
>> -        atomic_set(&cpu_count, num_online_cpus());
>> +    if ( atomic_cmpxchg(&cpu_count, 0, num_online_cpus()) == 0 )
>>           cpumask_raise_softirq(&cpu_online_map, RCU_SOFTIRQ);
>> -    }
>>
>>       while ( atomic_read(&cpu_count) )
>>       {
>>
>> Could you give that a try, please?
> 
> With this patch I cannot disable SMT at all.
> 
> The problem that my diff solved was a race between 2 consecutive
> rcu_barrier operations on CPU0 (the pattern specific to SMT-on/off
> operation) where some CPUs didn't exit the cpu_count checking loop
> completely but cpu_count is already reinitialized on CPU0 - this
> results in some CPUs being stuck in the loop.

Ah, okay, then I believe a combination of the two patches is needed.

Something like the attached version?


Juergen

[-- Attachment #2: 0002-xen-rcu-don-t-use-stop_machine_run-for-rcu_barrier.patch --]
[-- Type: text/x-patch, Size: 4793 bytes --]

From 560ecf8ca947b16aa5af7978905ace51965167e2 Mon Sep 17 00:00:00 2001
From: Juergen Gross <jgross@suse.com>
To: xen-devel@lists.xenproject.org
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Julien Grall <julien@xen.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Wei Liu <wl@xen.org>
Date: Mon, 17 Feb 2020 06:58:49 +0100
Subject: [PATCH 2/2] xen/rcu: don't use stop_machine_run() for rcu_barrier()

Today rcu_barrier() is calling stop_machine_run() to synchronize all
physical cpus in order to ensure all pending rcu calls have finished
when returning.

As stop_machine_run() is using tasklets this requires scheduling of
idle vcpus on all cpus imposing the need to call rcu_barrier() on idle
cpus only in case of core scheduling being active, as otherwise a
scheduling deadlock would occur.

There is no need at all to do the syncing of the cpus in tasklets, as
rcu activity is started in __do_softirq() called whenever softirq
activity is allowed. So rcu_barrier() can easily be modified to use
softirq for synchronization of the cpus no longer requiring any
scheduling activity.

As there already is a rcu softirq reuse that for the synchronization.

Finally switch rcu_barrier() to return void as it now can never fail.

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 xen/common/rcupdate.c      | 49 ++++++++++++++++++++++++++--------------------
 xen/include/xen/rcupdate.h |  2 +-
 2 files changed, 29 insertions(+), 22 deletions(-)

diff --git a/xen/common/rcupdate.c b/xen/common/rcupdate.c
index 079ea9d8a1..1f02a804e3 100644
--- a/xen/common/rcupdate.c
+++ b/xen/common/rcupdate.c
@@ -143,47 +143,51 @@ static int qhimark = 10000;
 static int qlowmark = 100;
 static int rsinterval = 1000;
 
-struct rcu_barrier_data {
-    struct rcu_head head;
-    atomic_t *cpu_count;
-};
+/*
+ * rcu_barrier() handling:
+ * cpu_count holds the number of cpu required to finish barrier handling.
+ * Cpus are synchronized via softirq mechanism. rcu_barrier() is regarded to
+ * be active if cpu_count is not zero. In case rcu_barrier() is called on
+ * multiple cpus it is enough to check for cpu_count being not zero on entry
+ * and to call process_pending_softirqs() in a loop until cpu_count drops to
+ * zero, as syncing has been requested already and we don't need to sync
+ * multiple times.
+ */
+static atomic_t cpu_count = ATOMIC_INIT(0);
 
 static void rcu_barrier_callback(struct rcu_head *head)
 {
-    struct rcu_barrier_data *data = container_of(
-        head, struct rcu_barrier_data, head);
-    atomic_inc(data->cpu_count);
+    atomic_dec(&cpu_count);
 }
 
-static int rcu_barrier_action(void *_cpu_count)
+static void rcu_barrier_action(void)
 {
-    struct rcu_barrier_data data = { .cpu_count = _cpu_count };
-
-    ASSERT(!local_irq_is_enabled());
-    local_irq_enable();
+    struct rcu_head head;
 
     /*
      * When callback is executed, all previously-queued RCU work on this CPU
      * is completed. When all CPUs have executed their callback, data.cpu_count
      * will have been incremented to include every online CPU.
      */
-    call_rcu(&data.head, rcu_barrier_callback);
+    call_rcu(&head, rcu_barrier_callback);
 
-    while ( atomic_read(data.cpu_count) != num_online_cpus() )
+    while ( atomic_read(&cpu_count) )
     {
         process_pending_softirqs();
         cpu_relax();
     }
-
-    local_irq_disable();
-
-    return 0;
 }
 
-int rcu_barrier(void)
+void rcu_barrier(void)
 {
-    atomic_t cpu_count = ATOMIC_INIT(0);
-    return stop_machine_run(rcu_barrier_action, &cpu_count, NR_CPUS);
+    if ( !atomic_cmpxchg(&cpu_count, 0, num_online_cpus()) )
+        cpumask_raise_softirq(&cpu_online_map, RCU_SOFTIRQ);
+
+    while ( atomic_read(&cpu_count) )
+    {
+        process_pending_softirqs();
+        cpu_relax();
+    }
 }
 
 /* Is batch a before batch b ? */
@@ -422,6 +426,9 @@ static void rcu_process_callbacks(void)
         rdp->process_callbacks = false;
         __rcu_process_callbacks(&rcu_ctrlblk, rdp);
     }
+
+    if ( atomic_read(&cpu_count) )
+        rcu_barrier_action();
 }
 
 static int __rcu_pending(struct rcu_ctrlblk *rcp, struct rcu_data *rdp)
diff --git a/xen/include/xen/rcupdate.h b/xen/include/xen/rcupdate.h
index 174d058113..87f35b7704 100644
--- a/xen/include/xen/rcupdate.h
+++ b/xen/include/xen/rcupdate.h
@@ -143,7 +143,7 @@ void rcu_check_callbacks(int cpu);
 void call_rcu(struct rcu_head *head, 
               void (*func)(struct rcu_head *head));
 
-int rcu_barrier(void);
+void rcu_barrier(void);
 
 void rcu_idle_enter(unsigned int cpu);
 void rcu_idle_exit(unsigned int cpu);
-- 
2.16.4


[-- Attachment #3: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [Xen-devel] [PATCH v2 0/4] xen/rcu: let rcu work better with core scheduling
  2020-03-02 14:03               ` Jürgen Groß
@ 2020-03-02 14:23                 ` Igor Druzhinin
  2020-03-02 14:32                   ` Jürgen Groß
  0 siblings, 1 reply; 27+ messages in thread
From: Igor Druzhinin @ 2020-03-02 14:23 UTC (permalink / raw)
  To: Jürgen Groß, xen-devel
  Cc: Kevin Tian, Stefano Stabellini, Julien Grall, Jan Beulich,
	Wei Liu, Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Ian Jackson, Jun Nakajima, Roger Pau Monné

On 02/03/2020 14:03, Jürgen Groß wrote:
> On 02.03.20 14:25, Igor Druzhinin wrote:
>> On 28/02/2020 07:10, Jürgen Groß wrote:
>>>
>>> I think you are just narrowing the window of the race:
>>>
>>> It is still possible to have two cpus entering rcu_barrier() and to
>>> make it into the if ( !initial ) clause.
>>>
>>> Instead of introducing another atomic I believe the following patch
>>> instead of yours should do it:
>>>
>>> diff --git a/xen/common/rcupdate.c b/xen/common/rcupdate.c
>>> index e6add0b120..0d5469a326 100644
>>> --- a/xen/common/rcupdate.c
>>> +++ b/xen/common/rcupdate.c
>>> @@ -180,23 +180,17 @@ static void rcu_barrier_action(void)
>>>
>>>   void rcu_barrier(void)
>>>   {
>>> -    int initial = atomic_read(&cpu_count);
>>> -
>>>       while ( !get_cpu_maps() )
>>>       {
>>>           process_pending_softirqs();
>>> -        if ( initial && !atomic_read(&cpu_count) )
>>> +        if ( !atomic_read(&cpu_count) )
>>>               return;
>>>
>>>           cpu_relax();
>>> -        initial = atomic_read(&cpu_count);
>>>       }
>>>
>>> -    if ( !initial )
>>> -    {
>>> -        atomic_set(&cpu_count, num_online_cpus());
>>> +    if ( atomic_cmpxchg(&cpu_count, 0, num_online_cpus()) == 0 )
>>>           cpumask_raise_softirq(&cpu_online_map, RCU_SOFTIRQ);
>>> -    }
>>>
>>>       while ( atomic_read(&cpu_count) )
>>>       {
>>>
>>> Could you give that a try, please?
>>
>> With this patch I cannot disable SMT at all.
>>
>> The problem that my diff solved was a race between 2 consecutive
>> rcu_barrier operations on CPU0 (the pattern specific to SMT-on/off
>> operation) where some CPUs didn't exit the cpu_count checking loop
>> completely but cpu_count is already reinitialized on CPU0 - this
>> results in some CPUs being stuck in the loop.
> 
> Ah, okay, then I believe a combination of the two patches is needed.
> 
> Something like the attached version?

I apologies - my previous test result was from machine booted in core mode.
I'm now testing it properly and the original patch seems to do the trick but
I still don't understand how you can avoid the race with only 1 counter - 
it's always possible that CPU1 is still in cpu_count checking loop (even if
cpu_count is currently 0) when cpu_count is reinitialized.

I'm looking at your current version now. Was the removal of get_cpu_maps()
and recursion protection intentional? I suspect it would only work on the
latest master so I need to keep those for 4.13 testing.

Igor


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Xen-devel] [PATCH v2 0/4] xen/rcu: let rcu work better with core scheduling
  2020-03-02 14:23                 ` Igor Druzhinin
@ 2020-03-02 14:32                   ` Jürgen Groß
  2020-03-02 22:29                     ` Igor Druzhinin
  0 siblings, 1 reply; 27+ messages in thread
From: Jürgen Groß @ 2020-03-02 14:32 UTC (permalink / raw)
  To: Igor Druzhinin, xen-devel
  Cc: Kevin Tian, Stefano Stabellini, Julien Grall, Jan Beulich,
	Wei Liu, Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Ian Jackson, Jun Nakajima, Roger Pau Monné

[-- Attachment #1: Type: text/plain, Size: 2831 bytes --]

On 02.03.20 15:23, Igor Druzhinin wrote:
> On 02/03/2020 14:03, Jürgen Groß wrote:
>> On 02.03.20 14:25, Igor Druzhinin wrote:
>>> On 28/02/2020 07:10, Jürgen Groß wrote:
>>>>
>>>> I think you are just narrowing the window of the race:
>>>>
>>>> It is still possible to have two cpus entering rcu_barrier() and to
>>>> make it into the if ( !initial ) clause.
>>>>
>>>> Instead of introducing another atomic I believe the following patch
>>>> instead of yours should do it:
>>>>
>>>> diff --git a/xen/common/rcupdate.c b/xen/common/rcupdate.c
>>>> index e6add0b120..0d5469a326 100644
>>>> --- a/xen/common/rcupdate.c
>>>> +++ b/xen/common/rcupdate.c
>>>> @@ -180,23 +180,17 @@ static void rcu_barrier_action(void)
>>>>
>>>>    void rcu_barrier(void)
>>>>    {
>>>> -    int initial = atomic_read(&cpu_count);
>>>> -
>>>>        while ( !get_cpu_maps() )
>>>>        {
>>>>            process_pending_softirqs();
>>>> -        if ( initial && !atomic_read(&cpu_count) )
>>>> +        if ( !atomic_read(&cpu_count) )
>>>>                return;
>>>>
>>>>            cpu_relax();
>>>> -        initial = atomic_read(&cpu_count);
>>>>        }
>>>>
>>>> -    if ( !initial )
>>>> -    {
>>>> -        atomic_set(&cpu_count, num_online_cpus());
>>>> +    if ( atomic_cmpxchg(&cpu_count, 0, num_online_cpus()) == 0 )
>>>>            cpumask_raise_softirq(&cpu_online_map, RCU_SOFTIRQ);
>>>> -    }
>>>>
>>>>        while ( atomic_read(&cpu_count) )
>>>>        {
>>>>
>>>> Could you give that a try, please?
>>>
>>> With this patch I cannot disable SMT at all.
>>>
>>> The problem that my diff solved was a race between 2 consecutive
>>> rcu_barrier operations on CPU0 (the pattern specific to SMT-on/off
>>> operation) where some CPUs didn't exit the cpu_count checking loop
>>> completely but cpu_count is already reinitialized on CPU0 - this
>>> results in some CPUs being stuck in the loop.
>>
>> Ah, okay, then I believe a combination of the two patches is needed.
>>
>> Something like the attached version?
> 
> I apologies - my previous test result was from machine booted in core mode.
> I'm now testing it properly and the original patch seems to do the trick but
> I still don't understand how you can avoid the race with only 1 counter -
> it's always possible that CPU1 is still in cpu_count checking loop (even if
> cpu_count is currently 0) when cpu_count is reinitialized.

I guess this is very very unlikely.

> I'm looking at your current version now. Was the removal of get_cpu_maps()
> and recursion protection intentional? I suspect it would only work on the
> latest master so I need to keep those for 4.13 testing.

Oh, sorry, this seems to be an old version.

Here comes the correct one.


Juergen

[-- Attachment #2: v3-0002-xen-rcu-don-t-use-stop_machine_run-for-rcu_barrie.patch --]
[-- Type: text/x-patch, Size: 6071 bytes --]

From ca740c277b2fa86e6c4d3e3dac6a8366c7898672 Mon Sep 17 00:00:00 2001
From: Juergen Gross <jgross@suse.com>
Date: Fri, 28 Feb 2020 07:43:56 +0100
Subject: [PATCH v3 2/4] xen/rcu: don't use stop_machine_run() for
 rcu_barrier()

Today rcu_barrier() is calling stop_machine_run() to synchronize all
physical cpus in order to ensure all pending rcu calls have finished
when returning.

As stop_machine_run() is using tasklets this requires scheduling of
idle vcpus on all cpus imposing the need to call rcu_barrier() on idle
cpus only in case of core scheduling being active, as otherwise a
scheduling deadlock would occur.

There is no need at all to do the syncing of the cpus in tasklets, as
rcu activity is started in __do_softirq() called whenever softirq
activity is allowed. So rcu_barrier() can easily be modified to use
softirq for synchronization of the cpus no longer requiring any
scheduling activity.

As there already is a rcu softirq reuse that for the synchronization.

Remove the barrier element from struct rcu_data as it isn't used.

Finally switch rcu_barrier() to return void as it now can never fail.

Partially-based-on-patch-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
---
V2:
- add recursion detection

V3:
- fix races (Igor Druzhinin)
---
 xen/common/rcupdate.c      | 85 +++++++++++++++++++++++++++++++---------------
 xen/include/xen/rcupdate.h |  2 +-
 2 files changed, 59 insertions(+), 28 deletions(-)

diff --git a/xen/common/rcupdate.c b/xen/common/rcupdate.c
index 03d84764d2..27d597bbeb 100644
--- a/xen/common/rcupdate.c
+++ b/xen/common/rcupdate.c
@@ -83,7 +83,6 @@ struct rcu_data {
     struct rcu_head **donetail;
     long            blimit;           /* Upper limit on a processed batch */
     int cpu;
-    struct rcu_head barrier;
     long            last_rs_qlen;     /* qlen during the last resched */
 
     /* 3) idle CPUs handling */
@@ -91,6 +90,7 @@ struct rcu_data {
     bool idle_timer_active;
 
     bool            process_callbacks;
+    bool            barrier_active;
 };
 
 /*
@@ -143,51 +143,75 @@ static int qhimark = 10000;
 static int qlowmark = 100;
 static int rsinterval = 1000;
 
-struct rcu_barrier_data {
-    struct rcu_head head;
-    atomic_t *cpu_count;
-};
+/*
+ * rcu_barrier() handling:
+ * cpu_count holds the number of cpu required to finish barrier handling.
+ * Cpus are synchronized via softirq mechanism. rcu_barrier() is regarded to
+ * be active if cpu_count is not zero. In case rcu_barrier() is called on
+ * multiple cpus it is enough to check for cpu_count being not zero on entry
+ * and to call process_pending_softirqs() in a loop until cpu_count drops to
+ * zero, as syncing has been requested already and we don't need to sync
+ * multiple times.
+ * In order to avoid hangs when rcu_barrier() is called mutiple times on the
+ * same cpu in fast sequence and a slave cpu couldn't drop out of the
+ * barrier handling fast enough a second counter done_count is needed.
+ */
+static atomic_t cpu_count = ATOMIC_INIT(0);
+static atomic_t done_count = ATOMIC_INIT(0);
 
 static void rcu_barrier_callback(struct rcu_head *head)
 {
-    struct rcu_barrier_data *data = container_of(
-        head, struct rcu_barrier_data, head);
-    atomic_inc(data->cpu_count);
+    atomic_dec(&cpu_count);
 }
 
-static int rcu_barrier_action(void *_cpu_count)
+static void rcu_barrier_action(void)
 {
-    struct rcu_barrier_data data = { .cpu_count = _cpu_count };
-
-    ASSERT(!local_irq_is_enabled());
-    local_irq_enable();
+    struct rcu_head head;
 
     /*
      * When callback is executed, all previously-queued RCU work on this CPU
-     * is completed. When all CPUs have executed their callback, data.cpu_count
-     * will have been incremented to include every online CPU.
+     * is completed. When all CPUs have executed their callback, cpu_count
+     * will have been decremented to 0.
      */
-    call_rcu(&data.head, rcu_barrier_callback);
+    call_rcu(&head, rcu_barrier_callback);
 
-    while ( atomic_read(data.cpu_count) != num_online_cpus() )
+    while ( atomic_read(&cpu_count) )
     {
         process_pending_softirqs();
         cpu_relax();
     }
 
-    local_irq_disable();
-
-    return 0;
+    atomic_dec(&done_count);
 }
 
-/*
- * As rcu_barrier() is using stop_machine_run() it is allowed to be used in
- * idle context only (see comment for stop_machine_run()).
- */
-int rcu_barrier(void)
+void rcu_barrier(void)
 {
-    atomic_t cpu_count = ATOMIC_INIT(0);
-    return stop_machine_run(rcu_barrier_action, &cpu_count, NR_CPUS);
+    unsigned int n_cpus;
+
+    while ( !get_cpu_maps() )
+    {
+        process_pending_softirqs();
+        if ( !atomic_read(&cpu_count) )
+            return;
+
+        cpu_relax();
+    }
+
+    n_cpus = num_online_cpus();
+
+    if ( atomic_cmpxchg(&cpu_count, 0, n_cpus) == 0 )
+    {
+        atomic_add(n_cpus, &done_count);
+        cpumask_raise_softirq(&cpu_online_map, RCU_SOFTIRQ);
+    }
+
+    while ( atomic_read(&done_count) )
+    {
+        process_pending_softirqs();
+        cpu_relax();
+    }
+
+    put_cpu_maps();
 }
 
 /* Is batch a before batch b ? */
@@ -426,6 +450,13 @@ static void rcu_process_callbacks(void)
         rdp->process_callbacks = false;
         __rcu_process_callbacks(&rcu_ctrlblk, rdp);
     }
+
+    if ( atomic_read(&cpu_count) && !rdp->barrier_active )
+    {
+        rdp->barrier_active = true;
+        rcu_barrier_action();
+        rdp->barrier_active = false;
+    }
 }
 
 static int __rcu_pending(struct rcu_ctrlblk *rcp, struct rcu_data *rdp)
diff --git a/xen/include/xen/rcupdate.h b/xen/include/xen/rcupdate.h
index 174d058113..87f35b7704 100644
--- a/xen/include/xen/rcupdate.h
+++ b/xen/include/xen/rcupdate.h
@@ -143,7 +143,7 @@ void rcu_check_callbacks(int cpu);
 void call_rcu(struct rcu_head *head, 
               void (*func)(struct rcu_head *head));
 
-int rcu_barrier(void);
+void rcu_barrier(void);
 
 void rcu_idle_enter(unsigned int cpu);
 void rcu_idle_exit(unsigned int cpu);
-- 
2.16.4


[-- Attachment #3: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [Xen-devel] [PATCH v2 0/4] xen/rcu: let rcu work better with core scheduling
  2020-03-02 14:32                   ` Jürgen Groß
@ 2020-03-02 22:29                     ` Igor Druzhinin
  0 siblings, 0 replies; 27+ messages in thread
From: Igor Druzhinin @ 2020-03-02 22:29 UTC (permalink / raw)
  To: Jürgen Groß, xen-devel
  Cc: Kevin Tian, Stefano Stabellini, Julien Grall, Jan Beulich,
	Wei Liu, Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Ian Jackson, Jun Nakajima, Roger Pau Monné

On 02/03/2020 14:32, Jürgen Groß wrote:
> On 02.03.20 15:23, Igor Druzhinin wrote:
>> On 02/03/2020 14:03, Jürgen Groß wrote:
>>> On 02.03.20 14:25, Igor Druzhinin wrote:
>>>> On 28/02/2020 07:10, Jürgen Groß wrote:
>>>>>
>>>>> I think you are just narrowing the window of the race:
>>>>>
>>>>> It is still possible to have two cpus entering rcu_barrier() and to
>>>>> make it into the if ( !initial ) clause.
>>>>>
>>>>> Instead of introducing another atomic I believe the following patch
>>>>> instead of yours should do it:
>>>>>
>>>>> diff --git a/xen/common/rcupdate.c b/xen/common/rcupdate.c
>>>>> index e6add0b120..0d5469a326 100644
>>>>> --- a/xen/common/rcupdate.c
>>>>> +++ b/xen/common/rcupdate.c
>>>>> @@ -180,23 +180,17 @@ static void rcu_barrier_action(void)
>>>>>
>>>>>    void rcu_barrier(void)
>>>>>    {
>>>>> -    int initial = atomic_read(&cpu_count);
>>>>> -
>>>>>        while ( !get_cpu_maps() )
>>>>>        {
>>>>>            process_pending_softirqs();
>>>>> -        if ( initial && !atomic_read(&cpu_count) )
>>>>> +        if ( !atomic_read(&cpu_count) )
>>>>>                return;
>>>>>
>>>>>            cpu_relax();
>>>>> -        initial = atomic_read(&cpu_count);
>>>>>        }
>>>>>
>>>>> -    if ( !initial )
>>>>> -    {
>>>>> -        atomic_set(&cpu_count, num_online_cpus());
>>>>> +    if ( atomic_cmpxchg(&cpu_count, 0, num_online_cpus()) == 0 )
>>>>>            cpumask_raise_softirq(&cpu_online_map, RCU_SOFTIRQ);
>>>>> -    }
>>>>>
>>>>>        while ( atomic_read(&cpu_count) )
>>>>>        {
>>>>>
>>>>> Could you give that a try, please?
>>>>
>>>> With this patch I cannot disable SMT at all.
>>>>
>>>> The problem that my diff solved was a race between 2 consecutive
>>>> rcu_barrier operations on CPU0 (the pattern specific to SMT-on/off
>>>> operation) where some CPUs didn't exit the cpu_count checking loop
>>>> completely but cpu_count is already reinitialized on CPU0 - this
>>>> results in some CPUs being stuck in the loop.
>>>
>>> Ah, okay, then I believe a combination of the two patches is needed.
>>>
>>> Something like the attached version?
>>
>> I apologies - my previous test result was from machine booted in core mode.
>> I'm now testing it properly and the original patch seems to do the trick but
>> I still don't understand how you can avoid the race with only 1 counter -
>> it's always possible that CPU1 is still in cpu_count checking loop (even if
>> cpu_count is currently 0) when cpu_count is reinitialized.
> 
> I guess this is very very unlikely.
> 
>> I'm looking at your current version now. Was the removal of get_cpu_maps()
>> and recursion protection intentional? I suspect it would only work on the
>> latest master so I need to keep those for 4.13 testing.
> 
> Oh, sorry, this seems to be an old version.
> 
> Here comes the correct one.

I checked this version and it's supposed to be fine for v3 I guess. However,
I wasn't able to check how well it would work in core mode though as CPU hot off
is generally broken in it now (at least it boots in core mode with rcu_barrier
called on CPU bring-up).

Igor

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2020-03-02 22:29 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-18 12:21 [Xen-devel] [PATCH v2 0/4] xen/rcu: let rcu work better with core scheduling Juergen Gross
2020-02-18 12:21 ` [Xen-devel] [PATCH v2 1/4] xen/rcu: use rcu softirq for forcing quiescent state Juergen Gross
2020-02-21 14:17   ` Andrew Cooper
2020-02-18 12:21 ` [Xen-devel] [PATCH v2 2/4] xen/rcu: don't use stop_machine_run() for rcu_barrier() Juergen Gross
2020-02-18 12:21 ` [Xen-devel] [PATCH v2 3/4] xen: add process_pending_softirqs_norcu() for keyhandlers Juergen Gross
2020-02-24 11:25   ` Roger Pau Monné
2020-02-24 11:44     ` Jürgen Groß
2020-02-24 12:02       ` Roger Pau Monné
2020-02-18 12:21 ` [Xen-devel] [PATCH v2 4/4] xen/rcu: add assertions to debug build Juergen Gross
2020-02-24 11:31   ` Roger Pau Monné
2020-02-24 11:45     ` Jürgen Groß
2020-02-18 13:15 ` [Xen-devel] [PATCH v2 0/4] xen/rcu: let rcu work better with core scheduling Igor Druzhinin
2020-02-19 16:48   ` Igor Druzhinin
2020-02-22  2:29 ` Igor Druzhinin
2020-02-22  6:05   ` Jürgen Groß
2020-02-22 12:32     ` Julien Grall
2020-02-22 13:56       ` Jürgen Groß
2020-02-22 16:42     ` Igor Druzhinin
2020-02-23 14:14       ` Jürgen Groß
2020-02-27 15:16         ` Igor Druzhinin
2020-02-27 15:21           ` Jürgen Groß
2020-02-28  7:10           ` Jürgen Groß
2020-03-02 13:25             ` Igor Druzhinin
2020-03-02 14:03               ` Jürgen Groß
2020-03-02 14:23                 ` Igor Druzhinin
2020-03-02 14:32                   ` Jürgen Groß
2020-03-02 22:29                     ` Igor Druzhinin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).