[Xen-devel] [PATCH v4 0/2] x86: scratch cpumask fixes/improvement

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Xen-devel] [PATCH v4 0/2] x86: scratch cpumask fixes/improvement
@ 2020-02-28  9:33 Roger Pau Monne
  2020-02-28  9:33 ` [Xen-devel] [PATCH v4 1/2] x86/smp: use a dedicated CPU mask in send_IPI_mask Roger Pau Monne
  2020-02-28  9:33 ` [Xen-devel] [PATCH v4 2/2] x86: add accessors for scratch cpu mask Roger Pau Monne
  0 siblings, 2 replies; 10+ messages in thread
From: Roger Pau Monne @ 2020-02-28  9:33 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich, Roger Pau Monne

Hello,

Following series contain yet one more bugfix that removes the usage of
the scratch cpumask in send_IPI_mask and the introduction of accessors
to get/put the per-CPU scratch cpumask in order to prevent such issues
form happening in the future.

Thanks, Roger.

Roger Pau Monne (2):
  x86/smp: use a dedicated CPU mask in send_IPI_mask
  x86: add accessors for scratch cpu mask

 xen/arch/x86/io_apic.c    |  6 ++++--
 xen/arch/x86/irq.c        | 14 ++++++++++----
 xen/arch/x86/mm.c         | 40 +++++++++++++++++++++++++++------------
 xen/arch/x86/msi.c        |  4 +++-
 xen/arch/x86/smp.c        | 29 +++++++++++++++++++++++++++-
 xen/arch/x86/smpboot.c    | 10 ++++++++--
 xen/include/asm-x86/smp.h | 14 ++++++++++++++
 7 files changed, 95 insertions(+), 22 deletions(-)

-- 
2.25.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Xen-devel] [PATCH v4 1/2] x86/smp: use a dedicated CPU mask in send_IPI_mask
  2020-02-28  9:33 [Xen-devel] [PATCH v4 0/2] x86: scratch cpumask fixes/improvement Roger Pau Monne
@ 2020-02-28  9:33 ` Roger Pau Monne
  2020-02-28 10:08   ` Jan Beulich
  2020-02-28  9:33 ` [Xen-devel] [PATCH v4 2/2] x86: add accessors for scratch cpu mask Roger Pau Monne
  1 sibling, 1 reply; 10+ messages in thread
From: Roger Pau Monne @ 2020-02-28  9:33 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich, Roger Pau Monne

Some callers of send_IPI_mask pass the scratch cpumask as the mask
parameter of send_IPI_mask, so the scratch cpumask cannot be used by
the function. The following trace has been obtained with a debug patch
and shows one of those callers:

(XEN) scratch CPU mask already in use by arch/x86/mm.c#_get_page_type+0x1f9/0x1abf
(XEN) Xen BUG at smp.c:45
[...]
(XEN) Xen call trace:
(XEN)    [<ffff82d0802abb53>] R scratch_cpumask+0xd3/0xf9
(XEN)    [<ffff82d0802abc21>] F send_IPI_mask+0x72/0x1ca
(XEN)    [<ffff82d0802ac13e>] F flush_area_mask+0x10c/0x16c
(XEN)    [<ffff82d080296c56>] F arch/x86/mm.c#_get_page_type+0x3ff/0x1abf
(XEN)    [<ffff82d080298324>] F get_page_type+0xe/0x2c
(XEN)    [<ffff82d08038624f>] F pv_set_gdt+0xa1/0x2aa
(XEN)    [<ffff82d08027dfd6>] F arch_set_info_guest+0x1196/0x16ba
(XEN)    [<ffff82d080207a55>] F default_initialise_vcpu+0xc7/0xd4
(XEN)    [<ffff82d08027e55b>] F arch_initialise_vcpu+0x61/0xcd
(XEN)    [<ffff82d080207e78>] F do_vcpu_op+0x219/0x690
(XEN)    [<ffff82d08038be16>] F pv_hypercall+0x2f6/0x593
(XEN)    [<ffff82d080396432>] F lstar_enter+0x112/0x120

_get_page_type will use the scratch cpumask to call flush_tlb_mask,
which in turn calls send_IPI_mask.

Fix this by using a dedicated per CPU cpumask in send_IPI_mask.

Fixes: 5500d265a2a8 ('x86/smp: use APIC ALLBUT destination shorthand when possible')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
 xen/arch/x86/smp.c     | 4 +++-
 xen/arch/x86/smpboot.c | 9 ++++++++-
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/smp.c b/xen/arch/x86/smp.c
index 0461812cf6..072638f0f6 100644
--- a/xen/arch/x86/smp.c
+++ b/xen/arch/x86/smp.c
@@ -59,6 +59,8 @@ static void send_IPI_shortcut(unsigned int shortcut, int vector,
     apic_write(APIC_ICR, cfg);
 }
 
+DECLARE_PER_CPU(cpumask_var_t, send_ipi_cpumask);
+
 /*
  * send_IPI_mask(cpumask, vector): sends @vector IPI to CPUs in @cpumask,
  * excluding the local CPU. @cpumask may be empty.
@@ -67,7 +69,7 @@ static void send_IPI_shortcut(unsigned int shortcut, int vector,
 void send_IPI_mask(const cpumask_t *mask, int vector)
 {
     bool cpus_locked = false;
-    cpumask_t *scratch = this_cpu(scratch_cpumask);
+    cpumask_t *scratch = this_cpu(send_ipi_cpumask);
 
     if ( in_irq() || in_mce_handler() || in_nmi_handler() )
     {
diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index ad49f2dcd7..6c548b0b53 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -57,6 +57,9 @@ DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_core_mask);
 DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, scratch_cpumask);
 static cpumask_t scratch_cpu0mask;
 
+DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, send_ipi_cpumask);
+static cpumask_t send_ipi_cpu0mask;
+
 cpumask_t cpu_online_map __read_mostly;
 EXPORT_SYMBOL(cpu_online_map);
 
@@ -930,6 +933,8 @@ static void cpu_smpboot_free(unsigned int cpu, bool remove)
         FREE_CPUMASK_VAR(per_cpu(cpu_core_mask, cpu));
         if ( per_cpu(scratch_cpumask, cpu) != &scratch_cpu0mask )
             FREE_CPUMASK_VAR(per_cpu(scratch_cpumask, cpu));
+        if ( per_cpu(send_ipi_cpumask, cpu) != &send_ipi_cpu0mask )
+            FREE_CPUMASK_VAR(per_cpu(send_ipi_cpumask, cpu));
     }
 
     cleanup_cpu_root_pgt(cpu);
@@ -1034,7 +1039,8 @@ static int cpu_smpboot_alloc(unsigned int cpu)
 
     if ( !(cond_zalloc_cpumask_var(&per_cpu(cpu_sibling_mask, cpu)) &&
            cond_zalloc_cpumask_var(&per_cpu(cpu_core_mask, cpu)) &&
-           cond_alloc_cpumask_var(&per_cpu(scratch_cpumask, cpu))) )
+           cond_alloc_cpumask_var(&per_cpu(scratch_cpumask, cpu)) &&
+           cond_alloc_cpumask_var(&per_cpu(send_ipi_cpumask, cpu))) )
         goto out;
 
     rc = 0;
@@ -1175,6 +1181,7 @@ void __init smp_prepare_boot_cpu(void)
     cpumask_set_cpu(cpu, &cpu_present_map);
 #if NR_CPUS > 2 * BITS_PER_LONG
     per_cpu(scratch_cpumask, cpu) = &scratch_cpu0mask;
+    per_cpu(send_ipi_cpumask, cpu) = &send_ipi_cpu0mask;
 #endif
 
     get_cpu_info()->use_pv_cr3 = false;
-- 
2.25.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Xen-devel] [PATCH v4 2/2] x86: add accessors for scratch cpu mask
  2020-02-28  9:33 [Xen-devel] [PATCH v4 0/2] x86: scratch cpumask fixes/improvement Roger Pau Monne
  2020-02-28  9:33 ` [Xen-devel] [PATCH v4 1/2] x86/smp: use a dedicated CPU mask in send_IPI_mask Roger Pau Monne
@ 2020-02-28  9:33 ` Roger Pau Monne
  2020-02-28 10:16   ` Jan Beulich
  1 sibling, 1 reply; 10+ messages in thread
From: Roger Pau Monne @ 2020-02-28  9:33 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich, Roger Pau Monne

Current usage of the per-CPU scratch cpumask is dangerous since
there's no way to figure out if the mask is already being used except
for manual code inspection of all the callers and possible call paths.

This is unsafe and not reliable, so introduce a minimal get/put
infrastructure to prevent nested usage of the scratch mask and usage
in interrupt context.

Move the definition of scratch_cpumask to smp.c in order to place the
declaration and the accessors as close as possible.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v3:
 - Fix commit message.
 - Split the cpumask taken section into two in _clear_irq_vector.
 - Add an empty statement in do_mmuext_op to avoid a break.
 - Change the logic used to release the scratch cpumask in
   __do_update_va_mapping.
 - Add a %ps print to scratch_cpumask helper.
 - Remove printing the current IP, as that would be done by BUG
   anyway.
 - Pass the cpumask to put_scratch_cpumask and zap the pointer.

Changes since v1:
 - Use __builtin_return_address(0) instead of __func__.
 - Move declaration of scratch_cpumask and scratch_cpumask accessor to
   smp.c.
 - Do not allow usage in #MC or #NMI context.
---
 xen/arch/x86/io_apic.c    |  6 ++++--
 xen/arch/x86/irq.c        | 14 ++++++++++----
 xen/arch/x86/mm.c         | 40 +++++++++++++++++++++++++++------------
 xen/arch/x86/msi.c        |  4 +++-
 xen/arch/x86/smp.c        | 25 ++++++++++++++++++++++++
 xen/arch/x86/smpboot.c    |  1 -
 xen/include/asm-x86/smp.h | 14 ++++++++++++++
 7 files changed, 84 insertions(+), 20 deletions(-)

diff --git a/xen/arch/x86/io_apic.c b/xen/arch/x86/io_apic.c
index e98e08e9c8..0bb994f0ba 100644
--- a/xen/arch/x86/io_apic.c
+++ b/xen/arch/x86/io_apic.c
@@ -2236,10 +2236,11 @@ int io_apic_set_pci_routing (int ioapic, int pin, int irq, int edge_level, int a
     entry.vector = vector;
 
     if (cpumask_intersects(desc->arch.cpu_mask, TARGET_CPUS)) {
-        cpumask_t *mask = this_cpu(scratch_cpumask);
+        cpumask_t *mask = get_scratch_cpumask();
 
         cpumask_and(mask, desc->arch.cpu_mask, TARGET_CPUS);
         SET_DEST(entry, logical, cpu_mask_to_apicid(mask));
+        put_scratch_cpumask(mask);
     } else {
         printk(XENLOG_ERR "IRQ%d: no target CPU (%*pb vs %*pb)\n",
                irq, CPUMASK_PR(desc->arch.cpu_mask), CPUMASK_PR(TARGET_CPUS));
@@ -2433,10 +2434,11 @@ int ioapic_guest_write(unsigned long physbase, unsigned int reg, u32 val)
 
     if ( cpumask_intersects(desc->arch.cpu_mask, TARGET_CPUS) )
     {
-        cpumask_t *mask = this_cpu(scratch_cpumask);
+        cpumask_t *mask = get_scratch_cpumask();
 
         cpumask_and(mask, desc->arch.cpu_mask, TARGET_CPUS);
         SET_DEST(rte, logical, cpu_mask_to_apicid(mask));
+        put_scratch_cpumask(mask);
     }
     else
     {
diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
index cc2eb8e925..19488dae21 100644
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -196,7 +196,7 @@ static void _clear_irq_vector(struct irq_desc *desc)
 {
     unsigned int cpu, old_vector, irq = desc->irq;
     unsigned int vector = desc->arch.vector;
-    cpumask_t *tmp_mask = this_cpu(scratch_cpumask);
+    cpumask_t *tmp_mask = get_scratch_cpumask();
 
     BUG_ON(!valid_irq_vector(vector));
 
@@ -208,6 +208,7 @@ static void _clear_irq_vector(struct irq_desc *desc)
         ASSERT(per_cpu(vector_irq, cpu)[vector] == irq);
         per_cpu(vector_irq, cpu)[vector] = ~irq;
     }
+    put_scratch_cpumask(tmp_mask);
 
     desc->arch.vector = IRQ_VECTOR_UNASSIGNED;
     cpumask_clear(desc->arch.cpu_mask);
@@ -227,8 +228,9 @@ static void _clear_irq_vector(struct irq_desc *desc)
 
     /* If we were in motion, also clear desc->arch.old_vector */
     old_vector = desc->arch.old_vector;
-    cpumask_and(tmp_mask, desc->arch.old_cpu_mask, &cpu_online_map);
 
+    cpumask_and(tmp_mask, desc->arch.old_cpu_mask, &cpu_online_map);
+    tmp_mask = get_scratch_cpumask();
     for_each_cpu(cpu, tmp_mask)
     {
         ASSERT(per_cpu(vector_irq, cpu)[old_vector] == irq);
@@ -236,6 +238,7 @@ static void _clear_irq_vector(struct irq_desc *desc)
         per_cpu(vector_irq, cpu)[old_vector] = ~irq;
     }
 
+    put_scratch_cpumask(tmp_mask);
     release_old_vec(desc);
 
     desc->arch.move_in_progress = 0;
@@ -1152,10 +1155,11 @@ static void irq_guest_eoi_timer_fn(void *data)
         break;
 
     case ACKTYPE_EOI:
-        cpu_eoi_map = this_cpu(scratch_cpumask);
+        cpu_eoi_map = get_scratch_cpumask();
         cpumask_copy(cpu_eoi_map, action->cpu_eoi_map);
         spin_unlock_irq(&desc->lock);
         on_selected_cpus(cpu_eoi_map, set_eoi_ready, desc, 0);
+        put_scratch_cpumask(cpu_eoi_map);
         return;
     }
 
@@ -2531,12 +2535,12 @@ void fixup_irqs(const cpumask_t *mask, bool verbose)
     unsigned int irq;
     static int warned;
     struct irq_desc *desc;
+    cpumask_t *affinity = get_scratch_cpumask();
 
     for ( irq = 0; irq < nr_irqs; irq++ )
     {
         bool break_affinity = false, set_affinity = true;
         unsigned int vector;
-        cpumask_t *affinity = this_cpu(scratch_cpumask);
 
         if ( irq == 2 )
             continue;
@@ -2640,6 +2644,8 @@ void fixup_irqs(const cpumask_t *mask, bool verbose)
                    irq, CPUMASK_PR(affinity));
     }
 
+    put_scratch_cpumask(affinity);
+
     /* That doesn't seem sufficient.  Give it 1ms. */
     local_irq_enable();
     mdelay(1);
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 70b87c4830..b3b09a0219 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -1262,7 +1262,7 @@ void put_page_from_l1e(l1_pgentry_t l1e, struct domain *l1e_owner)
              (l1e_owner == pg_owner) )
         {
             struct vcpu *v;
-            cpumask_t *mask = this_cpu(scratch_cpumask);
+            cpumask_t *mask = get_scratch_cpumask();
 
             cpumask_clear(mask);
 
@@ -1279,6 +1279,7 @@ void put_page_from_l1e(l1_pgentry_t l1e, struct domain *l1e_owner)
 
             if ( !cpumask_empty(mask) )
                 flush_tlb_mask(mask);
+            put_scratch_cpumask(mask);
         }
 #endif /* CONFIG_PV_LDT_PAGING */
         put_page(page);
@@ -2903,7 +2904,7 @@ static int _get_page_type(struct page_info *page, unsigned long type,
                  * vital that no other CPUs are left with mappings of a frame
                  * which is about to become writeable to the guest.
                  */
-                cpumask_t *mask = this_cpu(scratch_cpumask);
+                cpumask_t *mask = get_scratch_cpumask();
 
                 BUG_ON(in_irq());
                 cpumask_copy(mask, d->dirty_cpumask);
@@ -2919,6 +2920,7 @@ static int _get_page_type(struct page_info *page, unsigned long type,
                     perfc_incr(need_flush_tlb_flush);
                     flush_tlb_mask(mask);
                 }
+                put_scratch_cpumask(mask);
 
                 /* We lose existing type and validity. */
                 nx &= ~(PGT_type_mask | PGT_validated);
@@ -3635,7 +3637,7 @@ long do_mmuext_op(
         case MMUEXT_TLB_FLUSH_MULTI:
         case MMUEXT_INVLPG_MULTI:
         {
-            cpumask_t *mask = this_cpu(scratch_cpumask);
+            cpumask_t *mask = get_scratch_cpumask();
 
             if ( unlikely(currd != pg_owner) )
                 rc = -EPERM;
@@ -3645,12 +3647,13 @@ long do_mmuext_op(
                                    mask)) )
                 rc = -EINVAL;
             if ( unlikely(rc) )
-                break;
-
-            if ( op.cmd == MMUEXT_TLB_FLUSH_MULTI )
+                ;
+            else if ( op.cmd == MMUEXT_TLB_FLUSH_MULTI )
                 flush_tlb_mask(mask);
             else if ( __addr_ok(op.arg1.linear_addr) )
                 flush_tlb_one_mask(mask, op.arg1.linear_addr);
+            put_scratch_cpumask(mask);
+
             break;
         }
 
@@ -3683,7 +3686,7 @@ long do_mmuext_op(
             else if ( likely(cache_flush_permitted(currd)) )
             {
                 unsigned int cpu;
-                cpumask_t *mask = this_cpu(scratch_cpumask);
+                cpumask_t *mask = get_scratch_cpumask();
 
                 cpumask_clear(mask);
                 for_each_online_cpu(cpu)
@@ -3691,6 +3694,7 @@ long do_mmuext_op(
                                              per_cpu(cpu_sibling_mask, cpu)) )
                         __cpumask_set_cpu(cpu, mask);
                 flush_mask(mask, FLUSH_CACHE);
+                put_scratch_cpumask(mask);
             }
             else
                 rc = -EINVAL;
@@ -4156,12 +4160,13 @@ long do_mmu_update(
          * Force other vCPU-s of the affected guest to pick up L4 entry
          * changes (if any).
          */
-        unsigned int cpu = smp_processor_id();
-        cpumask_t *mask = per_cpu(scratch_cpumask, cpu);
+        cpumask_t *mask = get_scratch_cpumask();
 
-        cpumask_andnot(mask, pt_owner->dirty_cpumask, cpumask_of(cpu));
+        cpumask_andnot(mask, pt_owner->dirty_cpumask,
+                       cpumask_of(smp_processor_id()));
         if ( !cpumask_empty(mask) )
             flush_mask(mask, FLUSH_TLB_GLOBAL | FLUSH_ROOT_PGTBL);
+        put_scratch_cpumask(mask);
     }
 
     perfc_add(num_page_updates, i);
@@ -4353,7 +4358,7 @@ static int __do_update_va_mapping(
             mask = d->dirty_cpumask;
             break;
         default:
-            mask = this_cpu(scratch_cpumask);
+            mask = get_scratch_cpumask();
             rc = vcpumask_to_pcpumask(d, const_guest_handle_from_ptr(bmap_ptr,
                                                                      void),
                                       mask);
@@ -4373,7 +4378,7 @@ static int __do_update_va_mapping(
             mask = d->dirty_cpumask;
             break;
         default:
-            mask = this_cpu(scratch_cpumask);
+            mask = get_scratch_cpumask();
             rc = vcpumask_to_pcpumask(d, const_guest_handle_from_ptr(bmap_ptr,
                                                                      void),
                                       mask);
@@ -4384,6 +4389,17 @@ static int __do_update_va_mapping(
         break;
     }
 
+    switch ( flags & ~UVMF_FLUSHTYPE_MASK )
+    {
+    case UVMF_LOCAL:
+    case UVMF_ALL:
+        break;
+
+    default:
+        put_scratch_cpumask(mask);
+    }
+
+
     return rc;
 }
 
diff --git a/xen/arch/x86/msi.c b/xen/arch/x86/msi.c
index 161ee60dbe..6d198f8665 100644
--- a/xen/arch/x86/msi.c
+++ b/xen/arch/x86/msi.c
@@ -159,13 +159,15 @@ void msi_compose_msg(unsigned vector, const cpumask_t *cpu_mask, struct msi_msg
 
     if ( cpu_mask )
     {
-        cpumask_t *mask = this_cpu(scratch_cpumask);
+        cpumask_t *mask;
 
         if ( !cpumask_intersects(cpu_mask, &cpu_online_map) )
             return;
 
+        mask = get_scratch_cpumask();
         cpumask_and(mask, cpu_mask, &cpu_online_map);
         msg->dest32 = cpu_mask_to_apicid(mask);
+        put_scratch_cpumask(mask);
     }
 
     msg->address_hi = MSI_ADDR_BASE_HI;
diff --git a/xen/arch/x86/smp.c b/xen/arch/x86/smp.c
index 072638f0f6..945dbabefe 100644
--- a/xen/arch/x86/smp.c
+++ b/xen/arch/x86/smp.c
@@ -25,6 +25,31 @@
 #include <irq_vectors.h>
 #include <mach_apic.h>
 
+DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, scratch_cpumask);
+
+#ifndef NDEBUG
+cpumask_t *scratch_cpumask(bool use)
+{
+    static DEFINE_PER_CPU(void *, scratch_cpumask_use);
+
+    /*
+     * Due to reentrancy scratch cpumask cannot be used in IRQ, #MC or #NMI
+     * context.
+     */
+    BUG_ON(in_irq() || in_mce_handler() || in_nmi_handler());
+
+    if ( use && unlikely(this_cpu(scratch_cpumask_use)) )
+    {
+        printk("scratch CPU mask already in use by %ps (%p)\n",
+               this_cpu(scratch_cpumask_use), this_cpu(scratch_cpumask_use));
+        BUG();
+    }
+    this_cpu(scratch_cpumask_use) = use ? __builtin_return_address(0) : NULL;
+
+    return use ? this_cpu(scratch_cpumask) : NULL;
+}
+#endif
+
 /* Helper functions to prepare APIC register values. */
 static unsigned int prepare_ICR(unsigned int shortcut, int vector)
 {
diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index 6c548b0b53..e26b61a8b4 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -54,7 +54,6 @@ DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_sibling_mask);
 /* representing HT and core siblings of each logical CPU */
 DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_core_mask);
 
-DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, scratch_cpumask);
 static cpumask_t scratch_cpu0mask;
 
 DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, send_ipi_cpumask);
diff --git a/xen/include/asm-x86/smp.h b/xen/include/asm-x86/smp.h
index 92d69a5ea0..d2f0bb0b4f 100644
--- a/xen/include/asm-x86/smp.h
+++ b/xen/include/asm-x86/smp.h
@@ -23,6 +23,20 @@ DECLARE_PER_CPU(cpumask_var_t, cpu_sibling_mask);
 DECLARE_PER_CPU(cpumask_var_t, cpu_core_mask);
 DECLARE_PER_CPU(cpumask_var_t, scratch_cpumask);
 
+#ifndef NDEBUG
+/* Not to be called directly, use {get/put}_scratch_cpumask(). */
+cpumask_t *scratch_cpumask(bool use);
+#define get_scratch_cpumask() scratch_cpumask(true)
+#define put_scratch_cpumask(m) do {             \
+    BUG_ON((m) != this_cpu(scratch_cpumask));   \
+    scratch_cpumask(false);                     \
+    (m) = NULL;                                 \
+} while ( false )
+#else
+#define get_scratch_cpumask() this_cpu(scratch_cpumask)
+#define put_scratch_cpumask(m)
+#endif
+
 /*
  * Do we, for platform reasons, need to actually keep CPUs online when we
  * would otherwise prefer them to be off?
-- 
2.25.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [Xen-devel] [PATCH v4 1/2] x86/smp: use a dedicated CPU mask in send_IPI_mask
  2020-02-28  9:33 ` [Xen-devel] [PATCH v4 1/2] x86/smp: use a dedicated CPU mask in send_IPI_mask Roger Pau Monne
@ 2020-02-28 10:08   ` Jan Beulich
  2020-02-28 10:22     ` Roger Pau Monné
  0 siblings, 1 reply; 10+ messages in thread
From: Jan Beulich @ 2020-02-28 10:08 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: xen-devel, Wei Liu, Andrew Cooper

On 28.02.2020 10:33, Roger Pau Monne wrote:
> --- a/xen/arch/x86/smp.c
> +++ b/xen/arch/x86/smp.c
> @@ -59,6 +59,8 @@ static void send_IPI_shortcut(unsigned int shortcut, int vector,
>      apic_write(APIC_ICR, cfg);
>  }
>  
> +DECLARE_PER_CPU(cpumask_var_t, send_ipi_cpumask);

This needs to be put in a header, so that ...

> --- a/xen/arch/x86/smpboot.c
> +++ b/xen/arch/x86/smpboot.c
> @@ -57,6 +57,9 @@ DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_core_mask);
>  DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, scratch_cpumask);
>  static cpumask_t scratch_cpu0mask;
>  
> +DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, send_ipi_cpumask);

... this gets compiled with having seen the declaration, such
that if one gets changed without also changing the other, the
build will break.

Everything else looks fine to me.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Xen-devel] [PATCH v4 2/2] x86: add accessors for scratch cpu mask
  2020-02-28  9:33 ` [Xen-devel] [PATCH v4 2/2] x86: add accessors for scratch cpu mask Roger Pau Monne
@ 2020-02-28 10:16   ` Jan Beulich
  2020-02-28 10:31     ` Roger Pau Monné
  0 siblings, 1 reply; 10+ messages in thread
From: Jan Beulich @ 2020-02-28 10:16 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: xen-devel, Wei Liu, Andrew Cooper

On 28.02.2020 10:33, Roger Pau Monne wrote:
> Current usage of the per-CPU scratch cpumask is dangerous since
> there's no way to figure out if the mask is already being used except
> for manual code inspection of all the callers and possible call paths.
> 
> This is unsafe and not reliable, so introduce a minimal get/put
> infrastructure to prevent nested usage of the scratch mask and usage
> in interrupt context.
> 
> Move the definition of scratch_cpumask to smp.c in order to place the
> declaration and the accessors as close as possible.

You've changed one instance of "declaration", but not also the other.

> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -196,7 +196,7 @@ static void _clear_irq_vector(struct irq_desc *desc)
>  {
>      unsigned int cpu, old_vector, irq = desc->irq;
>      unsigned int vector = desc->arch.vector;
> -    cpumask_t *tmp_mask = this_cpu(scratch_cpumask);
> +    cpumask_t *tmp_mask = get_scratch_cpumask();
>  
>      BUG_ON(!valid_irq_vector(vector));
>  
> @@ -208,6 +208,7 @@ static void _clear_irq_vector(struct irq_desc *desc)
>          ASSERT(per_cpu(vector_irq, cpu)[vector] == irq);
>          per_cpu(vector_irq, cpu)[vector] = ~irq;
>      }
> +    put_scratch_cpumask(tmp_mask);
>  
>      desc->arch.vector = IRQ_VECTOR_UNASSIGNED;
>      cpumask_clear(desc->arch.cpu_mask);
> @@ -227,8 +228,9 @@ static void _clear_irq_vector(struct irq_desc *desc)
>  
>      /* If we were in motion, also clear desc->arch.old_vector */
>      old_vector = desc->arch.old_vector;
> -    cpumask_and(tmp_mask, desc->arch.old_cpu_mask, &cpu_online_map);
>  
> +    cpumask_and(tmp_mask, desc->arch.old_cpu_mask, &cpu_online_map);
> +    tmp_mask = get_scratch_cpumask();

Did you test this? It looks overwhelmingly likely that the two
lines need to be the other way around.

> @@ -4384,6 +4389,17 @@ static int __do_update_va_mapping(
>          break;
>      }
>  
> +    switch ( flags & ~UVMF_FLUSHTYPE_MASK )
> +    {
> +    case UVMF_LOCAL:
> +    case UVMF_ALL:
> +        break;
> +
> +    default:
> +        put_scratch_cpumask(mask);
> +    }
> +
> +
>      return rc;

No two successive blank lines please.

> --- a/xen/arch/x86/smp.c
> +++ b/xen/arch/x86/smp.c
> @@ -25,6 +25,31 @@
>  #include <irq_vectors.h>
>  #include <mach_apic.h>
>  
> +DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, scratch_cpumask);
> +
> +#ifndef NDEBUG
> +cpumask_t *scratch_cpumask(bool use)
> +{
> +    static DEFINE_PER_CPU(void *, scratch_cpumask_use);

I'd make this "const void *", btw.

> +    /*
> +     * Due to reentrancy scratch cpumask cannot be used in IRQ, #MC or #NMI
> +     * context.
> +     */
> +    BUG_ON(in_irq() || in_mce_handler() || in_nmi_handler());
> +
> +    if ( use && unlikely(this_cpu(scratch_cpumask_use)) )
> +    {
> +        printk("scratch CPU mask already in use by %ps (%p)\n",
> +               this_cpu(scratch_cpumask_use), this_cpu(scratch_cpumask_use));

Why the raw %p as well? We don't do so elsewhere, I think. Yes,
it's debugging code only, but I wonder anyway.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Xen-devel] [PATCH v4 1/2] x86/smp: use a dedicated CPU mask in send_IPI_mask
  2020-02-28 10:08   ` Jan Beulich
@ 2020-02-28 10:22     ` Roger Pau Monné
  0 siblings, 0 replies; 10+ messages in thread
From: Roger Pau Monné @ 2020-02-28 10:22 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Wei Liu, Andrew Cooper

On Fri, Feb 28, 2020 at 11:08:43AM +0100, Jan Beulich wrote:
> On 28.02.2020 10:33, Roger Pau Monne wrote:
> > --- a/xen/arch/x86/smp.c
> > +++ b/xen/arch/x86/smp.c
> > @@ -59,6 +59,8 @@ static void send_IPI_shortcut(unsigned int shortcut, int vector,
> >      apic_write(APIC_ICR, cfg);
> >  }
> >  
> > +DECLARE_PER_CPU(cpumask_var_t, send_ipi_cpumask);
> 
> This needs to be put in a header, so that ...
> 
> > --- a/xen/arch/x86/smpboot.c
> > +++ b/xen/arch/x86/smpboot.c
> > @@ -57,6 +57,9 @@ DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_core_mask);
> >  DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, scratch_cpumask);
> >  static cpumask_t scratch_cpu0mask;
> >  
> > +DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, send_ipi_cpumask);
> 
> ... this gets compiled with having seen the declaration, such
> that if one gets changed without also changing the other, the
> build will break.

Right, was trying to limit the scope of the declaration, but your
suggestion makes sense.

> Everything else looks fine to me.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Xen-devel] [PATCH v4 2/2] x86: add accessors for scratch cpu mask
  2020-02-28 10:16   ` Jan Beulich
@ 2020-02-28 10:31     ` Roger Pau Monné
  2020-02-28 11:15       ` Jan Beulich
  0 siblings, 1 reply; 10+ messages in thread
From: Roger Pau Monné @ 2020-02-28 10:31 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Wei Liu, Andrew Cooper

On Fri, Feb 28, 2020 at 11:16:55AM +0100, Jan Beulich wrote:
> On 28.02.2020 10:33, Roger Pau Monne wrote:
> > Current usage of the per-CPU scratch cpumask is dangerous since
> > there's no way to figure out if the mask is already being used except
> > for manual code inspection of all the callers and possible call paths.
> > 
> > This is unsafe and not reliable, so introduce a minimal get/put
> > infrastructure to prevent nested usage of the scratch mask and usage
> > in interrupt context.
> > 
> > Move the definition of scratch_cpumask to smp.c in order to place the
> > declaration and the accessors as close as possible.
> 
> You've changed one instance of "declaration", but not also the other.

Oh, sorry. Sadly you are not the only one with a cold this week :).

> > --- a/xen/arch/x86/irq.c
> > +++ b/xen/arch/x86/irq.c
> > @@ -196,7 +196,7 @@ static void _clear_irq_vector(struct irq_desc *desc)
> >  {
> >      unsigned int cpu, old_vector, irq = desc->irq;
> >      unsigned int vector = desc->arch.vector;
> > -    cpumask_t *tmp_mask = this_cpu(scratch_cpumask);
> > +    cpumask_t *tmp_mask = get_scratch_cpumask();
> >  
> >      BUG_ON(!valid_irq_vector(vector));
> >  
> > @@ -208,6 +208,7 @@ static void _clear_irq_vector(struct irq_desc *desc)
> >          ASSERT(per_cpu(vector_irq, cpu)[vector] == irq);
> >          per_cpu(vector_irq, cpu)[vector] = ~irq;
> >      }
> > +    put_scratch_cpumask(tmp_mask);
> >  
> >      desc->arch.vector = IRQ_VECTOR_UNASSIGNED;
> >      cpumask_clear(desc->arch.cpu_mask);
> > @@ -227,8 +228,9 @@ static void _clear_irq_vector(struct irq_desc *desc)
> >  
> >      /* If we were in motion, also clear desc->arch.old_vector */
> >      old_vector = desc->arch.old_vector;
> > -    cpumask_and(tmp_mask, desc->arch.old_cpu_mask, &cpu_online_map);
> >  
> > +    cpumask_and(tmp_mask, desc->arch.old_cpu_mask, &cpu_online_map);
> > +    tmp_mask = get_scratch_cpumask();
> 
> Did you test this? It looks overwhelmingly likely that the two
> lines need to be the other way around.

Urg, yes, I've tested it but likely missed to trigger this case and
even worse failed to spot it on my own review. It's obviously wrong.

> > +    /*
> > +     * Due to reentrancy scratch cpumask cannot be used in IRQ, #MC or #NMI
> > +     * context.
> > +     */
> > +    BUG_ON(in_irq() || in_mce_handler() || in_nmi_handler());
> > +
> > +    if ( use && unlikely(this_cpu(scratch_cpumask_use)) )
> > +    {
> > +        printk("scratch CPU mask already in use by %ps (%p)\n",
> > +               this_cpu(scratch_cpumask_use), this_cpu(scratch_cpumask_use));
> 
> Why the raw %p as well? We don't do so elsewhere, I think. Yes,
> it's debugging code only, but I wonder anyway.

I use addr2line to find the offending line, and it's much easier to do
so if you have the address directly, rather than having to use nm in
order to figure out the address of the symbol and then add the offset.

Maybe I'm missing some other way to do this more easily?

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Xen-devel] [PATCH v4 2/2] x86: add accessors for scratch cpu mask
  2020-02-28 10:31     ` Roger Pau Monné
@ 2020-02-28 11:15       ` Jan Beulich
  2020-02-28 11:41         ` Roger Pau Monné
  0 siblings, 1 reply; 10+ messages in thread
From: Jan Beulich @ 2020-02-28 11:15 UTC (permalink / raw)
  To: Roger Pau Monné, Andrew Cooper; +Cc: xen-devel, Wei Liu

On 28.02.2020 11:31, Roger Pau Monné wrote:
> On Fri, Feb 28, 2020 at 11:16:55AM +0100, Jan Beulich wrote:
>> On 28.02.2020 10:33, Roger Pau Monne wrote:
>>> +    /*
>>> +     * Due to reentrancy scratch cpumask cannot be used in IRQ, #MC or #NMI
>>> +     * context.
>>> +     */
>>> +    BUG_ON(in_irq() || in_mce_handler() || in_nmi_handler());
>>> +
>>> +    if ( use && unlikely(this_cpu(scratch_cpumask_use)) )
>>> +    {
>>> +        printk("scratch CPU mask already in use by %ps (%p)\n",
>>> +               this_cpu(scratch_cpumask_use), this_cpu(scratch_cpumask_use));
>>
>> Why the raw %p as well? We don't do so elsewhere, I think. Yes,
>> it's debugging code only, but I wonder anyway.
> 
> I use addr2line to find the offending line, and it's much easier to do
> so if you have the address directly, rather than having to use nm in
> order to figure out the address of the symbol and then add the offset.
> 
> Maybe I'm missing some other way to do this more easily?

In such a case we may want to consider making %ps (and %pS)
print a hex presentation next to the decoded one, in debug
builds at least. Andrew, thoughts? (There may be cases where
this is not wanted, bit if we made this a debug mode only
feature, I think it wouldn't do too much harm.)

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Xen-devel] [PATCH v4 2/2] x86: add accessors for scratch cpu mask
  2020-02-28 11:15       ` Jan Beulich
@ 2020-02-28 11:41         ` Roger Pau Monné
  2020-02-28 12:03           ` Jan Beulich
  0 siblings, 1 reply; 10+ messages in thread
From: Roger Pau Monné @ 2020-02-28 11:41 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Wei Liu, xen-devel

On Fri, Feb 28, 2020 at 12:15:21PM +0100, Jan Beulich wrote:
> On 28.02.2020 11:31, Roger Pau Monné wrote:
> > On Fri, Feb 28, 2020 at 11:16:55AM +0100, Jan Beulich wrote:
> >> On 28.02.2020 10:33, Roger Pau Monne wrote:
> >>> +    /*
> >>> +     * Due to reentrancy scratch cpumask cannot be used in IRQ, #MC or #NMI
> >>> +     * context.
> >>> +     */
> >>> +    BUG_ON(in_irq() || in_mce_handler() || in_nmi_handler());
> >>> +
> >>> +    if ( use && unlikely(this_cpu(scratch_cpumask_use)) )
> >>> +    {
> >>> +        printk("scratch CPU mask already in use by %ps (%p)\n",
> >>> +               this_cpu(scratch_cpumask_use), this_cpu(scratch_cpumask_use));
> >>
> >> Why the raw %p as well? We don't do so elsewhere, I think. Yes,
> >> it's debugging code only, but I wonder anyway.
> > 
> > I use addr2line to find the offending line, and it's much easier to do
> > so if you have the address directly, rather than having to use nm in
> > order to figure out the address of the symbol and then add the offset.
> > 
> > Maybe I'm missing some other way to do this more easily?
> 
> In such a case we may want to consider making %ps (and %pS)
> print a hex presentation next to the decoded one, in debug
> builds at least. Andrew, thoughts? (There may be cases where
> this is not wanted, bit if we made this a debug mode only
> feature, I think it wouldn't do too much harm.)

If you agree to make %p[sS] print the address then I can drop this
and send a patch to that effect (likely next week).

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Xen-devel] [PATCH v4 2/2] x86: add accessors for scratch cpu mask
  2020-02-28 11:41         ` Roger Pau Monné
@ 2020-02-28 12:03           ` Jan Beulich
  0 siblings, 0 replies; 10+ messages in thread
From: Jan Beulich @ 2020-02-28 12:03 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: Andrew Cooper, Wei Liu, xen-devel

On 28.02.2020 12:41, Roger Pau Monné wrote:
> On Fri, Feb 28, 2020 at 12:15:21PM +0100, Jan Beulich wrote:
>> On 28.02.2020 11:31, Roger Pau Monné wrote:
>>> On Fri, Feb 28, 2020 at 11:16:55AM +0100, Jan Beulich wrote:
>>>> On 28.02.2020 10:33, Roger Pau Monne wrote:
>>>>> +    /*
>>>>> +     * Due to reentrancy scratch cpumask cannot be used in IRQ, #MC or #NMI
>>>>> +     * context.
>>>>> +     */
>>>>> +    BUG_ON(in_irq() || in_mce_handler() || in_nmi_handler());
>>>>> +
>>>>> +    if ( use && unlikely(this_cpu(scratch_cpumask_use)) )
>>>>> +    {
>>>>> +        printk("scratch CPU mask already in use by %ps (%p)\n",
>>>>> +               this_cpu(scratch_cpumask_use), this_cpu(scratch_cpumask_use));
>>>>
>>>> Why the raw %p as well? We don't do so elsewhere, I think. Yes,
>>>> it's debugging code only, but I wonder anyway.
>>>
>>> I use addr2line to find the offending line, and it's much easier to do
>>> so if you have the address directly, rather than having to use nm in
>>> order to figure out the address of the symbol and then add the offset.
>>>
>>> Maybe I'm missing some other way to do this more easily?
>>
>> In such a case we may want to consider making %ps (and %pS)
>> print a hex presentation next to the decoded one, in debug
>> builds at least. Andrew, thoughts? (There may be cases where
>> this is not wanted, bit if we made this a debug mode only
>> feature, I think it wouldn't do too much harm.)
> 
> If you agree to make %p[sS] print the address then I can drop this
> and send a patch to that effect (likely next week).

In principle I agree, but the effect in particular on stack
dumps needs to be looked at pretty closely.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-02-28 12:03 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-28  9:33 [Xen-devel] [PATCH v4 0/2] x86: scratch cpumask fixes/improvement Roger Pau Monne
2020-02-28  9:33 ` [Xen-devel] [PATCH v4 1/2] x86/smp: use a dedicated CPU mask in send_IPI_mask Roger Pau Monne
2020-02-28 10:08   ` Jan Beulich
2020-02-28 10:22     ` Roger Pau Monné
2020-02-28  9:33 ` [Xen-devel] [PATCH v4 2/2] x86: add accessors for scratch cpu mask Roger Pau Monne
2020-02-28 10:16   ` Jan Beulich
2020-02-28 10:31     ` Roger Pau Monné
2020-02-28 11:15       ` Jan Beulich
2020-02-28 11:41         ` Roger Pau Monné
2020-02-28 12:03           ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.