All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/9] x86: IRQ management adjustments
@ 2019-04-29 11:16 ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-04-29 11:16 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

First and foremost this series is trying to deal with CPU offlining
issues, which have become more prominent with the recently
added SMT enable/disable operation in xen-hptool. Later patches
in the series then carry out more or less unrelated changes
(hopefully improvements) noticed while looking at various pieces
of involved code.

The first patch introduces an ASSERT() which I've observed to
trigger every once in a while. I'm still trying to find the cause of
this, hence the RFC for that one patch.

1: x86/IRQ: deal with move-in-progress state in fixup_irqs()
2: x86/IRQ: deal with move cleanup count state in fixup_irqs()
3: x86/IRQ: improve dump_irqs()
4: x86/IRQ: desc->affinity should strictly represent the requested value
5: x86/IRQ: fix locking around vector management
6: x86/IRQ: reduce unused space in struct arch_irq_desc
7: x86/IRQ: drop redundant cpumask_empty() from move_masked_irq()
8: x86/IRQ: make fixup_irqs() skip unconnected internally used interrupts
9: x86/IO-APIC: drop an unused variable from setup_IO_APIC_irqs()

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [Xen-devel] [PATCH 0/9] x86: IRQ management adjustments
@ 2019-04-29 11:16 ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-04-29 11:16 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

First and foremost this series is trying to deal with CPU offlining
issues, which have become more prominent with the recently
added SMT enable/disable operation in xen-hptool. Later patches
in the series then carry out more or less unrelated changes
(hopefully improvements) noticed while looking at various pieces
of involved code.

The first patch introduces an ASSERT() which I've observed to
trigger every once in a while. I'm still trying to find the cause of
this, hence the RFC for that one patch.

1: x86/IRQ: deal with move-in-progress state in fixup_irqs()
2: x86/IRQ: deal with move cleanup count state in fixup_irqs()
3: x86/IRQ: improve dump_irqs()
4: x86/IRQ: desc->affinity should strictly represent the requested value
5: x86/IRQ: fix locking around vector management
6: x86/IRQ: reduce unused space in struct arch_irq_desc
7: x86/IRQ: drop redundant cpumask_empty() from move_masked_irq()
8: x86/IRQ: make fixup_irqs() skip unconnected internally used interrupts
9: x86/IO-APIC: drop an unused variable from setup_IO_APIC_irqs()

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [PATCH RFC 1/9] x86/IRQ: deal with move-in-progress state in fixup_irqs()
@ 2019-04-29 11:22   ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-04-29 11:22 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

The flag being set may prevent affinity changes, as these often imply
assignment of a new vector. When there's no possible destination left
for the IRQ, the clearing of the flag needs to happen right from
fixup_irqs().

Additionally _assign_irq_vector() needs to avoid setting the flag when
there's no online CPU left in what gets put into ->arch.old_cpu_mask.
The old vector can be released right away in this case.

Also extend the log message about broken affinity to include the new
affinity as well, allowing to notice issues with affinity changes not
actually having taken place. Swap the if/else-if order there at the
same time to reduce the amount of conditions checked.

At the same time replace two open coded instances of the new helper
function.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
RFC: I've seen the new ASSERT() in irq_move_cleanup_interrupt() trigger.
     I'm pretty sure that this assertion triggering means something else
     is wrong, and has been even prior to this change (adding the
     assertion without any of the other changes here should be valid in
     my understanding).

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -242,6 +242,20 @@ void destroy_irq(unsigned int irq)
     xfree(action);
 }
 
+static void release_old_vec(struct irq_desc *desc)
+{
+    unsigned int vector = desc->arch.old_vector;
+
+    desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
+    cpumask_clear(desc->arch.old_cpu_mask);
+
+    if ( desc->arch.used_vectors )
+    {
+        ASSERT(test_bit(vector, desc->arch.used_vectors));
+        clear_bit(vector, desc->arch.used_vectors);
+    }
+}
+
 static void __clear_irq_vector(int irq)
 {
     int cpu, vector, old_vector;
@@ -285,14 +299,7 @@ static void __clear_irq_vector(int irq)
         per_cpu(vector_irq, cpu)[old_vector] = ~irq;
     }
 
-    desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
-    cpumask_clear(desc->arch.old_cpu_mask);
-
-    if ( desc->arch.used_vectors )
-    {
-        ASSERT(test_bit(old_vector, desc->arch.used_vectors));
-        clear_bit(old_vector, desc->arch.used_vectors);
-    }
+    release_old_vec(desc);
 
     desc->arch.move_in_progress = 0;
 }
@@ -517,12 +524,21 @@ next:
         /* Found one! */
         current_vector = vector;
         current_offset = offset;
-        if (old_vector > 0) {
-            desc->arch.move_in_progress = 1;
-            cpumask_copy(desc->arch.old_cpu_mask, desc->arch.cpu_mask);
+
+        if ( old_vector > 0 )
+        {
+            cpumask_and(desc->arch.old_cpu_mask, desc->arch.cpu_mask,
+                        &cpu_online_map);
             desc->arch.old_vector = desc->arch.vector;
+            if ( !cpumask_empty(desc->arch.old_cpu_mask) )
+                desc->arch.move_in_progress = 1;
+            else
+                /* This can happen while offlining a CPU. */
+                release_old_vec(desc);
         }
+
         trace_irq_mask(TRC_HW_IRQ_ASSIGN_VECTOR, irq, vector, &tmp_mask);
+
         for_each_cpu(new_cpu, &tmp_mask)
             per_cpu(vector_irq, new_cpu)[vector] = irq;
         desc->arch.vector = vector;
@@ -691,14 +707,8 @@ void irq_move_cleanup_interrupt(struct c
 
         if ( desc->arch.move_cleanup_count == 0 )
         {
-            desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
-            cpumask_clear(desc->arch.old_cpu_mask);
-
-            if ( desc->arch.used_vectors )
-            {
-                ASSERT(test_bit(vector, desc->arch.used_vectors));
-                clear_bit(vector, desc->arch.used_vectors);
-            }
+            ASSERT(vector == desc->arch.old_vector);
+            release_old_vec(desc);
         }
 unlock:
         spin_unlock(&desc->lock);
@@ -2391,6 +2401,24 @@ void fixup_irqs(const cpumask_t *mask, b
             continue;
         }
 
+        /*
+         * In order for the affinity adjustment below to be successful, we
+         * need __assign_irq_vector() to succeed. This in particular means
+         * clearing desc->arch.move_in_progress if this would otherwise
+         * prevent the function from succeeding. Since there's no way for the
+         * flag to get cleared anymore when there's no possible destination
+         * left (the only possibility then would be the IRQs enabled window
+         * after this loop), there's then also no race with us doing it here.
+         *
+         * Therefore the logic here and there need to remain in sync.
+         */
+        if ( desc->arch.move_in_progress &&
+             !cpumask_intersects(mask, desc->arch.cpu_mask) )
+        {
+            release_old_vec(desc);
+            desc->arch.move_in_progress = 0;
+        }
+
         cpumask_and(&affinity, &affinity, mask);
         if ( cpumask_empty(&affinity) )
         {
@@ -2409,15 +2437,18 @@ void fixup_irqs(const cpumask_t *mask, b
         if ( desc->handler->enable )
             desc->handler->enable(desc);
 
+        cpumask_copy(&affinity, desc->affinity);
+
         spin_unlock(&desc->lock);
 
         if ( !verbose )
             continue;
 
-        if ( break_affinity && set_affinity )
-            printk("Broke affinity for irq %i\n", irq);
-        else if ( !set_affinity )
-            printk("Cannot set affinity for irq %i\n", irq);
+        if ( !set_affinity )
+            printk("Cannot set affinity for IRQ%u\n", irq);
+        else if ( break_affinity )
+            printk("Broke affinity for IRQ%u, new: %*pb\n",
+                   irq, nr_cpu_ids, &affinity);
     }
 
     /* That doesn't seem sufficient.  Give it 1ms. */




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [Xen-devel] [PATCH RFC 1/9] x86/IRQ: deal with move-in-progress state in fixup_irqs()
@ 2019-04-29 11:22   ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-04-29 11:22 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

The flag being set may prevent affinity changes, as these often imply
assignment of a new vector. When there's no possible destination left
for the IRQ, the clearing of the flag needs to happen right from
fixup_irqs().

Additionally _assign_irq_vector() needs to avoid setting the flag when
there's no online CPU left in what gets put into ->arch.old_cpu_mask.
The old vector can be released right away in this case.

Also extend the log message about broken affinity to include the new
affinity as well, allowing to notice issues with affinity changes not
actually having taken place. Swap the if/else-if order there at the
same time to reduce the amount of conditions checked.

At the same time replace two open coded instances of the new helper
function.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
RFC: I've seen the new ASSERT() in irq_move_cleanup_interrupt() trigger.
     I'm pretty sure that this assertion triggering means something else
     is wrong, and has been even prior to this change (adding the
     assertion without any of the other changes here should be valid in
     my understanding).

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -242,6 +242,20 @@ void destroy_irq(unsigned int irq)
     xfree(action);
 }
 
+static void release_old_vec(struct irq_desc *desc)
+{
+    unsigned int vector = desc->arch.old_vector;
+
+    desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
+    cpumask_clear(desc->arch.old_cpu_mask);
+
+    if ( desc->arch.used_vectors )
+    {
+        ASSERT(test_bit(vector, desc->arch.used_vectors));
+        clear_bit(vector, desc->arch.used_vectors);
+    }
+}
+
 static void __clear_irq_vector(int irq)
 {
     int cpu, vector, old_vector;
@@ -285,14 +299,7 @@ static void __clear_irq_vector(int irq)
         per_cpu(vector_irq, cpu)[old_vector] = ~irq;
     }
 
-    desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
-    cpumask_clear(desc->arch.old_cpu_mask);
-
-    if ( desc->arch.used_vectors )
-    {
-        ASSERT(test_bit(old_vector, desc->arch.used_vectors));
-        clear_bit(old_vector, desc->arch.used_vectors);
-    }
+    release_old_vec(desc);
 
     desc->arch.move_in_progress = 0;
 }
@@ -517,12 +524,21 @@ next:
         /* Found one! */
         current_vector = vector;
         current_offset = offset;
-        if (old_vector > 0) {
-            desc->arch.move_in_progress = 1;
-            cpumask_copy(desc->arch.old_cpu_mask, desc->arch.cpu_mask);
+
+        if ( old_vector > 0 )
+        {
+            cpumask_and(desc->arch.old_cpu_mask, desc->arch.cpu_mask,
+                        &cpu_online_map);
             desc->arch.old_vector = desc->arch.vector;
+            if ( !cpumask_empty(desc->arch.old_cpu_mask) )
+                desc->arch.move_in_progress = 1;
+            else
+                /* This can happen while offlining a CPU. */
+                release_old_vec(desc);
         }
+
         trace_irq_mask(TRC_HW_IRQ_ASSIGN_VECTOR, irq, vector, &tmp_mask);
+
         for_each_cpu(new_cpu, &tmp_mask)
             per_cpu(vector_irq, new_cpu)[vector] = irq;
         desc->arch.vector = vector;
@@ -691,14 +707,8 @@ void irq_move_cleanup_interrupt(struct c
 
         if ( desc->arch.move_cleanup_count == 0 )
         {
-            desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
-            cpumask_clear(desc->arch.old_cpu_mask);
-
-            if ( desc->arch.used_vectors )
-            {
-                ASSERT(test_bit(vector, desc->arch.used_vectors));
-                clear_bit(vector, desc->arch.used_vectors);
-            }
+            ASSERT(vector == desc->arch.old_vector);
+            release_old_vec(desc);
         }
 unlock:
         spin_unlock(&desc->lock);
@@ -2391,6 +2401,24 @@ void fixup_irqs(const cpumask_t *mask, b
             continue;
         }
 
+        /*
+         * In order for the affinity adjustment below to be successful, we
+         * need __assign_irq_vector() to succeed. This in particular means
+         * clearing desc->arch.move_in_progress if this would otherwise
+         * prevent the function from succeeding. Since there's no way for the
+         * flag to get cleared anymore when there's no possible destination
+         * left (the only possibility then would be the IRQs enabled window
+         * after this loop), there's then also no race with us doing it here.
+         *
+         * Therefore the logic here and there need to remain in sync.
+         */
+        if ( desc->arch.move_in_progress &&
+             !cpumask_intersects(mask, desc->arch.cpu_mask) )
+        {
+            release_old_vec(desc);
+            desc->arch.move_in_progress = 0;
+        }
+
         cpumask_and(&affinity, &affinity, mask);
         if ( cpumask_empty(&affinity) )
         {
@@ -2409,15 +2437,18 @@ void fixup_irqs(const cpumask_t *mask, b
         if ( desc->handler->enable )
             desc->handler->enable(desc);
 
+        cpumask_copy(&affinity, desc->affinity);
+
         spin_unlock(&desc->lock);
 
         if ( !verbose )
             continue;
 
-        if ( break_affinity && set_affinity )
-            printk("Broke affinity for irq %i\n", irq);
-        else if ( !set_affinity )
-            printk("Cannot set affinity for irq %i\n", irq);
+        if ( !set_affinity )
+            printk("Cannot set affinity for IRQ%u\n", irq);
+        else if ( break_affinity )
+            printk("Broke affinity for IRQ%u, new: %*pb\n",
+                   irq, nr_cpu_ids, &affinity);
     }
 
     /* That doesn't seem sufficient.  Give it 1ms. */




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [PATCH 2/9] x86/IRQ: deal with move cleanup count state in fixup_irqs()
@ 2019-04-29 11:23   ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-04-29 11:23 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

The cleanup IPI may get sent immediately before a CPU gets removed from
the online map. In such a case the IPI would get handled on the CPU
being offlined no earlier than in the interrupts disabled window after
fixup_irqs()' main loop. This is too late, however, because a possible
affinity change may incur the need for vector assignment, which will
fail when the IRQ's move cleanup count is still non-zero.

To fix this
- record the set of CPUs the cleanup IPIs gets actually sent to alongside
  setting their count,
- adjust the count in fixup_irqs(), accounting for all CPUs that the
  cleanup IPI was sent to, but that are no longer online,
- bail early from the cleanup IPI handler when the CPU is no longer
  online, to prevent double accounting.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
TBD: The proper recording of the IPI destinations actually makes the
     move_cleanup_count field redundant. Do we want to drop it, at the
     price of a few more CPU-mask operations?

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -658,6 +658,9 @@ void irq_move_cleanup_interrupt(struct c
     ack_APIC_irq();
 
     me = smp_processor_id();
+    if ( !cpu_online(me) )
+        return;
+
     for ( vector = FIRST_DYNAMIC_VECTOR;
           vector <= LAST_HIPRIORITY_VECTOR; vector++)
     {
@@ -717,11 +720,14 @@ unlock:
 
 static void send_cleanup_vector(struct irq_desc *desc)
 {
-    cpumask_t cleanup_mask;
+    cpumask_and(desc->arch.old_cpu_mask, desc->arch.old_cpu_mask,
+                &cpu_online_map);
+    desc->arch.move_cleanup_count = cpumask_weight(desc->arch.old_cpu_mask);
 
-    cpumask_and(&cleanup_mask, desc->arch.old_cpu_mask, &cpu_online_map);
-    desc->arch.move_cleanup_count = cpumask_weight(&cleanup_mask);
-    send_IPI_mask(&cleanup_mask, IRQ_MOVE_CLEANUP_VECTOR);
+    if ( desc->arch.move_cleanup_count )
+        send_IPI_mask(desc->arch.old_cpu_mask, IRQ_MOVE_CLEANUP_VECTOR);
+    else
+        release_old_vec(desc);
 
     desc->arch.move_in_progress = 0;
 }
@@ -2394,6 +2400,16 @@ void fixup_irqs(const cpumask_t *mask, b
              vector <= LAST_HIPRIORITY_VECTOR )
             cpumask_and(desc->arch.cpu_mask, desc->arch.cpu_mask, mask);
 
+        if ( desc->arch.move_cleanup_count )
+        {
+            /* The cleanup IPI may have got sent while we were still online. */
+            cpumask_andnot(&affinity, desc->arch.old_cpu_mask,
+                           &cpu_online_map);
+            desc->arch.move_cleanup_count -= cpumask_weight(&affinity);
+            if ( !desc->arch.move_cleanup_count )
+                release_old_vec(desc);
+        }
+
         cpumask_copy(&affinity, desc->affinity);
         if ( !desc->action || cpumask_subset(&affinity, mask) )
         {





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [Xen-devel] [PATCH 2/9] x86/IRQ: deal with move cleanup count state in fixup_irqs()
@ 2019-04-29 11:23   ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-04-29 11:23 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

The cleanup IPI may get sent immediately before a CPU gets removed from
the online map. In such a case the IPI would get handled on the CPU
being offlined no earlier than in the interrupts disabled window after
fixup_irqs()' main loop. This is too late, however, because a possible
affinity change may incur the need for vector assignment, which will
fail when the IRQ's move cleanup count is still non-zero.

To fix this
- record the set of CPUs the cleanup IPIs gets actually sent to alongside
  setting their count,
- adjust the count in fixup_irqs(), accounting for all CPUs that the
  cleanup IPI was sent to, but that are no longer online,
- bail early from the cleanup IPI handler when the CPU is no longer
  online, to prevent double accounting.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
TBD: The proper recording of the IPI destinations actually makes the
     move_cleanup_count field redundant. Do we want to drop it, at the
     price of a few more CPU-mask operations?

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -658,6 +658,9 @@ void irq_move_cleanup_interrupt(struct c
     ack_APIC_irq();
 
     me = smp_processor_id();
+    if ( !cpu_online(me) )
+        return;
+
     for ( vector = FIRST_DYNAMIC_VECTOR;
           vector <= LAST_HIPRIORITY_VECTOR; vector++)
     {
@@ -717,11 +720,14 @@ unlock:
 
 static void send_cleanup_vector(struct irq_desc *desc)
 {
-    cpumask_t cleanup_mask;
+    cpumask_and(desc->arch.old_cpu_mask, desc->arch.old_cpu_mask,
+                &cpu_online_map);
+    desc->arch.move_cleanup_count = cpumask_weight(desc->arch.old_cpu_mask);
 
-    cpumask_and(&cleanup_mask, desc->arch.old_cpu_mask, &cpu_online_map);
-    desc->arch.move_cleanup_count = cpumask_weight(&cleanup_mask);
-    send_IPI_mask(&cleanup_mask, IRQ_MOVE_CLEANUP_VECTOR);
+    if ( desc->arch.move_cleanup_count )
+        send_IPI_mask(desc->arch.old_cpu_mask, IRQ_MOVE_CLEANUP_VECTOR);
+    else
+        release_old_vec(desc);
 
     desc->arch.move_in_progress = 0;
 }
@@ -2394,6 +2400,16 @@ void fixup_irqs(const cpumask_t *mask, b
              vector <= LAST_HIPRIORITY_VECTOR )
             cpumask_and(desc->arch.cpu_mask, desc->arch.cpu_mask, mask);
 
+        if ( desc->arch.move_cleanup_count )
+        {
+            /* The cleanup IPI may have got sent while we were still online. */
+            cpumask_andnot(&affinity, desc->arch.old_cpu_mask,
+                           &cpu_online_map);
+            desc->arch.move_cleanup_count -= cpumask_weight(&affinity);
+            if ( !desc->arch.move_cleanup_count )
+                release_old_vec(desc);
+        }
+
         cpumask_copy(&affinity, desc->affinity);
         if ( !desc->action || cpumask_subset(&affinity, mask) )
         {





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [PATCH 3/9] x86/IRQ: improve dump_irqs()
@ 2019-04-29 11:23   ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-04-29 11:23 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

Don't log a stray trailing comma. Shorten a few fields.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -2318,7 +2318,7 @@ static void dump_irqs(unsigned char key)
 
         spin_lock_irqsave(&desc->lock, flags);
 
-        printk("   IRQ:%4d affinity:%*pb vec:%02x type=%-15s status=%08x ",
+        printk("   IRQ:%4d aff:%*pb vec:%02x %-15s status=%03x ",
                irq, nr_cpu_ids, cpumask_bits(desc->affinity), desc->arch.vector,
                desc->handler->typename, desc->status);
 
@@ -2329,23 +2329,21 @@ static void dump_irqs(unsigned char key)
         {
             action = (irq_guest_action_t *)desc->action;
 
-            printk("in-flight=%d domain-list=", action->in_flight);
+            printk("in-flight=%d%c",
+                   action->in_flight, action->nr_guests ? ' ' : '\n');
 
-            for ( i = 0; i < action->nr_guests; i++ )
+            for ( i = 0; i < action->nr_guests; )
             {
-                d = action->guest[i];
+                d = action->guest[i++];
                 pirq = domain_irq_to_pirq(d, irq);
                 info = pirq_info(d, pirq);
-                printk("%u:%3d(%c%c%c)",
+                printk("d%d:%3d(%c%c%c)%c",
                        d->domain_id, pirq,
                        evtchn_port_is_pending(d, info->evtchn) ? 'P' : '-',
                        evtchn_port_is_masked(d, info->evtchn) ? 'M' : '-',
-                       (info->masked ? 'M' : '-'));
-                if ( i != action->nr_guests )
-                    printk(",");
+                       info->masked ? 'M' : '-',
+                       i < action->nr_guests ? ',' : '\n');
             }
-
-            printk("\n");
         }
         else if ( desc->action )
             printk("%ps()\n", desc->action->handler);





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [Xen-devel] [PATCH 3/9] x86/IRQ: improve dump_irqs()
@ 2019-04-29 11:23   ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-04-29 11:23 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

Don't log a stray trailing comma. Shorten a few fields.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -2318,7 +2318,7 @@ static void dump_irqs(unsigned char key)
 
         spin_lock_irqsave(&desc->lock, flags);
 
-        printk("   IRQ:%4d affinity:%*pb vec:%02x type=%-15s status=%08x ",
+        printk("   IRQ:%4d aff:%*pb vec:%02x %-15s status=%03x ",
                irq, nr_cpu_ids, cpumask_bits(desc->affinity), desc->arch.vector,
                desc->handler->typename, desc->status);
 
@@ -2329,23 +2329,21 @@ static void dump_irqs(unsigned char key)
         {
             action = (irq_guest_action_t *)desc->action;
 
-            printk("in-flight=%d domain-list=", action->in_flight);
+            printk("in-flight=%d%c",
+                   action->in_flight, action->nr_guests ? ' ' : '\n');
 
-            for ( i = 0; i < action->nr_guests; i++ )
+            for ( i = 0; i < action->nr_guests; )
             {
-                d = action->guest[i];
+                d = action->guest[i++];
                 pirq = domain_irq_to_pirq(d, irq);
                 info = pirq_info(d, pirq);
-                printk("%u:%3d(%c%c%c)",
+                printk("d%d:%3d(%c%c%c)%c",
                        d->domain_id, pirq,
                        evtchn_port_is_pending(d, info->evtchn) ? 'P' : '-',
                        evtchn_port_is_masked(d, info->evtchn) ? 'M' : '-',
-                       (info->masked ? 'M' : '-'));
-                if ( i != action->nr_guests )
-                    printk(",");
+                       info->masked ? 'M' : '-',
+                       i < action->nr_guests ? ',' : '\n');
             }
-
-            printk("\n");
         }
         else if ( desc->action )
             printk("%ps()\n", desc->action->handler);





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [PATCH 4/9] x86/IRQ: desc->affinity should strictly represent the requested value
@ 2019-04-29 11:24   ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-04-29 11:24 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

desc->arch.cpu_mask reflects the actual set of target CPUs. Don't ever
fiddle with desc->affinity itself, except to store caller requested
values.

This renders both set_native_irq_info() uses (which weren't using proper
locking anyway) redundant - drop the function altogether.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/io_apic.c
+++ b/xen/arch/x86/io_apic.c
@@ -1042,7 +1042,6 @@ static void __init setup_IO_APIC_irqs(vo
             SET_DEST(entry, logical, cpu_mask_to_apicid(TARGET_CPUS));
             spin_lock_irqsave(&ioapic_lock, flags);
             __ioapic_write_entry(apic, pin, 0, entry);
-            set_native_irq_info(irq, TARGET_CPUS);
             spin_unlock_irqrestore(&ioapic_lock, flags);
         }
     }
@@ -2251,7 +2250,6 @@ int io_apic_set_pci_routing (int ioapic,
 
     spin_lock_irqsave(&ioapic_lock, flags);
     __ioapic_write_entry(ioapic, pin, 0, entry);
-    set_native_irq_info(irq, TARGET_CPUS);
     spin_unlock(&ioapic_lock);
 
     spin_lock(&desc->lock);
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -572,11 +572,16 @@ int assign_irq_vector(int irq, const cpu
 
     spin_lock_irqsave(&vector_lock, flags);
     ret = __assign_irq_vector(irq, desc, mask ?: TARGET_CPUS);
-    if (!ret) {
+    if ( !ret )
+    {
         ret = desc->arch.vector;
-        cpumask_copy(desc->affinity, desc->arch.cpu_mask);
+        if ( mask )
+            cpumask_copy(desc->affinity, mask);
+        else
+            cpumask_setall(desc->affinity);
     }
     spin_unlock_irqrestore(&vector_lock, flags);
+
     return ret;
 }
 
@@ -2318,9 +2323,10 @@ static void dump_irqs(unsigned char key)
 
         spin_lock_irqsave(&desc->lock, flags);
 
-        printk("   IRQ:%4d aff:%*pb vec:%02x %-15s status=%03x ",
-               irq, nr_cpu_ids, cpumask_bits(desc->affinity), desc->arch.vector,
-               desc->handler->typename, desc->status);
+        printk("   IRQ:%4d aff:%*pb/%*pb vec:%02x %-15s status=%03x ",
+               irq, nr_cpu_ids, cpumask_bits(desc->affinity),
+               nr_cpu_ids, cpumask_bits(desc->arch.cpu_mask),
+               desc->arch.vector, desc->handler->typename, desc->status);
 
         if ( ssid )
             printk("Z=%-25s ", ssid);
@@ -2408,8 +2414,7 @@ void fixup_irqs(const cpumask_t *mask, b
                 release_old_vec(desc);
         }
 
-        cpumask_copy(&affinity, desc->affinity);
-        if ( !desc->action || cpumask_subset(&affinity, mask) )
+        if ( !desc->action || cpumask_subset(desc->affinity, mask) )
         {
             spin_unlock(&desc->lock);
             continue;
@@ -2433,12 +2438,13 @@ void fixup_irqs(const cpumask_t *mask, b
             desc->arch.move_in_progress = 0;
         }
 
-        cpumask_and(&affinity, &affinity, mask);
-        if ( cpumask_empty(&affinity) )
+        if ( !cpumask_intersects(mask, desc->affinity) )
         {
             break_affinity = true;
-            cpumask_copy(&affinity, mask);
+            cpumask_setall(&affinity);
         }
+        else
+            cpumask_copy(&affinity, desc->affinity);
 
         if ( desc->handler->disable )
             desc->handler->disable(desc);
--- a/xen/include/xen/irq.h
+++ b/xen/include/xen/irq.h
@@ -162,11 +162,6 @@ extern irq_desc_t *domain_spin_lock_irq_
 extern irq_desc_t *pirq_spin_lock_irq_desc(
     const struct pirq *, unsigned long *pflags);
 
-static inline void set_native_irq_info(unsigned int irq, const cpumask_t *mask)
-{
-    cpumask_copy(irq_to_desc(irq)->affinity, mask);
-}
-
 unsigned int set_desc_affinity(struct irq_desc *, const cpumask_t *);
 
 #ifndef arch_hwdom_irqs





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [Xen-devel] [PATCH 4/9] x86/IRQ: desc->affinity should strictly represent the requested value
@ 2019-04-29 11:24   ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-04-29 11:24 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

desc->arch.cpu_mask reflects the actual set of target CPUs. Don't ever
fiddle with desc->affinity itself, except to store caller requested
values.

This renders both set_native_irq_info() uses (which weren't using proper
locking anyway) redundant - drop the function altogether.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/io_apic.c
+++ b/xen/arch/x86/io_apic.c
@@ -1042,7 +1042,6 @@ static void __init setup_IO_APIC_irqs(vo
             SET_DEST(entry, logical, cpu_mask_to_apicid(TARGET_CPUS));
             spin_lock_irqsave(&ioapic_lock, flags);
             __ioapic_write_entry(apic, pin, 0, entry);
-            set_native_irq_info(irq, TARGET_CPUS);
             spin_unlock_irqrestore(&ioapic_lock, flags);
         }
     }
@@ -2251,7 +2250,6 @@ int io_apic_set_pci_routing (int ioapic,
 
     spin_lock_irqsave(&ioapic_lock, flags);
     __ioapic_write_entry(ioapic, pin, 0, entry);
-    set_native_irq_info(irq, TARGET_CPUS);
     spin_unlock(&ioapic_lock);
 
     spin_lock(&desc->lock);
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -572,11 +572,16 @@ int assign_irq_vector(int irq, const cpu
 
     spin_lock_irqsave(&vector_lock, flags);
     ret = __assign_irq_vector(irq, desc, mask ?: TARGET_CPUS);
-    if (!ret) {
+    if ( !ret )
+    {
         ret = desc->arch.vector;
-        cpumask_copy(desc->affinity, desc->arch.cpu_mask);
+        if ( mask )
+            cpumask_copy(desc->affinity, mask);
+        else
+            cpumask_setall(desc->affinity);
     }
     spin_unlock_irqrestore(&vector_lock, flags);
+
     return ret;
 }
 
@@ -2318,9 +2323,10 @@ static void dump_irqs(unsigned char key)
 
         spin_lock_irqsave(&desc->lock, flags);
 
-        printk("   IRQ:%4d aff:%*pb vec:%02x %-15s status=%03x ",
-               irq, nr_cpu_ids, cpumask_bits(desc->affinity), desc->arch.vector,
-               desc->handler->typename, desc->status);
+        printk("   IRQ:%4d aff:%*pb/%*pb vec:%02x %-15s status=%03x ",
+               irq, nr_cpu_ids, cpumask_bits(desc->affinity),
+               nr_cpu_ids, cpumask_bits(desc->arch.cpu_mask),
+               desc->arch.vector, desc->handler->typename, desc->status);
 
         if ( ssid )
             printk("Z=%-25s ", ssid);
@@ -2408,8 +2414,7 @@ void fixup_irqs(const cpumask_t *mask, b
                 release_old_vec(desc);
         }
 
-        cpumask_copy(&affinity, desc->affinity);
-        if ( !desc->action || cpumask_subset(&affinity, mask) )
+        if ( !desc->action || cpumask_subset(desc->affinity, mask) )
         {
             spin_unlock(&desc->lock);
             continue;
@@ -2433,12 +2438,13 @@ void fixup_irqs(const cpumask_t *mask, b
             desc->arch.move_in_progress = 0;
         }
 
-        cpumask_and(&affinity, &affinity, mask);
-        if ( cpumask_empty(&affinity) )
+        if ( !cpumask_intersects(mask, desc->affinity) )
         {
             break_affinity = true;
-            cpumask_copy(&affinity, mask);
+            cpumask_setall(&affinity);
         }
+        else
+            cpumask_copy(&affinity, desc->affinity);
 
         if ( desc->handler->disable )
             desc->handler->disable(desc);
--- a/xen/include/xen/irq.h
+++ b/xen/include/xen/irq.h
@@ -162,11 +162,6 @@ extern irq_desc_t *domain_spin_lock_irq_
 extern irq_desc_t *pirq_spin_lock_irq_desc(
     const struct pirq *, unsigned long *pflags);
 
-static inline void set_native_irq_info(unsigned int irq, const cpumask_t *mask)
-{
-    cpumask_copy(irq_to_desc(irq)->affinity, mask);
-}
-
 unsigned int set_desc_affinity(struct irq_desc *, const cpumask_t *);
 
 #ifndef arch_hwdom_irqs





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [PATCH 5/9] x86/IRQ: fix locking around vector management
@ 2019-04-29 11:25   ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-04-29 11:25 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

All of __{assign,bind,clear}_irq_vector() manipulate struct irq_desc
fields, and hence ought to be called with the descriptor lock held in
addition to vector_lock. This is currently the case for only
set_desc_affinity() and destroy_irq(), which also clarifies what the
nesting behavior between the locks has to be. Reflect the new
expectation by having these functions all take a descriptor as
parameter instead of an interrupt number.

Drop one of the two leading underscores from all three functions at
the same time.

There's one case left where descriptors get manipulated with just
vector_lock held: setup_vector_irq() assumes its caller to acquire
vector_lock, and hence can't itself acquire the descriptor locks (wrong
lock order). I don't currently see how to address this.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -27,6 +27,7 @@
 #include <public/physdev.h>
 
 static int parse_irq_vector_map_param(const char *s);
+static void _clear_irq_vector(struct irq_desc *desc);
 
 /* opt_noirqbalance: If true, software IRQ balancing/affinity is disabled. */
 bool __read_mostly opt_noirqbalance;
@@ -112,13 +113,12 @@ static void trace_irq_mask(u32 event, in
     trace_var(event, 1, sizeof(d), &d);
 }
 
-static int __init __bind_irq_vector(int irq, int vector, const cpumask_t *cpu_mask)
+static int __init _bind_irq_vector(struct irq_desc *desc, int vector,
+                                   const cpumask_t *cpu_mask)
 {
     cpumask_t online_mask;
     int cpu;
-    struct irq_desc *desc = irq_to_desc(irq);
 
-    BUG_ON((unsigned)irq >= nr_irqs);
     BUG_ON((unsigned)vector >= NR_VECTORS);
 
     cpumask_and(&online_mask, cpu_mask, &cpu_online_map);
@@ -129,9 +129,9 @@ static int __init __bind_irq_vector(int
         return 0;
     if ( desc->arch.vector != IRQ_VECTOR_UNASSIGNED )
         return -EBUSY;
-    trace_irq_mask(TRC_HW_IRQ_BIND_VECTOR, irq, vector, &online_mask);
+    trace_irq_mask(TRC_HW_IRQ_BIND_VECTOR, desc->irq, vector, &online_mask);
     for_each_cpu(cpu, &online_mask)
-        per_cpu(vector_irq, cpu)[vector] = irq;
+        per_cpu(vector_irq, cpu)[vector] = desc->irq;
     desc->arch.vector = vector;
     cpumask_copy(desc->arch.cpu_mask, &online_mask);
     if ( desc->arch.used_vectors )
@@ -145,12 +145,18 @@ static int __init __bind_irq_vector(int
 
 int __init bind_irq_vector(int irq, int vector, const cpumask_t *cpu_mask)
 {
+    struct irq_desc *desc = irq_to_desc(irq);
     unsigned long flags;
     int ret;
 
-    spin_lock_irqsave(&vector_lock, flags);
-    ret = __bind_irq_vector(irq, vector, cpu_mask);
-    spin_unlock_irqrestore(&vector_lock, flags);
+    BUG_ON((unsigned)irq >= nr_irqs);
+
+    spin_lock_irqsave(&desc->lock, flags);
+    spin_lock(&vector_lock);
+    ret = _bind_irq_vector(desc, vector, cpu_mask);
+    spin_unlock(&vector_lock);
+    spin_unlock_irqrestore(&desc->lock, flags);
+
     return ret;
 }
 
@@ -235,7 +241,9 @@ void destroy_irq(unsigned int irq)
 
     spin_lock_irqsave(&desc->lock, flags);
     desc->handler = &no_irq_type;
-    clear_irq_vector(irq);
+    spin_lock(&vector_lock);
+    _clear_irq_vector(desc);
+    spin_unlock(&vector_lock);
     desc->arch.used_vectors = NULL;
     spin_unlock_irqrestore(&desc->lock, flags);
 
@@ -256,11 +264,11 @@ static void release_old_vec(struct irq_d
     }
 }
 
-static void __clear_irq_vector(int irq)
+static void _clear_irq_vector(struct irq_desc *desc)
 {
-    int cpu, vector, old_vector;
+    unsigned int cpu;
+    int vector, old_vector, irq = desc->irq;
     cpumask_t tmp_mask;
-    struct irq_desc *desc = irq_to_desc(irq);
 
     BUG_ON(!desc->arch.vector);
 
@@ -306,11 +314,14 @@ static void __clear_irq_vector(int irq)
 
 void clear_irq_vector(int irq)
 {
+    struct irq_desc *desc = irq_to_desc(irq);
     unsigned long flags;
 
-    spin_lock_irqsave(&vector_lock, flags);
-    __clear_irq_vector(irq);
-    spin_unlock_irqrestore(&vector_lock, flags);
+    spin_lock_irqsave(&desc->lock, flags);
+    spin_lock(&vector_lock);
+    _clear_irq_vector(desc);
+    spin_unlock(&vector_lock);
+    spin_unlock_irqrestore(&desc->lock, flags);
 }
 
 int irq_to_vector(int irq)
@@ -445,8 +456,7 @@ static vmask_t *irq_get_used_vector_mask
     return ret;
 }
 
-static int __assign_irq_vector(
-    int irq, struct irq_desc *desc, const cpumask_t *mask)
+static int _assign_irq_vector(struct irq_desc *desc, const cpumask_t *mask)
 {
     /*
      * NOTE! The local APIC isn't very good at handling
@@ -460,7 +470,8 @@ static int __assign_irq_vector(
      * 0x80, because int 0x80 is hm, kind of importantish. ;)
      */
     static int current_vector = FIRST_DYNAMIC_VECTOR, current_offset = 0;
-    int cpu, err, old_vector;
+    unsigned int cpu;
+    int err, old_vector, irq = desc->irq;
     cpumask_t tmp_mask;
     vmask_t *irq_used_vectors = NULL;
 
@@ -570,8 +581,12 @@ int assign_irq_vector(int irq, const cpu
     
     BUG_ON(irq >= nr_irqs || irq <0);
 
-    spin_lock_irqsave(&vector_lock, flags);
-    ret = __assign_irq_vector(irq, desc, mask ?: TARGET_CPUS);
+    spin_lock_irqsave(&desc->lock, flags);
+
+    spin_lock(&vector_lock);
+    ret = _assign_irq_vector(desc, mask ?: TARGET_CPUS);
+    spin_unlock(&vector_lock);
+
     if ( !ret )
     {
         ret = desc->arch.vector;
@@ -580,7 +595,8 @@ int assign_irq_vector(int irq, const cpu
         else
             cpumask_setall(desc->affinity);
     }
-    spin_unlock_irqrestore(&vector_lock, flags);
+
+    spin_unlock_irqrestore(&desc->lock, flags);
 
     return ret;
 }
@@ -754,7 +770,6 @@ void irq_complete_move(struct irq_desc *
 
 unsigned int set_desc_affinity(struct irq_desc *desc, const cpumask_t *mask)
 {
-    unsigned int irq;
     int ret;
     unsigned long flags;
     cpumask_t dest_mask;
@@ -762,10 +777,8 @@ unsigned int set_desc_affinity(struct ir
     if (!cpumask_intersects(mask, &cpu_online_map))
         return BAD_APICID;
 
-    irq = desc->irq;
-
     spin_lock_irqsave(&vector_lock, flags);
-    ret = __assign_irq_vector(irq, desc, mask);
+    ret = _assign_irq_vector(desc, mask);
     spin_unlock_irqrestore(&vector_lock, flags);
 
     if (ret < 0)




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [Xen-devel] [PATCH 5/9] x86/IRQ: fix locking around vector management
@ 2019-04-29 11:25   ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-04-29 11:25 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

All of __{assign,bind,clear}_irq_vector() manipulate struct irq_desc
fields, and hence ought to be called with the descriptor lock held in
addition to vector_lock. This is currently the case for only
set_desc_affinity() and destroy_irq(), which also clarifies what the
nesting behavior between the locks has to be. Reflect the new
expectation by having these functions all take a descriptor as
parameter instead of an interrupt number.

Drop one of the two leading underscores from all three functions at
the same time.

There's one case left where descriptors get manipulated with just
vector_lock held: setup_vector_irq() assumes its caller to acquire
vector_lock, and hence can't itself acquire the descriptor locks (wrong
lock order). I don't currently see how to address this.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -27,6 +27,7 @@
 #include <public/physdev.h>
 
 static int parse_irq_vector_map_param(const char *s);
+static void _clear_irq_vector(struct irq_desc *desc);
 
 /* opt_noirqbalance: If true, software IRQ balancing/affinity is disabled. */
 bool __read_mostly opt_noirqbalance;
@@ -112,13 +113,12 @@ static void trace_irq_mask(u32 event, in
     trace_var(event, 1, sizeof(d), &d);
 }
 
-static int __init __bind_irq_vector(int irq, int vector, const cpumask_t *cpu_mask)
+static int __init _bind_irq_vector(struct irq_desc *desc, int vector,
+                                   const cpumask_t *cpu_mask)
 {
     cpumask_t online_mask;
     int cpu;
-    struct irq_desc *desc = irq_to_desc(irq);
 
-    BUG_ON((unsigned)irq >= nr_irqs);
     BUG_ON((unsigned)vector >= NR_VECTORS);
 
     cpumask_and(&online_mask, cpu_mask, &cpu_online_map);
@@ -129,9 +129,9 @@ static int __init __bind_irq_vector(int
         return 0;
     if ( desc->arch.vector != IRQ_VECTOR_UNASSIGNED )
         return -EBUSY;
-    trace_irq_mask(TRC_HW_IRQ_BIND_VECTOR, irq, vector, &online_mask);
+    trace_irq_mask(TRC_HW_IRQ_BIND_VECTOR, desc->irq, vector, &online_mask);
     for_each_cpu(cpu, &online_mask)
-        per_cpu(vector_irq, cpu)[vector] = irq;
+        per_cpu(vector_irq, cpu)[vector] = desc->irq;
     desc->arch.vector = vector;
     cpumask_copy(desc->arch.cpu_mask, &online_mask);
     if ( desc->arch.used_vectors )
@@ -145,12 +145,18 @@ static int __init __bind_irq_vector(int
 
 int __init bind_irq_vector(int irq, int vector, const cpumask_t *cpu_mask)
 {
+    struct irq_desc *desc = irq_to_desc(irq);
     unsigned long flags;
     int ret;
 
-    spin_lock_irqsave(&vector_lock, flags);
-    ret = __bind_irq_vector(irq, vector, cpu_mask);
-    spin_unlock_irqrestore(&vector_lock, flags);
+    BUG_ON((unsigned)irq >= nr_irqs);
+
+    spin_lock_irqsave(&desc->lock, flags);
+    spin_lock(&vector_lock);
+    ret = _bind_irq_vector(desc, vector, cpu_mask);
+    spin_unlock(&vector_lock);
+    spin_unlock_irqrestore(&desc->lock, flags);
+
     return ret;
 }
 
@@ -235,7 +241,9 @@ void destroy_irq(unsigned int irq)
 
     spin_lock_irqsave(&desc->lock, flags);
     desc->handler = &no_irq_type;
-    clear_irq_vector(irq);
+    spin_lock(&vector_lock);
+    _clear_irq_vector(desc);
+    spin_unlock(&vector_lock);
     desc->arch.used_vectors = NULL;
     spin_unlock_irqrestore(&desc->lock, flags);
 
@@ -256,11 +264,11 @@ static void release_old_vec(struct irq_d
     }
 }
 
-static void __clear_irq_vector(int irq)
+static void _clear_irq_vector(struct irq_desc *desc)
 {
-    int cpu, vector, old_vector;
+    unsigned int cpu;
+    int vector, old_vector, irq = desc->irq;
     cpumask_t tmp_mask;
-    struct irq_desc *desc = irq_to_desc(irq);
 
     BUG_ON(!desc->arch.vector);
 
@@ -306,11 +314,14 @@ static void __clear_irq_vector(int irq)
 
 void clear_irq_vector(int irq)
 {
+    struct irq_desc *desc = irq_to_desc(irq);
     unsigned long flags;
 
-    spin_lock_irqsave(&vector_lock, flags);
-    __clear_irq_vector(irq);
-    spin_unlock_irqrestore(&vector_lock, flags);
+    spin_lock_irqsave(&desc->lock, flags);
+    spin_lock(&vector_lock);
+    _clear_irq_vector(desc);
+    spin_unlock(&vector_lock);
+    spin_unlock_irqrestore(&desc->lock, flags);
 }
 
 int irq_to_vector(int irq)
@@ -445,8 +456,7 @@ static vmask_t *irq_get_used_vector_mask
     return ret;
 }
 
-static int __assign_irq_vector(
-    int irq, struct irq_desc *desc, const cpumask_t *mask)
+static int _assign_irq_vector(struct irq_desc *desc, const cpumask_t *mask)
 {
     /*
      * NOTE! The local APIC isn't very good at handling
@@ -460,7 +470,8 @@ static int __assign_irq_vector(
      * 0x80, because int 0x80 is hm, kind of importantish. ;)
      */
     static int current_vector = FIRST_DYNAMIC_VECTOR, current_offset = 0;
-    int cpu, err, old_vector;
+    unsigned int cpu;
+    int err, old_vector, irq = desc->irq;
     cpumask_t tmp_mask;
     vmask_t *irq_used_vectors = NULL;
 
@@ -570,8 +581,12 @@ int assign_irq_vector(int irq, const cpu
     
     BUG_ON(irq >= nr_irqs || irq <0);
 
-    spin_lock_irqsave(&vector_lock, flags);
-    ret = __assign_irq_vector(irq, desc, mask ?: TARGET_CPUS);
+    spin_lock_irqsave(&desc->lock, flags);
+
+    spin_lock(&vector_lock);
+    ret = _assign_irq_vector(desc, mask ?: TARGET_CPUS);
+    spin_unlock(&vector_lock);
+
     if ( !ret )
     {
         ret = desc->arch.vector;
@@ -580,7 +595,8 @@ int assign_irq_vector(int irq, const cpu
         else
             cpumask_setall(desc->affinity);
     }
-    spin_unlock_irqrestore(&vector_lock, flags);
+
+    spin_unlock_irqrestore(&desc->lock, flags);
 
     return ret;
 }
@@ -754,7 +770,6 @@ void irq_complete_move(struct irq_desc *
 
 unsigned int set_desc_affinity(struct irq_desc *desc, const cpumask_t *mask)
 {
-    unsigned int irq;
     int ret;
     unsigned long flags;
     cpumask_t dest_mask;
@@ -762,10 +777,8 @@ unsigned int set_desc_affinity(struct ir
     if (!cpumask_intersects(mask, &cpu_online_map))
         return BAD_APICID;
 
-    irq = desc->irq;
-
     spin_lock_irqsave(&vector_lock, flags);
-    ret = __assign_irq_vector(irq, desc, mask);
+    ret = _assign_irq_vector(desc, mask);
     spin_unlock_irqrestore(&vector_lock, flags);
 
     if (ret < 0)




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [PATCH 6/9] x86/IRQ: reduce unused space in struct arch_irq_desc
@ 2019-04-29 11:25   ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-04-29 11:25 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/include/asm-x86/irq.h
+++ b/xen/include/asm-x86/irq.h
@@ -35,8 +35,8 @@ struct arch_irq_desc {
         cpumask_var_t cpu_mask;
         cpumask_var_t old_cpu_mask;
         cpumask_var_t pending_mask;
-        unsigned move_cleanup_count;
         vmask_t *used_vectors;
+        unsigned move_cleanup_count;
         u8 move_in_progress : 1;
         s8 used;
 };



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [Xen-devel] [PATCH 6/9] x86/IRQ: reduce unused space in struct arch_irq_desc
@ 2019-04-29 11:25   ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-04-29 11:25 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/include/asm-x86/irq.h
+++ b/xen/include/asm-x86/irq.h
@@ -35,8 +35,8 @@ struct arch_irq_desc {
         cpumask_var_t cpu_mask;
         cpumask_var_t old_cpu_mask;
         cpumask_var_t pending_mask;
-        unsigned move_cleanup_count;
         vmask_t *used_vectors;
+        unsigned move_cleanup_count;
         u8 move_in_progress : 1;
         s8 used;
 };



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [PATCH 7/9] x86/IRQ: drop redundant cpumask_empty() from move_masked_irq()
@ 2019-04-29 11:26   ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-04-29 11:26 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

The subsequent cpumask_intersects() covers the "empty" case quite fine.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -638,9 +638,6 @@ void move_masked_irq(struct irq_desc *de
     
     desc->status &= ~IRQ_MOVE_PENDING;
 
-    if (unlikely(cpumask_empty(pending_mask)))
-        return;
-
     if (!desc->handler->set_affinity)
         return;
 



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [Xen-devel] [PATCH 7/9] x86/IRQ: drop redundant cpumask_empty() from move_masked_irq()
@ 2019-04-29 11:26   ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-04-29 11:26 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

The subsequent cpumask_intersects() covers the "empty" case quite fine.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -638,9 +638,6 @@ void move_masked_irq(struct irq_desc *de
     
     desc->status &= ~IRQ_MOVE_PENDING;
 
-    if (unlikely(cpumask_empty(pending_mask)))
-        return;
-
     if (!desc->handler->set_affinity)
         return;
 



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [PATCH 8/9] x86/IRQ: make fixup_irqs() skip unconnected internally used interrupts
@ 2019-04-29 11:26   ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-04-29 11:26 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

Since the "Cannot set affinity ..." warning is a one time one, avoid
triggering it already at boot time when parking secondary threads and
the serial console uses a (still unconnected at that time) PCI IRQ.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -2412,8 +2412,20 @@ void fixup_irqs(const cpumask_t *mask, b
         vector = irq_to_vector(irq);
         if ( vector >= FIRST_HIPRIORITY_VECTOR &&
              vector <= LAST_HIPRIORITY_VECTOR )
+        {
             cpumask_and(desc->arch.cpu_mask, desc->arch.cpu_mask, mask);
 
+            /*
+             * This can in particular happen when parking secondary threads
+             * during boot and when the serial console wants to use a PCI IRQ.
+             */
+            if ( desc->handler == &no_irq_type )
+            {
+                spin_unlock(&desc->lock);
+                continue;
+            }
+        }
+
         if ( desc->arch.move_cleanup_count )
         {
             /* The cleanup IPI may have got sent while we were still online. */




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [Xen-devel] [PATCH 8/9] x86/IRQ: make fixup_irqs() skip unconnected internally used interrupts
@ 2019-04-29 11:26   ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-04-29 11:26 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

Since the "Cannot set affinity ..." warning is a one time one, avoid
triggering it already at boot time when parking secondary threads and
the serial console uses a (still unconnected at that time) PCI IRQ.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -2412,8 +2412,20 @@ void fixup_irqs(const cpumask_t *mask, b
         vector = irq_to_vector(irq);
         if ( vector >= FIRST_HIPRIORITY_VECTOR &&
              vector <= LAST_HIPRIORITY_VECTOR )
+        {
             cpumask_and(desc->arch.cpu_mask, desc->arch.cpu_mask, mask);
 
+            /*
+             * This can in particular happen when parking secondary threads
+             * during boot and when the serial console wants to use a PCI IRQ.
+             */
+            if ( desc->handler == &no_irq_type )
+            {
+                spin_unlock(&desc->lock);
+                continue;
+            }
+        }
+
         if ( desc->arch.move_cleanup_count )
         {
             /* The cleanup IPI may have got sent while we were still online. */




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [PATCH 9/9] x86/IO-APIC: drop an unused variable from setup_IO_APIC_irqs()
@ 2019-04-29 11:27   ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-04-29 11:27 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

Must be a left-over from earlier days.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/io_apic.c
+++ b/xen/arch/x86/io_apic.c
@@ -984,8 +984,6 @@ static void __init setup_IO_APIC_irqs(vo
 
     for (apic = 0; apic < nr_ioapics; apic++) {
         for (pin = 0; pin < nr_ioapic_entries[apic]; pin++) {
-            struct irq_desc *desc;
-
             /*
              * add it to the IO-APIC irq-routing table:
              */
@@ -1038,7 +1036,6 @@ static void __init setup_IO_APIC_irqs(vo
             if (platform_legacy_irq(irq))
                 disable_8259A_irq(irq_to_desc(irq));
 
-            desc = irq_to_desc(irq);
             SET_DEST(entry, logical, cpu_mask_to_apicid(TARGET_CPUS));
             spin_lock_irqsave(&ioapic_lock, flags);
             __ioapic_write_entry(apic, pin, 0, entry);



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [Xen-devel] [PATCH 9/9] x86/IO-APIC: drop an unused variable from setup_IO_APIC_irqs()
@ 2019-04-29 11:27   ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-04-29 11:27 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

Must be a left-over from earlier days.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/io_apic.c
+++ b/xen/arch/x86/io_apic.c
@@ -984,8 +984,6 @@ static void __init setup_IO_APIC_irqs(vo
 
     for (apic = 0; apic < nr_ioapics; apic++) {
         for (pin = 0; pin < nr_ioapic_entries[apic]; pin++) {
-            struct irq_desc *desc;
-
             /*
              * add it to the IO-APIC irq-routing table:
              */
@@ -1038,7 +1036,6 @@ static void __init setup_IO_APIC_irqs(vo
             if (platform_legacy_irq(irq))
                 disable_8259A_irq(irq_to_desc(irq));
 
-            desc = irq_to_desc(irq);
             SET_DEST(entry, logical, cpu_mask_to_apicid(TARGET_CPUS));
             spin_lock_irqsave(&ioapic_lock, flags);
             __ioapic_write_entry(apic, pin, 0, entry);



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH 9/9] x86/IO-APIC: drop an unused variable from setup_IO_APIC_irqs()
@ 2019-04-29 11:40     ` Andrew Cooper
  0 siblings, 0 replies; 196+ messages in thread
From: Andrew Cooper @ 2019-04-29 11:40 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Wei Liu, Roger Pau Monne

On 29/04/2019 12:27, Jan Beulich wrote:
> Must be a left-over from earlier days.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH 9/9] x86/IO-APIC: drop an unused variable from setup_IO_APIC_irqs()
@ 2019-04-29 11:40     ` Andrew Cooper
  0 siblings, 0 replies; 196+ messages in thread
From: Andrew Cooper @ 2019-04-29 11:40 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Wei Liu, Roger Pau Monne

On 29/04/2019 12:27, Jan Beulich wrote:
> Must be a left-over from earlier days.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH 6/9] x86/IRQ: reduce unused space in struct arch_irq_desc
@ 2019-04-29 11:46     ` Andrew Cooper
  0 siblings, 0 replies; 196+ messages in thread
From: Andrew Cooper @ 2019-04-29 11:46 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Wei Liu, Roger Pau Monne

On 29/04/2019 12:25, Jan Beulich wrote:
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH 6/9] x86/IRQ: reduce unused space in struct arch_irq_desc
@ 2019-04-29 11:46     ` Andrew Cooper
  0 siblings, 0 replies; 196+ messages in thread
From: Andrew Cooper @ 2019-04-29 11:46 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Wei Liu, Roger Pau Monne

On 29/04/2019 12:25, Jan Beulich wrote:
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH RFC 1/9] x86/IRQ: deal with move-in-progress state in fixup_irqs()
@ 2019-04-29 12:55     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-04-29 12:55 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Roger Pau Monne, Wei Liu, Igor Druzhinin

>>> On 29.04.19 at 13:22, <JBeulich@suse.com> wrote:
> RFC: I've seen the new ASSERT() in irq_move_cleanup_interrupt() trigger.
>      I'm pretty sure that this assertion triggering means something else
>      is wrong, and has been even prior to this change (adding the
>      assertion without any of the other changes here should be valid in
>      my understanding).

So I think what is missing is updating of vector_irq ...

> @@ -2391,6 +2401,24 @@ void fixup_irqs(const cpumask_t *mask, b
>              continue;
>          }
>  
> +        /*
> +         * In order for the affinity adjustment below to be successful, we
> +         * need __assign_irq_vector() to succeed. This in particular means
> +         * clearing desc->arch.move_in_progress if this would otherwise
> +         * prevent the function from succeeding. Since there's no way for the
> +         * flag to get cleared anymore when there's no possible destination
> +         * left (the only possibility then would be the IRQs enabled window
> +         * after this loop), there's then also no race with us doing it here.
> +         *
> +         * Therefore the logic here and there need to remain in sync.
> +         */
> +        if ( desc->arch.move_in_progress &&
> +             !cpumask_intersects(mask, desc->arch.cpu_mask) )
> +        {
> +            release_old_vec(desc);
> +            desc->arch.move_in_progress = 0;
> +        }

... here and in the somewhat similar logic patch 2 inserts a few lines
up. I'm about to try this out, but given how rarely I've seen the
problem this will take a while to feel confident (if, of course, it helps
in the first place).

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH RFC 1/9] x86/IRQ: deal with move-in-progress state in fixup_irqs()
@ 2019-04-29 12:55     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-04-29 12:55 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Roger Pau Monne, Wei Liu, Igor Druzhinin

>>> On 29.04.19 at 13:22, <JBeulich@suse.com> wrote:
> RFC: I've seen the new ASSERT() in irq_move_cleanup_interrupt() trigger.
>      I'm pretty sure that this assertion triggering means something else
>      is wrong, and has been even prior to this change (adding the
>      assertion without any of the other changes here should be valid in
>      my understanding).

So I think what is missing is updating of vector_irq ...

> @@ -2391,6 +2401,24 @@ void fixup_irqs(const cpumask_t *mask, b
>              continue;
>          }
>  
> +        /*
> +         * In order for the affinity adjustment below to be successful, we
> +         * need __assign_irq_vector() to succeed. This in particular means
> +         * clearing desc->arch.move_in_progress if this would otherwise
> +         * prevent the function from succeeding. Since there's no way for the
> +         * flag to get cleared anymore when there's no possible destination
> +         * left (the only possibility then would be the IRQs enabled window
> +         * after this loop), there's then also no race with us doing it here.
> +         *
> +         * Therefore the logic here and there need to remain in sync.
> +         */
> +        if ( desc->arch.move_in_progress &&
> +             !cpumask_intersects(mask, desc->arch.cpu_mask) )
> +        {
> +            release_old_vec(desc);
> +            desc->arch.move_in_progress = 0;
> +        }

... here and in the somewhat similar logic patch 2 inserts a few lines
up. I'm about to try this out, but given how rarely I've seen the
problem this will take a while to feel confident (if, of course, it helps
in the first place).

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH RFC 1/9] x86/IRQ: deal with move-in-progress state in fixup_irqs()
@ 2019-04-29 13:08       ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-04-29 13:08 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Roger Pau Monne, Wei Liu, Igor Druzhinin

>>> On 29.04.19 at 14:55, <JBeulich@suse.com> wrote:
>>>> On 29.04.19 at 13:22, <JBeulich@suse.com> wrote:
>> RFC: I've seen the new ASSERT() in irq_move_cleanup_interrupt() trigger.
>>      I'm pretty sure that this assertion triggering means something else
>>      is wrong, and has been even prior to this change (adding the
>>      assertion without any of the other changes here should be valid in
>>      my understanding).
> 
> So I think what is missing is updating of vector_irq ...
> 
>> @@ -2391,6 +2401,24 @@ void fixup_irqs(const cpumask_t *mask, b
>>              continue;
>>          }
>>  
>> +        /*
>> +         * In order for the affinity adjustment below to be successful, we
>> +         * need __assign_irq_vector() to succeed. This in particular means
>> +         * clearing desc->arch.move_in_progress if this would otherwise
>> +         * prevent the function from succeeding. Since there's no way for the
>> +         * flag to get cleared anymore when there's no possible destination
>> +         * left (the only possibility then would be the IRQs enabled window
>> +         * after this loop), there's then also no race with us doing it here.
>> +         *
>> +         * Therefore the logic here and there need to remain in sync.
>> +         */
>> +        if ( desc->arch.move_in_progress &&
>> +             !cpumask_intersects(mask, desc->arch.cpu_mask) )
>> +        {
>> +            release_old_vec(desc);
>> +            desc->arch.move_in_progress = 0;
>> +        }
> 
> ... here and in the somewhat similar logic patch 2 inserts a few lines
> up. I'm about to try this out, but given how rarely I've seen the
> problem this will take a while to feel confident (if, of course, it helps
> in the first place).

Actually no, the 2nd patch doesn't need any change - the code
added there only deals with CPUs already marked offline.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH RFC 1/9] x86/IRQ: deal with move-in-progress state in fixup_irqs()
@ 2019-04-29 13:08       ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-04-29 13:08 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Roger Pau Monne, Wei Liu, Igor Druzhinin

>>> On 29.04.19 at 14:55, <JBeulich@suse.com> wrote:
>>>> On 29.04.19 at 13:22, <JBeulich@suse.com> wrote:
>> RFC: I've seen the new ASSERT() in irq_move_cleanup_interrupt() trigger.
>>      I'm pretty sure that this assertion triggering means something else
>>      is wrong, and has been even prior to this change (adding the
>>      assertion without any of the other changes here should be valid in
>>      my understanding).
> 
> So I think what is missing is updating of vector_irq ...
> 
>> @@ -2391,6 +2401,24 @@ void fixup_irqs(const cpumask_t *mask, b
>>              continue;
>>          }
>>  
>> +        /*
>> +         * In order for the affinity adjustment below to be successful, we
>> +         * need __assign_irq_vector() to succeed. This in particular means
>> +         * clearing desc->arch.move_in_progress if this would otherwise
>> +         * prevent the function from succeeding. Since there's no way for the
>> +         * flag to get cleared anymore when there's no possible destination
>> +         * left (the only possibility then would be the IRQs enabled window
>> +         * after this loop), there's then also no race with us doing it here.
>> +         *
>> +         * Therefore the logic here and there need to remain in sync.
>> +         */
>> +        if ( desc->arch.move_in_progress &&
>> +             !cpumask_intersects(mask, desc->arch.cpu_mask) )
>> +        {
>> +            release_old_vec(desc);
>> +            desc->arch.move_in_progress = 0;
>> +        }
> 
> ... here and in the somewhat similar logic patch 2 inserts a few lines
> up. I'm about to try this out, but given how rarely I've seen the
> problem this will take a while to feel confident (if, of course, it helps
> in the first place).

Actually no, the 2nd patch doesn't need any change - the code
added there only deals with CPUs already marked offline.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [PATCH v1b 1/9] x86/IRQ: deal with move-in-progress state in fixup_irqs()
@ 2019-04-29 15:40   ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-04-29 15:40 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Roger Pau Monne, Wei Liu, Igor Druzhinin

The flag being set may prevent affinity changes, as these often imply
assignment of a new vector. When there's no possible destination left
for the IRQ, the clearing of the flag needs to happen right from
fixup_irqs().

Additionally _assign_irq_vector() needs to avoid setting the flag when
there's no online CPU left in what gets put into ->arch.old_cpu_mask.
The old vector can be released right away in this case.

Also extend the log message about broken affinity to include the new
affinity as well, allowing to notice issues with affinity changes not
actually having taken place. Swap the if/else-if order there at the
same time to reduce the amount of conditions checked.

At the same time replace two open coded instances of the new helper
function.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: Also update vector_irq[] in the code added to fixup_irqs().

--- unstable.orig/xen/arch/x86/irq.c	2019-04-29 17:34:16.726542659 +0200
+++ unstable/xen/arch/x86/irq.c	2019-04-29 15:05:39.000000000 +0200
@@ -242,6 +242,20 @@ void destroy_irq(unsigned int irq)
     xfree(action);
 }
 
+static void release_old_vec(struct irq_desc *desc)
+{
+    unsigned int vector = desc->arch.old_vector;
+
+    desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
+    cpumask_clear(desc->arch.old_cpu_mask);
+
+    if ( desc->arch.used_vectors )
+    {
+        ASSERT(test_bit(vector, desc->arch.used_vectors));
+        clear_bit(vector, desc->arch.used_vectors);
+    }
+}
+
 static void __clear_irq_vector(int irq)
 {
     int cpu, vector, old_vector;
@@ -285,14 +299,7 @@ static void __clear_irq_vector(int irq)
         per_cpu(vector_irq, cpu)[old_vector] = ~irq;
     }
 
-    desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
-    cpumask_clear(desc->arch.old_cpu_mask);
-
-    if ( desc->arch.used_vectors )
-    {
-        ASSERT(test_bit(old_vector, desc->arch.used_vectors));
-        clear_bit(old_vector, desc->arch.used_vectors);
-    }
+    release_old_vec(desc);
 
     desc->arch.move_in_progress = 0;
 }
@@ -517,12 +524,21 @@ next:
         /* Found one! */
         current_vector = vector;
         current_offset = offset;
-        if (old_vector > 0) {
-            desc->arch.move_in_progress = 1;
-            cpumask_copy(desc->arch.old_cpu_mask, desc->arch.cpu_mask);
+
+        if ( old_vector > 0 )
+        {
+            cpumask_and(desc->arch.old_cpu_mask, desc->arch.cpu_mask,
+                        &cpu_online_map);
             desc->arch.old_vector = desc->arch.vector;
+            if ( !cpumask_empty(desc->arch.old_cpu_mask) )
+                desc->arch.move_in_progress = 1;
+            else
+                /* This can happen while offlining a CPU. */
+                release_old_vec(desc);
         }
+
         trace_irq_mask(TRC_HW_IRQ_ASSIGN_VECTOR, irq, vector, &tmp_mask);
+
         for_each_cpu(new_cpu, &tmp_mask)
             per_cpu(vector_irq, new_cpu)[vector] = irq;
         desc->arch.vector = vector;
@@ -691,14 +707,8 @@ void irq_move_cleanup_interrupt(struct c
 
         if ( desc->arch.move_cleanup_count == 0 )
         {
-            desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
-            cpumask_clear(desc->arch.old_cpu_mask);
-
-            if ( desc->arch.used_vectors )
-            {
-                ASSERT(test_bit(vector, desc->arch.used_vectors));
-                clear_bit(vector, desc->arch.used_vectors);
-            }
+            ASSERT(vector == desc->arch.old_vector);
+            release_old_vec(desc);
         }
 unlock:
         spin_unlock(&desc->lock);
@@ -2391,6 +2401,33 @@ void fixup_irqs(const cpumask_t *mask, b
             continue;
         }
 
+        /*
+         * In order for the affinity adjustment below to be successful, we
+         * need __assign_irq_vector() to succeed. This in particular means
+         * clearing desc->arch.move_in_progress if this would otherwise
+         * prevent the function from succeeding. Since there's no way for the
+         * flag to get cleared anymore when there's no possible destination
+         * left (the only possibility then would be the IRQs enabled window
+         * after this loop), there's then also no race with us doing it here.
+         *
+         * Therefore the logic here and there need to remain in sync.
+         */
+        if ( desc->arch.move_in_progress &&
+             !cpumask_intersects(mask, desc->arch.cpu_mask) )
+        {
+            unsigned int cpu;
+
+            cpumask_and(&affinity, desc->arch.old_cpu_mask, &cpu_online_map);
+
+            spin_lock(&vector_lock);
+            for_each_cpu(cpu, &affinity)
+                per_cpu(vector_irq, cpu)[desc->arch.old_vector] = ~irq;
+            spin_unlock(&vector_lock);
+
+            release_old_vec(desc);
+            desc->arch.move_in_progress = 0;
+        }
+
         cpumask_and(&affinity, &affinity, mask);
         if ( cpumask_empty(&affinity) )
         {
@@ -2409,15 +2446,18 @@ void fixup_irqs(const cpumask_t *mask, b
         if ( desc->handler->enable )
             desc->handler->enable(desc);
 
+        cpumask_copy(&affinity, desc->affinity);
+
         spin_unlock(&desc->lock);
 
         if ( !verbose )
             continue;
 
-        if ( break_affinity && set_affinity )
-            printk("Broke affinity for irq %i\n", irq);
-        else if ( !set_affinity )
-            printk("Cannot set affinity for irq %i\n", irq);
+        if ( !set_affinity )
+            printk("Cannot set affinity for IRQ%u\n", irq);
+        else if ( break_affinity )
+            printk("Broke affinity for IRQ%u, new: %*pb\n",
+                   irq, nr_cpu_ids, &affinity);
     }
 
     /* That doesn't seem sufficient.  Give it 1ms. */




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [Xen-devel] [PATCH v1b 1/9] x86/IRQ: deal with move-in-progress state in fixup_irqs()
@ 2019-04-29 15:40   ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-04-29 15:40 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Roger Pau Monne, Wei Liu, Igor Druzhinin

The flag being set may prevent affinity changes, as these often imply
assignment of a new vector. When there's no possible destination left
for the IRQ, the clearing of the flag needs to happen right from
fixup_irqs().

Additionally _assign_irq_vector() needs to avoid setting the flag when
there's no online CPU left in what gets put into ->arch.old_cpu_mask.
The old vector can be released right away in this case.

Also extend the log message about broken affinity to include the new
affinity as well, allowing to notice issues with affinity changes not
actually having taken place. Swap the if/else-if order there at the
same time to reduce the amount of conditions checked.

At the same time replace two open coded instances of the new helper
function.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: Also update vector_irq[] in the code added to fixup_irqs().

--- unstable.orig/xen/arch/x86/irq.c	2019-04-29 17:34:16.726542659 +0200
+++ unstable/xen/arch/x86/irq.c	2019-04-29 15:05:39.000000000 +0200
@@ -242,6 +242,20 @@ void destroy_irq(unsigned int irq)
     xfree(action);
 }
 
+static void release_old_vec(struct irq_desc *desc)
+{
+    unsigned int vector = desc->arch.old_vector;
+
+    desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
+    cpumask_clear(desc->arch.old_cpu_mask);
+
+    if ( desc->arch.used_vectors )
+    {
+        ASSERT(test_bit(vector, desc->arch.used_vectors));
+        clear_bit(vector, desc->arch.used_vectors);
+    }
+}
+
 static void __clear_irq_vector(int irq)
 {
     int cpu, vector, old_vector;
@@ -285,14 +299,7 @@ static void __clear_irq_vector(int irq)
         per_cpu(vector_irq, cpu)[old_vector] = ~irq;
     }
 
-    desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
-    cpumask_clear(desc->arch.old_cpu_mask);
-
-    if ( desc->arch.used_vectors )
-    {
-        ASSERT(test_bit(old_vector, desc->arch.used_vectors));
-        clear_bit(old_vector, desc->arch.used_vectors);
-    }
+    release_old_vec(desc);
 
     desc->arch.move_in_progress = 0;
 }
@@ -517,12 +524,21 @@ next:
         /* Found one! */
         current_vector = vector;
         current_offset = offset;
-        if (old_vector > 0) {
-            desc->arch.move_in_progress = 1;
-            cpumask_copy(desc->arch.old_cpu_mask, desc->arch.cpu_mask);
+
+        if ( old_vector > 0 )
+        {
+            cpumask_and(desc->arch.old_cpu_mask, desc->arch.cpu_mask,
+                        &cpu_online_map);
             desc->arch.old_vector = desc->arch.vector;
+            if ( !cpumask_empty(desc->arch.old_cpu_mask) )
+                desc->arch.move_in_progress = 1;
+            else
+                /* This can happen while offlining a CPU. */
+                release_old_vec(desc);
         }
+
         trace_irq_mask(TRC_HW_IRQ_ASSIGN_VECTOR, irq, vector, &tmp_mask);
+
         for_each_cpu(new_cpu, &tmp_mask)
             per_cpu(vector_irq, new_cpu)[vector] = irq;
         desc->arch.vector = vector;
@@ -691,14 +707,8 @@ void irq_move_cleanup_interrupt(struct c
 
         if ( desc->arch.move_cleanup_count == 0 )
         {
-            desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
-            cpumask_clear(desc->arch.old_cpu_mask);
-
-            if ( desc->arch.used_vectors )
-            {
-                ASSERT(test_bit(vector, desc->arch.used_vectors));
-                clear_bit(vector, desc->arch.used_vectors);
-            }
+            ASSERT(vector == desc->arch.old_vector);
+            release_old_vec(desc);
         }
 unlock:
         spin_unlock(&desc->lock);
@@ -2391,6 +2401,33 @@ void fixup_irqs(const cpumask_t *mask, b
             continue;
         }
 
+        /*
+         * In order for the affinity adjustment below to be successful, we
+         * need __assign_irq_vector() to succeed. This in particular means
+         * clearing desc->arch.move_in_progress if this would otherwise
+         * prevent the function from succeeding. Since there's no way for the
+         * flag to get cleared anymore when there's no possible destination
+         * left (the only possibility then would be the IRQs enabled window
+         * after this loop), there's then also no race with us doing it here.
+         *
+         * Therefore the logic here and there need to remain in sync.
+         */
+        if ( desc->arch.move_in_progress &&
+             !cpumask_intersects(mask, desc->arch.cpu_mask) )
+        {
+            unsigned int cpu;
+
+            cpumask_and(&affinity, desc->arch.old_cpu_mask, &cpu_online_map);
+
+            spin_lock(&vector_lock);
+            for_each_cpu(cpu, &affinity)
+                per_cpu(vector_irq, cpu)[desc->arch.old_vector] = ~irq;
+            spin_unlock(&vector_lock);
+
+            release_old_vec(desc);
+            desc->arch.move_in_progress = 0;
+        }
+
         cpumask_and(&affinity, &affinity, mask);
         if ( cpumask_empty(&affinity) )
         {
@@ -2409,15 +2446,18 @@ void fixup_irqs(const cpumask_t *mask, b
         if ( desc->handler->enable )
             desc->handler->enable(desc);
 
+        cpumask_copy(&affinity, desc->affinity);
+
         spin_unlock(&desc->lock);
 
         if ( !verbose )
             continue;
 
-        if ( break_affinity && set_affinity )
-            printk("Broke affinity for irq %i\n", irq);
-        else if ( !set_affinity )
-            printk("Cannot set affinity for irq %i\n", irq);
+        if ( !set_affinity )
+            printk("Cannot set affinity for IRQ%u\n", irq);
+        else if ( break_affinity )
+            printk("Broke affinity for IRQ%u, new: %*pb\n",
+                   irq, nr_cpu_ids, &affinity);
     }
 
     /* That doesn't seem sufficient.  Give it 1ms. */




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v1b 1/9] x86/IRQ: deal with move-in-progress state in fixup_irqs()
@ 2019-05-03  9:19     ` Roger Pau Monné
  0 siblings, 0 replies; 196+ messages in thread
From: Roger Pau Monné @ 2019-05-03  9:19 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Igor Druzhinin, Wei Liu, Andrew Cooper

On Mon, Apr 29, 2019 at 09:40:14AM -0600, Jan Beulich wrote:
> The flag being set may prevent affinity changes, as these often imply
> assignment of a new vector. When there's no possible destination left
> for the IRQ, the clearing of the flag needs to happen right from
> fixup_irqs().
> 
> Additionally _assign_irq_vector() needs to avoid setting the flag when
> there's no online CPU left in what gets put into ->arch.old_cpu_mask.
> The old vector can be released right away in this case.
> 
> Also extend the log message about broken affinity to include the new
> affinity as well, allowing to notice issues with affinity changes not
> actually having taken place. Swap the if/else-if order there at the
> same time to reduce the amount of conditions checked.
> 
> At the same time replace two open coded instances of the new helper
> function.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> v2: Also update vector_irq[] in the code added to fixup_irqs().
> 
> --- unstable.orig/xen/arch/x86/irq.c	2019-04-29 17:34:16.726542659 +0200
> +++ unstable/xen/arch/x86/irq.c	2019-04-29 15:05:39.000000000 +0200
> @@ -242,6 +242,20 @@ void destroy_irq(unsigned int irq)
>      xfree(action);
>  }
>  
> +static void release_old_vec(struct irq_desc *desc)
> +{
> +    unsigned int vector = desc->arch.old_vector;
> +
> +    desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
> +    cpumask_clear(desc->arch.old_cpu_mask);
> +
> +    if ( desc->arch.used_vectors )

Wouldn't it be better to clean the bitmap when vector !=
IRQ_VECTOR_UNASSIGNED?

I haven't checked all the callers, but I don't think it's valid to
call release_old_vec with desc->arch.old_vector ==
IRQ_VECTOR_UNASSIGNED, in which case I would add an ASSERT.

> +    {
> +        ASSERT(test_bit(vector, desc->arch.used_vectors));
> +        clear_bit(vector, desc->arch.used_vectors);
> +    }
> +}
> +
>  static void __clear_irq_vector(int irq)
>  {
>      int cpu, vector, old_vector;
> @@ -285,14 +299,7 @@ static void __clear_irq_vector(int irq)

Kind of unrelated, but I think the check at the top of
__clear_irq_vector should be:

BUG_ON(desc->arch.vector == IRQ_VECTOR_UNASSIGNED);

Rather than the current:

BUG_ON(!desc->arch.vector);

There's a lot of logic that would go extremely wrong if vector is -1.

>          per_cpu(vector_irq, cpu)[old_vector] = ~irq;
>      }
>  
> -    desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
> -    cpumask_clear(desc->arch.old_cpu_mask);
> -
> -    if ( desc->arch.used_vectors )
> -    {
> -        ASSERT(test_bit(old_vector, desc->arch.used_vectors));
> -        clear_bit(old_vector, desc->arch.used_vectors);
> -    }
> +    release_old_vec(desc);
>  
>      desc->arch.move_in_progress = 0;

While there it might be nice to convert move_in_progress to a boolean.

>  }
> @@ -517,12 +524,21 @@ next:
>          /* Found one! */
>          current_vector = vector;
>          current_offset = offset;
> -        if (old_vector > 0) {
> -            desc->arch.move_in_progress = 1;
> -            cpumask_copy(desc->arch.old_cpu_mask, desc->arch.cpu_mask);
> +
> +        if ( old_vector > 0 )
> +        {
> +            cpumask_and(desc->arch.old_cpu_mask, desc->arch.cpu_mask,
> +                        &cpu_online_map);
>              desc->arch.old_vector = desc->arch.vector;
> +            if ( !cpumask_empty(desc->arch.old_cpu_mask) )
> +                desc->arch.move_in_progress = 1;
> +            else
> +                /* This can happen while offlining a CPU. */
> +                release_old_vec(desc);
>          }
> +
>          trace_irq_mask(TRC_HW_IRQ_ASSIGN_VECTOR, irq, vector, &tmp_mask);
> +
>          for_each_cpu(new_cpu, &tmp_mask)
>              per_cpu(vector_irq, new_cpu)[vector] = irq;
>          desc->arch.vector = vector;
> @@ -691,14 +707,8 @@ void irq_move_cleanup_interrupt(struct c
>  
>          if ( desc->arch.move_cleanup_count == 0 )
>          {
> -            desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
> -            cpumask_clear(desc->arch.old_cpu_mask);
> -
> -            if ( desc->arch.used_vectors )
> -            {
> -                ASSERT(test_bit(vector, desc->arch.used_vectors));
> -                clear_bit(vector, desc->arch.used_vectors);
> -            }
> +            ASSERT(vector == desc->arch.old_vector);
> +            release_old_vec(desc);
>          }
>  unlock:
>          spin_unlock(&desc->lock);
> @@ -2391,6 +2401,33 @@ void fixup_irqs(const cpumask_t *mask, b
>              continue;
>          }
>  
> +        /*
> +         * In order for the affinity adjustment below to be successful, we
> +         * need __assign_irq_vector() to succeed. This in particular means
> +         * clearing desc->arch.move_in_progress if this would otherwise
> +         * prevent the function from succeeding. Since there's no way for the
> +         * flag to get cleared anymore when there's no possible destination
> +         * left (the only possibility then would be the IRQs enabled window
> +         * after this loop), there's then also no race with us doing it here.
> +         *
> +         * Therefore the logic here and there need to remain in sync.
> +         */
> +        if ( desc->arch.move_in_progress &&
> +             !cpumask_intersects(mask, desc->arch.cpu_mask) )
> +        {
> +            unsigned int cpu;
> +
> +            cpumask_and(&affinity, desc->arch.old_cpu_mask, &cpu_online_map);
> +
> +            spin_lock(&vector_lock);
> +            for_each_cpu(cpu, &affinity)
> +                per_cpu(vector_irq, cpu)[desc->arch.old_vector] = ~irq;
> +            spin_unlock(&vector_lock);
> +
> +            release_old_vec(desc);
> +            desc->arch.move_in_progress = 0;
> +        }
> +
>          cpumask_and(&affinity, &affinity, mask);
>          if ( cpumask_empty(&affinity) )
>          {
> @@ -2409,15 +2446,18 @@ void fixup_irqs(const cpumask_t *mask, b
>          if ( desc->handler->enable )
>              desc->handler->enable(desc);
>  
> +        cpumask_copy(&affinity, desc->affinity);
> +
>          spin_unlock(&desc->lock);
>  
>          if ( !verbose )
>              continue;
>  
> -        if ( break_affinity && set_affinity )
> -            printk("Broke affinity for irq %i\n", irq);
> -        else if ( !set_affinity )
> -            printk("Cannot set affinity for irq %i\n", irq);
> +        if ( !set_affinity )
> +            printk("Cannot set affinity for IRQ%u\n", irq);
> +        else if ( break_affinity )
> +            printk("Broke affinity for IRQ%u, new: %*pb\n",
> +                   irq, nr_cpu_ids, &affinity);

I guess it's fine to have those without rate-limiting because
fixup_irqs is only called for admin-triggered actions, so there's no
risk of console flooding.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v1b 1/9] x86/IRQ: deal with move-in-progress state in fixup_irqs()
@ 2019-05-03  9:19     ` Roger Pau Monné
  0 siblings, 0 replies; 196+ messages in thread
From: Roger Pau Monné @ 2019-05-03  9:19 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Igor Druzhinin, Wei Liu, Andrew Cooper

On Mon, Apr 29, 2019 at 09:40:14AM -0600, Jan Beulich wrote:
> The flag being set may prevent affinity changes, as these often imply
> assignment of a new vector. When there's no possible destination left
> for the IRQ, the clearing of the flag needs to happen right from
> fixup_irqs().
> 
> Additionally _assign_irq_vector() needs to avoid setting the flag when
> there's no online CPU left in what gets put into ->arch.old_cpu_mask.
> The old vector can be released right away in this case.
> 
> Also extend the log message about broken affinity to include the new
> affinity as well, allowing to notice issues with affinity changes not
> actually having taken place. Swap the if/else-if order there at the
> same time to reduce the amount of conditions checked.
> 
> At the same time replace two open coded instances of the new helper
> function.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> v2: Also update vector_irq[] in the code added to fixup_irqs().
> 
> --- unstable.orig/xen/arch/x86/irq.c	2019-04-29 17:34:16.726542659 +0200
> +++ unstable/xen/arch/x86/irq.c	2019-04-29 15:05:39.000000000 +0200
> @@ -242,6 +242,20 @@ void destroy_irq(unsigned int irq)
>      xfree(action);
>  }
>  
> +static void release_old_vec(struct irq_desc *desc)
> +{
> +    unsigned int vector = desc->arch.old_vector;
> +
> +    desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
> +    cpumask_clear(desc->arch.old_cpu_mask);
> +
> +    if ( desc->arch.used_vectors )

Wouldn't it be better to clean the bitmap when vector !=
IRQ_VECTOR_UNASSIGNED?

I haven't checked all the callers, but I don't think it's valid to
call release_old_vec with desc->arch.old_vector ==
IRQ_VECTOR_UNASSIGNED, in which case I would add an ASSERT.

> +    {
> +        ASSERT(test_bit(vector, desc->arch.used_vectors));
> +        clear_bit(vector, desc->arch.used_vectors);
> +    }
> +}
> +
>  static void __clear_irq_vector(int irq)
>  {
>      int cpu, vector, old_vector;
> @@ -285,14 +299,7 @@ static void __clear_irq_vector(int irq)

Kind of unrelated, but I think the check at the top of
__clear_irq_vector should be:

BUG_ON(desc->arch.vector == IRQ_VECTOR_UNASSIGNED);

Rather than the current:

BUG_ON(!desc->arch.vector);

There's a lot of logic that would go extremely wrong if vector is -1.

>          per_cpu(vector_irq, cpu)[old_vector] = ~irq;
>      }
>  
> -    desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
> -    cpumask_clear(desc->arch.old_cpu_mask);
> -
> -    if ( desc->arch.used_vectors )
> -    {
> -        ASSERT(test_bit(old_vector, desc->arch.used_vectors));
> -        clear_bit(old_vector, desc->arch.used_vectors);
> -    }
> +    release_old_vec(desc);
>  
>      desc->arch.move_in_progress = 0;

While there it might be nice to convert move_in_progress to a boolean.

>  }
> @@ -517,12 +524,21 @@ next:
>          /* Found one! */
>          current_vector = vector;
>          current_offset = offset;
> -        if (old_vector > 0) {
> -            desc->arch.move_in_progress = 1;
> -            cpumask_copy(desc->arch.old_cpu_mask, desc->arch.cpu_mask);
> +
> +        if ( old_vector > 0 )
> +        {
> +            cpumask_and(desc->arch.old_cpu_mask, desc->arch.cpu_mask,
> +                        &cpu_online_map);
>              desc->arch.old_vector = desc->arch.vector;
> +            if ( !cpumask_empty(desc->arch.old_cpu_mask) )
> +                desc->arch.move_in_progress = 1;
> +            else
> +                /* This can happen while offlining a CPU. */
> +                release_old_vec(desc);
>          }
> +
>          trace_irq_mask(TRC_HW_IRQ_ASSIGN_VECTOR, irq, vector, &tmp_mask);
> +
>          for_each_cpu(new_cpu, &tmp_mask)
>              per_cpu(vector_irq, new_cpu)[vector] = irq;
>          desc->arch.vector = vector;
> @@ -691,14 +707,8 @@ void irq_move_cleanup_interrupt(struct c
>  
>          if ( desc->arch.move_cleanup_count == 0 )
>          {
> -            desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
> -            cpumask_clear(desc->arch.old_cpu_mask);
> -
> -            if ( desc->arch.used_vectors )
> -            {
> -                ASSERT(test_bit(vector, desc->arch.used_vectors));
> -                clear_bit(vector, desc->arch.used_vectors);
> -            }
> +            ASSERT(vector == desc->arch.old_vector);
> +            release_old_vec(desc);
>          }
>  unlock:
>          spin_unlock(&desc->lock);
> @@ -2391,6 +2401,33 @@ void fixup_irqs(const cpumask_t *mask, b
>              continue;
>          }
>  
> +        /*
> +         * In order for the affinity adjustment below to be successful, we
> +         * need __assign_irq_vector() to succeed. This in particular means
> +         * clearing desc->arch.move_in_progress if this would otherwise
> +         * prevent the function from succeeding. Since there's no way for the
> +         * flag to get cleared anymore when there's no possible destination
> +         * left (the only possibility then would be the IRQs enabled window
> +         * after this loop), there's then also no race with us doing it here.
> +         *
> +         * Therefore the logic here and there need to remain in sync.
> +         */
> +        if ( desc->arch.move_in_progress &&
> +             !cpumask_intersects(mask, desc->arch.cpu_mask) )
> +        {
> +            unsigned int cpu;
> +
> +            cpumask_and(&affinity, desc->arch.old_cpu_mask, &cpu_online_map);
> +
> +            spin_lock(&vector_lock);
> +            for_each_cpu(cpu, &affinity)
> +                per_cpu(vector_irq, cpu)[desc->arch.old_vector] = ~irq;
> +            spin_unlock(&vector_lock);
> +
> +            release_old_vec(desc);
> +            desc->arch.move_in_progress = 0;
> +        }
> +
>          cpumask_and(&affinity, &affinity, mask);
>          if ( cpumask_empty(&affinity) )
>          {
> @@ -2409,15 +2446,18 @@ void fixup_irqs(const cpumask_t *mask, b
>          if ( desc->handler->enable )
>              desc->handler->enable(desc);
>  
> +        cpumask_copy(&affinity, desc->affinity);
> +
>          spin_unlock(&desc->lock);
>  
>          if ( !verbose )
>              continue;
>  
> -        if ( break_affinity && set_affinity )
> -            printk("Broke affinity for irq %i\n", irq);
> -        else if ( !set_affinity )
> -            printk("Cannot set affinity for irq %i\n", irq);
> +        if ( !set_affinity )
> +            printk("Cannot set affinity for IRQ%u\n", irq);
> +        else if ( break_affinity )
> +            printk("Broke affinity for IRQ%u, new: %*pb\n",
> +                   irq, nr_cpu_ids, &affinity);

I guess it's fine to have those without rate-limiting because
fixup_irqs is only called for admin-triggered actions, so there's no
risk of console flooding.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v1b 1/9] x86/IRQ: deal with move-in-progress state in fixup_irqs()
@ 2019-05-03 14:10       ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-03 14:10 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel, Igor Druzhinin

>>> On 03.05.19 at 11:19, <roger.pau@citrix.com> wrote:
> On Mon, Apr 29, 2019 at 09:40:14AM -0600, Jan Beulich wrote:
>> --- unstable.orig/xen/arch/x86/irq.c	
>> +++ unstable/xen/arch/x86/irq.c
>> @@ -242,6 +242,20 @@ void destroy_irq(unsigned int irq)
>>      xfree(action);
>>  }
>>  
>> +static void release_old_vec(struct irq_desc *desc)
>> +{
>> +    unsigned int vector = desc->arch.old_vector;
>> +
>> +    desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
>> +    cpumask_clear(desc->arch.old_cpu_mask);
>> +
>> +    if ( desc->arch.used_vectors )
> 
> Wouldn't it be better to clean the bitmap when vector !=
> IRQ_VECTOR_UNASSIGNED?

No code path does / should call into here without the need to
actually release the previous vector.

> I haven't checked all the callers, but I don't think it's valid to
> call release_old_vec with desc->arch.old_vector ==
> IRQ_VECTOR_UNASSIGNED, in which case I would add an ASSERT.

Well, yes, I probably could. However, as much as I'm in
favor of ASSERT()s, I don't think it makes sense to ASSERT()
basically every bit of expected state. In the end there would
otherwise be more ASSERT()s than actual code.

>> +    {
>> +        ASSERT(test_bit(vector, desc->arch.used_vectors));
>> +        clear_bit(vector, desc->arch.used_vectors);
>> +    }
>> +}
>> +
>>  static void __clear_irq_vector(int irq)
>>  {
>>      int cpu, vector, old_vector;
>> @@ -285,14 +299,7 @@ static void __clear_irq_vector(int irq)
> 
> Kind of unrelated, but I think the check at the top of
> __clear_irq_vector should be:
> 
> BUG_ON(desc->arch.vector == IRQ_VECTOR_UNASSIGNED);
> 
> Rather than the current:
> 
> BUG_ON(!desc->arch.vector);
> 
> There's a lot of logic that would go extremely wrong if vector is -1.

Yes indeed. Do you want to send a patch, or should I add
one at the end of this series?

>>          per_cpu(vector_irq, cpu)[old_vector] = ~irq;
>>      }
>>  
>> -    desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
>> -    cpumask_clear(desc->arch.old_cpu_mask);
>> -
>> -    if ( desc->arch.used_vectors )
>> -    {
>> -        ASSERT(test_bit(old_vector, desc->arch.used_vectors));
>> -        clear_bit(old_vector, desc->arch.used_vectors);
>> -    }
>> +    release_old_vec(desc);
>>  
>>      desc->arch.move_in_progress = 0;
> 
> While there it might be nice to convert move_in_progress to a boolean.

This would grow the patch quite a bit I think, so I prefer no to.

>> @@ -2409,15 +2446,18 @@ void fixup_irqs(const cpumask_t *mask, b
>>          if ( desc->handler->enable )
>>              desc->handler->enable(desc);
>>  
>> +        cpumask_copy(&affinity, desc->affinity);
>> +
>>          spin_unlock(&desc->lock);
>>  
>>          if ( !verbose )
>>              continue;
>>  
>> -        if ( break_affinity && set_affinity )
>> -            printk("Broke affinity for irq %i\n", irq);
>> -        else if ( !set_affinity )
>> -            printk("Cannot set affinity for irq %i\n", irq);
>> +        if ( !set_affinity )
>> +            printk("Cannot set affinity for IRQ%u\n", irq);
>> +        else if ( break_affinity )
>> +            printk("Broke affinity for IRQ%u, new: %*pb\n",
>> +                   irq, nr_cpu_ids, &affinity);
> 
> I guess it's fine to have those without rate-limiting because
> fixup_irqs is only called for admin-triggered actions, so there's no
> risk of console flooding.

Right, plus I'd rather not hide any of these messages: Them
being there was already a good indication that something
_might_ be going wrong. If we got to the point where we're
fully confident in the code, then we could think about lowering
their log level, or rate limiting them.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v1b 1/9] x86/IRQ: deal with move-in-progress state in fixup_irqs()
@ 2019-05-03 14:10       ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-03 14:10 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel, Igor Druzhinin

>>> On 03.05.19 at 11:19, <roger.pau@citrix.com> wrote:
> On Mon, Apr 29, 2019 at 09:40:14AM -0600, Jan Beulich wrote:
>> --- unstable.orig/xen/arch/x86/irq.c	
>> +++ unstable/xen/arch/x86/irq.c
>> @@ -242,6 +242,20 @@ void destroy_irq(unsigned int irq)
>>      xfree(action);
>>  }
>>  
>> +static void release_old_vec(struct irq_desc *desc)
>> +{
>> +    unsigned int vector = desc->arch.old_vector;
>> +
>> +    desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
>> +    cpumask_clear(desc->arch.old_cpu_mask);
>> +
>> +    if ( desc->arch.used_vectors )
> 
> Wouldn't it be better to clean the bitmap when vector !=
> IRQ_VECTOR_UNASSIGNED?

No code path does / should call into here without the need to
actually release the previous vector.

> I haven't checked all the callers, but I don't think it's valid to
> call release_old_vec with desc->arch.old_vector ==
> IRQ_VECTOR_UNASSIGNED, in which case I would add an ASSERT.

Well, yes, I probably could. However, as much as I'm in
favor of ASSERT()s, I don't think it makes sense to ASSERT()
basically every bit of expected state. In the end there would
otherwise be more ASSERT()s than actual code.

>> +    {
>> +        ASSERT(test_bit(vector, desc->arch.used_vectors));
>> +        clear_bit(vector, desc->arch.used_vectors);
>> +    }
>> +}
>> +
>>  static void __clear_irq_vector(int irq)
>>  {
>>      int cpu, vector, old_vector;
>> @@ -285,14 +299,7 @@ static void __clear_irq_vector(int irq)
> 
> Kind of unrelated, but I think the check at the top of
> __clear_irq_vector should be:
> 
> BUG_ON(desc->arch.vector == IRQ_VECTOR_UNASSIGNED);
> 
> Rather than the current:
> 
> BUG_ON(!desc->arch.vector);
> 
> There's a lot of logic that would go extremely wrong if vector is -1.

Yes indeed. Do you want to send a patch, or should I add
one at the end of this series?

>>          per_cpu(vector_irq, cpu)[old_vector] = ~irq;
>>      }
>>  
>> -    desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
>> -    cpumask_clear(desc->arch.old_cpu_mask);
>> -
>> -    if ( desc->arch.used_vectors )
>> -    {
>> -        ASSERT(test_bit(old_vector, desc->arch.used_vectors));
>> -        clear_bit(old_vector, desc->arch.used_vectors);
>> -    }
>> +    release_old_vec(desc);
>>  
>>      desc->arch.move_in_progress = 0;
> 
> While there it might be nice to convert move_in_progress to a boolean.

This would grow the patch quite a bit I think, so I prefer no to.

>> @@ -2409,15 +2446,18 @@ void fixup_irqs(const cpumask_t *mask, b
>>          if ( desc->handler->enable )
>>              desc->handler->enable(desc);
>>  
>> +        cpumask_copy(&affinity, desc->affinity);
>> +
>>          spin_unlock(&desc->lock);
>>  
>>          if ( !verbose )
>>              continue;
>>  
>> -        if ( break_affinity && set_affinity )
>> -            printk("Broke affinity for irq %i\n", irq);
>> -        else if ( !set_affinity )
>> -            printk("Cannot set affinity for irq %i\n", irq);
>> +        if ( !set_affinity )
>> +            printk("Cannot set affinity for IRQ%u\n", irq);
>> +        else if ( break_affinity )
>> +            printk("Broke affinity for IRQ%u, new: %*pb\n",
>> +                   irq, nr_cpu_ids, &affinity);
> 
> I guess it's fine to have those without rate-limiting because
> fixup_irqs is only called for admin-triggered actions, so there's no
> risk of console flooding.

Right, plus I'd rather not hide any of these messages: Them
being there was already a good indication that something
_might_ be going wrong. If we got to the point where we're
fully confident in the code, then we could think about lowering
their log level, or rate limiting them.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH 2/9] x86/IRQ: deal with move cleanup count state in fixup_irqs()
@ 2019-05-03 15:21     ` Roger Pau Monné
  0 siblings, 0 replies; 196+ messages in thread
From: Roger Pau Monné @ 2019-05-03 15:21 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Wei Liu, Andrew Cooper

On Mon, Apr 29, 2019 at 05:23:20AM -0600, Jan Beulich wrote:
> The cleanup IPI may get sent immediately before a CPU gets removed from
> the online map. In such a case the IPI would get handled on the CPU
> being offlined no earlier than in the interrupts disabled window after
> fixup_irqs()' main loop. This is too late, however, because a possible
> affinity change may incur the need for vector assignment, which will
> fail when the IRQ's move cleanup count is still non-zero.
> 
> To fix this
> - record the set of CPUs the cleanup IPIs gets actually sent to alongside
>   setting their count,
> - adjust the count in fixup_irqs(), accounting for all CPUs that the
>   cleanup IPI was sent to, but that are no longer online,
> - bail early from the cleanup IPI handler when the CPU is no longer
>   online, to prevent double accounting.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Just as a note, this whole interrupt migration business seems
extremely complex, and I wonder if Xen does really need it, or what's
exactly it's performance gain compared to more simple solutions. I
understand this is just fixes, but IMO it's making the logic even more
complex.

Maybe it would be simpler to have the interrupts hard-bound to pCPUs
and instead have a soft-affinity on the guest vCPUs that are assigned
as the destination?

> ---
> TBD: The proper recording of the IPI destinations actually makes the
>      move_cleanup_count field redundant. Do we want to drop it, at the
>      price of a few more CPU-mask operations?

AFAICT this is not a hot path, so I would remove the
move_cleanup_count field and just weight the cpu bitmap when needed.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH 2/9] x86/IRQ: deal with move cleanup count state in fixup_irqs()
@ 2019-05-03 15:21     ` Roger Pau Monné
  0 siblings, 0 replies; 196+ messages in thread
From: Roger Pau Monné @ 2019-05-03 15:21 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Wei Liu, Andrew Cooper

On Mon, Apr 29, 2019 at 05:23:20AM -0600, Jan Beulich wrote:
> The cleanup IPI may get sent immediately before a CPU gets removed from
> the online map. In such a case the IPI would get handled on the CPU
> being offlined no earlier than in the interrupts disabled window after
> fixup_irqs()' main loop. This is too late, however, because a possible
> affinity change may incur the need for vector assignment, which will
> fail when the IRQ's move cleanup count is still non-zero.
> 
> To fix this
> - record the set of CPUs the cleanup IPIs gets actually sent to alongside
>   setting their count,
> - adjust the count in fixup_irqs(), accounting for all CPUs that the
>   cleanup IPI was sent to, but that are no longer online,
> - bail early from the cleanup IPI handler when the CPU is no longer
>   online, to prevent double accounting.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Just as a note, this whole interrupt migration business seems
extremely complex, and I wonder if Xen does really need it, or what's
exactly it's performance gain compared to more simple solutions. I
understand this is just fixes, but IMO it's making the logic even more
complex.

Maybe it would be simpler to have the interrupts hard-bound to pCPUs
and instead have a soft-affinity on the guest vCPUs that are assigned
as the destination?

> ---
> TBD: The proper recording of the IPI destinations actually makes the
>      move_cleanup_count field redundant. Do we want to drop it, at the
>      price of a few more CPU-mask operations?

AFAICT this is not a hot path, so I would remove the
move_cleanup_count field and just weight the cpu bitmap when needed.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH 3/9] x86/IRQ: improve dump_irqs()
@ 2019-05-03 15:43     ` Roger Pau Monné
  0 siblings, 0 replies; 196+ messages in thread
From: Roger Pau Monné @ 2019-05-03 15:43 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Wei Liu, Andrew Cooper

On Mon, Apr 29, 2019 at 05:23:49AM -0600, Jan Beulich wrote:
> Don't log a stray trailing comma. Shorten a few fields.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

> -            for ( i = 0; i < action->nr_guests; i++ )
> +            for ( i = 0; i < action->nr_guests; )
>              {
> -                d = action->guest[i];
> +                d = action->guest[i++];

Per my taste I would leave the increment in the for, but it's just
taste.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH 3/9] x86/IRQ: improve dump_irqs()
@ 2019-05-03 15:43     ` Roger Pau Monné
  0 siblings, 0 replies; 196+ messages in thread
From: Roger Pau Monné @ 2019-05-03 15:43 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Wei Liu, Andrew Cooper

On Mon, Apr 29, 2019 at 05:23:49AM -0600, Jan Beulich wrote:
> Don't log a stray trailing comma. Shorten a few fields.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

> -            for ( i = 0; i < action->nr_guests; i++ )
> +            for ( i = 0; i < action->nr_guests; )
>              {
> -                d = action->guest[i];
> +                d = action->guest[i++];

Per my taste I would leave the increment in the for, but it's just
taste.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH 4/9] x86/IRQ: desc->affinity should strictly represent the requested value
@ 2019-05-03 16:21     ` Roger Pau Monné
  0 siblings, 0 replies; 196+ messages in thread
From: Roger Pau Monné @ 2019-05-03 16:21 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Wei Liu, Andrew Cooper

On Mon, Apr 29, 2019 at 05:24:39AM -0600, Jan Beulich wrote:
> desc->arch.cpu_mask reflects the actual set of target CPUs. Don't ever
> fiddle with desc->affinity itself, except to store caller requested
> values.
> 
> This renders both set_native_irq_info() uses (which weren't using proper
> locking anyway) redundant - drop the function altogether.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

> 
> --- a/xen/arch/x86/io_apic.c
> +++ b/xen/arch/x86/io_apic.c
> @@ -1042,7 +1042,6 @@ static void __init setup_IO_APIC_irqs(vo
>              SET_DEST(entry, logical, cpu_mask_to_apicid(TARGET_CPUS));
>              spin_lock_irqsave(&ioapic_lock, flags);
>              __ioapic_write_entry(apic, pin, 0, entry);
> -            set_native_irq_info(irq, TARGET_CPUS);
>              spin_unlock_irqrestore(&ioapic_lock, flags);
>          }
>      }
> @@ -2251,7 +2250,6 @@ int io_apic_set_pci_routing (int ioapic,
>  
>      spin_lock_irqsave(&ioapic_lock, flags);
>      __ioapic_write_entry(ioapic, pin, 0, entry);
> -    set_native_irq_info(irq, TARGET_CPUS);
>      spin_unlock(&ioapic_lock);
>  
>      spin_lock(&desc->lock);
> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -572,11 +572,16 @@ int assign_irq_vector(int irq, const cpu
>  
>      spin_lock_irqsave(&vector_lock, flags);
>      ret = __assign_irq_vector(irq, desc, mask ?: TARGET_CPUS);
> -    if (!ret) {
> +    if ( !ret )
> +    {
>          ret = desc->arch.vector;
> -        cpumask_copy(desc->affinity, desc->arch.cpu_mask);
> +        if ( mask )
> +            cpumask_copy(desc->affinity, mask);
> +        else
> +            cpumask_setall(desc->affinity);

I guess it's fine to use setall instead of copying the cpu online map
here?

AFAICT __assign_irq_vector already filters offline CPUs from the
passed mask.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH 4/9] x86/IRQ: desc->affinity should strictly represent the requested value
@ 2019-05-03 16:21     ` Roger Pau Monné
  0 siblings, 0 replies; 196+ messages in thread
From: Roger Pau Monné @ 2019-05-03 16:21 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Wei Liu, Andrew Cooper

On Mon, Apr 29, 2019 at 05:24:39AM -0600, Jan Beulich wrote:
> desc->arch.cpu_mask reflects the actual set of target CPUs. Don't ever
> fiddle with desc->affinity itself, except to store caller requested
> values.
> 
> This renders both set_native_irq_info() uses (which weren't using proper
> locking anyway) redundant - drop the function altogether.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

> 
> --- a/xen/arch/x86/io_apic.c
> +++ b/xen/arch/x86/io_apic.c
> @@ -1042,7 +1042,6 @@ static void __init setup_IO_APIC_irqs(vo
>              SET_DEST(entry, logical, cpu_mask_to_apicid(TARGET_CPUS));
>              spin_lock_irqsave(&ioapic_lock, flags);
>              __ioapic_write_entry(apic, pin, 0, entry);
> -            set_native_irq_info(irq, TARGET_CPUS);
>              spin_unlock_irqrestore(&ioapic_lock, flags);
>          }
>      }
> @@ -2251,7 +2250,6 @@ int io_apic_set_pci_routing (int ioapic,
>  
>      spin_lock_irqsave(&ioapic_lock, flags);
>      __ioapic_write_entry(ioapic, pin, 0, entry);
> -    set_native_irq_info(irq, TARGET_CPUS);
>      spin_unlock(&ioapic_lock);
>  
>      spin_lock(&desc->lock);
> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -572,11 +572,16 @@ int assign_irq_vector(int irq, const cpu
>  
>      spin_lock_irqsave(&vector_lock, flags);
>      ret = __assign_irq_vector(irq, desc, mask ?: TARGET_CPUS);
> -    if (!ret) {
> +    if ( !ret )
> +    {
>          ret = desc->arch.vector;
> -        cpumask_copy(desc->affinity, desc->arch.cpu_mask);
> +        if ( mask )
> +            cpumask_copy(desc->affinity, mask);
> +        else
> +            cpumask_setall(desc->affinity);

I guess it's fine to use setall instead of copying the cpu online map
here?

AFAICT __assign_irq_vector already filters offline CPUs from the
passed mask.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v1b 1/9] x86/IRQ: deal with move-in-progress state in fixup_irqs()
@ 2019-05-06  7:15         ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-06  7:15 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel, Igor Druzhinin

>>> On 03.05.19 at 16:10, <JBeulich@suse.com> wrote:
>>>> On 03.05.19 at 11:19, <roger.pau@citrix.com> wrote:
>> On Mon, Apr 29, 2019 at 09:40:14AM -0600, Jan Beulich wrote:
>>> --- unstable.orig/xen/arch/x86/irq.c	
>>> +++ unstable/xen/arch/x86/irq.c
>>> @@ -242,6 +242,20 @@ void destroy_irq(unsigned int irq)
>>>      xfree(action);
>>>  }
>>>  
>>> +static void release_old_vec(struct irq_desc *desc)
>>> +{
>>> +    unsigned int vector = desc->arch.old_vector;
>>> +
>>> +    desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
>>> +    cpumask_clear(desc->arch.old_cpu_mask);
>>> +
>>> +    if ( desc->arch.used_vectors )
>> 
>> Wouldn't it be better to clean the bitmap when vector !=
>> IRQ_VECTOR_UNASSIGNED?
> 
> No code path does / should call into here without the need to
> actually release the previous vector.
> 
>> I haven't checked all the callers, but I don't think it's valid to
>> call release_old_vec with desc->arch.old_vector ==
>> IRQ_VECTOR_UNASSIGNED, in which case I would add an ASSERT.
> 
> Well, yes, I probably could. However, as much as I'm in
> favor of ASSERT()s, I don't think it makes sense to ASSERT()
> basically every bit of expected state. In the end there would
> otherwise be more ASSERT()s than actual code.

Actually, upon second thought - let me add this, but then in an
even more strict form: Certain very low and very high numbered
vectors are illegal here as well, and we may then be able to use
the same validation helper elsewhere (in particular also for the
check that you've found to be wrong in _clear_irq_vector()).

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v1b 1/9] x86/IRQ: deal with move-in-progress state in fixup_irqs()
@ 2019-05-06  7:15         ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-06  7:15 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel, Igor Druzhinin

>>> On 03.05.19 at 16:10, <JBeulich@suse.com> wrote:
>>>> On 03.05.19 at 11:19, <roger.pau@citrix.com> wrote:
>> On Mon, Apr 29, 2019 at 09:40:14AM -0600, Jan Beulich wrote:
>>> --- unstable.orig/xen/arch/x86/irq.c	
>>> +++ unstable/xen/arch/x86/irq.c
>>> @@ -242,6 +242,20 @@ void destroy_irq(unsigned int irq)
>>>      xfree(action);
>>>  }
>>>  
>>> +static void release_old_vec(struct irq_desc *desc)
>>> +{
>>> +    unsigned int vector = desc->arch.old_vector;
>>> +
>>> +    desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
>>> +    cpumask_clear(desc->arch.old_cpu_mask);
>>> +
>>> +    if ( desc->arch.used_vectors )
>> 
>> Wouldn't it be better to clean the bitmap when vector !=
>> IRQ_VECTOR_UNASSIGNED?
> 
> No code path does / should call into here without the need to
> actually release the previous vector.
> 
>> I haven't checked all the callers, but I don't think it's valid to
>> call release_old_vec with desc->arch.old_vector ==
>> IRQ_VECTOR_UNASSIGNED, in which case I would add an ASSERT.
> 
> Well, yes, I probably could. However, as much as I'm in
> favor of ASSERT()s, I don't think it makes sense to ASSERT()
> basically every bit of expected state. In the end there would
> otherwise be more ASSERT()s than actual code.

Actually, upon second thought - let me add this, but then in an
even more strict form: Certain very low and very high numbered
vectors are illegal here as well, and we may then be able to use
the same validation helper elsewhere (in particular also for the
check that you've found to be wrong in _clear_irq_vector()).

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH 2/9] x86/IRQ: deal with move cleanup count state in fixup_irqs()
@ 2019-05-06  7:44       ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-06  7:44 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel

>>> On 03.05.19 at 17:21, <roger.pau@citrix.com> wrote:
> On Mon, Apr 29, 2019 at 05:23:20AM -0600, Jan Beulich wrote:
>> The cleanup IPI may get sent immediately before a CPU gets removed from
>> the online map. In such a case the IPI would get handled on the CPU
>> being offlined no earlier than in the interrupts disabled window after
>> fixup_irqs()' main loop. This is too late, however, because a possible
>> affinity change may incur the need for vector assignment, which will
>> fail when the IRQ's move cleanup count is still non-zero.
>> 
>> To fix this
>> - record the set of CPUs the cleanup IPIs gets actually sent to alongside
>>   setting their count,
>> - adjust the count in fixup_irqs(), accounting for all CPUs that the
>>   cleanup IPI was sent to, but that are no longer online,
>> - bail early from the cleanup IPI handler when the CPU is no longer
>>   online, to prevent double accounting.
>> 
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks.

> Just as a note, this whole interrupt migration business seems
> extremely complex, and I wonder if Xen does really need it, or what's
> exactly it's performance gain compared to more simple solutions.

What more simple solutions would you think about? IRQ affinities
tracking their assigned-vCPU ones was added largely to avoid
high rate interrupts always arriving on a CPU other than the one
where the actual handling will take place. Arguably this may go
too far for low rate interrupts, but adding a respective heuristic
would rather further complicate handling.

> I understand this is just fixes, but IMO it's making the logic even more
> complex.
> 
> Maybe it would be simpler to have the interrupts hard-bound to pCPUs
> and instead have a soft-affinity on the guest vCPUs that are assigned
> as the destination?

How would the soft affinity of a vCPU be calculated that has
multiple IRQs (with at most partially overlapping affinities) to be
serviced by it?

>> ---
>> TBD: The proper recording of the IPI destinations actually makes the
>>      move_cleanup_count field redundant. Do we want to drop it, at the
>>      price of a few more CPU-mask operations?
> 
> AFAICT this is not a hot path, so I would remove the
> move_cleanup_count field and just weight the cpu bitmap when needed.

Added for v2 (pending successful testing).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH 2/9] x86/IRQ: deal with move cleanup count state in fixup_irqs()
@ 2019-05-06  7:44       ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-06  7:44 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel

>>> On 03.05.19 at 17:21, <roger.pau@citrix.com> wrote:
> On Mon, Apr 29, 2019 at 05:23:20AM -0600, Jan Beulich wrote:
>> The cleanup IPI may get sent immediately before a CPU gets removed from
>> the online map. In such a case the IPI would get handled on the CPU
>> being offlined no earlier than in the interrupts disabled window after
>> fixup_irqs()' main loop. This is too late, however, because a possible
>> affinity change may incur the need for vector assignment, which will
>> fail when the IRQ's move cleanup count is still non-zero.
>> 
>> To fix this
>> - record the set of CPUs the cleanup IPIs gets actually sent to alongside
>>   setting their count,
>> - adjust the count in fixup_irqs(), accounting for all CPUs that the
>>   cleanup IPI was sent to, but that are no longer online,
>> - bail early from the cleanup IPI handler when the CPU is no longer
>>   online, to prevent double accounting.
>> 
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks.

> Just as a note, this whole interrupt migration business seems
> extremely complex, and I wonder if Xen does really need it, or what's
> exactly it's performance gain compared to more simple solutions.

What more simple solutions would you think about? IRQ affinities
tracking their assigned-vCPU ones was added largely to avoid
high rate interrupts always arriving on a CPU other than the one
where the actual handling will take place. Arguably this may go
too far for low rate interrupts, but adding a respective heuristic
would rather further complicate handling.

> I understand this is just fixes, but IMO it's making the logic even more
> complex.
> 
> Maybe it would be simpler to have the interrupts hard-bound to pCPUs
> and instead have a soft-affinity on the guest vCPUs that are assigned
> as the destination?

How would the soft affinity of a vCPU be calculated that has
multiple IRQs (with at most partially overlapping affinities) to be
serviced by it?

>> ---
>> TBD: The proper recording of the IPI destinations actually makes the
>>      move_cleanup_count field redundant. Do we want to drop it, at the
>>      price of a few more CPU-mask operations?
> 
> AFAICT this is not a hot path, so I would remove the
> move_cleanup_count field and just weight the cpu bitmap when needed.

Added for v2 (pending successful testing).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH 3/9] x86/IRQ: improve dump_irqs()
@ 2019-05-06  8:06       ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-06  8:06 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel

>>> On 03.05.19 at 17:43, <roger.pau@citrix.com> wrote:
> On Mon, Apr 29, 2019 at 05:23:49AM -0600, Jan Beulich wrote:
>> Don't log a stray trailing comma. Shorten a few fields.
>> 
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks.

>> -            for ( i = 0; i < action->nr_guests; i++ )
>> +            for ( i = 0; i < action->nr_guests; )
>>              {
>> -                d = action->guest[i];
>> +                d = action->guest[i++];
> 
> Per my taste I would leave the increment in the for, but it's just
> taste.

If it was just taste, I'd have left it there, but there is

                printk("d%d:%3d(%c%c%c)%c",
                       d->domain_id, pirq,
                       evtchn_port_is_pending(d, info->evtchn) ? 'P' : '-',
                       evtchn_port_is_masked(d, info->evtchn) ? 'M' : '-',
                       info->masked ? 'M' : '-',
                       i < action->nr_guests ? ',' : '\n');

which depends on the early increment (or else would need adding
" + 1" or " - 1" on one side of the < . In fact this change is the
"don't log a stray trailing comma" part.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH 3/9] x86/IRQ: improve dump_irqs()
@ 2019-05-06  8:06       ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-06  8:06 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel

>>> On 03.05.19 at 17:43, <roger.pau@citrix.com> wrote:
> On Mon, Apr 29, 2019 at 05:23:49AM -0600, Jan Beulich wrote:
>> Don't log a stray trailing comma. Shorten a few fields.
>> 
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks.

>> -            for ( i = 0; i < action->nr_guests; i++ )
>> +            for ( i = 0; i < action->nr_guests; )
>>              {
>> -                d = action->guest[i];
>> +                d = action->guest[i++];
> 
> Per my taste I would leave the increment in the for, but it's just
> taste.

If it was just taste, I'd have left it there, but there is

                printk("d%d:%3d(%c%c%c)%c",
                       d->domain_id, pirq,
                       evtchn_port_is_pending(d, info->evtchn) ? 'P' : '-',
                       evtchn_port_is_masked(d, info->evtchn) ? 'M' : '-',
                       info->masked ? 'M' : '-',
                       i < action->nr_guests ? ',' : '\n');

which depends on the early increment (or else would need adding
" + 1" or " - 1" on one side of the < . In fact this change is the
"don't log a stray trailing comma" part.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH 4/9] x86/IRQ: desc->affinity should strictly represent the requested value
@ 2019-05-06  8:14       ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-06  8:14 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel

>>> On 03.05.19 at 18:21, <roger.pau@citrix.com> wrote:
> On Mon, Apr 29, 2019 at 05:24:39AM -0600, Jan Beulich wrote:
>> desc->arch.cpu_mask reflects the actual set of target CPUs. Don't ever
>> fiddle with desc->affinity itself, except to store caller requested
>> values.
>> 
>> This renders both set_native_irq_info() uses (which weren't using proper
>> locking anyway) redundant - drop the function altogether.
>> 
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks.

>> --- a/xen/arch/x86/irq.c
>> +++ b/xen/arch/x86/irq.c
>> @@ -572,11 +572,16 @@ int assign_irq_vector(int irq, const cpu
>>  
>>      spin_lock_irqsave(&vector_lock, flags);
>>      ret = __assign_irq_vector(irq, desc, mask ?: TARGET_CPUS);
>> -    if (!ret) {
>> +    if ( !ret )
>> +    {
>>          ret = desc->arch.vector;
>> -        cpumask_copy(desc->affinity, desc->arch.cpu_mask);
>> +        if ( mask )
>> +            cpumask_copy(desc->affinity, mask);
>> +        else
>> +            cpumask_setall(desc->affinity);
> 
> I guess it's fine to use setall instead of copying the cpu online map
> here?

It's not only fine, it's actually one of the goals: This way you can set
affinities such that they won't need adjustment after bringing up
another CPU. I've added a respective sentence to the description.

> AFAICT __assign_irq_vector already filters offline CPUs from the
> passed mask.

Indeed. And all other involved code should, too, by now. I think
there is at least one place left somewhere where the online map is
used for setting affinities, but I suppose this can be dealt with at
another time.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH 4/9] x86/IRQ: desc->affinity should strictly represent the requested value
@ 2019-05-06  8:14       ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-06  8:14 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel

>>> On 03.05.19 at 18:21, <roger.pau@citrix.com> wrote:
> On Mon, Apr 29, 2019 at 05:24:39AM -0600, Jan Beulich wrote:
>> desc->arch.cpu_mask reflects the actual set of target CPUs. Don't ever
>> fiddle with desc->affinity itself, except to store caller requested
>> values.
>> 
>> This renders both set_native_irq_info() uses (which weren't using proper
>> locking anyway) redundant - drop the function altogether.
>> 
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks.

>> --- a/xen/arch/x86/irq.c
>> +++ b/xen/arch/x86/irq.c
>> @@ -572,11 +572,16 @@ int assign_irq_vector(int irq, const cpu
>>  
>>      spin_lock_irqsave(&vector_lock, flags);
>>      ret = __assign_irq_vector(irq, desc, mask ?: TARGET_CPUS);
>> -    if (!ret) {
>> +    if ( !ret )
>> +    {
>>          ret = desc->arch.vector;
>> -        cpumask_copy(desc->affinity, desc->arch.cpu_mask);
>> +        if ( mask )
>> +            cpumask_copy(desc->affinity, mask);
>> +        else
>> +            cpumask_setall(desc->affinity);
> 
> I guess it's fine to use setall instead of copying the cpu online map
> here?

It's not only fine, it's actually one of the goals: This way you can set
affinities such that they won't need adjustment after bringing up
another CPU. I've added a respective sentence to the description.

> AFAICT __assign_irq_vector already filters offline CPUs from the
> passed mask.

Indeed. And all other involved code should, too, by now. I think
there is at least one place left somewhere where the online map is
used for setting affinities, but I suppose this can be dealt with at
another time.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH 5/9] x86/IRQ: fix locking around vector management
@ 2019-05-06 11:48     ` Roger Pau Monné
  0 siblings, 0 replies; 196+ messages in thread
From: Roger Pau Monné @ 2019-05-06 11:48 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Wei Liu, Andrew Cooper

On Mon, Apr 29, 2019 at 05:25:03AM -0600, Jan Beulich wrote:
> All of __{assign,bind,clear}_irq_vector() manipulate struct irq_desc
> fields, and hence ought to be called with the descriptor lock held in
> addition to vector_lock. This is currently the case for only
> set_desc_affinity() and destroy_irq(), which also clarifies what the

AFAICT set_desc_affinity is called from set_ioapic_affinity_irq which in
turn is called from setup_ioapic_dest without holding the desc lock.
Is this fine because that's only used a boot time?

> nesting behavior between the locks has to be. Reflect the new
> expectation by having these functions all take a descriptor as
> parameter instead of an interrupt number.
> 
> Drop one of the two leading underscores from all three functions at
> the same time.
> 
> There's one case left where descriptors get manipulated with just
> vector_lock held: setup_vector_irq() assumes its caller to acquire
> vector_lock, and hence can't itself acquire the descriptor locks (wrong
> lock order). I don't currently see how to address this.

Can you take the desc lock and vector lock for each irq in the second
loop of setup_vector_irq and remove the vector locking from the caller?

That might be inefficient, but it's just done for CPU initialization.

AFAICT the first loop of setup_vector_irq doesn't require any locking
since it's per-cpu initialization.

> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Change looks good to me:

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

> 
> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -27,6 +27,7 @@
>  #include <public/physdev.h>
>  
>  static int parse_irq_vector_map_param(const char *s);
> +static void _clear_irq_vector(struct irq_desc *desc);
>  
>  /* opt_noirqbalance: If true, software IRQ balancing/affinity is disabled. */
>  bool __read_mostly opt_noirqbalance;
> @@ -112,13 +113,12 @@ static void trace_irq_mask(u32 event, in
>      trace_var(event, 1, sizeof(d), &d);
>  }
>  
> -static int __init __bind_irq_vector(int irq, int vector, const cpumask_t *cpu_mask)
> +static int __init _bind_irq_vector(struct irq_desc *desc, int vector,

I wouldn't be opposed to adding ASSERTs here (and in
_{assign,bind,clear}_irq_vector, set_desc_affinity and destroy_irq)
to check for lock correctness.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH 5/9] x86/IRQ: fix locking around vector management
@ 2019-05-06 11:48     ` Roger Pau Monné
  0 siblings, 0 replies; 196+ messages in thread
From: Roger Pau Monné @ 2019-05-06 11:48 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Wei Liu, Andrew Cooper

On Mon, Apr 29, 2019 at 05:25:03AM -0600, Jan Beulich wrote:
> All of __{assign,bind,clear}_irq_vector() manipulate struct irq_desc
> fields, and hence ought to be called with the descriptor lock held in
> addition to vector_lock. This is currently the case for only
> set_desc_affinity() and destroy_irq(), which also clarifies what the

AFAICT set_desc_affinity is called from set_ioapic_affinity_irq which in
turn is called from setup_ioapic_dest without holding the desc lock.
Is this fine because that's only used a boot time?

> nesting behavior between the locks has to be. Reflect the new
> expectation by having these functions all take a descriptor as
> parameter instead of an interrupt number.
> 
> Drop one of the two leading underscores from all three functions at
> the same time.
> 
> There's one case left where descriptors get manipulated with just
> vector_lock held: setup_vector_irq() assumes its caller to acquire
> vector_lock, and hence can't itself acquire the descriptor locks (wrong
> lock order). I don't currently see how to address this.

Can you take the desc lock and vector lock for each irq in the second
loop of setup_vector_irq and remove the vector locking from the caller?

That might be inefficient, but it's just done for CPU initialization.

AFAICT the first loop of setup_vector_irq doesn't require any locking
since it's per-cpu initialization.

> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Change looks good to me:

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

> 
> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -27,6 +27,7 @@
>  #include <public/physdev.h>
>  
>  static int parse_irq_vector_map_param(const char *s);
> +static void _clear_irq_vector(struct irq_desc *desc);
>  
>  /* opt_noirqbalance: If true, software IRQ balancing/affinity is disabled. */
>  bool __read_mostly opt_noirqbalance;
> @@ -112,13 +113,12 @@ static void trace_irq_mask(u32 event, in
>      trace_var(event, 1, sizeof(d), &d);
>  }
>  
> -static int __init __bind_irq_vector(int irq, int vector, const cpumask_t *cpu_mask)
> +static int __init _bind_irq_vector(struct irq_desc *desc, int vector,

I wouldn't be opposed to adding ASSERTs here (and in
_{assign,bind,clear}_irq_vector, set_desc_affinity and destroy_irq)
to check for lock correctness.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH 5/9] x86/IRQ: fix locking around vector management
@ 2019-05-06 13:06       ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-06 13:06 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel

>>> On 06.05.19 at 13:48, <roger.pau@citrix.com> wrote:
> On Mon, Apr 29, 2019 at 05:25:03AM -0600, Jan Beulich wrote:
>> All of __{assign,bind,clear}_irq_vector() manipulate struct irq_desc
>> fields, and hence ought to be called with the descriptor lock held in
>> addition to vector_lock. This is currently the case for only
>> set_desc_affinity() and destroy_irq(), which also clarifies what the
> 
> AFAICT set_desc_affinity is called from set_ioapic_affinity_irq which in
> turn is called from setup_ioapic_dest without holding the desc lock.
> Is this fine because that's only used a boot time?

No, this isn't fine, and it's also not only called at boot time. I
simply didn't spot this case of function re-use - I had come to
the conclusion that all calls to set_desc_affinity() would come
through the .set_affinity hook pointers (or happen sufficiently
early).

VT-d's adjust_irq_affinity() has a similar issue.

At boot time alone would be insufficient anyway. Not taking
locks can only be safe prior to bringing up APs; any later
skipping of locking would at least require additional justification.

>> nesting behavior between the locks has to be. Reflect the new
>> expectation by having these functions all take a descriptor as
>> parameter instead of an interrupt number.
>> 
>> Drop one of the two leading underscores from all three functions at
>> the same time.
>> 
>> There's one case left where descriptors get manipulated with just
>> vector_lock held: setup_vector_irq() assumes its caller to acquire
>> vector_lock, and hence can't itself acquire the descriptor locks (wrong
>> lock order). I don't currently see how to address this.
> 
> Can you take the desc lock and vector lock for each irq in the second
> loop of setup_vector_irq and remove the vector locking from the caller?
> 
> That might be inefficient, but it's just done for CPU initialization.
> 
> AFAICT the first loop of setup_vector_irq doesn't require any locking
> since it's per-cpu initialization.

It's not so much the first lock afaict. It's the combined action of
calling this function and setting the online bit which needs the
lock held around it. I.e. the function setting bits in various
descriptors' CPU masks (and the tracking of the vector -> IRQ
relationships) has to be atomic (to the outside world) with the
setting of the CPU's bit in cpu_online_map.

>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> Change looks good to me:
> 
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks, but I'll not add this for now, given the further locking to
be added as per above.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH 5/9] x86/IRQ: fix locking around vector management
@ 2019-05-06 13:06       ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-06 13:06 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel

>>> On 06.05.19 at 13:48, <roger.pau@citrix.com> wrote:
> On Mon, Apr 29, 2019 at 05:25:03AM -0600, Jan Beulich wrote:
>> All of __{assign,bind,clear}_irq_vector() manipulate struct irq_desc
>> fields, and hence ought to be called with the descriptor lock held in
>> addition to vector_lock. This is currently the case for only
>> set_desc_affinity() and destroy_irq(), which also clarifies what the
> 
> AFAICT set_desc_affinity is called from set_ioapic_affinity_irq which in
> turn is called from setup_ioapic_dest without holding the desc lock.
> Is this fine because that's only used a boot time?

No, this isn't fine, and it's also not only called at boot time. I
simply didn't spot this case of function re-use - I had come to
the conclusion that all calls to set_desc_affinity() would come
through the .set_affinity hook pointers (or happen sufficiently
early).

VT-d's adjust_irq_affinity() has a similar issue.

At boot time alone would be insufficient anyway. Not taking
locks can only be safe prior to bringing up APs; any later
skipping of locking would at least require additional justification.

>> nesting behavior between the locks has to be. Reflect the new
>> expectation by having these functions all take a descriptor as
>> parameter instead of an interrupt number.
>> 
>> Drop one of the two leading underscores from all three functions at
>> the same time.
>> 
>> There's one case left where descriptors get manipulated with just
>> vector_lock held: setup_vector_irq() assumes its caller to acquire
>> vector_lock, and hence can't itself acquire the descriptor locks (wrong
>> lock order). I don't currently see how to address this.
> 
> Can you take the desc lock and vector lock for each irq in the second
> loop of setup_vector_irq and remove the vector locking from the caller?
> 
> That might be inefficient, but it's just done for CPU initialization.
> 
> AFAICT the first loop of setup_vector_irq doesn't require any locking
> since it's per-cpu initialization.

It's not so much the first lock afaict. It's the combined action of
calling this function and setting the online bit which needs the
lock held around it. I.e. the function setting bits in various
descriptors' CPU masks (and the tracking of the vector -> IRQ
relationships) has to be atomic (to the outside world) with the
setting of the CPU's bit in cpu_online_map.

>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> Change looks good to me:
> 
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks, but I'll not add this for now, given the further locking to
be added as per above.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH 7/9] x86/IRQ: drop redundant cpumask_empty() from move_masked_irq()
@ 2019-05-06 13:39     ` Roger Pau Monné
  0 siblings, 0 replies; 196+ messages in thread
From: Roger Pau Monné @ 2019-05-06 13:39 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Wei Liu, Andrew Cooper

On Mon, Apr 29, 2019 at 05:26:10AM -0600, Jan Beulich wrote:
> The subsequent cpumask_intersects() covers the "empty" case quite fine.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH 7/9] x86/IRQ: drop redundant cpumask_empty() from move_masked_irq()
@ 2019-05-06 13:39     ` Roger Pau Monné
  0 siblings, 0 replies; 196+ messages in thread
From: Roger Pau Monné @ 2019-05-06 13:39 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Wei Liu, Andrew Cooper

On Mon, Apr 29, 2019 at 05:26:10AM -0600, Jan Beulich wrote:
> The subsequent cpumask_intersects() covers the "empty" case quite fine.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH 8/9] x86/IRQ: make fixup_irqs() skip unconnected internally used interrupts
@ 2019-05-06 13:52     ` Roger Pau Monné
  0 siblings, 0 replies; 196+ messages in thread
From: Roger Pau Monné @ 2019-05-06 13:52 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Wei Liu, Andrew Cooper

On Mon, Apr 29, 2019 at 05:26:41AM -0600, Jan Beulich wrote:
> Since the "Cannot set affinity ..." warning is a one time one, avoid
> triggering it already at boot time when parking secondary threads and
> the serial console uses a (still unconnected at that time) PCI IRQ.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -2412,8 +2412,20 @@ void fixup_irqs(const cpumask_t *mask, b
>          vector = irq_to_vector(irq);
>          if ( vector >= FIRST_HIPRIORITY_VECTOR &&
>               vector <= LAST_HIPRIORITY_VECTOR )
> +        {
>              cpumask_and(desc->arch.cpu_mask, desc->arch.cpu_mask, mask);
>  
> +            /*
> +             * This can in particular happen when parking secondary threads
> +             * during boot and when the serial console wants to use a PCI IRQ.
> +             */
> +            if ( desc->handler == &no_irq_type )

I found it weird that a irq has a vector assigned (in this case a
high-priority vector) but no irq type set.

Shouldn't the vector be assigned when the type is set?

> +            {
> +                spin_unlock(&desc->lock);
> +                continue;
> +            }
> +        }
> +
>          if ( desc->arch.move_cleanup_count )
>          {
>              /* The cleanup IPI may have got sent while we were still online. */

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH 8/9] x86/IRQ: make fixup_irqs() skip unconnected internally used interrupts
@ 2019-05-06 13:52     ` Roger Pau Monné
  0 siblings, 0 replies; 196+ messages in thread
From: Roger Pau Monné @ 2019-05-06 13:52 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Wei Liu, Andrew Cooper

On Mon, Apr 29, 2019 at 05:26:41AM -0600, Jan Beulich wrote:
> Since the "Cannot set affinity ..." warning is a one time one, avoid
> triggering it already at boot time when parking secondary threads and
> the serial console uses a (still unconnected at that time) PCI IRQ.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -2412,8 +2412,20 @@ void fixup_irqs(const cpumask_t *mask, b
>          vector = irq_to_vector(irq);
>          if ( vector >= FIRST_HIPRIORITY_VECTOR &&
>               vector <= LAST_HIPRIORITY_VECTOR )
> +        {
>              cpumask_and(desc->arch.cpu_mask, desc->arch.cpu_mask, mask);
>  
> +            /*
> +             * This can in particular happen when parking secondary threads
> +             * during boot and when the serial console wants to use a PCI IRQ.
> +             */
> +            if ( desc->handler == &no_irq_type )

I found it weird that a irq has a vector assigned (in this case a
high-priority vector) but no irq type set.

Shouldn't the vector be assigned when the type is set?

> +            {
> +                spin_unlock(&desc->lock);
> +                continue;
> +            }
> +        }
> +
>          if ( desc->arch.move_cleanup_count )
>          {
>              /* The cleanup IPI may have got sent while we were still online. */

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH 8/9] x86/IRQ: make fixup_irqs() skip unconnected internally used interrupts
@ 2019-05-06 14:25       ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-06 14:25 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel

>>> On 06.05.19 at 15:52, <roger.pau@citrix.com> wrote:
> On Mon, Apr 29, 2019 at 05:26:41AM -0600, Jan Beulich wrote:
>> --- a/xen/arch/x86/irq.c
>> +++ b/xen/arch/x86/irq.c
>> @@ -2412,8 +2412,20 @@ void fixup_irqs(const cpumask_t *mask, b
>>          vector = irq_to_vector(irq);
>>          if ( vector >= FIRST_HIPRIORITY_VECTOR &&
>>               vector <= LAST_HIPRIORITY_VECTOR )
>> +        {
>>              cpumask_and(desc->arch.cpu_mask, desc->arch.cpu_mask, mask);
>>  
>> +            /*
>> +             * This can in particular happen when parking secondary threads
>> +             * during boot and when the serial console wants to use a PCI IRQ.
>> +             */
>> +            if ( desc->handler == &no_irq_type )
> 
> I found it weird that a irq has a vector assigned (in this case a
> high-priority vector) but no irq type set.
> 
> Shouldn't the vector be assigned when the type is set?

In general I would agree, but the way the serial console IRQ
gets set up is different - see smp_intr_init(). When it's a PCI
IRQ (IO-APIC pin 16 or above), we'll know how to program
the IO-APIC RTE (edge/level, activity high/low) only when
Dom0 boots, and hence we don't set ->handler early.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH 8/9] x86/IRQ: make fixup_irqs() skip unconnected internally used interrupts
@ 2019-05-06 14:25       ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-06 14:25 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel

>>> On 06.05.19 at 15:52, <roger.pau@citrix.com> wrote:
> On Mon, Apr 29, 2019 at 05:26:41AM -0600, Jan Beulich wrote:
>> --- a/xen/arch/x86/irq.c
>> +++ b/xen/arch/x86/irq.c
>> @@ -2412,8 +2412,20 @@ void fixup_irqs(const cpumask_t *mask, b
>>          vector = irq_to_vector(irq);
>>          if ( vector >= FIRST_HIPRIORITY_VECTOR &&
>>               vector <= LAST_HIPRIORITY_VECTOR )
>> +        {
>>              cpumask_and(desc->arch.cpu_mask, desc->arch.cpu_mask, mask);
>>  
>> +            /*
>> +             * This can in particular happen when parking secondary threads
>> +             * during boot and when the serial console wants to use a PCI IRQ.
>> +             */
>> +            if ( desc->handler == &no_irq_type )
> 
> I found it weird that a irq has a vector assigned (in this case a
> high-priority vector) but no irq type set.
> 
> Shouldn't the vector be assigned when the type is set?

In general I would agree, but the way the serial console IRQ
gets set up is different - see smp_intr_init(). When it's a PCI
IRQ (IO-APIC pin 16 or above), we'll know how to program
the IO-APIC RTE (edge/level, activity high/low) only when
Dom0 boots, and hence we don't set ->handler early.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v1b 1/9] x86/IRQ: deal with move-in-progress state in fixup_irqs()
@ 2019-05-06 14:28           ` Roger Pau Monné
  0 siblings, 0 replies; 196+ messages in thread
From: Roger Pau Monné @ 2019-05-06 14:28 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Wei Liu, xen-devel, Igor Druzhinin

On Mon, May 06, 2019 at 01:15:59AM -0600, Jan Beulich wrote:
> >>> On 03.05.19 at 16:10, <JBeulich@suse.com> wrote:
> >>>> On 03.05.19 at 11:19, <roger.pau@citrix.com> wrote:
> >> On Mon, Apr 29, 2019 at 09:40:14AM -0600, Jan Beulich wrote:
> >>> --- unstable.orig/xen/arch/x86/irq.c	
> >>> +++ unstable/xen/arch/x86/irq.c
> >>> @@ -242,6 +242,20 @@ void destroy_irq(unsigned int irq)
> >>>      xfree(action);
> >>>  }
> >>>  
> >>> +static void release_old_vec(struct irq_desc *desc)
> >>> +{
> >>> +    unsigned int vector = desc->arch.old_vector;
> >>> +
> >>> +    desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
> >>> +    cpumask_clear(desc->arch.old_cpu_mask);
> >>> +
> >>> +    if ( desc->arch.used_vectors )
> >> 
> >> Wouldn't it be better to clean the bitmap when vector !=
> >> IRQ_VECTOR_UNASSIGNED?
> > 
> > No code path does / should call into here without the need to
> > actually release the previous vector.
> > 
> >> I haven't checked all the callers, but I don't think it's valid to
> >> call release_old_vec with desc->arch.old_vector ==
> >> IRQ_VECTOR_UNASSIGNED, in which case I would add an ASSERT.
> > 
> > Well, yes, I probably could. However, as much as I'm in
> > favor of ASSERT()s, I don't think it makes sense to ASSERT()
> > basically every bit of expected state. In the end there would
> > otherwise be more ASSERT()s than actual code.
> 
> Actually, upon second thought - let me add this, but then in an
> even more strict form: Certain very low and very high numbered
> vectors are illegal here as well, and we may then be able to use
> the same validation helper elsewhere (in particular also for the
> check that you've found to be wrong in _clear_irq_vector()).

Thanks, that LGTM.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v1b 1/9] x86/IRQ: deal with move-in-progress state in fixup_irqs()
@ 2019-05-06 14:28           ` Roger Pau Monné
  0 siblings, 0 replies; 196+ messages in thread
From: Roger Pau Monné @ 2019-05-06 14:28 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Wei Liu, xen-devel, Igor Druzhinin

On Mon, May 06, 2019 at 01:15:59AM -0600, Jan Beulich wrote:
> >>> On 03.05.19 at 16:10, <JBeulich@suse.com> wrote:
> >>>> On 03.05.19 at 11:19, <roger.pau@citrix.com> wrote:
> >> On Mon, Apr 29, 2019 at 09:40:14AM -0600, Jan Beulich wrote:
> >>> --- unstable.orig/xen/arch/x86/irq.c	
> >>> +++ unstable/xen/arch/x86/irq.c
> >>> @@ -242,6 +242,20 @@ void destroy_irq(unsigned int irq)
> >>>      xfree(action);
> >>>  }
> >>>  
> >>> +static void release_old_vec(struct irq_desc *desc)
> >>> +{
> >>> +    unsigned int vector = desc->arch.old_vector;
> >>> +
> >>> +    desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
> >>> +    cpumask_clear(desc->arch.old_cpu_mask);
> >>> +
> >>> +    if ( desc->arch.used_vectors )
> >> 
> >> Wouldn't it be better to clean the bitmap when vector !=
> >> IRQ_VECTOR_UNASSIGNED?
> > 
> > No code path does / should call into here without the need to
> > actually release the previous vector.
> > 
> >> I haven't checked all the callers, but I don't think it's valid to
> >> call release_old_vec with desc->arch.old_vector ==
> >> IRQ_VECTOR_UNASSIGNED, in which case I would add an ASSERT.
> > 
> > Well, yes, I probably could. However, as much as I'm in
> > favor of ASSERT()s, I don't think it makes sense to ASSERT()
> > basically every bit of expected state. In the end there would
> > otherwise be more ASSERT()s than actual code.
> 
> Actually, upon second thought - let me add this, but then in an
> even more strict form: Certain very low and very high numbered
> vectors are illegal here as well, and we may then be able to use
> the same validation helper elsewhere (in particular also for the
> check that you've found to be wrong in _clear_irq_vector()).

Thanks, that LGTM.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH 8/9] x86/IRQ: make fixup_irqs() skip unconnected internally used interrupts
@ 2019-05-06 14:37         ` Roger Pau Monné
  0 siblings, 0 replies; 196+ messages in thread
From: Roger Pau Monné @ 2019-05-06 14:37 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Wei Liu, xen-devel

On Mon, May 06, 2019 at 08:25:51AM -0600, Jan Beulich wrote:
> >>> On 06.05.19 at 15:52, <roger.pau@citrix.com> wrote:
> > On Mon, Apr 29, 2019 at 05:26:41AM -0600, Jan Beulich wrote:
> >> --- a/xen/arch/x86/irq.c
> >> +++ b/xen/arch/x86/irq.c
> >> @@ -2412,8 +2412,20 @@ void fixup_irqs(const cpumask_t *mask, b
> >>          vector = irq_to_vector(irq);
> >>          if ( vector >= FIRST_HIPRIORITY_VECTOR &&
> >>               vector <= LAST_HIPRIORITY_VECTOR )
> >> +        {
> >>              cpumask_and(desc->arch.cpu_mask, desc->arch.cpu_mask, mask);
> >>  
> >> +            /*
> >> +             * This can in particular happen when parking secondary threads
> >> +             * during boot and when the serial console wants to use a PCI IRQ.
> >> +             */
> >> +            if ( desc->handler == &no_irq_type )
> > 
> > I found it weird that a irq has a vector assigned (in this case a
> > high-priority vector) but no irq type set.
> > 
> > Shouldn't the vector be assigned when the type is set?
> 
> In general I would agree, but the way the serial console IRQ
> gets set up is different - see smp_intr_init(). When it's a PCI
> IRQ (IO-APIC pin 16 or above), we'll know how to program
> the IO-APIC RTE (edge/level, activity high/low) only when
> Dom0 boots, and hence we don't set ->handler early.

Oh, OK. I guess assuming level triggered active low unless dom0
provides a different configuration is not safe.

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH 8/9] x86/IRQ: make fixup_irqs() skip unconnected internally used interrupts
@ 2019-05-06 14:37         ` Roger Pau Monné
  0 siblings, 0 replies; 196+ messages in thread
From: Roger Pau Monné @ 2019-05-06 14:37 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Wei Liu, xen-devel

On Mon, May 06, 2019 at 08:25:51AM -0600, Jan Beulich wrote:
> >>> On 06.05.19 at 15:52, <roger.pau@citrix.com> wrote:
> > On Mon, Apr 29, 2019 at 05:26:41AM -0600, Jan Beulich wrote:
> >> --- a/xen/arch/x86/irq.c
> >> +++ b/xen/arch/x86/irq.c
> >> @@ -2412,8 +2412,20 @@ void fixup_irqs(const cpumask_t *mask, b
> >>          vector = irq_to_vector(irq);
> >>          if ( vector >= FIRST_HIPRIORITY_VECTOR &&
> >>               vector <= LAST_HIPRIORITY_VECTOR )
> >> +        {
> >>              cpumask_and(desc->arch.cpu_mask, desc->arch.cpu_mask, mask);
> >>  
> >> +            /*
> >> +             * This can in particular happen when parking secondary threads
> >> +             * during boot and when the serial console wants to use a PCI IRQ.
> >> +             */
> >> +            if ( desc->handler == &no_irq_type )
> > 
> > I found it weird that a irq has a vector assigned (in this case a
> > high-priority vector) but no irq type set.
> > 
> > Shouldn't the vector be assigned when the type is set?
> 
> In general I would agree, but the way the serial console IRQ
> gets set up is different - see smp_intr_init(). When it's a PCI
> IRQ (IO-APIC pin 16 or above), we'll know how to program
> the IO-APIC RTE (edge/level, activity high/low) only when
> Dom0 boots, and hence we don't set ->handler early.

Oh, OK. I guess assuming level triggered active low unless dom0
provides a different configuration is not safe.

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v1b 1/9] x86/IRQ: deal with move-in-progress state in fixup_irqs()
@ 2019-05-06 15:00             ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-06 15:00 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel, Igor Druzhinin

>>> On 06.05.19 at 16:28, <roger.pau@citrix.com> wrote:
> On Mon, May 06, 2019 at 01:15:59AM -0600, Jan Beulich wrote:
>> >>> On 03.05.19 at 16:10, <JBeulich@suse.com> wrote:
>> >>>> On 03.05.19 at 11:19, <roger.pau@citrix.com> wrote:
>> >> On Mon, Apr 29, 2019 at 09:40:14AM -0600, Jan Beulich wrote:
>> >>> --- unstable.orig/xen/arch/x86/irq.c	
>> >>> +++ unstable/xen/arch/x86/irq.c
>> >>> @@ -242,6 +242,20 @@ void destroy_irq(unsigned int irq)
>> >>>      xfree(action);
>> >>>  }
>> >>>  
>> >>> +static void release_old_vec(struct irq_desc *desc)
>> >>> +{
>> >>> +    unsigned int vector = desc->arch.old_vector;
>> >>> +
>> >>> +    desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
>> >>> +    cpumask_clear(desc->arch.old_cpu_mask);
>> >>> +
>> >>> +    if ( desc->arch.used_vectors )
>> >> 
>> >> Wouldn't it be better to clean the bitmap when vector !=
>> >> IRQ_VECTOR_UNASSIGNED?
>> > 
>> > No code path does / should call into here without the need to
>> > actually release the previous vector.
>> > 
>> >> I haven't checked all the callers, but I don't think it's valid to
>> >> call release_old_vec with desc->arch.old_vector ==
>> >> IRQ_VECTOR_UNASSIGNED, in which case I would add an ASSERT.
>> > 
>> > Well, yes, I probably could. However, as much as I'm in
>> > favor of ASSERT()s, I don't think it makes sense to ASSERT()
>> > basically every bit of expected state. In the end there would
>> > otherwise be more ASSERT()s than actual code.
>> 
>> Actually, upon second thought - let me add this, but then in an
>> even more strict form: Certain very low and very high numbered
>> vectors are illegal here as well, and we may then be able to use
>> the same validation helper elsewhere (in particular also for the
>> check that you've found to be wrong in _clear_irq_vector()).
> 
> Thanks, that LGTM.

And FTR - it _does_ trigger. I'm still struggling to explain why.
The only place where ->arch.move_in_progress gets set is
in _assign_irq_vector(), and the check I've put there for
debugging purposes doesn't trigger, i.e. the vectors put there
into ->arch.old_vector are valid.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v1b 1/9] x86/IRQ: deal with move-in-progress state in fixup_irqs()
@ 2019-05-06 15:00             ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-06 15:00 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel, Igor Druzhinin

>>> On 06.05.19 at 16:28, <roger.pau@citrix.com> wrote:
> On Mon, May 06, 2019 at 01:15:59AM -0600, Jan Beulich wrote:
>> >>> On 03.05.19 at 16:10, <JBeulich@suse.com> wrote:
>> >>>> On 03.05.19 at 11:19, <roger.pau@citrix.com> wrote:
>> >> On Mon, Apr 29, 2019 at 09:40:14AM -0600, Jan Beulich wrote:
>> >>> --- unstable.orig/xen/arch/x86/irq.c	
>> >>> +++ unstable/xen/arch/x86/irq.c
>> >>> @@ -242,6 +242,20 @@ void destroy_irq(unsigned int irq)
>> >>>      xfree(action);
>> >>>  }
>> >>>  
>> >>> +static void release_old_vec(struct irq_desc *desc)
>> >>> +{
>> >>> +    unsigned int vector = desc->arch.old_vector;
>> >>> +
>> >>> +    desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
>> >>> +    cpumask_clear(desc->arch.old_cpu_mask);
>> >>> +
>> >>> +    if ( desc->arch.used_vectors )
>> >> 
>> >> Wouldn't it be better to clean the bitmap when vector !=
>> >> IRQ_VECTOR_UNASSIGNED?
>> > 
>> > No code path does / should call into here without the need to
>> > actually release the previous vector.
>> > 
>> >> I haven't checked all the callers, but I don't think it's valid to
>> >> call release_old_vec with desc->arch.old_vector ==
>> >> IRQ_VECTOR_UNASSIGNED, in which case I would add an ASSERT.
>> > 
>> > Well, yes, I probably could. However, as much as I'm in
>> > favor of ASSERT()s, I don't think it makes sense to ASSERT()
>> > basically every bit of expected state. In the end there would
>> > otherwise be more ASSERT()s than actual code.
>> 
>> Actually, upon second thought - let me add this, but then in an
>> even more strict form: Certain very low and very high numbered
>> vectors are illegal here as well, and we may then be able to use
>> the same validation helper elsewhere (in particular also for the
>> check that you've found to be wrong in _clear_irq_vector()).
> 
> Thanks, that LGTM.

And FTR - it _does_ trigger. I'm still struggling to explain why.
The only place where ->arch.move_in_progress gets set is
in _assign_irq_vector(), and the check I've put there for
debugging purposes doesn't trigger, i.e. the vectors put there
into ->arch.old_vector are valid.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH 2/9] x86/IRQ: deal with move cleanup count state in fixup_irqs()
@ 2019-05-07  7:28       ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-07  7:28 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel

>>> On 03.05.19 at 17:21, <roger.pau@citrix.com> wrote:
> On Mon, Apr 29, 2019 at 05:23:20AM -0600, Jan Beulich wrote:
>> ---
>> TBD: The proper recording of the IPI destinations actually makes the
>>      move_cleanup_count field redundant. Do we want to drop it, at the
>>      price of a few more CPU-mask operations?
> 
> AFAICT this is not a hot path, so I would remove the
> move_cleanup_count field and just weight the cpu bitmap when needed.

FTR: It's not fully redundant - the patch removing it that I had
added was actually the reason for seeing the ASSERT() trigger
that you did ask to add in patch 1. The reason is that the field
serves as a flag for irq_move_cleanup_interrupt() whether to
act on an IRQ in the first place. Without it, the function will
prematurely clean up the vector, thus confusing subsequent
code trying to do the cleanup when it's actually due.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH 2/9] x86/IRQ: deal with move cleanup count state in fixup_irqs()
@ 2019-05-07  7:28       ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-07  7:28 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel

>>> On 03.05.19 at 17:21, <roger.pau@citrix.com> wrote:
> On Mon, Apr 29, 2019 at 05:23:20AM -0600, Jan Beulich wrote:
>> ---
>> TBD: The proper recording of the IPI destinations actually makes the
>>      move_cleanup_count field redundant. Do we want to drop it, at the
>>      price of a few more CPU-mask operations?
> 
> AFAICT this is not a hot path, so I would remove the
> move_cleanup_count field and just weight the cpu bitmap when needed.

FTR: It's not fully redundant - the patch removing it that I had
added was actually the reason for seeing the ASSERT() trigger
that you did ask to add in patch 1. The reason is that the field
serves as a flag for irq_move_cleanup_interrupt() whether to
act on an IRQ in the first place. Without it, the function will
prematurely clean up the vector, thus confusing subsequent
code trying to do the cleanup when it's actually due.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH 2/9] x86/IRQ: deal with move cleanup count state in fixup_irqs()
@ 2019-05-07  8:12         ` Roger Pau Monné
  0 siblings, 0 replies; 196+ messages in thread
From: Roger Pau Monné @ 2019-05-07  8:12 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Wei Liu, xen-devel

On Tue, May 07, 2019 at 01:28:36AM -0600, Jan Beulich wrote:
> >>> On 03.05.19 at 17:21, <roger.pau@citrix.com> wrote:
> > On Mon, Apr 29, 2019 at 05:23:20AM -0600, Jan Beulich wrote:
> >> ---
> >> TBD: The proper recording of the IPI destinations actually makes the
> >>      move_cleanup_count field redundant. Do we want to drop it, at the
> >>      price of a few more CPU-mask operations?
> > 
> > AFAICT this is not a hot path, so I would remove the
> > move_cleanup_count field and just weight the cpu bitmap when needed.
> 
> FTR: It's not fully redundant - the patch removing it that I had
> added was actually the reason for seeing the ASSERT() trigger
> that you did ask to add in patch 1. The reason is that the field
> serves as a flag for irq_move_cleanup_interrupt() whether to
> act on an IRQ in the first place. Without it, the function will
> prematurely clean up the vector, thus confusing subsequent
> code trying to do the cleanup when it's actually due.

So weighing desc->arch.old_cpu_mask is not equivalent to
move_cleanup_count?

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH 2/9] x86/IRQ: deal with move cleanup count state in fixup_irqs()
@ 2019-05-07  8:12         ` Roger Pau Monné
  0 siblings, 0 replies; 196+ messages in thread
From: Roger Pau Monné @ 2019-05-07  8:12 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Wei Liu, xen-devel

On Tue, May 07, 2019 at 01:28:36AM -0600, Jan Beulich wrote:
> >>> On 03.05.19 at 17:21, <roger.pau@citrix.com> wrote:
> > On Mon, Apr 29, 2019 at 05:23:20AM -0600, Jan Beulich wrote:
> >> ---
> >> TBD: The proper recording of the IPI destinations actually makes the
> >>      move_cleanup_count field redundant. Do we want to drop it, at the
> >>      price of a few more CPU-mask operations?
> > 
> > AFAICT this is not a hot path, so I would remove the
> > move_cleanup_count field and just weight the cpu bitmap when needed.
> 
> FTR: It's not fully redundant - the patch removing it that I had
> added was actually the reason for seeing the ASSERT() trigger
> that you did ask to add in patch 1. The reason is that the field
> serves as a flag for irq_move_cleanup_interrupt() whether to
> act on an IRQ in the first place. Without it, the function will
> prematurely clean up the vector, thus confusing subsequent
> code trying to do the cleanup when it's actually due.

So weighing desc->arch.old_cpu_mask is not equivalent to
move_cleanup_count?

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH 2/9] x86/IRQ: deal with move cleanup count state in fixup_irqs()
@ 2019-05-07  9:28           ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-07  9:28 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel

>>> On 07.05.19 at 10:12, <roger.pau@citrix.com> wrote:
> On Tue, May 07, 2019 at 01:28:36AM -0600, Jan Beulich wrote:
>> >>> On 03.05.19 at 17:21, <roger.pau@citrix.com> wrote:
>> > On Mon, Apr 29, 2019 at 05:23:20AM -0600, Jan Beulich wrote:
>> >> ---
>> >> TBD: The proper recording of the IPI destinations actually makes the
>> >>      move_cleanup_count field redundant. Do we want to drop it, at the
>> >>      price of a few more CPU-mask operations?
>> > 
>> > AFAICT this is not a hot path, so I would remove the
>> > move_cleanup_count field and just weight the cpu bitmap when needed.
>> 
>> FTR: It's not fully redundant - the patch removing it that I had
>> added was actually the reason for seeing the ASSERT() trigger
>> that you did ask to add in patch 1. The reason is that the field
>> serves as a flag for irq_move_cleanup_interrupt() whether to
>> act on an IRQ in the first place. Without it, the function will
>> prematurely clean up the vector, thus confusing subsequent
>> code trying to do the cleanup when it's actually due.
> 
> So weighing desc->arch.old_cpu_mask is not equivalent to
> move_cleanup_count?

Not exactly, no: While the field gets set from the cpumask_weight()
result, it matter _when_ that happens. Prior to that point, what bits
are set in the mask is of no interest to irq_move_cleanup_interrupt().

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH 2/9] x86/IRQ: deal with move cleanup count state in fixup_irqs()
@ 2019-05-07  9:28           ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-07  9:28 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel

>>> On 07.05.19 at 10:12, <roger.pau@citrix.com> wrote:
> On Tue, May 07, 2019 at 01:28:36AM -0600, Jan Beulich wrote:
>> >>> On 03.05.19 at 17:21, <roger.pau@citrix.com> wrote:
>> > On Mon, Apr 29, 2019 at 05:23:20AM -0600, Jan Beulich wrote:
>> >> ---
>> >> TBD: The proper recording of the IPI destinations actually makes the
>> >>      move_cleanup_count field redundant. Do we want to drop it, at the
>> >>      price of a few more CPU-mask operations?
>> > 
>> > AFAICT this is not a hot path, so I would remove the
>> > move_cleanup_count field and just weight the cpu bitmap when needed.
>> 
>> FTR: It's not fully redundant - the patch removing it that I had
>> added was actually the reason for seeing the ASSERT() trigger
>> that you did ask to add in patch 1. The reason is that the field
>> serves as a flag for irq_move_cleanup_interrupt() whether to
>> act on an IRQ in the first place. Without it, the function will
>> prematurely clean up the vector, thus confusing subsequent
>> code trying to do the cleanup when it's actually due.
> 
> So weighing desc->arch.old_cpu_mask is not equivalent to
> move_cleanup_count?

Not exactly, no: While the field gets set from the cpumask_weight()
result, it matter _when_ that happens. Prior to that point, what bits
are set in the mask is of no interest to irq_move_cleanup_interrupt().

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [PATCH v2 00/12] x86: IRQ management adjustments
@ 2019-05-08 12:59   ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-08 12:59 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

First and foremost this series is trying to deal with CPU offlining
issues, which have become more prominent with the recently
added SMT enable/disable operation in xen-hptool. Later patches
in the series then carry out more or less unrelated changes
(hopefully improvements) noticed while looking at various pieces
of involved code.

01: deal with move-in-progress state in fixup_irqs()
02: deal with move cleanup count state in fixup_irqs()
03: avoid UB (or worse) in trace_irq_mask()
04: improve dump_irqs()
05: desc->affinity should strictly represent the requested value
06: consolidate use of ->arch.cpu_mask
07: fix locking around vector management
08: correct/tighten vector check in _clear_irq_vector()
09: make fixup_irqs() skip unconnected internally used interrupts
10: reduce unused space in struct arch_irq_desc
11: drop redundant cpumask_empty() from move_masked_irq()
12: simplify and rename pirq_acktype()

In principle patches 1-3, 5-7, and maybe 9 are backporting candidates.
Their intrusive nature makes wanting to do so questionable, though.

I'm omitting the final v1 "x86/IO-APIC: drop an unused variable from
setup_IO_APIC_irqs()" here, as it was acked already and is entirely
independent of this series. For other v2 specific information please
see the individual patches.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [Xen-devel] [PATCH v2 00/12] x86: IRQ management adjustments
@ 2019-05-08 12:59   ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-08 12:59 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

First and foremost this series is trying to deal with CPU offlining
issues, which have become more prominent with the recently
added SMT enable/disable operation in xen-hptool. Later patches
in the series then carry out more or less unrelated changes
(hopefully improvements) noticed while looking at various pieces
of involved code.

01: deal with move-in-progress state in fixup_irqs()
02: deal with move cleanup count state in fixup_irqs()
03: avoid UB (or worse) in trace_irq_mask()
04: improve dump_irqs()
05: desc->affinity should strictly represent the requested value
06: consolidate use of ->arch.cpu_mask
07: fix locking around vector management
08: correct/tighten vector check in _clear_irq_vector()
09: make fixup_irqs() skip unconnected internally used interrupts
10: reduce unused space in struct arch_irq_desc
11: drop redundant cpumask_empty() from move_masked_irq()
12: simplify and rename pirq_acktype()

In principle patches 1-3, 5-7, and maybe 9 are backporting candidates.
Their intrusive nature makes wanting to do so questionable, though.

I'm omitting the final v1 "x86/IO-APIC: drop an unused variable from
setup_IO_APIC_irqs()" here, as it was acked already and is entirely
independent of this series. For other v2 specific information please
see the individual patches.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [PATCH v2 01/12] x86/IRQ: deal with move-in-progress state in fixup_irqs()
@ 2019-05-08 13:03     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-08 13:03 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

The flag being set may prevent affinity changes, as these often imply
assignment of a new vector. When there's no possible destination left
for the IRQ, the clearing of the flag needs to happen right from
fixup_irqs().

Additionally _assign_irq_vector() needs to avoid setting the flag when
there's no online CPU left in what gets put into ->arch.old_cpu_mask.
The old vector can be released right away in this case.

Also extend the log message about broken affinity to include the new
affinity as well, allowing to notice issues with affinity changes not
actually having taken place. Swap the if/else-if order there at the
same time to reduce the amount of conditions checked.

At the same time replace two open coded instances of the new helper
function.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: Add/use valid_irq_vector().
v1b: Also update vector_irq[] in the code added to fixup_irqs().

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -99,6 +99,11 @@ void unlock_vector_lock(void)
     spin_unlock(&vector_lock);
 }
 
+static inline bool valid_irq_vector(unsigned int vector)
+{
+    return vector >= FIRST_DYNAMIC_VECTOR && vector <= LAST_HIPRIORITY_VECTOR;
+}
+
 static void trace_irq_mask(u32 event, int irq, int vector, cpumask_t *mask)
 {
     struct {
@@ -242,6 +247,22 @@ void destroy_irq(unsigned int irq)
     xfree(action);
 }
 
+static void release_old_vec(struct irq_desc *desc)
+{
+    unsigned int vector = desc->arch.old_vector;
+
+    desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
+    cpumask_clear(desc->arch.old_cpu_mask);
+
+    if ( !valid_irq_vector(vector) )
+        ASSERT_UNREACHABLE();
+    else if ( desc->arch.used_vectors )
+    {
+        ASSERT(test_bit(vector, desc->arch.used_vectors));
+        clear_bit(vector, desc->arch.used_vectors);
+    }
+}
+
 static void __clear_irq_vector(int irq)
 {
     int cpu, vector, old_vector;
@@ -285,14 +306,7 @@ static void __clear_irq_vector(int irq)
         per_cpu(vector_irq, cpu)[old_vector] = ~irq;
     }
 
-    desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
-    cpumask_clear(desc->arch.old_cpu_mask);
-
-    if ( desc->arch.used_vectors )
-    {
-        ASSERT(test_bit(old_vector, desc->arch.used_vectors));
-        clear_bit(old_vector, desc->arch.used_vectors);
-    }
+    release_old_vec(desc);
 
     desc->arch.move_in_progress = 0;
 }
@@ -517,12 +531,21 @@ next:
         /* Found one! */
         current_vector = vector;
         current_offset = offset;
-        if (old_vector > 0) {
-            desc->arch.move_in_progress = 1;
-            cpumask_copy(desc->arch.old_cpu_mask, desc->arch.cpu_mask);
+
+        if ( old_vector > 0 )
+        {
+            cpumask_and(desc->arch.old_cpu_mask, desc->arch.cpu_mask,
+                        &cpu_online_map);
             desc->arch.old_vector = desc->arch.vector;
+            if ( !cpumask_empty(desc->arch.old_cpu_mask) )
+                desc->arch.move_in_progress = 1;
+            else
+                /* This can happen while offlining a CPU. */
+                release_old_vec(desc);
         }
+
         trace_irq_mask(TRC_HW_IRQ_ASSIGN_VECTOR, irq, vector, &tmp_mask);
+
         for_each_cpu(new_cpu, &tmp_mask)
             per_cpu(vector_irq, new_cpu)[vector] = irq;
         desc->arch.vector = vector;
@@ -691,14 +714,8 @@ void irq_move_cleanup_interrupt(struct c
 
         if ( desc->arch.move_cleanup_count == 0 )
         {
-            desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
-            cpumask_clear(desc->arch.old_cpu_mask);
-
-            if ( desc->arch.used_vectors )
-            {
-                ASSERT(test_bit(vector, desc->arch.used_vectors));
-                clear_bit(vector, desc->arch.used_vectors);
-            }
+            ASSERT(vector == desc->arch.old_vector);
+            release_old_vec(desc);
         }
 unlock:
         spin_unlock(&desc->lock);
@@ -2391,6 +2408,33 @@ void fixup_irqs(const cpumask_t *mask, b
             continue;
         }
 
+        /*
+         * In order for the affinity adjustment below to be successful, we
+         * need __assign_irq_vector() to succeed. This in particular means
+         * clearing desc->arch.move_in_progress if this would otherwise
+         * prevent the function from succeeding. Since there's no way for the
+         * flag to get cleared anymore when there's no possible destination
+         * left (the only possibility then would be the IRQs enabled window
+         * after this loop), there's then also no race with us doing it here.
+         *
+         * Therefore the logic here and there need to remain in sync.
+         */
+        if ( desc->arch.move_in_progress &&
+             !cpumask_intersects(mask, desc->arch.cpu_mask) )
+        {
+            unsigned int cpu;
+
+            cpumask_and(&affinity, desc->arch.old_cpu_mask, &cpu_online_map);
+
+            spin_lock(&vector_lock);
+            for_each_cpu(cpu, &affinity)
+                per_cpu(vector_irq, cpu)[desc->arch.old_vector] = ~irq;
+            spin_unlock(&vector_lock);
+
+            release_old_vec(desc);
+            desc->arch.move_in_progress = 0;
+        }
+
         cpumask_and(&affinity, &affinity, mask);
         if ( cpumask_empty(&affinity) )
         {
@@ -2409,15 +2453,18 @@ void fixup_irqs(const cpumask_t *mask, b
         if ( desc->handler->enable )
             desc->handler->enable(desc);
 
+        cpumask_copy(&affinity, desc->affinity);
+
         spin_unlock(&desc->lock);
 
         if ( !verbose )
             continue;
 
-        if ( break_affinity && set_affinity )
-            printk("Broke affinity for irq %i\n", irq);
-        else if ( !set_affinity )
-            printk("Cannot set affinity for irq %i\n", irq);
+        if ( !set_affinity )
+            printk("Cannot set affinity for IRQ%u\n", irq);
+        else if ( break_affinity )
+            printk("Broke affinity for IRQ%u, new: %*pb\n",
+                   irq, nr_cpu_ids, &affinity);
     }
 
     /* That doesn't seem sufficient.  Give it 1ms. */




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [Xen-devel] [PATCH v2 01/12] x86/IRQ: deal with move-in-progress state in fixup_irqs()
@ 2019-05-08 13:03     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-08 13:03 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

The flag being set may prevent affinity changes, as these often imply
assignment of a new vector. When there's no possible destination left
for the IRQ, the clearing of the flag needs to happen right from
fixup_irqs().

Additionally _assign_irq_vector() needs to avoid setting the flag when
there's no online CPU left in what gets put into ->arch.old_cpu_mask.
The old vector can be released right away in this case.

Also extend the log message about broken affinity to include the new
affinity as well, allowing to notice issues with affinity changes not
actually having taken place. Swap the if/else-if order there at the
same time to reduce the amount of conditions checked.

At the same time replace two open coded instances of the new helper
function.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: Add/use valid_irq_vector().
v1b: Also update vector_irq[] in the code added to fixup_irqs().

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -99,6 +99,11 @@ void unlock_vector_lock(void)
     spin_unlock(&vector_lock);
 }
 
+static inline bool valid_irq_vector(unsigned int vector)
+{
+    return vector >= FIRST_DYNAMIC_VECTOR && vector <= LAST_HIPRIORITY_VECTOR;
+}
+
 static void trace_irq_mask(u32 event, int irq, int vector, cpumask_t *mask)
 {
     struct {
@@ -242,6 +247,22 @@ void destroy_irq(unsigned int irq)
     xfree(action);
 }
 
+static void release_old_vec(struct irq_desc *desc)
+{
+    unsigned int vector = desc->arch.old_vector;
+
+    desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
+    cpumask_clear(desc->arch.old_cpu_mask);
+
+    if ( !valid_irq_vector(vector) )
+        ASSERT_UNREACHABLE();
+    else if ( desc->arch.used_vectors )
+    {
+        ASSERT(test_bit(vector, desc->arch.used_vectors));
+        clear_bit(vector, desc->arch.used_vectors);
+    }
+}
+
 static void __clear_irq_vector(int irq)
 {
     int cpu, vector, old_vector;
@@ -285,14 +306,7 @@ static void __clear_irq_vector(int irq)
         per_cpu(vector_irq, cpu)[old_vector] = ~irq;
     }
 
-    desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
-    cpumask_clear(desc->arch.old_cpu_mask);
-
-    if ( desc->arch.used_vectors )
-    {
-        ASSERT(test_bit(old_vector, desc->arch.used_vectors));
-        clear_bit(old_vector, desc->arch.used_vectors);
-    }
+    release_old_vec(desc);
 
     desc->arch.move_in_progress = 0;
 }
@@ -517,12 +531,21 @@ next:
         /* Found one! */
         current_vector = vector;
         current_offset = offset;
-        if (old_vector > 0) {
-            desc->arch.move_in_progress = 1;
-            cpumask_copy(desc->arch.old_cpu_mask, desc->arch.cpu_mask);
+
+        if ( old_vector > 0 )
+        {
+            cpumask_and(desc->arch.old_cpu_mask, desc->arch.cpu_mask,
+                        &cpu_online_map);
             desc->arch.old_vector = desc->arch.vector;
+            if ( !cpumask_empty(desc->arch.old_cpu_mask) )
+                desc->arch.move_in_progress = 1;
+            else
+                /* This can happen while offlining a CPU. */
+                release_old_vec(desc);
         }
+
         trace_irq_mask(TRC_HW_IRQ_ASSIGN_VECTOR, irq, vector, &tmp_mask);
+
         for_each_cpu(new_cpu, &tmp_mask)
             per_cpu(vector_irq, new_cpu)[vector] = irq;
         desc->arch.vector = vector;
@@ -691,14 +714,8 @@ void irq_move_cleanup_interrupt(struct c
 
         if ( desc->arch.move_cleanup_count == 0 )
         {
-            desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
-            cpumask_clear(desc->arch.old_cpu_mask);
-
-            if ( desc->arch.used_vectors )
-            {
-                ASSERT(test_bit(vector, desc->arch.used_vectors));
-                clear_bit(vector, desc->arch.used_vectors);
-            }
+            ASSERT(vector == desc->arch.old_vector);
+            release_old_vec(desc);
         }
 unlock:
         spin_unlock(&desc->lock);
@@ -2391,6 +2408,33 @@ void fixup_irqs(const cpumask_t *mask, b
             continue;
         }
 
+        /*
+         * In order for the affinity adjustment below to be successful, we
+         * need __assign_irq_vector() to succeed. This in particular means
+         * clearing desc->arch.move_in_progress if this would otherwise
+         * prevent the function from succeeding. Since there's no way for the
+         * flag to get cleared anymore when there's no possible destination
+         * left (the only possibility then would be the IRQs enabled window
+         * after this loop), there's then also no race with us doing it here.
+         *
+         * Therefore the logic here and there need to remain in sync.
+         */
+        if ( desc->arch.move_in_progress &&
+             !cpumask_intersects(mask, desc->arch.cpu_mask) )
+        {
+            unsigned int cpu;
+
+            cpumask_and(&affinity, desc->arch.old_cpu_mask, &cpu_online_map);
+
+            spin_lock(&vector_lock);
+            for_each_cpu(cpu, &affinity)
+                per_cpu(vector_irq, cpu)[desc->arch.old_vector] = ~irq;
+            spin_unlock(&vector_lock);
+
+            release_old_vec(desc);
+            desc->arch.move_in_progress = 0;
+        }
+
         cpumask_and(&affinity, &affinity, mask);
         if ( cpumask_empty(&affinity) )
         {
@@ -2409,15 +2453,18 @@ void fixup_irqs(const cpumask_t *mask, b
         if ( desc->handler->enable )
             desc->handler->enable(desc);
 
+        cpumask_copy(&affinity, desc->affinity);
+
         spin_unlock(&desc->lock);
 
         if ( !verbose )
             continue;
 
-        if ( break_affinity && set_affinity )
-            printk("Broke affinity for irq %i\n", irq);
-        else if ( !set_affinity )
-            printk("Cannot set affinity for irq %i\n", irq);
+        if ( !set_affinity )
+            printk("Cannot set affinity for IRQ%u\n", irq);
+        else if ( break_affinity )
+            printk("Broke affinity for IRQ%u, new: %*pb\n",
+                   irq, nr_cpu_ids, &affinity);
     }
 
     /* That doesn't seem sufficient.  Give it 1ms. */




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [PATCH v2 02/12] x86/IRQ: deal with move cleanup count state in fixup_irqs()
@ 2019-05-08 13:03     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-08 13:03 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

The cleanup IPI may get sent immediately before a CPU gets removed from
the online map. In such a case the IPI would get handled on the CPU
being offlined no earlier than in the interrupts disabled window after
fixup_irqs()' main loop. This is too late, however, because a possible
affinity change may incur the need for vector assignment, which will
fail when the IRQ's move cleanup count is still non-zero.

To fix this
- record the set of CPUs the cleanup IPIs gets actually sent to alongside
  setting their count,
- adjust the count in fixup_irqs(), accounting for all CPUs that the
  cleanup IPI was sent to, but that are no longer online,
- bail early from the cleanup IPI handler when the CPU is no longer
  online, to prevent double accounting.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -665,6 +665,9 @@ void irq_move_cleanup_interrupt(struct c
     ack_APIC_irq();
 
     me = smp_processor_id();
+    if ( !cpu_online(me) )
+        return;
+
     for ( vector = FIRST_DYNAMIC_VECTOR;
           vector <= LAST_HIPRIORITY_VECTOR; vector++)
     {
@@ -724,11 +727,14 @@ unlock:
 
 static void send_cleanup_vector(struct irq_desc *desc)
 {
-    cpumask_t cleanup_mask;
+    cpumask_and(desc->arch.old_cpu_mask, desc->arch.old_cpu_mask,
+                &cpu_online_map);
+    desc->arch.move_cleanup_count = cpumask_weight(desc->arch.old_cpu_mask);
 
-    cpumask_and(&cleanup_mask, desc->arch.old_cpu_mask, &cpu_online_map);
-    desc->arch.move_cleanup_count = cpumask_weight(&cleanup_mask);
-    send_IPI_mask(&cleanup_mask, IRQ_MOVE_CLEANUP_VECTOR);
+    if ( desc->arch.move_cleanup_count )
+        send_IPI_mask(desc->arch.old_cpu_mask, IRQ_MOVE_CLEANUP_VECTOR);
+    else
+        release_old_vec(desc);
 
     desc->arch.move_in_progress = 0;
 }
@@ -2401,6 +2407,16 @@ void fixup_irqs(const cpumask_t *mask, b
              vector <= LAST_HIPRIORITY_VECTOR )
             cpumask_and(desc->arch.cpu_mask, desc->arch.cpu_mask, mask);
 
+        if ( desc->arch.move_cleanup_count )
+        {
+            /* The cleanup IPI may have got sent while we were still online. */
+            cpumask_andnot(&affinity, desc->arch.old_cpu_mask,
+                           &cpu_online_map);
+            desc->arch.move_cleanup_count -= cpumask_weight(&affinity);
+            if ( !desc->arch.move_cleanup_count )
+                release_old_vec(desc);
+        }
+
         cpumask_copy(&affinity, desc->affinity);
         if ( !desc->action || cpumask_subset(&affinity, mask) )
         {




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [Xen-devel] [PATCH v2 02/12] x86/IRQ: deal with move cleanup count state in fixup_irqs()
@ 2019-05-08 13:03     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-08 13:03 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

The cleanup IPI may get sent immediately before a CPU gets removed from
the online map. In such a case the IPI would get handled on the CPU
being offlined no earlier than in the interrupts disabled window after
fixup_irqs()' main loop. This is too late, however, because a possible
affinity change may incur the need for vector assignment, which will
fail when the IRQ's move cleanup count is still non-zero.

To fix this
- record the set of CPUs the cleanup IPIs gets actually sent to alongside
  setting their count,
- adjust the count in fixup_irqs(), accounting for all CPUs that the
  cleanup IPI was sent to, but that are no longer online,
- bail early from the cleanup IPI handler when the CPU is no longer
  online, to prevent double accounting.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -665,6 +665,9 @@ void irq_move_cleanup_interrupt(struct c
     ack_APIC_irq();
 
     me = smp_processor_id();
+    if ( !cpu_online(me) )
+        return;
+
     for ( vector = FIRST_DYNAMIC_VECTOR;
           vector <= LAST_HIPRIORITY_VECTOR; vector++)
     {
@@ -724,11 +727,14 @@ unlock:
 
 static void send_cleanup_vector(struct irq_desc *desc)
 {
-    cpumask_t cleanup_mask;
+    cpumask_and(desc->arch.old_cpu_mask, desc->arch.old_cpu_mask,
+                &cpu_online_map);
+    desc->arch.move_cleanup_count = cpumask_weight(desc->arch.old_cpu_mask);
 
-    cpumask_and(&cleanup_mask, desc->arch.old_cpu_mask, &cpu_online_map);
-    desc->arch.move_cleanup_count = cpumask_weight(&cleanup_mask);
-    send_IPI_mask(&cleanup_mask, IRQ_MOVE_CLEANUP_VECTOR);
+    if ( desc->arch.move_cleanup_count )
+        send_IPI_mask(desc->arch.old_cpu_mask, IRQ_MOVE_CLEANUP_VECTOR);
+    else
+        release_old_vec(desc);
 
     desc->arch.move_in_progress = 0;
 }
@@ -2401,6 +2407,16 @@ void fixup_irqs(const cpumask_t *mask, b
              vector <= LAST_HIPRIORITY_VECTOR )
             cpumask_and(desc->arch.cpu_mask, desc->arch.cpu_mask, mask);
 
+        if ( desc->arch.move_cleanup_count )
+        {
+            /* The cleanup IPI may have got sent while we were still online. */
+            cpumask_andnot(&affinity, desc->arch.old_cpu_mask,
+                           &cpu_online_map);
+            desc->arch.move_cleanup_count -= cpumask_weight(&affinity);
+            if ( !desc->arch.move_cleanup_count )
+                release_old_vec(desc);
+        }
+
         cpumask_copy(&affinity, desc->affinity);
         if ( !desc->action || cpumask_subset(&affinity, mask) )
         {




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [PATCH v2 03/12] x86/IRQ: avoid UB (or worse) in trace_irq_mask()
@ 2019-05-08 13:07     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-08 13:07 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap, Andrew Cooper, Wei Liu, Roger Pau Monne

Dynamically allocated CPU mask objects may be smaller than cpumask_t, so
copying has to be restricted to the actual allocation size. This is
particulary important since the function doesn't bail early when tracing
is not active, so even production builds would be affected by potential
misbehavior here.

Take the opportunity and also
- use initializers instead of assignment + memset(),
- constify the cpumask_t input pointer,
- u32 -> uint32_t.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: New.
---
TBD: I wonder whether the function shouldn't gain an early tb_init_done
     check, like many other trace_*() have.

George, despite your general request to be copied on entire series
rather than individual patches, I thought it would be better to copy
you on just this one (for its tracing aspect), as the patch here is
independent of the rest of the series, but at least one later patch
depends on the parameter constification done here.

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -104,16 +104,19 @@ static inline bool valid_irq_vector(unsi
     return vector >= FIRST_DYNAMIC_VECTOR && vector <= LAST_HIPRIORITY_VECTOR;
 }
 
-static void trace_irq_mask(u32 event, int irq, int vector, cpumask_t *mask)
+static void trace_irq_mask(uint32_t event, int irq, int vector,
+                           const cpumask_t *mask)
 {
     struct {
         unsigned int irq:16, vec:16;
         unsigned int mask[6];
-    } d;
-    d.irq = irq;
-    d.vec = vector;
-    memset(d.mask, 0, sizeof(d.mask));
-    memcpy(d.mask, mask, min(sizeof(d.mask), sizeof(cpumask_t)));
+    } d = {
+       .irq = irq,
+       .vec = vector,
+    };
+
+    memcpy(d.mask, mask,
+           min(sizeof(d.mask), BITS_TO_LONGS(nr_cpu_ids) * sizeof(long)));
     trace_var(event, 1, sizeof(d), &d);
 }
 





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [Xen-devel] [PATCH v2 03/12] x86/IRQ: avoid UB (or worse) in trace_irq_mask()
@ 2019-05-08 13:07     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-08 13:07 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap, Andrew Cooper, Wei Liu, Roger Pau Monne

Dynamically allocated CPU mask objects may be smaller than cpumask_t, so
copying has to be restricted to the actual allocation size. This is
particulary important since the function doesn't bail early when tracing
is not active, so even production builds would be affected by potential
misbehavior here.

Take the opportunity and also
- use initializers instead of assignment + memset(),
- constify the cpumask_t input pointer,
- u32 -> uint32_t.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: New.
---
TBD: I wonder whether the function shouldn't gain an early tb_init_done
     check, like many other trace_*() have.

George, despite your general request to be copied on entire series
rather than individual patches, I thought it would be better to copy
you on just this one (for its tracing aspect), as the patch here is
independent of the rest of the series, but at least one later patch
depends on the parameter constification done here.

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -104,16 +104,19 @@ static inline bool valid_irq_vector(unsi
     return vector >= FIRST_DYNAMIC_VECTOR && vector <= LAST_HIPRIORITY_VECTOR;
 }
 
-static void trace_irq_mask(u32 event, int irq, int vector, cpumask_t *mask)
+static void trace_irq_mask(uint32_t event, int irq, int vector,
+                           const cpumask_t *mask)
 {
     struct {
         unsigned int irq:16, vec:16;
         unsigned int mask[6];
-    } d;
-    d.irq = irq;
-    d.vec = vector;
-    memset(d.mask, 0, sizeof(d.mask));
-    memcpy(d.mask, mask, min(sizeof(d.mask), sizeof(cpumask_t)));
+    } d = {
+       .irq = irq,
+       .vec = vector,
+    };
+
+    memcpy(d.mask, mask,
+           min(sizeof(d.mask), BITS_TO_LONGS(nr_cpu_ids) * sizeof(long)));
     trace_var(event, 1, sizeof(d), &d);
 }
 





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [PATCH v2 04/12] x86/IRQ: improve dump_irqs()
@ 2019-05-08 13:08     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-08 13:08 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

Don't log a stray trailing comma. Shorten a few fields.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -2328,7 +2328,7 @@ static void dump_irqs(unsigned char key)
 
         spin_lock_irqsave(&desc->lock, flags);
 
-        printk("   IRQ:%4d affinity:%*pb vec:%02x type=%-15s status=%08x ",
+        printk("   IRQ:%4d aff:%*pb vec:%02x %-15s status=%03x ",
                irq, nr_cpu_ids, cpumask_bits(desc->affinity), desc->arch.vector,
                desc->handler->typename, desc->status);
 
@@ -2339,23 +2339,21 @@ static void dump_irqs(unsigned char key)
         {
             action = (irq_guest_action_t *)desc->action;
 
-            printk("in-flight=%d domain-list=", action->in_flight);
+            printk("in-flight=%d%c",
+                   action->in_flight, action->nr_guests ? ' ' : '\n');
 
-            for ( i = 0; i < action->nr_guests; i++ )
+            for ( i = 0; i < action->nr_guests; )
             {
-                d = action->guest[i];
+                d = action->guest[i++];
                 pirq = domain_irq_to_pirq(d, irq);
                 info = pirq_info(d, pirq);
-                printk("%u:%3d(%c%c%c)",
+                printk("d%d:%3d(%c%c%c)%c",
                        d->domain_id, pirq,
                        evtchn_port_is_pending(d, info->evtchn) ? 'P' : '-',
                        evtchn_port_is_masked(d, info->evtchn) ? 'M' : '-',
-                       (info->masked ? 'M' : '-'));
-                if ( i != action->nr_guests )
-                    printk(",");
+                       info->masked ? 'M' : '-',
+                       i < action->nr_guests ? ',' : '\n');
             }
-
-            printk("\n");
         }
         else if ( desc->action )
             printk("%ps()\n", desc->action->handler);




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [Xen-devel] [PATCH v2 04/12] x86/IRQ: improve dump_irqs()
@ 2019-05-08 13:08     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-08 13:08 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

Don't log a stray trailing comma. Shorten a few fields.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -2328,7 +2328,7 @@ static void dump_irqs(unsigned char key)
 
         spin_lock_irqsave(&desc->lock, flags);
 
-        printk("   IRQ:%4d affinity:%*pb vec:%02x type=%-15s status=%08x ",
+        printk("   IRQ:%4d aff:%*pb vec:%02x %-15s status=%03x ",
                irq, nr_cpu_ids, cpumask_bits(desc->affinity), desc->arch.vector,
                desc->handler->typename, desc->status);
 
@@ -2339,23 +2339,21 @@ static void dump_irqs(unsigned char key)
         {
             action = (irq_guest_action_t *)desc->action;
 
-            printk("in-flight=%d domain-list=", action->in_flight);
+            printk("in-flight=%d%c",
+                   action->in_flight, action->nr_guests ? ' ' : '\n');
 
-            for ( i = 0; i < action->nr_guests; i++ )
+            for ( i = 0; i < action->nr_guests; )
             {
-                d = action->guest[i];
+                d = action->guest[i++];
                 pirq = domain_irq_to_pirq(d, irq);
                 info = pirq_info(d, pirq);
-                printk("%u:%3d(%c%c%c)",
+                printk("d%d:%3d(%c%c%c)%c",
                        d->domain_id, pirq,
                        evtchn_port_is_pending(d, info->evtchn) ? 'P' : '-',
                        evtchn_port_is_masked(d, info->evtchn) ? 'M' : '-',
-                       (info->masked ? 'M' : '-'));
-                if ( i != action->nr_guests )
-                    printk(",");
+                       info->masked ? 'M' : '-',
+                       i < action->nr_guests ? ',' : '\n');
             }
-
-            printk("\n");
         }
         else if ( desc->action )
             printk("%ps()\n", desc->action->handler);




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [PATCH v2 05/12] x86/IRQ: desc->affinity should strictly represent the requested value
@ 2019-05-08 13:09     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-08 13:09 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

desc->arch.cpu_mask reflects the actual set of target CPUs. Don't ever
fiddle with desc->affinity itself, except to store caller requested
values. Note that assign_irq_vector() now takes a NULL incoming CPU mask
to mean "all CPUs" now, rather than just "all currently online CPUs".
This way no further affinity adjustment is needed after onlining further
CPUs.

This renders both set_native_irq_info() uses (which weren't using proper
locking anyway) redundant - drop the function altogether.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

--- a/xen/arch/x86/io_apic.c
+++ b/xen/arch/x86/io_apic.c
@@ -1042,7 +1042,6 @@ static void __init setup_IO_APIC_irqs(vo
             SET_DEST(entry, logical, cpu_mask_to_apicid(TARGET_CPUS));
             spin_lock_irqsave(&ioapic_lock, flags);
             __ioapic_write_entry(apic, pin, 0, entry);
-            set_native_irq_info(irq, TARGET_CPUS);
             spin_unlock_irqrestore(&ioapic_lock, flags);
         }
     }
@@ -2251,7 +2250,6 @@ int io_apic_set_pci_routing (int ioapic,
 
     spin_lock_irqsave(&ioapic_lock, flags);
     __ioapic_write_entry(ioapic, pin, 0, entry);
-    set_native_irq_info(irq, TARGET_CPUS);
     spin_unlock(&ioapic_lock);
 
     spin_lock(&desc->lock);
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -582,11 +582,16 @@ int assign_irq_vector(int irq, const cpu
 
     spin_lock_irqsave(&vector_lock, flags);
     ret = __assign_irq_vector(irq, desc, mask ?: TARGET_CPUS);
-    if (!ret) {
+    if ( !ret )
+    {
         ret = desc->arch.vector;
-        cpumask_copy(desc->affinity, desc->arch.cpu_mask);
+        if ( mask )
+            cpumask_copy(desc->affinity, mask);
+        else
+            cpumask_setall(desc->affinity);
     }
     spin_unlock_irqrestore(&vector_lock, flags);
+
     return ret;
 }
 
@@ -2328,9 +2333,10 @@ static void dump_irqs(unsigned char key)
 
         spin_lock_irqsave(&desc->lock, flags);
 
-        printk("   IRQ:%4d aff:%*pb vec:%02x %-15s status=%03x ",
-               irq, nr_cpu_ids, cpumask_bits(desc->affinity), desc->arch.vector,
-               desc->handler->typename, desc->status);
+        printk("   IRQ:%4d aff:%*pb/%*pb vec:%02x %-15s status=%03x ",
+               irq, nr_cpu_ids, cpumask_bits(desc->affinity),
+               nr_cpu_ids, cpumask_bits(desc->arch.cpu_mask),
+               desc->arch.vector, desc->handler->typename, desc->status);
 
         if ( ssid )
             printk("Z=%-25s ", ssid);
@@ -2418,8 +2424,7 @@ void fixup_irqs(const cpumask_t *mask, b
                 release_old_vec(desc);
         }
 
-        cpumask_copy(&affinity, desc->affinity);
-        if ( !desc->action || cpumask_subset(&affinity, mask) )
+        if ( !desc->action || cpumask_subset(desc->affinity, mask) )
         {
             spin_unlock(&desc->lock);
             continue;
@@ -2452,12 +2457,13 @@ void fixup_irqs(const cpumask_t *mask, b
             desc->arch.move_in_progress = 0;
         }
 
-        cpumask_and(&affinity, &affinity, mask);
-        if ( cpumask_empty(&affinity) )
+        if ( !cpumask_intersects(mask, desc->affinity) )
         {
             break_affinity = true;
-            cpumask_copy(&affinity, mask);
+            cpumask_setall(&affinity);
         }
+        else
+            cpumask_copy(&affinity, desc->affinity);
 
         if ( desc->handler->disable )
             desc->handler->disable(desc);
--- a/xen/include/xen/irq.h
+++ b/xen/include/xen/irq.h
@@ -162,11 +162,6 @@ extern irq_desc_t *domain_spin_lock_irq_
 extern irq_desc_t *pirq_spin_lock_irq_desc(
     const struct pirq *, unsigned long *pflags);
 
-static inline void set_native_irq_info(unsigned int irq, const cpumask_t *mask)
-{
-    cpumask_copy(irq_to_desc(irq)->affinity, mask);
-}
-
 unsigned int set_desc_affinity(struct irq_desc *, const cpumask_t *);
 
 #ifndef arch_hwdom_irqs




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [Xen-devel] [PATCH v2 05/12] x86/IRQ: desc->affinity should strictly represent the requested value
@ 2019-05-08 13:09     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-08 13:09 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

desc->arch.cpu_mask reflects the actual set of target CPUs. Don't ever
fiddle with desc->affinity itself, except to store caller requested
values. Note that assign_irq_vector() now takes a NULL incoming CPU mask
to mean "all CPUs" now, rather than just "all currently online CPUs".
This way no further affinity adjustment is needed after onlining further
CPUs.

This renders both set_native_irq_info() uses (which weren't using proper
locking anyway) redundant - drop the function altogether.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

--- a/xen/arch/x86/io_apic.c
+++ b/xen/arch/x86/io_apic.c
@@ -1042,7 +1042,6 @@ static void __init setup_IO_APIC_irqs(vo
             SET_DEST(entry, logical, cpu_mask_to_apicid(TARGET_CPUS));
             spin_lock_irqsave(&ioapic_lock, flags);
             __ioapic_write_entry(apic, pin, 0, entry);
-            set_native_irq_info(irq, TARGET_CPUS);
             spin_unlock_irqrestore(&ioapic_lock, flags);
         }
     }
@@ -2251,7 +2250,6 @@ int io_apic_set_pci_routing (int ioapic,
 
     spin_lock_irqsave(&ioapic_lock, flags);
     __ioapic_write_entry(ioapic, pin, 0, entry);
-    set_native_irq_info(irq, TARGET_CPUS);
     spin_unlock(&ioapic_lock);
 
     spin_lock(&desc->lock);
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -582,11 +582,16 @@ int assign_irq_vector(int irq, const cpu
 
     spin_lock_irqsave(&vector_lock, flags);
     ret = __assign_irq_vector(irq, desc, mask ?: TARGET_CPUS);
-    if (!ret) {
+    if ( !ret )
+    {
         ret = desc->arch.vector;
-        cpumask_copy(desc->affinity, desc->arch.cpu_mask);
+        if ( mask )
+            cpumask_copy(desc->affinity, mask);
+        else
+            cpumask_setall(desc->affinity);
     }
     spin_unlock_irqrestore(&vector_lock, flags);
+
     return ret;
 }
 
@@ -2328,9 +2333,10 @@ static void dump_irqs(unsigned char key)
 
         spin_lock_irqsave(&desc->lock, flags);
 
-        printk("   IRQ:%4d aff:%*pb vec:%02x %-15s status=%03x ",
-               irq, nr_cpu_ids, cpumask_bits(desc->affinity), desc->arch.vector,
-               desc->handler->typename, desc->status);
+        printk("   IRQ:%4d aff:%*pb/%*pb vec:%02x %-15s status=%03x ",
+               irq, nr_cpu_ids, cpumask_bits(desc->affinity),
+               nr_cpu_ids, cpumask_bits(desc->arch.cpu_mask),
+               desc->arch.vector, desc->handler->typename, desc->status);
 
         if ( ssid )
             printk("Z=%-25s ", ssid);
@@ -2418,8 +2424,7 @@ void fixup_irqs(const cpumask_t *mask, b
                 release_old_vec(desc);
         }
 
-        cpumask_copy(&affinity, desc->affinity);
-        if ( !desc->action || cpumask_subset(&affinity, mask) )
+        if ( !desc->action || cpumask_subset(desc->affinity, mask) )
         {
             spin_unlock(&desc->lock);
             continue;
@@ -2452,12 +2457,13 @@ void fixup_irqs(const cpumask_t *mask, b
             desc->arch.move_in_progress = 0;
         }
 
-        cpumask_and(&affinity, &affinity, mask);
-        if ( cpumask_empty(&affinity) )
+        if ( !cpumask_intersects(mask, desc->affinity) )
         {
             break_affinity = true;
-            cpumask_copy(&affinity, mask);
+            cpumask_setall(&affinity);
         }
+        else
+            cpumask_copy(&affinity, desc->affinity);
 
         if ( desc->handler->disable )
             desc->handler->disable(desc);
--- a/xen/include/xen/irq.h
+++ b/xen/include/xen/irq.h
@@ -162,11 +162,6 @@ extern irq_desc_t *domain_spin_lock_irq_
 extern irq_desc_t *pirq_spin_lock_irq_desc(
     const struct pirq *, unsigned long *pflags);
 
-static inline void set_native_irq_info(unsigned int irq, const cpumask_t *mask)
-{
-    cpumask_copy(irq_to_desc(irq)->affinity, mask);
-}
-
 unsigned int set_desc_affinity(struct irq_desc *, const cpumask_t *);
 
 #ifndef arch_hwdom_irqs




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [PATCH v2 06/12] x86/IRQ: consolidate use of ->arch.cpu_mask
@ 2019-05-08 13:10     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-08 13:10 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

Mixed meaning was implied so far by different pieces of code -
disagreement was in particular about whether to expect offline CPUs'
bits to possibly be set. Switch to a mostly consistent meaning
(exception being high priority interrupts, which would perhaps better
be switched to the same model as well in due course). Use the field to
record the vector allocation mask, i.e. potentially including bits of
offline (parked) CPUs. This implies that before passing the mask to
certain functions (most notably cpu_mask_to_apicid()) it needs to be
further reduced to the online subset.

The exception of high priority interrupts is also why for the moment
_bind_irq_vector() is left as is, despite looking wrong: It's used
exclusively for IRQ0, which isn't supposed to move off CPU0 at any time.

The prior lack of restricting to online CPUs in set_desc_affinity()
before calling cpu_mask_to_apicid() in particular allowed (in x2APIC
clustered mode) offlined CPUs to end up enabled in an IRQ's destination
field. (I wonder whether vector_allocation_cpumask_flat() shouldn't
follow a similar model, using cpu_present_map in favor of
cpu_online_map.)

For IO-APIC code it was definitely wrong to potentially store, as a
fallback, TARGET_CPUS (i.e. all online ones) into the field, as that
would have caused problems when determining on which CPUs to release
vectors when they've gone out of use. Disable interrupts instead when
no valid target CPU can be established (which code elsewhere should
guarantee to never happen), and log a message in such an unlikely event.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: New.

--- a/xen/arch/x86/io_apic.c
+++ b/xen/arch/x86/io_apic.c
@@ -680,7 +680,7 @@ void /*__init*/ setup_ioapic_dest(void)
                 continue;
             irq = pin_2_irq(irq_entry, ioapic, pin);
             desc = irq_to_desc(irq);
-            BUG_ON(cpumask_empty(desc->arch.cpu_mask));
+            BUG_ON(!cpumask_intersects(desc->arch.cpu_mask, &cpu_online_map));
             set_ioapic_affinity_irq(desc, desc->arch.cpu_mask);
         }
 
@@ -2197,7 +2197,6 @@ int io_apic_set_pci_routing (int ioapic,
 {
     struct irq_desc *desc = irq_to_desc(irq);
     struct IO_APIC_route_entry entry;
-    cpumask_t mask;
     unsigned long flags;
     int vector;
 
@@ -2232,11 +2231,17 @@ int io_apic_set_pci_routing (int ioapic,
         return vector;
     entry.vector = vector;
 
-    cpumask_copy(&mask, TARGET_CPUS);
-    /* Don't chance ending up with an empty mask. */
-    if (cpumask_intersects(&mask, desc->arch.cpu_mask))
-        cpumask_and(&mask, &mask, desc->arch.cpu_mask);
-    SET_DEST(entry, logical, cpu_mask_to_apicid(&mask));
+    if (cpumask_intersects(desc->arch.cpu_mask, TARGET_CPUS)) {
+        cpumask_t *mask = this_cpu(scratch_cpumask);
+
+        cpumask_and(mask, desc->arch.cpu_mask, TARGET_CPUS);
+        SET_DEST(entry, logical, cpu_mask_to_apicid(mask));
+    } else {
+        printk(XENLOG_ERR "IRQ%d: no target CPU (%*pb vs %*pb)\n",
+               irq, nr_cpu_ids, cpumask_bits(desc->arch.cpu_mask),
+               nr_cpu_ids, cpumask_bits(TARGET_CPUS));
+        desc->status |= IRQ_DISABLED;
+    }
 
     apic_printk(APIC_DEBUG, KERN_DEBUG "IOAPIC[%d]: Set PCI routing entry "
 		"(%d-%d -> %#x -> IRQ %d Mode:%i Active:%i)\n", ioapic,
@@ -2422,7 +2427,21 @@ int ioapic_guest_write(unsigned long phy
     /* Set the vector field to the real vector! */
     rte.vector = desc->arch.vector;
 
-    SET_DEST(rte, logical, cpu_mask_to_apicid(desc->arch.cpu_mask));
+    if ( cpumask_intersects(desc->arch.cpu_mask, TARGET_CPUS) )
+    {
+        cpumask_t *mask = this_cpu(scratch_cpumask);
+
+        cpumask_and(mask, desc->arch.cpu_mask, TARGET_CPUS);
+        SET_DEST(rte, logical, cpu_mask_to_apicid(mask));
+    }
+    else
+    {
+        gprintk(XENLOG_ERR, "IRQ%d: no target CPU (%*pb vs %*pb)\n",
+               irq, nr_cpu_ids, cpumask_bits(desc->arch.cpu_mask),
+               nr_cpu_ids, cpumask_bits(TARGET_CPUS));
+        desc->status |= IRQ_DISABLED;
+        rte.mask = 1;
+    }
 
     __ioapic_write_entry(apic, pin, 0, rte);
     
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -471,11 +471,13 @@ static int __assign_irq_vector(
      */
     static int current_vector = FIRST_DYNAMIC_VECTOR, current_offset = 0;
     int cpu, err, old_vector;
-    cpumask_t tmp_mask;
     vmask_t *irq_used_vectors = NULL;
 
     old_vector = irq_to_vector(irq);
-    if (old_vector > 0) {
+    if ( old_vector > 0 )
+    {
+        cpumask_t tmp_mask;
+
         cpumask_and(&tmp_mask, mask, &cpu_online_map);
         if (cpumask_intersects(&tmp_mask, desc->arch.cpu_mask)) {
             desc->arch.vector = old_vector;
@@ -498,7 +500,9 @@ static int __assign_irq_vector(
     else
         irq_used_vectors = irq_get_used_vector_mask(irq);
 
-    for_each_cpu(cpu, mask) {
+    for_each_cpu(cpu, mask)
+    {
+        const cpumask_t *vec_mask;
         int new_cpu;
         int vector, offset;
 
@@ -506,8 +510,7 @@ static int __assign_irq_vector(
         if (!cpu_online(cpu))
             continue;
 
-        cpumask_and(&tmp_mask, vector_allocation_cpumask(cpu),
-                    &cpu_online_map);
+        vec_mask = vector_allocation_cpumask(cpu);
 
         vector = current_vector;
         offset = current_offset;
@@ -528,7 +531,7 @@ next:
             && test_bit(vector, irq_used_vectors) )
             goto next;
 
-        for_each_cpu(new_cpu, &tmp_mask)
+        for_each_cpu(new_cpu, vec_mask)
             if (per_cpu(vector_irq, new_cpu)[vector] >= 0)
                 goto next;
         /* Found one! */
@@ -547,12 +550,12 @@ next:
                 release_old_vec(desc);
         }
 
-        trace_irq_mask(TRC_HW_IRQ_ASSIGN_VECTOR, irq, vector, &tmp_mask);
+        trace_irq_mask(TRC_HW_IRQ_ASSIGN_VECTOR, irq, vector, vec_mask);
 
-        for_each_cpu(new_cpu, &tmp_mask)
+        for_each_cpu(new_cpu, vec_mask)
             per_cpu(vector_irq, new_cpu)[vector] = irq;
         desc->arch.vector = vector;
-        cpumask_copy(desc->arch.cpu_mask, &tmp_mask);
+        cpumask_copy(desc->arch.cpu_mask, vec_mask);
 
         desc->arch.used = IRQ_USED;
         ASSERT((desc->arch.used_vectors == NULL)
@@ -783,6 +786,7 @@ unsigned int set_desc_affinity(struct ir
 
     cpumask_copy(desc->affinity, mask);
     cpumask_and(&dest_mask, mask, desc->arch.cpu_mask);
+    cpumask_and(&dest_mask, &dest_mask, &cpu_online_map);
 
     return cpu_mask_to_apicid(&dest_mask);
 }
--- a/xen/include/asm-x86/irq.h
+++ b/xen/include/asm-x86/irq.h
@@ -32,6 +32,12 @@ struct irq_desc;
 struct arch_irq_desc {
         s16 vector;                  /* vector itself is only 8 bits, */
         s16 old_vector;              /* but we use -1 for unassigned  */
+        /*
+         * Except for high priority interrupts @cpu_mask may have bits set for
+         * offline CPUs.  Consumers need to be careful to mask this down to
+         * online ones as necessary.  There is supposed to always be a non-
+         * empty intersection with cpu_online_map.
+         */
         cpumask_var_t cpu_mask;
         cpumask_var_t old_cpu_mask;
         cpumask_var_t pending_mask;




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [Xen-devel] [PATCH v2 06/12] x86/IRQ: consolidate use of ->arch.cpu_mask
@ 2019-05-08 13:10     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-08 13:10 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

Mixed meaning was implied so far by different pieces of code -
disagreement was in particular about whether to expect offline CPUs'
bits to possibly be set. Switch to a mostly consistent meaning
(exception being high priority interrupts, which would perhaps better
be switched to the same model as well in due course). Use the field to
record the vector allocation mask, i.e. potentially including bits of
offline (parked) CPUs. This implies that before passing the mask to
certain functions (most notably cpu_mask_to_apicid()) it needs to be
further reduced to the online subset.

The exception of high priority interrupts is also why for the moment
_bind_irq_vector() is left as is, despite looking wrong: It's used
exclusively for IRQ0, which isn't supposed to move off CPU0 at any time.

The prior lack of restricting to online CPUs in set_desc_affinity()
before calling cpu_mask_to_apicid() in particular allowed (in x2APIC
clustered mode) offlined CPUs to end up enabled in an IRQ's destination
field. (I wonder whether vector_allocation_cpumask_flat() shouldn't
follow a similar model, using cpu_present_map in favor of
cpu_online_map.)

For IO-APIC code it was definitely wrong to potentially store, as a
fallback, TARGET_CPUS (i.e. all online ones) into the field, as that
would have caused problems when determining on which CPUs to release
vectors when they've gone out of use. Disable interrupts instead when
no valid target CPU can be established (which code elsewhere should
guarantee to never happen), and log a message in such an unlikely event.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: New.

--- a/xen/arch/x86/io_apic.c
+++ b/xen/arch/x86/io_apic.c
@@ -680,7 +680,7 @@ void /*__init*/ setup_ioapic_dest(void)
                 continue;
             irq = pin_2_irq(irq_entry, ioapic, pin);
             desc = irq_to_desc(irq);
-            BUG_ON(cpumask_empty(desc->arch.cpu_mask));
+            BUG_ON(!cpumask_intersects(desc->arch.cpu_mask, &cpu_online_map));
             set_ioapic_affinity_irq(desc, desc->arch.cpu_mask);
         }
 
@@ -2197,7 +2197,6 @@ int io_apic_set_pci_routing (int ioapic,
 {
     struct irq_desc *desc = irq_to_desc(irq);
     struct IO_APIC_route_entry entry;
-    cpumask_t mask;
     unsigned long flags;
     int vector;
 
@@ -2232,11 +2231,17 @@ int io_apic_set_pci_routing (int ioapic,
         return vector;
     entry.vector = vector;
 
-    cpumask_copy(&mask, TARGET_CPUS);
-    /* Don't chance ending up with an empty mask. */
-    if (cpumask_intersects(&mask, desc->arch.cpu_mask))
-        cpumask_and(&mask, &mask, desc->arch.cpu_mask);
-    SET_DEST(entry, logical, cpu_mask_to_apicid(&mask));
+    if (cpumask_intersects(desc->arch.cpu_mask, TARGET_CPUS)) {
+        cpumask_t *mask = this_cpu(scratch_cpumask);
+
+        cpumask_and(mask, desc->arch.cpu_mask, TARGET_CPUS);
+        SET_DEST(entry, logical, cpu_mask_to_apicid(mask));
+    } else {
+        printk(XENLOG_ERR "IRQ%d: no target CPU (%*pb vs %*pb)\n",
+               irq, nr_cpu_ids, cpumask_bits(desc->arch.cpu_mask),
+               nr_cpu_ids, cpumask_bits(TARGET_CPUS));
+        desc->status |= IRQ_DISABLED;
+    }
 
     apic_printk(APIC_DEBUG, KERN_DEBUG "IOAPIC[%d]: Set PCI routing entry "
 		"(%d-%d -> %#x -> IRQ %d Mode:%i Active:%i)\n", ioapic,
@@ -2422,7 +2427,21 @@ int ioapic_guest_write(unsigned long phy
     /* Set the vector field to the real vector! */
     rte.vector = desc->arch.vector;
 
-    SET_DEST(rte, logical, cpu_mask_to_apicid(desc->arch.cpu_mask));
+    if ( cpumask_intersects(desc->arch.cpu_mask, TARGET_CPUS) )
+    {
+        cpumask_t *mask = this_cpu(scratch_cpumask);
+
+        cpumask_and(mask, desc->arch.cpu_mask, TARGET_CPUS);
+        SET_DEST(rte, logical, cpu_mask_to_apicid(mask));
+    }
+    else
+    {
+        gprintk(XENLOG_ERR, "IRQ%d: no target CPU (%*pb vs %*pb)\n",
+               irq, nr_cpu_ids, cpumask_bits(desc->arch.cpu_mask),
+               nr_cpu_ids, cpumask_bits(TARGET_CPUS));
+        desc->status |= IRQ_DISABLED;
+        rte.mask = 1;
+    }
 
     __ioapic_write_entry(apic, pin, 0, rte);
     
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -471,11 +471,13 @@ static int __assign_irq_vector(
      */
     static int current_vector = FIRST_DYNAMIC_VECTOR, current_offset = 0;
     int cpu, err, old_vector;
-    cpumask_t tmp_mask;
     vmask_t *irq_used_vectors = NULL;
 
     old_vector = irq_to_vector(irq);
-    if (old_vector > 0) {
+    if ( old_vector > 0 )
+    {
+        cpumask_t tmp_mask;
+
         cpumask_and(&tmp_mask, mask, &cpu_online_map);
         if (cpumask_intersects(&tmp_mask, desc->arch.cpu_mask)) {
             desc->arch.vector = old_vector;
@@ -498,7 +500,9 @@ static int __assign_irq_vector(
     else
         irq_used_vectors = irq_get_used_vector_mask(irq);
 
-    for_each_cpu(cpu, mask) {
+    for_each_cpu(cpu, mask)
+    {
+        const cpumask_t *vec_mask;
         int new_cpu;
         int vector, offset;
 
@@ -506,8 +510,7 @@ static int __assign_irq_vector(
         if (!cpu_online(cpu))
             continue;
 
-        cpumask_and(&tmp_mask, vector_allocation_cpumask(cpu),
-                    &cpu_online_map);
+        vec_mask = vector_allocation_cpumask(cpu);
 
         vector = current_vector;
         offset = current_offset;
@@ -528,7 +531,7 @@ next:
             && test_bit(vector, irq_used_vectors) )
             goto next;
 
-        for_each_cpu(new_cpu, &tmp_mask)
+        for_each_cpu(new_cpu, vec_mask)
             if (per_cpu(vector_irq, new_cpu)[vector] >= 0)
                 goto next;
         /* Found one! */
@@ -547,12 +550,12 @@ next:
                 release_old_vec(desc);
         }
 
-        trace_irq_mask(TRC_HW_IRQ_ASSIGN_VECTOR, irq, vector, &tmp_mask);
+        trace_irq_mask(TRC_HW_IRQ_ASSIGN_VECTOR, irq, vector, vec_mask);
 
-        for_each_cpu(new_cpu, &tmp_mask)
+        for_each_cpu(new_cpu, vec_mask)
             per_cpu(vector_irq, new_cpu)[vector] = irq;
         desc->arch.vector = vector;
-        cpumask_copy(desc->arch.cpu_mask, &tmp_mask);
+        cpumask_copy(desc->arch.cpu_mask, vec_mask);
 
         desc->arch.used = IRQ_USED;
         ASSERT((desc->arch.used_vectors == NULL)
@@ -783,6 +786,7 @@ unsigned int set_desc_affinity(struct ir
 
     cpumask_copy(desc->affinity, mask);
     cpumask_and(&dest_mask, mask, desc->arch.cpu_mask);
+    cpumask_and(&dest_mask, &dest_mask, &cpu_online_map);
 
     return cpu_mask_to_apicid(&dest_mask);
 }
--- a/xen/include/asm-x86/irq.h
+++ b/xen/include/asm-x86/irq.h
@@ -32,6 +32,12 @@ struct irq_desc;
 struct arch_irq_desc {
         s16 vector;                  /* vector itself is only 8 bits, */
         s16 old_vector;              /* but we use -1 for unassigned  */
+        /*
+         * Except for high priority interrupts @cpu_mask may have bits set for
+         * offline CPUs.  Consumers need to be careful to mask this down to
+         * online ones as necessary.  There is supposed to always be a non-
+         * empty intersection with cpu_online_map.
+         */
         cpumask_var_t cpu_mask;
         cpumask_var_t old_cpu_mask;
         cpumask_var_t pending_mask;




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [PATCH v2 07/12] x86/IRQ: fix locking around vector management
@ 2019-05-08 13:10     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-08 13:10 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

All of __{assign,bind,clear}_irq_vector() manipulate struct irq_desc
fields, and hence ought to be called with the descriptor lock held in
addition to vector_lock. This is currently the case for only
set_desc_affinity() (in the common case) and destroy_irq(), which also
clarifies what the nesting behavior between the locks has to be.
Reflect the new expectation by having these functions all take a
descriptor as parameter instead of an interrupt number.

Also take care of the two special cases of calls to set_desc_affinity():
set_ioapic_affinity_irq() and VT-d's dma_msi_set_affinity() get called
directly as well, and in these cases the descriptor locks hadn't got
acquired till now. For set_ioapic_affinity_irq() this means acquiring /
releasing of the IO-APIC lock can be plain spin_{,un}lock() then.

Drop one of the two leading underscores from all three functions at
the same time.

There's one case left where descriptors get manipulated with just
vector_lock held: setup_vector_irq() assumes its caller to acquire
vector_lock, and hence can't itself acquire the descriptor locks (wrong
lock order). I don't currently see how to address this.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: Also adjust set_ioapic_affinity_irq() and VT-d's
    dma_msi_set_affinity().

--- a/xen/arch/x86/io_apic.c
+++ b/xen/arch/x86/io_apic.c
@@ -550,14 +550,14 @@ static void clear_IO_APIC (void)
 static void
 set_ioapic_affinity_irq(struct irq_desc *desc, const cpumask_t *mask)
 {
-    unsigned long flags;
     unsigned int dest;
     int pin, irq;
     struct irq_pin_list *entry;
 
     irq = desc->irq;
 
-    spin_lock_irqsave(&ioapic_lock, flags);
+    spin_lock(&ioapic_lock);
+
     dest = set_desc_affinity(desc, mask);
     if (dest != BAD_APICID) {
         if ( !x2apic_enabled )
@@ -580,8 +580,8 @@ set_ioapic_affinity_irq(struct irq_desc
             entry = irq_2_pin + entry->next;
         }
     }
-    spin_unlock_irqrestore(&ioapic_lock, flags);
 
+    spin_unlock(&ioapic_lock);
 }
 
 /*
@@ -674,16 +674,19 @@ void /*__init*/ setup_ioapic_dest(void)
     for (ioapic = 0; ioapic < nr_ioapics; ioapic++) {
         for (pin = 0; pin < nr_ioapic_entries[ioapic]; pin++) {
             struct irq_desc *desc;
+            unsigned long flags;
 
             irq_entry = find_irq_entry(ioapic, pin, mp_INT);
             if (irq_entry == -1)
                 continue;
             irq = pin_2_irq(irq_entry, ioapic, pin);
             desc = irq_to_desc(irq);
+
+            spin_lock_irqsave(&desc->lock, flags);
             BUG_ON(!cpumask_intersects(desc->arch.cpu_mask, &cpu_online_map));
             set_ioapic_affinity_irq(desc, desc->arch.cpu_mask);
+            spin_unlock_irqrestore(&desc->lock, flags);
         }
-
     }
 }
 
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -27,6 +27,7 @@
 #include <public/physdev.h>
 
 static int parse_irq_vector_map_param(const char *s);
+static void _clear_irq_vector(struct irq_desc *desc);
 
 /* opt_noirqbalance: If true, software IRQ balancing/affinity is disabled. */
 bool __read_mostly opt_noirqbalance;
@@ -120,13 +121,12 @@ static void trace_irq_mask(uint32_t even
     trace_var(event, 1, sizeof(d), &d);
 }
 
-static int __init __bind_irq_vector(int irq, int vector, const cpumask_t *cpu_mask)
+static int __init _bind_irq_vector(struct irq_desc *desc, int vector,
+                                   const cpumask_t *cpu_mask)
 {
     cpumask_t online_mask;
     int cpu;
-    struct irq_desc *desc = irq_to_desc(irq);
 
-    BUG_ON((unsigned)irq >= nr_irqs);
     BUG_ON((unsigned)vector >= NR_VECTORS);
 
     cpumask_and(&online_mask, cpu_mask, &cpu_online_map);
@@ -137,9 +137,9 @@ static int __init __bind_irq_vector(int
         return 0;
     if ( desc->arch.vector != IRQ_VECTOR_UNASSIGNED )
         return -EBUSY;
-    trace_irq_mask(TRC_HW_IRQ_BIND_VECTOR, irq, vector, &online_mask);
+    trace_irq_mask(TRC_HW_IRQ_BIND_VECTOR, desc->irq, vector, &online_mask);
     for_each_cpu(cpu, &online_mask)
-        per_cpu(vector_irq, cpu)[vector] = irq;
+        per_cpu(vector_irq, cpu)[vector] = desc->irq;
     desc->arch.vector = vector;
     cpumask_copy(desc->arch.cpu_mask, &online_mask);
     if ( desc->arch.used_vectors )
@@ -153,12 +153,18 @@ static int __init __bind_irq_vector(int
 
 int __init bind_irq_vector(int irq, int vector, const cpumask_t *cpu_mask)
 {
+    struct irq_desc *desc = irq_to_desc(irq);
     unsigned long flags;
     int ret;
 
-    spin_lock_irqsave(&vector_lock, flags);
-    ret = __bind_irq_vector(irq, vector, cpu_mask);
-    spin_unlock_irqrestore(&vector_lock, flags);
+    BUG_ON((unsigned)irq >= nr_irqs);
+
+    spin_lock_irqsave(&desc->lock, flags);
+    spin_lock(&vector_lock);
+    ret = _bind_irq_vector(desc, vector, cpu_mask);
+    spin_unlock(&vector_lock);
+    spin_unlock_irqrestore(&desc->lock, flags);
+
     return ret;
 }
 
@@ -243,7 +249,9 @@ void destroy_irq(unsigned int irq)
 
     spin_lock_irqsave(&desc->lock, flags);
     desc->handler = &no_irq_type;
-    clear_irq_vector(irq);
+    spin_lock(&vector_lock);
+    _clear_irq_vector(desc);
+    spin_unlock(&vector_lock);
     desc->arch.used_vectors = NULL;
     spin_unlock_irqrestore(&desc->lock, flags);
 
@@ -266,11 +274,11 @@ static void release_old_vec(struct irq_d
     }
 }
 
-static void __clear_irq_vector(int irq)
+static void _clear_irq_vector(struct irq_desc *desc)
 {
-    int cpu, vector, old_vector;
+    unsigned int cpu;
+    int vector, old_vector, irq = desc->irq;
     cpumask_t tmp_mask;
-    struct irq_desc *desc = irq_to_desc(irq);
 
     BUG_ON(!desc->arch.vector);
 
@@ -316,11 +324,14 @@ static void __clear_irq_vector(int irq)
 
 void clear_irq_vector(int irq)
 {
+    struct irq_desc *desc = irq_to_desc(irq);
     unsigned long flags;
 
-    spin_lock_irqsave(&vector_lock, flags);
-    __clear_irq_vector(irq);
-    spin_unlock_irqrestore(&vector_lock, flags);
+    spin_lock_irqsave(&desc->lock, flags);
+    spin_lock(&vector_lock);
+    _clear_irq_vector(desc);
+    spin_unlock(&vector_lock);
+    spin_unlock_irqrestore(&desc->lock, flags);
 }
 
 int irq_to_vector(int irq)
@@ -455,8 +466,7 @@ static vmask_t *irq_get_used_vector_mask
     return ret;
 }
 
-static int __assign_irq_vector(
-    int irq, struct irq_desc *desc, const cpumask_t *mask)
+static int _assign_irq_vector(struct irq_desc *desc, const cpumask_t *mask)
 {
     /*
      * NOTE! The local APIC isn't very good at handling
@@ -470,7 +480,8 @@ static int __assign_irq_vector(
      * 0x80, because int 0x80 is hm, kind of importantish. ;)
      */
     static int current_vector = FIRST_DYNAMIC_VECTOR, current_offset = 0;
-    int cpu, err, old_vector;
+    unsigned int cpu;
+    int err, old_vector, irq = desc->irq;
     vmask_t *irq_used_vectors = NULL;
 
     old_vector = irq_to_vector(irq);
@@ -583,8 +594,12 @@ int assign_irq_vector(int irq, const cpu
     
     BUG_ON(irq >= nr_irqs || irq <0);
 
-    spin_lock_irqsave(&vector_lock, flags);
-    ret = __assign_irq_vector(irq, desc, mask ?: TARGET_CPUS);
+    spin_lock_irqsave(&desc->lock, flags);
+
+    spin_lock(&vector_lock);
+    ret = _assign_irq_vector(desc, mask ?: TARGET_CPUS);
+    spin_unlock(&vector_lock);
+
     if ( !ret )
     {
         ret = desc->arch.vector;
@@ -593,7 +608,8 @@ int assign_irq_vector(int irq, const cpu
         else
             cpumask_setall(desc->affinity);
     }
-    spin_unlock_irqrestore(&vector_lock, flags);
+
+    spin_unlock_irqrestore(&desc->lock, flags);
 
     return ret;
 }
@@ -767,7 +783,6 @@ void irq_complete_move(struct irq_desc *
 
 unsigned int set_desc_affinity(struct irq_desc *desc, const cpumask_t *mask)
 {
-    unsigned int irq;
     int ret;
     unsigned long flags;
     cpumask_t dest_mask;
@@ -775,10 +790,8 @@ unsigned int set_desc_affinity(struct ir
     if (!cpumask_intersects(mask, &cpu_online_map))
         return BAD_APICID;
 
-    irq = desc->irq;
-
     spin_lock_irqsave(&vector_lock, flags);
-    ret = __assign_irq_vector(irq, desc, mask);
+    ret = _assign_irq_vector(desc, mask);
     spin_unlock_irqrestore(&vector_lock, flags);
 
     if (ret < 0)
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -2134,11 +2134,16 @@ static void adjust_irq_affinity(struct a
     unsigned int node = rhsa ? pxm_to_node(rhsa->proximity_domain)
                              : NUMA_NO_NODE;
     const cpumask_t *cpumask = &cpu_online_map;
+    struct irq_desc *desc;
 
     if ( node < MAX_NUMNODES && node_online(node) &&
          cpumask_intersects(&node_to_cpumask(node), cpumask) )
         cpumask = &node_to_cpumask(node);
-    dma_msi_set_affinity(irq_to_desc(drhd->iommu->msi.irq), cpumask);
+
+    desc = irq_to_desc(drhd->iommu->msi.irq);
+    spin_lock_irq(&desc->lock);
+    dma_msi_set_affinity(desc, cpumask);
+    spin_unlock_irq(&desc->lock);
 }
 
 static int adjust_vtd_irq_affinities(void)




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [Xen-devel] [PATCH v2 07/12] x86/IRQ: fix locking around vector management
@ 2019-05-08 13:10     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-08 13:10 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

All of __{assign,bind,clear}_irq_vector() manipulate struct irq_desc
fields, and hence ought to be called with the descriptor lock held in
addition to vector_lock. This is currently the case for only
set_desc_affinity() (in the common case) and destroy_irq(), which also
clarifies what the nesting behavior between the locks has to be.
Reflect the new expectation by having these functions all take a
descriptor as parameter instead of an interrupt number.

Also take care of the two special cases of calls to set_desc_affinity():
set_ioapic_affinity_irq() and VT-d's dma_msi_set_affinity() get called
directly as well, and in these cases the descriptor locks hadn't got
acquired till now. For set_ioapic_affinity_irq() this means acquiring /
releasing of the IO-APIC lock can be plain spin_{,un}lock() then.

Drop one of the two leading underscores from all three functions at
the same time.

There's one case left where descriptors get manipulated with just
vector_lock held: setup_vector_irq() assumes its caller to acquire
vector_lock, and hence can't itself acquire the descriptor locks (wrong
lock order). I don't currently see how to address this.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: Also adjust set_ioapic_affinity_irq() and VT-d's
    dma_msi_set_affinity().

--- a/xen/arch/x86/io_apic.c
+++ b/xen/arch/x86/io_apic.c
@@ -550,14 +550,14 @@ static void clear_IO_APIC (void)
 static void
 set_ioapic_affinity_irq(struct irq_desc *desc, const cpumask_t *mask)
 {
-    unsigned long flags;
     unsigned int dest;
     int pin, irq;
     struct irq_pin_list *entry;
 
     irq = desc->irq;
 
-    spin_lock_irqsave(&ioapic_lock, flags);
+    spin_lock(&ioapic_lock);
+
     dest = set_desc_affinity(desc, mask);
     if (dest != BAD_APICID) {
         if ( !x2apic_enabled )
@@ -580,8 +580,8 @@ set_ioapic_affinity_irq(struct irq_desc
             entry = irq_2_pin + entry->next;
         }
     }
-    spin_unlock_irqrestore(&ioapic_lock, flags);
 
+    spin_unlock(&ioapic_lock);
 }
 
 /*
@@ -674,16 +674,19 @@ void /*__init*/ setup_ioapic_dest(void)
     for (ioapic = 0; ioapic < nr_ioapics; ioapic++) {
         for (pin = 0; pin < nr_ioapic_entries[ioapic]; pin++) {
             struct irq_desc *desc;
+            unsigned long flags;
 
             irq_entry = find_irq_entry(ioapic, pin, mp_INT);
             if (irq_entry == -1)
                 continue;
             irq = pin_2_irq(irq_entry, ioapic, pin);
             desc = irq_to_desc(irq);
+
+            spin_lock_irqsave(&desc->lock, flags);
             BUG_ON(!cpumask_intersects(desc->arch.cpu_mask, &cpu_online_map));
             set_ioapic_affinity_irq(desc, desc->arch.cpu_mask);
+            spin_unlock_irqrestore(&desc->lock, flags);
         }
-
     }
 }
 
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -27,6 +27,7 @@
 #include <public/physdev.h>
 
 static int parse_irq_vector_map_param(const char *s);
+static void _clear_irq_vector(struct irq_desc *desc);
 
 /* opt_noirqbalance: If true, software IRQ balancing/affinity is disabled. */
 bool __read_mostly opt_noirqbalance;
@@ -120,13 +121,12 @@ static void trace_irq_mask(uint32_t even
     trace_var(event, 1, sizeof(d), &d);
 }
 
-static int __init __bind_irq_vector(int irq, int vector, const cpumask_t *cpu_mask)
+static int __init _bind_irq_vector(struct irq_desc *desc, int vector,
+                                   const cpumask_t *cpu_mask)
 {
     cpumask_t online_mask;
     int cpu;
-    struct irq_desc *desc = irq_to_desc(irq);
 
-    BUG_ON((unsigned)irq >= nr_irqs);
     BUG_ON((unsigned)vector >= NR_VECTORS);
 
     cpumask_and(&online_mask, cpu_mask, &cpu_online_map);
@@ -137,9 +137,9 @@ static int __init __bind_irq_vector(int
         return 0;
     if ( desc->arch.vector != IRQ_VECTOR_UNASSIGNED )
         return -EBUSY;
-    trace_irq_mask(TRC_HW_IRQ_BIND_VECTOR, irq, vector, &online_mask);
+    trace_irq_mask(TRC_HW_IRQ_BIND_VECTOR, desc->irq, vector, &online_mask);
     for_each_cpu(cpu, &online_mask)
-        per_cpu(vector_irq, cpu)[vector] = irq;
+        per_cpu(vector_irq, cpu)[vector] = desc->irq;
     desc->arch.vector = vector;
     cpumask_copy(desc->arch.cpu_mask, &online_mask);
     if ( desc->arch.used_vectors )
@@ -153,12 +153,18 @@ static int __init __bind_irq_vector(int
 
 int __init bind_irq_vector(int irq, int vector, const cpumask_t *cpu_mask)
 {
+    struct irq_desc *desc = irq_to_desc(irq);
     unsigned long flags;
     int ret;
 
-    spin_lock_irqsave(&vector_lock, flags);
-    ret = __bind_irq_vector(irq, vector, cpu_mask);
-    spin_unlock_irqrestore(&vector_lock, flags);
+    BUG_ON((unsigned)irq >= nr_irqs);
+
+    spin_lock_irqsave(&desc->lock, flags);
+    spin_lock(&vector_lock);
+    ret = _bind_irq_vector(desc, vector, cpu_mask);
+    spin_unlock(&vector_lock);
+    spin_unlock_irqrestore(&desc->lock, flags);
+
     return ret;
 }
 
@@ -243,7 +249,9 @@ void destroy_irq(unsigned int irq)
 
     spin_lock_irqsave(&desc->lock, flags);
     desc->handler = &no_irq_type;
-    clear_irq_vector(irq);
+    spin_lock(&vector_lock);
+    _clear_irq_vector(desc);
+    spin_unlock(&vector_lock);
     desc->arch.used_vectors = NULL;
     spin_unlock_irqrestore(&desc->lock, flags);
 
@@ -266,11 +274,11 @@ static void release_old_vec(struct irq_d
     }
 }
 
-static void __clear_irq_vector(int irq)
+static void _clear_irq_vector(struct irq_desc *desc)
 {
-    int cpu, vector, old_vector;
+    unsigned int cpu;
+    int vector, old_vector, irq = desc->irq;
     cpumask_t tmp_mask;
-    struct irq_desc *desc = irq_to_desc(irq);
 
     BUG_ON(!desc->arch.vector);
 
@@ -316,11 +324,14 @@ static void __clear_irq_vector(int irq)
 
 void clear_irq_vector(int irq)
 {
+    struct irq_desc *desc = irq_to_desc(irq);
     unsigned long flags;
 
-    spin_lock_irqsave(&vector_lock, flags);
-    __clear_irq_vector(irq);
-    spin_unlock_irqrestore(&vector_lock, flags);
+    spin_lock_irqsave(&desc->lock, flags);
+    spin_lock(&vector_lock);
+    _clear_irq_vector(desc);
+    spin_unlock(&vector_lock);
+    spin_unlock_irqrestore(&desc->lock, flags);
 }
 
 int irq_to_vector(int irq)
@@ -455,8 +466,7 @@ static vmask_t *irq_get_used_vector_mask
     return ret;
 }
 
-static int __assign_irq_vector(
-    int irq, struct irq_desc *desc, const cpumask_t *mask)
+static int _assign_irq_vector(struct irq_desc *desc, const cpumask_t *mask)
 {
     /*
      * NOTE! The local APIC isn't very good at handling
@@ -470,7 +480,8 @@ static int __assign_irq_vector(
      * 0x80, because int 0x80 is hm, kind of importantish. ;)
      */
     static int current_vector = FIRST_DYNAMIC_VECTOR, current_offset = 0;
-    int cpu, err, old_vector;
+    unsigned int cpu;
+    int err, old_vector, irq = desc->irq;
     vmask_t *irq_used_vectors = NULL;
 
     old_vector = irq_to_vector(irq);
@@ -583,8 +594,12 @@ int assign_irq_vector(int irq, const cpu
     
     BUG_ON(irq >= nr_irqs || irq <0);
 
-    spin_lock_irqsave(&vector_lock, flags);
-    ret = __assign_irq_vector(irq, desc, mask ?: TARGET_CPUS);
+    spin_lock_irqsave(&desc->lock, flags);
+
+    spin_lock(&vector_lock);
+    ret = _assign_irq_vector(desc, mask ?: TARGET_CPUS);
+    spin_unlock(&vector_lock);
+
     if ( !ret )
     {
         ret = desc->arch.vector;
@@ -593,7 +608,8 @@ int assign_irq_vector(int irq, const cpu
         else
             cpumask_setall(desc->affinity);
     }
-    spin_unlock_irqrestore(&vector_lock, flags);
+
+    spin_unlock_irqrestore(&desc->lock, flags);
 
     return ret;
 }
@@ -767,7 +783,6 @@ void irq_complete_move(struct irq_desc *
 
 unsigned int set_desc_affinity(struct irq_desc *desc, const cpumask_t *mask)
 {
-    unsigned int irq;
     int ret;
     unsigned long flags;
     cpumask_t dest_mask;
@@ -775,10 +790,8 @@ unsigned int set_desc_affinity(struct ir
     if (!cpumask_intersects(mask, &cpu_online_map))
         return BAD_APICID;
 
-    irq = desc->irq;
-
     spin_lock_irqsave(&vector_lock, flags);
-    ret = __assign_irq_vector(irq, desc, mask);
+    ret = _assign_irq_vector(desc, mask);
     spin_unlock_irqrestore(&vector_lock, flags);
 
     if (ret < 0)
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -2134,11 +2134,16 @@ static void adjust_irq_affinity(struct a
     unsigned int node = rhsa ? pxm_to_node(rhsa->proximity_domain)
                              : NUMA_NO_NODE;
     const cpumask_t *cpumask = &cpu_online_map;
+    struct irq_desc *desc;
 
     if ( node < MAX_NUMNODES && node_online(node) &&
          cpumask_intersects(&node_to_cpumask(node), cpumask) )
         cpumask = &node_to_cpumask(node);
-    dma_msi_set_affinity(irq_to_desc(drhd->iommu->msi.irq), cpumask);
+
+    desc = irq_to_desc(drhd->iommu->msi.irq);
+    spin_lock_irq(&desc->lock);
+    dma_msi_set_affinity(desc, cpumask);
+    spin_unlock_irq(&desc->lock);
 }
 
 static int adjust_vtd_irq_affinities(void)




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [PATCH v2 08/12] x86/IRQs: correct/tighten vector check in _clear_irq_vector()
@ 2019-05-08 13:11     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-08 13:11 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

If any particular value was to be checked against, it would need to be
IRQ_VECTOR_UNASSIGNED.

Reported-by: Roger Pau Monné <roger.pau@citrix.com>

Be more strict though and use valid_irq_vector() instead.

Take the opportunity and also convert local variables to unsigned int.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: New.

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -276,14 +276,13 @@ static void release_old_vec(struct irq_d
 
 static void _clear_irq_vector(struct irq_desc *desc)
 {
-    unsigned int cpu;
-    int vector, old_vector, irq = desc->irq;
+    unsigned int cpu, old_vector, irq = desc->irq;
+    unsigned int vector = desc->arch.vector;
     cpumask_t tmp_mask;
 
-    BUG_ON(!desc->arch.vector);
+    BUG_ON(!valid_irq_vector(vector));
 
     /* Always clear desc->arch.vector */
-    vector = desc->arch.vector;
     cpumask_and(&tmp_mask, desc->arch.cpu_mask, &cpu_online_map);
 
     for_each_cpu(cpu, &tmp_mask) {




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [Xen-devel] [PATCH v2 08/12] x86/IRQs: correct/tighten vector check in _clear_irq_vector()
@ 2019-05-08 13:11     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-08 13:11 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

If any particular value was to be checked against, it would need to be
IRQ_VECTOR_UNASSIGNED.

Reported-by: Roger Pau Monné <roger.pau@citrix.com>

Be more strict though and use valid_irq_vector() instead.

Take the opportunity and also convert local variables to unsigned int.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: New.

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -276,14 +276,13 @@ static void release_old_vec(struct irq_d
 
 static void _clear_irq_vector(struct irq_desc *desc)
 {
-    unsigned int cpu;
-    int vector, old_vector, irq = desc->irq;
+    unsigned int cpu, old_vector, irq = desc->irq;
+    unsigned int vector = desc->arch.vector;
     cpumask_t tmp_mask;
 
-    BUG_ON(!desc->arch.vector);
+    BUG_ON(!valid_irq_vector(vector));
 
     /* Always clear desc->arch.vector */
-    vector = desc->arch.vector;
     cpumask_and(&tmp_mask, desc->arch.cpu_mask, &cpu_online_map);
 
     for_each_cpu(cpu, &tmp_mask) {




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [PATCH v2 09/12] x86/IRQ: make fixup_irqs() skip unconnected internally used interrupts
@ 2019-05-08 13:12     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-08 13:12 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

Since the "Cannot set affinity ..." warning is a one time one, avoid
triggering it already at boot time when parking secondary threads and
the serial console uses a (still unconnected at that time) PCI IRQ.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -2428,8 +2428,20 @@ void fixup_irqs(const cpumask_t *mask, b
         vector = irq_to_vector(irq);
         if ( vector >= FIRST_HIPRIORITY_VECTOR &&
              vector <= LAST_HIPRIORITY_VECTOR )
+        {
             cpumask_and(desc->arch.cpu_mask, desc->arch.cpu_mask, mask);
 
+            /*
+             * This can in particular happen when parking secondary threads
+             * during boot and when the serial console wants to use a PCI IRQ.
+             */
+            if ( desc->handler == &no_irq_type )
+            {
+                spin_unlock(&desc->lock);
+                continue;
+            }
+        }
+
         if ( desc->arch.move_cleanup_count )
         {
             /* The cleanup IPI may have got sent while we were still online. */



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [Xen-devel] [PATCH v2 09/12] x86/IRQ: make fixup_irqs() skip unconnected internally used interrupts
@ 2019-05-08 13:12     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-08 13:12 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

Since the "Cannot set affinity ..." warning is a one time one, avoid
triggering it already at boot time when parking secondary threads and
the serial console uses a (still unconnected at that time) PCI IRQ.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -2428,8 +2428,20 @@ void fixup_irqs(const cpumask_t *mask, b
         vector = irq_to_vector(irq);
         if ( vector >= FIRST_HIPRIORITY_VECTOR &&
              vector <= LAST_HIPRIORITY_VECTOR )
+        {
             cpumask_and(desc->arch.cpu_mask, desc->arch.cpu_mask, mask);
 
+            /*
+             * This can in particular happen when parking secondary threads
+             * during boot and when the serial console wants to use a PCI IRQ.
+             */
+            if ( desc->handler == &no_irq_type )
+            {
+                spin_unlock(&desc->lock);
+                continue;
+            }
+        }
+
         if ( desc->arch.move_cleanup_count )
         {
             /* The cleanup IPI may have got sent while we were still online. */



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [PATCH v2 10/12] x86/IRQ: reduce unused space in struct arch_irq_desc
@ 2019-05-08 13:13     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-08 13:13 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

--- a/xen/include/asm-x86/irq.h
+++ b/xen/include/asm-x86/irq.h
@@ -41,8 +41,8 @@ struct arch_irq_desc {
         cpumask_var_t cpu_mask;
         cpumask_var_t old_cpu_mask;
         cpumask_var_t pending_mask;
-        unsigned move_cleanup_count;
         vmask_t *used_vectors;
+        unsigned move_cleanup_count;
         u8 move_in_progress : 1;
         s8 used;
 };



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [Xen-devel] [PATCH v2 10/12] x86/IRQ: reduce unused space in struct arch_irq_desc
@ 2019-05-08 13:13     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-08 13:13 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

--- a/xen/include/asm-x86/irq.h
+++ b/xen/include/asm-x86/irq.h
@@ -41,8 +41,8 @@ struct arch_irq_desc {
         cpumask_var_t cpu_mask;
         cpumask_var_t old_cpu_mask;
         cpumask_var_t pending_mask;
-        unsigned move_cleanup_count;
         vmask_t *used_vectors;
+        unsigned move_cleanup_count;
         u8 move_in_progress : 1;
         s8 used;
 };



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [PATCH v2 11/12] x86/IRQ: drop redundant cpumask_empty() from move_masked_irq()
@ 2019-05-08 13:13     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-08 13:13 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

The subsequent cpumask_intersects() covers the "empty" case quite fine.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -650,9 +650,6 @@ void move_masked_irq(struct irq_desc *de
     
     desc->status &= ~IRQ_MOVE_PENDING;
 
-    if (unlikely(cpumask_empty(pending_mask)))
-        return;
-
     if (!desc->handler->set_affinity)
         return;
 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [Xen-devel] [PATCH v2 11/12] x86/IRQ: drop redundant cpumask_empty() from move_masked_irq()
@ 2019-05-08 13:13     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-08 13:13 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

The subsequent cpumask_intersects() covers the "empty" case quite fine.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -650,9 +650,6 @@ void move_masked_irq(struct irq_desc *de
     
     desc->status &= ~IRQ_MOVE_PENDING;
 
-    if (unlikely(cpumask_empty(pending_mask)))
-        return;
-
     if (!desc->handler->set_affinity)
         return;
 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [PATCH v2 12/12] x86/IRQ: simplify and rename pirq_acktype()
@ 2019-05-08 13:14     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-08 13:14 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

Its only caller already has the IRQ descriptor in its hands, so there's
no need for the function to re-obtain it. As a result the leading p of
its name is no longer appropriate and hence gets dropped.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: New.

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -1550,17 +1550,8 @@ int pirq_guest_unmask(struct domain *d)
     return 0;
 }
 
-static int pirq_acktype(struct domain *d, int pirq)
+static int irq_acktype(const struct irq_desc *desc)
 {
-    struct irq_desc  *desc;
-    int irq;
-
-    irq = domain_pirq_to_irq(d, pirq);
-    if ( irq <= 0 )
-        return ACKTYPE_NONE;
-
-    desc = irq_to_desc(irq);
-
     if ( desc->handler == &no_irq_type )
         return ACKTYPE_NONE;
 
@@ -1591,7 +1582,8 @@ static int pirq_acktype(struct domain *d
     if ( !strcmp(desc->handler->typename, "XT-PIC") )
         return ACKTYPE_UNMASK;
 
-    printk("Unknown PIC type '%s' for IRQ %d\n", desc->handler->typename, irq);
+    printk("Unknown PIC type '%s' for IRQ%d\n",
+           desc->handler->typename, desc->irq);
     BUG();
 
     return 0;
@@ -1668,7 +1660,7 @@ int pirq_guest_bind(struct vcpu *v, stru
         action->nr_guests   = 0;
         action->in_flight   = 0;
         action->shareable   = will_share;
-        action->ack_type    = pirq_acktype(v->domain, pirq->pirq);
+        action->ack_type    = irq_acktype(desc);
         init_timer(&action->eoi_timer, irq_guest_eoi_timer_fn, desc, 0);
 
         desc->status |= IRQ_GUEST;





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [Xen-devel] [PATCH v2 12/12] x86/IRQ: simplify and rename pirq_acktype()
@ 2019-05-08 13:14     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-08 13:14 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

Its only caller already has the IRQ descriptor in its hands, so there's
no need for the function to re-obtain it. As a result the leading p of
its name is no longer appropriate and hence gets dropped.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: New.

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -1550,17 +1550,8 @@ int pirq_guest_unmask(struct domain *d)
     return 0;
 }
 
-static int pirq_acktype(struct domain *d, int pirq)
+static int irq_acktype(const struct irq_desc *desc)
 {
-    struct irq_desc  *desc;
-    int irq;
-
-    irq = domain_pirq_to_irq(d, pirq);
-    if ( irq <= 0 )
-        return ACKTYPE_NONE;
-
-    desc = irq_to_desc(irq);
-
     if ( desc->handler == &no_irq_type )
         return ACKTYPE_NONE;
 
@@ -1591,7 +1582,8 @@ static int pirq_acktype(struct domain *d
     if ( !strcmp(desc->handler->typename, "XT-PIC") )
         return ACKTYPE_UNMASK;
 
-    printk("Unknown PIC type '%s' for IRQ %d\n", desc->handler->typename, irq);
+    printk("Unknown PIC type '%s' for IRQ%d\n",
+           desc->handler->typename, desc->irq);
     BUG();
 
     return 0;
@@ -1668,7 +1660,7 @@ int pirq_guest_bind(struct vcpu *v, stru
         action->nr_guests   = 0;
         action->in_flight   = 0;
         action->shareable   = will_share;
-        action->ack_type    = pirq_acktype(v->domain, pirq->pirq);
+        action->ack_type    = irq_acktype(desc);
         init_timer(&action->eoi_timer, irq_guest_eoi_timer_fn, desc, 0);
 
         desc->status |= IRQ_GUEST;





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v2 07/12] x86/IRQ: fix locking around vector management
@ 2019-05-08 13:16       ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-08 13:16 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Kevin Tian, Wei Liu, Roger Pau Monne

>>> On 08.05.19 at 15:10, <JBeulich@suse.com> wrote:
> All of __{assign,bind,clear}_irq_vector() manipulate struct irq_desc
> fields, and hence ought to be called with the descriptor lock held in
> addition to vector_lock. This is currently the case for only
> set_desc_affinity() (in the common case) and destroy_irq(), which also
> clarifies what the nesting behavior between the locks has to be.
> Reflect the new expectation by having these functions all take a
> descriptor as parameter instead of an interrupt number.
> 
> Also take care of the two special cases of calls to set_desc_affinity():
> set_ioapic_affinity_irq() and VT-d's dma_msi_set_affinity() get called
> directly as well, and in these cases the descriptor locks hadn't got
> acquired till now. For set_ioapic_affinity_irq() this means acquiring /
> releasing of the IO-APIC lock can be plain spin_{,un}lock() then.
> 
> Drop one of the two leading underscores from all three functions at
> the same time.
> 
> There's one case left where descriptors get manipulated with just
> vector_lock held: setup_vector_irq() assumes its caller to acquire
> vector_lock, and hence can't itself acquire the descriptor locks (wrong
> lock order). I don't currently see how to address this.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> v2: Also adjust set_ioapic_affinity_irq() and VT-d's
>     dma_msi_set_affinity().

I'm sorry, Kevin, I should have Cc-ed you on this one.

Jan

> --- a/xen/arch/x86/io_apic.c
> +++ b/xen/arch/x86/io_apic.c
> @@ -550,14 +550,14 @@ static void clear_IO_APIC (void)
>  static void
>  set_ioapic_affinity_irq(struct irq_desc *desc, const cpumask_t *mask)
>  {
> -    unsigned long flags;
>      unsigned int dest;
>      int pin, irq;
>      struct irq_pin_list *entry;
>  
>      irq = desc->irq;
>  
> -    spin_lock_irqsave(&ioapic_lock, flags);
> +    spin_lock(&ioapic_lock);
> +
>      dest = set_desc_affinity(desc, mask);
>      if (dest != BAD_APICID) {
>          if ( !x2apic_enabled )
> @@ -580,8 +580,8 @@ set_ioapic_affinity_irq(struct irq_desc
>              entry = irq_2_pin + entry->next;
>          }
>      }
> -    spin_unlock_irqrestore(&ioapic_lock, flags);
>  
> +    spin_unlock(&ioapic_lock);
>  }
>  
>  /*
> @@ -674,16 +674,19 @@ void /*__init*/ setup_ioapic_dest(void)
>      for (ioapic = 0; ioapic < nr_ioapics; ioapic++) {
>          for (pin = 0; pin < nr_ioapic_entries[ioapic]; pin++) {
>              struct irq_desc *desc;
> +            unsigned long flags;
>  
>              irq_entry = find_irq_entry(ioapic, pin, mp_INT);
>              if (irq_entry == -1)
>                  continue;
>              irq = pin_2_irq(irq_entry, ioapic, pin);
>              desc = irq_to_desc(irq);
> +
> +            spin_lock_irqsave(&desc->lock, flags);
>              BUG_ON(!cpumask_intersects(desc->arch.cpu_mask, 
> &cpu_online_map));
>              set_ioapic_affinity_irq(desc, desc->arch.cpu_mask);
> +            spin_unlock_irqrestore(&desc->lock, flags);
>          }
> -
>      }
>  }
>  
> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -27,6 +27,7 @@
>  #include <public/physdev.h>
>  
>  static int parse_irq_vector_map_param(const char *s);
> +static void _clear_irq_vector(struct irq_desc *desc);
>  
>  /* opt_noirqbalance: If true, software IRQ balancing/affinity is disabled. 
> */
>  bool __read_mostly opt_noirqbalance;
> @@ -120,13 +121,12 @@ static void trace_irq_mask(uint32_t even
>      trace_var(event, 1, sizeof(d), &d);
>  }
>  
> -static int __init __bind_irq_vector(int irq, int vector, const cpumask_t 
> *cpu_mask)
> +static int __init _bind_irq_vector(struct irq_desc *desc, int vector,
> +                                   const cpumask_t *cpu_mask)
>  {
>      cpumask_t online_mask;
>      int cpu;
> -    struct irq_desc *desc = irq_to_desc(irq);
>  
> -    BUG_ON((unsigned)irq >= nr_irqs);
>      BUG_ON((unsigned)vector >= NR_VECTORS);
>  
>      cpumask_and(&online_mask, cpu_mask, &cpu_online_map);
> @@ -137,9 +137,9 @@ static int __init __bind_irq_vector(int
>          return 0;
>      if ( desc->arch.vector != IRQ_VECTOR_UNASSIGNED )
>          return -EBUSY;
> -    trace_irq_mask(TRC_HW_IRQ_BIND_VECTOR, irq, vector, &online_mask);
> +    trace_irq_mask(TRC_HW_IRQ_BIND_VECTOR, desc->irq, vector, &online_mask);
>      for_each_cpu(cpu, &online_mask)
> -        per_cpu(vector_irq, cpu)[vector] = irq;
> +        per_cpu(vector_irq, cpu)[vector] = desc->irq;
>      desc->arch.vector = vector;
>      cpumask_copy(desc->arch.cpu_mask, &online_mask);
>      if ( desc->arch.used_vectors )
> @@ -153,12 +153,18 @@ static int __init __bind_irq_vector(int
>  
>  int __init bind_irq_vector(int irq, int vector, const cpumask_t *cpu_mask)
>  {
> +    struct irq_desc *desc = irq_to_desc(irq);
>      unsigned long flags;
>      int ret;
>  
> -    spin_lock_irqsave(&vector_lock, flags);
> -    ret = __bind_irq_vector(irq, vector, cpu_mask);
> -    spin_unlock_irqrestore(&vector_lock, flags);
> +    BUG_ON((unsigned)irq >= nr_irqs);
> +
> +    spin_lock_irqsave(&desc->lock, flags);
> +    spin_lock(&vector_lock);
> +    ret = _bind_irq_vector(desc, vector, cpu_mask);
> +    spin_unlock(&vector_lock);
> +    spin_unlock_irqrestore(&desc->lock, flags);
> +
>      return ret;
>  }
>  
> @@ -243,7 +249,9 @@ void destroy_irq(unsigned int irq)
>  
>      spin_lock_irqsave(&desc->lock, flags);
>      desc->handler = &no_irq_type;
> -    clear_irq_vector(irq);
> +    spin_lock(&vector_lock);
> +    _clear_irq_vector(desc);
> +    spin_unlock(&vector_lock);
>      desc->arch.used_vectors = NULL;
>      spin_unlock_irqrestore(&desc->lock, flags);
>  
> @@ -266,11 +274,11 @@ static void release_old_vec(struct irq_d
>      }
>  }
>  
> -static void __clear_irq_vector(int irq)
> +static void _clear_irq_vector(struct irq_desc *desc)
>  {
> -    int cpu, vector, old_vector;
> +    unsigned int cpu;
> +    int vector, old_vector, irq = desc->irq;
>      cpumask_t tmp_mask;
> -    struct irq_desc *desc = irq_to_desc(irq);
>  
>      BUG_ON(!desc->arch.vector);
>  
> @@ -316,11 +324,14 @@ static void __clear_irq_vector(int irq)
>  
>  void clear_irq_vector(int irq)
>  {
> +    struct irq_desc *desc = irq_to_desc(irq);
>      unsigned long flags;
>  
> -    spin_lock_irqsave(&vector_lock, flags);
> -    __clear_irq_vector(irq);
> -    spin_unlock_irqrestore(&vector_lock, flags);
> +    spin_lock_irqsave(&desc->lock, flags);
> +    spin_lock(&vector_lock);
> +    _clear_irq_vector(desc);
> +    spin_unlock(&vector_lock);
> +    spin_unlock_irqrestore(&desc->lock, flags);
>  }
>  
>  int irq_to_vector(int irq)
> @@ -455,8 +466,7 @@ static vmask_t *irq_get_used_vector_mask
>      return ret;
>  }
>  
> -static int __assign_irq_vector(
> -    int irq, struct irq_desc *desc, const cpumask_t *mask)
> +static int _assign_irq_vector(struct irq_desc *desc, const cpumask_t *mask)
>  {
>      /*
>       * NOTE! The local APIC isn't very good at handling
> @@ -470,7 +480,8 @@ static int __assign_irq_vector(
>       * 0x80, because int 0x80 is hm, kind of importantish. ;)
>       */
>      static int current_vector = FIRST_DYNAMIC_VECTOR, current_offset = 0;
> -    int cpu, err, old_vector;
> +    unsigned int cpu;
> +    int err, old_vector, irq = desc->irq;
>      vmask_t *irq_used_vectors = NULL;
>  
>      old_vector = irq_to_vector(irq);
> @@ -583,8 +594,12 @@ int assign_irq_vector(int irq, const cpu
>      
>      BUG_ON(irq >= nr_irqs || irq <0);
>  
> -    spin_lock_irqsave(&vector_lock, flags);
> -    ret = __assign_irq_vector(irq, desc, mask ?: TARGET_CPUS);
> +    spin_lock_irqsave(&desc->lock, flags);
> +
> +    spin_lock(&vector_lock);
> +    ret = _assign_irq_vector(desc, mask ?: TARGET_CPUS);
> +    spin_unlock(&vector_lock);
> +
>      if ( !ret )
>      {
>          ret = desc->arch.vector;
> @@ -593,7 +608,8 @@ int assign_irq_vector(int irq, const cpu
>          else
>              cpumask_setall(desc->affinity);
>      }
> -    spin_unlock_irqrestore(&vector_lock, flags);
> +
> +    spin_unlock_irqrestore(&desc->lock, flags);
>  
>      return ret;
>  }
> @@ -767,7 +783,6 @@ void irq_complete_move(struct irq_desc *
>  
>  unsigned int set_desc_affinity(struct irq_desc *desc, const cpumask_t 
> *mask)
>  {
> -    unsigned int irq;
>      int ret;
>      unsigned long flags;
>      cpumask_t dest_mask;
> @@ -775,10 +790,8 @@ unsigned int set_desc_affinity(struct ir
>      if (!cpumask_intersects(mask, &cpu_online_map))
>          return BAD_APICID;
>  
> -    irq = desc->irq;
> -
>      spin_lock_irqsave(&vector_lock, flags);
> -    ret = __assign_irq_vector(irq, desc, mask);
> +    ret = _assign_irq_vector(desc, mask);
>      spin_unlock_irqrestore(&vector_lock, flags);
>  
>      if (ret < 0)
> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -2134,11 +2134,16 @@ static void adjust_irq_affinity(struct a
>      unsigned int node = rhsa ? pxm_to_node(rhsa->proximity_domain)
>                               : NUMA_NO_NODE;
>      const cpumask_t *cpumask = &cpu_online_map;
> +    struct irq_desc *desc;
>  
>      if ( node < MAX_NUMNODES && node_online(node) &&
>           cpumask_intersects(&node_to_cpumask(node), cpumask) )
>          cpumask = &node_to_cpumask(node);
> -    dma_msi_set_affinity(irq_to_desc(drhd->iommu->msi.irq), cpumask);
> +
> +    desc = irq_to_desc(drhd->iommu->msi.irq);
> +    spin_lock_irq(&desc->lock);
> +    dma_msi_set_affinity(desc, cpumask);
> +    spin_unlock_irq(&desc->lock);
>  }
>  
>  static int adjust_vtd_irq_affinities(void)
> 



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v2 07/12] x86/IRQ: fix locking around vector management
@ 2019-05-08 13:16       ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-08 13:16 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Kevin Tian, Wei Liu, Roger Pau Monne

>>> On 08.05.19 at 15:10, <JBeulich@suse.com> wrote:
> All of __{assign,bind,clear}_irq_vector() manipulate struct irq_desc
> fields, and hence ought to be called with the descriptor lock held in
> addition to vector_lock. This is currently the case for only
> set_desc_affinity() (in the common case) and destroy_irq(), which also
> clarifies what the nesting behavior between the locks has to be.
> Reflect the new expectation by having these functions all take a
> descriptor as parameter instead of an interrupt number.
> 
> Also take care of the two special cases of calls to set_desc_affinity():
> set_ioapic_affinity_irq() and VT-d's dma_msi_set_affinity() get called
> directly as well, and in these cases the descriptor locks hadn't got
> acquired till now. For set_ioapic_affinity_irq() this means acquiring /
> releasing of the IO-APIC lock can be plain spin_{,un}lock() then.
> 
> Drop one of the two leading underscores from all three functions at
> the same time.
> 
> There's one case left where descriptors get manipulated with just
> vector_lock held: setup_vector_irq() assumes its caller to acquire
> vector_lock, and hence can't itself acquire the descriptor locks (wrong
> lock order). I don't currently see how to address this.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> v2: Also adjust set_ioapic_affinity_irq() and VT-d's
>     dma_msi_set_affinity().

I'm sorry, Kevin, I should have Cc-ed you on this one.

Jan

> --- a/xen/arch/x86/io_apic.c
> +++ b/xen/arch/x86/io_apic.c
> @@ -550,14 +550,14 @@ static void clear_IO_APIC (void)
>  static void
>  set_ioapic_affinity_irq(struct irq_desc *desc, const cpumask_t *mask)
>  {
> -    unsigned long flags;
>      unsigned int dest;
>      int pin, irq;
>      struct irq_pin_list *entry;
>  
>      irq = desc->irq;
>  
> -    spin_lock_irqsave(&ioapic_lock, flags);
> +    spin_lock(&ioapic_lock);
> +
>      dest = set_desc_affinity(desc, mask);
>      if (dest != BAD_APICID) {
>          if ( !x2apic_enabled )
> @@ -580,8 +580,8 @@ set_ioapic_affinity_irq(struct irq_desc
>              entry = irq_2_pin + entry->next;
>          }
>      }
> -    spin_unlock_irqrestore(&ioapic_lock, flags);
>  
> +    spin_unlock(&ioapic_lock);
>  }
>  
>  /*
> @@ -674,16 +674,19 @@ void /*__init*/ setup_ioapic_dest(void)
>      for (ioapic = 0; ioapic < nr_ioapics; ioapic++) {
>          for (pin = 0; pin < nr_ioapic_entries[ioapic]; pin++) {
>              struct irq_desc *desc;
> +            unsigned long flags;
>  
>              irq_entry = find_irq_entry(ioapic, pin, mp_INT);
>              if (irq_entry == -1)
>                  continue;
>              irq = pin_2_irq(irq_entry, ioapic, pin);
>              desc = irq_to_desc(irq);
> +
> +            spin_lock_irqsave(&desc->lock, flags);
>              BUG_ON(!cpumask_intersects(desc->arch.cpu_mask, 
> &cpu_online_map));
>              set_ioapic_affinity_irq(desc, desc->arch.cpu_mask);
> +            spin_unlock_irqrestore(&desc->lock, flags);
>          }
> -
>      }
>  }
>  
> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -27,6 +27,7 @@
>  #include <public/physdev.h>
>  
>  static int parse_irq_vector_map_param(const char *s);
> +static void _clear_irq_vector(struct irq_desc *desc);
>  
>  /* opt_noirqbalance: If true, software IRQ balancing/affinity is disabled. 
> */
>  bool __read_mostly opt_noirqbalance;
> @@ -120,13 +121,12 @@ static void trace_irq_mask(uint32_t even
>      trace_var(event, 1, sizeof(d), &d);
>  }
>  
> -static int __init __bind_irq_vector(int irq, int vector, const cpumask_t 
> *cpu_mask)
> +static int __init _bind_irq_vector(struct irq_desc *desc, int vector,
> +                                   const cpumask_t *cpu_mask)
>  {
>      cpumask_t online_mask;
>      int cpu;
> -    struct irq_desc *desc = irq_to_desc(irq);
>  
> -    BUG_ON((unsigned)irq >= nr_irqs);
>      BUG_ON((unsigned)vector >= NR_VECTORS);
>  
>      cpumask_and(&online_mask, cpu_mask, &cpu_online_map);
> @@ -137,9 +137,9 @@ static int __init __bind_irq_vector(int
>          return 0;
>      if ( desc->arch.vector != IRQ_VECTOR_UNASSIGNED )
>          return -EBUSY;
> -    trace_irq_mask(TRC_HW_IRQ_BIND_VECTOR, irq, vector, &online_mask);
> +    trace_irq_mask(TRC_HW_IRQ_BIND_VECTOR, desc->irq, vector, &online_mask);
>      for_each_cpu(cpu, &online_mask)
> -        per_cpu(vector_irq, cpu)[vector] = irq;
> +        per_cpu(vector_irq, cpu)[vector] = desc->irq;
>      desc->arch.vector = vector;
>      cpumask_copy(desc->arch.cpu_mask, &online_mask);
>      if ( desc->arch.used_vectors )
> @@ -153,12 +153,18 @@ static int __init __bind_irq_vector(int
>  
>  int __init bind_irq_vector(int irq, int vector, const cpumask_t *cpu_mask)
>  {
> +    struct irq_desc *desc = irq_to_desc(irq);
>      unsigned long flags;
>      int ret;
>  
> -    spin_lock_irqsave(&vector_lock, flags);
> -    ret = __bind_irq_vector(irq, vector, cpu_mask);
> -    spin_unlock_irqrestore(&vector_lock, flags);
> +    BUG_ON((unsigned)irq >= nr_irqs);
> +
> +    spin_lock_irqsave(&desc->lock, flags);
> +    spin_lock(&vector_lock);
> +    ret = _bind_irq_vector(desc, vector, cpu_mask);
> +    spin_unlock(&vector_lock);
> +    spin_unlock_irqrestore(&desc->lock, flags);
> +
>      return ret;
>  }
>  
> @@ -243,7 +249,9 @@ void destroy_irq(unsigned int irq)
>  
>      spin_lock_irqsave(&desc->lock, flags);
>      desc->handler = &no_irq_type;
> -    clear_irq_vector(irq);
> +    spin_lock(&vector_lock);
> +    _clear_irq_vector(desc);
> +    spin_unlock(&vector_lock);
>      desc->arch.used_vectors = NULL;
>      spin_unlock_irqrestore(&desc->lock, flags);
>  
> @@ -266,11 +274,11 @@ static void release_old_vec(struct irq_d
>      }
>  }
>  
> -static void __clear_irq_vector(int irq)
> +static void _clear_irq_vector(struct irq_desc *desc)
>  {
> -    int cpu, vector, old_vector;
> +    unsigned int cpu;
> +    int vector, old_vector, irq = desc->irq;
>      cpumask_t tmp_mask;
> -    struct irq_desc *desc = irq_to_desc(irq);
>  
>      BUG_ON(!desc->arch.vector);
>  
> @@ -316,11 +324,14 @@ static void __clear_irq_vector(int irq)
>  
>  void clear_irq_vector(int irq)
>  {
> +    struct irq_desc *desc = irq_to_desc(irq);
>      unsigned long flags;
>  
> -    spin_lock_irqsave(&vector_lock, flags);
> -    __clear_irq_vector(irq);
> -    spin_unlock_irqrestore(&vector_lock, flags);
> +    spin_lock_irqsave(&desc->lock, flags);
> +    spin_lock(&vector_lock);
> +    _clear_irq_vector(desc);
> +    spin_unlock(&vector_lock);
> +    spin_unlock_irqrestore(&desc->lock, flags);
>  }
>  
>  int irq_to_vector(int irq)
> @@ -455,8 +466,7 @@ static vmask_t *irq_get_used_vector_mask
>      return ret;
>  }
>  
> -static int __assign_irq_vector(
> -    int irq, struct irq_desc *desc, const cpumask_t *mask)
> +static int _assign_irq_vector(struct irq_desc *desc, const cpumask_t *mask)
>  {
>      /*
>       * NOTE! The local APIC isn't very good at handling
> @@ -470,7 +480,8 @@ static int __assign_irq_vector(
>       * 0x80, because int 0x80 is hm, kind of importantish. ;)
>       */
>      static int current_vector = FIRST_DYNAMIC_VECTOR, current_offset = 0;
> -    int cpu, err, old_vector;
> +    unsigned int cpu;
> +    int err, old_vector, irq = desc->irq;
>      vmask_t *irq_used_vectors = NULL;
>  
>      old_vector = irq_to_vector(irq);
> @@ -583,8 +594,12 @@ int assign_irq_vector(int irq, const cpu
>      
>      BUG_ON(irq >= nr_irqs || irq <0);
>  
> -    spin_lock_irqsave(&vector_lock, flags);
> -    ret = __assign_irq_vector(irq, desc, mask ?: TARGET_CPUS);
> +    spin_lock_irqsave(&desc->lock, flags);
> +
> +    spin_lock(&vector_lock);
> +    ret = _assign_irq_vector(desc, mask ?: TARGET_CPUS);
> +    spin_unlock(&vector_lock);
> +
>      if ( !ret )
>      {
>          ret = desc->arch.vector;
> @@ -593,7 +608,8 @@ int assign_irq_vector(int irq, const cpu
>          else
>              cpumask_setall(desc->affinity);
>      }
> -    spin_unlock_irqrestore(&vector_lock, flags);
> +
> +    spin_unlock_irqrestore(&desc->lock, flags);
>  
>      return ret;
>  }
> @@ -767,7 +783,6 @@ void irq_complete_move(struct irq_desc *
>  
>  unsigned int set_desc_affinity(struct irq_desc *desc, const cpumask_t 
> *mask)
>  {
> -    unsigned int irq;
>      int ret;
>      unsigned long flags;
>      cpumask_t dest_mask;
> @@ -775,10 +790,8 @@ unsigned int set_desc_affinity(struct ir
>      if (!cpumask_intersects(mask, &cpu_online_map))
>          return BAD_APICID;
>  
> -    irq = desc->irq;
> -
>      spin_lock_irqsave(&vector_lock, flags);
> -    ret = __assign_irq_vector(irq, desc, mask);
> +    ret = _assign_irq_vector(desc, mask);
>      spin_unlock_irqrestore(&vector_lock, flags);
>  
>      if (ret < 0)
> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -2134,11 +2134,16 @@ static void adjust_irq_affinity(struct a
>      unsigned int node = rhsa ? pxm_to_node(rhsa->proximity_domain)
>                               : NUMA_NO_NODE;
>      const cpumask_t *cpumask = &cpu_online_map;
> +    struct irq_desc *desc;
>  
>      if ( node < MAX_NUMNODES && node_online(node) &&
>           cpumask_intersects(&node_to_cpumask(node), cpumask) )
>          cpumask = &node_to_cpumask(node);
> -    dma_msi_set_affinity(irq_to_desc(drhd->iommu->msi.irq), cpumask);
> +
> +    desc = irq_to_desc(drhd->iommu->msi.irq);
> +    spin_lock_irq(&desc->lock);
> +    dma_msi_set_affinity(desc, cpumask);
> +    spin_unlock_irq(&desc->lock);
>  }
>  
>  static int adjust_vtd_irq_affinities(void)
> 



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v2 07/12] x86/IRQ: fix locking around vector management
@ 2019-05-11  0:11         ` Tian, Kevin
  0 siblings, 0 replies; 196+ messages in thread
From: Tian, Kevin @ 2019-05-11  0:11 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Wednesday, May 8, 2019 9:16 PM
> 
> >>> On 08.05.19 at 15:10, <JBeulich@suse.com> wrote:
> > All of __{assign,bind,clear}_irq_vector() manipulate struct irq_desc
> > fields, and hence ought to be called with the descriptor lock held in
> > addition to vector_lock. This is currently the case for only
> > set_desc_affinity() (in the common case) and destroy_irq(), which also
> > clarifies what the nesting behavior between the locks has to be.
> > Reflect the new expectation by having these functions all take a
> > descriptor as parameter instead of an interrupt number.
> >
> > Also take care of the two special cases of calls to set_desc_affinity():
> > set_ioapic_affinity_irq() and VT-d's dma_msi_set_affinity() get called
> > directly as well, and in these cases the descriptor locks hadn't got
> > acquired till now. For set_ioapic_affinity_irq() this means acquiring /
> > releasing of the IO-APIC lock can be plain spin_{,un}lock() then.
> >
> > Drop one of the two leading underscores from all three functions at
> > the same time.
> >
> > There's one case left where descriptors get manipulated with just
> > vector_lock held: setup_vector_irq() assumes its caller to acquire
> > vector_lock, and hence can't itself acquire the descriptor locks (wrong
> > lock order). I don't currently see how to address this.
> >
> > Signed-off-by: Jan Beulich <jbeulich@suse.com>
> > ---
> > v2: Also adjust set_ioapic_affinity_irq() and VT-d's
> >     dma_msi_set_affinity().
> 
> I'm sorry, Kevin, I should have Cc-ed you on this one.

Reviewed-by: Kevin Tian <kevin.tian@intel.com> for vtd part.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v2 07/12] x86/IRQ: fix locking around vector management
@ 2019-05-11  0:11         ` Tian, Kevin
  0 siblings, 0 replies; 196+ messages in thread
From: Tian, Kevin @ 2019-05-11  0:11 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Wednesday, May 8, 2019 9:16 PM
> 
> >>> On 08.05.19 at 15:10, <JBeulich@suse.com> wrote:
> > All of __{assign,bind,clear}_irq_vector() manipulate struct irq_desc
> > fields, and hence ought to be called with the descriptor lock held in
> > addition to vector_lock. This is currently the case for only
> > set_desc_affinity() (in the common case) and destroy_irq(), which also
> > clarifies what the nesting behavior between the locks has to be.
> > Reflect the new expectation by having these functions all take a
> > descriptor as parameter instead of an interrupt number.
> >
> > Also take care of the two special cases of calls to set_desc_affinity():
> > set_ioapic_affinity_irq() and VT-d's dma_msi_set_affinity() get called
> > directly as well, and in these cases the descriptor locks hadn't got
> > acquired till now. For set_ioapic_affinity_irq() this means acquiring /
> > releasing of the IO-APIC lock can be plain spin_{,un}lock() then.
> >
> > Drop one of the two leading underscores from all three functions at
> > the same time.
> >
> > There's one case left where descriptors get manipulated with just
> > vector_lock held: setup_vector_irq() assumes its caller to acquire
> > vector_lock, and hence can't itself acquire the descriptor locks (wrong
> > lock order). I don't currently see how to address this.
> >
> > Signed-off-by: Jan Beulich <jbeulich@suse.com>
> > ---
> > v2: Also adjust set_ioapic_affinity_irq() and VT-d's
> >     dma_msi_set_affinity().
> 
> I'm sorry, Kevin, I should have Cc-ed you on this one.

Reviewed-by: Kevin Tian <kevin.tian@intel.com> for vtd part.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v2 01/12] x86/IRQ: deal with move-in-progress state in fixup_irqs()
@ 2019-05-13  9:04       ` Roger Pau Monné
  0 siblings, 0 replies; 196+ messages in thread
From: Roger Pau Monné @ 2019-05-13  9:04 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Wei Liu, Andrew Cooper

On Wed, May 08, 2019 at 07:03:09AM -0600, Jan Beulich wrote:
> The flag being set may prevent affinity changes, as these often imply
> assignment of a new vector. When there's no possible destination left
> for the IRQ, the clearing of the flag needs to happen right from
> fixup_irqs().
> 
> Additionally _assign_irq_vector() needs to avoid setting the flag when
> there's no online CPU left in what gets put into ->arch.old_cpu_mask.
> The old vector can be released right away in this case.
> 
> Also extend the log message about broken affinity to include the new
> affinity as well, allowing to notice issues with affinity changes not
> actually having taken place. Swap the if/else-if order there at the
> same time to reduce the amount of conditions checked.
> 
> At the same time replace two open coded instances of the new helper
> function.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Thanks,

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

One comment below.

> ---
> v2: Add/use valid_irq_vector().
> v1b: Also update vector_irq[] in the code added to fixup_irqs().
> 
> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -99,6 +99,11 @@ void unlock_vector_lock(void)
>      spin_unlock(&vector_lock);
>  }
>  
> +static inline bool valid_irq_vector(unsigned int vector)
> +{
> +    return vector >= FIRST_DYNAMIC_VECTOR && vector <= LAST_HIPRIORITY_VECTOR;
> +}
> +
>  static void trace_irq_mask(u32 event, int irq, int vector, cpumask_t *mask)
>  {
>      struct {
> @@ -242,6 +247,22 @@ void destroy_irq(unsigned int irq)
>      xfree(action);
>  }
>  
> +static void release_old_vec(struct irq_desc *desc)
> +{
> +    unsigned int vector = desc->arch.old_vector;
> +
> +    desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
> +    cpumask_clear(desc->arch.old_cpu_mask);
> +
> +    if ( !valid_irq_vector(vector) )
> +        ASSERT_UNREACHABLE();
> +    else if ( desc->arch.used_vectors )
> +    {
> +        ASSERT(test_bit(vector, desc->arch.used_vectors));
> +        clear_bit(vector, desc->arch.used_vectors);
> +    }
> +}
> +
>  static void __clear_irq_vector(int irq)
>  {
>      int cpu, vector, old_vector;
> @@ -285,14 +306,7 @@ static void __clear_irq_vector(int irq)
>          per_cpu(vector_irq, cpu)[old_vector] = ~irq;
>      }
>  
> -    desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
> -    cpumask_clear(desc->arch.old_cpu_mask);
> -
> -    if ( desc->arch.used_vectors )
> -    {
> -        ASSERT(test_bit(old_vector, desc->arch.used_vectors));
> -        clear_bit(old_vector, desc->arch.used_vectors);
> -    }
> +    release_old_vec(desc);
>  
>      desc->arch.move_in_progress = 0;
>  }
> @@ -517,12 +531,21 @@ next:
>          /* Found one! */
>          current_vector = vector;
>          current_offset = offset;
> -        if (old_vector > 0) {
> -            desc->arch.move_in_progress = 1;
> -            cpumask_copy(desc->arch.old_cpu_mask, desc->arch.cpu_mask);
> +
> +        if ( old_vector > 0 )

Maybe you could use valid_irq_vector here, or compare against
IRQ_VECTOR_UNASSIGNED?

The fact that IRQ_VECTOR_UNASSIGNED is a negative value is an
implementation detail that shouldn't be exposed directly in the code
IMO.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v2 01/12] x86/IRQ: deal with move-in-progress state in fixup_irqs()
@ 2019-05-13  9:04       ` Roger Pau Monné
  0 siblings, 0 replies; 196+ messages in thread
From: Roger Pau Monné @ 2019-05-13  9:04 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Wei Liu, Andrew Cooper

On Wed, May 08, 2019 at 07:03:09AM -0600, Jan Beulich wrote:
> The flag being set may prevent affinity changes, as these often imply
> assignment of a new vector. When there's no possible destination left
> for the IRQ, the clearing of the flag needs to happen right from
> fixup_irqs().
> 
> Additionally _assign_irq_vector() needs to avoid setting the flag when
> there's no online CPU left in what gets put into ->arch.old_cpu_mask.
> The old vector can be released right away in this case.
> 
> Also extend the log message about broken affinity to include the new
> affinity as well, allowing to notice issues with affinity changes not
> actually having taken place. Swap the if/else-if order there at the
> same time to reduce the amount of conditions checked.
> 
> At the same time replace two open coded instances of the new helper
> function.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Thanks,

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

One comment below.

> ---
> v2: Add/use valid_irq_vector().
> v1b: Also update vector_irq[] in the code added to fixup_irqs().
> 
> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -99,6 +99,11 @@ void unlock_vector_lock(void)
>      spin_unlock(&vector_lock);
>  }
>  
> +static inline bool valid_irq_vector(unsigned int vector)
> +{
> +    return vector >= FIRST_DYNAMIC_VECTOR && vector <= LAST_HIPRIORITY_VECTOR;
> +}
> +
>  static void trace_irq_mask(u32 event, int irq, int vector, cpumask_t *mask)
>  {
>      struct {
> @@ -242,6 +247,22 @@ void destroy_irq(unsigned int irq)
>      xfree(action);
>  }
>  
> +static void release_old_vec(struct irq_desc *desc)
> +{
> +    unsigned int vector = desc->arch.old_vector;
> +
> +    desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
> +    cpumask_clear(desc->arch.old_cpu_mask);
> +
> +    if ( !valid_irq_vector(vector) )
> +        ASSERT_UNREACHABLE();
> +    else if ( desc->arch.used_vectors )
> +    {
> +        ASSERT(test_bit(vector, desc->arch.used_vectors));
> +        clear_bit(vector, desc->arch.used_vectors);
> +    }
> +}
> +
>  static void __clear_irq_vector(int irq)
>  {
>      int cpu, vector, old_vector;
> @@ -285,14 +306,7 @@ static void __clear_irq_vector(int irq)
>          per_cpu(vector_irq, cpu)[old_vector] = ~irq;
>      }
>  
> -    desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
> -    cpumask_clear(desc->arch.old_cpu_mask);
> -
> -    if ( desc->arch.used_vectors )
> -    {
> -        ASSERT(test_bit(old_vector, desc->arch.used_vectors));
> -        clear_bit(old_vector, desc->arch.used_vectors);
> -    }
> +    release_old_vec(desc);
>  
>      desc->arch.move_in_progress = 0;
>  }
> @@ -517,12 +531,21 @@ next:
>          /* Found one! */
>          current_vector = vector;
>          current_offset = offset;
> -        if (old_vector > 0) {
> -            desc->arch.move_in_progress = 1;
> -            cpumask_copy(desc->arch.old_cpu_mask, desc->arch.cpu_mask);
> +
> +        if ( old_vector > 0 )

Maybe you could use valid_irq_vector here, or compare against
IRQ_VECTOR_UNASSIGNED?

The fact that IRQ_VECTOR_UNASSIGNED is a negative value is an
implementation detail that shouldn't be exposed directly in the code
IMO.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v2 03/12] x86/IRQ: avoid UB (or worse) in trace_irq_mask()
@ 2019-05-13  9:08       ` Roger Pau Monné
  0 siblings, 0 replies; 196+ messages in thread
From: Roger Pau Monné @ 2019-05-13  9:08 UTC (permalink / raw)
  To: Jan Beulich; +Cc: George Dunlap, xen-devel, Wei Liu, Andrew Cooper

On Wed, May 08, 2019 at 07:07:21AM -0600, Jan Beulich wrote:
> Dynamically allocated CPU mask objects may be smaller than cpumask_t, so
> copying has to be restricted to the actual allocation size. This is
> particulary important since the function doesn't bail early when tracing
> is not active, so even production builds would be affected by potential
> misbehavior here.
> 
> Take the opportunity and also
> - use initializers instead of assignment + memset(),
> - constify the cpumask_t input pointer,
> - u32 -> uint32_t.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v2 03/12] x86/IRQ: avoid UB (or worse) in trace_irq_mask()
@ 2019-05-13  9:08       ` Roger Pau Monné
  0 siblings, 0 replies; 196+ messages in thread
From: Roger Pau Monné @ 2019-05-13  9:08 UTC (permalink / raw)
  To: Jan Beulich; +Cc: George Dunlap, xen-devel, Wei Liu, Andrew Cooper

On Wed, May 08, 2019 at 07:07:21AM -0600, Jan Beulich wrote:
> Dynamically allocated CPU mask objects may be smaller than cpumask_t, so
> copying has to be restricted to the actual allocation size. This is
> particulary important since the function doesn't bail early when tracing
> is not active, so even production builds would be affected by potential
> misbehavior here.
> 
> Take the opportunity and also
> - use initializers instead of assignment + memset(),
> - constify the cpumask_t input pointer,
> - u32 -> uint32_t.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v2 01/12] x86/IRQ: deal with move-in-progress state in fixup_irqs()
@ 2019-05-13  9:09         ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-13  9:09 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel

>>> On 13.05.19 at 11:04, <roger.pau@citrix.com> wrote:
> On Wed, May 08, 2019 at 07:03:09AM -0600, Jan Beulich wrote:
>> The flag being set may prevent affinity changes, as these often imply
>> assignment of a new vector. When there's no possible destination left
>> for the IRQ, the clearing of the flag needs to happen right from
>> fixup_irqs().
>> 
>> Additionally _assign_irq_vector() needs to avoid setting the flag when
>> there's no online CPU left in what gets put into ->arch.old_cpu_mask.
>> The old vector can be released right away in this case.
>> 
>> Also extend the log message about broken affinity to include the new
>> affinity as well, allowing to notice issues with affinity changes not
>> actually having taken place. Swap the if/else-if order there at the
>> same time to reduce the amount of conditions checked.
>> 
>> At the same time replace two open coded instances of the new helper
>> function.
>> 
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> Thanks,
> 
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks.

>> @@ -517,12 +531,21 @@ next:
>>          /* Found one! */
>>          current_vector = vector;
>>          current_offset = offset;
>> -        if (old_vector > 0) {
>> -            desc->arch.move_in_progress = 1;
>> -            cpumask_copy(desc->arch.old_cpu_mask, desc->arch.cpu_mask);
>> +
>> +        if ( old_vector > 0 )
> 
> Maybe you could use valid_irq_vector here, or compare against
> IRQ_VECTOR_UNASSIGNED?

Not in this patch, but I'd like to widen the use of valid_irq_vector()
subsequently, which would likely also include this case.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v2 01/12] x86/IRQ: deal with move-in-progress state in fixup_irqs()
@ 2019-05-13  9:09         ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-13  9:09 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel

>>> On 13.05.19 at 11:04, <roger.pau@citrix.com> wrote:
> On Wed, May 08, 2019 at 07:03:09AM -0600, Jan Beulich wrote:
>> The flag being set may prevent affinity changes, as these often imply
>> assignment of a new vector. When there's no possible destination left
>> for the IRQ, the clearing of the flag needs to happen right from
>> fixup_irqs().
>> 
>> Additionally _assign_irq_vector() needs to avoid setting the flag when
>> there's no online CPU left in what gets put into ->arch.old_cpu_mask.
>> The old vector can be released right away in this case.
>> 
>> Also extend the log message about broken affinity to include the new
>> affinity as well, allowing to notice issues with affinity changes not
>> actually having taken place. Swap the if/else-if order there at the
>> same time to reduce the amount of conditions checked.
>> 
>> At the same time replace two open coded instances of the new helper
>> function.
>> 
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> Thanks,
> 
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks.

>> @@ -517,12 +531,21 @@ next:
>>          /* Found one! */
>>          current_vector = vector;
>>          current_offset = offset;
>> -        if (old_vector > 0) {
>> -            desc->arch.move_in_progress = 1;
>> -            cpumask_copy(desc->arch.old_cpu_mask, desc->arch.cpu_mask);
>> +
>> +        if ( old_vector > 0 )
> 
> Maybe you could use valid_irq_vector here, or compare against
> IRQ_VECTOR_UNASSIGNED?

Not in this patch, but I'd like to widen the use of valid_irq_vector()
subsequently, which would likely also include this case.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v2 03/12] x86/IRQ: avoid UB (or worse) in trace_irq_mask()
@ 2019-05-13 10:42       ` George Dunlap
  0 siblings, 0 replies; 196+ messages in thread
From: George Dunlap @ 2019-05-13 10:42 UTC (permalink / raw)
  To: Jan Beulich, xen-devel
  Cc: George Dunlap, Andrew Cooper, Wei Liu, Roger Pau Monne

On 5/8/19 2:07 PM, Jan Beulich wrote:
> Dynamically allocated CPU mask objects may be smaller than cpumask_t, so
> copying has to be restricted to the actual allocation size. This is
> particulary important since the function doesn't bail early when tracing
> is not active, so even production builds would be affected by potential
> misbehavior here.
> 
> Take the opportunity and also
> - use initializers instead of assignment + memset(),
> - constify the cpumask_t input pointer,
> - u32 -> uint32_t.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> v2: New.
> ---
> TBD: I wonder whether the function shouldn't gain an early tb_init_done
>      check, like many other trace_*() have.

Yeah, avoiding these memcopies when tracing is not enabled seems like a
good thing.

Either way:

Acked-by: George Dunlap <george.dunlap@citrix.com>

> 
> George, despite your general request to be copied on entire series
> rather than individual patches, I thought it would be better to copy
> you on just this one (for its tracing aspect), as the patch here is
> independent of the rest of the series, but at least one later patch
> depends on the parameter constification done here.

Yes, I think in this case this was the easiest thing for me.  Thanks. :-)

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v2 03/12] x86/IRQ: avoid UB (or worse) in trace_irq_mask()
@ 2019-05-13 10:42       ` George Dunlap
  0 siblings, 0 replies; 196+ messages in thread
From: George Dunlap @ 2019-05-13 10:42 UTC (permalink / raw)
  To: Jan Beulich, xen-devel
  Cc: George Dunlap, Andrew Cooper, Wei Liu, Roger Pau Monne

On 5/8/19 2:07 PM, Jan Beulich wrote:
> Dynamically allocated CPU mask objects may be smaller than cpumask_t, so
> copying has to be restricted to the actual allocation size. This is
> particulary important since the function doesn't bail early when tracing
> is not active, so even production builds would be affected by potential
> misbehavior here.
> 
> Take the opportunity and also
> - use initializers instead of assignment + memset(),
> - constify the cpumask_t input pointer,
> - u32 -> uint32_t.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> v2: New.
> ---
> TBD: I wonder whether the function shouldn't gain an early tb_init_done
>      check, like many other trace_*() have.

Yeah, avoiding these memcopies when tracing is not enabled seems like a
good thing.

Either way:

Acked-by: George Dunlap <george.dunlap@citrix.com>

> 
> George, despite your general request to be copied on entire series
> rather than individual patches, I thought it would be better to copy
> you on just this one (for its tracing aspect), as the patch here is
> independent of the rest of the series, but at least one later patch
> depends on the parameter constification done here.

Yes, I think in this case this was the easiest thing for me.  Thanks. :-)

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v2 06/12] x86/IRQ: consolidate use of ->arch.cpu_mask
@ 2019-05-13 11:32       ` Roger Pau Monné
  0 siblings, 0 replies; 196+ messages in thread
From: Roger Pau Monné @ 2019-05-13 11:32 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Wei Liu, Andrew Cooper

On Wed, May 08, 2019 at 07:10:29AM -0600, Jan Beulich wrote:
> Mixed meaning was implied so far by different pieces of code -
> disagreement was in particular about whether to expect offline CPUs'
> bits to possibly be set. Switch to a mostly consistent meaning
> (exception being high priority interrupts, which would perhaps better
> be switched to the same model as well in due course). Use the field to
> record the vector allocation mask, i.e. potentially including bits of
> offline (parked) CPUs. This implies that before passing the mask to
> certain functions (most notably cpu_mask_to_apicid()) it needs to be
> further reduced to the online subset.
> 
> The exception of high priority interrupts is also why for the moment
> _bind_irq_vector() is left as is, despite looking wrong: It's used
> exclusively for IRQ0, which isn't supposed to move off CPU0 at any time.
> 
> The prior lack of restricting to online CPUs in set_desc_affinity()
> before calling cpu_mask_to_apicid() in particular allowed (in x2APIC
> clustered mode) offlined CPUs to end up enabled in an IRQ's destination
> field. (I wonder whether vector_allocation_cpumask_flat() shouldn't
> follow a similar model, using cpu_present_map in favor of
> cpu_online_map.)
> 
> For IO-APIC code it was definitely wrong to potentially store, as a
> fallback, TARGET_CPUS (i.e. all online ones) into the field, as that
> would have caused problems when determining on which CPUs to release
> vectors when they've gone out of use. Disable interrupts instead when
> no valid target CPU can be established (which code elsewhere should
> guarantee to never happen), and log a message in such an unlikely event.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Thanks.

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Some comments below.

> ---
> v2: New.
> 
> --- a/xen/arch/x86/io_apic.c
> +++ b/xen/arch/x86/io_apic.c
> @@ -680,7 +680,7 @@ void /*__init*/ setup_ioapic_dest(void)
>                  continue;
>              irq = pin_2_irq(irq_entry, ioapic, pin);
>              desc = irq_to_desc(irq);
> -            BUG_ON(cpumask_empty(desc->arch.cpu_mask));
> +            BUG_ON(!cpumask_intersects(desc->arch.cpu_mask, &cpu_online_map));

I wonder if maybe you could instead do:

if ( cpumask_intersects(desc->arch.cpu_mask, &cpu_online_map) )
    set_ioapic_affinity_irq(desc, desc->arch.cpu_mask);
else
    ASSERT_UNREACHABLE();

I guess if the IRQ is in use by Xen itself the failure ought to be
fatal.

>              set_ioapic_affinity_irq(desc, desc->arch.cpu_mask);
>          }
>  
> @@ -2197,7 +2197,6 @@ int io_apic_set_pci_routing (int ioapic,
>  {
>      struct irq_desc *desc = irq_to_desc(irq);
>      struct IO_APIC_route_entry entry;
> -    cpumask_t mask;
>      unsigned long flags;
>      int vector;
>  
> @@ -2232,11 +2231,17 @@ int io_apic_set_pci_routing (int ioapic,
>          return vector;
>      entry.vector = vector;
>  
> -    cpumask_copy(&mask, TARGET_CPUS);
> -    /* Don't chance ending up with an empty mask. */
> -    if (cpumask_intersects(&mask, desc->arch.cpu_mask))
> -        cpumask_and(&mask, &mask, desc->arch.cpu_mask);
> -    SET_DEST(entry, logical, cpu_mask_to_apicid(&mask));
> +    if (cpumask_intersects(desc->arch.cpu_mask, TARGET_CPUS)) {
> +        cpumask_t *mask = this_cpu(scratch_cpumask);
> +
> +        cpumask_and(mask, desc->arch.cpu_mask, TARGET_CPUS);
> +        SET_DEST(entry, logical, cpu_mask_to_apicid(mask));
> +    } else {
> +        printk(XENLOG_ERR "IRQ%d: no target CPU (%*pb vs %*pb)\n",
> +               irq, nr_cpu_ids, cpumask_bits(desc->arch.cpu_mask),
> +               nr_cpu_ids, cpumask_bits(TARGET_CPUS));
> +        desc->status |= IRQ_DISABLED;
> +    }

Hm, part of this file doesn't seem to use Xen coding style, but the
chunk you add below does use it. And there are functions (like
mask_and_ack_level_ioapic_irq that seem to use a mix of coding
styles).

I'm not sure what's the policy here, should new chunks follow Xen's
coding style?

>  
>      apic_printk(APIC_DEBUG, KERN_DEBUG "IOAPIC[%d]: Set PCI routing entry "
>  		"(%d-%d -> %#x -> IRQ %d Mode:%i Active:%i)\n", ioapic,
> @@ -2422,7 +2427,21 @@ int ioapic_guest_write(unsigned long phy
>      /* Set the vector field to the real vector! */
>      rte.vector = desc->arch.vector;
>  
> -    SET_DEST(rte, logical, cpu_mask_to_apicid(desc->arch.cpu_mask));
> +    if ( cpumask_intersects(desc->arch.cpu_mask, TARGET_CPUS) )
> +    {
> +        cpumask_t *mask = this_cpu(scratch_cpumask);
> +
> +        cpumask_and(mask, desc->arch.cpu_mask, TARGET_CPUS);
> +        SET_DEST(rte, logical, cpu_mask_to_apicid(mask));
> +    }
> +    else
> +    {
> +        gprintk(XENLOG_ERR, "IRQ%d: no target CPU (%*pb vs %*pb)\n",
> +               irq, nr_cpu_ids, cpumask_bits(desc->arch.cpu_mask),
> +               nr_cpu_ids, cpumask_bits(TARGET_CPUS));
> +        desc->status |= IRQ_DISABLED;
> +        rte.mask = 1;
> +    }
>  
>      __ioapic_write_entry(apic, pin, 0, rte);
>      
> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -471,11 +471,13 @@ static int __assign_irq_vector(
>       */
>      static int current_vector = FIRST_DYNAMIC_VECTOR, current_offset = 0;
>      int cpu, err, old_vector;
> -    cpumask_t tmp_mask;
>      vmask_t *irq_used_vectors = NULL;
>  
>      old_vector = irq_to_vector(irq);
> -    if (old_vector > 0) {
> +    if ( old_vector > 0 )

Another candidate to switch to valid_irq_vector or at least make an
explicit comparison with IRQ_VECTOR_UNASSIGNED.

Seeing your reply to my comment in that direction on a previous patch
this can be done as a follow up.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v2 06/12] x86/IRQ: consolidate use of ->arch.cpu_mask
@ 2019-05-13 11:32       ` Roger Pau Monné
  0 siblings, 0 replies; 196+ messages in thread
From: Roger Pau Monné @ 2019-05-13 11:32 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Wei Liu, Andrew Cooper

On Wed, May 08, 2019 at 07:10:29AM -0600, Jan Beulich wrote:
> Mixed meaning was implied so far by different pieces of code -
> disagreement was in particular about whether to expect offline CPUs'
> bits to possibly be set. Switch to a mostly consistent meaning
> (exception being high priority interrupts, which would perhaps better
> be switched to the same model as well in due course). Use the field to
> record the vector allocation mask, i.e. potentially including bits of
> offline (parked) CPUs. This implies that before passing the mask to
> certain functions (most notably cpu_mask_to_apicid()) it needs to be
> further reduced to the online subset.
> 
> The exception of high priority interrupts is also why for the moment
> _bind_irq_vector() is left as is, despite looking wrong: It's used
> exclusively for IRQ0, which isn't supposed to move off CPU0 at any time.
> 
> The prior lack of restricting to online CPUs in set_desc_affinity()
> before calling cpu_mask_to_apicid() in particular allowed (in x2APIC
> clustered mode) offlined CPUs to end up enabled in an IRQ's destination
> field. (I wonder whether vector_allocation_cpumask_flat() shouldn't
> follow a similar model, using cpu_present_map in favor of
> cpu_online_map.)
> 
> For IO-APIC code it was definitely wrong to potentially store, as a
> fallback, TARGET_CPUS (i.e. all online ones) into the field, as that
> would have caused problems when determining on which CPUs to release
> vectors when they've gone out of use. Disable interrupts instead when
> no valid target CPU can be established (which code elsewhere should
> guarantee to never happen), and log a message in such an unlikely event.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Thanks.

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Some comments below.

> ---
> v2: New.
> 
> --- a/xen/arch/x86/io_apic.c
> +++ b/xen/arch/x86/io_apic.c
> @@ -680,7 +680,7 @@ void /*__init*/ setup_ioapic_dest(void)
>                  continue;
>              irq = pin_2_irq(irq_entry, ioapic, pin);
>              desc = irq_to_desc(irq);
> -            BUG_ON(cpumask_empty(desc->arch.cpu_mask));
> +            BUG_ON(!cpumask_intersects(desc->arch.cpu_mask, &cpu_online_map));

I wonder if maybe you could instead do:

if ( cpumask_intersects(desc->arch.cpu_mask, &cpu_online_map) )
    set_ioapic_affinity_irq(desc, desc->arch.cpu_mask);
else
    ASSERT_UNREACHABLE();

I guess if the IRQ is in use by Xen itself the failure ought to be
fatal.

>              set_ioapic_affinity_irq(desc, desc->arch.cpu_mask);
>          }
>  
> @@ -2197,7 +2197,6 @@ int io_apic_set_pci_routing (int ioapic,
>  {
>      struct irq_desc *desc = irq_to_desc(irq);
>      struct IO_APIC_route_entry entry;
> -    cpumask_t mask;
>      unsigned long flags;
>      int vector;
>  
> @@ -2232,11 +2231,17 @@ int io_apic_set_pci_routing (int ioapic,
>          return vector;
>      entry.vector = vector;
>  
> -    cpumask_copy(&mask, TARGET_CPUS);
> -    /* Don't chance ending up with an empty mask. */
> -    if (cpumask_intersects(&mask, desc->arch.cpu_mask))
> -        cpumask_and(&mask, &mask, desc->arch.cpu_mask);
> -    SET_DEST(entry, logical, cpu_mask_to_apicid(&mask));
> +    if (cpumask_intersects(desc->arch.cpu_mask, TARGET_CPUS)) {
> +        cpumask_t *mask = this_cpu(scratch_cpumask);
> +
> +        cpumask_and(mask, desc->arch.cpu_mask, TARGET_CPUS);
> +        SET_DEST(entry, logical, cpu_mask_to_apicid(mask));
> +    } else {
> +        printk(XENLOG_ERR "IRQ%d: no target CPU (%*pb vs %*pb)\n",
> +               irq, nr_cpu_ids, cpumask_bits(desc->arch.cpu_mask),
> +               nr_cpu_ids, cpumask_bits(TARGET_CPUS));
> +        desc->status |= IRQ_DISABLED;
> +    }

Hm, part of this file doesn't seem to use Xen coding style, but the
chunk you add below does use it. And there are functions (like
mask_and_ack_level_ioapic_irq that seem to use a mix of coding
styles).

I'm not sure what's the policy here, should new chunks follow Xen's
coding style?

>  
>      apic_printk(APIC_DEBUG, KERN_DEBUG "IOAPIC[%d]: Set PCI routing entry "
>  		"(%d-%d -> %#x -> IRQ %d Mode:%i Active:%i)\n", ioapic,
> @@ -2422,7 +2427,21 @@ int ioapic_guest_write(unsigned long phy
>      /* Set the vector field to the real vector! */
>      rte.vector = desc->arch.vector;
>  
> -    SET_DEST(rte, logical, cpu_mask_to_apicid(desc->arch.cpu_mask));
> +    if ( cpumask_intersects(desc->arch.cpu_mask, TARGET_CPUS) )
> +    {
> +        cpumask_t *mask = this_cpu(scratch_cpumask);
> +
> +        cpumask_and(mask, desc->arch.cpu_mask, TARGET_CPUS);
> +        SET_DEST(rte, logical, cpu_mask_to_apicid(mask));
> +    }
> +    else
> +    {
> +        gprintk(XENLOG_ERR, "IRQ%d: no target CPU (%*pb vs %*pb)\n",
> +               irq, nr_cpu_ids, cpumask_bits(desc->arch.cpu_mask),
> +               nr_cpu_ids, cpumask_bits(TARGET_CPUS));
> +        desc->status |= IRQ_DISABLED;
> +        rte.mask = 1;
> +    }
>  
>      __ioapic_write_entry(apic, pin, 0, rte);
>      
> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -471,11 +471,13 @@ static int __assign_irq_vector(
>       */
>      static int current_vector = FIRST_DYNAMIC_VECTOR, current_offset = 0;
>      int cpu, err, old_vector;
> -    cpumask_t tmp_mask;
>      vmask_t *irq_used_vectors = NULL;
>  
>      old_vector = irq_to_vector(irq);
> -    if (old_vector > 0) {
> +    if ( old_vector > 0 )

Another candidate to switch to valid_irq_vector or at least make an
explicit comparison with IRQ_VECTOR_UNASSIGNED.

Seeing your reply to my comment in that direction on a previous patch
this can be done as a follow up.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v2 03/12] x86/IRQ: avoid UB (or worse) in trace_irq_mask()
@ 2019-05-13 12:05         ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-13 12:05 UTC (permalink / raw)
  To: george.dunlap
  Cc: George Dunlap, Andrew Cooper, Wei Liu, xen-devel, Roger Pau Monne

>>> On 13.05.19 at 12:42, <george.dunlap@citrix.com> wrote:
> On 5/8/19 2:07 PM, Jan Beulich wrote:
>> TBD: I wonder whether the function shouldn't gain an early tb_init_done
>>      check, like many other trace_*() have.
> 
> Yeah, avoiding these memcopies when tracing is not enabled seems like a
> good thing.

I've taken note to submit a respective follow-on patch.

> Either way:
> 
> Acked-by: George Dunlap <george.dunlap@citrix.com>

Thanks, Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v2 03/12] x86/IRQ: avoid UB (or worse) in trace_irq_mask()
@ 2019-05-13 12:05         ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-13 12:05 UTC (permalink / raw)
  To: george.dunlap
  Cc: George Dunlap, Andrew Cooper, Wei Liu, xen-devel, Roger Pau Monne

>>> On 13.05.19 at 12:42, <george.dunlap@citrix.com> wrote:
> On 5/8/19 2:07 PM, Jan Beulich wrote:
>> TBD: I wonder whether the function shouldn't gain an early tb_init_done
>>      check, like many other trace_*() have.
> 
> Yeah, avoiding these memcopies when tracing is not enabled seems like a
> good thing.

I've taken note to submit a respective follow-on patch.

> Either way:
> 
> Acked-by: George Dunlap <george.dunlap@citrix.com>

Thanks, Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v2 07/12] x86/IRQ: fix locking around vector management
@ 2019-05-13 13:48       ` Roger Pau Monné
  0 siblings, 0 replies; 196+ messages in thread
From: Roger Pau Monné @ 2019-05-13 13:48 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Wei Liu, Andrew Cooper

On Wed, May 08, 2019 at 07:10:59AM -0600, Jan Beulich wrote:
> All of __{assign,bind,clear}_irq_vector() manipulate struct irq_desc
> fields, and hence ought to be called with the descriptor lock held in
> addition to vector_lock. This is currently the case for only
> set_desc_affinity() (in the common case) and destroy_irq(), which also
> clarifies what the nesting behavior between the locks has to be.
> Reflect the new expectation by having these functions all take a
> descriptor as parameter instead of an interrupt number.
> 
> Also take care of the two special cases of calls to set_desc_affinity():
> set_ioapic_affinity_irq() and VT-d's dma_msi_set_affinity() get called
> directly as well, and in these cases the descriptor locks hadn't got
> acquired till now. For set_ioapic_affinity_irq() this means acquiring /
> releasing of the IO-APIC lock can be plain spin_{,un}lock() then.
> 
> Drop one of the two leading underscores from all three functions at
> the same time.
> 
> There's one case left where descriptors get manipulated with just
> vector_lock held: setup_vector_irq() assumes its caller to acquire
> vector_lock, and hence can't itself acquire the descriptor locks (wrong
> lock order). I don't currently see how to address this.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -2134,11 +2134,16 @@ static void adjust_irq_affinity(struct a
>      unsigned int node = rhsa ? pxm_to_node(rhsa->proximity_domain)
>                               : NUMA_NO_NODE;
>      const cpumask_t *cpumask = &cpu_online_map;
> +    struct irq_desc *desc;
>  
>      if ( node < MAX_NUMNODES && node_online(node) &&
>           cpumask_intersects(&node_to_cpumask(node), cpumask) )
>          cpumask = &node_to_cpumask(node);
> -    dma_msi_set_affinity(irq_to_desc(drhd->iommu->msi.irq), cpumask);
> +
> +    desc = irq_to_desc(drhd->iommu->msi.irq);
> +    spin_lock_irq(&desc->lock);

I would use the irqsave/irqrestore variants here for extra safety.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v2 07/12] x86/IRQ: fix locking around vector management
@ 2019-05-13 13:48       ` Roger Pau Monné
  0 siblings, 0 replies; 196+ messages in thread
From: Roger Pau Monné @ 2019-05-13 13:48 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Wei Liu, Andrew Cooper

On Wed, May 08, 2019 at 07:10:59AM -0600, Jan Beulich wrote:
> All of __{assign,bind,clear}_irq_vector() manipulate struct irq_desc
> fields, and hence ought to be called with the descriptor lock held in
> addition to vector_lock. This is currently the case for only
> set_desc_affinity() (in the common case) and destroy_irq(), which also
> clarifies what the nesting behavior between the locks has to be.
> Reflect the new expectation by having these functions all take a
> descriptor as parameter instead of an interrupt number.
> 
> Also take care of the two special cases of calls to set_desc_affinity():
> set_ioapic_affinity_irq() and VT-d's dma_msi_set_affinity() get called
> directly as well, and in these cases the descriptor locks hadn't got
> acquired till now. For set_ioapic_affinity_irq() this means acquiring /
> releasing of the IO-APIC lock can be plain spin_{,un}lock() then.
> 
> Drop one of the two leading underscores from all three functions at
> the same time.
> 
> There's one case left where descriptors get manipulated with just
> vector_lock held: setup_vector_irq() assumes its caller to acquire
> vector_lock, and hence can't itself acquire the descriptor locks (wrong
> lock order). I don't currently see how to address this.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -2134,11 +2134,16 @@ static void adjust_irq_affinity(struct a
>      unsigned int node = rhsa ? pxm_to_node(rhsa->proximity_domain)
>                               : NUMA_NO_NODE;
>      const cpumask_t *cpumask = &cpu_online_map;
> +    struct irq_desc *desc;
>  
>      if ( node < MAX_NUMNODES && node_online(node) &&
>           cpumask_intersects(&node_to_cpumask(node), cpumask) )
>          cpumask = &node_to_cpumask(node);
> -    dma_msi_set_affinity(irq_to_desc(drhd->iommu->msi.irq), cpumask);
> +
> +    desc = irq_to_desc(drhd->iommu->msi.irq);
> +    spin_lock_irq(&desc->lock);

I would use the irqsave/irqrestore variants here for extra safety.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v2 08/12] x86/IRQs: correct/tighten vector check in _clear_irq_vector()
@ 2019-05-13 14:01       ` Roger Pau Monné
  0 siblings, 0 replies; 196+ messages in thread
From: Roger Pau Monné @ 2019-05-13 14:01 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Wei Liu, Andrew Cooper

On Wed, May 08, 2019 at 07:11:52AM -0600, Jan Beulich wrote:
> If any particular value was to be checked against, it would need to be
> IRQ_VECTOR_UNASSIGNED.
> 
> Reported-by: Roger Pau Monné <roger.pau@citrix.com>
> 
> Be more strict though and use valid_irq_vector() instead.
> 
> Take the opportunity and also convert local variables to unsigned int.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v2 08/12] x86/IRQs: correct/tighten vector check in _clear_irq_vector()
@ 2019-05-13 14:01       ` Roger Pau Monné
  0 siblings, 0 replies; 196+ messages in thread
From: Roger Pau Monné @ 2019-05-13 14:01 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Wei Liu, Andrew Cooper

On Wed, May 08, 2019 at 07:11:52AM -0600, Jan Beulich wrote:
> If any particular value was to be checked against, it would need to be
> IRQ_VECTOR_UNASSIGNED.
> 
> Reported-by: Roger Pau Monné <roger.pau@citrix.com>
> 
> Be more strict though and use valid_irq_vector() instead.
> 
> Take the opportunity and also convert local variables to unsigned int.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v2 12/12] x86/IRQ: simplify and rename pirq_acktype()
@ 2019-05-13 14:14       ` Roger Pau Monné
  0 siblings, 0 replies; 196+ messages in thread
From: Roger Pau Monné @ 2019-05-13 14:14 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Wei Liu, Andrew Cooper

On Wed, May 08, 2019 at 07:14:06AM -0600, Jan Beulich wrote:
> Its only caller already has the IRQ descriptor in its hands, so there's
> no need for the function to re-obtain it. As a result the leading p of
> its name is no longer appropriate and hence gets dropped.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v2 12/12] x86/IRQ: simplify and rename pirq_acktype()
@ 2019-05-13 14:14       ` Roger Pau Monné
  0 siblings, 0 replies; 196+ messages in thread
From: Roger Pau Monné @ 2019-05-13 14:14 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Wei Liu, Andrew Cooper

On Wed, May 08, 2019 at 07:14:06AM -0600, Jan Beulich wrote:
> Its only caller already has the IRQ descriptor in its hands, so there's
> no need for the function to re-obtain it. As a result the leading p of
> its name is no longer appropriate and hence gets dropped.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v2 07/12] x86/IRQ: fix locking around vector management
@ 2019-05-13 14:19         ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-13 14:19 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel

>>> On 13.05.19 at 15:48, <roger.pau@citrix.com> wrote:
> On Wed, May 08, 2019 at 07:10:59AM -0600, Jan Beulich wrote:
>> --- a/xen/drivers/passthrough/vtd/iommu.c
>> +++ b/xen/drivers/passthrough/vtd/iommu.c
>> @@ -2134,11 +2134,16 @@ static void adjust_irq_affinity(struct a
>>      unsigned int node = rhsa ? pxm_to_node(rhsa->proximity_domain)
>>                               : NUMA_NO_NODE;
>>      const cpumask_t *cpumask = &cpu_online_map;
>> +    struct irq_desc *desc;
>>  
>>      if ( node < MAX_NUMNODES && node_online(node) &&
>>           cpumask_intersects(&node_to_cpumask(node), cpumask) )
>>          cpumask = &node_to_cpumask(node);
>> -    dma_msi_set_affinity(irq_to_desc(drhd->iommu->msi.irq), cpumask);
>> +
>> +    desc = irq_to_desc(drhd->iommu->msi.irq);
>> +    spin_lock_irq(&desc->lock);
> 
> I would use the irqsave/irqrestore variants here for extra safety.

Hmm, maybe. But I think we're in bigger trouble if IRQs indeed
ended up enabled at any of the two points where this function
gets called.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v2 07/12] x86/IRQ: fix locking around vector management
@ 2019-05-13 14:19         ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-13 14:19 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel

>>> On 13.05.19 at 15:48, <roger.pau@citrix.com> wrote:
> On Wed, May 08, 2019 at 07:10:59AM -0600, Jan Beulich wrote:
>> --- a/xen/drivers/passthrough/vtd/iommu.c
>> +++ b/xen/drivers/passthrough/vtd/iommu.c
>> @@ -2134,11 +2134,16 @@ static void adjust_irq_affinity(struct a
>>      unsigned int node = rhsa ? pxm_to_node(rhsa->proximity_domain)
>>                               : NUMA_NO_NODE;
>>      const cpumask_t *cpumask = &cpu_online_map;
>> +    struct irq_desc *desc;
>>  
>>      if ( node < MAX_NUMNODES && node_online(node) &&
>>           cpumask_intersects(&node_to_cpumask(node), cpumask) )
>>          cpumask = &node_to_cpumask(node);
>> -    dma_msi_set_affinity(irq_to_desc(drhd->iommu->msi.irq), cpumask);
>> +
>> +    desc = irq_to_desc(drhd->iommu->msi.irq);
>> +    spin_lock_irq(&desc->lock);
> 
> I would use the irqsave/irqrestore variants here for extra safety.

Hmm, maybe. But I think we're in bigger trouble if IRQs indeed
ended up enabled at any of the two points where this function
gets called.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v2 07/12] x86/IRQ: fix locking around vector management
@ 2019-05-13 14:45           ` Roger Pau Monné
  0 siblings, 0 replies; 196+ messages in thread
From: Roger Pau Monné @ 2019-05-13 14:45 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Wei Liu, xen-devel

On Mon, May 13, 2019 at 08:19:04AM -0600, Jan Beulich wrote:
> >>> On 13.05.19 at 15:48, <roger.pau@citrix.com> wrote:
> > On Wed, May 08, 2019 at 07:10:59AM -0600, Jan Beulich wrote:
> >> --- a/xen/drivers/passthrough/vtd/iommu.c
> >> +++ b/xen/drivers/passthrough/vtd/iommu.c
> >> @@ -2134,11 +2134,16 @@ static void adjust_irq_affinity(struct a
> >>      unsigned int node = rhsa ? pxm_to_node(rhsa->proximity_domain)
> >>                               : NUMA_NO_NODE;
> >>      const cpumask_t *cpumask = &cpu_online_map;
> >> +    struct irq_desc *desc;
> >>  
> >>      if ( node < MAX_NUMNODES && node_online(node) &&
> >>           cpumask_intersects(&node_to_cpumask(node), cpumask) )
> >>          cpumask = &node_to_cpumask(node);
> >> -    dma_msi_set_affinity(irq_to_desc(drhd->iommu->msi.irq), cpumask);
> >> +
> >> +    desc = irq_to_desc(drhd->iommu->msi.irq);
> >> +    spin_lock_irq(&desc->lock);
> > 
> > I would use the irqsave/irqrestore variants here for extra safety.
> 
> Hmm, maybe. But I think we're in bigger trouble if IRQs indeed
> ended up enabled at any of the two points where this function
> gets called.

I think I'm misreading the above, but if you expect
adjust_irq_affinity to always be called with interrupts disabled using
spin_unlock_irq is wrong as it unconditionally enables interrupts.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v2 07/12] x86/IRQ: fix locking around vector management
@ 2019-05-13 14:45           ` Roger Pau Monné
  0 siblings, 0 replies; 196+ messages in thread
From: Roger Pau Monné @ 2019-05-13 14:45 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Wei Liu, xen-devel

On Mon, May 13, 2019 at 08:19:04AM -0600, Jan Beulich wrote:
> >>> On 13.05.19 at 15:48, <roger.pau@citrix.com> wrote:
> > On Wed, May 08, 2019 at 07:10:59AM -0600, Jan Beulich wrote:
> >> --- a/xen/drivers/passthrough/vtd/iommu.c
> >> +++ b/xen/drivers/passthrough/vtd/iommu.c
> >> @@ -2134,11 +2134,16 @@ static void adjust_irq_affinity(struct a
> >>      unsigned int node = rhsa ? pxm_to_node(rhsa->proximity_domain)
> >>                               : NUMA_NO_NODE;
> >>      const cpumask_t *cpumask = &cpu_online_map;
> >> +    struct irq_desc *desc;
> >>  
> >>      if ( node < MAX_NUMNODES && node_online(node) &&
> >>           cpumask_intersects(&node_to_cpumask(node), cpumask) )
> >>          cpumask = &node_to_cpumask(node);
> >> -    dma_msi_set_affinity(irq_to_desc(drhd->iommu->msi.irq), cpumask);
> >> +
> >> +    desc = irq_to_desc(drhd->iommu->msi.irq);
> >> +    spin_lock_irq(&desc->lock);
> > 
> > I would use the irqsave/irqrestore variants here for extra safety.
> 
> Hmm, maybe. But I think we're in bigger trouble if IRQs indeed
> ended up enabled at any of the two points where this function
> gets called.

I think I'm misreading the above, but if you expect
adjust_irq_affinity to always be called with interrupts disabled using
spin_unlock_irq is wrong as it unconditionally enables interrupts.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v2 07/12] x86/IRQ: fix locking around vector management
@ 2019-05-13 15:05             ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-13 15:05 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel

>>> On 13.05.19 at 16:45, <roger.pau@citrix.com> wrote:
> On Mon, May 13, 2019 at 08:19:04AM -0600, Jan Beulich wrote:
>> >>> On 13.05.19 at 15:48, <roger.pau@citrix.com> wrote:
>> > On Wed, May 08, 2019 at 07:10:59AM -0600, Jan Beulich wrote:
>> >> --- a/xen/drivers/passthrough/vtd/iommu.c
>> >> +++ b/xen/drivers/passthrough/vtd/iommu.c
>> >> @@ -2134,11 +2134,16 @@ static void adjust_irq_affinity(struct a
>> >>      unsigned int node = rhsa ? pxm_to_node(rhsa->proximity_domain)
>> >>                               : NUMA_NO_NODE;
>> >>      const cpumask_t *cpumask = &cpu_online_map;
>> >> +    struct irq_desc *desc;
>> >>  
>> >>      if ( node < MAX_NUMNODES && node_online(node) &&
>> >>           cpumask_intersects(&node_to_cpumask(node), cpumask) )
>> >>          cpumask = &node_to_cpumask(node);
>> >> -    dma_msi_set_affinity(irq_to_desc(drhd->iommu->msi.irq), cpumask);
>> >> +
>> >> +    desc = irq_to_desc(drhd->iommu->msi.irq);
>> >> +    spin_lock_irq(&desc->lock);
>> > 
>> > I would use the irqsave/irqrestore variants here for extra safety.
>> 
>> Hmm, maybe. But I think we're in bigger trouble if IRQs indeed
>> ended up enabled at any of the two points where this function
>> gets called.
> 
> I think I'm misreading the above, but if you expect
> adjust_irq_affinity to always be called with interrupts disabled using
> spin_unlock_irq is wrong as it unconditionally enables interrupts.

Oops - s/enabled/disabled/ in my earlier reply.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v2 07/12] x86/IRQ: fix locking around vector management
@ 2019-05-13 15:05             ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-13 15:05 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel

>>> On 13.05.19 at 16:45, <roger.pau@citrix.com> wrote:
> On Mon, May 13, 2019 at 08:19:04AM -0600, Jan Beulich wrote:
>> >>> On 13.05.19 at 15:48, <roger.pau@citrix.com> wrote:
>> > On Wed, May 08, 2019 at 07:10:59AM -0600, Jan Beulich wrote:
>> >> --- a/xen/drivers/passthrough/vtd/iommu.c
>> >> +++ b/xen/drivers/passthrough/vtd/iommu.c
>> >> @@ -2134,11 +2134,16 @@ static void adjust_irq_affinity(struct a
>> >>      unsigned int node = rhsa ? pxm_to_node(rhsa->proximity_domain)
>> >>                               : NUMA_NO_NODE;
>> >>      const cpumask_t *cpumask = &cpu_online_map;
>> >> +    struct irq_desc *desc;
>> >>  
>> >>      if ( node < MAX_NUMNODES && node_online(node) &&
>> >>           cpumask_intersects(&node_to_cpumask(node), cpumask) )
>> >>          cpumask = &node_to_cpumask(node);
>> >> -    dma_msi_set_affinity(irq_to_desc(drhd->iommu->msi.irq), cpumask);
>> >> +
>> >> +    desc = irq_to_desc(drhd->iommu->msi.irq);
>> >> +    spin_lock_irq(&desc->lock);
>> > 
>> > I would use the irqsave/irqrestore variants here for extra safety.
>> 
>> Hmm, maybe. But I think we're in bigger trouble if IRQs indeed
>> ended up enabled at any of the two points where this function
>> gets called.
> 
> I think I'm misreading the above, but if you expect
> adjust_irq_affinity to always be called with interrupts disabled using
> spin_unlock_irq is wrong as it unconditionally enables interrupts.

Oops - s/enabled/disabled/ in my earlier reply.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v2 06/12] x86/IRQ: consolidate use of ->arch.cpu_mask
@ 2019-05-13 15:21         ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-13 15:21 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel

>>> On 13.05.19 at 13:32, <roger.pau@citrix.com> wrote:
> On Wed, May 08, 2019 at 07:10:29AM -0600, Jan Beulich wrote:
>> --- a/xen/arch/x86/io_apic.c
>> +++ b/xen/arch/x86/io_apic.c
>> @@ -680,7 +680,7 @@ void /*__init*/ setup_ioapic_dest(void)
>>                  continue;
>>              irq = pin_2_irq(irq_entry, ioapic, pin);
>>              desc = irq_to_desc(irq);
>> -            BUG_ON(cpumask_empty(desc->arch.cpu_mask));
>> +            BUG_ON(!cpumask_intersects(desc->arch.cpu_mask, &cpu_online_map));
> 
> I wonder if maybe you could instead do:
> 
> if ( cpumask_intersects(desc->arch.cpu_mask, &cpu_online_map) )
>     set_ioapic_affinity_irq(desc, desc->arch.cpu_mask);
> else
>     ASSERT_UNREACHABLE();
> 
> I guess if the IRQ is in use by Xen itself the failure ought to be
> fatal.

And imo also when it's another one (used by Dom0). Iirc we get
here only during Dom0 boot (the commented out __init serving as
a hint). Hence I think BUG_ON() is better in this case than any
for of assertion.

>> @@ -2232,11 +2231,17 @@ int io_apic_set_pci_routing (int ioapic,
>>          return vector;
>>      entry.vector = vector;
>>  
>> -    cpumask_copy(&mask, TARGET_CPUS);
>> -    /* Don't chance ending up with an empty mask. */
>> -    if (cpumask_intersects(&mask, desc->arch.cpu_mask))
>> -        cpumask_and(&mask, &mask, desc->arch.cpu_mask);
>> -    SET_DEST(entry, logical, cpu_mask_to_apicid(&mask));
>> +    if (cpumask_intersects(desc->arch.cpu_mask, TARGET_CPUS)) {
>> +        cpumask_t *mask = this_cpu(scratch_cpumask);
>> +
>> +        cpumask_and(mask, desc->arch.cpu_mask, TARGET_CPUS);
>> +        SET_DEST(entry, logical, cpu_mask_to_apicid(mask));
>> +    } else {
>> +        printk(XENLOG_ERR "IRQ%d: no target CPU (%*pb vs %*pb)\n",
>> +               irq, nr_cpu_ids, cpumask_bits(desc->arch.cpu_mask),
>> +               nr_cpu_ids, cpumask_bits(TARGET_CPUS));
>> +        desc->status |= IRQ_DISABLED;
>> +    }
> 
> Hm, part of this file doesn't seem to use Xen coding style, but the
> chunk you add below does use it. And there are functions (like
> mask_and_ack_level_ioapic_irq that seem to use a mix of coding
> styles).
> 
> I'm not sure what's the policy here, should new chunks follow Xen's
> coding style?

Well, I've decided to match surrounding code's style, until the file
gets morphed into consistent shape.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v2 06/12] x86/IRQ: consolidate use of ->arch.cpu_mask
@ 2019-05-13 15:21         ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-13 15:21 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel

>>> On 13.05.19 at 13:32, <roger.pau@citrix.com> wrote:
> On Wed, May 08, 2019 at 07:10:29AM -0600, Jan Beulich wrote:
>> --- a/xen/arch/x86/io_apic.c
>> +++ b/xen/arch/x86/io_apic.c
>> @@ -680,7 +680,7 @@ void /*__init*/ setup_ioapic_dest(void)
>>                  continue;
>>              irq = pin_2_irq(irq_entry, ioapic, pin);
>>              desc = irq_to_desc(irq);
>> -            BUG_ON(cpumask_empty(desc->arch.cpu_mask));
>> +            BUG_ON(!cpumask_intersects(desc->arch.cpu_mask, &cpu_online_map));
> 
> I wonder if maybe you could instead do:
> 
> if ( cpumask_intersects(desc->arch.cpu_mask, &cpu_online_map) )
>     set_ioapic_affinity_irq(desc, desc->arch.cpu_mask);
> else
>     ASSERT_UNREACHABLE();
> 
> I guess if the IRQ is in use by Xen itself the failure ought to be
> fatal.

And imo also when it's another one (used by Dom0). Iirc we get
here only during Dom0 boot (the commented out __init serving as
a hint). Hence I think BUG_ON() is better in this case than any
for of assertion.

>> @@ -2232,11 +2231,17 @@ int io_apic_set_pci_routing (int ioapic,
>>          return vector;
>>      entry.vector = vector;
>>  
>> -    cpumask_copy(&mask, TARGET_CPUS);
>> -    /* Don't chance ending up with an empty mask. */
>> -    if (cpumask_intersects(&mask, desc->arch.cpu_mask))
>> -        cpumask_and(&mask, &mask, desc->arch.cpu_mask);
>> -    SET_DEST(entry, logical, cpu_mask_to_apicid(&mask));
>> +    if (cpumask_intersects(desc->arch.cpu_mask, TARGET_CPUS)) {
>> +        cpumask_t *mask = this_cpu(scratch_cpumask);
>> +
>> +        cpumask_and(mask, desc->arch.cpu_mask, TARGET_CPUS);
>> +        SET_DEST(entry, logical, cpu_mask_to_apicid(mask));
>> +    } else {
>> +        printk(XENLOG_ERR "IRQ%d: no target CPU (%*pb vs %*pb)\n",
>> +               irq, nr_cpu_ids, cpumask_bits(desc->arch.cpu_mask),
>> +               nr_cpu_ids, cpumask_bits(TARGET_CPUS));
>> +        desc->status |= IRQ_DISABLED;
>> +    }
> 
> Hm, part of this file doesn't seem to use Xen coding style, but the
> chunk you add below does use it. And there are functions (like
> mask_and_ack_level_ioapic_irq that seem to use a mix of coding
> styles).
> 
> I'm not sure what's the policy here, should new chunks follow Xen's
> coding style?

Well, I've decided to match surrounding code's style, until the file
gets morphed into consistent shape.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [PATCH v3 00/15] x86: IRQ management adjustments
@ 2019-05-17 10:39   ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-17 10:39 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

First and foremost this series is trying to deal with CPU offlining
issues, which have become more prominent with the recently
added SMT enable/disable operation in xen-hptool. Later patches
in the series then carry out more or less unrelated changes
(hopefully improvements) noticed while looking at various pieces
of involved code.

01: deal with move-in-progress state in fixup_irqs()
02: deal with move cleanup count state in fixup_irqs()
03: improve dump_irqs()
04: desc->affinity should strictly represent the requested value
05: consolidate use of ->arch.cpu_mask
06: fix locking around vector management
07: target online CPUs when binding guest IRQ
08: correct/tighten vector check in _clear_irq_vector()
09: make fixup_irqs() skip unconnected internally used interrupts
10: drop redundant cpumask_empty() from move_masked_irq()
11: simplify and rename pirq_acktype()
12: add explict tracing-enabled check to trace_irq_mask()
13: tighten vector checks
14: eliminate some on-stack cpumask_t instances
15: move {,_}clear_irq_vector()

In principle patches 1, 2, 4-7, and maybe 9 are backporting candidates.
Their intrusive nature makes wanting to do so questionable, though.

This (in particular patch 14) depends on "[PATCH 0/4] x86: EOI
timer corrections / improvements". For v3 specific information
please see the individual patches.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [Xen-devel] [PATCH v3 00/15] x86: IRQ management adjustments
@ 2019-05-17 10:39   ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-17 10:39 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

First and foremost this series is trying to deal with CPU offlining
issues, which have become more prominent with the recently
added SMT enable/disable operation in xen-hptool. Later patches
in the series then carry out more or less unrelated changes
(hopefully improvements) noticed while looking at various pieces
of involved code.

01: deal with move-in-progress state in fixup_irqs()
02: deal with move cleanup count state in fixup_irqs()
03: improve dump_irqs()
04: desc->affinity should strictly represent the requested value
05: consolidate use of ->arch.cpu_mask
06: fix locking around vector management
07: target online CPUs when binding guest IRQ
08: correct/tighten vector check in _clear_irq_vector()
09: make fixup_irqs() skip unconnected internally used interrupts
10: drop redundant cpumask_empty() from move_masked_irq()
11: simplify and rename pirq_acktype()
12: add explict tracing-enabled check to trace_irq_mask()
13: tighten vector checks
14: eliminate some on-stack cpumask_t instances
15: move {,_}clear_irq_vector()

In principle patches 1, 2, 4-7, and maybe 9 are backporting candidates.
Their intrusive nature makes wanting to do so questionable, though.

This (in particular patch 14) depends on "[PATCH 0/4] x86: EOI
timer corrections / improvements". For v3 specific information
please see the individual patches.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [PATCH v3 01/15] x86/IRQ: deal with move-in-progress state in fixup_irqs()
@ 2019-05-17 10:44     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-17 10:44 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

The flag being set may prevent affinity changes, as these often imply
assignment of a new vector. When there's no possible destination left
for the IRQ, the clearing of the flag needs to happen right from
fixup_irqs().

Additionally _assign_irq_vector() needs to avoid setting the flag when
there's no online CPU left in what gets put into ->arch.old_cpu_mask.
The old vector can be released right away in this case.

Also extend the log message about broken affinity to include the new
affinity as well, allowing to notice issues with affinity changes not
actually having taken place. Swap the if/else-if order there at the
same time to reduce the amount of conditions checked.

At the same time replace two open coded instances of the new helper
function.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
---
v3: Move release_old_vec() further up (so a later patch won't need to).
    Re-base.
v2: Add/use valid_irq_vector().
v1b: Also update vector_irq[] in the code added to fixup_irqs().

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -99,6 +99,27 @@ void unlock_vector_lock(void)
     spin_unlock(&vector_lock);
 }
 
+static inline bool valid_irq_vector(unsigned int vector)
+{
+    return vector >= FIRST_DYNAMIC_VECTOR && vector <= LAST_HIPRIORITY_VECTOR;
+}
+
+static void release_old_vec(struct irq_desc *desc)
+{
+    unsigned int vector = desc->arch.old_vector;
+
+    desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
+    cpumask_clear(desc->arch.old_cpu_mask);
+
+    if ( !valid_irq_vector(vector) )
+        ASSERT_UNREACHABLE();
+    else if ( desc->arch.used_vectors )
+    {
+        ASSERT(test_bit(vector, desc->arch.used_vectors));
+        clear_bit(vector, desc->arch.used_vectors);
+    }
+}
+
 static void trace_irq_mask(uint32_t event, int irq, int vector,
                            const cpumask_t *mask)
 {
@@ -288,14 +309,7 @@ static void __clear_irq_vector(int irq)
         per_cpu(vector_irq, cpu)[old_vector] = ~irq;
     }
 
-    desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
-    cpumask_clear(desc->arch.old_cpu_mask);
-
-    if ( desc->arch.used_vectors )
-    {
-        ASSERT(test_bit(old_vector, desc->arch.used_vectors));
-        clear_bit(old_vector, desc->arch.used_vectors);
-    }
+    release_old_vec(desc);
 
     desc->arch.move_in_progress = 0;
 }
@@ -520,12 +534,21 @@ next:
         /* Found one! */
         current_vector = vector;
         current_offset = offset;
-        if (old_vector > 0) {
-            desc->arch.move_in_progress = 1;
-            cpumask_copy(desc->arch.old_cpu_mask, desc->arch.cpu_mask);
+
+        if ( old_vector > 0 )
+        {
+            cpumask_and(desc->arch.old_cpu_mask, desc->arch.cpu_mask,
+                        &cpu_online_map);
             desc->arch.old_vector = desc->arch.vector;
+            if ( !cpumask_empty(desc->arch.old_cpu_mask) )
+                desc->arch.move_in_progress = 1;
+            else
+                /* This can happen while offlining a CPU. */
+                release_old_vec(desc);
         }
+
         trace_irq_mask(TRC_HW_IRQ_ASSIGN_VECTOR, irq, vector, &tmp_mask);
+
         for_each_cpu(new_cpu, &tmp_mask)
             per_cpu(vector_irq, new_cpu)[vector] = irq;
         desc->arch.vector = vector;
@@ -694,14 +717,8 @@ void irq_move_cleanup_interrupt(struct c
 
         if ( desc->arch.move_cleanup_count == 0 )
         {
-            desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
-            cpumask_clear(desc->arch.old_cpu_mask);
-
-            if ( desc->arch.used_vectors )
-            {
-                ASSERT(test_bit(vector, desc->arch.used_vectors));
-                clear_bit(vector, desc->arch.used_vectors);
-            }
+            ASSERT(vector == desc->arch.old_vector);
+            release_old_vec(desc);
         }
 unlock:
         spin_unlock(&desc->lock);
@@ -2400,6 +2417,33 @@ void fixup_irqs(const cpumask_t *mask, b
             continue;
         }
 
+        /*
+         * In order for the affinity adjustment below to be successful, we
+         * need __assign_irq_vector() to succeed. This in particular means
+         * clearing desc->arch.move_in_progress if this would otherwise
+         * prevent the function from succeeding. Since there's no way for the
+         * flag to get cleared anymore when there's no possible destination
+         * left (the only possibility then would be the IRQs enabled window
+         * after this loop), there's then also no race with us doing it here.
+         *
+         * Therefore the logic here and there need to remain in sync.
+         */
+        if ( desc->arch.move_in_progress &&
+             !cpumask_intersects(mask, desc->arch.cpu_mask) )
+        {
+            unsigned int cpu;
+
+            cpumask_and(&affinity, desc->arch.old_cpu_mask, &cpu_online_map);
+
+            spin_lock(&vector_lock);
+            for_each_cpu(cpu, &affinity)
+                per_cpu(vector_irq, cpu)[desc->arch.old_vector] = ~irq;
+            spin_unlock(&vector_lock);
+
+            release_old_vec(desc);
+            desc->arch.move_in_progress = 0;
+        }
+
         cpumask_and(&affinity, &affinity, mask);
         if ( cpumask_empty(&affinity) )
         {
@@ -2418,15 +2462,18 @@ void fixup_irqs(const cpumask_t *mask, b
         if ( desc->handler->enable )
             desc->handler->enable(desc);
 
+        cpumask_copy(&affinity, desc->affinity);
+
         spin_unlock(&desc->lock);
 
         if ( !verbose )
             continue;
 
-        if ( break_affinity && set_affinity )
-            printk("Broke affinity for irq %i\n", irq);
-        else if ( !set_affinity )
-            printk("Cannot set affinity for irq %i\n", irq);
+        if ( !set_affinity )
+            printk("Cannot set affinity for IRQ%u\n", irq);
+        else if ( break_affinity )
+            printk("Broke affinity for IRQ%u, new: %*pb\n",
+                   irq, nr_cpu_ids, &affinity);
     }
 
     /* That doesn't seem sufficient.  Give it 1ms. */




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [Xen-devel] [PATCH v3 01/15] x86/IRQ: deal with move-in-progress state in fixup_irqs()
@ 2019-05-17 10:44     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-17 10:44 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

The flag being set may prevent affinity changes, as these often imply
assignment of a new vector. When there's no possible destination left
for the IRQ, the clearing of the flag needs to happen right from
fixup_irqs().

Additionally _assign_irq_vector() needs to avoid setting the flag when
there's no online CPU left in what gets put into ->arch.old_cpu_mask.
The old vector can be released right away in this case.

Also extend the log message about broken affinity to include the new
affinity as well, allowing to notice issues with affinity changes not
actually having taken place. Swap the if/else-if order there at the
same time to reduce the amount of conditions checked.

At the same time replace two open coded instances of the new helper
function.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
---
v3: Move release_old_vec() further up (so a later patch won't need to).
    Re-base.
v2: Add/use valid_irq_vector().
v1b: Also update vector_irq[] in the code added to fixup_irqs().

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -99,6 +99,27 @@ void unlock_vector_lock(void)
     spin_unlock(&vector_lock);
 }
 
+static inline bool valid_irq_vector(unsigned int vector)
+{
+    return vector >= FIRST_DYNAMIC_VECTOR && vector <= LAST_HIPRIORITY_VECTOR;
+}
+
+static void release_old_vec(struct irq_desc *desc)
+{
+    unsigned int vector = desc->arch.old_vector;
+
+    desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
+    cpumask_clear(desc->arch.old_cpu_mask);
+
+    if ( !valid_irq_vector(vector) )
+        ASSERT_UNREACHABLE();
+    else if ( desc->arch.used_vectors )
+    {
+        ASSERT(test_bit(vector, desc->arch.used_vectors));
+        clear_bit(vector, desc->arch.used_vectors);
+    }
+}
+
 static void trace_irq_mask(uint32_t event, int irq, int vector,
                            const cpumask_t *mask)
 {
@@ -288,14 +309,7 @@ static void __clear_irq_vector(int irq)
         per_cpu(vector_irq, cpu)[old_vector] = ~irq;
     }
 
-    desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
-    cpumask_clear(desc->arch.old_cpu_mask);
-
-    if ( desc->arch.used_vectors )
-    {
-        ASSERT(test_bit(old_vector, desc->arch.used_vectors));
-        clear_bit(old_vector, desc->arch.used_vectors);
-    }
+    release_old_vec(desc);
 
     desc->arch.move_in_progress = 0;
 }
@@ -520,12 +534,21 @@ next:
         /* Found one! */
         current_vector = vector;
         current_offset = offset;
-        if (old_vector > 0) {
-            desc->arch.move_in_progress = 1;
-            cpumask_copy(desc->arch.old_cpu_mask, desc->arch.cpu_mask);
+
+        if ( old_vector > 0 )
+        {
+            cpumask_and(desc->arch.old_cpu_mask, desc->arch.cpu_mask,
+                        &cpu_online_map);
             desc->arch.old_vector = desc->arch.vector;
+            if ( !cpumask_empty(desc->arch.old_cpu_mask) )
+                desc->arch.move_in_progress = 1;
+            else
+                /* This can happen while offlining a CPU. */
+                release_old_vec(desc);
         }
+
         trace_irq_mask(TRC_HW_IRQ_ASSIGN_VECTOR, irq, vector, &tmp_mask);
+
         for_each_cpu(new_cpu, &tmp_mask)
             per_cpu(vector_irq, new_cpu)[vector] = irq;
         desc->arch.vector = vector;
@@ -694,14 +717,8 @@ void irq_move_cleanup_interrupt(struct c
 
         if ( desc->arch.move_cleanup_count == 0 )
         {
-            desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
-            cpumask_clear(desc->arch.old_cpu_mask);
-
-            if ( desc->arch.used_vectors )
-            {
-                ASSERT(test_bit(vector, desc->arch.used_vectors));
-                clear_bit(vector, desc->arch.used_vectors);
-            }
+            ASSERT(vector == desc->arch.old_vector);
+            release_old_vec(desc);
         }
 unlock:
         spin_unlock(&desc->lock);
@@ -2400,6 +2417,33 @@ void fixup_irqs(const cpumask_t *mask, b
             continue;
         }
 
+        /*
+         * In order for the affinity adjustment below to be successful, we
+         * need __assign_irq_vector() to succeed. This in particular means
+         * clearing desc->arch.move_in_progress if this would otherwise
+         * prevent the function from succeeding. Since there's no way for the
+         * flag to get cleared anymore when there's no possible destination
+         * left (the only possibility then would be the IRQs enabled window
+         * after this loop), there's then also no race with us doing it here.
+         *
+         * Therefore the logic here and there need to remain in sync.
+         */
+        if ( desc->arch.move_in_progress &&
+             !cpumask_intersects(mask, desc->arch.cpu_mask) )
+        {
+            unsigned int cpu;
+
+            cpumask_and(&affinity, desc->arch.old_cpu_mask, &cpu_online_map);
+
+            spin_lock(&vector_lock);
+            for_each_cpu(cpu, &affinity)
+                per_cpu(vector_irq, cpu)[desc->arch.old_vector] = ~irq;
+            spin_unlock(&vector_lock);
+
+            release_old_vec(desc);
+            desc->arch.move_in_progress = 0;
+        }
+
         cpumask_and(&affinity, &affinity, mask);
         if ( cpumask_empty(&affinity) )
         {
@@ -2418,15 +2462,18 @@ void fixup_irqs(const cpumask_t *mask, b
         if ( desc->handler->enable )
             desc->handler->enable(desc);
 
+        cpumask_copy(&affinity, desc->affinity);
+
         spin_unlock(&desc->lock);
 
         if ( !verbose )
             continue;
 
-        if ( break_affinity && set_affinity )
-            printk("Broke affinity for irq %i\n", irq);
-        else if ( !set_affinity )
-            printk("Cannot set affinity for irq %i\n", irq);
+        if ( !set_affinity )
+            printk("Cannot set affinity for IRQ%u\n", irq);
+        else if ( break_affinity )
+            printk("Broke affinity for IRQ%u, new: %*pb\n",
+                   irq, nr_cpu_ids, &affinity);
     }
 
     /* That doesn't seem sufficient.  Give it 1ms. */




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [PATCH v3 02/15] x86/IRQ: deal with move cleanup count state in fixup_irqs()
@ 2019-05-17 10:45     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-17 10:45 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

The cleanup IPI may get sent immediately before a CPU gets removed from
the online map. In such a case the IPI would get handled on the CPU
being offlined no earlier than in the interrupts disabled window after
fixup_irqs()' main loop. This is too late, however, because a possible
affinity change may incur the need for vector assignment, which will
fail when the IRQ's move cleanup count is still non-zero.

To fix this
- record the set of CPUs the cleanup IPIs gets actually sent to alongside
  setting their count,
- adjust the count in fixup_irqs(), accounting for all CPUs that the
  cleanup IPI was sent to, but that are no longer online,
- bail early from the cleanup IPI handler when the CPU is no longer
  online, to prevent double accounting.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -668,6 +668,9 @@ void irq_move_cleanup_interrupt(struct c
     ack_APIC_irq();
 
     me = smp_processor_id();
+    if ( !cpu_online(me) )
+        return;
+
     for ( vector = FIRST_DYNAMIC_VECTOR;
           vector <= LAST_HIPRIORITY_VECTOR; vector++)
     {
@@ -727,11 +730,14 @@ unlock:
 
 static void send_cleanup_vector(struct irq_desc *desc)
 {
-    cpumask_t cleanup_mask;
+    cpumask_and(desc->arch.old_cpu_mask, desc->arch.old_cpu_mask,
+                &cpu_online_map);
+    desc->arch.move_cleanup_count = cpumask_weight(desc->arch.old_cpu_mask);
 
-    cpumask_and(&cleanup_mask, desc->arch.old_cpu_mask, &cpu_online_map);
-    desc->arch.move_cleanup_count = cpumask_weight(&cleanup_mask);
-    send_IPI_mask(&cleanup_mask, IRQ_MOVE_CLEANUP_VECTOR);
+    if ( desc->arch.move_cleanup_count )
+        send_IPI_mask(desc->arch.old_cpu_mask, IRQ_MOVE_CLEANUP_VECTOR);
+    else
+        release_old_vec(desc);
 
     desc->arch.move_in_progress = 0;
 }
@@ -2410,6 +2416,16 @@ void fixup_irqs(const cpumask_t *mask, b
              vector <= LAST_HIPRIORITY_VECTOR )
             cpumask_and(desc->arch.cpu_mask, desc->arch.cpu_mask, mask);
 
+        if ( desc->arch.move_cleanup_count )
+        {
+            /* The cleanup IPI may have got sent while we were still online. */
+            cpumask_andnot(&affinity, desc->arch.old_cpu_mask,
+                           &cpu_online_map);
+            desc->arch.move_cleanup_count -= cpumask_weight(&affinity);
+            if ( !desc->arch.move_cleanup_count )
+                release_old_vec(desc);
+        }
+
         cpumask_copy(&affinity, desc->affinity);
         if ( !desc->action || cpumask_subset(&affinity, mask) )
         {




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [Xen-devel] [PATCH v3 02/15] x86/IRQ: deal with move cleanup count state in fixup_irqs()
@ 2019-05-17 10:45     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-17 10:45 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

The cleanup IPI may get sent immediately before a CPU gets removed from
the online map. In such a case the IPI would get handled on the CPU
being offlined no earlier than in the interrupts disabled window after
fixup_irqs()' main loop. This is too late, however, because a possible
affinity change may incur the need for vector assignment, which will
fail when the IRQ's move cleanup count is still non-zero.

To fix this
- record the set of CPUs the cleanup IPIs gets actually sent to alongside
  setting their count,
- adjust the count in fixup_irqs(), accounting for all CPUs that the
  cleanup IPI was sent to, but that are no longer online,
- bail early from the cleanup IPI handler when the CPU is no longer
  online, to prevent double accounting.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -668,6 +668,9 @@ void irq_move_cleanup_interrupt(struct c
     ack_APIC_irq();
 
     me = smp_processor_id();
+    if ( !cpu_online(me) )
+        return;
+
     for ( vector = FIRST_DYNAMIC_VECTOR;
           vector <= LAST_HIPRIORITY_VECTOR; vector++)
     {
@@ -727,11 +730,14 @@ unlock:
 
 static void send_cleanup_vector(struct irq_desc *desc)
 {
-    cpumask_t cleanup_mask;
+    cpumask_and(desc->arch.old_cpu_mask, desc->arch.old_cpu_mask,
+                &cpu_online_map);
+    desc->arch.move_cleanup_count = cpumask_weight(desc->arch.old_cpu_mask);
 
-    cpumask_and(&cleanup_mask, desc->arch.old_cpu_mask, &cpu_online_map);
-    desc->arch.move_cleanup_count = cpumask_weight(&cleanup_mask);
-    send_IPI_mask(&cleanup_mask, IRQ_MOVE_CLEANUP_VECTOR);
+    if ( desc->arch.move_cleanup_count )
+        send_IPI_mask(desc->arch.old_cpu_mask, IRQ_MOVE_CLEANUP_VECTOR);
+    else
+        release_old_vec(desc);
 
     desc->arch.move_in_progress = 0;
 }
@@ -2410,6 +2416,16 @@ void fixup_irqs(const cpumask_t *mask, b
              vector <= LAST_HIPRIORITY_VECTOR )
             cpumask_and(desc->arch.cpu_mask, desc->arch.cpu_mask, mask);
 
+        if ( desc->arch.move_cleanup_count )
+        {
+            /* The cleanup IPI may have got sent while we were still online. */
+            cpumask_andnot(&affinity, desc->arch.old_cpu_mask,
+                           &cpu_online_map);
+            desc->arch.move_cleanup_count -= cpumask_weight(&affinity);
+            if ( !desc->arch.move_cleanup_count )
+                release_old_vec(desc);
+        }
+
         cpumask_copy(&affinity, desc->affinity);
         if ( !desc->action || cpumask_subset(&affinity, mask) )
         {




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [PATCH v3 03/15] x86/IRQ: improve dump_irqs()
@ 2019-05-17 10:46     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-17 10:46 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

Don't log a stray trailing comma. Shorten a few fields.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -2334,7 +2334,7 @@ static void dump_irqs(unsigned char key)
 
         spin_lock_irqsave(&desc->lock, flags);
 
-        printk("   IRQ:%4d affinity:%*pb vec:%02x type=%-15s status=%08x ",
+        printk("   IRQ:%4d aff:%*pb vec:%02x %-15s status=%03x ",
                irq, nr_cpu_ids, cpumask_bits(desc->affinity), desc->arch.vector,
                desc->handler->typename, desc->status);
 
@@ -2345,23 +2345,21 @@ static void dump_irqs(unsigned char key)
         {
             action = (irq_guest_action_t *)desc->action;
 
-            printk("in-flight=%d domain-list=", action->in_flight);
+            printk("in-flight=%d%c",
+                   action->in_flight, action->nr_guests ? ' ' : '\n');
 
-            for ( i = 0; i < action->nr_guests; i++ )
+            for ( i = 0; i < action->nr_guests; )
             {
-                d = action->guest[i];
+                d = action->guest[i++];
                 pirq = domain_irq_to_pirq(d, irq);
                 info = pirq_info(d, pirq);
-                printk("%u:%3d(%c%c%c)",
+                printk("d%d:%3d(%c%c%c)%c",
                        d->domain_id, pirq,
                        evtchn_port_is_pending(d, info->evtchn) ? 'P' : '-',
                        evtchn_port_is_masked(d, info->evtchn) ? 'M' : '-',
-                       (info->masked ? 'M' : '-'));
-                if ( i != action->nr_guests )
-                    printk(",");
+                       info->masked ? 'M' : '-',
+                       i < action->nr_guests ? ',' : '\n');
             }
-
-            printk("\n");
         }
         else if ( desc->action )
             printk("%ps()\n", desc->action->handler);




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [Xen-devel] [PATCH v3 03/15] x86/IRQ: improve dump_irqs()
@ 2019-05-17 10:46     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-17 10:46 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

Don't log a stray trailing comma. Shorten a few fields.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -2334,7 +2334,7 @@ static void dump_irqs(unsigned char key)
 
         spin_lock_irqsave(&desc->lock, flags);
 
-        printk("   IRQ:%4d affinity:%*pb vec:%02x type=%-15s status=%08x ",
+        printk("   IRQ:%4d aff:%*pb vec:%02x %-15s status=%03x ",
                irq, nr_cpu_ids, cpumask_bits(desc->affinity), desc->arch.vector,
                desc->handler->typename, desc->status);
 
@@ -2345,23 +2345,21 @@ static void dump_irqs(unsigned char key)
         {
             action = (irq_guest_action_t *)desc->action;
 
-            printk("in-flight=%d domain-list=", action->in_flight);
+            printk("in-flight=%d%c",
+                   action->in_flight, action->nr_guests ? ' ' : '\n');
 
-            for ( i = 0; i < action->nr_guests; i++ )
+            for ( i = 0; i < action->nr_guests; )
             {
-                d = action->guest[i];
+                d = action->guest[i++];
                 pirq = domain_irq_to_pirq(d, irq);
                 info = pirq_info(d, pirq);
-                printk("%u:%3d(%c%c%c)",
+                printk("d%d:%3d(%c%c%c)%c",
                        d->domain_id, pirq,
                        evtchn_port_is_pending(d, info->evtchn) ? 'P' : '-',
                        evtchn_port_is_masked(d, info->evtchn) ? 'M' : '-',
-                       (info->masked ? 'M' : '-'));
-                if ( i != action->nr_guests )
-                    printk(",");
+                       info->masked ? 'M' : '-',
+                       i < action->nr_guests ? ',' : '\n');
             }
-
-            printk("\n");
         }
         else if ( desc->action )
             printk("%ps()\n", desc->action->handler);




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [PATCH v3 04/15] x86/IRQ: desc->affinity should strictly represent the requested value
@ 2019-05-17 10:46     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-17 10:46 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

desc->arch.cpu_mask reflects the actual set of target CPUs. Don't ever
fiddle with desc->affinity itself, except to store caller requested
values. Note that assign_irq_vector() now takes a NULL incoming CPU mask
to mean "all CPUs" now, rather than just "all currently online CPUs".
This way no further affinity adjustment is needed after onlining further
CPUs.

This renders both set_native_irq_info() uses (which weren't using proper
locking anyway) redundant - drop the function altogether.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

--- a/xen/arch/x86/io_apic.c
+++ b/xen/arch/x86/io_apic.c
@@ -1039,7 +1039,6 @@ static void __init setup_IO_APIC_irqs(vo
             SET_DEST(entry, logical, cpu_mask_to_apicid(TARGET_CPUS));
             spin_lock_irqsave(&ioapic_lock, flags);
             __ioapic_write_entry(apic, pin, 0, entry);
-            set_native_irq_info(irq, TARGET_CPUS);
             spin_unlock_irqrestore(&ioapic_lock, flags);
         }
     }
@@ -2248,7 +2247,6 @@ int io_apic_set_pci_routing (int ioapic,
 
     spin_lock_irqsave(&ioapic_lock, flags);
     __ioapic_write_entry(ioapic, pin, 0, entry);
-    set_native_irq_info(irq, TARGET_CPUS);
     spin_unlock(&ioapic_lock);
 
     spin_lock(&desc->lock);
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -582,11 +582,16 @@ int assign_irq_vector(int irq, const cpu
 
     spin_lock_irqsave(&vector_lock, flags);
     ret = __assign_irq_vector(irq, desc, mask ?: TARGET_CPUS);
-    if (!ret) {
+    if ( !ret )
+    {
         ret = desc->arch.vector;
-        cpumask_copy(desc->affinity, desc->arch.cpu_mask);
+        if ( mask )
+            cpumask_copy(desc->affinity, mask);
+        else
+            cpumask_setall(desc->affinity);
     }
     spin_unlock_irqrestore(&vector_lock, flags);
+
     return ret;
 }
 
@@ -2334,9 +2339,10 @@ static void dump_irqs(unsigned char key)
 
         spin_lock_irqsave(&desc->lock, flags);
 
-        printk("   IRQ:%4d aff:%*pb vec:%02x %-15s status=%03x ",
-               irq, nr_cpu_ids, cpumask_bits(desc->affinity), desc->arch.vector,
-               desc->handler->typename, desc->status);
+        printk("   IRQ:%4d aff:%*pb/%*pb vec:%02x %-15s status=%03x ",
+               irq, nr_cpu_ids, cpumask_bits(desc->affinity),
+               nr_cpu_ids, cpumask_bits(desc->arch.cpu_mask),
+               desc->arch.vector, desc->handler->typename, desc->status);
 
         if ( ssid )
             printk("Z=%-25s ", ssid);
@@ -2424,8 +2430,7 @@ void fixup_irqs(const cpumask_t *mask, b
                 release_old_vec(desc);
         }
 
-        cpumask_copy(&affinity, desc->affinity);
-        if ( !desc->action || cpumask_subset(&affinity, mask) )
+        if ( !desc->action || cpumask_subset(desc->affinity, mask) )
         {
             spin_unlock(&desc->lock);
             continue;
@@ -2458,12 +2463,13 @@ void fixup_irqs(const cpumask_t *mask, b
             desc->arch.move_in_progress = 0;
         }
 
-        cpumask_and(&affinity, &affinity, mask);
-        if ( cpumask_empty(&affinity) )
+        if ( !cpumask_intersects(mask, desc->affinity) )
         {
             break_affinity = true;
-            cpumask_copy(&affinity, mask);
+            cpumask_setall(&affinity);
         }
+        else
+            cpumask_copy(&affinity, desc->affinity);
 
         if ( desc->handler->disable )
             desc->handler->disable(desc);
--- a/xen/include/xen/irq.h
+++ b/xen/include/xen/irq.h
@@ -162,11 +162,6 @@ extern irq_desc_t *domain_spin_lock_irq_
 extern irq_desc_t *pirq_spin_lock_irq_desc(
     const struct pirq *, unsigned long *pflags);
 
-static inline void set_native_irq_info(unsigned int irq, const cpumask_t *mask)
-{
-    cpumask_copy(irq_to_desc(irq)->affinity, mask);
-}
-
 unsigned int set_desc_affinity(struct irq_desc *, const cpumask_t *);
 
 #ifndef arch_hwdom_irqs




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [Xen-devel] [PATCH v3 04/15] x86/IRQ: desc->affinity should strictly represent the requested value
@ 2019-05-17 10:46     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-17 10:46 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

desc->arch.cpu_mask reflects the actual set of target CPUs. Don't ever
fiddle with desc->affinity itself, except to store caller requested
values. Note that assign_irq_vector() now takes a NULL incoming CPU mask
to mean "all CPUs" now, rather than just "all currently online CPUs".
This way no further affinity adjustment is needed after onlining further
CPUs.

This renders both set_native_irq_info() uses (which weren't using proper
locking anyway) redundant - drop the function altogether.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

--- a/xen/arch/x86/io_apic.c
+++ b/xen/arch/x86/io_apic.c
@@ -1039,7 +1039,6 @@ static void __init setup_IO_APIC_irqs(vo
             SET_DEST(entry, logical, cpu_mask_to_apicid(TARGET_CPUS));
             spin_lock_irqsave(&ioapic_lock, flags);
             __ioapic_write_entry(apic, pin, 0, entry);
-            set_native_irq_info(irq, TARGET_CPUS);
             spin_unlock_irqrestore(&ioapic_lock, flags);
         }
     }
@@ -2248,7 +2247,6 @@ int io_apic_set_pci_routing (int ioapic,
 
     spin_lock_irqsave(&ioapic_lock, flags);
     __ioapic_write_entry(ioapic, pin, 0, entry);
-    set_native_irq_info(irq, TARGET_CPUS);
     spin_unlock(&ioapic_lock);
 
     spin_lock(&desc->lock);
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -582,11 +582,16 @@ int assign_irq_vector(int irq, const cpu
 
     spin_lock_irqsave(&vector_lock, flags);
     ret = __assign_irq_vector(irq, desc, mask ?: TARGET_CPUS);
-    if (!ret) {
+    if ( !ret )
+    {
         ret = desc->arch.vector;
-        cpumask_copy(desc->affinity, desc->arch.cpu_mask);
+        if ( mask )
+            cpumask_copy(desc->affinity, mask);
+        else
+            cpumask_setall(desc->affinity);
     }
     spin_unlock_irqrestore(&vector_lock, flags);
+
     return ret;
 }
 
@@ -2334,9 +2339,10 @@ static void dump_irqs(unsigned char key)
 
         spin_lock_irqsave(&desc->lock, flags);
 
-        printk("   IRQ:%4d aff:%*pb vec:%02x %-15s status=%03x ",
-               irq, nr_cpu_ids, cpumask_bits(desc->affinity), desc->arch.vector,
-               desc->handler->typename, desc->status);
+        printk("   IRQ:%4d aff:%*pb/%*pb vec:%02x %-15s status=%03x ",
+               irq, nr_cpu_ids, cpumask_bits(desc->affinity),
+               nr_cpu_ids, cpumask_bits(desc->arch.cpu_mask),
+               desc->arch.vector, desc->handler->typename, desc->status);
 
         if ( ssid )
             printk("Z=%-25s ", ssid);
@@ -2424,8 +2430,7 @@ void fixup_irqs(const cpumask_t *mask, b
                 release_old_vec(desc);
         }
 
-        cpumask_copy(&affinity, desc->affinity);
-        if ( !desc->action || cpumask_subset(&affinity, mask) )
+        if ( !desc->action || cpumask_subset(desc->affinity, mask) )
         {
             spin_unlock(&desc->lock);
             continue;
@@ -2458,12 +2463,13 @@ void fixup_irqs(const cpumask_t *mask, b
             desc->arch.move_in_progress = 0;
         }
 
-        cpumask_and(&affinity, &affinity, mask);
-        if ( cpumask_empty(&affinity) )
+        if ( !cpumask_intersects(mask, desc->affinity) )
         {
             break_affinity = true;
-            cpumask_copy(&affinity, mask);
+            cpumask_setall(&affinity);
         }
+        else
+            cpumask_copy(&affinity, desc->affinity);
 
         if ( desc->handler->disable )
             desc->handler->disable(desc);
--- a/xen/include/xen/irq.h
+++ b/xen/include/xen/irq.h
@@ -162,11 +162,6 @@ extern irq_desc_t *domain_spin_lock_irq_
 extern irq_desc_t *pirq_spin_lock_irq_desc(
     const struct pirq *, unsigned long *pflags);
 
-static inline void set_native_irq_info(unsigned int irq, const cpumask_t *mask)
-{
-    cpumask_copy(irq_to_desc(irq)->affinity, mask);
-}
-
 unsigned int set_desc_affinity(struct irq_desc *, const cpumask_t *);
 
 #ifndef arch_hwdom_irqs




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [PATCH v3 05/15] x86/IRQ: consolidate use of ->arch.cpu_mask
@ 2019-05-17 10:47     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-17 10:47 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

Mixed meaning was implied so far by different pieces of code -
disagreement was in particular about whether to expect offline CPUs'
bits to possibly be set. Switch to a mostly consistent meaning
(exception being high priority interrupts, which would perhaps better
be switched to the same model as well in due course). Use the field to
record the vector allocation mask, i.e. potentially including bits of
offline (parked) CPUs. This implies that before passing the mask to
certain functions (most notably cpu_mask_to_apicid()) it needs to be
further reduced to the online subset.

The exception of high priority interrupts is also why for the moment
_bind_irq_vector() is left as is, despite looking wrong: It's used
exclusively for IRQ0, which isn't supposed to move off CPU0 at any time.

The prior lack of restricting to online CPUs in set_desc_affinity()
before calling cpu_mask_to_apicid() in particular allowed (in x2APIC
clustered mode) offlined CPUs to end up enabled in an IRQ's destination
field. (I wonder whether vector_allocation_cpumask_flat() shouldn't
follow a similar model, using cpu_present_map in favor of
cpu_online_map.)

For IO-APIC code it was definitely wrong to potentially store, as a
fallback, TARGET_CPUS (i.e. all online ones) into the field, as that
would have caused problems when determining on which CPUs to release
vectors when they've gone out of use. Disable interrupts instead when
no valid target CPU can be established (which code elsewhere should
guarantee to never happen), and log a message in such an unlikely event.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
---
v2: New.

--- a/xen/arch/x86/io_apic.c
+++ b/xen/arch/x86/io_apic.c
@@ -680,7 +680,7 @@ void /*__init*/ setup_ioapic_dest(void)
                 continue;
             irq = pin_2_irq(irq_entry, ioapic, pin);
             desc = irq_to_desc(irq);
-            BUG_ON(cpumask_empty(desc->arch.cpu_mask));
+            BUG_ON(!cpumask_intersects(desc->arch.cpu_mask, &cpu_online_map));
             set_ioapic_affinity_irq(desc, desc->arch.cpu_mask);
         }
 
@@ -2194,7 +2194,6 @@ int io_apic_set_pci_routing (int ioapic,
 {
     struct irq_desc *desc = irq_to_desc(irq);
     struct IO_APIC_route_entry entry;
-    cpumask_t mask;
     unsigned long flags;
     int vector;
 
@@ -2229,11 +2228,17 @@ int io_apic_set_pci_routing (int ioapic,
         return vector;
     entry.vector = vector;
 
-    cpumask_copy(&mask, TARGET_CPUS);
-    /* Don't chance ending up with an empty mask. */
-    if (cpumask_intersects(&mask, desc->arch.cpu_mask))
-        cpumask_and(&mask, &mask, desc->arch.cpu_mask);
-    SET_DEST(entry, logical, cpu_mask_to_apicid(&mask));
+    if (cpumask_intersects(desc->arch.cpu_mask, TARGET_CPUS)) {
+        cpumask_t *mask = this_cpu(scratch_cpumask);
+
+        cpumask_and(mask, desc->arch.cpu_mask, TARGET_CPUS);
+        SET_DEST(entry, logical, cpu_mask_to_apicid(mask));
+    } else {
+        printk(XENLOG_ERR "IRQ%d: no target CPU (%*pb vs %*pb)\n",
+               irq, nr_cpu_ids, cpumask_bits(desc->arch.cpu_mask),
+               nr_cpu_ids, cpumask_bits(TARGET_CPUS));
+        desc->status |= IRQ_DISABLED;
+    }
 
     apic_printk(APIC_DEBUG, KERN_DEBUG "IOAPIC[%d]: Set PCI routing entry "
 		"(%d-%d -> %#x -> IRQ %d Mode:%i Active:%i)\n", ioapic,
@@ -2419,7 +2424,21 @@ int ioapic_guest_write(unsigned long phy
     /* Set the vector field to the real vector! */
     rte.vector = desc->arch.vector;
 
-    SET_DEST(rte, logical, cpu_mask_to_apicid(desc->arch.cpu_mask));
+    if ( cpumask_intersects(desc->arch.cpu_mask, TARGET_CPUS) )
+    {
+        cpumask_t *mask = this_cpu(scratch_cpumask);
+
+        cpumask_and(mask, desc->arch.cpu_mask, TARGET_CPUS);
+        SET_DEST(rte, logical, cpu_mask_to_apicid(mask));
+    }
+    else
+    {
+        gprintk(XENLOG_ERR, "IRQ%d: no target CPU (%*pb vs %*pb)\n",
+               irq, nr_cpu_ids, cpumask_bits(desc->arch.cpu_mask),
+               nr_cpu_ids, cpumask_bits(TARGET_CPUS));
+        desc->status |= IRQ_DISABLED;
+        rte.mask = 1;
+    }
 
     __ioapic_write_entry(apic, pin, 0, rte);
     
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -471,11 +471,13 @@ static int __assign_irq_vector(
      */
     static int current_vector = FIRST_DYNAMIC_VECTOR, current_offset = 0;
     int cpu, err, old_vector;
-    cpumask_t tmp_mask;
     vmask_t *irq_used_vectors = NULL;
 
     old_vector = irq_to_vector(irq);
-    if (old_vector > 0) {
+    if ( old_vector > 0 )
+    {
+        cpumask_t tmp_mask;
+
         cpumask_and(&tmp_mask, mask, &cpu_online_map);
         if (cpumask_intersects(&tmp_mask, desc->arch.cpu_mask)) {
             desc->arch.vector = old_vector;
@@ -498,7 +500,9 @@ static int __assign_irq_vector(
     else
         irq_used_vectors = irq_get_used_vector_mask(irq);
 
-    for_each_cpu(cpu, mask) {
+    for_each_cpu(cpu, mask)
+    {
+        const cpumask_t *vec_mask;
         int new_cpu;
         int vector, offset;
 
@@ -506,8 +510,7 @@ static int __assign_irq_vector(
         if (!cpu_online(cpu))
             continue;
 
-        cpumask_and(&tmp_mask, vector_allocation_cpumask(cpu),
-                    &cpu_online_map);
+        vec_mask = vector_allocation_cpumask(cpu);
 
         vector = current_vector;
         offset = current_offset;
@@ -528,7 +531,7 @@ next:
             && test_bit(vector, irq_used_vectors) )
             goto next;
 
-        for_each_cpu(new_cpu, &tmp_mask)
+        for_each_cpu(new_cpu, vec_mask)
             if (per_cpu(vector_irq, new_cpu)[vector] >= 0)
                 goto next;
         /* Found one! */
@@ -547,12 +550,12 @@ next:
                 release_old_vec(desc);
         }
 
-        trace_irq_mask(TRC_HW_IRQ_ASSIGN_VECTOR, irq, vector, &tmp_mask);
+        trace_irq_mask(TRC_HW_IRQ_ASSIGN_VECTOR, irq, vector, vec_mask);
 
-        for_each_cpu(new_cpu, &tmp_mask)
+        for_each_cpu(new_cpu, vec_mask)
             per_cpu(vector_irq, new_cpu)[vector] = irq;
         desc->arch.vector = vector;
-        cpumask_copy(desc->arch.cpu_mask, &tmp_mask);
+        cpumask_copy(desc->arch.cpu_mask, vec_mask);
 
         desc->arch.used = IRQ_USED;
         ASSERT((desc->arch.used_vectors == NULL)
@@ -783,6 +786,7 @@ unsigned int set_desc_affinity(struct ir
 
     cpumask_copy(desc->affinity, mask);
     cpumask_and(&dest_mask, mask, desc->arch.cpu_mask);
+    cpumask_and(&dest_mask, &dest_mask, &cpu_online_map);
 
     return cpu_mask_to_apicid(&dest_mask);
 }
--- a/xen/include/asm-x86/irq.h
+++ b/xen/include/asm-x86/irq.h
@@ -32,6 +32,12 @@ struct irq_desc;
 struct arch_irq_desc {
         s16 vector;                  /* vector itself is only 8 bits, */
         s16 old_vector;              /* but we use -1 for unassigned  */
+        /*
+         * Except for high priority interrupts @cpu_mask may have bits set for
+         * offline CPUs.  Consumers need to be careful to mask this down to
+         * online ones as necessary.  There is supposed to always be a non-
+         * empty intersection with cpu_online_map.
+         */
         cpumask_var_t cpu_mask;
         cpumask_var_t old_cpu_mask;
         cpumask_var_t pending_mask;




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [Xen-devel] [PATCH v3 05/15] x86/IRQ: consolidate use of ->arch.cpu_mask
@ 2019-05-17 10:47     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-17 10:47 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

Mixed meaning was implied so far by different pieces of code -
disagreement was in particular about whether to expect offline CPUs'
bits to possibly be set. Switch to a mostly consistent meaning
(exception being high priority interrupts, which would perhaps better
be switched to the same model as well in due course). Use the field to
record the vector allocation mask, i.e. potentially including bits of
offline (parked) CPUs. This implies that before passing the mask to
certain functions (most notably cpu_mask_to_apicid()) it needs to be
further reduced to the online subset.

The exception of high priority interrupts is also why for the moment
_bind_irq_vector() is left as is, despite looking wrong: It's used
exclusively for IRQ0, which isn't supposed to move off CPU0 at any time.

The prior lack of restricting to online CPUs in set_desc_affinity()
before calling cpu_mask_to_apicid() in particular allowed (in x2APIC
clustered mode) offlined CPUs to end up enabled in an IRQ's destination
field. (I wonder whether vector_allocation_cpumask_flat() shouldn't
follow a similar model, using cpu_present_map in favor of
cpu_online_map.)

For IO-APIC code it was definitely wrong to potentially store, as a
fallback, TARGET_CPUS (i.e. all online ones) into the field, as that
would have caused problems when determining on which CPUs to release
vectors when they've gone out of use. Disable interrupts instead when
no valid target CPU can be established (which code elsewhere should
guarantee to never happen), and log a message in such an unlikely event.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
---
v2: New.

--- a/xen/arch/x86/io_apic.c
+++ b/xen/arch/x86/io_apic.c
@@ -680,7 +680,7 @@ void /*__init*/ setup_ioapic_dest(void)
                 continue;
             irq = pin_2_irq(irq_entry, ioapic, pin);
             desc = irq_to_desc(irq);
-            BUG_ON(cpumask_empty(desc->arch.cpu_mask));
+            BUG_ON(!cpumask_intersects(desc->arch.cpu_mask, &cpu_online_map));
             set_ioapic_affinity_irq(desc, desc->arch.cpu_mask);
         }
 
@@ -2194,7 +2194,6 @@ int io_apic_set_pci_routing (int ioapic,
 {
     struct irq_desc *desc = irq_to_desc(irq);
     struct IO_APIC_route_entry entry;
-    cpumask_t mask;
     unsigned long flags;
     int vector;
 
@@ -2229,11 +2228,17 @@ int io_apic_set_pci_routing (int ioapic,
         return vector;
     entry.vector = vector;
 
-    cpumask_copy(&mask, TARGET_CPUS);
-    /* Don't chance ending up with an empty mask. */
-    if (cpumask_intersects(&mask, desc->arch.cpu_mask))
-        cpumask_and(&mask, &mask, desc->arch.cpu_mask);
-    SET_DEST(entry, logical, cpu_mask_to_apicid(&mask));
+    if (cpumask_intersects(desc->arch.cpu_mask, TARGET_CPUS)) {
+        cpumask_t *mask = this_cpu(scratch_cpumask);
+
+        cpumask_and(mask, desc->arch.cpu_mask, TARGET_CPUS);
+        SET_DEST(entry, logical, cpu_mask_to_apicid(mask));
+    } else {
+        printk(XENLOG_ERR "IRQ%d: no target CPU (%*pb vs %*pb)\n",
+               irq, nr_cpu_ids, cpumask_bits(desc->arch.cpu_mask),
+               nr_cpu_ids, cpumask_bits(TARGET_CPUS));
+        desc->status |= IRQ_DISABLED;
+    }
 
     apic_printk(APIC_DEBUG, KERN_DEBUG "IOAPIC[%d]: Set PCI routing entry "
 		"(%d-%d -> %#x -> IRQ %d Mode:%i Active:%i)\n", ioapic,
@@ -2419,7 +2424,21 @@ int ioapic_guest_write(unsigned long phy
     /* Set the vector field to the real vector! */
     rte.vector = desc->arch.vector;
 
-    SET_DEST(rte, logical, cpu_mask_to_apicid(desc->arch.cpu_mask));
+    if ( cpumask_intersects(desc->arch.cpu_mask, TARGET_CPUS) )
+    {
+        cpumask_t *mask = this_cpu(scratch_cpumask);
+
+        cpumask_and(mask, desc->arch.cpu_mask, TARGET_CPUS);
+        SET_DEST(rte, logical, cpu_mask_to_apicid(mask));
+    }
+    else
+    {
+        gprintk(XENLOG_ERR, "IRQ%d: no target CPU (%*pb vs %*pb)\n",
+               irq, nr_cpu_ids, cpumask_bits(desc->arch.cpu_mask),
+               nr_cpu_ids, cpumask_bits(TARGET_CPUS));
+        desc->status |= IRQ_DISABLED;
+        rte.mask = 1;
+    }
 
     __ioapic_write_entry(apic, pin, 0, rte);
     
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -471,11 +471,13 @@ static int __assign_irq_vector(
      */
     static int current_vector = FIRST_DYNAMIC_VECTOR, current_offset = 0;
     int cpu, err, old_vector;
-    cpumask_t tmp_mask;
     vmask_t *irq_used_vectors = NULL;
 
     old_vector = irq_to_vector(irq);
-    if (old_vector > 0) {
+    if ( old_vector > 0 )
+    {
+        cpumask_t tmp_mask;
+
         cpumask_and(&tmp_mask, mask, &cpu_online_map);
         if (cpumask_intersects(&tmp_mask, desc->arch.cpu_mask)) {
             desc->arch.vector = old_vector;
@@ -498,7 +500,9 @@ static int __assign_irq_vector(
     else
         irq_used_vectors = irq_get_used_vector_mask(irq);
 
-    for_each_cpu(cpu, mask) {
+    for_each_cpu(cpu, mask)
+    {
+        const cpumask_t *vec_mask;
         int new_cpu;
         int vector, offset;
 
@@ -506,8 +510,7 @@ static int __assign_irq_vector(
         if (!cpu_online(cpu))
             continue;
 
-        cpumask_and(&tmp_mask, vector_allocation_cpumask(cpu),
-                    &cpu_online_map);
+        vec_mask = vector_allocation_cpumask(cpu);
 
         vector = current_vector;
         offset = current_offset;
@@ -528,7 +531,7 @@ next:
             && test_bit(vector, irq_used_vectors) )
             goto next;
 
-        for_each_cpu(new_cpu, &tmp_mask)
+        for_each_cpu(new_cpu, vec_mask)
             if (per_cpu(vector_irq, new_cpu)[vector] >= 0)
                 goto next;
         /* Found one! */
@@ -547,12 +550,12 @@ next:
                 release_old_vec(desc);
         }
 
-        trace_irq_mask(TRC_HW_IRQ_ASSIGN_VECTOR, irq, vector, &tmp_mask);
+        trace_irq_mask(TRC_HW_IRQ_ASSIGN_VECTOR, irq, vector, vec_mask);
 
-        for_each_cpu(new_cpu, &tmp_mask)
+        for_each_cpu(new_cpu, vec_mask)
             per_cpu(vector_irq, new_cpu)[vector] = irq;
         desc->arch.vector = vector;
-        cpumask_copy(desc->arch.cpu_mask, &tmp_mask);
+        cpumask_copy(desc->arch.cpu_mask, vec_mask);
 
         desc->arch.used = IRQ_USED;
         ASSERT((desc->arch.used_vectors == NULL)
@@ -783,6 +786,7 @@ unsigned int set_desc_affinity(struct ir
 
     cpumask_copy(desc->affinity, mask);
     cpumask_and(&dest_mask, mask, desc->arch.cpu_mask);
+    cpumask_and(&dest_mask, &dest_mask, &cpu_online_map);
 
     return cpu_mask_to_apicid(&dest_mask);
 }
--- a/xen/include/asm-x86/irq.h
+++ b/xen/include/asm-x86/irq.h
@@ -32,6 +32,12 @@ struct irq_desc;
 struct arch_irq_desc {
         s16 vector;                  /* vector itself is only 8 bits, */
         s16 old_vector;              /* but we use -1 for unassigned  */
+        /*
+         * Except for high priority interrupts @cpu_mask may have bits set for
+         * offline CPUs.  Consumers need to be careful to mask this down to
+         * online ones as necessary.  There is supposed to always be a non-
+         * empty intersection with cpu_online_map.
+         */
         cpumask_var_t cpu_mask;
         cpumask_var_t old_cpu_mask;
         cpumask_var_t pending_mask;




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [PATCH v3 06/15] x86/IRQ: fix locking around vector management
@ 2019-05-17 10:47     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-17 10:47 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

All of __{assign,bind,clear}_irq_vector() manipulate struct irq_desc
fields, and hence ought to be called with the descriptor lock held in
addition to vector_lock. This is currently the case for only
set_desc_affinity() (in the common case) and destroy_irq(), which also
clarifies what the nesting behavior between the locks has to be.
Reflect the new expectation by having these functions all take a
descriptor as parameter instead of an interrupt number.

Also take care of the two special cases of calls to set_desc_affinity():
set_ioapic_affinity_irq() and VT-d's dma_msi_set_affinity() get called
directly as well, and in these cases the descriptor locks hadn't got
acquired till now. For set_ioapic_affinity_irq() this means acquiring /
releasing of the IO-APIC lock can be plain spin_{,un}lock() then.

Drop one of the two leading underscores from all three functions at
the same time.

There's one case left where descriptors get manipulated with just
vector_lock held: setup_vector_irq() assumes its caller to acquire
vector_lock, and hence can't itself acquire the descriptor locks (wrong
lock order). I don't currently see how to address this.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com> [VT-d]
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
---
v3: Also drop one leading underscore from a comment. Re-base.
v2: Also adjust set_ioapic_affinity_irq() and VT-d's
    dma_msi_set_affinity().

--- a/xen/arch/x86/io_apic.c
+++ b/xen/arch/x86/io_apic.c
@@ -550,14 +550,14 @@ static void clear_IO_APIC (void)
 static void
 set_ioapic_affinity_irq(struct irq_desc *desc, const cpumask_t *mask)
 {
-    unsigned long flags;
     unsigned int dest;
     int pin, irq;
     struct irq_pin_list *entry;
 
     irq = desc->irq;
 
-    spin_lock_irqsave(&ioapic_lock, flags);
+    spin_lock(&ioapic_lock);
+
     dest = set_desc_affinity(desc, mask);
     if (dest != BAD_APICID) {
         if ( !x2apic_enabled )
@@ -580,8 +580,8 @@ set_ioapic_affinity_irq(struct irq_desc
             entry = irq_2_pin + entry->next;
         }
     }
-    spin_unlock_irqrestore(&ioapic_lock, flags);
 
+    spin_unlock(&ioapic_lock);
 }
 
 /*
@@ -674,16 +674,19 @@ void /*__init*/ setup_ioapic_dest(void)
     for (ioapic = 0; ioapic < nr_ioapics; ioapic++) {
         for (pin = 0; pin < nr_ioapic_entries[ioapic]; pin++) {
             struct irq_desc *desc;
+            unsigned long flags;
 
             irq_entry = find_irq_entry(ioapic, pin, mp_INT);
             if (irq_entry == -1)
                 continue;
             irq = pin_2_irq(irq_entry, ioapic, pin);
             desc = irq_to_desc(irq);
+
+            spin_lock_irqsave(&desc->lock, flags);
             BUG_ON(!cpumask_intersects(desc->arch.cpu_mask, &cpu_online_map));
             set_ioapic_affinity_irq(desc, desc->arch.cpu_mask);
+            spin_unlock_irqrestore(&desc->lock, flags);
         }
-
     }
 }
 
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -27,6 +27,7 @@
 #include <public/physdev.h>
 
 static int parse_irq_vector_map_param(const char *s);
+static void _clear_irq_vector(struct irq_desc *desc);
 
 /* opt_noirqbalance: If true, software IRQ balancing/affinity is disabled. */
 bool __read_mostly opt_noirqbalance;
@@ -136,13 +137,12 @@ static void trace_irq_mask(uint32_t even
     trace_var(event, 1, sizeof(d), &d);
 }
 
-static int __init __bind_irq_vector(int irq, int vector, const cpumask_t *cpu_mask)
+static int __init _bind_irq_vector(struct irq_desc *desc, int vector,
+                                   const cpumask_t *cpu_mask)
 {
     cpumask_t online_mask;
     int cpu;
-    struct irq_desc *desc = irq_to_desc(irq);
 
-    BUG_ON((unsigned)irq >= nr_irqs);
     BUG_ON((unsigned)vector >= NR_VECTORS);
 
     cpumask_and(&online_mask, cpu_mask, &cpu_online_map);
@@ -153,9 +153,9 @@ static int __init __bind_irq_vector(int
         return 0;
     if ( desc->arch.vector != IRQ_VECTOR_UNASSIGNED )
         return -EBUSY;
-    trace_irq_mask(TRC_HW_IRQ_BIND_VECTOR, irq, vector, &online_mask);
+    trace_irq_mask(TRC_HW_IRQ_BIND_VECTOR, desc->irq, vector, &online_mask);
     for_each_cpu(cpu, &online_mask)
-        per_cpu(vector_irq, cpu)[vector] = irq;
+        per_cpu(vector_irq, cpu)[vector] = desc->irq;
     desc->arch.vector = vector;
     cpumask_copy(desc->arch.cpu_mask, &online_mask);
     if ( desc->arch.used_vectors )
@@ -169,12 +169,18 @@ static int __init __bind_irq_vector(int
 
 int __init bind_irq_vector(int irq, int vector, const cpumask_t *cpu_mask)
 {
+    struct irq_desc *desc = irq_to_desc(irq);
     unsigned long flags;
     int ret;
 
-    spin_lock_irqsave(&vector_lock, flags);
-    ret = __bind_irq_vector(irq, vector, cpu_mask);
-    spin_unlock_irqrestore(&vector_lock, flags);
+    BUG_ON((unsigned)irq >= nr_irqs);
+
+    spin_lock_irqsave(&desc->lock, flags);
+    spin_lock(&vector_lock);
+    ret = _bind_irq_vector(desc, vector, cpu_mask);
+    spin_unlock(&vector_lock);
+    spin_unlock_irqrestore(&desc->lock, flags);
+
     return ret;
 }
 
@@ -259,18 +265,20 @@ void destroy_irq(unsigned int irq)
 
     spin_lock_irqsave(&desc->lock, flags);
     desc->handler = &no_irq_type;
-    clear_irq_vector(irq);
+    spin_lock(&vector_lock);
+    _clear_irq_vector(desc);
+    spin_unlock(&vector_lock);
     desc->arch.used_vectors = NULL;
     spin_unlock_irqrestore(&desc->lock, flags);
 
     xfree(action);
 }
 
-static void __clear_irq_vector(int irq)
+static void _clear_irq_vector(struct irq_desc *desc)
 {
-    int cpu, vector, old_vector;
+    unsigned int cpu;
+    int vector, old_vector, irq = desc->irq;
     cpumask_t tmp_mask;
-    struct irq_desc *desc = irq_to_desc(irq);
 
     BUG_ON(!desc->arch.vector);
 
@@ -316,11 +324,14 @@ static void __clear_irq_vector(int irq)
 
 void clear_irq_vector(int irq)
 {
+    struct irq_desc *desc = irq_to_desc(irq);
     unsigned long flags;
 
-    spin_lock_irqsave(&vector_lock, flags);
-    __clear_irq_vector(irq);
-    spin_unlock_irqrestore(&vector_lock, flags);
+    spin_lock_irqsave(&desc->lock, flags);
+    spin_lock(&vector_lock);
+    _clear_irq_vector(desc);
+    spin_unlock(&vector_lock);
+    spin_unlock_irqrestore(&desc->lock, flags);
 }
 
 int irq_to_vector(int irq)
@@ -455,8 +466,7 @@ static vmask_t *irq_get_used_vector_mask
     return ret;
 }
 
-static int __assign_irq_vector(
-    int irq, struct irq_desc *desc, const cpumask_t *mask)
+static int _assign_irq_vector(struct irq_desc *desc, const cpumask_t *mask)
 {
     /*
      * NOTE! The local APIC isn't very good at handling
@@ -470,7 +480,8 @@ static int __assign_irq_vector(
      * 0x80, because int 0x80 is hm, kind of importantish. ;)
      */
     static int current_vector = FIRST_DYNAMIC_VECTOR, current_offset = 0;
-    int cpu, err, old_vector;
+    unsigned int cpu;
+    int err, old_vector, irq = desc->irq;
     vmask_t *irq_used_vectors = NULL;
 
     old_vector = irq_to_vector(irq);
@@ -583,8 +594,12 @@ int assign_irq_vector(int irq, const cpu
     
     BUG_ON(irq >= nr_irqs || irq <0);
 
-    spin_lock_irqsave(&vector_lock, flags);
-    ret = __assign_irq_vector(irq, desc, mask ?: TARGET_CPUS);
+    spin_lock_irqsave(&desc->lock, flags);
+
+    spin_lock(&vector_lock);
+    ret = _assign_irq_vector(desc, mask ?: TARGET_CPUS);
+    spin_unlock(&vector_lock);
+
     if ( !ret )
     {
         ret = desc->arch.vector;
@@ -593,7 +608,8 @@ int assign_irq_vector(int irq, const cpu
         else
             cpumask_setall(desc->affinity);
     }
-    spin_unlock_irqrestore(&vector_lock, flags);
+
+    spin_unlock_irqrestore(&desc->lock, flags);
 
     return ret;
 }
@@ -767,7 +783,6 @@ void irq_complete_move(struct irq_desc *
 
 unsigned int set_desc_affinity(struct irq_desc *desc, const cpumask_t *mask)
 {
-    unsigned int irq;
     int ret;
     unsigned long flags;
     cpumask_t dest_mask;
@@ -775,10 +790,8 @@ unsigned int set_desc_affinity(struct ir
     if (!cpumask_intersects(mask, &cpu_online_map))
         return BAD_APICID;
 
-    irq = desc->irq;
-
     spin_lock_irqsave(&vector_lock, flags);
-    ret = __assign_irq_vector(irq, desc, mask);
+    ret = _assign_irq_vector(desc, mask);
     spin_unlock_irqrestore(&vector_lock, flags);
 
     if (ret < 0)
@@ -2442,7 +2455,7 @@ void fixup_irqs(const cpumask_t *mask, b
 
         /*
          * In order for the affinity adjustment below to be successful, we
-         * need __assign_irq_vector() to succeed. This in particular means
+         * need _assign_irq_vector() to succeed. This in particular means
          * clearing desc->arch.move_in_progress if this would otherwise
          * prevent the function from succeeding. Since there's no way for the
          * flag to get cleared anymore when there's no possible destination
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -2134,11 +2134,16 @@ static void adjust_irq_affinity(struct a
     unsigned int node = rhsa ? pxm_to_node(rhsa->proximity_domain)
                              : NUMA_NO_NODE;
     const cpumask_t *cpumask = &cpu_online_map;
+    struct irq_desc *desc;
 
     if ( node < MAX_NUMNODES && node_online(node) &&
          cpumask_intersects(&node_to_cpumask(node), cpumask) )
         cpumask = &node_to_cpumask(node);
-    dma_msi_set_affinity(irq_to_desc(drhd->iommu->msi.irq), cpumask);
+
+    desc = irq_to_desc(drhd->iommu->msi.irq);
+    spin_lock_irq(&desc->lock);
+    dma_msi_set_affinity(desc, cpumask);
+    spin_unlock_irq(&desc->lock);
 }
 
 static int adjust_vtd_irq_affinities(void)




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [Xen-devel] [PATCH v3 06/15] x86/IRQ: fix locking around vector management
@ 2019-05-17 10:47     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-17 10:47 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

All of __{assign,bind,clear}_irq_vector() manipulate struct irq_desc
fields, and hence ought to be called with the descriptor lock held in
addition to vector_lock. This is currently the case for only
set_desc_affinity() (in the common case) and destroy_irq(), which also
clarifies what the nesting behavior between the locks has to be.
Reflect the new expectation by having these functions all take a
descriptor as parameter instead of an interrupt number.

Also take care of the two special cases of calls to set_desc_affinity():
set_ioapic_affinity_irq() and VT-d's dma_msi_set_affinity() get called
directly as well, and in these cases the descriptor locks hadn't got
acquired till now. For set_ioapic_affinity_irq() this means acquiring /
releasing of the IO-APIC lock can be plain spin_{,un}lock() then.

Drop one of the two leading underscores from all three functions at
the same time.

There's one case left where descriptors get manipulated with just
vector_lock held: setup_vector_irq() assumes its caller to acquire
vector_lock, and hence can't itself acquire the descriptor locks (wrong
lock order). I don't currently see how to address this.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com> [VT-d]
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
---
v3: Also drop one leading underscore from a comment. Re-base.
v2: Also adjust set_ioapic_affinity_irq() and VT-d's
    dma_msi_set_affinity().

--- a/xen/arch/x86/io_apic.c
+++ b/xen/arch/x86/io_apic.c
@@ -550,14 +550,14 @@ static void clear_IO_APIC (void)
 static void
 set_ioapic_affinity_irq(struct irq_desc *desc, const cpumask_t *mask)
 {
-    unsigned long flags;
     unsigned int dest;
     int pin, irq;
     struct irq_pin_list *entry;
 
     irq = desc->irq;
 
-    spin_lock_irqsave(&ioapic_lock, flags);
+    spin_lock(&ioapic_lock);
+
     dest = set_desc_affinity(desc, mask);
     if (dest != BAD_APICID) {
         if ( !x2apic_enabled )
@@ -580,8 +580,8 @@ set_ioapic_affinity_irq(struct irq_desc
             entry = irq_2_pin + entry->next;
         }
     }
-    spin_unlock_irqrestore(&ioapic_lock, flags);
 
+    spin_unlock(&ioapic_lock);
 }
 
 /*
@@ -674,16 +674,19 @@ void /*__init*/ setup_ioapic_dest(void)
     for (ioapic = 0; ioapic < nr_ioapics; ioapic++) {
         for (pin = 0; pin < nr_ioapic_entries[ioapic]; pin++) {
             struct irq_desc *desc;
+            unsigned long flags;
 
             irq_entry = find_irq_entry(ioapic, pin, mp_INT);
             if (irq_entry == -1)
                 continue;
             irq = pin_2_irq(irq_entry, ioapic, pin);
             desc = irq_to_desc(irq);
+
+            spin_lock_irqsave(&desc->lock, flags);
             BUG_ON(!cpumask_intersects(desc->arch.cpu_mask, &cpu_online_map));
             set_ioapic_affinity_irq(desc, desc->arch.cpu_mask);
+            spin_unlock_irqrestore(&desc->lock, flags);
         }
-
     }
 }
 
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -27,6 +27,7 @@
 #include <public/physdev.h>
 
 static int parse_irq_vector_map_param(const char *s);
+static void _clear_irq_vector(struct irq_desc *desc);
 
 /* opt_noirqbalance: If true, software IRQ balancing/affinity is disabled. */
 bool __read_mostly opt_noirqbalance;
@@ -136,13 +137,12 @@ static void trace_irq_mask(uint32_t even
     trace_var(event, 1, sizeof(d), &d);
 }
 
-static int __init __bind_irq_vector(int irq, int vector, const cpumask_t *cpu_mask)
+static int __init _bind_irq_vector(struct irq_desc *desc, int vector,
+                                   const cpumask_t *cpu_mask)
 {
     cpumask_t online_mask;
     int cpu;
-    struct irq_desc *desc = irq_to_desc(irq);
 
-    BUG_ON((unsigned)irq >= nr_irqs);
     BUG_ON((unsigned)vector >= NR_VECTORS);
 
     cpumask_and(&online_mask, cpu_mask, &cpu_online_map);
@@ -153,9 +153,9 @@ static int __init __bind_irq_vector(int
         return 0;
     if ( desc->arch.vector != IRQ_VECTOR_UNASSIGNED )
         return -EBUSY;
-    trace_irq_mask(TRC_HW_IRQ_BIND_VECTOR, irq, vector, &online_mask);
+    trace_irq_mask(TRC_HW_IRQ_BIND_VECTOR, desc->irq, vector, &online_mask);
     for_each_cpu(cpu, &online_mask)
-        per_cpu(vector_irq, cpu)[vector] = irq;
+        per_cpu(vector_irq, cpu)[vector] = desc->irq;
     desc->arch.vector = vector;
     cpumask_copy(desc->arch.cpu_mask, &online_mask);
     if ( desc->arch.used_vectors )
@@ -169,12 +169,18 @@ static int __init __bind_irq_vector(int
 
 int __init bind_irq_vector(int irq, int vector, const cpumask_t *cpu_mask)
 {
+    struct irq_desc *desc = irq_to_desc(irq);
     unsigned long flags;
     int ret;
 
-    spin_lock_irqsave(&vector_lock, flags);
-    ret = __bind_irq_vector(irq, vector, cpu_mask);
-    spin_unlock_irqrestore(&vector_lock, flags);
+    BUG_ON((unsigned)irq >= nr_irqs);
+
+    spin_lock_irqsave(&desc->lock, flags);
+    spin_lock(&vector_lock);
+    ret = _bind_irq_vector(desc, vector, cpu_mask);
+    spin_unlock(&vector_lock);
+    spin_unlock_irqrestore(&desc->lock, flags);
+
     return ret;
 }
 
@@ -259,18 +265,20 @@ void destroy_irq(unsigned int irq)
 
     spin_lock_irqsave(&desc->lock, flags);
     desc->handler = &no_irq_type;
-    clear_irq_vector(irq);
+    spin_lock(&vector_lock);
+    _clear_irq_vector(desc);
+    spin_unlock(&vector_lock);
     desc->arch.used_vectors = NULL;
     spin_unlock_irqrestore(&desc->lock, flags);
 
     xfree(action);
 }
 
-static void __clear_irq_vector(int irq)
+static void _clear_irq_vector(struct irq_desc *desc)
 {
-    int cpu, vector, old_vector;
+    unsigned int cpu;
+    int vector, old_vector, irq = desc->irq;
     cpumask_t tmp_mask;
-    struct irq_desc *desc = irq_to_desc(irq);
 
     BUG_ON(!desc->arch.vector);
 
@@ -316,11 +324,14 @@ static void __clear_irq_vector(int irq)
 
 void clear_irq_vector(int irq)
 {
+    struct irq_desc *desc = irq_to_desc(irq);
     unsigned long flags;
 
-    spin_lock_irqsave(&vector_lock, flags);
-    __clear_irq_vector(irq);
-    spin_unlock_irqrestore(&vector_lock, flags);
+    spin_lock_irqsave(&desc->lock, flags);
+    spin_lock(&vector_lock);
+    _clear_irq_vector(desc);
+    spin_unlock(&vector_lock);
+    spin_unlock_irqrestore(&desc->lock, flags);
 }
 
 int irq_to_vector(int irq)
@@ -455,8 +466,7 @@ static vmask_t *irq_get_used_vector_mask
     return ret;
 }
 
-static int __assign_irq_vector(
-    int irq, struct irq_desc *desc, const cpumask_t *mask)
+static int _assign_irq_vector(struct irq_desc *desc, const cpumask_t *mask)
 {
     /*
      * NOTE! The local APIC isn't very good at handling
@@ -470,7 +480,8 @@ static int __assign_irq_vector(
      * 0x80, because int 0x80 is hm, kind of importantish. ;)
      */
     static int current_vector = FIRST_DYNAMIC_VECTOR, current_offset = 0;
-    int cpu, err, old_vector;
+    unsigned int cpu;
+    int err, old_vector, irq = desc->irq;
     vmask_t *irq_used_vectors = NULL;
 
     old_vector = irq_to_vector(irq);
@@ -583,8 +594,12 @@ int assign_irq_vector(int irq, const cpu
     
     BUG_ON(irq >= nr_irqs || irq <0);
 
-    spin_lock_irqsave(&vector_lock, flags);
-    ret = __assign_irq_vector(irq, desc, mask ?: TARGET_CPUS);
+    spin_lock_irqsave(&desc->lock, flags);
+
+    spin_lock(&vector_lock);
+    ret = _assign_irq_vector(desc, mask ?: TARGET_CPUS);
+    spin_unlock(&vector_lock);
+
     if ( !ret )
     {
         ret = desc->arch.vector;
@@ -593,7 +608,8 @@ int assign_irq_vector(int irq, const cpu
         else
             cpumask_setall(desc->affinity);
     }
-    spin_unlock_irqrestore(&vector_lock, flags);
+
+    spin_unlock_irqrestore(&desc->lock, flags);
 
     return ret;
 }
@@ -767,7 +783,6 @@ void irq_complete_move(struct irq_desc *
 
 unsigned int set_desc_affinity(struct irq_desc *desc, const cpumask_t *mask)
 {
-    unsigned int irq;
     int ret;
     unsigned long flags;
     cpumask_t dest_mask;
@@ -775,10 +790,8 @@ unsigned int set_desc_affinity(struct ir
     if (!cpumask_intersects(mask, &cpu_online_map))
         return BAD_APICID;
 
-    irq = desc->irq;
-
     spin_lock_irqsave(&vector_lock, flags);
-    ret = __assign_irq_vector(irq, desc, mask);
+    ret = _assign_irq_vector(desc, mask);
     spin_unlock_irqrestore(&vector_lock, flags);
 
     if (ret < 0)
@@ -2442,7 +2455,7 @@ void fixup_irqs(const cpumask_t *mask, b
 
         /*
          * In order for the affinity adjustment below to be successful, we
-         * need __assign_irq_vector() to succeed. This in particular means
+         * need _assign_irq_vector() to succeed. This in particular means
          * clearing desc->arch.move_in_progress if this would otherwise
          * prevent the function from succeeding. Since there's no way for the
          * flag to get cleared anymore when there's no possible destination
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -2134,11 +2134,16 @@ static void adjust_irq_affinity(struct a
     unsigned int node = rhsa ? pxm_to_node(rhsa->proximity_domain)
                              : NUMA_NO_NODE;
     const cpumask_t *cpumask = &cpu_online_map;
+    struct irq_desc *desc;
 
     if ( node < MAX_NUMNODES && node_online(node) &&
          cpumask_intersects(&node_to_cpumask(node), cpumask) )
         cpumask = &node_to_cpumask(node);
-    dma_msi_set_affinity(irq_to_desc(drhd->iommu->msi.irq), cpumask);
+
+    desc = irq_to_desc(drhd->iommu->msi.irq);
+    spin_lock_irq(&desc->lock);
+    dma_msi_set_affinity(desc, cpumask);
+    spin_unlock_irq(&desc->lock);
 }
 
 static int adjust_vtd_irq_affinities(void)




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [PATCH v3 07/15] x86/IRQ: target online CPUs when binding guest IRQ
@ 2019-05-17 10:48     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-17 10:48 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

fixup_irqs() skips interrupts without action. Hence such interrupts can
retain affinity to just offline CPUs. With "noirqbalance" in effect,
pirq_guest_bind() so far would have left them alone, resulting in a non-
working interrupt.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v3: New.
---
I've not observed this problem in practice - the change is just the
result of code inspection after having noticed action-less IRQs in 'i'
debug key output pointing at all parked/offline CPUs.

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -1683,9 +1683,27 @@ int pirq_guest_bind(struct vcpu *v, stru
 
         desc->status |= IRQ_GUEST;
 
-        /* Attempt to bind the interrupt target to the correct CPU. */
-        if ( !opt_noirqbalance && (desc->handler->set_affinity != NULL) )
-            desc->handler->set_affinity(desc, cpumask_of(v->processor));
+        /*
+         * Attempt to bind the interrupt target to the correct (or at least
+         * some online) CPU.
+         */
+        if ( desc->handler->set_affinity )
+        {
+            const cpumask_t *affinity = NULL;
+
+            if ( !opt_noirqbalance )
+                affinity = cpumask_of(v->processor);
+            else if ( !cpumask_intersects(desc->affinity, &cpu_online_map) )
+            {
+                cpumask_setall(desc->affinity);
+                affinity = &cpumask_all;
+            }
+            else if ( !cpumask_intersects(desc->arch.cpu_mask,
+                                          &cpu_online_map) )
+                affinity = desc->affinity;
+            if ( affinity )
+                desc->handler->set_affinity(desc, affinity);
+        }
 
         desc->status &= ~IRQ_DISABLED;
         desc->handler->startup(desc);





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [Xen-devel] [PATCH v3 07/15] x86/IRQ: target online CPUs when binding guest IRQ
@ 2019-05-17 10:48     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-17 10:48 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

fixup_irqs() skips interrupts without action. Hence such interrupts can
retain affinity to just offline CPUs. With "noirqbalance" in effect,
pirq_guest_bind() so far would have left them alone, resulting in a non-
working interrupt.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v3: New.
---
I've not observed this problem in practice - the change is just the
result of code inspection after having noticed action-less IRQs in 'i'
debug key output pointing at all parked/offline CPUs.

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -1683,9 +1683,27 @@ int pirq_guest_bind(struct vcpu *v, stru
 
         desc->status |= IRQ_GUEST;
 
-        /* Attempt to bind the interrupt target to the correct CPU. */
-        if ( !opt_noirqbalance && (desc->handler->set_affinity != NULL) )
-            desc->handler->set_affinity(desc, cpumask_of(v->processor));
+        /*
+         * Attempt to bind the interrupt target to the correct (or at least
+         * some online) CPU.
+         */
+        if ( desc->handler->set_affinity )
+        {
+            const cpumask_t *affinity = NULL;
+
+            if ( !opt_noirqbalance )
+                affinity = cpumask_of(v->processor);
+            else if ( !cpumask_intersects(desc->affinity, &cpu_online_map) )
+            {
+                cpumask_setall(desc->affinity);
+                affinity = &cpumask_all;
+            }
+            else if ( !cpumask_intersects(desc->arch.cpu_mask,
+                                          &cpu_online_map) )
+                affinity = desc->affinity;
+            if ( affinity )
+                desc->handler->set_affinity(desc, affinity);
+        }
 
         desc->status &= ~IRQ_DISABLED;
         desc->handler->startup(desc);





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [PATCH v3 08/15] x86/IRQs: correct/tighten vector check in _clear_irq_vector()
@ 2019-05-17 10:49     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-17 10:49 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

If any particular value was to be checked against, it would need to be
IRQ_VECTOR_UNASSIGNED.

Reported-by: Roger Pau Monné <roger.pau@citrix.com>

Be more strict though and use valid_irq_vector() instead.

Take the opportunity and also convert local variables to unsigned int.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
---
v2: New.

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -276,14 +276,13 @@ void destroy_irq(unsigned int irq)
 
 static void _clear_irq_vector(struct irq_desc *desc)
 {
-    unsigned int cpu;
-    int vector, old_vector, irq = desc->irq;
+    unsigned int cpu, old_vector, irq = desc->irq;
+    unsigned int vector = desc->arch.vector;
     cpumask_t tmp_mask;
 
-    BUG_ON(!desc->arch.vector);
+    BUG_ON(!valid_irq_vector(vector));
 
     /* Always clear desc->arch.vector */
-    vector = desc->arch.vector;
     cpumask_and(&tmp_mask, desc->arch.cpu_mask, &cpu_online_map);
 
     for_each_cpu(cpu, &tmp_mask) {




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [Xen-devel] [PATCH v3 08/15] x86/IRQs: correct/tighten vector check in _clear_irq_vector()
@ 2019-05-17 10:49     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-17 10:49 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

If any particular value was to be checked against, it would need to be
IRQ_VECTOR_UNASSIGNED.

Reported-by: Roger Pau Monné <roger.pau@citrix.com>

Be more strict though and use valid_irq_vector() instead.

Take the opportunity and also convert local variables to unsigned int.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
---
v2: New.

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -276,14 +276,13 @@ void destroy_irq(unsigned int irq)
 
 static void _clear_irq_vector(struct irq_desc *desc)
 {
-    unsigned int cpu;
-    int vector, old_vector, irq = desc->irq;
+    unsigned int cpu, old_vector, irq = desc->irq;
+    unsigned int vector = desc->arch.vector;
     cpumask_t tmp_mask;
 
-    BUG_ON(!desc->arch.vector);
+    BUG_ON(!valid_irq_vector(vector));
 
     /* Always clear desc->arch.vector */
-    vector = desc->arch.vector;
     cpumask_and(&tmp_mask, desc->arch.cpu_mask, &cpu_online_map);
 
     for_each_cpu(cpu, &tmp_mask) {




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [PATCH v3 09/15] x86/IRQ: make fixup_irqs() skip unconnected internally used interrupts
@ 2019-05-17 10:49     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-17 10:49 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

Since the "Cannot set affinity ..." warning is a one time one, avoid
triggering it already at boot time when parking secondary threads and
the serial console uses a (still unconnected at that time) PCI IRQ.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -2452,8 +2452,20 @@ void fixup_irqs(const cpumask_t *mask, b
         vector = irq_to_vector(irq);
         if ( vector >= FIRST_HIPRIORITY_VECTOR &&
              vector <= LAST_HIPRIORITY_VECTOR )
+        {
             cpumask_and(desc->arch.cpu_mask, desc->arch.cpu_mask, mask);
 
+            /*
+             * This can in particular happen when parking secondary threads
+             * during boot and when the serial console wants to use a PCI IRQ.
+             */
+            if ( desc->handler == &no_irq_type )
+            {
+                spin_unlock(&desc->lock);
+                continue;
+            }
+        }
+
         if ( desc->arch.move_cleanup_count )
         {
             /* The cleanup IPI may have got sent while we were still online. */



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [Xen-devel] [PATCH v3 09/15] x86/IRQ: make fixup_irqs() skip unconnected internally used interrupts
@ 2019-05-17 10:49     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-17 10:49 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

Since the "Cannot set affinity ..." warning is a one time one, avoid
triggering it already at boot time when parking secondary threads and
the serial console uses a (still unconnected at that time) PCI IRQ.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -2452,8 +2452,20 @@ void fixup_irqs(const cpumask_t *mask, b
         vector = irq_to_vector(irq);
         if ( vector >= FIRST_HIPRIORITY_VECTOR &&
              vector <= LAST_HIPRIORITY_VECTOR )
+        {
             cpumask_and(desc->arch.cpu_mask, desc->arch.cpu_mask, mask);
 
+            /*
+             * This can in particular happen when parking secondary threads
+             * during boot and when the serial console wants to use a PCI IRQ.
+             */
+            if ( desc->handler == &no_irq_type )
+            {
+                spin_unlock(&desc->lock);
+                continue;
+            }
+        }
+
         if ( desc->arch.move_cleanup_count )
         {
             /* The cleanup IPI may have got sent while we were still online. */



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [PATCH v3 10/15] x86/IRQ: drop redundant cpumask_empty() from move_masked_irq()
@ 2019-05-17 10:50     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-17 10:50 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

The subsequent cpumask_intersects() covers the "empty" case quite fine.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -650,9 +650,6 @@ void move_masked_irq(struct irq_desc *de
     
     desc->status &= ~IRQ_MOVE_PENDING;
 
-    if (unlikely(cpumask_empty(pending_mask)))
-        return;
-
     if (!desc->handler->set_affinity)
         return;
 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [Xen-devel] [PATCH v3 10/15] x86/IRQ: drop redundant cpumask_empty() from move_masked_irq()
@ 2019-05-17 10:50     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-17 10:50 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

The subsequent cpumask_intersects() covers the "empty" case quite fine.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -650,9 +650,6 @@ void move_masked_irq(struct irq_desc *de
     
     desc->status &= ~IRQ_MOVE_PENDING;
 
-    if (unlikely(cpumask_empty(pending_mask)))
-        return;
-
     if (!desc->handler->set_affinity)
         return;
 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [PATCH v3 11/15] x86/IRQ: simplify and rename pirq_acktype()
@ 2019-05-17 10:51     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-17 10:51 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

Its only caller already has the IRQ descriptor in its hands, so there's
no need for the function to re-obtain it. As a result the leading p of
its name is no longer appropriate and hence gets dropped.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
---
v2: New.

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -1556,17 +1556,8 @@ int pirq_guest_unmask(struct domain *d)
     return 0;
 }
 
-static int pirq_acktype(struct domain *d, int pirq)
+static int irq_acktype(const struct irq_desc *desc)
 {
-    struct irq_desc  *desc;
-    int irq;
-
-    irq = domain_pirq_to_irq(d, pirq);
-    if ( irq <= 0 )
-        return ACKTYPE_NONE;
-
-    desc = irq_to_desc(irq);
-
     if ( desc->handler == &no_irq_type )
         return ACKTYPE_NONE;
 
@@ -1597,7 +1588,8 @@ static int pirq_acktype(struct domain *d
     if ( !strcmp(desc->handler->typename, "XT-PIC") )
         return ACKTYPE_UNMASK;
 
-    printk("Unknown PIC type '%s' for IRQ %d\n", desc->handler->typename, irq);
+    printk("Unknown PIC type '%s' for IRQ%d\n",
+           desc->handler->typename, desc->irq);
     BUG();
 
     return 0;
@@ -1674,7 +1666,7 @@ int pirq_guest_bind(struct vcpu *v, stru
         action->nr_guests   = 0;
         action->in_flight   = 0;
         action->shareable   = will_share;
-        action->ack_type    = pirq_acktype(v->domain, pirq->pirq);
+        action->ack_type    = irq_acktype(desc);
         init_timer(&action->eoi_timer, irq_guest_eoi_timer_fn, desc, 0);
 
         desc->status |= IRQ_GUEST;




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [Xen-devel] [PATCH v3 11/15] x86/IRQ: simplify and rename pirq_acktype()
@ 2019-05-17 10:51     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-17 10:51 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

Its only caller already has the IRQ descriptor in its hands, so there's
no need for the function to re-obtain it. As a result the leading p of
its name is no longer appropriate and hence gets dropped.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
---
v2: New.

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -1556,17 +1556,8 @@ int pirq_guest_unmask(struct domain *d)
     return 0;
 }
 
-static int pirq_acktype(struct domain *d, int pirq)
+static int irq_acktype(const struct irq_desc *desc)
 {
-    struct irq_desc  *desc;
-    int irq;
-
-    irq = domain_pirq_to_irq(d, pirq);
-    if ( irq <= 0 )
-        return ACKTYPE_NONE;
-
-    desc = irq_to_desc(irq);
-
     if ( desc->handler == &no_irq_type )
         return ACKTYPE_NONE;
 
@@ -1597,7 +1588,8 @@ static int pirq_acktype(struct domain *d
     if ( !strcmp(desc->handler->typename, "XT-PIC") )
         return ACKTYPE_UNMASK;
 
-    printk("Unknown PIC type '%s' for IRQ %d\n", desc->handler->typename, irq);
+    printk("Unknown PIC type '%s' for IRQ%d\n",
+           desc->handler->typename, desc->irq);
     BUG();
 
     return 0;
@@ -1674,7 +1666,7 @@ int pirq_guest_bind(struct vcpu *v, stru
         action->nr_guests   = 0;
         action->in_flight   = 0;
         action->shareable   = will_share;
-        action->ack_type    = pirq_acktype(v->domain, pirq->pirq);
+        action->ack_type    = irq_acktype(desc);
         init_timer(&action->eoi_timer, irq_guest_eoi_timer_fn, desc, 0);
 
         desc->status |= IRQ_GUEST;




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [PATCH v3 12/15] x86/IRQ: add explicit tracing-enabled check to trace_irq_mask()
@ 2019-05-17 10:51     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-17 10:51 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap, Andrew Cooper, Wei Liu, Roger Pau Monne

The setup for calling trace_var() (which itself checks tb_init_done) is
non-negligible, and hence a separate outer-most check is warranted.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v3: New.

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -121,8 +121,8 @@ static void release_old_vec(struct irq_d
     }
 }
 
-static void trace_irq_mask(uint32_t event, int irq, int vector,
-                           const cpumask_t *mask)
+static void _trace_irq_mask(uint32_t event, int irq, int vector,
+                            const cpumask_t *mask)
 {
     struct {
         unsigned int irq:16, vec:16;
@@ -137,6 +137,13 @@ static void trace_irq_mask(uint32_t even
     trace_var(event, 1, sizeof(d), &d);
 }
 
+static inline void trace_irq_mask(uint32_t event, int irq, int vector,
+                                  const cpumask_t *mask)
+{
+    if ( unlikely(tb_init_done) )
+        _trace_irq_mask(event, irq, vector, mask);
+}
+
 static int __init _bind_irq_vector(struct irq_desc *desc, int vector,
                                    const cpumask_t *cpu_mask)
 {





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [Xen-devel] [PATCH v3 12/15] x86/IRQ: add explicit tracing-enabled check to trace_irq_mask()
@ 2019-05-17 10:51     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-17 10:51 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap, Andrew Cooper, Wei Liu, Roger Pau Monne

The setup for calling trace_var() (which itself checks tb_init_done) is
non-negligible, and hence a separate outer-most check is warranted.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v3: New.

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -121,8 +121,8 @@ static void release_old_vec(struct irq_d
     }
 }
 
-static void trace_irq_mask(uint32_t event, int irq, int vector,
-                           const cpumask_t *mask)
+static void _trace_irq_mask(uint32_t event, int irq, int vector,
+                            const cpumask_t *mask)
 {
     struct {
         unsigned int irq:16, vec:16;
@@ -137,6 +137,13 @@ static void trace_irq_mask(uint32_t even
     trace_var(event, 1, sizeof(d), &d);
 }
 
+static inline void trace_irq_mask(uint32_t event, int irq, int vector,
+                                  const cpumask_t *mask)
+{
+    if ( unlikely(tb_init_done) )
+        _trace_irq_mask(event, irq, vector, mask);
+}
+
 static int __init _bind_irq_vector(struct irq_desc *desc, int vector,
                                    const cpumask_t *cpu_mask)
 {





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [PATCH v3 13/15] x86/IRQ: tighten vector checks
@ 2019-05-17 10:52     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-17 10:52 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

Use valid_irq_vector() rather than "> 0".

Also replace an open-coded use of IRQ_VECTOR_UNASSIGNED.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v3: New.

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -342,7 +342,7 @@ void clear_irq_vector(int irq)
 
 int irq_to_vector(int irq)
 {
-    int vector = -1;
+    int vector = IRQ_VECTOR_UNASSIGNED;
 
     BUG_ON(irq >= nr_irqs || irq < 0);
 
@@ -452,15 +452,18 @@ static vmask_t *irq_get_used_vector_mask
             int vector;
             
             vector = irq_to_vector(irq);
-            if ( vector > 0 )
+            if ( valid_irq_vector(vector) )
             {
-                printk(XENLOG_INFO "IRQ %d already assigned vector %d\n",
+                printk(XENLOG_INFO "IRQ%d already assigned vector %02x\n",
                        irq, vector);
                 
                 ASSERT(!test_bit(vector, ret));
 
                 set_bit(vector, ret);
             }
+            else if ( vector != IRQ_VECTOR_UNASSIGNED )
+                printk(XENLOG_WARNING "IRQ%d mapped to bogus vector %02x\n",
+                       irq, vector);
         }
     }
     else if ( IO_APIC_IRQ(irq) &&
@@ -491,7 +494,7 @@ static int _assign_irq_vector(struct irq
     vmask_t *irq_used_vectors = NULL;
 
     old_vector = irq_to_vector(irq);
-    if ( old_vector > 0 )
+    if ( valid_irq_vector(old_vector) )
     {
         cpumask_t tmp_mask;
 
@@ -555,7 +558,7 @@ next:
         current_vector = vector;
         current_offset = offset;
 
-        if ( old_vector > 0 )
+        if ( valid_irq_vector(old_vector) )
         {
             cpumask_and(desc->arch.old_cpu_mask, desc->arch.cpu_mask,
                         &cpu_online_map);





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [Xen-devel] [PATCH v3 13/15] x86/IRQ: tighten vector checks
@ 2019-05-17 10:52     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-17 10:52 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

Use valid_irq_vector() rather than "> 0".

Also replace an open-coded use of IRQ_VECTOR_UNASSIGNED.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v3: New.

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -342,7 +342,7 @@ void clear_irq_vector(int irq)
 
 int irq_to_vector(int irq)
 {
-    int vector = -1;
+    int vector = IRQ_VECTOR_UNASSIGNED;
 
     BUG_ON(irq >= nr_irqs || irq < 0);
 
@@ -452,15 +452,18 @@ static vmask_t *irq_get_used_vector_mask
             int vector;
             
             vector = irq_to_vector(irq);
-            if ( vector > 0 )
+            if ( valid_irq_vector(vector) )
             {
-                printk(XENLOG_INFO "IRQ %d already assigned vector %d\n",
+                printk(XENLOG_INFO "IRQ%d already assigned vector %02x\n",
                        irq, vector);
                 
                 ASSERT(!test_bit(vector, ret));
 
                 set_bit(vector, ret);
             }
+            else if ( vector != IRQ_VECTOR_UNASSIGNED )
+                printk(XENLOG_WARNING "IRQ%d mapped to bogus vector %02x\n",
+                       irq, vector);
         }
     }
     else if ( IO_APIC_IRQ(irq) &&
@@ -491,7 +494,7 @@ static int _assign_irq_vector(struct irq
     vmask_t *irq_used_vectors = NULL;
 
     old_vector = irq_to_vector(irq);
-    if ( old_vector > 0 )
+    if ( valid_irq_vector(old_vector) )
     {
         cpumask_t tmp_mask;
 
@@ -555,7 +558,7 @@ next:
         current_vector = vector;
         current_offset = offset;
 
-        if ( old_vector > 0 )
+        if ( valid_irq_vector(old_vector) )
         {
             cpumask_and(desc->arch.old_cpu_mask, desc->arch.cpu_mask,
                         &cpu_online_map);





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [PATCH v3 14/15] x86/IRQ: eliminate some on-stack cpumask_t instances
@ 2019-05-17 10:52     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-17 10:52 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

Use scratch_cpumask where possible, to avoid creating these possibly
large stack objects. We can't use it in _assign_irq_vector() and
set_desc_affinity(), as these get called in IRQ context.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v3: New.

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -285,14 +285,15 @@ static void _clear_irq_vector(struct irq
 {
     unsigned int cpu, old_vector, irq = desc->irq;
     unsigned int vector = desc->arch.vector;
-    cpumask_t tmp_mask;
+    cpumask_t *tmp_mask = this_cpu(scratch_cpumask);
 
     BUG_ON(!valid_irq_vector(vector));
 
     /* Always clear desc->arch.vector */
-    cpumask_and(&tmp_mask, desc->arch.cpu_mask, &cpu_online_map);
+    cpumask_and(tmp_mask, desc->arch.cpu_mask, &cpu_online_map);
 
-    for_each_cpu(cpu, &tmp_mask) {
+    for_each_cpu(cpu, tmp_mask)
+    {
         ASSERT( per_cpu(vector_irq, cpu)[vector] == irq );
         per_cpu(vector_irq, cpu)[vector] = ~irq;
     }
@@ -308,16 +309,17 @@ static void _clear_irq_vector(struct irq
 
     desc->arch.used = IRQ_UNUSED;
 
-    trace_irq_mask(TRC_HW_IRQ_CLEAR_VECTOR, irq, vector, &tmp_mask);
+    trace_irq_mask(TRC_HW_IRQ_CLEAR_VECTOR, irq, vector, tmp_mask);
 
     if ( likely(!desc->arch.move_in_progress) )
         return;
 
     /* If we were in motion, also clear desc->arch.old_vector */
     old_vector = desc->arch.old_vector;
-    cpumask_and(&tmp_mask, desc->arch.old_cpu_mask, &cpu_online_map);
+    cpumask_and(tmp_mask, desc->arch.old_cpu_mask, &cpu_online_map);
 
-    for_each_cpu(cpu, &tmp_mask) {
+    for_each_cpu(cpu, tmp_mask)
+    {
         ASSERT( per_cpu(vector_irq, cpu)[old_vector] == irq );
         TRACE_3D(TRC_HW_IRQ_MOVE_FINISH, irq, old_vector, cpu);
         per_cpu(vector_irq, cpu)[old_vector] = ~irq;
@@ -1159,7 +1161,6 @@ static void irq_guest_eoi_timer_fn(void
     struct irq_desc *desc = data;
     unsigned int i, irq = desc - irq_desc;
     irq_guest_action_t *action;
-    cpumask_t cpu_eoi_map;
 
     spin_lock_irq(&desc->lock);
     
@@ -1189,14 +1190,18 @@ static void irq_guest_eoi_timer_fn(void
 
     switch ( action->ack_type )
     {
+        cpumask_t *cpu_eoi_map;
+
     case ACKTYPE_UNMASK:
         if ( desc->handler->end )
             desc->handler->end(desc, 0);
         break;
+
     case ACKTYPE_EOI:
-        cpumask_copy(&cpu_eoi_map, action->cpu_eoi_map);
+        cpu_eoi_map = this_cpu(scratch_cpumask);
+        cpumask_copy(cpu_eoi_map, action->cpu_eoi_map);
         spin_unlock_irq(&desc->lock);
-        on_selected_cpus(&cpu_eoi_map, set_eoi_ready, desc, 0);
+        on_selected_cpus(cpu_eoi_map, set_eoi_ready, desc, 0);
         return;
     }
 
@@ -2437,7 +2442,7 @@ void fixup_irqs(const cpumask_t *mask, b
     {
         bool break_affinity = false, set_affinity = true;
         unsigned int vector;
-        cpumask_t affinity;
+        cpumask_t *affinity = this_cpu(scratch_cpumask);
 
         if ( irq == 2 )
             continue;
@@ -2468,9 +2473,9 @@ void fixup_irqs(const cpumask_t *mask, b
         if ( desc->arch.move_cleanup_count )
         {
             /* The cleanup IPI may have got sent while we were still online. */
-            cpumask_andnot(&affinity, desc->arch.old_cpu_mask,
+            cpumask_andnot(affinity, desc->arch.old_cpu_mask,
                            &cpu_online_map);
-            desc->arch.move_cleanup_count -= cpumask_weight(&affinity);
+            desc->arch.move_cleanup_count -= cpumask_weight(affinity);
             if ( !desc->arch.move_cleanup_count )
                 release_old_vec(desc);
         }
@@ -2497,10 +2502,10 @@ void fixup_irqs(const cpumask_t *mask, b
         {
             unsigned int cpu;
 
-            cpumask_and(&affinity, desc->arch.old_cpu_mask, &cpu_online_map);
+            cpumask_and(affinity, desc->arch.old_cpu_mask, &cpu_online_map);
 
             spin_lock(&vector_lock);
-            for_each_cpu(cpu, &affinity)
+            for_each_cpu(cpu, affinity)
                 per_cpu(vector_irq, cpu)[desc->arch.old_vector] = ~irq;
             spin_unlock(&vector_lock);
 
@@ -2511,23 +2516,23 @@ void fixup_irqs(const cpumask_t *mask, b
         if ( !cpumask_intersects(mask, desc->affinity) )
         {
             break_affinity = true;
-            cpumask_setall(&affinity);
+            cpumask_setall(affinity);
         }
         else
-            cpumask_copy(&affinity, desc->affinity);
+            cpumask_copy(affinity, desc->affinity);
 
         if ( desc->handler->disable )
             desc->handler->disable(desc);
 
         if ( desc->handler->set_affinity )
-            desc->handler->set_affinity(desc, &affinity);
+            desc->handler->set_affinity(desc, affinity);
         else if ( !(warned++) )
             set_affinity = false;
 
         if ( desc->handler->enable )
             desc->handler->enable(desc);
 
-        cpumask_copy(&affinity, desc->affinity);
+        cpumask_copy(affinity, desc->affinity);
 
         spin_unlock(&desc->lock);
 
@@ -2538,7 +2543,7 @@ void fixup_irqs(const cpumask_t *mask, b
             printk("Cannot set affinity for IRQ%u\n", irq);
         else if ( break_affinity )
             printk("Broke affinity for IRQ%u, new: %*pb\n",
-                   irq, nr_cpu_ids, &affinity);
+                   irq, nr_cpu_ids, affinity);
     }
 
     /* That doesn't seem sufficient.  Give it 1ms. */




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [Xen-devel] [PATCH v3 14/15] x86/IRQ: eliminate some on-stack cpumask_t instances
@ 2019-05-17 10:52     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-17 10:52 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

Use scratch_cpumask where possible, to avoid creating these possibly
large stack objects. We can't use it in _assign_irq_vector() and
set_desc_affinity(), as these get called in IRQ context.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v3: New.

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -285,14 +285,15 @@ static void _clear_irq_vector(struct irq
 {
     unsigned int cpu, old_vector, irq = desc->irq;
     unsigned int vector = desc->arch.vector;
-    cpumask_t tmp_mask;
+    cpumask_t *tmp_mask = this_cpu(scratch_cpumask);
 
     BUG_ON(!valid_irq_vector(vector));
 
     /* Always clear desc->arch.vector */
-    cpumask_and(&tmp_mask, desc->arch.cpu_mask, &cpu_online_map);
+    cpumask_and(tmp_mask, desc->arch.cpu_mask, &cpu_online_map);
 
-    for_each_cpu(cpu, &tmp_mask) {
+    for_each_cpu(cpu, tmp_mask)
+    {
         ASSERT( per_cpu(vector_irq, cpu)[vector] == irq );
         per_cpu(vector_irq, cpu)[vector] = ~irq;
     }
@@ -308,16 +309,17 @@ static void _clear_irq_vector(struct irq
 
     desc->arch.used = IRQ_UNUSED;
 
-    trace_irq_mask(TRC_HW_IRQ_CLEAR_VECTOR, irq, vector, &tmp_mask);
+    trace_irq_mask(TRC_HW_IRQ_CLEAR_VECTOR, irq, vector, tmp_mask);
 
     if ( likely(!desc->arch.move_in_progress) )
         return;
 
     /* If we were in motion, also clear desc->arch.old_vector */
     old_vector = desc->arch.old_vector;
-    cpumask_and(&tmp_mask, desc->arch.old_cpu_mask, &cpu_online_map);
+    cpumask_and(tmp_mask, desc->arch.old_cpu_mask, &cpu_online_map);
 
-    for_each_cpu(cpu, &tmp_mask) {
+    for_each_cpu(cpu, tmp_mask)
+    {
         ASSERT( per_cpu(vector_irq, cpu)[old_vector] == irq );
         TRACE_3D(TRC_HW_IRQ_MOVE_FINISH, irq, old_vector, cpu);
         per_cpu(vector_irq, cpu)[old_vector] = ~irq;
@@ -1159,7 +1161,6 @@ static void irq_guest_eoi_timer_fn(void
     struct irq_desc *desc = data;
     unsigned int i, irq = desc - irq_desc;
     irq_guest_action_t *action;
-    cpumask_t cpu_eoi_map;
 
     spin_lock_irq(&desc->lock);
     
@@ -1189,14 +1190,18 @@ static void irq_guest_eoi_timer_fn(void
 
     switch ( action->ack_type )
     {
+        cpumask_t *cpu_eoi_map;
+
     case ACKTYPE_UNMASK:
         if ( desc->handler->end )
             desc->handler->end(desc, 0);
         break;
+
     case ACKTYPE_EOI:
-        cpumask_copy(&cpu_eoi_map, action->cpu_eoi_map);
+        cpu_eoi_map = this_cpu(scratch_cpumask);
+        cpumask_copy(cpu_eoi_map, action->cpu_eoi_map);
         spin_unlock_irq(&desc->lock);
-        on_selected_cpus(&cpu_eoi_map, set_eoi_ready, desc, 0);
+        on_selected_cpus(cpu_eoi_map, set_eoi_ready, desc, 0);
         return;
     }
 
@@ -2437,7 +2442,7 @@ void fixup_irqs(const cpumask_t *mask, b
     {
         bool break_affinity = false, set_affinity = true;
         unsigned int vector;
-        cpumask_t affinity;
+        cpumask_t *affinity = this_cpu(scratch_cpumask);
 
         if ( irq == 2 )
             continue;
@@ -2468,9 +2473,9 @@ void fixup_irqs(const cpumask_t *mask, b
         if ( desc->arch.move_cleanup_count )
         {
             /* The cleanup IPI may have got sent while we were still online. */
-            cpumask_andnot(&affinity, desc->arch.old_cpu_mask,
+            cpumask_andnot(affinity, desc->arch.old_cpu_mask,
                            &cpu_online_map);
-            desc->arch.move_cleanup_count -= cpumask_weight(&affinity);
+            desc->arch.move_cleanup_count -= cpumask_weight(affinity);
             if ( !desc->arch.move_cleanup_count )
                 release_old_vec(desc);
         }
@@ -2497,10 +2502,10 @@ void fixup_irqs(const cpumask_t *mask, b
         {
             unsigned int cpu;
 
-            cpumask_and(&affinity, desc->arch.old_cpu_mask, &cpu_online_map);
+            cpumask_and(affinity, desc->arch.old_cpu_mask, &cpu_online_map);
 
             spin_lock(&vector_lock);
-            for_each_cpu(cpu, &affinity)
+            for_each_cpu(cpu, affinity)
                 per_cpu(vector_irq, cpu)[desc->arch.old_vector] = ~irq;
             spin_unlock(&vector_lock);
 
@@ -2511,23 +2516,23 @@ void fixup_irqs(const cpumask_t *mask, b
         if ( !cpumask_intersects(mask, desc->affinity) )
         {
             break_affinity = true;
-            cpumask_setall(&affinity);
+            cpumask_setall(affinity);
         }
         else
-            cpumask_copy(&affinity, desc->affinity);
+            cpumask_copy(affinity, desc->affinity);
 
         if ( desc->handler->disable )
             desc->handler->disable(desc);
 
         if ( desc->handler->set_affinity )
-            desc->handler->set_affinity(desc, &affinity);
+            desc->handler->set_affinity(desc, affinity);
         else if ( !(warned++) )
             set_affinity = false;
 
         if ( desc->handler->enable )
             desc->handler->enable(desc);
 
-        cpumask_copy(&affinity, desc->affinity);
+        cpumask_copy(affinity, desc->affinity);
 
         spin_unlock(&desc->lock);
 
@@ -2538,7 +2543,7 @@ void fixup_irqs(const cpumask_t *mask, b
             printk("Cannot set affinity for IRQ%u\n", irq);
         else if ( break_affinity )
             printk("Broke affinity for IRQ%u, new: %*pb\n",
-                   irq, nr_cpu_ids, &affinity);
+                   irq, nr_cpu_ids, affinity);
     }
 
     /* That doesn't seem sufficient.  Give it 1ms. */




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [PATCH v3 15/15] x86/IRQ: move {,_}clear_irq_vector()
@ 2019-05-17 10:53     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-17 10:53 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

This is largely to drop a forward declaration. There's one functional
change - clear_irq_vector() gets marked __init, as its only caller is
check_timer(). Beyond this only a few stray blanks get removed.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v3: New.

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -27,7 +27,6 @@
 #include <public/physdev.h>
 
 static int parse_irq_vector_map_param(const char *s);
-static void _clear_irq_vector(struct irq_desc *desc);
 
 /* opt_noirqbalance: If true, software IRQ balancing/affinity is disabled. */
 bool __read_mostly opt_noirqbalance;
@@ -191,6 +190,67 @@ int __init bind_irq_vector(int irq, int
     return ret;
 }
 
+static void _clear_irq_vector(struct irq_desc *desc)
+{
+    unsigned int cpu, old_vector, irq = desc->irq;
+    unsigned int vector = desc->arch.vector;
+    cpumask_t *tmp_mask = this_cpu(scratch_cpumask);
+
+    BUG_ON(!valid_irq_vector(vector));
+
+    /* Always clear desc->arch.vector */
+    cpumask_and(tmp_mask, desc->arch.cpu_mask, &cpu_online_map);
+
+    for_each_cpu(cpu, tmp_mask)
+    {
+        ASSERT(per_cpu(vector_irq, cpu)[vector] == irq);
+        per_cpu(vector_irq, cpu)[vector] = ~irq;
+    }
+
+    desc->arch.vector = IRQ_VECTOR_UNASSIGNED;
+    cpumask_clear(desc->arch.cpu_mask);
+
+    if ( desc->arch.used_vectors )
+    {
+        ASSERT(test_bit(vector, desc->arch.used_vectors));
+        clear_bit(vector, desc->arch.used_vectors);
+    }
+
+    desc->arch.used = IRQ_UNUSED;
+
+    trace_irq_mask(TRC_HW_IRQ_CLEAR_VECTOR, irq, vector, tmp_mask);
+
+    if ( likely(!desc->arch.move_in_progress) )
+        return;
+
+    /* If we were in motion, also clear desc->arch.old_vector */
+    old_vector = desc->arch.old_vector;
+    cpumask_and(tmp_mask, desc->arch.old_cpu_mask, &cpu_online_map);
+
+    for_each_cpu(cpu, tmp_mask)
+    {
+        ASSERT(per_cpu(vector_irq, cpu)[old_vector] == irq);
+        TRACE_3D(TRC_HW_IRQ_MOVE_FINISH, irq, old_vector, cpu);
+        per_cpu(vector_irq, cpu)[old_vector] = ~irq;
+    }
+
+    release_old_vec(desc);
+
+    desc->arch.move_in_progress = 0;
+}
+
+void __init clear_irq_vector(int irq)
+{
+    struct irq_desc *desc = irq_to_desc(irq);
+    unsigned long flags;
+
+    spin_lock_irqsave(&desc->lock, flags);
+    spin_lock(&vector_lock);
+    _clear_irq_vector(desc);
+    spin_unlock(&vector_lock);
+    spin_unlock_irqrestore(&desc->lock, flags);
+}
+
 /*
  * Dynamic irq allocate and deallocation for MSI
  */
@@ -281,67 +341,6 @@ void destroy_irq(unsigned int irq)
     xfree(action);
 }
 
-static void _clear_irq_vector(struct irq_desc *desc)
-{
-    unsigned int cpu, old_vector, irq = desc->irq;
-    unsigned int vector = desc->arch.vector;
-    cpumask_t *tmp_mask = this_cpu(scratch_cpumask);
-
-    BUG_ON(!valid_irq_vector(vector));
-
-    /* Always clear desc->arch.vector */
-    cpumask_and(tmp_mask, desc->arch.cpu_mask, &cpu_online_map);
-
-    for_each_cpu(cpu, tmp_mask)
-    {
-        ASSERT( per_cpu(vector_irq, cpu)[vector] == irq );
-        per_cpu(vector_irq, cpu)[vector] = ~irq;
-    }
-
-    desc->arch.vector = IRQ_VECTOR_UNASSIGNED;
-    cpumask_clear(desc->arch.cpu_mask);
-
-    if ( desc->arch.used_vectors )
-    {
-        ASSERT(test_bit(vector, desc->arch.used_vectors));
-        clear_bit(vector, desc->arch.used_vectors);
-    }
-
-    desc->arch.used = IRQ_UNUSED;
-
-    trace_irq_mask(TRC_HW_IRQ_CLEAR_VECTOR, irq, vector, tmp_mask);
-
-    if ( likely(!desc->arch.move_in_progress) )
-        return;
-
-    /* If we were in motion, also clear desc->arch.old_vector */
-    old_vector = desc->arch.old_vector;
-    cpumask_and(tmp_mask, desc->arch.old_cpu_mask, &cpu_online_map);
-
-    for_each_cpu(cpu, tmp_mask)
-    {
-        ASSERT( per_cpu(vector_irq, cpu)[old_vector] == irq );
-        TRACE_3D(TRC_HW_IRQ_MOVE_FINISH, irq, old_vector, cpu);
-        per_cpu(vector_irq, cpu)[old_vector] = ~irq;
-    }
-
-    release_old_vec(desc);
-
-    desc->arch.move_in_progress = 0;
-}
-
-void clear_irq_vector(int irq)
-{
-    struct irq_desc *desc = irq_to_desc(irq);
-    unsigned long flags;
-
-    spin_lock_irqsave(&desc->lock, flags);
-    spin_lock(&vector_lock);
-    _clear_irq_vector(desc);
-    spin_unlock(&vector_lock);
-    spin_unlock_irqrestore(&desc->lock, flags);
-}
-
 int irq_to_vector(int irq)
 {
     int vector = IRQ_VECTOR_UNASSIGNED;




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [Xen-devel] [PATCH v3 15/15] x86/IRQ: move {,_}clear_irq_vector()
@ 2019-05-17 10:53     ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-17 10:53 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

This is largely to drop a forward declaration. There's one functional
change - clear_irq_vector() gets marked __init, as its only caller is
check_timer(). Beyond this only a few stray blanks get removed.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v3: New.

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -27,7 +27,6 @@
 #include <public/physdev.h>
 
 static int parse_irq_vector_map_param(const char *s);
-static void _clear_irq_vector(struct irq_desc *desc);
 
 /* opt_noirqbalance: If true, software IRQ balancing/affinity is disabled. */
 bool __read_mostly opt_noirqbalance;
@@ -191,6 +190,67 @@ int __init bind_irq_vector(int irq, int
     return ret;
 }
 
+static void _clear_irq_vector(struct irq_desc *desc)
+{
+    unsigned int cpu, old_vector, irq = desc->irq;
+    unsigned int vector = desc->arch.vector;
+    cpumask_t *tmp_mask = this_cpu(scratch_cpumask);
+
+    BUG_ON(!valid_irq_vector(vector));
+
+    /* Always clear desc->arch.vector */
+    cpumask_and(tmp_mask, desc->arch.cpu_mask, &cpu_online_map);
+
+    for_each_cpu(cpu, tmp_mask)
+    {
+        ASSERT(per_cpu(vector_irq, cpu)[vector] == irq);
+        per_cpu(vector_irq, cpu)[vector] = ~irq;
+    }
+
+    desc->arch.vector = IRQ_VECTOR_UNASSIGNED;
+    cpumask_clear(desc->arch.cpu_mask);
+
+    if ( desc->arch.used_vectors )
+    {
+        ASSERT(test_bit(vector, desc->arch.used_vectors));
+        clear_bit(vector, desc->arch.used_vectors);
+    }
+
+    desc->arch.used = IRQ_UNUSED;
+
+    trace_irq_mask(TRC_HW_IRQ_CLEAR_VECTOR, irq, vector, tmp_mask);
+
+    if ( likely(!desc->arch.move_in_progress) )
+        return;
+
+    /* If we were in motion, also clear desc->arch.old_vector */
+    old_vector = desc->arch.old_vector;
+    cpumask_and(tmp_mask, desc->arch.old_cpu_mask, &cpu_online_map);
+
+    for_each_cpu(cpu, tmp_mask)
+    {
+        ASSERT(per_cpu(vector_irq, cpu)[old_vector] == irq);
+        TRACE_3D(TRC_HW_IRQ_MOVE_FINISH, irq, old_vector, cpu);
+        per_cpu(vector_irq, cpu)[old_vector] = ~irq;
+    }
+
+    release_old_vec(desc);
+
+    desc->arch.move_in_progress = 0;
+}
+
+void __init clear_irq_vector(int irq)
+{
+    struct irq_desc *desc = irq_to_desc(irq);
+    unsigned long flags;
+
+    spin_lock_irqsave(&desc->lock, flags);
+    spin_lock(&vector_lock);
+    _clear_irq_vector(desc);
+    spin_unlock(&vector_lock);
+    spin_unlock_irqrestore(&desc->lock, flags);
+}
+
 /*
  * Dynamic irq allocate and deallocation for MSI
  */
@@ -281,67 +341,6 @@ void destroy_irq(unsigned int irq)
     xfree(action);
 }
 
-static void _clear_irq_vector(struct irq_desc *desc)
-{
-    unsigned int cpu, old_vector, irq = desc->irq;
-    unsigned int vector = desc->arch.vector;
-    cpumask_t *tmp_mask = this_cpu(scratch_cpumask);
-
-    BUG_ON(!valid_irq_vector(vector));
-
-    /* Always clear desc->arch.vector */
-    cpumask_and(tmp_mask, desc->arch.cpu_mask, &cpu_online_map);
-
-    for_each_cpu(cpu, tmp_mask)
-    {
-        ASSERT( per_cpu(vector_irq, cpu)[vector] == irq );
-        per_cpu(vector_irq, cpu)[vector] = ~irq;
-    }
-
-    desc->arch.vector = IRQ_VECTOR_UNASSIGNED;
-    cpumask_clear(desc->arch.cpu_mask);
-
-    if ( desc->arch.used_vectors )
-    {
-        ASSERT(test_bit(vector, desc->arch.used_vectors));
-        clear_bit(vector, desc->arch.used_vectors);
-    }
-
-    desc->arch.used = IRQ_UNUSED;
-
-    trace_irq_mask(TRC_HW_IRQ_CLEAR_VECTOR, irq, vector, tmp_mask);
-
-    if ( likely(!desc->arch.move_in_progress) )
-        return;
-
-    /* If we were in motion, also clear desc->arch.old_vector */
-    old_vector = desc->arch.old_vector;
-    cpumask_and(tmp_mask, desc->arch.old_cpu_mask, &cpu_online_map);
-
-    for_each_cpu(cpu, tmp_mask)
-    {
-        ASSERT( per_cpu(vector_irq, cpu)[old_vector] == irq );
-        TRACE_3D(TRC_HW_IRQ_MOVE_FINISH, irq, old_vector, cpu);
-        per_cpu(vector_irq, cpu)[old_vector] = ~irq;
-    }
-
-    release_old_vec(desc);
-
-    desc->arch.move_in_progress = 0;
-}
-
-void clear_irq_vector(int irq)
-{
-    struct irq_desc *desc = irq_to_desc(irq);
-    unsigned long flags;
-
-    spin_lock_irqsave(&desc->lock, flags);
-    spin_lock(&vector_lock);
-    _clear_irq_vector(desc);
-    spin_unlock(&vector_lock);
-    spin_unlock_irqrestore(&desc->lock, flags);
-}
-
 int irq_to_vector(int irq)
 {
     int vector = IRQ_VECTOR_UNASSIGNED;




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v3 07/15] x86/IRQ: target online CPUs when binding guest IRQ
@ 2019-05-20 11:40       ` Roger Pau Monné
  0 siblings, 0 replies; 196+ messages in thread
From: Roger Pau Monné @ 2019-05-20 11:40 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Wei Liu, Andrew Cooper

On Fri, May 17, 2019 at 04:48:21AM -0600, Jan Beulich wrote:
> fixup_irqs() skips interrupts without action. Hence such interrupts can
> retain affinity to just offline CPUs. With "noirqbalance" in effect,
> pirq_guest_bind() so far would have left them alone, resulting in a non-
> working interrupt.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> v3: New.
> ---
> I've not observed this problem in practice - the change is just the
> result of code inspection after having noticed action-less IRQs in 'i'
> debug key output pointing at all parked/offline CPUs.
> 
> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -1683,9 +1683,27 @@ int pirq_guest_bind(struct vcpu *v, stru
>  
>          desc->status |= IRQ_GUEST;
>  
> -        /* Attempt to bind the interrupt target to the correct CPU. */
> -        if ( !opt_noirqbalance && (desc->handler->set_affinity != NULL) )
> -            desc->handler->set_affinity(desc, cpumask_of(v->processor));
> +        /*
> +         * Attempt to bind the interrupt target to the correct (or at least
> +         * some online) CPU.
> +         */
> +        if ( desc->handler->set_affinity )
> +        {
> +            const cpumask_t *affinity = NULL;
> +
> +            if ( !opt_noirqbalance )
> +                affinity = cpumask_of(v->processor);
> +            else if ( !cpumask_intersects(desc->affinity, &cpu_online_map) )
> +            {
> +                cpumask_setall(desc->affinity);
> +                affinity = &cpumask_all;
> +            }
> +            else if ( !cpumask_intersects(desc->arch.cpu_mask,
> +                                          &cpu_online_map) )

I'm not sure I see the purpose of the desc->arch.cpu_mask check,
wouldn't it be better to just use else and set the affinity to
desc->affinity?

Or it's just an optimization to avoid doing the set_affinity call if
the interrupt it already bound to an online CPU?

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v3 07/15] x86/IRQ: target online CPUs when binding guest IRQ
@ 2019-05-20 11:40       ` Roger Pau Monné
  0 siblings, 0 replies; 196+ messages in thread
From: Roger Pau Monné @ 2019-05-20 11:40 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Wei Liu, Andrew Cooper

On Fri, May 17, 2019 at 04:48:21AM -0600, Jan Beulich wrote:
> fixup_irqs() skips interrupts without action. Hence such interrupts can
> retain affinity to just offline CPUs. With "noirqbalance" in effect,
> pirq_guest_bind() so far would have left them alone, resulting in a non-
> working interrupt.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> v3: New.
> ---
> I've not observed this problem in practice - the change is just the
> result of code inspection after having noticed action-less IRQs in 'i'
> debug key output pointing at all parked/offline CPUs.
> 
> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -1683,9 +1683,27 @@ int pirq_guest_bind(struct vcpu *v, stru
>  
>          desc->status |= IRQ_GUEST;
>  
> -        /* Attempt to bind the interrupt target to the correct CPU. */
> -        if ( !opt_noirqbalance && (desc->handler->set_affinity != NULL) )
> -            desc->handler->set_affinity(desc, cpumask_of(v->processor));
> +        /*
> +         * Attempt to bind the interrupt target to the correct (or at least
> +         * some online) CPU.
> +         */
> +        if ( desc->handler->set_affinity )
> +        {
> +            const cpumask_t *affinity = NULL;
> +
> +            if ( !opt_noirqbalance )
> +                affinity = cpumask_of(v->processor);
> +            else if ( !cpumask_intersects(desc->affinity, &cpu_online_map) )
> +            {
> +                cpumask_setall(desc->affinity);
> +                affinity = &cpumask_all;
> +            }
> +            else if ( !cpumask_intersects(desc->arch.cpu_mask,
> +                                          &cpu_online_map) )

I'm not sure I see the purpose of the desc->arch.cpu_mask check,
wouldn't it be better to just use else and set the affinity to
desc->affinity?

Or it's just an optimization to avoid doing the set_affinity call if
the interrupt it already bound to an online CPU?

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v3 12/15] x86/IRQ: add explicit tracing-enabled check to trace_irq_mask()
@ 2019-05-20 11:46       ` Roger Pau Monné
  0 siblings, 0 replies; 196+ messages in thread
From: Roger Pau Monné @ 2019-05-20 11:46 UTC (permalink / raw)
  To: Jan Beulich; +Cc: George Dunlap, xen-devel, Wei Liu, Andrew Cooper

On Fri, May 17, 2019 at 04:51:50AM -0600, Jan Beulich wrote:
> The setup for calling trace_var() (which itself checks tb_init_done) is
> non-negligible, and hence a separate outer-most check is warranted.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

I think a macro or helper would be helpful: ie: trace_enabled or some
such. Checking tb_init_done is as obvious as it could be IMO.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v3 12/15] x86/IRQ: add explicit tracing-enabled check to trace_irq_mask()
@ 2019-05-20 11:46       ` Roger Pau Monné
  0 siblings, 0 replies; 196+ messages in thread
From: Roger Pau Monné @ 2019-05-20 11:46 UTC (permalink / raw)
  To: Jan Beulich; +Cc: George Dunlap, xen-devel, Wei Liu, Andrew Cooper

On Fri, May 17, 2019 at 04:51:50AM -0600, Jan Beulich wrote:
> The setup for calling trace_var() (which itself checks tb_init_done) is
> non-negligible, and hence a separate outer-most check is warranted.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

I think a macro or helper would be helpful: ie: trace_enabled or some
such. Checking tb_init_done is as obvious as it could be IMO.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v3 13/15] x86/IRQ: tighten vector checks
@ 2019-05-20 14:04       ` Roger Pau Monné
  0 siblings, 0 replies; 196+ messages in thread
From: Roger Pau Monné @ 2019-05-20 14:04 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Wei Liu, Andrew Cooper

On Fri, May 17, 2019 at 04:52:32AM -0600, Jan Beulich wrote:
> Use valid_irq_vector() rather than "> 0".
> 
> Also replace an open-coded use of IRQ_VECTOR_UNASSIGNED.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

The question I have below is not directly related to the usage of
valid_irq_vector, but rather with the existing code.

> ---
> v3: New.
> 
> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -342,7 +342,7 @@ void clear_irq_vector(int irq)
>  
>  int irq_to_vector(int irq)
>  {
> -    int vector = -1;
> +    int vector = IRQ_VECTOR_UNASSIGNED;
>  
>      BUG_ON(irq >= nr_irqs || irq < 0);
>  
> @@ -452,15 +452,18 @@ static vmask_t *irq_get_used_vector_mask
>              int vector;
>              
>              vector = irq_to_vector(irq);
> -            if ( vector > 0 )
> +            if ( valid_irq_vector(vector) )
>              {
> -                printk(XENLOG_INFO "IRQ %d already assigned vector %d\n",
> +                printk(XENLOG_INFO "IRQ%d already assigned vector %02x\n",
>                         irq, vector);
>                  
>                  ASSERT(!test_bit(vector, ret));
>  
>                  set_bit(vector, ret);
>              }
> +            else if ( vector != IRQ_VECTOR_UNASSIGNED )
> +                printk(XENLOG_WARNING "IRQ%d mapped to bogus vector %02x\n",
> +                       irq, vector);

Maybe add an assert_unreachable here? It seems really bogus to call
irq_get_used_vector_mask with an unassigned vector.

But I'm not sure I fully understand this piece of code, neither why a
vector without a IRQ assigned can have a vector assigned. Is this
covering up for the lack of cleanup elsewhere?

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v3 13/15] x86/IRQ: tighten vector checks
@ 2019-05-20 14:04       ` Roger Pau Monné
  0 siblings, 0 replies; 196+ messages in thread
From: Roger Pau Monné @ 2019-05-20 14:04 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Wei Liu, Andrew Cooper

On Fri, May 17, 2019 at 04:52:32AM -0600, Jan Beulich wrote:
> Use valid_irq_vector() rather than "> 0".
> 
> Also replace an open-coded use of IRQ_VECTOR_UNASSIGNED.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

The question I have below is not directly related to the usage of
valid_irq_vector, but rather with the existing code.

> ---
> v3: New.
> 
> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -342,7 +342,7 @@ void clear_irq_vector(int irq)
>  
>  int irq_to_vector(int irq)
>  {
> -    int vector = -1;
> +    int vector = IRQ_VECTOR_UNASSIGNED;
>  
>      BUG_ON(irq >= nr_irqs || irq < 0);
>  
> @@ -452,15 +452,18 @@ static vmask_t *irq_get_used_vector_mask
>              int vector;
>              
>              vector = irq_to_vector(irq);
> -            if ( vector > 0 )
> +            if ( valid_irq_vector(vector) )
>              {
> -                printk(XENLOG_INFO "IRQ %d already assigned vector %d\n",
> +                printk(XENLOG_INFO "IRQ%d already assigned vector %02x\n",
>                         irq, vector);
>                  
>                  ASSERT(!test_bit(vector, ret));
>  
>                  set_bit(vector, ret);
>              }
> +            else if ( vector != IRQ_VECTOR_UNASSIGNED )
> +                printk(XENLOG_WARNING "IRQ%d mapped to bogus vector %02x\n",
> +                       irq, vector);

Maybe add an assert_unreachable here? It seems really bogus to call
irq_get_used_vector_mask with an unassigned vector.

But I'm not sure I fully understand this piece of code, neither why a
vector without a IRQ assigned can have a vector assigned. Is this
covering up for the lack of cleanup elsewhere?

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v3 14/15] x86/IRQ: eliminate some on-stack cpumask_t instances
@ 2019-05-20 14:22       ` Roger Pau Monné
  0 siblings, 0 replies; 196+ messages in thread
From: Roger Pau Monné @ 2019-05-20 14:22 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Wei Liu, Andrew Cooper

On Fri, May 17, 2019 at 04:52:54AM -0600, Jan Beulich wrote:
> Use scratch_cpumask where possible, to avoid creating these possibly
> large stack objects. We can't use it in _assign_irq_vector() and
> set_desc_affinity(), as these get called in IRQ context.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v3 14/15] x86/IRQ: eliminate some on-stack cpumask_t instances
@ 2019-05-20 14:22       ` Roger Pau Monné
  0 siblings, 0 replies; 196+ messages in thread
From: Roger Pau Monné @ 2019-05-20 14:22 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Wei Liu, Andrew Cooper

On Fri, May 17, 2019 at 04:52:54AM -0600, Jan Beulich wrote:
> Use scratch_cpumask where possible, to avoid creating these possibly
> large stack objects. We can't use it in _assign_irq_vector() and
> set_desc_affinity(), as these get called in IRQ context.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v3 07/15] x86/IRQ: target online CPUs when binding guest IRQ
@ 2019-05-20 15:17         ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-20 15:17 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel

>>> On 20.05.19 at 13:40, <roger.pau@citrix.com> wrote:
> On Fri, May 17, 2019 at 04:48:21AM -0600, Jan Beulich wrote:
>> fixup_irqs() skips interrupts without action. Hence such interrupts can
>> retain affinity to just offline CPUs. With "noirqbalance" in effect,
>> pirq_guest_bind() so far would have left them alone, resulting in a non-
>> working interrupt.
>> 
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>> ---
>> v3: New.
>> ---
>> I've not observed this problem in practice - the change is just the
>> result of code inspection after having noticed action-less IRQs in 'i'
>> debug key output pointing at all parked/offline CPUs.
>> 
>> --- a/xen/arch/x86/irq.c
>> +++ b/xen/arch/x86/irq.c
>> @@ -1683,9 +1683,27 @@ int pirq_guest_bind(struct vcpu *v, stru
>>  
>>          desc->status |= IRQ_GUEST;
>>  
>> -        /* Attempt to bind the interrupt target to the correct CPU. */
>> -        if ( !opt_noirqbalance && (desc->handler->set_affinity != NULL) )
>> -            desc->handler->set_affinity(desc, cpumask_of(v->processor));
>> +        /*
>> +         * Attempt to bind the interrupt target to the correct (or at least
>> +         * some online) CPU.
>> +         */
>> +        if ( desc->handler->set_affinity )
>> +        {
>> +            const cpumask_t *affinity = NULL;
>> +
>> +            if ( !opt_noirqbalance )
>> +                affinity = cpumask_of(v->processor);
>> +            else if ( !cpumask_intersects(desc->affinity, &cpu_online_map) )
>> +            {
>> +                cpumask_setall(desc->affinity);
>> +                affinity = &cpumask_all;
>> +            }
>> +            else if ( !cpumask_intersects(desc->arch.cpu_mask,
>> +                                          &cpu_online_map) )
> 
> I'm not sure I see the purpose of the desc->arch.cpu_mask check,
> wouldn't it be better to just use else and set the affinity to
> desc->affinity?

We should avoid clobbering desc->affinity whenever possible: It
reflects (see the respective patch in this series) what was
requested by whatever "outside" party.

> Or it's just an optimization to avoid doing the set_affinity call if
> the interrupt it already bound to an online CPU?

This is a second aspect here indeed - why play with the IRQ if
it has a valid destination?

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v3 07/15] x86/IRQ: target online CPUs when binding guest IRQ
@ 2019-05-20 15:17         ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-20 15:17 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel

>>> On 20.05.19 at 13:40, <roger.pau@citrix.com> wrote:
> On Fri, May 17, 2019 at 04:48:21AM -0600, Jan Beulich wrote:
>> fixup_irqs() skips interrupts without action. Hence such interrupts can
>> retain affinity to just offline CPUs. With "noirqbalance" in effect,
>> pirq_guest_bind() so far would have left them alone, resulting in a non-
>> working interrupt.
>> 
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>> ---
>> v3: New.
>> ---
>> I've not observed this problem in practice - the change is just the
>> result of code inspection after having noticed action-less IRQs in 'i'
>> debug key output pointing at all parked/offline CPUs.
>> 
>> --- a/xen/arch/x86/irq.c
>> +++ b/xen/arch/x86/irq.c
>> @@ -1683,9 +1683,27 @@ int pirq_guest_bind(struct vcpu *v, stru
>>  
>>          desc->status |= IRQ_GUEST;
>>  
>> -        /* Attempt to bind the interrupt target to the correct CPU. */
>> -        if ( !opt_noirqbalance && (desc->handler->set_affinity != NULL) )
>> -            desc->handler->set_affinity(desc, cpumask_of(v->processor));
>> +        /*
>> +         * Attempt to bind the interrupt target to the correct (or at least
>> +         * some online) CPU.
>> +         */
>> +        if ( desc->handler->set_affinity )
>> +        {
>> +            const cpumask_t *affinity = NULL;
>> +
>> +            if ( !opt_noirqbalance )
>> +                affinity = cpumask_of(v->processor);
>> +            else if ( !cpumask_intersects(desc->affinity, &cpu_online_map) )
>> +            {
>> +                cpumask_setall(desc->affinity);
>> +                affinity = &cpumask_all;
>> +            }
>> +            else if ( !cpumask_intersects(desc->arch.cpu_mask,
>> +                                          &cpu_online_map) )
> 
> I'm not sure I see the purpose of the desc->arch.cpu_mask check,
> wouldn't it be better to just use else and set the affinity to
> desc->affinity?

We should avoid clobbering desc->affinity whenever possible: It
reflects (see the respective patch in this series) what was
requested by whatever "outside" party.

> Or it's just an optimization to avoid doing the set_affinity call if
> the interrupt it already bound to an online CPU?

This is a second aspect here indeed - why play with the IRQ if
it has a valid destination?

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v3 13/15] x86/IRQ: tighten vector checks
@ 2019-05-20 15:26         ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-20 15:26 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel

>>> On 20.05.19 at 16:04, <roger.pau@citrix.com> wrote:
> On Fri, May 17, 2019 at 04:52:32AM -0600, Jan Beulich wrote:
>> Use valid_irq_vector() rather than "> 0".
>> 
>> Also replace an open-coded use of IRQ_VECTOR_UNASSIGNED.
>> 
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks.

> The question I have below is not directly related to the usage of
> valid_irq_vector, but rather with the existing code.
> 
>> @@ -452,15 +452,18 @@ static vmask_t *irq_get_used_vector_mask
>>              int vector;
>>              
>>              vector = irq_to_vector(irq);
>> -            if ( vector > 0 )
>> +            if ( valid_irq_vector(vector) )
>>              {
>> -                printk(XENLOG_INFO "IRQ %d already assigned vector %d\n",
>> +                printk(XENLOG_INFO "IRQ%d already assigned vector %02x\n",
>>                         irq, vector);
>>                  
>>                  ASSERT(!test_bit(vector, ret));
>>  
>>                  set_bit(vector, ret);
>>              }
>> +            else if ( vector != IRQ_VECTOR_UNASSIGNED )
>> +                printk(XENLOG_WARNING "IRQ%d mapped to bogus vector %02x\n",
>> +                       irq, vector);
> 
> Maybe add an assert_unreachable here? It seems really bogus to call
> irq_get_used_vector_mask with an unassigned vector.

How that? This would e.g. get called the very first time a vector
is to be assigned. But I'm afraid I'm a little confused anyway by
the wording you use - after all this is the code path dealing with
an IRQ _not_ being marked as having no vector assigned, but
also not having a valid vector.

> But I'm not sure I fully understand this piece of code, neither why a
> vector without a IRQ assigned can have a vector assigned. Is this
> covering up for the lack of cleanup elsewhere?

I don't think so, no. However, users of irq_to_vector() need to
be careful: The function can legitimately return 0 (besides
IRQ_VECTOR_UNASSIGNED) as an error indication. I've tried to
do away with this, but quickly realized I'd better not do so. I've
not seen the printk() trigger, but I'd rather see the printk() log
a message telling us that we also need to exclude vector 0 than
a wrong assertion to fire.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v3 13/15] x86/IRQ: tighten vector checks
@ 2019-05-20 15:26         ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-20 15:26 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel

>>> On 20.05.19 at 16:04, <roger.pau@citrix.com> wrote:
> On Fri, May 17, 2019 at 04:52:32AM -0600, Jan Beulich wrote:
>> Use valid_irq_vector() rather than "> 0".
>> 
>> Also replace an open-coded use of IRQ_VECTOR_UNASSIGNED.
>> 
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks.

> The question I have below is not directly related to the usage of
> valid_irq_vector, but rather with the existing code.
> 
>> @@ -452,15 +452,18 @@ static vmask_t *irq_get_used_vector_mask
>>              int vector;
>>              
>>              vector = irq_to_vector(irq);
>> -            if ( vector > 0 )
>> +            if ( valid_irq_vector(vector) )
>>              {
>> -                printk(XENLOG_INFO "IRQ %d already assigned vector %d\n",
>> +                printk(XENLOG_INFO "IRQ%d already assigned vector %02x\n",
>>                         irq, vector);
>>                  
>>                  ASSERT(!test_bit(vector, ret));
>>  
>>                  set_bit(vector, ret);
>>              }
>> +            else if ( vector != IRQ_VECTOR_UNASSIGNED )
>> +                printk(XENLOG_WARNING "IRQ%d mapped to bogus vector %02x\n",
>> +                       irq, vector);
> 
> Maybe add an assert_unreachable here? It seems really bogus to call
> irq_get_used_vector_mask with an unassigned vector.

How that? This would e.g. get called the very first time a vector
is to be assigned. But I'm afraid I'm a little confused anyway by
the wording you use - after all this is the code path dealing with
an IRQ _not_ being marked as having no vector assigned, but
also not having a valid vector.

> But I'm not sure I fully understand this piece of code, neither why a
> vector without a IRQ assigned can have a vector assigned. Is this
> covering up for the lack of cleanup elsewhere?

I don't think so, no. However, users of irq_to_vector() need to
be careful: The function can legitimately return 0 (besides
IRQ_VECTOR_UNASSIGNED) as an error indication. I've tried to
do away with this, but quickly realized I'd better not do so. I've
not seen the printk() trigger, but I'd rather see the printk() log
a message telling us that we also need to exclude vector 0 than
a wrong assertion to fire.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v3 07/15] x86/IRQ: target online CPUs when binding guest IRQ
@ 2019-05-22  9:41           ` Roger Pau Monné
  0 siblings, 0 replies; 196+ messages in thread
From: Roger Pau Monné @ 2019-05-22  9:41 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Wei Liu, xen-devel

On Mon, May 20, 2019 at 09:17:19AM -0600, Jan Beulich wrote:
> >>> On 20.05.19 at 13:40, <roger.pau@citrix.com> wrote:
> > On Fri, May 17, 2019 at 04:48:21AM -0600, Jan Beulich wrote:
> >> fixup_irqs() skips interrupts without action. Hence such interrupts can
> >> retain affinity to just offline CPUs. With "noirqbalance" in effect,
> >> pirq_guest_bind() so far would have left them alone, resulting in a non-
> >> working interrupt.
> >> 
> >> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> >> ---
> >> v3: New.
> >> ---
> >> I've not observed this problem in practice - the change is just the
> >> result of code inspection after having noticed action-less IRQs in 'i'
> >> debug key output pointing at all parked/offline CPUs.
> >> 
> >> --- a/xen/arch/x86/irq.c
> >> +++ b/xen/arch/x86/irq.c
> >> @@ -1683,9 +1683,27 @@ int pirq_guest_bind(struct vcpu *v, stru
> >>  
> >>          desc->status |= IRQ_GUEST;
> >>  
> >> -        /* Attempt to bind the interrupt target to the correct CPU. */
> >> -        if ( !opt_noirqbalance && (desc->handler->set_affinity != NULL) )
> >> -            desc->handler->set_affinity(desc, cpumask_of(v->processor));
> >> +        /*
> >> +         * Attempt to bind the interrupt target to the correct (or at least
> >> +         * some online) CPU.
> >> +         */
> >> +        if ( desc->handler->set_affinity )
> >> +        {
> >> +            const cpumask_t *affinity = NULL;
> >> +
> >> +            if ( !opt_noirqbalance )
> >> +                affinity = cpumask_of(v->processor);
> >> +            else if ( !cpumask_intersects(desc->affinity, &cpu_online_map) )
> >> +            {
> >> +                cpumask_setall(desc->affinity);
> >> +                affinity = &cpumask_all;
> >> +            }
> >> +            else if ( !cpumask_intersects(desc->arch.cpu_mask,
> >> +                                          &cpu_online_map) )
> > 
> > I'm not sure I see the purpose of the desc->arch.cpu_mask check,
> > wouldn't it be better to just use else and set the affinity to
> > desc->affinity?
> 
> We should avoid clobbering desc->affinity whenever possible: It
> reflects (see the respective patch in this series) what was
> requested by whatever "outside" party.
> 
> > Or it's just an optimization to avoid doing the set_affinity call if
> > the interrupt it already bound to an online CPU?
> 
> This is a second aspect here indeed - why play with the IRQ if
> it has a valid destination?

Thanks for the clarification, that LGTM:

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v3 07/15] x86/IRQ: target online CPUs when binding guest IRQ
@ 2019-05-22  9:41           ` Roger Pau Monné
  0 siblings, 0 replies; 196+ messages in thread
From: Roger Pau Monné @ 2019-05-22  9:41 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Wei Liu, xen-devel

On Mon, May 20, 2019 at 09:17:19AM -0600, Jan Beulich wrote:
> >>> On 20.05.19 at 13:40, <roger.pau@citrix.com> wrote:
> > On Fri, May 17, 2019 at 04:48:21AM -0600, Jan Beulich wrote:
> >> fixup_irqs() skips interrupts without action. Hence such interrupts can
> >> retain affinity to just offline CPUs. With "noirqbalance" in effect,
> >> pirq_guest_bind() so far would have left them alone, resulting in a non-
> >> working interrupt.
> >> 
> >> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> >> ---
> >> v3: New.
> >> ---
> >> I've not observed this problem in practice - the change is just the
> >> result of code inspection after having noticed action-less IRQs in 'i'
> >> debug key output pointing at all parked/offline CPUs.
> >> 
> >> --- a/xen/arch/x86/irq.c
> >> +++ b/xen/arch/x86/irq.c
> >> @@ -1683,9 +1683,27 @@ int pirq_guest_bind(struct vcpu *v, stru
> >>  
> >>          desc->status |= IRQ_GUEST;
> >>  
> >> -        /* Attempt to bind the interrupt target to the correct CPU. */
> >> -        if ( !opt_noirqbalance && (desc->handler->set_affinity != NULL) )
> >> -            desc->handler->set_affinity(desc, cpumask_of(v->processor));
> >> +        /*
> >> +         * Attempt to bind the interrupt target to the correct (or at least
> >> +         * some online) CPU.
> >> +         */
> >> +        if ( desc->handler->set_affinity )
> >> +        {
> >> +            const cpumask_t *affinity = NULL;
> >> +
> >> +            if ( !opt_noirqbalance )
> >> +                affinity = cpumask_of(v->processor);
> >> +            else if ( !cpumask_intersects(desc->affinity, &cpu_online_map) )
> >> +            {
> >> +                cpumask_setall(desc->affinity);
> >> +                affinity = &cpumask_all;
> >> +            }
> >> +            else if ( !cpumask_intersects(desc->arch.cpu_mask,
> >> +                                          &cpu_online_map) )
> > 
> > I'm not sure I see the purpose of the desc->arch.cpu_mask check,
> > wouldn't it be better to just use else and set the affinity to
> > desc->affinity?
> 
> We should avoid clobbering desc->affinity whenever possible: It
> reflects (see the respective patch in this series) what was
> requested by whatever "outside" party.
> 
> > Or it's just an optimization to avoid doing the set_affinity call if
> > the interrupt it already bound to an online CPU?
> 
> This is a second aspect here indeed - why play with the IRQ if
> it has a valid destination?

Thanks for the clarification, that LGTM:

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v3 13/15] x86/IRQ: tighten vector checks
@ 2019-05-22 16:42           ` Roger Pau Monné
  0 siblings, 0 replies; 196+ messages in thread
From: Roger Pau Monné @ 2019-05-22 16:42 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Wei Liu, xen-devel

On Mon, May 20, 2019 at 09:26:37AM -0600, Jan Beulich wrote:
> >>> On 20.05.19 at 16:04, <roger.pau@citrix.com> wrote:
> > On Fri, May 17, 2019 at 04:52:32AM -0600, Jan Beulich wrote:
> >> Use valid_irq_vector() rather than "> 0".
> >> 
> >> Also replace an open-coded use of IRQ_VECTOR_UNASSIGNED.
> >> 
> >> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> > 
> > Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
> 
> Thanks.
> 
> > The question I have below is not directly related to the usage of
> > valid_irq_vector, but rather with the existing code.
> > 
> >> @@ -452,15 +452,18 @@ static vmask_t *irq_get_used_vector_mask
> >>              int vector;
> >>              
> >>              vector = irq_to_vector(irq);
> >> -            if ( vector > 0 )
> >> +            if ( valid_irq_vector(vector) )
> >>              {
> >> -                printk(XENLOG_INFO "IRQ %d already assigned vector %d\n",
> >> +                printk(XENLOG_INFO "IRQ%d already assigned vector %02x\n",
> >>                         irq, vector);
> >>                  
> >>                  ASSERT(!test_bit(vector, ret));
> >>  
> >>                  set_bit(vector, ret);
> >>              }
> >> +            else if ( vector != IRQ_VECTOR_UNASSIGNED )
> >> +                printk(XENLOG_WARNING "IRQ%d mapped to bogus vector %02x\n",
> >> +                       irq, vector);
> > 
> > Maybe add an assert_unreachable here? It seems really bogus to call
> > irq_get_used_vector_mask with an unassigned vector.
> 
> How that? This would e.g. get called the very first time a vector
> is to be assigned. But I'm afraid I'm a little confused anyway by
> the wording you use - after all this is the code path dealing with
> an IRQ _not_ being marked as having no vector assigned, but
> also not having a valid vector.

Thanks for the clarification, by the name of the function I assumed it
must be called with an irq that has a vector assigned, if that's not
the case then I think it's fine.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v3 13/15] x86/IRQ: tighten vector checks
@ 2019-05-22 16:42           ` Roger Pau Monné
  0 siblings, 0 replies; 196+ messages in thread
From: Roger Pau Monné @ 2019-05-22 16:42 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Wei Liu, xen-devel

On Mon, May 20, 2019 at 09:26:37AM -0600, Jan Beulich wrote:
> >>> On 20.05.19 at 16:04, <roger.pau@citrix.com> wrote:
> > On Fri, May 17, 2019 at 04:52:32AM -0600, Jan Beulich wrote:
> >> Use valid_irq_vector() rather than "> 0".
> >> 
> >> Also replace an open-coded use of IRQ_VECTOR_UNASSIGNED.
> >> 
> >> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> > 
> > Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
> 
> Thanks.
> 
> > The question I have below is not directly related to the usage of
> > valid_irq_vector, but rather with the existing code.
> > 
> >> @@ -452,15 +452,18 @@ static vmask_t *irq_get_used_vector_mask
> >>              int vector;
> >>              
> >>              vector = irq_to_vector(irq);
> >> -            if ( vector > 0 )
> >> +            if ( valid_irq_vector(vector) )
> >>              {
> >> -                printk(XENLOG_INFO "IRQ %d already assigned vector %d\n",
> >> +                printk(XENLOG_INFO "IRQ%d already assigned vector %02x\n",
> >>                         irq, vector);
> >>                  
> >>                  ASSERT(!test_bit(vector, ret));
> >>  
> >>                  set_bit(vector, ret);
> >>              }
> >> +            else if ( vector != IRQ_VECTOR_UNASSIGNED )
> >> +                printk(XENLOG_WARNING "IRQ%d mapped to bogus vector %02x\n",
> >> +                       irq, vector);
> > 
> > Maybe add an assert_unreachable here? It seems really bogus to call
> > irq_get_used_vector_mask with an unassigned vector.
> 
> How that? This would e.g. get called the very first time a vector
> is to be assigned. But I'm afraid I'm a little confused anyway by
> the wording you use - after all this is the code path dealing with
> an IRQ _not_ being marked as having no vector assigned, but
> also not having a valid vector.

Thanks for the clarification, by the name of the function I assumed it
must be called with an irq that has a vector assigned, if that's not
the case then I think it's fine.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v3 13/15] x86/IRQ: tighten vector checks
@ 2019-05-23  8:36             ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-23  8:36 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel

>>> On 22.05.19 at 18:42, <roger.pau@citrix.com> wrote:
> On Mon, May 20, 2019 at 09:26:37AM -0600, Jan Beulich wrote:
>> >>> On 20.05.19 at 16:04, <roger.pau@citrix.com> wrote:
>> > On Fri, May 17, 2019 at 04:52:32AM -0600, Jan Beulich wrote:
>> >> @@ -452,15 +452,18 @@ static vmask_t *irq_get_used_vector_mask
>> >>              int vector;
>> >>              
>> >>              vector = irq_to_vector(irq);
>> >> -            if ( vector > 0 )
>> >> +            if ( valid_irq_vector(vector) )
>> >>              {
>> >> -                printk(XENLOG_INFO "IRQ %d already assigned vector %d\n",
>> >> +                printk(XENLOG_INFO "IRQ%d already assigned vector %02x\n",
>> >>                         irq, vector);
>> >>                  
>> >>                  ASSERT(!test_bit(vector, ret));
>> >>  
>> >>                  set_bit(vector, ret);
>> >>              }
>> >> +            else if ( vector != IRQ_VECTOR_UNASSIGNED )
>> >> +                printk(XENLOG_WARNING "IRQ%d mapped to bogus vector %02x\n",
>> >> +                       irq, vector);
>> > 
>> > Maybe add an assert_unreachable here? It seems really bogus to call
>> > irq_get_used_vector_mask with an unassigned vector.
>> 
>> How that? This would e.g. get called the very first time a vector
>> is to be assigned. But I'm afraid I'm a little confused anyway by
>> the wording you use - after all this is the code path dealing with
>> an IRQ _not_ being marked as having no vector assigned, but
>> also not having a valid vector.
> 
> Thanks for the clarification, by the name of the function I assumed it
> must be called with an irq that has a vector assigned, if that's not
> the case then I think it's fine.
> 
> Roger.

Well, the names means "get the object where used vectors are to
be tracked for this IRQ", which has no implication on whether a
vector was already assigned.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v3 13/15] x86/IRQ: tighten vector checks
@ 2019-05-23  8:36             ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-05-23  8:36 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel

>>> On 22.05.19 at 18:42, <roger.pau@citrix.com> wrote:
> On Mon, May 20, 2019 at 09:26:37AM -0600, Jan Beulich wrote:
>> >>> On 20.05.19 at 16:04, <roger.pau@citrix.com> wrote:
>> > On Fri, May 17, 2019 at 04:52:32AM -0600, Jan Beulich wrote:
>> >> @@ -452,15 +452,18 @@ static vmask_t *irq_get_used_vector_mask
>> >>              int vector;
>> >>              
>> >>              vector = irq_to_vector(irq);
>> >> -            if ( vector > 0 )
>> >> +            if ( valid_irq_vector(vector) )
>> >>              {
>> >> -                printk(XENLOG_INFO "IRQ %d already assigned vector %d\n",
>> >> +                printk(XENLOG_INFO "IRQ%d already assigned vector %02x\n",
>> >>                         irq, vector);
>> >>                  
>> >>                  ASSERT(!test_bit(vector, ret));
>> >>  
>> >>                  set_bit(vector, ret);
>> >>              }
>> >> +            else if ( vector != IRQ_VECTOR_UNASSIGNED )
>> >> +                printk(XENLOG_WARNING "IRQ%d mapped to bogus vector %02x\n",
>> >> +                       irq, vector);
>> > 
>> > Maybe add an assert_unreachable here? It seems really bogus to call
>> > irq_get_used_vector_mask with an unassigned vector.
>> 
>> How that? This would e.g. get called the very first time a vector
>> is to be assigned. But I'm afraid I'm a little confused anyway by
>> the wording you use - after all this is the code path dealing with
>> an IRQ _not_ being marked as having no vector assigned, but
>> also not having a valid vector.
> 
> Thanks for the clarification, by the name of the function I assumed it
> must be called with an irq that has a vector assigned, if that's not
> the case then I think it's fine.
> 
> Roger.

Well, the names means "get the object where used vectors are to
be tracked for this IRQ", which has no implication on whether a
vector was already assigned.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v3 01/15] x86/IRQ: deal with move-in-progress state in fixup_irqs()
  2019-05-17 10:44     ` [Xen-devel] " Jan Beulich
  (?)
@ 2019-07-03 15:39     ` Andrew Cooper
  2019-07-04  9:32       ` Jan Beulich
  -1 siblings, 1 reply; 196+ messages in thread
From: Andrew Cooper @ 2019-07-03 15:39 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Wei Liu, Roger Pau Monne

On 17/05/2019 11:44, Jan Beulich wrote:
> The flag being set may prevent affinity changes, as these often imply
> assignment of a new vector. When there's no possible destination left
> for the IRQ, the clearing of the flag needs to happen right from
> fixup_irqs().
>
> Additionally _assign_irq_vector() needs to avoid setting the flag when
> there's no online CPU left in what gets put into ->arch.old_cpu_mask.
> The old vector can be released right away in this case.

This suggests that it is a bugfix, but it isn't clear what happens when
things go wrong.

> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -2418,15 +2462,18 @@ void fixup_irqs(const cpumask_t *mask, b
>          if ( desc->handler->enable )
>              desc->handler->enable(desc);
>  
> +        cpumask_copy(&affinity, desc->affinity);
> +
>          spin_unlock(&desc->lock);
>  
>          if ( !verbose )
>              continue;
>  
> -        if ( break_affinity && set_affinity )
> -            printk("Broke affinity for irq %i\n", irq);
> -        else if ( !set_affinity )
> -            printk("Cannot set affinity for irq %i\n", irq);
> +        if ( !set_affinity )
> +            printk("Cannot set affinity for IRQ%u\n", irq);
> +        else if ( break_affinity )
> +            printk("Broke affinity for IRQ%u, new: %*pb\n",
> +                   irq, nr_cpu_ids, &affinity);

While I certainly prefer this version, I should point out that you
refused to accept my patches like this, and for consistency with the
rest of the codebase, you should be using cpumask_bits().

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v3 02/15] x86/IRQ: deal with move cleanup count state in fixup_irqs()
  2019-05-17 10:45     ` [Xen-devel] " Jan Beulich
  (?)
@ 2019-07-03 16:32     ` Andrew Cooper
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrew Cooper @ 2019-07-03 16:32 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Wei Liu, Roger Pau Monne

On 17/05/2019 11:45, Jan Beulich wrote:
> The cleanup IPI may get sent immediately before a CPU gets removed from
> the online map. In such a case the IPI would get handled on the CPU
> being offlined no earlier than in the interrupts disabled window after
> fixup_irqs()' main loop. This is too late, however, because a possible
> affinity change may incur the need for vector assignment, which will
> fail when the IRQ's move cleanup count is still non-zero.
>
> To fix this
> - record the set of CPUs the cleanup IPIs gets actually sent to alongside
>   setting their count,
> - adjust the count in fixup_irqs(), accounting for all CPUs that the
>   cleanup IPI was sent to, but that are no longer online,
> - bail early from the cleanup IPI handler when the CPU is no longer
>   online, to prevent double accounting.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v3 03/15] x86/IRQ: improve dump_irqs()
  2019-05-17 10:46     ` [Xen-devel] " Jan Beulich
  (?)
@ 2019-07-03 16:39     ` Andrew Cooper
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrew Cooper @ 2019-07-03 16:39 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Wei Liu, Roger Pau Monne

On 17/05/2019 11:46, Jan Beulich wrote:
> Don't log a stray trailing comma. Shorten a few fields.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v3 04/15] x86/IRQ: desc->affinity should strictly represent the requested value
  2019-05-17 10:46     ` [Xen-devel] " Jan Beulich
  (?)
@ 2019-07-03 17:58     ` Andrew Cooper
  2019-07-04  9:37       ` Jan Beulich
  -1 siblings, 1 reply; 196+ messages in thread
From: Andrew Cooper @ 2019-07-03 17:58 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Wei Liu, Roger Pau Monne


[-- Attachment #1.1: Type: text/plain, Size: 2646 bytes --]

On 17/05/2019 11:46, Jan Beulich wrote:
> @@ -2334,9 +2339,10 @@ static void dump_irqs(unsigned char key)
>  
>          spin_lock_irqsave(&desc->lock, flags);
>  
> -        printk("   IRQ:%4d aff:%*pb vec:%02x %-15s status=%03x ",
> -               irq, nr_cpu_ids, cpumask_bits(desc->affinity), desc->arch.vector,
> -               desc->handler->typename, desc->status);
> +        printk("   IRQ:%4d aff:%*pb/%*pb vec:%02x %-15s status=%03x ",
> +               irq, nr_cpu_ids, cpumask_bits(desc->affinity),
> +               nr_cpu_ids, cpumask_bits(desc->arch.cpu_mask),
> +               desc->arch.vector, desc->handler->typename, desc->status);

Taking a sample large system (Rome, with your x2apic series to be
specific), which is only half as large as typical high-end Skylake systems.

(XEN) IRQ information:
(XEN)    IRQ:   0 affinity:00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000001 vec:f0 type=IO-APIC-edge    status=00000000 time.c#timer_interrupt()
(XEN)    IRQ:   1 affinity:00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000001 vec:68 type=IO-APIC-edge    status=00000002 mapped, unbound
(XEN)    IRQ:   3 affinity:ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff vec:70 type=IO-APIC-edge    status=00000002 mapped, unbound
(XEN)    IRQ:   4 affinity:ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff vec:f1 type=IO-APIC-edge    status=00000000 ns16550.c#ns16550_interrupt()
(XEN)    IRQ:   5 affinity:00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000001 vec:78 type=IO-APIC-edge    status=00000002 mapped, unbound
(XEN)    IRQ:   6 affinity:00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000001 vec:88 type=IO-APIC-edge    status=00000002 mapped, unbound
(XEN)    IRQ:   7 affinity:ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff vec:90 type=IO-APIC-level   status=00000002 mapped, unbound
(XEN)    IRQ:   8 affinity:00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000001 vec:98 type=IO-APIC-edge    status=00000030 in-flight=0 domain-list=0:  8(---),
(XEN)    IRQ:   9 affinity:00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000001 vec:a0 type=IO-APIC-level   status=00000030 in-flight=1 domain-list=0:  9(PMM),

This change is going to double up the affinity block, which will make
the lines even longer.

Given that all examples I've ever spotted are either a single bit, or a
fully set block, {%*pbl} will render in a much shorter, and keep the
line length reasonable.  (This in practice applies to the previous patch
as well).

~Andrew

[-- Attachment #1.2: Type: text/html, Size: 3135 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v3 05/15] x86/IRQ: consolidate use of ->arch.cpu_mask
  2019-05-17 10:47     ` [Xen-devel] " Jan Beulich
  (?)
@ 2019-07-03 18:07     ` Andrew Cooper
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrew Cooper @ 2019-07-03 18:07 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Wei Liu, Roger Pau Monne

On 17/05/2019 11:47, Jan Beulich wrote:
> Mixed meaning was implied so far by different pieces of code -
> disagreement was in particular about whether to expect offline CPUs'
> bits to possibly be set. Switch to a mostly consistent meaning
> (exception being high priority interrupts, which would perhaps better
> be switched to the same model as well in due course). Use the field to
> record the vector allocation mask, i.e. potentially including bits of
> offline (parked) CPUs. This implies that before passing the mask to
> certain functions (most notably cpu_mask_to_apicid()) it needs to be
> further reduced to the online subset.
>
> The exception of high priority interrupts is also why for the moment
> _bind_irq_vector() is left as is, despite looking wrong: It's used
> exclusively for IRQ0, which isn't supposed to move off CPU0 at any time.
>
> The prior lack of restricting to online CPUs in set_desc_affinity()
> before calling cpu_mask_to_apicid() in particular allowed (in x2APIC
> clustered mode) offlined CPUs to end up enabled in an IRQ's destination
> field. (I wonder whether vector_allocation_cpumask_flat() shouldn't
> follow a similar model, using cpu_present_map in favor of
> cpu_online_map.)
>
> For IO-APIC code it was definitely wrong to potentially store, as a
> fallback, TARGET_CPUS (i.e. all online ones) into the field, as that
> would have caused problems when determining on which CPUs to release
> vectors when they've gone out of use. Disable interrupts instead when
> no valid target CPU can be established (which code elsewhere should
> guarantee to never happen), and log a message in such an unlikely event.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v3 06/15] x86/IRQ: fix locking around vector management
  2019-05-17 10:47     ` [Xen-devel] " Jan Beulich
  (?)
@ 2019-07-03 18:23     ` Andrew Cooper
  2019-07-04  9:54       ` Jan Beulich
  -1 siblings, 1 reply; 196+ messages in thread
From: Andrew Cooper @ 2019-07-03 18:23 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Wei Liu, Roger Pau Monne

On 17/05/2019 11:47, Jan Beulich wrote:
> All of __{assign,bind,clear}_irq_vector() manipulate struct irq_desc
> fields, and hence ought to be called with the descriptor lock held in
> addition to vector_lock. This is currently the case for only
> set_desc_affinity() (in the common case) and destroy_irq(), which also
> clarifies what the nesting behavior between the locks has to be.
> Reflect the new expectation by having these functions all take a
> descriptor as parameter instead of an interrupt number.
>
> Also take care of the two special cases of calls to set_desc_affinity():
> set_ioapic_affinity_irq() and VT-d's dma_msi_set_affinity() get called
> directly as well, and in these cases the descriptor locks hadn't got
> acquired till now. For set_ioapic_affinity_irq() this means acquiring /
> releasing of the IO-APIC lock can be plain spin_{,un}lock() then.
>
> Drop one of the two leading underscores from all three functions at
> the same time.
>
> There's one case left where descriptors get manipulated with just
> vector_lock held: setup_vector_irq() assumes its caller to acquire
> vector_lock, and hence can't itself acquire the descriptor locks (wrong
> lock order). I don't currently see how to address this.

In practice, the only mutation is setting a bit in cpu_mask for the
shared high priority vectors, so looks to be safe in practice.  The
callers use of the vector_lock looks like a bodge though.

However,  this analysis needs to be added to the comment for
setup_vector_irq(), because the behaviour is extremely fragile and
mustn't change.

>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> Reviewed-by: Kevin Tian <kevin.tian@intel.com> [VT-d]
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

With some form of adjustment to the comment for setup_vector_irq(), and
ideally to the commit message about safety in practice, Acked-by: Andrew
Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v3 07/15] x86/IRQ: target online CPUs when binding guest IRQ
  2019-05-17 10:48     ` [Xen-devel] " Jan Beulich
  (?)
  (?)
@ 2019-07-03 18:30     ` Andrew Cooper
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrew Cooper @ 2019-07-03 18:30 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Wei Liu, Roger Pau Monne

On 17/05/2019 11:48, Jan Beulich wrote:
> fixup_irqs() skips interrupts without action. Hence such interrupts can
> retain affinity to just offline CPUs. With "noirqbalance" in effect,
> pirq_guest_bind() so far would have left them alone, resulting in a non-
> working interrupt.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v3 08/15] x86/IRQs: correct/tighten vector check in _clear_irq_vector()
  2019-05-17 10:49     ` [Xen-devel] " Jan Beulich
  (?)
@ 2019-07-03 18:31     ` Andrew Cooper
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrew Cooper @ 2019-07-03 18:31 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Wei Liu, Roger Pau Monne

On 17/05/2019 11:49, Jan Beulich wrote:
> If any particular value was to be checked against, it would need to be
> IRQ_VECTOR_UNASSIGNED.
>
> Reported-by: Roger Pau Monné <roger.pau@citrix.com>
>
> Be more strict though and use valid_irq_vector() instead.
>
> Take the opportunity and also convert local variables to unsigned int.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v3 09/15] x86/IRQ: make fixup_irqs() skip unconnected internally used interrupts
  2019-05-17 10:49     ` [Xen-devel] " Jan Beulich
  (?)
@ 2019-07-03 18:36     ` Andrew Cooper
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrew Cooper @ 2019-07-03 18:36 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Wei Liu, Roger Pau Monne

On 17/05/2019 11:49, Jan Beulich wrote:
> Since the "Cannot set affinity ..." warning is a one time one, avoid
> triggering it already at boot time when parking secondary threads and
> the serial console uses a (still unconnected at that time) PCI IRQ.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v3 10/15] x86/IRQ: drop redundant cpumask_empty() from move_masked_irq()
  2019-05-17 10:50     ` [Xen-devel] " Jan Beulich
  (?)
@ 2019-07-03 18:38     ` Andrew Cooper
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrew Cooper @ 2019-07-03 18:38 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Wei Liu, Roger Pau Monne

On 17/05/2019 11:50, Jan Beulich wrote:
> The subsequent cpumask_intersects() covers the "empty" case quite fine.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v3 11/15] x86/IRQ: simplify and rename pirq_acktype()
  2019-05-17 10:51     ` [Xen-devel] " Jan Beulich
  (?)
@ 2019-07-03 18:39     ` Andrew Cooper
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrew Cooper @ 2019-07-03 18:39 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Wei Liu, Roger Pau Monne

On 17/05/2019 11:51, Jan Beulich wrote:
> Its only caller already has the IRQ descriptor in its hands, so there's
> no need for the function to re-obtain it. As a result the leading p of
> its name is no longer appropriate and hence gets dropped.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v3 12/15] x86/IRQ: add explicit tracing-enabled check to trace_irq_mask()
  2019-05-17 10:51     ` [Xen-devel] " Jan Beulich
  (?)
  (?)
@ 2019-07-03 18:41     ` Andrew Cooper
  2019-07-04 10:01       ` Jan Beulich
  -1 siblings, 1 reply; 196+ messages in thread
From: Andrew Cooper @ 2019-07-03 18:41 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: George Dunlap, Wei Liu, Roger Pau Monne

On 17/05/2019 11:51, Jan Beulich wrote:
> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -137,6 +137,13 @@ static void trace_irq_mask(uint32_t even
>      trace_var(event, 1, sizeof(d), &d);
>  }
>  
> +static inline void trace_irq_mask(uint32_t event, int irq, int vector,

No inline.  Otherwise, Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

> +                                  const cpumask_t *mask)
> +{
> +    if ( unlikely(tb_init_done) )
> +        _trace_irq_mask(event, irq, vector, mask);
> +}
> +
>  static int __init _bind_irq_vector(struct irq_desc *desc, int vector,
>                                     const cpumask_t *cpu_mask)
>  {
>
>
>
>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v3 13/15] x86/IRQ: tighten vector checks
  2019-05-17 10:52     ` [Xen-devel] " Jan Beulich
  (?)
  (?)
@ 2019-07-03 18:42     ` Andrew Cooper
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrew Cooper @ 2019-07-03 18:42 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Wei Liu, Roger Pau Monne

On 17/05/2019 11:52, Jan Beulich wrote:
> Use valid_irq_vector() rather than "> 0".
>
> Also replace an open-coded use of IRQ_VECTOR_UNASSIGNED.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v3 14/15] x86/IRQ: eliminate some on-stack cpumask_t instances
  2019-05-20 14:22       ` [Xen-devel] " Roger Pau Monné
  (?)
@ 2019-07-03 18:44       ` Andrew Cooper
  2019-07-04 10:04         ` Jan Beulich
  -1 siblings, 1 reply; 196+ messages in thread
From: Andrew Cooper @ 2019-07-03 18:44 UTC (permalink / raw)
  To: Roger Pau Monné, Jan Beulich; +Cc: xen-devel, Wei Liu

On 20/05/2019 15:22, Roger Pau Monné wrote:
> On Fri, May 17, 2019 at 04:52:54AM -0600, Jan Beulich wrote:
>> Use scratch_cpumask where possible, to avoid creating these possibly
>> large stack objects. We can't use it in _assign_irq_vector() and
>> set_desc_affinity(), as these get called in IRQ context.
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com

Missing a trailing >

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v3 15/15] x86/IRQ: move {, _}clear_irq_vector()
  2019-05-17 10:53     ` [Xen-devel] " Jan Beulich
  (?)
@ 2019-07-03 18:45     ` Andrew Cooper
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrew Cooper @ 2019-07-03 18:45 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Wei Liu, Roger Pau Monne

On 17/05/2019 11:53, Jan Beulich wrote:
> This is largely to drop a forward declaration. There's one functional
> change - clear_irq_vector() gets marked __init, as its only caller is
> check_timer(). Beyond this only a few stray blanks get removed.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v3 01/15] x86/IRQ: deal with move-in-progress state in fixup_irqs()
  2019-07-03 15:39     ` Andrew Cooper
@ 2019-07-04  9:32       ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-07-04  9:32 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel; +Cc: Wei Liu, Roger Pau Monne

On 03.07.2019 17:39, Andrew Cooper wrote:
> On 17/05/2019 11:44, Jan Beulich wrote:
>> The flag being set may prevent affinity changes, as these often imply
>> assignment of a new vector. When there's no possible destination left
>> for the IRQ, the clearing of the flag needs to happen right from
>> fixup_irqs().
>>
>> Additionally _assign_irq_vector() needs to avoid setting the flag when
>> there's no online CPU left in what gets put into ->arch.old_cpu_mask.
>> The old vector can be released right away in this case.
> 
> This suggests that it is a bugfix, but it isn't clear what happens when
> things go wrong.

The vector cleanup wouldn't ever trigger, as the IRQ wouldn't get
raised anymore to any of its prior target CPUs. Hence the immediate
cleanup that gets done in that case. I thought the 2nd sentence
would make this clear. If it doesn't, do you have a suggestion on
how to improve the text?

>> --- a/xen/arch/x86/irq.c
>> +++ b/xen/arch/x86/irq.c
>> @@ -2418,15 +2462,18 @@ void fixup_irqs(const cpumask_t *mask, b
>>           if ( desc->handler->enable )
>>               desc->handler->enable(desc);
>>   
>> +        cpumask_copy(&affinity, desc->affinity);
>> +
>>           spin_unlock(&desc->lock);
>>   
>>           if ( !verbose )
>>               continue;
>>   
>> -        if ( break_affinity && set_affinity )
>> -            printk("Broke affinity for irq %i\n", irq);
>> -        else if ( !set_affinity )
>> -            printk("Cannot set affinity for irq %i\n", irq);
>> +        if ( !set_affinity )
>> +            printk("Cannot set affinity for IRQ%u\n", irq);
>> +        else if ( break_affinity )
>> +            printk("Broke affinity for IRQ%u, new: %*pb\n",
>> +                   irq, nr_cpu_ids, &affinity);
> 
> While I certainly prefer this version, I should point out that you
> refused to accept my patches like this, and for consistency with the
> rest of the codebase, you should be using cpumask_bits().

Oh, indeed. I guess I had converted a debugging only printk() into
this one without noticing the necessary tidying, the more that
elsewhere in the series I'm actually doing so already.

Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v3 04/15] x86/IRQ: desc->affinity should strictly represent the requested value
  2019-07-03 17:58     ` Andrew Cooper
@ 2019-07-04  9:37       ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-07-04  9:37 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel; +Cc: Wei Liu, Roger Pau Monne

On 03.07.2019 19:58, Andrew Cooper wrote:
> On 17/05/2019 11:46, Jan Beulich wrote:
>> @@ -2334,9 +2339,10 @@ static void dump_irqs(unsigned char key)
>>   
>>           spin_lock_irqsave(&desc->lock, flags);
>>   
>> -        printk("   IRQ:%4d aff:%*pb vec:%02x %-15s status=%03x ",
>> -               irq, nr_cpu_ids, cpumask_bits(desc->affinity), desc->arch.vector,
>> -               desc->handler->typename, desc->status);
>> +        printk("   IRQ:%4d aff:%*pb/%*pb vec:%02x %-15s status=%03x ",
>> +               irq, nr_cpu_ids, cpumask_bits(desc->affinity),
>> +               nr_cpu_ids, cpumask_bits(desc->arch.cpu_mask),
>> +               desc->arch.vector, desc->handler->typename, desc->status);
> 
> Taking a sample large system (Rome, with your x2apic series to be
> specific), which is only half as large as typical high-end Skylake systems.
> 
> (XEN) IRQ information:
> (XEN)    IRQ:   0 affinity:00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000001 vec:f0 type=IO-APIC-edge    status=00000000 time.c#timer_interrupt()
> (XEN)    IRQ:   1 affinity:00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000001 vec:68 type=IO-APIC-edge    status=00000002 mapped, unbound
> (XEN)    IRQ:   3 affinity:ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff vec:70 type=IO-APIC-edge    status=00000002 mapped, unbound
> (XEN)    IRQ:   4 affinity:ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff vec:f1 type=IO-APIC-edge    status=00000000 ns16550.c#ns16550_interrupt()
> (XEN)    IRQ:   5 affinity:00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000001 vec:78 type=IO-APIC-edge    status=00000002 mapped, unbound
> (XEN)    IRQ:   6 affinity:00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000001 vec:88 type=IO-APIC-edge    status=00000002 mapped, unbound
> (XEN)    IRQ:   7 affinity:ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff vec:90 type=IO-APIC-level   status=00000002 mapped, unbound
> (XEN)    IRQ:   8 affinity:00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000001 vec:98 type=IO-APIC-edge    status=00000030 in-flight=0 domain-list=0:  8(---),
> (XEN)    IRQ:   9 affinity:00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000001 vec:a0 type=IO-APIC-level   status=00000030 in-flight=1 domain-list=0:  9(PMM),
> 
> This change is going to double up the affinity block, which will make
> the lines even longer.
> 
> Given that all examples I've ever spotted are either a single bit, or a
> fully set block, {%*pbl} will render in a much shorter, and keep the
> line length reasonable.  (This in practice applies to the previous patch
> as well).

With SMT off (on Intel systems) I've certainly observed every other bit
being set, which is why I had specifically decided against %*pbl. Plus
using %*pbl would break the tabular formatting. The only middle ground
I could see (still having the undesirable latter effect) would be to
pick between both forms based on the ratio between set bits and total
number of them (and perhaps using %*pb as long as the total number of
them is below a certain threshold). Thoughts?

Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v3 06/15] x86/IRQ: fix locking around vector management
  2019-07-03 18:23     ` Andrew Cooper
@ 2019-07-04  9:54       ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-07-04  9:54 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel; +Cc: Wei Liu, Roger Pau Monne

On 03.07.2019 20:23, Andrew Cooper wrote:
> On 17/05/2019 11:47, Jan Beulich wrote:
>> All of __{assign,bind,clear}_irq_vector() manipulate struct irq_desc
>> fields, and hence ought to be called with the descriptor lock held in
>> addition to vector_lock. This is currently the case for only
>> set_desc_affinity() (in the common case) and destroy_irq(), which also
>> clarifies what the nesting behavior between the locks has to be.
>> Reflect the new expectation by having these functions all take a
>> descriptor as parameter instead of an interrupt number.
>>
>> Also take care of the two special cases of calls to set_desc_affinity():
>> set_ioapic_affinity_irq() and VT-d's dma_msi_set_affinity() get called
>> directly as well, and in these cases the descriptor locks hadn't got
>> acquired till now. For set_ioapic_affinity_irq() this means acquiring /
>> releasing of the IO-APIC lock can be plain spin_{,un}lock() then.
>>
>> Drop one of the two leading underscores from all three functions at
>> the same time.
>>
>> There's one case left where descriptors get manipulated with just
>> vector_lock held: setup_vector_irq() assumes its caller to acquire
>> vector_lock, and hence can't itself acquire the descriptor locks (wrong
>> lock order). I don't currently see how to address this.
> 
> In practice, the only mutation is setting a bit in cpu_mask for the
> shared high priority vectors, so looks to be safe in practice.

I had tried to convince myself that it's safe in practice, but I'm
afraid I couldn't (and hence wouldn't want to say so in the patch
description here). There's one important thing to pay attention to:
Not all manipulations of ->arch.cpu_mask are atomic (and there's
really no way for them to be, with our current cpumask
infrastructure) - all of them assume to be done under lock. And
other than when offlining a CPU we're not in a fully synchronized
state while onlining one.

> The callers use of the vector_lock looks like a bodge though.

Well, it's definitely not nice, but unavoidable (afaict).

> However,  this analysis needs to be added to the comment for
> setup_vector_irq(), because the behaviour is extremely fragile and
> mustn't change.

Will do.

>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>> Reviewed-by: Kevin Tian <kevin.tian@intel.com> [VT-d]
>> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
> 
> With some form of adjustment to the comment for setup_vector_irq(), and
> ideally to the commit message about safety in practice, Acked-by: Andrew
> Cooper <andrew.cooper3@citrix.com>

Thanks.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v3 12/15] x86/IRQ: add explicit tracing-enabled check to trace_irq_mask()
  2019-07-03 18:41     ` Andrew Cooper
@ 2019-07-04 10:01       ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-07-04 10:01 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel; +Cc: George Dunlap, Wei Liu, Roger Pau Monne

On 03.07.2019 20:41, Andrew Cooper wrote:
> On 17/05/2019 11:51, Jan Beulich wrote:
>> --- a/xen/arch/x86/irq.c
>> +++ b/xen/arch/x86/irq.c
>> @@ -137,6 +137,13 @@ static void trace_irq_mask(uint32_t even
>>       trace_var(event, 1, sizeof(d), &d);
>>   }
>>   
>> +static inline void trace_irq_mask(uint32_t event, int irq, int vector,
> 
> No inline.

Well, I think in cases like this one we really want it, but anyway,
I'll drop it just to make progress here.

>  Otherwise, Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

Thanks, Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [Xen-devel] [PATCH v3 14/15] x86/IRQ: eliminate some on-stack cpumask_t instances
  2019-07-03 18:44       ` Andrew Cooper
@ 2019-07-04 10:04         ` Jan Beulich
  0 siblings, 0 replies; 196+ messages in thread
From: Jan Beulich @ 2019-07-04 10:04 UTC (permalink / raw)
  To: Andrew Cooper, Roger Pau Monné; +Cc: xen-devel, Wei Liu

On 03.07.2019 20:44, Andrew Cooper wrote:
> On 20/05/2019 15:22, Roger Pau Monné wrote:
>> On Fri, May 17, 2019 at 04:52:54AM -0600, Jan Beulich wrote:
>>> Use scratch_cpumask where possible, to avoid creating these possibly
>>> large stack objects. We can't use it in _assign_irq_vector() and
>>> set_desc_affinity(), as these get called in IRQ context.
>>>
>>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com
> 
> Missing a trailing >

I had added that one already.

> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

Thanks again.

Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

end of thread, other threads:[~2019-07-04 10:06 UTC | newest]

Thread overview: 196+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-29 11:16 [PATCH 0/9] x86: IRQ management adjustments Jan Beulich
2019-04-29 11:16 ` [Xen-devel] " Jan Beulich
2019-04-29 11:22 ` [PATCH RFC 1/9] x86/IRQ: deal with move-in-progress state in fixup_irqs() Jan Beulich
2019-04-29 11:22   ` [Xen-devel] " Jan Beulich
2019-04-29 12:55   ` Jan Beulich
2019-04-29 12:55     ` [Xen-devel] " Jan Beulich
2019-04-29 13:08     ` Jan Beulich
2019-04-29 13:08       ` [Xen-devel] " Jan Beulich
2019-04-29 11:23 ` [PATCH 2/9] x86/IRQ: deal with move cleanup count " Jan Beulich
2019-04-29 11:23   ` [Xen-devel] " Jan Beulich
2019-05-03 15:21   ` Roger Pau Monné
2019-05-03 15:21     ` [Xen-devel] " Roger Pau Monné
2019-05-06  7:44     ` Jan Beulich
2019-05-06  7:44       ` [Xen-devel] " Jan Beulich
2019-05-07  7:28     ` Jan Beulich
2019-05-07  7:28       ` [Xen-devel] " Jan Beulich
2019-05-07  8:12       ` Roger Pau Monné
2019-05-07  8:12         ` [Xen-devel] " Roger Pau Monné
2019-05-07  9:28         ` Jan Beulich
2019-05-07  9:28           ` [Xen-devel] " Jan Beulich
2019-04-29 11:23 ` [PATCH 3/9] x86/IRQ: improve dump_irqs() Jan Beulich
2019-04-29 11:23   ` [Xen-devel] " Jan Beulich
2019-05-03 15:43   ` Roger Pau Monné
2019-05-03 15:43     ` [Xen-devel] " Roger Pau Monné
2019-05-06  8:06     ` Jan Beulich
2019-05-06  8:06       ` [Xen-devel] " Jan Beulich
2019-04-29 11:24 ` [PATCH 4/9] x86/IRQ: desc->affinity should strictly represent the requested value Jan Beulich
2019-04-29 11:24   ` [Xen-devel] " Jan Beulich
2019-05-03 16:21   ` Roger Pau Monné
2019-05-03 16:21     ` [Xen-devel] " Roger Pau Monné
2019-05-06  8:14     ` Jan Beulich
2019-05-06  8:14       ` [Xen-devel] " Jan Beulich
2019-04-29 11:25 ` [PATCH 5/9] x86/IRQ: fix locking around vector management Jan Beulich
2019-04-29 11:25   ` [Xen-devel] " Jan Beulich
2019-05-06 11:48   ` Roger Pau Monné
2019-05-06 11:48     ` [Xen-devel] " Roger Pau Monné
2019-05-06 13:06     ` Jan Beulich
2019-05-06 13:06       ` [Xen-devel] " Jan Beulich
2019-04-29 11:25 ` [PATCH 6/9] x86/IRQ: reduce unused space in struct arch_irq_desc Jan Beulich
2019-04-29 11:25   ` [Xen-devel] " Jan Beulich
2019-04-29 11:46   ` Andrew Cooper
2019-04-29 11:46     ` [Xen-devel] " Andrew Cooper
2019-04-29 11:26 ` [PATCH 7/9] x86/IRQ: drop redundant cpumask_empty() from move_masked_irq() Jan Beulich
2019-04-29 11:26   ` [Xen-devel] " Jan Beulich
2019-05-06 13:39   ` Roger Pau Monné
2019-05-06 13:39     ` [Xen-devel] " Roger Pau Monné
2019-04-29 11:26 ` [PATCH 8/9] x86/IRQ: make fixup_irqs() skip unconnected internally used interrupts Jan Beulich
2019-04-29 11:26   ` [Xen-devel] " Jan Beulich
2019-05-06 13:52   ` Roger Pau Monné
2019-05-06 13:52     ` [Xen-devel] " Roger Pau Monné
2019-05-06 14:25     ` Jan Beulich
2019-05-06 14:25       ` [Xen-devel] " Jan Beulich
2019-05-06 14:37       ` Roger Pau Monné
2019-05-06 14:37         ` [Xen-devel] " Roger Pau Monné
2019-04-29 11:27 ` [PATCH 9/9] x86/IO-APIC: drop an unused variable from setup_IO_APIC_irqs() Jan Beulich
2019-04-29 11:27   ` [Xen-devel] " Jan Beulich
2019-04-29 11:40   ` Andrew Cooper
2019-04-29 11:40     ` [Xen-devel] " Andrew Cooper
2019-04-29 15:40 ` [PATCH v1b 1/9] x86/IRQ: deal with move-in-progress state in fixup_irqs() Jan Beulich
2019-04-29 15:40   ` [Xen-devel] " Jan Beulich
2019-05-03  9:19   ` Roger Pau Monné
2019-05-03  9:19     ` [Xen-devel] " Roger Pau Monné
2019-05-03 14:10     ` Jan Beulich
2019-05-03 14:10       ` [Xen-devel] " Jan Beulich
2019-05-06  7:15       ` Jan Beulich
2019-05-06  7:15         ` [Xen-devel] " Jan Beulich
2019-05-06 14:28         ` Roger Pau Monné
2019-05-06 14:28           ` [Xen-devel] " Roger Pau Monné
2019-05-06 15:00           ` Jan Beulich
2019-05-06 15:00             ` [Xen-devel] " Jan Beulich
2019-05-08 12:59 ` [PATCH v2 00/12] x86: IRQ management adjustments Jan Beulich
2019-05-08 12:59   ` [Xen-devel] " Jan Beulich
2019-05-08 13:03   ` [PATCH v2 01/12] x86/IRQ: deal with move-in-progress state in fixup_irqs() Jan Beulich
2019-05-08 13:03     ` [Xen-devel] " Jan Beulich
2019-05-13  9:04     ` Roger Pau Monné
2019-05-13  9:04       ` [Xen-devel] " Roger Pau Monné
2019-05-13  9:09       ` Jan Beulich
2019-05-13  9:09         ` [Xen-devel] " Jan Beulich
2019-05-08 13:03   ` [PATCH v2 02/12] x86/IRQ: deal with move cleanup count " Jan Beulich
2019-05-08 13:03     ` [Xen-devel] " Jan Beulich
2019-05-08 13:07   ` [PATCH v2 03/12] x86/IRQ: avoid UB (or worse) in trace_irq_mask() Jan Beulich
2019-05-08 13:07     ` [Xen-devel] " Jan Beulich
2019-05-13  9:08     ` Roger Pau Monné
2019-05-13  9:08       ` [Xen-devel] " Roger Pau Monné
2019-05-13 10:42     ` George Dunlap
2019-05-13 10:42       ` [Xen-devel] " George Dunlap
2019-05-13 12:05       ` Jan Beulich
2019-05-13 12:05         ` [Xen-devel] " Jan Beulich
2019-05-08 13:08   ` [PATCH v2 04/12] x86/IRQ: improve dump_irqs() Jan Beulich
2019-05-08 13:08     ` [Xen-devel] " Jan Beulich
2019-05-08 13:09   ` [PATCH v2 05/12] x86/IRQ: desc->affinity should strictly represent the requested value Jan Beulich
2019-05-08 13:09     ` [Xen-devel] " Jan Beulich
2019-05-08 13:10   ` [PATCH v2 06/12] x86/IRQ: consolidate use of ->arch.cpu_mask Jan Beulich
2019-05-08 13:10     ` [Xen-devel] " Jan Beulich
2019-05-13 11:32     ` Roger Pau Monné
2019-05-13 11:32       ` [Xen-devel] " Roger Pau Monné
2019-05-13 15:21       ` Jan Beulich
2019-05-13 15:21         ` [Xen-devel] " Jan Beulich
2019-05-08 13:10   ` [PATCH v2 07/12] x86/IRQ: fix locking around vector management Jan Beulich
2019-05-08 13:10     ` [Xen-devel] " Jan Beulich
2019-05-08 13:16     ` Jan Beulich
2019-05-08 13:16       ` [Xen-devel] " Jan Beulich
2019-05-11  0:11       ` Tian, Kevin
2019-05-11  0:11         ` [Xen-devel] " Tian, Kevin
2019-05-13 13:48     ` Roger Pau Monné
2019-05-13 13:48       ` [Xen-devel] " Roger Pau Monné
2019-05-13 14:19       ` Jan Beulich
2019-05-13 14:19         ` [Xen-devel] " Jan Beulich
2019-05-13 14:45         ` Roger Pau Monné
2019-05-13 14:45           ` [Xen-devel] " Roger Pau Monné
2019-05-13 15:05           ` Jan Beulich
2019-05-13 15:05             ` [Xen-devel] " Jan Beulich
2019-05-08 13:11   ` [PATCH v2 08/12] x86/IRQs: correct/tighten vector check in _clear_irq_vector() Jan Beulich
2019-05-08 13:11     ` [Xen-devel] " Jan Beulich
2019-05-13 14:01     ` Roger Pau Monné
2019-05-13 14:01       ` [Xen-devel] " Roger Pau Monné
2019-05-08 13:12   ` [PATCH v2 09/12] x86/IRQ: make fixup_irqs() skip unconnected internally used interrupts Jan Beulich
2019-05-08 13:12     ` [Xen-devel] " Jan Beulich
2019-05-08 13:13   ` [PATCH v2 10/12] x86/IRQ: reduce unused space in struct arch_irq_desc Jan Beulich
2019-05-08 13:13     ` [Xen-devel] " Jan Beulich
2019-05-08 13:13   ` [PATCH v2 11/12] x86/IRQ: drop redundant cpumask_empty() from move_masked_irq() Jan Beulich
2019-05-08 13:13     ` [Xen-devel] " Jan Beulich
2019-05-08 13:14   ` [PATCH v2 12/12] x86/IRQ: simplify and rename pirq_acktype() Jan Beulich
2019-05-08 13:14     ` [Xen-devel] " Jan Beulich
2019-05-13 14:14     ` Roger Pau Monné
2019-05-13 14:14       ` [Xen-devel] " Roger Pau Monné
2019-05-17 10:39 ` [PATCH v3 00/15] x86: IRQ management adjustments Jan Beulich
2019-05-17 10:39   ` [Xen-devel] " Jan Beulich
2019-05-17 10:44   ` [PATCH v3 01/15] x86/IRQ: deal with move-in-progress state in fixup_irqs() Jan Beulich
2019-05-17 10:44     ` [Xen-devel] " Jan Beulich
2019-07-03 15:39     ` Andrew Cooper
2019-07-04  9:32       ` Jan Beulich
2019-05-17 10:45   ` [PATCH v3 02/15] x86/IRQ: deal with move cleanup count " Jan Beulich
2019-05-17 10:45     ` [Xen-devel] " Jan Beulich
2019-07-03 16:32     ` Andrew Cooper
2019-05-17 10:46   ` [PATCH v3 03/15] x86/IRQ: improve dump_irqs() Jan Beulich
2019-05-17 10:46     ` [Xen-devel] " Jan Beulich
2019-07-03 16:39     ` Andrew Cooper
2019-05-17 10:46   ` [PATCH v3 04/15] x86/IRQ: desc->affinity should strictly represent the requested value Jan Beulich
2019-05-17 10:46     ` [Xen-devel] " Jan Beulich
2019-07-03 17:58     ` Andrew Cooper
2019-07-04  9:37       ` Jan Beulich
2019-05-17 10:47   ` [PATCH v3 05/15] x86/IRQ: consolidate use of ->arch.cpu_mask Jan Beulich
2019-05-17 10:47     ` [Xen-devel] " Jan Beulich
2019-07-03 18:07     ` Andrew Cooper
2019-05-17 10:47   ` [PATCH v3 06/15] x86/IRQ: fix locking around vector management Jan Beulich
2019-05-17 10:47     ` [Xen-devel] " Jan Beulich
2019-07-03 18:23     ` Andrew Cooper
2019-07-04  9:54       ` Jan Beulich
2019-05-17 10:48   ` [PATCH v3 07/15] x86/IRQ: target online CPUs when binding guest IRQ Jan Beulich
2019-05-17 10:48     ` [Xen-devel] " Jan Beulich
2019-05-20 11:40     ` Roger Pau Monné
2019-05-20 11:40       ` [Xen-devel] " Roger Pau Monné
2019-05-20 15:17       ` Jan Beulich
2019-05-20 15:17         ` [Xen-devel] " Jan Beulich
2019-05-22  9:41         ` Roger Pau Monné
2019-05-22  9:41           ` [Xen-devel] " Roger Pau Monné
2019-07-03 18:30     ` Andrew Cooper
2019-05-17 10:49   ` [PATCH v3 08/15] x86/IRQs: correct/tighten vector check in _clear_irq_vector() Jan Beulich
2019-05-17 10:49     ` [Xen-devel] " Jan Beulich
2019-07-03 18:31     ` Andrew Cooper
2019-05-17 10:49   ` [PATCH v3 09/15] x86/IRQ: make fixup_irqs() skip unconnected internally used interrupts Jan Beulich
2019-05-17 10:49     ` [Xen-devel] " Jan Beulich
2019-07-03 18:36     ` Andrew Cooper
2019-05-17 10:50   ` [PATCH v3 10/15] x86/IRQ: drop redundant cpumask_empty() from move_masked_irq() Jan Beulich
2019-05-17 10:50     ` [Xen-devel] " Jan Beulich
2019-07-03 18:38     ` Andrew Cooper
2019-05-17 10:51   ` [PATCH v3 11/15] x86/IRQ: simplify and rename pirq_acktype() Jan Beulich
2019-05-17 10:51     ` [Xen-devel] " Jan Beulich
2019-07-03 18:39     ` Andrew Cooper
2019-05-17 10:51   ` [PATCH v3 12/15] x86/IRQ: add explicit tracing-enabled check to trace_irq_mask() Jan Beulich
2019-05-17 10:51     ` [Xen-devel] " Jan Beulich
2019-05-20 11:46     ` Roger Pau Monné
2019-05-20 11:46       ` [Xen-devel] " Roger Pau Monné
2019-07-03 18:41     ` Andrew Cooper
2019-07-04 10:01       ` Jan Beulich
2019-05-17 10:52   ` [PATCH v3 13/15] x86/IRQ: tighten vector checks Jan Beulich
2019-05-17 10:52     ` [Xen-devel] " Jan Beulich
2019-05-20 14:04     ` Roger Pau Monné
2019-05-20 14:04       ` [Xen-devel] " Roger Pau Monné
2019-05-20 15:26       ` Jan Beulich
2019-05-20 15:26         ` [Xen-devel] " Jan Beulich
2019-05-22 16:42         ` Roger Pau Monné
2019-05-22 16:42           ` [Xen-devel] " Roger Pau Monné
2019-05-23  8:36           ` Jan Beulich
2019-05-23  8:36             ` [Xen-devel] " Jan Beulich
2019-07-03 18:42     ` Andrew Cooper
2019-05-17 10:52   ` [PATCH v3 14/15] x86/IRQ: eliminate some on-stack cpumask_t instances Jan Beulich
2019-05-17 10:52     ` [Xen-devel] " Jan Beulich
2019-05-20 14:22     ` Roger Pau Monné
2019-05-20 14:22       ` [Xen-devel] " Roger Pau Monné
2019-07-03 18:44       ` Andrew Cooper
2019-07-04 10:04         ` Jan Beulich
2019-05-17 10:53   ` [PATCH v3 15/15] x86/IRQ: move {,_}clear_irq_vector() Jan Beulich
2019-05-17 10:53     ` [Xen-devel] " Jan Beulich
2019-07-03 18:45     ` [Xen-devel] [PATCH v3 15/15] x86/IRQ: move {, _}clear_irq_vector() Andrew Cooper

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.