All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [RFC v2 08/15] Update IRTE according to guest interrupt config changes
@ 2015-06-12  9:40 Wu, Feng
  2015-06-12 10:40 ` Jan Beulich
  0 siblings, 1 reply; 7+ messages in thread
From: Wu, Feng @ 2015-06-12  9:40 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, keir, george.dunlap, andrew.cooper3, xen-devel,
	Zhang, Yang Z, Wu, Feng



> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Tuesday, June 09, 2015 11:06 PM
> To: Wu, Feng
> Cc: andrew.cooper3@citrix.com; george.dunlap@eu.citrix.com; Tian, Kevin;
> Zhang, Yang Z; xen-devel@lists.xen.org; keir@xen.org
> Subject: Re: [RFC v2 08/15] Update IRTE according to guest interrupt config
> changes
> 
> >>> On 08.05.15 at 11:07, <feng.wu@intel.com> wrote:
> > +static bool_t pi_find_dest_vcpu(struct domain *d, uint8_t dest_id,
> > +                                uint8_t dest_mode, uint8_t
> delivery_mode,
> > +                                uint8_t gvec, struct vcpu
> **dest_vcpu)
> > +{
> > +    struct vcpu *v, **dest_vcpu_array;
> > +    unsigned int dest_vcpu_num = 0;
> > +    int ret;
> 
> This, being given as operand to "return", should match in type with
> the function's return type.
> 
> > +    dest_vcpu_array = xzalloc_array(struct vcpu *, d->max_vcpus);
> 
> You realize that this can be quite big an allocation? (You could at
> least halve it by storing vCPU IDs instead of pointers, but if I'm
> not mistaken this could even be a simple bitmap.)

If we use vCPU IDs or bitmap, we need to iterate all the vCPUs in the
domain to find the right vCPU from the vcpu_id, right?

> 
> > +    if ( !dest_vcpu_array )
> > +    {
> > +        dprintk(XENLOG_G_INFO,
> > +                "dom%d: failed to allocate memeory.\n", d->domain_id);
> 
> Please fix the typo and remove the stop.
> 
> > +        return 0;
> > +    }
> > +
> > +    for_each_vcpu ( d, v )
> > +    {
> > +        if ( !vlapic_match_dest(vcpu_vlapic(v), NULL, 0,
> > +                                dest_id, dest_mode) )
> > +            continue;
> > +
> > +        dest_vcpu_array[dest_vcpu_num++] = v;
> > +    }
> > +
> > +    if ( delivery_mode == dest_LowestPrio )
> > +    {
> > +        if (  dest_vcpu_num != 0 )
> > +        {
> > +            *dest_vcpu = dest_vcpu_array[gvec % dest_vcpu_num];
> > +            ret = 1;
> > +        }
> > +        else
> > +            ret = 0;
> > +    }
> > +    else if (  dest_vcpu_num == 1 )
> > +    {
> > +        *dest_vcpu = dest_vcpu_array[0];
> > +        ret = 1;
> > +    }
> > +    else
> > +        ret = 0;
> > +
> > +    xfree(dest_vcpu_array);
> > +    return ret;
> > +}
> 
> Blank line before final return statement please.
> 
> > @@ -330,11 +398,40 @@ int pt_irq_create_bind(
> >          /* Calculate dest_vcpu_id for MSI-type pirq migration. */
> >          dest = pirq_dpci->gmsi.gflags & VMSI_DEST_ID_MASK;
> >          dest_mode = !!(pirq_dpci->gmsi.gflags & VMSI_DM_MASK);
> > +        delivery_mode = (pirq_dpci->gmsi.gflags >>
> GFLAGS_SHIFT_DELIV_MODE) &
> > +                        VMSI_DELIV_MASK;
> >          dest_vcpu_id = hvm_girq_dest_2_vcpu_id(d, dest, dest_mode);
> >          pirq_dpci->gmsi.dest_vcpu_id = dest_vcpu_id;
> >          spin_unlock(&d->event_lock);
> >          if ( dest_vcpu_id >= 0 )
> >              hvm_migrate_pirqs(d->vcpu[dest_vcpu_id]);
> > +
> > +        /* Use interrupt posting if it is supported */
> > +        if ( iommu_intpost )
> > +        {
> > +            struct vcpu *vcpu = NULL;
> > +
> > +            if ( !pi_find_dest_vcpu(d, dest, dest_mode, delivery_mode,
> > +                                    pirq_dpci->gmsi.gvec, &vcpu) )
> 
> Why not have the function return the vCPU instead of passing it
> via indirection?

Good suggestion!

Thanks,
Feng

> 
> > +            {
> > +                dprintk(XENLOG_G_WARNING,
> > +                        "%pv: failed to find the dest vCPU for PI, guest
> "
> > +                        "vector:%u use software way to deliver the "
> > +                        " interrupts.\n", vcpu, pirq_dpci->gmsi.gvec);
> 
> You shouldn't be printing the (NULL) vCPU here. And please print
> vectors as hex values.
> 
> > +                break;
> > +            }
> > +
> > +            if ( pi_update_irte( vcpu, info, pirq_dpci->gmsi.gvec ) != 0 )
> > +            {
> > +                dprintk(XENLOG_G_WARNING,
> > +                        "%pv: failed to update PI IRTE, guest
> vector:%u "
> > +                        "use software way to deliver the
> interrupts.\n",
> > +                        vcpu, pirq_dpci->gmsi.gvec);
> > +
> > +                break;
> > +            }
> > +        }
> > +
> >          break;
> 
> By using if() / else if() you could drop _both_ break-s you add.
> 
> Jan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC v2 08/15] Update IRTE according to guest interrupt config changes
  2015-06-12  9:40 [RFC v2 08/15] Update IRTE according to guest interrupt config changes Wu, Feng
@ 2015-06-12 10:40 ` Jan Beulich
  2015-06-15  9:18   ` Wu, Feng
  0 siblings, 1 reply; 7+ messages in thread
From: Jan Beulich @ 2015-06-12 10:40 UTC (permalink / raw)
  To: Feng Wu
  Cc: Kevin Tian, keir, george.dunlap, andrew.cooper3, xen-devel, Yang Z Zhang

>>> On 12.06.15 at 11:40, <feng.wu@intel.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: Tuesday, June 09, 2015 11:06 PM
>> >>> On 08.05.15 at 11:07, <feng.wu@intel.com> wrote:
>> > +static bool_t pi_find_dest_vcpu(struct domain *d, uint8_t dest_id,
>> > +                                uint8_t dest_mode, uint8_t
>> delivery_mode,
>> > +                                uint8_t gvec, struct vcpu
>> **dest_vcpu)
>> > +{
>> > +    struct vcpu *v, **dest_vcpu_array;
>> > +    unsigned int dest_vcpu_num = 0;
>> > +    int ret;
>> 
>> This, being given as operand to "return", should match in type with
>> the function's return type.
>> 
>> > +    dest_vcpu_array = xzalloc_array(struct vcpu *, d->max_vcpus);
>> 
>> You realize that this can be quite big an allocation? (You could at
>> least halve it by storing vCPU IDs instead of pointers, but if I'm
>> not mistaken this could even be a simple bitmap.)
> 
> If we use vCPU IDs or bitmap, we need to iterate all the vCPUs in the
> domain to find the right vCPU from the vcpu_id, right?

Why? You can index d->vcpu[].

Jan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC v2 08/15] Update IRTE according to guest interrupt config changes
  2015-06-12 10:40 ` Jan Beulich
@ 2015-06-15  9:18   ` Wu, Feng
  0 siblings, 0 replies; 7+ messages in thread
From: Wu, Feng @ 2015-06-15  9:18 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, keir, george.dunlap, andrew.cooper3, xen-devel,
	Zhang, Yang Z, Wu, Feng



> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Friday, June 12, 2015 6:41 PM
> To: Wu, Feng
> Cc: andrew.cooper3@citrix.com; george.dunlap@eu.citrix.com; Tian, Kevin;
> Zhang, Yang Z; xen-devel@lists.xen.org; keir@xen.org
> Subject: RE: [RFC v2 08/15] Update IRTE according to guest interrupt config
> changes
> 
> >>> On 12.06.15 at 11:40, <feng.wu@intel.com> wrote:
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: Tuesday, June 09, 2015 11:06 PM
> >> >>> On 08.05.15 at 11:07, <feng.wu@intel.com> wrote:
> >> > +static bool_t pi_find_dest_vcpu(struct domain *d, uint8_t dest_id,
> >> > +                                uint8_t dest_mode, uint8_t
> >> delivery_mode,
> >> > +                                uint8_t gvec, struct vcpu
> >> **dest_vcpu)
> >> > +{
> >> > +    struct vcpu *v, **dest_vcpu_array;
> >> > +    unsigned int dest_vcpu_num = 0;
> >> > +    int ret;
> >>
> >> This, being given as operand to "return", should match in type with
> >> the function's return type.
> >>
> >> > +    dest_vcpu_array = xzalloc_array(struct vcpu *, d->max_vcpus);
> >>
> >> You realize that this can be quite big an allocation? (You could at
> >> least halve it by storing vCPU IDs instead of pointers, but if I'm
> >> not mistaken this could even be a simple bitmap.)
> >
> > If we use vCPU IDs or bitmap, we need to iterate all the vCPUs in the
> > domain to find the right vCPU from the vcpu_id, right?
> 
> Why? You can index d->vcpu[].

Oh, yes, I didn't notice that.

Thanks,
Feng

> 
> Jan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC v2 08/15] Update IRTE according to guest interrupt config changes
  2015-06-16  8:08     ` Wu, Feng
@ 2015-06-16  8:17       ` Jan Beulich
  0 siblings, 0 replies; 7+ messages in thread
From: Jan Beulich @ 2015-06-16  8:17 UTC (permalink / raw)
  To: Feng Wu
  Cc: Kevin Tian, keir, george.dunlap, andrew.cooper3, xen-devel, Yang Z Zhang

>>> On 16.06.15 at 10:08, <feng.wu@intel.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: Tuesday, June 09, 2015 11:06 PM
>> >>> On 08.05.15 at 11:07, <feng.wu@intel.com> wrote:
>> > +    dest_vcpu_array = xzalloc_array(struct vcpu *, d->max_vcpus);
>> 
>> You realize that this can be quite big an allocation? (You could at
>> least halve it by storing vCPU IDs instead of pointers, but if I'm
>> not mistaken this could even be a simple bitmap.)
> 
> I think maybe storing the vCPU IDs is better than using bitmap, because
> If using vCPU IDs, we can easily to find the destination vCPU id by 
> dest_vcpu_id_array [gvec % dest_vcpu_num],
> then we can get the vcpu form d->vcpu[]. However if we use bitmap here, we set 
> the bits for those vCPUs which
> is present in the destination field, then we need to use find_next_bit() to 
> find the right vCPU ID, this may need
> a while loop, see the following scenario:
> 
> - Guest has 8 vCPUs, in the interrupt destination fields, it only configures 
> vcpu 0, 1, 3, 4, 5, 7 for the lowest-priority interrupt.
>  So we get the vCPU destination bitmap like this: 10111011b
> - Suppose guest vector is 44, so we choose 44%6=2, we need to use 
> find_next_bit() to find the second set bit (counting from 0, which is vCPU ID 
> 3), this may need a while loop.
> 
> What is your opinion about this? Space vs Speed?

Since the array can get rather large (in particular, larger than a page),
space would seem the more important factor here.

Also - please trim your replies.

Jan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC v2 08/15] Update IRTE according to guest interrupt config changes
  2015-06-09 15:06   ` Jan Beulich
@ 2015-06-16  8:08     ` Wu, Feng
  2015-06-16  8:17       ` Jan Beulich
  0 siblings, 1 reply; 7+ messages in thread
From: Wu, Feng @ 2015-06-16  8:08 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, keir, george.dunlap, andrew.cooper3, xen-devel,
	Zhang, Yang Z, Wu, Feng



> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Tuesday, June 09, 2015 11:06 PM
> To: Wu, Feng
> Cc: andrew.cooper3@citrix.com; george.dunlap@eu.citrix.com; Tian, Kevin;
> Zhang, Yang Z; xen-devel@lists.xen.org; keir@xen.org
> Subject: Re: [RFC v2 08/15] Update IRTE according to guest interrupt config
> changes
> 
> >>> On 08.05.15 at 11:07, <feng.wu@intel.com> wrote:
> > +static bool_t pi_find_dest_vcpu(struct domain *d, uint8_t dest_id,
> > +                                uint8_t dest_mode, uint8_t
> delivery_mode,
> > +                                uint8_t gvec, struct vcpu
> **dest_vcpu)
> > +{
> > +    struct vcpu *v, **dest_vcpu_array;
> > +    unsigned int dest_vcpu_num = 0;
> > +    int ret;
> 
> This, being given as operand to "return", should match in type with
> the function's return type.
> 
> > +    dest_vcpu_array = xzalloc_array(struct vcpu *, d->max_vcpus);
> 
> You realize that this can be quite big an allocation? (You could at
> least halve it by storing vCPU IDs instead of pointers, but if I'm
> not mistaken this could even be a simple bitmap.)

I think maybe storing the vCPU IDs is better than using bitmap, because
If using vCPU IDs, we can easily to find the destination vCPU id by dest_vcpu_id_array [gvec % dest_vcpu_num],
then we can get the vcpu form d->vcpu[]. However if we use bitmap here, we set the bits for those vCPUs which
is present in the destination field, then we need to use find_next_bit() to find the right vCPU ID, this may need
a while loop, see the following scenario:

- Guest has 8 vCPUs, in the interrupt destination fields, it only configures vcpu 0, 1, 3, 4, 5, 7 for the lowest-priority interrupt.
 So we get the vCPU destination bitmap like this: 10111011b
- Suppose guest vector is 44, so we choose 44%6=2, we need to use find_next_bit() to find the second set bit (counting from 0, which is vCPU ID 3), this may need a while loop.

What is your opinion about this? Space vs Speed?

Thanks,
Feng

> 
> > +    if ( !dest_vcpu_array )
> > +    {
> > +        dprintk(XENLOG_G_INFO,
> > +                "dom%d: failed to allocate memeory.\n", d->domain_id);
> 
> Please fix the typo and remove the stop.
> 
> > +        return 0;
> > +    }
> > +
> > +    for_each_vcpu ( d, v )
> > +    {
> > +        if ( !vlapic_match_dest(vcpu_vlapic(v), NULL, 0,
> > +                                dest_id, dest_mode) )
> > +            continue;
> > +
> > +        dest_vcpu_array[dest_vcpu_num++] = v;
> > +    }
> > +
> > +    if ( delivery_mode == dest_LowestPrio )
> > +    {
> > +        if (  dest_vcpu_num != 0 )
> > +        {
> > +            *dest_vcpu = dest_vcpu_array[gvec % dest_vcpu_num];
> > +            ret = 1;
> > +        }
> > +        else
> > +            ret = 0;
> > +    }
> > +    else if (  dest_vcpu_num == 1 )
> > +    {
> > +        *dest_vcpu = dest_vcpu_array[0];
> > +        ret = 1;
> > +    }
> > +    else
> > +        ret = 0;
> > +
> > +    xfree(dest_vcpu_array);
> > +    return ret;
> > +}
> 
> Blank line before final return statement please.
> 
> > @@ -330,11 +398,40 @@ int pt_irq_create_bind(
> >          /* Calculate dest_vcpu_id for MSI-type pirq migration. */
> >          dest = pirq_dpci->gmsi.gflags & VMSI_DEST_ID_MASK;
> >          dest_mode = !!(pirq_dpci->gmsi.gflags & VMSI_DM_MASK);
> > +        delivery_mode = (pirq_dpci->gmsi.gflags >>
> GFLAGS_SHIFT_DELIV_MODE) &
> > +                        VMSI_DELIV_MASK;
> >          dest_vcpu_id = hvm_girq_dest_2_vcpu_id(d, dest, dest_mode);
> >          pirq_dpci->gmsi.dest_vcpu_id = dest_vcpu_id;
> >          spin_unlock(&d->event_lock);
> >          if ( dest_vcpu_id >= 0 )
> >              hvm_migrate_pirqs(d->vcpu[dest_vcpu_id]);
> > +
> > +        /* Use interrupt posting if it is supported */
> > +        if ( iommu_intpost )
> > +        {
> > +            struct vcpu *vcpu = NULL;
> > +
> > +            if ( !pi_find_dest_vcpu(d, dest, dest_mode, delivery_mode,
> > +                                    pirq_dpci->gmsi.gvec, &vcpu) )
> 
> Why not have the function return the vCPU instead of passing it
> via indirection?
> 
> > +            {
> > +                dprintk(XENLOG_G_WARNING,
> > +                        "%pv: failed to find the dest vCPU for PI, guest
> "
> > +                        "vector:%u use software way to deliver the "
> > +                        " interrupts.\n", vcpu, pirq_dpci->gmsi.gvec);
> 
> You shouldn't be printing the (NULL) vCPU here. And please print
> vectors as hex values.
> 
> > +                break;
> > +            }
> > +
> > +            if ( pi_update_irte( vcpu, info, pirq_dpci->gmsi.gvec ) != 0 )
> > +            {
> > +                dprintk(XENLOG_G_WARNING,
> > +                        "%pv: failed to update PI IRTE, guest
> vector:%u "
> > +                        "use software way to deliver the
> interrupts.\n",
> > +                        vcpu, pirq_dpci->gmsi.gvec);
> > +
> > +                break;
> > +            }
> > +        }
> > +
> >          break;
> 
> By using if() / else if() you could drop _both_ break-s you add.
> 
> Jan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC v2 08/15] Update IRTE according to guest interrupt config changes
  2015-05-08  9:07 ` [RFC v2 08/15] Update IRTE according to guest interrupt config changes Feng Wu
@ 2015-06-09 15:06   ` Jan Beulich
  2015-06-16  8:08     ` Wu, Feng
  0 siblings, 1 reply; 7+ messages in thread
From: Jan Beulich @ 2015-06-09 15:06 UTC (permalink / raw)
  To: Feng Wu
  Cc: kevin.tian, keir, george.dunlap, andrew.cooper3, xen-devel, yang.z.zhang

>>> On 08.05.15 at 11:07, <feng.wu@intel.com> wrote:
> +static bool_t pi_find_dest_vcpu(struct domain *d, uint8_t dest_id,
> +                                uint8_t dest_mode, uint8_t delivery_mode,
> +                                uint8_t gvec, struct vcpu **dest_vcpu)
> +{
> +    struct vcpu *v, **dest_vcpu_array;
> +    unsigned int dest_vcpu_num = 0;
> +    int ret;

This, being given as operand to "return", should match in type with
the function's return type.

> +    dest_vcpu_array = xzalloc_array(struct vcpu *, d->max_vcpus);

You realize that this can be quite big an allocation? (You could at
least halve it by storing vCPU IDs instead of pointers, but if I'm
not mistaken this could even be a simple bitmap.)

> +    if ( !dest_vcpu_array )
> +    {
> +        dprintk(XENLOG_G_INFO,
> +                "dom%d: failed to allocate memeory.\n", d->domain_id);

Please fix the typo and remove the stop.

> +        return 0;
> +    }
> +
> +    for_each_vcpu ( d, v )
> +    {
> +        if ( !vlapic_match_dest(vcpu_vlapic(v), NULL, 0,
> +                                dest_id, dest_mode) )
> +            continue;
> +
> +        dest_vcpu_array[dest_vcpu_num++] = v;
> +    }
> +
> +    if ( delivery_mode == dest_LowestPrio )
> +    {
> +        if (  dest_vcpu_num != 0 )
> +        {
> +            *dest_vcpu = dest_vcpu_array[gvec % dest_vcpu_num];
> +            ret = 1;
> +        }
> +        else
> +            ret = 0;
> +    }
> +    else if (  dest_vcpu_num == 1 )
> +    {
> +        *dest_vcpu = dest_vcpu_array[0];
> +        ret = 1;
> +    }
> +    else
> +        ret = 0;
> +
> +    xfree(dest_vcpu_array);
> +    return ret;
> +}

Blank line before final return statement please.

> @@ -330,11 +398,40 @@ int pt_irq_create_bind(
>          /* Calculate dest_vcpu_id for MSI-type pirq migration. */
>          dest = pirq_dpci->gmsi.gflags & VMSI_DEST_ID_MASK;
>          dest_mode = !!(pirq_dpci->gmsi.gflags & VMSI_DM_MASK);
> +        delivery_mode = (pirq_dpci->gmsi.gflags >> GFLAGS_SHIFT_DELIV_MODE) &
> +                        VMSI_DELIV_MASK;
>          dest_vcpu_id = hvm_girq_dest_2_vcpu_id(d, dest, dest_mode);
>          pirq_dpci->gmsi.dest_vcpu_id = dest_vcpu_id;
>          spin_unlock(&d->event_lock);
>          if ( dest_vcpu_id >= 0 )
>              hvm_migrate_pirqs(d->vcpu[dest_vcpu_id]);
> +
> +        /* Use interrupt posting if it is supported */
> +        if ( iommu_intpost )
> +        {
> +            struct vcpu *vcpu = NULL;
> +
> +            if ( !pi_find_dest_vcpu(d, dest, dest_mode, delivery_mode,
> +                                    pirq_dpci->gmsi.gvec, &vcpu) )

Why not have the function return the vCPU instead of passing it
via indirection?

> +            {
> +                dprintk(XENLOG_G_WARNING,
> +                        "%pv: failed to find the dest vCPU for PI, guest "
> +                        "vector:%u use software way to deliver the "
> +                        " interrupts.\n", vcpu, pirq_dpci->gmsi.gvec);

You shouldn't be printing the (NULL) vCPU here. And please print
vectors as hex values.

> +                break;
> +            }
> +
> +            if ( pi_update_irte( vcpu, info, pirq_dpci->gmsi.gvec ) != 0 )
> +            {
> +                dprintk(XENLOG_G_WARNING,
> +                        "%pv: failed to update PI IRTE, guest vector:%u "
> +                        "use software way to deliver the interrupts.\n",
> +                        vcpu, pirq_dpci->gmsi.gvec);
> +
> +                break;
> +            }
> +        }
> +
>          break;

By using if() / else if() you could drop _both_ break-s you add.

Jan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [RFC v2 08/15] Update IRTE according to guest interrupt config changes
  2015-05-08  9:07 [RFC v2 00/15] Add VT-d Posted-Interrupts support Feng Wu
@ 2015-05-08  9:07 ` Feng Wu
  2015-06-09 15:06   ` Jan Beulich
  0 siblings, 1 reply; 7+ messages in thread
From: Feng Wu @ 2015-05-08  9:07 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, keir, george.dunlap, andrew.cooper3, jbeulich,
	yang.z.zhang, Feng Wu

When guest changes its interrupt configuration (such as, vector, etc.)
for direct-assigned devices, we need to update the associated IRTE
with the new guest vector, so external interrupts from the assigned
devices can be injected to guests without VM-Exit.

For lowest-priority interrupts, we use vector-hashing mechamisn to find
the destination vCPU. This follows the hardware behavior, since modern
Intel CPUs use vector hashing to handle the lowest-priority interrupt.

For multicast/broadcast vCPU, we cannot handle it via interrupt posting,
still use interrupt remapping.

Signed-off-by: Feng Wu <feng.wu@intel.com>
---
 xen/drivers/passthrough/io.c | 99 +++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 98 insertions(+), 1 deletion(-)

diff --git a/xen/drivers/passthrough/io.c b/xen/drivers/passthrough/io.c
index 9b77334..7b1c094 100644
--- a/xen/drivers/passthrough/io.c
+++ b/xen/drivers/passthrough/io.c
@@ -26,6 +26,7 @@
 #include <asm/hvm/iommu.h>
 #include <asm/hvm/support.h>
 #include <xen/hvm/irq.h>
+#include <asm/io_apic.h>
 
 static DEFINE_PER_CPU(struct list_head, dpci_list);
 
@@ -199,6 +200,73 @@ void free_hvm_irq_dpci(struct hvm_irq_dpci *dpci)
     xfree(dpci);
 }
 
+/*
+ * The purpose of this routine is to find the right destination vCPU for
+ * an interrupt which will be delivered by VT-d posted-interrupt. There
+ * are several cases as below:
+ *
+ * - For lowest-priority interrupts, we find the destination vCPU from the
+ *   guest vector using vector-hashing mechanism and return true. This follows
+ *   the hardware behavior, since modern Intel CPUs use vector hashing to
+ *   handle the lowest-priority interrupt.
+ * - Otherwise, for single destination interrupt, it is straightforward to
+ *   find the destination vCPU and return true.
+ * - For multicast/broadcast vCPU, we cannot handle it via interrupt posting,
+ *   so return false.
+ *
+ *   Here is the details about the vector-hashing mechanism:
+ *   1. For lowest-priority interrupts, store all the possible destination
+ *      vCPUs in an array.
+ *   2. Use "gvec % max number of destination vCPUs" to find the right
+ *      destination vCPU in the array for the lowest-priority interrupt.
+ */
+static bool_t pi_find_dest_vcpu(struct domain *d, uint8_t dest_id,
+                                uint8_t dest_mode, uint8_t delivery_mode,
+                                uint8_t gvec, struct vcpu **dest_vcpu)
+{
+    struct vcpu *v, **dest_vcpu_array;
+    unsigned int dest_vcpu_num = 0;
+    int ret;
+
+    dest_vcpu_array = xzalloc_array(struct vcpu *, d->max_vcpus);
+    if ( !dest_vcpu_array )
+    {
+        dprintk(XENLOG_G_INFO,
+                "dom%d: failed to allocate memeory.\n", d->domain_id);
+        return 0;
+    }
+
+    for_each_vcpu ( d, v )
+    {
+        if ( !vlapic_match_dest(vcpu_vlapic(v), NULL, 0,
+                                dest_id, dest_mode) )
+            continue;
+
+        dest_vcpu_array[dest_vcpu_num++] = v;
+    }
+
+    if ( delivery_mode == dest_LowestPrio )
+    {
+        if (  dest_vcpu_num != 0 )
+        {
+            *dest_vcpu = dest_vcpu_array[gvec % dest_vcpu_num];
+            ret = 1;
+        }
+        else
+            ret = 0;
+    }
+    else if (  dest_vcpu_num == 1 )
+    {
+        *dest_vcpu = dest_vcpu_array[0];
+        ret = 1;
+    }
+    else
+        ret = 0;
+
+    xfree(dest_vcpu_array);
+    return ret;
+}
+
 int pt_irq_create_bind(
     struct domain *d, xen_domctl_bind_pt_irq_t *pt_irq_bind)
 {
@@ -257,7 +325,7 @@ int pt_irq_create_bind(
     {
     case PT_IRQ_TYPE_MSI:
     {
-        uint8_t dest, dest_mode;
+        uint8_t dest, dest_mode, delivery_mode;
         int dest_vcpu_id;
 
         if ( !(pirq_dpci->flags & HVM_IRQ_DPCI_MAPPED) )
@@ -330,11 +398,40 @@ int pt_irq_create_bind(
         /* Calculate dest_vcpu_id for MSI-type pirq migration. */
         dest = pirq_dpci->gmsi.gflags & VMSI_DEST_ID_MASK;
         dest_mode = !!(pirq_dpci->gmsi.gflags & VMSI_DM_MASK);
+        delivery_mode = (pirq_dpci->gmsi.gflags >> GFLAGS_SHIFT_DELIV_MODE) &
+                        VMSI_DELIV_MASK;
         dest_vcpu_id = hvm_girq_dest_2_vcpu_id(d, dest, dest_mode);
         pirq_dpci->gmsi.dest_vcpu_id = dest_vcpu_id;
         spin_unlock(&d->event_lock);
         if ( dest_vcpu_id >= 0 )
             hvm_migrate_pirqs(d->vcpu[dest_vcpu_id]);
+
+        /* Use interrupt posting if it is supported */
+        if ( iommu_intpost )
+        {
+            struct vcpu *vcpu = NULL;
+
+            if ( !pi_find_dest_vcpu(d, dest, dest_mode, delivery_mode,
+                                    pirq_dpci->gmsi.gvec, &vcpu) )
+            {
+                dprintk(XENLOG_G_WARNING,
+                        "%pv: failed to find the dest vCPU for PI, guest "
+                        "vector:%u use software way to deliver the "
+                        " interrupts.\n", vcpu, pirq_dpci->gmsi.gvec);
+                break;
+            }
+
+            if ( pi_update_irte( vcpu, info, pirq_dpci->gmsi.gvec ) != 0 )
+            {
+                dprintk(XENLOG_G_WARNING,
+                        "%pv: failed to update PI IRTE, guest vector:%u "
+                        "use software way to deliver the interrupts.\n",
+                        vcpu, pirq_dpci->gmsi.gvec);
+
+                break;
+            }
+        }
+
         break;
     }
 
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2015-06-16  8:17 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-12  9:40 [RFC v2 08/15] Update IRTE according to guest interrupt config changes Wu, Feng
2015-06-12 10:40 ` Jan Beulich
2015-06-15  9:18   ` Wu, Feng
  -- strict thread matches above, loose matches on Subject: below --
2015-05-08  9:07 [RFC v2 00/15] Add VT-d Posted-Interrupts support Feng Wu
2015-05-08  9:07 ` [RFC v2 08/15] Update IRTE according to guest interrupt config changes Feng Wu
2015-06-09 15:06   ` Jan Beulich
2015-06-16  8:08     ` Wu, Feng
2015-06-16  8:17       ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.