All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Wu, Feng" <feng.wu@intel.com>
To: "Tian, Kevin" <kevin.tian@intel.com>,
	"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Cc: "Zhang, Yang Z" <yang.z.zhang@intel.com>,
	"Wu, Feng" <feng.wu@intel.com>, "keir@xen.org" <keir@xen.org>,
	"JBeulich@suse.com" <JBeulich@suse.com>
Subject: Re: [RFC v1 08/15] Update IRTE according to guest interrupt config changes
Date: Thu, 2 Apr 2015 08:02:51 +0000	[thread overview]
Message-ID: <E959C4978C3B6342920538CF579893F002487127@SHSMSX104.ccr.corp.intel.com> (raw)
In-Reply-To: <AADFC41AFE54684AB9EE6CBC0274A5D1261FB859@SHSMSX101.ccr.corp.intel.com>



> -----Original Message-----
> From: Tian, Kevin
> Sent: Thursday, April 02, 2015 2:50 PM
> To: Wu, Feng; xen-devel@lists.xen.org
> Cc: JBeulich@suse.com; keir@xen.org; Zhang, Yang Z
> Subject: RE: [RFC v1 08/15] Update IRTE according to guest interrupt config
> changes
> 
> > From: Wu, Feng
> > Sent: Thursday, April 02, 2015 2:21 PM
> >
> >
> >
> > > -----Original Message-----
> > > From: Tian, Kevin
> > > Sent: Thursday, April 02, 2015 1:52 PM
> > > To: Wu, Feng; xen-devel@lists.xen.org
> > > Cc: JBeulich@suse.com; keir@xen.org; Zhang, Yang Z
> > > Subject: RE: [RFC v1 08/15] Update IRTE according to guest interrupt config
> > > changes
> > >
> > > > From: Wu, Feng
> > > > Sent: Wednesday, March 25, 2015 8:32 PM
> > > >
> > > > When guest changes its interrupt configuration (such as, vector, etc.)
> > > > for direct-assigned devices, we need to update the associated IRTE
> > > > with the new guest vector, so external interrupts from the assigned
> > > > devices can be injected to guests without VM-Exit.
> > > >
> > > > For lowest-priority interrupts, we use vector-hashing mechamisn to find
> > > > the destination vCPU. This follows the hardware behavior, since modern
> > > > Intel CPUs use vector hashing to handle the lowest-priority interrupt.
> > > >
> > > > For multicase/broadcast vCPU, we cannot handle it via interrupt posting,
> > > > still use interrupt remapping.
> > > >
> > > > Signed-off-by: Feng Wu <feng.wu@intel.com>
> > > > ---
> > > >  xen/drivers/passthrough/io.c | 77
> > > > +++++++++++++++++++++++++++++++++++++++++++-
> > > >  1 file changed, 76 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/xen/drivers/passthrough/io.c b/xen/drivers/passthrough/io.c
> > > > index ae050df..1d9a132 100644
> > > > --- a/xen/drivers/passthrough/io.c
> > > > +++ b/xen/drivers/passthrough/io.c
> > > > @@ -26,6 +26,7 @@
> > > >  #include <asm/hvm/iommu.h>
> > > >  #include <asm/hvm/support.h>
> > > >  #include <xen/hvm/irq.h>
> > > > +#include <asm/io_apic.h>
> > > >
> > > >  static DEFINE_PER_CPU(struct list_head, dpci_list);
> > > >
> > > > @@ -199,6 +200,61 @@ void free_hvm_irq_dpci(struct hvm_irq_dpci
> > *dpci)
> > > >      xfree(dpci);
> > > >  }
> > > >
> > > > +/*
> > > > + * Here we handle the following cases:
> > > > + * - For lowest-priority interrupts, we find the destination vCPU from the
> > > > + *   guest vector using vector-hashing mechamisn and return true. This
> > > > follows
> > > > + *   the hardware behavior, since modern Intel CPUs use vector
> hashing
> > to
> > > > + *   handle the lowest-priority interrupt.
> > > > + * - Otherwise, for single destination interrupt, it is straightforward to
> > > > + *   find the destination vCPU and return true.
> > > > + * - For multicase/broadcast vCPU, we cannot handle it via interrupt
> > posting,
> > > > + *   so return false.
> > > > + */
> > > > +static bool_t pi_find_dest_vcpu(struct domain *d, uint8_t dest_id,
> > > > +                                uint8_t dest_mode, uint8_t
> > > > deliver_mode,
> > > > +                                uint32_t gvec, struct vcpu
> > > > **dest_vcpu)
> > > > +{
> > > > +    struct vcpu *v, **dest_vcpu_array;
> > > > +    unsigned int dest_vcpu_num = 0;
> > > > +    int ret;
> > > > +
> > > > +    if ( deliver_mode == dest_LowestPrio )
> > > > +        dest_vcpu_array = xzalloc_array(struct vcpu *, d->max_vcpus);
> > > > +
> > > > +    for_each_vcpu ( d, v )
> > > > +    {
> > > > +        if ( !vlapic_match_dest(vcpu_vlapic(v), NULL, 0,
> > > > +                                dest_id, dest_mode) )
> > > > +            continue;
> > > > +
> > > > +        dest_vcpu_num++;
> > > > +
> > > > +        if ( deliver_mode == dest_LowestPrio )
> > > > +            dest_vcpu_array[dest_vcpu_num] = v;
> > > > +        else
> > > > +            *dest_vcpu = v;
> > > > +    }
> > > > +
> > > > +    if ( deliver_mode == dest_LowestPrio )
> > > > +    {
> > > > +        if (  dest_vcpu_num != 0 )
> > > > +        {
> > > > +            *dest_vcpu = dest_vcpu_array[gvec % dest_vcpu_num];
> > > > +            ret = 1;
> > > > +        }
> > > > +        else
> > > > +            ret = 0;
> > > > +
> > > > +        xfree(dest_vcpu_array);
> > > > +        return ret;
> > > > +    }
> > > > +    else if (  dest_vcpu_num == 1 )
> > > > +        return 1;
> > > > +    else
> > > > +        return 0;
> > > > +}
> > > > +
> > > >  int pt_irq_create_bind(
> > > >      struct domain *d, xen_domctl_bind_pt_irq_t *pt_irq_bind)
> > > >  {
> > > > @@ -257,7 +313,7 @@ int pt_irq_create_bind(
> > > >      {
> > > >      case PT_IRQ_TYPE_MSI:
> > > >      {
> > > > -        uint8_t dest, dest_mode;
> > > > +        uint8_t dest, dest_mode, deliver_mode;
> > > >          int dest_vcpu_id;
> > > >
> > > >          if ( !(pirq_dpci->flags & HVM_IRQ_DPCI_MAPPED) )
> > > > @@ -330,11 +386,30 @@ int pt_irq_create_bind(
> > > >          /* Calculate dest_vcpu_id for MSI-type pirq migration. */
> > > >          dest = pirq_dpci->gmsi.gflags & VMSI_DEST_ID_MASK;
> > > >          dest_mode = !!(pirq_dpci->gmsi.gflags & VMSI_DM_MASK);
> > > > +        deliver_mode = (pirq_dpci->gmsi.gflags >>
> > > > GFLAGS_SHIFT_DELIV_MODE) &
> > > > +                        VMSI_DELIV_MASK;
> > > >          dest_vcpu_id = hvm_girq_dest_2_vcpu_id(d, dest,
> dest_mode);
> > > >          pirq_dpci->gmsi.dest_vcpu_id = dest_vcpu_id;
> > > >          spin_unlock(&d->event_lock);
> > > >          if ( dest_vcpu_id >= 0 )
> > > >              hvm_migrate_pirqs(d->vcpu[dest_vcpu_id]);
> > > > +
> > > > +        /* Use interrupt posting if it is supported */
> > > > +        if ( iommu_intpost )
> > > > +        {
> > > > +            struct vcpu *vcpu = NULL;
> > > > +
> > > > +            if ( !pi_find_dest_vcpu(d, dest, dest_mode, deliver_mode,
> > > > +                                    pirq_dpci->gmsi.gvec,
> > &vcpu) )
> > > > +                break;
> > > > +
> > >
> > > Is it possible this new pi_find_dest_vcpu will return a different target from
> > > earlier hvm_girq_des_2_vcpu_id? if yes it will cause tricky issues since
> > > earlier pirqs are migrated according to different policy. We need consolidate
> > > vcpu selection policies together to keep consistency.
> >
> > In my understanding, what you described above is the software way to
> deliver
> > the interrupts to vCPU, when posted-interrupt is used, interrupts are
> delivered
> > by hardware according to the settings in IRTE, hence those software path will
> > not get touched for these interrupts. So do we need to care about how
> > software
> > might migrate the interrupts here?
> 
> just curious why we can't use one policy for vcpu selection. if multicast
> handling is a difference, you may pass intpost as a parameter to use
> same function.
> 

Digging into hvm_girq_dest_2_vcpu_id, I find that hvm_girq_dest_2_vcpu_id()
is introduced by commit 023e3bc7, and it is just an optimization for interrupts
with single destination. For most case, the destination of a vCPU is determined
by vmsi_deliver().

> >
> > >
> > > and why failure to find dest_vcpu doesn't lead to an error but a break?
> >
> > We cannot post multicast/broadcast interrupts to a guest, and
> > pi_find_dest_vcpu() returns 0 when encountering a multicast/broadcast
> > interrupt, in that case, we still use interrupt remapping mechanism for it.
> 
> then you might handle postint first, and then if muticast or no intpost support
> then go to software style.

That is a good suggestion. I will think more about how the handle this better.

Thanks,
Feng

> 
> Thanks
> Kevin

  reply	other threads:[~2015-04-02  8:02 UTC|newest]

Thread overview: 101+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-25 12:31 [RFC v1 00/15] Add VT-d Posted-Interrupts support Feng Wu
2015-03-25 12:31 ` [RFC v1 01/15] iommu: Add iommu_intpost to control VT-d Posted-Interrupts feature Feng Wu
2015-03-26 17:39   ` Andrew Cooper
2015-03-27  4:46     ` Wu, Feng
2015-03-27  9:55       ` Andrew Cooper
2015-03-27  9:52     ` Jan Beulich
2015-03-25 12:31 ` [RFC v1 02/15] vt-d: VT-d Posted-Interrupts feature detection Feng Wu
2015-03-26 18:12   ` Andrew Cooper
2015-03-27  1:21     ` Wu, Feng
2015-03-27 10:06       ` Andrew Cooper
2015-03-27 13:41         ` Wu, Feng
2015-03-25 12:31 ` [RFC v1 03/15] vmx: Extend struct pi_desc to support VT-d Posted-Interrupts Feng Wu
2015-03-26 18:37   ` Andrew Cooper
2015-03-27  1:32     ` Wu, Feng
2015-03-25 12:31 ` [RFC v1 04/15] vmx: Add some helper functions for Posted-Interrupts Feng Wu
2015-03-26 18:44   ` Andrew Cooper
2015-03-25 12:31 ` [RFC v1 05/15] vmx: Initialize VT-d Posted-Interrupts Descriptor Feng Wu
2015-03-26 18:53   ` Andrew Cooper
2015-03-27  1:45     ` Wu, Feng
2015-03-26 19:29   ` Konrad Rzeszutek Wilk
2015-03-27  1:45     ` Wu, Feng
2015-05-04  5:32     ` Wu, Feng
2015-05-04  8:10       ` Jan Beulich
2015-05-04  8:36       ` Andrew Cooper
2015-05-04  9:07         ` Wu, Feng
2015-03-25 12:31 ` [RFC v1 06/15] vt-d: Extend struct iremap_entry to support VT-d Posted-Interrupts Feng Wu
2015-03-26 19:00   ` Andrew Cooper
2015-03-27  1:53     ` Wu, Feng
2015-03-27  9:58       ` Jan Beulich
2015-04-02  6:32         ` Tian, Kevin
2015-03-25 12:31 ` [RFC v1 07/15] vt-d: Add API to update IRTE when VT-d PI is used Feng Wu
2015-03-26 19:17   ` Andrew Cooper
2015-03-27  2:13     ` Wu, Feng
2015-03-27 10:02       ` Jan Beulich
2015-03-27  4:52     ` Wu, Feng
2015-03-26 19:36   ` Konrad Rzeszutek Wilk
2015-03-27  1:59     ` Wu, Feng
2015-04-02  5:34   ` Tian, Kevin
2015-04-02  6:02     ` Wu, Feng
2015-03-25 12:31 ` [RFC v1 08/15] Update IRTE according to guest interrupt config changes Feng Wu
2015-03-26 19:46   ` Konrad Rzeszutek Wilk
2015-03-27  5:45     ` Wu, Feng
2015-03-26 19:59   ` Andrew Cooper
2015-03-27  5:49     ` Wu, Feng
2015-03-27 11:31       ` Andrew Cooper
2015-04-02  5:52   ` Tian, Kevin
2015-04-02  6:20     ` Wu, Feng
2015-04-02  6:49       ` Tian, Kevin
2015-04-02  8:02         ` Wu, Feng [this message]
2015-04-03  8:29           ` Tian, Kevin
2015-03-25 12:31 ` [RFC v1 09/15] Add a new per-vCPU tasklet to wakeup the blocked vCPU Feng Wu
2015-04-02  5:53   ` Tian, Kevin
2015-04-02  7:20     ` Wu, Feng
2015-03-25 12:31 ` [RFC v1 10/15] vmx: Define two per-cpu variants Feng Wu
2015-03-26 19:59   ` Andrew Cooper
2015-04-02  5:54   ` Tian, Kevin
2015-04-02  6:24     ` Wu, Feng
2015-03-25 12:31 ` [RFC v1 11/15] vmx: Add a global wake-up vector for VT-d Posted-Interrupts Feng Wu
2015-03-26 20:07   ` Andrew Cooper
2015-04-02  6:00   ` Tian, Kevin
2015-04-02  7:18     ` Wu, Feng
2015-04-08  9:02       ` Tian, Kevin
2015-04-08 11:14         ` Wu, Feng
2015-03-25 12:31 ` [RFC v1 12/15] vmx: Properly handle notification event when vCPU is running Feng Wu
2015-03-25 14:14   ` Zhang, Yang Z
2015-03-27  4:40     ` Wu, Feng
2015-03-27  4:44       ` Zhang, Yang Z
2015-03-27  4:57         ` Wu, Feng
2015-04-02  6:08           ` Tian, Kevin
2015-04-02  7:21             ` Wu, Feng
2015-04-02 19:15             ` Konrad Rzeszutek Wilk
2015-04-03  2:00               ` Wu, Feng
2015-04-03 13:36                 ` Konrad Rzeszutek Wilk
2015-04-07  0:35                   ` Wu, Feng
2015-03-26 19:57   ` Konrad Rzeszutek Wilk
2015-03-27  3:06     ` Wu, Feng
2015-03-25 12:31 ` [RFC v1 13/15] Update Posted-Interrupts Descriptor during vCPU scheduling Feng Wu
2015-03-26 20:16   ` Andrew Cooper
2015-03-27  2:59     ` Wu, Feng
2015-04-02  6:24   ` Tian, Kevin
2015-04-02  8:39     ` Wu, Feng
2015-04-08  8:53       ` Tian, Kevin
2015-04-08 11:01         ` Wu, Feng
2015-04-09  2:37           ` Tian, Kevin
2015-03-25 12:31 ` [RFC v1 14/15] Suppress posting interrupts when 'SN' is set Feng Wu
2015-03-26 20:34   ` Andrew Cooper
2015-03-27  3:00     ` Wu, Feng
2015-03-27 12:06       ` Andrew Cooper
2015-03-27 13:45         ` Wu, Feng
2015-03-27 13:49           ` Andrew Cooper
2015-03-30  2:11             ` Wu, Feng
2015-03-30 10:11               ` Andrew Cooper
2015-03-25 12:31 ` [RFC v1 15/15] Add a command line parameter for VT-d posted-interrupts Feng Wu
2015-03-26 18:50 ` [RFC v1 00/15] Add VT-d Posted-Interrupts support Konrad Rzeszutek Wilk
2015-03-27  1:06   ` Wu, Feng
2015-03-27 14:44     ` Konrad Rzeszutek Wilk
2015-04-01 13:21 ` Wu, Feng
2015-04-13 12:12   ` Jan Beulich
2015-04-13 23:38     ` Wu, Feng
2015-04-24 17:50     ` Wu, Feng
2015-04-27 23:40       ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=E959C4978C3B6342920538CF579893F002487127@SHSMSX104.ccr.corp.intel.com \
    --to=feng.wu@intel.com \
    --cc=JBeulich@suse.com \
    --cc=keir@xen.org \
    --cc=kevin.tian@intel.com \
    --cc=xen-devel@lists.xen.org \
    --cc=yang.z.zhang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.