All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: "Duan, Zhenzhong" <zhenzhong.duan@intel.com>
Cc: "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"alex.williamson@redhat.com" <alex.williamson@redhat.com>,
	"clg@redhat.com" <clg@redhat.com>,
	"eric.auger@redhat.com" <eric.auger@redhat.com>,
	"peterx@redhat.com" <peterx@redhat.com>,
	"jasowang@redhat.com" <jasowang@redhat.com>,
	"jgg@nvidia.com" <jgg@nvidia.com>,
	"nicolinc@nvidia.com" <nicolinc@nvidia.com>,
	"joao.m.martins@oracle.com" <joao.m.martins@oracle.com>,
	"Tian, Kevin" <kevin.tian@intel.com>,
	"Liu, Yi L" <yi.l.liu@intel.com>,
	"Sun, Yi Y" <yi.y.sun@intel.com>,
	"Peng, Chao P" <chao.p.peng@intel.com>,
	Yi Sun <yi.y.sun@linux.intel.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Richard Henderson <richard.henderson@linaro.org>,
	Eduardo Habkost <eduardo@habkost.net>,
	Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Subject: Re: [PATCH v1 3/6] intel_iommu: Add a framework to check and sync host IOMMU cap/ecap
Date: Wed, 13 Mar 2024 07:17:43 -0400	[thread overview]
Message-ID: <20240313071647-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <SJ0PR11MB6744F2805D8EF6722C725DAC922A2@SJ0PR11MB6744.namprd11.prod.outlook.com>

On Wed, Mar 13, 2024 at 07:54:11AM +0000, Duan, Zhenzhong wrote:
> 
> 
> >-----Original Message-----
> >From: Michael S. Tsirkin <mst@redhat.com>
> >Subject: Re: [PATCH v1 3/6] intel_iommu: Add a framework to check and
> >sync host IOMMU cap/ecap
> >
> >On Wed, Mar 13, 2024 at 02:52:39AM +0000, Duan, Zhenzhong wrote:
> >> Hi Michael,
> >>
> >> >-----Original Message-----
> >> >From: Michael S. Tsirkin <mst@redhat.com>
> >> >Subject: Re: [PATCH v1 3/6] intel_iommu: Add a framework to check and
> >> >sync host IOMMU cap/ecap
> >> >
> >> >On Wed, Feb 28, 2024 at 05:44:29PM +0800, Zhenzhong Duan wrote:
> >> >> From: Yi Liu <yi.l.liu@intel.com>
> >> >>
> >> >> Add a framework to check and synchronize host IOMMU cap/ecap with
> >> >> vIOMMU cap/ecap.
> >> >>
> >> >> The sequence will be:
> >> >>
> >> >> vtd_cap_init() initializes iommu->cap/ecap.
> >> >> vtd_check_hdev() update iommu->cap/ecap based on host cap/ecap.
> >> >> iommu->cap_frozen set when machine create done, iommu->cap/ecap
> >> >become readonly.
> >> >>
> >> >> Implementation details for different backends will be in following
> >patches.
> >> >>
> >> >> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
> >> >> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
> >> >> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> >> >> ---
> >> >>  include/hw/i386/intel_iommu.h |  1 +
> >> >>  hw/i386/intel_iommu.c         | 50
> >> >++++++++++++++++++++++++++++++++++-
> >> >>  2 files changed, 50 insertions(+), 1 deletion(-)
> >> >>
> >> >> diff --git a/include/hw/i386/intel_iommu.h
> >> >b/include/hw/i386/intel_iommu.h
> >> >> index bbc7b96add..c71a133820 100644
> >> >> --- a/include/hw/i386/intel_iommu.h
> >> >> +++ b/include/hw/i386/intel_iommu.h
> >> >> @@ -283,6 +283,7 @@ struct IntelIOMMUState {
> >> >>
> >> >>      uint64_t cap;                   /* The value of capability reg */
> >> >>      uint64_t ecap;                  /* The value of extended capability reg */
> >> >> +    bool cap_frozen;                /* cap/ecap become read-only after
> >frozen */
> >> >>
> >> >>      uint32_t context_cache_gen;     /* Should be in [1,MAX] */
> >> >>      GHashTable *iotlb;              /* IOTLB */
> >> >> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> >> >> index ffa1ad6429..a9f9dfd6a7 100644
> >> >> --- a/hw/i386/intel_iommu.c
> >> >> +++ b/hw/i386/intel_iommu.c
> >> >> @@ -35,6 +35,8 @@
> >> >>  #include "sysemu/kvm.h"
> >> >>  #include "sysemu/dma.h"
> >> >>  #include "sysemu/sysemu.h"
> >> >> +#include "hw/vfio/vfio-common.h"
> >> >> +#include "sysemu/iommufd.h"
> >> >>  #include "hw/i386/apic_internal.h"
> >> >>  #include "kvm/kvm_i386.h"
> >> >>  #include "migration/vmstate.h"
> >> >> @@ -3819,6 +3821,38 @@ VTDAddressSpace
> >> >*vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus,
> >> >>      return vtd_dev_as;
> >> >>  }
> >> >>
> >> >> +static int vtd_check_legacy_hdev(IntelIOMMUState *s,
> >> >> +                                 IOMMULegacyDevice *ldev,
> >> >> +                                 Error **errp)
> >> >> +{
> >> >> +    return 0;
> >> >> +}
> >> >> +
> >> >> +static int vtd_check_iommufd_hdev(IntelIOMMUState *s,
> >> >> +                                  IOMMUFDDevice *idev,
> >> >> +                                  Error **errp)
> >> >> +{
> >> >> +    return 0;
> >> >> +}
> >> >> +
> >> >> +static int vtd_check_hdev(IntelIOMMUState *s,
> >VTDHostIOMMUDevice
> >> >*vtd_hdev,
> >> >> +                          Error **errp)
> >> >> +{
> >> >> +    HostIOMMUDevice *base_dev = vtd_hdev->dev;
> >> >> +    IOMMUFDDevice *idev;
> >> >> +
> >> >> +    if (base_dev->type == HID_LEGACY) {
> >> >> +        IOMMULegacyDevice *ldev = container_of(base_dev,
> >> >> +                                               IOMMULegacyDevice, base);
> >> >> +
> >> >> +        return vtd_check_legacy_hdev(s, ldev, errp);
> >> >> +    }
> >> >> +
> >> >> +    idev = container_of(base_dev, IOMMUFDDevice, base);
> >> >> +
> >> >> +    return vtd_check_iommufd_hdev(s, idev, errp);
> >> >> +}
> >> >> +
> >> >>  static int vtd_dev_set_iommu_device(PCIBus *bus, void *opaque, int
> >> >devfn,
> >> >>                                      HostIOMMUDevice *base_dev, Error **errp)
> >> >>  {
> >> >> @@ -3829,6 +3863,7 @@ static int
> >vtd_dev_set_iommu_device(PCIBus
> >> >*bus, void *opaque, int devfn,
> >> >>          .devfn = devfn,
> >> >>      };
> >> >>      struct vtd_as_key *new_key;
> >> >> +    int ret;
> >> >>
> >> >>      assert(base_dev);
> >> >>
> >> >> @@ -3848,6 +3883,13 @@ static int
> >vtd_dev_set_iommu_device(PCIBus
> >> >*bus, void *opaque, int devfn,
> >> >>      vtd_hdev->iommu_state = s;
> >> >>      vtd_hdev->dev = base_dev;
> >> >>
> >> >> +    ret = vtd_check_hdev(s, vtd_hdev, errp);
> >> >> +    if (ret) {
> >> >> +        g_free(vtd_hdev);
> >> >> +        vtd_iommu_unlock(s);
> >> >> +        return ret;
> >> >> +    }
> >> >> +
> >> >>      new_key = g_malloc(sizeof(*new_key));
> >> >>      new_key->bus = bus;
> >> >>      new_key->devfn = devfn;
> >> >
> >> >
> >> >Okay. So when VFIO device is created, it will call
> >vtd_dev_set_iommu_device
> >> >and that in turn will update caps.
> >> >
> >> >
> >> >
> >> >
> >> >> @@ -4083,7 +4125,9 @@ static void vtd_init(IntelIOMMUState *s)
> >> >>      s->iq_dw = false;
> >> >>      s->next_frcd_reg = 0;
> >> >>
> >> >> -    vtd_cap_init(s);
> >> >> +    if (!s->cap_frozen) {
> >> >> +        vtd_cap_init(s);
> >> >> +    }
> >> >>
> >> >
> >> >If it's fronzen it's because VFIO was added after machine done.
> >> >And then what? I think caps are just wrong?
> >>
> >> Not quite get your question on caps being wrong. But try to explains:
> >>
> >> When a hot plugged vfio device's host iommu cap isn't compatible with
> >> vIOMMU's, hotplug should fail. Currently there is no check for this and
> >> allow hotplug to succeed, but then some issue will reveal later,
> >> e.g., vIOMMU's MGAW > host IOMMU's MGAW, guest can setup iova
> >> mapping beyond host supported iova range, then DMA will fail.
> >>
> >> In fact, before this series, cap is not impacted by VFIO, so it's same effect of
> >> frozen after machine done.
> >>
> >> >
> >> >
> >> >I think the way to approach this is just by specifying this
> >> >as an option on command line.
> >>
> >> Do you mean add a cap_frozen property to intel_iommu?
> >> Vtd_init() is called in realize() and system reset(), so I utilize realize() to init
> >cap
> >> and froze cap before system reset(). If cap_frozen is an option, when it's
> >set to
> >> false, cap could be updated every system reset and it's not a fix value any
> >more.
> >> This may break migration.
> >
> >No, I mean either
> >1. add some kind of vfio-iommu device that is not exposed to guest
> >   but is not hot pluggable
> 
> Not quite get, what will such vfio-iommu device be used for if not exposed to guest.

It will update the IOMMU.
And do so without need for tricky callbacks.

-- 
MST



  reply	other threads:[~2024-03-13 11:18 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-28  9:44 [PATCH v1 0/6] Check and sync host IOMMU cap/ecap with vIOMMU Zhenzhong Duan
2024-02-28  9:44 ` [PATCH v1 1/6] intel_iommu: Add set/unset_iommu_device callback Zhenzhong Duan
2024-02-28  9:44 ` [PATCH v1 2/6] intel_iommu: Extract out vtd_cap_init to initialize cap/ecap Zhenzhong Duan
2024-02-28  9:44 ` [PATCH v1 3/6] intel_iommu: Add a framework to check and sync host IOMMU cap/ecap Zhenzhong Duan
2024-03-12 17:03   ` Michael S. Tsirkin
2024-03-13  2:52     ` Duan, Zhenzhong
2024-03-13  7:07       ` Michael S. Tsirkin
2024-03-13  7:54         ` Duan, Zhenzhong
2024-03-13 11:17           ` Michael S. Tsirkin [this message]
2024-03-14  4:05             ` Duan, Zhenzhong
2024-03-18 13:20             ` Eric Auger
2024-03-28  7:20               ` Michael S. Tsirkin
2024-03-29  3:22                 ` Duan, Zhenzhong
2024-02-28  9:44 ` [PATCH v1 4/6] intel_iommu: Implement check and sync mechanism in iommufd mode Zhenzhong Duan
2024-02-28  9:44 ` [PATCH v1 5/6] intel_iommu: Use mgaw instead of s->aw_bits Zhenzhong Duan
2024-02-28  9:44 ` [PATCH v1 6/6] intel_iommu: Block migration if cap is updated Zhenzhong Duan
2024-03-04  6:35   ` Prasad Pandit
2024-03-04  8:11     ` Duan, Zhenzhong
2024-03-04  9:43       ` Prasad Pandit
2024-03-04 10:10         ` Duan, Zhenzhong
2024-03-04  4:17 ` [PATCH v1 0/6] Check and sync host IOMMU cap/ecap with vIOMMU Jason Wang
2024-03-04  6:13   ` Duan, Zhenzhong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240313071647-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=chao.p.peng@intel.com \
    --cc=clg@redhat.com \
    --cc=eduardo@habkost.net \
    --cc=eric.auger@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=jgg@nvidia.com \
    --cc=joao.m.martins@oracle.com \
    --cc=kevin.tian@intel.com \
    --cc=marcel.apfelbaum@gmail.com \
    --cc=nicolinc@nvidia.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    --cc=yi.l.liu@intel.com \
    --cc=yi.y.sun@intel.com \
    --cc=yi.y.sun@linux.intel.com \
    --cc=zhenzhong.duan@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.