iommu.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
From: "Liu, Yi L" <yi.l.liu@intel.com>
To: "Tian, Kevin" <kevin.tian@intel.com>,
	Alex Williamson <alex.williamson@redhat.com>
Cc: "jean-philippe@linaro.org" <jean-philippe@linaro.org>,
	"Raj, Ashok" <ashok.raj@intel.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"iommu@lists.linux-foundation.org"
	<iommu@lists.linux-foundation.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"Sun, Yi Y" <yi.y.sun@intel.com>, "Wu, Hao" <hao.wu@intel.com>,
	"Tian,  Jun J" <jun.j.tian@intel.com>
Subject: RE: [PATCH v2 02/15] iommu: Report domain nesting info
Date: Tue, 16 Jun 2020 02:24:51 +0000	[thread overview]
Message-ID: <DM5PR11MB1435C08C428B34EA4546BB49C39D0@DM5PR11MB1435.namprd11.prod.outlook.com> (raw)
In-Reply-To: <MWHPR11MB16456A9F54BA70D5381F2D758C9D0@MWHPR11MB1645.namprd11.prod.outlook.com>

> From: Tian, Kevin <kevin.tian@intel.com>
> Sent: Tuesday, June 16, 2020 9:56 AM
> 
> > From: Liu, Yi L <yi.l.liu@intel.com>
> > Sent: Monday, June 15, 2020 2:05 PM
> >
> > Hi Kevin,
> >
> > > From: Tian, Kevin <kevin.tian@intel.com>
> > > Sent: Monday, June 15, 2020 9:23 AM
> > >
> > > > From: Liu, Yi L <yi.l.liu@intel.com>
> > > > Sent: Friday, June 12, 2020 5:05 PM
> > > >
> > > > Hi Alex,
> > > >
> > > > > From: Alex Williamson <alex.williamson@redhat.com>
> > > > > Sent: Friday, June 12, 2020 3:30 AM
> > > > >
> > > > > On Thu, 11 Jun 2020 05:15:21 -0700
> > > > > Liu Yi L <yi.l.liu@intel.com> wrote:
> > > > >
> > > > > > IOMMUs that support nesting translation needs report the
> > > > > > capability info to userspace, e.g. the format of first level/stage paging
> > > structures.
> > > > > >
> > > > > > Cc: Kevin Tian <kevin.tian@intel.com>
> > > > > > CC: Jacob Pan <jacob.jun.pan@linux.intel.com>
> > > > > > Cc: Alex Williamson <alex.williamson@redhat.com>
> > > > > > Cc: Eric Auger <eric.auger@redhat.com>
> > > > > > Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>
> > > > > > Cc: Joerg Roedel <joro@8bytes.org>
> > > > > > Cc: Lu Baolu <baolu.lu@linux.intel.com>
> > > > > > Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
> > > > > > Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
> > > > > > ---
> > > > > > @Jean, Eric: as nesting was introduced for ARM, but looks like no
> > > > > > actual user of it. right? So I'm wondering if we can reuse
> > > > > > DOMAIN_ATTR_NESTING to retrieve nesting info? how about your
> > > > opinions?
> > > > > >
> > > > > >  include/linux/iommu.h      |  1 +
> > > > > >  include/uapi/linux/iommu.h | 34
> > > > ++++++++++++++++++++++++++++++++++
> > > > > >  2 files changed, 35 insertions(+)
> > > > > >
> > > > > > diff --git a/include/linux/iommu.h b/include/linux/iommu.h index
> > > > > > 78a26ae..f6e4b49 100644
> > > > > > --- a/include/linux/iommu.h
> > > > > > +++ b/include/linux/iommu.h
> > > > > > @@ -126,6 +126,7 @@ enum iommu_attr {
> > > > > >  	DOMAIN_ATTR_FSL_PAMUV1,
> > > > > >  	DOMAIN_ATTR_NESTING,	/* two stages of translation */
> > > > > >  	DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE,
> > > > > > +	DOMAIN_ATTR_NESTING_INFO,
> > > > > >  	DOMAIN_ATTR_MAX,
> > > > > >  };
> > > > > >
> > > > > > diff --git a/include/uapi/linux/iommu.h
> > > > > > b/include/uapi/linux/iommu.h index 303f148..02eac73 100644
> > > > > > --- a/include/uapi/linux/iommu.h
> > > > > > +++ b/include/uapi/linux/iommu.h
> > > > > > @@ -332,4 +332,38 @@ struct iommu_gpasid_bind_data {
> > > > > >  	};
> > > > > >  };
> > > > > >
> > > > > > +struct iommu_nesting_info {
> > > > > > +	__u32	size;
> > > > > > +	__u32	format;
> > > > > > +	__u32	features;
> > > > > > +#define IOMMU_NESTING_FEAT_SYSWIDE_PASID	(1 << 0)
> > > > > > +#define IOMMU_NESTING_FEAT_BIND_PGTBL		(1 << 1)
> > > > > > +#define IOMMU_NESTING_FEAT_CACHE_INVLD		(1 <<
> > 2)
> > > > > > +	__u32	flags;
> > > > > > +	__u8	data[];
> > > > > > +};
> > > > > > +
> > > > > > +/*
> > > > > > + * @flags:	VT-d specific flags. Currently reserved for future
> > > > > > + *		extension.
> > > > > > + * @addr_width:	The output addr width of first level/stage
> > > translation
> > > > > > + * @pasid_bits:	Maximum supported PASID bits, 0 represents
> > no
> > > > PASID
> > > > > > + *		support.
> > > > > > + * @cap_reg:	Describe basic capabilities as defined in VT-d
> > > > capability
> > > > > > + *		register.
> > > > > > + * @cap_mask:	Mark valid capability bits in @cap_reg.
> > > > > > + * @ecap_reg:	Describe the extended capabilities as defined in
> VT-d
> > > > > > + *		extended capability register.
> > > > > > + * @ecap_mask:	Mark the valid capability bits in @ecap_reg.
> > > > >
> > > > > Please explain this a little further, why do we need to tell
> > > > > userspace about cap/ecap register bits that aren't valid through this
> > interface?
> > > > > Thanks,
> > > >
> > > > we only want to tell userspace about the bits marked in the
> > cap/ecap_mask.
> > > > cap/ecap_mask is kind of white-list of the cap/ecap register.
> > > > userspace should only care about the bits in the white-list, for other
> > > > bits, it should ignore.
> > > >
> > > > Regards,
> > > > Yi Liu
> > >
> > > For invalid bits if kernel just clears them then do we still need additional
> > mask bits
> > > to explicitly mark them out? I guess this might be the point that Alex asked...
> >
> > For invalid bits, kernel will clear them. But I think the mask bits is
> > still necessary. The mask bits tells user space the bits related to
> > nesting. Without it, user space may have no idea about it.
> 
> userspace should know which bit is related to nesting and then should
> check that bit explicitly...

ok, so userspace could get such info by the understanding of spec, right?
if user space could get it, then I think it's uncessary to have cap/ecap mask
bits.

> >
> > Maybe talk about QEMU usage of the cap/ecap bits would help. QEMU
> > vIOMMU
> > decides cap/ecap bits according to QEMU cmdline. But not all of them are
> > compatible with hardware support. Especially, vIOMMU built on nesting.
> > So needs to sync the cap/ecap bits with host side. Based on the mask
> > bits, QEMU can compare the cap/ecap bits configured by QEMU cmdline with
> > the cap/ecap bits reported by this interface. This comparation is limited
> > to the nesting related bits in cap/ecap, the other bits are not included
> > and can use the configuration by QEMU cmdline.
> 
> I didn't get this explanation. Based on patch [15/15], nesting capabilities
> are defined as:
> +/* Nesting Support Capability Alignment */
> +#define VTD_CAP_FL1GP		(1ULL << 56)
> +#define VTD_CAP_FL5LP		(1ULL << 60)
> +#define VTD_ECAP_PRS		(1ULL << 29)
> +#define VTD_ECAP_ERS		(1ULL << 30)
> +#define VTD_ECAP_SRS		(1ULL << 31)
> +#define VTD_ECAP_EAFS		(1ULL << 34)
> +#define VTD_ECAP_PASID		(1ULL << 40)
> 
> When Qemu gets an cmdline option it knows which bit out of above
> list should be checked against hardware capability. Then just do the
> check bit-by-bit. Why do we need mask bit in uapi to tell which bits
> are valid?

as above reply, if userspace has the check list for the cap/ecap bits,
then it's not necessary to use mask bit.

> Unless 0/1 doesn't represent validity of some bit. Do we
> have such example?

yes, like the pasid bits. it's 20 bits. but we already got pasid_bits
in the iommu_nesting_info_vtd structure. so it's not covered in the
ecap_bits.

Regards,
Yi Liu

> >
> > The link below show the current Intel vIOMMU usage on the cap/ecap bits.
> > For each assigned device, vIOMMU will compare the nesting related bits in
> > cap/ecap and mask out the bits which hardware doesn't support. After the
> > machine is intilized, the vIOMMU cap/ecap bits are determined. If user
> > hot-plug devices to VM, vIOMMU will fail it if the hardware cap/ecap bits
> > behind hot-plug device are not compatible with determined vIOMMU
> > cap/ecap
> > bits.
> >
> > https://www.spinics.net/lists/kvm/msg218294.html
> >
> > Regards,
> > Yi Liu
> >
> > > >
> > > > > Alex
> > > > >
> > > > >
> > > > > > + */
> > > > > > +struct iommu_nesting_info_vtd {
> > > > > > +	__u32	flags;
> > > > > > +	__u16	addr_width;
> > > > > > +	__u16	pasid_bits;
> > > > > > +	__u64	cap_reg;
> > > > > > +	__u64	cap_mask;
> > > > > > +	__u64	ecap_reg;
> > > > > > +	__u64	ecap_mask;
> > > > > > +};
> > > > > > +
> > > > > >  #endif /* _UAPI_IOMMU_H */

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

  reply	other threads:[~2020-06-16  2:24 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-11 12:15 [PATCH v2 00/15] vfio: expose virtual Shared Virtual Addressing to VMs Liu Yi L
2020-06-11 12:15 ` [PATCH v2 01/15] vfio/type1: Refactor vfio_iommu_type1_ioctl() Liu Yi L
2020-06-11 12:15 ` [PATCH v2 02/15] iommu: Report domain nesting info Liu Yi L
2020-06-11 19:30   ` Alex Williamson
2020-06-12  9:05     ` Liu, Yi L
2020-06-15  1:22       ` Tian, Kevin
2020-06-15  6:04         ` Liu, Yi L
2020-06-16  1:56           ` Tian, Kevin
2020-06-16  2:24             ` Liu, Yi L [this message]
2020-06-17 14:39   ` Jean-Philippe Brucker
2020-06-18 11:46     ` Liu, Yi L
2020-06-11 12:15 ` [PATCH v2 03/15] vfio/type1: Report iommu nesting info to userspace Liu Yi L
2020-06-11 12:15 ` [PATCH v2 04/15] vfio: Add PASID allocation/free support Liu Yi L
2020-06-11 12:15 ` [PATCH v2 05/15] iommu/vt-d: Support setting ioasid set to domain Liu Yi L
2020-06-11 12:15 ` [PATCH v2 06/15] vfio/type1: Add VFIO_IOMMU_PASID_REQUEST (alloc/free) Liu Yi L
2020-06-11 12:15 ` [PATCH v2 07/15] iommu/uapi: Add iommu_gpasid_unbind_data Liu Yi L
2020-06-11 12:15 ` [PATCH v2 08/15] iommu: Pass domain and unbind_data to sva_unbind_gpasid() Liu Yi L
2020-06-11 12:15 ` [PATCH v2 09/15] iommu/vt-d: Check ownership for PASIDs from user-space Liu Yi L
2020-06-11 12:15 ` [PATCH v2 10/15] vfio/type1: Support binding guest page tables to PASID Liu Yi L
2020-06-11 12:15 ` [PATCH v2 11/15] vfio/type1: Allow invalidating first-level/stage IOMMU cache Liu Yi L
2020-06-11 12:15 ` [PATCH v2 12/15] vfio/type1: Add vSVA support for IOMMU-backed mdevs Liu Yi L
2020-06-11 12:15 ` [PATCH v2 13/15] vfio/pci: Expose PCIe PASID capability to guest Liu Yi L
2020-06-11 12:15 ` [PATCH v2 14/15] vfio: Document dual stage control Liu Yi L
2020-06-15  9:41   ` Stefan Hajnoczi
2020-06-17  6:27     ` Liu, Yi L
2020-06-22 12:51       ` Stefan Hajnoczi
2020-06-23  6:43         ` Liu, Yi L
2020-06-11 12:15 ` [PATCH v2 15/15] iommu/vt-d: Support reporting nesting capability info Liu Yi L
2020-06-15 10:02 ` [PATCH v2 00/15] vfio: expose virtual Shared Virtual Addressing to VMs Stefan Hajnoczi
2020-06-15 12:39   ` Liu, Yi L
2020-06-16 15:34     ` Stefan Hajnoczi
2020-06-16  2:26   ` Tian, Kevin
2020-06-16 15:49     ` Stefan Hajnoczi
2020-06-16 16:09       ` Peter Xu
2020-06-22 12:49         ` Stefan Hajnoczi
2020-06-16 17:00       ` Raj, Ashok
2020-06-22 12:49         ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DM5PR11MB1435C08C428B34EA4546BB49C39D0@DM5PR11MB1435.namprd11.prod.outlook.com \
    --to=yi.l.liu@intel.com \
    --cc=alex.williamson@redhat.com \
    --cc=ashok.raj@intel.com \
    --cc=hao.wu@intel.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jean-philippe@linaro.org \
    --cc=jun.j.tian@intel.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=yi.y.sun@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).