KVM Archive on lore.kernel.org
 help / color / Atom feed
From: "Liu, Yi L" <yi.l.liu@intel.com>
To: Auger Eric <eric.auger@redhat.com>,
	"alex.williamson@redhat.com" <alex.williamson@redhat.com>,
	"baolu.lu@linux.intel.com" <baolu.lu@linux.intel.com>,
	"joro@8bytes.org" <joro@8bytes.org>
Cc: "Tian, Kevin" <kevin.tian@intel.com>,
	"jacob.jun.pan@linux.intel.com" <jacob.jun.pan@linux.intel.com>,
	"Raj, Ashok" <ashok.raj@intel.com>,
	"Tian, Jun J" <jun.j.tian@intel.com>,
	"Sun, Yi Y" <yi.y.sun@intel.com>,
	"jean-philippe@linaro.org" <jean-philippe@linaro.org>,
	"peterx@redhat.com" <peterx@redhat.com>,
	"Wu, Hao" <hao.wu@intel.com>,
	"stefanha@gmail.com" <stefanha@gmail.com>,
	"iommu@lists.linux-foundation.org"
	<iommu@lists.linux-foundation.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: RE: [PATCH v4 05/15] vfio: Add PASID allocation/free support
Date: Tue, 7 Jul 2020 09:45:51 +0000
Message-ID: <CY4PR11MB14328D24A466BFF9461CF163C3660@CY4PR11MB1432.namprd11.prod.outlook.com> (raw)
In-Reply-To: <0d4ad4ac-ae89-a2ca-d302-94463ee5fc7f@redhat.com>

Hi Eric,

> From: Auger Eric <eric.auger@redhat.com>
> Sent: Monday, July 6, 2020 10:52 PM
> 
> Hi Yi,
> 
> On 7/4/20 1:26 PM, Liu Yi L wrote:
> > Shared Virtual Addressing (a.k.a Shared Virtual Memory) allows sharing
> > multiple process virtual address spaces with the device for simplified
> > programming model. PASID is used to tag an virtual address space in DMA
> > requests and to identify the related translation structure in IOMMU. When
> > a PASID-capable device is assigned to a VM, we want the same capability
> > of using PASID to tag guest process virtual address spaces to achieve
> > virtual SVA (vSVA).
> >
> > PASID management for guest is vendor specific. Some vendors (e.g. Intel
> > VT-d) requires system-wide managed PASIDs cross all devices, regardless
> > of whether a device is used by host or assigned to guest. Other vendors
> > (e.g. ARM SMMU) may allow PASIDs managed per-device thus could be fully
> > delegated to the guest for assigned devices.
> >
> > For system-wide managed PASIDs, this patch introduces a vfio module to
> > handle explicit PASID alloc/free requests from guest. Allocated PASIDs
> > are associated to a process (or, mm_struct) in IOASID core. A vfio_mm
> > object is introduced to track mm_struct. Multiple VFIO containers within
> > a process share the same vfio_mm object.
> >
> > A quota mechanism is provided to prevent malicious user from exhausting
> > available PASIDs. Currently the quota is a global parameter applied to
> > all VFIO devices. In the future per-device quota might be supported too.
> >
> > Cc: Kevin Tian <kevin.tian@intel.com>
> > CC: Jacob Pan <jacob.jun.pan@linux.intel.com>
> > Cc: Eric Auger <eric.auger@redhat.com>
> > Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>
> > Cc: Joerg Roedel <joro@8bytes.org>
> > Cc: Lu Baolu <baolu.lu@linux.intel.com>
> > Suggested-by: Alex Williamson <alex.williamson@redhat.com>
> > Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
> > ---
> > v3 -> v4:
> > *) fix lock leam in vfio_mm_get_from_task()
> > *) drop pasid_quota field in struct vfio_mm
> > *) vfio_mm_get_from_task() returns ERR_PTR(-ENOTTY)
> when !CONFIG_VFIO_PASID
> >
> > v1 -> v2:
> > *) added in v2, split from the pasid alloc/free support of v1
> > ---
> >  drivers/vfio/Kconfig      |   5 ++
> >  drivers/vfio/Makefile     |   1 +
> >  drivers/vfio/vfio_pasid.c | 152
> ++++++++++++++++++++++++++++++++++++++++++++++
> >  include/linux/vfio.h      |  28 +++++++++
> >  4 files changed, 186 insertions(+)
> >  create mode 100644 drivers/vfio/vfio_pasid.c
> >
> > diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig
> > index fd17db9..3d8a108 100644
> > --- a/drivers/vfio/Kconfig
> > +++ b/drivers/vfio/Kconfig
> > @@ -19,6 +19,11 @@ config VFIO_VIRQFD
> >  	depends on VFIO && EVENTFD
> >  	default n
> >
> > +config VFIO_PASID
> > +	tristate
> > +	depends on IOASID && VFIO
> > +	default n
> > +
> >  menuconfig VFIO
> >  	tristate "VFIO Non-Privileged userspace driver framework"
> >  	depends on IOMMU_API
> > diff --git a/drivers/vfio/Makefile b/drivers/vfio/Makefile
> > index de67c47..bb836a3 100644
> > --- a/drivers/vfio/Makefile
> > +++ b/drivers/vfio/Makefile
> > @@ -3,6 +3,7 @@ vfio_virqfd-y := virqfd.o
> >
> >  obj-$(CONFIG_VFIO) += vfio.o
> >  obj-$(CONFIG_VFIO_VIRQFD) += vfio_virqfd.o
> > +obj-$(CONFIG_VFIO_PASID) += vfio_pasid.o
> >  obj-$(CONFIG_VFIO_IOMMU_TYPE1) += vfio_iommu_type1.o
> >  obj-$(CONFIG_VFIO_IOMMU_SPAPR_TCE) += vfio_iommu_spapr_tce.o
> >  obj-$(CONFIG_VFIO_SPAPR_EEH) += vfio_spapr_eeh.o
> > diff --git a/drivers/vfio/vfio_pasid.c b/drivers/vfio/vfio_pasid.c
> > new file mode 100644
> > index 0000000..c46b870
> > --- /dev/null
> > +++ b/drivers/vfio/vfio_pasid.c
> > @@ -0,0 +1,152 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/*
> > + * Copyright (C) 2020 Intel Corporation.
> > + *     Author: Liu Yi L <yi.l.liu@intel.com>
> > + *
> > + */
> > +
> > +#include <linux/vfio.h>
> > +#include <linux/eventfd.h>
> > +#include <linux/file.h>
> > +#include <linux/module.h>
> > +#include <linux/slab.h>
> > +#include <linux/sched/mm.h>
> > +
> > +#define DRIVER_VERSION  "0.1"
> > +#define DRIVER_AUTHOR   "Liu Yi L <yi.l.liu@intel.com>"
> > +#define DRIVER_DESC     "PASID management for VFIO bus drivers"
> > +
> > +#define VFIO_DEFAULT_PASID_QUOTA	1000
> > +static int pasid_quota = VFIO_DEFAULT_PASID_QUOTA;
> > +module_param_named(pasid_quota, pasid_quota, uint, 0444);
> > +MODULE_PARM_DESC(pasid_quota,
> > +		 " Set the quota for max number of PASIDs that an application is
> allowed to request (default 1000)");
> > +
> > +struct vfio_mm_token {
> > +	unsigned long long val;
> > +};
> > +
> > +struct vfio_mm {
> > +	struct kref		kref;
> > +	int			ioasid_sid;
> > +	struct list_head	next;
> > +	struct vfio_mm_token	token;
> > +};
> > +
> > +static struct vfio_pasid {
> > +	struct mutex		vfio_mm_lock;
> > +	struct list_head	vfio_mm_list;
> > +} vfio_pasid;
> > +
> > +/* called with vfio.vfio_mm_lock held */
> > +static void vfio_mm_release(struct kref *kref)
> > +{
> > +	struct vfio_mm *vmm = container_of(kref, struct vfio_mm, kref);
> > +
> > +	list_del(&vmm->next);
> > +	mutex_unlock(&vfio_pasid.vfio_mm_lock);
> > +	ioasid_free_set(vmm->ioasid_sid, true);
> > +	kfree(vmm);
> > +}
> > +
> > +void vfio_mm_put(struct vfio_mm *vmm)
> > +{
> > +	kref_put_mutex(&vmm->kref, vfio_mm_release,
> &vfio_pasid.vfio_mm_lock);
> > +}
> > +
> > +static void vfio_mm_get(struct vfio_mm *vmm)
> > +{
> > +	kref_get(&vmm->kref);
> > +}
> > +
> > +struct vfio_mm *vfio_mm_get_from_task(struct task_struct *task)
> > +{
> > +	struct mm_struct *mm = get_task_mm(task);
> > +	struct vfio_mm *vmm;
> > +	unsigned long long val = (unsigned long long) mm;
> > +	int ret;
> > +
> > +	mutex_lock(&vfio_pasid.vfio_mm_lock);
> > +	/* Search existing vfio_mm with current mm pointer */
> > +	list_for_each_entry(vmm, &vfio_pasid.vfio_mm_list, next) {
> > +		if (vmm->token.val == val) {
> > +			vfio_mm_get(vmm);
> > +			goto out;
> > +		}
> > +	}
> > +
> > +	vmm = kzalloc(sizeof(*vmm), GFP_KERNEL);
> > +	if (!vmm) {
> > +		vmm = ERR_PTR(-ENOMEM);
> > +		goto out;
> > +	}
> > +
> > +	/*
> > +	 * IOASID core provides a 'IOASID set' concept to track all
> > +	 * PASIDs associated with a token. Here we use mm_struct as
> > +	 * the token and create a IOASID set per mm_struct. All the
> > +	 * containers of the process share the same IOASID set.
> > +	 */
> > +	ret = ioasid_alloc_set((struct ioasid_set *) mm, pasid_quota,
> > +			       &vmm->ioasid_sid);
> > +	if (ret) {
> > +		kfree(vmm);
> > +		vmm = ERR_PTR(ret);
> > +		goto out;
> > +	}
> > +
> > +	kref_init(&vmm->kref);
> > +	vmm->token.val = val;
> > +
> > +	list_add(&vmm->next, &vfio_pasid.vfio_mm_list);
> > +out:
> > +	mutex_unlock(&vfio_pasid.vfio_mm_lock);
> > +	mmput(mm);
> > +	return vmm;
> > +}
> > +
> > +int vfio_pasid_alloc(struct vfio_mm *vmm, int min, int max)
> > +{
> > +	ioasid_t pasid;
> > +
> > +	pasid = ioasid_alloc(vmm->ioasid_sid, min, max, NULL);
> > +
> > +	return (pasid == INVALID_IOASID) ? -ENOSPC : pasid;
> > +}
> > +
> > +void vfio_pasid_free_range(struct vfio_mm *vmm,
> > +			    ioasid_t min, ioasid_t max)
> > +{
> > +	ioasid_t pasid = min;
> > +
> > +	if (min > max)
> > +		return;
> nit: is that check really useful?

looks to be duplicate as vfio_iommu_type1_pasid_request() has
done it as well. will remove it.

> > +
> > +	/*
> > +	 * IOASID core will notify PASID users (e.g. IOMMU driver) to
> > +	 * teardown necessary structures depending on the to-be-freed
> > +	 * PASID.
> > +	 */
> > +	for (; pasid <= max; pasid++)
> > +		ioasid_free(pasid);
> > +}
> > +
> > +static int __init vfio_pasid_init(void)
> > +{
> > +	mutex_init(&vfio_pasid.vfio_mm_lock);
> > +	INIT_LIST_HEAD(&vfio_pasid.vfio_mm_list);
> > +	return 0;
> > +}
> > +
> > +static void __exit vfio_pasid_exit(void)
> > +{
> > +	WARN_ON(!list_empty(&vfio_pasid.vfio_mm_list));
> Is it acceptable? Don't you need to cleanup everything here instead?

I guess yes. VFIO_PASID is supposed to be referenced by VFIO_IOMMU_TYPE1
and may be other module. once vfio_pasid_exit() is triggered, that means
its user (VFIO_IOMMU_TYPE1) has been removed. Should all the vfio_mm
instances should have been released. If not, means there is vfio_mm leak,
should be a bug of user module.

Regards,
Yi Liu

> Thanks
> 
> Eric
> > +}
> > +
> > +module_init(vfio_pasid_init);
> > +module_exit(vfio_pasid_exit);
> > +
> > +MODULE_VERSION(DRIVER_VERSION);
> > +MODULE_LICENSE("GPL v2");
> > +MODULE_AUTHOR(DRIVER_AUTHOR);
> > +MODULE_DESCRIPTION(DRIVER_DESC);
> > diff --git a/include/linux/vfio.h b/include/linux/vfio.h
> > index 38d3c6a..9da6468 100644
> > --- a/include/linux/vfio.h
> > +++ b/include/linux/vfio.h
> > @@ -97,6 +97,34 @@ extern int vfio_register_iommu_driver(const struct
> vfio_iommu_driver_ops *ops);
> >  extern void vfio_unregister_iommu_driver(
> >  				const struct vfio_iommu_driver_ops *ops);
> >
> > +struct vfio_mm;
> > +#if IS_ENABLED(CONFIG_VFIO_PASID)
> > +extern struct vfio_mm *vfio_mm_get_from_task(struct task_struct *task);
> > +extern void vfio_mm_put(struct vfio_mm *vmm);
> > +extern int vfio_pasid_alloc(struct vfio_mm *vmm, int min, int max);
> > +extern void vfio_pasid_free_range(struct vfio_mm *vmm,
> > +					ioasid_t min, ioasid_t max);
> > +#else
> > +static inline struct vfio_mm *vfio_mm_get_from_task(struct task_struct *task)
> > +{
> > +	return ERR_PTR(-ENOTTY);
> > +}
> > +
> > +static inline void vfio_mm_put(struct vfio_mm *vmm)
> > +{
> > +}
> > +
> > +static inline int vfio_pasid_alloc(struct vfio_mm *vmm, int min, int max)
> > +{
> > +	return -ENOTTY;
> > +}
> > +
> > +static inline void vfio_pasid_free_range(struct vfio_mm *vmm,
> > +					  ioasid_t min, ioasid_t max)
> > +{
> > +}
> > +#endif /* CONFIG_VFIO_PASID */
> > +
> >  /*
> >   * External user API
> >   */
> >
> Thanks
> 
> Eric


  reply index

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-04 11:26 [PATCH v4 00/15] vfio: expose virtual Shared Virtual Addressing to VMs Liu Yi L
2020-07-04 11:26 ` [PATCH v4 01/15] vfio/type1: Refactor vfio_iommu_type1_ioctl() Liu Yi L
2020-07-06  9:34   ` Auger Eric
2020-07-06 12:27     ` Liu, Yi L
2020-07-06 12:55       ` Auger Eric
2020-07-06 13:00         ` Liu, Yi L
2020-07-04 11:26 ` [PATCH v4 02/15] iommu: Report domain nesting info Liu Yi L
2020-07-06  9:34   ` Auger Eric
2020-07-06 12:20     ` Liu, Yi L
2020-07-06 13:00       ` Auger Eric
2020-07-06 13:17         ` Liu, Yi L
2020-07-04 11:26 ` [PATCH v4 03/15] iommu/smmu: Report empty " Liu Yi L
2020-07-06 10:37   ` Auger Eric
2020-07-06 12:46     ` Liu, Yi L
2020-07-06 13:21       ` Auger Eric
2020-07-06 13:26         ` Liu, Yi L
2020-07-04 11:26 ` [PATCH v4 04/15] vfio/type1: Report iommu nesting info to userspace Liu Yi L
2020-07-06 10:37   ` Auger Eric
2020-07-06 13:10     ` Liu, Yi L
2020-07-06 13:45       ` Auger Eric
2020-07-07  9:31         ` Liu, Yi L
2020-07-08  8:08           ` Liu, Yi L
2020-07-08 19:29             ` Alex Williamson
2020-07-09  0:25               ` Liu, Yi L
2020-07-06 14:06   ` Auger Eric
2020-07-07  9:34     ` Liu, Yi L
2020-07-04 11:26 ` [PATCH v4 05/15] vfio: Add PASID allocation/free support Liu Yi L
2020-07-06 14:52   ` Auger Eric
2020-07-07  9:45     ` Liu, Yi L [this message]
2020-07-04 11:26 ` [PATCH v4 06/15] iommu/vt-d: Support setting ioasid set to domain Liu Yi L
2020-07-06 14:52   ` Auger Eric
2020-07-07  9:37     ` Liu, Yi L
2020-07-04 11:26 ` [PATCH v4 07/15] vfio/type1: Add VFIO_IOMMU_PASID_REQUEST (alloc/free) Liu Yi L
2020-07-06 15:17   ` Auger Eric
2020-07-07  9:51     ` Liu, Yi L
2020-07-04 11:26 ` [PATCH v4 08/15] iommu: Pass domain to sva_unbind_gpasid() Liu Yi L
2020-07-04 11:26 ` [PATCH v4 09/15] iommu/vt-d: Check ownership for PASIDs from user-space Liu Yi L
2020-07-04 11:26 ` [PATCH v4 10/15] vfio/type1: Support binding guest page tables to PASID Liu Yi L
2020-07-04 11:26 ` [PATCH v4 11/15] vfio/type1: Allow invalidating first-level/stage IOMMU cache Liu Yi L
2020-07-04 11:26 ` [PATCH v4 12/15] vfio/type1: Add vSVA support for IOMMU-backed mdevs Liu Yi L
2020-07-04 11:26 ` [PATCH v4 13/15] vfio/pci: Expose PCIe PASID capability to guest Liu Yi L
2020-07-04 11:26 ` [PATCH v4 14/15] vfio: Document dual stage control Liu Yi L
2020-07-04 11:26 ` [PATCH v4 15/15] iommu/vt-d: Support reporting nesting capability info Liu Yi L

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CY4PR11MB14328D24A466BFF9461CF163C3660@CY4PR11MB1432.namprd11.prod.outlook.com \
    --to=yi.l.liu@intel.com \
    --cc=alex.williamson@redhat.com \
    --cc=ashok.raj@intel.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=eric.auger@redhat.com \
    --cc=hao.wu@intel.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jacob.jun.pan@linux.intel.com \
    --cc=jean-philippe@linaro.org \
    --cc=joro@8bytes.org \
    --cc=jun.j.tian@intel.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterx@redhat.com \
    --cc=stefanha@gmail.com \
    --cc=yi.y.sun@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

KVM Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/kvm/0 kvm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 kvm kvm/ https://lore.kernel.org/kvm \
		kvm@vger.kernel.org
	public-inbox-index kvm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.kvm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git