KVM Archive on lore.kernel.org
 help / color / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: "Liu, Yi L" <yi.l.liu@intel.com>
Cc: "Tian, Kevin" <kevin.tian@intel.com>,
	"jacob.jun.pan@linux.intel.com" <jacob.jun.pan@linux.intel.com>,
	"eric.auger@redhat.com" <eric.auger@redhat.com>,
	"baolu.lu@linux.intel.com" <baolu.lu@linux.intel.com>,
	"joro@8bytes.org" <joro@8bytes.org>,
	"Raj, Ashok" <ashok.raj@intel.com>,
	"Tian, Jun J" <jun.j.tian@intel.com>,
	"Sun, Yi Y" <yi.y.sun@intel.com>,
	"jean-philippe@linaro.org" <jean-philippe@linaro.org>,
	"peterx@redhat.com" <peterx@redhat.com>,
	"Wu, Hao" <hao.wu@intel.com>,
	"iommu@lists.linux-foundation.org"
	<iommu@lists.linux-foundation.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v3 06/14] vfio/type1: Add VFIO_IOMMU_PASID_REQUEST (alloc/free)
Date: Thu, 9 Jul 2020 08:27:51 -0600
Message-ID: <20200709082751.320742ab@x1.home> (raw)
In-Reply-To: <DM5PR11MB143584D5A0AAE13E0D2D04B7C3640@DM5PR11MB1435.namprd11.prod.outlook.com>

On Thu, 9 Jul 2020 07:16:31 +0000
"Liu, Yi L" <yi.l.liu@intel.com> wrote:

> Hi Alex,
> 
> After more thinking, looks like adding a r-b tree is still not enough to
> solve the potential problem for free a range of PASID in one ioctl. If
> caller gives [0, MAX_UNIT] in the free request, kernel anyhow should
> loop all the PASIDs and search in the r-b tree. Even VFIO can track the
> smallest/largest allocated PASID, and limit the free range to an accurate
> range, it is still no efficient. For example, user has allocated two PASIDs
> ( 1 and 999), and user gives the [0, MAX_UNIT] range in free request. VFIO
> will limit the free range to be [1, 999], but still needs to loop PASID 1 -
> 999, and search in r-b tree.

That sounds like a poor tree implementation.  Look at vfio_find_dma()
for instance, it returns a node within the specified range.  If the
tree has two nodes within the specified range we should never need to
call a search function like vfio_find_dma() more than three times.  We
call it once, get the first node, remove it.  Call it again, get the
other node, remove it.  Call a third time, find no matches, we're done.
So such an implementation limits searches to N+1 where N is the number
of nodes within the range.

> So I'm wondering can we fall back to prior proposal which only free one
> PASID for a free request. how about your opinion?

Doesn't it still seem like it would be a useful user interface to have
a mechanism to free all pasids, by calling with exactly [0, MAX_UINT]?
I'm not sure if there's another use case for this given than the user
doesn't have strict control of the pasid values they get.  Thanks,

Alex

> > From: Liu, Yi L <yi.l.liu@intel.com>
> > Sent: Thursday, July 9, 2020 10:26 AM
> > 
> > Hi Kevin,
> >   
> > > From: Tian, Kevin <kevin.tian@intel.com>
> > > Sent: Thursday, July 9, 2020 10:18 AM
> > >  
> > > > From: Liu, Yi L <yi.l.liu@intel.com>
> > > > Sent: Thursday, July 9, 2020 10:08 AM
> > > >
> > > > Hi Kevin,
> > > >  
> > > > > From: Tian, Kevin <kevin.tian@intel.com>
> > > > > Sent: Thursday, July 9, 2020 9:57 AM
> > > > >  
> > > > > > From: Liu, Yi L <yi.l.liu@intel.com>
> > > > > > Sent: Thursday, July 9, 2020 8:32 AM
> > > > > >
> > > > > > Hi Alex,
> > > > > >  
> > > > > > > Alex Williamson <alex.williamson@redhat.com>
> > > > > > > Sent: Thursday, July 9, 2020 3:55 AM
> > > > > > >
> > > > > > > On Wed, 8 Jul 2020 08:16:16 +0000 "Liu, Yi L"
> > > > > > > <yi.l.liu@intel.com> wrote:
> > > > > > >  
> > > > > > > > Hi Alex,
> > > > > > > >  
> > > > > > > > > From: Liu, Yi L < yi.l.liu@intel.com>
> > > > > > > > > Sent: Friday, July 3, 2020 2:28 PM
> > > > > > > > >
> > > > > > > > > Hi Alex,
> > > > > > > > >  
> > > > > > > > > > From: Alex Williamson <alex.williamson@redhat.com>
> > > > > > > > > > Sent: Friday, July 3, 2020 5:19 AM
> > > > > > > > > >
> > > > > > > > > > On Wed, 24 Jun 2020 01:55:19 -0700 Liu Yi L
> > > > > > > > > > <yi.l.liu@intel.com> wrote:
> > > > > > > > > >  
> > > > > > > > > > > This patch allows user space to request PASID
> > > > > > > > > > > allocation/free,  
> > > > e.g.  
> > > > > > > > > > > when serving the request from the guest.
> > > > > > > > > > >
> > > > > > > > > > > PASIDs that are not freed by userspace are
> > > > > > > > > > > automatically freed  
> > > > > > when  
> > > > > > > > > > > the IOASID set is destroyed when process exits.  
> > > > > > > > [...]  
> > > > > > > > > > > +static int vfio_iommu_type1_pasid_request(struct
> > > > > > > > > > > +vfio_iommu  
> > > > > > *iommu,  
> > > > > > > > > > > +					  unsigned long arg) {
> > > > > > > > > > > +	struct vfio_iommu_type1_pasid_request req;
> > > > > > > > > > > +	unsigned long minsz;
> > > > > > > > > > > +
> > > > > > > > > > > +	minsz = offsetofend(struct  
> > > vfio_iommu_type1_pasid_request,  
> > > > > > > range);  
> > > > > > > > > > > +
> > > > > > > > > > > +	if (copy_from_user(&req, (void __user *)arg, minsz))
> > > > > > > > > > > +		return -EFAULT;
> > > > > > > > > > > +
> > > > > > > > > > > +	if (req.argsz < minsz || (req.flags &  
> > > > > > > ~VFIO_PASID_REQUEST_MASK))  
> > > > > > > > > > > +		return -EINVAL;
> > > > > > > > > > > +
> > > > > > > > > > > +	if (req.range.min > req.range.max)  
> > > > > > > > > >
> > > > > > > > > > Is it exploitable that a user can spin the kernel for a
> > > > > > > > > > long time in the case of a free by calling this with [0,
> > > > > > > > > > MAX_UINT] regardless of their  
> > > > > > > actual  
> > > > > > > > > allocations?
> > > > > > > > >
> > > > > > > > > IOASID can ensure that user can only free the PASIDs
> > > > > > > > > allocated to the  
> > > > > > user.  
> > > > > > > but  
> > > > > > > > > it's true, kernel needs to loop all the PASIDs within the
> > > > > > > > > range provided by user.  
> > > > > > > it  
> > > > > > > > > may take a long time. is there anything we can do? one
> > > > > > > > > thing may limit  
> > > > > > the  
> > > > > > > range  
> > > > > > > > > provided by user?  
> > > > > > > >
> > > > > > > > thought about it more, we have per-VM pasid quota (say
> > > > > > > > 1000), so even if user passed down [0, MAX_UNIT], kernel
> > > > > > > > will only loop the
> > > > > > > > 1000 pasids at most. do you think we still need to do something on it?  
> > > > > > >
> > > > > > > How do you figure that?  vfio_iommu_type1_pasid_request()
> > > > > > > accepts the user's min/max so long as (max > min) and passes
> > > > > > > that to vfio_iommu_type1_pasid_free(), then to
> > > > > > > vfio_pasid_free_range() which loops as:
> > > > > > >
> > > > > > > 	ioasid_t pasid = min;
> > > > > > > 	for (; pasid <= max; pasid++)
> > > > > > > 		ioasid_free(pasid);
> > > > > > >
> > > > > > > A user might only be able to allocate 1000 pasids, but
> > > > > > > apparently they can ask to free all they want.
> > > > > > >
> > > > > > > It's also not obvious to me that calling ioasid_free() is only
> > > > > > > allowing the user to free their own passid.  Does it?  It
> > > > > > > would be a pretty  
> > > > >
> > > > > Agree. I thought ioasid_free should at least carry a token since
> > > > > the user  
> > > > space is  
> > > > > only allowed to manage PASIDs in its own set...
> > > > >  
> > > > > > > gaping hole if a user could free arbitrary pasids.  A r-b tree
> > > > > > > of passids might help both for security and to bound spinning in a loop.  
> > > > > >
> > > > > > oh, yes. BTW. instead of r-b tree in VFIO, maybe we can add an
> > > > > > ioasid_set parameter for ioasid_free(), thus to prevent the user
> > > > > > from freeing PASIDs that doesn't belong to it. I remember Jacob
> > > > > > mentioned it  
> > > > before.  
> > > > > >  
> > > > >
> > > > > check current ioasid_free:
> > > > >
> > > > >         spin_lock(&ioasid_allocator_lock);
> > > > >         ioasid_data = xa_load(&active_allocator->xa, ioasid);
> > > > >         if (!ioasid_data) {
> > > > >                 pr_err("Trying to free unknown IOASID %u\n", ioasid);
> > > > >                 goto exit_unlock;
> > > > >         }
> > > > >
> > > > > Allow an user to trigger above lock paths with MAX_UINT times
> > > > > might still  
> > > > be bad.
> > > >
> > > > yeah, how about the below two options:
> > > >
> > > > - comparing the max - min with the quota before calling ioasid_free().
> > > >   If max - min > current quota of the user, then should fail it. If
> > > >   max - min < quota, then call ioasid_free() one by one. still trigger
> > > >   the above lock path with quota times.  
> > >
> > > This is definitely wrong. [min, max] is about the range of the PASID
> > > value, while quota is about the number of allocated PASIDs. It's a bit
> > > weird to mix two together.  
> > 
> > got it.
> >   
> > > btw what is the main purpose of allowing batch PASID free requests?
> > > Can we just simplify to allow one PASID in each free just like how is
> > > it done in allocation path?  
> > 
> > it's an intention to reuse the [min, max] range as allocation path. currently, we
> > don't have such request as far as I can see.
> >   
> > > >
> > > > - pass the max and min to ioasid_free(), let ioasid_free() decide. should
> > > >   be able to avoid trigger the lock multiple times, and ioasid has have a
> > > >   track on how may PASIDs have been allocated, if max - min is larger than
> > > >   the allocated number, should fail anyway.  
> > >
> > > What about Alex's r-b tree suggestion? Is there any downside in you mind?  
> > 
> > no downside, I was just wanting to reuse the tracks in ioasid_set. I can add a r-b
> > for allocated PASIDs and find the PASIDs in the r-b tree only do free for the
> > PASIDs found in r-b tree, others in the range would be ignored.
> > does it look good?
> > 
> > Regards,
> > Yi Liu
> >   
> > > Thanks,
> > > Kevin  
> 


  reply index

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-24  8:55 [PATCH v3 00/14] vfio: expose virtual Shared Virtual Addressing to VMs Liu Yi L
2020-06-24  8:55 ` [PATCH v3 01/14] vfio/type1: Refactor vfio_iommu_type1_ioctl() Liu Yi L
2020-07-02 21:21   ` Alex Williamson
2020-07-03  3:46     ` Liu, Yi L
2020-06-24  8:55 ` [PATCH v3 02/14] iommu: Report domain nesting info Liu Yi L
2020-06-26  7:47   ` Jean-Philippe Brucker
2020-06-26 16:04     ` Robin Murphy
2020-06-27  6:53       ` Liu, Yi L
2020-06-30  1:20         ` Tian, Kevin
2020-06-27  6:14     ` Liu, Yi L
2020-06-29  9:24   ` Stefan Hajnoczi
2020-06-29 12:23     ` Liu, Yi L
2020-06-30  2:00       ` Tian, Kevin
2020-06-30  3:45         ` Liu, Yi L
2020-07-03  9:59         ` Stefan Hajnoczi
2020-07-02 17:54   ` Alex Williamson
2020-07-03  3:53     ` Liu, Yi L
2020-06-24  8:55 ` [PATCH v3 03/14] vfio/type1: Report iommu nesting info to userspace Liu Yi L
2020-07-02 18:38   ` Alex Williamson
2020-07-03  6:05     ` Liu, Yi L
2020-07-03 13:03       ` Liu, Yi L
2020-06-24  8:55 ` [PATCH v3 04/14] vfio: Add PASID allocation/free support Liu Yi L
2020-07-02 21:17   ` Alex Williamson
2020-07-03  6:08     ` Liu, Yi L
2020-06-24  8:55 ` [PATCH v3 05/14] iommu/vt-d: Support setting ioasid set to domain Liu Yi L
2020-06-24  8:55 ` [PATCH v3 06/14] vfio/type1: Add VFIO_IOMMU_PASID_REQUEST (alloc/free) Liu Yi L
2020-07-02 21:18   ` Alex Williamson
2020-07-03  6:28     ` Liu, Yi L
2020-07-08  8:16       ` Liu, Yi L
2020-07-08 19:54         ` Alex Williamson
2020-07-09  0:32           ` Liu, Yi L
2020-07-09  1:56             ` Tian, Kevin
2020-07-09  2:08               ` Liu, Yi L
2020-07-09  2:18                 ` Tian, Kevin
2020-07-09  2:26                   ` Liu, Yi L
2020-07-09  7:16                     ` Liu, Yi L
2020-07-09 14:27                       ` Alex Williamson [this message]
2020-07-09 18:05                         ` Jacob Pan
2020-07-10  5:39                         ` Liu, Yi L
2020-07-10 12:55                           ` Alex Williamson
2020-07-10 13:03                             ` Liu, Yi L
2020-06-24  8:55 ` [PATCH v3 07/14] iommu: Pass domain to sva_unbind_gpasid() Liu Yi L
2020-06-24  8:55 ` [PATCH v3 08/14] iommu/vt-d: Check ownership for PASIDs from user-space Liu Yi L
2020-06-24  8:55 ` [PATCH v3 09/14] vfio/type1: Support binding guest page tables to PASID Liu Yi L
2020-07-02 21:19   ` Alex Williamson
2020-07-03  6:46     ` Liu, Yi L
2020-06-24  8:55 ` [PATCH v3 10/14] vfio/type1: Allow invalidating first-level/stage IOMMU cache Liu Yi L
2020-07-02 21:19   ` Alex Williamson
2020-07-03  3:47     ` Liu, Yi L
2020-06-24  8:55 ` [PATCH v3 11/14] vfio/type1: Add vSVA support for IOMMU-backed mdevs Liu Yi L
2020-06-24  8:55 ` [PATCH v3 12/14] vfio/pci: Expose PCIe PASID capability to guest Liu Yi L
2020-06-24  8:55 ` [PATCH v3 13/14] vfio: Document dual stage control Liu Yi L
2020-06-29  9:21   ` Stefan Hajnoczi
2020-06-29  9:24     ` Liu, Yi L
2020-06-24  8:55 ` [PATCH v3 14/14] iommu/vt-d: Support reporting nesting capability info Liu Yi L

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200709082751.320742ab@x1.home \
    --to=alex.williamson@redhat.com \
    --cc=ashok.raj@intel.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=eric.auger@redhat.com \
    --cc=hao.wu@intel.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jacob.jun.pan@linux.intel.com \
    --cc=jean-philippe@linaro.org \
    --cc=joro@8bytes.org \
    --cc=jun.j.tian@intel.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterx@redhat.com \
    --cc=yi.l.liu@intel.com \
    --cc=yi.y.sun@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

KVM Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/kvm/0 kvm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 kvm kvm/ https://lore.kernel.org/kvm \
		kvm@vger.kernel.org
	public-inbox-index kvm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.kvm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git