linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: Shenming Lu <lushenming@huawei.com>
Cc: Cornelia Huck <cohuck@redhat.com>, Will Deacon <will@kernel.org>,
	"Robin Murphy" <robin.murphy@arm.com>,
	Joerg Roedel <joro@8bytes.org>,
	"Jean-Philippe Brucker" <jean-philippe@linaro.org>,
	Eric Auger <eric.auger@redhat.com>, <kvm@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>,
	<linux-arm-kernel@lists.infradead.org>,
	<iommu@lists.linux-foundation.org>, <linux-api@vger.kernel.org>,
	Kevin Tian <kevin.tian@intel.com>,
	Lu Baolu <baolu.lu@linux.intel.com>, <yi.l.liu@intel.com>,
	Christoph Hellwig <hch@infradead.org>,
	Jonathan Cameron <Jonathan.Cameron@huawei.com>,
	"Barry Song" <song.bao.hua@hisilicon.com>,
	<wanghaibin.wang@huawei.com>, <yuzenghui@huawei.com>
Subject: Re: [RFC PATCH v3 0/8] Add IOPF support for VFIO passthrough
Date: Mon, 24 May 2021 16:11:36 -0600	[thread overview]
Message-ID: <20210524161136.03e9323d@x1.home.shazbot.org> (raw)
In-Reply-To: <accfb404-1d7b-8d73-6fb7-50011a3e546f@huawei.com>

On Fri, 21 May 2021 14:37:21 +0800
Shenming Lu <lushenming@huawei.com> wrote:

> Hi Alex,
> 
> On 2021/5/19 2:57, Alex Williamson wrote:
> > On Fri, 9 Apr 2021 11:44:12 +0800
> > Shenming Lu <lushenming@huawei.com> wrote:
> >   
> >> Hi,
> >>
> >> Requesting for your comments and suggestions. :-)
> >>
> >> The static pinning and mapping problem in VFIO and possible solutions
> >> have been discussed a lot [1, 2]. One of the solutions is to add I/O
> >> Page Fault support for VFIO devices. Different from those relatively
> >> complicated software approaches such as presenting a vIOMMU that provides
> >> the DMA buffer information (might include para-virtualized optimizations),
> >> IOPF mainly depends on the hardware faulting capability, such as the PCIe
> >> PRI extension or Arm SMMU stall model. What's more, the IOPF support in
> >> the IOMMU driver has already been implemented in SVA [3]. So we add IOPF
> >> support for VFIO passthrough based on the IOPF part of SVA in this series.  
> > 
> > The SVA proposals are being reworked to make use of a new IOASID
> > object, it's not clear to me that this shouldn't also make use of that
> > work as it does a significant expansion of the type1 IOMMU with fault
> > handlers that would duplicate new work using that new model.  
> 
> It seems that the IOASID extension for guest SVA would not affect this series,
> will we do any host-guest IOASID translation in the VFIO fault handler?

Surely it will, we don't currently have any IOMMU fault handling or
forwarding of IOMMU faults through to the vfio bus driver, both of
those would be included in an IOASID implementation.  I think Jason's
vision is to use IOASID to deprecate type1 for all use cases, so even
if we were to choose to implement IOPF in type1 we should agree on
common interfaces with IOASID.
 
> >> We have measured its performance with UADK [4] (passthrough an accelerator
> >> to a VM(1U16G)) on Hisilicon Kunpeng920 board (and compared with host SVA):
> >>
> >> Run hisi_sec_test...
> >>  - with varying sending times and message lengths
> >>  - with/without IOPF enabled (speed slowdown)
> >>
> >> when msg_len = 1MB (and PREMAP_LEN (in Patch 4) = 1):
> >>             slowdown (num of faults)
> >>  times      VFIO IOPF      host SVA
> >>  1          63.4% (518)    82.8% (512)
> >>  100        22.9% (1058)   47.9% (1024)
> >>  1000       2.6% (1071)    8.5% (1024)
> >>
> >> when msg_len = 10MB (and PREMAP_LEN = 512):
> >>             slowdown (num of faults)
> >>  times      VFIO IOPF
> >>  1          32.6% (13)
> >>  100        3.5% (26)
> >>  1000       1.6% (26)  
> > 
> > It seems like this is only an example that you can make a benchmark
> > show anything you want.  The best results would be to pre-map
> > everything, which is what we have without this series.  What is an
> > acceptable overhead to incur to avoid page pinning?  What if userspace
> > had more fine grained control over which mappings were available for
> > faulting and which were statically mapped?  I don't really see what
> > sense the pre-mapping range makes.  If I assume the user is QEMU in a
> > non-vIOMMU configuration, pre-mapping the beginning of each RAM section
> > doesn't make any logical sense relative to device DMA.  
> 
> As you said in Patch 4, we can introduce full end-to-end functionality
> before trying to improve performance, and I will drop the pre-mapping patch
> in the current stage...
> 
> Is there a need that userspace wants more fine grained control over which
> mappings are available for faulting? If so, we may evolve the MAP ioctl
> to support for specifying the faulting range.

You're essentially enabling this for a vfio bus driver via patch 7/8,
pinning for selective DMA faulting.  How would a driver in userspace
make equivalent requests?  In the case of performance, the user could
mlock the page but they have no mechanism here to pre-fault it.  Should
they?

> As for the overhead of IOPF, it is unavoidable if enabling on-demand paging
> (and page faults occur almost only when first accessing), and it seems that
> there is a large optimization space compared to CPU page faulting.

Yes, there's of course going to be overhead in terms of latency for the
page faults.  My point was more that when a host is not under memory
pressure we should trend towards the performance of pinned, static
mappings and we should be able to create arbitrarily large pre-fault
behavior to show that.  But I think what we really want to enable via
IOPF is density, right?  Therefore how many more assigned device guests
can you run on a host with IOPF?  How does the slope, plateau, and
inflection point of their aggregate throughput compare to static
pinning?  VM startup time is probably also a useful metric, ie. trading
device latency for startup latency.  Thanks,

Alex


  reply	other threads:[~2021-05-24 22:11 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-09  3:44 [RFC PATCH v3 0/8] Add IOPF support for VFIO passthrough Shenming Lu
2021-04-09  3:44 ` [RFC PATCH v3 1/8] iommu: Evolve the device fault reporting framework Shenming Lu
2021-05-18 18:58   ` Alex Williamson
2021-05-21  6:37     ` Shenming Lu
2021-04-09  3:44 ` [RFC PATCH v3 2/8] vfio/type1: Add a page fault handler Shenming Lu
2021-05-18 18:58   ` Alex Williamson
2021-05-21  6:38     ` Shenming Lu
2021-05-24 22:11       ` Alex Williamson
2021-05-27 11:16         ` Shenming Lu
2021-04-09  3:44 ` [RFC PATCH v3 3/8] vfio/type1: Add an MMU notifier to avoid pinning Shenming Lu
2021-05-18 18:58   ` Alex Williamson
2021-05-21  6:37     ` Shenming Lu
2021-04-09  3:44 ` [RFC PATCH v3 4/8] vfio/type1: Pre-map more pages than requested in the IOPF handling Shenming Lu
2021-05-18 18:58   ` Alex Williamson
2021-05-21  6:37     ` Shenming Lu
2021-04-09  3:44 ` [RFC PATCH v3 5/8] vfio/type1: VFIO_IOMMU_ENABLE_IOPF Shenming Lu
2021-05-18 18:58   ` Alex Williamson
2021-05-21  6:38     ` Shenming Lu
2021-05-24 22:11       ` Alex Williamson
2021-05-27 11:15         ` Shenming Lu
2021-04-09  3:44 ` [RFC PATCH v3 6/8] vfio/type1: No need to statically pin and map if IOPF enabled Shenming Lu
2021-05-18 18:58   ` Alex Williamson
2021-05-21  6:39     ` Shenming Lu
2021-04-09  3:44 ` [RFC PATCH v3 7/8] vfio/type1: Add selective DMA faulting support Shenming Lu
2021-05-18 18:58   ` Alex Williamson
2021-05-21  6:39     ` Shenming Lu
2021-04-09  3:44 ` [RFC PATCH v3 8/8] vfio: Add nested IOPF support Shenming Lu
2021-05-18 18:58   ` Alex Williamson
2021-05-21  7:59     ` Shenming Lu
2021-05-24 13:11       ` Shenming Lu
2021-05-24 22:11         ` Alex Williamson
2021-05-27 11:03           ` Shenming Lu
2021-05-27 11:18             ` Lu Baolu
2021-06-01  4:36               ` Shenming Lu
2021-04-26  1:41 ` [RFC PATCH v3 0/8] Add IOPF support for VFIO passthrough Shenming Lu
2021-05-11 11:30   ` Shenming Lu
2021-05-18 18:57 ` Alex Williamson
2021-05-21  6:37   ` Shenming Lu
2021-05-24 22:11     ` Alex Williamson [this message]
2021-05-27 11:25       ` Shenming Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210524161136.03e9323d@x1.home.shazbot.org \
    --to=alex.williamson@redhat.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=cohuck@redhat.com \
    --cc=eric.auger@redhat.com \
    --cc=hch@infradead.org \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jean-philippe@linaro.org \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lushenming@huawei.com \
    --cc=robin.murphy@arm.com \
    --cc=song.bao.hua@hisilicon.com \
    --cc=wanghaibin.wang@huawei.com \
    --cc=will@kernel.org \
    --cc=yi.l.liu@intel.com \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).