From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755007AbdFWUeh (ORCPT ); Fri, 23 Jun 2017 16:34:37 -0400 Received: from mx1.redhat.com ([209.132.183.28]:41698 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754889AbdFWUeg (ORCPT ); Fri, 23 Jun 2017 16:34:36 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 7A942561DA Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=alex.williamson@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 7A942561DA Date: Fri, 23 Jun 2017 14:34:34 -0600 From: Alex Williamson To: Jacob Pan Cc: iommu@lists.linux-foundation.org, LKML , Joerg Roedel , David Woodhouse , "Liu, Yi L" , Lan Tianyu , "Tian, Kevin" , Raj Ashok , Jean Delvare Subject: Re: [RFC 8/9] iommu/intel-svm: notify page request to guest Message-ID: <20170623143434.2473215b@w520.home> In-Reply-To: <20170623131629.485750b6@jacob-builder> References: <1497478983-77580-1-git-send-email-jacob.jun.pan@linux.intel.com> <1497478983-77580-9-git-send-email-jacob.jun.pan@linux.intel.com> <20170622165358.70cfe33b@w520.home> <20170623131629.485750b6@jacob-builder> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Fri, 23 Jun 2017 20:34:35 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 23 Jun 2017 13:16:29 -0700 Jacob Pan wrote: > On Thu, 22 Jun 2017 16:53:58 -0600 > Alex Williamson wrote: > > > On Wed, 14 Jun 2017 15:23:02 -0700 > > Jacob Pan wrote: > > > > > If the source device of a page request has its PASID table pointer > > > bond to a guest, the first level page tables are owned by the guest. > > > In this case, we shall let guest OS to manage page fault. > > > > > > This patch uses the IOMMU fault notification API to send > > > notifications, possibly via VFIO, to the guest OS. Once guest pages > > > are fault in, guest will issue page response which will be passed > > > down via the invalidation passdown APIs. > > > > > > Signed-off-by: Jacob Pan > > > Signed-off-by: Ashok Raj > > > --- > > > drivers/iommu/intel-svm.c | 81 > > > ++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 80 > > > insertions(+), 1 deletion(-) > > > > > > diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c > > > index 23c4276..d1d2d23 100644 > > > --- a/drivers/iommu/intel-svm.c > > > +++ b/drivers/iommu/intel-svm.c > > > @@ -525,6 +525,80 @@ static bool access_error(struct vm_area_struct > > > *vma, struct page_req_dsc *req) return (requested & > > > ~vma->vm_flags) != 0; } > > > > > > +static int prq_to_iommu_prot(struct page_req_dsc *req) > > > +{ > > > + int prot = 0; > > > + > > > + if (req->rd_req) > > > + prot |= IOMMU_READ; > > > + if (req->wr_req) > > > + prot |= IOMMU_WRITE; > > > + if (req->exe_req) > > > + prot |= IOMMU_EXEC; > > > + if (req->priv_req) > > > + prot |= IOMMU_PRIV; > > > + > > > + return prot; > > > +} > > > + > > > +static int intel_svm_prq_notify(struct device *dev, struct > > > page_req_dsc *desc) +{ > > > + int ret = 0; > > > + struct iommu_fault_event *event; > > > + struct pci_dev *pdev; > > > + struct device_domain_info *info; > > > + unsigned long buf_offset; > > > + > > > + /** > > > + * If caller does not provide struct device, this is the > > > case where > > > + * guest PASID table is bond to the device. So we need to > > > retrieve > > > + * struct device from the page request deescriptor then > > > proceed. > > > + */ > > > + if (!dev) { > > > + pdev = pci_get_bus_and_slot(desc->bus, > > > desc->devfn); > > > + if (!pdev) { > > > + pr_err("No PCI device found for PRQ > > > [%02x:%02x.%d]\n", > > > + desc->bus, PCI_SLOT(desc->devfn), > > > + PCI_FUNC(desc->devfn)); > > > + return -ENODEV; > > > + } > > > + /** > > > + * Make sure PASID table pointer is bond to guest, > > > if yes notify > > > + * handler in the guest, e.g. via VFIO. > > > + */ > > > + info = pdev->dev.archdata.iommu; > > > + if (!info || !info->pasid_tbl_bond) { > > > + pr_debug("PRQ device pasid table not > > > bond.\n"); > > > > I can "bond" two things together, they are then "bound". > > > will fix that :) > > > + return -EINVAL; > > > + } > > > + dev = &pdev->dev; > > > > Leaks pdev reference. Both normal and error path. > > > I guess you are referring to ref count in pci_get_bus_and_slot()? I did > look at the code, it does not seem to do the count increment. Perhaps > the comment is stale? > > * pointer to its data structure. The caller must decrement the > * reference count by calling pci_dev_put(). If no device is found, > * %NULL is returned. > */ > struct pci_dev *pci_get_domain_bus_and_slot(int domain, unsigned int bus, > unsigned int devfn) > { > struct pci_dev *dev = NULL; > > for_each_pci_dev(dev) { ^^^^^^^^^^^^^^^^ <-- look in here, it's trickier than it appears > if (pci_domain_nr(dev->bus) == domain && > (dev->bus->number == bus && dev->devfn == devfn)) > return dev; > } > return NULL; > } > EXPORT_SYMBOL(pci_get_domain_bus_and_slot); > > > > > > + } > > > + > > > + pr_debug("Notify PRQ device [%02x:%02x.%d]\n", > > > + desc->bus, PCI_SLOT(desc->devfn), > > > + PCI_FUNC(desc->devfn)); > > > + event = kzalloc(sizeof(*event) + sizeof(*desc), > > > GFP_KERNEL); > > > + if (!event) > > > + return -ENOMEM; > > > + > > > + get_device(dev); > > > + /* Fill in event data for device specific processing */ > > > + event->dev = dev; > > > + buf_offset = offsetofend(struct iommu_fault_event, length); > > > + memcpy(buf_offset + event, desc, sizeof(*desc)); > > > + event->addr = desc->addr; > > > + event->pasid = desc->pasid; > > > + event->prot = prq_to_iommu_prot(desc); > > > + event->length = sizeof(*desc); > > > + event->flags = IOMMU_FAULT_PAGE_REQ; > > > + > > > + ret = iommu_fault_notifier_call_chain(event); > > > + put_device(dev); > > > + kfree(event); > > > + > > > + return ret; > > > +} > > > + > > > static irqreturn_t prq_event_thread(int irq, void *d) > > > { > > > struct intel_iommu *iommu = d; > > > @@ -548,7 +622,12 @@ static irqreturn_t prq_event_thread(int irq, > > > void *d) handled = 1; > > > > > > req = &iommu->prq[head / sizeof(*req)]; > > > - > > > + /** > > > + * If prq is to be handled outside iommu driver > > > via receiver of > > > + * the fault notifiers, we skip the page response > > > here. > > > + */ > > > + if (!intel_svm_prq_notify(NULL, req)) > > > + continue; > > > result = QI_RESP_FAILURE; > > > address = (u64)req->addr << VTD_PAGE_SHIFT; > > > if (!req->pasid_present) { > > > > [Jacob Pan]