From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752515AbdLEHh2 (ORCPT ); Tue, 5 Dec 2017 02:37:28 -0500 Received: from mga04.intel.com ([192.55.52.120]:56762 "EHLO mga04.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751161AbdLEHhZ (ORCPT ); Tue, 5 Dec 2017 02:37:25 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.45,363,1508828400"; d="scan'208";a="1252020746" Subject: Re: [PATCH v3 13/16] iommu/intel-svm: notify page request to guest To: Jacob Pan , iommu@lists.linux-foundation.org, LKML , Joerg Roedel , David Woodhouse , Greg Kroah-Hartman , Rafael Wysocki , Alex Williamson References: <1510944914-54430-1-git-send-email-jacob.jun.pan@linux.intel.com> <1510944914-54430-14-git-send-email-jacob.jun.pan@linux.intel.com> Cc: Lan Tianyu , Jean Delvare From: Lu Baolu Message-ID: <5A264CB3.2060603@linux.intel.com> Date: Tue, 5 Dec 2017 15:37:23 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 In-Reply-To: <1510944914-54430-14-git-send-email-jacob.jun.pan@linux.intel.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, On 11/18/2017 02:55 AM, Jacob Pan wrote: > If the source device of a page request has its PASID table pointer > bond to a guest, the first level page tables are owned by the guest. > In this case, we shall let guest OS to manage page fault. > > This patch uses the IOMMU fault notification API to send notifications, > possibly via VFIO, to the guest OS. Once guest pages are fault in, guest > will issue page response which will be passed down via the invalidation > passdown APIs. > > Signed-off-by: Jacob Pan > Signed-off-by: Ashok Raj > --- > drivers/iommu/intel-svm.c | 80 ++++++++++++++++++++++++++++++++++++++++++----- > include/linux/iommu.h | 1 + > 2 files changed, 74 insertions(+), 7 deletions(-) > > diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c > index f6697e5..77c25d8 100644 > --- a/drivers/iommu/intel-svm.c > +++ b/drivers/iommu/intel-svm.c > @@ -555,6 +555,71 @@ static bool is_canonical_address(u64 addr) > return (((saddr << shift) >> shift) == saddr); > } > > +static int prq_to_iommu_prot(struct page_req_dsc *req) > +{ > + int prot = 0; > + > + if (req->rd_req) > + prot |= IOMMU_FAULT_READ; > + if (req->wr_req) > + prot |= IOMMU_FAULT_WRITE; > + if (req->exe_req) > + prot |= IOMMU_FAULT_EXEC; > + if (req->priv_req) > + prot |= IOMMU_FAULT_PRIV; > + > + return prot; > +} > + > +static int intel_svm_prq_report(struct device *dev, struct page_req_dsc *desc) > +{ > + int ret = 0; It seems that "ret" should be initialized as -EINVAL. Otherwise, this function will return 0 for devices which have no fault handlers, and all page requests will be ignored by iommu driver. > + struct iommu_fault_event event; > + struct pci_dev *pdev; > + > + /** > + * If caller does not provide struct device, this is the case where > + * guest PASID table is bound to the device. So we need to retrieve > + * struct device from the page request descriptor then proceed. > + */ > + if (!dev) { > + pdev = pci_get_bus_and_slot(desc->bus, desc->devfn); > + if (!pdev) { > + pr_err("No PCI device found for PRQ [%02x:%02x.%d]\n", > + desc->bus, PCI_SLOT(desc->devfn), > + PCI_FUNC(desc->devfn)); > + return -ENODEV; > + } > + dev = &pdev->dev; > + } else if (dev_is_pci(dev)) { > + pdev = to_pci_dev(dev); > + pci_dev_get(pdev); > + } else > + return -ENODEV; > + > + pr_debug("Notify PRQ device [%02x:%02x.%d]\n", > + desc->bus, PCI_SLOT(desc->devfn), > + PCI_FUNC(desc->devfn)); > + > + /* invoke device fault handler if registered */ > + if (iommu_has_device_fault_handler(dev)) { > + /* Fill in event data for device specific processing */ > + event.type = IOMMU_FAULT_PAGE_REQ; > + event.addr = desc->addr; > + event.pasid = desc->pasid; > + event.page_req_group_id = desc->prg_index; > + event.prot = prq_to_iommu_prot(desc); > + event.last_req = desc->lpig; > + event.pasid_valid = 1; > + event.iommu_private = desc->private; > + ret = iommu_report_device_fault(&pdev->dev, &event); > + } > + > + pci_dev_put(pdev); > + > + return ret; > +} > + > static irqreturn_t prq_event_thread(int irq, void *d) > { > struct intel_iommu *iommu = d; > @@ -578,7 +643,12 @@ static irqreturn_t prq_event_thread(int irq, void *d) > handled = 1; > > req = &iommu->prq[head / sizeof(*req)]; > - > + /** > + * If prq is to be handled outside iommu driver via receiver of > + * the fault notifiers, we skip the page response here. > + */ > + if (!intel_svm_prq_report(NULL, req)) > + goto prq_advance; > result = QI_RESP_FAILURE; > address = (u64)req->addr << VTD_PAGE_SHIFT; > if (!req->pasid_present) { > @@ -649,11 +719,7 @@ static irqreturn_t prq_event_thread(int irq, void *d) > if (WARN_ON(&sdev->list == &svm->devs)) > sdev = NULL; > > - if (sdev && sdev->ops && sdev->ops->fault_cb) { > - int rwxp = (req->rd_req << 3) | (req->wr_req << 2) | > - (req->exe_req << 1) | (req->priv_req); > - sdev->ops->fault_cb(sdev->dev, req->pasid, req->addr, req->private, rwxp, result); > - } > + intel_svm_prq_report(sdev->dev, req); Do you mind explaining why we need to report this request twice? Best regards, Lu Baolu > /* We get here in the error case where the PASID lookup failed, > and these can be NULL. Do not use them below this point! */ > sdev = NULL; > @@ -679,7 +745,7 @@ static irqreturn_t prq_event_thread(int irq, void *d) > > qi_submit_sync(&resp, iommu); > } > - > + prq_advance: > head = (head + sizeof(*req)) & PRQ_RING_MASK; > } > > diff --git a/include/linux/iommu.h b/include/linux/iommu.h > index 841c044..3083796b 100644 > --- a/include/linux/iommu.h > +++ b/include/linux/iommu.h > @@ -42,6 +42,7 @@ > * if the IOMMU page table format is equivalent. > */ > #define IOMMU_PRIV (1 << 5) > +#define IOMMU_EXEC (1 << 6) > > struct iommu_ops; > struct iommu_group;