From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.5 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,URIBL_SBL,URIBL_SBL_A, USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5D014C73C46 for ; Tue, 9 Jul 2019 13:30:34 +0000 (UTC) Received: from dpdk.org (dpdk.org [92.243.14.124]) by mail.kernel.org (Postfix) with ESMTP id EBF4D20844 for ; Tue, 9 Jul 2019 13:30:33 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EBF4D20844 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=dev-bounces@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id C828D37A2; Tue, 9 Jul 2019 15:30:32 +0200 (CEST) Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) by dpdk.org (Postfix) with ESMTP id B5AF6324D for ; Tue, 9 Jul 2019 15:30:30 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Jul 2019 06:30:28 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.63,470,1557212400"; d="scan'208";a="165763691" Received: from aburakov-mobl1.ger.corp.intel.com (HELO [10.237.220.82]) ([10.237.220.82]) by fmsmga008.fm.intel.com with ESMTP; 09 Jul 2019 06:30:26 -0700 To: Jerin Jacob Kollanukkaran , David Marchand Cc: dev , Thomas Monjalon , Ben Walker References: <20190708142450.51597-1-jerinj@marvell.com> <0947c33d-b3be-1acc-f98e-3635cc5658d2@intel.com> From: "Burakov, Anatoly" Message-ID: Date: Tue, 9 Jul 2019 14:30:25 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.7.2 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Subject: Re: [dpdk-dev] [EXT] Re: [PATCH] bus/pci: fix IOVA as VA mode selection X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 09-Jul-19 1:11 PM, Jerin Jacob Kollanukkaran wrote: >> -----Original Message----- >> From: Burakov, Anatoly >> Sent: Tuesday, July 9, 2019 5:10 PM >> To: Jerin Jacob Kollanukkaran ; David Marchand >> >> Cc: dev ; Thomas Monjalon ; Ben >> Walker >> Subject: Re: [EXT] Re: [dpdk-dev] [PATCH] bus/pci: fix IOVA as VA mode >> selection >>>>> ________________________________________ >>>>> >>>>> On Mon, Jul 8, 2019 at 4:25 PM wrote: >>>>> From: Jerin Jacob >>>>> >>>>> Existing logic fails to select IOVA mode as VA if driver request to >>>>> enable IOVA as VA. >>>>> >>>>> IOVA as VA has more strict requirement than other modes, so enabling >>>>> positive logic for IOVA as VA selection. >>>>> >>>>> This patch also updates the default IOVA mode as PA for PCI devices >>>>> as it has to deal with DMA engines unlike the virtual devices that >>>>> may need only IOVA as DC. >>>>> >>>>> We have three cases: >>>>> - driver/hw supports IOVA as PA only >>>>> >>>>> [Jerin] It is not driver cap, it is more of system cap(IOMMU vs non >>>>> IOMMU). We are already addressing that case >>>> >>>> I don't get how this works. How does "system capability" affect what >>>> the device itself supports? Are we to assume that *all* hardware >>>> support IOVA as VA by default? "System capability" is more of a bus >>>> issue than an individual device issue, is it not? >>> >>> What I meant is, supporting VA vs PA is function of IOMMU(not the device >> attribute). >>> Ie. Device makes the bus master request, if IOMMU available and >>> enabled in the SYSTEM , It goes over IOMMU and translate the IOVA to >> physical address. >>> >>> Another way to put is, Is there any _PCIe_ device which need/requires >>> RTE_PCI_DRV_NEED_IOVA_AS_PA in rte_pci_driver.drv_flags >>> >>> >> >> Previously, as far as i can tell, the flag was used to indicate support for IOVA >> as VA mode, not *requirement* for IOVA as VA mode. For example, there >> are multiple patches [1][2][3][4] (i'm sure i can find more!) that added IOVA >> as VA support to various drivers, and they all were worded it in this exact way >> - "support for IOVA as VA mode", not "require IOVA as VA mode". As far as i >> can tell, none of these drivers *require* IOVA as VA mode - they merely use >> this flag to indicate support for it. > > Some class of devices NEED IOVA as VA for performance reasons. > Specially the devices has HW mempool allocators. On those devices If we don’t use IOVA as VA, > Upon getting packet from device, It needs to go over rte_mem_iova2virt() per > packet see driver/net/dppa2. Which has real performance issue. I wouldn't classify this as "needing" IOVA. "Need" implies it cannot work without it, whereas in this case it's more of a "highly recommended" rather than "need". >> >> Now suddenly it turns out that someone somewhere "knew" that "IOVA as >> VA" flag in PCI drivers is supposed to indicate *requirement* and not >> support, and it appears that this knowledge was not communicated nor >> documented anywhere, and is now treated as common knowledge. > > I think, the confusion here is, I was under impression that > # If device supports IOVA as VA and system runs with IOMMU then > the dpdk should run in IOVA as VA mode. > If above statement true then we don’t really need a new flag. Exactly. And the flag used to indicate that the device *supports* IOVA as VA, not that it *requires* it. > > Couple of points to make forward progress: > # If we think, there is a use case where device is IOVA as VA > And system runs in IOMMU mode then for some reason DPDK needs > to run in PA mode. If so, we need to create two flags > RTE_PCI_DRV_IOVA_AS_VA - it can run either modes There are use cases - KNI and igb_uio come to mind. Whether IOMMU uses VA or PA is a different from whether IOMMU is in use - there is no law that states that, when using IOMMU, IOVA have to have 1:1 mapping with VA. IOMMU requirement does not necessarily imply IOVA as VA - it is perfectly legal to program IOMMU to use IOVA as PA (which we currently do when we e.g. use VFIO for some devices and igb_uio for others). > RTE_PCI_DRV_NEED_IOVA_AS_VA - it can run only on IOVA as VA If we're adding a flag, we might as well not create a confusion and do it consistently. If IOVA as PA is supported, have a flag to indicate that. If IOVA as VA is supported, have a flag to indicate that. Absence of either flag implies inability to work in that mode. I don't see how this is less clear and self-documenting than having two IOVA as VA-related flags that have slightly different meaning and imply things not otherwise stated explicitly. > # With top of tree, Currently it never runs in IOVA as VA mode. > That’s a separate problem to fix. Which effect all the devices > Currently supporting RTE_PCI_DRV_IOVA_AS_VA. Ie even though > Device support RTE_PCI_DRV_IOVA_AS_VA, it is not running > With IOMMU protection and/or root privilege is required to run DPDK. > > >> >> [1] http://patchwork.dpdk.org/patch/53206/ >> [2] http://patchwork.dpdk.org/patch/50274/ >> [3] http://patchwork.dpdk.org/patch/50991/ >> [4] http://patchwork.dpdk.org/patch/46134/ >> >> -- >> Thanks, >> Anatoly -- Thanks, Anatoly