From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Burakov, Anatoly" Subject: Re: [PATCH v3] vfio: fix workaround of BAR0 mapping Date: Fri, 13 Jul 2018 12:08:46 +0100 Message-ID: References: <20180712030833.4887-1-t.yoshimura8869@gmail.com> <20180713101145.4795-1-t.yoshimura8869@gmail.com> <4c46da4e-ab67-bb76-b42a-25646c79cd99@intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Cc: dev@dpdk.org To: Takeshi Yoshimura Return-path: Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by dpdk.org (Postfix) with ESMTP id BE1063230 for ; Fri, 13 Jul 2018 13:08:49 +0200 (CEST) In-Reply-To: <4c46da4e-ab67-bb76-b42a-25646c79cd99@intel.com> Content-Language: en-US List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 13-Jul-18 12:00 PM, Burakov, Anatoly wrote: > On 13-Jul-18 11:11 AM, Takeshi Yoshimura wrote: >> The workaround of BAR0 mapping gives up and immediately returns an >> error if it cannot map around the MSI-X. However, recent version >> of VFIO allows MSIX mapping (*). >> >> I fixed not to return immediately but try mapping. In old Linux, mmap >> just fails and returns the same error as the code before my fix . In >> recent Linux, mmap succeeds and this patch enables running DPDK in >> specific environments (e.g., ppc64le with HGST NVMe) >> >> (*): "vfio-pci: Allow mapping MSIX BAR", >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/ >> commit/id=a32295c612c57990d17fb0f41e7134394b2f35f6 >> >> Fixes: 90a1633b2347 ("eal/linux: allow to map BARs with MSI-X tables") >> >> Signed-off-by: Takeshi Yoshimura >> --- >> >> Thanks, Anatoly. >> >> I updated the patch not to affect behaviors of older Linux and >> other environments as well as possible. This patch adds another >> chance to mmap BAR0. >> >> I noticed that the check at line 350 already includes the check >> of page size, so this patch does not fix the check. >> >> Regards, >> Takeshi > > Hi Takeshi, > > Please correct me if i'm wrong, but i'm not sure the old behavior is kept. > > Let's say we're running an old kernel, which doesn't allow mapping MSI-X > BARs. If MSI-X starts at beginning of the BAR (floor-aligned to page > size), and ends at or beyond end of BAR (ceiling-aligned to page size). > In that situation, old code just skipped the BAR and returned 0. > > We then exited the function, and there's a check for return value right > after pci_vfio_mmap_bar() that stop continuing if we fail to map > something. In the old code, we would continue as we went, and finish the > rest of our mappings. With your new code, you're attempting to map the > BAR, it fails, and you will return -1 on older kernels. > > I believe what we really need here is the following: > > 1) If this is a BAR containing MSI-X vector, first try mapping the > entire BAR. If it succeeds, great - that would be your new kernel behavior. > 2) If we failed on step 1), check to see if we can map around the BAR. > If we can, try to map around it like the current code does. If we cannot > map around it (i.e. if MSI-X vector, page aligned, occupies entire BAR), > then we simply return 0 and skip the BAR. > > That, i would think, would keep the old behavior and enable the new one. > > Does that make sense? > I envision this to look something like this: bool again = false; do { if (again) { // set up mmap-around if (cannot map around) return 0; } // try mapping if (map_failed && msix_table->bar_index == bar_index) { again = true; continue; } if (map_failed) return -1; break/return 0; } while (again); -- Thanks, Anatoly