From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.4 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F0B21C433FF for ; Mon, 29 Jul 2019 21:33:15 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id B9ED821655 for ; Mon, 29 Jul 2019 21:33:15 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=nvidia.com header.i=@nvidia.com header.b="BuHiU6Cu" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388524AbfG2VdO (ORCPT ); Mon, 29 Jul 2019 17:33:14 -0400 Received: from hqemgate16.nvidia.com ([216.228.121.65]:11890 "EHLO hqemgate16.nvidia.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728405AbfG2VdM (ORCPT ); Mon, 29 Jul 2019 17:33:12 -0400 Received: from hqpgpgate101.nvidia.com (Not Verified[216.228.121.13]) by hqemgate16.nvidia.com (using TLS: TLSv1.2, DES-CBC3-SHA) id ; Mon, 29 Jul 2019 14:33:01 -0700 Received: from hqmail.nvidia.com ([172.20.161.6]) by hqpgpgate101.nvidia.com (PGP Universal service); Mon, 29 Jul 2019 14:33:10 -0700 X-PGP-Universal: processed; by hqpgpgate101.nvidia.com on Mon, 29 Jul 2019 14:33:10 -0700 Received: from [10.26.11.172] (10.124.1.5) by HQMAIL107.nvidia.com (172.20.187.13) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Mon, 29 Jul 2019 21:33:06 +0000 Subject: Re: [PATCH net-next 3/3] net: stmmac: Introducing support for Page Pool To: Jose Abreu , Robin Murphy , "linux-kernel@vger.kernel.org" , "netdev@vger.kernel.org" , "linux-stm32@st-md-mailman.stormreply.com" , "linux-arm-kernel@lists.infradead.org" , Catalin Marinas , Will Deacon CC: Joao Pinto , Alexandre Torgue , Maxime Ripard , Chen-Yu Tsai , Maxime Coquelin , linux-tegra , Giuseppe Cavallaro , "David S . Miller" References: <1b254bb7fc6044c5e6e2fdd9e00088d1d13a808b.1562149883.git.joabreu@synopsys.com> <7a79be5d-7ba2-c457-36d3-1ccef6572181@nvidia.com> <9e695f33-fd9f-a910-0891-2b63bd75e082@nvidia.com> <1e2ea942-28fe-15b9-f675-8d6585f9a33f@nvidia.com> From: Jon Hunter Message-ID: <8a60361f-b914-93ef-0d80-92ae4ad8b808@nvidia.com> Date: Mon, 29 Jul 2019 22:33:04 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: X-Originating-IP: [10.124.1.5] X-ClientProxiedBy: HQMAIL105.nvidia.com (172.20.187.12) To HQMAIL107.nvidia.com (172.20.187.13) Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1564435981; bh=+tXkvnJomdlQUldytIfFdlsTnBE+q42y36xSnoRaYvU=; h=X-PGP-Universal:Subject:To:CC:References:From:Message-ID:Date: User-Agent:MIME-Version:In-Reply-To:X-Originating-IP: X-ClientProxiedBy:Content-Type:Content-Language: Content-Transfer-Encoding; b=BuHiU6CuSm73jgRPvxOtqlowqNvcDo0RdVEDsM3mou7XM1jZUGREQyFZuANpF/PvP 74z9Kajzmn7ZZkLJODbpWKiSvdL21YH+vAaT4lbeq0A48hDAGkx5dlYGbRySj7NVSl eeR8oEZNAaZmfNT/jAKBhIQ+ySqvcIJ+QGLGssZkoEGbiNaFn8PYEVr7Ogq3qxp6OC MGKfMF0rF+daOjLKHG4uwYzh0Z1m5F8/2SDgm4MUDbOQLBKmeq5ImF6yfeemzmZKpt Y8cvlCRKwGq0mSIJ3Pe9pq+Sm2pP0vRg32Z8tYF44uOAcTpCxzdqQDnj/bpHX1F7GO PyR6nJoawG9Ig== Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 29/07/2019 15:08, Jose Abreu wrote: ... >>> Hi Catalin and Will, >>> >>> Sorry to add you in such a long thread but we are seeing a DMA issue >>> with stmmac driver in an ARM64 platform with IOMMU enabled. >>> >>> The issue seems to be solved when buffers allocation for DMA based >>> transfers are *not* mapped with the DMA_ATTR_SKIP_CPU_SYNC flag *OR* >>> when IOMMU is disabled. >>> >>> Notice that after transfer is done we do use >>> dma_sync_single_for_{cpu,device} and then we reuse *the same* page for >>> another transfer. >>> >>> Can you please comment on whether DMA_ATTR_SKIP_CPU_SYNC can not be used >>> in ARM64 platforms with IOMMU ? >> >> In terms of what they do, there should be no difference on arm64 between: >> >> dma_map_page(..., dir); >> ... >> dma_unmap_page(..., dir); >> >> and: >> >> dma_map_page_attrs(..., dir, DMA_ATTR_SKIP_CPU_SYNC); >> dma_sync_single_for_device(..., dir); >> ... >> dma_sync_single_for_cpu(..., dir); >> dma_unmap_page_attrs(..., dir, DMA_ATTR_SKIP_CPU_SYNC); >> >> provided that the first sync covers the whole buffer and any subsequent >> ones cover at least the parts of the buffer which may have changed. Plus >> for coherent hardware it's entirely moot either way. > > Thanks for confirming. That's indeed what stmmac is doing when buffer is > received by syncing the packet size to CPU. > >> >> Given Jon's previous findings, I would lean towards the idea that >> performing the extra (redundant) cache maintenance plus barrier in >> dma_unmap is mostly just perturbing timing in the same way as the debug >> print which also made things seem OK. > > Mikko said that Tegra186 is not coherent so we have to explicit flush > pipeline but I don't understand why sync_single() is not doing it ... > > Jon, can you please remove *all* debug prints, hacks, etc ... and test > this one in attach with plain -net tree ? So far I have just been testing on the mainline kernel branch. The issue still persists after applying this on mainline. I can test on the -net tree, but I am not sure that will make a difference. Cheers Jon -- nvpublic