From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.2 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CBC39C7618E for ; Tue, 23 Jul 2019 13:20:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id AE698218BE for ; Tue, 23 Jul 2019 13:20:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390182AbfGWNUA (ORCPT ); Tue, 23 Jul 2019 09:20:00 -0400 Received: from foss.arm.com ([217.140.110.172]:54704 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729749AbfGWNUA (ORCPT ); Tue, 23 Jul 2019 09:20:00 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id EDAAF28; Tue, 23 Jul 2019 06:19:58 -0700 (PDT) Received: from [10.1.197.57] (e110467-lin.cambridge.arm.com [10.1.197.57]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id EA0C83F71F; Tue, 23 Jul 2019 06:19:56 -0700 (PDT) Subject: Re: [PATCH net-next 3/3] net: stmmac: Introducing support for Page Pool To: Jon Hunter , Jose Abreu , Lars Persson , Ilias Apalodimas Cc: Joao Pinto , Alexandre Torgue , Maxime Ripard , "netdev@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "linux-stm32@st-md-mailman.stormreply.com" , Chen-Yu Tsai , Maxime Coquelin , linux-tegra , Giuseppe Cavallaro , "David S . Miller" , "linux-arm-kernel@lists.infradead.org" References: <1b254bb7fc6044c5e6e2fdd9e00088d1d13a808b.1562149883.git.joabreu@synopsys.com> <29dcc161-f7c8-026e-c3cc-5adb04df128c@nvidia.com> <20190722101830.GA24948@apalos> <11557fe0-0cba-cb49-0fb6-ad24792d4a53@nvidia.com> <6c769226-bdd9-6fe0-b96b-5a0d800fed24@arm.com> <8756d681-e167-fe4a-c6f0-47ae2dcbb100@nvidia.com> From: Robin Murphy Message-ID: <3255edfa-4465-204b-4751-8d40c8fb1382@arm.com> Date: Tue, 23 Jul 2019 14:19:55 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1 MIME-Version: 1.0 In-Reply-To: <8756d681-e167-fe4a-c6f0-47ae2dcbb100@nvidia.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-GB Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 23/07/2019 13:09, Jon Hunter wrote: > > On 23/07/2019 11:29, Robin Murphy wrote: >> On 23/07/2019 11:07, Jose Abreu wrote: >>> From: Jon Hunter >>> Date: Jul/23/2019, 11:01:24 (UTC+00:00) >>> >>>> This appears to be a winner and by disabling the SMMU for the ethernet >>>> controller and reverting commit 954a03be033c7cef80ddc232e7cbdb17df735663 >>>> this worked! So yes appears to be related to the SMMU being enabled. We >>>> had to enable the SMMU for ethernet recently due to commit >>>> 954a03be033c7cef80ddc232e7cbdb17df735663. >>> >>> Finally :) >>> >>> However, from "git show 954a03be033c7cef80ddc232e7cbdb17df735663": >>> >>> +         There are few reasons to allow unmatched stream bypass, and >>> +         even fewer good ones.  If saying YES here breaks your board >>> +         you should work on fixing your board. >>> >>> So, how can we fix this ? Is your ethernet DT node marked as >>> "dma-coherent;" ? >> >> The first thing to try would be booting the failing setup with >> "iommu.passthrough=1" (or using CONFIG_IOMMU_DEFAULT_PASSTHROUGH) - if >> that makes things seem OK, then the problem is likely related to address >> translation; if not, then it's probably time to start looking at nasties >> like coherency and ordering, although in principle I wouldn't expect the >> SMMU to have too much impact there. > > Setting "iommu.passthrough=1" works for me. However, I am not sure where > to go from here, so any ideas you have would be great. OK, so that really implies it's something to do with the addresses. From a quick skim of the patch, I'm wondering if it's possible for buf->addr and buf->page->dma_addr to get out-of-sync at any point. The nature of the IOVA allocator makes it quite likely that a stale DMA address will have been reused for a new mapping, so putting the wrong address in a descriptor may well mean the DMA still ends up hitting a valid translation, but which is now pointing to a different page. >> Do you know if the SMMU interrupts are working correctly? If not, it's >> possible that an incorrect address or mapping direction could lead to >> the DMA transaction just being silently terminated without any fault >> indication, which generally presents as inexplicable weirdness (I've >> certainly seen that on another platform with the mix of an unsupported >> interrupt controller and an 'imperfect' ethernet driver). > > If I simply remove the iommu node for the ethernet controller, then I > see lots of ... > > [ 6.296121] arm-smmu 12000000.iommu: Unexpected global fault, this could be serious > [ 6.296125] arm-smmu 12000000.iommu: GFSR 0x00000002, GFSYNR0 0x00000000, GFSYNR1 0x00000014, GFSYNR2 0x00000000 > > So I assume that this is triggering the SMMU interrupt correctly. According to tegra186.dtsi it appears you're using the MMU-500 combined interrupt, so if global faults are being delivered then context faults *should* also, but I'd be inclined to try a quick hack of the relevant stmmac_desc_ops::set_addr callback to write some bogus unmapped address just to make sure arm_smmu_context_fault() then screams as expected, and we're not missing anything else. Robin.