From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AE00BC433E2 for ; Wed, 16 Sep 2020 08:48:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 77B4E20872 for ; Wed, 16 Sep 2020 08:48:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726634AbgIPIs6 (ORCPT ); Wed, 16 Sep 2020 04:48:58 -0400 Received: from mail.kernel.org ([198.145.29.99]:41178 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726473AbgIPIs4 (ORCPT ); Wed, 16 Sep 2020 04:48:56 -0400 Received: from gaia (unknown [46.69.195.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 7E5EF208E4; Wed, 16 Sep 2020 08:48:54 +0000 (UTC) Date: Wed, 16 Sep 2020 09:48:52 +0100 From: Catalin Marinas To: Will Deacon Cc: Jason Gunthorpe , Benjamin Herrenschmidt , Lorenzo Pieralisi , Clint Sbisa , linux-pci@vger.kernel.org, Bjorn Helgaas , linux-arm-kernel@lists.infradead.org, Leon Romanovsky Subject: Re: [PATCH] arm64: Enable PCI write-combine resources under sysfs Message-ID: <20200916084851.GA3122@gaia> References: <20200914143819.GC904879@nvidia.com> <375c478593945a416f3180c3773bcb5240d2e36c.camel@kernel.crashing.org> <1d6f2ceb8d3538c906a1fdb8cd3d4c74ccffa42e.camel@kernel.crashing.org> <20200914225740.GP904879@nvidia.com> <2b539df4c9ec703458e46da2fc879ee3b310b31c.camel@kernel.crashing.org> <20200915101831.GA2616@e121166-lin.cambridge.arm.com> <20200915110511.GQ904879@nvidia.com> <20200915234006.GI1573713@nvidia.com> <20200916083315.GC27496@willie-the-truck> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200916083315.GC27496@willie-the-truck> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org On Wed, Sep 16, 2020 at 09:33:16AM +0100, Will Deacon wrote: > On Tue, Sep 15, 2020 at 08:40:06PM -0300, Jason Gunthorpe wrote: > > On Wed, Sep 16, 2020 at 09:17:38AM +1000, Benjamin Herrenschmidt wrote: > > > On Tue, 2020-09-15 at 08:05 -0300, Jason Gunthorpe wrote: > > > > > To sum it up: > > > > > > > > > > (1) RDMA drivers need a new mapping function/attribute to define their > > > > > message push model. Actually the message model is not necessarily related > > > > > to write combining a la x86, so we should probably come up with a better > > > > > and consistent naming. Enabling this patchset may trigger performance > > > > > regressions on mellanox drivers on arm64 - this ought to be > > > > > addressed. > > > > > > > > It is pretty clear now that the certain ARM chips that don't do write > > > > combining with pgprot_writecombine will performance regress if they > > > > are running a certain uncommon Mellanox configuration. I suspect these > > > > deployments are all running the out of tree patch for DEVICE_GRE > > > > though. > > > > > > I'm not sure I understand... > > > > > > Today those ARM chips will not use pgprot_writecombine (at least not > > > using that code path, they might still use it as the result of the > > > other path in the driver that can enable it). > > > > Not quite, upstream kernel will never use WC on those > > devices. DEVICE_GRE is not supported in upstream, > > arch_can_pci_mmap_wc() is always false and the WC tester will always > > fail. > > > > > With the patch, those device will now use MT_DEVICE_NC. > > > > Which doesn't do WC at all on some ARM implementations. > > Is that just TX2? I remember that thing being weird where GRE performed > better than NC, but I thought that was a one off (and the thing is dead). I recall something along these lines. Hopefully ARM updated the guidance to licensees. > NC is more permissive than GRE, so I think that's the right one to use; i.e. > we go for the fewest number of restrictions on the hardware. If somebody > screws up the uarch, that's up to them. I agree, Normal NC is better as long as the BAR can tolerate read side-effects. -- Catalin From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.2 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C1CE4C433E2 for ; Wed, 16 Sep 2020 08:50:21 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 57EFB20872 for ; Wed, 16 Sep 2020 08:50:21 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="bhDM9qpm" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 57EFB20872 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References:Message-ID: Subject:To:From:Date:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=TtJ3Dw1Ms0Mur9soB7HorKFwXe/5QAztNkJcd9Ar+ik=; b=bhDM9qpmUZygPhrFDkqQgib2B x2Is5pdZA8atU5jMWFU3hAHsbo2x9yT0gPjwUxLothbyUhue6MKXWQED25AqEs2yVRrGHOQrKnVbk d+OLgzrB9hqgy8hflqz8bf+CkZ0mGPPL8SK9YNjBSYboVFvJibL9fEDJWoL2qjRFNczjE3Coz3IfG FgXVtyWNwb607H4DqsWfPlM0ka801i3WvSndKgPhwzRFcF6WFJgLycjWl+C3A2JU5wmWNfJ3zYDLF oiZEKYMCfjTiQ3wPpJeLlIdq8pZnnISq7OqDPWc0Ga3mJTr9Vdj35yYD99/bxzACohrJcyhr8yXXc +0vYLAzTw==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1kIT7a-0004V1-Vs; Wed, 16 Sep 2020 08:48:59 +0000 Received: from mail.kernel.org ([198.145.29.99]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1kIT7Y-0004U9-VM for linux-arm-kernel@lists.infradead.org; Wed, 16 Sep 2020 08:48:57 +0000 Received: from gaia (unknown [46.69.195.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 7E5EF208E4; Wed, 16 Sep 2020 08:48:54 +0000 (UTC) Date: Wed, 16 Sep 2020 09:48:52 +0100 From: Catalin Marinas To: Will Deacon Subject: Re: [PATCH] arm64: Enable PCI write-combine resources under sysfs Message-ID: <20200916084851.GA3122@gaia> References: <20200914143819.GC904879@nvidia.com> <375c478593945a416f3180c3773bcb5240d2e36c.camel@kernel.crashing.org> <1d6f2ceb8d3538c906a1fdb8cd3d4c74ccffa42e.camel@kernel.crashing.org> <20200914225740.GP904879@nvidia.com> <2b539df4c9ec703458e46da2fc879ee3b310b31c.camel@kernel.crashing.org> <20200915101831.GA2616@e121166-lin.cambridge.arm.com> <20200915110511.GQ904879@nvidia.com> <20200915234006.GI1573713@nvidia.com> <20200916083315.GC27496@willie-the-truck> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20200916083315.GC27496@willie-the-truck> User-Agent: Mutt/1.10.1 (2018-07-13) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20200916_044857_079856_8BC15D7F X-CRM114-Status: GOOD ( 30.10 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lorenzo Pieralisi , Leon Romanovsky , Benjamin Herrenschmidt , Bjorn Helgaas , Jason Gunthorpe , linux-pci@vger.kernel.org, linux-arm-kernel@lists.infradead.org, Clint Sbisa Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Wed, Sep 16, 2020 at 09:33:16AM +0100, Will Deacon wrote: > On Tue, Sep 15, 2020 at 08:40:06PM -0300, Jason Gunthorpe wrote: > > On Wed, Sep 16, 2020 at 09:17:38AM +1000, Benjamin Herrenschmidt wrote: > > > On Tue, 2020-09-15 at 08:05 -0300, Jason Gunthorpe wrote: > > > > > To sum it up: > > > > > > > > > > (1) RDMA drivers need a new mapping function/attribute to define their > > > > > message push model. Actually the message model is not necessarily related > > > > > to write combining a la x86, so we should probably come up with a better > > > > > and consistent naming. Enabling this patchset may trigger performance > > > > > regressions on mellanox drivers on arm64 - this ought to be > > > > > addressed. > > > > > > > > It is pretty clear now that the certain ARM chips that don't do write > > > > combining with pgprot_writecombine will performance regress if they > > > > are running a certain uncommon Mellanox configuration. I suspect these > > > > deployments are all running the out of tree patch for DEVICE_GRE > > > > though. > > > > > > I'm not sure I understand... > > > > > > Today those ARM chips will not use pgprot_writecombine (at least not > > > using that code path, they might still use it as the result of the > > > other path in the driver that can enable it). > > > > Not quite, upstream kernel will never use WC on those > > devices. DEVICE_GRE is not supported in upstream, > > arch_can_pci_mmap_wc() is always false and the WC tester will always > > fail. > > > > > With the patch, those device will now use MT_DEVICE_NC. > > > > Which doesn't do WC at all on some ARM implementations. > > Is that just TX2? I remember that thing being weird where GRE performed > better than NC, but I thought that was a one off (and the thing is dead). I recall something along these lines. Hopefully ARM updated the guidance to licensees. > NC is more permissive than GRE, so I think that's the right one to use; i.e. > we go for the fewest number of restrictions on the hardware. If somebody > screws up the uarch, that's up to them. I agree, Normal NC is better as long as the BAR can tolerate read side-effects. -- Catalin _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel