From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933381AbdGTEov (ORCPT ); Thu, 20 Jul 2017 00:44:51 -0400 Received: from mail-oi0-f46.google.com ([209.85.218.46]:33758 "EHLO mail-oi0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932233AbdGTEot (ORCPT ); Thu, 20 Jul 2017 00:44:49 -0400 MIME-Version: 1.0 In-Reply-To: References: <20170718060909.5280-1-airlied@redhat.com> <20170718143404.omgxrujngj2rhiya@redhat.com> From: Linus Torvalds Date: Wed, 19 Jul 2017 21:44:48 -0700 X-Google-Sender-Auth: NA3buD5bTKXk-nBjOB9UpRuakpk Message-ID: Subject: Re: [PATCH] efifb: allow user to disable write combined mapping. To: Andy Lutomirski Cc: Dave Airlie , Peter Jones , "the arch/x86 maintainers" , Dave Airlie , Bartlomiej Zolnierkiewicz , "linux-fbdev@vger.kernel.org" , Linux Kernel Mailing List , Peter Anvin Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jul 19, 2017 at 9:28 PM, Andy Lutomirski wrote: > > It shouldn't be that hard to hack up efifb to allocate some actual RAM > as "framebuffer", unmap it from the direct map, and ioremap_wc() it as > usual. Then you could see if PCIe is important for it. The thing is, the "actual RAM" case is unlikely to show this issue. RAM is special, even when you try to mark it WC or whatever. Yes, it might be slowed down by lack of caching, but the uncore still *knows* it is RAM. The accesses go to the memory controller, not the PCI side. > WC streaming writes over PCIe end up doing 64 byte writes, right? > Maybe the Matrox chip is just extremely slow handling 64b writes. .. or maybe there is some unholy "management logic" thing that catches those writes, because this is server hardware, and server vendors invariably add "value add" (read; shit) to their hardware to justify the high price. Like the Intel "management console" that was such a "security feature". I think one of the points of those magic graphics cards is that you can export the frame buffer over the management network, so that you can still run the graphical Windows GUI management stuff. Because you wouldn't want to just ssh into it and run command line stuff. So I wouldn't be surprised at all if the thing has a special back channel to the network chip with a queue of changes going over ethernet or something, and then when you stream things at high speeds to the GPU DRAM, you fill up the management bandwidth. If it was actual framebuffer DRAM, I would expect it to be *happy* with streaming 64-bit writes. But some special "management interface ASIC" that tries to keep track of GPU framebuffer "damage" might be something else altogether. Linus From mboxrd@z Thu Jan 1 00:00:00 1970 From: Linus Torvalds Date: Thu, 20 Jul 2017 04:44:48 +0000 Subject: Re: [PATCH] efifb: allow user to disable write combined mapping. Message-Id: List-Id: References: <20170718060909.5280-1-airlied@redhat.com> <20170718143404.omgxrujngj2rhiya@redhat.com> In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: Andy Lutomirski Cc: Dave Airlie , Peter Jones , the arch/x86 maintainers , Dave Airlie , Bartlomiej Zolnierkiewicz , "linux-fbdev@vger.kernel.org" , Linux Kernel Mailing List , Peter Anvin On Wed, Jul 19, 2017 at 9:28 PM, Andy Lutomirski wrote: > > It shouldn't be that hard to hack up efifb to allocate some actual RAM > as "framebuffer", unmap it from the direct map, and ioremap_wc() it as > usual. Then you could see if PCIe is important for it. The thing is, the "actual RAM" case is unlikely to show this issue. RAM is special, even when you try to mark it WC or whatever. Yes, it might be slowed down by lack of caching, but the uncore still *knows* it is RAM. The accesses go to the memory controller, not the PCI side. > WC streaming writes over PCIe end up doing 64 byte writes, right? > Maybe the Matrox chip is just extremely slow handling 64b writes. .. or maybe there is some unholy "management logic" thing that catches those writes, because this is server hardware, and server vendors invariably add "value add" (read; shit) to their hardware to justify the high price. Like the Intel "management console" that was such a "security feature". I think one of the points of those magic graphics cards is that you can export the frame buffer over the management network, so that you can still run the graphical Windows GUI management stuff. Because you wouldn't want to just ssh into it and run command line stuff. So I wouldn't be surprised at all if the thing has a special back channel to the network chip with a queue of changes going over ethernet or something, and then when you stream things at high speeds to the GPU DRAM, you fill up the management bandwidth. If it was actual framebuffer DRAM, I would expect it to be *happy* with streaming 64-bit writes. But some special "management interface ASIC" that tries to keep track of GPU framebuffer "damage" might be something else altogether. Linus