From: Benjamin Herrenschmidt <benh@kernel.crashing.org> To: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Thomas Hellstrom <thellstrom@vmware.com>, FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>, Russell King - ARM Linux <linux@arm.linux.org.uk>, Arnd Bergmann <arnd@arndb.de>, linux-kernel@vger.kernel.org, linaro-mm-sig@lists.linaro.org, linux-arm-kernel@lists.infradead.org Subject: Re: [Linaro-mm-sig] [RFC] ARM DMA mapping TODO, v1 Date: Sat, 30 Apr 2011 08:46:54 +1000 [thread overview] Message-ID: <1304117214.2513.262.camel@pasglop> (raw) In-Reply-To: <20110429092712.5bbd6948@jbarnes-desktop> On Fri, 2011-04-29 at 09:27 -0700, Jesse Barnes wrote: > You must be making it sound worse than it really is, otherwise how > would an embedded platform like the above deal with a display engine > that needed a large, contiguous chunk of uncached memory for the > display buffer? If the CPU is actively speculating into it and > overwriting blits etc it would never work... Or do you do such > reservations up front at 1G granularity?? Such embedded platforms have not been used with GPUs so far and our only implementation of 64-bit BookE is fortunately also completely cache coherent :-) The good thing on ppc is that so far there is no new design coming from us or FSL that isn't cache coherent. The bad thing is that people seem to still try to pump out things using old 44x which isn't and somewhat seem to also want to use GPUs on them :-) The 44x is a case where I have a small (64 entries) SW loaded TLB and I bolt the first 768M of the linear mapping (lowmem) using 3x256M entries. What "saves" it is that it's also an ancient design with essentially a busted prefetch engine that will thus cope with aliases as long as we don't explicitely access the cached and non-cached aliases simultaneously. The nasty cases I have never really dealt with properly are the Apple machines and their non coherent AGP. Those processors were really not designed with the idea that one would do non-coherent DMA, especially the 970 (G5) and our Linux code really don't like it. Things tend to "work" with DRI 1 because we allocate the AGP memory once in one big chunk (it's pages but they are allocated together and thus tend to be contiguous) so the possible issues with prefetch are so rare, I think we end up being lucky. With DRI 2 dynamically mapping things in/out, we have a bigger problem and I don't know how to solve it other than forcing the DRM to allocate graphic objects in reserved areas of memory made of 16M pools that I unmap from the linear mapping.... (since I use 16M pages to map the linear mapping). For ppc32 laptops it's even worse as I use 256MB BATs (block address translation, kind of special registers to create large static mappings) to map the linear mapping, which brings me back to the 44x case to some extent. I can't really do without at the moment, at the very least I require the kernel text / data / bss to be covered by BATs. > > Right. We should still shoot HW designers who give up coherency for the > > sake of 3D benchmarks. It's insanely stupid. > > Ah if it were that simple. :) There are big costs to implementing full > coherency for all your devices, as you well know, so it's just not a > question of benchmark optimization. But it -is- that simple. You do have to deal with coherency anyways for your PHB unless you start advocating that we should make everything else non coherent as well. So you have the logic. Just make your GPU operate on the same protocol. It's really only a perf tradeoff I believe. And a bad one. Cheers, Ben.
WARNING: multiple messages have this Message-ID (diff)
From: benh@kernel.crashing.org (Benjamin Herrenschmidt) To: linux-arm-kernel@lists.infradead.org Subject: [Linaro-mm-sig] [RFC] ARM DMA mapping TODO, v1 Date: Sat, 30 Apr 2011 08:46:54 +1000 [thread overview] Message-ID: <1304117214.2513.262.camel@pasglop> (raw) In-Reply-To: <20110429092712.5bbd6948@jbarnes-desktop> On Fri, 2011-04-29 at 09:27 -0700, Jesse Barnes wrote: > You must be making it sound worse than it really is, otherwise how > would an embedded platform like the above deal with a display engine > that needed a large, contiguous chunk of uncached memory for the > display buffer? If the CPU is actively speculating into it and > overwriting blits etc it would never work... Or do you do such > reservations up front at 1G granularity?? Such embedded platforms have not been used with GPUs so far and our only implementation of 64-bit BookE is fortunately also completely cache coherent :-) The good thing on ppc is that so far there is no new design coming from us or FSL that isn't cache coherent. The bad thing is that people seem to still try to pump out things using old 44x which isn't and somewhat seem to also want to use GPUs on them :-) The 44x is a case where I have a small (64 entries) SW loaded TLB and I bolt the first 768M of the linear mapping (lowmem) using 3x256M entries. What "saves" it is that it's also an ancient design with essentially a busted prefetch engine that will thus cope with aliases as long as we don't explicitely access the cached and non-cached aliases simultaneously. The nasty cases I have never really dealt with properly are the Apple machines and their non coherent AGP. Those processors were really not designed with the idea that one would do non-coherent DMA, especially the 970 (G5) and our Linux code really don't like it. Things tend to "work" with DRI 1 because we allocate the AGP memory once in one big chunk (it's pages but they are allocated together and thus tend to be contiguous) so the possible issues with prefetch are so rare, I think we end up being lucky. With DRI 2 dynamically mapping things in/out, we have a bigger problem and I don't know how to solve it other than forcing the DRM to allocate graphic objects in reserved areas of memory made of 16M pools that I unmap from the linear mapping.... (since I use 16M pages to map the linear mapping). For ppc32 laptops it's even worse as I use 256MB BATs (block address translation, kind of special registers to create large static mappings) to map the linear mapping, which brings me back to the 44x case to some extent. I can't really do without at the moment, at the very least I require the kernel text / data / bss to be covered by BATs. > > Right. We should still shoot HW designers who give up coherency for the > > sake of 3D benchmarks. It's insanely stupid. > > Ah if it were that simple. :) There are big costs to implementing full > coherency for all your devices, as you well know, so it's just not a > question of benchmark optimization. But it -is- that simple. You do have to deal with coherency anyways for your PHB unless you start advocating that we should make everything else non coherent as well. So you have the logic. Just make your GPU operate on the same protocol. It's really only a perf tradeoff I believe. And a bad one. Cheers, Ben.
next prev parent reply other threads:[~2011-04-29 22:47 UTC|newest] Thread overview: 198+ messages / expand[flat|nested] mbox.gz Atom feed top 2011-04-21 19:29 [RFC] ARM DMA mapping TODO, v1 Arnd Bergmann 2011-04-21 19:29 ` Arnd Bergmann 2011-04-21 20:09 ` [Linaro-mm-sig] " Jesse Barnes 2011-04-21 20:09 ` Jesse Barnes 2011-04-21 21:52 ` Zach Pfeffer 2011-04-21 21:52 ` Zach Pfeffer 2011-04-22 0:34 ` KyongHo Cho 2011-04-22 0:34 ` KyongHo Cho 2011-04-26 14:29 ` Arnd Bergmann 2011-04-26 14:29 ` Arnd Bergmann 2011-04-26 14:28 ` Arnd Bergmann 2011-04-26 14:28 ` Arnd Bergmann 2011-04-26 14:26 ` Arnd Bergmann 2011-04-26 14:26 ` Arnd Bergmann 2011-04-26 15:39 ` Jesse Barnes 2011-04-26 15:39 ` Jesse Barnes 2011-04-27 7:35 ` Russell King - ARM Linux 2011-04-27 7:35 ` Russell King - ARM Linux 2011-04-27 8:56 ` Arnd Bergmann 2011-04-27 8:56 ` Arnd Bergmann 2011-04-27 9:09 ` Russell King - ARM Linux 2011-04-27 9:09 ` Russell King - ARM Linux 2011-04-27 11:02 ` Arnd Bergmann 2011-04-27 11:02 ` Arnd Bergmann 2011-04-27 16:16 ` [Linaro-mm-sig] " Alex Deucher 2011-04-27 16:16 ` Alex Deucher 2011-04-27 17:44 ` Anca Emanuel 2011-04-27 17:44 ` Anca Emanuel 2011-04-27 20:27 ` Russell King - ARM Linux 2011-04-27 20:27 ` Russell King - ARM Linux 2011-04-27 20:16 ` Russell King - ARM Linux 2011-04-27 20:16 ` Russell King - ARM Linux 2011-04-27 20:21 ` Arnd Bergmann 2011-04-27 20:21 ` Arnd Bergmann 2011-04-27 20:26 ` Russell King - ARM Linux 2011-04-27 20:26 ` Russell King - ARM Linux 2011-04-27 20:48 ` Arnd Bergmann 2011-04-27 20:48 ` Arnd Bergmann 2011-04-27 21:41 ` Benjamin Herrenschmidt 2011-04-27 21:41 ` Benjamin Herrenschmidt 2011-04-28 9:30 ` Russell King - ARM Linux 2011-04-28 9:30 ` Russell King - ARM Linux 2011-04-28 21:07 ` Benjamin Herrenschmidt 2011-04-28 21:07 ` Benjamin Herrenschmidt 2011-04-29 11:26 ` Arnd Bergmann 2011-04-29 11:26 ` Arnd Bergmann 2011-04-29 11:47 ` Benjamin Herrenschmidt 2011-04-29 11:47 ` Benjamin Herrenschmidt 2011-04-29 11:56 ` Alan Cox 2011-04-29 11:56 ` Alan Cox 2011-04-29 22:51 ` Benjamin Herrenschmidt 2011-04-29 22:51 ` Benjamin Herrenschmidt 2011-04-29 12:06 ` [Linaro-mm-sig] " Thomas Hellstrom 2011-04-29 12:06 ` Thomas Hellstrom 2011-04-29 13:34 ` Jerome Glisse 2011-04-29 13:34 ` Jerome Glisse 2011-04-29 22:55 ` Benjamin Herrenschmidt 2011-04-29 22:55 ` Benjamin Herrenschmidt 2011-04-29 22:53 ` Benjamin Herrenschmidt 2011-04-29 22:53 ` Benjamin Herrenschmidt 2011-04-27 10:51 ` Marek Szyprowski 2011-04-27 10:51 ` Marek Szyprowski 2011-04-27 21:37 ` Benjamin Herrenschmidt 2011-04-27 21:37 ` Benjamin Herrenschmidt 2011-04-28 6:40 ` [Linaro-mm-sig] " Arnd Bergmann 2011-04-28 6:40 ` Arnd Bergmann 2011-04-28 6:46 ` FUJITA Tomonori 2011-04-28 6:46 ` FUJITA Tomonori 2011-04-28 9:37 ` Russell King - ARM Linux 2011-04-28 9:37 ` Russell King - ARM Linux 2011-04-28 10:32 ` [Linaro-mm-sig] " Marek Szyprowski 2011-04-28 10:32 ` Marek Szyprowski 2011-04-28 10:51 ` Russell King - ARM Linux 2011-04-28 10:51 ` Russell King - ARM Linux 2011-04-28 12:28 ` Arnd Bergmann 2011-04-28 12:28 ` Arnd Bergmann 2011-04-28 13:15 ` Russell King - ARM Linux 2011-04-28 13:15 ` Russell King - ARM Linux 2011-04-28 14:29 ` Arnd Bergmann 2011-04-28 14:29 ` Arnd Bergmann 2011-04-28 14:34 ` Russell King - ARM Linux 2011-04-28 14:34 ` Russell King - ARM Linux 2011-04-28 14:39 ` Arnd Bergmann 2011-04-28 14:39 ` Arnd Bergmann 2011-04-28 14:58 ` Russell King - ARM Linux 2011-04-28 14:58 ` Russell King - ARM Linux 2011-04-28 19:37 ` Jerome Glisse 2011-04-28 19:37 ` Jerome Glisse 2011-04-29 0:29 ` Benjamin Herrenschmidt 2011-04-29 0:29 ` Benjamin Herrenschmidt 2011-04-29 5:50 ` Thomas Hellstrom 2011-04-29 5:50 ` Thomas Hellstrom 2011-04-29 7:35 ` Benjamin Herrenschmidt 2011-04-29 7:35 ` Benjamin Herrenschmidt 2011-04-29 10:55 ` Thomas Hellstrom 2011-04-29 10:55 ` Thomas Hellstrom 2011-04-29 22:50 ` Benjamin Herrenschmidt 2011-04-29 22:50 ` Benjamin Herrenschmidt 2011-04-29 16:27 ` Jesse Barnes 2011-04-29 16:27 ` Jesse Barnes 2011-04-29 22:46 ` Benjamin Herrenschmidt [this message] 2011-04-29 22:46 ` Benjamin Herrenschmidt 2011-04-30 2:45 ` Jesse Barnes 2011-04-30 2:45 ` Jesse Barnes 2011-04-29 7:59 ` Russell King - ARM Linux 2011-04-29 7:59 ` Russell King - ARM Linux 2011-04-29 16:32 ` Jesse Barnes 2011-04-29 16:32 ` Jesse Barnes 2011-04-29 18:29 ` Arnd Bergmann 2011-04-29 18:29 ` Arnd Bergmann 2011-04-29 22:15 ` Russell King - ARM Linux 2011-04-29 22:15 ` Russell King - ARM Linux 2011-05-02 4:42 ` David Brown 2011-05-02 4:42 ` David Brown 2011-05-02 11:26 ` Arnd Bergmann 2011-05-02 11:26 ` Arnd Bergmann 2011-04-29 22:37 ` Benjamin Herrenschmidt 2011-04-29 22:37 ` Benjamin Herrenschmidt 2011-04-29 13:42 ` Joerg Roedel 2011-04-29 13:42 ` Joerg Roedel 2011-04-29 14:19 ` Jerome Glisse 2011-04-29 14:19 ` Jerome Glisse 2011-04-29 15:37 ` Jordan Crouse 2011-04-29 15:37 ` Jordan Crouse 2011-04-28 14:38 ` FUJITA Tomonori 2011-04-28 14:38 ` FUJITA Tomonori 2011-04-29 0:25 ` Benjamin Herrenschmidt 2011-04-29 0:25 ` Benjamin Herrenschmidt 2011-04-29 11:21 ` Arnd Bergmann 2011-04-29 11:21 ` Arnd Bergmann 2011-04-28 10:41 ` Joerg Roedel 2011-04-28 10:41 ` Joerg Roedel 2011-04-28 11:01 ` Russell King - ARM Linux 2011-04-28 11:01 ` Russell King - ARM Linux 2011-04-28 12:25 ` Joerg Roedel 2011-04-28 12:25 ` Joerg Roedel 2011-04-28 12:42 ` Russell King - ARM Linux 2011-04-28 12:42 ` Russell King - ARM Linux 2011-04-28 12:59 ` Joerg Roedel 2011-04-28 12:59 ` Joerg Roedel 2011-04-28 13:02 ` Arnd Bergmann 2011-04-28 13:02 ` Arnd Bergmann 2011-04-28 13:19 ` Russell King - ARM Linux 2011-04-28 13:19 ` Russell King - ARM Linux 2011-04-28 13:56 ` Joerg Roedel 2011-04-28 13:56 ` Joerg Roedel 2011-04-28 14:30 ` Russell King - ARM Linux 2011-04-28 14:30 ` Russell King - ARM Linux 2011-04-27 9:52 ` Catalin Marinas 2011-04-27 9:52 ` Catalin Marinas 2011-04-27 10:43 ` Arnd Bergmann 2011-04-27 10:43 ` Arnd Bergmann 2011-04-27 11:08 ` Catalin Marinas 2011-04-27 11:08 ` Catalin Marinas 2011-04-28 0:15 ` Valdis.Kletnieks 2011-04-28 0:15 ` Valdis.Kletnieks at vt.edu 2011-04-28 8:27 ` Catalin Marinas 2011-04-28 8:27 ` Catalin Marinas 2011-04-28 12:12 ` Arnd Bergmann 2011-04-28 12:12 ` Arnd Bergmann 2011-04-28 12:36 ` Russell King - ARM Linux 2011-04-28 12:36 ` Russell King - ARM Linux 2011-04-28 12:48 ` Arnd Bergmann 2011-04-28 12:48 ` Arnd Bergmann 2011-05-03 14:45 ` Dave Martin 2011-05-03 14:45 ` Dave Martin 2011-04-29 15:41 ` [Linaro-mm-sig] " Arnd Bergmann 2011-04-29 15:41 ` Arnd Bergmann 2011-04-29 16:42 ` Catalin Marinas 2011-04-29 16:42 ` Catalin Marinas 2011-05-03 15:05 ` [Linaro-mm-sig] " Laurent Pinchart 2011-05-03 15:05 ` Laurent Pinchart 2011-05-03 15:31 ` Arnd Bergmann 2011-05-03 15:31 ` Arnd Bergmann 2011-04-27 14:06 ` FUJITA Tomonori 2011-04-27 14:06 ` FUJITA Tomonori 2011-04-27 14:29 ` Catalin Marinas 2011-04-27 14:29 ` Catalin Marinas 2011-04-27 14:34 ` FUJITA Tomonori 2011-04-27 14:34 ` FUJITA Tomonori 2011-04-27 20:29 ` Russell King - ARM Linux 2011-04-27 20:29 ` Russell King - ARM Linux 2011-04-27 21:45 ` Benjamin Herrenschmidt 2011-04-27 21:45 ` Benjamin Herrenschmidt 2011-04-28 7:24 ` [Linaro-mm-sig] " KyongHo Cho 2011-04-28 7:24 ` KyongHo Cho 2011-04-28 8:31 ` Catalin Marinas 2011-04-28 8:31 ` Catalin Marinas 2011-04-27 21:31 ` Benjamin Herrenschmidt 2011-04-27 21:31 ` Benjamin Herrenschmidt 2011-04-28 9:42 ` Russell King - ARM Linux 2011-04-28 9:42 ` Russell King - ARM Linux 2011-04-28 10:27 ` Joerg Roedel 2011-04-28 10:27 ` Joerg Roedel 2011-04-28 12:15 ` Arnd Bergmann 2011-04-28 12:15 ` Arnd Bergmann 2011-05-03 14:35 [Linaro-mm-sig] " Laurent Pinchart 2011-05-03 14:35 ` Laurent Pinchart
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=1304117214.2513.262.camel@pasglop \ --to=benh@kernel.crashing.org \ --cc=arnd@arndb.de \ --cc=fujita.tomonori@lab.ntt.co.jp \ --cc=jbarnes@virtuousgeek.org \ --cc=linaro-mm-sig@lists.linaro.org \ --cc=linux-arm-kernel@lists.infradead.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux@arm.linux.org.uk \ --cc=thellstrom@vmware.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.