From mboxrd@z Thu Jan 1 00:00:00 1970 From: catalin.marinas@arm.com (Catalin Marinas) Date: Fri, 30 Aug 2013 10:44:43 +0100 Subject: [PATCH] ARM64: KVM: Fix coherent_icache_guest_page() for host with external L3-cache. In-Reply-To: References: <747a0675165da4ef147bbda4e140549b@www.loen.fr> <5935339137684ecf90dd484cc5739548@www.loen.fr> <20130815165344.GA3853@cbox> <20130829105235.GC13704@arm.com> <20130829125337.GG13704@arm.com> Message-ID: <20130830094443.GB62188@MacBook-Pro.local> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org On Thu, Aug 29, 2013 at 05:02:50PM +0100, Anup Patel wrote: > On Thu, Aug 29, 2013 at 6:23 PM, Catalin Marinas > wrote: > > On Thu, Aug 29, 2013 at 01:31:43PM +0100, Anup Patel wrote: > >> On Thu, Aug 29, 2013 at 4:22 PM, Catalin Marinas > >> wrote: > >> > On Fri, Aug 16, 2013 at 07:57:55AM +0100, Anup Patel wrote: > >> >> The approach of flushing d-cache by set/way upon first run of VCPU will > >> >> not work because for set/way operations ARM ARM says: "For set/way > >> >> operations, and for All (entire cache) operations, the point is defined to be > >> >> to the next level of caching". In other words, set/way operations work upto > >> >> point of unification. > >> > > >> > I don't understand where you got the idea that set/way operations work > >> > up to the point of unification. This is incorrect, the set/way > >> > operations work on the level of cache specified by bits 3:1 in the > >> > register passed to the DC CISW instruction. For your L3 cache, those > >> > bits would be 2 (and __flush_dcache_all() implementation does this > >> > dynamically). > >> > >> The L3-cache is not visible to CPU. It is totally independent and transparent > >> to CPU. > > > > OK. But you say that operations like DC CIVAC actually flush the L3? So > > I don't see it as completely transparent to the CPU. > > It is transparent from CPU perspective. In other words, there is nothing in > CPU for controlling/monitoring L3-cache. We probably have a different understanding of "transparent". It doesn't look to me like any more transparent than the L1 or L2 cache. Basically, from a software perspective, it needs maintenance. Whether the CPU explicitly asks the L3 cache for this or the L3 cache figures it on its own based on the L1/L2 operations is irrelevant. It would have been transparent if the software didn't need to know about it at all, but it's not the case. > > Do you have any configuration bits which would make the L3 completely > > transparent like always caching even when accesses are non-cacheable and > > DC ops to PoC ignoring it? > > Actually, L3-cache monitors the types of read/write generated by CPU (i.e. > whether the request is cacheable/non-cacheable or whether the request is > due to DC ops to PoC, or ...). > > To answer your query, there is no configuration to have L3 caching when > accesses are non-cacheable and DC ops to PoC. So it's an outer cache with some "improvements" to handle DC ops to PoC. I think it was a pretty bad decision on the hardware side as we really try to get rid of outer caches for many reasons: 1. Non-standard cache flushing operations (MMIO-based) 2. It may require cache maintenance by physical address - something hard to get in a guest OS (unless you virtualise L3 cache maintenance) 3. Are barriers like DSB propagated correctly? Does a DC op to PoC followed by DSB ensure that the L3 drained the cachelines to RAM? I think point 2 isn't required because your L3 detects DC ops to PoC. I hope point 3 is handled correctly (otherwise look how "nice" the mb() macro on arm is to cope with L2x0). If only 1 is left, we don't need the full outer_cache framework but it still needs to be addressed since the assumption is that flush_cache_all (or __flush_dcache_all) flushes all cache levels. These are not used in generic code but are used during kernel booting, KVM and cpuidle drivers. > > Now, back to the idea of outer_cache framework for arm64. Does your CPU > > have separate instructions for flushing this L3 cache? > > No, CPU does not have separate instruction for flushing L3-cache. On the > other hand, L3-cache has MMIO registers which can be use to explicitly > flush L3-cache. I guess you use those in your firmware or boot loader since Linux requires clean/invalidated caches at boot (and I plan to push a patch which removes kernel D-cache cleaning during boot to spot such problems early). A cpuidle driver would probably need this as well. -- Catalin