From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:50013) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QMibF-0002Zk-DU for qemu-devel@nongnu.org; Wed, 18 May 2011 11:23:50 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1QMibE-0006fp-1q for qemu-devel@nongnu.org; Wed, 18 May 2011 11:23:49 -0400 Received: from mx1.redhat.com ([209.132.183.28]:16852) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QMibD-0006fj-NK for qemu-devel@nongnu.org; Wed, 18 May 2011 11:23:48 -0400 Message-ID: <4DD3E47F.9060104@redhat.com> Date: Wed, 18 May 2011 18:23:43 +0300 From: Avi Kivity MIME-Version: 1.0 References: <4DD3C5B9.1080908@redhat.com> <4DD3D236.90708@siemens.com> <4DD3D95E.2060105@redhat.com> <4DD3E1B3.3020405@siemens.com> In-Reply-To: <4DD3E1B3.3020405@siemens.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [RFC] Memory API List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jan Kiszka Cc: qemu-devel On 05/18/2011 06:11 PM, Jan Kiszka wrote: > On 2011-05-18 16:36, Avi Kivity wrote: > >> I would add another drawback: > >> > >> - Inability to identify the origin of a region accesses and handle them > >> differently based on the source. > >> > >> That is at least problematic for the x86 APIC which is CPU local. Our > >> current way do deal with it is, well, very creative and falls to > >> dust if a guest actually tries to remap the APIC. > >> > >> However, I'm unsure if that can easily be addressed. As long as only x86 > >> is affected, it's tricky to ask for a big infrastructure to handle this > >> special case. Maybe there some other use cases, don't know. > > > > We could implement it with a per-cpu MemoryRegion, with each cpu's > > MemoryRegion populated by a different APIC sub-region. > > The tricky part is wiring this up efficiently for TCG, ie. in QEMU's > softmmu. I played with passing the issuing CPUState (or NULL for > devices) down the MMIO handler chain. Not totally beautiful as > decentralized dispatching was still required, but at least only > moderately invasive. Maybe your API allows for cleaning up the > management and dispatching part, need to rethink... My suggestion is opposite - have a different MemoryRegion for each (e.g. CPUState::memory). Then the TLBs will resolve to a different ram_addr_t for the same physical address, for the local APIC range. > > > > There is nothing we can do with a return code. You can't fail an mmio > > that causes overlapping physical memory map. > > We must fail such requests to make progress with the API. That may > happen either on caller side or in cpu_register_memory_region itself > (hwerror). Otherwise the new API will just be a shiny new facade for on > old and still fragile building. > >> > >>> void cpu_unregister_memory_region(MemoryRegion *mr); > > > > Instead, we need cpu_unregister_memory_region() to restore any > > previously hidden ranges. > > I disagree. Both approaches, rejecting overlaps or restoring them, imply > subtle semantical changes that exiting device models have to deal with. > We can't use any of both without some review and conversion work. So > better head for the clearer and, thus, cleaner approach. We need to head for the more hardware-like approach. What happens when you program overlapping BARs? I imagine the result is implementation-defined, but ends up with one region decoded in preference to the other. There is simply no way to reject an overlapping mapping. > >> - Get rid of wasteful PhysPageDesc at this chance? > > > > That's the intent, but not at this chance, rather later on. > > The features you expose to the users somehow have to be mapped on data > structures internally. Those need to support both the > registration/deregistration as well as the lookup efficiently. By > postponing that internal design to the point when we already switched to > facade, the risk arises that a suboptimal interface was exposed and > conversion was done in vain. That is true, and one of the reasons this is posted as an [RFC] (I usually prefer [PATCH]). However, I don't see a way to change the internals without changing the API. > > But I want > > the API to be compatible with the goal so we don't have to touch all > > devices again. > > We can't perform any proper change in the area without touching all > users, some a bit more, some only minimally. Right. I want to touch the devices *once*, to covert them to this API, then work on the internals. > > > >> - How to hook into the region maintenance (CPUPhysMemoryClient, > >> listening vs. filtering or modifying changes)? How to simplify > >> memory clients this way? > > > > I'd leave things as is, at least for the beginning. CPUPhysMemoryClient > > is global in nature, whereas MemoryRegion is local (offsets are relative > > to the containing region). > > See [1]: We really need to get rid of slot management on > CPUPhysMemoryClient side. Your API provides a perfect opportunity to > establish the infrastructure of slot tracking at a central place. We can > then switch from reporting cpu_registering_memory events to reporting > coalesced changes to slots, those slot that also the core uses. So a new > CPUPhysMemoryClient API needs to be considered in this API change as > well - or we change twice in the end. The kvm memory slots (and hopefully future qemu memory slots) are a flattened view of the MemoryRegion tree. There is no 1:1 mapping. -- error compiling committee.c: too many arguments to function