From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:36441) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QO4EQ-0004Ml-Ng for qemu-devel@nongnu.org; Sun, 22 May 2011 04:41:51 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1QO4EP-0003yX-9L for qemu-devel@nongnu.org; Sun, 22 May 2011 04:41:50 -0400 Received: from mx1.redhat.com ([209.132.183.28]:25651) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QO4EO-0003yO-T5 for qemu-devel@nongnu.org; Sun, 22 May 2011 04:41:49 -0400 Date: Sun, 22 May 2011 11:41:42 +0300 From: Gleb Natapov Message-ID: <20110522084142.GS27310@redhat.com> References: <20110519182203.GH27310@redhat.com> <4DD56126.20808@web.de> <20110519184056.GJ27310@redhat.com> <4DD56543.9020404@web.de> <20110519185019.GK27310@redhat.com> <4DD567B5.1060205@web.de> <20110520072358.GL27310@redhat.com> <4DD61ADD.9050103@web.de> <20110520112534.GM27310@redhat.com> <4DD8C03E.3020904@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4DD8C03E.3020904@redhat.com> Subject: Re: [Qemu-devel] [RFC] Memory API List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Avi Kivity Cc: Jan Kiszka , qemu-devel On Sun, May 22, 2011 at 10:50:22AM +0300, Avi Kivity wrote: > On 05/20/2011 02:25 PM, Gleb Natapov wrote: > >> > >> A) Removing regions will change significantly. So far this is done by > >> setting a region to IO_MEM_UNASSIGNED, keeping truncation. With the new > >> API that will be a true removal which will additionally restore hidden > >> regions. > >> > >And what problem do you expect may arise from that? Currently accessing > >such region after unassign will result in undefined behaviour, so this > >code is non working today, you can't make it worse. > > > > If the conversion were perfect then yes. However there is a > possibility that the conversion will not be perfect. > > It's also good to have to have the code document its intentions. If > you see _overlap() you know there is dynamic address decoding going > on, or something clever. > > >> B) Uncontrolled overlapping is a bug that should be caught by the core, > >> and a new API is a perfect chance to do this. > >> > >Well, this will indeed introduce the difference in behaviour :) The guest > >that ran before will abort now. Are you actually aware of any such > >overlaps in the current code base? > > Put a BAR over another BAR, then unmap it. > _overlap will not help with that. PCI BARs can overlap, so _overlap will be used to register them. You do not what to abort qemu when guest configure overlapping PCI BARs don't you? > >But if priorities are gona stay why not fail if two regions with the > >same priority overlap? If that happens it means that the memory creation > >didn't pass the point where conflict should have been resolved (by > >assigning different priorities) and this means that overlap is > >unintentional, no? > > It may be intentional, as in the case of PCI, or PAM (though you can > do PAM without priorities, by removing all but one of the subregions > in the area). > When two PCI BARs overlap, somebody somewhere has to decide which one of them will be visible. You can register both of them with same priority and let the core to decide this at the time it calculates flattened view or you can assign different priorities at PCI layer. In the later case you do not need _overlap API. > >I am starting to see how you can represent all this local decisions as > >priority numbers and then travel this weighted tree to find what memory > >region should be accessed (memory registration _has_ to be hierarchical > >for that to work in meaningful way). > > Priorities don't travel up the tree. They're used to resolve local > conflicts *only*. Yes, that what I mean by "traveling weighted tree". At each node you look at local priorities to decide where to move. > > > I still don't see why it is better > >than flattening the tree in the point of conflict. > > How do you decide which subregion wins? The same way you decide what priority you assign to each region. You do that at the point where you know the priorities. Suppose chipset has RAM from 0x0000 to 0x1fff and PCI from 0x2000 0x2fff and PCI layer tires to map a BAR from 0x1500 to 0x3500. Since chipset code knows that RAM has higher priority and it knows where PCI windows ends it can create two memory regions from that. One is RAM from 0x0000 to 0x1fff another PCI BAR from 0x2000 to 0x2fff. How your tree will look like in this case BTW? > >> Not necessarily. It depends on how much added value buses like PCI or > >> ISA or whatever can offer for managing I/O regions. For some purposes, > >> it may as well be fine to just call the memory_* service directly and > >> pass the result of some operation to the bus API later on. > >Depend on what memory_* service you are talking about. Just creating > >unattached memory region is OK. But if two independent pieces of code > >want to map two different memory regions into the same phys address I do > >not see who will resolve the conflict. > > They have to ask the bus to _add_subregion(). Only the bus knows > about the priorities (or the bus can ask them to create the > subregions). Yes, that what I wrote and Jan responded to. Buses all the way down. > > >> > >> > PCI > >> > device will call PCI subsystem. PCI subsystem, instead of assigning > >> > arbitrary priorities to all overlappings, > >> > >> Again: PCI will _not_ assign arbitrary priorities but only > >> MEMORY_REGION_DEFAULT_PRIORITY, likely 0. > >That is as arbitrary as it can get. Just assigning > >MEMORY_REGION_DEFAULT_PRIORITY/2^0xfff will work equally well, so what > >is not arbitrary about that number? > > That's just splitting hairs. Array indexes start from zero, an > arbitrary but convenient number. The point is that instead of assigning priorities PCI subsystem may resolve the conflict before passing registration to a chipset. > > >BTW why wouldn't PCI layer assign different priorities to overlapping > >regions to let the core know which one should be actually available? Why > >leave this decision to the core if it is clearly belongs to PCI? > > You mean overlapping BARs? If PCI wants BAR 1 to override BAR 2, > then it can indicate it with priorities. If it doesn't want to, it > can use the same priority for all regions. > > >> > >> That does not specify how the PCI bridge or the chipset will tell that > >> overlapping resolution lib _how_ overlapping regions shall be translated > >> into a flat representation. And precisely here come priorities into > >> play. It is the way to tell that lib either "region A shall override > >> region B" if A has higher prio or "if region A and B overlap, do > >> whatever you want" if both have the same prio. > >> > >Yep! And the question is why shouldn't this be done on the level that > >knows most about the conflict but propagated to the core. I am not > >arguing that priorities do not exists! Obviously they are. I am > >questioning the usefulness of priorities be part of memory core API. > > > > The chipset knows about the priorities. How to communicate them to > the core? > > - at runtime, with hierarchical dispatch of ->read() and ->write(): > slow, and doesn't work at all for RAM. > - using registration order: fragile > - using priorities > - by resolving overlapping and registering flattened list with the core. (See example above). > We need to get the information out of the chipset and into the core, > so the core can make use of it (like flattening the tree to produce > kvm slots). > > -- > I have a truly marvellous patch that fixes the bug which this > signature is too narrow to contain. -- Gleb.