On 2011-05-18 21:10, Anthony Liguori wrote: > On 05/18/2011 10:30 AM, Jan Kiszka wrote: >> On 2011-05-18 17:17, Peter Maydell wrote: >>> On 18 May 2011 16:11, Jan Kiszka wrote: >>>> On 2011-05-18 16:36, Avi Kivity wrote: >>>>> There is nothing we can do with a return code. You can't fail an mmio >>>>> that causes overlapping physical memory map. >>>> >>>> We must fail such requests to make progress with the API. That may >>>> happen either on caller side or in cpu_register_memory_region itself >>>> (hwerror). Otherwise the new API will just be a shiny new facade for on >>>> old and still fragile building. >>> >>> If we don't allow overlapping regions, then how do you implement >>> things like "on startup board maps ROM into lower addresses >>> over top of devices, but later it is unmapped and you can see >>> the underlying devices" ? (You can't currently do this AFAIK, >>> and it would be nice if the new API supported it.) >> >> Right, we can't do this properly, and that's why the attempt if the >> i440fx chipset model is so horribly broken ATM. >> >> Just allowing overlapping does not solve this problem either. It does >> not specify what region is on top and what is below (even worse if >> multiple regions overlap at the place). >> >> We need some managing instance here, and that is e.g. the chipset that >> provide control over the overlap in reality. It could hook up a >> PhysMemClient to receive and redirect updates to subregions, or only >> allow to register them in disabled state. > > I think that gets ugly pretty fast. The way this works IRL is that all > I/O dispatches pass through the chipset. You literally need something > as simple as: > > static void i440fx_io_intercept(void *opaque, uint64_t addr, uint32_t > value, int size, MemRegion *next) > { > I440FX *s = opaque; > > if (range_overlaps(addr, size, PAM_REGION)) { > ... > } else { > dispatch_io(next, addr, value, size); > } > } > > There's no need for an explicit intercept mechanism if you make multiple > levels have their own dispatch tables and register progressively larger > regions. In fact.... > > You really don't need to register 90% of the time. In the case of a PC > with i440fx, it's really quite simple: > > if an I/O is to the APIC page, > it's handled by the APIC That's not that simple. We need to tell apart: - if a cpu issued the request, and which one => forward to APIC - if the range was addressed by a device (PCI or other system bus devices) => forward to MSI or other MMIO handlers > elif the I/O is in ROM regions: > if PAM RE or WE > redirect to RAM appropriately > else: > send to ROMs > elif the I/O is in the PCI windows: > send to the PCI bus > else: > send to the PIIX3 > > For x86, you could easily completely skip the explicit registration and > just have a direct dispatch to the i440fx and implement something > slightly more fancy than the above. > > And I think this is true for most other types of boards too. This all melts down that we need to stop accepting memory region mappings from everywhere at core level, but properly dispatch them up the device tree. A device should register against its bus which can then forward or handle the mapping directly. And that handling requires a central tool box to avoid reinventing wheels. That's a worthwhile change, though a much bigger one than I was originally hoping to get away with. Jan