On 2011-05-19 20:50, Gleb Natapov wrote:
> On Thu, May 19, 2011 at 08:45:23PM +0200, Jan Kiszka wrote:
>> On 2011-05-19 20:40, Gleb Natapov wrote:
>>> On Thu, May 19, 2011 at 08:27:50PM +0200, Jan Kiszka wrote:
>>>> On 2011-05-19 20:22, Gleb Natapov wrote:
>>>>> On Thu, May 19, 2011 at 08:11:39PM +0200, Jan Kiszka wrote:
>>>>>> On 2011-05-19 19:39, Gleb Natapov wrote:
>>>>>>> On Thu, May 19, 2011 at 03:37:58PM +0200, Jan Kiszka wrote:
>>>>>>>> On 2011-05-19 15:36, Anthony Liguori wrote:
>>>>>>>>> On 05/18/2011 02:40 PM, Jan Kiszka wrote:
>>>>>>>>>> On 2011-05-18 15:12, Avi Kivity wrote:
>>>>>>>>>>> void cpu_register_memory_region(MemoryRegion *mr, target_phys_addr_t
>>>>>>>>>>> addr);
>>>>>>>>>>
>>>>>>>>>> OK, let's allow overlapping, but make it explicit:
>>>>>>>>>>
>>>>>>>>>> void cpu_register_memory_region_overlap(MemoryRegion *mr,
>>>>>>>>>>                                          target_phys_addr_t addr,
>>>>>>>>>>                                          int priority);
>>>>>>>>>
>>>>>>>>> The device doesn't actually know how overlapping is handled.  This is
>>>>>>>>> based on the bus hierarchy.
>>>>>>>>
>>>>>>>> Devices don't register their regions, buses do.
>>>>>>>>
>>>>>>> Today PCI device may register region that overlaps with any other
>>>>>>> registered memory region without even knowing it. Guest can write any
>>>>>>> RAM address into PCI BAR and this RAM address will be come mmio are. More
>>>>>>> interesting is what happens when guest reprogram PCI BAR to other address
>>>>>>> - the RAM that was at the previous address just disappears. Obviously
>>>>>>>   this is crazy behaviour, but the question is how do we want to handle
>>>>>>> it? One option is to disallow such overlapping registration, another is
>>>>>>> to restore RAM mapping after PCI BAR is reprogrammed. If we chose second
>>>>>>> one the PCI will not know that _overlap() should be called.
>>>>>>
>>>>>> BARs may overlap with other BARs or with RAM. That's well-known, so PCI
>>>>>> bridged need to register their regions with the _overlap variant
>>>>>> unconditionally. In contrast to the current PhysPageDesc mechanism, the
>>>>> With what priority? If it needs to call _overlap unconditionally why not
>>>>> always call _overlap and drop not _overlap variant?
>>>>
>>>> Because we should catch accidental overlaps in all those non PCI devices
>>>> with hard-wired addressing. That's a bug in the device/machine model and
>>>> should be reported as such by QEMU.
>>> Why should we complicate API to catch unlikely errors? If you want to
>>> debug that add capability to dump memory map from the monitor.
>>
>> Because we need to switch tons of code that so far saw a fairly
>> different reaction of the core to overlapping regions.
>>
> How so? Today if there is accidental overlap device will not function properly.
> With new API it will be the same.

I rather expect subtle differences as overlapping registration changes
existing regions, in the future those will recover.

> 
>>>
>>>>
>>>>>
>>>>>> new region management will not cause any harm to overlapping regions so
>>>>>> that they can "recover" when the overlap is gone.
>>>>>>
>>>>>>>
>>>>>>> Another example may be APIC region and PCI. They overlap, but neither
>>>>>>> CPU nor PCI knows about it.
>>>>>>
>>>>>> And they do not need to. The APIC regions will be managed by the per-CPU
>>>>>> region management, reusing the tool box we need for all bridges. It will
>>>>>> register the APIC page with a priority higher than the default one, thus
>>>>>> overriding everything that comes from the host bridge. I think that
>>>>>> reflects pretty well real machine behaviour.
>>>>>>
>>>>> What is "higher"? How does it know that priority is high enough?
>>>>
>>>> Because no one else manages priorities at a specific hierarchy level.
>>>> There is only one.
>>>>
>>> PCI and CPU are on different hierarchy levels. PCI is under the PIIX and
>>> CPU is on a system BUS.
>>
>> The priority for the APIC mapping will be applied at CPU level, of
>> course. So it will override everything, not just PCI.
>>
> So you do not need explicit priority because the place in hierarchy
> implicitly provides you with one.

Yes. Alternatively, you could add a prio offset to all mappings when
climbing one level up, provided that offset is smaller than the prio
range locally available to each level.

Jan