* [Qemu-devel] Flatview rendering scalability issue
@ 2019-03-11 9:26 Sergio Lopez
2019-03-11 10:19 ` Paolo Bonzini
0 siblings, 1 reply; 6+ messages in thread
From: Sergio Lopez @ 2019-03-11 9:26 UTC (permalink / raw)
To: qemu-devel; +Cc: pbonzini
Hi,
Thanks to Q35/PCIe, we can now assign a large number of PCI devices to a
single VM, but it seems that Flatview rendering scales poorly (worse
than linear) when it has to deal with a large number of Memory Regions.
I've measured to cost of the pci_default_write_config() call at
virtio_write_config() for 1 PCI device vs. 100 PCI devices:
- 1 PCI device
write_config: 1879 us
write_config: 1037 us
write_config: 1 us
write_config: 3 us
write_config: 1783 us
write_config: 2652 us
write_config: 1 us
write_config: 2 us
write_config: 1551 us
- 100 PCI devices
write_config: 503963 us
write_config: 1 us
write_config: 493344 us
write_config: 0 us
write_config: 472946 us
write_config: 1 us
write_config: 495175 us
write_config: 1 us
write_config: 519312 us
write_config: 1 us
I guess this is a consequence of having to reset/rebuild the Flatview
when altering the PCI BAR regions.
Is this a known issue we're already working on?
Thanks,
Sergio (slp).
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] Flatview rendering scalability issue
2019-03-11 9:26 [Qemu-devel] Flatview rendering scalability issue Sergio Lopez
@ 2019-03-11 10:19 ` Paolo Bonzini
2019-03-11 13:48 ` Sergio Lopez
0 siblings, 1 reply; 6+ messages in thread
From: Paolo Bonzini @ 2019-03-11 10:19 UTC (permalink / raw)
To: Sergio Lopez, qemu-devel
On 11/03/19 10:26, Sergio Lopez wrote:
> I guess this is a consequence of having to reset/rebuild the Flatview
> when altering the PCI BAR regions.
>
> Is this a known issue we're already working on?
What version of QEMU is this?
The initialization is O(n^2) because the guest initializes one device at
a time, so you rebuild the FlatView first with 0 devices, then 1, then
2, etc. This is very hard to fix, if at all possible.
However, each FlatView creation should be O(n) where n is the number of
devices currently configured. Please check with "info mtree -f" that
you only have a fixed number of FlatViews. Old versions had one per device.
Paolo
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] Flatview rendering scalability issue
2019-03-11 10:19 ` Paolo Bonzini
@ 2019-03-11 13:48 ` Sergio Lopez
2019-03-11 14:07 ` Paolo Bonzini
0 siblings, 1 reply; 6+ messages in thread
From: Sergio Lopez @ 2019-03-11 13:48 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: qemu-devel
Paolo Bonzini writes:
> On 11/03/19 10:26, Sergio Lopez wrote:
>> I guess this is a consequence of having to reset/rebuild the Flatview
>> when altering the PCI BAR regions.
>>
>> Is this a known issue we're already working on?
>
> What version of QEMU is this?
This upstream as of 6cb4f6db4f4367f (Mar 07 2019).
> The initialization is O(n^2) because the guest initializes one device at
> a time, so you rebuild the FlatView first with 0 devices, then 1, then
> 2, etc. This is very hard to fix, if at all possible.
>
> However, each FlatView creation should be O(n) where n is the number of
> devices currently configured. Please check with "info mtree -f" that
> you only have a fixed number of FlatViews. Old versions had one per device.
I'm seeing 9 FVs with 1 PCI, and 119 with 100 PCIs.
Thanks,
Sergio.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] Flatview rendering scalability issue
2019-03-11 13:48 ` Sergio Lopez
@ 2019-03-11 14:07 ` Paolo Bonzini
2019-03-11 14:35 ` Sergio Lopez
2019-03-12 3:23 ` Peter Xu
0 siblings, 2 replies; 6+ messages in thread
From: Paolo Bonzini @ 2019-03-11 14:07 UTC (permalink / raw)
To: Sergio Lopez; +Cc: qemu-devel, Peter Xu
On 11/03/19 14:48, Sergio Lopez wrote:
>> The initialization is O(n^2) because the guest initializes one device at
>> a time, so you rebuild the FlatView first with 0 devices, then 1, then
>> 2, etc. This is very hard to fix, if at all possible.
>>
>> However, each FlatView creation should be O(n) where n is the number of
>> devices currently configured. Please check with "info mtree -f" that
>> you only have a fixed number of FlatViews. Old versions had one per device.
> I'm seeing 9 FVs with 1 PCI, and 119 with 100 PCIs.
With
$ eval qemu-system-x86_64 -M q35 \
-device\ e1000,id=n{1,2,3,4,5,6,7,8}{1,2,3}
I only see 4 flat views ("system", "io", "memory", "(none)").
Probably you are using intel-iommu? Peter, it should be possible to
reorganize the VT-d memory regions like this:
intel_iommu_ir (MMIO, not added to any container)
vtd_root_dmar (container)
intel_iommu_dmar (IOMMU), priority 0
alias to intel_iommu_ir, priority 1
vtd_root_nodmar
alias to get_system_memory(), priority 0
alias to intel_iommu_ir, priority 1
vtd_root_0 memory region (container)
vtd_root_dmar # only one of these is enabled
vtd_root_nodmar
where the vtd_root_dmar and vtd_root_nodmar memory regions are created
in vtd_init once and for all. Because all vtd_root_* memory regions
have only one child, memory.c will recognize that they represent the
same memory, and create at most two FlatViews (one for vtd_root_dmar,
one for vtd_root_nodmar).
Paolo
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] Flatview rendering scalability issue
2019-03-11 14:07 ` Paolo Bonzini
@ 2019-03-11 14:35 ` Sergio Lopez
2019-03-12 3:23 ` Peter Xu
1 sibling, 0 replies; 6+ messages in thread
From: Sergio Lopez @ 2019-03-11 14:35 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: qemu-devel, Peter Xu
Paolo Bonzini writes:
> On 11/03/19 14:48, Sergio Lopez wrote:
>>> The initialization is O(n^2) because the guest initializes one device at
>>> a time, so you rebuild the FlatView first with 0 devices, then 1, then
>>> 2, etc. This is very hard to fix, if at all possible.
>>>
>>> However, each FlatView creation should be O(n) where n is the number of
>>> devices currently configured. Please check with "info mtree -f" that
>>> you only have a fixed number of FlatViews. Old versions had one per device.
>> I'm seeing 9 FVs with 1 PCI, and 119 with 100 PCIs.
>
> With
>
> $ eval qemu-system-x86_64 -M q35 \
> -device\ e1000,id=n{1,2,3,4,5,6,7,8}{1,2,3}
>
> I only see 4 flat views ("system", "io", "memory", "(none)").
>
> Probably you are using intel-iommu? Peter, it should be possible to
> reorganize the VT-d memory regions like this:
You're right, the number of FVs goes down drastically after removing
intel-iommu, and the slowness during Guest PCI initialization disappears
with it.
Thanks,
Sergio.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] Flatview rendering scalability issue
2019-03-11 14:07 ` Paolo Bonzini
2019-03-11 14:35 ` Sergio Lopez
@ 2019-03-12 3:23 ` Peter Xu
1 sibling, 0 replies; 6+ messages in thread
From: Peter Xu @ 2019-03-12 3:23 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: Sergio Lopez, qemu-devel
On Mon, Mar 11, 2019 at 03:07:43PM +0100, Paolo Bonzini wrote:
> On 11/03/19 14:48, Sergio Lopez wrote:
> >> The initialization is O(n^2) because the guest initializes one device at
> >> a time, so you rebuild the FlatView first with 0 devices, then 1, then
> >> 2, etc. This is very hard to fix, if at all possible.
> >>
> >> However, each FlatView creation should be O(n) where n is the number of
> >> devices currently configured. Please check with "info mtree -f" that
> >> you only have a fixed number of FlatViews. Old versions had one per device.
> > I'm seeing 9 FVs with 1 PCI, and 119 with 100 PCIs.
>
> With
>
> $ eval qemu-system-x86_64 -M q35 \
> -device\ e1000,id=n{1,2,3,4,5,6,7,8}{1,2,3}
>
> I only see 4 flat views ("system", "io", "memory", "(none)").
>
> Probably you are using intel-iommu? Peter, it should be possible to
> reorganize the VT-d memory regions like this:
>
> intel_iommu_ir (MMIO, not added to any container)
>
> vtd_root_dmar (container)
> intel_iommu_dmar (IOMMU), priority 0
> alias to intel_iommu_ir, priority 1
>
> vtd_root_nodmar
> alias to get_system_memory(), priority 0
> alias to intel_iommu_ir, priority 1
>
> vtd_root_0 memory region (container)
> vtd_root_dmar # only one of these is enabled
> vtd_root_nodmar
>
> where the vtd_root_dmar and vtd_root_nodmar memory regions are created
> in vtd_init once and for all. Because all vtd_root_* memory regions
> have only one child, memory.c will recognize that they represent the
> same memory, and create at most two FlatViews (one for vtd_root_dmar,
> one for vtd_root_nodmar).
Yes this sounds good. The only thing I'm still uncertain is about the
IOMMU notifiers, which should be per-device (for real). That's
embedded in IOMMUMemoryRegion so far and it includes the real MR
object:
struct IOMMUMemoryRegion {
MemoryRegion parent_obj;
QLIST_HEAD(, IOMMUNotifier) iommu_notify;
IOMMUNotifierFlag iommu_notify_flags;
};
Maybe I should also make parent_obj a pointer to the created MRs
mentioned above, so IOMMUMemoryRegion only contains notification
information rather than real MRs (otherwise we won't have a chance to
share memory regions between devices)?
(But if so then TYPE_INTEL_IOMMU_MEMORY_REGION might not be able to
inherit TYPE_IOMMU_MEMORY_REGION directly, and I've not thought about
the details of that, yet)
Regards,
--
Peter Xu
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2019-03-12 3:31 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-11 9:26 [Qemu-devel] Flatview rendering scalability issue Sergio Lopez
2019-03-11 10:19 ` Paolo Bonzini
2019-03-11 13:48 ` Sergio Lopez
2019-03-11 14:07 ` Paolo Bonzini
2019-03-11 14:35 ` Sergio Lopez
2019-03-12 3:23 ` Peter Xu
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.