qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/1] Skip flatview_simplify() for cpu vendor zhaoxin
@ 2020-10-16 11:29 FelixCuioc
  2020-10-16 11:29 ` [PATCH 1/1] " FelixCuioc
  0 siblings, 1 reply; 15+ messages in thread
From: FelixCuioc @ 2020-10-16 11:29 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Eduardo Habkost, Alex Williamson
  Cc: TonyWWang-oc, RockCui-oc, qemu-devel, CobeChen-oc

The actual situation we encountered is:
When assign EHCI device to the virtual machine,
after initializing EHCI in seabios,it will continuously
send dma cycles.
After flatview_simplify(),the IOVA mappings of the
EHCI device will be innocently unmapped between the
delate and add phases of the VFIO MemoryListener.
But the EHCI device is always sending DMA cycle.
At this time,the IOMMU will block the DMA cycle.
The purpose of this patch is to skip flatview_simplify()
for zx vendor.

FelixCuioc (1):
  Skip flatview_simplify() for cpu vendor zhaoxin

 softmmu/memory.c  | 20 +++++++++++++++++++-
 target/i386/cpu.c |  7 +++++++
 target/i386/cpu.h |  3 +++
 3 files changed, 29 insertions(+), 1 deletion(-)

-- 
2.17.1



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/1] Skip flatview_simplify() for cpu vendor zhaoxin
  2020-10-16 11:29 [PATCH 0/1] Skip flatview_simplify() for cpu vendor zhaoxin FelixCuioc
@ 2020-10-16 11:29 ` FelixCuioc
  2020-10-16 11:42   ` Paolo Bonzini
  0 siblings, 1 reply; 15+ messages in thread
From: FelixCuioc @ 2020-10-16 11:29 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Eduardo Habkost, Alex Williamson
  Cc: TonyWWang-oc, RockCui-oc, qemu-devel, CobeChen-oc

The issue here is that an assinged EHCI device accesses
an adjacent mapping between the delete and add phases
of the VFIO MemoryListener.
We want to skip flatview_simplify() is to prevent EHCI
device IOVA mappings from being unmapped.

Signed-off-by: FelixCuioc <FelixCui-oc@zhaoxin.com>
---
 softmmu/memory.c  | 20 +++++++++++++++++++-
 target/i386/cpu.c |  7 +++++++
 target/i386/cpu.h |  3 +++
 3 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/softmmu/memory.c b/softmmu/memory.c
index 403ff3abc9..a998018d87 100644
--- a/softmmu/memory.c
+++ b/softmmu/memory.c
@@ -700,6 +700,22 @@ static MemoryRegion *memory_region_get_flatview_root(MemoryRegion *mr)
     return NULL;
 }
 
+static bool skip_simplify(void)
+{
+#ifdef I386_CPU_H
+    char vendor[CPUID_VENDOR_SZ + 1] = { 0 };
+    host_get_vendor(vendor);
+    if (!strncmp(vendor, CPUID_VENDOR_VIA, strlen(CPUID_VENDOR_VIA))
+        || !strncmp(vendor, CPUID_VENDOR_ZHAOXIN,
+                  strlen(CPUID_VENDOR_ZHAOXIN))) {
+        return true;
+    }
+    return false;
+#else
+    return false;
+#endif
+}
+
 /* Render a memory topology into a list of disjoint absolute ranges. */
 static FlatView *generate_memory_topology(MemoryRegion *mr)
 {
@@ -713,7 +729,9 @@ static FlatView *generate_memory_topology(MemoryRegion *mr)
                              addrrange_make(int128_zero(), int128_2_64()),
                              false, false);
     }
-    flatview_simplify(view);
+    if (!skip_simplify()) {
+        flatview_simplify(view);
+    }
 
     view->dispatch = address_space_dispatch_new(view);
     for (i = 0; i < view->nr; i++) {
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 5d713c8528..6bda35a03e 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -1559,6 +1559,13 @@ void host_cpuid(uint32_t function, uint32_t count,
     if (edx)
         *edx = vec[3];
 }
+void host_get_vendor(char *vendor)
+{
+  uint32_t eax, ebx, ecx, edx;
+
+  host_cpuid(0x0, 0, &eax, &ebx, &ecx, &edx);
+  x86_cpu_vendor_words2str(vendor, ebx, edx, ecx);
+}
 
 void host_vendor_fms(char *vendor, int *family, int *model, int *stepping)
 {
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 51c1d5f60a..e8e26e6a53 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -832,6 +832,8 @@ typedef uint64_t FeatureWordArray[FEATURE_WORDS];
 
 #define CPUID_VENDOR_VIA   "CentaurHauls"
 
+#define CPUID_VENDOR_ZHAOXIN   "  Shanghai  "
+
 #define CPUID_VENDOR_HYGON    "HygonGenuine"
 
 #define IS_INTEL_CPU(env) ((env)->cpuid_vendor1 == CPUID_VENDOR_INTEL_1 && \
@@ -1918,6 +1920,7 @@ void cpu_clear_apic_feature(CPUX86State *env);
 void host_cpuid(uint32_t function, uint32_t count,
                 uint32_t *eax, uint32_t *ebx, uint32_t *ecx, uint32_t *edx);
 void host_vendor_fms(char *vendor, int *family, int *model, int *stepping);
+void host_get_vendor(char *vendor);
 
 /* helper.c */
 bool x86_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/1] Skip flatview_simplify() for cpu vendor zhaoxin
  2020-10-16 11:29 ` [PATCH 1/1] " FelixCuioc
@ 2020-10-16 11:42   ` Paolo Bonzini
  2020-10-19  6:55     ` 答复: " FelixCui-oc
  2020-10-19 19:02     ` Alex Williamson
  0 siblings, 2 replies; 15+ messages in thread
From: Paolo Bonzini @ 2020-10-16 11:42 UTC (permalink / raw)
  To: FelixCuioc, Richard Henderson, Eduardo Habkost, Alex Williamson
  Cc: TonyWWang-oc, Alex Williamson, RockCui-oc, qemu-devel, CobeChen-oc

On 16/10/20 13:29, FelixCuioc wrote:
> The issue here is that an assinged EHCI device accesses
> an adjacent mapping between the delete and add phases
> of the VFIO MemoryListener.
> We want to skip flatview_simplify() is to prevent EHCI
> device IOVA mappings from being unmapped.

Hi,

there is indeed a bug, but I have already explained last month
(https://mail.gnu.org/archive/html/qemu-devel/2020-09/msg01279.html)
that this patch is conceptually wrong:

1) you're adding host_get_vendor conditioned on compiling the x86
emulator, so you are breaking compilation on non-x86 machines.

2) you're adding a check for the host, but the bug applies to all hosts.
 If there is a bug on x86 hardware emulation, it should be fixed even
when emulating x86 from ARM.  It should also apply to all CPU vendors.

Alex, the issue here is that the delete+add passes are racing against an
assigned device's DMA. For KVM we were thinking of changing the whole
memory map with a single ioctl, but that's much easier because KVM
builds its page tables lazily. It would be possible for the IOMMU too
but it would require a relatively complicated comparison of the old and
new memory maps in the kernel.

Paolo



^ permalink raw reply	[flat|nested] 15+ messages in thread

* 答复: [PATCH 1/1] Skip flatview_simplify() for cpu vendor zhaoxin
  2020-10-16 11:42   ` Paolo Bonzini
@ 2020-10-19  6:55     ` FelixCui-oc
  2020-10-19 19:02     ` Alex Williamson
  1 sibling, 0 replies; 15+ messages in thread
From: FelixCui-oc @ 2020-10-19  6:55 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Eduardo Habkost, Alex Williamson
  Cc: Tony W Wang-oc, RockCui-oc, qemu-devel, CobeChen-oc

[-- Attachment #1: Type: text/plain, Size: 2145 bytes --]

hi paolo,


>2) you're adding a check for the host, but the bug applies to all hosts.
>If there is a bug on x86 hardware emulation, it should be fixed even
>when emulating x86 from ARM.  It should also apply to all CPU vendors.


What is the progress of handling this bug ?

If the processing is more complicated, can we temporarily remove flatview_simplify()?


hi Alex,


>the issue here is that the delete+add passes are racing against an
>assigned device's DMA


Please help comment how to solve this problem.


Best regards

Felixcui-oc


________________________________
发件人: Paolo Bonzini <pbonzini@redhat.com>
发送时间: 2020年10月16日 19:42:29
收件人: FelixCui-oc; Richard Henderson; Eduardo Habkost; Alex Williamson
抄送: qemu-devel@nongnu.org; RockCui-oc; Tony W Wang-oc; CobeChen-oc; Alex Williamson
主题: Re: [PATCH 1/1] Skip flatview_simplify() for cpu vendor zhaoxin

On 16/10/20 13:29, FelixCuioc wrote:
> The issue here is that an assinged EHCI device accesses
> an adjacent mapping between the delete and add phases
> of the VFIO MemoryListener.
> We want to skip flatview_simplify() is to prevent EHCI
> device IOVA mappings from being unmapped.

Hi,

there is indeed a bug, but I have already explained last month
(https://mail.gnu.org/archive/html/qemu-devel/2020-09/msg01279.html)
that this patch is conceptually wrong:

1) you're adding host_get_vendor conditioned on compiling the x86
emulator, so you are breaking compilation on non-x86 machines.

2) you're adding a check for the host, but the bug applies to all hosts.
 If there is a bug on x86 hardware emulation, it should be fixed even
when emulating x86 from ARM.  It should also apply to all CPU vendors.

Alex, the issue here is that the delete+add passes are racing against an
assigned device's DMA. For KVM we were thinking of changing the whole
memory map with a single ioctl, but that's much easier because KVM
builds its page tables lazily. It would be possible for the IOMMU too
but it would require a relatively complicated comparison of the old and
new memory maps in the kernel.

Paolo


[-- Attachment #2: Type: text/html, Size: 5308 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/1] Skip flatview_simplify() for cpu vendor zhaoxin
  2020-10-16 11:42   ` Paolo Bonzini
  2020-10-19  6:55     ` 答复: " FelixCui-oc
@ 2020-10-19 19:02     ` Alex Williamson
  2020-10-20  9:24       ` Paolo Bonzini
  1 sibling, 1 reply; 15+ messages in thread
From: Alex Williamson @ 2020-10-19 19:02 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: FelixCuioc, Eduardo Habkost, CobeChen-oc, qemu-devel,
	TonyWWang-oc, RockCui-oc, Richard Henderson

On Fri, 16 Oct 2020 13:42:29 +0200
Paolo Bonzini <pbonzini@redhat.com> wrote:

> On 16/10/20 13:29, FelixCuioc wrote:
> > The issue here is that an assinged EHCI device accesses
> > an adjacent mapping between the delete and add phases
> > of the VFIO MemoryListener.
> > We want to skip flatview_simplify() is to prevent EHCI
> > device IOVA mappings from being unmapped.  
> 
> Hi,
> 
> there is indeed a bug, but I have already explained last month
> (https://mail.gnu.org/archive/html/qemu-devel/2020-09/msg01279.html)
> that this patch is conceptually wrong:
> 
> 1) you're adding host_get_vendor conditioned on compiling the x86
> emulator, so you are breaking compilation on non-x86 machines.
> 
> 2) you're adding a check for the host, but the bug applies to all hosts.
>  If there is a bug on x86 hardware emulation, it should be fixed even
> when emulating x86 from ARM.  It should also apply to all CPU vendors.
> 
> Alex, the issue here is that the delete+add passes are racing against an
> assigned device's DMA. For KVM we were thinking of changing the whole
> memory map with a single ioctl, but that's much easier because KVM
> builds its page tables lazily. It would be possible for the IOMMU too
> but it would require a relatively complicated comparison of the old and
> new memory maps in the kernel.

We can only build IOMMU page tables lazily if we get faults, which we
generally don't.  We also cannot atomically update IOMMU page tables
relative to a device, so "housekeeping" updates of mappings to (I
assume) consolidate KVM memory slots doesn't work so well when the
device is still running.  Stopping the device via something like the
bus-master enable bit also sounds like a whole set of problems itself.
I assume these simplified mappings also reduce our resolution for later
unmaps, which isn't necessarily a win for an assigned device either if
it exposes the race again each boot.

Maybe the question is why we don't see these errors more regularly, is
there something unique about the memory layout of this platform versus
others that causes larger memory regions to be coalesced together only
to be later unmapped and provide more exposure to this issue?  Thanks,

Alex



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/1] Skip flatview_simplify() for cpu vendor zhaoxin
  2020-10-19 19:02     ` Alex Williamson
@ 2020-10-20  9:24       ` Paolo Bonzini
  2020-10-20 22:44         ` Alex Williamson
  0 siblings, 1 reply; 15+ messages in thread
From: Paolo Bonzini @ 2020-10-20  9:24 UTC (permalink / raw)
  To: Alex Williamson
  Cc: FelixCuioc, Eduardo Habkost, CobeChen-oc, qemu-devel,
	TonyWWang-oc, RockCui-oc, Richard Henderson

On 19/10/20 21:02, Alex Williamson wrote:
>> For KVM we were thinking of changing the whole
>> memory map with a single ioctl, but that's much easier because KVM
>> builds its page tables lazily. It would be possible for the IOMMU too
>> but it would require a relatively complicated comparison of the old and
>> new memory maps in the kernel.
>
> We can only build IOMMU page tables lazily if we get faults, which we
> generally don't.  We also cannot atomically update IOMMU page tables
> relative to a device,

Yeah, I didn't mean building IOMMU page tables lazily, rather replacing
the whole VFIO memory map with a single ioctl.

I don't think that requires atomic updates of the IOMMU page table root,
but it would require atomic updates of IOMMU page table entires; VFIO
would compare the old and new memory map and modify the page tables when
it sees a difference.  Is that possible?

Paolo

> so "housekeeping" updates of mappings to (I
> assume) consolidate KVM memory slots doesn't work so well when the
> device is still running.  Stopping the device via something like the
> bus-master enable bit also sounds like a whole set of problems itself.
> I assume these simplified mappings also reduce our resolution for later
> unmaps, which isn't necessarily a win for an assigned device either if
> it exposes the race again each boot.
> 
> Maybe the question is why we don't see these errors more regularly, is
> there something unique about the memory layout of this platform versus
> others that causes larger memory regions to be coalesced together only
> to be later unmapped and provide more exposure to this issue?  Thanks,



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/1] Skip flatview_simplify() for cpu vendor zhaoxin
  2020-10-20  9:24       ` Paolo Bonzini
@ 2020-10-20 22:44         ` Alex Williamson
  2020-10-21  7:37           ` Paolo Bonzini
  0 siblings, 1 reply; 15+ messages in thread
From: Alex Williamson @ 2020-10-20 22:44 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: FelixCuioc, Eduardo Habkost, CobeChen-oc, qemu-devel,
	TonyWWang-oc, RockCui-oc, Richard Henderson

On Tue, 20 Oct 2020 11:24:58 +0200
Paolo Bonzini <pbonzini@redhat.com> wrote:

> On 19/10/20 21:02, Alex Williamson wrote:
> >> For KVM we were thinking of changing the whole
> >> memory map with a single ioctl, but that's much easier because KVM
> >> builds its page tables lazily. It would be possible for the IOMMU too
> >> but it would require a relatively complicated comparison of the old and
> >> new memory maps in the kernel.  
> >
> > We can only build IOMMU page tables lazily if we get faults, which we
> > generally don't.  We also cannot atomically update IOMMU page tables
> > relative to a device,  
> 
> Yeah, I didn't mean building IOMMU page tables lazily, rather replacing
> the whole VFIO memory map with a single ioctl.
> 
> I don't think that requires atomic updates of the IOMMU page table root,
> but it would require atomic updates of IOMMU page table entires; VFIO
> would compare the old and new memory map and modify the page tables when
> it sees a difference.  Is that possible?

Theoretically possible, probably.  I imagine any IOMMU worth it's
silicon should be able to update ptes atomically, but there's probably
a substantial re-engineering of the internal IOMMU API and external
vfio IOMMU uAPI to get there.  For example, I don't expect we have
support at the IOMMU driver level to reshuffle ptes when an IOMMU super
page is broken.  Instead, for an unmap operation, the IOMMU API allows
the driver to return a larger number of unmapped pages than requested.
I'd be nervous about an agreed baseline for modifications to pages
covered by an IOMMU super page as well.  The vfio IOMMU type1 uAPI is
largely a reflection of this internal API with further restrictions for
tracking and accounting of user mappings.  Therefore we don't allow
mappings that modify or overlap existing mappings nor do we allow an
unmap which bisects any existing mapping.

To support a memory map approach (which implicitly negates those sorts
of rules) we'd need to know if the IOMMU driver itself can atomically
handle arbitrary maps and unmaps, performing any necessary super page
fix-ups atomically.  The internal mechanics of the vfio IOMMU would
need to change quite a bit too for tracking and pinning, I suspect.

Do we necessarily need a memory map ioctl for this or could it be the
QEMU code that compares the old and new maps to trigger map and unmap
ioctls?  For example (aiui) our race is that if we have contiguous
memory regions A and B and flatview_simplify() tries to expand A and
delete B we'll see a series of listener notifications deleting A and B
and adding A'.  But the vfio QEMU code could parse the memory map to
determine that old A + B is functionally equivalent to A' and do
nothing.  Do you foresee any breakdowns for such an approach?  Hotplug
concerns me in that a new device only has the current simplified
flatview, ex. we only know A' rather than A + B, so we can't get back
to A + !B like a device with more history could.  Thanks,

Alex



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/1] Skip flatview_simplify() for cpu vendor zhaoxin
  2020-10-20 22:44         ` Alex Williamson
@ 2020-10-21  7:37           ` Paolo Bonzini
  2020-10-21 13:16             ` 答复: " FelixCui-oc
  2020-10-21 18:49             ` Alex Williamson
  0 siblings, 2 replies; 15+ messages in thread
From: Paolo Bonzini @ 2020-10-21  7:37 UTC (permalink / raw)
  To: Alex Williamson
  Cc: FelixCuioc, Eduardo Habkost, CobeChen-oc, qemu-devel,
	TonyWWang-oc, RockCui-oc, Richard Henderson

On 21/10/20 00:44, Alex Williamson wrote:
> Do we necessarily need a memory map ioctl for this or could it be the
> QEMU code that compares the old and new maps to trigger map and unmap
> ioctls?  For example (aiui) our race is that if we have contiguous
> memory regions A and B and flatview_simplify() tries to expand A and
> delete B we'll see a series of listener notifications deleting A and B
> and adding A'.  But the vfio QEMU code could parse the memory map to
> determine that old A + B is functionally equivalent to A' and do
> nothing.

I think the issue is a bit different, and in fact there are two sides of
the same issue.  Say you have A (large) and it is replaced by A'
(smaller) + B, then:

* the first part of A disappears for a moment before A' appears.  This
is something that QEMU can work around, by not doing anything

* the second part of A disappears for a moment before B appears.  This
is the root API issue and not something that QEMU can work around; and
in fact it is not even fixed by removing flatview_simplify.

Felix, did you identify the regions whose simplification causes the bug?
 Is this RAM (for example due to the PAM registers) or something else?

Paolo



^ permalink raw reply	[flat|nested] 15+ messages in thread

* 答复: [PATCH 1/1] Skip flatview_simplify() for cpu vendor zhaoxin
  2020-10-21  7:37           ` Paolo Bonzini
@ 2020-10-21 13:16             ` FelixCui-oc
  2020-10-21 18:49             ` Alex Williamson
  1 sibling, 0 replies; 15+ messages in thread
From: FelixCui-oc @ 2020-10-21 13:16 UTC (permalink / raw)
  To: Paolo Bonzini, Alex Williamson
  Cc: CobeChen-oc, qemu-devel, Tony W Wang-oc, RockCui-oc,
	Richard Henderson, Eduardo Habkost

[-- Attachment #1: Type: text/plain, Size: 1957 bytes --]

hi Paolo,

>Felix, did you identify the regions whose simplification causes the bug?
>Is this RAM (for example due to the PAM registers) or something else?


yes, this bug is caused by write PAM register.The actual situation is that the

properties of some ranges are changed from RW to readonly.This situation

will cause the old ranges to be unmapped.


Best regards

Felixcui-oc


________________________________
发件人: Paolo Bonzini <pbonzini@redhat.com>
发送时间: 2020年10月21日 15:37:53
收件人: Alex Williamson
抄送: FelixCui-oc; Richard Henderson; Eduardo Habkost; qemu-devel@nongnu.org; RockCui-oc; Tony W Wang-oc; CobeChen-oc
主题: Re: [PATCH 1/1] Skip flatview_simplify() for cpu vendor zhaoxin

On 21/10/20 00:44, Alex Williamson wrote:
> Do we necessarily need a memory map ioctl for this or could it be the
> QEMU code that compares the old and new maps to trigger map and unmap
> ioctls?  For example (aiui) our race is that if we have contiguous
> memory regions A and B and flatview_simplify() tries to expand A and
> delete B we'll see a series of listener notifications deleting A and B
> and adding A'.  But the vfio QEMU code could parse the memory map to
> determine that old A + B is functionally equivalent to A' and do
> nothing.

I think the issue is a bit different, and in fact there are two sides of
the same issue.  Say you have A (large) and it is replaced by A'
(smaller) + B, then:

* the first part of A disappears for a moment before A' appears.  This
is something that QEMU can work around, by not doing anything

* the second part of A disappears for a moment before B appears.  This
is the root API issue and not something that QEMU can work around; and
in fact it is not even fixed by removing flatview_simplify.

Felix, did you identify the regions whose simplification causes the bug?
 Is this RAM (for example due to the PAM registers) or something else?

Paolo


[-- Attachment #2: Type: text/html, Size: 5140 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/1] Skip flatview_simplify() for cpu vendor zhaoxin
  2020-10-21  7:37           ` Paolo Bonzini
  2020-10-21 13:16             ` 答复: " FelixCui-oc
@ 2020-10-21 18:49             ` Alex Williamson
  2020-10-21 19:50               ` Paolo Bonzini
  1 sibling, 1 reply; 15+ messages in thread
From: Alex Williamson @ 2020-10-21 18:49 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: FelixCuioc, Eduardo Habkost, CobeChen-oc, qemu-devel,
	TonyWWang-oc, RockCui-oc, Richard Henderson

On Wed, 21 Oct 2020 09:37:53 +0200
Paolo Bonzini <pbonzini@redhat.com> wrote:

> On 21/10/20 00:44, Alex Williamson wrote:
> > Do we necessarily need a memory map ioctl for this or could it be the
> > QEMU code that compares the old and new maps to trigger map and unmap
> > ioctls?  For example (aiui) our race is that if we have contiguous
> > memory regions A and B and flatview_simplify() tries to expand A and
> > delete B we'll see a series of listener notifications deleting A and B
> > and adding A'.  But the vfio QEMU code could parse the memory map to
> > determine that old A + B is functionally equivalent to A' and do
> > nothing.  
> 
> I think the issue is a bit different, and in fact there are two sides of
> the same issue.  Say you have A (large) and it is replaced by A'
> (smaller) + B, then:
> 
> * the first part of A disappears for a moment before A' appears.  This
> is something that QEMU can work around, by not doing anything
> 
> * the second part of A disappears for a moment before B appears.  This
> is the root API issue and not something that QEMU can work around; and
> in fact it is not even fixed by removing flatview_simplify.

Right, our current uAPI does not support a mechanism to atomically
change a mapping, but likewise we're probably not going to have devices
performing DMA to regions that are being remapped.  We know that
removing flatview_simplify() resolves this issue and FelixCui's update
suggests we do have a case where the permission changes of an adjacent
range is triggering a range consolidation, which we see as the range
being removed and added as something else, larger or smaller.

I can understand the general benefit of flatview_simplify(), but I
wonder if the best short term solution is to skip operating on the x86
PAM range, which I understand to be a small number of memory chunks
below 1MB.  I might also wonder why the EHCI controller on this
platform is choosing that range for DMA.  Thanks,

Alex



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/1] Skip flatview_simplify() for cpu vendor zhaoxin
  2020-10-21 18:49             ` Alex Williamson
@ 2020-10-21 19:50               ` Paolo Bonzini
  2020-10-22  3:02                 ` 答复: " FelixCui-oc
  0 siblings, 1 reply; 15+ messages in thread
From: Paolo Bonzini @ 2020-10-21 19:50 UTC (permalink / raw)
  To: Alex Williamson
  Cc: FelixCuioc, Eduardo Habkost, CobeChen-oc, qemu-devel,
	TonyWWang-oc, RockCui-oc, Richard Henderson

On 21/10/20 20:49, Alex Williamson wrote:
> I can understand the general benefit of flatview_simplify(), but I
> wonder if the best short term solution is to skip operating on the x86
> PAM range, which I understand to be a small number of memory chunks
> below 1MB.

I'd rather remove flatview_simplify altogether, it probably triggers
relatively rarely.  Possibly do not let it operate on RAM/ROM regions,
only on I/O regions.

> I might also wonder why the EHCI controller on this
> platform is choosing that range for DMA.

I assume it's the BIOS's driver and it's choosing a range in low memory,
but still I'm not sure why its DMA is racing against the PAM update
(which is done very early).  Felix, do you know the answer?

Paolo



^ permalink raw reply	[flat|nested] 15+ messages in thread

* 答复: [PATCH 1/1] Skip flatview_simplify() for cpu vendor zhaoxin
  2020-10-21 19:50               ` Paolo Bonzini
@ 2020-10-22  3:02                 ` FelixCui-oc
  2020-10-22  3:30                   ` Paolo Bonzini
  0 siblings, 1 reply; 15+ messages in thread
From: FelixCui-oc @ 2020-10-22  3:02 UTC (permalink / raw)
  To: Paolo Bonzini, Alex Williamson
  Cc: CobeChen-oc, qemu-devel, Tony W Wang-oc, RockCui-oc,
	Richard Henderson, Eduardo Habkost

[-- Attachment #1: Type: text/plain, Size: 1889 bytes --]

hi  ,

>I assume it's the BIOS's driver and it's choosing a range in low memory,
>but still I'm not sure why its DMA is racing against the PAM update
>(which is done very early).  Felix, do you know the answer?


This bug is triggered by make_bios_readonly() in seabios. Make_bios_readonly() will write pam register.

And ehci_setup() is executed before make_bios_readonly(). After initializing EHCI in seabios, it will

continuously send dma cycles. So when write pam register, this bug appeared.


In addition, before write pam registers, flatview_simplify() has merged a very large range.For example,

this large range is 0xc0000-0xbfffffff. So even if EHCI is configured to not allocate buffers in low memory,

this bug will still occur.Thanks.


Best regards

Felixcui-oc

________________________________
发件人: Paolo Bonzini <pbonzini@redhat.com>
发送时间: 2020年10月22日 3:50:15
收件人: Alex Williamson
抄送: FelixCui-oc; Richard Henderson; Eduardo Habkost; qemu-devel@nongnu.org; RockCui-oc; Tony W Wang-oc; CobeChen-oc
主题: Re: [PATCH 1/1] Skip flatview_simplify() for cpu vendor zhaoxin

On 21/10/20 20:49, Alex Williamson wrote:
> I can understand the general benefit of flatview_simplify(), but I
> wonder if the best short term solution is to skip operating on the x86
> PAM range, which I understand to be a small number of memory chunks
> below 1MB.

I'd rather remove flatview_simplify altogether, it probably triggers
relatively rarely.  Possibly do not let it operate on RAM/ROM regions,
only on I/O regions.

> I might also wonder why the EHCI controller on this
> platform is choosing that range for DMA.

I assume it's the BIOS's driver and it's choosing a range in low memory,
but still I'm not sure why its DMA is racing against the PAM update
(which is done very early).  Felix, do you know the answer?

Paolo


[-- Attachment #2: Type: text/html, Size: 6515 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: 答复: [PATCH 1/1] Skip flatview_simplify() for cpu vendor zhaoxin
  2020-10-22  3:02                 ` 答复: " FelixCui-oc
@ 2020-10-22  3:30                   ` Paolo Bonzini
  2020-10-22  6:31                     ` 答复: " FelixCui-oc
  0 siblings, 1 reply; 15+ messages in thread
From: Paolo Bonzini @ 2020-10-22  3:30 UTC (permalink / raw)
  To: FelixCui-oc, Alex Williamson
  Cc: CobeChen-oc, qemu-devel, Tony W Wang-oc, RockCui-oc,
	Richard Henderson, Eduardo Habkost

On 22/10/20 05:02, FelixCui-oc wrote:
> In addition, before write pam registers, flatview_simplify() has merged
> a very large range.For example,
> 
> this large range is 0xc0000-0xbfffffff. So even if EHCI is configured to
> not allocate buffers in low memory,
> 
> this bug will still occur.Thanks.

So removing flatview_simplify() works because the higher area (0x10000
and above) remains the same.  I guess the simplest thing to do is to
apply flatview_simplify() only to I/O areas, though we can also consider
removing it completely.  I'm not sure in which case it would provide a
noticeable improvement.

Paolo



^ permalink raw reply	[flat|nested] 15+ messages in thread

* 答复: 答复: [PATCH 1/1] Skip flatview_simplify() for cpu vendor zhaoxin
  2020-10-22  3:30                   ` Paolo Bonzini
@ 2020-10-22  6:31                     ` FelixCui-oc
  2020-10-27  3:18                       ` FelixCui-oc
  0 siblings, 1 reply; 15+ messages in thread
From: FelixCui-oc @ 2020-10-22  6:31 UTC (permalink / raw)
  To: Paolo Bonzini, Alex Williamson
  Cc: CobeChen-oc, qemu-devel, Tony W Wang-oc, RockCui-oc,
	Richard Henderson, Eduardo Habkost

[-- Attachment #1: Type: text/plain, Size: 1424 bytes --]

hi paolo,


>So removing flatview_simplify() works because the higher area (0x10000

>and above) remains the same.  I guess the simplest thing to do is to
>apply flatview_simplify() only to I/O areas, though we can also consider
>removing it completely.  I'm not sure in which case it would provide a
>noticeable improvement.


I agree with you very much. Both of these cases can solve this bug.

Thanks.


Best regards

Felixcui-oc

________________________________
发件人: Paolo Bonzini <pbonzini@redhat.com>
发送时间: 2020年10月22日 11:30:37
收件人: FelixCui-oc; Alex Williamson
抄送: Richard Henderson; Eduardo Habkost; qemu-devel@nongnu.org; RockCui-oc; Tony W Wang-oc; CobeChen-oc
主题: Re: 答复: [PATCH 1/1] Skip flatview_simplify() for cpu vendor zhaoxin

On 22/10/20 05:02, FelixCui-oc wrote:
> In addition, before write pam registers, flatview_simplify() has merged
> a very large range.For example,
>
> this large range is 0xc0000-0xbfffffff. So even if EHCI is configured to
> not allocate buffers in low memory,
>
> this bug will still occur.Thanks.

So removing flatview_simplify() works because the higher area (0x10000
and above) remains the same.  I guess the simplest thing to do is to
apply flatview_simplify() only to I/O areas, though we can also consider
removing it completely.  I'm not sure in which case it would provide a
noticeable improvement.

Paolo


[-- Attachment #2: Type: text/html, Size: 5242 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* 答复: 答复: [PATCH 1/1] Skip flatview_simplify() for cpu vendor zhaoxin
  2020-10-22  6:31                     ` 答复: " FelixCui-oc
@ 2020-10-27  3:18                       ` FelixCui-oc
  0 siblings, 0 replies; 15+ messages in thread
From: FelixCui-oc @ 2020-10-27  3:18 UTC (permalink / raw)
  To: Paolo Bonzini, Alex Williamson
  Cc: CobeChen-oc, qemu-devel, Tony W Wang-oc, RockCui-oc,
	Richard Henderson, Eduardo Habkost

[-- Attachment #1: Type: text/plain, Size: 2266 bytes --]

hi paolo,

>So removing flatview_simplify() works because the higher area (0x10000
>and above) remains the same.  I guess the simplest thing to do is to
>apply flatview_simplify() only to I/O areas, though we can also consider
>removing it completely.  I'm not sure in which case it would provide a
>noticeable improvement.

Which one do you think is better? I prefer to remove flatview_simplify() directly.
Please help give some suggestions , and then i will resubmit the patch.

Best regards
FelixCui-oc


________________________________
发件人: FelixCui-oc
发送时间: 2020年10月22日 14:31:26
收件人: Paolo Bonzini; Alex Williamson
抄送: Richard Henderson; Eduardo Habkost; qemu-devel@nongnu.org; RockCui-oc; Tony W Wang-oc; CobeChen-oc
主题: 答复: 答复: [PATCH 1/1] Skip flatview_simplify() for cpu vendor zhaoxin


hi paolo,


>So removing flatview_simplify() works because the higher area (0x10000

>and above) remains the same.  I guess the simplest thing to do is to
>apply flatview_simplify() only to I/O areas, though we can also consider
>removing it completely.  I'm not sure in which case it would provide a
>noticeable improvement.


I agree with you very much. Both of these cases can solve this bug.

Thanks.


Best regards

Felixcui-oc

________________________________
发件人: Paolo Bonzini <pbonzini@redhat.com>
发送时间: 2020年10月22日 11:30:37
收件人: FelixCui-oc; Alex Williamson
抄送: Richard Henderson; Eduardo Habkost; qemu-devel@nongnu.org; RockCui-oc; Tony W Wang-oc; CobeChen-oc
主题: Re: 答复: [PATCH 1/1] Skip flatview_simplify() for cpu vendor zhaoxin

On 22/10/20 05:02, FelixCui-oc wrote:
> In addition, before write pam registers, flatview_simplify() has merged
> a very large range.For example,
>
> this large range is 0xc0000-0xbfffffff. So even if EHCI is configured to
> not allocate buffers in low memory,
>
> this bug will still occur.Thanks.

So removing flatview_simplify() works because the higher area (0x10000
and above) remains the same.  I guess the simplest thing to do is to
apply flatview_simplify() only to I/O areas, though we can also consider
removing it completely.  I'm not sure in which case it would provide a
noticeable improvement.

Paolo


[-- Attachment #2: Type: text/html, Size: 10188 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2020-10-27  3:25 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-16 11:29 [PATCH 0/1] Skip flatview_simplify() for cpu vendor zhaoxin FelixCuioc
2020-10-16 11:29 ` [PATCH 1/1] " FelixCuioc
2020-10-16 11:42   ` Paolo Bonzini
2020-10-19  6:55     ` 答复: " FelixCui-oc
2020-10-19 19:02     ` Alex Williamson
2020-10-20  9:24       ` Paolo Bonzini
2020-10-20 22:44         ` Alex Williamson
2020-10-21  7:37           ` Paolo Bonzini
2020-10-21 13:16             ` 答复: " FelixCui-oc
2020-10-21 18:49             ` Alex Williamson
2020-10-21 19:50               ` Paolo Bonzini
2020-10-22  3:02                 ` 答复: " FelixCui-oc
2020-10-22  3:30                   ` Paolo Bonzini
2020-10-22  6:31                     ` 答复: " FelixCui-oc
2020-10-27  3:18                       ` FelixCui-oc

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).