All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] ARM: cache coherence problem in guestcopy.c
@ 2013-06-18  5:03 Jaeyong Yoo
  2013-06-18  9:20 ` Ian Campbell
  0 siblings, 1 reply; 17+ messages in thread
From: Jaeyong Yoo @ 2013-06-18  5:03 UTC (permalink / raw)
  To: xen-devel; +Cc: Jaeyong Yoo

I've encountered a rather unusual bug while I'm implementing live 
migration on arndale board.
After resume, domU kernel starts invoking hypercalls and at some point
the hypercall parameters delivered to xen are corrupted.

After some debugging (with the help of HW debugger), I found that
cache polution happens, and here is the detailed sequence.

1) DomU kernel allocates a local variable struct xen_add_to_physmap xatp
   and the GVA of xatp is 0xc785fe38 (note that not cache-line aligned)
   (see gnttab_map function in linux/drivers/xen/grant-table.c)
   
2) GVA of xatp is mapped in xen page table at raw_copy_from_guest function, 
   and the VA of xen is 0xae48ee38 and its contents are cached.

3) DomU kernel reuses xatp to invoke the second hypercall with different 
   parameters.

4) GVA of xatp is mapped again in the same VA of xen, and the cached data
   at step 2) (the first hypercall parameter) is loaded.

The below patch prevents the above problem.

I'm wondering, as a better solution, that does unmap_domain_page should 
invalidate the cache before unmap the page?


Signed-off-by: Jaeyong Yoo <jaeyong.yoo@samsung.com>
---
 xen/arch/arm/guestcopy.c |    3 +++
 1 file changed, 3 insertions(+)

diff --git a/xen/arch/arm/guestcopy.c b/xen/arch/arm/guestcopy.c
index d146cd6..9c4a7e4 100644
--- a/xen/arch/arm/guestcopy.c
+++ b/xen/arch/arm/guestcopy.c
@@ -25,6 +25,7 @@ unsigned long raw_copy_to_guest(void *to, const void *from, unsigned len)
         p += offset;
         memcpy(p, from, size);
 
+        flush_xen_dcache_va_range(p, size);
         unmap_domain_page(p - offset);
         len -= size;
         from += size;
@@ -55,6 +56,7 @@ unsigned long raw_clear_guest(void *to, unsigned len)
         p += offset;
         memset(p, 0x00, size);
 
+        flush_xen_dcache_va_range(p, size);
         unmap_domain_page(p - offset);
         len -= size;
         to += size;
@@ -84,6 +86,7 @@ unsigned long raw_copy_from_guest(void *to, const void __user *from, unsigned le
 
         memcpy(to, p, size);
 
+        flush_xen_dcache_va_range(p, size);
         unmap_domain_page(p);
         len -= size;
         from += size;
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 17+ messages in thread
* Re: [PATCH] ARM: cache coherence problem in guestcopy.c
@ 2013-06-18 11:22 Jaeyong Yoo
  2013-06-18 11:45 ` Ian Campbell
  0 siblings, 1 reply; 17+ messages in thread
From: Jaeyong Yoo @ 2013-06-18 11:22 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Tim Deegan, xen-devel

> 
> So I think we probably actually need the dcache flush in domain_map_page
> at the "/* Commandeer this 2MB slot */" point. In that context I don't
> think we can avoid flushing anything other than the complete 2MB
> mapping. Does this work for you too?

I am not sure that this would work. If we map_domain_page and unmap_domain_page
with the same mfn over and over again while the ref count is not zero (say 5), 
then flush is not called. And, I think we should call flush according to the reason below:

> 
> The laziness of the remappings makes me wonder though. Do you know if
> the slot is reused between step #2 and #3?  Otherwise I'd expect us to
> reuse the existing mapping with the cache intact. The caches are PIPT so
> I wouldn't expect the address aliasing to be an issue. Unless the
> mapping is reused for something else I'm not too sure where the cache
> pollution is coming from.

Let me try explain in more detail. 
We can consider the DomU as a producer (writing values to mfn) and
hypervisor as a consumer (reading values from mfn). While DomU is 
invoking multiple hypercalls, it reuses the same mfn and the same 
mapping at xen page table.

       (consumer)             (producer)
           xen                   DomU
             \                    /   (writing path)
            (cache)              /
                 \              /
(reading path)    \           /
_______________________________________
                    |   mfn   |        (physical memory)
---------------------------------------

If we see the above figure, xen "may" keep reading the cached value while 
DomU is writing different values to mfn. Here goes my observation where
cache pollution happen:

The pollution actually happens in second line of cache.
DomU side hypercall param local address is 0xc785fe38 (cache line size = 0x40B),
and the size of hypercall param is 24B. So the hypercall param lays out in two
cache lines. When hypervisor is reading the hypercall param, it reads the first 
8 bytes correctly (means the first cache line is flushed) and the other 16 bytes
are polluted (means the second cache line is not flushed).
Honestly, I'm not sure why the first cache line is flushed and the second is not.
I think we can also cache_line_align the hypercall param struct, but that is only
when  the sizes of all hypercall params are smaller than cache line size.

I hope the alignment of the figure is not broken :)

Best,
Jaeyong

^ permalink raw reply	[flat|nested] 17+ messages in thread
* Re: [PATCH] ARM: cache coherence problem in guestcopy.c
@ 2013-06-18 12:05 Jaeyong Yoo
  2013-06-18 12:18 ` Ian Campbell
  0 siblings, 1 reply; 17+ messages in thread
From: Jaeyong Yoo @ 2013-06-18 12:05 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Tim Deegan, xen-devel

> 
> But all of the caches on this platform are PIPT (right?) so isn't it
> actually:
> 
>        (consumer)             (producer)
>             xen                   DomU
>               \                 /   (writing path)
>                \               /
>                 \             /
> (reading path)  \           /
>                   \         /
>                     (cache)
>                        ||
>                        ||
>                        \/
> _______________________________________
>                      |   mfn   |        (physical memory)
> ---------------------------------------
> 
> 
> Or are you saying that the writing path is uncached?

Oops my mistake. As far as I know, it is PIPT and the writing also 
should be cached.

> 
> I was chatting with Tim and he suggested that the issue might be the
> ReOrder Buffer, which is virtually tagged. In that case a DMB ought to
> be sufficient and not a full cache flush, we think.
> 
> We were also speculating that we probably want some DMBs in
> context_switch_{from,to} as well as at return_to_guest.

Actually, I just learned ReOrder Buffer, and it looks like so.

Best,
Jaeyong

^ permalink raw reply	[flat|nested] 17+ messages in thread
* Re: [PATCH] ARM: cache coherence problem in guestcopy.c
@ 2013-06-20  8:34 Jaeyong Yoo
  2013-06-25  9:22 ` Ian Campbell
  2013-07-02  9:38 ` Ian Campbell
  0 siblings, 2 replies; 17+ messages in thread
From: Jaeyong Yoo @ 2013-06-20  8:34 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Tim Deegan, xen-devel

> On Tue, 2013-06-18 at 13:18 +0100, Ian Campbell wrote:
> > On Tue, 2013-06-18 at 12:05 +0000, Jaeyong Yoo wrote:
> > > > We were also speculating that we probably want some DMBs in
> > > > context_switch_{from,to} as well as at return_to_guest.
> > > 
> > > Actually, I just learned ReOrder Buffer, and it looks like so.
> 
> Does this patch help with the issue you are seeing?

I tried the combinations and it does not work. I think my problem maybe stem 
from a different reason. Since this problem happens while
we try to migrate domU, something really weird may happen.

Actually, one of my colleage told me that this problem I'm having has been 
magically disappeared while he tried with copying more vcpu registers and 
lots of printks places to places. At this moment, I'm not sure that this is the 
common problem in xen or the problem due to poor migration, but I'm more 
believing that maybe it is due to the poor migration. Since I'm keep investigating
tihs issue, I will tell you if anything turns out.

> I think in theory it should *BUT* (and it's a big But)...
> 
> ... it doesn't work for me on an issue I've been seeing booting dom0 on
> the Foundation model. However doing the dmb in map_domain_page() *twice*

BTW, did you put barriers in map_domain_page? 
since the patch below looks like in unmap_domain_page.

> does work for me. In fact doing a dmb+nop works too. This is
> inexplicable to me. It may be a model bug or it may (more likely) be
> indicative of some other underlying issue. I also tried dsb() instead of
> dmb() and that doesn't make a difference.
> 
> Anyway, I'm interested about its behaviour on real hardware.
> 
> These are doing full system barriers. I think in reality only a "dmb
> nsh" should be needed for this use case, since we only care about the
> local processor. I didn't want to go there when the supposedly more
> obvious case wasn't doing as I expected!
> 
> Ian.
> 
> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> index beae61e..ef67953 100644
> --- a/xen/arch/arm/mm.c
> +++ b/xen/arch/arm/mm.c
> @@ -218,6 +218,13 @@ void *map_domain_page(unsigned long mfn)
> 
>      local_irq_save(flags);
> 
> +    /*
> +     * Ensure that any previous writes which occurred via another
> +     * mapping, specifically the guest's own mapping have been
> +     * completed such that they are visible via this mapping.
> +     */
> +    dmb();
> +
>      /* The map is laid out as an open-addressed hash table where each
>       * entry is a 2MB superpage pte.  We use the available bits of each
>       * PTE as a reference count; when the refcount is zero the slot can
> @@ -286,6 +297,13 @@ void unmap_domain_page(const void *va)
> 
>      map[slot].pt.avail--;
> 
> +    /*
> +     * Ensure that any writes which occurred via this mapping have been
> +     * completed such that they are visible via other virtual
> +     * mappings, specifically the guest's own mapping.
> +     */
> +    dmb();
> +
>      local_irq_restore(flags);
> }
> 
> 
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2013-07-02 12:39 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-06-18  5:03 [PATCH] ARM: cache coherence problem in guestcopy.c Jaeyong Yoo
2013-06-18  9:20 ` Ian Campbell
2013-06-18 11:22 Jaeyong Yoo
2013-06-18 11:45 ` Ian Campbell
2013-06-18 12:05 Jaeyong Yoo
2013-06-18 12:18 ` Ian Campbell
2013-06-19 15:12   ` Ian Campbell
2013-06-20 11:55   ` Stefano Stabellini
2013-06-20 12:19     ` Tim Deegan
2013-06-25  9:43       ` Ian Campbell
2013-06-20  8:34 Jaeyong Yoo
2013-06-25  9:22 ` Ian Campbell
2013-07-02  9:38 ` Ian Campbell
2013-07-02 12:14   ` Sengul Thomas
2013-07-02 12:24     ` Sengul Thomas
2013-07-02 12:33     ` Ian Campbell
2013-07-02 12:39       ` Sengul Thomas

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.