* xen/arm and swiotlb-xen: possible data corruption
@ 2017-03-02 1:05 Stefano Stabellini
2017-03-02 8:38 ` Edgar E. Iglesias
0 siblings, 1 reply; 13+ messages in thread
From: Stefano Stabellini @ 2017-03-02 1:05 UTC (permalink / raw)
To: edgar.iglesias; +Cc: xen-devel, julien.grall, sstabellini
Hi all,
Edgar reported a data corruption on network packets in dom0 when the
swiotlb-xen is in use. He also reported that the following patch "fixes"
the problem for him:
static void __xen_dma_page_cpu_to_dev(struct device *hwdev, dma_addr_t handle,
size_t size, enum dma_data_direction dir)
{
- dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size, dir, DMA_MAP);
+ printk("%s: addr=%lx size=%zd\n", __func__, handle, size);
+ dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size + 64, dir, DMA_MAP);
I am thinking that the problem has something to do with cacheline
alignment on the Xen side
(xen/common/grant_table.c:__gnttab_cache_flush).
If op == GNTTAB_CACHE_INVAL, we call invalidate_dcache_va_range; if op
== GNTTAB_CACHE_CLEAN, we call clean_dcache_va_range instead. The
parameter, v, could be non-cacheline aligned.
invalidate_dcache_va_range is capable of handling a not aligned address,
while clean_dcache_va_range does not.
Edgar, does the appended patch fix the problem for you?
---
diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h
index 86de0b6..9cdf2fb 100644
--- a/xen/include/asm-arm/page.h
+++ b/xen/include/asm-arm/page.h
@@ -322,10 +322,30 @@ static inline int invalidate_dcache_va_range(const void *p, unsigned long size)
static inline int clean_dcache_va_range(const void *p, unsigned long size)
{
- const void *end;
+ size_t off;
+ const void *end = p + size;
+
dsb(sy); /* So the CPU issues all writes to the range */
- for ( end = p + size; p < end; p += cacheline_bytes )
+
+ off = (unsigned long)p % cacheline_bytes;
+ if ( off )
+ {
+ p -= off;
asm volatile (__clean_dcache_one(0) : : "r" (p));
+ p += cacheline_bytes;
+ size -= cacheline_bytes - off;
+ }
+ off = (unsigned long)end % cacheline_bytes;
+ if ( off )
+ {
+ end -= off;
+ size -= off;
+ asm volatile (__clean_dcache_one(0) : : "r" (end));
+ }
+
+ for ( ; p < end; p += cacheline_bytes )
+ asm volatile (__clean_dcache_one(0) : : "r" (p));
+
dsb(sy); /* So we know the flushes happen before continuing */
/* ARM callers assume that dcache_* functions cannot fail. */
return 0;
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: xen/arm and swiotlb-xen: possible data corruption
2017-03-02 1:05 xen/arm and swiotlb-xen: possible data corruption Stefano Stabellini
@ 2017-03-02 8:38 ` Edgar E. Iglesias
2017-03-02 8:53 ` Edgar E. Iglesias
0 siblings, 1 reply; 13+ messages in thread
From: Edgar E. Iglesias @ 2017-03-02 8:38 UTC (permalink / raw)
To: Stefano Stabellini; +Cc: xen-devel, julien.grall
On Wed, Mar 01, 2017 at 05:05:21PM -0800, Stefano Stabellini wrote:
> Hi all,
>
> Edgar reported a data corruption on network packets in dom0 when the
> swiotlb-xen is in use. He also reported that the following patch "fixes"
> the problem for him:
>
> static void __xen_dma_page_cpu_to_dev(struct device *hwdev, dma_addr_t handle,
> size_t size, enum dma_data_direction dir)
> {
> - dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size, dir, DMA_MAP);
> + printk("%s: addr=%lx size=%zd\n", __func__, handle, size);
> + dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size + 64, dir, DMA_MAP);
>
> I am thinking that the problem has something to do with cacheline
> alignment on the Xen side
> (xen/common/grant_table.c:__gnttab_cache_flush).
>
> If op == GNTTAB_CACHE_INVAL, we call invalidate_dcache_va_range; if op
> == GNTTAB_CACHE_CLEAN, we call clean_dcache_va_range instead. The
> parameter, v, could be non-cacheline aligned.
>
> invalidate_dcache_va_range is capable of handling a not aligned address,
> while clean_dcache_va_range does not.
>
> Edgar, does the appended patch fix the problem for you?
Thanks Stefano,
This does indeed fix the issue for me.
Cheers,
Edgar
>
> ---
>
> diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h
> index 86de0b6..9cdf2fb 100644
> --- a/xen/include/asm-arm/page.h
> +++ b/xen/include/asm-arm/page.h
> @@ -322,10 +322,30 @@ static inline int invalidate_dcache_va_range(const void *p, unsigned long size)
>
> static inline int clean_dcache_va_range(const void *p, unsigned long size)
> {
> - const void *end;
> + size_t off;
> + const void *end = p + size;
> +
> dsb(sy); /* So the CPU issues all writes to the range */
> - for ( end = p + size; p < end; p += cacheline_bytes )
> +
> + off = (unsigned long)p % cacheline_bytes;
> + if ( off )
> + {
> + p -= off;
> asm volatile (__clean_dcache_one(0) : : "r" (p));
> + p += cacheline_bytes;
> + size -= cacheline_bytes - off;
> + }
> + off = (unsigned long)end % cacheline_bytes;
> + if ( off )
> + {
> + end -= off;
> + size -= off;
> + asm volatile (__clean_dcache_one(0) : : "r" (end));
> + }
> +
> + for ( ; p < end; p += cacheline_bytes )
> + asm volatile (__clean_dcache_one(0) : : "r" (p));
> +
> dsb(sy); /* So we know the flushes happen before continuing */
> /* ARM callers assume that dcache_* functions cannot fail. */
> return 0;
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: xen/arm and swiotlb-xen: possible data corruption
2017-03-02 8:38 ` Edgar E. Iglesias
@ 2017-03-02 8:53 ` Edgar E. Iglesias
2017-03-02 17:56 ` Julien Grall
0 siblings, 1 reply; 13+ messages in thread
From: Edgar E. Iglesias @ 2017-03-02 8:53 UTC (permalink / raw)
To: Edgar E. Iglesias; +Cc: xen-devel, julien.grall, Stefano Stabellini
On Thu, Mar 02, 2017 at 09:38:37AM +0100, Edgar E. Iglesias wrote:
> On Wed, Mar 01, 2017 at 05:05:21PM -0800, Stefano Stabellini wrote:
> > Hi all,
> >
> > Edgar reported a data corruption on network packets in dom0 when the
> > swiotlb-xen is in use. He also reported that the following patch "fixes"
> > the problem for him:
> >
> > static void __xen_dma_page_cpu_to_dev(struct device *hwdev, dma_addr_t handle,
> > size_t size, enum dma_data_direction dir)
> > {
> > - dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size, dir, DMA_MAP);
> > + printk("%s: addr=%lx size=%zd\n", __func__, handle, size);
> > + dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size + 64, dir, DMA_MAP);
> >
> > I am thinking that the problem has something to do with cacheline
> > alignment on the Xen side
> > (xen/common/grant_table.c:__gnttab_cache_flush).
> >
> > If op == GNTTAB_CACHE_INVAL, we call invalidate_dcache_va_range; if op
> > == GNTTAB_CACHE_CLEAN, we call clean_dcache_va_range instead. The
> > parameter, v, could be non-cacheline aligned.
> >
> > invalidate_dcache_va_range is capable of handling a not aligned address,
> > while clean_dcache_va_range does not.
> >
> > Edgar, does the appended patch fix the problem for you?
>
>
> Thanks Stefano,
>
> This does indeed fix the issue for me.
Hi again,
Looking at the code, the problem here is that we may flush one cache line
less than expected.
This smaller patch fixes it for me too:
diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h
index c492d6d..fa1b4dd 100644
--- a/xen/include/asm-arm/page.h
+++ b/xen/include/asm-arm/page.h
@@ -325,7 +325,9 @@ static inline int clean_dcache_va_range(const void *p, unsigned long size)
{
const void *end;
dsb(sy); /* So the CPU issues all writes to the range */
- for ( end = p + size; p < end; p += cacheline_bytes )
+
+ end = (void *)ROUNDUP((uintptr_t)p + size, cacheline_bytes);
+ for ( ; p < end; p += cacheline_bytes )
asm volatile (__clean_dcache_one(0) : : "r" (p));
dsb(sy); /* So we know the flushes happen before continuing */
/* ARM callers assume that dcache_* functions cannot fail. */
Anyway, I'm OK with either fix.
Cheers,
Edgar
>
> Cheers,
> Edgar
>
>
> >
> > ---
> >
> > diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h
> > index 86de0b6..9cdf2fb 100644
> > --- a/xen/include/asm-arm/page.h
> > +++ b/xen/include/asm-arm/page.h
> > @@ -322,10 +322,30 @@ static inline int invalidate_dcache_va_range(const void *p, unsigned long size)
> >
> > static inline int clean_dcache_va_range(const void *p, unsigned long size)
> > {
> > - const void *end;
> > + size_t off;
> > + const void *end = p + size;
> > +
> > dsb(sy); /* So the CPU issues all writes to the range */
> > - for ( end = p + size; p < end; p += cacheline_bytes )
> > +
> > + off = (unsigned long)p % cacheline_bytes;
> > + if ( off )
> > + {
> > + p -= off;
> > asm volatile (__clean_dcache_one(0) : : "r" (p));
> > + p += cacheline_bytes;
> > + size -= cacheline_bytes - off;
> > + }
> > + off = (unsigned long)end % cacheline_bytes;
> > + if ( off )
> > + {
> > + end -= off;
> > + size -= off;
> > + asm volatile (__clean_dcache_one(0) : : "r" (end));
> > + }
> > +
> > + for ( ; p < end; p += cacheline_bytes )
> > + asm volatile (__clean_dcache_one(0) : : "r" (p));
> > +
> > dsb(sy); /* So we know the flushes happen before continuing */
> > /* ARM callers assume that dcache_* functions cannot fail. */
> > return 0;
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: xen/arm and swiotlb-xen: possible data corruption
2017-03-02 8:53 ` Edgar E. Iglesias
@ 2017-03-02 17:56 ` Julien Grall
2017-03-02 19:12 ` Stefano Stabellini
0 siblings, 1 reply; 13+ messages in thread
From: Julien Grall @ 2017-03-02 17:56 UTC (permalink / raw)
To: Edgar E. Iglesias, Edgar E. Iglesias; +Cc: xen-devel, nd, Stefano Stabellini
Hi Edgar,
On 02/03/17 08:53, Edgar E. Iglesias wrote:
> On Thu, Mar 02, 2017 at 09:38:37AM +0100, Edgar E. Iglesias wrote:
>> On Wed, Mar 01, 2017 at 05:05:21PM -0800, Stefano Stabellini wrote:
>>> Hi all,
>>>
>>> Edgar reported a data corruption on network packets in dom0 when the
>>> swiotlb-xen is in use. He also reported that the following patch "fixes"
>>> the problem for him:
>>>
>>> static void __xen_dma_page_cpu_to_dev(struct device *hwdev, dma_addr_t handle,
>>> size_t size, enum dma_data_direction dir)
>>> {
>>> - dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size, dir, DMA_MAP);
>>> + printk("%s: addr=%lx size=%zd\n", __func__, handle, size);
>>> + dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size + 64, dir, DMA_MAP);
>>>
>>> I am thinking that the problem has something to do with cacheline
>>> alignment on the Xen side
>>> (xen/common/grant_table.c:__gnttab_cache_flush).
>>>
>>> If op == GNTTAB_CACHE_INVAL, we call invalidate_dcache_va_range; if op
>>> == GNTTAB_CACHE_CLEAN, we call clean_dcache_va_range instead. The
>>> parameter, v, could be non-cacheline aligned.
>>>
>>> invalidate_dcache_va_range is capable of handling a not aligned address,
>>> while clean_dcache_va_range does not.
>>>
>>> Edgar, does the appended patch fix the problem for you?
>>
>>
>> Thanks Stefano,
>>
>> This does indeed fix the issue for me.
>
>
> Hi again,
>
> Looking at the code, the problem here is that we may flush one cache line
> less than expected.
>
> This smaller patch fixes it for me too:
> diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h
> index c492d6d..fa1b4dd 100644
> --- a/xen/include/asm-arm/page.h
> +++ b/xen/include/asm-arm/page.h
> @@ -325,7 +325,9 @@ static inline int clean_dcache_va_range(const void *p, unsigned long size)
> {
> const void *end;
> dsb(sy); /* So the CPU issues all writes to the range */
> - for ( end = p + size; p < end; p += cacheline_bytes )
> +
> + end = (void *)ROUNDUP((uintptr_t)p + size, cacheline_bytes);
> + for ( ; p < end; p += cacheline_bytes )
> asm volatile (__clean_dcache_one(0) : : "r" (p));
> dsb(sy); /* So we know the flushes happen before continuing */
> /* ARM callers assume that dcache_* functions cannot fail. */
>
>
> Anyway, I'm OK with either fix.
I would prefer your version compare to Stefano's one.
Cheers,
--
Julien Grall
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: xen/arm and swiotlb-xen: possible data corruption
2017-03-02 17:56 ` Julien Grall
@ 2017-03-02 19:12 ` Stefano Stabellini
2017-03-02 19:32 ` Julien Grall
0 siblings, 1 reply; 13+ messages in thread
From: Stefano Stabellini @ 2017-03-02 19:12 UTC (permalink / raw)
To: Julien Grall
Cc: Edgar E. Iglesias, Edgar E. Iglesias, nd, Stefano Stabellini, xen-devel
On Thu, 2 Mar 2017, Julien Grall wrote:
> On 02/03/17 08:53, Edgar E. Iglesias wrote:
> > On Thu, Mar 02, 2017 at 09:38:37AM +0100, Edgar E. Iglesias wrote:
> > > On Wed, Mar 01, 2017 at 05:05:21PM -0800, Stefano Stabellini wrote:
> > > > Hi all,
> > > >
> > > > Edgar reported a data corruption on network packets in dom0 when the
> > > > swiotlb-xen is in use. He also reported that the following patch "fixes"
> > > > the problem for him:
> > > >
> > > > static void __xen_dma_page_cpu_to_dev(struct device *hwdev, dma_addr_t
> > > > handle,
> > > > size_t size, enum dma_data_direction dir)
> > > > {
> > > > - dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size,
> > > > dir, DMA_MAP);
> > > > + printk("%s: addr=%lx size=%zd\n", __func__, handle, size);
> > > > + dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size +
> > > > 64, dir, DMA_MAP);
> > > >
> > > > I am thinking that the problem has something to do with cacheline
> > > > alignment on the Xen side
> > > > (xen/common/grant_table.c:__gnttab_cache_flush).
> > > >
> > > > If op == GNTTAB_CACHE_INVAL, we call invalidate_dcache_va_range; if op
> > > > == GNTTAB_CACHE_CLEAN, we call clean_dcache_va_range instead. The
> > > > parameter, v, could be non-cacheline aligned.
> > > >
> > > > invalidate_dcache_va_range is capable of handling a not aligned address,
> > > > while clean_dcache_va_range does not.
> > > >
> > > > Edgar, does the appended patch fix the problem for you?
> > >
> > >
> > > Thanks Stefano,
> > >
> > > This does indeed fix the issue for me.
Thanks for reporting and testing!
> > Hi again,
> >
> > Looking at the code, the problem here is that we may flush one cache line
> > less than expected.
> >
> > This smaller patch fixes it for me too:
> > diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h
> > index c492d6d..fa1b4dd 100644
> > --- a/xen/include/asm-arm/page.h
> > +++ b/xen/include/asm-arm/page.h
> > @@ -325,7 +325,9 @@ static inline int clean_dcache_va_range(const void *p,
> > unsigned long size)
> > {
> > const void *end;
> > dsb(sy); /* So the CPU issues all writes to the range */
> > - for ( end = p + size; p < end; p += cacheline_bytes )
> > +
> > + end = (void *)ROUNDUP((uintptr_t)p + size, cacheline_bytes);
> > + for ( ; p < end; p += cacheline_bytes )
> > asm volatile (__clean_dcache_one(0) : : "r" (p));
> > dsb(sy); /* So we know the flushes happen before continuing
> > */
> > /* ARM callers assume that dcache_* functions cannot fail. */
> >
> >
> > Anyway, I'm OK with either fix.
>
> I would prefer your version compare to Stefano's one.
Julien, from looking at the two diffs, this is simpler and nicer, but if
you look at xen/include/asm-arm/page.h, my patch made
clean_dcache_va_range consistent with invalidate_dcache_va_range. For
consistency, I would prefer to deal with the two functions the same way.
Although it is not a spec requirement, I also think that it is a good
idea to issue cache flushes from cacheline aligned addresses, like
invalidate_dcache_va_range does and Linux does, to make more obvious
what is going on.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: xen/arm and swiotlb-xen: possible data corruption
2017-03-02 19:12 ` Stefano Stabellini
@ 2017-03-02 19:32 ` Julien Grall
2017-03-02 22:39 ` Stefano Stabellini
0 siblings, 1 reply; 13+ messages in thread
From: Julien Grall @ 2017-03-02 19:32 UTC (permalink / raw)
To: Stefano Stabellini; +Cc: Edgar E. Iglesias, Edgar E. Iglesias, nd, xen-devel
Hi Stefano,
On 02/03/17 19:12, Stefano Stabellini wrote:
> On Thu, 2 Mar 2017, Julien Grall wrote:
>> On 02/03/17 08:53, Edgar E. Iglesias wrote:
>>> On Thu, Mar 02, 2017 at 09:38:37AM +0100, Edgar E. Iglesias wrote:
>>>> On Wed, Mar 01, 2017 at 05:05:21PM -0800, Stefano Stabellini wrote:
> Julien, from looking at the two diffs, this is simpler and nicer, but if
> you look at xen/include/asm-arm/page.h, my patch made
> clean_dcache_va_range consistent with invalidate_dcache_va_range. For
> consistency, I would prefer to deal with the two functions the same way.
> Although it is not a spec requirement, I also think that it is a good
> idea to issue cache flushes from cacheline aligned addresses, like
> invalidate_dcache_va_range does and Linux does, to make more obvious
> what is going on.
invalid_dcache_va_range is split because the cache instruction differs
for the start and end if unaligned. For them you want to use clean &
invalidate rather than invalidate.
If you look at the implementation of other cache helpers in Linux (see
dcache_by_line_op in arch/arm64/include/asm/assembler.h), they will only
align start & end.
Also, the invalid_dcache_va_range is using modulo which I would rather
avoid. The modulo in this case will not be optimized by the compiler
because cacheline_bytes is not a constant.
So I still prefer to keep this function really simple.
BTW, you would also need to fix clean_and_invalidate_dcache_va_range.
--
Julien Grall
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: xen/arm and swiotlb-xen: possible data corruption
2017-03-02 19:32 ` Julien Grall
@ 2017-03-02 22:39 ` Stefano Stabellini
2017-03-02 22:55 ` Edgar E. Iglesias
2017-03-02 23:19 ` Julien Grall
0 siblings, 2 replies; 13+ messages in thread
From: Stefano Stabellini @ 2017-03-02 22:39 UTC (permalink / raw)
To: Julien Grall
Cc: Edgar E. Iglesias, Edgar E. Iglesias, nd, Stefano Stabellini, xen-devel
On Thu, 2 Mar 2017, Julien Grall wrote:
> Hi Stefano,
>
> On 02/03/17 19:12, Stefano Stabellini wrote:
> > On Thu, 2 Mar 2017, Julien Grall wrote:
> > > On 02/03/17 08:53, Edgar E. Iglesias wrote:
> > > > On Thu, Mar 02, 2017 at 09:38:37AM +0100, Edgar E. Iglesias wrote:
> > > > > On Wed, Mar 01, 2017 at 05:05:21PM -0800, Stefano Stabellini wrote:
> > Julien, from looking at the two diffs, this is simpler and nicer, but if
> > you look at xen/include/asm-arm/page.h, my patch made
> > clean_dcache_va_range consistent with invalidate_dcache_va_range. For
> > consistency, I would prefer to deal with the two functions the same way.
> > Although it is not a spec requirement, I also think that it is a good
> > idea to issue cache flushes from cacheline aligned addresses, like
> > invalidate_dcache_va_range does and Linux does, to make more obvious
> > what is going on.
>
> invalid_dcache_va_range is split because the cache instruction differs for the
> start and end if unaligned. For them you want to use clean & invalidate rather
> than invalidate.
>
> If you look at the implementation of other cache helpers in Linux (see
> dcache_by_line_op in arch/arm64/include/asm/assembler.h), they will only align
> start & end.
I don't think so, unless I am reading dcache_by_line_op wrong.
> Also, the invalid_dcache_va_range is using modulo which I would rather avoid.
> The modulo in this case will not be optimized by the compiler because
> cacheline_bytes is not a constant.
That is a good point. What if I replace the modulo op with
p & (cacheline_bytes - 1)
in invalidate_dcache_va_range, then add the similar code to
clean_dcache_va_range and clean_and_invalidate_dcache_va_range?
> BTW, you would also need to fix clean_and_invalidate_dcache_va_range.
I'll do that, thanks for the reminder.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: xen/arm and swiotlb-xen: possible data corruption
2017-03-02 22:39 ` Stefano Stabellini
@ 2017-03-02 22:55 ` Edgar E. Iglesias
2017-03-02 23:07 ` Stefano Stabellini
2017-03-02 23:19 ` Julien Grall
1 sibling, 1 reply; 13+ messages in thread
From: Edgar E. Iglesias @ 2017-03-02 22:55 UTC (permalink / raw)
To: Stefano Stabellini; +Cc: Edgar E. Iglesias, Julien Grall, nd, xen-devel
On Thu, Mar 02, 2017 at 02:39:55PM -0800, Stefano Stabellini wrote:
> On Thu, 2 Mar 2017, Julien Grall wrote:
> > Hi Stefano,
> >
> > On 02/03/17 19:12, Stefano Stabellini wrote:
> > > On Thu, 2 Mar 2017, Julien Grall wrote:
> > > > On 02/03/17 08:53, Edgar E. Iglesias wrote:
> > > > > On Thu, Mar 02, 2017 at 09:38:37AM +0100, Edgar E. Iglesias wrote:
> > > > > > On Wed, Mar 01, 2017 at 05:05:21PM -0800, Stefano Stabellini wrote:
> > > Julien, from looking at the two diffs, this is simpler and nicer, but if
> > > you look at xen/include/asm-arm/page.h, my patch made
> > > clean_dcache_va_range consistent with invalidate_dcache_va_range. For
> > > consistency, I would prefer to deal with the two functions the same way.
> > > Although it is not a spec requirement, I also think that it is a good
> > > idea to issue cache flushes from cacheline aligned addresses, like
> > > invalidate_dcache_va_range does and Linux does, to make more obvious
> > > what is going on.
> >
> > invalid_dcache_va_range is split because the cache instruction differs for the
> > start and end if unaligned. For them you want to use clean & invalidate rather
> > than invalidate.
> >
> > If you look at the implementation of other cache helpers in Linux (see
> > dcache_by_line_op in arch/arm64/include/asm/assembler.h), they will only align
> > start & end.
>
> I don't think so, unless I am reading dcache_by_line_op wrong.
>
>
> > Also, the invalid_dcache_va_range is using modulo which I would rather avoid.
> > The modulo in this case will not be optimized by the compiler because
> > cacheline_bytes is not a constant.
>
> That is a good point. What if I replace the modulo op with
>
> p & (cacheline_bytes - 1)
>
> in invalidate_dcache_va_range, then add the similar code to
> clean_dcache_va_range and clean_and_invalidate_dcache_va_range?
Yeah, if there was some kind of generic ALIGN or ROUND_DOWN macro we could do:
--- a/xen/include/asm-arm/page.h
+++ b/xen/include/asm-arm/page.h
@@ -325,7 +325,9 @@ static inline int clean_dcache_va_range(const void *p, unsigned long size)
{
const void *end;
dsb(sy); /* So the CPU issues all writes to the range */
- for ( end = p + size; p < end; p += cacheline_bytes )
+
+ p = (void *)ALIGN((uintptr_t)p, cacheline_bytes);
+ end = (void *)ROUNDUP((uintptr_t)p + size, cacheline_bytes);
+ for ( ; p < end; p += cacheline_bytes )
asm volatile (__clean_dcache_one(0) : : "r" (p));
dsb(sy); /* So we know the flushes happen before continuing */
/* ARM callers assume that dcache_* functions cannot fail. */
I think that would achieve the same result as your patch Stefano?
Cheers,
Edgar
>
>
> > BTW, you would also need to fix clean_and_invalidate_dcache_va_range.
>
> I'll do that, thanks for the reminder.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: xen/arm and swiotlb-xen: possible data corruption
2017-03-02 22:55 ` Edgar E. Iglesias
@ 2017-03-02 23:07 ` Stefano Stabellini
2017-03-02 23:24 ` Julien Grall
0 siblings, 1 reply; 13+ messages in thread
From: Stefano Stabellini @ 2017-03-02 23:07 UTC (permalink / raw)
To: Edgar E. Iglesias
Cc: nd, Julien Grall, Stefano Stabellini, xen-devel, Edgar E. Iglesias
On Thu, 2 Mar 2017, Edgar E. Iglesias wrote:
> On Thu, Mar 02, 2017 at 02:39:55PM -0800, Stefano Stabellini wrote:
> > On Thu, 2 Mar 2017, Julien Grall wrote:
> > > Hi Stefano,
> > >
> > > On 02/03/17 19:12, Stefano Stabellini wrote:
> > > > On Thu, 2 Mar 2017, Julien Grall wrote:
> > > > > On 02/03/17 08:53, Edgar E. Iglesias wrote:
> > > > > > On Thu, Mar 02, 2017 at 09:38:37AM +0100, Edgar E. Iglesias wrote:
> > > > > > > On Wed, Mar 01, 2017 at 05:05:21PM -0800, Stefano Stabellini wrote:
> > > > Julien, from looking at the two diffs, this is simpler and nicer, but if
> > > > you look at xen/include/asm-arm/page.h, my patch made
> > > > clean_dcache_va_range consistent with invalidate_dcache_va_range. For
> > > > consistency, I would prefer to deal with the two functions the same way.
> > > > Although it is not a spec requirement, I also think that it is a good
> > > > idea to issue cache flushes from cacheline aligned addresses, like
> > > > invalidate_dcache_va_range does and Linux does, to make more obvious
> > > > what is going on.
> > >
> > > invalid_dcache_va_range is split because the cache instruction differs for the
> > > start and end if unaligned. For them you want to use clean & invalidate rather
> > > than invalidate.
> > >
> > > If you look at the implementation of other cache helpers in Linux (see
> > > dcache_by_line_op in arch/arm64/include/asm/assembler.h), they will only align
> > > start & end.
> >
> > I don't think so, unless I am reading dcache_by_line_op wrong.
> >
> >
> > > Also, the invalid_dcache_va_range is using modulo which I would rather avoid.
> > > The modulo in this case will not be optimized by the compiler because
> > > cacheline_bytes is not a constant.
> >
> > That is a good point. What if I replace the modulo op with
> >
> > p & (cacheline_bytes - 1)
> >
> > in invalidate_dcache_va_range, then add the similar code to
> > clean_dcache_va_range and clean_and_invalidate_dcache_va_range?
>
>
> Yeah, if there was some kind of generic ALIGN or ROUND_DOWN macro we could do:
>
> --- a/xen/include/asm-arm/page.h
> +++ b/xen/include/asm-arm/page.h
> @@ -325,7 +325,9 @@ static inline int clean_dcache_va_range(const void *p, unsigned long size)
> {
> const void *end;
> dsb(sy); /* So the CPU issues all writes to the range */
> - for ( end = p + size; p < end; p += cacheline_bytes )
> +
> + p = (void *)ALIGN((uintptr_t)p, cacheline_bytes);
> + end = (void *)ROUNDUP((uintptr_t)p + size, cacheline_bytes);
Even simpler:
end = p + size;
p = (void *)ALIGN((uintptr_t)p, cacheline_bytes);
> + for ( ; p < end; p += cacheline_bytes )
> asm volatile (__clean_dcache_one(0) : : "r" (p));
> dsb(sy); /* So we know the flushes happen before continuing */
> /* ARM callers assume that dcache_* functions cannot fail. */
>
> I think that would achieve the same result as your patch Stefano?
Yes, indeed, that's better.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: xen/arm and swiotlb-xen: possible data corruption
2017-03-02 22:39 ` Stefano Stabellini
2017-03-02 22:55 ` Edgar E. Iglesias
@ 2017-03-02 23:19 ` Julien Grall
2017-03-03 0:53 ` Stefano Stabellini
1 sibling, 1 reply; 13+ messages in thread
From: Julien Grall @ 2017-03-02 23:19 UTC (permalink / raw)
To: Stefano Stabellini; +Cc: Edgar E. Iglesias, Edgar E. Iglesias, nd, xen-devel
On 02/03/2017 22:39, Stefano Stabellini wrote:
> On Thu, 2 Mar 2017, Julien Grall wrote:
>> Hi Stefano,
>>
>> On 02/03/17 19:12, Stefano Stabellini wrote:
>>> On Thu, 2 Mar 2017, Julien Grall wrote:
>>>> On 02/03/17 08:53, Edgar E. Iglesias wrote:
>>>>> On Thu, Mar 02, 2017 at 09:38:37AM +0100, Edgar E. Iglesias wrote:
>>>>>> On Wed, Mar 01, 2017 at 05:05:21PM -0800, Stefano Stabellini wrote:
>>> Julien, from looking at the two diffs, this is simpler and nicer, but if
>>> you look at xen/include/asm-arm/page.h, my patch made
>>> clean_dcache_va_range consistent with invalidate_dcache_va_range. For
>>> consistency, I would prefer to deal with the two functions the same way.
>>> Although it is not a spec requirement, I also think that it is a good
>>> idea to issue cache flushes from cacheline aligned addresses, like
>>> invalidate_dcache_va_range does and Linux does, to make more obvious
>>> what is going on.
>>
>> invalid_dcache_va_range is split because the cache instruction differs for the
>> start and end if unaligned. For them you want to use clean & invalidate rather
>> than invalidate.
>>
>> If you look at the implementation of other cache helpers in Linux (see
>> dcache_by_line_op in arch/arm64/include/asm/assembler.h), they will only align
>> start & end.
>
> I don't think so, unless I am reading dcache_by_line_op wrong.
343 .macro dcache_by_line_op op, domain, kaddr, size, tmp1, tmp2
344 dcache_line_size \tmp1, \tmp2
345 add \size, \kaddr, \size
346 sub \tmp2, \tmp1, #1
347 bic \kaddr, \kaddr, \tmp2
348 9998:
349 .if (\op == cvau || \op == cvac)
350 alternative_if_not ARM64_WORKAROUND_CLEAN_CACHE
351 dc \op, \kaddr
352 alternative_else
353 dc civac, \kaddr
354 alternative_endif
355 .else
356 dc \op, \kaddr
357 .endif
358 add \kaddr, \kaddr, \tmp1
359 cmp \kaddr, \size
360 b.lo 9998b
361 dsb \domain
362 .endm
363
It has only one cache instruction in the resulting assembly because it
has .if/.else assembly directives.
Cheers,
--
Julien Grall
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: xen/arm and swiotlb-xen: possible data corruption
2017-03-02 23:07 ` Stefano Stabellini
@ 2017-03-02 23:24 ` Julien Grall
0 siblings, 0 replies; 13+ messages in thread
From: Julien Grall @ 2017-03-02 23:24 UTC (permalink / raw)
To: Stefano Stabellini, Edgar E. Iglesias; +Cc: Edgar E. Iglesias, nd, xen-devel
On 02/03/2017 23:07, Stefano Stabellini wrote:
> On Thu, 2 Mar 2017, Edgar E. Iglesias wrote:
>> On Thu, Mar 02, 2017 at 02:39:55PM -0800, Stefano Stabellini wrote:
>>> On Thu, 2 Mar 2017, Julien Grall wrote:
>>>> Hi Stefano,
>>>>
>>>> On 02/03/17 19:12, Stefano Stabellini wrote:
>>>>> On Thu, 2 Mar 2017, Julien Grall wrote:
>>>>>> On 02/03/17 08:53, Edgar E. Iglesias wrote:
>>>>>>> On Thu, Mar 02, 2017 at 09:38:37AM +0100, Edgar E. Iglesias wrote:
>>>>>>>> On Wed, Mar 01, 2017 at 05:05:21PM -0800, Stefano Stabellini wrote:
>>>>> Julien, from looking at the two diffs, this is simpler and nicer, but if
>>>>> you look at xen/include/asm-arm/page.h, my patch made
>>>>> clean_dcache_va_range consistent with invalidate_dcache_va_range. For
>>>>> consistency, I would prefer to deal with the two functions the same way.
>>>>> Although it is not a spec requirement, I also think that it is a good
>>>>> idea to issue cache flushes from cacheline aligned addresses, like
>>>>> invalidate_dcache_va_range does and Linux does, to make more obvious
>>>>> what is going on.
>>>>
>>>> invalid_dcache_va_range is split because the cache instruction differs for the
>>>> start and end if unaligned. For them you want to use clean & invalidate rather
>>>> than invalidate.
>>>>
>>>> If you look at the implementation of other cache helpers in Linux (see
>>>> dcache_by_line_op in arch/arm64/include/asm/assembler.h), they will only align
>>>> start & end.
>>>
>>> I don't think so, unless I am reading dcache_by_line_op wrong.
>>>
>>>
>>>> Also, the invalid_dcache_va_range is using modulo which I would rather avoid.
>>>> The modulo in this case will not be optimized by the compiler because
>>>> cacheline_bytes is not a constant.
>>>
>>> That is a good point. What if I replace the modulo op with
>>>
>>> p & (cacheline_bytes - 1)
>>>
>>> in invalidate_dcache_va_range, then add the similar code to
>>> clean_dcache_va_range and clean_and_invalidate_dcache_va_range?
>>
>>
>> Yeah, if there was some kind of generic ALIGN or ROUND_DOWN macro we could do:
>>
>> --- a/xen/include/asm-arm/page.h
>> +++ b/xen/include/asm-arm/page.h
>> @@ -325,7 +325,9 @@ static inline int clean_dcache_va_range(const void *p, unsigned long size)
>> {
>> const void *end;
>> dsb(sy); /* So the CPU issues all writes to the range */
>> - for ( end = p + size; p < end; p += cacheline_bytes )
>> +
>> + p = (void *)ALIGN((uintptr_t)p, cacheline_bytes);
>> + end = (void *)ROUNDUP((uintptr_t)p + size, cacheline_bytes);
>
> Even simpler:
>
> end = p + size;
> p = (void *)ALIGN((uintptr_t)p, cacheline_bytes);
We don't have any ALIGN macro in Xen and the way we use the term align
in xen is very similar to ROUNDUP.
However a simple p = (void *)((uintptr_t)p & ~(cacheline_bytes - 1))
should work here.
Cheers,
--
Julien Grall
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: xen/arm and swiotlb-xen: possible data corruption
2017-03-02 23:19 ` Julien Grall
@ 2017-03-03 0:53 ` Stefano Stabellini
2017-03-03 16:20 ` Julien Grall
0 siblings, 1 reply; 13+ messages in thread
From: Stefano Stabellini @ 2017-03-03 0:53 UTC (permalink / raw)
To: Julien Grall
Cc: Edgar E. Iglesias, Edgar E. Iglesias, nd, Stefano Stabellini, xen-devel
On Thu, 2 Mar 2017, Julien Grall wrote:
> On 02/03/2017 22:39, Stefano Stabellini wrote:
> > On Thu, 2 Mar 2017, Julien Grall wrote:
> > > Hi Stefano,
> > >
> > > On 02/03/17 19:12, Stefano Stabellini wrote:
> > > > On Thu, 2 Mar 2017, Julien Grall wrote:
> > > > > On 02/03/17 08:53, Edgar E. Iglesias wrote:
> > > > > > On Thu, Mar 02, 2017 at 09:38:37AM +0100, Edgar E. Iglesias wrote:
> > > > > > > On Wed, Mar 01, 2017 at 05:05:21PM -0800, Stefano Stabellini
> > > > > > > wrote:
> > > > Julien, from looking at the two diffs, this is simpler and nicer, but if
> > > > you look at xen/include/asm-arm/page.h, my patch made
> > > > clean_dcache_va_range consistent with invalidate_dcache_va_range. For
> > > > consistency, I would prefer to deal with the two functions the same way.
> > > > Although it is not a spec requirement, I also think that it is a good
> > > > idea to issue cache flushes from cacheline aligned addresses, like
> > > > invalidate_dcache_va_range does and Linux does, to make more obvious
> > > > what is going on.
> > >
> > > invalid_dcache_va_range is split because the cache instruction differs for
> > > the
> > > start and end if unaligned. For them you want to use clean & invalidate
> > > rather
> > > than invalidate.
> > >
> > > If you look at the implementation of other cache helpers in Linux (see
> > > dcache_by_line_op in arch/arm64/include/asm/assembler.h), they will only
> > > align
> > > start & end.
> >
> > I don't think so, unless I am reading dcache_by_line_op wrong.
>
> 343 .macro dcache_by_line_op op, domain, kaddr, size, tmp1, tmp2
> 344 dcache_line_size \tmp1, \tmp2
> 345 add \size, \kaddr, \size
> 346 sub \tmp2, \tmp1, #1
> 347 bic \kaddr, \kaddr, \tmp2
> 348 9998:
> 349 .if (\op == cvau || \op == cvac)
> 350 alternative_if_not ARM64_WORKAROUND_CLEAN_CACHE
> 351 dc \op, \kaddr
> 352 alternative_else
> 353 dc civac, \kaddr
> 354 alternative_endif
> 355 .else
> 356 dc \op, \kaddr
> 357 .endif
> 358 add \kaddr, \kaddr, \tmp1
> 359 cmp \kaddr, \size
> 360 b.lo 9998b
> 361 dsb \domain
> 362 .endm
> 363
>
> It has only one cache instruction in the resulting assembly because it has
> .if/.else assembly directives.
Yes, but it does not only align start and end, all cache instructions
are called on aligned addresses, right?
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: xen/arm and swiotlb-xen: possible data corruption
2017-03-03 0:53 ` Stefano Stabellini
@ 2017-03-03 16:20 ` Julien Grall
0 siblings, 0 replies; 13+ messages in thread
From: Julien Grall @ 2017-03-03 16:20 UTC (permalink / raw)
To: Stefano Stabellini; +Cc: Edgar E. Iglesias, Edgar E. Iglesias, nd, xen-devel
Hi Stefano,
On 03/03/17 00:53, Stefano Stabellini wrote:
> On Thu, 2 Mar 2017, Julien Grall wrote:
>> On 02/03/2017 22:39, Stefano Stabellini wrote:
>>> On Thu, 2 Mar 2017, Julien Grall wrote:
>>>> Hi Stefano,
>>>>
>>>> On 02/03/17 19:12, Stefano Stabellini wrote:
>>>>> On Thu, 2 Mar 2017, Julien Grall wrote:
>>>>>> On 02/03/17 08:53, Edgar E. Iglesias wrote:
>>>>>>> On Thu, Mar 02, 2017 at 09:38:37AM +0100, Edgar E. Iglesias wrote:
>>>>>>>> On Wed, Mar 01, 2017 at 05:05:21PM -0800, Stefano Stabellini
>>>>>>>> wrote:
>>>>> Julien, from looking at the two diffs, this is simpler and nicer, but if
>>>>> you look at xen/include/asm-arm/page.h, my patch made
>>>>> clean_dcache_va_range consistent with invalidate_dcache_va_range. For
>>>>> consistency, I would prefer to deal with the two functions the same way.
>>>>> Although it is not a spec requirement, I also think that it is a good
>>>>> idea to issue cache flushes from cacheline aligned addresses, like
>>>>> invalidate_dcache_va_range does and Linux does, to make more obvious
>>>>> what is going on.
>>>>
>>>> invalid_dcache_va_range is split because the cache instruction differs for
>>>> the
>>>> start and end if unaligned. For them you want to use clean & invalidate
>>>> rather
>>>> than invalidate.
>>>>
>>>> If you look at the implementation of other cache helpers in Linux (see
>>>> dcache_by_line_op in arch/arm64/include/asm/assembler.h), they will only
>>>> align
>>>> start & end.
>>>
>>> I don't think so, unless I am reading dcache_by_line_op wrong.
>>
>> 343 .macro dcache_by_line_op op, domain, kaddr, size, tmp1, tmp2
>> 344 dcache_line_size \tmp1, \tmp2
>> 345 add \size, \kaddr, \size
>> 346 sub \tmp2, \tmp1, #1
>> 347 bic \kaddr, \kaddr, \tmp2
>> 348 9998:
>> 349 .if (\op == cvau || \op == cvac)
>> 350 alternative_if_not ARM64_WORKAROUND_CLEAN_CACHE
>> 351 dc \op, \kaddr
>> 352 alternative_else
>> 353 dc civac, \kaddr
>> 354 alternative_endif
>> 355 .else
>> 356 dc \op, \kaddr
>> 357 .endif
>> 358 add \kaddr, \kaddr, \tmp1
>> 359 cmp \kaddr, \size
>> 360 b.lo 9998b
>> 361 dsb \domain
>> 362 .endm
>> 363
>>
>> It has only one cache instruction in the resulting assembly because it has
>> .if/.else assembly directives.
>
> Yes, but it does not only align start and end, all cache instructions
> are called on aligned addresses, right?
I don't think so. The instruction "bic \kaddr, \kaddr, \tmp2" will
make sure the start address is aligned to a cache line size.
The C version of the assembly code is exactly what you wrote on the
previous e-mail:
end = p + size;
p = (void *)ALIGN((uintptr_t)p, cacheline_bytes);
Cheers,
--
Julien Grall
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2017-03-03 16:21 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-02 1:05 xen/arm and swiotlb-xen: possible data corruption Stefano Stabellini
2017-03-02 8:38 ` Edgar E. Iglesias
2017-03-02 8:53 ` Edgar E. Iglesias
2017-03-02 17:56 ` Julien Grall
2017-03-02 19:12 ` Stefano Stabellini
2017-03-02 19:32 ` Julien Grall
2017-03-02 22:39 ` Stefano Stabellini
2017-03-02 22:55 ` Edgar E. Iglesias
2017-03-02 23:07 ` Stefano Stabellini
2017-03-02 23:24 ` Julien Grall
2017-03-02 23:19 ` Julien Grall
2017-03-03 0:53 ` Stefano Stabellini
2017-03-03 16:20 ` Julien Grall
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.