* xen/arm and swiotlb-xen: possible data corruption @ 2017-03-02 1:05 Stefano Stabellini 2017-03-02 8:38 ` Edgar E. Iglesias 0 siblings, 1 reply; 13+ messages in thread From: Stefano Stabellini @ 2017-03-02 1:05 UTC (permalink / raw) To: edgar.iglesias; +Cc: xen-devel, julien.grall, sstabellini Hi all, Edgar reported a data corruption on network packets in dom0 when the swiotlb-xen is in use. He also reported that the following patch "fixes" the problem for him: static void __xen_dma_page_cpu_to_dev(struct device *hwdev, dma_addr_t handle, size_t size, enum dma_data_direction dir) { - dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size, dir, DMA_MAP); + printk("%s: addr=%lx size=%zd\n", __func__, handle, size); + dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size + 64, dir, DMA_MAP); I am thinking that the problem has something to do with cacheline alignment on the Xen side (xen/common/grant_table.c:__gnttab_cache_flush). If op == GNTTAB_CACHE_INVAL, we call invalidate_dcache_va_range; if op == GNTTAB_CACHE_CLEAN, we call clean_dcache_va_range instead. The parameter, v, could be non-cacheline aligned. invalidate_dcache_va_range is capable of handling a not aligned address, while clean_dcache_va_range does not. Edgar, does the appended patch fix the problem for you? --- diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h index 86de0b6..9cdf2fb 100644 --- a/xen/include/asm-arm/page.h +++ b/xen/include/asm-arm/page.h @@ -322,10 +322,30 @@ static inline int invalidate_dcache_va_range(const void *p, unsigned long size) static inline int clean_dcache_va_range(const void *p, unsigned long size) { - const void *end; + size_t off; + const void *end = p + size; + dsb(sy); /* So the CPU issues all writes to the range */ - for ( end = p + size; p < end; p += cacheline_bytes ) + + off = (unsigned long)p % cacheline_bytes; + if ( off ) + { + p -= off; asm volatile (__clean_dcache_one(0) : : "r" (p)); + p += cacheline_bytes; + size -= cacheline_bytes - off; + } + off = (unsigned long)end % cacheline_bytes; + if ( off ) + { + end -= off; + size -= off; + asm volatile (__clean_dcache_one(0) : : "r" (end)); + } + + for ( ; p < end; p += cacheline_bytes ) + asm volatile (__clean_dcache_one(0) : : "r" (p)); + dsb(sy); /* So we know the flushes happen before continuing */ /* ARM callers assume that dcache_* functions cannot fail. */ return 0; _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: xen/arm and swiotlb-xen: possible data corruption 2017-03-02 1:05 xen/arm and swiotlb-xen: possible data corruption Stefano Stabellini @ 2017-03-02 8:38 ` Edgar E. Iglesias 2017-03-02 8:53 ` Edgar E. Iglesias 0 siblings, 1 reply; 13+ messages in thread From: Edgar E. Iglesias @ 2017-03-02 8:38 UTC (permalink / raw) To: Stefano Stabellini; +Cc: xen-devel, julien.grall On Wed, Mar 01, 2017 at 05:05:21PM -0800, Stefano Stabellini wrote: > Hi all, > > Edgar reported a data corruption on network packets in dom0 when the > swiotlb-xen is in use. He also reported that the following patch "fixes" > the problem for him: > > static void __xen_dma_page_cpu_to_dev(struct device *hwdev, dma_addr_t handle, > size_t size, enum dma_data_direction dir) > { > - dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size, dir, DMA_MAP); > + printk("%s: addr=%lx size=%zd\n", __func__, handle, size); > + dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size + 64, dir, DMA_MAP); > > I am thinking that the problem has something to do with cacheline > alignment on the Xen side > (xen/common/grant_table.c:__gnttab_cache_flush). > > If op == GNTTAB_CACHE_INVAL, we call invalidate_dcache_va_range; if op > == GNTTAB_CACHE_CLEAN, we call clean_dcache_va_range instead. The > parameter, v, could be non-cacheline aligned. > > invalidate_dcache_va_range is capable of handling a not aligned address, > while clean_dcache_va_range does not. > > Edgar, does the appended patch fix the problem for you? Thanks Stefano, This does indeed fix the issue for me. Cheers, Edgar > > --- > > diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h > index 86de0b6..9cdf2fb 100644 > --- a/xen/include/asm-arm/page.h > +++ b/xen/include/asm-arm/page.h > @@ -322,10 +322,30 @@ static inline int invalidate_dcache_va_range(const void *p, unsigned long size) > > static inline int clean_dcache_va_range(const void *p, unsigned long size) > { > - const void *end; > + size_t off; > + const void *end = p + size; > + > dsb(sy); /* So the CPU issues all writes to the range */ > - for ( end = p + size; p < end; p += cacheline_bytes ) > + > + off = (unsigned long)p % cacheline_bytes; > + if ( off ) > + { > + p -= off; > asm volatile (__clean_dcache_one(0) : : "r" (p)); > + p += cacheline_bytes; > + size -= cacheline_bytes - off; > + } > + off = (unsigned long)end % cacheline_bytes; > + if ( off ) > + { > + end -= off; > + size -= off; > + asm volatile (__clean_dcache_one(0) : : "r" (end)); > + } > + > + for ( ; p < end; p += cacheline_bytes ) > + asm volatile (__clean_dcache_one(0) : : "r" (p)); > + > dsb(sy); /* So we know the flushes happen before continuing */ > /* ARM callers assume that dcache_* functions cannot fail. */ > return 0; _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: xen/arm and swiotlb-xen: possible data corruption 2017-03-02 8:38 ` Edgar E. Iglesias @ 2017-03-02 8:53 ` Edgar E. Iglesias 2017-03-02 17:56 ` Julien Grall 0 siblings, 1 reply; 13+ messages in thread From: Edgar E. Iglesias @ 2017-03-02 8:53 UTC (permalink / raw) To: Edgar E. Iglesias; +Cc: xen-devel, julien.grall, Stefano Stabellini On Thu, Mar 02, 2017 at 09:38:37AM +0100, Edgar E. Iglesias wrote: > On Wed, Mar 01, 2017 at 05:05:21PM -0800, Stefano Stabellini wrote: > > Hi all, > > > > Edgar reported a data corruption on network packets in dom0 when the > > swiotlb-xen is in use. He also reported that the following patch "fixes" > > the problem for him: > > > > static void __xen_dma_page_cpu_to_dev(struct device *hwdev, dma_addr_t handle, > > size_t size, enum dma_data_direction dir) > > { > > - dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size, dir, DMA_MAP); > > + printk("%s: addr=%lx size=%zd\n", __func__, handle, size); > > + dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size + 64, dir, DMA_MAP); > > > > I am thinking that the problem has something to do with cacheline > > alignment on the Xen side > > (xen/common/grant_table.c:__gnttab_cache_flush). > > > > If op == GNTTAB_CACHE_INVAL, we call invalidate_dcache_va_range; if op > > == GNTTAB_CACHE_CLEAN, we call clean_dcache_va_range instead. The > > parameter, v, could be non-cacheline aligned. > > > > invalidate_dcache_va_range is capable of handling a not aligned address, > > while clean_dcache_va_range does not. > > > > Edgar, does the appended patch fix the problem for you? > > > Thanks Stefano, > > This does indeed fix the issue for me. Hi again, Looking at the code, the problem here is that we may flush one cache line less than expected. This smaller patch fixes it for me too: diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h index c492d6d..fa1b4dd 100644 --- a/xen/include/asm-arm/page.h +++ b/xen/include/asm-arm/page.h @@ -325,7 +325,9 @@ static inline int clean_dcache_va_range(const void *p, unsigned long size) { const void *end; dsb(sy); /* So the CPU issues all writes to the range */ - for ( end = p + size; p < end; p += cacheline_bytes ) + + end = (void *)ROUNDUP((uintptr_t)p + size, cacheline_bytes); + for ( ; p < end; p += cacheline_bytes ) asm volatile (__clean_dcache_one(0) : : "r" (p)); dsb(sy); /* So we know the flushes happen before continuing */ /* ARM callers assume that dcache_* functions cannot fail. */ Anyway, I'm OK with either fix. Cheers, Edgar > > Cheers, > Edgar > > > > > > --- > > > > diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h > > index 86de0b6..9cdf2fb 100644 > > --- a/xen/include/asm-arm/page.h > > +++ b/xen/include/asm-arm/page.h > > @@ -322,10 +322,30 @@ static inline int invalidate_dcache_va_range(const void *p, unsigned long size) > > > > static inline int clean_dcache_va_range(const void *p, unsigned long size) > > { > > - const void *end; > > + size_t off; > > + const void *end = p + size; > > + > > dsb(sy); /* So the CPU issues all writes to the range */ > > - for ( end = p + size; p < end; p += cacheline_bytes ) > > + > > + off = (unsigned long)p % cacheline_bytes; > > + if ( off ) > > + { > > + p -= off; > > asm volatile (__clean_dcache_one(0) : : "r" (p)); > > + p += cacheline_bytes; > > + size -= cacheline_bytes - off; > > + } > > + off = (unsigned long)end % cacheline_bytes; > > + if ( off ) > > + { > > + end -= off; > > + size -= off; > > + asm volatile (__clean_dcache_one(0) : : "r" (end)); > > + } > > + > > + for ( ; p < end; p += cacheline_bytes ) > > + asm volatile (__clean_dcache_one(0) : : "r" (p)); > > + > > dsb(sy); /* So we know the flushes happen before continuing */ > > /* ARM callers assume that dcache_* functions cannot fail. */ > > return 0; > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > https://lists.xen.org/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: xen/arm and swiotlb-xen: possible data corruption 2017-03-02 8:53 ` Edgar E. Iglesias @ 2017-03-02 17:56 ` Julien Grall 2017-03-02 19:12 ` Stefano Stabellini 0 siblings, 1 reply; 13+ messages in thread From: Julien Grall @ 2017-03-02 17:56 UTC (permalink / raw) To: Edgar E. Iglesias, Edgar E. Iglesias; +Cc: xen-devel, nd, Stefano Stabellini Hi Edgar, On 02/03/17 08:53, Edgar E. Iglesias wrote: > On Thu, Mar 02, 2017 at 09:38:37AM +0100, Edgar E. Iglesias wrote: >> On Wed, Mar 01, 2017 at 05:05:21PM -0800, Stefano Stabellini wrote: >>> Hi all, >>> >>> Edgar reported a data corruption on network packets in dom0 when the >>> swiotlb-xen is in use. He also reported that the following patch "fixes" >>> the problem for him: >>> >>> static void __xen_dma_page_cpu_to_dev(struct device *hwdev, dma_addr_t handle, >>> size_t size, enum dma_data_direction dir) >>> { >>> - dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size, dir, DMA_MAP); >>> + printk("%s: addr=%lx size=%zd\n", __func__, handle, size); >>> + dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size + 64, dir, DMA_MAP); >>> >>> I am thinking that the problem has something to do with cacheline >>> alignment on the Xen side >>> (xen/common/grant_table.c:__gnttab_cache_flush). >>> >>> If op == GNTTAB_CACHE_INVAL, we call invalidate_dcache_va_range; if op >>> == GNTTAB_CACHE_CLEAN, we call clean_dcache_va_range instead. The >>> parameter, v, could be non-cacheline aligned. >>> >>> invalidate_dcache_va_range is capable of handling a not aligned address, >>> while clean_dcache_va_range does not. >>> >>> Edgar, does the appended patch fix the problem for you? >> >> >> Thanks Stefano, >> >> This does indeed fix the issue for me. > > > Hi again, > > Looking at the code, the problem here is that we may flush one cache line > less than expected. > > This smaller patch fixes it for me too: > diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h > index c492d6d..fa1b4dd 100644 > --- a/xen/include/asm-arm/page.h > +++ b/xen/include/asm-arm/page.h > @@ -325,7 +325,9 @@ static inline int clean_dcache_va_range(const void *p, unsigned long size) > { > const void *end; > dsb(sy); /* So the CPU issues all writes to the range */ > - for ( end = p + size; p < end; p += cacheline_bytes ) > + > + end = (void *)ROUNDUP((uintptr_t)p + size, cacheline_bytes); > + for ( ; p < end; p += cacheline_bytes ) > asm volatile (__clean_dcache_one(0) : : "r" (p)); > dsb(sy); /* So we know the flushes happen before continuing */ > /* ARM callers assume that dcache_* functions cannot fail. */ > > > Anyway, I'm OK with either fix. I would prefer your version compare to Stefano's one. Cheers, -- Julien Grall _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: xen/arm and swiotlb-xen: possible data corruption 2017-03-02 17:56 ` Julien Grall @ 2017-03-02 19:12 ` Stefano Stabellini 2017-03-02 19:32 ` Julien Grall 0 siblings, 1 reply; 13+ messages in thread From: Stefano Stabellini @ 2017-03-02 19:12 UTC (permalink / raw) To: Julien Grall Cc: Edgar E. Iglesias, Edgar E. Iglesias, nd, Stefano Stabellini, xen-devel On Thu, 2 Mar 2017, Julien Grall wrote: > On 02/03/17 08:53, Edgar E. Iglesias wrote: > > On Thu, Mar 02, 2017 at 09:38:37AM +0100, Edgar E. Iglesias wrote: > > > On Wed, Mar 01, 2017 at 05:05:21PM -0800, Stefano Stabellini wrote: > > > > Hi all, > > > > > > > > Edgar reported a data corruption on network packets in dom0 when the > > > > swiotlb-xen is in use. He also reported that the following patch "fixes" > > > > the problem for him: > > > > > > > > static void __xen_dma_page_cpu_to_dev(struct device *hwdev, dma_addr_t > > > > handle, > > > > size_t size, enum dma_data_direction dir) > > > > { > > > > - dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size, > > > > dir, DMA_MAP); > > > > + printk("%s: addr=%lx size=%zd\n", __func__, handle, size); > > > > + dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size + > > > > 64, dir, DMA_MAP); > > > > > > > > I am thinking that the problem has something to do with cacheline > > > > alignment on the Xen side > > > > (xen/common/grant_table.c:__gnttab_cache_flush). > > > > > > > > If op == GNTTAB_CACHE_INVAL, we call invalidate_dcache_va_range; if op > > > > == GNTTAB_CACHE_CLEAN, we call clean_dcache_va_range instead. The > > > > parameter, v, could be non-cacheline aligned. > > > > > > > > invalidate_dcache_va_range is capable of handling a not aligned address, > > > > while clean_dcache_va_range does not. > > > > > > > > Edgar, does the appended patch fix the problem for you? > > > > > > > > > Thanks Stefano, > > > > > > This does indeed fix the issue for me. Thanks for reporting and testing! > > Hi again, > > > > Looking at the code, the problem here is that we may flush one cache line > > less than expected. > > > > This smaller patch fixes it for me too: > > diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h > > index c492d6d..fa1b4dd 100644 > > --- a/xen/include/asm-arm/page.h > > +++ b/xen/include/asm-arm/page.h > > @@ -325,7 +325,9 @@ static inline int clean_dcache_va_range(const void *p, > > unsigned long size) > > { > > const void *end; > > dsb(sy); /* So the CPU issues all writes to the range */ > > - for ( end = p + size; p < end; p += cacheline_bytes ) > > + > > + end = (void *)ROUNDUP((uintptr_t)p + size, cacheline_bytes); > > + for ( ; p < end; p += cacheline_bytes ) > > asm volatile (__clean_dcache_one(0) : : "r" (p)); > > dsb(sy); /* So we know the flushes happen before continuing > > */ > > /* ARM callers assume that dcache_* functions cannot fail. */ > > > > > > Anyway, I'm OK with either fix. > > I would prefer your version compare to Stefano's one. Julien, from looking at the two diffs, this is simpler and nicer, but if you look at xen/include/asm-arm/page.h, my patch made clean_dcache_va_range consistent with invalidate_dcache_va_range. For consistency, I would prefer to deal with the two functions the same way. Although it is not a spec requirement, I also think that it is a good idea to issue cache flushes from cacheline aligned addresses, like invalidate_dcache_va_range does and Linux does, to make more obvious what is going on. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: xen/arm and swiotlb-xen: possible data corruption 2017-03-02 19:12 ` Stefano Stabellini @ 2017-03-02 19:32 ` Julien Grall 2017-03-02 22:39 ` Stefano Stabellini 0 siblings, 1 reply; 13+ messages in thread From: Julien Grall @ 2017-03-02 19:32 UTC (permalink / raw) To: Stefano Stabellini; +Cc: Edgar E. Iglesias, Edgar E. Iglesias, nd, xen-devel Hi Stefano, On 02/03/17 19:12, Stefano Stabellini wrote: > On Thu, 2 Mar 2017, Julien Grall wrote: >> On 02/03/17 08:53, Edgar E. Iglesias wrote: >>> On Thu, Mar 02, 2017 at 09:38:37AM +0100, Edgar E. Iglesias wrote: >>>> On Wed, Mar 01, 2017 at 05:05:21PM -0800, Stefano Stabellini wrote: > Julien, from looking at the two diffs, this is simpler and nicer, but if > you look at xen/include/asm-arm/page.h, my patch made > clean_dcache_va_range consistent with invalidate_dcache_va_range. For > consistency, I would prefer to deal with the two functions the same way. > Although it is not a spec requirement, I also think that it is a good > idea to issue cache flushes from cacheline aligned addresses, like > invalidate_dcache_va_range does and Linux does, to make more obvious > what is going on. invalid_dcache_va_range is split because the cache instruction differs for the start and end if unaligned. For them you want to use clean & invalidate rather than invalidate. If you look at the implementation of other cache helpers in Linux (see dcache_by_line_op in arch/arm64/include/asm/assembler.h), they will only align start & end. Also, the invalid_dcache_va_range is using modulo which I would rather avoid. The modulo in this case will not be optimized by the compiler because cacheline_bytes is not a constant. So I still prefer to keep this function really simple. BTW, you would also need to fix clean_and_invalidate_dcache_va_range. -- Julien Grall _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: xen/arm and swiotlb-xen: possible data corruption 2017-03-02 19:32 ` Julien Grall @ 2017-03-02 22:39 ` Stefano Stabellini 2017-03-02 22:55 ` Edgar E. Iglesias 2017-03-02 23:19 ` Julien Grall 0 siblings, 2 replies; 13+ messages in thread From: Stefano Stabellini @ 2017-03-02 22:39 UTC (permalink / raw) To: Julien Grall Cc: Edgar E. Iglesias, Edgar E. Iglesias, nd, Stefano Stabellini, xen-devel On Thu, 2 Mar 2017, Julien Grall wrote: > Hi Stefano, > > On 02/03/17 19:12, Stefano Stabellini wrote: > > On Thu, 2 Mar 2017, Julien Grall wrote: > > > On 02/03/17 08:53, Edgar E. Iglesias wrote: > > > > On Thu, Mar 02, 2017 at 09:38:37AM +0100, Edgar E. Iglesias wrote: > > > > > On Wed, Mar 01, 2017 at 05:05:21PM -0800, Stefano Stabellini wrote: > > Julien, from looking at the two diffs, this is simpler and nicer, but if > > you look at xen/include/asm-arm/page.h, my patch made > > clean_dcache_va_range consistent with invalidate_dcache_va_range. For > > consistency, I would prefer to deal with the two functions the same way. > > Although it is not a spec requirement, I also think that it is a good > > idea to issue cache flushes from cacheline aligned addresses, like > > invalidate_dcache_va_range does and Linux does, to make more obvious > > what is going on. > > invalid_dcache_va_range is split because the cache instruction differs for the > start and end if unaligned. For them you want to use clean & invalidate rather > than invalidate. > > If you look at the implementation of other cache helpers in Linux (see > dcache_by_line_op in arch/arm64/include/asm/assembler.h), they will only align > start & end. I don't think so, unless I am reading dcache_by_line_op wrong. > Also, the invalid_dcache_va_range is using modulo which I would rather avoid. > The modulo in this case will not be optimized by the compiler because > cacheline_bytes is not a constant. That is a good point. What if I replace the modulo op with p & (cacheline_bytes - 1) in invalidate_dcache_va_range, then add the similar code to clean_dcache_va_range and clean_and_invalidate_dcache_va_range? > BTW, you would also need to fix clean_and_invalidate_dcache_va_range. I'll do that, thanks for the reminder. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: xen/arm and swiotlb-xen: possible data corruption 2017-03-02 22:39 ` Stefano Stabellini @ 2017-03-02 22:55 ` Edgar E. Iglesias 2017-03-02 23:07 ` Stefano Stabellini 2017-03-02 23:19 ` Julien Grall 1 sibling, 1 reply; 13+ messages in thread From: Edgar E. Iglesias @ 2017-03-02 22:55 UTC (permalink / raw) To: Stefano Stabellini; +Cc: Edgar E. Iglesias, Julien Grall, nd, xen-devel On Thu, Mar 02, 2017 at 02:39:55PM -0800, Stefano Stabellini wrote: > On Thu, 2 Mar 2017, Julien Grall wrote: > > Hi Stefano, > > > > On 02/03/17 19:12, Stefano Stabellini wrote: > > > On Thu, 2 Mar 2017, Julien Grall wrote: > > > > On 02/03/17 08:53, Edgar E. Iglesias wrote: > > > > > On Thu, Mar 02, 2017 at 09:38:37AM +0100, Edgar E. Iglesias wrote: > > > > > > On Wed, Mar 01, 2017 at 05:05:21PM -0800, Stefano Stabellini wrote: > > > Julien, from looking at the two diffs, this is simpler and nicer, but if > > > you look at xen/include/asm-arm/page.h, my patch made > > > clean_dcache_va_range consistent with invalidate_dcache_va_range. For > > > consistency, I would prefer to deal with the two functions the same way. > > > Although it is not a spec requirement, I also think that it is a good > > > idea to issue cache flushes from cacheline aligned addresses, like > > > invalidate_dcache_va_range does and Linux does, to make more obvious > > > what is going on. > > > > invalid_dcache_va_range is split because the cache instruction differs for the > > start and end if unaligned. For them you want to use clean & invalidate rather > > than invalidate. > > > > If you look at the implementation of other cache helpers in Linux (see > > dcache_by_line_op in arch/arm64/include/asm/assembler.h), they will only align > > start & end. > > I don't think so, unless I am reading dcache_by_line_op wrong. > > > > Also, the invalid_dcache_va_range is using modulo which I would rather avoid. > > The modulo in this case will not be optimized by the compiler because > > cacheline_bytes is not a constant. > > That is a good point. What if I replace the modulo op with > > p & (cacheline_bytes - 1) > > in invalidate_dcache_va_range, then add the similar code to > clean_dcache_va_range and clean_and_invalidate_dcache_va_range? Yeah, if there was some kind of generic ALIGN or ROUND_DOWN macro we could do: --- a/xen/include/asm-arm/page.h +++ b/xen/include/asm-arm/page.h @@ -325,7 +325,9 @@ static inline int clean_dcache_va_range(const void *p, unsigned long size) { const void *end; dsb(sy); /* So the CPU issues all writes to the range */ - for ( end = p + size; p < end; p += cacheline_bytes ) + + p = (void *)ALIGN((uintptr_t)p, cacheline_bytes); + end = (void *)ROUNDUP((uintptr_t)p + size, cacheline_bytes); + for ( ; p < end; p += cacheline_bytes ) asm volatile (__clean_dcache_one(0) : : "r" (p)); dsb(sy); /* So we know the flushes happen before continuing */ /* ARM callers assume that dcache_* functions cannot fail. */ I think that would achieve the same result as your patch Stefano? Cheers, Edgar > > > > BTW, you would also need to fix clean_and_invalidate_dcache_va_range. > > I'll do that, thanks for the reminder. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: xen/arm and swiotlb-xen: possible data corruption 2017-03-02 22:55 ` Edgar E. Iglesias @ 2017-03-02 23:07 ` Stefano Stabellini 2017-03-02 23:24 ` Julien Grall 0 siblings, 1 reply; 13+ messages in thread From: Stefano Stabellini @ 2017-03-02 23:07 UTC (permalink / raw) To: Edgar E. Iglesias Cc: nd, Julien Grall, Stefano Stabellini, xen-devel, Edgar E. Iglesias On Thu, 2 Mar 2017, Edgar E. Iglesias wrote: > On Thu, Mar 02, 2017 at 02:39:55PM -0800, Stefano Stabellini wrote: > > On Thu, 2 Mar 2017, Julien Grall wrote: > > > Hi Stefano, > > > > > > On 02/03/17 19:12, Stefano Stabellini wrote: > > > > On Thu, 2 Mar 2017, Julien Grall wrote: > > > > > On 02/03/17 08:53, Edgar E. Iglesias wrote: > > > > > > On Thu, Mar 02, 2017 at 09:38:37AM +0100, Edgar E. Iglesias wrote: > > > > > > > On Wed, Mar 01, 2017 at 05:05:21PM -0800, Stefano Stabellini wrote: > > > > Julien, from looking at the two diffs, this is simpler and nicer, but if > > > > you look at xen/include/asm-arm/page.h, my patch made > > > > clean_dcache_va_range consistent with invalidate_dcache_va_range. For > > > > consistency, I would prefer to deal with the two functions the same way. > > > > Although it is not a spec requirement, I also think that it is a good > > > > idea to issue cache flushes from cacheline aligned addresses, like > > > > invalidate_dcache_va_range does and Linux does, to make more obvious > > > > what is going on. > > > > > > invalid_dcache_va_range is split because the cache instruction differs for the > > > start and end if unaligned. For them you want to use clean & invalidate rather > > > than invalidate. > > > > > > If you look at the implementation of other cache helpers in Linux (see > > > dcache_by_line_op in arch/arm64/include/asm/assembler.h), they will only align > > > start & end. > > > > I don't think so, unless I am reading dcache_by_line_op wrong. > > > > > > > Also, the invalid_dcache_va_range is using modulo which I would rather avoid. > > > The modulo in this case will not be optimized by the compiler because > > > cacheline_bytes is not a constant. > > > > That is a good point. What if I replace the modulo op with > > > > p & (cacheline_bytes - 1) > > > > in invalidate_dcache_va_range, then add the similar code to > > clean_dcache_va_range and clean_and_invalidate_dcache_va_range? > > > Yeah, if there was some kind of generic ALIGN or ROUND_DOWN macro we could do: > > --- a/xen/include/asm-arm/page.h > +++ b/xen/include/asm-arm/page.h > @@ -325,7 +325,9 @@ static inline int clean_dcache_va_range(const void *p, unsigned long size) > { > const void *end; > dsb(sy); /* So the CPU issues all writes to the range */ > - for ( end = p + size; p < end; p += cacheline_bytes ) > + > + p = (void *)ALIGN((uintptr_t)p, cacheline_bytes); > + end = (void *)ROUNDUP((uintptr_t)p + size, cacheline_bytes); Even simpler: end = p + size; p = (void *)ALIGN((uintptr_t)p, cacheline_bytes); > + for ( ; p < end; p += cacheline_bytes ) > asm volatile (__clean_dcache_one(0) : : "r" (p)); > dsb(sy); /* So we know the flushes happen before continuing */ > /* ARM callers assume that dcache_* functions cannot fail. */ > > I think that would achieve the same result as your patch Stefano? Yes, indeed, that's better. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: xen/arm and swiotlb-xen: possible data corruption 2017-03-02 23:07 ` Stefano Stabellini @ 2017-03-02 23:24 ` Julien Grall 0 siblings, 0 replies; 13+ messages in thread From: Julien Grall @ 2017-03-02 23:24 UTC (permalink / raw) To: Stefano Stabellini, Edgar E. Iglesias; +Cc: Edgar E. Iglesias, nd, xen-devel On 02/03/2017 23:07, Stefano Stabellini wrote: > On Thu, 2 Mar 2017, Edgar E. Iglesias wrote: >> On Thu, Mar 02, 2017 at 02:39:55PM -0800, Stefano Stabellini wrote: >>> On Thu, 2 Mar 2017, Julien Grall wrote: >>>> Hi Stefano, >>>> >>>> On 02/03/17 19:12, Stefano Stabellini wrote: >>>>> On Thu, 2 Mar 2017, Julien Grall wrote: >>>>>> On 02/03/17 08:53, Edgar E. Iglesias wrote: >>>>>>> On Thu, Mar 02, 2017 at 09:38:37AM +0100, Edgar E. Iglesias wrote: >>>>>>>> On Wed, Mar 01, 2017 at 05:05:21PM -0800, Stefano Stabellini wrote: >>>>> Julien, from looking at the two diffs, this is simpler and nicer, but if >>>>> you look at xen/include/asm-arm/page.h, my patch made >>>>> clean_dcache_va_range consistent with invalidate_dcache_va_range. For >>>>> consistency, I would prefer to deal with the two functions the same way. >>>>> Although it is not a spec requirement, I also think that it is a good >>>>> idea to issue cache flushes from cacheline aligned addresses, like >>>>> invalidate_dcache_va_range does and Linux does, to make more obvious >>>>> what is going on. >>>> >>>> invalid_dcache_va_range is split because the cache instruction differs for the >>>> start and end if unaligned. For them you want to use clean & invalidate rather >>>> than invalidate. >>>> >>>> If you look at the implementation of other cache helpers in Linux (see >>>> dcache_by_line_op in arch/arm64/include/asm/assembler.h), they will only align >>>> start & end. >>> >>> I don't think so, unless I am reading dcache_by_line_op wrong. >>> >>> >>>> Also, the invalid_dcache_va_range is using modulo which I would rather avoid. >>>> The modulo in this case will not be optimized by the compiler because >>>> cacheline_bytes is not a constant. >>> >>> That is a good point. What if I replace the modulo op with >>> >>> p & (cacheline_bytes - 1) >>> >>> in invalidate_dcache_va_range, then add the similar code to >>> clean_dcache_va_range and clean_and_invalidate_dcache_va_range? >> >> >> Yeah, if there was some kind of generic ALIGN or ROUND_DOWN macro we could do: >> >> --- a/xen/include/asm-arm/page.h >> +++ b/xen/include/asm-arm/page.h >> @@ -325,7 +325,9 @@ static inline int clean_dcache_va_range(const void *p, unsigned long size) >> { >> const void *end; >> dsb(sy); /* So the CPU issues all writes to the range */ >> - for ( end = p + size; p < end; p += cacheline_bytes ) >> + >> + p = (void *)ALIGN((uintptr_t)p, cacheline_bytes); >> + end = (void *)ROUNDUP((uintptr_t)p + size, cacheline_bytes); > > Even simpler: > > end = p + size; > p = (void *)ALIGN((uintptr_t)p, cacheline_bytes); We don't have any ALIGN macro in Xen and the way we use the term align in xen is very similar to ROUNDUP. However a simple p = (void *)((uintptr_t)p & ~(cacheline_bytes - 1)) should work here. Cheers, -- Julien Grall _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: xen/arm and swiotlb-xen: possible data corruption 2017-03-02 22:39 ` Stefano Stabellini 2017-03-02 22:55 ` Edgar E. Iglesias @ 2017-03-02 23:19 ` Julien Grall 2017-03-03 0:53 ` Stefano Stabellini 1 sibling, 1 reply; 13+ messages in thread From: Julien Grall @ 2017-03-02 23:19 UTC (permalink / raw) To: Stefano Stabellini; +Cc: Edgar E. Iglesias, Edgar E. Iglesias, nd, xen-devel On 02/03/2017 22:39, Stefano Stabellini wrote: > On Thu, 2 Mar 2017, Julien Grall wrote: >> Hi Stefano, >> >> On 02/03/17 19:12, Stefano Stabellini wrote: >>> On Thu, 2 Mar 2017, Julien Grall wrote: >>>> On 02/03/17 08:53, Edgar E. Iglesias wrote: >>>>> On Thu, Mar 02, 2017 at 09:38:37AM +0100, Edgar E. Iglesias wrote: >>>>>> On Wed, Mar 01, 2017 at 05:05:21PM -0800, Stefano Stabellini wrote: >>> Julien, from looking at the two diffs, this is simpler and nicer, but if >>> you look at xen/include/asm-arm/page.h, my patch made >>> clean_dcache_va_range consistent with invalidate_dcache_va_range. For >>> consistency, I would prefer to deal with the two functions the same way. >>> Although it is not a spec requirement, I also think that it is a good >>> idea to issue cache flushes from cacheline aligned addresses, like >>> invalidate_dcache_va_range does and Linux does, to make more obvious >>> what is going on. >> >> invalid_dcache_va_range is split because the cache instruction differs for the >> start and end if unaligned. For them you want to use clean & invalidate rather >> than invalidate. >> >> If you look at the implementation of other cache helpers in Linux (see >> dcache_by_line_op in arch/arm64/include/asm/assembler.h), they will only align >> start & end. > > I don't think so, unless I am reading dcache_by_line_op wrong. 343 .macro dcache_by_line_op op, domain, kaddr, size, tmp1, tmp2 344 dcache_line_size \tmp1, \tmp2 345 add \size, \kaddr, \size 346 sub \tmp2, \tmp1, #1 347 bic \kaddr, \kaddr, \tmp2 348 9998: 349 .if (\op == cvau || \op == cvac) 350 alternative_if_not ARM64_WORKAROUND_CLEAN_CACHE 351 dc \op, \kaddr 352 alternative_else 353 dc civac, \kaddr 354 alternative_endif 355 .else 356 dc \op, \kaddr 357 .endif 358 add \kaddr, \kaddr, \tmp1 359 cmp \kaddr, \size 360 b.lo 9998b 361 dsb \domain 362 .endm 363 It has only one cache instruction in the resulting assembly because it has .if/.else assembly directives. Cheers, -- Julien Grall _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: xen/arm and swiotlb-xen: possible data corruption 2017-03-02 23:19 ` Julien Grall @ 2017-03-03 0:53 ` Stefano Stabellini 2017-03-03 16:20 ` Julien Grall 0 siblings, 1 reply; 13+ messages in thread From: Stefano Stabellini @ 2017-03-03 0:53 UTC (permalink / raw) To: Julien Grall Cc: Edgar E. Iglesias, Edgar E. Iglesias, nd, Stefano Stabellini, xen-devel On Thu, 2 Mar 2017, Julien Grall wrote: > On 02/03/2017 22:39, Stefano Stabellini wrote: > > On Thu, 2 Mar 2017, Julien Grall wrote: > > > Hi Stefano, > > > > > > On 02/03/17 19:12, Stefano Stabellini wrote: > > > > On Thu, 2 Mar 2017, Julien Grall wrote: > > > > > On 02/03/17 08:53, Edgar E. Iglesias wrote: > > > > > > On Thu, Mar 02, 2017 at 09:38:37AM +0100, Edgar E. Iglesias wrote: > > > > > > > On Wed, Mar 01, 2017 at 05:05:21PM -0800, Stefano Stabellini > > > > > > > wrote: > > > > Julien, from looking at the two diffs, this is simpler and nicer, but if > > > > you look at xen/include/asm-arm/page.h, my patch made > > > > clean_dcache_va_range consistent with invalidate_dcache_va_range. For > > > > consistency, I would prefer to deal with the two functions the same way. > > > > Although it is not a spec requirement, I also think that it is a good > > > > idea to issue cache flushes from cacheline aligned addresses, like > > > > invalidate_dcache_va_range does and Linux does, to make more obvious > > > > what is going on. > > > > > > invalid_dcache_va_range is split because the cache instruction differs for > > > the > > > start and end if unaligned. For them you want to use clean & invalidate > > > rather > > > than invalidate. > > > > > > If you look at the implementation of other cache helpers in Linux (see > > > dcache_by_line_op in arch/arm64/include/asm/assembler.h), they will only > > > align > > > start & end. > > > > I don't think so, unless I am reading dcache_by_line_op wrong. > > 343 .macro dcache_by_line_op op, domain, kaddr, size, tmp1, tmp2 > 344 dcache_line_size \tmp1, \tmp2 > 345 add \size, \kaddr, \size > 346 sub \tmp2, \tmp1, #1 > 347 bic \kaddr, \kaddr, \tmp2 > 348 9998: > 349 .if (\op == cvau || \op == cvac) > 350 alternative_if_not ARM64_WORKAROUND_CLEAN_CACHE > 351 dc \op, \kaddr > 352 alternative_else > 353 dc civac, \kaddr > 354 alternative_endif > 355 .else > 356 dc \op, \kaddr > 357 .endif > 358 add \kaddr, \kaddr, \tmp1 > 359 cmp \kaddr, \size > 360 b.lo 9998b > 361 dsb \domain > 362 .endm > 363 > > It has only one cache instruction in the resulting assembly because it has > .if/.else assembly directives. Yes, but it does not only align start and end, all cache instructions are called on aligned addresses, right? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: xen/arm and swiotlb-xen: possible data corruption 2017-03-03 0:53 ` Stefano Stabellini @ 2017-03-03 16:20 ` Julien Grall 0 siblings, 0 replies; 13+ messages in thread From: Julien Grall @ 2017-03-03 16:20 UTC (permalink / raw) To: Stefano Stabellini; +Cc: Edgar E. Iglesias, Edgar E. Iglesias, nd, xen-devel Hi Stefano, On 03/03/17 00:53, Stefano Stabellini wrote: > On Thu, 2 Mar 2017, Julien Grall wrote: >> On 02/03/2017 22:39, Stefano Stabellini wrote: >>> On Thu, 2 Mar 2017, Julien Grall wrote: >>>> Hi Stefano, >>>> >>>> On 02/03/17 19:12, Stefano Stabellini wrote: >>>>> On Thu, 2 Mar 2017, Julien Grall wrote: >>>>>> On 02/03/17 08:53, Edgar E. Iglesias wrote: >>>>>>> On Thu, Mar 02, 2017 at 09:38:37AM +0100, Edgar E. Iglesias wrote: >>>>>>>> On Wed, Mar 01, 2017 at 05:05:21PM -0800, Stefano Stabellini >>>>>>>> wrote: >>>>> Julien, from looking at the two diffs, this is simpler and nicer, but if >>>>> you look at xen/include/asm-arm/page.h, my patch made >>>>> clean_dcache_va_range consistent with invalidate_dcache_va_range. For >>>>> consistency, I would prefer to deal with the two functions the same way. >>>>> Although it is not a spec requirement, I also think that it is a good >>>>> idea to issue cache flushes from cacheline aligned addresses, like >>>>> invalidate_dcache_va_range does and Linux does, to make more obvious >>>>> what is going on. >>>> >>>> invalid_dcache_va_range is split because the cache instruction differs for >>>> the >>>> start and end if unaligned. For them you want to use clean & invalidate >>>> rather >>>> than invalidate. >>>> >>>> If you look at the implementation of other cache helpers in Linux (see >>>> dcache_by_line_op in arch/arm64/include/asm/assembler.h), they will only >>>> align >>>> start & end. >>> >>> I don't think so, unless I am reading dcache_by_line_op wrong. >> >> 343 .macro dcache_by_line_op op, domain, kaddr, size, tmp1, tmp2 >> 344 dcache_line_size \tmp1, \tmp2 >> 345 add \size, \kaddr, \size >> 346 sub \tmp2, \tmp1, #1 >> 347 bic \kaddr, \kaddr, \tmp2 >> 348 9998: >> 349 .if (\op == cvau || \op == cvac) >> 350 alternative_if_not ARM64_WORKAROUND_CLEAN_CACHE >> 351 dc \op, \kaddr >> 352 alternative_else >> 353 dc civac, \kaddr >> 354 alternative_endif >> 355 .else >> 356 dc \op, \kaddr >> 357 .endif >> 358 add \kaddr, \kaddr, \tmp1 >> 359 cmp \kaddr, \size >> 360 b.lo 9998b >> 361 dsb \domain >> 362 .endm >> 363 >> >> It has only one cache instruction in the resulting assembly because it has >> .if/.else assembly directives. > > Yes, but it does not only align start and end, all cache instructions > are called on aligned addresses, right? I don't think so. The instruction "bic \kaddr, \kaddr, \tmp2" will make sure the start address is aligned to a cache line size. The C version of the assembly code is exactly what you wrote on the previous e-mail: end = p + size; p = (void *)ALIGN((uintptr_t)p, cacheline_bytes); Cheers, -- Julien Grall _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2017-03-03 16:21 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-03-02 1:05 xen/arm and swiotlb-xen: possible data corruption Stefano Stabellini 2017-03-02 8:38 ` Edgar E. Iglesias 2017-03-02 8:53 ` Edgar E. Iglesias 2017-03-02 17:56 ` Julien Grall 2017-03-02 19:12 ` Stefano Stabellini 2017-03-02 19:32 ` Julien Grall 2017-03-02 22:39 ` Stefano Stabellini 2017-03-02 22:55 ` Edgar E. Iglesias 2017-03-02 23:07 ` Stefano Stabellini 2017-03-02 23:24 ` Julien Grall 2017-03-02 23:19 ` Julien Grall 2017-03-03 0:53 ` Stefano Stabellini 2017-03-03 16:20 ` Julien Grall
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.