linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] xen/swiotlb: Exchange to contiguous memory for map_sg hook
@ 2012-12-06 13:08 Dongxiao Xu
  2012-12-06 13:37 ` [Xen-devel] " Jan Beulich
  2012-12-07 14:08 ` Konrad Rzeszutek Wilk
  0 siblings, 2 replies; 16+ messages in thread
From: Dongxiao Xu @ 2012-12-06 13:08 UTC (permalink / raw)
  To: konrad.wilk, xen-devel; +Cc: linux-kernel

While mapping sg buffers, checking to cross page DMA buffer is
also needed. If the guest DMA buffer crosses page boundary, Xen
should exchange contiguous memory for it.

Besides, it is needed to backup the original page contents
and copy it back after memory exchange is done.

This fixes issues if device DMA into software static buffers,
and in case the static buffer cross page boundary which pages are
not contiguous in real hardware.

Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
---
 drivers/xen/swiotlb-xen.c |   47 ++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 46 insertions(+), 1 deletions(-)

diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 58db6df..e8f0cfb 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -461,6 +461,22 @@ xen_swiotlb_sync_single_for_device(struct device *hwdev, dma_addr_t dev_addr,
 }
 EXPORT_SYMBOL_GPL(xen_swiotlb_sync_single_for_device);
 
+static bool
+check_continguous_region(unsigned long vstart, unsigned long order)
+{
+	unsigned long prev_ma = xen_virt_to_bus((void *)vstart);
+	unsigned long next_ma;
+	int i;
+
+	for (i = 1; i < (1 << order); i++) {
+		next_ma = xen_virt_to_bus((void *)(vstart + i * PAGE_SIZE));
+		if (next_ma != prev_ma + PAGE_SIZE)
+			return false;
+		prev_ma = next_ma;
+	}
+	return true;
+}
+
 /*
  * Map a set of buffers described by scatterlist in streaming mode for DMA.
  * This is the scatter-gather version of the above xen_swiotlb_map_page
@@ -489,7 +505,36 @@ xen_swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl,
 
 	for_each_sg(sgl, sg, nelems, i) {
 		phys_addr_t paddr = sg_phys(sg);
-		dma_addr_t dev_addr = xen_phys_to_bus(paddr);
+		unsigned long vstart, order;
+		dma_addr_t dev_addr;
+
+		/*
+		 * While mapping sg buffers, checking to cross page DMA buffer
+		 * is also needed. If the guest DMA buffer crosses page
+		 * boundary, Xen should exchange contiguous memory for it.
+		 * Besides, it is needed to backup the original page contents
+		 * and copy it back after memory exchange is done.
+		 */
+		if (range_straddles_page_boundary(paddr, sg->length)) {
+			vstart = (unsigned long)__va(paddr & PAGE_MASK);
+			order = get_order(sg->length + (paddr & ~PAGE_MASK));
+			if (!check_continguous_region(vstart, order)) {
+				unsigned long buf;
+				buf = __get_free_pages(GFP_KERNEL, order);
+				memcpy((void *)buf, (void *)vstart,
+					PAGE_SIZE * (1 << order));
+				if (xen_create_contiguous_region(vstart, order,
+						fls64(paddr))) {
+					free_pages(buf, order);
+					return 0;
+				}
+				memcpy((void *)vstart, (void *)buf,
+					PAGE_SIZE * (1 << order));
+				free_pages(buf, order);
+			}
+		}
+
+		dev_addr = xen_phys_to_bus(paddr);
 
 		if (swiotlb_force ||
 		    !dma_capable(hwdev, dev_addr, sg->length) ||
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [Xen-devel] [PATCH] xen/swiotlb: Exchange to contiguous memory for map_sg hook
  2012-12-06 13:08 [PATCH] xen/swiotlb: Exchange to contiguous memory for map_sg hook Dongxiao Xu
@ 2012-12-06 13:37 ` Jan Beulich
  2012-12-07 14:11   ` Konrad Rzeszutek Wilk
  2012-12-07 14:08 ` Konrad Rzeszutek Wilk
  1 sibling, 1 reply; 16+ messages in thread
From: Jan Beulich @ 2012-12-06 13:37 UTC (permalink / raw)
  To: Dongxiao Xu; +Cc: xen-devel, konrad.wilk, linux-kernel

>>> On 06.12.12 at 14:08, Dongxiao Xu <dongxiao.xu@intel.com> wrote:
> While mapping sg buffers, checking to cross page DMA buffer is
> also needed. If the guest DMA buffer crosses page boundary, Xen
> should exchange contiguous memory for it.
> 
> Besides, it is needed to backup the original page contents
> and copy it back after memory exchange is done.
> 
> This fixes issues if device DMA into software static buffers,
> and in case the static buffer cross page boundary which pages are
> not contiguous in real hardware.
> 
> Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
> Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
> ---
>  drivers/xen/swiotlb-xen.c |   47 
> ++++++++++++++++++++++++++++++++++++++++++++-
>  1 files changed, 46 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> index 58db6df..e8f0cfb 100644
> --- a/drivers/xen/swiotlb-xen.c
> +++ b/drivers/xen/swiotlb-xen.c
> @@ -461,6 +461,22 @@ xen_swiotlb_sync_single_for_device(struct device *hwdev, 
> dma_addr_t dev_addr,
>  }
>  EXPORT_SYMBOL_GPL(xen_swiotlb_sync_single_for_device);
>  
> +static bool
> +check_continguous_region(unsigned long vstart, unsigned long order)

check_continguous_region(unsigned long vstart, unsigned int order)

But - why do you need to do this check order based in the first
place? Checking the actual length of the buffer should suffice.

> +{
> +	unsigned long prev_ma = xen_virt_to_bus((void *)vstart);
> +	unsigned long next_ma;

phys_addr_t or some such for both of them.

> +	int i;

unsigned long

> +
> +	for (i = 1; i < (1 << order); i++) {

1UL

> +		next_ma = xen_virt_to_bus((void *)(vstart + i * PAGE_SIZE));
> +		if (next_ma != prev_ma + PAGE_SIZE)
> +			return false;
> +		prev_ma = next_ma;
> +	}
> +	return true;
> +}
> +
>  /*
>   * Map a set of buffers described by scatterlist in streaming mode for DMA.
>   * This is the scatter-gather version of the above xen_swiotlb_map_page
> @@ -489,7 +505,36 @@ xen_swiotlb_map_sg_attrs(struct device *hwdev, struct 
> scatterlist *sgl,
>  
>  	for_each_sg(sgl, sg, nelems, i) {
>  		phys_addr_t paddr = sg_phys(sg);
> -		dma_addr_t dev_addr = xen_phys_to_bus(paddr);
> +		unsigned long vstart, order;
> +		dma_addr_t dev_addr;
> +
> +		/*
> +		 * While mapping sg buffers, checking to cross page DMA buffer
> +		 * is also needed. If the guest DMA buffer crosses page
> +		 * boundary, Xen should exchange contiguous memory for it.
> +		 * Besides, it is needed to backup the original page contents
> +		 * and copy it back after memory exchange is done.
> +		 */
> +		if (range_straddles_page_boundary(paddr, sg->length)) {
> +			vstart = (unsigned long)__va(paddr & PAGE_MASK);
> +			order = get_order(sg->length + (paddr & ~PAGE_MASK));
> +			if (!check_continguous_region(vstart, order)) {
> +				unsigned long buf;
> +				buf = __get_free_pages(GFP_KERNEL, order);
> +				memcpy((void *)buf, (void *)vstart,
> +					PAGE_SIZE * (1 << order));
> +				if (xen_create_contiguous_region(vstart, order,
> +						fls64(paddr))) {
> +					free_pages(buf, order);
> +					return 0;
> +				}
> +				memcpy((void *)vstart, (void *)buf,
> +					PAGE_SIZE * (1 << order));
> +				free_pages(buf, order);
> +			}
> +		}
> +
> +		dev_addr = xen_phys_to_bus(paddr);
>  
>  		if (swiotlb_force ||
>  		    !dma_capable(hwdev, dev_addr, sg->length) ||

How about swiotlb_map_page() (for the compound page case)?

Jan


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] xen/swiotlb: Exchange to contiguous memory for map_sg hook
  2012-12-06 13:08 [PATCH] xen/swiotlb: Exchange to contiguous memory for map_sg hook Dongxiao Xu
  2012-12-06 13:37 ` [Xen-devel] " Jan Beulich
@ 2012-12-07 14:08 ` Konrad Rzeszutek Wilk
  2012-12-11  6:39   ` Xu, Dongxiao
  1 sibling, 1 reply; 16+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-12-07 14:08 UTC (permalink / raw)
  To: Dongxiao Xu; +Cc: xen-devel, linux-kernel

On Thu, Dec 06, 2012 at 09:08:42PM +0800, Dongxiao Xu wrote:
> While mapping sg buffers, checking to cross page DMA buffer is
> also needed. If the guest DMA buffer crosses page boundary, Xen
> should exchange contiguous memory for it.

So this is when we cross those 2MB contingous swatch of buffers.
Wouldn't we get the same problem with the 'map_page' call? If the
driver tried to map say a 4MB DMA region?

What if this check was done in the routines that provide the
software static buffers and there try to provide a nice
DMA contingous swatch of pages?

> 
> Besides, it is needed to backup the original page contents
> and copy it back after memory exchange is done.
> 
> This fixes issues if device DMA into software static buffers,
> and in case the static buffer cross page boundary which pages are
> not contiguous in real hardware.
> 
> Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
> Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
> ---
>  drivers/xen/swiotlb-xen.c |   47 ++++++++++++++++++++++++++++++++++++++++++++-
>  1 files changed, 46 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> index 58db6df..e8f0cfb 100644
> --- a/drivers/xen/swiotlb-xen.c
> +++ b/drivers/xen/swiotlb-xen.c
> @@ -461,6 +461,22 @@ xen_swiotlb_sync_single_for_device(struct device *hwdev, dma_addr_t dev_addr,
>  }
>  EXPORT_SYMBOL_GPL(xen_swiotlb_sync_single_for_device);
>  
> +static bool
> +check_continguous_region(unsigned long vstart, unsigned long order)
> +{
> +	unsigned long prev_ma = xen_virt_to_bus((void *)vstart);
> +	unsigned long next_ma;
> +	int i;
> +
> +	for (i = 1; i < (1 << order); i++) {
> +		next_ma = xen_virt_to_bus((void *)(vstart + i * PAGE_SIZE));
> +		if (next_ma != prev_ma + PAGE_SIZE)
> +			return false;
> +		prev_ma = next_ma;
> +	}
> +	return true;
> +}
> +
>  /*
>   * Map a set of buffers described by scatterlist in streaming mode for DMA.
>   * This is the scatter-gather version of the above xen_swiotlb_map_page
> @@ -489,7 +505,36 @@ xen_swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl,
>  
>  	for_each_sg(sgl, sg, nelems, i) {
>  		phys_addr_t paddr = sg_phys(sg);
> -		dma_addr_t dev_addr = xen_phys_to_bus(paddr);
> +		unsigned long vstart, order;
> +		dma_addr_t dev_addr;
> +
> +		/*
> +		 * While mapping sg buffers, checking to cross page DMA buffer
> +		 * is also needed. If the guest DMA buffer crosses page
> +		 * boundary, Xen should exchange contiguous memory for it.
> +		 * Besides, it is needed to backup the original page contents
> +		 * and copy it back after memory exchange is done.
> +		 */
> +		if (range_straddles_page_boundary(paddr, sg->length)) {
> +			vstart = (unsigned long)__va(paddr & PAGE_MASK);
> +			order = get_order(sg->length + (paddr & ~PAGE_MASK));
> +			if (!check_continguous_region(vstart, order)) {
> +				unsigned long buf;
> +				buf = __get_free_pages(GFP_KERNEL, order);
> +				memcpy((void *)buf, (void *)vstart,
> +					PAGE_SIZE * (1 << order));
> +				if (xen_create_contiguous_region(vstart, order,
> +						fls64(paddr))) {
> +					free_pages(buf, order);
> +					return 0;
> +				}
> +				memcpy((void *)vstart, (void *)buf,
> +					PAGE_SIZE * (1 << order));
> +				free_pages(buf, order);
> +			}
> +		}
> +
> +		dev_addr = xen_phys_to_bus(paddr);
>  
>  		if (swiotlb_force ||
>  		    !dma_capable(hwdev, dev_addr, sg->length) ||
> -- 
> 1.7.1
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xen-devel] [PATCH] xen/swiotlb: Exchange to contiguous memory for map_sg hook
  2012-12-06 13:37 ` [Xen-devel] " Jan Beulich
@ 2012-12-07 14:11   ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 16+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-12-07 14:11 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Dongxiao Xu, xen-devel, linux-kernel

On Thu, Dec 06, 2012 at 01:37:41PM +0000, Jan Beulich wrote:
> >>> On 06.12.12 at 14:08, Dongxiao Xu <dongxiao.xu@intel.com> wrote:
> > While mapping sg buffers, checking to cross page DMA buffer is
> > also needed. If the guest DMA buffer crosses page boundary, Xen
> > should exchange contiguous memory for it.
> > 
> > Besides, it is needed to backup the original page contents
> > and copy it back after memory exchange is done.
> > 
> > This fixes issues if device DMA into software static buffers,
> > and in case the static buffer cross page boundary which pages are
> > not contiguous in real hardware.
> > 
> > Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
> > Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
> > ---
> >  drivers/xen/swiotlb-xen.c |   47 
> > ++++++++++++++++++++++++++++++++++++++++++++-
> >  1 files changed, 46 insertions(+), 1 deletions(-)
> > 
> > diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> > index 58db6df..e8f0cfb 100644
> > --- a/drivers/xen/swiotlb-xen.c
> > +++ b/drivers/xen/swiotlb-xen.c
> > @@ -461,6 +461,22 @@ xen_swiotlb_sync_single_for_device(struct device *hwdev, 
> > dma_addr_t dev_addr,
> >  }
> >  EXPORT_SYMBOL_GPL(xen_swiotlb_sync_single_for_device);
> >  
> > +static bool
> > +check_continguous_region(unsigned long vstart, unsigned long order)
> 
> check_continguous_region(unsigned long vstart, unsigned int order)
> 
> But - why do you need to do this check order based in the first
> place? Checking the actual length of the buffer should suffice.
> 
> > +{
> > +	unsigned long prev_ma = xen_virt_to_bus((void *)vstart);
> > +	unsigned long next_ma;
> 
> phys_addr_t or some such for both of them.
> 
> > +	int i;
> 
> unsigned long
> 
> > +
> > +	for (i = 1; i < (1 << order); i++) {
> 
> 1UL
> 
> > +		next_ma = xen_virt_to_bus((void *)(vstart + i * PAGE_SIZE));
> > +		if (next_ma != prev_ma + PAGE_SIZE)
> > +			return false;
> > +		prev_ma = next_ma;
> > +	}
> > +	return true;
> > +}
> > +
> >  /*
> >   * Map a set of buffers described by scatterlist in streaming mode for DMA.
> >   * This is the scatter-gather version of the above xen_swiotlb_map_page
> > @@ -489,7 +505,36 @@ xen_swiotlb_map_sg_attrs(struct device *hwdev, struct 
> > scatterlist *sgl,
> >  
> >  	for_each_sg(sgl, sg, nelems, i) {
> >  		phys_addr_t paddr = sg_phys(sg);
> > -		dma_addr_t dev_addr = xen_phys_to_bus(paddr);
> > +		unsigned long vstart, order;
> > +		dma_addr_t dev_addr;
> > +
> > +		/*
> > +		 * While mapping sg buffers, checking to cross page DMA buffer
> > +		 * is also needed. If the guest DMA buffer crosses page
> > +		 * boundary, Xen should exchange contiguous memory for it.
> > +		 * Besides, it is needed to backup the original page contents
> > +		 * and copy it back after memory exchange is done.
> > +		 */
> > +		if (range_straddles_page_boundary(paddr, sg->length)) {
> > +			vstart = (unsigned long)__va(paddr & PAGE_MASK);
> > +			order = get_order(sg->length + (paddr & ~PAGE_MASK));
> > +			if (!check_continguous_region(vstart, order)) {
> > +				unsigned long buf;
> > +				buf = __get_free_pages(GFP_KERNEL, order);
> > +				memcpy((void *)buf, (void *)vstart,
> > +					PAGE_SIZE * (1 << order));
> > +				if (xen_create_contiguous_region(vstart, order,
> > +						fls64(paddr))) {
> > +					free_pages(buf, order);
> > +					return 0;
> > +				}
> > +				memcpy((void *)vstart, (void *)buf,
> > +					PAGE_SIZE * (1 << order));
> > +				free_pages(buf, order);
> > +			}
> > +		}
> > +
> > +		dev_addr = xen_phys_to_bus(paddr);
> >  
> >  		if (swiotlb_force ||
> >  		    !dma_capable(hwdev, dev_addr, sg->length) ||
> 
> How about swiotlb_map_page() (for the compound page case)?

Heh. Thanks - I just got to your reply now and had the same question.

Interestingly enough - this looks like a problem that has been forever
and nobody ever hit this.

Worst, the problem is even present if a driver uses the pci_alloc_coherent
and asks for a 3MB region or such - as we can at most give out only
2MB swaths.

> 
> Jan
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [PATCH] xen/swiotlb: Exchange to contiguous memory for map_sg hook
  2012-12-07 14:08 ` Konrad Rzeszutek Wilk
@ 2012-12-11  6:39   ` Xu, Dongxiao
  2012-12-11 17:06     ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 16+ messages in thread
From: Xu, Dongxiao @ 2012-12-11  6:39 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel, linux-kernel

> -----Original Message-----
> From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@oracle.com]
> Sent: Friday, December 07, 2012 10:09 PM
> To: Xu, Dongxiao
> Cc: xen-devel@lists.xen.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH] xen/swiotlb: Exchange to contiguous memory for map_sg
> hook
> 
> On Thu, Dec 06, 2012 at 09:08:42PM +0800, Dongxiao Xu wrote:
> > While mapping sg buffers, checking to cross page DMA buffer is also
> > needed. If the guest DMA buffer crosses page boundary, Xen should
> > exchange contiguous memory for it.
> 
> So this is when we cross those 2MB contingous swatch of buffers.
> Wouldn't we get the same problem with the 'map_page' call? If the driver tried
> to map say a 4MB DMA region?

Yes, it also needs such check, as I just replied to Jan's mail.

> 
> What if this check was done in the routines that provide the software static
> buffers and there try to provide a nice DMA contingous swatch of pages?

Yes, this approach also came to our mind, which needs to modify the driver itself.
If so, it requires driver not using such static buffers (e.g., from kmalloc) to do DMA even if the buffer is continuous in native.
Is this acceptable by kernel/driver upstream?

Thanks,
Dongxiao

> 
> >
> > Besides, it is needed to backup the original page contents and copy it
> > back after memory exchange is done.
> >
> > This fixes issues if device DMA into software static buffers, and in
> > case the static buffer cross page boundary which pages are not
> > contiguous in real hardware.
> >
> > Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
> > Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
> > ---
> >  drivers/xen/swiotlb-xen.c |   47
> ++++++++++++++++++++++++++++++++++++++++++++-
> >  1 files changed, 46 insertions(+), 1 deletions(-)
> >
> > diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> > index 58db6df..e8f0cfb 100644
> > --- a/drivers/xen/swiotlb-xen.c
> > +++ b/drivers/xen/swiotlb-xen.c
> > @@ -461,6 +461,22 @@ xen_swiotlb_sync_single_for_device(struct device
> > *hwdev, dma_addr_t dev_addr,  }
> > EXPORT_SYMBOL_GPL(xen_swiotlb_sync_single_for_device);
> >
> > +static bool
> > +check_continguous_region(unsigned long vstart, unsigned long order) {
> > +	unsigned long prev_ma = xen_virt_to_bus((void *)vstart);
> > +	unsigned long next_ma;
> > +	int i;
> > +
> > +	for (i = 1; i < (1 << order); i++) {
> > +		next_ma = xen_virt_to_bus((void *)(vstart + i * PAGE_SIZE));
> > +		if (next_ma != prev_ma + PAGE_SIZE)
> > +			return false;
> > +		prev_ma = next_ma;
> > +	}
> > +	return true;
> > +}
> > +
> >  /*
> >   * Map a set of buffers described by scatterlist in streaming mode for
> DMA.
> >   * This is the scatter-gather version of the above
> > xen_swiotlb_map_page @@ -489,7 +505,36 @@
> > xen_swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist
> > *sgl,
> >
> >  	for_each_sg(sgl, sg, nelems, i) {
> >  		phys_addr_t paddr = sg_phys(sg);
> > -		dma_addr_t dev_addr = xen_phys_to_bus(paddr);
> > +		unsigned long vstart, order;
> > +		dma_addr_t dev_addr;
> > +
> > +		/*
> > +		 * While mapping sg buffers, checking to cross page DMA buffer
> > +		 * is also needed. If the guest DMA buffer crosses page
> > +		 * boundary, Xen should exchange contiguous memory for it.
> > +		 * Besides, it is needed to backup the original page contents
> > +		 * and copy it back after memory exchange is done.
> > +		 */
> > +		if (range_straddles_page_boundary(paddr, sg->length)) {
> > +			vstart = (unsigned long)__va(paddr & PAGE_MASK);
> > +			order = get_order(sg->length + (paddr & ~PAGE_MASK));
> > +			if (!check_continguous_region(vstart, order)) {
> > +				unsigned long buf;
> > +				buf = __get_free_pages(GFP_KERNEL, order);
> > +				memcpy((void *)buf, (void *)vstart,
> > +					PAGE_SIZE * (1 << order));
> > +				if (xen_create_contiguous_region(vstart, order,
> > +						fls64(paddr))) {
> > +					free_pages(buf, order);
> > +					return 0;
> > +				}
> > +				memcpy((void *)vstart, (void *)buf,
> > +					PAGE_SIZE * (1 << order));
> > +				free_pages(buf, order);
> > +			}
> > +		}
> > +
> > +		dev_addr = xen_phys_to_bus(paddr);
> >
> >  		if (swiotlb_force ||
> >  		    !dma_capable(hwdev, dev_addr, sg->length) ||
> > --
> > 1.7.1
> >

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] xen/swiotlb: Exchange to contiguous memory for map_sg hook
  2012-12-11  6:39   ` Xu, Dongxiao
@ 2012-12-11 17:06     ` Konrad Rzeszutek Wilk
  2012-12-12  1:03       ` Xu, Dongxiao
  0 siblings, 1 reply; 16+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-12-11 17:06 UTC (permalink / raw)
  To: Xu, Dongxiao; +Cc: xen-devel, linux-kernel

On Tue, Dec 11, 2012 at 06:39:35AM +0000, Xu, Dongxiao wrote:
> > -----Original Message-----
> > From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@oracle.com]
> > Sent: Friday, December 07, 2012 10:09 PM
> > To: Xu, Dongxiao
> > Cc: xen-devel@lists.xen.org; linux-kernel@vger.kernel.org
> > Subject: Re: [PATCH] xen/swiotlb: Exchange to contiguous memory for map_sg
> > hook
> > 
> > On Thu, Dec 06, 2012 at 09:08:42PM +0800, Dongxiao Xu wrote:
> > > While mapping sg buffers, checking to cross page DMA buffer is also
> > > needed. If the guest DMA buffer crosses page boundary, Xen should
> > > exchange contiguous memory for it.
> > 
> > So this is when we cross those 2MB contingous swatch of buffers.
> > Wouldn't we get the same problem with the 'map_page' call? If the driver tried
> > to map say a 4MB DMA region?
> 
> Yes, it also needs such check, as I just replied to Jan's mail.
> 
> > 
> > What if this check was done in the routines that provide the software static
> > buffers and there try to provide a nice DMA contingous swatch of pages?
> 
> Yes, this approach also came to our mind, which needs to modify the driver itself.
> If so, it requires driver not using such static buffers (e.g., from kmalloc) to do DMA even if the buffer is continuous in native.

I am bit loss here.

Is the issue you found only with drivers that do not use DMA API?
Can you perhaps point me to the code that triggered this fix in the first
place?
> Is this acceptable by kernel/driver upstream?

I am still not completely clear on what you had in mind. The one method I
thought about that might help in this is to have Xen-SWIOTLB
track which memory ranges were exchanged (so xen_swiotlb_fixup
would save the *buf and the size for each call to
xen_create_contiguous_region in a list or array).

When xen_swiotlb_map/xen_swiotlb_map_sg_attrs care called they
would consult said array/list to see if the region they retrieved
crosses said 2MB chunks. If so.. and here I am unsure of
what would be the best way to proceed.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [PATCH] xen/swiotlb: Exchange to contiguous memory for map_sg hook
  2012-12-11 17:06     ` Konrad Rzeszutek Wilk
@ 2012-12-12  1:03       ` Xu, Dongxiao
  2012-12-12  9:38         ` [Xen-devel] " Jan Beulich
  2012-12-13 16:34         ` Konrad Rzeszutek Wilk
  0 siblings, 2 replies; 16+ messages in thread
From: Xu, Dongxiao @ 2012-12-12  1:03 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel, linux-kernel

> -----Original Message-----
> From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@oracle.com]
> Sent: Wednesday, December 12, 2012 1:07 AM
> To: Xu, Dongxiao
> Cc: xen-devel@lists.xen.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH] xen/swiotlb: Exchange to contiguous memory for map_sg
> hook
> 
> On Tue, Dec 11, 2012 at 06:39:35AM +0000, Xu, Dongxiao wrote:
> > > -----Original Message-----
> > > From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@oracle.com]
> > > Sent: Friday, December 07, 2012 10:09 PM
> > > To: Xu, Dongxiao
> > > Cc: xen-devel@lists.xen.org; linux-kernel@vger.kernel.org
> > > Subject: Re: [PATCH] xen/swiotlb: Exchange to contiguous memory for
> > > map_sg hook
> > >
> > > On Thu, Dec 06, 2012 at 09:08:42PM +0800, Dongxiao Xu wrote:
> > > > While mapping sg buffers, checking to cross page DMA buffer is
> > > > also needed. If the guest DMA buffer crosses page boundary, Xen
> > > > should exchange contiguous memory for it.
> > >
> > > So this is when we cross those 2MB contingous swatch of buffers.
> > > Wouldn't we get the same problem with the 'map_page' call? If the
> > > driver tried to map say a 4MB DMA region?
> >
> > Yes, it also needs such check, as I just replied to Jan's mail.
> >
> > >
> > > What if this check was done in the routines that provide the
> > > software static buffers and there try to provide a nice DMA contingous
> swatch of pages?
> >
> > Yes, this approach also came to our mind, which needs to modify the driver
> itself.
> > If so, it requires driver not using such static buffers (e.g., from kmalloc) to do
> DMA even if the buffer is continuous in native.
> 
> I am bit loss here.
> 
> Is the issue you found only with drivers that do not use DMA API?
> Can you perhaps point me to the code that triggered this fix in the first place?

Yes, we met this issue on a specific SAS device/driver, and it calls into libata-core code, you can refer to function ata_dev_read_id() called from ata_dev_reread_id() in drivers/ata/libata-core.c.

In the above function, the target buffer is (void *)dev->link->ap->sector_buf, which is 512 bytes static buffer and unfortunately it across the page boundary.

> > Is this acceptable by kernel/driver upstream?
> 
> I am still not completely clear on what you had in mind. The one method I
> thought about that might help in this is to have Xen-SWIOTLB track which
> memory ranges were exchanged (so xen_swiotlb_fixup would save the *buf
> and the size for each call to xen_create_contiguous_region in a list or array).
> 
> When xen_swiotlb_map/xen_swiotlb_map_sg_attrs care called they would
> consult said array/list to see if the region they retrieved crosses said 2MB
> chunks. If so.. and here I am unsure of what would be the best way to proceed.

We thought we can solve the issue in several ways:

1) Like the previous patch I sent out, we check the DMA region in xen_swiotlb_map_page() and xen_swiotlb_map_sg_attr(), and if DMA region crosses page boundary, we exchange the memory and copy the content. However it has race condition that when copying the memory content (we introduced two memory copies in the patch), some other code may also visit the page, which may encounter incorrect values.
2) Mostly the same as item 1), the only difference is that we put the memory content copy inside Xen hypervisor but not in Dom0. This requires we add certain flags to indicating memory moving in the XENMEM_exchange hypercall.
3) As you also mentioned, this is not a common case, it is only triggered by some specific devices/drivers. we can fix it in certain driver to avoid DMA into static buffers. Like (void *)dev->link->ap->sector_buf in the above case. But I am not sure whether it is acceptable by kernel/driver upstream.

Thanks,
Dongxiao



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xen-devel] [PATCH] xen/swiotlb: Exchange to contiguous memory for map_sg hook
  2012-12-12  1:03       ` Xu, Dongxiao
@ 2012-12-12  9:38         ` Jan Beulich
  2012-12-19 20:09           ` Konrad Rzeszutek Wilk
  2012-12-13 16:34         ` Konrad Rzeszutek Wilk
  1 sibling, 1 reply; 16+ messages in thread
From: Jan Beulich @ 2012-12-12  9:38 UTC (permalink / raw)
  To: Dongxiao Xu, Konrad Rzeszutek Wilk; +Cc: xen-devel, linux-kernel

>>> On 12.12.12 at 02:03, "Xu, Dongxiao" <dongxiao.xu@intel.com> wrote:
>> From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@oracle.com]
>> On Tue, Dec 11, 2012 at 06:39:35AM +0000, Xu, Dongxiao wrote:
>> > > From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@oracle.com]
>> > > What if this check was done in the routines that provide the
>> > > software static buffers and there try to provide a nice DMA contingous
>> swatch of pages?
>> >
>> > Yes, this approach also came to our mind, which needs to modify the driver
>> itself.
>> > If so, it requires driver not using such static buffers (e.g., from 
> kmalloc) to do
>> DMA even if the buffer is continuous in native.
>> 
>> I am bit loss here.
>> 
>> Is the issue you found only with drivers that do not use DMA API?
>> Can you perhaps point me to the code that triggered this fix in the first 
> place?
> 
> Yes, we met this issue on a specific SAS device/driver, and it calls into 
> libata-core code, you can refer to function ata_dev_read_id() called from 
> ata_dev_reread_id() in drivers/ata/libata-core.c.
> 
> In the above function, the target buffer is (void *)dev->link->ap->sector_buf, 
> which is 512 bytes static buffer and unfortunately it across the page 
> boundary.

I wonder whether such use of sg_init_one()/sg_set_buf() is correct
in the first place. While there aren't any restrictions documented for
its use, one clearly can't pass in whatever one wants (a pointer into
vmalloc()-ed memory, for instance, won't work afaict).

I didn't go through all other users of it, but quite a few of the uses
elsewhere look similarly questionable.

>> I am still not completely clear on what you had in mind. The one method I
>> thought about that might help in this is to have Xen-SWIOTLB track which
>> memory ranges were exchanged (so xen_swiotlb_fixup would save the *buf
>> and the size for each call to xen_create_contiguous_region in a list or 
> array).
>> 
>> When xen_swiotlb_map/xen_swiotlb_map_sg_attrs care called they would
>> consult said array/list to see if the region they retrieved crosses said 2MB
>> chunks. If so.. and here I am unsure of what would be the best way to 
> proceed.
> 
> We thought we can solve the issue in several ways:
> 
> 1) Like the previous patch I sent out, we check the DMA region in 
> xen_swiotlb_map_page() and xen_swiotlb_map_sg_attr(), and if DMA region 
> crosses page boundary, we exchange the memory and copy the content. However 
> it has race condition that when copying the memory content (we introduced two 
> memory copies in the patch), some other code may also visit the page, which 
> may encounter incorrect values.

That's why, after mapping a buffer (or SG list) one has to call the
sync functions before looking at data. Any race as described by
you is therefore a programming error.

Jan



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] xen/swiotlb: Exchange to contiguous memory for map_sg hook
  2012-12-12  1:03       ` Xu, Dongxiao
  2012-12-12  9:38         ` [Xen-devel] " Jan Beulich
@ 2012-12-13 16:34         ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 16+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-12-13 16:34 UTC (permalink / raw)
  To: Xu, Dongxiao; +Cc: xen-devel, linux-kernel

On Wed, Dec 12, 2012 at 01:03:55AM +0000, Xu, Dongxiao wrote:
> > -----Original Message-----
> > From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@oracle.com]
> > Sent: Wednesday, December 12, 2012 1:07 AM
> > To: Xu, Dongxiao
> > Cc: xen-devel@lists.xen.org; linux-kernel@vger.kernel.org
> > Subject: Re: [PATCH] xen/swiotlb: Exchange to contiguous memory for map_sg
> > hook
> > 
> > On Tue, Dec 11, 2012 at 06:39:35AM +0000, Xu, Dongxiao wrote:
> > > > -----Original Message-----
> > > > From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@oracle.com]
> > > > Sent: Friday, December 07, 2012 10:09 PM
> > > > To: Xu, Dongxiao
> > > > Cc: xen-devel@lists.xen.org; linux-kernel@vger.kernel.org
> > > > Subject: Re: [PATCH] xen/swiotlb: Exchange to contiguous memory for
> > > > map_sg hook
> > > >
> > > > On Thu, Dec 06, 2012 at 09:08:42PM +0800, Dongxiao Xu wrote:
> > > > > While mapping sg buffers, checking to cross page DMA buffer is
> > > > > also needed. If the guest DMA buffer crosses page boundary, Xen
> > > > > should exchange contiguous memory for it.
> > > >
> > > > So this is when we cross those 2MB contingous swatch of buffers.
> > > > Wouldn't we get the same problem with the 'map_page' call? If the
> > > > driver tried to map say a 4MB DMA region?
> > >
> > > Yes, it also needs such check, as I just replied to Jan's mail.
> > >
> > > >
> > > > What if this check was done in the routines that provide the
> > > > software static buffers and there try to provide a nice DMA contingous
> > swatch of pages?
> > >
> > > Yes, this approach also came to our mind, which needs to modify the driver
> > itself.
> > > If so, it requires driver not using such static buffers (e.g., from kmalloc) to do
> > DMA even if the buffer is continuous in native.
> > 
> > I am bit loss here.
> > 
> > Is the issue you found only with drivers that do not use DMA API?
> > Can you perhaps point me to the code that triggered this fix in the first place?
> 
> Yes, we met this issue on a specific SAS device/driver, and it calls into libata-core code, you can refer to function ata_dev_read_id() called from ata_dev_reread_id() in drivers/ata/libata-core.c.
> 
> In the above function, the target buffer is (void *)dev->link->ap->sector_buf, which is 512 bytes static buffer and unfortunately it across the page boundary.

Hm, that looks like a DMA API violation. If you ran the code with
CONFIG_DEBUG_DMA_API did it complain about this? I recall the floppy
driver doing something similar and the Debug DMA API was quite verbose
in pointing this out.

> 
> > > Is this acceptable by kernel/driver upstream?
> > 
> > I am still not completely clear on what you had in mind. The one method I
> > thought about that might help in this is to have Xen-SWIOTLB track which
> > memory ranges were exchanged (so xen_swiotlb_fixup would save the *buf
> > and the size for each call to xen_create_contiguous_region in a list or array).
> > 
> > When xen_swiotlb_map/xen_swiotlb_map_sg_attrs care called they would
> > consult said array/list to see if the region they retrieved crosses said 2MB
> > chunks. If so.. and here I am unsure of what would be the best way to proceed.
> 
> We thought we can solve the issue in several ways:
> 
> 1) Like the previous patch I sent out, we check the DMA region in xen_swiotlb_map_page() and xen_swiotlb_map_sg_attr(), and if DMA region crosses page boundary, we exchange the memory and copy the content. However it has race condition that when copying the memory content (we introduced two memory copies in the patch), some other code may also visit the page, which may encounter incorrect values.
> 2) Mostly the same as item 1), the only difference is that we put the memory content copy inside Xen hypervisor but not in Dom0. This requires we add certain flags to indicating memory moving in the XENMEM_exchange hypercall.
> 3) As you also mentioned, this is not a common case, it is only triggered by some specific devices/drivers. we can fix it in certain driver to avoid DMA into static buffers. Like (void *)dev->link->ap->sector_buf in the above case. But I am not sure whether it is acceptable by kernel/driver upstream.
> 
> Thanks,
> Dongxiao
> 
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xen-devel] [PATCH] xen/swiotlb: Exchange to contiguous memory for map_sg hook
  2012-12-12  9:38         ` [Xen-devel] " Jan Beulich
@ 2012-12-19 20:09           ` Konrad Rzeszutek Wilk
  2012-12-20  1:23             ` Xu, Dongxiao
  0 siblings, 1 reply; 16+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-12-19 20:09 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Dongxiao Xu, xen-devel, linux-kernel

On Wed, Dec 12, 2012 at 09:38:23AM +0000, Jan Beulich wrote:
> >>> On 12.12.12 at 02:03, "Xu, Dongxiao" <dongxiao.xu@intel.com> wrote:
> >> From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@oracle.com]
> >> On Tue, Dec 11, 2012 at 06:39:35AM +0000, Xu, Dongxiao wrote:
> >> > > From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@oracle.com]
> >> > > What if this check was done in the routines that provide the
> >> > > software static buffers and there try to provide a nice DMA contingous
> >> swatch of pages?
> >> >
> >> > Yes, this approach also came to our mind, which needs to modify the driver
> >> itself.
> >> > If so, it requires driver not using such static buffers (e.g., from 
> > kmalloc) to do
> >> DMA even if the buffer is continuous in native.
> >> 
> >> I am bit loss here.
> >> 
> >> Is the issue you found only with drivers that do not use DMA API?
> >> Can you perhaps point me to the code that triggered this fix in the first 
> > place?
> > 
> > Yes, we met this issue on a specific SAS device/driver, and it calls into 
> > libata-core code, you can refer to function ata_dev_read_id() called from 
> > ata_dev_reread_id() in drivers/ata/libata-core.c.
> > 
> > In the above function, the target buffer is (void *)dev->link->ap->sector_buf, 
> > which is 512 bytes static buffer and unfortunately it across the page 
> > boundary.
> 
> I wonder whether such use of sg_init_one()/sg_set_buf() is correct
> in the first place. While there aren't any restrictions documented for
> its use, one clearly can't pass in whatever one wants (a pointer into
> vmalloc()-ed memory, for instance, won't work afaict).
> 
> I didn't go through all other users of it, but quite a few of the uses
> elsewhere look similarly questionable.
> 
> >> I am still not completely clear on what you had in mind. The one method I
> >> thought about that might help in this is to have Xen-SWIOTLB track which
> >> memory ranges were exchanged (so xen_swiotlb_fixup would save the *buf
> >> and the size for each call to xen_create_contiguous_region in a list or 
> > array).
> >> 
> >> When xen_swiotlb_map/xen_swiotlb_map_sg_attrs care called they would
> >> consult said array/list to see if the region they retrieved crosses said 2MB
> >> chunks. If so.. and here I am unsure of what would be the best way to 
> > proceed.

And from finally looking at the code I misunderstood your initial description.

> > 
> > We thought we can solve the issue in several ways:
> > 
> > 1) Like the previous patch I sent out, we check the DMA region in 
> > xen_swiotlb_map_page() and xen_swiotlb_map_sg_attr(), and if DMA region 
> > crosses page boundary, we exchange the memory and copy the content. However 
> > it has race condition that when copying the memory content (we introduced two 
> > memory copies in the patch), some other code may also visit the page, which 
> > may encounter incorrect values.
> 
> That's why, after mapping a buffer (or SG list) one has to call the
> sync functions before looking at data. Any race as described by
> you is therefore a programming error.

I am with Jan here. It looks like the libata is depending on the dma_unmap
to do this. But it is unclear to me when the ata_qc_complete is actually
called to unmap the buffer (and hence sync it). 

> 
> Jan
> 
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [Xen-devel] [PATCH] xen/swiotlb: Exchange to contiguous memory for map_sg hook
  2012-12-19 20:09           ` Konrad Rzeszutek Wilk
@ 2012-12-20  1:23             ` Xu, Dongxiao
  2012-12-20  8:56               ` Jan Beulich
  0 siblings, 1 reply; 16+ messages in thread
From: Xu, Dongxiao @ 2012-12-20  1:23 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, Jan Beulich; +Cc: xen-devel, linux-kernel

> -----Original Message-----
> From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@oracle.com]
> Sent: Thursday, December 20, 2012 4:09 AM
> To: Jan Beulich
> Cc: Xu, Dongxiao; xen-devel@lists.xen.org; linux-kernel@vger.kernel.org
> Subject: Re: [Xen-devel] [PATCH] xen/swiotlb: Exchange to contiguous memory
> for map_sg hook
> 
> On Wed, Dec 12, 2012 at 09:38:23AM +0000, Jan Beulich wrote:
> > >>> On 12.12.12 at 02:03, "Xu, Dongxiao" <dongxiao.xu@intel.com> wrote:
> > >> From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@oracle.com] On Tue,
> > >> Dec 11, 2012 at 06:39:35AM +0000, Xu, Dongxiao wrote:
> > >> > > From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@oracle.com]
> > >> > > What if this check was done in the routines that provide the
> > >> > > software static buffers and there try to provide a nice DMA
> > >> > > contingous
> > >> swatch of pages?
> > >> >
> > >> > Yes, this approach also came to our mind, which needs to modify
> > >> > the driver
> > >> itself.
> > >> > If so, it requires driver not using such static buffers (e.g.,
> > >> > from
> > > kmalloc) to do
> > >> DMA even if the buffer is continuous in native.
> > >>
> > >> I am bit loss here.
> > >>
> > >> Is the issue you found only with drivers that do not use DMA API?
> > >> Can you perhaps point me to the code that triggered this fix in the
> > >> first
> > > place?
> > >
> > > Yes, we met this issue on a specific SAS device/driver, and it calls
> > > into libata-core code, you can refer to function ata_dev_read_id()
> > > called from
> > > ata_dev_reread_id() in drivers/ata/libata-core.c.
> > >
> > > In the above function, the target buffer is (void
> > > *)dev->link->ap->sector_buf, which is 512 bytes static buffer and
> > > unfortunately it across the page boundary.
> >
> > I wonder whether such use of sg_init_one()/sg_set_buf() is correct in
> > the first place. While there aren't any restrictions documented for
> > its use, one clearly can't pass in whatever one wants (a pointer into
> > vmalloc()-ed memory, for instance, won't work afaict).
> >
> > I didn't go through all other users of it, but quite a few of the uses
> > elsewhere look similarly questionable.
> >
> > >> I am still not completely clear on what you had in mind. The one
> > >> method I thought about that might help in this is to have
> > >> Xen-SWIOTLB track which memory ranges were exchanged (so
> > >> xen_swiotlb_fixup would save the *buf and the size for each call to
> > >> xen_create_contiguous_region in a list or
> > > array).
> > >>
> > >> When xen_swiotlb_map/xen_swiotlb_map_sg_attrs care called they
> > >> would consult said array/list to see if the region they retrieved
> > >> crosses said 2MB chunks. If so.. and here I am unsure of what would
> > >> be the best way to
> > > proceed.
> 
> And from finally looking at the code I misunderstood your initial description.
> 
> > >
> > > We thought we can solve the issue in several ways:
> > >
> > > 1) Like the previous patch I sent out, we check the DMA region in
> > > xen_swiotlb_map_page() and xen_swiotlb_map_sg_attr(), and if DMA
> > > region crosses page boundary, we exchange the memory and copy the
> > > content. However it has race condition that when copying the memory
> > > content (we introduced two memory copies in the patch), some other
> > > code may also visit the page, which may encounter incorrect values.
> >
> > That's why, after mapping a buffer (or SG list) one has to call the
> > sync functions before looking at data. Any race as described by you is
> > therefore a programming error.
> 
> I am with Jan here. It looks like the libata is depending on the dma_unmap to
> do this. But it is unclear to me when the ata_qc_complete is actually called to
> unmap the buffer (and hence sync it).

Sorry, maybe I am still not describing this issue clearly.

Take the libata case as an example, the static DMA buffer locates (dev->link->ap->sector_buf , here we use Data Structure B in the graph) in the following structure:

-------------------------------------Page boundary
<Data Structure A>
<Data Structure B>
-------------------------------------Page boundary
<Data Structure B (cross page)>
<Data Structure C>
-------------------------------------Page boundary

Where Structure B is our DMA target.

For Data Structure B, we didn't care about the simultaneous access, either lock or sync function will take care of it.
What we are not sure is "read/write of A and C from other processor". As we will have memory copy for the pages, and at the same time, other CPU may access A/C.

Thanks,
Dongxiao
> 
> >
> > Jan
> >
> >

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [Xen-devel] [PATCH] xen/swiotlb: Exchange to contiguous memory for map_sg hook
  2012-12-20  1:23             ` Xu, Dongxiao
@ 2012-12-20  8:56               ` Jan Beulich
  2013-01-07  7:17                 ` Xu, Dongxiao
  0 siblings, 1 reply; 16+ messages in thread
From: Jan Beulich @ 2012-12-20  8:56 UTC (permalink / raw)
  To: Dongxiao Xu; +Cc: xen-devel, Konrad Rzeszutek Wilk, linux-kernel

>>> On 20.12.12 at 02:23, "Xu, Dongxiao" <dongxiao.xu@intel.com> wrote:
> Sorry, maybe I am still not describing this issue clearly.

No, at least I understood you the way you re-describe below.

> Take the libata case as an example, the static DMA buffer locates 
> (dev->link->ap->sector_buf , here we use Data Structure B in the graph) in 
> the following structure:
> 
> -------------------------------------Page boundary
> <Data Structure A>
> <Data Structure B>
> -------------------------------------Page boundary
> <Data Structure B (cross page)>
> <Data Structure C>
> -------------------------------------Page boundary
> 
> Where Structure B is our DMA target.
> 
> For Data Structure B, we didn't care about the simultaneous access, either 
> lock or sync function will take care of it.
> What we are not sure is "read/write of A and C from other processor". As we 
> will have memory copy for the pages, and at the same time, other CPU may 
> access A/C.

The question is whether what libata does here is valid in the first
place - fill an SG list entry with something that crosses a page
boundary and is not a compound page.

Jan


^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [Xen-devel] [PATCH] xen/swiotlb: Exchange to contiguous memory for map_sg hook
  2012-12-20  8:56               ` Jan Beulich
@ 2013-01-07  7:17                 ` Xu, Dongxiao
  2013-01-07  8:46                   ` Jan Beulich
  2013-01-07 15:55                   ` Konrad Rzeszutek Wilk
  0 siblings, 2 replies; 16+ messages in thread
From: Xu, Dongxiao @ 2013-01-07  7:17 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Konrad Rzeszutek Wilk, linux-kernel

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Thursday, December 20, 2012 4:56 PM
> To: Xu, Dongxiao
> Cc: xen-devel@lists.xen.org; Konrad Rzeszutek Wilk;
> linux-kernel@vger.kernel.org
> Subject: RE: [Xen-devel] [PATCH] xen/swiotlb: Exchange to contiguous memory
> for map_sg hook
> 
> >>> On 20.12.12 at 02:23, "Xu, Dongxiao" <dongxiao.xu@intel.com> wrote:
> > Sorry, maybe I am still not describing this issue clearly.
> 
> No, at least I understood you the way you re-describe below.
> 
> > Take the libata case as an example, the static DMA buffer locates
> > (dev->link->ap->sector_buf , here we use Data Structure B in the graph) in
> > the following structure:
> >
> > -------------------------------------Page boundary
> > <Data Structure A>
> > <Data Structure B>
> > -------------------------------------Page boundary
> > <Data Structure B (cross page)>
> > <Data Structure C>
> > -------------------------------------Page boundary
> >
> > Where Structure B is our DMA target.
> >
> > For Data Structure B, we didn't care about the simultaneous access, either
> > lock or sync function will take care of it.
> > What we are not sure is "read/write of A and C from other processor". As we
> > will have memory copy for the pages, and at the same time, other CPU may
> > access A/C.
> 
> The question is whether what libata does here is valid in the first
> place - fill an SG list entry with something that crosses a page
> boundary and is not a compound page.

Sorry for the late response about this thread.

To make sure I understand you correctly, so do you mean the correct fix should be applied to libata driver, and make sure it DMA from/to correct place (for example, some memory allocated by DMA API), but not such certain field in a static structure?

If we fix it in device driver side, then we may not need to modify the xen-swiotlb part.

Thanks,
Dongxiao

> 
> Jan


^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [Xen-devel] [PATCH] xen/swiotlb: Exchange to contiguous memory for map_sg hook
  2013-01-07  7:17                 ` Xu, Dongxiao
@ 2013-01-07  8:46                   ` Jan Beulich
  2013-01-07 15:55                   ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 16+ messages in thread
From: Jan Beulich @ 2013-01-07  8:46 UTC (permalink / raw)
  To: dongxiao.xu; +Cc: xen-devel, konrad.wilk, linux-kernel

>>> "Xu, Dongxiao" <dongxiao.xu@intel.com> 01/07/13 8:17 AM >>>
> From: Jan Beulich [mailto:JBeulich@suse.com]
> >>> On 20.12.12 at 02:23, "Xu, Dongxiao" <dongxiao.xu@intel.com> wrote:
>> > Take the libata case as an example, the static DMA buffer locates
>> > (dev->link->ap->sector_buf , here we use Data Structure B in the graph) in
>> > the following structure:
>> >
>> > -------------------------------------Page boundary
>> > <Data Structure A>
>> > <Data Structure B>
>> > -------------------------------------Page boundary
>> > <Data Structure B (cross page)>
>> > <Data Structure C>
>> > -------------------------------------Page boundary
>> >
>> > Where Structure B is our DMA target.
>> >
>> > For Data Structure B, we didn't care about the simultaneous access, either
>> > lock or sync function will take care of it.
>> > What we are not sure is "read/write of A and C from other processor". As we
>> > will have memory copy for the pages, and at the same time, other CPU may
>> > access A/C.
>> 
>> The question is whether what libata does here is valid in the first
>> place - fill an SG list entry with something that crosses a page
>> boundary and is not a compound page.
>
>To make sure I understand you correctly, so do you mean the correct fix should be
> applied to libata driver, and make sure it DMA from/to correct place (for example,
> some memory allocated by DMA API), but not such certain field in a static structure?

Yes.

Jan


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xen-devel] [PATCH] xen/swiotlb: Exchange to contiguous memory for map_sg hook
  2013-01-07  7:17                 ` Xu, Dongxiao
  2013-01-07  8:46                   ` Jan Beulich
@ 2013-01-07 15:55                   ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 16+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-01-07 15:55 UTC (permalink / raw)
  To: Xu, Dongxiao; +Cc: Jan Beulich, xen-devel, linux-kernel

On Mon, Jan 07, 2013 at 07:17:33AM +0000, Xu, Dongxiao wrote:
> > -----Original Message-----
> > From: Jan Beulich [mailto:JBeulich@suse.com]
> > Sent: Thursday, December 20, 2012 4:56 PM
> > To: Xu, Dongxiao
> > Cc: xen-devel@lists.xen.org; Konrad Rzeszutek Wilk;
> > linux-kernel@vger.kernel.org
> > Subject: RE: [Xen-devel] [PATCH] xen/swiotlb: Exchange to contiguous memory
> > for map_sg hook
> > 
> > >>> On 20.12.12 at 02:23, "Xu, Dongxiao" <dongxiao.xu@intel.com> wrote:
> > > Sorry, maybe I am still not describing this issue clearly.
> > 
> > No, at least I understood you the way you re-describe below.
> > 
> > > Take the libata case as an example, the static DMA buffer locates
> > > (dev->link->ap->sector_buf , here we use Data Structure B in the graph) in
> > > the following structure:
> > >
> > > -------------------------------------Page boundary
> > > <Data Structure A>
> > > <Data Structure B>
> > > -------------------------------------Page boundary
> > > <Data Structure B (cross page)>
> > > <Data Structure C>
> > > -------------------------------------Page boundary
> > >
> > > Where Structure B is our DMA target.
> > >
> > > For Data Structure B, we didn't care about the simultaneous access, either
> > > lock or sync function will take care of it.
> > > What we are not sure is "read/write of A and C from other processor". As we
> > > will have memory copy for the pages, and at the same time, other CPU may
> > > access A/C.
> > 
> > The question is whether what libata does here is valid in the first
> > place - fill an SG list entry with something that crosses a page
> > boundary and is not a compound page.
> 
> Sorry for the late response about this thread.
> 
> To make sure I understand you correctly, so do you mean the correct fix should be applied to libata driver, and make sure it DMA from/to correct place (for example, some memory allocated by DMA API), but not such certain field in a static structure?

Or with baremetal swiotlb if the user booted with 'swiotlb=force'.

> 
> If we fix it in device driver side, then we may not need to modify the xen-swiotlb part.
> 
Correct. This is a bug in the device driver where it checks the contents of its buffer
_before_ doing an DMA unmap or DMA sync.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [Xen-devel] [PATCH] xen/swiotlb: Exchange to contiguous memory for map_sg hook
@ 2012-12-11  6:27 Xu, Dongxiao
  0 siblings, 0 replies; 16+ messages in thread
From: Xu, Dongxiao @ 2012-12-11  6:27 UTC (permalink / raw)
  To: Jan Beulich, konrad.wilk; +Cc: xen-devel, linux-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="gb2312", Size: 5189 bytes --]

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Thursday, December 06, 2012 9:38 PM
> To: Xu, Dongxiao
> Cc: xen-devel@lists.xen.org; konrad.wilk@oracle.com;
> linux-kernel@vger.kernel.org
> Subject: Re: [Xen-devel] [PATCH] xen/swiotlb: Exchange to contiguous memory
> for map_sg hook
> 
> >>> On 06.12.12 at 14:08, Dongxiao Xu <dongxiao.xu@intel.com> wrote:
> > While mapping sg buffers, checking to cross page DMA buffer is also
> > needed. If the guest DMA buffer crosses page boundary, Xen should
> > exchange contiguous memory for it.
> >
> > Besides, it is needed to backup the original page contents and copy it
> > back after memory exchange is done.
> >
> > This fixes issues if device DMA into software static buffers, and in
> > case the static buffer cross page boundary which pages are not
> > contiguous in real hardware.
> >
> > Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
> > Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
> > ---
> >  drivers/xen/swiotlb-xen.c |   47
> > ++++++++++++++++++++++++++++++++++++++++++++-
> >  1 files changed, 46 insertions(+), 1 deletions(-)
> >
> > diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> > index 58db6df..e8f0cfb 100644
> > --- a/drivers/xen/swiotlb-xen.c
> > +++ b/drivers/xen/swiotlb-xen.c
> > @@ -461,6 +461,22 @@ xen_swiotlb_sync_single_for_device(struct device
> > *hwdev, dma_addr_t dev_addr,  }
> > EXPORT_SYMBOL_GPL(xen_swiotlb_sync_single_for_device);
> >
> > +static bool
> > +check_continguous_region(unsigned long vstart, unsigned long order)
> 
> check_continguous_region(unsigned long vstart, unsigned int order)
> 
> But - why do you need to do this check order based in the first place? Checking
> the actual length of the buffer should suffice.

Thanks, the word "continguous" is mistyped in the function, it should be "contiguous".
¡¡¡¡
check_contiguous_region() function is used to check whether pages are contiguous in hardware.
The length only indicates whether the buffer crosses page boundary. If buffer crosses pages and they are not contiguous in hardware, we do need to exchange memory in Xen.

> 
> > +{
> > +	unsigned long prev_ma = xen_virt_to_bus((void *)vstart);
> > +	unsigned long next_ma;
> 
> phys_addr_t or some such for both of them.

Thanks.
Should be dma_addr_t?

> 
> > +	int i;
> 
> unsigned long

Thanks.

> 
> > +
> > +	for (i = 1; i < (1 << order); i++) {
> 
> 1UL

Thanks.

> 
> > +		next_ma = xen_virt_to_bus((void *)(vstart + i * PAGE_SIZE));
> > +		if (next_ma != prev_ma + PAGE_SIZE)
> > +			return false;
> > +		prev_ma = next_ma;
> > +	}
> > +	return true;
> > +}
> > +
> >  /*
> >   * Map a set of buffers described by scatterlist in streaming mode for
> DMA.
> >   * This is the scatter-gather version of the above
> > xen_swiotlb_map_page @@ -489,7 +505,36 @@
> > xen_swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist
> > *sgl,
> >
> >  	for_each_sg(sgl, sg, nelems, i) {
> >  		phys_addr_t paddr = sg_phys(sg);
> > -		dma_addr_t dev_addr = xen_phys_to_bus(paddr);
> > +		unsigned long vstart, order;
> > +		dma_addr_t dev_addr;
> > +
> > +		/*
> > +		 * While mapping sg buffers, checking to cross page DMA buffer
> > +		 * is also needed. If the guest DMA buffer crosses page
> > +		 * boundary, Xen should exchange contiguous memory for it.
> > +		 * Besides, it is needed to backup the original page contents
> > +		 * and copy it back after memory exchange is done.
> > +		 */
> > +		if (range_straddles_page_boundary(paddr, sg->length)) {
> > +			vstart = (unsigned long)__va(paddr & PAGE_MASK);
> > +			order = get_order(sg->length + (paddr & ~PAGE_MASK));
> > +			if (!check_continguous_region(vstart, order)) {
> > +				unsigned long buf;
> > +				buf = __get_free_pages(GFP_KERNEL, order);
> > +				memcpy((void *)buf, (void *)vstart,
> > +					PAGE_SIZE * (1 << order));
> > +				if (xen_create_contiguous_region(vstart, order,
> > +						fls64(paddr))) {
> > +					free_pages(buf, order);
> > +					return 0;
> > +				}
> > +				memcpy((void *)vstart, (void *)buf,
> > +					PAGE_SIZE * (1 << order));
> > +				free_pages(buf, order);
> > +			}
> > +		}
> > +
> > +		dev_addr = xen_phys_to_bus(paddr);
> >
> >  		if (swiotlb_force ||
> >  		    !dma_capable(hwdev, dev_addr, sg->length) ||
> 
> How about swiotlb_map_page() (for the compound page case)?

Yes! This should also need similar handling.

One thing needs further consideration is that, the above approach introduces two memory copies, which has race condition that, when we are exchanging/copying pages, dom0 may visit other elements right in the pages.

One choice is to move the memory copy in hypervisor, which requires us to modify the XENMEM_exchange hypercall and add certain flags indicating whether the exchange needs memory copying.

Or another choice to solve this issue in driver side to avoid DMA into such static buffers? This is easy to modify one driver but may have difficulties to monitor so many device drivers.

Thanks,
Dongxiao

> 
> Jan

ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2013-01-07 15:55 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-12-06 13:08 [PATCH] xen/swiotlb: Exchange to contiguous memory for map_sg hook Dongxiao Xu
2012-12-06 13:37 ` [Xen-devel] " Jan Beulich
2012-12-07 14:11   ` Konrad Rzeszutek Wilk
2012-12-07 14:08 ` Konrad Rzeszutek Wilk
2012-12-11  6:39   ` Xu, Dongxiao
2012-12-11 17:06     ` Konrad Rzeszutek Wilk
2012-12-12  1:03       ` Xu, Dongxiao
2012-12-12  9:38         ` [Xen-devel] " Jan Beulich
2012-12-19 20:09           ` Konrad Rzeszutek Wilk
2012-12-20  1:23             ` Xu, Dongxiao
2012-12-20  8:56               ` Jan Beulich
2013-01-07  7:17                 ` Xu, Dongxiao
2013-01-07  8:46                   ` Jan Beulich
2013-01-07 15:55                   ` Konrad Rzeszutek Wilk
2012-12-13 16:34         ` Konrad Rzeszutek Wilk
2012-12-11  6:27 [Xen-devel] " Xu, Dongxiao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).