All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] be2net: Bugfix for packet drop with kernel param swiotlb=force
@ 2014-02-20  1:56 jiang.biao2
  2014-02-20  9:39 ` Sathya Perla
  0 siblings, 1 reply; 12+ messages in thread
From: jiang.biao2 @ 2014-02-20  1:56 UTC (permalink / raw)
  To: netdev
  Cc: sathya.perla, subbu.seetharaman, ajit.khaparde, wang.liang82,
	cai.qu, li.fengmao, long.chun, David Miller


From: Li Fengmao <li.fengmao@zte.com.cn>

There will be packet drop with kernel param "swiotlb = force" on
Emulex 10Gb NIC using be2net driver. The problem is caused by
receiving skb without calling pci_unmap_page() in get_rx_page_info().
rx_page_info->last_page_user is initialized to false in
be_post_rx_frags() when current frag are mapped in the first half of
the same page with another frag. But in that case with
"swiotlb = force" param, data can not be copied into the page of
rx_page_info without calling pci_unmap_page, so the data frag mapped
in the first half of the page will be dropped.

It can be solved by creating only a mapping relation between frag
and page, and deleting rx_page_info->last_page_user to ensure
calling pci_unmap_page when handling each receiving frag.

Steps to reproduce the bug:
1. Prepare a Emulex Corporation OneConnect 10Gb NIC.
2. Add the kernel param like "swiotlb = force" in /boot/grub/grub.conf .
3. Reboot the system. (e.g exec reboot command)
4. Activate the interface. (e.g ifconfig eth0 192.168.1.2 up)
5. There will be packet drop when ping 192.168.1.2 from another host.

Signed-off-by: Li Fengmao <li.fengmao@zte.com.cn>
Signed-off-by: Long Chun <long.chun@zte.com.cn>
Reviewed-by: Wang Liang <wang.liang82@zte.com.cn>
Reviewed-by: Cai Qu <cai.qu@zte.com.cn>
Reviewed-by: Jiang Biao <jiang.biao2@zte.com.cn>

--- old/drivers/net/ethernet/emulex/benet/be_main.c	2014-02-20 08:49:49.322503588 +0800
+++ new/drivers/net/ethernet/emulex/benet/be_main.c	2014-02-20 08:56:38.796503104 +0800
@@ -1018,12 +1018,9 @@ get_rx_page_info(struct be_adapter *adap
 	rx_page_info = &rxo->page_info_tbl[frag_idx];
 	BUG_ON(!rx_page_info->page);

-	if (rx_page_info->last_page_user) {
-		dma_unmap_page(&adapter->pdev->dev,
-		               dma_unmap_addr(rx_page_info, bus),
-		               adapter->big_page_size, DMA_FROM_DEVICE);
-		rx_page_info->last_page_user = false;
-	}
+	dma_unmap_page(&adapter->pdev->dev,
+	               dma_unmap_addr(rx_page_info, bus),
+	               rx_frag_size, DMA_FROM_DEVICE);

 	atomic_dec(&rxq->used);
 	return rx_page_info;
@@ -1344,20 +1341,15 @@ static void be_post_rx_frags(struct be_r

 	page_info = &rxo->page_info_tbl[rxq->head];
 	for (posted = 0; posted < MAX_RX_POST && !page_info->page; posted++) {
-		if (!pagep) {
-			pagep = be_alloc_pages(adapter->big_page_size, gfp);
-			if (unlikely(!pagep)) {
-				rx_stats(rxo)->rx_post_fail++;
-				break;
-			}
-			page_dmaaddr = dma_map_page(&adapter->pdev->dev, pagep,
-			                            0, adapter->big_page_size,
-			                            DMA_FROM_DEVICE);
-			page_info->page_offset = 0;
-		} else {
-			get_page(pagep);
-			page_info->page_offset = page_offset + rx_frag_size;
+		pagep = be_alloc_pages(rx_frag_size, gfp);
+		if (unlikely(!pagep)) {
+			rx_stats(rxo)->rx_post_fail++;
+			break;
 		}
+		page_dmaaddr = dma_map_page(&adapter->pdev->dev, pagep,
+		                            0, rx_frag_size,
+		                            DMA_FROM_DEVICE);
+		page_info->page_offset = 0;
 		page_offset = page_info->page_offset;
 		page_info->page = pagep;
 		dma_unmap_addr_set(page_info, bus, page_dmaaddr);
@@ -1367,12 +1359,7 @@ static void be_post_rx_frags(struct be_r
 		rxd->fragpa_lo = cpu_to_le32(frag_dmaaddr & 0xFFFFFFFF);
 		rxd->fragpa_hi = cpu_to_le32(upper_32_bits(frag_dmaaddr));

-		/* Any space left in the current big page for another frag? */
-		if ((page_offset + rx_frag_size + rx_frag_size) >
-					adapter->big_page_size) {
-			pagep = NULL;
-			page_info->last_page_user = true;
-		}
+		pagep = NULL;

 		prev_page_info = page_info;
 		queue_head_inc(rxq);

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [PATCH] be2net: Bugfix for packet drop with kernel param swiotlb=force
  2014-02-20  1:56 [PATCH] be2net: Bugfix for packet drop with kernel param swiotlb=force jiang.biao2
@ 2014-02-20  9:39 ` Sathya Perla
       [not found]   ` <OF23C58A81.24C211B9-ON48257C86.0025CED7-48257C86.0026D02E@zte.com.cn>
  2014-02-22  1:49   ` Ben Hutchings
  0 siblings, 2 replies; 12+ messages in thread
From: Sathya Perla @ 2014-02-20  9:39 UTC (permalink / raw)
  To: jiang.biao2, netdev
  Cc: Subramanian Seetharaman, Ajit Khaparde, wang.liang82, cai.qu,
	li.fengmao, long.chun, David Miller

> -----Original Message-----
> From: jiang.biao2@zte.com.cn [mailto:jiang.biao2@zte.com.cn]
> 
> 
> From: Li Fengmao <li.fengmao@zte.com.cn>
> 
> There will be packet drop with kernel param "swiotlb = force" on
> Emulex 10Gb NIC using be2net driver. The problem is caused by
> receiving skb without calling pci_unmap_page() in get_rx_page_info().
> rx_page_info->last_page_user is initialized to false in
> be_post_rx_frags() when current frag are mapped in the first half of
> the same page with another frag. But in that case with
> "swiotlb = force" param, data can not be copied into the page of
> rx_page_info without calling pci_unmap_page, so the data frag mapped
> in the first half of the page will be dropped.
> 
> It can be solved by creating only a mapping relation between frag
> and page, and deleting rx_page_info->last_page_user to ensure
> calling pci_unmap_page when handling each receiving frag.

This patch uses an entire page for each RX frag (whose default size is 2048).
Consequently, on platforms like ppc64 where the default PAGE_SIZE is 64K,
memory usage becomes very inefficient.

Instead, I've tried a partial-page mapping scheme. This retains the
page sharing logic, but un-maps each frag separately so that
the data is copied from the bounce buffers.

Pls see if this works for you;  thanks.

---
diff --git a/drivers/net/ethernet/emulex/benet/be.h b/drivers/net/ethernet/emulex/benet/be.h
index a150401..013777a 100644
--- a/drivers/net/ethernet/emulex/benet/be.h
+++ b/drivers/net/ethernet/emulex/benet/be.h
@@ -263,7 +263,6 @@ struct be_rx_page_info {
 	struct page *page;
 	DEFINE_DMA_UNMAP_ADDR(bus);
 	u16 page_offset;
-	bool last_page_user;
 };
 
 struct be_rx_stats {
diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c
index 4f87f5c..9d8ea91 100644
--- a/drivers/net/ethernet/emulex/benet/be_main.c
+++ b/drivers/net/ethernet/emulex/benet/be_main.c
@@ -1448,13 +1448,8 @@ static struct be_rx_page_info *get_rx_page_info(struct be_rx_obj *rxo)
 	rx_page_info = &rxo->page_info_tbl[frag_idx];
 	BUG_ON(!rx_page_info->page);
 
-	if (rx_page_info->last_page_user) {
-		dma_unmap_page(&adapter->pdev->dev,
-			       dma_unmap_addr(rx_page_info, bus),
-			       adapter->big_page_size, DMA_FROM_DEVICE);
-		rx_page_info->last_page_user = false;
-	}
-
+	dma_unmap_page(&adapter->pdev->dev, dma_unmap_addr(rx_page_info, bus),
+		       rx_frag_size, DMA_FROM_DEVICE);
 	queue_tail_inc(rxq);
 	atomic_dec(&rxq->used);
 	return rx_page_info;
@@ -1761,12 +1756,12 @@ static inline struct page *be_alloc_pages(u32 size, gfp_t gfp)
 static void be_post_rx_frags(struct be_rx_obj *rxo, gfp_t gfp)
 {
 	struct be_adapter *adapter = rxo->adapter;
-	struct be_rx_page_info *page_info = NULL, *prev_page_info = NULL;
+	struct be_rx_page_info *page_info = NULL;
 	struct be_queue_info *rxq = &rxo->q;
 	struct page *pagep = NULL;
 	struct device *dev = &adapter->pdev->dev;
 	struct be_eth_rx_d *rxd;
-	u64 page_dmaaddr = 0, frag_dmaaddr;
+	u64 frag_dmaaddr;
 	u32 posted, page_offset = 0;
 
 	page_info = &rxo->page_info_tbl[rxq->head];
@@ -1777,24 +1772,22 @@ static void be_post_rx_frags(struct be_rx_obj *rxo, gfp_t gfp)
 				rx_stats(rxo)->rx_post_fail++;
 				break;
 			}
-			page_dmaaddr = dma_map_page(dev, pagep, 0,
-						    adapter->big_page_size,
-						    DMA_FROM_DEVICE);
-			if (dma_mapping_error(dev, page_dmaaddr)) {
-				put_page(pagep);
-				pagep = NULL;
-				rx_stats(rxo)->rx_post_fail++;
-				break;
-			}
 			page_info->page_offset = 0;
 		} else {
 			get_page(pagep);
 			page_info->page_offset = page_offset + rx_frag_size;
 		}
 		page_offset = page_info->page_offset;
+
+		frag_dmaaddr = dma_map_page(dev, pagep, page_offset,
+					    rx_frag_size, DMA_FROM_DEVICE);
+		if (dma_mapping_error(dev, frag_dmaaddr)) {
+			put_page(pagep);
+			rx_stats(rxo)->rx_post_fail++;
+			break;
+		}
 		page_info->page = pagep;
-		dma_unmap_addr_set(page_info, bus, page_dmaaddr);
-		frag_dmaaddr = page_dmaaddr + page_info->page_offset;
+		dma_unmap_addr_set(page_info, bus, frag_dmaaddr);
 
 		rxd = queue_head_node(rxq);
 		rxd->fragpa_lo = cpu_to_le32(frag_dmaaddr & 0xFFFFFFFF);
@@ -1802,17 +1795,12 @@ static void be_post_rx_frags(struct be_rx_obj *rxo, gfp_t gfp)
 
 		/* Any space left in the current big page for another frag? */
 		if ((page_offset + rx_frag_size + rx_frag_size) >
-					adapter->big_page_size) {
+					adapter->big_page_size)
 			pagep = NULL;
-			page_info->last_page_user = true;
-		}
 
-		prev_page_info = page_info;
 		queue_head_inc(rxq);
 		page_info = &rxo->page_info_tbl[rxq->head];
 	}
-	if (pagep)
-		prev_page_info->last_page_user = true;
 
 	if (posted) {
 		atomic_add(posted, &rxq->used);



> 
> Steps to reproduce the bug:
> 1. Prepare a Emulex Corporation OneConnect 10Gb NIC.
> 2. Add the kernel param like "swiotlb = force" in /boot/grub/grub.conf .
> 3. Reboot the system. (e.g exec reboot command)
> 4. Activate the interface. (e.g ifconfig eth0 192.168.1.2 up)
> 5. There will be packet drop when ping 192.168.1.2 from another host.
> 
> Signed-off-by: Li Fengmao <li.fengmao@zte.com.cn>
> Signed-off-by: Long Chun <long.chun@zte.com.cn>
> Reviewed-by: Wang Liang <wang.liang82@zte.com.cn>
> Reviewed-by: Cai Qu <cai.qu@zte.com.cn>
> Reviewed-by: Jiang Biao <jiang.biao2@zte.com.cn>
> 
> --- old/drivers/net/ethernet/emulex/benet/be_main.c	2014-02-20
> 08:49:49.322503588 +0800
> +++ new/drivers/net/ethernet/emulex/benet/be_main.c	2014-02-20
> 08:56:38.796503104 +0800
> @@ -1018,12 +1018,9 @@ get_rx_page_info(struct be_adapter *adap
>  	rx_page_info = &rxo->page_info_tbl[frag_idx];
>  	BUG_ON(!rx_page_info->page);
> 
> -	if (rx_page_info->last_page_user) {
> -		dma_unmap_page(&adapter->pdev->dev,
> -		               dma_unmap_addr(rx_page_info, bus),
> -		               adapter->big_page_size, DMA_FROM_DEVICE);
> -		rx_page_info->last_page_user = false;
> -	}
> +	dma_unmap_page(&adapter->pdev->dev,
> +	               dma_unmap_addr(rx_page_info, bus),
> +	               rx_frag_size, DMA_FROM_DEVICE);
> 
>  	atomic_dec(&rxq->used);
>  	return rx_page_info;
> @@ -1344,20 +1341,15 @@ static void be_post_rx_frags(struct be_r
> 
>  	page_info = &rxo->page_info_tbl[rxq->head];
>  	for (posted = 0; posted < MAX_RX_POST && !page_info->page; posted++) {
> -		if (!pagep) {
> -			pagep = be_alloc_pages(adapter->big_page_size, gfp);
> -			if (unlikely(!pagep)) {
> -				rx_stats(rxo)->rx_post_fail++;
> -				break;
> -			}
> -			page_dmaaddr = dma_map_page(&adapter->pdev->dev, pagep,
> -			                            0, adapter->big_page_size,
> -			                            DMA_FROM_DEVICE);
> -			page_info->page_offset = 0;
> -		} else {
> -			get_page(pagep);
> -			page_info->page_offset = page_offset + rx_frag_size;
> +		pagep = be_alloc_pages(rx_frag_size, gfp);
> +		if (unlikely(!pagep)) {
> +			rx_stats(rxo)->rx_post_fail++;
> +			break;
>  		}
> +		page_dmaaddr = dma_map_page(&adapter->pdev->dev, pagep,
> +		                            0, rx_frag_size,
> +		                            DMA_FROM_DEVICE);
> +		page_info->page_offset = 0;
>  		page_offset = page_info->page_offset;
>  		page_info->page = pagep;
>  		dma_unmap_addr_set(page_info, bus, page_dmaaddr);
> @@ -1367,12 +1359,7 @@ static void be_post_rx_frags(struct be_r
>  		rxd->fragpa_lo = cpu_to_le32(frag_dmaaddr & 0xFFFFFFFF);
>  		rxd->fragpa_hi = cpu_to_le32(upper_32_bits(frag_dmaaddr));
> 
> -		/* Any space left in the current big page for another frag? */
> -		if ((page_offset + rx_frag_size + rx_frag_size) >
> -					adapter->big_page_size) {
> -			pagep = NULL;
> -			page_info->last_page_user = true;
> -		}
> +		pagep = NULL;
> 
>  		prev_page_info = page_info;
>  		queue_head_inc(rxq);

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* RE: RE: [PATCH] be2net: Bugfix for packet drop with kernel param swiotlb=force
       [not found]   ` <OF23C58A81.24C211B9-ON48257C86.0025CED7-48257C86.0026D02E@zte.com.cn>
@ 2014-02-21  7:15     ` Sathya Perla
  0 siblings, 0 replies; 12+ messages in thread
From: Sathya Perla @ 2014-02-21  7:15 UTC (permalink / raw)
  To: jiang.biao2, netdev
  Cc: Subramanian Seetharaman, Ajit Khaparde, wang.liang82, cai.qu,
	li.fengmao, long.chun, David Miller

> -----Original Message-----
> From: jiang.biao2@zte.com.cn [mailto:jiang.biao2@zte.com.cn]
> 
> > >
> > > From: Li Fengmao <li.fengmao@zte.com.cn>
> > >
> > > There will be packet drop with kernel param "swiotlb = force" on
> > > Emulex 10Gb NIC using be2net driver. The problem is caused by
> > > receiving skb without calling pci_unmap_page() in get_rx_page_info().
> > > rx_page_info->last_page_user is initialized to false in
> > > be_post_rx_frags() when current frag are mapped in the first half of
> > > the same page with another frag. But in that case with
> > > "swiotlb = force" param, data can not be copied into the page of
> > > rx_page_info without calling pci_unmap_page, so the data frag mapped
> > > in the first half of the page will be dropped.
> > >
> > > It can be solved by creating only a mapping relation between frag
> > > and page, and deleting rx_page_info->last_page_user to ensure
> > > calling pci_unmap_page when handling each receiving frag.
> >
> > This patch uses an entire page for each RX frag (whose default size is 2048).
> > Consequently, on platforms like ppc64 where the default PAGE_SIZE is 64K,
> > memory usage becomes very inefficient.
> >
> > Instead, I've tried a partial-page mapping scheme. This retains the
> > page sharing logic, but un-maps each frag separately so that
> > the data is copied from the bounce buffers.
> >
> > Pls see if this works for you;  thanks.
> >
> Our patch indeed has some problem because use an entire page by alloc_pages,
> maybe kmalloc should be appropriate.
> Your modification works for us, it should be a right patch for the bug.

Thanks for verifying. I'll re-post my patch with a proper sign-off.

-Sathya

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] be2net: Bugfix for packet drop with kernel param swiotlb=force
  2014-02-20  9:39 ` Sathya Perla
       [not found]   ` <OF23C58A81.24C211B9-ON48257C86.0025CED7-48257C86.0026D02E@zte.com.cn>
@ 2014-02-22  1:49   ` Ben Hutchings
  2014-02-24  6:43     ` Sathya Perla
  2014-02-25 14:06     ` Sathya Perla
  1 sibling, 2 replies; 12+ messages in thread
From: Ben Hutchings @ 2014-02-22  1:49 UTC (permalink / raw)
  To: Sathya Perla
  Cc: jiang.biao2, netdev, Subramanian Seetharaman, Ajit Khaparde,
	wang.liang82, cai.qu, li.fengmao, long.chun, David Miller

[-- Attachment #1: Type: text/plain, Size: 1634 bytes --]

On Thu, 2014-02-20 at 09:39 +0000, Sathya Perla wrote:
> > -----Original Message-----
> > From: jiang.biao2@zte.com.cn [mailto:jiang.biao2@zte.com.cn]
> > 
> > 
> > From: Li Fengmao <li.fengmao@zte.com.cn>
> > 
> > There will be packet drop with kernel param "swiotlb = force" on
> > Emulex 10Gb NIC using be2net driver. The problem is caused by
> > receiving skb without calling pci_unmap_page() in get_rx_page_info().
> > rx_page_info->last_page_user is initialized to false in
> > be_post_rx_frags() when current frag are mapped in the first half of
> > the same page with another frag. But in that case with
> > "swiotlb = force" param, data can not be copied into the page of
> > rx_page_info without calling pci_unmap_page, so the data frag mapped
> > in the first half of the page will be dropped.
> > 
> > It can be solved by creating only a mapping relation between frag
> > and page, and deleting rx_page_info->last_page_user to ensure
> > calling pci_unmap_page when handling each receiving frag.
> 
> This patch uses an entire page for each RX frag (whose default size is 2048).
> Consequently, on platforms like ppc64 where the default PAGE_SIZE is 64K,
> memory usage becomes very inefficient.
> 
> Instead, I've tried a partial-page mapping scheme. This retains the
> page sharing logic, but un-maps each frag separately so that
> the data is copied from the bounce buffers.
[...]

You don't need to map/unmap each fragment separately; you can sync a
sub-page range with dma_sync_single_for_cpu().

Ben.

-- 
Ben Hutchings
I haven't lost my mind; it's backed up on tape somewhere.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [PATCH] be2net: Bugfix for packet drop with kernel param swiotlb=force
  2014-02-22  1:49   ` Ben Hutchings
@ 2014-02-24  6:43     ` Sathya Perla
  2014-02-25 14:06     ` Sathya Perla
  1 sibling, 0 replies; 12+ messages in thread
From: Sathya Perla @ 2014-02-24  6:43 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: jiang.biao2, netdev, Subramanian Seetharaman, Ajit Khaparde,
	wang.liang82, cai.qu, li.fengmao, long.chun, David Miller

> -----Original Message-----
> From: Ben Hutchings [mailto:ben@decadent.org.uk]
> 
> On Thu, 2014-02-20 at 09:39 +0000, Sathya Perla wrote:
> > > -----Original Message-----
> > > From: jiang.biao2@zte.com.cn [mailto:jiang.biao2@zte.com.cn]
> > >
> > >
> > > From: Li Fengmao <li.fengmao@zte.com.cn>
> > >
> > > There will be packet drop with kernel param "swiotlb = force" on
> > > Emulex 10Gb NIC using be2net driver. The problem is caused by
> > > receiving skb without calling pci_unmap_page() in get_rx_page_info().
> > > rx_page_info->last_page_user is initialized to false in
> > > be_post_rx_frags() when current frag are mapped in the first half of
> > > the same page with another frag. But in that case with
> > > "swiotlb = force" param, data can not be copied into the page of
> > > rx_page_info without calling pci_unmap_page, so the data frag mapped
> > > in the first half of the page will be dropped.
> > >
> > > It can be solved by creating only a mapping relation between frag
> > > and page, and deleting rx_page_info->last_page_user to ensure
> > > calling pci_unmap_page when handling each receiving frag.
> >
> > This patch uses an entire page for each RX frag (whose default size is 2048).
> > Consequently, on platforms like ppc64 where the default PAGE_SIZE is 64K,
> > memory usage becomes very inefficient.
> >
> > Instead, I've tried a partial-page mapping scheme. This retains the
> > page sharing logic, but un-maps each frag separately so that
> > the data is copied from the bounce buffers.
> [...]
> 
> You don't need to map/unmap each fragment separately; you can sync a
> sub-page range with dma_sync_single_for_cpu().
> 

Ha, yes! Thanks for the tip....
-Sathya

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [PATCH] be2net: Bugfix for packet drop with kernel param swiotlb=force
  2014-02-22  1:49   ` Ben Hutchings
  2014-02-24  6:43     ` Sathya Perla
@ 2014-02-25 14:06     ` Sathya Perla
  2014-02-25 16:58       ` Ben Hutchings
  1 sibling, 1 reply; 12+ messages in thread
From: Sathya Perla @ 2014-02-25 14:06 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: jiang.biao2, netdev, Subramanian Seetharaman, Ajit Khaparde,
	wang.liang82, cai.qu, li.fengmao, long.chun, David Miller

> -----Original Message-----
> From: Ben Hutchings [mailto:ben@decadent.org.uk]
> 
> On Thu, 2014-02-20 at 09:39 +0000, Sathya Perla wrote:
> > > -----Original Message-----
> > > From: jiang.biao2@zte.com.cn [mailto:jiang.biao2@zte.com.cn]
> > >
> > >
> > > From: Li Fengmao <li.fengmao@zte.com.cn>
> > >
> > > There will be packet drop with kernel param "swiotlb = force" on
> > > Emulex 10Gb NIC using be2net driver. The problem is caused by
> > > receiving skb without calling pci_unmap_page() in get_rx_page_info().
> > > rx_page_info->last_page_user is initialized to false in
> > > be_post_rx_frags() when current frag are mapped in the first half of
> > > the same page with another frag. But in that case with
> > > "swiotlb = force" param, data can not be copied into the page of
> > > rx_page_info without calling pci_unmap_page, so the data frag mapped
> > > in the first half of the page will be dropped.
> > >
> > > It can be solved by creating only a mapping relation between frag
> > > and page, and deleting rx_page_info->last_page_user to ensure
> > > calling pci_unmap_page when handling each receiving frag.
> >
> > This patch uses an entire page for each RX frag (whose default size is 2048).
> > Consequently, on platforms like ppc64 where the default PAGE_SIZE is 64K,
> > memory usage becomes very inefficient.
> >
> > Instead, I've tried a partial-page mapping scheme. This retains the
> > page sharing logic, but un-maps each frag separately so that
> > the data is copied from the bounce buffers.
> [...]
> 
> You don't need to map/unmap each fragment separately; you can sync a
> sub-page range with dma_sync_single_for_cpu().
> 

Ben, after syncing each frag with a dma_sync_single_for_cpu() call,
would I still need to dma_unmap_page() after DMA is done on all the frags
of the page? I thought I needed to.

I'm confused to see that swiotlb_bounce() (that copies data from the bounce buffers)
is called from both swiotlb_sync_single_for_cpu() and swiotlb_unmap_page().
I ofcourse don't want the data to be copied from the bounce buffers twice!

thanks,
-Sathya

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] be2net: Bugfix for packet drop with kernel param swiotlb=force
  2014-02-25 14:06     ` Sathya Perla
@ 2014-02-25 16:58       ` Ben Hutchings
  2014-02-26  4:54         ` Sathya Perla
  0 siblings, 1 reply; 12+ messages in thread
From: Ben Hutchings @ 2014-02-25 16:58 UTC (permalink / raw)
  To: Sathya Perla
  Cc: jiang.biao2, netdev, Subramanian Seetharaman, Ajit Khaparde,
	wang.liang82, cai.qu, li.fengmao, long.chun, David Miller

[-- Attachment #1: Type: text/plain, Size: 2598 bytes --]

On Tue, 2014-02-25 at 14:06 +0000, Sathya Perla wrote:
> > -----Original Message-----
> > From: Ben Hutchings [mailto:ben@decadent.org.uk]
> > 
> > On Thu, 2014-02-20 at 09:39 +0000, Sathya Perla wrote:
> > > > -----Original Message-----
> > > > From: jiang.biao2@zte.com.cn [mailto:jiang.biao2@zte.com.cn]
> > > >
> > > >
> > > > From: Li Fengmao <li.fengmao@zte.com.cn>
> > > >
> > > > There will be packet drop with kernel param "swiotlb = force" on
> > > > Emulex 10Gb NIC using be2net driver. The problem is caused by
> > > > receiving skb without calling pci_unmap_page() in get_rx_page_info().
> > > > rx_page_info->last_page_user is initialized to false in
> > > > be_post_rx_frags() when current frag are mapped in the first half of
> > > > the same page with another frag. But in that case with
> > > > "swiotlb = force" param, data can not be copied into the page of
> > > > rx_page_info without calling pci_unmap_page, so the data frag mapped
> > > > in the first half of the page will be dropped.
> > > >
> > > > It can be solved by creating only a mapping relation between frag
> > > > and page, and deleting rx_page_info->last_page_user to ensure
> > > > calling pci_unmap_page when handling each receiving frag.
> > >
> > > This patch uses an entire page for each RX frag (whose default size is 2048).
> > > Consequently, on platforms like ppc64 where the default PAGE_SIZE is 64K,
> > > memory usage becomes very inefficient.
> > >
> > > Instead, I've tried a partial-page mapping scheme. This retains the
> > > page sharing logic, but un-maps each frag separately so that
> > > the data is copied from the bounce buffers.
> > [...]
> > 
> > You don't need to map/unmap each fragment separately; you can sync a
> > sub-page range with dma_sync_single_for_cpu().
> > 
> 
> Ben, after syncing each frag with a dma_sync_single_for_cpu() call,
> would I still need to dma_unmap_page() after DMA is done on all the frags
> of the page? I thought I needed to.

Yes.

> I'm confused to see that swiotlb_bounce() (that copies data from the bounce buffers)
> is called from both swiotlb_sync_single_for_cpu() and swiotlb_unmap_page().
> I ofcourse don't want the data to be copied from the bounce buffers twice!

Right, unmap includes a sync so you should:
- for the last frag per page, dma_unmap_page() only
- for every other frag, dma_sync_single_for_cpu() only

Ben.

-- 
Ben Hutchings
Everything should be made as simple as possible, but not simpler.
                                                           - Albert Einstein

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [PATCH] be2net: Bugfix for packet drop with kernel param swiotlb=force
  2014-02-25 16:58       ` Ben Hutchings
@ 2014-02-26  4:54         ` Sathya Perla
  2014-03-01  1:00           ` Ben Hutchings
  0 siblings, 1 reply; 12+ messages in thread
From: Sathya Perla @ 2014-02-26  4:54 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: jiang.biao2, netdev, Subramanian Seetharaman, Ajit Khaparde,
	wang.liang82, cai.qu, li.fengmao, long.chun, David Miller

> -----Original Message-----
> From: Ben Hutchings [mailto:ben@decadent.org.uk]
> 
...
> > > > >
> > > > > From: Li Fengmao <li.fengmao@zte.com.cn>
> > > > >
> > > > > There will be packet drop with kernel param "swiotlb = force" on
> > > > > Emulex 10Gb NIC using be2net driver. The problem is caused by
> > > > > receiving skb without calling pci_unmap_page() in get_rx_page_info().
> > > > > rx_page_info->last_page_user is initialized to false in
> > > > > be_post_rx_frags() when current frag are mapped in the first half of
> > > > > the same page with another frag. But in that case with
> > > > > "swiotlb = force" param, data can not be copied into the page of
> > > > > rx_page_info without calling pci_unmap_page, so the data frag mapped
> > > > > in the first half of the page will be dropped.
> > > > >
> > > > > It can be solved by creating only a mapping relation between frag
> > > > > and page, and deleting rx_page_info->last_page_user to ensure
> > > > > calling pci_unmap_page when handling each receiving frag.
> > > >
> > > > This patch uses an entire page for each RX frag (whose default size is 2048).
> > > > Consequently, on platforms like ppc64 where the default PAGE_SIZE is 64K,
> > > > memory usage becomes very inefficient.
> > > >
> > > > Instead, I've tried a partial-page mapping scheme. This retains the
> > > > page sharing logic, but un-maps each frag separately so that
> > > > the data is copied from the bounce buffers.
> > > [...]
> > >
> > > You don't need to map/unmap each fragment separately; you can sync a
> > > sub-page range with dma_sync_single_for_cpu().
> > >
> >
> > Ben, after syncing each frag with a dma_sync_single_for_cpu() call,
> > would I still need to dma_unmap_page() after DMA is done on all the frags
> > of the page? I thought I needed to.
> 
> Yes.
> 
> > I'm confused to see that swiotlb_bounce() (that copies data from the bounce buffers)
> > is called from both swiotlb_sync_single_for_cpu() and swiotlb_unmap_page().
> > I ofcourse don't want the data to be copied from the bounce buffers twice!
> 
> Right, unmap includes a sync so you should:
> - for the last frag per page, dma_unmap_page() only
> - for every other frag, dma_sync_single_for_cpu() only

Ben, the dma_unmap_page() requires us to pass the same dma_addr and size returned by
(and passed to) the dma_map_page() call.
So, for the last frag, when I call dma_map_page() with the page_addr (and not the frag_addr) and the
page size, won't it try to copy the data belonging to all the previous frags of page again?!!

 thanks,
-Sathya

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] be2net: Bugfix for packet drop with kernel param swiotlb=force
  2014-02-26  4:54         ` Sathya Perla
@ 2014-03-01  1:00           ` Ben Hutchings
  2014-03-03  7:24             ` Sathya Perla
  0 siblings, 1 reply; 12+ messages in thread
From: Ben Hutchings @ 2014-03-01  1:00 UTC (permalink / raw)
  To: Sathya Perla
  Cc: jiang.biao2, netdev, Subramanian Seetharaman, Ajit Khaparde,
	wang.liang82, cai.qu, li.fengmao, long.chun, David Miller

[-- Attachment #1: Type: text/plain, Size: 3041 bytes --]

On Wed, 2014-02-26 at 04:54 +0000, Sathya Perla wrote:
> > -----Original Message-----
> > From: Ben Hutchings [mailto:ben@decadent.org.uk]
> > 
> ...
> > > > > >
> > > > > > From: Li Fengmao <li.fengmao@zte.com.cn>
> > > > > >
> > > > > > There will be packet drop with kernel param "swiotlb = force" on
> > > > > > Emulex 10Gb NIC using be2net driver. The problem is caused by
> > > > > > receiving skb without calling pci_unmap_page() in get_rx_page_info().
> > > > > > rx_page_info->last_page_user is initialized to false in
> > > > > > be_post_rx_frags() when current frag are mapped in the first half of
> > > > > > the same page with another frag. But in that case with
> > > > > > "swiotlb = force" param, data can not be copied into the page of
> > > > > > rx_page_info without calling pci_unmap_page, so the data frag mapped
> > > > > > in the first half of the page will be dropped.
> > > > > >
> > > > > > It can be solved by creating only a mapping relation between frag
> > > > > > and page, and deleting rx_page_info->last_page_user to ensure
> > > > > > calling pci_unmap_page when handling each receiving frag.
> > > > >
> > > > > This patch uses an entire page for each RX frag (whose default size is 2048).
> > > > > Consequently, on platforms like ppc64 where the default PAGE_SIZE is 64K,
> > > > > memory usage becomes very inefficient.
> > > > >
> > > > > Instead, I've tried a partial-page mapping scheme. This retains the
> > > > > page sharing logic, but un-maps each frag separately so that
> > > > > the data is copied from the bounce buffers.
> > > > [...]
> > > >
> > > > You don't need to map/unmap each fragment separately; you can sync a
> > > > sub-page range with dma_sync_single_for_cpu().
> > > >
> > >
> > > Ben, after syncing each frag with a dma_sync_single_for_cpu() call,
> > > would I still need to dma_unmap_page() after DMA is done on all the frags
> > > of the page? I thought I needed to.
> > 
> > Yes.
> > 
> > > I'm confused to see that swiotlb_bounce() (that copies data from the bounce buffers)
> > > is called from both swiotlb_sync_single_for_cpu() and swiotlb_unmap_page().
> > > I ofcourse don't want the data to be copied from the bounce buffers twice!
> > 
> > Right, unmap includes a sync so you should:
> > - for the last frag per page, dma_unmap_page() only
> > - for every other frag, dma_sync_single_for_cpu() only
> 
> Ben, the dma_unmap_page() requires us to pass the same dma_addr and size returned by
> (and passed to) the dma_map_page() call.
> So, for the last frag, when I call dma_map_page() with the page_addr (and not the frag_addr) and the
> page size, won't it try to copy the data belonging to all the previous frags of page again?!!

When using bounce buffers (e.g. swiotlb), yes this results in another
copy, which is inefficient.  But when using a real IOMMU neither
function will copy.  I would optimise for IOMMUs.

Ben.

-- 
Ben Hutchings
If at first you don't succeed, you're doing about average.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [PATCH] be2net: Bugfix for packet drop with kernel param swiotlb=force
  2014-03-01  1:00           ` Ben Hutchings
@ 2014-03-03  7:24             ` Sathya Perla
  0 siblings, 0 replies; 12+ messages in thread
From: Sathya Perla @ 2014-03-03  7:24 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: jiang.biao2, netdev, Subramanian Seetharaman, Ajit Khaparde,
	wang.liang82, cai.qu, li.fengmao, long.chun, David Miller

> -----Original Message-----
> From: Ben Hutchings [mailto:ben@decadent.org.uk]
> 
> On Wed, 2014-02-26 at 04:54 +0000, Sathya Perla wrote:
> > > -----Original Message-----
> > > From: Ben Hutchings [mailto:ben@decadent.org.uk]
> > >
> > ...
> > > > > > >
> > > > > > > From: Li Fengmao <li.fengmao@zte.com.cn>
> > > > > > >
> > > > > > > There will be packet drop with kernel param "swiotlb = force" on
> > > > > > > Emulex 10Gb NIC using be2net driver. The problem is caused by
> > > > > > > receiving skb without calling pci_unmap_page() in get_rx_page_info().
> > > > > > > rx_page_info->last_page_user is initialized to false in
> > > > > > > be_post_rx_frags() when current frag are mapped in the first half of
> > > > > > > the same page with another frag. But in that case with
> > > > > > > "swiotlb = force" param, data can not be copied into the page of
> > > > > > > rx_page_info without calling pci_unmap_page, so the data frag mapped
> > > > > > > in the first half of the page will be dropped.
> > > > > > >
> > > > > > > It can be solved by creating only a mapping relation between frag
> > > > > > > and page, and deleting rx_page_info->last_page_user to ensure
> > > > > > > calling pci_unmap_page when handling each receiving frag.
> > > > > >
> > > > > > This patch uses an entire page for each RX frag (whose default size is 2048).
> > > > > > Consequently, on platforms like ppc64 where the default PAGE_SIZE is 64K,
> > > > > > memory usage becomes very inefficient.
> > > > > >
> > > > > > Instead, I've tried a partial-page mapping scheme. This retains the
> > > > > > page sharing logic, but un-maps each frag separately so that
> > > > > > the data is copied from the bounce buffers.
> > > > > [...]
> > > > >
> > > > > You don't need to map/unmap each fragment separately; you can sync a
> > > > > sub-page range with dma_sync_single_for_cpu().
> > > > >
> > > >
> > > > Ben, after syncing each frag with a dma_sync_single_for_cpu() call,
> > > > would I still need to dma_unmap_page() after DMA is done on all the frags
> > > > of the page? I thought I needed to.
> > >
> > > Yes.
> > >
> > > > I'm confused to see that swiotlb_bounce() (that copies data from the bounce buffers)
> > > > is called from both swiotlb_sync_single_for_cpu() and swiotlb_unmap_page().
> > > > I ofcourse don't want the data to be copied from the bounce buffers twice!
> > >
> > > Right, unmap includes a sync so you should:
> > > - for the last frag per page, dma_unmap_page() only
> > > - for every other frag, dma_sync_single_for_cpu() only
> >
> > Ben, the dma_unmap_page() requires us to pass the same dma_addr and size returned
> by
> > (and passed to) the dma_map_page() call.
> > So, for the last frag, when I call dma_map_page() with the page_addr (and not the
> frag_addr) and the
> > page size, won't it try to copy the data belonging to all the previous frags of page again?!!
> 
> When using bounce buffers (e.g. swiotlb), yes this results in another
> copy, which is inefficient.  But when using a real IOMMU neither
> function will copy.  I would optimise for IOMMUs.

Ok, that sounds right.....thanks for the help!
-Sathya

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] be2net: Bugfix for packet drop with kernel param swiotlb=force
  2014-02-19  8:25 jiang.biao2
@ 2014-02-19 21:44 ` David Miller
  0 siblings, 0 replies; 12+ messages in thread
From: David Miller @ 2014-02-19 21:44 UTC (permalink / raw)
  To: jiang.biao2
  Cc: netdev, sathya.perla, subbu.seetharaman, ajit.khaparde,
	wang.liang82, cai.qu, li.fengmao, long.chun

From: jiang.biao2@zte.com.cn
Date: Wed, 19 Feb 2014 16:25:40 +0800

> +        dma_unmap_page(&adapter->pdev->dev,
> +               dma_unmap_addr(rx_page_info, bus),
> +               rx_frag_size, DMA_FROM_DEVICE);

This is not indented properly.

Arguments on the second and subsequent lines of a function call must
start exactly at the first column after the openning parenthesis of the
function call on the first line.

You must use the appropriate number of TAB and SPACE characters
necessary to do so.

If you are indenting only using TAB characters, you are doing it wrong.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH] be2net: Bugfix for packet drop with kernel param swiotlb=force
@ 2014-02-19  8:25 jiang.biao2
  2014-02-19 21:44 ` David Miller
  0 siblings, 1 reply; 12+ messages in thread
From: jiang.biao2 @ 2014-02-19  8:25 UTC (permalink / raw)
  To: netdev
  Cc: Sathya Perla, Subbu Seetharaman, Ajit Khaparde, wang.liang82,
	cai.qu, li.fengmao, long.chun


From: Li Fengmao <li.fengmao@zte.com.cn>

There will be packet drop with kernel param "swiotlb = force" on
Emulex 10Gb NIC using be2net driver. The problem is caused by
receiving skb without calling pci_unmap_page() in get_rx_page_info().
rx_page_info->last_page_user is initialized to false in
be_post_rx_frags() when current frag are mapped in the first half of
the same page with another frag. But in that case with
"swiotlb = force" param, data can not be copied into the page of
rx_page_info without calling pci_unmap_page, so the data frag mapped
in the first half of the page will be dropped.

It can be solved by creating only a mapping relation between frag
and page, and deleting rx_page_info->last_page_user to ensure
calling pci_unmap_page when handling each receiving frag.

Steps to reproduce the bug:
1. Prepare a Emulex Corporation OneConnect 10Gb NIC.
2. Add the kernel param like "swiotlb = force" in /boot/grub/grub.conf .
3. Reboot the system. (e.g exec reboot command)
3. Activate the interface. (e.g ifconfig eth0 192.168.1.2 up)
4. There will be packet drop when ping 192.168.1.2 from another host.

Signed-off-by: Li Fengmao <li.fengmao@zte.com.cn>
Signed-off-by: Long Chun <long.chun@zte.com.cn>
Reviewed-by: Wang Liang <wang.liang82@zte.com.cn>
Reviewed-by: Cai Qu <cai.qu@zte.com.cn>
Reviewed-by: Jiang Biao <jiang.biao2@zte.com.cn>

--- old/drivers/net/ethernet/emulex/benet/be_main.c	2014-02-18 03:34:15.206388270 -0500
+++ new/drivers/net/ethernet/emulex/benet/be_main.c	2014-02-18 03:44:17.368388223 -0500
@@ -1018,13 +1018,9 @@ get_rx_page_info(struct be_adapter *adap
 	rx_page_info = &rxo->page_info_tbl[frag_idx];
 	BUG_ON(!rx_page_info->page);

-	if (rx_page_info->last_page_user) {
-		dma_unmap_page(&adapter->pdev->dev,
-			       dma_unmap_addr(rx_page_info, bus),
-			       adapter->big_page_size, DMA_FROM_DEVICE);
-		rx_page_info->last_page_user = false;
-	}
-
+        dma_unmap_page(&adapter->pdev->dev,
+               dma_unmap_addr(rx_page_info, bus),
+               rx_frag_size, DMA_FROM_DEVICE);
 	atomic_dec(&rxq->used);
 	return rx_page_info;
 }
@@ -1344,20 +1340,15 @@ static void be_post_rx_frags(struct be_r

 	page_info = &rxo->page_info_tbl[rxq->head];
 	for (posted = 0; posted < MAX_RX_POST && !page_info->page; posted++) {
-		if (!pagep) {
-			pagep = be_alloc_pages(adapter->big_page_size, gfp);
-			if (unlikely(!pagep)) {
-				rx_stats(rxo)->rx_post_fail++;
-				break;
-			}
-			page_dmaaddr = dma_map_page(&adapter->pdev->dev, pagep,
-						    0, adapter->big_page_size,
-						    DMA_FROM_DEVICE);
-			page_info->page_offset = 0;
-		} else {
-			get_page(pagep);
-			page_info->page_offset = page_offset + rx_frag_size;
+		pagep = be_alloc_pages(rx_frag_size, gfp);
+		if (unlikely(!pagep)) {
+			rx_stats(rxo)->rx_post_fail++;
+			break;
 		}
+		page_dmaaddr = dma_map_page(&adapter->pdev->dev, pagep,
+						    0, rx_frag_size,
+						    DMA_FROM_DEVICE);
+		page_info->page_offset = 0;
 		page_offset = page_info->page_offset;
 		page_info->page = pagep;
 		dma_unmap_addr_set(page_info, bus, page_dmaaddr);
@@ -1367,12 +1358,7 @@ static void be_post_rx_frags(struct be_r
 		rxd->fragpa_lo = cpu_to_le32(frag_dmaaddr & 0xFFFFFFFF);
 		rxd->fragpa_hi = cpu_to_le32(upper_32_bits(frag_dmaaddr));

-		/* Any space left in the current big page for another frag? */
-		if ((page_offset + rx_frag_size + rx_frag_size) >
-					adapter->big_page_size) {
-			pagep = NULL;
-			page_info->last_page_user = true;
-		}
+		pagep = NULL;

 		prev_page_info = page_info;
 		queue_head_inc(rxq);

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2014-03-03  7:24 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-02-20  1:56 [PATCH] be2net: Bugfix for packet drop with kernel param swiotlb=force jiang.biao2
2014-02-20  9:39 ` Sathya Perla
     [not found]   ` <OF23C58A81.24C211B9-ON48257C86.0025CED7-48257C86.0026D02E@zte.com.cn>
2014-02-21  7:15     ` Sathya Perla
2014-02-22  1:49   ` Ben Hutchings
2014-02-24  6:43     ` Sathya Perla
2014-02-25 14:06     ` Sathya Perla
2014-02-25 16:58       ` Ben Hutchings
2014-02-26  4:54         ` Sathya Perla
2014-03-01  1:00           ` Ben Hutchings
2014-03-03  7:24             ` Sathya Perla
  -- strict thread matches above, loose matches on Subject: below --
2014-02-19  8:25 jiang.biao2
2014-02-19 21:44 ` David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.