All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] Implement persistent grant in xen-netfront/netback
@ 2012-11-15  7:03 Annie Li
  2012-11-15  7:04 ` [PATCH 1/4] xen/netback: implements persistent grant with one page pool Annie Li
                   ` (8 more replies)
  0 siblings, 9 replies; 42+ messages in thread
From: Annie Li @ 2012-11-15  7:03 UTC (permalink / raw)
  To: xen-devel, netdev, konrad.wilk, Ian.Campbell; +Cc: annie.li

This patch implements persistent grants for xen-netfront/netback. This
mechanism maintains page pools in netback/netfront, these page pools is used to
save grant pages which are mapped. This way improve performance which is wasted
when doing grant operations.

Current netback/netfront does map/unmap grant operations frequently when
transmitting/receiving packets, and grant operations costs much cpu clock. In
this patch, netfront/netback maps grant pages when needed and then saves them
into a page pool for future use. All these pages will be unmapped when
removing/releasing the net device.

In netfront, two pools are maintained for transmitting and receiving packets.
When new grant pages are needed, the driver gets grant pages from this pool
first. If no free grant page exists, it allocates new page, maps it and then
saves it into the pool. The pool size for transmit/receive is exactly tx/rx
ring size. The driver uses memcpy(not grantcopy) to copy data grant pages.
Here, memcpy is copying the whole page size data. I tried to copy len size data
from offset, but network does not seem work well. I am trying to find the root
cause now.

In netback, it also maintains two page pools for tx/rx. When netback gets a
request, it does a search first to find out whether the grant reference of
this request is already mapped into its page pool. If the grant ref is mapped,
the address of this mapped page is gotten and memcpy is used to copy data
between grant pages. However, if the grant ref is not mapped, a new page is
allocated, mapped with this grant ref, and then saved into page pool for
future use. Similarly, memcpy replaces grant copy to copy data between grant
pages. In this implementation, two arrays(gnttab_tx_vif,gnttab_rx_vif) are
used to save vif pointer for every request because current netback is not
per-vif based. This would be changed after implementing 1:1 model in netback.

This patch supports both persistent-grant and non persistent grant. A new
xenstore key "feature-persistent-grants" is used to represent this feature.

This patch is based on linux3.4-rc3. I hit netperf/netserver failure on
linux latest version v3.7-rc1, v3.7-rc2 and v3.7-rc4. Not sure whether this
netperf/netserver failure connects compound page commit in v3.7-rc1, but I did
hit BUG_ON with debug patch from thread
http://lists.xen.org/archives/html/xen-devel/2012-10/msg00893.html


Annie Li (4):
  xen/netback: implements persistent grant with one page pool.
  xen/netback: Split one page pool into two(tx/rx) page pool.
  Xen/netfront: Implement persistent grant in netfront.
  fix code indent issue in xen-netfront.

 drivers/net/xen-netback/common.h    |   24 ++-
 drivers/net/xen-netback/interface.c |   26 +++
 drivers/net/xen-netback/netback.c   |  215 ++++++++++++++++++--
 drivers/net/xen-netback/xenbus.c    |   14 ++-
 drivers/net/xen-netfront.c          |  378 +++++++++++++++++++++++++++++------
 5 files changed, 570 insertions(+), 87 deletions(-)

-- 
1.7.3.4

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 1/4] xen/netback: implements persistent grant with one page pool.
  2012-11-15  7:03 [PATCH 0/4] Implement persistent grant in xen-netfront/netback Annie Li
@ 2012-11-15  7:04 ` Annie Li
  2012-11-15  9:10   ` Ian Campbell
  2012-11-15  9:57   ` Roger Pau Monné
  2012-11-15  7:04 ` [PATCH 2/4] xen/netback: Split one page pool into two(tx/rx) " Annie Li
                   ` (7 subsequent siblings)
  8 siblings, 2 replies; 42+ messages in thread
From: Annie Li @ 2012-11-15  7:04 UTC (permalink / raw)
  To: xen-devel, netdev, konrad.wilk, Ian.Campbell; +Cc: annie.li

This patch implements persistent grant in netback driver. Tx and rx
share the same page pool, this pool will be split into two parts
in next patch.

Signed-off-by: Annie Li <annie.li@oracle.com>
---
 drivers/net/xen-netback/common.h    |   18 +++-
 drivers/net/xen-netback/interface.c |   22 ++++
 drivers/net/xen-netback/netback.c   |  212 +++++++++++++++++++++++++++++++----
 drivers/net/xen-netback/xenbus.c    |   14 ++-
 4 files changed, 239 insertions(+), 27 deletions(-)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index 94b79c3..a85cac6 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -45,8 +45,19 @@
 #include <xen/grant_table.h>
 #include <xen/xenbus.h>
 
+#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
+#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
+#define MAXIMUM_OUTSTANDING_BLOCK_REQS \
+			(XEN_NETIF_TX_RING_SIZE + XEN_NETIF_RX_RING_SIZE)
+
 struct xen_netbk;
 
+struct persistent_entry {
+	grant_ref_t forgranted;
+	struct page *fpage;
+	struct gnttab_map_grant_ref map;
+};
+
 struct xenvif {
 	/* Unique identifier for this interface. */
 	domid_t          domid;
@@ -75,6 +86,7 @@ struct xenvif {
 
 	/* Internal feature information. */
 	u8 can_queue:1;	    /* can queue packets for receiver? */
+	u8 persistent_grant:1;
 
 	/*
 	 * Allow xenvif_start_xmit() to peek ahead in the rx request
@@ -98,6 +110,9 @@ struct xenvif {
 	struct net_device *dev;
 
 	wait_queue_head_t waiting_to_free;
+
+	struct persistent_entry *persistent_gnt[MAXIMUM_OUTSTANDING_BLOCK_REQS];
+	unsigned int persistent_gntcnt;
 };
 
 static inline struct xenbus_device *xenvif_to_xenbus_device(struct xenvif *vif)
@@ -105,9 +120,6 @@ static inline struct xenbus_device *xenvif_to_xenbus_device(struct xenvif *vif)
 	return to_xenbus_device(vif->dev->dev.parent);
 }
 
-#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
-#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
-
 struct xenvif *xenvif_alloc(struct device *parent,
 			    domid_t domid,
 			    unsigned int handle);
diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index b7d41f8..226d159 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -300,6 +300,8 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid,
 		return ERR_PTR(err);
 	}
 
+	vif->persistent_gntcnt = 0;
+
 	netdev_dbg(dev, "Successfully created xenvif\n");
 	return vif;
 }
@@ -343,6 +345,23 @@ err:
 	return err;
 }
 
+void xenvif_free_grants(struct persistent_entry **pers_entry,
+			unsigned int count)
+{
+	int i, ret;
+	struct gnttab_unmap_grant_ref unmap;
+
+	for (i = 0; i < count; i++) {
+		gnttab_set_unmap_op(&unmap,
+			(unsigned long)page_to_pfn(pers_entry[i]->fpage),
+			GNTMAP_host_map,
+			pers_entry[i]->map.handle);
+		ret = gnttab_unmap_refs(&unmap, &pers_entry[i]->fpage,
+					1, false);
+		BUG_ON(ret);
+	}
+}
+
 void xenvif_disconnect(struct xenvif *vif)
 {
 	struct net_device *dev = vif->dev;
@@ -366,6 +385,9 @@ void xenvif_disconnect(struct xenvif *vif)
 	unregister_netdev(vif->dev);
 
 	xen_netbk_unmap_frontend_rings(vif);
+	if (vif->persistent_grant)
+		xenvif_free_grants(vif->persistent_gnt,
+				   vif->persistent_gntcnt);
 
 	free_netdev(vif->dev);
 }
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 2596401..a26d3fc 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -80,6 +80,8 @@ union page_ext {
 	void *mapping;
 };
 
+struct xenvif;
+
 struct xen_netbk {
 	wait_queue_head_t wq;
 	struct task_struct *task;
@@ -102,6 +104,7 @@ struct xen_netbk {
 
 	struct pending_tx_info pending_tx_info[MAX_PENDING_REQS];
 	struct gnttab_copy tx_copy_ops[MAX_PENDING_REQS];
+	struct xenvif *gnttab_tx_vif[MAX_PENDING_REQS];
 
 	u16 pending_ring[MAX_PENDING_REQS];
 
@@ -111,12 +114,139 @@ struct xen_netbk {
 	 * straddles two buffers in the frontend.
 	 */
 	struct gnttab_copy grant_copy_op[2*XEN_NETIF_RX_RING_SIZE];
+	struct xenvif *gnttab_rx_vif[2*XEN_NETIF_RX_RING_SIZE];
 	struct netbk_rx_meta meta[2*XEN_NETIF_RX_RING_SIZE];
 };
 
 static struct xen_netbk *xen_netbk;
 static int xen_netbk_group_nr;
 
+static struct persistent_entry*
+get_per_gnt(struct persistent_entry **pers_entry,
+	    unsigned int count, grant_ref_t gref)
+{
+	int i;
+
+	for (i = 0; i < count; i++)
+		if (gref == pers_entry[i]->forgranted)
+			return pers_entry[i];
+
+	return NULL;
+}
+
+static void*
+map_new_gnt(struct persistent_entry **pers_entry, unsigned int count,
+	    grant_ref_t ref, domid_t domid)
+{
+	struct gnttab_map_grant_ref *map;
+	struct page *page;
+	unsigned long vaddr;
+	unsigned long pfn;
+	uint32_t flags;
+	int ret = 0;
+
+	pers_entry[count] = (struct persistent_entry *)
+			    kmalloc(sizeof(struct persistent_entry),
+				    GFP_KERNEL);
+	if (!pers_entry[count])
+		return ERR_PTR(-ENOMEM);
+
+	map = &pers_entry[count]->map;
+	page = alloc_page(GFP_KERNEL);
+	if (!page) {
+		kfree(pers_entry[count]);
+		pers_entry[count] = NULL;
+		return ERR_PTR(-ENOMEM);
+	}
+
+	pers_entry[count]->fpage = page;
+	pfn = page_to_pfn(page);
+	vaddr = (unsigned long)pfn_to_kaddr(pfn);
+	flags = GNTMAP_host_map;
+
+	gnttab_set_map_op(map, vaddr, flags, ref, domid);
+	ret = gnttab_map_refs(map, NULL, &page, 1);
+	BUG_ON(ret);
+
+	pers_entry[count]->forgranted = ref;
+
+	return page_address(page);
+}
+
+static void*
+get_ref_page(struct persistent_entry **pers_entry, unsigned int *count,
+	     grant_ref_t ref, domid_t domid, unsigned int total)
+{
+	struct persistent_entry *per_gnt;
+	void *vaddr;
+
+	per_gnt = get_per_gnt(pers_entry, *count, ref);
+
+	if (per_gnt != NULL)
+		return page_address(per_gnt->fpage);
+	else {
+		BUG_ON(*count >= total);
+		vaddr = map_new_gnt(pers_entry, *count, ref, domid);
+		if (IS_ERR_OR_NULL(vaddr))
+			return vaddr;
+		*count += 1;
+		return vaddr;
+	}
+}
+
+static int
+grant_memory_copy_op(unsigned int cmd, void *vuop, unsigned int count,
+		     struct xen_netbk *netbk, bool tx_pool)
+{
+	int i;
+	struct xenvif *vif;
+	struct gnttab_copy *uop = vuop;
+	unsigned int *gnt_count;
+	unsigned int gnt_total;
+	struct persistent_entry **pers_entry;
+	int ret = 0;
+
+	BUG_ON(cmd != GNTTABOP_copy);
+	for (i = 0; i < count; i++) {
+		if (tx_pool)
+			vif = netbk->gnttab_tx_vif[i];
+		else
+			vif = netbk->gnttab_rx_vif[i];
+
+		pers_entry = vif->persistent_gnt;
+		gnt_count = &vif->persistent_gntcnt;
+		gnt_total = MAXIMUM_OUTSTANDING_BLOCK_REQS;
+
+		if (vif->persistent_grant) {
+			void *saddr, *daddr;
+
+			saddr = uop[i].source.domid == DOMID_SELF ?
+				(void *) uop[i].source.u.gmfn :
+				get_ref_page(pers_entry, gnt_count,
+					     uop[i].source.u.ref,
+					     uop[i].source.domid,
+					     gnt_total);
+			if (IS_ERR_OR_NULL(saddr))
+				return -ENOMEM;
+
+			daddr = uop[i].dest.domid == DOMID_SELF ?
+				(void *) uop[i].dest.u.gmfn :
+				get_ref_page(pers_entry, gnt_count,
+					     uop[i].dest.u.ref,
+					     uop[i].dest.domid,
+					     gnt_total);
+			if (IS_ERR_OR_NULL(daddr))
+				return -ENOMEM;
+
+			memcpy(daddr+uop[i].dest.offset,
+			       saddr+uop[i].source.offset, uop[i].len);
+		} else
+			ret = HYPERVISOR_grant_table_op(cmd, &uop[i], 1);
+	}
+
+	return ret;
+}
+
 void xen_netbk_add_xenvif(struct xenvif *vif)
 {
 	int i;
@@ -387,13 +517,15 @@ static struct netbk_rx_meta *get_next_rx_buffer(struct xenvif *vif,
  * Set up the grant operations for this fragment. If it's a flipping
  * interface, we also set up the unmap request from here.
  */
-static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
-				struct netrx_pending_operations *npo,
-				struct page *page, unsigned long size,
-				unsigned long offset, int *head)
+static int netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
+			       struct netrx_pending_operations *npo,
+			       struct page *page, unsigned long size,
+			       unsigned long offset, int *head,
+			       struct xenvif **grxvif)
 {
 	struct gnttab_copy *copy_gop;
 	struct netbk_rx_meta *meta;
+	int count = 0;
 	/*
 	 * These variables are used iff get_page_ext returns true,
 	 * in which case they are guaranteed to be initialized.
@@ -425,6 +557,10 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
 			bytes = MAX_BUFFER_OFFSET - npo->copy_off;
 
 		copy_gop = npo->copy + npo->copy_prod++;
+		*grxvif = vif;
+		grxvif++;
+		count++;
+
 		copy_gop->flags = GNTCOPY_dest_gref;
 		if (foreign) {
 			struct xen_netbk *netbk = &xen_netbk[group];
@@ -438,7 +574,10 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
 		} else {
 			void *vaddr = page_address(page);
 			copy_gop->source.domid = DOMID_SELF;
-			copy_gop->source.u.gmfn = virt_to_mfn(vaddr);
+			if (!vif->persistent_grant)
+				copy_gop->source.u.gmfn = virt_to_mfn(vaddr);
+			else
+				copy_gop->source.u.gmfn = (unsigned long)vaddr;
 		}
 		copy_gop->source.offset = offset;
 		copy_gop->dest.domid = vif->domid;
@@ -460,6 +599,7 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
 		*head = 0; /* There must be something in this buffer now. */
 
 	}
+	return count;
 }
 
 /*
@@ -474,8 +614,10 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
  * zero GSO descriptors (for non-GSO packets) or one descriptor (for
  * frontend-side LRO).
  */
-static int netbk_gop_skb(struct sk_buff *skb,
-			 struct netrx_pending_operations *npo)
+static int netbk_gop_skb(struct xen_netbk *netbk,
+			 struct sk_buff *skb,
+			 struct netrx_pending_operations *npo,
+			 struct xenvif **grxvif)
 {
 	struct xenvif *vif = netdev_priv(skb->dev);
 	int nr_frags = skb_shinfo(skb)->nr_frags;
@@ -518,17 +660,19 @@ static int netbk_gop_skb(struct sk_buff *skb,
 		if (data + len > skb_tail_pointer(skb))
 			len = skb_tail_pointer(skb) - data;
 
-		netbk_gop_frag_copy(vif, skb, npo,
-				    virt_to_page(data), len, offset, &head);
+		grxvif += netbk_gop_frag_copy(vif, skb, npo,
+					      virt_to_page(data), len,
+					      offset, &head, grxvif);
+
 		data += len;
 	}
 
 	for (i = 0; i < nr_frags; i++) {
-		netbk_gop_frag_copy(vif, skb, npo,
-				    skb_frag_page(&skb_shinfo(skb)->frags[i]),
-				    skb_frag_size(&skb_shinfo(skb)->frags[i]),
-				    skb_shinfo(skb)->frags[i].page_offset,
-				    &head);
+		grxvif += netbk_gop_frag_copy(vif, skb, npo,
+				skb_frag_page(&skb_shinfo(skb)->frags[i]),
+				skb_frag_size(&skb_shinfo(skb)->frags[i]),
+				skb_shinfo(skb)->frags[i].page_offset,
+				&head, grxvif);
 	}
 
 	return npo->meta_prod - old_meta_prod;
@@ -593,6 +737,8 @@ struct skb_cb_overlay {
 static void xen_netbk_rx_action(struct xen_netbk *netbk)
 {
 	struct xenvif *vif = NULL, *tmp;
+	struct xenvif **grxvif = netbk->gnttab_rx_vif;
+	int old_copy_prod = 0;
 	s8 status;
 	u16 irq, flags;
 	struct xen_netif_rx_response *resp;
@@ -619,7 +765,9 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk)
 		nr_frags = skb_shinfo(skb)->nr_frags;
 
 		sco = (struct skb_cb_overlay *)skb->cb;
-		sco->meta_slots_used = netbk_gop_skb(skb, &npo);
+		sco->meta_slots_used = netbk_gop_skb(netbk, skb, &npo, grxvif);
+		grxvif += npo.copy_prod - old_copy_prod;
+		old_copy_prod = npo.copy_prod;
 
 		count += nr_frags + 1;
 
@@ -636,8 +784,10 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk)
 		return;
 
 	BUG_ON(npo.copy_prod > ARRAY_SIZE(netbk->grant_copy_op));
-	ret = HYPERVISOR_grant_table_op(GNTTABOP_copy, &netbk->grant_copy_op,
-					npo.copy_prod);
+	ret = grant_memory_copy_op(GNTTABOP_copy,
+				   &netbk->grant_copy_op,
+				   npo.copy_prod, netbk,
+				   false);
 	BUG_ON(ret != 0);
 
 	while ((skb = __skb_dequeue(&rxq)) != NULL) {
@@ -918,7 +1068,8 @@ static struct gnttab_copy *xen_netbk_get_requests(struct xen_netbk *netbk,
 						  struct xenvif *vif,
 						  struct sk_buff *skb,
 						  struct xen_netif_tx_request *txp,
-						  struct gnttab_copy *gop)
+						  struct gnttab_copy *gop,
+						  struct xenvif **gtxvif)
 {
 	struct skb_shared_info *shinfo = skb_shinfo(skb);
 	skb_frag_t *frags = shinfo->frags;
@@ -944,7 +1095,11 @@ static struct gnttab_copy *xen_netbk_get_requests(struct xen_netbk *netbk,
 		gop->source.domid = vif->domid;
 		gop->source.offset = txp->offset;
 
-		gop->dest.u.gmfn = virt_to_mfn(page_address(page));
+		if (!vif->persistent_grant)
+			gop->dest.u.gmfn = virt_to_mfn(page_address(page));
+		else
+			gop->dest.u.gmfn = (unsigned long)page_address(page);
+
 		gop->dest.domid = DOMID_SELF;
 		gop->dest.offset = txp->offset;
 
@@ -952,6 +1107,9 @@ static struct gnttab_copy *xen_netbk_get_requests(struct xen_netbk *netbk,
 		gop->flags = GNTCOPY_source_gref;
 
 		gop++;
+		*gtxvif = vif;
+		gtxvif++;
+
 
 		memcpy(&pending_tx_info[pending_idx].req, txp, sizeof(*txp));
 		xenvif_get(vif);
@@ -1218,6 +1376,7 @@ static bool tx_credit_exceeded(struct xenvif *vif, unsigned size)
 static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk)
 {
 	struct gnttab_copy *gop = netbk->tx_copy_ops, *request_gop;
+	struct xenvif **gtxvif = netbk->gnttab_tx_vif;
 	struct sk_buff *skb;
 	int ret;
 
@@ -1338,7 +1497,11 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk)
 		gop->source.domid = vif->domid;
 		gop->source.offset = txreq.offset;
 
-		gop->dest.u.gmfn = virt_to_mfn(page_address(page));
+		if (!vif->persistent_grant)
+			gop->dest.u.gmfn = virt_to_mfn(page_address(page));
+		else
+			gop->dest.u.gmfn = (unsigned long)page_address(page);
+
 		gop->dest.domid = DOMID_SELF;
 		gop->dest.offset = txreq.offset;
 
@@ -1346,6 +1509,7 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk)
 		gop->flags = GNTCOPY_source_gref;
 
 		gop++;
+		*gtxvif++ = vif;
 
 		memcpy(&netbk->pending_tx_info[pending_idx].req,
 		       &txreq, sizeof(txreq));
@@ -1369,12 +1533,13 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk)
 		netbk->pending_cons++;
 
 		request_gop = xen_netbk_get_requests(netbk, vif,
-						     skb, txfrags, gop);
+						     skb, txfrags, gop, gtxvif);
 		if (request_gop == NULL) {
 			kfree_skb(skb);
 			netbk_tx_err(vif, &txreq, idx);
 			continue;
 		}
+		gtxvif += request_gop - gop;
 		gop = request_gop;
 
 		vif->tx.req_cons = idx;
@@ -1467,8 +1632,9 @@ static void xen_netbk_tx_action(struct xen_netbk *netbk)
 
 	if (nr_gops == 0)
 		return;
-	ret = HYPERVISOR_grant_table_op(GNTTABOP_copy,
-					netbk->tx_copy_ops, nr_gops);
+	ret = grant_memory_copy_op(GNTTABOP_copy,
+				   netbk->tx_copy_ops, nr_gops,
+				   netbk, true);
 	BUG_ON(ret);
 
 	xen_netbk_tx_submit(netbk);
diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index 410018c..938e908 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -114,6 +114,13 @@ static int netback_probe(struct xenbus_device *dev,
 			goto abort_transaction;
 		}
 
+		err = xenbus_printf(xbt, dev->nodename,
+				    "feature-persistent-grants", "%u", 1);
+		if (err) {
+			message = "writing feature-persistent-grants";
+			goto abort_transaction;
+		}
+
 		err = xenbus_transaction_end(xbt, 0);
 	} while (err == -EAGAIN);
 
@@ -453,7 +460,12 @@ static int connect_rings(struct backend_info *be)
 		val = 0;
 	vif->csum = !val;
 
-	/* Map the shared frame, irq etc. */
+	if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-persistent-grants",
+			 "%u", &val) < 0)
+		val = 0;
+	vif->persistent_grant = !!val;
+
+/* Map the shared frame, irq etc. */
 	err = xenvif_connect(vif, tx_ring_ref, rx_ring_ref, evtchn);
 	if (err) {
 		xenbus_dev_fatal(dev, err,
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 2/4] xen/netback: Split one page pool into two(tx/rx) page pool.
  2012-11-15  7:03 [PATCH 0/4] Implement persistent grant in xen-netfront/netback Annie Li
  2012-11-15  7:04 ` [PATCH 1/4] xen/netback: implements persistent grant with one page pool Annie Li
@ 2012-11-15  7:04 ` Annie Li
  2012-11-15  9:15   ` Ian Campbell
  2012-11-15  7:05 ` [PATCH 3/4] Xen/netfront: Implement persistent grant in netfront Annie Li
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 42+ messages in thread
From: Annie Li @ 2012-11-15  7:04 UTC (permalink / raw)
  To: xen-devel, netdev, konrad.wilk, Ian.Campbell; +Cc: annie.li

For tx path, this implementation simplifies the work of searching out
grant page from page pool based on grant reference.

Signed-off-by: Annie Li <annie.li@oracle.com>
---
 drivers/net/xen-netback/common.h    |   14 ++++++++++----
 drivers/net/xen-netback/interface.c |   12 ++++++++----
 drivers/net/xen-netback/netback.c   |   15 +++++++++------
 3 files changed, 27 insertions(+), 14 deletions(-)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index a85cac6..02c8573 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -47,8 +47,6 @@
 
 #define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
 #define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
-#define MAXIMUM_OUTSTANDING_BLOCK_REQS \
-			(XEN_NETIF_TX_RING_SIZE + XEN_NETIF_RX_RING_SIZE)
 
 struct xen_netbk;
 
@@ -111,8 +109,16 @@ struct xenvif {
 
 	wait_queue_head_t waiting_to_free;
 
-	struct persistent_entry *persistent_gnt[MAXIMUM_OUTSTANDING_BLOCK_REQS];
-	unsigned int persistent_gntcnt;
+	struct persistent_entry *persistent_tx_gnt[XEN_NETIF_TX_RING_SIZE];
+
+	/*
+	 * 2*XEN_NETIF_RX_RING_SIZE is for the case of each head/fragment page
+	 * using 2 copy operations.
+	 */
+	struct persistent_entry *persistent_rx_gnt[2*XEN_NETIF_RX_RING_SIZE];
+
+	unsigned int persistent_tx_gntcnt;
+	unsigned int persistent_rx_gntcnt;
 };
 
 static inline struct xenbus_device *xenvif_to_xenbus_device(struct xenvif *vif)
diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index 226d159..ecbe116 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -300,7 +300,8 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid,
 		return ERR_PTR(err);
 	}
 
-	vif->persistent_gntcnt = 0;
+	vif->persistent_tx_gntcnt = 0;
+	vif->persistent_rx_gntcnt = 0;
 
 	netdev_dbg(dev, "Successfully created xenvif\n");
 	return vif;
@@ -385,9 +386,12 @@ void xenvif_disconnect(struct xenvif *vif)
 	unregister_netdev(vif->dev);
 
 	xen_netbk_unmap_frontend_rings(vif);
-	if (vif->persistent_grant)
-		xenvif_free_grants(vif->persistent_gnt,
-				   vif->persistent_gntcnt);
+	if (vif->persistent_grant) {
+		xenvif_free_grants(vif->persistent_tx_gnt,
+				   vif->persistent_tx_gntcnt);
+		xenvif_free_grants(vif->persistent_rx_gnt,
+				   vif->persistent_rx_gntcnt);
+	}
 
 	free_netdev(vif->dev);
 }
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index a26d3fc..ec59c73 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -208,14 +208,17 @@ grant_memory_copy_op(unsigned int cmd, void *vuop, unsigned int count,
 
 	BUG_ON(cmd != GNTTABOP_copy);
 	for (i = 0; i < count; i++) {
-		if (tx_pool)
+		if (tx_pool) {
 			vif = netbk->gnttab_tx_vif[i];
-		else
+			gnt_count = &vif->persistent_tx_gntcnt;
+			gnt_total = XEN_NETIF_TX_RING_SIZE;
+			pers_entry = vif->persistent_tx_gnt;
+		} else {
 			vif = netbk->gnttab_rx_vif[i];
-
-		pers_entry = vif->persistent_gnt;
-		gnt_count = &vif->persistent_gntcnt;
-		gnt_total = MAXIMUM_OUTSTANDING_BLOCK_REQS;
+			gnt_count = &vif->persistent_rx_gntcnt;
+			gnt_total = 2*XEN_NETIF_RX_RING_SIZE;
+			pers_entry = vif->persistent_rx_gnt;
+		}
 
 		if (vif->persistent_grant) {
 			void *saddr, *daddr;
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 3/4] Xen/netfront: Implement persistent grant in netfront.
  2012-11-15  7:03 [PATCH 0/4] Implement persistent grant in xen-netfront/netback Annie Li
  2012-11-15  7:04 ` [PATCH 1/4] xen/netback: implements persistent grant with one page pool Annie Li
  2012-11-15  7:04 ` [PATCH 2/4] xen/netback: Split one page pool into two(tx/rx) " Annie Li
@ 2012-11-15  7:05 ` Annie Li
  2012-11-15 10:52   ` [Xen-devel] " Roger Pau Monné
  2012-11-15  7:05 ` [PATCH 4/4] fix code indent issue in xen-netfront Annie Li
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 42+ messages in thread
From: Annie Li @ 2012-11-15  7:05 UTC (permalink / raw)
  To: xen-devel, netdev, konrad.wilk, Ian.Campbell; +Cc: annie.li

Tx/rx page pool are maintained. New grant is mapped and put into
pool, unmap only happens when releasing/removing device.

Signed-off-by: Annie Li <annie.li@oracle.com>
---
 drivers/net/xen-netfront.c |  372 +++++++++++++++++++++++++++++++++++++-------
 1 files changed, 315 insertions(+), 57 deletions(-)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 0ebbb19..17b81c0 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -79,6 +79,13 @@ struct netfront_stats {
 	struct u64_stats_sync	syncp;
 };
 
+struct gnt_list {
+	grant_ref_t		gref;
+	struct			page *gnt_pages;
+	void			*gnt_target;
+	struct			gnt_list *tail;
+};
+
 struct netfront_info {
 	struct list_head list;
 	struct net_device *netdev;
@@ -109,6 +116,10 @@ struct netfront_info {
 	grant_ref_t grant_tx_ref[NET_TX_RING_SIZE];
 	unsigned tx_skb_freelist;
 
+	struct gnt_list *tx_grant[NET_TX_RING_SIZE];
+	struct gnt_list *tx_gnt_list;
+	unsigned int tx_gnt_cnt;
+
 	spinlock_t   rx_lock ____cacheline_aligned_in_smp;
 	struct xen_netif_rx_front_ring rx;
 	int rx_ring_ref;
@@ -126,6 +137,10 @@ struct netfront_info {
 	grant_ref_t gref_rx_head;
 	grant_ref_t grant_rx_ref[NET_RX_RING_SIZE];
 
+	struct gnt_list *rx_grant[NET_RX_RING_SIZE];
+	struct gnt_list *rx_gnt_list;
+	unsigned int rx_gnt_cnt;
+
 	unsigned long rx_pfn_array[NET_RX_RING_SIZE];
 	struct multicall_entry rx_mcl[NET_RX_RING_SIZE+1];
 	struct mmu_update rx_mmu[NET_RX_RING_SIZE];
@@ -134,6 +149,7 @@ struct netfront_info {
 	struct netfront_stats __percpu *stats;
 
 	unsigned long rx_gso_checksum_fixup;
+	u8 persistent_gnt:1;
 };
 
 struct netfront_rx_info {
@@ -194,6 +210,16 @@ static grant_ref_t xennet_get_rx_ref(struct netfront_info *np,
 	return ref;
 }
 
+static struct gnt_list *xennet_get_rx_grant(struct netfront_info *np,
+					    RING_IDX ri)
+{
+	int i = xennet_rxidx(ri);
+	struct gnt_list *gntlist = np->rx_grant[i];
+	np->rx_grant[i] = NULL;
+
+	return gntlist;
+}
+
 #ifdef CONFIG_SYSFS
 static int xennet_sysfs_addif(struct net_device *netdev);
 static void xennet_sysfs_delif(struct net_device *netdev);
@@ -231,6 +257,68 @@ static void xennet_maybe_wake_tx(struct net_device *dev)
 		netif_wake_queue(dev);
 }
 
+static grant_ref_t xennet_alloc_rx_ref(struct net_device *dev,
+				       unsigned long mfn, void *vaddr,
+				       unsigned int id,
+				       grant_ref_t ref)
+{
+	struct netfront_info *np = netdev_priv(dev);
+	grant_ref_t gnt_ref;
+	struct gnt_list *gnt_list_entry;
+
+	if (np->persistent_gnt && np->rx_gnt_cnt) {
+		gnt_list_entry = np->rx_gnt_list;
+		np->rx_gnt_list = np->rx_gnt_list->tail;
+		np->rx_gnt_cnt--;
+
+		gnt_list_entry->gnt_target = vaddr;
+		gnt_ref = gnt_list_entry->gref;
+		np->rx_grant[id] = gnt_list_entry;
+	} else {
+		struct page *page;
+
+		BUG_ON(!np->persistent_gnt && np->rx_gnt_cnt);
+		if (!ref)
+			gnt_ref =
+				gnttab_claim_grant_reference(&np->gref_rx_head);
+		else
+			gnt_ref = ref;
+		BUG_ON((signed short)gnt_ref < 0);
+
+		if (np->persistent_gnt) {
+			page = alloc_page(GFP_KERNEL);
+			if (!page) {
+				if (!ref)
+					gnttab_release_grant_reference(
+							&np->gref_rx_head, ref);
+				return -ENOMEM;
+			}
+			mfn = pfn_to_mfn(page_to_pfn(page));
+
+			gnt_list_entry = kmalloc(sizeof(struct gnt_list),
+						 GFP_KERNEL);
+			if (!gnt_list_entry) {
+				__free_page(page);
+				if (!ref)
+					gnttab_release_grant_reference(
+							&np->gref_rx_head, ref);
+				return -ENOMEM;
+			}
+			gnt_list_entry->gref = gnt_ref;
+			gnt_list_entry->gnt_pages = page;
+			gnt_list_entry->gnt_target = vaddr;
+
+			np->rx_grant[id] = gnt_list_entry;
+		}
+
+		gnttab_grant_foreign_access_ref(gnt_ref, np->xbdev->otherend_id,
+						mfn, 0);
+	}
+	np->grant_rx_ref[id] = gnt_ref;
+
+	return gnt_ref;
+}
+
 static void xennet_alloc_rx_buffers(struct net_device *dev)
 {
 	unsigned short id;
@@ -240,8 +328,6 @@ static void xennet_alloc_rx_buffers(struct net_device *dev)
 	int i, batch_target, notify;
 	RING_IDX req_prod = np->rx.req_prod_pvt;
 	grant_ref_t ref;
-	unsigned long pfn;
-	void *vaddr;
 	struct xen_netif_rx_request *req;
 
 	if (unlikely(!netif_carrier_ok(dev)))
@@ -306,19 +392,16 @@ no_skb:
 		BUG_ON(np->rx_skbs[id]);
 		np->rx_skbs[id] = skb;
 
-		ref = gnttab_claim_grant_reference(&np->gref_rx_head);
-		BUG_ON((signed short)ref < 0);
-		np->grant_rx_ref[id] = ref;
+		page = skb_frag_page(&skb_shinfo(skb)->frags[0]);
 
-		pfn = page_to_pfn(skb_frag_page(&skb_shinfo(skb)->frags[0]));
-		vaddr = page_address(skb_frag_page(&skb_shinfo(skb)->frags[0]));
+		ref = xennet_alloc_rx_ref(dev, pfn_to_mfn(page_to_pfn(page)),
+					  page_address(page), id, 0);
+		if ((signed short)ref < 0) {
+			__skb_queue_tail(&np->rx_batch, skb);
+			break;
+		}
 
 		req = RING_GET_REQUEST(&np->rx, req_prod + i);
-		gnttab_grant_foreign_access_ref(ref,
-						np->xbdev->otherend_id,
-						pfn_to_mfn(pfn),
-						0);
-
 		req->id = id;
 		req->gref = ref;
 	}
@@ -375,17 +458,30 @@ static void xennet_tx_buf_gc(struct net_device *dev)
 
 			id  = txrsp->id;
 			skb = np->tx_skbs[id].skb;
-			if (unlikely(gnttab_query_foreign_access(
-				np->grant_tx_ref[id]) != 0)) {
-				printk(KERN_ALERT "xennet_tx_buf_gc: warning "
-				       "-- grant still in use by backend "
-				       "domain.\n");
-				BUG();
+
+			if (np->persistent_gnt) {
+				struct gnt_list *gnt_list_entry;
+
+				gnt_list_entry = np->tx_grant[id];
+				BUG_ON(!gnt_list_entry);
+
+				gnt_list_entry->tail = np->tx_gnt_list;
+				np->tx_gnt_list = gnt_list_entry;
+				np->tx_gnt_cnt++;
+			} else {
+				if (unlikely(gnttab_query_foreign_access(
+					np->grant_tx_ref[id]) != 0)) {
+					printk(KERN_ALERT "xennet_tx_buf_gc: warning "
+					       "-- grant still in use by backend "
+					       "domain.\n");
+					BUG();
+				}
+
+				gnttab_end_foreign_access_ref(
+					np->grant_tx_ref[id], GNTMAP_readonly);
+				gnttab_release_grant_reference(
+				      &np->gref_tx_head, np->grant_tx_ref[id]);
 			}
-			gnttab_end_foreign_access_ref(
-				np->grant_tx_ref[id], GNTMAP_readonly);
-			gnttab_release_grant_reference(
-				&np->gref_tx_head, np->grant_tx_ref[id]);
 			np->grant_tx_ref[id] = GRANT_INVALID_REF;
 			add_id_to_freelist(&np->tx_skb_freelist, np->tx_skbs, id);
 			dev_kfree_skb_irq(skb);
@@ -409,6 +505,59 @@ static void xennet_tx_buf_gc(struct net_device *dev)
 	xennet_maybe_wake_tx(dev);
 }
 
+static grant_ref_t xennet_alloc_tx_ref(struct net_device *dev,
+				       unsigned long mfn,
+				       unsigned int id)
+{
+	struct netfront_info *np = netdev_priv(dev);
+	grant_ref_t ref;
+	struct page *granted_page;
+
+	if (np->persistent_gnt && np->tx_gnt_cnt) {
+		struct gnt_list *gnt_list_entry;
+
+		gnt_list_entry = np->tx_gnt_list;
+		np->tx_gnt_list = np->tx_gnt_list->tail;
+		np->tx_gnt_cnt--;
+
+		ref = gnt_list_entry->gref;
+		np->tx_grant[id] = gnt_list_entry;
+	} else {
+		struct gnt_list *gnt_list_entry;
+
+		BUG_ON(!np->persistent_gnt && np->tx_gnt_cnt);
+		ref = gnttab_claim_grant_reference(&np->gref_tx_head);
+		BUG_ON((signed short)ref < 0);
+
+		if (np->persistent_gnt) {
+			granted_page = alloc_page(GFP_KERNEL);
+			if (!granted_page) {
+				gnttab_release_grant_reference(
+							&np->gref_tx_head, ref);
+				return -ENOMEM;
+			}
+
+			mfn = pfn_to_mfn(page_to_pfn(granted_page));
+			gnt_list_entry = kmalloc(sizeof(struct gnt_list),
+						 GFP_KERNEL);
+			if (!gnt_list_entry) {
+				__free_page(granted_page);
+				gnttab_release_grant_reference(
+							&np->gref_tx_head, ref);
+				return -ENOMEM;
+			}
+
+			gnt_list_entry->gref = ref;
+			gnt_list_entry->gnt_pages = granted_page;
+			np->tx_grant[id] = gnt_list_entry;
+		}
+		gnttab_grant_foreign_access_ref(ref, np->xbdev->otherend_id,
+						mfn, 0);
+	}
+
+	return ref;
+}
+
 static void xennet_make_frags(struct sk_buff *skb, struct net_device *dev,
 			      struct xen_netif_tx_request *tx)
 {
@@ -421,6 +570,9 @@ static void xennet_make_frags(struct sk_buff *skb, struct net_device *dev,
 	unsigned int len = skb_headlen(skb);
 	unsigned int id;
 	grant_ref_t ref;
+	struct gnt_list *gnt_list_entry;
+	void *out_addr;
+	void *in_addr;
 	int i;
 
 	/* While the header overlaps a page boundary (including being
@@ -436,17 +588,19 @@ static void xennet_make_frags(struct sk_buff *skb, struct net_device *dev,
 		np->tx_skbs[id].skb = skb_get(skb);
 		tx = RING_GET_REQUEST(&np->tx, prod++);
 		tx->id = id;
-		ref = gnttab_claim_grant_reference(&np->gref_tx_head);
-		BUG_ON((signed short)ref < 0);
-
 		mfn = virt_to_mfn(data);
-		gnttab_grant_foreign_access_ref(ref, np->xbdev->otherend_id,
-						mfn, GNTMAP_readonly);
-
+		ref = xennet_alloc_tx_ref(dev, mfn, id);
 		tx->gref = np->grant_tx_ref[id] = ref;
 		tx->offset = offset;
 		tx->size = len;
 		tx->flags = 0;
+		if (np->persistent_gnt) {
+			gnt_list_entry = np->tx_grant[id];
+			out_addr = page_address(gnt_list_entry->gnt_pages);
+			in_addr = (void *)((unsigned long)data
+							& ~(PAGE_SIZE-1));
+			memcpy(out_addr, in_addr, PAGE_SIZE);
+		}
 	}
 
 	/* Grant backend access to each skb fragment page. */
@@ -459,17 +613,19 @@ static void xennet_make_frags(struct sk_buff *skb, struct net_device *dev,
 		np->tx_skbs[id].skb = skb_get(skb);
 		tx = RING_GET_REQUEST(&np->tx, prod++);
 		tx->id = id;
-		ref = gnttab_claim_grant_reference(&np->gref_tx_head);
-		BUG_ON((signed short)ref < 0);
-
 		mfn = pfn_to_mfn(page_to_pfn(skb_frag_page(frag)));
-		gnttab_grant_foreign_access_ref(ref, np->xbdev->otherend_id,
-						mfn, GNTMAP_readonly);
-
+		ref = xennet_alloc_tx_ref(dev, mfn, id);
 		tx->gref = np->grant_tx_ref[id] = ref;
 		tx->offset = frag->page_offset;
 		tx->size = skb_frag_size(frag);
 		tx->flags = 0;
+		if (np->persistent_gnt) {
+			gnt_list_entry = np->tx_grant[id];
+			out_addr = page_address(gnt_list_entry->gnt_pages);
+			in_addr = (void *)((unsigned long)page_address(
+					skb_frag_page(frag)) & ~(PAGE_SIZE-1));
+			memcpy(out_addr, in_addr, PAGE_SIZE);
+		}
 	}
 
 	np->tx.req_prod_pvt = prod;
@@ -491,6 +647,9 @@ static int xennet_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	unsigned int offset = offset_in_page(data);
 	unsigned int len = skb_headlen(skb);
 	unsigned long flags;
+	struct gnt_list *gnt_list_entry;
+	void *out_addr;
+	void *in_addr;
 
 	frags += DIV_ROUND_UP(offset + len, PAGE_SIZE);
 	if (unlikely(frags > MAX_SKB_FRAGS + 1)) {
@@ -517,16 +676,20 @@ static int xennet_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	tx = RING_GET_REQUEST(&np->tx, i);
 
 	tx->id   = id;
-	ref = gnttab_claim_grant_reference(&np->gref_tx_head);
-	BUG_ON((signed short)ref < 0);
 	mfn = virt_to_mfn(data);
-	gnttab_grant_foreign_access_ref(
-		ref, np->xbdev->otherend_id, mfn, GNTMAP_readonly);
+	ref = xennet_alloc_tx_ref(dev, mfn, id);
 	tx->gref = np->grant_tx_ref[id] = ref;
 	tx->offset = offset;
 	tx->size = len;
 	extra = NULL;
 
+	if (np->persistent_gnt) {
+		gnt_list_entry = np->tx_grant[id];
+		out_addr = page_address(gnt_list_entry->gnt_pages);
+		in_addr = (void *)((unsigned long)data & ~(PAGE_SIZE-1));
+		memcpy(out_addr, in_addr, PAGE_SIZE);
+	}
+
 	tx->flags = 0;
 	if (skb->ip_summed == CHECKSUM_PARTIAL)
 		/* local packet? */
@@ -595,13 +758,17 @@ static int xennet_close(struct net_device *dev)
 }
 
 static void xennet_move_rx_slot(struct netfront_info *np, struct sk_buff *skb,
-				grant_ref_t ref)
+				grant_ref_t ref, RING_IDX cons)
 {
 	int new = xennet_rxidx(np->rx.req_prod_pvt);
 
 	BUG_ON(np->rx_skbs[new]);
 	np->rx_skbs[new] = skb;
 	np->grant_rx_ref[new] = ref;
+
+	if (np->persistent_gnt)
+		np->rx_grant[new] = xennet_get_rx_grant(np, cons);
+
 	RING_GET_REQUEST(&np->rx, np->rx.req_prod_pvt)->id = new;
 	RING_GET_REQUEST(&np->rx, np->rx.req_prod_pvt)->gref = ref;
 	np->rx.req_prod_pvt++;
@@ -644,7 +811,7 @@ static int xennet_get_extras(struct netfront_info *np,
 
 		skb = xennet_get_rx_skb(np, cons);
 		ref = xennet_get_rx_ref(np, cons);
-		xennet_move_rx_slot(np, skb, ref);
+		xennet_move_rx_slot(np, skb, ref, cons);
 	} while (extra->flags & XEN_NETIF_EXTRA_FLAG_MORE);
 
 	np->rx.rsp_cons = cons;
@@ -665,6 +832,12 @@ static int xennet_get_responses(struct netfront_info *np,
 	int frags = 1;
 	int err = 0;
 	unsigned long ret;
+	struct gnt_list *gnt_list_entry;
+
+	if (np->persistent_gnt) {
+		gnt_list_entry = xennet_get_rx_grant(np, cons);
+		BUG_ON(!gnt_list_entry);
+	}
 
 	if (rx->flags & XEN_NETRXF_extra_info) {
 		err = xennet_get_extras(np, extras, rp);
@@ -677,7 +850,7 @@ static int xennet_get_responses(struct netfront_info *np,
 			if (net_ratelimit())
 				dev_warn(dev, "rx->offset: %x, size: %u\n",
 					 rx->offset, rx->status);
-			xennet_move_rx_slot(np, skb, ref);
+			xennet_move_rx_slot(np, skb, ref, cons);
 			err = -EINVAL;
 			goto next;
 		}
@@ -695,11 +868,29 @@ static int xennet_get_responses(struct netfront_info *np,
 			goto next;
 		}
 
-		ret = gnttab_end_foreign_access_ref(ref, 0);
-		BUG_ON(!ret);
-
-		gnttab_release_grant_reference(&np->gref_rx_head, ref);
-
+		if (!np->persistent_gnt) {
+			ret = gnttab_end_foreign_access_ref(ref, 0);
+			BUG_ON(!ret);
+			gnttab_release_grant_reference(&np->gref_rx_head, ref);
+		} else {
+			struct page *grant_page;
+			void *grant_target;
+
+			grant_page = gnt_list_entry->gnt_pages;
+			grant_target = gnt_list_entry->gnt_target;
+			BUG_ON(grant_page == 0);
+			BUG_ON(grant_target == 0);
+
+			if (rx->status > 0)
+				memcpy(grant_target+rx->offset,
+				       page_address(grant_page)+rx->offset,
+				       rx->status); /* status encodes size */
+
+			gnt_list_entry->gref = ref;
+			gnt_list_entry->tail = np->rx_gnt_list;
+			np->rx_gnt_list = gnt_list_entry;
+			np->rx_gnt_cnt++;
+		}
 		__skb_queue_tail(list, skb);
 
 next:
@@ -716,6 +907,10 @@ next:
 		rx = RING_GET_RESPONSE(&np->rx, cons + frags);
 		skb = xennet_get_rx_skb(np, cons + frags);
 		ref = xennet_get_rx_ref(np, cons + frags);
+		if (np->persistent_gnt) {
+			gnt_list_entry = xennet_get_rx_grant(np, cons + frags);
+			BUG_ON(!gnt_list_entry);
+		}
 		frags++;
 	}
 
@@ -1090,16 +1285,32 @@ static void xennet_release_tx_bufs(struct netfront_info *np)
 	struct sk_buff *skb;
 	int i;
 
+	if (np->persistent_gnt) {
+		struct gnt_list *gnt_list_entry;
+
+		while (np->tx_gnt_list) {
+			gnt_list_entry = np->tx_gnt_list;
+			np->tx_gnt_list = np->tx_gnt_list->tail;
+			gnttab_end_foreign_access(gnt_list_entry->gref, 0, 0UL);
+			gnttab_release_grant_reference(&np->gref_tx_head,
+						       gnt_list_entry->gref);
+
+			__free_page(gnt_list_entry->gnt_pages);
+			kfree(gnt_list_entry);
+		}
+	}
+
 	for (i = 0; i < NET_TX_RING_SIZE; i++) {
 		/* Skip over entries which are actually freelist references */
 		if (skb_entry_is_link(&np->tx_skbs[i]))
 			continue;
-
 		skb = np->tx_skbs[i].skb;
-		gnttab_end_foreign_access_ref(np->grant_tx_ref[i],
-					      GNTMAP_readonly);
-		gnttab_release_grant_reference(&np->gref_tx_head,
-					       np->grant_tx_ref[i]);
+		if (!np->persistent_gnt) {
+			gnttab_end_foreign_access_ref(np->grant_tx_ref[i],
+						      GNTMAP_readonly);
+			gnttab_release_grant_reference(&np->gref_tx_head,
+						       np->grant_tx_ref[i]);
+		}
 		np->grant_tx_ref[i] = GRANT_INVALID_REF;
 		add_id_to_freelist(&np->tx_skb_freelist, np->tx_skbs, i);
 		dev_kfree_skb_irq(skb);
@@ -1124,6 +1335,20 @@ static void xennet_release_rx_bufs(struct netfront_info *np)
 
 	spin_lock_bh(&np->rx_lock);
 
+	if (np->persistent_gnt) {
+		struct gnt_list *gnt_list_entry;
+
+		while (np->rx_gnt_list) {
+			gnt_list_entry = np->rx_gnt_list;
+			np->rx_gnt_list = np->rx_gnt_list->tail;
+			gnttab_end_foreign_access(gnt_list_entry->gref, 0, 0UL);
+			gnttab_release_grant_reference(&np->gref_rx_head,
+							gnt_list_entry->gref);
+			__free_page(gnt_list_entry->gnt_pages);
+			kfree(gnt_list_entry);
+		}
+	}
+
 	for (id = 0; id < NET_RX_RING_SIZE; id++) {
 		ref = np->grant_rx_ref[id];
 		if (ref == GRANT_INVALID_REF) {
@@ -1132,8 +1357,10 @@ static void xennet_release_rx_bufs(struct netfront_info *np)
 		}
 
 		skb = np->rx_skbs[id];
-		mfn = gnttab_end_foreign_transfer_ref(ref);
-		gnttab_release_grant_reference(&np->gref_rx_head, ref);
+		if (!np->persistent_gnt) {
+			mfn = gnttab_end_foreign_transfer_ref(ref);
+			gnttab_release_grant_reference(&np->gref_rx_head, ref);
+		}
 		np->grant_rx_ref[id] = GRANT_INVALID_REF;
 
 		if (0 == mfn) {
@@ -1607,6 +1834,13 @@ again:
 		goto abort_transaction;
 	}
 
+	err = xenbus_printf(xbt, dev->nodename, "feature-persistent-grants",
+			    "%u", info->persistent_gnt);
+	if (err) {
+		message = "writing feature-persistent-grants";
+		xenbus_dev_fatal(dev, err, "%s", message);
+	}
+
 	err = xenbus_transaction_end(xbt, 0);
 	if (err) {
 		if (err == -EAGAIN)
@@ -1634,6 +1868,7 @@ static int xennet_connect(struct net_device *dev)
 	grant_ref_t ref;
 	struct xen_netif_rx_request *req;
 	unsigned int feature_rx_copy;
+	int ret, val;
 
 	err = xenbus_scanf(XBT_NIL, np->xbdev->otherend,
 			   "feature-rx-copy", "%u", &feature_rx_copy);
@@ -1646,6 +1881,13 @@ static int xennet_connect(struct net_device *dev)
 		return -ENODEV;
 	}
 
+	err = xenbus_scanf(XBT_NIL, np->xbdev->otherend,
+			   "feature-persistent-grants", "%u", &val);
+	if (err != 1)
+		val = 0;
+
+	np->persistent_gnt = !!val;
+
 	err = talk_to_netback(np->xbdev, np);
 	if (err)
 		return err;
@@ -1657,9 +1899,24 @@ static int xennet_connect(struct net_device *dev)
 	spin_lock_bh(&np->rx_lock);
 	spin_lock_irq(&np->tx_lock);
 
+	np->tx_gnt_cnt = 0;
+	np->rx_gnt_cnt = 0;
+
 	/* Step 1: Discard all pending TX packet fragments. */
 	xennet_release_tx_bufs(np);
 
+	if (np->persistent_gnt) {
+		struct gnt_list *gnt_list_entry;
+
+		while (np->rx_gnt_list) {
+			gnt_list_entry = np->rx_gnt_list;
+			np->rx_gnt_list = np->rx_gnt_list->tail;
+			gnttab_end_foreign_access(gnt_list_entry->gref, 0, 0UL);
+			__free_page(gnt_list_entry->gnt_pages);
+			kfree(gnt_list_entry);
+		}
+	}
+
 	/* Step 2: Rebuild the RX buffer freelist and the RX ring itself. */
 	for (requeue_idx = 0, i = 0; i < NET_RX_RING_SIZE; i++) {
 		skb_frag_t *frag;
@@ -1673,10 +1930,11 @@ static int xennet_connect(struct net_device *dev)
 
 		frag = &skb_shinfo(skb)->frags[0];
 		page = skb_frag_page(frag);
-		gnttab_grant_foreign_access_ref(
-			ref, np->xbdev->otherend_id,
-			pfn_to_mfn(page_to_pfn(page)),
-			0);
+		ret = xennet_alloc_rx_ref(dev, pfn_to_mfn(page_to_pfn(page)),
+					  page_address(page), requeue_idx, ref);
+		if ((signed short)ret < 0)
+			break;
+
 		req->gref = ref;
 		req->id   = requeue_idx;
 
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 4/4] fix code indent issue in xen-netfront.
  2012-11-15  7:03 [PATCH 0/4] Implement persistent grant in xen-netfront/netback Annie Li
                   ` (2 preceding siblings ...)
  2012-11-15  7:05 ` [PATCH 3/4] Xen/netfront: Implement persistent grant in netfront Annie Li
@ 2012-11-15  7:05 ` Annie Li
  2012-11-15  7:40 ` [PATCH 0/4] Implement persistent grant in xen-netfront/netback Pasi Kärkkäinen
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 42+ messages in thread
From: Annie Li @ 2012-11-15  7:05 UTC (permalink / raw)
  To: xen-devel, netdev, konrad.wilk, Ian.Campbell; +Cc: annie.li

Signed-off-by: Annie Li <annie.li@oracle.com>
---
 drivers/net/xen-netfront.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 17b81c0..66bb29f 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -202,7 +202,7 @@ static struct sk_buff *xennet_get_rx_skb(struct netfront_info *np,
 }
 
 static grant_ref_t xennet_get_rx_ref(struct netfront_info *np,
-					    RING_IDX ri)
+				     RING_IDX ri)
 {
 	int i = xennet_rxidx(ri);
 	grant_ref_t ref = np->grant_rx_ref[i];
@@ -1420,7 +1420,7 @@ static void xennet_uninit(struct net_device *dev)
 }
 
 static netdev_features_t xennet_fix_features(struct net_device *dev,
-	netdev_features_t features)
+					     netdev_features_t features)
 {
 	struct netfront_info *np = netdev_priv(dev);
 	int val;
@@ -1447,7 +1447,7 @@ static netdev_features_t xennet_fix_features(struct net_device *dev,
 }
 
 static int xennet_set_features(struct net_device *dev,
-	netdev_features_t features)
+			       netdev_features_t features)
 {
 	if (!(features & NETIF_F_SG) && dev->mtu > ETH_DATA_LEN) {
 		netdev_info(dev, "Reducing MTU because no SG offload");
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [PATCH 0/4] Implement persistent grant in xen-netfront/netback
  2012-11-15  7:03 [PATCH 0/4] Implement persistent grant in xen-netfront/netback Annie Li
                   ` (3 preceding siblings ...)
  2012-11-15  7:05 ` [PATCH 4/4] fix code indent issue in xen-netfront Annie Li
@ 2012-11-15  7:40 ` Pasi Kärkkäinen
  2012-11-15  8:38   ` [Xen-devel] " ANNIE LI
  2012-11-15  8:53 ` Ian Campbell
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 42+ messages in thread
From: Pasi Kärkkäinen @ 2012-11-15  7:40 UTC (permalink / raw)
  To: Annie Li; +Cc: netdev, xen-devel, Ian.Campbell, konrad.wilk

Hello,

On Thu, Nov 15, 2012 at 03:03:07PM +0800, Annie Li wrote:
> This patch implements persistent grants for xen-netfront/netback. This
> mechanism maintains page pools in netback/netfront, these page pools is used to
> save grant pages which are mapped. This way improve performance which is wasted
> when doing grant operations.
> 
> Current netback/netfront does map/unmap grant operations frequently when
> transmitting/receiving packets, and grant operations costs much cpu clock. In
> this patch, netfront/netback maps grant pages when needed and then saves them
> into a page pool for future use. All these pages will be unmapped when
> removing/releasing the net device.
> 

Do you have performance numbers available already? with/without persistent grants? 


> In netfront, two pools are maintained for transmitting and receiving packets.
> When new grant pages are needed, the driver gets grant pages from this pool
> first. If no free grant page exists, it allocates new page, maps it and then
> saves it into the pool. The pool size for transmit/receive is exactly tx/rx
> ring size. The driver uses memcpy(not grantcopy) to copy data grant pages.
> Here, memcpy is copying the whole page size data. I tried to copy len size data
> from offset, but network does not seem work well. I am trying to find the root
> cause now.
> 
> In netback, it also maintains two page pools for tx/rx. When netback gets a
> request, it does a search first to find out whether the grant reference of
> this request is already mapped into its page pool. If the grant ref is mapped,
> the address of this mapped page is gotten and memcpy is used to copy data
> between grant pages. However, if the grant ref is not mapped, a new page is
> allocated, mapped with this grant ref, and then saved into page pool for
> future use. Similarly, memcpy replaces grant copy to copy data between grant
> pages. In this implementation, two arrays(gnttab_tx_vif,gnttab_rx_vif) are
> used to save vif pointer for every request because current netback is not
> per-vif based. This would be changed after implementing 1:1 model in netback.
> 

Btw is xen-netback/xen-netfront multiqueue support something you're planning to implement aswell? 
multiqueue allows single vif scaling to multiple vcpus/cores.


Thanks,

-- Pasi


> This patch supports both persistent-grant and non persistent grant. A new
> xenstore key "feature-persistent-grants" is used to represent this feature.
> 
> This patch is based on linux3.4-rc3. I hit netperf/netserver failure on
> linux latest version v3.7-rc1, v3.7-rc2 and v3.7-rc4. Not sure whether this
> netperf/netserver failure connects compound page commit in v3.7-rc1, but I did
> hit BUG_ON with debug patch from thread
> http://lists.xen.org/archives/html/xen-devel/2012-10/msg00893.html
> 
> 
> Annie Li (4):
>   xen/netback: implements persistent grant with one page pool.
>   xen/netback: Split one page pool into two(tx/rx) page pool.
>   Xen/netfront: Implement persistent grant in netfront.
>   fix code indent issue in xen-netfront.
> 
>  drivers/net/xen-netback/common.h    |   24 ++-
>  drivers/net/xen-netback/interface.c |   26 +++
>  drivers/net/xen-netback/netback.c   |  215 ++++++++++++++++++--
>  drivers/net/xen-netback/xenbus.c    |   14 ++-
>  drivers/net/xen-netfront.c          |  378 +++++++++++++++++++++++++++++------
>  5 files changed, 570 insertions(+), 87 deletions(-)
> 
> -- 
> 1.7.3.4
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Xen-devel] [PATCH 0/4] Implement persistent grant in xen-netfront/netback
  2012-11-15  7:40 ` [PATCH 0/4] Implement persistent grant in xen-netfront/netback Pasi Kärkkäinen
@ 2012-11-15  8:38   ` ANNIE LI
  2012-11-15  8:51     ` Ian Campbell
                       ` (2 more replies)
  0 siblings, 3 replies; 42+ messages in thread
From: ANNIE LI @ 2012-11-15  8:38 UTC (permalink / raw)
  To: Pasi Kärkkäinen; +Cc: xen-devel, netdev, konrad.wilk, Ian.Campbell



On 2012-11-15 15:40, Pasi Kärkkäinen wrote:
> Hello,
>
> On Thu, Nov 15, 2012 at 03:03:07PM +0800, Annie Li wrote:
>> This patch implements persistent grants for xen-netfront/netback. This
>> mechanism maintains page pools in netback/netfront, these page pools is used to
>> save grant pages which are mapped. This way improve performance which is wasted
>> when doing grant operations.
>>
>> Current netback/netfront does map/unmap grant operations frequently when
>> transmitting/receiving packets, and grant operations costs much cpu clock. In
>> this patch, netfront/netback maps grant pages when needed and then saves them
>> into a page pool for future use. All these pages will be unmapped when
>> removing/releasing the net device.
>>
> Do you have performance numbers available already? with/without persistent grants?
I have some simple netperf/netserver test result with/without persistent 
grants,

Following is result of with persistent grant patch,

Guests, Sum,      Avg,     Min,     Max
  1,  15106.4,  15106.4, 15106.36, 15106.36
  2,  13052.7,  6526.34,  6261.81,  6790.86
  3,  12675.1,  6337.53,  6220.24,  6454.83
  4,  13194,  6596.98,  6274.70,  6919.25


Following are result of without persistent patch

Guests, Sum,     Avg,    Min,        Max
  1,  10864.1,  10864.1, 10864.10, 10864.10
  2,  10898.5,  5449.24,  4862.08,  6036.40
  3,  10734.5,  5367.26,  5261.43,  5473.08
  4,  10924,    5461.99,  5314.84,  5609.14

>> In netfront, two pools are maintained for transmitting and receiving packets.
>> When new grant pages are needed, the driver gets grant pages from this pool
>> first. If no free grant page exists, it allocates new page, maps it and then
>> saves it into the pool. The pool size for transmit/receive is exactly tx/rx
>> ring size. The driver uses memcpy(not grantcopy) to copy data grant pages.
>> Here, memcpy is copying the whole page size data. I tried to copy len size data
>> from offset, but network does not seem work well. I am trying to find the root
>> cause now.
>>
>> In netback, it also maintains two page pools for tx/rx. When netback gets a
>> request, it does a search first to find out whether the grant reference of
>> this request is already mapped into its page pool. If the grant ref is mapped,
>> the address of this mapped page is gotten and memcpy is used to copy data
>> between grant pages. However, if the grant ref is not mapped, a new page is
>> allocated, mapped with this grant ref, and then saved into page pool for
>> future use. Similarly, memcpy replaces grant copy to copy data between grant
>> pages. In this implementation, two arrays(gnttab_tx_vif,gnttab_rx_vif) are
>> used to save vif pointer for every request because current netback is not
>> per-vif based. This would be changed after implementing 1:1 model in netback.
>>
> Btw is xen-netback/xen-netfront multiqueue support something you're planning to implement aswell?
Currently, some patches exist for implementing 1:1 model in netback, but 
this should be different from what you mentioned, and they are not ready 
for upstream.
These patches make netback thread per VIF, and mainly implement some 
concepts from netchannel2, such as multipage rings, seperate tx and rx 
rings, seperate tx and rx event channels, etc.

Thanks
Annie

> multiqueue allows single vif scaling to multiple vcpus/cores.
>
>
> Thanks,
>
> -- Pasi
>
>
>> This patch supports both persistent-grant and non persistent grant. A new
>> xenstore key "feature-persistent-grants" is used to represent this feature.
>>
>> This patch is based on linux3.4-rc3. I hit netperf/netserver failure on
>> linux latest version v3.7-rc1, v3.7-rc2 and v3.7-rc4. Not sure whether this
>> netperf/netserver failure connects compound page commit in v3.7-rc1, but I did
>> hit BUG_ON with debug patch from thread
>> http://lists.xen.org/archives/html/xen-devel/2012-10/msg00893.html
>>
>>
>> Annie Li (4):
>>    xen/netback: implements persistent grant with one page pool.
>>    xen/netback: Split one page pool into two(tx/rx) page pool.
>>    Xen/netfront: Implement persistent grant in netfront.
>>    fix code indent issue in xen-netfront.
>>
>>   drivers/net/xen-netback/common.h    |   24 ++-
>>   drivers/net/xen-netback/interface.c |   26 +++
>>   drivers/net/xen-netback/netback.c   |  215 ++++++++++++++++++--
>>   drivers/net/xen-netback/xenbus.c    |   14 ++-
>>   drivers/net/xen-netfront.c          |  378 +++++++++++++++++++++++++++++------
>>   5 files changed, 570 insertions(+), 87 deletions(-)
>>
>> -- 
>> 1.7.3.4
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Xen-devel] [PATCH 0/4] Implement persistent grant in xen-netfront/netback
  2012-11-15  8:38   ` [Xen-devel] " ANNIE LI
@ 2012-11-15  8:51     ` Ian Campbell
  2012-11-15  9:02       ` ANNIE LI
  2012-11-15  9:35     ` Wei Liu
  2012-11-15 10:56     ` Roger Pau Monné
  2 siblings, 1 reply; 42+ messages in thread
From: Ian Campbell @ 2012-11-15  8:51 UTC (permalink / raw)
  To: ANNIE LI; +Cc: Pasi Kärkkäinen, xen-devel, netdev, konrad.wilk

On Thu, 2012-11-15 at 08:38 +0000, ANNIE LI wrote:
> 
> On 2012-11-15 15:40, Pasi Kärkkäinen wrote:
> > Hello,
> >
> > On Thu, Nov 15, 2012 at 03:03:07PM +0800, Annie Li wrote:
> >> This patch implements persistent grants for xen-netfront/netback. This
> >> mechanism maintains page pools in netback/netfront, these page pools is used to
> >> save grant pages which are mapped. This way improve performance which is wasted
> >> when doing grant operations.
> >>
> >> Current netback/netfront does map/unmap grant operations frequently when
> >> transmitting/receiving packets, and grant operations costs much cpu clock. In
> >> this patch, netfront/netback maps grant pages when needed and then saves them
> >> into a page pool for future use. All these pages will be unmapped when
> >> removing/releasing the net device.
> >>
> > Do you have performance numbers available already? with/without persistent grants?
> I have some simple netperf/netserver test result with/without persistent 
> grants,
> 
> Following is result of with persistent grant patch,

> Guests, Sum,      Avg,     Min,     Max
>   1,  15106.4,  15106.4, 15106.36, 15106.36
>   2,  13052.7,  6526.34,  6261.81,  6790.86
>   3,  12675.1,  6337.53,  6220.24,  6454.83
>   4,  13194,  6596.98,  6274.70,  6919.25

Are these pairs of guests or individual ones?

I think the really interesting cases are when you get up to larger
numbers of guests, aren't they? ISTR that for blkio things got most
interesting WRT persistent grants at the dozens of guests stage. Do you
have any numbers for those?

Have you run any tests other than netperf?

Do you have numbers for a a persistent capable backend with a
non-persistent frontend and vice versa?


> 
> 
> Following are result of without persistent patch
> 
> Guests, Sum,     Avg,    Min,        Max
>   1,  10864.1,  10864.1, 10864.10, 10864.10
>   2,  10898.5,  5449.24,  4862.08,  6036.40
>   3,  10734.5,  5367.26,  5261.43,  5473.08
>   4,  10924,    5461.99,  5314.84,  5609.14

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 0/4] Implement persistent grant in xen-netfront/netback
  2012-11-15  7:03 [PATCH 0/4] Implement persistent grant in xen-netfront/netback Annie Li
                   ` (4 preceding siblings ...)
  2012-11-15  7:40 ` [PATCH 0/4] Implement persistent grant in xen-netfront/netback Pasi Kärkkäinen
@ 2012-11-15  8:53 ` Ian Campbell
  2012-11-15 11:14   ` ANNIE LI
  2012-11-15  8:56 ` Ian Campbell
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 42+ messages in thread
From: Ian Campbell @ 2012-11-15  8:53 UTC (permalink / raw)
  To: Annie Li; +Cc: xen-devel, netdev, konrad.wilk

On Thu, 2012-11-15 at 07:03 +0000, Annie Li wrote:
> 
> This patch is based on linux3.4-rc3. I hit netperf/netserver failure
> on linux latest version v3.7-rc1, v3.7-rc2 and v3.7-rc4. Not sure
> whether thisnetperf/netserver failure connects compound page commit in
> v3.7-rc1, but I did hit BUG_ON with debug patch from thread
> http://lists.xen.org/archives/html/xen-devel/2012-10/msg00893.html

Do you think you could cook up a netfront fix similar in principal to
the 6a8ed462f16b8455eec5ae00eb6014159a6721f0 fix for netback?

Ian.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 0/4] Implement persistent grant in xen-netfront/netback
  2012-11-15  7:03 [PATCH 0/4] Implement persistent grant in xen-netfront/netback Annie Li
                   ` (5 preceding siblings ...)
  2012-11-15  8:53 ` Ian Campbell
@ 2012-11-15  8:56 ` Ian Campbell
  2012-11-15 11:14   ` [Xen-devel] " ANNIE LI
  2012-11-16  9:57 ` Ian Campbell
  2012-11-16 15:18 ` Konrad Rzeszutek Wilk
  8 siblings, 1 reply; 42+ messages in thread
From: Ian Campbell @ 2012-11-15  8:56 UTC (permalink / raw)
  To: Annie Li; +Cc: xen-devel, netdev, konrad.wilk

On Thu, 2012-11-15 at 07:03 +0000, Annie Li wrote:

> 
> This patch implements persistent grants for xen-netfront/netback. This
> mechanism maintains page pools in netback/netfront, these page pools
> is used to save grant pages which are mapped. This way improve
> performance which is wasted when doing grant operations. 

Please can you send a patch against xen-unstable.hg to document this new
protocol variant in xen/include/public/io/netif.h. This header is a bit
under-documented but lets not let it fall further behind (if you want to
go further and document the existing features, in the style of the
current blkif.h, then that would be awesome!).

You may also want to provide a similar patch to Linux's copy which is in
linux/include/xen/interface/io/netif.h

Ian.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Xen-devel] [PATCH 0/4] Implement persistent grant in xen-netfront/netback
  2012-11-15  8:51     ` Ian Campbell
@ 2012-11-15  9:02       ` ANNIE LI
  0 siblings, 0 replies; 42+ messages in thread
From: ANNIE LI @ 2012-11-15  9:02 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Pasi Kärkkäinen, xen-devel, netdev, konrad.wilk



On 2012-11-15 16:51, Ian Campbell wrote:
> On Thu, 2012-11-15 at 08:38 +0000, ANNIE LI wrote:
>> On 2012-11-15 15:40, Pasi Kärkkäinen wrote:
>>> Hello,
>>>
>>> On Thu, Nov 15, 2012 at 03:03:07PM +0800, Annie Li wrote:
>>>> This patch implements persistent grants for xen-netfront/netback. This
>>>> mechanism maintains page pools in netback/netfront, these page pools is used to
>>>> save grant pages which are mapped. This way improve performance which is wasted
>>>> when doing grant operations.
>>>>
>>>> Current netback/netfront does map/unmap grant operations frequently when
>>>> transmitting/receiving packets, and grant operations costs much cpu clock. In
>>>> this patch, netfront/netback maps grant pages when needed and then saves them
>>>> into a page pool for future use. All these pages will be unmapped when
>>>> removing/releasing the net device.
>>>>
>>> Do you have performance numbers available already? with/without persistent grants?
>> I have some simple netperf/netserver test result with/without persistent
>> grants,
>>
>> Following is result of with persistent grant patch,
>> Guests, Sum,      Avg,     Min,     Max
>>    1,  15106.4,  15106.4, 15106.36, 15106.36
>>    2,  13052.7,  6526.34,  6261.81,  6790.86
>>    3,  12675.1,  6337.53,  6220.24,  6454.83
>>    4,  13194,  6596.98,  6274.70,  6919.25
> Are these pairs of guests or individual ones?

They are pairs of guests.

>
> I think the really interesting cases are when you get up to larger
> numbers of guests, aren't they?
Right.
> ISTR that for blkio things got most
> interesting WRT persistent grants at the dozens of guests stage. Do you
> have any numbers for those?
No, I will run more test with more gusets.
>
> Have you run any tests other than netperf?
No, I didn't.
>
> Do you have numbers for a a persistent capable backend with a
> non-persistent frontend and vice versa?
I did it, but the test only runs with 4 guests too,will test with more 
guests.

Thanks
Annie
>
>
>>
>> Following are result of without persistent patch
>>
>> Guests, Sum,     Avg,    Min,        Max
>>    1,  10864.1,  10864.1, 10864.10, 10864.10
>>    2,  10898.5,  5449.24,  4862.08,  6036.40
>>    3,  10734.5,  5367.26,  5261.43,  5473.08
>>    4,  10924,    5461.99,  5314.84,  5609.14
>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 1/4] xen/netback: implements persistent grant with one page pool.
  2012-11-15  7:04 ` [PATCH 1/4] xen/netback: implements persistent grant with one page pool Annie Li
@ 2012-11-15  9:10   ` Ian Campbell
  2012-11-16  2:18     ` [Xen-devel] " ANNIE LI
  2012-11-15  9:57   ` Roger Pau Monné
  1 sibling, 1 reply; 42+ messages in thread
From: Ian Campbell @ 2012-11-15  9:10 UTC (permalink / raw)
  To: Annie Li; +Cc: xen-devel, netdev, konrad.wilk

On Thu, 2012-11-15 at 07:04 +0000, Annie Li wrote:
> This patch implements persistent grant in netback driver. Tx and rx
> share the same page pool, this pool will be split into two parts
> in next patch.
> 
> Signed-off-by: Annie Li <annie.li@oracle.com>
> ---
>  drivers/net/xen-netback/common.h    |   18 +++-
>  drivers/net/xen-netback/interface.c |   22 ++++
>  drivers/net/xen-netback/netback.c   |  212 +++++++++++++++++++++++++++++++----
>  drivers/net/xen-netback/xenbus.c    |   14 ++-
>  4 files changed, 239 insertions(+), 27 deletions(-)
> 
> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
> index 94b79c3..a85cac6 100644
> --- a/drivers/net/xen-netback/common.h
> +++ b/drivers/net/xen-netback/common.h
> @@ -45,8 +45,19 @@
>  #include <xen/grant_table.h>
>  #include <xen/xenbus.h>
> 
> +#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
> +#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
> +#define MAXIMUM_OUTSTANDING_BLOCK_REQS \

BLOCK?

> +                       (XEN_NETIF_TX_RING_SIZE + XEN_NETIF_RX_RING_SIZE)
> +
>  struct xen_netbk;
> 
> +struct persistent_entry {
> +       grant_ref_t forgranted;
> +       struct page *fpage;
> +       struct gnttab_map_grant_ref map;
> +};

Isn't this duplicating a bunch of infrastructure which is also in
blkback? Can we put it into some common helpers please?

> +
>  struct xenvif {
>         /* Unique identifier for this interface. */
>         domid_t          domid;
> @@ -75,6 +86,7 @@ struct xenvif {
> 
>         /* Internal feature information. */
>         u8 can_queue:1;     /* can queue packets for receiver? */
> +       u8 persistent_grant:1;
> 
>         /*
>          * Allow xenvif_start_xmit() to peek ahead in the rx request
> @@ -98,6 +110,9 @@ struct xenvif {
>         struct net_device *dev;
> 
>         wait_queue_head_t waiting_to_free;
> +
> +       struct persistent_entry *persistent_gnt[MAXIMUM_OUTSTANDING_BLOCK_REQS];

What is the per-vif memory overhead of this array?

> +static struct persistent_entry*
> +get_per_gnt(struct persistent_entry **pers_entry,
> +           unsigned int count, grant_ref_t gref)
> +{
> +       int i;
> +
> +       for (i = 0; i < count; i++)
> +               if (gref == pers_entry[i]->forgranted)
> +                       return pers_entry[i];

Isn't this linear scan rather expensive? I think Roger implemented some
sort of hash lookup for blkback which I think is required here too (and
should be easy if you make that code common).

> +
> +       return NULL;
> +}
> +

> @@ -1338,7 +1497,11 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk)
>                 gop->source.domid = vif->domid;
>                 gop->source.offset = txreq.offset;
> 
> -               gop->dest.u.gmfn = virt_to_mfn(page_address(page));
> +               if (!vif->persistent_grant)
> +                       gop->dest.u.gmfn = virt_to_mfn(page_address(page));
> +               else
> +                       gop->dest.u.gmfn = (unsigned long)page_address(page);

page_address doesn't return any sort of frame number, does it? This is
rather confusing...

> @@ -453,7 +460,12 @@ static int connect_rings(struct backend_info *be)
>                 val = 0;
>         vif->csum = !val;
> 
> -       /* Map the shared frame, irq etc. */
> +       if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-persistent-grants",
> +                        "%u", &val) < 0)
> +               val = 0;
> +       vif->persistent_grant = !!val;
> +
> +/* Map the shared frame, irq etc. */

Please run the patches through checkpatch.pl

>         err = xenvif_connect(vif, tx_ring_ref, rx_ring_ref, evtchn);
>         if (err) {
>                 xenbus_dev_fatal(dev, err,
> --
> 1.7.3.4
> 

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 2/4] xen/netback: Split one page pool into two(tx/rx) page pool.
  2012-11-15  7:04 ` [PATCH 2/4] xen/netback: Split one page pool into two(tx/rx) " Annie Li
@ 2012-11-15  9:15   ` Ian Campbell
  2012-11-16  3:10     ` ANNIE LI
  0 siblings, 1 reply; 42+ messages in thread
From: Ian Campbell @ 2012-11-15  9:15 UTC (permalink / raw)
  To: Annie Li; +Cc: xen-devel, netdev, konrad.wilk

On Thu, 2012-11-15 at 07:04 +0000, Annie Li wrote:
> For tx path, this implementation simplifies the work of searching out
> grant page from page pool based on grant reference.

It's still a linear search though, and it doesn't look much simpler to
me:
  	for (i = 0; i < count; i++) {
		if (tx_pool)
			vif = netbk->gnttab_tx_vif[i];
		else
			vif = netbk->gnttab_rx_vif[i];

		pers_entry = vif->persistent_gnt;
		gnt_count = &vif->persistent_gntcnt;
		gnt_total = MAXIMUM_OUTSTANDING_BLOCK_REQS;
becomes:
	for (i = 0; i < count; i++) {
		if (tx_pool) {
			vif = netbk->gnttab_tx_vif[i];
			gnt_count = &vif->persistent_tx_gntcnt;
			gnt_total = XEN_NETIF_TX_RING_SIZE;
			pers_entry = vif->persistent_tx_gnt;
		} else {
			vif = netbk->gnttab_rx_vif[i];
			gnt_count = &vif->persistent_rx_gntcnt;
			gnt_total = 2*XEN_NETIF_RX_RING_SIZE;
			pers_entry = vif->persistent_rx_gnt;
		}

> @@ -111,8 +109,16 @@ struct xenvif {
>  
>  	wait_queue_head_t waiting_to_free;
>  
> -	struct persistent_entry *persistent_gnt[MAXIMUM_OUTSTANDING_BLOCK_REQS];
> -	unsigned int persistent_gntcnt;
> +	struct persistent_entry *persistent_tx_gnt[XEN_NETIF_TX_RING_SIZE];
> +
> +	/*
> +	 * 2*XEN_NETIF_RX_RING_SIZE is for the case of each head/fragment page

Shouldn't that been incorporated into MAXIMUM_OUTSTANDING_BLOCK_REQS
(sic) too?

> +	 * using 2 copy operations.
> +	 */
> +	struct persistent_entry *persistent_rx_gnt[2*XEN_NETIF_RX_RING_SIZE];

What is the per-vif memory overhead after this change?

Ian.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 0/4] Implement persistent grant in xen-netfront/netback
  2012-11-15  8:38   ` [Xen-devel] " ANNIE LI
  2012-11-15  8:51     ` Ian Campbell
@ 2012-11-15  9:35     ` Wei Liu
  2012-11-15 11:12       ` [Xen-devel] " ANNIE LI
  2012-11-16 15:34       ` Konrad Rzeszutek Wilk
  2012-11-15 10:56     ` Roger Pau Monné
  2 siblings, 2 replies; 42+ messages in thread
From: Wei Liu @ 2012-11-15  9:35 UTC (permalink / raw)
  To: ANNIE LI; +Cc: netdev, xen-devel, Ian Campbell, Konrad Rzeszutek Wilk


[-- Attachment #1.1: Type: text/plain, Size: 1799 bytes --]

On Thu, Nov 15, 2012 at 4:38 PM, ANNIE LI <annie.li@oracle.com> wrote:

>
>
> On 2012-11-15 15:40, Pasi Kärkkäinen wrote:
>
>> Hello,
>>
>> On Thu, Nov 15, 2012 at 03:03:07PM +0800, Annie Li wrote:
>>
>>> This patch implements persistent grants for xen-netfront/netback. This
>>> mechanism maintains page pools in netback/netfront, these page pools is
>>> used to
>>> save grant pages which are mapped. This way improve performance which is
>>> wasted
>>> when doing grant operations.
>>>
>>> Current netback/netfront does map/unmap grant operations frequently when
>>> transmitting/receiving packets, and grant operations costs much cpu
>>> clock. In
>>> this patch, netfront/netback maps grant pages when needed and then saves
>>> them
>>> into a page pool for future use. All these pages will be unmapped when
>>> removing/releasing the net device.
>>>
>>>  Do you have performance numbers available already? with/without
>> persistent grants?
>>
> I have some simple netperf/netserver test result with/without persistent
> grants,
>
> Following is result of with persistent grant patch,
>
> Guests, Sum,      Avg,     Min,     Max
>  1,  15106.4,  15106.4, 15106.36, 15106.36
>  2,  13052.7,  6526.34,  6261.81,  6790.86
>  3,  12675.1,  6337.53,  6220.24,  6454.83
>  4,  13194,  6596.98,  6274.70,  6919.25
>
>
> Following are result of without persistent patch
>
> Guests, Sum,     Avg,    Min,        Max
>  1,  10864.1,  10864.1, 10864.10, 10864.10
>  2,  10898.5,  5449.24,  4862.08,  6036.40
>  3,  10734.5,  5367.26,  5261.43,  5473.08
>  4,  10924,    5461.99,  5314.84,  5609.14
>
>
>
Interesting results. Have you tested how good it is on a 10G nic, i.e.
guest sending packets
through physical network to another host.


Wei.

[-- Attachment #1.2: Type: text/html, Size: 2488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Xen-devel] [PATCH 1/4] xen/netback: implements persistent grant with one page pool.
  2012-11-15  7:04 ` [PATCH 1/4] xen/netback: implements persistent grant with one page pool Annie Li
  2012-11-15  9:10   ` Ian Campbell
@ 2012-11-15  9:57   ` Roger Pau Monné
  2012-11-16  2:49     ` ANNIE LI
  1 sibling, 1 reply; 42+ messages in thread
From: Roger Pau Monné @ 2012-11-15  9:57 UTC (permalink / raw)
  To: Annie Li; +Cc: xen-devel, netdev, konrad.wilk, Ian Campbell

On 15/11/12 08:04, Annie Li wrote:
> This patch implements persistent grant in netback driver. Tx and rx
> share the same page pool, this pool will be split into two parts
> in next patch.
> 
> Signed-off-by: Annie Li <annie.li@oracle.com>
> ---
>  drivers/net/xen-netback/common.h    |   18 +++-
>  drivers/net/xen-netback/interface.c |   22 ++++
>  drivers/net/xen-netback/netback.c   |  212 +++++++++++++++++++++++++++++++----
>  drivers/net/xen-netback/xenbus.c    |   14 ++-
>  4 files changed, 239 insertions(+), 27 deletions(-)
> 
> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
> index 94b79c3..a85cac6 100644
> --- a/drivers/net/xen-netback/common.h
> +++ b/drivers/net/xen-netback/common.h
> @@ -45,8 +45,19 @@
>  #include <xen/grant_table.h>
>  #include <xen/xenbus.h>
> 
> +#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
> +#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
> +#define MAXIMUM_OUTSTANDING_BLOCK_REQS \
> +                       (XEN_NETIF_TX_RING_SIZE + XEN_NETIF_RX_RING_SIZE)
> +
>  struct xen_netbk;
> 
> +struct persistent_entry {
> +       grant_ref_t forgranted;
> +       struct page *fpage;
> +       struct gnttab_map_grant_ref map;
> +};

This should be common with the blkback implementation, I think we should
move some structures/functions from blkback to a common place. When I
implementated some functions in blkback I though they could be reused by
other backends that wanted to use persistent grants, so I keep them free
of blkback specific structures.

>  struct xenvif {
>         /* Unique identifier for this interface. */
>         domid_t          domid;
> @@ -75,6 +86,7 @@ struct xenvif {
> 
>         /* Internal feature information. */
>         u8 can_queue:1;     /* can queue packets for receiver? */
> +       u8 persistent_grant:1;
> 
>         /*
>          * Allow xenvif_start_xmit() to peek ahead in the rx request
> @@ -98,6 +110,9 @@ struct xenvif {
>         struct net_device *dev;
> 
>         wait_queue_head_t waiting_to_free;
> +
> +       struct persistent_entry *persistent_gnt[MAXIMUM_OUTSTANDING_BLOCK_REQS];
> +       unsigned int persistent_gntcnt;

This should be a red-black tree, which has the property of a search time
<= O(log n), using an array is more expensive in terms of memory and has
a worse search time O(n), this is specially interesting for netback,
which can have twice as much persistent grants as blkback (because two
rings are used).

Take a look at the following functions from blkback; foreach_grant,
add_persistent_gnt and get_persistent_gnt. They are generic functions to
deal with persistent grants.

>  };
> 
>  static inline struct xenbus_device *xenvif_to_xenbus_device(struct xenvif *vif)
> @@ -105,9 +120,6 @@ static inline struct xenbus_device *xenvif_to_xenbus_device(struct xenvif *vif)
>         return to_xenbus_device(vif->dev->dev.parent);
>  }
> 
> -#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
> -#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
> -
>  struct xenvif *xenvif_alloc(struct device *parent,
>                             domid_t domid,
>                             unsigned int handle);
> diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
> index b7d41f8..226d159 100644
> --- a/drivers/net/xen-netback/interface.c
> +++ b/drivers/net/xen-netback/interface.c
> @@ -300,6 +300,8 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid,
>                 return ERR_PTR(err);
>         }
> 
> +       vif->persistent_gntcnt = 0;
> +
>         netdev_dbg(dev, "Successfully created xenvif\n");
>         return vif;
>  }
> @@ -343,6 +345,23 @@ err:
>         return err;
>  }
> 
> +void xenvif_free_grants(struct persistent_entry **pers_entry,
> +                       unsigned int count)
> +{
> +       int i, ret;
> +       struct gnttab_unmap_grant_ref unmap;
> +
> +       for (i = 0; i < count; i++) {
> +               gnttab_set_unmap_op(&unmap,
> +                       (unsigned long)page_to_pfn(pers_entry[i]->fpage),
> +                       GNTMAP_host_map,
> +                       pers_entry[i]->map.handle);
> +               ret = gnttab_unmap_refs(&unmap, &pers_entry[i]->fpage,
> +                                       1, false);

This is not correct, you should call gnttab_set_unmap_op on a batch of
grants (up to BLKIF_MAX_SEGMENTS_PER_REQUEST), and then call
gnttab_unmap_refs on all of them. Here is a simple example (take a look
at blkback.c function xen_blkif_schedule to see an example with a
red-black tree, I think this part of the code should also be made common):

for (i = 0, segs_to_unmap = 0; i < count; i++) {
	gnttab_set_unmap_op(&unmap[segs_to_unmap],
		(unsigned long)page_to_pfn(pers_entry[i]->fpage),
		GNTMAP_host_map,
		pers_entry[i]->map.handle);
	pages[segs_to_unmap] =
		(unsigned long)page_to_pfn(pers_entry[i]->fpage);
	if (++segs_to_unmap == BLKIF_MAX_SEGMENTS_PER_REQUEST ||
		(i + 1) == count) {
		ret = gnttab_unmap_refs(unmap, NULL, pages,
			segs_to_unmap);
		BUG_ON(ret);
		segs_to_unmap == 0;
	}
}

> +               BUG_ON(ret);
> +       }
> +}
> +
>  void xenvif_disconnect(struct xenvif *vif)
>  {
>         struct net_device *dev = vif->dev;
> @@ -366,6 +385,9 @@ void xenvif_disconnect(struct xenvif *vif)
>         unregister_netdev(vif->dev);
> 
>         xen_netbk_unmap_frontend_rings(vif);
> +       if (vif->persistent_grant)
> +               xenvif_free_grants(vif->persistent_gnt,
> +                                  vif->persistent_gntcnt);
> 
>         free_netdev(vif->dev);
>  }
> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> index 2596401..a26d3fc 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -80,6 +80,8 @@ union page_ext {
>         void *mapping;
>  };
> 
> +struct xenvif;
> +
>  struct xen_netbk {
>         wait_queue_head_t wq;
>         struct task_struct *task;
> @@ -102,6 +104,7 @@ struct xen_netbk {
> 
>         struct pending_tx_info pending_tx_info[MAX_PENDING_REQS];
>         struct gnttab_copy tx_copy_ops[MAX_PENDING_REQS];
> +       struct xenvif *gnttab_tx_vif[MAX_PENDING_REQS];
> 
>         u16 pending_ring[MAX_PENDING_REQS];
> 
> @@ -111,12 +114,139 @@ struct xen_netbk {
>          * straddles two buffers in the frontend.
>          */
>         struct gnttab_copy grant_copy_op[2*XEN_NETIF_RX_RING_SIZE];
> +       struct xenvif *gnttab_rx_vif[2*XEN_NETIF_RX_RING_SIZE];
>         struct netbk_rx_meta meta[2*XEN_NETIF_RX_RING_SIZE];
>  };
> 
>  static struct xen_netbk *xen_netbk;
>  static int xen_netbk_group_nr;
> 
> +static struct persistent_entry*
> +get_per_gnt(struct persistent_entry **pers_entry,
> +           unsigned int count, grant_ref_t gref)
> +{
> +       int i;
> +
> +       for (i = 0; i < count; i++)
> +               if (gref == pers_entry[i]->forgranted)
> +                       return pers_entry[i];
> +
> +       return NULL;
> +}

This should be replaced with common code shared with all persistent
backends implementations.

> +
> +static void*
> +map_new_gnt(struct persistent_entry **pers_entry, unsigned int count,
> +           grant_ref_t ref, domid_t domid)
> +{
> +       struct gnttab_map_grant_ref *map;
> +       struct page *page;
> +       unsigned long vaddr;
> +       unsigned long pfn;
> +       uint32_t flags;
> +       int ret = 0;
> +
> +       pers_entry[count] = (struct persistent_entry *)
> +                           kmalloc(sizeof(struct persistent_entry),
> +                                   GFP_KERNEL);
> +       if (!pers_entry[count])
> +               return ERR_PTR(-ENOMEM);
> +
> +       map = &pers_entry[count]->map;
> +       page = alloc_page(GFP_KERNEL);
> +       if (!page) {
> +               kfree(pers_entry[count]);
> +               pers_entry[count] = NULL;
> +               return ERR_PTR(-ENOMEM);
> +       }
> +
> +       pers_entry[count]->fpage = page;
> +       pfn = page_to_pfn(page);
> +       vaddr = (unsigned long)pfn_to_kaddr(pfn);
> +       flags = GNTMAP_host_map;
> +
> +       gnttab_set_map_op(map, vaddr, flags, ref, domid);
> +       ret = gnttab_map_refs(map, NULL, &page, 1);
> +       BUG_ON(ret);

This is highly inefficient, one of the points of using gnttab_set_map_op
is that you can queue a bunch of grants, and then map them at the same
time using gnttab_map_refs, but here you are using it to map a single
grant at a time. You should instead see how much grants you need to map
to complete the request and map them all at the same time.

> +
> +       pers_entry[count]->forgranted = ref;
> +
> +       return page_address(page);
> +}
> +
> +static void*
> +get_ref_page(struct persistent_entry **pers_entry, unsigned int *count,
> +            grant_ref_t ref, domid_t domid, unsigned int total)
> +{
> +       struct persistent_entry *per_gnt;
> +       void *vaddr;
> +
> +       per_gnt = get_per_gnt(pers_entry, *count, ref);
> +
> +       if (per_gnt != NULL)
> +               return page_address(per_gnt->fpage);
> +       else {
> +               BUG_ON(*count >= total);
> +               vaddr = map_new_gnt(pers_entry, *count, ref, domid);
> +               if (IS_ERR_OR_NULL(vaddr))
> +                       return vaddr;
> +               *count += 1;
> +               return vaddr;
> +       }
> +}
> +
> +static int
> +grant_memory_copy_op(unsigned int cmd, void *vuop, unsigned int count,
> +                    struct xen_netbk *netbk, bool tx_pool)
> +{
> +       int i;
> +       struct xenvif *vif;
> +       struct gnttab_copy *uop = vuop;
> +       unsigned int *gnt_count;
> +       unsigned int gnt_total;
> +       struct persistent_entry **pers_entry;
> +       int ret = 0;
> +
> +       BUG_ON(cmd != GNTTABOP_copy);
> +       for (i = 0; i < count; i++) {
> +               if (tx_pool)
> +                       vif = netbk->gnttab_tx_vif[i];
> +               else
> +                       vif = netbk->gnttab_rx_vif[i];
> +
> +               pers_entry = vif->persistent_gnt;
> +               gnt_count = &vif->persistent_gntcnt;
> +               gnt_total = MAXIMUM_OUTSTANDING_BLOCK_REQS;
> +
> +               if (vif->persistent_grant) {
> +                       void *saddr, *daddr;
> +
> +                       saddr = uop[i].source.domid == DOMID_SELF ?
> +                               (void *) uop[i].source.u.gmfn :
> +                               get_ref_page(pers_entry, gnt_count,
> +                                            uop[i].source.u.ref,
> +                                            uop[i].source.domid,
> +                                            gnt_total);
> +                       if (IS_ERR_OR_NULL(saddr))
> +                               return -ENOMEM;
> +
> +                       daddr = uop[i].dest.domid == DOMID_SELF ?
> +                               (void *) uop[i].dest.u.gmfn :
> +                               get_ref_page(pers_entry, gnt_count,
> +                                            uop[i].dest.u.ref,
> +                                            uop[i].dest.domid,
> +                                            gnt_total);
> +                       if (IS_ERR_OR_NULL(daddr))
> +                               return -ENOMEM;
> +
> +                       memcpy(daddr+uop[i].dest.offset,
> +                              saddr+uop[i].source.offset, uop[i].len);
> +               } else
> +                       ret = HYPERVISOR_grant_table_op(cmd, &uop[i], 1);
> +       }
> +
> +       return ret;
> +}
> +
>  void xen_netbk_add_xenvif(struct xenvif *vif)
>  {
>         int i;
> @@ -387,13 +517,15 @@ static struct netbk_rx_meta *get_next_rx_buffer(struct xenvif *vif,
>   * Set up the grant operations for this fragment. If it's a flipping
>   * interface, we also set up the unmap request from here.
>   */
> -static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
> -                               struct netrx_pending_operations *npo,
> -                               struct page *page, unsigned long size,
> -                               unsigned long offset, int *head)
> +static int netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
> +                              struct netrx_pending_operations *npo,
> +                              struct page *page, unsigned long size,
> +                              unsigned long offset, int *head,
> +                              struct xenvif **grxvif)
>  {
>         struct gnttab_copy *copy_gop;
>         struct netbk_rx_meta *meta;
> +       int count = 0;
>         /*
>          * These variables are used iff get_page_ext returns true,
>          * in which case they are guaranteed to be initialized.
> @@ -425,6 +557,10 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
>                         bytes = MAX_BUFFER_OFFSET - npo->copy_off;
> 
>                 copy_gop = npo->copy + npo->copy_prod++;
> +               *grxvif = vif;
> +               grxvif++;
> +               count++;
> +
>                 copy_gop->flags = GNTCOPY_dest_gref;
>                 if (foreign) {
>                         struct xen_netbk *netbk = &xen_netbk[group];
> @@ -438,7 +574,10 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
>                 } else {
>                         void *vaddr = page_address(page);
>                         copy_gop->source.domid = DOMID_SELF;
> -                       copy_gop->source.u.gmfn = virt_to_mfn(vaddr);
> +                       if (!vif->persistent_grant)
> +                               copy_gop->source.u.gmfn = virt_to_mfn(vaddr);
> +                       else
> +                               copy_gop->source.u.gmfn = (unsigned long)vaddr;
>                 }
>                 copy_gop->source.offset = offset;
>                 copy_gop->dest.domid = vif->domid;
> @@ -460,6 +599,7 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
>                 *head = 0; /* There must be something in this buffer now. */
> 
>         }
> +       return count;
>  }
> 
>  /*
> @@ -474,8 +614,10 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
>   * zero GSO descriptors (for non-GSO packets) or one descriptor (for
>   * frontend-side LRO).
>   */
> -static int netbk_gop_skb(struct sk_buff *skb,
> -                        struct netrx_pending_operations *npo)
> +static int netbk_gop_skb(struct xen_netbk *netbk,
> +                        struct sk_buff *skb,
> +                        struct netrx_pending_operations *npo,
> +                        struct xenvif **grxvif)
>  {
>         struct xenvif *vif = netdev_priv(skb->dev);
>         int nr_frags = skb_shinfo(skb)->nr_frags;
> @@ -518,17 +660,19 @@ static int netbk_gop_skb(struct sk_buff *skb,
>                 if (data + len > skb_tail_pointer(skb))
>                         len = skb_tail_pointer(skb) - data;
> 
> -               netbk_gop_frag_copy(vif, skb, npo,
> -                                   virt_to_page(data), len, offset, &head);
> +               grxvif += netbk_gop_frag_copy(vif, skb, npo,
> +                                             virt_to_page(data), len,
> +                                             offset, &head, grxvif);
> +
>                 data += len;
>         }
> 
>         for (i = 0; i < nr_frags; i++) {
> -               netbk_gop_frag_copy(vif, skb, npo,
> -                                   skb_frag_page(&skb_shinfo(skb)->frags[i]),
> -                                   skb_frag_size(&skb_shinfo(skb)->frags[i]),
> -                                   skb_shinfo(skb)->frags[i].page_offset,
> -                                   &head);
> +               grxvif += netbk_gop_frag_copy(vif, skb, npo,
> +                               skb_frag_page(&skb_shinfo(skb)->frags[i]),
> +                               skb_frag_size(&skb_shinfo(skb)->frags[i]),
> +                               skb_shinfo(skb)->frags[i].page_offset,
> +                               &head, grxvif);
>         }
> 
>         return npo->meta_prod - old_meta_prod;
> @@ -593,6 +737,8 @@ struct skb_cb_overlay {
>  static void xen_netbk_rx_action(struct xen_netbk *netbk)
>  {
>         struct xenvif *vif = NULL, *tmp;
> +       struct xenvif **grxvif = netbk->gnttab_rx_vif;
> +       int old_copy_prod = 0;
>         s8 status;
>         u16 irq, flags;
>         struct xen_netif_rx_response *resp;
> @@ -619,7 +765,9 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk)
>                 nr_frags = skb_shinfo(skb)->nr_frags;
> 
>                 sco = (struct skb_cb_overlay *)skb->cb;
> -               sco->meta_slots_used = netbk_gop_skb(skb, &npo);
> +               sco->meta_slots_used = netbk_gop_skb(netbk, skb, &npo, grxvif);
> +               grxvif += npo.copy_prod - old_copy_prod;
> +               old_copy_prod = npo.copy_prod;
> 
>                 count += nr_frags + 1;
> 
> @@ -636,8 +784,10 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk)
>                 return;
> 
>         BUG_ON(npo.copy_prod > ARRAY_SIZE(netbk->grant_copy_op));
> -       ret = HYPERVISOR_grant_table_op(GNTTABOP_copy, &netbk->grant_copy_op,
> -                                       npo.copy_prod);
> +       ret = grant_memory_copy_op(GNTTABOP_copy,
> +                                  &netbk->grant_copy_op,
> +                                  npo.copy_prod, netbk,
> +                                  false);
>         BUG_ON(ret != 0);
> 
>         while ((skb = __skb_dequeue(&rxq)) != NULL) {
> @@ -918,7 +1068,8 @@ static struct gnttab_copy *xen_netbk_get_requests(struct xen_netbk *netbk,
>                                                   struct xenvif *vif,
>                                                   struct sk_buff *skb,
>                                                   struct xen_netif_tx_request *txp,
> -                                                 struct gnttab_copy *gop)
> +                                                 struct gnttab_copy *gop,
> +                                                 struct xenvif **gtxvif)
>  {
>         struct skb_shared_info *shinfo = skb_shinfo(skb);
>         skb_frag_t *frags = shinfo->frags;
> @@ -944,7 +1095,11 @@ static struct gnttab_copy *xen_netbk_get_requests(struct xen_netbk *netbk,
>                 gop->source.domid = vif->domid;
>                 gop->source.offset = txp->offset;
> 
> -               gop->dest.u.gmfn = virt_to_mfn(page_address(page));
> +               if (!vif->persistent_grant)
> +                       gop->dest.u.gmfn = virt_to_mfn(page_address(page));
> +               else
> +                       gop->dest.u.gmfn = (unsigned long)page_address(page);
> +
>                 gop->dest.domid = DOMID_SELF;
>                 gop->dest.offset = txp->offset;
> 
> @@ -952,6 +1107,9 @@ static struct gnttab_copy *xen_netbk_get_requests(struct xen_netbk *netbk,
>                 gop->flags = GNTCOPY_source_gref;
> 
>                 gop++;
> +               *gtxvif = vif;
> +               gtxvif++;
> +
> 
>                 memcpy(&pending_tx_info[pending_idx].req, txp, sizeof(*txp));
>                 xenvif_get(vif);
> @@ -1218,6 +1376,7 @@ static bool tx_credit_exceeded(struct xenvif *vif, unsigned size)
>  static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk)
>  {
>         struct gnttab_copy *gop = netbk->tx_copy_ops, *request_gop;
> +       struct xenvif **gtxvif = netbk->gnttab_tx_vif;
>         struct sk_buff *skb;
>         int ret;
> 
> @@ -1338,7 +1497,11 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk)
>                 gop->source.domid = vif->domid;
>                 gop->source.offset = txreq.offset;
> 
> -               gop->dest.u.gmfn = virt_to_mfn(page_address(page));
> +               if (!vif->persistent_grant)
> +                       gop->dest.u.gmfn = virt_to_mfn(page_address(page));
> +               else
> +                       gop->dest.u.gmfn = (unsigned long)page_address(page);
> +
>                 gop->dest.domid = DOMID_SELF;
>                 gop->dest.offset = txreq.offset;
> 
> @@ -1346,6 +1509,7 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk)
>                 gop->flags = GNTCOPY_source_gref;
> 
>                 gop++;
> +               *gtxvif++ = vif;
> 
>                 memcpy(&netbk->pending_tx_info[pending_idx].req,
>                        &txreq, sizeof(txreq));
> @@ -1369,12 +1533,13 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk)
>                 netbk->pending_cons++;
> 
>                 request_gop = xen_netbk_get_requests(netbk, vif,
> -                                                    skb, txfrags, gop);
> +                                                    skb, txfrags, gop, gtxvif);
>                 if (request_gop == NULL) {
>                         kfree_skb(skb);
>                         netbk_tx_err(vif, &txreq, idx);
>                         continue;
>                 }
> +               gtxvif += request_gop - gop;
>                 gop = request_gop;
> 
>                 vif->tx.req_cons = idx;
> @@ -1467,8 +1632,9 @@ static void xen_netbk_tx_action(struct xen_netbk *netbk)
> 
>         if (nr_gops == 0)
>                 return;
> -       ret = HYPERVISOR_grant_table_op(GNTTABOP_copy,
> -                                       netbk->tx_copy_ops, nr_gops);
> +       ret = grant_memory_copy_op(GNTTABOP_copy,
> +                                  netbk->tx_copy_ops, nr_gops,
> +                                  netbk, true);
>         BUG_ON(ret);
> 
>         xen_netbk_tx_submit(netbk);
> diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
> index 410018c..938e908 100644
> --- a/drivers/net/xen-netback/xenbus.c
> +++ b/drivers/net/xen-netback/xenbus.c
> @@ -114,6 +114,13 @@ static int netback_probe(struct xenbus_device *dev,
>                         goto abort_transaction;
>                 }
> 
> +               err = xenbus_printf(xbt, dev->nodename,
> +                                   "feature-persistent-grants", "%u", 1);
> +               if (err) {
> +                       message = "writing feature-persistent-grants";
> +                       goto abort_transaction;
> +               }
> +
>                 err = xenbus_transaction_end(xbt, 0);
>         } while (err == -EAGAIN);
> 
> @@ -453,7 +460,12 @@ static int connect_rings(struct backend_info *be)
>                 val = 0;
>         vif->csum = !val;
> 
> -       /* Map the shared frame, irq etc. */
> +       if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-persistent-grants",
> +                        "%u", &val) < 0)

In block devices "feature-persistent" is used, so I think that for
clearness it should be announced the same way in net.

> +               val = 0;
> +       vif->persistent_grant = !!val;
> +
> +/* Map the shared frame, irq etc. */
>         err = xenvif_connect(vif, tx_ring_ref, rx_ring_ref, evtchn);
>         if (err) {
>                 xenbus_dev_fatal(dev, err,
> --
> 1.7.3.4
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> 

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Xen-devel] [PATCH 3/4] Xen/netfront: Implement persistent grant in netfront.
  2012-11-15  7:05 ` [PATCH 3/4] Xen/netfront: Implement persistent grant in netfront Annie Li
@ 2012-11-15 10:52   ` Roger Pau Monné
  2012-11-16  5:22     ` ANNIE LI
  0 siblings, 1 reply; 42+ messages in thread
From: Roger Pau Monné @ 2012-11-15 10:52 UTC (permalink / raw)
  To: Annie Li; +Cc: xen-devel, netdev, konrad.wilk, Ian Campbell

On 15/11/12 08:05, Annie Li wrote:
> Tx/rx page pool are maintained. New grant is mapped and put into
> pool, unmap only happens when releasing/removing device.
> 
> Signed-off-by: Annie Li <annie.li@oracle.com>
> ---
>  drivers/net/xen-netfront.c |  372 +++++++++++++++++++++++++++++++++++++-------
>  1 files changed, 315 insertions(+), 57 deletions(-)
> 
> diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
> index 0ebbb19..17b81c0 100644
> --- a/drivers/net/xen-netfront.c
> +++ b/drivers/net/xen-netfront.c
> @@ -79,6 +79,13 @@ struct netfront_stats {
>         struct u64_stats_sync   syncp;
>  };
> 
> +struct gnt_list {
> +       grant_ref_t             gref;
> +       struct                  page *gnt_pages;
> +       void                    *gnt_target;
> +       struct                  gnt_list *tail;
> +};

This could also be shared with blkfront.

> +
>  struct netfront_info {
>         struct list_head list;
>         struct net_device *netdev;
> @@ -109,6 +116,10 @@ struct netfront_info {
>         grant_ref_t grant_tx_ref[NET_TX_RING_SIZE];
>         unsigned tx_skb_freelist;
> 
> +       struct gnt_list *tx_grant[NET_TX_RING_SIZE];
> +       struct gnt_list *tx_gnt_list;
> +       unsigned int tx_gnt_cnt;

I don't understand this, why do you need both an array and a list? I'm
not familiar with net code, so I don't know if this is some kind of
special netfront thing?

Anyway if you have to use a list I would recommend using one of the list
constructions that's already in the kernel, it simplifies the code and
makes it more easy to understand than creating your own list structure.

> +
>         spinlock_t   rx_lock ____cacheline_aligned_in_smp;
>         struct xen_netif_rx_front_ring rx;
>         int rx_ring_ref;
> @@ -126,6 +137,10 @@ struct netfront_info {
>         grant_ref_t gref_rx_head;
>         grant_ref_t grant_rx_ref[NET_RX_RING_SIZE];
> 
> +       struct gnt_list *rx_grant[NET_RX_RING_SIZE];
> +       struct gnt_list *rx_gnt_list;
> +       unsigned int rx_gnt_cnt;

Same comment above here.

> +
>         unsigned long rx_pfn_array[NET_RX_RING_SIZE];
>         struct multicall_entry rx_mcl[NET_RX_RING_SIZE+1];
>         struct mmu_update rx_mmu[NET_RX_RING_SIZE];
> @@ -134,6 +149,7 @@ struct netfront_info {
>         struct netfront_stats __percpu *stats;
> 
>         unsigned long rx_gso_checksum_fixup;
> +       u8 persistent_gnt:1;
>  };
> 
>  struct netfront_rx_info {
> @@ -194,6 +210,16 @@ static grant_ref_t xennet_get_rx_ref(struct netfront_info *np,
>         return ref;
>  }
> 
> +static struct gnt_list *xennet_get_rx_grant(struct netfront_info *np,
> +                                           RING_IDX ri)
> +{
> +       int i = xennet_rxidx(ri);
> +       struct gnt_list *gntlist = np->rx_grant[i];
> +       np->rx_grant[i] = NULL;

Ok, I think I get why do you need both an array and a list, is that
because netfront doesn't have some kind of shadow ring to keep track of
issued requests?

So each issued request has an associated gnt_list with the list of used
grants? If so it would be good to add a comment about it.

> +       return gntlist;
> +}
> +
>  #ifdef CONFIG_SYSFS
>  static int xennet_sysfs_addif(struct net_device *netdev);
>  static void xennet_sysfs_delif(struct net_device *netdev);
> @@ -231,6 +257,68 @@ static void xennet_maybe_wake_tx(struct net_device *dev)
>                 netif_wake_queue(dev);
>  }
> 
> +static grant_ref_t xennet_alloc_rx_ref(struct net_device *dev,
> +                                      unsigned long mfn, void *vaddr,
> +                                      unsigned int id,
> +                                      grant_ref_t ref)
> +{
> +       struct netfront_info *np = netdev_priv(dev);
> +       grant_ref_t gnt_ref;
> +       struct gnt_list *gnt_list_entry;
> +
> +       if (np->persistent_gnt && np->rx_gnt_cnt) {
> +               gnt_list_entry = np->rx_gnt_list;
> +               np->rx_gnt_list = np->rx_gnt_list->tail;
> +               np->rx_gnt_cnt--;
> +
> +               gnt_list_entry->gnt_target = vaddr;
> +               gnt_ref = gnt_list_entry->gref;
> +               np->rx_grant[id] = gnt_list_entry;
> +       } else {
> +               struct page *page;
> +
> +               BUG_ON(!np->persistent_gnt && np->rx_gnt_cnt);
> +               if (!ref)
> +                       gnt_ref =
> +                               gnttab_claim_grant_reference(&np->gref_rx_head);
> +               else
> +                       gnt_ref = ref;
> +               BUG_ON((signed short)gnt_ref < 0);
> +
> +               if (np->persistent_gnt) {

So you are only using persistent grants if the backend also supports
them. Have you benchmarked the performance of a persistent frontend with
a non-persistent backend. In the block case, usign a persistent frontend
with a non-persistent backend let to an overall performance improvement,
so blkfront uses persistent grants even if blkback doesn't support them.
Take a look at the following graph:

http://xenbits.xen.org/people/royger/persistent_grants/nonpers_read.png

> +                       page = alloc_page(GFP_KERNEL);
> +                       if (!page) {
> +                               if (!ref)
> +                                       gnttab_release_grant_reference(
> +                                                       &np->gref_rx_head, ref);
> +                               return -ENOMEM;
> +                       }
> +                       mfn = pfn_to_mfn(page_to_pfn(page));
> +
> +                       gnt_list_entry = kmalloc(sizeof(struct gnt_list),
> +                                                GFP_KERNEL);
> +                       if (!gnt_list_entry) {
> +                               __free_page(page);
> +                               if (!ref)
> +                                       gnttab_release_grant_reference(
> +                                                       &np->gref_rx_head, ref);
> +                               return -ENOMEM;
> +                       }
> +                       gnt_list_entry->gref = gnt_ref;
> +                       gnt_list_entry->gnt_pages = page;
> +                       gnt_list_entry->gnt_target = vaddr;
> +
> +                       np->rx_grant[id] = gnt_list_entry;
> +               }
> +
> +               gnttab_grant_foreign_access_ref(gnt_ref, np->xbdev->otherend_id,
> +                                               mfn, 0);
> +       }
> +       np->grant_rx_ref[id] = gnt_ref;
> +
> +       return gnt_ref;
> +}
> +
>  static void xennet_alloc_rx_buffers(struct net_device *dev)
>  {
>         unsigned short id;
> @@ -240,8 +328,6 @@ static void xennet_alloc_rx_buffers(struct net_device *dev)
>         int i, batch_target, notify;
>         RING_IDX req_prod = np->rx.req_prod_pvt;
>         grant_ref_t ref;
> -       unsigned long pfn;
> -       void *vaddr;
>         struct xen_netif_rx_request *req;
> 
>         if (unlikely(!netif_carrier_ok(dev)))
> @@ -306,19 +392,16 @@ no_skb:
>                 BUG_ON(np->rx_skbs[id]);
>                 np->rx_skbs[id] = skb;
> 
> -               ref = gnttab_claim_grant_reference(&np->gref_rx_head);
> -               BUG_ON((signed short)ref < 0);
> -               np->grant_rx_ref[id] = ref;
> +               page = skb_frag_page(&skb_shinfo(skb)->frags[0]);
> 
> -               pfn = page_to_pfn(skb_frag_page(&skb_shinfo(skb)->frags[0]));
> -               vaddr = page_address(skb_frag_page(&skb_shinfo(skb)->frags[0]));
> +               ref = xennet_alloc_rx_ref(dev, pfn_to_mfn(page_to_pfn(page)),
> +                                         page_address(page), id, 0);
> +               if ((signed short)ref < 0) {
> +                       __skb_queue_tail(&np->rx_batch, skb);
> +                       break;
> +               }
> 
>                 req = RING_GET_REQUEST(&np->rx, req_prod + i);
> -               gnttab_grant_foreign_access_ref(ref,
> -                                               np->xbdev->otherend_id,
> -                                               pfn_to_mfn(pfn),
> -                                               0);
> -
>                 req->id = id;
>                 req->gref = ref;
>         }
> @@ -375,17 +458,30 @@ static void xennet_tx_buf_gc(struct net_device *dev)
> 
>                         id  = txrsp->id;
>                         skb = np->tx_skbs[id].skb;
> -                       if (unlikely(gnttab_query_foreign_access(
> -                               np->grant_tx_ref[id]) != 0)) {
> -                               printk(KERN_ALERT "xennet_tx_buf_gc: warning "
> -                                      "-- grant still in use by backend "
> -                                      "domain.\n");
> -                               BUG();
> +
> +                       if (np->persistent_gnt) {
> +                               struct gnt_list *gnt_list_entry;
> +
> +                               gnt_list_entry = np->tx_grant[id];
> +                               BUG_ON(!gnt_list_entry);
> +
> +                               gnt_list_entry->tail = np->tx_gnt_list;
> +                               np->tx_gnt_list = gnt_list_entry;
> +                               np->tx_gnt_cnt++;
> +                       } else {
> +                               if (unlikely(gnttab_query_foreign_access(
> +                                       np->grant_tx_ref[id]) != 0)) {
> +                                       printk(KERN_ALERT "xennet_tx_buf_gc: warning "
> +                                              "-- grant still in use by backend "
> +                                              "domain.\n");
> +                                       BUG();
> +                               }
> +
> +                               gnttab_end_foreign_access_ref(
> +                                       np->grant_tx_ref[id], GNTMAP_readonly);

If I've read the code correctly, you are giving this frame both
read/write permissions to the other end on xennet_alloc_tx_ref, but then
you are only removing the read permissions? (see comment below on the
xennet_alloc_tx_ref function).

> +                               gnttab_release_grant_reference(
> +                                     &np->gref_tx_head, np->grant_tx_ref[id]);
>                         }
> -                       gnttab_end_foreign_access_ref(
> -                               np->grant_tx_ref[id], GNTMAP_readonly);
> -                       gnttab_release_grant_reference(
> -                               &np->gref_tx_head, np->grant_tx_ref[id]);
>                         np->grant_tx_ref[id] = GRANT_INVALID_REF;
>                         add_id_to_freelist(&np->tx_skb_freelist, np->tx_skbs, id);
>                         dev_kfree_skb_irq(skb);
> @@ -409,6 +505,59 @@ static void xennet_tx_buf_gc(struct net_device *dev)
>         xennet_maybe_wake_tx(dev);
>  }
> 
> +static grant_ref_t xennet_alloc_tx_ref(struct net_device *dev,
> +                                      unsigned long mfn,
> +                                      unsigned int id)
> +{
> +       struct netfront_info *np = netdev_priv(dev);
> +       grant_ref_t ref;
> +       struct page *granted_page;
> +
> +       if (np->persistent_gnt && np->tx_gnt_cnt) {
> +               struct gnt_list *gnt_list_entry;
> +
> +               gnt_list_entry = np->tx_gnt_list;
> +               np->tx_gnt_list = np->tx_gnt_list->tail;
> +               np->tx_gnt_cnt--;
> +
> +               ref = gnt_list_entry->gref;
> +               np->tx_grant[id] = gnt_list_entry;
> +       } else {
> +               struct gnt_list *gnt_list_entry;
> +
> +               BUG_ON(!np->persistent_gnt && np->tx_gnt_cnt);
> +               ref = gnttab_claim_grant_reference(&np->gref_tx_head);
> +               BUG_ON((signed short)ref < 0);
> +
> +               if (np->persistent_gnt) {
> +                       granted_page = alloc_page(GFP_KERNEL);
> +                       if (!granted_page) {
> +                               gnttab_release_grant_reference(
> +                                                       &np->gref_tx_head, ref);
> +                               return -ENOMEM;
> +                       }
> +
> +                       mfn = pfn_to_mfn(page_to_pfn(granted_page));
> +                       gnt_list_entry = kmalloc(sizeof(struct gnt_list),
> +                                                GFP_KERNEL);
> +                       if (!gnt_list_entry) {
> +                               __free_page(granted_page);
> +                               gnttab_release_grant_reference(
> +                                                       &np->gref_tx_head, ref);
> +                               return -ENOMEM;
> +                       }
> +
> +                       gnt_list_entry->gref = ref;
> +                       gnt_list_entry->gnt_pages = granted_page;
> +                       np->tx_grant[id] = gnt_list_entry;
> +               }
> +               gnttab_grant_foreign_access_ref(ref, np->xbdev->otherend_id,
> +                                               mfn, 0);

If you are not always using persistent grants I guess you need to give
read only permissions to this frame (GNTMAP_readonly). Also, for keeping
things in logical order, isn't it best that this function comes before
xennet_tx_buf_gc?

> +       }
> +
> +       return ref;
> +}
> +
> @@ -1132,8 +1357,10 @@ static void xennet_release_rx_bufs(struct netfront_info *np)
>                 }
> 
>                 skb = np->rx_skbs[id];
> -               mfn = gnttab_end_foreign_transfer_ref(ref);
> -               gnttab_release_grant_reference(&np->gref_rx_head, ref);
> +               if (!np->persistent_gnt) {
> +                       mfn = gnttab_end_foreign_transfer_ref(ref);
> +                       gnttab_release_grant_reference(&np->gref_rx_head, ref);
> +               }
>                 np->grant_rx_ref[id] = GRANT_INVALID_REF;
> 
>                 if (0 == mfn) {
> @@ -1607,6 +1834,13 @@ again:
>                 goto abort_transaction;
>         }
> 
> +       err = xenbus_printf(xbt, dev->nodename, "feature-persistent-grants",
> +                           "%u", info->persistent_gnt);

As in netback, I think "feature-persistent" should be used.

> +       if (err) {
> +               message = "writing feature-persistent-grants";
> +               xenbus_dev_fatal(dev, err, "%s", message);
> +       }
> +
>         err = xenbus_transaction_end(xbt, 0);
>         if (err) {
>                 if (err == -EAGAIN)
> @@ -1634,6 +1868,7 @@ static int xennet_connect(struct net_device *dev)
>         grant_ref_t ref;
>         struct xen_netif_rx_request *req;
>         unsigned int feature_rx_copy;
> +       int ret, val;
> 
>         err = xenbus_scanf(XBT_NIL, np->xbdev->otherend,
>                            "feature-rx-copy", "%u", &feature_rx_copy);
> @@ -1646,6 +1881,13 @@ static int xennet_connect(struct net_device *dev)
>                 return -ENODEV;
>         }
> 
> +       err = xenbus_scanf(XBT_NIL, np->xbdev->otherend,
> +                          "feature-persistent-grants", "%u", &val);
> +       if (err != 1)
> +               val = 0;
> +
> +       np->persistent_gnt = !!val;
> +
>         err = talk_to_netback(np->xbdev, np);
>         if (err)
>                 return err;
> @@ -1657,9 +1899,24 @@ static int xennet_connect(struct net_device *dev)
>         spin_lock_bh(&np->rx_lock);
>         spin_lock_irq(&np->tx_lock);
> 
> +       np->tx_gnt_cnt = 0;
> +       np->rx_gnt_cnt = 0;
> +
>         /* Step 1: Discard all pending TX packet fragments. */
>         xennet_release_tx_bufs(np);
> 
> +       if (np->persistent_gnt) {
> +               struct gnt_list *gnt_list_entry;
> +
> +               while (np->rx_gnt_list) {
> +                       gnt_list_entry = np->rx_gnt_list;
> +                       np->rx_gnt_list = np->rx_gnt_list->tail;
> +                       gnttab_end_foreign_access(gnt_list_entry->gref, 0, 0UL);
> +                       __free_page(gnt_list_entry->gnt_pages);
> +                       kfree(gnt_list_entry);
> +               }
> +       }
> +
>         /* Step 2: Rebuild the RX buffer freelist and the RX ring itself. */
>         for (requeue_idx = 0, i = 0; i < NET_RX_RING_SIZE; i++) {
>                 skb_frag_t *frag;
> @@ -1673,10 +1930,11 @@ static int xennet_connect(struct net_device *dev)
> 
>                 frag = &skb_shinfo(skb)->frags[0];
>                 page = skb_frag_page(frag);
> -               gnttab_grant_foreign_access_ref(
> -                       ref, np->xbdev->otherend_id,
> -                       pfn_to_mfn(page_to_pfn(page)),
> -                       0);
> +               ret = xennet_alloc_rx_ref(dev, pfn_to_mfn(page_to_pfn(page)),
> +                                         page_address(page), requeue_idx, ref);
> +               if ((signed short)ret < 0)
> +                       break;
> +
>                 req->gref = ref;
>                 req->id   = requeue_idx;
> 
> --
> 1.7.3.4
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> 

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Xen-devel] [PATCH 0/4] Implement persistent grant in xen-netfront/netback
  2012-11-15  8:38   ` [Xen-devel] " ANNIE LI
  2012-11-15  8:51     ` Ian Campbell
  2012-11-15  9:35     ` Wei Liu
@ 2012-11-15 10:56     ` Roger Pau Monné
  2012-11-15 11:14       ` ANNIE LI
                         ` (2 more replies)
  2 siblings, 3 replies; 42+ messages in thread
From: Roger Pau Monné @ 2012-11-15 10:56 UTC (permalink / raw)
  To: ANNIE LI
  Cc: Pasi Kärkkäinen, netdev, xen-devel, Ian Campbell, konrad.wilk

On 15/11/12 09:38, ANNIE LI wrote:
> 
> 
> On 2012-11-15 15:40, Pasi Kärkkäinen wrote:
>> Hello,
>>
>> On Thu, Nov 15, 2012 at 03:03:07PM +0800, Annie Li wrote:
>>> This patch implements persistent grants for xen-netfront/netback. This
>>> mechanism maintains page pools in netback/netfront, these page pools is used to
>>> save grant pages which are mapped. This way improve performance which is wasted
>>> when doing grant operations.
>>>
>>> Current netback/netfront does map/unmap grant operations frequently when
>>> transmitting/receiving packets, and grant operations costs much cpu clock. In
>>> this patch, netfront/netback maps grant pages when needed and then saves them
>>> into a page pool for future use. All these pages will be unmapped when
>>> removing/releasing the net device.
>>>
>> Do you have performance numbers available already? with/without persistent grants?
> I have some simple netperf/netserver test result with/without persistent 
> grants,
> 
> Following is result of with persistent grant patch,
> 
> Guests, Sum,      Avg,     Min,     Max
>   1,  15106.4,  15106.4, 15106.36, 15106.36
>   2,  13052.7,  6526.34,  6261.81,  6790.86
>   3,  12675.1,  6337.53,  6220.24,  6454.83
>   4,  13194,  6596.98,  6274.70,  6919.25
> 
> 
> Following are result of without persistent patch
> 
> Guests, Sum,     Avg,    Min,        Max
>   1,  10864.1,  10864.1, 10864.10, 10864.10
>   2,  10898.5,  5449.24,  4862.08,  6036.40
>   3,  10734.5,  5367.26,  5261.43,  5473.08
>   4,  10924,    5461.99,  5314.84,  5609.14

In the block case, performance improvement is seen when using a large
number of guests, could you perform the same benchmark increasing the
number of guests to 15?

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Xen-devel] [PATCH 0/4] Implement persistent grant in xen-netfront/netback
  2012-11-15  9:35     ` Wei Liu
@ 2012-11-15 11:12       ` ANNIE LI
  2012-11-16 15:34       ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 42+ messages in thread
From: ANNIE LI @ 2012-11-15 11:12 UTC (permalink / raw)
  To: Wei Liu; +Cc: netdev, xen-devel, Ian Campbell, Konrad Rzeszutek Wilk



On 2012-11-15 17:35, Wei Liu wrote:
>
> On Thu, Nov 15, 2012 at 4:38 PM, ANNIE LI <annie.li@oracle.com 
> <mailto:annie.li@oracle.com>> wrote:
>
>
>
>     On 2012-11-15 15:40, Pasi Kärkkäinen wrote:
>
>         Hello,
>
>         On Thu, Nov 15, 2012 at 03:03:07PM +0800, Annie Li wrote:
>
>             This patch implements persistent grants for
>             xen-netfront/netback. This
>             mechanism maintains page pools in netback/netfront, these
>             page pools is used to
>             save grant pages which are mapped. This way improve
>             performance which is wasted
>             when doing grant operations.
>
>             Current netback/netfront does map/unmap grant operations
>             frequently when
>             transmitting/receiving packets, and grant operations costs
>             much cpu clock. In
>             this patch, netfront/netback maps grant pages when needed
>             and then saves them
>             into a page pool for future use. All these pages will be
>             unmapped when
>             removing/releasing the net device.
>
>         Do you have performance numbers available already?
>         with/without persistent grants?
>
>     I have some simple netperf/netserver test result with/without
>     persistent grants,
>
>     Following is result of with persistent grant patch,
>
>     Guests, Sum,      Avg,     Min,     Max
>      1,  15106.4,  15106.4, 15106.36, 15106.36
>      2,  13052.7,  6526.34,  6261.81,  6790.86
>      3,  12675.1,  6337.53,  6220.24,  6454.83
>      4,  13194,  6596.98,  6274.70,  6919.25
>
>
>     Following are result of without persistent patch
>
>     Guests, Sum,     Avg,    Min,        Max
>      1,  10864.1,  10864.1, 10864.10, 10864.10
>      2,  10898.5,  5449.24,  4862.08,  6036.40
>      3,  10734.5,  5367.26,  5261.43,  5473.08
>      4,  10924,    5461.99,  5314.84,  5609.14
>
>
>
> Interesting results. Have you tested how good it is on a 10G nic, i.e. 
> guest sending packets
> through physical network to another host.
Not yet.

Thanks
Annie
>
>
> Wei.
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Xen-devel] [PATCH 0/4] Implement persistent grant in xen-netfront/netback
  2012-11-15 10:56     ` Roger Pau Monné
@ 2012-11-15 11:14       ` ANNIE LI
  2012-11-15 11:15       ` Ian Campbell
  2012-11-16 15:21       ` Konrad Rzeszutek Wilk
  2 siblings, 0 replies; 42+ messages in thread
From: ANNIE LI @ 2012-11-15 11:14 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Pasi Kärkkäinen, netdev, xen-devel, Ian Campbell, konrad.wilk



On 2012-11-15 18:56, Roger Pau Monné wrote:
> On 15/11/12 09:38, ANNIE LI wrote:
>>
>> On 2012-11-15 15:40, Pasi Kärkkäinen wrote:
>>> Hello,
>>>
>>> On Thu, Nov 15, 2012 at 03:03:07PM +0800, Annie Li wrote:
>>>> This patch implements persistent grants for xen-netfront/netback. This
>>>> mechanism maintains page pools in netback/netfront, these page pools is used to
>>>> save grant pages which are mapped. This way improve performance which is wasted
>>>> when doing grant operations.
>>>>
>>>> Current netback/netfront does map/unmap grant operations frequently when
>>>> transmitting/receiving packets, and grant operations costs much cpu clock. In
>>>> this patch, netfront/netback maps grant pages when needed and then saves them
>>>> into a page pool for future use. All these pages will be unmapped when
>>>> removing/releasing the net device.
>>>>
>>> Do you have performance numbers available already? with/without persistent grants?
>> I have some simple netperf/netserver test result with/without persistent
>> grants,
>>
>> Following is result of with persistent grant patch,
>>
>> Guests, Sum,      Avg,     Min,     Max
>>    1,  15106.4,  15106.4, 15106.36, 15106.36
>>    2,  13052.7,  6526.34,  6261.81,  6790.86
>>    3,  12675.1,  6337.53,  6220.24,  6454.83
>>    4,  13194,  6596.98,  6274.70,  6919.25
>>
>>
>> Following are result of without persistent patch
>>
>> Guests, Sum,     Avg,    Min,        Max
>>    1,  10864.1,  10864.1, 10864.10, 10864.10
>>    2,  10898.5,  5449.24,  4862.08,  6036.40
>>    3,  10734.5,  5367.26,  5261.43,  5473.08
>>    4,  10924,    5461.99,  5314.84,  5609.14
> In the block case, performance improvement is seen when using a large
> number of guests, could you perform the same benchmark increasing the
> number of guests to 15?
Sure, but my current local environment does not meet such test 
requirement, let me find an environment and then start such test.

Thanks
Annie
>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 0/4] Implement persistent grant in xen-netfront/netback
  2012-11-15  8:53 ` Ian Campbell
@ 2012-11-15 11:14   ` ANNIE LI
  0 siblings, 0 replies; 42+ messages in thread
From: ANNIE LI @ 2012-11-15 11:14 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, netdev, konrad.wilk



On 2012-11-15 16:53, Ian Campbell wrote:
> On Thu, 2012-11-15 at 07:03 +0000, Annie Li wrote:
>> This patch is based on linux3.4-rc3. I hit netperf/netserver failure
>> on linux latest version v3.7-rc1, v3.7-rc2 and v3.7-rc4. Not sure
>> whether thisnetperf/netserver failure connects compound page commit in
>> v3.7-rc1, but I did hit BUG_ON with debug patch from thread
>> http://lists.xen.org/archives/html/xen-devel/2012-10/msg00893.html
> Do you think you could cook up a netfront fix similar in principal to
> the 6a8ed462f16b8455eec5ae00eb6014159a6721f0 fix for netback?

Ok, let me have a try to do similar thing in netfront.

Thanks
Annie
>
> Ian.
>
>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Xen-devel] [PATCH 0/4] Implement persistent grant in xen-netfront/netback
  2012-11-15  8:56 ` Ian Campbell
@ 2012-11-15 11:14   ` ANNIE LI
  0 siblings, 0 replies; 42+ messages in thread
From: ANNIE LI @ 2012-11-15 11:14 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, netdev, konrad.wilk



On 2012-11-15 16:56, Ian Campbell wrote:
> On Thu, 2012-11-15 at 07:03 +0000, Annie Li wrote:
>
>> This patch implements persistent grants for xen-netfront/netback. This
>> mechanism maintains page pools in netback/netfront, these page pools
>> is used to save grant pages which are mapped. This way improve
>> performance which is wasted when doing grant operations.
> Please can you send a patch against xen-unstable.hg to document this new
> protocol variant in xen/include/public/io/netif.h. This header is a bit
> under-documented but lets not let it fall further behind (if you want to
> go further and document the existing features, in the style of the
> current blkif.h, then that would be awesome!).
Ok, no problem.
>
> You may also want to provide a similar patch to Linux's copy which is in
> linux/include/xen/interface/io/netif.h
Ok, will do that.

Thanks
Annie
>
> Ian.
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Xen-devel] [PATCH 0/4] Implement persistent grant in xen-netfront/netback
  2012-11-15 10:56     ` Roger Pau Monné
  2012-11-15 11:14       ` ANNIE LI
@ 2012-11-15 11:15       ` Ian Campbell
  2012-11-15 18:29         ` Konrad Rzeszutek Wilk
  2012-11-16 15:21       ` Konrad Rzeszutek Wilk
  2 siblings, 1 reply; 42+ messages in thread
From: Ian Campbell @ 2012-11-15 11:15 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: ANNIE LI, Pasi Kärkkäinen, netdev, xen-devel, konrad.wilk

On Thu, 2012-11-15 at 10:56 +0000, Roger Pau Monne wrote:
> On 15/11/12 09:38, ANNIE LI wrote:
> > 
> > 
> > On 2012-11-15 15:40, Pasi Kärkkäinen wrote:
> >> Hello,
> >>
> >> On Thu, Nov 15, 2012 at 03:03:07PM +0800, Annie Li wrote:
> >>> This patch implements persistent grants for xen-netfront/netback. This
> >>> mechanism maintains page pools in netback/netfront, these page pools is used to
> >>> save grant pages which are mapped. This way improve performance which is wasted
> >>> when doing grant operations.
> >>>
> >>> Current netback/netfront does map/unmap grant operations frequently when
> >>> transmitting/receiving packets, and grant operations costs much cpu clock. In
> >>> this patch, netfront/netback maps grant pages when needed and then saves them
> >>> into a page pool for future use. All these pages will be unmapped when
> >>> removing/releasing the net device.
> >>>
> >> Do you have performance numbers available already? with/without persistent grants?
> > I have some simple netperf/netserver test result with/without persistent 
> > grants,
> > 
> > Following is result of with persistent grant patch,
> > 
> > Guests, Sum,      Avg,     Min,     Max
> >   1,  15106.4,  15106.4, 15106.36, 15106.36
> >   2,  13052.7,  6526.34,  6261.81,  6790.86
> >   3,  12675.1,  6337.53,  6220.24,  6454.83
> >   4,  13194,  6596.98,  6274.70,  6919.25
> > 
> > 
> > Following are result of without persistent patch
> > 
> > Guests, Sum,     Avg,    Min,        Max
> >   1,  10864.1,  10864.1, 10864.10, 10864.10
> >   2,  10898.5,  5449.24,  4862.08,  6036.40
> >   3,  10734.5,  5367.26,  5261.43,  5473.08
> >   4,  10924,    5461.99,  5314.84,  5609.14
> 
> In the block case, performance improvement is seen when using a large
> number of guests, could you perform the same benchmark increasing the
> number of guests to 15?

It would also be nice to see some analysis of the numbers which justify
why this change is a good one without every reviewer having to evaluate
the raw data themselves. In fact this should really be part of the
commit message.

Ian.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Xen-devel] [PATCH 0/4] Implement persistent grant in xen-netfront/netback
  2012-11-15 11:15       ` Ian Campbell
@ 2012-11-15 18:29         ` Konrad Rzeszutek Wilk
  2012-11-15 19:11           ` Ian Campbell
  0 siblings, 1 reply; 42+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-11-15 18:29 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Roger Pau Monne, ANNIE LI, Pasi Kärkkäinen, netdev, xen-devel

On Thu, Nov 15, 2012 at 11:15:06AM +0000, Ian Campbell wrote:
> On Thu, 2012-11-15 at 10:56 +0000, Roger Pau Monne wrote:
> > On 15/11/12 09:38, ANNIE LI wrote:
> > > 
> > > 
> > > On 2012-11-15 15:40, Pasi Kärkkäinen wrote:
> > >> Hello,
> > >>
> > >> On Thu, Nov 15, 2012 at 03:03:07PM +0800, Annie Li wrote:
> > >>> This patch implements persistent grants for xen-netfront/netback. This
> > >>> mechanism maintains page pools in netback/netfront, these page pools is used to
> > >>> save grant pages which are mapped. This way improve performance which is wasted
> > >>> when doing grant operations.
> > >>>
> > >>> Current netback/netfront does map/unmap grant operations frequently when
> > >>> transmitting/receiving packets, and grant operations costs much cpu clock. In
> > >>> this patch, netfront/netback maps grant pages when needed and then saves them
> > >>> into a page pool for future use. All these pages will be unmapped when
> > >>> removing/releasing the net device.
> > >>>
> > >> Do you have performance numbers available already? with/without persistent grants?
> > > I have some simple netperf/netserver test result with/without persistent 
> > > grants,
> > > 
> > > Following is result of with persistent grant patch,
> > > 
> > > Guests, Sum,      Avg,     Min,     Max
> > >   1,  15106.4,  15106.4, 15106.36, 15106.36
> > >   2,  13052.7,  6526.34,  6261.81,  6790.86
> > >   3,  12675.1,  6337.53,  6220.24,  6454.83
> > >   4,  13194,  6596.98,  6274.70,  6919.25
> > > 
> > > 
> > > Following are result of without persistent patch
> > > 
> > > Guests, Sum,     Avg,    Min,        Max
> > >   1,  10864.1,  10864.1, 10864.10, 10864.10
> > >   2,  10898.5,  5449.24,  4862.08,  6036.40
> > >   3,  10734.5,  5367.26,  5261.43,  5473.08
> > >   4,  10924,    5461.99,  5314.84,  5609.14
> > 
> > In the block case, performance improvement is seen when using a large
> > number of guests, could you perform the same benchmark increasing the
> > number of guests to 15?
> 
> It would also be nice to see some analysis of the numbers which justify
> why this change is a good one without every reviewer having to evaluate
> the raw data themselves. In fact this should really be part of the
> commit message.

You mean like a nice graph, eh?

I will run these patches on my 32GB box and see if I can give you
a nice PDF/jpg.

> 
> Ian.
> 

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Xen-devel] [PATCH 0/4] Implement persistent grant in xen-netfront/netback
  2012-11-15 18:29         ` Konrad Rzeszutek Wilk
@ 2012-11-15 19:11           ` Ian Campbell
  2012-11-16 15:23             ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 42+ messages in thread
From: Ian Campbell @ 2012-11-15 19:11 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Roger Pau Monne, ANNIE LI, Pasi Kärkkäinen, netdev, xen-devel

On Thu, 2012-11-15 at 18:29 +0000, Konrad Rzeszutek Wilk wrote:
> On Thu, Nov 15, 2012 at 11:15:06AM +0000, Ian Campbell wrote:
> > On Thu, 2012-11-15 at 10:56 +0000, Roger Pau Monne wrote:
> > > On 15/11/12 09:38, ANNIE LI wrote:
> > > > 
> > > > 
> > > > On 2012-11-15 15:40, Pasi Kärkkäinen wrote:
> > > >> Hello,
> > > >>
> > > >> On Thu, Nov 15, 2012 at 03:03:07PM +0800, Annie Li wrote:
> > > >>> This patch implements persistent grants for xen-netfront/netback. This
> > > >>> mechanism maintains page pools in netback/netfront, these page pools is used to
> > > >>> save grant pages which are mapped. This way improve performance which is wasted
> > > >>> when doing grant operations.
> > > >>>
> > > >>> Current netback/netfront does map/unmap grant operations frequently when
> > > >>> transmitting/receiving packets, and grant operations costs much cpu clock. In
> > > >>> this patch, netfront/netback maps grant pages when needed and then saves them
> > > >>> into a page pool for future use. All these pages will be unmapped when
> > > >>> removing/releasing the net device.
> > > >>>
> > > >> Do you have performance numbers available already? with/without persistent grants?
> > > > I have some simple netperf/netserver test result with/without persistent 
> > > > grants,
> > > > 
> > > > Following is result of with persistent grant patch,
> > > > 
> > > > Guests, Sum,      Avg,     Min,     Max
> > > >   1,  15106.4,  15106.4, 15106.36, 15106.36
> > > >   2,  13052.7,  6526.34,  6261.81,  6790.86
> > > >   3,  12675.1,  6337.53,  6220.24,  6454.83
> > > >   4,  13194,  6596.98,  6274.70,  6919.25
> > > > 
> > > > 
> > > > Following are result of without persistent patch
> > > > 
> > > > Guests, Sum,     Avg,    Min,        Max
> > > >   1,  10864.1,  10864.1, 10864.10, 10864.10
> > > >   2,  10898.5,  5449.24,  4862.08,  6036.40
> > > >   3,  10734.5,  5367.26,  5261.43,  5473.08
> > > >   4,  10924,    5461.99,  5314.84,  5609.14
> > > 
> > > In the block case, performance improvement is seen when using a large
> > > number of guests, could you perform the same benchmark increasing the
> > > number of guests to 15?
> > 
> > It would also be nice to see some analysis of the numbers which justify
> > why this change is a good one without every reviewer having to evaluate
> > the raw data themselves. In fact this should really be part of the
> > commit message.
> 
> You mean like a nice graph, eh?

Together with an analysis of what it means and why it is a good thing,
yes.

Ian.

> 
> I will run these patches on my 32GB box and see if I can give you
> a nice PDF/jpg.
> 
> > 
> > Ian.
> > 

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Xen-devel] [PATCH 1/4] xen/netback: implements persistent grant with one page pool.
  2012-11-15  9:10   ` Ian Campbell
@ 2012-11-16  2:18     ` ANNIE LI
  2012-11-16  9:27       ` Ian Campbell
  0 siblings, 1 reply; 42+ messages in thread
From: ANNIE LI @ 2012-11-16  2:18 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, netdev, konrad.wilk



On 2012-11-15 17:10, Ian Campbell wrote:
> On Thu, 2012-11-15 at 07:04 +0000, Annie Li wrote:
>> This patch implements persistent grant in netback driver. Tx and rx
>> share the same page pool, this pool will be split into two parts
>> in next patch.
>>
>> Signed-off-by: Annie Li<annie.li@oracle.com>
>> ---
>>   drivers/net/xen-netback/common.h    |   18 +++-
>>   drivers/net/xen-netback/interface.c |   22 ++++
>>   drivers/net/xen-netback/netback.c   |  212 +++++++++++++++++++++++++++++++----
>>   drivers/net/xen-netback/xenbus.c    |   14 ++-
>>   4 files changed, 239 insertions(+), 27 deletions(-)
>>
>> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
>> index 94b79c3..a85cac6 100644
>> --- a/drivers/net/xen-netback/common.h
>> +++ b/drivers/net/xen-netback/common.h
>> @@ -45,8 +45,19 @@
>>   #include<xen/grant_table.h>
>>   #include<xen/xenbus.h>
>>
>> +#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
>> +#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
>> +#define MAXIMUM_OUTSTANDING_BLOCK_REQS \
> BLOCK?

Oh, an error when splitting the patch, will fix it, thanks.

>
>> +                       (XEN_NETIF_TX_RING_SIZE + XEN_NETIF_RX_RING_SIZE)
>> +
>>   struct xen_netbk;
>>
>> +struct persistent_entry {
>> +       grant_ref_t forgranted;
>> +       struct page *fpage;
>> +       struct gnttab_map_grant_ref map;
>> +};
> Isn't this duplicating a bunch of infrastructure which is also in
> blkback? Can we put it into some common helpers please?

Yes,

"struct gnttab_map_grant_ref map" can be changed to handle like blkback to keep same as blkback, and share them in common helpers.


>
>> +
>>   struct xenvif {
>>          /* Unique identifier for this interface. */
>>          domid_t          domid;
>> @@ -75,6 +86,7 @@ struct xenvif {
>>
>>          /* Internal feature information. */
>>          u8 can_queue:1;     /* can queue packets for receiver? */
>> +       u8 persistent_grant:1;
>>
>>          /*
>>           * Allow xenvif_start_xmit() to peek ahead in the rx request
>> @@ -98,6 +110,9 @@ struct xenvif {
>>          struct net_device *dev;
>>
>>          wait_queue_head_t waiting_to_free;
>> +
>> +       struct persistent_entry *persistent_gnt[MAXIMUM_OUTSTANDING_BLOCK_REQS];
> What is the per-vif memory overhead of this array?
In this patch,
The maximum of memory overhead is about

(XEN_NETIF_TX_RING_SIZE+XEN_NETIF_RX_RING_SIZE)*PAGE_SIZE  (plus size of grant_ref_t and handle)
which is about 512 PAGE_SIZE. Normally, without heavy network offload, this maximum can not be reached.

In next patch of splitting tx/rx pool, the maximum is about (256+512)PAGE_SIZE.

>
>> +static struct persistent_entry*
>> +get_per_gnt(struct persistent_entry **pers_entry,
>> +           unsigned int count, grant_ref_t gref)
>> +{
>> +       int i;
>> +
>> +       for (i = 0; i<  count; i++)
>> +               if (gref == pers_entry[i]->forgranted)
>> +                       return pers_entry[i];
> Isn't this linear scan rather expensive? I think Roger implemented some
> sort of hash lookup for blkback which I think is required here too (and
> should be easy if you make that code common).

Agree, thanks.

>
>> +
>> +       return NULL;
>> +}
>> +
>> @@ -1338,7 +1497,11 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk)
>>                  gop->source.domid = vif->domid;
>>                  gop->source.offset = txreq.offset;
>>
>> -               gop->dest.u.gmfn = virt_to_mfn(page_address(page));
>> +               if (!vif->persistent_grant)
>> +                       gop->dest.u.gmfn = virt_to_mfn(page_address(page));
>> +               else
>> +                       gop->dest.u.gmfn = (unsigned long)page_address(page);
> page_address doesn't return any sort of frame number, does it? This is
> rather confusing...

Yes. I only use dest.u.gmfn element to save the page_address here for 
future memcpy, and it does not mean to use frame number actually. To 
avoid confusion, here I can use

gop->dest.u.gmfn = virt_to_mfn(page_address(page));

and then call mfn_to_virt when doing memcpy.


>
>> @@ -453,7 +460,12 @@ static int connect_rings(struct backend_info *be)
>>                  val = 0;
>>          vif->csum = !val;
>>
>> -       /* Map the shared frame, irq etc. */
>> +       if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-persistent-grants",
>> +                        "%u",&val)<  0)
>> +               val = 0;
>> +       vif->persistent_grant = !!val;
>> +
>> +/* Map the shared frame, irq etc. */
> Please run the patches through checkpatch.pl

Yes, I run checkpatch.pl before posting them. The only warning exists in 
initial code netfront.c, it is a printk code in xennet_tx_buf_gc, I did 
not fix that.

Thanks
Annie
>
>>          err = xenvif_connect(vif, tx_ring_ref, rx_ring_ref, evtchn);
>>          if (err) {
>>                  xenbus_dev_fatal(dev, err,
>> --
>> 1.7.3.4
>>
>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Xen-devel] [PATCH 1/4] xen/netback: implements persistent grant with one page pool.
  2012-11-15  9:57   ` Roger Pau Monné
@ 2012-11-16  2:49     ` ANNIE LI
  2012-11-16  7:57       ` ANNIE LI
  2012-11-16  9:32       ` Ian Campbell
  0 siblings, 2 replies; 42+ messages in thread
From: ANNIE LI @ 2012-11-16  2:49 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, netdev, konrad.wilk, Ian Campbell



On 2012-11-15 17:57, Roger Pau Monné wrote:
> On 15/11/12 08:04, Annie Li wrote:
>> This patch implements persistent grant in netback driver. Tx and rx
>> share the same page pool, this pool will be split into two parts
>> in next patch.
>>
>> Signed-off-by: Annie Li<annie.li@oracle.com>
>> ---
>>   drivers/net/xen-netback/common.h    |   18 +++-
>>   drivers/net/xen-netback/interface.c |   22 ++++
>>   drivers/net/xen-netback/netback.c   |  212 +++++++++++++++++++++++++++++++----
>>   drivers/net/xen-netback/xenbus.c    |   14 ++-
>>   4 files changed, 239 insertions(+), 27 deletions(-)
>>
>> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
>> index 94b79c3..a85cac6 100644
>> --- a/drivers/net/xen-netback/common.h
>> +++ b/drivers/net/xen-netback/common.h
>> @@ -45,8 +45,19 @@
>>   #include<xen/grant_table.h>
>>   #include<xen/xenbus.h>
>>
>> +#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
>> +#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
>> +#define MAXIMUM_OUTSTANDING_BLOCK_REQS \
>> +                       (XEN_NETIF_TX_RING_SIZE + XEN_NETIF_RX_RING_SIZE)
>> +
>>   struct xen_netbk;
>>
>> +struct persistent_entry {
>> +       grant_ref_t forgranted;
>> +       struct page *fpage;
>> +       struct gnttab_map_grant_ref map;
>> +};
> This should be common with the blkback implementation, I think we should
> move some structures/functions from blkback to a common place. When I
> implementated some functions in blkback I though they could be reused by
> other backends that wanted to use persistent grants, so I keep them free
> of blkback specific structures.

Good idea, thanks.

>
>>   struct xenvif {
>>          /* Unique identifier for this interface. */
>>          domid_t          domid;
>> @@ -75,6 +86,7 @@ struct xenvif {
>>
>>          /* Internal feature information. */
>>          u8 can_queue:1;     /* can queue packets for receiver? */
>> +       u8 persistent_grant:1;
>>
>>          /*
>>           * Allow xenvif_start_xmit() to peek ahead in the rx request
>> @@ -98,6 +110,9 @@ struct xenvif {
>>          struct net_device *dev;
>>
>>          wait_queue_head_t waiting_to_free;
>> +
>> +       struct persistent_entry *persistent_gnt[MAXIMUM_OUTSTANDING_BLOCK_REQS];
>> +       unsigned int persistent_gntcnt;
> This should be a red-black tree, which has the property of a search time
> <= O(log n), using an array is more expensive in terms of memory and has
> a worse search time O(n), this is specially interesting for netback,
> which can have twice as much persistent grants as blkback (because two
> rings are used).

Right, thanks.

>
> Take a look at the following functions from blkback; foreach_grant,
> add_persistent_gnt and get_persistent_gnt. They are generic functions to
> deal with persistent grants.

Ok, thanks.
Or moving those functions into a separate common file?

>
>>   };
>>
>>   static inline struct xenbus_device *xenvif_to_xenbus_device(struct xenvif *vif)
>> @@ -105,9 +120,6 @@ static inline struct xenbus_device *xenvif_to_xenbus_device(struct xenvif *vif)
>>          return to_xenbus_device(vif->dev->dev.parent);
>>   }
>>
>> -#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
>> -#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
>> -
>>   struct xenvif *xenvif_alloc(struct device *parent,
>>                              domid_t domid,
>>                              unsigned int handle);
>> diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
>> index b7d41f8..226d159 100644
>> --- a/drivers/net/xen-netback/interface.c
>> +++ b/drivers/net/xen-netback/interface.c
>> @@ -300,6 +300,8 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid,
>>                  return ERR_PTR(err);
>>          }
>>
>> +       vif->persistent_gntcnt = 0;
>> +
>>          netdev_dbg(dev, "Successfully created xenvif\n");
>>          return vif;
>>   }
>> @@ -343,6 +345,23 @@ err:
>>          return err;
>>   }
>>
>> +void xenvif_free_grants(struct persistent_entry **pers_entry,
>> +                       unsigned int count)
>> +{
>> +       int i, ret;
>> +       struct gnttab_unmap_grant_ref unmap;
>> +
>> +       for (i = 0; i<  count; i++) {
>> +               gnttab_set_unmap_op(&unmap,
>> +                       (unsigned long)page_to_pfn(pers_entry[i]->fpage),
>> +                       GNTMAP_host_map,
>> +                       pers_entry[i]->map.handle);
>> +               ret = gnttab_unmap_refs(&unmap,&pers_entry[i]->fpage,
>> +                                       1, false);
> This is not correct, you should call gnttab_set_unmap_op on a batch of
> grants (up to BLKIF_MAX_SEGMENTS_PER_REQUEST), and then call
> gnttab_unmap_refs on all of them. Here is a simple example (take a look
> at blkback.c function xen_blkif_schedule to see an example with a
> red-black tree, I think this part of the code should also be made common):
>
> for (i = 0, segs_to_unmap = 0; i<  count; i++) {
> 	gnttab_set_unmap_op(&unmap[segs_to_unmap],
> 		(unsigned long)page_to_pfn(pers_entry[i]->fpage),
> 		GNTMAP_host_map,
> 		pers_entry[i]->map.handle);
> 	pages[segs_to_unmap] =
> 		(unsigned long)page_to_pfn(pers_entry[i]->fpage);
> 	if (++segs_to_unmap == BLKIF_MAX_SEGMENTS_PER_REQUEST ||
> 		(i + 1) == count) {
> 		ret = gnttab_unmap_refs(unmap, NULL, pages,
> 			segs_to_unmap);
> 		BUG_ON(ret);
> 		segs_to_unmap == 0;
> 	}
> }

Got it, thanks.

>
>> +               BUG_ON(ret);
>> +       }
>> +}
>> +
>>   void xenvif_disconnect(struct xenvif *vif)
>>   {
>>          struct net_device *dev = vif->dev;
>> @@ -366,6 +385,9 @@ void xenvif_disconnect(struct xenvif *vif)
>>          unregister_netdev(vif->dev);
>>
>>          xen_netbk_unmap_frontend_rings(vif);
>> +       if (vif->persistent_grant)
>> +               xenvif_free_grants(vif->persistent_gnt,
>> +                                  vif->persistent_gntcnt);
>>
>>          free_netdev(vif->dev);
>>   }
>> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
>> index 2596401..a26d3fc 100644
>> --- a/drivers/net/xen-netback/netback.c
>> +++ b/drivers/net/xen-netback/netback.c
>> @@ -80,6 +80,8 @@ union page_ext {
>>          void *mapping;
>>   };
>>
>> +struct xenvif;
>> +
>>   struct xen_netbk {
>>          wait_queue_head_t wq;
>>          struct task_struct *task;
>> @@ -102,6 +104,7 @@ struct xen_netbk {
>>
>>          struct pending_tx_info pending_tx_info[MAX_PENDING_REQS];
>>          struct gnttab_copy tx_copy_ops[MAX_PENDING_REQS];
>> +       struct xenvif *gnttab_tx_vif[MAX_PENDING_REQS];
>>
>>          u16 pending_ring[MAX_PENDING_REQS];
>>
>> @@ -111,12 +114,139 @@ struct xen_netbk {
>>           * straddles two buffers in the frontend.
>>           */
>>          struct gnttab_copy grant_copy_op[2*XEN_NETIF_RX_RING_SIZE];
>> +       struct xenvif *gnttab_rx_vif[2*XEN_NETIF_RX_RING_SIZE];
>>          struct netbk_rx_meta meta[2*XEN_NETIF_RX_RING_SIZE];
>>   };
>>
>>   static struct xen_netbk *xen_netbk;
>>   static int xen_netbk_group_nr;
>>
>> +static struct persistent_entry*
>> +get_per_gnt(struct persistent_entry **pers_entry,
>> +           unsigned int count, grant_ref_t gref)
>> +{
>> +       int i;
>> +
>> +       for (i = 0; i<  count; i++)
>> +               if (gref == pers_entry[i]->forgranted)
>> +                       return pers_entry[i];
>> +
>> +       return NULL;
>> +}
> This should be replaced with common code shared with all persistent
> backends implementations.

Ok, thanks.

>
>> +
>> +static void*
>> +map_new_gnt(struct persistent_entry **pers_entry, unsigned int count,
>> +           grant_ref_t ref, domid_t domid)
>> +{
>> +       struct gnttab_map_grant_ref *map;
>> +       struct page *page;
>> +       unsigned long vaddr;
>> +       unsigned long pfn;
>> +       uint32_t flags;
>> +       int ret = 0;
>> +
>> +       pers_entry[count] = (struct persistent_entry *)
>> +                           kmalloc(sizeof(struct persistent_entry),
>> +                                   GFP_KERNEL);
>> +       if (!pers_entry[count])
>> +               return ERR_PTR(-ENOMEM);
>> +
>> +       map =&pers_entry[count]->map;
>> +       page = alloc_page(GFP_KERNEL);
>> +       if (!page) {
>> +               kfree(pers_entry[count]);
>> +               pers_entry[count] = NULL;
>> +               return ERR_PTR(-ENOMEM);
>> +       }
>> +
>> +       pers_entry[count]->fpage = page;
>> +       pfn = page_to_pfn(page);
>> +       vaddr = (unsigned long)pfn_to_kaddr(pfn);
>> +       flags = GNTMAP_host_map;
>> +
>> +       gnttab_set_map_op(map, vaddr, flags, ref, domid);
>> +       ret = gnttab_map_refs(map, NULL,&page, 1);
>> +       BUG_ON(ret);
> This is highly inefficient, one of the points of using gnttab_set_map_op
> is that you can queue a bunch of grants, and then map them at the same
> time using gnttab_map_refs, but here you are using it to map a single
> grant at a time. You should instead see how much grants you need to map
> to complete the request and map them all at the same time.

Yes, it is inefficient here. But this is limited by current netback 
implementation. Current netback is not per-VIF based(not like blkback 
does). After combining persistent grant and non persistent grant 
together, every vif request in the queue may/may not support persistent 
grant. I have to judge whether every vif in the queue supports 
persistent grant or not. If it support, memcpy is used, if not, 
grantcopy is used.
After making netback per-VIF works, this issue can be fixed.

>
>> +
>> +       pers_entry[count]->forgranted = ref;
>> +
>> +       return page_address(page);
>> +}
>> +
>> +static void*
>> +get_ref_page(struct persistent_entry **pers_entry, unsigned int *count,
>> +            grant_ref_t ref, domid_t domid, unsigned int total)
>> +{
>> +       struct persistent_entry *per_gnt;
>> +       void *vaddr;
>> +
>> +       per_gnt = get_per_gnt(pers_entry, *count, ref);
>> +
>> +       if (per_gnt != NULL)
>> +               return page_address(per_gnt->fpage);
>> +       else {
>> +               BUG_ON(*count>= total);
>> +               vaddr = map_new_gnt(pers_entry, *count, ref, domid);
>> +               if (IS_ERR_OR_NULL(vaddr))
>> +                       return vaddr;
>> +               *count += 1;
>> +               return vaddr;
>> +       }
>> +}
>> +
>> +static int
>> +grant_memory_copy_op(unsigned int cmd, void *vuop, unsigned int count,
>> +                    struct xen_netbk *netbk, bool tx_pool)
>> +{
>> +       int i;
>> +       struct xenvif *vif;
>> +       struct gnttab_copy *uop = vuop;
>> +       unsigned int *gnt_count;
>> +       unsigned int gnt_total;
>> +       struct persistent_entry **pers_entry;
>> +       int ret = 0;
>> +
>> +       BUG_ON(cmd != GNTTABOP_copy);
>> +       for (i = 0; i<  count; i++) {
>> +               if (tx_pool)
>> +                       vif = netbk->gnttab_tx_vif[i];
>> +               else
>> +                       vif = netbk->gnttab_rx_vif[i];
>> +
>> +               pers_entry = vif->persistent_gnt;
>> +               gnt_count =&vif->persistent_gntcnt;
>> +               gnt_total = MAXIMUM_OUTSTANDING_BLOCK_REQS;
>> +
>> +               if (vif->persistent_grant) {
>> +                       void *saddr, *daddr;
>> +
>> +                       saddr = uop[i].source.domid == DOMID_SELF ?
>> +                               (void *) uop[i].source.u.gmfn :
>> +                               get_ref_page(pers_entry, gnt_count,
>> +                                            uop[i].source.u.ref,
>> +                                            uop[i].source.domid,
>> +                                            gnt_total);
>> +                       if (IS_ERR_OR_NULL(saddr))
>> +                               return -ENOMEM;
>> +
>> +                       daddr = uop[i].dest.domid == DOMID_SELF ?
>> +                               (void *) uop[i].dest.u.gmfn :
>> +                               get_ref_page(pers_entry, gnt_count,
>> +                                            uop[i].dest.u.ref,
>> +                                            uop[i].dest.domid,
>> +                                            gnt_total);
>> +                       if (IS_ERR_OR_NULL(daddr))
>> +                               return -ENOMEM;
>> +
>> +                       memcpy(daddr+uop[i].dest.offset,
>> +                              saddr+uop[i].source.offset, uop[i].len);
>> +               } else
>> +                       ret = HYPERVISOR_grant_table_op(cmd,&uop[i], 1);
>> +       }
>> +
>> +       return ret;
>> +}
>> +
>>   void xen_netbk_add_xenvif(struct xenvif *vif)
>>   {
>>          int i;
>> @@ -387,13 +517,15 @@ static struct netbk_rx_meta *get_next_rx_buffer(struct xenvif *vif,
>>    * Set up the grant operations for this fragment. If it's a flipping
>>    * interface, we also set up the unmap request from here.
>>    */
>> -static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
>> -                               struct netrx_pending_operations *npo,
>> -                               struct page *page, unsigned long size,
>> -                               unsigned long offset, int *head)
>> +static int netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
>> +                              struct netrx_pending_operations *npo,
>> +                              struct page *page, unsigned long size,
>> +                              unsigned long offset, int *head,
>> +                              struct xenvif **grxvif)
>>   {
>>          struct gnttab_copy *copy_gop;
>>          struct netbk_rx_meta *meta;
>> +       int count = 0;
>>          /*
>>           * These variables are used iff get_page_ext returns true,
>>           * in which case they are guaranteed to be initialized.
>> @@ -425,6 +557,10 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
>>                          bytes = MAX_BUFFER_OFFSET - npo->copy_off;
>>
>>                  copy_gop = npo->copy + npo->copy_prod++;
>> +               *grxvif = vif;
>> +               grxvif++;
>> +               count++;
>> +
>>                  copy_gop->flags = GNTCOPY_dest_gref;
>>                  if (foreign) {
>>                          struct xen_netbk *netbk =&xen_netbk[group];
>> @@ -438,7 +574,10 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
>>                  } else {
>>                          void *vaddr = page_address(page);
>>                          copy_gop->source.domid = DOMID_SELF;
>> -                       copy_gop->source.u.gmfn = virt_to_mfn(vaddr);
>> +                       if (!vif->persistent_grant)
>> +                               copy_gop->source.u.gmfn = virt_to_mfn(vaddr);
>> +                       else
>> +                               copy_gop->source.u.gmfn = (unsigned long)vaddr;
>>                  }
>>                  copy_gop->source.offset = offset;
>>                  copy_gop->dest.domid = vif->domid;
>> @@ -460,6 +599,7 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
>>                  *head = 0; /* There must be something in this buffer now. */
>>
>>          }
>> +       return count;
>>   }
>>
>>   /*
>> @@ -474,8 +614,10 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
>>    * zero GSO descriptors (for non-GSO packets) or one descriptor (for
>>    * frontend-side LRO).
>>    */
>> -static int netbk_gop_skb(struct sk_buff *skb,
>> -                        struct netrx_pending_operations *npo)
>> +static int netbk_gop_skb(struct xen_netbk *netbk,
>> +                        struct sk_buff *skb,
>> +                        struct netrx_pending_operations *npo,
>> +                        struct xenvif **grxvif)
>>   {
>>          struct xenvif *vif = netdev_priv(skb->dev);
>>          int nr_frags = skb_shinfo(skb)->nr_frags;
>> @@ -518,17 +660,19 @@ static int netbk_gop_skb(struct sk_buff *skb,
>>                  if (data + len>  skb_tail_pointer(skb))
>>                          len = skb_tail_pointer(skb) - data;
>>
>> -               netbk_gop_frag_copy(vif, skb, npo,
>> -                                   virt_to_page(data), len, offset,&head);
>> +               grxvif += netbk_gop_frag_copy(vif, skb, npo,
>> +                                             virt_to_page(data), len,
>> +                                             offset,&head, grxvif);
>> +
>>                  data += len;
>>          }
>>
>>          for (i = 0; i<  nr_frags; i++) {
>> -               netbk_gop_frag_copy(vif, skb, npo,
>> -                                   skb_frag_page(&skb_shinfo(skb)->frags[i]),
>> -                                   skb_frag_size(&skb_shinfo(skb)->frags[i]),
>> -                                   skb_shinfo(skb)->frags[i].page_offset,
>> -&head);
>> +               grxvif += netbk_gop_frag_copy(vif, skb, npo,
>> +                               skb_frag_page(&skb_shinfo(skb)->frags[i]),
>> +                               skb_frag_size(&skb_shinfo(skb)->frags[i]),
>> +                               skb_shinfo(skb)->frags[i].page_offset,
>> +&head, grxvif);
>>          }
>>
>>          return npo->meta_prod - old_meta_prod;
>> @@ -593,6 +737,8 @@ struct skb_cb_overlay {
>>   static void xen_netbk_rx_action(struct xen_netbk *netbk)
>>   {
>>          struct xenvif *vif = NULL, *tmp;
>> +       struct xenvif **grxvif = netbk->gnttab_rx_vif;
>> +       int old_copy_prod = 0;
>>          s8 status;
>>          u16 irq, flags;
>>          struct xen_netif_rx_response *resp;
>> @@ -619,7 +765,9 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk)
>>                  nr_frags = skb_shinfo(skb)->nr_frags;
>>
>>                  sco = (struct skb_cb_overlay *)skb->cb;
>> -               sco->meta_slots_used = netbk_gop_skb(skb,&npo);
>> +               sco->meta_slots_used = netbk_gop_skb(netbk, skb,&npo, grxvif);
>> +               grxvif += npo.copy_prod - old_copy_prod;
>> +               old_copy_prod = npo.copy_prod;
>>
>>                  count += nr_frags + 1;
>>
>> @@ -636,8 +784,10 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk)
>>                  return;
>>
>>          BUG_ON(npo.copy_prod>  ARRAY_SIZE(netbk->grant_copy_op));
>> -       ret = HYPERVISOR_grant_table_op(GNTTABOP_copy,&netbk->grant_copy_op,
>> -                                       npo.copy_prod);
>> +       ret = grant_memory_copy_op(GNTTABOP_copy,
>> +&netbk->grant_copy_op,
>> +                                  npo.copy_prod, netbk,
>> +                                  false);
>>          BUG_ON(ret != 0);
>>
>>          while ((skb = __skb_dequeue(&rxq)) != NULL) {
>> @@ -918,7 +1068,8 @@ static struct gnttab_copy *xen_netbk_get_requests(struct xen_netbk *netbk,
>>                                                    struct xenvif *vif,
>>                                                    struct sk_buff *skb,
>>                                                    struct xen_netif_tx_request *txp,
>> -                                                 struct gnttab_copy *gop)
>> +                                                 struct gnttab_copy *gop,
>> +                                                 struct xenvif **gtxvif)
>>   {
>>          struct skb_shared_info *shinfo = skb_shinfo(skb);
>>          skb_frag_t *frags = shinfo->frags;
>> @@ -944,7 +1095,11 @@ static struct gnttab_copy *xen_netbk_get_requests(struct xen_netbk *netbk,
>>                  gop->source.domid = vif->domid;
>>                  gop->source.offset = txp->offset;
>>
>> -               gop->dest.u.gmfn = virt_to_mfn(page_address(page));
>> +               if (!vif->persistent_grant)
>> +                       gop->dest.u.gmfn = virt_to_mfn(page_address(page));
>> +               else
>> +                       gop->dest.u.gmfn = (unsigned long)page_address(page);
>> +
>>                  gop->dest.domid = DOMID_SELF;
>>                  gop->dest.offset = txp->offset;
>>
>> @@ -952,6 +1107,9 @@ static struct gnttab_copy *xen_netbk_get_requests(struct xen_netbk *netbk,
>>                  gop->flags = GNTCOPY_source_gref;
>>
>>                  gop++;
>> +               *gtxvif = vif;
>> +               gtxvif++;
>> +
>>
>>                  memcpy(&pending_tx_info[pending_idx].req, txp, sizeof(*txp));
>>                  xenvif_get(vif);
>> @@ -1218,6 +1376,7 @@ static bool tx_credit_exceeded(struct xenvif *vif, unsigned size)
>>   static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk)
>>   {
>>          struct gnttab_copy *gop = netbk->tx_copy_ops, *request_gop;
>> +       struct xenvif **gtxvif = netbk->gnttab_tx_vif;
>>          struct sk_buff *skb;
>>          int ret;
>>
>> @@ -1338,7 +1497,11 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk)
>>                  gop->source.domid = vif->domid;
>>                  gop->source.offset = txreq.offset;
>>
>> -               gop->dest.u.gmfn = virt_to_mfn(page_address(page));
>> +               if (!vif->persistent_grant)
>> +                       gop->dest.u.gmfn = virt_to_mfn(page_address(page));
>> +               else
>> +                       gop->dest.u.gmfn = (unsigned long)page_address(page);
>> +
>>                  gop->dest.domid = DOMID_SELF;
>>                  gop->dest.offset = txreq.offset;
>>
>> @@ -1346,6 +1509,7 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk)
>>                  gop->flags = GNTCOPY_source_gref;
>>
>>                  gop++;
>> +               *gtxvif++ = vif;
>>
>>                  memcpy(&netbk->pending_tx_info[pending_idx].req,
>>                         &txreq, sizeof(txreq));
>> @@ -1369,12 +1533,13 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk)
>>                  netbk->pending_cons++;
>>
>>                  request_gop = xen_netbk_get_requests(netbk, vif,
>> -                                                    skb, txfrags, gop);
>> +                                                    skb, txfrags, gop, gtxvif);
>>                  if (request_gop == NULL) {
>>                          kfree_skb(skb);
>>                          netbk_tx_err(vif,&txreq, idx);
>>                          continue;
>>                  }
>> +               gtxvif += request_gop - gop;
>>                  gop = request_gop;
>>
>>                  vif->tx.req_cons = idx;
>> @@ -1467,8 +1632,9 @@ static void xen_netbk_tx_action(struct xen_netbk *netbk)
>>
>>          if (nr_gops == 0)
>>                  return;
>> -       ret = HYPERVISOR_grant_table_op(GNTTABOP_copy,
>> -                                       netbk->tx_copy_ops, nr_gops);
>> +       ret = grant_memory_copy_op(GNTTABOP_copy,
>> +                                  netbk->tx_copy_ops, nr_gops,
>> +                                  netbk, true);
>>          BUG_ON(ret);
>>
>>          xen_netbk_tx_submit(netbk);
>> diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
>> index 410018c..938e908 100644
>> --- a/drivers/net/xen-netback/xenbus.c
>> +++ b/drivers/net/xen-netback/xenbus.c
>> @@ -114,6 +114,13 @@ static int netback_probe(struct xenbus_device *dev,
>>                          goto abort_transaction;
>>                  }
>>
>> +               err = xenbus_printf(xbt, dev->nodename,
>> +                                   "feature-persistent-grants", "%u", 1);
>> +               if (err) {
>> +                       message = "writing feature-persistent-grants";
>> +                       goto abort_transaction;
>> +               }
>> +
>>                  err = xenbus_transaction_end(xbt, 0);
>>          } while (err == -EAGAIN);
>>
>> @@ -453,7 +460,12 @@ static int connect_rings(struct backend_info *be)
>>                  val = 0;
>>          vif->csum = !val;
>>
>> -       /* Map the shared frame, irq etc. */
>> +       if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-persistent-grants",
>> +                        "%u",&val)<  0)
> In block devices "feature-persistent" is used, so I think that for
> clearness it should be announced the same way in net.
Is it  "feature-persistent" ? I checked your RFC patch, the key is 
"feature-persistent-grants".

Thanks
Annie
>
>> +               val = 0;
>> +       vif->persistent_grant = !!val;
>> +
>> +/* Map the shared frame, irq etc. */
>>          err = xenvif_connect(vif, tx_ring_ref, rx_ring_ref, evtchn);
>>          if (err) {
>>                  xenbus_dev_fatal(dev, err,
>> --
>> 1.7.3.4
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> http://lists.xen.org/xen-devel
>>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 2/4] xen/netback: Split one page pool into two(tx/rx) page pool.
  2012-11-15  9:15   ` Ian Campbell
@ 2012-11-16  3:10     ` ANNIE LI
  0 siblings, 0 replies; 42+ messages in thread
From: ANNIE LI @ 2012-11-16  3:10 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, netdev, konrad.wilk



On 2012-11-15 17:15, Ian Campbell wrote:
> On Thu, 2012-11-15 at 07:04 +0000, Annie Li wrote:
>> For tx path, this implementation simplifies the work of searching out
>> grant page from page pool based on grant reference.
> It's still a linear search though, and it doesn't look much simpler to
> me:
>    	for (i = 0; i<  count; i++) {
> 		if (tx_pool)
> 			vif = netbk->gnttab_tx_vif[i];
> 		else
> 			vif = netbk->gnttab_rx_vif[i];
>
> 		pers_entry = vif->persistent_gnt;
> 		gnt_count =&vif->persistent_gntcnt;
> 		gnt_total = MAXIMUM_OUTSTANDING_BLOCK_REQS;
> becomes:
> 	for (i = 0; i<  count; i++) {
> 		if (tx_pool) {
> 			vif = netbk->gnttab_tx_vif[i];
> 			gnt_count =&vif->persistent_tx_gntcnt;
> 			gnt_total = XEN_NETIF_TX_RING_SIZE;
> 			pers_entry = vif->persistent_tx_gnt;
> 		} else {
> 			vif = netbk->gnttab_rx_vif[i];
> 			gnt_count =&vif->persistent_rx_gntcnt;
> 			gnt_total = 2*XEN_NETIF_RX_RING_SIZE;
> 			pers_entry = vif->persistent_rx_gnt;
> 		}

Yes, the code is not simpler. If we make netback per-VIF based, then 
these code will disappear.
The simplifying here means for tx path, the max search index is 
XEN_NETIF_TX_RING_SIZE(256 here), and this change can save some time 
when searching out grant page for specific grant reference.

>
>> @@ -111,8 +109,16 @@ struct xenvif {
>>
>>   	wait_queue_head_t waiting_to_free;
>>
>> -	struct persistent_entry *persistent_gnt[MAXIMUM_OUTSTANDING_BLOCK_REQS];
>> -	unsigned int persistent_gntcnt;
>> +	struct persistent_entry *persistent_tx_gnt[XEN_NETIF_TX_RING_SIZE];
>> +
>> +	/*
>> +	 * 2*XEN_NETIF_RX_RING_SIZE is for the case of each head/fragment page
> Shouldn't that been incorporated into MAXIMUM_OUTSTANDING_BLOCK_REQS
> (sic) too?

Yes, the total value is same as MAXIMUM_OUTSTANDING_BLOCK_REQS. But here

2*XEN_NETIF_RX_RING_SIZE means it is only used by rx path, and it is used just like other elements in netback structure, such as grant_copy_op, meta, etc.


>> +	 * using 2 copy operations.
>> +	 */
>> +	struct persistent_entry *persistent_rx_gnt[2*XEN_NETIF_RX_RING_SIZE];
> What is the per-vif memory overhead after this change?

Per-vif memory overhead is following,
for tx path, it is about XEN_NETIF_RX_RING_SIZE*PAGE_SIZE  (256 
PAGE_SIZE here)
for rx path, it is about 2*XEN_NETIF_RX_RING_SIZE*PAGE_SIZE  (512 
PAGE_SIZE here)

I can add some comment here.

Thanks
Annie
>
> Ian.
>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Xen-devel] [PATCH 3/4] Xen/netfront: Implement persistent grant in netfront.
  2012-11-15 10:52   ` [Xen-devel] " Roger Pau Monné
@ 2012-11-16  5:22     ` ANNIE LI
  2012-11-16  7:58       ` ANNIE LI
  0 siblings, 1 reply; 42+ messages in thread
From: ANNIE LI @ 2012-11-16  5:22 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, netdev, konrad.wilk, Ian Campbell



On 2012-11-15 18:52, Roger Pau Monné wrote:
> On 15/11/12 08:05, Annie Li wrote:
>> Tx/rx page pool are maintained. New grant is mapped and put into
>> pool, unmap only happens when releasing/removing device.
>>
>> Signed-off-by: Annie Li<annie.li@oracle.com>
>> ---
>>   drivers/net/xen-netfront.c |  372 +++++++++++++++++++++++++++++++++++++-------
>>   1 files changed, 315 insertions(+), 57 deletions(-)
>>
>> diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
>> index 0ebbb19..17b81c0 100644
>> --- a/drivers/net/xen-netfront.c
>> +++ b/drivers/net/xen-netfront.c
>> @@ -79,6 +79,13 @@ struct netfront_stats {
>>          struct u64_stats_sync   syncp;
>>   };
>>
>> +struct gnt_list {
>> +       grant_ref_t             gref;
>> +       struct                  page *gnt_pages;
>> +       void                    *gnt_target;
>> +       struct                  gnt_list *tail;
>> +};
> This could also be shared with blkfront.

Netfront does not have the shadow like blkfront, and it needs the 
gnt_target to save skb address of rx path. So we can share this too? 
blkfront would not use it actually.

>
>> +
>>   struct netfront_info {
>>          struct list_head list;
>>          struct net_device *netdev;
>> @@ -109,6 +116,10 @@ struct netfront_info {
>>          grant_ref_t grant_tx_ref[NET_TX_RING_SIZE];
>>          unsigned tx_skb_freelist;
>>
>> +       struct gnt_list *tx_grant[NET_TX_RING_SIZE];
>> +       struct gnt_list *tx_gnt_list;
>> +       unsigned int tx_gnt_cnt;
> I don't understand this, why do you need both an array and a list?

The array tx_grant is just like other tx_skbs, grant_tx_ref. It saves 
grant entries corresponding every request in the ring. This is what 
netfront different from blkfront, netfront does not have shadow ring, 
and it only uses a ring size array to track every request in the ring.
The list is like a pool to save all available persistent grants.

> I'm
> not familiar with net code, so I don't know if this is some kind of
> special netfront thing?

Yes, this is different from blkfront. netfront uses ring size arrays to 
track every request in the ring.

>
> Anyway if you have to use a list I would recommend using one of the list
> constructions that's already in the kernel, it simplifies the code and
> makes it more easy to understand than creating your own list structure.

Ok, thanks.

>
>> +
>>          spinlock_t   rx_lock ____cacheline_aligned_in_smp;
>>          struct xen_netif_rx_front_ring rx;
>>          int rx_ring_ref;
>> @@ -126,6 +137,10 @@ struct netfront_info {
>>          grant_ref_t gref_rx_head;
>>          grant_ref_t grant_rx_ref[NET_RX_RING_SIZE];
>>
>> +       struct gnt_list *rx_grant[NET_RX_RING_SIZE];
>> +       struct gnt_list *rx_gnt_list;
>> +       unsigned int rx_gnt_cnt;
> Same comment above here.

Same as above.

>
>> +
>>          unsigned long rx_pfn_array[NET_RX_RING_SIZE];
>>          struct multicall_entry rx_mcl[NET_RX_RING_SIZE+1];
>>          struct mmu_update rx_mmu[NET_RX_RING_SIZE];
>> @@ -134,6 +149,7 @@ struct netfront_info {
>>          struct netfront_stats __percpu *stats;
>>
>>          unsigned long rx_gso_checksum_fixup;
>> +       u8 persistent_gnt:1;
>>   };
>>
>>   struct netfront_rx_info {
>> @@ -194,6 +210,16 @@ static grant_ref_t xennet_get_rx_ref(struct netfront_info *np,
>>          return ref;
>>   }
>>
>> +static struct gnt_list *xennet_get_rx_grant(struct netfront_info *np,
>> +                                           RING_IDX ri)
>> +{
>> +       int i = xennet_rxidx(ri);
>> +       struct gnt_list *gntlist = np->rx_grant[i];
>> +       np->rx_grant[i] = NULL;
> Ok, I think I get why do you need both an array and a list, is that
> because netfront doesn't have some kind of shadow ring to keep track of
> issued requests?

Yes.

>
> So each issued request has an associated gnt_list with the list of used
> grants?

gnt_list is kind of free grants. It is like a pool of free grants. If 
free grants exist in this list, free grant will be gotten from this 
list. If no, new grant will be allocated. In xennet_tx_buf_gc, free 
grants will be put into the list again if response status is OK.

> If so it would be good to add a comment about it.
>
>> +       return gntlist;
>> +}
>> +
>>   #ifdef CONFIG_SYSFS
>>   static int xennet_sysfs_addif(struct net_device *netdev);
>>   static void xennet_sysfs_delif(struct net_device *netdev);
>> @@ -231,6 +257,68 @@ static void xennet_maybe_wake_tx(struct net_device *dev)
>>                  netif_wake_queue(dev);
>>   }
>>
>> +static grant_ref_t xennet_alloc_rx_ref(struct net_device *dev,
>> +                                      unsigned long mfn, void *vaddr,
>> +                                      unsigned int id,
>> +                                      grant_ref_t ref)
>> +{
>> +       struct netfront_info *np = netdev_priv(dev);
>> +       grant_ref_t gnt_ref;
>> +       struct gnt_list *gnt_list_entry;
>> +
>> +       if (np->persistent_gnt&&  np->rx_gnt_cnt) {
>> +               gnt_list_entry = np->rx_gnt_list;
>> +               np->rx_gnt_list = np->rx_gnt_list->tail;
>> +               np->rx_gnt_cnt--;
>> +
>> +               gnt_list_entry->gnt_target = vaddr;
>> +               gnt_ref = gnt_list_entry->gref;
>> +               np->rx_grant[id] = gnt_list_entry;
>> +       } else {
>> +               struct page *page;
>> +
>> +               BUG_ON(!np->persistent_gnt&&  np->rx_gnt_cnt);
>> +               if (!ref)
>> +                       gnt_ref =
>> +                               gnttab_claim_grant_reference(&np->gref_rx_head);
>> +               else
>> +                       gnt_ref = ref;
>> +               BUG_ON((signed short)gnt_ref<  0);
>> +
>> +               if (np->persistent_gnt) {
> So you are only using persistent grants if the backend also supports
> them.

Current implementation is:
If netback supports persistent grant, the frontend will work with 
persistent grant feature too.
If netback does not support persistent grant, the frontend will work 
without persistent grant feature.

> Have you benchmarked the performance of a persistent frontend with
> a non-persistent backend.

I  remember I did some test before, not so sure. Will check it.

>   In the block case, usign a persistent frontend
> with a non-persistent backend let to an overall performance improvement,
> so blkfront uses persistent grants even if blkback doesn't support them.
> Take a look at the following graph:
>
> http://xenbits.xen.org/people/royger/persistent_grants/nonpers_read.png

Good idea, that makes sense. I will change netfront too, thanks.

>
>> +                       page = alloc_page(GFP_KERNEL);
>> +                       if (!page) {
>> +                               if (!ref)
>> +                                       gnttab_release_grant_reference(
>> +&np->gref_rx_head, ref);
>> +                               return -ENOMEM;
>> +                       }
>> +                       mfn = pfn_to_mfn(page_to_pfn(page));
>> +
>> +                       gnt_list_entry = kmalloc(sizeof(struct gnt_list),
>> +                                                GFP_KERNEL);
>> +                       if (!gnt_list_entry) {
>> +                               __free_page(page);
>> +                               if (!ref)
>> +                                       gnttab_release_grant_reference(
>> +&np->gref_rx_head, ref);
>> +                               return -ENOMEM;
>> +                       }
>> +                       gnt_list_entry->gref = gnt_ref;
>> +                       gnt_list_entry->gnt_pages = page;
>> +                       gnt_list_entry->gnt_target = vaddr;
>> +
>> +                       np->rx_grant[id] = gnt_list_entry;
>> +               }
>> +
>> +               gnttab_grant_foreign_access_ref(gnt_ref, np->xbdev->otherend_id,
>> +                                               mfn, 0);
>> +       }
>> +       np->grant_rx_ref[id] = gnt_ref;
>> +
>> +       return gnt_ref;
>> +}
>> +
>>   static void xennet_alloc_rx_buffers(struct net_device *dev)
>>   {
>>          unsigned short id;
>> @@ -240,8 +328,6 @@ static void xennet_alloc_rx_buffers(struct net_device *dev)
>>          int i, batch_target, notify;
>>          RING_IDX req_prod = np->rx.req_prod_pvt;
>>          grant_ref_t ref;
>> -       unsigned long pfn;
>> -       void *vaddr;
>>          struct xen_netif_rx_request *req;
>>
>>          if (unlikely(!netif_carrier_ok(dev)))
>> @@ -306,19 +392,16 @@ no_skb:
>>                  BUG_ON(np->rx_skbs[id]);
>>                  np->rx_skbs[id] = skb;
>>
>> -               ref = gnttab_claim_grant_reference(&np->gref_rx_head);
>> -               BUG_ON((signed short)ref<  0);
>> -               np->grant_rx_ref[id] = ref;
>> +               page = skb_frag_page(&skb_shinfo(skb)->frags[0]);
>>
>> -               pfn = page_to_pfn(skb_frag_page(&skb_shinfo(skb)->frags[0]));
>> -               vaddr = page_address(skb_frag_page(&skb_shinfo(skb)->frags[0]));
>> +               ref = xennet_alloc_rx_ref(dev, pfn_to_mfn(page_to_pfn(page)),
>> +                                         page_address(page), id, 0);
>> +               if ((signed short)ref<  0) {
>> +                       __skb_queue_tail(&np->rx_batch, skb);
>> +                       break;
>> +               }
>>
>>                  req = RING_GET_REQUEST(&np->rx, req_prod + i);
>> -               gnttab_grant_foreign_access_ref(ref,
>> -                                               np->xbdev->otherend_id,
>> -                                               pfn_to_mfn(pfn),
>> -                                               0);
>> -
>>                  req->id = id;
>>                  req->gref = ref;
>>          }
>> @@ -375,17 +458,30 @@ static void xennet_tx_buf_gc(struct net_device *dev)
>>
>>                          id  = txrsp->id;
>>                          skb = np->tx_skbs[id].skb;
>> -                       if (unlikely(gnttab_query_foreign_access(
>> -                               np->grant_tx_ref[id]) != 0)) {
>> -                               printk(KERN_ALERT "xennet_tx_buf_gc: warning "
>> -                                      "-- grant still in use by backend "
>> -                                      "domain.\n");
>> -                               BUG();
>> +
>> +                       if (np->persistent_gnt) {
>> +                               struct gnt_list *gnt_list_entry;
>> +
>> +                               gnt_list_entry = np->tx_grant[id];
>> +                               BUG_ON(!gnt_list_entry);
>> +
>> +                               gnt_list_entry->tail = np->tx_gnt_list;
>> +                               np->tx_gnt_list = gnt_list_entry;
>> +                               np->tx_gnt_cnt++;
>> +                       } else {
>> +                               if (unlikely(gnttab_query_foreign_access(
>> +                                       np->grant_tx_ref[id]) != 0)) {
>> +                                       printk(KERN_ALERT "xennet_tx_buf_gc: warning "
>> +                                              "-- grant still in use by backend "
>> +                                              "domain.\n");
>> +                                       BUG();
>> +                               }
>> +
>> +                               gnttab_end_foreign_access_ref(
>> +                                       np->grant_tx_ref[id], GNTMAP_readonly);
> If I've read the code correctly, you are giving this frame both
> read/write permissions to the other end on xennet_alloc_tx_ref, but then
> you are only removing the read permissions? (see comment below on the
> xennet_alloc_tx_ref function).

Yes, this is a bug.
For non persistent grant, it should remove the read permissions. For 
persistent grant, it should remove both.
As mentioned above, it is better to enable persistent grant, I will 
change code and not consider non persistent grant.
See comments below about why needing both permissions in 
xennet_alloc_tx_ref.

>
>> +                               gnttab_release_grant_reference(
>> +&np->gref_tx_head, np->grant_tx_ref[id]);
>>                          }
>> -                       gnttab_end_foreign_access_ref(
>> -                               np->grant_tx_ref[id], GNTMAP_readonly);
>> -                       gnttab_release_grant_reference(
>> -&np->gref_tx_head, np->grant_tx_ref[id]);
>>                          np->grant_tx_ref[id] = GRANT_INVALID_REF;
>>                          add_id_to_freelist(&np->tx_skb_freelist, np->tx_skbs, id);
>>                          dev_kfree_skb_irq(skb);
>> @@ -409,6 +505,59 @@ static void xennet_tx_buf_gc(struct net_device *dev)
>>          xennet_maybe_wake_tx(dev);
>>   }
>>
>> +static grant_ref_t xennet_alloc_tx_ref(struct net_device *dev,
>> +                                      unsigned long mfn,
>> +                                      unsigned int id)
>> +{
>> +       struct netfront_info *np = netdev_priv(dev);
>> +       grant_ref_t ref;
>> +       struct page *granted_page;
>> +
>> +       if (np->persistent_gnt&&  np->tx_gnt_cnt) {
>> +               struct gnt_list *gnt_list_entry;
>> +
>> +               gnt_list_entry = np->tx_gnt_list;
>> +               np->tx_gnt_list = np->tx_gnt_list->tail;
>> +               np->tx_gnt_cnt--;
>> +
>> +               ref = gnt_list_entry->gref;
>> +               np->tx_grant[id] = gnt_list_entry;
>> +       } else {
>> +               struct gnt_list *gnt_list_entry;
>> +
>> +               BUG_ON(!np->persistent_gnt&&  np->tx_gnt_cnt);
>> +               ref = gnttab_claim_grant_reference(&np->gref_tx_head);
>> +               BUG_ON((signed short)ref<  0);
>> +
>> +               if (np->persistent_gnt) {
>> +                       granted_page = alloc_page(GFP_KERNEL);
>> +                       if (!granted_page) {
>> +                               gnttab_release_grant_reference(
>> +&np->gref_tx_head, ref);
>> +                               return -ENOMEM;
>> +                       }
>> +
>> +                       mfn = pfn_to_mfn(page_to_pfn(granted_page));
>> +                       gnt_list_entry = kmalloc(sizeof(struct gnt_list),
>> +                                                GFP_KERNEL);
>> +                       if (!gnt_list_entry) {
>> +                               __free_page(granted_page);
>> +                               gnttab_release_grant_reference(
>> +&np->gref_tx_head, ref);
>> +                               return -ENOMEM;
>> +                       }
>> +
>> +                       gnt_list_entry->gref = ref;
>> +                       gnt_list_entry->gnt_pages = granted_page;
>> +                       np->tx_grant[id] = gnt_list_entry;
>> +               }
>> +               gnttab_grant_foreign_access_ref(ref, np->xbdev->otherend_id,
>> +                                               mfn, 0);
> If you are not always using persistent grants I guess you need to give
> read only permissions to this frame (GNTMAP_readonly).

We can not use GNTMAP_readonly here because tx path packet data will be 
copied into these persistent grant pages. Mabbe it is better to use 
GNTMAP_readonly for nonpersistent and 0 for persistent grant.
As mentioned above, it is better to enable persistent grant, I will 
change code and not consider non persistent grant.

>   Also, for keeping
> things in logical order, isn't it best that this function comes before
> xennet_tx_buf_gc?

xennet_alloc_tx_ref is called by following function xennet_make_frags, 
so I assume xennet_alloc_tx_ref is better to be close to 
xennet_make_frags. Xennet_tx_buf_gc does not have any connection with 
xennet_alloc_tx_ref, did I miss something?

>
>> +       }
>> +
>> +       return ref;
>> +}
>> +
>> @@ -1132,8 +1357,10 @@ static void xennet_release_rx_bufs(struct netfront_info *np)
>>                  }
>>
>>                  skb = np->rx_skbs[id];
>> -               mfn = gnttab_end_foreign_transfer_ref(ref);
>> -               gnttab_release_grant_reference(&np->gref_rx_head, ref);
>> +               if (!np->persistent_gnt) {
>> +                       mfn = gnttab_end_foreign_transfer_ref(ref);
>> +                       gnttab_release_grant_reference(&np->gref_rx_head, ref);
>> +               }
>>                  np->grant_rx_ref[id] = GRANT_INVALID_REF;
>>
>>                  if (0 == mfn) {
>> @@ -1607,6 +1834,13 @@ again:
>>                  goto abort_transaction;
>>          }
>>
>> +       err = xenbus_printf(xbt, dev->nodename, "feature-persistent-grants",
>> +                           "%u", info->persistent_gnt);
> As in netback, I think "feature-persistent" should be used.

Same in blkback, I assume it is  "feature-persistent-grants", right?
I referred your RFC patch, did you change it later? Or I missed something?

Thanks
Annie
>
>> +       if (err) {
>> +               message = "writing feature-persistent-grants";
>> +               xenbus_dev_fatal(dev, err, "%s", message);
>> +       }
>> +
>>          err = xenbus_transaction_end(xbt, 0);
>>          if (err) {
>>                  if (err == -EAGAIN)
>> @@ -1634,6 +1868,7 @@ static int xennet_connect(struct net_device *dev)
>>          grant_ref_t ref;
>>          struct xen_netif_rx_request *req;
>>          unsigned int feature_rx_copy;
>> +       int ret, val;
>>
>>          err = xenbus_scanf(XBT_NIL, np->xbdev->otherend,
>>                             "feature-rx-copy", "%u",&feature_rx_copy);
>> @@ -1646,6 +1881,13 @@ static int xennet_connect(struct net_device *dev)
>>                  return -ENODEV;
>>          }
>>
>> +       err = xenbus_scanf(XBT_NIL, np->xbdev->otherend,
>> +                          "feature-persistent-grants", "%u",&val);
>> +       if (err != 1)
>> +               val = 0;
>> +
>> +       np->persistent_gnt = !!val;
>> +
>>          err = talk_to_netback(np->xbdev, np);
>>          if (err)
>>                  return err;
>> @@ -1657,9 +1899,24 @@ static int xennet_connect(struct net_device *dev)
>>          spin_lock_bh(&np->rx_lock);
>>          spin_lock_irq(&np->tx_lock);
>>
>> +       np->tx_gnt_cnt = 0;
>> +       np->rx_gnt_cnt = 0;
>> +
>>          /* Step 1: Discard all pending TX packet fragments. */
>>          xennet_release_tx_bufs(np);
>>
>> +       if (np->persistent_gnt) {
>> +               struct gnt_list *gnt_list_entry;
>> +
>> +               while (np->rx_gnt_list) {
>> +                       gnt_list_entry = np->rx_gnt_list;
>> +                       np->rx_gnt_list = np->rx_gnt_list->tail;
>> +                       gnttab_end_foreign_access(gnt_list_entry->gref, 0, 0UL);
>> +                       __free_page(gnt_list_entry->gnt_pages);
>> +                       kfree(gnt_list_entry);
>> +               }
>> +       }
>> +
>>          /* Step 2: Rebuild the RX buffer freelist and the RX ring itself. */
>>          for (requeue_idx = 0, i = 0; i<  NET_RX_RING_SIZE; i++) {
>>                  skb_frag_t *frag;
>> @@ -1673,10 +1930,11 @@ static int xennet_connect(struct net_device *dev)
>>
>>                  frag =&skb_shinfo(skb)->frags[0];
>>                  page = skb_frag_page(frag);
>> -               gnttab_grant_foreign_access_ref(
>> -                       ref, np->xbdev->otherend_id,
>> -                       pfn_to_mfn(page_to_pfn(page)),
>> -                       0);
>> +               ret = xennet_alloc_rx_ref(dev, pfn_to_mfn(page_to_pfn(page)),
>> +                                         page_address(page), requeue_idx, ref);
>> +               if ((signed short)ret<  0)
>> +                       break;
>> +
>>                  req->gref = ref;
>>                  req->id   = requeue_idx;
>>
>> --
>> 1.7.3.4
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> http://lists.xen.org/xen-devel
>>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Xen-devel] [PATCH 1/4] xen/netback: implements persistent grant with one page pool.
  2012-11-16  2:49     ` ANNIE LI
@ 2012-11-16  7:57       ` ANNIE LI
  2012-11-16  9:32       ` Ian Campbell
  1 sibling, 0 replies; 42+ messages in thread
From: ANNIE LI @ 2012-11-16  7:57 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, netdev, konrad.wilk, Ian Campbell



On 2012-11-16 10:49, ANNIE LI wrote:
>
>
> On 2012-11-15 17:57, Roger Pau Monné wrote:
>>>
>>> @@ -453,7 +460,12 @@ static int connect_rings(struct backend_info *be)
>>>                  val = 0;
>>>          vif->csum = !val;
>>>
>>> -       /* Map the shared frame, irq etc. */
>>> +       if (xenbus_scanf(XBT_NIL, dev->otherend, 
>>> "feature-persistent-grants",
>>> +                        "%u",&val)<  0)
>> In block devices "feature-persistent" is used, so I think that for
>> clearness it should be announced the same way in net.
> Is it  "feature-persistent" ? I checked your RFC patch, the key is 
> "feature-persistent-grants".
>
>
My mistake.
In your v2 patch, it is "feature-persistent". I will change the code as 
blkback/blkfront.

Thanks
Annie

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Xen-devel] [PATCH 3/4] Xen/netfront: Implement persistent grant in netfront.
  2012-11-16  5:22     ` ANNIE LI
@ 2012-11-16  7:58       ` ANNIE LI
  0 siblings, 0 replies; 42+ messages in thread
From: ANNIE LI @ 2012-11-16  7:58 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: netdev, xen-devel, Ian Campbell, konrad.wilk



On 2012-11-16 13:22, ANNIE LI wrote:
>
>
> On 2012-11-15 18:52, Roger Pau Monné wrote:
>>> +       err = xenbus_printf(xbt, dev->nodename, 
>>> "feature-persistent-grants",
>>> +                           "%u", info->persistent_gnt);
>> As in netback, I think "feature-persistent" should be used.
>
> Same in blkback, I assume it is  "feature-persistent-grants", right?
> I referred your RFC patch, did you change it later? Or I missed 
> something?
>
>
My mistake.
In your v2 patch, it is "feature-persistent". I will change the code as 
blkback/blkfront.

Thanks
Annie

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Xen-devel] [PATCH 1/4] xen/netback: implements persistent grant with one page pool.
  2012-11-16  2:18     ` [Xen-devel] " ANNIE LI
@ 2012-11-16  9:27       ` Ian Campbell
  2012-11-16  9:55         ` ANNIE LI
  0 siblings, 1 reply; 42+ messages in thread
From: Ian Campbell @ 2012-11-16  9:27 UTC (permalink / raw)
  To: ANNIE LI; +Cc: xen-devel, netdev, konrad.wilk

On Fri, 2012-11-16 at 02:18 +0000, ANNIE LI wrote:
> In this patch,
> The maximum of memory overhead is about
> 
> (XEN_NETIF_TX_RING_SIZE+XEN_NETIF_RX_RING_SIZE)*PAGE_SIZE  (plus size of grant_ref_t and handle)
> which is about 512 PAGE_SIZE. Normally, without heavy network offload, this maximum can not be reached.
> 
> In next patch of splitting tx/rx pool, the maximum is about

"about" or just "is"?

>  (256+512)PAGE_SIZE.

IOW 3MB.

> >
> >> +
> >> +       return NULL;
> >> +}
> >> +
> >> @@ -1338,7 +1497,11 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk)
> >>                  gop->source.domid = vif->domid;
> >>                  gop->source.offset = txreq.offset;
> >>
> >> -               gop->dest.u.gmfn = virt_to_mfn(page_address(page));
> >> +               if (!vif->persistent_grant)
> >> +                       gop->dest.u.gmfn = virt_to_mfn(page_address(page));
> >> +               else
> >> +                       gop->dest.u.gmfn = (unsigned long)page_address(page);
> > page_address doesn't return any sort of frame number, does it? This is
> > rather confusing...
> 
> Yes. I only use dest.u.gmfn element to save the page_address here for 
> future memcpy, and it does not mean to use frame number actually. To 
> avoid confusion, here I can use
> 
> gop->dest.u.gmfn = virt_to_mfn(page_address(page));
> 
> and then call mfn_to_virt when doing memcpy.

It seems a bit odd to be using the gop structure in this way when you
aren't actually doing a grant op on it. 

While investigating I noticed:
+static int
+grant_memory_copy_op(unsigned int cmd, void *vuop, unsigned int count,
+                    struct xen_netbk *netbk, bool tx_pool)
...
+       struct gnttab_copy *uop = vuop;

Why void *vuop? Why not struct gnttab_copy * in the parameter?

I also noticed your new grant_memory_copy_op() seems to have unbatched
the grant ops in the non-persistent case, which is going to suck for
performance in non-persistent mode. You need to pull the conditional and
the HYPERVISOR_grant_table_op outside the loop and pass it full array
instead of doing them one at a time.

Ian

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Xen-devel] [PATCH 1/4] xen/netback: implements persistent grant with one page pool.
  2012-11-16  2:49     ` ANNIE LI
  2012-11-16  7:57       ` ANNIE LI
@ 2012-11-16  9:32       ` Ian Campbell
  2012-11-16 11:34         ` ANNIE LI
  1 sibling, 1 reply; 42+ messages in thread
From: Ian Campbell @ 2012-11-16  9:32 UTC (permalink / raw)
  To: ANNIE LI; +Cc: Roger Pau Monne, xen-devel, netdev, konrad.wilk

On Fri, 2012-11-16 at 02:49 +0000, ANNIE LI wrote:
> >
> > Take a look at the following functions from blkback; foreach_grant,
> > add_persistent_gnt and get_persistent_gnt. They are generic functions to
> > deal with persistent grants.
> 
> Ok, thanks.
> Or moving those functions into a separate common file?

Please put them somewhere common.

> > This is highly inefficient, one of the points of using gnttab_set_map_op
> > is that you can queue a bunch of grants, and then map them at the same
> > time using gnttab_map_refs, but here you are using it to map a single
> > grant at a time. You should instead see how much grants you need to map
> > to complete the request and map them all at the same time.
> 
> Yes, it is inefficient here. But this is limited by current netback
> implementation. Current netback is not per-VIF based(not like blkback
> does). After combining persistent grant and non persistent grant
> together, every vif request in the queue may/may not support persistent
> grant. I have to judge whether every vif in the queue supports
> persistent grant or not. If it support, memcpy is used, if not,
> grantcopy is used.

You could (and should) still batch all the grant copies into one
hypercall, e.g. walk the list either doing memcpy or queuing up copyops
as appropriate, then at the end if the queue is non-zero length issue
the hypercall.

I'd expect this lack of batching here and in the other case I just
spotted to have a detrimental affect on guests running with this patch
but not using persistent grants. Did you benchmark that case?

> After making netback per-VIF works, this issue can be fixed.

You've mentioned improvements which are conditional on this work a few
times I think, perhaps it makes sense to make that change first?

Ian.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Xen-devel] [PATCH 1/4] xen/netback: implements persistent grant with one page pool.
  2012-11-16  9:27       ` Ian Campbell
@ 2012-11-16  9:55         ` ANNIE LI
  0 siblings, 0 replies; 42+ messages in thread
From: ANNIE LI @ 2012-11-16  9:55 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, netdev, konrad.wilk



On 2012-11-16 17:27, Ian Campbell wrote:
> On Fri, 2012-11-16 at 02:18 +0000, ANNIE LI wrote:
>> In this patch,
>> The maximum of memory overhead is about
>>
>> (XEN_NETIF_TX_RING_SIZE+XEN_NETIF_RX_RING_SIZE)*PAGE_SIZE  (plus size of grant_ref_t and handle)
>> which is about 512 PAGE_SIZE. Normally, without heavy network offload, this maximum can not be reached.
>>
>> In next patch of splitting tx/rx pool, the maximum is about
> "about" or just "is"?

For only grant pages, it is this value. I took into account other 
element of grant_ref_t and map(change to handle in future)....

>
>>   (256+512)PAGE_SIZE.
> IOW 3MB.
>
>>>> +
>>>> +       return NULL;
>>>> +}
>>>> +
>>>> @@ -1338,7 +1497,11 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk)
>>>>                   gop->source.domid = vif->domid;
>>>>                   gop->source.offset = txreq.offset;
>>>>
>>>> -               gop->dest.u.gmfn = virt_to_mfn(page_address(page));
>>>> +               if (!vif->persistent_grant)
>>>> +                       gop->dest.u.gmfn = virt_to_mfn(page_address(page));
>>>> +               else
>>>> +                       gop->dest.u.gmfn = (unsigned long)page_address(page);
>>> page_address doesn't return any sort of frame number, does it? This is
>>> rather confusing...
>> Yes. I only use dest.u.gmfn element to save the page_address here for
>> future memcpy, and it does not mean to use frame number actually. To
>> avoid confusion, here I can use
>>
>> gop->dest.u.gmfn = virt_to_mfn(page_address(page));
>>
>> and then call mfn_to_virt when doing memcpy.
> It seems a bit odd to be using the gop structure in this way when you
> aren't actually doing a grant op on it.
>
> While investigating I noticed:
> +static int
> +grant_memory_copy_op(unsigned int cmd, void *vuop, unsigned int count,
> +                    struct xen_netbk *netbk, bool tx_pool)
> ...
> +       struct gnttab_copy *uop = vuop;
>
> Why void *vuop? Why not struct gnttab_copy * in the parameter?

Sorry, my mistake.

>
> I also noticed your new grant_memory_copy_op() seems to have unbatched
> the grant ops in the non-persistent case, which is going to suck for
> performance in non-persistent mode. You need to pull the conditional and
> the HYPERVISOR_grant_table_op outside the loop and pass it full array
> instead of doing them one at a time.

This still connects with netback per-VIF implementation.
Currently, these could not be pulled out outside since netback queue may 
contains persistent and nonpersistent in the same queue. I did consider 
to implement per-VIF first and then the persistent grant,
but thinking of it is part of wei's patch combined with other patches, 
and finally decided to implement per-VIF later.

But this does limit implementation of persistent grant.

Thanks
Annie
>
> Ian
>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 0/4] Implement persistent grant in xen-netfront/netback
  2012-11-15  7:03 [PATCH 0/4] Implement persistent grant in xen-netfront/netback Annie Li
                   ` (6 preceding siblings ...)
  2012-11-15  8:56 ` Ian Campbell
@ 2012-11-16  9:57 ` Ian Campbell
  2012-11-16 11:37   ` ANNIE LI
  2012-11-16 15:18 ` Konrad Rzeszutek Wilk
  8 siblings, 1 reply; 42+ messages in thread
From: Ian Campbell @ 2012-11-16  9:57 UTC (permalink / raw)
  To: Annie Li; +Cc: xen-devel, netdev, konrad.wilk

On Thu, 2012-11-15 at 07:03 +0000, Annie Li wrote:
> This patch implements persistent grants for xen-netfront/netback.

Hang on a sec. It has just occurred to me that netfront/netback in the
current mainline kernels don't currently use grant maps at all, they use
grant copy on both the tx and rx paths.

The supposed benefit of persistent grants is to avoid the TLB shootdowns
on grant unmap, but in the current code there should be exactly zero of
those.

If I understand correctly this patch goes from using grant copy
operations to persistently mapping frames and then using memcpy on those
buffers to copy in/out to local buffers. I'm finding it hard to think of
a reason why this should perform any better, do you have a theory which
explains it? (my best theory is that it has a beneficial impact on where
the cache locality of the data, but netperf doesn't typically actually
access the data so I'm not sure why that would matter)

Also AIUI this is also doing persistent grants for both Tx and Rx
directions?

For guest Rx does this mean it now copies twice, in dom0 from the DMA
buffer to the guest provided buffer and then again in the guest from the
granted buffer to a normal one?

For guest Tx how do you handle the lifecycle of the grant mapped pages
which are being sent up into the dom0 network stack? Or are you also now
copying twice in this case? (i.e. guest copies into a granted buffer and
dom0 copies out into a local buffer?)

Did you do measurement of the Tx and Rx cases independently? Do you know
that they both benefit from this change (rather than for example an
improvement in one direction masking a regression in the other). Were
the numbers you previously posted in one particular direction or did you
measure both?

Ian.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Xen-devel] [PATCH 1/4] xen/netback: implements persistent grant with one page pool.
  2012-11-16  9:32       ` Ian Campbell
@ 2012-11-16 11:34         ` ANNIE LI
  0 siblings, 0 replies; 42+ messages in thread
From: ANNIE LI @ 2012-11-16 11:34 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Roger Pau Monne, xen-devel, netdev, konrad.wilk



On 2012-11-16 17:32, Ian Campbell wrote:
> On Fri, 2012-11-16 at 02:49 +0000, ANNIE LI wrote:
>>> Take a look at the following functions from blkback; foreach_grant,
>>> add_persistent_gnt and get_persistent_gnt. They are generic functions to
>>> deal with persistent grants.
>> Ok, thanks.
>> Or moving those functions into a separate common file?
> Please put them somewhere common.

Ok.

>>> This is highly inefficient, one of the points of using gnttab_set_map_op
>>> is that you can queue a bunch of grants, and then map them at the same
>>> time using gnttab_map_refs, but here you are using it to map a single
>>> grant at a time. You should instead see how much grants you need to map
>>> to complete the request and map them all at the same time.
>> Yes, it is inefficient here. But this is limited by current netback
>> implementation. Current netback is not per-VIF based(not like blkback
>> does). After combining persistent grant and non persistent grant
>> together, every vif request in the queue may/may not support persistent
>> grant. I have to judge whether every vif in the queue supports
>> persistent grant or not. If it support, memcpy is used, if not,
>> grantcopy is used.
> You could (and should) still batch all the grant copies into one
> hypercall, e.g. walk the list either doing memcpy or queuing up copyops
> as appropriate, then at the end if the queue is non-zero length issue
> the hypercall.

This still connects with netback per-VIF implementation.

> I'd expect this lack of batching here and in the other case I just
> spotted to have a detrimental affect on guests running with this patch
> but not using persistent grants. Did you benchmark that case?

I did some test before.
But I'd better to create more detailed result under in different case.

>> After making netback per-VIF works, this issue can be fixed.
> You've mentioned improvements which are conditional on this work a few
> times I think, perhaps it makes sense to make that change first?

Yes, I did consider to implement per-VIF first before persistent grant. 
But thinking of it is part of wei's patch and combined with other 
patches, and decided to implement it later. But making the change first 
would make things more clear.

Thanks
Annie
> Ian.
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message tomajordomo@vger.kernel.org
> More majordomo info athttp://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 0/4] Implement persistent grant in xen-netfront/netback
  2012-11-16  9:57 ` Ian Campbell
@ 2012-11-16 11:37   ` ANNIE LI
  2012-11-16 11:46     ` Ian Campbell
  0 siblings, 1 reply; 42+ messages in thread
From: ANNIE LI @ 2012-11-16 11:37 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, netdev, konrad.wilk



On 2012-11-16 17:57, Ian Campbell wrote:
> On Thu, 2012-11-15 at 07:03 +0000, Annie Li wrote:
>> This patch implements persistent grants for xen-netfront/netback.
> Hang on a sec. It has just occurred to me that netfront/netback in the
> current mainline kernels don't currently use grant maps at all, they use
> grant copy on both the tx and rx paths.

Ah, this patch is based on v3.4-rc3.
Current mainline kernel does not pass the netperf/netserver case. As I 
mentioned earlier, I hit BUG_ON with your debug patch too when testing 
mainline kernel with netperf/netserver.
This is interesting, I should have check the latest code.

>
> The supposed benefit of persistent grants is to avoid the TLB shootdowns
> on grant unmap, but in the current code there should be exactly zero of
> those.

Is there any performance document about current grant copy code in 
mainline kernel?

>
> If I understand correctly this patch goes from using grant copy
> operations to persistently mapping frames and then using memcpy on those
> buffers to copy in/out to local buffers. I'm finding it hard to think of
> a reason why this should perform any better, do you have a theory which
> explains it?

This patch is aiming to fix spin lock issue of grant operations, it 
comes out to avoid possible grant operations(including grant map and copy).

> (my best theory is that it has a beneficial impact on where
> the cache locality of the data, but netperf doesn't typically actually
> access the data so I'm not sure why that would matter)
>
> Also AIUI this is also doing persistent grants for both Tx and Rx
> directions?

Yes.

>
> For guest Rx does this mean it now copies twice, in dom0 from the DMA
> buffer to the guest provided buffer and then again in the guest from the
> granted buffer to a normal one?

Yes.

>
> For guest Tx how do you handle the lifecycle of the grant mapped pages
> which are being sent up into the dom0 network stack? Or are you also now
> copying twice in this case? (i.e. guest copies into a granted buffer and
> dom0 copies out into a local buffer?)

Copy twice: guest copies into a granted buffer and dom0 copies out into 
a local buffer.

>
> Did you do measurement of the Tx and Rx cases independently?

No.

> Do you know
> that they both benefit from this change (rather than for example an
> improvement in one direction masking a regression in the other).

On theory, this implementation avoid spinlock issue of grant operation, 
so they should both benefit from it.

> Were
> the numbers you previously posted in one particular direction or did you
> measure both?

One particular direction, one runs as server, the other runs as client.

Thanks
Annie
>
> Ian.
>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 0/4] Implement persistent grant in xen-netfront/netback
  2012-11-16 11:37   ` ANNIE LI
@ 2012-11-16 11:46     ` Ian Campbell
  2012-11-17  4:39       ` annie li
  0 siblings, 1 reply; 42+ messages in thread
From: Ian Campbell @ 2012-11-16 11:46 UTC (permalink / raw)
  To: ANNIE LI; +Cc: xen-devel, netdev, konrad.wilk

On Fri, 2012-11-16 at 11:37 +0000, ANNIE LI wrote:
> 
> On 2012-11-16 17:57, Ian Campbell wrote:
> > On Thu, 2012-11-15 at 07:03 +0000, Annie Li wrote:
> >> This patch implements persistent grants for xen-netfront/netback.
> > Hang on a sec. It has just occurred to me that netfront/netback in the
> > current mainline kernels don't currently use grant maps at all, they use
> > grant copy on both the tx and rx paths.
> 
> Ah, this patch is based on v3.4-rc3.

Nothing has changed in more recent kernels in this regard.

> >
> > The supposed benefit of persistent grants is to avoid the TLB shootdowns
> > on grant unmap, but in the current code there should be exactly zero of
> > those.
> 
> Is there any performance document about current grant copy code in 
> mainline kernel?

Not AFAIK.

> > If I understand correctly this patch goes from using grant copy
> > operations to persistently mapping frames and then using memcpy on those
> > buffers to copy in/out to local buffers. I'm finding it hard to think of
> > a reason why this should perform any better, do you have a theory which
> > explains it?
> 
> This patch is aiming to fix spin lock issue of grant operations, it 
> comes out to avoid possible grant operations(including grant map and copy).

Makes sense. This is the sort of thing which ought to feature
prominently in commit messages and/or introductory mails.

> > Do you know
> > that they both benefit from this change (rather than for example an
> > improvement in one direction masking a regression in the other).
> 
> On theory, this implementation avoid spinlock issue of grant operation, 
> so they should both benefit from it.

It seems like having netfront simply allocate itself a pool of grant
references which it reuses would give equivalent benefits whilst being a
smaller patch, with no protocol change and avoiding double copying. In
fact by avoiding the double copy I'd expect it to be even better.

> > Were
> > the numbers you previously posted in one particular direction or did you
> > measure both?
> 
> One particular direction, one runs as server, the other runs as client.

I think you need to measure both dom0->domU and domU->dom0 to get the
full picture since AIUI netperf sends the bulk data in only one
direction with just ACKs coming back the other way.

Ian.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 0/4] Implement persistent grant in xen-netfront/netback
  2012-11-15  7:03 [PATCH 0/4] Implement persistent grant in xen-netfront/netback Annie Li
                   ` (7 preceding siblings ...)
  2012-11-16  9:57 ` Ian Campbell
@ 2012-11-16 15:18 ` Konrad Rzeszutek Wilk
  8 siblings, 0 replies; 42+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-11-16 15:18 UTC (permalink / raw)
  To: Annie Li; +Cc: xen-devel, netdev, Ian.Campbell

On Thu, Nov 15, 2012 at 03:03:07PM +0800, Annie Li wrote:
> This patch implements persistent grants for xen-netfront/netback. This
> mechanism maintains page pools in netback/netfront, these page pools is used to
> save grant pages which are mapped. This way improve performance which is wasted
> when doing grant operations.
> 
> Current netback/netfront does map/unmap grant operations frequently when
> transmitting/receiving packets, and grant operations costs much cpu clock. In
> this patch, netfront/netback maps grant pages when needed and then saves them
> into a page pool for future use. All these pages will be unmapped when
> removing/releasing the net device.
> 
> In netfront, two pools are maintained for transmitting and receiving packets.
> When new grant pages are needed, the driver gets grant pages from this pool
> first. If no free grant page exists, it allocates new page, maps it and then
> saves it into the pool. The pool size for transmit/receive is exactly tx/rx
> ring size. The driver uses memcpy(not grantcopy) to copy data grant pages.
> Here, memcpy is copying the whole page size data. I tried to copy len size data
> from offset, but network does not seem work well. I am trying to find the root
> cause now.
> 
> In netback, it also maintains two page pools for tx/rx. When netback gets a
> request, it does a search first to find out whether the grant reference of
> this request is already mapped into its page pool. If the grant ref is mapped,
> the address of this mapped page is gotten and memcpy is used to copy data
> between grant pages. However, if the grant ref is not mapped, a new page is
> allocated, mapped with this grant ref, and then saved into page pool for
> future use. Similarly, memcpy replaces grant copy to copy data between grant
> pages. In this implementation, two arrays(gnttab_tx_vif,gnttab_rx_vif) are
> used to save vif pointer for every request because current netback is not
> per-vif based. This would be changed after implementing 1:1 model in netback.
> 
> This patch supports both persistent-grant and non persistent grant. A new
> xenstore key "feature-persistent-grants" is used to represent this feature.
> 
> This patch is based on linux3.4-rc3. I hit netperf/netserver failure on
> linux latest version v3.7-rc1, v3.7-rc2 and v3.7-rc4. Not sure whether this
> netperf/netserver failure connects compound page commit in v3.7-rc1, but I did
> hit BUG_ON with debug patch from thread
> http://lists.xen.org/archives/html/xen-devel/2012-10/msg00893.html

FYI, I get this:

 477.814511] BUG: sleeping function called from invalid context at /home/konrad/ssd/linux/mm/page_alloc.c:2487
[  477.815281] in_atomic(): 1, irqs_disabled(): 1, pid: 3017, name: netperf
[  477.815281] Pid: 3017, comm: netperf Not tainted 3.5.0upstream-00004-g69047bb #1
[  477.815281] Call Trace:
[  477.815281]  [<ffffffff810b990a>] __might_sleep+0xda/0x100
[  477.815281]  [<ffffffff81142e93>] __alloc_pages_nodemask+0x223/0x920
[  477.815281]  [<ffffffff81158439>] ? zone_statistics+0x99/0xc0
[  477.815281]  [<ffffffff81076e79>] ? default_spin_lock_flags+0x9/0x10
[  477.815281]  [<ffffffff81615e3a>] ? _raw_spin_lock_irqsave+0x3a/0x50
[  477.815281]  [<ffffffff81076e79>] ? default_spin_lock_flags+0x9/0x10
[  477.815281]  [<ffffffff81098977>] ? lock_timer_base+0x37/0x70
[  477.815281]  [<ffffffff8109a03d>] ? mod_timer_pending+0x11d/0x230
[  477.815281]  [<ffffffff81616144>] ? _raw_spin_unlock_bh+0x24/0x30
[  477.815281]  [<ffffffff8117e7e1>] alloc_pages_current+0xb1/0x110
[  477.815281]  [<ffffffffa0034238>] xennet_alloc_tx_ref+0x78/0x1c0 [xen_netfront]
[  477.815281]  [<ffffffffa00344eb>] xennet_start_xmit+0x16b/0x9f0 [xen_netfront]
[  477.815281]  [<ffffffff814c69eb>] dev_hard_start_xmit+0x2fb/0x6f0
[  477.815281]  [<ffffffff814e4566>] sch_direct_xmit+0x116/0x1e0
[  477.815281]  [<ffffffff814c6f6a>] dev_queue_xmit+0x18a/0x6b0
[  477.815281]  [<ffffffff8151264e>] ip_finish_output+0x18e/0x300
[  477.815281]  [<ffffffff81512821>] ip_output+0x61/0xa0
[  477.815281]  [<ffffffff81511b82>] ? __ip_local_out+0xa2/0xb0
[  477.815281]  [<ffffffff81511bb4>] ip_local_out+0x24/0x30
[  477.815281]  [<ffffffff81511ffe>] ip_queue_xmit+0x15e/0x410
[  477.815281]  [<ffffffff81528354>] tcp_transmit_skb+0x424/0x8f0
[  477.815281]  [<ffffffff8152a8c2>] tcp_write_xmit+0x1f2/0x9c0
[  477.815281]  [<ffffffff81182194>] ? ksize+0x14/0x70
[  477.815281]  [<ffffffff8152b711>] __tcp_push_pending_frames+0x21/0x90
[  477.815281]  [<ffffffff8151db23>] tcp_sendmsg+0x983/0xcd0
[  477.815281]  [<ffffffff81540daf>] inet_sendmsg+0x7f/0xd0
[  477.815281]  [<ffffffff81290dde>] ? selinux_socket_sendmsg+0x1e/0x20
[  477.815281]  [<ffffffff814aed13>] sock_sendmsg+0xf3/0x120
[  477.815281]  [<ffffffff81076f48>] ? pvclock_clocksource_read+0x58/0xd0
[  477.815281]  [<ffffffff812de7c0>] ? timerqueue_add+0x60/0xb0
[  477.815281]  [<ffffffff810b0b85>] ? enqueue_hrtimer+0x25/0xb0
[  477.815281]  [<ffffffff814af4d4>] sys_sendto+0x104/0x140
[  477.815281]  [<ffffffff81041279>] ? xen_clocksource_read+0x39/0x50
[  477.815281]  [<ffffffff81041419>] ? xen_clocksource_get_cycles+0x9/0x10
[  477.815281]  [<ffffffff810d3242>] ? getnstimeofday+0x52/0xe0
[  477.815281]  [<ffffffff8161dfb9>] system_call_fastpath+0x16/0x1b

> 
> 
> Annie Li (4):
>   xen/netback: implements persistent grant with one page pool.
>   xen/netback: Split one page pool into two(tx/rx) page pool.
>   Xen/netfront: Implement persistent grant in netfront.
>   fix code indent issue in xen-netfront.
> 
>  drivers/net/xen-netback/common.h    |   24 ++-
>  drivers/net/xen-netback/interface.c |   26 +++
>  drivers/net/xen-netback/netback.c   |  215 ++++++++++++++++++--
>  drivers/net/xen-netback/xenbus.c    |   14 ++-
>  drivers/net/xen-netfront.c          |  378 +++++++++++++++++++++++++++++------
>  5 files changed, 570 insertions(+), 87 deletions(-)
> 
> -- 
> 1.7.3.4

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Xen-devel] [PATCH 0/4] Implement persistent grant in xen-netfront/netback
  2012-11-15 10:56     ` Roger Pau Monné
  2012-11-15 11:14       ` ANNIE LI
  2012-11-15 11:15       ` Ian Campbell
@ 2012-11-16 15:21       ` Konrad Rzeszutek Wilk
  2 siblings, 0 replies; 42+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-11-16 15:21 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: ANNIE LI, Pasi Kärkkäinen, netdev, xen-devel, Ian Campbell

On Thu, Nov 15, 2012 at 11:56:17AM +0100, Roger Pau Monné wrote:
> On 15/11/12 09:38, ANNIE LI wrote:
> > 
> > 
> > On 2012-11-15 15:40, Pasi Kärkkäinen wrote:
> >> Hello,
> >>
> >> On Thu, Nov 15, 2012 at 03:03:07PM +0800, Annie Li wrote:
> >>> This patch implements persistent grants for xen-netfront/netback. This
> >>> mechanism maintains page pools in netback/netfront, these page pools is used to
> >>> save grant pages which are mapped. This way improve performance which is wasted
> >>> when doing grant operations.
> >>>
> >>> Current netback/netfront does map/unmap grant operations frequently when
> >>> transmitting/receiving packets, and grant operations costs much cpu clock. In
> >>> this patch, netfront/netback maps grant pages when needed and then saves them
> >>> into a page pool for future use. All these pages will be unmapped when
> >>> removing/releasing the net device.
> >>>
> >> Do you have performance numbers available already? with/without persistent grants?
> > I have some simple netperf/netserver test result with/without persistent 
> > grants,
> > 
> > Following is result of with persistent grant patch,
> > 
> > Guests, Sum,      Avg,     Min,     Max
> >   1,  15106.4,  15106.4, 15106.36, 15106.36
> >   2,  13052.7,  6526.34,  6261.81,  6790.86
> >   3,  12675.1,  6337.53,  6220.24,  6454.83
> >   4,  13194,  6596.98,  6274.70,  6919.25
> > 
> > 
> > Following are result of without persistent patch
> > 
> > Guests, Sum,     Avg,    Min,        Max
> >   1,  10864.1,  10864.1, 10864.10, 10864.10
> >   2,  10898.5,  5449.24,  4862.08,  6036.40
> >   3,  10734.5,  5367.26,  5261.43,  5473.08
> >   4,  10924,    5461.99,  5314.84,  5609.14
> 
> In the block case, performance improvement is seen when using a large
> number of guests, could you perform the same benchmark increasing the
> number of guests to 15?

Keep in mind that one of the things that is limiting these numbers is
that netback is very CPU intensive. So I think it could get much much
faster - but netback pegs at 100%. With Wei Liu's patches the CPU usage
did drop by 40% (this is when I tested the old netback with Wei's
netback patches)- so we should see even a further speed increase.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Xen-devel] [PATCH 0/4] Implement persistent grant in xen-netfront/netback
  2012-11-15 19:11           ` Ian Campbell
@ 2012-11-16 15:23             ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 42+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-11-16 15:23 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Roger Pau Monne, ANNIE LI, Pasi Kärkkäinen, netdev, xen-devel

On Thu, Nov 15, 2012 at 07:11:07PM +0000, Ian Campbell wrote:
> On Thu, 2012-11-15 at 18:29 +0000, Konrad Rzeszutek Wilk wrote:
> > On Thu, Nov 15, 2012 at 11:15:06AM +0000, Ian Campbell wrote:
> > > On Thu, 2012-11-15 at 10:56 +0000, Roger Pau Monne wrote:
> > > > On 15/11/12 09:38, ANNIE LI wrote:
> > > > > 
> > > > > 
> > > > > On 2012-11-15 15:40, Pasi Kärkkäinen wrote:
> > > > >> Hello,
> > > > >>
> > > > >> On Thu, Nov 15, 2012 at 03:03:07PM +0800, Annie Li wrote:
> > > > >>> This patch implements persistent grants for xen-netfront/netback. This
> > > > >>> mechanism maintains page pools in netback/netfront, these page pools is used to
> > > > >>> save grant pages which are mapped. This way improve performance which is wasted
> > > > >>> when doing grant operations.
> > > > >>>
> > > > >>> Current netback/netfront does map/unmap grant operations frequently when
> > > > >>> transmitting/receiving packets, and grant operations costs much cpu clock. In
> > > > >>> this patch, netfront/netback maps grant pages when needed and then saves them
> > > > >>> into a page pool for future use. All these pages will be unmapped when
> > > > >>> removing/releasing the net device.
> > > > >>>
> > > > >> Do you have performance numbers available already? with/without persistent grants?
> > > > > I have some simple netperf/netserver test result with/without persistent 
> > > > > grants,
> > > > > 
> > > > > Following is result of with persistent grant patch,
> > > > > 
> > > > > Guests, Sum,      Avg,     Min,     Max
> > > > >   1,  15106.4,  15106.4, 15106.36, 15106.36
> > > > >   2,  13052.7,  6526.34,  6261.81,  6790.86
> > > > >   3,  12675.1,  6337.53,  6220.24,  6454.83
> > > > >   4,  13194,  6596.98,  6274.70,  6919.25
> > > > > 
> > > > > 
> > > > > Following are result of without persistent patch
> > > > > 
> > > > > Guests, Sum,     Avg,    Min,        Max
> > > > >   1,  10864.1,  10864.1, 10864.10, 10864.10
> > > > >   2,  10898.5,  5449.24,  4862.08,  6036.40
> > > > >   3,  10734.5,  5367.26,  5261.43,  5473.08
> > > > >   4,  10924,    5461.99,  5314.84,  5609.14
> > > > 
> > > > In the block case, performance improvement is seen when using a large
> > > > number of guests, could you perform the same benchmark increasing the
> > > > number of guests to 15?
> > > 
> > > It would also be nice to see some analysis of the numbers which justify
> > > why this change is a good one without every reviewer having to evaluate
> > > the raw data themselves. In fact this should really be part of the
> > > commit message.
> > 
> > You mean like a nice graph, eh?
> 
> Together with an analysis of what it means and why it is a good thing,
> yes.

OK, lets put that on the TODO list for the next posting. In the meantime -
it sounds like you (the maintainer) are happy with the direction this is going.

The other things we want to do _after_ these patches is to look at the Wei
Liu patches and try to address the different reviewers comments.

The neat thing about them is that they have a concept of a page pool system.
And with persistent pages in both blkback and netback this gets more exciting.


> 
> Ian.
> 
> > 
> > I will run these patches on my 32GB box and see if I can give you
> > a nice PDF/jpg.
> > 
> > > 
> > > Ian.
> > > 
> 

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Xen-devel] [PATCH 0/4] Implement persistent grant in xen-netfront/netback
  2012-11-15  9:35     ` Wei Liu
  2012-11-15 11:12       ` [Xen-devel] " ANNIE LI
@ 2012-11-16 15:34       ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 42+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-11-16 15:34 UTC (permalink / raw)
  To: Wei Liu
  Cc: ANNIE LI, Pasi Kärkkäinen, netdev, xen-devel, Ian Campbell

On Thu, Nov 15, 2012 at 05:35:13PM +0800, Wei Liu wrote:
> On Thu, Nov 15, 2012 at 4:38 PM, ANNIE LI <annie.li@oracle.com> wrote:
> 
> >
> >
> > On 2012-11-15 15:40, Pasi Kärkkäinen wrote:
> >
> >> Hello,
> >>
> >> On Thu, Nov 15, 2012 at 03:03:07PM +0800, Annie Li wrote:
> >>
> >>> This patch implements persistent grants for xen-netfront/netback. This
> >>> mechanism maintains page pools in netback/netfront, these page pools is
> >>> used to
> >>> save grant pages which are mapped. This way improve performance which is
> >>> wasted
> >>> when doing grant operations.
> >>>
> >>> Current netback/netfront does map/unmap grant operations frequently when
> >>> transmitting/receiving packets, and grant operations costs much cpu
> >>> clock. In
> >>> this patch, netfront/netback maps grant pages when needed and then saves
> >>> them
> >>> into a page pool for future use. All these pages will be unmapped when
> >>> removing/releasing the net device.
> >>>
> >>>  Do you have performance numbers available already? with/without
> >> persistent grants?
> >>
> > I have some simple netperf/netserver test result with/without persistent
> > grants,
> >
> > Following is result of with persistent grant patch,
> >
> > Guests, Sum,      Avg,     Min,     Max
> >  1,  15106.4,  15106.4, 15106.36, 15106.36
> >  2,  13052.7,  6526.34,  6261.81,  6790.86
> >  3,  12675.1,  6337.53,  6220.24,  6454.83
> >  4,  13194,  6596.98,  6274.70,  6919.25
> >
> >
> > Following are result of without persistent patch
> >
> > Guests, Sum,     Avg,    Min,        Max
> >  1,  10864.1,  10864.1, 10864.10, 10864.10
> >  2,  10898.5,  5449.24,  4862.08,  6036.40
> >  3,  10734.5,  5367.26,  5261.43,  5473.08
> >  4,  10924,    5461.99,  5314.84,  5609.14
> >
> >
> >
> Interesting results. Have you tested how good it is on a 10G nic, i.e.
> guest sending packets
> through physical network to another host.

Not yet. This was done with two guests pounding each other. I am
setting two machines up for Annie so she can do that type of testing
and also with more guests.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 0/4] Implement persistent grant in xen-netfront/netback
  2012-11-16 11:46     ` Ian Campbell
@ 2012-11-17  4:39       ` annie li
  0 siblings, 0 replies; 42+ messages in thread
From: annie li @ 2012-11-17  4:39 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, netdev, konrad.wilk



On 2012-11-16 19:46, Ian Campbell wrote:
>
> It seems like having netfront simply allocate itself a pool of grant
> references which it reuses would give equivalent benefits whilst being a
> smaller patch, with no protocol change and avoiding double copying. In
> fact by avoiding the double copy I'd expect it to be even better.
>
>   

OK, I can try this implementation.
And I'd better to compare the performance between double copy and this 
implementation to see how much grant lock affects grant copy too.

> I think you need to measure both dom0->domU and domU->dom0 to get the
> full picture since AIUI netperf sends the bulk data in only one
> direction with just ACKs coming back the other way.
>   
Yes.
My environment does not meet requirement of more VMs, I would do more 
thorough test after Konrad setups the environment.

Thanks
Annie

^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2012-11-17  4:39 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-11-15  7:03 [PATCH 0/4] Implement persistent grant in xen-netfront/netback Annie Li
2012-11-15  7:04 ` [PATCH 1/4] xen/netback: implements persistent grant with one page pool Annie Li
2012-11-15  9:10   ` Ian Campbell
2012-11-16  2:18     ` [Xen-devel] " ANNIE LI
2012-11-16  9:27       ` Ian Campbell
2012-11-16  9:55         ` ANNIE LI
2012-11-15  9:57   ` Roger Pau Monné
2012-11-16  2:49     ` ANNIE LI
2012-11-16  7:57       ` ANNIE LI
2012-11-16  9:32       ` Ian Campbell
2012-11-16 11:34         ` ANNIE LI
2012-11-15  7:04 ` [PATCH 2/4] xen/netback: Split one page pool into two(tx/rx) " Annie Li
2012-11-15  9:15   ` Ian Campbell
2012-11-16  3:10     ` ANNIE LI
2012-11-15  7:05 ` [PATCH 3/4] Xen/netfront: Implement persistent grant in netfront Annie Li
2012-11-15 10:52   ` [Xen-devel] " Roger Pau Monné
2012-11-16  5:22     ` ANNIE LI
2012-11-16  7:58       ` ANNIE LI
2012-11-15  7:05 ` [PATCH 4/4] fix code indent issue in xen-netfront Annie Li
2012-11-15  7:40 ` [PATCH 0/4] Implement persistent grant in xen-netfront/netback Pasi Kärkkäinen
2012-11-15  8:38   ` [Xen-devel] " ANNIE LI
2012-11-15  8:51     ` Ian Campbell
2012-11-15  9:02       ` ANNIE LI
2012-11-15  9:35     ` Wei Liu
2012-11-15 11:12       ` [Xen-devel] " ANNIE LI
2012-11-16 15:34       ` Konrad Rzeszutek Wilk
2012-11-15 10:56     ` Roger Pau Monné
2012-11-15 11:14       ` ANNIE LI
2012-11-15 11:15       ` Ian Campbell
2012-11-15 18:29         ` Konrad Rzeszutek Wilk
2012-11-15 19:11           ` Ian Campbell
2012-11-16 15:23             ` Konrad Rzeszutek Wilk
2012-11-16 15:21       ` Konrad Rzeszutek Wilk
2012-11-15  8:53 ` Ian Campbell
2012-11-15 11:14   ` ANNIE LI
2012-11-15  8:56 ` Ian Campbell
2012-11-15 11:14   ` [Xen-devel] " ANNIE LI
2012-11-16  9:57 ` Ian Campbell
2012-11-16 11:37   ` ANNIE LI
2012-11-16 11:46     ` Ian Campbell
2012-11-17  4:39       ` annie li
2012-11-16 15:18 ` Konrad Rzeszutek Wilk

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.