All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/8] Bugfix and mechanical works for Xen network driver
@ 2013-02-15 16:00 Wei Liu
  2013-02-15 16:00 ` [PATCH 1/8] netback: don't bind kthread to cpu Wei Liu
                   ` (17 more replies)
  0 siblings, 18 replies; 91+ messages in thread
From: Wei Liu @ 2013-02-15 16:00 UTC (permalink / raw)
  To: xen-devel, netdev, ian.campbell, konrad.wilk, annie.li

This patch series contains a small fix plus mechanical works for xen network
driver.

 * bug fix: don't bind kthread to specific cpu core
 * allow host admin to unload netback
 * multi-page ring support
 * split event channels support

^ permalink raw reply	[flat|nested] 91+ messages in thread

* [PATCH 1/8] netback: don't bind kthread to cpu
  2013-02-15 16:00 [PATCH 0/8] Bugfix and mechanical works for Xen network driver Wei Liu
@ 2013-02-15 16:00 ` Wei Liu
  2013-03-04 20:51   ` Konrad Rzeszutek Wilk
  2013-03-04 20:51   ` Konrad Rzeszutek Wilk
  2013-02-15 16:00 ` Wei Liu
                   ` (16 subsequent siblings)
  17 siblings, 2 replies; 91+ messages in thread
From: Wei Liu @ 2013-02-15 16:00 UTC (permalink / raw)
  To: xen-devel, netdev, ian.campbell, konrad.wilk, annie.li; +Cc: Wei Liu

The initialization process makes an assumption that the online cpus are
numbered from 0 to xen_netbk_group_nr-1,  which is not always true.

As we only need a pool of worker threads, simply don't bind them to specific
cpus.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 drivers/net/xen-netback/netback.c |    2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 3ae49b1..db8d45a 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -1729,8 +1729,6 @@ static int __init netback_init(void)
 			goto failed_init;
 		}
 
-		kthread_bind(netbk->task, group);
-
 		INIT_LIST_HEAD(&netbk->net_schedule_list);
 
 		spin_lock_init(&netbk->net_schedule_list_lock);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH 1/8] netback: don't bind kthread to cpu
  2013-02-15 16:00 [PATCH 0/8] Bugfix and mechanical works for Xen network driver Wei Liu
  2013-02-15 16:00 ` [PATCH 1/8] netback: don't bind kthread to cpu Wei Liu
@ 2013-02-15 16:00 ` Wei Liu
  2013-02-15 16:00 ` [PATCH 2/8] netback: add module unload function Wei Liu
                   ` (15 subsequent siblings)
  17 siblings, 0 replies; 91+ messages in thread
From: Wei Liu @ 2013-02-15 16:00 UTC (permalink / raw)
  To: xen-devel, netdev, ian.campbell, konrad.wilk, annie.li; +Cc: Wei Liu

The initialization process makes an assumption that the online cpus are
numbered from 0 to xen_netbk_group_nr-1,  which is not always true.

As we only need a pool of worker threads, simply don't bind them to specific
cpus.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 drivers/net/xen-netback/netback.c |    2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 3ae49b1..db8d45a 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -1729,8 +1729,6 @@ static int __init netback_init(void)
 			goto failed_init;
 		}
 
-		kthread_bind(netbk->task, group);
-
 		INIT_LIST_HEAD(&netbk->net_schedule_list);
 
 		spin_lock_init(&netbk->net_schedule_list_lock);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH 2/8] netback: add module unload function
  2013-02-15 16:00 [PATCH 0/8] Bugfix and mechanical works for Xen network driver Wei Liu
                   ` (2 preceding siblings ...)
  2013-02-15 16:00 ` [PATCH 2/8] netback: add module unload function Wei Liu
@ 2013-02-15 16:00 ` Wei Liu
  2013-03-04 20:55   ` [Xen-devel] " Konrad Rzeszutek Wilk
                     ` (3 more replies)
  2013-02-15 16:00 ` [PATCH 3/8] netback: get/put module along with vif connect/disconnect Wei Liu
                   ` (13 subsequent siblings)
  17 siblings, 4 replies; 91+ messages in thread
From: Wei Liu @ 2013-02-15 16:00 UTC (permalink / raw)
  To: xen-devel, netdev, ian.campbell, konrad.wilk, annie.li; +Cc: Wei Liu

Enable users to unload netback module. Users should make sure there is not vif
runnig.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 drivers/net/xen-netback/common.h  |    1 +
 drivers/net/xen-netback/netback.c |   18 ++++++++++++++++++
 drivers/net/xen-netback/xenbus.c  |    5 +++++
 3 files changed, 24 insertions(+)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index 9d7f172..35d8772 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -120,6 +120,7 @@ void xenvif_get(struct xenvif *vif);
 void xenvif_put(struct xenvif *vif);
 
 int xenvif_xenbus_init(void);
+void xenvif_xenbus_exit(void);
 
 int xenvif_schedulable(struct xenvif *vif);
 
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index db8d45a..de59098 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -1761,5 +1761,23 @@ failed_init:
 
 module_init(netback_init);
 
+static void __exit netback_exit(void)
+{
+	int group, i;
+	xenvif_xenbus_exit();
+	for (group = 0; group < xen_netbk_group_nr; group++) {
+		struct xen_netbk *netbk = &xen_netbk[group];
+		for (i = 0; i < MAX_PENDING_REQS; i++) {
+			if (netbk->mmap_pages[i])
+				__free_page(netbk->mmap_pages[i]);
+		}
+		del_timer_sync(&netbk->net_timer);
+		kthread_stop(netbk->task);
+	}
+	vfree(xen_netbk);
+}
+
+module_exit(netback_exit);
+
 MODULE_LICENSE("Dual BSD/GPL");
 MODULE_ALIAS("xen-backend:vif");
diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index 410018c..65d14f2 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -485,3 +485,8 @@ int xenvif_xenbus_init(void)
 {
 	return xenbus_register_backend(&netback_driver);
 }
+
+void xenvif_xenbus_exit(void)
+{
+	return xenbus_unregister_driver(&netback_driver);
+}
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH 2/8] netback: add module unload function
  2013-02-15 16:00 [PATCH 0/8] Bugfix and mechanical works for Xen network driver Wei Liu
  2013-02-15 16:00 ` [PATCH 1/8] netback: don't bind kthread to cpu Wei Liu
  2013-02-15 16:00 ` Wei Liu
@ 2013-02-15 16:00 ` Wei Liu
  2013-02-15 16:00 ` Wei Liu
                   ` (14 subsequent siblings)
  17 siblings, 0 replies; 91+ messages in thread
From: Wei Liu @ 2013-02-15 16:00 UTC (permalink / raw)
  To: xen-devel, netdev, ian.campbell, konrad.wilk, annie.li; +Cc: Wei Liu

Enable users to unload netback module. Users should make sure there is not vif
runnig.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 drivers/net/xen-netback/common.h  |    1 +
 drivers/net/xen-netback/netback.c |   18 ++++++++++++++++++
 drivers/net/xen-netback/xenbus.c  |    5 +++++
 3 files changed, 24 insertions(+)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index 9d7f172..35d8772 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -120,6 +120,7 @@ void xenvif_get(struct xenvif *vif);
 void xenvif_put(struct xenvif *vif);
 
 int xenvif_xenbus_init(void);
+void xenvif_xenbus_exit(void);
 
 int xenvif_schedulable(struct xenvif *vif);
 
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index db8d45a..de59098 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -1761,5 +1761,23 @@ failed_init:
 
 module_init(netback_init);
 
+static void __exit netback_exit(void)
+{
+	int group, i;
+	xenvif_xenbus_exit();
+	for (group = 0; group < xen_netbk_group_nr; group++) {
+		struct xen_netbk *netbk = &xen_netbk[group];
+		for (i = 0; i < MAX_PENDING_REQS; i++) {
+			if (netbk->mmap_pages[i])
+				__free_page(netbk->mmap_pages[i]);
+		}
+		del_timer_sync(&netbk->net_timer);
+		kthread_stop(netbk->task);
+	}
+	vfree(xen_netbk);
+}
+
+module_exit(netback_exit);
+
 MODULE_LICENSE("Dual BSD/GPL");
 MODULE_ALIAS("xen-backend:vif");
diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index 410018c..65d14f2 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -485,3 +485,8 @@ int xenvif_xenbus_init(void)
 {
 	return xenbus_register_backend(&netback_driver);
 }
+
+void xenvif_xenbus_exit(void)
+{
+	return xenbus_unregister_driver(&netback_driver);
+}
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH 3/8] netback: get/put module along with vif connect/disconnect
  2013-02-15 16:00 [PATCH 0/8] Bugfix and mechanical works for Xen network driver Wei Liu
                   ` (4 preceding siblings ...)
  2013-02-15 16:00 ` [PATCH 3/8] netback: get/put module along with vif connect/disconnect Wei Liu
@ 2013-02-15 16:00 ` Wei Liu
  2013-03-04 20:56   ` Konrad Rzeszutek Wilk
                     ` (3 more replies)
  2013-02-15 16:00 ` [PATCH 4/8] xenbus_client: Extend interface to support multi-page ring Wei Liu
                   ` (11 subsequent siblings)
  17 siblings, 4 replies; 91+ messages in thread
From: Wei Liu @ 2013-02-15 16:00 UTC (permalink / raw)
  To: xen-devel, netdev, ian.campbell, konrad.wilk, annie.li; +Cc: Wei Liu

If there is vif running and user unloads netback, guest's network interface
just mysteriously stops working. So we need to prevent unloading netback
module if there is vif running.

The disconnect function of vif may get called by the generic framework even
before vif connects, so thers is an extra check on whether we actually need to
put module when disconnecting a vif.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 drivers/net/xen-netback/interface.c |   18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index 221f426..db638e1 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -314,6 +314,8 @@ int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref,
 	if (vif->irq)
 		return 0;
 
+	__module_get(THIS_MODULE);
+
 	err = xen_netbk_map_frontend_rings(vif, tx_ring_ref, rx_ring_ref);
 	if (err < 0)
 		goto err;
@@ -341,6 +343,7 @@ int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref,
 err_unmap:
 	xen_netbk_unmap_frontend_rings(vif);
 err:
+	module_put(THIS_MODULE);
 	return err;
 }
 
@@ -358,18 +361,31 @@ void xenvif_carrier_off(struct xenvif *vif)
 
 void xenvif_disconnect(struct xenvif *vif)
 {
+	/*
+	 * This function may get called even before vif connets, set
+	 * need_module_put if vif->irq != 0, which means vif has
+	 * already connected, we should call module_put to balance the
+	 * previous __module_get.
+	 */
+	int need_module_put = 0;
+
 	if (netif_carrier_ok(vif->dev))
 		xenvif_carrier_off(vif);
 
 	atomic_dec(&vif->refcnt);
 	wait_event(vif->waiting_to_free, atomic_read(&vif->refcnt) == 0);
 
-	if (vif->irq)
+	if (vif->irq) {
 		unbind_from_irqhandler(vif->irq, vif);
+		need_module_put = 1;
+	}
 
 	unregister_netdev(vif->dev);
 
 	xen_netbk_unmap_frontend_rings(vif);
 
 	free_netdev(vif->dev);
+
+	if (need_module_put)
+		module_put(THIS_MODULE);
 }
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH 3/8] netback: get/put module along with vif connect/disconnect
  2013-02-15 16:00 [PATCH 0/8] Bugfix and mechanical works for Xen network driver Wei Liu
                   ` (3 preceding siblings ...)
  2013-02-15 16:00 ` Wei Liu
@ 2013-02-15 16:00 ` Wei Liu
  2013-02-15 16:00 ` Wei Liu
                   ` (12 subsequent siblings)
  17 siblings, 0 replies; 91+ messages in thread
From: Wei Liu @ 2013-02-15 16:00 UTC (permalink / raw)
  To: xen-devel, netdev, ian.campbell, konrad.wilk, annie.li; +Cc: Wei Liu

If there is vif running and user unloads netback, guest's network interface
just mysteriously stops working. So we need to prevent unloading netback
module if there is vif running.

The disconnect function of vif may get called by the generic framework even
before vif connects, so thers is an extra check on whether we actually need to
put module when disconnecting a vif.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 drivers/net/xen-netback/interface.c |   18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index 221f426..db638e1 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -314,6 +314,8 @@ int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref,
 	if (vif->irq)
 		return 0;
 
+	__module_get(THIS_MODULE);
+
 	err = xen_netbk_map_frontend_rings(vif, tx_ring_ref, rx_ring_ref);
 	if (err < 0)
 		goto err;
@@ -341,6 +343,7 @@ int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref,
 err_unmap:
 	xen_netbk_unmap_frontend_rings(vif);
 err:
+	module_put(THIS_MODULE);
 	return err;
 }
 
@@ -358,18 +361,31 @@ void xenvif_carrier_off(struct xenvif *vif)
 
 void xenvif_disconnect(struct xenvif *vif)
 {
+	/*
+	 * This function may get called even before vif connets, set
+	 * need_module_put if vif->irq != 0, which means vif has
+	 * already connected, we should call module_put to balance the
+	 * previous __module_get.
+	 */
+	int need_module_put = 0;
+
 	if (netif_carrier_ok(vif->dev))
 		xenvif_carrier_off(vif);
 
 	atomic_dec(&vif->refcnt);
 	wait_event(vif->waiting_to_free, atomic_read(&vif->refcnt) == 0);
 
-	if (vif->irq)
+	if (vif->irq) {
 		unbind_from_irqhandler(vif->irq, vif);
+		need_module_put = 1;
+	}
 
 	unregister_netdev(vif->dev);
 
 	xen_netbk_unmap_frontend_rings(vif);
 
 	free_netdev(vif->dev);
+
+	if (need_module_put)
+		module_put(THIS_MODULE);
 }
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH 4/8] xenbus_client: Extend interface to support multi-page ring
  2013-02-15 16:00 [PATCH 0/8] Bugfix and mechanical works for Xen network driver Wei Liu
                   ` (5 preceding siblings ...)
  2013-02-15 16:00 ` Wei Liu
@ 2013-02-15 16:00 ` Wei Liu
  2013-02-15 16:17   ` Jan Beulich
                     ` (5 more replies)
  2013-02-15 16:00 ` Wei Liu
                   ` (10 subsequent siblings)
  17 siblings, 6 replies; 91+ messages in thread
From: Wei Liu @ 2013-02-15 16:00 UTC (permalink / raw)
  To: xen-devel, netdev, ian.campbell, konrad.wilk, annie.li
  Cc: Wei Liu, Roger Pau Monne, Stefano Stabellini, Mukesh Rathor

Also bundle fixes for xen frontends and backends in this patch.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Roger Pau Monne <roger.pau@citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Mukesh Rathor <mukesh.rathor@oracle.com>
---
 drivers/block/xen-blkback/xenbus.c |   14 +-
 drivers/block/xen-blkfront.c       |    6 +-
 drivers/net/xen-netback/netback.c  |    4 +-
 drivers/net/xen-netfront.c         |    9 +-
 drivers/pci/xen-pcifront.c         |    5 +-
 drivers/xen/xen-pciback/xenbus.c   |   10 +-
 drivers/xen/xenbus/xenbus_client.c |  314 ++++++++++++++++++++++++++----------
 include/xen/xenbus.h               |   17 +-
 8 files changed, 270 insertions(+), 109 deletions(-)

diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c
index 6398072..384ff24 100644
--- a/drivers/block/xen-blkback/xenbus.c
+++ b/drivers/block/xen-blkback/xenbus.c
@@ -122,7 +122,8 @@ static struct xen_blkif *xen_blkif_alloc(domid_t domid)
 	return blkif;
 }
 
-static int xen_blkif_map(struct xen_blkif *blkif, unsigned long shared_page,
+static int xen_blkif_map(struct xen_blkif *blkif, int *shared_pages,
+			 int nr_pages,
 			 unsigned int evtchn)
 {
 	int err;
@@ -131,7 +132,8 @@ static int xen_blkif_map(struct xen_blkif *blkif, unsigned long shared_page,
 	if (blkif->irq)
 		return 0;
 
-	err = xenbus_map_ring_valloc(blkif->be->dev, shared_page, &blkif->blk_ring);
+	err = xenbus_map_ring_valloc(blkif->be->dev, shared_pages,
+				     nr_pages, &blkif->blk_ring);
 	if (err < 0)
 		return err;
 
@@ -726,7 +728,7 @@ again:
 static int connect_ring(struct backend_info *be)
 {
 	struct xenbus_device *dev = be->dev;
-	unsigned long ring_ref;
+	int ring_ref;
 	unsigned int evtchn;
 	unsigned int pers_grants;
 	char protocol[64] = "";
@@ -767,14 +769,14 @@ static int connect_ring(struct backend_info *be)
 	be->blkif->vbd.feature_gnt_persistent = pers_grants;
 	be->blkif->vbd.overflow_max_grants = 0;
 
-	pr_info(DRV_PFX "ring-ref %ld, event-channel %d, protocol %d (%s) %s\n",
+	pr_info(DRV_PFX "ring-ref %d, event-channel %d, protocol %d (%s) %s\n",
 		ring_ref, evtchn, be->blkif->blk_protocol, protocol,
 		pers_grants ? "persistent grants" : "");
 
 	/* Map the shared frame, irq etc. */
-	err = xen_blkif_map(be->blkif, ring_ref, evtchn);
+	err = xen_blkif_map(be->blkif, &ring_ref, 1, evtchn);
 	if (err) {
-		xenbus_dev_fatal(dev, err, "mapping ring-ref %lu port %u",
+		xenbus_dev_fatal(dev, err, "mapping ring-ref %u port %u",
 				 ring_ref, evtchn);
 		return err;
 	}
diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index 96e9b00..12c9ebd 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -991,6 +991,7 @@ static int setup_blkring(struct xenbus_device *dev,
 {
 	struct blkif_sring *sring;
 	int err;
+	int grefs[1];
 
 	info->ring_ref = GRANT_INVALID_REF;
 
@@ -1004,13 +1005,14 @@ static int setup_blkring(struct xenbus_device *dev,
 
 	sg_init_table(info->sg, BLKIF_MAX_SEGMENTS_PER_REQUEST);
 
-	err = xenbus_grant_ring(dev, virt_to_mfn(info->ring.sring));
+	err = xenbus_grant_ring(dev, info->ring.sring,
+				1, grefs);
 	if (err < 0) {
 		free_page((unsigned long)sring);
 		info->ring.sring = NULL;
 		goto fail;
 	}
-	info->ring_ref = err;
+	info->ring_ref = grefs[0];
 
 	err = xenbus_alloc_evtchn(dev, &info->evtchn);
 	if (err)
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index de59098..98ccea9 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -1665,7 +1665,7 @@ int xen_netbk_map_frontend_rings(struct xenvif *vif,
 	int err = -ENOMEM;
 
 	err = xenbus_map_ring_valloc(xenvif_to_xenbus_device(vif),
-				     tx_ring_ref, &addr);
+				     &tx_ring_ref, 1, &addr);
 	if (err)
 		goto err;
 
@@ -1673,7 +1673,7 @@ int xen_netbk_map_frontend_rings(struct xenvif *vif,
 	BACK_RING_INIT(&vif->tx, txs, PAGE_SIZE);
 
 	err = xenbus_map_ring_valloc(xenvif_to_xenbus_device(vif),
-				     rx_ring_ref, &addr);
+				     &rx_ring_ref, 1, &addr);
 	if (err)
 		goto err;
 
diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 7ffa43b..8bd75a1 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -1501,6 +1501,7 @@ static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info)
 	struct xen_netif_tx_sring *txs;
 	struct xen_netif_rx_sring *rxs;
 	int err;
+	int grefs[1];
 	struct net_device *netdev = info->netdev;
 
 	info->tx_ring_ref = GRANT_INVALID_REF;
@@ -1524,13 +1525,13 @@ static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info)
 	SHARED_RING_INIT(txs);
 	FRONT_RING_INIT(&info->tx, txs, PAGE_SIZE);
 
-	err = xenbus_grant_ring(dev, virt_to_mfn(txs));
+	err = xenbus_grant_ring(dev, txs, 1, grefs);
 	if (err < 0) {
 		free_page((unsigned long)txs);
 		goto fail;
 	}
 
-	info->tx_ring_ref = err;
+	info->tx_ring_ref = grefs[0];
 	rxs = (struct xen_netif_rx_sring *)get_zeroed_page(GFP_NOIO | __GFP_HIGH);
 	if (!rxs) {
 		err = -ENOMEM;
@@ -1540,12 +1541,12 @@ static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info)
 	SHARED_RING_INIT(rxs);
 	FRONT_RING_INIT(&info->rx, rxs, PAGE_SIZE);
 
-	err = xenbus_grant_ring(dev, virt_to_mfn(rxs));
+	err = xenbus_grant_ring(dev, rxs, 1, grefs);
 	if (err < 0) {
 		free_page((unsigned long)rxs);
 		goto fail;
 	}
-	info->rx_ring_ref = err;
+	info->rx_ring_ref = grefs[0];
 
 	err = xenbus_alloc_evtchn(dev, &info->evtchn);
 	if (err)
diff --git a/drivers/pci/xen-pcifront.c b/drivers/pci/xen-pcifront.c
index 966abc6..016a2bb 100644
--- a/drivers/pci/xen-pcifront.c
+++ b/drivers/pci/xen-pcifront.c
@@ -772,12 +772,13 @@ static int pcifront_publish_info(struct pcifront_device *pdev)
 {
 	int err = 0;
 	struct xenbus_transaction trans;
+	int grefs[1];
 
-	err = xenbus_grant_ring(pdev->xdev, virt_to_mfn(pdev->sh_info));
+	err = xenbus_grant_ring(pdev->xdev, pdev->sh_info, 1, grefs);
 	if (err < 0)
 		goto out;
 
-	pdev->gnt_ref = err;
+	pdev->gnt_ref = grefs[0];
 
 	err = xenbus_alloc_evtchn(pdev->xdev, &pdev->evtchn);
 	if (err)
diff --git a/drivers/xen/xen-pciback/xenbus.c b/drivers/xen/xen-pciback/xenbus.c
index 64b11f9..4655851 100644
--- a/drivers/xen/xen-pciback/xenbus.c
+++ b/drivers/xen/xen-pciback/xenbus.c
@@ -98,17 +98,17 @@ static void free_pdev(struct xen_pcibk_device *pdev)
 	kfree(pdev);
 }
 
-static int xen_pcibk_do_attach(struct xen_pcibk_device *pdev, int gnt_ref,
-			     int remote_evtchn)
+static int xen_pcibk_do_attach(struct xen_pcibk_device *pdev, int *gnt_ref,
+			       int nr_grefs, int remote_evtchn)
 {
 	int err = 0;
 	void *vaddr;
 
 	dev_dbg(&pdev->xdev->dev,
 		"Attaching to frontend resources - gnt_ref=%d evtchn=%d\n",
-		gnt_ref, remote_evtchn);
+		gnt_ref[0], remote_evtchn);
 
-	err = xenbus_map_ring_valloc(pdev->xdev, gnt_ref, &vaddr);
+	err = xenbus_map_ring_valloc(pdev->xdev, gnt_ref, nr_grefs, &vaddr);
 	if (err < 0) {
 		xenbus_dev_fatal(pdev->xdev, err,
 				"Error mapping other domain page in ours.");
@@ -172,7 +172,7 @@ static int xen_pcibk_attach(struct xen_pcibk_device *pdev)
 		goto out;
 	}
 
-	err = xen_pcibk_do_attach(pdev, gnt_ref, remote_evtchn);
+	err = xen_pcibk_do_attach(pdev, &gnt_ref, 1, remote_evtchn);
 	if (err)
 		goto out;
 
diff --git a/drivers/xen/xenbus/xenbus_client.c b/drivers/xen/xenbus/xenbus_client.c
index 1bac743..7c1bd49 100644
--- a/drivers/xen/xenbus/xenbus_client.c
+++ b/drivers/xen/xenbus/xenbus_client.c
@@ -54,14 +54,16 @@ struct xenbus_map_node {
 		struct vm_struct *area; /* PV */
 		struct page *page;     /* HVM */
 	};
-	grant_handle_t handle;
+	grant_handle_t handle[XENBUS_MAX_RING_PAGES];
+	unsigned int   nr_handles;
 };
 
 static DEFINE_SPINLOCK(xenbus_valloc_lock);
 static LIST_HEAD(xenbus_valloc_pages);
 
 struct xenbus_ring_ops {
-	int (*map)(struct xenbus_device *dev, int gnt, void **vaddr);
+	int (*map)(struct xenbus_device *dev, int *gnt, int nr_gnts,
+		   void **vaddr);
 	int (*unmap)(struct xenbus_device *dev, void *vaddr);
 };
 
@@ -357,17 +359,39 @@ static void xenbus_switch_fatal(struct xenbus_device *dev, int depth, int err,
 /**
  * xenbus_grant_ring
  * @dev: xenbus device
- * @ring_mfn: mfn of ring to grant
-
- * Grant access to the given @ring_mfn to the peer of the given device.  Return
- * 0 on success, or -errno on error.  On error, the device will switch to
+ * @vaddr: starting virtual address of the ring
+ * @nr_pages: number of pages to be granted
+ * @grefs: grant reference array to be filled in
+ *
+ * Grant access to the given @vaddr to the peer of the given device.
+ * Then fill in @grefs with grant references.  Return 0 on success, or
+ * -errno on error.  On error, the device will switch to
  * XenbusStateClosing, and the error will be saved in the store.
  */
-int xenbus_grant_ring(struct xenbus_device *dev, unsigned long ring_mfn)
+int xenbus_grant_ring(struct xenbus_device *dev, void *vaddr,
+		      int nr_pages, int *grefs)
 {
-	int err = gnttab_grant_foreign_access(dev->otherend_id, ring_mfn, 0);
-	if (err < 0)
-		xenbus_dev_fatal(dev, err, "granting access to ring page");
+	int i;
+	int err;
+
+	for (i = 0; i < nr_pages; i++) {
+		unsigned long addr = (unsigned long)vaddr +
+			(PAGE_SIZE * i);
+		err = gnttab_grant_foreign_access(dev->otherend_id,
+						  virt_to_mfn(addr), 0);
+		if (err < 0) {
+			xenbus_dev_fatal(dev, err,
+					 "granting access to ring page");
+			goto fail;
+		}
+		grefs[i] = err;
+	}
+
+	return 0;
+
+fail:
+	for ( ; i >= 0; i--)
+		gnttab_end_foreign_access_ref(grefs[i], 0);
 	return err;
 }
 EXPORT_SYMBOL_GPL(xenbus_grant_ring);
@@ -448,7 +472,8 @@ EXPORT_SYMBOL_GPL(xenbus_free_evtchn);
 /**
  * xenbus_map_ring_valloc
  * @dev: xenbus device
- * @gnt_ref: grant reference
+ * @gnt_ref: grant reference array
+ * @nr_grefs: number of grant references
  * @vaddr: pointer to address to be filled out by mapping
  *
  * Based on Rusty Russell's skeleton driver's map_page.
@@ -459,51 +484,61 @@ EXPORT_SYMBOL_GPL(xenbus_free_evtchn);
  * or -ENOMEM on error. If an error is returned, device will switch to
  * XenbusStateClosing and the error message will be saved in XenStore.
  */
-int xenbus_map_ring_valloc(struct xenbus_device *dev, int gnt_ref, void **vaddr)
+int xenbus_map_ring_valloc(struct xenbus_device *dev, int *gnt_ref,
+			   int nr_grefs, void **vaddr)
 {
-	return ring_ops->map(dev, gnt_ref, vaddr);
+	return ring_ops->map(dev, gnt_ref, nr_grefs, vaddr);
 }
 EXPORT_SYMBOL_GPL(xenbus_map_ring_valloc);
 
 static int xenbus_map_ring_valloc_pv(struct xenbus_device *dev,
-				     int gnt_ref, void **vaddr)
+				     int *gnt_ref, int nr_grefs, void **vaddr)
 {
-	struct gnttab_map_grant_ref op = {
-		.flags = GNTMAP_host_map | GNTMAP_contains_pte,
-		.ref   = gnt_ref,
-		.dom   = dev->otherend_id,
-	};
+	struct gnttab_map_grant_ref op;
 	struct xenbus_map_node *node;
 	struct vm_struct *area;
-	pte_t *pte;
+	pte_t *pte[XENBUS_MAX_RING_PAGES];
+	int i;
+	int err = GNTST_okay;
+	int vma_leaked; /* used in rollback */
 
 	*vaddr = NULL;
 
+	if (nr_grefs > XENBUS_MAX_RING_PAGES)
+		return -EINVAL;
+
 	node = kzalloc(sizeof(*node), GFP_KERNEL);
 	if (!node)
 		return -ENOMEM;
 
-	area = alloc_vm_area(PAGE_SIZE, &pte);
+	area = alloc_vm_area(PAGE_SIZE * nr_grefs, pte);
 	if (!area) {
 		kfree(node);
 		return -ENOMEM;
 	}
 
-	op.host_addr = arbitrary_virt_to_machine(pte).maddr;
-
-	if (HYPERVISOR_grant_table_op(GNTTABOP_map_grant_ref, &op, 1))
-		BUG();
-
-	if (op.status != GNTST_okay) {
-		free_vm_area(area);
-		kfree(node);
-		xenbus_dev_fatal(dev, op.status,
-				 "mapping in shared page %d from domain %d",
-				 gnt_ref, dev->otherend_id);
-		return op.status;
+	/* Issue hypercall for individual entry, rollback if error occurs. */
+	for (i = 0; i < nr_grefs; i++) {
+		op.flags = GNTMAP_host_map | GNTMAP_contains_pte;
+		op.ref   = gnt_ref[i];
+		op.dom   = dev->otherend_id;
+		op.host_addr = arbitrary_virt_to_machine(pte[i]).maddr;
+
+		if (HYPERVISOR_grant_table_op(GNTTABOP_map_grant_ref, &op, 1))
+			BUG();
+
+		if (op.status != GNTST_okay) {
+			err = op.status;
+			xenbus_dev_fatal(dev, op.status,
+				 "mapping in shared page (%d/%d) %d from domain %d",
+				 i+1, nr_grefs, gnt_ref[i], dev->otherend_id);
+			node->handle[i] = INVALID_GRANT_HANDLE;
+			goto rollback;
+		} else
+			node->handle[i] = op.handle;
 	}
 
-	node->handle = op.handle;
+	node->nr_handles = nr_grefs;
 	node->area = area;
 
 	spin_lock(&xenbus_valloc_lock);
@@ -512,31 +547,73 @@ static int xenbus_map_ring_valloc_pv(struct xenbus_device *dev,
 
 	*vaddr = area->addr;
 	return 0;
+
+rollback:
+	vma_leaked = 0;
+	for ( ; i >= 0; i--) {
+		if (node->handle[i] != INVALID_GRANT_HANDLE) {
+			struct gnttab_unmap_grant_ref unmap_op;
+			unmap_op.dev_bus_addr = 0;
+			unmap_op.host_addr =
+				arbitrary_virt_to_machine(pte[i]).maddr;
+			unmap_op.handle = node->handle[i];
+
+			if (HYPERVISOR_grant_table_op(GNTTABOP_unmap_grant_ref,
+						      &unmap_op, 1))
+				BUG();
+
+			if (unmap_op.status != GNTST_okay) {
+				pr_alert("rollback mapping (%d/%d) %d from domain %d, err = %d",
+					 i+1, nr_grefs, gnt_ref[i],
+					 dev->otherend_id,
+					 unmap_op.status);
+				vma_leaked = 1;
+			}
+			node->handle[i] = INVALID_GRANT_HANDLE;
+		}
+	}
+
+	if (!vma_leaked)
+		free_vm_area(area);
+	else
+		pr_alert("leaking vm area %p size %d page(s)", area, nr_grefs);
+
+	kfree(node);
+
+	return err;
 }
 
 static int xenbus_map_ring_valloc_hvm(struct xenbus_device *dev,
-				      int gnt_ref, void **vaddr)
+				      int *gnt_ref, int nr_grefs, void **vaddr)
 {
 	struct xenbus_map_node *node;
 	int err;
 	void *addr;
+	int vma_leaked;
 
 	*vaddr = NULL;
 
+	if (nr_grefs > XENBUS_MAX_RING_PAGES)
+		return -EINVAL;
+
 	node = kzalloc(sizeof(*node), GFP_KERNEL);
 	if (!node)
 		return -ENOMEM;
 
-	err = alloc_xenballooned_pages(1, &node->page, false /* lowmem */);
+	err = alloc_xenballooned_pages(nr_grefs, &node->page,
+				       false /* lowmem */);
 	if (err)
 		goto out_err;
 
 	addr = pfn_to_kaddr(page_to_pfn(node->page));
 
-	err = xenbus_map_ring(dev, gnt_ref, &node->handle, addr);
+	err = xenbus_map_ring(dev, gnt_ref, nr_grefs, node->handle,
+			      addr, &vma_leaked);
 	if (err)
 		goto out_err;
 
+	node->nr_handles = nr_grefs;
+
 	spin_lock(&xenbus_valloc_lock);
 	list_add(&node->next, &xenbus_valloc_pages);
 	spin_unlock(&xenbus_valloc_lock);
@@ -545,7 +622,8 @@ static int xenbus_map_ring_valloc_hvm(struct xenbus_device *dev,
 	return 0;
 
  out_err:
-	free_xenballooned_pages(1, &node->page);
+	if (!vma_leaked)
+		free_xenballooned_pages(nr_grefs, &node->page);
 	kfree(node);
 	return err;
 }
@@ -554,36 +632,75 @@ static int xenbus_map_ring_valloc_hvm(struct xenbus_device *dev,
 /**
  * xenbus_map_ring
  * @dev: xenbus device
- * @gnt_ref: grant reference
+ * @gnt_ref: grant reference array
+ * @nr_grefs: number of grant reference
  * @handle: pointer to grant handle to be filled
  * @vaddr: address to be mapped to
+ * @vma_leaked: cannot clean up a failed mapping, vma leaked
  *
- * Map a page of memory into this domain from another domain's grant table.
+ * Map pages of memory into this domain from another domain's grant table.
  * xenbus_map_ring does not allocate the virtual address space (you must do
- * this yourself!). It only maps in the page to the specified address.
+ * this yourself!). It only maps in the pages to the specified address.
  * Returns 0 on success, and GNTST_* (see xen/include/interface/grant_table.h)
  * or -ENOMEM on error. If an error is returned, device will switch to
- * XenbusStateClosing and the error message will be saved in XenStore.
+ * XenbusStateClosing and the last error message will be saved in XenStore.
  */
-int xenbus_map_ring(struct xenbus_device *dev, int gnt_ref,
-		    grant_handle_t *handle, void *vaddr)
+int xenbus_map_ring(struct xenbus_device *dev, int *gnt_ref, int nr_grefs,
+		    grant_handle_t *handle, void *vaddr, int *vma_leaked)
 {
 	struct gnttab_map_grant_ref op;
+	int i;
+	int err = GNTST_okay;
+
+	for (i = 0; i < nr_grefs; i++) {
+		unsigned long addr = (unsigned long)vaddr +
+			(PAGE_SIZE * i);
+		gnttab_set_map_op(&op, (unsigned long)addr,
+				  GNTMAP_host_map, gnt_ref[i],
+				  dev->otherend_id);
+
+		if (HYPERVISOR_grant_table_op(GNTTABOP_map_grant_ref,
+					      &op, 1))
+			BUG();
+
+		if (op.status != GNTST_okay) {
+			xenbus_dev_fatal(dev, op.status,
+				 "mapping in shared page (%d/%d) %d from domain %d",
+				 i+1, nr_grefs, gnt_ref[i], dev->otherend_id);
+			handle[i] = INVALID_GRANT_HANDLE;
+			goto rollback;
+		} else
+			handle[i] = op.handle;
+	}
 
-	gnttab_set_map_op(&op, (unsigned long)vaddr, GNTMAP_host_map, gnt_ref,
-			  dev->otherend_id);
-
-	if (HYPERVISOR_grant_table_op(GNTTABOP_map_grant_ref, &op, 1))
-		BUG();
+	return 0;
 
-	if (op.status != GNTST_okay) {
-		xenbus_dev_fatal(dev, op.status,
-				 "mapping in shared page %d from domain %d",
-				 gnt_ref, dev->otherend_id);
-	} else
-		*handle = op.handle;
+rollback:
+	*vma_leaked = 0;
+	for ( ; i >= 0; i--) {
+		if (handle[i] != INVALID_GRANT_HANDLE) {
+			struct gnttab_unmap_grant_ref unmap_op;
+			unsigned long addr = (unsigned long)vaddr +
+				(PAGE_SIZE * i);
+			gnttab_set_unmap_op(&unmap_op, (phys_addr_t)addr,
+					    GNTMAP_host_map, handle[i]);
+
+			if (HYPERVISOR_grant_table_op(GNTTABOP_unmap_grant_ref,
+						      &unmap_op, 1))
+				BUG();
+
+			if (unmap_op.status != GNTST_okay) {
+				pr_alert("rollback mapping (%d/%d) %d from domain %d, err = %d",
+					 i+1, nr_grefs, gnt_ref[i],
+					 dev->otherend_id,
+					 unmap_op.status);
+				*vma_leaked = 1;
+			}
+			handle[i] = INVALID_GRANT_HANDLE;
+		}
+	}
 
-	return op.status;
+	return err;
 }
 EXPORT_SYMBOL_GPL(xenbus_map_ring);
 
@@ -609,10 +726,11 @@ EXPORT_SYMBOL_GPL(xenbus_unmap_ring_vfree);
 static int xenbus_unmap_ring_vfree_pv(struct xenbus_device *dev, void *vaddr)
 {
 	struct xenbus_map_node *node;
-	struct gnttab_unmap_grant_ref op = {
-		.host_addr = (unsigned long)vaddr,
-	};
+	struct gnttab_unmap_grant_ref op[XENBUS_MAX_RING_PAGES];
 	unsigned int level;
+	int i;
+	int last_error = GNTST_okay;
+	int vma_leaked;
 
 	spin_lock(&xenbus_valloc_lock);
 	list_for_each_entry(node, &xenbus_valloc_pages, next) {
@@ -631,22 +749,39 @@ static int xenbus_unmap_ring_vfree_pv(struct xenbus_device *dev, void *vaddr)
 		return GNTST_bad_virt_addr;
 	}
 
-	op.handle = node->handle;
-	op.host_addr = arbitrary_virt_to_machine(
-		lookup_address((unsigned long)vaddr, &level)).maddr;
+	for (i = 0; i < node->nr_handles; i++) {
+		unsigned long addr = (unsigned long)vaddr +
+			(PAGE_SIZE * i);
+		op[i].dev_bus_addr = 0;
+		op[i].handle = node->handle[i];
+		op[i].host_addr = arbitrary_virt_to_machine(
+			lookup_address((unsigned long)addr, &level)).maddr;
+	}
 
-	if (HYPERVISOR_grant_table_op(GNTTABOP_unmap_grant_ref, &op, 1))
+	if (HYPERVISOR_grant_table_op(GNTTABOP_unmap_grant_ref, op,
+				      node->nr_handles))
 		BUG();
 
-	if (op.status == GNTST_okay)
+	vma_leaked = 0;
+	for (i = 0; i < node->nr_handles; i++) {
+		if (op[i].status != GNTST_okay) {
+			last_error = op[i].status;
+			vma_leaked = 1;
+			xenbus_dev_error(dev, op[i].status,
+				 "unmapping page (%d/%d) at handle %d error %d",
+				 i+1, node->nr_handles, node->handle[i],
+				 op[i].status);
+		}
+	}
+
+	if (!vma_leaked)
 		free_vm_area(node->area);
 	else
-		xenbus_dev_error(dev, op.status,
-				 "unmapping page at handle %d error %d",
-				 node->handle, op.status);
+		pr_alert("leaking vm area %p size %d page(s)",
+			 node->area, node->nr_handles);
 
 	kfree(node);
-	return op.status;
+	return last_error;
 }
 
 static int xenbus_unmap_ring_vfree_hvm(struct xenbus_device *dev, void *vaddr)
@@ -673,10 +808,10 @@ static int xenbus_unmap_ring_vfree_hvm(struct xenbus_device *dev, void *vaddr)
 		return GNTST_bad_virt_addr;
 	}
 
-	rv = xenbus_unmap_ring(dev, node->handle, addr);
+	rv = xenbus_unmap_ring(dev, node->handle, node->nr_handles, addr);
 
 	if (!rv)
-		free_xenballooned_pages(1, &node->page);
+		free_xenballooned_pages(node->nr_handles, &node->page);
 	else
 		WARN(1, "Leaking %p\n", vaddr);
 
@@ -687,7 +822,8 @@ static int xenbus_unmap_ring_vfree_hvm(struct xenbus_device *dev, void *vaddr)
 /**
  * xenbus_unmap_ring
  * @dev: xenbus device
- * @handle: grant handle
+ * @handle: grant handle array
+ * @nr_handles: number of grant handles
  * @vaddr: addr to unmap
  *
  * Unmap a page of memory in this domain that was imported from another domain.
@@ -695,21 +831,33 @@ static int xenbus_unmap_ring_vfree_hvm(struct xenbus_device *dev, void *vaddr)
  * (see xen/include/interface/grant_table.h).
  */
 int xenbus_unmap_ring(struct xenbus_device *dev,
-		      grant_handle_t handle, void *vaddr)
+		      grant_handle_t *handle, int nr_handles,
+		      void *vaddr)
 {
 	struct gnttab_unmap_grant_ref op;
+	int last_error = GNTST_okay;
+	int i;
+
+	for (i = 0; i < nr_handles; i++) {
+		unsigned long addr = (unsigned long)vaddr +
+			(PAGE_SIZE * i);
+		gnttab_set_unmap_op(&op, (unsigned long)addr,
+				    GNTMAP_host_map, handle[i]);
+		handle[i] = INVALID_GRANT_HANDLE;
+
+		if (HYPERVISOR_grant_table_op(GNTTABOP_unmap_grant_ref,
+					      &op, 1))
+			BUG();
+
+		if (op.status != GNTST_okay) {
+			xenbus_dev_error(dev, op.status,
+				 "unmapping page (%d/%d) at handle %d error %d",
+				 i+1, nr_handles, handle[i], op.status);
+			last_error = op.status;
+		}
+	}
 
-	gnttab_set_unmap_op(&op, (unsigned long)vaddr, GNTMAP_host_map, handle);
-
-	if (HYPERVISOR_grant_table_op(GNTTABOP_unmap_grant_ref, &op, 1))
-		BUG();
-
-	if (op.status != GNTST_okay)
-		xenbus_dev_error(dev, op.status,
-				 "unmapping page at handle %d error %d",
-				 handle, op.status);
-
-	return op.status;
+	return last_error;
 }
 EXPORT_SYMBOL_GPL(xenbus_unmap_ring);
 
diff --git a/include/xen/xenbus.h b/include/xen/xenbus.h
index 0a7515c..b7d9613 100644
--- a/include/xen/xenbus.h
+++ b/include/xen/xenbus.h
@@ -46,6 +46,11 @@
 #include <xen/interface/io/xenbus.h>
 #include <xen/interface/io/xs_wire.h>
 
+/* Max pages supported by multi-page ring in the backend */
+#define XENBUS_MAX_RING_PAGE_ORDER  2
+#define XENBUS_MAX_RING_PAGES       (1U << XENBUS_MAX_RING_PAGE_ORDER)
+#define INVALID_GRANT_HANDLE        (~0U)
+
 /* Register callback to watch this node. */
 struct xenbus_watch
 {
@@ -195,15 +200,17 @@ int xenbus_watch_pathfmt(struct xenbus_device *dev, struct xenbus_watch *watch,
 			 const char *pathfmt, ...);
 
 int xenbus_switch_state(struct xenbus_device *dev, enum xenbus_state new_state);
-int xenbus_grant_ring(struct xenbus_device *dev, unsigned long ring_mfn);
+int xenbus_grant_ring(struct xenbus_device *dev, void *vaddr,
+		      int nr_gages, int *grefs);
 int xenbus_map_ring_valloc(struct xenbus_device *dev,
-			   int gnt_ref, void **vaddr);
-int xenbus_map_ring(struct xenbus_device *dev, int gnt_ref,
-			   grant_handle_t *handle, void *vaddr);
+			   int *gnt_ref, int nr_grefs, void **vaddr);
+int xenbus_map_ring(struct xenbus_device *dev, int *gnt_ref, int nr_grefs,
+		    grant_handle_t *handle, void *vaddr, int *vma_leaked);
 
 int xenbus_unmap_ring_vfree(struct xenbus_device *dev, void *vaddr);
 int xenbus_unmap_ring(struct xenbus_device *dev,
-		      grant_handle_t handle, void *vaddr);
+		      grant_handle_t *handle, int nr_handles,
+		      void *vaddr);
 
 int xenbus_alloc_evtchn(struct xenbus_device *dev, int *port);
 int xenbus_bind_evtchn(struct xenbus_device *dev, int remote_port, int *port);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH 4/8] xenbus_client: Extend interface to support multi-page ring
  2013-02-15 16:00 [PATCH 0/8] Bugfix and mechanical works for Xen network driver Wei Liu
                   ` (6 preceding siblings ...)
  2013-02-15 16:00 ` [PATCH 4/8] xenbus_client: Extend interface to support multi-page ring Wei Liu
@ 2013-02-15 16:00 ` Wei Liu
  2013-02-15 16:00 ` [PATCH 5/8] netback: multi-page ring support Wei Liu
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 91+ messages in thread
From: Wei Liu @ 2013-02-15 16:00 UTC (permalink / raw)
  To: xen-devel, netdev, ian.campbell, konrad.wilk, annie.li
  Cc: Stefano Stabellini, Wei Liu, Roger Pau Monne

Also bundle fixes for xen frontends and backends in this patch.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Roger Pau Monne <roger.pau@citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Mukesh Rathor <mukesh.rathor@oracle.com>
---
 drivers/block/xen-blkback/xenbus.c |   14 +-
 drivers/block/xen-blkfront.c       |    6 +-
 drivers/net/xen-netback/netback.c  |    4 +-
 drivers/net/xen-netfront.c         |    9 +-
 drivers/pci/xen-pcifront.c         |    5 +-
 drivers/xen/xen-pciback/xenbus.c   |   10 +-
 drivers/xen/xenbus/xenbus_client.c |  314 ++++++++++++++++++++++++++----------
 include/xen/xenbus.h               |   17 +-
 8 files changed, 270 insertions(+), 109 deletions(-)

diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c
index 6398072..384ff24 100644
--- a/drivers/block/xen-blkback/xenbus.c
+++ b/drivers/block/xen-blkback/xenbus.c
@@ -122,7 +122,8 @@ static struct xen_blkif *xen_blkif_alloc(domid_t domid)
 	return blkif;
 }
 
-static int xen_blkif_map(struct xen_blkif *blkif, unsigned long shared_page,
+static int xen_blkif_map(struct xen_blkif *blkif, int *shared_pages,
+			 int nr_pages,
 			 unsigned int evtchn)
 {
 	int err;
@@ -131,7 +132,8 @@ static int xen_blkif_map(struct xen_blkif *blkif, unsigned long shared_page,
 	if (blkif->irq)
 		return 0;
 
-	err = xenbus_map_ring_valloc(blkif->be->dev, shared_page, &blkif->blk_ring);
+	err = xenbus_map_ring_valloc(blkif->be->dev, shared_pages,
+				     nr_pages, &blkif->blk_ring);
 	if (err < 0)
 		return err;
 
@@ -726,7 +728,7 @@ again:
 static int connect_ring(struct backend_info *be)
 {
 	struct xenbus_device *dev = be->dev;
-	unsigned long ring_ref;
+	int ring_ref;
 	unsigned int evtchn;
 	unsigned int pers_grants;
 	char protocol[64] = "";
@@ -767,14 +769,14 @@ static int connect_ring(struct backend_info *be)
 	be->blkif->vbd.feature_gnt_persistent = pers_grants;
 	be->blkif->vbd.overflow_max_grants = 0;
 
-	pr_info(DRV_PFX "ring-ref %ld, event-channel %d, protocol %d (%s) %s\n",
+	pr_info(DRV_PFX "ring-ref %d, event-channel %d, protocol %d (%s) %s\n",
 		ring_ref, evtchn, be->blkif->blk_protocol, protocol,
 		pers_grants ? "persistent grants" : "");
 
 	/* Map the shared frame, irq etc. */
-	err = xen_blkif_map(be->blkif, ring_ref, evtchn);
+	err = xen_blkif_map(be->blkif, &ring_ref, 1, evtchn);
 	if (err) {
-		xenbus_dev_fatal(dev, err, "mapping ring-ref %lu port %u",
+		xenbus_dev_fatal(dev, err, "mapping ring-ref %u port %u",
 				 ring_ref, evtchn);
 		return err;
 	}
diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index 96e9b00..12c9ebd 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -991,6 +991,7 @@ static int setup_blkring(struct xenbus_device *dev,
 {
 	struct blkif_sring *sring;
 	int err;
+	int grefs[1];
 
 	info->ring_ref = GRANT_INVALID_REF;
 
@@ -1004,13 +1005,14 @@ static int setup_blkring(struct xenbus_device *dev,
 
 	sg_init_table(info->sg, BLKIF_MAX_SEGMENTS_PER_REQUEST);
 
-	err = xenbus_grant_ring(dev, virt_to_mfn(info->ring.sring));
+	err = xenbus_grant_ring(dev, info->ring.sring,
+				1, grefs);
 	if (err < 0) {
 		free_page((unsigned long)sring);
 		info->ring.sring = NULL;
 		goto fail;
 	}
-	info->ring_ref = err;
+	info->ring_ref = grefs[0];
 
 	err = xenbus_alloc_evtchn(dev, &info->evtchn);
 	if (err)
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index de59098..98ccea9 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -1665,7 +1665,7 @@ int xen_netbk_map_frontend_rings(struct xenvif *vif,
 	int err = -ENOMEM;
 
 	err = xenbus_map_ring_valloc(xenvif_to_xenbus_device(vif),
-				     tx_ring_ref, &addr);
+				     &tx_ring_ref, 1, &addr);
 	if (err)
 		goto err;
 
@@ -1673,7 +1673,7 @@ int xen_netbk_map_frontend_rings(struct xenvif *vif,
 	BACK_RING_INIT(&vif->tx, txs, PAGE_SIZE);
 
 	err = xenbus_map_ring_valloc(xenvif_to_xenbus_device(vif),
-				     rx_ring_ref, &addr);
+				     &rx_ring_ref, 1, &addr);
 	if (err)
 		goto err;
 
diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 7ffa43b..8bd75a1 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -1501,6 +1501,7 @@ static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info)
 	struct xen_netif_tx_sring *txs;
 	struct xen_netif_rx_sring *rxs;
 	int err;
+	int grefs[1];
 	struct net_device *netdev = info->netdev;
 
 	info->tx_ring_ref = GRANT_INVALID_REF;
@@ -1524,13 +1525,13 @@ static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info)
 	SHARED_RING_INIT(txs);
 	FRONT_RING_INIT(&info->tx, txs, PAGE_SIZE);
 
-	err = xenbus_grant_ring(dev, virt_to_mfn(txs));
+	err = xenbus_grant_ring(dev, txs, 1, grefs);
 	if (err < 0) {
 		free_page((unsigned long)txs);
 		goto fail;
 	}
 
-	info->tx_ring_ref = err;
+	info->tx_ring_ref = grefs[0];
 	rxs = (struct xen_netif_rx_sring *)get_zeroed_page(GFP_NOIO | __GFP_HIGH);
 	if (!rxs) {
 		err = -ENOMEM;
@@ -1540,12 +1541,12 @@ static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info)
 	SHARED_RING_INIT(rxs);
 	FRONT_RING_INIT(&info->rx, rxs, PAGE_SIZE);
 
-	err = xenbus_grant_ring(dev, virt_to_mfn(rxs));
+	err = xenbus_grant_ring(dev, rxs, 1, grefs);
 	if (err < 0) {
 		free_page((unsigned long)rxs);
 		goto fail;
 	}
-	info->rx_ring_ref = err;
+	info->rx_ring_ref = grefs[0];
 
 	err = xenbus_alloc_evtchn(dev, &info->evtchn);
 	if (err)
diff --git a/drivers/pci/xen-pcifront.c b/drivers/pci/xen-pcifront.c
index 966abc6..016a2bb 100644
--- a/drivers/pci/xen-pcifront.c
+++ b/drivers/pci/xen-pcifront.c
@@ -772,12 +772,13 @@ static int pcifront_publish_info(struct pcifront_device *pdev)
 {
 	int err = 0;
 	struct xenbus_transaction trans;
+	int grefs[1];
 
-	err = xenbus_grant_ring(pdev->xdev, virt_to_mfn(pdev->sh_info));
+	err = xenbus_grant_ring(pdev->xdev, pdev->sh_info, 1, grefs);
 	if (err < 0)
 		goto out;
 
-	pdev->gnt_ref = err;
+	pdev->gnt_ref = grefs[0];
 
 	err = xenbus_alloc_evtchn(pdev->xdev, &pdev->evtchn);
 	if (err)
diff --git a/drivers/xen/xen-pciback/xenbus.c b/drivers/xen/xen-pciback/xenbus.c
index 64b11f9..4655851 100644
--- a/drivers/xen/xen-pciback/xenbus.c
+++ b/drivers/xen/xen-pciback/xenbus.c
@@ -98,17 +98,17 @@ static void free_pdev(struct xen_pcibk_device *pdev)
 	kfree(pdev);
 }
 
-static int xen_pcibk_do_attach(struct xen_pcibk_device *pdev, int gnt_ref,
-			     int remote_evtchn)
+static int xen_pcibk_do_attach(struct xen_pcibk_device *pdev, int *gnt_ref,
+			       int nr_grefs, int remote_evtchn)
 {
 	int err = 0;
 	void *vaddr;
 
 	dev_dbg(&pdev->xdev->dev,
 		"Attaching to frontend resources - gnt_ref=%d evtchn=%d\n",
-		gnt_ref, remote_evtchn);
+		gnt_ref[0], remote_evtchn);
 
-	err = xenbus_map_ring_valloc(pdev->xdev, gnt_ref, &vaddr);
+	err = xenbus_map_ring_valloc(pdev->xdev, gnt_ref, nr_grefs, &vaddr);
 	if (err < 0) {
 		xenbus_dev_fatal(pdev->xdev, err,
 				"Error mapping other domain page in ours.");
@@ -172,7 +172,7 @@ static int xen_pcibk_attach(struct xen_pcibk_device *pdev)
 		goto out;
 	}
 
-	err = xen_pcibk_do_attach(pdev, gnt_ref, remote_evtchn);
+	err = xen_pcibk_do_attach(pdev, &gnt_ref, 1, remote_evtchn);
 	if (err)
 		goto out;
 
diff --git a/drivers/xen/xenbus/xenbus_client.c b/drivers/xen/xenbus/xenbus_client.c
index 1bac743..7c1bd49 100644
--- a/drivers/xen/xenbus/xenbus_client.c
+++ b/drivers/xen/xenbus/xenbus_client.c
@@ -54,14 +54,16 @@ struct xenbus_map_node {
 		struct vm_struct *area; /* PV */
 		struct page *page;     /* HVM */
 	};
-	grant_handle_t handle;
+	grant_handle_t handle[XENBUS_MAX_RING_PAGES];
+	unsigned int   nr_handles;
 };
 
 static DEFINE_SPINLOCK(xenbus_valloc_lock);
 static LIST_HEAD(xenbus_valloc_pages);
 
 struct xenbus_ring_ops {
-	int (*map)(struct xenbus_device *dev, int gnt, void **vaddr);
+	int (*map)(struct xenbus_device *dev, int *gnt, int nr_gnts,
+		   void **vaddr);
 	int (*unmap)(struct xenbus_device *dev, void *vaddr);
 };
 
@@ -357,17 +359,39 @@ static void xenbus_switch_fatal(struct xenbus_device *dev, int depth, int err,
 /**
  * xenbus_grant_ring
  * @dev: xenbus device
- * @ring_mfn: mfn of ring to grant
-
- * Grant access to the given @ring_mfn to the peer of the given device.  Return
- * 0 on success, or -errno on error.  On error, the device will switch to
+ * @vaddr: starting virtual address of the ring
+ * @nr_pages: number of pages to be granted
+ * @grefs: grant reference array to be filled in
+ *
+ * Grant access to the given @vaddr to the peer of the given device.
+ * Then fill in @grefs with grant references.  Return 0 on success, or
+ * -errno on error.  On error, the device will switch to
  * XenbusStateClosing, and the error will be saved in the store.
  */
-int xenbus_grant_ring(struct xenbus_device *dev, unsigned long ring_mfn)
+int xenbus_grant_ring(struct xenbus_device *dev, void *vaddr,
+		      int nr_pages, int *grefs)
 {
-	int err = gnttab_grant_foreign_access(dev->otherend_id, ring_mfn, 0);
-	if (err < 0)
-		xenbus_dev_fatal(dev, err, "granting access to ring page");
+	int i;
+	int err;
+
+	for (i = 0; i < nr_pages; i++) {
+		unsigned long addr = (unsigned long)vaddr +
+			(PAGE_SIZE * i);
+		err = gnttab_grant_foreign_access(dev->otherend_id,
+						  virt_to_mfn(addr), 0);
+		if (err < 0) {
+			xenbus_dev_fatal(dev, err,
+					 "granting access to ring page");
+			goto fail;
+		}
+		grefs[i] = err;
+	}
+
+	return 0;
+
+fail:
+	for ( ; i >= 0; i--)
+		gnttab_end_foreign_access_ref(grefs[i], 0);
 	return err;
 }
 EXPORT_SYMBOL_GPL(xenbus_grant_ring);
@@ -448,7 +472,8 @@ EXPORT_SYMBOL_GPL(xenbus_free_evtchn);
 /**
  * xenbus_map_ring_valloc
  * @dev: xenbus device
- * @gnt_ref: grant reference
+ * @gnt_ref: grant reference array
+ * @nr_grefs: number of grant references
  * @vaddr: pointer to address to be filled out by mapping
  *
  * Based on Rusty Russell's skeleton driver's map_page.
@@ -459,51 +484,61 @@ EXPORT_SYMBOL_GPL(xenbus_free_evtchn);
  * or -ENOMEM on error. If an error is returned, device will switch to
  * XenbusStateClosing and the error message will be saved in XenStore.
  */
-int xenbus_map_ring_valloc(struct xenbus_device *dev, int gnt_ref, void **vaddr)
+int xenbus_map_ring_valloc(struct xenbus_device *dev, int *gnt_ref,
+			   int nr_grefs, void **vaddr)
 {
-	return ring_ops->map(dev, gnt_ref, vaddr);
+	return ring_ops->map(dev, gnt_ref, nr_grefs, vaddr);
 }
 EXPORT_SYMBOL_GPL(xenbus_map_ring_valloc);
 
 static int xenbus_map_ring_valloc_pv(struct xenbus_device *dev,
-				     int gnt_ref, void **vaddr)
+				     int *gnt_ref, int nr_grefs, void **vaddr)
 {
-	struct gnttab_map_grant_ref op = {
-		.flags = GNTMAP_host_map | GNTMAP_contains_pte,
-		.ref   = gnt_ref,
-		.dom   = dev->otherend_id,
-	};
+	struct gnttab_map_grant_ref op;
 	struct xenbus_map_node *node;
 	struct vm_struct *area;
-	pte_t *pte;
+	pte_t *pte[XENBUS_MAX_RING_PAGES];
+	int i;
+	int err = GNTST_okay;
+	int vma_leaked; /* used in rollback */
 
 	*vaddr = NULL;
 
+	if (nr_grefs > XENBUS_MAX_RING_PAGES)
+		return -EINVAL;
+
 	node = kzalloc(sizeof(*node), GFP_KERNEL);
 	if (!node)
 		return -ENOMEM;
 
-	area = alloc_vm_area(PAGE_SIZE, &pte);
+	area = alloc_vm_area(PAGE_SIZE * nr_grefs, pte);
 	if (!area) {
 		kfree(node);
 		return -ENOMEM;
 	}
 
-	op.host_addr = arbitrary_virt_to_machine(pte).maddr;
-
-	if (HYPERVISOR_grant_table_op(GNTTABOP_map_grant_ref, &op, 1))
-		BUG();
-
-	if (op.status != GNTST_okay) {
-		free_vm_area(area);
-		kfree(node);
-		xenbus_dev_fatal(dev, op.status,
-				 "mapping in shared page %d from domain %d",
-				 gnt_ref, dev->otherend_id);
-		return op.status;
+	/* Issue hypercall for individual entry, rollback if error occurs. */
+	for (i = 0; i < nr_grefs; i++) {
+		op.flags = GNTMAP_host_map | GNTMAP_contains_pte;
+		op.ref   = gnt_ref[i];
+		op.dom   = dev->otherend_id;
+		op.host_addr = arbitrary_virt_to_machine(pte[i]).maddr;
+
+		if (HYPERVISOR_grant_table_op(GNTTABOP_map_grant_ref, &op, 1))
+			BUG();
+
+		if (op.status != GNTST_okay) {
+			err = op.status;
+			xenbus_dev_fatal(dev, op.status,
+				 "mapping in shared page (%d/%d) %d from domain %d",
+				 i+1, nr_grefs, gnt_ref[i], dev->otherend_id);
+			node->handle[i] = INVALID_GRANT_HANDLE;
+			goto rollback;
+		} else
+			node->handle[i] = op.handle;
 	}
 
-	node->handle = op.handle;
+	node->nr_handles = nr_grefs;
 	node->area = area;
 
 	spin_lock(&xenbus_valloc_lock);
@@ -512,31 +547,73 @@ static int xenbus_map_ring_valloc_pv(struct xenbus_device *dev,
 
 	*vaddr = area->addr;
 	return 0;
+
+rollback:
+	vma_leaked = 0;
+	for ( ; i >= 0; i--) {
+		if (node->handle[i] != INVALID_GRANT_HANDLE) {
+			struct gnttab_unmap_grant_ref unmap_op;
+			unmap_op.dev_bus_addr = 0;
+			unmap_op.host_addr =
+				arbitrary_virt_to_machine(pte[i]).maddr;
+			unmap_op.handle = node->handle[i];
+
+			if (HYPERVISOR_grant_table_op(GNTTABOP_unmap_grant_ref,
+						      &unmap_op, 1))
+				BUG();
+
+			if (unmap_op.status != GNTST_okay) {
+				pr_alert("rollback mapping (%d/%d) %d from domain %d, err = %d",
+					 i+1, nr_grefs, gnt_ref[i],
+					 dev->otherend_id,
+					 unmap_op.status);
+				vma_leaked = 1;
+			}
+			node->handle[i] = INVALID_GRANT_HANDLE;
+		}
+	}
+
+	if (!vma_leaked)
+		free_vm_area(area);
+	else
+		pr_alert("leaking vm area %p size %d page(s)", area, nr_grefs);
+
+	kfree(node);
+
+	return err;
 }
 
 static int xenbus_map_ring_valloc_hvm(struct xenbus_device *dev,
-				      int gnt_ref, void **vaddr)
+				      int *gnt_ref, int nr_grefs, void **vaddr)
 {
 	struct xenbus_map_node *node;
 	int err;
 	void *addr;
+	int vma_leaked;
 
 	*vaddr = NULL;
 
+	if (nr_grefs > XENBUS_MAX_RING_PAGES)
+		return -EINVAL;
+
 	node = kzalloc(sizeof(*node), GFP_KERNEL);
 	if (!node)
 		return -ENOMEM;
 
-	err = alloc_xenballooned_pages(1, &node->page, false /* lowmem */);
+	err = alloc_xenballooned_pages(nr_grefs, &node->page,
+				       false /* lowmem */);
 	if (err)
 		goto out_err;
 
 	addr = pfn_to_kaddr(page_to_pfn(node->page));
 
-	err = xenbus_map_ring(dev, gnt_ref, &node->handle, addr);
+	err = xenbus_map_ring(dev, gnt_ref, nr_grefs, node->handle,
+			      addr, &vma_leaked);
 	if (err)
 		goto out_err;
 
+	node->nr_handles = nr_grefs;
+
 	spin_lock(&xenbus_valloc_lock);
 	list_add(&node->next, &xenbus_valloc_pages);
 	spin_unlock(&xenbus_valloc_lock);
@@ -545,7 +622,8 @@ static int xenbus_map_ring_valloc_hvm(struct xenbus_device *dev,
 	return 0;
 
  out_err:
-	free_xenballooned_pages(1, &node->page);
+	if (!vma_leaked)
+		free_xenballooned_pages(nr_grefs, &node->page);
 	kfree(node);
 	return err;
 }
@@ -554,36 +632,75 @@ static int xenbus_map_ring_valloc_hvm(struct xenbus_device *dev,
 /**
  * xenbus_map_ring
  * @dev: xenbus device
- * @gnt_ref: grant reference
+ * @gnt_ref: grant reference array
+ * @nr_grefs: number of grant reference
  * @handle: pointer to grant handle to be filled
  * @vaddr: address to be mapped to
+ * @vma_leaked: cannot clean up a failed mapping, vma leaked
  *
- * Map a page of memory into this domain from another domain's grant table.
+ * Map pages of memory into this domain from another domain's grant table.
  * xenbus_map_ring does not allocate the virtual address space (you must do
- * this yourself!). It only maps in the page to the specified address.
+ * this yourself!). It only maps in the pages to the specified address.
  * Returns 0 on success, and GNTST_* (see xen/include/interface/grant_table.h)
  * or -ENOMEM on error. If an error is returned, device will switch to
- * XenbusStateClosing and the error message will be saved in XenStore.
+ * XenbusStateClosing and the last error message will be saved in XenStore.
  */
-int xenbus_map_ring(struct xenbus_device *dev, int gnt_ref,
-		    grant_handle_t *handle, void *vaddr)
+int xenbus_map_ring(struct xenbus_device *dev, int *gnt_ref, int nr_grefs,
+		    grant_handle_t *handle, void *vaddr, int *vma_leaked)
 {
 	struct gnttab_map_grant_ref op;
+	int i;
+	int err = GNTST_okay;
+
+	for (i = 0; i < nr_grefs; i++) {
+		unsigned long addr = (unsigned long)vaddr +
+			(PAGE_SIZE * i);
+		gnttab_set_map_op(&op, (unsigned long)addr,
+				  GNTMAP_host_map, gnt_ref[i],
+				  dev->otherend_id);
+
+		if (HYPERVISOR_grant_table_op(GNTTABOP_map_grant_ref,
+					      &op, 1))
+			BUG();
+
+		if (op.status != GNTST_okay) {
+			xenbus_dev_fatal(dev, op.status,
+				 "mapping in shared page (%d/%d) %d from domain %d",
+				 i+1, nr_grefs, gnt_ref[i], dev->otherend_id);
+			handle[i] = INVALID_GRANT_HANDLE;
+			goto rollback;
+		} else
+			handle[i] = op.handle;
+	}
 
-	gnttab_set_map_op(&op, (unsigned long)vaddr, GNTMAP_host_map, gnt_ref,
-			  dev->otherend_id);
-
-	if (HYPERVISOR_grant_table_op(GNTTABOP_map_grant_ref, &op, 1))
-		BUG();
+	return 0;
 
-	if (op.status != GNTST_okay) {
-		xenbus_dev_fatal(dev, op.status,
-				 "mapping in shared page %d from domain %d",
-				 gnt_ref, dev->otherend_id);
-	} else
-		*handle = op.handle;
+rollback:
+	*vma_leaked = 0;
+	for ( ; i >= 0; i--) {
+		if (handle[i] != INVALID_GRANT_HANDLE) {
+			struct gnttab_unmap_grant_ref unmap_op;
+			unsigned long addr = (unsigned long)vaddr +
+				(PAGE_SIZE * i);
+			gnttab_set_unmap_op(&unmap_op, (phys_addr_t)addr,
+					    GNTMAP_host_map, handle[i]);
+
+			if (HYPERVISOR_grant_table_op(GNTTABOP_unmap_grant_ref,
+						      &unmap_op, 1))
+				BUG();
+
+			if (unmap_op.status != GNTST_okay) {
+				pr_alert("rollback mapping (%d/%d) %d from domain %d, err = %d",
+					 i+1, nr_grefs, gnt_ref[i],
+					 dev->otherend_id,
+					 unmap_op.status);
+				*vma_leaked = 1;
+			}
+			handle[i] = INVALID_GRANT_HANDLE;
+		}
+	}
 
-	return op.status;
+	return err;
 }
 EXPORT_SYMBOL_GPL(xenbus_map_ring);
 
@@ -609,10 +726,11 @@ EXPORT_SYMBOL_GPL(xenbus_unmap_ring_vfree);
 static int xenbus_unmap_ring_vfree_pv(struct xenbus_device *dev, void *vaddr)
 {
 	struct xenbus_map_node *node;
-	struct gnttab_unmap_grant_ref op = {
-		.host_addr = (unsigned long)vaddr,
-	};
+	struct gnttab_unmap_grant_ref op[XENBUS_MAX_RING_PAGES];
 	unsigned int level;
+	int i;
+	int last_error = GNTST_okay;
+	int vma_leaked;
 
 	spin_lock(&xenbus_valloc_lock);
 	list_for_each_entry(node, &xenbus_valloc_pages, next) {
@@ -631,22 +749,39 @@ static int xenbus_unmap_ring_vfree_pv(struct xenbus_device *dev, void *vaddr)
 		return GNTST_bad_virt_addr;
 	}
 
-	op.handle = node->handle;
-	op.host_addr = arbitrary_virt_to_machine(
-		lookup_address((unsigned long)vaddr, &level)).maddr;
+	for (i = 0; i < node->nr_handles; i++) {
+		unsigned long addr = (unsigned long)vaddr +
+			(PAGE_SIZE * i);
+		op[i].dev_bus_addr = 0;
+		op[i].handle = node->handle[i];
+		op[i].host_addr = arbitrary_virt_to_machine(
+			lookup_address((unsigned long)addr, &level)).maddr;
+	}
 
-	if (HYPERVISOR_grant_table_op(GNTTABOP_unmap_grant_ref, &op, 1))
+	if (HYPERVISOR_grant_table_op(GNTTABOP_unmap_grant_ref, op,
+				      node->nr_handles))
 		BUG();
 
-	if (op.status == GNTST_okay)
+	vma_leaked = 0;
+	for (i = 0; i < node->nr_handles; i++) {
+		if (op[i].status != GNTST_okay) {
+			last_error = op[i].status;
+			vma_leaked = 1;
+			xenbus_dev_error(dev, op[i].status,
+				 "unmapping page (%d/%d) at handle %d error %d",
+				 i+1, node->nr_handles, node->handle[i],
+				 op[i].status);
+		}
+	}
+
+	if (!vma_leaked)
 		free_vm_area(node->area);
 	else
-		xenbus_dev_error(dev, op.status,
-				 "unmapping page at handle %d error %d",
-				 node->handle, op.status);
+		pr_alert("leaking vm area %p size %d page(s)",
+			 node->area, node->nr_handles);
 
 	kfree(node);
-	return op.status;
+	return last_error;
 }
 
 static int xenbus_unmap_ring_vfree_hvm(struct xenbus_device *dev, void *vaddr)
@@ -673,10 +808,10 @@ static int xenbus_unmap_ring_vfree_hvm(struct xenbus_device *dev, void *vaddr)
 		return GNTST_bad_virt_addr;
 	}
 
-	rv = xenbus_unmap_ring(dev, node->handle, addr);
+	rv = xenbus_unmap_ring(dev, node->handle, node->nr_handles, addr);
 
 	if (!rv)
-		free_xenballooned_pages(1, &node->page);
+		free_xenballooned_pages(node->nr_handles, &node->page);
 	else
 		WARN(1, "Leaking %p\n", vaddr);
 
@@ -687,7 +822,8 @@ static int xenbus_unmap_ring_vfree_hvm(struct xenbus_device *dev, void *vaddr)
 /**
  * xenbus_unmap_ring
  * @dev: xenbus device
- * @handle: grant handle
+ * @handle: grant handle array
+ * @nr_handles: number of grant handles
  * @vaddr: addr to unmap
  *
  * Unmap a page of memory in this domain that was imported from another domain.
@@ -695,21 +831,33 @@ static int xenbus_unmap_ring_vfree_hvm(struct xenbus_device *dev, void *vaddr)
  * (see xen/include/interface/grant_table.h).
  */
 int xenbus_unmap_ring(struct xenbus_device *dev,
-		      grant_handle_t handle, void *vaddr)
+		      grant_handle_t *handle, int nr_handles,
+		      void *vaddr)
 {
 	struct gnttab_unmap_grant_ref op;
+	int last_error = GNTST_okay;
+	int i;
+
+	for (i = 0; i < nr_handles; i++) {
+		unsigned long addr = (unsigned long)vaddr +
+			(PAGE_SIZE * i);
+		gnttab_set_unmap_op(&op, (unsigned long)addr,
+				    GNTMAP_host_map, handle[i]);
+		handle[i] = INVALID_GRANT_HANDLE;
+
+		if (HYPERVISOR_grant_table_op(GNTTABOP_unmap_grant_ref,
+					      &op, 1))
+			BUG();
+
+		if (op.status != GNTST_okay) {
+			xenbus_dev_error(dev, op.status,
+				 "unmapping page (%d/%d) at handle %d error %d",
+				 i+1, nr_handles, handle[i], op.status);
+			last_error = op.status;
+		}
+	}
 
-	gnttab_set_unmap_op(&op, (unsigned long)vaddr, GNTMAP_host_map, handle);
-
-	if (HYPERVISOR_grant_table_op(GNTTABOP_unmap_grant_ref, &op, 1))
-		BUG();
-
-	if (op.status != GNTST_okay)
-		xenbus_dev_error(dev, op.status,
-				 "unmapping page at handle %d error %d",
-				 handle, op.status);
-
-	return op.status;
+	return last_error;
 }
 EXPORT_SYMBOL_GPL(xenbus_unmap_ring);
 
diff --git a/include/xen/xenbus.h b/include/xen/xenbus.h
index 0a7515c..b7d9613 100644
--- a/include/xen/xenbus.h
+++ b/include/xen/xenbus.h
@@ -46,6 +46,11 @@
 #include <xen/interface/io/xenbus.h>
 #include <xen/interface/io/xs_wire.h>
 
+/* Max pages supported by multi-page ring in the backend */
+#define XENBUS_MAX_RING_PAGE_ORDER  2
+#define XENBUS_MAX_RING_PAGES       (1U << XENBUS_MAX_RING_PAGE_ORDER)
+#define INVALID_GRANT_HANDLE        (~0U)
+
 /* Register callback to watch this node. */
 struct xenbus_watch
 {
@@ -195,15 +200,17 @@ int xenbus_watch_pathfmt(struct xenbus_device *dev, struct xenbus_watch *watch,
 			 const char *pathfmt, ...);
 
 int xenbus_switch_state(struct xenbus_device *dev, enum xenbus_state new_state);
-int xenbus_grant_ring(struct xenbus_device *dev, unsigned long ring_mfn);
+int xenbus_grant_ring(struct xenbus_device *dev, void *vaddr,
+		      int nr_gages, int *grefs);
 int xenbus_map_ring_valloc(struct xenbus_device *dev,
-			   int gnt_ref, void **vaddr);
-int xenbus_map_ring(struct xenbus_device *dev, int gnt_ref,
-			   grant_handle_t *handle, void *vaddr);
+			   int *gnt_ref, int nr_grefs, void **vaddr);
+int xenbus_map_ring(struct xenbus_device *dev, int *gnt_ref, int nr_grefs,
+		    grant_handle_t *handle, void *vaddr, int *vma_leaked);
 
 int xenbus_unmap_ring_vfree(struct xenbus_device *dev, void *vaddr);
 int xenbus_unmap_ring(struct xenbus_device *dev,
-		      grant_handle_t handle, void *vaddr);
+		      grant_handle_t *handle, int nr_handles,
+		      void *vaddr);
 
 int xenbus_alloc_evtchn(struct xenbus_device *dev, int *port);
 int xenbus_bind_evtchn(struct xenbus_device *dev, int remote_port, int *port);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH 5/8] netback: multi-page ring support
  2013-02-15 16:00 [PATCH 0/8] Bugfix and mechanical works for Xen network driver Wei Liu
                   ` (8 preceding siblings ...)
  2013-02-15 16:00 ` [PATCH 5/8] netback: multi-page ring support Wei Liu
@ 2013-02-15 16:00 ` Wei Liu
  2013-03-04 21:00   ` [Xen-devel] " Konrad Rzeszutek Wilk
                     ` (3 more replies)
  2013-02-15 16:00 ` [PATCH 6/8] netfront: " Wei Liu
                   ` (7 subsequent siblings)
  17 siblings, 4 replies; 91+ messages in thread
From: Wei Liu @ 2013-02-15 16:00 UTC (permalink / raw)
  To: xen-devel, netdev, ian.campbell, konrad.wilk, annie.li; +Cc: Wei Liu

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 drivers/net/xen-netback/common.h    |   30 ++++++--
 drivers/net/xen-netback/interface.c |   46 +++++++++--
 drivers/net/xen-netback/netback.c   |   73 ++++++++----------
 drivers/net/xen-netback/xenbus.c    |  143 +++++++++++++++++++++++++++++++++--
 4 files changed, 229 insertions(+), 63 deletions(-)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index 35d8772..f541ba9 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -45,6 +45,12 @@
 #include <xen/grant_table.h>
 #include <xen/xenbus.h>
 
+#define NETBK_MAX_RING_PAGE_ORDER XENBUS_MAX_RING_PAGE_ORDER
+#define NETBK_MAX_RING_PAGES      (1U << NETBK_MAX_RING_PAGE_ORDER)
+
+#define NETBK_MAX_TX_RING_SIZE XEN_NETIF_TX_RING_SIZE(NETBK_MAX_RING_PAGES)
+#define NETBK_MAX_RX_RING_SIZE XEN_NETIF_RX_RING_SIZE(NETBK_MAX_RING_PAGES)
+
 struct xen_netbk;
 
 struct xenvif {
@@ -66,6 +72,8 @@ struct xenvif {
 	/* The shared rings and indexes. */
 	struct xen_netif_tx_back_ring tx;
 	struct xen_netif_rx_back_ring rx;
+	unsigned int nr_tx_handles;
+	unsigned int nr_rx_handles;
 
 	/* Frontend feature information. */
 	u8 can_sg:1;
@@ -105,15 +113,19 @@ static inline struct xenbus_device *xenvif_to_xenbus_device(struct xenvif *vif)
 	return to_xenbus_device(vif->dev->dev.parent);
 }
 
-#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
-#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
+#define XEN_NETIF_TX_RING_SIZE(_nr_pages)		\
+	__CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE * (_nr_pages))
+#define XEN_NETIF_RX_RING_SIZE(_nr_pages)		\
+	__CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE * (_nr_pages))
 
 struct xenvif *xenvif_alloc(struct device *parent,
 			    domid_t domid,
 			    unsigned int handle);
 
-int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref,
-		   unsigned long rx_ring_ref, unsigned int evtchn);
+int xenvif_connect(struct xenvif *vif,
+		   unsigned long *tx_ring_ref, unsigned int tx_ring_order,
+		   unsigned long *rx_ring_ref, unsigned int rx_ring_order,
+		   unsigned int evtchn);
 void xenvif_disconnect(struct xenvif *vif);
 
 void xenvif_get(struct xenvif *vif);
@@ -129,10 +141,12 @@ int xen_netbk_rx_ring_full(struct xenvif *vif);
 int xen_netbk_must_stop_queue(struct xenvif *vif);
 
 /* (Un)Map communication rings. */
-void xen_netbk_unmap_frontend_rings(struct xenvif *vif);
+void xen_netbk_unmap_frontend_rings(struct xenvif *vif, void *addr);
 int xen_netbk_map_frontend_rings(struct xenvif *vif,
-				 grant_ref_t tx_ring_ref,
-				 grant_ref_t rx_ring_ref);
+				 void **addr,
+				 int domid,
+				 int *ring_ref,
+				 unsigned int ring_ref_count);
 
 /* (De)Register a xenvif with the netback backend. */
 void xen_netbk_add_xenvif(struct xenvif *vif);
@@ -158,4 +172,6 @@ void xenvif_carrier_off(struct xenvif *vif);
 /* Returns number of ring slots required to send an skb to the frontend */
 unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb);
 
+extern unsigned int MODPARM_netback_max_tx_ring_page_order;
+extern unsigned int MODPARM_netback_max_rx_ring_page_order;
 #endif /* __XEN_NETBACK__COMMON_H__ */
diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index db638e1..fa4d46d 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -305,10 +305,16 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid,
 	return vif;
 }
 
-int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref,
-		   unsigned long rx_ring_ref, unsigned int evtchn)
+int xenvif_connect(struct xenvif *vif,
+		   unsigned long *tx_ring_ref, unsigned int tx_ring_ref_count,
+		   unsigned long *rx_ring_ref, unsigned int rx_ring_ref_count,
+		   unsigned int evtchn)
 {
 	int err = -ENOMEM;
+	void *addr;
+	struct xen_netif_tx_sring *txs;
+	struct xen_netif_rx_sring *rxs;
+	int tmp[NETBK_MAX_RING_PAGES], i;
 
 	/* Already connected through? */
 	if (vif->irq)
@@ -316,15 +322,36 @@ int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref,
 
 	__module_get(THIS_MODULE);
 
-	err = xen_netbk_map_frontend_rings(vif, tx_ring_ref, rx_ring_ref);
+	for (i = 0; i < tx_ring_ref_count; i++)
+		tmp[i] = tx_ring_ref[i];
+
+	err = xen_netbk_map_frontend_rings(vif, &addr, vif->domid,
+					   tmp, tx_ring_ref_count);
 	if (err < 0)
 		goto err;
 
+	txs = addr;
+	BACK_RING_INIT(&vif->tx, txs, PAGE_SIZE * tx_ring_ref_count);
+	vif->nr_tx_handles = tx_ring_ref_count;
+
+	for (i = 0; i < rx_ring_ref_count; i++)
+		tmp[i] = rx_ring_ref[i];
+
+	err = xen_netbk_map_frontend_rings(vif, &addr, vif->domid,
+					   tmp, rx_ring_ref_count);
+
+	if (err < 0)
+		goto err_tx_unmap;
+
+	rxs = addr;
+	BACK_RING_INIT(&vif->rx, rxs, PAGE_SIZE * rx_ring_ref_count);
+	vif->nr_rx_handles = rx_ring_ref_count;
+
 	err = bind_interdomain_evtchn_to_irqhandler(
 		vif->domid, evtchn, xenvif_interrupt, 0,
 		vif->dev->name, vif);
 	if (err < 0)
-		goto err_unmap;
+		goto err_rx_unmap;
 	vif->irq = err;
 	disable_irq(vif->irq);
 
@@ -340,8 +367,12 @@ int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref,
 	rtnl_unlock();
 
 	return 0;
-err_unmap:
-	xen_netbk_unmap_frontend_rings(vif);
+err_rx_unmap:
+	xen_netbk_unmap_frontend_rings(vif, (void *)vif->rx.sring);
+	vif->nr_rx_handles = 0;
+err_tx_unmap:
+	xen_netbk_unmap_frontend_rings(vif, (void *)vif->tx.sring);
+	vif->nr_tx_handles = 0;
 err:
 	module_put(THIS_MODULE);
 	return err;
@@ -382,7 +413,8 @@ void xenvif_disconnect(struct xenvif *vif)
 
 	unregister_netdev(vif->dev);
 
-	xen_netbk_unmap_frontend_rings(vif);
+	xen_netbk_unmap_frontend_rings(vif, (void *)vif->tx.sring);
+	xen_netbk_unmap_frontend_rings(vif, (void *)vif->rx.sring);
 
 	free_netdev(vif->dev);
 
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 98ccea9..644c760 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -47,6 +47,19 @@
 #include <asm/xen/hypercall.h>
 #include <asm/xen/page.h>
 
+unsigned int MODPARM_netback_max_rx_ring_page_order = NETBK_MAX_RING_PAGE_ORDER;
+module_param_named(netback_max_rx_ring_page_order,
+		   MODPARM_netback_max_rx_ring_page_order, uint, 0);
+MODULE_PARM_DESC(netback_max_rx_ring_page_order,
+		 "Maximum supported receiver ring page order");
+
+unsigned int MODPARM_netback_max_tx_ring_page_order = NETBK_MAX_RING_PAGE_ORDER;
+module_param_named(netback_max_tx_ring_page_order,
+		   MODPARM_netback_max_tx_ring_page_order, uint, 0);
+MODULE_PARM_DESC(netback_max_tx_ring_page_order,
+		 "Maximum supported transmitter ring page order");
+
+
 struct pending_tx_info {
 	struct xen_netif_tx_request req;
 	struct xenvif *vif;
@@ -59,7 +72,7 @@ struct netbk_rx_meta {
 	int gso_size;
 };
 
-#define MAX_PENDING_REQS 256
+#define MAX_PENDING_REQS NETBK_MAX_TX_RING_SIZE
 
 /* Discriminate from any valid pending_idx value. */
 #define INVALID_PENDING_IDX 0xFFFF
@@ -111,8 +124,8 @@ struct xen_netbk {
 	 * head/fragment page uses 2 copy operations because it
 	 * straddles two buffers in the frontend.
 	 */
-	struct gnttab_copy grant_copy_op[2*XEN_NETIF_RX_RING_SIZE];
-	struct netbk_rx_meta meta[2*XEN_NETIF_RX_RING_SIZE];
+	struct gnttab_copy grant_copy_op[2*NETBK_MAX_RX_RING_SIZE];
+	struct netbk_rx_meta meta[2*NETBK_MAX_RX_RING_SIZE];
 };
 
 static struct xen_netbk *xen_netbk;
@@ -262,7 +275,8 @@ int xen_netbk_rx_ring_full(struct xenvif *vif)
 	RING_IDX needed = max_required_rx_slots(vif);
 
 	return ((vif->rx.sring->req_prod - peek) < needed) ||
-	       ((vif->rx.rsp_prod_pvt + XEN_NETIF_RX_RING_SIZE - peek) < needed);
+	       ((vif->rx.rsp_prod_pvt +
+		 XEN_NETIF_RX_RING_SIZE(vif->nr_rx_handles) - peek) < needed);
 }
 
 int xen_netbk_must_stop_queue(struct xenvif *vif)
@@ -657,7 +671,8 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk)
 		__skb_queue_tail(&rxq, skb);
 
 		/* Filled the batch queue? */
-		if (count + MAX_SKB_FRAGS >= XEN_NETIF_RX_RING_SIZE)
+		if (count + MAX_SKB_FRAGS >=
+		    XEN_NETIF_RX_RING_SIZE(vif->nr_rx_handles))
 			break;
 	}
 
@@ -1292,12 +1307,12 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk)
 			continue;
 
 		if (vif->tx.sring->req_prod - vif->tx.req_cons >
-		    XEN_NETIF_TX_RING_SIZE) {
+		    XEN_NETIF_TX_RING_SIZE(vif->nr_tx_handles)) {
 			netdev_err(vif->dev,
 				   "Impossible number of requests. "
 				   "req_prod %d, req_cons %d, size %ld\n",
 				   vif->tx.sring->req_prod, vif->tx.req_cons,
-				   XEN_NETIF_TX_RING_SIZE);
+				   XEN_NETIF_TX_RING_SIZE(vif->nr_tx_handles));
 			netbk_fatal_tx_err(vif);
 			continue;
 		}
@@ -1644,48 +1659,22 @@ static int xen_netbk_kthread(void *data)
 	return 0;
 }
 
-void xen_netbk_unmap_frontend_rings(struct xenvif *vif)
+void xen_netbk_unmap_frontend_rings(struct xenvif *vif, void *addr)
 {
-	if (vif->tx.sring)
-		xenbus_unmap_ring_vfree(xenvif_to_xenbus_device(vif),
-					vif->tx.sring);
-	if (vif->rx.sring)
-		xenbus_unmap_ring_vfree(xenvif_to_xenbus_device(vif),
-					vif->rx.sring);
+	if (addr)
+		xenbus_unmap_ring_vfree(xenvif_to_xenbus_device(vif), addr);
 }
 
 int xen_netbk_map_frontend_rings(struct xenvif *vif,
-				 grant_ref_t tx_ring_ref,
-				 grant_ref_t rx_ring_ref)
+				 void **vaddr,
+				 int domid,
+				 int *ring_ref,
+				 unsigned int ring_ref_count)
 {
-	void *addr;
-	struct xen_netif_tx_sring *txs;
-	struct xen_netif_rx_sring *rxs;
-
-	int err = -ENOMEM;
-
-	err = xenbus_map_ring_valloc(xenvif_to_xenbus_device(vif),
-				     &tx_ring_ref, 1, &addr);
-	if (err)
-		goto err;
-
-	txs = (struct xen_netif_tx_sring *)addr;
-	BACK_RING_INIT(&vif->tx, txs, PAGE_SIZE);
+	int err = 0;
 
 	err = xenbus_map_ring_valloc(xenvif_to_xenbus_device(vif),
-				     &rx_ring_ref, 1, &addr);
-	if (err)
-		goto err;
-
-	rxs = (struct xen_netif_rx_sring *)addr;
-	BACK_RING_INIT(&vif->rx, rxs, PAGE_SIZE);
-
-	vif->rx_req_cons_peek = 0;
-
-	return 0;
-
-err:
-	xen_netbk_unmap_frontend_rings(vif);
+				     ring_ref, ring_ref_count, vaddr);
 	return err;
 }
 
diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index 65d14f2..1791807 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -114,6 +114,33 @@ static int netback_probe(struct xenbus_device *dev,
 			goto abort_transaction;
 		}
 
+		/* Multi-page ring support */
+		if (MODPARM_netback_max_tx_ring_page_order >
+		    NETBK_MAX_RING_PAGE_ORDER)
+			MODPARM_netback_max_tx_ring_page_order =
+				NETBK_MAX_RING_PAGE_ORDER;
+		err = xenbus_printf(xbt, dev->nodename,
+				    "max-tx-ring-page-order",
+				    "%u",
+				    MODPARM_netback_max_tx_ring_page_order);
+		if (err) {
+			message = "writing max-tx-ring-page-order";
+			goto abort_transaction;
+		}
+
+		if (MODPARM_netback_max_rx_ring_page_order >
+		    NETBK_MAX_RING_PAGE_ORDER)
+			MODPARM_netback_max_rx_ring_page_order =
+				NETBK_MAX_RING_PAGE_ORDER;
+		err = xenbus_printf(xbt, dev->nodename,
+				    "max-rx-ring-page-order",
+				    "%u",
+				    MODPARM_netback_max_rx_ring_page_order);
+		if (err) {
+			message = "writing max-rx-ring-page-order";
+			goto abort_transaction;
+		}
+
 		err = xenbus_transaction_end(xbt, 0);
 	} while (err == -EAGAIN);
 
@@ -392,22 +419,107 @@ static int connect_rings(struct backend_info *be)
 {
 	struct xenvif *vif = be->vif;
 	struct xenbus_device *dev = be->dev;
-	unsigned long tx_ring_ref, rx_ring_ref;
 	unsigned int evtchn, rx_copy;
 	int err;
 	int val;
+	unsigned long tx_ring_ref[NETBK_MAX_RING_PAGES];
+	unsigned long rx_ring_ref[NETBK_MAX_RING_PAGES];
+	unsigned int  tx_ring_order;
+	unsigned int  rx_ring_order;
 
 	err = xenbus_gather(XBT_NIL, dev->otherend,
-			    "tx-ring-ref", "%lu", &tx_ring_ref,
-			    "rx-ring-ref", "%lu", &rx_ring_ref,
 			    "event-channel", "%u", &evtchn, NULL);
 	if (err) {
 		xenbus_dev_fatal(dev, err,
-				 "reading %s/ring-ref and event-channel",
+				 "reading %s/event-channel",
 				 dev->otherend);
 		return err;
 	}
 
+	err = xenbus_scanf(XBT_NIL, dev->otherend, "tx-ring-order", "%u",
+			   &tx_ring_order);
+	if (err < 0) {
+		tx_ring_order = 0;
+
+		err = xenbus_scanf(XBT_NIL, dev->otherend, "tx-ring-ref", "%lu",
+				   &tx_ring_ref[0]);
+		if (err < 0) {
+			xenbus_dev_fatal(dev, err, "reading %s/tx-ring-ref",
+					 dev->otherend);
+			return err;
+		}
+	} else {
+		unsigned int i;
+
+		if (tx_ring_order > MODPARM_netback_max_tx_ring_page_order) {
+			err = -EINVAL;
+			xenbus_dev_fatal(dev, err,
+					 "%s/tx-ring-page-order too big",
+					 dev->otherend);
+			return err;
+		}
+
+		for (i = 0; i < (1U << tx_ring_order); i++) {
+			char ring_ref_name[sizeof("tx-ring-ref") + 2];
+
+			snprintf(ring_ref_name, sizeof(ring_ref_name),
+				 "tx-ring-ref%u", i);
+
+			err = xenbus_scanf(XBT_NIL, dev->otherend,
+					   ring_ref_name, "%lu",
+					   &tx_ring_ref[i]);
+			if (err < 0) {
+				xenbus_dev_fatal(dev, err,
+						 "reading %s/%s",
+						 dev->otherend,
+						 ring_ref_name);
+				return err;
+			}
+		}
+	}
+
+	err = xenbus_scanf(XBT_NIL, dev->otherend, "rx-ring-order", "%u",
+			   &rx_ring_order);
+	if (err < 0) {
+		rx_ring_order = 0;
+
+		err = xenbus_scanf(XBT_NIL, dev->otherend, "rx-ring-ref", "%lu",
+				   &rx_ring_ref[0]);
+		if (err < 0) {
+			xenbus_dev_fatal(dev, err, "reading %s/rx-ring-ref",
+					 dev->otherend);
+			return err;
+		}
+	} else {
+		unsigned int i;
+
+		if (rx_ring_order > MODPARM_netback_max_rx_ring_page_order) {
+			err = -EINVAL;
+			xenbus_dev_fatal(dev, err,
+					 "%s/rx-ring-page-order too big",
+					 dev->otherend);
+			return err;
+		}
+
+		for (i = 0; i < (1U << rx_ring_order); i++) {
+			char ring_ref_name[sizeof("rx-ring-ref") + 2];
+
+			snprintf(ring_ref_name, sizeof(ring_ref_name),
+				 "rx-ring-ref%u", i);
+
+			err = xenbus_scanf(XBT_NIL, dev->otherend,
+					   ring_ref_name, "%lu",
+					   &rx_ring_ref[i]);
+			if (err < 0) {
+				xenbus_dev_fatal(dev, err,
+						 "reading %s/%s",
+						 dev->otherend,
+						 ring_ref_name);
+				return err;
+			}
+		}
+	}
+
 	err = xenbus_scanf(XBT_NIL, dev->otherend, "request-rx-copy", "%u",
 			   &rx_copy);
 	if (err == -ENOENT) {
@@ -454,11 +566,28 @@ static int connect_rings(struct backend_info *be)
 	vif->csum = !val;
 
 	/* Map the shared frame, irq etc. */
-	err = xenvif_connect(vif, tx_ring_ref, rx_ring_ref, evtchn);
+	err = xenvif_connect(vif, tx_ring_ref, (1U << tx_ring_order),
+			     rx_ring_ref, (1U << rx_ring_order),
+			     evtchn);
 	if (err) {
+		/* construct 1 2 3 / 4 5 6 */
+		int i;
+		char txs[3 * (1U << MODPARM_netback_max_tx_ring_page_order)];
+		char rxs[3 * (1U << MODPARM_netback_max_rx_ring_page_order)];
+
+		txs[0] = rxs[0] = 0;
+
+		for (i = 0; i < (1U << tx_ring_order); i++)
+			snprintf(txs+strlen(txs), sizeof(txs)-strlen(txs)-1,
+				 " %lu", tx_ring_ref[i]);
+
+		for (i = 0; i < (1U << rx_ring_order); i++)
+			snprintf(rxs+strlen(rxs), sizeof(rxs)-strlen(rxs)-1,
+				 " %lu", rx_ring_ref[i]);
+
 		xenbus_dev_fatal(dev, err,
-				 "mapping shared-frames %lu/%lu port %u",
-				 tx_ring_ref, rx_ring_ref, evtchn);
+				 "mapping shared-frames%s /%s port %u",
+				 txs, rxs, evtchn);
 		return err;
 	}
 	return 0;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH 5/8] netback: multi-page ring support
  2013-02-15 16:00 [PATCH 0/8] Bugfix and mechanical works for Xen network driver Wei Liu
                   ` (7 preceding siblings ...)
  2013-02-15 16:00 ` Wei Liu
@ 2013-02-15 16:00 ` Wei Liu
  2013-02-15 16:00 ` Wei Liu
                   ` (8 subsequent siblings)
  17 siblings, 0 replies; 91+ messages in thread
From: Wei Liu @ 2013-02-15 16:00 UTC (permalink / raw)
  To: xen-devel, netdev, ian.campbell, konrad.wilk, annie.li; +Cc: Wei Liu

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 drivers/net/xen-netback/common.h    |   30 ++++++--
 drivers/net/xen-netback/interface.c |   46 +++++++++--
 drivers/net/xen-netback/netback.c   |   73 ++++++++----------
 drivers/net/xen-netback/xenbus.c    |  143 +++++++++++++++++++++++++++++++++--
 4 files changed, 229 insertions(+), 63 deletions(-)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index 35d8772..f541ba9 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -45,6 +45,12 @@
 #include <xen/grant_table.h>
 #include <xen/xenbus.h>
 
+#define NETBK_MAX_RING_PAGE_ORDER XENBUS_MAX_RING_PAGE_ORDER
+#define NETBK_MAX_RING_PAGES      (1U << NETBK_MAX_RING_PAGE_ORDER)
+
+#define NETBK_MAX_TX_RING_SIZE XEN_NETIF_TX_RING_SIZE(NETBK_MAX_RING_PAGES)
+#define NETBK_MAX_RX_RING_SIZE XEN_NETIF_RX_RING_SIZE(NETBK_MAX_RING_PAGES)
+
 struct xen_netbk;
 
 struct xenvif {
@@ -66,6 +72,8 @@ struct xenvif {
 	/* The shared rings and indexes. */
 	struct xen_netif_tx_back_ring tx;
 	struct xen_netif_rx_back_ring rx;
+	unsigned int nr_tx_handles;
+	unsigned int nr_rx_handles;
 
 	/* Frontend feature information. */
 	u8 can_sg:1;
@@ -105,15 +113,19 @@ static inline struct xenbus_device *xenvif_to_xenbus_device(struct xenvif *vif)
 	return to_xenbus_device(vif->dev->dev.parent);
 }
 
-#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
-#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
+#define XEN_NETIF_TX_RING_SIZE(_nr_pages)		\
+	__CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE * (_nr_pages))
+#define XEN_NETIF_RX_RING_SIZE(_nr_pages)		\
+	__CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE * (_nr_pages))
 
 struct xenvif *xenvif_alloc(struct device *parent,
 			    domid_t domid,
 			    unsigned int handle);
 
-int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref,
-		   unsigned long rx_ring_ref, unsigned int evtchn);
+int xenvif_connect(struct xenvif *vif,
+		   unsigned long *tx_ring_ref, unsigned int tx_ring_order,
+		   unsigned long *rx_ring_ref, unsigned int rx_ring_order,
+		   unsigned int evtchn);
 void xenvif_disconnect(struct xenvif *vif);
 
 void xenvif_get(struct xenvif *vif);
@@ -129,10 +141,12 @@ int xen_netbk_rx_ring_full(struct xenvif *vif);
 int xen_netbk_must_stop_queue(struct xenvif *vif);
 
 /* (Un)Map communication rings. */
-void xen_netbk_unmap_frontend_rings(struct xenvif *vif);
+void xen_netbk_unmap_frontend_rings(struct xenvif *vif, void *addr);
 int xen_netbk_map_frontend_rings(struct xenvif *vif,
-				 grant_ref_t tx_ring_ref,
-				 grant_ref_t rx_ring_ref);
+				 void **addr,
+				 int domid,
+				 int *ring_ref,
+				 unsigned int ring_ref_count);
 
 /* (De)Register a xenvif with the netback backend. */
 void xen_netbk_add_xenvif(struct xenvif *vif);
@@ -158,4 +172,6 @@ void xenvif_carrier_off(struct xenvif *vif);
 /* Returns number of ring slots required to send an skb to the frontend */
 unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb);
 
+extern unsigned int MODPARM_netback_max_tx_ring_page_order;
+extern unsigned int MODPARM_netback_max_rx_ring_page_order;
 #endif /* __XEN_NETBACK__COMMON_H__ */
diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index db638e1..fa4d46d 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -305,10 +305,16 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid,
 	return vif;
 }
 
-int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref,
-		   unsigned long rx_ring_ref, unsigned int evtchn)
+int xenvif_connect(struct xenvif *vif,
+		   unsigned long *tx_ring_ref, unsigned int tx_ring_ref_count,
+		   unsigned long *rx_ring_ref, unsigned int rx_ring_ref_count,
+		   unsigned int evtchn)
 {
 	int err = -ENOMEM;
+	void *addr;
+	struct xen_netif_tx_sring *txs;
+	struct xen_netif_rx_sring *rxs;
+	int tmp[NETBK_MAX_RING_PAGES], i;
 
 	/* Already connected through? */
 	if (vif->irq)
@@ -316,15 +322,36 @@ int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref,
 
 	__module_get(THIS_MODULE);
 
-	err = xen_netbk_map_frontend_rings(vif, tx_ring_ref, rx_ring_ref);
+	for (i = 0; i < tx_ring_ref_count; i++)
+		tmp[i] = tx_ring_ref[i];
+
+	err = xen_netbk_map_frontend_rings(vif, &addr, vif->domid,
+					   tmp, tx_ring_ref_count);
 	if (err < 0)
 		goto err;
 
+	txs = addr;
+	BACK_RING_INIT(&vif->tx, txs, PAGE_SIZE * tx_ring_ref_count);
+	vif->nr_tx_handles = tx_ring_ref_count;
+
+	for (i = 0; i < rx_ring_ref_count; i++)
+		tmp[i] = rx_ring_ref[i];
+
+	err = xen_netbk_map_frontend_rings(vif, &addr, vif->domid,
+					   tmp, rx_ring_ref_count);
+
+	if (err < 0)
+		goto err_tx_unmap;
+
+	rxs = addr;
+	BACK_RING_INIT(&vif->rx, rxs, PAGE_SIZE * rx_ring_ref_count);
+	vif->nr_rx_handles = rx_ring_ref_count;
+
 	err = bind_interdomain_evtchn_to_irqhandler(
 		vif->domid, evtchn, xenvif_interrupt, 0,
 		vif->dev->name, vif);
 	if (err < 0)
-		goto err_unmap;
+		goto err_rx_unmap;
 	vif->irq = err;
 	disable_irq(vif->irq);
 
@@ -340,8 +367,12 @@ int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref,
 	rtnl_unlock();
 
 	return 0;
-err_unmap:
-	xen_netbk_unmap_frontend_rings(vif);
+err_rx_unmap:
+	xen_netbk_unmap_frontend_rings(vif, (void *)vif->rx.sring);
+	vif->nr_rx_handles = 0;
+err_tx_unmap:
+	xen_netbk_unmap_frontend_rings(vif, (void *)vif->tx.sring);
+	vif->nr_tx_handles = 0;
 err:
 	module_put(THIS_MODULE);
 	return err;
@@ -382,7 +413,8 @@ void xenvif_disconnect(struct xenvif *vif)
 
 	unregister_netdev(vif->dev);
 
-	xen_netbk_unmap_frontend_rings(vif);
+	xen_netbk_unmap_frontend_rings(vif, (void *)vif->tx.sring);
+	xen_netbk_unmap_frontend_rings(vif, (void *)vif->rx.sring);
 
 	free_netdev(vif->dev);
 
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 98ccea9..644c760 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -47,6 +47,19 @@
 #include <asm/xen/hypercall.h>
 #include <asm/xen/page.h>
 
+unsigned int MODPARM_netback_max_rx_ring_page_order = NETBK_MAX_RING_PAGE_ORDER;
+module_param_named(netback_max_rx_ring_page_order,
+		   MODPARM_netback_max_rx_ring_page_order, uint, 0);
+MODULE_PARM_DESC(netback_max_rx_ring_page_order,
+		 "Maximum supported receiver ring page order");
+
+unsigned int MODPARM_netback_max_tx_ring_page_order = NETBK_MAX_RING_PAGE_ORDER;
+module_param_named(netback_max_tx_ring_page_order,
+		   MODPARM_netback_max_tx_ring_page_order, uint, 0);
+MODULE_PARM_DESC(netback_max_tx_ring_page_order,
+		 "Maximum supported transmitter ring page order");
+
+
 struct pending_tx_info {
 	struct xen_netif_tx_request req;
 	struct xenvif *vif;
@@ -59,7 +72,7 @@ struct netbk_rx_meta {
 	int gso_size;
 };
 
-#define MAX_PENDING_REQS 256
+#define MAX_PENDING_REQS NETBK_MAX_TX_RING_SIZE
 
 /* Discriminate from any valid pending_idx value. */
 #define INVALID_PENDING_IDX 0xFFFF
@@ -111,8 +124,8 @@ struct xen_netbk {
 	 * head/fragment page uses 2 copy operations because it
 	 * straddles two buffers in the frontend.
 	 */
-	struct gnttab_copy grant_copy_op[2*XEN_NETIF_RX_RING_SIZE];
-	struct netbk_rx_meta meta[2*XEN_NETIF_RX_RING_SIZE];
+	struct gnttab_copy grant_copy_op[2*NETBK_MAX_RX_RING_SIZE];
+	struct netbk_rx_meta meta[2*NETBK_MAX_RX_RING_SIZE];
 };
 
 static struct xen_netbk *xen_netbk;
@@ -262,7 +275,8 @@ int xen_netbk_rx_ring_full(struct xenvif *vif)
 	RING_IDX needed = max_required_rx_slots(vif);
 
 	return ((vif->rx.sring->req_prod - peek) < needed) ||
-	       ((vif->rx.rsp_prod_pvt + XEN_NETIF_RX_RING_SIZE - peek) < needed);
+	       ((vif->rx.rsp_prod_pvt +
+		 XEN_NETIF_RX_RING_SIZE(vif->nr_rx_handles) - peek) < needed);
 }
 
 int xen_netbk_must_stop_queue(struct xenvif *vif)
@@ -657,7 +671,8 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk)
 		__skb_queue_tail(&rxq, skb);
 
 		/* Filled the batch queue? */
-		if (count + MAX_SKB_FRAGS >= XEN_NETIF_RX_RING_SIZE)
+		if (count + MAX_SKB_FRAGS >=
+		    XEN_NETIF_RX_RING_SIZE(vif->nr_rx_handles))
 			break;
 	}
 
@@ -1292,12 +1307,12 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk)
 			continue;
 
 		if (vif->tx.sring->req_prod - vif->tx.req_cons >
-		    XEN_NETIF_TX_RING_SIZE) {
+		    XEN_NETIF_TX_RING_SIZE(vif->nr_tx_handles)) {
 			netdev_err(vif->dev,
 				   "Impossible number of requests. "
 				   "req_prod %d, req_cons %d, size %ld\n",
 				   vif->tx.sring->req_prod, vif->tx.req_cons,
-				   XEN_NETIF_TX_RING_SIZE);
+				   XEN_NETIF_TX_RING_SIZE(vif->nr_tx_handles));
 			netbk_fatal_tx_err(vif);
 			continue;
 		}
@@ -1644,48 +1659,22 @@ static int xen_netbk_kthread(void *data)
 	return 0;
 }
 
-void xen_netbk_unmap_frontend_rings(struct xenvif *vif)
+void xen_netbk_unmap_frontend_rings(struct xenvif *vif, void *addr)
 {
-	if (vif->tx.sring)
-		xenbus_unmap_ring_vfree(xenvif_to_xenbus_device(vif),
-					vif->tx.sring);
-	if (vif->rx.sring)
-		xenbus_unmap_ring_vfree(xenvif_to_xenbus_device(vif),
-					vif->rx.sring);
+	if (addr)
+		xenbus_unmap_ring_vfree(xenvif_to_xenbus_device(vif), addr);
 }
 
 int xen_netbk_map_frontend_rings(struct xenvif *vif,
-				 grant_ref_t tx_ring_ref,
-				 grant_ref_t rx_ring_ref)
+				 void **vaddr,
+				 int domid,
+				 int *ring_ref,
+				 unsigned int ring_ref_count)
 {
-	void *addr;
-	struct xen_netif_tx_sring *txs;
-	struct xen_netif_rx_sring *rxs;
-
-	int err = -ENOMEM;
-
-	err = xenbus_map_ring_valloc(xenvif_to_xenbus_device(vif),
-				     &tx_ring_ref, 1, &addr);
-	if (err)
-		goto err;
-
-	txs = (struct xen_netif_tx_sring *)addr;
-	BACK_RING_INIT(&vif->tx, txs, PAGE_SIZE);
+	int err = 0;
 
 	err = xenbus_map_ring_valloc(xenvif_to_xenbus_device(vif),
-				     &rx_ring_ref, 1, &addr);
-	if (err)
-		goto err;
-
-	rxs = (struct xen_netif_rx_sring *)addr;
-	BACK_RING_INIT(&vif->rx, rxs, PAGE_SIZE);
-
-	vif->rx_req_cons_peek = 0;
-
-	return 0;
-
-err:
-	xen_netbk_unmap_frontend_rings(vif);
+				     ring_ref, ring_ref_count, vaddr);
 	return err;
 }
 
diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index 65d14f2..1791807 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -114,6 +114,33 @@ static int netback_probe(struct xenbus_device *dev,
 			goto abort_transaction;
 		}
 
+		/* Multi-page ring support */
+		if (MODPARM_netback_max_tx_ring_page_order >
+		    NETBK_MAX_RING_PAGE_ORDER)
+			MODPARM_netback_max_tx_ring_page_order =
+				NETBK_MAX_RING_PAGE_ORDER;
+		err = xenbus_printf(xbt, dev->nodename,
+				    "max-tx-ring-page-order",
+				    "%u",
+				    MODPARM_netback_max_tx_ring_page_order);
+		if (err) {
+			message = "writing max-tx-ring-page-order";
+			goto abort_transaction;
+		}
+
+		if (MODPARM_netback_max_rx_ring_page_order >
+		    NETBK_MAX_RING_PAGE_ORDER)
+			MODPARM_netback_max_rx_ring_page_order =
+				NETBK_MAX_RING_PAGE_ORDER;
+		err = xenbus_printf(xbt, dev->nodename,
+				    "max-rx-ring-page-order",
+				    "%u",
+				    MODPARM_netback_max_rx_ring_page_order);
+		if (err) {
+			message = "writing max-rx-ring-page-order";
+			goto abort_transaction;
+		}
+
 		err = xenbus_transaction_end(xbt, 0);
 	} while (err == -EAGAIN);
 
@@ -392,22 +419,107 @@ static int connect_rings(struct backend_info *be)
 {
 	struct xenvif *vif = be->vif;
 	struct xenbus_device *dev = be->dev;
-	unsigned long tx_ring_ref, rx_ring_ref;
 	unsigned int evtchn, rx_copy;
 	int err;
 	int val;
+	unsigned long tx_ring_ref[NETBK_MAX_RING_PAGES];
+	unsigned long rx_ring_ref[NETBK_MAX_RING_PAGES];
+	unsigned int  tx_ring_order;
+	unsigned int  rx_ring_order;
 
 	err = xenbus_gather(XBT_NIL, dev->otherend,
-			    "tx-ring-ref", "%lu", &tx_ring_ref,
-			    "rx-ring-ref", "%lu", &rx_ring_ref,
 			    "event-channel", "%u", &evtchn, NULL);
 	if (err) {
 		xenbus_dev_fatal(dev, err,
-				 "reading %s/ring-ref and event-channel",
+				 "reading %s/event-channel",
 				 dev->otherend);
 		return err;
 	}
 
+	err = xenbus_scanf(XBT_NIL, dev->otherend, "tx-ring-order", "%u",
+			   &tx_ring_order);
+	if (err < 0) {
+		tx_ring_order = 0;
+
+		err = xenbus_scanf(XBT_NIL, dev->otherend, "tx-ring-ref", "%lu",
+				   &tx_ring_ref[0]);
+		if (err < 0) {
+			xenbus_dev_fatal(dev, err, "reading %s/tx-ring-ref",
+					 dev->otherend);
+			return err;
+		}
+	} else {
+		unsigned int i;
+
+		if (tx_ring_order > MODPARM_netback_max_tx_ring_page_order) {
+			err = -EINVAL;
+			xenbus_dev_fatal(dev, err,
+					 "%s/tx-ring-page-order too big",
+					 dev->otherend);
+			return err;
+		}
+
+		for (i = 0; i < (1U << tx_ring_order); i++) {
+			char ring_ref_name[sizeof("tx-ring-ref") + 2];
+
+			snprintf(ring_ref_name, sizeof(ring_ref_name),
+				 "tx-ring-ref%u", i);
+
+			err = xenbus_scanf(XBT_NIL, dev->otherend,
+					   ring_ref_name, "%lu",
+					   &tx_ring_ref[i]);
+			if (err < 0) {
+				xenbus_dev_fatal(dev, err,
+						 "reading %s/%s",
+						 dev->otherend,
+						 ring_ref_name);
+				return err;
+			}
+		}
+	}
+
+	err = xenbus_scanf(XBT_NIL, dev->otherend, "rx-ring-order", "%u",
+			   &rx_ring_order);
+	if (err < 0) {
+		rx_ring_order = 0;
+
+		err = xenbus_scanf(XBT_NIL, dev->otherend, "rx-ring-ref", "%lu",
+				   &rx_ring_ref[0]);
+		if (err < 0) {
+			xenbus_dev_fatal(dev, err, "reading %s/rx-ring-ref",
+					 dev->otherend);
+			return err;
+		}
+	} else {
+		unsigned int i;
+
+		if (rx_ring_order > MODPARM_netback_max_rx_ring_page_order) {
+			err = -EINVAL;
+			xenbus_dev_fatal(dev, err,
+					 "%s/rx-ring-page-order too big",
+					 dev->otherend);
+			return err;
+		}
+
+		for (i = 0; i < (1U << rx_ring_order); i++) {
+			char ring_ref_name[sizeof("rx-ring-ref") + 2];
+
+			snprintf(ring_ref_name, sizeof(ring_ref_name),
+				 "rx-ring-ref%u", i);
+
+			err = xenbus_scanf(XBT_NIL, dev->otherend,
+					   ring_ref_name, "%lu",
+					   &rx_ring_ref[i]);
+			if (err < 0) {
+				xenbus_dev_fatal(dev, err,
+						 "reading %s/%s",
+						 dev->otherend,
+						 ring_ref_name);
+				return err;
+			}
+		}
+	}
+
 	err = xenbus_scanf(XBT_NIL, dev->otherend, "request-rx-copy", "%u",
 			   &rx_copy);
 	if (err == -ENOENT) {
@@ -454,11 +566,28 @@ static int connect_rings(struct backend_info *be)
 	vif->csum = !val;
 
 	/* Map the shared frame, irq etc. */
-	err = xenvif_connect(vif, tx_ring_ref, rx_ring_ref, evtchn);
+	err = xenvif_connect(vif, tx_ring_ref, (1U << tx_ring_order),
+			     rx_ring_ref, (1U << rx_ring_order),
+			     evtchn);
 	if (err) {
+		/* construct 1 2 3 / 4 5 6 */
+		int i;
+		char txs[3 * (1U << MODPARM_netback_max_tx_ring_page_order)];
+		char rxs[3 * (1U << MODPARM_netback_max_rx_ring_page_order)];
+
+		txs[0] = rxs[0] = 0;
+
+		for (i = 0; i < (1U << tx_ring_order); i++)
+			snprintf(txs+strlen(txs), sizeof(txs)-strlen(txs)-1,
+				 " %lu", tx_ring_ref[i]);
+
+		for (i = 0; i < (1U << rx_ring_order); i++)
+			snprintf(rxs+strlen(rxs), sizeof(rxs)-strlen(rxs)-1,
+				 " %lu", rx_ring_ref[i]);
+
 		xenbus_dev_fatal(dev, err,
-				 "mapping shared-frames %lu/%lu port %u",
-				 tx_ring_ref, rx_ring_ref, evtchn);
+				 "mapping shared-frames%s /%s port %u",
+				 txs, rxs, evtchn);
 		return err;
 	}
 	return 0;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH 6/8] netfront: multi-page ring support
  2013-02-15 16:00 [PATCH 0/8] Bugfix and mechanical works for Xen network driver Wei Liu
                   ` (9 preceding siblings ...)
  2013-02-15 16:00 ` Wei Liu
@ 2013-02-15 16:00 ` Wei Liu
  2013-02-26  6:52   ` ANNIE LI
                     ` (3 more replies)
  2013-02-15 16:00 ` Wei Liu
                   ` (6 subsequent siblings)
  17 siblings, 4 replies; 91+ messages in thread
From: Wei Liu @ 2013-02-15 16:00 UTC (permalink / raw)
  To: xen-devel, netdev, ian.campbell, konrad.wilk, annie.li; +Cc: Wei Liu

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 drivers/net/xen-netfront.c |  246 +++++++++++++++++++++++++++++++-------------
 1 file changed, 174 insertions(+), 72 deletions(-)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 8bd75a1..de73a71 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -67,9 +67,19 @@ struct netfront_cb {
 
 #define GRANT_INVALID_REF	0
 
-#define NET_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
-#define NET_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
-#define TX_MAX_TARGET min_t(int, NET_TX_RING_SIZE, 256)
+#define XENNET_MAX_RING_PAGE_ORDER XENBUS_MAX_RING_PAGE_ORDER
+#define XENNET_MAX_RING_PAGES      (1U << XENNET_MAX_RING_PAGE_ORDER)
+
+
+#define NET_TX_RING_SIZE(_nr_pages)			\
+	__CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE * (_nr_pages))
+#define NET_RX_RING_SIZE(_nr_pages)			\
+	__CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE * (_nr_pages))
+
+#define XENNET_MAX_TX_RING_SIZE NET_TX_RING_SIZE(XENNET_MAX_RING_PAGES)
+#define XENNET_MAX_RX_RING_SIZE NET_RX_RING_SIZE(XENNET_MAX_RING_PAGES)
+
+#define TX_MAX_TARGET min_t(int, NET_TX_RING_SIZE(1), 256)
 
 struct netfront_stats {
 	u64			rx_packets;
@@ -80,6 +90,11 @@ struct netfront_stats {
 };
 
 struct netfront_info {
+	/* Statistics */
+	struct netfront_stats __percpu *stats;
+
+	unsigned long rx_gso_checksum_fixup;
+
 	struct list_head list;
 	struct net_device *netdev;
 
@@ -90,7 +105,9 @@ struct netfront_info {
 
 	spinlock_t   tx_lock;
 	struct xen_netif_tx_front_ring tx;
-	int tx_ring_ref;
+	int tx_ring_ref[XENNET_MAX_RING_PAGES];
+	unsigned int tx_ring_page_order;
+	unsigned int tx_ring_pages;
 
 	/*
 	 * {tx,rx}_skbs store outstanding skbuffs. Free tx_skb entries
@@ -104,36 +121,33 @@ struct netfront_info {
 	union skb_entry {
 		struct sk_buff *skb;
 		unsigned long link;
-	} tx_skbs[NET_TX_RING_SIZE];
+	} tx_skbs[XENNET_MAX_TX_RING_SIZE];
 	grant_ref_t gref_tx_head;
-	grant_ref_t grant_tx_ref[NET_TX_RING_SIZE];
+	grant_ref_t grant_tx_ref[XENNET_MAX_TX_RING_SIZE];
 	unsigned tx_skb_freelist;
 
 	spinlock_t   rx_lock ____cacheline_aligned_in_smp;
 	struct xen_netif_rx_front_ring rx;
-	int rx_ring_ref;
+	int rx_ring_ref[XENNET_MAX_RING_PAGES];
+	unsigned int rx_ring_page_order;
+	unsigned int rx_ring_pages;
 
 	/* Receive-ring batched refills. */
 #define RX_MIN_TARGET 8
 #define RX_DFL_MIN_TARGET 64
-#define RX_MAX_TARGET min_t(int, NET_RX_RING_SIZE, 256)
+#define RX_MAX_TARGET min_t(int, NET_RX_RING_SIZE(1), 256)
 	unsigned rx_min_target, rx_max_target, rx_target;
 	struct sk_buff_head rx_batch;
 
 	struct timer_list rx_refill_timer;
 
-	struct sk_buff *rx_skbs[NET_RX_RING_SIZE];
+	struct sk_buff *rx_skbs[XENNET_MAX_RX_RING_SIZE];
 	grant_ref_t gref_rx_head;
-	grant_ref_t grant_rx_ref[NET_RX_RING_SIZE];
-
-	unsigned long rx_pfn_array[NET_RX_RING_SIZE];
-	struct multicall_entry rx_mcl[NET_RX_RING_SIZE+1];
-	struct mmu_update rx_mmu[NET_RX_RING_SIZE];
-
-	/* Statistics */
-	struct netfront_stats __percpu *stats;
+	grant_ref_t grant_rx_ref[XENNET_MAX_RX_RING_SIZE];
 
-	unsigned long rx_gso_checksum_fixup;
+	unsigned long rx_pfn_array[XENNET_MAX_RX_RING_SIZE];
+	struct multicall_entry rx_mcl[XENNET_MAX_RX_RING_SIZE+1];
+	struct mmu_update rx_mmu[XENNET_MAX_RX_RING_SIZE];
 };
 
 struct netfront_rx_info {
@@ -171,15 +185,15 @@ static unsigned short get_id_from_freelist(unsigned *head,
 	return id;
 }
 
-static int xennet_rxidx(RING_IDX idx)
+static int xennet_rxidx(RING_IDX idx, struct netfront_info *info)
 {
-	return idx & (NET_RX_RING_SIZE - 1);
+	return idx & (NET_RX_RING_SIZE(info->rx_ring_pages) - 1);
 }
 
 static struct sk_buff *xennet_get_rx_skb(struct netfront_info *np,
 					 RING_IDX ri)
 {
-	int i = xennet_rxidx(ri);
+	int i = xennet_rxidx(ri, np);
 	struct sk_buff *skb = np->rx_skbs[i];
 	np->rx_skbs[i] = NULL;
 	return skb;
@@ -188,7 +202,7 @@ static struct sk_buff *xennet_get_rx_skb(struct netfront_info *np,
 static grant_ref_t xennet_get_rx_ref(struct netfront_info *np,
 					    RING_IDX ri)
 {
-	int i = xennet_rxidx(ri);
+	int i = xennet_rxidx(ri, np);
 	grant_ref_t ref = np->grant_rx_ref[i];
 	np->grant_rx_ref[i] = GRANT_INVALID_REF;
 	return ref;
@@ -301,7 +315,7 @@ no_skb:
 
 		skb->dev = dev;
 
-		id = xennet_rxidx(req_prod + i);
+		id = xennet_rxidx(req_prod + i, np);
 
 		BUG_ON(np->rx_skbs[id]);
 		np->rx_skbs[id] = skb;
@@ -653,7 +667,7 @@ static int xennet_close(struct net_device *dev)
 static void xennet_move_rx_slot(struct netfront_info *np, struct sk_buff *skb,
 				grant_ref_t ref)
 {
-	int new = xennet_rxidx(np->rx.req_prod_pvt);
+	int new = xennet_rxidx(np->rx.req_prod_pvt, np);
 
 	BUG_ON(np->rx_skbs[new]);
 	np->rx_skbs[new] = skb;
@@ -1109,7 +1123,7 @@ static void xennet_release_tx_bufs(struct netfront_info *np)
 	struct sk_buff *skb;
 	int i;
 
-	for (i = 0; i < NET_TX_RING_SIZE; i++) {
+	for (i = 0; i < NET_TX_RING_SIZE(np->tx_ring_pages); i++) {
 		/* Skip over entries which are actually freelist references */
 		if (skb_entry_is_link(&np->tx_skbs[i]))
 			continue;
@@ -1143,7 +1157,7 @@ static void xennet_release_rx_bufs(struct netfront_info *np)
 
 	spin_lock_bh(&np->rx_lock);
 
-	for (id = 0; id < NET_RX_RING_SIZE; id++) {
+	for (id = 0; id < NET_RX_RING_SIZE(np->rx_ring_pages); id++) {
 		ref = np->grant_rx_ref[id];
 		if (ref == GRANT_INVALID_REF) {
 			unused++;
@@ -1324,13 +1338,13 @@ static struct net_device *xennet_create_dev(struct xenbus_device *dev)
 
 	/* Initialise tx_skbs as a free chain containing every entry. */
 	np->tx_skb_freelist = 0;
-	for (i = 0; i < NET_TX_RING_SIZE; i++) {
+	for (i = 0; i < XENNET_MAX_TX_RING_SIZE; i++) {
 		skb_entry_set_link(&np->tx_skbs[i], i+1);
 		np->grant_tx_ref[i] = GRANT_INVALID_REF;
 	}
 
 	/* Clear out rx_skbs */
-	for (i = 0; i < NET_RX_RING_SIZE; i++) {
+	for (i = 0; i < XENNET_MAX_RX_RING_SIZE; i++) {
 		np->rx_skbs[i] = NULL;
 		np->grant_rx_ref[i] = GRANT_INVALID_REF;
 	}
@@ -1428,13 +1442,6 @@ static int netfront_probe(struct xenbus_device *dev,
 	return err;
 }
 
-static void xennet_end_access(int ref, void *page)
-{
-	/* This frees the page as a side-effect */
-	if (ref != GRANT_INVALID_REF)
-		gnttab_end_foreign_access(ref, 0, (unsigned long)page);
-}
-
 static void xennet_disconnect_backend(struct netfront_info *info)
 {
 	/* Stop old i/f to prevent errors whilst we rebuild the state. */
@@ -1448,12 +1455,12 @@ static void xennet_disconnect_backend(struct netfront_info *info)
 		unbind_from_irqhandler(info->netdev->irq, info->netdev);
 	info->evtchn = info->netdev->irq = 0;
 
-	/* End access and free the pages */
-	xennet_end_access(info->tx_ring_ref, info->tx.sring);
-	xennet_end_access(info->rx_ring_ref, info->rx.sring);
+	xenbus_unmap_ring_vfree(info->xbdev, (void *)info->tx.sring);
+	free_pages((unsigned long)info->tx.sring, info->tx_ring_page_order);
+
+	xenbus_unmap_ring_vfree(info->xbdev, (void *)info->rx.sring);
+	free_pages((unsigned long)info->rx.sring, info->rx_ring_page_order);
 
-	info->tx_ring_ref = GRANT_INVALID_REF;
-	info->rx_ring_ref = GRANT_INVALID_REF;
 	info->tx.sring = NULL;
 	info->rx.sring = NULL;
 }
@@ -1501,11 +1508,14 @@ static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info)
 	struct xen_netif_tx_sring *txs;
 	struct xen_netif_rx_sring *rxs;
 	int err;
-	int grefs[1];
 	struct net_device *netdev = info->netdev;
+	unsigned int max_tx_ring_page_order, max_rx_ring_page_order;
+	int i;
 
-	info->tx_ring_ref = GRANT_INVALID_REF;
-	info->rx_ring_ref = GRANT_INVALID_REF;
+	for (i = 0; i < XENNET_MAX_RING_PAGES; i++) {
+		info->tx_ring_ref[i] = GRANT_INVALID_REF;
+		info->rx_ring_ref[i] = GRANT_INVALID_REF;
+	}
 	info->rx.sring = NULL;
 	info->tx.sring = NULL;
 	netdev->irq = 0;
@@ -1516,50 +1526,100 @@ static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info)
 		goto fail;
 	}
 
-	txs = (struct xen_netif_tx_sring *)get_zeroed_page(GFP_NOIO | __GFP_HIGH);
+	err = xenbus_scanf(XBT_NIL, info->xbdev->otherend,
+			   "max-tx-ring-page-order", "%u",
+			   &max_tx_ring_page_order);
+	if (err < 0) {
+		info->tx_ring_page_order = 0;
+		dev_info(&dev->dev, "single tx ring\n");
+	} else {
+		if (max_tx_ring_page_order > XENNET_MAX_RING_PAGE_ORDER) {
+			dev_info(&dev->dev,
+				 "backend ring page order %d too large, clamp to %d\n",
+				 max_tx_ring_page_order,
+				 XENNET_MAX_RING_PAGE_ORDER);
+			max_tx_ring_page_order = XENNET_MAX_RING_PAGE_ORDER;
+		}
+		info->tx_ring_page_order = max_tx_ring_page_order;
+		dev_info(&dev->dev, "multi-page tx ring, order = %d\n",
+			 info->tx_ring_page_order);
+	}
+	info->tx_ring_pages = (1U << info->tx_ring_page_order);
+
+	txs = (struct xen_netif_tx_sring *)
+		__get_free_pages(__GFP_ZERO | GFP_NOIO | __GFP_HIGH,
+				 info->tx_ring_page_order);
 	if (!txs) {
 		err = -ENOMEM;
 		xenbus_dev_fatal(dev, err, "allocating tx ring page");
 		goto fail;
 	}
 	SHARED_RING_INIT(txs);
-	FRONT_RING_INIT(&info->tx, txs, PAGE_SIZE);
+	FRONT_RING_INIT(&info->tx, txs, PAGE_SIZE * info->tx_ring_pages);
+
+	err = xenbus_grant_ring(dev, txs, info->tx_ring_pages,
+				info->tx_ring_ref);
+	if (err < 0)
+		goto grant_tx_ring_fail;
 
-	err = xenbus_grant_ring(dev, txs, 1, grefs);
+	err = xenbus_scanf(XBT_NIL, info->xbdev->otherend,
+			   "max-rx-ring-page-order", "%u",
+			   &max_rx_ring_page_order);
 	if (err < 0) {
-		free_page((unsigned long)txs);
-		goto fail;
+		info->rx_ring_page_order = 0;
+		dev_info(&dev->dev, "single rx ring\n");
+	} else {
+		if (max_rx_ring_page_order > XENNET_MAX_RING_PAGE_ORDER) {
+			dev_info(&dev->dev,
+				 "backend ring page order %d too large, clamp to %d\n",
+				 max_rx_ring_page_order,
+				 XENNET_MAX_RING_PAGE_ORDER);
+			max_rx_ring_page_order = XENNET_MAX_RING_PAGE_ORDER;
+		}
+		info->rx_ring_page_order = max_rx_ring_page_order;
+		dev_info(&dev->dev, "multi-page rx ring, order = %d\n",
+			 info->rx_ring_page_order);
 	}
+	info->rx_ring_pages = (1U << info->rx_ring_page_order);
 
-	info->tx_ring_ref = grefs[0];
-	rxs = (struct xen_netif_rx_sring *)get_zeroed_page(GFP_NOIO | __GFP_HIGH);
+	rxs = (struct xen_netif_rx_sring *)
+		__get_free_pages(__GFP_ZERO | GFP_NOIO | __GFP_HIGH,
+				 info->rx_ring_page_order);
 	if (!rxs) {
 		err = -ENOMEM;
 		xenbus_dev_fatal(dev, err, "allocating rx ring page");
-		goto fail;
+		goto alloc_rx_ring_fail;
 	}
 	SHARED_RING_INIT(rxs);
-	FRONT_RING_INIT(&info->rx, rxs, PAGE_SIZE);
+	FRONT_RING_INIT(&info->rx, rxs, PAGE_SIZE * info->rx_ring_pages);
 
-	err = xenbus_grant_ring(dev, rxs, 1, grefs);
-	if (err < 0) {
-		free_page((unsigned long)rxs);
-		goto fail;
-	}
-	info->rx_ring_ref = grefs[0];
+	err = xenbus_grant_ring(dev, rxs, info->rx_ring_pages,
+				info->rx_ring_ref);
+	if (err < 0)
+		goto grant_rx_ring_fail;
 
 	err = xenbus_alloc_evtchn(dev, &info->evtchn);
 	if (err)
-		goto fail;
+		goto alloc_evtchn_fail;
 
 	err = bind_evtchn_to_irqhandler(info->evtchn, xennet_interrupt,
 					0, netdev->name, netdev);
 	if (err < 0)
-		goto fail;
+		goto bind_fail;
 	netdev->irq = err;
 	return 0;
 
- fail:
+bind_fail:
+	xenbus_free_evtchn(dev, info->evtchn);
+alloc_evtchn_fail:
+	xenbus_unmap_ring_vfree(info->xbdev, (void *)info->rx.sring);
+grant_rx_ring_fail:
+	free_pages((unsigned long)info->rx.sring, info->rx_ring_page_order);
+alloc_rx_ring_fail:
+	xenbus_unmap_ring_vfree(info->xbdev, (void *)info->tx.sring);
+grant_tx_ring_fail:
+	free_pages((unsigned long)info->tx.sring, info->tx_ring_page_order);
+fail:
 	return err;
 }
 
@@ -1570,6 +1630,7 @@ static int talk_to_netback(struct xenbus_device *dev,
 	const char *message;
 	struct xenbus_transaction xbt;
 	int err;
+	int i;
 
 	/* Create shared ring, alloc event channel. */
 	err = setup_netfront(dev, info);
@@ -1583,18 +1644,58 @@ again:
 		goto destroy_ring;
 	}
 
-	err = xenbus_printf(xbt, dev->nodename, "tx-ring-ref", "%u",
-			    info->tx_ring_ref);
-	if (err) {
-		message = "writing tx ring-ref";
-		goto abort_transaction;
+	if (info->tx_ring_page_order == 0) {
+		err = xenbus_printf(xbt, dev->nodename, "tx-ring-ref", "%u",
+				    info->tx_ring_ref[0]);
+		if (err) {
+			message = "writing tx ring-ref";
+			goto abort_transaction;
+		}
+	} else {
+		err = xenbus_printf(xbt, dev->nodename, "tx-ring-order", "%u",
+				    info->tx_ring_page_order);
+		if (err) {
+			message = "writing tx-ring-order";
+			goto abort_transaction;
+		}
+		for (i = 0; i < info->tx_ring_pages; i++) {
+			char name[sizeof("tx-ring-ref")+3];
+			snprintf(name, sizeof(name), "tx-ring-ref%u", i);
+			err = xenbus_printf(xbt, dev->nodename, name, "%u",
+					    info->tx_ring_ref[i]);
+			if (err) {
+				message = "writing tx ring-ref";
+				goto abort_transaction;
+			}
+		}
 	}
-	err = xenbus_printf(xbt, dev->nodename, "rx-ring-ref", "%u",
-			    info->rx_ring_ref);
-	if (err) {
-		message = "writing rx ring-ref";
-		goto abort_transaction;
+
+	if (info->rx_ring_page_order == 0) {
+		err = xenbus_printf(xbt, dev->nodename, "rx-ring-ref", "%u",
+				    info->rx_ring_ref[0]);
+		if (err) {
+			message = "writing rx ring-ref";
+			goto abort_transaction;
+		}
+	} else {
+		err = xenbus_printf(xbt, dev->nodename, "rx-ring-order", "%u",
+				    info->rx_ring_page_order);
+		if (err) {
+			message = "writing rx-ring-order";
+			goto abort_transaction;
+		}
+		for (i = 0; i < info->rx_ring_pages; i++) {
+			char name[sizeof("rx-ring-ref")+3];
+			snprintf(name, sizeof(name), "rx-ring-ref%u", i);
+			err = xenbus_printf(xbt, dev->nodename, name, "%u",
+					    info->rx_ring_ref[i]);
+			if (err) {
+				message = "writing rx ring-ref";
+				goto abort_transaction;
+			}
+		}
 	}
+
 	err = xenbus_printf(xbt, dev->nodename,
 			    "event-channel", "%u", info->evtchn);
 	if (err) {
@@ -1681,7 +1782,8 @@ static int xennet_connect(struct net_device *dev)
 	xennet_release_tx_bufs(np);
 
 	/* Step 2: Rebuild the RX buffer freelist and the RX ring itself. */
-	for (requeue_idx = 0, i = 0; i < NET_RX_RING_SIZE; i++) {
+	for (requeue_idx = 0, i = 0; i < NET_RX_RING_SIZE(np->rx_ring_pages);
+	     i++) {
 		skb_frag_t *frag;
 		const struct page *page;
 		if (!np->rx_skbs[i])
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH 6/8] netfront: multi-page ring support
  2013-02-15 16:00 [PATCH 0/8] Bugfix and mechanical works for Xen network driver Wei Liu
                   ` (10 preceding siblings ...)
  2013-02-15 16:00 ` [PATCH 6/8] netfront: " Wei Liu
@ 2013-02-15 16:00 ` Wei Liu
  2013-02-15 16:00 ` [PATCH 7/8] netback: split event channels support Wei Liu
                   ` (5 subsequent siblings)
  17 siblings, 0 replies; 91+ messages in thread
From: Wei Liu @ 2013-02-15 16:00 UTC (permalink / raw)
  To: xen-devel, netdev, ian.campbell, konrad.wilk, annie.li; +Cc: Wei Liu

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 drivers/net/xen-netfront.c |  246 +++++++++++++++++++++++++++++++-------------
 1 file changed, 174 insertions(+), 72 deletions(-)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 8bd75a1..de73a71 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -67,9 +67,19 @@ struct netfront_cb {
 
 #define GRANT_INVALID_REF	0
 
-#define NET_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
-#define NET_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
-#define TX_MAX_TARGET min_t(int, NET_TX_RING_SIZE, 256)
+#define XENNET_MAX_RING_PAGE_ORDER XENBUS_MAX_RING_PAGE_ORDER
+#define XENNET_MAX_RING_PAGES      (1U << XENNET_MAX_RING_PAGE_ORDER)
+
+
+#define NET_TX_RING_SIZE(_nr_pages)			\
+	__CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE * (_nr_pages))
+#define NET_RX_RING_SIZE(_nr_pages)			\
+	__CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE * (_nr_pages))
+
+#define XENNET_MAX_TX_RING_SIZE NET_TX_RING_SIZE(XENNET_MAX_RING_PAGES)
+#define XENNET_MAX_RX_RING_SIZE NET_RX_RING_SIZE(XENNET_MAX_RING_PAGES)
+
+#define TX_MAX_TARGET min_t(int, NET_TX_RING_SIZE(1), 256)
 
 struct netfront_stats {
 	u64			rx_packets;
@@ -80,6 +90,11 @@ struct netfront_stats {
 };
 
 struct netfront_info {
+	/* Statistics */
+	struct netfront_stats __percpu *stats;
+
+	unsigned long rx_gso_checksum_fixup;
+
 	struct list_head list;
 	struct net_device *netdev;
 
@@ -90,7 +105,9 @@ struct netfront_info {
 
 	spinlock_t   tx_lock;
 	struct xen_netif_tx_front_ring tx;
-	int tx_ring_ref;
+	int tx_ring_ref[XENNET_MAX_RING_PAGES];
+	unsigned int tx_ring_page_order;
+	unsigned int tx_ring_pages;
 
 	/*
 	 * {tx,rx}_skbs store outstanding skbuffs. Free tx_skb entries
@@ -104,36 +121,33 @@ struct netfront_info {
 	union skb_entry {
 		struct sk_buff *skb;
 		unsigned long link;
-	} tx_skbs[NET_TX_RING_SIZE];
+	} tx_skbs[XENNET_MAX_TX_RING_SIZE];
 	grant_ref_t gref_tx_head;
-	grant_ref_t grant_tx_ref[NET_TX_RING_SIZE];
+	grant_ref_t grant_tx_ref[XENNET_MAX_TX_RING_SIZE];
 	unsigned tx_skb_freelist;
 
 	spinlock_t   rx_lock ____cacheline_aligned_in_smp;
 	struct xen_netif_rx_front_ring rx;
-	int rx_ring_ref;
+	int rx_ring_ref[XENNET_MAX_RING_PAGES];
+	unsigned int rx_ring_page_order;
+	unsigned int rx_ring_pages;
 
 	/* Receive-ring batched refills. */
 #define RX_MIN_TARGET 8
 #define RX_DFL_MIN_TARGET 64
-#define RX_MAX_TARGET min_t(int, NET_RX_RING_SIZE, 256)
+#define RX_MAX_TARGET min_t(int, NET_RX_RING_SIZE(1), 256)
 	unsigned rx_min_target, rx_max_target, rx_target;
 	struct sk_buff_head rx_batch;
 
 	struct timer_list rx_refill_timer;
 
-	struct sk_buff *rx_skbs[NET_RX_RING_SIZE];
+	struct sk_buff *rx_skbs[XENNET_MAX_RX_RING_SIZE];
 	grant_ref_t gref_rx_head;
-	grant_ref_t grant_rx_ref[NET_RX_RING_SIZE];
-
-	unsigned long rx_pfn_array[NET_RX_RING_SIZE];
-	struct multicall_entry rx_mcl[NET_RX_RING_SIZE+1];
-	struct mmu_update rx_mmu[NET_RX_RING_SIZE];
-
-	/* Statistics */
-	struct netfront_stats __percpu *stats;
+	grant_ref_t grant_rx_ref[XENNET_MAX_RX_RING_SIZE];
 
-	unsigned long rx_gso_checksum_fixup;
+	unsigned long rx_pfn_array[XENNET_MAX_RX_RING_SIZE];
+	struct multicall_entry rx_mcl[XENNET_MAX_RX_RING_SIZE+1];
+	struct mmu_update rx_mmu[XENNET_MAX_RX_RING_SIZE];
 };
 
 struct netfront_rx_info {
@@ -171,15 +185,15 @@ static unsigned short get_id_from_freelist(unsigned *head,
 	return id;
 }
 
-static int xennet_rxidx(RING_IDX idx)
+static int xennet_rxidx(RING_IDX idx, struct netfront_info *info)
 {
-	return idx & (NET_RX_RING_SIZE - 1);
+	return idx & (NET_RX_RING_SIZE(info->rx_ring_pages) - 1);
 }
 
 static struct sk_buff *xennet_get_rx_skb(struct netfront_info *np,
 					 RING_IDX ri)
 {
-	int i = xennet_rxidx(ri);
+	int i = xennet_rxidx(ri, np);
 	struct sk_buff *skb = np->rx_skbs[i];
 	np->rx_skbs[i] = NULL;
 	return skb;
@@ -188,7 +202,7 @@ static struct sk_buff *xennet_get_rx_skb(struct netfront_info *np,
 static grant_ref_t xennet_get_rx_ref(struct netfront_info *np,
 					    RING_IDX ri)
 {
-	int i = xennet_rxidx(ri);
+	int i = xennet_rxidx(ri, np);
 	grant_ref_t ref = np->grant_rx_ref[i];
 	np->grant_rx_ref[i] = GRANT_INVALID_REF;
 	return ref;
@@ -301,7 +315,7 @@ no_skb:
 
 		skb->dev = dev;
 
-		id = xennet_rxidx(req_prod + i);
+		id = xennet_rxidx(req_prod + i, np);
 
 		BUG_ON(np->rx_skbs[id]);
 		np->rx_skbs[id] = skb;
@@ -653,7 +667,7 @@ static int xennet_close(struct net_device *dev)
 static void xennet_move_rx_slot(struct netfront_info *np, struct sk_buff *skb,
 				grant_ref_t ref)
 {
-	int new = xennet_rxidx(np->rx.req_prod_pvt);
+	int new = xennet_rxidx(np->rx.req_prod_pvt, np);
 
 	BUG_ON(np->rx_skbs[new]);
 	np->rx_skbs[new] = skb;
@@ -1109,7 +1123,7 @@ static void xennet_release_tx_bufs(struct netfront_info *np)
 	struct sk_buff *skb;
 	int i;
 
-	for (i = 0; i < NET_TX_RING_SIZE; i++) {
+	for (i = 0; i < NET_TX_RING_SIZE(np->tx_ring_pages); i++) {
 		/* Skip over entries which are actually freelist references */
 		if (skb_entry_is_link(&np->tx_skbs[i]))
 			continue;
@@ -1143,7 +1157,7 @@ static void xennet_release_rx_bufs(struct netfront_info *np)
 
 	spin_lock_bh(&np->rx_lock);
 
-	for (id = 0; id < NET_RX_RING_SIZE; id++) {
+	for (id = 0; id < NET_RX_RING_SIZE(np->rx_ring_pages); id++) {
 		ref = np->grant_rx_ref[id];
 		if (ref == GRANT_INVALID_REF) {
 			unused++;
@@ -1324,13 +1338,13 @@ static struct net_device *xennet_create_dev(struct xenbus_device *dev)
 
 	/* Initialise tx_skbs as a free chain containing every entry. */
 	np->tx_skb_freelist = 0;
-	for (i = 0; i < NET_TX_RING_SIZE; i++) {
+	for (i = 0; i < XENNET_MAX_TX_RING_SIZE; i++) {
 		skb_entry_set_link(&np->tx_skbs[i], i+1);
 		np->grant_tx_ref[i] = GRANT_INVALID_REF;
 	}
 
 	/* Clear out rx_skbs */
-	for (i = 0; i < NET_RX_RING_SIZE; i++) {
+	for (i = 0; i < XENNET_MAX_RX_RING_SIZE; i++) {
 		np->rx_skbs[i] = NULL;
 		np->grant_rx_ref[i] = GRANT_INVALID_REF;
 	}
@@ -1428,13 +1442,6 @@ static int netfront_probe(struct xenbus_device *dev,
 	return err;
 }
 
-static void xennet_end_access(int ref, void *page)
-{
-	/* This frees the page as a side-effect */
-	if (ref != GRANT_INVALID_REF)
-		gnttab_end_foreign_access(ref, 0, (unsigned long)page);
-}
-
 static void xennet_disconnect_backend(struct netfront_info *info)
 {
 	/* Stop old i/f to prevent errors whilst we rebuild the state. */
@@ -1448,12 +1455,12 @@ static void xennet_disconnect_backend(struct netfront_info *info)
 		unbind_from_irqhandler(info->netdev->irq, info->netdev);
 	info->evtchn = info->netdev->irq = 0;
 
-	/* End access and free the pages */
-	xennet_end_access(info->tx_ring_ref, info->tx.sring);
-	xennet_end_access(info->rx_ring_ref, info->rx.sring);
+	xenbus_unmap_ring_vfree(info->xbdev, (void *)info->tx.sring);
+	free_pages((unsigned long)info->tx.sring, info->tx_ring_page_order);
+
+	xenbus_unmap_ring_vfree(info->xbdev, (void *)info->rx.sring);
+	free_pages((unsigned long)info->rx.sring, info->rx_ring_page_order);
 
-	info->tx_ring_ref = GRANT_INVALID_REF;
-	info->rx_ring_ref = GRANT_INVALID_REF;
 	info->tx.sring = NULL;
 	info->rx.sring = NULL;
 }
@@ -1501,11 +1508,14 @@ static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info)
 	struct xen_netif_tx_sring *txs;
 	struct xen_netif_rx_sring *rxs;
 	int err;
-	int grefs[1];
 	struct net_device *netdev = info->netdev;
+	unsigned int max_tx_ring_page_order, max_rx_ring_page_order;
+	int i;
 
-	info->tx_ring_ref = GRANT_INVALID_REF;
-	info->rx_ring_ref = GRANT_INVALID_REF;
+	for (i = 0; i < XENNET_MAX_RING_PAGES; i++) {
+		info->tx_ring_ref[i] = GRANT_INVALID_REF;
+		info->rx_ring_ref[i] = GRANT_INVALID_REF;
+	}
 	info->rx.sring = NULL;
 	info->tx.sring = NULL;
 	netdev->irq = 0;
@@ -1516,50 +1526,100 @@ static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info)
 		goto fail;
 	}
 
-	txs = (struct xen_netif_tx_sring *)get_zeroed_page(GFP_NOIO | __GFP_HIGH);
+	err = xenbus_scanf(XBT_NIL, info->xbdev->otherend,
+			   "max-tx-ring-page-order", "%u",
+			   &max_tx_ring_page_order);
+	if (err < 0) {
+		info->tx_ring_page_order = 0;
+		dev_info(&dev->dev, "single tx ring\n");
+	} else {
+		if (max_tx_ring_page_order > XENNET_MAX_RING_PAGE_ORDER) {
+			dev_info(&dev->dev,
+				 "backend ring page order %d too large, clamp to %d\n",
+				 max_tx_ring_page_order,
+				 XENNET_MAX_RING_PAGE_ORDER);
+			max_tx_ring_page_order = XENNET_MAX_RING_PAGE_ORDER;
+		}
+		info->tx_ring_page_order = max_tx_ring_page_order;
+		dev_info(&dev->dev, "multi-page tx ring, order = %d\n",
+			 info->tx_ring_page_order);
+	}
+	info->tx_ring_pages = (1U << info->tx_ring_page_order);
+
+	txs = (struct xen_netif_tx_sring *)
+		__get_free_pages(__GFP_ZERO | GFP_NOIO | __GFP_HIGH,
+				 info->tx_ring_page_order);
 	if (!txs) {
 		err = -ENOMEM;
 		xenbus_dev_fatal(dev, err, "allocating tx ring page");
 		goto fail;
 	}
 	SHARED_RING_INIT(txs);
-	FRONT_RING_INIT(&info->tx, txs, PAGE_SIZE);
+	FRONT_RING_INIT(&info->tx, txs, PAGE_SIZE * info->tx_ring_pages);
+
+	err = xenbus_grant_ring(dev, txs, info->tx_ring_pages,
+				info->tx_ring_ref);
+	if (err < 0)
+		goto grant_tx_ring_fail;
 
-	err = xenbus_grant_ring(dev, txs, 1, grefs);
+	err = xenbus_scanf(XBT_NIL, info->xbdev->otherend,
+			   "max-rx-ring-page-order", "%u",
+			   &max_rx_ring_page_order);
 	if (err < 0) {
-		free_page((unsigned long)txs);
-		goto fail;
+		info->rx_ring_page_order = 0;
+		dev_info(&dev->dev, "single rx ring\n");
+	} else {
+		if (max_rx_ring_page_order > XENNET_MAX_RING_PAGE_ORDER) {
+			dev_info(&dev->dev,
+				 "backend ring page order %d too large, clamp to %d\n",
+				 max_rx_ring_page_order,
+				 XENNET_MAX_RING_PAGE_ORDER);
+			max_rx_ring_page_order = XENNET_MAX_RING_PAGE_ORDER;
+		}
+		info->rx_ring_page_order = max_rx_ring_page_order;
+		dev_info(&dev->dev, "multi-page rx ring, order = %d\n",
+			 info->rx_ring_page_order);
 	}
+	info->rx_ring_pages = (1U << info->rx_ring_page_order);
 
-	info->tx_ring_ref = grefs[0];
-	rxs = (struct xen_netif_rx_sring *)get_zeroed_page(GFP_NOIO | __GFP_HIGH);
+	rxs = (struct xen_netif_rx_sring *)
+		__get_free_pages(__GFP_ZERO | GFP_NOIO | __GFP_HIGH,
+				 info->rx_ring_page_order);
 	if (!rxs) {
 		err = -ENOMEM;
 		xenbus_dev_fatal(dev, err, "allocating rx ring page");
-		goto fail;
+		goto alloc_rx_ring_fail;
 	}
 	SHARED_RING_INIT(rxs);
-	FRONT_RING_INIT(&info->rx, rxs, PAGE_SIZE);
+	FRONT_RING_INIT(&info->rx, rxs, PAGE_SIZE * info->rx_ring_pages);
 
-	err = xenbus_grant_ring(dev, rxs, 1, grefs);
-	if (err < 0) {
-		free_page((unsigned long)rxs);
-		goto fail;
-	}
-	info->rx_ring_ref = grefs[0];
+	err = xenbus_grant_ring(dev, rxs, info->rx_ring_pages,
+				info->rx_ring_ref);
+	if (err < 0)
+		goto grant_rx_ring_fail;
 
 	err = xenbus_alloc_evtchn(dev, &info->evtchn);
 	if (err)
-		goto fail;
+		goto alloc_evtchn_fail;
 
 	err = bind_evtchn_to_irqhandler(info->evtchn, xennet_interrupt,
 					0, netdev->name, netdev);
 	if (err < 0)
-		goto fail;
+		goto bind_fail;
 	netdev->irq = err;
 	return 0;
 
- fail:
+bind_fail:
+	xenbus_free_evtchn(dev, info->evtchn);
+alloc_evtchn_fail:
+	xenbus_unmap_ring_vfree(info->xbdev, (void *)info->rx.sring);
+grant_rx_ring_fail:
+	free_pages((unsigned long)info->rx.sring, info->rx_ring_page_order);
+alloc_rx_ring_fail:
+	xenbus_unmap_ring_vfree(info->xbdev, (void *)info->tx.sring);
+grant_tx_ring_fail:
+	free_pages((unsigned long)info->tx.sring, info->tx_ring_page_order);
+fail:
 	return err;
 }
 
@@ -1570,6 +1630,7 @@ static int talk_to_netback(struct xenbus_device *dev,
 	const char *message;
 	struct xenbus_transaction xbt;
 	int err;
+	int i;
 
 	/* Create shared ring, alloc event channel. */
 	err = setup_netfront(dev, info);
@@ -1583,18 +1644,58 @@ again:
 		goto destroy_ring;
 	}
 
-	err = xenbus_printf(xbt, dev->nodename, "tx-ring-ref", "%u",
-			    info->tx_ring_ref);
-	if (err) {
-		message = "writing tx ring-ref";
-		goto abort_transaction;
+	if (info->tx_ring_page_order == 0) {
+		err = xenbus_printf(xbt, dev->nodename, "tx-ring-ref", "%u",
+				    info->tx_ring_ref[0]);
+		if (err) {
+			message = "writing tx ring-ref";
+			goto abort_transaction;
+		}
+	} else {
+		err = xenbus_printf(xbt, dev->nodename, "tx-ring-order", "%u",
+				    info->tx_ring_page_order);
+		if (err) {
+			message = "writing tx-ring-order";
+			goto abort_transaction;
+		}
+		for (i = 0; i < info->tx_ring_pages; i++) {
+			char name[sizeof("tx-ring-ref")+3];
+			snprintf(name, sizeof(name), "tx-ring-ref%u", i);
+			err = xenbus_printf(xbt, dev->nodename, name, "%u",
+					    info->tx_ring_ref[i]);
+			if (err) {
+				message = "writing tx ring-ref";
+				goto abort_transaction;
+			}
+		}
 	}
-	err = xenbus_printf(xbt, dev->nodename, "rx-ring-ref", "%u",
-			    info->rx_ring_ref);
-	if (err) {
-		message = "writing rx ring-ref";
-		goto abort_transaction;
+
+	if (info->rx_ring_page_order == 0) {
+		err = xenbus_printf(xbt, dev->nodename, "rx-ring-ref", "%u",
+				    info->rx_ring_ref[0]);
+		if (err) {
+			message = "writing rx ring-ref";
+			goto abort_transaction;
+		}
+	} else {
+		err = xenbus_printf(xbt, dev->nodename, "rx-ring-order", "%u",
+				    info->rx_ring_page_order);
+		if (err) {
+			message = "writing rx-ring-order";
+			goto abort_transaction;
+		}
+		for (i = 0; i < info->rx_ring_pages; i++) {
+			char name[sizeof("rx-ring-ref")+3];
+			snprintf(name, sizeof(name), "rx-ring-ref%u", i);
+			err = xenbus_printf(xbt, dev->nodename, name, "%u",
+					    info->rx_ring_ref[i]);
+			if (err) {
+				message = "writing rx ring-ref";
+				goto abort_transaction;
+			}
+		}
 	}
+
 	err = xenbus_printf(xbt, dev->nodename,
 			    "event-channel", "%u", info->evtchn);
 	if (err) {
@@ -1681,7 +1782,8 @@ static int xennet_connect(struct net_device *dev)
 	xennet_release_tx_bufs(np);
 
 	/* Step 2: Rebuild the RX buffer freelist and the RX ring itself. */
-	for (requeue_idx = 0, i = 0; i < NET_RX_RING_SIZE; i++) {
+	for (requeue_idx = 0, i = 0; i < NET_RX_RING_SIZE(np->rx_ring_pages);
+	     i++) {
 		skb_frag_t *frag;
 		const struct page *page;
 		if (!np->rx_skbs[i])
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH 7/8] netback: split event channels support
  2013-02-15 16:00 [PATCH 0/8] Bugfix and mechanical works for Xen network driver Wei Liu
                   ` (12 preceding siblings ...)
  2013-02-15 16:00 ` [PATCH 7/8] netback: split event channels support Wei Liu
@ 2013-02-15 16:00 ` Wei Liu
  2013-03-04 21:22   ` Konrad Rzeszutek Wilk
  2013-03-04 21:22   ` Konrad Rzeszutek Wilk
  2013-02-15 16:00 ` [PATCH 8/8] netfront: " Wei Liu
                   ` (3 subsequent siblings)
  17 siblings, 2 replies; 91+ messages in thread
From: Wei Liu @ 2013-02-15 16:00 UTC (permalink / raw)
  To: xen-devel, netdev, ian.campbell, konrad.wilk, annie.li; +Cc: Wei Liu

Netback and netfront only use one event channel to do tx / rx notification.
This may cause unnecessary wake-up of process routines. This patch adds a new
feature called feautre-split-event-channel to netback, enabling it to handle
Tx and Rx event separately.

Netback will use tx_irq to notify guest for tx completion, rx_irq for rx
notification.

If frontend doesn't support this feature, tx_irq = rx_irq.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 drivers/net/xen-netback/common.h    |   10 +++--
 drivers/net/xen-netback/interface.c |   78 ++++++++++++++++++++++++++++-------
 drivers/net/xen-netback/netback.c   |    7 ++--
 drivers/net/xen-netback/xenbus.c    |   44 ++++++++++++++++----
 4 files changed, 109 insertions(+), 30 deletions(-)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index f541ba9..cc2a9f0 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -63,8 +63,11 @@ struct xenvif {
 
 	u8               fe_dev_addr[6];
 
-	/* Physical parameters of the comms window. */
-	unsigned int     irq;
+	/* Physical parameters of the comms window.
+	 * When feature-split-event-channels = 0, tx_irq = rx_irq.
+	 */
+	unsigned int tx_irq;
+	unsigned int rx_irq;
 
 	/* List of frontends to notify after a batch of frames sent. */
 	struct list_head notify_list;
@@ -122,10 +125,11 @@ struct xenvif *xenvif_alloc(struct device *parent,
 			    domid_t domid,
 			    unsigned int handle);
 
+/* When feature-split-event-channels == 0, tx_evtchn == rx_evtchn */
 int xenvif_connect(struct xenvif *vif,
 		   unsigned long *tx_ring_ref, unsigned int tx_ring_order,
 		   unsigned long *rx_ring_ref, unsigned int rx_ring_order,
-		   unsigned int evtchn);
+		   unsigned int tx_evtchn, unsigned int rx_evtchn);
 void xenvif_disconnect(struct xenvif *vif);
 
 void xenvif_get(struct xenvif *vif);
diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index fa4d46d..c9ebe21 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -60,7 +60,8 @@ static int xenvif_rx_schedulable(struct xenvif *vif)
 	return xenvif_schedulable(vif) && !xen_netbk_rx_ring_full(vif);
 }
 
-static irqreturn_t xenvif_interrupt(int irq, void *dev_id)
+/* Tx interrupt handler used when feature-split-event-channels == 1 */
+static irqreturn_t xenvif_tx_interrupt(int tx_irq, void *dev_id)
 {
 	struct xenvif *vif = dev_id;
 
@@ -69,12 +70,31 @@ static irqreturn_t xenvif_interrupt(int irq, void *dev_id)
 
 	xen_netbk_schedule_xenvif(vif);
 
+	return IRQ_HANDLED;
+}
+
+/* Rx interrupt handler used when feature-split-event-channels == 1 */
+static irqreturn_t xenvif_rx_interrupt(int rx_irq, void *dev_id)
+{
+	struct xenvif *vif = dev_id;
+
+	if (vif->netbk == NULL)
+		return IRQ_NONE;
+
 	if (xenvif_rx_schedulable(vif))
 		netif_wake_queue(vif->dev);
 
 	return IRQ_HANDLED;
 }
 
+/* Used when feature-split-event-channels == 0 */
+static irqreturn_t xenvif_interrupt(int irq, void *dev_id)
+{
+	xenvif_tx_interrupt(irq, dev_id);
+	xenvif_rx_interrupt(irq, dev_id);
+	return IRQ_HANDLED;
+}
+
 static int xenvif_start_xmit(struct sk_buff *skb, struct net_device *dev)
 {
 	struct xenvif *vif = netdev_priv(dev);
@@ -125,13 +145,15 @@ static struct net_device_stats *xenvif_get_stats(struct net_device *dev)
 static void xenvif_up(struct xenvif *vif)
 {
 	xen_netbk_add_xenvif(vif);
-	enable_irq(vif->irq);
+	enable_irq(vif->tx_irq);
+	enable_irq(vif->rx_irq);
 	xen_netbk_check_rx_xenvif(vif);
 }
 
 static void xenvif_down(struct xenvif *vif)
 {
-	disable_irq(vif->irq);
+	disable_irq(vif->tx_irq);
+	disable_irq(vif->rx_irq);
 	del_timer_sync(&vif->credit_timeout);
 	xen_netbk_deschedule_xenvif(vif);
 	xen_netbk_remove_xenvif(vif);
@@ -308,7 +330,7 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid,
 int xenvif_connect(struct xenvif *vif,
 		   unsigned long *tx_ring_ref, unsigned int tx_ring_ref_count,
 		   unsigned long *rx_ring_ref, unsigned int rx_ring_ref_count,
-		   unsigned int evtchn)
+		   unsigned int tx_evtchn, unsigned int rx_evtchn)
 {
 	int err = -ENOMEM;
 	void *addr;
@@ -317,7 +339,7 @@ int xenvif_connect(struct xenvif *vif,
 	int tmp[NETBK_MAX_RING_PAGES], i;
 
 	/* Already connected through? */
-	if (vif->irq)
+	if (vif->tx_irq)
 		return 0;
 
 	__module_get(THIS_MODULE);
@@ -347,13 +369,32 @@ int xenvif_connect(struct xenvif *vif,
 	BACK_RING_INIT(&vif->rx, rxs, PAGE_SIZE * rx_ring_ref_count);
 	vif->nr_rx_handles = rx_ring_ref_count;
 
-	err = bind_interdomain_evtchn_to_irqhandler(
-		vif->domid, evtchn, xenvif_interrupt, 0,
-		vif->dev->name, vif);
-	if (err < 0)
-		goto err_rx_unmap;
-	vif->irq = err;
-	disable_irq(vif->irq);
+	if (tx_evtchn == rx_evtchn) { /* feature-split-event-channels == 0 */
+		err = bind_interdomain_evtchn_to_irqhandler(
+			vif->domid, tx_evtchn, xenvif_interrupt, 0,
+			vif->dev->name, vif);
+		if (err < 0)
+			goto err_rx_unmap;
+		vif->tx_irq = vif->rx_irq = err;
+		disable_irq(vif->tx_irq);
+		disable_irq(vif->rx_irq);
+	} else { /* feature-split-event-channels == 1 */
+		err = bind_interdomain_evtchn_to_irqhandler(
+			vif->domid, tx_evtchn, xenvif_tx_interrupt, 0,
+			vif->dev->name, vif);
+		if (err < 0)
+			goto err_rx_unmap;
+		vif->tx_irq = err;
+		disable_irq(vif->tx_irq);
+
+		err = bind_interdomain_evtchn_to_irqhandler(
+			vif->domid, rx_evtchn, xenvif_rx_interrupt, 0,
+			vif->dev->name, vif);
+		if (err < 0)
+			goto err_tx_unbind;
+		vif->rx_irq = err;
+		disable_irq(vif->rx_irq);
+	}
 
 	xenvif_get(vif);
 
@@ -367,6 +408,10 @@ int xenvif_connect(struct xenvif *vif,
 	rtnl_unlock();
 
 	return 0;
+
+err_tx_unbind:
+	unbind_from_irqhandler(vif->tx_irq, vif);
+	vif->tx_irq = 0;
 err_rx_unmap:
 	xen_netbk_unmap_frontend_rings(vif, (void *)vif->rx.sring);
 	vif->nr_rx_handles = 0;
@@ -406,8 +451,13 @@ void xenvif_disconnect(struct xenvif *vif)
 	atomic_dec(&vif->refcnt);
 	wait_event(vif->waiting_to_free, atomic_read(&vif->refcnt) == 0);
 
-	if (vif->irq) {
-		unbind_from_irqhandler(vif->irq, vif);
+	if (vif->tx_irq) {
+		if (vif->tx_irq == vif->rx_irq)
+			unbind_from_irqhandler(vif->tx_irq, vif);
+		else {
+			unbind_from_irqhandler(vif->tx_irq, vif);
+			unbind_from_irqhandler(vif->rx_irq, vif);
+		}
 		need_module_put = 1;
 	}
 
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 644c760..5ac8c35 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -639,7 +639,7 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk)
 {
 	struct xenvif *vif = NULL, *tmp;
 	s8 status;
-	u16 irq, flags;
+	u16 flags;
 	struct xen_netif_rx_response *resp;
 	struct sk_buff_head rxq;
 	struct sk_buff *skb;
@@ -750,7 +750,6 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk)
 					 sco->meta_slots_used);
 
 		RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(&vif->rx, ret);
-		irq = vif->irq;
 		if (ret && list_empty(&vif->notify_list))
 			list_add_tail(&vif->notify_list, &notify);
 
@@ -762,7 +761,7 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk)
 	}
 
 	list_for_each_entry_safe(vif, tmp, &notify, notify_list) {
-		notify_remote_via_irq(vif->irq);
+		notify_remote_via_irq(vif->rx_irq);
 		list_del_init(&vif->notify_list);
 	}
 
@@ -1595,7 +1594,7 @@ static void make_tx_response(struct xenvif *vif,
 	vif->tx.rsp_prod_pvt = ++i;
 	RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(&vif->tx, notify);
 	if (notify)
-		notify_remote_via_irq(vif->irq);
+		notify_remote_via_irq(vif->tx_irq);
 }
 
 static struct xen_netif_rx_response *make_rx_response(struct xenvif *vif,
diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index 1791807..6822d89 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -141,6 +141,15 @@ static int netback_probe(struct xenbus_device *dev,
 			goto abort_transaction;
 		}
 
+		/* Split event channels support */
+		err = xenbus_printf(xbt, dev->nodename,
+				    "feature-split-event-channels",
+				    "%u", 1);
+		if (err) {
+			message = "writing feature-split-event-channels";
+			goto abort_transaction;
+		}
+
 		err = xenbus_transaction_end(xbt, 0);
 	} while (err == -EAGAIN);
 
@@ -419,7 +428,7 @@ static int connect_rings(struct backend_info *be)
 {
 	struct xenvif *vif = be->vif;
 	struct xenbus_device *dev = be->dev;
-	unsigned int evtchn, rx_copy;
+	unsigned int tx_evtchn, rx_evtchn, rx_copy;
 	int err;
 	int val;
 	unsigned long tx_ring_ref[NETBK_MAX_RING_PAGES];
@@ -428,12 +437,22 @@ static int connect_rings(struct backend_info *be)
 	unsigned int  rx_ring_order;
 
 	err = xenbus_gather(XBT_NIL, dev->otherend,
-			    "event-channel", "%u", &evtchn, NULL);
+			    "event-channel", "%u", &tx_evtchn, NULL);
 	if (err) {
-		xenbus_dev_fatal(dev, err,
-				 "reading %s/event-channel",
-				 dev->otherend);
-		return err;
+		/* try split event channels */
+		err = xenbus_gather(XBT_NIL, dev->otherend,
+				    "event-channel-tx", "%u", &tx_evtchn,
+				    "event-channel-rx", "%u", &rx_evtchn,
+				    NULL);
+		if (err) {
+			xenbus_dev_fatal(dev, err,
+					 "reading %s/event-channel(-tx/rx)",
+					 dev->otherend);
+			return err;
+		}
+	} else { /* frontend doesn't support split event channels */
+		rx_evtchn = tx_evtchn;
+		dev_info(&dev->dev, "single event channel\n");
 	}
 
 	err = xenbus_scanf(XBT_NIL, dev->otherend, "tx-ring-order", "%u",
@@ -568,12 +587,19 @@ static int connect_rings(struct backend_info *be)
 	/* Map the shared frame, irq etc. */
 	err = xenvif_connect(vif, tx_ring_ref, (1U << tx_ring_order),
 			     rx_ring_ref, (1U << rx_ring_order),
-			     evtchn);
+			     tx_evtchn, rx_evtchn);
 	if (err) {
 		/* construct 1 2 3 / 4 5 6 */
 		int i;
 		char txs[3 * (1U << MODPARM_netback_max_tx_ring_page_order)];
 		char rxs[3 * (1U << MODPARM_netback_max_rx_ring_page_order)];
+		char evtchns[20];
+
+		if (tx_evtchn == rx_evtchn)
+			snprintf(evtchns, sizeof(evtchns)-1, "%u", tx_evtchn);
+		else
+			snprintf(evtchns, sizeof(evtchns)-1, "%u/%u",
+				 tx_evtchn, rx_evtchn);
 
 		txs[0] = rxs[0] = 0;
 
@@ -586,8 +612,8 @@ static int connect_rings(struct backend_info *be)
 				 " %lu", rx_ring_ref[i]);
 
 		xenbus_dev_fatal(dev, err,
-				 "mapping shared-frames%s /%s port %u",
-				 txs, rxs, evtchn);
+				 "mapping shared-frames%s /%s port %s",
+				 txs, rxs, evtchns);
 		return err;
 	}
 	return 0;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH 7/8] netback: split event channels support
  2013-02-15 16:00 [PATCH 0/8] Bugfix and mechanical works for Xen network driver Wei Liu
                   ` (11 preceding siblings ...)
  2013-02-15 16:00 ` Wei Liu
@ 2013-02-15 16:00 ` Wei Liu
  2013-02-15 16:00 ` Wei Liu
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 91+ messages in thread
From: Wei Liu @ 2013-02-15 16:00 UTC (permalink / raw)
  To: xen-devel, netdev, ian.campbell, konrad.wilk, annie.li; +Cc: Wei Liu

Netback and netfront only use one event channel to do tx / rx notification.
This may cause unnecessary wake-up of process routines. This patch adds a new
feature called feautre-split-event-channel to netback, enabling it to handle
Tx and Rx event separately.

Netback will use tx_irq to notify guest for tx completion, rx_irq for rx
notification.

If frontend doesn't support this feature, tx_irq = rx_irq.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 drivers/net/xen-netback/common.h    |   10 +++--
 drivers/net/xen-netback/interface.c |   78 ++++++++++++++++++++++++++++-------
 drivers/net/xen-netback/netback.c   |    7 ++--
 drivers/net/xen-netback/xenbus.c    |   44 ++++++++++++++++----
 4 files changed, 109 insertions(+), 30 deletions(-)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index f541ba9..cc2a9f0 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -63,8 +63,11 @@ struct xenvif {
 
 	u8               fe_dev_addr[6];
 
-	/* Physical parameters of the comms window. */
-	unsigned int     irq;
+	/* Physical parameters of the comms window.
+	 * When feature-split-event-channels = 0, tx_irq = rx_irq.
+	 */
+	unsigned int tx_irq;
+	unsigned int rx_irq;
 
 	/* List of frontends to notify after a batch of frames sent. */
 	struct list_head notify_list;
@@ -122,10 +125,11 @@ struct xenvif *xenvif_alloc(struct device *parent,
 			    domid_t domid,
 			    unsigned int handle);
 
+/* When feature-split-event-channels == 0, tx_evtchn == rx_evtchn */
 int xenvif_connect(struct xenvif *vif,
 		   unsigned long *tx_ring_ref, unsigned int tx_ring_order,
 		   unsigned long *rx_ring_ref, unsigned int rx_ring_order,
-		   unsigned int evtchn);
+		   unsigned int tx_evtchn, unsigned int rx_evtchn);
 void xenvif_disconnect(struct xenvif *vif);
 
 void xenvif_get(struct xenvif *vif);
diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index fa4d46d..c9ebe21 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -60,7 +60,8 @@ static int xenvif_rx_schedulable(struct xenvif *vif)
 	return xenvif_schedulable(vif) && !xen_netbk_rx_ring_full(vif);
 }
 
-static irqreturn_t xenvif_interrupt(int irq, void *dev_id)
+/* Tx interrupt handler used when feature-split-event-channels == 1 */
+static irqreturn_t xenvif_tx_interrupt(int tx_irq, void *dev_id)
 {
 	struct xenvif *vif = dev_id;
 
@@ -69,12 +70,31 @@ static irqreturn_t xenvif_interrupt(int irq, void *dev_id)
 
 	xen_netbk_schedule_xenvif(vif);
 
+	return IRQ_HANDLED;
+}
+
+/* Rx interrupt handler used when feature-split-event-channels == 1 */
+static irqreturn_t xenvif_rx_interrupt(int rx_irq, void *dev_id)
+{
+	struct xenvif *vif = dev_id;
+
+	if (vif->netbk == NULL)
+		return IRQ_NONE;
+
 	if (xenvif_rx_schedulable(vif))
 		netif_wake_queue(vif->dev);
 
 	return IRQ_HANDLED;
 }
 
+/* Used when feature-split-event-channels == 0 */
+static irqreturn_t xenvif_interrupt(int irq, void *dev_id)
+{
+	xenvif_tx_interrupt(irq, dev_id);
+	xenvif_rx_interrupt(irq, dev_id);
+	return IRQ_HANDLED;
+}
+
 static int xenvif_start_xmit(struct sk_buff *skb, struct net_device *dev)
 {
 	struct xenvif *vif = netdev_priv(dev);
@@ -125,13 +145,15 @@ static struct net_device_stats *xenvif_get_stats(struct net_device *dev)
 static void xenvif_up(struct xenvif *vif)
 {
 	xen_netbk_add_xenvif(vif);
-	enable_irq(vif->irq);
+	enable_irq(vif->tx_irq);
+	enable_irq(vif->rx_irq);
 	xen_netbk_check_rx_xenvif(vif);
 }
 
 static void xenvif_down(struct xenvif *vif)
 {
-	disable_irq(vif->irq);
+	disable_irq(vif->tx_irq);
+	disable_irq(vif->rx_irq);
 	del_timer_sync(&vif->credit_timeout);
 	xen_netbk_deschedule_xenvif(vif);
 	xen_netbk_remove_xenvif(vif);
@@ -308,7 +330,7 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid,
 int xenvif_connect(struct xenvif *vif,
 		   unsigned long *tx_ring_ref, unsigned int tx_ring_ref_count,
 		   unsigned long *rx_ring_ref, unsigned int rx_ring_ref_count,
-		   unsigned int evtchn)
+		   unsigned int tx_evtchn, unsigned int rx_evtchn)
 {
 	int err = -ENOMEM;
 	void *addr;
@@ -317,7 +339,7 @@ int xenvif_connect(struct xenvif *vif,
 	int tmp[NETBK_MAX_RING_PAGES], i;
 
 	/* Already connected through? */
-	if (vif->irq)
+	if (vif->tx_irq)
 		return 0;
 
 	__module_get(THIS_MODULE);
@@ -347,13 +369,32 @@ int xenvif_connect(struct xenvif *vif,
 	BACK_RING_INIT(&vif->rx, rxs, PAGE_SIZE * rx_ring_ref_count);
 	vif->nr_rx_handles = rx_ring_ref_count;
 
-	err = bind_interdomain_evtchn_to_irqhandler(
-		vif->domid, evtchn, xenvif_interrupt, 0,
-		vif->dev->name, vif);
-	if (err < 0)
-		goto err_rx_unmap;
-	vif->irq = err;
-	disable_irq(vif->irq);
+	if (tx_evtchn == rx_evtchn) { /* feature-split-event-channels == 0 */
+		err = bind_interdomain_evtchn_to_irqhandler(
+			vif->domid, tx_evtchn, xenvif_interrupt, 0,
+			vif->dev->name, vif);
+		if (err < 0)
+			goto err_rx_unmap;
+		vif->tx_irq = vif->rx_irq = err;
+		disable_irq(vif->tx_irq);
+		disable_irq(vif->rx_irq);
+	} else { /* feature-split-event-channels == 1 */
+		err = bind_interdomain_evtchn_to_irqhandler(
+			vif->domid, tx_evtchn, xenvif_tx_interrupt, 0,
+			vif->dev->name, vif);
+		if (err < 0)
+			goto err_rx_unmap;
+		vif->tx_irq = err;
+		disable_irq(vif->tx_irq);
+
+		err = bind_interdomain_evtchn_to_irqhandler(
+			vif->domid, rx_evtchn, xenvif_rx_interrupt, 0,
+			vif->dev->name, vif);
+		if (err < 0)
+			goto err_tx_unbind;
+		vif->rx_irq = err;
+		disable_irq(vif->rx_irq);
+	}
 
 	xenvif_get(vif);
 
@@ -367,6 +408,10 @@ int xenvif_connect(struct xenvif *vif,
 	rtnl_unlock();
 
 	return 0;
+
+err_tx_unbind:
+	unbind_from_irqhandler(vif->tx_irq, vif);
+	vif->tx_irq = 0;
 err_rx_unmap:
 	xen_netbk_unmap_frontend_rings(vif, (void *)vif->rx.sring);
 	vif->nr_rx_handles = 0;
@@ -406,8 +451,13 @@ void xenvif_disconnect(struct xenvif *vif)
 	atomic_dec(&vif->refcnt);
 	wait_event(vif->waiting_to_free, atomic_read(&vif->refcnt) == 0);
 
-	if (vif->irq) {
-		unbind_from_irqhandler(vif->irq, vif);
+	if (vif->tx_irq) {
+		if (vif->tx_irq == vif->rx_irq)
+			unbind_from_irqhandler(vif->tx_irq, vif);
+		else {
+			unbind_from_irqhandler(vif->tx_irq, vif);
+			unbind_from_irqhandler(vif->rx_irq, vif);
+		}
 		need_module_put = 1;
 	}
 
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 644c760..5ac8c35 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -639,7 +639,7 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk)
 {
 	struct xenvif *vif = NULL, *tmp;
 	s8 status;
-	u16 irq, flags;
+	u16 flags;
 	struct xen_netif_rx_response *resp;
 	struct sk_buff_head rxq;
 	struct sk_buff *skb;
@@ -750,7 +750,6 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk)
 					 sco->meta_slots_used);
 
 		RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(&vif->rx, ret);
-		irq = vif->irq;
 		if (ret && list_empty(&vif->notify_list))
 			list_add_tail(&vif->notify_list, &notify);
 
@@ -762,7 +761,7 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk)
 	}
 
 	list_for_each_entry_safe(vif, tmp, &notify, notify_list) {
-		notify_remote_via_irq(vif->irq);
+		notify_remote_via_irq(vif->rx_irq);
 		list_del_init(&vif->notify_list);
 	}
 
@@ -1595,7 +1594,7 @@ static void make_tx_response(struct xenvif *vif,
 	vif->tx.rsp_prod_pvt = ++i;
 	RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(&vif->tx, notify);
 	if (notify)
-		notify_remote_via_irq(vif->irq);
+		notify_remote_via_irq(vif->tx_irq);
 }
 
 static struct xen_netif_rx_response *make_rx_response(struct xenvif *vif,
diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index 1791807..6822d89 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -141,6 +141,15 @@ static int netback_probe(struct xenbus_device *dev,
 			goto abort_transaction;
 		}
 
+		/* Split event channels support */
+		err = xenbus_printf(xbt, dev->nodename,
+				    "feature-split-event-channels",
+				    "%u", 1);
+		if (err) {
+			message = "writing feature-split-event-channels";
+			goto abort_transaction;
+		}
+
 		err = xenbus_transaction_end(xbt, 0);
 	} while (err == -EAGAIN);
 
@@ -419,7 +428,7 @@ static int connect_rings(struct backend_info *be)
 {
 	struct xenvif *vif = be->vif;
 	struct xenbus_device *dev = be->dev;
-	unsigned int evtchn, rx_copy;
+	unsigned int tx_evtchn, rx_evtchn, rx_copy;
 	int err;
 	int val;
 	unsigned long tx_ring_ref[NETBK_MAX_RING_PAGES];
@@ -428,12 +437,22 @@ static int connect_rings(struct backend_info *be)
 	unsigned int  rx_ring_order;
 
 	err = xenbus_gather(XBT_NIL, dev->otherend,
-			    "event-channel", "%u", &evtchn, NULL);
+			    "event-channel", "%u", &tx_evtchn, NULL);
 	if (err) {
-		xenbus_dev_fatal(dev, err,
-				 "reading %s/event-channel",
-				 dev->otherend);
-		return err;
+		/* try split event channels */
+		err = xenbus_gather(XBT_NIL, dev->otherend,
+				    "event-channel-tx", "%u", &tx_evtchn,
+				    "event-channel-rx", "%u", &rx_evtchn,
+				    NULL);
+		if (err) {
+			xenbus_dev_fatal(dev, err,
+					 "reading %s/event-channel(-tx/rx)",
+					 dev->otherend);
+			return err;
+		}
+	} else { /* frontend doesn't support split event channels */
+		rx_evtchn = tx_evtchn;
+		dev_info(&dev->dev, "single event channel\n");
 	}
 
 	err = xenbus_scanf(XBT_NIL, dev->otherend, "tx-ring-order", "%u",
@@ -568,12 +587,19 @@ static int connect_rings(struct backend_info *be)
 	/* Map the shared frame, irq etc. */
 	err = xenvif_connect(vif, tx_ring_ref, (1U << tx_ring_order),
 			     rx_ring_ref, (1U << rx_ring_order),
-			     evtchn);
+			     tx_evtchn, rx_evtchn);
 	if (err) {
 		/* construct 1 2 3 / 4 5 6 */
 		int i;
 		char txs[3 * (1U << MODPARM_netback_max_tx_ring_page_order)];
 		char rxs[3 * (1U << MODPARM_netback_max_rx_ring_page_order)];
+		char evtchns[20];
+
+		if (tx_evtchn == rx_evtchn)
+			snprintf(evtchns, sizeof(evtchns)-1, "%u", tx_evtchn);
+		else
+			snprintf(evtchns, sizeof(evtchns)-1, "%u/%u",
+				 tx_evtchn, rx_evtchn);
 
 		txs[0] = rxs[0] = 0;
 
@@ -586,8 +612,8 @@ static int connect_rings(struct backend_info *be)
 				 " %lu", rx_ring_ref[i]);
 
 		xenbus_dev_fatal(dev, err,
-				 "mapping shared-frames%s /%s port %u",
-				 txs, rxs, evtchn);
+				 "mapping shared-frames%s /%s port %s",
+				 txs, rxs, evtchns);
 		return err;
 	}
 	return 0;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH 8/8] netfront: split event channels support
  2013-02-15 16:00 [PATCH 0/8] Bugfix and mechanical works for Xen network driver Wei Liu
                   ` (14 preceding siblings ...)
  2013-02-15 16:00 ` [PATCH 8/8] netfront: " Wei Liu
@ 2013-02-15 16:00 ` Wei Liu
  2013-03-04 21:24   ` Konrad Rzeszutek Wilk
  2013-03-04 21:24   ` Konrad Rzeszutek Wilk
  2013-02-26  3:07 ` [PATCH 0/8] Bugfix and mechanical works for Xen network driver ANNIE LI
  2013-02-26  3:07 ` [Xen-devel] " ANNIE LI
  17 siblings, 2 replies; 91+ messages in thread
From: Wei Liu @ 2013-02-15 16:00 UTC (permalink / raw)
  To: xen-devel, netdev, ian.campbell, konrad.wilk, annie.li; +Cc: Wei Liu

If this feature is not activated, rx_irq = tx_irq. See corresponding netback
change log for details.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 drivers/net/xen-netfront.c |  184 ++++++++++++++++++++++++++++++++++++--------
 1 file changed, 152 insertions(+), 32 deletions(-)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index de73a71..ea9b656 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -100,7 +100,12 @@ struct netfront_info {
 
 	struct napi_struct napi;
 
-	unsigned int evtchn;
+	/* 
+	 * Split event channels support, tx_* == rx_* when using
+	 * single event channel.
+	 */
+	unsigned int tx_evtchn, rx_evtchn;
+	unsigned int tx_irq, rx_irq;
 	struct xenbus_device *xbdev;
 
 	spinlock_t   tx_lock;
@@ -344,7 +349,7 @@ no_skb:
  push:
 	RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(&np->rx, notify);
 	if (notify)
-		notify_remote_via_irq(np->netdev->irq);
+		notify_remote_via_irq(np->rx_irq);
 }
 
 static int xennet_open(struct net_device *dev)
@@ -633,7 +638,7 @@ static int xennet_start_xmit(struct sk_buff *skb, struct net_device *dev)
 
 	RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(&np->tx, notify);
 	if (notify)
-		notify_remote_via_irq(np->netdev->irq);
+		notify_remote_via_irq(np->tx_irq);
 
 	u64_stats_update_begin(&stats->syncp);
 	stats->tx_bytes += skb->len;
@@ -1263,26 +1268,41 @@ static int xennet_set_features(struct net_device *dev,
 	return 0;
 }
 
-static irqreturn_t xennet_interrupt(int irq, void *dev_id)
+/* Used for tx completion */
+static irqreturn_t xennet_tx_interrupt(int tx_irq, void *dev_id)
 {
-	struct net_device *dev = dev_id;
-	struct netfront_info *np = netdev_priv(dev);
+	struct netfront_info *np = dev_id;
+	struct net_device *dev = np->netdev;
 	unsigned long flags;
 
 	spin_lock_irqsave(&np->tx_lock, flags);
+	xennet_tx_buf_gc(dev);
+	spin_unlock_irqrestore(&np->tx_lock, flags);
 
-	if (likely(netif_carrier_ok(dev))) {
-		xennet_tx_buf_gc(dev);
-		/* Under tx_lock: protects access to rx shared-ring indexes. */
-		if (RING_HAS_UNCONSUMED_RESPONSES(&np->rx))
-			napi_schedule(&np->napi);
-	}
+	return IRQ_HANDLED;
+}
 
-	spin_unlock_irqrestore(&np->tx_lock, flags);
+/* Used for rx */
+static irqreturn_t xennet_rx_interrupt(int rx_irq, void *dev_id)
+{
+	struct netfront_info *np = dev_id;
+	struct net_device *dev = np->netdev;
+
+	if (likely(netif_carrier_ok(dev) &&
+		   RING_HAS_UNCONSUMED_RESPONSES(&np->rx)))
+		napi_schedule(&np->napi);
 
 	return IRQ_HANDLED;
 }
 
+/* Used for single event channel configuration */
+static irqreturn_t xennet_interrupt(int irq, void *dev_id)
+{
+	xennet_tx_interrupt(irq, dev_id);
+	xennet_rx_interrupt(irq, dev_id);
+	return IRQ_HANDLED;
+}
+
 #ifdef CONFIG_NET_POLL_CONTROLLER
 static void xennet_poll_controller(struct net_device *dev)
 {
@@ -1451,9 +1471,14 @@ static void xennet_disconnect_backend(struct netfront_info *info)
 	spin_unlock_irq(&info->tx_lock);
 	spin_unlock_bh(&info->rx_lock);
 
-	if (info->netdev->irq)
-		unbind_from_irqhandler(info->netdev->irq, info->netdev);
-	info->evtchn = info->netdev->irq = 0;
+	if (info->tx_irq && (info->tx_irq == info->rx_irq))
+		unbind_from_irqhandler(info->tx_irq, info);
+	if (info->tx_irq && (info->tx_irq != info->rx_irq)) {
+		unbind_from_irqhandler(info->tx_irq, info);
+		unbind_from_irqhandler(info->rx_irq, info);
+	}
+	info->tx_evtchn = info->rx_evtchn = 0;
+	info->tx_irq = info->rx_irq = 0;
 
 	xenbus_unmap_ring_vfree(info->xbdev, (void *)info->tx.sring);
 	free_pages((unsigned long)info->tx.sring, info->tx_ring_page_order);
@@ -1503,11 +1528,86 @@ static int xen_net_read_mac(struct xenbus_device *dev, u8 mac[])
 	return 0;
 }
 
+static int setup_netfront_single(struct netfront_info *info)
+{
+	int err;
+
+	err = xenbus_alloc_evtchn(info->xbdev, &info->tx_evtchn);
+	if (err < 0)
+		goto fail;
+
+	err = bind_evtchn_to_irqhandler(info->tx_evtchn,
+					xennet_interrupt,
+					0, info->netdev->name, info);
+	if (err < 0)
+		goto bind_fail;
+	info->rx_evtchn = info->tx_evtchn;
+	info->rx_irq = info->tx_irq = err;
+	dev_info(&info->xbdev->dev,
+		 "single event channel, evtchn = %d, irq = %d\n",
+		 info->tx_evtchn, info->tx_irq);
+
+	return 0;
+
+bind_fail:
+	xenbus_free_evtchn(info->xbdev, info->tx_evtchn);
+	info->tx_evtchn = 0;
+fail:
+	return err;
+}
+
+static int setup_netfront_split(struct netfront_info *info)
+{
+	int err;
+
+	err = xenbus_alloc_evtchn(info->xbdev, &info->tx_evtchn);
+	if (err)
+		goto fail;
+	err = xenbus_alloc_evtchn(info->xbdev, &info->rx_evtchn);
+	if (err)
+		goto alloc_rx_evtchn_fail;
+
+	err = bind_evtchn_to_irqhandler(info->tx_evtchn,
+					xennet_tx_interrupt,
+					0, info->netdev->name, info);
+	if (err < 0)
+		goto bind_tx_fail;
+	info->tx_irq = err;
+
+	err = bind_evtchn_to_irqhandler(info->rx_evtchn,
+					xennet_rx_interrupt,
+					0, info->netdev->name, info);
+	if (err < 0)
+		goto bind_rx_fail;
+
+	info->rx_irq = err;
+
+	dev_info(&info->xbdev->dev,
+		 "split event channels, tx_evtchn/irq = %d/%d, rx_evtchn/irq = %d/%d",
+		 info->tx_evtchn, info->tx_irq,
+		 info->rx_evtchn, info->rx_irq);
+
+	return 0;
+
+bind_rx_fail:
+	unbind_from_irqhandler(info->tx_irq, info);
+	info->tx_irq = 0;
+bind_tx_fail:
+	xenbus_free_evtchn(info->xbdev, info->rx_evtchn);
+	info->rx_evtchn = 0;
+alloc_rx_evtchn_fail:
+	xenbus_free_evtchn(info->xbdev, info->tx_evtchn);
+	info->tx_evtchn = 0;
+fail:
+	return err;
+}
+
 static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info)
 {
 	struct xen_netif_tx_sring *txs;
 	struct xen_netif_rx_sring *rxs;
 	int err;
+	unsigned int feature_split_evtchn;
 	struct net_device *netdev = info->netdev;
 	unsigned int max_tx_ring_page_order, max_rx_ring_page_order;
 	int i;
@@ -1527,6 +1627,12 @@ static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info)
 	}
 
 	err = xenbus_scanf(XBT_NIL, info->xbdev->otherend,
+			   "feature-split-event-channels", "%u",
+			   &feature_split_evtchn);
+	if (err < 0)
+		feature_split_evtchn = 0;
+
+	err = xenbus_scanf(XBT_NIL, info->xbdev->otherend,
 			   "max-tx-ring-page-order", "%u",
 			   &max_tx_ring_page_order);
 	if (err < 0) {
@@ -1598,20 +1704,17 @@ static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info)
 	if (err < 0)
 		goto grant_rx_ring_fail;
 
-	err = xenbus_alloc_evtchn(dev, &info->evtchn);
+	if (feature_split_evtchn)
+		err = setup_netfront_split(info);
+	else
+		err = setup_netfront_single(info);
+
 	if (err)
-		goto alloc_evtchn_fail;
+		goto setup_evtchn_fail;
 
-	err = bind_evtchn_to_irqhandler(info->evtchn, xennet_interrupt,
-					0, netdev->name, netdev);
-	if (err < 0)
-		goto bind_fail;
-	netdev->irq = err;
 	return 0;
 
-bind_fail:
-	xenbus_free_evtchn(dev, info->evtchn);
-alloc_evtchn_fail:
+setup_evtchn_fail:
 	xenbus_unmap_ring_vfree(info->xbdev, (void *)info->rx.sring);
 grant_rx_ring_fail:
 	free_pages((unsigned long)info->rx.sring, info->rx_ring_page_order);
@@ -1696,11 +1799,26 @@ again:
 		}
 	}
 
-	err = xenbus_printf(xbt, dev->nodename,
-			    "event-channel", "%u", info->evtchn);
-	if (err) {
-		message = "writing event-channel";
-		goto abort_transaction;
+	if (info->tx_evtchn == info->rx_evtchn) {
+		err = xenbus_printf(xbt, dev->nodename,
+				    "event-channel", "%u", info->tx_evtchn);
+		if (err) {
+			message = "writing event-channel";
+			goto abort_transaction;
+		}
+	} else {
+		err = xenbus_printf(xbt, dev->nodename,
+				    "event-channel-tx", "%u", info->tx_evtchn);
+		if (err) {
+			message = "writing event-channel-tx";
+			goto abort_transaction;
+		}
+		err = xenbus_printf(xbt, dev->nodename,
+				    "event-channel-rx", "%u", info->rx_evtchn);
+		if (err) {
+			message = "writing event-channel-rx";
+			goto abort_transaction;
+		}
 	}
 
 	err = xenbus_printf(xbt, dev->nodename, "request-rx-copy", "%u",
@@ -1814,7 +1932,9 @@ static int xennet_connect(struct net_device *dev)
 	 * packets.
 	 */
 	netif_carrier_on(np->netdev);
-	notify_remote_via_irq(np->netdev->irq);
+	notify_remote_via_irq(np->tx_irq);
+	if (np->tx_irq != np->rx_irq)
+		notify_remote_via_irq(np->rx_irq);
 	xennet_tx_buf_gc(dev);
 	xennet_alloc_rx_buffers(dev);
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH 8/8] netfront: split event channels support
  2013-02-15 16:00 [PATCH 0/8] Bugfix and mechanical works for Xen network driver Wei Liu
                   ` (13 preceding siblings ...)
  2013-02-15 16:00 ` Wei Liu
@ 2013-02-15 16:00 ` Wei Liu
  2013-02-15 16:00 ` Wei Liu
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 91+ messages in thread
From: Wei Liu @ 2013-02-15 16:00 UTC (permalink / raw)
  To: xen-devel, netdev, ian.campbell, konrad.wilk, annie.li; +Cc: Wei Liu

If this feature is not activated, rx_irq = tx_irq. See corresponding netback
change log for details.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 drivers/net/xen-netfront.c |  184 ++++++++++++++++++++++++++++++++++++--------
 1 file changed, 152 insertions(+), 32 deletions(-)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index de73a71..ea9b656 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -100,7 +100,12 @@ struct netfront_info {
 
 	struct napi_struct napi;
 
-	unsigned int evtchn;
+	/* 
+	 * Split event channels support, tx_* == rx_* when using
+	 * single event channel.
+	 */
+	unsigned int tx_evtchn, rx_evtchn;
+	unsigned int tx_irq, rx_irq;
 	struct xenbus_device *xbdev;
 
 	spinlock_t   tx_lock;
@@ -344,7 +349,7 @@ no_skb:
  push:
 	RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(&np->rx, notify);
 	if (notify)
-		notify_remote_via_irq(np->netdev->irq);
+		notify_remote_via_irq(np->rx_irq);
 }
 
 static int xennet_open(struct net_device *dev)
@@ -633,7 +638,7 @@ static int xennet_start_xmit(struct sk_buff *skb, struct net_device *dev)
 
 	RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(&np->tx, notify);
 	if (notify)
-		notify_remote_via_irq(np->netdev->irq);
+		notify_remote_via_irq(np->tx_irq);
 
 	u64_stats_update_begin(&stats->syncp);
 	stats->tx_bytes += skb->len;
@@ -1263,26 +1268,41 @@ static int xennet_set_features(struct net_device *dev,
 	return 0;
 }
 
-static irqreturn_t xennet_interrupt(int irq, void *dev_id)
+/* Used for tx completion */
+static irqreturn_t xennet_tx_interrupt(int tx_irq, void *dev_id)
 {
-	struct net_device *dev = dev_id;
-	struct netfront_info *np = netdev_priv(dev);
+	struct netfront_info *np = dev_id;
+	struct net_device *dev = np->netdev;
 	unsigned long flags;
 
 	spin_lock_irqsave(&np->tx_lock, flags);
+	xennet_tx_buf_gc(dev);
+	spin_unlock_irqrestore(&np->tx_lock, flags);
 
-	if (likely(netif_carrier_ok(dev))) {
-		xennet_tx_buf_gc(dev);
-		/* Under tx_lock: protects access to rx shared-ring indexes. */
-		if (RING_HAS_UNCONSUMED_RESPONSES(&np->rx))
-			napi_schedule(&np->napi);
-	}
+	return IRQ_HANDLED;
+}
 
-	spin_unlock_irqrestore(&np->tx_lock, flags);
+/* Used for rx */
+static irqreturn_t xennet_rx_interrupt(int rx_irq, void *dev_id)
+{
+	struct netfront_info *np = dev_id;
+	struct net_device *dev = np->netdev;
+
+	if (likely(netif_carrier_ok(dev) &&
+		   RING_HAS_UNCONSUMED_RESPONSES(&np->rx)))
+		napi_schedule(&np->napi);
 
 	return IRQ_HANDLED;
 }
 
+/* Used for single event channel configuration */
+static irqreturn_t xennet_interrupt(int irq, void *dev_id)
+{
+	xennet_tx_interrupt(irq, dev_id);
+	xennet_rx_interrupt(irq, dev_id);
+	return IRQ_HANDLED;
+}
+
 #ifdef CONFIG_NET_POLL_CONTROLLER
 static void xennet_poll_controller(struct net_device *dev)
 {
@@ -1451,9 +1471,14 @@ static void xennet_disconnect_backend(struct netfront_info *info)
 	spin_unlock_irq(&info->tx_lock);
 	spin_unlock_bh(&info->rx_lock);
 
-	if (info->netdev->irq)
-		unbind_from_irqhandler(info->netdev->irq, info->netdev);
-	info->evtchn = info->netdev->irq = 0;
+	if (info->tx_irq && (info->tx_irq == info->rx_irq))
+		unbind_from_irqhandler(info->tx_irq, info);
+	if (info->tx_irq && (info->tx_irq != info->rx_irq)) {
+		unbind_from_irqhandler(info->tx_irq, info);
+		unbind_from_irqhandler(info->rx_irq, info);
+	}
+	info->tx_evtchn = info->rx_evtchn = 0;
+	info->tx_irq = info->rx_irq = 0;
 
 	xenbus_unmap_ring_vfree(info->xbdev, (void *)info->tx.sring);
 	free_pages((unsigned long)info->tx.sring, info->tx_ring_page_order);
@@ -1503,11 +1528,86 @@ static int xen_net_read_mac(struct xenbus_device *dev, u8 mac[])
 	return 0;
 }
 
+static int setup_netfront_single(struct netfront_info *info)
+{
+	int err;
+
+	err = xenbus_alloc_evtchn(info->xbdev, &info->tx_evtchn);
+	if (err < 0)
+		goto fail;
+
+	err = bind_evtchn_to_irqhandler(info->tx_evtchn,
+					xennet_interrupt,
+					0, info->netdev->name, info);
+	if (err < 0)
+		goto bind_fail;
+	info->rx_evtchn = info->tx_evtchn;
+	info->rx_irq = info->tx_irq = err;
+	dev_info(&info->xbdev->dev,
+		 "single event channel, evtchn = %d, irq = %d\n",
+		 info->tx_evtchn, info->tx_irq);
+
+	return 0;
+
+bind_fail:
+	xenbus_free_evtchn(info->xbdev, info->tx_evtchn);
+	info->tx_evtchn = 0;
+fail:
+	return err;
+}
+
+static int setup_netfront_split(struct netfront_info *info)
+{
+	int err;
+
+	err = xenbus_alloc_evtchn(info->xbdev, &info->tx_evtchn);
+	if (err)
+		goto fail;
+	err = xenbus_alloc_evtchn(info->xbdev, &info->rx_evtchn);
+	if (err)
+		goto alloc_rx_evtchn_fail;
+
+	err = bind_evtchn_to_irqhandler(info->tx_evtchn,
+					xennet_tx_interrupt,
+					0, info->netdev->name, info);
+	if (err < 0)
+		goto bind_tx_fail;
+	info->tx_irq = err;
+
+	err = bind_evtchn_to_irqhandler(info->rx_evtchn,
+					xennet_rx_interrupt,
+					0, info->netdev->name, info);
+	if (err < 0)
+		goto bind_rx_fail;
+
+	info->rx_irq = err;
+
+	dev_info(&info->xbdev->dev,
+		 "split event channels, tx_evtchn/irq = %d/%d, rx_evtchn/irq = %d/%d",
+		 info->tx_evtchn, info->tx_irq,
+		 info->rx_evtchn, info->rx_irq);
+
+	return 0;
+
+bind_rx_fail:
+	unbind_from_irqhandler(info->tx_irq, info);
+	info->tx_irq = 0;
+bind_tx_fail:
+	xenbus_free_evtchn(info->xbdev, info->rx_evtchn);
+	info->rx_evtchn = 0;
+alloc_rx_evtchn_fail:
+	xenbus_free_evtchn(info->xbdev, info->tx_evtchn);
+	info->tx_evtchn = 0;
+fail:
+	return err;
+}
+
 static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info)
 {
 	struct xen_netif_tx_sring *txs;
 	struct xen_netif_rx_sring *rxs;
 	int err;
+	unsigned int feature_split_evtchn;
 	struct net_device *netdev = info->netdev;
 	unsigned int max_tx_ring_page_order, max_rx_ring_page_order;
 	int i;
@@ -1527,6 +1627,12 @@ static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info)
 	}
 
 	err = xenbus_scanf(XBT_NIL, info->xbdev->otherend,
+			   "feature-split-event-channels", "%u",
+			   &feature_split_evtchn);
+	if (err < 0)
+		feature_split_evtchn = 0;
+
+	err = xenbus_scanf(XBT_NIL, info->xbdev->otherend,
 			   "max-tx-ring-page-order", "%u",
 			   &max_tx_ring_page_order);
 	if (err < 0) {
@@ -1598,20 +1704,17 @@ static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info)
 	if (err < 0)
 		goto grant_rx_ring_fail;
 
-	err = xenbus_alloc_evtchn(dev, &info->evtchn);
+	if (feature_split_evtchn)
+		err = setup_netfront_split(info);
+	else
+		err = setup_netfront_single(info);
+
 	if (err)
-		goto alloc_evtchn_fail;
+		goto setup_evtchn_fail;
 
-	err = bind_evtchn_to_irqhandler(info->evtchn, xennet_interrupt,
-					0, netdev->name, netdev);
-	if (err < 0)
-		goto bind_fail;
-	netdev->irq = err;
 	return 0;
 
-bind_fail:
-	xenbus_free_evtchn(dev, info->evtchn);
-alloc_evtchn_fail:
+setup_evtchn_fail:
 	xenbus_unmap_ring_vfree(info->xbdev, (void *)info->rx.sring);
 grant_rx_ring_fail:
 	free_pages((unsigned long)info->rx.sring, info->rx_ring_page_order);
@@ -1696,11 +1799,26 @@ again:
 		}
 	}
 
-	err = xenbus_printf(xbt, dev->nodename,
-			    "event-channel", "%u", info->evtchn);
-	if (err) {
-		message = "writing event-channel";
-		goto abort_transaction;
+	if (info->tx_evtchn == info->rx_evtchn) {
+		err = xenbus_printf(xbt, dev->nodename,
+				    "event-channel", "%u", info->tx_evtchn);
+		if (err) {
+			message = "writing event-channel";
+			goto abort_transaction;
+		}
+	} else {
+		err = xenbus_printf(xbt, dev->nodename,
+				    "event-channel-tx", "%u", info->tx_evtchn);
+		if (err) {
+			message = "writing event-channel-tx";
+			goto abort_transaction;
+		}
+		err = xenbus_printf(xbt, dev->nodename,
+				    "event-channel-rx", "%u", info->rx_evtchn);
+		if (err) {
+			message = "writing event-channel-rx";
+			goto abort_transaction;
+		}
 	}
 
 	err = xenbus_printf(xbt, dev->nodename, "request-rx-copy", "%u",
@@ -1814,7 +1932,9 @@ static int xennet_connect(struct net_device *dev)
 	 * packets.
 	 */
 	netif_carrier_on(np->netdev);
-	notify_remote_via_irq(np->netdev->irq);
+	notify_remote_via_irq(np->tx_irq);
+	if (np->tx_irq != np->rx_irq)
+		notify_remote_via_irq(np->rx_irq);
 	xennet_tx_buf_gc(dev);
 	xennet_alloc_rx_buffers(dev);
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* Re: [Xen-devel] [PATCH 4/8] xenbus_client: Extend interface to support multi-page ring
  2013-02-15 16:00 ` [PATCH 4/8] xenbus_client: Extend interface to support multi-page ring Wei Liu
  2013-02-15 16:17   ` Jan Beulich
@ 2013-02-15 16:17   ` Jan Beulich
  2013-02-15 16:33     ` Wei Liu
  2013-02-15 16:33     ` Wei Liu
  2013-03-04 21:12   ` Konrad Rzeszutek Wilk
                     ` (3 subsequent siblings)
  5 siblings, 2 replies; 91+ messages in thread
From: Jan Beulich @ 2013-02-15 16:17 UTC (permalink / raw)
  To: Wei Liu
  Cc: ian.campbell, Roger Pau Monne, Stefano Stabellini, xen-devel,
	annie.li, konrad.wilk, netdev

>>> On 15.02.13 at 17:00, Wei Liu <wei.liu2@citrix.com> wrote:
> Also bundle fixes for xen frontends and backends in this patch.

Could you at least enumerate those fixes in the patch description?

Jan

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 4/8] xenbus_client: Extend interface to support multi-page ring
  2013-02-15 16:00 ` [PATCH 4/8] xenbus_client: Extend interface to support multi-page ring Wei Liu
@ 2013-02-15 16:17   ` Jan Beulich
  2013-02-15 16:17   ` [Xen-devel] " Jan Beulich
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 91+ messages in thread
From: Jan Beulich @ 2013-02-15 16:17 UTC (permalink / raw)
  To: Wei Liu
  Cc: ian.campbell, Stefano Stabellini, netdev, konrad.wilk, xen-devel,
	annie.li, Roger Pau Monne

>>> On 15.02.13 at 17:00, Wei Liu <wei.liu2@citrix.com> wrote:
> Also bundle fixes for xen frontends and backends in this patch.

Could you at least enumerate those fixes in the patch description?

Jan

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [Xen-devel] [PATCH 4/8] xenbus_client: Extend interface to support multi-page ring
  2013-02-15 16:17   ` [Xen-devel] " Jan Beulich
@ 2013-02-15 16:33     ` Wei Liu
  2013-02-15 16:59       ` Jan Beulich
  2013-02-15 16:59       ` [Xen-devel] " Jan Beulich
  2013-02-15 16:33     ` Wei Liu
  1 sibling, 2 replies; 91+ messages in thread
From: Wei Liu @ 2013-02-15 16:33 UTC (permalink / raw)
  To: Jan Beulich
  Cc: wei.liu2, Ian Campbell, Roger Pau Monne, Stefano Stabellini,
	xen-devel, annie.li, konrad.wilk, netdev

On Fri, 2013-02-15 at 16:17 +0000, Jan Beulich wrote:
> >>> On 15.02.13 at 17:00, Wei Liu <wei.liu2@citrix.com> wrote:
> > Also bundle fixes for xen frontends and backends in this patch.
> 
> Could you at least enumerate those fixes in the patch description?
> 
> Jan
> 

Mostly mechanical fixes to adapt to new xenbus client interface, which
includes: xen-pciback, xen-pcifront, xen-blkback, xen-blkfront,
xen-netback, xen-netfront.

The above description will be added to patch description.

Wei.

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 4/8] xenbus_client: Extend interface to support multi-page ring
  2013-02-15 16:17   ` [Xen-devel] " Jan Beulich
  2013-02-15 16:33     ` Wei Liu
@ 2013-02-15 16:33     ` Wei Liu
  1 sibling, 0 replies; 91+ messages in thread
From: Wei Liu @ 2013-02-15 16:33 UTC (permalink / raw)
  To: Jan Beulich
  Cc: wei.liu2, Ian Campbell, Stefano Stabellini, netdev, konrad.wilk,
	xen-devel, annie.li, Roger Pau Monne

On Fri, 2013-02-15 at 16:17 +0000, Jan Beulich wrote:
> >>> On 15.02.13 at 17:00, Wei Liu <wei.liu2@citrix.com> wrote:
> > Also bundle fixes for xen frontends and backends in this patch.
> 
> Could you at least enumerate those fixes in the patch description?
> 
> Jan
> 

Mostly mechanical fixes to adapt to new xenbus client interface, which
includes: xen-pciback, xen-pcifront, xen-blkback, xen-blkfront,
xen-netback, xen-netfront.

The above description will be added to patch description.

Wei.

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [Xen-devel] [PATCH 4/8] xenbus_client: Extend interface to support multi-page ring
  2013-02-15 16:33     ` Wei Liu
  2013-02-15 16:59       ` Jan Beulich
@ 2013-02-15 16:59       ` Jan Beulich
  2013-02-15 17:01         ` Wei Liu
  2013-02-15 17:01         ` [Xen-devel] " Wei Liu
  1 sibling, 2 replies; 91+ messages in thread
From: Jan Beulich @ 2013-02-15 16:59 UTC (permalink / raw)
  To: Wei Liu
  Cc: Ian Campbell, Roger PauMonne, Stefano Stabellini, xen-devel,
	annie.li, konrad.wilk, netdev

>>> On 15.02.13 at 17:33, Wei Liu <wei.liu2@citrix.com> wrote:
> On Fri, 2013-02-15 at 16:17 +0000, Jan Beulich wrote:
>> >>> On 15.02.13 at 17:00, Wei Liu <wei.liu2@citrix.com> wrote:
>> > Also bundle fixes for xen frontends and backends in this patch.
>> 
>> Could you at least enumerate those fixes in the patch description?
> 
> Mostly mechanical fixes to adapt to new xenbus client interface, which
> includes: xen-pciback, xen-pcifront, xen-blkback, xen-blkfront,
> xen-netback, xen-netfront.

But "fixes" and "changes" are two different terms.

Jan

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 4/8] xenbus_client: Extend interface to support multi-page ring
  2013-02-15 16:33     ` Wei Liu
@ 2013-02-15 16:59       ` Jan Beulich
  2013-02-15 16:59       ` [Xen-devel] " Jan Beulich
  1 sibling, 0 replies; 91+ messages in thread
From: Jan Beulich @ 2013-02-15 16:59 UTC (permalink / raw)
  To: Wei Liu
  Cc: Ian Campbell, Stefano Stabellini, netdev, konrad.wilk, xen-devel,
	annie.li, Roger PauMonne

>>> On 15.02.13 at 17:33, Wei Liu <wei.liu2@citrix.com> wrote:
> On Fri, 2013-02-15 at 16:17 +0000, Jan Beulich wrote:
>> >>> On 15.02.13 at 17:00, Wei Liu <wei.liu2@citrix.com> wrote:
>> > Also bundle fixes for xen frontends and backends in this patch.
>> 
>> Could you at least enumerate those fixes in the patch description?
> 
> Mostly mechanical fixes to adapt to new xenbus client interface, which
> includes: xen-pciback, xen-pcifront, xen-blkback, xen-blkfront,
> xen-netback, xen-netfront.

But "fixes" and "changes" are two different terms.

Jan

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [Xen-devel] [PATCH 4/8] xenbus_client: Extend interface to support multi-page ring
  2013-02-15 16:59       ` [Xen-devel] " Jan Beulich
  2013-02-15 17:01         ` Wei Liu
@ 2013-02-15 17:01         ` Wei Liu
  1 sibling, 0 replies; 91+ messages in thread
From: Wei Liu @ 2013-02-15 17:01 UTC (permalink / raw)
  To: Jan Beulich
  Cc: wei.liu2, Ian Campbell, Roger Pau Monne, Stefano Stabellini,
	xen-devel, annie.li, konrad.wilk, netdev

On Fri, 2013-02-15 at 16:59 +0000, Jan Beulich wrote:
> >>> On 15.02.13 at 17:33, Wei Liu <wei.liu2@citrix.com> wrote:
> > On Fri, 2013-02-15 at 16:17 +0000, Jan Beulich wrote:
> >> >>> On 15.02.13 at 17:00, Wei Liu <wei.liu2@citrix.com> wrote:
> >> > Also bundle fixes for xen frontends and backends in this patch.
> >> 
> >> Could you at least enumerate those fixes in the patch description?
> > 
> > Mostly mechanical fixes to adapt to new xenbus client interface, which
> > includes: xen-pciback, xen-pcifront, xen-blkback, xen-blkfront,
> > xen-netback, xen-netfront.
> 
> But "fixes" and "changes" are two different terms.
> 

I "fixed" the build breakage. :-)


Wei.

> Jan
> 

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 4/8] xenbus_client: Extend interface to support multi-page ring
  2013-02-15 16:59       ` [Xen-devel] " Jan Beulich
@ 2013-02-15 17:01         ` Wei Liu
  2013-02-15 17:01         ` [Xen-devel] " Wei Liu
  1 sibling, 0 replies; 91+ messages in thread
From: Wei Liu @ 2013-02-15 17:01 UTC (permalink / raw)
  To: Jan Beulich
  Cc: wei.liu2, Ian Campbell, Stefano Stabellini, netdev, konrad.wilk,
	xen-devel, annie.li, Roger Pau Monne

On Fri, 2013-02-15 at 16:59 +0000, Jan Beulich wrote:
> >>> On 15.02.13 at 17:33, Wei Liu <wei.liu2@citrix.com> wrote:
> > On Fri, 2013-02-15 at 16:17 +0000, Jan Beulich wrote:
> >> >>> On 15.02.13 at 17:00, Wei Liu <wei.liu2@citrix.com> wrote:
> >> > Also bundle fixes for xen frontends and backends in this patch.
> >> 
> >> Could you at least enumerate those fixes in the patch description?
> > 
> > Mostly mechanical fixes to adapt to new xenbus client interface, which
> > includes: xen-pciback, xen-pcifront, xen-blkback, xen-blkfront,
> > xen-netback, xen-netfront.
> 
> But "fixes" and "changes" are two different terms.
> 

I "fixed" the build breakage. :-)


Wei.

> Jan
> 

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [Xen-devel] [PATCH 0/8] Bugfix and mechanical works for Xen network driver
  2013-02-15 16:00 [PATCH 0/8] Bugfix and mechanical works for Xen network driver Wei Liu
                   ` (16 preceding siblings ...)
  2013-02-26  3:07 ` [PATCH 0/8] Bugfix and mechanical works for Xen network driver ANNIE LI
@ 2013-02-26  3:07 ` ANNIE LI
  2013-02-26 11:33   ` Wei Liu
  2013-02-26 11:33   ` Wei Liu
  17 siblings, 2 replies; 91+ messages in thread
From: ANNIE LI @ 2013-02-26  3:07 UTC (permalink / raw)
  To: Wei Liu; +Cc: xen-devel, netdev, ian.campbell, konrad.wilk

What the version are these patches based on?
I tried v3.8-rc7 and 3.8-rc6, patch 3/8, 4/8 ... can not be merged 
successfully. Can you rebase it?

Thanks
Annie

On 2013-2-16 0:00, Wei Liu wrote:
> This patch series contains a small fix plus mechanical works for xen network
> driver.
>
>   * bug fix: don't bind kthread to specific cpu core
>   * allow host admin to unload netback
>   * multi-page ring support
>   * split event channels support
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 0/8] Bugfix and mechanical works for Xen network driver
  2013-02-15 16:00 [PATCH 0/8] Bugfix and mechanical works for Xen network driver Wei Liu
                   ` (15 preceding siblings ...)
  2013-02-15 16:00 ` Wei Liu
@ 2013-02-26  3:07 ` ANNIE LI
  2013-02-26  3:07 ` [Xen-devel] " ANNIE LI
  17 siblings, 0 replies; 91+ messages in thread
From: ANNIE LI @ 2013-02-26  3:07 UTC (permalink / raw)
  To: Wei Liu; +Cc: netdev, konrad.wilk, ian.campbell, xen-devel

What the version are these patches based on?
I tried v3.8-rc7 and 3.8-rc6, patch 3/8, 4/8 ... can not be merged 
successfully. Can you rebase it?

Thanks
Annie

On 2013-2-16 0:00, Wei Liu wrote:
> This patch series contains a small fix plus mechanical works for xen network
> driver.
>
>   * bug fix: don't bind kthread to specific cpu core
>   * allow host admin to unload netback
>   * multi-page ring support
>   * split event channels support
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 6/8] netfront: multi-page ring support
  2013-02-15 16:00 ` [PATCH 6/8] netfront: " Wei Liu
@ 2013-02-26  6:52   ` ANNIE LI
  2013-02-26 12:35     ` Wei Liu
  2013-02-26 12:35     ` Wei Liu
  2013-02-26  6:52   ` ANNIE LI
                     ` (2 subsequent siblings)
  3 siblings, 2 replies; 91+ messages in thread
From: ANNIE LI @ 2013-02-26  6:52 UTC (permalink / raw)
  To: Wei Liu; +Cc: xen-devel, netdev, ian.campbell, konrad.wilk



On 2013-2-16 0:00, Wei Liu wrote:
> Signed-off-by: Wei Liu<wei.liu2@citrix.com>
> ---
>   drivers/net/xen-netfront.c |  246 +++++++++++++++++++++++++++++++-------------
>   1 file changed, 174 insertions(+), 72 deletions(-)
>
> diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
> index 8bd75a1..de73a71 100644
> --- a/drivers/net/xen-netfront.c
> +++ b/drivers/net/xen-netfront.c
> @@ -67,9 +67,19 @@ struct netfront_cb {
>
>   #define GRANT_INVALID_REF	0
>
> -#define NET_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
> -#define NET_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
> -#define TX_MAX_TARGET min_t(int, NET_TX_RING_SIZE, 256)
> +#define XENNET_MAX_RING_PAGE_ORDER XENBUS_MAX_RING_PAGE_ORDER
> +#define XENNET_MAX_RING_PAGES      (1U<<  XENNET_MAX_RING_PAGE_ORDER)
> +
> +
> +#define NET_TX_RING_SIZE(_nr_pages)			\
> +	__CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE * (_nr_pages))
> +#define NET_RX_RING_SIZE(_nr_pages)			\
> +	__CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE * (_nr_pages))
> +
> +#define XENNET_MAX_TX_RING_SIZE NET_TX_RING_SIZE(XENNET_MAX_RING_PAGES)
> +#define XENNET_MAX_RX_RING_SIZE NET_RX_RING_SIZE(XENNET_MAX_RING_PAGES)
> +
> +#define TX_MAX_TARGET min_t(int, NET_TX_RING_SIZE(1), 256)

Not using multi-page ring here?
In xennet_create_dev, gnttab_alloc_grant_references allocates 
TX_MAX_TARGET number of grant reference for tx. In 
xennet_release_tx_bufs, NET_TX_RING_SIZE(np->tx_ring_pages) numbers of 
grants are processed. And NET_RX_RING_SIZE(np->tx_ring_pages) is totally 
different from TX_MAX_TARGET if np->rx_ring_pages is not 1. Although 
skb_entry_is_link helps to not release invalid grants, lots of null loop 
seems unnecessary. I think TX_MAX_TARGET should be changed into some 
variableconnected with np->tx_ring_pages. Or you intended to use one 
page ring here?

>
>   struct netfront_stats {
>   	u64			rx_packets;
> @@ -80,6 +90,11 @@ struct netfront_stats {
>   };
>
>   struct netfront_info {
> +	/* Statistics */
> +	struct netfront_stats __percpu *stats;
> +
> +	unsigned long rx_gso_checksum_fixup;
> +
>   	struct list_head list;
>   	struct net_device *netdev;
>
> @@ -90,7 +105,9 @@ struct netfront_info {
>
>   	spinlock_t   tx_lock;
>   	struct xen_netif_tx_front_ring tx;
> -	int tx_ring_ref;
> +	int tx_ring_ref[XENNET_MAX_RING_PAGES];
> +	unsigned int tx_ring_page_order;
> +	unsigned int tx_ring_pages;
>
>   	/*
>   	 * {tx,rx}_skbs store outstanding skbuffs. Free tx_skb entries
> @@ -104,36 +121,33 @@ struct netfront_info {
>   	union skb_entry {
>   		struct sk_buff *skb;
>   		unsigned long link;
> -	} tx_skbs[NET_TX_RING_SIZE];
> +	} tx_skbs[XENNET_MAX_TX_RING_SIZE];
>   	grant_ref_t gref_tx_head;
> -	grant_ref_t grant_tx_ref[NET_TX_RING_SIZE];
> +	grant_ref_t grant_tx_ref[XENNET_MAX_TX_RING_SIZE];
>   	unsigned tx_skb_freelist;
>
>   	spinlock_t   rx_lock ____cacheline_aligned_in_smp;
>   	struct xen_netif_rx_front_ring rx;
> -	int rx_ring_ref;
> +	int rx_ring_ref[XENNET_MAX_RING_PAGES];
> +	unsigned int rx_ring_page_order;
> +	unsigned int rx_ring_pages;
>
>   	/* Receive-ring batched refills. */
>   #define RX_MIN_TARGET 8
>   #define RX_DFL_MIN_TARGET 64
> -#define RX_MAX_TARGET min_t(int, NET_RX_RING_SIZE, 256)
> +#define RX_MAX_TARGET min_t(int, NET_RX_RING_SIZE(1), 256)

Not using multi-page ring here?
(See comments of tx side above)

Thanks
Annie

>   	unsigned rx_min_target, rx_max_target, rx_target;
>   	struct sk_buff_head rx_batch;
>
>   	struct timer_list rx_refill_timer;
>
> -	struct sk_buff *rx_skbs[NET_RX_RING_SIZE];
> +	struct sk_buff *rx_skbs[XENNET_MAX_RX_RING_SIZE];
>   	grant_ref_t gref_rx_head;
> -	grant_ref_t grant_rx_ref[NET_RX_RING_SIZE];
> -
> -	unsigned long rx_pfn_array[NET_RX_RING_SIZE];
> -	struct multicall_entry rx_mcl[NET_RX_RING_SIZE+1];
> -	struct mmu_update rx_mmu[NET_RX_RING_SIZE];
> -
> -	/* Statistics */
> -	struct netfront_stats __percpu *stats;
> +	grant_ref_t grant_rx_ref[XENNET_MAX_RX_RING_SIZE];
>
> -	unsigned long rx_gso_checksum_fixup;
> +	unsigned long rx_pfn_array[XENNET_MAX_RX_RING_SIZE];
> +	struct multicall_entry rx_mcl[XENNET_MAX_RX_RING_SIZE+1];
> +	struct mmu_update rx_mmu[XENNET_MAX_RX_RING_SIZE];
>   };
>
>   struct netfront_rx_info {
> @@ -171,15 +185,15 @@ static unsigned short get_id_from_freelist(unsigned *head,
>   	return id;
>   }
>
> -static int xennet_rxidx(RING_IDX idx)
> +static int xennet_rxidx(RING_IDX idx, struct netfront_info *info)
>   {
> -	return idx&  (NET_RX_RING_SIZE - 1);
> +	return idx&  (NET_RX_RING_SIZE(info->rx_ring_pages) - 1);
>   }
>
>   static struct sk_buff *xennet_get_rx_skb(struct netfront_info *np,
>   					 RING_IDX ri)
>   {
> -	int i = xennet_rxidx(ri);
> +	int i = xennet_rxidx(ri, np);
>   	struct sk_buff *skb = np->rx_skbs[i];
>   	np->rx_skbs[i] = NULL;
>   	return skb;
> @@ -188,7 +202,7 @@ static struct sk_buff *xennet_get_rx_skb(struct netfront_info *np,
>   static grant_ref_t xennet_get_rx_ref(struct netfront_info *np,
>   					    RING_IDX ri)
>   {
> -	int i = xennet_rxidx(ri);
> +	int i = xennet_rxidx(ri, np);
>   	grant_ref_t ref = np->grant_rx_ref[i];
>   	np->grant_rx_ref[i] = GRANT_INVALID_REF;
>   	return ref;
> @@ -301,7 +315,7 @@ no_skb:
>
>   		skb->dev = dev;
>
> -		id = xennet_rxidx(req_prod + i);
> +		id = xennet_rxidx(req_prod + i, np);
>
>   		BUG_ON(np->rx_skbs[id]);
>   		np->rx_skbs[id] = skb;
> @@ -653,7 +667,7 @@ static int xennet_close(struct net_device *dev)
>   static void xennet_move_rx_slot(struct netfront_info *np, struct sk_buff *skb,
>   				grant_ref_t ref)
>   {
> -	int new = xennet_rxidx(np->rx.req_prod_pvt);
> +	int new = xennet_rxidx(np->rx.req_prod_pvt, np);
>
>   	BUG_ON(np->rx_skbs[new]);
>   	np->rx_skbs[new] = skb;
> @@ -1109,7 +1123,7 @@ static void xennet_release_tx_bufs(struct netfront_info *np)
>   	struct sk_buff *skb;
>   	int i;
>
> -	for (i = 0; i<  NET_TX_RING_SIZE; i++) {
> +	for (i = 0; i<  NET_TX_RING_SIZE(np->tx_ring_pages); i++) {
>   		/* Skip over entries which are actually freelist references */
>   		if (skb_entry_is_link(&np->tx_skbs[i]))
>   			continue;
> @@ -1143,7 +1157,7 @@ static void xennet_release_rx_bufs(struct netfront_info *np)
>
>   	spin_lock_bh(&np->rx_lock);
>
> -	for (id = 0; id<  NET_RX_RING_SIZE; id++) {
> +	for (id = 0; id<  NET_RX_RING_SIZE(np->rx_ring_pages); id++) {
>   		ref = np->grant_rx_ref[id];
>   		if (ref == GRANT_INVALID_REF) {
>   			unused++;
> @@ -1324,13 +1338,13 @@ static struct net_device *xennet_create_dev(struct xenbus_device *dev)
>
>   	/* Initialise tx_skbs as a free chain containing every entry. */
>   	np->tx_skb_freelist = 0;
> -	for (i = 0; i<  NET_TX_RING_SIZE; i++) {
> +	for (i = 0; i<  XENNET_MAX_TX_RING_SIZE; i++) {
>   		skb_entry_set_link(&np->tx_skbs[i], i+1);
>   		np->grant_tx_ref[i] = GRANT_INVALID_REF;
>   	}
>
>   	/* Clear out rx_skbs */
> -	for (i = 0; i<  NET_RX_RING_SIZE; i++) {
> +	for (i = 0; i<  XENNET_MAX_RX_RING_SIZE; i++) {
>   		np->rx_skbs[i] = NULL;
>   		np->grant_rx_ref[i] = GRANT_INVALID_REF;
>   	}
> @@ -1428,13 +1442,6 @@ static int netfront_probe(struct xenbus_device *dev,
>   	return err;
>   }
>
> -static void xennet_end_access(int ref, void *page)
> -{
> -	/* This frees the page as a side-effect */
> -	if (ref != GRANT_INVALID_REF)
> -		gnttab_end_foreign_access(ref, 0, (unsigned long)page);
> -}
> -
>   static void xennet_disconnect_backend(struct netfront_info *info)
>   {
>   	/* Stop old i/f to prevent errors whilst we rebuild the state. */
> @@ -1448,12 +1455,12 @@ static void xennet_disconnect_backend(struct netfront_info *info)
>   		unbind_from_irqhandler(info->netdev->irq, info->netdev);
>   	info->evtchn = info->netdev->irq = 0;
>
> -	/* End access and free the pages */
> -	xennet_end_access(info->tx_ring_ref, info->tx.sring);
> -	xennet_end_access(info->rx_ring_ref, info->rx.sring);
> +	xenbus_unmap_ring_vfree(info->xbdev, (void *)info->tx.sring);
> +	free_pages((unsigned long)info->tx.sring, info->tx_ring_page_order);
> +
> +	xenbus_unmap_ring_vfree(info->xbdev, (void *)info->rx.sring);
> +	free_pages((unsigned long)info->rx.sring, info->rx_ring_page_order);
>
> -	info->tx_ring_ref = GRANT_INVALID_REF;
> -	info->rx_ring_ref = GRANT_INVALID_REF;
>   	info->tx.sring = NULL;
>   	info->rx.sring = NULL;
>   }
> @@ -1501,11 +1508,14 @@ static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info)
>   	struct xen_netif_tx_sring *txs;
>   	struct xen_netif_rx_sring *rxs;
>   	int err;
> -	int grefs[1];
>   	struct net_device *netdev = info->netdev;
> +	unsigned int max_tx_ring_page_order, max_rx_ring_page_order;
> +	int i;
>
> -	info->tx_ring_ref = GRANT_INVALID_REF;
> -	info->rx_ring_ref = GRANT_INVALID_REF;
> +	for (i = 0; i<  XENNET_MAX_RING_PAGES; i++) {
> +		info->tx_ring_ref[i] = GRANT_INVALID_REF;
> +		info->rx_ring_ref[i] = GRANT_INVALID_REF;
> +	}
>   	info->rx.sring = NULL;
>   	info->tx.sring = NULL;
>   	netdev->irq = 0;
> @@ -1516,50 +1526,100 @@ static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info)
>   		goto fail;
>   	}
>
> -	txs = (struct xen_netif_tx_sring *)get_zeroed_page(GFP_NOIO | __GFP_HIGH);
> +	err = xenbus_scanf(XBT_NIL, info->xbdev->otherend,
> +			   "max-tx-ring-page-order", "%u",
> +			&max_tx_ring_page_order);
> +	if (err<  0) {
> +		info->tx_ring_page_order = 0;
> +		dev_info(&dev->dev, "single tx ring\n");
> +	} else {
> +		if (max_tx_ring_page_order>  XENNET_MAX_RING_PAGE_ORDER) {
> +			dev_info(&dev->dev,
> +				 "backend ring page order %d too large, clamp to %d\n",
> +				 max_tx_ring_page_order,
> +				 XENNET_MAX_RING_PAGE_ORDER);
> +			max_tx_ring_page_order = XENNET_MAX_RING_PAGE_ORDER;
> +		}
> +		info->tx_ring_page_order = max_tx_ring_page_order;
> +		dev_info(&dev->dev, "multi-page tx ring, order = %d\n",
> +			 info->tx_ring_page_order);
> +	}
> +	info->tx_ring_pages = (1U<<  info->tx_ring_page_order);
> +
> +	txs = (struct xen_netif_tx_sring *)
> +		__get_free_pages(__GFP_ZERO | GFP_NOIO | __GFP_HIGH,
> +				 info->tx_ring_page_order);
>   	if (!txs) {
>   		err = -ENOMEM;
>   		xenbus_dev_fatal(dev, err, "allocating tx ring page");
>   		goto fail;
>   	}
>   	SHARED_RING_INIT(txs);
> -	FRONT_RING_INIT(&info->tx, txs, PAGE_SIZE);
> +	FRONT_RING_INIT(&info->tx, txs, PAGE_SIZE * info->tx_ring_pages);
> +
> +	err = xenbus_grant_ring(dev, txs, info->tx_ring_pages,
> +				info->tx_ring_ref);
> +	if (err<  0)
> +		goto grant_tx_ring_fail;
>
> -	err = xenbus_grant_ring(dev, txs, 1, grefs);
> +	err = xenbus_scanf(XBT_NIL, info->xbdev->otherend,
> +			   "max-rx-ring-page-order", "%u",
> +			&max_rx_ring_page_order);
>   	if (err<  0) {
> -		free_page((unsigned long)txs);
> -		goto fail;
> +		info->rx_ring_page_order = 0;
> +		dev_info(&dev->dev, "single rx ring\n");
> +	} else {
> +		if (max_rx_ring_page_order>  XENNET_MAX_RING_PAGE_ORDER) {
> +			dev_info(&dev->dev,
> +				 "backend ring page order %d too large, clamp to %d\n",
> +				 max_rx_ring_page_order,
> +				 XENNET_MAX_RING_PAGE_ORDER);
> +			max_rx_ring_page_order = XENNET_MAX_RING_PAGE_ORDER;
> +		}
> +		info->rx_ring_page_order = max_rx_ring_page_order;
> +		dev_info(&dev->dev, "multi-page rx ring, order = %d\n",
> +			 info->rx_ring_page_order);
>   	}
> +	info->rx_ring_pages = (1U<<  info->rx_ring_page_order);
>
> -	info->tx_ring_ref = grefs[0];
> -	rxs = (struct xen_netif_rx_sring *)get_zeroed_page(GFP_NOIO | __GFP_HIGH);
> +	rxs = (struct xen_netif_rx_sring *)
> +		__get_free_pages(__GFP_ZERO | GFP_NOIO | __GFP_HIGH,
> +				 info->rx_ring_page_order);
>   	if (!rxs) {
>   		err = -ENOMEM;
>   		xenbus_dev_fatal(dev, err, "allocating rx ring page");
> -		goto fail;
> +		goto alloc_rx_ring_fail;
>   	}
>   	SHARED_RING_INIT(rxs);
> -	FRONT_RING_INIT(&info->rx, rxs, PAGE_SIZE);
> +	FRONT_RING_INIT(&info->rx, rxs, PAGE_SIZE * info->rx_ring_pages);
>
> -	err = xenbus_grant_ring(dev, rxs, 1, grefs);
> -	if (err<  0) {
> -		free_page((unsigned long)rxs);
> -		goto fail;
> -	}
> -	info->rx_ring_ref = grefs[0];
> +	err = xenbus_grant_ring(dev, rxs, info->rx_ring_pages,
> +				info->rx_ring_ref);
> +	if (err<  0)
> +		goto grant_rx_ring_fail;
>
>   	err = xenbus_alloc_evtchn(dev,&info->evtchn);
>   	if (err)
> -		goto fail;
> +		goto alloc_evtchn_fail;
>
>   	err = bind_evtchn_to_irqhandler(info->evtchn, xennet_interrupt,
>   					0, netdev->name, netdev);
>   	if (err<  0)
> -		goto fail;
> +		goto bind_fail;
>   	netdev->irq = err;
>   	return 0;
>
> - fail:
> +bind_fail:
> +	xenbus_free_evtchn(dev, info->evtchn);
> +alloc_evtchn_fail:
> +	xenbus_unmap_ring_vfree(info->xbdev, (void *)info->rx.sring);
> +grant_rx_ring_fail:
> +	free_pages((unsigned long)info->rx.sring, info->rx_ring_page_order);
> +alloc_rx_ring_fail:
> +	xenbus_unmap_ring_vfree(info->xbdev, (void *)info->tx.sring);
> +grant_tx_ring_fail:
> +	free_pages((unsigned long)info->tx.sring, info->tx_ring_page_order);
> +fail:
>   	return err;
>   }
>
> @@ -1570,6 +1630,7 @@ static int talk_to_netback(struct xenbus_device *dev,
>   	const char *message;
>   	struct xenbus_transaction xbt;
>   	int err;
> +	int i;
>
>   	/* Create shared ring, alloc event channel. */
>   	err = setup_netfront(dev, info);
> @@ -1583,18 +1644,58 @@ again:
>   		goto destroy_ring;
>   	}
>
> -	err = xenbus_printf(xbt, dev->nodename, "tx-ring-ref", "%u",
> -			    info->tx_ring_ref);
> -	if (err) {
> -		message = "writing tx ring-ref";
> -		goto abort_transaction;
> +	if (info->tx_ring_page_order == 0) {
> +		err = xenbus_printf(xbt, dev->nodename, "tx-ring-ref", "%u",
> +				    info->tx_ring_ref[0]);
> +		if (err) {
> +			message = "writing tx ring-ref";
> +			goto abort_transaction;
> +		}
> +	} else {
> +		err = xenbus_printf(xbt, dev->nodename, "tx-ring-order", "%u",
> +				    info->tx_ring_page_order);
> +		if (err) {
> +			message = "writing tx-ring-order";
> +			goto abort_transaction;
> +		}
> +		for (i = 0; i<  info->tx_ring_pages; i++) {
> +			char name[sizeof("tx-ring-ref")+3];
> +			snprintf(name, sizeof(name), "tx-ring-ref%u", i);
> +			err = xenbus_printf(xbt, dev->nodename, name, "%u",
> +					    info->tx_ring_ref[i]);
> +			if (err) {
> +				message = "writing tx ring-ref";
> +				goto abort_transaction;
> +			}
> +		}
>   	}
> -	err = xenbus_printf(xbt, dev->nodename, "rx-ring-ref", "%u",
> -			    info->rx_ring_ref);
> -	if (err) {
> -		message = "writing rx ring-ref";
> -		goto abort_transaction;
> +
> +	if (info->rx_ring_page_order == 0) {
> +		err = xenbus_printf(xbt, dev->nodename, "rx-ring-ref", "%u",
> +				    info->rx_ring_ref[0]);
> +		if (err) {
> +			message = "writing rx ring-ref";
> +			goto abort_transaction;
> +		}
> +	} else {
> +		err = xenbus_printf(xbt, dev->nodename, "rx-ring-order", "%u",
> +				    info->rx_ring_page_order);
> +		if (err) {
> +			message = "writing rx-ring-order";
> +			goto abort_transaction;
> +		}
> +		for (i = 0; i<  info->rx_ring_pages; i++) {
> +			char name[sizeof("rx-ring-ref")+3];
> +			snprintf(name, sizeof(name), "rx-ring-ref%u", i);
> +			err = xenbus_printf(xbt, dev->nodename, name, "%u",
> +					    info->rx_ring_ref[i]);
> +			if (err) {
> +				message = "writing rx ring-ref";
> +				goto abort_transaction;
> +			}
> +		}
>   	}
> +
>   	err = xenbus_printf(xbt, dev->nodename,
>   			    "event-channel", "%u", info->evtchn);
>   	if (err) {
> @@ -1681,7 +1782,8 @@ static int xennet_connect(struct net_device *dev)
>   	xennet_release_tx_bufs(np);
>
>   	/* Step 2: Rebuild the RX buffer freelist and the RX ring itself. */
> -	for (requeue_idx = 0, i = 0; i<  NET_RX_RING_SIZE; i++) {
> +	for (requeue_idx = 0, i = 0; i<  NET_RX_RING_SIZE(np->rx_ring_pages);
> +	     i++) {
>   		skb_frag_t *frag;
>   		const struct page *page;
>   		if (!np->rx_skbs[i])

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 6/8] netfront: multi-page ring support
  2013-02-15 16:00 ` [PATCH 6/8] netfront: " Wei Liu
  2013-02-26  6:52   ` ANNIE LI
@ 2013-02-26  6:52   ` ANNIE LI
  2013-03-04 21:16   ` Konrad Rzeszutek Wilk
  2013-03-04 21:16   ` Konrad Rzeszutek Wilk
  3 siblings, 0 replies; 91+ messages in thread
From: ANNIE LI @ 2013-02-26  6:52 UTC (permalink / raw)
  To: Wei Liu; +Cc: netdev, konrad.wilk, ian.campbell, xen-devel



On 2013-2-16 0:00, Wei Liu wrote:
> Signed-off-by: Wei Liu<wei.liu2@citrix.com>
> ---
>   drivers/net/xen-netfront.c |  246 +++++++++++++++++++++++++++++++-------------
>   1 file changed, 174 insertions(+), 72 deletions(-)
>
> diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
> index 8bd75a1..de73a71 100644
> --- a/drivers/net/xen-netfront.c
> +++ b/drivers/net/xen-netfront.c
> @@ -67,9 +67,19 @@ struct netfront_cb {
>
>   #define GRANT_INVALID_REF	0
>
> -#define NET_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
> -#define NET_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
> -#define TX_MAX_TARGET min_t(int, NET_TX_RING_SIZE, 256)
> +#define XENNET_MAX_RING_PAGE_ORDER XENBUS_MAX_RING_PAGE_ORDER
> +#define XENNET_MAX_RING_PAGES      (1U<<  XENNET_MAX_RING_PAGE_ORDER)
> +
> +
> +#define NET_TX_RING_SIZE(_nr_pages)			\
> +	__CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE * (_nr_pages))
> +#define NET_RX_RING_SIZE(_nr_pages)			\
> +	__CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE * (_nr_pages))
> +
> +#define XENNET_MAX_TX_RING_SIZE NET_TX_RING_SIZE(XENNET_MAX_RING_PAGES)
> +#define XENNET_MAX_RX_RING_SIZE NET_RX_RING_SIZE(XENNET_MAX_RING_PAGES)
> +
> +#define TX_MAX_TARGET min_t(int, NET_TX_RING_SIZE(1), 256)

Not using multi-page ring here?
In xennet_create_dev, gnttab_alloc_grant_references allocates 
TX_MAX_TARGET number of grant reference for tx. In 
xennet_release_tx_bufs, NET_TX_RING_SIZE(np->tx_ring_pages) numbers of 
grants are processed. And NET_RX_RING_SIZE(np->tx_ring_pages) is totally 
different from TX_MAX_TARGET if np->rx_ring_pages is not 1. Although 
skb_entry_is_link helps to not release invalid grants, lots of null loop 
seems unnecessary. I think TX_MAX_TARGET should be changed into some 
variableconnected with np->tx_ring_pages. Or you intended to use one 
page ring here?

>
>   struct netfront_stats {
>   	u64			rx_packets;
> @@ -80,6 +90,11 @@ struct netfront_stats {
>   };
>
>   struct netfront_info {
> +	/* Statistics */
> +	struct netfront_stats __percpu *stats;
> +
> +	unsigned long rx_gso_checksum_fixup;
> +
>   	struct list_head list;
>   	struct net_device *netdev;
>
> @@ -90,7 +105,9 @@ struct netfront_info {
>
>   	spinlock_t   tx_lock;
>   	struct xen_netif_tx_front_ring tx;
> -	int tx_ring_ref;
> +	int tx_ring_ref[XENNET_MAX_RING_PAGES];
> +	unsigned int tx_ring_page_order;
> +	unsigned int tx_ring_pages;
>
>   	/*
>   	 * {tx,rx}_skbs store outstanding skbuffs. Free tx_skb entries
> @@ -104,36 +121,33 @@ struct netfront_info {
>   	union skb_entry {
>   		struct sk_buff *skb;
>   		unsigned long link;
> -	} tx_skbs[NET_TX_RING_SIZE];
> +	} tx_skbs[XENNET_MAX_TX_RING_SIZE];
>   	grant_ref_t gref_tx_head;
> -	grant_ref_t grant_tx_ref[NET_TX_RING_SIZE];
> +	grant_ref_t grant_tx_ref[XENNET_MAX_TX_RING_SIZE];
>   	unsigned tx_skb_freelist;
>
>   	spinlock_t   rx_lock ____cacheline_aligned_in_smp;
>   	struct xen_netif_rx_front_ring rx;
> -	int rx_ring_ref;
> +	int rx_ring_ref[XENNET_MAX_RING_PAGES];
> +	unsigned int rx_ring_page_order;
> +	unsigned int rx_ring_pages;
>
>   	/* Receive-ring batched refills. */
>   #define RX_MIN_TARGET 8
>   #define RX_DFL_MIN_TARGET 64
> -#define RX_MAX_TARGET min_t(int, NET_RX_RING_SIZE, 256)
> +#define RX_MAX_TARGET min_t(int, NET_RX_RING_SIZE(1), 256)

Not using multi-page ring here?
(See comments of tx side above)

Thanks
Annie

>   	unsigned rx_min_target, rx_max_target, rx_target;
>   	struct sk_buff_head rx_batch;
>
>   	struct timer_list rx_refill_timer;
>
> -	struct sk_buff *rx_skbs[NET_RX_RING_SIZE];
> +	struct sk_buff *rx_skbs[XENNET_MAX_RX_RING_SIZE];
>   	grant_ref_t gref_rx_head;
> -	grant_ref_t grant_rx_ref[NET_RX_RING_SIZE];
> -
> -	unsigned long rx_pfn_array[NET_RX_RING_SIZE];
> -	struct multicall_entry rx_mcl[NET_RX_RING_SIZE+1];
> -	struct mmu_update rx_mmu[NET_RX_RING_SIZE];
> -
> -	/* Statistics */
> -	struct netfront_stats __percpu *stats;
> +	grant_ref_t grant_rx_ref[XENNET_MAX_RX_RING_SIZE];
>
> -	unsigned long rx_gso_checksum_fixup;
> +	unsigned long rx_pfn_array[XENNET_MAX_RX_RING_SIZE];
> +	struct multicall_entry rx_mcl[XENNET_MAX_RX_RING_SIZE+1];
> +	struct mmu_update rx_mmu[XENNET_MAX_RX_RING_SIZE];
>   };
>
>   struct netfront_rx_info {
> @@ -171,15 +185,15 @@ static unsigned short get_id_from_freelist(unsigned *head,
>   	return id;
>   }
>
> -static int xennet_rxidx(RING_IDX idx)
> +static int xennet_rxidx(RING_IDX idx, struct netfront_info *info)
>   {
> -	return idx&  (NET_RX_RING_SIZE - 1);
> +	return idx&  (NET_RX_RING_SIZE(info->rx_ring_pages) - 1);
>   }
>
>   static struct sk_buff *xennet_get_rx_skb(struct netfront_info *np,
>   					 RING_IDX ri)
>   {
> -	int i = xennet_rxidx(ri);
> +	int i = xennet_rxidx(ri, np);
>   	struct sk_buff *skb = np->rx_skbs[i];
>   	np->rx_skbs[i] = NULL;
>   	return skb;
> @@ -188,7 +202,7 @@ static struct sk_buff *xennet_get_rx_skb(struct netfront_info *np,
>   static grant_ref_t xennet_get_rx_ref(struct netfront_info *np,
>   					    RING_IDX ri)
>   {
> -	int i = xennet_rxidx(ri);
> +	int i = xennet_rxidx(ri, np);
>   	grant_ref_t ref = np->grant_rx_ref[i];
>   	np->grant_rx_ref[i] = GRANT_INVALID_REF;
>   	return ref;
> @@ -301,7 +315,7 @@ no_skb:
>
>   		skb->dev = dev;
>
> -		id = xennet_rxidx(req_prod + i);
> +		id = xennet_rxidx(req_prod + i, np);
>
>   		BUG_ON(np->rx_skbs[id]);
>   		np->rx_skbs[id] = skb;
> @@ -653,7 +667,7 @@ static int xennet_close(struct net_device *dev)
>   static void xennet_move_rx_slot(struct netfront_info *np, struct sk_buff *skb,
>   				grant_ref_t ref)
>   {
> -	int new = xennet_rxidx(np->rx.req_prod_pvt);
> +	int new = xennet_rxidx(np->rx.req_prod_pvt, np);
>
>   	BUG_ON(np->rx_skbs[new]);
>   	np->rx_skbs[new] = skb;
> @@ -1109,7 +1123,7 @@ static void xennet_release_tx_bufs(struct netfront_info *np)
>   	struct sk_buff *skb;
>   	int i;
>
> -	for (i = 0; i<  NET_TX_RING_SIZE; i++) {
> +	for (i = 0; i<  NET_TX_RING_SIZE(np->tx_ring_pages); i++) {
>   		/* Skip over entries which are actually freelist references */
>   		if (skb_entry_is_link(&np->tx_skbs[i]))
>   			continue;
> @@ -1143,7 +1157,7 @@ static void xennet_release_rx_bufs(struct netfront_info *np)
>
>   	spin_lock_bh(&np->rx_lock);
>
> -	for (id = 0; id<  NET_RX_RING_SIZE; id++) {
> +	for (id = 0; id<  NET_RX_RING_SIZE(np->rx_ring_pages); id++) {
>   		ref = np->grant_rx_ref[id];
>   		if (ref == GRANT_INVALID_REF) {
>   			unused++;
> @@ -1324,13 +1338,13 @@ static struct net_device *xennet_create_dev(struct xenbus_device *dev)
>
>   	/* Initialise tx_skbs as a free chain containing every entry. */
>   	np->tx_skb_freelist = 0;
> -	for (i = 0; i<  NET_TX_RING_SIZE; i++) {
> +	for (i = 0; i<  XENNET_MAX_TX_RING_SIZE; i++) {
>   		skb_entry_set_link(&np->tx_skbs[i], i+1);
>   		np->grant_tx_ref[i] = GRANT_INVALID_REF;
>   	}
>
>   	/* Clear out rx_skbs */
> -	for (i = 0; i<  NET_RX_RING_SIZE; i++) {
> +	for (i = 0; i<  XENNET_MAX_RX_RING_SIZE; i++) {
>   		np->rx_skbs[i] = NULL;
>   		np->grant_rx_ref[i] = GRANT_INVALID_REF;
>   	}
> @@ -1428,13 +1442,6 @@ static int netfront_probe(struct xenbus_device *dev,
>   	return err;
>   }
>
> -static void xennet_end_access(int ref, void *page)
> -{
> -	/* This frees the page as a side-effect */
> -	if (ref != GRANT_INVALID_REF)
> -		gnttab_end_foreign_access(ref, 0, (unsigned long)page);
> -}
> -
>   static void xennet_disconnect_backend(struct netfront_info *info)
>   {
>   	/* Stop old i/f to prevent errors whilst we rebuild the state. */
> @@ -1448,12 +1455,12 @@ static void xennet_disconnect_backend(struct netfront_info *info)
>   		unbind_from_irqhandler(info->netdev->irq, info->netdev);
>   	info->evtchn = info->netdev->irq = 0;
>
> -	/* End access and free the pages */
> -	xennet_end_access(info->tx_ring_ref, info->tx.sring);
> -	xennet_end_access(info->rx_ring_ref, info->rx.sring);
> +	xenbus_unmap_ring_vfree(info->xbdev, (void *)info->tx.sring);
> +	free_pages((unsigned long)info->tx.sring, info->tx_ring_page_order);
> +
> +	xenbus_unmap_ring_vfree(info->xbdev, (void *)info->rx.sring);
> +	free_pages((unsigned long)info->rx.sring, info->rx_ring_page_order);
>
> -	info->tx_ring_ref = GRANT_INVALID_REF;
> -	info->rx_ring_ref = GRANT_INVALID_REF;
>   	info->tx.sring = NULL;
>   	info->rx.sring = NULL;
>   }
> @@ -1501,11 +1508,14 @@ static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info)
>   	struct xen_netif_tx_sring *txs;
>   	struct xen_netif_rx_sring *rxs;
>   	int err;
> -	int grefs[1];
>   	struct net_device *netdev = info->netdev;
> +	unsigned int max_tx_ring_page_order, max_rx_ring_page_order;
> +	int i;
>
> -	info->tx_ring_ref = GRANT_INVALID_REF;
> -	info->rx_ring_ref = GRANT_INVALID_REF;
> +	for (i = 0; i<  XENNET_MAX_RING_PAGES; i++) {
> +		info->tx_ring_ref[i] = GRANT_INVALID_REF;
> +		info->rx_ring_ref[i] = GRANT_INVALID_REF;
> +	}
>   	info->rx.sring = NULL;
>   	info->tx.sring = NULL;
>   	netdev->irq = 0;
> @@ -1516,50 +1526,100 @@ static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info)
>   		goto fail;
>   	}
>
> -	txs = (struct xen_netif_tx_sring *)get_zeroed_page(GFP_NOIO | __GFP_HIGH);
> +	err = xenbus_scanf(XBT_NIL, info->xbdev->otherend,
> +			   "max-tx-ring-page-order", "%u",
> +			&max_tx_ring_page_order);
> +	if (err<  0) {
> +		info->tx_ring_page_order = 0;
> +		dev_info(&dev->dev, "single tx ring\n");
> +	} else {
> +		if (max_tx_ring_page_order>  XENNET_MAX_RING_PAGE_ORDER) {
> +			dev_info(&dev->dev,
> +				 "backend ring page order %d too large, clamp to %d\n",
> +				 max_tx_ring_page_order,
> +				 XENNET_MAX_RING_PAGE_ORDER);
> +			max_tx_ring_page_order = XENNET_MAX_RING_PAGE_ORDER;
> +		}
> +		info->tx_ring_page_order = max_tx_ring_page_order;
> +		dev_info(&dev->dev, "multi-page tx ring, order = %d\n",
> +			 info->tx_ring_page_order);
> +	}
> +	info->tx_ring_pages = (1U<<  info->tx_ring_page_order);
> +
> +	txs = (struct xen_netif_tx_sring *)
> +		__get_free_pages(__GFP_ZERO | GFP_NOIO | __GFP_HIGH,
> +				 info->tx_ring_page_order);
>   	if (!txs) {
>   		err = -ENOMEM;
>   		xenbus_dev_fatal(dev, err, "allocating tx ring page");
>   		goto fail;
>   	}
>   	SHARED_RING_INIT(txs);
> -	FRONT_RING_INIT(&info->tx, txs, PAGE_SIZE);
> +	FRONT_RING_INIT(&info->tx, txs, PAGE_SIZE * info->tx_ring_pages);
> +
> +	err = xenbus_grant_ring(dev, txs, info->tx_ring_pages,
> +				info->tx_ring_ref);
> +	if (err<  0)
> +		goto grant_tx_ring_fail;
>
> -	err = xenbus_grant_ring(dev, txs, 1, grefs);
> +	err = xenbus_scanf(XBT_NIL, info->xbdev->otherend,
> +			   "max-rx-ring-page-order", "%u",
> +			&max_rx_ring_page_order);
>   	if (err<  0) {
> -		free_page((unsigned long)txs);
> -		goto fail;
> +		info->rx_ring_page_order = 0;
> +		dev_info(&dev->dev, "single rx ring\n");
> +	} else {
> +		if (max_rx_ring_page_order>  XENNET_MAX_RING_PAGE_ORDER) {
> +			dev_info(&dev->dev,
> +				 "backend ring page order %d too large, clamp to %d\n",
> +				 max_rx_ring_page_order,
> +				 XENNET_MAX_RING_PAGE_ORDER);
> +			max_rx_ring_page_order = XENNET_MAX_RING_PAGE_ORDER;
> +		}
> +		info->rx_ring_page_order = max_rx_ring_page_order;
> +		dev_info(&dev->dev, "multi-page rx ring, order = %d\n",
> +			 info->rx_ring_page_order);
>   	}
> +	info->rx_ring_pages = (1U<<  info->rx_ring_page_order);
>
> -	info->tx_ring_ref = grefs[0];
> -	rxs = (struct xen_netif_rx_sring *)get_zeroed_page(GFP_NOIO | __GFP_HIGH);
> +	rxs = (struct xen_netif_rx_sring *)
> +		__get_free_pages(__GFP_ZERO | GFP_NOIO | __GFP_HIGH,
> +				 info->rx_ring_page_order);
>   	if (!rxs) {
>   		err = -ENOMEM;
>   		xenbus_dev_fatal(dev, err, "allocating rx ring page");
> -		goto fail;
> +		goto alloc_rx_ring_fail;
>   	}
>   	SHARED_RING_INIT(rxs);
> -	FRONT_RING_INIT(&info->rx, rxs, PAGE_SIZE);
> +	FRONT_RING_INIT(&info->rx, rxs, PAGE_SIZE * info->rx_ring_pages);
>
> -	err = xenbus_grant_ring(dev, rxs, 1, grefs);
> -	if (err<  0) {
> -		free_page((unsigned long)rxs);
> -		goto fail;
> -	}
> -	info->rx_ring_ref = grefs[0];
> +	err = xenbus_grant_ring(dev, rxs, info->rx_ring_pages,
> +				info->rx_ring_ref);
> +	if (err<  0)
> +		goto grant_rx_ring_fail;
>
>   	err = xenbus_alloc_evtchn(dev,&info->evtchn);
>   	if (err)
> -		goto fail;
> +		goto alloc_evtchn_fail;
>
>   	err = bind_evtchn_to_irqhandler(info->evtchn, xennet_interrupt,
>   					0, netdev->name, netdev);
>   	if (err<  0)
> -		goto fail;
> +		goto bind_fail;
>   	netdev->irq = err;
>   	return 0;
>
> - fail:
> +bind_fail:
> +	xenbus_free_evtchn(dev, info->evtchn);
> +alloc_evtchn_fail:
> +	xenbus_unmap_ring_vfree(info->xbdev, (void *)info->rx.sring);
> +grant_rx_ring_fail:
> +	free_pages((unsigned long)info->rx.sring, info->rx_ring_page_order);
> +alloc_rx_ring_fail:
> +	xenbus_unmap_ring_vfree(info->xbdev, (void *)info->tx.sring);
> +grant_tx_ring_fail:
> +	free_pages((unsigned long)info->tx.sring, info->tx_ring_page_order);
> +fail:
>   	return err;
>   }
>
> @@ -1570,6 +1630,7 @@ static int talk_to_netback(struct xenbus_device *dev,
>   	const char *message;
>   	struct xenbus_transaction xbt;
>   	int err;
> +	int i;
>
>   	/* Create shared ring, alloc event channel. */
>   	err = setup_netfront(dev, info);
> @@ -1583,18 +1644,58 @@ again:
>   		goto destroy_ring;
>   	}
>
> -	err = xenbus_printf(xbt, dev->nodename, "tx-ring-ref", "%u",
> -			    info->tx_ring_ref);
> -	if (err) {
> -		message = "writing tx ring-ref";
> -		goto abort_transaction;
> +	if (info->tx_ring_page_order == 0) {
> +		err = xenbus_printf(xbt, dev->nodename, "tx-ring-ref", "%u",
> +				    info->tx_ring_ref[0]);
> +		if (err) {
> +			message = "writing tx ring-ref";
> +			goto abort_transaction;
> +		}
> +	} else {
> +		err = xenbus_printf(xbt, dev->nodename, "tx-ring-order", "%u",
> +				    info->tx_ring_page_order);
> +		if (err) {
> +			message = "writing tx-ring-order";
> +			goto abort_transaction;
> +		}
> +		for (i = 0; i<  info->tx_ring_pages; i++) {
> +			char name[sizeof("tx-ring-ref")+3];
> +			snprintf(name, sizeof(name), "tx-ring-ref%u", i);
> +			err = xenbus_printf(xbt, dev->nodename, name, "%u",
> +					    info->tx_ring_ref[i]);
> +			if (err) {
> +				message = "writing tx ring-ref";
> +				goto abort_transaction;
> +			}
> +		}
>   	}
> -	err = xenbus_printf(xbt, dev->nodename, "rx-ring-ref", "%u",
> -			    info->rx_ring_ref);
> -	if (err) {
> -		message = "writing rx ring-ref";
> -		goto abort_transaction;
> +
> +	if (info->rx_ring_page_order == 0) {
> +		err = xenbus_printf(xbt, dev->nodename, "rx-ring-ref", "%u",
> +				    info->rx_ring_ref[0]);
> +		if (err) {
> +			message = "writing rx ring-ref";
> +			goto abort_transaction;
> +		}
> +	} else {
> +		err = xenbus_printf(xbt, dev->nodename, "rx-ring-order", "%u",
> +				    info->rx_ring_page_order);
> +		if (err) {
> +			message = "writing rx-ring-order";
> +			goto abort_transaction;
> +		}
> +		for (i = 0; i<  info->rx_ring_pages; i++) {
> +			char name[sizeof("rx-ring-ref")+3];
> +			snprintf(name, sizeof(name), "rx-ring-ref%u", i);
> +			err = xenbus_printf(xbt, dev->nodename, name, "%u",
> +					    info->rx_ring_ref[i]);
> +			if (err) {
> +				message = "writing rx ring-ref";
> +				goto abort_transaction;
> +			}
> +		}
>   	}
> +
>   	err = xenbus_printf(xbt, dev->nodename,
>   			    "event-channel", "%u", info->evtchn);
>   	if (err) {
> @@ -1681,7 +1782,8 @@ static int xennet_connect(struct net_device *dev)
>   	xennet_release_tx_bufs(np);
>
>   	/* Step 2: Rebuild the RX buffer freelist and the RX ring itself. */
> -	for (requeue_idx = 0, i = 0; i<  NET_RX_RING_SIZE; i++) {
> +	for (requeue_idx = 0, i = 0; i<  NET_RX_RING_SIZE(np->rx_ring_pages);
> +	     i++) {
>   		skb_frag_t *frag;
>   		const struct page *page;
>   		if (!np->rx_skbs[i])

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [Xen-devel] [PATCH 0/8] Bugfix and mechanical works for Xen network driver
  2013-02-26  3:07 ` [Xen-devel] " ANNIE LI
@ 2013-02-26 11:33   ` Wei Liu
  2013-02-26 11:33   ` Wei Liu
  1 sibling, 0 replies; 91+ messages in thread
From: Wei Liu @ 2013-02-26 11:33 UTC (permalink / raw)
  To: ANNIE LI; +Cc: wei.liu2, xen-devel, netdev, Ian Campbell, konrad.wilk

On Tue, 2013-02-26 at 03:07 +0000, ANNIE LI wrote:
> What the version are these patches based on?
> I tried v3.8-rc7 and 3.8-rc6, patch 3/8, 4/8 ... can not be merged 
> successfully. Can you rebase it?
> 

IIRC we had some XSA patches after this series. Or I just developed it
on top of a old branch. I will rebase it soon.


Wei.

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 0/8] Bugfix and mechanical works for Xen network driver
  2013-02-26  3:07 ` [Xen-devel] " ANNIE LI
  2013-02-26 11:33   ` Wei Liu
@ 2013-02-26 11:33   ` Wei Liu
  1 sibling, 0 replies; 91+ messages in thread
From: Wei Liu @ 2013-02-26 11:33 UTC (permalink / raw)
  To: ANNIE LI; +Cc: netdev, konrad.wilk, wei.liu2, Ian Campbell, xen-devel

On Tue, 2013-02-26 at 03:07 +0000, ANNIE LI wrote:
> What the version are these patches based on?
> I tried v3.8-rc7 and 3.8-rc6, patch 3/8, 4/8 ... can not be merged 
> successfully. Can you rebase it?
> 

IIRC we had some XSA patches after this series. Or I just developed it
on top of a old branch. I will rebase it soon.


Wei.

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 6/8] netfront: multi-page ring support
  2013-02-26  6:52   ` ANNIE LI
  2013-02-26 12:35     ` Wei Liu
@ 2013-02-26 12:35     ` Wei Liu
  2013-02-27  7:39       ` ANNIE LI
  2013-02-27  7:39       ` ANNIE LI
  1 sibling, 2 replies; 91+ messages in thread
From: Wei Liu @ 2013-02-26 12:35 UTC (permalink / raw)
  To: ANNIE LI; +Cc: wei.liu2, xen-devel, netdev, Ian Campbell, konrad.wilk

On Tue, 2013-02-26 at 06:52 +0000, ANNIE LI wrote:
> 
> On 2013-2-16 0:00, Wei Liu wrote:
> > Signed-off-by: Wei Liu<wei.liu2@citrix.com>
> > ---
> >   drivers/net/xen-netfront.c |  246 +++++++++++++++++++++++++++++++-------------
> >   1 file changed, 174 insertions(+), 72 deletions(-)
> >
> > diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
> > index 8bd75a1..de73a71 100644
> > --- a/drivers/net/xen-netfront.c
> > +++ b/drivers/net/xen-netfront.c
> > @@ -67,9 +67,19 @@ struct netfront_cb {
> >
> >   #define GRANT_INVALID_REF   0
> >
> > -#define NET_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
> > -#define NET_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
> > -#define TX_MAX_TARGET min_t(int, NET_TX_RING_SIZE, 256)
> > +#define XENNET_MAX_RING_PAGE_ORDER XENBUS_MAX_RING_PAGE_ORDER
> > +#define XENNET_MAX_RING_PAGES      (1U<<  XENNET_MAX_RING_PAGE_ORDER)
> > +
> > +
> > +#define NET_TX_RING_SIZE(_nr_pages)                  \
> > +     __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE * (_nr_pages))
> > +#define NET_RX_RING_SIZE(_nr_pages)                  \
> > +     __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE * (_nr_pages))
> > +
> > +#define XENNET_MAX_TX_RING_SIZE NET_TX_RING_SIZE(XENNET_MAX_RING_PAGES)
> > +#define XENNET_MAX_RX_RING_SIZE NET_RX_RING_SIZE(XENNET_MAX_RING_PAGES)
> > +
> > +#define TX_MAX_TARGET min_t(int, NET_TX_RING_SIZE(1), 256)
> 
> Not using multi-page ring here?
> In xennet_create_dev, gnttab_alloc_grant_references allocates
> TX_MAX_TARGET number of grant reference for tx. In
> xennet_release_tx_bufs, NET_TX_RING_SIZE(np->tx_ring_pages) numbers of
> grants are processed. And NET_RX_RING_SIZE(np->tx_ring_pages) is totally
> different from TX_MAX_TARGET if np->rx_ring_pages is not 1. Although
> skb_entry_is_link helps to not release invalid grants, lots of null loop
> seems unnecessary. I think TX_MAX_TARGET should be changed into some
> variableconnected with np->tx_ring_pages. Or you intended to use one
> page ring here?
> 

Looking back my history, this limitation was introduced because if we
have a multi-page backend and single page frontend, the backend skb
processing could overlap.

I agree with you that this limit should be variable, but as we still use
M:N model, the safe option is to cap this limit to 1 page.

Another option is to check validity of skbs before processing them. I
will look into that as well.

The same reason applies to the RX ring as well.


Wei.

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 6/8] netfront: multi-page ring support
  2013-02-26  6:52   ` ANNIE LI
@ 2013-02-26 12:35     ` Wei Liu
  2013-02-26 12:35     ` Wei Liu
  1 sibling, 0 replies; 91+ messages in thread
From: Wei Liu @ 2013-02-26 12:35 UTC (permalink / raw)
  To: ANNIE LI; +Cc: netdev, konrad.wilk, wei.liu2, Ian Campbell, xen-devel

On Tue, 2013-02-26 at 06:52 +0000, ANNIE LI wrote:
> 
> On 2013-2-16 0:00, Wei Liu wrote:
> > Signed-off-by: Wei Liu<wei.liu2@citrix.com>
> > ---
> >   drivers/net/xen-netfront.c |  246 +++++++++++++++++++++++++++++++-------------
> >   1 file changed, 174 insertions(+), 72 deletions(-)
> >
> > diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
> > index 8bd75a1..de73a71 100644
> > --- a/drivers/net/xen-netfront.c
> > +++ b/drivers/net/xen-netfront.c
> > @@ -67,9 +67,19 @@ struct netfront_cb {
> >
> >   #define GRANT_INVALID_REF   0
> >
> > -#define NET_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
> > -#define NET_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
> > -#define TX_MAX_TARGET min_t(int, NET_TX_RING_SIZE, 256)
> > +#define XENNET_MAX_RING_PAGE_ORDER XENBUS_MAX_RING_PAGE_ORDER
> > +#define XENNET_MAX_RING_PAGES      (1U<<  XENNET_MAX_RING_PAGE_ORDER)
> > +
> > +
> > +#define NET_TX_RING_SIZE(_nr_pages)                  \
> > +     __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE * (_nr_pages))
> > +#define NET_RX_RING_SIZE(_nr_pages)                  \
> > +     __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE * (_nr_pages))
> > +
> > +#define XENNET_MAX_TX_RING_SIZE NET_TX_RING_SIZE(XENNET_MAX_RING_PAGES)
> > +#define XENNET_MAX_RX_RING_SIZE NET_RX_RING_SIZE(XENNET_MAX_RING_PAGES)
> > +
> > +#define TX_MAX_TARGET min_t(int, NET_TX_RING_SIZE(1), 256)
> 
> Not using multi-page ring here?
> In xennet_create_dev, gnttab_alloc_grant_references allocates
> TX_MAX_TARGET number of grant reference for tx. In
> xennet_release_tx_bufs, NET_TX_RING_SIZE(np->tx_ring_pages) numbers of
> grants are processed. And NET_RX_RING_SIZE(np->tx_ring_pages) is totally
> different from TX_MAX_TARGET if np->rx_ring_pages is not 1. Although
> skb_entry_is_link helps to not release invalid grants, lots of null loop
> seems unnecessary. I think TX_MAX_TARGET should be changed into some
> variableconnected with np->tx_ring_pages. Or you intended to use one
> page ring here?
> 

Looking back my history, this limitation was introduced because if we
have a multi-page backend and single page frontend, the backend skb
processing could overlap.

I agree with you that this limit should be variable, but as we still use
M:N model, the safe option is to cap this limit to 1 page.

Another option is to check validity of skbs before processing them. I
will look into that as well.

The same reason applies to the RX ring as well.


Wei.

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 6/8] netfront: multi-page ring support
  2013-02-26 12:35     ` Wei Liu
@ 2013-02-27  7:39       ` ANNIE LI
  2013-02-27 15:49         ` Wei Liu
  2013-02-27 15:49         ` Wei Liu
  2013-02-27  7:39       ` ANNIE LI
  1 sibling, 2 replies; 91+ messages in thread
From: ANNIE LI @ 2013-02-27  7:39 UTC (permalink / raw)
  To: Wei Liu; +Cc: xen-devel, netdev, Ian Campbell, konrad.wilk



On 2013-2-26 20:35, Wei Liu wrote:
> On Tue, 2013-02-26 at 06:52 +0000, ANNIE LI wrote:
>> On 2013-2-16 0:00, Wei Liu wrote:
>>> Signed-off-by: Wei Liu<wei.liu2@citrix.com>
>>> ---
>>>    drivers/net/xen-netfront.c |  246 +++++++++++++++++++++++++++++++-------------
>>>    1 file changed, 174 insertions(+), 72 deletions(-)
>>>
>>> diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
>>> index 8bd75a1..de73a71 100644
>>> --- a/drivers/net/xen-netfront.c
>>> +++ b/drivers/net/xen-netfront.c
>>> @@ -67,9 +67,19 @@ struct netfront_cb {
>>>
>>>    #define GRANT_INVALID_REF   0
>>>
>>> -#define NET_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
>>> -#define NET_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
>>> -#define TX_MAX_TARGET min_t(int, NET_TX_RING_SIZE, 256)
>>> +#define XENNET_MAX_RING_PAGE_ORDER XENBUS_MAX_RING_PAGE_ORDER
>>> +#define XENNET_MAX_RING_PAGES      (1U<<   XENNET_MAX_RING_PAGE_ORDER)
>>> +
>>> +
>>> +#define NET_TX_RING_SIZE(_nr_pages)                  \
>>> +     __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE * (_nr_pages))
>>> +#define NET_RX_RING_SIZE(_nr_pages)                  \
>>> +     __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE * (_nr_pages))
>>> +
>>> +#define XENNET_MAX_TX_RING_SIZE NET_TX_RING_SIZE(XENNET_MAX_RING_PAGES)
>>> +#define XENNET_MAX_RX_RING_SIZE NET_RX_RING_SIZE(XENNET_MAX_RING_PAGES)
>>> +
>>> +#define TX_MAX_TARGET min_t(int, NET_TX_RING_SIZE(1), 256)
>> Not using multi-page ring here?
>> In xennet_create_dev, gnttab_alloc_grant_references allocates
>> TX_MAX_TARGET number of grant reference for tx. In
>> xennet_release_tx_bufs, NET_TX_RING_SIZE(np->tx_ring_pages) numbers of
>> grants are processed. And NET_RX_RING_SIZE(np->tx_ring_pages) is totally
>> different from TX_MAX_TARGET if np->rx_ring_pages is not 1. Although
>> skb_entry_is_link helps to not release invalid grants, lots of null loop
>> seems unnecessary. I think TX_MAX_TARGET should be changed into some
>> variableconnected with np->tx_ring_pages. Or you intended to use one
>> page ring here?
>>
> Looking back my history, this limitation was introduced because if we
> have a multi-page backend and single page frontend, the backend skb
> processing could overlap.

I did not see the overlap you mentioned here in netback. Although 
netback supports multi-page, netback->vif still uses single page if the 
frontend only supports single page. Netfront and netback negotiate this 
through xenstore in your 5/8 patch. The requests and response should not 
have any overlap between netback and netfront. Am I missing something?

>
> I agree with you that this limit should be variable, but as we still use
> M:N model, the safe option is to cap this limit to 1 page.

Yes, M:N model is still used here. But the share ring should be same for 
netback->vif and netfront.

Thanks
Annie

>
> Another option is to check validity of skbs before processing them. I
> will look into that as well.
>
> The same reason applies to the RX ring as well.
>
>
> Wei.
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 6/8] netfront: multi-page ring support
  2013-02-26 12:35     ` Wei Liu
  2013-02-27  7:39       ` ANNIE LI
@ 2013-02-27  7:39       ` ANNIE LI
  1 sibling, 0 replies; 91+ messages in thread
From: ANNIE LI @ 2013-02-27  7:39 UTC (permalink / raw)
  To: Wei Liu; +Cc: netdev, konrad.wilk, Ian Campbell, xen-devel



On 2013-2-26 20:35, Wei Liu wrote:
> On Tue, 2013-02-26 at 06:52 +0000, ANNIE LI wrote:
>> On 2013-2-16 0:00, Wei Liu wrote:
>>> Signed-off-by: Wei Liu<wei.liu2@citrix.com>
>>> ---
>>>    drivers/net/xen-netfront.c |  246 +++++++++++++++++++++++++++++++-------------
>>>    1 file changed, 174 insertions(+), 72 deletions(-)
>>>
>>> diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
>>> index 8bd75a1..de73a71 100644
>>> --- a/drivers/net/xen-netfront.c
>>> +++ b/drivers/net/xen-netfront.c
>>> @@ -67,9 +67,19 @@ struct netfront_cb {
>>>
>>>    #define GRANT_INVALID_REF   0
>>>
>>> -#define NET_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
>>> -#define NET_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
>>> -#define TX_MAX_TARGET min_t(int, NET_TX_RING_SIZE, 256)
>>> +#define XENNET_MAX_RING_PAGE_ORDER XENBUS_MAX_RING_PAGE_ORDER
>>> +#define XENNET_MAX_RING_PAGES      (1U<<   XENNET_MAX_RING_PAGE_ORDER)
>>> +
>>> +
>>> +#define NET_TX_RING_SIZE(_nr_pages)                  \
>>> +     __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE * (_nr_pages))
>>> +#define NET_RX_RING_SIZE(_nr_pages)                  \
>>> +     __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE * (_nr_pages))
>>> +
>>> +#define XENNET_MAX_TX_RING_SIZE NET_TX_RING_SIZE(XENNET_MAX_RING_PAGES)
>>> +#define XENNET_MAX_RX_RING_SIZE NET_RX_RING_SIZE(XENNET_MAX_RING_PAGES)
>>> +
>>> +#define TX_MAX_TARGET min_t(int, NET_TX_RING_SIZE(1), 256)
>> Not using multi-page ring here?
>> In xennet_create_dev, gnttab_alloc_grant_references allocates
>> TX_MAX_TARGET number of grant reference for tx. In
>> xennet_release_tx_bufs, NET_TX_RING_SIZE(np->tx_ring_pages) numbers of
>> grants are processed. And NET_RX_RING_SIZE(np->tx_ring_pages) is totally
>> different from TX_MAX_TARGET if np->rx_ring_pages is not 1. Although
>> skb_entry_is_link helps to not release invalid grants, lots of null loop
>> seems unnecessary. I think TX_MAX_TARGET should be changed into some
>> variableconnected with np->tx_ring_pages. Or you intended to use one
>> page ring here?
>>
> Looking back my history, this limitation was introduced because if we
> have a multi-page backend and single page frontend, the backend skb
> processing could overlap.

I did not see the overlap you mentioned here in netback. Although 
netback supports multi-page, netback->vif still uses single page if the 
frontend only supports single page. Netfront and netback negotiate this 
through xenstore in your 5/8 patch. The requests and response should not 
have any overlap between netback and netfront. Am I missing something?

>
> I agree with you that this limit should be variable, but as we still use
> M:N model, the safe option is to cap this limit to 1 page.

Yes, M:N model is still used here. But the share ring should be same for 
netback->vif and netfront.

Thanks
Annie

>
> Another option is to check validity of skbs before processing them. I
> will look into that as well.
>
> The same reason applies to the RX ring as well.
>
>
> Wei.
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 6/8] netfront: multi-page ring support
  2013-02-27  7:39       ` ANNIE LI
  2013-02-27 15:49         ` Wei Liu
@ 2013-02-27 15:49         ` Wei Liu
  2013-02-28  5:19           ` ANNIE LI
  2013-02-28  5:19           ` ANNIE LI
  1 sibling, 2 replies; 91+ messages in thread
From: Wei Liu @ 2013-02-27 15:49 UTC (permalink / raw)
  To: ANNIE LI; +Cc: wei.liu2, xen-devel, netdev, Ian Campbell, konrad.wilk

On Wed, 2013-02-27 at 07:39 +0000, ANNIE LI wrote:
> 
> On 2013-2-26 20:35, Wei Liu wrote:
> > On Tue, 2013-02-26 at 06:52 +0000, ANNIE LI wrote:
> >> On 2013-2-16 0:00, Wei Liu wrote:
> >>> Signed-off-by: Wei Liu<wei.liu2@citrix.com>
> >>> ---
> >>>    drivers/net/xen-netfront.c |  246 +++++++++++++++++++++++++++++++-------------
> >>>    1 file changed, 174 insertions(+), 72 deletions(-)
> >>>
> >>> diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
> >>> index 8bd75a1..de73a71 100644
> >>> --- a/drivers/net/xen-netfront.c
> >>> +++ b/drivers/net/xen-netfront.c
> >>> @@ -67,9 +67,19 @@ struct netfront_cb {
> >>>
> >>>    #define GRANT_INVALID_REF   0
> >>>
> >>> -#define NET_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
> >>> -#define NET_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
> >>> -#define TX_MAX_TARGET min_t(int, NET_TX_RING_SIZE, 256)
> >>> +#define XENNET_MAX_RING_PAGE_ORDER XENBUS_MAX_RING_PAGE_ORDER
> >>> +#define XENNET_MAX_RING_PAGES      (1U<<   XENNET_MAX_RING_PAGE_ORDER)
> >>> +
> >>> +
> >>> +#define NET_TX_RING_SIZE(_nr_pages)                  \
> >>> +     __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE * (_nr_pages))
> >>> +#define NET_RX_RING_SIZE(_nr_pages)                  \
> >>> +     __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE * (_nr_pages))
> >>> +
> >>> +#define XENNET_MAX_TX_RING_SIZE NET_TX_RING_SIZE(XENNET_MAX_RING_PAGES)
> >>> +#define XENNET_MAX_RX_RING_SIZE NET_RX_RING_SIZE(XENNET_MAX_RING_PAGES)
> >>> +
> >>> +#define TX_MAX_TARGET min_t(int, NET_TX_RING_SIZE(1), 256)
> >> Not using multi-page ring here?
> >> In xennet_create_dev, gnttab_alloc_grant_references allocates
> >> TX_MAX_TARGET number of grant reference for tx. In
> >> xennet_release_tx_bufs, NET_TX_RING_SIZE(np->tx_ring_pages) numbers of
> >> grants are processed. And NET_RX_RING_SIZE(np->tx_ring_pages) is totally
> >> different from TX_MAX_TARGET if np->rx_ring_pages is not 1. Although
> >> skb_entry_is_link helps to not release invalid grants, lots of null loop
> >> seems unnecessary. I think TX_MAX_TARGET should be changed into some
> >> variableconnected with np->tx_ring_pages. Or you intended to use one
> >> page ring here?
> >>
> > Looking back my history, this limitation was introduced because if we
> > have a multi-page backend and single page frontend, the backend skb
> > processing could overlap.
> 
> I did not see the overlap you mentioned here in netback. Although 
> netback supports multi-page, netback->vif still uses single page if the 
> frontend only supports single page. Netfront and netback negotiate this 
> through xenstore in your 5/8 patch. The requests and response should not 
> have any overlap between netback and netfront. Am I missing something?
> 

I tried to dig up mail archive just now and realized that the bug report
was in private mail exchange with Konrad.

I don't really remember the details now since it is more than one year
old, but you can find trace in Konrad's tree, CS 5b4c3dd5b255. All I can
remember is that this bug was triggered by mixed old/new
frontend/backend.

I think this cap can be removed if we make all buffers in netfront
dynamically allocated.


Wei.

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 6/8] netfront: multi-page ring support
  2013-02-27  7:39       ` ANNIE LI
@ 2013-02-27 15:49         ` Wei Liu
  2013-02-27 15:49         ` Wei Liu
  1 sibling, 0 replies; 91+ messages in thread
From: Wei Liu @ 2013-02-27 15:49 UTC (permalink / raw)
  To: ANNIE LI; +Cc: netdev, konrad.wilk, wei.liu2, Ian Campbell, xen-devel

On Wed, 2013-02-27 at 07:39 +0000, ANNIE LI wrote:
> 
> On 2013-2-26 20:35, Wei Liu wrote:
> > On Tue, 2013-02-26 at 06:52 +0000, ANNIE LI wrote:
> >> On 2013-2-16 0:00, Wei Liu wrote:
> >>> Signed-off-by: Wei Liu<wei.liu2@citrix.com>
> >>> ---
> >>>    drivers/net/xen-netfront.c |  246 +++++++++++++++++++++++++++++++-------------
> >>>    1 file changed, 174 insertions(+), 72 deletions(-)
> >>>
> >>> diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
> >>> index 8bd75a1..de73a71 100644
> >>> --- a/drivers/net/xen-netfront.c
> >>> +++ b/drivers/net/xen-netfront.c
> >>> @@ -67,9 +67,19 @@ struct netfront_cb {
> >>>
> >>>    #define GRANT_INVALID_REF   0
> >>>
> >>> -#define NET_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
> >>> -#define NET_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
> >>> -#define TX_MAX_TARGET min_t(int, NET_TX_RING_SIZE, 256)
> >>> +#define XENNET_MAX_RING_PAGE_ORDER XENBUS_MAX_RING_PAGE_ORDER
> >>> +#define XENNET_MAX_RING_PAGES      (1U<<   XENNET_MAX_RING_PAGE_ORDER)
> >>> +
> >>> +
> >>> +#define NET_TX_RING_SIZE(_nr_pages)                  \
> >>> +     __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE * (_nr_pages))
> >>> +#define NET_RX_RING_SIZE(_nr_pages)                  \
> >>> +     __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE * (_nr_pages))
> >>> +
> >>> +#define XENNET_MAX_TX_RING_SIZE NET_TX_RING_SIZE(XENNET_MAX_RING_PAGES)
> >>> +#define XENNET_MAX_RX_RING_SIZE NET_RX_RING_SIZE(XENNET_MAX_RING_PAGES)
> >>> +
> >>> +#define TX_MAX_TARGET min_t(int, NET_TX_RING_SIZE(1), 256)
> >> Not using multi-page ring here?
> >> In xennet_create_dev, gnttab_alloc_grant_references allocates
> >> TX_MAX_TARGET number of grant reference for tx. In
> >> xennet_release_tx_bufs, NET_TX_RING_SIZE(np->tx_ring_pages) numbers of
> >> grants are processed. And NET_RX_RING_SIZE(np->tx_ring_pages) is totally
> >> different from TX_MAX_TARGET if np->rx_ring_pages is not 1. Although
> >> skb_entry_is_link helps to not release invalid grants, lots of null loop
> >> seems unnecessary. I think TX_MAX_TARGET should be changed into some
> >> variableconnected with np->tx_ring_pages. Or you intended to use one
> >> page ring here?
> >>
> > Looking back my history, this limitation was introduced because if we
> > have a multi-page backend and single page frontend, the backend skb
> > processing could overlap.
> 
> I did not see the overlap you mentioned here in netback. Although 
> netback supports multi-page, netback->vif still uses single page if the 
> frontend only supports single page. Netfront and netback negotiate this 
> through xenstore in your 5/8 patch. The requests and response should not 
> have any overlap between netback and netfront. Am I missing something?
> 

I tried to dig up mail archive just now and realized that the bug report
was in private mail exchange with Konrad.

I don't really remember the details now since it is more than one year
old, but you can find trace in Konrad's tree, CS 5b4c3dd5b255. All I can
remember is that this bug was triggered by mixed old/new
frontend/backend.

I think this cap can be removed if we make all buffers in netfront
dynamically allocated.


Wei.

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 6/8] netfront: multi-page ring support
  2013-02-27 15:49         ` Wei Liu
  2013-02-28  5:19           ` ANNIE LI
@ 2013-02-28  5:19           ` ANNIE LI
  2013-02-28 11:02             ` Wei Liu
  2013-02-28 11:02             ` Wei Liu
  1 sibling, 2 replies; 91+ messages in thread
From: ANNIE LI @ 2013-02-28  5:19 UTC (permalink / raw)
  To: Wei Liu; +Cc: xen-devel, netdev, Ian Campbell, konrad.wilk



On 2013-2-27 23:49, Wei Liu wrote:
> On Wed, 2013-02-27 at 07:39 +0000, ANNIE LI wrote:
>> On 2013-2-26 20:35, Wei Liu wrote:
>>> On Tue, 2013-02-26 at 06:52 +0000, ANNIE LI wrote:
>>>> On 2013-2-16 0:00, Wei Liu wrote:
>>>>> Signed-off-by: Wei Liu<wei.liu2@citrix.com>
>>>>> ---
>>>>>     drivers/net/xen-netfront.c |  246 +++++++++++++++++++++++++++++++-------------
>>>>>     1 file changed, 174 insertions(+), 72 deletions(-)
>>>>>
>>>>> diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
>>>>> index 8bd75a1..de73a71 100644
>>>>> --- a/drivers/net/xen-netfront.c
>>>>> +++ b/drivers/net/xen-netfront.c
>>>>> @@ -67,9 +67,19 @@ struct netfront_cb {
>>>>>
>>>>>     #define GRANT_INVALID_REF   0
>>>>>
>>>>> -#define NET_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
>>>>> -#define NET_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
>>>>> -#define TX_MAX_TARGET min_t(int, NET_TX_RING_SIZE, 256)
>>>>> +#define XENNET_MAX_RING_PAGE_ORDER XENBUS_MAX_RING_PAGE_ORDER
>>>>> +#define XENNET_MAX_RING_PAGES      (1U<<    XENNET_MAX_RING_PAGE_ORDER)
>>>>> +
>>>>> +
>>>>> +#define NET_TX_RING_SIZE(_nr_pages)                  \
>>>>> +     __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE * (_nr_pages))
>>>>> +#define NET_RX_RING_SIZE(_nr_pages)                  \
>>>>> +     __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE * (_nr_pages))
>>>>> +
>>>>> +#define XENNET_MAX_TX_RING_SIZE NET_TX_RING_SIZE(XENNET_MAX_RING_PAGES)
>>>>> +#define XENNET_MAX_RX_RING_SIZE NET_RX_RING_SIZE(XENNET_MAX_RING_PAGES)
>>>>> +
>>>>> +#define TX_MAX_TARGET min_t(int, NET_TX_RING_SIZE(1), 256)
>>>> Not using multi-page ring here?
>>>> In xennet_create_dev, gnttab_alloc_grant_references allocates
>>>> TX_MAX_TARGET number of grant reference for tx. In
>>>> xennet_release_tx_bufs, NET_TX_RING_SIZE(np->tx_ring_pages) numbers of
>>>> grants are processed. And NET_RX_RING_SIZE(np->tx_ring_pages) is totally
>>>> different from TX_MAX_TARGET if np->rx_ring_pages is not 1. Although
>>>> skb_entry_is_link helps to not release invalid grants, lots of null loop
>>>> seems unnecessary. I think TX_MAX_TARGET should be changed into some
>>>> variableconnected with np->tx_ring_pages. Or you intended to use one
>>>> page ring here?
>>>>
>>> Looking back my history, this limitation was introduced because if we
>>> have a multi-page backend and single page frontend, the backend skb
>>> processing could overlap.
>> I did not see the overlap you mentioned here in netback. Although
>> netback supports multi-page, netback->vif still uses single page if the
>> frontend only supports single page. Netfront and netback negotiate this
>> through xenstore in your 5/8 patch. The requests and response should not
>> have any overlap between netback and netfront. Am I missing something?
>>
> I tried to dig up mail archive just now and realized that the bug report
> was in private mail exchange with Konrad.
>
> I don't really remember the details now since it is more than one year
> old, but you can find trace in Konrad's tree, CS 5b4c3dd5b255. All I can
> remember is that this bug was triggered by mixed old/new
> frontend/backend.

I checked the code in Konrad's tree and am thinking this overlap issue 
you mentioned existing in original netback(without multi-ring) and newer 
netfront. Original netback does not support multi-ring, and your newer 
netfront before this bug fix used "#define TX_MAX_TARGET 
XENNET_MAX_TX_RING_SIZE" directly. So that would cause overlap when 
netfront allocating rx skbs.
"#define TX_MAX_TARGET min_t(int, NET_TX_RING_SIZE(1), 256)" limits the 
netfront to single ring, it fixed the overlap issue, but not enough.

>
> I think this cap can be removed if we make all buffers in netfront
> dynamically allocated.

Yes, making TX_MAX_TARGET dynamically would fix this issue.

Thanks
Annie
>
>
> Wei.
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 6/8] netfront: multi-page ring support
  2013-02-27 15:49         ` Wei Liu
@ 2013-02-28  5:19           ` ANNIE LI
  2013-02-28  5:19           ` ANNIE LI
  1 sibling, 0 replies; 91+ messages in thread
From: ANNIE LI @ 2013-02-28  5:19 UTC (permalink / raw)
  To: Wei Liu; +Cc: netdev, konrad.wilk, Ian Campbell, xen-devel



On 2013-2-27 23:49, Wei Liu wrote:
> On Wed, 2013-02-27 at 07:39 +0000, ANNIE LI wrote:
>> On 2013-2-26 20:35, Wei Liu wrote:
>>> On Tue, 2013-02-26 at 06:52 +0000, ANNIE LI wrote:
>>>> On 2013-2-16 0:00, Wei Liu wrote:
>>>>> Signed-off-by: Wei Liu<wei.liu2@citrix.com>
>>>>> ---
>>>>>     drivers/net/xen-netfront.c |  246 +++++++++++++++++++++++++++++++-------------
>>>>>     1 file changed, 174 insertions(+), 72 deletions(-)
>>>>>
>>>>> diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
>>>>> index 8bd75a1..de73a71 100644
>>>>> --- a/drivers/net/xen-netfront.c
>>>>> +++ b/drivers/net/xen-netfront.c
>>>>> @@ -67,9 +67,19 @@ struct netfront_cb {
>>>>>
>>>>>     #define GRANT_INVALID_REF   0
>>>>>
>>>>> -#define NET_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
>>>>> -#define NET_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
>>>>> -#define TX_MAX_TARGET min_t(int, NET_TX_RING_SIZE, 256)
>>>>> +#define XENNET_MAX_RING_PAGE_ORDER XENBUS_MAX_RING_PAGE_ORDER
>>>>> +#define XENNET_MAX_RING_PAGES      (1U<<    XENNET_MAX_RING_PAGE_ORDER)
>>>>> +
>>>>> +
>>>>> +#define NET_TX_RING_SIZE(_nr_pages)                  \
>>>>> +     __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE * (_nr_pages))
>>>>> +#define NET_RX_RING_SIZE(_nr_pages)                  \
>>>>> +     __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE * (_nr_pages))
>>>>> +
>>>>> +#define XENNET_MAX_TX_RING_SIZE NET_TX_RING_SIZE(XENNET_MAX_RING_PAGES)
>>>>> +#define XENNET_MAX_RX_RING_SIZE NET_RX_RING_SIZE(XENNET_MAX_RING_PAGES)
>>>>> +
>>>>> +#define TX_MAX_TARGET min_t(int, NET_TX_RING_SIZE(1), 256)
>>>> Not using multi-page ring here?
>>>> In xennet_create_dev, gnttab_alloc_grant_references allocates
>>>> TX_MAX_TARGET number of grant reference for tx. In
>>>> xennet_release_tx_bufs, NET_TX_RING_SIZE(np->tx_ring_pages) numbers of
>>>> grants are processed. And NET_RX_RING_SIZE(np->tx_ring_pages) is totally
>>>> different from TX_MAX_TARGET if np->rx_ring_pages is not 1. Although
>>>> skb_entry_is_link helps to not release invalid grants, lots of null loop
>>>> seems unnecessary. I think TX_MAX_TARGET should be changed into some
>>>> variableconnected with np->tx_ring_pages. Or you intended to use one
>>>> page ring here?
>>>>
>>> Looking back my history, this limitation was introduced because if we
>>> have a multi-page backend and single page frontend, the backend skb
>>> processing could overlap.
>> I did not see the overlap you mentioned here in netback. Although
>> netback supports multi-page, netback->vif still uses single page if the
>> frontend only supports single page. Netfront and netback negotiate this
>> through xenstore in your 5/8 patch. The requests and response should not
>> have any overlap between netback and netfront. Am I missing something?
>>
> I tried to dig up mail archive just now and realized that the bug report
> was in private mail exchange with Konrad.
>
> I don't really remember the details now since it is more than one year
> old, but you can find trace in Konrad's tree, CS 5b4c3dd5b255. All I can
> remember is that this bug was triggered by mixed old/new
> frontend/backend.

I checked the code in Konrad's tree and am thinking this overlap issue 
you mentioned existing in original netback(without multi-ring) and newer 
netfront. Original netback does not support multi-ring, and your newer 
netfront before this bug fix used "#define TX_MAX_TARGET 
XENNET_MAX_TX_RING_SIZE" directly. So that would cause overlap when 
netfront allocating rx skbs.
"#define TX_MAX_TARGET min_t(int, NET_TX_RING_SIZE(1), 256)" limits the 
netfront to single ring, it fixed the overlap issue, but not enough.

>
> I think this cap can be removed if we make all buffers in netfront
> dynamically allocated.

Yes, making TX_MAX_TARGET dynamically would fix this issue.

Thanks
Annie
>
>
> Wei.
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 6/8] netfront: multi-page ring support
  2013-02-28  5:19           ` ANNIE LI
@ 2013-02-28 11:02             ` Wei Liu
  2013-02-28 12:55               ` annie li
  2013-02-28 12:55               ` annie li
  2013-02-28 11:02             ` Wei Liu
  1 sibling, 2 replies; 91+ messages in thread
From: Wei Liu @ 2013-02-28 11:02 UTC (permalink / raw)
  To: ANNIE LI; +Cc: xen-devel, netdev, Ian Campbell, konrad.wilk

On Thu, Feb 28, 2013 at 05:19:43AM +0000, ANNIE LI wrote:
> 
> 
> On 2013-2-27 23:49, Wei Liu wrote:
> > On Wed, 2013-02-27 at 07:39 +0000, ANNIE LI wrote:
> >> On 2013-2-26 20:35, Wei Liu wrote:
> >>> On Tue, 2013-02-26 at 06:52 +0000, ANNIE LI wrote:
> >>>> On 2013-2-16 0:00, Wei Liu wrote:
> >>>>> Signed-off-by: Wei Liu<wei.liu2@citrix.com>
> >>>>> ---
> >>>>>     drivers/net/xen-netfront.c |  246 +++++++++++++++++++++++++++++++-------------
> >>>>>     1 file changed, 174 insertions(+), 72 deletions(-)
> >>>>>
> >>>>> diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
> >>>>> index 8bd75a1..de73a71 100644
> >>>>> --- a/drivers/net/xen-netfront.c
> >>>>> +++ b/drivers/net/xen-netfront.c
> >>>>> @@ -67,9 +67,19 @@ struct netfront_cb {
> >>>>>
> >>>>>     #define GRANT_INVALID_REF   0
> >>>>>
> >>>>> -#define NET_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
> >>>>> -#define NET_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
> >>>>> -#define TX_MAX_TARGET min_t(int, NET_TX_RING_SIZE, 256)
> >>>>> +#define XENNET_MAX_RING_PAGE_ORDER XENBUS_MAX_RING_PAGE_ORDER
> >>>>> +#define XENNET_MAX_RING_PAGES      (1U<<    XENNET_MAX_RING_PAGE_ORDER)
> >>>>> +
> >>>>> +
> >>>>> +#define NET_TX_RING_SIZE(_nr_pages)                  \
> >>>>> +     __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE * (_nr_pages))
> >>>>> +#define NET_RX_RING_SIZE(_nr_pages)                  \
> >>>>> +     __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE * (_nr_pages))
> >>>>> +
> >>>>> +#define XENNET_MAX_TX_RING_SIZE NET_TX_RING_SIZE(XENNET_MAX_RING_PAGES)
> >>>>> +#define XENNET_MAX_RX_RING_SIZE NET_RX_RING_SIZE(XENNET_MAX_RING_PAGES)
> >>>>> +
> >>>>> +#define TX_MAX_TARGET min_t(int, NET_TX_RING_SIZE(1), 256)
> >>>> Not using multi-page ring here?
> >>>> In xennet_create_dev, gnttab_alloc_grant_references allocates
> >>>> TX_MAX_TARGET number of grant reference for tx. In
> >>>> xennet_release_tx_bufs, NET_TX_RING_SIZE(np->tx_ring_pages) numbers of
> >>>> grants are processed. And NET_RX_RING_SIZE(np->tx_ring_pages) is totally
> >>>> different from TX_MAX_TARGET if np->rx_ring_pages is not 1. Although
> >>>> skb_entry_is_link helps to not release invalid grants, lots of null loop
> >>>> seems unnecessary. I think TX_MAX_TARGET should be changed into some
> >>>> variableconnected with np->tx_ring_pages. Or you intended to use one
> >>>> page ring here?
> >>>>
> >>> Looking back my history, this limitation was introduced because if we
> >>> have a multi-page backend and single page frontend, the backend skb
> >>> processing could overlap.
> >> I did not see the overlap you mentioned here in netback. Although
> >> netback supports multi-page, netback->vif still uses single page if the
> >> frontend only supports single page. Netfront and netback negotiate this
> >> through xenstore in your 5/8 patch. The requests and response should not
> >> have any overlap between netback and netfront. Am I missing something?
> >>
> > I tried to dig up mail archive just now and realized that the bug report
> > was in private mail exchange with Konrad.
> >
> > I don't really remember the details now since it is more than one year
> > old, but you can find trace in Konrad's tree, CS 5b4c3dd5b255. All I can
> > remember is that this bug was triggered by mixed old/new
> > frontend/backend.
> 
> I checked the code in Konrad's tree and am thinking this overlap issue 
> you mentioned existing in original netback(without multi-ring) and newer 
> netfront. Original netback does not support multi-ring, and your newer 
> netfront before this bug fix used "#define TX_MAX_TARGET 
> XENNET_MAX_TX_RING_SIZE" directly. So that would cause overlap when 
> netfront allocating rx skbs.
> "#define TX_MAX_TARGET min_t(int, NET_TX_RING_SIZE(1), 256)" limits the 
> netfront to single ring, it fixed the overlap issue, but not enough.
> 

Yes. I just saw a bug report from Xen-user list yesterday for the same
issue in original netback (1 page ring), so the overlap issue is not
introduced by multi-page ring implementation. If your team also sees that
issue, do you have patch to fix that?


Wei.

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 6/8] netfront: multi-page ring support
  2013-02-28  5:19           ` ANNIE LI
  2013-02-28 11:02             ` Wei Liu
@ 2013-02-28 11:02             ` Wei Liu
  1 sibling, 0 replies; 91+ messages in thread
From: Wei Liu @ 2013-02-28 11:02 UTC (permalink / raw)
  To: ANNIE LI; +Cc: netdev, konrad.wilk, Ian Campbell, xen-devel

On Thu, Feb 28, 2013 at 05:19:43AM +0000, ANNIE LI wrote:
> 
> 
> On 2013-2-27 23:49, Wei Liu wrote:
> > On Wed, 2013-02-27 at 07:39 +0000, ANNIE LI wrote:
> >> On 2013-2-26 20:35, Wei Liu wrote:
> >>> On Tue, 2013-02-26 at 06:52 +0000, ANNIE LI wrote:
> >>>> On 2013-2-16 0:00, Wei Liu wrote:
> >>>>> Signed-off-by: Wei Liu<wei.liu2@citrix.com>
> >>>>> ---
> >>>>>     drivers/net/xen-netfront.c |  246 +++++++++++++++++++++++++++++++-------------
> >>>>>     1 file changed, 174 insertions(+), 72 deletions(-)
> >>>>>
> >>>>> diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
> >>>>> index 8bd75a1..de73a71 100644
> >>>>> --- a/drivers/net/xen-netfront.c
> >>>>> +++ b/drivers/net/xen-netfront.c
> >>>>> @@ -67,9 +67,19 @@ struct netfront_cb {
> >>>>>
> >>>>>     #define GRANT_INVALID_REF   0
> >>>>>
> >>>>> -#define NET_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
> >>>>> -#define NET_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
> >>>>> -#define TX_MAX_TARGET min_t(int, NET_TX_RING_SIZE, 256)
> >>>>> +#define XENNET_MAX_RING_PAGE_ORDER XENBUS_MAX_RING_PAGE_ORDER
> >>>>> +#define XENNET_MAX_RING_PAGES      (1U<<    XENNET_MAX_RING_PAGE_ORDER)
> >>>>> +
> >>>>> +
> >>>>> +#define NET_TX_RING_SIZE(_nr_pages)                  \
> >>>>> +     __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE * (_nr_pages))
> >>>>> +#define NET_RX_RING_SIZE(_nr_pages)                  \
> >>>>> +     __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE * (_nr_pages))
> >>>>> +
> >>>>> +#define XENNET_MAX_TX_RING_SIZE NET_TX_RING_SIZE(XENNET_MAX_RING_PAGES)
> >>>>> +#define XENNET_MAX_RX_RING_SIZE NET_RX_RING_SIZE(XENNET_MAX_RING_PAGES)
> >>>>> +
> >>>>> +#define TX_MAX_TARGET min_t(int, NET_TX_RING_SIZE(1), 256)
> >>>> Not using multi-page ring here?
> >>>> In xennet_create_dev, gnttab_alloc_grant_references allocates
> >>>> TX_MAX_TARGET number of grant reference for tx. In
> >>>> xennet_release_tx_bufs, NET_TX_RING_SIZE(np->tx_ring_pages) numbers of
> >>>> grants are processed. And NET_RX_RING_SIZE(np->tx_ring_pages) is totally
> >>>> different from TX_MAX_TARGET if np->rx_ring_pages is not 1. Although
> >>>> skb_entry_is_link helps to not release invalid grants, lots of null loop
> >>>> seems unnecessary. I think TX_MAX_TARGET should be changed into some
> >>>> variableconnected with np->tx_ring_pages. Or you intended to use one
> >>>> page ring here?
> >>>>
> >>> Looking back my history, this limitation was introduced because if we
> >>> have a multi-page backend and single page frontend, the backend skb
> >>> processing could overlap.
> >> I did not see the overlap you mentioned here in netback. Although
> >> netback supports multi-page, netback->vif still uses single page if the
> >> frontend only supports single page. Netfront and netback negotiate this
> >> through xenstore in your 5/8 patch. The requests and response should not
> >> have any overlap between netback and netfront. Am I missing something?
> >>
> > I tried to dig up mail archive just now and realized that the bug report
> > was in private mail exchange with Konrad.
> >
> > I don't really remember the details now since it is more than one year
> > old, but you can find trace in Konrad's tree, CS 5b4c3dd5b255. All I can
> > remember is that this bug was triggered by mixed old/new
> > frontend/backend.
> 
> I checked the code in Konrad's tree and am thinking this overlap issue 
> you mentioned existing in original netback(without multi-ring) and newer 
> netfront. Original netback does not support multi-ring, and your newer 
> netfront before this bug fix used "#define TX_MAX_TARGET 
> XENNET_MAX_TX_RING_SIZE" directly. So that would cause overlap when 
> netfront allocating rx skbs.
> "#define TX_MAX_TARGET min_t(int, NET_TX_RING_SIZE(1), 256)" limits the 
> netfront to single ring, it fixed the overlap issue, but not enough.
> 

Yes. I just saw a bug report from Xen-user list yesterday for the same
issue in original netback (1 page ring), so the overlap issue is not
introduced by multi-page ring implementation. If your team also sees that
issue, do you have patch to fix that?


Wei.

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 6/8] netfront: multi-page ring support
  2013-02-28 11:02             ` Wei Liu
  2013-02-28 12:55               ` annie li
@ 2013-02-28 12:55               ` annie li
  1 sibling, 0 replies; 91+ messages in thread
From: annie li @ 2013-02-28 12:55 UTC (permalink / raw)
  To: Wei Liu; +Cc: xen-devel, netdev, Ian Campbell, konrad.wilk


On 2013-2-28 19:02, Wei Liu wrote:
> On Thu, Feb 28, 2013 at 05:19:43AM +0000, ANNIE LI wrote:
>> I checked the code in Konrad's tree and am thinking this overlap issue
>> you mentioned existing in original netback(without multi-ring) and newer
>> netfront. Original netback does not support multi-ring, and your newer
>> netfront before this bug fix used "#define TX_MAX_TARGET
>> XENNET_MAX_TX_RING_SIZE" directly. So that would cause overlap when
>> netfront allocating rx skbs.
>> "#define TX_MAX_TARGET min_t(int, NET_TX_RING_SIZE(1), 256)" limits the
>> netfront to single ring, it fixed the overlap issue, but not enough.
>>
> Yes. I just saw a bug report from Xen-user list yesterday for the same
> issue in original netback (1 page ring), so the overlap issue is not
> introduced by multi-page ring implementation. If your team also sees that
> issue, do you have patch to fix that?

No. We thought your patch fixed it, and I did not check it further at 
that time.
Are you sure they are same? What is the thread title in Xen-user?
The overlap issue here exists in netfront when netfront allocates skb 
greedily. In Konrad's tree merged with your patch, netfront with 
"#define TX_MAX_TARGET XENNET_MAX_TX_RING_SIZE" should hit this overlap 
issue when it runs with single ring netback.

Thanks
Annie

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 6/8] netfront: multi-page ring support
  2013-02-28 11:02             ` Wei Liu
@ 2013-02-28 12:55               ` annie li
  2013-02-28 12:55               ` annie li
  1 sibling, 0 replies; 91+ messages in thread
From: annie li @ 2013-02-28 12:55 UTC (permalink / raw)
  To: Wei Liu; +Cc: netdev, konrad.wilk, Ian Campbell, xen-devel


On 2013-2-28 19:02, Wei Liu wrote:
> On Thu, Feb 28, 2013 at 05:19:43AM +0000, ANNIE LI wrote:
>> I checked the code in Konrad's tree and am thinking this overlap issue
>> you mentioned existing in original netback(without multi-ring) and newer
>> netfront. Original netback does not support multi-ring, and your newer
>> netfront before this bug fix used "#define TX_MAX_TARGET
>> XENNET_MAX_TX_RING_SIZE" directly. So that would cause overlap when
>> netfront allocating rx skbs.
>> "#define TX_MAX_TARGET min_t(int, NET_TX_RING_SIZE(1), 256)" limits the
>> netfront to single ring, it fixed the overlap issue, but not enough.
>>
> Yes. I just saw a bug report from Xen-user list yesterday for the same
> issue in original netback (1 page ring), so the overlap issue is not
> introduced by multi-page ring implementation. If your team also sees that
> issue, do you have patch to fix that?

No. We thought your patch fixed it, and I did not check it further at 
that time.
Are you sure they are same? What is the thread title in Xen-user?
The overlap issue here exists in netfront when netfront allocates skb 
greedily. In Konrad's tree merged with your patch, netfront with 
"#define TX_MAX_TARGET XENNET_MAX_TX_RING_SIZE" should hit this overlap 
issue when it runs with single ring netback.

Thanks
Annie

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 1/8] netback: don't bind kthread to cpu
  2013-02-15 16:00 ` [PATCH 1/8] netback: don't bind kthread to cpu Wei Liu
@ 2013-03-04 20:51   ` Konrad Rzeszutek Wilk
  2013-03-05 13:30     ` Wei Liu
  2013-03-05 13:30     ` Wei Liu
  2013-03-04 20:51   ` Konrad Rzeszutek Wilk
  1 sibling, 2 replies; 91+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-03-04 20:51 UTC (permalink / raw)
  To: Wei Liu; +Cc: xen-devel, netdev, ian.campbell, annie.li

On Fri, Feb 15, 2013 at 04:00:02PM +0000, Wei Liu wrote:
> The initialization process makes an assumption that the online cpus are
> numbered from 0 to xen_netbk_group_nr-1,  which is not always true.

And xen_netbk_group_nr is num_online_cpus()?

So under what conditions does this change? Is this when the CPU hotplug
is involved and the CPUs go offline? In which case should there be a
CPU hotplug notifier to re-bind the workers are appropiate?

> 
> As we only need a pool of worker threads, simply don't bind them to specific
> cpus.

OK. Is there another method of doing this? Are there patches to make the thread
try to be vCPU->guest affinite?

> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> ---
>  drivers/net/xen-netback/netback.c |    2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> index 3ae49b1..db8d45a 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -1729,8 +1729,6 @@ static int __init netback_init(void)
>  			goto failed_init;
>  		}
>  
> -		kthread_bind(netbk->task, group);
> -
>  		INIT_LIST_HEAD(&netbk->net_schedule_list);
>  
>  		spin_lock_init(&netbk->net_schedule_list_lock);
> -- 
> 1.7.10.4
> 

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 1/8] netback: don't bind kthread to cpu
  2013-02-15 16:00 ` [PATCH 1/8] netback: don't bind kthread to cpu Wei Liu
  2013-03-04 20:51   ` Konrad Rzeszutek Wilk
@ 2013-03-04 20:51   ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 91+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-03-04 20:51 UTC (permalink / raw)
  To: Wei Liu; +Cc: netdev, annie.li, ian.campbell, xen-devel

On Fri, Feb 15, 2013 at 04:00:02PM +0000, Wei Liu wrote:
> The initialization process makes an assumption that the online cpus are
> numbered from 0 to xen_netbk_group_nr-1,  which is not always true.

And xen_netbk_group_nr is num_online_cpus()?

So under what conditions does this change? Is this when the CPU hotplug
is involved and the CPUs go offline? In which case should there be a
CPU hotplug notifier to re-bind the workers are appropiate?

> 
> As we only need a pool of worker threads, simply don't bind them to specific
> cpus.

OK. Is there another method of doing this? Are there patches to make the thread
try to be vCPU->guest affinite?

> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> ---
>  drivers/net/xen-netback/netback.c |    2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> index 3ae49b1..db8d45a 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -1729,8 +1729,6 @@ static int __init netback_init(void)
>  			goto failed_init;
>  		}
>  
> -		kthread_bind(netbk->task, group);
> -
>  		INIT_LIST_HEAD(&netbk->net_schedule_list);
>  
>  		spin_lock_init(&netbk->net_schedule_list_lock);
> -- 
> 1.7.10.4
> 

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [Xen-devel] [PATCH 2/8] netback: add module unload function
  2013-02-15 16:00 ` Wei Liu
@ 2013-03-04 20:55   ` Konrad Rzeszutek Wilk
  2013-03-04 20:58     ` Andrew Cooper
                       ` (3 more replies)
  2013-03-04 20:55   ` Konrad Rzeszutek Wilk
                     ` (2 subsequent siblings)
  3 siblings, 4 replies; 91+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-03-04 20:55 UTC (permalink / raw)
  To: Wei Liu; +Cc: xen-devel, netdev, ian.campbell, annie.li

On Fri, Feb 15, 2013 at 04:00:03PM +0000, Wei Liu wrote:
> Enable users to unload netback module. Users should make sure there is not vif
> runnig.

'sure there are no vif's running.'

Any way of making this VIF part be automatic? Meaning that netback
can figure out if there are VIFs running and if so don't unload
all of the parts and just mention that you are leaking memory.

This looks quite dangerous - meaning if there are guests running and
we for fun do 'rmmod xen_netback' it looks like we could crash dom0?

> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> ---
>  drivers/net/xen-netback/common.h  |    1 +
>  drivers/net/xen-netback/netback.c |   18 ++++++++++++++++++
>  drivers/net/xen-netback/xenbus.c  |    5 +++++
>  3 files changed, 24 insertions(+)
> 
> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
> index 9d7f172..35d8772 100644
> --- a/drivers/net/xen-netback/common.h
> +++ b/drivers/net/xen-netback/common.h
> @@ -120,6 +120,7 @@ void xenvif_get(struct xenvif *vif);
>  void xenvif_put(struct xenvif *vif);
>  
>  int xenvif_xenbus_init(void);
> +void xenvif_xenbus_exit(void);
>  
>  int xenvif_schedulable(struct xenvif *vif);
>  
> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> index db8d45a..de59098 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -1761,5 +1761,23 @@ failed_init:
>  
>  module_init(netback_init);
>  
> +static void __exit netback_exit(void)
> +{
> +	int group, i;
> +	xenvif_xenbus_exit();

You should check the return code of this function.

> +	for (group = 0; group < xen_netbk_group_nr; group++) {
> +		struct xen_netbk *netbk = &xen_netbk[group];
> +		for (i = 0; i < MAX_PENDING_REQS; i++) {
> +			if (netbk->mmap_pages[i])
> +				__free_page(netbk->mmap_pages[i]);
> +		}
> +		del_timer_sync(&netbk->net_timer);
> +		kthread_stop(netbk->task);
> +	}
> +	vfree(xen_netbk);
> +}
> +
> +module_exit(netback_exit);
> +
>  MODULE_LICENSE("Dual BSD/GPL");
>  MODULE_ALIAS("xen-backend:vif");
> diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
> index 410018c..65d14f2 100644
> --- a/drivers/net/xen-netback/xenbus.c
> +++ b/drivers/net/xen-netback/xenbus.c
> @@ -485,3 +485,8 @@ int xenvif_xenbus_init(void)
>  {
>  	return xenbus_register_backend(&netback_driver);
>  }
> +
> +void xenvif_xenbus_exit(void)
> +{
> +	return xenbus_unregister_driver(&netback_driver);
> +}
> -- 
> 1.7.10.4
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> 

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 2/8] netback: add module unload function
  2013-02-15 16:00 ` Wei Liu
  2013-03-04 20:55   ` [Xen-devel] " Konrad Rzeszutek Wilk
@ 2013-03-04 20:55   ` Konrad Rzeszutek Wilk
  2013-03-04 21:58   ` Stephen Hemminger
  2013-03-04 21:58   ` Stephen Hemminger
  3 siblings, 0 replies; 91+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-03-04 20:55 UTC (permalink / raw)
  To: Wei Liu; +Cc: netdev, annie.li, ian.campbell, xen-devel

On Fri, Feb 15, 2013 at 04:00:03PM +0000, Wei Liu wrote:
> Enable users to unload netback module. Users should make sure there is not vif
> runnig.

'sure there are no vif's running.'

Any way of making this VIF part be automatic? Meaning that netback
can figure out if there are VIFs running and if so don't unload
all of the parts and just mention that you are leaking memory.

This looks quite dangerous - meaning if there are guests running and
we for fun do 'rmmod xen_netback' it looks like we could crash dom0?

> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> ---
>  drivers/net/xen-netback/common.h  |    1 +
>  drivers/net/xen-netback/netback.c |   18 ++++++++++++++++++
>  drivers/net/xen-netback/xenbus.c  |    5 +++++
>  3 files changed, 24 insertions(+)
> 
> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
> index 9d7f172..35d8772 100644
> --- a/drivers/net/xen-netback/common.h
> +++ b/drivers/net/xen-netback/common.h
> @@ -120,6 +120,7 @@ void xenvif_get(struct xenvif *vif);
>  void xenvif_put(struct xenvif *vif);
>  
>  int xenvif_xenbus_init(void);
> +void xenvif_xenbus_exit(void);
>  
>  int xenvif_schedulable(struct xenvif *vif);
>  
> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> index db8d45a..de59098 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -1761,5 +1761,23 @@ failed_init:
>  
>  module_init(netback_init);
>  
> +static void __exit netback_exit(void)
> +{
> +	int group, i;
> +	xenvif_xenbus_exit();

You should check the return code of this function.

> +	for (group = 0; group < xen_netbk_group_nr; group++) {
> +		struct xen_netbk *netbk = &xen_netbk[group];
> +		for (i = 0; i < MAX_PENDING_REQS; i++) {
> +			if (netbk->mmap_pages[i])
> +				__free_page(netbk->mmap_pages[i]);
> +		}
> +		del_timer_sync(&netbk->net_timer);
> +		kthread_stop(netbk->task);
> +	}
> +	vfree(xen_netbk);
> +}
> +
> +module_exit(netback_exit);
> +
>  MODULE_LICENSE("Dual BSD/GPL");
>  MODULE_ALIAS("xen-backend:vif");
> diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
> index 410018c..65d14f2 100644
> --- a/drivers/net/xen-netback/xenbus.c
> +++ b/drivers/net/xen-netback/xenbus.c
> @@ -485,3 +485,8 @@ int xenvif_xenbus_init(void)
>  {
>  	return xenbus_register_backend(&netback_driver);
>  }
> +
> +void xenvif_xenbus_exit(void)
> +{
> +	return xenbus_unregister_driver(&netback_driver);
> +}
> -- 
> 1.7.10.4
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> 

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 3/8] netback: get/put module along with vif connect/disconnect
  2013-02-15 16:00 ` Wei Liu
  2013-03-04 20:56   ` Konrad Rzeszutek Wilk
@ 2013-03-04 20:56   ` Konrad Rzeszutek Wilk
  2013-03-05 10:02   ` David Vrabel
  2013-03-05 10:02   ` [Xen-devel] " David Vrabel
  3 siblings, 0 replies; 91+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-03-04 20:56 UTC (permalink / raw)
  To: Wei Liu; +Cc: xen-devel, netdev, ian.campbell, annie.li

On Fri, Feb 15, 2013 at 04:00:04PM +0000, Wei Liu wrote:
> If there is vif running and user unloads netback, guest's network interface
> just mysteriously stops working. So we need to prevent unloading netback
> module if there is vif running.
> 
> The disconnect function of vif may get called by the generic framework even
> before vif connects, so thers is an extra check on whether we actually need to
> put module when disconnecting a vif.

Ah, I think this patch should come before the "netback: add module unload function"

> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> ---
>  drivers/net/xen-netback/interface.c |   18 +++++++++++++++++-
>  1 file changed, 17 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
> index 221f426..db638e1 100644
> --- a/drivers/net/xen-netback/interface.c
> +++ b/drivers/net/xen-netback/interface.c
> @@ -314,6 +314,8 @@ int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref,
>  	if (vif->irq)
>  		return 0;
>  
> +	__module_get(THIS_MODULE);
> +
>  	err = xen_netbk_map_frontend_rings(vif, tx_ring_ref, rx_ring_ref);
>  	if (err < 0)
>  		goto err;
> @@ -341,6 +343,7 @@ int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref,
>  err_unmap:
>  	xen_netbk_unmap_frontend_rings(vif);
>  err:
> +	module_put(THIS_MODULE);
>  	return err;
>  }
>  
> @@ -358,18 +361,31 @@ void xenvif_carrier_off(struct xenvif *vif)
>  
>  void xenvif_disconnect(struct xenvif *vif)
>  {
> +	/*
> +	 * This function may get called even before vif connets, set

connects

> +	 * need_module_put if vif->irq != 0, which means vif has
> +	 * already connected, we should call module_put to balance the
> +	 * previous __module_get.
> +	 */
> +	int need_module_put = 0;
> +
>  	if (netif_carrier_ok(vif->dev))
>  		xenvif_carrier_off(vif);
>  
>  	atomic_dec(&vif->refcnt);
>  	wait_event(vif->waiting_to_free, atomic_read(&vif->refcnt) == 0);
>  
> -	if (vif->irq)
> +	if (vif->irq) {
>  		unbind_from_irqhandler(vif->irq, vif);
> +		need_module_put = 1;
> +	}
>  
>  	unregister_netdev(vif->dev);
>  
>  	xen_netbk_unmap_frontend_rings(vif);
>  
>  	free_netdev(vif->dev);
> +
> +	if (need_module_put)
> +		module_put(THIS_MODULE);
>  }
> -- 
> 1.7.10.4
> 

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 3/8] netback: get/put module along with vif connect/disconnect
  2013-02-15 16:00 ` Wei Liu
@ 2013-03-04 20:56   ` Konrad Rzeszutek Wilk
  2013-03-04 20:56   ` Konrad Rzeszutek Wilk
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 91+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-03-04 20:56 UTC (permalink / raw)
  To: Wei Liu; +Cc: netdev, annie.li, ian.campbell, xen-devel

On Fri, Feb 15, 2013 at 04:00:04PM +0000, Wei Liu wrote:
> If there is vif running and user unloads netback, guest's network interface
> just mysteriously stops working. So we need to prevent unloading netback
> module if there is vif running.
> 
> The disconnect function of vif may get called by the generic framework even
> before vif connects, so thers is an extra check on whether we actually need to
> put module when disconnecting a vif.

Ah, I think this patch should come before the "netback: add module unload function"

> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> ---
>  drivers/net/xen-netback/interface.c |   18 +++++++++++++++++-
>  1 file changed, 17 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
> index 221f426..db638e1 100644
> --- a/drivers/net/xen-netback/interface.c
> +++ b/drivers/net/xen-netback/interface.c
> @@ -314,6 +314,8 @@ int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref,
>  	if (vif->irq)
>  		return 0;
>  
> +	__module_get(THIS_MODULE);
> +
>  	err = xen_netbk_map_frontend_rings(vif, tx_ring_ref, rx_ring_ref);
>  	if (err < 0)
>  		goto err;
> @@ -341,6 +343,7 @@ int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref,
>  err_unmap:
>  	xen_netbk_unmap_frontend_rings(vif);
>  err:
> +	module_put(THIS_MODULE);
>  	return err;
>  }
>  
> @@ -358,18 +361,31 @@ void xenvif_carrier_off(struct xenvif *vif)
>  
>  void xenvif_disconnect(struct xenvif *vif)
>  {
> +	/*
> +	 * This function may get called even before vif connets, set

connects

> +	 * need_module_put if vif->irq != 0, which means vif has
> +	 * already connected, we should call module_put to balance the
> +	 * previous __module_get.
> +	 */
> +	int need_module_put = 0;
> +
>  	if (netif_carrier_ok(vif->dev))
>  		xenvif_carrier_off(vif);
>  
>  	atomic_dec(&vif->refcnt);
>  	wait_event(vif->waiting_to_free, atomic_read(&vif->refcnt) == 0);
>  
> -	if (vif->irq)
> +	if (vif->irq) {
>  		unbind_from_irqhandler(vif->irq, vif);
> +		need_module_put = 1;
> +	}
>  
>  	unregister_netdev(vif->dev);
>  
>  	xen_netbk_unmap_frontend_rings(vif);
>  
>  	free_netdev(vif->dev);
> +
> +	if (need_module_put)
> +		module_put(THIS_MODULE);
>  }
> -- 
> 1.7.10.4
> 

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [Xen-devel] [PATCH 2/8] netback: add module unload function
  2013-03-04 20:55   ` [Xen-devel] " Konrad Rzeszutek Wilk
  2013-03-04 20:58     ` Andrew Cooper
@ 2013-03-04 20:58     ` Andrew Cooper
  2013-03-05 13:30     ` Wei Liu
  2013-03-05 13:30     ` Wei Liu
  3 siblings, 0 replies; 91+ messages in thread
From: Andrew Cooper @ 2013-03-04 20:58 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: Wei Liu, netdev, annie.li, Ian Campbell, xen-devel

On 04/03/13 20:55, Konrad Rzeszutek Wilk wrote:
> On Fri, Feb 15, 2013 at 04:00:03PM +0000, Wei Liu wrote:
>> Enable users to unload netback module. Users should make sure there is not vif
>> runnig.
> 'sure there are no vif's running.'

If we are picking at grammar, no apostrophe in 'vifs'

~Andrew

>
> Any way of making this VIF part be automatic? Meaning that netback
> can figure out if there are VIFs running and if so don't unload
> all of the parts and just mention that you are leaking memory.
>
> This looks quite dangerous - meaning if there are guests running and
> we for fun do 'rmmod xen_netback' it looks like we could crash dom0?
>
>> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
>> ---
>>  drivers/net/xen-netback/common.h  |    1 +
>>  drivers/net/xen-netback/netback.c |   18 ++++++++++++++++++
>>  drivers/net/xen-netback/xenbus.c  |    5 +++++
>>  3 files changed, 24 insertions(+)
>>
>> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
>> index 9d7f172..35d8772 100644
>> --- a/drivers/net/xen-netback/common.h
>> +++ b/drivers/net/xen-netback/common.h
>> @@ -120,6 +120,7 @@ void xenvif_get(struct xenvif *vif);
>>  void xenvif_put(struct xenvif *vif);
>>  
>>  int xenvif_xenbus_init(void);
>> +void xenvif_xenbus_exit(void);
>>  
>>  int xenvif_schedulable(struct xenvif *vif);
>>  
>> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
>> index db8d45a..de59098 100644
>> --- a/drivers/net/xen-netback/netback.c
>> +++ b/drivers/net/xen-netback/netback.c
>> @@ -1761,5 +1761,23 @@ failed_init:
>>  
>>  module_init(netback_init);
>>  
>> +static void __exit netback_exit(void)
>> +{
>> +	int group, i;
>> +	xenvif_xenbus_exit();
> You should check the return code of this function.
>
>> +	for (group = 0; group < xen_netbk_group_nr; group++) {
>> +		struct xen_netbk *netbk = &xen_netbk[group];
>> +		for (i = 0; i < MAX_PENDING_REQS; i++) {
>> +			if (netbk->mmap_pages[i])
>> +				__free_page(netbk->mmap_pages[i]);
>> +		}
>> +		del_timer_sync(&netbk->net_timer);
>> +		kthread_stop(netbk->task);
>> +	}
>> +	vfree(xen_netbk);
>> +}
>> +
>> +module_exit(netback_exit);
>> +
>>  MODULE_LICENSE("Dual BSD/GPL");
>>  MODULE_ALIAS("xen-backend:vif");
>> diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
>> index 410018c..65d14f2 100644
>> --- a/drivers/net/xen-netback/xenbus.c
>> +++ b/drivers/net/xen-netback/xenbus.c
>> @@ -485,3 +485,8 @@ int xenvif_xenbus_init(void)
>>  {
>>  	return xenbus_register_backend(&netback_driver);
>>  }
>> +
>> +void xenvif_xenbus_exit(void)
>> +{
>> +	return xenbus_unregister_driver(&netback_driver);
>> +}
>> -- 
>> 1.7.10.4
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> http://lists.xen.org/xen-devel
>>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 2/8] netback: add module unload function
  2013-03-04 20:55   ` [Xen-devel] " Konrad Rzeszutek Wilk
@ 2013-03-04 20:58     ` Andrew Cooper
  2013-03-04 20:58     ` [Xen-devel] " Andrew Cooper
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 91+ messages in thread
From: Andrew Cooper @ 2013-03-04 20:58 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: netdev, annie.li, Wei Liu, Ian Campbell, xen-devel

On 04/03/13 20:55, Konrad Rzeszutek Wilk wrote:
> On Fri, Feb 15, 2013 at 04:00:03PM +0000, Wei Liu wrote:
>> Enable users to unload netback module. Users should make sure there is not vif
>> runnig.
> 'sure there are no vif's running.'

If we are picking at grammar, no apostrophe in 'vifs'

~Andrew

>
> Any way of making this VIF part be automatic? Meaning that netback
> can figure out if there are VIFs running and if so don't unload
> all of the parts and just mention that you are leaking memory.
>
> This looks quite dangerous - meaning if there are guests running and
> we for fun do 'rmmod xen_netback' it looks like we could crash dom0?
>
>> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
>> ---
>>  drivers/net/xen-netback/common.h  |    1 +
>>  drivers/net/xen-netback/netback.c |   18 ++++++++++++++++++
>>  drivers/net/xen-netback/xenbus.c  |    5 +++++
>>  3 files changed, 24 insertions(+)
>>
>> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
>> index 9d7f172..35d8772 100644
>> --- a/drivers/net/xen-netback/common.h
>> +++ b/drivers/net/xen-netback/common.h
>> @@ -120,6 +120,7 @@ void xenvif_get(struct xenvif *vif);
>>  void xenvif_put(struct xenvif *vif);
>>  
>>  int xenvif_xenbus_init(void);
>> +void xenvif_xenbus_exit(void);
>>  
>>  int xenvif_schedulable(struct xenvif *vif);
>>  
>> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
>> index db8d45a..de59098 100644
>> --- a/drivers/net/xen-netback/netback.c
>> +++ b/drivers/net/xen-netback/netback.c
>> @@ -1761,5 +1761,23 @@ failed_init:
>>  
>>  module_init(netback_init);
>>  
>> +static void __exit netback_exit(void)
>> +{
>> +	int group, i;
>> +	xenvif_xenbus_exit();
> You should check the return code of this function.
>
>> +	for (group = 0; group < xen_netbk_group_nr; group++) {
>> +		struct xen_netbk *netbk = &xen_netbk[group];
>> +		for (i = 0; i < MAX_PENDING_REQS; i++) {
>> +			if (netbk->mmap_pages[i])
>> +				__free_page(netbk->mmap_pages[i]);
>> +		}
>> +		del_timer_sync(&netbk->net_timer);
>> +		kthread_stop(netbk->task);
>> +	}
>> +	vfree(xen_netbk);
>> +}
>> +
>> +module_exit(netback_exit);
>> +
>>  MODULE_LICENSE("Dual BSD/GPL");
>>  MODULE_ALIAS("xen-backend:vif");
>> diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
>> index 410018c..65d14f2 100644
>> --- a/drivers/net/xen-netback/xenbus.c
>> +++ b/drivers/net/xen-netback/xenbus.c
>> @@ -485,3 +485,8 @@ int xenvif_xenbus_init(void)
>>  {
>>  	return xenbus_register_backend(&netback_driver);
>>  }
>> +
>> +void xenvif_xenbus_exit(void)
>> +{
>> +	return xenbus_unregister_driver(&netback_driver);
>> +}
>> -- 
>> 1.7.10.4
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> http://lists.xen.org/xen-devel
>>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [Xen-devel] [PATCH 5/8] netback: multi-page ring support
  2013-02-15 16:00 ` Wei Liu
@ 2013-03-04 21:00   ` Konrad Rzeszutek Wilk
  2013-03-04 21:00   ` Konrad Rzeszutek Wilk
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 91+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-03-04 21:00 UTC (permalink / raw)
  To: Wei Liu; +Cc: xen-devel, netdev, ian.campbell, annie.li

On Fri, Feb 15, 2013 at 04:00:06PM +0000, Wei Liu wrote:

Please a bit more description. Say which XenBus feature this
is using.  And that probably means another patch to the Xen
tree to document this.


> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> ---
>  drivers/net/xen-netback/common.h    |   30 ++++++--
>  drivers/net/xen-netback/interface.c |   46 +++++++++--
>  drivers/net/xen-netback/netback.c   |   73 ++++++++----------
>  drivers/net/xen-netback/xenbus.c    |  143 +++++++++++++++++++++++++++++++++--
>  4 files changed, 229 insertions(+), 63 deletions(-)
> 
> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
> index 35d8772..f541ba9 100644
> --- a/drivers/net/xen-netback/common.h
> +++ b/drivers/net/xen-netback/common.h
> @@ -45,6 +45,12 @@
>  #include <xen/grant_table.h>
>  #include <xen/xenbus.h>
>  
> +#define NETBK_MAX_RING_PAGE_ORDER XENBUS_MAX_RING_PAGE_ORDER
> +#define NETBK_MAX_RING_PAGES      (1U << NETBK_MAX_RING_PAGE_ORDER)
> +
> +#define NETBK_MAX_TX_RING_SIZE XEN_NETIF_TX_RING_SIZE(NETBK_MAX_RING_PAGES)
> +#define NETBK_MAX_RX_RING_SIZE XEN_NETIF_RX_RING_SIZE(NETBK_MAX_RING_PAGES)
> +
>  struct xen_netbk;
>  
>  struct xenvif {
> @@ -66,6 +72,8 @@ struct xenvif {
>  	/* The shared rings and indexes. */
>  	struct xen_netif_tx_back_ring tx;
>  	struct xen_netif_rx_back_ring rx;
> +	unsigned int nr_tx_handles;
> +	unsigned int nr_rx_handles;
>  
>  	/* Frontend feature information. */
>  	u8 can_sg:1;
> @@ -105,15 +113,19 @@ static inline struct xenbus_device *xenvif_to_xenbus_device(struct xenvif *vif)
>  	return to_xenbus_device(vif->dev->dev.parent);
>  }
>  
> -#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
> -#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
> +#define XEN_NETIF_TX_RING_SIZE(_nr_pages)		\
> +	__CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE * (_nr_pages))
> +#define XEN_NETIF_RX_RING_SIZE(_nr_pages)		\
> +	__CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE * (_nr_pages))
>  
>  struct xenvif *xenvif_alloc(struct device *parent,
>  			    domid_t domid,
>  			    unsigned int handle);
>  
> -int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref,
> -		   unsigned long rx_ring_ref, unsigned int evtchn);
> +int xenvif_connect(struct xenvif *vif,
> +		   unsigned long *tx_ring_ref, unsigned int tx_ring_order,
> +		   unsigned long *rx_ring_ref, unsigned int rx_ring_order,
> +		   unsigned int evtchn);
>  void xenvif_disconnect(struct xenvif *vif);
>  
>  void xenvif_get(struct xenvif *vif);
> @@ -129,10 +141,12 @@ int xen_netbk_rx_ring_full(struct xenvif *vif);
>  int xen_netbk_must_stop_queue(struct xenvif *vif);
>  
>  /* (Un)Map communication rings. */
> -void xen_netbk_unmap_frontend_rings(struct xenvif *vif);
> +void xen_netbk_unmap_frontend_rings(struct xenvif *vif, void *addr);
>  int xen_netbk_map_frontend_rings(struct xenvif *vif,
> -				 grant_ref_t tx_ring_ref,
> -				 grant_ref_t rx_ring_ref);
> +				 void **addr,
> +				 int domid,
> +				 int *ring_ref,
> +				 unsigned int ring_ref_count);
>  
>  /* (De)Register a xenvif with the netback backend. */
>  void xen_netbk_add_xenvif(struct xenvif *vif);
> @@ -158,4 +172,6 @@ void xenvif_carrier_off(struct xenvif *vif);
>  /* Returns number of ring slots required to send an skb to the frontend */
>  unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb);
>  
> +extern unsigned int MODPARM_netback_max_tx_ring_page_order;
> +extern unsigned int MODPARM_netback_max_rx_ring_page_order;
>  #endif /* __XEN_NETBACK__COMMON_H__ */
> diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
> index db638e1..fa4d46d 100644
> --- a/drivers/net/xen-netback/interface.c
> +++ b/drivers/net/xen-netback/interface.c
> @@ -305,10 +305,16 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid,
>  	return vif;
>  }
>  
> -int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref,
> -		   unsigned long rx_ring_ref, unsigned int evtchn)
> +int xenvif_connect(struct xenvif *vif,
> +		   unsigned long *tx_ring_ref, unsigned int tx_ring_ref_count,
> +		   unsigned long *rx_ring_ref, unsigned int rx_ring_ref_count,
> +		   unsigned int evtchn)
>  {
>  	int err = -ENOMEM;
> +	void *addr;
> +	struct xen_netif_tx_sring *txs;
> +	struct xen_netif_rx_sring *rxs;
> +	int tmp[NETBK_MAX_RING_PAGES], i;
>  
>  	/* Already connected through? */
>  	if (vif->irq)
> @@ -316,15 +322,36 @@ int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref,
>  
>  	__module_get(THIS_MODULE);
>  
> -	err = xen_netbk_map_frontend_rings(vif, tx_ring_ref, rx_ring_ref);
> +	for (i = 0; i < tx_ring_ref_count; i++)
> +		tmp[i] = tx_ring_ref[i];
> +
> +	err = xen_netbk_map_frontend_rings(vif, &addr, vif->domid,
> +					   tmp, tx_ring_ref_count);
>  	if (err < 0)
>  		goto err;
>  
> +	txs = addr;
> +	BACK_RING_INIT(&vif->tx, txs, PAGE_SIZE * tx_ring_ref_count);
> +	vif->nr_tx_handles = tx_ring_ref_count;
> +
> +	for (i = 0; i < rx_ring_ref_count; i++)
> +		tmp[i] = rx_ring_ref[i];
> +
> +	err = xen_netbk_map_frontend_rings(vif, &addr, vif->domid,
> +					   tmp, rx_ring_ref_count);
> +
> +	if (err < 0)
> +		goto err_tx_unmap;
> +
> +	rxs = addr;
> +	BACK_RING_INIT(&vif->rx, rxs, PAGE_SIZE * rx_ring_ref_count);
> +	vif->nr_rx_handles = rx_ring_ref_count;
> +
>  	err = bind_interdomain_evtchn_to_irqhandler(
>  		vif->domid, evtchn, xenvif_interrupt, 0,
>  		vif->dev->name, vif);
>  	if (err < 0)
> -		goto err_unmap;
> +		goto err_rx_unmap;
>  	vif->irq = err;
>  	disable_irq(vif->irq);
>  
> @@ -340,8 +367,12 @@ int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref,
>  	rtnl_unlock();
>  
>  	return 0;
> -err_unmap:
> -	xen_netbk_unmap_frontend_rings(vif);
> +err_rx_unmap:
> +	xen_netbk_unmap_frontend_rings(vif, (void *)vif->rx.sring);
> +	vif->nr_rx_handles = 0;
> +err_tx_unmap:
> +	xen_netbk_unmap_frontend_rings(vif, (void *)vif->tx.sring);
> +	vif->nr_tx_handles = 0;
>  err:
>  	module_put(THIS_MODULE);
>  	return err;
> @@ -382,7 +413,8 @@ void xenvif_disconnect(struct xenvif *vif)
>  
>  	unregister_netdev(vif->dev);
>  
> -	xen_netbk_unmap_frontend_rings(vif);
> +	xen_netbk_unmap_frontend_rings(vif, (void *)vif->tx.sring);
> +	xen_netbk_unmap_frontend_rings(vif, (void *)vif->rx.sring);
>  
>  	free_netdev(vif->dev);
>  
> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> index 98ccea9..644c760 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -47,6 +47,19 @@
>  #include <asm/xen/hypercall.h>
>  #include <asm/xen/page.h>
>  
> +unsigned int MODPARM_netback_max_rx_ring_page_order = NETBK_MAX_RING_PAGE_ORDER;
> +module_param_named(netback_max_rx_ring_page_order,
> +		   MODPARM_netback_max_rx_ring_page_order, uint, 0);
> +MODULE_PARM_DESC(netback_max_rx_ring_page_order,
> +		 "Maximum supported receiver ring page order");
> +
> +unsigned int MODPARM_netback_max_tx_ring_page_order = NETBK_MAX_RING_PAGE_ORDER;
> +module_param_named(netback_max_tx_ring_page_order,
> +		   MODPARM_netback_max_tx_ring_page_order, uint, 0);
> +MODULE_PARM_DESC(netback_max_tx_ring_page_order,
> +		 "Maximum supported transmitter ring page order");
> +
> +

These should also show up in Documentation/ABI/sysfs/stable/xen* something

But more importantly.. why not make the TX=RX order and just have one
option?

>  struct pending_tx_info {
>  	struct xen_netif_tx_request req;
>  	struct xenvif *vif;
> @@ -59,7 +72,7 @@ struct netbk_rx_meta {
>  	int gso_size;
>  };
>  
> -#define MAX_PENDING_REQS 256
> +#define MAX_PENDING_REQS NETBK_MAX_TX_RING_SIZE
>  
>  /* Discriminate from any valid pending_idx value. */
>  #define INVALID_PENDING_IDX 0xFFFF
> @@ -111,8 +124,8 @@ struct xen_netbk {
>  	 * head/fragment page uses 2 copy operations because it
>  	 * straddles two buffers in the frontend.
>  	 */
> -	struct gnttab_copy grant_copy_op[2*XEN_NETIF_RX_RING_SIZE];
> -	struct netbk_rx_meta meta[2*XEN_NETIF_RX_RING_SIZE];
> +	struct gnttab_copy grant_copy_op[2*NETBK_MAX_RX_RING_SIZE];
> +	struct netbk_rx_meta meta[2*NETBK_MAX_RX_RING_SIZE];
>  };
>  
>  static struct xen_netbk *xen_netbk;
> @@ -262,7 +275,8 @@ int xen_netbk_rx_ring_full(struct xenvif *vif)
>  	RING_IDX needed = max_required_rx_slots(vif);
>  
>  	return ((vif->rx.sring->req_prod - peek) < needed) ||
> -	       ((vif->rx.rsp_prod_pvt + XEN_NETIF_RX_RING_SIZE - peek) < needed);
> +	       ((vif->rx.rsp_prod_pvt +
> +		 XEN_NETIF_RX_RING_SIZE(vif->nr_rx_handles) - peek) < needed);
>  }
>  
>  int xen_netbk_must_stop_queue(struct xenvif *vif)
> @@ -657,7 +671,8 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk)
>  		__skb_queue_tail(&rxq, skb);
>  
>  		/* Filled the batch queue? */
> -		if (count + MAX_SKB_FRAGS >= XEN_NETIF_RX_RING_SIZE)
> +		if (count + MAX_SKB_FRAGS >=
> +		    XEN_NETIF_RX_RING_SIZE(vif->nr_rx_handles))
>  			break;
>  	}
>  
> @@ -1292,12 +1307,12 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk)
>  			continue;
>  
>  		if (vif->tx.sring->req_prod - vif->tx.req_cons >
> -		    XEN_NETIF_TX_RING_SIZE) {
> +		    XEN_NETIF_TX_RING_SIZE(vif->nr_tx_handles)) {
>  			netdev_err(vif->dev,
>  				   "Impossible number of requests. "
>  				   "req_prod %d, req_cons %d, size %ld\n",
>  				   vif->tx.sring->req_prod, vif->tx.req_cons,
> -				   XEN_NETIF_TX_RING_SIZE);
> +				   XEN_NETIF_TX_RING_SIZE(vif->nr_tx_handles));
>  			netbk_fatal_tx_err(vif);
>  			continue;
>  		}
> @@ -1644,48 +1659,22 @@ static int xen_netbk_kthread(void *data)
>  	return 0;
>  }
>  
> -void xen_netbk_unmap_frontend_rings(struct xenvif *vif)
> +void xen_netbk_unmap_frontend_rings(struct xenvif *vif, void *addr)
>  {
> -	if (vif->tx.sring)
> -		xenbus_unmap_ring_vfree(xenvif_to_xenbus_device(vif),
> -					vif->tx.sring);
> -	if (vif->rx.sring)
> -		xenbus_unmap_ring_vfree(xenvif_to_xenbus_device(vif),
> -					vif->rx.sring);
> +	if (addr)
> +		xenbus_unmap_ring_vfree(xenvif_to_xenbus_device(vif), addr);
>  }
>  
>  int xen_netbk_map_frontend_rings(struct xenvif *vif,
> -				 grant_ref_t tx_ring_ref,
> -				 grant_ref_t rx_ring_ref)
> +				 void **vaddr,
> +				 int domid,
> +				 int *ring_ref,
> +				 unsigned int ring_ref_count)
>  {
> -	void *addr;
> -	struct xen_netif_tx_sring *txs;
> -	struct xen_netif_rx_sring *rxs;
> -
> -	int err = -ENOMEM;
> -
> -	err = xenbus_map_ring_valloc(xenvif_to_xenbus_device(vif),
> -				     &tx_ring_ref, 1, &addr);
> -	if (err)
> -		goto err;
> -
> -	txs = (struct xen_netif_tx_sring *)addr;
> -	BACK_RING_INIT(&vif->tx, txs, PAGE_SIZE);
> +	int err = 0;
>  
>  	err = xenbus_map_ring_valloc(xenvif_to_xenbus_device(vif),
> -				     &rx_ring_ref, 1, &addr);
> -	if (err)
> -		goto err;
> -
> -	rxs = (struct xen_netif_rx_sring *)addr;
> -	BACK_RING_INIT(&vif->rx, rxs, PAGE_SIZE);
> -
> -	vif->rx_req_cons_peek = 0;
> -
> -	return 0;
> -
> -err:
> -	xen_netbk_unmap_frontend_rings(vif);
> +				     ring_ref, ring_ref_count, vaddr);
>  	return err;
>  }
>  
> diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
> index 65d14f2..1791807 100644
> --- a/drivers/net/xen-netback/xenbus.c
> +++ b/drivers/net/xen-netback/xenbus.c
> @@ -114,6 +114,33 @@ static int netback_probe(struct xenbus_device *dev,
>  			goto abort_transaction;
>  		}
>  
> +		/* Multi-page ring support */
> +		if (MODPARM_netback_max_tx_ring_page_order >
> +		    NETBK_MAX_RING_PAGE_ORDER)
> +			MODPARM_netback_max_tx_ring_page_order =
> +				NETBK_MAX_RING_PAGE_ORDER;
> +		err = xenbus_printf(xbt, dev->nodename,
> +				    "max-tx-ring-page-order",
> +				    "%u",
> +				    MODPARM_netback_max_tx_ring_page_order);
> +		if (err) {
> +			message = "writing max-tx-ring-page-order";
> +			goto abort_transaction;
> +		}
> +
> +		if (MODPARM_netback_max_rx_ring_page_order >
> +		    NETBK_MAX_RING_PAGE_ORDER)
> +			MODPARM_netback_max_rx_ring_page_order =
> +				NETBK_MAX_RING_PAGE_ORDER;
> +		err = xenbus_printf(xbt, dev->nodename,
> +				    "max-rx-ring-page-order",
> +				    "%u",
> +				    MODPARM_netback_max_rx_ring_page_order);
> +		if (err) {
> +			message = "writing max-rx-ring-page-order";
> +			goto abort_transaction;
> +		}
> +
>  		err = xenbus_transaction_end(xbt, 0);
>  	} while (err == -EAGAIN);
>  
> @@ -392,22 +419,107 @@ static int connect_rings(struct backend_info *be)
>  {
>  	struct xenvif *vif = be->vif;
>  	struct xenbus_device *dev = be->dev;
> -	unsigned long tx_ring_ref, rx_ring_ref;
>  	unsigned int evtchn, rx_copy;
>  	int err;
>  	int val;
> +	unsigned long tx_ring_ref[NETBK_MAX_RING_PAGES];
> +	unsigned long rx_ring_ref[NETBK_MAX_RING_PAGES];
> +	unsigned int  tx_ring_order;
> +	unsigned int  rx_ring_order;
>  
>  	err = xenbus_gather(XBT_NIL, dev->otherend,
> -			    "tx-ring-ref", "%lu", &tx_ring_ref,
> -			    "rx-ring-ref", "%lu", &rx_ring_ref,
>  			    "event-channel", "%u", &evtchn, NULL);
>  	if (err) {
>  		xenbus_dev_fatal(dev, err,
> -				 "reading %s/ring-ref and event-channel",
> +				 "reading %s/event-channel",
>  				 dev->otherend);
>  		return err;
>  	}
>  
> +	err = xenbus_scanf(XBT_NIL, dev->otherend, "tx-ring-order", "%u",
> +			   &tx_ring_order);
> +	if (err < 0) {
> +		tx_ring_order = 0;
> +
> +		err = xenbus_scanf(XBT_NIL, dev->otherend, "tx-ring-ref", "%lu",
> +				   &tx_ring_ref[0]);
> +		if (err < 0) {
> +			xenbus_dev_fatal(dev, err, "reading %s/tx-ring-ref",
> +					 dev->otherend);
> +			return err;
> +		}
> +	} else {
> +		unsigned int i;
> +
> +		if (tx_ring_order > MODPARM_netback_max_tx_ring_page_order) {
> +			err = -EINVAL;
> +			xenbus_dev_fatal(dev, err,
> +					 "%s/tx-ring-page-order too big",
> +					 dev->otherend);
> +			return err;
> +		}
> +
> +		for (i = 0; i < (1U << tx_ring_order); i++) {
> +			char ring_ref_name[sizeof("tx-ring-ref") + 2];
> +
> +			snprintf(ring_ref_name, sizeof(ring_ref_name),
> +				 "tx-ring-ref%u", i);
> +
> +			err = xenbus_scanf(XBT_NIL, dev->otherend,
> +					   ring_ref_name, "%lu",
> +					   &tx_ring_ref[i]);
> +			if (err < 0) {
> +				xenbus_dev_fatal(dev, err,
> +						 "reading %s/%s",
> +						 dev->otherend,
> +						 ring_ref_name);
> +				return err;
> +			}
> +		}
> +	}
> +
> +	err = xenbus_scanf(XBT_NIL, dev->otherend, "rx-ring-order", "%u",
> +			   &rx_ring_order);
> +	if (err < 0) {
> +		rx_ring_order = 0;
> +
> +		err = xenbus_scanf(XBT_NIL, dev->otherend, "rx-ring-ref", "%lu",
> +				   &rx_ring_ref[0]);
> +		if (err < 0) {
> +			xenbus_dev_fatal(dev, err, "reading %s/rx-ring-ref",
> +					 dev->otherend);
> +			return err;
> +		}
> +	} else {
> +		unsigned int i;
> +
> +		if (rx_ring_order > MODPARM_netback_max_rx_ring_page_order) {
> +			err = -EINVAL;
> +			xenbus_dev_fatal(dev, err,
> +					 "%s/rx-ring-page-order too big",
> +					 dev->otherend);
> +			return err;
> +		}
> +
> +		for (i = 0; i < (1U << rx_ring_order); i++) {
> +			char ring_ref_name[sizeof("rx-ring-ref") + 2];
> +
> +			snprintf(ring_ref_name, sizeof(ring_ref_name),
> +				 "rx-ring-ref%u", i);
> +
> +			err = xenbus_scanf(XBT_NIL, dev->otherend,
> +					   ring_ref_name, "%lu",
> +					   &rx_ring_ref[i]);
> +			if (err < 0) {
> +				xenbus_dev_fatal(dev, err,
> +						 "reading %s/%s",
> +						 dev->otherend,
> +						 ring_ref_name);
> +				return err;
> +			}
> +		}
> +	}
> +
>  	err = xenbus_scanf(XBT_NIL, dev->otherend, "request-rx-copy", "%u",
>  			   &rx_copy);
>  	if (err == -ENOENT) {
> @@ -454,11 +566,28 @@ static int connect_rings(struct backend_info *be)
>  	vif->csum = !val;
>  
>  	/* Map the shared frame, irq etc. */
> -	err = xenvif_connect(vif, tx_ring_ref, rx_ring_ref, evtchn);
> +	err = xenvif_connect(vif, tx_ring_ref, (1U << tx_ring_order),
> +			     rx_ring_ref, (1U << rx_ring_order),
> +			     evtchn);
>  	if (err) {
> +		/* construct 1 2 3 / 4 5 6 */
> +		int i;
> +		char txs[3 * (1U << MODPARM_netback_max_tx_ring_page_order)];
> +		char rxs[3 * (1U << MODPARM_netback_max_rx_ring_page_order)];
> +
> +		txs[0] = rxs[0] = 0;
> +
> +		for (i = 0; i < (1U << tx_ring_order); i++)
> +			snprintf(txs+strlen(txs), sizeof(txs)-strlen(txs)-1,
> +				 " %lu", tx_ring_ref[i]);
> +
> +		for (i = 0; i < (1U << rx_ring_order); i++)
> +			snprintf(rxs+strlen(rxs), sizeof(rxs)-strlen(rxs)-1,
> +				 " %lu", rx_ring_ref[i]);
> +
>  		xenbus_dev_fatal(dev, err,
> -				 "mapping shared-frames %lu/%lu port %u",
> -				 tx_ring_ref, rx_ring_ref, evtchn);
> +				 "mapping shared-frames%s /%s port %u",
> +				 txs, rxs, evtchn);
>  		return err;
>  	}
>  	return 0;
> -- 
> 1.7.10.4
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> 

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 5/8] netback: multi-page ring support
  2013-02-15 16:00 ` Wei Liu
  2013-03-04 21:00   ` [Xen-devel] " Konrad Rzeszutek Wilk
@ 2013-03-04 21:00   ` Konrad Rzeszutek Wilk
  2013-03-05 10:41   ` David Vrabel
  2013-03-05 10:41   ` [Xen-devel] " David Vrabel
  3 siblings, 0 replies; 91+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-03-04 21:00 UTC (permalink / raw)
  To: Wei Liu; +Cc: netdev, annie.li, ian.campbell, xen-devel

On Fri, Feb 15, 2013 at 04:00:06PM +0000, Wei Liu wrote:

Please a bit more description. Say which XenBus feature this
is using.  And that probably means another patch to the Xen
tree to document this.


> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> ---
>  drivers/net/xen-netback/common.h    |   30 ++++++--
>  drivers/net/xen-netback/interface.c |   46 +++++++++--
>  drivers/net/xen-netback/netback.c   |   73 ++++++++----------
>  drivers/net/xen-netback/xenbus.c    |  143 +++++++++++++++++++++++++++++++++--
>  4 files changed, 229 insertions(+), 63 deletions(-)
> 
> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
> index 35d8772..f541ba9 100644
> --- a/drivers/net/xen-netback/common.h
> +++ b/drivers/net/xen-netback/common.h
> @@ -45,6 +45,12 @@
>  #include <xen/grant_table.h>
>  #include <xen/xenbus.h>
>  
> +#define NETBK_MAX_RING_PAGE_ORDER XENBUS_MAX_RING_PAGE_ORDER
> +#define NETBK_MAX_RING_PAGES      (1U << NETBK_MAX_RING_PAGE_ORDER)
> +
> +#define NETBK_MAX_TX_RING_SIZE XEN_NETIF_TX_RING_SIZE(NETBK_MAX_RING_PAGES)
> +#define NETBK_MAX_RX_RING_SIZE XEN_NETIF_RX_RING_SIZE(NETBK_MAX_RING_PAGES)
> +
>  struct xen_netbk;
>  
>  struct xenvif {
> @@ -66,6 +72,8 @@ struct xenvif {
>  	/* The shared rings and indexes. */
>  	struct xen_netif_tx_back_ring tx;
>  	struct xen_netif_rx_back_ring rx;
> +	unsigned int nr_tx_handles;
> +	unsigned int nr_rx_handles;
>  
>  	/* Frontend feature information. */
>  	u8 can_sg:1;
> @@ -105,15 +113,19 @@ static inline struct xenbus_device *xenvif_to_xenbus_device(struct xenvif *vif)
>  	return to_xenbus_device(vif->dev->dev.parent);
>  }
>  
> -#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
> -#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
> +#define XEN_NETIF_TX_RING_SIZE(_nr_pages)		\
> +	__CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE * (_nr_pages))
> +#define XEN_NETIF_RX_RING_SIZE(_nr_pages)		\
> +	__CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE * (_nr_pages))
>  
>  struct xenvif *xenvif_alloc(struct device *parent,
>  			    domid_t domid,
>  			    unsigned int handle);
>  
> -int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref,
> -		   unsigned long rx_ring_ref, unsigned int evtchn);
> +int xenvif_connect(struct xenvif *vif,
> +		   unsigned long *tx_ring_ref, unsigned int tx_ring_order,
> +		   unsigned long *rx_ring_ref, unsigned int rx_ring_order,
> +		   unsigned int evtchn);
>  void xenvif_disconnect(struct xenvif *vif);
>  
>  void xenvif_get(struct xenvif *vif);
> @@ -129,10 +141,12 @@ int xen_netbk_rx_ring_full(struct xenvif *vif);
>  int xen_netbk_must_stop_queue(struct xenvif *vif);
>  
>  /* (Un)Map communication rings. */
> -void xen_netbk_unmap_frontend_rings(struct xenvif *vif);
> +void xen_netbk_unmap_frontend_rings(struct xenvif *vif, void *addr);
>  int xen_netbk_map_frontend_rings(struct xenvif *vif,
> -				 grant_ref_t tx_ring_ref,
> -				 grant_ref_t rx_ring_ref);
> +				 void **addr,
> +				 int domid,
> +				 int *ring_ref,
> +				 unsigned int ring_ref_count);
>  
>  /* (De)Register a xenvif with the netback backend. */
>  void xen_netbk_add_xenvif(struct xenvif *vif);
> @@ -158,4 +172,6 @@ void xenvif_carrier_off(struct xenvif *vif);
>  /* Returns number of ring slots required to send an skb to the frontend */
>  unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb);
>  
> +extern unsigned int MODPARM_netback_max_tx_ring_page_order;
> +extern unsigned int MODPARM_netback_max_rx_ring_page_order;
>  #endif /* __XEN_NETBACK__COMMON_H__ */
> diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
> index db638e1..fa4d46d 100644
> --- a/drivers/net/xen-netback/interface.c
> +++ b/drivers/net/xen-netback/interface.c
> @@ -305,10 +305,16 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid,
>  	return vif;
>  }
>  
> -int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref,
> -		   unsigned long rx_ring_ref, unsigned int evtchn)
> +int xenvif_connect(struct xenvif *vif,
> +		   unsigned long *tx_ring_ref, unsigned int tx_ring_ref_count,
> +		   unsigned long *rx_ring_ref, unsigned int rx_ring_ref_count,
> +		   unsigned int evtchn)
>  {
>  	int err = -ENOMEM;
> +	void *addr;
> +	struct xen_netif_tx_sring *txs;
> +	struct xen_netif_rx_sring *rxs;
> +	int tmp[NETBK_MAX_RING_PAGES], i;
>  
>  	/* Already connected through? */
>  	if (vif->irq)
> @@ -316,15 +322,36 @@ int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref,
>  
>  	__module_get(THIS_MODULE);
>  
> -	err = xen_netbk_map_frontend_rings(vif, tx_ring_ref, rx_ring_ref);
> +	for (i = 0; i < tx_ring_ref_count; i++)
> +		tmp[i] = tx_ring_ref[i];
> +
> +	err = xen_netbk_map_frontend_rings(vif, &addr, vif->domid,
> +					   tmp, tx_ring_ref_count);
>  	if (err < 0)
>  		goto err;
>  
> +	txs = addr;
> +	BACK_RING_INIT(&vif->tx, txs, PAGE_SIZE * tx_ring_ref_count);
> +	vif->nr_tx_handles = tx_ring_ref_count;
> +
> +	for (i = 0; i < rx_ring_ref_count; i++)
> +		tmp[i] = rx_ring_ref[i];
> +
> +	err = xen_netbk_map_frontend_rings(vif, &addr, vif->domid,
> +					   tmp, rx_ring_ref_count);
> +
> +	if (err < 0)
> +		goto err_tx_unmap;
> +
> +	rxs = addr;
> +	BACK_RING_INIT(&vif->rx, rxs, PAGE_SIZE * rx_ring_ref_count);
> +	vif->nr_rx_handles = rx_ring_ref_count;
> +
>  	err = bind_interdomain_evtchn_to_irqhandler(
>  		vif->domid, evtchn, xenvif_interrupt, 0,
>  		vif->dev->name, vif);
>  	if (err < 0)
> -		goto err_unmap;
> +		goto err_rx_unmap;
>  	vif->irq = err;
>  	disable_irq(vif->irq);
>  
> @@ -340,8 +367,12 @@ int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref,
>  	rtnl_unlock();
>  
>  	return 0;
> -err_unmap:
> -	xen_netbk_unmap_frontend_rings(vif);
> +err_rx_unmap:
> +	xen_netbk_unmap_frontend_rings(vif, (void *)vif->rx.sring);
> +	vif->nr_rx_handles = 0;
> +err_tx_unmap:
> +	xen_netbk_unmap_frontend_rings(vif, (void *)vif->tx.sring);
> +	vif->nr_tx_handles = 0;
>  err:
>  	module_put(THIS_MODULE);
>  	return err;
> @@ -382,7 +413,8 @@ void xenvif_disconnect(struct xenvif *vif)
>  
>  	unregister_netdev(vif->dev);
>  
> -	xen_netbk_unmap_frontend_rings(vif);
> +	xen_netbk_unmap_frontend_rings(vif, (void *)vif->tx.sring);
> +	xen_netbk_unmap_frontend_rings(vif, (void *)vif->rx.sring);
>  
>  	free_netdev(vif->dev);
>  
> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> index 98ccea9..644c760 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -47,6 +47,19 @@
>  #include <asm/xen/hypercall.h>
>  #include <asm/xen/page.h>
>  
> +unsigned int MODPARM_netback_max_rx_ring_page_order = NETBK_MAX_RING_PAGE_ORDER;
> +module_param_named(netback_max_rx_ring_page_order,
> +		   MODPARM_netback_max_rx_ring_page_order, uint, 0);
> +MODULE_PARM_DESC(netback_max_rx_ring_page_order,
> +		 "Maximum supported receiver ring page order");
> +
> +unsigned int MODPARM_netback_max_tx_ring_page_order = NETBK_MAX_RING_PAGE_ORDER;
> +module_param_named(netback_max_tx_ring_page_order,
> +		   MODPARM_netback_max_tx_ring_page_order, uint, 0);
> +MODULE_PARM_DESC(netback_max_tx_ring_page_order,
> +		 "Maximum supported transmitter ring page order");
> +
> +

These should also show up in Documentation/ABI/sysfs/stable/xen* something

But more importantly.. why not make the TX=RX order and just have one
option?

>  struct pending_tx_info {
>  	struct xen_netif_tx_request req;
>  	struct xenvif *vif;
> @@ -59,7 +72,7 @@ struct netbk_rx_meta {
>  	int gso_size;
>  };
>  
> -#define MAX_PENDING_REQS 256
> +#define MAX_PENDING_REQS NETBK_MAX_TX_RING_SIZE
>  
>  /* Discriminate from any valid pending_idx value. */
>  #define INVALID_PENDING_IDX 0xFFFF
> @@ -111,8 +124,8 @@ struct xen_netbk {
>  	 * head/fragment page uses 2 copy operations because it
>  	 * straddles two buffers in the frontend.
>  	 */
> -	struct gnttab_copy grant_copy_op[2*XEN_NETIF_RX_RING_SIZE];
> -	struct netbk_rx_meta meta[2*XEN_NETIF_RX_RING_SIZE];
> +	struct gnttab_copy grant_copy_op[2*NETBK_MAX_RX_RING_SIZE];
> +	struct netbk_rx_meta meta[2*NETBK_MAX_RX_RING_SIZE];
>  };
>  
>  static struct xen_netbk *xen_netbk;
> @@ -262,7 +275,8 @@ int xen_netbk_rx_ring_full(struct xenvif *vif)
>  	RING_IDX needed = max_required_rx_slots(vif);
>  
>  	return ((vif->rx.sring->req_prod - peek) < needed) ||
> -	       ((vif->rx.rsp_prod_pvt + XEN_NETIF_RX_RING_SIZE - peek) < needed);
> +	       ((vif->rx.rsp_prod_pvt +
> +		 XEN_NETIF_RX_RING_SIZE(vif->nr_rx_handles) - peek) < needed);
>  }
>  
>  int xen_netbk_must_stop_queue(struct xenvif *vif)
> @@ -657,7 +671,8 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk)
>  		__skb_queue_tail(&rxq, skb);
>  
>  		/* Filled the batch queue? */
> -		if (count + MAX_SKB_FRAGS >= XEN_NETIF_RX_RING_SIZE)
> +		if (count + MAX_SKB_FRAGS >=
> +		    XEN_NETIF_RX_RING_SIZE(vif->nr_rx_handles))
>  			break;
>  	}
>  
> @@ -1292,12 +1307,12 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk)
>  			continue;
>  
>  		if (vif->tx.sring->req_prod - vif->tx.req_cons >
> -		    XEN_NETIF_TX_RING_SIZE) {
> +		    XEN_NETIF_TX_RING_SIZE(vif->nr_tx_handles)) {
>  			netdev_err(vif->dev,
>  				   "Impossible number of requests. "
>  				   "req_prod %d, req_cons %d, size %ld\n",
>  				   vif->tx.sring->req_prod, vif->tx.req_cons,
> -				   XEN_NETIF_TX_RING_SIZE);
> +				   XEN_NETIF_TX_RING_SIZE(vif->nr_tx_handles));
>  			netbk_fatal_tx_err(vif);
>  			continue;
>  		}
> @@ -1644,48 +1659,22 @@ static int xen_netbk_kthread(void *data)
>  	return 0;
>  }
>  
> -void xen_netbk_unmap_frontend_rings(struct xenvif *vif)
> +void xen_netbk_unmap_frontend_rings(struct xenvif *vif, void *addr)
>  {
> -	if (vif->tx.sring)
> -		xenbus_unmap_ring_vfree(xenvif_to_xenbus_device(vif),
> -					vif->tx.sring);
> -	if (vif->rx.sring)
> -		xenbus_unmap_ring_vfree(xenvif_to_xenbus_device(vif),
> -					vif->rx.sring);
> +	if (addr)
> +		xenbus_unmap_ring_vfree(xenvif_to_xenbus_device(vif), addr);
>  }
>  
>  int xen_netbk_map_frontend_rings(struct xenvif *vif,
> -				 grant_ref_t tx_ring_ref,
> -				 grant_ref_t rx_ring_ref)
> +				 void **vaddr,
> +				 int domid,
> +				 int *ring_ref,
> +				 unsigned int ring_ref_count)
>  {
> -	void *addr;
> -	struct xen_netif_tx_sring *txs;
> -	struct xen_netif_rx_sring *rxs;
> -
> -	int err = -ENOMEM;
> -
> -	err = xenbus_map_ring_valloc(xenvif_to_xenbus_device(vif),
> -				     &tx_ring_ref, 1, &addr);
> -	if (err)
> -		goto err;
> -
> -	txs = (struct xen_netif_tx_sring *)addr;
> -	BACK_RING_INIT(&vif->tx, txs, PAGE_SIZE);
> +	int err = 0;
>  
>  	err = xenbus_map_ring_valloc(xenvif_to_xenbus_device(vif),
> -				     &rx_ring_ref, 1, &addr);
> -	if (err)
> -		goto err;
> -
> -	rxs = (struct xen_netif_rx_sring *)addr;
> -	BACK_RING_INIT(&vif->rx, rxs, PAGE_SIZE);
> -
> -	vif->rx_req_cons_peek = 0;
> -
> -	return 0;
> -
> -err:
> -	xen_netbk_unmap_frontend_rings(vif);
> +				     ring_ref, ring_ref_count, vaddr);
>  	return err;
>  }
>  
> diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
> index 65d14f2..1791807 100644
> --- a/drivers/net/xen-netback/xenbus.c
> +++ b/drivers/net/xen-netback/xenbus.c
> @@ -114,6 +114,33 @@ static int netback_probe(struct xenbus_device *dev,
>  			goto abort_transaction;
>  		}
>  
> +		/* Multi-page ring support */
> +		if (MODPARM_netback_max_tx_ring_page_order >
> +		    NETBK_MAX_RING_PAGE_ORDER)
> +			MODPARM_netback_max_tx_ring_page_order =
> +				NETBK_MAX_RING_PAGE_ORDER;
> +		err = xenbus_printf(xbt, dev->nodename,
> +				    "max-tx-ring-page-order",
> +				    "%u",
> +				    MODPARM_netback_max_tx_ring_page_order);
> +		if (err) {
> +			message = "writing max-tx-ring-page-order";
> +			goto abort_transaction;
> +		}
> +
> +		if (MODPARM_netback_max_rx_ring_page_order >
> +		    NETBK_MAX_RING_PAGE_ORDER)
> +			MODPARM_netback_max_rx_ring_page_order =
> +				NETBK_MAX_RING_PAGE_ORDER;
> +		err = xenbus_printf(xbt, dev->nodename,
> +				    "max-rx-ring-page-order",
> +				    "%u",
> +				    MODPARM_netback_max_rx_ring_page_order);
> +		if (err) {
> +			message = "writing max-rx-ring-page-order";
> +			goto abort_transaction;
> +		}
> +
>  		err = xenbus_transaction_end(xbt, 0);
>  	} while (err == -EAGAIN);
>  
> @@ -392,22 +419,107 @@ static int connect_rings(struct backend_info *be)
>  {
>  	struct xenvif *vif = be->vif;
>  	struct xenbus_device *dev = be->dev;
> -	unsigned long tx_ring_ref, rx_ring_ref;
>  	unsigned int evtchn, rx_copy;
>  	int err;
>  	int val;
> +	unsigned long tx_ring_ref[NETBK_MAX_RING_PAGES];
> +	unsigned long rx_ring_ref[NETBK_MAX_RING_PAGES];
> +	unsigned int  tx_ring_order;
> +	unsigned int  rx_ring_order;
>  
>  	err = xenbus_gather(XBT_NIL, dev->otherend,
> -			    "tx-ring-ref", "%lu", &tx_ring_ref,
> -			    "rx-ring-ref", "%lu", &rx_ring_ref,
>  			    "event-channel", "%u", &evtchn, NULL);
>  	if (err) {
>  		xenbus_dev_fatal(dev, err,
> -				 "reading %s/ring-ref and event-channel",
> +				 "reading %s/event-channel",
>  				 dev->otherend);
>  		return err;
>  	}
>  
> +	err = xenbus_scanf(XBT_NIL, dev->otherend, "tx-ring-order", "%u",
> +			   &tx_ring_order);
> +	if (err < 0) {
> +		tx_ring_order = 0;
> +
> +		err = xenbus_scanf(XBT_NIL, dev->otherend, "tx-ring-ref", "%lu",
> +				   &tx_ring_ref[0]);
> +		if (err < 0) {
> +			xenbus_dev_fatal(dev, err, "reading %s/tx-ring-ref",
> +					 dev->otherend);
> +			return err;
> +		}
> +	} else {
> +		unsigned int i;
> +
> +		if (tx_ring_order > MODPARM_netback_max_tx_ring_page_order) {
> +			err = -EINVAL;
> +			xenbus_dev_fatal(dev, err,
> +					 "%s/tx-ring-page-order too big",
> +					 dev->otherend);
> +			return err;
> +		}
> +
> +		for (i = 0; i < (1U << tx_ring_order); i++) {
> +			char ring_ref_name[sizeof("tx-ring-ref") + 2];
> +
> +			snprintf(ring_ref_name, sizeof(ring_ref_name),
> +				 "tx-ring-ref%u", i);
> +
> +			err = xenbus_scanf(XBT_NIL, dev->otherend,
> +					   ring_ref_name, "%lu",
> +					   &tx_ring_ref[i]);
> +			if (err < 0) {
> +				xenbus_dev_fatal(dev, err,
> +						 "reading %s/%s",
> +						 dev->otherend,
> +						 ring_ref_name);
> +				return err;
> +			}
> +		}
> +	}
> +
> +	err = xenbus_scanf(XBT_NIL, dev->otherend, "rx-ring-order", "%u",
> +			   &rx_ring_order);
> +	if (err < 0) {
> +		rx_ring_order = 0;
> +
> +		err = xenbus_scanf(XBT_NIL, dev->otherend, "rx-ring-ref", "%lu",
> +				   &rx_ring_ref[0]);
> +		if (err < 0) {
> +			xenbus_dev_fatal(dev, err, "reading %s/rx-ring-ref",
> +					 dev->otherend);
> +			return err;
> +		}
> +	} else {
> +		unsigned int i;
> +
> +		if (rx_ring_order > MODPARM_netback_max_rx_ring_page_order) {
> +			err = -EINVAL;
> +			xenbus_dev_fatal(dev, err,
> +					 "%s/rx-ring-page-order too big",
> +					 dev->otherend);
> +			return err;
> +		}
> +
> +		for (i = 0; i < (1U << rx_ring_order); i++) {
> +			char ring_ref_name[sizeof("rx-ring-ref") + 2];
> +
> +			snprintf(ring_ref_name, sizeof(ring_ref_name),
> +				 "rx-ring-ref%u", i);
> +
> +			err = xenbus_scanf(XBT_NIL, dev->otherend,
> +					   ring_ref_name, "%lu",
> +					   &rx_ring_ref[i]);
> +			if (err < 0) {
> +				xenbus_dev_fatal(dev, err,
> +						 "reading %s/%s",
> +						 dev->otherend,
> +						 ring_ref_name);
> +				return err;
> +			}
> +		}
> +	}
> +
>  	err = xenbus_scanf(XBT_NIL, dev->otherend, "request-rx-copy", "%u",
>  			   &rx_copy);
>  	if (err == -ENOENT) {
> @@ -454,11 +566,28 @@ static int connect_rings(struct backend_info *be)
>  	vif->csum = !val;
>  
>  	/* Map the shared frame, irq etc. */
> -	err = xenvif_connect(vif, tx_ring_ref, rx_ring_ref, evtchn);
> +	err = xenvif_connect(vif, tx_ring_ref, (1U << tx_ring_order),
> +			     rx_ring_ref, (1U << rx_ring_order),
> +			     evtchn);
>  	if (err) {
> +		/* construct 1 2 3 / 4 5 6 */
> +		int i;
> +		char txs[3 * (1U << MODPARM_netback_max_tx_ring_page_order)];
> +		char rxs[3 * (1U << MODPARM_netback_max_rx_ring_page_order)];
> +
> +		txs[0] = rxs[0] = 0;
> +
> +		for (i = 0; i < (1U << tx_ring_order); i++)
> +			snprintf(txs+strlen(txs), sizeof(txs)-strlen(txs)-1,
> +				 " %lu", tx_ring_ref[i]);
> +
> +		for (i = 0; i < (1U << rx_ring_order); i++)
> +			snprintf(rxs+strlen(rxs), sizeof(rxs)-strlen(rxs)-1,
> +				 " %lu", rx_ring_ref[i]);
> +
>  		xenbus_dev_fatal(dev, err,
> -				 "mapping shared-frames %lu/%lu port %u",
> -				 tx_ring_ref, rx_ring_ref, evtchn);
> +				 "mapping shared-frames%s /%s port %u",
> +				 txs, rxs, evtchn);
>  		return err;
>  	}
>  	return 0;
> -- 
> 1.7.10.4
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> 

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 4/8] xenbus_client: Extend interface to support multi-page ring
  2013-02-15 16:00 ` [PATCH 4/8] xenbus_client: Extend interface to support multi-page ring Wei Liu
                     ` (2 preceding siblings ...)
  2013-03-04 21:12   ` Konrad Rzeszutek Wilk
@ 2013-03-04 21:12   ` Konrad Rzeszutek Wilk
  2013-03-05 10:25   ` David Vrabel
  2013-03-05 10:25   ` [Xen-devel] " David Vrabel
  5 siblings, 0 replies; 91+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-03-04 21:12 UTC (permalink / raw)
  To: Wei Liu
  Cc: xen-devel, netdev, ian.campbell, annie.li, Roger Pau Monne,
	Stefano Stabellini, Mukesh Rathor

On Fri, Feb 15, 2013 at 04:00:05PM +0000, Wei Liu wrote:
> Also bundle fixes for xen frontends and backends in this patch.

Please explain what this does. I have a fairly good idea
of what it does, but if there are folks who are going to look
in the source code and find this git commit they might not understand
how this is suppose to be used or why it is better to have
more than 0-page rings.



> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> Cc: Roger Pau Monne <roger.pau@citrix.com>
> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> Cc: Mukesh Rathor <mukesh.rathor@oracle.com>
> ---
>  drivers/block/xen-blkback/xenbus.c |   14 +-
>  drivers/block/xen-blkfront.c       |    6 +-
>  drivers/net/xen-netback/netback.c  |    4 +-
>  drivers/net/xen-netfront.c         |    9 +-
>  drivers/pci/xen-pcifront.c         |    5 +-
>  drivers/xen/xen-pciback/xenbus.c   |   10 +-
>  drivers/xen/xenbus/xenbus_client.c |  314 ++++++++++++++++++++++++++----------
>  include/xen/xenbus.h               |   17 +-
>  8 files changed, 270 insertions(+), 109 deletions(-)
> 
> diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c
> index 6398072..384ff24 100644
> --- a/drivers/block/xen-blkback/xenbus.c
> +++ b/drivers/block/xen-blkback/xenbus.c
> @@ -122,7 +122,8 @@ static struct xen_blkif *xen_blkif_alloc(domid_t domid)
>  	return blkif;
>  }
>  
> -static int xen_blkif_map(struct xen_blkif *blkif, unsigned long shared_page,
> +static int xen_blkif_map(struct xen_blkif *blkif, int *shared_pages,
> +			 int nr_pages,
>  			 unsigned int evtchn)
>  {
>  	int err;
> @@ -131,7 +132,8 @@ static int xen_blkif_map(struct xen_blkif *blkif, unsigned long shared_page,
>  	if (blkif->irq)
>  		return 0;
>  
> -	err = xenbus_map_ring_valloc(blkif->be->dev, shared_page, &blkif->blk_ring);
> +	err = xenbus_map_ring_valloc(blkif->be->dev, shared_pages,
> +				     nr_pages, &blkif->blk_ring);
>  	if (err < 0)
>  		return err;
>  
> @@ -726,7 +728,7 @@ again:
>  static int connect_ring(struct backend_info *be)
>  {
>  	struct xenbus_device *dev = be->dev;
> -	unsigned long ring_ref;
> +	int ring_ref;
>  	unsigned int evtchn;
>  	unsigned int pers_grants;
>  	char protocol[64] = "";
> @@ -767,14 +769,14 @@ static int connect_ring(struct backend_info *be)
>  	be->blkif->vbd.feature_gnt_persistent = pers_grants;
>  	be->blkif->vbd.overflow_max_grants = 0;
>  
> -	pr_info(DRV_PFX "ring-ref %ld, event-channel %d, protocol %d (%s) %s\n",
> +	pr_info(DRV_PFX "ring-ref %d, event-channel %d, protocol %d (%s) %s\n",
>  		ring_ref, evtchn, be->blkif->blk_protocol, protocol,
>  		pers_grants ? "persistent grants" : "");
>  
>  	/* Map the shared frame, irq etc. */
> -	err = xen_blkif_map(be->blkif, ring_ref, evtchn);
> +	err = xen_blkif_map(be->blkif, &ring_ref, 1, evtchn);
>  	if (err) {
> -		xenbus_dev_fatal(dev, err, "mapping ring-ref %lu port %u",
> +		xenbus_dev_fatal(dev, err, "mapping ring-ref %u port %u",
>  				 ring_ref, evtchn);
>  		return err;
>  	}
> diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
> index 96e9b00..12c9ebd 100644
> --- a/drivers/block/xen-blkfront.c
> +++ b/drivers/block/xen-blkfront.c
> @@ -991,6 +991,7 @@ static int setup_blkring(struct xenbus_device *dev,
>  {
>  	struct blkif_sring *sring;
>  	int err;
> +	int grefs[1];

Perhaps all of those '1' should have a #define.

Say 'XEN_DEFAULT_RING_PAGE_SIZE'?

>  
>  	info->ring_ref = GRANT_INVALID_REF;
>  
> @@ -1004,13 +1005,14 @@ static int setup_blkring(struct xenbus_device *dev,
>  
>  	sg_init_table(info->sg, BLKIF_MAX_SEGMENTS_PER_REQUEST);
>  
> -	err = xenbus_grant_ring(dev, virt_to_mfn(info->ring.sring));
> +	err = xenbus_grant_ring(dev, info->ring.sring,
> +				1, grefs);
>  	if (err < 0) {
>  		free_page((unsigned long)sring);
>  		info->ring.sring = NULL;
>  		goto fail;
>  	}
> -	info->ring_ref = err;
> +	info->ring_ref = grefs[0];
>  
>  	err = xenbus_alloc_evtchn(dev, &info->evtchn);
>  	if (err)
> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> index de59098..98ccea9 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -1665,7 +1665,7 @@ int xen_netbk_map_frontend_rings(struct xenvif *vif,
>  	int err = -ENOMEM;
>  
>  	err = xenbus_map_ring_valloc(xenvif_to_xenbus_device(vif),
> -				     tx_ring_ref, &addr);
> +				     &tx_ring_ref, 1, &addr);
>  	if (err)
>  		goto err;
>  
> @@ -1673,7 +1673,7 @@ int xen_netbk_map_frontend_rings(struct xenvif *vif,
>  	BACK_RING_INIT(&vif->tx, txs, PAGE_SIZE);
>  
>  	err = xenbus_map_ring_valloc(xenvif_to_xenbus_device(vif),
> -				     rx_ring_ref, &addr);
> +				     &rx_ring_ref, 1, &addr);
>  	if (err)
>  		goto err;
>  
> diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
> index 7ffa43b..8bd75a1 100644
> --- a/drivers/net/xen-netfront.c
> +++ b/drivers/net/xen-netfront.c
> @@ -1501,6 +1501,7 @@ static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info)
>  	struct xen_netif_tx_sring *txs;
>  	struct xen_netif_rx_sring *rxs;
>  	int err;
> +	int grefs[1];
>  	struct net_device *netdev = info->netdev;
>  
>  	info->tx_ring_ref = GRANT_INVALID_REF;
> @@ -1524,13 +1525,13 @@ static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info)
>  	SHARED_RING_INIT(txs);
>  	FRONT_RING_INIT(&info->tx, txs, PAGE_SIZE);
>  
> -	err = xenbus_grant_ring(dev, virt_to_mfn(txs));
> +	err = xenbus_grant_ring(dev, txs, 1, grefs);
>  	if (err < 0) {
>  		free_page((unsigned long)txs);
>  		goto fail;
>  	}
>  
> -	info->tx_ring_ref = err;
> +	info->tx_ring_ref = grefs[0];
>  	rxs = (struct xen_netif_rx_sring *)get_zeroed_page(GFP_NOIO | __GFP_HIGH);
>  	if (!rxs) {
>  		err = -ENOMEM;
> @@ -1540,12 +1541,12 @@ static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info)
>  	SHARED_RING_INIT(rxs);
>  	FRONT_RING_INIT(&info->rx, rxs, PAGE_SIZE);
>  
> -	err = xenbus_grant_ring(dev, virt_to_mfn(rxs));
> +	err = xenbus_grant_ring(dev, rxs, 1, grefs);
>  	if (err < 0) {
>  		free_page((unsigned long)rxs);
>  		goto fail;
>  	}
> -	info->rx_ring_ref = err;
> +	info->rx_ring_ref = grefs[0];
>  
>  	err = xenbus_alloc_evtchn(dev, &info->evtchn);
>  	if (err)
> diff --git a/drivers/pci/xen-pcifront.c b/drivers/pci/xen-pcifront.c
> index 966abc6..016a2bb 100644
> --- a/drivers/pci/xen-pcifront.c
> +++ b/drivers/pci/xen-pcifront.c
> @@ -772,12 +772,13 @@ static int pcifront_publish_info(struct pcifront_device *pdev)
>  {
>  	int err = 0;
>  	struct xenbus_transaction trans;
> +	int grefs[1];
>  
> -	err = xenbus_grant_ring(pdev->xdev, virt_to_mfn(pdev->sh_info));
> +	err = xenbus_grant_ring(pdev->xdev, pdev->sh_info, 1, grefs);
>  	if (err < 0)
>  		goto out;
>  
> -	pdev->gnt_ref = err;
> +	pdev->gnt_ref = grefs[0];
>  
>  	err = xenbus_alloc_evtchn(pdev->xdev, &pdev->evtchn);
>  	if (err)
> diff --git a/drivers/xen/xen-pciback/xenbus.c b/drivers/xen/xen-pciback/xenbus.c
> index 64b11f9..4655851 100644
> --- a/drivers/xen/xen-pciback/xenbus.c
> +++ b/drivers/xen/xen-pciback/xenbus.c
> @@ -98,17 +98,17 @@ static void free_pdev(struct xen_pcibk_device *pdev)
>  	kfree(pdev);
>  }
>  
> -static int xen_pcibk_do_attach(struct xen_pcibk_device *pdev, int gnt_ref,
> -			     int remote_evtchn)
> +static int xen_pcibk_do_attach(struct xen_pcibk_device *pdev, int *gnt_ref,
> +			       int nr_grefs, int remote_evtchn)
>  {
>  	int err = 0;
>  	void *vaddr;
>  
>  	dev_dbg(&pdev->xdev->dev,
>  		"Attaching to frontend resources - gnt_ref=%d evtchn=%d\n",
> -		gnt_ref, remote_evtchn);
> +		gnt_ref[0], remote_evtchn);
>  
> -	err = xenbus_map_ring_valloc(pdev->xdev, gnt_ref, &vaddr);
> +	err = xenbus_map_ring_valloc(pdev->xdev, gnt_ref, nr_grefs, &vaddr);
>  	if (err < 0) {
>  		xenbus_dev_fatal(pdev->xdev, err,
>  				"Error mapping other domain page in ours.");
> @@ -172,7 +172,7 @@ static int xen_pcibk_attach(struct xen_pcibk_device *pdev)
>  		goto out;
>  	}
>  
> -	err = xen_pcibk_do_attach(pdev, gnt_ref, remote_evtchn);
> +	err = xen_pcibk_do_attach(pdev, &gnt_ref, 1, remote_evtchn);
>  	if (err)
>  		goto out;
>  
> diff --git a/drivers/xen/xenbus/xenbus_client.c b/drivers/xen/xenbus/xenbus_client.c
> index 1bac743..7c1bd49 100644
> --- a/drivers/xen/xenbus/xenbus_client.c
> +++ b/drivers/xen/xenbus/xenbus_client.c
> @@ -54,14 +54,16 @@ struct xenbus_map_node {
>  		struct vm_struct *area; /* PV */
>  		struct page *page;     /* HVM */
>  	};
> -	grant_handle_t handle;
> +	grant_handle_t handle[XENBUS_MAX_RING_PAGES];
> +	unsigned int   nr_handles;
>  };
>  
>  static DEFINE_SPINLOCK(xenbus_valloc_lock);
>  static LIST_HEAD(xenbus_valloc_pages);
>  
>  struct xenbus_ring_ops {
> -	int (*map)(struct xenbus_device *dev, int gnt, void **vaddr);
> +	int (*map)(struct xenbus_device *dev, int *gnt, int nr_gnts,
> +		   void **vaddr);
>  	int (*unmap)(struct xenbus_device *dev, void *vaddr);
>  };
>  
> @@ -357,17 +359,39 @@ static void xenbus_switch_fatal(struct xenbus_device *dev, int depth, int err,
>  /**
>   * xenbus_grant_ring
>   * @dev: xenbus device
> - * @ring_mfn: mfn of ring to grant
> -
> - * Grant access to the given @ring_mfn to the peer of the given device.  Return
> - * 0 on success, or -errno on error.  On error, the device will switch to
> + * @vaddr: starting virtual address of the ring
> + * @nr_pages: number of pages to be granted
> + * @grefs: grant reference array to be filled in
> + *
> + * Grant access to the given @vaddr to the peer of the given device.
> + * Then fill in @grefs with grant references.  Return 0 on success, or
> + * -errno on error.  On error, the device will switch to
>   * XenbusStateClosing, and the error will be saved in the store.
>   */
> -int xenbus_grant_ring(struct xenbus_device *dev, unsigned long ring_mfn)
> +int xenbus_grant_ring(struct xenbus_device *dev, void *vaddr,
> +		      int nr_pages, int *grefs)
>  {
> -	int err = gnttab_grant_foreign_access(dev->otherend_id, ring_mfn, 0);
> -	if (err < 0)
> -		xenbus_dev_fatal(dev, err, "granting access to ring page");
> +	int i;
> +	int err;
> +
> +	for (i = 0; i < nr_pages; i++) {
> +		unsigned long addr = (unsigned long)vaddr +
> +			(PAGE_SIZE * i);
> +		err = gnttab_grant_foreign_access(dev->otherend_id,
> +						  virt_to_mfn(addr), 0);
> +		if (err < 0) {
> +			xenbus_dev_fatal(dev, err,
> +					 "granting access to ring page");
> +			goto fail;
> +		}
> +		grefs[i] = err;
> +	}
> +
> +	return 0;
> +
> +fail:
> +	for ( ; i >= 0; i--)
> +		gnttab_end_foreign_access_ref(grefs[i], 0);
>  	return err;
>  }
>  EXPORT_SYMBOL_GPL(xenbus_grant_ring);
> @@ -448,7 +472,8 @@ EXPORT_SYMBOL_GPL(xenbus_free_evtchn);
>  /**
>   * xenbus_map_ring_valloc
>   * @dev: xenbus device
> - * @gnt_ref: grant reference
> + * @gnt_ref: grant reference array
> + * @nr_grefs: number of grant references
>   * @vaddr: pointer to address to be filled out by mapping
>   *
>   * Based on Rusty Russell's skeleton driver's map_page.
> @@ -459,51 +484,61 @@ EXPORT_SYMBOL_GPL(xenbus_free_evtchn);
>   * or -ENOMEM on error. If an error is returned, device will switch to
>   * XenbusStateClosing and the error message will be saved in XenStore.
>   */
> -int xenbus_map_ring_valloc(struct xenbus_device *dev, int gnt_ref, void **vaddr)
> +int xenbus_map_ring_valloc(struct xenbus_device *dev, int *gnt_ref,
> +			   int nr_grefs, void **vaddr)
>  {
> -	return ring_ops->map(dev, gnt_ref, vaddr);
> +	return ring_ops->map(dev, gnt_ref, nr_grefs, vaddr);
>  }
>  EXPORT_SYMBOL_GPL(xenbus_map_ring_valloc);
>  
>  static int xenbus_map_ring_valloc_pv(struct xenbus_device *dev,
> -				     int gnt_ref, void **vaddr)
> +				     int *gnt_ref, int nr_grefs, void **vaddr)
>  {
> -	struct gnttab_map_grant_ref op = {
> -		.flags = GNTMAP_host_map | GNTMAP_contains_pte,
> -		.ref   = gnt_ref,
> -		.dom   = dev->otherend_id,
> -	};
> +	struct gnttab_map_grant_ref op;
>  	struct xenbus_map_node *node;
>  	struct vm_struct *area;
> -	pte_t *pte;
> +	pte_t *pte[XENBUS_MAX_RING_PAGES];
> +	int i;
> +	int err = GNTST_okay;
> +	int vma_leaked; /* used in rollback */

bool

>  
>  	*vaddr = NULL;
>  
> +	if (nr_grefs > XENBUS_MAX_RING_PAGES)
> +		return -EINVAL;
> +
>  	node = kzalloc(sizeof(*node), GFP_KERNEL);
>  	if (!node)
>  		return -ENOMEM;
>  
> -	area = alloc_vm_area(PAGE_SIZE, &pte);
> +	area = alloc_vm_area(PAGE_SIZE * nr_grefs, pte);
>  	if (!area) {
>  		kfree(node);
>  		return -ENOMEM;
>  	}
>  
> -	op.host_addr = arbitrary_virt_to_machine(pte).maddr;
> -
> -	if (HYPERVISOR_grant_table_op(GNTTABOP_map_grant_ref, &op, 1))
> -		BUG();
> -
> -	if (op.status != GNTST_okay) {
> -		free_vm_area(area);
> -		kfree(node);
> -		xenbus_dev_fatal(dev, op.status,
> -				 "mapping in shared page %d from domain %d",
> -				 gnt_ref, dev->otherend_id);
> -		return op.status;
> +	/* Issue hypercall for individual entry, rollback if error occurs. */
> +	for (i = 0; i < nr_grefs; i++) {
> +		op.flags = GNTMAP_host_map | GNTMAP_contains_pte;
> +		op.ref   = gnt_ref[i];
> +		op.dom   = dev->otherend_id;
> +		op.host_addr = arbitrary_virt_to_machine(pte[i]).maddr;
> +
> +		if (HYPERVISOR_grant_table_op(GNTTABOP_map_grant_ref, &op, 1))
> +			BUG();
> +
> +		if (op.status != GNTST_okay) {
> +			err = op.status;
> +			xenbus_dev_fatal(dev, op.status,
> +				 "mapping in shared page (%d/%d) %d from domain %d",
> +				 i+1, nr_grefs, gnt_ref[i], dev->otherend_id);
> +			node->handle[i] = INVALID_GRANT_HANDLE;
> +			goto rollback;
> +		} else
> +			node->handle[i] = op.handle;
>  	}
>  
> -	node->handle = op.handle;
> +	node->nr_handles = nr_grefs;
>  	node->area = area;
>  
>  	spin_lock(&xenbus_valloc_lock);
> @@ -512,31 +547,73 @@ static int xenbus_map_ring_valloc_pv(struct xenbus_device *dev,
>  
>  	*vaddr = area->addr;
>  	return 0;
> +
> +rollback:
> +	vma_leaked = 0;
> +	for ( ; i >= 0; i--) {
> +		if (node->handle[i] != INVALID_GRANT_HANDLE) {
> +			struct gnttab_unmap_grant_ref unmap_op;
> +			unmap_op.dev_bus_addr = 0;
> +			unmap_op.host_addr =
> +				arbitrary_virt_to_machine(pte[i]).maddr;
> +			unmap_op.handle = node->handle[i];
> +
> +			if (HYPERVISOR_grant_table_op(GNTTABOP_unmap_grant_ref,
> +						      &unmap_op, 1))
> +				BUG();
> +
> +			if (unmap_op.status != GNTST_okay) {
> +				pr_alert("rollback mapping (%d/%d) %d from domain %d, err = %d",
> +					 i+1, nr_grefs, gnt_ref[i],
> +					 dev->otherend_id,
> +					 unmap_op.status);
> +				vma_leaked = 1;
> +			}
> +			node->handle[i] = INVALID_GRANT_HANDLE;
> +		}
> +	}
> +
> +	if (!vma_leaked)
> +		free_vm_area(area);
> +	else
> +		pr_alert("leaking vm area %p size %d page(s)", area, nr_grefs);
> +
> +	kfree(node);
> +
> +	return err;
>  }
>  
>  static int xenbus_map_ring_valloc_hvm(struct xenbus_device *dev,
> -				      int gnt_ref, void **vaddr)
> +				      int *gnt_ref, int nr_grefs, void **vaddr)
>  {
>  	struct xenbus_map_node *node;
>  	int err;
>  	void *addr;
> +	int vma_leaked;
>  
>  	*vaddr = NULL;
>  
> +	if (nr_grefs > XENBUS_MAX_RING_PAGES)
> +		return -EINVAL;
> +
>  	node = kzalloc(sizeof(*node), GFP_KERNEL);
>  	if (!node)
>  		return -ENOMEM;
>  
> -	err = alloc_xenballooned_pages(1, &node->page, false /* lowmem */);
> +	err = alloc_xenballooned_pages(nr_grefs, &node->page,
> +				       false /* lowmem */);
>  	if (err)
>  		goto out_err;
>  
>  	addr = pfn_to_kaddr(page_to_pfn(node->page));
>  
> -	err = xenbus_map_ring(dev, gnt_ref, &node->handle, addr);
> +	err = xenbus_map_ring(dev, gnt_ref, nr_grefs, node->handle,
> +			      addr, &vma_leaked);
>  	if (err)
>  		goto out_err;
>  
> +	node->nr_handles = nr_grefs;
> +
>  	spin_lock(&xenbus_valloc_lock);
>  	list_add(&node->next, &xenbus_valloc_pages);
>  	spin_unlock(&xenbus_valloc_lock);
> @@ -545,7 +622,8 @@ static int xenbus_map_ring_valloc_hvm(struct xenbus_device *dev,
>  	return 0;
>  
>   out_err:
> -	free_xenballooned_pages(1, &node->page);
> +	if (!vma_leaked)
> +		free_xenballooned_pages(nr_grefs, &node->page);
>  	kfree(node);
>  	return err;
>  }
> @@ -554,36 +632,75 @@ static int xenbus_map_ring_valloc_hvm(struct xenbus_device *dev,
>  /**
>   * xenbus_map_ring
>   * @dev: xenbus device
> - * @gnt_ref: grant reference
> + * @gnt_ref: grant reference array
> + * @nr_grefs: number of grant reference
>   * @handle: pointer to grant handle to be filled
>   * @vaddr: address to be mapped to
> + * @vma_leaked: cannot clean up a failed mapping, vma leaked
>   *
> - * Map a page of memory into this domain from another domain's grant table.
> + * Map pages of memory into this domain from another domain's grant table.
>   * xenbus_map_ring does not allocate the virtual address space (you must do
> - * this yourself!). It only maps in the page to the specified address.
> + * this yourself!). It only maps in the pages to the specified address.
>   * Returns 0 on success, and GNTST_* (see xen/include/interface/grant_table.h)
>   * or -ENOMEM on error. If an error is returned, device will switch to
> - * XenbusStateClosing and the error message will be saved in XenStore.
> + * XenbusStateClosing and the last error message will be saved in XenStore.
>   */
> -int xenbus_map_ring(struct xenbus_device *dev, int gnt_ref,
> -		    grant_handle_t *handle, void *vaddr)
> +int xenbus_map_ring(struct xenbus_device *dev, int *gnt_ref, int nr_grefs,
> +		    grant_handle_t *handle, void *vaddr, int *vma_leaked)
>  {
>  	struct gnttab_map_grant_ref op;
> +	int i;
> +	int err = GNTST_okay;
> +
> +	for (i = 0; i < nr_grefs; i++) {
> +		unsigned long addr = (unsigned long)vaddr +
> +			(PAGE_SIZE * i);
> +		gnttab_set_map_op(&op, (unsigned long)addr,
> +				  GNTMAP_host_map, gnt_ref[i],
> +				  dev->otherend_id);
> +
> +		if (HYPERVISOR_grant_table_op(GNTTABOP_map_grant_ref,
> +					      &op, 1))
> +			BUG();
> +
> +		if (op.status != GNTST_okay) {
> +			xenbus_dev_fatal(dev, op.status,
> +				 "mapping in shared page (%d/%d) %d from domain %d",
> +				 i+1, nr_grefs, gnt_ref[i], dev->otherend_id);
> +			handle[i] = INVALID_GRANT_HANDLE;
> +			goto rollback;
> +		} else
> +			handle[i] = op.handle;
> +	}
>  
> -	gnttab_set_map_op(&op, (unsigned long)vaddr, GNTMAP_host_map, gnt_ref,
> -			  dev->otherend_id);
> -
> -	if (HYPERVISOR_grant_table_op(GNTTABOP_map_grant_ref, &op, 1))
> -		BUG();
> +	return 0;
>  
> -	if (op.status != GNTST_okay) {
> -		xenbus_dev_fatal(dev, op.status,
> -				 "mapping in shared page %d from domain %d",
> -				 gnt_ref, dev->otherend_id);
> -	} else
> -		*handle = op.handle;
> +rollback:
> +	*vma_leaked = 0;
> +	for ( ; i >= 0; i--) {
> +		if (handle[i] != INVALID_GRANT_HANDLE) {
> +			struct gnttab_unmap_grant_ref unmap_op;
> +			unsigned long addr = (unsigned long)vaddr +
> +				(PAGE_SIZE * i);
> +			gnttab_set_unmap_op(&unmap_op, (phys_addr_t)addr,
> +					    GNTMAP_host_map, handle[i]);
> +
> +			if (HYPERVISOR_grant_table_op(GNTTABOP_unmap_grant_ref,
> +						      &unmap_op, 1))
> +				BUG();
> +
> +			if (unmap_op.status != GNTST_okay) {
> +				pr_alert("rollback mapping (%d/%d) %d from domain %d, err = %d",
> +					 i+1, nr_grefs, gnt_ref[i],
> +					 dev->otherend_id,
> +					 unmap_op.status);
> +				*vma_leaked = 1;
> +			}
> +			handle[i] = INVALID_GRANT_HANDLE;
> +		}
> +	}
>  
> -	return op.status;
> +	return err;
>  }
>  EXPORT_SYMBOL_GPL(xenbus_map_ring);
>  
> @@ -609,10 +726,11 @@ EXPORT_SYMBOL_GPL(xenbus_unmap_ring_vfree);
>  static int xenbus_unmap_ring_vfree_pv(struct xenbus_device *dev, void *vaddr)
>  {
>  	struct xenbus_map_node *node;
> -	struct gnttab_unmap_grant_ref op = {
> -		.host_addr = (unsigned long)vaddr,
> -	};
> +	struct gnttab_unmap_grant_ref op[XENBUS_MAX_RING_PAGES];
>  	unsigned int level;
> +	int i;
> +	int last_error = GNTST_okay;
> +	int vma_leaked;

bool
>  
>  	spin_lock(&xenbus_valloc_lock);
>  	list_for_each_entry(node, &xenbus_valloc_pages, next) {
> @@ -631,22 +749,39 @@ static int xenbus_unmap_ring_vfree_pv(struct xenbus_device *dev, void *vaddr)
>  		return GNTST_bad_virt_addr;
>  	}
>  
> -	op.handle = node->handle;
> -	op.host_addr = arbitrary_virt_to_machine(
> -		lookup_address((unsigned long)vaddr, &level)).maddr;
> +	for (i = 0; i < node->nr_handles; i++) {
> +		unsigned long addr = (unsigned long)vaddr +
> +			(PAGE_SIZE * i);
> +		op[i].dev_bus_addr = 0;
> +		op[i].handle = node->handle[i];
> +		op[i].host_addr = arbitrary_virt_to_machine(
> +			lookup_address((unsigned long)addr, &level)).maddr;
> +	}
>  
> -	if (HYPERVISOR_grant_table_op(GNTTABOP_unmap_grant_ref, &op, 1))
> +	if (HYPERVISOR_grant_table_op(GNTTABOP_unmap_grant_ref, op,
> +				      node->nr_handles))
>  		BUG();
>  
> -	if (op.status == GNTST_okay)
> +	vma_leaked = 0;
> +	for (i = 0; i < node->nr_handles; i++) {
> +		if (op[i].status != GNTST_okay) {
> +			last_error = op[i].status;
> +			vma_leaked = 1;
> +			xenbus_dev_error(dev, op[i].status,
> +				 "unmapping page (%d/%d) at handle %d error %d",
> +				 i+1, node->nr_handles, node->handle[i],
> +				 op[i].status);
> +		}
> +	}
> +
> +	if (!vma_leaked)
>  		free_vm_area(node->area);
>  	else
> -		xenbus_dev_error(dev, op.status,
> -				 "unmapping page at handle %d error %d",
> -				 node->handle, op.status);
> +		pr_alert("leaking vm area %p size %d page(s)",
> +			 node->area, node->nr_handles);
>  
>  	kfree(node);
> -	return op.status;
> +	return last_error;
>  }
>  
>  static int xenbus_unmap_ring_vfree_hvm(struct xenbus_device *dev, void *vaddr)
> @@ -673,10 +808,10 @@ static int xenbus_unmap_ring_vfree_hvm(struct xenbus_device *dev, void *vaddr)
>  		return GNTST_bad_virt_addr;
>  	}
>  
> -	rv = xenbus_unmap_ring(dev, node->handle, addr);
> +	rv = xenbus_unmap_ring(dev, node->handle, node->nr_handles, addr);
>  
>  	if (!rv)
> -		free_xenballooned_pages(1, &node->page);
> +		free_xenballooned_pages(node->nr_handles, &node->page);
>  	else
>  		WARN(1, "Leaking %p\n", vaddr);
>  
> @@ -687,7 +822,8 @@ static int xenbus_unmap_ring_vfree_hvm(struct xenbus_device *dev, void *vaddr)
>  /**
>   * xenbus_unmap_ring
>   * @dev: xenbus device
> - * @handle: grant handle
> + * @handle: grant handle array
> + * @nr_handles: number of grant handles
>   * @vaddr: addr to unmap
>   *
>   * Unmap a page of memory in this domain that was imported from another domain.
> @@ -695,21 +831,33 @@ static int xenbus_unmap_ring_vfree_hvm(struct xenbus_device *dev, void *vaddr)
>   * (see xen/include/interface/grant_table.h).
>   */
>  int xenbus_unmap_ring(struct xenbus_device *dev,
> -		      grant_handle_t handle, void *vaddr)
> +		      grant_handle_t *handle, int nr_handles,
> +		      void *vaddr)
>  {
>  	struct gnttab_unmap_grant_ref op;
> +	int last_error = GNTST_okay;
> +	int i;
> +
> +	for (i = 0; i < nr_handles; i++) {
> +		unsigned long addr = (unsigned long)vaddr +
> +			(PAGE_SIZE * i);
> +		gnttab_set_unmap_op(&op, (unsigned long)addr,
> +				    GNTMAP_host_map, handle[i]);
> +		handle[i] = INVALID_GRANT_HANDLE;
> +
> +		if (HYPERVISOR_grant_table_op(GNTTABOP_unmap_grant_ref,
> +					      &op, 1))
> +			BUG();
> +
> +		if (op.status != GNTST_okay) {
> +			xenbus_dev_error(dev, op.status,
> +				 "unmapping page (%d/%d) at handle %d error %d",
> +				 i+1, nr_handles, handle[i], op.status);
> +			last_error = op.status;
> +		}
> +	}
>  
> -	gnttab_set_unmap_op(&op, (unsigned long)vaddr, GNTMAP_host_map, handle);
> -
> -	if (HYPERVISOR_grant_table_op(GNTTABOP_unmap_grant_ref, &op, 1))
> -		BUG();
> -
> -	if (op.status != GNTST_okay)
> -		xenbus_dev_error(dev, op.status,
> -				 "unmapping page at handle %d error %d",
> -				 handle, op.status);
> -
> -	return op.status;
> +	return last_error;
>  }
>  EXPORT_SYMBOL_GPL(xenbus_unmap_ring);
>  
> diff --git a/include/xen/xenbus.h b/include/xen/xenbus.h
> index 0a7515c..b7d9613 100644
> --- a/include/xen/xenbus.h
> +++ b/include/xen/xenbus.h
> @@ -46,6 +46,11 @@
>  #include <xen/interface/io/xenbus.h>
>  #include <xen/interface/io/xs_wire.h>
>  
> +/* Max pages supported by multi-page ring in the backend */
> +#define XENBUS_MAX_RING_PAGE_ORDER  2
> +#define XENBUS_MAX_RING_PAGES       (1U << XENBUS_MAX_RING_PAGE_ORDER)

So why this value? If this is a mechanical change shouldn't the order be '0' ?

> +#define INVALID_GRANT_HANDLE        (~0U)
> +
>  /* Register callback to watch this node. */
>  struct xenbus_watch
>  {
> @@ -195,15 +200,17 @@ int xenbus_watch_pathfmt(struct xenbus_device *dev, struct xenbus_watch *watch,
>  			 const char *pathfmt, ...);
>  
>  int xenbus_switch_state(struct xenbus_device *dev, enum xenbus_state new_state);
> -int xenbus_grant_ring(struct xenbus_device *dev, unsigned long ring_mfn);
> +int xenbus_grant_ring(struct xenbus_device *dev, void *vaddr,
> +		      int nr_gages, int *grefs);
>  int xenbus_map_ring_valloc(struct xenbus_device *dev,
> -			   int gnt_ref, void **vaddr);
> -int xenbus_map_ring(struct xenbus_device *dev, int gnt_ref,
> -			   grant_handle_t *handle, void *vaddr);
> +			   int *gnt_ref, int nr_grefs, void **vaddr);
> +int xenbus_map_ring(struct xenbus_device *dev, int *gnt_ref, int nr_grefs,
> +		    grant_handle_t *handle, void *vaddr, int *vma_leaked);
>  
>  int xenbus_unmap_ring_vfree(struct xenbus_device *dev, void *vaddr);
>  int xenbus_unmap_ring(struct xenbus_device *dev,
> -		      grant_handle_t handle, void *vaddr);
> +		      grant_handle_t *handle, int nr_handles,
> +		      void *vaddr);
>  
>  int xenbus_alloc_evtchn(struct xenbus_device *dev, int *port);
>  int xenbus_bind_evtchn(struct xenbus_device *dev, int remote_port, int *port);
> -- 
> 1.7.10.4
> 

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 4/8] xenbus_client: Extend interface to support multi-page ring
  2013-02-15 16:00 ` [PATCH 4/8] xenbus_client: Extend interface to support multi-page ring Wei Liu
  2013-02-15 16:17   ` Jan Beulich
  2013-02-15 16:17   ` [Xen-devel] " Jan Beulich
@ 2013-03-04 21:12   ` Konrad Rzeszutek Wilk
  2013-03-04 21:12   ` Konrad Rzeszutek Wilk
                     ` (2 subsequent siblings)
  5 siblings, 0 replies; 91+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-03-04 21:12 UTC (permalink / raw)
  To: Wei Liu
  Cc: ian.campbell, Stefano Stabellini, netdev, xen-devel, annie.li,
	Roger Pau Monne

On Fri, Feb 15, 2013 at 04:00:05PM +0000, Wei Liu wrote:
> Also bundle fixes for xen frontends and backends in this patch.

Please explain what this does. I have a fairly good idea
of what it does, but if there are folks who are going to look
in the source code and find this git commit they might not understand
how this is suppose to be used or why it is better to have
more than 0-page rings.



> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> Cc: Roger Pau Monne <roger.pau@citrix.com>
> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> Cc: Mukesh Rathor <mukesh.rathor@oracle.com>
> ---
>  drivers/block/xen-blkback/xenbus.c |   14 +-
>  drivers/block/xen-blkfront.c       |    6 +-
>  drivers/net/xen-netback/netback.c  |    4 +-
>  drivers/net/xen-netfront.c         |    9 +-
>  drivers/pci/xen-pcifront.c         |    5 +-
>  drivers/xen/xen-pciback/xenbus.c   |   10 +-
>  drivers/xen/xenbus/xenbus_client.c |  314 ++++++++++++++++++++++++++----------
>  include/xen/xenbus.h               |   17 +-
>  8 files changed, 270 insertions(+), 109 deletions(-)
> 
> diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c
> index 6398072..384ff24 100644
> --- a/drivers/block/xen-blkback/xenbus.c
> +++ b/drivers/block/xen-blkback/xenbus.c
> @@ -122,7 +122,8 @@ static struct xen_blkif *xen_blkif_alloc(domid_t domid)
>  	return blkif;
>  }
>  
> -static int xen_blkif_map(struct xen_blkif *blkif, unsigned long shared_page,
> +static int xen_blkif_map(struct xen_blkif *blkif, int *shared_pages,
> +			 int nr_pages,
>  			 unsigned int evtchn)
>  {
>  	int err;
> @@ -131,7 +132,8 @@ static int xen_blkif_map(struct xen_blkif *blkif, unsigned long shared_page,
>  	if (blkif->irq)
>  		return 0;
>  
> -	err = xenbus_map_ring_valloc(blkif->be->dev, shared_page, &blkif->blk_ring);
> +	err = xenbus_map_ring_valloc(blkif->be->dev, shared_pages,
> +				     nr_pages, &blkif->blk_ring);
>  	if (err < 0)
>  		return err;
>  
> @@ -726,7 +728,7 @@ again:
>  static int connect_ring(struct backend_info *be)
>  {
>  	struct xenbus_device *dev = be->dev;
> -	unsigned long ring_ref;
> +	int ring_ref;
>  	unsigned int evtchn;
>  	unsigned int pers_grants;
>  	char protocol[64] = "";
> @@ -767,14 +769,14 @@ static int connect_ring(struct backend_info *be)
>  	be->blkif->vbd.feature_gnt_persistent = pers_grants;
>  	be->blkif->vbd.overflow_max_grants = 0;
>  
> -	pr_info(DRV_PFX "ring-ref %ld, event-channel %d, protocol %d (%s) %s\n",
> +	pr_info(DRV_PFX "ring-ref %d, event-channel %d, protocol %d (%s) %s\n",
>  		ring_ref, evtchn, be->blkif->blk_protocol, protocol,
>  		pers_grants ? "persistent grants" : "");
>  
>  	/* Map the shared frame, irq etc. */
> -	err = xen_blkif_map(be->blkif, ring_ref, evtchn);
> +	err = xen_blkif_map(be->blkif, &ring_ref, 1, evtchn);
>  	if (err) {
> -		xenbus_dev_fatal(dev, err, "mapping ring-ref %lu port %u",
> +		xenbus_dev_fatal(dev, err, "mapping ring-ref %u port %u",
>  				 ring_ref, evtchn);
>  		return err;
>  	}
> diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
> index 96e9b00..12c9ebd 100644
> --- a/drivers/block/xen-blkfront.c
> +++ b/drivers/block/xen-blkfront.c
> @@ -991,6 +991,7 @@ static int setup_blkring(struct xenbus_device *dev,
>  {
>  	struct blkif_sring *sring;
>  	int err;
> +	int grefs[1];

Perhaps all of those '1' should have a #define.

Say 'XEN_DEFAULT_RING_PAGE_SIZE'?

>  
>  	info->ring_ref = GRANT_INVALID_REF;
>  
> @@ -1004,13 +1005,14 @@ static int setup_blkring(struct xenbus_device *dev,
>  
>  	sg_init_table(info->sg, BLKIF_MAX_SEGMENTS_PER_REQUEST);
>  
> -	err = xenbus_grant_ring(dev, virt_to_mfn(info->ring.sring));
> +	err = xenbus_grant_ring(dev, info->ring.sring,
> +				1, grefs);
>  	if (err < 0) {
>  		free_page((unsigned long)sring);
>  		info->ring.sring = NULL;
>  		goto fail;
>  	}
> -	info->ring_ref = err;
> +	info->ring_ref = grefs[0];
>  
>  	err = xenbus_alloc_evtchn(dev, &info->evtchn);
>  	if (err)
> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> index de59098..98ccea9 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -1665,7 +1665,7 @@ int xen_netbk_map_frontend_rings(struct xenvif *vif,
>  	int err = -ENOMEM;
>  
>  	err = xenbus_map_ring_valloc(xenvif_to_xenbus_device(vif),
> -				     tx_ring_ref, &addr);
> +				     &tx_ring_ref, 1, &addr);
>  	if (err)
>  		goto err;
>  
> @@ -1673,7 +1673,7 @@ int xen_netbk_map_frontend_rings(struct xenvif *vif,
>  	BACK_RING_INIT(&vif->tx, txs, PAGE_SIZE);
>  
>  	err = xenbus_map_ring_valloc(xenvif_to_xenbus_device(vif),
> -				     rx_ring_ref, &addr);
> +				     &rx_ring_ref, 1, &addr);
>  	if (err)
>  		goto err;
>  
> diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
> index 7ffa43b..8bd75a1 100644
> --- a/drivers/net/xen-netfront.c
> +++ b/drivers/net/xen-netfront.c
> @@ -1501,6 +1501,7 @@ static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info)
>  	struct xen_netif_tx_sring *txs;
>  	struct xen_netif_rx_sring *rxs;
>  	int err;
> +	int grefs[1];
>  	struct net_device *netdev = info->netdev;
>  
>  	info->tx_ring_ref = GRANT_INVALID_REF;
> @@ -1524,13 +1525,13 @@ static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info)
>  	SHARED_RING_INIT(txs);
>  	FRONT_RING_INIT(&info->tx, txs, PAGE_SIZE);
>  
> -	err = xenbus_grant_ring(dev, virt_to_mfn(txs));
> +	err = xenbus_grant_ring(dev, txs, 1, grefs);
>  	if (err < 0) {
>  		free_page((unsigned long)txs);
>  		goto fail;
>  	}
>  
> -	info->tx_ring_ref = err;
> +	info->tx_ring_ref = grefs[0];
>  	rxs = (struct xen_netif_rx_sring *)get_zeroed_page(GFP_NOIO | __GFP_HIGH);
>  	if (!rxs) {
>  		err = -ENOMEM;
> @@ -1540,12 +1541,12 @@ static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info)
>  	SHARED_RING_INIT(rxs);
>  	FRONT_RING_INIT(&info->rx, rxs, PAGE_SIZE);
>  
> -	err = xenbus_grant_ring(dev, virt_to_mfn(rxs));
> +	err = xenbus_grant_ring(dev, rxs, 1, grefs);
>  	if (err < 0) {
>  		free_page((unsigned long)rxs);
>  		goto fail;
>  	}
> -	info->rx_ring_ref = err;
> +	info->rx_ring_ref = grefs[0];
>  
>  	err = xenbus_alloc_evtchn(dev, &info->evtchn);
>  	if (err)
> diff --git a/drivers/pci/xen-pcifront.c b/drivers/pci/xen-pcifront.c
> index 966abc6..016a2bb 100644
> --- a/drivers/pci/xen-pcifront.c
> +++ b/drivers/pci/xen-pcifront.c
> @@ -772,12 +772,13 @@ static int pcifront_publish_info(struct pcifront_device *pdev)
>  {
>  	int err = 0;
>  	struct xenbus_transaction trans;
> +	int grefs[1];
>  
> -	err = xenbus_grant_ring(pdev->xdev, virt_to_mfn(pdev->sh_info));
> +	err = xenbus_grant_ring(pdev->xdev, pdev->sh_info, 1, grefs);
>  	if (err < 0)
>  		goto out;
>  
> -	pdev->gnt_ref = err;
> +	pdev->gnt_ref = grefs[0];
>  
>  	err = xenbus_alloc_evtchn(pdev->xdev, &pdev->evtchn);
>  	if (err)
> diff --git a/drivers/xen/xen-pciback/xenbus.c b/drivers/xen/xen-pciback/xenbus.c
> index 64b11f9..4655851 100644
> --- a/drivers/xen/xen-pciback/xenbus.c
> +++ b/drivers/xen/xen-pciback/xenbus.c
> @@ -98,17 +98,17 @@ static void free_pdev(struct xen_pcibk_device *pdev)
>  	kfree(pdev);
>  }
>  
> -static int xen_pcibk_do_attach(struct xen_pcibk_device *pdev, int gnt_ref,
> -			     int remote_evtchn)
> +static int xen_pcibk_do_attach(struct xen_pcibk_device *pdev, int *gnt_ref,
> +			       int nr_grefs, int remote_evtchn)
>  {
>  	int err = 0;
>  	void *vaddr;
>  
>  	dev_dbg(&pdev->xdev->dev,
>  		"Attaching to frontend resources - gnt_ref=%d evtchn=%d\n",
> -		gnt_ref, remote_evtchn);
> +		gnt_ref[0], remote_evtchn);
>  
> -	err = xenbus_map_ring_valloc(pdev->xdev, gnt_ref, &vaddr);
> +	err = xenbus_map_ring_valloc(pdev->xdev, gnt_ref, nr_grefs, &vaddr);
>  	if (err < 0) {
>  		xenbus_dev_fatal(pdev->xdev, err,
>  				"Error mapping other domain page in ours.");
> @@ -172,7 +172,7 @@ static int xen_pcibk_attach(struct xen_pcibk_device *pdev)
>  		goto out;
>  	}
>  
> -	err = xen_pcibk_do_attach(pdev, gnt_ref, remote_evtchn);
> +	err = xen_pcibk_do_attach(pdev, &gnt_ref, 1, remote_evtchn);
>  	if (err)
>  		goto out;
>  
> diff --git a/drivers/xen/xenbus/xenbus_client.c b/drivers/xen/xenbus/xenbus_client.c
> index 1bac743..7c1bd49 100644
> --- a/drivers/xen/xenbus/xenbus_client.c
> +++ b/drivers/xen/xenbus/xenbus_client.c
> @@ -54,14 +54,16 @@ struct xenbus_map_node {
>  		struct vm_struct *area; /* PV */
>  		struct page *page;     /* HVM */
>  	};
> -	grant_handle_t handle;
> +	grant_handle_t handle[XENBUS_MAX_RING_PAGES];
> +	unsigned int   nr_handles;
>  };
>  
>  static DEFINE_SPINLOCK(xenbus_valloc_lock);
>  static LIST_HEAD(xenbus_valloc_pages);
>  
>  struct xenbus_ring_ops {
> -	int (*map)(struct xenbus_device *dev, int gnt, void **vaddr);
> +	int (*map)(struct xenbus_device *dev, int *gnt, int nr_gnts,
> +		   void **vaddr);
>  	int (*unmap)(struct xenbus_device *dev, void *vaddr);
>  };
>  
> @@ -357,17 +359,39 @@ static void xenbus_switch_fatal(struct xenbus_device *dev, int depth, int err,
>  /**
>   * xenbus_grant_ring
>   * @dev: xenbus device
> - * @ring_mfn: mfn of ring to grant
> -
> - * Grant access to the given @ring_mfn to the peer of the given device.  Return
> - * 0 on success, or -errno on error.  On error, the device will switch to
> + * @vaddr: starting virtual address of the ring
> + * @nr_pages: number of pages to be granted
> + * @grefs: grant reference array to be filled in
> + *
> + * Grant access to the given @vaddr to the peer of the given device.
> + * Then fill in @grefs with grant references.  Return 0 on success, or
> + * -errno on error.  On error, the device will switch to
>   * XenbusStateClosing, and the error will be saved in the store.
>   */
> -int xenbus_grant_ring(struct xenbus_device *dev, unsigned long ring_mfn)
> +int xenbus_grant_ring(struct xenbus_device *dev, void *vaddr,
> +		      int nr_pages, int *grefs)
>  {
> -	int err = gnttab_grant_foreign_access(dev->otherend_id, ring_mfn, 0);
> -	if (err < 0)
> -		xenbus_dev_fatal(dev, err, "granting access to ring page");
> +	int i;
> +	int err;
> +
> +	for (i = 0; i < nr_pages; i++) {
> +		unsigned long addr = (unsigned long)vaddr +
> +			(PAGE_SIZE * i);
> +		err = gnttab_grant_foreign_access(dev->otherend_id,
> +						  virt_to_mfn(addr), 0);
> +		if (err < 0) {
> +			xenbus_dev_fatal(dev, err,
> +					 "granting access to ring page");
> +			goto fail;
> +		}
> +		grefs[i] = err;
> +	}
> +
> +	return 0;
> +
> +fail:
> +	for ( ; i >= 0; i--)
> +		gnttab_end_foreign_access_ref(grefs[i], 0);
>  	return err;
>  }
>  EXPORT_SYMBOL_GPL(xenbus_grant_ring);
> @@ -448,7 +472,8 @@ EXPORT_SYMBOL_GPL(xenbus_free_evtchn);
>  /**
>   * xenbus_map_ring_valloc
>   * @dev: xenbus device
> - * @gnt_ref: grant reference
> + * @gnt_ref: grant reference array
> + * @nr_grefs: number of grant references
>   * @vaddr: pointer to address to be filled out by mapping
>   *
>   * Based on Rusty Russell's skeleton driver's map_page.
> @@ -459,51 +484,61 @@ EXPORT_SYMBOL_GPL(xenbus_free_evtchn);
>   * or -ENOMEM on error. If an error is returned, device will switch to
>   * XenbusStateClosing and the error message will be saved in XenStore.
>   */
> -int xenbus_map_ring_valloc(struct xenbus_device *dev, int gnt_ref, void **vaddr)
> +int xenbus_map_ring_valloc(struct xenbus_device *dev, int *gnt_ref,
> +			   int nr_grefs, void **vaddr)
>  {
> -	return ring_ops->map(dev, gnt_ref, vaddr);
> +	return ring_ops->map(dev, gnt_ref, nr_grefs, vaddr);
>  }
>  EXPORT_SYMBOL_GPL(xenbus_map_ring_valloc);
>  
>  static int xenbus_map_ring_valloc_pv(struct xenbus_device *dev,
> -				     int gnt_ref, void **vaddr)
> +				     int *gnt_ref, int nr_grefs, void **vaddr)
>  {
> -	struct gnttab_map_grant_ref op = {
> -		.flags = GNTMAP_host_map | GNTMAP_contains_pte,
> -		.ref   = gnt_ref,
> -		.dom   = dev->otherend_id,
> -	};
> +	struct gnttab_map_grant_ref op;
>  	struct xenbus_map_node *node;
>  	struct vm_struct *area;
> -	pte_t *pte;
> +	pte_t *pte[XENBUS_MAX_RING_PAGES];
> +	int i;
> +	int err = GNTST_okay;
> +	int vma_leaked; /* used in rollback */

bool

>  
>  	*vaddr = NULL;
>  
> +	if (nr_grefs > XENBUS_MAX_RING_PAGES)
> +		return -EINVAL;
> +
>  	node = kzalloc(sizeof(*node), GFP_KERNEL);
>  	if (!node)
>  		return -ENOMEM;
>  
> -	area = alloc_vm_area(PAGE_SIZE, &pte);
> +	area = alloc_vm_area(PAGE_SIZE * nr_grefs, pte);
>  	if (!area) {
>  		kfree(node);
>  		return -ENOMEM;
>  	}
>  
> -	op.host_addr = arbitrary_virt_to_machine(pte).maddr;
> -
> -	if (HYPERVISOR_grant_table_op(GNTTABOP_map_grant_ref, &op, 1))
> -		BUG();
> -
> -	if (op.status != GNTST_okay) {
> -		free_vm_area(area);
> -		kfree(node);
> -		xenbus_dev_fatal(dev, op.status,
> -				 "mapping in shared page %d from domain %d",
> -				 gnt_ref, dev->otherend_id);
> -		return op.status;
> +	/* Issue hypercall for individual entry, rollback if error occurs. */
> +	for (i = 0; i < nr_grefs; i++) {
> +		op.flags = GNTMAP_host_map | GNTMAP_contains_pte;
> +		op.ref   = gnt_ref[i];
> +		op.dom   = dev->otherend_id;
> +		op.host_addr = arbitrary_virt_to_machine(pte[i]).maddr;
> +
> +		if (HYPERVISOR_grant_table_op(GNTTABOP_map_grant_ref, &op, 1))
> +			BUG();
> +
> +		if (op.status != GNTST_okay) {
> +			err = op.status;
> +			xenbus_dev_fatal(dev, op.status,
> +				 "mapping in shared page (%d/%d) %d from domain %d",
> +				 i+1, nr_grefs, gnt_ref[i], dev->otherend_id);
> +			node->handle[i] = INVALID_GRANT_HANDLE;
> +			goto rollback;
> +		} else
> +			node->handle[i] = op.handle;
>  	}
>  
> -	node->handle = op.handle;
> +	node->nr_handles = nr_grefs;
>  	node->area = area;
>  
>  	spin_lock(&xenbus_valloc_lock);
> @@ -512,31 +547,73 @@ static int xenbus_map_ring_valloc_pv(struct xenbus_device *dev,
>  
>  	*vaddr = area->addr;
>  	return 0;
> +
> +rollback:
> +	vma_leaked = 0;
> +	for ( ; i >= 0; i--) {
> +		if (node->handle[i] != INVALID_GRANT_HANDLE) {
> +			struct gnttab_unmap_grant_ref unmap_op;
> +			unmap_op.dev_bus_addr = 0;
> +			unmap_op.host_addr =
> +				arbitrary_virt_to_machine(pte[i]).maddr;
> +			unmap_op.handle = node->handle[i];
> +
> +			if (HYPERVISOR_grant_table_op(GNTTABOP_unmap_grant_ref,
> +						      &unmap_op, 1))
> +				BUG();
> +
> +			if (unmap_op.status != GNTST_okay) {
> +				pr_alert("rollback mapping (%d/%d) %d from domain %d, err = %d",
> +					 i+1, nr_grefs, gnt_ref[i],
> +					 dev->otherend_id,
> +					 unmap_op.status);
> +				vma_leaked = 1;
> +			}
> +			node->handle[i] = INVALID_GRANT_HANDLE;
> +		}
> +	}
> +
> +	if (!vma_leaked)
> +		free_vm_area(area);
> +	else
> +		pr_alert("leaking vm area %p size %d page(s)", area, nr_grefs);
> +
> +	kfree(node);
> +
> +	return err;
>  }
>  
>  static int xenbus_map_ring_valloc_hvm(struct xenbus_device *dev,
> -				      int gnt_ref, void **vaddr)
> +				      int *gnt_ref, int nr_grefs, void **vaddr)
>  {
>  	struct xenbus_map_node *node;
>  	int err;
>  	void *addr;
> +	int vma_leaked;
>  
>  	*vaddr = NULL;
>  
> +	if (nr_grefs > XENBUS_MAX_RING_PAGES)
> +		return -EINVAL;
> +
>  	node = kzalloc(sizeof(*node), GFP_KERNEL);
>  	if (!node)
>  		return -ENOMEM;
>  
> -	err = alloc_xenballooned_pages(1, &node->page, false /* lowmem */);
> +	err = alloc_xenballooned_pages(nr_grefs, &node->page,
> +				       false /* lowmem */);
>  	if (err)
>  		goto out_err;
>  
>  	addr = pfn_to_kaddr(page_to_pfn(node->page));
>  
> -	err = xenbus_map_ring(dev, gnt_ref, &node->handle, addr);
> +	err = xenbus_map_ring(dev, gnt_ref, nr_grefs, node->handle,
> +			      addr, &vma_leaked);
>  	if (err)
>  		goto out_err;
>  
> +	node->nr_handles = nr_grefs;
> +
>  	spin_lock(&xenbus_valloc_lock);
>  	list_add(&node->next, &xenbus_valloc_pages);
>  	spin_unlock(&xenbus_valloc_lock);
> @@ -545,7 +622,8 @@ static int xenbus_map_ring_valloc_hvm(struct xenbus_device *dev,
>  	return 0;
>  
>   out_err:
> -	free_xenballooned_pages(1, &node->page);
> +	if (!vma_leaked)
> +		free_xenballooned_pages(nr_grefs, &node->page);
>  	kfree(node);
>  	return err;
>  }
> @@ -554,36 +632,75 @@ static int xenbus_map_ring_valloc_hvm(struct xenbus_device *dev,
>  /**
>   * xenbus_map_ring
>   * @dev: xenbus device
> - * @gnt_ref: grant reference
> + * @gnt_ref: grant reference array
> + * @nr_grefs: number of grant reference
>   * @handle: pointer to grant handle to be filled
>   * @vaddr: address to be mapped to
> + * @vma_leaked: cannot clean up a failed mapping, vma leaked
>   *
> - * Map a page of memory into this domain from another domain's grant table.
> + * Map pages of memory into this domain from another domain's grant table.
>   * xenbus_map_ring does not allocate the virtual address space (you must do
> - * this yourself!). It only maps in the page to the specified address.
> + * this yourself!). It only maps in the pages to the specified address.
>   * Returns 0 on success, and GNTST_* (see xen/include/interface/grant_table.h)
>   * or -ENOMEM on error. If an error is returned, device will switch to
> - * XenbusStateClosing and the error message will be saved in XenStore.
> + * XenbusStateClosing and the last error message will be saved in XenStore.
>   */
> -int xenbus_map_ring(struct xenbus_device *dev, int gnt_ref,
> -		    grant_handle_t *handle, void *vaddr)
> +int xenbus_map_ring(struct xenbus_device *dev, int *gnt_ref, int nr_grefs,
> +		    grant_handle_t *handle, void *vaddr, int *vma_leaked)
>  {
>  	struct gnttab_map_grant_ref op;
> +	int i;
> +	int err = GNTST_okay;
> +
> +	for (i = 0; i < nr_grefs; i++) {
> +		unsigned long addr = (unsigned long)vaddr +
> +			(PAGE_SIZE * i);
> +		gnttab_set_map_op(&op, (unsigned long)addr,
> +				  GNTMAP_host_map, gnt_ref[i],
> +				  dev->otherend_id);
> +
> +		if (HYPERVISOR_grant_table_op(GNTTABOP_map_grant_ref,
> +					      &op, 1))
> +			BUG();
> +
> +		if (op.status != GNTST_okay) {
> +			xenbus_dev_fatal(dev, op.status,
> +				 "mapping in shared page (%d/%d) %d from domain %d",
> +				 i+1, nr_grefs, gnt_ref[i], dev->otherend_id);
> +			handle[i] = INVALID_GRANT_HANDLE;
> +			goto rollback;
> +		} else
> +			handle[i] = op.handle;
> +	}
>  
> -	gnttab_set_map_op(&op, (unsigned long)vaddr, GNTMAP_host_map, gnt_ref,
> -			  dev->otherend_id);
> -
> -	if (HYPERVISOR_grant_table_op(GNTTABOP_map_grant_ref, &op, 1))
> -		BUG();
> +	return 0;
>  
> -	if (op.status != GNTST_okay) {
> -		xenbus_dev_fatal(dev, op.status,
> -				 "mapping in shared page %d from domain %d",
> -				 gnt_ref, dev->otherend_id);
> -	} else
> -		*handle = op.handle;
> +rollback:
> +	*vma_leaked = 0;
> +	for ( ; i >= 0; i--) {
> +		if (handle[i] != INVALID_GRANT_HANDLE) {
> +			struct gnttab_unmap_grant_ref unmap_op;
> +			unsigned long addr = (unsigned long)vaddr +
> +				(PAGE_SIZE * i);
> +			gnttab_set_unmap_op(&unmap_op, (phys_addr_t)addr,
> +					    GNTMAP_host_map, handle[i]);
> +
> +			if (HYPERVISOR_grant_table_op(GNTTABOP_unmap_grant_ref,
> +						      &unmap_op, 1))
> +				BUG();
> +
> +			if (unmap_op.status != GNTST_okay) {
> +				pr_alert("rollback mapping (%d/%d) %d from domain %d, err = %d",
> +					 i+1, nr_grefs, gnt_ref[i],
> +					 dev->otherend_id,
> +					 unmap_op.status);
> +				*vma_leaked = 1;
> +			}
> +			handle[i] = INVALID_GRANT_HANDLE;
> +		}
> +	}
>  
> -	return op.status;
> +	return err;
>  }
>  EXPORT_SYMBOL_GPL(xenbus_map_ring);
>  
> @@ -609,10 +726,11 @@ EXPORT_SYMBOL_GPL(xenbus_unmap_ring_vfree);
>  static int xenbus_unmap_ring_vfree_pv(struct xenbus_device *dev, void *vaddr)
>  {
>  	struct xenbus_map_node *node;
> -	struct gnttab_unmap_grant_ref op = {
> -		.host_addr = (unsigned long)vaddr,
> -	};
> +	struct gnttab_unmap_grant_ref op[XENBUS_MAX_RING_PAGES];
>  	unsigned int level;
> +	int i;
> +	int last_error = GNTST_okay;
> +	int vma_leaked;

bool
>  
>  	spin_lock(&xenbus_valloc_lock);
>  	list_for_each_entry(node, &xenbus_valloc_pages, next) {
> @@ -631,22 +749,39 @@ static int xenbus_unmap_ring_vfree_pv(struct xenbus_device *dev, void *vaddr)
>  		return GNTST_bad_virt_addr;
>  	}
>  
> -	op.handle = node->handle;
> -	op.host_addr = arbitrary_virt_to_machine(
> -		lookup_address((unsigned long)vaddr, &level)).maddr;
> +	for (i = 0; i < node->nr_handles; i++) {
> +		unsigned long addr = (unsigned long)vaddr +
> +			(PAGE_SIZE * i);
> +		op[i].dev_bus_addr = 0;
> +		op[i].handle = node->handle[i];
> +		op[i].host_addr = arbitrary_virt_to_machine(
> +			lookup_address((unsigned long)addr, &level)).maddr;
> +	}
>  
> -	if (HYPERVISOR_grant_table_op(GNTTABOP_unmap_grant_ref, &op, 1))
> +	if (HYPERVISOR_grant_table_op(GNTTABOP_unmap_grant_ref, op,
> +				      node->nr_handles))
>  		BUG();
>  
> -	if (op.status == GNTST_okay)
> +	vma_leaked = 0;
> +	for (i = 0; i < node->nr_handles; i++) {
> +		if (op[i].status != GNTST_okay) {
> +			last_error = op[i].status;
> +			vma_leaked = 1;
> +			xenbus_dev_error(dev, op[i].status,
> +				 "unmapping page (%d/%d) at handle %d error %d",
> +				 i+1, node->nr_handles, node->handle[i],
> +				 op[i].status);
> +		}
> +	}
> +
> +	if (!vma_leaked)
>  		free_vm_area(node->area);
>  	else
> -		xenbus_dev_error(dev, op.status,
> -				 "unmapping page at handle %d error %d",
> -				 node->handle, op.status);
> +		pr_alert("leaking vm area %p size %d page(s)",
> +			 node->area, node->nr_handles);
>  
>  	kfree(node);
> -	return op.status;
> +	return last_error;
>  }
>  
>  static int xenbus_unmap_ring_vfree_hvm(struct xenbus_device *dev, void *vaddr)
> @@ -673,10 +808,10 @@ static int xenbus_unmap_ring_vfree_hvm(struct xenbus_device *dev, void *vaddr)
>  		return GNTST_bad_virt_addr;
>  	}
>  
> -	rv = xenbus_unmap_ring(dev, node->handle, addr);
> +	rv = xenbus_unmap_ring(dev, node->handle, node->nr_handles, addr);
>  
>  	if (!rv)
> -		free_xenballooned_pages(1, &node->page);
> +		free_xenballooned_pages(node->nr_handles, &node->page);
>  	else
>  		WARN(1, "Leaking %p\n", vaddr);
>  
> @@ -687,7 +822,8 @@ static int xenbus_unmap_ring_vfree_hvm(struct xenbus_device *dev, void *vaddr)
>  /**
>   * xenbus_unmap_ring
>   * @dev: xenbus device
> - * @handle: grant handle
> + * @handle: grant handle array
> + * @nr_handles: number of grant handles
>   * @vaddr: addr to unmap
>   *
>   * Unmap a page of memory in this domain that was imported from another domain.
> @@ -695,21 +831,33 @@ static int xenbus_unmap_ring_vfree_hvm(struct xenbus_device *dev, void *vaddr)
>   * (see xen/include/interface/grant_table.h).
>   */
>  int xenbus_unmap_ring(struct xenbus_device *dev,
> -		      grant_handle_t handle, void *vaddr)
> +		      grant_handle_t *handle, int nr_handles,
> +		      void *vaddr)
>  {
>  	struct gnttab_unmap_grant_ref op;
> +	int last_error = GNTST_okay;
> +	int i;
> +
> +	for (i = 0; i < nr_handles; i++) {
> +		unsigned long addr = (unsigned long)vaddr +
> +			(PAGE_SIZE * i);
> +		gnttab_set_unmap_op(&op, (unsigned long)addr,
> +				    GNTMAP_host_map, handle[i]);
> +		handle[i] = INVALID_GRANT_HANDLE;
> +
> +		if (HYPERVISOR_grant_table_op(GNTTABOP_unmap_grant_ref,
> +					      &op, 1))
> +			BUG();
> +
> +		if (op.status != GNTST_okay) {
> +			xenbus_dev_error(dev, op.status,
> +				 "unmapping page (%d/%d) at handle %d error %d",
> +				 i+1, nr_handles, handle[i], op.status);
> +			last_error = op.status;
> +		}
> +	}
>  
> -	gnttab_set_unmap_op(&op, (unsigned long)vaddr, GNTMAP_host_map, handle);
> -
> -	if (HYPERVISOR_grant_table_op(GNTTABOP_unmap_grant_ref, &op, 1))
> -		BUG();
> -
> -	if (op.status != GNTST_okay)
> -		xenbus_dev_error(dev, op.status,
> -				 "unmapping page at handle %d error %d",
> -				 handle, op.status);
> -
> -	return op.status;
> +	return last_error;
>  }
>  EXPORT_SYMBOL_GPL(xenbus_unmap_ring);
>  
> diff --git a/include/xen/xenbus.h b/include/xen/xenbus.h
> index 0a7515c..b7d9613 100644
> --- a/include/xen/xenbus.h
> +++ b/include/xen/xenbus.h
> @@ -46,6 +46,11 @@
>  #include <xen/interface/io/xenbus.h>
>  #include <xen/interface/io/xs_wire.h>
>  
> +/* Max pages supported by multi-page ring in the backend */
> +#define XENBUS_MAX_RING_PAGE_ORDER  2
> +#define XENBUS_MAX_RING_PAGES       (1U << XENBUS_MAX_RING_PAGE_ORDER)

So why this value? If this is a mechanical change shouldn't the order be '0' ?

> +#define INVALID_GRANT_HANDLE        (~0U)
> +
>  /* Register callback to watch this node. */
>  struct xenbus_watch
>  {
> @@ -195,15 +200,17 @@ int xenbus_watch_pathfmt(struct xenbus_device *dev, struct xenbus_watch *watch,
>  			 const char *pathfmt, ...);
>  
>  int xenbus_switch_state(struct xenbus_device *dev, enum xenbus_state new_state);
> -int xenbus_grant_ring(struct xenbus_device *dev, unsigned long ring_mfn);
> +int xenbus_grant_ring(struct xenbus_device *dev, void *vaddr,
> +		      int nr_gages, int *grefs);
>  int xenbus_map_ring_valloc(struct xenbus_device *dev,
> -			   int gnt_ref, void **vaddr);
> -int xenbus_map_ring(struct xenbus_device *dev, int gnt_ref,
> -			   grant_handle_t *handle, void *vaddr);
> +			   int *gnt_ref, int nr_grefs, void **vaddr);
> +int xenbus_map_ring(struct xenbus_device *dev, int *gnt_ref, int nr_grefs,
> +		    grant_handle_t *handle, void *vaddr, int *vma_leaked);
>  
>  int xenbus_unmap_ring_vfree(struct xenbus_device *dev, void *vaddr);
>  int xenbus_unmap_ring(struct xenbus_device *dev,
> -		      grant_handle_t handle, void *vaddr);
> +		      grant_handle_t *handle, int nr_handles,
> +		      void *vaddr);
>  
>  int xenbus_alloc_evtchn(struct xenbus_device *dev, int *port);
>  int xenbus_bind_evtchn(struct xenbus_device *dev, int remote_port, int *port);
> -- 
> 1.7.10.4
> 

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 6/8] netfront: multi-page ring support
  2013-02-15 16:00 ` [PATCH 6/8] netfront: " Wei Liu
                     ` (2 preceding siblings ...)
  2013-03-04 21:16   ` Konrad Rzeszutek Wilk
@ 2013-03-04 21:16   ` Konrad Rzeszutek Wilk
  3 siblings, 0 replies; 91+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-03-04 21:16 UTC (permalink / raw)
  To: Wei Liu; +Cc: xen-devel, netdev, ian.campbell, annie.li

On Fri, Feb 15, 2013 at 04:00:07PM +0000, Wei Liu wrote:

Please:
 1) Explain the new PV protocol (you could just do a copy-n-paste
    from what you had in the backend).
 2).Also submit a patch to Xen hypervisor tree for the new XenBus
    extension.
 3). Explain in which scenarios this benefits the user.
 4). Also provide a Documentation/ABI/stable/sysfs-bus-xen-frontend
to explain the new parameter.
 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> ---
>  drivers/net/xen-netfront.c |  246 +++++++++++++++++++++++++++++++-------------
>  1 file changed, 174 insertions(+), 72 deletions(-)
> 
> diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
> index 8bd75a1..de73a71 100644
> --- a/drivers/net/xen-netfront.c
> +++ b/drivers/net/xen-netfront.c
> @@ -67,9 +67,19 @@ struct netfront_cb {
>  
>  #define GRANT_INVALID_REF	0
>  
> -#define NET_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
> -#define NET_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
> -#define TX_MAX_TARGET min_t(int, NET_TX_RING_SIZE, 256)
> +#define XENNET_MAX_RING_PAGE_ORDER XENBUS_MAX_RING_PAGE_ORDER
> +#define XENNET_MAX_RING_PAGES      (1U << XENNET_MAX_RING_PAGE_ORDER)
> +
> +
> +#define NET_TX_RING_SIZE(_nr_pages)			\
> +	__CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE * (_nr_pages))
> +#define NET_RX_RING_SIZE(_nr_pages)			\
> +	__CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE * (_nr_pages))
> +
> +#define XENNET_MAX_TX_RING_SIZE NET_TX_RING_SIZE(XENNET_MAX_RING_PAGES)
> +#define XENNET_MAX_RX_RING_SIZE NET_RX_RING_SIZE(XENNET_MAX_RING_PAGES)
> +
> +#define TX_MAX_TARGET min_t(int, NET_TX_RING_SIZE(1), 256)
>  
>  struct netfront_stats {
>  	u64			rx_packets;
> @@ -80,6 +90,11 @@ struct netfront_stats {
>  };
>  
>  struct netfront_info {
> +	/* Statistics */
> +	struct netfront_stats __percpu *stats;
> +
> +	unsigned long rx_gso_checksum_fixup;
> +
>  	struct list_head list;
>  	struct net_device *netdev;
>  
> @@ -90,7 +105,9 @@ struct netfront_info {
>  
>  	spinlock_t   tx_lock;
>  	struct xen_netif_tx_front_ring tx;
> -	int tx_ring_ref;
> +	int tx_ring_ref[XENNET_MAX_RING_PAGES];
> +	unsigned int tx_ring_page_order;
> +	unsigned int tx_ring_pages;
>  
>  	/*
>  	 * {tx,rx}_skbs store outstanding skbuffs. Free tx_skb entries
> @@ -104,36 +121,33 @@ struct netfront_info {
>  	union skb_entry {
>  		struct sk_buff *skb;
>  		unsigned long link;
> -	} tx_skbs[NET_TX_RING_SIZE];
> +	} tx_skbs[XENNET_MAX_TX_RING_SIZE];
>  	grant_ref_t gref_tx_head;
> -	grant_ref_t grant_tx_ref[NET_TX_RING_SIZE];
> +	grant_ref_t grant_tx_ref[XENNET_MAX_TX_RING_SIZE];
>  	unsigned tx_skb_freelist;
>  
>  	spinlock_t   rx_lock ____cacheline_aligned_in_smp;
>  	struct xen_netif_rx_front_ring rx;
> -	int rx_ring_ref;
> +	int rx_ring_ref[XENNET_MAX_RING_PAGES];
> +	unsigned int rx_ring_page_order;
> +	unsigned int rx_ring_pages;
>  
>  	/* Receive-ring batched refills. */
>  #define RX_MIN_TARGET 8
>  #define RX_DFL_MIN_TARGET 64
> -#define RX_MAX_TARGET min_t(int, NET_RX_RING_SIZE, 256)
> +#define RX_MAX_TARGET min_t(int, NET_RX_RING_SIZE(1), 256)
>  	unsigned rx_min_target, rx_max_target, rx_target;
>  	struct sk_buff_head rx_batch;
>  
>  	struct timer_list rx_refill_timer;
>  
> -	struct sk_buff *rx_skbs[NET_RX_RING_SIZE];
> +	struct sk_buff *rx_skbs[XENNET_MAX_RX_RING_SIZE];
>  	grant_ref_t gref_rx_head;
> -	grant_ref_t grant_rx_ref[NET_RX_RING_SIZE];
> -
> -	unsigned long rx_pfn_array[NET_RX_RING_SIZE];
> -	struct multicall_entry rx_mcl[NET_RX_RING_SIZE+1];
> -	struct mmu_update rx_mmu[NET_RX_RING_SIZE];
> -
> -	/* Statistics */
> -	struct netfront_stats __percpu *stats;
> +	grant_ref_t grant_rx_ref[XENNET_MAX_RX_RING_SIZE];
>  
> -	unsigned long rx_gso_checksum_fixup;
> +	unsigned long rx_pfn_array[XENNET_MAX_RX_RING_SIZE];
> +	struct multicall_entry rx_mcl[XENNET_MAX_RX_RING_SIZE+1];
> +	struct mmu_update rx_mmu[XENNET_MAX_RX_RING_SIZE];
>  };
>  
>  struct netfront_rx_info {
> @@ -171,15 +185,15 @@ static unsigned short get_id_from_freelist(unsigned *head,
>  	return id;
>  }
>  
> -static int xennet_rxidx(RING_IDX idx)
> +static int xennet_rxidx(RING_IDX idx, struct netfront_info *info)
>  {
> -	return idx & (NET_RX_RING_SIZE - 1);
> +	return idx & (NET_RX_RING_SIZE(info->rx_ring_pages) - 1);
>  }
>  
>  static struct sk_buff *xennet_get_rx_skb(struct netfront_info *np,
>  					 RING_IDX ri)
>  {
> -	int i = xennet_rxidx(ri);
> +	int i = xennet_rxidx(ri, np);
>  	struct sk_buff *skb = np->rx_skbs[i];
>  	np->rx_skbs[i] = NULL;
>  	return skb;
> @@ -188,7 +202,7 @@ static struct sk_buff *xennet_get_rx_skb(struct netfront_info *np,
>  static grant_ref_t xennet_get_rx_ref(struct netfront_info *np,
>  					    RING_IDX ri)
>  {
> -	int i = xennet_rxidx(ri);
> +	int i = xennet_rxidx(ri, np);
>  	grant_ref_t ref = np->grant_rx_ref[i];
>  	np->grant_rx_ref[i] = GRANT_INVALID_REF;
>  	return ref;
> @@ -301,7 +315,7 @@ no_skb:
>  
>  		skb->dev = dev;
>  
> -		id = xennet_rxidx(req_prod + i);
> +		id = xennet_rxidx(req_prod + i, np);
>  
>  		BUG_ON(np->rx_skbs[id]);
>  		np->rx_skbs[id] = skb;
> @@ -653,7 +667,7 @@ static int xennet_close(struct net_device *dev)
>  static void xennet_move_rx_slot(struct netfront_info *np, struct sk_buff *skb,
>  				grant_ref_t ref)
>  {
> -	int new = xennet_rxidx(np->rx.req_prod_pvt);
> +	int new = xennet_rxidx(np->rx.req_prod_pvt, np);
>  
>  	BUG_ON(np->rx_skbs[new]);
>  	np->rx_skbs[new] = skb;
> @@ -1109,7 +1123,7 @@ static void xennet_release_tx_bufs(struct netfront_info *np)
>  	struct sk_buff *skb;
>  	int i;
>  
> -	for (i = 0; i < NET_TX_RING_SIZE; i++) {
> +	for (i = 0; i < NET_TX_RING_SIZE(np->tx_ring_pages); i++) {
>  		/* Skip over entries which are actually freelist references */
>  		if (skb_entry_is_link(&np->tx_skbs[i]))
>  			continue;
> @@ -1143,7 +1157,7 @@ static void xennet_release_rx_bufs(struct netfront_info *np)
>  
>  	spin_lock_bh(&np->rx_lock);
>  
> -	for (id = 0; id < NET_RX_RING_SIZE; id++) {
> +	for (id = 0; id < NET_RX_RING_SIZE(np->rx_ring_pages); id++) {
>  		ref = np->grant_rx_ref[id];
>  		if (ref == GRANT_INVALID_REF) {
>  			unused++;
> @@ -1324,13 +1338,13 @@ static struct net_device *xennet_create_dev(struct xenbus_device *dev)
>  
>  	/* Initialise tx_skbs as a free chain containing every entry. */
>  	np->tx_skb_freelist = 0;
> -	for (i = 0; i < NET_TX_RING_SIZE; i++) {
> +	for (i = 0; i < XENNET_MAX_TX_RING_SIZE; i++) {
>  		skb_entry_set_link(&np->tx_skbs[i], i+1);
>  		np->grant_tx_ref[i] = GRANT_INVALID_REF;
>  	}
>  
>  	/* Clear out rx_skbs */
> -	for (i = 0; i < NET_RX_RING_SIZE; i++) {
> +	for (i = 0; i < XENNET_MAX_RX_RING_SIZE; i++) {
>  		np->rx_skbs[i] = NULL;
>  		np->grant_rx_ref[i] = GRANT_INVALID_REF;
>  	}
> @@ -1428,13 +1442,6 @@ static int netfront_probe(struct xenbus_device *dev,
>  	return err;
>  }
>  
> -static void xennet_end_access(int ref, void *page)
> -{
> -	/* This frees the page as a side-effect */
> -	if (ref != GRANT_INVALID_REF)
> -		gnttab_end_foreign_access(ref, 0, (unsigned long)page);
> -}
> -
>  static void xennet_disconnect_backend(struct netfront_info *info)
>  {
>  	/* Stop old i/f to prevent errors whilst we rebuild the state. */
> @@ -1448,12 +1455,12 @@ static void xennet_disconnect_backend(struct netfront_info *info)
>  		unbind_from_irqhandler(info->netdev->irq, info->netdev);
>  	info->evtchn = info->netdev->irq = 0;
>  
> -	/* End access and free the pages */
> -	xennet_end_access(info->tx_ring_ref, info->tx.sring);
> -	xennet_end_access(info->rx_ring_ref, info->rx.sring);
> +	xenbus_unmap_ring_vfree(info->xbdev, (void *)info->tx.sring);
> +	free_pages((unsigned long)info->tx.sring, info->tx_ring_page_order);
> +
> +	xenbus_unmap_ring_vfree(info->xbdev, (void *)info->rx.sring);
> +	free_pages((unsigned long)info->rx.sring, info->rx_ring_page_order);
>  
> -	info->tx_ring_ref = GRANT_INVALID_REF;
> -	info->rx_ring_ref = GRANT_INVALID_REF;
>  	info->tx.sring = NULL;
>  	info->rx.sring = NULL;
>  }
> @@ -1501,11 +1508,14 @@ static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info)
>  	struct xen_netif_tx_sring *txs;
>  	struct xen_netif_rx_sring *rxs;
>  	int err;
> -	int grefs[1];
>  	struct net_device *netdev = info->netdev;
> +	unsigned int max_tx_ring_page_order, max_rx_ring_page_order;
> +	int i;
>  
> -	info->tx_ring_ref = GRANT_INVALID_REF;
> -	info->rx_ring_ref = GRANT_INVALID_REF;
> +	for (i = 0; i < XENNET_MAX_RING_PAGES; i++) {
> +		info->tx_ring_ref[i] = GRANT_INVALID_REF;
> +		info->rx_ring_ref[i] = GRANT_INVALID_REF;
> +	}
>  	info->rx.sring = NULL;
>  	info->tx.sring = NULL;
>  	netdev->irq = 0;
> @@ -1516,50 +1526,100 @@ static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info)
>  		goto fail;
>  	}
>  
> -	txs = (struct xen_netif_tx_sring *)get_zeroed_page(GFP_NOIO | __GFP_HIGH);
> +	err = xenbus_scanf(XBT_NIL, info->xbdev->otherend,
> +			   "max-tx-ring-page-order", "%u",
> +			   &max_tx_ring_page_order);
> +	if (err < 0) {
> +		info->tx_ring_page_order = 0;
> +		dev_info(&dev->dev, "single tx ring\n");
> +	} else {
> +		if (max_tx_ring_page_order > XENNET_MAX_RING_PAGE_ORDER) {
> +			dev_info(&dev->dev,
> +				 "backend ring page order %d too large, clamp to %d\n",
> +				 max_tx_ring_page_order,
> +				 XENNET_MAX_RING_PAGE_ORDER);
> +			max_tx_ring_page_order = XENNET_MAX_RING_PAGE_ORDER;
> +		}
> +		info->tx_ring_page_order = max_tx_ring_page_order;
> +		dev_info(&dev->dev, "multi-page tx ring, order = %d\n",
> +			 info->tx_ring_page_order);
> +	}
> +	info->tx_ring_pages = (1U << info->tx_ring_page_order);
> +
> +	txs = (struct xen_netif_tx_sring *)
> +		__get_free_pages(__GFP_ZERO | GFP_NOIO | __GFP_HIGH,
> +				 info->tx_ring_page_order);
>  	if (!txs) {
>  		err = -ENOMEM;
>  		xenbus_dev_fatal(dev, err, "allocating tx ring page");
>  		goto fail;
>  	}
>  	SHARED_RING_INIT(txs);
> -	FRONT_RING_INIT(&info->tx, txs, PAGE_SIZE);
> +	FRONT_RING_INIT(&info->tx, txs, PAGE_SIZE * info->tx_ring_pages);
> +
> +	err = xenbus_grant_ring(dev, txs, info->tx_ring_pages,
> +				info->tx_ring_ref);
> +	if (err < 0)
> +		goto grant_tx_ring_fail;
>  
> -	err = xenbus_grant_ring(dev, txs, 1, grefs);
> +	err = xenbus_scanf(XBT_NIL, info->xbdev->otherend,
> +			   "max-rx-ring-page-order", "%u",
> +			   &max_rx_ring_page_order);
>  	if (err < 0) {
> -		free_page((unsigned long)txs);
> -		goto fail;
> +		info->rx_ring_page_order = 0;
> +		dev_info(&dev->dev, "single rx ring\n");
> +	} else {
> +		if (max_rx_ring_page_order > XENNET_MAX_RING_PAGE_ORDER) {
> +			dev_info(&dev->dev,
> +				 "backend ring page order %d too large, clamp to %d\n",
> +				 max_rx_ring_page_order,
> +				 XENNET_MAX_RING_PAGE_ORDER);
> +			max_rx_ring_page_order = XENNET_MAX_RING_PAGE_ORDER;
> +		}
> +		info->rx_ring_page_order = max_rx_ring_page_order;
> +		dev_info(&dev->dev, "multi-page rx ring, order = %d\n",
> +			 info->rx_ring_page_order);
>  	}
> +	info->rx_ring_pages = (1U << info->rx_ring_page_order);
>  
> -	info->tx_ring_ref = grefs[0];
> -	rxs = (struct xen_netif_rx_sring *)get_zeroed_page(GFP_NOIO | __GFP_HIGH);
> +	rxs = (struct xen_netif_rx_sring *)
> +		__get_free_pages(__GFP_ZERO | GFP_NOIO | __GFP_HIGH,
> +				 info->rx_ring_page_order);
>  	if (!rxs) {
>  		err = -ENOMEM;
>  		xenbus_dev_fatal(dev, err, "allocating rx ring page");
> -		goto fail;
> +		goto alloc_rx_ring_fail;
>  	}
>  	SHARED_RING_INIT(rxs);
> -	FRONT_RING_INIT(&info->rx, rxs, PAGE_SIZE);
> +	FRONT_RING_INIT(&info->rx, rxs, PAGE_SIZE * info->rx_ring_pages);
>  
> -	err = xenbus_grant_ring(dev, rxs, 1, grefs);
> -	if (err < 0) {
> -		free_page((unsigned long)rxs);
> -		goto fail;
> -	}
> -	info->rx_ring_ref = grefs[0];
> +	err = xenbus_grant_ring(dev, rxs, info->rx_ring_pages,
> +				info->rx_ring_ref);
> +	if (err < 0)
> +		goto grant_rx_ring_fail;
>  
>  	err = xenbus_alloc_evtchn(dev, &info->evtchn);
>  	if (err)
> -		goto fail;
> +		goto alloc_evtchn_fail;
>  
>  	err = bind_evtchn_to_irqhandler(info->evtchn, xennet_interrupt,
>  					0, netdev->name, netdev);
>  	if (err < 0)
> -		goto fail;
> +		goto bind_fail;
>  	netdev->irq = err;
>  	return 0;
>  
> - fail:
> +bind_fail:
> +	xenbus_free_evtchn(dev, info->evtchn);
> +alloc_evtchn_fail:
> +	xenbus_unmap_ring_vfree(info->xbdev, (void *)info->rx.sring);
> +grant_rx_ring_fail:
> +	free_pages((unsigned long)info->rx.sring, info->rx_ring_page_order);
> +alloc_rx_ring_fail:
> +	xenbus_unmap_ring_vfree(info->xbdev, (void *)info->tx.sring);
> +grant_tx_ring_fail:
> +	free_pages((unsigned long)info->tx.sring, info->tx_ring_page_order);
> +fail:
>  	return err;
>  }
>  
> @@ -1570,6 +1630,7 @@ static int talk_to_netback(struct xenbus_device *dev,
>  	const char *message;
>  	struct xenbus_transaction xbt;
>  	int err;
> +	int i;
>  
>  	/* Create shared ring, alloc event channel. */
>  	err = setup_netfront(dev, info);
> @@ -1583,18 +1644,58 @@ again:
>  		goto destroy_ring;
>  	}
>  
> -	err = xenbus_printf(xbt, dev->nodename, "tx-ring-ref", "%u",
> -			    info->tx_ring_ref);
> -	if (err) {
> -		message = "writing tx ring-ref";
> -		goto abort_transaction;
> +	if (info->tx_ring_page_order == 0) {
> +		err = xenbus_printf(xbt, dev->nodename, "tx-ring-ref", "%u",
> +				    info->tx_ring_ref[0]);
> +		if (err) {
> +			message = "writing tx ring-ref";
> +			goto abort_transaction;
> +		}
> +	} else {
> +		err = xenbus_printf(xbt, dev->nodename, "tx-ring-order", "%u",
> +				    info->tx_ring_page_order);
> +		if (err) {
> +			message = "writing tx-ring-order";
> +			goto abort_transaction;
> +		}
> +		for (i = 0; i < info->tx_ring_pages; i++) {
> +			char name[sizeof("tx-ring-ref")+3];
> +			snprintf(name, sizeof(name), "tx-ring-ref%u", i);
> +			err = xenbus_printf(xbt, dev->nodename, name, "%u",
> +					    info->tx_ring_ref[i]);
> +			if (err) {
> +				message = "writing tx ring-ref";
> +				goto abort_transaction;
> +			}
> +		}
>  	}
> -	err = xenbus_printf(xbt, dev->nodename, "rx-ring-ref", "%u",
> -			    info->rx_ring_ref);
> -	if (err) {
> -		message = "writing rx ring-ref";
> -		goto abort_transaction;
> +
> +	if (info->rx_ring_page_order == 0) {
> +		err = xenbus_printf(xbt, dev->nodename, "rx-ring-ref", "%u",
> +				    info->rx_ring_ref[0]);
> +		if (err) {
> +			message = "writing rx ring-ref";
> +			goto abort_transaction;
> +		}
> +	} else {
> +		err = xenbus_printf(xbt, dev->nodename, "rx-ring-order", "%u",
> +				    info->rx_ring_page_order);
> +		if (err) {
> +			message = "writing rx-ring-order";
> +			goto abort_transaction;
> +		}
> +		for (i = 0; i < info->rx_ring_pages; i++) {
> +			char name[sizeof("rx-ring-ref")+3];
> +			snprintf(name, sizeof(name), "rx-ring-ref%u", i);
> +			err = xenbus_printf(xbt, dev->nodename, name, "%u",
> +					    info->rx_ring_ref[i]);
> +			if (err) {
> +				message = "writing rx ring-ref";
> +				goto abort_transaction;
> +			}
> +		}
>  	}
> +
>  	err = xenbus_printf(xbt, dev->nodename,
>  			    "event-channel", "%u", info->evtchn);
>  	if (err) {
> @@ -1681,7 +1782,8 @@ static int xennet_connect(struct net_device *dev)
>  	xennet_release_tx_bufs(np);
>  
>  	/* Step 2: Rebuild the RX buffer freelist and the RX ring itself. */
> -	for (requeue_idx = 0, i = 0; i < NET_RX_RING_SIZE; i++) {
> +	for (requeue_idx = 0, i = 0; i < NET_RX_RING_SIZE(np->rx_ring_pages);
> +	     i++) {
>  		skb_frag_t *frag;
>  		const struct page *page;
>  		if (!np->rx_skbs[i])
> -- 
> 1.7.10.4
> 

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 6/8] netfront: multi-page ring support
  2013-02-15 16:00 ` [PATCH 6/8] netfront: " Wei Liu
  2013-02-26  6:52   ` ANNIE LI
  2013-02-26  6:52   ` ANNIE LI
@ 2013-03-04 21:16   ` Konrad Rzeszutek Wilk
  2013-03-04 21:16   ` Konrad Rzeszutek Wilk
  3 siblings, 0 replies; 91+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-03-04 21:16 UTC (permalink / raw)
  To: Wei Liu; +Cc: netdev, annie.li, ian.campbell, xen-devel

On Fri, Feb 15, 2013 at 04:00:07PM +0000, Wei Liu wrote:

Please:
 1) Explain the new PV protocol (you could just do a copy-n-paste
    from what you had in the backend).
 2).Also submit a patch to Xen hypervisor tree for the new XenBus
    extension.
 3). Explain in which scenarios this benefits the user.
 4). Also provide a Documentation/ABI/stable/sysfs-bus-xen-frontend
to explain the new parameter.
 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> ---
>  drivers/net/xen-netfront.c |  246 +++++++++++++++++++++++++++++++-------------
>  1 file changed, 174 insertions(+), 72 deletions(-)
> 
> diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
> index 8bd75a1..de73a71 100644
> --- a/drivers/net/xen-netfront.c
> +++ b/drivers/net/xen-netfront.c
> @@ -67,9 +67,19 @@ struct netfront_cb {
>  
>  #define GRANT_INVALID_REF	0
>  
> -#define NET_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
> -#define NET_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
> -#define TX_MAX_TARGET min_t(int, NET_TX_RING_SIZE, 256)
> +#define XENNET_MAX_RING_PAGE_ORDER XENBUS_MAX_RING_PAGE_ORDER
> +#define XENNET_MAX_RING_PAGES      (1U << XENNET_MAX_RING_PAGE_ORDER)
> +
> +
> +#define NET_TX_RING_SIZE(_nr_pages)			\
> +	__CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE * (_nr_pages))
> +#define NET_RX_RING_SIZE(_nr_pages)			\
> +	__CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE * (_nr_pages))
> +
> +#define XENNET_MAX_TX_RING_SIZE NET_TX_RING_SIZE(XENNET_MAX_RING_PAGES)
> +#define XENNET_MAX_RX_RING_SIZE NET_RX_RING_SIZE(XENNET_MAX_RING_PAGES)
> +
> +#define TX_MAX_TARGET min_t(int, NET_TX_RING_SIZE(1), 256)
>  
>  struct netfront_stats {
>  	u64			rx_packets;
> @@ -80,6 +90,11 @@ struct netfront_stats {
>  };
>  
>  struct netfront_info {
> +	/* Statistics */
> +	struct netfront_stats __percpu *stats;
> +
> +	unsigned long rx_gso_checksum_fixup;
> +
>  	struct list_head list;
>  	struct net_device *netdev;
>  
> @@ -90,7 +105,9 @@ struct netfront_info {
>  
>  	spinlock_t   tx_lock;
>  	struct xen_netif_tx_front_ring tx;
> -	int tx_ring_ref;
> +	int tx_ring_ref[XENNET_MAX_RING_PAGES];
> +	unsigned int tx_ring_page_order;
> +	unsigned int tx_ring_pages;
>  
>  	/*
>  	 * {tx,rx}_skbs store outstanding skbuffs. Free tx_skb entries
> @@ -104,36 +121,33 @@ struct netfront_info {
>  	union skb_entry {
>  		struct sk_buff *skb;
>  		unsigned long link;
> -	} tx_skbs[NET_TX_RING_SIZE];
> +	} tx_skbs[XENNET_MAX_TX_RING_SIZE];
>  	grant_ref_t gref_tx_head;
> -	grant_ref_t grant_tx_ref[NET_TX_RING_SIZE];
> +	grant_ref_t grant_tx_ref[XENNET_MAX_TX_RING_SIZE];
>  	unsigned tx_skb_freelist;
>  
>  	spinlock_t   rx_lock ____cacheline_aligned_in_smp;
>  	struct xen_netif_rx_front_ring rx;
> -	int rx_ring_ref;
> +	int rx_ring_ref[XENNET_MAX_RING_PAGES];
> +	unsigned int rx_ring_page_order;
> +	unsigned int rx_ring_pages;
>  
>  	/* Receive-ring batched refills. */
>  #define RX_MIN_TARGET 8
>  #define RX_DFL_MIN_TARGET 64
> -#define RX_MAX_TARGET min_t(int, NET_RX_RING_SIZE, 256)
> +#define RX_MAX_TARGET min_t(int, NET_RX_RING_SIZE(1), 256)
>  	unsigned rx_min_target, rx_max_target, rx_target;
>  	struct sk_buff_head rx_batch;
>  
>  	struct timer_list rx_refill_timer;
>  
> -	struct sk_buff *rx_skbs[NET_RX_RING_SIZE];
> +	struct sk_buff *rx_skbs[XENNET_MAX_RX_RING_SIZE];
>  	grant_ref_t gref_rx_head;
> -	grant_ref_t grant_rx_ref[NET_RX_RING_SIZE];
> -
> -	unsigned long rx_pfn_array[NET_RX_RING_SIZE];
> -	struct multicall_entry rx_mcl[NET_RX_RING_SIZE+1];
> -	struct mmu_update rx_mmu[NET_RX_RING_SIZE];
> -
> -	/* Statistics */
> -	struct netfront_stats __percpu *stats;
> +	grant_ref_t grant_rx_ref[XENNET_MAX_RX_RING_SIZE];
>  
> -	unsigned long rx_gso_checksum_fixup;
> +	unsigned long rx_pfn_array[XENNET_MAX_RX_RING_SIZE];
> +	struct multicall_entry rx_mcl[XENNET_MAX_RX_RING_SIZE+1];
> +	struct mmu_update rx_mmu[XENNET_MAX_RX_RING_SIZE];
>  };
>  
>  struct netfront_rx_info {
> @@ -171,15 +185,15 @@ static unsigned short get_id_from_freelist(unsigned *head,
>  	return id;
>  }
>  
> -static int xennet_rxidx(RING_IDX idx)
> +static int xennet_rxidx(RING_IDX idx, struct netfront_info *info)
>  {
> -	return idx & (NET_RX_RING_SIZE - 1);
> +	return idx & (NET_RX_RING_SIZE(info->rx_ring_pages) - 1);
>  }
>  
>  static struct sk_buff *xennet_get_rx_skb(struct netfront_info *np,
>  					 RING_IDX ri)
>  {
> -	int i = xennet_rxidx(ri);
> +	int i = xennet_rxidx(ri, np);
>  	struct sk_buff *skb = np->rx_skbs[i];
>  	np->rx_skbs[i] = NULL;
>  	return skb;
> @@ -188,7 +202,7 @@ static struct sk_buff *xennet_get_rx_skb(struct netfront_info *np,
>  static grant_ref_t xennet_get_rx_ref(struct netfront_info *np,
>  					    RING_IDX ri)
>  {
> -	int i = xennet_rxidx(ri);
> +	int i = xennet_rxidx(ri, np);
>  	grant_ref_t ref = np->grant_rx_ref[i];
>  	np->grant_rx_ref[i] = GRANT_INVALID_REF;
>  	return ref;
> @@ -301,7 +315,7 @@ no_skb:
>  
>  		skb->dev = dev;
>  
> -		id = xennet_rxidx(req_prod + i);
> +		id = xennet_rxidx(req_prod + i, np);
>  
>  		BUG_ON(np->rx_skbs[id]);
>  		np->rx_skbs[id] = skb;
> @@ -653,7 +667,7 @@ static int xennet_close(struct net_device *dev)
>  static void xennet_move_rx_slot(struct netfront_info *np, struct sk_buff *skb,
>  				grant_ref_t ref)
>  {
> -	int new = xennet_rxidx(np->rx.req_prod_pvt);
> +	int new = xennet_rxidx(np->rx.req_prod_pvt, np);
>  
>  	BUG_ON(np->rx_skbs[new]);
>  	np->rx_skbs[new] = skb;
> @@ -1109,7 +1123,7 @@ static void xennet_release_tx_bufs(struct netfront_info *np)
>  	struct sk_buff *skb;
>  	int i;
>  
> -	for (i = 0; i < NET_TX_RING_SIZE; i++) {
> +	for (i = 0; i < NET_TX_RING_SIZE(np->tx_ring_pages); i++) {
>  		/* Skip over entries which are actually freelist references */
>  		if (skb_entry_is_link(&np->tx_skbs[i]))
>  			continue;
> @@ -1143,7 +1157,7 @@ static void xennet_release_rx_bufs(struct netfront_info *np)
>  
>  	spin_lock_bh(&np->rx_lock);
>  
> -	for (id = 0; id < NET_RX_RING_SIZE; id++) {
> +	for (id = 0; id < NET_RX_RING_SIZE(np->rx_ring_pages); id++) {
>  		ref = np->grant_rx_ref[id];
>  		if (ref == GRANT_INVALID_REF) {
>  			unused++;
> @@ -1324,13 +1338,13 @@ static struct net_device *xennet_create_dev(struct xenbus_device *dev)
>  
>  	/* Initialise tx_skbs as a free chain containing every entry. */
>  	np->tx_skb_freelist = 0;
> -	for (i = 0; i < NET_TX_RING_SIZE; i++) {
> +	for (i = 0; i < XENNET_MAX_TX_RING_SIZE; i++) {
>  		skb_entry_set_link(&np->tx_skbs[i], i+1);
>  		np->grant_tx_ref[i] = GRANT_INVALID_REF;
>  	}
>  
>  	/* Clear out rx_skbs */
> -	for (i = 0; i < NET_RX_RING_SIZE; i++) {
> +	for (i = 0; i < XENNET_MAX_RX_RING_SIZE; i++) {
>  		np->rx_skbs[i] = NULL;
>  		np->grant_rx_ref[i] = GRANT_INVALID_REF;
>  	}
> @@ -1428,13 +1442,6 @@ static int netfront_probe(struct xenbus_device *dev,
>  	return err;
>  }
>  
> -static void xennet_end_access(int ref, void *page)
> -{
> -	/* This frees the page as a side-effect */
> -	if (ref != GRANT_INVALID_REF)
> -		gnttab_end_foreign_access(ref, 0, (unsigned long)page);
> -}
> -
>  static void xennet_disconnect_backend(struct netfront_info *info)
>  {
>  	/* Stop old i/f to prevent errors whilst we rebuild the state. */
> @@ -1448,12 +1455,12 @@ static void xennet_disconnect_backend(struct netfront_info *info)
>  		unbind_from_irqhandler(info->netdev->irq, info->netdev);
>  	info->evtchn = info->netdev->irq = 0;
>  
> -	/* End access and free the pages */
> -	xennet_end_access(info->tx_ring_ref, info->tx.sring);
> -	xennet_end_access(info->rx_ring_ref, info->rx.sring);
> +	xenbus_unmap_ring_vfree(info->xbdev, (void *)info->tx.sring);
> +	free_pages((unsigned long)info->tx.sring, info->tx_ring_page_order);
> +
> +	xenbus_unmap_ring_vfree(info->xbdev, (void *)info->rx.sring);
> +	free_pages((unsigned long)info->rx.sring, info->rx_ring_page_order);
>  
> -	info->tx_ring_ref = GRANT_INVALID_REF;
> -	info->rx_ring_ref = GRANT_INVALID_REF;
>  	info->tx.sring = NULL;
>  	info->rx.sring = NULL;
>  }
> @@ -1501,11 +1508,14 @@ static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info)
>  	struct xen_netif_tx_sring *txs;
>  	struct xen_netif_rx_sring *rxs;
>  	int err;
> -	int grefs[1];
>  	struct net_device *netdev = info->netdev;
> +	unsigned int max_tx_ring_page_order, max_rx_ring_page_order;
> +	int i;
>  
> -	info->tx_ring_ref = GRANT_INVALID_REF;
> -	info->rx_ring_ref = GRANT_INVALID_REF;
> +	for (i = 0; i < XENNET_MAX_RING_PAGES; i++) {
> +		info->tx_ring_ref[i] = GRANT_INVALID_REF;
> +		info->rx_ring_ref[i] = GRANT_INVALID_REF;
> +	}
>  	info->rx.sring = NULL;
>  	info->tx.sring = NULL;
>  	netdev->irq = 0;
> @@ -1516,50 +1526,100 @@ static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info)
>  		goto fail;
>  	}
>  
> -	txs = (struct xen_netif_tx_sring *)get_zeroed_page(GFP_NOIO | __GFP_HIGH);
> +	err = xenbus_scanf(XBT_NIL, info->xbdev->otherend,
> +			   "max-tx-ring-page-order", "%u",
> +			   &max_tx_ring_page_order);
> +	if (err < 0) {
> +		info->tx_ring_page_order = 0;
> +		dev_info(&dev->dev, "single tx ring\n");
> +	} else {
> +		if (max_tx_ring_page_order > XENNET_MAX_RING_PAGE_ORDER) {
> +			dev_info(&dev->dev,
> +				 "backend ring page order %d too large, clamp to %d\n",
> +				 max_tx_ring_page_order,
> +				 XENNET_MAX_RING_PAGE_ORDER);
> +			max_tx_ring_page_order = XENNET_MAX_RING_PAGE_ORDER;
> +		}
> +		info->tx_ring_page_order = max_tx_ring_page_order;
> +		dev_info(&dev->dev, "multi-page tx ring, order = %d\n",
> +			 info->tx_ring_page_order);
> +	}
> +	info->tx_ring_pages = (1U << info->tx_ring_page_order);
> +
> +	txs = (struct xen_netif_tx_sring *)
> +		__get_free_pages(__GFP_ZERO | GFP_NOIO | __GFP_HIGH,
> +				 info->tx_ring_page_order);
>  	if (!txs) {
>  		err = -ENOMEM;
>  		xenbus_dev_fatal(dev, err, "allocating tx ring page");
>  		goto fail;
>  	}
>  	SHARED_RING_INIT(txs);
> -	FRONT_RING_INIT(&info->tx, txs, PAGE_SIZE);
> +	FRONT_RING_INIT(&info->tx, txs, PAGE_SIZE * info->tx_ring_pages);
> +
> +	err = xenbus_grant_ring(dev, txs, info->tx_ring_pages,
> +				info->tx_ring_ref);
> +	if (err < 0)
> +		goto grant_tx_ring_fail;
>  
> -	err = xenbus_grant_ring(dev, txs, 1, grefs);
> +	err = xenbus_scanf(XBT_NIL, info->xbdev->otherend,
> +			   "max-rx-ring-page-order", "%u",
> +			   &max_rx_ring_page_order);
>  	if (err < 0) {
> -		free_page((unsigned long)txs);
> -		goto fail;
> +		info->rx_ring_page_order = 0;
> +		dev_info(&dev->dev, "single rx ring\n");
> +	} else {
> +		if (max_rx_ring_page_order > XENNET_MAX_RING_PAGE_ORDER) {
> +			dev_info(&dev->dev,
> +				 "backend ring page order %d too large, clamp to %d\n",
> +				 max_rx_ring_page_order,
> +				 XENNET_MAX_RING_PAGE_ORDER);
> +			max_rx_ring_page_order = XENNET_MAX_RING_PAGE_ORDER;
> +		}
> +		info->rx_ring_page_order = max_rx_ring_page_order;
> +		dev_info(&dev->dev, "multi-page rx ring, order = %d\n",
> +			 info->rx_ring_page_order);
>  	}
> +	info->rx_ring_pages = (1U << info->rx_ring_page_order);
>  
> -	info->tx_ring_ref = grefs[0];
> -	rxs = (struct xen_netif_rx_sring *)get_zeroed_page(GFP_NOIO | __GFP_HIGH);
> +	rxs = (struct xen_netif_rx_sring *)
> +		__get_free_pages(__GFP_ZERO | GFP_NOIO | __GFP_HIGH,
> +				 info->rx_ring_page_order);
>  	if (!rxs) {
>  		err = -ENOMEM;
>  		xenbus_dev_fatal(dev, err, "allocating rx ring page");
> -		goto fail;
> +		goto alloc_rx_ring_fail;
>  	}
>  	SHARED_RING_INIT(rxs);
> -	FRONT_RING_INIT(&info->rx, rxs, PAGE_SIZE);
> +	FRONT_RING_INIT(&info->rx, rxs, PAGE_SIZE * info->rx_ring_pages);
>  
> -	err = xenbus_grant_ring(dev, rxs, 1, grefs);
> -	if (err < 0) {
> -		free_page((unsigned long)rxs);
> -		goto fail;
> -	}
> -	info->rx_ring_ref = grefs[0];
> +	err = xenbus_grant_ring(dev, rxs, info->rx_ring_pages,
> +				info->rx_ring_ref);
> +	if (err < 0)
> +		goto grant_rx_ring_fail;
>  
>  	err = xenbus_alloc_evtchn(dev, &info->evtchn);
>  	if (err)
> -		goto fail;
> +		goto alloc_evtchn_fail;
>  
>  	err = bind_evtchn_to_irqhandler(info->evtchn, xennet_interrupt,
>  					0, netdev->name, netdev);
>  	if (err < 0)
> -		goto fail;
> +		goto bind_fail;
>  	netdev->irq = err;
>  	return 0;
>  
> - fail:
> +bind_fail:
> +	xenbus_free_evtchn(dev, info->evtchn);
> +alloc_evtchn_fail:
> +	xenbus_unmap_ring_vfree(info->xbdev, (void *)info->rx.sring);
> +grant_rx_ring_fail:
> +	free_pages((unsigned long)info->rx.sring, info->rx_ring_page_order);
> +alloc_rx_ring_fail:
> +	xenbus_unmap_ring_vfree(info->xbdev, (void *)info->tx.sring);
> +grant_tx_ring_fail:
> +	free_pages((unsigned long)info->tx.sring, info->tx_ring_page_order);
> +fail:
>  	return err;
>  }
>  
> @@ -1570,6 +1630,7 @@ static int talk_to_netback(struct xenbus_device *dev,
>  	const char *message;
>  	struct xenbus_transaction xbt;
>  	int err;
> +	int i;
>  
>  	/* Create shared ring, alloc event channel. */
>  	err = setup_netfront(dev, info);
> @@ -1583,18 +1644,58 @@ again:
>  		goto destroy_ring;
>  	}
>  
> -	err = xenbus_printf(xbt, dev->nodename, "tx-ring-ref", "%u",
> -			    info->tx_ring_ref);
> -	if (err) {
> -		message = "writing tx ring-ref";
> -		goto abort_transaction;
> +	if (info->tx_ring_page_order == 0) {
> +		err = xenbus_printf(xbt, dev->nodename, "tx-ring-ref", "%u",
> +				    info->tx_ring_ref[0]);
> +		if (err) {
> +			message = "writing tx ring-ref";
> +			goto abort_transaction;
> +		}
> +	} else {
> +		err = xenbus_printf(xbt, dev->nodename, "tx-ring-order", "%u",
> +				    info->tx_ring_page_order);
> +		if (err) {
> +			message = "writing tx-ring-order";
> +			goto abort_transaction;
> +		}
> +		for (i = 0; i < info->tx_ring_pages; i++) {
> +			char name[sizeof("tx-ring-ref")+3];
> +			snprintf(name, sizeof(name), "tx-ring-ref%u", i);
> +			err = xenbus_printf(xbt, dev->nodename, name, "%u",
> +					    info->tx_ring_ref[i]);
> +			if (err) {
> +				message = "writing tx ring-ref";
> +				goto abort_transaction;
> +			}
> +		}
>  	}
> -	err = xenbus_printf(xbt, dev->nodename, "rx-ring-ref", "%u",
> -			    info->rx_ring_ref);
> -	if (err) {
> -		message = "writing rx ring-ref";
> -		goto abort_transaction;
> +
> +	if (info->rx_ring_page_order == 0) {
> +		err = xenbus_printf(xbt, dev->nodename, "rx-ring-ref", "%u",
> +				    info->rx_ring_ref[0]);
> +		if (err) {
> +			message = "writing rx ring-ref";
> +			goto abort_transaction;
> +		}
> +	} else {
> +		err = xenbus_printf(xbt, dev->nodename, "rx-ring-order", "%u",
> +				    info->rx_ring_page_order);
> +		if (err) {
> +			message = "writing rx-ring-order";
> +			goto abort_transaction;
> +		}
> +		for (i = 0; i < info->rx_ring_pages; i++) {
> +			char name[sizeof("rx-ring-ref")+3];
> +			snprintf(name, sizeof(name), "rx-ring-ref%u", i);
> +			err = xenbus_printf(xbt, dev->nodename, name, "%u",
> +					    info->rx_ring_ref[i]);
> +			if (err) {
> +				message = "writing rx ring-ref";
> +				goto abort_transaction;
> +			}
> +		}
>  	}
> +
>  	err = xenbus_printf(xbt, dev->nodename,
>  			    "event-channel", "%u", info->evtchn);
>  	if (err) {
> @@ -1681,7 +1782,8 @@ static int xennet_connect(struct net_device *dev)
>  	xennet_release_tx_bufs(np);
>  
>  	/* Step 2: Rebuild the RX buffer freelist and the RX ring itself. */
> -	for (requeue_idx = 0, i = 0; i < NET_RX_RING_SIZE; i++) {
> +	for (requeue_idx = 0, i = 0; i < NET_RX_RING_SIZE(np->rx_ring_pages);
> +	     i++) {
>  		skb_frag_t *frag;
>  		const struct page *page;
>  		if (!np->rx_skbs[i])
> -- 
> 1.7.10.4
> 

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 7/8] netback: split event channels support
  2013-02-15 16:00 ` Wei Liu
  2013-03-04 21:22   ` Konrad Rzeszutek Wilk
@ 2013-03-04 21:22   ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 91+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-03-04 21:22 UTC (permalink / raw)
  To: Wei Liu; +Cc: xen-devel, netdev, ian.campbell, annie.li

On Fri, Feb 15, 2013 at 04:00:08PM +0000, Wei Liu wrote:
> Netback and netfront only use one event channel to do tx / rx notification.
> This may cause unnecessary wake-up of process routines. This patch adds a new
> feature called feautre-split-event-channel to netback, enabling it to handle

'feature', not 'feautre'. In the code it looks to also be plural:

feature-split-event-channels

and there is also event-channel-tx and event-channel-rx attribute.


You need to also send an patch to Xen hypervisor tree for this new XenBus
attributes and provide the appropiate syntax.

> Tx and Rx event separately.
> 
> Netback will use tx_irq to notify guest for tx completion, rx_irq for rx
> notification.
> 
> If frontend doesn't support this feature, tx_irq = rx_irq.
> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> ---
>  drivers/net/xen-netback/common.h    |   10 +++--
>  drivers/net/xen-netback/interface.c |   78 ++++++++++++++++++++++++++++-------
>  drivers/net/xen-netback/netback.c   |    7 ++--
>  drivers/net/xen-netback/xenbus.c    |   44 ++++++++++++++++----
>  4 files changed, 109 insertions(+), 30 deletions(-)
> 
> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
> index f541ba9..cc2a9f0 100644
> --- a/drivers/net/xen-netback/common.h
> +++ b/drivers/net/xen-netback/common.h
> @@ -63,8 +63,11 @@ struct xenvif {
>  
>  	u8               fe_dev_addr[6];
>  
> -	/* Physical parameters of the comms window. */
> -	unsigned int     irq;
> +	/* Physical parameters of the comms window.
> +	 * When feature-split-event-channels = 0, tx_irq = rx_irq.
> +	 */
> +	unsigned int tx_irq;
> +	unsigned int rx_irq;
>  
>  	/* List of frontends to notify after a batch of frames sent. */
>  	struct list_head notify_list;
> @@ -122,10 +125,11 @@ struct xenvif *xenvif_alloc(struct device *parent,
>  			    domid_t domid,
>  			    unsigned int handle);
>  
> +/* When feature-split-event-channels == 0, tx_evtchn == rx_evtchn */
>  int xenvif_connect(struct xenvif *vif,
>  		   unsigned long *tx_ring_ref, unsigned int tx_ring_order,
>  		   unsigned long *rx_ring_ref, unsigned int rx_ring_order,
> -		   unsigned int evtchn);
> +		   unsigned int tx_evtchn, unsigned int rx_evtchn);
>  void xenvif_disconnect(struct xenvif *vif);
>  
>  void xenvif_get(struct xenvif *vif);
> diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
> index fa4d46d..c9ebe21 100644
> --- a/drivers/net/xen-netback/interface.c
> +++ b/drivers/net/xen-netback/interface.c
> @@ -60,7 +60,8 @@ static int xenvif_rx_schedulable(struct xenvif *vif)
>  	return xenvif_schedulable(vif) && !xen_netbk_rx_ring_full(vif);
>  }
>  
> -static irqreturn_t xenvif_interrupt(int irq, void *dev_id)
> +/* Tx interrupt handler used when feature-split-event-channels == 1 */
> +static irqreturn_t xenvif_tx_interrupt(int tx_irq, void *dev_id)
>  {
>  	struct xenvif *vif = dev_id;
>  
> @@ -69,12 +70,31 @@ static irqreturn_t xenvif_interrupt(int irq, void *dev_id)
>  
>  	xen_netbk_schedule_xenvif(vif);
>  
> +	return IRQ_HANDLED;
> +}
> +
> +/* Rx interrupt handler used when feature-split-event-channels == 1 */
> +static irqreturn_t xenvif_rx_interrupt(int rx_irq, void *dev_id)
> +{
> +	struct xenvif *vif = dev_id;
> +
> +	if (vif->netbk == NULL)
> +		return IRQ_NONE;
> +
>  	if (xenvif_rx_schedulable(vif))
>  		netif_wake_queue(vif->dev);
>  
>  	return IRQ_HANDLED;
>  }
>  
> +/* Used when feature-split-event-channels == 0 */
> +static irqreturn_t xenvif_interrupt(int irq, void *dev_id)
> +{
> +	xenvif_tx_interrupt(irq, dev_id);
> +	xenvif_rx_interrupt(irq, dev_id);
> +	return IRQ_HANDLED;
> +}
> +
>  static int xenvif_start_xmit(struct sk_buff *skb, struct net_device *dev)
>  {
>  	struct xenvif *vif = netdev_priv(dev);
> @@ -125,13 +145,15 @@ static struct net_device_stats *xenvif_get_stats(struct net_device *dev)
>  static void xenvif_up(struct xenvif *vif)
>  {
>  	xen_netbk_add_xenvif(vif);
> -	enable_irq(vif->irq);
> +	enable_irq(vif->tx_irq);
> +	enable_irq(vif->rx_irq);
>  	xen_netbk_check_rx_xenvif(vif);
>  }
>  
>  static void xenvif_down(struct xenvif *vif)
>  {
> -	disable_irq(vif->irq);
> +	disable_irq(vif->tx_irq);
> +	disable_irq(vif->rx_irq);
>  	del_timer_sync(&vif->credit_timeout);
>  	xen_netbk_deschedule_xenvif(vif);
>  	xen_netbk_remove_xenvif(vif);
> @@ -308,7 +330,7 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid,
>  int xenvif_connect(struct xenvif *vif,
>  		   unsigned long *tx_ring_ref, unsigned int tx_ring_ref_count,
>  		   unsigned long *rx_ring_ref, unsigned int rx_ring_ref_count,
> -		   unsigned int evtchn)
> +		   unsigned int tx_evtchn, unsigned int rx_evtchn)
>  {
>  	int err = -ENOMEM;
>  	void *addr;
> @@ -317,7 +339,7 @@ int xenvif_connect(struct xenvif *vif,
>  	int tmp[NETBK_MAX_RING_PAGES], i;
>  
>  	/* Already connected through? */
> -	if (vif->irq)
> +	if (vif->tx_irq)
>  		return 0;
>  
>  	__module_get(THIS_MODULE);
> @@ -347,13 +369,32 @@ int xenvif_connect(struct xenvif *vif,
>  	BACK_RING_INIT(&vif->rx, rxs, PAGE_SIZE * rx_ring_ref_count);
>  	vif->nr_rx_handles = rx_ring_ref_count;
>  
> -	err = bind_interdomain_evtchn_to_irqhandler(
> -		vif->domid, evtchn, xenvif_interrupt, 0,
> -		vif->dev->name, vif);
> -	if (err < 0)
> -		goto err_rx_unmap;
> -	vif->irq = err;
> -	disable_irq(vif->irq);
> +	if (tx_evtchn == rx_evtchn) { /* feature-split-event-channels == 0 */
> +		err = bind_interdomain_evtchn_to_irqhandler(
> +			vif->domid, tx_evtchn, xenvif_interrupt, 0,
> +			vif->dev->name, vif);
> +		if (err < 0)
> +			goto err_rx_unmap;
> +		vif->tx_irq = vif->rx_irq = err;
> +		disable_irq(vif->tx_irq);
> +		disable_irq(vif->rx_irq);
> +	} else { /* feature-split-event-channels == 1 */
> +		err = bind_interdomain_evtchn_to_irqhandler(
> +			vif->domid, tx_evtchn, xenvif_tx_interrupt, 0,
> +			vif->dev->name, vif);
> +		if (err < 0)
> +			goto err_rx_unmap;
> +		vif->tx_irq = err;
> +		disable_irq(vif->tx_irq);
> +
> +		err = bind_interdomain_evtchn_to_irqhandler(
> +			vif->domid, rx_evtchn, xenvif_rx_interrupt, 0,
> +			vif->dev->name, vif);
> +		if (err < 0)
> +			goto err_tx_unbind;
> +		vif->rx_irq = err;
> +		disable_irq(vif->rx_irq);
> +	}
>  
>  	xenvif_get(vif);
>  
> @@ -367,6 +408,10 @@ int xenvif_connect(struct xenvif *vif,
>  	rtnl_unlock();
>  
>  	return 0;
> +
> +err_tx_unbind:
> +	unbind_from_irqhandler(vif->tx_irq, vif);
> +	vif->tx_irq = 0;
>  err_rx_unmap:
>  	xen_netbk_unmap_frontend_rings(vif, (void *)vif->rx.sring);
>  	vif->nr_rx_handles = 0;
> @@ -406,8 +451,13 @@ void xenvif_disconnect(struct xenvif *vif)
>  	atomic_dec(&vif->refcnt);
>  	wait_event(vif->waiting_to_free, atomic_read(&vif->refcnt) == 0);
>  
> -	if (vif->irq) {
> -		unbind_from_irqhandler(vif->irq, vif);
> +	if (vif->tx_irq) {
> +		if (vif->tx_irq == vif->rx_irq)
> +			unbind_from_irqhandler(vif->tx_irq, vif);
> +		else {
> +			unbind_from_irqhandler(vif->tx_irq, vif);
> +			unbind_from_irqhandler(vif->rx_irq, vif);
> +		}
>  		need_module_put = 1;
>  	}
>  
> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> index 644c760..5ac8c35 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -639,7 +639,7 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk)
>  {
>  	struct xenvif *vif = NULL, *tmp;
>  	s8 status;
> -	u16 irq, flags;
> +	u16 flags;
>  	struct xen_netif_rx_response *resp;
>  	struct sk_buff_head rxq;
>  	struct sk_buff *skb;
> @@ -750,7 +750,6 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk)
>  					 sco->meta_slots_used);
>  
>  		RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(&vif->rx, ret);
> -		irq = vif->irq;
>  		if (ret && list_empty(&vif->notify_list))
>  			list_add_tail(&vif->notify_list, &notify);
>  
> @@ -762,7 +761,7 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk)
>  	}
>  
>  	list_for_each_entry_safe(vif, tmp, &notify, notify_list) {
> -		notify_remote_via_irq(vif->irq);
> +		notify_remote_via_irq(vif->rx_irq);
>  		list_del_init(&vif->notify_list);
>  	}
>  
> @@ -1595,7 +1594,7 @@ static void make_tx_response(struct xenvif *vif,
>  	vif->tx.rsp_prod_pvt = ++i;
>  	RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(&vif->tx, notify);
>  	if (notify)
> -		notify_remote_via_irq(vif->irq);
> +		notify_remote_via_irq(vif->tx_irq);
>  }
>  
>  static struct xen_netif_rx_response *make_rx_response(struct xenvif *vif,
> diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
> index 1791807..6822d89 100644
> --- a/drivers/net/xen-netback/xenbus.c
> +++ b/drivers/net/xen-netback/xenbus.c
> @@ -141,6 +141,15 @@ static int netback_probe(struct xenbus_device *dev,
>  			goto abort_transaction;
>  		}
>  
> +		/* Split event channels support */
> +		err = xenbus_printf(xbt, dev->nodename,
> +				    "feature-split-event-channels",
> +				    "%u", 1);
> +		if (err) {
> +			message = "writing feature-split-event-channels";
> +			goto abort_transaction;
> +		}

I wouldn't abort b/c of it. Just continue on without using this.

> +
>  		err = xenbus_transaction_end(xbt, 0);
>  	} while (err == -EAGAIN);
>  
> @@ -419,7 +428,7 @@ static int connect_rings(struct backend_info *be)
>  {
>  	struct xenvif *vif = be->vif;
>  	struct xenbus_device *dev = be->dev;
> -	unsigned int evtchn, rx_copy;
> +	unsigned int tx_evtchn, rx_evtchn, rx_copy;
>  	int err;
>  	int val;
>  	unsigned long tx_ring_ref[NETBK_MAX_RING_PAGES];
> @@ -428,12 +437,22 @@ static int connect_rings(struct backend_info *be)
>  	unsigned int  rx_ring_order;
>  
>  	err = xenbus_gather(XBT_NIL, dev->otherend,
> -			    "event-channel", "%u", &evtchn, NULL);
> +			    "event-channel", "%u", &tx_evtchn, NULL);
>  	if (err) {
> -		xenbus_dev_fatal(dev, err,
> -				 "reading %s/event-channel",
> -				 dev->otherend);
> -		return err;
> +		/* try split event channels */
> +		err = xenbus_gather(XBT_NIL, dev->otherend,
> +				    "event-channel-tx", "%u", &tx_evtchn,
> +				    "event-channel-rx", "%u", &rx_evtchn,
> +				    NULL);
> +		if (err) {
> +			xenbus_dev_fatal(dev, err,
> +					 "reading %s/event-channel(-tx/rx)",
> +					 dev->otherend);
> +			return err;
> +		}
> +	} else { /* frontend doesn't support split event channels */
> +		rx_evtchn = tx_evtchn;
> +		dev_info(&dev->dev, "single event channel\n");
>  	}
>  
>  	err = xenbus_scanf(XBT_NIL, dev->otherend, "tx-ring-order", "%u",
> @@ -568,12 +587,19 @@ static int connect_rings(struct backend_info *be)
>  	/* Map the shared frame, irq etc. */
>  	err = xenvif_connect(vif, tx_ring_ref, (1U << tx_ring_order),
>  			     rx_ring_ref, (1U << rx_ring_order),
> -			     evtchn);
> +			     tx_evtchn, rx_evtchn);
>  	if (err) {
>  		/* construct 1 2 3 / 4 5 6 */
>  		int i;
>  		char txs[3 * (1U << MODPARM_netback_max_tx_ring_page_order)];
>  		char rxs[3 * (1U << MODPARM_netback_max_rx_ring_page_order)];
> +		char evtchns[20];
> +
> +		if (tx_evtchn == rx_evtchn)
> +			snprintf(evtchns, sizeof(evtchns)-1, "%u", tx_evtchn);
> +		else
> +			snprintf(evtchns, sizeof(evtchns)-1, "%u/%u",
> +				 tx_evtchn, rx_evtchn);
>  
>  		txs[0] = rxs[0] = 0;
>  
> @@ -586,8 +612,8 @@ static int connect_rings(struct backend_info *be)
>  				 " %lu", rx_ring_ref[i]);
>  
>  		xenbus_dev_fatal(dev, err,
> -				 "mapping shared-frames%s /%s port %u",
> -				 txs, rxs, evtchn);
> +				 "mapping shared-frames%s /%s port %s",
> +				 txs, rxs, evtchns);
>  		return err;
>  	}
>  	return 0;
> -- 
> 1.7.10.4
> 

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 7/8] netback: split event channels support
  2013-02-15 16:00 ` Wei Liu
@ 2013-03-04 21:22   ` Konrad Rzeszutek Wilk
  2013-03-04 21:22   ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 91+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-03-04 21:22 UTC (permalink / raw)
  To: Wei Liu; +Cc: netdev, annie.li, ian.campbell, xen-devel

On Fri, Feb 15, 2013 at 04:00:08PM +0000, Wei Liu wrote:
> Netback and netfront only use one event channel to do tx / rx notification.
> This may cause unnecessary wake-up of process routines. This patch adds a new
> feature called feautre-split-event-channel to netback, enabling it to handle

'feature', not 'feautre'. In the code it looks to also be plural:

feature-split-event-channels

and there is also event-channel-tx and event-channel-rx attribute.


You need to also send an patch to Xen hypervisor tree for this new XenBus
attributes and provide the appropiate syntax.

> Tx and Rx event separately.
> 
> Netback will use tx_irq to notify guest for tx completion, rx_irq for rx
> notification.
> 
> If frontend doesn't support this feature, tx_irq = rx_irq.
> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> ---
>  drivers/net/xen-netback/common.h    |   10 +++--
>  drivers/net/xen-netback/interface.c |   78 ++++++++++++++++++++++++++++-------
>  drivers/net/xen-netback/netback.c   |    7 ++--
>  drivers/net/xen-netback/xenbus.c    |   44 ++++++++++++++++----
>  4 files changed, 109 insertions(+), 30 deletions(-)
> 
> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
> index f541ba9..cc2a9f0 100644
> --- a/drivers/net/xen-netback/common.h
> +++ b/drivers/net/xen-netback/common.h
> @@ -63,8 +63,11 @@ struct xenvif {
>  
>  	u8               fe_dev_addr[6];
>  
> -	/* Physical parameters of the comms window. */
> -	unsigned int     irq;
> +	/* Physical parameters of the comms window.
> +	 * When feature-split-event-channels = 0, tx_irq = rx_irq.
> +	 */
> +	unsigned int tx_irq;
> +	unsigned int rx_irq;
>  
>  	/* List of frontends to notify after a batch of frames sent. */
>  	struct list_head notify_list;
> @@ -122,10 +125,11 @@ struct xenvif *xenvif_alloc(struct device *parent,
>  			    domid_t domid,
>  			    unsigned int handle);
>  
> +/* When feature-split-event-channels == 0, tx_evtchn == rx_evtchn */
>  int xenvif_connect(struct xenvif *vif,
>  		   unsigned long *tx_ring_ref, unsigned int tx_ring_order,
>  		   unsigned long *rx_ring_ref, unsigned int rx_ring_order,
> -		   unsigned int evtchn);
> +		   unsigned int tx_evtchn, unsigned int rx_evtchn);
>  void xenvif_disconnect(struct xenvif *vif);
>  
>  void xenvif_get(struct xenvif *vif);
> diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
> index fa4d46d..c9ebe21 100644
> --- a/drivers/net/xen-netback/interface.c
> +++ b/drivers/net/xen-netback/interface.c
> @@ -60,7 +60,8 @@ static int xenvif_rx_schedulable(struct xenvif *vif)
>  	return xenvif_schedulable(vif) && !xen_netbk_rx_ring_full(vif);
>  }
>  
> -static irqreturn_t xenvif_interrupt(int irq, void *dev_id)
> +/* Tx interrupt handler used when feature-split-event-channels == 1 */
> +static irqreturn_t xenvif_tx_interrupt(int tx_irq, void *dev_id)
>  {
>  	struct xenvif *vif = dev_id;
>  
> @@ -69,12 +70,31 @@ static irqreturn_t xenvif_interrupt(int irq, void *dev_id)
>  
>  	xen_netbk_schedule_xenvif(vif);
>  
> +	return IRQ_HANDLED;
> +}
> +
> +/* Rx interrupt handler used when feature-split-event-channels == 1 */
> +static irqreturn_t xenvif_rx_interrupt(int rx_irq, void *dev_id)
> +{
> +	struct xenvif *vif = dev_id;
> +
> +	if (vif->netbk == NULL)
> +		return IRQ_NONE;
> +
>  	if (xenvif_rx_schedulable(vif))
>  		netif_wake_queue(vif->dev);
>  
>  	return IRQ_HANDLED;
>  }
>  
> +/* Used when feature-split-event-channels == 0 */
> +static irqreturn_t xenvif_interrupt(int irq, void *dev_id)
> +{
> +	xenvif_tx_interrupt(irq, dev_id);
> +	xenvif_rx_interrupt(irq, dev_id);
> +	return IRQ_HANDLED;
> +}
> +
>  static int xenvif_start_xmit(struct sk_buff *skb, struct net_device *dev)
>  {
>  	struct xenvif *vif = netdev_priv(dev);
> @@ -125,13 +145,15 @@ static struct net_device_stats *xenvif_get_stats(struct net_device *dev)
>  static void xenvif_up(struct xenvif *vif)
>  {
>  	xen_netbk_add_xenvif(vif);
> -	enable_irq(vif->irq);
> +	enable_irq(vif->tx_irq);
> +	enable_irq(vif->rx_irq);
>  	xen_netbk_check_rx_xenvif(vif);
>  }
>  
>  static void xenvif_down(struct xenvif *vif)
>  {
> -	disable_irq(vif->irq);
> +	disable_irq(vif->tx_irq);
> +	disable_irq(vif->rx_irq);
>  	del_timer_sync(&vif->credit_timeout);
>  	xen_netbk_deschedule_xenvif(vif);
>  	xen_netbk_remove_xenvif(vif);
> @@ -308,7 +330,7 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid,
>  int xenvif_connect(struct xenvif *vif,
>  		   unsigned long *tx_ring_ref, unsigned int tx_ring_ref_count,
>  		   unsigned long *rx_ring_ref, unsigned int rx_ring_ref_count,
> -		   unsigned int evtchn)
> +		   unsigned int tx_evtchn, unsigned int rx_evtchn)
>  {
>  	int err = -ENOMEM;
>  	void *addr;
> @@ -317,7 +339,7 @@ int xenvif_connect(struct xenvif *vif,
>  	int tmp[NETBK_MAX_RING_PAGES], i;
>  
>  	/* Already connected through? */
> -	if (vif->irq)
> +	if (vif->tx_irq)
>  		return 0;
>  
>  	__module_get(THIS_MODULE);
> @@ -347,13 +369,32 @@ int xenvif_connect(struct xenvif *vif,
>  	BACK_RING_INIT(&vif->rx, rxs, PAGE_SIZE * rx_ring_ref_count);
>  	vif->nr_rx_handles = rx_ring_ref_count;
>  
> -	err = bind_interdomain_evtchn_to_irqhandler(
> -		vif->domid, evtchn, xenvif_interrupt, 0,
> -		vif->dev->name, vif);
> -	if (err < 0)
> -		goto err_rx_unmap;
> -	vif->irq = err;
> -	disable_irq(vif->irq);
> +	if (tx_evtchn == rx_evtchn) { /* feature-split-event-channels == 0 */
> +		err = bind_interdomain_evtchn_to_irqhandler(
> +			vif->domid, tx_evtchn, xenvif_interrupt, 0,
> +			vif->dev->name, vif);
> +		if (err < 0)
> +			goto err_rx_unmap;
> +		vif->tx_irq = vif->rx_irq = err;
> +		disable_irq(vif->tx_irq);
> +		disable_irq(vif->rx_irq);
> +	} else { /* feature-split-event-channels == 1 */
> +		err = bind_interdomain_evtchn_to_irqhandler(
> +			vif->domid, tx_evtchn, xenvif_tx_interrupt, 0,
> +			vif->dev->name, vif);
> +		if (err < 0)
> +			goto err_rx_unmap;
> +		vif->tx_irq = err;
> +		disable_irq(vif->tx_irq);
> +
> +		err = bind_interdomain_evtchn_to_irqhandler(
> +			vif->domid, rx_evtchn, xenvif_rx_interrupt, 0,
> +			vif->dev->name, vif);
> +		if (err < 0)
> +			goto err_tx_unbind;
> +		vif->rx_irq = err;
> +		disable_irq(vif->rx_irq);
> +	}
>  
>  	xenvif_get(vif);
>  
> @@ -367,6 +408,10 @@ int xenvif_connect(struct xenvif *vif,
>  	rtnl_unlock();
>  
>  	return 0;
> +
> +err_tx_unbind:
> +	unbind_from_irqhandler(vif->tx_irq, vif);
> +	vif->tx_irq = 0;
>  err_rx_unmap:
>  	xen_netbk_unmap_frontend_rings(vif, (void *)vif->rx.sring);
>  	vif->nr_rx_handles = 0;
> @@ -406,8 +451,13 @@ void xenvif_disconnect(struct xenvif *vif)
>  	atomic_dec(&vif->refcnt);
>  	wait_event(vif->waiting_to_free, atomic_read(&vif->refcnt) == 0);
>  
> -	if (vif->irq) {
> -		unbind_from_irqhandler(vif->irq, vif);
> +	if (vif->tx_irq) {
> +		if (vif->tx_irq == vif->rx_irq)
> +			unbind_from_irqhandler(vif->tx_irq, vif);
> +		else {
> +			unbind_from_irqhandler(vif->tx_irq, vif);
> +			unbind_from_irqhandler(vif->rx_irq, vif);
> +		}
>  		need_module_put = 1;
>  	}
>  
> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> index 644c760..5ac8c35 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -639,7 +639,7 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk)
>  {
>  	struct xenvif *vif = NULL, *tmp;
>  	s8 status;
> -	u16 irq, flags;
> +	u16 flags;
>  	struct xen_netif_rx_response *resp;
>  	struct sk_buff_head rxq;
>  	struct sk_buff *skb;
> @@ -750,7 +750,6 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk)
>  					 sco->meta_slots_used);
>  
>  		RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(&vif->rx, ret);
> -		irq = vif->irq;
>  		if (ret && list_empty(&vif->notify_list))
>  			list_add_tail(&vif->notify_list, &notify);
>  
> @@ -762,7 +761,7 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk)
>  	}
>  
>  	list_for_each_entry_safe(vif, tmp, &notify, notify_list) {
> -		notify_remote_via_irq(vif->irq);
> +		notify_remote_via_irq(vif->rx_irq);
>  		list_del_init(&vif->notify_list);
>  	}
>  
> @@ -1595,7 +1594,7 @@ static void make_tx_response(struct xenvif *vif,
>  	vif->tx.rsp_prod_pvt = ++i;
>  	RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(&vif->tx, notify);
>  	if (notify)
> -		notify_remote_via_irq(vif->irq);
> +		notify_remote_via_irq(vif->tx_irq);
>  }
>  
>  static struct xen_netif_rx_response *make_rx_response(struct xenvif *vif,
> diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
> index 1791807..6822d89 100644
> --- a/drivers/net/xen-netback/xenbus.c
> +++ b/drivers/net/xen-netback/xenbus.c
> @@ -141,6 +141,15 @@ static int netback_probe(struct xenbus_device *dev,
>  			goto abort_transaction;
>  		}
>  
> +		/* Split event channels support */
> +		err = xenbus_printf(xbt, dev->nodename,
> +				    "feature-split-event-channels",
> +				    "%u", 1);
> +		if (err) {
> +			message = "writing feature-split-event-channels";
> +			goto abort_transaction;
> +		}

I wouldn't abort b/c of it. Just continue on without using this.

> +
>  		err = xenbus_transaction_end(xbt, 0);
>  	} while (err == -EAGAIN);
>  
> @@ -419,7 +428,7 @@ static int connect_rings(struct backend_info *be)
>  {
>  	struct xenvif *vif = be->vif;
>  	struct xenbus_device *dev = be->dev;
> -	unsigned int evtchn, rx_copy;
> +	unsigned int tx_evtchn, rx_evtchn, rx_copy;
>  	int err;
>  	int val;
>  	unsigned long tx_ring_ref[NETBK_MAX_RING_PAGES];
> @@ -428,12 +437,22 @@ static int connect_rings(struct backend_info *be)
>  	unsigned int  rx_ring_order;
>  
>  	err = xenbus_gather(XBT_NIL, dev->otherend,
> -			    "event-channel", "%u", &evtchn, NULL);
> +			    "event-channel", "%u", &tx_evtchn, NULL);
>  	if (err) {
> -		xenbus_dev_fatal(dev, err,
> -				 "reading %s/event-channel",
> -				 dev->otherend);
> -		return err;
> +		/* try split event channels */
> +		err = xenbus_gather(XBT_NIL, dev->otherend,
> +				    "event-channel-tx", "%u", &tx_evtchn,
> +				    "event-channel-rx", "%u", &rx_evtchn,
> +				    NULL);
> +		if (err) {
> +			xenbus_dev_fatal(dev, err,
> +					 "reading %s/event-channel(-tx/rx)",
> +					 dev->otherend);
> +			return err;
> +		}
> +	} else { /* frontend doesn't support split event channels */
> +		rx_evtchn = tx_evtchn;
> +		dev_info(&dev->dev, "single event channel\n");
>  	}
>  
>  	err = xenbus_scanf(XBT_NIL, dev->otherend, "tx-ring-order", "%u",
> @@ -568,12 +587,19 @@ static int connect_rings(struct backend_info *be)
>  	/* Map the shared frame, irq etc. */
>  	err = xenvif_connect(vif, tx_ring_ref, (1U << tx_ring_order),
>  			     rx_ring_ref, (1U << rx_ring_order),
> -			     evtchn);
> +			     tx_evtchn, rx_evtchn);
>  	if (err) {
>  		/* construct 1 2 3 / 4 5 6 */
>  		int i;
>  		char txs[3 * (1U << MODPARM_netback_max_tx_ring_page_order)];
>  		char rxs[3 * (1U << MODPARM_netback_max_rx_ring_page_order)];
> +		char evtchns[20];
> +
> +		if (tx_evtchn == rx_evtchn)
> +			snprintf(evtchns, sizeof(evtchns)-1, "%u", tx_evtchn);
> +		else
> +			snprintf(evtchns, sizeof(evtchns)-1, "%u/%u",
> +				 tx_evtchn, rx_evtchn);
>  
>  		txs[0] = rxs[0] = 0;
>  
> @@ -586,8 +612,8 @@ static int connect_rings(struct backend_info *be)
>  				 " %lu", rx_ring_ref[i]);
>  
>  		xenbus_dev_fatal(dev, err,
> -				 "mapping shared-frames%s /%s port %u",
> -				 txs, rxs, evtchn);
> +				 "mapping shared-frames%s /%s port %s",
> +				 txs, rxs, evtchns);
>  		return err;
>  	}
>  	return 0;
> -- 
> 1.7.10.4
> 

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 8/8] netfront: split event channels support
  2013-02-15 16:00 ` Wei Liu
  2013-03-04 21:24   ` Konrad Rzeszutek Wilk
@ 2013-03-04 21:24   ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 91+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-03-04 21:24 UTC (permalink / raw)
  To: Wei Liu; +Cc: xen-devel, netdev, ian.campbell, annie.li

On Fri, Feb 15, 2013 at 04:00:09PM +0000, Wei Liu wrote:
> If this feature is not activated, rx_irq = tx_irq. See corresponding netback
> change log for details.

To make it easier for people that use 'git log --grep' one usually says:
"See xen/netfront: Use .. " for the frontend implementation. That way
you can just copy-n-paste the title of the patch in the search functionality.

You should also include in this patch the description of the protocol.

And explain how it benefits the networking subsystem?


> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> ---
>  drivers/net/xen-netfront.c |  184 ++++++++++++++++++++++++++++++++++++--------
>  1 file changed, 152 insertions(+), 32 deletions(-)
> 
> diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
> index de73a71..ea9b656 100644
> --- a/drivers/net/xen-netfront.c
> +++ b/drivers/net/xen-netfront.c
> @@ -100,7 +100,12 @@ struct netfront_info {
>  
>  	struct napi_struct napi;
>  
> -	unsigned int evtchn;
> +	/* 
> +	 * Split event channels support, tx_* == rx_* when using
> +	 * single event channel.
> +	 */
> +	unsigned int tx_evtchn, rx_evtchn;
> +	unsigned int tx_irq, rx_irq;
>  	struct xenbus_device *xbdev;
>  
>  	spinlock_t   tx_lock;
> @@ -344,7 +349,7 @@ no_skb:
>   push:
>  	RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(&np->rx, notify);
>  	if (notify)
> -		notify_remote_via_irq(np->netdev->irq);
> +		notify_remote_via_irq(np->rx_irq);
>  }
>  
>  static int xennet_open(struct net_device *dev)
> @@ -633,7 +638,7 @@ static int xennet_start_xmit(struct sk_buff *skb, struct net_device *dev)
>  
>  	RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(&np->tx, notify);
>  	if (notify)
> -		notify_remote_via_irq(np->netdev->irq);
> +		notify_remote_via_irq(np->tx_irq);
>  
>  	u64_stats_update_begin(&stats->syncp);
>  	stats->tx_bytes += skb->len;
> @@ -1263,26 +1268,41 @@ static int xennet_set_features(struct net_device *dev,
>  	return 0;
>  }
>  
> -static irqreturn_t xennet_interrupt(int irq, void *dev_id)
> +/* Used for tx completion */
> +static irqreturn_t xennet_tx_interrupt(int tx_irq, void *dev_id)
>  {
> -	struct net_device *dev = dev_id;
> -	struct netfront_info *np = netdev_priv(dev);
> +	struct netfront_info *np = dev_id;
> +	struct net_device *dev = np->netdev;
>  	unsigned long flags;
>  
>  	spin_lock_irqsave(&np->tx_lock, flags);
> +	xennet_tx_buf_gc(dev);
> +	spin_unlock_irqrestore(&np->tx_lock, flags);
>  
> -	if (likely(netif_carrier_ok(dev))) {
> -		xennet_tx_buf_gc(dev);
> -		/* Under tx_lock: protects access to rx shared-ring indexes. */
> -		if (RING_HAS_UNCONSUMED_RESPONSES(&np->rx))
> -			napi_schedule(&np->napi);
> -	}
> +	return IRQ_HANDLED;
> +}
>  
> -	spin_unlock_irqrestore(&np->tx_lock, flags);
> +/* Used for rx */
> +static irqreturn_t xennet_rx_interrupt(int rx_irq, void *dev_id)
> +{
> +	struct netfront_info *np = dev_id;
> +	struct net_device *dev = np->netdev;
> +
> +	if (likely(netif_carrier_ok(dev) &&
> +		   RING_HAS_UNCONSUMED_RESPONSES(&np->rx)))
> +		napi_schedule(&np->napi);
>  
>  	return IRQ_HANDLED;
>  }
>  
> +/* Used for single event channel configuration */
> +static irqreturn_t xennet_interrupt(int irq, void *dev_id)
> +{
> +	xennet_tx_interrupt(irq, dev_id);
> +	xennet_rx_interrupt(irq, dev_id);
> +	return IRQ_HANDLED;
> +}
> +
>  #ifdef CONFIG_NET_POLL_CONTROLLER
>  static void xennet_poll_controller(struct net_device *dev)
>  {
> @@ -1451,9 +1471,14 @@ static void xennet_disconnect_backend(struct netfront_info *info)
>  	spin_unlock_irq(&info->tx_lock);
>  	spin_unlock_bh(&info->rx_lock);
>  
> -	if (info->netdev->irq)
> -		unbind_from_irqhandler(info->netdev->irq, info->netdev);
> -	info->evtchn = info->netdev->irq = 0;
> +	if (info->tx_irq && (info->tx_irq == info->rx_irq))
> +		unbind_from_irqhandler(info->tx_irq, info);
> +	if (info->tx_irq && (info->tx_irq != info->rx_irq)) {
> +		unbind_from_irqhandler(info->tx_irq, info);
> +		unbind_from_irqhandler(info->rx_irq, info);
> +	}
> +	info->tx_evtchn = info->rx_evtchn = 0;
> +	info->tx_irq = info->rx_irq = 0;
>  
>  	xenbus_unmap_ring_vfree(info->xbdev, (void *)info->tx.sring);
>  	free_pages((unsigned long)info->tx.sring, info->tx_ring_page_order);
> @@ -1503,11 +1528,86 @@ static int xen_net_read_mac(struct xenbus_device *dev, u8 mac[])
>  	return 0;
>  }
>  
> +static int setup_netfront_single(struct netfront_info *info)
> +{
> +	int err;
> +
> +	err = xenbus_alloc_evtchn(info->xbdev, &info->tx_evtchn);
> +	if (err < 0)
> +		goto fail;
> +
> +	err = bind_evtchn_to_irqhandler(info->tx_evtchn,
> +					xennet_interrupt,
> +					0, info->netdev->name, info);
> +	if (err < 0)
> +		goto bind_fail;
> +	info->rx_evtchn = info->tx_evtchn;
> +	info->rx_irq = info->tx_irq = err;
> +	dev_info(&info->xbdev->dev,
> +		 "single event channel, evtchn = %d, irq = %d\n",
> +		 info->tx_evtchn, info->tx_irq);
> +
> +	return 0;
> +
> +bind_fail:
> +	xenbus_free_evtchn(info->xbdev, info->tx_evtchn);
> +	info->tx_evtchn = 0;
> +fail:
> +	return err;
> +}
> +
> +static int setup_netfront_split(struct netfront_info *info)
> +{
> +	int err;
> +
> +	err = xenbus_alloc_evtchn(info->xbdev, &info->tx_evtchn);
> +	if (err)
> +		goto fail;
> +	err = xenbus_alloc_evtchn(info->xbdev, &info->rx_evtchn);
> +	if (err)
> +		goto alloc_rx_evtchn_fail;
> +
> +	err = bind_evtchn_to_irqhandler(info->tx_evtchn,
> +					xennet_tx_interrupt,
> +					0, info->netdev->name, info);
> +	if (err < 0)
> +		goto bind_tx_fail;
> +	info->tx_irq = err;
> +
> +	err = bind_evtchn_to_irqhandler(info->rx_evtchn,
> +					xennet_rx_interrupt,
> +					0, info->netdev->name, info);
> +	if (err < 0)
> +		goto bind_rx_fail;
> +
> +	info->rx_irq = err;
> +
> +	dev_info(&info->xbdev->dev,
> +		 "split event channels, tx_evtchn/irq = %d/%d, rx_evtchn/irq = %d/%d",
> +		 info->tx_evtchn, info->tx_irq,
> +		 info->rx_evtchn, info->rx_irq);
> +
> +	return 0;
> +
> +bind_rx_fail:
> +	unbind_from_irqhandler(info->tx_irq, info);
> +	info->tx_irq = 0;
> +bind_tx_fail:
> +	xenbus_free_evtchn(info->xbdev, info->rx_evtchn);
> +	info->rx_evtchn = 0;
> +alloc_rx_evtchn_fail:
> +	xenbus_free_evtchn(info->xbdev, info->tx_evtchn);
> +	info->tx_evtchn = 0;
> +fail:
> +	return err;
> +}
> +
>  static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info)
>  {
>  	struct xen_netif_tx_sring *txs;
>  	struct xen_netif_rx_sring *rxs;
>  	int err;
> +	unsigned int feature_split_evtchn;
>  	struct net_device *netdev = info->netdev;
>  	unsigned int max_tx_ring_page_order, max_rx_ring_page_order;
>  	int i;
> @@ -1527,6 +1627,12 @@ static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info)
>  	}
>  
>  	err = xenbus_scanf(XBT_NIL, info->xbdev->otherend,
> +			   "feature-split-event-channels", "%u",
> +			   &feature_split_evtchn);
> +	if (err < 0)
> +		feature_split_evtchn = 0;
> +
> +	err = xenbus_scanf(XBT_NIL, info->xbdev->otherend,
>  			   "max-tx-ring-page-order", "%u",
>  			   &max_tx_ring_page_order);
>  	if (err < 0) {
> @@ -1598,20 +1704,17 @@ static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info)
>  	if (err < 0)
>  		goto grant_rx_ring_fail;
>  
> -	err = xenbus_alloc_evtchn(dev, &info->evtchn);
> +	if (feature_split_evtchn)
> +		err = setup_netfront_split(info);
> +	else
> +		err = setup_netfront_single(info);
> +
>  	if (err)
> -		goto alloc_evtchn_fail;
> +		goto setup_evtchn_fail;
>  
> -	err = bind_evtchn_to_irqhandler(info->evtchn, xennet_interrupt,
> -					0, netdev->name, netdev);
> -	if (err < 0)
> -		goto bind_fail;
> -	netdev->irq = err;
>  	return 0;
>  
> -bind_fail:
> -	xenbus_free_evtchn(dev, info->evtchn);
> -alloc_evtchn_fail:
> +setup_evtchn_fail:
>  	xenbus_unmap_ring_vfree(info->xbdev, (void *)info->rx.sring);
>  grant_rx_ring_fail:
>  	free_pages((unsigned long)info->rx.sring, info->rx_ring_page_order);
> @@ -1696,11 +1799,26 @@ again:
>  		}
>  	}
>  
> -	err = xenbus_printf(xbt, dev->nodename,
> -			    "event-channel", "%u", info->evtchn);
> -	if (err) {
> -		message = "writing event-channel";
> -		goto abort_transaction;
> +	if (info->tx_evtchn == info->rx_evtchn) {
> +		err = xenbus_printf(xbt, dev->nodename,
> +				    "event-channel", "%u", info->tx_evtchn);
> +		if (err) {
> +			message = "writing event-channel";
> +			goto abort_transaction;
> +		}
> +	} else {
> +		err = xenbus_printf(xbt, dev->nodename,
> +				    "event-channel-tx", "%u", info->tx_evtchn);
> +		if (err) {
> +			message = "writing event-channel-tx";
> +			goto abort_transaction;
> +		}
> +		err = xenbus_printf(xbt, dev->nodename,
> +				    "event-channel-rx", "%u", info->rx_evtchn);
> +		if (err) {
> +			message = "writing event-channel-rx";
> +			goto abort_transaction;
> +		}
>  	}
>  
>  	err = xenbus_printf(xbt, dev->nodename, "request-rx-copy", "%u",
> @@ -1814,7 +1932,9 @@ static int xennet_connect(struct net_device *dev)
>  	 * packets.
>  	 */
>  	netif_carrier_on(np->netdev);
> -	notify_remote_via_irq(np->netdev->irq);
> +	notify_remote_via_irq(np->tx_irq);
> +	if (np->tx_irq != np->rx_irq)
> +		notify_remote_via_irq(np->rx_irq);
>  	xennet_tx_buf_gc(dev);
>  	xennet_alloc_rx_buffers(dev);
>  
> -- 
> 1.7.10.4
> 

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 8/8] netfront: split event channels support
  2013-02-15 16:00 ` Wei Liu
@ 2013-03-04 21:24   ` Konrad Rzeszutek Wilk
  2013-03-04 21:24   ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 91+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-03-04 21:24 UTC (permalink / raw)
  To: Wei Liu; +Cc: netdev, annie.li, ian.campbell, xen-devel

On Fri, Feb 15, 2013 at 04:00:09PM +0000, Wei Liu wrote:
> If this feature is not activated, rx_irq = tx_irq. See corresponding netback
> change log for details.

To make it easier for people that use 'git log --grep' one usually says:
"See xen/netfront: Use .. " for the frontend implementation. That way
you can just copy-n-paste the title of the patch in the search functionality.

You should also include in this patch the description of the protocol.

And explain how it benefits the networking subsystem?


> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> ---
>  drivers/net/xen-netfront.c |  184 ++++++++++++++++++++++++++++++++++++--------
>  1 file changed, 152 insertions(+), 32 deletions(-)
> 
> diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
> index de73a71..ea9b656 100644
> --- a/drivers/net/xen-netfront.c
> +++ b/drivers/net/xen-netfront.c
> @@ -100,7 +100,12 @@ struct netfront_info {
>  
>  	struct napi_struct napi;
>  
> -	unsigned int evtchn;
> +	/* 
> +	 * Split event channels support, tx_* == rx_* when using
> +	 * single event channel.
> +	 */
> +	unsigned int tx_evtchn, rx_evtchn;
> +	unsigned int tx_irq, rx_irq;
>  	struct xenbus_device *xbdev;
>  
>  	spinlock_t   tx_lock;
> @@ -344,7 +349,7 @@ no_skb:
>   push:
>  	RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(&np->rx, notify);
>  	if (notify)
> -		notify_remote_via_irq(np->netdev->irq);
> +		notify_remote_via_irq(np->rx_irq);
>  }
>  
>  static int xennet_open(struct net_device *dev)
> @@ -633,7 +638,7 @@ static int xennet_start_xmit(struct sk_buff *skb, struct net_device *dev)
>  
>  	RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(&np->tx, notify);
>  	if (notify)
> -		notify_remote_via_irq(np->netdev->irq);
> +		notify_remote_via_irq(np->tx_irq);
>  
>  	u64_stats_update_begin(&stats->syncp);
>  	stats->tx_bytes += skb->len;
> @@ -1263,26 +1268,41 @@ static int xennet_set_features(struct net_device *dev,
>  	return 0;
>  }
>  
> -static irqreturn_t xennet_interrupt(int irq, void *dev_id)
> +/* Used for tx completion */
> +static irqreturn_t xennet_tx_interrupt(int tx_irq, void *dev_id)
>  {
> -	struct net_device *dev = dev_id;
> -	struct netfront_info *np = netdev_priv(dev);
> +	struct netfront_info *np = dev_id;
> +	struct net_device *dev = np->netdev;
>  	unsigned long flags;
>  
>  	spin_lock_irqsave(&np->tx_lock, flags);
> +	xennet_tx_buf_gc(dev);
> +	spin_unlock_irqrestore(&np->tx_lock, flags);
>  
> -	if (likely(netif_carrier_ok(dev))) {
> -		xennet_tx_buf_gc(dev);
> -		/* Under tx_lock: protects access to rx shared-ring indexes. */
> -		if (RING_HAS_UNCONSUMED_RESPONSES(&np->rx))
> -			napi_schedule(&np->napi);
> -	}
> +	return IRQ_HANDLED;
> +}
>  
> -	spin_unlock_irqrestore(&np->tx_lock, flags);
> +/* Used for rx */
> +static irqreturn_t xennet_rx_interrupt(int rx_irq, void *dev_id)
> +{
> +	struct netfront_info *np = dev_id;
> +	struct net_device *dev = np->netdev;
> +
> +	if (likely(netif_carrier_ok(dev) &&
> +		   RING_HAS_UNCONSUMED_RESPONSES(&np->rx)))
> +		napi_schedule(&np->napi);
>  
>  	return IRQ_HANDLED;
>  }
>  
> +/* Used for single event channel configuration */
> +static irqreturn_t xennet_interrupt(int irq, void *dev_id)
> +{
> +	xennet_tx_interrupt(irq, dev_id);
> +	xennet_rx_interrupt(irq, dev_id);
> +	return IRQ_HANDLED;
> +}
> +
>  #ifdef CONFIG_NET_POLL_CONTROLLER
>  static void xennet_poll_controller(struct net_device *dev)
>  {
> @@ -1451,9 +1471,14 @@ static void xennet_disconnect_backend(struct netfront_info *info)
>  	spin_unlock_irq(&info->tx_lock);
>  	spin_unlock_bh(&info->rx_lock);
>  
> -	if (info->netdev->irq)
> -		unbind_from_irqhandler(info->netdev->irq, info->netdev);
> -	info->evtchn = info->netdev->irq = 0;
> +	if (info->tx_irq && (info->tx_irq == info->rx_irq))
> +		unbind_from_irqhandler(info->tx_irq, info);
> +	if (info->tx_irq && (info->tx_irq != info->rx_irq)) {
> +		unbind_from_irqhandler(info->tx_irq, info);
> +		unbind_from_irqhandler(info->rx_irq, info);
> +	}
> +	info->tx_evtchn = info->rx_evtchn = 0;
> +	info->tx_irq = info->rx_irq = 0;
>  
>  	xenbus_unmap_ring_vfree(info->xbdev, (void *)info->tx.sring);
>  	free_pages((unsigned long)info->tx.sring, info->tx_ring_page_order);
> @@ -1503,11 +1528,86 @@ static int xen_net_read_mac(struct xenbus_device *dev, u8 mac[])
>  	return 0;
>  }
>  
> +static int setup_netfront_single(struct netfront_info *info)
> +{
> +	int err;
> +
> +	err = xenbus_alloc_evtchn(info->xbdev, &info->tx_evtchn);
> +	if (err < 0)
> +		goto fail;
> +
> +	err = bind_evtchn_to_irqhandler(info->tx_evtchn,
> +					xennet_interrupt,
> +					0, info->netdev->name, info);
> +	if (err < 0)
> +		goto bind_fail;
> +	info->rx_evtchn = info->tx_evtchn;
> +	info->rx_irq = info->tx_irq = err;
> +	dev_info(&info->xbdev->dev,
> +		 "single event channel, evtchn = %d, irq = %d\n",
> +		 info->tx_evtchn, info->tx_irq);
> +
> +	return 0;
> +
> +bind_fail:
> +	xenbus_free_evtchn(info->xbdev, info->tx_evtchn);
> +	info->tx_evtchn = 0;
> +fail:
> +	return err;
> +}
> +
> +static int setup_netfront_split(struct netfront_info *info)
> +{
> +	int err;
> +
> +	err = xenbus_alloc_evtchn(info->xbdev, &info->tx_evtchn);
> +	if (err)
> +		goto fail;
> +	err = xenbus_alloc_evtchn(info->xbdev, &info->rx_evtchn);
> +	if (err)
> +		goto alloc_rx_evtchn_fail;
> +
> +	err = bind_evtchn_to_irqhandler(info->tx_evtchn,
> +					xennet_tx_interrupt,
> +					0, info->netdev->name, info);
> +	if (err < 0)
> +		goto bind_tx_fail;
> +	info->tx_irq = err;
> +
> +	err = bind_evtchn_to_irqhandler(info->rx_evtchn,
> +					xennet_rx_interrupt,
> +					0, info->netdev->name, info);
> +	if (err < 0)
> +		goto bind_rx_fail;
> +
> +	info->rx_irq = err;
> +
> +	dev_info(&info->xbdev->dev,
> +		 "split event channels, tx_evtchn/irq = %d/%d, rx_evtchn/irq = %d/%d",
> +		 info->tx_evtchn, info->tx_irq,
> +		 info->rx_evtchn, info->rx_irq);
> +
> +	return 0;
> +
> +bind_rx_fail:
> +	unbind_from_irqhandler(info->tx_irq, info);
> +	info->tx_irq = 0;
> +bind_tx_fail:
> +	xenbus_free_evtchn(info->xbdev, info->rx_evtchn);
> +	info->rx_evtchn = 0;
> +alloc_rx_evtchn_fail:
> +	xenbus_free_evtchn(info->xbdev, info->tx_evtchn);
> +	info->tx_evtchn = 0;
> +fail:
> +	return err;
> +}
> +
>  static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info)
>  {
>  	struct xen_netif_tx_sring *txs;
>  	struct xen_netif_rx_sring *rxs;
>  	int err;
> +	unsigned int feature_split_evtchn;
>  	struct net_device *netdev = info->netdev;
>  	unsigned int max_tx_ring_page_order, max_rx_ring_page_order;
>  	int i;
> @@ -1527,6 +1627,12 @@ static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info)
>  	}
>  
>  	err = xenbus_scanf(XBT_NIL, info->xbdev->otherend,
> +			   "feature-split-event-channels", "%u",
> +			   &feature_split_evtchn);
> +	if (err < 0)
> +		feature_split_evtchn = 0;
> +
> +	err = xenbus_scanf(XBT_NIL, info->xbdev->otherend,
>  			   "max-tx-ring-page-order", "%u",
>  			   &max_tx_ring_page_order);
>  	if (err < 0) {
> @@ -1598,20 +1704,17 @@ static int setup_netfront(struct xenbus_device *dev, struct netfront_info *info)
>  	if (err < 0)
>  		goto grant_rx_ring_fail;
>  
> -	err = xenbus_alloc_evtchn(dev, &info->evtchn);
> +	if (feature_split_evtchn)
> +		err = setup_netfront_split(info);
> +	else
> +		err = setup_netfront_single(info);
> +
>  	if (err)
> -		goto alloc_evtchn_fail;
> +		goto setup_evtchn_fail;
>  
> -	err = bind_evtchn_to_irqhandler(info->evtchn, xennet_interrupt,
> -					0, netdev->name, netdev);
> -	if (err < 0)
> -		goto bind_fail;
> -	netdev->irq = err;
>  	return 0;
>  
> -bind_fail:
> -	xenbus_free_evtchn(dev, info->evtchn);
> -alloc_evtchn_fail:
> +setup_evtchn_fail:
>  	xenbus_unmap_ring_vfree(info->xbdev, (void *)info->rx.sring);
>  grant_rx_ring_fail:
>  	free_pages((unsigned long)info->rx.sring, info->rx_ring_page_order);
> @@ -1696,11 +1799,26 @@ again:
>  		}
>  	}
>  
> -	err = xenbus_printf(xbt, dev->nodename,
> -			    "event-channel", "%u", info->evtchn);
> -	if (err) {
> -		message = "writing event-channel";
> -		goto abort_transaction;
> +	if (info->tx_evtchn == info->rx_evtchn) {
> +		err = xenbus_printf(xbt, dev->nodename,
> +				    "event-channel", "%u", info->tx_evtchn);
> +		if (err) {
> +			message = "writing event-channel";
> +			goto abort_transaction;
> +		}
> +	} else {
> +		err = xenbus_printf(xbt, dev->nodename,
> +				    "event-channel-tx", "%u", info->tx_evtchn);
> +		if (err) {
> +			message = "writing event-channel-tx";
> +			goto abort_transaction;
> +		}
> +		err = xenbus_printf(xbt, dev->nodename,
> +				    "event-channel-rx", "%u", info->rx_evtchn);
> +		if (err) {
> +			message = "writing event-channel-rx";
> +			goto abort_transaction;
> +		}
>  	}
>  
>  	err = xenbus_printf(xbt, dev->nodename, "request-rx-copy", "%u",
> @@ -1814,7 +1932,9 @@ static int xennet_connect(struct net_device *dev)
>  	 * packets.
>  	 */
>  	netif_carrier_on(np->netdev);
> -	notify_remote_via_irq(np->netdev->irq);
> +	notify_remote_via_irq(np->tx_irq);
> +	if (np->tx_irq != np->rx_irq)
> +		notify_remote_via_irq(np->rx_irq);
>  	xennet_tx_buf_gc(dev);
>  	xennet_alloc_rx_buffers(dev);
>  
> -- 
> 1.7.10.4
> 

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 2/8] netback: add module unload function
  2013-02-15 16:00 ` Wei Liu
  2013-03-04 20:55   ` [Xen-devel] " Konrad Rzeszutek Wilk
  2013-03-04 20:55   ` Konrad Rzeszutek Wilk
@ 2013-03-04 21:58   ` Stephen Hemminger
  2013-03-05 13:30     ` Wei Liu
  2013-03-05 13:30     ` Wei Liu
  2013-03-04 21:58   ` Stephen Hemminger
  3 siblings, 2 replies; 91+ messages in thread
From: Stephen Hemminger @ 2013-03-04 21:58 UTC (permalink / raw)
  To: Wei Liu; +Cc: xen-devel, netdev, ian.campbell, konrad.wilk, annie.li

On Fri, 15 Feb 2013 16:00:03 +0000
Wei Liu <wei.liu2@citrix.com> wrote:

> Enable users to unload netback module. Users should make sure there is not vif
> runnig.


Isn't it likely that some admin might be trying to cleanup or do shutdown and there
is an ordering problem. What happens if vif's are still running?

Why depend on users? You should allow it anytime and do refcounting and auto-destroy the vif's for them.

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 2/8] netback: add module unload function
  2013-02-15 16:00 ` Wei Liu
                     ` (2 preceding siblings ...)
  2013-03-04 21:58   ` Stephen Hemminger
@ 2013-03-04 21:58   ` Stephen Hemminger
  3 siblings, 0 replies; 91+ messages in thread
From: Stephen Hemminger @ 2013-03-04 21:58 UTC (permalink / raw)
  To: Wei Liu; +Cc: netdev, annie.li, konrad.wilk, ian.campbell, xen-devel

On Fri, 15 Feb 2013 16:00:03 +0000
Wei Liu <wei.liu2@citrix.com> wrote:

> Enable users to unload netback module. Users should make sure there is not vif
> runnig.


Isn't it likely that some admin might be trying to cleanup or do shutdown and there
is an ordering problem. What happens if vif's are still running?

Why depend on users? You should allow it anytime and do refcounting and auto-destroy the vif's for them.

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [Xen-devel] [PATCH 3/8] netback: get/put module along with vif connect/disconnect
  2013-02-15 16:00 ` Wei Liu
                     ` (2 preceding siblings ...)
  2013-03-05 10:02   ` David Vrabel
@ 2013-03-05 10:02   ` David Vrabel
  2013-03-05 13:30     ` Wei Liu
  2013-03-05 13:30     ` [Xen-devel] " Wei Liu
  3 siblings, 2 replies; 91+ messages in thread
From: David Vrabel @ 2013-03-05 10:02 UTC (permalink / raw)
  To: Wei Liu; +Cc: xen-devel, netdev, ian.campbell, konrad.wilk, annie.li

On 15/02/13 16:00, Wei Liu wrote:
> If there is vif running and user unloads netback, guest's network interface
> just mysteriously stops working. So we need to prevent unloading netback
> module if there is vif running.

It's not mysterious -- it is cleanly disconnected, and will reconnect
when the module is reinserted.

Being able to unload modules while they are in use is standard so I
don't think this should be applied.

David

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 3/8] netback: get/put module along with vif connect/disconnect
  2013-02-15 16:00 ` Wei Liu
  2013-03-04 20:56   ` Konrad Rzeszutek Wilk
  2013-03-04 20:56   ` Konrad Rzeszutek Wilk
@ 2013-03-05 10:02   ` David Vrabel
  2013-03-05 10:02   ` [Xen-devel] " David Vrabel
  3 siblings, 0 replies; 91+ messages in thread
From: David Vrabel @ 2013-03-05 10:02 UTC (permalink / raw)
  To: Wei Liu; +Cc: netdev, annie.li, konrad.wilk, ian.campbell, xen-devel

On 15/02/13 16:00, Wei Liu wrote:
> If there is vif running and user unloads netback, guest's network interface
> just mysteriously stops working. So we need to prevent unloading netback
> module if there is vif running.

It's not mysterious -- it is cleanly disconnected, and will reconnect
when the module is reinserted.

Being able to unload modules while they are in use is standard so I
don't think this should be applied.

David

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [Xen-devel] [PATCH 4/8] xenbus_client: Extend interface to support multi-page ring
  2013-02-15 16:00 ` [PATCH 4/8] xenbus_client: Extend interface to support multi-page ring Wei Liu
                     ` (4 preceding siblings ...)
  2013-03-05 10:25   ` David Vrabel
@ 2013-03-05 10:25   ` David Vrabel
  5 siblings, 0 replies; 91+ messages in thread
From: David Vrabel @ 2013-03-05 10:25 UTC (permalink / raw)
  To: Wei Liu
  Cc: xen-devel, netdev, ian.campbell, konrad.wilk, annie.li,
	Stefano Stabellini, Roger Pau Monne

On 15/02/13 16:00, Wei Liu wrote:
> Also bundle fixes for xen frontends and backends in this patch.

When changing APIs you don't need to call out required changes to
callers as "fixes".

> --- a/drivers/xen/xenbus/xenbus_client.c
> +++ b/drivers/xen/xenbus/xenbus_client.c
> @@ -357,17 +359,39 @@ static void xenbus_switch_fatal(struct xenbus_device *dev, int depth, int err,
>  /**
>   * xenbus_grant_ring
>   * @dev: xenbus device
> - * @ring_mfn: mfn of ring to grant
> -
> - * Grant access to the given @ring_mfn to the peer of the given device.  Return
> - * 0 on success, or -errno on error.  On error, the device will switch to
> + * @vaddr: starting virtual address of the ring
> + * @nr_pages: number of pages to be granted
> + * @grefs: grant reference array to be filled in
> + *
> + * Grant access to the given @vaddr to the peer of the given device.
> + * Then fill in @grefs with grant references.  Return 0 on success, or
> + * -errno on error.  On error, the device will switch to
>   * XenbusStateClosing, and the error will be saved in the store.
>   */
> -int xenbus_grant_ring(struct xenbus_device *dev, unsigned long ring_mfn)
> +int xenbus_grant_ring(struct xenbus_device *dev, void *vaddr,
> +		      int nr_pages, int *grefs)

This call previously return the grant ref in an int, but grant refs are
really of (unsigned) type grant_ref_t.  Make grefs an array of
grant_ref_t's.

>  static int xenbus_map_ring_valloc_pv(struct xenbus_device *dev,
> -				     int gnt_ref, void **vaddr)
> +				     int *gnt_ref, int nr_grefs, void **vaddr)
[...]
> +	/* Issue hypercall for individual entry, rollback if error occurs. */
> +	for (i = 0; i < nr_grefs; i++) {
> +		op.flags = GNTMAP_host_map | GNTMAP_contains_pte;
> +		op.ref   = gnt_ref[i];
> +		op.dom   = dev->otherend_id;
> +		op.host_addr = arbitrary_virt_to_machine(pte[i]).maddr;
> +
> +		if (HYPERVISOR_grant_table_op(GNTTABOP_map_grant_ref, &op, 1))
> +			BUG();

I think these hypercalls should be batches into one
GNTTABOP_map_grant_ref call.  I think this will make your 'rollback'
easier as well.

Similarly, for all the other places you have map/unmap in a loop.

> +int xenbus_map_ring(struct xenbus_device *dev, int *gnt_ref, int nr_grefs,
> +		    grant_handle_t *handle, void *vaddr, int *vma_leaked)

grant_ref_t for the array.

> @@ -195,15 +200,17 @@ int xenbus_watch_pathfmt(struct xenbus_device *dev, struct xenbus_watch *watch,
>  			 const char *pathfmt, ...);
>  
>  int xenbus_switch_state(struct xenbus_device *dev, enum xenbus_state new_state);
> -int xenbus_grant_ring(struct xenbus_device *dev, unsigned long ring_mfn);
> +int xenbus_grant_ring(struct xenbus_device *dev, void *vaddr,
> +		      int nr_gages, int *grefs);

nr_pages?

David

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 4/8] xenbus_client: Extend interface to support multi-page ring
  2013-02-15 16:00 ` [PATCH 4/8] xenbus_client: Extend interface to support multi-page ring Wei Liu
                     ` (3 preceding siblings ...)
  2013-03-04 21:12   ` Konrad Rzeszutek Wilk
@ 2013-03-05 10:25   ` David Vrabel
  2013-03-05 10:25   ` [Xen-devel] " David Vrabel
  5 siblings, 0 replies; 91+ messages in thread
From: David Vrabel @ 2013-03-05 10:25 UTC (permalink / raw)
  To: Wei Liu
  Cc: ian.campbell, konrad.wilk, netdev, Stefano Stabellini, xen-devel,
	annie.li, Roger Pau Monne

On 15/02/13 16:00, Wei Liu wrote:
> Also bundle fixes for xen frontends and backends in this patch.

When changing APIs you don't need to call out required changes to
callers as "fixes".

> --- a/drivers/xen/xenbus/xenbus_client.c
> +++ b/drivers/xen/xenbus/xenbus_client.c
> @@ -357,17 +359,39 @@ static void xenbus_switch_fatal(struct xenbus_device *dev, int depth, int err,
>  /**
>   * xenbus_grant_ring
>   * @dev: xenbus device
> - * @ring_mfn: mfn of ring to grant
> -
> - * Grant access to the given @ring_mfn to the peer of the given device.  Return
> - * 0 on success, or -errno on error.  On error, the device will switch to
> + * @vaddr: starting virtual address of the ring
> + * @nr_pages: number of pages to be granted
> + * @grefs: grant reference array to be filled in
> + *
> + * Grant access to the given @vaddr to the peer of the given device.
> + * Then fill in @grefs with grant references.  Return 0 on success, or
> + * -errno on error.  On error, the device will switch to
>   * XenbusStateClosing, and the error will be saved in the store.
>   */
> -int xenbus_grant_ring(struct xenbus_device *dev, unsigned long ring_mfn)
> +int xenbus_grant_ring(struct xenbus_device *dev, void *vaddr,
> +		      int nr_pages, int *grefs)

This call previously return the grant ref in an int, but grant refs are
really of (unsigned) type grant_ref_t.  Make grefs an array of
grant_ref_t's.

>  static int xenbus_map_ring_valloc_pv(struct xenbus_device *dev,
> -				     int gnt_ref, void **vaddr)
> +				     int *gnt_ref, int nr_grefs, void **vaddr)
[...]
> +	/* Issue hypercall for individual entry, rollback if error occurs. */
> +	for (i = 0; i < nr_grefs; i++) {
> +		op.flags = GNTMAP_host_map | GNTMAP_contains_pte;
> +		op.ref   = gnt_ref[i];
> +		op.dom   = dev->otherend_id;
> +		op.host_addr = arbitrary_virt_to_machine(pte[i]).maddr;
> +
> +		if (HYPERVISOR_grant_table_op(GNTTABOP_map_grant_ref, &op, 1))
> +			BUG();

I think these hypercalls should be batches into one
GNTTABOP_map_grant_ref call.  I think this will make your 'rollback'
easier as well.

Similarly, for all the other places you have map/unmap in a loop.

> +int xenbus_map_ring(struct xenbus_device *dev, int *gnt_ref, int nr_grefs,
> +		    grant_handle_t *handle, void *vaddr, int *vma_leaked)

grant_ref_t for the array.

> @@ -195,15 +200,17 @@ int xenbus_watch_pathfmt(struct xenbus_device *dev, struct xenbus_watch *watch,
>  			 const char *pathfmt, ...);
>  
>  int xenbus_switch_state(struct xenbus_device *dev, enum xenbus_state new_state);
> -int xenbus_grant_ring(struct xenbus_device *dev, unsigned long ring_mfn);
> +int xenbus_grant_ring(struct xenbus_device *dev, void *vaddr,
> +		      int nr_gages, int *grefs);

nr_pages?

David

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [Xen-devel] [PATCH 5/8] netback: multi-page ring support
  2013-02-15 16:00 ` Wei Liu
                     ` (2 preceding siblings ...)
  2013-03-05 10:41   ` David Vrabel
@ 2013-03-05 10:41   ` David Vrabel
  3 siblings, 0 replies; 91+ messages in thread
From: David Vrabel @ 2013-03-05 10:41 UTC (permalink / raw)
  To: Wei Liu; +Cc: xen-devel, netdev, ian.campbell, konrad.wilk, annie.li

On 15/02/13 16:00, Wei Liu wrote:
> [nothing]

You need to describe the protocol used to negotiate this.  What happens
when a frontend without such support connects to a backend with support?
And vice-versa?

> --- a/drivers/net/xen-netback/common.h
> +++ b/drivers/net/xen-netback/common.h
> @@ -45,6 +45,12 @@
>  #include <xen/grant_table.h>
>  #include <xen/xenbus.h>
>  
> +#define NETBK_MAX_RING_PAGE_ORDER XENBUS_MAX_RING_PAGE_ORDER
> +#define NETBK_MAX_RING_PAGES      (1U << NETBK_MAX_RING_PAGE_ORDER)
> +
> +#define NETBK_MAX_TX_RING_SIZE XEN_NETIF_TX_RING_SIZE(NETBK_MAX_RING_PAGES)
> +#define NETBK_MAX_RX_RING_SIZE XEN_NETIF_RX_RING_SIZE(NETBK_MAX_RING_PAGES)

See comment below.

> @@ -105,15 +113,19 @@ static inline struct xenbus_device *xenvif_to_xenbus_device(struct xenvif *vif)
>  	return to_xenbus_device(vif->dev->dev.parent);
>  }
>  
> -#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
> -#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
> +#define XEN_NETIF_TX_RING_SIZE(_nr_pages)		\
> +	__CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE * (_nr_pages))

Ring size is no longer const.

Would these be better as inline functions with a struct xenvif parameter?

Would need to fixup the MAX_XX_RING_SIZE macros above.

> index db638e1..fa4d46d 100644
> --- a/drivers/net/xen-netback/interface.c
> +++ b/drivers/net/xen-netback/interface.c
> @@ -305,10 +305,16 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid,
>  	return vif;
>  }
>  
> -int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref,
> -		   unsigned long rx_ring_ref, unsigned int evtchn)
> +int xenvif_connect(struct xenvif *vif,
> +		   unsigned long *tx_ring_ref, unsigned int tx_ring_ref_count,
> +		   unsigned long *rx_ring_ref, unsigned int rx_ring_ref_count,
> +		   unsigned int evtchn)
>  {
>  	int err = -ENOMEM;
> +	void *addr;
> +	struct xen_netif_tx_sring *txs;
> +	struct xen_netif_rx_sring *rxs;
> +	int tmp[NETBK_MAX_RING_PAGES], i;

grant_ref_t, and elsewhere probably -- I didn't check.

> @@ -382,7 +413,8 @@ void xenvif_disconnect(struct xenvif *vif)
>  
>  	unregister_netdev(vif->dev);
>  
> -	xen_netbk_unmap_frontend_rings(vif);
> +	xen_netbk_unmap_frontend_rings(vif, (void *)vif->tx.sring);
> +	xen_netbk_unmap_frontend_rings(vif, (void *)vif->rx.sring);

Don't need the casts here.

> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -47,6 +47,19 @@
>  #include <asm/xen/hypercall.h>
>  #include <asm/xen/page.h>
>  
> +unsigned int MODPARM_netback_max_rx_ring_page_order = NETBK_MAX_RING_PAGE_ORDER;
> +module_param_named(netback_max_rx_ring_page_order,
> +		   MODPARM_netback_max_rx_ring_page_order, uint, 0);

Please don't prefix new module parameters with "netback",
"max_rx_ring_page_order" is fine.

> +MODULE_PARM_DESC(netback_max_rx_ring_page_order,
> +		 "Maximum supported receiver ring page order");
> +
> +unsigned int MODPARM_netback_max_tx_ring_page_order = NETBK_MAX_RING_PAGE_ORDER;
> +module_param_named(netback_max_tx_ring_page_order,
> +		   MODPARM_netback_max_tx_ring_page_order, uint, 0);

Ditto.

> +MODULE_PARM_DESC(netback_max_tx_ring_page_order,
> +		 "Maximum supported transmitter ring page order");
> +
> +
[...]
> --- a/drivers/net/xen-netback/xenbus.c
> +++ b/drivers/net/xen-netback/xenbus.c
[...]
> +	err = xenbus_scanf(XBT_NIL, dev->otherend, "tx-ring-order", "%u",
> +			   &tx_ring_order);
> +	if (err < 0) {
> +		tx_ring_order = 0;
> +
> +		err = xenbus_scanf(XBT_NIL, dev->otherend, "tx-ring-ref", "%lu",
> +				   &tx_ring_ref[0]);
> +		if (err < 0) {
> +			xenbus_dev_fatal(dev, err, "reading %s/tx-ring-ref",
> +					 dev->otherend);
> +			return err;
> +		}
> +	} else {
> +		unsigned int i;
> +
> +		if (tx_ring_order > MODPARM_netback_max_tx_ring_page_order) {
> +			err = -EINVAL;
> +			xenbus_dev_fatal(dev, err,
> +					 "%s/tx-ring-page-order too big",
> +					 dev->otherend);
> +			return err;
> +		}
> +
> +		for (i = 0; i < (1U << tx_ring_order); i++) {
> +			char ring_ref_name[sizeof("tx-ring-ref") + 2];
> +
> +			snprintf(ring_ref_name, sizeof(ring_ref_name),
> +				 "tx-ring-ref%u", i);
> +
> +			err = xenbus_scanf(XBT_NIL, dev->otherend,
> +					   ring_ref_name, "%lu",
> +					   &tx_ring_ref[i]);
> +			if (err < 0) {
> +				xenbus_dev_fatal(dev, err,
> +						 "reading %s/%s",
> +						 dev->otherend,
> +						 ring_ref_name);
> +				return err;
> +			}
> +		}

Refactor this whole if/else block and the similar code below for rx into
a common library function?

It will be useful for blkback etc. as well.

> @@ -454,11 +566,28 @@ static int connect_rings(struct backend_info *be)
>  	vif->csum = !val;
>  
>  	/* Map the shared frame, irq etc. */
> -	err = xenvif_connect(vif, tx_ring_ref, rx_ring_ref, evtchn);
> +	err = xenvif_connect(vif, tx_ring_ref, (1U << tx_ring_order),
> +			     rx_ring_ref, (1U << rx_ring_order),
> +			     evtchn);
>  	if (err) {
> +		/* construct 1 2 3 / 4 5 6 */

This comment doesn't make sense to me.

David

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 5/8] netback: multi-page ring support
  2013-02-15 16:00 ` Wei Liu
  2013-03-04 21:00   ` [Xen-devel] " Konrad Rzeszutek Wilk
  2013-03-04 21:00   ` Konrad Rzeszutek Wilk
@ 2013-03-05 10:41   ` David Vrabel
  2013-03-05 10:41   ` [Xen-devel] " David Vrabel
  3 siblings, 0 replies; 91+ messages in thread
From: David Vrabel @ 2013-03-05 10:41 UTC (permalink / raw)
  To: Wei Liu; +Cc: netdev, annie.li, konrad.wilk, ian.campbell, xen-devel

On 15/02/13 16:00, Wei Liu wrote:
> [nothing]

You need to describe the protocol used to negotiate this.  What happens
when a frontend without such support connects to a backend with support?
And vice-versa?

> --- a/drivers/net/xen-netback/common.h
> +++ b/drivers/net/xen-netback/common.h
> @@ -45,6 +45,12 @@
>  #include <xen/grant_table.h>
>  #include <xen/xenbus.h>
>  
> +#define NETBK_MAX_RING_PAGE_ORDER XENBUS_MAX_RING_PAGE_ORDER
> +#define NETBK_MAX_RING_PAGES      (1U << NETBK_MAX_RING_PAGE_ORDER)
> +
> +#define NETBK_MAX_TX_RING_SIZE XEN_NETIF_TX_RING_SIZE(NETBK_MAX_RING_PAGES)
> +#define NETBK_MAX_RX_RING_SIZE XEN_NETIF_RX_RING_SIZE(NETBK_MAX_RING_PAGES)

See comment below.

> @@ -105,15 +113,19 @@ static inline struct xenbus_device *xenvif_to_xenbus_device(struct xenvif *vif)
>  	return to_xenbus_device(vif->dev->dev.parent);
>  }
>  
> -#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
> -#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
> +#define XEN_NETIF_TX_RING_SIZE(_nr_pages)		\
> +	__CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE * (_nr_pages))

Ring size is no longer const.

Would these be better as inline functions with a struct xenvif parameter?

Would need to fixup the MAX_XX_RING_SIZE macros above.

> index db638e1..fa4d46d 100644
> --- a/drivers/net/xen-netback/interface.c
> +++ b/drivers/net/xen-netback/interface.c
> @@ -305,10 +305,16 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid,
>  	return vif;
>  }
>  
> -int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref,
> -		   unsigned long rx_ring_ref, unsigned int evtchn)
> +int xenvif_connect(struct xenvif *vif,
> +		   unsigned long *tx_ring_ref, unsigned int tx_ring_ref_count,
> +		   unsigned long *rx_ring_ref, unsigned int rx_ring_ref_count,
> +		   unsigned int evtchn)
>  {
>  	int err = -ENOMEM;
> +	void *addr;
> +	struct xen_netif_tx_sring *txs;
> +	struct xen_netif_rx_sring *rxs;
> +	int tmp[NETBK_MAX_RING_PAGES], i;

grant_ref_t, and elsewhere probably -- I didn't check.

> @@ -382,7 +413,8 @@ void xenvif_disconnect(struct xenvif *vif)
>  
>  	unregister_netdev(vif->dev);
>  
> -	xen_netbk_unmap_frontend_rings(vif);
> +	xen_netbk_unmap_frontend_rings(vif, (void *)vif->tx.sring);
> +	xen_netbk_unmap_frontend_rings(vif, (void *)vif->rx.sring);

Don't need the casts here.

> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -47,6 +47,19 @@
>  #include <asm/xen/hypercall.h>
>  #include <asm/xen/page.h>
>  
> +unsigned int MODPARM_netback_max_rx_ring_page_order = NETBK_MAX_RING_PAGE_ORDER;
> +module_param_named(netback_max_rx_ring_page_order,
> +		   MODPARM_netback_max_rx_ring_page_order, uint, 0);

Please don't prefix new module parameters with "netback",
"max_rx_ring_page_order" is fine.

> +MODULE_PARM_DESC(netback_max_rx_ring_page_order,
> +		 "Maximum supported receiver ring page order");
> +
> +unsigned int MODPARM_netback_max_tx_ring_page_order = NETBK_MAX_RING_PAGE_ORDER;
> +module_param_named(netback_max_tx_ring_page_order,
> +		   MODPARM_netback_max_tx_ring_page_order, uint, 0);

Ditto.

> +MODULE_PARM_DESC(netback_max_tx_ring_page_order,
> +		 "Maximum supported transmitter ring page order");
> +
> +
[...]
> --- a/drivers/net/xen-netback/xenbus.c
> +++ b/drivers/net/xen-netback/xenbus.c
[...]
> +	err = xenbus_scanf(XBT_NIL, dev->otherend, "tx-ring-order", "%u",
> +			   &tx_ring_order);
> +	if (err < 0) {
> +		tx_ring_order = 0;
> +
> +		err = xenbus_scanf(XBT_NIL, dev->otherend, "tx-ring-ref", "%lu",
> +				   &tx_ring_ref[0]);
> +		if (err < 0) {
> +			xenbus_dev_fatal(dev, err, "reading %s/tx-ring-ref",
> +					 dev->otherend);
> +			return err;
> +		}
> +	} else {
> +		unsigned int i;
> +
> +		if (tx_ring_order > MODPARM_netback_max_tx_ring_page_order) {
> +			err = -EINVAL;
> +			xenbus_dev_fatal(dev, err,
> +					 "%s/tx-ring-page-order too big",
> +					 dev->otherend);
> +			return err;
> +		}
> +
> +		for (i = 0; i < (1U << tx_ring_order); i++) {
> +			char ring_ref_name[sizeof("tx-ring-ref") + 2];
> +
> +			snprintf(ring_ref_name, sizeof(ring_ref_name),
> +				 "tx-ring-ref%u", i);
> +
> +			err = xenbus_scanf(XBT_NIL, dev->otherend,
> +					   ring_ref_name, "%lu",
> +					   &tx_ring_ref[i]);
> +			if (err < 0) {
> +				xenbus_dev_fatal(dev, err,
> +						 "reading %s/%s",
> +						 dev->otherend,
> +						 ring_ref_name);
> +				return err;
> +			}
> +		}

Refactor this whole if/else block and the similar code below for rx into
a common library function?

It will be useful for blkback etc. as well.

> @@ -454,11 +566,28 @@ static int connect_rings(struct backend_info *be)
>  	vif->csum = !val;
>  
>  	/* Map the shared frame, irq etc. */
> -	err = xenvif_connect(vif, tx_ring_ref, rx_ring_ref, evtchn);
> +	err = xenvif_connect(vif, tx_ring_ref, (1U << tx_ring_order),
> +			     rx_ring_ref, (1U << rx_ring_order),
> +			     evtchn);
>  	if (err) {
> +		/* construct 1 2 3 / 4 5 6 */

This comment doesn't make sense to me.

David

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 2/8] netback: add module unload function
  2013-03-04 21:58   ` Stephen Hemminger
  2013-03-05 13:30     ` Wei Liu
@ 2013-03-05 13:30     ` Wei Liu
  1 sibling, 0 replies; 91+ messages in thread
From: Wei Liu @ 2013-03-05 13:30 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: wei.liu2, xen-devel, netdev, Ian Campbell, konrad.wilk, annie.li

On Mon, 2013-03-04 at 21:58 +0000, Stephen Hemminger wrote:
> On Fri, 15 Feb 2013 16:00:03 +0000
> Wei Liu <wei.liu2@citrix.com> wrote:
> 
> > Enable users to unload netback module. Users should make sure there is not vif
> > runnig.
> 
> 
> Isn't it likely that some admin might be trying to cleanup or do shutdown and there
> is an ordering problem. What happens if vif's are still running?
> 
> Why depend on users? You should allow it anytime and do refcounting and auto-destroy the vif's for them.

Sure, we should not depend on users. I will move my vif get/put module
patch before this one.


Wei.

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 2/8] netback: add module unload function
  2013-03-04 21:58   ` Stephen Hemminger
@ 2013-03-05 13:30     ` Wei Liu
  2013-03-05 13:30     ` Wei Liu
  1 sibling, 0 replies; 91+ messages in thread
From: Wei Liu @ 2013-03-05 13:30 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: wei.liu2, Ian Campbell, konrad.wilk, netdev, xen-devel, annie.li

On Mon, 2013-03-04 at 21:58 +0000, Stephen Hemminger wrote:
> On Fri, 15 Feb 2013 16:00:03 +0000
> Wei Liu <wei.liu2@citrix.com> wrote:
> 
> > Enable users to unload netback module. Users should make sure there is not vif
> > runnig.
> 
> 
> Isn't it likely that some admin might be trying to cleanup or do shutdown and there
> is an ordering problem. What happens if vif's are still running?
> 
> Why depend on users? You should allow it anytime and do refcounting and auto-destroy the vif's for them.

Sure, we should not depend on users. I will move my vif get/put module
patch before this one.


Wei.

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [Xen-devel] [PATCH 2/8] netback: add module unload function
  2013-03-04 20:55   ` [Xen-devel] " Konrad Rzeszutek Wilk
  2013-03-04 20:58     ` Andrew Cooper
  2013-03-04 20:58     ` [Xen-devel] " Andrew Cooper
@ 2013-03-05 13:30     ` Wei Liu
  2013-03-05 13:30     ` Wei Liu
  3 siblings, 0 replies; 91+ messages in thread
From: Wei Liu @ 2013-03-05 13:30 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: wei.liu2, xen-devel, netdev, Ian Campbell, annie.li

On Mon, 2013-03-04 at 20:55 +0000, Konrad Rzeszutek Wilk wrote:
> On Fri, Feb 15, 2013 at 04:00:03PM +0000, Wei Liu wrote:
> > Enable users to unload netback module. Users should make sure there is not vif
> > runnig.
> 
> 'sure there are no vif's running.'
> 
> Any way of making this VIF part be automatic? Meaning that netback
> can figure out if there are VIFs running and if so don't unload
> all of the parts and just mention that you are leaking memory.
> 
> This looks quite dangerous - meaning if there are guests running and
> we for fun do 'rmmod xen_netback' it looks like we could crash dom0?
> 

Dom0 will not crash. But as you suggested in a latter email, I should
move the get/put module patch before this one.

The rationale behind this patch is that if there's anything wrong
discovered inside netback, we can just migrate all VMs to a new host,
unload old netback, load new netback then migrate all VMs back. This
should be useful for both production and testing.


Wei.

> > 
> > Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> > ---
> >  drivers/net/xen-netback/common.h  |    1 +
> >  drivers/net/xen-netback/netback.c |   18 ++++++++++++++++++
> >  drivers/net/xen-netback/xenbus.c  |    5 +++++
> >  3 files changed, 24 insertions(+)
> > 
> > diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
> > index 9d7f172..35d8772 100644
> > --- a/drivers/net/xen-netback/common.h
> > +++ b/drivers/net/xen-netback/common.h
> > @@ -120,6 +120,7 @@ void xenvif_get(struct xenvif *vif);
> >  void xenvif_put(struct xenvif *vif);
> >  
> >  int xenvif_xenbus_init(void);
> > +void xenvif_xenbus_exit(void);
> >  
> >  int xenvif_schedulable(struct xenvif *vif);
> >  
> > diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> > index db8d45a..de59098 100644
> > --- a/drivers/net/xen-netback/netback.c
> > +++ b/drivers/net/xen-netback/netback.c
> > @@ -1761,5 +1761,23 @@ failed_init:
> >  
> >  module_init(netback_init);
> >  
> > +static void __exit netback_exit(void)
> > +{
> > +	int group, i;
> > +	xenvif_xenbus_exit();
> 
> You should check the return code of this function.
> 
> > +	for (group = 0; group < xen_netbk_group_nr; group++) {
> > +		struct xen_netbk *netbk = &xen_netbk[group];
> > +		for (i = 0; i < MAX_PENDING_REQS; i++) {
> > +			if (netbk->mmap_pages[i])
> > +				__free_page(netbk->mmap_pages[i]);
> > +		}
> > +		del_timer_sync(&netbk->net_timer);
> > +		kthread_stop(netbk->task);
> > +	}
> > +	vfree(xen_netbk);
> > +}
> > +
> > +module_exit(netback_exit);
> > +
> >  MODULE_LICENSE("Dual BSD/GPL");
> >  MODULE_ALIAS("xen-backend:vif");
> > diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
> > index 410018c..65d14f2 100644
> > --- a/drivers/net/xen-netback/xenbus.c
> > +++ b/drivers/net/xen-netback/xenbus.c
> > @@ -485,3 +485,8 @@ int xenvif_xenbus_init(void)
> >  {
> >  	return xenbus_register_backend(&netback_driver);
> >  }
> > +
> > +void xenvif_xenbus_exit(void)
> > +{
> > +	return xenbus_unregister_driver(&netback_driver);
> > +}
> > -- 
> > 1.7.10.4
> > 
> > 
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xen.org
> > http://lists.xen.org/xen-devel
> > 

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 2/8] netback: add module unload function
  2013-03-04 20:55   ` [Xen-devel] " Konrad Rzeszutek Wilk
                       ` (2 preceding siblings ...)
  2013-03-05 13:30     ` Wei Liu
@ 2013-03-05 13:30     ` Wei Liu
  3 siblings, 0 replies; 91+ messages in thread
From: Wei Liu @ 2013-03-05 13:30 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: netdev, annie.li, wei.liu2, Ian Campbell, xen-devel

On Mon, 2013-03-04 at 20:55 +0000, Konrad Rzeszutek Wilk wrote:
> On Fri, Feb 15, 2013 at 04:00:03PM +0000, Wei Liu wrote:
> > Enable users to unload netback module. Users should make sure there is not vif
> > runnig.
> 
> 'sure there are no vif's running.'
> 
> Any way of making this VIF part be automatic? Meaning that netback
> can figure out if there are VIFs running and if so don't unload
> all of the parts and just mention that you are leaking memory.
> 
> This looks quite dangerous - meaning if there are guests running and
> we for fun do 'rmmod xen_netback' it looks like we could crash dom0?
> 

Dom0 will not crash. But as you suggested in a latter email, I should
move the get/put module patch before this one.

The rationale behind this patch is that if there's anything wrong
discovered inside netback, we can just migrate all VMs to a new host,
unload old netback, load new netback then migrate all VMs back. This
should be useful for both production and testing.


Wei.

> > 
> > Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> > ---
> >  drivers/net/xen-netback/common.h  |    1 +
> >  drivers/net/xen-netback/netback.c |   18 ++++++++++++++++++
> >  drivers/net/xen-netback/xenbus.c  |    5 +++++
> >  3 files changed, 24 insertions(+)
> > 
> > diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
> > index 9d7f172..35d8772 100644
> > --- a/drivers/net/xen-netback/common.h
> > +++ b/drivers/net/xen-netback/common.h
> > @@ -120,6 +120,7 @@ void xenvif_get(struct xenvif *vif);
> >  void xenvif_put(struct xenvif *vif);
> >  
> >  int xenvif_xenbus_init(void);
> > +void xenvif_xenbus_exit(void);
> >  
> >  int xenvif_schedulable(struct xenvif *vif);
> >  
> > diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> > index db8d45a..de59098 100644
> > --- a/drivers/net/xen-netback/netback.c
> > +++ b/drivers/net/xen-netback/netback.c
> > @@ -1761,5 +1761,23 @@ failed_init:
> >  
> >  module_init(netback_init);
> >  
> > +static void __exit netback_exit(void)
> > +{
> > +	int group, i;
> > +	xenvif_xenbus_exit();
> 
> You should check the return code of this function.
> 
> > +	for (group = 0; group < xen_netbk_group_nr; group++) {
> > +		struct xen_netbk *netbk = &xen_netbk[group];
> > +		for (i = 0; i < MAX_PENDING_REQS; i++) {
> > +			if (netbk->mmap_pages[i])
> > +				__free_page(netbk->mmap_pages[i]);
> > +		}
> > +		del_timer_sync(&netbk->net_timer);
> > +		kthread_stop(netbk->task);
> > +	}
> > +	vfree(xen_netbk);
> > +}
> > +
> > +module_exit(netback_exit);
> > +
> >  MODULE_LICENSE("Dual BSD/GPL");
> >  MODULE_ALIAS("xen-backend:vif");
> > diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
> > index 410018c..65d14f2 100644
> > --- a/drivers/net/xen-netback/xenbus.c
> > +++ b/drivers/net/xen-netback/xenbus.c
> > @@ -485,3 +485,8 @@ int xenvif_xenbus_init(void)
> >  {
> >  	return xenbus_register_backend(&netback_driver);
> >  }
> > +
> > +void xenvif_xenbus_exit(void)
> > +{
> > +	return xenbus_unregister_driver(&netback_driver);
> > +}
> > -- 
> > 1.7.10.4
> > 
> > 
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xen.org
> > http://lists.xen.org/xen-devel
> > 

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 1/8] netback: don't bind kthread to cpu
  2013-03-04 20:51   ` Konrad Rzeszutek Wilk
  2013-03-05 13:30     ` Wei Liu
@ 2013-03-05 13:30     ` Wei Liu
  2013-03-05 13:56       ` Konrad Rzeszutek Wilk
  2013-03-05 13:56       ` [Xen-devel] " Konrad Rzeszutek Wilk
  1 sibling, 2 replies; 91+ messages in thread
From: Wei Liu @ 2013-03-05 13:30 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: wei.liu2, xen-devel, netdev, Ian Campbell, annie.li

On Mon, 2013-03-04 at 20:51 +0000, Konrad Rzeszutek Wilk wrote:
> On Fri, Feb 15, 2013 at 04:00:02PM +0000, Wei Liu wrote:
> > The initialization process makes an assumption that the online cpus are
> > numbered from 0 to xen_netbk_group_nr-1,  which is not always true.
> 
> And xen_netbk_group_nr is num_online_cpus()?
> 

Yes.

> So under what conditions does this change? Is this when the CPU hotplug
> is involved and the CPUs go offline? In which case should there be a

Yes, the hotplug path.

> CPU hotplug notifier to re-bind the workers are appropiate?
> 
> > 
> > As we only need a pool of worker threads, simply don't bind them to specific
> > cpus.
> 
> OK. Is there another method of doing this? Are there patches to make the thread
> try to be vCPU->guest affinite?
> 

No, not at the moment.


Wei.

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 1/8] netback: don't bind kthread to cpu
  2013-03-04 20:51   ` Konrad Rzeszutek Wilk
@ 2013-03-05 13:30     ` Wei Liu
  2013-03-05 13:30     ` Wei Liu
  1 sibling, 0 replies; 91+ messages in thread
From: Wei Liu @ 2013-03-05 13:30 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: netdev, annie.li, wei.liu2, Ian Campbell, xen-devel

On Mon, 2013-03-04 at 20:51 +0000, Konrad Rzeszutek Wilk wrote:
> On Fri, Feb 15, 2013 at 04:00:02PM +0000, Wei Liu wrote:
> > The initialization process makes an assumption that the online cpus are
> > numbered from 0 to xen_netbk_group_nr-1,  which is not always true.
> 
> And xen_netbk_group_nr is num_online_cpus()?
> 

Yes.

> So under what conditions does this change? Is this when the CPU hotplug
> is involved and the CPUs go offline? In which case should there be a

Yes, the hotplug path.

> CPU hotplug notifier to re-bind the workers are appropiate?
> 
> > 
> > As we only need a pool of worker threads, simply don't bind them to specific
> > cpus.
> 
> OK. Is there another method of doing this? Are there patches to make the thread
> try to be vCPU->guest affinite?
> 

No, not at the moment.


Wei.

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [Xen-devel] [PATCH 3/8] netback: get/put module along with vif connect/disconnect
  2013-03-05 10:02   ` [Xen-devel] " David Vrabel
  2013-03-05 13:30     ` Wei Liu
@ 2013-03-05 13:30     ` Wei Liu
  2013-03-05 14:07       ` David Vrabel
  2013-03-05 14:07       ` [Xen-devel] " David Vrabel
  1 sibling, 2 replies; 91+ messages in thread
From: Wei Liu @ 2013-03-05 13:30 UTC (permalink / raw)
  To: David Vrabel
  Cc: wei.liu2, xen-devel, netdev, Ian Campbell, konrad.wilk, annie.li

On Tue, 2013-03-05 at 10:02 +0000, David Vrabel wrote:
> On 15/02/13 16:00, Wei Liu wrote:
> > If there is vif running and user unloads netback, guest's network interface
> > just mysteriously stops working. So we need to prevent unloading netback
> > module if there is vif running.
> 
> It's not mysterious -- it is cleanly disconnected, and will reconnect
> when the module is reinserted.
> 

>From a guest's POV, it just stops without any sign. This should be
prevented IMHO.

Netback / netfront lose all states when netback is unloaded. And
netfront doesn't support reconfiguration at the moment. My guess is that
this is the reason why netback doesn't even have unload function at
first.

> Being able to unload modules while they are in use is standard so I
> don't think this should be applied.
> 

I don't think this is true from a module dependency point of view - just
try to unload any in use module, rmmod / modprobe will give you a fatal
error.

The situation for netback is a bit different. The module dependency is
not within the same kernel, we can only do this by explicitly get/put
this module.


Wei.

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 3/8] netback: get/put module along with vif connect/disconnect
  2013-03-05 10:02   ` [Xen-devel] " David Vrabel
@ 2013-03-05 13:30     ` Wei Liu
  2013-03-05 13:30     ` [Xen-devel] " Wei Liu
  1 sibling, 0 replies; 91+ messages in thread
From: Wei Liu @ 2013-03-05 13:30 UTC (permalink / raw)
  To: David Vrabel
  Cc: wei.liu2, Ian Campbell, konrad.wilk, netdev, xen-devel, annie.li

On Tue, 2013-03-05 at 10:02 +0000, David Vrabel wrote:
> On 15/02/13 16:00, Wei Liu wrote:
> > If there is vif running and user unloads netback, guest's network interface
> > just mysteriously stops working. So we need to prevent unloading netback
> > module if there is vif running.
> 
> It's not mysterious -- it is cleanly disconnected, and will reconnect
> when the module is reinserted.
> 

>From a guest's POV, it just stops without any sign. This should be
prevented IMHO.

Netback / netfront lose all states when netback is unloaded. And
netfront doesn't support reconfiguration at the moment. My guess is that
this is the reason why netback doesn't even have unload function at
first.

> Being able to unload modules while they are in use is standard so I
> don't think this should be applied.
> 

I don't think this is true from a module dependency point of view - just
try to unload any in use module, rmmod / modprobe will give you a fatal
error.

The situation for netback is a bit different. The module dependency is
not within the same kernel, we can only do this by explicitly get/put
this module.


Wei.

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [Xen-devel] [PATCH 1/8] netback: don't bind kthread to cpu
  2013-03-05 13:30     ` Wei Liu
  2013-03-05 13:56       ` Konrad Rzeszutek Wilk
@ 2013-03-05 13:56       ` Konrad Rzeszutek Wilk
  2013-03-05 14:04         ` Wei Liu
                           ` (3 more replies)
  1 sibling, 4 replies; 91+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-03-05 13:56 UTC (permalink / raw)
  To: Wei Liu; +Cc: netdev, annie.li, Ian Campbell, xen-devel

On Tue, Mar 05, 2013 at 01:30:10PM +0000, Wei Liu wrote:
> On Mon, 2013-03-04 at 20:51 +0000, Konrad Rzeszutek Wilk wrote:
> > On Fri, Feb 15, 2013 at 04:00:02PM +0000, Wei Liu wrote:
> > > The initialization process makes an assumption that the online cpus are
> > > numbered from 0 to xen_netbk_group_nr-1,  which is not always true.
> > 
> > And xen_netbk_group_nr is num_online_cpus()?
> > 
> 
> Yes.
> 
> > So under what conditions does this change? Is this when the CPU hotplug
> > is involved and the CPUs go offline? In which case should there be a
> 
> Yes, the hotplug path.
> 
> > CPU hotplug notifier to re-bind the workers are appropiate?

?
Can't that option be explored?
> > 
> > > 
> > > As we only need a pool of worker threads, simply don't bind them to specific
> > > cpus.
> > 
> > OK. Is there another method of doing this? Are there patches to make the thread
> > try to be vCPU->guest affinite?
> > 
> 
> No, not at the moment.
> 
> 
> Wei.
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> 

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 1/8] netback: don't bind kthread to cpu
  2013-03-05 13:30     ` Wei Liu
@ 2013-03-05 13:56       ` Konrad Rzeszutek Wilk
  2013-03-05 13:56       ` [Xen-devel] " Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 91+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-03-05 13:56 UTC (permalink / raw)
  To: Wei Liu; +Cc: netdev, annie.li, Ian Campbell, xen-devel

On Tue, Mar 05, 2013 at 01:30:10PM +0000, Wei Liu wrote:
> On Mon, 2013-03-04 at 20:51 +0000, Konrad Rzeszutek Wilk wrote:
> > On Fri, Feb 15, 2013 at 04:00:02PM +0000, Wei Liu wrote:
> > > The initialization process makes an assumption that the online cpus are
> > > numbered from 0 to xen_netbk_group_nr-1,  which is not always true.
> > 
> > And xen_netbk_group_nr is num_online_cpus()?
> > 
> 
> Yes.
> 
> > So under what conditions does this change? Is this when the CPU hotplug
> > is involved and the CPUs go offline? In which case should there be a
> 
> Yes, the hotplug path.
> 
> > CPU hotplug notifier to re-bind the workers are appropiate?

?
Can't that option be explored?
> > 
> > > 
> > > As we only need a pool of worker threads, simply don't bind them to specific
> > > cpus.
> > 
> > OK. Is there another method of doing this? Are there patches to make the thread
> > try to be vCPU->guest affinite?
> > 
> 
> No, not at the moment.
> 
> 
> Wei.
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> 

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [Xen-devel] [PATCH 1/8] netback: don't bind kthread to cpu
  2013-03-05 13:56       ` [Xen-devel] " Konrad Rzeszutek Wilk
@ 2013-03-05 14:04         ` Wei Liu
  2013-03-05 14:04         ` Wei Liu
                           ` (2 subsequent siblings)
  3 siblings, 0 replies; 91+ messages in thread
From: Wei Liu @ 2013-03-05 14:04 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: wei.liu2, netdev, annie.li, Ian Campbell, xen-devel

On Tue, 2013-03-05 at 13:56 +0000, Konrad Rzeszutek Wilk wrote:
> On Tue, Mar 05, 2013 at 01:30:10PM +0000, Wei Liu wrote:
> > On Mon, 2013-03-04 at 20:51 +0000, Konrad Rzeszutek Wilk wrote:
> > > On Fri, Feb 15, 2013 at 04:00:02PM +0000, Wei Liu wrote:
> > > > The initialization process makes an assumption that the online cpus are
> > > > numbered from 0 to xen_netbk_group_nr-1,  which is not always true.
> > > 
> > > And xen_netbk_group_nr is num_online_cpus()?
> > > 
> > 
> > Yes.
> > 
> > > So under what conditions does this change? Is this when the CPU hotplug
> > > is involved and the CPUs go offline? In which case should there be a
> > 
> > Yes, the hotplug path.
> > 
> > > CPU hotplug notifier to re-bind the workers are appropiate?
> 
> ?
> Can't that option be explored?

Thinking about it. Should be doable.


Wei.

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 1/8] netback: don't bind kthread to cpu
  2013-03-05 13:56       ` [Xen-devel] " Konrad Rzeszutek Wilk
  2013-03-05 14:04         ` Wei Liu
@ 2013-03-05 14:04         ` Wei Liu
  2013-03-05 14:42         ` David Vrabel
  2013-03-05 14:42         ` [Xen-devel] " David Vrabel
  3 siblings, 0 replies; 91+ messages in thread
From: Wei Liu @ 2013-03-05 14:04 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: netdev, annie.li, wei.liu2, Ian Campbell, xen-devel

On Tue, 2013-03-05 at 13:56 +0000, Konrad Rzeszutek Wilk wrote:
> On Tue, Mar 05, 2013 at 01:30:10PM +0000, Wei Liu wrote:
> > On Mon, 2013-03-04 at 20:51 +0000, Konrad Rzeszutek Wilk wrote:
> > > On Fri, Feb 15, 2013 at 04:00:02PM +0000, Wei Liu wrote:
> > > > The initialization process makes an assumption that the online cpus are
> > > > numbered from 0 to xen_netbk_group_nr-1,  which is not always true.
> > > 
> > > And xen_netbk_group_nr is num_online_cpus()?
> > > 
> > 
> > Yes.
> > 
> > > So under what conditions does this change? Is this when the CPU hotplug
> > > is involved and the CPUs go offline? In which case should there be a
> > 
> > Yes, the hotplug path.
> > 
> > > CPU hotplug notifier to re-bind the workers are appropiate?
> 
> ?
> Can't that option be explored?

Thinking about it. Should be doable.


Wei.

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [Xen-devel] [PATCH 3/8] netback: get/put module along with vif connect/disconnect
  2013-03-05 13:30     ` [Xen-devel] " Wei Liu
  2013-03-05 14:07       ` David Vrabel
@ 2013-03-05 14:07       ` David Vrabel
  2013-03-05 14:44         ` Wei Liu
                           ` (3 more replies)
  1 sibling, 4 replies; 91+ messages in thread
From: David Vrabel @ 2013-03-05 14:07 UTC (permalink / raw)
  To: Wei Liu; +Cc: xen-devel, netdev, Ian Campbell, konrad.wilk, annie.li

On 05/03/13 13:30, Wei Liu wrote:
> On Tue, 2013-03-05 at 10:02 +0000, David Vrabel wrote:
>> On 15/02/13 16:00, Wei Liu wrote:
>>> If there is vif running and user unloads netback, guest's network interface
>>> just mysteriously stops working. So we need to prevent unloading netback
>>> module if there is vif running.
>>
>> It's not mysterious -- it is cleanly disconnected, and will reconnect
>> when the module is reinserted.
>>
> 
> From a guest's POV, it just stops without any sign. This should be
> prevented IMHO.

This is a bug in the frontend or a bug in the backend failing to
disconnect correctly.

I posted a series of "xen-foofront: handle backend CLOSED without
CLOSING" patches that may help here. (I didn't get applied to netfront
for some reason.)

Disabling module unload doesn't prevent this from happening away.  You
can always manually unbind the backend device from the xen-netback
driver which has the same effect as unloading the module.

> Netback / netfront lose all states when netback is unloaded. And
> netfront doesn't support reconfiguration at the moment. My guess is that
> this is the reason why netback doesn't even have unload function at
> first.

If netfront cannot handle reconnect then that's a bug in the frontend or
a bug in the backend xenbus code not setting up the reconnect correctly.

>> Being able to unload modules while they are in use is standard so I
>> don't think this should be applied.
> 
> I don't think this is true from a module dependency point of view - just
> try to unload any in use module, rmmod / modprobe will give you a fatal
> error.

Try it with any other network interface driver and it will unload just fine.

David

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 3/8] netback: get/put module along with vif connect/disconnect
  2013-03-05 13:30     ` [Xen-devel] " Wei Liu
@ 2013-03-05 14:07       ` David Vrabel
  2013-03-05 14:07       ` [Xen-devel] " David Vrabel
  1 sibling, 0 replies; 91+ messages in thread
From: David Vrabel @ 2013-03-05 14:07 UTC (permalink / raw)
  To: Wei Liu; +Cc: netdev, annie.li, konrad.wilk, Ian Campbell, xen-devel

On 05/03/13 13:30, Wei Liu wrote:
> On Tue, 2013-03-05 at 10:02 +0000, David Vrabel wrote:
>> On 15/02/13 16:00, Wei Liu wrote:
>>> If there is vif running and user unloads netback, guest's network interface
>>> just mysteriously stops working. So we need to prevent unloading netback
>>> module if there is vif running.
>>
>> It's not mysterious -- it is cleanly disconnected, and will reconnect
>> when the module is reinserted.
>>
> 
> From a guest's POV, it just stops without any sign. This should be
> prevented IMHO.

This is a bug in the frontend or a bug in the backend failing to
disconnect correctly.

I posted a series of "xen-foofront: handle backend CLOSED without
CLOSING" patches that may help here. (I didn't get applied to netfront
for some reason.)

Disabling module unload doesn't prevent this from happening away.  You
can always manually unbind the backend device from the xen-netback
driver which has the same effect as unloading the module.

> Netback / netfront lose all states when netback is unloaded. And
> netfront doesn't support reconfiguration at the moment. My guess is that
> this is the reason why netback doesn't even have unload function at
> first.

If netfront cannot handle reconnect then that's a bug in the frontend or
a bug in the backend xenbus code not setting up the reconnect correctly.

>> Being able to unload modules while they are in use is standard so I
>> don't think this should be applied.
> 
> I don't think this is true from a module dependency point of view - just
> try to unload any in use module, rmmod / modprobe will give you a fatal
> error.

Try it with any other network interface driver and it will unload just fine.

David

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [Xen-devel] [PATCH 1/8] netback: don't bind kthread to cpu
  2013-03-05 13:56       ` [Xen-devel] " Konrad Rzeszutek Wilk
                           ` (2 preceding siblings ...)
  2013-03-05 14:42         ` David Vrabel
@ 2013-03-05 14:42         ` David Vrabel
  2013-03-05 15:52           ` Konrad Rzeszutek Wilk
  2013-03-05 15:52           ` Konrad Rzeszutek Wilk
  3 siblings, 2 replies; 91+ messages in thread
From: David Vrabel @ 2013-03-05 14:42 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: Wei Liu, netdev, annie.li, Ian Campbell, xen-devel

On 05/03/13 13:56, Konrad Rzeszutek Wilk wrote:
> On Tue, Mar 05, 2013 at 01:30:10PM +0000, Wei Liu wrote:
>> On Mon, 2013-03-04 at 20:51 +0000, Konrad Rzeszutek Wilk wrote:
>>> On Fri, Feb 15, 2013 at 04:00:02PM +0000, Wei Liu wrote:
>>>> The initialization process makes an assumption that the online cpus are
>>>> numbered from 0 to xen_netbk_group_nr-1,  which is not always true.
>>>
>>> And xen_netbk_group_nr is num_online_cpus()?
>>>
>>
>> Yes.
>>
>>> So under what conditions does this change? Is this when the CPU hotplug
>>> is involved and the CPUs go offline? In which case should there be a
>>
>> Yes, the hotplug path.
>>
>>> CPU hotplug notifier to re-bind the workers are appropiate?
> 
> ?
> Can't that option be explored?

I'm not sure binding netback threads to particular VCPUs is useful
without also binding the events to the corresponding VCPUs.

I would hope that the scheduler would tend to the correct behavior if
threads aren't bound, anyway.

David

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 1/8] netback: don't bind kthread to cpu
  2013-03-05 13:56       ` [Xen-devel] " Konrad Rzeszutek Wilk
  2013-03-05 14:04         ` Wei Liu
  2013-03-05 14:04         ` Wei Liu
@ 2013-03-05 14:42         ` David Vrabel
  2013-03-05 14:42         ` [Xen-devel] " David Vrabel
  3 siblings, 0 replies; 91+ messages in thread
From: David Vrabel @ 2013-03-05 14:42 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: netdev, annie.li, Wei Liu, Ian Campbell, xen-devel

On 05/03/13 13:56, Konrad Rzeszutek Wilk wrote:
> On Tue, Mar 05, 2013 at 01:30:10PM +0000, Wei Liu wrote:
>> On Mon, 2013-03-04 at 20:51 +0000, Konrad Rzeszutek Wilk wrote:
>>> On Fri, Feb 15, 2013 at 04:00:02PM +0000, Wei Liu wrote:
>>>> The initialization process makes an assumption that the online cpus are
>>>> numbered from 0 to xen_netbk_group_nr-1,  which is not always true.
>>>
>>> And xen_netbk_group_nr is num_online_cpus()?
>>>
>>
>> Yes.
>>
>>> So under what conditions does this change? Is this when the CPU hotplug
>>> is involved and the CPUs go offline? In which case should there be a
>>
>> Yes, the hotplug path.
>>
>>> CPU hotplug notifier to re-bind the workers are appropiate?
> 
> ?
> Can't that option be explored?

I'm not sure binding netback threads to particular VCPUs is useful
without also binding the events to the corresponding VCPUs.

I would hope that the scheduler would tend to the correct behavior if
threads aren't bound, anyway.

David

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [Xen-devel] [PATCH 3/8] netback: get/put module along with vif connect/disconnect
  2013-03-05 14:07       ` [Xen-devel] " David Vrabel
@ 2013-03-05 14:44         ` Wei Liu
  2013-03-05 14:44         ` Wei Liu
                           ` (2 subsequent siblings)
  3 siblings, 0 replies; 91+ messages in thread
From: Wei Liu @ 2013-03-05 14:44 UTC (permalink / raw)
  To: David Vrabel
  Cc: wei.liu2, xen-devel, netdev, Ian Campbell, konrad.wilk, annie.li

On Tue, 2013-03-05 at 14:07 +0000, David Vrabel wrote:
> On 05/03/13 13:30, Wei Liu wrote:
> > On Tue, 2013-03-05 at 10:02 +0000, David Vrabel wrote:
> >> On 15/02/13 16:00, Wei Liu wrote:
> >>> If there is vif running and user unloads netback, guest's network interface
> >>> just mysteriously stops working. So we need to prevent unloading netback
> >>> module if there is vif running.
> >>
> >> It's not mysterious -- it is cleanly disconnected, and will reconnect
> >> when the module is reinserted.
> >>
> > 
> > From a guest's POV, it just stops without any sign. This should be
> > prevented IMHO.
> 
> This is a bug in the frontend or a bug in the backend failing to
> disconnect correctly.
> 
> I posted a series of "xen-foofront: handle backend CLOSED without
> CLOSING" patches that may help here. (I didn't get applied to netfront
> for some reason.)
> 

Any links? And the reason why it was not applied?

> Disabling module unload doesn't prevent this from happening away.  You
> can always manually unbind the backend device from the xen-netback
> driver which has the same effect as unloading the module.
> 

Yes, but that's not a normal use case.

> > Netback / netfront lose all states when netback is unloaded. And
> > netfront doesn't support reconfiguration at the moment. My guess is that
> > this is the reason why netback doesn't even have unload function at
> > first.
> 
> If netfront cannot handle reconnect then that's a bug in the frontend or
> a bug in the backend xenbus code not setting up the reconnect correctly.
> 

AFAICT, various frontends (hvc, fb, blk etc.) don't respond to Closing
Closed Reconfigur{ing,ed} XenbusState. Is it the "bug" you're referring
to?

> >> Being able to unload modules while they are in use is standard so I
> >> don't think this should be applied.
> > 
> > I don't think this is true from a module dependency point of view - just
> > try to unload any in use module, rmmod / modprobe will give you a fatal
> > error.
> 
> Try it with any other network interface driver and it will unload just fine.
> 

I don't think we are talking about the same thing here...

If you unload a network module in Dom0, that's fine, because you lose
your interface in Dom0 as well. But for a DomU, frontend don't know
about this.



Wei.

> David

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 3/8] netback: get/put module along with vif connect/disconnect
  2013-03-05 14:07       ` [Xen-devel] " David Vrabel
  2013-03-05 14:44         ` Wei Liu
@ 2013-03-05 14:44         ` Wei Liu
  2013-03-05 15:53         ` Konrad Rzeszutek Wilk
  2013-03-05 15:53         ` [Xen-devel] " Konrad Rzeszutek Wilk
  3 siblings, 0 replies; 91+ messages in thread
From: Wei Liu @ 2013-03-05 14:44 UTC (permalink / raw)
  To: David Vrabel
  Cc: wei.liu2, Ian Campbell, konrad.wilk, netdev, xen-devel, annie.li

On Tue, 2013-03-05 at 14:07 +0000, David Vrabel wrote:
> On 05/03/13 13:30, Wei Liu wrote:
> > On Tue, 2013-03-05 at 10:02 +0000, David Vrabel wrote:
> >> On 15/02/13 16:00, Wei Liu wrote:
> >>> If there is vif running and user unloads netback, guest's network interface
> >>> just mysteriously stops working. So we need to prevent unloading netback
> >>> module if there is vif running.
> >>
> >> It's not mysterious -- it is cleanly disconnected, and will reconnect
> >> when the module is reinserted.
> >>
> > 
> > From a guest's POV, it just stops without any sign. This should be
> > prevented IMHO.
> 
> This is a bug in the frontend or a bug in the backend failing to
> disconnect correctly.
> 
> I posted a series of "xen-foofront: handle backend CLOSED without
> CLOSING" patches that may help here. (I didn't get applied to netfront
> for some reason.)
> 

Any links? And the reason why it was not applied?

> Disabling module unload doesn't prevent this from happening away.  You
> can always manually unbind the backend device from the xen-netback
> driver which has the same effect as unloading the module.
> 

Yes, but that's not a normal use case.

> > Netback / netfront lose all states when netback is unloaded. And
> > netfront doesn't support reconfiguration at the moment. My guess is that
> > this is the reason why netback doesn't even have unload function at
> > first.
> 
> If netfront cannot handle reconnect then that's a bug in the frontend or
> a bug in the backend xenbus code not setting up the reconnect correctly.
> 

AFAICT, various frontends (hvc, fb, blk etc.) don't respond to Closing
Closed Reconfigur{ing,ed} XenbusState. Is it the "bug" you're referring
to?

> >> Being able to unload modules while they are in use is standard so I
> >> don't think this should be applied.
> > 
> > I don't think this is true from a module dependency point of view - just
> > try to unload any in use module, rmmod / modprobe will give you a fatal
> > error.
> 
> Try it with any other network interface driver and it will unload just fine.
> 

I don't think we are talking about the same thing here...

If you unload a network module in Dom0, that's fine, because you lose
your interface in Dom0 as well. But for a DomU, frontend don't know
about this.



Wei.

> David

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [Xen-devel] [PATCH 1/8] netback: don't bind kthread to cpu
  2013-03-05 14:42         ` [Xen-devel] " David Vrabel
@ 2013-03-05 15:52           ` Konrad Rzeszutek Wilk
  2013-03-05 15:52           ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 91+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-03-05 15:52 UTC (permalink / raw)
  To: David Vrabel; +Cc: Wei Liu, netdev, annie.li, Ian Campbell, xen-devel

On Tue, Mar 05, 2013 at 02:42:26PM +0000, David Vrabel wrote:
> On 05/03/13 13:56, Konrad Rzeszutek Wilk wrote:
> > On Tue, Mar 05, 2013 at 01:30:10PM +0000, Wei Liu wrote:
> >> On Mon, 2013-03-04 at 20:51 +0000, Konrad Rzeszutek Wilk wrote:
> >>> On Fri, Feb 15, 2013 at 04:00:02PM +0000, Wei Liu wrote:
> >>>> The initialization process makes an assumption that the online cpus are
> >>>> numbered from 0 to xen_netbk_group_nr-1,  which is not always true.
> >>>
> >>> And xen_netbk_group_nr is num_online_cpus()?
> >>>
> >>
> >> Yes.
> >>
> >>> So under what conditions does this change? Is this when the CPU hotplug
> >>> is involved and the CPUs go offline? In which case should there be a
> >>
> >> Yes, the hotplug path.
> >>
> >>> CPU hotplug notifier to re-bind the workers are appropiate?
> > 
> > ?
> > Can't that option be explored?
> 
> I'm not sure binding netback threads to particular VCPUs is useful
> without also binding the events to the corresponding VCPUs.
> 
> I would hope that the scheduler would tend to the correct behavior if
> threads aren't bound, anyway.

That is fine too. If that is what we want we just need to make the git
commit message be more descriptive of why we don't want to bind
to VCPUs.
> 
> David

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 1/8] netback: don't bind kthread to cpu
  2013-03-05 14:42         ` [Xen-devel] " David Vrabel
  2013-03-05 15:52           ` Konrad Rzeszutek Wilk
@ 2013-03-05 15:52           ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 91+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-03-05 15:52 UTC (permalink / raw)
  To: David Vrabel; +Cc: netdev, annie.li, Wei Liu, Ian Campbell, xen-devel

On Tue, Mar 05, 2013 at 02:42:26PM +0000, David Vrabel wrote:
> On 05/03/13 13:56, Konrad Rzeszutek Wilk wrote:
> > On Tue, Mar 05, 2013 at 01:30:10PM +0000, Wei Liu wrote:
> >> On Mon, 2013-03-04 at 20:51 +0000, Konrad Rzeszutek Wilk wrote:
> >>> On Fri, Feb 15, 2013 at 04:00:02PM +0000, Wei Liu wrote:
> >>>> The initialization process makes an assumption that the online cpus are
> >>>> numbered from 0 to xen_netbk_group_nr-1,  which is not always true.
> >>>
> >>> And xen_netbk_group_nr is num_online_cpus()?
> >>>
> >>
> >> Yes.
> >>
> >>> So under what conditions does this change? Is this when the CPU hotplug
> >>> is involved and the CPUs go offline? In which case should there be a
> >>
> >> Yes, the hotplug path.
> >>
> >>> CPU hotplug notifier to re-bind the workers are appropiate?
> > 
> > ?
> > Can't that option be explored?
> 
> I'm not sure binding netback threads to particular VCPUs is useful
> without also binding the events to the corresponding VCPUs.
> 
> I would hope that the scheduler would tend to the correct behavior if
> threads aren't bound, anyway.

That is fine too. If that is what we want we just need to make the git
commit message be more descriptive of why we don't want to bind
to VCPUs.
> 
> David

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [Xen-devel] [PATCH 3/8] netback: get/put module along with vif connect/disconnect
  2013-03-05 14:07       ` [Xen-devel] " David Vrabel
                           ` (2 preceding siblings ...)
  2013-03-05 15:53         ` Konrad Rzeszutek Wilk
@ 2013-03-05 15:53         ` Konrad Rzeszutek Wilk
  3 siblings, 0 replies; 91+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-03-05 15:53 UTC (permalink / raw)
  To: David Vrabel; +Cc: Wei Liu, netdev, annie.li, Ian Campbell, xen-devel

On Tue, Mar 05, 2013 at 02:07:42PM +0000, David Vrabel wrote:
> On 05/03/13 13:30, Wei Liu wrote:
> > On Tue, 2013-03-05 at 10:02 +0000, David Vrabel wrote:
> >> On 15/02/13 16:00, Wei Liu wrote:
> >>> If there is vif running and user unloads netback, guest's network interface
> >>> just mysteriously stops working. So we need to prevent unloading netback
> >>> module if there is vif running.
> >>
> >> It's not mysterious -- it is cleanly disconnected, and will reconnect
> >> when the module is reinserted.
> >>
> > 
> > From a guest's POV, it just stops without any sign. This should be
> > prevented IMHO.
> 
> This is a bug in the frontend or a bug in the backend failing to
> disconnect correctly.
> 
> I posted a series of "xen-foofront: handle backend CLOSED without
> CLOSING" patches that may help here. (I didn't get applied to netfront
> for some reason.)

Hm, could you resent it please and make sure that the networking
maintainer is on the To list?

> 
> Disabling module unload doesn't prevent this from happening away.  You
> can always manually unbind the backend device from the xen-netback
> driver which has the same effect as unloading the module.
> 
> > Netback / netfront lose all states when netback is unloaded. And
> > netfront doesn't support reconfiguration at the moment. My guess is that
> > this is the reason why netback doesn't even have unload function at
> > first.
> 
> If netfront cannot handle reconnect then that's a bug in the frontend or
> a bug in the backend xenbus code not setting up the reconnect correctly.
> 
> >> Being able to unload modules while they are in use is standard so I
> >> don't think this should be applied.
> > 
> > I don't think this is true from a module dependency point of view - just
> > try to unload any in use module, rmmod / modprobe will give you a fatal
> > error.
> 
> Try it with any other network interface driver and it will unload just fine.
> 
> David
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> 

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 3/8] netback: get/put module along with vif connect/disconnect
  2013-03-05 14:07       ` [Xen-devel] " David Vrabel
  2013-03-05 14:44         ` Wei Liu
  2013-03-05 14:44         ` Wei Liu
@ 2013-03-05 15:53         ` Konrad Rzeszutek Wilk
  2013-03-05 15:53         ` [Xen-devel] " Konrad Rzeszutek Wilk
  3 siblings, 0 replies; 91+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-03-05 15:53 UTC (permalink / raw)
  To: David Vrabel; +Cc: netdev, annie.li, Wei Liu, Ian Campbell, xen-devel

On Tue, Mar 05, 2013 at 02:07:42PM +0000, David Vrabel wrote:
> On 05/03/13 13:30, Wei Liu wrote:
> > On Tue, 2013-03-05 at 10:02 +0000, David Vrabel wrote:
> >> On 15/02/13 16:00, Wei Liu wrote:
> >>> If there is vif running and user unloads netback, guest's network interface
> >>> just mysteriously stops working. So we need to prevent unloading netback
> >>> module if there is vif running.
> >>
> >> It's not mysterious -- it is cleanly disconnected, and will reconnect
> >> when the module is reinserted.
> >>
> > 
> > From a guest's POV, it just stops without any sign. This should be
> > prevented IMHO.
> 
> This is a bug in the frontend or a bug in the backend failing to
> disconnect correctly.
> 
> I posted a series of "xen-foofront: handle backend CLOSED without
> CLOSING" patches that may help here. (I didn't get applied to netfront
> for some reason.)

Hm, could you resent it please and make sure that the networking
maintainer is on the To list?

> 
> Disabling module unload doesn't prevent this from happening away.  You
> can always manually unbind the backend device from the xen-netback
> driver which has the same effect as unloading the module.
> 
> > Netback / netfront lose all states when netback is unloaded. And
> > netfront doesn't support reconfiguration at the moment. My guess is that
> > this is the reason why netback doesn't even have unload function at
> > first.
> 
> If netfront cannot handle reconnect then that's a bug in the frontend or
> a bug in the backend xenbus code not setting up the reconnect correctly.
> 
> >> Being able to unload modules while they are in use is standard so I
> >> don't think this should be applied.
> > 
> > I don't think this is true from a module dependency point of view - just
> > try to unload any in use module, rmmod / modprobe will give you a fatal
> > error.
> 
> Try it with any other network interface driver and it will unload just fine.
> 
> David
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> 

^ permalink raw reply	[flat|nested] 91+ messages in thread

end of thread, other threads:[~2013-03-05 15:53 UTC | newest]

Thread overview: 91+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-02-15 16:00 [PATCH 0/8] Bugfix and mechanical works for Xen network driver Wei Liu
2013-02-15 16:00 ` [PATCH 1/8] netback: don't bind kthread to cpu Wei Liu
2013-03-04 20:51   ` Konrad Rzeszutek Wilk
2013-03-05 13:30     ` Wei Liu
2013-03-05 13:30     ` Wei Liu
2013-03-05 13:56       ` Konrad Rzeszutek Wilk
2013-03-05 13:56       ` [Xen-devel] " Konrad Rzeszutek Wilk
2013-03-05 14:04         ` Wei Liu
2013-03-05 14:04         ` Wei Liu
2013-03-05 14:42         ` David Vrabel
2013-03-05 14:42         ` [Xen-devel] " David Vrabel
2013-03-05 15:52           ` Konrad Rzeszutek Wilk
2013-03-05 15:52           ` Konrad Rzeszutek Wilk
2013-03-04 20:51   ` Konrad Rzeszutek Wilk
2013-02-15 16:00 ` Wei Liu
2013-02-15 16:00 ` [PATCH 2/8] netback: add module unload function Wei Liu
2013-02-15 16:00 ` Wei Liu
2013-03-04 20:55   ` [Xen-devel] " Konrad Rzeszutek Wilk
2013-03-04 20:58     ` Andrew Cooper
2013-03-04 20:58     ` [Xen-devel] " Andrew Cooper
2013-03-05 13:30     ` Wei Liu
2013-03-05 13:30     ` Wei Liu
2013-03-04 20:55   ` Konrad Rzeszutek Wilk
2013-03-04 21:58   ` Stephen Hemminger
2013-03-05 13:30     ` Wei Liu
2013-03-05 13:30     ` Wei Liu
2013-03-04 21:58   ` Stephen Hemminger
2013-02-15 16:00 ` [PATCH 3/8] netback: get/put module along with vif connect/disconnect Wei Liu
2013-02-15 16:00 ` Wei Liu
2013-03-04 20:56   ` Konrad Rzeszutek Wilk
2013-03-04 20:56   ` Konrad Rzeszutek Wilk
2013-03-05 10:02   ` David Vrabel
2013-03-05 10:02   ` [Xen-devel] " David Vrabel
2013-03-05 13:30     ` Wei Liu
2013-03-05 13:30     ` [Xen-devel] " Wei Liu
2013-03-05 14:07       ` David Vrabel
2013-03-05 14:07       ` [Xen-devel] " David Vrabel
2013-03-05 14:44         ` Wei Liu
2013-03-05 14:44         ` Wei Liu
2013-03-05 15:53         ` Konrad Rzeszutek Wilk
2013-03-05 15:53         ` [Xen-devel] " Konrad Rzeszutek Wilk
2013-02-15 16:00 ` [PATCH 4/8] xenbus_client: Extend interface to support multi-page ring Wei Liu
2013-02-15 16:17   ` Jan Beulich
2013-02-15 16:17   ` [Xen-devel] " Jan Beulich
2013-02-15 16:33     ` Wei Liu
2013-02-15 16:59       ` Jan Beulich
2013-02-15 16:59       ` [Xen-devel] " Jan Beulich
2013-02-15 17:01         ` Wei Liu
2013-02-15 17:01         ` [Xen-devel] " Wei Liu
2013-02-15 16:33     ` Wei Liu
2013-03-04 21:12   ` Konrad Rzeszutek Wilk
2013-03-04 21:12   ` Konrad Rzeszutek Wilk
2013-03-05 10:25   ` David Vrabel
2013-03-05 10:25   ` [Xen-devel] " David Vrabel
2013-02-15 16:00 ` Wei Liu
2013-02-15 16:00 ` [PATCH 5/8] netback: multi-page ring support Wei Liu
2013-02-15 16:00 ` Wei Liu
2013-03-04 21:00   ` [Xen-devel] " Konrad Rzeszutek Wilk
2013-03-04 21:00   ` Konrad Rzeszutek Wilk
2013-03-05 10:41   ` David Vrabel
2013-03-05 10:41   ` [Xen-devel] " David Vrabel
2013-02-15 16:00 ` [PATCH 6/8] netfront: " Wei Liu
2013-02-26  6:52   ` ANNIE LI
2013-02-26 12:35     ` Wei Liu
2013-02-26 12:35     ` Wei Liu
2013-02-27  7:39       ` ANNIE LI
2013-02-27 15:49         ` Wei Liu
2013-02-27 15:49         ` Wei Liu
2013-02-28  5:19           ` ANNIE LI
2013-02-28  5:19           ` ANNIE LI
2013-02-28 11:02             ` Wei Liu
2013-02-28 12:55               ` annie li
2013-02-28 12:55               ` annie li
2013-02-28 11:02             ` Wei Liu
2013-02-27  7:39       ` ANNIE LI
2013-02-26  6:52   ` ANNIE LI
2013-03-04 21:16   ` Konrad Rzeszutek Wilk
2013-03-04 21:16   ` Konrad Rzeszutek Wilk
2013-02-15 16:00 ` Wei Liu
2013-02-15 16:00 ` [PATCH 7/8] netback: split event channels support Wei Liu
2013-02-15 16:00 ` Wei Liu
2013-03-04 21:22   ` Konrad Rzeszutek Wilk
2013-03-04 21:22   ` Konrad Rzeszutek Wilk
2013-02-15 16:00 ` [PATCH 8/8] netfront: " Wei Liu
2013-02-15 16:00 ` Wei Liu
2013-03-04 21:24   ` Konrad Rzeszutek Wilk
2013-03-04 21:24   ` Konrad Rzeszutek Wilk
2013-02-26  3:07 ` [PATCH 0/8] Bugfix and mechanical works for Xen network driver ANNIE LI
2013-02-26  3:07 ` [Xen-devel] " ANNIE LI
2013-02-26 11:33   ` Wei Liu
2013-02-26 11:33   ` Wei Liu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.