All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] xen-netback: Using a new state bit instead of carrier
@ 2014-07-30 19:50 Zoltan Kiss
  2014-07-30 19:50 ` [PATCH] xen-netback: Turn off the carrier if the guest is not able to receive Zoltan Kiss
  2014-07-30 19:50 ` Zoltan Kiss
  0 siblings, 2 replies; 19+ messages in thread
From: Zoltan Kiss @ 2014-07-30 19:50 UTC (permalink / raw)
  To: Wei Liu, Ian Campbell
  Cc: Zoltan Kiss, David Vrabel, netdev, linux-kernel, xen-devel

This patch introduces a new state bit VIF_STATUS_CONNECTED to track whether the
vif is in a connected state. Using carrier will not work with the next patch
in this series, which aims to turn the carrier temporarily off if the guest
doesn't seem to be able to receive packets.

Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: xen-devel@lists.xenproject.org 
diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index 28c9822..7231359 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -198,6 +198,11 @@ struct xenvif_queue { /* Per-queue data for xenvif */
 	struct xenvif_stats stats;
 };
 
+enum vif_state_bit_shift {
+	/* This bit marks that the vif is connected */
+	VIF_STATUS_CONNECTED
+};
+
 struct xenvif {
 	/* Unique identifier for this interface. */
 	domid_t          domid;
@@ -220,6 +225,7 @@ struct xenvif {
 	 * frontend is rogue.
 	 */
 	bool disabled;
+	unsigned long status;
 
 	/* Queues */
 	struct xenvif_queue *queues;
diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index bd59d9d..fbdadb3 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -55,7 +55,8 @@ static inline void xenvif_stop_queue(struct xenvif_queue *queue)
 
 int xenvif_schedulable(struct xenvif *vif)
 {
-	return netif_running(vif->dev) && netif_carrier_ok(vif->dev);
+	return netif_running(vif->dev) &&
+		test_bit(VIF_STATUS_CONNECTED, &vif->status);
 }
 
 static irqreturn_t xenvif_tx_interrupt(int irq, void *dev_id)
@@ -267,7 +268,7 @@ static void xenvif_down(struct xenvif *vif)
 static int xenvif_open(struct net_device *dev)
 {
 	struct xenvif *vif = netdev_priv(dev);
-	if (netif_carrier_ok(dev))
+	if (test_bit(VIF_STATUS_CONNECTED, &vif->status))
 		xenvif_up(vif);
 	netif_tx_start_all_queues(dev);
 	return 0;
@@ -276,7 +277,7 @@ static int xenvif_open(struct net_device *dev)
 static int xenvif_close(struct net_device *dev)
 {
 	struct xenvif *vif = netdev_priv(dev);
-	if (netif_carrier_ok(dev))
+	if (test_bit(VIF_STATUS_CONNECTED, &vif->status))
 		xenvif_down(vif);
 	netif_tx_stop_all_queues(dev);
 	return 0;
@@ -528,6 +529,7 @@ void xenvif_carrier_on(struct xenvif *vif)
 	if (!vif->can_sg && vif->dev->mtu > ETH_DATA_LEN)
 		dev_set_mtu(vif->dev, ETH_DATA_LEN);
 	netdev_update_features(vif->dev);
+	set_bit(VIF_STATUS_CONNECTED, &vif->status);
 	netif_carrier_on(vif->dev);
 	if (netif_running(vif->dev))
 		xenvif_up(vif);
@@ -625,9 +627,11 @@ void xenvif_carrier_off(struct xenvif *vif)
 	struct net_device *dev = vif->dev;
 
 	rtnl_lock();
-	netif_carrier_off(dev); /* discard queued packets */
-	if (netif_running(dev))
-		xenvif_down(vif);
+	if (test_and_clear_bit(VIF_STATUS_CONNECTED, &vif->status)) {
+		netif_carrier_off(dev); /* discard queued packets */
+		if (netif_running(dev))
+			xenvif_down(vif);
+	}
 	rtnl_unlock();
 }
 
@@ -656,8 +660,7 @@ void xenvif_disconnect(struct xenvif *vif)
 	unsigned int num_queues = vif->num_queues;
 	unsigned int queue_index;
 
-	if (netif_carrier_ok(vif->dev))
-		xenvif_carrier_off(vif);
+	xenvif_carrier_off(vif);
 
 	for (queue_index = 0; queue_index < num_queues; ++queue_index) {
 		queue = &vif->queues[queue_index];
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 769e553..6c4cc0f 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -1953,7 +1953,7 @@ int xenvif_kthread_guest_rx(void *data)
 		 * context so we defer it here, if this thread is
 		 * associated with queue 0.
 		 */
-		if (unlikely(queue->vif->disabled && netif_carrier_ok(queue->vif->dev) && queue->id == 0))
+		if (unlikely(queue->vif->disabled && queue->id == 0))
 			xenvif_carrier_off(queue->vif);
 
 		if (kthread_should_stop())

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH] xen-netback: Turn off the carrier if the guest is not able to receive
  2014-07-30 19:50 [PATCH] xen-netback: Using a new state bit instead of carrier Zoltan Kiss
@ 2014-07-30 19:50 ` Zoltan Kiss
  2014-08-01  4:53   ` David Miller
                     ` (5 more replies)
  2014-07-30 19:50 ` Zoltan Kiss
  1 sibling, 6 replies; 19+ messages in thread
From: Zoltan Kiss @ 2014-07-30 19:50 UTC (permalink / raw)
  To: Wei Liu, Ian Campbell
  Cc: Zoltan Kiss, David Vrabel, netdev, linux-kernel, xen-devel

Currently when the guest is not able to receive more packets, qdisc layer starts
a timer, and when it goes off, qdisc is started again to deliver a packet again.
This is a very slow way to drain the queues, consumes unnecessary resources and
slows down other guests shutdown.
This patch change the behaviour by turning the carrier off when that timer
fires, so all the packets are freed up which were stucked waiting for that vif.
Instead of the rx_queue_purge bool it uses the VIF_STATUS_RX_PURGE_EVENT bit to
signal the thread that either the timout happened or an RX interrupt arrived, so
the thread can check what it should do. It also disables NAPI, so the guest
can't transmit, but leaves the interrupts on, so it can resurrect.

Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: xen-devel@lists.xenproject.org 
diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index 7231359..66b4bef 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -176,9 +176,9 @@ struct xenvif_queue { /* Per-queue data for xenvif */
 	struct xen_netif_rx_back_ring rx;
 	struct sk_buff_head rx_queue;
 	RING_IDX rx_last_skb_slots;
-	bool rx_queue_purge;
+	unsigned long status;
 
-	struct timer_list wake_queue;
+	struct timer_list rx_stalled;
 
 	struct gnttab_copy grant_copy_op[MAX_GRANT_COPY_OPS];
 
@@ -198,9 +198,17 @@ struct xenvif_queue { /* Per-queue data for xenvif */
 	struct xenvif_stats stats;
 };
 
-enum vif_state_bit_shift {
+enum state_bit_shift {
 	/* This bit marks that the vif is connected */
-	VIF_STATUS_CONNECTED
+	VIF_STATUS_CONNECTED,
+	/* This bit signals to the RX thread that queuing was stopped
+	 * (in start_xmit), and either the timer fired or an RX interrupt came
+	 */
+	QUEUE_STATUS_RX_PURGE_EVENT,
+	/* This bit tells to the interrupt handler that this queue was the
+	 * reason for the carrier off, so it should kick the thread
+	 */
+	QUEUE_STATUS_RX_STALLED
 };
 
 struct xenvif {
diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index fa7d8ac..d2e61f4 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -79,7 +79,8 @@ int xenvif_poll(struct napi_struct *napi, int budget)
 	 * for this vif to deschedule it from NAPI. But this interface
 	 * will be turned off in thread context later.
 	 */
-	if (unlikely(queue->vif->disabled)) {
+	if (unlikely(queue->vif->disabled ||
+		     !netif_carrier_ok(queue->vif->dev))) {
 		napi_complete(napi);
 		return 0;
 	}
@@ -97,7 +98,13 @@ int xenvif_poll(struct napi_struct *napi, int budget)
 static irqreturn_t xenvif_rx_interrupt(int irq, void *dev_id)
 {
 	struct xenvif_queue *queue = dev_id;
+	struct netdev_queue *net_queue =
+		netdev_get_tx_queue(queue->vif->dev, queue->id);
 
+	if (unlikely((netif_tx_queue_stopped(net_queue) ||
+		      !netif_carrier_ok(queue->vif->dev)) &&
+		     test_bit(QUEUE_STATUS_RX_STALLED, &queue->status)))
+		set_bit(QUEUE_STATUS_RX_PURGE_EVENT, &queue->status);
 	xenvif_kick_thread(queue);
 
 	return IRQ_HANDLED;
@@ -126,15 +133,13 @@ void xenvif_wake_queue(struct xenvif_queue *queue)
 }
 
 /* Callback to wake the queue and drain it on timeout */
-static void xenvif_wake_queue_callback(unsigned long data)
+static void xenvif_rx_stalled(unsigned long data)
 {
 	struct xenvif_queue *queue = (struct xenvif_queue *)data;
 
 	if (xenvif_queue_stopped(queue)) {
-		netdev_err(queue->vif->dev, "draining TX queue\n");
-		queue->rx_queue_purge = true;
+		set_bit(QUEUE_STATUS_RX_PURGE_EVENT, &queue->status);
 		xenvif_kick_thread(queue);
-		xenvif_wake_queue(queue);
 	}
 }
 
@@ -183,11 +188,11 @@ static int xenvif_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	 * drain.
 	 */
 	if (!xenvif_rx_ring_slots_available(queue, min_slots_needed)) {
-		queue->wake_queue.function = xenvif_wake_queue_callback;
-		queue->wake_queue.data = (unsigned long)queue;
+		queue->rx_stalled.function = xenvif_rx_stalled;
+		queue->rx_stalled.data = (unsigned long)queue;
 		xenvif_stop_queue(queue);
-		mod_timer(&queue->wake_queue,
-			jiffies + rx_drain_timeout_jiffies);
+		mod_timer(&queue->rx_stalled,
+			  jiffies + rx_drain_timeout_jiffies);
 	}
 
 	skb_queue_tail(&queue->rx_queue, skb);
@@ -515,7 +520,7 @@ int xenvif_init_queue(struct xenvif_queue *queue)
 		queue->grant_tx_handle[i] = NETBACK_INVALID_HANDLE;
 	}
 
-	init_timer(&queue->wake_queue);
+	init_timer(&queue->rx_stalled);
 
 	netif_napi_add(queue->vif->dev, &queue->napi, xenvif_poll,
 			XENVIF_NAPI_WEIGHT);
@@ -666,7 +671,7 @@ void xenvif_disconnect(struct xenvif *vif)
 		queue = &vif->queues[queue_index];
 
 		if (queue->task) {
-			del_timer_sync(&queue->wake_queue);
+			del_timer_sync(&queue->rx_stalled);
 			kthread_stop(queue->task);
 			queue->task = NULL;
 		}
@@ -708,16 +713,12 @@ void xenvif_free(struct xenvif *vif)
 	/* Here we want to avoid timeout messages if an skb can be legitimately
 	 * stuck somewhere else. Realistically this could be an another vif's
 	 * internal or QDisc queue. That another vif also has this
-	 * rx_drain_timeout_msecs timeout, but the timer only ditches the
-	 * internal queue. After that, the QDisc queue can put in worst case
-	 * XEN_NETIF_RX_RING_SIZE / MAX_SKB_FRAGS skbs into that another vif's
-	 * internal queue, so we need several rounds of such timeouts until we
-	 * can be sure that no another vif should have skb's from us. We are
-	 * not sending more skb's, so newly stuck packets are not interesting
-	 * for us here.
+	 * rx_drain_timeout_msecs timeout, so give it time to drain out.
+	 * Although if that other guest wakes up just before its timeout happens
+	 * and takes only one skb from QDisc, it can hold onto other skbs for a
+	 * longer period.
 	 */
-	unsigned int worst_case_skb_lifetime = (rx_drain_timeout_msecs/1000) *
-		DIV_ROUND_UP(XENVIF_QUEUE_LENGTH, (XEN_NETIF_RX_RING_SIZE / MAX_SKB_FRAGS));
+	unsigned int worst_case_skb_lifetime = (rx_drain_timeout_msecs/1000);
 
 	unregister_netdev(vif->dev);
 
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 6c4cc0f..7dac833 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -1869,8 +1869,7 @@ void xenvif_idx_unmap(struct xenvif_queue *queue, u16 pending_idx)
 static inline int rx_work_todo(struct xenvif_queue *queue)
 {
 	return (!skb_queue_empty(&queue->rx_queue) &&
-	       xenvif_rx_ring_slots_available(queue, queue->rx_last_skb_slots)) ||
-	       queue->rx_queue_purge;
+	       xenvif_rx_ring_slots_available(queue, queue->rx_last_skb_slots));
 }
 
 static inline int tx_work_todo(struct xenvif_queue *queue)
@@ -1944,6 +1943,7 @@ int xenvif_kthread_guest_rx(void *data)
 		wait_event_interruptible(queue->wq,
 					 rx_work_todo(queue) ||
 					 queue->vif->disabled ||
+					 test_bit(QUEUE_STATUS_RX_PURGE_EVENT, &queue->status) ||
 					 kthread_should_stop());
 
 		/* This frontend is found to be rogue, disable it in
@@ -1955,24 +1955,78 @@ int xenvif_kthread_guest_rx(void *data)
 		 */
 		if (unlikely(queue->vif->disabled && queue->id == 0))
 			xenvif_carrier_off(queue->vif);
+		else if (unlikely(test_and_clear_bit(QUEUE_STATUS_RX_PURGE_EVENT,
+						     &queue->status))) {
+			/* Either the last unsuccesful skb or at least 1 slot
+			 * should fit
+			 */
+			int needed = queue->rx_last_skb_slots ?
+				     queue->rx_last_skb_slots : 1;
 
-		if (kthread_should_stop())
-			break;
-
-		if (queue->rx_queue_purge) {
+			/* It is assumed that if the guest post new
+			 * slots after this, the RX interrupt will set
+			 * the bit and wake up the thread again
+			 */
+			set_bit(QUEUE_STATUS_RX_STALLED, &queue->status);
+			if (!xenvif_rx_ring_slots_available(queue, needed)) {
+				rtnl_lock();
+				if (netif_carrier_ok(queue->vif->dev)) {
+					/* Timer fired and there are still no
+					 * slots. Turn off everything except the
+					 * interrupts
+					 */
+					netif_carrier_off(queue->vif->dev);
+					skb_queue_purge(&queue->rx_queue);
+					queue->rx_last_skb_slots = 0;
+					if (net_ratelimit())
+						netdev_err(queue->vif->dev, "Carrier off due to lack of guest response on queue %d\n", queue->id);
+				}
+				rtnl_unlock();
+			} else if (!netif_carrier_ok(queue->vif->dev)) {
+				unsigned int num_queues = queue->vif->num_queues;
+				unsigned int i;
+				/* The carrier was down, but an interrupt kicked
+				 * the thread again after new requests were
+				 * posted
+				 */
+				clear_bit(QUEUE_STATUS_RX_STALLED,
+					  &queue->status);
+				rtnl_lock();
+				netif_carrier_on(queue->vif->dev);
+				netif_tx_wake_all_queues(queue->vif->dev);
+				rtnl_unlock();
+
+				for (i = 0; i < num_queues; ++i) {
+					struct xenvif_queue *temp = &queue->vif->queues[i];
+					xenvif_napi_schedule_or_enable_events(temp);
+				}
+				if (net_ratelimit())
+					netdev_err(queue->vif->dev, "Carrier on again\n");
+				continue;
+			} else {
+				/* Queuing were stopped, but the guest posted
+				 * new requests
+				 */
+				clear_bit(QUEUE_STATUS_RX_STALLED,
+					  &queue->status);
+				del_timer_sync(&queue->rx_stalled);
+				xenvif_start_queue(queue);
+				continue;
+			}
+		} else if (!netif_carrier_ok(queue->vif->dev)) {
+			/* Another queue stalled and turned the carrier off, so
+			 * purge our internal queue
+			 */
 			skb_queue_purge(&queue->rx_queue);
-			queue->rx_queue_purge = false;
+			queue->rx_last_skb_slots = 0;
 		}
 
+		if (kthread_should_stop())
+			break;
+
 		if (!skb_queue_empty(&queue->rx_queue))
 			xenvif_rx_action(queue);
 
-		if (skb_queue_empty(&queue->rx_queue) &&
-		    xenvif_queue_stopped(queue)) {
-			del_timer_sync(&queue->wake_queue);
-			xenvif_start_queue(queue);
-		}
-
 		cond_resched();
 	}
 

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH] xen-netback: Turn off the carrier if the guest is not able to receive
  2014-07-30 19:50 [PATCH] xen-netback: Using a new state bit instead of carrier Zoltan Kiss
  2014-07-30 19:50 ` [PATCH] xen-netback: Turn off the carrier if the guest is not able to receive Zoltan Kiss
@ 2014-07-30 19:50 ` Zoltan Kiss
  1 sibling, 0 replies; 19+ messages in thread
From: Zoltan Kiss @ 2014-07-30 19:50 UTC (permalink / raw)
  To: Wei Liu, Ian Campbell
  Cc: netdev, xen-devel, David Vrabel, Zoltan Kiss, linux-kernel

Currently when the guest is not able to receive more packets, qdisc layer starts
a timer, and when it goes off, qdisc is started again to deliver a packet again.
This is a very slow way to drain the queues, consumes unnecessary resources and
slows down other guests shutdown.
This patch change the behaviour by turning the carrier off when that timer
fires, so all the packets are freed up which were stucked waiting for that vif.
Instead of the rx_queue_purge bool it uses the VIF_STATUS_RX_PURGE_EVENT bit to
signal the thread that either the timout happened or an RX interrupt arrived, so
the thread can check what it should do. It also disables NAPI, so the guest
can't transmit, but leaves the interrupts on, so it can resurrect.

Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: xen-devel@lists.xenproject.org 
diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index 7231359..66b4bef 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -176,9 +176,9 @@ struct xenvif_queue { /* Per-queue data for xenvif */
 	struct xen_netif_rx_back_ring rx;
 	struct sk_buff_head rx_queue;
 	RING_IDX rx_last_skb_slots;
-	bool rx_queue_purge;
+	unsigned long status;
 
-	struct timer_list wake_queue;
+	struct timer_list rx_stalled;
 
 	struct gnttab_copy grant_copy_op[MAX_GRANT_COPY_OPS];
 
@@ -198,9 +198,17 @@ struct xenvif_queue { /* Per-queue data for xenvif */
 	struct xenvif_stats stats;
 };
 
-enum vif_state_bit_shift {
+enum state_bit_shift {
 	/* This bit marks that the vif is connected */
-	VIF_STATUS_CONNECTED
+	VIF_STATUS_CONNECTED,
+	/* This bit signals to the RX thread that queuing was stopped
+	 * (in start_xmit), and either the timer fired or an RX interrupt came
+	 */
+	QUEUE_STATUS_RX_PURGE_EVENT,
+	/* This bit tells to the interrupt handler that this queue was the
+	 * reason for the carrier off, so it should kick the thread
+	 */
+	QUEUE_STATUS_RX_STALLED
 };
 
 struct xenvif {
diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index fa7d8ac..d2e61f4 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -79,7 +79,8 @@ int xenvif_poll(struct napi_struct *napi, int budget)
 	 * for this vif to deschedule it from NAPI. But this interface
 	 * will be turned off in thread context later.
 	 */
-	if (unlikely(queue->vif->disabled)) {
+	if (unlikely(queue->vif->disabled ||
+		     !netif_carrier_ok(queue->vif->dev))) {
 		napi_complete(napi);
 		return 0;
 	}
@@ -97,7 +98,13 @@ int xenvif_poll(struct napi_struct *napi, int budget)
 static irqreturn_t xenvif_rx_interrupt(int irq, void *dev_id)
 {
 	struct xenvif_queue *queue = dev_id;
+	struct netdev_queue *net_queue =
+		netdev_get_tx_queue(queue->vif->dev, queue->id);
 
+	if (unlikely((netif_tx_queue_stopped(net_queue) ||
+		      !netif_carrier_ok(queue->vif->dev)) &&
+		     test_bit(QUEUE_STATUS_RX_STALLED, &queue->status)))
+		set_bit(QUEUE_STATUS_RX_PURGE_EVENT, &queue->status);
 	xenvif_kick_thread(queue);
 
 	return IRQ_HANDLED;
@@ -126,15 +133,13 @@ void xenvif_wake_queue(struct xenvif_queue *queue)
 }
 
 /* Callback to wake the queue and drain it on timeout */
-static void xenvif_wake_queue_callback(unsigned long data)
+static void xenvif_rx_stalled(unsigned long data)
 {
 	struct xenvif_queue *queue = (struct xenvif_queue *)data;
 
 	if (xenvif_queue_stopped(queue)) {
-		netdev_err(queue->vif->dev, "draining TX queue\n");
-		queue->rx_queue_purge = true;
+		set_bit(QUEUE_STATUS_RX_PURGE_EVENT, &queue->status);
 		xenvif_kick_thread(queue);
-		xenvif_wake_queue(queue);
 	}
 }
 
@@ -183,11 +188,11 @@ static int xenvif_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	 * drain.
 	 */
 	if (!xenvif_rx_ring_slots_available(queue, min_slots_needed)) {
-		queue->wake_queue.function = xenvif_wake_queue_callback;
-		queue->wake_queue.data = (unsigned long)queue;
+		queue->rx_stalled.function = xenvif_rx_stalled;
+		queue->rx_stalled.data = (unsigned long)queue;
 		xenvif_stop_queue(queue);
-		mod_timer(&queue->wake_queue,
-			jiffies + rx_drain_timeout_jiffies);
+		mod_timer(&queue->rx_stalled,
+			  jiffies + rx_drain_timeout_jiffies);
 	}
 
 	skb_queue_tail(&queue->rx_queue, skb);
@@ -515,7 +520,7 @@ int xenvif_init_queue(struct xenvif_queue *queue)
 		queue->grant_tx_handle[i] = NETBACK_INVALID_HANDLE;
 	}
 
-	init_timer(&queue->wake_queue);
+	init_timer(&queue->rx_stalled);
 
 	netif_napi_add(queue->vif->dev, &queue->napi, xenvif_poll,
 			XENVIF_NAPI_WEIGHT);
@@ -666,7 +671,7 @@ void xenvif_disconnect(struct xenvif *vif)
 		queue = &vif->queues[queue_index];
 
 		if (queue->task) {
-			del_timer_sync(&queue->wake_queue);
+			del_timer_sync(&queue->rx_stalled);
 			kthread_stop(queue->task);
 			queue->task = NULL;
 		}
@@ -708,16 +713,12 @@ void xenvif_free(struct xenvif *vif)
 	/* Here we want to avoid timeout messages if an skb can be legitimately
 	 * stuck somewhere else. Realistically this could be an another vif's
 	 * internal or QDisc queue. That another vif also has this
-	 * rx_drain_timeout_msecs timeout, but the timer only ditches the
-	 * internal queue. After that, the QDisc queue can put in worst case
-	 * XEN_NETIF_RX_RING_SIZE / MAX_SKB_FRAGS skbs into that another vif's
-	 * internal queue, so we need several rounds of such timeouts until we
-	 * can be sure that no another vif should have skb's from us. We are
-	 * not sending more skb's, so newly stuck packets are not interesting
-	 * for us here.
+	 * rx_drain_timeout_msecs timeout, so give it time to drain out.
+	 * Although if that other guest wakes up just before its timeout happens
+	 * and takes only one skb from QDisc, it can hold onto other skbs for a
+	 * longer period.
 	 */
-	unsigned int worst_case_skb_lifetime = (rx_drain_timeout_msecs/1000) *
-		DIV_ROUND_UP(XENVIF_QUEUE_LENGTH, (XEN_NETIF_RX_RING_SIZE / MAX_SKB_FRAGS));
+	unsigned int worst_case_skb_lifetime = (rx_drain_timeout_msecs/1000);
 
 	unregister_netdev(vif->dev);
 
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 6c4cc0f..7dac833 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -1869,8 +1869,7 @@ void xenvif_idx_unmap(struct xenvif_queue *queue, u16 pending_idx)
 static inline int rx_work_todo(struct xenvif_queue *queue)
 {
 	return (!skb_queue_empty(&queue->rx_queue) &&
-	       xenvif_rx_ring_slots_available(queue, queue->rx_last_skb_slots)) ||
-	       queue->rx_queue_purge;
+	       xenvif_rx_ring_slots_available(queue, queue->rx_last_skb_slots));
 }
 
 static inline int tx_work_todo(struct xenvif_queue *queue)
@@ -1944,6 +1943,7 @@ int xenvif_kthread_guest_rx(void *data)
 		wait_event_interruptible(queue->wq,
 					 rx_work_todo(queue) ||
 					 queue->vif->disabled ||
+					 test_bit(QUEUE_STATUS_RX_PURGE_EVENT, &queue->status) ||
 					 kthread_should_stop());
 
 		/* This frontend is found to be rogue, disable it in
@@ -1955,24 +1955,78 @@ int xenvif_kthread_guest_rx(void *data)
 		 */
 		if (unlikely(queue->vif->disabled && queue->id == 0))
 			xenvif_carrier_off(queue->vif);
+		else if (unlikely(test_and_clear_bit(QUEUE_STATUS_RX_PURGE_EVENT,
+						     &queue->status))) {
+			/* Either the last unsuccesful skb or at least 1 slot
+			 * should fit
+			 */
+			int needed = queue->rx_last_skb_slots ?
+				     queue->rx_last_skb_slots : 1;
 
-		if (kthread_should_stop())
-			break;
-
-		if (queue->rx_queue_purge) {
+			/* It is assumed that if the guest post new
+			 * slots after this, the RX interrupt will set
+			 * the bit and wake up the thread again
+			 */
+			set_bit(QUEUE_STATUS_RX_STALLED, &queue->status);
+			if (!xenvif_rx_ring_slots_available(queue, needed)) {
+				rtnl_lock();
+				if (netif_carrier_ok(queue->vif->dev)) {
+					/* Timer fired and there are still no
+					 * slots. Turn off everything except the
+					 * interrupts
+					 */
+					netif_carrier_off(queue->vif->dev);
+					skb_queue_purge(&queue->rx_queue);
+					queue->rx_last_skb_slots = 0;
+					if (net_ratelimit())
+						netdev_err(queue->vif->dev, "Carrier off due to lack of guest response on queue %d\n", queue->id);
+				}
+				rtnl_unlock();
+			} else if (!netif_carrier_ok(queue->vif->dev)) {
+				unsigned int num_queues = queue->vif->num_queues;
+				unsigned int i;
+				/* The carrier was down, but an interrupt kicked
+				 * the thread again after new requests were
+				 * posted
+				 */
+				clear_bit(QUEUE_STATUS_RX_STALLED,
+					  &queue->status);
+				rtnl_lock();
+				netif_carrier_on(queue->vif->dev);
+				netif_tx_wake_all_queues(queue->vif->dev);
+				rtnl_unlock();
+
+				for (i = 0; i < num_queues; ++i) {
+					struct xenvif_queue *temp = &queue->vif->queues[i];
+					xenvif_napi_schedule_or_enable_events(temp);
+				}
+				if (net_ratelimit())
+					netdev_err(queue->vif->dev, "Carrier on again\n");
+				continue;
+			} else {
+				/* Queuing were stopped, but the guest posted
+				 * new requests
+				 */
+				clear_bit(QUEUE_STATUS_RX_STALLED,
+					  &queue->status);
+				del_timer_sync(&queue->rx_stalled);
+				xenvif_start_queue(queue);
+				continue;
+			}
+		} else if (!netif_carrier_ok(queue->vif->dev)) {
+			/* Another queue stalled and turned the carrier off, so
+			 * purge our internal queue
+			 */
 			skb_queue_purge(&queue->rx_queue);
-			queue->rx_queue_purge = false;
+			queue->rx_last_skb_slots = 0;
 		}
 
+		if (kthread_should_stop())
+			break;
+
 		if (!skb_queue_empty(&queue->rx_queue))
 			xenvif_rx_action(queue);
 
-		if (skb_queue_empty(&queue->rx_queue) &&
-		    xenvif_queue_stopped(queue)) {
-			del_timer_sync(&queue->wake_queue);
-			xenvif_start_queue(queue);
-		}
-
 		cond_resched();
 	}

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH] xen-netback: Turn off the carrier if the guest is not able to receive
  2014-07-30 19:50 ` [PATCH] xen-netback: Turn off the carrier if the guest is not able to receive Zoltan Kiss
  2014-08-01  4:53   ` David Miller
@ 2014-08-01  4:53   ` David Miller
  2014-08-01 10:52   ` Wei Liu
                     ` (3 subsequent siblings)
  5 siblings, 0 replies; 19+ messages in thread
From: David Miller @ 2014-08-01  4:53 UTC (permalink / raw)
  To: zoltan.kiss
  Cc: wei.liu2, Ian.Campbell, david.vrabel, netdev, linux-kernel, xen-devel

From: Zoltan Kiss <zoltan.kiss@citrix.com>
Date: Wed, 30 Jul 2014 20:50:49 +0100

> Currently when the guest is not able to receive more packets, qdisc layer starts
> a timer, and when it goes off, qdisc is started again to deliver a packet again.
> This is a very slow way to drain the queues, consumes unnecessary resources and
> slows down other guests shutdown.
> This patch change the behaviour by turning the carrier off when that timer
> fires, so all the packets are freed up which were stucked waiting for that vif.
> Instead of the rx_queue_purge bool it uses the VIF_STATUS_RX_PURGE_EVENT bit to
> signal the thread that either the timout happened or an RX interrupt arrived, so
> the thread can check what it should do. It also disables NAPI, so the guest
> can't transmit, but leaves the interrupts on, so it can resurrect.
> 
> Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
> Signed-off-by: David Vrabel <david.vrabel@citrix.com>

When posting a multi-part patch set, number your patches and have a header
"[PATCH 0/N] " posting which describes at a high level what the patch
series is doing, and why.

> +				for (i = 0; i < num_queues; ++i) {

Please use the more canonical "i++" increment.

Thanks.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] xen-netback: Turn off the carrier if the guest is not able to receive
  2014-07-30 19:50 ` [PATCH] xen-netback: Turn off the carrier if the guest is not able to receive Zoltan Kiss
@ 2014-08-01  4:53   ` David Miller
  2014-08-01  4:53   ` David Miller
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 19+ messages in thread
From: David Miller @ 2014-08-01  4:53 UTC (permalink / raw)
  To: zoltan.kiss
  Cc: wei.liu2, Ian.Campbell, netdev, linux-kernel, david.vrabel, xen-devel

From: Zoltan Kiss <zoltan.kiss@citrix.com>
Date: Wed, 30 Jul 2014 20:50:49 +0100

> Currently when the guest is not able to receive more packets, qdisc layer starts
> a timer, and when it goes off, qdisc is started again to deliver a packet again.
> This is a very slow way to drain the queues, consumes unnecessary resources and
> slows down other guests shutdown.
> This patch change the behaviour by turning the carrier off when that timer
> fires, so all the packets are freed up which were stucked waiting for that vif.
> Instead of the rx_queue_purge bool it uses the VIF_STATUS_RX_PURGE_EVENT bit to
> signal the thread that either the timout happened or an RX interrupt arrived, so
> the thread can check what it should do. It also disables NAPI, so the guest
> can't transmit, but leaves the interrupts on, so it can resurrect.
> 
> Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
> Signed-off-by: David Vrabel <david.vrabel@citrix.com>

When posting a multi-part patch set, number your patches and have a header
"[PATCH 0/N] " posting which describes at a high level what the patch
series is doing, and why.

> +				for (i = 0; i < num_queues; ++i) {

Please use the more canonical "i++" increment.

Thanks.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] xen-netback: Turn off the carrier if the guest is not able to receive
  2014-07-30 19:50 ` [PATCH] xen-netback: Turn off the carrier if the guest is not able to receive Zoltan Kiss
                     ` (2 preceding siblings ...)
  2014-08-01 10:52   ` Wei Liu
@ 2014-08-01 10:52   ` Wei Liu
  2014-08-04 15:04     ` Zoltan Kiss
  2014-08-04 15:04     ` Zoltan Kiss
  2014-08-04 13:35   ` [Xen-devel] " David Vrabel
  2014-08-04 13:35   ` David Vrabel
  5 siblings, 2 replies; 19+ messages in thread
From: Wei Liu @ 2014-08-01 10:52 UTC (permalink / raw)
  To: Zoltan Kiss
  Cc: Wei Liu, Ian Campbell, David Vrabel, netdev, linux-kernel, xen-devel

On Wed, Jul 30, 2014 at 08:50:49PM +0100, Zoltan Kiss wrote:
> Currently when the guest is not able to receive more packets, qdisc layer starts
> a timer, and when it goes off, qdisc is started again to deliver a packet again.
> This is a very slow way to drain the queues, consumes unnecessary resources and
> slows down other guests shutdown.
> This patch change the behaviour by turning the carrier off when that timer
> fires, so all the packets are freed up which were stucked waiting for that vif.
> Instead of the rx_queue_purge bool it uses the VIF_STATUS_RX_PURGE_EVENT bit to
> signal the thread that either the timout happened or an RX interrupt arrived, so
> the thread can check what it should do. It also disables NAPI, so the guest
> can't transmit, but leaves the interrupts on, so it can resurrect.
> 
> Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
> Cc: netdev@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Cc: xen-devel@lists.xenproject.org 
> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
> index 7231359..66b4bef 100644
> --- a/drivers/net/xen-netback/common.h
> +++ b/drivers/net/xen-netback/common.h
> @@ -176,9 +176,9 @@ struct xenvif_queue { /* Per-queue data for xenvif */
>  	struct xen_netif_rx_back_ring rx;
>  	struct sk_buff_head rx_queue;
>  	RING_IDX rx_last_skb_slots;
> -	bool rx_queue_purge;
> +	unsigned long status;
>  
> -	struct timer_list wake_queue;
> +	struct timer_list rx_stalled;
>  
>  	struct gnttab_copy grant_copy_op[MAX_GRANT_COPY_OPS];
>  
> @@ -198,9 +198,17 @@ struct xenvif_queue { /* Per-queue data for xenvif */
>  	struct xenvif_stats stats;
>  };
>  
> -enum vif_state_bit_shift {
> +enum state_bit_shift {

Why change the name?

>  	/* This bit marks that the vif is connected */
> -	VIF_STATUS_CONNECTED
> +	VIF_STATUS_CONNECTED,
> +	/* This bit signals to the RX thread that queuing was stopped
> +	 * (in start_xmit), and either the timer fired or an RX interrupt came
> +	 */
> +	QUEUE_STATUS_RX_PURGE_EVENT,
> +	/* This bit tells to the interrupt handler that this queue was the

I don't think you need that "to".

> +	 * reason for the carrier off, so it should kick the thread
> +	 */

I guess you're saying "so it should purge the queue"? I don't see
xenvif_kick_thread guarded by QUEUE_STATUS_RX_STALLED but I do see
QUEUE_STATUS_RX_PURGE_EVENT is set due to QUEUE_STATUS_RX_STALLED in rx
interrupt handler.

> +	QUEUE_STATUS_RX_STALLED

What's the interaction of these two bits? I see in below code one is set
under certain test result of the other in various places. It would be
good if you can provide some detailed description on the control flow.

>  };
>  
>  struct xenvif {
> diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
> index fa7d8ac..d2e61f4 100644
> --- a/drivers/net/xen-netback/interface.c
> +++ b/drivers/net/xen-netback/interface.c
> @@ -79,7 +79,8 @@ int xenvif_poll(struct napi_struct *napi, int budget)
>  	 * for this vif to deschedule it from NAPI. But this interface
>  	 * will be turned off in thread context later.
>  	 */
> -	if (unlikely(queue->vif->disabled)) {
> +	if (unlikely(queue->vif->disabled ||
> +		     !netif_carrier_ok(queue->vif->dev))) {
>  		napi_complete(napi);
>  		return 0;
>  	}
> @@ -97,7 +98,13 @@ int xenvif_poll(struct napi_struct *napi, int budget)
>  static irqreturn_t xenvif_rx_interrupt(int irq, void *dev_id)
>  {
>  	struct xenvif_queue *queue = dev_id;
> +	struct netdev_queue *net_queue =
> +		netdev_get_tx_queue(queue->vif->dev, queue->id);
>  
> +	if (unlikely((netif_tx_queue_stopped(net_queue) ||
> +		      !netif_carrier_ok(queue->vif->dev)) &&
> +		     test_bit(QUEUE_STATUS_RX_STALLED, &queue->status)))
> +		set_bit(QUEUE_STATUS_RX_PURGE_EVENT, &queue->status);
>  	xenvif_kick_thread(queue);
>  
>  	return IRQ_HANDLED;
> @@ -126,15 +133,13 @@ void xenvif_wake_queue(struct xenvif_queue *queue)
>  }
>  
>  /* Callback to wake the queue and drain it on timeout */
> -static void xenvif_wake_queue_callback(unsigned long data)
> +static void xenvif_rx_stalled(unsigned long data)
>  {
>  	struct xenvif_queue *queue = (struct xenvif_queue *)data;
>  
>  	if (xenvif_queue_stopped(queue)) {
> -		netdev_err(queue->vif->dev, "draining TX queue\n");
> -		queue->rx_queue_purge = true;
> +		set_bit(QUEUE_STATUS_RX_PURGE_EVENT, &queue->status);
>  		xenvif_kick_thread(queue);
> -		xenvif_wake_queue(queue);
>  	}
>  }
>  
> @@ -183,11 +188,11 @@ static int xenvif_start_xmit(struct sk_buff *skb, struct net_device *dev)
>  	 * drain.
>  	 */
>  	if (!xenvif_rx_ring_slots_available(queue, min_slots_needed)) {
> -		queue->wake_queue.function = xenvif_wake_queue_callback;
> -		queue->wake_queue.data = (unsigned long)queue;
> +		queue->rx_stalled.function = xenvif_rx_stalled;
> +		queue->rx_stalled.data = (unsigned long)queue;
>  		xenvif_stop_queue(queue);
> -		mod_timer(&queue->wake_queue,
> -			jiffies + rx_drain_timeout_jiffies);
> +		mod_timer(&queue->rx_stalled,
> +			  jiffies + rx_drain_timeout_jiffies);
>  	}
>  
>  	skb_queue_tail(&queue->rx_queue, skb);
> @@ -515,7 +520,7 @@ int xenvif_init_queue(struct xenvif_queue *queue)
>  		queue->grant_tx_handle[i] = NETBACK_INVALID_HANDLE;
>  	}
>  
> -	init_timer(&queue->wake_queue);
> +	init_timer(&queue->rx_stalled);
>  
>  	netif_napi_add(queue->vif->dev, &queue->napi, xenvif_poll,
>  			XENVIF_NAPI_WEIGHT);
> @@ -666,7 +671,7 @@ void xenvif_disconnect(struct xenvif *vif)
>  		queue = &vif->queues[queue_index];
>  
>  		if (queue->task) {
> -			del_timer_sync(&queue->wake_queue);
> +			del_timer_sync(&queue->rx_stalled);
>  			kthread_stop(queue->task);
>  			queue->task = NULL;
>  		}
> @@ -708,16 +713,12 @@ void xenvif_free(struct xenvif *vif)
>  	/* Here we want to avoid timeout messages if an skb can be legitimately
>  	 * stuck somewhere else. Realistically this could be an another vif's
>  	 * internal or QDisc queue. That another vif also has this
> -	 * rx_drain_timeout_msecs timeout, but the timer only ditches the
> -	 * internal queue. After that, the QDisc queue can put in worst case
> -	 * XEN_NETIF_RX_RING_SIZE / MAX_SKB_FRAGS skbs into that another vif's
> -	 * internal queue, so we need several rounds of such timeouts until we
> -	 * can be sure that no another vif should have skb's from us. We are
> -	 * not sending more skb's, so newly stuck packets are not interesting
> -	 * for us here.
> +	 * rx_drain_timeout_msecs timeout, so give it time to drain out.
> +	 * Although if that other guest wakes up just before its timeout happens
> +	 * and takes only one skb from QDisc, it can hold onto other skbs for a
> +	 * longer period.
>  	 */
> -	unsigned int worst_case_skb_lifetime = (rx_drain_timeout_msecs/1000) *
> -		DIV_ROUND_UP(XENVIF_QUEUE_LENGTH, (XEN_NETIF_RX_RING_SIZE / MAX_SKB_FRAGS));
> +	unsigned int worst_case_skb_lifetime = (rx_drain_timeout_msecs/1000);
>  
>  	unregister_netdev(vif->dev);
>  
> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> index 6c4cc0f..7dac833 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -1869,8 +1869,7 @@ void xenvif_idx_unmap(struct xenvif_queue *queue, u16 pending_idx)
>  static inline int rx_work_todo(struct xenvif_queue *queue)
>  {
>  	return (!skb_queue_empty(&queue->rx_queue) &&
> -	       xenvif_rx_ring_slots_available(queue, queue->rx_last_skb_slots)) ||
> -	       queue->rx_queue_purge;
> +	       xenvif_rx_ring_slots_available(queue, queue->rx_last_skb_slots));
>  }
>  
>  static inline int tx_work_todo(struct xenvif_queue *queue)
> @@ -1944,6 +1943,7 @@ int xenvif_kthread_guest_rx(void *data)
>  		wait_event_interruptible(queue->wq,
>  					 rx_work_todo(queue) ||
>  					 queue->vif->disabled ||
> +					 test_bit(QUEUE_STATUS_RX_PURGE_EVENT, &queue->status) ||
>  					 kthread_should_stop());
>  
>  		/* This frontend is found to be rogue, disable it in
> @@ -1955,24 +1955,78 @@ int xenvif_kthread_guest_rx(void *data)
>  		 */
>  		if (unlikely(queue->vif->disabled && queue->id == 0))
>  			xenvif_carrier_off(queue->vif);
> +		else if (unlikely(test_and_clear_bit(QUEUE_STATUS_RX_PURGE_EVENT,
> +						     &queue->status))) {
> +			/* Either the last unsuccesful skb or at least 1 slot
> +			 * should fit
> +			 */
> +			int needed = queue->rx_last_skb_slots ?
> +				     queue->rx_last_skb_slots : 1;
>  
> -		if (kthread_should_stop())
> -			break;
> -

Is it really necessary to have kthread_should_stop moved after queue
purging logic? Is it better to leave it here so that we don't do any
more unnecessary work and just quit?

> -		if (queue->rx_queue_purge) {
> +			/* It is assumed that if the guest post new
> +			 * slots after this, the RX interrupt will set
> +			 * the bit and wake up the thread again
> +			 */

This needs more detailed. Reading this comment and the "set_bit" in
following is a bit confusing.

I can tell after reading the code that "the bit" refers to
QUEUE_STATUS_RX_PURGE_EVENT. The following line sets
QUEUE_STATUS_RX_STALLED which has the effect of setting
QUEUE_STATUS_RX_PURGE_EVENT in rx interrupt handler.

> +			set_bit(QUEUE_STATUS_RX_STALLED, &queue->status);
> +			if (!xenvif_rx_ring_slots_available(queue, needed)) {
> +				rtnl_lock();
> +				if (netif_carrier_ok(queue->vif->dev)) {
> +					/* Timer fired and there are still no
> +					 * slots. Turn off everything except the
> +					 * interrupts
> +					 */
> +					netif_carrier_off(queue->vif->dev);
> +					skb_queue_purge(&queue->rx_queue);
> +					queue->rx_last_skb_slots = 0;
> +					if (net_ratelimit())
> +						netdev_err(queue->vif->dev, "Carrier off due to lack of guest response on queue %d\n", queue->id);
> +				}
> +				rtnl_unlock();
> +			} else if (!netif_carrier_ok(queue->vif->dev)) {
> +				unsigned int num_queues = queue->vif->num_queues;
> +				unsigned int i;
> +				/* The carrier was down, but an interrupt kicked
> +				 * the thread again after new requests were
> +				 * posted
> +				 */
> +				clear_bit(QUEUE_STATUS_RX_STALLED,
> +					  &queue->status);
> +				rtnl_lock();
> +				netif_carrier_on(queue->vif->dev);
> +				netif_tx_wake_all_queues(queue->vif->dev);
> +				rtnl_unlock();
> +
> +				for (i = 0; i < num_queues; ++i) {
> +					struct xenvif_queue *temp = &queue->vif->queues[i];
> +					xenvif_napi_schedule_or_enable_events(temp);
> +				}
> +				if (net_ratelimit())
> +					netdev_err(queue->vif->dev, "Carrier on again\n");
> +				continue;
> +			} else {
> +				/* Queuing were stopped, but the guest posted
> +				 * new requests
> +				 */
> +				clear_bit(QUEUE_STATUS_RX_STALLED,
> +					  &queue->status);
> +				del_timer_sync(&queue->rx_stalled);
> +				xenvif_start_queue(queue);
> +				continue;
> +			}
> +		} else if (!netif_carrier_ok(queue->vif->dev)) {
> +			/* Another queue stalled and turned the carrier off, so
> +			 * purge our internal queue
> +			 */

If I'm not misreading this branch doesn't have
QUEUE_STATUS_RX_PURGE_EVENT set but you still pruge internal queues.
So the QUEUE_STATUS_RX_PURGE_EVENT only controls QDISK purging? If so, I
think you might need to say that clearly in comment above.

I've tried my best to understand what's going on in this patch. AFAICT
this looks like an improvement but not a bug fix to me. I need to
justify the complexity introduced and the benefit gained but so far I
don't have very good clue. I think it's better to have a 00/N patch to
describe high level design and motive, as DaveM suggested.

Wei.

>  			skb_queue_purge(&queue->rx_queue);
> -			queue->rx_queue_purge = false;
> +			queue->rx_last_skb_slots = 0;
>  		}
>  
> +		if (kthread_should_stop())
> +			break;
> +
>  		if (!skb_queue_empty(&queue->rx_queue))
>  			xenvif_rx_action(queue);
>  
> -		if (skb_queue_empty(&queue->rx_queue) &&
> -		    xenvif_queue_stopped(queue)) {
> -			del_timer_sync(&queue->wake_queue);
> -			xenvif_start_queue(queue);
> -		}
> -
>  		cond_resched();
>  	}
>  

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] xen-netback: Turn off the carrier if the guest is not able to receive
  2014-07-30 19:50 ` [PATCH] xen-netback: Turn off the carrier if the guest is not able to receive Zoltan Kiss
  2014-08-01  4:53   ` David Miller
  2014-08-01  4:53   ` David Miller
@ 2014-08-01 10:52   ` Wei Liu
  2014-08-01 10:52   ` Wei Liu
                     ` (2 subsequent siblings)
  5 siblings, 0 replies; 19+ messages in thread
From: Wei Liu @ 2014-08-01 10:52 UTC (permalink / raw)
  To: Zoltan Kiss
  Cc: Wei Liu, Ian Campbell, netdev, linux-kernel, David Vrabel, xen-devel

On Wed, Jul 30, 2014 at 08:50:49PM +0100, Zoltan Kiss wrote:
> Currently when the guest is not able to receive more packets, qdisc layer starts
> a timer, and when it goes off, qdisc is started again to deliver a packet again.
> This is a very slow way to drain the queues, consumes unnecessary resources and
> slows down other guests shutdown.
> This patch change the behaviour by turning the carrier off when that timer
> fires, so all the packets are freed up which were stucked waiting for that vif.
> Instead of the rx_queue_purge bool it uses the VIF_STATUS_RX_PURGE_EVENT bit to
> signal the thread that either the timout happened or an RX interrupt arrived, so
> the thread can check what it should do. It also disables NAPI, so the guest
> can't transmit, but leaves the interrupts on, so it can resurrect.
> 
> Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
> Cc: netdev@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Cc: xen-devel@lists.xenproject.org 
> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
> index 7231359..66b4bef 100644
> --- a/drivers/net/xen-netback/common.h
> +++ b/drivers/net/xen-netback/common.h
> @@ -176,9 +176,9 @@ struct xenvif_queue { /* Per-queue data for xenvif */
>  	struct xen_netif_rx_back_ring rx;
>  	struct sk_buff_head rx_queue;
>  	RING_IDX rx_last_skb_slots;
> -	bool rx_queue_purge;
> +	unsigned long status;
>  
> -	struct timer_list wake_queue;
> +	struct timer_list rx_stalled;
>  
>  	struct gnttab_copy grant_copy_op[MAX_GRANT_COPY_OPS];
>  
> @@ -198,9 +198,17 @@ struct xenvif_queue { /* Per-queue data for xenvif */
>  	struct xenvif_stats stats;
>  };
>  
> -enum vif_state_bit_shift {
> +enum state_bit_shift {

Why change the name?

>  	/* This bit marks that the vif is connected */
> -	VIF_STATUS_CONNECTED
> +	VIF_STATUS_CONNECTED,
> +	/* This bit signals to the RX thread that queuing was stopped
> +	 * (in start_xmit), and either the timer fired or an RX interrupt came
> +	 */
> +	QUEUE_STATUS_RX_PURGE_EVENT,
> +	/* This bit tells to the interrupt handler that this queue was the

I don't think you need that "to".

> +	 * reason for the carrier off, so it should kick the thread
> +	 */

I guess you're saying "so it should purge the queue"? I don't see
xenvif_kick_thread guarded by QUEUE_STATUS_RX_STALLED but I do see
QUEUE_STATUS_RX_PURGE_EVENT is set due to QUEUE_STATUS_RX_STALLED in rx
interrupt handler.

> +	QUEUE_STATUS_RX_STALLED

What's the interaction of these two bits? I see in below code one is set
under certain test result of the other in various places. It would be
good if you can provide some detailed description on the control flow.

>  };
>  
>  struct xenvif {
> diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
> index fa7d8ac..d2e61f4 100644
> --- a/drivers/net/xen-netback/interface.c
> +++ b/drivers/net/xen-netback/interface.c
> @@ -79,7 +79,8 @@ int xenvif_poll(struct napi_struct *napi, int budget)
>  	 * for this vif to deschedule it from NAPI. But this interface
>  	 * will be turned off in thread context later.
>  	 */
> -	if (unlikely(queue->vif->disabled)) {
> +	if (unlikely(queue->vif->disabled ||
> +		     !netif_carrier_ok(queue->vif->dev))) {
>  		napi_complete(napi);
>  		return 0;
>  	}
> @@ -97,7 +98,13 @@ int xenvif_poll(struct napi_struct *napi, int budget)
>  static irqreturn_t xenvif_rx_interrupt(int irq, void *dev_id)
>  {
>  	struct xenvif_queue *queue = dev_id;
> +	struct netdev_queue *net_queue =
> +		netdev_get_tx_queue(queue->vif->dev, queue->id);
>  
> +	if (unlikely((netif_tx_queue_stopped(net_queue) ||
> +		      !netif_carrier_ok(queue->vif->dev)) &&
> +		     test_bit(QUEUE_STATUS_RX_STALLED, &queue->status)))
> +		set_bit(QUEUE_STATUS_RX_PURGE_EVENT, &queue->status);
>  	xenvif_kick_thread(queue);
>  
>  	return IRQ_HANDLED;
> @@ -126,15 +133,13 @@ void xenvif_wake_queue(struct xenvif_queue *queue)
>  }
>  
>  /* Callback to wake the queue and drain it on timeout */
> -static void xenvif_wake_queue_callback(unsigned long data)
> +static void xenvif_rx_stalled(unsigned long data)
>  {
>  	struct xenvif_queue *queue = (struct xenvif_queue *)data;
>  
>  	if (xenvif_queue_stopped(queue)) {
> -		netdev_err(queue->vif->dev, "draining TX queue\n");
> -		queue->rx_queue_purge = true;
> +		set_bit(QUEUE_STATUS_RX_PURGE_EVENT, &queue->status);
>  		xenvif_kick_thread(queue);
> -		xenvif_wake_queue(queue);
>  	}
>  }
>  
> @@ -183,11 +188,11 @@ static int xenvif_start_xmit(struct sk_buff *skb, struct net_device *dev)
>  	 * drain.
>  	 */
>  	if (!xenvif_rx_ring_slots_available(queue, min_slots_needed)) {
> -		queue->wake_queue.function = xenvif_wake_queue_callback;
> -		queue->wake_queue.data = (unsigned long)queue;
> +		queue->rx_stalled.function = xenvif_rx_stalled;
> +		queue->rx_stalled.data = (unsigned long)queue;
>  		xenvif_stop_queue(queue);
> -		mod_timer(&queue->wake_queue,
> -			jiffies + rx_drain_timeout_jiffies);
> +		mod_timer(&queue->rx_stalled,
> +			  jiffies + rx_drain_timeout_jiffies);
>  	}
>  
>  	skb_queue_tail(&queue->rx_queue, skb);
> @@ -515,7 +520,7 @@ int xenvif_init_queue(struct xenvif_queue *queue)
>  		queue->grant_tx_handle[i] = NETBACK_INVALID_HANDLE;
>  	}
>  
> -	init_timer(&queue->wake_queue);
> +	init_timer(&queue->rx_stalled);
>  
>  	netif_napi_add(queue->vif->dev, &queue->napi, xenvif_poll,
>  			XENVIF_NAPI_WEIGHT);
> @@ -666,7 +671,7 @@ void xenvif_disconnect(struct xenvif *vif)
>  		queue = &vif->queues[queue_index];
>  
>  		if (queue->task) {
> -			del_timer_sync(&queue->wake_queue);
> +			del_timer_sync(&queue->rx_stalled);
>  			kthread_stop(queue->task);
>  			queue->task = NULL;
>  		}
> @@ -708,16 +713,12 @@ void xenvif_free(struct xenvif *vif)
>  	/* Here we want to avoid timeout messages if an skb can be legitimately
>  	 * stuck somewhere else. Realistically this could be an another vif's
>  	 * internal or QDisc queue. That another vif also has this
> -	 * rx_drain_timeout_msecs timeout, but the timer only ditches the
> -	 * internal queue. After that, the QDisc queue can put in worst case
> -	 * XEN_NETIF_RX_RING_SIZE / MAX_SKB_FRAGS skbs into that another vif's
> -	 * internal queue, so we need several rounds of such timeouts until we
> -	 * can be sure that no another vif should have skb's from us. We are
> -	 * not sending more skb's, so newly stuck packets are not interesting
> -	 * for us here.
> +	 * rx_drain_timeout_msecs timeout, so give it time to drain out.
> +	 * Although if that other guest wakes up just before its timeout happens
> +	 * and takes only one skb from QDisc, it can hold onto other skbs for a
> +	 * longer period.
>  	 */
> -	unsigned int worst_case_skb_lifetime = (rx_drain_timeout_msecs/1000) *
> -		DIV_ROUND_UP(XENVIF_QUEUE_LENGTH, (XEN_NETIF_RX_RING_SIZE / MAX_SKB_FRAGS));
> +	unsigned int worst_case_skb_lifetime = (rx_drain_timeout_msecs/1000);
>  
>  	unregister_netdev(vif->dev);
>  
> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> index 6c4cc0f..7dac833 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -1869,8 +1869,7 @@ void xenvif_idx_unmap(struct xenvif_queue *queue, u16 pending_idx)
>  static inline int rx_work_todo(struct xenvif_queue *queue)
>  {
>  	return (!skb_queue_empty(&queue->rx_queue) &&
> -	       xenvif_rx_ring_slots_available(queue, queue->rx_last_skb_slots)) ||
> -	       queue->rx_queue_purge;
> +	       xenvif_rx_ring_slots_available(queue, queue->rx_last_skb_slots));
>  }
>  
>  static inline int tx_work_todo(struct xenvif_queue *queue)
> @@ -1944,6 +1943,7 @@ int xenvif_kthread_guest_rx(void *data)
>  		wait_event_interruptible(queue->wq,
>  					 rx_work_todo(queue) ||
>  					 queue->vif->disabled ||
> +					 test_bit(QUEUE_STATUS_RX_PURGE_EVENT, &queue->status) ||
>  					 kthread_should_stop());
>  
>  		/* This frontend is found to be rogue, disable it in
> @@ -1955,24 +1955,78 @@ int xenvif_kthread_guest_rx(void *data)
>  		 */
>  		if (unlikely(queue->vif->disabled && queue->id == 0))
>  			xenvif_carrier_off(queue->vif);
> +		else if (unlikely(test_and_clear_bit(QUEUE_STATUS_RX_PURGE_EVENT,
> +						     &queue->status))) {
> +			/* Either the last unsuccesful skb or at least 1 slot
> +			 * should fit
> +			 */
> +			int needed = queue->rx_last_skb_slots ?
> +				     queue->rx_last_skb_slots : 1;
>  
> -		if (kthread_should_stop())
> -			break;
> -

Is it really necessary to have kthread_should_stop moved after queue
purging logic? Is it better to leave it here so that we don't do any
more unnecessary work and just quit?

> -		if (queue->rx_queue_purge) {
> +			/* It is assumed that if the guest post new
> +			 * slots after this, the RX interrupt will set
> +			 * the bit and wake up the thread again
> +			 */

This needs more detailed. Reading this comment and the "set_bit" in
following is a bit confusing.

I can tell after reading the code that "the bit" refers to
QUEUE_STATUS_RX_PURGE_EVENT. The following line sets
QUEUE_STATUS_RX_STALLED which has the effect of setting
QUEUE_STATUS_RX_PURGE_EVENT in rx interrupt handler.

> +			set_bit(QUEUE_STATUS_RX_STALLED, &queue->status);
> +			if (!xenvif_rx_ring_slots_available(queue, needed)) {
> +				rtnl_lock();
> +				if (netif_carrier_ok(queue->vif->dev)) {
> +					/* Timer fired and there are still no
> +					 * slots. Turn off everything except the
> +					 * interrupts
> +					 */
> +					netif_carrier_off(queue->vif->dev);
> +					skb_queue_purge(&queue->rx_queue);
> +					queue->rx_last_skb_slots = 0;
> +					if (net_ratelimit())
> +						netdev_err(queue->vif->dev, "Carrier off due to lack of guest response on queue %d\n", queue->id);
> +				}
> +				rtnl_unlock();
> +			} else if (!netif_carrier_ok(queue->vif->dev)) {
> +				unsigned int num_queues = queue->vif->num_queues;
> +				unsigned int i;
> +				/* The carrier was down, but an interrupt kicked
> +				 * the thread again after new requests were
> +				 * posted
> +				 */
> +				clear_bit(QUEUE_STATUS_RX_STALLED,
> +					  &queue->status);
> +				rtnl_lock();
> +				netif_carrier_on(queue->vif->dev);
> +				netif_tx_wake_all_queues(queue->vif->dev);
> +				rtnl_unlock();
> +
> +				for (i = 0; i < num_queues; ++i) {
> +					struct xenvif_queue *temp = &queue->vif->queues[i];
> +					xenvif_napi_schedule_or_enable_events(temp);
> +				}
> +				if (net_ratelimit())
> +					netdev_err(queue->vif->dev, "Carrier on again\n");
> +				continue;
> +			} else {
> +				/* Queuing were stopped, but the guest posted
> +				 * new requests
> +				 */
> +				clear_bit(QUEUE_STATUS_RX_STALLED,
> +					  &queue->status);
> +				del_timer_sync(&queue->rx_stalled);
> +				xenvif_start_queue(queue);
> +				continue;
> +			}
> +		} else if (!netif_carrier_ok(queue->vif->dev)) {
> +			/* Another queue stalled and turned the carrier off, so
> +			 * purge our internal queue
> +			 */

If I'm not misreading this branch doesn't have
QUEUE_STATUS_RX_PURGE_EVENT set but you still pruge internal queues.
So the QUEUE_STATUS_RX_PURGE_EVENT only controls QDISK purging? If so, I
think you might need to say that clearly in comment above.

I've tried my best to understand what's going on in this patch. AFAICT
this looks like an improvement but not a bug fix to me. I need to
justify the complexity introduced and the benefit gained but so far I
don't have very good clue. I think it's better to have a 00/N patch to
describe high level design and motive, as DaveM suggested.

Wei.

>  			skb_queue_purge(&queue->rx_queue);
> -			queue->rx_queue_purge = false;
> +			queue->rx_last_skb_slots = 0;
>  		}
>  
> +		if (kthread_should_stop())
> +			break;
> +
>  		if (!skb_queue_empty(&queue->rx_queue))
>  			xenvif_rx_action(queue);
>  
> -		if (skb_queue_empty(&queue->rx_queue) &&
> -		    xenvif_queue_stopped(queue)) {
> -			del_timer_sync(&queue->wake_queue);
> -			xenvif_start_queue(queue);
> -		}
> -
>  		cond_resched();
>  	}
>  

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Xen-devel] [PATCH] xen-netback: Turn off the carrier if the guest is not able to receive
  2014-07-30 19:50 ` [PATCH] xen-netback: Turn off the carrier if the guest is not able to receive Zoltan Kiss
                     ` (3 preceding siblings ...)
  2014-08-01 10:52   ` Wei Liu
@ 2014-08-04 13:35   ` David Vrabel
  2014-08-04 15:13     ` Zoltan Kiss
  2014-08-04 15:13     ` [Xen-devel] " Zoltan Kiss
  2014-08-04 13:35   ` David Vrabel
  5 siblings, 2 replies; 19+ messages in thread
From: David Vrabel @ 2014-08-04 13:35 UTC (permalink / raw)
  To: Zoltan Kiss, Wei Liu, Ian Campbell
  Cc: netdev, xen-devel, David Vrabel, linux-kernel

On 30/07/14 20:50, Zoltan Kiss wrote:
> Currently when the guest is not able to receive more packets, qdisc layer starts
> a timer, and when it goes off, qdisc is started again to deliver a packet again.
> This is a very slow way to drain the queues, consumes unnecessary resources and
> slows down other guests shutdown.
> This patch change the behaviour by turning the carrier off when that timer
> fires, so all the packets are freed up which were stucked waiting for that vif.
> Instead of the rx_queue_purge bool it uses the VIF_STATUS_RX_PURGE_EVENT bit to
> signal the thread that either the timout happened or an RX interrupt arrived, so
> the thread can check what it should do. It also disables NAPI, so the guest
> can't transmit, but leaves the interrupts on, so it can resurrect.

I don't think you should disable NAPI, particularly since you have to
fiddle with the bits directly instead of usign the API to do so.

I looked at some hardware drivers and none of them disabled NAPI -- they
just allow it to naturally end once hardware queues are drained.

netback is a little different as a frontend could stop processing
to-guest packets but continue sending from-guest ones.  I don't see any
problem with allowing this.

> @@ -1955,24 +1955,78 @@ int xenvif_kthread_guest_rx(void *data)
>  		 */
>  		if (unlikely(queue->vif->disabled && queue->id == 0))
>  			xenvif_carrier_off(queue->vif);
> +		else if (unlikely(test_and_clear_bit(QUEUE_STATUS_RX_PURGE_EVENT,
> +						     &queue->status))) {
> +			/* Either the last unsuccesful skb or at least 1 slot
> +			 * should fit
> +			 */
> +			int needed = queue->rx_last_skb_slots ?
> +				     queue->rx_last_skb_slots : 1;
>  
> -		if (kthread_should_stop())
> -			break;
> -
> -		if (queue->rx_queue_purge) {
> +			/* It is assumed that if the guest post new
> +			 * slots after this, the RX interrupt will set
> +			 * the bit and wake up the thread again
> +			 */
> +			set_bit(QUEUE_STATUS_RX_STALLED, &queue->status);

This big if needs to go in a new function.

David

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] xen-netback: Turn off the carrier if the guest is not able to receive
  2014-07-30 19:50 ` [PATCH] xen-netback: Turn off the carrier if the guest is not able to receive Zoltan Kiss
                     ` (4 preceding siblings ...)
  2014-08-04 13:35   ` [Xen-devel] " David Vrabel
@ 2014-08-04 13:35   ` David Vrabel
  5 siblings, 0 replies; 19+ messages in thread
From: David Vrabel @ 2014-08-04 13:35 UTC (permalink / raw)
  To: Zoltan Kiss, Wei Liu, Ian Campbell
  Cc: netdev, David Vrabel, linux-kernel, xen-devel

On 30/07/14 20:50, Zoltan Kiss wrote:
> Currently when the guest is not able to receive more packets, qdisc layer starts
> a timer, and when it goes off, qdisc is started again to deliver a packet again.
> This is a very slow way to drain the queues, consumes unnecessary resources and
> slows down other guests shutdown.
> This patch change the behaviour by turning the carrier off when that timer
> fires, so all the packets are freed up which were stucked waiting for that vif.
> Instead of the rx_queue_purge bool it uses the VIF_STATUS_RX_PURGE_EVENT bit to
> signal the thread that either the timout happened or an RX interrupt arrived, so
> the thread can check what it should do. It also disables NAPI, so the guest
> can't transmit, but leaves the interrupts on, so it can resurrect.

I don't think you should disable NAPI, particularly since you have to
fiddle with the bits directly instead of usign the API to do so.

I looked at some hardware drivers and none of them disabled NAPI -- they
just allow it to naturally end once hardware queues are drained.

netback is a little different as a frontend could stop processing
to-guest packets but continue sending from-guest ones.  I don't see any
problem with allowing this.

> @@ -1955,24 +1955,78 @@ int xenvif_kthread_guest_rx(void *data)
>  		 */
>  		if (unlikely(queue->vif->disabled && queue->id == 0))
>  			xenvif_carrier_off(queue->vif);
> +		else if (unlikely(test_and_clear_bit(QUEUE_STATUS_RX_PURGE_EVENT,
> +						     &queue->status))) {
> +			/* Either the last unsuccesful skb or at least 1 slot
> +			 * should fit
> +			 */
> +			int needed = queue->rx_last_skb_slots ?
> +				     queue->rx_last_skb_slots : 1;
>  
> -		if (kthread_should_stop())
> -			break;
> -
> -		if (queue->rx_queue_purge) {
> +			/* It is assumed that if the guest post new
> +			 * slots after this, the RX interrupt will set
> +			 * the bit and wake up the thread again
> +			 */
> +			set_bit(QUEUE_STATUS_RX_STALLED, &queue->status);

This big if needs to go in a new function.

David

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] xen-netback: Turn off the carrier if the guest is not able to receive
  2014-08-01 10:52   ` Wei Liu
@ 2014-08-04 15:04     ` Zoltan Kiss
  2014-08-04 15:04     ` Zoltan Kiss
  1 sibling, 0 replies; 19+ messages in thread
From: Zoltan Kiss @ 2014-08-04 15:04 UTC (permalink / raw)
  To: Wei Liu; +Cc: Ian Campbell, David Vrabel, netdev, linux-kernel, xen-devel

On 01/08/14 11:52, Wei Liu wrote:
> On Wed, Jul 30, 2014 at 08:50:49PM +0100, Zoltan Kiss wrote:
>> --- a/drivers/net/xen-netback/common.h
>> +++ b/drivers/net/xen-netback/common.h
>> @@ -176,9 +176,9 @@ struct xenvif_queue { /* Per-queue data for xenvif */
>>   	struct xen_netif_rx_back_ring rx;
>>   	struct sk_buff_head rx_queue;
>>   	RING_IDX rx_last_skb_slots;
>> -	bool rx_queue_purge;
>> +	unsigned long status;
>>
>> -	struct timer_list wake_queue;
>> +	struct timer_list rx_stalled;
>>
>>   	struct gnttab_copy grant_copy_op[MAX_GRANT_COPY_OPS];
>>
>> @@ -198,9 +198,17 @@ struct xenvif_queue { /* Per-queue data for xenvif */
>>   	struct xenvif_stats stats;
>>   };
>>
>> -enum vif_state_bit_shift {
>> +enum state_bit_shift {
>
> Why change the name?
It's better to change this in the first patch. I've prepared this on a 
pre-multiqueue codebase, where these bits were for the whole vif, not 
for a queue.

>> +	 * reason for the carrier off, so it should kick the thread
>> +	 */
>
> I guess you're saying "so it should purge the queue"? I don't see
> xenvif_kick_thread guarded by QUEUE_STATUS_RX_STALLED but I do see
> QUEUE_STATUS_RX_PURGE_EVENT is set due to QUEUE_STATUS_RX_STALLED in rx
> interrupt handler.
>
>> +	QUEUE_STATUS_RX_STALLED
>
> What's the interaction of these two bits? I see in below code one is set
> under certain test result of the other in various places. It would be
> good if you can provide some detailed description on the control flow.
I'll extend that description. STALLED bit marks the queues which were 
blocked and brought the carrier off. Otherwise an another queue which is 
still operational would turn the carrier on even if the blocked ones are 
still blocked. That can cause a fast flapping of carrier. I think it's 
better if we don't turn the carrier on until at least one of the blocked 
queues comes back alive, but I'm open of other opinions.
Also, the conditions were wrong in the interrupt handler, I'll fix that too.

>> @@ -1955,24 +1955,78 @@ int xenvif_kthread_guest_rx(void *data)
>>   		 */
>>   		if (unlikely(queue->vif->disabled && queue->id == 0))
>>   			xenvif_carrier_off(queue->vif);
>> +		else if (unlikely(test_and_clear_bit(QUEUE_STATUS_RX_PURGE_EVENT,
>> +						     &queue->status))) {
>> +			/* Either the last unsuccesful skb or at least 1 slot
>> +			 * should fit
>> +			 */
>> +			int needed = queue->rx_last_skb_slots ?
>> +				     queue->rx_last_skb_slots : 1;
>>
>> -		if (kthread_should_stop())
>> -			break;
>> -
>
> Is it really necessary to have kthread_should_stop moved after queue
> purging logic? Is it better to leave it here so that we don't do any
> more unnecessary work and just quit?
I've just left where it was, after the check for vif->disable, but 
indeed we can move it before it.
>
>> -		if (queue->rx_queue_purge) {
>> +			/* It is assumed that if the guest post new
>> +			 * slots after this, the RX interrupt will set
>> +			 * the bit and wake up the thread again
>> +			 */
>
> This needs more detailed. Reading this comment and the "set_bit" in
> following is a bit confusing.
>
> I can tell after reading the code that "the bit" refers to
> QUEUE_STATUS_RX_PURGE_EVENT. The following line sets
> QUEUE_STATUS_RX_STALLED which has the effect of setting
> QUEUE_STATUS_RX_PURGE_EVENT in rx interrupt handler.
>
Ok
>> +			set_bit(QUEUE_STATUS_RX_STALLED, &queue->status);
>> +			if (!xenvif_rx_ring_slots_available(queue, needed)) {
>> +				rtnl_lock();
>> +				if (netif_carrier_ok(queue->vif->dev)) {
>> +					/* Timer fired and there are still no
>> +					 * slots. Turn off everything except the
>> +					 * interrupts
>> +					 */
>> +					netif_carrier_off(queue->vif->dev);
>> +					skb_queue_purge(&queue->rx_queue);
>> +					queue->rx_last_skb_slots = 0;
>> +					if (net_ratelimit())
>> +						netdev_err(queue->vif->dev, "Carrier off due to lack of guest response on queue %d\n", queue->id);
>> +				}
>> +				rtnl_unlock();
>> +			} else if (!netif_carrier_ok(queue->vif->dev)) {
>> +				unsigned int num_queues = queue->vif->num_queues;
>> +				unsigned int i;
>> +				/* The carrier was down, but an interrupt kicked
>> +				 * the thread again after new requests were
>> +				 * posted
>> +				 */
>> +				clear_bit(QUEUE_STATUS_RX_STALLED,
>> +					  &queue->status);
>> +				rtnl_lock();
>> +				netif_carrier_on(queue->vif->dev);
>> +				netif_tx_wake_all_queues(queue->vif->dev);
>> +				rtnl_unlock();
>> +
>> +				for (i = 0; i < num_queues; ++i) {
>> +					struct xenvif_queue *temp = &queue->vif->queues[i];
>> +					xenvif_napi_schedule_or_enable_events(temp);
>> +				}
>> +				if (net_ratelimit())
>> +					netdev_err(queue->vif->dev, "Carrier on again\n");
>> +				continue;
>> +			} else {
>> +				/* Queuing were stopped, but the guest posted
>> +				 * new requests
>> +				 */
>> +				clear_bit(QUEUE_STATUS_RX_STALLED,
>> +					  &queue->status);
>> +				del_timer_sync(&queue->rx_stalled);
>> +				xenvif_start_queue(queue);
>> +				continue;
>> +			}
>> +		} else if (!netif_carrier_ok(queue->vif->dev)) {
>> +			/* Another queue stalled and turned the carrier off, so
>> +			 * purge our internal queue
>> +			 */
>
> If I'm not misreading this branch doesn't have
> QUEUE_STATUS_RX_PURGE_EVENT set but you still pruge internal queues.
> So the QUEUE_STATUS_RX_PURGE_EVENT only controls QDISK purging? If so, I
> think you might need to say that clearly in comment above.
Nope. PURGE_EVENT is only set for the queues which were blocked in 
start_xmit, and STALLED bit makes sure that only they can turn the 
carrier on again. This branch is for the queues which are still 
functioning. QDisc were already purged when the carrier was turned off.
>
> I've tried my best to understand what's going on in this patch. AFAICT
> this looks like an improvement but not a bug fix to me. I need to
> justify the complexity introduced and the benefit gained but so far I
> don't have very good clue. I think it's better to have a 00/N patch to
> describe high level design and motive, as DaveM suggested.
Indeed, it's not trivial as there are a lot of racing possibilities due 
to multiqueue. I'll resend with more comments and explanation
>
> Wei.
>
>>   			skb_queue_purge(&queue->rx_queue);
>> -			queue->rx_queue_purge = false;
>> +			queue->rx_last_skb_slots = 0;
>>   		}
>>
>> +		if (kthread_should_stop())
>> +			break;
>> +
>>   		if (!skb_queue_empty(&queue->rx_queue))
>>   			xenvif_rx_action(queue);
>>
>> -		if (skb_queue_empty(&queue->rx_queue) &&
>> -		    xenvif_queue_stopped(queue)) {
>> -			del_timer_sync(&queue->wake_queue);
>> -			xenvif_start_queue(queue);
>> -		}
>> -
>>   		cond_resched();
>>   	}
>>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] xen-netback: Turn off the carrier if the guest is not able to receive
  2014-08-01 10:52   ` Wei Liu
  2014-08-04 15:04     ` Zoltan Kiss
@ 2014-08-04 15:04     ` Zoltan Kiss
  1 sibling, 0 replies; 19+ messages in thread
From: Zoltan Kiss @ 2014-08-04 15:04 UTC (permalink / raw)
  To: Wei Liu; +Cc: netdev, xen-devel, Ian Campbell, linux-kernel, David Vrabel

On 01/08/14 11:52, Wei Liu wrote:
> On Wed, Jul 30, 2014 at 08:50:49PM +0100, Zoltan Kiss wrote:
>> --- a/drivers/net/xen-netback/common.h
>> +++ b/drivers/net/xen-netback/common.h
>> @@ -176,9 +176,9 @@ struct xenvif_queue { /* Per-queue data for xenvif */
>>   	struct xen_netif_rx_back_ring rx;
>>   	struct sk_buff_head rx_queue;
>>   	RING_IDX rx_last_skb_slots;
>> -	bool rx_queue_purge;
>> +	unsigned long status;
>>
>> -	struct timer_list wake_queue;
>> +	struct timer_list rx_stalled;
>>
>>   	struct gnttab_copy grant_copy_op[MAX_GRANT_COPY_OPS];
>>
>> @@ -198,9 +198,17 @@ struct xenvif_queue { /* Per-queue data for xenvif */
>>   	struct xenvif_stats stats;
>>   };
>>
>> -enum vif_state_bit_shift {
>> +enum state_bit_shift {
>
> Why change the name?
It's better to change this in the first patch. I've prepared this on a 
pre-multiqueue codebase, where these bits were for the whole vif, not 
for a queue.

>> +	 * reason for the carrier off, so it should kick the thread
>> +	 */
>
> I guess you're saying "so it should purge the queue"? I don't see
> xenvif_kick_thread guarded by QUEUE_STATUS_RX_STALLED but I do see
> QUEUE_STATUS_RX_PURGE_EVENT is set due to QUEUE_STATUS_RX_STALLED in rx
> interrupt handler.
>
>> +	QUEUE_STATUS_RX_STALLED
>
> What's the interaction of these two bits? I see in below code one is set
> under certain test result of the other in various places. It would be
> good if you can provide some detailed description on the control flow.
I'll extend that description. STALLED bit marks the queues which were 
blocked and brought the carrier off. Otherwise an another queue which is 
still operational would turn the carrier on even if the blocked ones are 
still blocked. That can cause a fast flapping of carrier. I think it's 
better if we don't turn the carrier on until at least one of the blocked 
queues comes back alive, but I'm open of other opinions.
Also, the conditions were wrong in the interrupt handler, I'll fix that too.

>> @@ -1955,24 +1955,78 @@ int xenvif_kthread_guest_rx(void *data)
>>   		 */
>>   		if (unlikely(queue->vif->disabled && queue->id == 0))
>>   			xenvif_carrier_off(queue->vif);
>> +		else if (unlikely(test_and_clear_bit(QUEUE_STATUS_RX_PURGE_EVENT,
>> +						     &queue->status))) {
>> +			/* Either the last unsuccesful skb or at least 1 slot
>> +			 * should fit
>> +			 */
>> +			int needed = queue->rx_last_skb_slots ?
>> +				     queue->rx_last_skb_slots : 1;
>>
>> -		if (kthread_should_stop())
>> -			break;
>> -
>
> Is it really necessary to have kthread_should_stop moved after queue
> purging logic? Is it better to leave it here so that we don't do any
> more unnecessary work and just quit?
I've just left where it was, after the check for vif->disable, but 
indeed we can move it before it.
>
>> -		if (queue->rx_queue_purge) {
>> +			/* It is assumed that if the guest post new
>> +			 * slots after this, the RX interrupt will set
>> +			 * the bit and wake up the thread again
>> +			 */
>
> This needs more detailed. Reading this comment and the "set_bit" in
> following is a bit confusing.
>
> I can tell after reading the code that "the bit" refers to
> QUEUE_STATUS_RX_PURGE_EVENT. The following line sets
> QUEUE_STATUS_RX_STALLED which has the effect of setting
> QUEUE_STATUS_RX_PURGE_EVENT in rx interrupt handler.
>
Ok
>> +			set_bit(QUEUE_STATUS_RX_STALLED, &queue->status);
>> +			if (!xenvif_rx_ring_slots_available(queue, needed)) {
>> +				rtnl_lock();
>> +				if (netif_carrier_ok(queue->vif->dev)) {
>> +					/* Timer fired and there are still no
>> +					 * slots. Turn off everything except the
>> +					 * interrupts
>> +					 */
>> +					netif_carrier_off(queue->vif->dev);
>> +					skb_queue_purge(&queue->rx_queue);
>> +					queue->rx_last_skb_slots = 0;
>> +					if (net_ratelimit())
>> +						netdev_err(queue->vif->dev, "Carrier off due to lack of guest response on queue %d\n", queue->id);
>> +				}
>> +				rtnl_unlock();
>> +			} else if (!netif_carrier_ok(queue->vif->dev)) {
>> +				unsigned int num_queues = queue->vif->num_queues;
>> +				unsigned int i;
>> +				/* The carrier was down, but an interrupt kicked
>> +				 * the thread again after new requests were
>> +				 * posted
>> +				 */
>> +				clear_bit(QUEUE_STATUS_RX_STALLED,
>> +					  &queue->status);
>> +				rtnl_lock();
>> +				netif_carrier_on(queue->vif->dev);
>> +				netif_tx_wake_all_queues(queue->vif->dev);
>> +				rtnl_unlock();
>> +
>> +				for (i = 0; i < num_queues; ++i) {
>> +					struct xenvif_queue *temp = &queue->vif->queues[i];
>> +					xenvif_napi_schedule_or_enable_events(temp);
>> +				}
>> +				if (net_ratelimit())
>> +					netdev_err(queue->vif->dev, "Carrier on again\n");
>> +				continue;
>> +			} else {
>> +				/* Queuing were stopped, but the guest posted
>> +				 * new requests
>> +				 */
>> +				clear_bit(QUEUE_STATUS_RX_STALLED,
>> +					  &queue->status);
>> +				del_timer_sync(&queue->rx_stalled);
>> +				xenvif_start_queue(queue);
>> +				continue;
>> +			}
>> +		} else if (!netif_carrier_ok(queue->vif->dev)) {
>> +			/* Another queue stalled and turned the carrier off, so
>> +			 * purge our internal queue
>> +			 */
>
> If I'm not misreading this branch doesn't have
> QUEUE_STATUS_RX_PURGE_EVENT set but you still pruge internal queues.
> So the QUEUE_STATUS_RX_PURGE_EVENT only controls QDISK purging? If so, I
> think you might need to say that clearly in comment above.
Nope. PURGE_EVENT is only set for the queues which were blocked in 
start_xmit, and STALLED bit makes sure that only they can turn the 
carrier on again. This branch is for the queues which are still 
functioning. QDisc were already purged when the carrier was turned off.
>
> I've tried my best to understand what's going on in this patch. AFAICT
> this looks like an improvement but not a bug fix to me. I need to
> justify the complexity introduced and the benefit gained but so far I
> don't have very good clue. I think it's better to have a 00/N patch to
> describe high level design and motive, as DaveM suggested.
Indeed, it's not trivial as there are a lot of racing possibilities due 
to multiqueue. I'll resend with more comments and explanation
>
> Wei.
>
>>   			skb_queue_purge(&queue->rx_queue);
>> -			queue->rx_queue_purge = false;
>> +			queue->rx_last_skb_slots = 0;
>>   		}
>>
>> +		if (kthread_should_stop())
>> +			break;
>> +
>>   		if (!skb_queue_empty(&queue->rx_queue))
>>   			xenvif_rx_action(queue);
>>
>> -		if (skb_queue_empty(&queue->rx_queue) &&
>> -		    xenvif_queue_stopped(queue)) {
>> -			del_timer_sync(&queue->wake_queue);
>> -			xenvif_start_queue(queue);
>> -		}
>> -
>>   		cond_resched();
>>   	}
>>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Xen-devel] [PATCH] xen-netback: Turn off the carrier if the guest is not able to receive
  2014-08-04 13:35   ` [Xen-devel] " David Vrabel
  2014-08-04 15:13     ` Zoltan Kiss
@ 2014-08-04 15:13     ` Zoltan Kiss
  2014-08-08 16:33       ` Stephen Hemminger
  2014-08-08 16:33       ` [Xen-devel] " Stephen Hemminger
  1 sibling, 2 replies; 19+ messages in thread
From: Zoltan Kiss @ 2014-08-04 15:13 UTC (permalink / raw)
  To: David Vrabel, Wei Liu, Ian Campbell; +Cc: netdev, xen-devel, linux-kernel

On 04/08/14 14:35, David Vrabel wrote:
> On 30/07/14 20:50, Zoltan Kiss wrote:
>> Currently when the guest is not able to receive more packets, qdisc layer starts
>> a timer, and when it goes off, qdisc is started again to deliver a packet again.
>> This is a very slow way to drain the queues, consumes unnecessary resources and
>> slows down other guests shutdown.
>> This patch change the behaviour by turning the carrier off when that timer
>> fires, so all the packets are freed up which were stucked waiting for that vif.
>> Instead of the rx_queue_purge bool it uses the VIF_STATUS_RX_PURGE_EVENT bit to
>> signal the thread that either the timout happened or an RX interrupt arrived, so
>> the thread can check what it should do. It also disables NAPI, so the guest
>> can't transmit, but leaves the interrupts on, so it can resurrect.
>
> I don't think you should disable NAPI, particularly since you have to
> fiddle with the bits directly instead of usign the API to do so.
Since then I've found a much better way, see xenvif_poll.
>
> I looked at some hardware drivers and none of them disabled NAPI -- they
> just allow it to naturally end once hardware queues are drained.
>
> netback is a little different as a frontend could stop processing
> to-guest packets but continue sending from-guest ones.  I don't see any
> problem with allowing this.
Yes, that might be the case. Just a hunch tells me that could be wrong, 
but I couldn't come up with a scenario to prove it. Let's hear other's 
opinion about it:
Is it good or bad if a guest interface with carrier off still keeps 
sending packets towards the backend? Hardware drivers are trusted that 
when carrier goes down there won't be new packets (apart from a few 
still in the receive queue of the driver), so NAPI will be automatically 
descheduled. But in this scenario the guest could still post new packets 
on the ring, and they will be received unless NAPI is explicitly 
descheduled.

>
>> @@ -1955,24 +1955,78 @@ int xenvif_kthread_guest_rx(void *data)
>>   		 */
>>   		if (unlikely(queue->vif->disabled && queue->id == 0))
>>   			xenvif_carrier_off(queue->vif);
>> +		else if (unlikely(test_and_clear_bit(QUEUE_STATUS_RX_PURGE_EVENT,
>> +						     &queue->status))) {
>> +			/* Either the last unsuccesful skb or at least 1 slot
>> +			 * should fit
>> +			 */
>> +			int needed = queue->rx_last_skb_slots ?
>> +				     queue->rx_last_skb_slots : 1;
>>
>> -		if (kthread_should_stop())
>> -			break;
>> -
>> -		if (queue->rx_queue_purge) {
>> +			/* It is assumed that if the guest post new
>> +			 * slots after this, the RX interrupt will set
>> +			 * the bit and wake up the thread again
>> +			 */
>> +			set_bit(QUEUE_STATUS_RX_STALLED, &queue->status);
>
> This big if needs to go in a new function.
>
> David
>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] xen-netback: Turn off the carrier if the guest is not able to receive
  2014-08-04 13:35   ` [Xen-devel] " David Vrabel
@ 2014-08-04 15:13     ` Zoltan Kiss
  2014-08-04 15:13     ` [Xen-devel] " Zoltan Kiss
  1 sibling, 0 replies; 19+ messages in thread
From: Zoltan Kiss @ 2014-08-04 15:13 UTC (permalink / raw)
  To: David Vrabel, Wei Liu, Ian Campbell; +Cc: netdev, linux-kernel, xen-devel

On 04/08/14 14:35, David Vrabel wrote:
> On 30/07/14 20:50, Zoltan Kiss wrote:
>> Currently when the guest is not able to receive more packets, qdisc layer starts
>> a timer, and when it goes off, qdisc is started again to deliver a packet again.
>> This is a very slow way to drain the queues, consumes unnecessary resources and
>> slows down other guests shutdown.
>> This patch change the behaviour by turning the carrier off when that timer
>> fires, so all the packets are freed up which were stucked waiting for that vif.
>> Instead of the rx_queue_purge bool it uses the VIF_STATUS_RX_PURGE_EVENT bit to
>> signal the thread that either the timout happened or an RX interrupt arrived, so
>> the thread can check what it should do. It also disables NAPI, so the guest
>> can't transmit, but leaves the interrupts on, so it can resurrect.
>
> I don't think you should disable NAPI, particularly since you have to
> fiddle with the bits directly instead of usign the API to do so.
Since then I've found a much better way, see xenvif_poll.
>
> I looked at some hardware drivers and none of them disabled NAPI -- they
> just allow it to naturally end once hardware queues are drained.
>
> netback is a little different as a frontend could stop processing
> to-guest packets but continue sending from-guest ones.  I don't see any
> problem with allowing this.
Yes, that might be the case. Just a hunch tells me that could be wrong, 
but I couldn't come up with a scenario to prove it. Let's hear other's 
opinion about it:
Is it good or bad if a guest interface with carrier off still keeps 
sending packets towards the backend? Hardware drivers are trusted that 
when carrier goes down there won't be new packets (apart from a few 
still in the receive queue of the driver), so NAPI will be automatically 
descheduled. But in this scenario the guest could still post new packets 
on the ring, and they will be received unless NAPI is explicitly 
descheduled.

>
>> @@ -1955,24 +1955,78 @@ int xenvif_kthread_guest_rx(void *data)
>>   		 */
>>   		if (unlikely(queue->vif->disabled && queue->id == 0))
>>   			xenvif_carrier_off(queue->vif);
>> +		else if (unlikely(test_and_clear_bit(QUEUE_STATUS_RX_PURGE_EVENT,
>> +						     &queue->status))) {
>> +			/* Either the last unsuccesful skb or at least 1 slot
>> +			 * should fit
>> +			 */
>> +			int needed = queue->rx_last_skb_slots ?
>> +				     queue->rx_last_skb_slots : 1;
>>
>> -		if (kthread_should_stop())
>> -			break;
>> -
>> -		if (queue->rx_queue_purge) {
>> +			/* It is assumed that if the guest post new
>> +			 * slots after this, the RX interrupt will set
>> +			 * the bit and wake up the thread again
>> +			 */
>> +			set_bit(QUEUE_STATUS_RX_STALLED, &queue->status);
>
> This big if needs to go in a new function.
>
> David
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Xen-devel] [PATCH] xen-netback: Turn off the carrier if the guest is not able to receive
  2014-08-04 15:13     ` [Xen-devel] " Zoltan Kiss
  2014-08-08 16:33       ` Stephen Hemminger
@ 2014-08-08 16:33       ` Stephen Hemminger
  2014-08-11 12:31         ` Zoltan Kiss
  2014-08-11 12:31         ` [Xen-devel] " Zoltan Kiss
  1 sibling, 2 replies; 19+ messages in thread
From: Stephen Hemminger @ 2014-08-08 16:33 UTC (permalink / raw)
  To: Zoltan Kiss
  Cc: David Vrabel, Wei Liu, Ian Campbell, netdev, xen-devel, linux-kernel

This idea of bouncing carrier is wrong. If guest is flow blocked you don't
want to toggle carrier. That will cause problems because applications that are
looking for carrier transistions like routing daemons will be notified.

If running a routing daemon this will also lead to link flapping which
is very bad and cause lots of other work for peer routing daemons.

Carrier is not a suitable flow control mechanism.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] xen-netback: Turn off the carrier if the guest is not able to receive
  2014-08-04 15:13     ` [Xen-devel] " Zoltan Kiss
@ 2014-08-08 16:33       ` Stephen Hemminger
  2014-08-08 16:33       ` [Xen-devel] " Stephen Hemminger
  1 sibling, 0 replies; 19+ messages in thread
From: Stephen Hemminger @ 2014-08-08 16:33 UTC (permalink / raw)
  To: Zoltan Kiss
  Cc: Wei Liu, Ian Campbell, netdev, linux-kernel, David Vrabel, xen-devel

This idea of bouncing carrier is wrong. If guest is flow blocked you don't
want to toggle carrier. That will cause problems because applications that are
looking for carrier transistions like routing daemons will be notified.

If running a routing daemon this will also lead to link flapping which
is very bad and cause lots of other work for peer routing daemons.

Carrier is not a suitable flow control mechanism.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Xen-devel] [PATCH] xen-netback: Turn off the carrier if the guest is not able to receive
  2014-08-08 16:33       ` [Xen-devel] " Stephen Hemminger
  2014-08-11 12:31         ` Zoltan Kiss
@ 2014-08-11 12:31         ` Zoltan Kiss
  2014-08-11 12:37           ` David Vrabel
  2014-08-11 12:37           ` [Xen-devel] " David Vrabel
  1 sibling, 2 replies; 19+ messages in thread
From: Zoltan Kiss @ 2014-08-11 12:31 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: David Vrabel, Wei Liu, Ian Campbell, netdev, xen-devel, linux-kernel

On 08/08/14 17:33, Stephen Hemminger wrote:
> This idea of bouncing carrier is wrong. If guest is flow blocked you don't
> want to toggle carrier. That will cause problems because applications that are
> looking for carrier transistions like routing daemons will be notified.
>
> If running a routing daemon this will also lead to link flapping which
> is very bad and cause lots of other work for peer routing daemons.
>
> Carrier is not a suitable flow control mechanism.
>

Hi,

Indeed, I also had some concerns about using carrier state to solve this 
problem, as the notifier can kick a lot of things, and flapping is not 
impossible. That's why the frontend has 10 seconds by default to do 
something. Practice shows that if a frontend can't do any receive work 
for that time, it is unlikely it will be able to do it soon.
So worst case carrier flapping can happen only in every 10 seconds, I 
think that's manageable. I think the majority of the users have simple 
bridged setups where this carrier change doesn't start any expensive 
operation.
The reason we choose carrier change for this purpose because we needed 
something which ditched everything in QDisc and made sure nothing will 
be queued up there until there is a chance we can transmit to the guest. 
Calling dev_deactivate straight away seemed less appropriate.

Regards,

Zoltan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] xen-netback: Turn off the carrier if the guest is not able to receive
  2014-08-08 16:33       ` [Xen-devel] " Stephen Hemminger
@ 2014-08-11 12:31         ` Zoltan Kiss
  2014-08-11 12:31         ` [Xen-devel] " Zoltan Kiss
  1 sibling, 0 replies; 19+ messages in thread
From: Zoltan Kiss @ 2014-08-11 12:31 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Wei Liu, Ian Campbell, netdev, linux-kernel, David Vrabel, xen-devel

On 08/08/14 17:33, Stephen Hemminger wrote:
> This idea of bouncing carrier is wrong. If guest is flow blocked you don't
> want to toggle carrier. That will cause problems because applications that are
> looking for carrier transistions like routing daemons will be notified.
>
> If running a routing daemon this will also lead to link flapping which
> is very bad and cause lots of other work for peer routing daemons.
>
> Carrier is not a suitable flow control mechanism.
>

Hi,

Indeed, I also had some concerns about using carrier state to solve this 
problem, as the notifier can kick a lot of things, and flapping is not 
impossible. That's why the frontend has 10 seconds by default to do 
something. Practice shows that if a frontend can't do any receive work 
for that time, it is unlikely it will be able to do it soon.
So worst case carrier flapping can happen only in every 10 seconds, I 
think that's manageable. I think the majority of the users have simple 
bridged setups where this carrier change doesn't start any expensive 
operation.
The reason we choose carrier change for this purpose because we needed 
something which ditched everything in QDisc and made sure nothing will 
be queued up there until there is a chance we can transmit to the guest. 
Calling dev_deactivate straight away seemed less appropriate.

Regards,

Zoltan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Xen-devel] [PATCH] xen-netback: Turn off the carrier if the guest is not able to receive
  2014-08-11 12:31         ` [Xen-devel] " Zoltan Kiss
  2014-08-11 12:37           ` David Vrabel
@ 2014-08-11 12:37           ` David Vrabel
  1 sibling, 0 replies; 19+ messages in thread
From: David Vrabel @ 2014-08-11 12:37 UTC (permalink / raw)
  To: Zoltan Kiss, Stephen Hemminger
  Cc: Wei Liu, Ian Campbell, netdev, xen-devel, linux-kernel

On 11/08/14 13:31, Zoltan Kiss wrote:
> On 08/08/14 17:33, Stephen Hemminger wrote:
>> This idea of bouncing carrier is wrong. If guest is flow blocked you
>> don't
>> want to toggle carrier. That will cause problems because applications
>> that are
>> looking for carrier transistions like routing daemons will be notified.
>>
>> If running a routing daemon this will also lead to link flapping which
>> is very bad and cause lots of other work for peer routing daemons.
>>
>> Carrier is not a suitable flow control mechanism.
>>
> 
> Hi,
> 
> Indeed, I also had some concerns about using carrier state to solve this
> problem, as the notifier can kick a lot of things, and flapping is not
> impossible. That's why the frontend has 10 seconds by default to do
> something. Practice shows that if a frontend can't do any receive work
> for that time, it is unlikely it will be able to do it soon.
> So worst case carrier flapping can happen only in every 10 seconds, I
> think that's manageable. I think the majority of the users have simple
> bridged setups where this carrier change doesn't start any expensive
> operation.
> The reason we choose carrier change for this purpose because we needed
> something which ditched everything in QDisc and made sure nothing will
> be queued up there until there is a chance we can transmit to the guest.
> Calling dev_deactivate straight away seemed less appropriate.

I do think we need to revisit this and introduce a per-queue
stop_and_flush operation we can use instead.

David

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] xen-netback: Turn off the carrier if the guest is not able to receive
  2014-08-11 12:31         ` [Xen-devel] " Zoltan Kiss
@ 2014-08-11 12:37           ` David Vrabel
  2014-08-11 12:37           ` [Xen-devel] " David Vrabel
  1 sibling, 0 replies; 19+ messages in thread
From: David Vrabel @ 2014-08-11 12:37 UTC (permalink / raw)
  To: Zoltan Kiss, Stephen Hemminger
  Cc: netdev, Wei Liu, Ian Campbell, linux-kernel, xen-devel

On 11/08/14 13:31, Zoltan Kiss wrote:
> On 08/08/14 17:33, Stephen Hemminger wrote:
>> This idea of bouncing carrier is wrong. If guest is flow blocked you
>> don't
>> want to toggle carrier. That will cause problems because applications
>> that are
>> looking for carrier transistions like routing daemons will be notified.
>>
>> If running a routing daemon this will also lead to link flapping which
>> is very bad and cause lots of other work for peer routing daemons.
>>
>> Carrier is not a suitable flow control mechanism.
>>
> 
> Hi,
> 
> Indeed, I also had some concerns about using carrier state to solve this
> problem, as the notifier can kick a lot of things, and flapping is not
> impossible. That's why the frontend has 10 seconds by default to do
> something. Practice shows that if a frontend can't do any receive work
> for that time, it is unlikely it will be able to do it soon.
> So worst case carrier flapping can happen only in every 10 seconds, I
> think that's manageable. I think the majority of the users have simple
> bridged setups where this carrier change doesn't start any expensive
> operation.
> The reason we choose carrier change for this purpose because we needed
> something which ditched everything in QDisc and made sure nothing will
> be queued up there until there is a chance we can transmit to the guest.
> Calling dev_deactivate straight away seemed less appropriate.

I do think we need to revisit this and introduce a per-queue
stop_and_flush operation we can use instead.

David

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2014-08-11 12:37 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-30 19:50 [PATCH] xen-netback: Using a new state bit instead of carrier Zoltan Kiss
2014-07-30 19:50 ` [PATCH] xen-netback: Turn off the carrier if the guest is not able to receive Zoltan Kiss
2014-08-01  4:53   ` David Miller
2014-08-01  4:53   ` David Miller
2014-08-01 10:52   ` Wei Liu
2014-08-01 10:52   ` Wei Liu
2014-08-04 15:04     ` Zoltan Kiss
2014-08-04 15:04     ` Zoltan Kiss
2014-08-04 13:35   ` [Xen-devel] " David Vrabel
2014-08-04 15:13     ` Zoltan Kiss
2014-08-04 15:13     ` [Xen-devel] " Zoltan Kiss
2014-08-08 16:33       ` Stephen Hemminger
2014-08-08 16:33       ` [Xen-devel] " Stephen Hemminger
2014-08-11 12:31         ` Zoltan Kiss
2014-08-11 12:31         ` [Xen-devel] " Zoltan Kiss
2014-08-11 12:37           ` David Vrabel
2014-08-11 12:37           ` [Xen-devel] " David Vrabel
2014-08-04 13:35   ` David Vrabel
2014-07-30 19:50 ` Zoltan Kiss

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.