netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] sfc: efx: add support for skb->xmit_more
@ 2014-10-13 16:59 Daniel Borkmann
  2014-10-13 18:02 ` Edward Cree
  0 siblings, 1 reply; 13+ messages in thread
From: Daniel Borkmann @ 2014-10-13 16:59 UTC (permalink / raw)
  To: davem; +Cc: nikolay, netdev, Shradha Shah

Add support for xmit_more batching: efx has BQL support, thus we need
to only move the queue hang check before the hw tail pointer write
and test for xmit_more bit resp. whether the queue/stack has stopped.
Thanks to Nikolay Aleksandrov who had a Solarflare NIC to test!

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Nikolay Aleksandrov <nikolay@redhat.com>
Cc: Shradha Shah <sshah@solarflare.com>
---
 drivers/net/ethernet/sfc/tx.c | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/sfc/tx.c b/drivers/net/ethernet/sfc/tx.c
index 3206098..566432c 100644
--- a/drivers/net/ethernet/sfc/tx.c
+++ b/drivers/net/ethernet/sfc/tx.c
@@ -439,13 +439,13 @@ finish_packet:
 
 	netdev_tx_sent_queue(tx_queue->core_txq, skb->len);
 
+	efx_tx_maybe_stop_queue(tx_queue);
+
 	/* Pass off to hardware */
-	efx_nic_push_buffers(tx_queue);
+	if (!skb->xmit_more || netif_xmit_stopped(tx_queue->core_txq))
+		efx_nic_push_buffers(tx_queue);
 
 	tx_queue->tx_packets++;
-
-	efx_tx_maybe_stop_queue(tx_queue);
-
 	return NETDEV_TX_OK;
 
  dma_err:
@@ -1308,11 +1308,12 @@ static int efx_enqueue_skb_tso(struct efx_tx_queue *tx_queue,
 
 	netdev_tx_sent_queue(tx_queue->core_txq, skb->len);
 
-	/* Pass off to hardware */
-	efx_nic_push_buffers(tx_queue);
-
 	efx_tx_maybe_stop_queue(tx_queue);
 
+	/* Pass off to hardware */
+	if (!skb->xmit_more || netif_xmit_stopped(tx_queue->core_txq))
+		efx_nic_push_buffers(tx_queue);
+
 	tx_queue->tso_bursts++;
 	return NETDEV_TX_OK;
 
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] sfc: efx: add support for skb->xmit_more
  2014-10-13 16:59 [PATCH] sfc: efx: add support for skb->xmit_more Daniel Borkmann
@ 2014-10-13 18:02 ` Edward Cree
  2014-10-13 19:18   ` Daniel Borkmann
  0 siblings, 1 reply; 13+ messages in thread
From: Edward Cree @ 2014-10-13 18:02 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: davem, nikolay, netdev, Shradha Shah, linux-net-drivers

On 13/10/14 17:59, Daniel Borkmann wrote:
> Add support for xmit_more batching: efx has BQL support, thus we need
> to only move the queue hang check before the hw tail pointer write
> and test for xmit_more bit resp. whether the queue/stack has stopped.
> Thanks to Nikolay Aleksandrov who had a Solarflare NIC to test!
I see two issues here.

1) The error handling path (labels dma_err: and err:) will unwind back
to tx_queue->insert_count, which (AFAICT) only gets updated by
efx_nic_push_buffers() (at least on EF10, where that calls
efx_ef10_tx_write()).
So if we transmit a bunch of skbs with xmit_more, then get an error, we
will unwind all the way back to the start, whereas (again, AFAICT) the
previous skbs were nicely completed and safe to send.
The unwind will leave us in a good state, but we'll have failed to send
some packets that there was nothing wrong with.
I think the appropriate fix for this would be to maintain two separate
insert_counts, as xmit_more means we can't conflate "last write pointer
pushed" and "write pointer at start of this skb" any longer.

2) I think we shouldn't do PIO when xmit_more is set - it's probably bad
for performance (though I haven't measured), and I'm not entirely
convinced it will behave correctly.  I think
efx_nic_tx_is_empty(tx_queue) will return true if we have xmit_more'd a
PIO packet, enabling us to potentially PIO the next packet as well, thus
overwriting the PIO buffer before the NIC's had a chance to read it.
I'm also unsure about what happens to TX Push (efx_ef10_push_tx_desc),
for similar reasons.
So I think TX PIO and TX Push both need to be suppressed when
skb->xmit_more (as a correctness issue), and should also be suppressed
on the !xmit_more skb that 'uncorks' a previous row of xmit_mores - as a
performance issue at the least, and for TX Push I'm also unsure about
the correctness (though I haven't worked that one through in detail).

Until these issues are addressed, I have to NAK this.  Tomorrow I'll try
to rustle up an rfc patch that handles these issues, unless someone
beats me to it.

-Edward

> Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
> Acked-by: Nikolay Aleksandrov <nikolay@redhat.com>
> Cc: Shradha Shah <sshah@solarflare.com>
> ---
>  drivers/net/ethernet/sfc/tx.c | 15 ++++++++-------
>  1 file changed, 8 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/net/ethernet/sfc/tx.c b/drivers/net/ethernet/sfc/tx.c
> index 3206098..566432c 100644
> --- a/drivers/net/ethernet/sfc/tx.c
> +++ b/drivers/net/ethernet/sfc/tx.c
> @@ -439,13 +439,13 @@ finish_packet:
>  
>  	netdev_tx_sent_queue(tx_queue->core_txq, skb->len);
>  
> +	efx_tx_maybe_stop_queue(tx_queue);
> +
>  	/* Pass off to hardware */
> -	efx_nic_push_buffers(tx_queue);
> +	if (!skb->xmit_more || netif_xmit_stopped(tx_queue->core_txq))
> +		efx_nic_push_buffers(tx_queue);
>  
>  	tx_queue->tx_packets++;
> -
> -	efx_tx_maybe_stop_queue(tx_queue);
> -
>  	return NETDEV_TX_OK;
>  
>   dma_err:
> @@ -1308,11 +1308,12 @@ static int efx_enqueue_skb_tso(struct efx_tx_queue *tx_queue,
>  
>  	netdev_tx_sent_queue(tx_queue->core_txq, skb->len);
>  
> -	/* Pass off to hardware */
> -	efx_nic_push_buffers(tx_queue);
> -
>  	efx_tx_maybe_stop_queue(tx_queue);
>  
> +	/* Pass off to hardware */
> +	if (!skb->xmit_more || netif_xmit_stopped(tx_queue->core_txq))
> +		efx_nic_push_buffers(tx_queue);
> +
>  	tx_queue->tso_bursts++;
>  	return NETDEV_TX_OK;
>  

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] sfc: efx: add support for skb->xmit_more
  2014-10-13 18:02 ` Edward Cree
@ 2014-10-13 19:18   ` Daniel Borkmann
  2014-10-14 18:41     ` [PATCH RFC net-next] sfc: " Edward Cree
  0 siblings, 1 reply; 13+ messages in thread
From: Daniel Borkmann @ 2014-10-13 19:18 UTC (permalink / raw)
  To: Edward Cree; +Cc: davem, nikolay, netdev, Shradha Shah, linux-net-drivers

On 10/13/2014 08:02 PM, Edward Cree wrote:
...
> Until these issues are addressed, I have to NAK this.  Tomorrow I'll try
> to rustle up an rfc patch that handles these issues, unless someone
> beats me to it.

Thanks for your feedback! Yes, please feel free to take this onwards and
integrate/adapt it to your needs for sfc's xmit_more support.

Thanks a lot,
Daniel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH RFC net-next] sfc: add support for skb->xmit_more
  2014-10-13 19:18   ` Daniel Borkmann
@ 2014-10-14 18:41     ` Edward Cree
  2014-10-14 21:15       ` David Miller
  2014-10-16 16:36       ` Robert Stonehouse
  0 siblings, 2 replies; 13+ messages in thread
From: Edward Cree @ 2014-10-14 18:41 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Edward Cree, davem, nikolay, netdev, Shradha Shah, Jon Cooper,
	linux-net-drivers

Don't ring the doorbell, and don't do PIO.  This will also prevent
 TX Push, because there will be more than one buffer waiting when
 the doorbell is rung.

Signed-off-by: Edward Cree <ecree@solarflare.com>
---

 drivers/net/ethernet/sfc/nic.h | 29 +++++++++++++++++++++++-----
 drivers/net/ethernet/sfc/tx.c  | 43 +++++++++++++++++-------------------------
 2 files changed, 41 insertions(+), 31 deletions(-)

diff --git a/drivers/net/ethernet/sfc/nic.h b/drivers/net/ethernet/sfc/nic.h
index 60f8514..f77cce0 100644
--- a/drivers/net/ethernet/sfc/nic.h
+++ b/drivers/net/ethernet/sfc/nic.h
@@ -71,9 +71,17 @@ efx_tx_desc(struct efx_tx_queue *tx_queue, unsigned int index)
 	return ((efx_qword_t *) (tx_queue->txd.buf.addr)) + index;
 }
 
-/* Report whether the NIC considers this TX queue empty, given the
- * write_count used for the last doorbell push.  May return false
- * negative.
+/* Get partner of a TX queue, seen as part of the same net core queue */
+static struct efx_tx_queue *efx_tx_queue_partner(struct efx_tx_queue *tx_queue)
+{
+	if (tx_queue->queue & EFX_TXQ_TYPE_OFFLOAD)
+		return tx_queue - EFX_TXQ_TYPE_OFFLOAD;
+	else
+		return tx_queue + EFX_TXQ_TYPE_OFFLOAD;
+}
+
+/* Report whether this TX queue would be empty for the given write_count.
+ * May return false negative.
  */
 static inline bool __efx_nic_tx_is_empty(struct efx_tx_queue *tx_queue,
 					 unsigned int write_count)
@@ -86,9 +94,18 @@ static inline bool __efx_nic_tx_is_empty(struct efx_tx_queue *tx_queue,
 	return ((empty_read_count ^ write_count) & ~EFX_EMPTY_COUNT_VALID) == 0;
 }
 
-static inline bool efx_nic_tx_is_empty(struct efx_tx_queue *tx_queue)
+/* Decide whether we can use TX PIO, ie. write packet data directly into
+ * a buffer on the device.  This can reduce latency at the expense of
+ * throughput, so we only do this if both hardware and software TX rings
+ * are empty.  This also ensures that only one packet at a time can be
+ * using the PIO buffer.
+ */
+static inline bool efx_nic_may_tx_pio(struct efx_tx_queue *tx_queue)
 {
-	return __efx_nic_tx_is_empty(tx_queue, tx_queue->write_count);
+	struct efx_tx_queue *partner = efx_tx_queue_partner(tx_queue);
+	return tx_queue->piobuf &&
+	       __efx_nic_tx_is_empty(tx_queue, tx_queue->insert_count) &&
+	       __efx_nic_tx_is_empty(partner, partner->insert_count);
 }
 
 /* Decide whether to push a TX descriptor to the NIC vs merely writing
@@ -96,6 +113,8 @@ static inline bool efx_nic_tx_is_empty(struct efx_tx_queue *tx_queue)
  * descriptor to an empty queue, but is otherwise pointless.  Further,
  * Falcon and Siena have hardware bugs (SF bug 33851) that may be
  * triggered if we don't check this.
+ * We use the write_count used for the last doorbell push, to get the
+ * NIC's view of the tx queue.
  */
 static inline bool efx_nic_may_push_tx_desc(struct efx_tx_queue *tx_queue,
 					    unsigned int write_count)
diff --git a/drivers/net/ethernet/sfc/tx.c b/drivers/net/ethernet/sfc/tx.c
index 3206098..aaf2987 100644
--- a/drivers/net/ethernet/sfc/tx.c
+++ b/drivers/net/ethernet/sfc/tx.c
@@ -132,15 +132,6 @@ unsigned int efx_tx_max_skb_descs(struct efx_nic *efx)
 	return max_descs;
 }
 
-/* Get partner of a TX queue, seen as part of the same net core queue */
-static struct efx_tx_queue *efx_tx_queue_partner(struct efx_tx_queue *tx_queue)
-{
-	if (tx_queue->queue & EFX_TXQ_TYPE_OFFLOAD)
-		return tx_queue - EFX_TXQ_TYPE_OFFLOAD;
-	else
-		return tx_queue + EFX_TXQ_TYPE_OFFLOAD;
-}
-
 static void efx_tx_maybe_stop_queue(struct efx_tx_queue *txq1)
 {
 	/* We need to consider both queues that the net core sees as one */
@@ -344,6 +335,7 @@ netdev_tx_t efx_enqueue_skb(struct efx_tx_queue *tx_queue, struct sk_buff *skb)
 	struct efx_nic *efx = tx_queue->efx;
 	struct device *dma_dev = &efx->pci_dev->dev;
 	struct efx_tx_buffer *buffer;
+	unsigned int old_insert_count = tx_queue->insert_count;
 	skb_frag_t *fragment;
 	unsigned int len, unmap_len = 0;
 	dma_addr_t dma_addr, unmap_addr = 0;
@@ -351,8 +343,6 @@ netdev_tx_t efx_enqueue_skb(struct efx_tx_queue *tx_queue, struct sk_buff *skb)
 	unsigned short dma_flags;
 	int i = 0;
 
-	EFX_BUG_ON_PARANOID(tx_queue->write_count != tx_queue->insert_count);
-
 	if (skb_shinfo(skb)->gso_size)
 		return efx_enqueue_skb_tso(tx_queue, skb);
 
@@ -369,9 +359,8 @@ netdev_tx_t efx_enqueue_skb(struct efx_tx_queue *tx_queue, struct sk_buff *skb)
 
 	/* Consider using PIO for short packets */
 #ifdef EFX_USE_PIO
-	if (skb->len <= efx_piobuf_size && tx_queue->piobuf &&
-	    efx_nic_tx_is_empty(tx_queue) &&
-	    efx_nic_tx_is_empty(efx_tx_queue_partner(tx_queue))) {
+	if (skb->len <= efx_piobuf_size && !skb->xmit_more &&
+	    efx_nic_may_tx_pio(tx_queue)) {
 		buffer = efx_enqueue_skb_pio(tx_queue, skb);
 		dma_flags = EFX_TX_BUF_OPTION;
 		goto finish_packet;
@@ -439,13 +428,14 @@ finish_packet:
 
 	netdev_tx_sent_queue(tx_queue->core_txq, skb->len);
 
+	efx_tx_maybe_stop_queue(tx_queue);
+
 	/* Pass off to hardware */
-	efx_nic_push_buffers(tx_queue);
+	if (!skb->xmit_more || netif_xmit_stopped(tx_queue->core_txq))
+		efx_nic_push_buffers(tx_queue);
 
 	tx_queue->tx_packets++;
 
-	efx_tx_maybe_stop_queue(tx_queue);
-
 	return NETDEV_TX_OK;
 
  dma_err:
@@ -458,7 +448,7 @@ finish_packet:
 	dev_kfree_skb_any(skb);
 
 	/* Work backwards until we hit the original insert pointer value */
-	while (tx_queue->insert_count != tx_queue->write_count) {
+	while (tx_queue->insert_count != old_insert_count) {
 		unsigned int pkts_compl = 0, bytes_compl = 0;
 		--tx_queue->insert_count;
 		buffer = __efx_tx_queue_get_insert_buffer(tx_queue);
@@ -989,12 +979,13 @@ static int efx_tso_put_header(struct efx_tx_queue *tx_queue,
 /* Remove buffers put into a tx_queue.  None of the buffers must have
  * an skb attached.
  */
-static void efx_enqueue_unwind(struct efx_tx_queue *tx_queue)
+static void efx_enqueue_unwind(struct efx_tx_queue *tx_queue,
+			       unsigned int insert_count)
 {
 	struct efx_tx_buffer *buffer;
 
 	/* Work backwards until we hit the original insert pointer value */
-	while (tx_queue->insert_count != tx_queue->write_count) {
+	while (tx_queue->insert_count != insert_count) {
 		--tx_queue->insert_count;
 		buffer = __efx_tx_queue_get_insert_buffer(tx_queue);
 		efx_dequeue_buffer(tx_queue, buffer, NULL, NULL);
@@ -1258,14 +1249,13 @@ static int efx_enqueue_skb_tso(struct efx_tx_queue *tx_queue,
 			       struct sk_buff *skb)
 {
 	struct efx_nic *efx = tx_queue->efx;
+	unsigned int old_insert_count = tx_queue->insert_count;
 	int frag_i, rc;
 	struct tso_state state;
 
 	/* Find the packet protocol and sanity-check it */
 	state.protocol = efx_tso_check_protocol(skb);
 
-	EFX_BUG_ON_PARANOID(tx_queue->write_count != tx_queue->insert_count);
-
 	rc = tso_start(&state, efx, skb);
 	if (rc)
 		goto mem_err;
@@ -1308,11 +1298,12 @@ static int efx_enqueue_skb_tso(struct efx_tx_queue *tx_queue,
 
 	netdev_tx_sent_queue(tx_queue->core_txq, skb->len);
 
-	/* Pass off to hardware */
-	efx_nic_push_buffers(tx_queue);
-
 	efx_tx_maybe_stop_queue(tx_queue);
 
+	/* Pass off to hardware */
+	if (!skb->xmit_more || netif_xmit_stopped(tx_queue->core_txq))
+		efx_nic_push_buffers(tx_queue);
+
 	tx_queue->tso_bursts++;
 	return NETDEV_TX_OK;
 
@@ -1336,6 +1327,6 @@ static int efx_enqueue_skb_tso(struct efx_tx_queue *tx_queue,
 		dma_unmap_single(&efx->pci_dev->dev, state.header_dma_addr,
 				 state.header_unmap_len, DMA_TO_DEVICE);
 
-	efx_enqueue_unwind(tx_queue);
+	efx_enqueue_unwind(tx_queue, old_insert_count);
 	return NETDEV_TX_OK;
 }
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH RFC net-next] sfc: add support for skb->xmit_more
  2014-10-14 18:41     ` [PATCH RFC net-next] sfc: " Edward Cree
@ 2014-10-14 21:15       ` David Miller
  2014-10-15 11:05         ` Edward Cree
  2014-10-16 16:36       ` Robert Stonehouse
  1 sibling, 1 reply; 13+ messages in thread
From: David Miller @ 2014-10-14 21:15 UTC (permalink / raw)
  To: ecree; +Cc: dborkman, nikolay, netdev, sshah, jcooper, linux-net-drivers

From: Edward Cree <ecree@solarflare.com>
Date: Tue, 14 Oct 2014 19:41:37 +0100

> Don't ring the doorbell, and don't do PIO.  This will also prevent
>  TX Push, because there will be more than one buffer waiting when
>  the doorbell is rung.
> 
> Signed-off-by: Edward Cree <ecree@solarflare.com>

This looks good to me, mind if I apply this now?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH RFC net-next] sfc: add support for skb->xmit_more
  2014-10-14 21:15       ` David Miller
@ 2014-10-15 11:05         ` Edward Cree
  2014-10-15 16:20           ` David Miller
  0 siblings, 1 reply; 13+ messages in thread
From: Edward Cree @ 2014-10-15 11:05 UTC (permalink / raw)
  To: David Miller; +Cc: dborkman, nikolay, netdev, sshah, jcooper, linux-net-drivers

On 14/10/14 22:15, David Miller wrote:
> From: Edward Cree <ecree@solarflare.com>
> Date: Tue, 14 Oct 2014 19:41:37 +0100
>
>> Don't ring the doorbell, and don't do PIO.  This will also prevent
>>  TX Push, because there will be more than one buffer waiting when
>>  the doorbell is rung.
>>
>> Signed-off-by: Edward Cree <ecree@solarflare.com>
> This looks good to me, mind if I apply this now?
I'd rather wait until Jon Cooper's had a chance to look at it; he
understands our TX path better than I.

-Edward

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH RFC net-next] sfc: add support for skb->xmit_more
  2014-10-15 11:05         ` Edward Cree
@ 2014-10-15 16:20           ` David Miller
  2014-10-16 15:42             ` Jonathan Cooper
  0 siblings, 1 reply; 13+ messages in thread
From: David Miller @ 2014-10-15 16:20 UTC (permalink / raw)
  To: ecree; +Cc: dborkman, nikolay, netdev, sshah, jcooper, linux-net-drivers

From: Edward Cree <ecree@solarflare.com>
Date: Wed, 15 Oct 2014 12:05:35 +0100

> On 14/10/14 22:15, David Miller wrote:
>> From: Edward Cree <ecree@solarflare.com>
>> Date: Tue, 14 Oct 2014 19:41:37 +0100
>>
>>> Don't ring the doorbell, and don't do PIO.  This will also prevent
>>>  TX Push, because there will be more than one buffer waiting when
>>>  the doorbell is rung.
>>>
>>> Signed-off-by: Edward Cree <ecree@solarflare.com>
>> This looks good to me, mind if I apply this now?
> I'd rather wait until Jon Cooper's had a chance to look at it; he
> understands our TX path better than I.

Ok.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH RFC net-next] sfc: add support for skb->xmit_more
  2014-10-15 16:20           ` David Miller
@ 2014-10-16 15:42             ` Jonathan Cooper
  0 siblings, 0 replies; 13+ messages in thread
From: Jonathan Cooper @ 2014-10-16 15:42 UTC (permalink / raw)
  To: David Miller; +Cc: ecree, dborkman, nikolay, netdev, sshah, linux-net-drivers

On 15/10/14 17:20, David Miller wrote:
> From: Edward Cree <ecree@solarflare.com>
> Date: Wed, 15 Oct 2014 12:05:35 +0100
>
>> On 14/10/14 22:15, David Miller wrote:
>>> From: Edward Cree <ecree@solarflare.com>
>>> Date: Tue, 14 Oct 2014 19:41:37 +0100
>>>
>>>> Don't ring the doorbell, and don't do PIO.  This will also prevent
>>>>   TX Push, because there will be more than one buffer waiting when
>>>>   the doorbell is rung.
>>>>
>>>> Signed-off-by: Edward Cree <ecree@solarflare.com>
>>> This looks good to me, mind if I apply this now?
>> I'd rather wait until Jon Cooper's had a chance to look at it; he
>> understands our TX path better than I.
> Ok.
Looks good to me.

Acked-by: Jon Cooper <jcooper@solarflare.com>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH RFC net-next] sfc: add support for skb->xmit_more
  2014-10-14 18:41     ` [PATCH RFC net-next] sfc: " Edward Cree
  2014-10-14 21:15       ` David Miller
@ 2014-10-16 16:36       ` Robert Stonehouse
  2014-10-17 14:32         ` [PATCH " Edward Cree
  1 sibling, 1 reply; 13+ messages in thread
From: Robert Stonehouse @ 2014-10-16 16:36 UTC (permalink / raw)
  To: Edward Cree, Daniel Borkmann
  Cc: davem, nikolay, netdev, Shradha Shah, Jon Cooper, linux-net-drivers

On 14/10/14 19:41, Edward Cree wrote:
> Don't ring the doorbell, and don't do PIO.  This will also prevent
>   TX Push, because there will be more than one buffer waiting when
>   the doorbell is rung.
>
> Signed-off-by: Edward Cree <ecree@solarflare.com>
> ---
>
> @@ -351,8 +343,6 @@ netdev_tx_t efx_enqueue_skb(struct efx_tx_queue *tx_queue, struct sk_buff *skb)
>   	unsigned short dma_flags;
>   	int i = 0;
>   
> -	EFX_BUG_ON_PARANOID(tx_queue->write_count != tx_queue->insert_count);
> -
>   	if (skb_shinfo(skb)->gso_size)
>   		return efx_enqueue_skb_tso(tx_queue, skb);

Would it be possible to keep a weaker version of this check i.e.
EFX_BUG_ON_PARANOID(tx_queue->write_count > tx_queue->insert_count);

> @@ -1258,14 +1249,13 @@ static int efx_enqueue_skb_tso(struct efx_tx_queue *tx_queue,
>   			       struct sk_buff *skb)
>   {
>   	struct efx_nic *efx = tx_queue->efx;
> +	unsigned int old_insert_count = tx_queue->insert_count;
>   	int frag_i, rc;
>   	struct tso_state state;
>   
>   	/* Find the packet protocol and sanity-check it */
>   	state.protocol = efx_tso_check_protocol(skb);
>   
> -	EFX_BUG_ON_PARANOID(tx_queue->write_count != tx_queue->insert_count);
> -
>   	rc = tso_start(&state, efx, skb);
>   	if (rc)
>   		goto mem_err;

The same would apply here.

Thanks
Rob

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH net-next] sfc: add support for skb->xmit_more
  2014-10-16 16:36       ` Robert Stonehouse
@ 2014-10-17 14:32         ` Edward Cree
  2014-10-18  3:47           ` David Miller
  2014-10-22  8:58           ` Ben Hutchings
  0 siblings, 2 replies; 13+ messages in thread
From: Edward Cree @ 2014-10-17 14:32 UTC (permalink / raw)
  To: Robert Stonehouse
  Cc: Edward Cree, Daniel Borkmann, davem, nikolay, netdev,
	Shradha Shah, Jon Cooper, linux-net-drivers

Don't ring the doorbell, and don't do PIO.  This will also prevent
 TX Push, because there will be more than one buffer waiting when
 the doorbell is rung.

Signed-off-by: Edward Cree <ecree@solarflare.com>
---
 drivers/net/ethernet/sfc/nic.h | 29 +++++++++++++++++++++++-----
 drivers/net/ethernet/sfc/tx.c  | 43 +++++++++++++++++++-----------------------
 2 files changed, 43 insertions(+), 29 deletions(-)

diff --git a/drivers/net/ethernet/sfc/nic.h b/drivers/net/ethernet/sfc/nic.h
index 60f8514..f77cce0 100644
--- a/drivers/net/ethernet/sfc/nic.h
+++ b/drivers/net/ethernet/sfc/nic.h
@@ -71,9 +71,17 @@ efx_tx_desc(struct efx_tx_queue *tx_queue, unsigned int index)
 	return ((efx_qword_t *) (tx_queue->txd.buf.addr)) + index;
 }
 
-/* Report whether the NIC considers this TX queue empty, given the
- * write_count used for the last doorbell push.  May return false
- * negative.
+/* Get partner of a TX queue, seen as part of the same net core queue */
+static struct efx_tx_queue *efx_tx_queue_partner(struct efx_tx_queue *tx_queue)
+{
+	if (tx_queue->queue & EFX_TXQ_TYPE_OFFLOAD)
+		return tx_queue - EFX_TXQ_TYPE_OFFLOAD;
+	else
+		return tx_queue + EFX_TXQ_TYPE_OFFLOAD;
+}
+
+/* Report whether this TX queue would be empty for the given write_count.
+ * May return false negative.
  */
 static inline bool __efx_nic_tx_is_empty(struct efx_tx_queue *tx_queue,
 					 unsigned int write_count)
@@ -86,9 +94,18 @@ static inline bool __efx_nic_tx_is_empty(struct efx_tx_queue *tx_queue,
 	return ((empty_read_count ^ write_count) & ~EFX_EMPTY_COUNT_VALID) == 0;
 }
 
-static inline bool efx_nic_tx_is_empty(struct efx_tx_queue *tx_queue)
+/* Decide whether we can use TX PIO, ie. write packet data directly into
+ * a buffer on the device.  This can reduce latency at the expense of
+ * throughput, so we only do this if both hardware and software TX rings
+ * are empty.  This also ensures that only one packet at a time can be
+ * using the PIO buffer.
+ */
+static inline bool efx_nic_may_tx_pio(struct efx_tx_queue *tx_queue)
 {
-	return __efx_nic_tx_is_empty(tx_queue, tx_queue->write_count);
+	struct efx_tx_queue *partner = efx_tx_queue_partner(tx_queue);
+	return tx_queue->piobuf &&
+	       __efx_nic_tx_is_empty(tx_queue, tx_queue->insert_count) &&
+	       __efx_nic_tx_is_empty(partner, partner->insert_count);
 }
 
 /* Decide whether to push a TX descriptor to the NIC vs merely writing
@@ -96,6 +113,8 @@ static inline bool efx_nic_tx_is_empty(struct efx_tx_queue *tx_queue)
  * descriptor to an empty queue, but is otherwise pointless.  Further,
  * Falcon and Siena have hardware bugs (SF bug 33851) that may be
  * triggered if we don't check this.
+ * We use the write_count used for the last doorbell push, to get the
+ * NIC's view of the tx queue.
  */
 static inline bool efx_nic_may_push_tx_desc(struct efx_tx_queue *tx_queue,
 					    unsigned int write_count)
diff --git a/drivers/net/ethernet/sfc/tx.c b/drivers/net/ethernet/sfc/tx.c
index 3206098..ee84a90 100644
--- a/drivers/net/ethernet/sfc/tx.c
+++ b/drivers/net/ethernet/sfc/tx.c
@@ -132,15 +132,6 @@ unsigned int efx_tx_max_skb_descs(struct efx_nic *efx)
 	return max_descs;
 }
 
-/* Get partner of a TX queue, seen as part of the same net core queue */
-static struct efx_tx_queue *efx_tx_queue_partner(struct efx_tx_queue *tx_queue)
-{
-	if (tx_queue->queue & EFX_TXQ_TYPE_OFFLOAD)
-		return tx_queue - EFX_TXQ_TYPE_OFFLOAD;
-	else
-		return tx_queue + EFX_TXQ_TYPE_OFFLOAD;
-}
-
 static void efx_tx_maybe_stop_queue(struct efx_tx_queue *txq1)
 {
 	/* We need to consider both queues that the net core sees as one */
@@ -344,6 +335,7 @@ netdev_tx_t efx_enqueue_skb(struct efx_tx_queue *tx_queue, struct sk_buff *skb)
 	struct efx_nic *efx = tx_queue->efx;
 	struct device *dma_dev = &efx->pci_dev->dev;
 	struct efx_tx_buffer *buffer;
+	unsigned int old_insert_count = tx_queue->insert_count;
 	skb_frag_t *fragment;
 	unsigned int len, unmap_len = 0;
 	dma_addr_t dma_addr, unmap_addr = 0;
@@ -351,7 +343,7 @@ netdev_tx_t efx_enqueue_skb(struct efx_tx_queue *tx_queue, struct sk_buff *skb)
 	unsigned short dma_flags;
 	int i = 0;
 
-	EFX_BUG_ON_PARANOID(tx_queue->write_count != tx_queue->insert_count);
+	EFX_BUG_ON_PARANOID(tx_queue->write_count > tx_queue->insert_count);
 
 	if (skb_shinfo(skb)->gso_size)
 		return efx_enqueue_skb_tso(tx_queue, skb);
@@ -369,9 +361,8 @@ netdev_tx_t efx_enqueue_skb(struct efx_tx_queue *tx_queue, struct sk_buff *skb)
 
 	/* Consider using PIO for short packets */
 #ifdef EFX_USE_PIO
-	if (skb->len <= efx_piobuf_size && tx_queue->piobuf &&
-	    efx_nic_tx_is_empty(tx_queue) &&
-	    efx_nic_tx_is_empty(efx_tx_queue_partner(tx_queue))) {
+	if (skb->len <= efx_piobuf_size && !skb->xmit_more &&
+	    efx_nic_may_tx_pio(tx_queue)) {
 		buffer = efx_enqueue_skb_pio(tx_queue, skb);
 		dma_flags = EFX_TX_BUF_OPTION;
 		goto finish_packet;
@@ -439,13 +430,14 @@ finish_packet:
 
 	netdev_tx_sent_queue(tx_queue->core_txq, skb->len);
 
+	efx_tx_maybe_stop_queue(tx_queue);
+
 	/* Pass off to hardware */
-	efx_nic_push_buffers(tx_queue);
+	if (!skb->xmit_more || netif_xmit_stopped(tx_queue->core_txq))
+		efx_nic_push_buffers(tx_queue);
 
 	tx_queue->tx_packets++;
 
-	efx_tx_maybe_stop_queue(tx_queue);
-
 	return NETDEV_TX_OK;
 
  dma_err:
@@ -458,7 +450,7 @@ finish_packet:
 	dev_kfree_skb_any(skb);
 
 	/* Work backwards until we hit the original insert pointer value */
-	while (tx_queue->insert_count != tx_queue->write_count) {
+	while (tx_queue->insert_count != old_insert_count) {
 		unsigned int pkts_compl = 0, bytes_compl = 0;
 		--tx_queue->insert_count;
 		buffer = __efx_tx_queue_get_insert_buffer(tx_queue);
@@ -989,12 +981,13 @@ static int efx_tso_put_header(struct efx_tx_queue *tx_queue,
 /* Remove buffers put into a tx_queue.  None of the buffers must have
  * an skb attached.
  */
-static void efx_enqueue_unwind(struct efx_tx_queue *tx_queue)
+static void efx_enqueue_unwind(struct efx_tx_queue *tx_queue,
+			       unsigned int insert_count)
 {
 	struct efx_tx_buffer *buffer;
 
 	/* Work backwards until we hit the original insert pointer value */
-	while (tx_queue->insert_count != tx_queue->write_count) {
+	while (tx_queue->insert_count != insert_count) {
 		--tx_queue->insert_count;
 		buffer = __efx_tx_queue_get_insert_buffer(tx_queue);
 		efx_dequeue_buffer(tx_queue, buffer, NULL, NULL);
@@ -1258,13 +1251,14 @@ static int efx_enqueue_skb_tso(struct efx_tx_queue *tx_queue,
 			       struct sk_buff *skb)
 {
 	struct efx_nic *efx = tx_queue->efx;
+	unsigned int old_insert_count = tx_queue->insert_count;
 	int frag_i, rc;
 	struct tso_state state;
 
 	/* Find the packet protocol and sanity-check it */
 	state.protocol = efx_tso_check_protocol(skb);
 
-	EFX_BUG_ON_PARANOID(tx_queue->write_count != tx_queue->insert_count);
+	EFX_BUG_ON_PARANOID(tx_queue->write_count > tx_queue->insert_count);
 
 	rc = tso_start(&state, efx, skb);
 	if (rc)
@@ -1308,11 +1302,12 @@ static int efx_enqueue_skb_tso(struct efx_tx_queue *tx_queue,
 
 	netdev_tx_sent_queue(tx_queue->core_txq, skb->len);
 
-	/* Pass off to hardware */
-	efx_nic_push_buffers(tx_queue);
-
 	efx_tx_maybe_stop_queue(tx_queue);
 
+	/* Pass off to hardware */
+	if (!skb->xmit_more || netif_xmit_stopped(tx_queue->core_txq))
+		efx_nic_push_buffers(tx_queue);
+
 	tx_queue->tso_bursts++;
 	return NETDEV_TX_OK;
 
@@ -1336,6 +1331,6 @@ static int efx_enqueue_skb_tso(struct efx_tx_queue *tx_queue,
 		dma_unmap_single(&efx->pci_dev->dev, state.header_dma_addr,
 				 state.header_unmap_len, DMA_TO_DEVICE);
 
-	efx_enqueue_unwind(tx_queue);
+	efx_enqueue_unwind(tx_queue, old_insert_count);
 	return NETDEV_TX_OK;
 }
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next] sfc: add support for skb->xmit_more
  2014-10-17 14:32         ` [PATCH " Edward Cree
@ 2014-10-18  3:47           ` David Miller
  2014-10-22  8:58           ` Ben Hutchings
  1 sibling, 0 replies; 13+ messages in thread
From: David Miller @ 2014-10-18  3:47 UTC (permalink / raw)
  To: ecree
  Cc: rstonehouse, dborkman, nikolay, netdev, sshah, jcooper,
	linux-net-drivers

From: Edward Cree <ecree@solarflare.com>
Date: Fri, 17 Oct 2014 15:32:25 +0100

> Don't ring the doorbell, and don't do PIO.  This will also prevent
>  TX Push, because there will be more than one buffer waiting when
>  the doorbell is rung.
> 
> Signed-off-by: Edward Cree <ecree@solarflare.com>

Applied, thanks.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next] sfc: add support for skb->xmit_more
  2014-10-17 14:32         ` [PATCH " Edward Cree
  2014-10-18  3:47           ` David Miller
@ 2014-10-22  8:58           ` Ben Hutchings
  2014-10-22 12:36             ` Edward Cree
  1 sibling, 1 reply; 13+ messages in thread
From: Ben Hutchings @ 2014-10-22  8:58 UTC (permalink / raw)
  To: Edward Cree
  Cc: Robert Stonehouse, Daniel Borkmann, davem, nikolay, netdev,
	Shradha Shah, Jon Cooper, linux-net-drivers

[-- Attachment #1: Type: text/plain, Size: 684 bytes --]

On Fri, 2014-10-17 at 15:32 +0100, Edward Cree wrote:
[...]
> @@ -351,7 +343,7 @@ netdev_tx_t efx_enqueue_skb(struct efx_tx_queue *tx_queue, struct sk_buff *skb)
>  	unsigned short dma_flags;
>  	int i = 0;
>  
> -	EFX_BUG_ON_PARANOID(tx_queue->write_count != tx_queue->insert_count);
> +	EFX_BUG_ON_PARANOID(tx_queue->write_count > tx_queue->insert_count);
[...]

Doesn't this break after 2^32 descriptors?  It seems like you would need
a similar comparison to time_after(); possibly:

	EFX_BUG_ON_PARANOID((int)(tx_queue->write_count - tx_queue->insert_count) > 0);

Ben.

-- 
Ben Hutchings
For every action, there is an equal and opposite criticism. - Harrison

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next] sfc: add support for skb->xmit_more
  2014-10-22  8:58           ` Ben Hutchings
@ 2014-10-22 12:36             ` Edward Cree
  0 siblings, 0 replies; 13+ messages in thread
From: Edward Cree @ 2014-10-22 12:36 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: Robert Stonehouse, Daniel Borkmann, davem, nikolay, netdev,
	Shradha Shah, Jon Cooper, linux-net-drivers

On 22/10/14 09:58, Ben Hutchings wrote:
> On Fri, 2014-10-17 at 15:32 +0100, Edward Cree wrote:
> [...]
>> @@ -351,7 +343,7 @@ netdev_tx_t efx_enqueue_skb(struct efx_tx_queue *tx_queue, struct sk_buff *skb)
>>  	unsigned short dma_flags;
>>  	int i = 0;
>>  
>> -	EFX_BUG_ON_PARANOID(tx_queue->write_count != tx_queue->insert_count);
>> +	EFX_BUG_ON_PARANOID(tx_queue->write_count > tx_queue->insert_count);
> [...]
>
> Doesn't this break after 2^32 descriptors?  It seems like you would need
> a similar comparison to time_after(); possibly:
>
> 	EFX_BUG_ON_PARANOID((int)(tx_queue->write_count - tx_queue->insert_count) > 0);
>
> Ben.
>
Yes, Jon already spotted that.  We just diked it out - see "[PATCH net]
sfc: remove incorrect EFX_BUG_ON_PARANOID check"
http://patchwork.ozlabs.org/patch/401503/
But your suggestion is a good one; feel free to post as a new patch on
that thread.
-Edward

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2014-10-22 12:36 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-10-13 16:59 [PATCH] sfc: efx: add support for skb->xmit_more Daniel Borkmann
2014-10-13 18:02 ` Edward Cree
2014-10-13 19:18   ` Daniel Borkmann
2014-10-14 18:41     ` [PATCH RFC net-next] sfc: " Edward Cree
2014-10-14 21:15       ` David Miller
2014-10-15 11:05         ` Edward Cree
2014-10-15 16:20           ` David Miller
2014-10-16 15:42             ` Jonathan Cooper
2014-10-16 16:36       ` Robert Stonehouse
2014-10-17 14:32         ` [PATCH " Edward Cree
2014-10-18  3:47           ` David Miller
2014-10-22  8:58           ` Ben Hutchings
2014-10-22 12:36             ` Edward Cree

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).