* [PATCH net-next 0/2] mvneta xmit_more and bql support @ 2016-09-13 7:00 ` Marcin Wojtas 0 siblings, 0 replies; 10+ messages in thread From: Marcin Wojtas @ 2016-09-13 7:00 UTC (permalink / raw) To: linux-kernel, linux-arm-kernel, netdev Cc: davem, linux, sebastian.hesselbarth, andrew, jason, thomas.petazzoni, gregory.clement, nadavh, alior, simon.guinot, nitroshift, mw, jaz Hi, This short patchset introduces two enhancements to mvneta driver TX packets concatenation support using xmit_more mechanism and also byte queue limit in order to decrease latency on saturated links. Any comments or feedback would be welcome. Best regards, Marcin Marcin Wojtas (1): net: mvneta: add BQL support Simon Guinot (1): net: mvneta: add xmit_more support drivers/net/ethernet/marvell/mvneta.c | 33 +++++++++++++++++++++++++++------ 1 file changed, 27 insertions(+), 6 deletions(-) -- 1.8.3.1 ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH net-next 0/2] mvneta xmit_more and bql support @ 2016-09-13 7:00 ` Marcin Wojtas 0 siblings, 0 replies; 10+ messages in thread From: Marcin Wojtas @ 2016-09-13 7:00 UTC (permalink / raw) To: linux-arm-kernel Hi, This short patchset introduces two enhancements to mvneta driver TX packets concatenation support using xmit_more mechanism and also byte queue limit in order to decrease latency on saturated links. Any comments or feedback would be welcome. Best regards, Marcin Marcin Wojtas (1): net: mvneta: add BQL support Simon Guinot (1): net: mvneta: add xmit_more support drivers/net/ethernet/marvell/mvneta.c | 33 +++++++++++++++++++++++++++------ 1 file changed, 27 insertions(+), 6 deletions(-) -- 1.8.3.1 ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH net-next 1/2] net: mvneta: add xmit_more support 2016-09-13 7:00 ` Marcin Wojtas @ 2016-09-13 7:00 ` Marcin Wojtas -1 siblings, 0 replies; 10+ messages in thread From: Marcin Wojtas @ 2016-09-13 7:00 UTC (permalink / raw) To: linux-kernel, linux-arm-kernel, netdev Cc: davem, linux, sebastian.hesselbarth, andrew, jason, thomas.petazzoni, gregory.clement, nadavh, alior, simon.guinot, nitroshift, mw, jaz From: Simon Guinot <simon.guinot@sequanux.org> Basing on xmit_more flag of the skb, TX descriptors can be concatenated before flushing. This commit delay Tx descriptor flush if the queue is running and if there is more skb's to send. Signed-off-by: Simon Guinot <simon.guinot@sequanux.org> --- drivers/net/ethernet/marvell/mvneta.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c index d41c28d..b9dccea 100644 --- a/drivers/net/ethernet/marvell/mvneta.c +++ b/drivers/net/ethernet/marvell/mvneta.c @@ -512,6 +512,7 @@ struct mvneta_tx_queue { * descriptor ring */ int count; + int pending; int tx_stop_threshold; int tx_wake_threshold; @@ -802,8 +803,9 @@ static void mvneta_txq_pend_desc_add(struct mvneta_port *pp, /* Only 255 descriptors can be added at once ; Assume caller * process TX desriptors in quanta less than 256 */ - val = pend_desc; + val = pend_desc + txq->pending; mvreg_write(pp, MVNETA_TXQ_UPDATE_REG(txq->id), val); + txq->pending = 0; } /* Get pointer to next TX descriptor to be processed (send) by HW */ @@ -2357,11 +2359,14 @@ out: struct netdev_queue *nq = netdev_get_tx_queue(dev, txq_id); txq->count += frags; - mvneta_txq_pend_desc_add(pp, txq, frags); - if (txq->count >= txq->tx_stop_threshold) netif_tx_stop_queue(nq); + if (!skb->xmit_more || netif_xmit_stopped(nq)) + mvneta_txq_pend_desc_add(pp, txq, frags); + else + txq->pending += frags; + u64_stats_update_begin(&stats->syncp); stats->tx_packets++; stats->tx_bytes += len; -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH net-next 1/2] net: mvneta: add xmit_more support @ 2016-09-13 7:00 ` Marcin Wojtas 0 siblings, 0 replies; 10+ messages in thread From: Marcin Wojtas @ 2016-09-13 7:00 UTC (permalink / raw) To: linux-arm-kernel From: Simon Guinot <simon.guinot@sequanux.org> Basing on xmit_more flag of the skb, TX descriptors can be concatenated before flushing. This commit delay Tx descriptor flush if the queue is running and if there is more skb's to send. Signed-off-by: Simon Guinot <simon.guinot@sequanux.org> --- drivers/net/ethernet/marvell/mvneta.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c index d41c28d..b9dccea 100644 --- a/drivers/net/ethernet/marvell/mvneta.c +++ b/drivers/net/ethernet/marvell/mvneta.c @@ -512,6 +512,7 @@ struct mvneta_tx_queue { * descriptor ring */ int count; + int pending; int tx_stop_threshold; int tx_wake_threshold; @@ -802,8 +803,9 @@ static void mvneta_txq_pend_desc_add(struct mvneta_port *pp, /* Only 255 descriptors can be added at once ; Assume caller * process TX desriptors in quanta less than 256 */ - val = pend_desc; + val = pend_desc + txq->pending; mvreg_write(pp, MVNETA_TXQ_UPDATE_REG(txq->id), val); + txq->pending = 0; } /* Get pointer to next TX descriptor to be processed (send) by HW */ @@ -2357,11 +2359,14 @@ out: struct netdev_queue *nq = netdev_get_tx_queue(dev, txq_id); txq->count += frags; - mvneta_txq_pend_desc_add(pp, txq, frags); - if (txq->count >= txq->tx_stop_threshold) netif_tx_stop_queue(nq); + if (!skb->xmit_more || netif_xmit_stopped(nq)) + mvneta_txq_pend_desc_add(pp, txq, frags); + else + txq->pending += frags; + u64_stats_update_begin(&stats->syncp); stats->tx_packets++; stats->tx_bytes += len; -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH net-next 1/2] net: mvneta: add xmit_more support 2016-09-13 7:00 ` Marcin Wojtas @ 2016-09-13 14:33 ` Eric Dumazet -1 siblings, 0 replies; 10+ messages in thread From: Eric Dumazet @ 2016-09-13 14:33 UTC (permalink / raw) To: Marcin Wojtas Cc: linux-kernel, linux-arm-kernel, netdev, davem, linux, sebastian.hesselbarth, andrew, jason, thomas.petazzoni, gregory.clement, nadavh, alior, simon.guinot, nitroshift, jaz On Tue, 2016-09-13 at 09:00 +0200, Marcin Wojtas wrote: > From: Simon Guinot <simon.guinot@sequanux.org> > > Basing on xmit_more flag of the skb, TX descriptors can be concatenated > before flushing. This commit delay Tx descriptor flush if the queue is > running and if there is more skb's to send. > > Signed-off-by: Simon Guinot <simon.guinot@sequanux.org> > --- > drivers/net/ethernet/marvell/mvneta.c | 11 ++++++++--- > 1 file changed, 8 insertions(+), 3 deletions(-) > > diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c > index d41c28d..b9dccea 100644 > --- a/drivers/net/ethernet/marvell/mvneta.c > +++ b/drivers/net/ethernet/marvell/mvneta.c > @@ -512,6 +512,7 @@ struct mvneta_tx_queue { > * descriptor ring > */ > int count; > + int pending; > int tx_stop_threshold; > int tx_wake_threshold; > > @@ -802,8 +803,9 @@ static void mvneta_txq_pend_desc_add(struct mvneta_port *pp, > /* Only 255 descriptors can be added at once ; Assume caller > * process TX desriptors in quanta less than 256 > */ Hi Marcin Well, given the above comment, and fact that MVNETA_MAX_TXD == 532, it looks like you might add a bug if more than 256 skb are given to your ndo_start_xmit() with skb->xmit_more = 1 I therefore suggest you make sure it does not happen. txq->pending += frags; if (!skb->xmit_more || txq->pending > 256 - MVNETA_MAX_SKB_DESCS || netif_xmit_stopped(nq)) mvneta_txq_pend_desc_add(pp, txq) ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH net-next 1/2] net: mvneta: add xmit_more support @ 2016-09-13 14:33 ` Eric Dumazet 0 siblings, 0 replies; 10+ messages in thread From: Eric Dumazet @ 2016-09-13 14:33 UTC (permalink / raw) To: linux-arm-kernel On Tue, 2016-09-13 at 09:00 +0200, Marcin Wojtas wrote: > From: Simon Guinot <simon.guinot@sequanux.org> > > Basing on xmit_more flag of the skb, TX descriptors can be concatenated > before flushing. This commit delay Tx descriptor flush if the queue is > running and if there is more skb's to send. > > Signed-off-by: Simon Guinot <simon.guinot@sequanux.org> > --- > drivers/net/ethernet/marvell/mvneta.c | 11 ++++++++--- > 1 file changed, 8 insertions(+), 3 deletions(-) > > diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c > index d41c28d..b9dccea 100644 > --- a/drivers/net/ethernet/marvell/mvneta.c > +++ b/drivers/net/ethernet/marvell/mvneta.c > @@ -512,6 +512,7 @@ struct mvneta_tx_queue { > * descriptor ring > */ > int count; > + int pending; > int tx_stop_threshold; > int tx_wake_threshold; > > @@ -802,8 +803,9 @@ static void mvneta_txq_pend_desc_add(struct mvneta_port *pp, > /* Only 255 descriptors can be added at once ; Assume caller > * process TX desriptors in quanta less than 256 > */ Hi Marcin Well, given the above comment, and fact that MVNETA_MAX_TXD == 532, it looks like you might add a bug if more than 256 skb are given to your ndo_start_xmit() with skb->xmit_more = 1 I therefore suggest you make sure it does not happen. txq->pending += frags; if (!skb->xmit_more || txq->pending > 256 - MVNETA_MAX_SKB_DESCS || netif_xmit_stopped(nq)) mvneta_txq_pend_desc_add(pp, txq) ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH net-next 1/2] net: mvneta: add xmit_more support 2016-09-13 14:33 ` Eric Dumazet @ 2016-09-13 14:42 ` Eric Dumazet -1 siblings, 0 replies; 10+ messages in thread From: Eric Dumazet @ 2016-09-13 14:42 UTC (permalink / raw) To: Marcin Wojtas Cc: linux-kernel, linux-arm-kernel, netdev, davem, linux, sebastian.hesselbarth, andrew, jason, thomas.petazzoni, gregory.clement, nadavh, alior, simon.guinot, nitroshift, jaz On Tue, 2016-09-13 at 07:33 -0700, Eric Dumazet wrote: > Hi Marcin > > Well, given the above comment, and fact that MVNETA_MAX_TXD == 532, it > looks like you might add a bug if more than 256 skb are given to your > ndo_start_xmit() with skb->xmit_more = 1 > > I therefore suggest you make sure it does not happen. > > txq->pending += frags; > if (!skb->xmit_more || > txq->pending > 256 - MVNETA_MAX_SKB_DESCS || > netif_xmit_stopped(nq)) > mvneta_txq_pend_desc_add(pp, txq) > Another solution would be to test the potential overflow in mvneta_tx() and force a mvneta_txq_pend_desc_add(pp, txq) _before_ adding the desc of the "about to be cooked" TSO packet. (This is because MVNETA_MAX_SKB_DESCS is 217, so 255-217 leaves few room for xmit_more to show its power) ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH net-next 1/2] net: mvneta: add xmit_more support @ 2016-09-13 14:42 ` Eric Dumazet 0 siblings, 0 replies; 10+ messages in thread From: Eric Dumazet @ 2016-09-13 14:42 UTC (permalink / raw) To: linux-arm-kernel On Tue, 2016-09-13 at 07:33 -0700, Eric Dumazet wrote: > Hi Marcin > > Well, given the above comment, and fact that MVNETA_MAX_TXD == 532, it > looks like you might add a bug if more than 256 skb are given to your > ndo_start_xmit() with skb->xmit_more = 1 > > I therefore suggest you make sure it does not happen. > > txq->pending += frags; > if (!skb->xmit_more || > txq->pending > 256 - MVNETA_MAX_SKB_DESCS || > netif_xmit_stopped(nq)) > mvneta_txq_pend_desc_add(pp, txq) > Another solution would be to test the potential overflow in mvneta_tx() and force a mvneta_txq_pend_desc_add(pp, txq) _before_ adding the desc of the "about to be cooked" TSO packet. (This is because MVNETA_MAX_SKB_DESCS is 217, so 255-217 leaves few room for xmit_more to show its power) ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH net-next 2/2] net: mvneta: add BQL support 2016-09-13 7:00 ` Marcin Wojtas @ 2016-09-13 7:00 ` Marcin Wojtas -1 siblings, 0 replies; 10+ messages in thread From: Marcin Wojtas @ 2016-09-13 7:00 UTC (permalink / raw) To: linux-kernel, linux-arm-kernel, netdev Cc: davem, linux, sebastian.hesselbarth, andrew, jason, thomas.petazzoni, gregory.clement, nadavh, alior, simon.guinot, nitroshift, mw, jaz Tests showed that when whole bandwidth is consumed, the latency for various kind of traffic can reach high values. With saturated link (e.g. with iperf from target to host) simple ping could take significant amount of time. BQL proved to improve this situation when implemented in mvneta driver. Measurements of ping latency for 3 link speeds: Speed | Latency w/o BQL | Latency with BQL 10 | 7-14 ms | 3.5 ms 100 | 2-12 ms | 0.6 ms 1000 | often timeout | up to 2ms Decreasing latency as above result in sligt performance cost - 4kpps (-1.4%) when pushing 64B packets via two bridged interfaces of Armada 38x. For 1500B packets in the same setup, the mpstat tool showed +8% of CPU occupation (default affinity, second CPU idle). Even though this cost seems reasonable to take, considering other improvements. This commit adds byte queue limit mechanism for the mvneta driver. Signed-off-by: Marcin Wojtas <mw@semihalf.com> --- drivers/net/ethernet/marvell/mvneta.c | 22 +++++++++++++++++++--- 1 file changed, 19 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c index b9dccea..bb5df35 100644 --- a/drivers/net/ethernet/marvell/mvneta.c +++ b/drivers/net/ethernet/marvell/mvneta.c @@ -1719,8 +1719,10 @@ static struct mvneta_tx_queue *mvneta_tx_done_policy(struct mvneta_port *pp, /* Free tx queue skbuffs */ static void mvneta_txq_bufs_free(struct mvneta_port *pp, - struct mvneta_tx_queue *txq, int num) + struct mvneta_tx_queue *txq, int num, + struct netdev_queue *nq) { + unsigned int bytes_compl = 0, pkts_compl = 0; int i; for (i = 0; i < num; i++) { @@ -1728,6 +1730,11 @@ static void mvneta_txq_bufs_free(struct mvneta_port *pp, txq->txq_get_index; struct sk_buff *skb = txq->tx_skb[txq->txq_get_index]; + if (skb) { + bytes_compl += skb->len; + pkts_compl++; + } + mvneta_txq_inc_get(txq); if (!IS_TSO_HEADER(txq, tx_desc->buf_phys_addr)) @@ -1738,6 +1745,8 @@ static void mvneta_txq_bufs_free(struct mvneta_port *pp, continue; dev_kfree_skb_any(skb); } + + netdev_tx_completed_queue(nq, pkts_compl, bytes_compl); } /* Handle end of transmission */ @@ -1751,7 +1760,7 @@ static void mvneta_txq_done(struct mvneta_port *pp, if (!tx_done) return; - mvneta_txq_bufs_free(pp, txq, tx_done); + mvneta_txq_bufs_free(pp, txq, tx_done, nq); txq->count -= tx_done; @@ -2358,6 +2367,8 @@ out: struct mvneta_pcpu_stats *stats = this_cpu_ptr(pp->stats); struct netdev_queue *nq = netdev_get_tx_queue(dev, txq_id); + netdev_tx_sent_queue(nq, len); + txq->count += frags; if (txq->count >= txq->tx_stop_threshold) netif_tx_stop_queue(nq); @@ -2385,9 +2396,10 @@ static void mvneta_txq_done_force(struct mvneta_port *pp, struct mvneta_tx_queue *txq) { + struct netdev_queue *nq = netdev_get_tx_queue(pp->dev, txq->id); int tx_done = txq->count; - mvneta_txq_bufs_free(pp, txq, tx_done); + mvneta_txq_bufs_free(pp, txq, tx_done, nq); /* reset txq */ txq->count = 0; @@ -2884,6 +2896,8 @@ static int mvneta_txq_init(struct mvneta_port *pp, static void mvneta_txq_deinit(struct mvneta_port *pp, struct mvneta_tx_queue *txq) { + struct netdev_queue *nq = netdev_get_tx_queue(pp->dev, txq->id); + kfree(txq->tx_skb); if (txq->tso_hdrs) @@ -2895,6 +2909,8 @@ static void mvneta_txq_deinit(struct mvneta_port *pp, txq->size * MVNETA_DESC_ALIGNED_SIZE, txq->descs, txq->descs_phys); + netdev_tx_reset_queue(nq); + txq->descs = NULL; txq->last_desc = 0; txq->next_desc_to_proc = 0; -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH net-next 2/2] net: mvneta: add BQL support @ 2016-09-13 7:00 ` Marcin Wojtas 0 siblings, 0 replies; 10+ messages in thread From: Marcin Wojtas @ 2016-09-13 7:00 UTC (permalink / raw) To: linux-arm-kernel Tests showed that when whole bandwidth is consumed, the latency for various kind of traffic can reach high values. With saturated link (e.g. with iperf from target to host) simple ping could take significant amount of time. BQL proved to improve this situation when implemented in mvneta driver. Measurements of ping latency for 3 link speeds: Speed | Latency w/o BQL | Latency with BQL 10 | 7-14 ms | 3.5 ms 100 | 2-12 ms | 0.6 ms 1000 | often timeout | up to 2ms Decreasing latency as above result in sligt performance cost - 4kpps (-1.4%) when pushing 64B packets via two bridged interfaces of Armada 38x. For 1500B packets in the same setup, the mpstat tool showed +8% of CPU occupation (default affinity, second CPU idle). Even though this cost seems reasonable to take, considering other improvements. This commit adds byte queue limit mechanism for the mvneta driver. Signed-off-by: Marcin Wojtas <mw@semihalf.com> --- drivers/net/ethernet/marvell/mvneta.c | 22 +++++++++++++++++++--- 1 file changed, 19 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c index b9dccea..bb5df35 100644 --- a/drivers/net/ethernet/marvell/mvneta.c +++ b/drivers/net/ethernet/marvell/mvneta.c @@ -1719,8 +1719,10 @@ static struct mvneta_tx_queue *mvneta_tx_done_policy(struct mvneta_port *pp, /* Free tx queue skbuffs */ static void mvneta_txq_bufs_free(struct mvneta_port *pp, - struct mvneta_tx_queue *txq, int num) + struct mvneta_tx_queue *txq, int num, + struct netdev_queue *nq) { + unsigned int bytes_compl = 0, pkts_compl = 0; int i; for (i = 0; i < num; i++) { @@ -1728,6 +1730,11 @@ static void mvneta_txq_bufs_free(struct mvneta_port *pp, txq->txq_get_index; struct sk_buff *skb = txq->tx_skb[txq->txq_get_index]; + if (skb) { + bytes_compl += skb->len; + pkts_compl++; + } + mvneta_txq_inc_get(txq); if (!IS_TSO_HEADER(txq, tx_desc->buf_phys_addr)) @@ -1738,6 +1745,8 @@ static void mvneta_txq_bufs_free(struct mvneta_port *pp, continue; dev_kfree_skb_any(skb); } + + netdev_tx_completed_queue(nq, pkts_compl, bytes_compl); } /* Handle end of transmission */ @@ -1751,7 +1760,7 @@ static void mvneta_txq_done(struct mvneta_port *pp, if (!tx_done) return; - mvneta_txq_bufs_free(pp, txq, tx_done); + mvneta_txq_bufs_free(pp, txq, tx_done, nq); txq->count -= tx_done; @@ -2358,6 +2367,8 @@ out: struct mvneta_pcpu_stats *stats = this_cpu_ptr(pp->stats); struct netdev_queue *nq = netdev_get_tx_queue(dev, txq_id); + netdev_tx_sent_queue(nq, len); + txq->count += frags; if (txq->count >= txq->tx_stop_threshold) netif_tx_stop_queue(nq); @@ -2385,9 +2396,10 @@ static void mvneta_txq_done_force(struct mvneta_port *pp, struct mvneta_tx_queue *txq) { + struct netdev_queue *nq = netdev_get_tx_queue(pp->dev, txq->id); int tx_done = txq->count; - mvneta_txq_bufs_free(pp, txq, tx_done); + mvneta_txq_bufs_free(pp, txq, tx_done, nq); /* reset txq */ txq->count = 0; @@ -2884,6 +2896,8 @@ static int mvneta_txq_init(struct mvneta_port *pp, static void mvneta_txq_deinit(struct mvneta_port *pp, struct mvneta_tx_queue *txq) { + struct netdev_queue *nq = netdev_get_tx_queue(pp->dev, txq->id); + kfree(txq->tx_skb); if (txq->tso_hdrs) @@ -2895,6 +2909,8 @@ static void mvneta_txq_deinit(struct mvneta_port *pp, txq->size * MVNETA_DESC_ALIGNED_SIZE, txq->descs, txq->descs_phys); + netdev_tx_reset_queue(nq); + txq->descs = NULL; txq->last_desc = 0; txq->next_desc_to_proc = 0; -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 10+ messages in thread
end of thread, other threads:[~2016-09-13 14:42 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-09-13 7:00 [PATCH net-next 0/2] mvneta xmit_more and bql support Marcin Wojtas 2016-09-13 7:00 ` Marcin Wojtas 2016-09-13 7:00 ` [PATCH net-next 1/2] net: mvneta: add xmit_more support Marcin Wojtas 2016-09-13 7:00 ` Marcin Wojtas 2016-09-13 14:33 ` Eric Dumazet 2016-09-13 14:33 ` Eric Dumazet 2016-09-13 14:42 ` Eric Dumazet 2016-09-13 14:42 ` Eric Dumazet 2016-09-13 7:00 ` [PATCH net-next 2/2] net: mvneta: add BQL support Marcin Wojtas 2016-09-13 7:00 ` Marcin Wojtas
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.