All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH net-next] net: pktgen: packet bursting via skb->xmit_more
@ 2014-09-26  0:46 Alexei Starovoitov
  2014-09-26  1:20 ` Eric Dumazet
                   ` (2 more replies)
  0 siblings, 3 replies; 31+ messages in thread
From: Alexei Starovoitov @ 2014-09-26  0:46 UTC (permalink / raw)
  To: David S. Miller
  Cc: Jesper Dangaard Brouer, Eric Dumazet, John Fastabend, netdev

This patch demonstrates the effect of delaying update of HW tailptr.
(based on earlier patch by Jesper)

burst=1 is a default. It sends one packet with xmit_more=false
burst=2 sends one packet with xmit_more=true and
        2nd copy of the same packet with xmit_more=false
burst=3 sends two copies of the same packet with xmit_more=true and
        3rd copy with xmit_more=false

Performance with ixgbe:

usec 30:
burst=1  tx:9.2 Mpps
burst=2  tx:13.6 Mpps
burst=3  tx:14.5 Mpps full 10G line rate

usec 1 (default):
burst=1,4,100 tx:3.9 Mpps

usec 0:
burst=1  tx:4.9 Mpps
burst=2  tx:6.6 Mpps
burst=3  tx:7.9 Mpps
burst=4  tx:8.7 Mpps
burst=8  tx:10.3 Mpps
burst=128  tx:12.4 Mpps

Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
---

tx queue size, irq affinity left in default.
pause frames are off.

Nice to finally see line rate generated by one cpu

Comparing to Jesper patch this one amortizes the cost
of spin_lock and atomic_inc by doing HARD_TX_LOCK and
atomic_add(N) once across N packets.

 net/core/pktgen.c |   33 ++++++++++++++++++++++++++++++---
 1 file changed, 30 insertions(+), 3 deletions(-)

diff --git a/net/core/pktgen.c b/net/core/pktgen.c
index 5c728aa..47557ba 100644
--- a/net/core/pktgen.c
+++ b/net/core/pktgen.c
@@ -387,6 +387,7 @@ struct pktgen_dev {
 	u16 queue_map_min;
 	u16 queue_map_max;
 	__u32 skb_priority;	/* skb priority field */
+	int burst;		/* number of duplicated packets to burst */
 	int node;               /* Memory node */
 
 #ifdef CONFIG_XFRM
@@ -613,6 +614,9 @@ static int pktgen_if_show(struct seq_file *seq, void *v)
 	if (pkt_dev->traffic_class)
 		seq_printf(seq, "     traffic_class: 0x%02x\n", pkt_dev->traffic_class);
 
+	if (pkt_dev->burst > 1)
+		seq_printf(seq, "     burst: %d\n", pkt_dev->burst);
+
 	if (pkt_dev->node >= 0)
 		seq_printf(seq, "     node: %d\n", pkt_dev->node);
 
@@ -1124,6 +1128,16 @@ static ssize_t pktgen_if_write(struct file *file,
 			pkt_dev->dst_mac_count);
 		return count;
 	}
+	if (!strcmp(name, "burst")) {
+		len = num_arg(&user_buffer[i], 10, &value);
+		if (len < 0)
+			return len;
+
+		i += len;
+		pkt_dev->burst = value < 1 ? 1 : value;
+		sprintf(pg_result, "OK: burst=%d", pkt_dev->burst);
+		return count;
+	}
 	if (!strcmp(name, "node")) {
 		len = num_arg(&user_buffer[i], 10, &value);
 		if (len < 0)
@@ -3299,7 +3313,8 @@ static void pktgen_xmit(struct pktgen_dev *pkt_dev)
 {
 	struct net_device *odev = pkt_dev->odev;
 	struct netdev_queue *txq;
-	int ret;
+	int burst_cnt, ret;
+	bool more;
 
 	/* If device is offline, then don't send */
 	if (unlikely(!netif_running(odev) || !netif_carrier_ok(odev))) {
@@ -3347,8 +3362,14 @@ static void pktgen_xmit(struct pktgen_dev *pkt_dev)
 		pkt_dev->last_ok = 0;
 		goto unlock;
 	}
-	atomic_inc(&(pkt_dev->skb->users));
-	ret = netdev_start_xmit(pkt_dev->skb, odev, txq, false);
+	atomic_add(pkt_dev->burst, &pkt_dev->skb->users);
+
+	burst_cnt = 0;
+
+xmit_more:
+	more = ++burst_cnt < pkt_dev->burst;
+
+	ret = netdev_start_xmit(pkt_dev->skb, odev, txq, more);
 
 	switch (ret) {
 	case NETDEV_TX_OK:
@@ -3356,6 +3377,8 @@ static void pktgen_xmit(struct pktgen_dev *pkt_dev)
 		pkt_dev->sofar++;
 		pkt_dev->seq_num++;
 		pkt_dev->tx_bytes += pkt_dev->last_pkt_size;
+		if (more)
+			goto xmit_more;
 		break;
 	case NET_XMIT_DROP:
 	case NET_XMIT_CN:
@@ -3374,6 +3397,9 @@ static void pktgen_xmit(struct pktgen_dev *pkt_dev)
 		atomic_dec(&(pkt_dev->skb->users));
 		pkt_dev->last_ok = 0;
 	}
+
+	if (unlikely(pkt_dev->burst - burst_cnt > 0))
+		atomic_sub(pkt_dev->burst - burst_cnt, &pkt_dev->skb->users);
 unlock:
 	HARD_TX_UNLOCK(odev, txq);
 
@@ -3572,6 +3598,7 @@ static int pktgen_add_device(struct pktgen_thread *t, const char *ifname)
 	pkt_dev->svlan_p = 0;
 	pkt_dev->svlan_cfi = 0;
 	pkt_dev->svlan_id = 0xffff;
+	pkt_dev->burst = 1;
 	pkt_dev->node = -1;
 
 	err = pktgen_setup_dev(t->net, pkt_dev, ifname);
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 31+ messages in thread
* Re: [PATCH v2 net-next] mlx4: optimize xmit path
@ 2014-09-29 17:46 Alexei Starovoitov
  2014-09-29 18:08 ` Eric Dumazet
  0 siblings, 1 reply; 31+ messages in thread
From: Alexei Starovoitov @ 2014-09-29 17:46 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Or Gerlitz, David S. Miller, Jesper Dangaard Brouer,
	Eric Dumazet, John Fastabend, Linux Netdev List, Amir Vadai,
	Or Gerlitz

On Sun, Sep 28, 2014 at 9:19 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>         send_doorbell = !skb->xmit_more || netif_xmit_stopped(ring->tx_queue);
>
>         if (ring->bf_enabled && desc_size <= MAX_BF && !bounce &&
>             !vlan_tx_tag_present(skb) && send_doorbell) {

This patch is good,

but I've been thinking more about bf+xmit_more and want
to double check my understanding in that scenario:
xmit_more=true will queue descriptors normally and
the last xmit_more=false packet will write into BF.
I guess BF suppose to pick up the earlier ones from
the queue, otherwise the whole thing is broken.
So if indeed BF can pick up the whole chain, then
it should be the faster way than doing iowrite32(), right?

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2014-10-02 12:45 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-26  0:46 [RFC PATCH net-next] net: pktgen: packet bursting via skb->xmit_more Alexei Starovoitov
2014-09-26  1:20 ` Eric Dumazet
2014-09-26  7:42   ` Eric Dumazet
2014-09-26 15:44     ` Eric Dumazet
2014-09-26 15:59       ` Alexei Starovoitov
2014-09-26 16:06         ` Eric Dumazet
2014-09-27 20:43     ` Eric Dumazet
2014-09-27 20:55       ` Or Gerlitz
2014-09-27 21:30         ` Eric Dumazet
2014-09-27 22:56           ` [PATCH net-next] mlx4: optimize xmit path Eric Dumazet
2014-09-27 23:44             ` Hannes Frederic Sowa
2014-09-28  0:05               ` Eric Dumazet
2014-09-28  0:22                 ` Hannes Frederic Sowa
2014-09-28 12:42             ` Eric Dumazet
2014-09-28 14:35             ` Or Gerlitz
2014-09-28 16:03               ` Eric Dumazet
2014-09-29  4:19             ` [PATCH v2 " Eric Dumazet
2014-09-30 12:01               ` Amir Vadai
2014-09-30 12:11                 ` Eric Dumazet
2014-10-02  4:35               ` Eric Dumazet
2014-10-02  8:03                 ` Amir Vadai
2014-10-02  8:29                   ` Jesper Dangaard Brouer
2014-10-02  8:57                     ` Amir Vadai
2014-10-02 11:45                   ` Eric Dumazet
2014-10-02 11:56                     ` Amir Vadai
2014-10-02 12:07                       ` Eric Dumazet
2014-10-02 12:45                         ` Amir Vadai
2014-09-26  8:05 ` [RFC PATCH net-next] net: pktgen: packet bursting via skb->xmit_more Jesper Dangaard Brouer
2014-09-27 20:59 ` Or Gerlitz
2014-09-29 17:46 [PATCH v2 net-next] mlx4: optimize xmit path Alexei Starovoitov
2014-09-29 18:08 ` Eric Dumazet

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.