All of lore.kernel.org
 help / color / mirror / Atom feed
* pktgen and spin_lock_bh in xmit path
@ 2009-10-20  3:38 Ben Greear
  2009-10-20  3:48 ` Eric Dumazet
  0 siblings, 1 reply; 22+ messages in thread
From: Ben Greear @ 2009-10-20  3:38 UTC (permalink / raw)
  To: NetDev

I'm having strange issues when running pktgen on 10G interfaces while 
also running
pktgen on mac-vlans on that interface, when the mac-vlan pktgen threads 
are on a different
CPU.

First, lockdep gives up and says that things are not properly 
annotated.  I believe this is because
the macvlan tx path will lock it's txq and will also lock the 
lower-dev's txq.  To fix this, perhaps
we need some new lockdep aware primitives for netdev txq locking?

Second, is using _bh() locking really sufficient if we have pktgen 
writing to a physical device
and also have other pktgen threads writing to that same device though 
mac-vlans?   I'm seeing
deadlocks spinning on the _bh() lock in pktgen as well as strange 
corruptions, so I think there
must be *some* problem somewhere, I just don't know quite what it is yet.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com> 
Candela Technologies Inc  http://www.candelatech.com



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: pktgen and spin_lock_bh in xmit path
  2009-10-20  3:38 pktgen and spin_lock_bh in xmit path Ben Greear
@ 2009-10-20  3:48 ` Eric Dumazet
  2009-10-20  4:52   ` Ben Greear
  0 siblings, 1 reply; 22+ messages in thread
From: Eric Dumazet @ 2009-10-20  3:48 UTC (permalink / raw)
  To: Ben Greear; +Cc: NetDev

Ben Greear a écrit :
> I'm having strange issues when running pktgen on 10G interfaces while
> also running
> pktgen on mac-vlans on that interface, when the mac-vlan pktgen threads
> are on a different
> CPU.
> 
> First, lockdep gives up and says that things are not properly
> annotated.  I believe this is because
> the macvlan tx path will lock it's txq and will also lock the
> lower-dev's txq.  To fix this, perhaps
> we need some new lockdep aware primitives for netdev txq locking?
> 
> Second, is using _bh() locking really sufficient if we have pktgen
> writing to a physical device
> and also have other pktgen threads writing to that same device though
> mac-vlans?   I'm seeing
> deadlocks spinning on the _bh() lock in pktgen as well as strange
> corruptions, so I think there
> must be *some* problem somewhere, I just don't know quite what it is yet.
> 

Could you please give us a copy if your pktgen scripts ?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: pktgen and spin_lock_bh in xmit path
  2009-10-20  3:48 ` Eric Dumazet
@ 2009-10-20  4:52   ` Ben Greear
  2009-10-20 17:37     ` Ben Greear
  0 siblings, 1 reply; 22+ messages in thread
From: Ben Greear @ 2009-10-20  4:52 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: NetDev

Eric Dumazet wrote:
> Ben Greear a écrit :
>   
>> I'm having strange issues when running pktgen on 10G interfaces while
>> also running
>> pktgen on mac-vlans on that interface, when the mac-vlan pktgen threads
>> are on a different
>> CPU.
>>
>> First, lockdep gives up and says that things are not properly
>> annotated.  I believe this is because
>> the macvlan tx path will lock it's txq and will also lock the
>> lower-dev's txq.  To fix this, perhaps
>> we need some new lockdep aware primitives for netdev txq locking?
>>
>> Second, is using _bh() locking really sufficient if we have pktgen
>> writing to a physical device
>> and also have other pktgen threads writing to that same device though
>> mac-vlans?   I'm seeing
>> deadlocks spinning on the _bh() lock in pktgen as well as strange
>> corruptions, so I think there
>> must be *some* problem somewhere, I just don't know quite what it is yet.
>>
>>     
>
> Could you please give us a copy if your pktgen scripts ?
>   
I'm driving it with another program, and my pktgen is a bit hacked, but 
the basic idea is:

1 pktgen connection on cpu 0 running as fast as it can (trying for 
10Gbps, but getting maybe 3-4),
  running between two 10G ports (intel 82599).
  Multi-pkt is set to 10,000 on each side.
3 pairs of mac-vlans on each of the two physical 10G ports.
 3 pktgen 'connections' between these..each are running at about 1Gbps.
 These 3 pktgen connections are on CPU 4.
 Multi-pkt is set to 1 since multi-pkt is a very bad idea on virtual 
devices.

1514 byte pkts.  No IPs on the interfaces, using ToS in pktgen, but 
nothing else is configured to
care.

The two physical ports are cabled together directly with a fibre cable.

All pktgen connections are full duplex (both sides transmitting to each 
other..and I have
rx logic to gather stats on received pkts as well).  With no kernel 
debugging, this can run right at 10Gbps bi-directional,
with lockdep it gets around 5-6Gbps in each direction.

The lockup often occurs near starting/stopping pktgen, but also happens 
while just normally
running under load, usually within 10 minutes.

I tried and failed to reproduce this on a 1G network, but maybe I'm just 
not getting (un)lucky,
didn't try for too long.

Among other things, it appears as if the mac-vlan interfaces sometimes 
become locked to transmit
by pktgen, but a raw socket in user-space can send on them fine.  I'm 
going to add some debugging
for this particular issue tomorrow to try to figure out why that happens.

Please note I have the rest of my network patches applied (but not using 
any proprietary modules),
so it could easily be something I've caused.  I think fixing lockdep to 
work with the txq _bh locks
would be a good first step to fixing this..

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com> 
Candela Technologies Inc  http://www.candelatech.com



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: pktgen and spin_lock_bh in xmit path
  2009-10-20  4:52   ` Ben Greear
@ 2009-10-20 17:37     ` Ben Greear
  2009-10-20 17:44       ` Eric Dumazet
  0 siblings, 1 reply; 22+ messages in thread
From: Ben Greear @ 2009-10-20 17:37 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: NetDev, robert

On 10/19/2009 09:52 PM, Ben Greear wrote:
> Eric Dumazet wrote:
>> Ben Greear a écrit :
>>> I'm having strange issues when running pktgen on 10G interfaces while
>>> also running
>>> pktgen on mac-vlans on that interface, when the mac-vlan pktgen threads
>>> are on a different
>>> CPU.


I think I found the problem.  First, lockdep was not the issue, and mac-vlans
were properly setting up the lockdep keys.  I would have expected lockdep to
figure out I was trying to lock a non-valid lock, but maybe something else
kept that from happening.

Second:  I think the problem can only happen on my code tree because I
added code to allow mac-vlans to return NETDEV_TX_BUSY
when a hacked varient of dev_queue_xmit decided it could not immediately
transmit a packet.  Without my change, a packet would have to be created fresh
in this scenario, so it would not hit the bug.

However, I think pktgen might still need a similar fix because other drivers or
logic might also change the skb tx-queue map.

Here is the problem, or at least one of them:

pktgen tries to xmit, but gets NETDEV_TX_BUSY.  During the xmit attempt, the
skb queue map was changed to that of the underlying device, which was 4.  Note
that mac-vlans have only a single tx queue.
pktgen will retry this skb, but it never resets the skb queue back to 0.
This means that it will soon be accessing txq[4], which is corrupting
memory.  Things rapidly decline from here!

Here is a patch for comment, in case the pktgen folks would like to
apply something similar:

@@ -3991,11 +4001,26 @@ static void pktgen_xmit(struct pktgen_dev *pkt_dev, u64 now)
                 }
         }

-       if (!pkt_dev->skb) {
+       if ((!pkt_dev->skb) || (pkt_dev->clone_count <= 1)) {
+               /** If clone count is low, that might be because device is a layered
+                * virtual device, like mac-vlan.  In that case, the queue-map may be
+                * changed while transmitting out the lower levels, so we need to
+                * reset this here so we don't accidentally use a bogus queue.
+                */
+       reset_queue_map:
                 set_cur_queue_map(pkt_dev);
                 queue_map = pkt_dev->cur_queue_map;
         } else {
                 queue_map = skb_get_queue_mapping(pkt_dev->skb);
+               if (unlikely(queue_map >= odev->num_tx_queues)) {
+                       static int do_once = 1;
+                       if (do_once) {
+                               printk("pktgen ERROR:  queue_map range error, queue_map: %i  num_tx_queues: %i  iface: %s\n",
+                                      queue_map, odev->num_tx_queues, odev->name);
+                               WARN_ON(1);
+                       }
+                       goto reset_queue_map;
+               }
         }

         txq = netdev_get_tx_queue(odev, queue_map);


Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: pktgen and spin_lock_bh in xmit path
  2009-10-20 17:37     ` Ben Greear
@ 2009-10-20 17:44       ` Eric Dumazet
  2009-10-20 17:54         ` Ben Greear
  0 siblings, 1 reply; 22+ messages in thread
From: Eric Dumazet @ 2009-10-20 17:44 UTC (permalink / raw)
  To: Ben Greear; +Cc: NetDev, robert

Ben Greear a écrit :
> On 10/19/2009 09:52 PM, Ben Greear wrote:
>> Eric Dumazet wrote:
>>> Ben Greear a écrit :
>>>> I'm having strange issues when running pktgen on 10G interfaces while
>>>> also running
>>>> pktgen on mac-vlans on that interface, when the mac-vlan pktgen threads
>>>> are on a different
>>>> CPU.
> 
> 
> I think I found the problem.  First, lockdep was not the issue, and
> mac-vlans
> were properly setting up the lockdep keys.  I would have expected
> lockdep to
> figure out I was trying to lock a non-valid lock, but maybe something else
> kept that from happening.
> 
> Second:  I think the problem can only happen on my code tree because I
> added code to allow mac-vlans to return NETDEV_TX_BUSY
> when a hacked varient of dev_queue_xmit decided it could not immediately
> transmit a packet.  Without my change, a packet would have to be created
> fresh
> in this scenario, so it would not hit the bug.
> 
> However, I think pktgen might still need a similar fix because other
> drivers or
> logic might also change the skb tx-queue map.
> 
> Here is the problem, or at least one of them:
> 
> pktgen tries to xmit, but gets NETDEV_TX_BUSY.  During the xmit attempt,
> the
> skb queue map was changed to that of the underlying device, which was
> 4.  Note
> that mac-vlans have only a single tx queue.

Thats not true since commit 2c11455321f37da6fe6cc36353149f9ac9183334
Date:   Thu Sep 3 00:11:45 2009 +0000
(macvlan: add multiqueue capability )

    macvlan devices are currently not multi-queue capable.

    We can do that defining rtnl_link_ops method,
    get_tx_queues(), called from rtnl_create_link()

    This new method gets num_tx_queues/real_num_tx_queues
    from lower device.

    macvlan_get_tx_queues() is a copy of vlan_get_tx_queues().

    Because macvlan_start_xmit() has to update netdev_queue
    stats only (and not dev->stats), I chose to change
    tx_errors/tx_aborted_errors accounting to tx_dropped,
    since netdev_queue structure doesnt define tx_errors /
    tx_aborted_errors.


> pktgen will retry this skb, but it never resets the skb queue back to 0.
> This means that it will soon be accessing txq[4], which is corrupting
> memory.  Things rapidly decline from here!

Something is really wrong on your kernel :)

> 
> Here is a patch for comment, in case the pktgen folks would like to
> apply something similar:
> 
> @@ -3991,11 +4001,26 @@ static void pktgen_xmit(struct pktgen_dev
> *pkt_dev, u64 now)
>                 }
>         }
> 
> -       if (!pkt_dev->skb) {
> +       if ((!pkt_dev->skb) || (pkt_dev->clone_count <= 1)) {
> +               /** If clone count is low, that might be because device
> is a layered
> +                * virtual device, like mac-vlan.  In that case, the
> queue-map may be
> +                * changed while transmitting out the lower levels, so
> we need to
> +                * reset this here so we don't accidentally use a bogus
> queue.
> +                */
> +       reset_queue_map:
>                 set_cur_queue_map(pkt_dev);
>                 queue_map = pkt_dev->cur_queue_map;
>         } else {
>                 queue_map = skb_get_queue_mapping(pkt_dev->skb);
> +               if (unlikely(queue_map >= odev->num_tx_queues)) {
> +                       static int do_once = 1;
> +                       if (do_once) {
> +                               printk("pktgen ERROR:  queue_map range
> error, queue_map: %i  num_tx_queues: %i  iface: %s\n",
> +                                      queue_map, odev->num_tx_queues,
> odev->name);
> +                               WARN_ON(1);
> +                       }
> +                       goto reset_queue_map;
> +               }
>         }
> 
>         txq = netdev_get_tx_queue(odev, queue_map);

Please try last kernel before posting patches :)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: pktgen and spin_lock_bh in xmit path
  2009-10-20 17:44       ` Eric Dumazet
@ 2009-10-20 17:54         ` Ben Greear
  2009-10-20 18:35           ` Eric Dumazet
  0 siblings, 1 reply; 22+ messages in thread
From: Ben Greear @ 2009-10-20 17:54 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: NetDev, robert

On 10/20/2009 10:44 AM, Eric Dumazet wrote:

>
> Please try last kernel before posting patches :)

I should have looked at the latest code, but even if mac-vlan is no longer
an issue, I suspect it may still be possible to get into this case with
other virtual devices because pktgen plays tricks by cloning pkts.

That said, I have no proof, so no further arguments from me.

Thanks for pointing out the new mac-vlan patch.

Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: pktgen and spin_lock_bh in xmit path
  2009-10-20 17:54         ` Ben Greear
@ 2009-10-20 18:35           ` Eric Dumazet
  2009-10-20 18:54             ` Eric Dumazet
  0 siblings, 1 reply; 22+ messages in thread
From: Eric Dumazet @ 2009-10-20 18:35 UTC (permalink / raw)
  To: Ben Greear; +Cc: NetDev, robert, David S. Miller

Ben Greear a écrit :

> I should have looked at the latest code, but even if mac-vlan is no longer
> an issue, I suspect it may still be possible to get into this case with
> other virtual devices because pktgen plays tricks by cloning pkts.

Sure, you may be right.

> 
> That said, I have no proof, so no further arguments from me.
> 
> Thanks for pointing out the new mac-vlan patch.
> 

fill_packet() does the mapping setting, so if a device/tunnel transmit change it,
we might have a problem for cloned pktgen packets.

Hmm... dev_pick_tx() logic might be the problem...

u16 skb_tx_hash(const struct net_device *dev, const struct sk_buff *skb)
{
        u32 hash;

        if (skb_rx_queue_recorded(skb)) {
                hash = skb_get_rx_queue(skb);
                while (unlikely(hash >= dev->real_num_tx_queues))
                        hash -= dev->real_num_tx_queues;
                return hash;
        }

        if (skb->sk && skb->sk->sk_hash)
                hash = skb->sk->sk_hash;
        else
                hash = skb->protocol;

        hash = jhash_1word(hash, skb_tx_hashrnd);

        return (u16) (((u64) hash * dev->real_num_tx_queues) >> 32);
}
EXPORT_SYMBOL(skb_tx_hash);

static struct netdev_queue *dev_pick_tx(struct net_device *dev,
                                        struct sk_buff *skb)
{
        const struct net_device_ops *ops = dev->netdev_ops;
        u16 queue_index = 0;

        if (ops->ndo_select_queue)
                queue_index = ops->ndo_select_queue(dev, skb);
        else if (dev->real_num_tx_queues > 1)
                queue_index = skb_tx_hash(dev, skb);

        skb_set_queue_mapping(skb, queue_index);
        return netdev_get_tx_queue(dev, queue_index);
}




So if pktgen does a skb_set_queue_mapping(skb, 0);
dev_pick_tx() will call skb_tx_hash().
skb_tx_hash() will consider mapping is not set and will
generate a new one based on hash value.


David, we are changing skb->mapping while transmitting it...

So yes, it can break pktgen if its skbs are cloned.

Maybe pktgen should call skb_record_rx_queue() instead of skb_set_queue_mapping()

Or maybe we should avoid this +/- 1 thing we do in skb_record_rx_queue(), since
we use same skb field for recording rx_queue or tx_queue :(


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: pktgen and spin_lock_bh in xmit path
  2009-10-20 18:35           ` Eric Dumazet
@ 2009-10-20 18:54             ` Eric Dumazet
  2009-10-20 20:16               ` Ben Greear
  2009-10-20 21:10               ` Ben Greear
  0 siblings, 2 replies; 22+ messages in thread
From: Eric Dumazet @ 2009-10-20 18:54 UTC (permalink / raw)
  Cc: Ben Greear, NetDev, robert, David S. Miller

Eric Dumazet a écrit :

> David, we are changing skb->mapping while transmitting it...
> 
> So yes, it can break pktgen if its skbs are cloned.
> 
> Maybe pktgen should call skb_record_rx_queue() instead of skb_set_queue_mapping()
> 
> Or maybe we should avoid this +/- 1 thing we do in skb_record_rx_queue(), since
> we use same skb field for recording rx_queue or tx_queue :(
> 

A possible fix would be :

diff --git a/net/core/pktgen.c b/net/core/pktgen.c
index 86acdba..bbe4b2d 100644
--- a/net/core/pktgen.c
+++ b/net/core/pktgen.c
@@ -2554,7 +2554,6 @@ static struct sk_buff *fill_packet_ipv4(struct net_device *odev,
 	__be16 *vlan_encapsulated_proto = NULL;  /* packet type ID field (or len) for VLAN tag */
 	__be16 *svlan_tci = NULL;                /* Encapsulates priority and SVLAN ID */
 	__be16 *svlan_encapsulated_proto = NULL; /* packet type ID field (or len) for SVLAN tag */
-	u16 queue_map;
 
 	if (pkt_dev->nr_labels)
 		protocol = htons(ETH_P_MPLS_UC);
@@ -2565,7 +2564,6 @@ static struct sk_buff *fill_packet_ipv4(struct net_device *odev,
 	/* Update any of the values, used when we're incrementing various
 	 * fields.
 	 */
-	queue_map = pkt_dev->cur_queue_map;
 	mod_cur_headers(pkt_dev);
 
 	datalen = (odev->hard_header_len + 16) & ~0xf;
@@ -2605,7 +2603,6 @@ static struct sk_buff *fill_packet_ipv4(struct net_device *odev,
 	skb->network_header = skb->tail;
 	skb->transport_header = skb->network_header + sizeof(struct iphdr);
 	skb_put(skb, sizeof(struct iphdr) + sizeof(struct udphdr));
-	skb_set_queue_mapping(skb, queue_map);
 	iph = ip_hdr(skb);
 	udph = udp_hdr(skb);
 
@@ -2896,7 +2893,6 @@ static struct sk_buff *fill_packet_ipv6(struct net_device *odev,
 	__be16 *vlan_encapsulated_proto = NULL;  /* packet type ID field (or len) for VLAN tag */
 	__be16 *svlan_tci = NULL;                /* Encapsulates priority and SVLAN ID */
 	__be16 *svlan_encapsulated_proto = NULL; /* packet type ID field (or len) for SVLAN tag */
-	u16 queue_map;
 
 	if (pkt_dev->nr_labels)
 		protocol = htons(ETH_P_MPLS_UC);
@@ -2907,7 +2903,6 @@ static struct sk_buff *fill_packet_ipv6(struct net_device *odev,
 	/* Update any of the values, used when we're incrementing various
 	 * fields.
 	 */
-	queue_map = pkt_dev->cur_queue_map;
 	mod_cur_headers(pkt_dev);
 
 	skb = __netdev_alloc_skb(odev,
@@ -2946,7 +2941,6 @@ static struct sk_buff *fill_packet_ipv6(struct net_device *odev,
 	skb->network_header = skb->tail;
 	skb->transport_header = skb->network_header + sizeof(struct ipv6hdr);
 	skb_put(skb, sizeof(struct ipv6hdr) + sizeof(struct udphdr));
-	skb_set_queue_mapping(skb, queue_map);
 	iph = ipv6_hdr(skb);
 	udph = udp_hdr(skb);
 
@@ -3437,7 +3431,13 @@ static void pktgen_xmit(struct pktgen_dev *pkt_dev)
 	if (pkt_dev->delay && pkt_dev->last_ok)
 		spin(pkt_dev, pkt_dev->next_tx);
 
-	queue_map = skb_get_queue_mapping(pkt_dev->skb);
+	queue_map = pkt_dev->cur_queue_map;
+	/*
+	 * tells skb_tx_hash() to use this tx queue.
+	 * We should reset skb->mapping before each xmit() because
+	 * xmit() might change it.
+	 */
+	skb_record_rx_queue(pkt_dev->skb, queue_map);
 	txq = netdev_get_tx_queue(odev, queue_map);
 
 	__netif_tx_lock_bh(txq);




^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: pktgen and spin_lock_bh in xmit path
  2009-10-20 18:54             ` Eric Dumazet
@ 2009-10-20 20:16               ` Ben Greear
  2009-10-20 21:10               ` Ben Greear
  1 sibling, 0 replies; 22+ messages in thread
From: Ben Greear @ 2009-10-20 20:16 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: NetDev, robert, David S. Miller

On 10/20/2009 11:54 AM, Eric Dumazet wrote:
> Eric Dumazet a écrit :
>
>> David, we are changing skb->mapping while transmitting it...
>>
>> So yes, it can break pktgen if its skbs are cloned.
>>
>> Maybe pktgen should call skb_record_rx_queue() instead of skb_set_queue_mapping()
>>
>> Or maybe we should avoid this +/- 1 thing we do in skb_record_rx_queue(), since
>> we use same skb field for recording rx_queue or tx_queue :(
>>
>
> A possible fix would be :

I tried your patch.  It's not crashing, but the link is bouncing around:

Oct 20 13:13:34 localhost kernel: ixgbe 0000:05:00.0: master disable timed out
Oct 20 13:13:35 localhost kernel: ixgbe 0000:05:00.0: master disable timed out
Oct 20 13:13:35 localhost kernel: ixgbe 0000:05:00.0: master disable timed out
...

Oct 20 13:14:06 localhost kernel: ixgbe 0000:05:00.0: master disable timed out
Oct 20 13:14:06 localhost kernel: ixgbe: eth9 NIC Link is Down
Oct 20 13:14:06 localhost kernel: ixgbe: eth9 NIC Link is Up 10 Gbps, Flow Control: RX/TX
Oct 20 13:14:09 localhost kernel: ixgbe: eth8 NIC Link is Up 10 Gbps, Flow Control: RX/TX
Oct 20 13:14:12 localhost kernel: ixgbe 0000:05:00.1: master disable timed out
Oct 20 13:14:15 localhost kernel: ixgbe: eth9 NIC Link is Up 10 Gbps, Flow Control: RX/TX
Oct 20 13:14:19 localhost kernel: ixgbe: eth9 NIC Link is Down
Oct 20 13:14:19 localhost kernel: ixgbe 0000:05:00.0: master disable timed out


I'm not sure if this is another weirdness in my system or something
else...

Ben


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: pktgen and spin_lock_bh in xmit path
  2009-10-20 18:54             ` Eric Dumazet
  2009-10-20 20:16               ` Ben Greear
@ 2009-10-20 21:10               ` Ben Greear
  2009-10-20 21:22                 ` Eric Dumazet
  2009-10-21  5:12                 ` Krishna Kumar2
  1 sibling, 2 replies; 22+ messages in thread
From: Ben Greear @ 2009-10-20 21:10 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: NetDev, robert, David S. Miller

On 10/20/2009 11:54 AM, Eric Dumazet wrote:

> -	queue_map = skb_get_queue_mapping(pkt_dev->skb);
> +	queue_map = pkt_dev->cur_queue_map;
> +	/*
> +	 * tells skb_tx_hash() to use this tx queue.
> +	 * We should reset skb->mapping before each xmit() because
> +	 * xmit() might change it.
> +	 */
> +	skb_record_rx_queue(pkt_dev->skb, queue_map);
>   	txq = netdev_get_tx_queue(odev, queue_map);

I think that must be wrong.  The record_rx_queue sets it to queue_map + 1,
but the hard-start-xmit method (in ixgbe/ixgbe_main.c, at least), takes the
skb->queue_map and uses it as an index with no subtraction.

This causes watchdog timeouts because we are calling txq_trans_update in pktgen on
queue 0, for instance, but sending pkts on queue 1.  If queue 1 is ever
busy when the WD fires, link is reset.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: pktgen and spin_lock_bh in xmit path
  2009-10-20 21:10               ` Ben Greear
@ 2009-10-20 21:22                 ` Eric Dumazet
  2009-10-20 21:30                   ` Ben Greear
  2009-10-21  5:12                 ` Krishna Kumar2
  1 sibling, 1 reply; 22+ messages in thread
From: Eric Dumazet @ 2009-10-20 21:22 UTC (permalink / raw)
  To: Ben Greear; +Cc: NetDev, robert, David S. Miller

Ben Greear a écrit :
> On 10/20/2009 11:54 AM, Eric Dumazet wrote:
> 
>> -    queue_map = skb_get_queue_mapping(pkt_dev->skb);
>> +    queue_map = pkt_dev->cur_queue_map;
>> +    /*
>> +     * tells skb_tx_hash() to use this tx queue.
>> +     * We should reset skb->mapping before each xmit() because
>> +     * xmit() might change it.
>> +     */
>> +    skb_record_rx_queue(pkt_dev->skb, queue_map);
>>       txq = netdev_get_tx_queue(odev, queue_map);
> 
> I think that must be wrong.  The record_rx_queue sets it to queue_map + 1,
> but the hard-start-xmit method (in ixgbe/ixgbe_main.c, at least), takes the
> skb->queue_map and uses it as an index with no subtraction.


Yes but check net/core/dev.c I quoted in my previous mail :
We change queue_map if skb goes through dev_queue_xmit()
(as done by macvlan)

u16 skb_tx_hash(const struct net_device *dev, const struct sk_buff *skb)
{
        u32 hash;

        if (skb_rx_queue_recorded(skb)) {
                hash = skb_get_rx_queue(skb);
                while (unlikely(hash >= dev->real_num_tx_queues))
                        hash -= dev->real_num_tx_queues;
                return hash;
        }

        if (skb->sk && skb->sk->sk_hash)
                hash = skb->sk->sk_hash;
        else
                hash = skb->protocol;

        hash = jhash_1word(hash, skb_tx_hashrnd);

        return (u16) (((u64) hash * dev->real_num_tx_queues) >> 32);
}
EXPORT_SYMBOL(skb_tx_hash);

static struct netdev_queue *dev_pick_tx(struct net_device *dev,
                                        struct sk_buff *skb)
{
        const struct net_device_ops *ops = dev->netdev_ops;
        u16 queue_index = 0;

        if (ops->ndo_select_queue)
                queue_index = ops->ndo_select_queue(dev, skb);
        else if (dev->real_num_tx_queues > 1)
                queue_index = skb_tx_hash(dev, skb);

        skb_set_queue_mapping(skb, queue_index);
        return netdev_get_tx_queue(dev, queue_index);
}

So if skb->queue_mapping was X+1 before entering dev_pick_tx(), it is X when
leaving dev_pick_tx()

> 
> This causes watchdog timeouts because we are calling txq_trans_update in
> pktgen on
> queue 0, for instance, but sending pkts on queue 1.  If queue 1 is ever
> busy when the WD fires, link is reset.
> 

Problem is not pktgen IMHO. 

Problem is skb->queue_mapping has different meaning if skb is directly given to a real device -> start_xmit()

( In this case skb->queue_mapping should be between [O ... real_num_tx_queues-1])

But if it goes through dev_queue_xmit(), it should be set between [1 .. real_num_tx_queues], because
dev_pick_tx() will decrement skb->queue_mapping

In fact skb->queue_mapping only works for forwarded packets, not locally generated ones.

I am too tired to cook a fix at this moment, sorry :(

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: pktgen and spin_lock_bh in xmit path
  2009-10-20 21:22                 ` Eric Dumazet
@ 2009-10-20 21:30                   ` Ben Greear
  2009-10-20 21:57                     ` Eric Dumazet
  0 siblings, 1 reply; 22+ messages in thread
From: Ben Greear @ 2009-10-20 21:30 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: NetDev, robert, David S. Miller

On 10/20/2009 02:22 PM, Eric Dumazet wrote:

> Problem is skb->queue_mapping has different meaning if skb is directly given to a real device ->  start_xmit()
>
> ( In this case skb->queue_mapping should be between [O ... real_num_tx_queues-1])
>
> But if it goes through dev_queue_xmit(), it should be set between [1 .. real_num_tx_queues], because
> dev_pick_tx() will decrement skb->queue_mapping
>
> In fact skb->queue_mapping only works for forwarded packets, not locally generated ones.
>
> I am too tired to cook a fix at this moment, sorry :(

That's definitely a nasty little issue.  Using skb_set_queue_mapping
in pktgen makes it run for me, but may just be getting lucky with the
mac-vlan interfaces which will do the dev_queue_xmit (but, I don't so much
care exactly what queue is used as long as things don't crash and the
link doesn't reset).

Don't worry about a quick patch on my account.  I seem to have it working
to at least some degree (no funny crashes, no link watchdog timeouts).

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: pktgen and spin_lock_bh in xmit path
  2009-10-20 21:30                   ` Ben Greear
@ 2009-10-20 21:57                     ` Eric Dumazet
  2009-10-20 23:17                       ` Ben Greear
  0 siblings, 1 reply; 22+ messages in thread
From: Eric Dumazet @ 2009-10-20 21:57 UTC (permalink / raw)
  To: Ben Greear; +Cc: NetDev, robert, David S. Miller

Ben Greear a écrit :
> That's definitely a nasty little issue.  Using skb_set_queue_mapping
> in pktgen makes it run for me, but may just be getting lucky with the
> mac-vlan interfaces which will do the dev_queue_xmit (but, I don't so much
> care exactly what queue is used as long as things don't crash and the
> link doesn't reset).
> 
> Don't worry about a quick patch on my account.  I seem to have it working
> to at least some degree (no funny crashes, no link watchdog timeouts).
> 

Could you try following patch ?

This makes queue_mapping invariant if set in range [0 ... real_num_tx_queues-1]

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index df7b23a..3ef2429 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2023,17 +2023,17 @@ static inline void skb_copy_queue_mapping(struct sk_buff *to, const struct sk_bu
 
 static inline void skb_record_rx_queue(struct sk_buff *skb, u16 rx_queue)
 {
-	skb->queue_mapping = rx_queue + 1;
+	skb->queue_mapping = rx_queue;
 }
 
 static inline u16 skb_get_rx_queue(const struct sk_buff *skb)
 {
-	return skb->queue_mapping - 1;
+	return skb->queue_mapping;
 }
 
 static inline bool skb_rx_queue_recorded(const struct sk_buff *skb)
 {
-	return (skb->queue_mapping != 0);
+	return (skb->queue_mapping != 0xffff);
 }
 
 extern u16 skb_tx_hash(const struct net_device *dev,
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 80a9616..5f04f00 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -198,6 +198,7 @@ struct sk_buff *__alloc_skb(unsigned int size, gfp_t gfp_mask,
 	memset(skb, 0, offsetof(struct sk_buff, tail));
 	skb->truesize = size + sizeof(struct sk_buff);
 	atomic_set(&skb->users, 1);
+	skb->queue_mapping = 0xffff;
 	skb->head = data;
 	skb->data = data;
 	skb_reset_tail_pointer(skb);



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: pktgen and spin_lock_bh in xmit path
  2009-10-20 21:57                     ` Eric Dumazet
@ 2009-10-20 23:17                       ` Ben Greear
  2009-10-21  3:05                         ` Ben Greear
  0 siblings, 1 reply; 22+ messages in thread
From: Ben Greear @ 2009-10-20 23:17 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: NetDev, robert, David S. Miller

On 10/20/2009 02:57 PM, Eric Dumazet wrote:
> Ben Greear a écrit :
>> That's definitely a nasty little issue.  Using skb_set_queue_mapping
>> in pktgen makes it run for me, but may just be getting lucky with the
>> mac-vlan interfaces which will do the dev_queue_xmit (but, I don't so much
>> care exactly what queue is used as long as things don't crash and the
>> link doesn't reset).
>>
>> Don't worry about a quick patch on my account.  I seem to have it working
>> to at least some degree (no funny crashes, no link watchdog timeouts).
>>
>
> Could you try following patch ?
>
> This makes queue_mapping invariant if set in range [0 ... real_num_tx_queues-1]

Yes, that runs w/out causing link resets and without crashes (just tested it for
a few minutes).

Interestingly, the pkts sent by pktgen on the mac-vlans end up in
tx-queues that match processor ID, even though I'm on .31 where mac-vlans
have only one tx-queue and pktgen is setting the queue to 0 in the skb
(per your previous patch).

Must be something somewhere doing the mapping of processor ids to txqs.

I'll try to figure this out this evening or tomorrow.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: pktgen and spin_lock_bh in xmit path
  2009-10-20 23:17                       ` Ben Greear
@ 2009-10-21  3:05                         ` Ben Greear
  2009-10-21  3:12                           ` Eric Dumazet
  0 siblings, 1 reply; 22+ messages in thread
From: Ben Greear @ 2009-10-21  3:05 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: NetDev, robert, David S. Miller

Ben Greear wrote:
> On 10/20/2009 02:57 PM, Eric Dumazet wrote:
>> Ben Greear a écrit :
>>> That's definitely a nasty little issue.  Using skb_set_queue_mapping
>>> in pktgen makes it run for me, but may just be getting lucky with the
>>> mac-vlan interfaces which will do the dev_queue_xmit (but, I don't 
>>> so much
>>> care exactly what queue is used as long as things don't crash and the
>>> link doesn't reset).
>>>
>>> Don't worry about a quick patch on my account.  I seem to have it 
>>> working
>>> to at least some degree (no funny crashes, no link watchdog timeouts).
>>>
>>
>> Could you try following patch ?
>>
>> This makes queue_mapping invariant if set in range [0 ... 
>> real_num_tx_queues-1]
>
> Yes, that runs w/out causing link resets and without crashes (just 
> tested it for
> a few minutes).
>
> Interestingly, the pkts sent by pktgen on the mac-vlans end up in
> tx-queues that match processor ID, even though I'm on .31 where mac-vlans
> have only one tx-queue and pktgen is setting the queue to 0 in the skb
> (per your previous patch).
Ok, this is because ixgbe implements the ndo_select_queue, which is 
called from
dev_pick_tx.

So, as far as I can tell, as long as you are using ixgbe with 82559, it 
doesn't matter what you set
for the queue-map in the skb..it's always over-written.

I don't know if this is a bug or a feature, but it explains the behavior 
with tx and rx queues
that I saw when using pktgen on mac-vlans...

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com> 
Candela Technologies Inc  http://www.candelatech.com



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: pktgen and spin_lock_bh in xmit path
  2009-10-21  3:05                         ` Ben Greear
@ 2009-10-21  3:12                           ` Eric Dumazet
  2009-10-21  3:59                             ` Eric Dumazet
  0 siblings, 1 reply; 22+ messages in thread
From: Eric Dumazet @ 2009-10-21  3:12 UTC (permalink / raw)
  To: Ben Greear; +Cc: NetDev, robert, David S. Miller

Ben Greear a écrit :
> Ben Greear wrote:
>> On 10/20/2009 02:57 PM, Eric Dumazet wrote:
>>> Ben Greear a écrit :
>>>> That's definitely a nasty little issue.  Using skb_set_queue_mapping
>>>> in pktgen makes it run for me, but may just be getting lucky with the
>>>> mac-vlan interfaces which will do the dev_queue_xmit (but, I don't
>>>> so much
>>>> care exactly what queue is used as long as things don't crash and the
>>>> link doesn't reset).
>>>>
>>>> Don't worry about a quick patch on my account.  I seem to have it
>>>> working
>>>> to at least some degree (no funny crashes, no link watchdog timeouts).
>>>>
>>>
>>> Could you try following patch ?
>>>
>>> This makes queue_mapping invariant if set in range [0 ...
>>> real_num_tx_queues-1]
>>
>> Yes, that runs w/out causing link resets and without crashes (just
>> tested it for
>> a few minutes).
>>
>> Interestingly, the pkts sent by pktgen on the mac-vlans end up in
>> tx-queues that match processor ID, even though I'm on .31 where mac-vlans
>> have only one tx-queue and pktgen is setting the queue to 0 in the skb
>> (per your previous patch).
> Ok, this is because ixgbe implements the ndo_select_queue, which is
> called from
> dev_pick_tx.
> 
> So, as far as I can tell, as long as you are using ixgbe with 82559, it
> doesn't matter what you set
> for the queue-map in the skb..it's always over-written.
> 
> I don't know if this is a bug or a feature, but it explains the behavior
> with tx and rx queues
> that I saw when using pktgen on mac-vlans...
> 

We have many bugs in this area :)

So we probably need to reset skb_set_queue_mapping(skb, queue_map);
each time skb is transmitted by pktgen.

Or else, pktgen will break if using bonding driver -> ixgbe, since bonding
uses only one txqueue (it is not yet multiqueue aware)

Thanks, I'll take care of official patches submission


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: pktgen and spin_lock_bh in xmit path
  2009-10-21  3:12                           ` Eric Dumazet
@ 2009-10-21  3:59                             ` Eric Dumazet
  2009-10-21  5:00                               ` Ben Greear
  0 siblings, 1 reply; 22+ messages in thread
From: Eric Dumazet @ 2009-10-21  3:59 UTC (permalink / raw)
  To: Ben Greear; +Cc: NetDev, robert, David S. Miller

Eric Dumazet a écrit :
> Ben Greear a écrit :
>> Ben Greear wrote:
>>> On 10/20/2009 02:57 PM, Eric Dumazet wrote:
>>>> Ben Greear a écrit :
>>>>> That's definitely a nasty little issue.  Using skb_set_queue_mapping
>>>>> in pktgen makes it run for me, but may just be getting lucky with the
>>>>> mac-vlan interfaces which will do the dev_queue_xmit (but, I don't
>>>>> so much
>>>>> care exactly what queue is used as long as things don't crash and the
>>>>> link doesn't reset).
>>>>>
>>>>> Don't worry about a quick patch on my account.  I seem to have it
>>>>> working
>>>>> to at least some degree (no funny crashes, no link watchdog timeouts).
>>>>>
>>>> Could you try following patch ?
>>>>
>>>> This makes queue_mapping invariant if set in range [0 ...
>>>> real_num_tx_queues-1]
>>> Yes, that runs w/out causing link resets and without crashes (just
>>> tested it for
>>> a few minutes).
>>>
>>> Interestingly, the pkts sent by pktgen on the mac-vlans end up in
>>> tx-queues that match processor ID, even though I'm on .31 where mac-vlans
>>> have only one tx-queue and pktgen is setting the queue to 0 in the skb
>>> (per your previous patch).
>> Ok, this is because ixgbe implements the ndo_select_queue, which is
>> called from
>> dev_pick_tx.
>>
>> So, as far as I can tell, as long as you are using ixgbe with 82559, it
>> doesn't matter what you set
>> for the queue-map in the skb..it's always over-written.
>>
>> I don't know if this is a bug or a feature, but it explains the behavior
>> with tx and rx queues
>> that I saw when using pktgen on mac-vlans...
>>
> 
> We have many bugs in this area :)
> 
> So we probably need to reset skb_set_queue_mapping(skb, queue_map);
> each time skb is transmitted by pktgen.
> 
> Or else, pktgen will break if using bonding driver -> ixgbe, since bonding
> uses only one txqueue (it is not yet multiqueue aware)
> 

After some thoughts, I believe user is in error :)

pktgen should not use "clone XXX" pkts if macvlan is used (or any other driver
that ultimatly calls dev_queue_xmit() and queue packet), since skb queue anchor
is shared and would be overwritten.

1) Only way to use "clone XXXX" pkts is when using real device.

2) Also, using macvlan in pktgen is sub-optimal, since you can already put any
MAC addresses in pktgen pkts, you dont need to go through macvlan layer.

3) If ixgbe overwrites skb->queue_mapping to current cpu, you should setup pktgen
 queue_map_min and queue_map_max to match you cpu number, or use QUEUE_MAP_CPU pktgen flag
Or else, pktgen wont get  the appropriate txq (and lock) before calling driver start_xmit()

Unfortunatly, the (queue_map_min==queue_map_max) case needs a patch that might be not present in 2.6.31
(commit 896a7cf8d846a9e86fb823be16f4f14ffeb7f074 : pktgen: Fix multiqueue handling)


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: pktgen and spin_lock_bh in xmit path
  2009-10-21  3:59                             ` Eric Dumazet
@ 2009-10-21  5:00                               ` Ben Greear
  2009-10-21  5:14                                 ` Eric Dumazet
  0 siblings, 1 reply; 22+ messages in thread
From: Ben Greear @ 2009-10-21  5:00 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: NetDev, robert, David S. Miller

Eric Dumazet wrote:
> pktgen should not use "clone XXX" pkts if macvlan is used (or any other driver
> that ultimatly calls dev_queue_xmit() and queue packet), since skb queue anchor
> is shared and would be overwritten.
>   
> After some thoughts, I believe user is in error :)
I tried to explain in my original post:  The problem arises when
when the hard-start-xmit fails with NETDEV_TX_BUSY.  Part of the
hard-start-xmit logic for virtual devices can call dev_queue_xmit, which 
can ultimately
change the queue mapping and yet may still return NETDEV_TX_BUSY.

pktgen would try to resend this skb next loop, and this is where it would
blow up.

I have a patched macvlan logic and a patched dev queue xmit logic that 
allows
me to return NETDEV_TX_BUSY when underlying device fails to transmit.

It may be that my hacked macvlan is the only virtual device that could ever
return NETDEV_TX_BUSY, and if that is the case, I don't think the bug
could ever be hit in official kernel code.  My opinion is that the 
current pktgen code makes
too many assumptions, so unless there is a performance penalty, I still
think it should be cleaned up.  But, I may be too paranoid.
> 1) Only way to use "clone XXXX" pkts is when using real device.
>   
Agreed, and I was not cloning pkts on the mac-vlan interface.
> 2) Also, using macvlan in pktgen is sub-optimal, since you can already put any
> MAC addresses in pktgen pkts, you dont need to go through macvlan layer.
>   
It's sub-optimal for massive pkt pushing, but still useful for sending 
multiple distinct flows
across a single physical wire.
> 3) If ixgbe overwrites skb->queue_mapping to current cpu, you should setup pktgen
>  queue_map_min and queue_map_max to match you cpu number, or use QUEUE_MAP_CPU pktgen flag
> Or else, pktgen wont get  the appropriate txq (and lock) before calling driver start_xmit()
>   
The hard-start-xmit path doesn't call the driver's queue-mapping logic, 
so you
only get that fun when transmitting through mac-vlans (or .1q vlans, 
etc).  There appears to be
no watchdog for virtual devices, and the dev_queue_xmit path updates the 
proper txq, so, as long as you
aren't using that +1 variant of the skb set queue map logic in pktgen, I 
think you will be fine.  The
current code is fine in this manner, but your patch broke it w/out the 
second patch to remove the +1
logic.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com> 
Candela Technologies Inc  http://www.candelatech.com



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: pktgen and spin_lock_bh in xmit path
  2009-10-20 21:10               ` Ben Greear
  2009-10-20 21:22                 ` Eric Dumazet
@ 2009-10-21  5:12                 ` Krishna Kumar2
  2009-10-21  5:32                   ` Ben Greear
  1 sibling, 1 reply; 22+ messages in thread
From: Krishna Kumar2 @ 2009-10-21  5:12 UTC (permalink / raw)
  To: Ben Greear; +Cc: David S. Miller, Eric Dumazet, NetDev, netdev-owner, robert

Ben Greear <greearb@candelatech.com>  wrote on 10/21/2009 02:40:13 AM:

Coming back a bit to this post:

> > -   queue_map = skb_get_queue_mapping(pkt_dev->skb);
> > +   queue_map = pkt_dev->cur_queue_map;
> > +   /*
> > +    * tells skb_tx_hash() to use this tx queue.
> > +    * We should reset skb->mapping before each xmit() because
> > +    * xmit() might change it.
> > +    */
> > +   skb_record_rx_queue(pkt_dev->skb, queue_map);
> >      txq = netdev_get_tx_queue(odev, queue_map);
>
> I think that must be wrong.  The record_rx_queue sets it to queue_map +
1,
> but the hard-start-xmit method (in ixgbe/ixgbe_main.c, at least), takes
the
> skb->queue_map and uses it as an index with no subtraction.

But that should work fine. record_rx_q sets queue_mapping to +1,
but skb_tx_hash calls skb_get_rx_queue, which does a -1 on this
value, and updates that value into queue_mapping. Hence it will
not cross the txq boundary. Drivers can use the queue_map value
directly without requiring to subtract.

Thanks,

- KK

>
> This causes watchdog timeouts because we are calling txq_trans_update in
pktgen on
> queue 0, for instance, but sending pkts on queue 1.  If queue 1 is ever
> busy when the WD fires, link is reset.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: pktgen and spin_lock_bh in xmit path
  2009-10-21  5:00                               ` Ben Greear
@ 2009-10-21  5:14                                 ` Eric Dumazet
  2009-10-21  5:40                                   ` Ben Greear
  0 siblings, 1 reply; 22+ messages in thread
From: Eric Dumazet @ 2009-10-21  5:14 UTC (permalink / raw)
  To: Ben Greear; +Cc: NetDev, robert, David S. Miller

Ben Greear a écrit :
> Eric Dumazet wrote:
>> pktgen should not use "clone XXX" pkts if macvlan is used (or any
>> other driver
>> that ultimatly calls dev_queue_xmit() and queue packet), since skb
>> queue anchor
>> is shared and would be overwritten.
>>   After some thoughts, I believe user is in error :)
> I tried to explain in my original post:  The problem arises when
> when the hard-start-xmit fails with NETDEV_TX_BUSY.  Part of the
> hard-start-xmit logic for virtual devices can call dev_queue_xmit, which
> can ultimately
> change the queue mapping and yet may still return NETDEV_TX_BUSY.
> 
> pktgen would try to resend this skb next loop, and this is where it would
> blow up.
> 
> I have a patched macvlan logic and a patched dev queue xmit logic that
> allows
> me to return NETDEV_TX_BUSY when underlying device fails to transmit.
> 
> It may be that my hacked macvlan is the only virtual device that could ever
> return NETDEV_TX_BUSY, and if that is the case, I don't think the bug
> could ever be hit in official kernel code.  My opinion is that the
> current pktgen code makes
> too many assumptions, so unless there is a performance penalty, I still
> think it should be cleaned up.  But, I may be too paranoid.

If a virtual device changes skb->queue_map, it must consume skb,
or it breaks caller.

Alternative would be to restore queue_map to its initial value in
your hacked macvlan when it wants to return NETDEV_TX_BUSY status.


We could add a WARN_ON(skb_get_queue_mapping(pkt_dev->skb) != queue_map);
in pktgen, to catch driver errors but pktgen assumption is right IMHO

@@ -3466,6 +3471,7 @@ static void pktgen_xmit(struct pktgen_dev *pkt_dev)
 		/* fallthru */
 	case NETDEV_TX_LOCKED:
 	case NETDEV_TX_BUSY:
+		WARN_ON(skb_get_queue_mapping(pkt_dev->skb) != queue_map);
 		/* Retry it next time */
 		atomic_dec(&(pkt_dev->skb->users));
 		pkt_dev->last_ok = 0;

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: pktgen and spin_lock_bh in xmit path
  2009-10-21  5:12                 ` Krishna Kumar2
@ 2009-10-21  5:32                   ` Ben Greear
  0 siblings, 0 replies; 22+ messages in thread
From: Ben Greear @ 2009-10-21  5:32 UTC (permalink / raw)
  To: Krishna Kumar2
  Cc: David S. Miller, Eric Dumazet, NetDev, netdev-owner, robert

Krishna Kumar2 wrote:
> Ben Greear <greearb@candelatech.com>  wrote on 10/21/2009 02:40:13 AM:
>
> Coming back a bit to this post:
>
>   
>>> -   queue_map = skb_get_queue_mapping(pkt_dev->skb);
>>> +   queue_map = pkt_dev->cur_queue_map;
>>> +   /*
>>> +    * tells skb_tx_hash() to use this tx queue.
>>> +    * We should reset skb->mapping before each xmit() because
>>> +    * xmit() might change it.
>>> +    */
>>> +   skb_record_rx_queue(pkt_dev->skb, queue_map);
>>>      txq = netdev_get_tx_queue(odev, queue_map);
>>>       
>> I think that must be wrong.  The record_rx_queue sets it to queue_map + 1,
>>     
>> but the hard-start-xmit method (in ixgbe/ixgbe_main.c, at least), takes the
>> skb->queue_map and uses it as an index with no subtraction.
>>     
>
> But that should work fine. record_rx_q sets queue_mapping to +1,
> but skb_tx_hash calls skb_get_rx_queue, which does a -1 on this
> value, and updates that value into queue_mapping. Hence it will
> not cross the txq boundary. Drivers can use the queue_map value
> directly without requiring to subtract.
>   
When using pktgen on real physical hardware, there is none of the 
skb_tx_hash or dev_queue_xmit
logic called, just the hard-start-xmit.  That is why it fails to update 
the proper queue with
his first patch. 

On virtual devices like mac-vlans, the logic probably worked ok since it 
goes
through dev_queue_xmit.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com> 
Candela Technologies Inc  http://www.candelatech.com



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: pktgen and spin_lock_bh in xmit path
  2009-10-21  5:14                                 ` Eric Dumazet
@ 2009-10-21  5:40                                   ` Ben Greear
  0 siblings, 0 replies; 22+ messages in thread
From: Ben Greear @ 2009-10-21  5:40 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: NetDev, robert, David S. Miller

Eric Dumazet wrote:
> Ben Greear a écrit :
>   
>> Eric Dumazet wrote:
>>     
>>> pktgen should not use "clone XXX" pkts if macvlan is used (or any
>>> other driver
>>> that ultimatly calls dev_queue_xmit() and queue packet), since skb
>>> queue anchor
>>> is shared and would be overwritten.
>>>   After some thoughts, I believe user is in error :)
>>>       
>> I tried to explain in my original post:  The problem arises when
>> when the hard-start-xmit fails with NETDEV_TX_BUSY.  Part of the
>> hard-start-xmit logic for virtual devices can call dev_queue_xmit, which
>> can ultimately
>> change the queue mapping and yet may still return NETDEV_TX_BUSY.
>>
>> pktgen would try to resend this skb next loop, and this is where it would
>> blow up.
>>
>> I have a patched macvlan logic and a patched dev queue xmit logic that
>> allows
>> me to return NETDEV_TX_BUSY when underlying device fails to transmit.
>>
>> It may be that my hacked macvlan is the only virtual device that could ever
>> return NETDEV_TX_BUSY, and if that is the case, I don't think the bug
>> could ever be hit in official kernel code.  My opinion is that the
>> current pktgen code makes
>> too many assumptions, so unless there is a performance penalty, I still
>> think it should be cleaned up.  But, I may be too paranoid.
>>     
>
> If a virtual device changes skb->queue_map, it must consume skb,
> or it breaks caller.
>   
> Alternative would be to restore queue_map to its initial value in
> your hacked macvlan when it wants to return NETDEV_TX_BUSY status.
>   
Yeah, that sounds fine to me, should be easy and cheap.
> We could add a WARN_ON(skb_get_queue_mapping(pkt_dev->skb) != queue_map);
> in pktgen, to catch driver errors but pktgen assumption is right IMHO
>
> @@ -3466,6 +3471,7 @@ static void pktgen_xmit(struct pktgen_dev *pkt_dev)
>  		/* fallthru */
>  	case NETDEV_TX_LOCKED:
>  	case NETDEV_TX_BUSY:
> +		WARN_ON(skb_get_queue_mapping(pkt_dev->skb) != queue_map);
>  		/* Retry it next time */
>  		atomic_dec(&(pkt_dev->skb->users));
>  		pkt_dev->last_ok = 0;
>   
That looks good too.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com> 
Candela Technologies Inc  http://www.candelatech.com



^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2009-10-21  5:40 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-10-20  3:38 pktgen and spin_lock_bh in xmit path Ben Greear
2009-10-20  3:48 ` Eric Dumazet
2009-10-20  4:52   ` Ben Greear
2009-10-20 17:37     ` Ben Greear
2009-10-20 17:44       ` Eric Dumazet
2009-10-20 17:54         ` Ben Greear
2009-10-20 18:35           ` Eric Dumazet
2009-10-20 18:54             ` Eric Dumazet
2009-10-20 20:16               ` Ben Greear
2009-10-20 21:10               ` Ben Greear
2009-10-20 21:22                 ` Eric Dumazet
2009-10-20 21:30                   ` Ben Greear
2009-10-20 21:57                     ` Eric Dumazet
2009-10-20 23:17                       ` Ben Greear
2009-10-21  3:05                         ` Ben Greear
2009-10-21  3:12                           ` Eric Dumazet
2009-10-21  3:59                             ` Eric Dumazet
2009-10-21  5:00                               ` Ben Greear
2009-10-21  5:14                                 ` Eric Dumazet
2009-10-21  5:40                                   ` Ben Greear
2009-10-21  5:12                 ` Krishna Kumar2
2009-10-21  5:32                   ` Ben Greear

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.