All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC] net: release dst entry in dev_queue_xmit()
@ 2009-03-20 11:40 Eric Dumazet
  2009-03-20 14:10 ` Neil Horman
                   ` (2 more replies)
  0 siblings, 3 replies; 28+ messages in thread
From: Eric Dumazet @ 2009-03-20 11:40 UTC (permalink / raw)
  To: David S. Miller; +Cc: Linux Netdev List

One point of contention in high network loads is the dst_release() performed
when a transmited skb is freed. This is because NIC tx completion calls skb free
long after original call to dev_queue_xmit(skb).

CPU cache is cold and the atomic op in dst_release() stalls. On SMP, this is
quite visible if one CPU is 100% handling softirqs for a network device,
since dst_clone() is done by other cpus, involving cache line ping pongs.

I believe we can release dst in dev_queue_xmit(), while cpu cache is hot, since
caller of dev_queue_xmit() had to hold a reference on dst right before. This reduce
work to be done by softirq handler, and decrease cache misses.

I also believe only pktgen can call dev_queue_xmit() with skb which have
a skb->users != 1. But pkthen skbs have a NULL dst entry.

I added a WARN_ON_ONCE() to catch other cases, and not release skb->dst
if skb->users != 1


Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>


diff --git a/net/core/dev.c b/net/core/dev.c
index f112970..9e0fd01 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1852,6 +1852,20 @@ gso:
 	if (q->enqueue) {
 		spinlock_t *root_lock = qdisc_lock(q);
 
+		/*
+		 * Release dst while its refcnt is hot in CPU cache, instead
+		 * of waiting NIC tx completion
+		 */
+		if (likely(skb->dst)) {
+			if (!WARN_ON_ONCE(atomic_read(&skb->users) != 1)) {
+				int newrefcnt;
+
+				smp_mb__before_atomic_dec();
+				newrefcnt = atomic_dec_return(&skb->dst->__refcnt);
+				WARN_ON(newrefcnt < 0);
+				skb->dst = NULL;
+			}
+		}
 		spin_lock(root_lock);
 
 		if (unlikely(test_bit(__QDISC_STATE_DEACTIVATED, &q->state))) {

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [RFC] net: release dst entry in dev_queue_xmit()
  2009-03-20 11:40 [RFC] net: release dst entry in dev_queue_xmit() Eric Dumazet
@ 2009-03-20 14:10 ` Neil Horman
  2009-03-25  6:43 ` David Miller
  2009-05-12  8:12 ` [PATCH] net: release dst entry in dev_hard_start_xmit() Eric Dumazet
  2 siblings, 0 replies; 28+ messages in thread
From: Neil Horman @ 2009-03-20 14:10 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David S. Miller, Linux Netdev List

On Fri, Mar 20, 2009 at 12:40:22PM +0100, Eric Dumazet wrote:
> One point of contention in high network loads is the dst_release() performed
> when a transmited skb is freed. This is because NIC tx completion calls skb free
> long after original call to dev_queue_xmit(skb).
> 
> CPU cache is cold and the atomic op in dst_release() stalls. On SMP, this is
> quite visible if one CPU is 100% handling softirqs for a network device,
> since dst_clone() is done by other cpus, involving cache line ping pongs.
> 
> I believe we can release dst in dev_queue_xmit(), while cpu cache is hot, since
> caller of dev_queue_xmit() had to hold a reference on dst right before. This reduce
> work to be done by softirq handler, and decrease cache misses.
> 
> I also believe only pktgen can call dev_queue_xmit() with skb which have
> a skb->users != 1. But pkthen skbs have a NULL dst entry.
> 
> I added a WARN_ON_ONCE() to catch other cases, and not release skb->dst
> if skb->users != 1
> 
> 
> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
> 
> 
> diff --git a/net/core/dev.c b/net/core/dev.c
> index f112970..9e0fd01 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -1852,6 +1852,20 @@ gso:
>  	if (q->enqueue) {
>  		spinlock_t *root_lock = qdisc_lock(q);
>  
> +		/*
> +		 * Release dst while its refcnt is hot in CPU cache, instead
> +		 * of waiting NIC tx completion
> +		 */
> +		if (likely(skb->dst)) {
> +			if (!WARN_ON_ONCE(atomic_read(&skb->users) != 1)) {
> +				int newrefcnt;
> +
> +				smp_mb__before_atomic_dec();
> +				newrefcnt = atomic_dec_return(&skb->dst->__refcnt);
> +				WARN_ON(newrefcnt < 0);
> +				skb->dst = NULL;
> +			}
> +		}
>  		spin_lock(root_lock);
>  
>  		if (unlikely(test_bit(__QDISC_STATE_DEACTIVATED, &q->state))) {
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

I think this seems like a pretty good idea.  I thought for a moment that some
stacked interfaces (bonds, vlan devices), might have a problem with this, since
they tend to pass through dev_queue_xmit twice, but I can't see a problem with
either one of those cases either, since neither of thier xmit routines makes any
use of the dst pointer.  I'd say include it

Acked-by: Neil Horman <nhorman@tuxdriver.com>


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC] net: release dst entry in dev_queue_xmit()
  2009-03-20 11:40 [RFC] net: release dst entry in dev_queue_xmit() Eric Dumazet
  2009-03-20 14:10 ` Neil Horman
@ 2009-03-25  6:43 ` David Miller
  2009-03-25  7:13   ` Eric Dumazet
  2009-05-12  8:12 ` [PATCH] net: release dst entry in dev_hard_start_xmit() Eric Dumazet
  2 siblings, 1 reply; 28+ messages in thread
From: David Miller @ 2009-03-25  6:43 UTC (permalink / raw)
  To: dada1; +Cc: netdev

From: Eric Dumazet <dada1@cosmosbay.com>
Date: Fri, 20 Mar 2009 12:40:22 +0100

> I believe we can release dst in dev_queue_xmit(), while cpu cache is
> hot, since caller of dev_queue_xmit() had to hold a reference on dst
> right before. This reduce work to be done by softirq handler, and
> decrease cache misses.
> 

This will break various packet schedulers and classifiers.

Heck sch_sfq.c even uses skb->dst as part of it's flow hash
function :-)

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC] net: release dst entry in dev_queue_xmit()
  2009-03-25  6:43 ` David Miller
@ 2009-03-25  7:13   ` Eric Dumazet
  2009-03-25  7:17     ` David Miller
  0 siblings, 1 reply; 28+ messages in thread
From: Eric Dumazet @ 2009-03-25  7:13 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

David Miller a écrit :
> From: Eric Dumazet <dada1@cosmosbay.com>
> Date: Fri, 20 Mar 2009 12:40:22 +0100
> 
>> I believe we can release dst in dev_queue_xmit(), while cpu cache is
>> hot, since caller of dev_queue_xmit() had to hold a reference on dst
>> right before. This reduce work to be done by softirq handler, and
>> decrease cache misses.
>>
> 
> This will break various packet schedulers and classifiers.
> 
> Heck sch_sfq.c even uses skb->dst as part of it's flow hash
> function :-)

Well, as one of the hash perturbator, for other protocols than
IPV4 & IPV6...

        default:
                h = (unsigned long)skb->dst ^ skb->protocol;
                h2 = (unsigned long)skb->sk;
        }

        return sfq_fold_hash(q, h, h2);

But teql indeed mandates dst in __teql_resolve() 

Darn...

This dst freeing should be done very late then, in the NIC driver itself, just
before giving skb to hardware, or right before in dev_hard_start_xmit()

If done in dev_hard_start_xmit(), skb could be requeued (because of NETDEV_TX_BUSY).
Then if requeued, maybe at this time, dst being NULL is not a problem ?

Thanks a lot David


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC] net: release dst entry in dev_queue_xmit()
  2009-03-25  7:13   ` Eric Dumazet
@ 2009-03-25  7:17     ` David Miller
  2009-03-25 18:22       ` Jarek Poplawski
  0 siblings, 1 reply; 28+ messages in thread
From: David Miller @ 2009-03-25  7:17 UTC (permalink / raw)
  To: dada1; +Cc: netdev

From: Eric Dumazet <dada1@cosmosbay.com>
Date: Wed, 25 Mar 2009 08:13:30 +0100

> If done in dev_hard_start_xmit(), skb could be requeued (because of
> NETDEV_TX_BUSY).  Then if requeued, maybe at this time, dst being
> NULL is not a problem ?

Usually it should be OK because the packet schedulers have
a sort-of one-behind state that allows them to reinsert
the SKB into their queue datastructures without reclassifying.

But I'm not %100 sure there isn't some case that might still
need skb->dst there.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC] net: release dst entry in dev_queue_xmit()
  2009-03-25  7:17     ` David Miller
@ 2009-03-25 18:22       ` Jarek Poplawski
  2009-03-25 18:41         ` Eric Dumazet
  0 siblings, 1 reply; 28+ messages in thread
From: Jarek Poplawski @ 2009-03-25 18:22 UTC (permalink / raw)
  To: David Miller; +Cc: dada1, netdev

David Miller wrote, On 03/25/2009 08:17 AM:

> From: Eric Dumazet <dada1@cosmosbay.com>
> Date: Wed, 25 Mar 2009 08:13:30 +0100
> 
>> If done in dev_hard_start_xmit(), skb could be requeued (because of
>> NETDEV_TX_BUSY).  Then if requeued, maybe at this time, dst being
>> NULL is not a problem ?
> 
> Usually it should be OK because the packet schedulers have
> a sort-of one-behind state that allows them to reinsert
> the SKB into their queue datastructures without reclassifying.


Actually, since David has dumped requeuing there is no reinserting;
this last one "requeued" skb is buffered at the top in q->gso_skb
and waiting for better times.

Jarek P.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC] net: release dst entry in dev_queue_xmit()
  2009-03-25 18:22       ` Jarek Poplawski
@ 2009-03-25 18:41         ` Eric Dumazet
  2009-03-25 19:18           ` Jarek Poplawski
  0 siblings, 1 reply; 28+ messages in thread
From: Eric Dumazet @ 2009-03-25 18:41 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: David Miller, netdev

Jarek Poplawski a écrit :
> David Miller wrote, On 03/25/2009 08:17 AM:
> 
>> From: Eric Dumazet <dada1@cosmosbay.com>
>> Date: Wed, 25 Mar 2009 08:13:30 +0100
>>
>>> If done in dev_hard_start_xmit(), skb could be requeued (because of
>>> NETDEV_TX_BUSY).  Then if requeued, maybe at this time, dst being
>>> NULL is not a problem ?
>> Usually it should be OK because the packet schedulers have
>> a sort-of one-behind state that allows them to reinsert
>> the SKB into their queue datastructures without reclassifying.
> 
> 
> Actually, since David has dumped requeuing there is no reinserting;
> this last one "requeued" skb is buffered at the top in q->gso_skb
> and waiting for better times.

Yes indeed, this is what I thought too, thanks Jarek.

I tested following patch today on my machine, but obviously could not try 
all possible quirks :)

[PATCH] net: release dst entry in dev_hard_start_xmit()

One point of contention in high network loads is the dst_release() performed
when a transmited skb is freed. This is because NIC tx completion calls skb free
long after original call to dev_queue_xmit(skb).

CPU cache is cold and the atomic op in dst_release() stalls. On SMP, this is
quite visible if one CPU is 100% handling softirqs for a network device,
since dst_clone() is done by other cpus, involving cache line ping pongs.

I believe we can release dst in dev_hard_start_xmit(), while cpu cache is hot, since
caller of dev_queue_xmit() had to hold a reference on dst right before. This reduce
work to be done by softirq handler, and decrease cache misses.

I also believe only pktgen can call dev_queue_xmit() with skb which have
a skb->users != 1. But pkthen skbs have a NULL dst entry.

I added a WARN_ON_ONCE() to catch other cases, and not release skb->dst
if skb->users != 1

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>

diff --git a/net/core/dev.c b/net/core/dev.c
index e3fe5c7..a622db6 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1664,6 +1664,26 @@ static int dev_gso_segment(struct sk_buff *skb)
 	return 0;
 }
 
+
+/*
+ * Release dst while its refcnt is likely hot in CPU cache, instead
+ * of waiting NIC tx completion.
+ * We inline dst_release() code for performance reason
+ */
+static void release_skb_dst(struct sk_buff *skb)
+{
+	if (likely(skb->dst)) {
+		if (!WARN_ON_ONCE(atomic_read(&skb->users) != 1)) {
+			int newrefcnt;
+
+			smp_mb__before_atomic_dec();
+			newrefcnt = atomic_dec_return(&skb->dst->__refcnt);
+			WARN_ON(newrefcnt < 0);
+			skb->dst = NULL;
+		}
+	}
+}
+
 int dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev,
 			struct netdev_queue *txq)
 {
@@ -1681,6 +1701,7 @@ int dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev,
 				goto gso;
 		}
 
+		release_skb_dst(skb);
 		return ops->ndo_start_xmit(skb, dev);
 	}
 
@@ -1691,6 +1712,7 @@ gso:
 
 		skb->next = nskb->next;
 		nskb->next = NULL;
+		release_skb_dst(nskb);
 		rc = ops->ndo_start_xmit(nskb, dev);
 		if (unlikely(rc)) {
 			nskb->next = skb->next;


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [RFC] net: release dst entry in dev_queue_xmit()
  2009-03-25 18:41         ` Eric Dumazet
@ 2009-03-25 19:18           ` Jarek Poplawski
  2009-03-25 19:40             ` Eric Dumazet
  0 siblings, 1 reply; 28+ messages in thread
From: Jarek Poplawski @ 2009-03-25 19:18 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev

On Wed, Mar 25, 2009 at 07:41:27PM +0100, Eric Dumazet wrote:
> Jarek Poplawski a écrit :
> > David Miller wrote, On 03/25/2009 08:17 AM:
> > 
> >> From: Eric Dumazet <dada1@cosmosbay.com>
> >> Date: Wed, 25 Mar 2009 08:13:30 +0100
> >>
> >>> If done in dev_hard_start_xmit(), skb could be requeued (because of
> >>> NETDEV_TX_BUSY).  Then if requeued, maybe at this time, dst being
> >>> NULL is not a problem ?
> >> Usually it should be OK because the packet schedulers have
> >> a sort-of one-behind state that allows them to reinsert
> >> the SKB into their queue datastructures without reclassifying.
> > 
> > 
> > Actually, since David has dumped requeuing there is no reinserting;
> > this last one "requeued" skb is buffered at the top in q->gso_skb
> > and waiting for better times.
> 
> Yes indeed, this is what I thought too, thanks Jarek.

Alas I'm a bit concerned with virtual devs, e.g. now I'm looking at
xmits in macvlan and pppoe. Maybe this patch should exclude them?

Jarek P.

> 
> I tested following patch today on my machine, but obviously could not try 
> all possible quirks :)
> 
> [PATCH] net: release dst entry in dev_hard_start_xmit()
> 
> One point of contention in high network loads is the dst_release() performed
> when a transmited skb is freed. This is because NIC tx completion calls skb free
> long after original call to dev_queue_xmit(skb).
> 
> CPU cache is cold and the atomic op in dst_release() stalls. On SMP, this is
> quite visible if one CPU is 100% handling softirqs for a network device,
> since dst_clone() is done by other cpus, involving cache line ping pongs.
> 
> I believe we can release dst in dev_hard_start_xmit(), while cpu cache is hot, since
> caller of dev_queue_xmit() had to hold a reference on dst right before. This reduce
> work to be done by softirq handler, and decrease cache misses.
> 
> I also believe only pktgen can call dev_queue_xmit() with skb which have
> a skb->users != 1. But pkthen skbs have a NULL dst entry.
> 
> I added a WARN_ON_ONCE() to catch other cases, and not release skb->dst
> if skb->users != 1
> 
> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
> 
> diff --git a/net/core/dev.c b/net/core/dev.c
> index e3fe5c7..a622db6 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -1664,6 +1664,26 @@ static int dev_gso_segment(struct sk_buff *skb)
>  	return 0;
>  }
>  
> +
> +/*
> + * Release dst while its refcnt is likely hot in CPU cache, instead
> + * of waiting NIC tx completion.
> + * We inline dst_release() code for performance reason
> + */
> +static void release_skb_dst(struct sk_buff *skb)
> +{
> +	if (likely(skb->dst)) {
> +		if (!WARN_ON_ONCE(atomic_read(&skb->users) != 1)) {
> +			int newrefcnt;
> +
> +			smp_mb__before_atomic_dec();
> +			newrefcnt = atomic_dec_return(&skb->dst->__refcnt);
> +			WARN_ON(newrefcnt < 0);
> +			skb->dst = NULL;
> +		}
> +	}
> +}
> +
>  int dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev,
>  			struct netdev_queue *txq)
>  {
> @@ -1681,6 +1701,7 @@ int dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev,
>  				goto gso;
>  		}
>  
> +		release_skb_dst(skb);
>  		return ops->ndo_start_xmit(skb, dev);
>  	}
>  
> @@ -1691,6 +1712,7 @@ gso:
>  
>  		skb->next = nskb->next;
>  		nskb->next = NULL;
> +		release_skb_dst(nskb);
>  		rc = ops->ndo_start_xmit(nskb, dev);
>  		if (unlikely(rc)) {
>  			nskb->next = skb->next;
> 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC] net: release dst entry in dev_queue_xmit()
  2009-03-25 19:18           ` Jarek Poplawski
@ 2009-03-25 19:40             ` Eric Dumazet
  2009-03-25 19:54               ` Jarek Poplawski
  0 siblings, 1 reply; 28+ messages in thread
From: Eric Dumazet @ 2009-03-25 19:40 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: David Miller, netdev

Jarek Poplawski a écrit :
> On Wed, Mar 25, 2009 at 07:41:27PM +0100, Eric Dumazet wrote:
>> Jarek Poplawski a écrit :
>>> David Miller wrote, On 03/25/2009 08:17 AM:
>>>
>>>> From: Eric Dumazet <dada1@cosmosbay.com>
>>>> Date: Wed, 25 Mar 2009 08:13:30 +0100
>>>>
>>>>> If done in dev_hard_start_xmit(), skb could be requeued (because of
>>>>> NETDEV_TX_BUSY).  Then if requeued, maybe at this time, dst being
>>>>> NULL is not a problem ?
>>>> Usually it should be OK because the packet schedulers have
>>>> a sort-of one-behind state that allows them to reinsert
>>>> the SKB into their queue datastructures without reclassifying.
>>>
>>> Actually, since David has dumped requeuing there is no reinserting;
>>> this last one "requeued" skb is buffered at the top in q->gso_skb
>>> and waiting for better times.
>> Yes indeed, this is what I thought too, thanks Jarek.
> 
> Alas I'm a bit concerned with virtual devs, e.g. now I'm looking at
> xmits in macvlan and pppoe. Maybe this patch should exclude them?
> 

Yes, MACVLAN :) its macvlan_start_xmit() function calls
dev_queue_xmit(skb) again, so we go back to packet schedulers and
classifiers, they might need dst again :(

Only other potential problem I found was in 
drivers/net/appletalk/ipddp.c

static int ipddp_xmit(struct sk_buff *skb, struct net_device *dev)
{
__be32 paddr = ((struct rtable*)skb->dst)->rt_gateway;

Not sure this driver is still supported, or if this paddr can be found elsewhere...
 __sk_dst_get(skb->sk) ???


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC] net: release dst entry in dev_queue_xmit()
  2009-03-25 19:40             ` Eric Dumazet
@ 2009-03-25 19:54               ` Jarek Poplawski
  2009-03-25 20:28                 ` Eric Dumazet
  0 siblings, 1 reply; 28+ messages in thread
From: Jarek Poplawski @ 2009-03-25 19:54 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev

On Wed, Mar 25, 2009 at 08:40:05PM +0100, Eric Dumazet wrote:
> Jarek Poplawski a écrit :
...
> > Alas I'm a bit concerned with virtual devs, e.g. now I'm looking at
> > xmits in macvlan and pppoe. Maybe this patch should exclude them?
> > 
> 
> Yes, MACVLAN :) its macvlan_start_xmit() function calls
> dev_queue_xmit(skb) again, so we go back to packet schedulers and
> classifiers, they might need dst again :(
> 
> Only other potential problem I found was in 
> drivers/net/appletalk/ipddp.c
> 
> static int ipddp_xmit(struct sk_buff *skb, struct net_device *dev)
> {
> __be32 paddr = ((struct rtable*)skb->dst)->rt_gateway;
> 
> Not sure this driver is still supported, or if this paddr can be found elsewhere...
>  __sk_dst_get(skb->sk) ???
> 

I guess you've considered the loopback too...

Jarek P.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC] net: release dst entry in dev_queue_xmit()
  2009-03-25 19:54               ` Jarek Poplawski
@ 2009-03-25 20:28                 ` Eric Dumazet
  2009-03-25 21:12                   ` Jarek Poplawski
  0 siblings, 1 reply; 28+ messages in thread
From: Eric Dumazet @ 2009-03-25 20:28 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: David Miller, netdev

Jarek Poplawski a écrit :

> I guess you've considered the loopback too...
> 

I had for the first patch (carefuly avoiding loopback being not responsive) :
I was releasing dst only if enqueue() was called.

You are right, loopback doesnt work anymore with last patch,
and I have no idea why...

Do you know why ?

Thank you


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC] net: release dst entry in dev_queue_xmit()
  2009-03-25 20:28                 ` Eric Dumazet
@ 2009-03-25 21:12                   ` Jarek Poplawski
  2009-03-25 21:20                     ` Patrick McHardy
  0 siblings, 1 reply; 28+ messages in thread
From: Jarek Poplawski @ 2009-03-25 21:12 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev

On Wed, Mar 25, 2009 at 09:28:37PM +0100, Eric Dumazet wrote:
> Jarek Poplawski a écrit :
> 
> > I guess you've considered the loopback too...
> > 
> 
> I had for the first patch (carefuly avoiding loopback being not responsive) :
> I was releasing dst only if enqueue() was called.
> 
> You are right, loopback doesnt work anymore with last patch,
> and I have no idea why...
> 
> Do you know why ?

Hmm.. isn't this test for dst == NULL in ip_rcv_finish expected to be
negative for loopback source?

Jarek P.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC] net: release dst entry in dev_queue_xmit()
  2009-03-25 21:12                   ` Jarek Poplawski
@ 2009-03-25 21:20                     ` Patrick McHardy
  0 siblings, 0 replies; 28+ messages in thread
From: Patrick McHardy @ 2009-03-25 21:20 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: Eric Dumazet, David Miller, netdev

Jarek Poplawski wrote:
> On Wed, Mar 25, 2009 at 09:28:37PM +0100, Eric Dumazet wrote:
>   
>> Jarek Poplawski a écrit :
>>
>>     
>>> I guess you've considered the loopback too...
>>>
>>>       
>> I had for the first patch (carefuly avoiding loopback being not responsive) :
>> I was releasing dst only if enqueue() was called.
>>
>> You are right, loopback doesnt work anymore with last patch,
>> and I have no idea why...
>>
>> Do you know why ?
>>     
>
> Hmm.. isn't this test for dst == NULL in ip_rcv_finish expected to be
> negative for loopback source?

ip_route_input() doesn't accept loopback addresses, so loopback packets
already need to have a dst_entry attached.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH] net: release dst entry in dev_hard_start_xmit()
  2009-03-20 11:40 [RFC] net: release dst entry in dev_queue_xmit() Eric Dumazet
  2009-03-20 14:10 ` Neil Horman
  2009-03-25  6:43 ` David Miller
@ 2009-05-12  8:12 ` Eric Dumazet
  2009-05-12  8:21   ` Eric Dumazet
  2 siblings, 1 reply; 28+ messages in thread
From: Eric Dumazet @ 2009-05-12  8:12 UTC (permalink / raw)
  To: David S. Miller; +Cc: Linux Netdev List, Jarek Poplawski, Patrick McHardy

One point of contention in high network loads is the dst_release() performed
when a transmited skb is freed. This is because NIC tx completion calls
dev_kree_skb() long after original call to dev_queue_xmit(skb).

CPU cache is cold and the atomic op in dst_release() stalls. On SMP, this is
quite visible if one CPU is 100% handling softirqs for a network device,
since dst_clone() is done by other cpus, involving cache line ping pongs.

It seems right place to release dst is in dev_hard_start_xmit(), for most
devices but ones that are virtual, and some exceptions.

David Miller suggested to define a new device flag, set in alloc_netdev_mq()
(so that most devices set it at init time), and carefuly unset in devices
which dont want a NULL skb->dst in their ndo_start_xmit().

List of devices that must clear this flag is :

- loopback device, because it calls netif_rx() and quoting Patrick :
    "ip_route_input() doesn't accept loopback addresses, so loopback packets
     already need to have a dst_entry attached."
- appletalk/ipddp.c : needs skb->dst in its xmit function

- And all devices that call again dev_queue_xmit() from their xmit function
(as some classifiers need skb->dst) : bonding, macvlan, eql, ifb, hdlc_fr

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
---
 drivers/net/appletalk/ipddp.c   |    1 +
 drivers/net/bonding/bond_main.c |    1 +
 drivers/net/eql.c               |    1 +
 drivers/net/ifb.c               |    1 +
 drivers/net/loopback.c          |    1 +
 drivers/net/macvlan.c           |    1 +
 drivers/net/wan/hdlc_fr.c       |    1 +
 include/linux/if.h              |    3 +++
 net/core/dev.c                  |    9 +++++++++
 9 files changed, 19 insertions(+)

diff --git a/drivers/net/appletalk/ipddp.c b/drivers/net/appletalk/ipddp.c
index da64ba8..0268561 100644
--- a/drivers/net/appletalk/ipddp.c
+++ b/drivers/net/appletalk/ipddp.c
@@ -71,6 +71,7 @@ static struct net_device * __init ipddp_init(void)
 	if (!dev)
 		return ERR_PTR(-ENOMEM);
 
+	dev->priv_flags &= IFF_XMIT_DST_RELEASE;
 	strcpy(dev->name, "ipddp%d");
 
 	if (version_printed++ == 0)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 815191d..a29f421 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -5148,6 +5148,7 @@ int bond_create(char *name, struct bond_params *params)
 		goto out_rtnl;
 	}
 
+	bond_dev->priv_flags &= ~IFF_XMIT_DST_RELEASE;
 	if (!name) {
 		res = dev_alloc_name(bond_dev, "bond%d");
 		if (res < 0)
diff --git a/drivers/net/eql.c b/drivers/net/eql.c
index 5210bb1..19b7dd9 100644
--- a/drivers/net/eql.c
+++ b/drivers/net/eql.c
@@ -194,6 +194,7 @@ static void __init eql_setup(struct net_device *dev)
 
 	dev->type       	= ARPHRD_SLIP;
 	dev->tx_queue_len 	= 5;		/* Hands them off fast */
+	dev->priv_flags	       &= ~IFF_XMIT_DST_RELEASE;
 }
 
 static int eql_open(struct net_device *dev)
diff --git a/drivers/net/ifb.c b/drivers/net/ifb.c
index 60a2630..96713ef 100644
--- a/drivers/net/ifb.c
+++ b/drivers/net/ifb.c
@@ -156,6 +156,7 @@ static void ifb_setup(struct net_device *dev)
 
 	dev->flags |= IFF_NOARP;
 	dev->flags &= ~IFF_MULTICAST;
+	dev->priv_flags &= ~IFF_XMIT_DST_RELEASE;
 	random_ether_addr(dev->dev_addr);
 }
 
diff --git a/drivers/net/loopback.c b/drivers/net/loopback.c
index 6f71157..da472c6 100644
--- a/drivers/net/loopback.c
+++ b/drivers/net/loopback.c
@@ -170,6 +170,7 @@ static void loopback_setup(struct net_device *dev)
 	dev->tx_queue_len	= 0;
 	dev->type		= ARPHRD_LOOPBACK;	/* 0x0001*/
 	dev->flags		= IFF_LOOPBACK;
+	dev->priv_flags	       &= ~IFF_XMIT_DST_RELEASE;
 	dev->features 		= NETIF_F_SG | NETIF_F_FRAGLIST
 		| NETIF_F_TSO
 		| NETIF_F_NO_CSUM
diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index 329cd50..d5334b4 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -414,6 +414,7 @@ static void macvlan_setup(struct net_device *dev)
 {
 	ether_setup(dev);
 
+	dev->priv_flags	       &= ~IFF_XMIT_DST_RELEASE;
 	dev->netdev_ops		= &macvlan_netdev_ops;
 	dev->destructor		= free_netdev;
 	dev->header_ops		= &macvlan_hard_header_ops,
diff --git a/drivers/net/wan/hdlc_fr.c b/drivers/net/wan/hdlc_fr.c
index 8005301..bfa0161 100644
--- a/drivers/net/wan/hdlc_fr.c
+++ b/drivers/net/wan/hdlc_fr.c
@@ -1054,6 +1054,7 @@ static void pvc_setup(struct net_device *dev)
 	dev->flags = IFF_POINTOPOINT;
 	dev->hard_header_len = 10;
 	dev->addr_len = 2;
+	dev->priv_flags &= ~IFF_XMIT_DST_RELEASE;
 }
 
 static const struct net_device_ops pvc_ops = {
diff --git a/include/linux/if.h b/include/linux/if.h
index 1108f3e..b9a6229 100644
--- a/include/linux/if.h
+++ b/include/linux/if.h
@@ -67,6 +67,9 @@
 #define IFF_ISATAP	0x80		/* ISATAP interface (RFC4214)	*/
 #define IFF_MASTER_ARPMON 0x100		/* bonding master, ARP mon in use */
 #define IFF_WAN_HDLC	0x200		/* WAN HDLC device		*/
+#define IFF_XMIT_DST_RELEASE 0x400	/* dev_hard_start_xmit() is allowed to
+					 * release skb->dst
+					 */
 
 #define IF_GET_IFACE	0x0001		/* for querying only */
 #define IF_GET_PROTO	0x0002
diff --git a/net/core/dev.c b/net/core/dev.c
index 14dd725..b9aef53 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1688,6 +1688,14 @@ int dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev,
 				goto gso;
 		}
 
+		/*
+		 * If device doesnt need skb->dst, release it right now while
+		 * its hot in this cpu cache
+		 */
+		if ((dev->priv_flags & IFF_XMIT_DST_RELEASE) && skb->dst) {
+			dst_release(skb->dst);
+			skb->dst = NULL;
+		}
 		rc = ops->ndo_start_xmit(skb, dev);
 		/*
 		 * TODO: if skb_orphan() was called by
@@ -5028,6 +5036,7 @@ struct net_device *alloc_netdev_mq(int sizeof_priv, const char *name,
 	netdev_init_queues(dev);
 
 	INIT_LIST_HEAD(&dev->napi_list);
+	dev->priv_flags = IFF_XMIT_DST_RELEASE;
 	setup(dev);
 	strcpy(dev->name, name);
 	return dev;


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH] net: release dst entry in dev_hard_start_xmit()
  2009-05-12  8:12 ` [PATCH] net: release dst entry in dev_hard_start_xmit() Eric Dumazet
@ 2009-05-12  8:21   ` Eric Dumazet
  2009-05-12 19:26     ` [PATCH, v2] " Eric Dumazet
  2009-05-12 19:27     ` [PATCH] " Jarek Poplawski
  0 siblings, 2 replies; 28+ messages in thread
From: Eric Dumazet @ 2009-05-12  8:21 UTC (permalink / raw)
  To: David S. Miller; +Cc: Linux Netdev List, Jarek Poplawski, Patrick McHardy

Eric Dumazet a écrit :
> One point of contention in high network loads is the dst_release() performed
> when a transmited skb is freed. This is because NIC tx completion calls
> dev_kree_skb() long after original call to dev_queue_xmit(skb).
> 
> CPU cache is cold and the atomic op in dst_release() stalls. On SMP, this is
> quite visible if one CPU is 100% handling softirqs for a network device,
> since dst_clone() is done by other cpus, involving cache line ping pongs.
> 
> It seems right place to release dst is in dev_hard_start_xmit(), for most
> devices but ones that are virtual, and some exceptions.
> 
> David Miller suggested to define a new device flag, set in alloc_netdev_mq()
> (so that most devices set it at init time), and carefuly unset in devices
> which dont want a NULL skb->dst in their ndo_start_xmit().
> 
> List of devices that must clear this flag is :
> 
> - loopback device, because it calls netif_rx() and quoting Patrick :
>     "ip_route_input() doesn't accept loopback addresses, so loopback packets
>      already need to have a dst_entry attached."
> - appletalk/ipddp.c : needs skb->dst in its xmit function
> 
> - And all devices that call again dev_queue_xmit() from their xmit function
> (as some classifiers need skb->dst) : bonding, macvlan, eql, ifb, hdlc_fr
> 
> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
> ---
>  drivers/net/appletalk/ipddp.c   |    1 +
>  drivers/net/bonding/bond_main.c |    1 +
>  drivers/net/eql.c               |    1 +
>  drivers/net/ifb.c               |    1 +
>  drivers/net/loopback.c          |    1 +
>  drivers/net/macvlan.c           |    1 +
>  drivers/net/wan/hdlc_fr.c       |    1 +
>  include/linux/if.h              |    3 +++
>  net/core/dev.c                  |    9 +++++++++
>  9 files changed, 19 insertions(+)
> 
> diff --git a/drivers/net/appletalk/ipddp.c b/drivers/net/appletalk/ipddp.c
> index da64ba8..0268561 100644
> --- a/drivers/net/appletalk/ipddp.c
> +++ b/drivers/net/appletalk/ipddp.c
> @@ -71,6 +71,7 @@ static struct net_device * __init ipddp_init(void)
>  	if (!dev)
>  		return ERR_PTR(-ENOMEM);
>  
> +	dev->priv_flags &= IFF_XMIT_DST_RELEASE;


Oops, I forgot the  ~ here, sorry

>  	strcpy(dev->name, "ipddp%d");
>  
>  	if (version_printed++ == 0)
> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> index 815191d..a29f421 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -5148,6 +5148,7 @@ int bond_create(char *name, struct bond_params *params)
>  		goto out_rtnl;
>  	}
>  
> +	bond_dev->priv_flags &= ~IFF_XMIT_DST_RELEASE;
>  	if (!name) {
>  		res = dev_alloc_name(bond_dev, "bond%d");
>  		if (res < 0)
> diff --git a/drivers/net/eql.c b/drivers/net/eql.c
> index 5210bb1..19b7dd9 100644
> --- a/drivers/net/eql.c
> +++ b/drivers/net/eql.c
> @@ -194,6 +194,7 @@ static void __init eql_setup(struct net_device *dev)
>  
>  	dev->type       	= ARPHRD_SLIP;
>  	dev->tx_queue_len 	= 5;		/* Hands them off fast */
> +	dev->priv_flags	       &= ~IFF_XMIT_DST_RELEASE;
>  }
>  
>  static int eql_open(struct net_device *dev)
> diff --git a/drivers/net/ifb.c b/drivers/net/ifb.c
> index 60a2630..96713ef 100644
> --- a/drivers/net/ifb.c
> +++ b/drivers/net/ifb.c
> @@ -156,6 +156,7 @@ static void ifb_setup(struct net_device *dev)
>  
>  	dev->flags |= IFF_NOARP;
>  	dev->flags &= ~IFF_MULTICAST;
> +	dev->priv_flags &= ~IFF_XMIT_DST_RELEASE;
>  	random_ether_addr(dev->dev_addr);
>  }
>  
> diff --git a/drivers/net/loopback.c b/drivers/net/loopback.c
> index 6f71157..da472c6 100644
> --- a/drivers/net/loopback.c
> +++ b/drivers/net/loopback.c
> @@ -170,6 +170,7 @@ static void loopback_setup(struct net_device *dev)
>  	dev->tx_queue_len	= 0;
>  	dev->type		= ARPHRD_LOOPBACK;	/* 0x0001*/
>  	dev->flags		= IFF_LOOPBACK;
> +	dev->priv_flags	       &= ~IFF_XMIT_DST_RELEASE;
>  	dev->features 		= NETIF_F_SG | NETIF_F_FRAGLIST
>  		| NETIF_F_TSO
>  		| NETIF_F_NO_CSUM
> diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
> index 329cd50..d5334b4 100644
> --- a/drivers/net/macvlan.c
> +++ b/drivers/net/macvlan.c
> @@ -414,6 +414,7 @@ static void macvlan_setup(struct net_device *dev)
>  {
>  	ether_setup(dev);
>  
> +	dev->priv_flags	       &= ~IFF_XMIT_DST_RELEASE;
>  	dev->netdev_ops		= &macvlan_netdev_ops;
>  	dev->destructor		= free_netdev;
>  	dev->header_ops		= &macvlan_hard_header_ops,
> diff --git a/drivers/net/wan/hdlc_fr.c b/drivers/net/wan/hdlc_fr.c
> index 8005301..bfa0161 100644
> --- a/drivers/net/wan/hdlc_fr.c
> +++ b/drivers/net/wan/hdlc_fr.c
> @@ -1054,6 +1054,7 @@ static void pvc_setup(struct net_device *dev)
>  	dev->flags = IFF_POINTOPOINT;
>  	dev->hard_header_len = 10;
>  	dev->addr_len = 2;
> +	dev->priv_flags &= ~IFF_XMIT_DST_RELEASE;
>  }
>  
>  static const struct net_device_ops pvc_ops = {
> diff --git a/include/linux/if.h b/include/linux/if.h
> index 1108f3e..b9a6229 100644
> --- a/include/linux/if.h
> +++ b/include/linux/if.h
> @@ -67,6 +67,9 @@
>  #define IFF_ISATAP	0x80		/* ISATAP interface (RFC4214)	*/
>  #define IFF_MASTER_ARPMON 0x100		/* bonding master, ARP mon in use */
>  #define IFF_WAN_HDLC	0x200		/* WAN HDLC device		*/
> +#define IFF_XMIT_DST_RELEASE 0x400	/* dev_hard_start_xmit() is allowed to
> +					 * release skb->dst
> +					 */
>  
>  #define IF_GET_IFACE	0x0001		/* for querying only */
>  #define IF_GET_PROTO	0x0002
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 14dd725..b9aef53 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -1688,6 +1688,14 @@ int dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev,
>  				goto gso;
>  		}
>  
> +		/*
> +		 * If device doesnt need skb->dst, release it right now while
> +		 * its hot in this cpu cache
> +		 */
> +		if ((dev->priv_flags & IFF_XMIT_DST_RELEASE) && skb->dst) {
> +			dst_release(skb->dst);
> +			skb->dst = NULL;
> +		}
>  		rc = ops->ndo_start_xmit(skb, dev);
>  		/*
>  		 * TODO: if skb_orphan() was called by
> @@ -5028,6 +5036,7 @@ struct net_device *alloc_netdev_mq(int sizeof_priv, const char *name,
>  	netdev_init_queues(dev);
>  
>  	INIT_LIST_HEAD(&dev->napi_list);
> +	dev->priv_flags = IFF_XMIT_DST_RELEASE;
>  	setup(dev);
>  	strcpy(dev->name, name);
>  	return dev;
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH, v2] net: release dst entry in dev_hard_start_xmit()
  2009-05-12  8:21   ` Eric Dumazet
@ 2009-05-12 19:26     ` Eric Dumazet
  2009-05-19  5:19       ` David Miller
  2009-05-12 19:27     ` [PATCH] " Jarek Poplawski
  1 sibling, 1 reply; 28+ messages in thread
From: Eric Dumazet @ 2009-05-12 19:26 UTC (permalink / raw)
  To: David S. Miller; +Cc: Linux Netdev List, Jarek Poplawski, Patrick McHardy

Eric Dumazet a écrit :
> Eric Dumazet a écrit :
>> One point of contention in high network loads is the dst_release() performed
>> when a transmited skb is freed. This is because NIC tx completion calls
>> dev_kree_skb() long after original call to dev_queue_xmit(skb).
>>
>> CPU cache is cold and the atomic op in dst_release() stalls. On SMP, this is
>> quite visible if one CPU is 100% handling softirqs for a network device,
>> since dst_clone() is done by other cpus, involving cache line ping pongs.
>>
>> It seems right place to release dst is in dev_hard_start_xmit(), for most
>> devices but ones that are virtual, and some exceptions.
>>
>> David Miller suggested to define a new device flag, set in alloc_netdev_mq()
>> (so that most devices set it at init time), and carefuly unset in devices
>> which dont want a NULL skb->dst in their ndo_start_xmit().
>>
>> List of devices that must clear this flag is :
>>
>> - loopback device, because it calls netif_rx() and quoting Patrick :
>>     "ip_route_input() doesn't accept loopback addresses, so loopback packets
>>      already need to have a dst_entry attached."
>> - appletalk/ipddp.c : needs skb->dst in its xmit function
>>
>> - And all devices that call again dev_queue_xmit() from their xmit function
>> (as some classifiers need skb->dst) : bonding, macvlan, eql, ifb, hdlc_fr
>>
>> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
>> ---
>>  drivers/net/appletalk/ipddp.c   |    1 +
>>  drivers/net/bonding/bond_main.c |    1 +
>>  drivers/net/eql.c               |    1 +
>>  drivers/net/ifb.c               |    1 +
>>  drivers/net/loopback.c          |    1 +
>>  drivers/net/macvlan.c           |    1 +
>>  drivers/net/wan/hdlc_fr.c       |    1 +
>>  include/linux/if.h              |    3 +++
>>  net/core/dev.c                  |    9 +++++++++
>>  9 files changed, 19 insertions(+)
>>
>> diff --git a/drivers/net/appletalk/ipddp.c b/drivers/net/appletalk/ipddp.c
>> index da64ba8..0268561 100644
>> --- a/drivers/net/appletalk/ipddp.c
>> +++ b/drivers/net/appletalk/ipddp.c
>> @@ -71,6 +71,7 @@ static struct net_device * __init ipddp_init(void)
>>  	if (!dev)
>>  		return ERR_PTR(-ENOMEM);
>>  
>> +	dev->priv_flags &= IFF_XMIT_DST_RELEASE;
> 
> 
> Oops, I forgot the  ~ here, sorry

Here is a second version, including change to vlan code as well.

Thank you

[PATCH] net: release dst entry in dev_hard_start_xmit()

One point of contention in high network loads is the dst_release() performed
when a transmited skb is freed. This is because NIC tx completion calls
dev_kree_skb() long after original call to dev_queue_xmit(skb).

CPU cache is cold and the atomic op in dst_release() stalls. On SMP, this is
quite visible if one CPU is 100% handling softirqs for a network device,
since dst_clone() is done by other cpus, involving cache line ping pongs.

It seems right place to release dst is in dev_hard_start_xmit(), for most
devices but ones that are virtual, and some exceptions.

David Miller suggested to define a new device flag, set in alloc_netdev_mq()
(so that most devices set it at init time), and carefuly unset in devices
which dont want a NULL skb->dst in their ndo_start_xmit().

List of devices that must clear this flag is :

- loopback device, because it calls netif_rx() and quoting Patrick :
    "ip_route_input() doesn't accept loopback addresses, so loopback packets
     already need to have a dst_entry attached."
- appletalk/ipddp.c : needs skb->dst in its xmit function

- And all devices that call again dev_queue_xmit() from their xmit function
(as some classifiers need skb->dst) : bonding, vlan, macvlan, eql, ifb, hdlc_fr

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
---
 drivers/net/appletalk/ipddp.c   |    1 +
 drivers/net/bonding/bond_main.c |    1 +
 drivers/net/eql.c               |    1 +
 drivers/net/ifb.c               |    1 +
 drivers/net/loopback.c          |    1 +
 drivers/net/macvlan.c           |    1 +
 drivers/net/wan/hdlc_fr.c       |    1 +
 include/linux/if.h              |    3 +++
 net/8021q/vlan_dev.c            |    1 +
 net/core/dev.c                  |    9 +++++++++
 10 files changed, 20 insertions(+)

diff --git a/drivers/net/appletalk/ipddp.c b/drivers/net/appletalk/ipddp.c
index da64ba8..f939e92 100644
--- a/drivers/net/appletalk/ipddp.c
+++ b/drivers/net/appletalk/ipddp.c
@@ -71,6 +71,7 @@ static struct net_device * __init ipddp_init(void)
 	if (!dev)
 		return ERR_PTR(-ENOMEM);
 
+	dev->priv_flags &= ~IFF_XMIT_DST_RELEASE;
 	strcpy(dev->name, "ipddp%d");
 
 	if (version_printed++ == 0)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 815191d..a29f421 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -5148,6 +5148,7 @@ int bond_create(char *name, struct bond_params *params)
 		goto out_rtnl;
 	}
 
+	bond_dev->priv_flags &= ~IFF_XMIT_DST_RELEASE;
 	if (!name) {
 		res = dev_alloc_name(bond_dev, "bond%d");
 		if (res < 0)
diff --git a/drivers/net/eql.c b/drivers/net/eql.c
index 5210bb1..19b7dd9 100644
--- a/drivers/net/eql.c
+++ b/drivers/net/eql.c
@@ -194,6 +194,7 @@ static void __init eql_setup(struct net_device *dev)
 
 	dev->type       	= ARPHRD_SLIP;
 	dev->tx_queue_len 	= 5;		/* Hands them off fast */
+	dev->priv_flags	       &= ~IFF_XMIT_DST_RELEASE;
 }
 
 static int eql_open(struct net_device *dev)
diff --git a/drivers/net/ifb.c b/drivers/net/ifb.c
index 60a2630..96713ef 100644
--- a/drivers/net/ifb.c
+++ b/drivers/net/ifb.c
@@ -156,6 +156,7 @@ static void ifb_setup(struct net_device *dev)
 
 	dev->flags |= IFF_NOARP;
 	dev->flags &= ~IFF_MULTICAST;
+	dev->priv_flags &= ~IFF_XMIT_DST_RELEASE;
 	random_ether_addr(dev->dev_addr);
 }
 
diff --git a/drivers/net/loopback.c b/drivers/net/loopback.c
index 6f71157..da472c6 100644
--- a/drivers/net/loopback.c
+++ b/drivers/net/loopback.c
@@ -170,6 +170,7 @@ static void loopback_setup(struct net_device *dev)
 	dev->tx_queue_len	= 0;
 	dev->type		= ARPHRD_LOOPBACK;	/* 0x0001*/
 	dev->flags		= IFF_LOOPBACK;
+	dev->priv_flags	       &= ~IFF_XMIT_DST_RELEASE;
 	dev->features 		= NETIF_F_SG | NETIF_F_FRAGLIST
 		| NETIF_F_TSO
 		| NETIF_F_NO_CSUM
diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index 329cd50..d5334b4 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -414,6 +414,7 @@ static void macvlan_setup(struct net_device *dev)
 {
 	ether_setup(dev);
 
+	dev->priv_flags	       &= ~IFF_XMIT_DST_RELEASE;
 	dev->netdev_ops		= &macvlan_netdev_ops;
 	dev->destructor		= free_netdev;
 	dev->header_ops		= &macvlan_hard_header_ops,
diff --git a/drivers/net/wan/hdlc_fr.c b/drivers/net/wan/hdlc_fr.c
index 8005301..bfa0161 100644
--- a/drivers/net/wan/hdlc_fr.c
+++ b/drivers/net/wan/hdlc_fr.c
@@ -1054,6 +1054,7 @@ static void pvc_setup(struct net_device *dev)
 	dev->flags = IFF_POINTOPOINT;
 	dev->hard_header_len = 10;
 	dev->addr_len = 2;
+	dev->priv_flags &= ~IFF_XMIT_DST_RELEASE;
 }
 
 static const struct net_device_ops pvc_ops = {
diff --git a/include/linux/if.h b/include/linux/if.h
index 1108f3e..b9a6229 100644
--- a/include/linux/if.h
+++ b/include/linux/if.h
@@ -67,6 +67,9 @@
 #define IFF_ISATAP	0x80		/* ISATAP interface (RFC4214)	*/
 #define IFF_MASTER_ARPMON 0x100		/* bonding master, ARP mon in use */
 #define IFF_WAN_HDLC	0x200		/* WAN HDLC device		*/
+#define IFF_XMIT_DST_RELEASE 0x400	/* dev_hard_start_xmit() is allowed to
+					 * release skb->dst
+					 */
 
 #define IF_GET_IFACE	0x0001		/* for querying only */
 #define IF_GET_PROTO	0x0002
diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
index 25ba41e..faa535f 100644
--- a/net/8021q/vlan_dev.c
+++ b/net/8021q/vlan_dev.c
@@ -739,6 +739,7 @@ void vlan_setup(struct net_device *dev)
 	ether_setup(dev);
 
 	dev->priv_flags		|= IFF_802_1Q_VLAN;
+	dev->priv_flags		&= ~IFF_XMIT_DST_RELEASE;
 	dev->tx_queue_len	= 0;
 
 	dev->netdev_ops		= &vlan_netdev_ops;
diff --git a/net/core/dev.c b/net/core/dev.c
index 14dd725..b9aef53 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1688,6 +1688,14 @@ int dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev,
 				goto gso;
 		}
 
+		/*
+		 * If device doesnt need skb->dst, release it right now while
+		 * its hot in this cpu cache
+		 */
+		if ((dev->priv_flags & IFF_XMIT_DST_RELEASE) && skb->dst) {
+			dst_release(skb->dst);
+			skb->dst = NULL;
+		}
 		rc = ops->ndo_start_xmit(skb, dev);
 		/*
 		 * TODO: if skb_orphan() was called by
@@ -5028,6 +5036,7 @@ struct net_device *alloc_netdev_mq(int sizeof_priv, const char *name,
 	netdev_init_queues(dev);
 
 	INIT_LIST_HEAD(&dev->napi_list);
+	dev->priv_flags = IFF_XMIT_DST_RELEASE;
 	setup(dev);
 	strcpy(dev->name, name);
 	return dev;


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH] net: release dst entry in dev_hard_start_xmit()
  2009-05-12  8:21   ` Eric Dumazet
  2009-05-12 19:26     ` [PATCH, v2] " Eric Dumazet
@ 2009-05-12 19:27     ` Jarek Poplawski
  2009-05-12 19:44       ` Eric Dumazet
  1 sibling, 1 reply; 28+ messages in thread
From: Jarek Poplawski @ 2009-05-12 19:27 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David S. Miller, Linux Netdev List, Patrick McHardy

On Tue, May 12, 2009 at 10:21:12AM +0200, Eric Dumazet wrote:
> Eric Dumazet a écrit :
...
> > List of devices that must clear this flag is :
> > 
> > - loopback device, because it calls netif_rx() and quoting Patrick :
> >     "ip_route_input() doesn't accept loopback addresses, so loopback packets
> >      already need to have a dst_entry attached."
> > - appletalk/ipddp.c : needs skb->dst in its xmit function
> > 
> > - And all devices that call again dev_queue_xmit() from their xmit function
> > (as some classifiers need skb->dst) : bonding, macvlan, eql, ifb, hdlc_fr

Why not vlan (and/or maybe others in net/ using dev_queue_xmit)?

Jarek P.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] net: release dst entry in dev_hard_start_xmit()
  2009-05-12 19:27     ` [PATCH] " Jarek Poplawski
@ 2009-05-12 19:44       ` Eric Dumazet
  2009-05-12 20:05         ` Jarek Poplawski
  0 siblings, 1 reply; 28+ messages in thread
From: Eric Dumazet @ 2009-05-12 19:44 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: David S. Miller, Linux Netdev List, Patrick McHardy

Jarek Poplawski a écrit :
> On Tue, May 12, 2009 at 10:21:12AM +0200, Eric Dumazet wrote:
>> Eric Dumazet a écrit :
> ...
>>> List of devices that must clear this flag is :
>>>
>>> - loopback device, because it calls netif_rx() and quoting Patrick :
>>>     "ip_route_input() doesn't accept loopback addresses, so loopback packets
>>>      already need to have a dst_entry attached."
>>> - appletalk/ipddp.c : needs skb->dst in its xmit function
>>>
>>> - And all devices that call again dev_queue_xmit() from their xmit function
>>> (as some classifiers need skb->dst) : bonding, macvlan, eql, ifb, hdlc_fr
> 
> Why not vlan (and/or maybe others in net/ using dev_queue_xmit)?
> 

Yes I spoted vlan earlier this afternoon.

For other net/*, I didnt not find candidates yet.



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] net: release dst entry in dev_hard_start_xmit()
  2009-05-12 19:44       ` Eric Dumazet
@ 2009-05-12 20:05         ` Jarek Poplawski
  2009-05-12 20:24           ` Jarek Poplawski
  2009-05-12 20:52           ` Eric Dumazet
  0 siblings, 2 replies; 28+ messages in thread
From: Jarek Poplawski @ 2009-05-12 20:05 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David S. Miller, Linux Netdev List, Patrick McHardy

On Tue, May 12, 2009 at 09:44:52PM +0200, Eric Dumazet wrote:
> Jarek Poplawski a écrit :
> > On Tue, May 12, 2009 at 10:21:12AM +0200, Eric Dumazet wrote:
> >> Eric Dumazet a écrit :
> > ...
> >>> List of devices that must clear this flag is :
> >>>
> >>> - loopback device, because it calls netif_rx() and quoting Patrick :
> >>>     "ip_route_input() doesn't accept loopback addresses, so loopback packets
> >>>      already need to have a dst_entry attached."
> >>> - appletalk/ipddp.c : needs skb->dst in its xmit function
> >>>
> >>> - And all devices that call again dev_queue_xmit() from their xmit function
> >>> (as some classifiers need skb->dst) : bonding, macvlan, eql, ifb, hdlc_fr
> > 
> > Why not vlan (and/or maybe others in net/ using dev_queue_xmit)?
> > 
> 
> Yes I spoted vlan earlier this afternoon.
> 
> For other net/*, I didnt not find candidates yet.

Hmm..., if vlan, then why not pppoe (and/or maybe others in drivers/net/
using dev_queue_xmit)?

Jarek P.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] net: release dst entry in dev_hard_start_xmit()
  2009-05-12 20:05         ` Jarek Poplawski
@ 2009-05-12 20:24           ` Jarek Poplawski
  2009-05-12 20:52           ` Eric Dumazet
  1 sibling, 0 replies; 28+ messages in thread
From: Jarek Poplawski @ 2009-05-12 20:24 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David S. Miller, Linux Netdev List, Patrick McHardy

On Tue, May 12, 2009 at 10:05:15PM +0200, Jarek Poplawski wrote:
...
> Hmm..., if vlan, then why not pppoe (and/or maybe others in drivers/net/
> using dev_queue_xmit)?

I wonder if we can't simplify this e.g. by checking indirectly for
some hardware flag/capability like checksums etc. Some older hardware
could be omitted, by is it really so big deal?

Jarek P.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] net: release dst entry in dev_hard_start_xmit()
  2009-05-12 20:05         ` Jarek Poplawski
  2009-05-12 20:24           ` Jarek Poplawski
@ 2009-05-12 20:52           ` Eric Dumazet
  2009-05-12 20:59             ` Jarek Poplawski
  1 sibling, 1 reply; 28+ messages in thread
From: Eric Dumazet @ 2009-05-12 20:52 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: David S. Miller, Linux Netdev List, Patrick McHardy

Jarek Poplawski a écrit :
> On Tue, May 12, 2009 at 09:44:52PM +0200, Eric Dumazet wrote:
>> Jarek Poplawski a écrit :
>>> On Tue, May 12, 2009 at 10:21:12AM +0200, Eric Dumazet wrote:
>>>> Eric Dumazet a écrit :
>>> ...
>>>>> List of devices that must clear this flag is :
>>>>>
>>>>> - loopback device, because it calls netif_rx() and quoting Patrick :
>>>>>     "ip_route_input() doesn't accept loopback addresses, so loopback packets
>>>>>      already need to have a dst_entry attached."
>>>>> - appletalk/ipddp.c : needs skb->dst in its xmit function
>>>>>
>>>>> - And all devices that call again dev_queue_xmit() from their xmit function
>>>>> (as some classifiers need skb->dst) : bonding, macvlan, eql, ifb, hdlc_fr
>>> Why not vlan (and/or maybe others in net/ using dev_queue_xmit)?
>>>
>> Yes I spoted vlan earlier this afternoon.
>>
>> For other net/*, I didnt not find candidates yet.
> 
> Hmm..., if vlan, then why not pppoe (and/or maybe others in drivers/net/
> using dev_queue_xmit)?
> 

As I said, I didnt found other relevant uses, but I am probably wrong :)

About ppoe for example, two calls to dev_queue_xmit() are not relevant.

One is from 

static int pppoe_sendmsg(struct kiocb *iocb, struct socket *sock,
          struct msghdr *m, size_t total_len)

This is a normal direct call, not a call from its ndo_start_xmit()

Second one is from

static int __pppoe_xmit(struct sock *sk, struct sk_buff *skb)

Same observation, there is no impact for this one as well.



So we have to care on device drivers that have a ndo_start_xmit() call that could
re-enter dev_queue_xmit(). Not care about drivers that call dev_queue_xmit() from
other paths.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] net: release dst entry in dev_hard_start_xmit()
  2009-05-12 20:52           ` Eric Dumazet
@ 2009-05-12 20:59             ` Jarek Poplawski
  0 siblings, 0 replies; 28+ messages in thread
From: Jarek Poplawski @ 2009-05-12 20:59 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David S. Miller, Linux Netdev List, Patrick McHardy

On Tue, May 12, 2009 at 10:52:27PM +0200, Eric Dumazet wrote:
> Jarek Poplawski a écrit :
> > On Tue, May 12, 2009 at 09:44:52PM +0200, Eric Dumazet wrote:
> >> Jarek Poplawski a écrit :
> >>> On Tue, May 12, 2009 at 10:21:12AM +0200, Eric Dumazet wrote:
> >>>> Eric Dumazet a écrit :
> >>> ...
> >>>>> List of devices that must clear this flag is :
> >>>>>
> >>>>> - loopback device, because it calls netif_rx() and quoting Patrick :
> >>>>>     "ip_route_input() doesn't accept loopback addresses, so loopback packets
> >>>>>      already need to have a dst_entry attached."
> >>>>> - appletalk/ipddp.c : needs skb->dst in its xmit function
> >>>>>
> >>>>> - And all devices that call again dev_queue_xmit() from their xmit function
> >>>>> (as some classifiers need skb->dst) : bonding, macvlan, eql, ifb, hdlc_fr
> >>> Why not vlan (and/or maybe others in net/ using dev_queue_xmit)?
> >>>
> >> Yes I spoted vlan earlier this afternoon.
> >>
> >> For other net/*, I didnt not find candidates yet.
> > 
> > Hmm..., if vlan, then why not pppoe (and/or maybe others in drivers/net/
> > using dev_queue_xmit)?
> > 
> 
> As I said, I didnt found other relevant uses, but I am probably wrong :)
> 
> About ppoe for example, two calls to dev_queue_xmit() are not relevant.
> 
> One is from 
> 
> static int pppoe_sendmsg(struct kiocb *iocb, struct socket *sock,
>           struct msghdr *m, size_t total_len)
> 
> This is a normal direct call, not a call from its ndo_start_xmit()
> 
> Second one is from
> 
> static int __pppoe_xmit(struct sock *sk, struct sk_buff *skb)
> 
> Same observation, there is no impact for this one as well.
> 
> 
> 
> So we have to care on device drivers that have a ndo_start_xmit() call that could
> re-enter dev_queue_xmit(). Not care about drivers that call dev_queue_xmit() from
> other paths.
> 

Isn't it called through ppp_generic's ndo_start_xmit()?

Jarek P.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH, v2] net: release dst entry in dev_hard_start_xmit()
  2009-05-12 19:26     ` [PATCH, v2] " Eric Dumazet
@ 2009-05-19  5:19       ` David Miller
  2009-05-19  5:44         ` Eric Dumazet
  2009-05-19 19:44         ` Eric Dumazet
  0 siblings, 2 replies; 28+ messages in thread
From: David Miller @ 2009-05-19  5:19 UTC (permalink / raw)
  To: dada1; +Cc: netdev, jarkao2, kaber

From: Eric Dumazet <dada1@cosmosbay.com>
Date: Tue, 12 May 2009 21:26:35 +0200

> [PATCH] net: release dst entry in dev_hard_start_xmit()
 ...
> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>

Applied, thanks Eric.

Eric, please followup and double-check the pppoe paths
that Jarek mentioned.  I never saw that fully resolved.

Thanks!

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH, v2] net: release dst entry in dev_hard_start_xmit()
  2009-05-19  5:19       ` David Miller
@ 2009-05-19  5:44         ` Eric Dumazet
  2009-05-19 19:44         ` Eric Dumazet
  1 sibling, 0 replies; 28+ messages in thread
From: Eric Dumazet @ 2009-05-19  5:44 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, jarkao2, kaber

David Miller a écrit :
> From: Eric Dumazet <dada1@cosmosbay.com>
> Date: Tue, 12 May 2009 21:26:35 +0200
> 
>> [PATCH] net: release dst entry in dev_hard_start_xmit()
>  ...
>> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
> 
> Applied, thanks Eric.
> 
> Eric, please followup and double-check the pppoe paths
> that Jarek mentioned.  I never saw that fully resolved.
> 

Ah OK, I 'll do this David

Thanks !



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH, v2] net: release dst entry in dev_hard_start_xmit()
  2009-05-19  5:19       ` David Miller
  2009-05-19  5:44         ` Eric Dumazet
@ 2009-05-19 19:44         ` Eric Dumazet
  2009-05-19 21:09           ` Jarek Poplawski
  1 sibling, 1 reply; 28+ messages in thread
From: Eric Dumazet @ 2009-05-19 19:44 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, jarkao2, kaber

David Miller a écrit :
> From: Eric Dumazet <dada1@cosmosbay.com>
> Date: Tue, 12 May 2009 21:26:35 +0200
> 
>> [PATCH] net: release dst entry in dev_hard_start_xmit()
>  ...
>> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
> 
> Applied, thanks Eric.
> 
> Eric, please followup and double-check the pppoe paths
> that Jarek mentioned.  I never saw that fully resolved.
> 

[PATCH] ppp: unset IFF_XMIT_DST_RELEASE in ppp_setup()

Jarek pointed pppoe can call back dev_queue_xmit(), and might need
skb->dst, so its safer to unset IFF_XMIT_DST_RELEASE on ppp devices.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
---
diff --git a/drivers/net/ppp_generic.c b/drivers/net/ppp_generic.c
index 8ee9142..639d11b 100644
--- a/drivers/net/ppp_generic.c
+++ b/drivers/net/ppp_generic.c
@@ -1054,6 +1054,7 @@ static void ppp_setup(struct net_device *dev)
 	dev->type = ARPHRD_PPP;
 	dev->flags = IFF_POINTOPOINT | IFF_NOARP | IFF_MULTICAST;
 	dev->features |= NETIF_F_NETNS_LOCAL;
+	dev->priv_flags &= ~IFF_XMIT_DST_RELEASE;
 }
 
 /*


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH, v2] net: release dst entry in dev_hard_start_xmit()
  2009-05-19 19:44         ` Eric Dumazet
@ 2009-05-19 21:09           ` Jarek Poplawski
  2009-05-19 21:21             ` Eric Dumazet
  2009-05-19 21:24             ` David Miller
  0 siblings, 2 replies; 28+ messages in thread
From: Jarek Poplawski @ 2009-05-19 21:09 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev, kaber

On Tue, May 19, 2009 at 09:44:14PM +0200, Eric Dumazet wrote:
> David Miller a écrit :
> > From: Eric Dumazet <dada1@cosmosbay.com>
> > Date: Tue, 12 May 2009 21:26:35 +0200
> > 
> >> [PATCH] net: release dst entry in dev_hard_start_xmit()
> >  ...
> >> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
> > 
> > Applied, thanks Eric.
> > 
> > Eric, please followup and double-check the pppoe paths
> > that Jarek mentioned.  I never saw that fully resolved.
> > 
> 
> [PATCH] ppp: unset IFF_XMIT_DST_RELEASE in ppp_setup()
> 
> Jarek pointed pppoe can call back dev_queue_xmit(), and might need
> skb->dst, so its safer to unset IFF_XMIT_DST_RELEASE on ppp devices.

Hmm... Of course, this patch looks OK to me, but actually my main
concern was more general. We avoid adding such flags for each "real"
dev, but if so IMHO it would be safer to generally add them to all
"virtual" devs - needed or not. You prefer to do this only where
necessary, but it's not always clear if it's omitted on purpose or
by chance. So, now I'm wondering about xen-netfront - needlessly I
hope ;-)

Jarek P.

> 
> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
> ---
> diff --git a/drivers/net/ppp_generic.c b/drivers/net/ppp_generic.c
> index 8ee9142..639d11b 100644
> --- a/drivers/net/ppp_generic.c
> +++ b/drivers/net/ppp_generic.c
> @@ -1054,6 +1054,7 @@ static void ppp_setup(struct net_device *dev)
>  	dev->type = ARPHRD_PPP;
>  	dev->flags = IFF_POINTOPOINT | IFF_NOARP | IFF_MULTICAST;
>  	dev->features |= NETIF_F_NETNS_LOCAL;
> +	dev->priv_flags &= ~IFF_XMIT_DST_RELEASE;
>  }
>  
>  /*
> 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH, v2] net: release dst entry in dev_hard_start_xmit()
  2009-05-19 21:09           ` Jarek Poplawski
@ 2009-05-19 21:21             ` Eric Dumazet
  2009-05-19 21:24             ` David Miller
  1 sibling, 0 replies; 28+ messages in thread
From: Eric Dumazet @ 2009-05-19 21:21 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: David Miller, netdev, kaber

Jarek Poplawski a écrit :
> On Tue, May 19, 2009 at 09:44:14PM +0200, Eric Dumazet wrote:
>> David Miller a écrit :
>>> From: Eric Dumazet <dada1@cosmosbay.com>
>>> Date: Tue, 12 May 2009 21:26:35 +0200
>>>
>>>> [PATCH] net: release dst entry in dev_hard_start_xmit()
>>>  ...
>>>> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
>>> Applied, thanks Eric.
>>>
>>> Eric, please followup and double-check the pppoe paths
>>> that Jarek mentioned.  I never saw that fully resolved.
>>>
>> [PATCH] ppp: unset IFF_XMIT_DST_RELEASE in ppp_setup()
>>
>> Jarek pointed pppoe can call back dev_queue_xmit(), and might need
>> skb->dst, so its safer to unset IFF_XMIT_DST_RELEASE on ppp devices.
> 
> Hmm... Of course, this patch looks OK to me, but actually my main
> concern was more general. We avoid adding such flags for each "real"
> dev, but if so IMHO it would be safer to generally add them to all
> "virtual" devs - needed or not. You prefer to do this only where
> necessary, but it's not always clear if it's omitted on purpose or
> by chance. So, now I'm wondering about xen-netfront - needlessly I
> hope ;-)
> 

This is the deal in fact, tracking all valid uses, and I'll check this.

Another path would have to set the flag only for fast devices (Gb and 10Gb)




^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH, v2] net: release dst entry in dev_hard_start_xmit()
  2009-05-19 21:09           ` Jarek Poplawski
  2009-05-19 21:21             ` Eric Dumazet
@ 2009-05-19 21:24             ` David Miller
  1 sibling, 0 replies; 28+ messages in thread
From: David Miller @ 2009-05-19 21:24 UTC (permalink / raw)
  To: jarkao2; +Cc: dada1, netdev, kaber

From: Jarek Poplawski <jarkao2@gmail.com>
Date: Tue, 19 May 2009 23:09:37 +0200

> Hmm... Of course, this patch looks OK to me, but actually my main
> concern was more general. We avoid adding such flags for each "real"
> dev, but if so IMHO it would be safer to generally add them to all
> "virtual" devs - needed or not. You prefer to do this only where
> necessary, but it's not always clear if it's omitted on purpose or
> by chance. So, now I'm wondering about xen-netfront - needlessly I
> hope ;-)

It is an issue that surely needs to be fleshed out, to make
sure we use this where possible without breaking odd cases
too easily.

But for now I'm applying Eric's patch because it does fix
something until we have these issues sorted more generally.

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2009-05-19 21:24 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-03-20 11:40 [RFC] net: release dst entry in dev_queue_xmit() Eric Dumazet
2009-03-20 14:10 ` Neil Horman
2009-03-25  6:43 ` David Miller
2009-03-25  7:13   ` Eric Dumazet
2009-03-25  7:17     ` David Miller
2009-03-25 18:22       ` Jarek Poplawski
2009-03-25 18:41         ` Eric Dumazet
2009-03-25 19:18           ` Jarek Poplawski
2009-03-25 19:40             ` Eric Dumazet
2009-03-25 19:54               ` Jarek Poplawski
2009-03-25 20:28                 ` Eric Dumazet
2009-03-25 21:12                   ` Jarek Poplawski
2009-03-25 21:20                     ` Patrick McHardy
2009-05-12  8:12 ` [PATCH] net: release dst entry in dev_hard_start_xmit() Eric Dumazet
2009-05-12  8:21   ` Eric Dumazet
2009-05-12 19:26     ` [PATCH, v2] " Eric Dumazet
2009-05-19  5:19       ` David Miller
2009-05-19  5:44         ` Eric Dumazet
2009-05-19 19:44         ` Eric Dumazet
2009-05-19 21:09           ` Jarek Poplawski
2009-05-19 21:21             ` Eric Dumazet
2009-05-19 21:24             ` David Miller
2009-05-12 19:27     ` [PATCH] " Jarek Poplawski
2009-05-12 19:44       ` Eric Dumazet
2009-05-12 20:05         ` Jarek Poplawski
2009-05-12 20:24           ` Jarek Poplawski
2009-05-12 20:52           ` Eric Dumazet
2009-05-12 20:59             ` Jarek Poplawski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.