* [PATCH net-next] net: properly flush delay-freed skbs
@ 2016-11-23 16:44 Eric Dumazet
2016-11-23 17:12 ` Alexander Duyck
2016-11-26 0:49 ` David Miller
0 siblings, 2 replies; 4+ messages in thread
From: Eric Dumazet @ 2016-11-23 16:44 UTC (permalink / raw)
To: David Miller; +Cc: netdev, Jesper Dangaard Brouer, Alexander Duyck
From: Eric Dumazet <edumazet@google.com>
Typical NAPI drivers use napi_consume_skb(skb) at TX completion time.
This put skb in a percpu special queue, napi_alloc_cache, to get bulk
frees.
It turns out the queue is not flushed and hits the NAPI_SKB_CACHE_SIZE
limit quite often, with skbs that were queued hundreds of usec earlier.
I measured this can take ~6000 nsec to perform one flush.
__kfree_skb_flush() can be called from two points right now :
1) From net_tx_action(), but only for skbs that were queued to
sd->completion_queue.
-> Irrelevant for NAPI drivers in normal operation.
2) From net_rx_action(), but only under high stress or if RPS/RFS has a
pending action.
This patch changes net_rx_action() to perform the flush in all cases and
after more urgent operations happened (like kicking remote CPUS for
RPS/RFS).
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Alexander Duyck <alexander.h.duyck@intel.com>
---
net/core/dev.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/net/core/dev.c b/net/core/dev.c
index f71b34ab57a5132647729d20e21376d362d4e630..048b46b7c92ae10080226ea7050fad3529920baa 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5260,7 +5260,7 @@ static __latent_entropy void net_rx_action(struct softirq_action *h)
if (list_empty(&list)) {
if (!sd_has_rps_ipi_waiting(sd) && list_empty(&repoll))
- return;
+ goto out;
break;
}
@@ -5278,7 +5278,6 @@ static __latent_entropy void net_rx_action(struct softirq_action *h)
}
}
- __kfree_skb_flush();
local_irq_disable();
list_splice_tail_init(&sd->poll_list, &list);
@@ -5288,6 +5287,8 @@ static __latent_entropy void net_rx_action(struct softirq_action *h)
__raise_softirq_irqoff(NET_RX_SOFTIRQ);
net_rps_action_and_irq_enable(sd);
+out:
+ __kfree_skb_flush();
}
struct netdev_adjacent {
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH net-next] net: properly flush delay-freed skbs
2016-11-23 16:44 [PATCH net-next] net: properly flush delay-freed skbs Eric Dumazet
@ 2016-11-23 17:12 ` Alexander Duyck
2016-11-23 18:11 ` Jesper Dangaard Brouer
2016-11-26 0:49 ` David Miller
1 sibling, 1 reply; 4+ messages in thread
From: Alexander Duyck @ 2016-11-23 17:12 UTC (permalink / raw)
To: Eric Dumazet
Cc: David Miller, netdev, Jesper Dangaard Brouer, Alexander Duyck
On Wed, Nov 23, 2016 at 8:44 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> From: Eric Dumazet <edumazet@google.com>
>
> Typical NAPI drivers use napi_consume_skb(skb) at TX completion time.
> This put skb in a percpu special queue, napi_alloc_cache, to get bulk
> frees.
>
> It turns out the queue is not flushed and hits the NAPI_SKB_CACHE_SIZE
> limit quite often, with skbs that were queued hundreds of usec earlier.
> I measured this can take ~6000 nsec to perform one flush.
>
> __kfree_skb_flush() can be called from two points right now :
>
> 1) From net_tx_action(), but only for skbs that were queued to
> sd->completion_queue.
>
> -> Irrelevant for NAPI drivers in normal operation.
>
> 2) From net_rx_action(), but only under high stress or if RPS/RFS has a
> pending action.
>
> This patch changes net_rx_action() to perform the flush in all cases and
> after more urgent operations happened (like kicking remote CPUS for
> RPS/RFS).
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Jesper Dangaard Brouer <brouer@redhat.com>
> Cc: Alexander Duyck <alexander.h.duyck@intel.com>
> ---
Yeah, we didn't intent the data to be sitting around that long. The
change looks good to me.
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH net-next] net: properly flush delay-freed skbs
2016-11-23 17:12 ` Alexander Duyck
@ 2016-11-23 18:11 ` Jesper Dangaard Brouer
0 siblings, 0 replies; 4+ messages in thread
From: Jesper Dangaard Brouer @ 2016-11-23 18:11 UTC (permalink / raw)
To: Alexander Duyck
Cc: Eric Dumazet, David Miller, netdev, Alexander Duyck, brouer
On Wed, 23 Nov 2016 09:12:50 -0800
Alexander Duyck <alexander.duyck@gmail.com> wrote:
> On Wed, Nov 23, 2016 at 8:44 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > From: Eric Dumazet <edumazet@google.com>
> >
> > Typical NAPI drivers use napi_consume_skb(skb) at TX completion time.
> > This put skb in a percpu special queue, napi_alloc_cache, to get bulk
> > frees.
> >
> > It turns out the queue is not flushed and hits the NAPI_SKB_CACHE_SIZE
> > limit quite often, with skbs that were queued hundreds of usec earlier.
> > I measured this can take ~6000 nsec to perform one flush.
> >
> > __kfree_skb_flush() can be called from two points right now :
> >
> > 1) From net_tx_action(), but only for skbs that were queued to
> > sd->completion_queue.
> >
> > -> Irrelevant for NAPI drivers in normal operation.
> >
> > 2) From net_rx_action(), but only under high stress or if RPS/RFS has a
> > pending action.
> >
> > This patch changes net_rx_action() to perform the flush in all cases and
> > after more urgent operations happened (like kicking remote CPUS for
> > RPS/RFS).
> >
> > Signed-off-by: Eric Dumazet <edumazet@google.com>
> > Cc: Jesper Dangaard Brouer <brouer@redhat.com>
> > Cc: Alexander Duyck <alexander.h.duyck@intel.com>
> > ---
>
> Yeah, we didn't intent the data to be sitting around that long. The
> change looks good to me.
>
> Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Also looks good to me! Thanks for catching this.
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
Author of http://www.iptv-analyzer.org
LinkedIn: http://www.linkedin.com/in/brouer
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH net-next] net: properly flush delay-freed skbs
2016-11-23 16:44 [PATCH net-next] net: properly flush delay-freed skbs Eric Dumazet
2016-11-23 17:12 ` Alexander Duyck
@ 2016-11-26 0:49 ` David Miller
1 sibling, 0 replies; 4+ messages in thread
From: David Miller @ 2016-11-26 0:49 UTC (permalink / raw)
To: eric.dumazet; +Cc: netdev, brouer, alexander.h.duyck
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 23 Nov 2016 08:44:56 -0800
> From: Eric Dumazet <edumazet@google.com>
>
> Typical NAPI drivers use napi_consume_skb(skb) at TX completion time.
> This put skb in a percpu special queue, napi_alloc_cache, to get bulk
> frees.
>
> It turns out the queue is not flushed and hits the NAPI_SKB_CACHE_SIZE
> limit quite often, with skbs that were queued hundreds of usec earlier.
> I measured this can take ~6000 nsec to perform one flush.
>
> __kfree_skb_flush() can be called from two points right now :
>
> 1) From net_tx_action(), but only for skbs that were queued to
> sd->completion_queue.
>
> -> Irrelevant for NAPI drivers in normal operation.
>
> 2) From net_rx_action(), but only under high stress or if RPS/RFS has a
> pending action.
>
> This patch changes net_rx_action() to perform the flush in all cases and
> after more urgent operations happened (like kicking remote CPUS for
> RPS/RFS).
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Jesper Dangaard Brouer <brouer@redhat.com>
> Cc: Alexander Duyck <alexander.h.duyck@intel.com>
Applied.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2016-11-26 0:50 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-23 16:44 [PATCH net-next] net: properly flush delay-freed skbs Eric Dumazet
2016-11-23 17:12 ` Alexander Duyck
2016-11-23 18:11 ` Jesper Dangaard Brouer
2016-11-26 0:49 ` David Miller
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.