net: fix memory leaks in flush_backlog() with RPS
diff mbox series

Message ID 20200502031516.2825-1-cai@lca.pw
State New
Headers show
Series
  • net: fix memory leaks in flush_backlog() with RPS
Related show

Commit Message

Qian Cai May 2, 2020, 3:15 a.m. UTC
netif_receive_skb_list_internal() could call enqueue_to_backlog() to put
some skb to softnet_data.input_pkt_queue and then in
ip_route_input_slow(), it allocates a dst_entry to be used in
skb_dst_set(). Later,

cleanup_net
  default_device_exit_batch
    unregister_netdevice_many
      rollback_registered_many
        flush_all_backlogs

will call flush_backlog() for all CPUs which would call kfree_skb() for
each skb on the input_pkt_queue without calling skb_dst_drop() first.

unreferenced object 0xffff97008e4c4040 (size 176):
 comm "softirq", pid 0, jiffies 4295173845 (age 32012.550s)
 hex dump (first 32 bytes):
   00 d0 a5 74 04 97 ff ff 40 72 1a 96 ff ff ff ff  ...t....@r......
   c1 a3 c5 95 ff ff ff ff 00 00 00 00 00 00 00 00  ................
 backtrace:
   [<0000000030483fae>] kmem_cache_alloc+0x184/0x430
   [<000000007ae17545>] dst_alloc+0x8e/0x128
   [<000000001efe9a1f>] rt_dst_alloc+0x6f/0x1e0
   rt_dst_alloc at net/ipv4/route.c:1628
   [<00000000e67d4dac>] ip_route_input_rcu+0xdfe/0x1640
   ip_route_input_slow at net/ipv4/route.c:2218
   (inlined by) ip_route_input_rcu at net/ipv4/route.c:2348
   [<000000009f30cbc0>] ip_route_input_noref+0xab/0x1a0
   [<000000004f53bd04>] arp_process+0x83a/0xf50
   arp_process at net/ipv4/arp.c:813 (discriminator 1)
   [<0000000061fd547d>] arp_rcv+0x276/0x330
   [<0000000007dbfa7a>] __netif_receive_skb_list_core+0x4d2/0x500
   [<0000000062d5f6d2>] netif_receive_skb_list_internal+0x4cb/0x7d0
   [<000000002baa2b74>] gro_normal_list+0x55/0xc0
   [<0000000093d04885>] napi_complete_done+0xea/0x350
   [<00000000467dd088>] tg3_poll_msix+0x174/0x310 [tg3]
   [<00000000498af7d9>] net_rx_action+0x278/0x890
   [<000000001e81d7e6>] __do_softirq+0xd9/0x589
   [<00000000087ee354>] irq_exit+0xa2/0xc0
   [<000000001c4db0cd>] do_IRQ+0x87/0x180

Signed-off-by: Qian Cai <cai@lca.pw>
---
 net/core/dev.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Eric Dumazet May 2, 2020, 3:32 a.m. UTC | #1
On 5/1/20 8:15 PM, Qian Cai wrote:
> netif_receive_skb_list_internal() could call enqueue_to_backlog() to put
> some skb to softnet_data.input_pkt_queue and then in
> ip_route_input_slow(), it allocates a dst_entry to be used in
> skb_dst_set(). Later,
> 
> cleanup_net
>   default_device_exit_batch
>     unregister_netdevice_many
>       rollback_registered_many
>         flush_all_backlogs
> 
> will call flush_backlog() for all CPUs which would call kfree_skb() for
> each skb on the input_pkt_queue without calling skb_dst_drop() first.
> 
> unreferenced object 0xffff97008e4c4040 (size 176):
>  comm "softirq", pid 0, jiffies 4295173845 (age 32012.550s)
>  hex dump (first 32 bytes):
>    00 d0 a5 74 04 97 ff ff 40 72 1a 96 ff ff ff ff  ...t....@r......
>    c1 a3 c5 95 ff ff ff ff 00 00 00 00 00 00 00 00  ................
>  backtrace:
>    [<0000000030483fae>] kmem_cache_alloc+0x184/0x430
>    [<000000007ae17545>] dst_alloc+0x8e/0x128
>    [<000000001efe9a1f>] rt_dst_alloc+0x6f/0x1e0
>    rt_dst_alloc at net/ipv4/route.c:1628
>    [<00000000e67d4dac>] ip_route_input_rcu+0xdfe/0x1640
>    ip_route_input_slow at net/ipv4/route.c:2218
>    (inlined by) ip_route_input_rcu at net/ipv4/route.c:2348
>    [<000000009f30cbc0>] ip_route_input_noref+0xab/0x1a0
>    [<000000004f53bd04>] arp_process+0x83a/0xf50
>    arp_process at net/ipv4/arp.c:813 (discriminator 1)
>    [<0000000061fd547d>] arp_rcv+0x276/0x330
>    [<0000000007dbfa7a>] __netif_receive_skb_list_core+0x4d2/0x500
>    [<0000000062d5f6d2>] netif_receive_skb_list_internal+0x4cb/0x7d0
>    [<000000002baa2b74>] gro_normal_list+0x55/0xc0
>    [<0000000093d04885>] napi_complete_done+0xea/0x350
>    [<00000000467dd088>] tg3_poll_msix+0x174/0x310 [tg3]
>    [<00000000498af7d9>] net_rx_action+0x278/0x890
>    [<000000001e81d7e6>] __do_softirq+0xd9/0x589
>    [<00000000087ee354>] irq_exit+0xa2/0xc0
>    [<000000001c4db0cd>] do_IRQ+0x87/0x180
> 
> Signed-off-by: Qian Cai <cai@lca.pw>
> ---
>  net/core/dev.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 522288177bbd..b898cd3036da 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -5496,6 +5496,7 @@ static void flush_backlog(struct work_struct *work)
>  	skb_queue_walk_safe(&sd->input_pkt_queue, skb, tmp) {
>  		if (skb->dev->reg_state == NETREG_UNREGISTERING) {
>  			__skb_unlink(skb, &sd->input_pkt_queue);
> +			skb_dst_drop(skb);
>  			kfree_skb(skb);
>  			input_queue_head_incr(sd);
>  		}
> 


kfree_skb() is supposed to call skb_dst_drop() (look in skb_release_head_state())

If you think about it, we would have hundreds of similar bugs if this was not the case.
Qian Cai May 2, 2020, 4:12 a.m. UTC | #2
> On May 1, 2020, at 11:32 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> 
> kfree_skb() is supposed to call skb_dst_drop() (look in skb_release_head_state())
> 
> If you think about it, we would have hundreds of similar bugs if this was not the case.

Thanks for quick response. Funny thing is that once I applied this patch, the leaks went away. It could be the fuzzers do not always reproduce the leaks or it could be that call_rcu() in skb_dst_drop() takes a long time waiting for grace periods which may confuse kmemleak because skb has already gone.

Patch
diff mbox series

diff --git a/net/core/dev.c b/net/core/dev.c
index 522288177bbd..b898cd3036da 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5496,6 +5496,7 @@  static void flush_backlog(struct work_struct *work)
 	skb_queue_walk_safe(&sd->input_pkt_queue, skb, tmp) {
 		if (skb->dev->reg_state == NETREG_UNREGISTERING) {
 			__skb_unlink(skb, &sd->input_pkt_queue);
+			skb_dst_drop(skb);
 			kfree_skb(skb);
 			input_queue_head_incr(sd);
 		}