Netdev Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH V2 net] ice: fix memory leak of aRFS after resuming from suspend
@ 2021-03-19  6:40 Yongxin Liu
  2021-03-31  2:28 ` Liu, Yongxin
  0 siblings, 1 reply; 4+ messages in thread
From: Yongxin Liu @ 2021-03-19  6:40 UTC (permalink / raw)
  To: brett.creeley, madhu.chittim, anthony.l.nguyen, andrewx.bowers,
	jeffrey.t.kirsher
  Cc: netdev

In ice_suspend(), ice_clear_interrupt_scheme() is called, and then
irq_free_descs() will be eventually called to free irq and its descriptor.

In ice_resume(), ice_init_interrupt_scheme() is called to allocate new irqs.
However, in ice_rebuild_arfs(), struct irq_glue and struct cpu_rmap maybe
cannot be freed, if the irqs that released in ice_suspend() were reassigned
to other devices, which makes irq descriptor's affinity_notify lost.

So call ice_free_cpu_rx_rmap() before ice_clear_interrupt_scheme(), which
can make sure all irq_glue and cpu_rmap can be correctly released before
corresponding irq and descriptor are released.

Fix the following memory leak.

unreferenced object 0xffff95bd951afc00 (size 512):
  comm "kworker/0:1", pid 134, jiffies 4294684283 (age 13051.958s)
  hex dump (first 32 bytes):
    18 00 00 00 18 00 18 00 70 fc 1a 95 bd 95 ff ff  ........p.......
    00 00 ff ff 01 00 ff ff 02 00 ff ff 03 00 ff ff  ................
  backtrace:
    [<0000000072e4b914>] __kmalloc+0x336/0x540
    [<0000000054642a87>] alloc_cpu_rmap+0x3b/0xb0
    [<00000000f220deec>] ice_set_cpu_rx_rmap+0x6a/0x110 [ice]
    [<000000002370a632>] ice_probe+0x941/0x1180 [ice]
    [<00000000d692edba>] local_pci_probe+0x47/0xa0
    [<00000000503934f0>] work_for_cpu_fn+0x1a/0x30
    [<00000000555a9e4a>] process_one_work+0x1dd/0x410
    [<000000002c4b414a>] worker_thread+0x221/0x3f0
    [<00000000bb2b556b>] kthread+0x14c/0x170
    [<00000000ad2cf1cd>] ret_from_fork+0x1f/0x30
unreferenced object 0xffff95bd81b0a2a0 (size 96):
  comm "kworker/0:1", pid 134, jiffies 4294684283 (age 13051.958s)
  hex dump (first 32 bytes):
    38 00 00 00 01 00 00 00 e0 ff ff ff 0f 00 00 00  8...............
    b0 a2 b0 81 bd 95 ff ff b0 a2 b0 81 bd 95 ff ff  ................
  backtrace:
    [<00000000582dd5c5>] kmem_cache_alloc_trace+0x31f/0x4c0
    [<000000002659850d>] irq_cpu_rmap_add+0x25/0xe0
    [<00000000495a3055>] ice_set_cpu_rx_rmap+0xb4/0x110 [ice]
    [<000000002370a632>] ice_probe+0x941/0x1180 [ice]
    [<00000000d692edba>] local_pci_probe+0x47/0xa0
    [<00000000503934f0>] work_for_cpu_fn+0x1a/0x30
    [<00000000555a9e4a>] process_one_work+0x1dd/0x410
    [<000000002c4b414a>] worker_thread+0x221/0x3f0
    [<00000000bb2b556b>] kthread+0x14c/0x170
    [<00000000ad2cf1cd>] ret_from_fork+0x1f/0x30

Signed-off-by: Yongxin Liu <yongxin.liu@windriver.com>
---
 drivers/net/ethernet/intel/ice/ice_main.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index 2c23c8f468a5..9c2d567a2534 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -4568,6 +4568,7 @@ static int __maybe_unused ice_suspend(struct device *dev)
 			continue;
 		ice_vsi_free_q_vectors(pf->vsi[v]);
 	}
+	ice_free_cpu_rx_rmap(ice_get_main_vsi(pf));
 	ice_clear_interrupt_scheme(pf);
 
 	pci_save_state(pdev);
-- 
2.14.5


^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: [PATCH V2 net] ice: fix memory leak of aRFS after resuming from suspend
  2021-03-19  6:40 [PATCH V2 net] ice: fix memory leak of aRFS after resuming from suspend Yongxin Liu
@ 2021-03-31  2:28 ` Liu, Yongxin
  2021-04-01 20:26   ` Nguyen, Anthony L
  0 siblings, 1 reply; 4+ messages in thread
From: Liu, Yongxin @ 2021-03-31  2:28 UTC (permalink / raw)
  To: brett.creeley, madhu.chittim, anthony.l.nguyen, andrewx.bowers,
	jeffrey.t.kirsher
  Cc: netdev

Hello Brett,

Could you please help to review this V2?


Thanks,
Yongxin

> -----Original Message-----
> From: Liu, Yongxin <yongxin.liu@windriver.com>
> Sent: Friday, March 19, 2021 14:44
> To: brett.creeley@intel.com; madhu.chittim@intel.com;
> anthony.l.nguyen@intel.com; andrewx.bowers@intel.com;
> jeffrey.t.kirsher@intel.com
> Cc: netdev@vger.kernel.org
> Subject: [PATCH V2 net] ice: fix memory leak of aRFS after resuming from
> suspend
> 
> In ice_suspend(), ice_clear_interrupt_scheme() is called, and then
> irq_free_descs() will be eventually called to free irq and its descriptor.
> 
> In ice_resume(), ice_init_interrupt_scheme() is called to allocate new
> irqs.
> However, in ice_rebuild_arfs(), struct irq_glue and struct cpu_rmap maybe
> cannot be freed, if the irqs that released in ice_suspend() were
> reassigned to other devices, which makes irq descriptor's affinity_notify
> lost.
> 
> So call ice_free_cpu_rx_rmap() before ice_clear_interrupt_scheme(), which
> can make sure all irq_glue and cpu_rmap can be correctly released before
> corresponding irq and descriptor are released.
> 
> Fix the following memory leak.
> 
> unreferenced object 0xffff95bd951afc00 (size 512):
>   comm "kworker/0:1", pid 134, jiffies 4294684283 (age 13051.958s)
>   hex dump (first 32 bytes):
>     18 00 00 00 18 00 18 00 70 fc 1a 95 bd 95 ff ff  ........p.......
>     00 00 ff ff 01 00 ff ff 02 00 ff ff 03 00 ff ff  ................
>   backtrace:
>     [<0000000072e4b914>] __kmalloc+0x336/0x540
>     [<0000000054642a87>] alloc_cpu_rmap+0x3b/0xb0
>     [<00000000f220deec>] ice_set_cpu_rx_rmap+0x6a/0x110 [ice]
>     [<000000002370a632>] ice_probe+0x941/0x1180 [ice]
>     [<00000000d692edba>] local_pci_probe+0x47/0xa0
>     [<00000000503934f0>] work_for_cpu_fn+0x1a/0x30
>     [<00000000555a9e4a>] process_one_work+0x1dd/0x410
>     [<000000002c4b414a>] worker_thread+0x221/0x3f0
>     [<00000000bb2b556b>] kthread+0x14c/0x170
>     [<00000000ad2cf1cd>] ret_from_fork+0x1f/0x30 unreferenced object
> 0xffff95bd81b0a2a0 (size 96):
>   comm "kworker/0:1", pid 134, jiffies 4294684283 (age 13051.958s)
>   hex dump (first 32 bytes):
>     38 00 00 00 01 00 00 00 e0 ff ff ff 0f 00 00 00  8...............
>     b0 a2 b0 81 bd 95 ff ff b0 a2 b0 81 bd 95 ff ff  ................
>   backtrace:
>     [<00000000582dd5c5>] kmem_cache_alloc_trace+0x31f/0x4c0
>     [<000000002659850d>] irq_cpu_rmap_add+0x25/0xe0
>     [<00000000495a3055>] ice_set_cpu_rx_rmap+0xb4/0x110 [ice]
>     [<000000002370a632>] ice_probe+0x941/0x1180 [ice]
>     [<00000000d692edba>] local_pci_probe+0x47/0xa0
>     [<00000000503934f0>] work_for_cpu_fn+0x1a/0x30
>     [<00000000555a9e4a>] process_one_work+0x1dd/0x410
>     [<000000002c4b414a>] worker_thread+0x221/0x3f0
>     [<00000000bb2b556b>] kthread+0x14c/0x170
>     [<00000000ad2cf1cd>] ret_from_fork+0x1f/0x30
> 
> Signed-off-by: Yongxin Liu <yongxin.liu@windriver.com>
> ---
>  drivers/net/ethernet/intel/ice/ice_main.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice_main.c
> b/drivers/net/ethernet/intel/ice/ice_main.c
> index 2c23c8f468a5..9c2d567a2534 100644
> --- a/drivers/net/ethernet/intel/ice/ice_main.c
> +++ b/drivers/net/ethernet/intel/ice/ice_main.c
> @@ -4568,6 +4568,7 @@ static int __maybe_unused ice_suspend(struct device
> *dev)
>  			continue;
>  		ice_vsi_free_q_vectors(pf->vsi[v]);
>  	}
> +	ice_free_cpu_rx_rmap(ice_get_main_vsi(pf));
>  	ice_clear_interrupt_scheme(pf);
> 
>  	pci_save_state(pdev);
> --
> 2.14.5


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH V2 net] ice: fix memory leak of aRFS after resuming from suspend
  2021-03-31  2:28 ` Liu, Yongxin
@ 2021-04-01 20:26   ` Nguyen, Anthony L
  2021-04-07 21:21     ` Brelinski, TonyX
  0 siblings, 1 reply; 4+ messages in thread
From: Nguyen, Anthony L @ 2021-04-01 20:26 UTC (permalink / raw)
  To: Chittim, Madhu, Yongxin.Liu, andrewx.bowers, jeffrey.t.kirsher,
	Creeley, Brett
  Cc: netdev, intel-wired-lan

On Wed, 2021-03-31 at 02:28 +0000, Liu, Yongxin wrote:
> Hello Brett,
> 
> Could you please help to review this V2?
> 

Hi Yongxin,

I have this applied to the Intel-wired-lan tree to go through some
testing. Also, adding the Intel-wired-lan list for reviews.

Thanks,
Tony

> Thanks,
> Yongxin
> 
> > -----Original Message-----
> > From: Liu, Yongxin <yongxin.liu@windriver.com>
> > Sent: Friday, March 19, 2021 14:44
> > To: brett.creeley@intel.com; madhu.chittim@intel.com;
> > anthony.l.nguyen@intel.com; andrewx.bowers@intel.com;
> > jeffrey.t.kirsher@intel.com
> > Cc: netdev@vger.kernel.org
> > Subject: [PATCH V2 net] ice: fix memory leak of aRFS after resuming
> > from
> > suspend
> > 
> > In ice_suspend(), ice_clear_interrupt_scheme() is called, and then
> > irq_free_descs() will be eventually called to free irq and its
> > descriptor.
> > 
> > In ice_resume(), ice_init_interrupt_scheme() is called to allocate
> > new
> > irqs.
> > However, in ice_rebuild_arfs(), struct irq_glue and struct cpu_rmap
> > maybe
> > cannot be freed, if the irqs that released in ice_suspend() were
> > reassigned to other devices, which makes irq descriptor's
> > affinity_notify
> > lost.
> > 
> > So call ice_free_cpu_rx_rmap() before ice_clear_interrupt_scheme(),
> > which
> > can make sure all irq_glue and cpu_rmap can be correctly released
> > before
> > corresponding irq and descriptor are released.
> > 
> > Fix the following memory leak.
> > 
> > unreferenced object 0xffff95bd951afc00 (size 512):
> >   comm "kworker/0:1", pid 134, jiffies 4294684283 (age 13051.958s)
> >   hex dump (first 32 bytes):
> >     18 00 00 00 18 00 18 00 70 fc 1a 95 bd 95 ff
> > ff  ........p.......
> >     00 00 ff ff 01 00 ff ff 02 00 ff ff 03 00 ff
> > ff  ................
> >   backtrace:
> >     [<0000000072e4b914>] __kmalloc+0x336/0x540
> >     [<0000000054642a87>] alloc_cpu_rmap+0x3b/0xb0
> >     [<00000000f220deec>] ice_set_cpu_rx_rmap+0x6a/0x110 [ice]
> >     [<000000002370a632>] ice_probe+0x941/0x1180 [ice]
> >     [<00000000d692edba>] local_pci_probe+0x47/0xa0
> >     [<00000000503934f0>] work_for_cpu_fn+0x1a/0x30
> >     [<00000000555a9e4a>] process_one_work+0x1dd/0x410
> >     [<000000002c4b414a>] worker_thread+0x221/0x3f0
> >     [<00000000bb2b556b>] kthread+0x14c/0x170
> >     [<00000000ad2cf1cd>] ret_from_fork+0x1f/0x30 unreferenced
> > object
> > 0xffff95bd81b0a2a0 (size 96):
> >   comm "kworker/0:1", pid 134, jiffies 4294684283 (age 13051.958s)
> >   hex dump (first 32 bytes):
> >     38 00 00 00 01 00 00 00 e0 ff ff ff 0f 00 00
> > 00  8...............
> >     b0 a2 b0 81 bd 95 ff ff b0 a2 b0 81 bd 95 ff
> > ff  ................
> >   backtrace:
> >     [<00000000582dd5c5>] kmem_cache_alloc_trace+0x31f/0x4c0
> >     [<000000002659850d>] irq_cpu_rmap_add+0x25/0xe0
> >     [<00000000495a3055>] ice_set_cpu_rx_rmap+0xb4/0x110 [ice]
> >     [<000000002370a632>] ice_probe+0x941/0x1180 [ice]
> >     [<00000000d692edba>] local_pci_probe+0x47/0xa0
> >     [<00000000503934f0>] work_for_cpu_fn+0x1a/0x30
> >     [<00000000555a9e4a>] process_one_work+0x1dd/0x410
> >     [<000000002c4b414a>] worker_thread+0x221/0x3f0
> >     [<00000000bb2b556b>] kthread+0x14c/0x170
> >     [<00000000ad2cf1cd>] ret_from_fork+0x1f/0x30
> > 
> > Signed-off-by: Yongxin Liu <yongxin.liu@windriver.com>
> > ---
> >  drivers/net/ethernet/intel/ice/ice_main.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/drivers/net/ethernet/intel/ice/ice_main.c
> > b/drivers/net/ethernet/intel/ice/ice_main.c
> > index 2c23c8f468a5..9c2d567a2534 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_main.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_main.c
> > @@ -4568,6 +4568,7 @@ static int __maybe_unused ice_suspend(struct
> > device
> > *dev)
> >  			continue;
> >  		ice_vsi_free_q_vectors(pf->vsi[v]);
> >  	}
> > +	ice_free_cpu_rx_rmap(ice_get_main_vsi(pf));
> >  	ice_clear_interrupt_scheme(pf);
> > 
> >  	pci_save_state(pdev);
> > --
> > 2.14.5
> 
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: [PATCH V2 net] ice: fix memory leak of aRFS after resuming from suspend
  2021-04-01 20:26   ` Nguyen, Anthony L
@ 2021-04-07 21:21     ` Brelinski, TonyX
  0 siblings, 0 replies; 4+ messages in thread
From: Brelinski, TonyX @ 2021-04-07 21:21 UTC (permalink / raw)
  To: Nguyen, Anthony L, Chittim, Madhu, Yongxin.Liu, andrewx.bowers,
	jeffrey.t.kirsher, Creeley, Brett
  Cc: netdev, intel-wired-lan

> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of
> Nguyen, Anthony L
> Sent: Thursday, April 1, 2021 1:27 PM
> To: Chittim, Madhu <madhu.chittim@intel.com>;
> Yongxin.Liu@windriver.com; andrewx.bowers@intel.com;
> jeffrey.t.kirsher@intel.com; Creeley, Brett <brett.creeley@intel.com>
> Cc: netdev@vger.kernel.org; intel-wired-lan@lists.osuosl.org
> Subject: Re: [Intel-wired-lan] [PATCH V2 net] ice: fix memory leak of aRFS
> after resuming from suspend
> 
> On Wed, 2021-03-31 at 02:28 +0000, Liu, Yongxin wrote:
> > Hello Brett,
> >
> > Could you please help to review this V2?
> >
> 
> Hi Yongxin,
> 
> I have this applied to the Intel-wired-lan tree to go through some testing.
> Also, adding the Intel-wired-lan list for reviews.
> 
> Thanks,
> Tony
> 
> > Thanks,
> > Yongxin
> >
> > > -----Original Message-----
> > > From: Liu, Yongxin <yongxin.liu@windriver.com>
> > > Sent: Friday, March 19, 2021 14:44
> > > To: brett.creeley@intel.com; madhu.chittim@intel.com;
> > > anthony.l.nguyen@intel.com; andrewx.bowers@intel.com;
> > > jeffrey.t.kirsher@intel.com
> > > Cc: netdev@vger.kernel.org
> > > Subject: [PATCH V2 net] ice: fix memory leak of aRFS after resuming
> > > from suspend
> > >
> > > In ice_suspend(), ice_clear_interrupt_scheme() is called, and then
> > > irq_free_descs() will be eventually called to free irq and its
> > > descriptor.
> > >
> > > In ice_resume(), ice_init_interrupt_scheme() is called to allocate
> > > new irqs.
> > > However, in ice_rebuild_arfs(), struct irq_glue and struct cpu_rmap
> > > maybe cannot be freed, if the irqs that released in ice_suspend()
> > > were reassigned to other devices, which makes irq descriptor's
> > > affinity_notify lost.
> > >
> > > So call ice_free_cpu_rx_rmap() before ice_clear_interrupt_scheme(),
> > > which can make sure all irq_glue and cpu_rmap can be correctly
> > > released before corresponding irq and descriptor are released.
> > >
> > > Fix the following memory leak.
> > >
> > > unreferenced object 0xffff95bd951afc00 (size 512):
> > >   comm "kworker/0:1", pid 134, jiffies 4294684283 (age 13051.958s)
> > >   hex dump (first 32 bytes):
> > >     18 00 00 00 18 00 18 00 70 fc 1a 95 bd 95 ff ff
> > > ........p.......
> > >     00 00 ff ff 01 00 ff ff 02 00 ff ff 03 00 ff ff
> > > ................
> > >   backtrace:
> > >     [<0000000072e4b914>] __kmalloc+0x336/0x540
> > >     [<0000000054642a87>] alloc_cpu_rmap+0x3b/0xb0
> > >     [<00000000f220deec>] ice_set_cpu_rx_rmap+0x6a/0x110 [ice]
> > >     [<000000002370a632>] ice_probe+0x941/0x1180 [ice]
> > >     [<00000000d692edba>] local_pci_probe+0x47/0xa0
> > >     [<00000000503934f0>] work_for_cpu_fn+0x1a/0x30
> > >     [<00000000555a9e4a>] process_one_work+0x1dd/0x410
> > >     [<000000002c4b414a>] worker_thread+0x221/0x3f0
> > >     [<00000000bb2b556b>] kthread+0x14c/0x170
> > >     [<00000000ad2cf1cd>] ret_from_fork+0x1f/0x30 unreferenced object
> > > 0xffff95bd81b0a2a0 (size 96):
> > >   comm "kworker/0:1", pid 134, jiffies 4294684283 (age 13051.958s)
> > >   hex dump (first 32 bytes):
> > >     38 00 00 00 01 00 00 00 e0 ff ff ff 0f 00 00
> > > 00  8...............
> > >     b0 a2 b0 81 bd 95 ff ff b0 a2 b0 81 bd 95 ff ff
> > > ................
> > >   backtrace:
> > >     [<00000000582dd5c5>] kmem_cache_alloc_trace+0x31f/0x4c0
> > >     [<000000002659850d>] irq_cpu_rmap_add+0x25/0xe0
> > >     [<00000000495a3055>] ice_set_cpu_rx_rmap+0xb4/0x110 [ice]
> > >     [<000000002370a632>] ice_probe+0x941/0x1180 [ice]
> > >     [<00000000d692edba>] local_pci_probe+0x47/0xa0
> > >     [<00000000503934f0>] work_for_cpu_fn+0x1a/0x30
> > >     [<00000000555a9e4a>] process_one_work+0x1dd/0x410
> > >     [<000000002c4b414a>] worker_thread+0x221/0x3f0
> > >     [<00000000bb2b556b>] kthread+0x14c/0x170
> > >     [<00000000ad2cf1cd>] ret_from_fork+0x1f/0x30
> > >
> > > Signed-off-by: Yongxin Liu <yongxin.liu@windriver.com>
> > > ---
> > >  drivers/net/ethernet/intel/ice/ice_main.c | 1 +
> > >  1 file changed, 1 insertion(+)

Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> A Contingent Worker at Intel



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, back to index

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-19  6:40 [PATCH V2 net] ice: fix memory leak of aRFS after resuming from suspend Yongxin Liu
2021-03-31  2:28 ` Liu, Yongxin
2021-04-01 20:26   ` Nguyen, Anthony L
2021-04-07 21:21     ` Brelinski, TonyX

Netdev Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/netdev/0 netdev/git/0.git
	git clone --mirror https://lore.kernel.org/netdev/1 netdev/git/1.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 netdev netdev/ https://lore.kernel.org/netdev \
		netdev@vger.kernel.org
	public-inbox-index netdev

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.netdev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git