All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lorenzo Bianconi <lorenzo@kernel.org>
To: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>,
	nbd@nbd.name, linux-wireless@vger.kernel.org
Subject: Re: [PATCH] mt76: usb: fix possible memory leak during suspend/resume
Date: Sat, 13 Apr 2019 12:10:59 +0200	[thread overview]
Message-ID: <20190413101056.GA7940@localhost.localdomain> (raw)
In-Reply-To: <20190413083050.GA7434@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 8194 bytes --]

> On Fri, Apr 12, 2019 at 06:27:48PM +0200, Lorenzo Bianconi wrote:
> > > > On Fri, Apr 12, 2019 at 02:27:16PM +0200, Lorenzo Bianconi wrote:
> > > > > Disable mt76u_tx_tasklet at the end of mt76u_stop_queues in order to
> > > > > properly deallocate all pending skbs during suspend/resume phase
> > > > 
> > > > On suspend/resume tx skb's are processed after tasklet_enable()
> > > > in resume callback. There is issue with device removal though
> > > > (during suspend or otherwise).
> > > 
> > > Hi Stanislaw,
> > > 
> > > I guess the right moment to deallocate the skbs is during suspend since resume
> > > can happen in very far future
> 
> Yes, it's better to free on suspend, but in practice does not really matter since
> system is disabled till resume.
> 
> > > > > Fixes: b40b15e1521f ("mt76: add usb support to mt76 layer")
> > > > > Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
> > > > > ---
> > > > >  drivers/net/wireless/mediatek/mt76/usb.c | 4 ++--
> > > > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/net/wireless/mediatek/mt76/usb.c b/drivers/net/wireless/mediatek/mt76/usb.c
> > > > > index a3acc070063a..575207133775 100644
> > > > > --- a/drivers/net/wireless/mediatek/mt76/usb.c
> > > > > +++ b/drivers/net/wireless/mediatek/mt76/usb.c
> > > > > @@ -842,10 +842,10 @@ static void mt76u_stop_tx(struct mt76_dev *dev)
> > > > >  void mt76u_stop_queues(struct mt76_dev *dev)
> > > > >  {
> > > > >  	tasklet_disable(&dev->usb.rx_tasklet);
> > > > > -	tasklet_disable(&dev->usb.tx_tasklet);
> > > > > -
> > > > >  	mt76u_stop_rx(dev);
> > > > > +
> > > > >  	mt76u_stop_tx(dev);
> > > > > +	tasklet_disable(&dev->usb.tx_tasklet);
> > > > 
> > > > If tasklet is scheduled and we disable it and never enable, we end up
> > > > with infinite loop in tasklet_action_common(). This patch make the
> > > > problem less reproducible since tasklet_disable() is moved after
> > > > usb_kill_urb() -> tasklet_schedule(), but it is still possible.
> > > 
> > > I can see the point here. Maybe we can just run tasklet_kill instead of
> > > tasklet_disable here (at least for tx one)
> 
> I think you have right as tasklet_kill() will wait for scheduled tasklet .
> Originally in my patch (see below) I used wait_event as I thought
> tasklet_kill() may prevent scheduled tasklet to be executed (hence cause
> leak) but that seems to be not true.

I agree with rx side (good catch!!), but on tx one I guess usb_kill_urb()
is already waiting for tx pending so we just need to use tasklet_kill
at the end of mt76u_stop_queues, in this way we will free pending skbs during
suspend

Regards,
Lorenzo

> 
> > I think it is enough even for rx side since usb_kill_urb() will wait the end of
> > mt76u_complete_rx() and tasklet_kill() will wait for the completion of
> > mt76u_rx_tasklet(). We will need to remove tasklet_enable in resume routines.
> 
> No, because rx urb's are resubmitted on mt76u_rx_tasklet(). I use usb_poison_urb()
> to prevent that in my patch.
> 
> Stanislaw
> 
> commit 088b1209302e17cda688c9b562e18970c92823ac
> Author: Stanislaw Gruszka <sgruszka@redhat.com>
> Date:   Thu Apr 4 13:37:43 2019 +0200
> 
>     mt76usb: fix tx/rx stop
>     
>     Disabling tasklets on stopping rx/tx is wrong. If blocked tasklet
>     is scheduled and we remove device we get 100% cpu usage:
>     
>       PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
>         9 root      20   0       0      0      0 R  93.8   0.0   1:47.19 ksoftirqd/0
>     
>     by using memory after free and eventually crash:
>     
>     [ 2068.591964] RIP: 0010:tasklet_action_common.isra.17+0x66/0x100
>     [ 2068.591966] Code: 41 89 f5 eb 25 f0 48 0f ba 33 00 0f 83 b1 00 00 00 48 8b 7a 20 48 8b 42 18 e8 56 a3 b5 00 f0 80 23 fd 48 89 ea 48 85 ed 74 53 <48> 8b 2a 48 8d 5a 08 f0 48 0f ba 6a 08 01 72 0b 8b 42 10 85 c0 74
>     [ 2068.591968] RSP: 0018:ffff98758c34be58 EFLAGS: 00010206
>     [ 2068.591969] RAX: ffff98758e6966d0 RBX: ffff98756e69aef8 RCX: 0000000000000006
>     [ 2068.591970] RDX: 01060a053d060305 RSI: 0000000000000006 RDI: ffff98758e6966d0
>     [ 2068.591971] RBP: 01060a053d060305 R08: 0000000000000000 R09: 00000000000203c0
>     [ 2068.591971] R10: 000003ff65b34f08 R11: 0000000000000001 R12: ffff98758e6966d0
>     [ 2068.591972] R13: 0000000000000006 R14: 0000000000000040 R15: 0000000000000006
>     [ 2068.591974] FS:  0000000000000000(0000) GS:ffff98758e680000(0000) knlGS:0000000000000000
>     [ 2068.591975] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>     [ 2068.591975] CR2: 00002c5f73a6cc20 CR3: 00000002f920a001 CR4: 00000000003606e0
>     [ 2068.591977] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>     [ 2068.591978] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>     [ 2068.591978] Call Trace:
>     [ 2068.591985]  __do_softirq+0xe3/0x30a
>     [ 2068.591989]  ? sort_range+0x20/0x20
>     [ 2068.591990]  run_ksoftirqd+0x26/0x40
>     [ 2068.591992]  smpboot_thread_fn+0xc5/0x160
>     [ 2068.591995]  kthread+0x112/0x130
>     [ 2068.591997]  ? kthread_create_on_node+0x40/0x40
>     [ 2068.591998]  ret_from_fork+0x35/0x40
>     [ 2068.591999] Modules linked in: ccm arc4 fuse rfcomm cmac bnep sunrpc snd_hda_codec_hdmi snd_soc_skl snd_soc_core snd_soc_acpi_intel_match snd_hda_codec_realtek snd_soc_acpi snd_hda_codec_generic snd_soc_skl_ipc snd_soc_sst_ipc snd_soc_sst_dsp snd_hda_ext_core iTCO_wdt snd_hda_intel intel_rapl iTCO_vendor_support x86_pkg_temp_thermal intel_powerclamp btusb mei_wdt coretemp btrtl snd_hda_codec btbcm btintel intel_cstate snd_hwdep intel_uncore uvcvideo snd_hda_core videobuf2_vmalloc videobuf2_memops intel_rapl_perf wmi_bmof videobuf2_v4l2 intel_wmi_thunderbolt snd_seq bluetooth joydev videobuf2_common snd_seq_device snd_pcm videodev media i2c_i801 snd_timer idma64 ecdh_generic intel_lpss_pci intel_lpss mei_me mei ucsi_acpi typec_ucsi processor_thermal_device intel_soc_dts_iosf intel_pch_thermal typec thinkpad_acpi wmi snd soundcore rfkill int3403_thermal int340x_thermal_zone int3400_thermal acpi_thermal_rel acpi_pad pcc_cpufreq uas usb_storage crc32c_intel i915 i2c_algo_bit nvme serio_raw
>     [ 2068.592033]  drm_kms_helper e1000e nvme_core drm video ipv6 [last unloaded: cfg80211]
>     
>     Fortunate thing is that this not happen frequently, as scheduling
>     tasklet on blocked state is very exceptional, though might happen.
>     
>     Due to different RX/TX tasklet processing fix is different for those.
>     For TX just wait until queues are empty. For RX assure rx_tasklet
>     do fail to resubmit buffers by poisoning urb's and kill the tasklet.
>     
>     Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
> 
> diff --git a/drivers/net/wireless/mediatek/mt76/mac80211.c b/drivers/net/wireless/mediatek/mt76/mac80211.c
> index 15825e9a458f..7bf91566e212 100644

[...]

> @@ -830,20 +831,35 @@ static void mt76u_free_tx(struct mt76_dev *dev)
>  static void mt76u_stop_tx(struct mt76_dev *dev)
>  {
>  	struct mt76_queue *q;
> -	int i, j;
> +	int i, j, ret;
>  
>  	for (i = 0; i < IEEE80211_NUM_ACS; i++) {
>  		q = dev->q_tx[i].q;
>  		for (j = 0; j < q->ndesc; j++)
>  			usb_kill_urb(q->entry[j].urb);
>  	}
> +
> +	ret = wait_event_timeout(dev->tx_wait, !mt76_has_tx_pending(dev), HZ/5);
> +	if (!ret)
> +		dev_err(dev->dev, "timed out waiting for pending tx\n");
>  }
>  
> -void mt76u_stop_queues(struct mt76_dev *dev)
> +int mt76u_start_queues(struct mt76_dev *dev)

maybe mt76u_resume_rx_queue would be more clear

>  {
> -	tasklet_disable(&dev->usb.rx_tasklet);
> -	tasklet_disable(&dev->usb.tx_tasklet);
> +	struct mt76_queue *q = &dev->q_rx[MT_RXQ_MAIN];
> +	int i;
> +
> +	/* Nothing to do for TX */
>  
> +	for (i = 0; i < q->ndesc; i++)
> +		usb_unpoison_urb(q->entry[i].urb);
> +
> +	return mt76u_submit_rx_buffers(dev);
> +}
> +EXPORT_SYMBOL_GPL(mt76u_start_queues);
> +
> +void mt76u_stop_queues(struct mt76_dev *dev)
> +{
>  	mt76u_stop_rx(dev);
>  	mt76u_stop_tx(dev);
>  }


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

  reply	other threads:[~2019-04-13 10:11 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <cover.1555071870.git.lorenzo@kernel.org>
2019-04-12 12:27 ` [PATCH] mt76: usb: fix possible memory leak during suspend/resume Lorenzo Bianconi
2019-04-12 14:54   ` Stanislaw Gruszka
2019-04-12 15:35     ` Lorenzo Bianconi
2019-04-12 16:27       ` Lorenzo Bianconi
2019-04-13  8:30         ` Stanislaw Gruszka
2019-04-13 10:10           ` Lorenzo Bianconi [this message]
2019-04-15 11:53             ` Stanislaw Gruszka
2019-04-15 15:04               ` Lorenzo Bianconi
2019-04-16  8:04                 ` Stanislaw Gruszka
2019-04-16  8:12                   ` Lorenzo Bianconi
2019-04-16  8:27                     ` Stanislaw Gruszka
2019-04-16  8:33                       ` Lorenzo Bianconi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190413101056.GA7940@localhost.localdomain \
    --to=lorenzo@kernel.org \
    --cc=linux-wireless@vger.kernel.org \
    --cc=lorenzo.bianconi@redhat.com \
    --cc=nbd@nbd.name \
    --cc=sgruszka@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.