All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
To: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: linux-wireless@vger.kernel.org, Felix Fietkau <nbd@nbd.name>,
	Ryder Lee <ryder.lee@mediatek.com>, Roy Luo <royluo@google.com>
Subject: Re: [RFC] mt76: fix tx hung regression on MT7630E
Date: Mon, 29 Jul 2019 16:02:41 +0200	[thread overview]
Message-ID: <20190729140241.GC4030@localhost.localdomain> (raw)
In-Reply-To: <20190729125351.GA3086@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 3146 bytes --]

> On Fri, Jul 26, 2019 at 02:10:56PM +0200, Stanislaw Gruszka wrote:
> > Since 41634aa8d6db ("mt76: only schedule txqs from the tx tasklet")
> > I can observe firmware hangs on MT7630E on station mode: tx stop
> > functioning after minor activity (rx keep working) and on module
> > unload device fail to stop with messages:
> > 
> > [ 5446.141413] mt76x0e 0000:06:00.0: TX DMA did not stop
> > [ 5449.176764] mt76x0e 0000:06:00.0: TX DMA did not stop
> > 
> > Loading module again results in failure to associate with AP.
> > Only machine power off / power on cycle can make device work again.
> > 
> > I have no idea why the commit caused F/W hangs. Most likely some proper
> > fix is needed of changing registers programming (or assuring proper order
> > of actions), but so far I can not came up with any better fix than
> > a partial revert of 41634aa8d6db.
> 
> The difference is that with 41634aa8d6db we can run mt76x02_poll_tx()
> and mt76x02_tx_tasklet() in parallel on 2 different CPUs. Without
> the commit, since tasklet is not scheduled from mt76_wake_tx_queue(),
> it does not happen.
> 
> I'm not quite sure why, but MT7630E does not like when we poll tx status
> on 2 cpus at once. Change like below:

Hi Stanislaw,

have you tried to look at the commit: 6fe533378795f87
("mt76: mt76x02: remove irqsave/restore in locking for tx status fifo")?
Could it be a race between a registermap update and a stats load? (we are using a
different lock now)

> 
> diff --git a/drivers/net/wireless/mediatek/mt76/mt76x02_mmio.c b/drivers/net/wireless/mediatek/mt76/mt76x02_mmio.c
> index 467b28379870..622251faa415 100644
> --- a/drivers/net/wireless/mediatek/mt76/mt76x02_mmio.c
> +++ b/drivers/net/wireless/mediatek/mt76/mt76x02_mmio.c
> @@ -170,7 +170,7 @@ static int mt76x02_poll_tx(struct napi_struct *napi, int budget)
>  					       mt76.tx_napi);
>  	int i;
>  
> -	mt76x02_mac_poll_tx_status(dev, false);
> +	mt76x02_mac_poll_tx_status(dev, true);

I am not sure if we really need mt76x02_mac_poll_tx_status() here since we run
it in mt76x02_tx_complete_skb() and in mt76x02_tx_tasklet(). Anyway the only
difference doing so is we do not run mt76x02_send_tx_status().

Regards,
Lorenzo

>  
>  	for (i = MT_TXQ_MCU; i >= 0; i--)
>  		mt76_queue_tx_cleanup(dev, i, false);
> 
> is sufficient to avoid the hangs.
> 
> > diff --git a/drivers/net/wireless/mediatek/mt76/tx.c b/drivers/net/wireless/mediatek/mt76/tx.c
> > index 5397827668b9..fefe0ee52584 100644
> > --- a/drivers/net/wireless/mediatek/mt76/tx.c
> > +++ b/drivers/net/wireless/mediatek/mt76/tx.c
> > @@ -598,7 +598,7 @@ void mt76_wake_tx_queue(struct ieee80211_hw *hw, struct ieee80211_txq *txq)
> >  	if (!test_bit(MT76_STATE_RUNNING, &dev->state))
> >  		return;
> >  
> > -	tasklet_schedule(&dev->tx_tasklet);
> > +	mt76_txq_schedule(dev, txq->ac);
> 
> However I'm not sure if change to tasklet_schedule() is indeed correct,
> since on tasklet we schedule all queues, hence queues that could
> possibly be still blocked? And on mt76_wake_tx_queue() we schedule only
> one queue.
> 
> Stanislaw

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

  reply	other threads:[~2019-07-29 14:02 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-26 12:10 [RFC] mt76: fix tx hung regression on MT7630E Stanislaw Gruszka
2019-07-29 12:53 ` Stanislaw Gruszka
2019-07-29 14:02   ` Lorenzo Bianconi [this message]
2019-07-30 13:54     ` Stanislaw Gruszka
2019-07-30 14:55       ` Lorenzo Bianconi
2019-07-31  8:19         ` Stanislaw Gruszka
2019-07-31  8:51           ` Stanislaw Gruszka
2019-07-31  9:09             ` Lorenzo Bianconi
2019-08-05 10:01               ` Stanislaw Gruszka
2019-08-05 11:27                 ` Lorenzo Bianconi
2019-08-05 12:39                   ` Stanislaw Gruszka
2019-08-12 11:50                     ` Stanislaw Gruszka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190729140241.GC4030@localhost.localdomain \
    --to=lorenzo.bianconi@redhat.com \
    --cc=linux-wireless@vger.kernel.org \
    --cc=nbd@nbd.name \
    --cc=royluo@google.com \
    --cc=ryder.lee@mediatek.com \
    --cc=sgruszka@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.