All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Kazior <michal.kazior@tieto.com>
To: Ben Greear <greearb@candelatech.com>
Cc: ath10k <ath10k@lists.infradead.org>
Subject: Re: Hard lockup during vif restart tests.
Date: Thu, 18 Sep 2014 08:23:17 +0200	[thread overview]
Message-ID: <CA+BoTQkcP=Yc8kA2=OotrWbaAf09aMOmiPQp5W2j1vxsu3x4Kg@mail.gmail.com> (raw)
In-Reply-To: <5419AE32.6010805@candelatech.com>

On 17 September 2014 17:52, Ben Greear <greearb@candelatech.com> wrote:
> On 09/16/2014 11:34 PM, Michal Kazior wrote:
>> On 16 September 2014 20:42, Ben Greear <greearb@candelatech.com> wrote:
>>> This is on a 3.14.14+ hacked kernel, with CT firmware.
>>>
>>> Test case is to restart stations (and the AP
>>> on the other side) every 10-30 seconds.
>>> After a bit, the station machine locked up hard.
>>>
>>> I have no idea how to trouble-shoot this better, so this is
>>> just FYI.
>>>
>> [...]
>>> ath10k: boot warm reset complete
>>> ath10k: failed to power up target using warm reset: -110
>>> ath10k: trying cold reset
>>> ath10k: boot cold reset
>>> ath10k: boot cold reset complete
>>> [hang, even sysrq will not work]
>>
>> There's a known problem with cold reset being capable of locking up
>> entire system (depends on the pci-e controller, e.g. AP135 splats a
>> Data Bus Error instead).
>>
>> Actually warm reset can do the same in some corner cases: try running
>> Rx traffic and just start the recovery sequence (without actually
>> crashing the fw). My x86 locks up very easily with this.
>>
>> I strongly suggest you use reset_mode=1 when you load ath10k_pci so
>> cold reset isn't used. This may result in ath10k being unable to bring
>> up the device in some rare cases (e.g. after an IOMMU fault if your
>> system supports it) but I believe it's far better than having the
>> whole system lock up.
>>
>> My suspicion is tx/rx rings, dma transfer engines, internal irqs
>> aren't stopped properly. I have a prototype patch for the warm reset
>> problem but it's incomplete and I'm not sure if I can share it yet.
>
> I will try the warm-reset-only flag, and I do hope you have success
> with the warm/cold reset fixes.

It sort of works as it is now but it's ugly.


> But, I still wonder if we could just reset less often and maybe
> make it a bit harder to hit these problems?
>
> Why do we reset the firmware/NIC when we admin down/up the
> vif (when a single vif is active)?  Couldn't we just keep
> the firmware active in this state and not risk lockup due
> to reset?

If you put down last interface mac80211 calls drv_stop(). There isn't
any real need to keep the device up and running after that other than
trying to workaround the reset issue. But then you need to deal with
firmware quirks. I recall it could report Rx indications after all
vdevs had been removed (and this is now also observable with 10.2
during probing/bootup). It's just simpler to reboot firmware on
drv_stop/start().


Michał

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

  reply	other threads:[~2014-09-18  6:23 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-16 18:42 Hard lockup during vif restart tests Ben Greear
2014-09-17  6:34 ` Michal Kazior
2014-09-17 15:52   ` Ben Greear
2014-09-18  6:23     ` Michal Kazior [this message]
2014-09-18  7:31       ` Kalle Valo
2014-09-18 16:06         ` Ben Greear

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA+BoTQkcP=Yc8kA2=OotrWbaAf09aMOmiPQp5W2j1vxsu3x4Kg@mail.gmail.com' \
    --to=michal.kazior@tieto.com \
    --cc=ath10k@lists.infradead.org \
    --cc=greearb@candelatech.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.