All of lore.kernel.org
 help / color / mirror / Atom feed
* Linux freezes after a time while running
@ 2016-10-28 19:05 Conrad Kostecki
  2016-10-31 10:12 ` Michal Kazior
  0 siblings, 1 reply; 14+ messages in thread
From: Conrad Kostecki @ 2016-10-28 19:05 UTC (permalink / raw)
  To: ath10k

Hi!
In order to create a dual band AP, I've bought two Compex LE900VX cards.
As the mainboard has only one PCIe slot, I also bought a Mikrotik RB14e miniPCIe->PCIe adapter, which can carry up to 4 miniPCIe cards.
 
Currently, I am running HostAPd 2.6 and Kernel 4.8.4 with 10.2.4.70.56 ath10k-firmware.
Both cards are being detected fine and working.
 
When I do start HostAPd, the whole server just freezes after a time, usally after 30-120 minutes. Until it freezes, HostAPd is working perfectly fine. When I do not start HostAPd, no freeze occurs and the whole system is running stable.
 
No errors were logged in dmesg.
 
Any Ideas, how to debug this?
Are the cards maybe faulty?
I already asked on the HostAPd-mailinglist and was asked, to ask here ;-)
 
Cheers
Conrad

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Linux freezes after a time while running
  2016-10-28 19:05 Linux freezes after a time while running Conrad Kostecki
@ 2016-10-31 10:12 ` Michal Kazior
  2016-10-31 16:48   ` Re[2]: " Conrad Kostecki
  0 siblings, 1 reply; 14+ messages in thread
From: Michal Kazior @ 2016-10-31 10:12 UTC (permalink / raw)
  To: Conrad Kostecki; +Cc: ath10k

On 28 October 2016 at 21:05, Conrad Kostecki <ck+ath10k@bl4ckb0x.de> wrote:
> Hi!
> In order to create a dual band AP, I've bought two Compex LE900VX cards.
> As the mainboard has only one PCIe slot, I also bought a Mikrotik RB14e miniPCIe->PCIe adapter, which can carry up to 4 miniPCIe cards.
>
> Currently, I am running HostAPd 2.6 and Kernel 4.8.4 with 10.2.4.70.56 ath10k-firmware.
> Both cards are being detected fine and working.
>
> When I do start HostAPd, the whole server just freezes after a time, usally after 30-120 minutes. Until it freezes, HostAPd is working perfectly fine. When I do not start HostAPd, no freeze occurs and the whole system is running stable.
>
> No errors were logged in dmesg.
>
> Any Ideas, how to debug this?
> Are the cards maybe faulty?
> I already asked on the HostAPd-mailinglist and was asked, to ask here ;-)

You could try loading ath10k_pci with reset_mode=1 parameter.

Cold reset is known to cause some problems after firmware certain
crashes and I've personally experienced system freezes on x86 (MIPS
tends to spit "data bus error" and doesn't lock up).


Michał

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re[2]: Linux freezes after a time while running
  2016-10-31 10:12 ` Michal Kazior
@ 2016-10-31 16:48   ` Conrad Kostecki
  2016-11-02  3:31     ` Michal Kazior
  0 siblings, 1 reply; 14+ messages in thread
From: Conrad Kostecki @ 2016-10-31 16:48 UTC (permalink / raw)
  To: Michal Kazior; +Cc: ath10k

Hello Michał,

Am 31.10.2016 11:12:03, "Michal Kazior" <michal.kazior@tieto.com> 
schrieb:

>  You could try loading ath10k_pci with reset_mode=1 parameter.
>
>Cold reset is known to cause some problems after firmware certain
>crashes and I've personally experienced system freezes on x86 (MIPS
>tends to spit "data bus error" and doesn't lock up).
thank you very much for your answer. I've now set reset_mode=1,
which seems to be now active, as I can see in dmesg:
[    8.471659] ath10k_pci 0000:08:00.0: pci irq msi oper_irq_mode 2 
irq_mode 0 reset_mode 1
[    8.587267] ath10k_pci 0000:09:00.0: pci irq msi oper_irq_mode 2 
irq_mode 0 reset_mode 1

After starting HostAPd, I powered up my Squeezebox Radio to connect via 
WiFi.
Just after a few minutes, it crashed, as expected, but it did not 
restart the whole server.
It this due reset_mode=1? I was now able to capture a lot of information 
from dmesg.
You can clearly see, that the firmware crashed. The HostAPd process is 
still running,
but the WiFi can be detected anymore.

As it's very much, I've put this on pastebin: 
http://pastebin.com/83WZktp6

You can see at mark 250, WiFi1 (2.4GHz) comes up and at mark 356 WIFI2 
(5GHz) comes up.
By mark 691, ath10k_pci crashed and WiFi stopped working. Normally, at 
this point the whole server would reboot.

I've also now tried the newest firmware 10.2.4.70.58 which no luck.

Any Ideas?

Cheers and Thanks
Conrad


_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Re[2]: Linux freezes after a time while running
  2016-10-31 16:48   ` Re[2]: " Conrad Kostecki
@ 2016-11-02  3:31     ` Michal Kazior
  2016-11-02  9:00       ` Conrad Kostecki
  0 siblings, 1 reply; 14+ messages in thread
From: Michal Kazior @ 2016-11-02  3:31 UTC (permalink / raw)
  To: Conrad Kostecki; +Cc: ath10k

On 31 October 2016 at 17:48, Conrad Kostecki <ck+ath10k@bl4ckb0x.de> wrote:
> Hello Michał,
>
> Am 31.10.2016 11:12:03, "Michal Kazior" <michal.kazior@tieto.com> schrieb:
>
>>  You could try loading ath10k_pci with reset_mode=1 parameter.
>>
>> Cold reset is known to cause some problems after firmware certain
>> crashes and I've personally experienced system freezes on x86 (MIPS
>> tends to spit "data bus error" and doesn't lock up).
>
> thank you very much for your answer. I've now set reset_mode=1,
> which seems to be now active, as I can see in dmesg:
> [    8.471659] ath10k_pci 0000:08:00.0: pci irq msi oper_irq_mode 2 irq_mode
> 0 reset_mode 1
> [    8.587267] ath10k_pci 0000:09:00.0: pci irq msi oper_irq_mode 2 irq_mode
> 0 reset_mode 1
>
> After starting HostAPd, I powered up my Squeezebox Radio to connect via
> WiFi.
> Just after a few minutes, it crashed, as expected, but it did not restart
> the whole server.
> It this due reset_mode=1? I was now able to capture a lot of information
> from dmesg.
> You can clearly see, that the firmware crashed. The HostAPd process is still
> running,
> but the WiFi can be detected anymore.
>
> As it's very much, I've put this on pastebin: http://pastebin.com/83WZktp6
>
> You can see at mark 250, WiFi1 (2.4GHz) comes up and at mark 356 WIFI2
> (5GHz) comes up.
> By mark 691, ath10k_pci crashed and WiFi stopped working. Normally, at this
> point the whole server would reboot.
>
> I've also now tried the newest firmware 10.2.4.70.58 which nuso luck.
>
> Any Ideas?

The driver is unable to retrieve register dump and there's a lot of
failures happening. This doesn't look like a firmware crash per se.
More like device failure caused by host refusing pcie access or
something (very similar to when x86 iommu refuses dma access on
use-after-free) being reported as one (target cpu catches the fault,
runs the handler which is treated as an uncaught assert and is
reported to host same way an assert would, but with a different
register dump).

I suspect the pcie link gets broken one direction because attempting a
cold reset did crash the host even harder.

Looks like the device went into a very confused state due to pcie link
failure starting from:

  [  691.609836] pcieport 0000:00:02.0: AER: Multiple Uncorrected
(Non-Fatal) error received: id=0800

I'm not really familiar with these. Perhaps there's a pcie bridge
problem on your host platform or maybe an electrical issue (e.g.
insufficient power supply to handle short bursts?).


Michal

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Re[2]: Linux freezes after a time while running
  2016-11-02  3:31     ` Michal Kazior
@ 2016-11-02  9:00       ` Conrad Kostecki
  2016-11-02 15:27         ` Michal Kazior
  0 siblings, 1 reply; 14+ messages in thread
From: Conrad Kostecki @ 2016-11-02  9:00 UTC (permalink / raw)
  To: Michal Kazior; +Cc: ath10k

Hi!

> Michal Kazior <michal.kazior@tieto.com> hat am 2. November 2016 um 04:31 geschrieben:
> I suspect the pcie link gets broken one direction because attempting a
> cold reset did crash the host even harder.

What I can see, the WiFi runs stable, if nobody connects to it. The failure starts, when somebody connects and does some throughput.

> Looks like the device went into a very confused state due to pcie link
> failure starting from:

Well, I did a test and removed the Mikrotik RB14e, as it has a PLX chip and provides an extra pcie switch. Instead, I took a simple passive miniPCIe->PCIe adapter and build only one card into it. The problem still exists. So, the RB14e is not the cause at least.

>   [  691.609836] pcieport 0000:00:02.0: AER: Multiple Uncorrected
> (Non-Fatal) error received: id=0800

I see, so the firmware crash it not the cause, but the error reported by AER. Thanks for clarify!

> I'm not really familiar with these. Perhaps there's a pcie bridge
> problem on your host platform or maybe an electrical issue (e.g.
> insufficient power supply to handle short bursts?).

Well, the mainboard is microATX. So its powered by a Seasonic 400W passive PSU, but the whole system takes about 40-50w. So the PSU itself should be enough? ;-) IIRC PCIe should provide up to 75W?

I am out of ideas.. I know, a year ago, that cards were perfectly working. Even a test replacement shows the same errors, so it's not a defective card.

Cheers
Conrad

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Re[2]: Linux freezes after a time while running
  2016-11-02  9:00       ` Conrad Kostecki
@ 2016-11-02 15:27         ` Michal Kazior
  2016-11-02 16:15           ` Conrad Kostecki
  2016-11-07 13:38           ` Re[2]: " Conrad Kostecki
  0 siblings, 2 replies; 14+ messages in thread
From: Michal Kazior @ 2016-11-02 15:27 UTC (permalink / raw)
  To: Conrad Kostecki; +Cc: ath10k

On 2 November 2016 at 10:00, Conrad Kostecki <ck+ath10k@bl4ckb0x.de> wrote:
> Hi!
>
>> Michal Kazior <michal.kazior@tieto.com> hat am 2. November 2016 um 04:31 geschrieben:
>> I suspect the pcie link gets broken one direction because attempting a
>> cold reset did crash the host even harder.
>
> What I can see, the WiFi runs stable, if nobody connects to it. The failure starts, when somebody connects and does some throughput.
>
>> Looks like the device went into a very confused state due to pcie link
>> failure starting from:
>
> Well, I did a test and removed the Mikrotik RB14e, as it has a PLX chip and provides an extra pcie switch. Instead, I took a simple passive miniPCIe->PCIe adapter and build only one card into it. The problem still exists. So, the RB14e is not the cause at least.
>
>>   [ 691.609836] pcieport 0000:00:02.0: AER: Multiple Uncorrected
>> (Non-Fatal) error received: id=0800
>
> I see, so the firmware crash it not the cause, but the error reported by AER. Thanks for clarify!
>
>> I'm not really familiar with these. Perhaps there's a pcie bridge
>> problem on your host platform or maybe an electrical issue (e.g.
>> insufficient power supply to handle short bursts?).
>
> Well, the mainboard is microATX. So its powered by a Seasonic 400W passive PSU, but the whole system takes about 40-50w. So the PSU itself should be enough? ;-) IIRC PCIe should provide up to 75W?
>
> I am out of ideas.. I know, a year ago, that cards were perfectly working. Even a test replacement shows the same errors, so it's not a defective card.

Were the cards working with this particular microATX mainboard before though?

Maybe the adapter is to blame? Or maybe it's a faulty mobo? You could
try reducing txpower of the card with "iw" and see if it makes a
difference.


Michal

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Re[2]: Linux freezes after a time while running
  2016-11-02 15:27         ` Michal Kazior
@ 2016-11-02 16:15           ` Conrad Kostecki
  2016-11-02 17:24             ` Re[4]: " Conrad Kostecki
  2016-11-07 13:38           ` Re[2]: " Conrad Kostecki
  1 sibling, 1 reply; 14+ messages in thread
From: Conrad Kostecki @ 2016-11-02 16:15 UTC (permalink / raw)
  To: Michal Kazior; +Cc: ath10k

Hi!

> Michal Kazior <michal.kazior@tieto.com> hat am 2. November 2016 um 16:27 geschrieben:

> Were the cards working with this particular microATX mainboard before though?

Not on this mobo, as it's new, so don't know.

But on my older Soekris net6501. There was the same problem, that the system crashed and freezes, when WiFi was enabled and used, with same type of Compex WiFi cards. But I know, that the Soekris was running 1-2 years just stable, before that problems there also started. The miniPCIe were direcly inserted without any adapter.

Just a wild guess, as I am running Gentoo and my own kernel. Could be something missing in my kernel, which could lead to this problem?
 
> Maybe the adapter is to blame?
I don't think, as the active and the passive adapter do show the same error.

> Or maybe it's a faulty mobo?
Well, other PCIe (e.G. USB 3.0, S-ATA) cards are just working fine..

> You could try reducing txpower of the card with "iw" and see if it makes a
> difference.

That does not seem to work?

iw dev wlp7s0 set txpower fixed 15
command failed: Operation not supported (-95)

iw dev wlp7s0 set txpower limit 15
command failed: Operation not supported (-95)

Cheers
Conrad

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re[4]: Linux freezes after a time while running
  2016-11-02 16:15           ` Conrad Kostecki
@ 2016-11-02 17:24             ` Conrad Kostecki
  0 siblings, 0 replies; 14+ messages in thread
From: Conrad Kostecki @ 2016-11-02 17:24 UTC (permalink / raw)
  To: Michal Kazior; +Cc: ath10k


Am 02.11.2016 17:15:15, "Conrad Kostecki" <ck+ath10k@bl4ckb0x.de> 
schrieb:

>That does not seem to work?
>
>iw dev wlp7s0 set txpower fixed 15
>command failed: Operation not supported (-95)
>
>iw dev wlp7s0 set txpower limit 15
>command failed: Operation not supported (-95)
>
>

Sorry, my fault. Setting to e.g. 20mW worked.
But did not help, as the problem still happens with lower txpower.

Cheers
Conrad


_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Re[2]: Linux freezes after a time while running
  2016-11-02 15:27         ` Michal Kazior
  2016-11-02 16:15           ` Conrad Kostecki
@ 2016-11-07 13:38           ` Conrad Kostecki
  2016-11-07 16:43             ` Michal Kazior
  1 sibling, 1 reply; 14+ messages in thread
From: Conrad Kostecki @ 2016-11-07 13:38 UTC (permalink / raw)
  To: Michal Kazior; +Cc: ath10k

Hello Michal,

> Michal Kazior <michal.kazior@tieto.com> hat am 2. November 2016 um 16:27 geschrieben:

> Were the cards working with this particular microATX mainboard before though?

i was now able to borrow a Soekris net6501, the same model, which I used to have. I've now inserted there also those two Compex cards.

Interestingly, it's not working. The Soekris net6501 shows the same behavior. After a WiFi client connects, atk10k firmware crashed. As this mainboard does not have AER, I can't see the exact cause.

But what I can say, my Soekris net6501, which I used to have, was for years working fine, before it started to freeze. As the same now happens with my borrowed Soekris net6501 (It does not freeze with reboot_mode=1), I am unsure, if this is really a hardware problem.. maybe you have a clou?

Cheers
Conrad

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Re[2]: Linux freezes after a time while running
  2016-11-07 13:38           ` Re[2]: " Conrad Kostecki
@ 2016-11-07 16:43             ` Michal Kazior
  2016-11-16 22:21               ` Re[4]: " Conrad Kostecki
  0 siblings, 1 reply; 14+ messages in thread
From: Michal Kazior @ 2016-11-07 16:43 UTC (permalink / raw)
  To: Conrad Kostecki; +Cc: ath10k

On 7 November 2016 at 06:38, Conrad Kostecki <ck+ath10k@bl4ckb0x.de> wrote:
> Hello Michal,
>
>> Michal Kazior <michal.kazior@tieto.com> hat am 2. November 2016 um 16:27 geschrieben:
>
>> Were the cards working with this particular microATX mainboard before though?
>
> i was now able to borrow a Soekris net6501, the same model, which I used to have. I've now inserted there also those two Compex cards.
>
> Interestingly, it's not working. The Soekris net6501 shows the same behavior. After a WiFi client connects, atk10k firmware crashed. As this mainboard does not have AER, I can't see the exact cause.
>
> But what I can say, my Soekris net6501, which I used to have, was for years working fine, before it started to freeze. As the same now happens with my borrowed Soekris net6501 (It does not freeze with reboot_mode=1), I am unsure, if this is really a hardware problem.. maybe you have a clou?

I assume you used a different kernel in the past compared to the
recent test you did. You could try re-testing the older kernel
(assuming you remember which one it was) and if it works you could
bisect your way to find the commit that breaks it for you.


Michal

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re[4]: Linux freezes after a time while running
  2016-11-07 16:43             ` Michal Kazior
@ 2016-11-16 22:21               ` Conrad Kostecki
  2016-11-18 14:47                 ` Michal Kazior
  0 siblings, 1 reply; 14+ messages in thread
From: Conrad Kostecki @ 2016-11-16 22:21 UTC (permalink / raw)
  To: Michal Kazior; +Cc: ath10k

Hello Michal,

Am 07.11.2016 17:43:02, "Michal Kazior" <michal.kazior@tieto.com> 
schrieb:

>
>I assume you used a different kernel in the past compared to the
>recent test you did. You could try re-testing the older kernel
>(assuming you remember which one it was) and if it works you could
>bisect your way to find the commit that breaks it for you.
>
>
I don't have the older kernel config anymore, but it was pretty the 
same, as my current one.
I've now done some tests and it's quite interesseting. Kernelversion did 
not matter for me.

Running with current firmware-5.bin, it's always crashing. I tried 
different BIOS options, nothing helped.

BUT: Downgrade to firmware-2.bin helps.
ath10k_pci 0000:03:00.0: firmware ver 10.1.467.3-1 api 2 features 
wmi-10.x, has-wmi-mgmt-tx,no-p2p crc32 2c3ffc2f

Running such old firmware, the Wifi just runs stable. NO firmware crash 
happens. When I just change firmware back to firmware-5.bin, it crashes 
again.
So there seems to changed something. Can be this debugged somehow?

With firmware-2.bin, I am also getting such errors, but WiFi works fine:
[   82.504901] ath10k_pci 0000:03:00.0: SWBA overrun on vdev 0, skipped 
old beacon
[   82.556103] ath10k_pci 0000:03:00.0: SWBA overrun on vdev 1, skipped 
old beacon
[   87.113085] ath10k_warn: 89 callbacks suppressed

Currently, the workaround is for me firmware-2.bin, but IMHO it's not a 
solution..
Any Ideas?

Cheers
Conrad


_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Re[4]: Linux freezes after a time while running
  2016-11-16 22:21               ` Re[4]: " Conrad Kostecki
@ 2016-11-18 14:47                 ` Michal Kazior
  2016-11-18 17:21                   ` Ben Greear
  2016-11-18 20:55                   ` Re[6]: " Conrad Kostecki
  0 siblings, 2 replies; 14+ messages in thread
From: Michal Kazior @ 2016-11-18 14:47 UTC (permalink / raw)
  To: Conrad Kostecki; +Cc: ath10k

On 16 November 2016 at 23:21, Conrad Kostecki <ck+ath10k@bl4ckb0x.de> wrote:
> Hello Michal,
>
> Am 07.11.2016 17:43:02, "Michal Kazior" <michal.kazior@tieto.com> schrieb:
>
>>
>> I assume you used a different kernel in the past compared to the
>> recent test you did. You could try re-testing the older kernel
>> (assuming you remember which one it was) and if it works you could
>> bisect your way to find the commit that breaks it for you.
>>
>>
> I don't have the older kernel config anymore, but it was pretty the same, as
> my current one.
> I've now done some tests and it's quite interesseting. Kernelversion did not
> matter for me.
>
> Running with current firmware-5.bin, it's always crashing. I tried different
> BIOS options, nothing helped.
>
> BUT: Downgrade to firmware-2.bin helps.
> ath10k_pci 0000:03:00.0: firmware ver 10.1.467.3-1 api 2 features wmi-10.x,
> has-wmi-mgmt-tx,no-p2p crc32 2c3ffc2f
>
> Running such old firmware, the Wifi just runs stable. NO firmware crash
> happens. When I just change firmware back to firmware-5.bin, it crashes
> again.
> So there seems to changed something. Can be this debugged somehow?
>
> With firmware-2.bin, I am also getting such errors, but WiFi works fine:
> [   82.504901] ath10k_pci 0000:03:00.0: SWBA overrun on vdev 0, skipped old
> beacon
> [   82.556103] ath10k_pci 0000:03:00.0: SWBA overrun on vdev 1, skipped old
> beacon
> [   87.113085] ath10k_warn: 89 callbacks suppressed
>
> Currently, the workaround is for me firmware-2.bin, but IMHO it's not a
> solution..
> Any Ideas?

Hmm.. looks like there's a stall in target-host communication for ~5
seconds (89 suppressed warnings match 2 vifs beaconing at ~100ms
interval).

Did you try running without multi-BSS, i.e. just one AP vif? That's
probably not going to help but it's worth ruling that out.


Michał

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Linux freezes after a time while running
  2016-11-18 14:47                 ` Michal Kazior
@ 2016-11-18 17:21                   ` Ben Greear
  2016-11-18 20:55                   ` Re[6]: " Conrad Kostecki
  1 sibling, 0 replies; 14+ messages in thread
From: Ben Greear @ 2016-11-18 17:21 UTC (permalink / raw)
  To: Michal Kazior, Conrad Kostecki; +Cc: ath10k

On 11/18/2016 06:47 AM, Michal Kazior wrote:
> On 16 November 2016 at 23:21, Conrad Kostecki <ck+ath10k@bl4ckb0x.de> wrote:
>> Hello Michal,
>>
>> Am 07.11.2016 17:43:02, "Michal Kazior" <michal.kazior@tieto.com> schrieb:
>>
>>>
>>> I assume you used a different kernel in the past compared to the
>>> recent test you did. You could try re-testing the older kernel
>>> (assuming you remember which one it was) and if it works you could
>>> bisect your way to find the commit that breaks it for you.
>>>
>>>
>> I don't have the older kernel config anymore, but it was pretty the same, as
>> my current one.
>> I've now done some tests and it's quite interesseting. Kernelversion did not
>> matter for me.
>>
>> Running with current firmware-5.bin, it's always crashing. I tried different
>> BIOS options, nothing helped.
>>
>> BUT: Downgrade to firmware-2.bin helps.
>> ath10k_pci 0000:03:00.0: firmware ver 10.1.467.3-1 api 2 features wmi-10.x,
>> has-wmi-mgmt-tx,no-p2p crc32 2c3ffc2f
>>
>> Running such old firmware, the Wifi just runs stable. NO firmware crash
>> happens. When I just change firmware back to firmware-5.bin, it crashes
>> again.
>> So there seems to changed something. Can be this debugged somehow?
>>
>> With firmware-2.bin, I am also getting such errors, but WiFi works fine:
>> [   82.504901] ath10k_pci 0000:03:00.0: SWBA overrun on vdev 0, skipped old
>> beacon
>> [   82.556103] ath10k_pci 0000:03:00.0: SWBA overrun on vdev 1, skipped old
>> beacon
>> [   87.113085] ath10k_warn: 89 callbacks suppressed
>>
>> Currently, the workaround is for me firmware-2.bin, but IMHO it's not a
>> solution..
>> Any Ideas?
>
> Hmm.. looks like there's a stall in target-host communication for ~5
> seconds (89 suppressed warnings match 2 vifs beaconing at ~100ms
> interval).
>
> Did you try running without multi-BSS, i.e. just one AP vif? That's
> probably not going to help but it's worth ruling that out.

While backporting some 10.2 code into my 10.1 tree, I found a change to
the CE logic in the firmware that appeared to cause hangs in our longer
duration runs.

I'll trade a hint of how to possibly fix this for a hint of how to fix
the warm-start/cold-start bug that is evidently fixed in 10.2 upstream firmware :)

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re[6]: Linux freezes after a time while running
  2016-11-18 14:47                 ` Michal Kazior
  2016-11-18 17:21                   ` Ben Greear
@ 2016-11-18 20:55                   ` Conrad Kostecki
  1 sibling, 0 replies; 14+ messages in thread
From: Conrad Kostecki @ 2016-11-18 20:55 UTC (permalink / raw)
  To: Michal Kazior; +Cc: ath10k

Hi Michal,

Am 18.11.2016 15:47:42, "Michal Kazior" <michal.kazior@tieto.com> 
schrieb:

>Hmm.. looks like there's a stall in target-host communication for ~5
>seconds (89 suppressed warnings match 2 vifs beaconing at ~100ms
>interval).
>
>Did you try running without multi-BSS, i.e. just one AP vif? That's
>probably not going to help but it's worth ruling that out.
>
>

I've now run a test and as you said, it did not helped.

Cheers
Conrad


_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2016-11-18 21:02 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-28 19:05 Linux freezes after a time while running Conrad Kostecki
2016-10-31 10:12 ` Michal Kazior
2016-10-31 16:48   ` Re[2]: " Conrad Kostecki
2016-11-02  3:31     ` Michal Kazior
2016-11-02  9:00       ` Conrad Kostecki
2016-11-02 15:27         ` Michal Kazior
2016-11-02 16:15           ` Conrad Kostecki
2016-11-02 17:24             ` Re[4]: " Conrad Kostecki
2016-11-07 13:38           ` Re[2]: " Conrad Kostecki
2016-11-07 16:43             ` Michal Kazior
2016-11-16 22:21               ` Re[4]: " Conrad Kostecki
2016-11-18 14:47                 ` Michal Kazior
2016-11-18 17:21                   ` Ben Greear
2016-11-18 20:55                   ` Re[6]: " Conrad Kostecki

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.