All of lore.kernel.org
 help / color / mirror / Atom feed
* ath10k kernel requirements (coherent_pool etc)
@ 2018-08-08 16:56 Arthur Watt
  2018-08-09  9:42 ` Sebastian Gottschall
  2018-08-09 11:04 ` Michał Kazior
  0 siblings, 2 replies; 4+ messages in thread
From: Arthur Watt @ 2018-08-08 16:56 UTC (permalink / raw)
  To: ath10k

Good afternoon.

We have been testing 3 x ath10k devices on a 4.14 based Linux kernel with a Cavium processor.
Up to this point we have been seeing some unexplained crashes from the ath10k devices (firmware crashes due to corrupt HTC messages is the most common).
We have tried various module and firmware versions (including wireless-latest) and all seem to show the same errors.

Is it possible that there is an inherent memory corrupt problem with the ath10k driver/firmware, since other list members don't seem to be shouting about this we started investigating the kernel configuration and not the driver/firmware?

This morning I made a kernel parameter change
       coherent_pool=256M
Since making this change the number of faults seem to  have reduced.

Does anyone know if there are any Linux kernel configuration parameters needed in order for multiple ath10k devices to behave reliably.
In particular do I need a coherent_pool configured and what page size is the optimum for ath10k operation?
Our ath10k devices are QCA9984 and QCA988X.

Thank you
Arthur



_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: ath10k kernel requirements (coherent_pool etc)
  2018-08-08 16:56 ath10k kernel requirements (coherent_pool etc) Arthur Watt
@ 2018-08-09  9:42 ` Sebastian Gottschall
  2018-08-09 11:04 ` Michał Kazior
  1 sibling, 0 replies; 4+ messages in thread
From: Sebastian Gottschall @ 2018-08-09  9:42 UTC (permalink / raw)
  To: ath10k


Am 08.08.2018 um 18:56 schrieb Arthur Watt:
> Good afternoon.
>
> We have been testing 3 x ath10k devices on a 4.14 based Linux kernel with a Cavium processor.
> Up to this point we have been seeing some unexplained crashes from the ath10k devices (firmware crashes due to corrupt HTC messages is the most common).
> We have tried various module and firmware versions (including wireless-latest) and all seem to show the same errors.
>
> Is it possible that there is an inherent memory corrupt problem with the ath10k driver/firmware, since other list members don't seem to be shouting about this we started investigating the kernel configuration and not the driver/firmware?
>
> This morning I made a kernel parameter change
>         coherent_pool=256M
> Since making this change the number of faults seem to  have reduced.
>
> Does anyone know if there are any Linux kernel configuration parameters needed in order for multiple ath10k devices to behave reliably.
> In particular do I need a coherent_pool configured and what page size is the optimum for ath10k operation?
> Our ath10k devices are QCA9984 and QCA988X.

i only can say that i run multiple QCA988X cards on X64 and on ARM 
cortex A9 devices  and i also run dual QCA9984 configurations on IPQ8064 
arm cpus with no issues.

would be good to see some errorlogs. it can be related to some dma 
memory address/space problems which are cpu specific


Sebastian

>
> Thank you
> Arthur
>
>
>
> _______________________________________________
> ath10k mailing list
> ath10k@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/ath10k
>

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: ath10k kernel requirements (coherent_pool etc)
  2018-08-08 16:56 ath10k kernel requirements (coherent_pool etc) Arthur Watt
  2018-08-09  9:42 ` Sebastian Gottschall
@ 2018-08-09 11:04 ` Michał Kazior
  1 sibling, 0 replies; 4+ messages in thread
From: Michał Kazior @ 2018-08-09 11:04 UTC (permalink / raw)
  To: Arthur Watt; +Cc: ath10k

On 8 August 2018 at 18:56, Arthur Watt <arthur.watt@virtuosys.com> wrote:
> Good afternoon.
>
> We have been testing 3 x ath10k devices on a 4.14 based Linux kernel with a Cavium processor.
> Up to this point we have been seeing some unexplained crashes from the ath10k devices (firmware crashes due to corrupt HTC messages is the most common).
> We have tried various module and firmware versions (including wireless-latest) and all seem to show the same errors.
>
> Is it possible that there is an inherent memory corrupt problem with the ath10k driver/firmware, since other list members don't seem to be shouting about this we started investigating the kernel configuration and not the driver/firmware?
>
> This morning I made a kernel parameter change
>        coherent_pool=256M
> Since making this change the number of faults seem to  have reduced.
>
> Does anyone know if there are any Linux kernel configuration parameters needed in order for multiple ath10k devices to behave reliably.
> In particular do I need a coherent_pool configured and what page size is the optimum for ath10k operation?
> Our ath10k devices are QCA9984 and QCA988X.

I recall we had some weird issues with multiple devices at some stage,
with qca61x4 and qca988x on the same system. I can't remember anymore
what it was exactly but it should be somewhere in git log. I also
personally had issues with kvm pci-passthrough where interrupts were
somehow lost during firmware upload. Corrupted htc messages may
suggest incomplete transfers and/or interrupts being propagated
out-of-order with regard to other transfers.

Is ath10k starting up with msi interrupts or legacy? If it's msi try
forcing legacy. If it's legacy might be worth checking if you can get
msi running.

Or maybe coherent memory mapping management is bugged for your
platform in the kernel (i.e. architectural bug) or u-boot. The cpu is
64 bit, right? What is the dma mask that gets used? Maybe it's too
wide and pci controller (on either host, or qca chip side) gets
confused. Or maybe alignment is wrong.


Michał

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: ath10k kernel requirements (coherent_pool etc)
@ 2018-10-02 17:01 Adam Cottrel
  0 siblings, 0 replies; 4+ messages in thread
From: Adam Cottrel @ 2018-10-02 17:01 UTC (permalink / raw)
  To: ath10k

Hi,

I would like to pick up this thread as I have been doing a lot of work with the ATH10K firmware, and we are still no clearer on a workable solution on the Linux 4.14.4 kernel.

We see the firmware crash when the kernel memory starts to get fragmented - we are using a dd copy operation to simulate the clean cache becoming full. We can help to mitigate (but not stop) the crashes by forcing a cache clear (e.g. drop cache) at regular intervals, or by forcing kswapd to run more often by using the highest watermark_scale_factor (10%).

The ATH10K driver can be made to crash with legacy or MSI interrupts, and occurs with or without SMP active. Furthermore, if I enable or disable IOMMU the fault symptoms do not change - this effectively replaces the DMA ops with SWIO vs the hard IOMMU on the Arm 64-bit core.

There appear to be two types of crash - one where we seem to get an assertion in the QCA9984/QCA988X device, and an immediate crash notification via the NAPI callbacks.

The second type is more mysterious, as the firmware just hangs there after completing one of the many receive hand-shakes, and then is eventually picked up by the firmware as a crash.

To start, it is not clear what can make a remote device fail - can anyone help explain what might be causing the QCA988x and QCA9984 to fail in the first place?
Best,
Adam

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-10-02 17:02 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-08 16:56 ath10k kernel requirements (coherent_pool etc) Arthur Watt
2018-08-09  9:42 ` Sebastian Gottschall
2018-08-09 11:04 ` Michał Kazior
2018-10-02 17:01 Adam Cottrel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.