ath9k-devel.lists.ath9k.org archive mirror
 help / color / mirror / Atom feed
* [ath9k-devel] Interrupt issue on ARM with SMP
@ 2016-08-25  3:50 Lamar Hansford
  2016-08-25 15:48 ` Adrian Chadd
  0 siblings, 1 reply; 7+ messages in thread
From: Lamar Hansford @ 2016-08-25  3:50 UTC (permalink / raw)
  To: ath9k-devel

Hi,
I am trying to port the driver to an ARM platform using SMP where we see a hard lock-up (no panic).  To address this and a number of secondary issues I am refactoring the interrupt handling code and I need to understand the interrupt hand-shake process.

I am using:
*  NVidia T30
* AR9485 (AR9003)

My current understanding of  the code is:
-1-  ISR
    -1- disable all interrupts (IER|AR_INTR_ASYNC_ENABLE|AR_INTR_SYNC_ENABLE)
    -2- schedule tasklet
-2- Tasklet (enable interrupts)
    -1- write default AR_IER
    -2- write default AR_INTR_ASYNC_ENABLE
    -3- write default AR_INTR_ASYNC_MASK
    -4- write default AR_INTR_SYNC_ENABLE
    -5- write default AR_INTR_SYNC_MASK

I would like to change this to:
-1-  ISR
    -1- mask interrupt to be handled
-2- Tasklet (unmask interrupts)
    -1- Un-mask interrupts which are handled

TEST SCENARIO:
* Set interface to manual in /etc/networking/interfaces
* modprobe ath9k
* iw wlan0 set monitor none
* ifconfig wlan0 up

ISSUES:
* I see only two interrupts after reset_complete
* in the first interrupt ATH_OP_HW_RESET is pending.  So ignored
* The second interrupt occurs very close in time after reset_complete which follows sequence below
* no more interrupts

ath_isr
[ 3551.293766] ath: phy6: ISR-IRQ-ISR: 0x81000012
[ 3551.298242] ath: phy6: ISR-IRQ-IMR: 0x81800175
[ 3551.302718] ath: phy6: ISR-IRQ-IMR2: 0xc10000

ar9003_hw_get_isr
[ 3385.525838] ath: phy6: ATH9K REG-ISR-ASYNC-CAUSE: 0 (0x0)
[ 3385.531271] ath: phy6: ATH9K REG-ISR-ASYNC-MASK: 2 (0x2)
[ 3385.536617] ath: phy6: ATH9K REG-ISR-SYNC-CAUSE: 0 (0x0)
[ 3385.541961] ath: phy6: ATH9K REG-ISR-SYNC-MASK: 23f60
[ 3385.547133] ath: phy6: ATH9K REG-ISR-ISR: 0x0, (0x81000012)

Causing the ISR  not being processed due to:
        if (!isr && !sync_cause && !async_cause)
                return false;
and
        if (async_cause & async_mask) {
                if ((REG_READ(ah, AR_RTC_STATUS) & AR_RTC_STATUS_M)
                                == AR_RTC_STATUS_ON){
                        isr = REG_READ(ah, AR_ISR);
                }
        }

If I ignore these and read/process the ISR anyway I still get no further interrupts. I only get sleep timer  calls.

On x86 (which works) I get coninual interrupts which exhibit the following:
Aug 25 03:15:09 s0000 kernel: [190154.647784] ath: phy52: AR_IMR 0x81800175 IER 0x1
Aug 25 03:15:09 s0000 kernel: [190154.651795] ath: phy52: ATH9K-ISR IMR: 0x81800175
Aug 25 03:15:09 s0000 kernel: [190154.651804] ath: phy52: ATH9K-ISR IMR2: 0xc10000
Aug 25 03:15:09 s0000 kernel: [190154.651810] ath: phy52: ATH9K-ISR ISR: 0x4
Aug 25 03:15:09 s0000 kernel: [190154.651820] ath: phy52: ATH9K REG-ISR-ASYNC-CAUSE: 2
Aug 25 03:15:09 s0000 kernel: [190154.651824] ath: phy52: ATH9K REG-ISR-ASYNC-MASK: 2
Aug 25 03:15:09 s0000 kernel: [190154.651827] ath: phy52: ATH9K REG-ISR-SYNC: 0

Any ideas what to explore next?

Additional questions:
* What is the process to mask interrupts?
* Is it possible to mask interrupts without disabling all of them? (normally interrupts remain enabled but only the procedure is masked until serviced).





This email and any attachments may contain private, confidential and privileged material for the sole use of the intended recipient. If you are not the intended recipient, please immediately delete this email and any attachments.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [ath9k-devel] Interrupt issue on ARM with SMP
  2016-08-25  3:50 [ath9k-devel] Interrupt issue on ARM with SMP Lamar Hansford
@ 2016-08-25 15:48 ` Adrian Chadd
  2016-08-25 18:15   ` Lamar Hansford
  0 siblings, 1 reply; 7+ messages in thread
From: Adrian Chadd @ 2016-08-25 15:48 UTC (permalink / raw)
  To: ath9k-devel

Does the ARM board do legacy interrupts as well as MSI?


-a


On 24 August 2016 at 20:50, Lamar Hansford <Lamar.Hansford@maxpoint.com> wrote:
> Hi,
> I am trying to port the driver to an ARM platform using SMP where we see a hard lock-up (no panic).  To address this and a number of secondary issues I am refactoring the interrupt handling code and I need to understand the interrupt hand-shake process.
>
> I am using:
> *  NVidia T30
> * AR9485 (AR9003)
>
> My current understanding of  the code is:
> -1-  ISR
>     -1- disable all interrupts (IER|AR_INTR_ASYNC_ENABLE|AR_INTR_SYNC_ENABLE)
>     -2- schedule tasklet
> -2- Tasklet (enable interrupts)
>     -1- write default AR_IER
>     -2- write default AR_INTR_ASYNC_ENABLE
>     -3- write default AR_INTR_ASYNC_MASK
>     -4- write default AR_INTR_SYNC_ENABLE
>     -5- write default AR_INTR_SYNC_MASK
>
> I would like to change this to:
> -1-  ISR
>     -1- mask interrupt to be handled
> -2- Tasklet (unmask interrupts)
>     -1- Un-mask interrupts which are handled
>
> TEST SCENARIO:
> * Set interface to manual in /etc/networking/interfaces
> * modprobe ath9k
> * iw wlan0 set monitor none
> * ifconfig wlan0 up
>
> ISSUES:
> * I see only two interrupts after reset_complete
> * in the first interrupt ATH_OP_HW_RESET is pending.  So ignored
> * The second interrupt occurs very close in time after reset_complete which follows sequence below
> * no more interrupts
>
> ath_isr
> [ 3551.293766] ath: phy6: ISR-IRQ-ISR: 0x81000012
> [ 3551.298242] ath: phy6: ISR-IRQ-IMR: 0x81800175
> [ 3551.302718] ath: phy6: ISR-IRQ-IMR2: 0xc10000
>
> ar9003_hw_get_isr
> [ 3385.525838] ath: phy6: ATH9K REG-ISR-ASYNC-CAUSE: 0 (0x0)
> [ 3385.531271] ath: phy6: ATH9K REG-ISR-ASYNC-MASK: 2 (0x2)
> [ 3385.536617] ath: phy6: ATH9K REG-ISR-SYNC-CAUSE: 0 (0x0)
> [ 3385.541961] ath: phy6: ATH9K REG-ISR-SYNC-MASK: 23f60
> [ 3385.547133] ath: phy6: ATH9K REG-ISR-ISR: 0x0, (0x81000012)
>
> Causing the ISR  not being processed due to:
>         if (!isr && !sync_cause && !async_cause)
>                 return false;
> and
>         if (async_cause & async_mask) {
>                 if ((REG_READ(ah, AR_RTC_STATUS) & AR_RTC_STATUS_M)
>                                 == AR_RTC_STATUS_ON){
>                         isr = REG_READ(ah, AR_ISR);
>                 }
>         }
>
> If I ignore these and read/process the ISR anyway I still get no further interrupts. I only get sleep timer  calls.
>
> On x86 (which works) I get coninual interrupts which exhibit the following:
> Aug 25 03:15:09 s0000 kernel: [190154.647784] ath: phy52: AR_IMR 0x81800175 IER 0x1
> Aug 25 03:15:09 s0000 kernel: [190154.651795] ath: phy52: ATH9K-ISR IMR: 0x81800175
> Aug 25 03:15:09 s0000 kernel: [190154.651804] ath: phy52: ATH9K-ISR IMR2: 0xc10000
> Aug 25 03:15:09 s0000 kernel: [190154.651810] ath: phy52: ATH9K-ISR ISR: 0x4
> Aug 25 03:15:09 s0000 kernel: [190154.651820] ath: phy52: ATH9K REG-ISR-ASYNC-CAUSE: 2
> Aug 25 03:15:09 s0000 kernel: [190154.651824] ath: phy52: ATH9K REG-ISR-ASYNC-MASK: 2
> Aug 25 03:15:09 s0000 kernel: [190154.651827] ath: phy52: ATH9K REG-ISR-SYNC: 0
>
> Any ideas what to explore next?
>
> Additional questions:
> * What is the process to mask interrupts?
> * Is it possible to mask interrupts without disabling all of them? (normally interrupts remain enabled but only the procedure is masked until serviced).
>
>
>
>
>
> This email and any attachments may contain private, confidential and privileged material for the sole use of the intended recipient. If you are not the intended recipient, please immediately delete this email and any attachments.
> _______________________________________________
> ath9k-devel mailing list
> ath9k-devel at lists.ath9k.org
> https://lists.ath9k.org/mailman/listinfo/ath9k-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [ath9k-devel] Interrupt issue on ARM with SMP
  2016-08-25 15:48 ` Adrian Chadd
@ 2016-08-25 18:15   ` Lamar Hansford
  2016-08-26  1:33     ` Adrian Chadd
  0 siblings, 1 reply; 7+ messages in thread
From: Lamar Hansford @ 2016-08-25 18:15 UTC (permalink / raw)
  To: ath9k-devel

I am using legacy interrupts in this case.  I found the issue where the interrupts were not firing (stupid issue on my part).  Working on getting the data through the pipe.

I still have a question on the meaning for the ASYNC cause.

After receiving and clearing interrupts (writing the ISR back to itself).   I am masking the bits until they are serviced.  When the interrupts are masked I get an async cause of 0x0 and the interrupts are ignored.

Is the AR_INTR_ASYNC_CAUSE indicating that the interrupt has not changed since the last IMR update?

If so this would be odd behavior (normally an IMR will prevent spurious ISR to relieve loading).

Thanks in advance,
-Lamar






This email and any attachments may contain private, confidential and privileged material for the sole use of the intended recipient. If you are not the intended recipient, please immediately delete this email and any attachments.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [ath9k-devel] Interrupt issue on ARM with SMP
  2016-08-25 18:15   ` Lamar Hansford
@ 2016-08-26  1:33     ` Adrian Chadd
  2016-08-26 14:35       ` Lamar Hansford
  0 siblings, 1 reply; 7+ messages in thread
From: Adrian Chadd @ 2016-08-26  1:33 UTC (permalink / raw)
  To: ath9k-devel

On 25 August 2016 at 11:15, Lamar Hansford <Lamar.Hansford@maxpoint.com> wrote:
> I am using legacy interrupts in this case.  I found the issue where the interrupts were not firing (stupid issue on my part).  Working on getting the data through the pipe.
>
> I still have a question on the meaning for the ASYNC cause.
>
> After receiving and clearing interrupts (writing the ISR back to itself).   I am masking the bits until they are serviced.  When the interrupts are masked I get an async cause of 0x0 and the interrupts are ignored.
>
> Is the AR_INTR_ASYNC_CAUSE indicating that the interrupt has not changed since the last IMR update?

Right. It's not triggering an interrupt, so ASYNC_CAUSE won't get
triggered. AR_ISR will still show the status, but if it's disabled, it
won't trigger an interrupt at all.

> If so this would be odd behavior (normally an IMR will prevent spurious ISR to relieve loading).

AR_IMR_* is the MAC status control register - it controls which
MAC/PHY events generate interrupts.

SYNC_CAUSE / ASYNC_CAUSE is the whole chip interrupt - it includes
interrupts for hardware errors too, sleep state accesses, etc.

AR_IER controls whether the MAC generates interrupts at all.

So my suggestion is you only enable/disable AR_IER, and you don't try
to enable/disable bits in the AR_IMR register.

SYNC vs ASYNC - the TL;DR is that ASYNC always stays asserted until
you ACK the underlying condition, SYNC gets triggered once until you
ACK the SYNC_CAUSE register bit.

Eg, I do GPIO interrupts on ath9k (haven't integrated / published it
yet.) I use SYNC interrupts otherwise I keep getting the GPIO
interrupts until the GPIO bits are serviced, which can take time. The
underlying event (polarity low or high) can be asserted for quite some
time and I don't want to be spammed with interrupts.


-adrian


>
> Thanks in advance,
> -Lamar
>
>
>
>
>
>
> This email and any attachments may contain private, confidential and privileged material for the sole use of the intended recipient. If you are not the intended recipient, please immediately delete this email and any attachments.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [ath9k-devel] Interrupt issue on ARM with SMP
  2016-08-26  1:33     ` Adrian Chadd
@ 2016-08-26 14:35       ` Lamar Hansford
  2016-08-28  4:44         ` Adrian Chadd
  0 siblings, 1 reply; 7+ messages in thread
From: Lamar Hansford @ 2016-08-26 14:35 UTC (permalink / raw)
  To: ath9k-devel

Thanks for the explanation.

What I current have in my code is that I when I receive the first interrupt with ASYNC_CAUSE=2, I disable interrupts (IER=0).

After I do this I see 2-3 more interrupts with ASYNC_CAUSE=0.

Would this be the Atheros chipset generating the interrupts or spurious interrupts caused by the SoC?

My thought in  addressing this would be to disable the IRQ when I receive an interrupt with cause, with a timer to re-enable the interrupt at a period which allows the spurious condition to settle.   Of course, in order to play nice with shared (legacy) interrupts I need to implement MSI interrupt schema or else I would disable interrupts for all shared handlers.

A Quick note....
The prolific use of spinlock_irqsave and spinlock_bh is bad,  Very bad.   This disables interrupts for the entire system and not merely the Atheros driver.  This will cause adverse system impacts such as deadlocks and latency issues.

On the ARM platform we can run only for a few seconds (under traffic) before we deadlock the entire system (hard lock with no panic).   Under SMP we get an almost immediate hard lock up (no panic).   I have ripped all of the IRQ disables ripped out in order to get the system running to the point I can debug the interrupt handling.

Generally speaking you never want to use spinlocks on long running tasks.  However, I see spinlocks being used in place of mutex (which allow thread sleep) for potentially long running tasks such DMA and flush actions.

Is there any reason why these would be implemented not only using spinlock, but with a spinlock which disables IRQ?   Is there any reason any a mutex would not be better in these cases?

Thanks for your help here!
-Lamar


-----Original Message-----
From: adrian.chadd@gmail.com [mailto:adrian.chadd at gmail.com] On Behalf Of Adrian Chadd
Sent: Thursday, August 25, 2016 8:34 PM
To: Lamar Hansford
Cc: ath9k-devel at lists.ath9k.org
Subject: Re: [ath9k-devel] Interrupt issue on ARM with SMP

On 25 August 2016 at 11:15, Lamar Hansford <Lamar.Hansford@maxpoint.com> wrote:
> I am using legacy interrupts in this case.  I found the issue where the interrupts were not firing (stupid issue on my part).  Working on getting the data through the pipe.
>
> I still have a question on the meaning for the ASYNC cause.
>
> After receiving and clearing interrupts (writing the ISR back to itself).   I am masking the bits until they are serviced.  When the interrupts are masked I get an async cause of 0x0 and the interrupts are ignored.
>
> Is the AR_INTR_ASYNC_CAUSE indicating that the interrupt has not changed since the last IMR update?

Right. It's not triggering an interrupt, so ASYNC_CAUSE won't get triggered. AR_ISR will still show the status, but if it's disabled, it won't trigger an interrupt at all.

> If so this would be odd behavior (normally an IMR will prevent spurious ISR to relieve loading).

AR_IMR_* is the MAC status control register - it controls which MAC/PHY events generate interrupts.

SYNC_CAUSE / ASYNC_CAUSE is the whole chip interrupt - it includes interrupts for hardware errors too, sleep state accesses, etc.

AR_IER controls whether the MAC generates interrupts at all.

So my suggestion is you only enable/disable AR_IER, and you don't try to enable/disable bits in the AR_IMR register.

SYNC vs ASYNC - the TL;DR is that ASYNC always stays asserted until you ACK the underlying condition, SYNC gets triggered once until you ACK the SYNC_CAUSE register bit.

Eg, I do GPIO interrupts on ath9k (haven't integrated / published it
yet.) I use SYNC interrupts otherwise I keep getting the GPIO interrupts until the GPIO bits are serviced, which can take time. The underlying event (polarity low or high) can be asserted for quite some time and I don't want to be spammed with interrupts.


-adrian


>
> Thanks in advance,
> -Lamar
>
>
>
>
>
>
> This email and any attachments may contain private, confidential and privileged material for the sole use of the intended recipient. If you are not the intended recipient, please immediately delete this email and any attachments.
This email and any attachments may contain private, confidential and privileged material for the sole use of the intended recipient. If you are not the intended recipient, please immediately delete this email and any attachments.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [ath9k-devel] Interrupt issue on ARM with SMP
  2016-08-26 14:35       ` Lamar Hansford
@ 2016-08-28  4:44         ` Adrian Chadd
  2016-08-29 15:27           ` Lamar Hansford
  0 siblings, 1 reply; 7+ messages in thread
From: Adrian Chadd @ 2016-08-28  4:44 UTC (permalink / raw)
  To: ath9k-devel

Hi,

So, interrupt handling is finnicky. It's possible there's already a
posted interrupt waiting somewhere. Ideally you'd do a write-then-read
in each of the interrupt blocks between your device and the CPU to
ensure thing are synced. Otherwise the different frequencies in
different blocks in the ARM code mean you get interrupts after you've
cleared the interrupt itself, because it hasn't yet posted in that
particular hardware block.

Yay, etc.



-adrian

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [ath9k-devel] Interrupt issue on ARM with SMP
  2016-08-28  4:44         ` Adrian Chadd
@ 2016-08-29 15:27           ` Lamar Hansford
  0 siblings, 0 replies; 7+ messages in thread
From: Lamar Hansford @ 2016-08-29 15:27 UTC (permalink / raw)
  To: ath9k-devel

Another error on my end.  I've ignored any interrupts which do not have async/sync cause and added a mitigation timer and interrupts are firing as expected.   Of course now the packets are getting thrown away at the DMA :).  That is likely due to a BSP issue.

Thanks for all of the help!
-Lamar

-----Original Message-----
From: adrian.chadd@gmail.com [mailto:adrian.chadd at gmail.com] On Behalf Of Adrian Chadd
Sent: Saturday, August 27, 2016 11:45 PM
To: Lamar Hansford
Cc: ath9k-devel at lists.ath9k.org
Subject: Re: [ath9k-devel] Interrupt issue on ARM with SMP

Hi,

So, interrupt handling is finnicky. It's possible there's already a posted interrupt waiting somewhere. Ideally you'd do a write-then-read in each of the interrupt blocks between your device and the CPU to ensure thing are synced. Otherwise the different frequencies in different blocks in the ARM code mean you get interrupts after you've cleared the interrupt itself, because it hasn't yet posted in that particular hardware block.

Yay, etc.



-adrian
This email and any attachments may contain private, confidential and privileged material for the sole use of the intended recipient. If you are not the intended recipient, please immediately delete this email and any attachments.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-08-29 15:27 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-25  3:50 [ath9k-devel] Interrupt issue on ARM with SMP Lamar Hansford
2016-08-25 15:48 ` Adrian Chadd
2016-08-25 18:15   ` Lamar Hansford
2016-08-26  1:33     ` Adrian Chadd
2016-08-26 14:35       ` Lamar Hansford
2016-08-28  4:44         ` Adrian Chadd
2016-08-29 15:27           ` Lamar Hansford

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).