* ath9k: irq storm after suspend/resume @ 2011-08-27 11:32 Clemens Buchacher 2011-08-29 7:55 ` Mohammed Shafi 0 siblings, 1 reply; 38+ messages in thread From: Clemens Buchacher @ 2011-08-27 11:32 UTC (permalink / raw) To: linux-wireless Hi, I am trying to debug this issue: https://bugzilla.kernel.org/show_bug.cgi?id=39112 An IRQ storm triggers during resume after suspend to RAM. The IRQ storm also triggers during insmod after suspend/resume, even if ath9k is not loaded during suspend. In the former case, the IRQ storm happens right after the PCI subsystem restores the config space for ath9k. In the latter case, the IRQ storm happens immediately after request_irq in ath_pci_probe. So I have moved ath9k_init_device ahead of request_irq, in order to call ath9k_hw_set_interrupts(ah, 0) before enabling the interrupt. But still, the interrupts come. Note that according to /proc/interrupts, the IRQ line is not shared with any other device. I did not manage to determine which interrupt it is exactly, because the device is not in a ready state (SC_OP_INVALID is set) when they happen (in either scenario that triggers the IRQ storm). And SC_OP_INVALID is cleared only much later in ath9k_start. So, I am at a loss. Any ideas? Below is the output I get during insmod'ing right after resume (complete logs can be found in the bug report). Below that are the changes I made to produce the output. Clemens --- [ 53.387989] ath9k 0000:02:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17 [ 53.389306] ath9k 0000:02:00.0: setting latency timer to 64 [ 53.389371] ath: reset power on [ 53.392725] ath: UNDEFINED -> AWAKE [ 53.392783] ath: serialize_regmode is 0 [ 53.438096] ath: Eeprom VER: 14, REV: 13 [ 53.438098] ath: Initialize ANI [ 53.438112] ath: Enable MIB counters [ 53.438143] ath: Setup TX queue: 9 [ 53.438144] ath: Set queue properties for: 9 [ 53.438146] ath: Setup TX queue: 8 [ 53.438147] ath: Set queue properties for: 8 [ 53.438149] ath: Set queue properties for: 8 [ 53.438151] ath: Reset TX queue: 8 [ 53.438176] ath: tx ok 0x0 err 0x0 desc 0x100 eol 0x100 urn 0x0 [ 53.438178] ath: Setup TX queue: 0 [ 53.438179] ath: Set queue properties for: 0 [ 53.438181] ath: Setup TX queue: 1 [ 53.438182] ath: Set queue properties for: 1 [ 53.438184] ath: Setup TX queue: 2 [ 53.438185] ath: Set queue properties for: 2 [ 53.438186] ath: Setup TX queue: 3 [ 53.438187] ath: Set queue properties for: 3 [ 53.439692] ath: TX streams 1, RX streams: 1 [ 53.439694] ath: EEPROM regdomain: 0x60 [ 53.439695] ath: EEPROM indicates we should expect a direct regpair map [ 53.439698] ath: Country alpha2 being used: 00 [ 53.439699] ath: Regpair used: 0x60 [ 53.439702] cfg80211: Updating information on frequency 2412 MHz for a 20 MHz width channel with regulatory rule: [ 53.439704] cfg80211: 2402000 KHz - 2472000 KHz @ KHz), (N/A mBi, 2000 mBm) [ 53.439706] cfg80211: Updating information on frequency 2417 MHz for a 20 MHz width channel with regulatory rule: [ 53.439709] cfg80211: 2402000 KHz - 2472000 KHz @ KHz), (N/A mBi, 2000 mBm) [ 53.439710] cfg80211: Updating information on frequency 2422 MHz for a 20 MHz width channel with regulatory rule: [ 53.439713] cfg80211: 2402000 KHz - 2472000 KHz @ KHz), (N/A mBi, 2000 mBm) [ 53.439714] cfg80211: Updating information on frequency 2427 MHz for a 20 MHz width channel with regulatory rule: [ 53.439717] cfg80211: 2402000 KHz - 2472000 KHz @ KHz), (N/A mBi, 2000 mBm) [ 53.439719] cfg80211: Updating information on frequency 2432 MHz for a 20 MHz width channel with regulatory rule: [ 53.439721] cfg80211: 2402000 KHz - 2472000 KHz @ KHz), (N/A mBi, 2000 mBm) [ 53.439723] cfg80211: Updating information on frequency 2437 MHz for a 20 MHz width channel with regulatory rule: [ 53.439725] cfg80211: 2402000 KHz - 2472000 KHz @ KHz), (N/A mBi, 2000 mBm) [ 53.439727] cfg80211: Updating information on frequency 2442 MHz for a 20 MHz width channel with regulatory rule: [ 53.439729] cfg80211: 2402000 KHz - 2472000 KHz @ KHz), (N/A mBi, 2000 mBm) [ 53.439731] cfg80211: Updating information on frequency 2447 MHz for a 20 MHz width channel with regulatory rule: [ 53.439733] cfg80211: 2402000 KHz - 2472000 KHz @ KHz), (N/A mBi, 2000 mBm) [ 53.439735] cfg80211: Updating information on frequency 2452 MHz for a 20 MHz width channel with regulatory rule: [ 53.439737] cfg80211: 2402000 KHz - 2472000 KHz @ KHz), (N/A mBi, 2000 mBm) [ 53.439739] cfg80211: Updating information on frequency 2457 MHz for a 20 MHz width channel with regulatory rule: [ 53.439741] cfg80211: 2402000 KHz - 2472000 KHz @ KHz), (N/A mBi, 2000 mBm) [ 53.439743] cfg80211: Updating information on frequency 2462 MHz for a 20 MHz width channel with regulatory rule: [ 53.439746] cfg80211: 2402000 KHz - 2472000 KHz @ KHz), (N/A mBi, 2000 mBm) [ 53.439747] cfg80211: Updating information on frequency 2467 MHz for a 20 MHz width channel with regulatory rule: [ 53.439750] cfg80211: 2457000 KHz - 2482000 KHz @ KHz), (N/A mBi, 2000 mBm) [ 53.439751] cfg80211: Updating information on frequency 2472 MHz for a 20 MHz width channel with regulatory rule: [ 53.439754] cfg80211: 2457000 KHz - 2482000 KHz @ KHz), (N/A mBi, 2000 mBm) [ 53.439756] cfg80211: Updating information on frequency 2484 MHz for a 20 MHz width channel with regulatory rule: [ 53.439758] cfg80211: 2474000 KHz - 2494000 KHz @ KHz), (N/A mBi, 2000 mBm) [ 53.439761] ath: tx DMA: 512 buffers 1 desc/buf [ 53.439772] ath: tx DMA map: ffff880071f90000 (54600) -> 71f90000 (54600) [ 53.439792] ath: beacon DMA: 4 buffers 1 desc/buf [ 53.439795] ath: beacon DMA map: ffff880071ecc000 (416) -> 71ecc000 (416) [ 53.439797] ath: cachelsz 64 rxbufsize 1926 [ 53.439799] ath: rx DMA: 512 buffers 1 desc/buf [ 53.439811] ath: rx DMA map: ffff880071fb0000 (54600) -> 71fb0000 (54600) [ 53.441193] cfg80211: Ignoring regulatory request Set by core since the driver uses its own custom regulatory domain [ 53.441269] ieee80211 phy1: Selected rate control algorithm 'ath9k_rate_control' [ 53.441628] Registered led device: ath9k-phy1 [ 53.441641] ath9k: disabling interrupts [ 53.442979] ath: disable IER [ 53.442990] ath: 0x0 => 0x0 [ 53.442992] ath: new IMR 0x0 [ 53.443013] ath9k: requesting irq [ 53.444293] ath9k: interrupt with invalid flag set (1 times) [ 53.444296] wait 500 ms [ 53.446255] ath9k: interrupt with invalid flag set (2 times) [ 53.446288] ath9k: interrupt with invalid flag set (4 times) [ 53.446330] ath9k: interrupt with invalid flag set (8 times) [ 53.446374] ath9k: interrupt with invalid flag set (16 times) [ 53.446453] ath9k: interrupt with invalid flag set (32 times) [ 53.446572] ath9k: interrupt with invalid flag set (64 times) [ 53.446821] ath9k: interrupt with invalid flag set (128 times) [ 53.447319] ath9k: interrupt with invalid flag set (256 times) [ 53.448526] ath9k: interrupt with invalid flag set (512 times) [ 53.450942] ath9k: interrupt with invalid flag set (1024 times) [ 53.454793] ath9k: interrupt with invalid flag set (2048 times) [ 53.462493] ath9k: interrupt with invalid flag set (4096 times) [ 53.477738] ath9k: interrupt with invalid flag set (8192 times) [ 53.508164] ath9k: interrupt with invalid flag set (16384 times) [ 53.571798] ath9k: interrupt with invalid flag set (32768 times) [ 53.697472] ath9k: interrupt with invalid flag set (65536 times) [ 53.832522] irq 17: nobody cared (try booting with the "irqpoll" option) [ 53.834253] Pid: 0, comm: kworker/0:0 Tainted: G C 3.1.0-rc2-ath9k-debug #16 [ 53.835991] Call Trace: [ 53.837727] <IRQ> [<ffffffff810c64fd>] __report_bad_irq+0x3d/0xe0 [ 53.839487] [<ffffffff810c693d>] note_interrupt+0x14d/0x210 [ 53.841220] [<ffffffff810c49d9>] handle_irq_event_percpu+0xc9/0x2a0 [ 53.842952] [<ffffffff810c4bf8>] handle_irq_event+0x48/0x70 [ 53.844659] [<ffffffff810c727a>] handle_fasteoi_irq+0x5a/0xe0 [ 53.846379] [<ffffffff81016982>] handle_irq+0x22/0x40 [ 53.848073] [<ffffffff813eb45a>] do_IRQ+0x5a/0xe0 [ 53.849751] [<ffffffff813e80ae>] common_interrupt+0x6e/0x6e [ 53.851438] <EOI> [<ffffffff813e4a96>] ? schedule+0x316/0x9b0 [ 53.853151] [<ffffffff812732bb>] ? intel_idle+0xcb/0x120 [ 53.854822] [<ffffffff8127329d>] ? intel_idle+0xad/0x120 [ 53.856502] [<ffffffff81311e66>] cpuidle_idle_call+0xc6/0x350 [ 53.858166] [<ffffffff81013229>] cpu_idle+0xc9/0x120 [ 53.859806] [<ffffffff813d6f71>] start_secondary+0x202/0x209 [ 53.861438] handlers: [ 53.863009] [<ffffffffa0557600>] ath_isr [ 53.864524] Disabling IRQ #17 [ 53.943777] done waiting [ 53.944849] ieee80211 phy1: Atheros AR9285 Rev:2 mem=0xffffc90010ea0000, irq=17 diff --git a/drivers/net/wireless/ath/ath9k/hw.c b/drivers/net/wireless/ath/ath9k/hw.c index 8dcefe7..fe7780c 100644 --- a/drivers/net/wireless/ath/ath9k/hw.c +++ b/drivers/net/wireless/ath/ath9k/hw.c @@ -16,6 +16,7 @@ #include <linux/io.h> #include <linux/slab.h> +#include <linux/delay.h> #include <asm/unaligned.h> #include "hw.h" @@ -532,6 +533,7 @@ static int __ath9k_hw_init(struct ath_hw *ah) ah->WARegVal |= (AR_WA_D3_L1_DISABLE | AR_WA_ASPM_TIMER_BASED_DISABLE); + ath_dbg(common, ATH_DBG_RESET, "reset power on\n"); if (!ath9k_hw_set_reset_reg(ah, ATH9K_RESET_POWER_ON)) { ath_err(common, "Couldn't reset chip\n"); return -EIO; diff --git a/drivers/net/wireless/ath/ath9k/main.c b/drivers/net/wireless/ath/ath9k/main.c index 9098aaa..fdd5dd1 100644 --- a/drivers/net/wireless/ath/ath9k/main.c +++ b/drivers/net/wireless/ath/ath9k/main.c @@ -16,6 +16,7 @@ #include <linux/nl80211.h> #include <linux/delay.h> +#include <linux/log2.h> #include "ath9k.h" #include "btcoex.h" @@ -770,14 +771,28 @@ irqreturn_t ath_isr(int irq, void *dev) * touch anything. Note this can happen early * on if the IRQ is shared. */ - if (sc->sc_flags & SC_OP_INVALID) + if (sc->sc_flags & SC_OP_INVALID) { + static int count = 0; + if (count < INT_MAX) { + count++; + if (is_power_of_2(count)) + printk(KERN_DEBUG "ath9k: interrupt with invalid flag set (%d times)\n", count); + } return IRQ_NONE; + } /* shared irq, not for us */ - if (!ath9k_hw_intrpend(ah)) + if (!ath9k_hw_intrpend(ah)) { + static int count = 0; + if (count < INT_MAX) { + count++; + if (is_power_of_2(count)) + printk(KERN_DEBUG "ath9k: shared irq, not for us (%d times)\n", count); + } return IRQ_NONE; + } /* * Figure out the reason(s) for the interrupt. Note @@ -792,8 +807,15 @@ irqreturn_t ath_isr(int irq, void *dev) * If there are no status bits set, then this interrupt was not * for me (should have been caught above). */ - if (!status) + if (!status) { + static int count = 0; + if (count < INT_MAX) { + count++; + if (is_power_of_2(count)) + printk(KERN_DEBUG "ath9k: interrupt, but no status bit set (%d times)\n", count); + } return IRQ_NONE; + } /* Cache the status */ sc->intrstatus = status; diff --git a/drivers/net/wireless/ath/ath9k/pci.c b/drivers/net/wireless/ath/ath9k/pci.c index be4ea13..6fdabd8 100644 --- a/drivers/net/wireless/ath/ath9k/pci.c +++ b/drivers/net/wireless/ath/ath9k/pci.c @@ -242,21 +242,35 @@ static int ath_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id) /* Will be cleared in ath9k_start() */ sc->sc_flags |= SC_OP_INVALID; + pci_read_config_word(pdev, PCI_SUBSYSTEM_ID, &subsysid); + ret = ath9k_init_device(id->device, sc, subsysid, &ath_pci_bus_ops); + if (ret) { + dev_err(&pdev->dev, "Failed to initialize device\n"); + goto err_init; + } + + { + struct ath_hw *ah = sc->sc_ah; + printk(KERN_INFO "ath9k: disabling interrupts\n"); + ath9k_ps_wakeup(sc); + mutex_lock(&sc->mutex); + ath9k_hw_set_interrupts(ah, 0); + mutex_unlock(&sc->mutex); + ath9k_ps_restore(sc); + } + + printk(KERN_INFO "ath9k: requesting irq\n"); ret = request_irq(pdev->irq, ath_isr, IRQF_SHARED, "ath9k", sc); if (ret) { dev_err(&pdev->dev, "request_irq failed\n"); goto err_irq; } + printk(KERN_INFO "wait 500 ms\n"); + msleep(500); + printk(KERN_INFO "done waiting\n"); sc->irq = pdev->irq; - pci_read_config_word(pdev, PCI_SUBSYSTEM_ID, &subsysid); - ret = ath9k_init_device(id->device, sc, subsysid, &ath_pci_bus_ops); - if (ret) { - dev_err(&pdev->dev, "Failed to initialize device\n"); - goto err_init; - } - ath9k_hw_name(sc->sc_ah, hw_name, sizeof(hw_name)); wiphy_info(hw->wiphy, "%s mem=0x%lx, irq=%d\n", hw_name, (unsigned long)mem, pdev->irq); ^ permalink raw reply related [flat|nested] 38+ messages in thread
* Re: ath9k: irq storm after suspend/resume 2011-08-27 11:32 ath9k: irq storm after suspend/resume Clemens Buchacher @ 2011-08-29 7:55 ` Mohammed Shafi 2011-08-29 11:53 ` Mohammed Shafi 0 siblings, 1 reply; 38+ messages in thread From: Mohammed Shafi @ 2011-08-29 7:55 UTC (permalink / raw) To: Clemens Buchacher; +Cc: linux-wireless, beta992 On Sat, Aug 27, 2011 at 5:02 PM, Clemens Buchacher <drizzd@aon.at> wrote: > Hi, > Hi Clemens, > I am trying to debug this issue: > https://bugzilla.kernel.org/show_bug.cgi?id=39112 > > An IRQ storm triggers during resume after suspend to RAM. The IRQ > storm also triggers during insmod after suspend/resume, even if > ath9k is not loaded during suspend. a recent commit, not sure this will help suspend/resume commit 0682c9b52bf51fbc67c4e79fcbdadcf70bd600f8 Author: Rajkumar Manoharan <rmanohar@qca.qualcomm.com> Date: Sat Aug 13 10:28:09 2011 +0530 ath9k: Fix rx overrun interrupt storm Whenever RXEOL is received, both RXORN and RXEOL got cleared to avoid rx overrun interrupt storm. This was handled only for edma chips. The same scenario was also observered with AR9280, doing frequent channel type switch b/w HT20/40 with bidi traffic that is causing failure to stop rx dma. This patch clears the RXEOL & RXORN interrupts for all chips. > > In the former case, the IRQ storm happens right after the PCI > subsystem restores the config space for ath9k. > > In the latter case, the IRQ storm happens immediately after > request_irq in ath_pci_probe. So I have moved ath9k_init_device > ahead of request_irq, in order to call ath9k_hw_set_interrupts(ah, > 0) before enabling the interrupt. > > But still, the interrupts come. Note that according to > /proc/interrupts, the IRQ line is not shared with any other device. > I did not manage to determine which interrupt it is exactly, > because the device is not in a ready state (SC_OP_INVALID is set) > when they happen (in either scenario that triggers the IRQ storm). > And SC_OP_INVALID is cleared only much later in ath9k_start. > > So, I am at a loss. Any ideas? please provide the lspci -vvvxx. also looking at /sys/kernel/debug/ieee80211/phy0/ath9k$ sudo cat interrupt. also this additional information might help: in case have you seen this is happening in 32 bit also ? is this happening in wireless-testing Linux 3.1-rc3 ? or the latest compat wireless? i did some preliminary testing, not able to recreate it. will try further.thanks! > > Below is the output I get during insmod'ing right after resume > (complete logs can be found in the bug report). Below that are the > changes I made to produce the output. > > Clemens > --- > > [ 53.387989] ath9k 0000:02:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17 > [ 53.389306] ath9k 0000:02:00.0: setting latency timer to 64 > [ 53.389371] ath: reset power on > [ 53.392725] ath: UNDEFINED -> AWAKE > [ 53.392783] ath: serialize_regmode is 0 > [ 53.438096] ath: Eeprom VER: 14, REV: 13 > [ 53.438098] ath: Initialize ANI > [ 53.438112] ath: Enable MIB counters > [ 53.438143] ath: Setup TX queue: 9 > [ 53.438144] ath: Set queue properties for: 9 > [ 53.438146] ath: Setup TX queue: 8 > [ 53.438147] ath: Set queue properties for: 8 > [ 53.438149] ath: Set queue properties for: 8 > [ 53.438151] ath: Reset TX queue: 8 > [ 53.438176] ath: tx ok 0x0 err 0x0 desc 0x100 eol 0x100 urn 0x0 > [ 53.438178] ath: Setup TX queue: 0 > [ 53.438179] ath: Set queue properties for: 0 > [ 53.438181] ath: Setup TX queue: 1 > [ 53.438182] ath: Set queue properties for: 1 > [ 53.438184] ath: Setup TX queue: 2 > [ 53.438185] ath: Set queue properties for: 2 > [ 53.438186] ath: Setup TX queue: 3 > [ 53.438187] ath: Set queue properties for: 3 > [ 53.439692] ath: TX streams 1, RX streams: 1 > [ 53.439694] ath: EEPROM regdomain: 0x60 > [ 53.439695] ath: EEPROM indicates we should expect a direct regpair map > [ 53.439698] ath: Country alpha2 being used: 00 > [ 53.439699] ath: Regpair used: 0x60 > [ 53.439702] cfg80211: Updating information on frequency 2412 MHz for a 20 MHz width channel with regulatory rule: > [ 53.439704] cfg80211: 2402000 KHz - 2472000 KHz @ KHz), (N/A mBi, 2000 mBm) > [ 53.439706] cfg80211: Updating information on frequency 2417 MHz for a 20 MHz width channel with regulatory rule: > [ 53.439709] cfg80211: 2402000 KHz - 2472000 KHz @ KHz), (N/A mBi, 2000 mBm) > [ 53.439710] cfg80211: Updating information on frequency 2422 MHz for a 20 MHz width channel with regulatory rule: > [ 53.439713] cfg80211: 2402000 KHz - 2472000 KHz @ KHz), (N/A mBi, 2000 mBm) > [ 53.439714] cfg80211: Updating information on frequency 2427 MHz for a 20 MHz width channel with regulatory rule: > [ 53.439717] cfg80211: 2402000 KHz - 2472000 KHz @ KHz), (N/A mBi, 2000 mBm) > [ 53.439719] cfg80211: Updating information on frequency 2432 MHz for a 20 MHz width channel with regulatory rule: > [ 53.439721] cfg80211: 2402000 KHz - 2472000 KHz @ KHz), (N/A mBi, 2000 mBm) > [ 53.439723] cfg80211: Updating information on frequency 2437 MHz for a 20 MHz width channel with regulatory rule: > [ 53.439725] cfg80211: 2402000 KHz - 2472000 KHz @ KHz), (N/A mBi, 2000 mBm) > [ 53.439727] cfg80211: Updating information on frequency 2442 MHz for a 20 MHz width channel with regulatory rule: > [ 53.439729] cfg80211: 2402000 KHz - 2472000 KHz @ KHz), (N/A mBi, 2000 mBm) > [ 53.439731] cfg80211: Updating information on frequency 2447 MHz for a 20 MHz width channel with regulatory rule: > [ 53.439733] cfg80211: 2402000 KHz - 2472000 KHz @ KHz), (N/A mBi, 2000 mBm) > [ 53.439735] cfg80211: Updating information on frequency 2452 MHz for a 20 MHz width channel with regulatory rule: > [ 53.439737] cfg80211: 2402000 KHz - 2472000 KHz @ KHz), (N/A mBi, 2000 mBm) > [ 53.439739] cfg80211: Updating information on frequency 2457 MHz for a 20 MHz width channel with regulatory rule: > [ 53.439741] cfg80211: 2402000 KHz - 2472000 KHz @ KHz), (N/A mBi, 2000 mBm) > [ 53.439743] cfg80211: Updating information on frequency 2462 MHz for a 20 MHz width channel with regulatory rule: > [ 53.439746] cfg80211: 2402000 KHz - 2472000 KHz @ KHz), (N/A mBi, 2000 mBm) > [ 53.439747] cfg80211: Updating information on frequency 2467 MHz for a 20 MHz width channel with regulatory rule: > [ 53.439750] cfg80211: 2457000 KHz - 2482000 KHz @ KHz), (N/A mBi, 2000 mBm) > [ 53.439751] cfg80211: Updating information on frequency 2472 MHz for a 20 MHz width channel with regulatory rule: > [ 53.439754] cfg80211: 2457000 KHz - 2482000 KHz @ KHz), (N/A mBi, 2000 mBm) > [ 53.439756] cfg80211: Updating information on frequency 2484 MHz for a 20 MHz width channel with regulatory rule: > [ 53.439758] cfg80211: 2474000 KHz - 2494000 KHz @ KHz), (N/A mBi, 2000 mBm) > [ 53.439761] ath: tx DMA: 512 buffers 1 desc/buf > [ 53.439772] ath: tx DMA map: ffff880071f90000 (54600) -> 71f90000 (54600) > [ 53.439792] ath: beacon DMA: 4 buffers 1 desc/buf > [ 53.439795] ath: beacon DMA map: ffff880071ecc000 (416) -> 71ecc000 (416) > [ 53.439797] ath: cachelsz 64 rxbufsize 1926 > [ 53.439799] ath: rx DMA: 512 buffers 1 desc/buf > [ 53.439811] ath: rx DMA map: ffff880071fb0000 (54600) -> 71fb0000 (54600) > [ 53.441193] cfg80211: Ignoring regulatory request Set by core since the driver uses its own custom regulatory domain > [ 53.441269] ieee80211 phy1: Selected rate control algorithm 'ath9k_rate_control' > [ 53.441628] Registered led device: ath9k-phy1 > [ 53.441641] ath9k: disabling interrupts > [ 53.442979] ath: disable IER > [ 53.442990] ath: 0x0 => 0x0 > [ 53.442992] ath: new IMR 0x0 > [ 53.443013] ath9k: requesting irq > [ 53.444293] ath9k: interrupt with invalid flag set (1 times) > [ 53.444296] wait 500 ms > [ 53.446255] ath9k: interrupt with invalid flag set (2 times) > [ 53.446288] ath9k: interrupt with invalid flag set (4 times) > [ 53.446330] ath9k: interrupt with invalid flag set (8 times) > [ 53.446374] ath9k: interrupt with invalid flag set (16 times) > [ 53.446453] ath9k: interrupt with invalid flag set (32 times) > [ 53.446572] ath9k: interrupt with invalid flag set (64 times) > [ 53.446821] ath9k: interrupt with invalid flag set (128 times) > [ 53.447319] ath9k: interrupt with invalid flag set (256 times) > [ 53.448526] ath9k: interrupt with invalid flag set (512 times) > [ 53.450942] ath9k: interrupt with invalid flag set (1024 times) > [ 53.454793] ath9k: interrupt with invalid flag set (2048 times) > [ 53.462493] ath9k: interrupt with invalid flag set (4096 times) > [ 53.477738] ath9k: interrupt with invalid flag set (8192 times) > [ 53.508164] ath9k: interrupt with invalid flag set (16384 times) > [ 53.571798] ath9k: interrupt with invalid flag set (32768 times) > [ 53.697472] ath9k: interrupt with invalid flag set (65536 times) > [ 53.832522] irq 17: nobody cared (try booting with the "irqpoll" option) > [ 53.834253] Pid: 0, comm: kworker/0:0 Tainted: G C 3.1.0-rc2-ath9k-debug #16 > [ 53.835991] Call Trace: > [ 53.837727] <IRQ> [<ffffffff810c64fd>] __report_bad_irq+0x3d/0xe0 > [ 53.839487] [<ffffffff810c693d>] note_interrupt+0x14d/0x210 > [ 53.841220] [<ffffffff810c49d9>] handle_irq_event_percpu+0xc9/0x2a0 > [ 53.842952] [<ffffffff810c4bf8>] handle_irq_event+0x48/0x70 > [ 53.844659] [<ffffffff810c727a>] handle_fasteoi_irq+0x5a/0xe0 > [ 53.846379] [<ffffffff81016982>] handle_irq+0x22/0x40 > [ 53.848073] [<ffffffff813eb45a>] do_IRQ+0x5a/0xe0 > [ 53.849751] [<ffffffff813e80ae>] common_interrupt+0x6e/0x6e > [ 53.851438] <EOI> [<ffffffff813e4a96>] ? schedule+0x316/0x9b0 > [ 53.853151] [<ffffffff812732bb>] ? intel_idle+0xcb/0x120 > [ 53.854822] [<ffffffff8127329d>] ? intel_idle+0xad/0x120 > [ 53.856502] [<ffffffff81311e66>] cpuidle_idle_call+0xc6/0x350 > [ 53.858166] [<ffffffff81013229>] cpu_idle+0xc9/0x120 > [ 53.859806] [<ffffffff813d6f71>] start_secondary+0x202/0x209 > [ 53.861438] handlers: > [ 53.863009] [<ffffffffa0557600>] ath_isr > [ 53.864524] Disabling IRQ #17 > [ 53.943777] done waiting > [ 53.944849] ieee80211 phy1: Atheros AR9285 Rev:2 mem=0xffffc90010ea0000, irq=17 > > diff --git a/drivers/net/wireless/ath/ath9k/hw.c b/drivers/net/wireless/ath/ath9k/hw.c > index 8dcefe7..fe7780c 100644 > --- a/drivers/net/wireless/ath/ath9k/hw.c > +++ b/drivers/net/wireless/ath/ath9k/hw.c > @@ -16,6 +16,7 @@ > > #include <linux/io.h> > #include <linux/slab.h> > +#include <linux/delay.h> > #include <asm/unaligned.h> > > #include "hw.h" > @@ -532,6 +533,7 @@ static int __ath9k_hw_init(struct ath_hw *ah) > ah->WARegVal |= (AR_WA_D3_L1_DISABLE | > AR_WA_ASPM_TIMER_BASED_DISABLE); > > + ath_dbg(common, ATH_DBG_RESET, "reset power on\n"); > if (!ath9k_hw_set_reset_reg(ah, ATH9K_RESET_POWER_ON)) { > ath_err(common, "Couldn't reset chip\n"); > return -EIO; > diff --git a/drivers/net/wireless/ath/ath9k/main.c b/drivers/net/wireless/ath/ath9k/main.c > index 9098aaa..fdd5dd1 100644 > --- a/drivers/net/wireless/ath/ath9k/main.c > +++ b/drivers/net/wireless/ath/ath9k/main.c > @@ -16,6 +16,7 @@ > > #include <linux/nl80211.h> > #include <linux/delay.h> > +#include <linux/log2.h> > #include "ath9k.h" > #include "btcoex.h" > > @@ -770,14 +771,28 @@ irqreturn_t ath_isr(int irq, void *dev) > * touch anything. Note this can happen early > * on if the IRQ is shared. > */ > - if (sc->sc_flags & SC_OP_INVALID) > + if (sc->sc_flags & SC_OP_INVALID) { > + static int count = 0; > + if (count < INT_MAX) { > + count++; > + if (is_power_of_2(count)) > + printk(KERN_DEBUG "ath9k: interrupt with invalid flag set (%d times)\n", count); > + } > return IRQ_NONE; > + } > > > /* shared irq, not for us */ > > - if (!ath9k_hw_intrpend(ah)) > + if (!ath9k_hw_intrpend(ah)) { > + static int count = 0; > + if (count < INT_MAX) { > + count++; > + if (is_power_of_2(count)) > + printk(KERN_DEBUG "ath9k: shared irq, not for us (%d times)\n", count); > + } > return IRQ_NONE; > + } > > /* > * Figure out the reason(s) for the interrupt. Note > @@ -792,8 +807,15 @@ irqreturn_t ath_isr(int irq, void *dev) > * If there are no status bits set, then this interrupt was not > * for me (should have been caught above). > */ > - if (!status) > + if (!status) { > + static int count = 0; > + if (count < INT_MAX) { > + count++; > + if (is_power_of_2(count)) > + printk(KERN_DEBUG "ath9k: interrupt, but no status bit set (%d times)\n", count); > + } > return IRQ_NONE; > + } > > /* Cache the status */ > sc->intrstatus = status; > diff --git a/drivers/net/wireless/ath/ath9k/pci.c b/drivers/net/wireless/ath/ath9k/pci.c > index be4ea13..6fdabd8 100644 > --- a/drivers/net/wireless/ath/ath9k/pci.c > +++ b/drivers/net/wireless/ath/ath9k/pci.c > @@ -242,21 +242,35 @@ static int ath_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id) > /* Will be cleared in ath9k_start() */ > sc->sc_flags |= SC_OP_INVALID; > > + pci_read_config_word(pdev, PCI_SUBSYSTEM_ID, &subsysid); > + ret = ath9k_init_device(id->device, sc, subsysid, &ath_pci_bus_ops); > + if (ret) { > + dev_err(&pdev->dev, "Failed to initialize device\n"); > + goto err_init; > + } > + > + { > + struct ath_hw *ah = sc->sc_ah; > + printk(KERN_INFO "ath9k: disabling interrupts\n"); > + ath9k_ps_wakeup(sc); > + mutex_lock(&sc->mutex); > + ath9k_hw_set_interrupts(ah, 0); > + mutex_unlock(&sc->mutex); > + ath9k_ps_restore(sc); > + } > + > + printk(KERN_INFO "ath9k: requesting irq\n"); > ret = request_irq(pdev->irq, ath_isr, IRQF_SHARED, "ath9k", sc); > if (ret) { > dev_err(&pdev->dev, "request_irq failed\n"); > goto err_irq; > } > + printk(KERN_INFO "wait 500 ms\n"); > + msleep(500); > + printk(KERN_INFO "done waiting\n"); > > sc->irq = pdev->irq; > > - pci_read_config_word(pdev, PCI_SUBSYSTEM_ID, &subsysid); > - ret = ath9k_init_device(id->device, sc, subsysid, &ath_pci_bus_ops); > - if (ret) { > - dev_err(&pdev->dev, "Failed to initialize device\n"); > - goto err_init; > - } > - > ath9k_hw_name(sc->sc_ah, hw_name, sizeof(hw_name)); > wiphy_info(hw->wiphy, "%s mem=0x%lx, irq=%d\n", > hw_name, (unsigned long)mem, pdev->irq); > -- > To unsubscribe from this list: send the line "unsubscribe linux-wireless" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- shafi ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ath9k: irq storm after suspend/resume 2011-08-29 7:55 ` Mohammed Shafi @ 2011-08-29 11:53 ` Mohammed Shafi 2011-08-29 15:12 ` Mohammed Shafi 0 siblings, 1 reply; 38+ messages in thread From: Mohammed Shafi @ 2011-08-29 11:53 UTC (permalink / raw) To: Clemens Buchacher; +Cc: linux-wireless, beta992 On Mon, Aug 29, 2011 at 1:25 PM, Mohammed Shafi <shafi.wireless@gmail.com> wrote: > On Sat, Aug 27, 2011 at 5:02 PM, Clemens Buchacher <drizzd@aon.at> wrote: >> Hi, >> > Hi Clemens, > >> I am trying to debug this issue: >> https://bugzilla.kernel.org/show_bug.cgi?id=39112 >> >> An IRQ storm triggers during resume after suspend to RAM. The IRQ >> storm also triggers during insmod after suspend/resume, even if >> ath9k is not loaded during suspend. > > a recent commit, not sure this will help suspend/resume > > commit 0682c9b52bf51fbc67c4e79fcbdadcf70bd600f8 > Author: Rajkumar Manoharan <rmanohar@qca.qualcomm.com> > Date: Sat Aug 13 10:28:09 2011 +0530 > > ath9k: Fix rx overrun interrupt storm > > Whenever RXEOL is received, both RXORN and RXEOL got cleared > to avoid rx overrun interrupt storm. This was handled only for > edma chips. The same scenario was also observered with AR9280, > doing frequent channel type switch b/w HT20/40 with bidi traffic > that is causing failure to stop rx dma. This patch clears > the RXEOL & RXORN interrupts for all chips. > > >> >> In the former case, the IRQ storm happens right after the PCI >> subsystem restores the config space for ath9k. >> >> In the latter case, the IRQ storm happens immediately after >> request_irq in ath_pci_probe. So I have moved ath9k_init_device >> ahead of request_irq, in order to call ath9k_hw_set_interrupts(ah, >> 0) before enabling the interrupt. >> >> But still, the interrupts come. Note that according to >> /proc/interrupts, the IRQ line is not shared with any other device. >> I did not manage to determine which interrupt it is exactly, >> because the device is not in a ready state (SC_OP_INVALID is set) >> when they happen (in either scenario that triggers the IRQ storm). >> And SC_OP_INVALID is cleared only much later in ath9k_start. >> >> So, I am at a loss. Any ideas? > > please provide the lspci -vvvxx. > also looking at > /sys/kernel/debug/ieee80211/phy0/ath9k$ sudo cat interrupt. hi, i think this will help, please get the message sudo modprobe ath9k debug=0xffffffff. few fatal PCI interrupt messages are based on ATH_DEBUG_ANY. thanks. > > also this additional information might help: > in case have you seen this is happening in 32 bit also ? > is this happening in wireless-testing Linux 3.1-rc3 ? or the latest > compat wireless? > i did some preliminary testing, not able to recreate it. will try > further.thanks! > >> >> Below is the output I get during insmod'ing right after resume >> (complete logs can be found in the bug report). Below that are the >> changes I made to produce the output. >> >> Clemens >> --- >> >> [ 53.387989] ath9k 0000:02:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17 >> [ 53.389306] ath9k 0000:02:00.0: setting latency timer to 64 >> [ 53.389371] ath: reset power on >> [ 53.392725] ath: UNDEFINED -> AWAKE >> [ 53.392783] ath: serialize_regmode is 0 >> [ 53.438096] ath: Eeprom VER: 14, REV: 13 >> [ 53.438098] ath: Initialize ANI >> [ 53.438112] ath: Enable MIB counters >> [ 53.438143] ath: Setup TX queue: 9 >> [ 53.438144] ath: Set queue properties for: 9 >> [ 53.438146] ath: Setup TX queue: 8 >> [ 53.438147] ath: Set queue properties for: 8 >> [ 53.438149] ath: Set queue properties for: 8 >> [ 53.438151] ath: Reset TX queue: 8 >> [ 53.438176] ath: tx ok 0x0 err 0x0 desc 0x100 eol 0x100 urn 0x0 >> [ 53.438178] ath: Setup TX queue: 0 >> [ 53.438179] ath: Set queue properties for: 0 >> [ 53.438181] ath: Setup TX queue: 1 >> [ 53.438182] ath: Set queue properties for: 1 >> [ 53.438184] ath: Setup TX queue: 2 >> [ 53.438185] ath: Set queue properties for: 2 >> [ 53.438186] ath: Setup TX queue: 3 >> [ 53.438187] ath: Set queue properties for: 3 >> [ 53.439692] ath: TX streams 1, RX streams: 1 >> [ 53.439694] ath: EEPROM regdomain: 0x60 >> [ 53.439695] ath: EEPROM indicates we should expect a direct regpair map >> [ 53.439698] ath: Country alpha2 being used: 00 >> [ 53.439699] ath: Regpair used: 0x60 >> [ 53.439702] cfg80211: Updating information on frequency 2412 MHz for a 20 MHz width channel with regulatory rule: >> [ 53.439704] cfg80211: 2402000 KHz - 2472000 KHz @ KHz), (N/A mBi, 2000 mBm) >> [ 53.439706] cfg80211: Updating information on frequency 2417 MHz for a 20 MHz width channel with regulatory rule: >> [ 53.439709] cfg80211: 2402000 KHz - 2472000 KHz @ KHz), (N/A mBi, 2000 mBm) >> [ 53.439710] cfg80211: Updating information on frequency 2422 MHz for a 20 MHz width channel with regulatory rule: >> [ 53.439713] cfg80211: 2402000 KHz - 2472000 KHz @ KHz), (N/A mBi, 2000 mBm) >> [ 53.439714] cfg80211: Updating information on frequency 2427 MHz for a 20 MHz width channel with regulatory rule: >> [ 53.439717] cfg80211: 2402000 KHz - 2472000 KHz @ KHz), (N/A mBi, 2000 mBm) >> [ 53.439719] cfg80211: Updating information on frequency 2432 MHz for a 20 MHz width channel with regulatory rule: >> [ 53.439721] cfg80211: 2402000 KHz - 2472000 KHz @ KHz), (N/A mBi, 2000 mBm) >> [ 53.439723] cfg80211: Updating information on frequency 2437 MHz for a 20 MHz width channel with regulatory rule: >> [ 53.439725] cfg80211: 2402000 KHz - 2472000 KHz @ KHz), (N/A mBi, 2000 mBm) >> [ 53.439727] cfg80211: Updating information on frequency 2442 MHz for a 20 MHz width channel with regulatory rule: >> [ 53.439729] cfg80211: 2402000 KHz - 2472000 KHz @ KHz), (N/A mBi, 2000 mBm) >> [ 53.439731] cfg80211: Updating information on frequency 2447 MHz for a 20 MHz width channel with regulatory rule: >> [ 53.439733] cfg80211: 2402000 KHz - 2472000 KHz @ KHz), (N/A mBi, 2000 mBm) >> [ 53.439735] cfg80211: Updating information on frequency 2452 MHz for a 20 MHz width channel with regulatory rule: >> [ 53.439737] cfg80211: 2402000 KHz - 2472000 KHz @ KHz), (N/A mBi, 2000 mBm) >> [ 53.439739] cfg80211: Updating information on frequency 2457 MHz for a 20 MHz width channel with regulatory rule: >> [ 53.439741] cfg80211: 2402000 KHz - 2472000 KHz @ KHz), (N/A mBi, 2000 mBm) >> [ 53.439743] cfg80211: Updating information on frequency 2462 MHz for a 20 MHz width channel with regulatory rule: >> [ 53.439746] cfg80211: 2402000 KHz - 2472000 KHz @ KHz), (N/A mBi, 2000 mBm) >> [ 53.439747] cfg80211: Updating information on frequency 2467 MHz for a 20 MHz width channel with regulatory rule: >> [ 53.439750] cfg80211: 2457000 KHz - 2482000 KHz @ KHz), (N/A mBi, 2000 mBm) >> [ 53.439751] cfg80211: Updating information on frequency 2472 MHz for a 20 MHz width channel with regulatory rule: >> [ 53.439754] cfg80211: 2457000 KHz - 2482000 KHz @ KHz), (N/A mBi, 2000 mBm) >> [ 53.439756] cfg80211: Updating information on frequency 2484 MHz for a 20 MHz width channel with regulatory rule: >> [ 53.439758] cfg80211: 2474000 KHz - 2494000 KHz @ KHz), (N/A mBi, 2000 mBm) >> [ 53.439761] ath: tx DMA: 512 buffers 1 desc/buf >> [ 53.439772] ath: tx DMA map: ffff880071f90000 (54600) -> 71f90000 (54600) >> [ 53.439792] ath: beacon DMA: 4 buffers 1 desc/buf >> [ 53.439795] ath: beacon DMA map: ffff880071ecc000 (416) -> 71ecc000 (416) >> [ 53.439797] ath: cachelsz 64 rxbufsize 1926 >> [ 53.439799] ath: rx DMA: 512 buffers 1 desc/buf >> [ 53.439811] ath: rx DMA map: ffff880071fb0000 (54600) -> 71fb0000 (54600) >> [ 53.441193] cfg80211: Ignoring regulatory request Set by core since the driver uses its own custom regulatory domain >> [ 53.441269] ieee80211 phy1: Selected rate control algorithm 'ath9k_rate_control' >> [ 53.441628] Registered led device: ath9k-phy1 >> [ 53.441641] ath9k: disabling interrupts >> [ 53.442979] ath: disable IER >> [ 53.442990] ath: 0x0 => 0x0 >> [ 53.442992] ath: new IMR 0x0 >> [ 53.443013] ath9k: requesting irq >> [ 53.444293] ath9k: interrupt with invalid flag set (1 times) >> [ 53.444296] wait 500 ms >> [ 53.446255] ath9k: interrupt with invalid flag set (2 times) >> [ 53.446288] ath9k: interrupt with invalid flag set (4 times) >> [ 53.446330] ath9k: interrupt with invalid flag set (8 times) >> [ 53.446374] ath9k: interrupt with invalid flag set (16 times) >> [ 53.446453] ath9k: interrupt with invalid flag set (32 times) >> [ 53.446572] ath9k: interrupt with invalid flag set (64 times) >> [ 53.446821] ath9k: interrupt with invalid flag set (128 times) >> [ 53.447319] ath9k: interrupt with invalid flag set (256 times) >> [ 53.448526] ath9k: interrupt with invalid flag set (512 times) >> [ 53.450942] ath9k: interrupt with invalid flag set (1024 times) >> [ 53.454793] ath9k: interrupt with invalid flag set (2048 times) >> [ 53.462493] ath9k: interrupt with invalid flag set (4096 times) >> [ 53.477738] ath9k: interrupt with invalid flag set (8192 times) >> [ 53.508164] ath9k: interrupt with invalid flag set (16384 times) >> [ 53.571798] ath9k: interrupt with invalid flag set (32768 times) >> [ 53.697472] ath9k: interrupt with invalid flag set (65536 times) >> [ 53.832522] irq 17: nobody cared (try booting with the "irqpoll" option) >> [ 53.834253] Pid: 0, comm: kworker/0:0 Tainted: G C 3.1.0-rc2-ath9k-debug #16 >> [ 53.835991] Call Trace: >> [ 53.837727] <IRQ> [<ffffffff810c64fd>] __report_bad_irq+0x3d/0xe0 >> [ 53.839487] [<ffffffff810c693d>] note_interrupt+0x14d/0x210 >> [ 53.841220] [<ffffffff810c49d9>] handle_irq_event_percpu+0xc9/0x2a0 >> [ 53.842952] [<ffffffff810c4bf8>] handle_irq_event+0x48/0x70 >> [ 53.844659] [<ffffffff810c727a>] handle_fasteoi_irq+0x5a/0xe0 >> [ 53.846379] [<ffffffff81016982>] handle_irq+0x22/0x40 >> [ 53.848073] [<ffffffff813eb45a>] do_IRQ+0x5a/0xe0 >> [ 53.849751] [<ffffffff813e80ae>] common_interrupt+0x6e/0x6e >> [ 53.851438] <EOI> [<ffffffff813e4a96>] ? schedule+0x316/0x9b0 >> [ 53.853151] [<ffffffff812732bb>] ? intel_idle+0xcb/0x120 >> [ 53.854822] [<ffffffff8127329d>] ? intel_idle+0xad/0x120 >> [ 53.856502] [<ffffffff81311e66>] cpuidle_idle_call+0xc6/0x350 >> [ 53.858166] [<ffffffff81013229>] cpu_idle+0xc9/0x120 >> [ 53.859806] [<ffffffff813d6f71>] start_secondary+0x202/0x209 >> [ 53.861438] handlers: >> [ 53.863009] [<ffffffffa0557600>] ath_isr >> [ 53.864524] Disabling IRQ #17 >> [ 53.943777] done waiting >> [ 53.944849] ieee80211 phy1: Atheros AR9285 Rev:2 mem=0xffffc90010ea0000, irq=17 >> >> diff --git a/drivers/net/wireless/ath/ath9k/hw.c b/drivers/net/wireless/ath/ath9k/hw.c >> index 8dcefe7..fe7780c 100644 >> --- a/drivers/net/wireless/ath/ath9k/hw.c >> +++ b/drivers/net/wireless/ath/ath9k/hw.c >> @@ -16,6 +16,7 @@ >> >> #include <linux/io.h> >> #include <linux/slab.h> >> +#include <linux/delay.h> >> #include <asm/unaligned.h> >> >> #include "hw.h" >> @@ -532,6 +533,7 @@ static int __ath9k_hw_init(struct ath_hw *ah) >> ah->WARegVal |= (AR_WA_D3_L1_DISABLE | >> AR_WA_ASPM_TIMER_BASED_DISABLE); >> >> + ath_dbg(common, ATH_DBG_RESET, "reset power on\n"); >> if (!ath9k_hw_set_reset_reg(ah, ATH9K_RESET_POWER_ON)) { >> ath_err(common, "Couldn't reset chip\n"); >> return -EIO; >> diff --git a/drivers/net/wireless/ath/ath9k/main.c b/drivers/net/wireless/ath/ath9k/main.c >> index 9098aaa..fdd5dd1 100644 >> --- a/drivers/net/wireless/ath/ath9k/main.c >> +++ b/drivers/net/wireless/ath/ath9k/main.c >> @@ -16,6 +16,7 @@ >> >> #include <linux/nl80211.h> >> #include <linux/delay.h> >> +#include <linux/log2.h> >> #include "ath9k.h" >> #include "btcoex.h" >> >> @@ -770,14 +771,28 @@ irqreturn_t ath_isr(int irq, void *dev) >> * touch anything. Note this can happen early >> * on if the IRQ is shared. >> */ >> - if (sc->sc_flags & SC_OP_INVALID) >> + if (sc->sc_flags & SC_OP_INVALID) { >> + static int count = 0; >> + if (count < INT_MAX) { >> + count++; >> + if (is_power_of_2(count)) >> + printk(KERN_DEBUG "ath9k: interrupt with invalid flag set (%d times)\n", count); >> + } >> return IRQ_NONE; >> + } >> >> >> /* shared irq, not for us */ >> >> - if (!ath9k_hw_intrpend(ah)) >> + if (!ath9k_hw_intrpend(ah)) { >> + static int count = 0; >> + if (count < INT_MAX) { >> + count++; >> + if (is_power_of_2(count)) >> + printk(KERN_DEBUG "ath9k: shared irq, not for us (%d times)\n", count); >> + } >> return IRQ_NONE; >> + } >> >> /* >> * Figure out the reason(s) for the interrupt. Note >> @@ -792,8 +807,15 @@ irqreturn_t ath_isr(int irq, void *dev) >> * If there are no status bits set, then this interrupt was not >> * for me (should have been caught above). >> */ >> - if (!status) >> + if (!status) { >> + static int count = 0; >> + if (count < INT_MAX) { >> + count++; >> + if (is_power_of_2(count)) >> + printk(KERN_DEBUG "ath9k: interrupt, but no status bit set (%d times)\n", count); >> + } >> return IRQ_NONE; >> + } >> >> /* Cache the status */ >> sc->intrstatus = status; >> diff --git a/drivers/net/wireless/ath/ath9k/pci.c b/drivers/net/wireless/ath/ath9k/pci.c >> index be4ea13..6fdabd8 100644 >> --- a/drivers/net/wireless/ath/ath9k/pci.c >> +++ b/drivers/net/wireless/ath/ath9k/pci.c >> @@ -242,21 +242,35 @@ static int ath_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id) >> /* Will be cleared in ath9k_start() */ >> sc->sc_flags |= SC_OP_INVALID; >> >> + pci_read_config_word(pdev, PCI_SUBSYSTEM_ID, &subsysid); >> + ret = ath9k_init_device(id->device, sc, subsysid, &ath_pci_bus_ops); >> + if (ret) { >> + dev_err(&pdev->dev, "Failed to initialize device\n"); >> + goto err_init; >> + } >> + >> + { >> + struct ath_hw *ah = sc->sc_ah; >> + printk(KERN_INFO "ath9k: disabling interrupts\n"); >> + ath9k_ps_wakeup(sc); >> + mutex_lock(&sc->mutex); >> + ath9k_hw_set_interrupts(ah, 0); >> + mutex_unlock(&sc->mutex); >> + ath9k_ps_restore(sc); >> + } >> + >> + printk(KERN_INFO "ath9k: requesting irq\n"); >> ret = request_irq(pdev->irq, ath_isr, IRQF_SHARED, "ath9k", sc); >> if (ret) { >> dev_err(&pdev->dev, "request_irq failed\n"); >> goto err_irq; >> } >> + printk(KERN_INFO "wait 500 ms\n"); >> + msleep(500); >> + printk(KERN_INFO "done waiting\n"); >> >> sc->irq = pdev->irq; >> >> - pci_read_config_word(pdev, PCI_SUBSYSTEM_ID, &subsysid); >> - ret = ath9k_init_device(id->device, sc, subsysid, &ath_pci_bus_ops); >> - if (ret) { >> - dev_err(&pdev->dev, "Failed to initialize device\n"); >> - goto err_init; >> - } >> - >> ath9k_hw_name(sc->sc_ah, hw_name, sizeof(hw_name)); >> wiphy_info(hw->wiphy, "%s mem=0x%lx, irq=%d\n", >> hw_name, (unsigned long)mem, pdev->irq); >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-wireless" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > > > -- > shafi > -- shafi ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ath9k: irq storm after suspend/resume 2011-08-29 11:53 ` Mohammed Shafi @ 2011-08-29 15:12 ` Mohammed Shafi 2011-08-30 6:41 ` Clemens Buchacher 0 siblings, 1 reply; 38+ messages in thread From: Mohammed Shafi @ 2011-08-29 15:12 UTC (permalink / raw) To: Clemens Buchacher; +Cc: linux-wireless, beta992 >>> But still, the interrupts come. Note that according to >>> /proc/interrupts, the IRQ line is not shared with any other device. >>> I did not manage to determine which interrupt it is exactly, >>> because the device is not in a ready state (SC_OP_INVALID is set) >>> when they happen (in either scenario that triggers the IRQ storm). >>> And SC_OP_INVALID is cleared only much later in ath9k_start. >>> >>> So, I am at a loss. Any ideas? >> >> please provide the lspci -vvvxx. >> also looking at >> /sys/kernel/debug/ieee80211/phy0/ath9k$ sudo cat interrupt. > > hi, i think this will help, please get the message sudo modprobe ath9k > debug=0xffffffff. > few fatal PCI interrupt messages are based on ATH_DEBUG_ANY. > thanks. > we can also try to disable MIB interrupts though its handled properly now in ath9k http://www.kernel.org/pub/linux/kernel/people/mcgrof/patches/ath9k/2008-09-25/0001-ath9k-disable-MIB-interrupts-to-fix-interrupt-storm.patch -- shafi ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ath9k: irq storm after suspend/resume 2011-08-29 15:12 ` Mohammed Shafi @ 2011-08-30 6:41 ` Clemens Buchacher 2011-08-30 9:33 ` Mohammed Shafi 0 siblings, 1 reply; 38+ messages in thread From: Clemens Buchacher @ 2011-08-30 6:41 UTC (permalink / raw) To: Mohammed Shafi; +Cc: linux-wireless, beta992 Hi Mohammed, On Mon, Aug 29, 2011 at 08:42:33PM +0530, Mohammed Shafi wrote: > > >> But still, the interrupts come. Note that according to > >> /proc/interrupts, the IRQ line is not shared with any other device. > >> I did not manage to determine which interrupt it is exactly, > >> because the device is not in a ready state (SC_OP_INVALID is set) > >> when they happen (in either scenario that triggers the IRQ storm). > >> And SC_OP_INVALID is cleared only much later in ath9k_start. > >> > >> So, I am at a loss. Any ideas? > > > > please provide the lspci -vvvxx. Please see below. > >> also looking at > >> /sys/kernel/debug/ieee80211/phy0/ath9k$ sudo cat interrupt. Those interrupt counters are always zero, because ath_isr never gets to the point where it would gather statistics. The interrupt routine exits right at the start, because SC_OP_INVALID is still set. if (sc->sc_flags & SC_OP_INVALID) return IRQ_NONE; By the time the invalid flag is cleared, the IRQ line has long since been disabled, due to 10000 spurios interrupts during less than 500 ms. > > hi, i think this will help, please get the message sudo modprobe ath9k > > debug=0xffffffff. > > few fatal PCI interrupt messages are based on ATH_DEBUG_ANY. Whenever I did that in the past, it just added lots of PDADC debug messages. > we can also try to disable MIB interrupts though its handled properly > now in ath9k > > http://www.kernel.org/pub/linux/kernel/people/mcgrof/patches/ath9k/2008-09-25/0001-ath9k-disable-MIB-interrupts-to-fix-interrupt-storm.patch But I am already disabling all interrupts by setting the mask to 0. Unless there are some non-maskable ones? I wonder if the device is in some crashed state at this point. Is it possible to reset the device in ath_pci_probe? > a recent commit, not sure this will help suspend/resume > > commit 0682c9b52bf51fbc67c4e79fcbdadcf70bd600f8 > Author: Rajkumar Manoharan <rmanohar@qca.qualcomm.com> > Date: Sat Aug 13 10:28:09 2011 +0530 > > ath9k: Fix rx overrun interrupt storm For the same reason as above, this patch does not touch any code that would get executed. > > also this additional information might help: > > in case have you seen this is happening in 32 bit also ? I have never had a 32-bit system on this machine. > > is this happening in wireless-testing Linux 3.1-rc3 ? or the latest > > compat wireless? I think I tried last week, but I can try again. > > i did some preliminary testing, not able to recreate it. will try > > further.thanks! Thanks for trying. Did you turn off network manager? As I described here, it can make the bug go away. [1] https://bugzilla.kernel.org/show_bug.cgi?id=39112#c5 Clemens --- 02:00.0 Network controller: Atheros Communications Inc. AR9285 Wireless Network Adapter (PCI-Express) (rev 01) Subsystem: AzureWave Device 1089 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 17 Region 0: Memory at d2c00000 (64-bit, non-prefetchable) [size=64K] Capabilities: [40] Power Management version 3 Flags: PMEClk- DSI- D1+ D2- AuxCurrent=375mA PME(D0+,D1+,D2-,D3hot+,D3cold+) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit- Address: 00000000 Data: 0000 Capabilities: [60] Express (v2) Legacy Endpoint, MSI 00 DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop- MaxPayload 128 bytes, MaxReadReq 512 bytes DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend- LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency L0 <512ns, L1 <64us ClockPM- Surprise- LLActRep- BwNot- LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Not Supported, TimeoutDis+ DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -6dB Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -6dB Capabilities: [100 v1] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn- Capabilities: [140 v1] Virtual Channel Caps: LPEVC=0 RefClk=100ns PATEntryBits=1 Arb: Fixed- WRR32- WRR64- WRR128- Ctrl: ArbSelect=Fixed Status: InProgress- VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans- Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256- Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=ff Status: NegoPending- InProgress- Capabilities: [160 v1] Device Serial Number 00-15-17-ff-ff-24-14-12 Capabilities: [170 v1] Power Budgeting <?> Kernel driver in use: ath9k Kernel modules: ath9k 00: 8c 16 2b 00 07 00 10 00 01 00 80 02 10 00 00 00 10: 04 00 c0 d2 00 00 00 00 00 00 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 3b 1a 89 10 30: 00 00 00 00 40 00 00 00 00 00 00 00 03 01 00 00 ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ath9k: irq storm after suspend/resume 2011-08-30 6:41 ` Clemens Buchacher @ 2011-08-30 9:33 ` Mohammed Shafi 2011-08-30 9:41 ` Mohammed Shafi 2011-09-01 6:24 ` Clemens Buchacher 0 siblings, 2 replies; 38+ messages in thread From: Mohammed Shafi @ 2011-08-30 9:33 UTC (permalink / raw) To: Clemens Buchacher; +Cc: linux-wireless, beta992 On Tue, Aug 30, 2011 at 12:11 PM, Clemens Buchacher <drizzd@aon.at> wrote: > Hi Mohammed, > > On Mon, Aug 29, 2011 at 08:42:33PM +0530, Mohammed Shafi wrote: >> >> >> But still, the interrupts come. Note that according to >> >> /proc/interrupts, the IRQ line is not shared with any other device. >> >> I did not manage to determine which interrupt it is exactly, >> >> because the device is not in a ready state (SC_OP_INVALID is set) >> >> when they happen (in either scenario that triggers the IRQ storm). >> >> And SC_OP_INVALID is cleared only much later in ath9k_start. >> >> >> >> So, I am at a loss. Any ideas? >> > >> > please provide the lspci -vvvxx. > > Please see below. thanks! > >> >> also looking at >> >> /sys/kernel/debug/ieee80211/phy0/ath9k$ sudo cat interrupt. > > Those interrupt counters are always zero, because ath_isr never > gets to the point where it would gather statistics. The interrupt > routine exits right at the start, because SC_OP_INVALID is still > set. yes it is, though not a good idea, just thinking of we could get some thing by not setting SC_OP_INVALID flag in ath_pci_probe(it was added to fix a panic, but it did not cause panic for me now). > > if (sc->sc_flags & SC_OP_INVALID) > return IRQ_NONE; > > By the time the invalid flag is cleared, the IRQ line has long > since been disabled, due to 10000 spurios interrupts during less > than 500 ms. > >> > hi, i think this will help, please get the message sudo modprobe ath9k >> > debug=0xffffffff. >> > few fatal PCI interrupt messages are based on ATH_DEBUG_ANY. > > Whenever I did that in the past, it just added lots of PDADC debug > messages. though we might get some PCI fatal interrupts. > >> we can also try to disable MIB interrupts though its handled properly >> now in ath9k >> >> http://www.kernel.org/pub/linux/kernel/people/mcgrof/patches/ath9k/2008-09-25/0001-ath9k-disable-MIB-interrupts-to-fix-interrupt-storm.patch > > But I am already disabling all interrupts by setting the mask to 0. > Unless there are some non-maskable ones? > > I wonder if the device is in some crashed state at this point. Is > it possible to reset the device in ath_pci_probe? i don't think ath_reset cannot be called > >> a recent commit, not sure this will help suspend/resume >> >> commit 0682c9b52bf51fbc67c4e79fcbdadcf70bd600f8 >> Author: Rajkumar Manoharan <rmanohar@qca.qualcomm.com> >> Date: Sat Aug 13 10:28:09 2011 +0530 >> >> ath9k: Fix rx overrun interrupt storm > > For the same reason as above, this patch does not touch any code > that would get executed. > >> > also this additional information might help: >> > in case have you seen this is happening in 32 bit also ? > > I have never had a 32-bit system on this machine. > >> > is this happening in wireless-testing Linux 3.1-rc3 ? or the latest >> > compat wireless? > > I think I tried last week, but I can try again. > >> > i did some preliminary testing, not able to recreate it. will try >> > further.thanks! > > Thanks for trying. Did you turn off network manager? As I described > here, it can make the bug go away. i am bit confused looking at bug comments. please correct me. the bug comments say that disabling/making the Network-Manager to sleep triggers the problem. > > [1] https://bugzilla.kernel.org/show_bug.cgi?id=39112#c5 > > Clemens > --- > > 02:00.0 Network controller: Atheros Communications Inc. AR9285 Wireless Network Adapter (PCI-Express) (rev 01) > Subsystem: AzureWave Device 1089 > Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- > Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- > Latency: 0, Cache Line Size: 64 bytes > Interrupt: pin A routed to IRQ 17 > Region 0: Memory at d2c00000 (64-bit, non-prefetchable) [size=64K] > Capabilities: [40] Power Management version 3 > Flags: PMEClk- DSI- D1+ D2- AuxCurrent=375mA PME(D0+,D1+,D2-,D3hot+,D3cold+) > Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- > Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit- > Address: 00000000 Data: 0000 > Capabilities: [60] Express (v2) Legacy Endpoint, MSI 00 > DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us > ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- > DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- > RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop- > MaxPayload 128 bytes, MaxReadReq 512 bytes > DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend- > LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency L0 <512ns, L1 <64us > ClockPM- Surprise- LLActRep- BwNot- > LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+ > ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- > LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- > DevCap2: Completion Timeout: Not Supported, TimeoutDis+ > DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- > LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -6dB > Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- > Compliance De-emphasis: -6dB > LnkSta2: Current De-emphasis Level: -6dB > Capabilities: [100 v1] Advanced Error Reporting > UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- > UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- > UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- > CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- > CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ > AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn- > Capabilities: [140 v1] Virtual Channel > Caps: LPEVC=0 RefClk=100ns PATEntryBits=1 > Arb: Fixed- WRR32- WRR64- WRR128- > Ctrl: ArbSelect=Fixed > Status: InProgress- > VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans- > Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256- > Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=ff > Status: NegoPending- InProgress- > Capabilities: [160 v1] Device Serial Number 00-15-17-ff-ff-24-14-12 > Capabilities: [170 v1] Power Budgeting <?> > Kernel driver in use: ath9k > Kernel modules: ath9k > 00: 8c 16 2b 00 07 00 10 00 01 00 80 02 10 00 00 00 > 10: 04 00 c0 d2 00 00 00 00 00 00 00 00 00 00 00 00 > 20: 00 00 00 00 00 00 00 00 00 00 00 00 3b 1a 89 10 > 30: 00 00 00 00 40 00 00 00 00 00 00 00 03 01 00 00 > -- shafi ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ath9k: irq storm after suspend/resume 2011-08-30 9:33 ` Mohammed Shafi @ 2011-08-30 9:41 ` Mohammed Shafi 2011-09-01 6:24 ` Clemens Buchacher 1 sibling, 0 replies; 38+ messages in thread From: Mohammed Shafi @ 2011-08-30 9:41 UTC (permalink / raw) To: Clemens Buchacher; +Cc: linux-wireless, beta992 > yes it is, though not a good idea, just thinking of we could get some > thing by not setting SC_OP_INVALID flag in ath_pci_probe(it was added > to fix a panic, but it did not cause panic for me now). sorry please don't try it, not setting the SC_OP_INVALID flag in ath_pci_probe, the system did panic. > >> >> if (sc->sc_flags & SC_OP_INVALID) >> return IRQ_NONE; >> ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ath9k: irq storm after suspend/resume 2011-08-30 9:33 ` Mohammed Shafi 2011-08-30 9:41 ` Mohammed Shafi @ 2011-09-01 6:24 ` Clemens Buchacher 2011-09-26 9:24 ` Mohammed Shafi 1 sibling, 1 reply; 38+ messages in thread From: Clemens Buchacher @ 2011-09-01 6:24 UTC (permalink / raw) To: Mohammed Shafi; +Cc: linux-wireless, beta992 Mohammed Shafi <shafi.wireless@gmail.com> wrote: > >i am bit confused looking at bug comments. please correct me. >the bug comments say that disabling/making the Network-Manager to >sleep triggers the problem. That is correct. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ath9k: irq storm after suspend/resume 2011-09-01 6:24 ` Clemens Buchacher @ 2011-09-26 9:24 ` Mohammed Shafi 2011-09-27 21:42 ` Clemens Buchacher 0 siblings, 1 reply; 38+ messages in thread From: Mohammed Shafi @ 2011-09-26 9:24 UTC (permalink / raw) To: Clemens Buchacher; +Cc: linux-wireless, beta992 [-- Attachment #1: Type: text/plain, Size: 489 bytes --] Hi, On Thu, Sep 1, 2011 at 11:54 AM, Clemens Buchacher <drizzd@aon.at> wrote: > Mohammed Shafi <shafi.wireless@gmail.com> wrote: >> >>i am bit confused looking at bug comments. please correct me. >>the bug comments say that disabling/making the Network-Manager to >>sleep triggers the problem. > That is correct. is the issue still seen? any improvement with the attached patch.. will just again take a look at this thread and ath_pci_probe once i finish my work. thanks! > -- shafi [-- Attachment #2: 0001-ath9k-disable-interrupt-before-free_irq.patch --] [-- Type: text/x-diff, Size: 1271 bytes --] From 3550cd7a113e17c4e7ced8acbd7f14a3a73586ed Mon Sep 17 00:00:00 2001 From: Mohammed Shafi Shajakhan <mohammed@qca.qualcomm.com> Date: Sat, 24 Sep 2011 21:07:16 +0530 Subject: [PATCH] ath9k: disable interrupt before free_irq free_irq claims that for a shared IRQ the caller must ensure that the interrupt is disabled on the card it drives before calling it. will soon test and send a proper patch Signed-off-by: Mohammed Shafi Shajakhan <mohammed@qca.qualcomm.com> --- drivers/net/wireless/ath/ath9k/pci.c | 7 +++++++ 1 files changed, 7 insertions(+), 0 deletions(-) diff --git a/drivers/net/wireless/ath/ath9k/pci.c b/drivers/net/wireless/ath/ath9k/pci.c index d67d6ee..35537ca 100644 --- a/drivers/net/wireless/ath/ath9k/pci.c +++ b/drivers/net/wireless/ath/ath9k/pci.c @@ -285,11 +285,18 @@ static void ath_pci_remove(struct pci_dev *pdev) { struct ieee80211_hw *hw = pci_get_drvdata(pdev); struct ath_softc *sc = hw->priv; + struct ath_hw *ah = sc->sc_ah; void __iomem *mem = sc->mem; if (!is_ath9k_unloaded) sc->sc_ah->ah_flags |= AH_UNPLUGGED; + + /* disable interrupts */ + ah->imask &= ~ATH9K_INT_GLOBAL; + ath9k_hw_disable_interrupts(ah); + ath9k_deinit_device(sc); + free_irq(sc->irq, sc); ieee80211_free_hw(sc->hw); -- 1.7.0.4 ^ permalink raw reply related [flat|nested] 38+ messages in thread
* Re: ath9k: irq storm after suspend/resume 2011-09-26 9:24 ` Mohammed Shafi @ 2011-09-27 21:42 ` Clemens Buchacher 2011-09-29 8:18 ` Mohammed Shafi 0 siblings, 1 reply; 38+ messages in thread From: Clemens Buchacher @ 2011-09-27 21:42 UTC (permalink / raw) To: Mohammed Shafi; +Cc: linux-wireless, beta992 Hello Mohammed, On Mon, Sep 26, 2011 at 02:54:18PM +0530, Mohammed Shafi wrote: > > is the issue still seen? any improvement with the attached patch.. I still have the issue. No change with current wl-testing plus your patch. Below dmesg output of: modprobe ath9k debug=0x0000fffb echo mem >/sys/power/state # resume Clemens --- [ 27.575562] ath9k 0000:02:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17 [ 27.577970] ath9k 0000:02:00.0: setting latency timer to 64 [ 27.581405] ath: UNDEFINED -> AWAKE [ 27.581469] ath: serialize_regmode is 0 [ 27.626195] ath: Eeprom VER: 14, REV: 13 [ 27.626199] ath: Initialize ANI [ 27.626203] ath: Enable MIB counters [ 27.626240] ath: Setup TX queue: 9 [ 27.626243] ath: Set queue properties for: 9 [ 27.626248] ath: Setup TX queue: 8 [ 27.626251] ath: Set queue properties for: 8 [ 27.626255] ath: Set queue properties for: 8 [ 27.626259] ath: Reset TX queue: 8 [ 27.626286] ath: tx ok 0x0 err 0x0 desc 0x100 eol 0x100 urn 0x0 [ 27.626290] ath: Setup TX queue: 0 [ 27.626293] ath: Set queue properties for: 0 [ 27.626297] ath: Setup TX queue: 1 [ 27.626300] ath: Set queue properties for: 1 [ 27.626304] ath: Setup TX queue: 2 [ 27.626307] ath: Set queue properties for: 2 [ 27.626310] ath: Setup TX queue: 3 [ 27.626313] ath: Set queue properties for: 3 [ 27.627816] ath: TX streams 1, RX streams: 1 [ 27.627819] ath: EEPROM regdomain: 0x60 [ 27.627822] ath: EEPROM indicates we should expect a direct regpair map [ 27.627828] ath: Country alpha2 being used: 00 [ 27.627830] ath: Regpair used: 0x60 [ 27.627838] ath: tx DMA: 512 buffers 1 desc/buf [ 27.627857] ath: tx DMA map: ffff88006f250000 (54600) -> 6f250000 (54600) [ 27.627891] ath: beacon DMA: 4 buffers 1 desc/buf [ 27.627897] ath: beacon DMA map: ffff88007198d000 (416) -> 7198d000 (416) [ 27.627902] ath: cachelsz 64 rxbufsize 1926 [ 27.627906] ath: rx DMA: 512 buffers 1 desc/buf [ 27.627924] ath: rx DMA map: ffff88006f2b0000 (54600) -> 6f2b0000 (54600) [ 27.629650] ieee80211 phy1: Selected rate control algorithm 'ath9k_rate_control' [ 27.630460] Registered led device: ath9k-phy1 [ 27.630475] ieee80211 phy1: Atheros AR9285 Rev:2 mem=0xffffc90021ba0000, irq=17 [ 43.855411] PM: Syncing filesystems ... done. [ 43.948410] PM: Preparing system for mem sleep [ 43.949937] Freezing user space processes ... (elapsed 0.01 seconds) done. [ 43.963254] Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done. [ 43.976527] PM: Entering mem sleep [ 43.976595] Suspending console(s) (use no_console_suspend to debug) [ 43.977193] sd 0:0:0:0: [sda] Synchronizing SCSI cache [ 43.989161] jmb38x_ms 0000:06:00.3: PCI INT B disabled [ 43.989194] sdhci-pci 0000:06:00.0: PCI INT B disabled [ 43.989232] ath: AWAKE -> FULL-SLEEP [ 43.998370] sd 0:0:0:0: [sda] Stopping disk [ 44.196369] snd_hda_intel 0000:00:1b.0: PCI INT A disabled [ 44.608701] PM: suspend of devices complete after 633.019 msecs [ 44.609142] jme 0000:06:00.5: PME# enabled [ 44.609152] jme 0000:06:00.5: wake-up capability enabled by ACPI [ 44.648588] PM: late suspend of devices complete after 39.960 msecs [ 44.648748] ACPI: Preparing to enter system sleep state S3 [ 44.951903] PM: Saving platform NVS memory [ 44.952780] Disabling non-boot CPUs ... [ 44.955081] CPU 1 is now offline [ 45.057696] CPU 2 is now offline [ 45.160822] CPU 3 is now offline [ 45.161227] Extended CMOS year: 2000 [ 45.161461] ACPI: Low-level resume complete [ 45.161521] PM: Restoring platform NVS memory [ 45.161951] Extended CMOS year: 2000 [ 45.161974] Enabling non-boot CPUs ... [ 45.162070] Booting Node 0 Processor 1 APIC 0x4 [ 45.162071] smpboot cpu 1: start_ip = 96000 [ 45.173166] Calibrating delay loop (skipped) already calibrated this CPU [ 45.193433] NMI watchdog enabled, takes one hw-pmu counter. [ 45.193554] CPU1 is up [ 45.193622] Booting Node 0 Processor 2 APIC 0x1 [ 45.193623] smpboot cpu 2: start_ip = 96000 [ 45.194742] Switched to NOHz mode on CPU #1 [ 45.204719] Calibrating delay loop (skipped) already calibrated this CPU [ 45.225042] NMI watchdog enabled, takes one hw-pmu counter. [ 45.225176] CPU2 is up [ 45.225257] Booting Node 0 Processor 3 APIC 0x5 [ 45.225259] smpboot cpu 3: start_ip = 96000 [ 45.228010] Switched to NOHz mode on CPU #2 [ 45.236352] Calibrating delay loop (skipped) already calibrated this CPU [ 45.256763] NMI watchdog enabled, takes one hw-pmu counter. [ 45.256925] CPU3 is up [ 45.258831] Switched to NOHz mode on CPU #3 [ 45.259072] ACPI: Waking up from system sleep state S3 [ 45.580717] intel ips 0000:00:1f.6: MCP limit exceeded: Avg temp 20253, limit 9000 [ 45.658244] i915 0000:00:02.0: restoring config space at offset 0xf (was 0x100, writing 0x10b) [ 45.658253] i915 0000:00:02.0: restoring config space at offset 0x1 (was 0x900007, writing 0x900407) [ 45.658277] mei 0000:00:16.0: restoring config space at offset 0xf (was 0x100, writing 0x10b) [ 45.658296] mei 0000:00:16.0: restoring config space at offset 0x4 (was 0x4, writing 0xd540a004) [ 45.658304] mei 0000:00:16.0: restoring config space at offset 0x1 (was 0x180000, writing 0x180006) [ 45.658332] pci 0000:00:1a.0: restoring config space at offset 0xf (was 0x100, writing 0x10b) [ 45.658350] pci 0000:00:1a.0: restoring config space at offset 0x4 (was 0x0, writing 0xd5408000) [ 45.658358] pci 0000:00:1a.0: restoring config space at offset 0x1 (was 0x2900000, writing 0x2900006) [ 45.658391] snd_hda_intel 0000:00:1b.0: restoring config space at offset 0xf (was 0x100, writing 0x105) [ 45.658410] snd_hda_intel 0000:00:1b.0: restoring config space at offset 0x4 (was 0x4, writing 0xd5400004) [ 45.658415] snd_hda_intel 0000:00:1b.0: restoring config space at offset 0x3 (was 0x0, writing 0x10) [ 45.658421] snd_hda_intel 0000:00:1b.0: restoring config space at offset 0x1 (was 0x100004, writing 0x100002) [ 45.658455] pcieport 0000:00:1c.0: restoring config space at offset 0xf (was 0x100, writing 0x10010b) [ 45.658467] pcieport 0000:00:1c.0: restoring config space at offset 0x9 (was 0x1fff1, writing 0x7c717c61) [ 45.658479] pcieport 0000:00:1c.0: restoring config space at offset 0x3 (was 0x810000, writing 0x810010) [ 45.658485] pcieport 0000:00:1c.0: restoring config space at offset 0x1 (was 0x100000, writing 0x100407) [ 45.658538] pcieport 0000:00:1c.1: restoring config space at offset 0xf (was 0x200, writing 0x100203) [ 45.658550] pcieport 0000:00:1c.1: restoring config space at offset 0x9 (was 0x1fff1, writing 0x7c517c41) [ 45.658561] pcieport 0000:00:1c.1: restoring config space at offset 0x3 (was 0x810000, writing 0x810010) [ 45.658568] pcieport 0000:00:1c.1: restoring config space at offset 0x1 (was 0x100000, writing 0x100407) [ 45.658620] pcieport 0000:00:1c.2: restoring config space at offset 0xf (was 0x300, writing 0x100304) [ 45.658632] pcieport 0000:00:1c.2: restoring config space at offset 0x9 (was 0x1fff1, writing 0x7c317c21) [ 45.658643] pcieport 0000:00:1c.2: restoring config space at offset 0x3 (was 0x810000, writing 0x810010) [ 45.658650] pcieport 0000:00:1c.2: restoring config space at offset 0x1 (was 0x100000, writing 0x100407) [ 45.658703] pcieport 0000:00:1c.5: restoring config space at offset 0xf (was 0x200, writing 0x100203) [ 45.658714] pcieport 0000:00:1c.5: restoring config space at offset 0x9 (was 0x1fff1, writing 0x7c117c01) [ 45.658726] pcieport 0000:00:1c.5: restoring config space at offset 0x3 (was 0x810000, writing 0x810010) [ 45.658732] pcieport 0000:00:1c.5: restoring config space at offset 0x1 (was 0x100000, writing 0x100407) [ 45.658780] pci 0000:00:1d.0: restoring config space at offset 0xf (was 0x100, writing 0x10a) [ 45.658799] pci 0000:00:1d.0: restoring config space at offset 0x4 (was 0x0, writing 0xd5407000) [ 45.658807] pci 0000:00:1d.0: restoring config space at offset 0x1 (was 0x2900000, writing 0x2900006) [ 45.658831] pci 0000:00:1e.0: restoring config space at offset 0xf (was 0x0, writing 0x100000) [ 45.658930] ahci 0000:00:1f.2: restoring config space at offset 0x1 (was 0x2b00007, writing 0x2b00407) [ 45.659017] intel ips 0000:00:1f.6: restoring config space at offset 0x1 (was 0x100400, writing 0x100406) [ 45.659057] ath9k 0000:02:00.0: restoring config space at offset 0xf (was 0x1ff, writing 0x103) [ 45.659077] ath9k 0000:02:00.0: restoring config space at offset 0x4 (was 0x4, writing 0xd2c00004) [ 45.659082] ath9k 0000:02:00.0: restoring config space at offset 0x3 (was 0x0, writing 0x10) [ 45.659088] ath9k 0000:02:00.0: restoring config space at offset 0x1 (was 0x100000, writing 0x100007) [ 45.659129] sdhci-pci 0000:06:00.0: restoring config space at offset 0xf (was 0x2ff, writing 0x204) [ 45.659157] sdhci-pci 0000:06:00.0: restoring config space at offset 0x4 (was 0x0, writing 0xd0407000) [ 45.659163] sdhci-pci 0000:06:00.0: restoring config space at offset 0x3 (was 0x800000, writing 0x800010) [ 45.659172] sdhci-pci 0000:06:00.0: restoring config space at offset 0x1 (was 0x100000, writing 0x100007) [ 45.659218] pci 0000:06:00.2: restoring config space at offset 0xf (was 0x2ff, writing 0x204) [ 45.659246] pci 0000:06:00.2: restoring config space at offset 0x4 (was 0x0, writing 0xd0406000) [ 45.659253] pci 0000:06:00.2: restoring config space at offset 0x3 (was 0x800000, writing 0x800010) [ 45.659262] pci 0000:06:00.2: restoring config space at offset 0x1 (was 0x100000, writing 0x100003) [ 45.659308] jmb38x_ms 0000:06:00.3: restoring config space at offset 0xf (was 0x2ff, writing 0x204) [ 45.659336] jmb38x_ms 0000:06:00.3: restoring config space at offset 0x4 (was 0x0, writing 0xd0405000) [ 45.659342] jmb38x_ms 0000:06:00.3: restoring config space at offset 0x3 (was 0x800000, writing 0x800010) [ 45.659351] jmb38x_ms 0000:06:00.3: restoring config space at offset 0x1 (was 0x100000, writing 0x100007) [ 45.659397] pci 0000:06:00.4: restoring config space at offset 0xf (was 0x2ff, writing 0x204) [ 45.659425] pci 0000:06:00.4: restoring config space at offset 0x4 (was 0x0, writing 0xd0404000) [ 45.659431] pci 0000:06:00.4: restoring config space at offset 0x3 (was 0x800000, writing 0x800010) [ 45.659441] pci 0000:06:00.4: restoring config space at offset 0x1 (was 0x100000, writing 0x100007) [ 45.659487] jme 0000:06:00.5: restoring config space at offset 0xf (was 0x100, writing 0x103) [ 45.659508] jme 0000:06:00.5: restoring config space at offset 0x7 (was 0x1, writing 0xa001) [ 45.659520] jme 0000:06:00.5: restoring config space at offset 0x4 (was 0x0, writing 0xd0400000) [ 45.659526] jme 0000:06:00.5: restoring config space at offset 0x3 (was 0x800000, writing 0x800010) [ 45.659535] jme 0000:06:00.5: restoring config space at offset 0x1 (was 0x100000, writing 0x100007) [ 45.659759] PM: early resume of devices complete after 1.603 msecs [ 45.659979] i915 0000:00:02.0: setting latency timer to 64 [ 45.659996] snd_hda_intel 0000:00:1b.0: PCI INT A -> GSI 22 (level, low) -> IRQ 22 [ 45.660003] snd_hda_intel 0000:00:1b.0: setting latency timer to 64 [ 45.660050] snd_hda_intel 0000:00:1b.0: irq 45 for MSI/MSI-X [ 45.660126] pci 0000:00:1e.0: setting latency timer to 64 [ 45.660144] ahci 0000:00:1f.2: setting latency timer to 64 [ 45.660211] ath: FULL-SLEEP -> AWAKE [ 45.663932] sdhci-pci 0000:06:00.0: PCI INT B -> GSI 18 (level, low) -> IRQ 18 [ 45.663974] sdhci-pci 0000:06:00.0: setting latency timer to 64 [ 45.664037] jmb38x_ms 0000:06:00.3: PCI INT B -> GSI 18 (level, low) -> IRQ 18 [ 45.664051] jmb38x_ms 0000:06:00.3: setting latency timer to 64 [ 45.664113] jme 0000:06:00.5: wake-up capability disabled by ACPI [ 45.664124] jme 0000:06:00.5: PME# disabled [ 45.664503] ath: disable IER [ 45.664521] ath: Disable MIB counters [ 45.664560] ath: disable IER [ 45.664575] ath: 0x30 => 0x30 [ 45.664576] ath: new IMR 0x30 [ 45.664672] sd 0:0:0:0: [sda] Starting disk [ 45.667161] ath: Getting spur idx:0 is2Ghz:1 val:8000 [ 45.668810] ath: Reset TX queue: 0 [ 45.668813] ath: tx ok 0x0 err 0x0 desc 0x101 eol 0x101 urn 0x0 [ 45.668815] ath: Reset TX queue: 1 [ 45.668817] ath: tx ok 0x0 err 0x0 desc 0x103 eol 0x103 urn 0x0 [ 45.668819] ath: Reset TX queue: 2 [ 45.668821] ath: tx ok 0x0 err 0x0 desc 0x107 eol 0x107 urn 0x0 [ 45.668823] ath: Reset TX queue: 3 [ 45.668825] ath: tx ok 0x0 err 0x0 desc 0x10f eol 0x10f urn 0x0 [ 45.668827] ath: Reset TXQ, inactive queue: 4 [ 45.668828] ath: Reset TXQ, inactive queue: 5 [ 45.668830] ath: Reset TXQ, inactive queue: 6 [ 45.668831] ath: Reset TXQ, inactive queue: 7 [ 45.668832] ath: Reset TX queue: 8 [ 45.668842] ath: tx ok 0x0 err 0x0 desc 0x10f eol 0x10f urn 0x0 [ 45.668844] ath: Reset TX queue: 9 [ 45.668854] ath: tx ok 0x0 err 0x0 desc 0x10f eol 0x10f urn 0x0 [ 45.668861] ath: ah->misc_mode 0x4 [ 45.670264] ath: Running PA Calibration [ 45.670963] ath: enabling ADC Gain Calibration. [ 45.670964] ath: enabling ADC DC Calibration. [ 45.670965] ath: enabling IQ Calibration. [ 45.670969] ath: starting ADC Gain Calibration [ 45.671139] ath: AWAKE -> FULL-SLEEP [ 45.679552] Extended CMOS year: 2000 [ 45.983193] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300) [ 45.985133] ata2.00: ACPI cmd ef/10:06:00:00:00:a0 (SET FEATURES) succeeded [ 45.985136] ata2.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out [ 45.988885] ata2.00: ACPI cmd ef/10:06:00:00:00:a0 (SET FEATURES) succeeded [ 45.988889] ata2.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out [ 45.989852] ata6: SATA link down (SStatus 0 SControl 300) [ 45.990375] ata2.00: configured for UDMA/100 [ 46.120272] irq 17: nobody cared (try booting with the "irqpoll" option) [ 46.120275] Pid: 0, comm: swapper Tainted: G C 3.1.0-rc7-wl-ARCH3.1.0-rc7-ath9k-debug #2 [ 46.120277] Call Trace: [ 46.120280] <IRQ> [<ffffffff810c68bd>] __report_bad_irq+0x3d/0xe0 [ 46.120290] [<ffffffff810c6cfd>] note_interrupt+0x14d/0x210 [ 46.120293] [<ffffffff810c4c39>] handle_irq_event_percpu+0xc9/0x2a0 [ 46.120296] [<ffffffff810c4e58>] handle_irq_event+0x48/0x70 [ 46.120299] [<ffffffff810c763a>] handle_fasteoi_irq+0x5a/0xe0 [ 46.120303] [<ffffffff81016982>] handle_irq+0x22/0x40 [ 46.120308] [<ffffffff813ec09a>] do_IRQ+0x5a/0xe0 [ 46.120312] [<ffffffff813e8cee>] common_interrupt+0x6e/0x6e [ 46.120313] <EOI> [<ffffffff8131234a>] ? poll_idle+0x3a/0x80 [ 46.120319] [<ffffffff81312323>] ? poll_idle+0x13/0x80 [ 46.120321] [<ffffffff813126f6>] cpuidle_idle_call+0xc6/0x350 [ 46.120324] [<ffffffff81013229>] cpu_idle+0xc9/0x120 [ 46.120328] [<ffffffff813c64e7>] rest_init+0x9b/0xa4 [ 46.120330] [<ffffffff8174dc15>] start_kernel+0x3bf/0x3cc [ 46.120333] [<ffffffff8174d347>] x86_64_start_reservations+0x132/0x136 [ 46.120336] [<ffffffff8174d140>] ? early_idt_handlers+0x140/0x140 [ 46.120338] [<ffffffff8174d44d>] x86_64_start_kernel+0x102/0x111 [ 46.120340] handlers: [ 46.120344] [<ffffffffa053c020>] ath_isr [ 46.120345] Disabling IRQ #17 [ 48.371839] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [ 48.479286] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out [ 48.479457] ata1.00: ACPI cmd ef/10:06:00:00:00:a0 (SET FEATURES) succeeded [ 48.479463] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out [ 48.516626] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out [ 48.516804] ata1.00: ACPI cmd ef/10:06:00:00:00:a0 (SET FEATURES) succeeded [ 48.516809] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out [ 48.518731] ata1.00: configured for UDMA/133 [ 48.526624] ath: FULL-SLEEP -> AWAKE [ 48.530116] ath: ah->misc_mode 0x4 [ 48.530142] ath: AWAKE -> FULL-SLEEP [ 48.530172] PM: resume of devices complete after 2875.994 msecs [ 48.530607] PM: Finishing wakeup. [ 48.530608] Restarting tasks ... done. [ 48.533504] video LNXVIDEO:02: Restoring backlight state ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ath9k: irq storm after suspend/resume 2011-09-27 21:42 ` Clemens Buchacher @ 2011-09-29 8:18 ` Mohammed Shafi 2011-09-29 10:33 ` Adrian Chadd 0 siblings, 1 reply; 38+ messages in thread From: Mohammed Shafi @ 2011-09-29 8:18 UTC (permalink / raw) To: Clemens Buchacher; +Cc: linux-wireless, beta992 On Wed, Sep 28, 2011 at 3:12 AM, Clemens Buchacher <drizzd@aon.at> wrote: > Hello Mohammed, > > On Mon, Sep 26, 2011 at 02:54:18PM +0530, Mohammed Shafi wrote: >> >> is the issue still seen? any improvement with the attached patch.. > > I still have the issue. No change with current wl-testing plus your > patch. Clemens thanks for taking time and test it, I wll see if i can find something else. > > Below dmesg output of: > > modprobe ath9k debug=0x0000fffb > echo mem >/sys/power/state > # resume > > Clemens > > --- > [ 27.575562] ath9k 0000:02:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17 > [ 27.577970] ath9k 0000:02:00.0: setting latency timer to 64 > [ 27.581405] ath: UNDEFINED -> AWAKE > [ 27.581469] ath: serialize_regmode is 0 > [ 27.626195] ath: Eeprom VER: 14, REV: 13 > [ 27.626199] ath: Initialize ANI > [ 27.626203] ath: Enable MIB counters > [ 27.626240] ath: Setup TX queue: 9 > [ 27.626243] ath: Set queue properties for: 9 > [ 27.626248] ath: Setup TX queue: 8 > [ 27.626251] ath: Set queue properties for: 8 > [ 27.626255] ath: Set queue properties for: 8 > [ 27.626259] ath: Reset TX queue: 8 > [ 27.626286] ath: tx ok 0x0 err 0x0 desc 0x100 eol 0x100 urn 0x0 > [ 27.626290] ath: Setup TX queue: 0 > [ 27.626293] ath: Set queue properties for: 0 > [ 27.626297] ath: Setup TX queue: 1 > [ 27.626300] ath: Set queue properties for: 1 > [ 27.626304] ath: Setup TX queue: 2 > [ 27.626307] ath: Set queue properties for: 2 > [ 27.626310] ath: Setup TX queue: 3 > [ 27.626313] ath: Set queue properties for: 3 > [ 27.627816] ath: TX streams 1, RX streams: 1 > [ 27.627819] ath: EEPROM regdomain: 0x60 > [ 27.627822] ath: EEPROM indicates we should expect a direct regpair map > [ 27.627828] ath: Country alpha2 being used: 00 > [ 27.627830] ath: Regpair used: 0x60 > [ 27.627838] ath: tx DMA: 512 buffers 1 desc/buf > [ 27.627857] ath: tx DMA map: ffff88006f250000 (54600) -> 6f250000 (54600) > [ 27.627891] ath: beacon DMA: 4 buffers 1 desc/buf > [ 27.627897] ath: beacon DMA map: ffff88007198d000 (416) -> 7198d000 (416) > [ 27.627902] ath: cachelsz 64 rxbufsize 1926 > [ 27.627906] ath: rx DMA: 512 buffers 1 desc/buf > [ 27.627924] ath: rx DMA map: ffff88006f2b0000 (54600) -> 6f2b0000 (54600) > [ 27.629650] ieee80211 phy1: Selected rate control algorithm 'ath9k_rate_control' > [ 27.630460] Registered led device: ath9k-phy1 > [ 27.630475] ieee80211 phy1: Atheros AR9285 Rev:2 mem=0xffffc90021ba0000, irq=17 > [ 43.855411] PM: Syncing filesystems ... done. > [ 43.948410] PM: Preparing system for mem sleep > [ 43.949937] Freezing user space processes ... (elapsed 0.01 seconds) done. > [ 43.963254] Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done. > [ 43.976527] PM: Entering mem sleep > [ 43.976595] Suspending console(s) (use no_console_suspend to debug) > [ 43.977193] sd 0:0:0:0: [sda] Synchronizing SCSI cache > [ 43.989161] jmb38x_ms 0000:06:00.3: PCI INT B disabled > [ 43.989194] sdhci-pci 0000:06:00.0: PCI INT B disabled > [ 43.989232] ath: AWAKE -> FULL-SLEEP > [ 43.998370] sd 0:0:0:0: [sda] Stopping disk > [ 44.196369] snd_hda_intel 0000:00:1b.0: PCI INT A disabled > [ 44.608701] PM: suspend of devices complete after 633.019 msecs > [ 44.609142] jme 0000:06:00.5: PME# enabled > [ 44.609152] jme 0000:06:00.5: wake-up capability enabled by ACPI > [ 44.648588] PM: late suspend of devices complete after 39.960 msecs > [ 44.648748] ACPI: Preparing to enter system sleep state S3 > [ 44.951903] PM: Saving platform NVS memory > [ 44.952780] Disabling non-boot CPUs ... > [ 44.955081] CPU 1 is now offline > [ 45.057696] CPU 2 is now offline > [ 45.160822] CPU 3 is now offline > [ 45.161227] Extended CMOS year: 2000 > [ 45.161461] ACPI: Low-level resume complete > [ 45.161521] PM: Restoring platform NVS memory > [ 45.161951] Extended CMOS year: 2000 > [ 45.161974] Enabling non-boot CPUs ... > [ 45.162070] Booting Node 0 Processor 1 APIC 0x4 > [ 45.162071] smpboot cpu 1: start_ip = 96000 > [ 45.173166] Calibrating delay loop (skipped) already calibrated this CPU > [ 45.193433] NMI watchdog enabled, takes one hw-pmu counter. > [ 45.193554] CPU1 is up > [ 45.193622] Booting Node 0 Processor 2 APIC 0x1 > [ 45.193623] smpboot cpu 2: start_ip = 96000 > [ 45.194742] Switched to NOHz mode on CPU #1 > [ 45.204719] Calibrating delay loop (skipped) already calibrated this CPU > [ 45.225042] NMI watchdog enabled, takes one hw-pmu counter. > [ 45.225176] CPU2 is up > [ 45.225257] Booting Node 0 Processor 3 APIC 0x5 > [ 45.225259] smpboot cpu 3: start_ip = 96000 > [ 45.228010] Switched to NOHz mode on CPU #2 > [ 45.236352] Calibrating delay loop (skipped) already calibrated this CPU > [ 45.256763] NMI watchdog enabled, takes one hw-pmu counter. > [ 45.256925] CPU3 is up > [ 45.258831] Switched to NOHz mode on CPU #3 > [ 45.259072] ACPI: Waking up from system sleep state S3 > [ 45.580717] intel ips 0000:00:1f.6: MCP limit exceeded: Avg temp 20253, limit 9000 > [ 45.658244] i915 0000:00:02.0: restoring config space at offset 0xf (was 0x100, writing 0x10b) > [ 45.658253] i915 0000:00:02.0: restoring config space at offset 0x1 (was 0x900007, writing 0x900407) > [ 45.658277] mei 0000:00:16.0: restoring config space at offset 0xf (was 0x100, writing 0x10b) > [ 45.658296] mei 0000:00:16.0: restoring config space at offset 0x4 (was 0x4, writing 0xd540a004) > [ 45.658304] mei 0000:00:16.0: restoring config space at offset 0x1 (was 0x180000, writing 0x180006) > [ 45.658332] pci 0000:00:1a.0: restoring config space at offset 0xf (was 0x100, writing 0x10b) > [ 45.658350] pci 0000:00:1a.0: restoring config space at offset 0x4 (was 0x0, writing 0xd5408000) > [ 45.658358] pci 0000:00:1a.0: restoring config space at offset 0x1 (was 0x2900000, writing 0x2900006) > [ 45.658391] snd_hda_intel 0000:00:1b.0: restoring config space at offset 0xf (was 0x100, writing 0x105) > [ 45.658410] snd_hda_intel 0000:00:1b.0: restoring config space at offset 0x4 (was 0x4, writing 0xd5400004) > [ 45.658415] snd_hda_intel 0000:00:1b.0: restoring config space at offset 0x3 (was 0x0, writing 0x10) > [ 45.658421] snd_hda_intel 0000:00:1b.0: restoring config space at offset 0x1 (was 0x100004, writing 0x100002) > [ 45.658455] pcieport 0000:00:1c.0: restoring config space at offset 0xf (was 0x100, writing 0x10010b) > [ 45.658467] pcieport 0000:00:1c.0: restoring config space at offset 0x9 (was 0x1fff1, writing 0x7c717c61) > [ 45.658479] pcieport 0000:00:1c.0: restoring config space at offset 0x3 (was 0x810000, writing 0x810010) > [ 45.658485] pcieport 0000:00:1c.0: restoring config space at offset 0x1 (was 0x100000, writing 0x100407) > [ 45.658538] pcieport 0000:00:1c.1: restoring config space at offset 0xf (was 0x200, writing 0x100203) > [ 45.658550] pcieport 0000:00:1c.1: restoring config space at offset 0x9 (was 0x1fff1, writing 0x7c517c41) > [ 45.658561] pcieport 0000:00:1c.1: restoring config space at offset 0x3 (was 0x810000, writing 0x810010) > [ 45.658568] pcieport 0000:00:1c.1: restoring config space at offset 0x1 (was 0x100000, writing 0x100407) > [ 45.658620] pcieport 0000:00:1c.2: restoring config space at offset 0xf (was 0x300, writing 0x100304) > [ 45.658632] pcieport 0000:00:1c.2: restoring config space at offset 0x9 (was 0x1fff1, writing 0x7c317c21) > [ 45.658643] pcieport 0000:00:1c.2: restoring config space at offset 0x3 (was 0x810000, writing 0x810010) > [ 45.658650] pcieport 0000:00:1c.2: restoring config space at offset 0x1 (was 0x100000, writing 0x100407) > [ 45.658703] pcieport 0000:00:1c.5: restoring config space at offset 0xf (was 0x200, writing 0x100203) > [ 45.658714] pcieport 0000:00:1c.5: restoring config space at offset 0x9 (was 0x1fff1, writing 0x7c117c01) > [ 45.658726] pcieport 0000:00:1c.5: restoring config space at offset 0x3 (was 0x810000, writing 0x810010) > [ 45.658732] pcieport 0000:00:1c.5: restoring config space at offset 0x1 (was 0x100000, writing 0x100407) > [ 45.658780] pci 0000:00:1d.0: restoring config space at offset 0xf (was 0x100, writing 0x10a) > [ 45.658799] pci 0000:00:1d.0: restoring config space at offset 0x4 (was 0x0, writing 0xd5407000) > [ 45.658807] pci 0000:00:1d.0: restoring config space at offset 0x1 (was 0x2900000, writing 0x2900006) > [ 45.658831] pci 0000:00:1e.0: restoring config space at offset 0xf (was 0x0, writing 0x100000) > [ 45.658930] ahci 0000:00:1f.2: restoring config space at offset 0x1 (was 0x2b00007, writing 0x2b00407) > [ 45.659017] intel ips 0000:00:1f.6: restoring config space at offset 0x1 (was 0x100400, writing 0x100406) > [ 45.659057] ath9k 0000:02:00.0: restoring config space at offset 0xf (was 0x1ff, writing 0x103) > [ 45.659077] ath9k 0000:02:00.0: restoring config space at offset 0x4 (was 0x4, writing 0xd2c00004) > [ 45.659082] ath9k 0000:02:00.0: restoring config space at offset 0x3 (was 0x0, writing 0x10) > [ 45.659088] ath9k 0000:02:00.0: restoring config space at offset 0x1 (was 0x100000, writing 0x100007) > [ 45.659129] sdhci-pci 0000:06:00.0: restoring config space at offset 0xf (was 0x2ff, writing 0x204) > [ 45.659157] sdhci-pci 0000:06:00.0: restoring config space at offset 0x4 (was 0x0, writing 0xd0407000) > [ 45.659163] sdhci-pci 0000:06:00.0: restoring config space at offset 0x3 (was 0x800000, writing 0x800010) > [ 45.659172] sdhci-pci 0000:06:00.0: restoring config space at offset 0x1 (was 0x100000, writing 0x100007) > [ 45.659218] pci 0000:06:00.2: restoring config space at offset 0xf (was 0x2ff, writing 0x204) > [ 45.659246] pci 0000:06:00.2: restoring config space at offset 0x4 (was 0x0, writing 0xd0406000) > [ 45.659253] pci 0000:06:00.2: restoring config space at offset 0x3 (was 0x800000, writing 0x800010) > [ 45.659262] pci 0000:06:00.2: restoring config space at offset 0x1 (was 0x100000, writing 0x100003) > [ 45.659308] jmb38x_ms 0000:06:00.3: restoring config space at offset 0xf (was 0x2ff, writing 0x204) > [ 45.659336] jmb38x_ms 0000:06:00.3: restoring config space at offset 0x4 (was 0x0, writing 0xd0405000) > [ 45.659342] jmb38x_ms 0000:06:00.3: restoring config space at offset 0x3 (was 0x800000, writing 0x800010) > [ 45.659351] jmb38x_ms 0000:06:00.3: restoring config space at offset 0x1 (was 0x100000, writing 0x100007) > [ 45.659397] pci 0000:06:00.4: restoring config space at offset 0xf (was 0x2ff, writing 0x204) > [ 45.659425] pci 0000:06:00.4: restoring config space at offset 0x4 (was 0x0, writing 0xd0404000) > [ 45.659431] pci 0000:06:00.4: restoring config space at offset 0x3 (was 0x800000, writing 0x800010) > [ 45.659441] pci 0000:06:00.4: restoring config space at offset 0x1 (was 0x100000, writing 0x100007) > [ 45.659487] jme 0000:06:00.5: restoring config space at offset 0xf (was 0x100, writing 0x103) > [ 45.659508] jme 0000:06:00.5: restoring config space at offset 0x7 (was 0x1, writing 0xa001) > [ 45.659520] jme 0000:06:00.5: restoring config space at offset 0x4 (was 0x0, writing 0xd0400000) > [ 45.659526] jme 0000:06:00.5: restoring config space at offset 0x3 (was 0x800000, writing 0x800010) > [ 45.659535] jme 0000:06:00.5: restoring config space at offset 0x1 (was 0x100000, writing 0x100007) > [ 45.659759] PM: early resume of devices complete after 1.603 msecs > [ 45.659979] i915 0000:00:02.0: setting latency timer to 64 > [ 45.659996] snd_hda_intel 0000:00:1b.0: PCI INT A -> GSI 22 (level, low) -> IRQ 22 > [ 45.660003] snd_hda_intel 0000:00:1b.0: setting latency timer to 64 > [ 45.660050] snd_hda_intel 0000:00:1b.0: irq 45 for MSI/MSI-X > [ 45.660126] pci 0000:00:1e.0: setting latency timer to 64 > [ 45.660144] ahci 0000:00:1f.2: setting latency timer to 64 > [ 45.660211] ath: FULL-SLEEP -> AWAKE > [ 45.663932] sdhci-pci 0000:06:00.0: PCI INT B -> GSI 18 (level, low) -> IRQ 18 > [ 45.663974] sdhci-pci 0000:06:00.0: setting latency timer to 64 > [ 45.664037] jmb38x_ms 0000:06:00.3: PCI INT B -> GSI 18 (level, low) -> IRQ 18 > [ 45.664051] jmb38x_ms 0000:06:00.3: setting latency timer to 64 > [ 45.664113] jme 0000:06:00.5: wake-up capability disabled by ACPI > [ 45.664124] jme 0000:06:00.5: PME# disabled > [ 45.664503] ath: disable IER > [ 45.664521] ath: Disable MIB counters > [ 45.664560] ath: disable IER > [ 45.664575] ath: 0x30 => 0x30 > [ 45.664576] ath: new IMR 0x30 > [ 45.664672] sd 0:0:0:0: [sda] Starting disk > [ 45.667161] ath: Getting spur idx:0 is2Ghz:1 val:8000 > [ 45.668810] ath: Reset TX queue: 0 > [ 45.668813] ath: tx ok 0x0 err 0x0 desc 0x101 eol 0x101 urn 0x0 > [ 45.668815] ath: Reset TX queue: 1 > [ 45.668817] ath: tx ok 0x0 err 0x0 desc 0x103 eol 0x103 urn 0x0 > [ 45.668819] ath: Reset TX queue: 2 > [ 45.668821] ath: tx ok 0x0 err 0x0 desc 0x107 eol 0x107 urn 0x0 > [ 45.668823] ath: Reset TX queue: 3 > [ 45.668825] ath: tx ok 0x0 err 0x0 desc 0x10f eol 0x10f urn 0x0 > [ 45.668827] ath: Reset TXQ, inactive queue: 4 > [ 45.668828] ath: Reset TXQ, inactive queue: 5 > [ 45.668830] ath: Reset TXQ, inactive queue: 6 > [ 45.668831] ath: Reset TXQ, inactive queue: 7 > [ 45.668832] ath: Reset TX queue: 8 > [ 45.668842] ath: tx ok 0x0 err 0x0 desc 0x10f eol 0x10f urn 0x0 > [ 45.668844] ath: Reset TX queue: 9 > [ 45.668854] ath: tx ok 0x0 err 0x0 desc 0x10f eol 0x10f urn 0x0 > [ 45.668861] ath: ah->misc_mode 0x4 > [ 45.670264] ath: Running PA Calibration > [ 45.670963] ath: enabling ADC Gain Calibration. > [ 45.670964] ath: enabling ADC DC Calibration. > [ 45.670965] ath: enabling IQ Calibration. > [ 45.670969] ath: starting ADC Gain Calibration > [ 45.671139] ath: AWAKE -> FULL-SLEEP > [ 45.679552] Extended CMOS year: 2000 > [ 45.983193] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300) > [ 45.985133] ata2.00: ACPI cmd ef/10:06:00:00:00:a0 (SET FEATURES) succeeded > [ 45.985136] ata2.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out > [ 45.988885] ata2.00: ACPI cmd ef/10:06:00:00:00:a0 (SET FEATURES) succeeded > [ 45.988889] ata2.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out > [ 45.989852] ata6: SATA link down (SStatus 0 SControl 300) > [ 45.990375] ata2.00: configured for UDMA/100 > [ 46.120272] irq 17: nobody cared (try booting with the "irqpoll" option) > [ 46.120275] Pid: 0, comm: swapper Tainted: G C 3.1.0-rc7-wl-ARCH3.1.0-rc7-ath9k-debug #2 > [ 46.120277] Call Trace: > [ 46.120280] <IRQ> [<ffffffff810c68bd>] __report_bad_irq+0x3d/0xe0 > [ 46.120290] [<ffffffff810c6cfd>] note_interrupt+0x14d/0x210 > [ 46.120293] [<ffffffff810c4c39>] handle_irq_event_percpu+0xc9/0x2a0 > [ 46.120296] [<ffffffff810c4e58>] handle_irq_event+0x48/0x70 > [ 46.120299] [<ffffffff810c763a>] handle_fasteoi_irq+0x5a/0xe0 > [ 46.120303] [<ffffffff81016982>] handle_irq+0x22/0x40 > [ 46.120308] [<ffffffff813ec09a>] do_IRQ+0x5a/0xe0 > [ 46.120312] [<ffffffff813e8cee>] common_interrupt+0x6e/0x6e > [ 46.120313] <EOI> [<ffffffff8131234a>] ? poll_idle+0x3a/0x80 > [ 46.120319] [<ffffffff81312323>] ? poll_idle+0x13/0x80 > [ 46.120321] [<ffffffff813126f6>] cpuidle_idle_call+0xc6/0x350 > [ 46.120324] [<ffffffff81013229>] cpu_idle+0xc9/0x120 > [ 46.120328] [<ffffffff813c64e7>] rest_init+0x9b/0xa4 > [ 46.120330] [<ffffffff8174dc15>] start_kernel+0x3bf/0x3cc > [ 46.120333] [<ffffffff8174d347>] x86_64_start_reservations+0x132/0x136 > [ 46.120336] [<ffffffff8174d140>] ? early_idt_handlers+0x140/0x140 > [ 46.120338] [<ffffffff8174d44d>] x86_64_start_kernel+0x102/0x111 > [ 46.120340] handlers: > [ 46.120344] [<ffffffffa053c020>] ath_isr > [ 46.120345] Disabling IRQ #17 > [ 48.371839] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) > [ 48.479286] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out > [ 48.479457] ata1.00: ACPI cmd ef/10:06:00:00:00:a0 (SET FEATURES) succeeded > [ 48.479463] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out > [ 48.516626] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out > [ 48.516804] ata1.00: ACPI cmd ef/10:06:00:00:00:a0 (SET FEATURES) succeeded > [ 48.516809] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out > [ 48.518731] ata1.00: configured for UDMA/133 > [ 48.526624] ath: FULL-SLEEP -> AWAKE > [ 48.530116] ath: ah->misc_mode 0x4 > [ 48.530142] ath: AWAKE -> FULL-SLEEP > [ 48.530172] PM: resume of devices complete after 2875.994 msecs > [ 48.530607] PM: Finishing wakeup. > [ 48.530608] Restarting tasks ... done. > [ 48.533504] video LNXVIDEO:02: Restoring backlight state > -- shafi ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ath9k: irq storm after suspend/resume 2011-09-29 8:18 ` Mohammed Shafi @ 2011-09-29 10:33 ` Adrian Chadd 2011-09-29 17:11 ` Clemens Buchacher [not found] ` <CAD2nsn0Z2J4r4tN_fLjx5bbvz2bg6NVcQ8vppJbbNcgOF8pFew@mail.gmail.com> 0 siblings, 2 replies; 38+ messages in thread From: Adrian Chadd @ 2011-09-29 10:33 UTC (permalink / raw) To: Mohammed Shafi; +Cc: Clemens Buchacher, linux-wireless, beta992 Hm, Has someone figured out which ISR bits are being triggered? If not, I can likely whip up a patch which adds some relevant printk's. I think it's worth establishing: * is it a sync/async interrupt; * is it a fatal interrupt (eg something like a PCI bus error or transaction timeout) or is it a normal ISR bit that keeps firing; * .. and which one is firing. Adrian ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ath9k: irq storm after suspend/resume 2011-09-29 10:33 ` Adrian Chadd @ 2011-09-29 17:11 ` Clemens Buchacher [not found] ` <CAD2nsn0Z2J4r4tN_fLjx5bbvz2bg6NVcQ8vppJbbNcgOF8pFew@mail.gmail.com> 1 sibling, 0 replies; 38+ messages in thread From: Clemens Buchacher @ 2011-09-29 17:11 UTC (permalink / raw) To: Adrian Chadd; +Cc: Mohammed Shafi, linux-wireless, beta992 Hi Adrian, On Thu, Sep 29, 2011 at 06:33:42PM +0800, Adrian Chadd wrote: > > Has someone figured out which ISR bits are being triggered? No. I tried to move the SC_OP_INVALID check until after the call to ath9k_hw_getisr in ath_isr. But that caused the kernel to freeze IIRC. What I also tried and failed to do a hardware reset before request_irq gets called. For ath9k_hw_reset I need an initialized ath_hw struct, but that's done after request_irq in ath9k_init_device. I remember I tried to delay request_irq, but for some reason that did not work out either. > If not, I can likely whip up a patch which adds some relevant > printk's. I think it's worth establishing: > > * is it a sync/async interrupt; > * is it a fatal interrupt (eg something like a PCI bus error or > transaction timeout) or is it a normal ISR bit that keeps firing; > * .. and which one is firing. If you have an idea how to do this, I think that would be most helpful. Clemens ^ permalink raw reply [flat|nested] 38+ messages in thread
[parent not found: <CAD2nsn0Z2J4r4tN_fLjx5bbvz2bg6NVcQ8vppJbbNcgOF8pFew@mail.gmail.com>]
[parent not found: <CAJ-VmokcM4KmzV7Rn9PA68iEiTJiPw=ffYgNFLDAHShxD0HNAg@mail.gmail.com>]
* Re: ath9k: irq storm after suspend/resume [not found] ` <CAJ-VmokcM4KmzV7Rn9PA68iEiTJiPw=ffYgNFLDAHShxD0HNAg@mail.gmail.com> @ 2011-10-03 8:48 ` Clemens Buchacher 2011-10-04 7:58 ` Adrian Chadd 0 siblings, 1 reply; 38+ messages in thread From: Clemens Buchacher @ 2011-10-03 8:48 UTC (permalink / raw) To: Adrian Chadd; +Cc: Mohammed Shafi, linux-wireless (re-added linux-wireless) On Fri, Sep 30, 2011 at 01:46:06PM +0800, Adrian Chadd wrote: > > if (sc->sc_flags & SC_OP_INVALID) { > printk(KERN_ERR "AR_INTR_SYNC_CAUSE: 0x%08x, AR_INTR_ASYNC_CAUSE: 0x%08x\n", > REG_READ(ah, AR_INTR_SYNC_CAUSE), REG_READ(ah, AR_INTR_ASYNC_CAUSE)); > printk(KERN_ERR "AR_IMR: 0x%08x, AR_ISR: 0x%08x\n", > REG_READ(ah, AR_IMR), > REG_READ(ah, AR_ISR)); > return IRQ_NONE; > } I only output the first 99 and then every 2^n'th interrupt. The log gets truncated otherwise. If I load the module first, then suspend/resume, I get this until "nobody cared". for 1, 2, ..., 99, 128, 256: AR_INTR_SYNC_CAUSE: 0x00000000, AR_INTR_ASYNC_CAUSE: 0x00000000 AR_IMR: 0xdeadbeef, AR_ISR: 0xdeadbeef for 512: AR_INTR_SYNC_CAUSE: 0x00000000, AR_INTR_ASYNC_CAUSE: 0x00000000 AR_IMR: 0x00000000, AR_ISR: 0x00000208 for 1024: AR_INTR_SYNC_CAUSE: 0x00000000, AR_INTR_ASYNC_CAUSE: 0x00000000 AR_IMR: 0x81800964, AR_ISR: 0x00000008 for 2048: AR_INTR_SYNC_CAUSE: 0x00000000, AR_INTR_ASYNC_CAUSE: 0x00000000 AR_IMR: 0xdeadbeef, AR_ISR: 0xdeadbeef for 2^12, 2^13, ..., 2^16 AR_INTR_SYNC_CAUSE: 0x00020000, AR_INTR_ASYNC_CAUSE: 0x00000000 AR_IMR: 0xdeadbeef, AR_ISR: 0xdeadbeef The last ones indeed correspond to AR_INTR_SYNC_MAC_SLEEP_ACCESS that you mentioned. Note that this output is interleaved with device initialization. So I don't know if that causes the register contents to become valid, or if it changes the contents. If I suspend/resume first, then load the module, then wait for 500 ms after request_irq I get only zeroes, repeated 109 times: for 1, 2, ..., 99, 2^7, 2^8, ..., 2^16: AR_IMR: 0x00000000, AR_ISR: 0x00000000 AR_INTR_SYNC_CAUSE: 0x00000000, AR_INTR_ASYNC_CAUSE: 0x00000000 Clemens --- diff --git a/drivers/net/wireless/ath/ath9k/main.c b/drivers/net/wireless/ath/ath9k/main.c index edaa784..634256b 100644 --- a/drivers/net/wireless/ath/ath9k/main.c +++ b/drivers/net/wireless/ath/ath9k/main.c @@ -16,6 +16,7 @@ #include <linux/nl80211.h> #include <linux/delay.h> +#include <linux/log2.h> #include "ath9k.h" #include "btcoex.h" @@ -765,8 +766,19 @@ irqreturn_t ath_isr(int irq, void *dev) * touch anything. Note this can happen early * on if the IRQ is shared. */ - if (sc->sc_flags & SC_OP_INVALID) + if (sc->sc_flags & SC_OP_INVALID) { + static int count = 0; + if (count < INT_MAX) + count++; + if (count < 100 || is_power_of_2(count)) { + printk(KERN_ERR "AR_INTR_SYNC_CAUSE: 0x%08x, AR_INTR_ASYNC_CAUSE: 0x%08x\n", + REG_READ(ah, AR_INTR_SYNC_CAUSE), REG_READ(ah, AR_INTR_ASYNC_CAUSE)); + printk(KERN_ERR "AR_IMR: 0x%08x, AR_ISR: 0x%08x\n", + REG_READ(ah, AR_IMR), + REG_READ(ah, AR_ISR)); + } return IRQ_NONE; + } /* shared irq, not for us */ diff --git a/drivers/net/wireless/ath/ath9k/pci.c b/drivers/net/wireless/ath/ath9k/pci.c index d67d6ee..76006a9 100644 --- a/drivers/net/wireless/ath/ath9k/pci.c +++ b/drivers/net/wireless/ath/ath9k/pci.c @@ -246,20 +246,24 @@ static int ath_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id) /* Will be cleared in ath9k_start() */ sc->sc_flags |= SC_OP_INVALID; + ret = ath9k_init_device(id->device, sc, &ath_pci_bus_ops); + if (ret) { + dev_err(&pdev->dev, "Failed to initialize device\n"); + goto err_init; + } + + printk(KERN_INFO "ath9k: requesting irq\n"); ret = request_irq(pdev->irq, ath_isr, IRQF_SHARED, "ath9k", sc); if (ret) { dev_err(&pdev->dev, "request_irq failed\n"); goto err_irq; } + printk(KERN_INFO "wait 500 ms\n"); + msleep(500); + printk(KERN_INFO "done waiting\n"); sc->irq = pdev->irq; - ret = ath9k_init_device(id->device, sc, &ath_pci_bus_ops); - if (ret) { - dev_err(&pdev->dev, "Failed to initialize device\n"); - goto err_init; - } - ath9k_hw_name(sc->sc_ah, hw_name, sizeof(hw_name)); wiphy_info(hw->wiphy, "%s mem=0x%lx, irq=%d\n", hw_name, (unsigned long)mem, pdev->irq); ^ permalink raw reply related [flat|nested] 38+ messages in thread
* Re: ath9k: irq storm after suspend/resume 2011-10-03 8:48 ` Clemens Buchacher @ 2011-10-04 7:58 ` Adrian Chadd 2011-10-04 18:15 ` Clemens Buchacher 2011-10-04 18:36 ` Clemens Buchacher 0 siblings, 2 replies; 38+ messages in thread From: Adrian Chadd @ 2011-10-04 7:58 UTC (permalink / raw) To: Clemens Buchacher; +Cc: Mohammed Shafi, linux-wireless Right, this looks a bit evil. ;) On 3 October 2011 16:48, Clemens Buchacher <drizzd@aon.at> wrote: > for 1, 2, ..., 99, 128, 256: > AR_INTR_SYNC_CAUSE: 0x00000000, AR_INTR_ASYNC_CAUSE: 0x00000000 > AR_IMR: 0xdeadbeef, AR_ISR: 0xdeadbeef .. that says some part of the MAC is very asleep. But no interrupt cause bits are set. So I bet it's a shared interrupt. > for 512: > AR_INTR_SYNC_CAUSE: 0x00000000, AR_INTR_ASYNC_CAUSE: 0x00000000 > AR_IMR: 0x00000000, AR_ISR: 0x00000208 That looks valid, but why are you receiving an interrupt here? Is it some other device on a shared irq? Or a non-acked interrupt? Ie - there's no sync/async cause bits set and IMR is 0, the status bits are no tx pkt / no rx pkt. So that looks fine. You shouldn't have received an interrupt. > for 1024: > AR_INTR_SYNC_CAUSE: 0x00000000, AR_INTR_ASYNC_CAUSE: 0x00000000 > AR_IMR: 0x81800964, AR_ISR: 0x00000008 .. hm, why'd you receive an interrupt here? the ISR bits aren't set that way. Again, I think this is another shared or non-acked interrupt coming in. > for 2048: > AR_INTR_SYNC_CAUSE: 0x00000000, AR_INTR_ASYNC_CAUSE: 0x00000000 > AR_IMR: 0xdeadbeef, AR_ISR: 0xdeadbeef > > for 2^12, 2^13, ..., 2^16 > AR_INTR_SYNC_CAUSE: 0x00020000, AR_INTR_ASYNC_CAUSE: 0x00000000 > AR_IMR: 0xdeadbeef, AR_ISR: 0xdeadbeef > > The last ones indeed correspond to AR_INTR_SYNC_MAC_SLEEP_ACCESS > that you mentioned. Note that this output is interleaved with > device initialization. So I don't know if that causes the register > contents to become valid, or if it changes the contents. > > If I suspend/resume first, then load the module, then wait for 500 > ms after request_irq I get only zeroes, repeated 109 times: > > for 1, 2, ..., 99, 2^7, 2^8, ..., 2^16: > AR_IMR: 0x00000000, AR_ISR: 0x00000000 > AR_INTR_SYNC_CAUSE: 0x00000000, AR_INTR_ASYNC_CAUSE: 0x00000000 Would you mind also printing the contents of AR_IER there? I wonder if AR_IER is incorrectly being enabled. I think this may be a shared interrupt. What else is using the same IRQ line? What about only printing out those lines if the interrupt is actually generated by the NIC? There's a function further down in the isr routine which checks whether the NIC itself posted the interrupt (it checks SYNC_CAUSE/ASYNC_CAUSE.) That should cut down the spam quite a bit and only log the relevant entries for ath9k. Adrian ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ath9k: irq storm after suspend/resume 2011-10-04 7:58 ` Adrian Chadd @ 2011-10-04 18:15 ` Clemens Buchacher 2011-10-04 21:11 ` Adrian Chadd 2011-10-04 18:36 ` Clemens Buchacher 1 sibling, 1 reply; 38+ messages in thread From: Clemens Buchacher @ 2011-10-04 18:15 UTC (permalink / raw) To: Adrian Chadd; +Cc: Mohammed Shafi, linux-wireless On Tue, Oct 04, 2011 at 03:58:09PM +0800, Adrian Chadd wrote: > > On 3 October 2011 16:48, Clemens Buchacher <drizzd@aon.at> wrote: > > > for 1, 2, ..., 99, 128, 256: > > AR_INTR_SYNC_CAUSE: 0x00000000, AR_INTR_ASYNC_CAUSE: 0x00000000 > > AR_IMR: 0xdeadbeef, AR_ISR: 0xdeadbeef > > .. that says some part of the MAC is very asleep. But no interrupt > cause bits are set. So I bet it's a shared interrupt. Unfortunately, that's not it. From /proc/interrupts: CPU0 CPU1 CPU2 CPU3 17: 24958 25047 25012 24984 IO-APIC-fasteoi ath9k This is right after the IRQ is disabled because of too many unhandled interrupts. > > If I suspend/resume first, then load the module, then wait for 500 > > ms after request_irq I get only zeroes, repeated 109 times: > > > > for 1, 2, ..., 99, 2^7, 2^8, ..., 2^16: > > AR_IMR: 0x00000000, AR_ISR: 0x00000000 > > AR_INTR_SYNC_CAUSE: 0x00000000, AR_INTR_ASYNC_CAUSE: 0x00000000 > > Would you mind also printing the contents of AR_IER there? I wonder if > AR_IER is incorrectly being enabled. Not much more helpful, I'm afraid: AR_IMR: 0x00000000, AR_ISR: 0x00000000, AR_IER: 0x00000000 AR_INTR_SYNC_CAUSE: 0x00000000, AR_INTR_ASYNC_CAUSE: 0x00000000 (repeated 109 times) It seems that at this point the device is in a bad state. Clemens ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ath9k: irq storm after suspend/resume 2011-10-04 18:15 ` Clemens Buchacher @ 2011-10-04 21:11 ` Adrian Chadd 2011-10-05 6:28 ` Clemens Buchacher 0 siblings, 1 reply; 38+ messages in thread From: Adrian Chadd @ 2011-10-04 21:11 UTC (permalink / raw) To: Clemens Buchacher; +Cc: Mohammed Shafi, linux-wireless On 5 October 2011 02:15, Clemens Buchacher <drizzd@aon.at> wrote: >> Would you mind also printing the contents of AR_IER there? I wonder if >> AR_IER is incorrectly being enabled. > > Not much more helpful, I'm afraid: > > AR_IMR: 0x00000000, AR_ISR: 0x00000000, AR_IER: 0x00000000 > AR_INTR_SYNC_CAUSE: 0x00000000, AR_INTR_ASYNC_CAUSE: 0x00000000 > (repeated 109 times) > > It seems that at this point the device is in a bad state. Well, the question now is whether the device itself is actually firing off interrupts. >From what you've shown there, the device _isn't_ firing off interrupts the normal way. So I wonder if it's the card or whether it's something else (eg the PCI bus glue.) Here, you could try inserting this in the SC_OP_INVALID bit, right after you do some logging. I'm still assuming your AR9285 is somehow firing off interrupts. At this point however I'm all just guessing here. It'd be nice to know whether the device is _really_ sending interrupts or not. /* Mark _all_ async interrupts as serviced */ REG_WRITE(ah, AR_INTR_ASYNC_CAUSE, 0xffffffff); REG_READ(ah, AR_INTR_ASYNC_CAUSE); /* Disable sync interrupt sources */ REG_WRITE(ah, AR_INTR_SYNC_ENABLE, 0); /* Mark _all_ sync interrupts as serviced */ REG_WRITE(ah, AR_INTR_SYNC_CAUSE, 0xffffffff); REG_READ(ah, AR_INTR_SYNC_CAUSE); /* Do a reset of the AHB and host interface */ REG_WRITE(ah, AR_RC, AR_RC_AHB | AR_RC_HOSTIF); DELAY(100); REG_WRITE(ah, AR_RC, 0); ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ath9k: irq storm after suspend/resume 2011-10-04 21:11 ` Adrian Chadd @ 2011-10-05 6:28 ` Clemens Buchacher 2011-10-05 13:02 ` Adrian Chadd 0 siblings, 1 reply; 38+ messages in thread From: Clemens Buchacher @ 2011-10-05 6:28 UTC (permalink / raw) To: Adrian Chadd; +Cc: Mohammed Shafi, linux-wireless On Wed, Oct 05, 2011 at 05:11:00AM +0800, Adrian Chadd wrote: > > Well, the question now is whether the device itself is actually firing > off interrupts. > > From what you've shown there, the device _isn't_ firing off interrupts > the normal way. > So I wonder if it's the card or whether it's something else (eg the > PCI bus glue.) > > Here, you could try inserting this in the SC_OP_INVALID bit, right > after you do some logging. I'm still assuming your AR9285 is somehow > firing off interrupts. > At this point however I'm all just guessing here. It'd be nice to know > whether the device is _really_ sending interrupts or not. No change with the code below, except that the timing is slightly different. > /* Mark _all_ async interrupts as serviced */ > REG_WRITE(ah, AR_INTR_ASYNC_CAUSE, 0xffffffff); > REG_READ(ah, AR_INTR_ASYNC_CAUSE); > > /* Disable sync interrupt sources */ > REG_WRITE(ah, AR_INTR_SYNC_ENABLE, 0); > /* Mark _all_ sync interrupts as serviced */ > REG_WRITE(ah, AR_INTR_SYNC_CAUSE, 0xffffffff); > REG_READ(ah, AR_INTR_SYNC_CAUSE); > > /* Do a reset of the AHB and host interface */ > REG_WRITE(ah, AR_RC, AR_RC_AHB | AR_RC_HOSTIF); > DELAY(100); Since I could not find a suitable DELAY macro, I used this instead: int delay = 100; DO_DELAY(delay); > REG_WRITE(ah, AR_RC, 0); For the record, instead of going to full sleep, I can reproduce the issue with echo devices >/sys/power/pm_test echo mem >/sys/power/state but not with echo freezer >/sys/power/pm_test echo mem >/sys/power/state So it could still be a different device that's causing the issue? I can try unloading all kinds of modules. Or is there an option to suspend just one device? Clemens ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ath9k: irq storm after suspend/resume 2011-10-05 6:28 ` Clemens Buchacher @ 2011-10-05 13:02 ` Adrian Chadd 2011-10-12 13:10 ` Mohammed Shafi 0 siblings, 1 reply; 38+ messages in thread From: Adrian Chadd @ 2011-10-05 13:02 UTC (permalink / raw) To: Clemens Buchacher; +Cc: Mohammed Shafi, linux-wireless On 5 October 2011 14:28, Clemens Buchacher <drizzd@aon.at> wrote: > For the record, instead of going to full sleep, I can reproduce the > issue with > > echo devices >/sys/power/pm_test > echo mem >/sys/power/state > > but not with > > echo freezer >/sys/power/pm_test > echo mem >/sys/power/state > > So it could still be a different device that's causing the issue? I > can try unloading all kinds of modules. Or is there an option to > suspend just one device? Right. So its likely something (more) odd. Yes, see if that's the case :) At this point I'm going to punt this to the power management/ACPI people .:) Good luck! Adrian ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ath9k: irq storm after suspend/resume 2011-10-05 13:02 ` Adrian Chadd @ 2011-10-12 13:10 ` Mohammed Shafi 2011-10-15 9:39 ` Clemens Buchacher 0 siblings, 1 reply; 38+ messages in thread From: Mohammed Shafi @ 2011-10-12 13:10 UTC (permalink / raw) To: Adrian Chadd, Clemens Buchacher; +Cc: linux-wireless Hi Clemens, please try this in case it helps anything to debug the issue. this should avoid dead beef and the interrupt when we access MAC registers when the MAC is asleep. also i did not see MAC irq being not triggerred in your case(AR_INTR_ASYNC_CAUSE should be 0x2) or may be its not an interrupt triggered by the chip itself. thanks diff --git a/drivers/net/wireless/ath/ath9k/main.c b/drivers/net/wireless/ath/ath9k/main.c index 93fbe6f..51b7c54 100644 --- a/drivers/net/wireless/ath/ath9k/main.c +++ b/drivers/net/wireless/ath/ath9k/main.c @@ -770,6 +770,7 @@ irqreturn_t ath_isr(int irq, void *dev) enum ath9k_int status; bool sched = false; + ath9k_ps_wakeup(sc); /* * The hardware is not ready/present, don't * touch anything. Note this can happen early On Wed, Oct 5, 2011 at 6:32 PM, Adrian Chadd <adrian@freebsd.org> wrote: > On 5 October 2011 14:28, Clemens Buchacher <drizzd@aon.at> wrote: >> For the record, instead of going to full sleep, I can reproduce the >> issue with >> >> echo devices >/sys/power/pm_test >> echo mem >/sys/power/state >> >> but not with >> >> echo freezer >/sys/power/pm_test >> echo mem >/sys/power/state >> >> So it could still be a different device that's causing the issue? I >> can try unloading all kinds of modules. Or is there an option to >> suspend just one device? > > Right. So its likely something (more) odd. Yes, see if that's the case > :) At this point I'm going to punt this to the power management/ACPI > people .:) > > Good luck! > > > Adrian > -- shafi ^ permalink raw reply related [flat|nested] 38+ messages in thread
* Re: ath9k: irq storm after suspend/resume 2011-10-12 13:10 ` Mohammed Shafi @ 2011-10-15 9:39 ` Clemens Buchacher 2011-10-15 10:01 ` Adrian Chadd 0 siblings, 1 reply; 38+ messages in thread From: Clemens Buchacher @ 2011-10-15 9:39 UTC (permalink / raw) To: Mohammed Shafi; +Cc: Adrian Chadd, linux-wireless Hi Mohammed, On Wed, Oct 12, 2011 at 06:40:06PM +0530, Mohammed Shafi wrote: > > please try this in case it helps anything > to debug the issue. With this, the previous instaces of 0xdeadbeaf are replaced with zeroes, and I do not get any non-zero traces for SYNC_CAUSE or ASYNC_CAUSE any more. > this should avoid dead beef and the interrupt when we access MAC > registers when the MAC is asleep. also i did not see MAC irq being not > triggerred in your case(AR_INTR_ASYNC_CAUSE should be 0x2) or may be > its not an interrupt triggered by the chip itself. thanks If it's not triggered by ath9k, I do wonder why it works if I suspend in connected mode, and why reloading the driver usually fixes the issue. Maybe there is something in the driver unloading code that we could look at? Clemens ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ath9k: irq storm after suspend/resume 2011-10-15 9:39 ` Clemens Buchacher @ 2011-10-15 10:01 ` Adrian Chadd 2011-10-18 6:44 ` Clemens Buchacher 0 siblings, 1 reply; 38+ messages in thread From: Adrian Chadd @ 2011-10-15 10:01 UTC (permalink / raw) To: Clemens Buchacher; +Cc: Mohammed Shafi, linux-wireless I've found some perhaps relevant bits of code here which force the chip into and out of reset/sleep states. It's possible that your BIOS doesn't fully reset the NIC upon startup. It's also possible there are issues with the ath9k power save/resume code. It's also possible they're all compounding to make things worse. Are you able to completely and utterly disable any/all power saving on ath9k, including APSM if you have it enabled? Unfortunately the information I have is a bit sketchy. I'll do some further digging (and hope I don't bug Atheros people too much :) to see if I can extract out some more information about what's going on. Thanks, Adrian ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ath9k: irq storm after suspend/resume 2011-10-15 10:01 ` Adrian Chadd @ 2011-10-18 6:44 ` Clemens Buchacher 2011-10-18 7:05 ` Adrian Chadd 0 siblings, 1 reply; 38+ messages in thread From: Clemens Buchacher @ 2011-10-18 6:44 UTC (permalink / raw) To: Adrian Chadd; +Cc: Mohammed Shafi, linux-wireless On Sat, Oct 15, 2011 at 06:01:38PM +0800, Adrian Chadd wrote: > > Are you able to completely and utterly disable any/all power saving on > ath9k, including APSM if you have it enabled? Unfortunately, no change with the following options disabled: PCIEAER y -> n PCIEASPM y -> n PM_RUNTIME y -> n Anything else I should disable? Clemens ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ath9k: irq storm after suspend/resume 2011-10-18 6:44 ` Clemens Buchacher @ 2011-10-18 7:05 ` Adrian Chadd 2011-10-21 10:22 ` Clemens Buchacher 0 siblings, 1 reply; 38+ messages in thread From: Adrian Chadd @ 2011-10-18 7:05 UTC (permalink / raw) To: Clemens Buchacher; +Cc: Mohammed Shafi, linux-wireless Hm, which version of the AR9285 is it, exactly? Would you just do a quick check (or code hack) and see if it matches the AR_SREV_9285SE_20 (or similar) macro? (I see a WAR in the reference code for that particular model AR9285, that isn't in ath9k.) I'm honestly still not convinced that it's (soley) the AR9285. You _could_ do something naughty like disabling all interrupts and then powering off the NIC (but not detaching it from the bus glue, like unloading the module does.) Good luck, Adrian ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ath9k: irq storm after suspend/resume 2011-10-18 7:05 ` Adrian Chadd @ 2011-10-21 10:22 ` Clemens Buchacher 2011-10-21 14:10 ` [ath9k-devel] " Adrian Chadd 0 siblings, 1 reply; 38+ messages in thread From: Clemens Buchacher @ 2011-10-21 10:22 UTC (permalink / raw) To: Adrian Chadd; +Cc: Mohammed Shafi, linux-wireless On Tue, Oct 18, 2011 at 03:05:38PM +0800, Adrian Chadd wrote: > > Hm, which version of the AR9285 is it, exactly? > Would you just do a quick check (or code hack) and see if it matches > the AR_SREV_9285SE_20 (or similar) macro? It matches AR_SREV_9285, but it does not match AR_SREV_9285E_20. Clemens ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ath9k: irq storm after suspend/resume 2011-10-21 10:22 ` Clemens Buchacher @ 2011-10-21 14:10 ` Adrian Chadd 0 siblings, 0 replies; 38+ messages in thread From: Adrian Chadd @ 2011-10-21 14:10 UTC (permalink / raw) To: Clemens Buchacher; +Cc: Mohammed Shafi, linux-wireless, ath9k-devel On 21 October 2011 18:22, Clemens Buchacher <drizzd@aon.at> wrote: > It matches AR_SREV_9285, but it does not match AR_SREV_9285E_20. Ok. Well, I've read the ar9285 host documentation that I've gotten my hands on and I can't see anything obvious. All I can think of is that something else is throwing those interrupts - whether it's the NIC throwing the error, or whether its the bus itself throwing an error and you're seeing the AR9285 because it's the only thing attached. I however know next to nothing about PCI/PCIe and the various fruity behaviour seen. I assume you have disabled the NIC power save option. Not just APSM, etc, but the powersave options in iw. I don't know what they are, sorry, only that they exist. I suggest modifying ath_intr() so if there is an interrupt storm, disable _all_ sources of interrupts (ie, set sync_cause, async_cause to 0, set imr to 0, set ier to 0) and see if it continues occuring. If the interrupt storm continue occuring, try putting the whole NIC to sleep entirely (ie, call pcipowersave to put the NIC to sleep, then make sure the NIC -stays- asleep somehow, ie so nothing else tries waking it up - I don't know how to do this, likely you'll have to clear some interface flag so the rest of the ath9k and 802.11 stack thinks the nic is actually down for the count) and see if you're still getting spurious interrupts. If you are at this point, the only other possible thing I can think of doing involves a PCIe bus analyser, and you/I don't have access to those. :-) Adrian ^ permalink raw reply [flat|nested] 38+ messages in thread
* [ath9k-devel] ath9k: irq storm after suspend/resume @ 2011-10-21 14:10 ` Adrian Chadd 0 siblings, 0 replies; 38+ messages in thread From: Adrian Chadd @ 2011-10-21 14:10 UTC (permalink / raw) To: ath9k-devel On 21 October 2011 18:22, Clemens Buchacher <drizzd@aon.at> wrote: > It matches AR_SREV_9285, but it does not match AR_SREV_9285E_20. Ok. Well, I've read the ar9285 host documentation that I've gotten my hands on and I can't see anything obvious. All I can think of is that something else is throwing those interrupts - whether it's the NIC throwing the error, or whether its the bus itself throwing an error and you're seeing the AR9285 because it's the only thing attached. I however know next to nothing about PCI/PCIe and the various fruity behaviour seen. I assume you have disabled the NIC power save option. Not just APSM, etc, but the powersave options in iw. I don't know what they are, sorry, only that they exist. I suggest modifying ath_intr() so if there is an interrupt storm, disable _all_ sources of interrupts (ie, set sync_cause, async_cause to 0, set imr to 0, set ier to 0) and see if it continues occuring. If the interrupt storm continue occuring, try putting the whole NIC to sleep entirely (ie, call pcipowersave to put the NIC to sleep, then make sure the NIC -stays- asleep somehow, ie so nothing else tries waking it up - I don't know how to do this, likely you'll have to clear some interface flag so the rest of the ath9k and 802.11 stack thinks the nic is actually down for the count) and see if you're still getting spurious interrupts. If you are at this point, the only other possible thing I can think of doing involves a PCIe bus analyser, and you/I don't have access to those. :-) Adrian ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ath9k: irq storm after suspend/resume 2011-10-21 14:10 ` [ath9k-devel] " Adrian Chadd @ 2011-10-21 19:03 ` Clemens Buchacher -1 siblings, 0 replies; 38+ messages in thread From: Clemens Buchacher @ 2011-10-21 19:03 UTC (permalink / raw) To: Adrian Chadd; +Cc: Mohammed Shafi, linux-wireless, ath9k-devel On Fri, Oct 21, 2011 at 10:10:21PM +0800, Adrian Chadd wrote: > > If the interrupt storm continue occuring, try putting the whole NIC to > sleep entirely (ie, call pcipowersave to put the NIC to sleep, then > make sure the NIC -stays- asleep somehow, ie so nothing else tries > waking it up. Is it possible to figure out whether or not it is asleep in the interrupt routine? ^ permalink raw reply [flat|nested] 38+ messages in thread
* [ath9k-devel] ath9k: irq storm after suspend/resume @ 2011-10-21 19:03 ` Clemens Buchacher 0 siblings, 0 replies; 38+ messages in thread From: Clemens Buchacher @ 2011-10-21 19:03 UTC (permalink / raw) To: ath9k-devel On Fri, Oct 21, 2011 at 10:10:21PM +0800, Adrian Chadd wrote: > > If the interrupt storm continue occuring, try putting the whole NIC to > sleep entirely (ie, call pcipowersave to put the NIC to sleep, then > make sure the NIC -stays- asleep somehow, ie so nothing else tries > waking it up. Is it possible to figure out whether or not it is asleep in the interrupt routine? ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ath9k: irq storm after suspend/resume 2011-10-21 14:10 ` [ath9k-devel] " Adrian Chadd @ 2011-10-21 20:20 ` Clemens Buchacher -1 siblings, 0 replies; 38+ messages in thread From: Clemens Buchacher @ 2011-10-21 20:20 UTC (permalink / raw) To: Adrian Chadd; +Cc: Mohammed Shafi, linux-wireless, ath9k-devel On Fri, Oct 21, 2011 at 10:10:21PM +0800, Adrian Chadd wrote: > > I assume you have disabled the NIC power save option. Not just APSM, > etc, but the powersave options in iw. I don't know what they are, > sorry, only that they exist. I am not sure if that's what you mean, but if I set "iwconfig wlan0 power off" it makes no difference. > I suggest modifying ath_intr() so if there is an interrupt storm, > disable _all_ sources of interrupts (ie, set sync_cause, async_cause > to 0, set imr to 0, set ier to 0) and see if it continues occuring. They keep coming with this too. > If the interrupt storm continue occuring, try putting the whole NIC to > sleep entirely (ie, call pcipowersave to put the NIC to sleep, then > make sure the NIC -stays- asleep somehow, ie so nothing else tries > waking it up - I don't know how to do this, likely you'll have to > clear some interface flag so the rest of the ath9k and 802.11 stack > thinks the nic is actually down for the count) and see if you're still > getting spurious interrupts. I am now calling ath9k_hw_ops(sc->sc_ah)->config_pci_powersave(sc->sc_ah, true); ath9k_setpower(sc, ATH9K_PM_FULL_SLEEP); right before request_irq, and the interrupts still come. And the device is in FULL_SLEEP the whole time, at least according to ah->power_mode. Clemens ^ permalink raw reply [flat|nested] 38+ messages in thread
* [ath9k-devel] ath9k: irq storm after suspend/resume @ 2011-10-21 20:20 ` Clemens Buchacher 0 siblings, 0 replies; 38+ messages in thread From: Clemens Buchacher @ 2011-10-21 20:20 UTC (permalink / raw) To: ath9k-devel On Fri, Oct 21, 2011 at 10:10:21PM +0800, Adrian Chadd wrote: > > I assume you have disabled the NIC power save option. Not just APSM, > etc, but the powersave options in iw. I don't know what they are, > sorry, only that they exist. I am not sure if that's what you mean, but if I set "iwconfig wlan0 power off" it makes no difference. > I suggest modifying ath_intr() so if there is an interrupt storm, > disable _all_ sources of interrupts (ie, set sync_cause, async_cause > to 0, set imr to 0, set ier to 0) and see if it continues occuring. They keep coming with this too. > If the interrupt storm continue occuring, try putting the whole NIC to > sleep entirely (ie, call pcipowersave to put the NIC to sleep, then > make sure the NIC -stays- asleep somehow, ie so nothing else tries > waking it up - I don't know how to do this, likely you'll have to > clear some interface flag so the rest of the ath9k and 802.11 stack > thinks the nic is actually down for the count) and see if you're still > getting spurious interrupts. I am now calling ath9k_hw_ops(sc->sc_ah)->config_pci_powersave(sc->sc_ah, true); ath9k_setpower(sc, ATH9K_PM_FULL_SLEEP); right before request_irq, and the interrupts still come. And the device is in FULL_SLEEP the whole time, at least according to ah->power_mode. Clemens ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ath9k: irq storm after suspend/resume 2011-10-21 20:20 ` [ath9k-devel] " Clemens Buchacher @ 2011-10-22 0:47 ` Adrian Chadd -1 siblings, 0 replies; 38+ messages in thread From: Adrian Chadd @ 2011-10-22 0:47 UTC (permalink / raw) To: Clemens Buchacher; +Cc: Mohammed Shafi, linux-wireless, ath9k-devel .. then I don't think it's specifically the NIC. >From what I understand, unless there are bus errors happening (and that'd occur with the NIC awake or having something being written to it) the NIC shouldn't be generating interrupts if: * sync/async cause are 0; * ier is 0; * imr is 0. Now maybe some part of ath9k is still trying to write to the NIC when it's asleep, but it would set some bits in sync_cause/async_cause. At this point I'd really suggest whacking other devices into the slot to see if they generate the same kind of behaviour. Adrian ^ permalink raw reply [flat|nested] 38+ messages in thread
* [ath9k-devel] ath9k: irq storm after suspend/resume @ 2011-10-22 0:47 ` Adrian Chadd 0 siblings, 0 replies; 38+ messages in thread From: Adrian Chadd @ 2011-10-22 0:47 UTC (permalink / raw) To: ath9k-devel .. then I don't think it's specifically the NIC. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ath9k: irq storm after suspend/resume 2011-10-22 0:47 ` [ath9k-devel] " Adrian Chadd @ 2011-10-22 7:22 ` Clemens Buchacher -1 siblings, 0 replies; 38+ messages in thread From: Clemens Buchacher @ 2011-10-22 7:22 UTC (permalink / raw) To: Adrian Chadd; +Cc: Mohammed Shafi, linux-wireless, ath9k-devel Hi, On Sat, Oct 22, 2011 at 08:47:33AM +0800, Adrian Chadd wrote: > > At this point I'd really suggest whacking other devices into the slot > to see if they generate the same kind of behaviour. Looks like you had it right from the start. After unloading the jme module [*1*] I cannot reproduce the IRQ storm any more. From looking at dmesg it should have been obvious, since it also gets IRQ 17 assigned. I don't know how I missed that, I was sure I had checked. Sorry guys, for most likely wasting your time. And thanks for your patience. Best, Clemens [*1*] $ lspci 06:00.5 Ethernet controller: JMicron Technology Corp. JMC250 PCI Express Gigabit Ethernet Controller (rev 03)) $ dmesg|grep -i jme [ 6.160187] jme: JMicron JMC2XX ethernet driver version 1.0.8 [ 6.160278] jme 0000:06:00.5: PCI INT A -> GSI 17 (level, low) -> IRQ 17 [ 6.160305] jme 0000:06:00.5: setting latency timer to 64 [ 6.161154] jme 0000:06:00.5: eth0: JMC250 Gigabit Ethernet chiprev:23 pcirev:3 macaddr:bc:ae:c5:2f:91:47 [ 18.354697] jme 0000:06:00.5: PCI INT A disabled ^ permalink raw reply [flat|nested] 38+ messages in thread
* [ath9k-devel] ath9k: irq storm after suspend/resume @ 2011-10-22 7:22 ` Clemens Buchacher 0 siblings, 0 replies; 38+ messages in thread From: Clemens Buchacher @ 2011-10-22 7:22 UTC (permalink / raw) To: ath9k-devel Hi, On Sat, Oct 22, 2011 at 08:47:33AM +0800, Adrian Chadd wrote: > > At this point I'd really suggest whacking other devices into the slot > to see if they generate the same kind of behaviour. Looks like you had it right from the start. After unloading the jme module [*1*] I cannot reproduce the IRQ storm any more. From looking at dmesg it should have been obvious, since it also gets IRQ 17 assigned. I don't know how I missed that, I was sure I had checked. Sorry guys, for most likely wasting your time. And thanks for your patience. Best, Clemens [*1*] $ lspci 06:00.5 Ethernet controller: JMicron Technology Corp. JMC250 PCI Express Gigabit Ethernet Controller (rev 03)) $ dmesg|grep -i jme [ 6.160187] jme: JMicron JMC2XX ethernet driver version 1.0.8 [ 6.160278] jme 0000:06:00.5: PCI INT A -> GSI 17 (level, low) -> IRQ 17 [ 6.160305] jme 0000:06:00.5: setting latency timer to 64 [ 6.161154] jme 0000:06:00.5: eth0: JMC250 Gigabit Ethernet chiprev:23 pcirev:3 macaddr:bc:ae:c5:2f:91:47 [ 18.354697] jme 0000:06:00.5: PCI INT A disabled ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ath9k: irq storm after suspend/resume 2011-10-22 7:22 ` [ath9k-devel] " Clemens Buchacher @ 2011-10-22 7:29 ` Adrian Chadd -1 siblings, 0 replies; 38+ messages in thread From: Adrian Chadd @ 2011-10-22 7:29 UTC (permalink / raw) To: Clemens Buchacher; +Cc: Mohammed Shafi, linux-wireless, ath9k-devel On 22 October 2011 15:22, Clemens Buchacher <drizzd@aon.at> wrote: > Looks like you had it right from the start. After unloading the jme > module [*1*] I cannot reproduce the IRQ storm any more. From > looking at dmesg it should have been obvious, since it also gets > IRQ 17 assigned. I don't know how I missed that, I was sure I had > checked. > > Sorry guys, for most likely wasting your time. And thanks for your > patience. Ah, eek! Well, you're welcome! And I hope you/others have a better understanding of how the interrupt setup on these NICs work. (I certainly have a better understanding now of the host interface and what we can poke when it's awake and asleep.) Good luck! Adrian ^ permalink raw reply [flat|nested] 38+ messages in thread
* [ath9k-devel] ath9k: irq storm after suspend/resume @ 2011-10-22 7:29 ` Adrian Chadd 0 siblings, 0 replies; 38+ messages in thread From: Adrian Chadd @ 2011-10-22 7:29 UTC (permalink / raw) To: ath9k-devel On 22 October 2011 15:22, Clemens Buchacher <drizzd@aon.at> wrote: > Looks like you had it right from the start. After unloading the jme > module [*1*] I cannot reproduce the IRQ storm any more. From > looking at dmesg it should have been obvious, since it also gets > IRQ 17 assigned. I don't know how I missed that, I was sure I had > checked. > > Sorry guys, for most likely wasting your time. And thanks for your > patience. Ah, eek! Well, you're welcome! And I hope you/others have a better understanding of how the interrupt setup on these NICs work. (I certainly have a better understanding now of the host interface and what we can poke when it's awake and asleep.) Good luck! Adrian ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ath9k: irq storm after suspend/resume 2011-10-04 7:58 ` Adrian Chadd 2011-10-04 18:15 ` Clemens Buchacher @ 2011-10-04 18:36 ` Clemens Buchacher 1 sibling, 0 replies; 38+ messages in thread From: Clemens Buchacher @ 2011-10-04 18:36 UTC (permalink / raw) To: Adrian Chadd; +Cc: Mohammed Shafi, linux-wireless On Tue, Oct 04, 2011 at 03:58:09PM +0800, Adrian Chadd wrote: > > There's a function further down in the isr routine which checks > whether the NIC itself posted the interrupt (it checks > SYNC_CAUSE/ASYNC_CAUSE.) That should cut down the spam quite a > bit and only log the relevant entries for ath9k. If I do this no interrupts are logged at all. In previous traces SYNC_CAUSE became non-zero only because the debugging output delayed the interrupt handler enough to give the initialization code a chance to run. Clemens ^ permalink raw reply [flat|nested] 38+ messages in thread
end of thread, other threads:[~2011-10-22 7:29 UTC | newest] Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2011-08-27 11:32 ath9k: irq storm after suspend/resume Clemens Buchacher 2011-08-29 7:55 ` Mohammed Shafi 2011-08-29 11:53 ` Mohammed Shafi 2011-08-29 15:12 ` Mohammed Shafi 2011-08-30 6:41 ` Clemens Buchacher 2011-08-30 9:33 ` Mohammed Shafi 2011-08-30 9:41 ` Mohammed Shafi 2011-09-01 6:24 ` Clemens Buchacher 2011-09-26 9:24 ` Mohammed Shafi 2011-09-27 21:42 ` Clemens Buchacher 2011-09-29 8:18 ` Mohammed Shafi 2011-09-29 10:33 ` Adrian Chadd 2011-09-29 17:11 ` Clemens Buchacher [not found] ` <CAD2nsn0Z2J4r4tN_fLjx5bbvz2bg6NVcQ8vppJbbNcgOF8pFew@mail.gmail.com> [not found] ` <CAJ-VmokcM4KmzV7Rn9PA68iEiTJiPw=ffYgNFLDAHShxD0HNAg@mail.gmail.com> 2011-10-03 8:48 ` Clemens Buchacher 2011-10-04 7:58 ` Adrian Chadd 2011-10-04 18:15 ` Clemens Buchacher 2011-10-04 21:11 ` Adrian Chadd 2011-10-05 6:28 ` Clemens Buchacher 2011-10-05 13:02 ` Adrian Chadd 2011-10-12 13:10 ` Mohammed Shafi 2011-10-15 9:39 ` Clemens Buchacher 2011-10-15 10:01 ` Adrian Chadd 2011-10-18 6:44 ` Clemens Buchacher 2011-10-18 7:05 ` Adrian Chadd 2011-10-21 10:22 ` Clemens Buchacher 2011-10-21 14:10 ` Adrian Chadd 2011-10-21 14:10 ` [ath9k-devel] " Adrian Chadd 2011-10-21 19:03 ` Clemens Buchacher 2011-10-21 19:03 ` [ath9k-devel] " Clemens Buchacher 2011-10-21 20:20 ` Clemens Buchacher 2011-10-21 20:20 ` [ath9k-devel] " Clemens Buchacher 2011-10-22 0:47 ` Adrian Chadd 2011-10-22 0:47 ` [ath9k-devel] " Adrian Chadd 2011-10-22 7:22 ` Clemens Buchacher 2011-10-22 7:22 ` [ath9k-devel] " Clemens Buchacher 2011-10-22 7:29 ` Adrian Chadd 2011-10-22 7:29 ` [ath9k-devel] " Adrian Chadd 2011-10-04 18:36 ` Clemens Buchacher
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.