From mboxrd@z Thu Jan 1 00:00:00 1970 Return-path: Received: from mail-ia0-f170.google.com ([209.85.210.170]:43094 "EHLO mail-ia0-f170.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753976Ab3CXSzp (ORCPT ); Sun, 24 Mar 2013 14:55:45 -0400 Received: by mail-ia0-f170.google.com with SMTP id h8so4964789iaa.15 for ; Sun, 24 Mar 2013 11:55:45 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: References: From: Robert Shade Date: Sun, 24 Mar 2013 14:55:25 -0400 Message-ID: (sfid-20130324_195558_331845_DB4AC464) Subject: Re: Auth Packet TX Delay To: Adrian Chadd Cc: linux-wireless@vger.kernel.org, ath9k-devel@lists.ath9k.org Content-Type: text/plain; charset=ISO-8859-1 Sender: linux-wireless-owner@vger.kernel.org List-ID: I've done some more testing on this and it looks like the auth packets aren't delayed, they're never sent. The "TX Complete" messages are from the queue being purged before a reset due to channel change. It only gets in this state when it's unable to change the channel due to the timeout on setting the AR_PHY_AGC_CONTROL register. One thing that does look interesting is that, when we fail to change the channel, ath_complete_reset is never called and therefore ieee80211_wake_queues is never called either. Are stop/wake queues calls supposed to be balanced? Felix's "ath9k_hw: improve reset reliability after errors" did help a bit, but I still get the DMA errors periodically and it eventually got into the state where the calibration fails leading to not sending any auth packets. Based on your suggestions, I've been testing a few days with the following patch and I have yet to see any DMA messages or the auth stuck state. I had to check for SC_OP_INVALID because the initial reset in ath9k_start would cause a kernel panic when the system was being initialized from a powered off state. I don't know enough about the internals to know if I should expect that, but I'll need to set up a serial console in order to capture the panic. diff --git a/drivers/net/wireless/ath/ath9k/hw.c b/drivers/net/wireless/ath/ath9k/hw.c index 07e2526..05eebdb 100644 --- a/drivers/net/wireless/ath/ath9k/hw.c +++ b/drivers/net/wireless/ath/ath9k/hw.c @@ -1456,6 +1456,7 @@ static bool ath9k_hw_set_reset_reg(struct ath_hw *ah, u32 type) static bool ath9k_hw_chip_reset(struct ath_hw *ah, struct ath9k_channel *chan) { + struct ath_softc *sc = ah->hw->priv; int reset_type = ATH9K_RESET_WARM; if (AR_SREV_9280(ah)) { @@ -1463,9 +1464,12 @@ static bool ath9k_hw_chip_reset(struct ath_hw *ah, reset_type = ATH9K_RESET_POWER_ON; else reset_type = ATH9K_RESET_COLD; - } else if (ah->chip_fullsleep || REG_READ(ah, AR_Q_TXE) || - (REG_READ(ah, AR_CR) & AR_CR_RXE)) + } else if (AR_SREV_9160(ah) && !test_bit(SC_OP_INVALID, &sc->sc_flags)) { + reset_type = ATH9K_RESET_COLD; + } else if (ah->chip_fullsleep || REG_READ(ah, AR_Q_TXE) || + (REG_READ(ah, AR_CR) & AR_CR_RXE)) { reset_type = ATH9K_RESET_COLD; + } if (!ath9k_hw_set_reset_reg(ah, reset_type)) return false; diff --git a/drivers/net/wireless/ath/ath9k/main.c b/drivers/net/wireless/ath/ath9k/main.c index 6e66f9c..09b4699 100644 --- a/drivers/net/wireless/ath/ath9k/main.c +++ b/drivers/net/wireless/ath/ath9k/main.c @@ -252,7 +252,7 @@ static int ath_reset_internal(struct ath_softc *sc, struct ath9k_channel *hchan) struct ath_hw *ah = sc->sc_ah; struct ath_common *common = ath9k_hw_common(ah); struct ath9k_hw_cal_data *caldata = NULL; - bool fastcc = true; + bool fastcc = !AR_SREV_9160(ah); int r; __ath_cancel_work(sc); -- On Sat, Feb 9, 2013 at 2:39 AM, Adrian Chadd wrote: > > On 8 February 2013 10:48, Robert Shade wrote: > > Maybe I have the terminology wrong, but I thought a cold reset meant > > toggling the PCI reset line. > > That's a very very cold reset line, that resets the PCI bus glue > inside the chip. > > There's a bunch of different reset lines; most of the device can be > reset without resetting the PCI/PCIe host interface. > > So a "cold" reset here is resetting almost everything inside the chip > by pulling down those reset lines and stopping/restarting clocks. > > > Adrian From mboxrd@z Thu Jan 1 00:00:00 1970 From: Robert Shade Date: Sun, 24 Mar 2013 14:55:25 -0400 Subject: [ath9k-devel] Auth Packet TX Delay In-Reply-To: References: Message-ID: List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: ath9k-devel@lists.ath9k.org I've done some more testing on this and it looks like the auth packets aren't delayed, they're never sent. The "TX Complete" messages are from the queue being purged before a reset due to channel change. It only gets in this state when it's unable to change the channel due to the timeout on setting the AR_PHY_AGC_CONTROL register. One thing that does look interesting is that, when we fail to change the channel, ath_complete_reset is never called and therefore ieee80211_wake_queues is never called either. Are stop/wake queues calls supposed to be balanced? Felix's "ath9k_hw: improve reset reliability after errors" did help a bit, but I still get the DMA errors periodically and it eventually got into the state where the calibration fails leading to not sending any auth packets. Based on your suggestions, I've been testing a few days with the following patch and I have yet to see any DMA messages or the auth stuck state. I had to check for SC_OP_INVALID because the initial reset in ath9k_start would cause a kernel panic when the system was being initialized from a powered off state. I don't know enough about the internals to know if I should expect that, but I'll need to set up a serial console in order to capture the panic. diff --git a/drivers/net/wireless/ath/ath9k/hw.c b/drivers/net/wireless/ath/ath9k/hw.c index 07e2526..05eebdb 100644 --- a/drivers/net/wireless/ath/ath9k/hw.c +++ b/drivers/net/wireless/ath/ath9k/hw.c @@ -1456,6 +1456,7 @@ static bool ath9k_hw_set_reset_reg(struct ath_hw *ah, u32 type) static bool ath9k_hw_chip_reset(struct ath_hw *ah, struct ath9k_channel *chan) { + struct ath_softc *sc = ah->hw->priv; int reset_type = ATH9K_RESET_WARM; if (AR_SREV_9280(ah)) { @@ -1463,9 +1464,12 @@ static bool ath9k_hw_chip_reset(struct ath_hw *ah, reset_type = ATH9K_RESET_POWER_ON; else reset_type = ATH9K_RESET_COLD; - } else if (ah->chip_fullsleep || REG_READ(ah, AR_Q_TXE) || - (REG_READ(ah, AR_CR) & AR_CR_RXE)) + } else if (AR_SREV_9160(ah) && !test_bit(SC_OP_INVALID, &sc->sc_flags)) { + reset_type = ATH9K_RESET_COLD; + } else if (ah->chip_fullsleep || REG_READ(ah, AR_Q_TXE) || + (REG_READ(ah, AR_CR) & AR_CR_RXE)) { reset_type = ATH9K_RESET_COLD; + } if (!ath9k_hw_set_reset_reg(ah, reset_type)) return false; diff --git a/drivers/net/wireless/ath/ath9k/main.c b/drivers/net/wireless/ath/ath9k/main.c index 6e66f9c..09b4699 100644 --- a/drivers/net/wireless/ath/ath9k/main.c +++ b/drivers/net/wireless/ath/ath9k/main.c @@ -252,7 +252,7 @@ static int ath_reset_internal(struct ath_softc *sc, struct ath9k_channel *hchan) struct ath_hw *ah = sc->sc_ah; struct ath_common *common = ath9k_hw_common(ah); struct ath9k_hw_cal_data *caldata = NULL; - bool fastcc = true; + bool fastcc = !AR_SREV_9160(ah); int r; __ath_cancel_work(sc); -- On Sat, Feb 9, 2013 at 2:39 AM, Adrian Chadd wrote: > > On 8 February 2013 10:48, Robert Shade wrote: > > Maybe I have the terminology wrong, but I thought a cold reset meant > > toggling the PCI reset line. > > That's a very very cold reset line, that resets the PCI bus glue > inside the chip. > > There's a bunch of different reset lines; most of the device can be > reset without resetting the PCI/PCIe host interface. > > So a "cold" reset here is resetting almost everything inside the chip > by pulling down those reset lines and stopping/restarting clocks. > > > Adrian