* [PATCH 0/2] ath10k: warm reset fixes 2014-05-14 @ 2014-05-14 6:57 Michal Kazior 2014-05-14 6:57 ` [PATCH 1/2] ath10k: improve warm reset reliability Michal Kazior ` (3 more replies) 0 siblings, 4 replies; 9+ messages in thread From: Michal Kazior @ 2014-05-14 6:57 UTC (permalink / raw) To: ath10k; +Cc: linux-wireless, Michal Kazior Hi, Warm reset was not able to recover from a device crash so ath10k was falling back to cold reset which would occasionally hang the system or generate a data bus error with some device chips. Warm reset now works a lot more reliably (at least on my T430). It's able to recover from a "hard" simulate fw crash and some firmware asserts I've forced meaning it should recover in most cases without falling back to the cold reset. There's still one case of a crash warm reset is incapable to recover from: IOMMU faults (when supported and enabled and device tries to access non-DMA mapped host memory) trigger device crash and leave CE totally unresponsive before and after the warm reset. Michal Kazior (2): ath10k: improve warm reset reliability ath10k: retry warm reset a few times drivers/net/wireless/ath/ath10k/pci.c | 53 +++++++++++++++++++++++++++++++++-- 1 file changed, 50 insertions(+), 3 deletions(-) -- 1.8.5.3 ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 1/2] ath10k: improve warm reset reliability 2014-05-14 6:57 [PATCH 0/2] ath10k: warm reset fixes 2014-05-14 Michal Kazior @ 2014-05-14 6:57 ` Michal Kazior 2014-05-15 10:48 ` Kalle Valo 2014-05-14 6:57 ` [PATCH 2/2] ath10k: retry warm reset a few times Michal Kazior ` (2 subsequent siblings) 3 siblings, 1 reply; 9+ messages in thread From: Michal Kazior @ 2014-05-14 6:57 UTC (permalink / raw) To: ath10k; +Cc: linux-wireless, Michal Kazior Warm reset is now able to recover after device crashes which required a cold reset before. This should greatly reduce chances of getting data bus errors or host system freezes due to buggy cold reset on some chips. Signed-off-by: Michal Kazior <michal.kazior@tieto.com> --- drivers/net/wireless/ath/ath10k/pci.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/drivers/net/wireless/ath/ath10k/pci.c b/drivers/net/wireless/ath/ath10k/pci.c index 66b1f30..c9b14b4 100644 --- a/drivers/net/wireless/ath/ath10k/pci.c +++ b/drivers/net/wireless/ath/ath10k/pci.c @@ -1802,6 +1802,28 @@ static void ath10k_pci_fw_interrupt_handler(struct ath10k *ar) ath10k_pci_sleep(ar); } +/* this function effectively clears target memory controller assert line */ +static void ath10k_pci_warm_reset_si0(struct ath10k *ar) +{ + u32 val; + + val = ath10k_pci_read32(ar, RTC_SOC_BASE_ADDRESS + + SOC_RESET_CONTROL_ADDRESS); + ath10k_pci_write32(ar, RTC_SOC_BASE_ADDRESS + SOC_RESET_CONTROL_ADDRESS, + val | SOC_RESET_CONTROL_SI0_RST_MASK); + val = ath10k_pci_read32(ar, RTC_SOC_BASE_ADDRESS + + SOC_RESET_CONTROL_ADDRESS); + msleep(10); + + val = ath10k_pci_read32(ar, RTC_SOC_BASE_ADDRESS + + SOC_RESET_CONTROL_ADDRESS); + ath10k_pci_write32(ar, RTC_SOC_BASE_ADDRESS + SOC_RESET_CONTROL_ADDRESS, + val & ~SOC_RESET_CONTROL_SI0_RST_MASK); + val = ath10k_pci_read32(ar, RTC_SOC_BASE_ADDRESS + + SOC_RESET_CONTROL_ADDRESS); + msleep(10); +} + static int ath10k_pci_warm_reset(struct ath10k *ar) { int ret = 0; @@ -1860,6 +1882,8 @@ static int ath10k_pci_warm_reset(struct ath10k *ar) SOC_RESET_CONTROL_ADDRESS); msleep(10); + ath10k_pci_warm_reset_si0(ar); + /* debug */ val = ath10k_pci_read32(ar, SOC_CORE_BASE_ADDRESS + PCIE_INTR_CAUSE_ADDRESS); -- 1.8.5.3 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] ath10k: improve warm reset reliability 2014-05-14 6:57 ` [PATCH 1/2] ath10k: improve warm reset reliability Michal Kazior @ 2014-05-15 10:48 ` Kalle Valo 2014-05-15 12:42 ` Michal Kazior 0 siblings, 1 reply; 9+ messages in thread From: Kalle Valo @ 2014-05-15 10:48 UTC (permalink / raw) To: Michal Kazior; +Cc: ath10k, linux-wireless Michal Kazior <michal.kazior@tieto.com> writes: > Warm reset is now able to recover after device > crashes which required a cold reset before. > > This should greatly reduce chances of getting data > bus errors or host system freezes due to buggy > cold reset on some chips. > > Signed-off-by: Michal Kazior <michal.kazior@tieto.com> Awesome! This is very much needed. I just have a cosmetic comment: > +/* this function effectively clears target memory controller assert line */ > +static void ath10k_pci_warm_reset_si0(struct ath10k *ar) > +{ > + u32 val; > + > + val = ath10k_pci_read32(ar, RTC_SOC_BASE_ADDRESS + > + SOC_RESET_CONTROL_ADDRESS); > + ath10k_pci_write32(ar, RTC_SOC_BASE_ADDRESS + SOC_RESET_CONTROL_ADDRESS, > + val | SOC_RESET_CONTROL_SI0_RST_MASK); > + val = ath10k_pci_read32(ar, RTC_SOC_BASE_ADDRESS + > + SOC_RESET_CONTROL_ADDRESS); We do have the ath10k_pci_soc_ functions for accessing SOC registers, I would prefer to use those here. I now modified your patch with the diff below. Is that ok to you? --- a/drivers/net/wireless/ath/ath10k/pci.c +++ b/drivers/net/wireless/ath/ath10k/pci.c @@ -1807,20 +1807,18 @@ static void ath10k_pci_warm_reset_si0(struct ath10k *ar) { u32 val; - val = ath10k_pci_read32(ar, RTC_SOC_BASE_ADDRESS + - SOC_RESET_CONTROL_ADDRESS); - ath10k_pci_write32(ar, RTC_SOC_BASE_ADDRESS + SOC_RESET_CONTROL_ADDRESS, - val | SOC_RESET_CONTROL_SI0_RST_MASK); - val = ath10k_pci_read32(ar, RTC_SOC_BASE_ADDRESS + - SOC_RESET_CONTROL_ADDRESS); + val = ath10k_pci_soc_read32(ar, SOC_RESET_CONTROL_ADDRESS); + ath10k_pci_soc_write32(ar, SOC_RESET_CONTROL_ADDRESS, + val | SOC_RESET_CONTROL_SI0_RST_MASK); + val = ath10k_pci_soc_read32(ar, SOC_RESET_CONTROL_ADDRESS); + msleep(10); - val = ath10k_pci_read32(ar, RTC_SOC_BASE_ADDRESS + - SOC_RESET_CONTROL_ADDRESS); - ath10k_pci_write32(ar, RTC_SOC_BASE_ADDRESS + SOC_RESET_CONTROL_ADDRESS, - val & ~SOC_RESET_CONTROL_SI0_RST_MASK); - val = ath10k_pci_read32(ar, RTC_SOC_BASE_ADDRESS + - SOC_RESET_CONTROL_ADDRESS); + val = ath10k_pci_soc_read32(ar, SOC_RESET_CONTROL_ADDRESS); + ath10k_pci_soc_write32(ar, SOC_RESET_CONTROL_ADDRESS, + val & ~SOC_RESET_CONTROL_SI0_RST_MASK); + val = ath10k_pci_soc_read32(ar, SOC_RESET_CONTROL_ADDRESS); + msleep(10); } Full patch here: https://github.com/kvalo/ath/commit/7b52054308a371d479a9e686e1d8411d19a90fd7 -- Kalle Valo ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] ath10k: improve warm reset reliability 2014-05-15 10:48 ` Kalle Valo @ 2014-05-15 12:42 ` Michal Kazior 2014-05-15 12:57 ` Kalle Valo 0 siblings, 1 reply; 9+ messages in thread From: Michal Kazior @ 2014-05-15 12:42 UTC (permalink / raw) To: Kalle Valo; +Cc: linux-wireless, ath10k On 15 May 2014 12:48, Kalle Valo <kvalo@qca.qualcomm.com> wrote: > Michal Kazior <michal.kazior@tieto.com> writes: > >> Warm reset is now able to recover after device >> crashes which required a cold reset before. >> >> This should greatly reduce chances of getting data >> bus errors or host system freezes due to buggy >> cold reset on some chips. >> >> Signed-off-by: Michal Kazior <michal.kazior@tieto.com> > > Awesome! This is very much needed. I just have a cosmetic comment: > >> +/* this function effectively clears target memory controller assert line */ >> +static void ath10k_pci_warm_reset_si0(struct ath10k *ar) >> +{ >> + u32 val; >> + >> + val = ath10k_pci_read32(ar, RTC_SOC_BASE_ADDRESS + >> + SOC_RESET_CONTROL_ADDRESS); >> + ath10k_pci_write32(ar, RTC_SOC_BASE_ADDRESS + SOC_RESET_CONTROL_ADDRESS, >> + val | SOC_RESET_CONTROL_SI0_RST_MASK); >> + val = ath10k_pci_read32(ar, RTC_SOC_BASE_ADDRESS + >> + SOC_RESET_CONTROL_ADDRESS); > > We do have the ath10k_pci_soc_ functions for accessing SOC registers, I > would prefer to use those here. I now modified your patch with the diff > below. Is that ok to you? Ah, yeah. I totally forgot about these functions (again). > --- a/drivers/net/wireless/ath/ath10k/pci.c > +++ b/drivers/net/wireless/ath/ath10k/pci.c > @@ -1807,20 +1807,18 @@ static void ath10k_pci_warm_reset_si0(struct ath10k *ar) > { > u32 val; > > - val = ath10k_pci_read32(ar, RTC_SOC_BASE_ADDRESS + > - SOC_RESET_CONTROL_ADDRESS); > - ath10k_pci_write32(ar, RTC_SOC_BASE_ADDRESS + SOC_RESET_CONTROL_ADDRESS, > - val | SOC_RESET_CONTROL_SI0_RST_MASK); > - val = ath10k_pci_read32(ar, RTC_SOC_BASE_ADDRESS + > - SOC_RESET_CONTROL_ADDRESS); > + val = ath10k_pci_soc_read32(ar, SOC_RESET_CONTROL_ADDRESS); > + ath10k_pci_soc_write32(ar, SOC_RESET_CONTROL_ADDRESS, > + val | SOC_RESET_CONTROL_SI0_RST_MASK); > + val = ath10k_pci_soc_read32(ar, SOC_RESET_CONTROL_ADDRESS); > + > msleep(10); > > - val = ath10k_pci_read32(ar, RTC_SOC_BASE_ADDRESS + > - SOC_RESET_CONTROL_ADDRESS); > - ath10k_pci_write32(ar, RTC_SOC_BASE_ADDRESS + SOC_RESET_CONTROL_ADDRESS, > - val & ~SOC_RESET_CONTROL_SI0_RST_MASK); > - val = ath10k_pci_read32(ar, RTC_SOC_BASE_ADDRESS + > - SOC_RESET_CONTROL_ADDRESS); > + val = ath10k_pci_soc_read32(ar, SOC_RESET_CONTROL_ADDRESS); > + ath10k_pci_soc_write32(ar, SOC_RESET_CONTROL_ADDRESS, > + val & ~SOC_RESET_CONTROL_SI0_RST_MASK); > + val = ath10k_pci_soc_read32(ar, SOC_RESET_CONTROL_ADDRESS); > + > msleep(10); > } Looks good to me :-) > Full patch here: > > https://github.com/kvalo/ath/commit/7b52054308a371d479a9e686e1d8411d19a90fd7 You probably mean: https://github.com/kvalo/ath/commit/136ef8110cba61752eee5ef6bd7ce170b15cf491 Michał ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] ath10k: improve warm reset reliability 2014-05-15 12:42 ` Michal Kazior @ 2014-05-15 12:57 ` Kalle Valo 0 siblings, 0 replies; 9+ messages in thread From: Kalle Valo @ 2014-05-15 12:57 UTC (permalink / raw) To: Michal Kazior; +Cc: linux-wireless, ath10k Michal Kazior <michal.kazior@tieto.com> writes: >> Full patch here: >> >> https://github.com/kvalo/ath/commit/7b52054308a371d479a9e686e1d8411d19a90fd7 > > You probably mean: > > https://github.com/kvalo/ath/commit/136ef8110cba61752eee5ef6bd7ce170b15cf491 Yup, sorry for the confusion. -- Kalle Valo ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 2/2] ath10k: retry warm reset a few times 2014-05-14 6:57 [PATCH 0/2] ath10k: warm reset fixes 2014-05-14 Michal Kazior 2014-05-14 6:57 ` [PATCH 1/2] ath10k: improve warm reset reliability Michal Kazior @ 2014-05-14 6:57 ` Michal Kazior 2014-05-14 14:22 ` [PATCH 0/2] ath10k: warm reset fixes 2014-05-14 Kalle Valo 2014-05-16 13:49 ` Kalle Valo 3 siblings, 0 replies; 9+ messages in thread From: Michal Kazior @ 2014-05-14 6:57 UTC (permalink / raw) To: ath10k; +Cc: linux-wireless, Michal Kazior Sometimes warm reset works upon retry. It might be related to imperfect warm reset routine, but for now let's just do the retries. This should improve the reliability of some chips that hang/crash with cold reset which is used as a last resort if warm reset fails. Signed-off-by: Michal Kazior <michal.kazior@tieto.com> --- drivers/net/wireless/ath/ath10k/pci.c | 29 ++++++++++++++++++++++++++--- 1 file changed, 26 insertions(+), 3 deletions(-) diff --git a/drivers/net/wireless/ath/ath10k/pci.c b/drivers/net/wireless/ath/ath10k/pci.c index c9b14b4..c9e482e 100644 --- a/drivers/net/wireless/ath/ath10k/pci.c +++ b/drivers/net/wireless/ath/ath10k/pci.c @@ -59,6 +59,7 @@ MODULE_PARM_DESC(reset_mode, "0: auto, 1: warm only (default: 0)"); /* how long wait to wait for target to initialise, in ms */ #define ATH10K_PCI_TARGET_WAIT 3000 +#define ATH10K_PCI_NUM_WARM_RESET_ATTEMPTS 3 #define QCA988X_2_0_DEVICE_ID (0x003c) @@ -2012,6 +2013,28 @@ err: return ret; } +static int ath10k_pci_hif_power_up_warm(struct ath10k *ar) +{ + int i, ret; + + /* + * Sometime warm reset succeeds after retries. + * + * FIXME: It might be possible to tune ath10k_pci_warm_reset() to work + * at first try. + */ + for (i = 0; i < ATH10K_PCI_NUM_WARM_RESET_ATTEMPTS; i++) { + ret = __ath10k_pci_hif_power_up(ar, false); + if (ret == 0) + break; + + ath10k_warn("failed to warm reset (attempt %d out of %d): %d\n", + i + 1, ATH10K_PCI_NUM_WARM_RESET_ATTEMPTS, ret); + } + + return ret; +} + static int ath10k_pci_hif_power_up(struct ath10k *ar) { int ret; @@ -2023,10 +2046,10 @@ static int ath10k_pci_hif_power_up(struct ath10k *ar) * preferred (and safer) way to perform a device reset is through a * warm reset. * - * Warm reset doesn't always work though (notably after a firmware - * crash) so fall back to cold reset if necessary. + * Warm reset doesn't always work though so fall back to cold reset may + * be necessary. */ - ret = __ath10k_pci_hif_power_up(ar, false); + ret = ath10k_pci_hif_power_up_warm(ar); if (ret) { ath10k_warn("failed to power up target using warm reset: %d\n", ret); -- 1.8.5.3 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 0/2] ath10k: warm reset fixes 2014-05-14 2014-05-14 6:57 [PATCH 0/2] ath10k: warm reset fixes 2014-05-14 Michal Kazior 2014-05-14 6:57 ` [PATCH 1/2] ath10k: improve warm reset reliability Michal Kazior 2014-05-14 6:57 ` [PATCH 2/2] ath10k: retry warm reset a few times Michal Kazior @ 2014-05-14 14:22 ` Kalle Valo 2014-05-14 14:25 ` Kalle Valo 2014-05-16 13:49 ` Kalle Valo 3 siblings, 1 reply; 9+ messages in thread From: Kalle Valo @ 2014-05-14 14:22 UTC (permalink / raw) To: Michal Kazior; +Cc: ath10k, linux-wireless Michal Kazior <michal.kazior@tieto.com> writes: > Warm reset was not able to recover from a device > crash so ath10k was falling back to cold reset > which would occasionally hang the system or > generate a data bus error with some device chips. > > Warm reset now works a lot more reliably (at least > on my T430). It's able to recover from a "hard" > simulate fw crash and some firmware asserts I've > forced meaning it should recover in most cases > without falling back to the cold reset. > > There's still one case of a crash warm reset is > incapable to recover from: IOMMU faults (when > supported and enabled and device tries to access > non-DMA mapped host memory) trigger device crash > and leave CE totally unresponsive before and after > the warm reset. > > > Michal Kazior (2): > ath10k: improve warm reset reliability > ath10k: retry warm reset a few times The patches don't apply anymore and the conflict was not trivial enough for me to fix it. Can you rebase, please? -- Kalle Valo ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/2] ath10k: warm reset fixes 2014-05-14 2014-05-14 14:22 ` [PATCH 0/2] ath10k: warm reset fixes 2014-05-14 Kalle Valo @ 2014-05-14 14:25 ` Kalle Valo 0 siblings, 0 replies; 9+ messages in thread From: Kalle Valo @ 2014-05-14 14:25 UTC (permalink / raw) To: Michal Kazior; +Cc: linux-wireless, ath10k Kalle Valo <kvalo@qca.qualcomm.com> writes: > Michal Kazior <michal.kazior@tieto.com> writes: > >> Warm reset was not able to recover from a device >> crash so ath10k was falling back to cold reset >> which would occasionally hang the system or >> generate a data bus error with some device chips. >> >> Warm reset now works a lot more reliably (at least >> on my T430). It's able to recover from a "hard" >> simulate fw crash and some firmware asserts I've >> forced meaning it should recover in most cases >> without falling back to the cold reset. >> >> There's still one case of a crash warm reset is >> incapable to recover from: IOMMU faults (when >> supported and enabled and device tries to access >> non-DMA mapped host memory) trigger device crash >> and leave CE totally unresponsive before and after >> the warm reset. >> >> >> Michal Kazior (2): >> ath10k: improve warm reset reliability >> ath10k: retry warm reset a few times > > The patches don't apply anymore and the conflict was not trivial enough > for me to fix it. Can you rebase, please? Sorry, I replied to the wrong thread. Please ignore what I said above! -- Kalle Valo ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/2] ath10k: warm reset fixes 2014-05-14 2014-05-14 6:57 [PATCH 0/2] ath10k: warm reset fixes 2014-05-14 Michal Kazior ` (2 preceding siblings ...) 2014-05-14 14:22 ` [PATCH 0/2] ath10k: warm reset fixes 2014-05-14 Kalle Valo @ 2014-05-16 13:49 ` Kalle Valo 3 siblings, 0 replies; 9+ messages in thread From: Kalle Valo @ 2014-05-16 13:49 UTC (permalink / raw) To: Michal Kazior; +Cc: ath10k, linux-wireless Michal Kazior <michal.kazior@tieto.com> writes: > Hi, > > Warm reset was not able to recover from a device > crash so ath10k was falling back to cold reset > which would occasionally hang the system or > generate a data bus error with some device chips. > > Warm reset now works a lot more reliably (at least > on my T430). It's able to recover from a "hard" > simulate fw crash and some firmware asserts I've > forced meaning it should recover in most cases > without falling back to the cold reset. > > There's still one case of a crash warm reset is > incapable to recover from: IOMMU faults (when > supported and enabled and device tries to access > non-DMA mapped host memory) trigger device crash > and leave CE totally unresponsive before and after > the warm reset. > > > Michal Kazior (2): > ath10k: improve warm reset reliability > ath10k: retry warm reset a few times Thanks, both patches applied. -- Kalle Valo ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2014-05-16 13:49 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2014-05-14 6:57 [PATCH 0/2] ath10k: warm reset fixes 2014-05-14 Michal Kazior 2014-05-14 6:57 ` [PATCH 1/2] ath10k: improve warm reset reliability Michal Kazior 2014-05-15 10:48 ` Kalle Valo 2014-05-15 12:42 ` Michal Kazior 2014-05-15 12:57 ` Kalle Valo 2014-05-14 6:57 ` [PATCH 2/2] ath10k: retry warm reset a few times Michal Kazior 2014-05-14 14:22 ` [PATCH 0/2] ath10k: warm reset fixes 2014-05-14 Kalle Valo 2014-05-14 14:25 ` Kalle Valo 2014-05-16 13:49 ` Kalle Valo
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).