linux-wireless.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andy Shevchenko <andy.shevchenko@gmail.com>
To: "Jonas Dreßler" <verdre@v0yd.nl>
Cc: "Brian Norris" <briannorris@chromium.org>,
	"Amitkumar Karwar" <amitkarwar@gmail.com>,
	"Ganapathi Bhat" <ganapathi017@gmail.com>,
	"Xinming Hu" <huxinming820@gmail.com>,
	"Kalle Valo" <kvalo@codeaurora.org>,
	"David S. Miller" <davem@davemloft.net>,
	"Jakub Kicinski" <kuba@kernel.org>,
	"Tsuchiya Yuto" <kitakar@gmail.com>,
	linux-wireless <linux-wireless@vger.kernel.org>,
	netdev@vger.kernel.org,
	"Linux Kernel" <linux-kernel@vger.kernel.org>,
	linux-pci <linux-pci@vger.kernel.org>,
	"Maximilian Luz" <luzmaximilian@gmail.com>,
	"Andy Shevchenko" <andriy.shevchenko@linux.intel.com>,
	"Bjorn Helgaas" <bhelgaas@google.com>,
	"Pali Rohár" <pali@kernel.org>
Subject: Re: [PATCH 1/2] mwifiex: Use non-posted PCI register writes
Date: Thu, 23 Sep 2021 22:37:40 +0300	[thread overview]
Message-ID: <CAHp75VdRWd9Oj_68BqewAdjtzhRz406eh=4M7FmjRvqhkaWaOw@mail.gmail.com> (raw)
In-Reply-To: <e4cbf804-c374-79a3-53ac-8a0fbd8f75b8@v0yd.nl>

On Thu, Sep 23, 2021 at 6:28 PM Jonas Dreßler <verdre@v0yd.nl> wrote:
>
> On 9/22/21 2:50 PM, Jonas Dreßler wrote:
> > On 9/20/21 7:48 PM, Brian Norris wrote:
> >> On Sat, Sep 18, 2021 at 12:37 AM Jonas Dreßler <verdre@v0yd.nl> wrote:
> >>> Thanks for the pointer to that commit Brian, it turns out this is
> >>> actually the change that causes the "Firmware wakeup failed" issues that
> >>> I'm trying to fix with the second patch here.
> >>
> >> Huh. That's interesting, although I guess it makes some sense given
> >> your theory of "dropped writes". FWIW, this strategy (post a single
> >> write, then wait for wakeup) is the same used by some other
> >> chips/drivers too (e.g., ath10k/pci), although in those cases card
> >> wakeup is much much faster. But if the bus was dropping writes
> >> somehow, those strategies would fail too.
> >>
> >>> Also my approach is a lot messier than just reverting
> >>> 062e008a6e83e7c4da7df0a9c6aefdbc849e2bb3 and also appears to be blocking
> >>> even longer...
> >>
> >> For the record, in case you're talking about my data ("blocking even
> >> longer"): I was only testing patch 1. Patch 2 isn't really relevant to
> >> my particular systems (Rockchip RK3399 + Marvell 8997/PCIe), because
> >> (a) I'm pretty sure my system isn't "dropping" any reads or writes
> >> (b) all my delay is in the read-back; the Rockchip PCIe bus is waiting
> >> indefinitely for the card to wake up, instead of timing out and
> >> reporting all-1's like many x86 systems appear to do (I've tested
> >> this).
> >>
> >> So, the 6ms delay is entirely sitting in the ioread32(), not a delay
> >> loop.
> >>
> >> I haven't yet tried your version 2 (which avoids the blocking read to
> >> wake up; good!), but it sounds like in theory it could solve your
> >> problem while avoiding 6ms delays for me. I intend to test your v2
> >> this week.
> >>
> >
> > With "blocking even longer" I meant that (on my system) the delay-loop
> > blocks even longer than waking up the card via mwifiex_read_reg() (both
> > are in the orders of milliseconds). And given that in certain cases the
> > card wakeup (or a write getting through to the card, I have no idea) can
> > take extremely long, I'd feel more confident going with the
> > mwifiex_read_reg() method to wake up the card.
> >
> > Anyway, you know what's even weirder with all this: I've been testing
> > the first commit of patch v2 (so just the single read-back instead of
> > the big hammer) together with 062e008a6e83e7c4da7df0a9c6aefdbc849e2bb3
> > reverted for a good week now and haven't seen any wakeup failure yet.
> > Otoh I'm fairly sure the big hammer with reading back every write wasn't
> > enough to fix the wakeup failures, otherwise I wouldn't even have
> > started working on the second commit.
> >
> > So that would mean there's a difference between writing and then reading
> > back vs only reading to wake up the card: Only the latter fixes the
> > wakeup failures.
> >
> >>> Does anyone have an idea what could be the reason for the posted write
> >>> not going through, or could that also be a potential firmware bug in the
> >>> chip?
> >>
> >> I have no clue about that. That does sound downright horrible, but so
> >> are many things when dealing with this family of hardware/firmware.
> >> I'm not sure how to prove out whether this is a host bus problem, or
> >> an endpoint/firmware problem, other than perhaps trying the same
> >> module/firmware on another system, if that's possible.
> >>
> >> Anyway, to reiterate: I'm not fundamentally opposed to v2 (pending a
> >> test run here), even if it is a bit ugly and perhaps not 100%
> >> understood.
> >>
> >
> > I'm not 100% sure about all this yet, I think I'm gonna try to confirm
> > my older findings once again now and then we'll see. FTR, would you be
> > fine with using the mwifiex_read_reg() method to wake up the card and
> > somehow quirking your system to use write_reg()?
> >
> >> Brian
> >>
> >
>
> Okay, so I finally managed to find my exact reproducer for the bug again:
>
> 1) Make sure wifi powersaving is enabled (iw dev wlp1s0 set power_save on)
> 2) Connect to any wifi network (makes firmware go into wifi powersaving
> mode, not deep sleep)
> 3) Make sure bluetooth is turned off (to ensure the firmware actually
> enters powersave mode and doesn't keep the radio active doing bluetooth
> stuff)
> 4) To confirm that wifi powersaving is entered ping a device on the LAN,
> pings should be a few ms higher than without powersaving
> 5) Run "while true; do iwconfig; sleep 0.0001; done", this wakes and
> suspends the firmware extremely often
> 6) Wait until things explode, for me it consistently takes <5 minutes
>
> Using this reproducer I was able to clear things up a bit:
>
> - There still are wakeup failures when using (only) mwifiex_read_reg()
> to wake the card, so there's no weird difference between waking up using
> read vs write+read-back
>
> - Just calling mwifiex_write_reg() once and then blocking until the card
> wakes up using my delay-loop doesn't fix the issue, it's actually
> writing multiple times that fixes the issue
>
> These observations sound a lot like writes (and even reads) are actually
> being dropped, don't they?



-- 
With Best Regards,
Andy Shevchenko

  reply	other threads:[~2021-09-23 19:38 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-30 12:37 [PATCH 0/2] mwifiex: Work around firmware bugs on 88W8897 chip Jonas Dreßler
2021-08-30 12:37 ` [PATCH 1/2] mwifiex: Use non-posted PCI register writes Jonas Dreßler
2021-08-30 12:49   ` Andy Shevchenko
2021-09-01 14:01     ` Jonas Dreßler
2021-09-01 15:47       ` Andy Shevchenko
2021-09-01 15:51       ` Pali Rohár
2021-09-01 16:51         ` Heiner Kallweit
2021-09-01 17:07           ` Johannes Berg
2021-09-01 19:07             ` Heiner Kallweit
2021-09-01 22:41             ` Bjorn Helgaas
2021-09-02 14:05               ` Bjorn Helgaas
2021-09-01 19:40   ` Brian Norris
2021-09-01 20:40     ` Andy Shevchenko
2021-09-01 21:04       ` Brian Norris
2021-09-01 21:07         ` Brian Norris
2021-09-18  7:37           ` Jonas Dreßler
2021-09-20 17:48             ` Brian Norris
2021-09-22 12:50               ` Jonas Dreßler
2021-09-23 15:28                 ` Jonas Dreßler
2021-09-23 19:37                   ` Andy Shevchenko [this message]
2021-09-23 19:41                   ` Andy Shevchenko
2021-09-23 20:22                     ` Pali Rohár
2021-09-30 15:38                       ` Jonas Dreßler
2021-09-30 15:42                         ` Pali Rohár
2021-09-30 16:14                           ` Jonas Dreßler
2021-09-30 16:19                             ` Pali Rohár
2021-09-30 16:22                               ` Jonas Dreßler
2021-09-30 16:39                                 ` Pali Rohár
2021-08-30 12:37 ` [PATCH 2/2] mwifiex: Try waking the firmware until we get an interrupt Jonas Dreßler
2021-08-30 12:51   ` Andy Shevchenko
2021-08-30 12:55     ` Andy Shevchenko
2021-09-25 17:32 ` [PATCH 0/2] mwifiex: Work around firmware bugs on 88W8897 chip Pali Rohár

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHp75VdRWd9Oj_68BqewAdjtzhRz406eh=4M7FmjRvqhkaWaOw@mail.gmail.com' \
    --to=andy.shevchenko@gmail.com \
    --cc=amitkarwar@gmail.com \
    --cc=andriy.shevchenko@linux.intel.com \
    --cc=bhelgaas@google.com \
    --cc=briannorris@chromium.org \
    --cc=davem@davemloft.net \
    --cc=ganapathi017@gmail.com \
    --cc=huxinming820@gmail.com \
    --cc=kitakar@gmail.com \
    --cc=kuba@kernel.org \
    --cc=kvalo@codeaurora.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-wireless@vger.kernel.org \
    --cc=luzmaximilian@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=pali@kernel.org \
    --cc=verdre@v0yd.nl \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).