From: Pkshih <pkshih@realtek.com>
To: James Cameron <quozl@laptop.org>,
Larry Finger <Larry.Finger@lwfinger.net>
Cc: "linux-wireless@vger.kernel.org" <linux-wireless@vger.kernel.org>
Subject: RE: rtl8821ae keep alive not set, connection lost
Date: Fri, 2 Feb 2018 07:50:26 +0000 [thread overview]
Message-ID: <5B2DA6FDDF928F4E855344EE0A5C39D13BE7A25E@RTITMBSV07.realtek.com.tw> (raw)
In-Reply-To: <20180201062202.GH917@us.netrek.org>
> -----Original Message-----
> From: linux-wireless-owner@vger.kernel.org [mailto:linux-wireless-owner@vger.kernel.org] On Behalf
> Of James Cameron
> Sent: Thursday, February 01, 2018 2:22 PM
> To: Larry Finger
> Cc: linux-wireless@vger.kernel.org; Pkshih
> Subject: Re: rtl8821ae keep alive not set, connection lost
>
> On Wed, Jan 31, 2018 at 11:06:12AM -0600, Larry Finger wrote:
> > On 09/12/2017 05:09 PM, James Cameron wrote:
> > >Summary: 40b368af4b75 ("rtlwifi: Fix alignment issues") breaks
> > >rtl8821ae keep alive, causing "Connection to AP lost" and deauth,
> > >but why?
> > >
> > >Wireless connection is lost after a few seconds or minutes, on
> > >every OLPC NL3 laptop with rtl8821ae, with any stable kernel after
> > >4.10.1, and any kernel with 40b368af4b75.
> > >
> > >dmesg contains
> > >
> > > wlp2s0: Connection to AP 2c:b0:5d:a6:86:eb lost
> > >
> > >iw event shows
> > >
> > > wlp2s0: del station 2c:b0:5d:a6:86:eb
> > > wlp2s0 (phy #0): deauth 74:c6:3b:09:b5:0d -> 2c:b0:5d:a6:86:eb reason 4: Disassociated due to
> inactivity
> > > wlp2s0 (phy #0): disconnected (local request)
> > >
> > >Workaround is to bounce the link, then reconnect;
> > >
> > > ip link set wlp2s0 down
> > > ip link set wlp2s0 up
> > > iw dev wlp2s0 connect qz
> > >
> > >A nearby monitor host captures a deauthentication packet sent by
> > >the device.
> > >
> > >Bisection showed cause is 40b368af4b75 ("rtlwifi: Fix alignment
> > >issues") which changes the width of DBI register read.
> > >
> > >On the face of it, 40b368af4b75 looks correct, especially compared
> > >against same function in rtl8723be.
> > >
> > >I've no idea why reverting fixes the problem. I'm hoping someone
> > >here might speculate and suggest ways to test.
> > >
> > >As keep alive is set through this path, my guess is that keep alive
> > >is not being set in the device. Or perhaps reading 16-bits
> > >perturbs another register. Is there a way to test?
> > >
> > >http://dev.laptop.org/~quozl/z/1drtGD.txt dmesg of 4.13
> > >
> > >http://dev.laptop.org/~quozl/z/1drt7c.txt dmesg with 4.13 and
> > >revert of 40b368af4b75
> >
> > James,
> >
> > I'm afraid we are needing to revisit this problem again. Changing
> > that 8-bit read to a 16-bit version causes an unaligned memory
> > reference in AARCH64, thus we will need to re-revert. To prevent
> > problems on systems such as yours, PK plans to turn off ASPM
> > capability and backdoor in certain platforms that will be listed in
> > a quirks table. Please report the output of 'dmidecode -t system'
> > for you affected system(s).
>
> Thanks for letting me know.
>
> We made three production runs, and I'm waiting to get a hold of the
> dmidecode for two of them. This may take some weeks; we have to find
> stock and ship it, or we have to ask our contract manufacturer (CM) if
> they have kept data or units.
>
> I've dmidecode for one production run.
>
> http://dev.laptop.org/~quozl/z/1eh7JF.txt (my unit nl3-e)
>
> I've dmidecode for prototypes, but they have clearly been programmed
> badly. We did not ask our CM for Windows compatibility, so they may
> have had no step to verify the data. We also went through several
> iterations to get serial numbers assigned, so the data I have does not
> have good provenance.
>
> http://dev.laptop.org/~quozl/z/1eh7EE.txt (my unit nl3-c)
> http://dev.laptop.org/~quozl/z/1eh7EV.txt (my unit nl3-d)
> http://dev.laptop.org/~quozl/z/1eh7He.txt (my unit nl3-a)
> http://dev.laptop.org/~quozl/z/1eh8DR.txt (my unit nl3-b)
>
> > We hope you will be able to test any proposed patches.
>
> Yes, can do.
>
> I've just tested v4.15.
>
> However, I'm concerned about your plan to use quirks;
>
> 1. turning off ASPM may decrease run time on battery, which if it is
> significant, across several thousand laptops will yield generator fuel
> or solar budget failure; can the power impact be quantified?
>
> 2. why not keep ASPM enabled, and use 8-bit when quirked, or on
> x86_64, or when not AARCH64?
>
> 3. why not find the underlying problem; PK is in the same company as
> the device firmware engineers, so it should be possible for them to
> find out why 16-bit access causes the device firmware to hang? We
> drew a blank trying to reach firmware engineers through our CM and
> module maker; perhaps we were not large or noisy enough.
>
> 4. it's not just me; there are others who have reported similar
> problems, so won't re-reverting affect them? They haven't engaged in
> the process as thoroughly, and may not be in the quirks table. You
> also reproduced the problem with different hardware.
>
Hi James,
In my experiment, unaligned-word-access may get wrong values that
are different from the value by byte-access. Actually, it can simply
verified by using 'lspci' to check PCI configuration space.
DBI read 0x70f:
_rtl8821ae_dbi_read:1127 r8 0x34f = 0x0017
_rtl8821ae_dbi_read:1131 r8 0x350 = 0x000c
_rtl8821ae_dbi_read:1136 r16 0x350 = 0xffff
DBI read 0x719:
_rtl8821ae_dbi_read:1127 r8 0x34d = 0x0000
_rtl8821ae_dbi_read:1131 r8 0x34e = 0x0002
_rtl8821ae_dbi_read:1136 r16 0x34e = 0x0200
According to the wrong and original value of 0x70f is 0xff, I think
larger L1 latency 0x70f[5:3] may be helpful. Please help to try
below patch. If it works, quirk table won't be necessary.
PK
diff --git a/rtl8821ae/hw.c b/rtl8821ae/hw.c
index 7d43ba002..e53af06ed 100644
--- a/rtl8821ae/hw.c
+++ b/rtl8821ae/hw.c
@@ -1123,7 +1123,8 @@ static u8 _rtl8821ae_dbi_read(struct rtl_priv *rtlpriv, u16 addr)
}
if (0 == tmp) {
read_addr = REG_DBI_RDATA + addr % 4;
- ret = rtl_read_word(rtlpriv, read_addr);
+
+ ret = rtl_read_byte(rtlpriv, read_addr);
}
return ret;
}
@@ -1165,7 +1166,7 @@ static void _rtl8821ae_enable_aspm_back_door(struct ieee80211_hw *hw)
}
tmp = _rtl8821ae_dbi_read(rtlpriv, 0x70f);
- _rtl8821ae_dbi_write(rtlpriv, 0x70f, tmp | BIT(7));
+ _rtl8821ae_dbi_write(rtlpriv, 0x70f, tmp | BIT(7) | 0x38);
tmp = _rtl8821ae_dbi_read(rtlpriv, 0x719);
_rtl8821ae_dbi_write(rtlpriv, 0x719, tmp | BIT(3) | BIT(4));
next prev parent reply other threads:[~2018-02-02 7:51 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-09-12 22:09 rtl8821ae keep alive not set, connection lost James Cameron
2017-09-13 15:01 ` Larry Finger
2017-09-13 21:46 ` James Cameron
2017-09-14 0:39 ` Larry Finger
2017-09-14 9:27 ` James Cameron
2017-09-19 9:42 ` James Cameron
2017-09-20 9:36 ` James Cameron
2017-09-20 21:48 ` Larry Finger
2017-09-20 23:22 ` James Cameron
2017-09-21 8:07 ` James Cameron
2017-09-21 14:40 ` Larry Finger
2017-09-22 5:35 ` James Cameron
2018-01-31 17:06 ` Larry Finger
2018-02-01 6:22 ` James Cameron
2018-02-02 7:50 ` Pkshih [this message]
2018-02-02 20:13 ` Larry Finger
2018-02-03 4:45 ` Pkshih
2018-02-04 18:18 ` Larry Finger
2018-02-02 20:27 ` Larry Finger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5B2DA6FDDF928F4E855344EE0A5C39D13BE7A25E@RTITMBSV07.realtek.com.tw \
--to=pkshih@realtek.com \
--cc=Larry.Finger@lwfinger.net \
--cc=linux-wireless@vger.kernel.org \
--cc=quozl@laptop.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).