linux-wireless.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Pkshih <pkshih@realtek.com>
To: James Cameron <quozl@laptop.org>,
	Larry Finger <Larry.Finger@lwfinger.net>
Cc: "linux-wireless@vger.kernel.org" <linux-wireless@vger.kernel.org>
Subject: RE: rtl8821ae keep alive not set, connection lost
Date: Fri, 2 Feb 2018 07:50:26 +0000	[thread overview]
Message-ID: <5B2DA6FDDF928F4E855344EE0A5C39D13BE7A25E@RTITMBSV07.realtek.com.tw> (raw)
In-Reply-To: <20180201062202.GH917@us.netrek.org>


> -----Original Message-----
> From: linux-wireless-owner@vger.kernel.org [mailto:linux-wireless-owner@vger.kernel.org] On Behalf
> Of James Cameron
> Sent: Thursday, February 01, 2018 2:22 PM
> To: Larry Finger
> Cc: linux-wireless@vger.kernel.org; Pkshih
> Subject: Re: rtl8821ae keep alive not set, connection lost
> 
> On Wed, Jan 31, 2018 at 11:06:12AM -0600, Larry Finger wrote:
> > On 09/12/2017 05:09 PM, James Cameron wrote:
> > >Summary: 40b368af4b75 ("rtlwifi: Fix alignment issues") breaks
> > >rtl8821ae keep alive, causing "Connection to AP lost" and deauth,
> > >but why?
> > >
> > >Wireless connection is lost after a few seconds or minutes, on
> > >every OLPC NL3 laptop with rtl8821ae, with any stable kernel after
> > >4.10.1, and any kernel with 40b368af4b75.
> > >
> > >dmesg contains
> > >
> > >   wlp2s0: Connection to AP 2c:b0:5d:a6:86:eb lost
> > >
> > >iw event shows
> > >
> > >   wlp2s0: del station 2c:b0:5d:a6:86:eb
> > >   wlp2s0 (phy #0): deauth 74:c6:3b:09:b5:0d -> 2c:b0:5d:a6:86:eb reason 4: Disassociated due to
> inactivity
> > >   wlp2s0 (phy #0): disconnected (local request)
> > >
> > >Workaround is to bounce the link, then reconnect;
> > >
> > >   ip link set wlp2s0 down
> > >   ip link set wlp2s0 up
> > >   iw dev wlp2s0 connect qz
> > >
> > >A nearby monitor host captures a deauthentication packet sent by
> > >the device.
> > >
> > >Bisection showed cause is 40b368af4b75 ("rtlwifi: Fix alignment
> > >issues") which changes the width of DBI register read.
> > >
> > >On the face of it, 40b368af4b75 looks correct, especially compared
> > >against same function in rtl8723be.
> > >
> > >I've no idea why reverting fixes the problem.  I'm hoping someone
> > >here might speculate and suggest ways to test.
> > >
> > >As keep alive is set through this path, my guess is that keep alive
> > >is not being set in the device.  Or perhaps reading 16-bits
> > >perturbs another register.  Is there a way to test?
> > >
> > >http://dev.laptop.org/~quozl/z/1drtGD.txt dmesg of 4.13
> > >
> > >http://dev.laptop.org/~quozl/z/1drt7c.txt dmesg with 4.13 and
> > >revert of 40b368af4b75
> >
> > James,
> >
> > I'm afraid we are needing to revisit this problem again. Changing
> > that 8-bit read to a 16-bit version causes an unaligned memory
> > reference in AARCH64, thus we will need to re-revert. To prevent
> > problems on systems such as yours, PK plans to turn off ASPM
> > capability and backdoor in certain platforms that will be listed in
> > a quirks table. Please report the output of 'dmidecode -t system'
> > for you affected system(s).
> 
> Thanks for letting me know.
> 
> We made three production runs, and I'm waiting to get a hold of the
> dmidecode for two of them.  This may take some weeks; we have to find
> stock and ship it, or we have to ask our contract manufacturer (CM) if
> they have kept data or units.
> 
> I've dmidecode for one production run.
> 
> http://dev.laptop.org/~quozl/z/1eh7JF.txt (my unit nl3-e)
> 
> I've dmidecode for prototypes, but they have clearly been programmed
> badly.  We did not ask our CM for Windows compatibility, so they may
> have had no step to verify the data.  We also went through several
> iterations to get serial numbers assigned, so the data I have does not
> have good provenance.
> 
> http://dev.laptop.org/~quozl/z/1eh7EE.txt (my unit nl3-c)
> http://dev.laptop.org/~quozl/z/1eh7EV.txt (my unit nl3-d)
> http://dev.laptop.org/~quozl/z/1eh7He.txt (my unit nl3-a)
> http://dev.laptop.org/~quozl/z/1eh8DR.txt (my unit nl3-b)
> 
> > We hope you will be able to test any proposed patches.
> 
> Yes, can do.
> 
> I've just tested v4.15.
> 
> However, I'm concerned about your plan to use quirks;
> 
> 1.  turning off ASPM may decrease run time on battery, which if it is
> significant, across several thousand laptops will yield generator fuel
> or solar budget failure; can the power impact be quantified?
> 
> 2.  why not keep ASPM enabled, and use 8-bit when quirked, or on
> x86_64, or when not AARCH64?
> 
> 3.  why not find the underlying problem; PK is in the same company as
> the device firmware engineers, so it should be possible for them to
> find out why 16-bit access causes the device firmware to hang?  We
> drew a blank trying to reach firmware engineers through our CM and
> module maker; perhaps we were not large or noisy enough.
> 
> 4.  it's not just me; there are others who have reported similar
> problems, so won't re-reverting affect them?  They haven't engaged in
> the process as thoroughly, and may not be in the quirks table.  You
> also reproduced the problem with different hardware.
> 

Hi James, 

In my experiment, unaligned-word-access may get wrong values that 
are different from the value by byte-access. Actually, it can simply 
verified by using 'lspci' to check PCI configuration space.

DBI read 0x70f:
_rtl8821ae_dbi_read:1127 r8 0x34f = 0x0017
_rtl8821ae_dbi_read:1131 r8 0x350 = 0x000c
_rtl8821ae_dbi_read:1136 r16 0x350 = 0xffff

DBI read 0x719:
_rtl8821ae_dbi_read:1127 r8 0x34d = 0x0000
_rtl8821ae_dbi_read:1131 r8 0x34e = 0x0002
_rtl8821ae_dbi_read:1136 r16 0x34e = 0x0200


According to the wrong and original value of 0x70f is 0xff, I think
larger L1 latency 0x70f[5:3] may be helpful. Please help to try
below patch. If it works, quirk table won't be necessary.

PK


diff --git a/rtl8821ae/hw.c b/rtl8821ae/hw.c
index 7d43ba002..e53af06ed 100644
--- a/rtl8821ae/hw.c
+++ b/rtl8821ae/hw.c
@@ -1123,7 +1123,8 @@ static u8 _rtl8821ae_dbi_read(struct rtl_priv *rtlpriv, u16 addr)
 	}
 	if (0 == tmp) {
 		read_addr = REG_DBI_RDATA + addr % 4;
-		ret = rtl_read_word(rtlpriv, read_addr);
+
+		ret = rtl_read_byte(rtlpriv, read_addr);
 	}
 	return ret;
 }
@@ -1165,7 +1166,7 @@ static void _rtl8821ae_enable_aspm_back_door(struct ieee80211_hw *hw)
 	}
 
 	tmp = _rtl8821ae_dbi_read(rtlpriv, 0x70f);
-	_rtl8821ae_dbi_write(rtlpriv, 0x70f, tmp | BIT(7));
+	_rtl8821ae_dbi_write(rtlpriv, 0x70f, tmp | BIT(7) | 0x38);
 
 	tmp = _rtl8821ae_dbi_read(rtlpriv, 0x719);
 	_rtl8821ae_dbi_write(rtlpriv, 0x719, tmp | BIT(3) | BIT(4));

  reply	other threads:[~2018-02-02  7:51 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-12 22:09 rtl8821ae keep alive not set, connection lost James Cameron
2017-09-13 15:01 ` Larry Finger
2017-09-13 21:46   ` James Cameron
2017-09-14  0:39     ` Larry Finger
2017-09-14  9:27       ` James Cameron
2017-09-19  9:42         ` James Cameron
2017-09-20  9:36           ` James Cameron
2017-09-20 21:48             ` Larry Finger
2017-09-20 23:22               ` James Cameron
2017-09-21  8:07                 ` James Cameron
2017-09-21 14:40                   ` Larry Finger
2017-09-22  5:35                     ` James Cameron
2018-01-31 17:06 ` Larry Finger
2018-02-01  6:22   ` James Cameron
2018-02-02  7:50     ` Pkshih [this message]
2018-02-02 20:13       ` Larry Finger
2018-02-03  4:45         ` Pkshih
2018-02-04 18:18           ` Larry Finger
2018-02-02 20:27       ` Larry Finger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5B2DA6FDDF928F4E855344EE0A5C39D13BE7A25E@RTITMBSV07.realtek.com.tw \
    --to=pkshih@realtek.com \
    --cc=Larry.Finger@lwfinger.net \
    --cc=linux-wireless@vger.kernel.org \
    --cc=quozl@laptop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).