linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next v3 0/3] r8169: Implement dynamic ASPM mechanism for recent 1.0/2.5Gbps Realtek NICs
@ 2021-08-19  5:45 Kai-Heng Feng
  2021-08-19  5:45 ` [PATCH net-next v3 1/3] r8169: Implement dynamic ASPM mechanism Kai-Heng Feng
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: Kai-Heng Feng @ 2021-08-19  5:45 UTC (permalink / raw)
  To: hkallweit1, nic_swsd, bhelgaas
  Cc: davem, kuba, netdev, linux-pci, linux-kernel, Kai-Heng Feng

The latest Realtek vendor driver and its Windows driver implements a
feature called "dynamic ASPM" which can improve performance on it's
ethernet NICs.

Heiner Kallweit pointed out the potential root cause can be that the
buffer is to small for its ASPM exit latency.

So bring the dynamic ASPM to r8169 so we can have both nice performance
and powersaving at the same time.

v2:
https://lore.kernel.org/netdev/20210812155341.817031-1-kai.heng.feng@canonical.com/

v1:
https://lore.kernel.org/netdev/20210803152823.515849-1-kai.heng.feng@canonical.com/

Kai-Heng Feng (3):
  r8169: Implement dynamic ASPM mechanism
  PCI/ASPM: Introduce a new helper to report ASPM support status
  r8169: Enable ASPM for selected NICs

 drivers/net/ethernet/realtek/r8169_main.c | 69 ++++++++++++++++++++---
 drivers/pci/pcie/aspm.c                   | 11 ++++
 include/linux/pci.h                       |  2 +
 3 files changed, 74 insertions(+), 8 deletions(-)

-- 
2.32.0


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH net-next v3 1/3] r8169: Implement dynamic ASPM mechanism
  2021-08-19  5:45 [PATCH net-next v3 0/3] r8169: Implement dynamic ASPM mechanism for recent 1.0/2.5Gbps Realtek NICs Kai-Heng Feng
@ 2021-08-19  5:45 ` Kai-Heng Feng
  2021-08-19 11:42   ` Bjorn Helgaas
  2021-08-19  5:45 ` [PATCH net-next v3 2/3] PCI/ASPM: Introduce a new helper to report ASPM support status Kai-Heng Feng
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 16+ messages in thread
From: Kai-Heng Feng @ 2021-08-19  5:45 UTC (permalink / raw)
  To: hkallweit1, nic_swsd, bhelgaas
  Cc: davem, kuba, netdev, linux-pci, linux-kernel, Kai-Heng Feng

r8169 NICs on some platforms have abysmal speed when ASPM is enabled.
Same issue can be observed with older vendor drivers.

The issue is however solved by the latest vendor driver. There's a new
mechanism, which disables r8169's internal ASPM when the NIC traffic has
more than 10 packets, and vice versa. The possible reason for this is
likely because the buffer on the chip is too small for its ASPM exit
latency.

Realtek confirmed that all their PCIe LAN NICs, r8106, r8168 and r8125
use dynamic ASPM under Windows. So implement the same mechanism here to
resolve the issue.

Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
---
v3:
 - Use msecs_to_jiffies() for delay time
 - Use atomic_t instead of mutex for bh
 - Mention the buffer size and ASPM exit latency in commit message

v2: 
 - Use delayed_work instead of timer_list to avoid interrupt context
 - Use mutex to serialize packet counter read/write
 - Wording change

 drivers/net/ethernet/realtek/r8169_main.c | 44 ++++++++++++++++++++++-
 1 file changed, 43 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
index 7a69b468584a2..3359509c1c351 100644
--- a/drivers/net/ethernet/realtek/r8169_main.c
+++ b/drivers/net/ethernet/realtek/r8169_main.c
@@ -624,6 +624,10 @@ struct rtl8169_private {
 
 	unsigned supports_gmii:1;
 	unsigned aspm_manageable:1;
+	unsigned rtl_aspm_enabled:1;
+	struct delayed_work aspm_toggle;
+	atomic_t aspm_packet_count;
+
 	dma_addr_t counters_phys_addr;
 	struct rtl8169_counters *counters;
 	struct rtl8169_tc_offsets tc_offset;
@@ -2665,8 +2669,13 @@ static void rtl_pcie_state_l2l3_disable(struct rtl8169_private *tp)
 
 static void rtl_hw_aspm_clkreq_enable(struct rtl8169_private *tp, bool enable)
 {
+	if (!tp->aspm_manageable && enable)
+		return;
+
+	tp->rtl_aspm_enabled = enable;
+
 	/* Don't enable ASPM in the chip if OS can't control ASPM */
-	if (enable && tp->aspm_manageable) {
+	if (enable) {
 		RTL_W8(tp, Config5, RTL_R8(tp, Config5) | ASPM_en);
 		RTL_W8(tp, Config2, RTL_R8(tp, Config2) | ClkReqEn);
 	} else {
@@ -4415,6 +4424,7 @@ static void rtl_tx(struct net_device *dev, struct rtl8169_private *tp,
 
 	dirty_tx = tp->dirty_tx;
 
+	atomic_add(tp->cur_tx - dirty_tx, &tp->aspm_packet_count);
 	while (READ_ONCE(tp->cur_tx) != dirty_tx) {
 		unsigned int entry = dirty_tx % NUM_TX_DESC;
 		u32 status;
@@ -4559,6 +4569,8 @@ static int rtl_rx(struct net_device *dev, struct rtl8169_private *tp, int budget
 		rtl8169_mark_to_asic(desc);
 	}
 
+	atomic_add(count, &tp->aspm_packet_count);
+
 	return count;
 }
 
@@ -4666,8 +4678,32 @@ static int r8169_phy_connect(struct rtl8169_private *tp)
 	return 0;
 }
 
+#define ASPM_PACKET_THRESHOLD 10
+#define ASPM_TOGGLE_INTERVAL 1000
+
+static void rtl8169_aspm_toggle(struct work_struct *work)
+{
+	struct rtl8169_private *tp = container_of(work, struct rtl8169_private,
+						  aspm_toggle.work);
+	int packet_count;
+	bool enable;
+
+	packet_count = atomic_xchg(&tp->aspm_packet_count, 0);
+	enable = packet_count <= ASPM_PACKET_THRESHOLD;
+
+	if (tp->rtl_aspm_enabled != enable) {
+		rtl_unlock_config_regs(tp);
+		rtl_hw_aspm_clkreq_enable(tp, enable);
+		rtl_lock_config_regs(tp);
+	}
+
+	schedule_delayed_work(&tp->aspm_toggle, msecs_to_jiffies(ASPM_TOGGLE_INTERVAL));
+}
+
 static void rtl8169_down(struct rtl8169_private *tp)
 {
+	cancel_delayed_work_sync(&tp->aspm_toggle);
+
 	/* Clear all task flags */
 	bitmap_zero(tp->wk.flags, RTL_FLAG_MAX);
 
@@ -4694,6 +4730,8 @@ static void rtl8169_up(struct rtl8169_private *tp)
 	rtl_reset_work(tp);
 
 	phy_start(tp->phydev);
+
+	schedule_delayed_work(&tp->aspm_toggle, msecs_to_jiffies(ASPM_TOGGLE_INTERVAL));
 }
 
 static int rtl8169_close(struct net_device *dev)
@@ -5354,6 +5392,10 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 
 	INIT_WORK(&tp->wk.work, rtl_task);
 
+	INIT_DELAYED_WORK(&tp->aspm_toggle, rtl8169_aspm_toggle);
+
+	atomic_set(&tp->aspm_packet_count, 0);
+
 	rtl_init_mac_address(tp);
 
 	dev->ethtool_ops = &rtl8169_ethtool_ops;
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net-next v3 2/3] PCI/ASPM: Introduce a new helper to report ASPM support status
  2021-08-19  5:45 [PATCH net-next v3 0/3] r8169: Implement dynamic ASPM mechanism for recent 1.0/2.5Gbps Realtek NICs Kai-Heng Feng
  2021-08-19  5:45 ` [PATCH net-next v3 1/3] r8169: Implement dynamic ASPM mechanism Kai-Heng Feng
@ 2021-08-19  5:45 ` Kai-Heng Feng
  2021-08-19  5:45 ` [PATCH net-next v3 3/3] r8169: Enable ASPM for selected NICs Kai-Heng Feng
  2021-08-19  6:08 ` [PATCH net-next v3 0/3] r8169: Implement dynamic ASPM mechanism for recent 1.0/2.5Gbps Realtek NICs Heiner Kallweit
  3 siblings, 0 replies; 16+ messages in thread
From: Kai-Heng Feng @ 2021-08-19  5:45 UTC (permalink / raw)
  To: hkallweit1, nic_swsd, bhelgaas
  Cc: davem, kuba, netdev, linux-pci, linux-kernel, Kai-Heng Feng,
	Saheed O. Bolarinwa, Vidya Sagar, Logan Gunthorpe,
	Krzysztof Wilczyński

Introduce a new helper, pcie_aspm_supported(), to report ASPM support
status.

The user will be introduced by next patch.

Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
---
v3:
 - This is a new patch

 drivers/pci/pcie/aspm.c | 11 +++++++++++
 include/linux/pci.h     |  2 ++
 2 files changed, 13 insertions(+)

diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index 013a47f587cea..eeea6a04ab0cf 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -1201,6 +1201,17 @@ bool pcie_aspm_enabled(struct pci_dev *pdev)
 }
 EXPORT_SYMBOL_GPL(pcie_aspm_enabled);
 
+bool pcie_aspm_supported(struct pci_dev *pdev)
+{
+	struct pcie_link_state *link = pcie_aspm_get_link(pdev);
+
+	if (!link)
+		return false;
+
+	return link->aspm_support;
+}
+EXPORT_SYMBOL_GPL(pcie_aspm_supported);
+
 static ssize_t aspm_attr_show_common(struct device *dev,
 				     struct device_attribute *attr,
 				     char *buf, u8 state)
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 540b377ca8f61..b7b71982f2405 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1602,6 +1602,7 @@ int pci_disable_link_state_locked(struct pci_dev *pdev, int state);
 void pcie_no_aspm(void);
 bool pcie_aspm_support_enabled(void);
 bool pcie_aspm_enabled(struct pci_dev *pdev);
+bool pcie_aspm_supported(struct pci_dev *pdev);
 #else
 static inline int pci_disable_link_state(struct pci_dev *pdev, int state)
 { return 0; }
@@ -1610,6 +1611,7 @@ static inline int pci_disable_link_state_locked(struct pci_dev *pdev, int state)
 static inline void pcie_no_aspm(void) { }
 static inline bool pcie_aspm_support_enabled(void) { return false; }
 static inline bool pcie_aspm_enabled(struct pci_dev *pdev) { return false; }
+static inline bool pcie_aspm_supported(struct pci_dev *pdev) { return false; }
 #endif
 
 #ifdef CONFIG_PCIEAER
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net-next v3 3/3] r8169: Enable ASPM for selected NICs
  2021-08-19  5:45 [PATCH net-next v3 0/3] r8169: Implement dynamic ASPM mechanism for recent 1.0/2.5Gbps Realtek NICs Kai-Heng Feng
  2021-08-19  5:45 ` [PATCH net-next v3 1/3] r8169: Implement dynamic ASPM mechanism Kai-Heng Feng
  2021-08-19  5:45 ` [PATCH net-next v3 2/3] PCI/ASPM: Introduce a new helper to report ASPM support status Kai-Heng Feng
@ 2021-08-19  5:45 ` Kai-Heng Feng
  2021-08-19  6:02   ` Heiner Kallweit
  2021-08-19  6:08 ` [PATCH net-next v3 0/3] r8169: Implement dynamic ASPM mechanism for recent 1.0/2.5Gbps Realtek NICs Heiner Kallweit
  3 siblings, 1 reply; 16+ messages in thread
From: Kai-Heng Feng @ 2021-08-19  5:45 UTC (permalink / raw)
  To: hkallweit1, nic_swsd, bhelgaas
  Cc: davem, kuba, netdev, linux-pci, linux-kernel, Kai-Heng Feng

The latest vendor driver enables ASPM for more recent r8168 NICs, so
disable ASPM on older chips and enable ASPM for the rest.

Rename aspm_manageable to pcie_aspm_manageable to indicate it's ASPM
from PCIe, and use rtl_aspm_supported for Realtek NIC's internal ASPM
function.

Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
---
v3:
 - Use pcie_aspm_supported() to retrieve ASPM support status
 - Use whitelist for r8169 internal ASPM status

v2:
 - No change

 drivers/net/ethernet/realtek/r8169_main.c | 27 ++++++++++++++++-------
 1 file changed, 19 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
index 3359509c1c351..88e015d93e490 100644
--- a/drivers/net/ethernet/realtek/r8169_main.c
+++ b/drivers/net/ethernet/realtek/r8169_main.c
@@ -623,7 +623,8 @@ struct rtl8169_private {
 	} wk;
 
 	unsigned supports_gmii:1;
-	unsigned aspm_manageable:1;
+	unsigned pcie_aspm_manageable:1;
+	unsigned rtl_aspm_supported:1;
 	unsigned rtl_aspm_enabled:1;
 	struct delayed_work aspm_toggle;
 	atomic_t aspm_packet_count;
@@ -702,6 +703,20 @@ static bool rtl_is_8168evl_up(struct rtl8169_private *tp)
 	       tp->mac_version <= RTL_GIGA_MAC_VER_53;
 }
 
+static int rtl_supports_aspm(struct rtl8169_private *tp)
+{
+	switch (tp->mac_version) {
+	case RTL_GIGA_MAC_VER_02 ... RTL_GIGA_MAC_VER_31:
+	case RTL_GIGA_MAC_VER_37:
+	case RTL_GIGA_MAC_VER_39:
+	case RTL_GIGA_MAC_VER_43:
+	case RTL_GIGA_MAC_VER_47:
+		return 0;
+	default:
+		return 1;
+	}
+}
+
 static bool rtl_supports_eee(struct rtl8169_private *tp)
 {
 	return tp->mac_version >= RTL_GIGA_MAC_VER_34 &&
@@ -2669,7 +2684,7 @@ static void rtl_pcie_state_l2l3_disable(struct rtl8169_private *tp)
 
 static void rtl_hw_aspm_clkreq_enable(struct rtl8169_private *tp, bool enable)
 {
-	if (!tp->aspm_manageable && enable)
+	if (!(tp->pcie_aspm_manageable && tp->rtl_aspm_supported) && enable)
 		return;
 
 	tp->rtl_aspm_enabled = enable;
@@ -5319,12 +5334,8 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 	if (rc)
 		return rc;
 
-	/* Disable ASPM completely as that cause random device stop working
-	 * problems as well as full system hangs for some PCIe devices users.
-	 */
-	rc = pci_disable_link_state(pdev, PCIE_LINK_STATE_L0S |
-					  PCIE_LINK_STATE_L1);
-	tp->aspm_manageable = !rc;
+	tp->pcie_aspm_manageable = pcie_aspm_supported(pdev);
+	tp->rtl_aspm_supported = rtl_supports_aspm(tp);
 
 	/* enable device (incl. PCI PM wakeup and hotplug setup) */
 	rc = pcim_enable_device(pdev);
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v3 3/3] r8169: Enable ASPM for selected NICs
  2021-08-19  5:45 ` [PATCH net-next v3 3/3] r8169: Enable ASPM for selected NICs Kai-Heng Feng
@ 2021-08-19  6:02   ` Heiner Kallweit
  2021-08-19  6:50     ` Kai-Heng Feng
  0 siblings, 1 reply; 16+ messages in thread
From: Heiner Kallweit @ 2021-08-19  6:02 UTC (permalink / raw)
  To: Kai-Heng Feng, nic_swsd, bhelgaas
  Cc: davem, kuba, netdev, linux-pci, linux-kernel

On 19.08.2021 07:45, Kai-Heng Feng wrote:
> The latest vendor driver enables ASPM for more recent r8168 NICs, so
> disable ASPM on older chips and enable ASPM for the rest.
> 
> Rename aspm_manageable to pcie_aspm_manageable to indicate it's ASPM
> from PCIe, and use rtl_aspm_supported for Realtek NIC's internal ASPM
> function.
> 
> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
> ---
> v3:
>  - Use pcie_aspm_supported() to retrieve ASPM support status
>  - Use whitelist for r8169 internal ASPM status
> 
> v2:
>  - No change
> 
>  drivers/net/ethernet/realtek/r8169_main.c | 27 ++++++++++++++++-------
>  1 file changed, 19 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
> index 3359509c1c351..88e015d93e490 100644
> --- a/drivers/net/ethernet/realtek/r8169_main.c
> +++ b/drivers/net/ethernet/realtek/r8169_main.c
> @@ -623,7 +623,8 @@ struct rtl8169_private {
>  	} wk;
>  
>  	unsigned supports_gmii:1;
> -	unsigned aspm_manageable:1;
> +	unsigned pcie_aspm_manageable:1;
> +	unsigned rtl_aspm_supported:1;
>  	unsigned rtl_aspm_enabled:1;
>  	struct delayed_work aspm_toggle;
>  	atomic_t aspm_packet_count;
> @@ -702,6 +703,20 @@ static bool rtl_is_8168evl_up(struct rtl8169_private *tp)
>  	       tp->mac_version <= RTL_GIGA_MAC_VER_53;
>  }
>  
> +static int rtl_supports_aspm(struct rtl8169_private *tp)
> +{
> +	switch (tp->mac_version) {
> +	case RTL_GIGA_MAC_VER_02 ... RTL_GIGA_MAC_VER_31:
> +	case RTL_GIGA_MAC_VER_37:
> +	case RTL_GIGA_MAC_VER_39:
> +	case RTL_GIGA_MAC_VER_43:
> +	case RTL_GIGA_MAC_VER_47:
> +		return 0;
> +	default:
> +		return 1;
> +	}
> +}
> +
>  static bool rtl_supports_eee(struct rtl8169_private *tp)
>  {
>  	return tp->mac_version >= RTL_GIGA_MAC_VER_34 &&
> @@ -2669,7 +2684,7 @@ static void rtl_pcie_state_l2l3_disable(struct rtl8169_private *tp)
>  
>  static void rtl_hw_aspm_clkreq_enable(struct rtl8169_private *tp, bool enable)
>  {
> -	if (!tp->aspm_manageable && enable)
> +	if (!(tp->pcie_aspm_manageable && tp->rtl_aspm_supported) && enable)
>  		return;
>  
>  	tp->rtl_aspm_enabled = enable;
> @@ -5319,12 +5334,8 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
>  	if (rc)
>  		return rc;
>  
> -	/* Disable ASPM completely as that cause random device stop working
> -	 * problems as well as full system hangs for some PCIe devices users.
> -	 */
> -	rc = pci_disable_link_state(pdev, PCIE_LINK_STATE_L0S |
> -					  PCIE_LINK_STATE_L1);
> -	tp->aspm_manageable = !rc;
> +	tp->pcie_aspm_manageable = pcie_aspm_supported(pdev);

That's not what I meant, and it's also not correct.

> +	tp->rtl_aspm_supported = rtl_supports_aspm(tp);
>  
>  	/* enable device (incl. PCI PM wakeup and hotplug setup) */
>  	rc = pcim_enable_device(pdev);
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v3 0/3] r8169: Implement dynamic ASPM mechanism for recent 1.0/2.5Gbps Realtek NICs
  2021-08-19  5:45 [PATCH net-next v3 0/3] r8169: Implement dynamic ASPM mechanism for recent 1.0/2.5Gbps Realtek NICs Kai-Heng Feng
                   ` (2 preceding siblings ...)
  2021-08-19  5:45 ` [PATCH net-next v3 3/3] r8169: Enable ASPM for selected NICs Kai-Heng Feng
@ 2021-08-19  6:08 ` Heiner Kallweit
  2021-08-19  6:19   ` Kai-Heng Feng
  3 siblings, 1 reply; 16+ messages in thread
From: Heiner Kallweit @ 2021-08-19  6:08 UTC (permalink / raw)
  To: Kai-Heng Feng, nic_swsd, bhelgaas
  Cc: davem, kuba, netdev, linux-pci, linux-kernel

On 19.08.2021 07:45, Kai-Heng Feng wrote:
> The latest Realtek vendor driver and its Windows driver implements a
> feature called "dynamic ASPM" which can improve performance on it's
> ethernet NICs.
> 
This statement would need a proof. Which performance improvement
did you measure? And why should performance improve?
On mainline ASPM is disabled, therefore I don't think we can see
a performance improvement. More the opposite in the scenario
I described: If traffic starts and there's a congestion in the chip,
then it may take a second until ASPM gets disabled. This may hit
performance.

> Heiner Kallweit pointed out the potential root cause can be that the
> buffer is to small for its ASPM exit latency.
> 
> So bring the dynamic ASPM to r8169 so we can have both nice performance
> and powersaving at the same time.
> 
> v2:
> https://lore.kernel.org/netdev/20210812155341.817031-1-kai.heng.feng@canonical.com/
> 
> v1:
> https://lore.kernel.org/netdev/20210803152823.515849-1-kai.heng.feng@canonical.com/
> 
> Kai-Heng Feng (3):
>   r8169: Implement dynamic ASPM mechanism
>   PCI/ASPM: Introduce a new helper to report ASPM support status
>   r8169: Enable ASPM for selected NICs
> 
>  drivers/net/ethernet/realtek/r8169_main.c | 69 ++++++++++++++++++++---
>  drivers/pci/pcie/aspm.c                   | 11 ++++
>  include/linux/pci.h                       |  2 +
>  3 files changed, 74 insertions(+), 8 deletions(-)
> 
This series is meant for your downstream kernel only, and posted here to
get feedback. Therefore it should be annotated as RFC, not that it gets
applied accidentally.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v3 0/3] r8169: Implement dynamic ASPM mechanism for recent 1.0/2.5Gbps Realtek NICs
  2021-08-19  6:08 ` [PATCH net-next v3 0/3] r8169: Implement dynamic ASPM mechanism for recent 1.0/2.5Gbps Realtek NICs Heiner Kallweit
@ 2021-08-19  6:19   ` Kai-Heng Feng
  0 siblings, 0 replies; 16+ messages in thread
From: Kai-Heng Feng @ 2021-08-19  6:19 UTC (permalink / raw)
  To: Heiner Kallweit
  Cc: nic_swsd, Bjorn Helgaas, David Miller, Jakub Kicinski,
	Linux Netdev List, Linux PCI, LKML

On Thu, Aug 19, 2021 at 2:08 PM Heiner Kallweit <hkallweit1@gmail.com> wrote:
>
> On 19.08.2021 07:45, Kai-Heng Feng wrote:
> > The latest Realtek vendor driver and its Windows driver implements a
> > feature called "dynamic ASPM" which can improve performance on it's
> > ethernet NICs.
> >
> This statement would need a proof. Which performance improvement
> did you measure? And why should performance improve?

It means what patch 1/3 fixes...

> On mainline ASPM is disabled, therefore I don't think we can see
> a performance improvement. More the opposite in the scenario
> I described: If traffic starts and there's a congestion in the chip,
> then it may take a second until ASPM gets disabled. This may hit
> performance.

OK. We can know if the 1 sec interval is enough once it's deployed in the wild.

>
> > Heiner Kallweit pointed out the potential root cause can be that the
> > buffer is to small for its ASPM exit latency.
> >
> > So bring the dynamic ASPM to r8169 so we can have both nice performance
> > and powersaving at the same time.
> >
> > v2:
> > https://lore.kernel.org/netdev/20210812155341.817031-1-kai.heng.feng@canonical.com/
> >
> > v1:
> > https://lore.kernel.org/netdev/20210803152823.515849-1-kai.heng.feng@canonical.com/
> >
> > Kai-Heng Feng (3):
> >   r8169: Implement dynamic ASPM mechanism
> >   PCI/ASPM: Introduce a new helper to report ASPM support status
> >   r8169: Enable ASPM for selected NICs
> >
> >  drivers/net/ethernet/realtek/r8169_main.c | 69 ++++++++++++++++++++---
> >  drivers/pci/pcie/aspm.c                   | 11 ++++
> >  include/linux/pci.h                       |  2 +
> >  3 files changed, 74 insertions(+), 8 deletions(-)
> >
> This series is meant for your downstream kernel only, and posted here to
> get feedback. Therefore it should be annotated as RFC, not that it gets
> applied accidentally.

Noted. Will annotate in next version.

Kai-Heng

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v3 3/3] r8169: Enable ASPM for selected NICs
  2021-08-19  6:02   ` Heiner Kallweit
@ 2021-08-19  6:50     ` Kai-Heng Feng
  2021-08-19  9:56       ` Heiner Kallweit
  0 siblings, 1 reply; 16+ messages in thread
From: Kai-Heng Feng @ 2021-08-19  6:50 UTC (permalink / raw)
  To: Heiner Kallweit
  Cc: nic_swsd, Bjorn Helgaas, David Miller, Jakub Kicinski,
	Linux Netdev List, Linux PCI, LKML

On Thu, Aug 19, 2021 at 2:08 PM Heiner Kallweit <hkallweit1@gmail.com> wrote:
>
> On 19.08.2021 07:45, Kai-Heng Feng wrote:
> > The latest vendor driver enables ASPM for more recent r8168 NICs, so
> > disable ASPM on older chips and enable ASPM for the rest.
> >
> > Rename aspm_manageable to pcie_aspm_manageable to indicate it's ASPM
> > from PCIe, and use rtl_aspm_supported for Realtek NIC's internal ASPM
> > function.
> >
> > Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
> > ---
> > v3:
> >  - Use pcie_aspm_supported() to retrieve ASPM support status
> >  - Use whitelist for r8169 internal ASPM status
> >
> > v2:
> >  - No change
> >
> >  drivers/net/ethernet/realtek/r8169_main.c | 27 ++++++++++++++++-------
> >  1 file changed, 19 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
> > index 3359509c1c351..88e015d93e490 100644
> > --- a/drivers/net/ethernet/realtek/r8169_main.c
> > +++ b/drivers/net/ethernet/realtek/r8169_main.c
> > @@ -623,7 +623,8 @@ struct rtl8169_private {
> >       } wk;
> >
> >       unsigned supports_gmii:1;
> > -     unsigned aspm_manageable:1;
> > +     unsigned pcie_aspm_manageable:1;
> > +     unsigned rtl_aspm_supported:1;
> >       unsigned rtl_aspm_enabled:1;
> >       struct delayed_work aspm_toggle;
> >       atomic_t aspm_packet_count;
> > @@ -702,6 +703,20 @@ static bool rtl_is_8168evl_up(struct rtl8169_private *tp)
> >              tp->mac_version <= RTL_GIGA_MAC_VER_53;
> >  }
> >
> > +static int rtl_supports_aspm(struct rtl8169_private *tp)
> > +{
> > +     switch (tp->mac_version) {
> > +     case RTL_GIGA_MAC_VER_02 ... RTL_GIGA_MAC_VER_31:
> > +     case RTL_GIGA_MAC_VER_37:
> > +     case RTL_GIGA_MAC_VER_39:
> > +     case RTL_GIGA_MAC_VER_43:
> > +     case RTL_GIGA_MAC_VER_47:
> > +             return 0;
> > +     default:
> > +             return 1;
> > +     }
> > +}
> > +
> >  static bool rtl_supports_eee(struct rtl8169_private *tp)
> >  {
> >       return tp->mac_version >= RTL_GIGA_MAC_VER_34 &&
> > @@ -2669,7 +2684,7 @@ static void rtl_pcie_state_l2l3_disable(struct rtl8169_private *tp)
> >
> >  static void rtl_hw_aspm_clkreq_enable(struct rtl8169_private *tp, bool enable)
> >  {
> > -     if (!tp->aspm_manageable && enable)
> > +     if (!(tp->pcie_aspm_manageable && tp->rtl_aspm_supported) && enable)
> >               return;
> >
> >       tp->rtl_aspm_enabled = enable;
> > @@ -5319,12 +5334,8 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
> >       if (rc)
> >               return rc;
> >
> > -     /* Disable ASPM completely as that cause random device stop working
> > -      * problems as well as full system hangs for some PCIe devices users.
> > -      */
> > -     rc = pci_disable_link_state(pdev, PCIE_LINK_STATE_L0S |
> > -                                       PCIE_LINK_STATE_L1);
> > -     tp->aspm_manageable = !rc;
> > +     tp->pcie_aspm_manageable = pcie_aspm_supported(pdev);
>
> That's not what I meant, and it's also not correct.

In case I make another mistake in next series, let me ask it more clearly...
What you meant was to check both link->aspm_enabled and link->aspm_support?

>
> > +     tp->rtl_aspm_supported = rtl_supports_aspm(tp);

Is rtl_supports_aspm() what you expect for the whitelist?
And what else am I missing?

Kai-Heng

> >
> >       /* enable device (incl. PCI PM wakeup and hotplug setup) */
> >       rc = pcim_enable_device(pdev);
> >
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v3 3/3] r8169: Enable ASPM for selected NICs
  2021-08-19  6:50     ` Kai-Heng Feng
@ 2021-08-19  9:56       ` Heiner Kallweit
  2021-08-27  6:23         ` Kai-Heng Feng
  0 siblings, 1 reply; 16+ messages in thread
From: Heiner Kallweit @ 2021-08-19  9:56 UTC (permalink / raw)
  To: Kai-Heng Feng
  Cc: nic_swsd, Bjorn Helgaas, David Miller, Jakub Kicinski,
	Linux Netdev List, Linux PCI, LKML

On 19.08.2021 08:50, Kai-Heng Feng wrote:
> On Thu, Aug 19, 2021 at 2:08 PM Heiner Kallweit <hkallweit1@gmail.com> wrote:
>>
>> On 19.08.2021 07:45, Kai-Heng Feng wrote:
>>> The latest vendor driver enables ASPM for more recent r8168 NICs, so
>>> disable ASPM on older chips and enable ASPM for the rest.
>>>
>>> Rename aspm_manageable to pcie_aspm_manageable to indicate it's ASPM
>>> from PCIe, and use rtl_aspm_supported for Realtek NIC's internal ASPM
>>> function.
>>>
>>> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
>>> ---
>>> v3:
>>>  - Use pcie_aspm_supported() to retrieve ASPM support status
>>>  - Use whitelist for r8169 internal ASPM status
>>>
>>> v2:
>>>  - No change
>>>
>>>  drivers/net/ethernet/realtek/r8169_main.c | 27 ++++++++++++++++-------
>>>  1 file changed, 19 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
>>> index 3359509c1c351..88e015d93e490 100644
>>> --- a/drivers/net/ethernet/realtek/r8169_main.c
>>> +++ b/drivers/net/ethernet/realtek/r8169_main.c
>>> @@ -623,7 +623,8 @@ struct rtl8169_private {
>>>       } wk;
>>>
>>>       unsigned supports_gmii:1;
>>> -     unsigned aspm_manageable:1;
>>> +     unsigned pcie_aspm_manageable:1;
>>> +     unsigned rtl_aspm_supported:1;
>>>       unsigned rtl_aspm_enabled:1;
>>>       struct delayed_work aspm_toggle;
>>>       atomic_t aspm_packet_count;
>>> @@ -702,6 +703,20 @@ static bool rtl_is_8168evl_up(struct rtl8169_private *tp)
>>>              tp->mac_version <= RTL_GIGA_MAC_VER_53;
>>>  }
>>>
>>> +static int rtl_supports_aspm(struct rtl8169_private *tp)
>>> +{
>>> +     switch (tp->mac_version) {
>>> +     case RTL_GIGA_MAC_VER_02 ... RTL_GIGA_MAC_VER_31:
>>> +     case RTL_GIGA_MAC_VER_37:
>>> +     case RTL_GIGA_MAC_VER_39:
>>> +     case RTL_GIGA_MAC_VER_43:
>>> +     case RTL_GIGA_MAC_VER_47:
>>> +             return 0;
>>> +     default:
>>> +             return 1;
>>> +     }
>>> +}
>>> +
>>>  static bool rtl_supports_eee(struct rtl8169_private *tp)
>>>  {
>>>       return tp->mac_version >= RTL_GIGA_MAC_VER_34 &&
>>> @@ -2669,7 +2684,7 @@ static void rtl_pcie_state_l2l3_disable(struct rtl8169_private *tp)
>>>
>>>  static void rtl_hw_aspm_clkreq_enable(struct rtl8169_private *tp, bool enable)
>>>  {
>>> -     if (!tp->aspm_manageable && enable)
>>> +     if (!(tp->pcie_aspm_manageable && tp->rtl_aspm_supported) && enable)
>>>               return;
>>>
>>>       tp->rtl_aspm_enabled = enable;
>>> @@ -5319,12 +5334,8 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
>>>       if (rc)
>>>               return rc;
>>>
>>> -     /* Disable ASPM completely as that cause random device stop working
>>> -      * problems as well as full system hangs for some PCIe devices users.
>>> -      */
>>> -     rc = pci_disable_link_state(pdev, PCIE_LINK_STATE_L0S |
>>> -                                       PCIE_LINK_STATE_L1);
>>> -     tp->aspm_manageable = !rc;
>>> +     tp->pcie_aspm_manageable = pcie_aspm_supported(pdev);
>>
>> That's not what I meant, and it's also not correct.
> 
> In case I make another mistake in next series, let me ask it more clearly...
> What you meant was to check both link->aspm_enabled and link->aspm_support?
> 
aspm_enabled can be changed by the user at any time.
pci_disable_link_state() also considers whether BIOS forbids that OS
mess with ASPM. See aspm_disabled.

>>
>>> +     tp->rtl_aspm_supported = rtl_supports_aspm(tp);
> 
> Is rtl_supports_aspm() what you expect for the whitelist?
> And what else am I missing?
> 
I meant use rtl_supports_aspm() to check when ASPM is relevant at all,
and in addition use a blacklist for chip versions where ASPM is
completely unusable.

> Kai-Heng
> 
>>>
>>>       /* enable device (incl. PCI PM wakeup and hotplug setup) */
>>>       rc = pcim_enable_device(pdev);
>>>
>>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v3 1/3] r8169: Implement dynamic ASPM mechanism
  2021-08-19  5:45 ` [PATCH net-next v3 1/3] r8169: Implement dynamic ASPM mechanism Kai-Heng Feng
@ 2021-08-19 11:42   ` Bjorn Helgaas
  2021-08-19 15:45     ` Heiner Kallweit
  0 siblings, 1 reply; 16+ messages in thread
From: Bjorn Helgaas @ 2021-08-19 11:42 UTC (permalink / raw)
  To: Kai-Heng Feng
  Cc: hkallweit1, nic_swsd, bhelgaas, davem, kuba, netdev, linux-pci,
	linux-kernel

On Thu, Aug 19, 2021 at 01:45:40PM +0800, Kai-Heng Feng wrote:
> r8169 NICs on some platforms have abysmal speed when ASPM is enabled.
> Same issue can be observed with older vendor drivers.

On some platforms but not on others?  Maybe the PCIe topology is a
factor?  Do you have bug reports with data, e.g., "lspci -vv" output?

> The issue is however solved by the latest vendor driver. There's a new
> mechanism, which disables r8169's internal ASPM when the NIC traffic has
> more than 10 packets, and vice versa. 

Presumably there's a time interval related to the 10 packets?  For
example, do you want to disable ASPM if 10 packets are received (or
sent?) in a certain amount of time?

> The possible reason for this is
> likely because the buffer on the chip is too small for its ASPM exit
> latency.

Maybe this means the chip advertises incorrect exit latencies?  If so,
maybe a quirk could override that?

> Realtek confirmed that all their PCIe LAN NICs, r8106, r8168 and r8125
> use dynamic ASPM under Windows. So implement the same mechanism here to
> resolve the issue.

What exactly is "dynamic ASPM"?

I see Heiner's comment about this being intended only for a downstream
kernel.  But why?

> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
> ---
> v3:
>  - Use msecs_to_jiffies() for delay time
>  - Use atomic_t instead of mutex for bh
>  - Mention the buffer size and ASPM exit latency in commit message
> 
> v2: 
>  - Use delayed_work instead of timer_list to avoid interrupt context
>  - Use mutex to serialize packet counter read/write
>  - Wording change
> 
>  drivers/net/ethernet/realtek/r8169_main.c | 44 ++++++++++++++++++++++-
>  1 file changed, 43 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
> index 7a69b468584a2..3359509c1c351 100644
> --- a/drivers/net/ethernet/realtek/r8169_main.c
> +++ b/drivers/net/ethernet/realtek/r8169_main.c
> @@ -624,6 +624,10 @@ struct rtl8169_private {
>  
>  	unsigned supports_gmii:1;
>  	unsigned aspm_manageable:1;
> +	unsigned rtl_aspm_enabled:1;
> +	struct delayed_work aspm_toggle;
> +	atomic_t aspm_packet_count;
> +
>  	dma_addr_t counters_phys_addr;
>  	struct rtl8169_counters *counters;
>  	struct rtl8169_tc_offsets tc_offset;
> @@ -2665,8 +2669,13 @@ static void rtl_pcie_state_l2l3_disable(struct rtl8169_private *tp)
>  
>  static void rtl_hw_aspm_clkreq_enable(struct rtl8169_private *tp, bool enable)
>  {
> +	if (!tp->aspm_manageable && enable)
> +		return;
> +
> +	tp->rtl_aspm_enabled = enable;
> +
>  	/* Don't enable ASPM in the chip if OS can't control ASPM */
> -	if (enable && tp->aspm_manageable) {
> +	if (enable) {
>  		RTL_W8(tp, Config5, RTL_R8(tp, Config5) | ASPM_en);
>  		RTL_W8(tp, Config2, RTL_R8(tp, Config2) | ClkReqEn);
>  	} else {
> @@ -4415,6 +4424,7 @@ static void rtl_tx(struct net_device *dev, struct rtl8169_private *tp,
>  
>  	dirty_tx = tp->dirty_tx;
>  
> +	atomic_add(tp->cur_tx - dirty_tx, &tp->aspm_packet_count);
>  	while (READ_ONCE(tp->cur_tx) != dirty_tx) {
>  		unsigned int entry = dirty_tx % NUM_TX_DESC;
>  		u32 status;
> @@ -4559,6 +4569,8 @@ static int rtl_rx(struct net_device *dev, struct rtl8169_private *tp, int budget
>  		rtl8169_mark_to_asic(desc);
>  	}
>  
> +	atomic_add(count, &tp->aspm_packet_count);
> +
>  	return count;
>  }
>  
> @@ -4666,8 +4678,32 @@ static int r8169_phy_connect(struct rtl8169_private *tp)
>  	return 0;
>  }
>  
> +#define ASPM_PACKET_THRESHOLD 10
> +#define ASPM_TOGGLE_INTERVAL 1000
> +
> +static void rtl8169_aspm_toggle(struct work_struct *work)
> +{
> +	struct rtl8169_private *tp = container_of(work, struct rtl8169_private,
> +						  aspm_toggle.work);
> +	int packet_count;
> +	bool enable;
> +
> +	packet_count = atomic_xchg(&tp->aspm_packet_count, 0);
> +	enable = packet_count <= ASPM_PACKET_THRESHOLD;
> +
> +	if (tp->rtl_aspm_enabled != enable) {
> +		rtl_unlock_config_regs(tp);
> +		rtl_hw_aspm_clkreq_enable(tp, enable);
> +		rtl_lock_config_regs(tp);
> +	}
> +
> +	schedule_delayed_work(&tp->aspm_toggle, msecs_to_jiffies(ASPM_TOGGLE_INTERVAL));
> +}
> +
>  static void rtl8169_down(struct rtl8169_private *tp)
>  {
> +	cancel_delayed_work_sync(&tp->aspm_toggle);
> +
>  	/* Clear all task flags */
>  	bitmap_zero(tp->wk.flags, RTL_FLAG_MAX);
>  
> @@ -4694,6 +4730,8 @@ static void rtl8169_up(struct rtl8169_private *tp)
>  	rtl_reset_work(tp);
>  
>  	phy_start(tp->phydev);
> +
> +	schedule_delayed_work(&tp->aspm_toggle, msecs_to_jiffies(ASPM_TOGGLE_INTERVAL));
>  }
>  
>  static int rtl8169_close(struct net_device *dev)
> @@ -5354,6 +5392,10 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
>  
>  	INIT_WORK(&tp->wk.work, rtl_task);
>  
> +	INIT_DELAYED_WORK(&tp->aspm_toggle, rtl8169_aspm_toggle);
> +
> +	atomic_set(&tp->aspm_packet_count, 0);
> +
>  	rtl_init_mac_address(tp);
>  
>  	dev->ethtool_ops = &rtl8169_ethtool_ops;
> -- 
> 2.32.0
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v3 1/3] r8169: Implement dynamic ASPM mechanism
  2021-08-19 11:42   ` Bjorn Helgaas
@ 2021-08-19 15:45     ` Heiner Kallweit
  2021-08-20 21:03       ` Bjorn Helgaas
  0 siblings, 1 reply; 16+ messages in thread
From: Heiner Kallweit @ 2021-08-19 15:45 UTC (permalink / raw)
  To: Bjorn Helgaas, Kai-Heng Feng
  Cc: nic_swsd, bhelgaas, davem, kuba, netdev, linux-pci, linux-kernel

On 19.08.2021 13:42, Bjorn Helgaas wrote:
> On Thu, Aug 19, 2021 at 01:45:40PM +0800, Kai-Heng Feng wrote:
>> r8169 NICs on some platforms have abysmal speed when ASPM is enabled.
>> Same issue can be observed with older vendor drivers.
> 
> On some platforms but not on others?  Maybe the PCIe topology is a
> factor?  Do you have bug reports with data, e.g., "lspci -vv" output?
> 
>> The issue is however solved by the latest vendor driver. There's a new
>> mechanism, which disables r8169's internal ASPM when the NIC traffic has
>> more than 10 packets, and vice versa. 
> 
> Presumably there's a time interval related to the 10 packets?  For
> example, do you want to disable ASPM if 10 packets are received (or
> sent?) in a certain amount of time?
> 
>> The possible reason for this is
>> likely because the buffer on the chip is too small for its ASPM exit
>> latency.
> 
> Maybe this means the chip advertises incorrect exit latencies?  If so,
> maybe a quirk could override that?
> 
>> Realtek confirmed that all their PCIe LAN NICs, r8106, r8168 and r8125
>> use dynamic ASPM under Windows. So implement the same mechanism here to
>> resolve the issue.
> 
> What exactly is "dynamic ASPM"?
> 
> I see Heiner's comment about this being intended only for a downstream
> kernel.  But why?
> 
We've seen various more or less obvious symptoms caused by the broken
ASPM support on Realtek network chips. Unfortunately Realtek releases
neither datasheets nor errata information.
Last time we attempted to re-enable ASPM numerous problem reports came
in. These Realtek chips are used on basically every consumer mainboard.
The proposed workaround has potential side effects: In case of a
congestion in the chip it may take up to a second until ASPM gets
disabled, what may affect performance, especially in case of alternating
traffic patterns. Also we can't expect support from Realtek.
Having said that my decision was that it's too risky to re-enable ASPM
in mainline even with this workaround in place. Kai-Heng weights the
power saving higher and wants to take the risk in his downstream kernel.
If there are no problems downstream after few months, then this
workaround may make it to mainline.

>> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
>> ---
>> v3:
>>  - Use msecs_to_jiffies() for delay time
>>  - Use atomic_t instead of mutex for bh
>>  - Mention the buffer size and ASPM exit latency in commit message
>>
>> v2: 
>>  - Use delayed_work instead of timer_list to avoid interrupt context
>>  - Use mutex to serialize packet counter read/write
>>  - Wording change
>>
>>  drivers/net/ethernet/realtek/r8169_main.c | 44 ++++++++++++++++++++++-
>>  1 file changed, 43 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
>> index 7a69b468584a2..3359509c1c351 100644
>> --- a/drivers/net/ethernet/realtek/r8169_main.c
>> +++ b/drivers/net/ethernet/realtek/r8169_main.c
>> @@ -624,6 +624,10 @@ struct rtl8169_private {
>>  
>>  	unsigned supports_gmii:1;
>>  	unsigned aspm_manageable:1;
>> +	unsigned rtl_aspm_enabled:1;
>> +	struct delayed_work aspm_toggle;
>> +	atomic_t aspm_packet_count;
>> +
>>  	dma_addr_t counters_phys_addr;
>>  	struct rtl8169_counters *counters;
>>  	struct rtl8169_tc_offsets tc_offset;
>> @@ -2665,8 +2669,13 @@ static void rtl_pcie_state_l2l3_disable(struct rtl8169_private *tp)
>>  
>>  static void rtl_hw_aspm_clkreq_enable(struct rtl8169_private *tp, bool enable)
>>  {
>> +	if (!tp->aspm_manageable && enable)
>> +		return;
>> +
>> +	tp->rtl_aspm_enabled = enable;
>> +
>>  	/* Don't enable ASPM in the chip if OS can't control ASPM */
>> -	if (enable && tp->aspm_manageable) {
>> +	if (enable) {
>>  		RTL_W8(tp, Config5, RTL_R8(tp, Config5) | ASPM_en);
>>  		RTL_W8(tp, Config2, RTL_R8(tp, Config2) | ClkReqEn);
>>  	} else {
>> @@ -4415,6 +4424,7 @@ static void rtl_tx(struct net_device *dev, struct rtl8169_private *tp,
>>  
>>  	dirty_tx = tp->dirty_tx;
>>  
>> +	atomic_add(tp->cur_tx - dirty_tx, &tp->aspm_packet_count);
>>  	while (READ_ONCE(tp->cur_tx) != dirty_tx) {
>>  		unsigned int entry = dirty_tx % NUM_TX_DESC;
>>  		u32 status;
>> @@ -4559,6 +4569,8 @@ static int rtl_rx(struct net_device *dev, struct rtl8169_private *tp, int budget
>>  		rtl8169_mark_to_asic(desc);
>>  	}
>>  
>> +	atomic_add(count, &tp->aspm_packet_count);
>> +
>>  	return count;
>>  }
>>  
>> @@ -4666,8 +4678,32 @@ static int r8169_phy_connect(struct rtl8169_private *tp)
>>  	return 0;
>>  }
>>  
>> +#define ASPM_PACKET_THRESHOLD 10
>> +#define ASPM_TOGGLE_INTERVAL 1000
>> +
>> +static void rtl8169_aspm_toggle(struct work_struct *work)
>> +{
>> +	struct rtl8169_private *tp = container_of(work, struct rtl8169_private,
>> +						  aspm_toggle.work);
>> +	int packet_count;
>> +	bool enable;
>> +
>> +	packet_count = atomic_xchg(&tp->aspm_packet_count, 0);
>> +	enable = packet_count <= ASPM_PACKET_THRESHOLD;
>> +
>> +	if (tp->rtl_aspm_enabled != enable) {
>> +		rtl_unlock_config_regs(tp);
>> +		rtl_hw_aspm_clkreq_enable(tp, enable);
>> +		rtl_lock_config_regs(tp);
>> +	}
>> +
>> +	schedule_delayed_work(&tp->aspm_toggle, msecs_to_jiffies(ASPM_TOGGLE_INTERVAL));
>> +}
>> +
>>  static void rtl8169_down(struct rtl8169_private *tp)
>>  {
>> +	cancel_delayed_work_sync(&tp->aspm_toggle);
>> +
>>  	/* Clear all task flags */
>>  	bitmap_zero(tp->wk.flags, RTL_FLAG_MAX);
>>  
>> @@ -4694,6 +4730,8 @@ static void rtl8169_up(struct rtl8169_private *tp)
>>  	rtl_reset_work(tp);
>>  
>>  	phy_start(tp->phydev);
>> +
>> +	schedule_delayed_work(&tp->aspm_toggle, msecs_to_jiffies(ASPM_TOGGLE_INTERVAL));
>>  }
>>  
>>  static int rtl8169_close(struct net_device *dev)
>> @@ -5354,6 +5392,10 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
>>  
>>  	INIT_WORK(&tp->wk.work, rtl_task);
>>  
>> +	INIT_DELAYED_WORK(&tp->aspm_toggle, rtl8169_aspm_toggle);
>> +
>> +	atomic_set(&tp->aspm_packet_count, 0);
>> +
>>  	rtl_init_mac_address(tp);
>>  
>>  	dev->ethtool_ops = &rtl8169_ethtool_ops;
>> -- 
>> 2.32.0
>>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v3 1/3] r8169: Implement dynamic ASPM mechanism
  2021-08-19 15:45     ` Heiner Kallweit
@ 2021-08-20 21:03       ` Bjorn Helgaas
  2021-08-24  7:39         ` Kai-Heng Feng
  0 siblings, 1 reply; 16+ messages in thread
From: Bjorn Helgaas @ 2021-08-20 21:03 UTC (permalink / raw)
  To: Heiner Kallweit
  Cc: Kai-Heng Feng, nic_swsd, bhelgaas, davem, kuba, netdev,
	linux-pci, linux-kernel

On Thu, Aug 19, 2021 at 05:45:22PM +0200, Heiner Kallweit wrote:
> On 19.08.2021 13:42, Bjorn Helgaas wrote:
> > On Thu, Aug 19, 2021 at 01:45:40PM +0800, Kai-Heng Feng wrote:
> >> r8169 NICs on some platforms have abysmal speed when ASPM is enabled.
> >> Same issue can be observed with older vendor drivers.
> > 
> > On some platforms but not on others?  Maybe the PCIe topology is a
> > factor?  Do you have bug reports with data, e.g., "lspci -vv" output?
> > 
> >> The issue is however solved by the latest vendor driver. There's a new
> >> mechanism, which disables r8169's internal ASPM when the NIC traffic has
> >> more than 10 packets, and vice versa. 
> > 
> > Presumably there's a time interval related to the 10 packets?  For
> > example, do you want to disable ASPM if 10 packets are received (or
> > sent?) in a certain amount of time?
> > 
> >> The possible reason for this is
> >> likely because the buffer on the chip is too small for its ASPM exit
> >> latency.
> > 
> > Maybe this means the chip advertises incorrect exit latencies?  If so,
> > maybe a quirk could override that?
> > 
> >> Realtek confirmed that all their PCIe LAN NICs, r8106, r8168 and r8125
> >> use dynamic ASPM under Windows. So implement the same mechanism here to
> >> resolve the issue.
> > 
> > What exactly is "dynamic ASPM"?
> > 
> > I see Heiner's comment about this being intended only for a downstream
> > kernel.  But why?
> > 
> We've seen various more or less obvious symptoms caused by the broken
> ASPM support on Realtek network chips. Unfortunately Realtek releases
> neither datasheets nor errata information.
> Last time we attempted to re-enable ASPM numerous problem reports came
> in. These Realtek chips are used on basically every consumer mainboard.
> The proposed workaround has potential side effects: In case of a
> congestion in the chip it may take up to a second until ASPM gets
> disabled, what may affect performance, especially in case of alternating
> traffic patterns. Also we can't expect support from Realtek.
> Having said that my decision was that it's too risky to re-enable ASPM
> in mainline even with this workaround in place. Kai-Heng weights the
> power saving higher and wants to take the risk in his downstream kernel.
> If there are no problems downstream after few months, then this
> workaround may make it to mainline.

Since ASPM apparently works well on some platforms but not others, I'd
suspect some incorrect exit latencies.

Ideally we'd have some launchpad/bugzilla links, and a better
understanding of the problem, and maybe a quirk that makes this work
on all platforms without mucking up the driver with ASPM tweaks.

But I'm a little out of turn here because the only direct impact to
the PCI core is the pcie_aspm_supported() interface.  It *looks* like
these patches don't actually touch the PCIe architected ASPM controls
in Link Control; all I see is mucking with Realtek-specific registers.

I think this is more work than it should be and likely to be not as
reliable as it should be.  But I guess that's up to you guys.

Bjorn

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v3 1/3] r8169: Implement dynamic ASPM mechanism
  2021-08-20 21:03       ` Bjorn Helgaas
@ 2021-08-24  7:39         ` Kai-Heng Feng
  2021-08-24 14:53           ` Bjorn Helgaas
  0 siblings, 1 reply; 16+ messages in thread
From: Kai-Heng Feng @ 2021-08-24  7:39 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Heiner Kallweit, nic_swsd, Bjorn Helgaas, David Miller,
	Jakub Kicinski, Linux Netdev List, Linux PCI, LKML

On Sat, Aug 21, 2021 at 5:03 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Thu, Aug 19, 2021 at 05:45:22PM +0200, Heiner Kallweit wrote:
> > On 19.08.2021 13:42, Bjorn Helgaas wrote:
> > > On Thu, Aug 19, 2021 at 01:45:40PM +0800, Kai-Heng Feng wrote:
> > >> r8169 NICs on some platforms have abysmal speed when ASPM is enabled.
> > >> Same issue can be observed with older vendor drivers.
> > >
> > > On some platforms but not on others?  Maybe the PCIe topology is a
> > > factor?  Do you have bug reports with data, e.g., "lspci -vv" output?
> > >
> > >> The issue is however solved by the latest vendor driver. There's a new
> > >> mechanism, which disables r8169's internal ASPM when the NIC traffic has
> > >> more than 10 packets, and vice versa.
> > >
> > > Presumably there's a time interval related to the 10 packets?  For
> > > example, do you want to disable ASPM if 10 packets are received (or
> > > sent?) in a certain amount of time?
> > >
> > >> The possible reason for this is
> > >> likely because the buffer on the chip is too small for its ASPM exit
> > >> latency.
> > >
> > > Maybe this means the chip advertises incorrect exit latencies?  If so,
> > > maybe a quirk could override that?
> > >
> > >> Realtek confirmed that all their PCIe LAN NICs, r8106, r8168 and r8125
> > >> use dynamic ASPM under Windows. So implement the same mechanism here to
> > >> resolve the issue.
> > >
> > > What exactly is "dynamic ASPM"?
> > >
> > > I see Heiner's comment about this being intended only for a downstream
> > > kernel.  But why?
> > >
> > We've seen various more or less obvious symptoms caused by the broken
> > ASPM support on Realtek network chips. Unfortunately Realtek releases
> > neither datasheets nor errata information.
> > Last time we attempted to re-enable ASPM numerous problem reports came
> > in. These Realtek chips are used on basically every consumer mainboard.
> > The proposed workaround has potential side effects: In case of a
> > congestion in the chip it may take up to a second until ASPM gets
> > disabled, what may affect performance, especially in case of alternating
> > traffic patterns. Also we can't expect support from Realtek.
> > Having said that my decision was that it's too risky to re-enable ASPM
> > in mainline even with this workaround in place. Kai-Heng weights the
> > power saving higher and wants to take the risk in his downstream kernel.
> > If there are no problems downstream after few months, then this
> > workaround may make it to mainline.
>
> Since ASPM apparently works well on some platforms but not others, I'd
> suspect some incorrect exit latencies.

Can be, but if their dynamic ASPM mechanism can workaround the issue,
maybe their hardware is just designed that way?

>
> Ideally we'd have some launchpad/bugzilla links, and a better
> understanding of the problem, and maybe a quirk that makes this work
> on all platforms without mucking up the driver with ASPM tweaks.

The tweaks is OS-agnostic and is also implemented in Windows.

>
> But I'm a little out of turn here because the only direct impact to
> the PCI core is the pcie_aspm_supported() interface.  It *looks* like
> these patches don't actually touch the PCIe architected ASPM controls
> in Link Control; all I see is mucking with Realtek-specific registers.

AFAICT, Realtek ethernet NIC and wireless NIC both have two layers of
ASPM, one is the regular PCIe ASPM, and a Realtek specific internal
ASPM.
Both have to be enabled to really make ASPM work for them.

Kai-Heng

>
> I think this is more work than it should be and likely to be not as
> reliable as it should be.  But I guess that's up to you guys.
>
> Bjorn

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v3 1/3] r8169: Implement dynamic ASPM mechanism
  2021-08-24  7:39         ` Kai-Heng Feng
@ 2021-08-24 14:53           ` Bjorn Helgaas
  2021-08-27  4:56             ` Kai-Heng Feng
  0 siblings, 1 reply; 16+ messages in thread
From: Bjorn Helgaas @ 2021-08-24 14:53 UTC (permalink / raw)
  To: Kai-Heng Feng
  Cc: Heiner Kallweit, nic_swsd, Bjorn Helgaas, David Miller,
	Jakub Kicinski, Linux Netdev List, Linux PCI, LKML

On Tue, Aug 24, 2021 at 03:39:35PM +0800, Kai-Heng Feng wrote:
> On Sat, Aug 21, 2021 at 5:03 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> >
> > On Thu, Aug 19, 2021 at 05:45:22PM +0200, Heiner Kallweit wrote:
> > > On 19.08.2021 13:42, Bjorn Helgaas wrote:
> > > > On Thu, Aug 19, 2021 at 01:45:40PM +0800, Kai-Heng Feng wrote:
> > > >> r8169 NICs on some platforms have abysmal speed when ASPM is enabled.
> > > >> Same issue can be observed with older vendor drivers.
> > > >
> > > > On some platforms but not on others?  Maybe the PCIe topology is a
> > > > factor?  Do you have bug reports with data, e.g., "lspci -vv" output?
> > > >
> > > >> The issue is however solved by the latest vendor driver. There's a new
> > > >> mechanism, which disables r8169's internal ASPM when the NIC traffic has
> > > >> more than 10 packets, and vice versa.
> > > >
> > > > Presumably there's a time interval related to the 10 packets?  For
> > > > example, do you want to disable ASPM if 10 packets are received (or
> > > > sent?) in a certain amount of time?
> > > >
> > > >> The possible reason for this is
> > > >> likely because the buffer on the chip is too small for its ASPM exit
> > > >> latency.
> > > >
> > > > Maybe this means the chip advertises incorrect exit latencies?  If so,
> > > > maybe a quirk could override that?
> > > >
> > > >> Realtek confirmed that all their PCIe LAN NICs, r8106, r8168 and r8125
> > > >> use dynamic ASPM under Windows. So implement the same mechanism here to
> > > >> resolve the issue.
> > > >
> > > > What exactly is "dynamic ASPM"?
> > > >
> > > > I see Heiner's comment about this being intended only for a downstream
> > > > kernel.  But why?
> > > >
> > > We've seen various more or less obvious symptoms caused by the broken
> > > ASPM support on Realtek network chips. Unfortunately Realtek releases
> > > neither datasheets nor errata information.
> > > Last time we attempted to re-enable ASPM numerous problem reports came
> > > in. These Realtek chips are used on basically every consumer mainboard.
> > > The proposed workaround has potential side effects: In case of a
> > > congestion in the chip it may take up to a second until ASPM gets
> > > disabled, what may affect performance, especially in case of alternating
> > > traffic patterns. Also we can't expect support from Realtek.
> > > Having said that my decision was that it's too risky to re-enable ASPM
> > > in mainline even with this workaround in place. Kai-Heng weights the
> > > power saving higher and wants to take the risk in his downstream kernel.
> > > If there are no problems downstream after few months, then this
> > > workaround may make it to mainline.
> >
> > Since ASPM apparently works well on some platforms but not others, I'd
> > suspect some incorrect exit latencies.
> 
> Can be, but if their dynamic ASPM mechanism can workaround the issue,
> maybe their hardware is just designed that way?

Designed what way?  You mean the hardware uses the architected ASPM
control bits in the PCIe capability to control some ASPM functionality
that doesn't work like the spec says it should work?

> > Ideally we'd have some launchpad/bugzilla links, and a better
> > understanding of the problem, and maybe a quirk that makes this work
> > on all platforms without mucking up the driver with ASPM tweaks.
> 
> The tweaks is OS-agnostic and is also implemented in Windows.

I assume you mean these tweaks are also implemented in the Windows
*driver* from Realtek.  That's not a very convincing argument that
this is the way it should work.

If ASPM works well on some platforms, we should be able to make it
work well on other platforms, too.  The actual data ("lspci -vvxxx")
from working and problematic platforms might have hints.


> > But I'm a little out of turn here because the only direct impact to
> > the PCI core is the pcie_aspm_supported() interface.  It *looks* like
> > these patches don't actually touch the PCIe architected ASPM controls
> > in Link Control; all I see is mucking with Realtek-specific registers.
> 
> AFAICT, Realtek ethernet NIC and wireless NIC both have two layers of
> ASPM, one is the regular PCIe ASPM, and a Realtek specific internal
> ASPM.
> Both have to be enabled to really make ASPM work for them.

It's common for devices to have chicken bits.  But when a feature is
enabled, it should work as defined by the PCIe spec so it will work
with other spec-compliant devices.

Bjorn

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v3 1/3] r8169: Implement dynamic ASPM mechanism
  2021-08-24 14:53           ` Bjorn Helgaas
@ 2021-08-27  4:56             ` Kai-Heng Feng
  0 siblings, 0 replies; 16+ messages in thread
From: Kai-Heng Feng @ 2021-08-27  4:56 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Heiner Kallweit, nic_swsd, Bjorn Helgaas, David Miller,
	Jakub Kicinski, Linux Netdev List, Linux PCI, LKML

On Tue, Aug 24, 2021 at 10:53 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Tue, Aug 24, 2021 at 03:39:35PM +0800, Kai-Heng Feng wrote:
> > On Sat, Aug 21, 2021 at 5:03 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > >
> > > On Thu, Aug 19, 2021 at 05:45:22PM +0200, Heiner Kallweit wrote:
> > > > On 19.08.2021 13:42, Bjorn Helgaas wrote:
> > > > > On Thu, Aug 19, 2021 at 01:45:40PM +0800, Kai-Heng Feng wrote:
> > > > >> r8169 NICs on some platforms have abysmal speed when ASPM is enabled.
> > > > >> Same issue can be observed with older vendor drivers.
> > > > >
> > > > > On some platforms but not on others?  Maybe the PCIe topology is a
> > > > > factor?  Do you have bug reports with data, e.g., "lspci -vv" output?
> > > > >
> > > > >> The issue is however solved by the latest vendor driver. There's a new
> > > > >> mechanism, which disables r8169's internal ASPM when the NIC traffic has
> > > > >> more than 10 packets, and vice versa.
> > > > >
> > > > > Presumably there's a time interval related to the 10 packets?  For
> > > > > example, do you want to disable ASPM if 10 packets are received (or
> > > > > sent?) in a certain amount of time?
> > > > >
> > > > >> The possible reason for this is
> > > > >> likely because the buffer on the chip is too small for its ASPM exit
> > > > >> latency.
> > > > >
> > > > > Maybe this means the chip advertises incorrect exit latencies?  If so,
> > > > > maybe a quirk could override that?
> > > > >
> > > > >> Realtek confirmed that all their PCIe LAN NICs, r8106, r8168 and r8125
> > > > >> use dynamic ASPM under Windows. So implement the same mechanism here to
> > > > >> resolve the issue.
> > > > >
> > > > > What exactly is "dynamic ASPM"?
> > > > >
> > > > > I see Heiner's comment about this being intended only for a downstream
> > > > > kernel.  But why?
> > > > >
> > > > We've seen various more or less obvious symptoms caused by the broken
> > > > ASPM support on Realtek network chips. Unfortunately Realtek releases
> > > > neither datasheets nor errata information.
> > > > Last time we attempted to re-enable ASPM numerous problem reports came
> > > > in. These Realtek chips are used on basically every consumer mainboard.
> > > > The proposed workaround has potential side effects: In case of a
> > > > congestion in the chip it may take up to a second until ASPM gets
> > > > disabled, what may affect performance, especially in case of alternating
> > > > traffic patterns. Also we can't expect support from Realtek.
> > > > Having said that my decision was that it's too risky to re-enable ASPM
> > > > in mainline even with this workaround in place. Kai-Heng weights the
> > > > power saving higher and wants to take the risk in his downstream kernel.
> > > > If there are no problems downstream after few months, then this
> > > > workaround may make it to mainline.
> > >
> > > Since ASPM apparently works well on some platforms but not others, I'd
> > > suspect some incorrect exit latencies.
> >
> > Can be, but if their dynamic ASPM mechanism can workaround the issue,
> > maybe their hardware is just designed that way?
>
> Designed what way?  You mean the hardware uses the architected ASPM
> control bits in the PCIe capability to control some ASPM functionality
> that doesn't work like the spec says it should work?

Yes, it requires both standard PCIe ASPM control bits and Realtek
specific register bits to make ASPM really work.
Does PCI spec mandates PCIe config space to be the only way to enable ASPM?

>
> > > Ideally we'd have some launchpad/bugzilla links, and a better
> > > understanding of the problem, and maybe a quirk that makes this work
> > > on all platforms without mucking up the driver with ASPM tweaks.
> >
> > The tweaks is OS-agnostic and is also implemented in Windows.
>
> I assume you mean these tweaks are also implemented in the Windows
> *driver* from Realtek.  That's not a very convincing argument that
> this is the way it should work.

Since Realtek doesn't publish any erratum so following the driver
tweaks is the most practical way to improve the situation under Linux.
The same tweaks (i.e. dynamically enable/disable ASPM) can also be
found in another driver, drivers/infiniband/hw/hfi1/aspm.c.

>
> If ASPM works well on some platforms, we should be able to make it
> work well on other platforms, too.  The actual data ("lspci -vvxxx")
> from working and problematic platforms might have hints.

OK, I'll ask affected users' lspci data.

>
>
> > > But I'm a little out of turn here because the only direct impact to
> > > the PCI core is the pcie_aspm_supported() interface.  It *looks* like
> > > these patches don't actually touch the PCIe architected ASPM controls
> > > in Link Control; all I see is mucking with Realtek-specific registers.
> >
> > AFAICT, Realtek ethernet NIC and wireless NIC both have two layers of
> > ASPM, one is the regular PCIe ASPM, and a Realtek specific internal
> > ASPM.
> > Both have to be enabled to really make ASPM work for them.
>
> It's common for devices to have chicken bits.  But when a feature is
> enabled, it should work as defined by the PCIe spec so it will work
> with other spec-compliant devices.

I have no idea why they designed ASPM in two layers. Only Realtek
knows the reason...

>
> Bjorn

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v3 3/3] r8169: Enable ASPM for selected NICs
  2021-08-19  9:56       ` Heiner Kallweit
@ 2021-08-27  6:23         ` Kai-Heng Feng
  0 siblings, 0 replies; 16+ messages in thread
From: Kai-Heng Feng @ 2021-08-27  6:23 UTC (permalink / raw)
  To: Heiner Kallweit
  Cc: nic_swsd, Bjorn Helgaas, David Miller, Jakub Kicinski,
	Linux Netdev List, Linux PCI, LKML

On Thu, Aug 19, 2021 at 5:56 PM Heiner Kallweit <hkallweit1@gmail.com> wrote:
>
> On 19.08.2021 08:50, Kai-Heng Feng wrote:
> > On Thu, Aug 19, 2021 at 2:08 PM Heiner Kallweit <hkallweit1@gmail.com> wrote:
> >>
> >> On 19.08.2021 07:45, Kai-Heng Feng wrote:
> >>> The latest vendor driver enables ASPM for more recent r8168 NICs, so
> >>> disable ASPM on older chips and enable ASPM for the rest.
> >>>
> >>> Rename aspm_manageable to pcie_aspm_manageable to indicate it's ASPM
> >>> from PCIe, and use rtl_aspm_supported for Realtek NIC's internal ASPM
> >>> function.
> >>>
> >>> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
> >>> ---
> >>> v3:
> >>>  - Use pcie_aspm_supported() to retrieve ASPM support status
> >>>  - Use whitelist for r8169 internal ASPM status
> >>>
> >>> v2:
> >>>  - No change
> >>>
> >>>  drivers/net/ethernet/realtek/r8169_main.c | 27 ++++++++++++++++-------
> >>>  1 file changed, 19 insertions(+), 8 deletions(-)
> >>>
> >>> diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
> >>> index 3359509c1c351..88e015d93e490 100644
> >>> --- a/drivers/net/ethernet/realtek/r8169_main.c
> >>> +++ b/drivers/net/ethernet/realtek/r8169_main.c
> >>> @@ -623,7 +623,8 @@ struct rtl8169_private {
> >>>       } wk;
> >>>
> >>>       unsigned supports_gmii:1;
> >>> -     unsigned aspm_manageable:1;
> >>> +     unsigned pcie_aspm_manageable:1;
> >>> +     unsigned rtl_aspm_supported:1;
> >>>       unsigned rtl_aspm_enabled:1;
> >>>       struct delayed_work aspm_toggle;
> >>>       atomic_t aspm_packet_count;
> >>> @@ -702,6 +703,20 @@ static bool rtl_is_8168evl_up(struct rtl8169_private *tp)
> >>>              tp->mac_version <= RTL_GIGA_MAC_VER_53;
> >>>  }
> >>>
> >>> +static int rtl_supports_aspm(struct rtl8169_private *tp)
> >>> +{
> >>> +     switch (tp->mac_version) {
> >>> +     case RTL_GIGA_MAC_VER_02 ... RTL_GIGA_MAC_VER_31:
> >>> +     case RTL_GIGA_MAC_VER_37:
> >>> +     case RTL_GIGA_MAC_VER_39:
> >>> +     case RTL_GIGA_MAC_VER_43:
> >>> +     case RTL_GIGA_MAC_VER_47:
> >>> +             return 0;
> >>> +     default:
> >>> +             return 1;
> >>> +     }
> >>> +}
> >>> +
> >>>  static bool rtl_supports_eee(struct rtl8169_private *tp)
> >>>  {
> >>>       return tp->mac_version >= RTL_GIGA_MAC_VER_34 &&
> >>> @@ -2669,7 +2684,7 @@ static void rtl_pcie_state_l2l3_disable(struct rtl8169_private *tp)
> >>>
> >>>  static void rtl_hw_aspm_clkreq_enable(struct rtl8169_private *tp, bool enable)
> >>>  {
> >>> -     if (!tp->aspm_manageable && enable)
> >>> +     if (!(tp->pcie_aspm_manageable && tp->rtl_aspm_supported) && enable)
> >>>               return;
> >>>
> >>>       tp->rtl_aspm_enabled = enable;
> >>> @@ -5319,12 +5334,8 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
> >>>       if (rc)
> >>>               return rc;
> >>>
> >>> -     /* Disable ASPM completely as that cause random device stop working
> >>> -      * problems as well as full system hangs for some PCIe devices users.
> >>> -      */
> >>> -     rc = pci_disable_link_state(pdev, PCIE_LINK_STATE_L0S |
> >>> -                                       PCIE_LINK_STATE_L1);
> >>> -     tp->aspm_manageable = !rc;
> >>> +     tp->pcie_aspm_manageable = pcie_aspm_supported(pdev);
> >>
> >> That's not what I meant, and it's also not correct.
> >
> > In case I make another mistake in next series, let me ask it more clearly...
> > What you meant was to check both link->aspm_enabled and link->aspm_support?
> >
> aspm_enabled can be changed by the user at any time.

OK, will check that too.

> pci_disable_link_state() also considers whether BIOS forbids that OS
> mess with ASPM. See aspm_disabled.

I think aspm_disabled means leave BIOS ASPM setting intact?
So If PCIe ASPM is already enabled, we should also enable Realtek
specific bits to make ASPM really work.

>
> >>
> >>> +     tp->rtl_aspm_supported = rtl_supports_aspm(tp);
> >
> > Is rtl_supports_aspm() what you expect for the whitelist?
> > And what else am I missing?
> >
> I meant use rtl_supports_aspm() to check when ASPM is relevant at all,

I think that means the relevant bits are link->aspm_capable and
pcie_aspm_support_enabled().
ASPM can be already enabled by BIOS with aspm_disabled set.

Then check link->aspm_enabled in aspm_toggle() routine because it can
be enabled at runtime.

> and in addition use a blacklist for chip versions where ASPM is
> completely unusable.

Thanks for your suggestion and review.

Kai-Heng

>
> > Kai-Heng
> >
> >>>
> >>>       /* enable device (incl. PCI PM wakeup and hotplug setup) */
> >>>       rc = pcim_enable_device(pdev);
> >>>
> >>
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2021-08-27  6:23 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-19  5:45 [PATCH net-next v3 0/3] r8169: Implement dynamic ASPM mechanism for recent 1.0/2.5Gbps Realtek NICs Kai-Heng Feng
2021-08-19  5:45 ` [PATCH net-next v3 1/3] r8169: Implement dynamic ASPM mechanism Kai-Heng Feng
2021-08-19 11:42   ` Bjorn Helgaas
2021-08-19 15:45     ` Heiner Kallweit
2021-08-20 21:03       ` Bjorn Helgaas
2021-08-24  7:39         ` Kai-Heng Feng
2021-08-24 14:53           ` Bjorn Helgaas
2021-08-27  4:56             ` Kai-Heng Feng
2021-08-19  5:45 ` [PATCH net-next v3 2/3] PCI/ASPM: Introduce a new helper to report ASPM support status Kai-Heng Feng
2021-08-19  5:45 ` [PATCH net-next v3 3/3] r8169: Enable ASPM for selected NICs Kai-Heng Feng
2021-08-19  6:02   ` Heiner Kallweit
2021-08-19  6:50     ` Kai-Heng Feng
2021-08-19  9:56       ` Heiner Kallweit
2021-08-27  6:23         ` Kai-Heng Feng
2021-08-19  6:08 ` [PATCH net-next v3 0/3] r8169: Implement dynamic ASPM mechanism for recent 1.0/2.5Gbps Realtek NICs Heiner Kallweit
2021-08-19  6:19   ` Kai-Heng Feng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).