netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Performance Regression due to ASPM disable patch
       [not found] <CGME20230712155834epcas5p1140d90c8a0a181930956622728c4dd89@epcas5p1.samsung.com>
@ 2023-07-12 15:55 ` Anuj Gupta
  2023-07-13  5:59   ` Heiner Kallweit
  2023-07-13 12:37   ` Linux regression tracking #adding (Thorsten Leemhuis)
  0 siblings, 2 replies; 5+ messages in thread
From: Anuj Gupta @ 2023-07-12 15:55 UTC (permalink / raw)
  To: hkallweit1, davem
  Cc: holger, kai.heng.feng, simon.horman, nic_swsd, netdev, linux-nvme

[-- Attachment #1: Type: text/plain, Size: 1286 bytes --]

Hi,

I see a performance regression for read/write workloads on our NVMe over
fabrics using TCP as transport setup.
IOPS drop by 23% for 4k-randread [1] and by 18% for 4k-randwrite [2].

I bisected and found that the commit
e1ed3e4d91112027b90c7ee61479141b3f948e6a ("r8169: disable ASPM during
NAPI poll") is the trigger.
When I revert this commit, the performance drop goes away.

The target machine uses a realtek ethernet controller - 
root@testpc:/home/test# lspci | grep -i eth
29:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. Device 2600
(rev 21)
2a:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. Killer
E3000 2.5GbE Controller (rev 03)

I tried to disable aspm by passing "pcie_aspm=off" as boot parameter and
by setting pcie aspm policy to performance. But it didn't improve the
performance.
I wonder if this is already known, and something different should be
done to handle the original issue? 

[1] fio randread
fio -direct=1 -iodepth=1 -rw=randread -ioengine=psync -bs=4k -numjobs=1
-runtime=30 -group_reporting -filename=/dev/nvme1n1 -name=psync_read
-output=psync_read
[2] fio randwrite
fio -direct=1 -iodepth=1 -rw=randwrite -ioengine=psync -bs=4k -numjobs=1
-runtime=30 -group_reporting -filename=/dev/nvme1n1 -name=psync_read
-output=psync_write

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Performance Regression due to ASPM disable patch
  2023-07-12 15:55 ` Performance Regression due to ASPM disable patch Anuj Gupta
@ 2023-07-13  5:59   ` Heiner Kallweit
  2023-07-13 12:49     ` Anuj Gupta
  2023-07-13 12:37   ` Linux regression tracking #adding (Thorsten Leemhuis)
  1 sibling, 1 reply; 5+ messages in thread
From: Heiner Kallweit @ 2023-07-13  5:59 UTC (permalink / raw)
  To: Anuj Gupta, davem
  Cc: holger, kai.heng.feng, simon.horman, nic_swsd, netdev, linux-nvme

On 12.07.2023 17:55, Anuj Gupta wrote:
> Hi,
> 
> I see a performance regression for read/write workloads on our NVMe over
> fabrics using TCP as transport setup.
> IOPS drop by 23% for 4k-randread [1] and by 18% for 4k-randwrite [2].
> 
> I bisected and found that the commit
> e1ed3e4d91112027b90c7ee61479141b3f948e6a ("r8169: disable ASPM during
> NAPI poll") is the trigger.
> When I revert this commit, the performance drop goes away.
> 
> The target machine uses a realtek ethernet controller - 
> root@testpc:/home/test# lspci | grep -i eth
> 29:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. Device 2600
> (rev 21)
> 2a:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. Killer
> E3000 2.5GbE Controller (rev 03)
> 
> I tried to disable aspm by passing "pcie_aspm=off" as boot parameter and
> by setting pcie aspm policy to performance. But it didn't improve the
> performance.
> I wonder if this is already known, and something different should be
> done to handle the original issue? 
> 
> [1] fio randread
> fio -direct=1 -iodepth=1 -rw=randread -ioengine=psync -bs=4k -numjobs=1
> -runtime=30 -group_reporting -filename=/dev/nvme1n1 -name=psync_read
> -output=psync_read
> [2] fio randwrite
> fio -direct=1 -iodepth=1 -rw=randwrite -ioengine=psync -bs=4k -numjobs=1
> -runtime=30 -group_reporting -filename=/dev/nvme1n1 -name=psync_read
> -output=psync_write
> 
> 
I can imagine a certain performance impact of this commit if there are
lots of small packets handled by individual NAPI polls.
Maybe it's also chip version specific.
You have two NIC's, do you see the issue with both of them?
Related: What's your line speed, 1Gbps or 2.5Gbps?
Can you reproduce the performance impact with iperf?
Do you use any network optimization settings for latency vs. performance?
Interrupt coalescing, is TSO(6) enabled?
An ethtool -k output may provide further insight.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Performance Regression due to ASPM disable patch
  2023-07-12 15:55 ` Performance Regression due to ASPM disable patch Anuj Gupta
  2023-07-13  5:59   ` Heiner Kallweit
@ 2023-07-13 12:37   ` Linux regression tracking #adding (Thorsten Leemhuis)
  2023-07-25 13:43     ` Linux regression tracking #update (Thorsten Leemhuis)
  1 sibling, 1 reply; 5+ messages in thread
From: Linux regression tracking #adding (Thorsten Leemhuis) @ 2023-07-13 12:37 UTC (permalink / raw)
  To: Anuj Gupta, hkallweit1, davem
  Cc: holger, kai.heng.feng, simon.horman, nic_swsd, netdev,
	linux-nvme, Linux kernel regressions list

[CCing the regression list, as it should be in the loop for regressions:
https://docs.kernel.org/admin-guide/reporting-regressions.html]

[TLDR: I'm adding this report to the list of tracked Linux kernel
regressions; the text you find below is based on a few templates
paragraphs you might have encountered already in similar form.
See link in footer if these mails annoy you.]

On 12.07.23 17:55, Anuj Gupta wrote:
> 
> I see a performance regression for read/write workloads on our NVMe over
> fabrics using TCP as transport setup.
> IOPS drop by 23% for 4k-randread [1] and by 18% for 4k-randwrite [2].
> 
> I bisected and found that the commit
> e1ed3e4d91112027b90c7ee61479141b3f948e6a ("r8169: disable ASPM during
> NAPI poll") is the trigger.
> When I revert this commit, the performance drop goes away.
> 
> The target machine uses a realtek ethernet controller
> [...]

Thanks for the report. To be sure the issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
tracking bot:

#regzbot ^introduced e1ed3e4d91112027b90c7ee61479141b3f94
#regzbot title net: r8169: performance regression for read/write
workloads on our NVMe over fabrics
#regzbot ignore-activity

This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply and tell me -- ideally
while also telling regzbot about it, as explained by the page listed in
the footer of this mail.

Developers: When fixing the issue, remember to add 'Link:' tags pointing
to the report (the parent of this mail). See page linked in footer for
details.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Performance Regression due to ASPM disable patch
  2023-07-13  5:59   ` Heiner Kallweit
@ 2023-07-13 12:49     ` Anuj Gupta
  0 siblings, 0 replies; 5+ messages in thread
From: Anuj Gupta @ 2023-07-13 12:49 UTC (permalink / raw)
  To: Heiner Kallweit
  Cc: davem, holger, kai.heng.feng, simon.horman, nic_swsd, netdev,
	linux-nvme, sagi, hch

[-- Attachment #1: Type: text/plain, Size: 5981 bytes --]

On Thu, Jul 13, 2023 at 07:59:32AM +0200, Heiner Kallweit wrote:
> On 12.07.2023 17:55, Anuj Gupta wrote:
> > Hi,
> > 
> > I see a performance regression for read/write workloads on our NVMe over
> > fabrics using TCP as transport setup.
> > IOPS drop by 23% for 4k-randread [1] and by 18% for 4k-randwrite [2].
> > 
> > I bisected and found that the commit
> > e1ed3e4d91112027b90c7ee61479141b3f948e6a ("r8169: disable ASPM during
> > NAPI poll") is the trigger.
> > When I revert this commit, the performance drop goes away.
> > 
> > The target machine uses a realtek ethernet controller - 
> > root@testpc:/home/test# lspci | grep -i eth
> > 29:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. Device 2600
> > (rev 21)
> > 2a:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. Killer
> > E3000 2.5GbE Controller (rev 03)
> > 
> > I tried to disable aspm by passing "pcie_aspm=off" as boot parameter and
> > by setting pcie aspm policy to performance. But it didn't improve the
> > performance.
> > I wonder if this is already known, and something different should be
> > done to handle the original issue? 
> > 
> > [1] fio randread
> > fio -direct=1 -iodepth=1 -rw=randread -ioengine=psync -bs=4k -numjobs=1
> > -runtime=30 -group_reporting -filename=/dev/nvme1n1 -name=psync_read
> > -output=psync_read
> > [2] fio randwrite
> > fio -direct=1 -iodepth=1 -rw=randwrite -ioengine=psync -bs=4k -numjobs=1
> > -runtime=30 -group_reporting -filename=/dev/nvme1n1 -name=psync_read
> > -output=psync_write
> > 
> > 
> I can imagine a certain performance impact of this commit if there are
> lots of small packets handled by individual NAPI polls.
> Maybe it's also chip version specific.
> You have two NIC's, do you see the issue with both of them?

I see this issue with the Realtek Semiconductor Co., Ltd. Killer NIC.
I haven't used the other NIC.

> Related: What's your line speed, 1Gbps or 2.5Gbps?

Speed is 1000Mb/s [1].

> Can you reproduce the performance impact with iperf?

I was not able to reproduce it with iperf [2]. One of the reasons could
be that, currently performance drop happends in nvme over fabrics scenario,
where block IO processing takes sometime before sending next I/O and hence
network packets. I suspect iperf works by sending packets continuously,
rather than at intervals, let me know If I am missing something here.

> Do you use any network optimization settings for latency vs. performance?

No, I haven't set any network optimization settings. We are using
default Ubuntu values. If you suspect some particular setting, I can check.

> Interrupt coalescing, is TSO(6) enabled?

I tried this command on different PC containing the same realtek NIC and
a intel NIC. The command worked fine for the intel NIC, but failed for the
realtek nic. It seems that, the error is specific to realtek nic.
Is there some other way to check for Interrupt coalescing?

> An ethtool -k output may provide further insight.

Please see [3].

[1]
# ethtool enp42s0
Settings for enp42s0:
        Speed: 1000Mb/s

[2]

WITH ASPM patch :

------------------------------------------------------------
# iperf -c 107.99.41.147 -l 4096 -i 1 -t 10
------------------------------------------------------------
Client connecting to 107.99.41.147, TCP port 5001
TCP window size:  531 KByte (default)
------------------------------------------------------------
[  3] local 107.99.41.244 port 40340 connected with 107.99.41.147 port
5001
[  3]  0.0-10.0 sec  1.10 GBytes   942 Mbits/sec

-----------------------------------------------------------

WITHOUT ASPM patch :
------------------------------------------------------------
# iperf -c 107.99.41.147 -l 4096 -i 1 -t 10
------------------------------------------------------------
Client connecting to 107.99.41.147, TCP port 5001
TCP window size:  472 KByte (default)
------------------------------------------------------------
[  3] local 107.99.41.244 port 51766 connected with 107.99.41.147 port
5001
[  3]  0.0-10.0 sec  1.10 GBytes   942 Mbits/sec

[3]

# ethtool -k enp42s0
Features for enp42s0:
rx-checksumming: on
tx-checksumming: on
tx-checksum-ipv4: on
tx-checksum-ip-generic: off [fixed]
tx-checksum-ipv6: on
tx-checksum-fcoe-crc: off [fixed]
tx-checksum-sctp: off [fixed]
scatter-gather: off
tx-scatter-gather: off
tx-scatter-gather-fraglist:
off [fixed]
tcp-segmentation-offload:
off
tx-tcp-segmentation:
off
tx-tcp-ecn-segmentation:
off
[fixed]
tx-tcp-mangleid-segmentation:
off
tx-tcp6-segmentation:
off
generic-segmentation-offload:
off
[requested
on]
generic-receive-offload:
on
large-receive-offload:
off
[fixed]
rx-vlan-offload:
on
tx-vlan-offload:
on
ntuple-filters:
off
[fixed]
receive-hashing:
off
[fixed]
highdma:
on
[fixed]
rx-vlan-filter:
off
[fixed]
vlan-challenged:
off
[fixed]
tx-lockless:
off
[fixed]
netns-local:
off
[fixed]
tx-gso-robust:
off
[fixed]
tx-fcoe-segmentation:
off
[fixed]
tx-gre-segmentation:
off
[fixed]
tx-gre-csum-segmentation:
off
[fixed]
tx-ipxip4-segmentation:
off
[fixed]
tx-ipxip6-segmentation:
off
[fixed]
tx-udp_tnl-segmentation:
off
[fixed]
tx-udp_tnl-csum-segmentation:
off
[fixed]
tx-gso-partial:
off
[fixed]
tx-tunnel-remcsum-segmentation:
off
[fixed]
tx-sctp-segmentation:
off
[fixed]
tx-esp-segmentation:
off
[fixed]
tx-udp-segmentation:
off
[fixed]
tx-gso-list:
off
[fixed]
fcoe-mtu:
off
[fixed]
tx-nocache-copy:
off
loopback:
off
[fixed]
rx-fcs:
off
rx-all:
off
tx-vlan-stag-hw-insert:
off
[fixed]
rx-vlan-stag-hw-parse:
off
[fixed]
rx-vlan-stag-filter:
off
[fixed]
l2-fwd-offload:
off
[fixed]
hw-tc-offload:
off
[fixed]
esp-hw-offload:
off
[fixed]
esp-tx-csum-hw-offload:
off
[fixed]
rx-udp_tunnel-port-offload:
off
[fixed]
tls-hw-tx-offload:
off
[fixed]
tls-hw-rx-offload:
off
[fixed]
rx-gro-hw:
off
[fixed]
tls-hw-record:
off
[fixed]
rx-gro-list:
off
macsec-hw-offload:
off
[fixed]
rx-udp-gro-forwarding:
off
hsr-tag-ins-offload:
off
[fixed]
hsr-tag-rm-offload:
off
[fixed]
hsr-fwd-offload:
off
[fixed]
hsr-dup-offload:
off
[fixed]

> 
> 

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Performance Regression due to ASPM disable patch
  2023-07-13 12:37   ` Linux regression tracking #adding (Thorsten Leemhuis)
@ 2023-07-25 13:43     ` Linux regression tracking #update (Thorsten Leemhuis)
  0 siblings, 0 replies; 5+ messages in thread
From: Linux regression tracking #update (Thorsten Leemhuis) @ 2023-07-25 13:43 UTC (permalink / raw)
  To: Linux kernel regressions list; +Cc: nic_swsd, netdev, linux-nvme

[TLDR: This mail in primarily relevant for Linux kernel regression
tracking. See link in footer if these mails annoy you.]

On 13.07.23 14:37, Linux regression tracking #adding (Thorsten Leemhuis)
wrote:

> On 12.07.23 17:55, Anuj Gupta wrote:
>>
>> I see a performance regression for read/write workloads on our NVMe over
>> fabrics using TCP as transport setup.
>> IOPS drop by 23% for 4k-randread [1] and by 18% for 4k-randwrite [2].
> 
> #regzbot ^introduced e1ed3e4d91112027b90c7ee61479141b3f94
> #regzbot title net: r8169: performance regression for read/write
> workloads on our NVMe over fabrics
> #regzbot ignore-activity

The fix did not properly link to the report (it only linked to a reply
in the thread), hence regzbot missed it:

#regzbot fix: e31a9fedc7d8d8
#regzbot ignore-activity

/me meanwhile wonders if it'S worth teaching regzbot how to handle these
cases

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-07-25 13:43 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CGME20230712155834epcas5p1140d90c8a0a181930956622728c4dd89@epcas5p1.samsung.com>
2023-07-12 15:55 ` Performance Regression due to ASPM disable patch Anuj Gupta
2023-07-13  5:59   ` Heiner Kallweit
2023-07-13 12:49     ` Anuj Gupta
2023-07-13 12:37   ` Linux regression tracking #adding (Thorsten Leemhuis)
2023-07-25 13:43     ` Linux regression tracking #update (Thorsten Leemhuis)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).