regressions.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: "Lazar, Lijo" <Lijo.Lazar@amd.com>
To: James Turner <linuxkernel.foss@dmarc-none.turner.link>
Cc: Alex Deucher <alexdeucher@gmail.com>,
	Thorsten Leemhuis <regressions@leemhuis.info>,
	"Deucher, Alexander" <Alexander.Deucher@amd.com>,
	"regressions@lists.linux.dev" <regressions@lists.linux.dev>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	Greg KH <gregkh@linuxfoundation.org>,
	"Pan, Xinhui" <Xinhui.Pan@amd.com>,
	LKML <linux-kernel@vger.kernel.org>,
	"amd-gfx@lists.freedesktop.org" <amd-gfx@lists.freedesktop.org>,
	Alex Williamson <alex.williamson@redhat.com>,
	"Koenig, Christian" <Christian.Koenig@amd.com>
Subject: RE: [REGRESSION] Too-low frequency limit for AMD GPU PCI-passed-through to Windows VM
Date: Mon, 24 Jan 2022 14:21:11 +0000	[thread overview]
Message-ID: <BYAPR12MB4614E2CFEDDDEAABBAB986A0975E9@BYAPR12MB4614.namprd12.prod.outlook.com> (raw)
In-Reply-To: <87sftfqwlx.fsf@dmarc-none.turner.link>

[Public]

Not able to relate to how it affects gfx/mem DPM alone. Unless Alex has other ideas, would you be able to enable drm debug messages and share the log?

	Enabling verbose debug messages is done through the drm.debug parameter, each category being enabled by a bit:

	drm.debug=0x1 will enable CORE messages
	drm.debug=0x2 will enable DRIVER messages
	drm.debug=0x3 will enable CORE and DRIVER messages
	...
	drm.debug=0x1ff will enable all messages
	An interesting feature is that it's possible to enable verbose logging at run-time by echoing the debug value in its sysfs node:

	# echo 0xf > /sys/module/drm/parameters/debug

Thanks,
Lijo

-----Original Message-----
From: James Turner <linuxkernel.foss@dmarc-none.turner.link> 
Sent: Sunday, January 23, 2022 2:41 AM
To: Lazar, Lijo <Lijo.Lazar@amd.com>
Cc: Alex Deucher <alexdeucher@gmail.com>; Thorsten Leemhuis <regressions@leemhuis.info>; Deucher, Alexander <Alexander.Deucher@amd.com>; regressions@lists.linux.dev; kvm@vger.kernel.org; Greg KH <gregkh@linuxfoundation.org>; Pan, Xinhui <Xinhui.Pan@amd.com>; LKML <linux-kernel@vger.kernel.org>; amd-gfx@lists.freedesktop.org; Alex Williamson <alex.williamson@redhat.com>; Koenig, Christian <Christian.Koenig@amd.com>
Subject: Re: [REGRESSION] Too-low frequency limit for AMD GPU PCI-passed-through to Windows VM

Hi Lijo,

> Could you provide the pp_dpm_* values in sysfs with and without the 
> patch? Also, could you try forcing PCIE to gen3 (through pp_dpm_pcie) 
> if it's not in gen3 when the issue happens?

AFAICT, I can't access those values while the AMD GPU PCI devices are bound to `vfio-pci`. However, I can at least access the link speed and width elsewhere in sysfs. So, I gathered what information I could for two different cases:

- With the PCI devices bound to `vfio-pci`. With this configuration, I
  can start the VM, but the `pp_dpm_*` values are not available since
  the devices are bound to `vfio-pci` instead of `amdgpu`.

- Without the PCI devices bound to `vfio-pci` (i.e. after removing the
  `vfio-pci.ids=...` kernel command line argument). With this
  configuration, I can access the `pp_dpm_*` values, since the PCI
  devices are bound to `amdgpu`. However, I cannot use the VM. If I try
  to start the VM, the display (both the external monitors attached to
  the AMD GPU and the built-in laptop display attached to the Intel
  iGPU) completely freezes.

The output shown below was identical for both the good commit:
f1688bd69ec4 ("drm/amd/amdgpu:save psp ring wptr to avoid attack") and the commit which introduced the issue:
f9b7f3703ff9 ("drm/amdgpu/acpi: make ATPX/ATCS structures global (v2)")

Note that the PCI link speed increased to 8.0 GT/s when the GPU was under heavy load for both versions, but the clock speeds of the GPU were different under load. (For the good commit, it was 1295 MHz; for the bad commit, it was 501 MHz.)


# With the PCI devices bound to `vfio-pci`

## Before starting the VM

% ls /sys/module/amdgpu/drivers/pci:amdgpu
module  bind  new_id  remove_id  uevent  unbind

% find /sys/bus/pci/devices/0000:01:00.0/ -type f -name 'current_link*' -print -exec cat {} \; /sys/bus/pci/devices/0000:01:00.0/current_link_width
8
/sys/bus/pci/devices/0000:01:00.0/current_link_speed
8.0 GT/s PCIe

## While running the VM, before placing the AMD GPU under heavy load

% find /sys/bus/pci/devices/0000:01:00.0/ -type f -name 'current_link*' -print -exec cat {} \; /sys/bus/pci/devices/0000:01:00.0/current_link_width
8
/sys/bus/pci/devices/0000:01:00.0/current_link_speed
2.5 GT/s PCIe

## While running the VM, with the AMD GPU under heavy load

% find /sys/bus/pci/devices/0000:01:00.0/ -type f -name 'current_link*' -print -exec cat {} \; /sys/bus/pci/devices/0000:01:00.0/current_link_width
8
/sys/bus/pci/devices/0000:01:00.0/current_link_speed
8.0 GT/s PCIe

## While running the VM, after stopping the heavy load on the AMD GPU

% find /sys/bus/pci/devices/0000:01:00.0/ -type f -name 'current_link*' -print -exec cat {} \; /sys/bus/pci/devices/0000:01:00.0/current_link_width
8
/sys/bus/pci/devices/0000:01:00.0/current_link_speed
2.5 GT/s PCIe

## After stopping the VM

% find /sys/bus/pci/devices/0000:01:00.0/ -type f -name 'current_link*' -print -exec cat {} \; /sys/bus/pci/devices/0000:01:00.0/current_link_width
8
/sys/bus/pci/devices/0000:01:00.0/current_link_speed
2.5 GT/s PCIe


# Without the PCI devices bound to `vfio-pci`

% ls /sys/module/amdgpu/drivers/pci:amdgpu
0000:01:00.0  module  bind  new_id  remove_id  uevent  unbind

% for f in /sys/module/amdgpu/drivers/pci:amdgpu/*/pp_dpm_*; do echo "$f"; cat "$f"; echo; done /sys/module/amdgpu/drivers/pci:amdgpu/0000:01:00.0/pp_dpm_mclk
0: 300Mhz
1: 625Mhz
2: 1500Mhz *

/sys/module/amdgpu/drivers/pci:amdgpu/0000:01:00.0/pp_dpm_pcie
0: 2.5GT/s, x8
1: 8.0GT/s, x16 *

/sys/module/amdgpu/drivers/pci:amdgpu/0000:01:00.0/pp_dpm_sclk
0: 214Mhz
1: 501Mhz
2: 850Mhz
3: 1034Mhz
4: 1144Mhz
5: 1228Mhz
6: 1275Mhz
7: 1295Mhz *

% find /sys/bus/pci/devices/0000:01:00.0/ -type f -name 'current_link*' -print -exec cat {} \; /sys/bus/pci/devices/0000:01:00.0/current_link_width
8
/sys/bus/pci/devices/0000:01:00.0/current_link_speed
8.0 GT/s PCIe


James

  reply	other threads:[~2022-01-24 14:21 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-17  2:12 [REGRESSION] Too-low frequency limit for AMD GPU PCI-passed-through to Windows VM James D. Turner
2022-01-17  8:09 ` Greg KH
2022-01-17  9:03 ` Thorsten Leemhuis
2022-01-18  3:14   ` James Turner
2022-01-21  2:13     ` James Turner
2022-01-21  6:22       ` Thorsten Leemhuis
2022-01-21 16:45         ` Alex Deucher
2022-01-22  0:51           ` James Turner
2022-01-22  5:52             ` Lazar, Lijo
2022-01-22 21:11               ` James Turner
2022-01-24 14:21                 ` Lazar, Lijo [this message]
2022-01-24 23:58                   ` James Turner
2022-01-25 13:33                     ` Lazar, Lijo
2022-01-30  0:25                       ` Jim Turner
2022-02-15 14:56                         ` Thorsten Leemhuis
2022-02-15 15:11                           ` Alex Deucher
2022-02-16  0:25                             ` James D. Turner
2022-02-16 16:37                               ` Alex Deucher
2022-03-06 15:48                                 ` Thorsten Leemhuis
2022-03-07  2:12                                   ` James Turner
2022-03-13 18:33                                     ` James Turner
2022-03-17 12:54                                       ` Thorsten Leemhuis
2022-03-18  5:43                                         ` Paul Menzel
2022-03-18  7:01                                           ` Thorsten Leemhuis
2022-03-18 14:46                                             ` Alex Williamson
2022-03-18 15:06                                               ` Alex Deucher
2022-03-18 15:25                                                 ` Alex Williamson
2022-03-21  1:26                                                   ` James Turner
2022-01-24 17:04                 ` Alex Deucher
2022-01-24 17:30                   ` Alex Williamson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BYAPR12MB4614E2CFEDDDEAABBAB986A0975E9@BYAPR12MB4614.namprd12.prod.outlook.com \
    --to=lijo.lazar@amd.com \
    --cc=Alexander.Deucher@amd.com \
    --cc=Christian.Koenig@amd.com \
    --cc=Xinhui.Pan@amd.com \
    --cc=alex.williamson@redhat.com \
    --cc=alexdeucher@gmail.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxkernel.foss@dmarc-none.turner.link \
    --cc=regressions@leemhuis.info \
    --cc=regressions@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).