linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: Thorsten Leemhuis <regressions@leemhuis.info>
Cc: "Paul Menzel" <pmenzel@molgen.mpg.de>,
	"James Turner" <linuxkernel.foss@dmarc-none.turner.link>,
	"Xinhui Pan" <Xinhui.Pan@amd.com>,
	regressions@lists.linux.dev, kvm@vger.kernel.org,
	"Greg KH" <gregkh@linuxfoundation.org>,
	"Lijo Lazar" <lijo.lazar@amd.com>,
	LKML <linux-kernel@vger.kernel.org>,
	amd-gfx@lists.freedesktop.org,
	"Alexander Deucher" <Alexander.Deucher@amd.com>,
	"Alex Deucher" <alexdeucher@gmail.com>,
	"Christian König" <Christian.Koenig@amd.com>
Subject: Re: [REGRESSION] Too-low frequency limit for AMD GPU PCI-passed-through to Windows VM
Date: Fri, 18 Mar 2022 08:46:25 -0600	[thread overview]
Message-ID: <20220318084625.27d42a51.alex.williamson@redhat.com> (raw)
In-Reply-To: <bc714e87-d1dc-cdda-5a29-25820faaff40@leemhuis.info>

On Fri, 18 Mar 2022 08:01:31 +0100
Thorsten Leemhuis <regressions@leemhuis.info> wrote:

> On 18.03.22 06:43, Paul Menzel wrote:
> >
> > Am 17.03.22 um 13:54 schrieb Thorsten Leemhuis:  
> >> On 13.03.22 19:33, James Turner wrote:  
> >>>  
> >>>> My understanding at this point is that the root problem is probably
> >>>> not in the Linux kernel but rather something else (e.g. the machine
> >>>> firmware or AMD Windows driver) and that the change in f9b7f3703ff9
> >>>> ("drm/amdgpu/acpi: make ATPX/ATCS structures global (v2)") simply
> >>>> exposed the underlying problem.  
> >>
> >> FWIW: that in the end is irrelevant when it comes to the Linux kernel's
> >> 'no regressions' rule. For details see:
> >>
> >> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/Documentation/admin-guide/reporting-regressions.rst
> >>
> >> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/Documentation/process/handling-regressions.rst
> >>
> >>
> >> That being said: sometimes for the greater good it's better to not
> >> insist on that. And I guess that might be the case here.  
> > 
> > But who decides that?  
> 
> In the end afaics: Linus. But he can't watch each and every discussion,
> so it partly falls down to people discussing a regression, as they can
> always decide to get him involved in case they are unhappy with how a
> regression is handled. That obviously includes me in this case. I simply
> use my best judgement in such situations. I'm still undecided if that
> path is appropriate here, that's why I wrote above to see what James
> would say, as he afaics was the only one that reported this regression.
> 
> > Running stuff in a virtual machine is not that uncommon.  
> 
> No, it's about passing through a GPU to a VM, which is a lot less common
> -- and afaics an area where blacklisting GPUs on the host to pass them
> through is not uncommon (a quick internet search confirmed that, but I
> might be wrong there).

Right, interference from host drivers and pre-boot environments is
always a concern with GPU assignment in particular.  AMD GPUs have a
long history of poor behavior relative to things like PCI secondary bus
resets which we use to try to get devices to clean, reusable states for
assignment.  Here a device is being bound to a host driver that
initiates some sort of power control, unbound from that driver and
exposed to new drivers far beyond the scope of the kernel's regression
policy.  Perhaps it's possible to undo such power control when
unbinding the device, but it's not necessarily a given that such a
thing is possible for this device without a cold reset.

IMO, it's not fair to restrict the kernel from such advancements.  If
the use case is within a VM, don't bind host drivers.  It's difficult
to make promises when dynamically switching between host and userspace
drivers for devices that don't have functional reset mechanisms.
Thanks,

Alex


  reply	other threads:[~2022-03-18 14:47 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <87ee57c8fu.fsf@turner.link>
2022-01-17  8:09 ` [REGRESSION] Too-low frequency limit for AMD GPU PCI-passed-through to Windows VM Greg KH
2022-01-17  9:03 ` Thorsten Leemhuis
2022-01-18  3:14   ` James Turner
2022-01-21  2:13     ` James Turner
2022-01-21  6:22       ` Thorsten Leemhuis
2022-01-21 16:45         ` Alex Deucher
2022-01-22  0:51           ` James Turner
2022-01-22  5:52             ` Lazar, Lijo
2022-01-22 21:11               ` James Turner
2022-01-24 14:21                 ` Lazar, Lijo
2022-01-24 23:58                   ` James Turner
2022-01-25 13:33                     ` Lazar, Lijo
2022-01-30  0:25                       ` Jim Turner
2022-02-15 14:56                         ` Thorsten Leemhuis
2022-02-15 15:11                           ` Alex Deucher
     [not found]                             ` <87pmnnpmh5.fsf@dmarc-none.turner.link>
2022-02-16 16:37                               ` Alex Deucher
2022-03-06 15:48                                 ` Thorsten Leemhuis
2022-03-07  2:12                                   ` James Turner
2022-03-13 18:33                                     ` James Turner
2022-03-17 12:54                                       ` Thorsten Leemhuis
2022-03-18  5:43                                         ` Paul Menzel
2022-03-18  7:01                                           ` Thorsten Leemhuis
2022-03-18 14:46                                             ` Alex Williamson [this message]
2022-03-18 15:06                                               ` Alex Deucher
2022-03-18 15:25                                                 ` Alex Williamson
2022-03-21  1:26                                                   ` James Turner
2022-01-24 17:04                 ` Alex Deucher
2022-01-24 17:30                   ` Alex Williamson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220318084625.27d42a51.alex.williamson@redhat.com \
    --to=alex.williamson@redhat.com \
    --cc=Alexander.Deucher@amd.com \
    --cc=Christian.Koenig@amd.com \
    --cc=Xinhui.Pan@amd.com \
    --cc=alexdeucher@gmail.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=kvm@vger.kernel.org \
    --cc=lijo.lazar@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxkernel.foss@dmarc-none.turner.link \
    --cc=pmenzel@molgen.mpg.de \
    --cc=regressions@leemhuis.info \
    --cc=regressions@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).