From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from wp530.webpack.hosteurope.de (wp530.webpack.hosteurope.de [80.237.130.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 890893D7B for ; Fri, 18 Mar 2022 07:01:36 +0000 (UTC) Received: from ip4d144895.dynamic.kabel-deutschland.de ([77.20.72.149] helo=[192.168.66.200]); authenticated by wp530.webpack.hosteurope.de running ExIM with esmtpsa (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) id 1nV6c9-00066r-PZ; Fri, 18 Mar 2022 08:01:33 +0100 Message-ID: Date: Fri, 18 Mar 2022 08:01:31 +0100 Precedence: bulk X-Mailing-List: regressions@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.5.0 Subject: Re: [REGRESSION] Too-low frequency limit for AMD GPU PCI-passed-through to Windows VM Content-Language: en-US To: Paul Menzel , James Turner Cc: Xinhui Pan , regressions@lists.linux.dev, kvm@vger.kernel.org, Greg KH , Lijo Lazar , LKML , amd-gfx@lists.freedesktop.org, Alexander Deucher , Alex Williamson , Alex Deucher , =?UTF-8?Q?Christian_K=c3=b6nig?= References: <87ee57c8fu.fsf@turner.link> <87zgnp96a4.fsf@turner.link> <87czkk1pmt.fsf@dmarc-none.turner.link> <87sftfqwlx.fsf@dmarc-none.turner.link> <87ee4wprsx.fsf@turner.link> <4b3ed7f6-d2b6-443c-970e-d963066ebfe3@amd.com> <87pmo8r6ob.fsf@turner.link> <5a68afe4-1e9e-c683-e06d-30afc2156f14@leemhuis.info> <87pmnnpmh5.fsf@dmarc-none.turner.link> <092b825a-10ff-e197-18a1-d3e3a097b0e3@leemhuis.info> <877d96to55.fsf@dmarc-none.turner.link> <87lexdw8gd.fsf@turner.link> <40b3084a-11b8-0962-4b33-34b56d3a87a3@molgen.mpg.de> From: Thorsten Leemhuis In-Reply-To: <40b3084a-11b8-0962-4b33-34b56d3a87a3@molgen.mpg.de> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-bounce-key: webpack.hosteurope.de;regressions@leemhuis.info;1647586896;206b6243; X-HE-SMSGID: 1nV6c9-00066r-PZ On 18.03.22 06:43, Paul Menzel wrote: > > Am 17.03.22 um 13:54 schrieb Thorsten Leemhuis: >> On 13.03.22 19:33, James Turner wrote: >>> >>>> My understanding at this point is that the root problem is probably >>>> not in the Linux kernel but rather something else (e.g. the machine >>>> firmware or AMD Windows driver) and that the change in f9b7f3703ff9 >>>> ("drm/amdgpu/acpi: make ATPX/ATCS structures global (v2)") simply >>>> exposed the underlying problem. >> >> FWIW: that in the end is irrelevant when it comes to the Linux kernel's >> 'no regressions' rule. For details see: >> >> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/Documentation/admin-guide/reporting-regressions.rst >> >> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/Documentation/process/handling-regressions.rst >> >> >> That being said: sometimes for the greater good it's better to not >> insist on that. And I guess that might be the case here. > > But who decides that? In the end afaics: Linus. But he can't watch each and every discussion, so it partly falls down to people discussing a regression, as they can always decide to get him involved in case they are unhappy with how a regression is handled. That obviously includes me in this case. I simply use my best judgement in such situations. I'm still undecided if that path is appropriate here, that's why I wrote above to see what James would say, as he afaics was the only one that reported this regression. > Running stuff in a virtual machine is not that uncommon. No, it's about passing through a GPU to a VM, which is a lot less common -- and afaics an area where blacklisting GPUs on the host to pass them through is not uncommon (a quick internet search confirmed that, but I might be wrong there). > Should the commit be reverted, and re-added with a more elaborate commit > message documenting the downsides? > > Could the user be notified somehow? Can PCI passthrough and a loaded > amdgpu driver be detected, so Linux warns about this? > > Also, should this be documented in the code? > >>> I'm not sure where to go from here. This issue isn't much of a concern >>> for me anymore, since blacklisting `amdgpu` works for my machine. At >>> this point, my understanding is that the root problem needs to be fixed >>> in AMD's Windows GPU driver or Dell's firmware, not the Linux kernel. If >>> any of the AMD developers on this thread would like to forward it to the >>> AMD Windows driver team, I'd be happy to work with AMD to fix the issue >>> properly. > > (Thorsten, your mailer mangled the quote somehow Kinda, but it IIRC was more me doing something stupid with my mailer. Sorry about that. > – I reformatted it –, thx! > which is too bad, as this message is shown when clicking on the link > *marked invalid* in the regzbot Web page [1]. (The link is a very nice > feature.) > >> In that case I'll drop it from the list of regressions, unless what I >> wrote above makes you change your mind. >> >> #regzbot invalid: firmware issue exposed by kernel change, user seems to >> be happy with a workaround >> >> Thx everyone who participated in handling this. > > Should the regression issue be re-opened until the questions above are > answered, and a more user friendly solution is found? I'll for now will just continue to watch this discussion and see what happens. > [1]: https://linux-regtracking.leemhuis.info/regzbot/resolved/ Ciao, Thorsten