From mboxrd@z Thu Jan 1 00:00:00 1970 From: bugzilla-daemon@freedesktop.org Subject: [Bug 103370] `vblank_mode=0 DRI_PRIME=1 glxgears` will introduce GPU lock up on Intel Graphics [8086:5917] + AMD Graphics [1002:6665] (rev c3) Date: Mon, 20 Nov 2017 15:41:11 +0000 Message-ID: References: Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1149067522==" Return-path: Received: from culpepper.freedesktop.org (culpepper.freedesktop.org [131.252.210.165]) by gabe.freedesktop.org (Postfix) with ESMTP id A6415882B5 for ; Mon, 20 Nov 2017 15:41:10 +0000 (UTC) In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: dri-devel@lists.freedesktop.org List-Id: dri-devel@lists.freedesktop.org --===============1149067522== Content-Type: multipart/alternative; boundary="15111924700.ceb8689D.21632"; charset="UTF-8" --15111924700.ceb8689D.21632 Date: Mon, 20 Nov 2017 15:41:10 +0000 MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://bugs.freedesktop.org/ Auto-Submitted: auto-generated https://bugs.freedesktop.org/show_bug.cgi?id=3D103370 --- Comment #33 from Alex Deucher --- (In reply to Michel D=C3=A4nzer from comment #27) > Thanks for bisecting, but I don't think that commit can be directly > responsible for a GPU hang. Before that commit, the DRI3 code in Mesa wou= ld > only use one back buffer for glxgears, which means that the GPU could only > start rendering a new frame after the previous one had finished presentin= g. > Maybe that somehow prevented the hang. That commit "fixed" a performance regression at the time because it ended up causing enough of a delay that the clocks didn't ramp up. So it probably exposed a kernel dpm issue. Without it, the clocks never ramped up enough = to cause an issue. With it, they did. (In reply to Timo Aaltonen from comment #32) > forwarding a comment from an engineer: >=20 > "During viewing the source code of radeon module, I found there is a bug = [1] > related to the dpm and clocks. So I decided to do some experiments. > Tried to set different max_sclk and max_mclk to see if the issue is gone. > 1. max_sclk: 70000, max_mclk: 75000 --> have the same issue > 2. max_sclk: 50000, max_mclk: 60000 --> pass multi-run test (more than 50 > runs) >=20 > [1] https://bugs.freedesktop.org/show_bug.cgi?id=3D76490 > " I think Sonny fixed this. It was due to using the wrong firmware. [ 1.827060] [drm] initializing kernel modesetting (HAINAN 0x1002:0x6665 0x1028:0x0844 0xC3). This chip should be using radeon/banks_k_2_smc.bin smc firmware. Is that available on the test system and kernel? --=20 You are receiving this mail because: You are the assignee for the bug.= --15111924700.ceb8689D.21632 Date: Mon, 20 Nov 2017 15:41:10 +0000 MIME-Version: 1.0 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://bugs.freedesktop.org/ Auto-Submitted: auto-generated

Comme= nt # 33 on bug 10337= 0 from Alex Deucher
(In reply to Michel D=C3=A4nzer from comment #27)
> Thanks for bisecting, but I don't think that com=
mit can be directly
> responsible for a GPU hang. Before that commit, the DRI3 code in Mesa =
would
> only use one back buffer for glxgears, which means that the GPU could =
only
> start rendering a new frame after the previous one had finished presen=
ting.
> Maybe that somehow prevented the hang.

That commit "fixed" a performance regression at the time because =
it ended up
causing enough of a delay that the clocks didn't ramp up.  So it probably
exposed a kernel dpm issue.  Without it, the clocks never ramped up enough =
to
cause an issue.  With it, they did.


(In reply to Timo Aaltonen from co=
mment #32)
> forwarding a comment from an engineer:
>=20
> "During viewing the source code of radeon module, I found there i=
s a bug [1]
> related to the dpm and clocks. So I decided to do some experiments.
> Tried to set different max_sclk and max_mclk to see if the issue is go=
ne.
> 1. max_sclk: 70000, max_mclk: 75000 --> have the same issue
> 2. max_sclk: 50000, max_mclk: 60000 --> pass multi-run test (more t=
han 50
> runs)
>=20
> [1] https://bugs.freedesktop.org/show_bug.c=
gi?id=3D76490
> "

I think Sonny fixed this.  It was due to using the wrong firmware.
[    1.827060] [drm] initializing kernel modesetting (HAINAN 0x1002:0x6665
0x1028:0x0844 0xC3).  This chip should be using radeon/banks_k_2_smc.bin smc
firmware.  Is that available on the test system and kernel?


You are receiving this mail because:
  • You are the assignee for the bug.
= --15111924700.ceb8689D.21632-- --===============1149067522== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KZHJpLWRldmVs IG1haWxpbmcgbGlzdApkcmktZGV2ZWxAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHBzOi8vbGlz dHMuZnJlZWRlc2t0b3Aub3JnL21haWxtYW4vbGlzdGluZm8vZHJpLWRldmVsCg== --===============1149067522==--