All of lore.kernel.org
 help / color / mirror / Atom feed
* [Bug 111229] Unable to unbind GPU from amdgpu
@ 2019-07-27  4:14 bugzilla-daemon
  2019-07-27  4:14 ` bugzilla-daemon
                   ` (18 more replies)
  0 siblings, 19 replies; 20+ messages in thread
From: bugzilla-daemon @ 2019-07-27  4:14 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 1933 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111229

            Bug ID: 111229
           Summary: Unable to unbind GPU from amdgpu
           Product: DRI
           Version: unspecified
          Hardware: x86-64 (AMD64)
                OS: Linux (All)
            Status: NEW
          Severity: normal
          Priority: medium
         Component: DRM/AMDgpu
          Assignee: dri-devel@lists.freedesktop.org
          Reporter: wedens13@yandex.ru

Created attachment 144877
  --> https://bugs.freedesktop.org/attachment.cgi?id=144877&action=edit
dmesg kernel 5.2.1

Arch linux
Kernel version: 5.2.1

I have two GPUs in my system: integrated Intel and Sapphire Pulse Vega 56.
I boot with Intel as my primary gpu and I use Vega for VFIO (gpu passthrough)
and gpu offloading.
What I'm trying to do is to boot with amdgpu driver for Vega and bind it to
vfio-pci when I start VM (qemu).

The problem occurs when I try to unbind Vega from amdgpu driver using this
command:
echo -n "0000:03:00.0" > /sys/bus/pci/drivers/amdgpu/unbind

It results in segfault with following error in dmesg (full dmesg from boot to
shutdown is attached):
[drm:amdgpu_pci_remove [amdgpu]] *ERROR* Device removal is currently not
supported outside of fbcon

After that I'm unable to rebind device back to amdgpu or any other driver:
echo "0000:03:00.0" > /sys/bus/pci/drivers/amdgpu/bind
bash: echo: write error: No such device

Also I'm unable to shutdown properly. Shutdown process becomes stuck at some
point and only holding the button helps.

I've attached relevant lspci -vvv output before and after attempt to unbind, in
case it's useful.

Another thing I've tried is to unbind using kernel 4.19.60 and it just hangs
after executing the command. I've attached the log of this attempt (error is
different from 5.2.1).

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 3367 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug 111229] Unable to unbind GPU from amdgpu
  2019-07-27  4:14 [Bug 111229] Unable to unbind GPU from amdgpu bugzilla-daemon
@ 2019-07-27  4:14 ` bugzilla-daemon
  2019-07-27  4:14 ` bugzilla-daemon
                   ` (17 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: bugzilla-daemon @ 2019-07-27  4:14 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 299 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111229

--- Comment #1 from wedens13@yandex.ru ---
Created attachment 144878
  --> https://bugs.freedesktop.org/attachment.cgi?id=144878&action=edit
dmesg kernel 4.19.60

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1158 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug 111229] Unable to unbind GPU from amdgpu
  2019-07-27  4:14 [Bug 111229] Unable to unbind GPU from amdgpu bugzilla-daemon
  2019-07-27  4:14 ` bugzilla-daemon
@ 2019-07-27  4:14 ` bugzilla-daemon
  2019-07-27  4:15 ` bugzilla-daemon
                   ` (16 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: bugzilla-daemon @ 2019-07-27  4:14 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 303 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111229

--- Comment #2 from wedens13@yandex.ru ---
Created attachment 144879
  --> https://bugs.freedesktop.org/attachment.cgi?id=144879&action=edit
lspci -vvv before unbind

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1170 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug 111229] Unable to unbind GPU from amdgpu
  2019-07-27  4:14 [Bug 111229] Unable to unbind GPU from amdgpu bugzilla-daemon
  2019-07-27  4:14 ` bugzilla-daemon
  2019-07-27  4:14 ` bugzilla-daemon
@ 2019-07-27  4:15 ` bugzilla-daemon
  2019-07-27  5:47 ` bugzilla-daemon
                   ` (15 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: bugzilla-daemon @ 2019-07-27  4:15 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 302 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111229

--- Comment #3 from wedens13@yandex.ru ---
Created attachment 144880
  --> https://bugs.freedesktop.org/attachment.cgi?id=144880&action=edit
lspci -vvv after unbind

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1167 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug 111229] Unable to unbind GPU from amdgpu
  2019-07-27  4:14 [Bug 111229] Unable to unbind GPU from amdgpu bugzilla-daemon
                   ` (2 preceding siblings ...)
  2019-07-27  4:15 ` bugzilla-daemon
@ 2019-07-27  5:47 ` bugzilla-daemon
  2019-07-27  5:47 ` bugzilla-daemon
                   ` (14 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: bugzilla-daemon @ 2019-07-27  5:47 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 451 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111229

wedens13@yandex.ru changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           See Also|                            |https://bugs.freedesktop.or
                   |                            |g/show_bug.cgi?id=106993

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1071 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug 111229] Unable to unbind GPU from amdgpu
  2019-07-27  4:14 [Bug 111229] Unable to unbind GPU from amdgpu bugzilla-daemon
                   ` (3 preceding siblings ...)
  2019-07-27  5:47 ` bugzilla-daemon
@ 2019-07-27  5:47 ` bugzilla-daemon
  2019-07-27  5:49 ` bugzilla-daemon
                   ` (13 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: bugzilla-daemon @ 2019-07-27  5:47 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 451 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111229

wedens13@yandex.ru changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           See Also|                            |https://bugs.freedesktop.or
                   |                            |g/show_bug.cgi?id=101946

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1071 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug 111229] Unable to unbind GPU from amdgpu
  2019-07-27  4:14 [Bug 111229] Unable to unbind GPU from amdgpu bugzilla-daemon
                   ` (4 preceding siblings ...)
  2019-07-27  5:47 ` bugzilla-daemon
@ 2019-07-27  5:49 ` bugzilla-daemon
  2019-07-28 11:35 ` bugzilla-daemon
                   ` (12 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: bugzilla-daemon @ 2019-07-27  5:49 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 293 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111229

--- Comment #4 from wedens13@yandex.ru ---
My first guess is that unbinding causes GPU reset which is known to leave GPU
in a messy state ("the reset bug").

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1030 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug 111229] Unable to unbind GPU from amdgpu
  2019-07-27  4:14 [Bug 111229] Unable to unbind GPU from amdgpu bugzilla-daemon
                   ` (5 preceding siblings ...)
  2019-07-27  5:49 ` bugzilla-daemon
@ 2019-07-28 11:35 ` bugzilla-daemon
  2019-07-28 18:38 ` bugzilla-daemon
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: bugzilla-daemon @ 2019-07-28 11:35 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 970 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111229

--- Comment #5 from wedens13@yandex.ru ---
Created attachment 144896
  --> https://bugs.freedesktop.org/attachment.cgi?id=144896&action=edit
unbinding without X running

I've attached a log of attempt to unbind without X running:

systemctl stop sddm
echo 0 > /sys/class/vtconsole/vtcon0/bind
echo 0 > /sys/class/vtconsole/vtcon1/bind
echo efi-framebuffer.0 > /sys/bus/platform/drivers/efi-framebuffer/unbind ||
true

echo "0000:03:00.0" > /sys/bus/pci/devices/0000:03:00.0/driver/unbind

Result is the same but backtrace seems a bit different. This was done with
kernel 5.2.1.

I've tried suspend to ram and another reset bug mitigation (which helps in
other cases), but gpu is still unusable after this failed attempt to unbind. I
still can't re-bind it to amdgpu or vfio-pci and clean shutdown is not
happening.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1865 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug 111229] Unable to unbind GPU from amdgpu
  2019-07-27  4:14 [Bug 111229] Unable to unbind GPU from amdgpu bugzilla-daemon
                   ` (6 preceding siblings ...)
  2019-07-28 11:35 ` bugzilla-daemon
@ 2019-07-28 18:38 ` bugzilla-daemon
  2019-07-29  8:54 ` bugzilla-daemon
                   ` (10 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: bugzilla-daemon @ 2019-07-28 18:38 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 516 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111229

--- Comment #6 from wedens13@yandex.ru ---
Seems to be a regression. 

I can unbind from amdgpu and bind to vfio-pci just fine on kernel
4.19.60-1-lts.

I was able to unbind without previous error after:

echo 0 > /sys/class/vtconsole/vtcon0/bind
echo 0 > /sys/class/vtconsole/vtcon1/bind
echo efi-framebuffer.0 > /sys/bus/platform/drivers/efi-framebuffer/unbind ||
true

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1252 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug 111229] Unable to unbind GPU from amdgpu
  2019-07-27  4:14 [Bug 111229] Unable to unbind GPU from amdgpu bugzilla-daemon
                   ` (7 preceding siblings ...)
  2019-07-28 18:38 ` bugzilla-daemon
@ 2019-07-29  8:54 ` bugzilla-daemon
  2019-07-29  8:54 ` bugzilla-daemon
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: bugzilla-daemon @ 2019-07-29  8:54 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 427 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111229

Michel Dänzer <michel@daenzer.net> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
 Attachment #144877|text/x-log                  |text/plain
          mime type|                            |

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1077 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug 111229] Unable to unbind GPU from amdgpu
  2019-07-27  4:14 [Bug 111229] Unable to unbind GPU from amdgpu bugzilla-daemon
                   ` (8 preceding siblings ...)
  2019-07-29  8:54 ` bugzilla-daemon
@ 2019-07-29  8:54 ` bugzilla-daemon
  2019-07-29 13:09 ` bugzilla-daemon
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: bugzilla-daemon @ 2019-07-29  8:54 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 427 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111229

Michel Dänzer <michel@daenzer.net> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
 Attachment #144878|text/x-log                  |text/plain
          mime type|                            |

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1077 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug 111229] Unable to unbind GPU from amdgpu
  2019-07-27  4:14 [Bug 111229] Unable to unbind GPU from amdgpu bugzilla-daemon
                   ` (9 preceding siblings ...)
  2019-07-29  8:54 ` bugzilla-daemon
@ 2019-07-29 13:09 ` bugzilla-daemon
  2019-08-06  0:14 ` bugzilla-daemon
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: bugzilla-daemon @ 2019-07-29 13:09 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 543 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111229

--- Comment #7 from wedens13@yandex.ru ---
Created attachment 144907
  --> https://bugs.freedesktop.org/attachment.cgi?id=144907&action=edit
kernel 5.1

I've narrowed it down to kernel 5.1. There are a lot of amdgpu changes in 5.1
(Vega related changes specifically). 

I hope someone more knowledgeable in amdgpu will be able to find what exactly
in 5.1 breaks unbinding. Let me know if I can help.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1382 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug 111229] Unable to unbind GPU from amdgpu
  2019-07-27  4:14 [Bug 111229] Unable to unbind GPU from amdgpu bugzilla-daemon
                   ` (10 preceding siblings ...)
  2019-07-29 13:09 ` bugzilla-daemon
@ 2019-08-06  0:14 ` bugzilla-daemon
  2019-09-03 19:06 ` bugzilla-daemon
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: bugzilla-daemon @ 2019-08-06  0:14 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 585 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111229

--- Comment #8 from Eugene Shatsky <eugene@shatsky.net> ---
Created attachment 144952
  --> https://bugs.freedesktop.org/attachment.cgi?id=144952&action=edit
another kernel, another disasterous unbind attempt

I couldn't rebind my RX 470 or shutdown the system cleanly after unbinding it
on any kernel my NixOS had since I've got it last winter. Reproduced OPs method
for 4.19.64, got severe warnings and oops, "modprobe -r amdgpu" just hangs.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1537 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug 111229] Unable to unbind GPU from amdgpu
  2019-07-27  4:14 [Bug 111229] Unable to unbind GPU from amdgpu bugzilla-daemon
                   ` (11 preceding siblings ...)
  2019-08-06  0:14 ` bugzilla-daemon
@ 2019-09-03 19:06 ` bugzilla-daemon
  2019-10-05 22:13 ` bugzilla-daemon
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: bugzilla-daemon @ 2019-09-03 19:06 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 720 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111229

--- Comment #9 from wedens13@yandex.ru ---
I'll do more testing, but it seems that unbind works with kernel 5.3-rc7.

There is still this error in the log:
[drm:amdgpu_pci_remove [amdgpu]] *ERROR* Device removal is currently not
supported outside of fbcon
without any backtraces and unbind seems to succeed with and without X running
(on other gpu, of course).

It'd be nice to have confirmation from other people.

Note that to bind gpu to vfio-pci reset app must be used after unbinding from
amdgpu: https://forum.level1techs.com/t/vega-10-and-12-reset-application/145666

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1533 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug 111229] Unable to unbind GPU from amdgpu
  2019-07-27  4:14 [Bug 111229] Unable to unbind GPU from amdgpu bugzilla-daemon
                   ` (12 preceding siblings ...)
  2019-09-03 19:06 ` bugzilla-daemon
@ 2019-10-05 22:13 ` bugzilla-daemon
  2019-10-21  7:22 ` bugzilla-daemon
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: bugzilla-daemon @ 2019-10-05 22:13 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 568 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111229

--- Comment #10 from Eugene Shatsky <eugene@shatsky.net> ---
I confirm that on on 5.3-rc7 I could unbind/bind RX470 multiple times and shut
the system down cleanly afterwards. Got some warning with a trace in dmesg, now
going to check if this does affect system stability and whether my goal of
switching the Radeon-powered seat between Linux desktop (without persistent
session, of course) and virtual machine is now reachable.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1319 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug 111229] Unable to unbind GPU from amdgpu
  2019-07-27  4:14 [Bug 111229] Unable to unbind GPU from amdgpu bugzilla-daemon
                   ` (13 preceding siblings ...)
  2019-10-05 22:13 ` bugzilla-daemon
@ 2019-10-21  7:22 ` bugzilla-daemon
  2019-11-04  4:08 ` bugzilla-daemon
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: bugzilla-daemon @ 2019-10-21  7:22 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 3614 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111229

--- Comment #11 from Eugene Shatsky <eugene@shatsky.net> ---
Since last comment I've used this for a dozen times for switching between Linux
desktop and Windows VM, one time amdgpu crashed after resume from suspend but
I'm not sure if it was related to this bug and I was still able to reboot after
it.
However I still get this warning sometimes on unbind:

WARNING: CPU: 0 PID: 1109 at drivers/gpu/drm/amd/amdgpu/amdgpu_object.c:929
amdgpu_bo_unpin+0xc8/0xf0 [amdgpu]
Modules linked in: vfio_pci vfio_virqfd vfio_iommu_type1 vfio fuse amdgpu
amd_iommu_v2 gpu_sched ttm xt_CHECKSUM xt_MASQUERADE ipt_REJECT nf_rejec>
 nf_conntrack nf_defrag_ipv4 libcrc32c zsmalloc ip6t_rpfilter ipt_rpfilter
ip6table_raw iptable_raw xt_pkttype nf_log_ipv6 nf_log_ipv4 nf_log_comm>
CPU: 0 PID: 1109 Comm: .libvirtd-wrapp Tainted: G           O      5.3.0-rc7
#1-NixOS
Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./H61M-DGS R2.0,
BIOS P1.10 10/01/2013
RIP: 0010:amdgpu_bo_unpin+0xc8/0xf0 [amdgpu]
Code: ff 48 83 c0 0c 48 39 d0 75 ea 48 8d 73 30 48 8d 7b 50 48 8d 54 24 08 e8
46 1f d8 ff 85 c0 74 a1 e9 30 6c 21 00 e8 28 f9 6b f5 <0f> 0b 48 8b >
RSP: 0018:ffffa4df00a4bd28 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff8c60449a4800 RCX: 0000000000000002
RDX: ffff8c60423c9b00 RSI: 0000000000000000 RDI: ffff8c60449a4800
RBP: ffff8c6008fa4058 R08: 0000000000000000 R09: ffffffffc0b3c000
R10: ffff8c60449a2800 R11: 0000000000000001 R12: ffff8c6008fa6378
R13: ffff8c6008fa6370 R14: ffff8c6008fa4058 R15: ffff8c6008d7f260
FS:  00007fac9a81f700(0000) GS:ffff8c605f400000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffea51ccff8 CR3: 00000004048c4003 CR4: 00000000001606f0
Call Trace:
 amdgpu_bo_free_kernel+0x6b/0x120 [amdgpu]
 amdgpu_gfx_rlc_fini+0x47/0x70 [amdgpu]
 gfx_v8_0_sw_fini+0xa1/0x1a0 [amdgpu]
 amdgpu_device_fini+0x257/0x479 [amdgpu]
 amdgpu_driver_unload_kms+0x4a/0x90 [amdgpu]
 drm_dev_unregister+0x4b/0xb0 [drm]
 amdgpu_pci_remove+0x25/0x50 [amdgpu]
 pci_device_remove+0x3b/0xc0
 device_release_driver_internal+0xd8/0x1b0
 unbind_store+0x94/0x120
 kernfs_fop_write+0x108/0x190
 vfs_write+0xa5/0x1a0
 ksys_write+0x59/0xd0
 do_syscall_64+0x4e/0x120
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7faca4a7b36f
Code: 1f 40 00 41 54 55 49 89 d4 53 48 89 f5 89 fb 48 83 ec 10 e8 53 fd ff ff
4c 89 e2 41 89 c0 48 89 ee 89 df b8 01 00 00 00 0f 05 <48> 3d 00 f0 >
RSP: 002b:00007fac9a81e4d0 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000012 RCX: 00007faca4a7b36f
RDX: 000000000000000c RSI: 00007fac84019a20 RDI: 0000000000000012
RBP: 00007fac84019a20 R08: 0000000000000000 R09: 000000000000002f
R10: 0000000000000000 R11: 0000000000000293 R12: 000000000000000c
R13: 0000000000000000 R14: 0000000000000012 R15: 00007fac9a81e568
---[ end trace ffd153eee3d00ec4 ]---
amdgpu 0000:01:00.0: 00000000001146cc unpin not necessary

It's produced by
https://github.com/torvalds/linux/blob/574cc4539762561d96b456dbc0544d8898bd4c6e/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c#L937
, I wonder if buffer object pin count is something like reference count

Also it looks like the message

*ERROR* Device removal is currently not supported outside of fbcon

is printed non-conditionally, without checking if DRM nodes are being used by
userspace clients. I wonder if it's possible to implement such a check and
prevent the unbind if they are

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 4531 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug 111229] Unable to unbind GPU from amdgpu
  2019-07-27  4:14 [Bug 111229] Unable to unbind GPU from amdgpu bugzilla-daemon
                   ` (14 preceding siblings ...)
  2019-10-21  7:22 ` bugzilla-daemon
@ 2019-11-04  4:08 ` bugzilla-daemon
  2019-11-04  4:09 ` bugzilla-daemon
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: bugzilla-daemon @ 2019-11-04  4:08 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 985 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111229

--- Comment #12 from Andrew B <abingham@gmail.com> ---
Fedora 31, 5.3.1 kernel, 5700XT - still seeing problems with unbinding from the
AMDGPU driver.  

I have video=efifb:off in my kernel parameters to keep the efifb from ever
using the card.

After stopping X and unbinding from vtcon0 and vtcon1, attempting to unbind the
driver from yields the following error, I cannot bind a new driver to the card,
and I can't shutdown cleanly.

[  140.760872] fbcon: Taking over console
[  140.773454] Console: switching to colour frame buffer device 320x90
[  577.562635] Console: switching to colour dummy device 80x25
[  679.403956] VFIO - User Level meta-driver version: 0.3
[  679.410718] [drm:amdgpu_pci_remove [amdgpu]] *ERROR* Device removal is
currently not supported outsid
e of fbcon
[  679.410938] [drm] amdgpu: finishing device.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1730 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug 111229] Unable to unbind GPU from amdgpu
  2019-07-27  4:14 [Bug 111229] Unable to unbind GPU from amdgpu bugzilla-daemon
                   ` (15 preceding siblings ...)
  2019-11-04  4:08 ` bugzilla-daemon
@ 2019-11-04  4:09 ` bugzilla-daemon
  2019-11-05 10:28 ` bugzilla-daemon
  2019-11-19  9:37 ` bugzilla-daemon
  18 siblings, 0 replies; 20+ messages in thread
From: bugzilla-daemon @ 2019-11-04  4:09 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 258 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111229

--- Comment #13 from Andrew B <abingham@gmail.com> ---
My comment above should have reference 5.3.7 as the kernel version.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1003 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug 111229] Unable to unbind GPU from amdgpu
  2019-07-27  4:14 [Bug 111229] Unable to unbind GPU from amdgpu bugzilla-daemon
                   ` (16 preceding siblings ...)
  2019-11-04  4:09 ` bugzilla-daemon
@ 2019-11-05 10:28 ` bugzilla-daemon
  2019-11-19  9:37 ` bugzilla-daemon
  18 siblings, 0 replies; 20+ messages in thread
From: bugzilla-daemon @ 2019-11-05 10:28 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 397 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111229

--- Comment #14 from wedens13@yandex.ru ---
(In reply to Andrew B from comment #13)
> My comment above should have reference 5.3.7 as the kernel version.

For navi you can try this kernel patch:
https://forum.level1techs.com/t/navi-reset-kernel-patch/147547

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1274 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug 111229] Unable to unbind GPU from amdgpu
  2019-07-27  4:14 [Bug 111229] Unable to unbind GPU from amdgpu bugzilla-daemon
                   ` (17 preceding siblings ...)
  2019-11-05 10:28 ` bugzilla-daemon
@ 2019-11-19  9:37 ` bugzilla-daemon
  18 siblings, 0 replies; 20+ messages in thread
From: bugzilla-daemon @ 2019-11-19  9:37 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 806 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111229

Martin Peres <martin.peres@free.fr> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |MOVED

--- Comment #15 from Martin Peres <martin.peres@free.fr> ---
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been
closed from further activity.

You can subscribe and participate further through the new bug through this link
to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/878.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 2322 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2019-11-19  9:37 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-27  4:14 [Bug 111229] Unable to unbind GPU from amdgpu bugzilla-daemon
2019-07-27  4:14 ` bugzilla-daemon
2019-07-27  4:14 ` bugzilla-daemon
2019-07-27  4:15 ` bugzilla-daemon
2019-07-27  5:47 ` bugzilla-daemon
2019-07-27  5:47 ` bugzilla-daemon
2019-07-27  5:49 ` bugzilla-daemon
2019-07-28 11:35 ` bugzilla-daemon
2019-07-28 18:38 ` bugzilla-daemon
2019-07-29  8:54 ` bugzilla-daemon
2019-07-29  8:54 ` bugzilla-daemon
2019-07-29 13:09 ` bugzilla-daemon
2019-08-06  0:14 ` bugzilla-daemon
2019-09-03 19:06 ` bugzilla-daemon
2019-10-05 22:13 ` bugzilla-daemon
2019-10-21  7:22 ` bugzilla-daemon
2019-11-04  4:08 ` bugzilla-daemon
2019-11-04  4:09 ` bugzilla-daemon
2019-11-05 10:28 ` bugzilla-daemon
2019-11-19  9:37 ` bugzilla-daemon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.