All of lore.kernel.org
 help / color / mirror / Atom feed
From: bugzilla-daemon@bugzilla.kernel.org
To: dri-devel@lists.freedesktop.org
Subject: [Bug 204559] New: amdgpu: kernel oops with constant gpu resets while using mpv
Date: Mon, 12 Aug 2019 10:53:06 +0000	[thread overview]
Message-ID: <bug-204559-2300@https.bugzilla.kernel.org/> (raw)

https://bugzilla.kernel.org/show_bug.cgi?id=204559

            Bug ID: 204559
           Summary: amdgpu: kernel oops with constant gpu resets while
                    using mpv
           Product: Drivers
           Version: 2.5
    Kernel Version: 5.2.7
          Hardware: x86-64
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: Video(DRI - non Intel)
          Assignee: drivers_video-dri@kernel-bugs.osdl.org
          Reporter: shoegaze@tutanota.com
        Regression: No

Created attachment 284335
  --> https://bugzilla.kernel.org/attachment.cgi?id=284335&action=edit
oops.txt

While watching a video using mpv (default config) the system will hang
eventually - this is actually a kernel oops that happens after lots of GPU
resets every second or so (in the span of ~5 minutes; it seems to be alright in
the beginning):
> Aug 12 00:46:49 mashedpotato kernel: [drm] UVD and UVD ENC initialized
> successfully.
> Aug 12 00:46:49 mashedpotato kernel: [drm] VCE initialized successfully.
> Aug 12 00:46:56 mashedpotato kernel: amdgpu 0000:01:00.0: GPU pci config
> reset
> Aug 12 00:46:59 mashedpotato kernel: [drm] PCIE GART of 256M enabled (table
> at 0x000000F400000000).


This block of warnings repeats itself many times and then it is this error:
> Aug 12 00:52:20 mashedpotato kernel: amdgpu 0000:01:00.0:
> [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring sdma0 test failed (-110)
> Aug 12 00:52:20 mashedpotato kernel: [drm:amdgpu_device_resume [amdgpu]]
> *ERROR* resume of IP block <sdma_v3_0> failed -110
> Aug 12 00:52:20 mashedpotato kernel: [drm:amdgpu_device_resume [amdgpu]]
> *ERROR* amdgpu_device_ip_resume failed (-110).
> Aug 12 00:52:25 mashedpotato kernel: BUG: kernel NULL pointer dereference,
> address: 0000000000000000
> Aug 12 00:52:25 mashedpotato kernel: #PF: supervisor instruction fetch in
> kernel mode
> Aug 12 00:52:25 mashedpotato kernel: #PF: error_code(0x0010) - not-present
> page


In the end it is a kernel oops, log is in the attachment. The system is only
recoverable via a hard reset afterwards, though the sound from a video keeps
playing just fine.


My system is a ASUS laptop, TUF FX505-DY with the latest BIOS. lspci:
> 00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Root
> Complex
> 00:00.2 IOMMU: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 IOMMU
> 00:01.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models
> 00h-1fh) PCIe Dummy Host Bridge
> 00:01.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 PCIe GPP
> Bridge [6:0]
> 00:01.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 PCIe GPP
> Bridge [6:0]
> 00:01.3 PCI bridge: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 PCIe GPP
> Bridge [6:0]
> 00:01.4 PCI bridge: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 PCIe GPP
> Bridge [6:0]
> 00:08.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models
> 00h-1fh) PCIe Dummy Host Bridge
> 00:08.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Internal
> PCIe GPP Bridge 0 to Bus A
> 00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller (rev
> 61)
> 00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge (rev
> 51)
> 00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device
> 24: Function 0
> 00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device
> 24: Function 1
> 00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device
> 24: Function 2
> 00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device
> 24: Function 3
> 00:18.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device
> 24: Function 4
> 00:18.5 Host bridge: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device
> 24: Function 5
> 00:18.6 Host bridge: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device
> 24: Function 6
> 00:18.7 Host bridge: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device
> 24: Function 7
> 01:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Baffin
> [Radeon RX 460/560D / Pro 450/455/460/555/555X/560/560X] (rev e5)
> 02:00.0 Non-Volatile memory controller: Kingston Technology Company, Inc.
> Device 5008 (rev 01)
> 03:00.0 Network controller: Realtek Semiconductor Co., Ltd. RTL8821CE
> 802.11ac PCIe Wireless Network Adapter
> 04:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
> RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 15)
> 05:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI]
> Picasso (rev c2)
> 05:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI]
> Raven/Raven2/Fenghuang HDMI/DP Audio Controller
> 05:00.2 Encryption controller: Advanced Micro Devices, Inc. [AMD] Family 17h
> (Models 10h-1fh) Platform Security Processor
> 05:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Raven USB 3.1
> 05:00.4 USB controller: Advanced Micro Devices, Inc. [AMD] Raven USB 3.1
> 05:00.6 Audio device: Advanced Micro Devices, Inc. [AMD] Family 17h (Models
> 10h-1fh) HD Audio Controller

I have amdgpu.gpu_reset=1 in my kernel commandline as I want to figure out
another issue - sometimes the system hangs after locking and disabling screen,
and I guess it is GPU reset-related.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

             reply	other threads:[~2019-08-12 10:53 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-12 10:53 bugzilla-daemon [this message]
2019-08-12 13:04 ` [Bug 204559] amdgpu: kernel oops with constant gpu resets while using mpv bugzilla-daemon
2019-08-12 13:30 ` bugzilla-daemon
2019-08-12 14:19 ` bugzilla-daemon
2019-08-12 14:22 ` bugzilla-daemon
2019-08-12 15:56 ` bugzilla-daemon
2019-08-12 16:42 ` bugzilla-daemon
2019-08-12 16:56 ` bugzilla-daemon
2019-08-12 17:01 ` bugzilla-daemon
2019-08-12 17:26 ` bugzilla-daemon
2019-09-07  6:55 ` bugzilla-daemon
2019-09-07  6:58 ` bugzilla-daemon
2019-10-25 16:38 ` bugzilla-daemon
2020-01-03 17:40 ` bugzilla-daemon
2020-01-07 22:31 ` bugzilla-daemon
2022-01-17 22:52 ` bugzilla-daemon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-204559-2300@https.bugzilla.kernel.org/ \
    --to=bugzilla-daemon@bugzilla.kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.