All of lore.kernel.org
 help / color / mirror / Atom feed
* [Bug 99312] Long-running OpenCL kernels cause ring stalls and GPU lockups on Kabini
@ 2017-01-07 18:25 bugzilla-daemon
  2017-01-07 18:25 ` bugzilla-daemon
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: bugzilla-daemon @ 2017-01-07 18:25 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 1869 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=99312

            Bug ID: 99312
           Summary: Long-running OpenCL kernels cause ring stalls and GPU
                    lockups on Kabini
           Product: Mesa
           Version: 13.0
          Hardware: Other
                OS: All
            Status: NEW
          Severity: normal
          Priority: medium
         Component: Drivers/Gallium/radeonsi
          Assignee: dri-devel@lists.freedesktop.org
          Reporter: vedran@miletic.net
        QA Contact: dri-devel@lists.freedesktop.org

Running long lasting OpenCL kernels (e.g. GROMACS with a system of many atoms)
using kernel 4.8.15, Mesa git, and LLVM git on Kabini APU:

vendor_id       : AuthenticAMD
cpu family      : 22
model           : 0
model name      : AMD Athlon(tm) 5350 APU with Radeon(tm) R3
stepping        : 1
microcode       : 0x700010b

with GPU:

00:01.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc.
[AMD/ATI] Kabini [Radeon HD 8400 / R3 Series] [1002:9830]

causes GPU lockups like:

[338584.980657] radeon 0000:00:01.0: ring 0 stalled for more than 10351msec
[338584.980811] radeon 0000:00:01.0: GPU lockup (current fence id
0x00000000000827c1 last fence id 0x00000000000827c2 on ring 0)
[338585.484633] radeon 0000:00:01.0: ring 0 stalled for more than 10855msec
[338585.484789] radeon 0000:00:01.0: GPU lockup (current fence id
0x00000000000827c1 last fence id 0x00000000000827c2 on ring 0)
[338585.988632] radeon 0000:00:01.0: ring 0 stalled for more than 11359msec
[338585.988787] radeon 0000:00:01.0: GPU lockup (current fence id
0x00000000000827c1 last fence id 0x00000000000827c2 on ring 0)

Machine does not hang. This is reliably reproducible. Any other info I can
provide?

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 3244 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug 99312] Long-running OpenCL kernels cause ring stalls and GPU lockups on Kabini
  2017-01-07 18:25 [Bug 99312] Long-running OpenCL kernels cause ring stalls and GPU lockups on Kabini bugzilla-daemon
@ 2017-01-07 18:25 ` bugzilla-daemon
  2017-01-07 18:36 ` bugzilla-daemon
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: bugzilla-daemon @ 2017-01-07 18:25 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 552 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=99312

Vedran Miletić <vedran@miletic.net> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Hardware|Other                       |x86-64 (AMD64)
           Severity|normal                      |major
            Version|13.0                        |git
                 OS|All                         |Linux (All)

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1597 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug 99312] Long-running OpenCL kernels cause ring stalls and GPU lockups on Kabini
  2017-01-07 18:25 [Bug 99312] Long-running OpenCL kernels cause ring stalls and GPU lockups on Kabini bugzilla-daemon
  2017-01-07 18:25 ` bugzilla-daemon
@ 2017-01-07 18:36 ` bugzilla-daemon
  2017-01-09 17:02 ` [Bug 99312] Long-running OpenCL kernels cause ring stalls and GPU lockups on Kabini when radeon.lockup_timeout is enabled bugzilla-daemon
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: bugzilla-daemon @ 2017-01-07 18:36 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 1219 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=99312

--- Comment #1 from John Bridgman <john.bridgman@amd.com> ---
If you have not already done so, try disabling the watchdog timer:


MODULE_PARM_DESC(lockup_timeout, "GPU lockup timeout in ms (default 10000 = 10
seconds, 0 = disable)");
module_param_named(lockup_timeout, radeon_lockup_timeout, int, 0444);

As part of HSA/ROC development we dropped the priority of compute work relative
to graphics which improved interactivity and *almost* eliminated timeouts
without having to disable the timer  - when I get back in the office I'll dig
up the changes. In the meantime, I think disabling the timer will do what you
need although you will still have sluggish graphics while long-running kernels
are active.

Lowering the priority of compute waves across the board won't be a fully
general solution because there are going to be some cases (eg Valve's recent
work with using high priority compute to improve VR smoothness) where compute
will need to be *higher* priority than graphics but it should cover most cases
other than "simultaneously running GROMACS and VR".

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 2067 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug 99312] Long-running OpenCL kernels cause ring stalls and GPU lockups on Kabini when radeon.lockup_timeout is enabled
  2017-01-07 18:25 [Bug 99312] Long-running OpenCL kernels cause ring stalls and GPU lockups on Kabini bugzilla-daemon
  2017-01-07 18:25 ` bugzilla-daemon
  2017-01-07 18:36 ` bugzilla-daemon
@ 2017-01-09 17:02 ` bugzilla-daemon
  2019-05-11 20:07 ` bugzilla-daemon
  2019-09-25 17:56 ` bugzilla-daemon
  4 siblings, 0 replies; 6+ messages in thread
From: bugzilla-daemon @ 2017-01-09 17:02 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 1543 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=99312

Vedran Miletić <vedran@miletic.net> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|Long-running OpenCL kernels |Long-running OpenCL kernels
                   |cause ring stalls and GPU   |cause ring stalls and GPU
                   |lockups on Kabini           |lockups on Kabini when
                   |                            |radeon.lockup_timeout is
                   |                            |enabled

--- Comment #2 from Vedran Miletić <vedran@miletic.net> ---
(In reply to John Bridgman from comment #1)
> If you have not already done so, try disabling the watchdog timer:
> 
> 
> MODULE_PARM_DESC(lockup_timeout, "GPU lockup timeout in ms (default 10000 =
> 10 seconds, 0 = disable)");
> module_param_named(lockup_timeout, radeon_lockup_timeout, int, 0444);
> 

Yup, that works around the problem.

> As part of HSA/ROC development we dropped the priority of compute work
> relative to graphics which improved interactivity and *almost* eliminated
> timeouts without having to disable the timer  - when I get back in the
> office I'll dig up the changes. In the meantime, I think disabling the timer
> will do what you need although you will still have sluggish graphics while
> long-running kernels are active.
> 

Eager to hear the details.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 3048 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug 99312] Long-running OpenCL kernels cause ring stalls and GPU lockups on Kabini when radeon.lockup_timeout is enabled
  2017-01-07 18:25 [Bug 99312] Long-running OpenCL kernels cause ring stalls and GPU lockups on Kabini bugzilla-daemon
                   ` (2 preceding siblings ...)
  2017-01-09 17:02 ` [Bug 99312] Long-running OpenCL kernels cause ring stalls and GPU lockups on Kabini when radeon.lockup_timeout is enabled bugzilla-daemon
@ 2019-05-11 20:07 ` bugzilla-daemon
  2019-09-25 17:56 ` bugzilla-daemon
  4 siblings, 0 replies; 6+ messages in thread
From: bugzilla-daemon @ 2019-05-11 20:07 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 519 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=99312

Jan Vesely <jv356@scarletmail.rutgers.edu> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Blocks|                            |99553


Referenced Bugs:

https://bugs.freedesktop.org/show_bug.cgi?id=99553
[Bug 99553] Tracker bug for runnning OpenCL applications on Clover
-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1646 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug 99312] Long-running OpenCL kernels cause ring stalls and GPU lockups on Kabini when radeon.lockup_timeout is enabled
  2017-01-07 18:25 [Bug 99312] Long-running OpenCL kernels cause ring stalls and GPU lockups on Kabini bugzilla-daemon
                   ` (3 preceding siblings ...)
  2019-05-11 20:07 ` bugzilla-daemon
@ 2019-09-25 17:56 ` bugzilla-daemon
  4 siblings, 0 replies; 6+ messages in thread
From: bugzilla-daemon @ 2019-09-25 17:56 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 841 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=99312

GitLab Migration User <gitlab-migration@fdo.invalid> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |MOVED
             Status|NEW                         |RESOLVED

--- Comment #3 from GitLab Migration User <gitlab-migration@fdo.invalid> ---
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been
closed from further activity.

You can subscribe and participate further through the new bug through this link
to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/1246.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 2620 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-09-25 17:56 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-07 18:25 [Bug 99312] Long-running OpenCL kernels cause ring stalls and GPU lockups on Kabini bugzilla-daemon
2017-01-07 18:25 ` bugzilla-daemon
2017-01-07 18:36 ` bugzilla-daemon
2017-01-09 17:02 ` [Bug 99312] Long-running OpenCL kernels cause ring stalls and GPU lockups on Kabini when radeon.lockup_timeout is enabled bugzilla-daemon
2019-05-11 20:07 ` bugzilla-daemon
2019-09-25 17:56 ` bugzilla-daemon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.