All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Luís Mendes" <luis.p.mendes-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: "Michel Dänzer" <michel-otUistvHUpPR7s880joybQ@public.gmane.org>
Cc: "Abramov, Slava" <Slava.Abramov-5C7GfCeVMHo@public.gmane.org>,
	"Koenig,
	Christian" <Christian.Koenig-5C7GfCeVMHo@public.gmane.org>,
	amd-gfx list
	<amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org>
Subject: Re: GPU hang trying to run OpenCL kernels on x86_64
Date: Tue, 26 Jun 2018 10:03:17 +0100	[thread overview]
Message-ID: <CAEzXK1oTc9p5hS97qNQHaW3Z82FXsDprSpfG4qVDhwT6VAaOhQ@mail.gmail.com> (raw)
In-Reply-To: <CAEzXK1ob33a28y_jwj_8ovr_ZDiowMAiUrnAQxS4vM5PK3NzzQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>


[-- Attachment #1.1: Type: text/plain, Size: 5215 bytes --]

I've tested Ubuntu 18.04 with kernel 4.17.2 using libdrm-2.4.92 and
mesa-18.1.0 and AMD RX 550 4GB is still hanging when running the identified
OpenCL kernels.

[  548.704916] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0
timeout, last signaled seq=30, last emitted seq=33
[  548.704988] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma1
timeout, last signaled seq=29, last emitted seq=31
[  548.704992] [drm] IP block:gmc_v8_0 is hung!
[  548.704994] [drm] IP block:tonga_ih is hung!
[  548.704996] [drm] IP block:gmc_v8_0 is hung!
[  548.704997] [drm] IP block:gfx_v8_0 is hung!
[  548.704998] [drm] IP block:sdma_v3_0 is hung!
[  548.704999] [drm] IP block:tonga_ih is hung!
[  548.705000] [drm] IP block:uvd_v6_0 is hung!
[  548.705001] [drm] IP block:gfx_v8_0 is hung!
[  548.705002] [drm] IP block:sdma_v3_0 is hung!
[  548.705003] [drm] IP block:uvd_v6_0 is hung!
[  548.705004] [drm] IP block:vce_v3_0 is hung!
[  548.705005] [drm] GPU recovery disabled.
[  548.705006] [drm] IP block:vce_v3_0 is hung!
[  548.705007] [drm] GPU recovery disabled.

Are there any new regarding this issue?

Regards,
Luís

On Fri, May 25, 2018 at 11:23 AM, Luís Mendes <luis.p.mendes-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
wrote:

> I've just tested Ubuntu 18.04 with kernel 4.17-rc6 using libdrm-2.4.92 and
> mesa-18.1.0.
> Now both sdma0 and sdma1 timeout as can be seen in the attached logs.
>
> ~agd5f -b drm-next-4.18 doesn't improve also.
>
> I have also tried amdgpu-pro 18.20 both on Ubuntu 18.04 and 16.04, but no
> improvements.
> I have tried amdgpu-pro 18.10 and 17.50 and also no improvements.
>
> ./amdgpu-pro-install -opencl=legacy,pal --headless
>
> On Thu, May 24, 2018 at 11:18 AM, Luís Mendes <luis.p.mendes@gmail.com>
> wrote:
>
>> Additional update...
>>
>> I was able to boot and enter X by installing an NVIDIA GTX 1050 Ti as the
>> primary display card and using an AMD RX 550 as the secondary card on the
>> Tyan S7025 with the same Ubuntu 18.04 and the same Linux kernel 4.17-rc6.
>> However once I try to run an OpenCL kernel on RX 550 I get a sdma1
>> timeout and the GPU hangs, which likely what is happening when I boot with
>> RX 550 as the single GPU card on the system.
>>
>> This means it is not an issue introduced in 4.17-rc6, it just means that
>> I didn't notice the effect of the system with the two GPUs vs system with
>> single AMD GPU.
>>
>> The dmesg log follows attached.
>>
>> Luís
>>
>> On Thu, May 24, 2018 at 10:13 AM, Luís Mendes <luis.p.mendes@gmail.com>
>> wrote:
>>
>>> Hi Michel,
>>>
>>> I also work as a researcher at a university and we are considering
>>> buying AMD cards to do OpenCL computations for numerical modelling, but
>>> currently I am unable to give a try at the AMD cards I have at home.
>>> I couldn't find any working driver for them... also amdgpu-pro drivers
>>> don't work, or at least I have been unable to make them work.
>>>
>>> Regards,
>>> Luís
>>>
>>> On Thu, May 24, 2018 at 10:01 AM, Luís Mendes <luis.p.mendes@gmail.com>
>>> wrote:
>>>
>>>> Hi Michel,
>>>>
>>>> So summarizing with Linux kernel 4.17-rc6 on Ubuntu 18.04 using AMD RX
>>>> 460/RX 550 I am not able to enter X.
>>>> The same system with AMD Radeon R7 240 not only enters X as also runs
>>>> the OpenCL kernel that RX 460 / RX 550 are unable to run for all the
>>>> kernels that I have tested.
>>>> Could this also be a Mesa issue, regarding OpenCL on RX 460?
>>>>
>>>> Regards,
>>>> Luís
>>>>
>>>> On Thu, May 24, 2018 at 9:55 AM, Luís Mendes <luis.p.mendes@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Michel,
>>>>>
>>>>> I will have to check previous rc releases of 4.17 to see if it wasn't
>>>>> already happening, before trying any possible git bisect.
>>>>> As an update I can say that an AMD Radeon R7 240 works fine on the
>>>>> same system with the same kernel and I am able to run the OpenCL kernels,
>>>>> that I couldn't with RX 460/RX 550.
>>>>>
>>>>> Regards,
>>>>> Luís
>>>>>
>>>>> On Thu, May 24, 2018 at 9:30 AM, Michel Dänzer <michel@daenzer.net>
>>>>> wrote:
>>>>>
>>>>>> On 2018-05-24 12:06 AM, Luís Mendes wrote:
>>>>>> > I've tried Linux 4.17-rc6 with Ubuntu 18.04 on Tyan S7002 and I am
>>>>>> not even
>>>>>> > able see lightdm/gdm3 as system hangs when starting X.
>>>>>> > Having SR-IOV enabled or disabled makes no difference.
>>>>>> > Tested with AMD RX 460.
>>>>>> > When X is supposed to start the system hangs and only a rectangular
>>>>>> region
>>>>>> > on the top left corner screen remains with console text messages
>>>>>> from the
>>>>>> > boot process while the remaining of the screen is just black. I am
>>>>>> unable
>>>>>> > to do anything with the keyboard, switching to console does not
>>>>>> work,
>>>>>> > ctrl-alt-del also doesn't work. I've to do a cold reset.
>>>>>>
>>>>>> Can you isolate which change introduced this new issue with git
>>>>>> bisect?
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Earthling Michel Dänzer               |
>>>>>> http://www.amd.com
>>>>>> Libre software enthusiast             |             Mesa and X
>>>>>> developer
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

[-- Attachment #1.2: Type: text/html, Size: 8580 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

  parent reply	other threads:[~2018-06-26  9:03 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-04 16:27 GPU hang trying to run OpenCL kernels on x86_64 Luís Mendes
     [not found] ` <CAEzXK1p3SnOnduP0nL1D1pNWH1LTWwApEs7QnQPTFDdBBiHLXA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-05-04 18:18   ` Abramov, Slava
     [not found]     ` <CY4PR12MB13045EA9BCBFAB20D698BA86FE860-rpdhrqHFk05fmUFYn07qFQdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2018-05-04 23:15       ` Luís Mendes
     [not found]         ` <CAEzXK1rJbXSpLUm=XQTLvySM4HJ45G9fQ0WCTPhnvdPETwUaYA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-05-07  9:38           ` Michel Dänzer
     [not found]             ` <53cc38f2-e94a-1514-0a03-303101db2f88-otUistvHUpPR7s880joybQ@public.gmane.org>
2018-05-07 12:31               ` Luís Mendes
     [not found]                 ` <CAEzXK1qHtfOBgismLvNy_x+NcMw78288AuwovBR-JmRwqpDmGQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-05-11 12:15                   ` Luís Mendes
     [not found]                     ` <CAEzXK1oM6b1bV3R3ezwD4eYrJXWV9+cdUFMC4q5+1bPkpmJFqQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-05-23 22:06                       ` Luís Mendes
     [not found]                         ` <CAEzXK1qRWkL67rG40jk4Odqm8AjbDvnnaCsDycLPRcPXXPL-KQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-05-24  8:30                           ` Michel Dänzer
     [not found]                             ` <22178876-688c-b4fe-d40e-1bab28024603-otUistvHUpPR7s880joybQ@public.gmane.org>
2018-05-24  8:55                               ` Luís Mendes
     [not found]                                 ` <CAEzXK1q+pMZgg7a7TMkNaDPhe90NkCc7MtAfE98vRY5tGDfW4A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-05-24  9:01                                   ` Luís Mendes
     [not found]                                     ` <CAEzXK1pZqFMZ8Qos3h7E-aYdmBE18GKFSamPV+0ZsXu=CatKRA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-05-24  9:13                                       ` Luís Mendes
     [not found]                                         ` <CAEzXK1qtjy+OaBh5rkzOXrTYe4Zq1w5bzdzTSRuCuDcm-kx-fA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-05-24 10:18                                           ` Luís Mendes
     [not found]                                             ` <CAEzXK1ryiqqrxTr5mn+Y4VvNAn-LtQb-0cBTitKKoQb+4B1=Qg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-05-25 10:23                                               ` Luís Mendes
     [not found]                                                 ` <CAEzXK1ob33a28y_jwj_8ovr_ZDiowMAiUrnAQxS4vM5PK3NzzQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-06-26  9:03                                                   ` Luís Mendes [this message]
     [not found]                                                     ` <CAEzXK1oTc9p5hS97qNQHaW3Z82FXsDprSpfG4qVDhwT6VAaOhQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-07-10 22:06                                                       ` Luís Mendes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAEzXK1oTc9p5hS97qNQHaW3Z82FXsDprSpfG4qVDhwT6VAaOhQ@mail.gmail.com \
    --to=luis.p.mendes-re5jqeeqqe8avxtiumwx3w@public.gmane.org \
    --cc=Christian.Koenig-5C7GfCeVMHo@public.gmane.org \
    --cc=Slava.Abramov-5C7GfCeVMHo@public.gmane.org \
    --cc=amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org \
    --cc=michel-otUistvHUpPR7s880joybQ@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.