All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: Kernel and ADM hardware roulette ( was AMD graphics performance regression in 4.15 and later )
@ 2018-06-06 13:33 Gabriel C
  2018-06-06 14:12   ` Michel Dänzer
  2018-06-06 14:29   ` Christian König
  0 siblings, 2 replies; 25+ messages in thread
From: Gabriel C @ 2018-06-06 13:33 UTC (permalink / raw)
  To: Christian König
  Cc: Jean-Marc Valin, Dave Airlie, alexander.deucher, Felix Kuehling,
	Laura Abbott, Andrew Morton, michel.daenzer, dri-devel, LKML,
	Linus Torvalds

2018-06-06 14:19 GMT+02:00 Christian König <christian.koenig@amd.com>:
> Am 06.06.2018 um 14:08 schrieb Gabriel C:
>>
>> 2018-06-06 13:33 GMT+02:00 Christian König <christian.koenig@amd.com>:
>>>
>>> Am 06.06.2018 um 13:28 schrieb Gabriel C:
>>>>
>>>> 2018-04-11 7:02 GMT+02:00 Gabriel C <nix.or.die@gmail.com>:
>>>>>>
>>>>>> 2018-04-11 6:00 GMT+02:00 Gabriel C <nix.or.die@gmail.com>:
>>>>>> 2018-04-09 11:42 GMT+02:00 Christian König
>>>>>> <ckoenig.leichtzumerken@gmail.com>:
>>>>>>>
>>>>>>> Am 07.04.2018 um 00:00 schrieb Jean-Marc Valin:
>>>>>
>>>>> ...
>>>>>>
>>>>>> I can help testing code for 4.17/++ if you wish but that is
>>>>>> *different*
>>>>>> storry.
>>>>>>
>>>>> Quick tested an 4.16.0-11490-gb284d4d5a678 , amdgpu and radeon driver
>>>>> are broken now in this one.
>>>>>
>>>>> radeon tells:
>>>>>
>>>>> ...
>>>>>
>>>>> [    6.337838] [drm] PCIE GART of 2048M enabled (table at
>>>>> 0x00000000001D6000).
>>>>> [    6.338210] radeon 0000:21:00.0: (-12) create WB bo failed
>>>>> [    6.338214] radeon 0000:21:00.0: disabling GPU acceleration
>>>>>
>>>>> ...
>>>>>
>>>> I have the same Issue now on final 4.17.
>>>
>>>
>>> Actually Michel came up with a fix for the performance regression which
>>> is
>>> now backported to older kernels as well.
>>>
>>> So the original issue of this mail thread should be fixed by now.
>>
>> Ok , will test as soon I get the GPU to work :))
>>
>>>> Also I played with BIOS options also which does not fix anything but
>>>> changes the error message.
>>>>
>>>> IOMMU && SR-IOV disabled the error changes to this :
>>>>
>>>> [    7.092044] [drm:r600_ring_test [radeon]] *ERROR* radeon: ring 0
>>>> test failed (scratch(0x850C)=0xCAFEDEAD)
>>>> [    7.092059] radeon 0000:21:00.0: disabling GPU acceleration
>>>>
>>>>
>>>> While I could workaround SWIOTLB bugs in 4.15 and 4.16 , 4.17 seems to
>>>> kill the GPU with no way
>>>> for me to make it work ( at least I could not find any workaround by now
>>>> )
>>>
>>>
>>> That actually sounds like something completely different. Can you provide
>>> a
>>> full dmesg of radeon and/or amdgpu?
>>
>> Sure here from boot with IOMMU/SR-IOV ON/OFF in BIOS :
>>
>>
>> http://ftp.frugalware.org/pub/other/people/crazy/radeon/dmesg-iommu-sr-iov-off.txt
>>
>> http://ftp.frugalware.org/pub/other/people/crazy/radeon/dmesg-iommu-sr-iov-on.txt
>>
>> Also nothing else changed in that setup just testing kernel 4.17.
>
>
> That has nothing TODO with the driver nor the original bug you reported. The
> problem is that SME is active and that is currently not supported at all
> with a that hardware.

Ok .. so are we playing now kernel an AMD Hardware roulette on each release ?

SME was like this in kernel 4.16.x here and all worked.

Also if you don't support SME at all now on that Hardware while worked before
please add proper error handling and proper dmesg messages
letting the user know.

radeon: xxxx : SME not supported on that Hardware anymore , please
disable SME...
radeon: xxxx: Update your GPU < or whatever >

How hard would be that ?

No one but developers , can guess from these error messges why his
hardware  suddenly  isn't working anymore by just updating the kernel.


>
> Try to disable SME either in the BIOS or on the kernel command line.

Yes that works but is not the point.

Really you just can't break users setups like this.

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2018-06-11 19:24 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-06 13:33 Kernel and ADM hardware roulette ( was AMD graphics performance regression in 4.15 and later ) Gabriel C
2018-06-06 14:12 ` Michel Dänzer
2018-06-06 14:12   ` Michel Dänzer
2018-06-06 14:44   ` Christian König
2018-06-06 14:44     ` Christian König
2018-06-06 15:03     ` Michel Dänzer
2018-06-06 15:03       ` Michel Dänzer
2018-06-06 15:44       ` Gabriel C
2018-06-07  7:07         ` Christian König
2018-06-07 12:32           ` Gabriel C
2018-06-07 16:24             ` Gabriel C
2018-06-07 17:20               ` Christian König
2018-06-08  6:01                 ` Christoph Hellwig
2018-06-08  6:47                   ` Christian König
2018-06-08  6:02             ` Christoph Hellwig
2018-06-08  6:02               ` Christoph Hellwig
2018-06-08  6:52               ` Christian König
2018-06-08  6:52                 ` Christian König
2018-06-08 13:32                 ` Gabriel C
2018-06-11  7:15                   ` Christoph Hellwig
2018-06-11 19:23                     ` Linus Torvalds
2018-06-11 19:23                       ` Linus Torvalds
2018-06-06 15:24     ` Gabriel C
2018-06-06 14:29 ` Christian König
2018-06-06 14:29   ` Christian König

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.