linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] gpu/drm: Remove TTM_PL_FLAG_WC of VRAM to fix writecombine issue for Loongson64
@ 2020-08-08  7:25 Tiezhu Yang
  2020-08-08 13:41 ` Thomas Bogendoerfer
  0 siblings, 1 reply; 7+ messages in thread
From: Tiezhu Yang @ 2020-08-08  7:25 UTC (permalink / raw)
  To: Alex Deucher, christian.koenig
  Cc: Huacai Chen, Jiaxun Yang, linux-mips, amd-gfx, linux-kernel

Loongson processors have a writecombine issue that maybe failed to
write back framebuffer used with ATI Radeon or AMD GPU at times,
after commit 8a08e50cee66 ("drm: Permit video-buffers writecombine
mapping for MIPS"), there exists some errors such as blurred screen
and lockup, and so on.

Remove the flag TTM_PL_FLAG_WC of VRAM to fix writecombine issue for
Loongson64 to work well with ATI Radeon or AMD GPU, and it has no any
influence on the other platforms.

[   60.958721] radeon 0000:03:00.0: ring 0 stalled for more than 10079msec
[   60.965315] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000000112 last fence id 0x000000000000011d on ring 0)
[   60.976525] radeon 0000:03:00.0: ring 3 stalled for more than 10086msec
[   60.983156] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000000374 last fence id 0x00000000000003a8 on ring 3)

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c |  7 +++++--
 drivers/gpu/drm/radeon/radeon_object.c     | 20 ++++++++++++++------
 2 files changed, 19 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 5ac7b55..9f785f6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -136,8 +136,11 @@ void amdgpu_bo_placement_from_domain(struct amdgpu_bo *abo, u32 domain)
 
 		places[c].fpfn = 0;
 		places[c].lpfn = 0;
-		places[c].flags = TTM_PL_FLAG_WC | TTM_PL_FLAG_UNCACHED |
-			TTM_PL_FLAG_VRAM;
+		if (IS_ENABLED(CONFIG_MACH_LOONGSON64))
+			places[c].flags = TTM_PL_FLAG_UNCACHED | TTM_PL_FLAG_VRAM;
+		else
+			places[c].flags = TTM_PL_FLAG_WC | TTM_PL_FLAG_UNCACHED |
+					  TTM_PL_FLAG_VRAM;
 
 		if (flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED)
 			places[c].lpfn = visible_pfn;
diff --git a/drivers/gpu/drm/radeon/radeon_object.c b/drivers/gpu/drm/radeon/radeon_object.c
index f3dee01..c6cede6 100644
--- a/drivers/gpu/drm/radeon/radeon_object.c
+++ b/drivers/gpu/drm/radeon/radeon_object.c
@@ -112,15 +112,23 @@ void radeon_ttm_placement_from_domain(struct radeon_bo *rbo, u32 domain)
 		    rbo->rdev->mc.visible_vram_size < rbo->rdev->mc.real_vram_size) {
 			rbo->placements[c].fpfn =
 				rbo->rdev->mc.visible_vram_size >> PAGE_SHIFT;
-			rbo->placements[c++].flags = TTM_PL_FLAG_WC |
-						     TTM_PL_FLAG_UNCACHED |
-						     TTM_PL_FLAG_VRAM;
+			if (IS_ENABLED(CONFIG_MACH_LOONGSON64))
+				rbo->placements[c++].flags = TTM_PL_FLAG_UNCACHED |
+							     TTM_PL_FLAG_VRAM;
+			else
+				rbo->placements[c++].flags = TTM_PL_FLAG_WC |
+							     TTM_PL_FLAG_UNCACHED |
+							     TTM_PL_FLAG_VRAM;
 		}
 
 		rbo->placements[c].fpfn = 0;
-		rbo->placements[c++].flags = TTM_PL_FLAG_WC |
-					     TTM_PL_FLAG_UNCACHED |
-					     TTM_PL_FLAG_VRAM;
+		if (IS_ENABLED(CONFIG_MACH_LOONGSON64))
+			rbo->placements[c++].flags = TTM_PL_FLAG_UNCACHED |
+						     TTM_PL_FLAG_VRAM;
+		else
+			rbo->placements[c++].flags = TTM_PL_FLAG_WC |
+						     TTM_PL_FLAG_UNCACHED |
+						     TTM_PL_FLAG_VRAM;
 	}
 
 	if (domain & RADEON_GEM_DOMAIN_GTT) {
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] gpu/drm: Remove TTM_PL_FLAG_WC of VRAM to fix writecombine issue for Loongson64
  2020-08-08  7:25 [PATCH] gpu/drm: Remove TTM_PL_FLAG_WC of VRAM to fix writecombine issue for Loongson64 Tiezhu Yang
@ 2020-08-08 13:41 ` Thomas Bogendoerfer
  2020-08-08 13:50   ` Jiaxun Yang
  0 siblings, 1 reply; 7+ messages in thread
From: Thomas Bogendoerfer @ 2020-08-08 13:41 UTC (permalink / raw)
  To: Tiezhu Yang
  Cc: Alex Deucher, christian.koenig, Huacai Chen, Jiaxun Yang,
	linux-mips, amd-gfx, linux-kernel

On Sat, Aug 08, 2020 at 03:25:02PM +0800, Tiezhu Yang wrote:
> Loongson processors have a writecombine issue that maybe failed to
> write back framebuffer used with ATI Radeon or AMD GPU at times,
> after commit 8a08e50cee66 ("drm: Permit video-buffers writecombine
> mapping for MIPS"), there exists some errors such as blurred screen
> and lockup, and so on.
> 
> Remove the flag TTM_PL_FLAG_WC of VRAM to fix writecombine issue for
> Loongson64 to work well with ATI Radeon or AMD GPU, and it has no any
> influence on the other platforms.

well it's not my call to take or reject this patch, but I already
indicated it might be better to disable writecombine on the CPU
detection side (or do you have other devices where writecombining
works ?). Something like below will disbale it for all loongson64 CPUs.
If you now find out where it works and where it doesn't, you can even
reduce it to the required minium of affected CPUs.

Thomas.


diff --git a/arch/mips/kernel/cpu-probe.c b/arch/mips/kernel/cpu-probe.c
index def1659fe262..cdd87009e931 100644
--- a/arch/mips/kernel/cpu-probe.c
+++ b/arch/mips/kernel/cpu-probe.c
@@ -2043,7 +2043,6 @@ static inline void cpu_probe_loongson(struct cpuinfo_mips *c, unsigned int cpu)
 			set_isa(c, MIPS_CPU_ISA_M64R2);
 			break;
 		}
-		c->writecombine = _CACHE_UNCACHED_ACCELERATED;
 		c->ases |= (MIPS_ASE_LOONGSON_MMI | MIPS_ASE_LOONGSON_EXT |
 				MIPS_ASE_LOONGSON_EXT2);
 		break;
@@ -2073,7 +2072,6 @@ static inline void cpu_probe_loongson(struct cpuinfo_mips *c, unsigned int cpu)
 		 * register, we correct it here.
 		 */
 		c->options |= MIPS_CPU_FTLB | MIPS_CPU_TLBINV | MIPS_CPU_LDPTE;
-		c->writecombine = _CACHE_UNCACHED_ACCELERATED;
 		c->ases |= (MIPS_ASE_LOONGSON_MMI | MIPS_ASE_LOONGSON_CAM |
 			MIPS_ASE_LOONGSON_EXT | MIPS_ASE_LOONGSON_EXT2);
 		c->ases &= ~MIPS_ASE_VZ; /* VZ of Loongson-3A2000/3000 is incomplete */
@@ -2084,7 +2082,6 @@ static inline void cpu_probe_loongson(struct cpuinfo_mips *c, unsigned int cpu)
 		set_elf_platform(cpu, "loongson3a");
 		set_isa(c, MIPS_CPU_ISA_M64R2);
 		decode_cpucfg(c);
-		c->writecombine = _CACHE_UNCACHED_ACCELERATED;
 		break;
 	default:
 		panic("Unknown Loongson Processor ID!");

-- 
Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
good idea.                                                [ RFC1925, 2.3 ]

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] gpu/drm: Remove TTM_PL_FLAG_WC of VRAM to fix writecombine issue for Loongson64
  2020-08-08 13:41 ` Thomas Bogendoerfer
@ 2020-08-08 13:50   ` Jiaxun Yang
  2020-08-09 12:13     ` Christian König
  0 siblings, 1 reply; 7+ messages in thread
From: Jiaxun Yang @ 2020-08-08 13:50 UTC (permalink / raw)
  To: Thomas Bogendoerfer, Tiezhu Yang
  Cc: Alex Deucher, christian.koenig, Huacai Chen, linux-mips, amd-gfx,
	linux-kernel



在 2020/8/8 下午9:41, Thomas Bogendoerfer 写道:
> On Sat, Aug 08, 2020 at 03:25:02PM +0800, Tiezhu Yang wrote:
>> Loongson processors have a writecombine issue that maybe failed to
>> write back framebuffer used with ATI Radeon or AMD GPU at times,
>> after commit 8a08e50cee66 ("drm: Permit video-buffers writecombine
>> mapping for MIPS"), there exists some errors such as blurred screen
>> and lockup, and so on.
>>
>> Remove the flag TTM_PL_FLAG_WC of VRAM to fix writecombine issue for
>> Loongson64 to work well with ATI Radeon or AMD GPU, and it has no any
>> influence on the other platforms.
> well it's not my call to take or reject this patch, but I already
> indicated it might be better to disable writecombine on the CPU
> detection side (or do you have other devices where writecombining
> works ?). Something like below will disbale it for all loongson64 CPUs.
> If you now find out where it works and where it doesn't, you can even
> reduce it to the required minium of affected CPUs.
Hi Tiezhu, Thomas,

Yes, writecombine works well on LS7A's internal GPU....
And even works well with some AMD GPUs (in my case, RX550).

Tiezhu, is it possible to investigate the issue deeper in Loongson?
Probably we just need to add some barrier to maintain the data coherency,
or disable writecombine for AMD GPU's command buffer and leave texture/frame
buffer wc accelerated.

Thanks.

- Jiaxun

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] gpu/drm: Remove TTM_PL_FLAG_WC of VRAM to fix writecombine issue for Loongson64
  2020-08-08 13:50   ` Jiaxun Yang
@ 2020-08-09 12:13     ` Christian König
  2020-08-10  0:58       ` Tiezhu Yang
  2020-08-10 10:50       ` Michel Dänzer
  0 siblings, 2 replies; 7+ messages in thread
From: Christian König @ 2020-08-09 12:13 UTC (permalink / raw)
  To: Jiaxun Yang, Thomas Bogendoerfer, Tiezhu Yang
  Cc: Alex Deucher, Huacai Chen, linux-mips, amd-gfx, linux-kernel

Am 08.08.20 um 15:50 schrieb Jiaxun Yang:
>
>
> 在 2020/8/8 下午9:41, Thomas Bogendoerfer 写道:
>> On Sat, Aug 08, 2020 at 03:25:02PM +0800, Tiezhu Yang wrote:
>>> Loongson processors have a writecombine issue that maybe failed to
>>> write back framebuffer used with ATI Radeon or AMD GPU at times,
>>> after commit 8a08e50cee66 ("drm: Permit video-buffers writecombine
>>> mapping for MIPS"), there exists some errors such as blurred screen
>>> and lockup, and so on.
>>>
>>> Remove the flag TTM_PL_FLAG_WC of VRAM to fix writecombine issue for
>>> Loongson64 to work well with ATI Radeon or AMD GPU, and it has no any
>>> influence on the other platforms.
>> well it's not my call to take or reject this patch, but I already
>> indicated it might be better to disable writecombine on the CPU
>> detection side (or do you have other devices where writecombining
>> works ?). Something like below will disbale it for all loongson64 CPUs.
>> If you now find out where it works and where it doesn't, you can even
>> reduce it to the required minium of affected CPUs.
> Hi Tiezhu, Thomas,
>
> Yes, writecombine works well on LS7A's internal GPU....
> And even works well with some AMD GPUs (in my case, RX550).

In this case the patch is a clear NAK since you haven't root caused the 
issue and are just working around it in a very questionable manner.

>
> Tiezhu, is it possible to investigate the issue deeper in Loongson?
> Probably we just need to add some barrier to maintain the data coherency,
> or disable writecombine for AMD GPU's command buffer and leave 
> texture/frame
> buffer wc accelerated.

Have you moved any buffer to VRAM and forgot to add an HDP flush/invalidate?

The acceleration is not much of a problem, but if WC doesn't work in 
general you need to disable it for the whole CPU and not for individual 
drivers.

Regards,
Christian.

>
> Thanks.
>
> - Jiaxun


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] gpu/drm: Remove TTM_PL_FLAG_WC of VRAM to fix writecombine issue for Loongson64
  2020-08-09 12:13     ` Christian König
@ 2020-08-10  0:58       ` Tiezhu Yang
  2020-08-10 10:50       ` Michel Dänzer
  1 sibling, 0 replies; 7+ messages in thread
From: Tiezhu Yang @ 2020-08-10  0:58 UTC (permalink / raw)
  To: Christian König, Jiaxun Yang, Thomas Bogendoerfer
  Cc: Alex Deucher, Huacai Chen, linux-mips, amd-gfx, linux-kernel

On 08/09/2020 08:13 PM, Christian König wrote:
> Am 08.08.20 um 15:50 schrieb Jiaxun Yang:
>>
>>
>> 在 2020/8/8 下午9:41, Thomas Bogendoerfer 写道:
>>> On Sat, Aug 08, 2020 at 03:25:02PM +0800, Tiezhu Yang wrote:
>>>> Loongson processors have a writecombine issue that maybe failed to
>>>> write back framebuffer used with ATI Radeon or AMD GPU at times,
>>>> after commit 8a08e50cee66 ("drm: Permit video-buffers writecombine
>>>> mapping for MIPS"), there exists some errors such as blurred screen
>>>> and lockup, and so on.
>>>>
>>>> Remove the flag TTM_PL_FLAG_WC of VRAM to fix writecombine issue for
>>>> Loongson64 to work well with ATI Radeon or AMD GPU, and it has no any
>>>> influence on the other platforms.
>>> well it's not my call to take or reject this patch, but I already
>>> indicated it might be better to disable writecombine on the CPU
>>> detection side (or do you have other devices where writecombining
>>> works ?). Something like below will disbale it for all loongson64 CPUs.
>>> If you now find out where it works and where it doesn't, you can even
>>> reduce it to the required minium of affected CPUs.
>> Hi Tiezhu, Thomas,
>>
>> Yes, writecombine works well on LS7A's internal GPU....
>> And even works well with some AMD GPUs (in my case, RX550).
>
> In this case the patch is a clear NAK since you haven't root caused 
> the issue and are just working around it in a very questionable manner.
>
>>
>> Tiezhu, is it possible to investigate the issue deeper in Loongson?
>> Probably we just need to add some barrier to maintain the data 
>> coherency,
>> or disable writecombine for AMD GPU's command buffer and leave 
>> texture/frame
>> buffer wc accelerated.
>
> Have you moved any buffer to VRAM and forgot to add an HDP 
> flush/invalidate?
>
> The acceleration is not much of a problem, but if WC doesn't work in 
> general you need to disable it for the whole CPU and not for 
> individual drivers.

Hi Thomas, Jiaxun and Christian,

Thank you very much for your suggestions.

Actually, this patch is a temporary solution to just make it work well,
it is not a proper and final solution.

I understand your opinions, it will take some time to find the root cause.

Thanks,
Tiezhu

>
> Regards,
> Christian.
>
>>
>> Thanks.
>>
>> - Jiaxun


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] gpu/drm: Remove TTM_PL_FLAG_WC of VRAM to fix writecombine issue for Loongson64
  2020-08-09 12:13     ` Christian König
  2020-08-10  0:58       ` Tiezhu Yang
@ 2020-08-10 10:50       ` Michel Dänzer
  2020-08-10 11:22         ` Christian König
  1 sibling, 1 reply; 7+ messages in thread
From: Michel Dänzer @ 2020-08-10 10:50 UTC (permalink / raw)
  To: Christian König, Jiaxun Yang, Thomas Bogendoerfer, Tiezhu Yang
  Cc: Alex Deucher, Huacai Chen, linux-mips, amd-gfx, linux-kernel

On 2020-08-09 2:13 p.m., Christian König wrote:
> Am 08.08.20 um 15:50 schrieb Jiaxun Yang:
>> 在 2020/8/8 下午9:41, Thomas Bogendoerfer 写道:
>>> On Sat, Aug 08, 2020 at 03:25:02PM +0800, Tiezhu Yang wrote:
>>>> Loongson processors have a writecombine issue that maybe failed to
>>>> write back framebuffer used with ATI Radeon or AMD GPU at times,
>>>> after commit 8a08e50cee66 ("drm: Permit video-buffers writecombine
>>>> mapping for MIPS"), there exists some errors such as blurred screen
>>>> and lockup, and so on.
>>>>
>>>> Remove the flag TTM_PL_FLAG_WC of VRAM to fix writecombine issue for
>>>> Loongson64 to work well with ATI Radeon or AMD GPU, and it has no any
>>>> influence on the other platforms.
>>> well it's not my call to take or reject this patch, but I already
>>> indicated it might be better to disable writecombine on the CPU
>>> detection side (or do you have other devices where writecombining
>>> works ?). Something like below will disbale it for all loongson64 CPUs.
>>> If you now find out where it works and where it doesn't, you can even
>>> reduce it to the required minium of affected CPUs.
>> Hi Tiezhu, Thomas,
>>
>> Yes, writecombine works well on LS7A's internal GPU....
>> And even works well with some AMD GPUs (in my case, RX550).
> 
> In this case the patch is a clear NAK since you haven't root caused the
> issue and are just working around it in a very questionable manner.

To be fair though, amdgpu & radeon are already disabling write-combining
for system memory pages in 32-bit x86 kernels for similar reasons.


-- 
Earthling Michel Dänzer               |               https://redhat.com
Libre software enthusiast             |             Mesa and X developer

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] gpu/drm: Remove TTM_PL_FLAG_WC of VRAM to fix writecombine issue for Loongson64
  2020-08-10 10:50       ` Michel Dänzer
@ 2020-08-10 11:22         ` Christian König
  0 siblings, 0 replies; 7+ messages in thread
From: Christian König @ 2020-08-10 11:22 UTC (permalink / raw)
  To: Michel Dänzer, Jiaxun Yang, Thomas Bogendoerfer, Tiezhu Yang
  Cc: Alex Deucher, Huacai Chen, linux-mips, amd-gfx, linux-kernel

Am 10.08.20 um 12:50 schrieb Michel Dänzer:
> On 2020-08-09 2:13 p.m., Christian König wrote:
>> Am 08.08.20 um 15:50 schrieb Jiaxun Yang:
>>> 在 2020/8/8 下午9:41, Thomas Bogendoerfer 写道:
>>>> On Sat, Aug 08, 2020 at 03:25:02PM +0800, Tiezhu Yang wrote:
>>>>> Loongson processors have a writecombine issue that maybe failed to
>>>>> write back framebuffer used with ATI Radeon or AMD GPU at times,
>>>>> after commit 8a08e50cee66 ("drm: Permit video-buffers writecombine
>>>>> mapping for MIPS"), there exists some errors such as blurred screen
>>>>> and lockup, and so on.
>>>>>
>>>>> Remove the flag TTM_PL_FLAG_WC of VRAM to fix writecombine issue for
>>>>> Loongson64 to work well with ATI Radeon or AMD GPU, and it has no any
>>>>> influence on the other platforms.
>>>> well it's not my call to take or reject this patch, but I already
>>>> indicated it might be better to disable writecombine on the CPU
>>>> detection side (or do you have other devices where writecombining
>>>> works ?). Something like below will disbale it for all loongson64 CPUs.
>>>> If you now find out where it works and where it doesn't, you can even
>>>> reduce it to the required minium of affected CPUs.
>>> Hi Tiezhu, Thomas,
>>>
>>> Yes, writecombine works well on LS7A's internal GPU....
>>> And even works well with some AMD GPUs (in my case, RX550).
>> In this case the patch is a clear NAK since you haven't root caused the
>> issue and are just working around it in a very questionable manner.
> To be fair though, amdgpu & radeon are already disabling write-combining
> for system memory pages in 32-bit x86 kernels for similar reasons.

Yeah, well that is USWC for system memory. But this is about WC for the 
VRAM BAR.

When we don't understand or don't correctly implement something on the 
platform for USWC then this is annoying, but not a serious issue.

But when the hardware doesn't correctly implement WC for PCIe BARs, then 
this is a violation of the PCIe spec and a bit more serious issue for 
the whole platform.

We can work around that by disabling WC for PCIe BARs on the whole 
platform, or behind specific bridges or or or, but patching each 
individual driver so that they work is not really the right approach.

Cheers,
Christian.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-08-10 11:22 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-08  7:25 [PATCH] gpu/drm: Remove TTM_PL_FLAG_WC of VRAM to fix writecombine issue for Loongson64 Tiezhu Yang
2020-08-08 13:41 ` Thomas Bogendoerfer
2020-08-08 13:50   ` Jiaxun Yang
2020-08-09 12:13     ` Christian König
2020-08-10  0:58       ` Tiezhu Yang
2020-08-10 10:50       ` Michel Dänzer
2020-08-10 11:22         ` Christian König

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).