All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/amdgpu: Fix a bug that vm size is wrong on Raven
@ 2017-12-14  2:25 Yong Zhao
       [not found] ` <1513218350-29030-1-git-send-email-yong.zhao-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Yong Zhao @ 2017-12-14  2:25 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Yong Zhao

Change-Id: Id522c1cbadb8c069720f4e64a31cff42cd014733
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 709587d..3b9eb1a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2534,7 +2534,7 @@ void amdgpu_vm_adjust_size(struct amdgpu_device *adev, uint32_t vm_size,
 	uint64_t tmp;
 
 	/* adjust vm size first */
-	if (amdgpu_vm_size != -1) {
+	if (amdgpu_vm_size != -1 && max_level == 1) {
 		unsigned max_size = 1 << (max_bits - 30);
 
 		vm_size = amdgpu_vm_size;
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] drm/amdgpu: Fix a bug that vm size is wrong on Raven
       [not found] ` <1513218350-29030-1-git-send-email-yong.zhao-5C7GfCeVMHo@public.gmane.org>
@ 2017-12-14  8:47   ` Christian König
       [not found]     ` <24c4613c-ed30-2ac3-cca2-4aec5dd707d5-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Christian König @ 2017-12-14  8:47 UTC (permalink / raw)
  To: Yong Zhao, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

NAK, that really circumvents the intention of the patch to adjust the 
number of levels based on the vm_size.

Christian.

Am 14.12.2017 um 03:25 schrieb Yong Zhao:
> Change-Id: Id522c1cbadb8c069720f4e64a31cff42cd014733
> Signed-off-by: Yong Zhao <yong.zhao@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 709587d..3b9eb1a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -2534,7 +2534,7 @@ void amdgpu_vm_adjust_size(struct amdgpu_device *adev, uint32_t vm_size,
>   	uint64_t tmp;
>   
>   	/* adjust vm size first */
> -	if (amdgpu_vm_size != -1) {
> +	if (amdgpu_vm_size != -1 && max_level == 1) {
>   		unsigned max_size = 1 << (max_bits - 30);
>   
>   		vm_size = amdgpu_vm_size;

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] drm/amdgpu: Fix a bug that vm size is wrong on Raven
       [not found]     ` <24c4613c-ed30-2ac3-cca2-4aec5dd707d5-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2017-12-14 15:56       ` Yong Zhao
       [not found]         ` <9fc57983-8af9-789d-6e60-e3716f846d6c-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Yong Zhao @ 2017-12-14 15:56 UTC (permalink / raw)
  To: christian.koenig-5C7GfCeVMHo, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Hi Christian,

I don't know much about the background. But according to my experiments, 
as long as we change the vm size to 64G, ATC memory access on Raven will 
fall apart. How should deal with that or can you come up with a fix?

Regards,

Yong


On 2017-12-14 03:47 AM, Christian König wrote:
> NAK, that really circumvents the intention of the patch to adjust the 
> number of levels based on the vm_size.
>
> Christian.
>
> Am 14.12.2017 um 03:25 schrieb Yong Zhao:
>> Change-Id: Id522c1cbadb8c069720f4e64a31cff42cd014733
>> Signed-off-by: Yong Zhao <yong.zhao@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index 709587d..3b9eb1a 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -2534,7 +2534,7 @@ void amdgpu_vm_adjust_size(struct amdgpu_device 
>> *adev, uint32_t vm_size,
>>       uint64_t tmp;
>>         /* adjust vm size first */
>> -    if (amdgpu_vm_size != -1) {
>> +    if (amdgpu_vm_size != -1 && max_level == 1) {
>>           unsigned max_size = 1 << (max_bits - 30);
>>             vm_size = amdgpu_vm_size;
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] drm/amdgpu: Fix a bug that vm size is wrong on Raven
       [not found]         ` <9fc57983-8af9-789d-6e60-e3716f846d6c-5C7GfCeVMHo@public.gmane.org>
@ 2017-12-14 17:23           ` Christian König
  0 siblings, 0 replies; 8+ messages in thread
From: Christian König @ 2017-12-14 17:23 UTC (permalink / raw)
  To: Yong Zhao, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Hi Yong,

> How should deal with that or can you come up with a fix? 
well that is the expected effect. So I don't see much we can do here.

Support for the parameter was added to be able to intentionally break 
support for HMM/SVM for testing the fall back paths.

Didn't thought about ATC while enabling this, but it certainly falls 
into the same category. Could be that we can still keep ATC working 
while reducing the GPUVM size, but that would require further testing.

Regards,
Christian.

Am 14.12.2017 um 16:56 schrieb Yong Zhao:
> Hi Christian,
>
> I don't know much about the background. But according to my 
> experiments, as long as we change the vm size to 64G, ATC memory 
> access on Raven will fall apart. How should deal with that or can you 
> come up with a fix?
>
> Regards,
>
> Yong
>
>
> On 2017-12-14 03:47 AM, Christian König wrote:
>> NAK, that really circumvents the intention of the patch to adjust the 
>> number of levels based on the vm_size.
>>
>> Christian.
>>
>> Am 14.12.2017 um 03:25 schrieb Yong Zhao:
>>> Change-Id: Id522c1cbadb8c069720f4e64a31cff42cd014733
>>> Signed-off-by: Yong Zhao <yong.zhao@amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> index 709587d..3b9eb1a 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> @@ -2534,7 +2534,7 @@ void amdgpu_vm_adjust_size(struct 
>>> amdgpu_device *adev, uint32_t vm_size,
>>>       uint64_t tmp;
>>>         /* adjust vm size first */
>>> -    if (amdgpu_vm_size != -1) {
>>> +    if (amdgpu_vm_size != -1 && max_level == 1) {
>>>           unsigned max_size = 1 << (max_bits - 30);
>>>             vm_size = amdgpu_vm_size;
>>
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] drm/amdgpu: Fix a bug that vm size is wrong on Raven
       [not found]         ` <e3d20771-df70-b9e5-345b-9914c54db292-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2017-12-18 17:00           ` Zhao, Yong
  0 siblings, 0 replies; 8+ messages in thread
From: Zhao, Yong @ 2017-12-18 17:00 UTC (permalink / raw)
  To: Felix Kühling, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	Koenig, Christian


[-- Attachment #1.1: Type: text/plain, Size: 2593 bytes --]

Felix,


we ended up with this fix. http://git.amd.com:8080/#/c/122202/


I explored the option to override the VM size only for KFD VMs, but it was not easy as amdgpu_check_vm_size() and amdgpu_vm_adjust_size() come in during loading amdgpu, i. e. before KFD VMs are created. So to fix the problem quickly, we ended up to not overwrite the VM size for all Raven VMs.


Regards,

Yong

________________________________
From: Christian König <ckoenig.leichtzumerken-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Sent: Monday, December 18, 2017 4:22:43 AM
To: Felix Kühling; Zhao, Yong; amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
Subject: Re: [PATCH] drm/amdgpu: Fix a bug that vm size is wrong on Raven

The problem was a merge conflict.

No idea what exactly went wrong, but Yong ended up with a branch where the vm_size was always overwritten with the value 64.

So we can completely drop this patch and yes when an user overrides the vm_size value he/she should know what the consequences are.

Regards,
Christian.

Am 17.12.2017 um 00:21 schrieb Felix Kühling:

Is there a way to override the VM size for KFD VMs only? Only they
depend on ATS, so only they need to be forced to be 48-bits.

On the other hand, it could be argued that a user who manually sets the
VM size with a module parameter knows what they're doing. So let them
break ATS.

Regards,
  Felix


Am 14.12.2017 um 17:45 schrieb Yong Zhao:


Change-Id: Id522c1cbadb8c069720f4e64a31cff42cd014733
Signed-off-by: Yong Zhao <yong.zhao-5C7GfCeVMHo@public.gmane.org><mailto:yong.zhao-5C7GfCeVMHo@public.gmane.org>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 709587d..93500e6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2534,7 +2534,7 @@ void amdgpu_vm_adjust_size(struct amdgpu_device *adev, uint32_t vm_size,
        uint64_t tmp;

        /* adjust vm size first */
-       if (amdgpu_vm_size != -1) {
+       if (amdgpu_vm_size != -1 && adev->asic_type != CHIP_RAVEN) {
                unsigned max_size = 1 << (max_bits - 30);

                vm_size = amdgpu_vm_size;








_______________________________________________
amd-gfx mailing list
amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org<mailto:amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org>
https://lists.freedesktop.org/mailman/listinfo/amd-gfx



[-- Attachment #1.2: Type: text/html, Size: 4351 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] drm/amdgpu: Fix a bug that vm size is wrong on Raven
       [not found]     ` <9ebd38b5-2b12-8797-65e5-3a920ce697aa-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2017-12-18  9:22       ` Christian König
       [not found]         ` <e3d20771-df70-b9e5-345b-9914c54db292-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Christian König @ 2017-12-18  9:22 UTC (permalink / raw)
  To: Felix Kühling, Yong Zhao, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW


[-- Attachment #1.1: Type: text/plain, Size: 1733 bytes --]

The problem was a merge conflict.

No idea what exactly went wrong, but Yong ended up with a branch where 
the vm_size was always overwritten with the value 64.

So we can completely drop this patch and yes when an user overrides the 
vm_size value he/she should know what the consequences are.

Regards,
Christian.

Am 17.12.2017 um 00:21 schrieb Felix Kühling:
> Is there a way to override the VM size for KFD VMs only? Only they
> depend on ATS, so only they need to be forced to be 48-bits.
>
> On the other hand, it could be argued that a user who manually sets the
> VM size with a module parameter knows what they're doing. So let them
> break ATS.
>
> Regards,
>    Felix
>
>
> Am 14.12.2017 um 17:45 schrieb Yong Zhao:
>> Change-Id: Id522c1cbadb8c069720f4e64a31cff42cd014733
>> Signed-off-by: Yong Zhao <yong.zhao-5C7GfCeVMHo@public.gmane.org>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index 709587d..93500e6 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -2534,7 +2534,7 @@ void amdgpu_vm_adjust_size(struct amdgpu_device *adev, uint32_t vm_size,
>>   	uint64_t tmp;
>>   
>>   	/* adjust vm size first */
>> -	if (amdgpu_vm_size != -1) {
>> +	if (amdgpu_vm_size != -1 && adev->asic_type != CHIP_RAVEN) {
>>   		unsigned max_size = 1 << (max_bits - 30);
>>   
>>   		vm_size = amdgpu_vm_size;
>
>
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[-- Attachment #1.2: Type: text/html, Size: 2657 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] drm/amdgpu: Fix a bug that vm size is wrong on Raven
       [not found] ` <1513269904-3385-1-git-send-email-yong.zhao-5C7GfCeVMHo@public.gmane.org>
@ 2017-12-16 23:21   ` Felix Kühling
       [not found]     ` <9ebd38b5-2b12-8797-65e5-3a920ce697aa-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Felix Kühling @ 2017-12-16 23:21 UTC (permalink / raw)
  To: Yong Zhao, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW


[-- Attachment #1.1.1: Type: text/plain, Size: 1159 bytes --]

Is there a way to override the VM size for KFD VMs only? Only they
depend on ATS, so only they need to be forced to be 48-bits.

On the other hand, it could be argued that a user who manually sets the
VM size with a module parameter knows what they're doing. So let them
break ATS.

Regards,
  Felix


Am 14.12.2017 um 17:45 schrieb Yong Zhao:
> Change-Id: Id522c1cbadb8c069720f4e64a31cff42cd014733
> Signed-off-by: Yong Zhao <yong.zhao-5C7GfCeVMHo@public.gmane.org>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 709587d..93500e6 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -2534,7 +2534,7 @@ void amdgpu_vm_adjust_size(struct amdgpu_device *adev, uint32_t vm_size,
>  	uint64_t tmp;
>  
>  	/* adjust vm size first */
> -	if (amdgpu_vm_size != -1) {
> +	if (amdgpu_vm_size != -1 && adev->asic_type != CHIP_RAVEN) {
>  		unsigned max_size = 1 << (max_bits - 30);
>  
>  		vm_size = amdgpu_vm_size;



[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 213 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH] drm/amdgpu: Fix a bug that vm size is wrong on Raven
@ 2017-12-14 16:45 Yong Zhao
       [not found] ` <1513269904-3385-1-git-send-email-yong.zhao-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Yong Zhao @ 2017-12-14 16:45 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Yong Zhao

Change-Id: Id522c1cbadb8c069720f4e64a31cff42cd014733
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 709587d..93500e6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2534,7 +2534,7 @@ void amdgpu_vm_adjust_size(struct amdgpu_device *adev, uint32_t vm_size,
 	uint64_t tmp;
 
 	/* adjust vm size first */
-	if (amdgpu_vm_size != -1) {
+	if (amdgpu_vm_size != -1 && adev->asic_type != CHIP_RAVEN) {
 		unsigned max_size = 1 << (max_bits - 30);
 
 		vm_size = amdgpu_vm_size;
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-12-18 17:00 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-14  2:25 [PATCH] drm/amdgpu: Fix a bug that vm size is wrong on Raven Yong Zhao
     [not found] ` <1513218350-29030-1-git-send-email-yong.zhao-5C7GfCeVMHo@public.gmane.org>
2017-12-14  8:47   ` Christian König
     [not found]     ` <24c4613c-ed30-2ac3-cca2-4aec5dd707d5-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-12-14 15:56       ` Yong Zhao
     [not found]         ` <9fc57983-8af9-789d-6e60-e3716f846d6c-5C7GfCeVMHo@public.gmane.org>
2017-12-14 17:23           ` Christian König
2017-12-14 16:45 Yong Zhao
     [not found] ` <1513269904-3385-1-git-send-email-yong.zhao-5C7GfCeVMHo@public.gmane.org>
2017-12-16 23:21   ` Felix Kühling
     [not found]     ` <9ebd38b5-2b12-8797-65e5-3a920ce697aa-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-12-18  9:22       ` Christian König
     [not found]         ` <e3d20771-df70-b9e5-345b-9914c54db292-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-12-18 17:00           ` Zhao, Yong

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.