All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/ttm: fix missing NULL check in ttm_device_swapout
@ 2022-06-03 10:46 Christian König
  2022-06-03 19:37 ` Felix Kuehling
  2022-06-06 13:15 ` Felix Kuehling
  0 siblings, 2 replies; 7+ messages in thread
From: Christian König @ 2022-06-03 10:46 UTC (permalink / raw)
  To: mike, dri-devel; +Cc: Christian König

Resources about to be destructed are not tied to BOs any more.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/ttm/ttm_device.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c
index a0562ab386f5..e7147e304637 100644
--- a/drivers/gpu/drm/ttm/ttm_device.c
+++ b/drivers/gpu/drm/ttm/ttm_device.c
@@ -156,8 +156,12 @@ int ttm_device_swapout(struct ttm_device *bdev, struct ttm_operation_ctx *ctx,
 
 		ttm_resource_manager_for_each_res(man, &cursor, res) {
 			struct ttm_buffer_object *bo = res->bo;
-			uint32_t num_pages = PFN_UP(bo->base.size);
+			uint32_t num_pages;
 
+			if (!bo)
+				continue;
+
+			num_pages = PFN_UP(bo->base.size);
 			ret = ttm_bo_swapout(bo, ctx, gfp_flags);
 			/* ttm_bo_swapout has dropped the lru_lock */
 			if (!ret)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] drm/ttm: fix missing NULL check in ttm_device_swapout
  2022-06-03 10:46 [PATCH] drm/ttm: fix missing NULL check in ttm_device_swapout Christian König
@ 2022-06-03 19:37 ` Felix Kuehling
  2022-06-03 22:44   ` Felix Kuehling
  2022-06-06 13:15 ` Felix Kuehling
  1 sibling, 1 reply; 7+ messages in thread
From: Felix Kuehling @ 2022-06-03 19:37 UTC (permalink / raw)
  To: Christian König, mike, dri-devel; +Cc: Christian König


On 2022-06-03 06:46, Christian König wrote:
> Resources about to be destructed are not tied to BOs any more.
I've been seeing a backtrace in that area with a patch series I'm 
working on, but didn't have enough time to track it down yet. I'll try 
if this patch fixes it.

Regards,
   Felix


>
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>   drivers/gpu/drm/ttm/ttm_device.c | 6 +++++-
>   1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c
> index a0562ab386f5..e7147e304637 100644
> --- a/drivers/gpu/drm/ttm/ttm_device.c
> +++ b/drivers/gpu/drm/ttm/ttm_device.c
> @@ -156,8 +156,12 @@ int ttm_device_swapout(struct ttm_device *bdev, struct ttm_operation_ctx *ctx,
>   
>   		ttm_resource_manager_for_each_res(man, &cursor, res) {
>   			struct ttm_buffer_object *bo = res->bo;
> -			uint32_t num_pages = PFN_UP(bo->base.size);
> +			uint32_t num_pages;
>   
> +			if (!bo)
> +				continue;
> +
> +			num_pages = PFN_UP(bo->base.size);
>   			ret = ttm_bo_swapout(bo, ctx, gfp_flags);
>   			/* ttm_bo_swapout has dropped the lru_lock */
>   			if (!ret)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] drm/ttm: fix missing NULL check in ttm_device_swapout
  2022-06-03 19:37 ` Felix Kuehling
@ 2022-06-03 22:44   ` Felix Kuehling
  2022-06-06 10:20     ` Christian König
  0 siblings, 1 reply; 7+ messages in thread
From: Felix Kuehling @ 2022-06-03 22:44 UTC (permalink / raw)
  To: Christian König, mike, dri-devel, amd-gfx; +Cc: Christian König

[+amd-gfx]


On 2022-06-03 15:37, Felix Kuehling wrote:
>
> On 2022-06-03 06:46, Christian König wrote:
>> Resources about to be destructed are not tied to BOs any more.
> I've been seeing a backtrace in that area with a patch series I'm 
> working on, but didn't have enough time to track it down yet. I'll try 
> if this patch fixes it.

The patch doesn't apply on amd-staging-drm-next. I made the following 
change instead, which fixes my problem (and I do see the pr_err being 
triggered):

--- a/drivers/gpu/drm/ttm/ttm_device.c
+++ b/drivers/gpu/drm/ttm/ttm_device.c
@@ -157,6 +157,10 @@ int ttm_device_swapout(struct ttm_device *bdev, struct ttm_operation_ctx *ctx,
                         list_for_each_entry(bo, &man->lru[j], lru) {
                                 uint32_t num_pages = PFN_UP(bo->base.size);
  
+                               if (!bo->resource) {
+                                       pr_err("### bo->resource is NULL\n");
+                                       continue;
+                               }
                                 ret = ttm_bo_swapout(bo, ctx, gfp_flags);
                                 /* ttm_bo_swapout has dropped the lru_lock */
                                 if (!ret)

>
> Regards,
>   Felix
>
>
>>
>> Signed-off-by: Christian König <christian.koenig@amd.com>
>> ---
>>   drivers/gpu/drm/ttm/ttm_device.c | 6 +++++-
>>   1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/ttm/ttm_device.c 
>> b/drivers/gpu/drm/ttm/ttm_device.c
>> index a0562ab386f5..e7147e304637 100644
>> --- a/drivers/gpu/drm/ttm/ttm_device.c
>> +++ b/drivers/gpu/drm/ttm/ttm_device.c
>> @@ -156,8 +156,12 @@ int ttm_device_swapout(struct ttm_device *bdev, 
>> struct ttm_operation_ctx *ctx,
>>             ttm_resource_manager_for_each_res(man, &cursor, res) {
>>               struct ttm_buffer_object *bo = res->bo;
>> -            uint32_t num_pages = PFN_UP(bo->base.size);
>> +            uint32_t num_pages;
>>   +            if (!bo)
>> +                continue;
>> +
>> +            num_pages = PFN_UP(bo->base.size);
>>               ret = ttm_bo_swapout(bo, ctx, gfp_flags);
>>               /* ttm_bo_swapout has dropped the lru_lock */
>>               if (!ret)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] drm/ttm: fix missing NULL check in ttm_device_swapout
  2022-06-03 22:44   ` Felix Kuehling
@ 2022-06-06 10:20     ` Christian König
  0 siblings, 0 replies; 7+ messages in thread
From: Christian König @ 2022-06-06 10:20 UTC (permalink / raw)
  To: Felix Kuehling, mike, dri-devel, amd-gfx; +Cc: Christian König

Am 04.06.22 um 00:44 schrieb Felix Kuehling:
> [+amd-gfx]
>
>
> On 2022-06-03 15:37, Felix Kuehling wrote:
>>
>> On 2022-06-03 06:46, Christian König wrote:
>>> Resources about to be destructed are not tied to BOs any more.
>> I've been seeing a backtrace in that area with a patch series I'm 
>> working on, but didn't have enough time to track it down yet. I'll 
>> try if this patch fixes it.
>
> The patch doesn't apply on amd-staging-drm-next. I made the following 
> change instead, which fixes my problem (and I do see the pr_err being 
> triggered):
>
> --- a/drivers/gpu/drm/ttm/ttm_device.c
> +++ b/drivers/gpu/drm/ttm/ttm_device.c
> @@ -157,6 +157,10 @@ int ttm_device_swapout(struct ttm_device *bdev, 
> struct ttm_operation_ctx *ctx,
>                         list_for_each_entry(bo, &man->lru[j], lru) {
>                                 uint32_t num_pages = 
> PFN_UP(bo->base.size);
>
> +                               if (!bo->resource) {
> +                                       pr_err("### bo->resource is 
> NULL\n");
> +                                       continue;
> +                               }

Yeah, that should be functional identical.

Can I get an rb for that? Going to provide backports to older kernels as 
well then.

Regards,
Christian.

> ret = ttm_bo_swapout(bo, ctx, gfp_flags);
>                                 /* ttm_bo_swapout has dropped the 
> lru_lock */
>                                 if (!ret)
>
>>
>> Regards,
>>   Felix
>>
>>
>>>
>>> Signed-off-by: Christian König <christian.koenig@amd.com>
>>> ---
>>>   drivers/gpu/drm/ttm/ttm_device.c | 6 +++++-
>>>   1 file changed, 5 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/ttm/ttm_device.c 
>>> b/drivers/gpu/drm/ttm/ttm_device.c
>>> index a0562ab386f5..e7147e304637 100644
>>> --- a/drivers/gpu/drm/ttm/ttm_device.c
>>> +++ b/drivers/gpu/drm/ttm/ttm_device.c
>>> @@ -156,8 +156,12 @@ int ttm_device_swapout(struct ttm_device *bdev, 
>>> struct ttm_operation_ctx *ctx,
>>>             ttm_resource_manager_for_each_res(man, &cursor, res) {
>>>               struct ttm_buffer_object *bo = res->bo;
>>> -            uint32_t num_pages = PFN_UP(bo->base.size);
>>> +            uint32_t num_pages;
>>>   +            if (!bo)
>>> +                continue;
>>> +
>>> +            num_pages = PFN_UP(bo->base.size);
>>>               ret = ttm_bo_swapout(bo, ctx, gfp_flags);
>>>               /* ttm_bo_swapout has dropped the lru_lock */
>>>               if (!ret)


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] drm/ttm: fix missing NULL check in ttm_device_swapout
  2022-06-03 10:46 [PATCH] drm/ttm: fix missing NULL check in ttm_device_swapout Christian König
  2022-06-03 19:37 ` Felix Kuehling
@ 2022-06-06 13:15 ` Felix Kuehling
  1 sibling, 0 replies; 7+ messages in thread
From: Felix Kuehling @ 2022-06-06 13:15 UTC (permalink / raw)
  To: Christian König, mike, dri-devel, amd-gfx list; +Cc: Christian König


Am 2022-06-03 um 06:46 schrieb Christian König:
> Resources about to be destructed are not tied to BOs any more.
>
> Signed-off-by: Christian König <christian.koenig@amd.com>

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>


> ---
>   drivers/gpu/drm/ttm/ttm_device.c | 6 +++++-
>   1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c
> index a0562ab386f5..e7147e304637 100644
> --- a/drivers/gpu/drm/ttm/ttm_device.c
> +++ b/drivers/gpu/drm/ttm/ttm_device.c
> @@ -156,8 +156,12 @@ int ttm_device_swapout(struct ttm_device *bdev, struct ttm_operation_ctx *ctx,
>   
>   		ttm_resource_manager_for_each_res(man, &cursor, res) {
>   			struct ttm_buffer_object *bo = res->bo;
> -			uint32_t num_pages = PFN_UP(bo->base.size);
> +			uint32_t num_pages;
>   
> +			if (!bo)
> +				continue;
> +
> +			num_pages = PFN_UP(bo->base.size);
>   			ret = ttm_bo_swapout(bo, ctx, gfp_flags);
>   			/* ttm_bo_swapout has dropped the lru_lock */
>   			if (!ret)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] drm/ttm: fix missing NULL check in ttm_device_swapout
  2022-07-14 21:58 Felix Kuehling
@ 2022-07-15  6:46 ` Christian König
  0 siblings, 0 replies; 7+ messages in thread
From: Christian König @ 2022-07-15  6:46 UTC (permalink / raw)
  To: Felix Kuehling, amd-gfx

Am 14.07.22 um 23:58 schrieb Felix Kuehling:
> Backport of Christian's patch 81b0d0e4f811 to amd-staging-drm-next. This
> branch may be nearly obsolete, but this patch may still be worth
> applying as it can serve as a template for backports to some release
> branches. It fixes intermittent kernel oopses when memory is severely
> overcommitted.
>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>

I was hoping that Alex rebase would land before anybody notices this 
problem.

Anyway patch is Reviewed-by: Christian König <christian.koenig@amd.com>.

Regards,
Christian.

> ---
>   drivers/gpu/drm/ttm/ttm_device.c | 3 +++
>   1 file changed, 3 insertions(+)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c
> index be24bb6cefd0..165a6cbb45d5 100644
> --- a/drivers/gpu/drm/ttm/ttm_device.c
> +++ b/drivers/gpu/drm/ttm/ttm_device.c
> @@ -157,6 +157,9 @@ int ttm_device_swapout(struct ttm_device *bdev, struct ttm_operation_ctx *ctx,
>   			list_for_each_entry(bo, &man->lru[j], lru) {
>   				uint32_t num_pages = PFN_UP(bo->base.size);
>   
> +				if (!bo->resource)
> +					continue;
> +
>   				ret = ttm_bo_swapout(bo, ctx, gfp_flags);
>   				/* ttm_bo_swapout has dropped the lru_lock */
>   				if (!ret)


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] drm/ttm: fix missing NULL check in ttm_device_swapout
@ 2022-07-14 21:58 Felix Kuehling
  2022-07-15  6:46 ` Christian König
  0 siblings, 1 reply; 7+ messages in thread
From: Felix Kuehling @ 2022-07-14 21:58 UTC (permalink / raw)
  To: amd-gfx; +Cc: christian.koenig

Backport of Christian's patch 81b0d0e4f811 to amd-staging-drm-next. This
branch may be nearly obsolete, but this patch may still be worth
applying as it can serve as a template for backports to some release
branches. It fixes intermittent kernel oopses when memory is severely
overcommitted.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/ttm/ttm_device.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c
index be24bb6cefd0..165a6cbb45d5 100644
--- a/drivers/gpu/drm/ttm/ttm_device.c
+++ b/drivers/gpu/drm/ttm/ttm_device.c
@@ -157,6 +157,9 @@ int ttm_device_swapout(struct ttm_device *bdev, struct ttm_operation_ctx *ctx,
 			list_for_each_entry(bo, &man->lru[j], lru) {
 				uint32_t num_pages = PFN_UP(bo->base.size);
 
+				if (!bo->resource)
+					continue;
+
 				ret = ttm_bo_swapout(bo, ctx, gfp_flags);
 				/* ttm_bo_swapout has dropped the lru_lock */
 				if (!ret)
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-07-15  6:46 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-03 10:46 [PATCH] drm/ttm: fix missing NULL check in ttm_device_swapout Christian König
2022-06-03 19:37 ` Felix Kuehling
2022-06-03 22:44   ` Felix Kuehling
2022-06-06 10:20     ` Christian König
2022-06-06 13:15 ` Felix Kuehling
2022-07-14 21:58 Felix Kuehling
2022-07-15  6:46 ` Christian König

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.