All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/ttm: fix bulk move handling during resource init
@ 2022-06-02 15:47 Christian König
  2022-06-02 16:15 ` Luben Tuikov
  2022-06-02 16:54 ` Alex Deucher
  0 siblings, 2 replies; 7+ messages in thread
From: Christian König @ 2022-06-02 15:47 UTC (permalink / raw)
  To: amd-gfx, Arunpravin.PaneerSelvam

The resource must be on the LRU before ttm_lru_bulk_move_add() is called.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/ttm/ttm_resource.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_resource.c b/drivers/gpu/drm/ttm/ttm_resource.c
index 65889b3caf50..928b9140f3c5 100644
--- a/drivers/gpu/drm/ttm/ttm_resource.c
+++ b/drivers/gpu/drm/ttm/ttm_resource.c
@@ -169,15 +169,17 @@ void ttm_resource_init(struct ttm_buffer_object *bo,
 	res->bus.is_iomem = false;
 	res->bus.caching = ttm_cached;
 	res->bo = bo;
-	INIT_LIST_HEAD(&res->lru);
 
 	man = ttm_manager_type(bo->bdev, place->mem_type);
 	spin_lock(&bo->bdev->lru_lock);
 	man->usage += res->num_pages << PAGE_SHIFT;
-	if (bo->bulk_move)
+	if (bo->bulk_move) {
+		list_add_tail(&res->lru, &man->lru[bo->priority]);
 		ttm_lru_bulk_move_add(bo->bulk_move, res);
-	else
+	} else {
+		INIT_LIST_HEAD(&res->lru);
 		ttm_resource_move_to_lru_tail(res);
+	}
 	spin_unlock(&bo->bdev->lru_lock);
 }
 EXPORT_SYMBOL(ttm_resource_init);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] drm/ttm: fix bulk move handling during resource init
  2022-06-02 15:47 [PATCH] drm/ttm: fix bulk move handling during resource init Christian König
@ 2022-06-02 16:15 ` Luben Tuikov
  2022-06-02 16:54 ` Alex Deucher
  1 sibling, 0 replies; 7+ messages in thread
From: Luben Tuikov @ 2022-06-02 16:15 UTC (permalink / raw)
  To: Christian König, amd-gfx, Arunpravin.PaneerSelvam

Acked-by: Luben Tuikov <luben.tuikov@amd.com>

Regards,
Luben

On 2022-06-02 11:47, Christian König wrote:
> The resource must be on the LRU before ttm_lru_bulk_move_add() is called.
> 
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>  drivers/gpu/drm/ttm/ttm_resource.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/ttm/ttm_resource.c b/drivers/gpu/drm/ttm/ttm_resource.c
> index 65889b3caf50..928b9140f3c5 100644
> --- a/drivers/gpu/drm/ttm/ttm_resource.c
> +++ b/drivers/gpu/drm/ttm/ttm_resource.c
> @@ -169,15 +169,17 @@ void ttm_resource_init(struct ttm_buffer_object *bo,
>  	res->bus.is_iomem = false;
>  	res->bus.caching = ttm_cached;
>  	res->bo = bo;
> -	INIT_LIST_HEAD(&res->lru);
>  
>  	man = ttm_manager_type(bo->bdev, place->mem_type);
>  	spin_lock(&bo->bdev->lru_lock);
>  	man->usage += res->num_pages << PAGE_SHIFT;
> -	if (bo->bulk_move)
> +	if (bo->bulk_move) {
> +		list_add_tail(&res->lru, &man->lru[bo->priority]);
>  		ttm_lru_bulk_move_add(bo->bulk_move, res);
> -	else
> +	} else {
> +		INIT_LIST_HEAD(&res->lru);
>  		ttm_resource_move_to_lru_tail(res);
> +	}
>  	spin_unlock(&bo->bdev->lru_lock);
>  }
>  EXPORT_SYMBOL(ttm_resource_init);

Regards,
-- 
Luben

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] drm/ttm: fix bulk move handling during resource init
  2022-06-02 15:47 [PATCH] drm/ttm: fix bulk move handling during resource init Christian König
  2022-06-02 16:15 ` Luben Tuikov
@ 2022-06-02 16:54 ` Alex Deucher
  2022-06-02 18:08   ` Mike Lothian
  1 sibling, 1 reply; 7+ messages in thread
From: Alex Deucher @ 2022-06-02 16:54 UTC (permalink / raw)
  To: Christian König, Mike Lothian; +Cc: amd-gfx list, Arunpravin

On Thu, Jun 2, 2022 at 11:47 AM Christian König
<ckoenig.leichtzumerken@gmail.com> wrote:
>
> The resource must be on the LRU before ttm_lru_bulk_move_add() is called.
>
> Signed-off-by: Christian König <christian.koenig@amd.com>

This should at least fix the null pointer in these bugs:

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1992
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2034

Alex

> ---
>  drivers/gpu/drm/ttm/ttm_resource.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_resource.c b/drivers/gpu/drm/ttm/ttm_resource.c
> index 65889b3caf50..928b9140f3c5 100644
> --- a/drivers/gpu/drm/ttm/ttm_resource.c
> +++ b/drivers/gpu/drm/ttm/ttm_resource.c
> @@ -169,15 +169,17 @@ void ttm_resource_init(struct ttm_buffer_object *bo,
>         res->bus.is_iomem = false;
>         res->bus.caching = ttm_cached;
>         res->bo = bo;
> -       INIT_LIST_HEAD(&res->lru);
>
>         man = ttm_manager_type(bo->bdev, place->mem_type);
>         spin_lock(&bo->bdev->lru_lock);
>         man->usage += res->num_pages << PAGE_SHIFT;
> -       if (bo->bulk_move)
> +       if (bo->bulk_move) {
> +               list_add_tail(&res->lru, &man->lru[bo->priority]);
>                 ttm_lru_bulk_move_add(bo->bulk_move, res);
> -       else
> +       } else {
> +               INIT_LIST_HEAD(&res->lru);
>                 ttm_resource_move_to_lru_tail(res);
> +       }
>         spin_unlock(&bo->bdev->lru_lock);
>  }
>  EXPORT_SYMBOL(ttm_resource_init);
> --
> 2.25.1
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] drm/ttm: fix bulk move handling during resource init
  2022-06-02 16:54 ` Alex Deucher
@ 2022-06-02 18:08   ` Mike Lothian
  2022-06-02 18:11     ` Mike Lothian
  2022-06-02 18:55     ` Christian König
  0 siblings, 2 replies; 7+ messages in thread
From: Mike Lothian @ 2022-06-02 18:08 UTC (permalink / raw)
  To: Alex Deucher; +Cc: Christian König, amd-gfx list, Arunpravin

Hi

I'm still seeing Null pointers against Linus's tree and drm-misc with this patch

Jun 02 19:04:05 axion.fireburn.co.uk kernel: BUG: kernel NULL pointer
dereference, address: 0000000000000008
Jun 02 19:04:05 axion.fireburn.co.uk kernel: #PF: supervisor write
access in kernel mode
Jun 02 19:04:05 axion.fireburn.co.uk kernel: #PF: error_code(0x0002) -
not-present page
Jun 02 19:04:05 axion.fireburn.co.uk kernel: PGD 11ee04067 P4D
11ee04067 PUD 15eccb067 PMD 0
Jun 02 19:04:05 axion.fireburn.co.uk kernel: Oops: 0002 [#1] PREEMPT SMP NOPTI
Jun 02 19:04:05 axion.fireburn.co.uk kernel: CPU: 0 PID: 1021 Comm:
GravityMark.x64 Tainted: G        W         5.18.0-tip+ #3177
Jun 02 19:04:05 axion.fireburn.co.uk kernel: Hardware name: ASUSTeK
COMPUTER INC. ROG Strix G513QY_G513QY/G513QY, BIOS G513QY.318
03/29/2022
Jun 02 19:04:05 axion.fireburn.co.uk kernel: RIP:
0010:ttm_resource_init+0x108/0x210
Jun 02 19:04:05 axion.fireburn.co.uk kernel: Code: 48 8b 74 0a 08 48
39 de 0f 84 82 00 00 00 48 8b 7b 38 4c 8b 4b 40 4c 8d 44 0a 08 48 8d
56 38 4c 89 4f 08 49 89 39 48 8b 4e 38 <48> 89 41 08 48 89 4b 38 48 89
>
Jun 02 19:04:05 axion.fireburn.co.uk kernel: RSP:
0018:ffff888112e73918 EFLAGS: 00010202
Jun 02 19:04:05 axion.fireburn.co.uk kernel: RAX: ffff888206b715d8
RBX: ffff888206b715a0 RCX: 0000000000000000
Jun 02 19:04:05 axion.fireburn.co.uk kernel: RDX: ffff888206b71cf8
RSI: ffff888206b71cc0 RDI: ffff888110605b00
Jun 02 19:04:05 axion.fireburn.co.uk kernel: RBP: ffff88816c848c08
R08: ffff88812235c790 R09: ffff8881306a4bd8
Jun 02 19:04:05 axion.fireburn.co.uk kernel: R10: 0000000000000000
R11: ffffffff81851320 R12: ffff888110605ad0
Jun 02 19:04:05 axion.fireburn.co.uk kernel: R13: ffff888206b715a0
R14: ffff88816c848c58 R15: ffff888110605ad0
Jun 02 19:04:05 axion.fireburn.co.uk kernel: FS:
00007f4c257c1740(0000) GS:ffff888fde400000(0000)
knlGS:0000000000000000
Jun 02 19:04:05 axion.fireburn.co.uk kernel: CS:  0010 DS: 0000 ES:
0000 CR0: 0000000080050033
Jun 02 19:04:05 axion.fireburn.co.uk kernel: CR2: 0000000000000008
CR3: 00000001183fc000 CR4: 0000000000350ef0
Jun 02 19:04:05 axion.fireburn.co.uk kernel: Call Trace:
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  <TASK>
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? amdgpu_vram_mgr_new+0xbb/0x4b0
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? ttm_bo_mem_space+0x89/0x1e0
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? ttm_bo_validate+0x80/0x1a0
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? amdgpu_cs_bo_validate+0xe9/0x2b0
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ?
amdgpu_syncobj_lookup_and_add_to_sync+0xa0/0xa0
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ?
amdgpu_vm_validate_pt_bos+0xce/0x1c0
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? amdgpu_cs_parser_bos+0x522/0x6e0
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? amdgpu_cs_ioctl+0x7fe/0xd00
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ?
amdgpu_cs_report_moved_bytes+0x60/0x60
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? drm_ioctl_kernel+0xcb/0x130
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? drm_ioctl+0x2f5/0x400
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ?
amdgpu_cs_report_moved_bytes+0x60/0x60
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? amdgpu_drm_ioctl+0x42/0x80
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? __x64_sys_ioctl+0x5e/0xa0
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? do_syscall_64+0x6a/0x90
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ?
exit_to_user_mode_prepare+0x19/0x90
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ?
entry_SYSCALL_64_after_hwframe+0x46/0xb0
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  </TASK>
Jun 02 19:04:05 axion.fireburn.co.uk kernel: Modules linked in:
Jun 02 19:04:05 axion.fireburn.co.uk kernel: CR2: 0000000000000008
Jun 02 19:04:05 axion.fireburn.co.uk kernel: ---[ end trace
0000000000000000 ]---
Jun 02 19:04:05 axion.fireburn.co.uk kernel: RIP:
0010:ttm_resource_init+0x108/0x210
Jun 02 19:04:05 axion.fireburn.co.uk kernel: Code: 48 8b 74 0a 08 48
39 de 0f 84 82 00 00 00 48 8b 7b 38 4c 8b 4b 40 4c 8d 44 0a 08 48 8d
56 38 4c 89 4f 08 49 89 39 48 8b 4e 38 <48> 89 41 08 48 89 4b 38 48 89
>
Jun 02 19:04:05 axion.fireburn.co.uk kernel: RSP:
0018:ffff888112e73918 EFLAGS: 00010202
Jun 02 19:04:05 axion.fireburn.co.uk kernel: RAX: ffff888206b715d8
RBX: ffff888206b715a0 RCX: 0000000000000000
Jun 02 19:04:05 axion.fireburn.co.uk kernel: RDX: ffff888206b71cf8
RSI: ffff888206b71cc0 RDI: ffff888110605b00
Jun 02 19:04:05 axion.fireburn.co.uk kernel: RBP: ffff88816c848c08
R08: ffff88812235c790 R09: ffff8881306a4bd8
Jun 02 19:04:05 axion.fireburn.co.uk kernel: R10: 0000000000000000
R11: ffffffff81851320 R12: ffff888110605ad0
Jun 02 19:04:05 axion.fireburn.co.uk kernel: R13: ffff888206b715a0
R14: ffff88816c848c58 R15: ffff888110605ad0
Jun 02 19:04:05 axion.fireburn.co.uk kernel: FS:
00007f4c257c1740(0000) GS:ffff888fde400000(0000)
knlGS:0000000000000000
Jun 02 19:04:05 axion.fireburn.co.uk kernel: CS:  0010 DS: 0000 ES:
0000 CR0: 0000000080050033
Jun 02 19:04:05 axion.fireburn.co.uk kernel: CR2: 0000000000000008
CR3: 00000001183fc000 CR4: 0000000000350ef0
Jun 02 19:04:05 axion.fireburn.co.uk kernel: note:
GravityMark.x64[1021] exited with preempt_count 1

On Thu, 2 Jun 2022 at 17:54, Alex Deucher <alexdeucher@gmail.com> wrote:
>
> On Thu, Jun 2, 2022 at 11:47 AM Christian König
> <ckoenig.leichtzumerken@gmail.com> wrote:
> >
> > The resource must be on the LRU before ttm_lru_bulk_move_add() is called.
> >
> > Signed-off-by: Christian König <christian.koenig@amd.com>
>
> This should at least fix the null pointer in these bugs:
>
> Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1992
> Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2034
>
> Alex
>
> > ---
> >  drivers/gpu/drm/ttm/ttm_resource.c | 8 +++++---
> >  1 file changed, 5 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/ttm/ttm_resource.c b/drivers/gpu/drm/ttm/ttm_resource.c
> > index 65889b3caf50..928b9140f3c5 100644
> > --- a/drivers/gpu/drm/ttm/ttm_resource.c
> > +++ b/drivers/gpu/drm/ttm/ttm_resource.c
> > @@ -169,15 +169,17 @@ void ttm_resource_init(struct ttm_buffer_object *bo,
> >         res->bus.is_iomem = false;
> >         res->bus.caching = ttm_cached;
> >         res->bo = bo;
> > -       INIT_LIST_HEAD(&res->lru);
> >
> >         man = ttm_manager_type(bo->bdev, place->mem_type);
> >         spin_lock(&bo->bdev->lru_lock);
> >         man->usage += res->num_pages << PAGE_SHIFT;
> > -       if (bo->bulk_move)
> > +       if (bo->bulk_move) {
> > +               list_add_tail(&res->lru, &man->lru[bo->priority]);
> >                 ttm_lru_bulk_move_add(bo->bulk_move, res);
> > -       else
> > +       } else {
> > +               INIT_LIST_HEAD(&res->lru);
> >                 ttm_resource_move_to_lru_tail(res);
> > +       }
> >         spin_unlock(&bo->bdev->lru_lock);
> >  }
> >  EXPORT_SYMBOL(ttm_resource_init);
> > --
> > 2.25.1
> >

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] drm/ttm: fix bulk move handling during resource init
  2022-06-02 18:08   ` Mike Lothian
@ 2022-06-02 18:11     ` Mike Lothian
  2022-06-02 18:55     ` Christian König
  1 sibling, 0 replies; 7+ messages in thread
From: Mike Lothian @ 2022-06-02 18:11 UTC (permalink / raw)
  To: Alex Deucher; +Cc: Christian König, amd-gfx list, Arunpravin

Here's the output from drm-misc too:

Jun 02 19:05:32 axion.fireburn.co.uk kernel: BUG: kernel NULL pointer
dereference, address: 00000000000000d8
Jun 02 19:05:32 axion.fireburn.co.uk kernel: #PF: supervisor read
access in kernel mode
Jun 02 19:05:32 axion.fireburn.co.uk kernel: #PF: error_code(0x0000) -
not-present page
Jun 02 19:05:32 axion.fireburn.co.uk kernel: PGD 11dfbb067 P4D
11dfbb067 PUD 170cd7067 PMD 0
Jun 02 19:05:32 axion.fireburn.co.uk kernel: Oops: 0000 [#1] PREEMPT SMP NOPTI
Jun 02 19:05:32 axion.fireburn.co.uk kernel: CPU: 1 PID: 1040 Comm:
GravityMark.x64 Tainted: G        W         5.18.0-rc5-misc+ #10
Jun 02 19:05:32 axion.fireburn.co.uk kernel: Hardware name: ASUSTeK
COMPUTER INC. ROG Strix G513QY_G513QY/G513QY, BIOS G513QY.318
03/29/2022
Jun 02 19:05:32 axion.fireburn.co.uk kernel: RIP:
0010:ttm_device_swapout+0x6a/0x3d0
Jun 02 19:05:32 axion.fireburn.co.uk kernel: Code: 85 ed 74 51 80 7d
01 00 74 4b 48 89 e6 48 89 ef e8 7b dd ff ff 48 85 c0 74 3b 48 89 c3
49 89 e6 48 8b 7b 30 4c 89 ee 44 89 e2 <4c> 8b bf d8 00 00 00 e8 fa a5
>
Jun 02 19:05:32 axion.fireburn.co.uk kernel: RSP:
0000:ffff888125a17c70 EFLAGS: 00010286
Jun 02 19:05:32 axion.fireburn.co.uk kernel: RAX: ffff888104f45aa0
RBX: ffff888104f45aa0 RCX: 0000000000000000
Jun 02 19:05:32 axion.fireburn.co.uk kernel: RDX: 0000000000000cc0
RSI: ffff888125a17d50 RDI: 0000000000000000
Jun 02 19:05:32 axion.fireburn.co.uk kernel: RBP: ffff888104f45118
R08: ffff88812278da68 R09: ffff888102eaa680
Jun 02 19:05:32 axion.fireburn.co.uk kernel: R10: 0000000000000063
R11: ffffffff818212a0 R12: 0000000000000cc0
Jun 02 19:05:32 axion.fireburn.co.uk kernel: R13: ffff888125a17d50
R14: ffff888125a17c70 R15: 0000000000691000
Jun 02 19:05:32 axion.fireburn.co.uk kernel: FS:
00007fd9e0671740(0000) GS:ffff888fde440000(0000)
knlGS:0000000000000000
Jun 02 19:05:32 axion.fireburn.co.uk kernel: CS:  0010 DS: 0000 ES:
0000 CR0: 0000000080050033
Jun 02 19:05:32 axion.fireburn.co.uk kernel: CR2: 00000000000000d8
CR3: 0000000125c5a000 CR4: 0000000000150ee0
Jun 02 19:05:32 axion.fireburn.co.uk kernel: Call Trace:
Jun 02 19:05:32 axion.fireburn.co.uk kernel:  <TASK>
Jun 02 19:05:32 axion.fireburn.co.uk kernel:  ? ttm_global_swapout+0xae/0xc0
Jun 02 19:05:32 axion.fireburn.co.uk kernel:  ? ttm_tt_populate+0x7d/0x130
Jun 02 19:05:32 axion.fireburn.co.uk kernel:  ?
ttm_bo_vm_fault_reserved+0x237/0x270
Jun 02 19:05:32 axion.fireburn.co.uk kernel:  ? amdgpu_gem_fault+0x92/0xd0
Jun 02 19:05:32 axion.fireburn.co.uk kernel:  ? do_fault+0x28e/0x4b0
Jun 02 19:05:32 axion.fireburn.co.uk kernel:  ? handle_mm_fault+0x849/0xa80
Jun 02 19:05:32 axion.fireburn.co.uk kernel:  ? do_user_addr_fault+0x275/0x450
Jun 02 19:05:32 axion.fireburn.co.uk kernel:  ? asm_exc_page_fault+0x9/0x30
Jun 02 19:05:32 axion.fireburn.co.uk kernel:  ? exc_page_fault+0x5f/0x150
Jun 02 19:05:32 axion.fireburn.co.uk kernel:  ? asm_exc_page_fault+0x1f/0x30
Jun 02 19:05:32 axion.fireburn.co.uk kernel:  </TASK>
Jun 02 19:05:32 axion.fireburn.co.uk kernel: Modules linked in:
Jun 02 19:05:32 axion.fireburn.co.uk kernel: CR2: 00000000000000d8
Jun 02 19:05:32 axion.fireburn.co.uk kernel: ---[ end trace
0000000000000000 ]---
Jun 02 19:05:32 axion.fireburn.co.uk kernel: RIP:
0010:ttm_device_swapout+0x6a/0x3d0
Jun 02 19:05:32 axion.fireburn.co.uk kernel: Code: 85 ed 74 51 80 7d
01 00 74 4b 48 89 e6 48 89 ef e8 7b dd ff ff 48 85 c0 74 3b 48 89 c3
49 89 e6 48 8b 7b 30 4c 89 ee 44 89 e2 <4c> 8b bf d8 00 00 00 e8 fa a5
>
Jun 02 19:05:32 axion.fireburn.co.uk kernel: RSP:
0000:ffff888125a17c70 EFLAGS: 00010286
Jun 02 19:05:32 axion.fireburn.co.uk kernel: RAX: ffff888104f45aa0
RBX: ffff888104f45aa0 RCX: 0000000000000000
Jun 02 19:05:32 axion.fireburn.co.uk kernel: RDX: 0000000000000cc0
RSI: ffff888125a17d50 RDI: 0000000000000000
Jun 02 19:05:32 axion.fireburn.co.uk kernel: RBP: ffff888104f45118
R08: ffff88812278da68 R09: ffff888102eaa680
Jun 02 19:05:32 axion.fireburn.co.uk kernel: R10: 0000000000000063
R11: ffffffff818212a0 R12: 0000000000000cc0
Jun 02 19:05:32 axion.fireburn.co.uk kernel: R13: ffff888125a17d50
R14: ffff888125a17c70 R15: 0000000000691000
Jun 02 19:05:32 axion.fireburn.co.uk kernel: FS:
00007fd9e0671740(0000) GS:ffff888fde440000(0000)
knlGS:0000000000000000
Jun 02 19:05:32 axion.fireburn.co.uk kernel: CS:  0010 DS: 0000 ES:
0000 CR0: 0000000080050033
Jun 02 19:05:32 axion.fireburn.co.uk kernel: CR2: 00000000000000d8
CR3: 0000000125c5a000 CR4: 0000000000150ee0
Jun 02 19:05:32 axion.fireburn.co.uk kernel: note:
GravityMark.x64[1040] exited with preempt_count 1

On Thu, 2 Jun 2022 at 19:08, Mike Lothian <mike@fireburn.co.uk> wrote:
>
> Hi
>
> I'm still seeing Null pointers against Linus's tree and drm-misc with this patch
>
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: BUG: kernel NULL pointer
> dereference, address: 0000000000000008
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: #PF: supervisor write
> access in kernel mode
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: #PF: error_code(0x0002) -
> not-present page
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: PGD 11ee04067 P4D
> 11ee04067 PUD 15eccb067 PMD 0
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: Oops: 0002 [#1] PREEMPT SMP NOPTI
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: CPU: 0 PID: 1021 Comm:
> GravityMark.x64 Tainted: G        W         5.18.0-tip+ #3177
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: Hardware name: ASUSTeK
> COMPUTER INC. ROG Strix G513QY_G513QY/G513QY, BIOS G513QY.318
> 03/29/2022
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: RIP:
> 0010:ttm_resource_init+0x108/0x210
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: Code: 48 8b 74 0a 08 48
> 39 de 0f 84 82 00 00 00 48 8b 7b 38 4c 8b 4b 40 4c 8d 44 0a 08 48 8d
> 56 38 4c 89 4f 08 49 89 39 48 8b 4e 38 <48> 89 41 08 48 89 4b 38 48 89
> >
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: RSP:
> 0018:ffff888112e73918 EFLAGS: 00010202
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: RAX: ffff888206b715d8
> RBX: ffff888206b715a0 RCX: 0000000000000000
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: RDX: ffff888206b71cf8
> RSI: ffff888206b71cc0 RDI: ffff888110605b00
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: RBP: ffff88816c848c08
> R08: ffff88812235c790 R09: ffff8881306a4bd8
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: R10: 0000000000000000
> R11: ffffffff81851320 R12: ffff888110605ad0
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: R13: ffff888206b715a0
> R14: ffff88816c848c58 R15: ffff888110605ad0
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: FS:
> 00007f4c257c1740(0000) GS:ffff888fde400000(0000)
> knlGS:0000000000000000
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: CS:  0010 DS: 0000 ES:
> 0000 CR0: 0000000080050033
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: CR2: 0000000000000008
> CR3: 00000001183fc000 CR4: 0000000000350ef0
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: Call Trace:
> Jun 02 19:04:05 axion.fireburn.co.uk kernel:  <TASK>
> Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? amdgpu_vram_mgr_new+0xbb/0x4b0
> Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? ttm_bo_mem_space+0x89/0x1e0
> Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? ttm_bo_validate+0x80/0x1a0
> Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? amdgpu_cs_bo_validate+0xe9/0x2b0
> Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ?
> amdgpu_syncobj_lookup_and_add_to_sync+0xa0/0xa0
> Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ?
> amdgpu_vm_validate_pt_bos+0xce/0x1c0
> Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? amdgpu_cs_parser_bos+0x522/0x6e0
> Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? amdgpu_cs_ioctl+0x7fe/0xd00
> Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ?
> amdgpu_cs_report_moved_bytes+0x60/0x60
> Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? drm_ioctl_kernel+0xcb/0x130
> Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? drm_ioctl+0x2f5/0x400
> Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ?
> amdgpu_cs_report_moved_bytes+0x60/0x60
> Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? amdgpu_drm_ioctl+0x42/0x80
> Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? __x64_sys_ioctl+0x5e/0xa0
> Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? do_syscall_64+0x6a/0x90
> Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ?
> exit_to_user_mode_prepare+0x19/0x90
> Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ?
> entry_SYSCALL_64_after_hwframe+0x46/0xb0
> Jun 02 19:04:05 axion.fireburn.co.uk kernel:  </TASK>
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: Modules linked in:
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: CR2: 0000000000000008
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: ---[ end trace
> 0000000000000000 ]---
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: RIP:
> 0010:ttm_resource_init+0x108/0x210
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: Code: 48 8b 74 0a 08 48
> 39 de 0f 84 82 00 00 00 48 8b 7b 38 4c 8b 4b 40 4c 8d 44 0a 08 48 8d
> 56 38 4c 89 4f 08 49 89 39 48 8b 4e 38 <48> 89 41 08 48 89 4b 38 48 89
> >
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: RSP:
> 0018:ffff888112e73918 EFLAGS: 00010202
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: RAX: ffff888206b715d8
> RBX: ffff888206b715a0 RCX: 0000000000000000
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: RDX: ffff888206b71cf8
> RSI: ffff888206b71cc0 RDI: ffff888110605b00
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: RBP: ffff88816c848c08
> R08: ffff88812235c790 R09: ffff8881306a4bd8
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: R10: 0000000000000000
> R11: ffffffff81851320 R12: ffff888110605ad0
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: R13: ffff888206b715a0
> R14: ffff88816c848c58 R15: ffff888110605ad0
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: FS:
> 00007f4c257c1740(0000) GS:ffff888fde400000(0000)
> knlGS:0000000000000000
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: CS:  0010 DS: 0000 ES:
> 0000 CR0: 0000000080050033
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: CR2: 0000000000000008
> CR3: 00000001183fc000 CR4: 0000000000350ef0
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: note:
> GravityMark.x64[1021] exited with preempt_count 1
>
> On Thu, 2 Jun 2022 at 17:54, Alex Deucher <alexdeucher@gmail.com> wrote:
> >
> > On Thu, Jun 2, 2022 at 11:47 AM Christian König
> > <ckoenig.leichtzumerken@gmail.com> wrote:
> > >
> > > The resource must be on the LRU before ttm_lru_bulk_move_add() is called.
> > >
> > > Signed-off-by: Christian König <christian.koenig@amd.com>
> >
> > This should at least fix the null pointer in these bugs:
> >
> > Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1992
> > Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2034
> >
> > Alex
> >
> > > ---
> > >  drivers/gpu/drm/ttm/ttm_resource.c | 8 +++++---
> > >  1 file changed, 5 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/ttm/ttm_resource.c b/drivers/gpu/drm/ttm/ttm_resource.c
> > > index 65889b3caf50..928b9140f3c5 100644
> > > --- a/drivers/gpu/drm/ttm/ttm_resource.c
> > > +++ b/drivers/gpu/drm/ttm/ttm_resource.c
> > > @@ -169,15 +169,17 @@ void ttm_resource_init(struct ttm_buffer_object *bo,
> > >         res->bus.is_iomem = false;
> > >         res->bus.caching = ttm_cached;
> > >         res->bo = bo;
> > > -       INIT_LIST_HEAD(&res->lru);
> > >
> > >         man = ttm_manager_type(bo->bdev, place->mem_type);
> > >         spin_lock(&bo->bdev->lru_lock);
> > >         man->usage += res->num_pages << PAGE_SHIFT;
> > > -       if (bo->bulk_move)
> > > +       if (bo->bulk_move) {
> > > +               list_add_tail(&res->lru, &man->lru[bo->priority]);
> > >                 ttm_lru_bulk_move_add(bo->bulk_move, res);
> > > -       else
> > > +       } else {
> > > +               INIT_LIST_HEAD(&res->lru);
> > >                 ttm_resource_move_to_lru_tail(res);
> > > +       }
> > >         spin_unlock(&bo->bdev->lru_lock);
> > >  }
> > >  EXPORT_SYMBOL(ttm_resource_init);
> > > --
> > > 2.25.1
> > >

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] drm/ttm: fix bulk move handling during resource init
  2022-06-02 18:08   ` Mike Lothian
  2022-06-02 18:11     ` Mike Lothian
@ 2022-06-02 18:55     ` Christian König
  2022-06-02 19:02       ` Mike Lothian
  1 sibling, 1 reply; 7+ messages in thread
From: Christian König @ 2022-06-02 18:55 UTC (permalink / raw)
  To: Mike Lothian, Alex Deucher; +Cc: amd-gfx list, Arunpravin

That's because drm-misc-next is currently broken and needs a backmerge.

Please try this patch on top of drm-next.

Regards,
Christian.

Am 02.06.22 um 20:08 schrieb Mike Lothian:
> Hi
>
> I'm still seeing Null pointers against Linus's tree and drm-misc with this patch
>
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: BUG: kernel NULL pointer
> dereference, address: 0000000000000008
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: #PF: supervisor write
> access in kernel mode
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: #PF: error_code(0x0002) -
> not-present page
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: PGD 11ee04067 P4D
> 11ee04067 PUD 15eccb067 PMD 0
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: Oops: 0002 [#1] PREEMPT SMP NOPTI
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: CPU: 0 PID: 1021 Comm:
> GravityMark.x64 Tainted: G        W         5.18.0-tip+ #3177
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: Hardware name: ASUSTeK
> COMPUTER INC. ROG Strix G513QY_G513QY/G513QY, BIOS G513QY.318
> 03/29/2022
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: RIP:
> 0010:ttm_resource_init+0x108/0x210
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: Code: 48 8b 74 0a 08 48
> 39 de 0f 84 82 00 00 00 48 8b 7b 38 4c 8b 4b 40 4c 8d 44 0a 08 48 8d
> 56 38 4c 89 4f 08 49 89 39 48 8b 4e 38 <48> 89 41 08 48 89 4b 38 48 89
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: RSP:
> 0018:ffff888112e73918 EFLAGS: 00010202
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: RAX: ffff888206b715d8
> RBX: ffff888206b715a0 RCX: 0000000000000000
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: RDX: ffff888206b71cf8
> RSI: ffff888206b71cc0 RDI: ffff888110605b00
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: RBP: ffff88816c848c08
> R08: ffff88812235c790 R09: ffff8881306a4bd8
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: R10: 0000000000000000
> R11: ffffffff81851320 R12: ffff888110605ad0
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: R13: ffff888206b715a0
> R14: ffff88816c848c58 R15: ffff888110605ad0
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: FS:
> 00007f4c257c1740(0000) GS:ffff888fde400000(0000)
> knlGS:0000000000000000
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: CS:  0010 DS: 0000 ES:
> 0000 CR0: 0000000080050033
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: CR2: 0000000000000008
> CR3: 00000001183fc000 CR4: 0000000000350ef0
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: Call Trace:
> Jun 02 19:04:05 axion.fireburn.co.uk kernel:  <TASK>
> Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? amdgpu_vram_mgr_new+0xbb/0x4b0
> Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? ttm_bo_mem_space+0x89/0x1e0
> Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? ttm_bo_validate+0x80/0x1a0
> Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? amdgpu_cs_bo_validate+0xe9/0x2b0
> Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ?
> amdgpu_syncobj_lookup_and_add_to_sync+0xa0/0xa0
> Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ?
> amdgpu_vm_validate_pt_bos+0xce/0x1c0
> Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? amdgpu_cs_parser_bos+0x522/0x6e0
> Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? amdgpu_cs_ioctl+0x7fe/0xd00
> Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ?
> amdgpu_cs_report_moved_bytes+0x60/0x60
> Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? drm_ioctl_kernel+0xcb/0x130
> Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? drm_ioctl+0x2f5/0x400
> Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ?
> amdgpu_cs_report_moved_bytes+0x60/0x60
> Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? amdgpu_drm_ioctl+0x42/0x80
> Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? __x64_sys_ioctl+0x5e/0xa0
> Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? do_syscall_64+0x6a/0x90
> Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ?
> exit_to_user_mode_prepare+0x19/0x90
> Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ?
> entry_SYSCALL_64_after_hwframe+0x46/0xb0
> Jun 02 19:04:05 axion.fireburn.co.uk kernel:  </TASK>
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: Modules linked in:
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: CR2: 0000000000000008
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: ---[ end trace
> 0000000000000000 ]---
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: RIP:
> 0010:ttm_resource_init+0x108/0x210
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: Code: 48 8b 74 0a 08 48
> 39 de 0f 84 82 00 00 00 48 8b 7b 38 4c 8b 4b 40 4c 8d 44 0a 08 48 8d
> 56 38 4c 89 4f 08 49 89 39 48 8b 4e 38 <48> 89 41 08 48 89 4b 38 48 89
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: RSP:
> 0018:ffff888112e73918 EFLAGS: 00010202
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: RAX: ffff888206b715d8
> RBX: ffff888206b715a0 RCX: 0000000000000000
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: RDX: ffff888206b71cf8
> RSI: ffff888206b71cc0 RDI: ffff888110605b00
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: RBP: ffff88816c848c08
> R08: ffff88812235c790 R09: ffff8881306a4bd8
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: R10: 0000000000000000
> R11: ffffffff81851320 R12: ffff888110605ad0
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: R13: ffff888206b715a0
> R14: ffff88816c848c58 R15: ffff888110605ad0
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: FS:
> 00007f4c257c1740(0000) GS:ffff888fde400000(0000)
> knlGS:0000000000000000
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: CS:  0010 DS: 0000 ES:
> 0000 CR0: 0000000080050033
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: CR2: 0000000000000008
> CR3: 00000001183fc000 CR4: 0000000000350ef0
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: note:
> GravityMark.x64[1021] exited with preempt_count 1
>
> On Thu, 2 Jun 2022 at 17:54, Alex Deucher <alexdeucher@gmail.com> wrote:
>> On Thu, Jun 2, 2022 at 11:47 AM Christian König
>> <ckoenig.leichtzumerken@gmail.com> wrote:
>>> The resource must be on the LRU before ttm_lru_bulk_move_add() is called.
>>>
>>> Signed-off-by: Christian König <christian.koenig@amd.com>
>> This should at least fix the null pointer in these bugs:
>>
>> Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1992
>> Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2034
>>
>> Alex
>>
>>> ---
>>>   drivers/gpu/drm/ttm/ttm_resource.c | 8 +++++---
>>>   1 file changed, 5 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/ttm/ttm_resource.c b/drivers/gpu/drm/ttm/ttm_resource.c
>>> index 65889b3caf50..928b9140f3c5 100644
>>> --- a/drivers/gpu/drm/ttm/ttm_resource.c
>>> +++ b/drivers/gpu/drm/ttm/ttm_resource.c
>>> @@ -169,15 +169,17 @@ void ttm_resource_init(struct ttm_buffer_object *bo,
>>>          res->bus.is_iomem = false;
>>>          res->bus.caching = ttm_cached;
>>>          res->bo = bo;
>>> -       INIT_LIST_HEAD(&res->lru);
>>>
>>>          man = ttm_manager_type(bo->bdev, place->mem_type);
>>>          spin_lock(&bo->bdev->lru_lock);
>>>          man->usage += res->num_pages << PAGE_SHIFT;
>>> -       if (bo->bulk_move)
>>> +       if (bo->bulk_move) {
>>> +               list_add_tail(&res->lru, &man->lru[bo->priority]);
>>>                  ttm_lru_bulk_move_add(bo->bulk_move, res);
>>> -       else
>>> +       } else {
>>> +               INIT_LIST_HEAD(&res->lru);
>>>                  ttm_resource_move_to_lru_tail(res);
>>> +       }
>>>          spin_unlock(&bo->bdev->lru_lock);
>>>   }
>>>   EXPORT_SYMBOL(ttm_resource_init);
>>> --
>>> 2.25.1
>>>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] drm/ttm: fix bulk move handling during resource init
  2022-06-02 18:55     ` Christian König
@ 2022-06-02 19:02       ` Mike Lothian
  0 siblings, 0 replies; 7+ messages in thread
From: Mike Lothian @ 2022-06-02 19:02 UTC (permalink / raw)
  To: Christian König; +Cc: Alex Deucher, amd-gfx list, Arunpravin

The Null pointer against drm-next:

Jun 02 19:59:50 axion.fireburn.co.uk kernel: BUG: kernel NULL pointer
dereference, address: 00000000000000d8
Jun 02 19:59:50 axion.fireburn.co.uk kernel: #PF: supervisor read
access in kernel mode
Jun 02 19:59:50 axion.fireburn.co.uk kernel: #PF: error_code(0x0000) -
not-present page
Jun 02 19:59:50 axion.fireburn.co.uk kernel: PGD 118700067 P4D
118700067 PUD 11f116067 PMD 0
Jun 02 19:59:50 axion.fireburn.co.uk kernel: Oops: 0000 [#1] PREEMPT SMP NOPTI
Jun 02 19:59:50 axion.fireburn.co.uk kernel: CPU: 4 PID: 1029 Comm:
GravityMark.x64 Tainted: G        W         5.18.0-rc5-drm+ #1070
Jun 02 19:59:50 axion.fireburn.co.uk kernel: Hardware name: ASUSTeK
COMPUTER INC. ROG Strix G513QY_G513QY/G513QY, BIOS G513QY.318
03/29/2022
Jun 02 19:59:50 axion.fireburn.co.uk kernel: RIP:
0010:ttm_device_swapout+0x6a/0x3d0
Jun 02 19:59:50 axion.fireburn.co.uk kernel: Code: 85 ed 74 51 80 7d
01 00 74 4b 48 89 e6 48 89 ef e8 7b dd ff ff 48 85 c0 74 3b 48 89 c3
49 89 e6 48 8b 7b 30 4c 89 ee 44 89 e2 <4c> 8b bf d8 00 00 00 e8 fa a5
>
Jun 02 19:59:50 axion.fireburn.co.uk kernel: RSP:
0000:ffff8881605dfc70 EFLAGS: 00010282
Jun 02 19:59:50 axion.fireburn.co.uk kernel: RAX: ffff888104f85ac8
RBX: ffff888104f85ac8 RCX: 0000000000000000
Jun 02 19:59:50 axion.fireburn.co.uk kernel: RDX: 0000000000000cc0
RSI: ffff8881605dfd50 RDI: 0000000000000000
Jun 02 19:59:50 axion.fireburn.co.uk kernel: RBP: ffff888104f85140
R08: ffff888101566240 R09: ffff88814e57b880
Jun 02 19:59:50 axion.fireburn.co.uk kernel: R10: 0000000000000063
R11: ffffffff818238a0 R12: 0000000000000cc0
Jun 02 19:59:50 axion.fireburn.co.uk kernel: R13: ffff8881605dfd50
R14: ffff8881605dfc70 R15: 0000000000691000
Jun 02 19:59:50 axion.fireburn.co.uk kernel: FS:
00007f4623fb9740(0000) GS:ffff888fde500000(0000)
knlGS:0000000000000000
Jun 02 19:59:50 axion.fireburn.co.uk kernel: CS:  0010 DS: 0000 ES:
0000 CR0: 0000000080050033
Jun 02 19:59:50 axion.fireburn.co.uk kernel: CR2: 00000000000000d8
CR3: 0000000102e3c000 CR4: 0000000000150ee0
Jun 02 19:59:50 axion.fireburn.co.uk kernel: Call Trace:
Jun 02 19:59:50 axion.fireburn.co.uk kernel:  <TASK>
Jun 02 19:59:50 axion.fireburn.co.uk kernel:  ? ttm_global_swapout+0xae/0xc0
Jun 02 19:59:50 axion.fireburn.co.uk kernel:  ? ttm_tt_populate+0x7d/0x130
Jun 02 19:59:50 axion.fireburn.co.uk kernel:  ?
ttm_bo_vm_fault_reserved+0x237/0x270
Jun 02 19:59:50 axion.fireburn.co.uk kernel:  ? amdgpu_gem_fault+0x92/0xd0
Jun 02 19:59:50 axion.fireburn.co.uk kernel:  ? do_fault+0x28e/0x4b0
Jun 02 19:59:50 axion.fireburn.co.uk kernel:  ? handle_mm_fault+0x849/0xa80
Jun 02 19:59:50 axion.fireburn.co.uk kernel:  ? amdgpu_drm_ioctl+0x68/0x80
Jun 02 19:59:50 axion.fireburn.co.uk kernel:  ? do_user_addr_fault+0x275/0x450
Jun 02 19:59:50 axion.fireburn.co.uk kernel:  ? asm_exc_page_fault+0x9/0x30
Jun 02 19:59:50 axion.fireburn.co.uk kernel:  ? exc_page_fault+0x5f/0x150
Jun 02 19:59:50 axion.fireburn.co.uk kernel:  ? asm_exc_page_fault+0x1f/0x30
Jun 02 19:59:50 axion.fireburn.co.uk kernel:  </TASK>
Jun 02 19:59:50 axion.fireburn.co.uk kernel: Modules linked in:
Jun 02 19:59:50 axion.fireburn.co.uk kernel: CR2: 00000000000000d8
Jun 02 19:59:50 axion.fireburn.co.uk kernel: ---[ end trace
0000000000000000 ]---
Jun 02 19:59:50 axion.fireburn.co.uk kernel: RIP:
0010:ttm_device_swapout+0x6a/0x3d0
Jun 02 19:59:50 axion.fireburn.co.uk kernel: Code: 85 ed 74 51 80 7d
01 00 74 4b 48 89 e6 48 89 ef e8 7b dd ff ff 48 85 c0 74 3b 48 89 c3
49 89 e6 48 8b 7b 30 4c 89 ee 44 89 e2 <4c> 8b bf d8 00 00 00 e8 fa a5
>
Jun 02 19:59:50 axion.fireburn.co.uk kernel: RSP:
0000:ffff8881605dfc70 EFLAGS: 00010282
Jun 02 19:59:50 axion.fireburn.co.uk kernel: RAX: ffff888104f85ac8
RBX: ffff888104f85ac8 RCX: 0000000000000000
Jun 02 19:59:50 axion.fireburn.co.uk kernel: RDX: 0000000000000cc0
RSI: ffff8881605dfd50 RDI: 0000000000000000
Jun 02 19:59:50 axion.fireburn.co.uk kernel: RBP: ffff888104f85140
R08: ffff888101566240 R09: ffff88814e57b880
Jun 02 19:59:50 axion.fireburn.co.uk kernel: R10: 0000000000000063
R11: ffffffff818238a0 R12: 0000000000000cc0
Jun 02 19:59:50 axion.fireburn.co.uk kernel: R13: ffff8881605dfd50
R14: ffff8881605dfc70 R15: 0000000000691000
Jun 02 19:59:50 axion.fireburn.co.uk kernel: FS:
00007f4623fb9740(0000) GS:ffff888fde500000(0000)
knlGS:0000000000000000
Jun 02 19:59:50 axion.fireburn.co.uk kernel: CS:  0010 DS: 0000 ES:
0000 CR0: 0000000080050033
Jun 02 19:59:50 axion.fireburn.co.uk kernel: CR2: 00000000000000d8
CR3: 0000000102e3c000 CR4: 0000000000150ee0
Jun 02 19:59:50 axion.fireburn.co.uk kernel: note:
GravityMark.x64[1029] exited with preempt_count 1

On Thu, 2 Jun 2022 at 19:55, Christian König
<ckoenig.leichtzumerken@gmail.com> wrote:
>
> That's because drm-misc-next is currently broken and needs a backmerge.
>
> Please try this patch on top of drm-next.
>
> Regards,
> Christian.
>
> Am 02.06.22 um 20:08 schrieb Mike Lothian:
> > Hi
> >
> > I'm still seeing Null pointers against Linus's tree and drm-misc with this patch
> >
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel: BUG: kernel NULL pointer
> > dereference, address: 0000000000000008
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel: #PF: supervisor write
> > access in kernel mode
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel: #PF: error_code(0x0002) -
> > not-present page
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel: PGD 11ee04067 P4D
> > 11ee04067 PUD 15eccb067 PMD 0
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel: Oops: 0002 [#1] PREEMPT SMP NOPTI
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel: CPU: 0 PID: 1021 Comm:
> > GravityMark.x64 Tainted: G        W         5.18.0-tip+ #3177
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel: Hardware name: ASUSTeK
> > COMPUTER INC. ROG Strix G513QY_G513QY/G513QY, BIOS G513QY.318
> > 03/29/2022
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel: RIP:
> > 0010:ttm_resource_init+0x108/0x210
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel: Code: 48 8b 74 0a 08 48
> > 39 de 0f 84 82 00 00 00 48 8b 7b 38 4c 8b 4b 40 4c 8d 44 0a 08 48 8d
> > 56 38 4c 89 4f 08 49 89 39 48 8b 4e 38 <48> 89 41 08 48 89 4b 38 48 89
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel: RSP:
> > 0018:ffff888112e73918 EFLAGS: 00010202
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel: RAX: ffff888206b715d8
> > RBX: ffff888206b715a0 RCX: 0000000000000000
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel: RDX: ffff888206b71cf8
> > RSI: ffff888206b71cc0 RDI: ffff888110605b00
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel: RBP: ffff88816c848c08
> > R08: ffff88812235c790 R09: ffff8881306a4bd8
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel: R10: 0000000000000000
> > R11: ffffffff81851320 R12: ffff888110605ad0
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel: R13: ffff888206b715a0
> > R14: ffff88816c848c58 R15: ffff888110605ad0
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel: FS:
> > 00007f4c257c1740(0000) GS:ffff888fde400000(0000)
> > knlGS:0000000000000000
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel: CS:  0010 DS: 0000 ES:
> > 0000 CR0: 0000000080050033
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel: CR2: 0000000000000008
> > CR3: 00000001183fc000 CR4: 0000000000350ef0
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel: Call Trace:
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel:  <TASK>
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? amdgpu_vram_mgr_new+0xbb/0x4b0
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? ttm_bo_mem_space+0x89/0x1e0
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? ttm_bo_validate+0x80/0x1a0
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? amdgpu_cs_bo_validate+0xe9/0x2b0
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ?
> > amdgpu_syncobj_lookup_and_add_to_sync+0xa0/0xa0
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ?
> > amdgpu_vm_validate_pt_bos+0xce/0x1c0
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? amdgpu_cs_parser_bos+0x522/0x6e0
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? amdgpu_cs_ioctl+0x7fe/0xd00
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ?
> > amdgpu_cs_report_moved_bytes+0x60/0x60
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? drm_ioctl_kernel+0xcb/0x130
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? drm_ioctl+0x2f5/0x400
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ?
> > amdgpu_cs_report_moved_bytes+0x60/0x60
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? amdgpu_drm_ioctl+0x42/0x80
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? __x64_sys_ioctl+0x5e/0xa0
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? do_syscall_64+0x6a/0x90
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ?
> > exit_to_user_mode_prepare+0x19/0x90
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ?
> > entry_SYSCALL_64_after_hwframe+0x46/0xb0
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel:  </TASK>
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel: Modules linked in:
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel: CR2: 0000000000000008
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel: ---[ end trace
> > 0000000000000000 ]---
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel: RIP:
> > 0010:ttm_resource_init+0x108/0x210
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel: Code: 48 8b 74 0a 08 48
> > 39 de 0f 84 82 00 00 00 48 8b 7b 38 4c 8b 4b 40 4c 8d 44 0a 08 48 8d
> > 56 38 4c 89 4f 08 49 89 39 48 8b 4e 38 <48> 89 41 08 48 89 4b 38 48 89
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel: RSP:
> > 0018:ffff888112e73918 EFLAGS: 00010202
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel: RAX: ffff888206b715d8
> > RBX: ffff888206b715a0 RCX: 0000000000000000
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel: RDX: ffff888206b71cf8
> > RSI: ffff888206b71cc0 RDI: ffff888110605b00
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel: RBP: ffff88816c848c08
> > R08: ffff88812235c790 R09: ffff8881306a4bd8
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel: R10: 0000000000000000
> > R11: ffffffff81851320 R12: ffff888110605ad0
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel: R13: ffff888206b715a0
> > R14: ffff88816c848c58 R15: ffff888110605ad0
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel: FS:
> > 00007f4c257c1740(0000) GS:ffff888fde400000(0000)
> > knlGS:0000000000000000
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel: CS:  0010 DS: 0000 ES:
> > 0000 CR0: 0000000080050033
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel: CR2: 0000000000000008
> > CR3: 00000001183fc000 CR4: 0000000000350ef0
> > Jun 02 19:04:05 axion.fireburn.co.uk kernel: note:
> > GravityMark.x64[1021] exited with preempt_count 1
> >
> > On Thu, 2 Jun 2022 at 17:54, Alex Deucher <alexdeucher@gmail.com> wrote:
> >> On Thu, Jun 2, 2022 at 11:47 AM Christian König
> >> <ckoenig.leichtzumerken@gmail.com> wrote:
> >>> The resource must be on the LRU before ttm_lru_bulk_move_add() is called.
> >>>
> >>> Signed-off-by: Christian König <christian.koenig@amd.com>
> >> This should at least fix the null pointer in these bugs:
> >>
> >> Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1992
> >> Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2034
> >>
> >> Alex
> >>
> >>> ---
> >>>   drivers/gpu/drm/ttm/ttm_resource.c | 8 +++++---
> >>>   1 file changed, 5 insertions(+), 3 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/ttm/ttm_resource.c b/drivers/gpu/drm/ttm/ttm_resource.c
> >>> index 65889b3caf50..928b9140f3c5 100644
> >>> --- a/drivers/gpu/drm/ttm/ttm_resource.c
> >>> +++ b/drivers/gpu/drm/ttm/ttm_resource.c
> >>> @@ -169,15 +169,17 @@ void ttm_resource_init(struct ttm_buffer_object *bo,
> >>>          res->bus.is_iomem = false;
> >>>          res->bus.caching = ttm_cached;
> >>>          res->bo = bo;
> >>> -       INIT_LIST_HEAD(&res->lru);
> >>>
> >>>          man = ttm_manager_type(bo->bdev, place->mem_type);
> >>>          spin_lock(&bo->bdev->lru_lock);
> >>>          man->usage += res->num_pages << PAGE_SHIFT;
> >>> -       if (bo->bulk_move)
> >>> +       if (bo->bulk_move) {
> >>> +               list_add_tail(&res->lru, &man->lru[bo->priority]);
> >>>                  ttm_lru_bulk_move_add(bo->bulk_move, res);
> >>> -       else
> >>> +       } else {
> >>> +               INIT_LIST_HEAD(&res->lru);
> >>>                  ttm_resource_move_to_lru_tail(res);
> >>> +       }
> >>>          spin_unlock(&bo->bdev->lru_lock);
> >>>   }
> >>>   EXPORT_SYMBOL(ttm_resource_init);
> >>> --
> >>> 2.25.1
> >>>
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-06-02 19:02 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-02 15:47 [PATCH] drm/ttm: fix bulk move handling during resource init Christian König
2022-06-02 16:15 ` Luben Tuikov
2022-06-02 16:54 ` Alex Deucher
2022-06-02 18:08   ` Mike Lothian
2022-06-02 18:11     ` Mike Lothian
2022-06-02 18:55     ` Christian König
2022-06-02 19:02       ` Mike Lothian

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.