intel-xe.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
To: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Stuart Summers <stuart.summers@intel.com>,
	Vivi@freedesktop.org, "Chang, Bruce" <yu.bruce.chang@intel.com>,
	intel-xe@lists.freedesktop.org
Subject: Re: [Intel-xe] [PATCH] drm/xe: fix pvc unload issue
Date: Tue, 4 Apr 2023 15:45:05 -0400	[thread overview]
Message-ID: <ZCx+Qfwr7KSaKGiw@intel.com> (raw)
In-Reply-To: <20230404184839.qlbwgv7lvvbi76i4@ldmartin-desk2.lan>

On Tue, Apr 04, 2023 at 11:48:39AM -0700, Lucas De Marchi wrote:
> On Mon, Apr 03, 2023 at 10:20:31PM +0000, Chang, Bruce wrote:
> > Currently, unload pvc driver will generate a null dereference
> > and the call stack is as below.
> > 
> > [ 4850.618000] Call Trace:
> > [ 4850.620740]  <TASK>
> > [ 4850.623134]  ttm_bo_cleanup_memtype_use+0x3f/0x50 [ttm]
> > [ 4850.628661]  ttm_bo_release+0x154/0x2c0 [ttm]
> > [ 4850.633317]  ? drm_buddy_fini+0x62/0x80 [drm_buddy]
> > [ 4850.638487]  ? __kmem_cache_free+0x27d/0x2c0
> > [ 4850.643054]  ttm_bo_put+0x38/0x60 [ttm]
> > [ 4850.647190]  xe_gem_object_free+0x1f/0x30 [xe]
> > [ 4850.651945]  drm_gem_object_free+0x1e/0x30 [drm]
> > [ 4850.656904]  ggtt_fini_noalloc+0x9d/0xe0 [xe]
> > [ 4850.661574]  drm_managed_release+0xb5/0x150 [drm]
> > [ 4850.666617]  drm_dev_release+0x30/0x50 [drm]
> > [ 4850.671209]  devm_drm_dev_init_release+0x3c/0x60 [drm]
> > 
> > There are a couple issues, but the main one is due to TTM has only
> > one TTM_PL_TT region, but since pvc has 2 tiles and tries to setup
> > 1 TTM_PL_TT each tile. The second will overwrite the first one.
> > 
> > During unload time, the first tile will reset the TTM_PL_TT manger
> > and when the second tile is trying to free Bo and it will generate
> > the null reference since the TTM manage is already got reset to 0.
> > 
> > The fix is to use one global TTM_PL_TT manager.
> > 
> > v2: make gtt mgr global and change the name to sys_mgr
> > 
> > Cc: Stuart Summers <stuart.summers@intel.com>
> > Cc: Matthew Brost <matthew.brost@intel.com>
> > Cc: Vivi, Rodrigo <rodrigo.vivi@intel.com>
> > Signed-off-by: Bruce Chang <yu.bruce.chang@intel.com>
> > ---
> > drivers/gpu/drm/xe/Makefile                   |  2 +-
> > drivers/gpu/drm/xe/xe_device.c                |  2 +
> > drivers/gpu/drm/xe/xe_device.h                |  1 +
> > drivers/gpu/drm/xe/xe_device_types.h          |  2 +
> > drivers/gpu/drm/xe/xe_gt.c                    | 18 -----
> > drivers/gpu/drm/xe/xe_gt_types.h              |  2 -
> > drivers/gpu/drm/xe/xe_ttm_gtt_mgr.h           | 16 -----
> > drivers/gpu/drm/xe/xe_ttm_gtt_mgr_types.h     | 18 -----
> > .../xe/{xe_ttm_gtt_mgr.c => xe_ttm_sys_mgr.c} | 67 +++++++------------
> > 9 files changed, 31 insertions(+), 97 deletions(-)
> > delete mode 100644 drivers/gpu/drm/xe/xe_ttm_gtt_mgr.h
> > delete mode 100644 drivers/gpu/drm/xe/xe_ttm_gtt_mgr_types.h
> > rename drivers/gpu/drm/xe/{xe_ttm_gtt_mgr.c => xe_ttm_sys_mgr.c} (52%)
> > 
> > diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
> > index 669c7b6f552c..7e19a15af3dc 100644
> > --- a/drivers/gpu/drm/xe/Makefile
> > +++ b/drivers/gpu/drm/xe/Makefile
> > @@ -83,7 +83,7 @@ xe-y += xe_bb.o \
> > 	xe_step.o \
> > 	xe_sync.o \
> > 	xe_trace.o \
> > -	xe_ttm_gtt_mgr.o \
> > +	xe_ttm_sys_mgr.o \
> > 	xe_ttm_stolen_mgr.o \
> > 	xe_ttm_vram_mgr.o \
> > 	xe_tuning.o \
> > diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> > index a79f934e3d2d..4fb9aff27686 100644
> > --- a/drivers/gpu/drm/xe/xe_device.c
> > +++ b/drivers/gpu/drm/xe/xe_device.c
> > @@ -279,6 +279,8 @@ int xe_device_probe(struct xe_device *xe)
> > 	if (err)
> > 		goto err_irq_shutdown;
> > 
> > +	xe_ttm_sys_mgr_init(xe);
> > +
> > 	for_each_gt(gt, xe, id) {
> > 		err = xe_gt_init_noalloc(gt);
> > 		if (err)
> > diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
> > index d277f8985f7b..d9d1b09a8e38 100644
> > --- a/drivers/gpu/drm/xe/xe_device.h
> > +++ b/drivers/gpu/drm/xe/xe_device.h
> > @@ -116,4 +116,5 @@ static inline bool xe_device_has_flat_ccs(struct xe_device *xe)
> > }
> > 
> > u32 xe_device_ccs_bytes(struct xe_device *xe, u64 size);
> > +int xe_ttm_sys_mgr_init(struct xe_device *xe);
> 
>   CC [M]  drivers/gpu/drm/xe/tests/xe_bo_test.o
> ../drivers/gpu/drm/xe/xe_ttm_sys_mgr.c:99:5: error: no previous prototype for ‘xe_ttm_sys_mgr_init’ [-Werror=missing-prototypes]
>    99 | int xe_ttm_sys_mgr_init(struct xe_device *xe)
>       |     ^~~~~~~~~~~~~~~~~~~
> 

that's awkward... I didn't get issue here, not CI got it...

> 
> couple of issues:
> 
> 1) you moved the implementation to a new file and didn't include
> xe_device.h, creating the build issue above

I was wondering about gcc version, but this wouldn't be magic in older gcc...

> 2) The decl shouldn't be here though. You need a xe_ttm_sys_mgr.h, just
> like we have xe_ttm_stolen_mgr.h, xe_ttm_vram_mgr.h, xe_ttm_vram_mgr_types.h

yeap, that sounds the way to go...

> 
> Lucas De Marchi

  reply	other threads:[~2023-04-04 19:45 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-03 22:20 [Intel-xe] [PATCH] drm/xe: fix pvc unload issue Chang, Bruce
2023-04-03 22:31 ` [Intel-xe] ✓ CI.Patch_applied: success for drm/xe: fix pvc unload issue (rev2) Patchwork
2023-04-03 22:32 ` [Intel-xe] ✓ CI.KUnit: " Patchwork
2023-04-03 22:36 ` [Intel-xe] ✓ CI.Build: " Patchwork
2023-04-03 22:57 ` [Intel-xe] ○ CI.BAT: info " Patchwork
2023-04-04  3:44 ` [Intel-xe] [PATCH] drm/xe: fix pvc unload issue Matthew Brost
2023-04-04 16:29   ` Rodrigo Vivi
2023-04-04 18:48 ` Lucas De Marchi
2023-04-04 19:45   ` Rodrigo Vivi [this message]
2023-04-04 20:22     ` Lucas De Marchi
  -- strict thread matches above, loose matches on Subject: below --
2023-03-30 15:40 Chang, Bruce
2023-03-31  5:40 ` Matthew Auld
2023-03-31 22:55   ` Chang, Yu bruce

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZCx+Qfwr7KSaKGiw@intel.com \
    --to=rodrigo.vivi@intel.com \
    --cc=Vivi@freedesktop.org \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=lucas.demarchi@intel.com \
    --cc=stuart.summers@intel.com \
    --cc=yu.bruce.chang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).