All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Zeng, Oak" <oak.zeng@intel.com>
To: "Brost, Matthew" <matthew.brost@intel.com>
Cc: "intel-xe@lists.freedesktop.org" <intel-xe@lists.freedesktop.org>
Subject: RE: [PATCH v4 15/30] drm/xe: Add xe_vm_pgtable_update_op to xe_vma_ops
Date: Wed, 27 Mar 2024 01:29:16 +0000	[thread overview]
Message-ID: <SA1PR11MB699177A96AF72F2B6B24339292342@SA1PR11MB6991.namprd11.prod.outlook.com> (raw)
In-Reply-To: <ZgMcfJT5UCcTgMl8@DUT025-TGLU.fm.intel.com>



> -----Original Message-----
> From: Brost, Matthew <matthew.brost@intel.com>
> Sent: Tuesday, March 26, 2024 3:06 PM
> To: Zeng, Oak <oak.zeng@intel.com>
> Cc: intel-xe@lists.freedesktop.org
> Subject: Re: [PATCH v4 15/30] drm/xe: Add xe_vm_pgtable_update_op to
> xe_vma_ops
> 
> On Mon, Mar 25, 2024 at 03:58:10PM -0600, Zeng, Oak wrote:
> >
> >
> > > -----Original Message-----
> > > From: Intel-xe <intel-xe-bounces@lists.freedesktop.org> On Behalf Of
> Matthew
> > > Brost
> > > Sent: Friday, March 8, 2024 12:08 AM
> > > To: intel-xe@lists.freedesktop.org
> > > Cc: Brost, Matthew <matthew.brost@intel.com>
> > > Subject: [PATCH v4 15/30] drm/xe: Add xe_vm_pgtable_update_op to
> > > xe_vma_ops
> > >
> > > Will help with the converstion to 1 job per VM bind IOCTL. Allocation
> > > only implemented in this patch.
> >
> >
> > Can you explain why you need to introduce pt_update_op? this patch seems
> only allocate some slot for pt_update_op but they are not really used for page
> table update. I will keep looking the rest of this series.
> >
> > I also don't get why this helps 1job/bind ioctl.
> >
> 
> Each VMA operations is converted into 0-3 PT per ops tile. i.e. a MAP VMA
> operations with immediate clear generates 0 PT ops, a REMAP VMA with
> both prev & next generates 3 PT ops. PT ops work on the internal / GPU
> PT state compared to VMA ops which operate on the internal GPUVM state.
> 
> The flow roughly is:
> 
> generate VMA ops from IOCTL input
> convert all VMA ops to PT ops, prep and stage PT ops
> run PT ops as an atomic unit (1 job)
> commit PT ops
> commit VMA ops
> 
> If at any point a failure occurs before the commit step the PT ops can
> be unwound.
> 
> Likewise when a failure occurs the VMA ops are unwound too, this is just
> layer above the PT ops.
> 
> In the end if an errors at point in the IOCTL the original GPUVM and PT
> state is restored. This compared to serial execution on tip where is an
> error occurs we just kill the VM.
> 
> Does this make sense?

Hi Matt,

Thanks for explain it. Should we also document such implementation details in the xe_vm_doc.h?

Oak

> 
> Matt
> 
> >
> > Oak
> >
> > >
> > > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > > ---
> > >  drivers/gpu/drm/xe/xe_pt_types.h | 12 ++++++
> > >  drivers/gpu/drm/xe/xe_vm.c       | 66 +++++++++++++++++++++++++++++--
> -
> > >  drivers/gpu/drm/xe/xe_vm_types.h |  8 ++++
> > >  3 files changed, 81 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/xe/xe_pt_types.h
> > > b/drivers/gpu/drm/xe/xe_pt_types.h
> > > index cee70cb0f014..2093150f461e 100644
> > > --- a/drivers/gpu/drm/xe/xe_pt_types.h
> > > +++ b/drivers/gpu/drm/xe/xe_pt_types.h
> > > @@ -74,4 +74,16 @@ struct xe_vm_pgtable_update {
> > >  	u32 flags;
> > >  };
> > >
> > > +/** struct xe_vm_pgtable_update_op - Page table update operation */
> > > +struct xe_vm_pgtable_update_op {
> > > +	/** @entries: entries to update for this operation */
> > > +	struct xe_vm_pgtable_update entries[XE_VM_MAX_LEVEL * 2 + 1];
> > > +	/** @num_entries: number of entries for this update operation */
> > > +	u32 num_entries;
> > > +	/** @bind: is a bind */
> > > +	bool bind;
> > > +	/** @rebind: is a rebind */
> > > +	bool rebind;
> > > +};
> > > +
> > >  #endif
> > > diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
> > > index 5b93c71fc5e9..72e9bdab79d5 100644
> > > --- a/drivers/gpu/drm/xe/xe_vm.c
> > > +++ b/drivers/gpu/drm/xe/xe_vm.c
> > > @@ -1319,6 +1319,42 @@ static void xe_vma_ops_init(struct xe_vma_ops
> *vops,
> > > struct xe_vm *vm,
> > >  	vops->num_syncs = num_syncs;
> > >  }
> > >
> > > +static int xe_vma_ops_alloc(struct xe_vma_ops *vops)
> > > +{
> > > +	int i;
> > > +
> > > +	for (i = 0; i < XE_MAX_TILES_PER_DEVICE; ++i) {
> > > +		if (!vops->pt_update_ops[i].num_ops)
> > > +			continue;
> > > +
> > > +		vops->pt_update_ops[i].ops =
> > > +			kmalloc_array(vops->pt_update_ops[i].num_ops,
> > > +				      sizeof(*vops->pt_update_ops[i].ops),
> > > +				      GFP_KERNEL);
> > > +		if (!vops->pt_update_ops[i].ops)
> > > +			return -ENOMEM;
> > > +	}
> > > +
> > > +	return 0;
> > > +}
> > > +
> > > +static void xe_vma_ops_fini(struct xe_vma_ops *vops)
> > > +{
> > > +	int i;
> > > +
> > > +	for (i = 0; i < XE_MAX_TILES_PER_DEVICE; ++i)
> > > +		kfree(vops->pt_update_ops[i].ops);
> > > +}
> > > +
> > > +static void xe_vma_ops_incr_pt_update_ops(struct xe_vma_ops *vops, u8
> > > tile_mask)
> > > +{
> > > +	int i;
> > > +
> > > +	for (i = 0; i < XE_MAX_TILES_PER_DEVICE; ++i)
> > > +		if (BIT(i) & tile_mask)
> > > +			++vops->pt_update_ops[i].num_ops;
> > > +}
> > > +
> > >  struct xe_vm *xe_vm_create(struct xe_device *xe, u32 flags)
> > >  {
> > >  	struct drm_gem_object *vm_resv_obj;
> > > @@ -1343,6 +1379,11 @@ struct xe_vm *xe_vm_create(struct xe_device *xe,
> > > u32 flags)
> > >  	xe_vma_ops_init(&vm->dummy_ops.vops, vm, NULL, NULL, 0);
> > >  	INIT_LIST_HEAD(&vm->dummy_ops.op.link);
> > >  	list_add(&vm->dummy_ops.op.link, &vm->dummy_ops.vops.list);
> > > +	for (id = 0; id < XE_MAX_TILES_PER_DEVICE; ++id)
> > > +		vm->dummy_ops.vops.pt_update_ops[id].num_ops = 1;
> > > +	err = xe_vma_ops_alloc(&vm->dummy_ops.vops);
> > > +	if (err)
> > > +		goto err_free;
> > >
> > >  	INIT_LIST_HEAD(&vm->rebind_list);
> > >
> > > @@ -1468,12 +1509,14 @@ struct xe_vm *xe_vm_create(struct xe_device
> *xe,
> > > u32 flags)
> > >  	return ERR_PTR(err);
> > >
> > >  err_no_resv:
> > > -	mutex_destroy(&vm->snap_mutex);
> > > +	if (!(flags & XE_VM_FLAG_MIGRATION))
> > > +		xe_device_mem_access_put(xe);
> > >  	for_each_tile(tile, xe, id)
> > >  		xe_range_fence_tree_fini(&vm->rftree[id]);
> > > +err_free:
> > > +	mutex_destroy(&vm->snap_mutex);
> > > +	xe_vma_ops_fini(&vm->dummy_ops.vops);
> > >  	kfree(vm);
> > > -	if (!(flags & XE_VM_FLAG_MIGRATION))
> > > -		xe_device_mem_access_put(xe);
> > >  	return ERR_PTR(err);
> > >  }
> > >
> > > @@ -1611,6 +1654,7 @@ static void vm_destroy_work_func(struct
> work_struct
> > > *w)
> > >
> > >  	trace_xe_vm_free(vm);
> > >  	dma_fence_put(vm->rebind_fence);
> > > +	xe_vma_ops_fini(&vm->dummy_ops.vops);
> > >  	kfree(vm);
> > >  }
> > >
> > > @@ -2284,7 +2328,6 @@ static int xe_vma_op_commit(struct xe_vm *vm,
> struct
> > > xe_vma_op *op)
> > >  	return err;
> > >  }
> > >
> > > -
> > >  static int vm_bind_ioctl_ops_parse(struct xe_vm *vm, struct
> xe_exec_queue *q,
> > >  				   struct drm_gpuva_ops *ops,
> > >  				   struct xe_sync_entry *syncs, u32 num_syncs,
> > > @@ -2334,6 +2377,9 @@ static int vm_bind_ioctl_ops_parse(struct xe_vm
> *vm,
> > > struct xe_exec_queue *q,
> > >  				return PTR_ERR(vma);
> > >
> > >  			op->map.vma = vma;
> > > +			if (op->map.immediate || !xe_vm_in_fault_mode(vm))
> > > +				xe_vma_ops_incr_pt_update_ops(vops,
> > > +							      op->tile_mask);
> > >  			break;
> > >  		}
> > >  		case DRM_GPUVA_OP_REMAP:
> > > @@ -2378,6 +2424,8 @@ static int vm_bind_ioctl_ops_parse(struct xe_vm
> *vm,
> > > struct xe_exec_queue *q,
> > >  					vm_dbg(&xe->drm, "REMAP:SKIP_PREV:
> > > addr=0x%016llx, range=0x%016llx",
> > >  					       (ULL)op->remap.start,
> > >  					       (ULL)op->remap.range);
> > > +				} else {
> > > +					xe_vma_ops_incr_pt_update_ops(vops,
> > > op->tile_mask);
> > >  				}
> > >  			}
> > >
> > > @@ -2414,13 +2462,16 @@ static int vm_bind_ioctl_ops_parse(struct xe_vm
> > > *vm, struct xe_exec_queue *q,
> > >  					vm_dbg(&xe->drm, "REMAP:SKIP_NEXT:
> > > addr=0x%016llx, range=0x%016llx",
> > >  					       (ULL)op->remap.start,
> > >  					       (ULL)op->remap.range);
> > > +				} else {
> > > +					xe_vma_ops_incr_pt_update_ops(vops,
> > > op->tile_mask);
> > >  				}
> > >  			}
> > > +			xe_vma_ops_incr_pt_update_ops(vops, op->tile_mask);
> > >  			break;
> > >  		}
> > >  		case DRM_GPUVA_OP_UNMAP:
> > >  		case DRM_GPUVA_OP_PREFETCH:
> > > -			/* Nothing to do */
> > > +			xe_vma_ops_incr_pt_update_ops(vops, op->tile_mask);
> > >  			break;
> > >  		default:
> > >  			drm_warn(&vm->xe->drm, "NOT POSSIBLE");
> > > @@ -3243,11 +3294,16 @@ int xe_vm_bind_ioctl(struct drm_device *dev,
> void
> > > *data, struct drm_file *file)
> > >  		goto unwind_ops;
> > >  	}
> > >
> > > +	err = xe_vma_ops_alloc(&vops);
> > > +	if (err)
> > > +		goto unwind_ops;
> > > +
> > >  	err = vm_bind_ioctl_ops_execute(vm, &vops);
> > >
> > >  unwind_ops:
> > >  	if (err && err != -ENODATA)
> > >  		vm_bind_ioctl_ops_unwind(vm, ops, args->num_binds);
> > > +	xe_vma_ops_fini(&vops);
> > >  	for (i = args->num_binds - 1; i >= 0; --i)
> > >  		if (ops[i])
> > >  			drm_gpuva_ops_free(&vm->gpuvm, ops[i]);
> > > diff --git a/drivers/gpu/drm/xe/xe_vm_types.h
> > > b/drivers/gpu/drm/xe/xe_vm_types.h
> > > index f6ea4df64e79..f5d740dcbba3 100644
> > > --- a/drivers/gpu/drm/xe/xe_vm_types.h
> > > +++ b/drivers/gpu/drm/xe/xe_vm_types.h
> > > @@ -22,6 +22,7 @@ struct xe_device;
> > >  struct xe_sync_entry;
> > >  struct xe_user_fence;
> > >  struct xe_vm;
> > > +struct xe_vm_pgtable_update_op;
> > >
> > >  #define XE_VMA_READ_ONLY	DRM_GPUVA_USERBITS
> > >  #define XE_VMA_DESTROYED	(DRM_GPUVA_USERBITS << 1)
> > > @@ -219,6 +220,13 @@ struct xe_vma_ops {
> > >  	struct xe_sync_entry *syncs;
> > >  	/** @num_syncs: number of syncs */
> > >  	u32 num_syncs;
> > > +	/** @pt_update_ops: page table update operations */
> > > +	struct {
> > > +		/** @ops: operations */
> > > +		struct xe_vm_pgtable_update_op *ops;
> > > +		/** @num_ops: number of operations */
> > > +		u32 num_ops;
> > > +	} pt_update_ops[XE_MAX_TILES_PER_DEVICE];
> > >  };
> > >
> > >  struct xe_vm {
> > > --
> > > 2.34.1
> >

  reply	other threads:[~2024-03-27  1:29 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-08  5:07 [PATCH v4 00/30] Refactor VM bind code Matthew Brost
2024-03-08  5:07 ` [PATCH v4 01/30] drm/xe: Lock all gpuva ops during VM bind IOCTL Matthew Brost
2024-03-10 17:44   ` Zeng, Oak
2024-03-11 19:48     ` Matthew Brost
2024-03-11 22:02       ` Zeng, Oak
2024-03-12  1:29         ` Matthew Brost
2024-03-08  5:07 ` [PATCH v4 02/30] drm/xe: Add ops_execute function which returns a fence Matthew Brost
2024-03-22 16:11   ` Zeng, Oak
2024-03-22 17:31     ` Matthew Brost
2024-03-22 19:39       ` Zeng, Oak
2024-03-08  5:07 ` [PATCH v4 03/30] drm/xe: Move migrate to prefetch to op_lock function Matthew Brost
2024-03-22 17:06   ` Zeng, Oak
2024-03-22 17:36     ` Matthew Brost
2024-03-22 19:45       ` Zeng, Oak
2024-03-08  5:07 ` [PATCH v4 04/30] drm/xe: Add struct xe_vma_ops abstraction Matthew Brost
2024-03-22 17:13   ` Zeng, Oak
2024-03-08  5:07 ` [PATCH v4 05/30] drm/xe: Update xe_vm_rebind to use dummy VMA operations Matthew Brost
2024-03-22 21:23   ` Zeng, Oak
2024-03-22 22:51     ` Matthew Brost
2024-03-08  5:07 ` [PATCH v4 06/30] drm/xe: Simplify VM bind IOCTL error handling and cleanup Matthew Brost
2024-03-25 16:03   ` Zeng, Oak
2024-03-26 18:46     ` Matthew Brost
2024-03-08  5:07 ` [PATCH v4 07/30] drm/xe: Update pagefaults to use dummy VMA operations Matthew Brost
2024-03-08  5:07 ` [PATCH v4 08/30] drm/xe: s/xe_tile_migrate_engine/xe_tile_migrate_exec_queue Matthew Brost
2024-03-25 16:05   ` Zeng, Oak
2024-03-08  5:07 ` [PATCH v4 09/30] drm/xe: Add some members to xe_vma_ops Matthew Brost
2024-03-25 16:10   ` Zeng, Oak
2024-03-26 18:47     ` Matthew Brost
2024-03-08  5:07 ` [PATCH v4 10/30] drm/xe: Add vm_bind_ioctl_ops_install_fences helper Matthew Brost
2024-03-25 16:51   ` Zeng, Oak
2024-03-25 19:34     ` Matthew Brost
2024-03-25 19:44       ` Zeng, Oak
2024-03-08  5:07 ` [PATCH v4 11/30] drm/xe: Move setting last fence to vm_bind_ioctl_ops_install_fences Matthew Brost
2024-03-25 17:02   ` Zeng, Oak
2024-03-25 19:35     ` Matthew Brost
2024-03-08  5:07 ` [PATCH v4 12/30] drm/xe: Move ufence check to op_lock Matthew Brost
2024-03-25 20:37   ` Zeng, Oak
2024-03-26 18:49     ` Matthew Brost
2024-03-08  5:07 ` [PATCH v4 13/30] drm/xe: Move ufence add to vm_bind_ioctl_ops_install_fences Matthew Brost
2024-03-25 20:54   ` Zeng, Oak
2024-03-26 18:54     ` Matthew Brost
2024-03-26 20:59       ` Zeng, Oak
2024-03-08  5:07 ` [PATCH v4 14/30] drm/xe: Add xe_gt_tlb_invalidation_range and convert PT layer to use this Matthew Brost
2024-03-25 21:35   ` Zeng, Oak
2024-03-26 18:57     ` Matthew Brost
2024-03-08  5:07 ` [PATCH v4 15/30] drm/xe: Add xe_vm_pgtable_update_op to xe_vma_ops Matthew Brost
2024-03-25 21:58   ` Zeng, Oak
2024-03-26 19:05     ` Matthew Brost
2024-03-27  1:29       ` Zeng, Oak [this message]
2024-03-08  5:07 ` [PATCH v4 16/30] drm/xe: Use ordered WQ for TLB invalidation fences Matthew Brost
2024-03-25 22:30   ` Zeng, Oak
2024-03-26 19:10     ` Matthew Brost
2024-03-08  5:07 ` [PATCH v4 17/30] drm/xe: Delete PT update selftest Matthew Brost
2024-03-25 22:31   ` Zeng, Oak
2024-03-08  5:07 ` [PATCH v4 18/30] drm/xe: Convert multiple bind ops into single job Matthew Brost
2024-03-27  2:40   ` Zeng, Oak
2024-03-27 19:26     ` Matthew Brost
2024-03-08  5:07 ` [PATCH v4 19/30] drm/xe: Remove old functions defs in xe_pt.h Matthew Brost
2024-03-08  5:07 ` [PATCH v4 20/30] drm/xe: Update PT layer with better error handling Matthew Brost
2024-03-08  5:07 ` [PATCH v4 21/30] drm/xe: Update xe_vm_rebind to return int Matthew Brost
2024-03-08  5:07 ` [PATCH v4 22/30] drm/xe: Move vma rebinding to the drm_exec locking loop Matthew Brost
2024-03-08  5:07 ` [PATCH v4 23/30] drm/xe: Update VM trace events Matthew Brost
2024-03-08  5:08 ` [PATCH v4 24/30] drm/xe: Update clear / populate arguments Matthew Brost
2024-03-08  5:08 ` [PATCH v4 25/30] drm/xe: Add __xe_migrate_update_pgtables_cpu helper Matthew Brost
2024-03-08  5:08 ` [PATCH v4 26/30] drm/xe: CPU binds for jobs Matthew Brost
2024-03-08  5:08 ` [PATCH v4 27/30] drm/xe: Don't use migrate exec queue for page fault binds Matthew Brost
2024-03-08  5:08 ` [PATCH v4 28/30] drm/xe: Add VM bind IOCTL error injection Matthew Brost
2024-03-08  5:08 ` [PATCH v4 29/30] drm/xe/guc: Assert time'd out jobs are not from a VM exec queue Matthew Brost
2024-03-08  5:08 ` [PATCH v4 30/30] drm/xe: Add PT exec queues Matthew Brost
2024-03-08  5:42 ` ✓ CI.Patch_applied: success for Refactor VM bind code (rev5) Patchwork
2024-03-08  5:43 ` ✗ CI.checkpatch: warning " Patchwork
2024-03-08  5:44 ` ✓ CI.KUnit: success " Patchwork
2024-03-08  5:55 ` ✓ CI.Build: " Patchwork
2024-03-08  5:55 ` ✗ CI.Hooks: failure " Patchwork
2024-03-08  5:56 ` ✓ CI.checksparse: success " Patchwork
2024-03-08  6:26 ` ✗ CI.BAT: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=SA1PR11MB699177A96AF72F2B6B24339292342@SA1PR11MB6991.namprd11.prod.outlook.com \
    --to=oak.zeng@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=matthew.brost@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.