All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Zeng, Oak" <oak.zeng@intel.com>
To: "Zeng, Oak" <oak.zeng@intel.com>,
	"Welty, Brian" <brian.welty@intel.com>,
	 "intel-xe@lists.freedesktop.org"
	<intel-xe@lists.freedesktop.org>
Cc: "Hellstrom, Thomas" <thomas.hellstrom@intel.com>,
	"Brost, Matthew" <matthew.brost@intel.com>,
	"airlied@gmail.com" <airlied@gmail.com>,
	"Ghimiray, Himal Prasad" <himal.prasad.ghimiray@intel.com>
Subject: RE: [PATCH 1/5] drm/xe/svm: Remap and provide memmap backing for GPU vram
Date: Fri, 15 Mar 2024 03:16:10 +0000	[thread overview]
Message-ID: <SA1PR11MB6991CD16EE87B670DE646D4892282@SA1PR11MB6991.namprd11.prod.outlook.com> (raw)
In-Reply-To: <SA1PR11MB6991D7E3792624E06A8334C692282@SA1PR11MB6991.namprd11.prod.outlook.com>



> -----Original Message-----
> From: Intel-xe <intel-xe-bounces@lists.freedesktop.org> On Behalf Of Zeng, Oak
> Sent: Thursday, March 14, 2024 11:10 PM
> To: Welty, Brian <brian.welty@intel.com>; intel-xe@lists.freedesktop.org
> Cc: Hellstrom, Thomas <thomas.hellstrom@intel.com>; Brost, Matthew
> <matthew.brost@intel.com>; airlied@gmail.com; Ghimiray, Himal Prasad
> <himal.prasad.ghimiray@intel.com>
> Subject: RE: [PATCH 1/5] drm/xe/svm: Remap and provide memmap backing for
> GPU vram
> 
> 
> 
> > -----Original Message-----
> > From: Welty, Brian <brian.welty@intel.com>
> > Sent: Thursday, March 14, 2024 9:46 PM
> > To: Zeng, Oak <oak.zeng@intel.com>; intel-xe@lists.freedesktop.org
> > Cc: Hellstrom, Thomas <thomas.hellstrom@intel.com>; Brost, Matthew
> > <matthew.brost@intel.com>; airlied@gmail.com; Ghimiray, Himal Prasad
> > <himal.prasad.ghimiray@intel.com>
> > Subject: Re: [PATCH 1/5] drm/xe/svm: Remap and provide memmap backing
> for
> > GPU vram
> >
> > Hi Oak,
> >
> > On 3/13/2024 8:35 PM, Oak Zeng wrote:
> > > Memory remap GPU vram using devm_memremap_pages, so each GPU
> vram
> > > page is backed by a struct page.
> > >
> > > Those struct pages are created to allow hmm migrate buffer b/t
> > > GPU vram and CPU system memory using existing Linux migration
> > > mechanism (i.e., migrating b/t CPU system memory and hard disk).
> > >
> > > This is prepare work to enable svm (shared virtual memory) through
> > > Linux kernel hmm framework. The memory remap's page map type is set
> > > to MEMORY_DEVICE_PRIVATE for now. This means even though each GPU
> > > vram page get a struct page and can be mapped in CPU page table,
> > > but such pages are treated as GPU's private resource, so CPU can't
> > > access them. If CPU access such page, a page fault is triggered
> > > and page will be migrate to system memory.
> > >
> > > For GPU device which supports coherent memory protocol b/t CPU and
> > > GPU (such as CXL and CAPI protocol), we can remap device memory as
> > > MEMORY_DEVICE_COHERENT. This is TBD.
> > >
> > > Signed-off-by: Oak Zeng <oak.zeng@intel.com>
> > > Co-developed-by: Niranjana Vishwanathapura
> > <niranjana.vishwanathapura@intel.com>
> > > Signed-off-by: Niranjana Vishwanathapura
> > <niranjana.vishwanathapura@intel.com>
> > > Cc: Matthew Brost <matthew.brost@intel.com>
> > > Cc: Thomas Hellström <thomas.hellstrom@intel.com>
> > > Cc: Brian Welty <brian.welty@intel.com>
> > > ---
> > >   drivers/gpu/drm/xe/Makefile          |  3 +-
> > >   drivers/gpu/drm/xe/xe_device_types.h |  9 +++
> > >   drivers/gpu/drm/xe/xe_mmio.c         |  8 +++
> > >   drivers/gpu/drm/xe/xe_svm.h          | 14 +++++
> > >   drivers/gpu/drm/xe/xe_svm_devmem.c   | 91
> > ++++++++++++++++++++++++++++
> > >   5 files changed, 124 insertions(+), 1 deletion(-)
> > >   create mode 100644 drivers/gpu/drm/xe/xe_svm.h
> > >   create mode 100644 drivers/gpu/drm/xe/xe_svm_devmem.c
> > >
> > > diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
> > > index c531210695db..840467080e59 100644
> > > --- a/drivers/gpu/drm/xe/Makefile
> > > +++ b/drivers/gpu/drm/xe/Makefile
> > > @@ -142,7 +142,8 @@ xe-y += xe_bb.o \
> > >   	xe_vram_freq.o \
> > >   	xe_wait_user_fence.o \
> > >   	xe_wa.o \
> > > -	xe_wopcm.o
> > > +	xe_wopcm.o \
> > > +	xe_svm_devmem.o
> >
> > Minor, but maintainers want above alphabetically sorted.
> 
> Correct. Matt pointed out the same. Will fix
> >
> > >
> > >   # graphics hardware monitoring (HWMON) support
> > >   xe-$(CONFIG_HWMON) += xe_hwmon.o
> > > diff --git a/drivers/gpu/drm/xe/xe_device_types.h
> > b/drivers/gpu/drm/xe/xe_device_types.h
> > > index 9785eef2e5a4..f27c3bee8ce7 100644
> > > --- a/drivers/gpu/drm/xe/xe_device_types.h
> > > +++ b/drivers/gpu/drm/xe/xe_device_types.h
> > > @@ -99,6 +99,15 @@ struct xe_mem_region {
> > >   	resource_size_t actual_physical_size;
> > >   	/** @mapping: pointer to VRAM mappable space */
> > >   	void __iomem *mapping;
> > > +	/** @pagemap: Used to remap device memory as ZONE_DEVICE */
> > > +    struct dev_pagemap pagemap;
> > > +    /**
> > > +     * @hpa_base: base host physical address
> > > +     *
> > > +     * This is generated when remap device memory as ZONE_DEVICE
> > > +     */
> > > +    resource_size_t hpa_base;
> > > +
> > >   };
> > >
> > >   /**
> > > diff --git a/drivers/gpu/drm/xe/xe_mmio.c
> b/drivers/gpu/drm/xe/xe_mmio.c
> > > index e3db3a178760..0d795394bc4c 100644
> > > --- a/drivers/gpu/drm/xe/xe_mmio.c
> > > +++ b/drivers/gpu/drm/xe/xe_mmio.c
> > > @@ -22,6 +22,7 @@
> > >   #include "xe_module.h"
> > >   #include "xe_sriov.h"
> > >   #include "xe_tile.h"
> > > +#include "xe_svm.h"
> > >
> > >   #define XEHP_MTCFG_ADDR		XE_REG(0x101800)
> > >   #define TILE_COUNT		REG_GENMASK(15, 8)
> > > @@ -286,6 +287,7 @@ int xe_mmio_probe_vram(struct xe_device *xe)
> > >   		}
> > >
> > >   		io_size -= min_t(u64, tile_size, io_size);
> > > +		xe_svm_devm_add(tile, &tile->mem.vram);
> >
> > I think slightly more appropriate call site for this might be
> > xe_tile_init_noalloc(), as that function states it is preparing tile
> > for VRAM allocations.
> > Also, I mention because we might like the flexiblity in future to call
> > this once for xe_device.mem.vram, instead of calling for each tile,
> > and easier to handle that in xe_tile.c instead of xe_mmio.c.
> 
> Good point. Will follow.

Sorry, with my comment below, do you still want to call it in xe_tile_init_noalloc?

For UMA, we only need to call it once. If you do it in init-noalloc, you would call it multiple times. Right?

Oak

> >
> > Related comment below.
> >
> >
> > >   	}
> > >
> > >   	xe->mem.vram.actual_physical_size = total_size;
> > > @@ -354,10 +356,16 @@ void xe_mmio_probe_tiles(struct xe_device *xe)
> > >   static void mmio_fini(struct drm_device *drm, void *arg)
> > >   {
> > >   	struct xe_device *xe = arg;
> > > +    struct xe_tile *tile;
> > > +    u8 id;
> > >
> > >   	pci_iounmap(to_pci_dev(xe->drm.dev), xe->mmio.regs);
> > >   	if (xe->mem.vram.mapping)
> > >   		iounmap(xe->mem.vram.mapping);
> > > +
> > > +	for_each_tile(tile, xe, id)
> > > +		xe_svm_devm_remove(xe, &tile->mem.vram);
> > > +
> > >   }
> > >
> > >   static int xe_verify_lmem_ready(struct xe_device *xe)
> > > diff --git a/drivers/gpu/drm/xe/xe_svm.h b/drivers/gpu/drm/xe/xe_svm.h
> > > new file mode 100644
> > > index 000000000000..09f9afb0e7d4
> > > --- /dev/null
> > > +++ b/drivers/gpu/drm/xe/xe_svm.h
> > > @@ -0,0 +1,14 @@
> > > +// SPDX-License-Identifier: MIT
> > > +/*
> > > + * Copyright © 2023 Intel Corporation
> > > + */
> > > +
> > > +#ifndef __XE_SVM_H
> > > +#define __XE_SVM_H
> > > +
> > > +#include "xe_device_types.h"
> > > +
> > > +int xe_svm_devm_add(struct xe_tile *tile, struct xe_mem_region *mem);
> > > +void xe_svm_devm_remove(struct xe_device *xe, struct xe_mem_region
> > *mem);
> > > +
> > > +#endif
> > > diff --git a/drivers/gpu/drm/xe/xe_svm_devmem.c
> > b/drivers/gpu/drm/xe/xe_svm_devmem.c
> > > new file mode 100644
> > > index 000000000000..63b7a1961cc6
> > > --- /dev/null
> > > +++ b/drivers/gpu/drm/xe/xe_svm_devmem.c
> > > @@ -0,0 +1,91 @@
> > > +// SPDX-License-Identifier: MIT
> > > +/*
> > > + * Copyright © 2023 Intel Corporation
> > > + */
> > > +
> > > +#include <linux/mm_types.h>
> > > +#include <linux/sched/mm.h>
> > > +
> > > +#include "xe_device_types.h"
> > > +#include "xe_trace.h"
> > > +#include "xe_svm.h"
> > > +
> > > +
> > > +static vm_fault_t xe_devm_migrate_to_ram(struct vm_fault *vmf)
> > > +{
> > > +	return 0;
> > > +}
> > > +
> > > +static void xe_devm_page_free(struct page *page)
> > > +{
> > > +}
> > > +
> > > +static const struct dev_pagemap_ops xe_devm_pagemap_ops = {
> > > +	.page_free = xe_devm_page_free,
> > > +	.migrate_to_ram = xe_devm_migrate_to_ram,
> > > +};
> > > +
> > > +/**
> > > + * xe_svm_devm_add: Remap and provide memmap backing for device
> > memory
> >
> > Do we really need 'svm' in function name?
> 
> Good point. I will remove svm.
> >
> > > + * @tile: tile that the memory region blongs to
> >
> > We might like flexibility in future to call this once for
> > xe_device.mem.vram, instead of calling for each tile.
> > So can we remove the tile argument, and just pass the xe_device pointer
> > and tile->id ?
> 
> This is interesting.
> 
> First of all, I programmed wrong below: mr->pagemap.owner = tile->xe-
> >drm.dev;
> 
> This should be: mr->pagemap.owner = tile for NUMA vram system.
> 
> For UMA vram, this should be: mr->pagemap.owner = tile_to_xe(tile);
> 
> This owner is important. It is used later to decide migration by hmm. We need to
> set the owner for hmm to identify different vram region.
> 
> Based on above, I think the tile parameter is better. For UMA, caller need to
> make sure call it once, any tile pointer should work. This does sound a little weird.
> But I don’t have a better idea.
> 
> Oak
> >
> >
> > > + * @mr: memory region to remap
> > > + *
> > > + * This remap device memory to host physical address space and create
> > > + * struct page to back device memory
> > > + *
> > > + * Return: 0 on success standard error code otherwise
> > > + */
> > > +int xe_svm_devm_add(struct xe_tile *tile, struct xe_mem_region *mr)
> > > +{
> > > +	struct device *dev = &to_pci_dev(tile->xe->drm.dev)->dev;
> > > +	struct resource *res;
> > > +	void *addr;
> > > +	int ret;
> > > +
> > > +	res = devm_request_free_mem_region(dev, &iomem_resource,
> > > +					   mr->usable_size);
> > > +	if (IS_ERR(res)) {
> > > +		ret = PTR_ERR(res);
> > > +		return ret;
> > > +	}
> > > +
> > > +	mr->pagemap.type = MEMORY_DEVICE_PRIVATE;
> > > +	mr->pagemap.range.start = res->start;
> > > +	mr->pagemap.range.end = res->end;
> > > +	mr->pagemap.nr_range = 1;
> > > +	mr->pagemap.ops = &xe_devm_pagemap_ops;
> > > +	mr->pagemap.owner = tile->xe->drm.dev;
> > > +	addr = devm_memremap_pages(dev, &mr->pagemap);
> > > +	if (IS_ERR(addr)) {
> > > +		devm_release_mem_region(dev, res->start, resource_size(res));
> > > +		ret = PTR_ERR(addr);
> > > +		drm_err(&tile->xe->drm, "Failed to remap tile %d memory,
> > errno %d\n",
> > > +				tile->id, ret);
> > > +		return ret;
> > > +	}
> > > +	mr->hpa_base = res->start;
> > > +
> > > +	drm_info(&tile->xe->drm, "Added tile %d memory [%llx-%llx] to devm,
> > remapped to %pr\n",
> > > +			tile->id, mr->io_start, mr->io_start + mr->usable_size,
> > res);
> > > +	return 0;
> > > +}
> > > +
> > > +/**
> > > + * xe_svm_devm_remove: Unmap device memory and free resources
> > > + * @xe: xe device
> > > + * @mr: memory region to remove
> > > + */
> > > +void xe_svm_devm_remove(struct xe_device *xe, struct xe_mem_region
> > *mr)
> > > +{
> > > +	/*FIXME: below cause a kernel hange during moduel remove*/
> > > +#if 0
> > > +	struct device *dev = &to_pci_dev(xe->drm.dev)->dev;
> > > +
> > > +	if (mr->hpa_base) {
> > > +		devm_memunmap_pages(dev, &mr->pagemap);
> > > +		devm_release_mem_region(dev, mr->pagemap.range.start,
> > > +			mr->pagemap.range.end - mr->pagemap.range.start +1);
> > > +	}
> > > +#endif
> > > +}
> > > +

  reply	other threads:[~2024-03-15  3:16 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-14  3:35 [PATCH 0/5] Use hmm_range_fault to populate user page Oak Zeng
2024-03-14  3:28 ` ✓ CI.Patch_applied: success for " Patchwork
2024-03-14  3:28 ` ✗ CI.checkpatch: warning " Patchwork
2024-03-14  3:29 ` ✗ CI.KUnit: failure " Patchwork
2024-03-14  3:35 ` [PATCH 1/5] drm/xe/svm: Remap and provide memmap backing for GPU vram Oak Zeng
2024-03-14 17:17   ` Matthew Brost
2024-03-14 18:32     ` Zeng, Oak
2024-03-14 20:49       ` Matthew Brost
2024-03-15 16:00         ` Zeng, Oak
2024-03-15 20:39           ` Matthew Brost
2024-03-15 21:31             ` Zeng, Oak
2024-03-16  1:25               ` Matthew Brost
2024-03-18 10:16                 ` Hellstrom, Thomas
2024-03-18 15:02                   ` Zeng, Oak
2024-03-18 15:46                     ` Hellstrom, Thomas
2024-03-18 14:51                 ` Zeng, Oak
2024-03-15  1:45   ` Welty, Brian
2024-03-15  3:10     ` Zeng, Oak
2024-03-15  3:16       ` Zeng, Oak [this message]
2024-03-15 18:05         ` Welty, Brian
2024-03-15 23:11           ` Zeng, Oak
2024-03-14  3:35 ` [PATCH 2/5] drm/xe: Helper to get memory region from tile Oak Zeng
2024-03-14 17:33   ` Matthew Brost
2024-03-14 17:44   ` Matthew Brost
2024-03-15  2:48     ` Zeng, Oak
2024-03-14  3:35 ` [PATCH 3/5] drm/xe: Helper to get dpa from pfn Oak Zeng
2024-03-14 17:39   ` Matthew Brost
2024-03-15 17:29     ` Zeng, Oak
2024-03-16  1:33       ` Matthew Brost
2024-03-18 19:25         ` Zeng, Oak
2024-03-18 12:09     ` Hellstrom, Thomas
2024-03-18 19:27       ` Zeng, Oak
2024-03-14  3:35 ` [PATCH 4/5] drm/xe: Helper to populate a userptr or hmmptr Oak Zeng
2024-03-14 20:25   ` Matthew Brost
2024-03-16  1:35     ` Zeng, Oak
2024-03-18  0:29       ` Matthew Brost
2024-03-18 11:53   ` Hellstrom, Thomas
2024-03-18 19:50     ` Zeng, Oak
2024-03-19  8:41       ` Hellstrom, Thomas
2024-03-19 16:13         ` Zeng, Oak
2024-03-19 19:52           ` Hellstrom, Thomas
2024-03-19 20:01             ` Zeng, Oak
2024-03-18 13:12   ` Hellstrom, Thomas
2024-03-18 14:49     ` Zeng, Oak
2024-03-18 15:40       ` Hellstrom, Thomas
2024-03-18 16:09         ` Zeng, Oak
2024-03-14  3:35 ` [PATCH 5/5] drm/xe: Use hmm_range_fault to populate user pages Oak Zeng
2024-03-14 20:54   ` Matthew Brost
2024-03-19  2:36     ` Zeng, Oak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=SA1PR11MB6991CD16EE87B670DE646D4892282@SA1PR11MB6991.namprd11.prod.outlook.com \
    --to=oak.zeng@intel.com \
    --cc=airlied@gmail.com \
    --cc=brian.welty@intel.com \
    --cc=himal.prasad.ghimiray@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=matthew.brost@intel.com \
    --cc=thomas.hellstrom@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.