linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Sierra Guiza, Alejandro (Alex)" <alex.sierra@amd.com>
To: akpm@linux-foundation.org, Felix.Kuehling@amd.com,
	linux-mm@kvack.org, rcampbell@nvidia.com,
	linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org
Cc: amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
	hch@lst.de, jgg@nvidia.com, jglisse@redhat.com
Subject: Re: [PATCH v3 0/8] Support DEVICE_GENERIC memory in migrate_vma_*
Date: Thu, 17 Jun 2021 10:56:03 -0500	[thread overview]
Message-ID: <6c49988d-e158-8297-d7e4-97db279458cd@amd.com> (raw)
In-Reply-To: <20210617151705.15367-1-alex.sierra@amd.com>


On 6/17/2021 10:16 AM, Alex Sierra wrote:
> v1:
> AMD is building a system architecture for the Frontier supercomputer with a
> coherent interconnect between CPUs and GPUs. This hardware architecture allows
> the CPUs to coherently access GPU device memory. We have hardware in our labs
> and we are working with our partner HPE on the BIOS, firmware and software
> for delivery to the DOE.
>
> The system BIOS advertises the GPU device memory (aka VRAM) as SPM
> (special purpose memory) in the UEFI system address map. The amdgpu driver looks
> it up with lookup_resource and registers it with devmap as MEMORY_DEVICE_GENERIC
> using devm_memremap_pages.
>
> Now we're trying to migrate data to and from that memory using the migrate_vma_*
> helpers so we can support page-based migration in our unified memory allocations,
> while also supporting CPU access to those pages.
>
> This patch series makes a few changes to make MEMORY_DEVICE_GENERIC pages behave
> correctly in the migrate_vma_* helpers. We are looking for feedback about this
> approach. If we're close, what's needed to make our patches acceptable upstream?
> If we're not close, any suggestions how else to achieve what we are trying to do
> (i.e. page migration and coherent CPU access to VRAM)?
>
> This work is based on HMM and our SVM memory manager that was recently upstreamed
> to Dave Airlie's drm-next branch
> https://lore.kernel.org/dri-devel/20210527205606.2660-6-Felix.Kuehling@amd.com/T/#r996356015e295780eb50453e7dbd5d0d68b47cbc
Corrected link:

https://cgit.freedesktop.org/drm/drm/log/?h=drm-next

Regards,
Alex Sierra

> On top of that we did some rework of our VRAM management for migrations to remove
> some incorrect assumptions, allow partially successful migrations and GPU memory
> mappings that mix pages in VRAM and system memory.
> https://patchwork.kernel.org/project/dri-devel/list/?series=489811

Corrected link:

https://lore.kernel.org/dri-devel/20210527205606.2660-6-Felix.Kuehling@amd.com/T/#r996356015e295780eb50453e7dbd5d0d68b47cbc

Regards,
Alex Sierra

>
> v2:
> This patch series version has merged "[RFC PATCH v3 0/2]
> mm: remove extra ZONE_DEVICE struct page refcount" patch series made by
> Ralph Campbell. It also applies at the top of these series, our changes
> to support device generic type in migration_vma helpers.
> This has been tested in systems with device memory that has coherent
> access by CPU.
>
> Also addresses the following feedback made in v1:
> - Isolate in one patch kernel/resource.c modification, based
> on Christoph's feedback.
> - Add helpers check for generic and private type to avoid
> duplicated long lines.
>
> v3:
> - Include cover letter from v1
> - Rename dax_layout_is_idle_page func to dax_page_unused in patch
> ext4/xfs: add page refcount helper
>
> Patches 1-2 Rebased Ralph Campbell's ZONE_DEVICE page refcounting patches
> Patches 4-5 are for context to show how we are looking up the SPM
> memory and registering it with devmap.
> Patches 3,6-8 are the changes we are trying to upstream or rework to
> make them acceptable upstream.
>
> Alex Sierra (6):
>    kernel: resource: lookup_resource as exported symbol
>    drm/amdkfd: add SPM support for SVM
>    drm/amdkfd: generic type as sys mem on migration to ram
>    include/linux/mm.h: helpers to check zone device generic type
>    mm: add generic type support to migrate_vma helpers
>    mm: call pgmap->ops->page_free for DEVICE_GENERIC pages
>
> Ralph Campbell (2):
>    ext4/xfs: add page refcount helper
>    mm: remove extra ZONE_DEVICE struct page refcount
>
>   arch/powerpc/kvm/book3s_hv_uvmem.c       |  2 +-
>   drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 15 ++++--
>   drivers/gpu/drm/nouveau/nouveau_dmem.c   |  2 +-
>   fs/dax.c                                 |  8 +--
>   fs/ext4/inode.c                          |  5 +-
>   fs/xfs/xfs_file.c                        |  4 +-
>   include/linux/dax.h                      | 10 ++++
>   include/linux/memremap.h                 |  7 +--
>   include/linux/mm.h                       | 52 +++---------------
>   kernel/resource.c                        |  2 +-
>   lib/test_hmm.c                           |  2 +-
>   mm/internal.h                            |  8 +++
>   mm/memremap.c                            | 69 +++++++-----------------
>   mm/migrate.c                             | 13 ++---
>   mm/page_alloc.c                          |  3 ++
>   mm/swap.c                                | 45 ++--------------
>   16 files changed, 83 insertions(+), 164 deletions(-)
>


  parent reply	other threads:[~2021-06-17 15:56 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-17 15:16 [PATCH v3 0/8] Support DEVICE_GENERIC memory in migrate_vma_* Alex Sierra
2021-06-17 15:16 ` [PATCH v3 1/8] ext4/xfs: add page refcount helper Alex Sierra
2021-06-17 16:11   ` Darrick J. Wong
2021-06-17 23:52   ` Dave Chinner
2021-06-19 16:14     ` Sierra Guiza, Alejandro (Alex)
2021-06-19 18:31   ` Alex Sierra
2021-06-17 15:16 ` [PATCH v3 2/8] mm: remove extra ZONE_DEVICE struct page refcount Alex Sierra
2021-06-17 19:16   ` Ralph Campbell
2021-06-28 16:46     ` Felix Kuehling
2021-06-30  0:23       ` Ralph Campbell
2021-06-17 15:17 ` [PATCH v3 3/8] kernel: resource: lookup_resource as exported symbol Alex Sierra
2021-06-17 15:17 ` [PATCH v3 4/8] drm/amdkfd: add SPM support for SVM Alex Sierra
2021-06-17 15:17 ` [PATCH v3 5/8] drm/amdkfd: generic type as sys mem on migration to ram Alex Sierra
2021-06-17 15:17 ` [PATCH v3 6/8] include/linux/mm.h: helpers to check zone device generic type Alex Sierra
2021-06-17 15:17 ` [PATCH v3 7/8] mm: add generic type support to migrate_vma helpers Alex Sierra
2021-06-17 15:17 ` [PATCH v3 8/8] mm: call pgmap->ops->page_free for DEVICE_GENERIC pages Alex Sierra
2021-06-17 15:56 ` Sierra Guiza, Alejandro (Alex) [this message]
2021-06-20 14:14 ` [PATCH v3 0/8] Support DEVICE_GENERIC memory in migrate_vma_* Theodore Ts'o
2021-06-23 21:49   ` Felix Kuehling
2021-06-24  5:30     ` Christoph Hellwig
2021-06-24 15:08       ` Felix Kuehling
2021-07-16 15:07     ` Theodore Y. Ts'o
2021-07-16 22:14       ` Felix Kuehling
2021-07-17 19:54         ` Sierra Guiza, Alejandro (Alex)
2021-07-23 22:46           ` Sierra Guiza, Alejandro (Alex)
2021-07-30 19:02             ` Felix Kuehling

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6c49988d-e158-8297-d7e4-97db279458cd@amd.com \
    --to=alex.sierra@amd.com \
    --cc=Felix.Kuehling@amd.com \
    --cc=akpm@linux-foundation.org \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=hch@lst.de \
    --cc=jgg@nvidia.com \
    --cc=jglisse@redhat.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=rcampbell@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).