linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Felix Kuehling <felix.kuehling@gmail.com>
To: felix.kuehling@amd.com, akpm@linux-foundation.org, linux-mm@kvack.org
Cc: hch@lst.de, jglisse@redhat.com, jgg@nvidia.com,
	dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org
Subject: [RFC PATCH 0/5] Support DEVICE_GENERIC memory in migrate_vma_*
Date: Thu, 27 May 2021 19:08:04 -0400	[thread overview]
Message-ID: <20210527230809.3701-1-Felix.Kuehling@amd.com> (raw)

AMD is building a system architecture for the Frontier supercomputer with
a coherent interconnect between CPUs and GPUs. This hardware architecture
allows the CPUs to coherently access GPU device memory. We have hardware
in our labs and we are working with our partner HPE on the BIOS, firmware
and software for delivery to the DOE.

The system BIOS advertises the GPU device memory (aka VRAM) as SPM
(special purpose memory) in the UEFI system address map. The amdgpu driver
looks it up with lookup_resource and registers it with devmap as
MEMORY_DEVICE_GENERIC using devm_memremap_pages.

Now we're trying to migrate data to and from that memory using the
migrate_vma_* helpers so we can support page-based migration in our
unified memory allocations, while also supporting CPU access to those
pages.

This patch series makes a few changes to make MEMORY_DEVICE_GENERIC pages
behave correctly in the migrate_vma_* helpers. We are looking for feedback
about this approach. If we're close, what's needed to make our patches
acceptable upstream? If we're not close, any suggestions how else to
achieve what we are trying to do (i.e. page migration and coherent CPU
access to VRAM)?

This work is based on HMM and our SVM memory manager that was recently
upstreamed to Dave Airlie's drm-next branch
[https://cgit.freedesktop.org/drm/drm/log/?h=drm-next]. On top of that we
did some rework of our VRAM management for migrations to remove some
incorrect assumptions, allow partially successful migrations and GPU
memory mappings that mix pages in VRAM and system memory.
[https://patchwork.kernel.org/project/dri-devel/list/?series=489811]

In this RFC, patches 1 and 2 are for context to show how we are looking up
the SPM memory and registering it with devmap.

Patches 3-5 are the changes we are trying to upstream or rework to make
them acceptable upstream.

Alex Sierra (5):
  drm/amdkfd: add SPM support for SVM
  drm/amdkfd: generic type as sys mem on migration to ram
  include/linux/mm.h: helper to check zone device generic type
  mm: add generic type support for device zone page migration
  mm: changes to unref pages with Generic type

 drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 15 +++++++++++----
 drivers/gpu/drm/amd/amdkfd/kfd_svm.h     |  1 -
 include/linux/mm.h                       |  8 ++++++++
 kernel/resource.c                        |  2 +-
 mm/memremap.c                            |  5 ++++-
 mm/migrate.c                             | 13 ++++++++-----
 6 files changed, 32 insertions(+), 12 deletions(-)

-- 
2.31.1



             reply	other threads:[~2021-05-27 23:09 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-27 23:08 Felix Kuehling [this message]
2021-05-27 23:08 ` [RFC PATCH 1/5] drm/amdkfd: add SPM support for SVM Felix Kuehling
2021-05-29  6:38   ` Christoph Hellwig
2021-05-29 18:42     ` Felix Kuehling
2021-05-27 23:08 ` [RFC PATCH 2/5] drm/amdkfd: generic type as sys mem on migration to ram Felix Kuehling
2021-05-27 23:08 ` [RFC PATCH 3/5] include/linux/mm.h: helper to check zone device generic type Felix Kuehling
2021-05-27 23:08 ` [RFC PATCH 4/5] mm: add generic type support for device zone page migration Felix Kuehling
2021-05-29  6:40   ` Christoph Hellwig
2021-05-27 23:08 ` [RFC PATCH 5/5] mm: changes to unref pages with Generic type Felix Kuehling
2021-05-29  6:42   ` Christoph Hellwig
2021-05-29 18:44     ` Felix Kuehling
2021-05-28 13:08 ` [RFC PATCH 0/5] Support DEVICE_GENERIC memory in migrate_vma_* Jason Gunthorpe
2021-05-28 15:56   ` Felix Kuehling
2021-05-29  6:41     ` Christoph Hellwig
2021-05-29 18:37       ` Felix Kuehling

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210527230809.3701-1-Felix.Kuehling@amd.com \
    --to=felix.kuehling@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=felix.kuehling@amd.com \
    --cc=hch@lst.de \
    --cc=jgg@nvidia.com \
    --cc=jglisse@redhat.com \
    --cc=linux-mm@kvack.org \
    --subject='Re: [RFC PATCH 0/5] Support DEVICE_GENERIC memory in migrate_vma_*' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).