linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jerome Glisse <jglisse@redhat.com>
To: "Figo.zhang" <figo1802@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Linux MM <linux-mm@kvack.org>, John Hubbard <jhubbard@nvidia.com>,
	Dan Williams <dan.j.williams@intel.com>,
	David Nellans <dnellans@nvidia.com>,
	Balbir Singh <bsingharora@gmail.com>
Subject: Re: [HMM-v25 00/19] HMM (Heterogeneous Memory Management) v25
Date: Wed, 13 Dec 2017 22:16:08 -0500	[thread overview]
Message-ID: <20171214031607.GA17710@redhat.com> (raw)
In-Reply-To: <CAF7GXvrxo2xj==wA_=fXr+9nF0k0Ed123kZXeKWKBHS6TKYNdA@mail.gmail.com>

On Thu, Dec 14, 2017 at 10:48:36AM +0800, Figo.zhang wrote:
> 2017-12-14 0:12 GMT+08:00 Jerome Glisse <jglisse@redhat.com>:
> 
> > On Wed, Dec 13, 2017 at 08:10:42PM +0800, Figo.zhang wrote:

[...]

> > Basic example is without HMM:
> >     mul_mat_on_gpu(float *r, float *a, float *b, unsigned m)
> >     {
> >         gpu_buffer_t gpu_r, gpu_a, gpu_b;
> >
> >         gpu_r = gpu_alloc(m*m*sizeof(float));
> >         gpu_a = gpu_alloc(m*m*sizeof(float));
> >         gpu_b = gpu_alloc(m*m*sizeof(float));
> >         gpu_copy_to(gpu_a, a, m*m*sizeof(float));
> >         gpu_copy_to(gpu_b, b, m*m*sizeof(float));
> >
> >         gpu_mul_mat(gpu_r, gpu_a, gpu_b, m);
> >
> >         gpu_copy_from(gpu_r, r, m*m*sizeof(float));
> >     }
> >
> 
> The traditional workflow is:
> 1. the pointer a, b and r are total point to the CPU memory
> 2. create/alloc three GPU buffers: gpu_a, gpu_b, gpu_r
> 3. copy CPU memory a and b to GPU memory gpu_b and gpu_b
> 4. let the GPU to do the calculation
> 5.  copy the result from GPU buffer (gpu_r) to CPU buffer (r)
> 
> is it right?

Right.


> > With HMM:
> >     mul_mat_on_gpu(float *r, float *a, float *b, unsigned m)
> >     {
> >         gpu_mul_mat(r, a, b, m);
> >     }
> >
> 
> with HMM workflow:
> 1. CPU has three buffer: a, b, r, and it is physical addr is : pa, pb, pr
>      and GPU has tree physical buffer: gpu_a, gpu_b, gpu_r
> 2. GPU want to access buffer a and b, cause a GPU page fault
> 3. GPU report a page fault to CPU
> 4. CPU got a GPU page fault:
>                 * unmap the buffer a,b,r (who do it? GPU driver?)
>                 * copy the buffer a ,b's content to GPU physical buffers:
> gpu_a, gpu_b
>                 * fill the GPU page table entry with these pages (gpu_a,
> gpu_b, gpu_r) of the CPU virtual address: a,b,r;
> 
> 5. GPU do the calculation
> 6. CPU want to get result from buffer r and will cause a CPU page fault:
> 7. in CPU page fault:
>              * unmap the GPU page table entry for virtual address a,b,r.
> (who do the unmap? GPU driver?)
>              * copy the GPU's buffer content (gpu_a, gpu_b, gpu_r) to
> CPU buffer (abr)
>              * fill the CPU page table entry: virtual_addr -> buffer
> (pa,pb,pr)
> 8. so the CPU can get the result form buffer r.
> 
> my guess workflow is right?
> it seems need two copy, from CPU to GPU, and then GPU to CPU for result.
> * is it CPU and GPU have the  page table concurrently, so
> no page fault occur?
> * how about the performance? it sounds will create lots of page fault.

This is not what happen. Here is the workflow with HMM mirror (note that
physical address do not matter here so i do not even reference them it is
all about virtual address):
 1 They are 3 buffers a, b and r at given virtual address both CPU and
   GPU can access them (concurently or not this does not matter).
 2 GPU can fault so if any virtual address do not have a page table
   entry inside the GPU page table this trigger a page fault that will
   call HMM mirror helper to snapshot CPU page table into the GPU page
   table. If there is no physical memory backing the virtual address
   (ie CPU page table is also empty for the given virtual address) then 
   the regular page fault handler of the kernel is invoked.

Without HMM mirror but ATS/PASI (CCIX or CAPI):
 1 They are 3 buffers a, b and r at given virtual address both CPU and
   GPU can access them (concurently or not this does not matter).
 2 GPU use the exact same page table as the CPU and fault exactly like
   CPU on empty page table entry

So in the end with HMM mirror or ATS/PASID you get the same behavior.
There is no complexity like you seem to assume. This all about virtual
address. At any point in time any given valid virtual address of a process
point to a given physical memory address and that physical memory address
is the same on both the CPU and the GPU at any point in time they are
never out of sync (both in HMM mirror and in ATS/PASID case).

The exception is for platform that do not have CAPI or CCIX property ie
cache coherency for CPU access to device memory. On such platform when
you migrate a virtual address to use device physical memory you update
the CPU page table with a special entry. If the CPU try to access the
virtual address with special entry it trigger fault and HMM will migrate
the virtual address back to regular memory. But this does not apply for
CAPI or CCIX platform.


Too minimize page fault the device driver is encourage to pre-fault and
prepopulate its page table (the HMM mirror case). Often device driver has
enough context information to guess what range of virtual address is
about to be access by the device and thus pre-fault thing.


Hope this clarify thing for you.

Cheers,
Jerome

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-12-14  3:16 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-17  0:05 [HMM-v25 00/19] HMM (Heterogeneous Memory Management) v25 Jérôme Glisse
2017-08-17  0:05 ` [HMM-v25 01/19] hmm: heterogeneous memory management documentation v3 Jérôme Glisse
2017-08-17  0:05 ` [HMM-v25 02/19] mm/hmm: heterogeneous memory management (HMM for short) v5 Jérôme Glisse
2017-08-17  0:05 ` [HMM-v25 03/19] mm/hmm/mirror: mirror process address space on device with HMM helpers v3 Jérôme Glisse
2017-08-17  0:05 ` [HMM-v25 04/19] mm/hmm/mirror: helper to snapshot CPU page table v4 Jérôme Glisse
2017-08-17  0:05 ` [HMM-v25 05/19] mm/hmm/mirror: device page fault handler Jérôme Glisse
2017-08-17  0:05 ` [HMM-v25 06/19] mm/memory_hotplug: introduce add_pages Jérôme Glisse
2017-08-17  0:05 ` [HMM-v25 07/19] mm/ZONE_DEVICE: new type of ZONE_DEVICE for unaddressable memory v5 Jérôme Glisse
2018-12-20  8:33   ` Dan Williams
2018-12-20 16:15     ` Jerome Glisse
2018-12-20 16:15       ` Jerome Glisse
2018-12-20 16:47       ` Dan Williams
2018-12-20 16:47         ` Dan Williams
2018-12-20 16:57         ` Jerome Glisse
2018-12-20 16:57           ` Jerome Glisse
2017-08-17  0:05 ` [HMM-v25 08/19] mm/ZONE_DEVICE: special case put_page() for device private pages v4 Jérôme Glisse
2017-08-17  0:05 ` [HMM-v25 09/19] mm/memcontrol: allow to uncharge page without using page->lru field Jérôme Glisse
2017-08-17  0:05 ` [HMM-v25 10/19] mm/memcontrol: support MEMORY_DEVICE_PRIVATE v4 Jérôme Glisse
2017-09-05 17:13   ` Laurent Dufour
2017-09-05 17:21     ` Jerome Glisse
2017-08-17  0:05 ` [HMM-v25 11/19] mm/hmm/devmem: device memory hotplug using ZONE_DEVICE v7 Jérôme Glisse
2017-08-17  0:05 ` [HMM-v25 12/19] mm/hmm/devmem: dummy HMM device for ZONE_DEVICE memory v3 Jérôme Glisse
2017-08-17  0:05 ` [HMM-v25 13/19] mm/migrate: new migrate mode MIGRATE_SYNC_NO_COPY Jérôme Glisse
2017-08-17 21:12   ` Andrew Morton
2017-08-17 21:44     ` Jerome Glisse
2017-08-17  0:05 ` [HMM-v25 14/19] mm/migrate: new memory migration helper for use with device memory v5 Jérôme Glisse
2017-08-17  0:05 ` [HMM-v25 15/19] mm/migrate: migrate_vma() unmap page from vma while collecting pages Jérôme Glisse
2017-08-17  0:05 ` [HMM-v25 16/19] mm/migrate: support un-addressable ZONE_DEVICE page in migration v3 Jérôme Glisse
2017-08-17  0:05 ` [HMM-v25 17/19] mm/migrate: allow migrate_vma() to alloc new page on empty entry v4 Jérôme Glisse
2017-08-17  0:05 ` [HMM-v25 18/19] mm/device-public-memory: device memory cache coherent with CPU v5 Jérôme Glisse
2017-08-17  0:05 ` [HMM-v25 19/19] mm/hmm: add new helper to hotplug CDM memory region v3 Jérôme Glisse
2017-09-04  3:09   ` Bob Liu
2017-09-04 15:51     ` Jerome Glisse
2017-09-05  1:13       ` Bob Liu
2017-09-05  2:38         ` Jerome Glisse
2017-09-05  3:50           ` Bob Liu
2017-09-05 13:50             ` Jerome Glisse
2017-09-05 16:18               ` Dan Williams
2017-09-05 19:00               ` Ross Zwisler
2017-09-05 19:20                 ` Jerome Glisse
2017-09-08 19:43                   ` Ross Zwisler
2017-09-08 20:29                     ` Jerome Glisse
2017-09-05 18:54           ` Ross Zwisler
2017-09-06  1:25             ` Bob Liu
2017-09-06  2:12               ` Jerome Glisse
2017-09-07  2:06                 ` Bob Liu
2017-09-07 17:00                   ` Jerome Glisse
2017-09-07 17:27                   ` Jerome Glisse
2017-09-08  1:59                     ` Bob Liu
2017-09-08 20:43                       ` Dan Williams
2017-11-17  3:47                         ` chetan L
2017-09-05  3:36       ` Balbir Singh
2017-08-17 21:39 ` [HMM-v25 00/19] HMM (Heterogeneous Memory Management) v25 Andrew Morton
2017-08-17 21:55   ` Jerome Glisse
2017-08-17 21:59     ` Dan Williams
2017-08-17 22:02       ` Jerome Glisse
2017-08-17 22:06         ` Dan Williams
2017-08-17 22:16       ` Andrew Morton
2017-12-13 12:10 ` Figo.zhang
2017-12-13 16:12   ` Jerome Glisse
2017-12-14  2:48     ` Figo.zhang
2017-12-14  3:16       ` Jerome Glisse [this message]
2017-12-14  3:53         ` Figo.zhang
2017-12-14  4:16           ` Jerome Glisse
2017-12-14  7:05             ` Figo.zhang
2017-12-14 15:28               ` Jerome Glisse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171214031607.GA17710@redhat.com \
    --to=jglisse@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=bsingharora@gmail.com \
    --cc=dan.j.williams@intel.com \
    --cc=dnellans@nvidia.com \
    --cc=figo1802@gmail.com \
    --cc=jhubbard@nvidia.com \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).