linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jerome Glisse <jglisse@redhat.com>
To: "Figo.zhang" <figo1802@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Linux MM <linux-mm@kvack.org>, John Hubbard <jhubbard@nvidia.com>,
	Dan Williams <dan.j.williams@intel.com>,
	David Nellans <dnellans@nvidia.com>,
	Balbir Singh <bsingharora@gmail.com>
Subject: Re: [HMM-v25 00/19] HMM (Heterogeneous Memory Management) v25
Date: Thu, 14 Dec 2017 10:28:35 -0500	[thread overview]
Message-ID: <20171214152834.GA25092@redhat.com> (raw)
In-Reply-To: <CAF7GXvpuvrfRHBBrQ4ADz+ma_=z6T0+9j3As-GBTtS+gNqfZXA@mail.gmail.com>

On Thu, Dec 14, 2017 at 03:05:39PM +0800, Figo.zhang wrote:
> 2017-12-14 12:16 GMT+08:00 Jerome Glisse <jglisse@redhat.com>:
> > On Thu, Dec 14, 2017 at 11:53:40AM +0800, Figo.zhang wrote:
> > > 2017-12-14 11:16 GMT+08:00 Jerome Glisse <jglisse@redhat.com>:
> > > > On Thu, Dec 14, 2017 at 10:48:36AM +0800, Figo.zhang wrote:
> > > > > 2017-12-14 0:12 GMT+08:00 Jerome Glisse <jglisse@redhat.com>:
> > > > > > On Wed, Dec 13, 2017 at 08:10:42PM +0800, Figo.zhang wrote:

[...]

> > This slide is for the case where you use device memory on PCIE platform.
> > When that happen only the device can access the virtual address back by
> > device memory. If CPU try to access such address a page fault is trigger
> > and it migrate the data back to regular memory where both GPU and CPU can
> > access it concurently.
> >
> > And again this behavior only happen if you use HMM non cache coherent
> > device memory model. If you use the device cache coherent model with HMM
> > then CPU can access the device memory directly too and above scenario
> > never happen.
> >
> > Note that memory copy when data move from device to system or from system
> > to device memory are inevitable. This is exactly as with autoNUMA. Also
> > note that in some case thing can get allocated directly on GPU and never
> > copied back to regular memory (only use by GPU and freed once GPU is done
> > with them) the zero copy case. But i want to stress that the zero copy
> > case is unlikely to happen for input buffer. Usualy you do not get your
> > input data set directly on the GPU but from network or disk and you might
> > do pre-processing on CPU (uncompress input, or do something that is better
> > done on the CPU). Then you feed your data to the GPU and you do computation
> > there.
> >
> 
> Greati 1/4 ?very detail about the HMM explanation, Thanks a lot.
> so would you like see my conclusion is correct?
> * if support CCIX/CAPI, CPU can access GPU memory directly, and GPU also
> can access CPU memory directly,
> so it no need copy on kernel space in HMM solutions.

Yes but migration do imply copy. The physical address backing a virtual address
can change over the lifetime of a virtual address (between mmap and munmap). As
a result of various activity (auto NUMA, compaction, swap out then swap back in,
...) and in the case that interest us as the result of a device driver migrating
thing to its device memory.


> * if no support CCIX/CAPI, CPU cannot access GPU memory in cache
> coherency method, also GPU cannot access CPU memory at
> cache coherency. it need some copies like John Hobburt's slides.
>    *when GPU page fault, need copy data from CPU page to GPU page, and
> HMM unmap the CPU page...
>    * when CPU page fault, need copy data from GPU page to CPU page
> and ummap GPU page and map the CPU page...

No, GPU can access main memory just fine (snoop PCIE transaction and in a
full cache coherent way with CPU). Only the CPU can not access the device
memory. So there is a special case only when migrating some virtual address
to use device memory.

What is described inside John's slides is what happen when you migrate some
virtual addresses to device memory where the CPU can not access it. This
migration is not necessary for the GPU to access memory. It only happens as
an optimization when the device driver suspect it will make frequent access
to that memory and that CPU will not try to access it.

[...]

> > No. Here we are always talking about virtual address that are the outcome
> > of an mmap syscall either as private anonymous memory or as mmap of regular
> > file (ie not a device file but a regular file on a filesystem).
> >
> > Device driver can migrate any virtual address to use device memory for
> > performance reasons (how, why and when such migration happens is totaly
> > opaque to HMM it is under the control of the device driver).
> >
> > So if you do:
> >    BUFA = malloc(size);
> > Then do something with BUFA on the CPU (like reading input or network, ...)
> > the memory is likely to be allocated with regular main memory (like DDR).
> >
> > Now if you start some job on your GPU that access BUFA the device driver
> > might call migrate_vma() helper to migrate the memory to device memory. At
> > that point the virtual address of BUFA point to physical device memory here
> > CAPI or CCIX. If it is not CAPI/CCIX than the GPU page table point to
> > device
> > memory while the CPU page table point to invalid special entry. The GPU can
> > work on BUFA that now reside inside the device memory. Finaly, in the non
> > CAPI/CCIX case, if CPU try to access that memory then a migration back to
> > regular memory happen.
> >
> 
> in this scenario:
> *if CAPI/CCIX supporti 1/4 ? the CPU's page table and GPU's also point to the
> device physical page?
> in this case, it still need the ZONE_DEVICE infrastructure for
> CPU page tablei 1/4 ?

Correct, in CAPI/CCIX case there is only one page table and thus after migration
they both point to same physical address for the virtual addresses of BUFA.

> *if no CAPI/CCIX support, the CPU's page table filled a invalid special pte.

Correct. This is the case described by John's slides.


The physical memory backing a virtual address can change at anytime for many
different reasons (autonuma, compaction, swap out follow by swap in, ...) and
migration (from one physical memory type to another) for accelerator purposes
is just a new reasons in that list.

Cheers,
JA(C)rA'me

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

      reply	other threads:[~2017-12-14 15:28 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-17  0:05 [HMM-v25 00/19] HMM (Heterogeneous Memory Management) v25 Jérôme Glisse
2017-08-17  0:05 ` [HMM-v25 01/19] hmm: heterogeneous memory management documentation v3 Jérôme Glisse
2017-08-17  0:05 ` [HMM-v25 02/19] mm/hmm: heterogeneous memory management (HMM for short) v5 Jérôme Glisse
2017-08-17  0:05 ` [HMM-v25 03/19] mm/hmm/mirror: mirror process address space on device with HMM helpers v3 Jérôme Glisse
2017-08-17  0:05 ` [HMM-v25 04/19] mm/hmm/mirror: helper to snapshot CPU page table v4 Jérôme Glisse
2017-08-17  0:05 ` [HMM-v25 05/19] mm/hmm/mirror: device page fault handler Jérôme Glisse
2017-08-17  0:05 ` [HMM-v25 06/19] mm/memory_hotplug: introduce add_pages Jérôme Glisse
2017-08-17  0:05 ` [HMM-v25 07/19] mm/ZONE_DEVICE: new type of ZONE_DEVICE for unaddressable memory v5 Jérôme Glisse
2018-12-20  8:33   ` Dan Williams
2018-12-20 16:15     ` Jerome Glisse
2018-12-20 16:15       ` Jerome Glisse
2018-12-20 16:47       ` Dan Williams
2018-12-20 16:47         ` Dan Williams
2018-12-20 16:57         ` Jerome Glisse
2018-12-20 16:57           ` Jerome Glisse
2017-08-17  0:05 ` [HMM-v25 08/19] mm/ZONE_DEVICE: special case put_page() for device private pages v4 Jérôme Glisse
2017-08-17  0:05 ` [HMM-v25 09/19] mm/memcontrol: allow to uncharge page without using page->lru field Jérôme Glisse
2017-08-17  0:05 ` [HMM-v25 10/19] mm/memcontrol: support MEMORY_DEVICE_PRIVATE v4 Jérôme Glisse
2017-09-05 17:13   ` Laurent Dufour
2017-09-05 17:21     ` Jerome Glisse
2017-08-17  0:05 ` [HMM-v25 11/19] mm/hmm/devmem: device memory hotplug using ZONE_DEVICE v7 Jérôme Glisse
2017-08-17  0:05 ` [HMM-v25 12/19] mm/hmm/devmem: dummy HMM device for ZONE_DEVICE memory v3 Jérôme Glisse
2017-08-17  0:05 ` [HMM-v25 13/19] mm/migrate: new migrate mode MIGRATE_SYNC_NO_COPY Jérôme Glisse
2017-08-17 21:12   ` Andrew Morton
2017-08-17 21:44     ` Jerome Glisse
2017-08-17  0:05 ` [HMM-v25 14/19] mm/migrate: new memory migration helper for use with device memory v5 Jérôme Glisse
2017-08-17  0:05 ` [HMM-v25 15/19] mm/migrate: migrate_vma() unmap page from vma while collecting pages Jérôme Glisse
2017-08-17  0:05 ` [HMM-v25 16/19] mm/migrate: support un-addressable ZONE_DEVICE page in migration v3 Jérôme Glisse
2017-08-17  0:05 ` [HMM-v25 17/19] mm/migrate: allow migrate_vma() to alloc new page on empty entry v4 Jérôme Glisse
2017-08-17  0:05 ` [HMM-v25 18/19] mm/device-public-memory: device memory cache coherent with CPU v5 Jérôme Glisse
2017-08-17  0:05 ` [HMM-v25 19/19] mm/hmm: add new helper to hotplug CDM memory region v3 Jérôme Glisse
2017-09-04  3:09   ` Bob Liu
2017-09-04 15:51     ` Jerome Glisse
2017-09-05  1:13       ` Bob Liu
2017-09-05  2:38         ` Jerome Glisse
2017-09-05  3:50           ` Bob Liu
2017-09-05 13:50             ` Jerome Glisse
2017-09-05 16:18               ` Dan Williams
2017-09-05 19:00               ` Ross Zwisler
2017-09-05 19:20                 ` Jerome Glisse
2017-09-08 19:43                   ` Ross Zwisler
2017-09-08 20:29                     ` Jerome Glisse
2017-09-05 18:54           ` Ross Zwisler
2017-09-06  1:25             ` Bob Liu
2017-09-06  2:12               ` Jerome Glisse
2017-09-07  2:06                 ` Bob Liu
2017-09-07 17:00                   ` Jerome Glisse
2017-09-07 17:27                   ` Jerome Glisse
2017-09-08  1:59                     ` Bob Liu
2017-09-08 20:43                       ` Dan Williams
2017-11-17  3:47                         ` chetan L
2017-09-05  3:36       ` Balbir Singh
2017-08-17 21:39 ` [HMM-v25 00/19] HMM (Heterogeneous Memory Management) v25 Andrew Morton
2017-08-17 21:55   ` Jerome Glisse
2017-08-17 21:59     ` Dan Williams
2017-08-17 22:02       ` Jerome Glisse
2017-08-17 22:06         ` Dan Williams
2017-08-17 22:16       ` Andrew Morton
2017-12-13 12:10 ` Figo.zhang
2017-12-13 16:12   ` Jerome Glisse
2017-12-14  2:48     ` Figo.zhang
2017-12-14  3:16       ` Jerome Glisse
2017-12-14  3:53         ` Figo.zhang
2017-12-14  4:16           ` Jerome Glisse
2017-12-14  7:05             ` Figo.zhang
2017-12-14 15:28               ` Jerome Glisse [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171214152834.GA25092@redhat.com \
    --to=jglisse@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=bsingharora@gmail.com \
    --cc=dan.j.williams@intel.com \
    --cc=dnellans@nvidia.com \
    --cc=figo1802@gmail.com \
    --cc=jhubbard@nvidia.com \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).