linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jerome Glisse <jglisse@redhat.com>
To: chetan L <loke.chetan@gmail.com>
Cc: Bob Liu <lliubbo@gmail.com>, David Nellans <dnellans@nvidia.com>,
	John Hubbard <jhubbard@nvidia.com>,
	Balbir Singh <bsingharora@gmail.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Michal Hocko <mhocko@kernel.org>, Linux MM <linux-mm@kvack.org>,
	Dan Williams <dan.j.williams@intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-accelerators@lists.ozlabs.org
Subject: Re: [PATCH 0/6] Cache coherent device memory (CDM) with HMM v5
Date: Thu, 16 Nov 2017 16:29:04 -0500	[thread overview]
Message-ID: <20171116212904.GA4823@redhat.com> (raw)
In-Reply-To: <CAAsGZS43n2_f9sQXGH5Ap=eEx2f099CDwHC0aTTgOEbw7Dc=zg@mail.gmail.com>

On Wed, Nov 15, 2017 at 07:29:10PM -0800, chetan L wrote:
> On Wed, Nov 15, 2017 at 7:23 PM, chetan L <loke.chetan@gmail.com> wrote:
> > On Wed, Nov 15, 2017 at 6:44 PM, Jerome Glisse <jglisse@redhat.com> wrote:
> >> On Wed, Nov 15, 2017 at 06:10:08PM -0800, chet l wrote:
> >>> >> You may think it as a CCIX device or CAPI device.
> >>> >> The requirement is eliminate any extra copy.
> >>> >> A typical usecase/requirement is malloc() and madvise() allocate from
> >>> >> device memory, then CPU write data to device memory directly and
> >>> >> trigger device to read the data/do calculation.
> >>> >
> >>> > I suggest you rely on the device driver userspace API to do a migration after malloc
> >>> > then. Something like:
> >>> >   ptr = malloc(size);
> >>> >   my_device_migrate(ptr, size);
> >>> >
> >>> > Which would call an ioctl of the device driver which itself would migrate memory or
> >>> > allocate device memory for the range if pointer return by malloc is not yet back by
> >>> > any pages.
> >>> >
> >>>
> >>> So for CCIX, I don't think there is going to be an inline device
> >>> driver that would allocate any memory for you. The expansion memory
> >>> will become part of the system memory as part of the boot process. So,
> >>> if the host DDR is 256GB and the CCIX expansion memory is 4GB, the
> >>> total system mem will be 260GB.
> >>>
> >>> Assume that the 'mm' is taught to mark/anoint the ZONE_DEVICE(or
> >>> ZONE_XXX) range from 256 to 260 GB. Then, for kmalloc it(mm) won't use
> >>> the ZONE_DEV range. But for a malloc, it will/can use that range.
> >>
> >> HMM zone device memory would work with that, you just need to teach the
> >> platform to identify this memory zone and not hotplug it. Again you
> >> should rely on specific device driver API to allocate this memory.
> >>
> >
> > @Jerome - a new linux-accelerator's list has just been created. I have
> > CC'd that list since we have overlapping interests w.r.t CCIX.
> >
> > I cannot comment on surprise add/remove as of now ... will cross the
> > bridge later.

Note that this is not hotplug strictly speaking. Design today is that it
is the device driver that register the memory. From kernel point of view
this is an hotplug but for many of the target architecture there is no
real hotplug ie device and its memory was present at boot time.

Like i said i think for now we are better of having each device manage and
register its memory. HMM provide a toolbox for that. If we see common trend
accross multiple devices then we can think about making something more
generic.


For the NUMA discussion this is related to CPU less node ie not wanting
to add any more CPU less node (node with only memory) and they are other
aspect too. For instance you do not necessarily have good informations
from the device to know if a page is access a lot by the device (this
kind of information is often only accessible by the device driver). Thus
the automatic NUMA placement is useless here. Not mentioning that for it
to work we would need to change how it currently work (iirc there is
issue when you not have a CPU id you can use).

Cheers,
Jerome

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-11-16 21:29 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-13 21:15 [PATCH 0/6] Cache coherent device memory (CDM) with HMM v5 Jérôme Glisse
2017-07-13 21:15 ` [PATCH 1/6] mm/zone-device: rename DEVICE_PUBLIC to DEVICE_HOST Jérôme Glisse
2017-07-17  9:09   ` Balbir Singh
2017-07-13 21:15 ` [PATCH 2/6] mm/device-public-memory: device memory cache coherent with CPU v4 Jérôme Glisse
2017-07-13 23:01   ` Balbir Singh
2017-07-13 21:15 ` [PATCH 3/6] mm/hmm: add new helper to hotplug CDM memory region v3 Jérôme Glisse
2017-07-13 21:15 ` [PATCH 4/6] mm/memcontrol: allow to uncharge page without using page->lru field Jérôme Glisse
2017-07-17  9:10   ` Balbir Singh
2017-07-13 21:15 ` [PATCH 5/6] mm/memcontrol: support MEMORY_DEVICE_PRIVATE and MEMORY_DEVICE_PUBLIC v3 Jérôme Glisse
2017-07-17  9:15   ` Balbir Singh
2017-07-13 21:15 ` [PATCH 6/6] mm/hmm: documents how device memory is accounted in rss and memcg Jérôme Glisse
2017-07-14 13:26   ` Michal Hocko
2017-07-18  3:26 ` [PATCH 0/6] Cache coherent device memory (CDM) with HMM v5 Bob Liu
2017-07-18 15:38   ` Jerome Glisse
2017-07-19  1:46     ` Bob Liu
2017-07-19  2:25       ` Jerome Glisse
2017-07-19  9:09         ` Bob Liu
2017-07-20 15:03           ` Jerome Glisse
2017-07-21  1:15             ` Bob Liu
2017-07-21  1:41               ` Jerome Glisse
2017-07-21  2:10                 ` Bob Liu
2017-07-21 12:01                   ` Bob Liu
2017-07-21 15:21                     ` Jerome Glisse
2017-07-21  3:48                 ` Dan Williams
2017-07-21 15:22                   ` Jerome Glisse
2017-09-05 19:36                   ` Jerome Glisse
2017-09-09 23:22                     ` Bob Liu
2017-09-11 23:36                       ` Jerome Glisse
2017-09-12  1:02                         ` Bob Liu
2017-09-12 16:17                           ` Jerome Glisse
2017-09-26  9:56                         ` Bob Liu
2017-09-26 16:16                           ` Jerome Glisse
2017-09-30  2:57                             ` Bob Liu
2017-09-30 22:49                               ` Jerome Glisse
2017-10-11 13:15                                 ` Bob Liu
2017-10-12 15:37                                   ` Jerome Glisse
2017-11-16  2:10                                     ` chet l
2017-11-16  2:44                                       ` Jerome Glisse
2017-11-16  3:23                                         ` chetan L
2017-11-16  3:29                                           ` chetan L
2017-11-16 21:29                                             ` Jerome Glisse [this message]
2017-11-16 22:41                                               ` chetan L
2017-11-16 23:11                                                 ` Jerome Glisse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171116212904.GA4823@redhat.com \
    --to=jglisse@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=bsingharora@gmail.com \
    --cc=dan.j.williams@intel.com \
    --cc=dnellans@nvidia.com \
    --cc=jhubbard@nvidia.com \
    --cc=linux-accelerators@lists.ozlabs.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lliubbo@gmail.com \
    --cc=loke.chetan@gmail.com \
    --cc=mhocko@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).