All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@techsingularity.net>
To: J?r?me Glisse <jglisse@redhat.com>
Cc: akpm@linux-foundation.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, John Hubbard <jhubbard@nvidia.com>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	David Nellans <dnellans@nvidia.com>
Subject: Re: [HMM 00/16] HMM (Heterogeneous Memory Management) v18
Date: Sun, 19 Mar 2017 20:09:56 +0000	[thread overview]
Message-ID: <20170319200956.GJ2774@techsingularity.net> (raw)
In-Reply-To: <1489680335-6594-1-git-send-email-jglisse@redhat.com>

On Thu, Mar 16, 2017 at 12:05:19PM -0400, J?r?me Glisse wrote:
> Cliff note: HMM offers 2 things (each standing on its own). First
> it allows to use device memory transparently inside any process
> without any modifications to process program code. Second it allows
> to mirror process address space on a device.
> 
> Changes since v17:
>   - typos
>   - ZONE_DEVICE page refcount move put_zone_device_page()
> 
> Work is still underway to use this feature inside the upstream
> nouveau driver. It has been tested with closed source driver
> and test are still underway on top of new kernel. So far we have
> found no issues. I expect to get a tested-by soon. Also this
> feature is not only useful for NVidia GPU, i expect AMD GPU will
> need it too if they want to support some of the new industry API.
> I also expect some FPGA company to use it and probably other
> hardware.
> 
> That being said I don't expect i will ever get a review-by anyone
> for reasons beyond my control.

I spent the length of time a battery lasts reading the patches during my
flight to LSF/MM showing that you can get people to review anything if
you lock them in a metal box for a few hours.

I only got as far as patch 13 before running low on time but decided to send
what I have anyway so you have the feedback before the LSF/MM topic. The
remaining patches are HMM specific and the intent was review how much the
core mm is affected and how hard this would be to maintain. I was less
concerned with the HMM internals itself but I assume that the authors
writing driver support can supply tested-by's.

Overall HMM is fairly well isolated.  The drivers can cause new and
interesting damage through the MMU notifiers and fault handling but that is
a driver, not a core, issue. There is new core code but most of it is active
only if a driver is so most people won't notice. Fast paths generally remain
unaffected except for one major case covered in the review. I also didn't
like the migrate_page API update and suggested an alternative. Most of the
other overhead is very minor. My expection is that most core code does not
have to care about HMM and while there is a risk that a driver can cause
damage through the notifiers, that is completely the responsibility of the
driver. Maybe some buglets exist in the new core migration code but again,
most people won't notice unless a suitable driver is loaded.

On that basis, if you address the major aspects of this review, I don't
have an objection at the moment to HMM being merged unlike the
objections I had to the CDM preparation patches that modified zonelist
handling, nodes and the page allocator fast paths.

It still leaves the problem of no in-kernel user of the API. The catch-22 has
now existed for years that driver support won't exist until it's merged and
it won't get merged without drivers. I won't object strongly on that basis
any more but others might. Maybe if this passes Andrew's review it could
be staged in mmotm until a driver or something like CDM is ready? That
would at least give a tree for driver authors to work against with the
resonable expectation that both HMM + driver would go in at the same time.

WARNING: multiple messages have this Message-ID (diff)
From: Mel Gorman <mgorman@techsingularity.net>
To: J?r?me Glisse <jglisse@redhat.com>
Cc: akpm@linux-foundation.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, John Hubbard <jhubbard@nvidia.com>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	David Nellans <dnellans@nvidia.com>
Subject: Re: [HMM 00/16] HMM (Heterogeneous Memory Management) v18
Date: Sun, 19 Mar 2017 20:09:56 +0000	[thread overview]
Message-ID: <20170319200956.GJ2774@techsingularity.net> (raw)
In-Reply-To: <1489680335-6594-1-git-send-email-jglisse@redhat.com>

On Thu, Mar 16, 2017 at 12:05:19PM -0400, J?r?me Glisse wrote:
> Cliff note: HMM offers 2 things (each standing on its own). First
> it allows to use device memory transparently inside any process
> without any modifications to process program code. Second it allows
> to mirror process address space on a device.
> 
> Changes since v17:
>   - typos
>   - ZONE_DEVICE page refcount move put_zone_device_page()
> 
> Work is still underway to use this feature inside the upstream
> nouveau driver. It has been tested with closed source driver
> and test are still underway on top of new kernel. So far we have
> found no issues. I expect to get a tested-by soon. Also this
> feature is not only useful for NVidia GPU, i expect AMD GPU will
> need it too if they want to support some of the new industry API.
> I also expect some FPGA company to use it and probably other
> hardware.
> 
> That being said I don't expect i will ever get a review-by anyone
> for reasons beyond my control.

I spent the length of time a battery lasts reading the patches during my
flight to LSF/MM showing that you can get people to review anything if
you lock them in a metal box for a few hours.

I only got as far as patch 13 before running low on time but decided to send
what I have anyway so you have the feedback before the LSF/MM topic. The
remaining patches are HMM specific and the intent was review how much the
core mm is affected and how hard this would be to maintain. I was less
concerned with the HMM internals itself but I assume that the authors
writing driver support can supply tested-by's.

Overall HMM is fairly well isolated.  The drivers can cause new and
interesting damage through the MMU notifiers and fault handling but that is
a driver, not a core, issue. There is new core code but most of it is active
only if a driver is so most people won't notice. Fast paths generally remain
unaffected except for one major case covered in the review. I also didn't
like the migrate_page API update and suggested an alternative. Most of the
other overhead is very minor. My expection is that most core code does not
have to care about HMM and while there is a risk that a driver can cause
damage through the notifiers, that is completely the responsibility of the
driver. Maybe some buglets exist in the new core migration code but again,
most people won't notice unless a suitable driver is loaded.

On that basis, if you address the major aspects of this review, I don't
have an objection at the moment to HMM being merged unlike the
objections I had to the CDM preparation patches that modified zonelist
handling, nodes and the page allocator fast paths.

It still leaves the problem of no in-kernel user of the API. The catch-22 has
now existed for years that driver support won't exist until it's merged and
it won't get merged without drivers. I won't object strongly on that basis
any more but others might. Maybe if this passes Andrew's review it could
be staged in mmotm until a driver or something like CDM is ready? That
would at least give a tree for driver authors to work against with the
resonable expectation that both HMM + driver would go in at the same time.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2017-03-19 20:35 UTC|newest]

Thread overview: 90+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-16 16:05 [HMM 00/16] HMM (Heterogeneous Memory Management) v18 Jérôme Glisse
2017-03-16 16:05 ` Jérôme Glisse
2017-03-16 16:05 ` [HMM 01/16] mm/memory/hotplug: convert device bool to int to allow for more flags v3 Jérôme Glisse
2017-03-16 16:05   ` Jérôme Glisse
2017-03-19 20:08   ` Mel Gorman
2017-03-19 20:08     ` Mel Gorman
2017-03-16 16:05 ` [HMM 02/16] mm/put_page: move ref decrement to put_zone_device_page() Jérôme Glisse
2017-03-16 16:05   ` Jérôme Glisse
2017-03-19 20:08   ` Mel Gorman
2017-03-19 20:08     ` Mel Gorman
2017-03-16 16:05 ` [HMM 03/16] mm/ZONE_DEVICE/free-page: callback when page is freed v3 Jérôme Glisse
2017-03-16 16:05   ` Jérôme Glisse
2017-03-19 20:08   ` Mel Gorman
2017-03-19 20:08     ` Mel Gorman
2017-03-16 16:05 ` [HMM 04/16] mm/ZONE_DEVICE/unaddressable: add support for un-addressable device memory v3 Jérôme Glisse
2017-03-16 16:05   ` Jérôme Glisse
2017-03-19 20:09   ` Mel Gorman
2017-03-19 20:09     ` Mel Gorman
2017-03-16 16:05 ` [HMM 05/16] mm/ZONE_DEVICE/x86: add support for un-addressable device memory Jérôme Glisse
2017-03-16 16:05   ` Jérôme Glisse
2017-03-16 16:05 ` [HMM 06/16] mm/migrate: add new boolean copy flag to migratepage() callback Jérôme Glisse
2017-03-16 16:05   ` Jérôme Glisse
2017-03-19 20:09   ` Mel Gorman
2017-03-19 20:09     ` Mel Gorman
2017-03-16 16:05 ` [HMM 07/16] mm/migrate: new memory migration helper for use with device memory v4 Jérôme Glisse
2017-03-16 16:05   ` Jérôme Glisse
2017-03-16 16:24   ` Reza Arbab
2017-03-16 16:24     ` Reza Arbab
2017-03-16 20:58     ` Balbir Singh
2017-03-16 20:58       ` Balbir Singh
2017-03-16 23:05   ` Andrew Morton
2017-03-16 23:05     ` Andrew Morton
2017-03-17  0:22     ` John Hubbard
2017-03-17  0:22       ` John Hubbard
2017-03-17  0:45       ` Balbir Singh
2017-03-17  0:45         ` Balbir Singh
2017-03-17  0:57         ` John Hubbard
2017-03-17  0:57           ` John Hubbard
2017-03-17  1:52           ` Jerome Glisse
2017-03-17  1:52             ` Jerome Glisse
2017-03-17  3:32             ` Andrew Morton
2017-03-17  3:32               ` Andrew Morton
2017-03-17  3:42           ` Balbir Singh
2017-03-17  3:42             ` Balbir Singh
2017-03-17  4:51             ` Balbir Singh
2017-03-17  4:51               ` Balbir Singh
2017-03-17  7:17               ` John Hubbard
2017-03-17  7:17                 ` John Hubbard
2017-03-16 16:05 ` [HMM 08/16] mm/migrate: migrate_vma() unmap page from vma while collecting pages Jérôme Glisse
2017-03-16 16:05   ` Jérôme Glisse
2017-03-16 16:05 ` [HMM 09/16] mm/hmm: heterogeneous memory management (HMM for short) Jérôme Glisse
2017-03-16 16:05   ` Jérôme Glisse
2017-03-19 20:09   ` Mel Gorman
2017-03-19 20:09     ` Mel Gorman
2017-03-16 16:05 ` [HMM 10/16] mm/hmm/mirror: mirror process address space on device with HMM helpers Jérôme Glisse
2017-03-16 16:05   ` Jérôme Glisse
2017-03-19 20:09   ` Mel Gorman
2017-03-19 20:09     ` Mel Gorman
2017-03-16 16:05 ` [HMM 11/16] mm/hmm/mirror: helper to snapshot CPU page table v2 Jérôme Glisse
2017-03-16 16:05   ` Jérôme Glisse
2017-03-19 20:09   ` Mel Gorman
2017-03-19 20:09     ` Mel Gorman
2017-03-16 16:05 ` [HMM 12/16] mm/hmm/mirror: device page fault handler Jérôme Glisse
2017-03-16 16:05   ` Jérôme Glisse
2017-03-16 16:05 ` [HMM 13/16] mm/hmm/migrate: support un-addressable ZONE_DEVICE page in migration Jérôme Glisse
2017-03-16 16:05   ` Jérôme Glisse
2017-03-16 16:05 ` [HMM 14/16] mm/migrate: allow migrate_vma() to alloc new page on empty entry Jérôme Glisse
2017-03-16 16:05   ` Jérôme Glisse
2017-03-16 16:05 ` [HMM 15/16] mm/hmm/devmem: device memory hotplug using ZONE_DEVICE Jérôme Glisse
2017-03-16 16:05   ` Jérôme Glisse
2017-03-16 16:05 ` [HMM 16/16] mm/hmm/devmem: dummy HMM device for ZONE_DEVICE memory v2 Jérôme Glisse
2017-03-16 16:05   ` Jérôme Glisse
2017-03-17  6:55   ` Bob Liu
2017-03-17  6:55     ` Bob Liu
2017-03-17 16:53     ` Jerome Glisse
2017-03-17 16:53       ` Jerome Glisse
2017-03-16 20:43 ` [HMM 00/16] HMM (Heterogeneous Memory Management) v18 Andrew Morton
2017-03-16 20:43   ` Andrew Morton
2017-03-16 23:49   ` Jerome Glisse
2017-03-16 23:49     ` Jerome Glisse
2017-03-17  8:29     ` Bob Liu
2017-03-17  8:29       ` Bob Liu
2017-03-17 15:57       ` Jerome Glisse
2017-03-17 15:57         ` Jerome Glisse
2017-03-17  8:39     ` Bob Liu
2017-03-17  8:39       ` Bob Liu
2017-03-17 15:52       ` Jerome Glisse
2017-03-17 15:52         ` Jerome Glisse
2017-03-19 20:09 ` Mel Gorman [this message]
2017-03-19 20:09   ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170319200956.GJ2774@techsingularity.net \
    --to=mgorman@techsingularity.net \
    --cc=akpm@linux-foundation.org \
    --cc=dnellans@nvidia.com \
    --cc=jglisse@redhat.com \
    --cc=jhubbard@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=n-horiguchi@ah.jp.nec.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.