linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* HMM (Heterogeneous Memory Management) v8
@ 2015-05-21 19:31 j.glisse
  2015-05-30  3:01 ` John Hubbard
  2015-05-31  6:56 ` Haggai Eran
  0 siblings, 2 replies; 4+ messages in thread
From: j.glisse @ 2015-05-21 19:31 UTC (permalink / raw)
  To: akpm
  Cc: linux-kernel, linux-mm, Linus Torvalds, joro, Mel Gorman,
	H. Peter Anvin, Peter Zijlstra, Andrea Arcangeli,
	Johannes Weiner, Larry Woodman, Rik van Riel, Dave Airlie,
	Brendan Conoboy, Joe Donohue, Duncan Poole, Sherry Cheung,
	Subhash Gutti, John Hubbard, Mark Hairgrove, Lucien Dunning,
	Cameron Buschardt, Arvind Gopalakrishnan, Haggai Eran,
	Shachar Raindel, Liran Liss


So sorry had to resend because i stupidly forgot to cc mailing list.
Ignore private send done before.


HMM (Heterogeneous Memory Management) is an helper layer for device
that want to mirror a process address space into their own mmu. Main
target is GPU but other hardware, like network device can take also
use HMM.

There is two side to HMM, first one is mirroring of process address
space on behalf of a device. HMM will manage a secondary page table
for the device and keep it synchronize with the CPU page table. HMM
also do DMA mapping on behalf of the device (which would allow new
kind of optimization further down the road (1)).

Second side is allowing to migrate process memory to device memory
where device memory is unmappable by the CPU. Any CPU access will
trigger special fault that will migrate memory back.

>From design point of view not much changed since last patchset (2).
Most of the change are in small details of the API expose to device
driver. This version also include device driver change for Mellanox
hardware to use HMM as an alternative to ODP (which provide a subset
of HMM functionality specificaly for RDMA devices). Long term plan
is to have HMM completely replace ODP.



Why doing this ?

Mirroring a process address space is mandatory with OpenCL 2.0 and
with other GPU compute API. OpenCL 2.0 allow different level of
implementation and currently only the lowest 2 are supported on
Linux. To implement the highest level, where CPU and GPU access
can happen concurently and are cache coherent, HMM is needed, or
something providing same functionality, for instance through
platform hardware.

Hardware solution such as PCIE ATS/PASID is limited to mirroring
system memory and does not provide way to migrate memory to device
memory (which offer significantly more bandwidth up to 10 times
faster than regular system memory with discret GPU, also have
lower latency than PCIE transaction).

Current CPU with GPU on same die (AMD or Intel) use the ATS/PASID
and for Intel a special level of cache (backed by a large pool of
fast memory).

For foreseeable futur, discrete GPU will remain releveant as they
can have a large quantity of faster memory than integrated GPU.

Thus we believe HMM will allow to leverage discret GPU memory in
a transparent fashion to the application, with minimum disruption
to the linux kernel mm code. Also HMM can work along hardware
solution such as PCIE ATS/PASID (leaving regular case to ATS/PASID
while HMM handles the migrated memory case).



Design :

The patch 1, 2, 3 and 4 augment the mmu notifier API with new
informations to more efficiently mirror CPU page table updates.

The first side of HMM, process address space mirroring, is
implemented in patch 5 through 12. This use a secondary page
table, in which HMM mirror memory actively use by the device.
HMM does not take a reference on any of the page, it use the
mmu notifier API to track changes to the CPU page table and to
update the mirror page table. All this while providing a simple
API to device driver.

To implement this we use a "generic" page table and not a radix
tree because we need to store more flags than radix allows and
we need to store dma address (sizeof(dma_addr_t) > sizeof(long)
on some platform). All this is

Patch 14 pass down the lane the new child mm struct of a parent
process being forked. This is necessary to properly handle fork
when parent process have migrated memory (more on that below).

Patch 15 allow to get the current memcg against which anonymous
memory of a process should be accounted. It usefull because in
HMM we do bulk transaction on address space and we wish to avoid
storing a pointer to memcg for each single page. All operation
dealing with memcg happens under the protection of the mmap
semaphore.


Second side of HMM, migration to device memory, is implemented
in patch 16 to 28. This only deal with anonymous memory. A new
special swap type is introduced. Migrated memory will have there
CPU page table entry set to this special swap entry (like the
migration entry but unlike migration this is not a short lived
state).

All the patches are then set of functions that deals with those
special entry in the various code path that might face them.

Memory migration require several steps, first the memory is un-
mapped from CPU and replace with special "locked" entry, HMM
locked entry is a short lived transitional state, this is to
avoid two threads to fight over migration entry.

Once unmapped HMM can determine what can be migrated or not by
comparing mapcount and page count. If something holds a reference
then the page is not migrated and CPU page table is restored.
Next step is to schedule the copy to device memory and update
the CPU page table to regular HMM entry.

Migration back follow the same pattern, replace with special
lock entry, then copy back, then update CPU page table.


(1) Because HMM keeps a secondary page table which keeps track of
    DMA mapping, there is room for new optimization. We want to
    add a new DMA API to allow to manage DMA page table mapping
    at directory level. This would allow to minimize memory
    consumption of mirror page table and also over head of doing
    DMA mapping page per page. This is a future feature we want
    to work on and hope the idea will proove usefull not only to
    HMM users.

(2) Previous patchset posting :
    v1 http://lwn.net/Articles/597289/
    v2 https://lkml.org/lkml/2014/6/12/559
    v3 https://lkml.org/lkml/2014/6/13/633
    v4 https://lkml.org/lkml/2014/8/29/423
    v5 https://lkml.org/lkml/2014/11/3/759
    v6 http://lwn.net/Articles/619737/
    v7 http://lwn.net/Articles/627316/


Cheers,
Jérôme

To: "Andrew Morton" <akpm@linux-foundation.org>,
Cc: <linux-kernel@vger.kernel.org>,
Cc: linux-mm <linux-mm@kvack.org>,
Cc: <linux-fsdevel@vger.kernel.org>,
Cc: "Linus Torvalds" <torvalds@linux-foundation.org>,
Cc: "Mel Gorman" <mgorman@suse.de>,
Cc: "H. Peter Anvin" <hpa@zytor.com>,
Cc: "Peter Zijlstra" <peterz@infradead.org>,
Cc: "Linda Wang" <lwang@redhat.com>,
Cc: "Kevin E Martin" <kem@redhat.com>,
Cc: "Andrea Arcangeli" <aarcange@redhat.com>,
Cc: "Johannes Weiner" <jweiner@redhat.com>,
Cc: "Larry Woodman" <lwoodman@redhat.com>,
Cc: "Rik van Riel" <riel@redhat.com>,
Cc: "Dave Airlie" <airlied@redhat.com>,
Cc: "Jeff Law" <law@redhat.com>,
Cc: "Brendan Conoboy" <blc@redhat.com>,
Cc: "Joe Donohue" <jdonohue@redhat.com>,
Cc: "Duncan Poole" <dpoole@nvidia.com>,
Cc: "Sherry Cheung" <SCheung@nvidia.com>,
Cc: "Subhash Gutti" <sgutti@nvidia.com>,
Cc: "John Hubbard" <jhubbard@nvidia.com>,
Cc: "Mark Hairgrove" <mhairgrove@nvidia.com>,
Cc: "Lucien Dunning" <ldunning@nvidia.com>,
Cc: "Cameron Buschardt" <cabuschardt@nvidia.com>,
Cc: "Arvind Gopalakrishnan" <arvindg@nvidia.com>,
Cc: "Haggai Eran" <haggaie@mellanox.com>,
Cc: "Or Gerlitz" <ogerlitz@mellanox.com>,
Cc: "Sagi Grimberg" <sagig@mellanox.com>
Cc: "Shachar Raindel" <raindel@mellanox.com>,
Cc: "Liran Liss" <liranl@mellanox.com>,
Cc: "Roland Dreier" <roland@purestorage.com>,
Cc: "Sander, Ben" <ben.sander@amd.com>,
Cc: "Stoner, Greg" <Greg.Stoner@amd.com>,
Cc: "Bridgman, John" <John.Bridgman@amd.com>,
Cc: "Mantor, Michael" <Michael.Mantor@amd.com>,
Cc: "Blinzer, Paul" <Paul.Blinzer@amd.com>,
Cc: "Morichetti, Laurent" <Laurent.Morichetti@amd.com>,
Cc: "Deucher, Alexander" <Alexander.Deucher@amd.com>,
Cc: "Gabbay, Oded" <Oded.Gabbay@amd.com>,

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: HMM (Heterogeneous Memory Management) v8
  2015-05-21 19:31 HMM (Heterogeneous Memory Management) v8 j.glisse
@ 2015-05-30  3:01 ` John Hubbard
  2015-05-31  6:56 ` Haggai Eran
  1 sibling, 0 replies; 4+ messages in thread
From: John Hubbard @ 2015-05-30  3:01 UTC (permalink / raw)
  To: j.glisse
  Cc: akpm, linux-kernel, linux-mm, Linus Torvalds, joro, Mel Gorman,
	H. Peter Anvin, Peter Zijlstra, Andrea Arcangeli,
	Johannes Weiner, Larry Woodman, Rik van Riel, Dave Airlie,
	Brendan Conoboy, Joe Donohue, Duncan Poole, Sherry Cheung,
	Subhash Gutti, Mark Hairgrove, Lucien Dunning, Cameron Buschardt,
	Arvind Gopalakrishnan, Haggai Eran, Shachar Raindel, Liran Liss

[-- Attachment #1: Type: text/plain, Size: 9790 bytes --]

On Thu, 21 May 2015, j.glisse@gmail.com wrote:

> 
> So sorry had to resend because i stupidly forgot to cc mailing list.
> Ignore private send done before.
> 
> 
> HMM (Heterogeneous Memory Management) is an helper layer for device
> that want to mirror a process address space into their own mmu. Main
> target is GPU but other hardware, like network device can take also
> use HMM.
> 
> There is two side to HMM, first one is mirroring of process address
> space on behalf of a device. HMM will manage a secondary page table
> for the device and keep it synchronize with the CPU page table. HMM
> also do DMA mapping on behalf of the device (which would allow new
> kind of optimization further down the road (1)).
> 
> Second side is allowing to migrate process memory to device memory
> where device memory is unmappable by the CPU. Any CPU access will
> trigger special fault that will migrate memory back.
> 
> From design point of view not much changed since last patchset (2).
> Most of the change are in small details of the API expose to device
> driver. This version also include device driver change for Mellanox
> hardware to use HMM as an alternative to ODP (which provide a subset
> of HMM functionality specificaly for RDMA devices). Long term plan
> is to have HMM completely replace ODP.
> 

Hi Jerome!

OK, seeing as how there is so much material to review here, I'll start 
with the easiest part first: documentation.

There is a lot of information spread throughout this patchset that needs 
to be preserved and made readily accessible, but some of it is only found 
in the comments in patch headers. It would be better if the information 
were right there in the source tree, not just in git history. Also, the 
comment blocks that are in the code itself are useful, but maybe not 
quite sufficient to provide the big picture.

With that in mind, I think that a Documentation/vm/hmm.txt file should be 
provided. It could capture all of this. We can refer to it from within the 
code, thus providing a higher level of quality (because we only have to 
update one place, for big-picture documentation comments).

If it helps, I'll volunteer to piece something together from the material 
that you have created, plus maybe a few notes about what a typical calling 
sequence looks like (since I have actual backtraces here from the 
ConnectIB cards).

Also, there are a lot of typographical errors that we can fix up as part 
of that effort. We want to ensure that such tiny issues don't distract 
people from the valuable content, so those need to be fixed. I'll let 
others decide as to whether that sort of fit-and-finish needs to happen 
now, or as a follow-up patch or two.

And finally, a critical part of good documentation is the naming of 
things. We're sort of still in the "wow, it works" phase of this project, 
and so now is a good time to start fussing about names. Therefore, you'll 
see a bunch of small and large naming recommendations coming from me, for 
the various patches here.

thanks,
John Hubbard

> 
> 
> Why doing this ?
> 
> Mirroring a process address space is mandatory with OpenCL 2.0 and
> with other GPU compute API. OpenCL 2.0 allow different level of
> implementation and currently only the lowest 2 are supported on
> Linux. To implement the highest level, where CPU and GPU access
> can happen concurently and are cache coherent, HMM is needed, or
> something providing same functionality, for instance through
> platform hardware.
> 
> Hardware solution such as PCIE ATS/PASID is limited to mirroring
> system memory and does not provide way to migrate memory to device
> memory (which offer significantly more bandwidth up to 10 times
> faster than regular system memory with discret GPU, also have
> lower latency than PCIE transaction).
> 
> Current CPU with GPU on same die (AMD or Intel) use the ATS/PASID
> and for Intel a special level of cache (backed by a large pool of
> fast memory).
> 
> For foreseeable futur, discrete GPU will remain releveant as they
> can have a large quantity of faster memory than integrated GPU.
> 
> Thus we believe HMM will allow to leverage discret GPU memory in
> a transparent fashion to the application, with minimum disruption
> to the linux kernel mm code. Also HMM can work along hardware
> solution such as PCIE ATS/PASID (leaving regular case to ATS/PASID
> while HMM handles the migrated memory case).
> 
> 
> 
> Design :
> 
> The patch 1, 2, 3 and 4 augment the mmu notifier API with new
> informations to more efficiently mirror CPU page table updates.
> 
> The first side of HMM, process address space mirroring, is
> implemented in patch 5 through 12. This use a secondary page
> table, in which HMM mirror memory actively use by the device.
> HMM does not take a reference on any of the page, it use the
> mmu notifier API to track changes to the CPU page table and to
> update the mirror page table. All this while providing a simple
> API to device driver.
> 
> To implement this we use a "generic" page table and not a radix
> tree because we need to store more flags than radix allows and
> we need to store dma address (sizeof(dma_addr_t) > sizeof(long)
> on some platform). All this is
> 
> Patch 14 pass down the lane the new child mm struct of a parent
> process being forked. This is necessary to properly handle fork
> when parent process have migrated memory (more on that below).
> 
> Patch 15 allow to get the current memcg against which anonymous
> memory of a process should be accounted. It usefull because in
> HMM we do bulk transaction on address space and we wish to avoid
> storing a pointer to memcg for each single page. All operation
> dealing with memcg happens under the protection of the mmap
> semaphore.
> 
> 
> Second side of HMM, migration to device memory, is implemented
> in patch 16 to 28. This only deal with anonymous memory. A new
> special swap type is introduced. Migrated memory will have there
> CPU page table entry set to this special swap entry (like the
> migration entry but unlike migration this is not a short lived
> state).
> 
> All the patches are then set of functions that deals with those
> special entry in the various code path that might face them.
> 
> Memory migration require several steps, first the memory is un-
> mapped from CPU and replace with special "locked" entry, HMM
> locked entry is a short lived transitional state, this is to
> avoid two threads to fight over migration entry.
> 
> Once unmapped HMM can determine what can be migrated or not by
> comparing mapcount and page count. If something holds a reference
> then the page is not migrated and CPU page table is restored.
> Next step is to schedule the copy to device memory and update
> the CPU page table to regular HMM entry.
> 
> Migration back follow the same pattern, replace with special
> lock entry, then copy back, then update CPU page table.
> 
> 
> (1) Because HMM keeps a secondary page table which keeps track of
>     DMA mapping, there is room for new optimization. We want to
>     add a new DMA API to allow to manage DMA page table mapping
>     at directory level. This would allow to minimize memory
>     consumption of mirror page table and also over head of doing
>     DMA mapping page per page. This is a future feature we want
>     to work on and hope the idea will proove usefull not only to
>     HMM users.
> 
> (2) Previous patchset posting :
>     v1 http://lwn.net/Articles/597289/
>     v2 https://lkml.org/lkml/2014/6/12/559
>     v3 https://lkml.org/lkml/2014/6/13/633
>     v4 https://lkml.org/lkml/2014/8/29/423
>     v5 https://lkml.org/lkml/2014/11/3/759
>     v6 http://lwn.net/Articles/619737/
>     v7 http://lwn.net/Articles/627316/
> 
> 
> Cheers,
> Jérôme
> 
> To: "Andrew Morton" <akpm@linux-foundation.org>,
> Cc: <linux-kernel@vger.kernel.org>,
> Cc: linux-mm <linux-mm@kvack.org>,
> Cc: <linux-fsdevel@vger.kernel.org>,
> Cc: "Linus Torvalds" <torvalds@linux-foundation.org>,
> Cc: "Mel Gorman" <mgorman@suse.de>,
> Cc: "H. Peter Anvin" <hpa@zytor.com>,
> Cc: "Peter Zijlstra" <peterz@infradead.org>,
> Cc: "Linda Wang" <lwang@redhat.com>,
> Cc: "Kevin E Martin" <kem@redhat.com>,
> Cc: "Andrea Arcangeli" <aarcange@redhat.com>,
> Cc: "Johannes Weiner" <jweiner@redhat.com>,
> Cc: "Larry Woodman" <lwoodman@redhat.com>,
> Cc: "Rik van Riel" <riel@redhat.com>,
> Cc: "Dave Airlie" <airlied@redhat.com>,
> Cc: "Jeff Law" <law@redhat.com>,
> Cc: "Brendan Conoboy" <blc@redhat.com>,
> Cc: "Joe Donohue" <jdonohue@redhat.com>,
> Cc: "Duncan Poole" <dpoole@nvidia.com>,
> Cc: "Sherry Cheung" <SCheung@nvidia.com>,
> Cc: "Subhash Gutti" <sgutti@nvidia.com>,
> Cc: "John Hubbard" <jhubbard@nvidia.com>,
> Cc: "Mark Hairgrove" <mhairgrove@nvidia.com>,
> Cc: "Lucien Dunning" <ldunning@nvidia.com>,
> Cc: "Cameron Buschardt" <cabuschardt@nvidia.com>,
> Cc: "Arvind Gopalakrishnan" <arvindg@nvidia.com>,
> Cc: "Haggai Eran" <haggaie@mellanox.com>,
> Cc: "Or Gerlitz" <ogerlitz@mellanox.com>,
> Cc: "Sagi Grimberg" <sagig@mellanox.com>
> Cc: "Shachar Raindel" <raindel@mellanox.com>,
> Cc: "Liran Liss" <liranl@mellanox.com>,
> Cc: "Roland Dreier" <roland@purestorage.com>,
> Cc: "Sander, Ben" <ben.sander@amd.com>,
> Cc: "Stoner, Greg" <Greg.Stoner@amd.com>,
> Cc: "Bridgman, John" <John.Bridgman@amd.com>,
> Cc: "Mantor, Michael" <Michael.Mantor@amd.com>,
> Cc: "Blinzer, Paul" <Paul.Blinzer@amd.com>,
> Cc: "Morichetti, Laurent" <Laurent.Morichetti@amd.com>,
> Cc: "Deucher, Alexander" <Alexander.Deucher@amd.com>,
> Cc: "Gabbay, Oded" <Oded.Gabbay@amd.com>,
> 
> 

thanks,
John H.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: HMM (Heterogeneous Memory Management) v8
  2015-05-21 19:31 HMM (Heterogeneous Memory Management) v8 j.glisse
  2015-05-30  3:01 ` John Hubbard
@ 2015-05-31  6:56 ` Haggai Eran
  1 sibling, 0 replies; 4+ messages in thread
From: Haggai Eran @ 2015-05-31  6:56 UTC (permalink / raw)
  To: j.glisse, akpm
  Cc: linux-kernel, linux-mm, Linus Torvalds, joro, Mel Gorman,
	H. Peter Anvin, Peter Zijlstra, Andrea Arcangeli,
	Johannes Weiner, Larry Woodman, Rik van Riel, Dave Airlie,
	Brendan Conoboy, Joe Donohue, Duncan Poole, Sherry Cheung,
	Subhash Gutti, John Hubbard, Mark Hairgrove, Lucien Dunning,
	Cameron Buschardt, Arvind Gopalakrishnan, Shachar Raindel,
	Liran Liss, Roland Dreier

On 21/05/2015 22:31, j.glisse@gmail.com wrote:
> From design point of view not much changed since last patchset (2).
> Most of the change are in small details of the API expose to device
> driver. This version also include device driver change for Mellanox
> hardware to use HMM as an alternative to ODP (which provide a subset
> of HMM functionality specificaly for RDMA devices). Long term plan
> is to have HMM completely replace ODP.

Hi,

I think HMM would be a good long term solution indeed. For now I would
want to keep ODP and HMM side by side (as the patchset seem to do)
mainly since HMM is introduced as a STAGING feature and ODP is part of
the mainline kernel.

It would be nice if you could provide a git repository to access the
patches. I couldn't apply them to the current linux-next tree.

A minor thing: I noticed some style issues in the patches. You should
run checkpatch.pl on the patches and get them to match the coding style.

Regards,
Haggai

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* HMM (Heterogeneous Memory Management) v8
@ 2015-01-05 22:44 j.glisse
  0 siblings, 0 replies; 4+ messages in thread
From: j.glisse @ 2015-01-05 22:44 UTC (permalink / raw)
  To: akpm
  Cc: linux-kernel, linux-mm, Linus Torvalds, joro, Mel Gorman,
	H. Peter Anvin, Peter Zijlstra, Andrea Arcangeli,
	Johannes Weiner, Larry Woodman, Rik van Riel, Dave Airlie,
	Brendan Conoboy, Joe Donohue, Duncan Poole, Sherry Cheung,
	Subhash Gutti, John Hubbard, Mark Hairgrove, Lucien Dunning,
	Cameron Buschardt, Arvind Gopalakrishnan, Shachar Raindel,
	Liran Liss, Roland Dreier

So a resend with corrections base on Haggai comments. This patchset is just
the ground foundation on to which we want to build our features set. Main
feature being migrating memory to device memory. The very first version of
this patchset already show cased proof of concept of much of the features.

Below is previous patchset cover letter pretty much unchanged as background
and motivation for it did not.


What it is ?

In a nutshell HMM is a subsystem that provide an easy to use api to mirror a
process address on a device with minimal hardware requirement (mainly device
page fault and read only page mapping). This does not rely on ATS and PASID
PCIE extensions. It intends to supersede those extensions by allowing to move
system memory to device memory in a transparent fashion for core kernel mm
code (ie cpu page fault on page residing in device memory will trigger
migration back to system memory).


Why doing this ?

We want to be able to mirror a process address space so that compute api such
as OpenCL or other similar api can start using the exact same address space on
the GPU as on the CPU. This will greatly simplify usages of those api. Moreover
we believe that we will see more and more specialize unit functions that will
want to mirror process address using their own mmu.

The migration side is simply because GPU memory bandwidth is far beyond than
system memory bandwith and there is no sign that this gap is closing (quite the
opposite).


Current status and future features :

None of this core code change in any major way core kernel mm code. This
is simple ground work with no impact on existing code path. Features that
will be implemented on top of this are :
  1 - Tansparently handle page mapping on behalf of device driver (DMA).
  2 - Improve DMA api to better match new usage pattern of HMM.
  3 - Migration of anonymous memory to device memory.
  4 - Locking memory to remote memory (CPU access trigger SIGBUS).
  5 - Access exclusion btw CPU and device for atomic operations.
  6 - Migration of file backed memory to device memory.


How future features will be implemented :
1 - Simply use existing DMA api to map page on behalf of a device.
2 - Introduce new DMA api to match new semantic of HMM. It is no longer page
    we map but address range and managing which page is effectively backing
    an address should be easy to update. I gave a presentation about that
    during this LPC.
3 - Requires change to cpu page fault code path to handle migration back to
    system memory on cpu access. An implementation of this was already sent
    as part of v1. This will be low impact and only add a new special swap
    type handling to existing fault code.
4 - Require a new syscall as i can not see which current syscall would be
    appropriate for this. My first feeling was to use mbind as it has the
    right semantic (binding a range of address to a device) but mbind is
    too numa centric.

    Second one was madvise, but semantic does not match, madvise does allow
    kernel to ignore them while we do want to block cpu access for as long
    as the range is bind to a device.

    So i do not think any of existing syscall can be extended with new flags
    but maybe i am wrong.
5 - Allowing to map a page as read only on the CPU while a device perform
    some atomic operation on it (this is mainly to work around system bus
    that do not support atomic memory access and sadly there is a large
    base of hardware without that feature).

    Easiest implementation would be using some page flags but there is none
    left. So it must be some flags in vma to know if there is a need to query
    HMM for write protection.

6 - This is the trickiest one to implement and while i showed a proof of
    concept with v1, i am still have a lot of conflictual feeling about how
    to achieve this.


As usual comments are more then welcome. Thanks in advance to anyone that
take a look at this code.

Previous patchset posting :
  v1 http://lwn.net/Articles/597289/
  v2 https://lkml.org/lkml/2014/6/12/559 (cover letter did not make it to ml)
  v3 https://lkml.org/lkml/2014/6/13/633
  v4 https://lkml.org/lkml/2014/8/29/423
  v5 https://lkml.org/lkml/2014/11/3/759
  v6 http://lwn.net/Articles/619737/

Cheers,
Jérôme

To: "Andrew Morton" <akpm@linux-foundation.org>,
Cc: <linux-kernel@vger.kernel.org>,
Cc: linux-mm <linux-mm@kvack.org>,
Cc: <linux-fsdevel@vger.kernel.org>,
Cc: "Linus Torvalds" <torvalds@linux-foundation.org>,
Cc: "Mel Gorman" <mgorman@suse.de>,
Cc: "H. Peter Anvin" <hpa@zytor.com>,
Cc: "Peter Zijlstra" <peterz@infradead.org>,
Cc: "Linda Wang" <lwang@redhat.com>,
Cc: "Kevin E Martin" <kem@redhat.com>,
Cc: "Jerome Glisse" <jglisse@redhat.com>,
Cc: "Andrea Arcangeli" <aarcange@redhat.com>,
Cc: "Johannes Weiner" <jweiner@redhat.com>,
Cc: "Larry Woodman" <lwoodman@redhat.com>,
Cc: "Rik van Riel" <riel@redhat.com>,
Cc: "Dave Airlie" <airlied@redhat.com>,
Cc: "Jeff Law" <law@redhat.com>,
Cc: "Brendan Conoboy" <blc@redhat.com>,
Cc: "Joe Donohue" <jdonohue@redhat.com>,
Cc: "Duncan Poole" <dpoole@nvidia.com>,
Cc: "Sherry Cheung" <SCheung@nvidia.com>,
Cc: "Subhash Gutti" <sgutti@nvidia.com>,
Cc: "John Hubbard" <jhubbard@nvidia.com>,
Cc: "Mark Hairgrove" <mhairgrove@nvidia.com>,
Cc: "Lucien Dunning" <ldunning@nvidia.com>,
Cc: "Cameron Buschardt" <cabuschardt@nvidia.com>,
Cc: "Arvind Gopalakrishnan" <arvindg@nvidia.com>,
Cc: "Haggai Eran" <haggaie@mellanox.com>,
Cc: "Or Gerlitz" <ogerlitz@mellanox.com>,
Cc: "Sagi Grimberg" <sagig@mellanox.com>
Cc: "Shachar Raindel" <raindel@mellanox.com>,
Cc: "Liran Liss" <liranl@mellanox.com>,
Cc: "Roland Dreier" <roland@purestorage.com>,
Cc: "Sander, Ben" <ben.sander@amd.com>,
Cc: "Stoner, Greg" <Greg.Stoner@amd.com>,
Cc: "Bridgman, John" <John.Bridgman@amd.com>,
Cc: "Mantor, Michael" <Michael.Mantor@amd.com>,
Cc: "Blinzer, Paul" <Paul.Blinzer@amd.com>,
Cc: "Morichetti, Laurent" <Laurent.Morichetti@amd.com>,
Cc: "Deucher, Alexander" <Alexander.Deucher@amd.com>,
Cc: "Gabbay, Oded" <Oded.Gabbay@amd.com>,

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-05-31  6:56 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-21 19:31 HMM (Heterogeneous Memory Management) v8 j.glisse
2015-05-30  3:01 ` John Hubbard
2015-05-31  6:56 ` Haggai Eran
  -- strict thread matches above, loose matches on Subject: below --
2015-01-05 22:44 j.glisse

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).