All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v10 00/10] Add support for SVM atomics in Nouveau
@ 2021-06-07  7:58 ` Alistair Popple
  0 siblings, 0 replies; 82+ messages in thread
From: Alistair Popple @ 2021-06-07  7:58 UTC (permalink / raw)
  To: linux-mm, akpm
  Cc: rcampbell, linux-doc, nouveau, hughd, linux-kernel, dri-devel,
	hch, bskeggs, jgg, peterx, shakeelb, jhubbard, willy,
	Alistair Popple

Hi Andrew,

This is an update to address some comments on the previous version of
this series. Most are code comment updates, although there were a couple
of code changes as well. The most significant are:

 - Re-introduce the check of VM_LOCKED under the PTL in
   page_mlock_one(). This was present in an earlier version of the series
   but removed because we thought it was redundant. However Shakeel
   provided some background making it clear it is needed.

 - Reworked the return codes in copy_pte_range() based on suggestions
   from Peter Xu to hopefully make the code clearer and less error-prone.

 - Integrated a fix to the Nouveau code reported by Colin King.

As discussed to minimise impact I have also made this dependent on
CONFIG_DEVICE_PRIVATE. Hopefully these changes don't break any other series that
may have been based on the previous version. I see there has been some
discussion from Hugh and others around patch order, so if you need me to rebase
these to a different branch let me know.

Introduction
============

Some devices have features such as atomic PTE bits that can be used to
implement atomic access to system memory. To support atomic operations to a
shared virtual memory page such a device needs access to that page which is
exclusive of the CPU. This series introduces a mechanism to temporarily
unmap pages granting exclusive access to a device.

These changes are required to support OpenCL atomic operations in Nouveau
to shared virtual memory (SVM) regions allocated with the
CL_MEM_SVM_ATOMICS clSVMAlloc flag. A more complete description of the
OpenCL SVM feature is available at
https://www.khronos.org/registry/OpenCL/specs/3.0-unified/html/
OpenCL_API.html#_shared_virtual_memory .

Implementation
==============

Exclusive device access is implemented by adding a new swap entry type
(SWAP_DEVICE_EXCLUSIVE) which is similar to a migration entry. The main
difference is that on fault the original entry is immediately restored by
the fault handler instead of waiting.

Restoring the entry triggers calls to MMU notifers which allows a device
driver to revoke the atomic access permission from the GPU prior to the CPU
finalising the entry.

Patches
=======

Patches 1 & 2 refactor existing migration and device private entry
functions.

Patches 3 & 4 rework try_to_unmap_one() by splitting out unrelated
functionality into separate functions - try_to_migrate_one() and
try_to_munlock_one().

Patch 5 renames some existing code but does not introduce functionality.

Patch 6 is a small clean-up to swap entry handling in copy_pte_range().

Patch 7 contains the bulk of the implementation for device exclusive
memory.

Patch 8 contains some additions to the HMM selftests to ensure everything
works as expected.

Patch 9 is a cleanup for the Nouveau SVM implementation.

Patch 10 contains the implementation of atomic access for the Nouveau
driver.

Testing
=======

This has been tested with upstream Mesa 21.1.0 and a simple OpenCL program
which checks that GPU atomic accesses to system memory are atomic. Without
this series the test fails as there is no way of write-protecting the page
mapping which results in the device clobbering CPU writes. For reference
the test is available at https://ozlabs.org/~apopple/opencl_svm_atomics/

Further testing has been performed by adding support for testing exclusive
access to the hmm-tests kselftests.


Alistair Popple (10):
  mm: Remove special swap entry functions
  mm/swapops: Rework swap entry manipulation code
  mm/rmap: Split try_to_munlock from try_to_unmap
  mm/rmap: Split migration into its own function
  mm: Rename migrate_pgmap_owner
  mm/memory.c: Allow different return codes for copy_nonpresent_pte()
  mm: Device exclusive memory access
  mm: Selftests for exclusive device memory
  nouveau/svm: Refactor nouveau_range_fault
  nouveau/svm: Implement atomic SVM access

 Documentation/vm/hmm.rst                      |  19 +-
 Documentation/vm/unevictable-lru.rst          |  33 +-
 arch/s390/mm/pgtable.c                        |   2 +-
 drivers/gpu/drm/nouveau/include/nvif/if000c.h |   1 +
 drivers/gpu/drm/nouveau/nouveau_svm.c         | 156 ++++-
 drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h |   1 +
 .../drm/nouveau/nvkm/subdev/mmu/vmmgp100.c    |   6 +
 fs/proc/task_mmu.c                            |  23 +-
 include/linux/mmu_notifier.h                  |  26 +-
 include/linux/rmap.h                          |  11 +-
 include/linux/swap.h                          |  13 +-
 include/linux/swapops.h                       | 123 ++--
 lib/test_hmm.c                                | 126 +++-
 lib/test_hmm_uapi.h                           |   2 +
 mm/debug_vm_pgtable.c                         |  12 +-
 mm/hmm.c                                      |  12 +-
 mm/huge_memory.c                              |  45 +-
 mm/hugetlb.c                                  |  10 +-
 mm/memcontrol.c                               |   2 +-
 mm/memory.c                                   | 173 ++++-
 mm/migrate.c                                  |  51 +-
 mm/mlock.c                                    |  12 +-
 mm/mprotect.c                                 |  18 +-
 mm/page_vma_mapped.c                          |  15 +-
 mm/rmap.c                                     | 602 +++++++++++++++---
 tools/testing/selftests/vm/hmm-tests.c        | 158 +++++
 26 files changed, 1328 insertions(+), 324 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 82+ messages in thread
* Re: [PATCH v10 07/10] mm: Device exclusive memory access
@ 2021-06-09 23:02 kernel test robot
  0 siblings, 0 replies; 82+ messages in thread
From: kernel test robot @ 2021-06-09 23:02 UTC (permalink / raw)
  To: kbuild

[-- Attachment #1: Type: text/plain, Size: 8010 bytes --]

CC: kbuild-all(a)lists.01.org
In-Reply-To: <20210607075855.5084-8-apopple@nvidia.com>
References: <20210607075855.5084-8-apopple@nvidia.com>
TO: Alistair Popple <apopple@nvidia.com>
TO: linux-mm(a)kvack.org
TO: akpm(a)linux-foundation.org
CC: rcampbell(a)nvidia.com
CC: linux-doc(a)vger.kernel.org
CC: nouveau(a)lists.freedesktop.org
CC: hughd(a)google.com
CC: linux-kernel(a)vger.kernel.org
CC: dri-devel(a)lists.freedesktop.org
CC: hch(a)infradead.org
CC: bskeggs(a)redhat.com

Hi Alistair,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on s390/features]
[also build test WARNING on kselftest/next linus/master v5.13-rc5]
[cannot apply to hnaz-linux-mm/master next-20210609]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Alistair-Popple/Add-support-for-SVM-atomics-in-Nouveau/20210607-160056
base:   https://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git features
:::::: branch date: 3 days ago
:::::: commit date: 3 days ago
config: parisc-randconfig-s032-20210607 (attached as .config)
compiler: hppa64-linux-gcc (GCC) 9.3.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # apt-get install sparse
        # sparse version: v0.6.3-341-g8af24329-dirty
        # https://github.com/0day-ci/linux/commit/e54198410efccc00d5c3075e55dc424809de198b
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Alistair-Popple/Add-support-for-SVM-atomics-in-Nouveau/20210607-160056
        git checkout e54198410efccc00d5c3075e55dc424809de198b
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' W=1 ARCH=parisc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>


sparse warnings: (new ones prefixed by >>)
>> mm/memory.c:726:21: sparse: sparse: context imbalance in 'restore_exclusive_pte' - different lock contexts for basic block
   mm/memory.c:772:1: sparse: sparse: context imbalance in 'copy_nonpresent_pte' - different lock contexts for basic block
   mm/memory.c:981:9: sparse: sparse: context imbalance in 'copy_pte_range' - different lock contexts for basic block
   mm/memory.c: note: in included file (through include/linux/pgtable.h, arch/parisc/include/asm/io.h, include/linux/io.h, ...):
   arch/parisc/include/asm/pgtable.h:451:9: sparse: sparse: context imbalance in 'zap_pte_range' - different lock contexts for basic block
   mm/memory.c:1725:16: sparse: sparse: context imbalance in '__get_locked_pte' - different lock contexts for basic block
   mm/memory.c:1746:9: sparse: sparse: context imbalance in 'insert_page_into_pte_locked' - different lock contexts for basic block
   mm/memory.c:1774:9: sparse: sparse: context imbalance in 'insert_page' - different lock contexts for basic block
   mm/memory.c:2066:9: sparse: sparse: context imbalance in 'insert_pfn' - different lock contexts for basic block
   mm/memory.c:2285:17: sparse: sparse: context imbalance in 'remap_pte_range' - different lock contexts for basic block
   mm/memory.c:2532:17: sparse: sparse: context imbalance in 'apply_to_pte_range' - unexpected unlock
   mm/memory.c:3056:17: sparse: sparse: context imbalance in 'wp_page_copy' - different lock contexts for basic block
   mm/memory.c:3166:17: sparse: sparse: context imbalance in 'wp_pfn_shared' - unexpected unlock
   mm/memory.c:3229:19: sparse: sparse: context imbalance in 'do_wp_page' - different lock contexts for basic block
   mm/memory.c:3760:9: sparse: sparse: context imbalance in 'do_anonymous_page' - different lock contexts for basic block
   mm/memory.c:3928:9: sparse: sparse: context imbalance in 'do_set_pte' - different lock contexts for basic block
   mm/memory.c:4356:32: sparse: sparse: context imbalance in 'do_numa_page' - different lock contexts for basic block
   mm/memory.c:4533:9: sparse: sparse: context imbalance in 'handle_pte_fault' - different lock contexts for basic block
   mm/memory.c:4819:5: sparse: sparse: context imbalance in 'follow_invalidate_pte' - different lock contexts for basic block
   mm/memory.c:4940:9: sparse: sparse: context imbalance in 'follow_pfn' - unexpected unlock

vim +/restore_exclusive_pte +726 mm/memory.c

28093f9f34cede Gerald Schaefer 2016-04-28  702  
e54198410efccc Alistair Popple 2021-06-07  703  static void restore_exclusive_pte(struct vm_area_struct *vma,
e54198410efccc Alistair Popple 2021-06-07  704  				  struct page *page, unsigned long address,
e54198410efccc Alistair Popple 2021-06-07  705  				  pte_t *ptep)
e54198410efccc Alistair Popple 2021-06-07  706  {
e54198410efccc Alistair Popple 2021-06-07  707  	pte_t pte;
e54198410efccc Alistair Popple 2021-06-07  708  	swp_entry_t entry;
e54198410efccc Alistair Popple 2021-06-07  709  
e54198410efccc Alistair Popple 2021-06-07  710  	pte = pte_mkold(mk_pte(page, READ_ONCE(vma->vm_page_prot)));
e54198410efccc Alistair Popple 2021-06-07  711  	if (pte_swp_soft_dirty(*ptep))
e54198410efccc Alistair Popple 2021-06-07  712  		pte = pte_mksoft_dirty(pte);
e54198410efccc Alistair Popple 2021-06-07  713  
e54198410efccc Alistair Popple 2021-06-07  714  	entry = pte_to_swp_entry(*ptep);
e54198410efccc Alistair Popple 2021-06-07  715  	if (pte_swp_uffd_wp(*ptep))
e54198410efccc Alistair Popple 2021-06-07  716  		pte = pte_mkuffd_wp(pte);
e54198410efccc Alistair Popple 2021-06-07  717  	else if (is_writable_device_exclusive_entry(entry))
e54198410efccc Alistair Popple 2021-06-07  718  		pte = maybe_mkwrite(pte_mkdirty(pte), vma);
e54198410efccc Alistair Popple 2021-06-07  719  
e54198410efccc Alistair Popple 2021-06-07  720  	set_pte_at(vma->vm_mm, address, ptep, pte);
e54198410efccc Alistair Popple 2021-06-07  721  
e54198410efccc Alistair Popple 2021-06-07  722  	/*
e54198410efccc Alistair Popple 2021-06-07  723  	 * No need to take a page reference as one was already
e54198410efccc Alistair Popple 2021-06-07  724  	 * created when the swap entry was made.
e54198410efccc Alistair Popple 2021-06-07  725  	 */
e54198410efccc Alistair Popple 2021-06-07 @726  	if (PageAnon(page))
e54198410efccc Alistair Popple 2021-06-07  727  		page_add_anon_rmap(page, vma, address, false);
e54198410efccc Alistair Popple 2021-06-07  728  	else
e54198410efccc Alistair Popple 2021-06-07  729  		/*
e54198410efccc Alistair Popple 2021-06-07  730  		 * Currently device exclusive access only supports anonymous
e54198410efccc Alistair Popple 2021-06-07  731  		 * memory so the entry shouldn't point to a filebacked page.
e54198410efccc Alistair Popple 2021-06-07  732  		 */
e54198410efccc Alistair Popple 2021-06-07  733  		WARN_ON_ONCE(!PageAnon(page));
e54198410efccc Alistair Popple 2021-06-07  734  
e54198410efccc Alistair Popple 2021-06-07  735  	if (vma->vm_flags & VM_LOCKED)
e54198410efccc Alistair Popple 2021-06-07  736  		mlock_vma_page(page);
e54198410efccc Alistair Popple 2021-06-07  737  
e54198410efccc Alistair Popple 2021-06-07  738  	/*
e54198410efccc Alistair Popple 2021-06-07  739  	 * No need to invalidate - it was non-present before. However
e54198410efccc Alistair Popple 2021-06-07  740  	 * secondary CPUs may have mappings that need invalidating.
e54198410efccc Alistair Popple 2021-06-07  741  	 */
e54198410efccc Alistair Popple 2021-06-07  742  	update_mmu_cache(vma, address, ptep);
e54198410efccc Alistair Popple 2021-06-07  743  }
e54198410efccc Alistair Popple 2021-06-07  744  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 37498 bytes --]

^ permalink raw reply	[flat|nested] 82+ messages in thread

end of thread, other threads:[~2021-06-27  3:35 UTC | newest]

Thread overview: 82+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-07  7:58 [PATCH v10 00/10] Add support for SVM atomics in Nouveau Alistair Popple
2021-06-07  7:58 ` Alistair Popple
2021-06-07  7:58 ` [Nouveau] " Alistair Popple
2021-06-07  7:58 ` [PATCH v10 01/10] mm: Remove special swap entry functions Alistair Popple
2021-06-07  7:58   ` Alistair Popple
2021-06-07  7:58   ` [Nouveau] " Alistair Popple
2021-06-07  7:58 ` [PATCH v10 02/10] mm/swapops: Rework swap entry manipulation code Alistair Popple
2021-06-07  7:58   ` Alistair Popple
2021-06-07  7:58   ` [Nouveau] " Alistair Popple
2021-06-07  7:58 ` [PATCH v10 03/10] mm/rmap: Split try_to_munlock from try_to_unmap Alistair Popple
2021-06-07  7:58   ` Alistair Popple
2021-06-07  7:58   ` [Nouveau] " Alistair Popple
2021-06-07  7:58 ` [PATCH v10 04/10] mm/rmap: Split migration into its own function Alistair Popple
2021-06-07  7:58   ` Alistair Popple
2021-06-07  7:58   ` [Nouveau] " Alistair Popple
2021-06-07  7:58 ` [PATCH v10 05/10] mm: Rename migrate_pgmap_owner Alistair Popple
2021-06-07  7:58   ` Alistair Popple
2021-06-07  7:58   ` [Nouveau] " Alistair Popple
2021-06-08 15:16   ` Peter Xu
2021-06-08 15:16     ` Peter Xu
2021-06-08 15:16     ` [Nouveau] " Peter Xu
2021-06-07  7:58 ` [PATCH v10 06/10] mm/memory.c: Allow different return codes for copy_nonpresent_pte() Alistair Popple
2021-06-07  7:58   ` Alistair Popple
2021-06-07  7:58   ` [Nouveau] " Alistair Popple
2021-06-08 15:19   ` Peter Xu
2021-06-08 15:19     ` Peter Xu
2021-06-08 15:19     ` [Nouveau] " Peter Xu
2021-06-07  7:58 ` [PATCH v10 07/10] mm: Device exclusive memory access Alistair Popple
2021-06-07  7:58   ` Alistair Popple
2021-06-07  7:58   ` [Nouveau] " Alistair Popple
2021-06-08 18:33   ` Peter Xu
2021-06-08 18:33     ` Peter Xu
2021-06-08 18:33     ` [Nouveau] " Peter Xu
2021-06-09  9:38     ` Alistair Popple
2021-06-09  9:38       ` Alistair Popple
2021-06-09  9:38       ` [Nouveau] " Alistair Popple
2021-06-09 16:05       ` Peter Xu
2021-06-09 16:05         ` Peter Xu
2021-06-09 16:05         ` [Nouveau] " Peter Xu
2021-06-10  0:18         ` Alistair Popple
2021-06-10  0:18           ` Alistair Popple
2021-06-10  0:18           ` [Nouveau] " Alistair Popple
2021-06-10 18:04           ` Peter Xu
2021-06-10 18:04             ` Peter Xu
2021-06-10 18:04             ` [Nouveau] " Peter Xu
2021-06-10 14:21             ` Alistair Popple
2021-06-10 14:21               ` Alistair Popple
2021-06-10 14:21               ` [Nouveau] " Alistair Popple
2021-06-10 23:04               ` Peter Xu
2021-06-10 23:04                 ` Peter Xu
2021-06-10 23:04                 ` [Nouveau] " Peter Xu
2021-06-10 23:17                 ` Alistair Popple
2021-06-10 23:17                   ` Alistair Popple
2021-06-10 23:17                   ` [Nouveau] " Alistair Popple
2021-06-11  1:00                   ` Peter Xu
2021-06-11  1:00                     ` Peter Xu
2021-06-11  1:00                     ` [Nouveau] " Peter Xu
2021-06-11  3:43                     ` Alistair Popple
2021-06-11  3:43                       ` Alistair Popple
2021-06-11  3:43                       ` [Nouveau] " Alistair Popple
2021-06-11 15:01                       ` Peter Xu
2021-06-11 15:01                         ` Peter Xu
2021-06-11 15:01                         ` [Nouveau] " Peter Xu
2021-06-15  3:08                         ` Alistair Popple
2021-06-15  3:08                           ` Alistair Popple
2021-06-15  3:08                           ` [Nouveau] " Alistair Popple
2021-06-15 16:25                           ` Peter Xu
2021-06-15 16:25                             ` Peter Xu
2021-06-15 16:25                             ` [Nouveau] " Peter Xu
2021-06-16  2:47                             ` Alistair Popple
2021-06-16  2:47                               ` Alistair Popple
2021-06-16  2:47                               ` [Nouveau] " Alistair Popple
2021-06-07  7:58 ` [PATCH v10 08/10] mm: Selftests for exclusive device memory Alistair Popple
2021-06-07  7:58   ` Alistair Popple
2021-06-07  7:58   ` [Nouveau] " Alistair Popple
2021-06-07  7:58 ` [PATCH v10 09/10] nouveau/svm: Refactor nouveau_range_fault Alistair Popple
2021-06-07  7:58   ` Alistair Popple
2021-06-07  7:58   ` [Nouveau] " Alistair Popple
2021-06-07  7:58 ` [PATCH v10 10/10] nouveau/svm: Implement atomic SVM access Alistair Popple
2021-06-07  7:58   ` Alistair Popple
2021-06-07  7:58   ` [Nouveau] " Alistair Popple
2021-06-09 23:02 [PATCH v10 07/10] mm: Device exclusive memory access kernel test robot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.