All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/9] Emulated coherent graphics memory take 2
@ 2019-10-08  9:14 Thomas Hellström (VMware)
  2019-10-08  9:15 ` [PATCH v4 1/9] mm: Remove BUG_ON mmap_sem not held from xxx_trans_huge_lock() Thomas Hellström (VMware)
                   ` (8 more replies)
  0 siblings, 9 replies; 33+ messages in thread
From: Thomas Hellström (VMware) @ 2019-10-08  9:14 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: torvalds, Thomas Hellström, Andrew Morton, Matthew Wilcox,
	Will Deacon, Peter Zijlstra, Rik van Riel, Minchan Kim,
	Michal Hocko, Huang Ying, Jérôme Glisse,
	Kirill A . Shutemov

From: Thomas Hellström <thellstrom@vmware.com>

Graphics APIs like OpenGL 4.4 and Vulkan require the graphics driver
to provide coherent graphics memory, meaning that the GPU sees any
content written to the coherent memory on the next GPU operation that
touches that memory, and the CPU sees any content written by the GPU
to that memory immediately after any fence object trailing the GPU
operation is signaled.

Paravirtual drivers that otherwise require explicit synchronization
needs to do this by hooking up dirty tracking to pagefault handlers
and buffer object validation.

Provide mm helpers needed for this and that also allow for huge pmd-
and pud entries (patch 1-3), and the associated vmwgfx code (patch 4-7).

The code has been tested and exercised by a tailored version of mesa
where we disable all explicit synchronization and assume graphics memory
is coherent. The performance loss varies of course; a typical number is
around 5%.

I would like to merge this code through the DRM tree, so an ack to include
the new mm helpers in that merge would be greatly appreciated.

Changes since RFC:
- Merge conflict changes moved to the correct patch. Fixes intra-patchset
  compile errors.
- Be more aggressive when turning ttm vm code into helpers. This makes sure
  we can use a const qualifier on the vmwgfx vm_ops.
- Reinstate a lost comment an fix an error path that was broken when turning
  the ttm vm code into helpers.
- Remove explicit type-casts of struct vm_area_struct::vm_private_data
- Clarify the locking inversion that makes us not being able to use the mm
  pagewalk code.

Changes since v1:
- Removed the vmwgfx maintainer entry for as_dirty_helpers.c, updated
  commit message accordingly
- Removed the TTM patches from the series as they are merged separately
  through DRM.
Changes since v2:
- Split out the pagewalk code from as_dirty_helpers.c and document locking.
- Add pre_vma and post_vma callbacks to the pagewalk code.
- Remove huge pmd and -pud asserts that would trip when we protect vmas with
  struct address_space::i_mmap_rwsem rather than with
  struct vm_area_struct::mmap_sem.
- Do some naming cleanup in as_dirty_helpers.c
Changes since v3:
- Extensive renaming of the dirty helpers including the filename.
- Update walk_page_mapping() doc.
- Update the pagewalk code to not unconditionally split pmds if a pte_entry()
  callback is present. Update the dirty helper pmd_entry accordingly.
- Use separate walk ops for the dirty helpers.
- Update the pagewalk code to take the pagetable lock in walk_pte_range.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>

^ permalink raw reply	[flat|nested] 33+ messages in thread
* [PATCH v4 0/9] Emulated coherent graphics memory
@ 2019-06-11 12:24 Thomas Hellström (VMware)
  2019-06-11 12:24 ` [PATCH v4 7/9] drm/vmwgfx: Use an RBtree instead of linked list for MOB resources Thomas Hellström (VMware)
  0 siblings, 1 reply; 33+ messages in thread
From: Thomas Hellström (VMware) @ 2019-06-11 12:24 UTC (permalink / raw)
  To: dri-devel
  Cc: linux-graphics-maintainer, pv-drivers, linux-kernel,
	Andrew Morton, Matthew Wilcox, Will Deacon, Peter Zijlstra,
	Rik van Riel, Minchan Kim, Michal Hocko, Huang Ying,
	Souptick Joarder, Jérôme Glisse, Christian König,
	linux-mm

Planning to merge this through the drm/vmwgfx tree soon, so if there
are any objections, please speak up.

Graphics APIs like OpenGL 4.4 and Vulkan require the graphics driver
to provide coherent graphics memory, meaning that the GPU sees any
content written to the coherent memory on the next GPU operation that
touches that memory, and the CPU sees any content written by the GPU
to that memory immediately after any fence object trailing the GPU
operation has signaled.

Paravirtual drivers that otherwise require explicit synchronization
needs to do this by hooking up dirty tracking to pagefault handlers
and buffer object validation. This is a first attempt to do that for
the vmwgfx driver.

The mm patches has been out for RFC. I think I have addressed all the
feedback I got, except a possible softdirty breakage. But although the
dirty-tracking and softdirty may write-protect PTEs both care about,
that shouldn't really cause any operation interference. In particular
since we use the hardware dirty PTE bits and softdirty uses other PTE bits.

For the TTM changes they are hopefully in line with the long-term
strategy of making helpers out of what's left of TTM.

The code has been tested and excercised by a tailored version of mesa
where we disable all explicit synchronization and assume graphics memory
is coherent. The performance loss varies of course; a typical number is
around 5%.

Changes v1-v2:
- Addressed a number of typos and formatting issues.
- Added a usage warning for apply_to_pfn_range() and apply_to_page_range()
- Re-evaluated the decision to use apply_to_pfn_range() rather than
  modifying the pagewalk.c. It still looks like generically handling the
  transparent huge page cases requires the mmap_sem to be held at least
  in read mode, so sticking with apply_to_pfn_range() for now.
- The TTM page-fault helper vma copy argument was scratched in favour of
  a pageprot_t argument.
Changes v3:
- Adapted to upstream API changes.
Changes v4:
- Adapted to upstream mmu_notifier changes. (Jerome?)
- Fixed a couple of warnings on 32-bit x86
- Fixed image offset computation on multisample images.
  
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Souptick Joarder <jrdr.linux@gmail.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: linux-mm@kvack.org


^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2019-10-13  4:21 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-08  9:14 [PATCH v4 0/9] Emulated coherent graphics memory take 2 Thomas Hellström (VMware)
2019-10-08  9:15 ` [PATCH v4 1/9] mm: Remove BUG_ON mmap_sem not held from xxx_trans_huge_lock() Thomas Hellström (VMware)
2019-10-08  9:15 ` [PATCH v4 2/9] mm: pagewalk: Take the pagetable lock in walk_pte_range() Thomas Hellström (VMware)
2019-10-09 15:14   ` Kirill A. Shutemov
2019-10-09 16:07     ` Linus Torvalds
2019-10-08  9:15 ` [PATCH v4 3/9] mm: pagewalk: Don't split transhuge pmds when a pmd_entry is present Thomas Hellström (VMware)
2019-10-09 15:27   ` Kirill A. Shutemov
2019-10-09 15:27     ` Kirill A. Shutemov
2019-10-09 16:20     ` Thomas Hellström (VMware)
2019-10-09 16:20       ` Thomas Hellström (VMware)        
2019-10-09 16:21     ` Linus Torvalds
2019-10-09 17:03       ` Thomas Hellström (VMware)
2019-10-09 17:16         ` Linus Torvalds
2019-10-09 18:52           ` Thomas Hellstrom
2019-10-09 19:20             ` Linus Torvalds
2019-10-09 20:06               ` Thomas Hellström (VMware)
2019-10-09 20:20                 ` Linus Torvalds
2019-10-09 22:30                   ` Thomas Hellström (VMware)
2019-10-09 23:50                     ` Thomas Hellström (VMware)
2019-10-09 23:51                     ` Linus Torvalds
2019-10-10  0:18                       ` Linus Torvalds
2019-10-10  1:09                       ` Thomas Hellström (VMware)
2019-10-10  2:07                         ` Linus Torvalds
2019-10-10  6:15                           ` Thomas Hellström (VMware)
2019-10-08  9:15 ` [PATCH v4 4/9] mm: Add a walk_page_mapping() function to the pagewalk code Thomas Hellström (VMware)
2019-10-08  9:15 ` [PATCH v4 5/9] mm: Add write-protect and clean utilities for address space ranges Thomas Hellström (VMware)
2019-10-08 17:06   ` Linus Torvalds
2019-10-08 18:25     ` Thomas Hellstrom
2019-10-08  9:15 ` [PATCH v4 6/9] drm/vmwgfx: Implement an infrastructure for write-coherent resources Thomas Hellström (VMware)
2019-10-08  9:15 ` [PATCH v4 7/9] drm/vmwgfx: Use an RBtree instead of linked list for MOB resources Thomas Hellström (VMware)
2019-10-08  9:15 ` [PATCH v4 8/9] drm/vmwgfx: Implement an infrastructure for read-coherent resources Thomas Hellström (VMware)
2019-10-08  9:15 ` [PATCH v4 9/9] drm/vmwgfx: Add surface dirty-tracking callbacks Thomas Hellström (VMware)
  -- strict thread matches above, loose matches on Subject: below --
2019-06-11 12:24 [PATCH v4 0/9] Emulated coherent graphics memory Thomas Hellström (VMware)
2019-06-11 12:24 ` [PATCH v4 7/9] drm/vmwgfx: Use an RBtree instead of linked list for MOB resources Thomas Hellström (VMware)

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.