All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Thomas Hellström (VMware)" <thomas_os@shipmail.org>
To: linux-mm@kvack.org, dri-devel@lists.freedesktop.org,
	linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>
Cc: "Ralph Campbell" <rcampbell@nvidia.com>,
	"Michal Hocko" <mhocko@suse.com>,
	pv-drivers@vmware.com, "Dan Williams" <dan.j.williams@intel.com>,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>,
	"Jérôme Glisse" <jglisse@redhat.com>,
	linux-graphics-maintainer@vmware.com,
	"Christian König" <christian.koenig@amd.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Subject: Ack to merge through DRM? WAS [PATCH v6 0/9] Huge page-table entries for TTM
Date: Mon, 16 Mar 2020 13:32:08 +0100	[thread overview]
Message-ID: <9eb1acd3-cded-65f0-ed75-10173dc3a41c@shipmail.org> (raw)
In-Reply-To: <20200304102840.2801-1-thomas_os@shipmail.org>

On 3/4/20 11:28 AM, Thomas Hellström (VMware) wrote:
> In order to reduce CPU usage [1] and in theory TLB misses this patchset enables
> huge- and giant page-table entries for TTM and TTM-enabled graphics drivers.
>
> Patch 1 and 2 introduce a vma_is_special_huge() function to make the mm code
> take the same path as DAX when splitting huge- and giant page table entries,
> (which currently means zapping the page-table entry and rely on re-faulting).
>
> Patch 3 makes the mm code split existing huge page-table entries
> on huge_fault fallbacks. Typically on COW or on buffer-objects that want
> write-notify. COW and write-notification is always done on the lowest
> page-table level. See the patch log message for additional considerations.
>
> Patch 4 introduces functions to allow the graphics drivers to manipulate
> the caching- and encryption flags of huge page-table entries without ugly
> hacks.
>
> Patch 5 implements the huge_fault handler in TTM.
> This enables huge page-table entries, provided that the kernel is configured
> to support transhuge pages, either by default or using madvise().
> However, they are unlikely to be inserted unless the kernel buffer object
> pfns and user-space addresses align perfectly. There are various options
> here, but since buffer objects that reside in system pages typically start
> at huge page boundaries if they are backed by huge pages, we try to enforce
> buffer object starting pfns and user-space addresses to be huge page-size
> aligned if their size exceeds a huge page-size. If pud-size transhuge
> ("giant") pages are enabled by the arch, the same holds for those.
>
> Patch 6 implements a specialized huge_fault handler for vmwgfx.
> The vmwgfx driver may perform dirty-tracking and needs some special code
> to handle that correctly.
>
> Patch 7 implements a drm helper to align user-space addresses according
> to the above scheme, if possible.
>
> Patch 8 implements a TTM range manager for vmwgfx that does the same for
> graphics IO memory. This may later be reused by other graphics drivers
> if necessary.
>
> Patch 9 finally hooks up the helpers of patch 7 and 8 to the vmwgfx driver.
> A similar change is needed for graphics drivers that want a reasonable
> likelyhood of actually using huge page-table entries.
>
> If a buffer object size is not huge-page or giant-page aligned,
> its size will NOT be inflated by this patchset. This means that the buffer
> object tail will use smaller size page-table entries and thus no memory
> overhead occurs. Drivers that want to pay the memory overhead price need to
> implement their own scheme to inflate buffer-object sizes.
>
> PMD size huge page-table-entries have been tested with vmwgfx and found to
> work well both with system memory backed and IO memory backed buffer objects.
>
> PUD size giant page-table-entries have seen limited (fault and COW) testing
> using a modified kernel (to support 1GB page allocations) and a fake vmwgfx
> TTM memory type. The vmwgfx driver does otherwise not support 1GB-size IO
> memory resources.
>
> Comments and suggestions welcome.
> Thomas
>
> Changes since RFC:
> * Check for buffer objects present in contigous IO Memory (Christian König)
> * Rebased on the vmwgfx emulated coherent memory functionality. That rebase
>    adds patch 5.
> Changes since v1:
> * Make the new TTM range manager vmwgfx-specific. (Christian König)
> * Minor fixes for configs that don't support or only partially support
>    transhuge pages.
> Changes since v2:
> * Minor coding style and doc fixes in patch 5/9 (Christian König)
> * Patch 5/9 doesn't touch mm. Remove from the patch title.
> Changes since v3:
> * Added reviews and acks
> * Implemented ugly but generic ttm_pgprot_is_wrprotecting() instead of arch
>    specific code.
> Changes since v4:
> * Added timings (Andrew Morton)
> * Updated function documentation (Andrew Morton)
> Changes since v6:
> * Fix drm build error with !CONFIG_MMU
>
> [1]
> The below test program generates the following gnu time output when run on a
> vmwgfx-enabled kernel without the patch series:
>
> 4.78user 6.02system 0:10.91elapsed 99%CPU (0avgtext+0avgdata 1624maxresident)k
> 0inputs+0outputs (0major+640077minor)pagefaults 0swaps
>
> and with the patch series:
>
> 1.71user 3.60system 0:05.40elapsed 98%CPU (0avgtext+0avgdata 1656maxresident)k
> 0inputs+0outputs (0major+20079minor)pagefaults 0swaps
>
> A consistent number of reduced graphics page-faults can be seen with normal
> graphics applications, but due to the aggressive buffer object caching in
> vmwgfx user-space drivers the CPU time reduction is within the error marginal.
>
> #include <unistd.h>
> #include <string.h>
> #include <sys/mman.h>
> #include <xf86drm.h>
>
> static void checkerr(int ret, const char *name)
> {
>    if (ret < 0) {
>      perror(name);
>      exit(-1);
>    }
> }
>
> int main(int agc, const char *argv[])
> {
>      struct drm_mode_create_dumb c_arg = {0};
>      struct drm_mode_map_dumb m_arg = {0};
>      struct drm_mode_destroy_dumb d_arg = {0};
>      int ret, i, fd;
>      void *map;
>
>      fd = open("/dev/dri/card0", O_RDWR);
>      checkerr(fd, argv[0]);
>
>      for (i = 0; i < 10000; ++i) {
>        c_arg.bpp = 32;
>        c_arg.width = 1024;
>        c_arg.height = 1024;
>        ret = drmIoctl(fd, DRM_IOCTL_MODE_CREATE_DUMB, &c_arg);
>        checkerr(fd, argv[0]);
>
>        m_arg.handle = c_arg.handle;
>        ret = drmIoctl(fd, DRM_IOCTL_MODE_MAP_DUMB, &m_arg);
>        checkerr(fd, argv[0]);
>        
>        map = mmap(NULL, c_arg.size, PROT_READ | PROT_WRITE, MAP_SHARED, fd,
> 	       m_arg.offset);
>        checkerr(map == MAP_FAILED ? -1 : 0, argv[0]);
>
>        (void) madvise((void *) map, c_arg.size, MADV_HUGEPAGE);
>        memset(map, 0x67, c_arg.size);
>        munmap(map, c_arg.size);
>
>        d_arg.handle = c_arg.handle;
>        ret = drmIoctl(fd, DRM_IOCTL_MODE_DESTROY_DUMB, &d_arg);
>        checkerr(ret, argv[0]);
>      }
>      
>      close(fd);
> }
>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Cc: Ralph Campbell <rcampbell@nvidia.com>
> Cc: "Jérôme Glisse" <jglisse@redhat.com>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
>
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

Andrew, would it be possible to have an ack for merge using a DRM tree 
for the -mm patches?

Thanks,

Thomas




WARNING: multiple messages have this Message-ID (diff)
From: "Thomas Hellström (VMware)" <thomas_os@shipmail.org>
To: linux-mm@kvack.org, dri-devel@lists.freedesktop.org,
	linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>
Cc: "Ralph Campbell" <rcampbell@nvidia.com>,
	"Michal Hocko" <mhocko@suse.com>,
	pv-drivers@vmware.com,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>,
	"Jérôme Glisse" <jglisse@redhat.com>,
	linux-graphics-maintainer@vmware.com,
	"Dan Williams" <dan.j.williams@intel.com>,
	"Christian König" <christian.koenig@amd.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Subject: Ack to merge through DRM? WAS [PATCH v6 0/9] Huge page-table entries for TTM
Date: Mon, 16 Mar 2020 13:32:08 +0100	[thread overview]
Message-ID: <9eb1acd3-cded-65f0-ed75-10173dc3a41c@shipmail.org> (raw)
In-Reply-To: <20200304102840.2801-1-thomas_os@shipmail.org>

On 3/4/20 11:28 AM, Thomas Hellström (VMware) wrote:
> In order to reduce CPU usage [1] and in theory TLB misses this patchset enables
> huge- and giant page-table entries for TTM and TTM-enabled graphics drivers.
>
> Patch 1 and 2 introduce a vma_is_special_huge() function to make the mm code
> take the same path as DAX when splitting huge- and giant page table entries,
> (which currently means zapping the page-table entry and rely on re-faulting).
>
> Patch 3 makes the mm code split existing huge page-table entries
> on huge_fault fallbacks. Typically on COW or on buffer-objects that want
> write-notify. COW and write-notification is always done on the lowest
> page-table level. See the patch log message for additional considerations.
>
> Patch 4 introduces functions to allow the graphics drivers to manipulate
> the caching- and encryption flags of huge page-table entries without ugly
> hacks.
>
> Patch 5 implements the huge_fault handler in TTM.
> This enables huge page-table entries, provided that the kernel is configured
> to support transhuge pages, either by default or using madvise().
> However, they are unlikely to be inserted unless the kernel buffer object
> pfns and user-space addresses align perfectly. There are various options
> here, but since buffer objects that reside in system pages typically start
> at huge page boundaries if they are backed by huge pages, we try to enforce
> buffer object starting pfns and user-space addresses to be huge page-size
> aligned if their size exceeds a huge page-size. If pud-size transhuge
> ("giant") pages are enabled by the arch, the same holds for those.
>
> Patch 6 implements a specialized huge_fault handler for vmwgfx.
> The vmwgfx driver may perform dirty-tracking and needs some special code
> to handle that correctly.
>
> Patch 7 implements a drm helper to align user-space addresses according
> to the above scheme, if possible.
>
> Patch 8 implements a TTM range manager for vmwgfx that does the same for
> graphics IO memory. This may later be reused by other graphics drivers
> if necessary.
>
> Patch 9 finally hooks up the helpers of patch 7 and 8 to the vmwgfx driver.
> A similar change is needed for graphics drivers that want a reasonable
> likelyhood of actually using huge page-table entries.
>
> If a buffer object size is not huge-page or giant-page aligned,
> its size will NOT be inflated by this patchset. This means that the buffer
> object tail will use smaller size page-table entries and thus no memory
> overhead occurs. Drivers that want to pay the memory overhead price need to
> implement their own scheme to inflate buffer-object sizes.
>
> PMD size huge page-table-entries have been tested with vmwgfx and found to
> work well both with system memory backed and IO memory backed buffer objects.
>
> PUD size giant page-table-entries have seen limited (fault and COW) testing
> using a modified kernel (to support 1GB page allocations) and a fake vmwgfx
> TTM memory type. The vmwgfx driver does otherwise not support 1GB-size IO
> memory resources.
>
> Comments and suggestions welcome.
> Thomas
>
> Changes since RFC:
> * Check for buffer objects present in contigous IO Memory (Christian König)
> * Rebased on the vmwgfx emulated coherent memory functionality. That rebase
>    adds patch 5.
> Changes since v1:
> * Make the new TTM range manager vmwgfx-specific. (Christian König)
> * Minor fixes for configs that don't support or only partially support
>    transhuge pages.
> Changes since v2:
> * Minor coding style and doc fixes in patch 5/9 (Christian König)
> * Patch 5/9 doesn't touch mm. Remove from the patch title.
> Changes since v3:
> * Added reviews and acks
> * Implemented ugly but generic ttm_pgprot_is_wrprotecting() instead of arch
>    specific code.
> Changes since v4:
> * Added timings (Andrew Morton)
> * Updated function documentation (Andrew Morton)
> Changes since v6:
> * Fix drm build error with !CONFIG_MMU
>
> [1]
> The below test program generates the following gnu time output when run on a
> vmwgfx-enabled kernel without the patch series:
>
> 4.78user 6.02system 0:10.91elapsed 99%CPU (0avgtext+0avgdata 1624maxresident)k
> 0inputs+0outputs (0major+640077minor)pagefaults 0swaps
>
> and with the patch series:
>
> 1.71user 3.60system 0:05.40elapsed 98%CPU (0avgtext+0avgdata 1656maxresident)k
> 0inputs+0outputs (0major+20079minor)pagefaults 0swaps
>
> A consistent number of reduced graphics page-faults can be seen with normal
> graphics applications, but due to the aggressive buffer object caching in
> vmwgfx user-space drivers the CPU time reduction is within the error marginal.
>
> #include <unistd.h>
> #include <string.h>
> #include <sys/mman.h>
> #include <xf86drm.h>
>
> static void checkerr(int ret, const char *name)
> {
>    if (ret < 0) {
>      perror(name);
>      exit(-1);
>    }
> }
>
> int main(int agc, const char *argv[])
> {
>      struct drm_mode_create_dumb c_arg = {0};
>      struct drm_mode_map_dumb m_arg = {0};
>      struct drm_mode_destroy_dumb d_arg = {0};
>      int ret, i, fd;
>      void *map;
>
>      fd = open("/dev/dri/card0", O_RDWR);
>      checkerr(fd, argv[0]);
>
>      for (i = 0; i < 10000; ++i) {
>        c_arg.bpp = 32;
>        c_arg.width = 1024;
>        c_arg.height = 1024;
>        ret = drmIoctl(fd, DRM_IOCTL_MODE_CREATE_DUMB, &c_arg);
>        checkerr(fd, argv[0]);
>
>        m_arg.handle = c_arg.handle;
>        ret = drmIoctl(fd, DRM_IOCTL_MODE_MAP_DUMB, &m_arg);
>        checkerr(fd, argv[0]);
>        
>        map = mmap(NULL, c_arg.size, PROT_READ | PROT_WRITE, MAP_SHARED, fd,
> 	       m_arg.offset);
>        checkerr(map == MAP_FAILED ? -1 : 0, argv[0]);
>
>        (void) madvise((void *) map, c_arg.size, MADV_HUGEPAGE);
>        memset(map, 0x67, c_arg.size);
>        munmap(map, c_arg.size);
>
>        d_arg.handle = c_arg.handle;
>        ret = drmIoctl(fd, DRM_IOCTL_MODE_DESTROY_DUMB, &d_arg);
>        checkerr(ret, argv[0]);
>      }
>      
>      close(fd);
> }
>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Cc: Ralph Campbell <rcampbell@nvidia.com>
> Cc: "Jérôme Glisse" <jglisse@redhat.com>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
>
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

Andrew, would it be possible to have an ack for merge using a DRM tree 
for the -mm patches?

Thanks,

Thomas



_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

  parent reply	other threads:[~2020-03-16 12:32 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-04 10:28 [PATCH v6 0/9] Huge page-table entries for TTM Thomas Hellström (VMware)
2020-03-04 10:28 ` Thomas Hellström (VMware)
2020-03-04 10:28 ` [PATCH v6 1/9] fs: Constify vma argument to vma_is_dax Thomas Hellström (VMware)
2020-03-04 10:28   ` Thomas Hellström (VMware)
2020-03-04 10:28 ` [PATCH v6 2/9] mm: Introduce vma_is_special_huge Thomas Hellström (VMware)
2020-03-04 10:28   ` Thomas Hellström (VMware)
2020-03-04 10:28 ` [PATCH v6 3/9] mm: Split huge pages on write-notify or COW Thomas Hellström (VMware)
2020-03-04 10:28   ` Thomas Hellström (VMware)
2020-03-04 10:28 ` [PATCH v6 4/9] mm: Add vmf_insert_pfn_xxx_prot() for huge page-table entries Thomas Hellström (VMware)
2020-03-04 10:28   ` Thomas Hellström (VMware)
2020-03-04 10:28 ` [PATCH v6 5/9] drm/ttm, drm/vmwgfx: Support huge TTM pagefaults Thomas Hellström (VMware)
2020-03-04 10:28   ` Thomas Hellström (VMware)
2020-03-04 10:28 ` [PATCH v6 6/9] drm/vmwgfx: Support huge page faults Thomas Hellström (VMware)
2020-03-04 10:28   ` Thomas Hellström (VMware)
2020-03-04 10:28 ` [PATCH v6 7/9] drm: Add a drm_get_unmapped_area() helper Thomas Hellström (VMware)
2020-03-04 10:28   ` Thomas Hellström (VMware)
2020-03-04 10:28 ` [PATCH v6 8/9] drm/vmwgfx: Introduce a huge page aligning TTM range manager Thomas Hellström (VMware)
2020-03-04 10:28   ` Thomas Hellström (VMware)
2020-03-04 10:28 ` [PATCH v6 9/9] drm/vmwgfx: Hook up the helpers to align buffer objects Thomas Hellström (VMware)
2020-03-04 10:28   ` Thomas Hellström (VMware)
2020-03-16 12:32 ` Thomas Hellström (VMware) [this message]
2020-03-16 12:32   ` Ack to merge through DRM? WAS [PATCH v6 0/9] Huge page-table entries for TTM Thomas Hellström (VMware)
2020-03-18 23:27   ` Andrew Morton
2020-03-18 23:27     ` Andrew Morton
2020-03-19 10:20     ` Thomas Hellström (VMware)
2020-03-19 10:20       ` Thomas Hellström (VMware)
2020-03-21  1:58       ` Andrew Morton
2020-03-21  1:58         ` Andrew Morton
2020-03-24 10:03 ` Separate pull request? WAS: " Thomas Hellström (VMware)
2020-03-24 10:31   ` Koenig, Christian

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9eb1acd3-cded-65f0-ed75-10173dc3a41c@shipmail.org \
    --to=thomas_os@shipmail.org \
    --cc=akpm@linux-foundation.org \
    --cc=christian.koenig@amd.com \
    --cc=dan.j.williams@intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=jglisse@redhat.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-graphics-maintainer@vmware.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=pv-drivers@vmware.com \
    --cc=rcampbell@nvidia.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.