linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC][PATCH v6 0/7] Generic page pool & deferred freeing for system dmabuf heap
@ 2021-02-05  8:06 John Stultz
  2021-02-05  8:06 ` [RFC][PATCH v6 1/7] drm: Add a sharable drm page-pool implementation John Stultz
                   ` (7 more replies)
  0 siblings, 8 replies; 33+ messages in thread
From: John Stultz @ 2021-02-05  8:06 UTC (permalink / raw)
  To: lkml
  Cc: John Stultz, Daniel Vetter, Christian Koenig, Sumit Semwal,
	Liam Mark, Chris Goldsworthy, Laura Abbott, Brian Starkey,
	Hridya Valsaraju, Suren Baghdasaryan, Sandeep Patil,
	Daniel Mentz, Ørjan Eide, Robin Murphy, Ezequiel Garcia,
	Simon Ser, James Jones, linux-media, dri-devel

This series is starting to get long, so I figured I'd add a
short cover letter for context.

The point of this series is trying to add both deferred-freeing
logic as well as a page pool to the DMA-BUF system heap.

This is desired, as the combination of deferred freeing along
with the page pool allows us to offload page-zeroing out of
the allocation hot path. This was done originally with ION
and this patch series allows the DMA-BUF system heap to match
ION's system heap allocation performance in a simple
microbenchmark [1] (ION re-added to the kernel for comparision,
running on an x86 vm image):

./dmabuf-heap-bench -i 0 1 system                     
Testing dmabuf system vs ion heaptype 0 (flags: 0x1)
---------------------------------------------
dmabuf heap: alloc 4096 bytes 5000 times in 86572223 ns          17314 ns/call
ion heap:    alloc 4096 bytes 5000 times in 97442526 ns          19488 ns/call
dmabuf heap: alloc 1048576 bytes 5000 times in 196635057 ns      39327 ns/call
ion heap:    alloc 1048576 bytes 5000 times in 357323629 ns      71464 ns/call
dmabuf heap: alloc 8388608 bytes 5000 times in 3165445534 ns     633089 ns/call
ion heap:    alloc 8388608 bytes 5000 times in 3699591271 ns     739918 ns/call
dmabuf heap: alloc 33554432 bytes 5000 times in 13327402517 ns   2665480 ns/call
ion heap:    alloc 33554432 bytes 5000 times in 15292352796 ns   3058470 ns/call

Daniel didn't like earlier attempts to re-use the network
page-pool code to achieve this, and suggested the ttm_pool be
used instead. This required pulling the fairly tightly knit
ttm_pool logic apart, but after many failed attmempts I think
I found a workable abstraction to split out shared logic.

So this series contains a new generic drm_page_pool helper
library, converts the ttm_pool to use it, and then adds the
dmabuf deferred freeing and adds support to the dmabuf system
heap to use both deferred freeing and the new drm_page_pool.

Input would be greatly appreciated. Testing as well, as I don't
have any development hardware that utilizes the ttm pool.

thanks
-john

[1] https://android.googlesource.com/platform/system/memory/libdmabufheap/+/refs/heads/master/tests/dmabuf_heap_bench.c

Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Liam Mark <lmark@codeaurora.org>
Cc: Chris Goldsworthy <cgoldswo@codeaurora.org>
Cc: Laura Abbott <labbott@kernel.org>
Cc: Brian Starkey <Brian.Starkey@arm.com>
Cc: Hridya Valsaraju <hridya@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sandeep Patil <sspatil@google.com>
Cc: Daniel Mentz <danielmentz@google.com>
Cc: Ørjan Eide <orjan.eide@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Ezequiel Garcia <ezequiel@collabora.com>
Cc: Simon Ser <contact@emersion.fr>
Cc: James Jones <jajones@nvidia.com>
Cc: linux-media@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org

John Stultz (7):
  drm: Add a sharable drm page-pool implementation
  drm: ttm_pool: Rename the ttm_pool_dma structure to ttm_pool_page_dat
  drm: ttm_pool: Rework ttm_pool_free_page to allow us to use it as a
    function pointer
  drm: ttm_pool: Rework ttm_pool to use drm_page_pool
  dma-buf: heaps: Add deferred-free-helper library code
  dma-buf: system_heap: Add drm pagepool support to system heap
  dma-buf: system_heap: Add deferred freeing to the system heap

 drivers/dma-buf/heaps/Kconfig                |   5 +
 drivers/dma-buf/heaps/Makefile               |   1 +
 drivers/dma-buf/heaps/deferred-free-helper.c | 145 ++++++++++
 drivers/dma-buf/heaps/deferred-free-helper.h |  55 ++++
 drivers/dma-buf/heaps/system_heap.c          |  77 ++++-
 drivers/gpu/drm/Kconfig                      |   5 +
 drivers/gpu/drm/Makefile                     |   1 +
 drivers/gpu/drm/page_pool.c                  | 220 +++++++++++++++
 drivers/gpu/drm/ttm/ttm_pool.c               | 278 ++++++-------------
 include/drm/page_pool.h                      |  54 ++++
 include/drm/ttm/ttm_pool.h                   |  23 +-
 11 files changed, 639 insertions(+), 225 deletions(-)
 create mode 100644 drivers/dma-buf/heaps/deferred-free-helper.c
 create mode 100644 drivers/dma-buf/heaps/deferred-free-helper.h
 create mode 100644 drivers/gpu/drm/page_pool.c
 create mode 100644 include/drm/page_pool.h

-- 
2.25.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [RFC][PATCH v6 1/7] drm: Add a sharable drm page-pool implementation
  2021-02-05  8:06 [RFC][PATCH v6 0/7] Generic page pool & deferred freeing for system dmabuf heap John Stultz
@ 2021-02-05  8:06 ` John Stultz
  2021-02-05  8:46   ` Christian König
  2021-02-05  8:06 ` [RFC][PATCH v6 2/7] drm: ttm_pool: Rename the ttm_pool_dma structure to ttm_pool_page_dat John Stultz
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 33+ messages in thread
From: John Stultz @ 2021-02-05  8:06 UTC (permalink / raw)
  To: lkml
  Cc: John Stultz, Daniel Vetter, Christian Koenig, Sumit Semwal,
	Liam Mark, Chris Goldsworthy, Laura Abbott, Brian Starkey,
	Hridya Valsaraju, Suren Baghdasaryan, Sandeep Patil,
	Daniel Mentz, Ørjan Eide, Robin Murphy, Ezequiel Garcia,
	Simon Ser, James Jones, linux-media, dri-devel

This adds a shrinker controlled page pool, closely
following the ttm_pool logic, which is abstracted out
a bit so it can be used by other non-ttm drivers.

Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Liam Mark <lmark@codeaurora.org>
Cc: Chris Goldsworthy <cgoldswo@codeaurora.org>
Cc: Laura Abbott <labbott@kernel.org>
Cc: Brian Starkey <Brian.Starkey@arm.com>
Cc: Hridya Valsaraju <hridya@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sandeep Patil <sspatil@google.com>
Cc: Daniel Mentz <danielmentz@google.com>
Cc: Ørjan Eide <orjan.eide@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Ezequiel Garcia <ezequiel@collabora.com>
Cc: Simon Ser <contact@emersion.fr>
Cc: James Jones <jajones@nvidia.com>
Cc: linux-media@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
 drivers/gpu/drm/Kconfig     |   4 +
 drivers/gpu/drm/Makefile    |   1 +
 drivers/gpu/drm/page_pool.c | 220 ++++++++++++++++++++++++++++++++++++
 include/drm/page_pool.h     |  54 +++++++++
 4 files changed, 279 insertions(+)
 create mode 100644 drivers/gpu/drm/page_pool.c
 create mode 100644 include/drm/page_pool.h

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index 0973f408d75f..d16bf340ed2e 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -174,6 +174,10 @@ config DRM_DP_CEC
 	  Note: not all adapters support this feature, and even for those
 	  that do support this they often do not hook up the CEC pin.
 
+config DRM_PAGE_POOL
+	bool
+	depends on DRM
+
 config DRM_TTM
 	tristate
 	depends on DRM && MMU
diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
index fefaff4c832d..877e0111ed34 100644
--- a/drivers/gpu/drm/Makefile
+++ b/drivers/gpu/drm/Makefile
@@ -32,6 +32,7 @@ drm-$(CONFIG_AGP) += drm_agpsupport.o
 drm-$(CONFIG_PCI) += drm_pci.o
 drm-$(CONFIG_DEBUG_FS) += drm_debugfs.o drm_debugfs_crc.o
 drm-$(CONFIG_DRM_LOAD_EDID_FIRMWARE) += drm_edid_load.o
+drm-$(CONFIG_DRM_PAGE_POOL) += page_pool.o
 
 drm_vram_helper-y := drm_gem_vram_helper.o
 obj-$(CONFIG_DRM_VRAM_HELPER) += drm_vram_helper.o
diff --git a/drivers/gpu/drm/page_pool.c b/drivers/gpu/drm/page_pool.c
new file mode 100644
index 000000000000..2139f86e6ca7
--- /dev/null
+++ b/drivers/gpu/drm/page_pool.c
@@ -0,0 +1,220 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * DRM page pool system
+ *
+ * Copyright (C) 2020 Linaro Ltd.
+ *
+ * Based on the ION page pool code
+ * Copyright (C) 2011 Google, Inc.
+ * As well as the ttm_pool code
+ * Copyright (C) 2020 Advanced Micro Devices, Inc.
+ */
+
+#include <linux/freezer.h>
+#include <linux/list.h>
+#include <linux/slab.h>
+#include <linux/swap.h>
+#include <linux/sched/signal.h>
+#include <drm/page_pool.h>
+
+static LIST_HEAD(pool_list);
+static DEFINE_MUTEX(pool_list_lock);
+static atomic_long_t total_pages;
+static unsigned long page_pool_max;
+MODULE_PARM_DESC(page_pool_max, "Number of pages in the WC/UC/DMA pool");
+module_param(page_pool_max, ulong, 0644);
+
+void drm_page_pool_set_max(unsigned long max)
+{
+	/* only write once */
+	if (!page_pool_max)
+		page_pool_max = max;
+}
+
+unsigned long drm_page_pool_get_max(void)
+{
+	return page_pool_max;
+}
+
+unsigned long drm_page_pool_get_total(void)
+{
+	return atomic_long_read(&total_pages);
+}
+
+int drm_page_pool_get_size(struct drm_page_pool *pool)
+{
+	int ret;
+
+	spin_lock(&pool->lock);
+	ret = pool->count;
+	spin_unlock(&pool->lock);
+	return ret;
+}
+
+static inline unsigned int drm_page_pool_free_pages(struct drm_page_pool *pool,
+						    struct page *page)
+{
+	return pool->free(page, pool->order);
+}
+
+static int drm_page_pool_shrink_one(void);
+
+void drm_page_pool_add(struct drm_page_pool *pool, struct page *page)
+{
+	spin_lock(&pool->lock);
+	list_add_tail(&page->lru, &pool->items);
+	pool->count++;
+	atomic_long_add(1 << pool->order, &total_pages);
+	spin_unlock(&pool->lock);
+
+	mod_node_page_state(page_pgdat(page), NR_KERNEL_MISC_RECLAIMABLE,
+			    1 << pool->order);
+
+	/* make sure we don't grow too large */
+	while (page_pool_max && atomic_long_read(&total_pages) > page_pool_max)
+		drm_page_pool_shrink_one();
+}
+EXPORT_SYMBOL_GPL(drm_page_pool_add);
+
+static struct page *drm_page_pool_remove(struct drm_page_pool *pool)
+{
+	struct page *page;
+
+	if (!pool->count)
+		return NULL;
+
+	page = list_first_entry(&pool->items, struct page, lru);
+	pool->count--;
+	atomic_long_sub(1 << pool->order, &total_pages);
+
+	list_del(&page->lru);
+	mod_node_page_state(page_pgdat(page), NR_KERNEL_MISC_RECLAIMABLE,
+			    -(1 << pool->order));
+	return page;
+}
+
+struct page *drm_page_pool_fetch(struct drm_page_pool *pool)
+{
+	struct page *page = NULL;
+
+	if (!pool) {
+		WARN_ON(!pool);
+		return NULL;
+	}
+
+	spin_lock(&pool->lock);
+	page = drm_page_pool_remove(pool);
+	spin_unlock(&pool->lock);
+
+	return page;
+}
+EXPORT_SYMBOL_GPL(drm_page_pool_fetch);
+
+struct drm_page_pool *drm_page_pool_create(unsigned int order,
+					   int (*free_page)(struct page *p, unsigned int order))
+{
+	struct drm_page_pool *pool = kmalloc(sizeof(*pool), GFP_KERNEL);
+
+	if (!pool)
+		return NULL;
+
+	pool->count = 0;
+	INIT_LIST_HEAD(&pool->items);
+	pool->order = order;
+	pool->free = free_page;
+	spin_lock_init(&pool->lock);
+	INIT_LIST_HEAD(&pool->list);
+
+	mutex_lock(&pool_list_lock);
+	list_add(&pool->list, &pool_list);
+	mutex_unlock(&pool_list_lock);
+
+	return pool;
+}
+EXPORT_SYMBOL_GPL(drm_page_pool_create);
+
+void drm_page_pool_destroy(struct drm_page_pool *pool)
+{
+	struct page *page;
+
+	/* Remove us from the pool list */
+	mutex_lock(&pool_list_lock);
+	list_del(&pool->list);
+	mutex_unlock(&pool_list_lock);
+
+	/* Free any remaining pages in the pool */
+	spin_lock(&pool->lock);
+	while (pool->count) {
+		page = drm_page_pool_remove(pool);
+		spin_unlock(&pool->lock);
+		drm_page_pool_free_pages(pool, page);
+		spin_lock(&pool->lock);
+	}
+	spin_unlock(&pool->lock);
+
+	kfree(pool);
+}
+EXPORT_SYMBOL_GPL(drm_page_pool_destroy);
+
+static int drm_page_pool_shrink_one(void)
+{
+	struct drm_page_pool *pool;
+	struct page *page;
+	int nr_freed = 0;
+
+	mutex_lock(&pool_list_lock);
+	pool = list_first_entry(&pool_list, typeof(*pool), list);
+
+	spin_lock(&pool->lock);
+	page = drm_page_pool_remove(pool);
+	spin_unlock(&pool->lock);
+
+	if (page)
+		nr_freed = drm_page_pool_free_pages(pool, page);
+
+	list_move_tail(&pool->list, &pool_list);
+	mutex_unlock(&pool_list_lock);
+
+	return nr_freed;
+}
+
+static unsigned long drm_page_pool_shrink_count(struct shrinker *shrinker,
+						struct shrink_control *sc)
+{
+	unsigned long count =  atomic_long_read(&total_pages);
+
+	return count ? count : SHRINK_EMPTY;
+}
+
+static unsigned long drm_page_pool_shrink_scan(struct shrinker *shrinker,
+					       struct shrink_control *sc)
+{
+	int to_scan = sc->nr_to_scan;
+	int nr_total = 0;
+
+	if (to_scan == 0)
+		return 0;
+
+	do {
+		int nr_freed = drm_page_pool_shrink_one();
+
+		to_scan -= nr_freed;
+		nr_total += nr_freed;
+	} while (to_scan >= 0 && atomic_long_read(&total_pages));
+
+	return nr_total;
+}
+
+static struct shrinker pool_shrinker = {
+	.count_objects = drm_page_pool_shrink_count,
+	.scan_objects = drm_page_pool_shrink_scan,
+	.seeks = 1,
+	.batch = 0,
+};
+
+int drm_page_pool_init_shrinker(void)
+{
+	return register_shrinker(&pool_shrinker);
+}
+module_init(drm_page_pool_init_shrinker);
+MODULE_LICENSE("GPL v2");
diff --git a/include/drm/page_pool.h b/include/drm/page_pool.h
new file mode 100644
index 000000000000..47e240b2bc69
--- /dev/null
+++ b/include/drm/page_pool.h
@@ -0,0 +1,54 @@
+/* SPDX-License-Identifier: GPL-2.0 OR MIT */
+/*
+ * Copyright 2020 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ * Authors: Christian König
+ */
+
+#ifndef _DRM_PAGE_POOL_H_
+#define _DRM_PAGE_POOL_H_
+
+#include <linux/mmzone.h>
+#include <linux/llist.h>
+#include <linux/spinlock.h>
+
+struct drm_page_pool {
+	int count;
+	struct list_head items;
+
+	int order;
+	int (*free)(struct page *p, unsigned int order);
+
+	spinlock_t lock;
+	struct list_head list;
+};
+
+void drm_page_pool_set_max(unsigned long max);
+unsigned long drm_page_pool_get_max(void);
+unsigned long drm_page_pool_get_total(void);
+int drm_page_pool_get_size(struct drm_page_pool *pool);
+struct page *drm_page_pool_fetch(struct drm_page_pool *pool);
+void drm_page_pool_add(struct drm_page_pool *pool, struct page *page);
+struct drm_page_pool *drm_page_pool_create(unsigned int order,
+					   int (*free_page)(struct page *p, unsigned int order));
+void drm_page_pool_destroy(struct drm_page_pool *pool);
+
+#endif
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [RFC][PATCH v6 2/7] drm: ttm_pool: Rename the ttm_pool_dma structure to ttm_pool_page_dat
  2021-02-05  8:06 [RFC][PATCH v6 0/7] Generic page pool & deferred freeing for system dmabuf heap John Stultz
  2021-02-05  8:06 ` [RFC][PATCH v6 1/7] drm: Add a sharable drm page-pool implementation John Stultz
@ 2021-02-05  8:06 ` John Stultz
  2021-02-05  8:06 ` [RFC][PATCH v6 3/7] drm: ttm_pool: Rework ttm_pool_free_page to allow us to use it as a function pointer John Stultz
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 33+ messages in thread
From: John Stultz @ 2021-02-05  8:06 UTC (permalink / raw)
  To: lkml
  Cc: John Stultz, Daniel Vetter, Christian Koenig, Sumit Semwal,
	Liam Mark, Chris Goldsworthy, Laura Abbott, Brian Starkey,
	Hridya Valsaraju, Suren Baghdasaryan, Sandeep Patil,
	Daniel Mentz, Ørjan Eide, Robin Murphy, Ezequiel Garcia,
	Simon Ser, James Jones, linux-media, dri-devel

This patch simply renames the ttm_pool_dma structure to
ttm_pool_page_dat, as we will extend it to store more then just
dma related info in it.

Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Liam Mark <lmark@codeaurora.org>
Cc: Chris Goldsworthy <cgoldswo@codeaurora.org>
Cc: Laura Abbott <labbott@kernel.org>
Cc: Brian Starkey <Brian.Starkey@arm.com>
Cc: Hridya Valsaraju <hridya@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sandeep Patil <sspatil@google.com>
Cc: Daniel Mentz <danielmentz@google.com>
Cc: Ørjan Eide <orjan.eide@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Ezequiel Garcia <ezequiel@collabora.com>
Cc: Simon Ser <contact@emersion.fr>
Cc: James Jones <jajones@nvidia.com>
Cc: linux-media@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
 drivers/gpu/drm/ttm/ttm_pool.c | 37 +++++++++++++++++-----------------
 1 file changed, 18 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_pool.c b/drivers/gpu/drm/ttm/ttm_pool.c
index 11e0313db0ea..c0274e256be3 100644
--- a/drivers/gpu/drm/ttm/ttm_pool.c
+++ b/drivers/gpu/drm/ttm/ttm_pool.c
@@ -37,18 +37,17 @@
 #ifdef CONFIG_X86
 #include <asm/set_memory.h>
 #endif
-
 #include <drm/ttm/ttm_pool.h>
 #include <drm/ttm/ttm_bo_driver.h>
 #include <drm/ttm/ttm_tt.h>
 
 /**
- * struct ttm_pool_dma - Helper object for coherent DMA mappings
+ * struct ttm_pool_page_dat - Helper object for coherent DMA mappings
  *
  * @addr: original DMA address returned for the mapping
  * @vaddr: original vaddr return for the mapping and order in the lower bits
  */
-struct ttm_pool_dma {
+struct ttm_pool_page_dat {
 	dma_addr_t addr;
 	unsigned long vaddr;
 };
@@ -75,7 +74,7 @@ static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, gfp_t gfp_flags,
 					unsigned int order)
 {
 	unsigned long attr = DMA_ATTR_FORCE_CONTIGUOUS;
-	struct ttm_pool_dma *dma;
+	struct ttm_pool_page_dat *dat;
 	struct page *p;
 	void *vaddr;
 
@@ -94,15 +93,15 @@ static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, gfp_t gfp_flags,
 		return p;
 	}
 
-	dma = kmalloc(sizeof(*dma), GFP_KERNEL);
-	if (!dma)
+	dat = kmalloc(sizeof(*dat), GFP_KERNEL);
+	if (!dat)
 		return NULL;
 
 	if (order)
 		attr |= DMA_ATTR_NO_WARN;
 
 	vaddr = dma_alloc_attrs(pool->dev, (1ULL << order) * PAGE_SIZE,
-				&dma->addr, gfp_flags, attr);
+				&dat->addr, gfp_flags, attr);
 	if (!vaddr)
 		goto error_free;
 
@@ -114,12 +113,12 @@ static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, gfp_t gfp_flags,
 	else
 		p = virt_to_page(vaddr);
 
-	dma->vaddr = (unsigned long)vaddr | order;
-	p->private = (unsigned long)dma;
+	dat->vaddr = (unsigned long)vaddr | order;
+	p->private = (unsigned long)dat;
 	return p;
 
 error_free:
-	kfree(dma);
+	kfree(dat);
 	return NULL;
 }
 
@@ -128,7 +127,7 @@ static void ttm_pool_free_page(struct ttm_pool *pool, enum ttm_caching caching,
 			       unsigned int order, struct page *p)
 {
 	unsigned long attr = DMA_ATTR_FORCE_CONTIGUOUS;
-	struct ttm_pool_dma *dma;
+	struct ttm_pool_page_dat *dat;
 	void *vaddr;
 
 #ifdef CONFIG_X86
@@ -147,11 +146,11 @@ static void ttm_pool_free_page(struct ttm_pool *pool, enum ttm_caching caching,
 	if (order)
 		attr |= DMA_ATTR_NO_WARN;
 
-	dma = (void *)p->private;
-	vaddr = (void *)(dma->vaddr & PAGE_MASK);
-	dma_free_attrs(pool->dev, (1UL << order) * PAGE_SIZE, vaddr, dma->addr,
+	dat = (void *)p->private;
+	vaddr = (void *)(dat->vaddr & PAGE_MASK);
+	dma_free_attrs(pool->dev, (1UL << order) * PAGE_SIZE, vaddr, dat->addr,
 		       attr);
-	kfree(dma);
+	kfree(dat);
 }
 
 /* Apply a new caching to an array of pages */
@@ -184,9 +183,9 @@ static int ttm_pool_map(struct ttm_pool *pool, unsigned int order,
 	unsigned int i;
 
 	if (pool->use_dma_alloc) {
-		struct ttm_pool_dma *dma = (void *)p->private;
+		struct ttm_pool_page_dat *dat = (void *)p->private;
 
-		addr = dma->addr;
+		addr = dat->addr;
 	} else {
 		size_t size = (1ULL << order) * PAGE_SIZE;
 
@@ -324,9 +323,9 @@ static unsigned int ttm_pool_shrink(void)
 static unsigned int ttm_pool_page_order(struct ttm_pool *pool, struct page *p)
 {
 	if (pool->use_dma_alloc) {
-		struct ttm_pool_dma *dma = (void *)p->private;
+		struct ttm_pool_page_dat *dat = (void *)p->private;
 
-		return dma->vaddr & ~PAGE_MASK;
+		return dat->vaddr & ~PAGE_MASK;
 	}
 
 	return p->private;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [RFC][PATCH v6 3/7] drm: ttm_pool: Rework ttm_pool_free_page to allow us to use it as a function pointer
  2021-02-05  8:06 [RFC][PATCH v6 0/7] Generic page pool & deferred freeing for system dmabuf heap John Stultz
  2021-02-05  8:06 ` [RFC][PATCH v6 1/7] drm: Add a sharable drm page-pool implementation John Stultz
  2021-02-05  8:06 ` [RFC][PATCH v6 2/7] drm: ttm_pool: Rename the ttm_pool_dma structure to ttm_pool_page_dat John Stultz
@ 2021-02-05  8:06 ` John Stultz
  2021-02-05  8:28   ` Christian König
  2021-02-05  8:06 ` [RFC][PATCH v6 4/7] drm: ttm_pool: Rework ttm_pool to use drm_page_pool John Stultz
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 33+ messages in thread
From: John Stultz @ 2021-02-05  8:06 UTC (permalink / raw)
  To: lkml
  Cc: John Stultz, Daniel Vetter, Christian Koenig, Sumit Semwal,
	Liam Mark, Chris Goldsworthy, Laura Abbott, Brian Starkey,
	Hridya Valsaraju, Suren Baghdasaryan, Sandeep Patil,
	Daniel Mentz, Ørjan Eide, Robin Murphy, Ezequiel Garcia,
	Simon Ser, James Jones, linux-media, dri-devel

This refactors ttm_pool_free_page(), and by adding extra entries
to ttm_pool_page_dat, we then use it for all allocations, which
allows us to simplify the arguments needed to be passed to
ttm_pool_free_page().

This is critical for allowing the free function to be called
by the sharable drm_page_pool logic.

Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Liam Mark <lmark@codeaurora.org>
Cc: Chris Goldsworthy <cgoldswo@codeaurora.org>
Cc: Laura Abbott <labbott@kernel.org>
Cc: Brian Starkey <Brian.Starkey@arm.com>
Cc: Hridya Valsaraju <hridya@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sandeep Patil <sspatil@google.com>
Cc: Daniel Mentz <danielmentz@google.com>
Cc: Ørjan Eide <orjan.eide@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Ezequiel Garcia <ezequiel@collabora.com>
Cc: Simon Ser <contact@emersion.fr>
Cc: James Jones <jajones@nvidia.com>
Cc: linux-media@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
 drivers/gpu/drm/ttm/ttm_pool.c | 60 ++++++++++++++++++----------------
 1 file changed, 32 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_pool.c b/drivers/gpu/drm/ttm/ttm_pool.c
index c0274e256be3..eca36678f967 100644
--- a/drivers/gpu/drm/ttm/ttm_pool.c
+++ b/drivers/gpu/drm/ttm/ttm_pool.c
@@ -44,10 +44,14 @@
 /**
  * struct ttm_pool_page_dat - Helper object for coherent DMA mappings
  *
+ * @pool: ttm_pool pointer the page was allocated by
+ * @caching: the caching value the allocated page was configured for
  * @addr: original DMA address returned for the mapping
  * @vaddr: original vaddr return for the mapping and order in the lower bits
  */
 struct ttm_pool_page_dat {
+	struct ttm_pool *pool;
+	enum ttm_caching caching;
 	dma_addr_t addr;
 	unsigned long vaddr;
 };
@@ -71,13 +75,20 @@ static struct shrinker mm_shrinker;
 
 /* Allocate pages of size 1 << order with the given gfp_flags */
 static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, gfp_t gfp_flags,
-					unsigned int order)
+					unsigned int order, enum ttm_caching caching)
 {
 	unsigned long attr = DMA_ATTR_FORCE_CONTIGUOUS;
 	struct ttm_pool_page_dat *dat;
 	struct page *p;
 	void *vaddr;
 
+	dat = kmalloc(sizeof(*dat), GFP_KERNEL);
+	if (!dat)
+		return NULL;
+
+	dat->pool = pool;
+	dat->caching = caching;
+
 	/* Don't set the __GFP_COMP flag for higher order allocations.
 	 * Mapping pages directly into an userspace process and calling
 	 * put_page() on a TTM allocated page is illegal.
@@ -88,15 +99,13 @@ static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, gfp_t gfp_flags,
 
 	if (!pool->use_dma_alloc) {
 		p = alloc_pages(gfp_flags, order);
-		if (p)
-			p->private = order;
+		if (!p)
+			goto error_free;
+		dat->vaddr = order;
+		p->private = (unsigned long)dat;
 		return p;
 	}
 
-	dat = kmalloc(sizeof(*dat), GFP_KERNEL);
-	if (!dat)
-		return NULL;
-
 	if (order)
 		attr |= DMA_ATTR_NO_WARN;
 
@@ -123,34 +132,34 @@ static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, gfp_t gfp_flags,
 }
 
 /* Reset the caching and pages of size 1 << order */
-static void ttm_pool_free_page(struct ttm_pool *pool, enum ttm_caching caching,
-			       unsigned int order, struct page *p)
+static int ttm_pool_free_page(struct page *p, unsigned int order)
 {
 	unsigned long attr = DMA_ATTR_FORCE_CONTIGUOUS;
-	struct ttm_pool_page_dat *dat;
+	struct ttm_pool_page_dat *dat = (void *)p->private;
 	void *vaddr;
 
 #ifdef CONFIG_X86
 	/* We don't care that set_pages_wb is inefficient here. This is only
 	 * used when we have to shrink and CPU overhead is irrelevant then.
 	 */
-	if (caching != ttm_cached && !PageHighMem(p))
+	if (dat->caching != ttm_cached && !PageHighMem(p))
 		set_pages_wb(p, 1 << order);
 #endif
 
-	if (!pool || !pool->use_dma_alloc) {
+	if (!dat->pool || !dat->pool->use_dma_alloc) {
 		__free_pages(p, order);
-		return;
+		goto out;
 	}
 
 	if (order)
 		attr |= DMA_ATTR_NO_WARN;
 
-	dat = (void *)p->private;
 	vaddr = (void *)(dat->vaddr & PAGE_MASK);
-	dma_free_attrs(pool->dev, (1UL << order) * PAGE_SIZE, vaddr, dat->addr,
+	dma_free_attrs(dat->pool->dev, (1UL << order) * PAGE_SIZE, vaddr, dat->addr,
 		       attr);
+out:
 	kfree(dat);
+	return 1 << order;
 }
 
 /* Apply a new caching to an array of pages */
@@ -264,7 +273,7 @@ static void ttm_pool_type_fini(struct ttm_pool_type *pt)
 	mutex_unlock(&shrinker_lock);
 
 	list_for_each_entry_safe(p, tmp, &pt->pages, lru)
-		ttm_pool_free_page(pt->pool, pt->caching, pt->order, p);
+		ttm_pool_free_page(p, pt->order);
 }
 
 /* Return the pool_type to use for the given caching and order */
@@ -307,7 +316,7 @@ static unsigned int ttm_pool_shrink(void)
 
 	p = ttm_pool_type_take(pt);
 	if (p) {
-		ttm_pool_free_page(pt->pool, pt->caching, pt->order, p);
+		ttm_pool_free_page(p, pt->order);
 		num_freed = 1 << pt->order;
 	} else {
 		num_freed = 0;
@@ -322,13 +331,9 @@ static unsigned int ttm_pool_shrink(void)
 /* Return the allocation order based for a page */
 static unsigned int ttm_pool_page_order(struct ttm_pool *pool, struct page *p)
 {
-	if (pool->use_dma_alloc) {
-		struct ttm_pool_page_dat *dat = (void *)p->private;
-
-		return dat->vaddr & ~PAGE_MASK;
-	}
+	struct ttm_pool_page_dat *dat = (void *)p->private;
 
-	return p->private;
+	return dat->vaddr & ~PAGE_MASK;
 }
 
 /**
@@ -379,7 +384,7 @@ int ttm_pool_alloc(struct ttm_pool *pool, struct ttm_tt *tt,
 		if (p) {
 			apply_caching = true;
 		} else {
-			p = ttm_pool_alloc_page(pool, gfp_flags, order);
+			p = ttm_pool_alloc_page(pool, gfp_flags, order, tt->caching);
 			if (p && PageHighMem(p))
 				apply_caching = true;
 		}
@@ -428,13 +433,13 @@ int ttm_pool_alloc(struct ttm_pool *pool, struct ttm_tt *tt,
 	ttm_mem_global_free_page(&ttm_mem_glob, p, (1 << order) * PAGE_SIZE);
 
 error_free_page:
-	ttm_pool_free_page(pool, tt->caching, order, p);
+	ttm_pool_free_page(p, order);
 
 error_free_all:
 	num_pages = tt->num_pages - num_pages;
 	for (i = 0; i < num_pages; ) {
 		order = ttm_pool_page_order(pool, tt->pages[i]);
-		ttm_pool_free_page(pool, tt->caching, order, tt->pages[i]);
+		ttm_pool_free_page(tt->pages[i], order);
 		i += 1 << order;
 	}
 
@@ -470,8 +475,7 @@ void ttm_pool_free(struct ttm_pool *pool, struct ttm_tt *tt)
 		if (pt)
 			ttm_pool_type_give(pt, tt->pages[i]);
 		else
-			ttm_pool_free_page(pool, tt->caching, order,
-					   tt->pages[i]);
+			ttm_pool_free_page(tt->pages[i], order);
 
 		i += num_pages;
 	}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [RFC][PATCH v6 4/7] drm: ttm_pool: Rework ttm_pool to use drm_page_pool
  2021-02-05  8:06 [RFC][PATCH v6 0/7] Generic page pool & deferred freeing for system dmabuf heap John Stultz
                   ` (2 preceding siblings ...)
  2021-02-05  8:06 ` [RFC][PATCH v6 3/7] drm: ttm_pool: Rework ttm_pool_free_page to allow us to use it as a function pointer John Stultz
@ 2021-02-05  8:06 ` John Stultz
  2021-02-05  8:50   ` Christian König
  2021-02-05  8:06 ` [RFC][PATCH v6 5/7] dma-buf: heaps: Add deferred-free-helper library code John Stultz
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 33+ messages in thread
From: John Stultz @ 2021-02-05  8:06 UTC (permalink / raw)
  To: lkml
  Cc: John Stultz, Daniel Vetter, Christian Koenig, Sumit Semwal,
	Liam Mark, Chris Goldsworthy, Laura Abbott, Brian Starkey,
	Hridya Valsaraju, Suren Baghdasaryan, Sandeep Patil,
	Daniel Mentz, Ørjan Eide, Robin Murphy, Ezequiel Garcia,
	Simon Ser, James Jones, linux-media, dri-devel

This patch reworks the ttm_pool logic to utilize the recently
added drm_page_pool code.

Basically all the ttm_pool_type structures are replaced with
drm_page_pool pointers, and since the drm_page_pool logic has
its own shrinker, we can remove all of the ttm_pool shrinker
logic.

NOTE: There is one mismatch in the interfaces I'm not totally
happy with. The ttm_pool tracks all of its pooled pages across
a number of different pools, and tries to keep this size under
the specified page_pool_size value. With the drm_page_pool,
there may other users, however there is still one global
shrinker list of pools. So we can't easily reduce the ttm
pool under the ttm specified size without potentially doing
a lot of shrinking to other non-ttm pools. So either we can:
  1) Try to split it so each user of drm_page_pools manages its
     own pool shrinking.
  2) Push the max value into the drm_page_pool, and have it
     manage shrinking to fit under that global max. Then share
     those size/max values out so the ttm_pool debug output
     can have more context.

I've taken the second path in this patch set, but wanted to call
it out so folks could look closely.

Thoughts would be greatly appreciated here!

Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Liam Mark <lmark@codeaurora.org>
Cc: Chris Goldsworthy <cgoldswo@codeaurora.org>
Cc: Laura Abbott <labbott@kernel.org>
Cc: Brian Starkey <Brian.Starkey@arm.com>
Cc: Hridya Valsaraju <hridya@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sandeep Patil <sspatil@google.com>
Cc: Daniel Mentz <danielmentz@google.com>
Cc: Ørjan Eide <orjan.eide@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Ezequiel Garcia <ezequiel@collabora.com>
Cc: Simon Ser <contact@emersion.fr>
Cc: James Jones <jajones@nvidia.com>
Cc: linux-media@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
 drivers/gpu/drm/Kconfig        |   1 +
 drivers/gpu/drm/ttm/ttm_pool.c | 199 +++++++--------------------------
 include/drm/ttm/ttm_pool.h     |  23 +---
 3 files changed, 41 insertions(+), 182 deletions(-)

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index d16bf340ed2e..d427abefabfb 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -181,6 +181,7 @@ config DRM_PAGE_POOL
 config DRM_TTM
 	tristate
 	depends on DRM && MMU
+	select DRM_PAGE_POOL
 	help
 	  GPU memory management subsystem for devices with multiple
 	  GPU memory types. Will be enabled automatically if a device driver
diff --git a/drivers/gpu/drm/ttm/ttm_pool.c b/drivers/gpu/drm/ttm/ttm_pool.c
index eca36678f967..dbbaf55ca5df 100644
--- a/drivers/gpu/drm/ttm/ttm_pool.c
+++ b/drivers/gpu/drm/ttm/ttm_pool.c
@@ -37,6 +37,7 @@
 #ifdef CONFIG_X86
 #include <asm/set_memory.h>
 #endif
+#include <drm/page_pool.h>
 #include <drm/ttm/ttm_pool.h>
 #include <drm/ttm/ttm_bo_driver.h>
 #include <drm/ttm/ttm_tt.h>
@@ -63,15 +64,13 @@ module_param(page_pool_size, ulong, 0644);
 
 static atomic_long_t allocated_pages;
 
-static struct ttm_pool_type global_write_combined[MAX_ORDER];
-static struct ttm_pool_type global_uncached[MAX_ORDER];
+static struct drm_page_pool *global_write_combined[MAX_ORDER];
+static struct drm_page_pool *global_uncached[MAX_ORDER];
 
-static struct ttm_pool_type global_dma32_write_combined[MAX_ORDER];
-static struct ttm_pool_type global_dma32_uncached[MAX_ORDER];
+static struct drm_page_pool *global_dma32_write_combined[MAX_ORDER];
+static struct drm_page_pool *global_dma32_uncached[MAX_ORDER];
 
 static struct mutex shrinker_lock;
-static struct list_head shrinker_list;
-static struct shrinker mm_shrinker;
 
 /* Allocate pages of size 1 << order with the given gfp_flags */
 static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, gfp_t gfp_flags,
@@ -223,79 +222,26 @@ static void ttm_pool_unmap(struct ttm_pool *pool, dma_addr_t dma_addr,
 		       DMA_BIDIRECTIONAL);
 }
 
-/* Give pages into a specific pool_type */
-static void ttm_pool_type_give(struct ttm_pool_type *pt, struct page *p)
-{
-	spin_lock(&pt->lock);
-	list_add(&p->lru, &pt->pages);
-	spin_unlock(&pt->lock);
-	atomic_long_add(1 << pt->order, &allocated_pages);
-}
-
-/* Take pages from a specific pool_type, return NULL when nothing available */
-static struct page *ttm_pool_type_take(struct ttm_pool_type *pt)
-{
-	struct page *p;
-
-	spin_lock(&pt->lock);
-	p = list_first_entry_or_null(&pt->pages, typeof(*p), lru);
-	if (p) {
-		atomic_long_sub(1 << pt->order, &allocated_pages);
-		list_del(&p->lru);
-	}
-	spin_unlock(&pt->lock);
-
-	return p;
-}
-
-/* Initialize and add a pool type to the global shrinker list */
-static void ttm_pool_type_init(struct ttm_pool_type *pt, struct ttm_pool *pool,
-			       enum ttm_caching caching, unsigned int order)
-{
-	pt->pool = pool;
-	pt->caching = caching;
-	pt->order = order;
-	spin_lock_init(&pt->lock);
-	INIT_LIST_HEAD(&pt->pages);
-
-	mutex_lock(&shrinker_lock);
-	list_add_tail(&pt->shrinker_list, &shrinker_list);
-	mutex_unlock(&shrinker_lock);
-}
-
-/* Remove a pool_type from the global shrinker list and free all pages */
-static void ttm_pool_type_fini(struct ttm_pool_type *pt)
-{
-	struct page *p, *tmp;
-
-	mutex_lock(&shrinker_lock);
-	list_del(&pt->shrinker_list);
-	mutex_unlock(&shrinker_lock);
-
-	list_for_each_entry_safe(p, tmp, &pt->pages, lru)
-		ttm_pool_free_page(p, pt->order);
-}
-
 /* Return the pool_type to use for the given caching and order */
-static struct ttm_pool_type *ttm_pool_select_type(struct ttm_pool *pool,
+static struct drm_page_pool *ttm_pool_select_type(struct ttm_pool *pool,
 						  enum ttm_caching caching,
 						  unsigned int order)
 {
 	if (pool->use_dma_alloc)
-		return &pool->caching[caching].orders[order];
+		return pool->caching[caching].orders[order];
 
 #ifdef CONFIG_X86
 	switch (caching) {
 	case ttm_write_combined:
 		if (pool->use_dma32)
-			return &global_dma32_write_combined[order];
+			return global_dma32_write_combined[order];
 
-		return &global_write_combined[order];
+		return global_write_combined[order];
 	case ttm_uncached:
 		if (pool->use_dma32)
-			return &global_dma32_uncached[order];
+			return global_dma32_uncached[order];
 
-		return &global_uncached[order];
+		return global_uncached[order];
 	default:
 		break;
 	}
@@ -304,30 +250,6 @@ static struct ttm_pool_type *ttm_pool_select_type(struct ttm_pool *pool,
 	return NULL;
 }
 
-/* Free pages using the global shrinker list */
-static unsigned int ttm_pool_shrink(void)
-{
-	struct ttm_pool_type *pt;
-	unsigned int num_freed;
-	struct page *p;
-
-	mutex_lock(&shrinker_lock);
-	pt = list_first_entry(&shrinker_list, typeof(*pt), shrinker_list);
-
-	p = ttm_pool_type_take(pt);
-	if (p) {
-		ttm_pool_free_page(p, pt->order);
-		num_freed = 1 << pt->order;
-	} else {
-		num_freed = 0;
-	}
-
-	list_move_tail(&pt->shrinker_list, &shrinker_list);
-	mutex_unlock(&shrinker_lock);
-
-	return num_freed;
-}
-
 /* Return the allocation order based for a page */
 static unsigned int ttm_pool_page_order(struct ttm_pool *pool, struct page *p)
 {
@@ -377,10 +299,10 @@ int ttm_pool_alloc(struct ttm_pool *pool, struct ttm_tt *tt,
 	for (order = min(MAX_ORDER - 1UL, __fls(num_pages)); num_pages;
 	     order = min_t(unsigned int, order, __fls(num_pages))) {
 		bool apply_caching = false;
-		struct ttm_pool_type *pt;
+		struct drm_page_pool *pt;
 
 		pt = ttm_pool_select_type(pool, tt->caching, order);
-		p = pt ? ttm_pool_type_take(pt) : NULL;
+		p = pt ? drm_page_pool_fetch(pt) : NULL;
 		if (p) {
 			apply_caching = true;
 		} else {
@@ -462,7 +384,7 @@ void ttm_pool_free(struct ttm_pool *pool, struct ttm_tt *tt)
 	for (i = 0; i < tt->num_pages; ) {
 		struct page *p = tt->pages[i];
 		unsigned int order, num_pages;
-		struct ttm_pool_type *pt;
+		struct drm_page_pool *pt;
 
 		order = ttm_pool_page_order(pool, p);
 		num_pages = 1ULL << order;
@@ -473,15 +395,12 @@ void ttm_pool_free(struct ttm_pool *pool, struct ttm_tt *tt)
 
 		pt = ttm_pool_select_type(pool, tt->caching, order);
 		if (pt)
-			ttm_pool_type_give(pt, tt->pages[i]);
+			drm_page_pool_add(pt, tt->pages[i]);
 		else
 			ttm_pool_free_page(tt->pages[i], order);
 
 		i += num_pages;
 	}
-
-	while (atomic_long_read(&allocated_pages) > page_pool_size)
-		ttm_pool_shrink();
 }
 EXPORT_SYMBOL(ttm_pool_free);
 
@@ -508,8 +427,8 @@ void ttm_pool_init(struct ttm_pool *pool, struct device *dev,
 
 	for (i = 0; i < TTM_NUM_CACHING_TYPES; ++i)
 		for (j = 0; j < MAX_ORDER; ++j)
-			ttm_pool_type_init(&pool->caching[i].orders[j],
-					   pool, i, j);
+			pool->caching[i].orders[j] = drm_page_pool_create(j,
+							ttm_pool_free_page);
 }
 
 /**
@@ -526,33 +445,18 @@ void ttm_pool_fini(struct ttm_pool *pool)
 
 	for (i = 0; i < TTM_NUM_CACHING_TYPES; ++i)
 		for (j = 0; j < MAX_ORDER; ++j)
-			ttm_pool_type_fini(&pool->caching[i].orders[j]);
+			drm_page_pool_destroy(pool->caching[i].orders[j]);
 }
 
 #ifdef CONFIG_DEBUG_FS
-/* Count the number of pages available in a pool_type */
-static unsigned int ttm_pool_type_count(struct ttm_pool_type *pt)
-{
-	unsigned int count = 0;
-	struct page *p;
-
-	spin_lock(&pt->lock);
-	/* Only used for debugfs, the overhead doesn't matter */
-	list_for_each_entry(p, &pt->pages, lru)
-		++count;
-	spin_unlock(&pt->lock);
-
-	return count;
-}
-
 /* Dump information about the different pool types */
-static void ttm_pool_debugfs_orders(struct ttm_pool_type *pt,
+static void ttm_pool_debugfs_orders(struct drm_page_pool **pt,
 				    struct seq_file *m)
 {
 	unsigned int i;
 
 	for (i = 0; i < MAX_ORDER; ++i)
-		seq_printf(m, " %8u", ttm_pool_type_count(&pt[i]));
+		seq_printf(m, " %8u", drm_page_pool_get_size(pt[i]));
 	seq_puts(m, "\n");
 }
 
@@ -602,7 +506,10 @@ int ttm_pool_debugfs(struct ttm_pool *pool, struct seq_file *m)
 	}
 
 	seq_printf(m, "\ntotal\t: %8lu of %8lu\n",
-		   atomic_long_read(&allocated_pages), page_pool_size);
+		   atomic_long_read(&allocated_pages),
+		   drm_page_pool_get_max());
+	seq_printf(m, "(%8lu in non-ttm pools)\n", drm_page_pool_get_total() -
+					atomic_long_read(&allocated_pages));
 
 	mutex_unlock(&shrinker_lock);
 
@@ -612,28 +519,6 @@ EXPORT_SYMBOL(ttm_pool_debugfs);
 
 #endif
 
-/* As long as pages are available make sure to release at least one */
-static unsigned long ttm_pool_shrinker_scan(struct shrinker *shrink,
-					    struct shrink_control *sc)
-{
-	unsigned long num_freed = 0;
-
-	do
-		num_freed += ttm_pool_shrink();
-	while (!num_freed && atomic_long_read(&allocated_pages));
-
-	return num_freed;
-}
-
-/* Return the number of pages available or SHRINK_EMPTY if we have none */
-static unsigned long ttm_pool_shrinker_count(struct shrinker *shrink,
-					     struct shrink_control *sc)
-{
-	unsigned long num_pages = atomic_long_read(&allocated_pages);
-
-	return num_pages ? num_pages : SHRINK_EMPTY;
-}
-
 /**
  * ttm_pool_mgr_init - Initialize globals
  *
@@ -648,24 +533,21 @@ int ttm_pool_mgr_init(unsigned long num_pages)
 	if (!page_pool_size)
 		page_pool_size = num_pages;
 
+	drm_page_pool_set_max(page_pool_size);
+
 	mutex_init(&shrinker_lock);
-	INIT_LIST_HEAD(&shrinker_list);
 
 	for (i = 0; i < MAX_ORDER; ++i) {
-		ttm_pool_type_init(&global_write_combined[i], NULL,
-				   ttm_write_combined, i);
-		ttm_pool_type_init(&global_uncached[i], NULL, ttm_uncached, i);
-
-		ttm_pool_type_init(&global_dma32_write_combined[i], NULL,
-				   ttm_write_combined, i);
-		ttm_pool_type_init(&global_dma32_uncached[i], NULL,
-				   ttm_uncached, i);
+		global_write_combined[i] = drm_page_pool_create(i,
+							ttm_pool_free_page);
+		global_uncached[i] = drm_page_pool_create(i,
+							ttm_pool_free_page);
+		global_dma32_write_combined[i] = drm_page_pool_create(i,
+							ttm_pool_free_page);
+		global_dma32_uncached[i] = drm_page_pool_create(i,
+							ttm_pool_free_page);
 	}
-
-	mm_shrinker.count_objects = ttm_pool_shrinker_count;
-	mm_shrinker.scan_objects = ttm_pool_shrinker_scan;
-	mm_shrinker.seeks = 1;
-	return register_shrinker(&mm_shrinker);
+	return 0;
 }
 
 /**
@@ -678,13 +560,10 @@ void ttm_pool_mgr_fini(void)
 	unsigned int i;
 
 	for (i = 0; i < MAX_ORDER; ++i) {
-		ttm_pool_type_fini(&global_write_combined[i]);
-		ttm_pool_type_fini(&global_uncached[i]);
+		drm_page_pool_destroy(global_write_combined[i]);
+		drm_page_pool_destroy(global_uncached[i]);
 
-		ttm_pool_type_fini(&global_dma32_write_combined[i]);
-		ttm_pool_type_fini(&global_dma32_uncached[i]);
+		drm_page_pool_destroy(global_dma32_write_combined[i]);
+		drm_page_pool_destroy(global_dma32_uncached[i]);
 	}
-
-	unregister_shrinker(&mm_shrinker);
-	WARN_ON(!list_empty(&shrinker_list));
 }
diff --git a/include/drm/ttm/ttm_pool.h b/include/drm/ttm/ttm_pool.h
index 4321728bdd11..460ab232fd22 100644
--- a/include/drm/ttm/ttm_pool.h
+++ b/include/drm/ttm/ttm_pool.h
@@ -36,27 +36,6 @@ struct ttm_tt;
 struct ttm_pool;
 struct ttm_operation_ctx;
 
-/**
- * ttm_pool_type - Pool for a certain memory type
- *
- * @pool: the pool we belong to, might be NULL for the global ones
- * @order: the allocation order our pages have
- * @caching: the caching type our pages have
- * @shrinker_list: our place on the global shrinker list
- * @lock: protection of the page list
- * @pages: the list of pages in the pool
- */
-struct ttm_pool_type {
-	struct ttm_pool *pool;
-	unsigned int order;
-	enum ttm_caching caching;
-
-	struct list_head shrinker_list;
-
-	spinlock_t lock;
-	struct list_head pages;
-};
-
 /**
  * ttm_pool - Pool for all caching and orders
  *
@@ -71,7 +50,7 @@ struct ttm_pool {
 	bool use_dma32;
 
 	struct {
-		struct ttm_pool_type orders[MAX_ORDER];
+		struct drm_page_pool *orders[MAX_ORDER];
 	} caching[TTM_NUM_CACHING_TYPES];
 };
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [RFC][PATCH v6 5/7] dma-buf: heaps: Add deferred-free-helper library code
  2021-02-05  8:06 [RFC][PATCH v6 0/7] Generic page pool & deferred freeing for system dmabuf heap John Stultz
                   ` (3 preceding siblings ...)
  2021-02-05  8:06 ` [RFC][PATCH v6 4/7] drm: ttm_pool: Rework ttm_pool to use drm_page_pool John Stultz
@ 2021-02-05  8:06 ` John Stultz
  2021-02-05  8:06 ` [RFC][PATCH v6 6/7] dma-buf: system_heap: Add drm pagepool support to system heap John Stultz
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 33+ messages in thread
From: John Stultz @ 2021-02-05  8:06 UTC (permalink / raw)
  To: lkml
  Cc: John Stultz, Daniel Vetter, Christian Koenig, Sumit Semwal,
	Liam Mark, Chris Goldsworthy, Laura Abbott, Brian Starkey,
	Hridya Valsaraju, Suren Baghdasaryan, Sandeep Patil,
	Daniel Mentz, Ørjan Eide, Robin Murphy, Ezequiel Garcia,
	Simon Ser, James Jones, linux-media, dri-devel

This patch provides infrastructure for deferring buffer frees.

This is a feature ION provided which when used with some form
of a page pool, provides a nice performance boost in an
allocation microbenchmark. The reason it helps is it allows the
page-zeroing to be done out of the normal allocation/free path,
and pushed off to a kthread.

As not all heaps will find this useful, its implemented as
a optional helper library that heaps can utilize.

Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Liam Mark <lmark@codeaurora.org>
Cc: Chris Goldsworthy <cgoldswo@codeaurora.org>
Cc: Laura Abbott <labbott@kernel.org>
Cc: Brian Starkey <Brian.Starkey@arm.com>
Cc: Hridya Valsaraju <hridya@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sandeep Patil <sspatil@google.com>
Cc: Daniel Mentz <danielmentz@google.com>
Cc: Ørjan Eide <orjan.eide@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Ezequiel Garcia <ezequiel@collabora.com>
Cc: Simon Ser <contact@emersion.fr>
Cc: James Jones <jajones@nvidia.com>
Cc: linux-media@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
v2:
* Fix sleep in atomic issue from using a mutex, by switching
  to a spinlock as Reported-by: kernel test robot <oliver.sang@intel.com>
* Cleanup API to use a reason enum for clarity and add some documentation
  comments as suggested by Suren Baghdasaryan.
v3:
* Minor tweaks so it can be built as a module
* A few small fixups suggested by Daniel Mentz
v4:
* Tweak from Daniel Mentz to make sure the shrinker
  count/freed values are tracked in pages not bytes
v5:
* Fix up page count tracking as suggested by Suren Baghdasaryan
---
 drivers/dma-buf/heaps/Kconfig                |   3 +
 drivers/dma-buf/heaps/Makefile               |   1 +
 drivers/dma-buf/heaps/deferred-free-helper.c | 145 +++++++++++++++++++
 drivers/dma-buf/heaps/deferred-free-helper.h |  55 +++++++
 4 files changed, 204 insertions(+)
 create mode 100644 drivers/dma-buf/heaps/deferred-free-helper.c
 create mode 100644 drivers/dma-buf/heaps/deferred-free-helper.h

diff --git a/drivers/dma-buf/heaps/Kconfig b/drivers/dma-buf/heaps/Kconfig
index a5eef06c4226..f7aef8bc7119 100644
--- a/drivers/dma-buf/heaps/Kconfig
+++ b/drivers/dma-buf/heaps/Kconfig
@@ -1,3 +1,6 @@
+config DMABUF_HEAPS_DEFERRED_FREE
+	tristate
+
 config DMABUF_HEAPS_SYSTEM
 	bool "DMA-BUF System Heap"
 	depends on DMABUF_HEAPS
diff --git a/drivers/dma-buf/heaps/Makefile b/drivers/dma-buf/heaps/Makefile
index 974467791032..4e7839875615 100644
--- a/drivers/dma-buf/heaps/Makefile
+++ b/drivers/dma-buf/heaps/Makefile
@@ -1,3 +1,4 @@
 # SPDX-License-Identifier: GPL-2.0
+obj-$(CONFIG_DMABUF_HEAPS_DEFERRED_FREE) += deferred-free-helper.o
 obj-$(CONFIG_DMABUF_HEAPS_SYSTEM)	+= system_heap.o
 obj-$(CONFIG_DMABUF_HEAPS_CMA)		+= cma_heap.o
diff --git a/drivers/dma-buf/heaps/deferred-free-helper.c b/drivers/dma-buf/heaps/deferred-free-helper.c
new file mode 100644
index 000000000000..672c3d5872e9
--- /dev/null
+++ b/drivers/dma-buf/heaps/deferred-free-helper.c
@@ -0,0 +1,145 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Deferred dmabuf freeing helper
+ *
+ * Copyright (C) 2020 Linaro, Ltd.
+ *
+ * Based on the ION page pool code
+ * Copyright (C) 2011 Google, Inc.
+ */
+
+#include <linux/freezer.h>
+#include <linux/list.h>
+#include <linux/slab.h>
+#include <linux/swap.h>
+#include <linux/sched/signal.h>
+
+#include "deferred-free-helper.h"
+
+static LIST_HEAD(free_list);
+static size_t list_size_pages;
+wait_queue_head_t freelist_waitqueue;
+struct task_struct *freelist_task;
+static DEFINE_SPINLOCK(free_list_lock);
+
+static inline size_t size_to_pages(size_t size)
+{
+	if (!size)
+		return 0;
+	return ((size - 1) >> PAGE_SHIFT) + 1;
+}
+
+void deferred_free(struct deferred_freelist_item *item,
+		   void (*free)(struct deferred_freelist_item*,
+				enum df_reason),
+		   size_t size)
+{
+	unsigned long flags;
+
+	INIT_LIST_HEAD(&item->list);
+	item->size = size;
+	item->free = free;
+
+	spin_lock_irqsave(&free_list_lock, flags);
+	list_add(&item->list, &free_list);
+	list_size_pages += size_to_pages(size);
+	spin_unlock_irqrestore(&free_list_lock, flags);
+	wake_up(&freelist_waitqueue);
+}
+EXPORT_SYMBOL_GPL(deferred_free);
+
+static size_t free_one_item(enum df_reason reason)
+{
+	unsigned long flags;
+	size_t nr_pages;
+	struct deferred_freelist_item *item;
+
+	spin_lock_irqsave(&free_list_lock, flags);
+	if (list_empty(&free_list)) {
+		spin_unlock_irqrestore(&free_list_lock, flags);
+		return 0;
+	}
+	item = list_first_entry(&free_list, struct deferred_freelist_item, list);
+	list_del(&item->list);
+	nr_pages = size_to_pages(item->size);
+	list_size_pages -= nr_pages;
+	spin_unlock_irqrestore(&free_list_lock, flags);
+
+	item->free(item, reason);
+	return nr_pages;
+}
+
+static unsigned long get_freelist_size_pages(void)
+{
+	unsigned long size;
+	unsigned long flags;
+
+	spin_lock_irqsave(&free_list_lock, flags);
+	size = list_size_pages;
+	spin_unlock_irqrestore(&free_list_lock, flags);
+	return size;
+}
+
+static unsigned long freelist_shrink_count(struct shrinker *shrinker,
+					   struct shrink_control *sc)
+{
+	return get_freelist_size_pages();
+}
+
+static unsigned long freelist_shrink_scan(struct shrinker *shrinker,
+					  struct shrink_control *sc)
+{
+	unsigned long total_freed = 0;
+
+	if (sc->nr_to_scan == 0)
+		return 0;
+
+	while (total_freed < sc->nr_to_scan) {
+		size_t pages_freed = free_one_item(DF_UNDER_PRESSURE);
+
+		if (!pages_freed)
+			break;
+
+		total_freed += pages_freed;
+	}
+
+	return total_freed;
+}
+
+static struct shrinker freelist_shrinker = {
+	.count_objects = freelist_shrink_count,
+	.scan_objects = freelist_shrink_scan,
+	.seeks = DEFAULT_SEEKS,
+	.batch = 0,
+};
+
+static int deferred_free_thread(void *data)
+{
+	while (true) {
+		wait_event_freezable(freelist_waitqueue,
+				     get_freelist_size_pages() > 0);
+
+		free_one_item(DF_NORMAL);
+	}
+
+	return 0;
+}
+
+static int deferred_freelist_init(void)
+{
+	list_size_pages = 0;
+
+	init_waitqueue_head(&freelist_waitqueue);
+	freelist_task = kthread_run(deferred_free_thread, NULL,
+				    "%s", "dmabuf-deferred-free-worker");
+	if (IS_ERR(freelist_task)) {
+		pr_err("Creating thread for deferred free failed\n");
+		return -1;
+	}
+	sched_set_normal(freelist_task, 19);
+
+	return register_shrinker(&freelist_shrinker);
+}
+module_init(deferred_freelist_init);
+MODULE_LICENSE("GPL v2");
+
diff --git a/drivers/dma-buf/heaps/deferred-free-helper.h b/drivers/dma-buf/heaps/deferred-free-helper.h
new file mode 100644
index 000000000000..18b44ac86ef6
--- /dev/null
+++ b/drivers/dma-buf/heaps/deferred-free-helper.h
@@ -0,0 +1,55 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef DEFERRED_FREE_HELPER_H
+#define DEFERRED_FREE_HELPER_H
+
+/**
+ * df_reason - enum for reason why item was freed
+ *
+ * This provides a reason for why the free function was called
+ * on the item. This is useful when deferred_free is used in
+ * combination with a pagepool, so under pressure the page can
+ * be immediately freed.
+ *
+ * DF_NORMAL:         Normal deferred free
+ *
+ * DF_UNDER_PRESSURE: Free was called because the system
+ *                    is under memory pressure. Usually
+ *                    from a shrinker. Avoid allocating
+ *                    memory in the free call, as it may
+ *                    fail.
+ */
+enum df_reason {
+	DF_NORMAL,
+	DF_UNDER_PRESSURE,
+};
+
+/**
+ * deferred_freelist_item - item structure for deferred freelist
+ *
+ * This is to be added to the structure for whatever you want to
+ * defer freeing on.
+ *
+ * @size: size of the item to be freed
+ * @free: function pointer to be called when freeing the item
+ * @list: list entry for the deferred list
+ */
+struct deferred_freelist_item {
+	size_t size;
+	void (*free)(struct deferred_freelist_item *i,
+		     enum df_reason reason);
+	struct list_head list;
+};
+
+/**
+ * deferred_free - call to add item to the deferred free list
+ *
+ * @item: Pointer to deferred_freelist_item field of a structure
+ * @free: Function pointer to the free call
+ * @size: Size of the item to be freed
+ */
+void deferred_free(struct deferred_freelist_item *item,
+		   void (*free)(struct deferred_freelist_item *i,
+				enum df_reason reason),
+		   size_t size);
+#endif
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [RFC][PATCH v6 6/7] dma-buf: system_heap: Add drm pagepool support to system heap
  2021-02-05  8:06 [RFC][PATCH v6 0/7] Generic page pool & deferred freeing for system dmabuf heap John Stultz
                   ` (4 preceding siblings ...)
  2021-02-05  8:06 ` [RFC][PATCH v6 5/7] dma-buf: heaps: Add deferred-free-helper library code John Stultz
@ 2021-02-05  8:06 ` John Stultz
  2021-02-05  8:06 ` [RFC][PATCH v6 7/7] dma-buf: system_heap: Add deferred freeing to the " John Stultz
  2021-02-05 10:36 ` [RFC][PATCH v6 0/7] Generic page pool & deferred freeing for system dmabuf heap Christian König
  7 siblings, 0 replies; 33+ messages in thread
From: John Stultz @ 2021-02-05  8:06 UTC (permalink / raw)
  To: lkml
  Cc: John Stultz, Daniel Vetter, Christian Koenig, Sumit Semwal,
	Liam Mark, Chris Goldsworthy, Laura Abbott, Brian Starkey,
	Hridya Valsaraju, Suren Baghdasaryan, Sandeep Patil,
	Daniel Mentz, Ørjan Eide, Robin Murphy, Ezequiel Garcia,
	Simon Ser, James Jones, linux-media, dri-devel

Utilize the drm pagepool code to speed up allocation
performance.

This is similar to the ION pagepool usage, but tries to
utilize generic code instead of a custom implementation.

Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Liam Mark <lmark@codeaurora.org>
Cc: Chris Goldsworthy <cgoldswo@codeaurora.org>
Cc: Laura Abbott <labbott@kernel.org>
Cc: Brian Starkey <Brian.Starkey@arm.com>
Cc: Hridya Valsaraju <hridya@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sandeep Patil <sspatil@google.com>
Cc: Daniel Mentz <danielmentz@google.com>
Cc: Ørjan Eide <orjan.eide@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Ezequiel Garcia <ezequiel@collabora.com>
Cc: Simon Ser <contact@emersion.fr>
Cc: James Jones <jajones@nvidia.com>
Cc: linux-media@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
v2:
* Fix build issue caused by selecting PAGE_POOL w/o NET
  as Reported-by: kernel test robot <lkp@intel.com>
v3:
* Simplify the page zeroing logic a bit by using kmap_atomic
  instead of vmap as suggested by Daniel Mentz
v5:
* Shift away from networking page pool completely to
  dmabuf page pool implementation
v6:
* Switch again to using the drm_page_pool code shared w/
  ttm_pool
---
 drivers/dma-buf/heaps/Kconfig       |  1 +
 drivers/dma-buf/heaps/system_heap.c | 56 +++++++++++++++++++++++++++--
 2 files changed, 54 insertions(+), 3 deletions(-)

diff --git a/drivers/dma-buf/heaps/Kconfig b/drivers/dma-buf/heaps/Kconfig
index f7aef8bc7119..7e28934e0def 100644
--- a/drivers/dma-buf/heaps/Kconfig
+++ b/drivers/dma-buf/heaps/Kconfig
@@ -4,6 +4,7 @@ config DMABUF_HEAPS_DEFERRED_FREE
 config DMABUF_HEAPS_SYSTEM
 	bool "DMA-BUF System Heap"
 	depends on DMABUF_HEAPS
+	select DRM_PAGE_POOL
 	help
 	  Choose this option to enable the system dmabuf heap. The system heap
 	  is backed by pages from the buddy allocator. If in doubt, say Y.
diff --git a/drivers/dma-buf/heaps/system_heap.c b/drivers/dma-buf/heaps/system_heap.c
index 17e0e9a68baf..6d39e9f32e36 100644
--- a/drivers/dma-buf/heaps/system_heap.c
+++ b/drivers/dma-buf/heaps/system_heap.c
@@ -21,6 +21,8 @@
 #include <linux/slab.h>
 #include <linux/vmalloc.h>
 
+#include <drm/page_pool.h>
+
 static struct dma_heap *sys_heap;
 
 struct system_heap_buffer {
@@ -53,6 +55,7 @@ static gfp_t order_flags[] = {HIGH_ORDER_GFP, LOW_ORDER_GFP, LOW_ORDER_GFP};
  */
 static const unsigned int orders[] = {8, 4, 0};
 #define NUM_ORDERS ARRAY_SIZE(orders)
+struct drm_page_pool *pools[NUM_ORDERS];
 
 static struct sg_table *dup_sg_table(struct sg_table *table)
 {
@@ -281,18 +284,49 @@ static void system_heap_vunmap(struct dma_buf *dmabuf, struct dma_buf_map *map)
 	dma_buf_map_clear(map);
 }
 
+static int system_heap_free_pages(struct page *p, unsigned int order)
+{
+	__free_pages(p, order);
+	return 1 << order;
+}
+
+static int system_heap_zero_buffer(struct system_heap_buffer *buffer)
+{
+	struct sg_table *sgt = &buffer->sg_table;
+	struct sg_page_iter piter;
+	struct page *p;
+	void *vaddr;
+	int ret = 0;
+
+	for_each_sgtable_page(sgt, &piter, 0) {
+		p = sg_page_iter_page(&piter);
+		vaddr = kmap_atomic(p);
+		memset(vaddr, 0, PAGE_SIZE);
+		kunmap_atomic(vaddr);
+	}
+
+	return ret;
+}
+
 static void system_heap_dma_buf_release(struct dma_buf *dmabuf)
 {
 	struct system_heap_buffer *buffer = dmabuf->priv;
 	struct sg_table *table;
 	struct scatterlist *sg;
-	int i;
+	int i, j;
+
+	/* Zero the buffer pages before adding back to the pool */
+	system_heap_zero_buffer(buffer);
 
 	table = &buffer->sg_table;
 	for_each_sg(table->sgl, sg, table->nents, i) {
 		struct page *page = sg_page(sg);
 
-		__free_pages(page, compound_order(page));
+		for (j = 0; j < NUM_ORDERS; j++) {
+			if (compound_order(page) == orders[j])
+				break;
+		}
+		drm_page_pool_add(pools[j], page);
 	}
 	sg_free_table(table);
 	kfree(buffer);
@@ -323,7 +357,9 @@ static struct page *alloc_largest_available(unsigned long size,
 		if (max_order < orders[i])
 			continue;
 
-		page = alloc_pages(order_flags[i], orders[i]);
+		page = drm_page_pool_fetch(pools[i]);
+		if (!page)
+			page = alloc_pages(order_flags[i], orders[i]);
 		if (!page)
 			continue;
 		return page;
@@ -428,6 +464,20 @@ static const struct dma_heap_ops system_heap_ops = {
 static int system_heap_create(void)
 {
 	struct dma_heap_export_info exp_info;
+	int i;
+
+	for (i = 0; i < NUM_ORDERS; i++) {
+		pools[i] = drm_page_pool_create(orders[i],
+						system_heap_free_pages);
+		if (IS_ERR(pools[i])) {
+			int j;
+
+			pr_err("%s: page pool creation failed!\n", __func__);
+			for (j = 0; j < i; j++)
+				drm_page_pool_destroy(pools[j]);
+			return PTR_ERR(pools[i]);
+		}
+	}
 
 	exp_info.name = "system";
 	exp_info.ops = &system_heap_ops;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [RFC][PATCH v6 7/7] dma-buf: system_heap: Add deferred freeing to the system heap
  2021-02-05  8:06 [RFC][PATCH v6 0/7] Generic page pool & deferred freeing for system dmabuf heap John Stultz
                   ` (5 preceding siblings ...)
  2021-02-05  8:06 ` [RFC][PATCH v6 6/7] dma-buf: system_heap: Add drm pagepool support to system heap John Stultz
@ 2021-02-05  8:06 ` John Stultz
  2021-02-05 10:36 ` [RFC][PATCH v6 0/7] Generic page pool & deferred freeing for system dmabuf heap Christian König
  7 siblings, 0 replies; 33+ messages in thread
From: John Stultz @ 2021-02-05  8:06 UTC (permalink / raw)
  To: lkml
  Cc: John Stultz, Daniel Vetter, Christian Koenig, Sumit Semwal,
	Liam Mark, Chris Goldsworthy, Laura Abbott, Brian Starkey,
	Hridya Valsaraju, Suren Baghdasaryan, Sandeep Patil,
	Daniel Mentz, Ørjan Eide, Robin Murphy, Ezequiel Garcia,
	Simon Ser, James Jones, linux-media, dri-devel

Utilize the deferred free helper library in the system heap.

This provides a nice performance bump and puts the
system heap performance on par with ION.

Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Liam Mark <lmark@codeaurora.org>
Cc: Chris Goldsworthy <cgoldswo@codeaurora.org>
Cc: Laura Abbott <labbott@kernel.org>
Cc: Brian Starkey <Brian.Starkey@arm.com>
Cc: Hridya Valsaraju <hridya@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sandeep Patil <sspatil@google.com>
Cc: Daniel Mentz <danielmentz@google.com>
Cc: Ørjan Eide <orjan.eide@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Ezequiel Garcia <ezequiel@collabora.com>
Cc: Simon Ser <contact@emersion.fr>
Cc: James Jones <jajones@nvidia.com>
Cc: linux-media@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
v2:
* Rework deferred-free api to use reason enum as suggested by
  Suren Baghdasaryan
---
 drivers/dma-buf/heaps/Kconfig       |  1 +
 drivers/dma-buf/heaps/system_heap.c | 31 ++++++++++++++++++++++-------
 2 files changed, 25 insertions(+), 7 deletions(-)

diff --git a/drivers/dma-buf/heaps/Kconfig b/drivers/dma-buf/heaps/Kconfig
index 7e28934e0def..10632ccfb4a5 100644
--- a/drivers/dma-buf/heaps/Kconfig
+++ b/drivers/dma-buf/heaps/Kconfig
@@ -5,6 +5,7 @@ config DMABUF_HEAPS_SYSTEM
 	bool "DMA-BUF System Heap"
 	depends on DMABUF_HEAPS
 	select DRM_PAGE_POOL
+	select DMABUF_HEAPS_DEFERRED_FREE
 	help
 	  Choose this option to enable the system dmabuf heap. The system heap
 	  is backed by pages from the buddy allocator. If in doubt, say Y.
diff --git a/drivers/dma-buf/heaps/system_heap.c b/drivers/dma-buf/heaps/system_heap.c
index 6d39e9f32e36..042244407db5 100644
--- a/drivers/dma-buf/heaps/system_heap.c
+++ b/drivers/dma-buf/heaps/system_heap.c
@@ -22,6 +22,7 @@
 #include <linux/vmalloc.h>
 
 #include <drm/page_pool.h>
+#include "deferred-free-helper.h"
 
 static struct dma_heap *sys_heap;
 
@@ -33,6 +34,7 @@ struct system_heap_buffer {
 	struct sg_table sg_table;
 	int vmap_cnt;
 	void *vaddr;
+	struct deferred_freelist_item deferred_free;
 };
 
 struct dma_heap_attachment {
@@ -308,30 +310,45 @@ static int system_heap_zero_buffer(struct system_heap_buffer *buffer)
 	return ret;
 }
 
-static void system_heap_dma_buf_release(struct dma_buf *dmabuf)
+static void system_heap_buf_free(struct deferred_freelist_item *item,
+				 enum df_reason reason)
 {
-	struct system_heap_buffer *buffer = dmabuf->priv;
+	struct system_heap_buffer *buffer;
 	struct sg_table *table;
 	struct scatterlist *sg;
 	int i, j;
 
+	buffer = container_of(item, struct system_heap_buffer, deferred_free);
 	/* Zero the buffer pages before adding back to the pool */
-	system_heap_zero_buffer(buffer);
+	if (reason == DF_NORMAL)
+		if (system_heap_zero_buffer(buffer))
+			reason = DF_UNDER_PRESSURE; // On failure, just free
 
 	table = &buffer->sg_table;
 	for_each_sg(table->sgl, sg, table->nents, i) {
 		struct page *page = sg_page(sg);
 
-		for (j = 0; j < NUM_ORDERS; j++) {
-			if (compound_order(page) == orders[j])
-				break;
+		if (reason == DF_UNDER_PRESSURE) {
+			__free_pages(page, compound_order(page));
+		} else {
+			for (j = 0; j < NUM_ORDERS; j++) {
+				if (compound_order(page) == orders[j])
+					break;
+			}
+			drm_page_pool_add(pools[j], page);
 		}
-		drm_page_pool_add(pools[j], page);
 	}
 	sg_free_table(table);
 	kfree(buffer);
 }
 
+static void system_heap_dma_buf_release(struct dma_buf *dmabuf)
+{
+	struct system_heap_buffer *buffer = dmabuf->priv;
+
+	deferred_free(&buffer->deferred_free, system_heap_buf_free, buffer->len);
+}
+
 static const struct dma_buf_ops system_heap_buf_ops = {
 	.attach = system_heap_attach,
 	.detach = system_heap_detach,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [RFC][PATCH v6 3/7] drm: ttm_pool: Rework ttm_pool_free_page to allow us to use it as a function pointer
  2021-02-05  8:06 ` [RFC][PATCH v6 3/7] drm: ttm_pool: Rework ttm_pool_free_page to allow us to use it as a function pointer John Stultz
@ 2021-02-05  8:28   ` Christian König
  2021-02-05 19:47     ` John Stultz
  0 siblings, 1 reply; 33+ messages in thread
From: Christian König @ 2021-02-05  8:28 UTC (permalink / raw)
  To: John Stultz, lkml
  Cc: Daniel Vetter, Sumit Semwal, Liam Mark, Chris Goldsworthy,
	Laura Abbott, Brian Starkey, Hridya Valsaraju,
	Suren Baghdasaryan, Sandeep Patil, Daniel Mentz, Ørjan Eide,
	Robin Murphy, Ezequiel Garcia, Simon Ser, James Jones,
	linux-media, dri-devel

Am 05.02.21 um 09:06 schrieb John Stultz:
> This refactors ttm_pool_free_page(), and by adding extra entries
> to ttm_pool_page_dat, we then use it for all allocations, which
> allows us to simplify the arguments needed to be passed to
> ttm_pool_free_page().

This is a clear NAK since the peer page data is just a workaround for 
the DMA-API hack to grab pages from there.

Adding this to all pages would increase the memory footprint drastically.

christian.

>
> This is critical for allowing the free function to be called
> by the sharable drm_page_pool logic.
>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: Christian Koenig <christian.koenig@amd.com>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: Liam Mark <lmark@codeaurora.org>
> Cc: Chris Goldsworthy <cgoldswo@codeaurora.org>
> Cc: Laura Abbott <labbott@kernel.org>
> Cc: Brian Starkey <Brian.Starkey@arm.com>
> Cc: Hridya Valsaraju <hridya@google.com>
> Cc: Suren Baghdasaryan <surenb@google.com>
> Cc: Sandeep Patil <sspatil@google.com>
> Cc: Daniel Mentz <danielmentz@google.com>
> Cc: Ørjan Eide <orjan.eide@arm.com>
> Cc: Robin Murphy <robin.murphy@arm.com>
> Cc: Ezequiel Garcia <ezequiel@collabora.com>
> Cc: Simon Ser <contact@emersion.fr>
> Cc: James Jones <jajones@nvidia.com>
> Cc: linux-media@vger.kernel.org
> Cc: dri-devel@lists.freedesktop.org
> Signed-off-by: John Stultz <john.stultz@linaro.org>
> ---
>   drivers/gpu/drm/ttm/ttm_pool.c | 60 ++++++++++++++++++----------------
>   1 file changed, 32 insertions(+), 28 deletions(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_pool.c b/drivers/gpu/drm/ttm/ttm_pool.c
> index c0274e256be3..eca36678f967 100644
> --- a/drivers/gpu/drm/ttm/ttm_pool.c
> +++ b/drivers/gpu/drm/ttm/ttm_pool.c
> @@ -44,10 +44,14 @@
>   /**
>    * struct ttm_pool_page_dat - Helper object for coherent DMA mappings
>    *
> + * @pool: ttm_pool pointer the page was allocated by
> + * @caching: the caching value the allocated page was configured for
>    * @addr: original DMA address returned for the mapping
>    * @vaddr: original vaddr return for the mapping and order in the lower bits
>    */
>   struct ttm_pool_page_dat {
> +	struct ttm_pool *pool;
> +	enum ttm_caching caching;
>   	dma_addr_t addr;
>   	unsigned long vaddr;
>   };
> @@ -71,13 +75,20 @@ static struct shrinker mm_shrinker;
>   
>   /* Allocate pages of size 1 << order with the given gfp_flags */
>   static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, gfp_t gfp_flags,
> -					unsigned int order)
> +					unsigned int order, enum ttm_caching caching)
>   {
>   	unsigned long attr = DMA_ATTR_FORCE_CONTIGUOUS;
>   	struct ttm_pool_page_dat *dat;
>   	struct page *p;
>   	void *vaddr;
>   
> +	dat = kmalloc(sizeof(*dat), GFP_KERNEL);
> +	if (!dat)
> +		return NULL;
> +
> +	dat->pool = pool;
> +	dat->caching = caching;
> +
>   	/* Don't set the __GFP_COMP flag for higher order allocations.
>   	 * Mapping pages directly into an userspace process and calling
>   	 * put_page() on a TTM allocated page is illegal.
> @@ -88,15 +99,13 @@ static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, gfp_t gfp_flags,
>   
>   	if (!pool->use_dma_alloc) {
>   		p = alloc_pages(gfp_flags, order);
> -		if (p)
> -			p->private = order;
> +		if (!p)
> +			goto error_free;
> +		dat->vaddr = order;
> +		p->private = (unsigned long)dat;
>   		return p;
>   	}
>   
> -	dat = kmalloc(sizeof(*dat), GFP_KERNEL);
> -	if (!dat)
> -		return NULL;
> -
>   	if (order)
>   		attr |= DMA_ATTR_NO_WARN;
>   
> @@ -123,34 +132,34 @@ static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, gfp_t gfp_flags,
>   }
>   
>   /* Reset the caching and pages of size 1 << order */
> -static void ttm_pool_free_page(struct ttm_pool *pool, enum ttm_caching caching,
> -			       unsigned int order, struct page *p)
> +static int ttm_pool_free_page(struct page *p, unsigned int order)
>   {
>   	unsigned long attr = DMA_ATTR_FORCE_CONTIGUOUS;
> -	struct ttm_pool_page_dat *dat;
> +	struct ttm_pool_page_dat *dat = (void *)p->private;
>   	void *vaddr;
>   
>   #ifdef CONFIG_X86
>   	/* We don't care that set_pages_wb is inefficient here. This is only
>   	 * used when we have to shrink and CPU overhead is irrelevant then.
>   	 */
> -	if (caching != ttm_cached && !PageHighMem(p))
> +	if (dat->caching != ttm_cached && !PageHighMem(p))
>   		set_pages_wb(p, 1 << order);
>   #endif
>   
> -	if (!pool || !pool->use_dma_alloc) {
> +	if (!dat->pool || !dat->pool->use_dma_alloc) {
>   		__free_pages(p, order);
> -		return;
> +		goto out;
>   	}
>   
>   	if (order)
>   		attr |= DMA_ATTR_NO_WARN;
>   
> -	dat = (void *)p->private;
>   	vaddr = (void *)(dat->vaddr & PAGE_MASK);
> -	dma_free_attrs(pool->dev, (1UL << order) * PAGE_SIZE, vaddr, dat->addr,
> +	dma_free_attrs(dat->pool->dev, (1UL << order) * PAGE_SIZE, vaddr, dat->addr,
>   		       attr);
> +out:
>   	kfree(dat);
> +	return 1 << order;
>   }
>   
>   /* Apply a new caching to an array of pages */
> @@ -264,7 +273,7 @@ static void ttm_pool_type_fini(struct ttm_pool_type *pt)
>   	mutex_unlock(&shrinker_lock);
>   
>   	list_for_each_entry_safe(p, tmp, &pt->pages, lru)
> -		ttm_pool_free_page(pt->pool, pt->caching, pt->order, p);
> +		ttm_pool_free_page(p, pt->order);
>   }
>   
>   /* Return the pool_type to use for the given caching and order */
> @@ -307,7 +316,7 @@ static unsigned int ttm_pool_shrink(void)
>   
>   	p = ttm_pool_type_take(pt);
>   	if (p) {
> -		ttm_pool_free_page(pt->pool, pt->caching, pt->order, p);
> +		ttm_pool_free_page(p, pt->order);
>   		num_freed = 1 << pt->order;
>   	} else {
>   		num_freed = 0;
> @@ -322,13 +331,9 @@ static unsigned int ttm_pool_shrink(void)
>   /* Return the allocation order based for a page */
>   static unsigned int ttm_pool_page_order(struct ttm_pool *pool, struct page *p)
>   {
> -	if (pool->use_dma_alloc) {
> -		struct ttm_pool_page_dat *dat = (void *)p->private;
> -
> -		return dat->vaddr & ~PAGE_MASK;
> -	}
> +	struct ttm_pool_page_dat *dat = (void *)p->private;
>   
> -	return p->private;
> +	return dat->vaddr & ~PAGE_MASK;
>   }
>   
>   /**
> @@ -379,7 +384,7 @@ int ttm_pool_alloc(struct ttm_pool *pool, struct ttm_tt *tt,
>   		if (p) {
>   			apply_caching = true;
>   		} else {
> -			p = ttm_pool_alloc_page(pool, gfp_flags, order);
> +			p = ttm_pool_alloc_page(pool, gfp_flags, order, tt->caching);
>   			if (p && PageHighMem(p))
>   				apply_caching = true;
>   		}
> @@ -428,13 +433,13 @@ int ttm_pool_alloc(struct ttm_pool *pool, struct ttm_tt *tt,
>   	ttm_mem_global_free_page(&ttm_mem_glob, p, (1 << order) * PAGE_SIZE);
>   
>   error_free_page:
> -	ttm_pool_free_page(pool, tt->caching, order, p);
> +	ttm_pool_free_page(p, order);
>   
>   error_free_all:
>   	num_pages = tt->num_pages - num_pages;
>   	for (i = 0; i < num_pages; ) {
>   		order = ttm_pool_page_order(pool, tt->pages[i]);
> -		ttm_pool_free_page(pool, tt->caching, order, tt->pages[i]);
> +		ttm_pool_free_page(tt->pages[i], order);
>   		i += 1 << order;
>   	}
>   
> @@ -470,8 +475,7 @@ void ttm_pool_free(struct ttm_pool *pool, struct ttm_tt *tt)
>   		if (pt)
>   			ttm_pool_type_give(pt, tt->pages[i]);
>   		else
> -			ttm_pool_free_page(pool, tt->caching, order,
> -					   tt->pages[i]);
> +			ttm_pool_free_page(tt->pages[i], order);
>   
>   		i += num_pages;
>   	}


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC][PATCH v6 1/7] drm: Add a sharable drm page-pool implementation
  2021-02-05  8:06 ` [RFC][PATCH v6 1/7] drm: Add a sharable drm page-pool implementation John Stultz
@ 2021-02-05  8:46   ` Christian König
  2021-02-05 20:46     ` John Stultz
  0 siblings, 1 reply; 33+ messages in thread
From: Christian König @ 2021-02-05  8:46 UTC (permalink / raw)
  To: John Stultz, lkml
  Cc: Daniel Vetter, Sumit Semwal, Liam Mark, Chris Goldsworthy,
	Laura Abbott, Brian Starkey, Hridya Valsaraju,
	Suren Baghdasaryan, Sandeep Patil, Daniel Mentz, Ørjan Eide,
	Robin Murphy, Ezequiel Garcia, Simon Ser, James Jones,
	linux-media, dri-devel

Am 05.02.21 um 09:06 schrieb John Stultz:
> This adds a shrinker controlled page pool, closely
> following the ttm_pool logic, which is abstracted out
> a bit so it can be used by other non-ttm drivers.
>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: Christian Koenig <christian.koenig@amd.com>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: Liam Mark <lmark@codeaurora.org>
> Cc: Chris Goldsworthy <cgoldswo@codeaurora.org>
> Cc: Laura Abbott <labbott@kernel.org>
> Cc: Brian Starkey <Brian.Starkey@arm.com>
> Cc: Hridya Valsaraju <hridya@google.com>
> Cc: Suren Baghdasaryan <surenb@google.com>
> Cc: Sandeep Patil <sspatil@google.com>
> Cc: Daniel Mentz <danielmentz@google.com>
> Cc: Ørjan Eide <orjan.eide@arm.com>
> Cc: Robin Murphy <robin.murphy@arm.com>
> Cc: Ezequiel Garcia <ezequiel@collabora.com>
> Cc: Simon Ser <contact@emersion.fr>
> Cc: James Jones <jajones@nvidia.com>
> Cc: linux-media@vger.kernel.org
> Cc: dri-devel@lists.freedesktop.org
> Signed-off-by: John Stultz <john.stultz@linaro.org>
> ---
>   drivers/gpu/drm/Kconfig     |   4 +
>   drivers/gpu/drm/Makefile    |   1 +
>   drivers/gpu/drm/page_pool.c | 220 ++++++++++++++++++++++++++++++++++++
>   include/drm/page_pool.h     |  54 +++++++++
>   4 files changed, 279 insertions(+)
>   create mode 100644 drivers/gpu/drm/page_pool.c
>   create mode 100644 include/drm/page_pool.h
>
> diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
> index 0973f408d75f..d16bf340ed2e 100644
> --- a/drivers/gpu/drm/Kconfig
> +++ b/drivers/gpu/drm/Kconfig
> @@ -174,6 +174,10 @@ config DRM_DP_CEC
>   	  Note: not all adapters support this feature, and even for those
>   	  that do support this they often do not hook up the CEC pin.
>   
> +config DRM_PAGE_POOL
> +	bool
> +	depends on DRM
> +
>   config DRM_TTM
>   	tristate
>   	depends on DRM && MMU
> diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
> index fefaff4c832d..877e0111ed34 100644
> --- a/drivers/gpu/drm/Makefile
> +++ b/drivers/gpu/drm/Makefile
> @@ -32,6 +32,7 @@ drm-$(CONFIG_AGP) += drm_agpsupport.o
>   drm-$(CONFIG_PCI) += drm_pci.o
>   drm-$(CONFIG_DEBUG_FS) += drm_debugfs.o drm_debugfs_crc.o
>   drm-$(CONFIG_DRM_LOAD_EDID_FIRMWARE) += drm_edid_load.o
> +drm-$(CONFIG_DRM_PAGE_POOL) += page_pool.o
>   
>   drm_vram_helper-y := drm_gem_vram_helper.o
>   obj-$(CONFIG_DRM_VRAM_HELPER) += drm_vram_helper.o
> diff --git a/drivers/gpu/drm/page_pool.c b/drivers/gpu/drm/page_pool.c
> new file mode 100644
> index 000000000000..2139f86e6ca7
> --- /dev/null
> +++ b/drivers/gpu/drm/page_pool.c
> @@ -0,0 +1,220 @@
> +// SPDX-License-Identifier: GPL-2.0

Please use a BSD/MIT compatible license if you want to copy this from 
the TTM code.

> +/*
> + * DRM page pool system
> + *
> + * Copyright (C) 2020 Linaro Ltd.
> + *
> + * Based on the ION page pool code
> + * Copyright (C) 2011 Google, Inc.
> + * As well as the ttm_pool code
> + * Copyright (C) 2020 Advanced Micro Devices, Inc.
> + */
> +
> +#include <linux/freezer.h>
> +#include <linux/list.h>
> +#include <linux/slab.h>
> +#include <linux/swap.h>
> +#include <linux/sched/signal.h>
> +#include <drm/page_pool.h>
> +
> +static LIST_HEAD(pool_list);
> +static DEFINE_MUTEX(pool_list_lock);
> +static atomic_long_t total_pages;
> +static unsigned long page_pool_max;
> +MODULE_PARM_DESC(page_pool_max, "Number of pages in the WC/UC/DMA pool");
> +module_param(page_pool_max, ulong, 0644);
> +
> +void drm_page_pool_set_max(unsigned long max)
> +{
> +	/* only write once */
> +	if (!page_pool_max)
> +		page_pool_max = max;
> +}
> +
> +unsigned long drm_page_pool_get_max(void)
> +{
> +	return page_pool_max;
> +}
> +
> +unsigned long drm_page_pool_get_total(void)
> +{
> +	return atomic_long_read(&total_pages);
> +}
> +
> +int drm_page_pool_get_size(struct drm_page_pool *pool)
> +{
> +	int ret;
> +
> +	spin_lock(&pool->lock);
> +	ret = pool->count;
> +	spin_unlock(&pool->lock);

Maybe use an atomic for the count instead?

> +	return ret;
> +}
> +
> +static inline unsigned int drm_page_pool_free_pages(struct drm_page_pool *pool,
> +						    struct page *page)
> +{
> +	return pool->free(page, pool->order);
> +}
> +
> +static int drm_page_pool_shrink_one(void);
> +
> +void drm_page_pool_add(struct drm_page_pool *pool, struct page *page)
> +{
> +	spin_lock(&pool->lock);
> +	list_add_tail(&page->lru, &pool->items);
> +	pool->count++;
> +	atomic_long_add(1 << pool->order, &total_pages);
> +	spin_unlock(&pool->lock);
> +
> +	mod_node_page_state(page_pgdat(page), NR_KERNEL_MISC_RECLAIMABLE,
> +			    1 << pool->order);

Hui what? What should that be good for?

> +
> +	/* make sure we don't grow too large */
> +	while (page_pool_max && atomic_long_read(&total_pages) > page_pool_max)
> +		drm_page_pool_shrink_one();
> +}
> +EXPORT_SYMBOL_GPL(drm_page_pool_add);
> +
> +static struct page *drm_page_pool_remove(struct drm_page_pool *pool)
> +{
> +	struct page *page;
> +
> +	if (!pool->count)
> +		return NULL;

Better use list_first_entry_or_null instead of checking the count.

This way you can also pull the lock into the function.

> +
> +	page = list_first_entry(&pool->items, struct page, lru);
> +	pool->count--;
> +	atomic_long_sub(1 << pool->order, &total_pages);
> +
> +	list_del(&page->lru);
> +	mod_node_page_state(page_pgdat(page), NR_KERNEL_MISC_RECLAIMABLE,
> +			    -(1 << pool->order));
> +	return page;
> +}
> +
> +struct page *drm_page_pool_fetch(struct drm_page_pool *pool)
> +{
> +	struct page *page = NULL;
> +
> +	if (!pool) {
> +		WARN_ON(!pool);
> +		return NULL;
> +	}
> +
> +	spin_lock(&pool->lock);
> +	page = drm_page_pool_remove(pool);
> +	spin_unlock(&pool->lock);
> +
> +	return page;
> +}
> +EXPORT_SYMBOL_GPL(drm_page_pool_fetch);
> +
> +struct drm_page_pool *drm_page_pool_create(unsigned int order,
> +					   int (*free_page)(struct page *p, unsigned int order))
> +{
> +	struct drm_page_pool *pool = kmalloc(sizeof(*pool), GFP_KERNEL);

Why not making this an embedded object? We should not see much dynamic 
pool creation.

> +
> +	if (!pool)
> +		return NULL;
> +
> +	pool->count = 0;
> +	INIT_LIST_HEAD(&pool->items);
> +	pool->order = order;
> +	pool->free = free_page;
> +	spin_lock_init(&pool->lock);
> +	INIT_LIST_HEAD(&pool->list);
> +
> +	mutex_lock(&pool_list_lock);
> +	list_add(&pool->list, &pool_list);
> +	mutex_unlock(&pool_list_lock);
> +
> +	return pool;
> +}
> +EXPORT_SYMBOL_GPL(drm_page_pool_create);
> +
> +void drm_page_pool_destroy(struct drm_page_pool *pool)
> +{
> +	struct page *page;
> +
> +	/* Remove us from the pool list */
> +	mutex_lock(&pool_list_lock);
> +	list_del(&pool->list);
> +	mutex_unlock(&pool_list_lock);
> +
> +	/* Free any remaining pages in the pool */
> +	spin_lock(&pool->lock);

Locking should be unnecessary when the pool is destroyed anyway.

> +	while (pool->count) {
> +		page = drm_page_pool_remove(pool);
> +		spin_unlock(&pool->lock);
> +		drm_page_pool_free_pages(pool, page);
> +		spin_lock(&pool->lock);
> +	}
> +	spin_unlock(&pool->lock);
> +
> +	kfree(pool);
> +}
> +EXPORT_SYMBOL_GPL(drm_page_pool_destroy);
> +
> +static int drm_page_pool_shrink_one(void)
> +{
> +	struct drm_page_pool *pool;
> +	struct page *page;
> +	int nr_freed = 0;
> +
> +	mutex_lock(&pool_list_lock);
> +	pool = list_first_entry(&pool_list, typeof(*pool), list);
> +
> +	spin_lock(&pool->lock);
> +	page = drm_page_pool_remove(pool);
> +	spin_unlock(&pool->lock);
> +
> +	if (page)
> +		nr_freed = drm_page_pool_free_pages(pool, page);
> +
> +	list_move_tail(&pool->list, &pool_list);

Better to move this up, directly after the list_first_entry().

> +	mutex_unlock(&pool_list_lock);
> +
> +	return nr_freed;
> +}
> +
> +static unsigned long drm_page_pool_shrink_count(struct shrinker *shrinker,
> +						struct shrink_control *sc)
> +{
> +	unsigned long count =  atomic_long_read(&total_pages);
> +
> +	return count ? count : SHRINK_EMPTY;
> +}
> +
> +static unsigned long drm_page_pool_shrink_scan(struct shrinker *shrinker,
> +					       struct shrink_control *sc)
> +{
> +	int to_scan = sc->nr_to_scan;
> +	int nr_total = 0;
> +
> +	if (to_scan == 0)
> +		return 0;
> +
> +	do {
> +		int nr_freed = drm_page_pool_shrink_one();
> +
> +		to_scan -= nr_freed;
> +		nr_total += nr_freed;
> +	} while (to_scan >= 0 && atomic_long_read(&total_pages));
> +
> +	return nr_total;
> +}
> +
> +static struct shrinker pool_shrinker = {
> +	.count_objects = drm_page_pool_shrink_count,
> +	.scan_objects = drm_page_pool_shrink_scan,
> +	.seeks = 1,
> +	.batch = 0,
> +};
> +
> +int drm_page_pool_init_shrinker(void)
> +{
> +	return register_shrinker(&pool_shrinker);
> +}
> +module_init(drm_page_pool_init_shrinker);
> +MODULE_LICENSE("GPL v2");
> diff --git a/include/drm/page_pool.h b/include/drm/page_pool.h
> new file mode 100644
> index 000000000000..47e240b2bc69
> --- /dev/null
> +++ b/include/drm/page_pool.h
> @@ -0,0 +1,54 @@
> +/* SPDX-License-Identifier: GPL-2.0 OR MIT */
> +/*
> + * Copyright 2020 Advanced Micro Devices, Inc.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
> + * OTHER DEALINGS IN THE SOFTWARE.
> + *
> + * Authors: Christian König
> + */
> +
> +#ifndef _DRM_PAGE_POOL_H_
> +#define _DRM_PAGE_POOL_H_
> +
> +#include <linux/mmzone.h>
> +#include <linux/llist.h>
> +#include <linux/spinlock.h>
> +
> +struct drm_page_pool {
> +	int count;
> +	struct list_head items;
> +
> +	int order;
> +	int (*free)(struct page *p, unsigned int order);
> +
> +	spinlock_t lock;
> +	struct list_head list;
> +};
> +
> +void drm_page_pool_set_max(unsigned long max);
> +unsigned long drm_page_pool_get_max(void);
> +unsigned long drm_page_pool_get_total(void);
> +int drm_page_pool_get_size(struct drm_page_pool *pool);
> +struct page *drm_page_pool_fetch(struct drm_page_pool *pool);
> +void drm_page_pool_add(struct drm_page_pool *pool, struct page *page);
> +struct drm_page_pool *drm_page_pool_create(unsigned int order,
> +					   int (*free_page)(struct page *p, unsigned int order));
> +void drm_page_pool_destroy(struct drm_page_pool *pool);
> +
> +#endif


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC][PATCH v6 4/7] drm: ttm_pool: Rework ttm_pool to use drm_page_pool
  2021-02-05  8:06 ` [RFC][PATCH v6 4/7] drm: ttm_pool: Rework ttm_pool to use drm_page_pool John Stultz
@ 2021-02-05  8:50   ` Christian König
  0 siblings, 0 replies; 33+ messages in thread
From: Christian König @ 2021-02-05  8:50 UTC (permalink / raw)
  To: John Stultz, lkml
  Cc: Daniel Vetter, Sumit Semwal, Liam Mark, Chris Goldsworthy,
	Laura Abbott, Brian Starkey, Hridya Valsaraju,
	Suren Baghdasaryan, Sandeep Patil, Daniel Mentz, Ørjan Eide,
	Robin Murphy, Ezequiel Garcia, Simon Ser, James Jones,
	linux-media, dri-devel

Am 05.02.21 um 09:06 schrieb John Stultz:
> This patch reworks the ttm_pool logic to utilize the recently
> added drm_page_pool code.
>
> Basically all the ttm_pool_type structures are replaced with
> drm_page_pool pointers, and since the drm_page_pool logic has
> its own shrinker, we can remove all of the ttm_pool shrinker
> logic.
>
> NOTE: There is one mismatch in the interfaces I'm not totally
> happy with. The ttm_pool tracks all of its pooled pages across
> a number of different pools, and tries to keep this size under
> the specified page_pool_size value. With the drm_page_pool,
> there may other users, however there is still one global
> shrinker list of pools. So we can't easily reduce the ttm
> pool under the ttm specified size without potentially doing
> a lot of shrinking to other non-ttm pools. So either we can:
>    1) Try to split it so each user of drm_page_pools manages its
>       own pool shrinking.
>    2) Push the max value into the drm_page_pool, and have it
>       manage shrinking to fit under that global max. Then share
>       those size/max values out so the ttm_pool debug output
>       can have more context.
>
> I've taken the second path in this patch set, but wanted to call
> it out so folks could look closely.

Option 3: Move the debugging code into the drm_page_pool as well.

I strong think that will be a hard requirement from Daniel for 
upstreaming this.

Christian.

>
> Thoughts would be greatly appreciated here!
>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: Christian Koenig <christian.koenig@amd.com>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: Liam Mark <lmark@codeaurora.org>
> Cc: Chris Goldsworthy <cgoldswo@codeaurora.org>
> Cc: Laura Abbott <labbott@kernel.org>
> Cc: Brian Starkey <Brian.Starkey@arm.com>
> Cc: Hridya Valsaraju <hridya@google.com>
> Cc: Suren Baghdasaryan <surenb@google.com>
> Cc: Sandeep Patil <sspatil@google.com>
> Cc: Daniel Mentz <danielmentz@google.com>
> Cc: Ørjan Eide <orjan.eide@arm.com>
> Cc: Robin Murphy <robin.murphy@arm.com>
> Cc: Ezequiel Garcia <ezequiel@collabora.com>
> Cc: Simon Ser <contact@emersion.fr>
> Cc: James Jones <jajones@nvidia.com>
> Cc: linux-media@vger.kernel.org
> Cc: dri-devel@lists.freedesktop.org
> Signed-off-by: John Stultz <john.stultz@linaro.org>
> ---
>   drivers/gpu/drm/Kconfig        |   1 +
>   drivers/gpu/drm/ttm/ttm_pool.c | 199 +++++++--------------------------
>   include/drm/ttm/ttm_pool.h     |  23 +---
>   3 files changed, 41 insertions(+), 182 deletions(-)
>
> diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
> index d16bf340ed2e..d427abefabfb 100644
> --- a/drivers/gpu/drm/Kconfig
> +++ b/drivers/gpu/drm/Kconfig
> @@ -181,6 +181,7 @@ config DRM_PAGE_POOL
>   config DRM_TTM
>   	tristate
>   	depends on DRM && MMU
> +	select DRM_PAGE_POOL
>   	help
>   	  GPU memory management subsystem for devices with multiple
>   	  GPU memory types. Will be enabled automatically if a device driver
> diff --git a/drivers/gpu/drm/ttm/ttm_pool.c b/drivers/gpu/drm/ttm/ttm_pool.c
> index eca36678f967..dbbaf55ca5df 100644
> --- a/drivers/gpu/drm/ttm/ttm_pool.c
> +++ b/drivers/gpu/drm/ttm/ttm_pool.c
> @@ -37,6 +37,7 @@
>   #ifdef CONFIG_X86
>   #include <asm/set_memory.h>
>   #endif
> +#include <drm/page_pool.h>
>   #include <drm/ttm/ttm_pool.h>
>   #include <drm/ttm/ttm_bo_driver.h>
>   #include <drm/ttm/ttm_tt.h>
> @@ -63,15 +64,13 @@ module_param(page_pool_size, ulong, 0644);
>   
>   static atomic_long_t allocated_pages;
>   
> -static struct ttm_pool_type global_write_combined[MAX_ORDER];
> -static struct ttm_pool_type global_uncached[MAX_ORDER];
> +static struct drm_page_pool *global_write_combined[MAX_ORDER];
> +static struct drm_page_pool *global_uncached[MAX_ORDER];
>   
> -static struct ttm_pool_type global_dma32_write_combined[MAX_ORDER];
> -static struct ttm_pool_type global_dma32_uncached[MAX_ORDER];
> +static struct drm_page_pool *global_dma32_write_combined[MAX_ORDER];
> +static struct drm_page_pool *global_dma32_uncached[MAX_ORDER];
>   
>   static struct mutex shrinker_lock;
> -static struct list_head shrinker_list;
> -static struct shrinker mm_shrinker;
>   
>   /* Allocate pages of size 1 << order with the given gfp_flags */
>   static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, gfp_t gfp_flags,
> @@ -223,79 +222,26 @@ static void ttm_pool_unmap(struct ttm_pool *pool, dma_addr_t dma_addr,
>   		       DMA_BIDIRECTIONAL);
>   }
>   
> -/* Give pages into a specific pool_type */
> -static void ttm_pool_type_give(struct ttm_pool_type *pt, struct page *p)
> -{
> -	spin_lock(&pt->lock);
> -	list_add(&p->lru, &pt->pages);
> -	spin_unlock(&pt->lock);
> -	atomic_long_add(1 << pt->order, &allocated_pages);
> -}
> -
> -/* Take pages from a specific pool_type, return NULL when nothing available */
> -static struct page *ttm_pool_type_take(struct ttm_pool_type *pt)
> -{
> -	struct page *p;
> -
> -	spin_lock(&pt->lock);
> -	p = list_first_entry_or_null(&pt->pages, typeof(*p), lru);
> -	if (p) {
> -		atomic_long_sub(1 << pt->order, &allocated_pages);
> -		list_del(&p->lru);
> -	}
> -	spin_unlock(&pt->lock);
> -
> -	return p;
> -}
> -
> -/* Initialize and add a pool type to the global shrinker list */
> -static void ttm_pool_type_init(struct ttm_pool_type *pt, struct ttm_pool *pool,
> -			       enum ttm_caching caching, unsigned int order)
> -{
> -	pt->pool = pool;
> -	pt->caching = caching;
> -	pt->order = order;
> -	spin_lock_init(&pt->lock);
> -	INIT_LIST_HEAD(&pt->pages);
> -
> -	mutex_lock(&shrinker_lock);
> -	list_add_tail(&pt->shrinker_list, &shrinker_list);
> -	mutex_unlock(&shrinker_lock);
> -}
> -
> -/* Remove a pool_type from the global shrinker list and free all pages */
> -static void ttm_pool_type_fini(struct ttm_pool_type *pt)
> -{
> -	struct page *p, *tmp;
> -
> -	mutex_lock(&shrinker_lock);
> -	list_del(&pt->shrinker_list);
> -	mutex_unlock(&shrinker_lock);
> -
> -	list_for_each_entry_safe(p, tmp, &pt->pages, lru)
> -		ttm_pool_free_page(p, pt->order);
> -}
> -
>   /* Return the pool_type to use for the given caching and order */
> -static struct ttm_pool_type *ttm_pool_select_type(struct ttm_pool *pool,
> +static struct drm_page_pool *ttm_pool_select_type(struct ttm_pool *pool,
>   						  enum ttm_caching caching,
>   						  unsigned int order)
>   {
>   	if (pool->use_dma_alloc)
> -		return &pool->caching[caching].orders[order];
> +		return pool->caching[caching].orders[order];
>   
>   #ifdef CONFIG_X86
>   	switch (caching) {
>   	case ttm_write_combined:
>   		if (pool->use_dma32)
> -			return &global_dma32_write_combined[order];
> +			return global_dma32_write_combined[order];
>   
> -		return &global_write_combined[order];
> +		return global_write_combined[order];
>   	case ttm_uncached:
>   		if (pool->use_dma32)
> -			return &global_dma32_uncached[order];
> +			return global_dma32_uncached[order];
>   
> -		return &global_uncached[order];
> +		return global_uncached[order];
>   	default:
>   		break;
>   	}
> @@ -304,30 +250,6 @@ static struct ttm_pool_type *ttm_pool_select_type(struct ttm_pool *pool,
>   	return NULL;
>   }
>   
> -/* Free pages using the global shrinker list */
> -static unsigned int ttm_pool_shrink(void)
> -{
> -	struct ttm_pool_type *pt;
> -	unsigned int num_freed;
> -	struct page *p;
> -
> -	mutex_lock(&shrinker_lock);
> -	pt = list_first_entry(&shrinker_list, typeof(*pt), shrinker_list);
> -
> -	p = ttm_pool_type_take(pt);
> -	if (p) {
> -		ttm_pool_free_page(p, pt->order);
> -		num_freed = 1 << pt->order;
> -	} else {
> -		num_freed = 0;
> -	}
> -
> -	list_move_tail(&pt->shrinker_list, &shrinker_list);
> -	mutex_unlock(&shrinker_lock);
> -
> -	return num_freed;
> -}
> -
>   /* Return the allocation order based for a page */
>   static unsigned int ttm_pool_page_order(struct ttm_pool *pool, struct page *p)
>   {
> @@ -377,10 +299,10 @@ int ttm_pool_alloc(struct ttm_pool *pool, struct ttm_tt *tt,
>   	for (order = min(MAX_ORDER - 1UL, __fls(num_pages)); num_pages;
>   	     order = min_t(unsigned int, order, __fls(num_pages))) {
>   		bool apply_caching = false;
> -		struct ttm_pool_type *pt;
> +		struct drm_page_pool *pt;
>   
>   		pt = ttm_pool_select_type(pool, tt->caching, order);
> -		p = pt ? ttm_pool_type_take(pt) : NULL;
> +		p = pt ? drm_page_pool_fetch(pt) : NULL;
>   		if (p) {
>   			apply_caching = true;
>   		} else {
> @@ -462,7 +384,7 @@ void ttm_pool_free(struct ttm_pool *pool, struct ttm_tt *tt)
>   	for (i = 0; i < tt->num_pages; ) {
>   		struct page *p = tt->pages[i];
>   		unsigned int order, num_pages;
> -		struct ttm_pool_type *pt;
> +		struct drm_page_pool *pt;
>   
>   		order = ttm_pool_page_order(pool, p);
>   		num_pages = 1ULL << order;
> @@ -473,15 +395,12 @@ void ttm_pool_free(struct ttm_pool *pool, struct ttm_tt *tt)
>   
>   		pt = ttm_pool_select_type(pool, tt->caching, order);
>   		if (pt)
> -			ttm_pool_type_give(pt, tt->pages[i]);
> +			drm_page_pool_add(pt, tt->pages[i]);
>   		else
>   			ttm_pool_free_page(tt->pages[i], order);
>   
>   		i += num_pages;
>   	}
> -
> -	while (atomic_long_read(&allocated_pages) > page_pool_size)
> -		ttm_pool_shrink();
>   }
>   EXPORT_SYMBOL(ttm_pool_free);
>   
> @@ -508,8 +427,8 @@ void ttm_pool_init(struct ttm_pool *pool, struct device *dev,
>   
>   	for (i = 0; i < TTM_NUM_CACHING_TYPES; ++i)
>   		for (j = 0; j < MAX_ORDER; ++j)
> -			ttm_pool_type_init(&pool->caching[i].orders[j],
> -					   pool, i, j);
> +			pool->caching[i].orders[j] = drm_page_pool_create(j,
> +							ttm_pool_free_page);
>   }
>   
>   /**
> @@ -526,33 +445,18 @@ void ttm_pool_fini(struct ttm_pool *pool)
>   
>   	for (i = 0; i < TTM_NUM_CACHING_TYPES; ++i)
>   		for (j = 0; j < MAX_ORDER; ++j)
> -			ttm_pool_type_fini(&pool->caching[i].orders[j]);
> +			drm_page_pool_destroy(pool->caching[i].orders[j]);
>   }
>   
>   #ifdef CONFIG_DEBUG_FS
> -/* Count the number of pages available in a pool_type */
> -static unsigned int ttm_pool_type_count(struct ttm_pool_type *pt)
> -{
> -	unsigned int count = 0;
> -	struct page *p;
> -
> -	spin_lock(&pt->lock);
> -	/* Only used for debugfs, the overhead doesn't matter */
> -	list_for_each_entry(p, &pt->pages, lru)
> -		++count;
> -	spin_unlock(&pt->lock);
> -
> -	return count;
> -}
> -
>   /* Dump information about the different pool types */
> -static void ttm_pool_debugfs_orders(struct ttm_pool_type *pt,
> +static void ttm_pool_debugfs_orders(struct drm_page_pool **pt,
>   				    struct seq_file *m)
>   {
>   	unsigned int i;
>   
>   	for (i = 0; i < MAX_ORDER; ++i)
> -		seq_printf(m, " %8u", ttm_pool_type_count(&pt[i]));
> +		seq_printf(m, " %8u", drm_page_pool_get_size(pt[i]));
>   	seq_puts(m, "\n");
>   }
>   
> @@ -602,7 +506,10 @@ int ttm_pool_debugfs(struct ttm_pool *pool, struct seq_file *m)
>   	}
>   
>   	seq_printf(m, "\ntotal\t: %8lu of %8lu\n",
> -		   atomic_long_read(&allocated_pages), page_pool_size);
> +		   atomic_long_read(&allocated_pages),
> +		   drm_page_pool_get_max());
> +	seq_printf(m, "(%8lu in non-ttm pools)\n", drm_page_pool_get_total() -
> +					atomic_long_read(&allocated_pages));
>   
>   	mutex_unlock(&shrinker_lock);
>   
> @@ -612,28 +519,6 @@ EXPORT_SYMBOL(ttm_pool_debugfs);
>   
>   #endif
>   
> -/* As long as pages are available make sure to release at least one */
> -static unsigned long ttm_pool_shrinker_scan(struct shrinker *shrink,
> -					    struct shrink_control *sc)
> -{
> -	unsigned long num_freed = 0;
> -
> -	do
> -		num_freed += ttm_pool_shrink();
> -	while (!num_freed && atomic_long_read(&allocated_pages));
> -
> -	return num_freed;
> -}
> -
> -/* Return the number of pages available or SHRINK_EMPTY if we have none */
> -static unsigned long ttm_pool_shrinker_count(struct shrinker *shrink,
> -					     struct shrink_control *sc)
> -{
> -	unsigned long num_pages = atomic_long_read(&allocated_pages);
> -
> -	return num_pages ? num_pages : SHRINK_EMPTY;
> -}
> -
>   /**
>    * ttm_pool_mgr_init - Initialize globals
>    *
> @@ -648,24 +533,21 @@ int ttm_pool_mgr_init(unsigned long num_pages)
>   	if (!page_pool_size)
>   		page_pool_size = num_pages;
>   
> +	drm_page_pool_set_max(page_pool_size);
> +
>   	mutex_init(&shrinker_lock);
> -	INIT_LIST_HEAD(&shrinker_list);
>   
>   	for (i = 0; i < MAX_ORDER; ++i) {
> -		ttm_pool_type_init(&global_write_combined[i], NULL,
> -				   ttm_write_combined, i);
> -		ttm_pool_type_init(&global_uncached[i], NULL, ttm_uncached, i);
> -
> -		ttm_pool_type_init(&global_dma32_write_combined[i], NULL,
> -				   ttm_write_combined, i);
> -		ttm_pool_type_init(&global_dma32_uncached[i], NULL,
> -				   ttm_uncached, i);
> +		global_write_combined[i] = drm_page_pool_create(i,
> +							ttm_pool_free_page);
> +		global_uncached[i] = drm_page_pool_create(i,
> +							ttm_pool_free_page);
> +		global_dma32_write_combined[i] = drm_page_pool_create(i,
> +							ttm_pool_free_page);
> +		global_dma32_uncached[i] = drm_page_pool_create(i,
> +							ttm_pool_free_page);
>   	}
> -
> -	mm_shrinker.count_objects = ttm_pool_shrinker_count;
> -	mm_shrinker.scan_objects = ttm_pool_shrinker_scan;
> -	mm_shrinker.seeks = 1;
> -	return register_shrinker(&mm_shrinker);
> +	return 0;
>   }
>   
>   /**
> @@ -678,13 +560,10 @@ void ttm_pool_mgr_fini(void)
>   	unsigned int i;
>   
>   	for (i = 0; i < MAX_ORDER; ++i) {
> -		ttm_pool_type_fini(&global_write_combined[i]);
> -		ttm_pool_type_fini(&global_uncached[i]);
> +		drm_page_pool_destroy(global_write_combined[i]);
> +		drm_page_pool_destroy(global_uncached[i]);
>   
> -		ttm_pool_type_fini(&global_dma32_write_combined[i]);
> -		ttm_pool_type_fini(&global_dma32_uncached[i]);
> +		drm_page_pool_destroy(global_dma32_write_combined[i]);
> +		drm_page_pool_destroy(global_dma32_uncached[i]);
>   	}
> -
> -	unregister_shrinker(&mm_shrinker);
> -	WARN_ON(!list_empty(&shrinker_list));
>   }
> diff --git a/include/drm/ttm/ttm_pool.h b/include/drm/ttm/ttm_pool.h
> index 4321728bdd11..460ab232fd22 100644
> --- a/include/drm/ttm/ttm_pool.h
> +++ b/include/drm/ttm/ttm_pool.h
> @@ -36,27 +36,6 @@ struct ttm_tt;
>   struct ttm_pool;
>   struct ttm_operation_ctx;
>   
> -/**
> - * ttm_pool_type - Pool for a certain memory type
> - *
> - * @pool: the pool we belong to, might be NULL for the global ones
> - * @order: the allocation order our pages have
> - * @caching: the caching type our pages have
> - * @shrinker_list: our place on the global shrinker list
> - * @lock: protection of the page list
> - * @pages: the list of pages in the pool
> - */
> -struct ttm_pool_type {
> -	struct ttm_pool *pool;
> -	unsigned int order;
> -	enum ttm_caching caching;
> -
> -	struct list_head shrinker_list;
> -
> -	spinlock_t lock;
> -	struct list_head pages;
> -};
> -
>   /**
>    * ttm_pool - Pool for all caching and orders
>    *
> @@ -71,7 +50,7 @@ struct ttm_pool {
>   	bool use_dma32;
>   
>   	struct {
> -		struct ttm_pool_type orders[MAX_ORDER];
> +		struct drm_page_pool *orders[MAX_ORDER];
>   	} caching[TTM_NUM_CACHING_TYPES];
>   };
>   


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC][PATCH v6 0/7] Generic page pool & deferred freeing for system dmabuf heap
  2021-02-05  8:06 [RFC][PATCH v6 0/7] Generic page pool & deferred freeing for system dmabuf heap John Stultz
                   ` (6 preceding siblings ...)
  2021-02-05  8:06 ` [RFC][PATCH v6 7/7] dma-buf: system_heap: Add deferred freeing to the " John Stultz
@ 2021-02-05 10:36 ` Christian König
  2021-02-05 20:57   ` John Stultz
  7 siblings, 1 reply; 33+ messages in thread
From: Christian König @ 2021-02-05 10:36 UTC (permalink / raw)
  To: John Stultz, lkml
  Cc: Daniel Vetter, Sumit Semwal, Liam Mark, Chris Goldsworthy,
	Laura Abbott, Brian Starkey, Hridya Valsaraju,
	Suren Baghdasaryan, Sandeep Patil, Daniel Mentz, Ørjan Eide,
	Robin Murphy, Ezequiel Garcia, Simon Ser, James Jones,
	linux-media, dri-devel

Am 05.02.21 um 09:06 schrieb John Stultz:
> This series is starting to get long, so I figured I'd add a
> short cover letter for context.
>
> The point of this series is trying to add both deferred-freeing
> logic as well as a page pool to the DMA-BUF system heap.
>
> This is desired, as the combination of deferred freeing along
> with the page pool allows us to offload page-zeroing out of
> the allocation hot path. This was done originally with ION
> and this patch series allows the DMA-BUF system heap to match
> ION's system heap allocation performance in a simple
> microbenchmark [1] (ION re-added to the kernel for comparision,
> running on an x86 vm image):
>
> ./dmabuf-heap-bench -i 0 1 system
> Testing dmabuf system vs ion heaptype 0 (flags: 0x1)
> ---------------------------------------------
> dmabuf heap: alloc 4096 bytes 5000 times in 86572223 ns          17314 ns/call
> ion heap:    alloc 4096 bytes 5000 times in 97442526 ns          19488 ns/call
> dmabuf heap: alloc 1048576 bytes 5000 times in 196635057 ns      39327 ns/call
> ion heap:    alloc 1048576 bytes 5000 times in 357323629 ns      71464 ns/call
> dmabuf heap: alloc 8388608 bytes 5000 times in 3165445534 ns     633089 ns/call
> ion heap:    alloc 8388608 bytes 5000 times in 3699591271 ns     739918 ns/call
> dmabuf heap: alloc 33554432 bytes 5000 times in 13327402517 ns   2665480 ns/call
> ion heap:    alloc 33554432 bytes 5000 times in 15292352796 ns   3058470 ns/call
>
> Daniel didn't like earlier attempts to re-use the network
> page-pool code to achieve this, and suggested the ttm_pool be
> used instead. This required pulling the fairly tightly knit
> ttm_pool logic apart, but after many failed attmempts I think
> I found a workable abstraction to split out shared logic.
>
> So this series contains a new generic drm_page_pool helper
> library, converts the ttm_pool to use it, and then adds the
> dmabuf deferred freeing and adds support to the dmabuf system
> heap to use both deferred freeing and the new drm_page_pool.
>
> Input would be greatly appreciated. Testing as well, as I don't
> have any development hardware that utilizes the ttm pool.

We can easily do the testing and the general idea sounds solid to me.

I see three major things we need to clean up here.
1. The licensing, you are moving from BSD/MIT to GPL2.
2. Don't add any new overhead to the TTM pool, especially allocating a 
private object per page is a no-go.
3. What are you doing with the reclaim stuff and why?
4. Keeping the documentation would be nice to have.

Regards,
Christian.

>
> thanks
> -john
>
> [1] https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fandroid.googlesource.com%2Fplatform%2Fsystem%2Fmemory%2Flibdmabufheap%2F%2B%2Frefs%2Fheads%2Fmaster%2Ftests%2Fdmabuf_heap_bench.c&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C2dc4d6cb3ee246558b9e08d8c9acef9a%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637481091933715561%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=oMgVsrdlwS%2BqZuuatjTiWDzMU9SiUW5eRar5xwT%2BHYQ%3D&amp;reserved=0
>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: Christian Koenig <christian.koenig@amd.com>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: Liam Mark <lmark@codeaurora.org>
> Cc: Chris Goldsworthy <cgoldswo@codeaurora.org>
> Cc: Laura Abbott <labbott@kernel.org>
> Cc: Brian Starkey <Brian.Starkey@arm.com>
> Cc: Hridya Valsaraju <hridya@google.com>
> Cc: Suren Baghdasaryan <surenb@google.com>
> Cc: Sandeep Patil <sspatil@google.com>
> Cc: Daniel Mentz <danielmentz@google.com>
> Cc: Ørjan Eide <orjan.eide@arm.com>
> Cc: Robin Murphy <robin.murphy@arm.com>
> Cc: Ezequiel Garcia <ezequiel@collabora.com>
> Cc: Simon Ser <contact@emersion.fr>
> Cc: James Jones <jajones@nvidia.com>
> Cc: linux-media@vger.kernel.org
> Cc: dri-devel@lists.freedesktop.org
>
> John Stultz (7):
>    drm: Add a sharable drm page-pool implementation
>    drm: ttm_pool: Rename the ttm_pool_dma structure to ttm_pool_page_dat
>    drm: ttm_pool: Rework ttm_pool_free_page to allow us to use it as a
>      function pointer
>    drm: ttm_pool: Rework ttm_pool to use drm_page_pool
>    dma-buf: heaps: Add deferred-free-helper library code
>    dma-buf: system_heap: Add drm pagepool support to system heap
>    dma-buf: system_heap: Add deferred freeing to the system heap
>
>   drivers/dma-buf/heaps/Kconfig                |   5 +
>   drivers/dma-buf/heaps/Makefile               |   1 +
>   drivers/dma-buf/heaps/deferred-free-helper.c | 145 ++++++++++
>   drivers/dma-buf/heaps/deferred-free-helper.h |  55 ++++
>   drivers/dma-buf/heaps/system_heap.c          |  77 ++++-
>   drivers/gpu/drm/Kconfig                      |   5 +
>   drivers/gpu/drm/Makefile                     |   1 +
>   drivers/gpu/drm/page_pool.c                  | 220 +++++++++++++++
>   drivers/gpu/drm/ttm/ttm_pool.c               | 278 ++++++-------------
>   include/drm/page_pool.h                      |  54 ++++
>   include/drm/ttm/ttm_pool.h                   |  23 +-
>   11 files changed, 639 insertions(+), 225 deletions(-)
>   create mode 100644 drivers/dma-buf/heaps/deferred-free-helper.c
>   create mode 100644 drivers/dma-buf/heaps/deferred-free-helper.h
>   create mode 100644 drivers/gpu/drm/page_pool.c
>   create mode 100644 include/drm/page_pool.h
>


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC][PATCH v6 3/7] drm: ttm_pool: Rework ttm_pool_free_page to allow us to use it as a function pointer
  2021-02-05  8:28   ` Christian König
@ 2021-02-05 19:47     ` John Stultz
  2021-02-09 12:13       ` Christian König
  0 siblings, 1 reply; 33+ messages in thread
From: John Stultz @ 2021-02-05 19:47 UTC (permalink / raw)
  To: Christian König
  Cc: lkml, Daniel Vetter, Sumit Semwal, Liam Mark, Chris Goldsworthy,
	Laura Abbott, Brian Starkey, Hridya Valsaraju,
	Suren Baghdasaryan, Sandeep Patil, Daniel Mentz, Ørjan Eide,
	Robin Murphy, Ezequiel Garcia, Simon Ser, James Jones,
	linux-media, dri-devel

On Fri, Feb 5, 2021 at 12:28 AM Christian König
<christian.koenig@amd.com> wrote:
> Am 05.02.21 um 09:06 schrieb John Stultz:
> > This refactors ttm_pool_free_page(), and by adding extra entries
> > to ttm_pool_page_dat, we then use it for all allocations, which
> > allows us to simplify the arguments needed to be passed to
> > ttm_pool_free_page().
>
> This is a clear NAK since the peer page data is just a workaround for
> the DMA-API hack to grab pages from there.
>
> Adding this to all pages would increase the memory footprint drastically.

Yea, that's a good point!  Hrm... bummer. I'll have to see if there's
some other way I can get the needed context for the free from the
generic page-pool side.

Thanks so much for the review!
-john

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC][PATCH v6 1/7] drm: Add a sharable drm page-pool implementation
  2021-02-05  8:46   ` Christian König
@ 2021-02-05 20:46     ` John Stultz
  2021-02-05 22:38       ` Suren Baghdasaryan
  2021-02-09 12:11       ` Christian König
  0 siblings, 2 replies; 33+ messages in thread
From: John Stultz @ 2021-02-05 20:46 UTC (permalink / raw)
  To: Christian König
  Cc: lkml, Daniel Vetter, Sumit Semwal, Liam Mark, Chris Goldsworthy,
	Laura Abbott, Brian Starkey, Hridya Valsaraju,
	Suren Baghdasaryan, Sandeep Patil, Daniel Mentz, Ørjan Eide,
	Robin Murphy, Ezequiel Garcia, Simon Ser, James Jones,
	linux-media, dri-devel

On Fri, Feb 5, 2021 at 12:47 AM Christian König
<christian.koenig@amd.com> wrote:
> Am 05.02.21 um 09:06 schrieb John Stultz:
> > diff --git a/drivers/gpu/drm/page_pool.c b/drivers/gpu/drm/page_pool.c
> > new file mode 100644
> > index 000000000000..2139f86e6ca7
> > --- /dev/null
> > +++ b/drivers/gpu/drm/page_pool.c
> > @@ -0,0 +1,220 @@
> > +// SPDX-License-Identifier: GPL-2.0
>
> Please use a BSD/MIT compatible license if you want to copy this from
> the TTM code.

Hrm. This may be problematic, as it's not just TTM code, but some of
the TTM logic integrated into a page-pool implementation I wrote based
on logic from the ION code (which was GPL-2.0 before it was dropped).
So I don't think I can just make it MIT.  Any extra context on the
need for MIT, or suggestions on how to best resolve this?

> > +int drm_page_pool_get_size(struct drm_page_pool *pool)
> > +{
> > +     int ret;
> > +
> > +     spin_lock(&pool->lock);
> > +     ret = pool->count;
> > +     spin_unlock(&pool->lock);
>
> Maybe use an atomic for the count instead?
>

I can do that, but am curious as to the benefit? We are mostly using
count where we already have to take the pool->lock anyway, and this
drm_page_pool_get_size() is only used for debugfs output so far, so I
don't expect it to be a hot path.


> > +void drm_page_pool_add(struct drm_page_pool *pool, struct page *page)
> > +{
> > +     spin_lock(&pool->lock);
> > +     list_add_tail(&page->lru, &pool->items);
> > +     pool->count++;
> > +     atomic_long_add(1 << pool->order, &total_pages);
> > +     spin_unlock(&pool->lock);
> > +
> > +     mod_node_page_state(page_pgdat(page), NR_KERNEL_MISC_RECLAIMABLE,
> > +                         1 << pool->order);
>
> Hui what? What should that be good for?

This is a carryover from the ION page pool implementation:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/staging/android/ion/ion_page_pool.c?h=v5.10#n28

My sense is it helps with the vmstat/meminfo accounting so folks can
see the cached pages are shrinkable/freeable. This maybe falls under
other dmabuf accounting/stats discussions, so I'm happy to remove it
for now, or let the drivers using the shared page pool logic handle
the accounting themselves?


> > +static struct page *drm_page_pool_remove(struct drm_page_pool *pool)
> > +{
> > +     struct page *page;
> > +
> > +     if (!pool->count)
> > +             return NULL;
>
> Better use list_first_entry_or_null instead of checking the count.
>
> This way you can also pull the lock into the function.

Yea, that cleans a number of things up nicely. Thank you!


> > +struct drm_page_pool *drm_page_pool_create(unsigned int order,
> > +                                        int (*free_page)(struct page *p, unsigned int order))
> > +{
> > +     struct drm_page_pool *pool = kmalloc(sizeof(*pool), GFP_KERNEL);
>
> Why not making this an embedded object? We should not see much dynamic
> pool creation.

Yea, it felt cleaner at the time this way, but I think I will need to
switch to an embedded object in order to resolve the memory usage
issue you pointed out with growing the ttm_pool_dma, so thank you for
the suggestion!


> > +void drm_page_pool_destroy(struct drm_page_pool *pool)
> > +{
> > +     struct page *page;
> > +
> > +     /* Remove us from the pool list */
> > +     mutex_lock(&pool_list_lock);
> > +     list_del(&pool->list);
> > +     mutex_unlock(&pool_list_lock);
> > +
> > +     /* Free any remaining pages in the pool */
> > +     spin_lock(&pool->lock);
>
> Locking should be unnecessary when the pool is destroyed anyway.

I guess if we've already pruned ourself from the pool list, then your
right, we can't race with the shrinker and it's maybe not necessary.
But it also seems easier to consistently follow the locking rules in a
very unlikely path rather than leaning on subtlety.  Either way, I
think this becomes moot if I make the improvements you suggest to
drm_page_pool_remove().

> > +static int drm_page_pool_shrink_one(void)
> > +{
> > +     struct drm_page_pool *pool;
> > +     struct page *page;
> > +     int nr_freed = 0;
> > +
> > +     mutex_lock(&pool_list_lock);
> > +     pool = list_first_entry(&pool_list, typeof(*pool), list);
> > +
> > +     spin_lock(&pool->lock);
> > +     page = drm_page_pool_remove(pool);
> > +     spin_unlock(&pool->lock);
> > +
> > +     if (page)
> > +             nr_freed = drm_page_pool_free_pages(pool, page);
> > +
> > +     list_move_tail(&pool->list, &pool_list);
>
> Better to move this up, directly after the list_first_entry().

Sounds good!

Thanks so much for your review and feedback! I'll try to get some of
the easy suggestions integrated, and will have to figure out what to
do about the re-licensing request.

thanks
-john

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC][PATCH v6 0/7] Generic page pool & deferred freeing for system dmabuf heap
  2021-02-05 10:36 ` [RFC][PATCH v6 0/7] Generic page pool & deferred freeing for system dmabuf heap Christian König
@ 2021-02-05 20:57   ` John Stultz
  0 siblings, 0 replies; 33+ messages in thread
From: John Stultz @ 2021-02-05 20:57 UTC (permalink / raw)
  To: Christian König
  Cc: lkml, Daniel Vetter, Sumit Semwal, Liam Mark, Chris Goldsworthy,
	Laura Abbott, Brian Starkey, Hridya Valsaraju,
	Suren Baghdasaryan, Sandeep Patil, Daniel Mentz, Ørjan Eide,
	Robin Murphy, Ezequiel Garcia, Simon Ser, James Jones,
	linux-media, dri-devel

On Fri, Feb 5, 2021 at 2:36 AM Christian König <christian.koenig@amd.com> wrote:
> Am 05.02.21 um 09:06 schrieb John Stultz:
> > Input would be greatly appreciated. Testing as well, as I don't
> > have any development hardware that utilizes the ttm pool.
>
> We can easily do the testing and the general idea sounds solid to me.
>

Thanks so much again for the feedback!

> I see three major things we need to clean up here.
> 1. The licensing, you are moving from BSD/MIT to GPL2.

Yea, this may be sticky, as it's not just code re-used from one dual
licensed file, but combination of GPL2 work, so advice here would be
appreciated.

> 2. Don't add any new overhead to the TTM pool, especially allocating a
> private object per page is a no-go.

This will need some more series rework to solve. I've got some ideas,
but we'll see if they work.

> 3. What are you doing with the reclaim stuff and why?

As I mentioned, it's a holdover from earlier code, and I'm happy to
drop it and defer to other accounting/stats discussions that are
ongoing.

> 4. Keeping the documentation would be nice to have.

True. I didn't spend much time with documentation, as I worried folks
may have disagreed with the whole approach. I'll work to improve it
for the next go around.

Thanks so much again for the review and feedback! I really appreciate
your time here.
-john

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC][PATCH v6 1/7] drm: Add a sharable drm page-pool implementation
  2021-02-05 20:46     ` John Stultz
@ 2021-02-05 22:38       ` Suren Baghdasaryan
  2021-02-09 12:11       ` Christian König
  1 sibling, 0 replies; 33+ messages in thread
From: Suren Baghdasaryan @ 2021-02-05 22:38 UTC (permalink / raw)
  To: John Stultz
  Cc: Christian König, lkml, Daniel Vetter, Sumit Semwal,
	Liam Mark, Chris Goldsworthy, Laura Abbott, Brian Starkey,
	Hridya Valsaraju, Sandeep Patil, Daniel Mentz, Ørjan Eide,
	Robin Murphy, Ezequiel Garcia, Simon Ser, James Jones,
	linux-media, dri-devel

On Fri, Feb 5, 2021 at 12:47 PM John Stultz <john.stultz@linaro.org> wrote:
>
> On Fri, Feb 5, 2021 at 12:47 AM Christian König
> <christian.koenig@amd.com> wrote:
> > Am 05.02.21 um 09:06 schrieb John Stultz:
> > > diff --git a/drivers/gpu/drm/page_pool.c b/drivers/gpu/drm/page_pool.c
> > > new file mode 100644
> > > index 000000000000..2139f86e6ca7
> > > --- /dev/null
> > > +++ b/drivers/gpu/drm/page_pool.c
> > > @@ -0,0 +1,220 @@
> > > +// SPDX-License-Identifier: GPL-2.0
> >
> > Please use a BSD/MIT compatible license if you want to copy this from
> > the TTM code.
>
> Hrm. This may be problematic, as it's not just TTM code, but some of
> the TTM logic integrated into a page-pool implementation I wrote based
> on logic from the ION code (which was GPL-2.0 before it was dropped).
> So I don't think I can just make it MIT.  Any extra context on the
> need for MIT, or suggestions on how to best resolve this?
>
> > > +int drm_page_pool_get_size(struct drm_page_pool *pool)
> > > +{
> > > +     int ret;
> > > +
> > > +     spin_lock(&pool->lock);
> > > +     ret = pool->count;
> > > +     spin_unlock(&pool->lock);
> >
> > Maybe use an atomic for the count instead?
> >
>
> I can do that, but am curious as to the benefit? We are mostly using
> count where we already have to take the pool->lock anyway, and this
> drm_page_pool_get_size() is only used for debugfs output so far, so I
> don't expect it to be a hot path.
>
>
> > > +void drm_page_pool_add(struct drm_page_pool *pool, struct page *page)
> > > +{
> > > +     spin_lock(&pool->lock);
> > > +     list_add_tail(&page->lru, &pool->items);
> > > +     pool->count++;
> > > +     atomic_long_add(1 << pool->order, &total_pages);
> > > +     spin_unlock(&pool->lock);
> > > +
> > > +     mod_node_page_state(page_pgdat(page), NR_KERNEL_MISC_RECLAIMABLE,
> > > +                         1 << pool->order);
> >
> > Hui what? What should that be good for?
>
> This is a carryover from the ION page pool implementation:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/staging/android/ion/ion_page_pool.c?h=v5.10#n28
>
> My sense is it helps with the vmstat/meminfo accounting so folks can
> see the cached pages are shrinkable/freeable. This maybe falls under
> other dmabuf accounting/stats discussions, so I'm happy to remove it
> for now, or let the drivers using the shared page pool logic handle
> the accounting themselves?

Yep, ION pools were accounted for as reclaimable kernel memory because
they could be dropped when the system is under memory pressure.

>
>
> > > +static struct page *drm_page_pool_remove(struct drm_page_pool *pool)
> > > +{
> > > +     struct page *page;
> > > +
> > > +     if (!pool->count)
> > > +             return NULL;
> >
> > Better use list_first_entry_or_null instead of checking the count.
> >
> > This way you can also pull the lock into the function.
>
> Yea, that cleans a number of things up nicely. Thank you!
>
>
> > > +struct drm_page_pool *drm_page_pool_create(unsigned int order,
> > > +                                        int (*free_page)(struct page *p, unsigned int order))
> > > +{
> > > +     struct drm_page_pool *pool = kmalloc(sizeof(*pool), GFP_KERNEL);
> >
> > Why not making this an embedded object? We should not see much dynamic
> > pool creation.
>
> Yea, it felt cleaner at the time this way, but I think I will need to
> switch to an embedded object in order to resolve the memory usage
> issue you pointed out with growing the ttm_pool_dma, so thank you for
> the suggestion!
>
>
> > > +void drm_page_pool_destroy(struct drm_page_pool *pool)
> > > +{
> > > +     struct page *page;
> > > +
> > > +     /* Remove us from the pool list */
> > > +     mutex_lock(&pool_list_lock);
> > > +     list_del(&pool->list);
> > > +     mutex_unlock(&pool_list_lock);
> > > +
> > > +     /* Free any remaining pages in the pool */
> > > +     spin_lock(&pool->lock);
> >
> > Locking should be unnecessary when the pool is destroyed anyway.
>
> I guess if we've already pruned ourself from the pool list, then your
> right, we can't race with the shrinker and it's maybe not necessary.
> But it also seems easier to consistently follow the locking rules in a
> very unlikely path rather than leaning on subtlety.  Either way, I
> think this becomes moot if I make the improvements you suggest to
> drm_page_pool_remove().
>
> > > +static int drm_page_pool_shrink_one(void)
> > > +{
> > > +     struct drm_page_pool *pool;
> > > +     struct page *page;
> > > +     int nr_freed = 0;
> > > +
> > > +     mutex_lock(&pool_list_lock);
> > > +     pool = list_first_entry(&pool_list, typeof(*pool), list);
> > > +
> > > +     spin_lock(&pool->lock);
> > > +     page = drm_page_pool_remove(pool);
> > > +     spin_unlock(&pool->lock);
> > > +
> > > +     if (page)
> > > +             nr_freed = drm_page_pool_free_pages(pool, page);
> > > +
> > > +     list_move_tail(&pool->list, &pool_list);
> >
> > Better to move this up, directly after the list_first_entry().
>
> Sounds good!
>
> Thanks so much for your review and feedback! I'll try to get some of
> the easy suggestions integrated, and will have to figure out what to
> do about the re-licensing request.
>
> thanks
> -john

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC][PATCH v6 1/7] drm: Add a sharable drm page-pool implementation
  2021-02-05 20:46     ` John Stultz
  2021-02-05 22:38       ` Suren Baghdasaryan
@ 2021-02-09 12:11       ` Christian König
  2021-02-09 12:57         ` Christian König
  2021-02-09 17:51         ` John Stultz
  1 sibling, 2 replies; 33+ messages in thread
From: Christian König @ 2021-02-09 12:11 UTC (permalink / raw)
  To: John Stultz
  Cc: lkml, Daniel Vetter, Sumit Semwal, Liam Mark, Chris Goldsworthy,
	Laura Abbott, Brian Starkey, Hridya Valsaraju,
	Suren Baghdasaryan, Sandeep Patil, Daniel Mentz, Ørjan Eide,
	Robin Murphy, Ezequiel Garcia, Simon Ser, James Jones,
	linux-media, dri-devel



Am 05.02.21 um 21:46 schrieb John Stultz:
> On Fri, Feb 5, 2021 at 12:47 AM Christian König
> <christian.koenig@amd.com> wrote:
>> Am 05.02.21 um 09:06 schrieb John Stultz:
>>> diff --git a/drivers/gpu/drm/page_pool.c b/drivers/gpu/drm/page_pool.c
>>> new file mode 100644
>>> index 000000000000..2139f86e6ca7
>>> --- /dev/null
>>> +++ b/drivers/gpu/drm/page_pool.c
>>> @@ -0,0 +1,220 @@
>>> +// SPDX-License-Identifier: GPL-2.0
>> Please use a BSD/MIT compatible license if you want to copy this from
>> the TTM code.
> Hrm. This may be problematic, as it's not just TTM code, but some of
> the TTM logic integrated into a page-pool implementation I wrote based
> on logic from the ION code (which was GPL-2.0 before it was dropped).
> So I don't think I can just make it MIT.  Any extra context on the
> need for MIT, or suggestions on how to best resolve this?

Most of the DRM/TTM code is also license able under the BSD/MIT style 
license since we want to enable the BSD guys to port it over as well.

What stuff do you need from the ION code that you can't just code 
yourself? As far as I have seen this is like 99% code from the TTM pool.

Christian.

>
>>> +int drm_page_pool_get_size(struct drm_page_pool *pool)
>>> +{
>>> +     int ret;
>>> +
>>> +     spin_lock(&pool->lock);
>>> +     ret = pool->count;
>>> +     spin_unlock(&pool->lock);
>> Maybe use an atomic for the count instead?
>>
> I can do that, but am curious as to the benefit? We are mostly using
> count where we already have to take the pool->lock anyway, and this
> drm_page_pool_get_size() is only used for debugfs output so far, so I
> don't expect it to be a hot path.

Yeah, it's not really important. IIRC I even walked over the linked list 
to count how many pages we had.

Christian.

>
>
>>> +void drm_page_pool_add(struct drm_page_pool *pool, struct page *page)
>>> +{
>>> +     spin_lock(&pool->lock);
>>> +     list_add_tail(&page->lru, &pool->items);
>>> +     pool->count++;
>>> +     atomic_long_add(1 << pool->order, &total_pages);
>>> +     spin_unlock(&pool->lock);
>>> +
>>> +     mod_node_page_state(page_pgdat(page), NR_KERNEL_MISC_RECLAIMABLE,
>>> +                         1 << pool->order);
>> Hui what? What should that be good for?
> This is a carryover from the ION page pool implementation:
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Ftree%2Fdrivers%2Fstaging%2Fandroid%2Fion%2Fion_page_pool.c%3Fh%3Dv5.10%23n28&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Cc4eadb0a9cf6491d99ba08d8ca173457%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637481548325174885%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=FUjZK5NSDMUYfU7vGeE4fDU2HCF%2FYyNBwc30aoLLPQ4%3D&amp;reserved=0
>
> My sense is it helps with the vmstat/meminfo accounting so folks can
> see the cached pages are shrinkable/freeable. This maybe falls under
> other dmabuf accounting/stats discussions, so I'm happy to remove it
> for now, or let the drivers using the shared page pool logic handle
> the accounting themselves?
>
>
>>> +static struct page *drm_page_pool_remove(struct drm_page_pool *pool)
>>> +{
>>> +     struct page *page;
>>> +
>>> +     if (!pool->count)
>>> +             return NULL;
>> Better use list_first_entry_or_null instead of checking the count.
>>
>> This way you can also pull the lock into the function.
> Yea, that cleans a number of things up nicely. Thank you!
>
>
>>> +struct drm_page_pool *drm_page_pool_create(unsigned int order,
>>> +                                        int (*free_page)(struct page *p, unsigned int order))
>>> +{
>>> +     struct drm_page_pool *pool = kmalloc(sizeof(*pool), GFP_KERNEL);
>> Why not making this an embedded object? We should not see much dynamic
>> pool creation.
> Yea, it felt cleaner at the time this way, but I think I will need to
> switch to an embedded object in order to resolve the memory usage
> issue you pointed out with growing the ttm_pool_dma, so thank you for
> the suggestion!
>
>
>>> +void drm_page_pool_destroy(struct drm_page_pool *pool)
>>> +{
>>> +     struct page *page;
>>> +
>>> +     /* Remove us from the pool list */
>>> +     mutex_lock(&pool_list_lock);
>>> +     list_del(&pool->list);
>>> +     mutex_unlock(&pool_list_lock);
>>> +
>>> +     /* Free any remaining pages in the pool */
>>> +     spin_lock(&pool->lock);
>> Locking should be unnecessary when the pool is destroyed anyway.
> I guess if we've already pruned ourself from the pool list, then your
> right, we can't race with the shrinker and it's maybe not necessary.
> But it also seems easier to consistently follow the locking rules in a
> very unlikely path rather than leaning on subtlety.  Either way, I
> think this becomes moot if I make the improvements you suggest to
> drm_page_pool_remove().
>
>>> +static int drm_page_pool_shrink_one(void)
>>> +{
>>> +     struct drm_page_pool *pool;
>>> +     struct page *page;
>>> +     int nr_freed = 0;
>>> +
>>> +     mutex_lock(&pool_list_lock);
>>> +     pool = list_first_entry(&pool_list, typeof(*pool), list);
>>> +
>>> +     spin_lock(&pool->lock);
>>> +     page = drm_page_pool_remove(pool);
>>> +     spin_unlock(&pool->lock);
>>> +
>>> +     if (page)
>>> +             nr_freed = drm_page_pool_free_pages(pool, page);
>>> +
>>> +     list_move_tail(&pool->list, &pool_list);
>> Better to move this up, directly after the list_first_entry().
> Sounds good!
>
> Thanks so much for your review and feedback! I'll try to get some of
> the easy suggestions integrated, and will have to figure out what to
> do about the re-licensing request.
>
> thanks
> -john


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC][PATCH v6 3/7] drm: ttm_pool: Rework ttm_pool_free_page to allow us to use it as a function pointer
  2021-02-05 19:47     ` John Stultz
@ 2021-02-09 12:13       ` Christian König
  2021-02-09 17:39         ` John Stultz
  0 siblings, 1 reply; 33+ messages in thread
From: Christian König @ 2021-02-09 12:13 UTC (permalink / raw)
  To: John Stultz
  Cc: lkml, Daniel Vetter, Sumit Semwal, Liam Mark, Chris Goldsworthy,
	Laura Abbott, Brian Starkey, Hridya Valsaraju,
	Suren Baghdasaryan, Sandeep Patil, Daniel Mentz, Ørjan Eide,
	Robin Murphy, Ezequiel Garcia, Simon Ser, James Jones,
	linux-media, dri-devel



Am 05.02.21 um 20:47 schrieb John Stultz:
> On Fri, Feb 5, 2021 at 12:28 AM Christian König
> <christian.koenig@amd.com> wrote:
>> Am 05.02.21 um 09:06 schrieb John Stultz:
>>> This refactors ttm_pool_free_page(), and by adding extra entries
>>> to ttm_pool_page_dat, we then use it for all allocations, which
>>> allows us to simplify the arguments needed to be passed to
>>> ttm_pool_free_page().
>> This is a clear NAK since the peer page data is just a workaround for
>> the DMA-API hack to grab pages from there.
>>
>> Adding this to all pages would increase the memory footprint drastically.
> Yea, that's a good point!  Hrm... bummer. I'll have to see if there's
> some other way I can get the needed context for the free from the
> generic page-pool side.

What exactly is the problem here? As far as I can see we just have the 
lru entry (list_head) and the pool.

How the lru is cast to the page can be completely pool implementation 
specific.

Regards,
Christian.

>
> Thanks so much for the review!
> -john


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC][PATCH v6 1/7] drm: Add a sharable drm page-pool implementation
  2021-02-09 12:11       ` Christian König
@ 2021-02-09 12:57         ` Christian König
  2021-02-09 17:33           ` Suren Baghdasaryan
  2021-02-09 17:51         ` John Stultz
  1 sibling, 1 reply; 33+ messages in thread
From: Christian König @ 2021-02-09 12:57 UTC (permalink / raw)
  To: John Stultz
  Cc: lkml, Daniel Vetter, Sumit Semwal, Liam Mark, Chris Goldsworthy,
	Laura Abbott, Brian Starkey, Hridya Valsaraju,
	Suren Baghdasaryan, Sandeep Patil, Daniel Mentz, Ørjan Eide,
	Robin Murphy, Ezequiel Garcia, Simon Ser, James Jones,
	linux-media, dri-devel

Am 09.02.21 um 13:11 schrieb Christian König:
> [SNIP]
>>>> +void drm_page_pool_add(struct drm_page_pool *pool, struct page *page)
>>>> +{
>>>> +     spin_lock(&pool->lock);
>>>> +     list_add_tail(&page->lru, &pool->items);
>>>> +     pool->count++;
>>>> +     atomic_long_add(1 << pool->order, &total_pages);
>>>> +     spin_unlock(&pool->lock);
>>>> +
>>>> +     mod_node_page_state(page_pgdat(page), 
>>>> NR_KERNEL_MISC_RECLAIMABLE,
>>>> +                         1 << pool->order);
>>> Hui what? What should that be good for?
>> This is a carryover from the ION page pool implementation:
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Ftree%2Fdrivers%2Fstaging%2Fandroid%2Fion%2Fion_page_pool.c%3Fh%3Dv5.10%23n28&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Cc4eadb0a9cf6491d99ba08d8ca173457%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637481548325174885%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=FUjZK5NSDMUYfU7vGeE4fDU2HCF%2FYyNBwc30aoLLPQ4%3D&amp;reserved=0 
>>
>>
>> My sense is it helps with the vmstat/meminfo accounting so folks can
>> see the cached pages are shrinkable/freeable. This maybe falls under
>> other dmabuf accounting/stats discussions, so I'm happy to remove it
>> for now, or let the drivers using the shared page pool logic handle
>> the accounting themselves?

Intentionally separated the discussion for that here.

As far as I can see this is just bluntly incorrect.

Either the page is reclaimable or it is part of our pool and freeable 
through the shrinker, but never ever both.

In the best case this just messes up the accounting, in the worst case 
it can cause memory corruption.

Christian.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC][PATCH v6 1/7] drm: Add a sharable drm page-pool implementation
  2021-02-09 12:57         ` Christian König
@ 2021-02-09 17:33           ` Suren Baghdasaryan
  2021-02-09 17:46             ` Christian König
  0 siblings, 1 reply; 33+ messages in thread
From: Suren Baghdasaryan @ 2021-02-09 17:33 UTC (permalink / raw)
  To: Christian König
  Cc: John Stultz, lkml, Daniel Vetter, Sumit Semwal, Liam Mark,
	Chris Goldsworthy, Laura Abbott, Brian Starkey, Hridya Valsaraju,
	Sandeep Patil, Daniel Mentz, Ørjan Eide, Robin Murphy,
	Ezequiel Garcia, Simon Ser, James Jones, linux-media, dri-devel

On Tue, Feb 9, 2021 at 4:57 AM Christian König <christian.koenig@amd.com> wrote:
>
> Am 09.02.21 um 13:11 schrieb Christian König:
> > [SNIP]
> >>>> +void drm_page_pool_add(struct drm_page_pool *pool, struct page *page)
> >>>> +{
> >>>> +     spin_lock(&pool->lock);
> >>>> +     list_add_tail(&page->lru, &pool->items);
> >>>> +     pool->count++;
> >>>> +     atomic_long_add(1 << pool->order, &total_pages);
> >>>> +     spin_unlock(&pool->lock);
> >>>> +
> >>>> +     mod_node_page_state(page_pgdat(page),
> >>>> NR_KERNEL_MISC_RECLAIMABLE,
> >>>> +                         1 << pool->order);
> >>> Hui what? What should that be good for?
> >> This is a carryover from the ION page pool implementation:
> >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Ftree%2Fdrivers%2Fstaging%2Fandroid%2Fion%2Fion_page_pool.c%3Fh%3Dv5.10%23n28&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Cc4eadb0a9cf6491d99ba08d8ca173457%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637481548325174885%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=FUjZK5NSDMUYfU7vGeE4fDU2HCF%2FYyNBwc30aoLLPQ4%3D&amp;reserved=0
> >>
> >>
> >> My sense is it helps with the vmstat/meminfo accounting so folks can
> >> see the cached pages are shrinkable/freeable. This maybe falls under
> >> other dmabuf accounting/stats discussions, so I'm happy to remove it
> >> for now, or let the drivers using the shared page pool logic handle
> >> the accounting themselves?
>
> Intentionally separated the discussion for that here.
>
> As far as I can see this is just bluntly incorrect.
>
> Either the page is reclaimable or it is part of our pool and freeable
> through the shrinker, but never ever both.

IIRC the original motivation for counting ION pooled pages as
reclaimable was to include them into /proc/meminfo's MemAvailable
calculations. NR_KERNEL_MISC_RECLAIMABLE defined as "reclaimable
non-slab kernel pages" seems like a good place to account for them but
I might be wrong.

>
> In the best case this just messes up the accounting, in the worst case
> it can cause memory corruption.
>
> Christian.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC][PATCH v6 3/7] drm: ttm_pool: Rework ttm_pool_free_page to allow us to use it as a function pointer
  2021-02-09 12:13       ` Christian König
@ 2021-02-09 17:39         ` John Stultz
  0 siblings, 0 replies; 33+ messages in thread
From: John Stultz @ 2021-02-09 17:39 UTC (permalink / raw)
  To: Christian König
  Cc: lkml, Daniel Vetter, Sumit Semwal, Liam Mark, Chris Goldsworthy,
	Laura Abbott, Brian Starkey, Hridya Valsaraju,
	Suren Baghdasaryan, Sandeep Patil, Daniel Mentz, Ørjan Eide,
	Robin Murphy, Ezequiel Garcia, Simon Ser, James Jones,
	linux-media, dri-devel

On Tue, Feb 9, 2021 at 4:14 AM Christian König <christian.koenig@amd.com> wrote:
> Am 05.02.21 um 20:47 schrieb John Stultz:
> > On Fri, Feb 5, 2021 at 12:28 AM Christian König
> > <christian.koenig@amd.com> wrote:
> >> Adding this to all pages would increase the memory footprint drastically.
> > Yea, that's a good point!  Hrm... bummer. I'll have to see if there's
> > some other way I can get the needed context for the free from the
> > generic page-pool side.
>
> What exactly is the problem here?

Me, usually. :)

> As far as I can see we just have the
> lru entry (list_head) and the pool.

Yea, I reworked it to an embedded drm_page_pool struct, but that is
mostly a list_head.

> How the lru is cast to the page can be completely pool implementation
> specific.

Yea, I had it do container_of(), just haven't gotten around to sending
it out yet.

Thanks so much for the feedback and ideas!

thanks
-john

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC][PATCH v6 1/7] drm: Add a sharable drm page-pool implementation
  2021-02-09 17:33           ` Suren Baghdasaryan
@ 2021-02-09 17:46             ` Christian König
  2021-02-09 17:52               ` Suren Baghdasaryan
  2021-02-09 20:03               ` Daniel Vetter
  0 siblings, 2 replies; 33+ messages in thread
From: Christian König @ 2021-02-09 17:46 UTC (permalink / raw)
  To: Suren Baghdasaryan
  Cc: John Stultz, lkml, Daniel Vetter, Sumit Semwal, Liam Mark,
	Chris Goldsworthy, Laura Abbott, Brian Starkey, Hridya Valsaraju,
	Sandeep Patil, Daniel Mentz, Ørjan Eide, Robin Murphy,
	Ezequiel Garcia, Simon Ser, James Jones, linux-media, dri-devel



Am 09.02.21 um 18:33 schrieb Suren Baghdasaryan:
> On Tue, Feb 9, 2021 at 4:57 AM Christian König <christian.koenig@amd.com> wrote:
>> Am 09.02.21 um 13:11 schrieb Christian König:
>>> [SNIP]
>>>>>> +void drm_page_pool_add(struct drm_page_pool *pool, struct page *page)
>>>>>> +{
>>>>>> +     spin_lock(&pool->lock);
>>>>>> +     list_add_tail(&page->lru, &pool->items);
>>>>>> +     pool->count++;
>>>>>> +     atomic_long_add(1 << pool->order, &total_pages);
>>>>>> +     spin_unlock(&pool->lock);
>>>>>> +
>>>>>> +     mod_node_page_state(page_pgdat(page),
>>>>>> NR_KERNEL_MISC_RECLAIMABLE,
>>>>>> +                         1 << pool->order);
>>>>> Hui what? What should that be good for?
>>>> This is a carryover from the ION page pool implementation:
>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Ftree%2Fdrivers%2Fstaging%2Fandroid%2Fion%2Fion_page_pool.c%3Fh%3Dv5.10%23n28&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Cdccccff8edcd4d147a5b08d8cd20cff2%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637484888114923580%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=9%2BIBC0tezSV6Ci4S3kWfW%2BQvJm4mdunn3dF6C0kyfCw%3D&amp;reserved=0
>>>>
>>>>
>>>> My sense is it helps with the vmstat/meminfo accounting so folks can
>>>> see the cached pages are shrinkable/freeable. This maybe falls under
>>>> other dmabuf accounting/stats discussions, so I'm happy to remove it
>>>> for now, or let the drivers using the shared page pool logic handle
>>>> the accounting themselves?
>> Intentionally separated the discussion for that here.
>>
>> As far as I can see this is just bluntly incorrect.
>>
>> Either the page is reclaimable or it is part of our pool and freeable
>> through the shrinker, but never ever both.
> IIRC the original motivation for counting ION pooled pages as
> reclaimable was to include them into /proc/meminfo's MemAvailable
> calculations. NR_KERNEL_MISC_RECLAIMABLE defined as "reclaimable
> non-slab kernel pages" seems like a good place to account for them but
> I might be wrong.

Yeah, that's what I see here as well. But exactly that is utterly nonsense.

Those pages are not "free" in the sense that get_free_page could return 
them directly.

Regards,
Christian.

>
>> In the best case this just messes up the accounting, in the worst case
>> it can cause memory corruption.
>>
>> Christian.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC][PATCH v6 1/7] drm: Add a sharable drm page-pool implementation
  2021-02-09 12:11       ` Christian König
  2021-02-09 12:57         ` Christian König
@ 2021-02-09 17:51         ` John Stultz
  1 sibling, 0 replies; 33+ messages in thread
From: John Stultz @ 2021-02-09 17:51 UTC (permalink / raw)
  To: Christian König
  Cc: lkml, Daniel Vetter, Sumit Semwal, Liam Mark, Chris Goldsworthy,
	Laura Abbott, Brian Starkey, Hridya Valsaraju,
	Suren Baghdasaryan, Sandeep Patil, Daniel Mentz, Ørjan Eide,
	Robin Murphy, Ezequiel Garcia, Simon Ser, James Jones,
	linux-media, dri-devel

On Tue, Feb 9, 2021 at 4:11 AM Christian König <christian.koenig@amd.com> wrote:
>
>
>
> Am 05.02.21 um 21:46 schrieb John Stultz:
> > On Fri, Feb 5, 2021 at 12:47 AM Christian König
> > <christian.koenig@amd.com> wrote:
> >> Am 05.02.21 um 09:06 schrieb John Stultz:
> >>> diff --git a/drivers/gpu/drm/page_pool.c b/drivers/gpu/drm/page_pool.c
> >>> new file mode 100644
> >>> index 000000000000..2139f86e6ca7
> >>> --- /dev/null
> >>> +++ b/drivers/gpu/drm/page_pool.c
> >>> @@ -0,0 +1,220 @@
> >>> +// SPDX-License-Identifier: GPL-2.0
> >> Please use a BSD/MIT compatible license if you want to copy this from
> >> the TTM code.
> > Hrm. This may be problematic, as it's not just TTM code, but some of
> > the TTM logic integrated into a page-pool implementation I wrote based
> > on logic from the ION code (which was GPL-2.0 before it was dropped).
> > So I don't think I can just make it MIT.  Any extra context on the
> > need for MIT, or suggestions on how to best resolve this?
>
> Most of the DRM/TTM code is also license able under the BSD/MIT style
> license since we want to enable the BSD guys to port it over as well.
>
> What stuff do you need from the ION code that you can't just code
> yourself? As far as I have seen this is like 99% code from the TTM pool.

Yea, it evolved into being mostly logic from the TTM pool (or code
that was very similar to begin with), but it's not where it started.
My old days at IBM makes me wary of claiming it's no longer the Ship
of Theseus.

So instead I think I'll have to just throw out my patch and rewrite it
in full (so apologies in advance for any review issues I
introduce/reintroduce).

thanks
-john

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC][PATCH v6 1/7] drm: Add a sharable drm page-pool implementation
  2021-02-09 17:46             ` Christian König
@ 2021-02-09 17:52               ` Suren Baghdasaryan
  2021-02-09 20:03               ` Daniel Vetter
  1 sibling, 0 replies; 33+ messages in thread
From: Suren Baghdasaryan @ 2021-02-09 17:52 UTC (permalink / raw)
  To: Christian König
  Cc: John Stultz, lkml, Daniel Vetter, Sumit Semwal, Liam Mark,
	Chris Goldsworthy, Laura Abbott, Brian Starkey, Hridya Valsaraju,
	Sandeep Patil, Daniel Mentz, Ørjan Eide, Robin Murphy,
	Ezequiel Garcia, Simon Ser, James Jones, linux-media, dri-devel

On Tue, Feb 9, 2021 at 9:46 AM Christian König <christian.koenig@amd.com> wrote:
>
>
>
> Am 09.02.21 um 18:33 schrieb Suren Baghdasaryan:
> > On Tue, Feb 9, 2021 at 4:57 AM Christian König <christian.koenig@amd.com> wrote:
> >> Am 09.02.21 um 13:11 schrieb Christian König:
> >>> [SNIP]
> >>>>>> +void drm_page_pool_add(struct drm_page_pool *pool, struct page *page)
> >>>>>> +{
> >>>>>> +     spin_lock(&pool->lock);
> >>>>>> +     list_add_tail(&page->lru, &pool->items);
> >>>>>> +     pool->count++;
> >>>>>> +     atomic_long_add(1 << pool->order, &total_pages);
> >>>>>> +     spin_unlock(&pool->lock);
> >>>>>> +
> >>>>>> +     mod_node_page_state(page_pgdat(page),
> >>>>>> NR_KERNEL_MISC_RECLAIMABLE,
> >>>>>> +                         1 << pool->order);
> >>>>> Hui what? What should that be good for?
> >>>> This is a carryover from the ION page pool implementation:
> >>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Ftree%2Fdrivers%2Fstaging%2Fandroid%2Fion%2Fion_page_pool.c%3Fh%3Dv5.10%23n28&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Cdccccff8edcd4d147a5b08d8cd20cff2%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637484888114923580%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=9%2BIBC0tezSV6Ci4S3kWfW%2BQvJm4mdunn3dF6C0kyfCw%3D&amp;reserved=0
> >>>>
> >>>>
> >>>> My sense is it helps with the vmstat/meminfo accounting so folks can
> >>>> see the cached pages are shrinkable/freeable. This maybe falls under
> >>>> other dmabuf accounting/stats discussions, so I'm happy to remove it
> >>>> for now, or let the drivers using the shared page pool logic handle
> >>>> the accounting themselves?
> >> Intentionally separated the discussion for that here.
> >>
> >> As far as I can see this is just bluntly incorrect.
> >>
> >> Either the page is reclaimable or it is part of our pool and freeable
> >> through the shrinker, but never ever both.
> > IIRC the original motivation for counting ION pooled pages as
> > reclaimable was to include them into /proc/meminfo's MemAvailable
> > calculations. NR_KERNEL_MISC_RECLAIMABLE defined as "reclaimable
> > non-slab kernel pages" seems like a good place to account for them but
> > I might be wrong.
>
> Yeah, that's what I see here as well. But exactly that is utterly nonsense.
>
> Those pages are not "free" in the sense that get_free_page could return
> them directly.

Any ideas where these pages would fit better? We do want to know that
under memory pressure these pages can be made available (which is I
think what MemAvailable means).

>
> Regards,
> Christian.
>
> >
> >> In the best case this just messes up the accounting, in the worst case
> >> it can cause memory corruption.
> >>
> >> Christian.
>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC][PATCH v6 1/7] drm: Add a sharable drm page-pool implementation
  2021-02-09 17:46             ` Christian König
  2021-02-09 17:52               ` Suren Baghdasaryan
@ 2021-02-09 20:03               ` Daniel Vetter
  2021-02-09 20:16                 ` Suren Baghdasaryan
  1 sibling, 1 reply; 33+ messages in thread
From: Daniel Vetter @ 2021-02-09 20:03 UTC (permalink / raw)
  To: Christian König
  Cc: Suren Baghdasaryan, John Stultz, lkml, Sumit Semwal, Liam Mark,
	Chris Goldsworthy, Laura Abbott, Brian Starkey, Hridya Valsaraju,
	Sandeep Patil, Daniel Mentz, Ørjan Eide, Robin Murphy,
	Ezequiel Garcia, Simon Ser, James Jones, linux-media, dri-devel

On Tue, Feb 9, 2021 at 6:46 PM Christian König <christian.koenig@amd.com> wrote:
>
>
>
> Am 09.02.21 um 18:33 schrieb Suren Baghdasaryan:
> > On Tue, Feb 9, 2021 at 4:57 AM Christian König <christian.koenig@amd.com> wrote:
> >> Am 09.02.21 um 13:11 schrieb Christian König:
> >>> [SNIP]
> >>>>>> +void drm_page_pool_add(struct drm_page_pool *pool, struct page *page)
> >>>>>> +{
> >>>>>> +     spin_lock(&pool->lock);
> >>>>>> +     list_add_tail(&page->lru, &pool->items);
> >>>>>> +     pool->count++;
> >>>>>> +     atomic_long_add(1 << pool->order, &total_pages);
> >>>>>> +     spin_unlock(&pool->lock);
> >>>>>> +
> >>>>>> +     mod_node_page_state(page_pgdat(page),
> >>>>>> NR_KERNEL_MISC_RECLAIMABLE,
> >>>>>> +                         1 << pool->order);
> >>>>> Hui what? What should that be good for?
> >>>> This is a carryover from the ION page pool implementation:
> >>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Ftree%2Fdrivers%2Fstaging%2Fandroid%2Fion%2Fion_page_pool.c%3Fh%3Dv5.10%23n28&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Cdccccff8edcd4d147a5b08d8cd20cff2%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637484888114923580%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=9%2BIBC0tezSV6Ci4S3kWfW%2BQvJm4mdunn3dF6C0kyfCw%3D&amp;reserved=0
> >>>>
> >>>>
> >>>> My sense is it helps with the vmstat/meminfo accounting so folks can
> >>>> see the cached pages are shrinkable/freeable. This maybe falls under
> >>>> other dmabuf accounting/stats discussions, so I'm happy to remove it
> >>>> for now, or let the drivers using the shared page pool logic handle
> >>>> the accounting themselves?
> >> Intentionally separated the discussion for that here.
> >>
> >> As far as I can see this is just bluntly incorrect.
> >>
> >> Either the page is reclaimable or it is part of our pool and freeable
> >> through the shrinker, but never ever both.
> > IIRC the original motivation for counting ION pooled pages as
> > reclaimable was to include them into /proc/meminfo's MemAvailable
> > calculations. NR_KERNEL_MISC_RECLAIMABLE defined as "reclaimable
> > non-slab kernel pages" seems like a good place to account for them but
> > I might be wrong.
>
> Yeah, that's what I see here as well. But exactly that is utterly nonsense.
>
> Those pages are not "free" in the sense that get_free_page could return
> them directly.

Well on Android that is kinda true, because Android has it's
oom-killer (way back it was just a shrinker callback, not sure how it
works now), which just shot down all the background apps. So at least
some of that (everything used by background apps) is indeed
reclaimable on Android.

But that doesn't hold on Linux in general, so we can't really do this
for common code.

Also I had a long meeting with Suren, John and other googles
yesterday, and the aim is now to try and support all the Android gpu
memory accounting needs with cgroups. That should work, and it will
allow Android to handle all the Android-ism in a clean way in upstream
code. Or that's at least the plan.

I think the only thing we identified that Android still needs on top
is the dma-buf sysfs stuff, so that shared buffers (which on Android
are always dma-buf, and always stay around as dma-buf fd throughout
their lifetime) can be listed/analyzed with full detail.

But aside from this the plan for all the per-process or per-heap
account, oom-killer integration and everything else is planned to be
done with cgroups. Android (for now) only needs to account overall gpu
memory since none of it is swappable on android drivers anyway, plus
no vram, so not much needed.

Cheers, Daniel

>
> Regards,
> Christian.
>
> >
> >> In the best case this just messes up the accounting, in the worst case
> >> it can cause memory corruption.
> >>
> >> Christian.
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC][PATCH v6 1/7] drm: Add a sharable drm page-pool implementation
  2021-02-09 20:03               ` Daniel Vetter
@ 2021-02-09 20:16                 ` Suren Baghdasaryan
  2021-02-10 13:06                   ` Daniel Vetter
  0 siblings, 1 reply; 33+ messages in thread
From: Suren Baghdasaryan @ 2021-02-09 20:16 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Christian König, John Stultz, lkml, Sumit Semwal, Liam Mark,
	Chris Goldsworthy, Laura Abbott, Brian Starkey, Hridya Valsaraju,
	Sandeep Patil, Daniel Mentz, Ørjan Eide, Robin Murphy,
	Ezequiel Garcia, Simon Ser, James Jones, linux-media, dri-devel

On Tue, Feb 9, 2021 at 12:03 PM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Tue, Feb 9, 2021 at 6:46 PM Christian König <christian.koenig@amd.com> wrote:
> >
> >
> >
> > Am 09.02.21 um 18:33 schrieb Suren Baghdasaryan:
> > > On Tue, Feb 9, 2021 at 4:57 AM Christian König <christian.koenig@amd.com> wrote:
> > >> Am 09.02.21 um 13:11 schrieb Christian König:
> > >>> [SNIP]
> > >>>>>> +void drm_page_pool_add(struct drm_page_pool *pool, struct page *page)
> > >>>>>> +{
> > >>>>>> +     spin_lock(&pool->lock);
> > >>>>>> +     list_add_tail(&page->lru, &pool->items);
> > >>>>>> +     pool->count++;
> > >>>>>> +     atomic_long_add(1 << pool->order, &total_pages);
> > >>>>>> +     spin_unlock(&pool->lock);
> > >>>>>> +
> > >>>>>> +     mod_node_page_state(page_pgdat(page),
> > >>>>>> NR_KERNEL_MISC_RECLAIMABLE,
> > >>>>>> +                         1 << pool->order);
> > >>>>> Hui what? What should that be good for?
> > >>>> This is a carryover from the ION page pool implementation:
> > >>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Ftree%2Fdrivers%2Fstaging%2Fandroid%2Fion%2Fion_page_pool.c%3Fh%3Dv5.10%23n28&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Cdccccff8edcd4d147a5b08d8cd20cff2%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637484888114923580%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=9%2BIBC0tezSV6Ci4S3kWfW%2BQvJm4mdunn3dF6C0kyfCw%3D&amp;reserved=0
> > >>>>
> > >>>>
> > >>>> My sense is it helps with the vmstat/meminfo accounting so folks can
> > >>>> see the cached pages are shrinkable/freeable. This maybe falls under
> > >>>> other dmabuf accounting/stats discussions, so I'm happy to remove it
> > >>>> for now, or let the drivers using the shared page pool logic handle
> > >>>> the accounting themselves?
> > >> Intentionally separated the discussion for that here.
> > >>
> > >> As far as I can see this is just bluntly incorrect.
> > >>
> > >> Either the page is reclaimable or it is part of our pool and freeable
> > >> through the shrinker, but never ever both.
> > > IIRC the original motivation for counting ION pooled pages as
> > > reclaimable was to include them into /proc/meminfo's MemAvailable
> > > calculations. NR_KERNEL_MISC_RECLAIMABLE defined as "reclaimable
> > > non-slab kernel pages" seems like a good place to account for them but
> > > I might be wrong.
> >
> > Yeah, that's what I see here as well. But exactly that is utterly nonsense.
> >
> > Those pages are not "free" in the sense that get_free_page could return
> > them directly.
>
> Well on Android that is kinda true, because Android has it's
> oom-killer (way back it was just a shrinker callback, not sure how it
> works now), which just shot down all the background apps. So at least
> some of that (everything used by background apps) is indeed
> reclaimable on Android.
>
> But that doesn't hold on Linux in general, so we can't really do this
> for common code.
>
> Also I had a long meeting with Suren, John and other googles
> yesterday, and the aim is now to try and support all the Android gpu
> memory accounting needs with cgroups. That should work, and it will
> allow Android to handle all the Android-ism in a clean way in upstream
> code. Or that's at least the plan.
>
> I think the only thing we identified that Android still needs on top
> is the dma-buf sysfs stuff, so that shared buffers (which on Android
> are always dma-buf, and always stay around as dma-buf fd throughout
> their lifetime) can be listed/analyzed with full detail.
>
> But aside from this the plan for all the per-process or per-heap
> account, oom-killer integration and everything else is planned to be
> done with cgroups.

Until cgroups are ready we probably will need to add a sysfs node to
report the total dmabuf pool size and I think that would cover our
current accounting need here.
As I mentioned, not including dmabuf pools into MemAvailable would
affect that stat and I'm wondering if pools should be considered as
part of MemAvailable or not. Since MemAvailable includes SReclaimable
I think it makes sense to include them but maybe there are other
considerations that I'm missing?

> Android (for now) only needs to account overall gpu
> memory since none of it is swappable on android drivers anyway, plus
> no vram, so not much needed.
>
> Cheers, Daniel
>
> >
> > Regards,
> > Christian.
> >
> > >
> > >> In the best case this just messes up the accounting, in the worst case
> > >> it can cause memory corruption.
> > >>
> > >> Christian.
> >
>
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC][PATCH v6 1/7] drm: Add a sharable drm page-pool implementation
  2021-02-09 20:16                 ` Suren Baghdasaryan
@ 2021-02-10 13:06                   ` Daniel Vetter
  2021-02-10 16:39                     ` Suren Baghdasaryan
  0 siblings, 1 reply; 33+ messages in thread
From: Daniel Vetter @ 2021-02-10 13:06 UTC (permalink / raw)
  To: Suren Baghdasaryan
  Cc: Daniel Vetter, Christian König, John Stultz, lkml,
	Sumit Semwal, Liam Mark, Chris Goldsworthy, Laura Abbott,
	Brian Starkey, Hridya Valsaraju, Sandeep Patil, Daniel Mentz,
	Ørjan Eide, Robin Murphy, Ezequiel Garcia, Simon Ser,
	James Jones, linux-media, dri-devel

On Tue, Feb 09, 2021 at 12:16:51PM -0800, Suren Baghdasaryan wrote:
> On Tue, Feb 9, 2021 at 12:03 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> >
> > On Tue, Feb 9, 2021 at 6:46 PM Christian König <christian.koenig@amd.com> wrote:
> > >
> > >
> > >
> > > Am 09.02.21 um 18:33 schrieb Suren Baghdasaryan:
> > > > On Tue, Feb 9, 2021 at 4:57 AM Christian König <christian.koenig@amd.com> wrote:
> > > >> Am 09.02.21 um 13:11 schrieb Christian König:
> > > >>> [SNIP]
> > > >>>>>> +void drm_page_pool_add(struct drm_page_pool *pool, struct page *page)
> > > >>>>>> +{
> > > >>>>>> +     spin_lock(&pool->lock);
> > > >>>>>> +     list_add_tail(&page->lru, &pool->items);
> > > >>>>>> +     pool->count++;
> > > >>>>>> +     atomic_long_add(1 << pool->order, &total_pages);
> > > >>>>>> +     spin_unlock(&pool->lock);
> > > >>>>>> +
> > > >>>>>> +     mod_node_page_state(page_pgdat(page),
> > > >>>>>> NR_KERNEL_MISC_RECLAIMABLE,
> > > >>>>>> +                         1 << pool->order);
> > > >>>>> Hui what? What should that be good for?
> > > >>>> This is a carryover from the ION page pool implementation:
> > > >>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Ftree%2Fdrivers%2Fstaging%2Fandroid%2Fion%2Fion_page_pool.c%3Fh%3Dv5.10%23n28&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Cdccccff8edcd4d147a5b08d8cd20cff2%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637484888114923580%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=9%2BIBC0tezSV6Ci4S3kWfW%2BQvJm4mdunn3dF6C0kyfCw%3D&amp;reserved=0
> > > >>>>
> > > >>>>
> > > >>>> My sense is it helps with the vmstat/meminfo accounting so folks can
> > > >>>> see the cached pages are shrinkable/freeable. This maybe falls under
> > > >>>> other dmabuf accounting/stats discussions, so I'm happy to remove it
> > > >>>> for now, or let the drivers using the shared page pool logic handle
> > > >>>> the accounting themselves?
> > > >> Intentionally separated the discussion for that here.
> > > >>
> > > >> As far as I can see this is just bluntly incorrect.
> > > >>
> > > >> Either the page is reclaimable or it is part of our pool and freeable
> > > >> through the shrinker, but never ever both.
> > > > IIRC the original motivation for counting ION pooled pages as
> > > > reclaimable was to include them into /proc/meminfo's MemAvailable
> > > > calculations. NR_KERNEL_MISC_RECLAIMABLE defined as "reclaimable
> > > > non-slab kernel pages" seems like a good place to account for them but
> > > > I might be wrong.
> > >
> > > Yeah, that's what I see here as well. But exactly that is utterly nonsense.
> > >
> > > Those pages are not "free" in the sense that get_free_page could return
> > > them directly.
> >
> > Well on Android that is kinda true, because Android has it's
> > oom-killer (way back it was just a shrinker callback, not sure how it
> > works now), which just shot down all the background apps. So at least
> > some of that (everything used by background apps) is indeed
> > reclaimable on Android.
> >
> > But that doesn't hold on Linux in general, so we can't really do this
> > for common code.
> >
> > Also I had a long meeting with Suren, John and other googles
> > yesterday, and the aim is now to try and support all the Android gpu
> > memory accounting needs with cgroups. That should work, and it will
> > allow Android to handle all the Android-ism in a clean way in upstream
> > code. Or that's at least the plan.
> >
> > I think the only thing we identified that Android still needs on top
> > is the dma-buf sysfs stuff, so that shared buffers (which on Android
> > are always dma-buf, and always stay around as dma-buf fd throughout
> > their lifetime) can be listed/analyzed with full detail.
> >
> > But aside from this the plan for all the per-process or per-heap
> > account, oom-killer integration and everything else is planned to be
> > done with cgroups.
> 
> Until cgroups are ready we probably will need to add a sysfs node to
> report the total dmabuf pool size and I think that would cover our
> current accounting need here.
> As I mentioned, not including dmabuf pools into MemAvailable would
> affect that stat and I'm wondering if pools should be considered as
> part of MemAvailable or not. Since MemAvailable includes SReclaimable
> I think it makes sense to include them but maybe there are other
> considerations that I'm missing?

On Android, yes, on upstream, not so much. Because upstream doesn't have
the android low memory killer cleanup up all the apps, so effectively we
can't reclaim that memory, and we shouldn't report it as such.
-Daniel

> 
> > Android (for now) only needs to account overall gpu
> > memory since none of it is swappable on android drivers anyway, plus
> > no vram, so not much needed.
> >
> > Cheers, Daniel
> >
> > >
> > > Regards,
> > > Christian.
> > >
> > > >
> > > >> In the best case this just messes up the accounting, in the worst case
> > > >> it can cause memory corruption.
> > > >>
> > > >> Christian.
> > >
> >
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC][PATCH v6 1/7] drm: Add a sharable drm page-pool implementation
  2021-02-10 13:06                   ` Daniel Vetter
@ 2021-02-10 16:39                     ` Suren Baghdasaryan
  2021-02-10 17:21                       ` Daniel Vetter
  2021-02-10 18:32                       ` Christian König
  0 siblings, 2 replies; 33+ messages in thread
From: Suren Baghdasaryan @ 2021-02-10 16:39 UTC (permalink / raw)
  To: Suren Baghdasaryan, Christian König, John Stultz, lkml,
	Sumit Semwal, Liam Mark, Chris Goldsworthy, Laura Abbott,
	Brian Starkey, Hridya Valsaraju, Sandeep Patil, Daniel Mentz,
	Ørjan Eide, Robin Murphy, Ezequiel Garcia, Simon Ser,
	James Jones, linux-media, dri-devel
  Cc: Daniel Vetter

On Wed, Feb 10, 2021 at 5:06 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Tue, Feb 09, 2021 at 12:16:51PM -0800, Suren Baghdasaryan wrote:
> > On Tue, Feb 9, 2021 at 12:03 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> > >
> > > On Tue, Feb 9, 2021 at 6:46 PM Christian König <christian.koenig@amd.com> wrote:
> > > >
> > > >
> > > >
> > > > Am 09.02.21 um 18:33 schrieb Suren Baghdasaryan:
> > > > > On Tue, Feb 9, 2021 at 4:57 AM Christian König <christian.koenig@amd.com> wrote:
> > > > >> Am 09.02.21 um 13:11 schrieb Christian König:
> > > > >>> [SNIP]
> > > > >>>>>> +void drm_page_pool_add(struct drm_page_pool *pool, struct page *page)
> > > > >>>>>> +{
> > > > >>>>>> +     spin_lock(&pool->lock);
> > > > >>>>>> +     list_add_tail(&page->lru, &pool->items);
> > > > >>>>>> +     pool->count++;
> > > > >>>>>> +     atomic_long_add(1 << pool->order, &total_pages);
> > > > >>>>>> +     spin_unlock(&pool->lock);
> > > > >>>>>> +
> > > > >>>>>> +     mod_node_page_state(page_pgdat(page),
> > > > >>>>>> NR_KERNEL_MISC_RECLAIMABLE,
> > > > >>>>>> +                         1 << pool->order);
> > > > >>>>> Hui what? What should that be good for?
> > > > >>>> This is a carryover from the ION page pool implementation:
> > > > >>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Ftree%2Fdrivers%2Fstaging%2Fandroid%2Fion%2Fion_page_pool.c%3Fh%3Dv5.10%23n28&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Cdccccff8edcd4d147a5b08d8cd20cff2%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637484888114923580%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=9%2BIBC0tezSV6Ci4S3kWfW%2BQvJm4mdunn3dF6C0kyfCw%3D&amp;reserved=0
> > > > >>>>
> > > > >>>>
> > > > >>>> My sense is it helps with the vmstat/meminfo accounting so folks can
> > > > >>>> see the cached pages are shrinkable/freeable. This maybe falls under
> > > > >>>> other dmabuf accounting/stats discussions, so I'm happy to remove it
> > > > >>>> for now, or let the drivers using the shared page pool logic handle
> > > > >>>> the accounting themselves?
> > > > >> Intentionally separated the discussion for that here.
> > > > >>
> > > > >> As far as I can see this is just bluntly incorrect.
> > > > >>
> > > > >> Either the page is reclaimable or it is part of our pool and freeable
> > > > >> through the shrinker, but never ever both.
> > > > > IIRC the original motivation for counting ION pooled pages as
> > > > > reclaimable was to include them into /proc/meminfo's MemAvailable
> > > > > calculations. NR_KERNEL_MISC_RECLAIMABLE defined as "reclaimable
> > > > > non-slab kernel pages" seems like a good place to account for them but
> > > > > I might be wrong.
> > > >
> > > > Yeah, that's what I see here as well. But exactly that is utterly nonsense.
> > > >
> > > > Those pages are not "free" in the sense that get_free_page could return
> > > > them directly.
> > >
> > > Well on Android that is kinda true, because Android has it's
> > > oom-killer (way back it was just a shrinker callback, not sure how it
> > > works now), which just shot down all the background apps. So at least
> > > some of that (everything used by background apps) is indeed
> > > reclaimable on Android.
> > >
> > > But that doesn't hold on Linux in general, so we can't really do this
> > > for common code.
> > >
> > > Also I had a long meeting with Suren, John and other googles
> > > yesterday, and the aim is now to try and support all the Android gpu
> > > memory accounting needs with cgroups. That should work, and it will
> > > allow Android to handle all the Android-ism in a clean way in upstream
> > > code. Or that's at least the plan.
> > >
> > > I think the only thing we identified that Android still needs on top
> > > is the dma-buf sysfs stuff, so that shared buffers (which on Android
> > > are always dma-buf, and always stay around as dma-buf fd throughout
> > > their lifetime) can be listed/analyzed with full detail.
> > >
> > > But aside from this the plan for all the per-process or per-heap
> > > account, oom-killer integration and everything else is planned to be
> > > done with cgroups.
> >
> > Until cgroups are ready we probably will need to add a sysfs node to
> > report the total dmabuf pool size and I think that would cover our
> > current accounting need here.
> > As I mentioned, not including dmabuf pools into MemAvailable would
> > affect that stat and I'm wondering if pools should be considered as
> > part of MemAvailable or not. Since MemAvailable includes SReclaimable
> > I think it makes sense to include them but maybe there are other
> > considerations that I'm missing?
>
> On Android, yes, on upstream, not so much. Because upstream doesn't have
> the android low memory killer cleanup up all the apps, so effectively we
> can't reclaim that memory, and we shouldn't report it as such.
> -Daniel

Hmm. Sorry, I fail to see why Android's low memory killer makes a
difference here. In my mind, the pages in the pools are not used but
kept there in case heaps need them (maybe that's the part I'm wrong?).
These pages can be freed by the shrinker if memory pressure rises. In
that sense I think it's very similar to reclaimable slabs which are
already accounted as part of MemAvailable. So it seems logical to me
to include unused pages in the pools here as well. What am I missing?

>
> >
> > > Android (for now) only needs to account overall gpu
> > > memory since none of it is swappable on android drivers anyway, plus
> > > no vram, so not much needed.
> > >
> > > Cheers, Daniel
> > >
> > > >
> > > > Regards,
> > > > Christian.
> > > >
> > > > >
> > > > >> In the best case this just messes up the accounting, in the worst case
> > > > >> it can cause memory corruption.
> > > > >>
> > > > >> Christian.
> > > >
> > >
> > >
> > > --
> > > Daniel Vetter
> > > Software Engineer, Intel Corporation
> > > http://blog.ffwll.ch
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC][PATCH v6 1/7] drm: Add a sharable drm page-pool implementation
  2021-02-10 16:39                     ` Suren Baghdasaryan
@ 2021-02-10 17:21                       ` Daniel Vetter
  2021-02-10 17:28                         ` Suren Baghdasaryan
  2021-02-10 18:32                       ` Christian König
  1 sibling, 1 reply; 33+ messages in thread
From: Daniel Vetter @ 2021-02-10 17:21 UTC (permalink / raw)
  To: Suren Baghdasaryan
  Cc: Christian König, John Stultz, lkml, Sumit Semwal, Liam Mark,
	Chris Goldsworthy, Laura Abbott, Brian Starkey, Hridya Valsaraju,
	Sandeep Patil, Daniel Mentz, Ørjan Eide, Robin Murphy,
	Ezequiel Garcia, Simon Ser, James Jones, linux-media, dri-devel

On Wed, Feb 10, 2021 at 5:39 PM Suren Baghdasaryan <surenb@google.com> wrote:
>
> On Wed, Feb 10, 2021 at 5:06 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> >
> > On Tue, Feb 09, 2021 at 12:16:51PM -0800, Suren Baghdasaryan wrote:
> > > On Tue, Feb 9, 2021 at 12:03 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > >
> > > > On Tue, Feb 9, 2021 at 6:46 PM Christian König <christian.koenig@amd.com> wrote:
> > > > >
> > > > >
> > > > >
> > > > > Am 09.02.21 um 18:33 schrieb Suren Baghdasaryan:
> > > > > > On Tue, Feb 9, 2021 at 4:57 AM Christian König <christian.koenig@amd.com> wrote:
> > > > > >> Am 09.02.21 um 13:11 schrieb Christian König:
> > > > > >>> [SNIP]
> > > > > >>>>>> +void drm_page_pool_add(struct drm_page_pool *pool, struct page *page)
> > > > > >>>>>> +{
> > > > > >>>>>> +     spin_lock(&pool->lock);
> > > > > >>>>>> +     list_add_tail(&page->lru, &pool->items);
> > > > > >>>>>> +     pool->count++;
> > > > > >>>>>> +     atomic_long_add(1 << pool->order, &total_pages);
> > > > > >>>>>> +     spin_unlock(&pool->lock);
> > > > > >>>>>> +
> > > > > >>>>>> +     mod_node_page_state(page_pgdat(page),
> > > > > >>>>>> NR_KERNEL_MISC_RECLAIMABLE,
> > > > > >>>>>> +                         1 << pool->order);
> > > > > >>>>> Hui what? What should that be good for?
> > > > > >>>> This is a carryover from the ION page pool implementation:
> > > > > >>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Ftree%2Fdrivers%2Fstaging%2Fandroid%2Fion%2Fion_page_pool.c%3Fh%3Dv5.10%23n28&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Cdccccff8edcd4d147a5b08d8cd20cff2%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637484888114923580%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=9%2BIBC0tezSV6Ci4S3kWfW%2BQvJm4mdunn3dF6C0kyfCw%3D&amp;reserved=0
> > > > > >>>>
> > > > > >>>>
> > > > > >>>> My sense is it helps with the vmstat/meminfo accounting so folks can
> > > > > >>>> see the cached pages are shrinkable/freeable. This maybe falls under
> > > > > >>>> other dmabuf accounting/stats discussions, so I'm happy to remove it
> > > > > >>>> for now, or let the drivers using the shared page pool logic handle
> > > > > >>>> the accounting themselves?
> > > > > >> Intentionally separated the discussion for that here.
> > > > > >>
> > > > > >> As far as I can see this is just bluntly incorrect.
> > > > > >>
> > > > > >> Either the page is reclaimable or it is part of our pool and freeable
> > > > > >> through the shrinker, but never ever both.
> > > > > > IIRC the original motivation for counting ION pooled pages as
> > > > > > reclaimable was to include them into /proc/meminfo's MemAvailable
> > > > > > calculations. NR_KERNEL_MISC_RECLAIMABLE defined as "reclaimable
> > > > > > non-slab kernel pages" seems like a good place to account for them but
> > > > > > I might be wrong.
> > > > >
> > > > > Yeah, that's what I see here as well. But exactly that is utterly nonsense.
> > > > >
> > > > > Those pages are not "free" in the sense that get_free_page could return
> > > > > them directly.
> > > >
> > > > Well on Android that is kinda true, because Android has it's
> > > > oom-killer (way back it was just a shrinker callback, not sure how it
> > > > works now), which just shot down all the background apps. So at least
> > > > some of that (everything used by background apps) is indeed
> > > > reclaimable on Android.
> > > >
> > > > But that doesn't hold on Linux in general, so we can't really do this
> > > > for common code.
> > > >
> > > > Also I had a long meeting with Suren, John and other googles
> > > > yesterday, and the aim is now to try and support all the Android gpu
> > > > memory accounting needs with cgroups. That should work, and it will
> > > > allow Android to handle all the Android-ism in a clean way in upstream
> > > > code. Or that's at least the plan.
> > > >
> > > > I think the only thing we identified that Android still needs on top
> > > > is the dma-buf sysfs stuff, so that shared buffers (which on Android
> > > > are always dma-buf, and always stay around as dma-buf fd throughout
> > > > their lifetime) can be listed/analyzed with full detail.
> > > >
> > > > But aside from this the plan for all the per-process or per-heap
> > > > account, oom-killer integration and everything else is planned to be
> > > > done with cgroups.
> > >
> > > Until cgroups are ready we probably will need to add a sysfs node to
> > > report the total dmabuf pool size and I think that would cover our
> > > current accounting need here.
> > > As I mentioned, not including dmabuf pools into MemAvailable would
> > > affect that stat and I'm wondering if pools should be considered as
> > > part of MemAvailable or not. Since MemAvailable includes SReclaimable
> > > I think it makes sense to include them but maybe there are other
> > > considerations that I'm missing?
> >
> > On Android, yes, on upstream, not so much. Because upstream doesn't have
> > the android low memory killer cleanup up all the apps, so effectively we
> > can't reclaim that memory, and we shouldn't report it as such.
> > -Daniel
>
> Hmm. Sorry, I fail to see why Android's low memory killer makes a
> difference here. In my mind, the pages in the pools are not used but
> kept there in case heaps need them (maybe that's the part I'm wrong?).
> These pages can be freed by the shrinker if memory pressure rises. In
> that sense I think it's very similar to reclaimable slabs which are
> already accounted as part of MemAvailable. So it seems logical to me
> to include unused pages in the pools here as well. What am I missing?

Ah yes, those page pool pages we can list. But conceptually (at least
in the internals) they're shrunk through mm shrinker callbacks, like
slab cache memory. So not exactly sure where to list that.

Since we have the same pools for gpu allocations on the ttm side and
John is looking into unifying those, maybe we could add that as a
patch on top? For some nice consistency across all gpu drivers from
android to discrete. I think if you, John and Christian from ttm side
can figure out how these page pools should be reported we'll have
something that fits? Maybe John can ping you on the other thread with
the shared pool rfc between ttm and dma-buf heaps (there's so much
going right now all over I'm a bit lost).

Cheers, Daniel

>
> >
> > >
> > > > Android (for now) only needs to account overall gpu
> > > > memory since none of it is swappable on android drivers anyway, plus
> > > > no vram, so not much needed.
> > > >
> > > > Cheers, Daniel
> > > >
> > > > >
> > > > > Regards,
> > > > > Christian.
> > > > >
> > > > > >
> > > > > >> In the best case this just messes up the accounting, in the worst case
> > > > > >> it can cause memory corruption.
> > > > > >>
> > > > > >> Christian.
> > > > >
> > > >
> > > >
> > > > --
> > > > Daniel Vetter
> > > > Software Engineer, Intel Corporation
> > > > http://blog.ffwll.ch
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC][PATCH v6 1/7] drm: Add a sharable drm page-pool implementation
  2021-02-10 17:21                       ` Daniel Vetter
@ 2021-02-10 17:28                         ` Suren Baghdasaryan
  0 siblings, 0 replies; 33+ messages in thread
From: Suren Baghdasaryan @ 2021-02-10 17:28 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Christian König, John Stultz, lkml, Sumit Semwal, Liam Mark,
	Chris Goldsworthy, Laura Abbott, Brian Starkey, Hridya Valsaraju,
	Sandeep Patil, Daniel Mentz, Ørjan Eide, Robin Murphy,
	Ezequiel Garcia, Simon Ser, James Jones, linux-media, dri-devel

On Wed, Feb 10, 2021 at 9:21 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Wed, Feb 10, 2021 at 5:39 PM Suren Baghdasaryan <surenb@google.com> wrote:
> >
> > On Wed, Feb 10, 2021 at 5:06 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > >
> > > On Tue, Feb 09, 2021 at 12:16:51PM -0800, Suren Baghdasaryan wrote:
> > > > On Tue, Feb 9, 2021 at 12:03 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > >
> > > > > On Tue, Feb 9, 2021 at 6:46 PM Christian König <christian.koenig@amd.com> wrote:
> > > > > >
> > > > > >
> > > > > >
> > > > > > Am 09.02.21 um 18:33 schrieb Suren Baghdasaryan:
> > > > > > > On Tue, Feb 9, 2021 at 4:57 AM Christian König <christian.koenig@amd.com> wrote:
> > > > > > >> Am 09.02.21 um 13:11 schrieb Christian König:
> > > > > > >>> [SNIP]
> > > > > > >>>>>> +void drm_page_pool_add(struct drm_page_pool *pool, struct page *page)
> > > > > > >>>>>> +{
> > > > > > >>>>>> +     spin_lock(&pool->lock);
> > > > > > >>>>>> +     list_add_tail(&page->lru, &pool->items);
> > > > > > >>>>>> +     pool->count++;
> > > > > > >>>>>> +     atomic_long_add(1 << pool->order, &total_pages);
> > > > > > >>>>>> +     spin_unlock(&pool->lock);
> > > > > > >>>>>> +
> > > > > > >>>>>> +     mod_node_page_state(page_pgdat(page),
> > > > > > >>>>>> NR_KERNEL_MISC_RECLAIMABLE,
> > > > > > >>>>>> +                         1 << pool->order);
> > > > > > >>>>> Hui what? What should that be good for?
> > > > > > >>>> This is a carryover from the ION page pool implementation:
> > > > > > >>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Ftree%2Fdrivers%2Fstaging%2Fandroid%2Fion%2Fion_page_pool.c%3Fh%3Dv5.10%23n28&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Cdccccff8edcd4d147a5b08d8cd20cff2%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637484888114923580%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=9%2BIBC0tezSV6Ci4S3kWfW%2BQvJm4mdunn3dF6C0kyfCw%3D&amp;reserved=0
> > > > > > >>>>
> > > > > > >>>>
> > > > > > >>>> My sense is it helps with the vmstat/meminfo accounting so folks can
> > > > > > >>>> see the cached pages are shrinkable/freeable. This maybe falls under
> > > > > > >>>> other dmabuf accounting/stats discussions, so I'm happy to remove it
> > > > > > >>>> for now, or let the drivers using the shared page pool logic handle
> > > > > > >>>> the accounting themselves?
> > > > > > >> Intentionally separated the discussion for that here.
> > > > > > >>
> > > > > > >> As far as I can see this is just bluntly incorrect.
> > > > > > >>
> > > > > > >> Either the page is reclaimable or it is part of our pool and freeable
> > > > > > >> through the shrinker, but never ever both.
> > > > > > > IIRC the original motivation for counting ION pooled pages as
> > > > > > > reclaimable was to include them into /proc/meminfo's MemAvailable
> > > > > > > calculations. NR_KERNEL_MISC_RECLAIMABLE defined as "reclaimable
> > > > > > > non-slab kernel pages" seems like a good place to account for them but
> > > > > > > I might be wrong.
> > > > > >
> > > > > > Yeah, that's what I see here as well. But exactly that is utterly nonsense.
> > > > > >
> > > > > > Those pages are not "free" in the sense that get_free_page could return
> > > > > > them directly.
> > > > >
> > > > > Well on Android that is kinda true, because Android has it's
> > > > > oom-killer (way back it was just a shrinker callback, not sure how it
> > > > > works now), which just shot down all the background apps. So at least
> > > > > some of that (everything used by background apps) is indeed
> > > > > reclaimable on Android.
> > > > >
> > > > > But that doesn't hold on Linux in general, so we can't really do this
> > > > > for common code.
> > > > >
> > > > > Also I had a long meeting with Suren, John and other googles
> > > > > yesterday, and the aim is now to try and support all the Android gpu
> > > > > memory accounting needs with cgroups. That should work, and it will
> > > > > allow Android to handle all the Android-ism in a clean way in upstream
> > > > > code. Or that's at least the plan.
> > > > >
> > > > > I think the only thing we identified that Android still needs on top
> > > > > is the dma-buf sysfs stuff, so that shared buffers (which on Android
> > > > > are always dma-buf, and always stay around as dma-buf fd throughout
> > > > > their lifetime) can be listed/analyzed with full detail.
> > > > >
> > > > > But aside from this the plan for all the per-process or per-heap
> > > > > account, oom-killer integration and everything else is planned to be
> > > > > done with cgroups.
> > > >
> > > > Until cgroups are ready we probably will need to add a sysfs node to
> > > > report the total dmabuf pool size and I think that would cover our
> > > > current accounting need here.
> > > > As I mentioned, not including dmabuf pools into MemAvailable would
> > > > affect that stat and I'm wondering if pools should be considered as
> > > > part of MemAvailable or not. Since MemAvailable includes SReclaimable
> > > > I think it makes sense to include them but maybe there are other
> > > > considerations that I'm missing?
> > >
> > > On Android, yes, on upstream, not so much. Because upstream doesn't have
> > > the android low memory killer cleanup up all the apps, so effectively we
> > > can't reclaim that memory, and we shouldn't report it as such.
> > > -Daniel
> >
> > Hmm. Sorry, I fail to see why Android's low memory killer makes a
> > difference here. In my mind, the pages in the pools are not used but
> > kept there in case heaps need them (maybe that's the part I'm wrong?).
> > These pages can be freed by the shrinker if memory pressure rises. In
> > that sense I think it's very similar to reclaimable slabs which are
> > already accounted as part of MemAvailable. So it seems logical to me
> > to include unused pages in the pools here as well. What am I missing?
>
> Ah yes, those page pool pages we can list. But conceptually (at least
> in the internals) they're shrunk through mm shrinker callbacks, like
> slab cache memory. So not exactly sure where to list that.
>
> Since we have the same pools for gpu allocations on the ttm side and
> John is looking into unifying those, maybe we could add that as a
> patch on top? For some nice consistency across all gpu drivers from
> android to discrete. I think if you, John and Christian from ttm side
> can figure out how these page pools should be reported we'll have
> something that fits? Maybe John can ping you on the other thread with
> the shared pool rfc between ttm and dma-buf heaps (there's so much
> going right now all over I'm a bit lost).

Sounds good. I'll follow up with John to see where this discussion fits better.
Thanks!

>
> Cheers, Daniel
>
> >
> > >
> > > >
> > > > > Android (for now) only needs to account overall gpu
> > > > > memory since none of it is swappable on android drivers anyway, plus
> > > > > no vram, so not much needed.
> > > > >
> > > > > Cheers, Daniel
> > > > >
> > > > > >
> > > > > > Regards,
> > > > > > Christian.
> > > > > >
> > > > > > >
> > > > > > >> In the best case this just messes up the accounting, in the worst case
> > > > > > >> it can cause memory corruption.
> > > > > > >>
> > > > > > >> Christian.
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Daniel Vetter
> > > > > Software Engineer, Intel Corporation
> > > > > http://blog.ffwll.ch
> > >
> > > --
> > > Daniel Vetter
> > > Software Engineer, Intel Corporation
> > > http://blog.ffwll.ch
>
>
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC][PATCH v6 1/7] drm: Add a sharable drm page-pool implementation
  2021-02-10 16:39                     ` Suren Baghdasaryan
  2021-02-10 17:21                       ` Daniel Vetter
@ 2021-02-10 18:32                       ` Christian König
  2021-02-10 19:12                         ` Suren Baghdasaryan
  1 sibling, 1 reply; 33+ messages in thread
From: Christian König @ 2021-02-10 18:32 UTC (permalink / raw)
  To: Suren Baghdasaryan, John Stultz, lkml, Sumit Semwal, Liam Mark,
	Chris Goldsworthy, Laura Abbott, Brian Starkey, Hridya Valsaraju,
	Sandeep Patil, Daniel Mentz, Ørjan Eide, Robin Murphy,
	Ezequiel Garcia, Simon Ser, James Jones, linux-media, dri-devel
  Cc: Daniel Vetter



Am 10.02.21 um 17:39 schrieb Suren Baghdasaryan:
> On Wed, Feb 10, 2021 at 5:06 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>> On Tue, Feb 09, 2021 at 12:16:51PM -0800, Suren Baghdasaryan wrote:
>>> On Tue, Feb 9, 2021 at 12:03 PM Daniel Vetter <daniel@ffwll.ch> wrote:
>>>> On Tue, Feb 9, 2021 at 6:46 PM Christian König <christian.koenig@amd.com> wrote:
>>>>>
>>>>>
>>>>> Am 09.02.21 um 18:33 schrieb Suren Baghdasaryan:
>>>>>> On Tue, Feb 9, 2021 at 4:57 AM Christian König <christian.koenig@amd.com> wrote:
>>>>>>> Am 09.02.21 um 13:11 schrieb Christian König:
>>>>>>>> [SNIP]
>>>>>>>>>>> +void drm_page_pool_add(struct drm_page_pool *pool, struct page *page)
>>>>>>>>>>> +{
>>>>>>>>>>> +     spin_lock(&pool->lock);
>>>>>>>>>>> +     list_add_tail(&page->lru, &pool->items);
>>>>>>>>>>> +     pool->count++;
>>>>>>>>>>> +     atomic_long_add(1 << pool->order, &total_pages);
>>>>>>>>>>> +     spin_unlock(&pool->lock);
>>>>>>>>>>> +
>>>>>>>>>>> +     mod_node_page_state(page_pgdat(page),
>>>>>>>>>>> NR_KERNEL_MISC_RECLAIMABLE,
>>>>>>>>>>> +                         1 << pool->order);
>>>>>>>>>> Hui what? What should that be good for?
>>>>>>>>> This is a carryover from the ION page pool implementation:
>>>>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Ftree%2Fdrivers%2Fstaging%2Fandroid%2Fion%2Fion_page_pool.c%3Fh%3Dv5.10%23n28&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Cbb7155447ee149a49f3a08d8cde2685d%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637485719618339413%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=IYsJoAd7SUo12V7tS3CCRqNVm569iy%2FtoXQqm2MdC1g%3D&amp;reserved=0
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> My sense is it helps with the vmstat/meminfo accounting so folks can
>>>>>>>>> see the cached pages are shrinkable/freeable. This maybe falls under
>>>>>>>>> other dmabuf accounting/stats discussions, so I'm happy to remove it
>>>>>>>>> for now, or let the drivers using the shared page pool logic handle
>>>>>>>>> the accounting themselves?
>>>>>>> Intentionally separated the discussion for that here.
>>>>>>>
>>>>>>> As far as I can see this is just bluntly incorrect.
>>>>>>>
>>>>>>> Either the page is reclaimable or it is part of our pool and freeable
>>>>>>> through the shrinker, but never ever both.
>>>>>> IIRC the original motivation for counting ION pooled pages as
>>>>>> reclaimable was to include them into /proc/meminfo's MemAvailable
>>>>>> calculations. NR_KERNEL_MISC_RECLAIMABLE defined as "reclaimable
>>>>>> non-slab kernel pages" seems like a good place to account for them but
>>>>>> I might be wrong.
>>>>> Yeah, that's what I see here as well. But exactly that is utterly nonsense.
>>>>>
>>>>> Those pages are not "free" in the sense that get_free_page could return
>>>>> them directly.
>>>> Well on Android that is kinda true, because Android has it's
>>>> oom-killer (way back it was just a shrinker callback, not sure how it
>>>> works now), which just shot down all the background apps. So at least
>>>> some of that (everything used by background apps) is indeed
>>>> reclaimable on Android.
>>>>
>>>> But that doesn't hold on Linux in general, so we can't really do this
>>>> for common code.
>>>>
>>>> Also I had a long meeting with Suren, John and other googles
>>>> yesterday, and the aim is now to try and support all the Android gpu
>>>> memory accounting needs with cgroups. That should work, and it will
>>>> allow Android to handle all the Android-ism in a clean way in upstream
>>>> code. Or that's at least the plan.
>>>>
>>>> I think the only thing we identified that Android still needs on top
>>>> is the dma-buf sysfs stuff, so that shared buffers (which on Android
>>>> are always dma-buf, and always stay around as dma-buf fd throughout
>>>> their lifetime) can be listed/analyzed with full detail.
>>>>
>>>> But aside from this the plan for all the per-process or per-heap
>>>> account, oom-killer integration and everything else is planned to be
>>>> done with cgroups.
>>> Until cgroups are ready we probably will need to add a sysfs node to
>>> report the total dmabuf pool size and I think that would cover our
>>> current accounting need here.
>>> As I mentioned, not including dmabuf pools into MemAvailable would
>>> affect that stat and I'm wondering if pools should be considered as
>>> part of MemAvailable or not. Since MemAvailable includes SReclaimable
>>> I think it makes sense to include them but maybe there are other
>>> considerations that I'm missing?
>> On Android, yes, on upstream, not so much. Because upstream doesn't have
>> the android low memory killer cleanup up all the apps, so effectively we
>> can't reclaim that memory, and we shouldn't report it as such.
>> -Daniel
> Hmm. Sorry, I fail to see why Android's low memory killer makes a
> difference here. In my mind, the pages in the pools are not used but
> kept there in case heaps need them (maybe that's the part I'm wrong?).
> These pages can be freed by the shrinker if memory pressure rises.

And exactly that's the difference. They *can* be freed is not the same 
thing as they *are* free.

> In that sense I think it's very similar to reclaimable slabs which are
> already accounted as part of MemAvailable. So it seems logical to me
> to include unused pages in the pools here as well. What am I missing?

See the shrinkers are there because you need to do some action before 
you can re-use the memory.

In the case of the TTM/DRM pool for example you need to change the 
caching attributes which might cause sleeping for a TLB flush to finish.

By accounting those pages as free you mess up (for example) the handling 
which makes sure that there are enough emergency reserves. I can only 
strongly recommend to not do that.

What you could do is to add a sysfs interface to expose the different 
shrinkers and the amount of pages in them to userspace. Similar to what 
/proc/slabinfo is doing.

Regards,
Christian.

>
>>>> Android (for now) only needs to account overall gpu
>>>> memory since none of it is swappable on android drivers anyway, plus
>>>> no vram, so not much needed.
>>>>
>>>> Cheers, Daniel
>>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>>>> In the best case this just messes up the accounting, in the worst case
>>>>>>> it can cause memory corruption.
>>>>>>>
>>>>>>> Christian.
>>>>
>>>> --
>>>> Daniel Vetter
>>>> Software Engineer, Intel Corporation
>>>> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Cbb7155447ee149a49f3a08d8cde2685d%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637485719618349407%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=11ENl28PIoSoUx6FhkEK9u4G6yiLc3YhsYsl1DIzsv8%3D&amp;reserved=0
>> --
>> Daniel Vetter
>> Software Engineer, Intel Corporation
>> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Cbb7155447ee149a49f3a08d8cde2685d%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637485719618349407%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=11ENl28PIoSoUx6FhkEK9u4G6yiLc3YhsYsl1DIzsv8%3D&amp;reserved=0


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC][PATCH v6 1/7] drm: Add a sharable drm page-pool implementation
  2021-02-10 18:32                       ` Christian König
@ 2021-02-10 19:12                         ` Suren Baghdasaryan
  2021-02-10 19:23                           ` Christian König
  0 siblings, 1 reply; 33+ messages in thread
From: Suren Baghdasaryan @ 2021-02-10 19:12 UTC (permalink / raw)
  To: Christian König
  Cc: John Stultz, lkml, Sumit Semwal, Liam Mark, Chris Goldsworthy,
	Laura Abbott, Brian Starkey, Hridya Valsaraju, Sandeep Patil,
	Daniel Mentz, Ørjan Eide, Robin Murphy, Ezequiel Garcia,
	Simon Ser, James Jones, linux-media, dri-devel, Daniel Vetter

On Wed, Feb 10, 2021 at 10:32 AM Christian König
<christian.koenig@amd.com> wrote:
>
>
>
> Am 10.02.21 um 17:39 schrieb Suren Baghdasaryan:
> > On Wed, Feb 10, 2021 at 5:06 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> >> On Tue, Feb 09, 2021 at 12:16:51PM -0800, Suren Baghdasaryan wrote:
> >>> On Tue, Feb 9, 2021 at 12:03 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> >>>> On Tue, Feb 9, 2021 at 6:46 PM Christian König <christian.koenig@amd.com> wrote:
> >>>>>
> >>>>>
> >>>>> Am 09.02.21 um 18:33 schrieb Suren Baghdasaryan:
> >>>>>> On Tue, Feb 9, 2021 at 4:57 AM Christian König <christian.koenig@amd.com> wrote:
> >>>>>>> Am 09.02.21 um 13:11 schrieb Christian König:
> >>>>>>>> [SNIP]
> >>>>>>>>>>> +void drm_page_pool_add(struct drm_page_pool *pool, struct page *page)
> >>>>>>>>>>> +{
> >>>>>>>>>>> +     spin_lock(&pool->lock);
> >>>>>>>>>>> +     list_add_tail(&page->lru, &pool->items);
> >>>>>>>>>>> +     pool->count++;
> >>>>>>>>>>> +     atomic_long_add(1 << pool->order, &total_pages);
> >>>>>>>>>>> +     spin_unlock(&pool->lock);
> >>>>>>>>>>> +
> >>>>>>>>>>> +     mod_node_page_state(page_pgdat(page),
> >>>>>>>>>>> NR_KERNEL_MISC_RECLAIMABLE,
> >>>>>>>>>>> +                         1 << pool->order);
> >>>>>>>>>> Hui what? What should that be good for?
> >>>>>>>>> This is a carryover from the ION page pool implementation:
> >>>>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Ftree%2Fdrivers%2Fstaging%2Fandroid%2Fion%2Fion_page_pool.c%3Fh%3Dv5.10%23n28&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Cbb7155447ee149a49f3a08d8cde2685d%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637485719618339413%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=IYsJoAd7SUo12V7tS3CCRqNVm569iy%2FtoXQqm2MdC1g%3D&amp;reserved=0
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> My sense is it helps with the vmstat/meminfo accounting so folks can
> >>>>>>>>> see the cached pages are shrinkable/freeable. This maybe falls under
> >>>>>>>>> other dmabuf accounting/stats discussions, so I'm happy to remove it
> >>>>>>>>> for now, or let the drivers using the shared page pool logic handle
> >>>>>>>>> the accounting themselves?
> >>>>>>> Intentionally separated the discussion for that here.
> >>>>>>>
> >>>>>>> As far as I can see this is just bluntly incorrect.
> >>>>>>>
> >>>>>>> Either the page is reclaimable or it is part of our pool and freeable
> >>>>>>> through the shrinker, but never ever both.
> >>>>>> IIRC the original motivation for counting ION pooled pages as
> >>>>>> reclaimable was to include them into /proc/meminfo's MemAvailable
> >>>>>> calculations. NR_KERNEL_MISC_RECLAIMABLE defined as "reclaimable
> >>>>>> non-slab kernel pages" seems like a good place to account for them but
> >>>>>> I might be wrong.
> >>>>> Yeah, that's what I see here as well. But exactly that is utterly nonsense.
> >>>>>
> >>>>> Those pages are not "free" in the sense that get_free_page could return
> >>>>> them directly.
> >>>> Well on Android that is kinda true, because Android has it's
> >>>> oom-killer (way back it was just a shrinker callback, not sure how it
> >>>> works now), which just shot down all the background apps. So at least
> >>>> some of that (everything used by background apps) is indeed
> >>>> reclaimable on Android.
> >>>>
> >>>> But that doesn't hold on Linux in general, so we can't really do this
> >>>> for common code.
> >>>>
> >>>> Also I had a long meeting with Suren, John and other googles
> >>>> yesterday, and the aim is now to try and support all the Android gpu
> >>>> memory accounting needs with cgroups. That should work, and it will
> >>>> allow Android to handle all the Android-ism in a clean way in upstream
> >>>> code. Or that's at least the plan.
> >>>>
> >>>> I think the only thing we identified that Android still needs on top
> >>>> is the dma-buf sysfs stuff, so that shared buffers (which on Android
> >>>> are always dma-buf, and always stay around as dma-buf fd throughout
> >>>> their lifetime) can be listed/analyzed with full detail.
> >>>>
> >>>> But aside from this the plan for all the per-process or per-heap
> >>>> account, oom-killer integration and everything else is planned to be
> >>>> done with cgroups.
> >>> Until cgroups are ready we probably will need to add a sysfs node to
> >>> report the total dmabuf pool size and I think that would cover our
> >>> current accounting need here.
> >>> As I mentioned, not including dmabuf pools into MemAvailable would
> >>> affect that stat and I'm wondering if pools should be considered as
> >>> part of MemAvailable or not. Since MemAvailable includes SReclaimable
> >>> I think it makes sense to include them but maybe there are other
> >>> considerations that I'm missing?
> >> On Android, yes, on upstream, not so much. Because upstream doesn't have
> >> the android low memory killer cleanup up all the apps, so effectively we
> >> can't reclaim that memory, and we shouldn't report it as such.
> >> -Daniel
> > Hmm. Sorry, I fail to see why Android's low memory killer makes a
> > difference here. In my mind, the pages in the pools are not used but
> > kept there in case heaps need them (maybe that's the part I'm wrong?).
> > These pages can be freed by the shrinker if memory pressure rises.
>
> And exactly that's the difference. They *can* be freed is not the same
> thing as they *are* free.

No argument there. That's why I think meminfo has two separate stats
for MemFree and MemAvailable. MemFree is self-explanatory. The
description of MemAvailable in
https://www.kernel.org/doc/Documentation/filesystems/proc.txt says "An
estimate of how much memory is available for starting new
applications, without swapping.". Since dropping unused pages from
slabs, caches and pools is less expensive than swapping, I would
assume that a well-behaved system would do that before resorting to
swapping. And if so, such memory should be included in MemAvailable
because VM will make it available before swapping. But again, that's
my interpretation. WDYT?

>
> > In that sense I think it's very similar to reclaimable slabs which are
> > already accounted as part of MemAvailable. So it seems logical to me
> > to include unused pages in the pools here as well. What am I missing?
>
> See the shrinkers are there because you need to do some action before
> you can re-use the memory.
>
> In the case of the TTM/DRM pool for example you need to change the
> caching attributes which might cause sleeping for a TLB flush to finish.

I see your point here. But how about caches/pools which can be easily
dropped? Shouldn't they be part of MemAvailable?

>
> By accounting those pages as free you mess up (for example) the handling
> which makes sure that there are enough emergency reserves. I can only
> strongly recommend to not do that.
>
> What you could do is to add a sysfs interface to expose the different
> shrinkers and the amount of pages in them to userspace. Similar to what
> /proc/slabinfo is doing.

True, we can work around this. Just want to make sure whatever we do
really makes sense.
Thanks,
Suren.

>
> Regards,
> Christian.
>
> >
> >>>> Android (for now) only needs to account overall gpu
> >>>> memory since none of it is swappable on android drivers anyway, plus
> >>>> no vram, so not much needed.
> >>>>
> >>>> Cheers, Daniel
> >>>>
> >>>>> Regards,
> >>>>> Christian.
> >>>>>
> >>>>>>> In the best case this just messes up the accounting, in the worst case
> >>>>>>> it can cause memory corruption.
> >>>>>>>
> >>>>>>> Christian.
> >>>>
> >>>> --
> >>>> Daniel Vetter
> >>>> Software Engineer, Intel Corporation
> >>>> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Cbb7155447ee149a49f3a08d8cde2685d%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637485719618349407%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=11ENl28PIoSoUx6FhkEK9u4G6yiLc3YhsYsl1DIzsv8%3D&amp;reserved=0
> >> --
> >> Daniel Vetter
> >> Software Engineer, Intel Corporation
> >> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Cbb7155447ee149a49f3a08d8cde2685d%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637485719618349407%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=11ENl28PIoSoUx6FhkEK9u4G6yiLc3YhsYsl1DIzsv8%3D&amp;reserved=0
>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC][PATCH v6 1/7] drm: Add a sharable drm page-pool implementation
  2021-02-10 19:12                         ` Suren Baghdasaryan
@ 2021-02-10 19:23                           ` Christian König
  0 siblings, 0 replies; 33+ messages in thread
From: Christian König @ 2021-02-10 19:23 UTC (permalink / raw)
  To: Suren Baghdasaryan
  Cc: John Stultz, lkml, Sumit Semwal, Liam Mark, Chris Goldsworthy,
	Laura Abbott, Brian Starkey, Hridya Valsaraju, Sandeep Patil,
	Daniel Mentz, Ørjan Eide, Robin Murphy, Ezequiel Garcia,
	Simon Ser, James Jones, linux-media, dri-devel, Daniel Vetter



Am 10.02.21 um 20:12 schrieb Suren Baghdasaryan:
> On Wed, Feb 10, 2021 at 10:32 AM Christian König
> <christian.koenig@amd.com> wrote:
>>
>>
>> Am 10.02.21 um 17:39 schrieb Suren Baghdasaryan:
>>> On Wed, Feb 10, 2021 at 5:06 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>>>> On Tue, Feb 09, 2021 at 12:16:51PM -0800, Suren Baghdasaryan wrote:
>>>>> On Tue, Feb 9, 2021 at 12:03 PM Daniel Vetter <daniel@ffwll.ch> wrote:
>>>>>> On Tue, Feb 9, 2021 at 6:46 PM Christian König <christian.koenig@amd.com> wrote:
>>>>>>>
>>>>>>> Am 09.02.21 um 18:33 schrieb Suren Baghdasaryan:
>>>>>>>> On Tue, Feb 9, 2021 at 4:57 AM Christian König <christian.koenig@amd.com> wrote:
>>>>>>>>> Am 09.02.21 um 13:11 schrieb Christian König:
>>>>>>>>>> [SNIP]
>>>>>>>>>>>>> +void drm_page_pool_add(struct drm_page_pool *pool, struct page *page)
>>>>>>>>>>>>> +{
>>>>>>>>>>>>> +     spin_lock(&pool->lock);
>>>>>>>>>>>>> +     list_add_tail(&page->lru, &pool->items);
>>>>>>>>>>>>> +     pool->count++;
>>>>>>>>>>>>> +     atomic_long_add(1 << pool->order, &total_pages);
>>>>>>>>>>>>> +     spin_unlock(&pool->lock);
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +     mod_node_page_state(page_pgdat(page),
>>>>>>>>>>>>> NR_KERNEL_MISC_RECLAIMABLE,
>>>>>>>>>>>>> +                         1 << pool->order);
>>>>>>>>>>>> Hui what? What should that be good for?
>>>>>>>>>>> This is a carryover from the ION page pool implementation:
>>>>>>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Ftree%2Fdrivers%2Fstaging%2Fandroid%2Fion%2Fion_page_pool.c%3Fh%3Dv5.10%23n28&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C4c92877de38440ef18ee08d8cdf7c8ab%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637485811429326241%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=Ghp48YBj6R3z4yKsj8ecjXxhZp8g5sJOL0GJRUtCFKY%3D&amp;reserved=0
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> My sense is it helps with the vmstat/meminfo accounting so folks can
>>>>>>>>>>> see the cached pages are shrinkable/freeable. This maybe falls under
>>>>>>>>>>> other dmabuf accounting/stats discussions, so I'm happy to remove it
>>>>>>>>>>> for now, or let the drivers using the shared page pool logic handle
>>>>>>>>>>> the accounting themselves?
>>>>>>>>> Intentionally separated the discussion for that here.
>>>>>>>>>
>>>>>>>>> As far as I can see this is just bluntly incorrect.
>>>>>>>>>
>>>>>>>>> Either the page is reclaimable or it is part of our pool and freeable
>>>>>>>>> through the shrinker, but never ever both.
>>>>>>>> IIRC the original motivation for counting ION pooled pages as
>>>>>>>> reclaimable was to include them into /proc/meminfo's MemAvailable
>>>>>>>> calculations. NR_KERNEL_MISC_RECLAIMABLE defined as "reclaimable
>>>>>>>> non-slab kernel pages" seems like a good place to account for them but
>>>>>>>> I might be wrong.
>>>>>>> Yeah, that's what I see here as well. But exactly that is utterly nonsense.
>>>>>>>
>>>>>>> Those pages are not "free" in the sense that get_free_page could return
>>>>>>> them directly.
>>>>>> Well on Android that is kinda true, because Android has it's
>>>>>> oom-killer (way back it was just a shrinker callback, not sure how it
>>>>>> works now), which just shot down all the background apps. So at least
>>>>>> some of that (everything used by background apps) is indeed
>>>>>> reclaimable on Android.
>>>>>>
>>>>>> But that doesn't hold on Linux in general, so we can't really do this
>>>>>> for common code.
>>>>>>
>>>>>> Also I had a long meeting with Suren, John and other googles
>>>>>> yesterday, and the aim is now to try and support all the Android gpu
>>>>>> memory accounting needs with cgroups. That should work, and it will
>>>>>> allow Android to handle all the Android-ism in a clean way in upstream
>>>>>> code. Or that's at least the plan.
>>>>>>
>>>>>> I think the only thing we identified that Android still needs on top
>>>>>> is the dma-buf sysfs stuff, so that shared buffers (which on Android
>>>>>> are always dma-buf, and always stay around as dma-buf fd throughout
>>>>>> their lifetime) can be listed/analyzed with full detail.
>>>>>>
>>>>>> But aside from this the plan for all the per-process or per-heap
>>>>>> account, oom-killer integration and everything else is planned to be
>>>>>> done with cgroups.
>>>>> Until cgroups are ready we probably will need to add a sysfs node to
>>>>> report the total dmabuf pool size and I think that would cover our
>>>>> current accounting need here.
>>>>> As I mentioned, not including dmabuf pools into MemAvailable would
>>>>> affect that stat and I'm wondering if pools should be considered as
>>>>> part of MemAvailable or not. Since MemAvailable includes SReclaimable
>>>>> I think it makes sense to include them but maybe there are other
>>>>> considerations that I'm missing?
>>>> On Android, yes, on upstream, not so much. Because upstream doesn't have
>>>> the android low memory killer cleanup up all the apps, so effectively we
>>>> can't reclaim that memory, and we shouldn't report it as such.
>>>> -Daniel
>>> Hmm. Sorry, I fail to see why Android's low memory killer makes a
>>> difference here. In my mind, the pages in the pools are not used but
>>> kept there in case heaps need them (maybe that's the part I'm wrong?).
>>> These pages can be freed by the shrinker if memory pressure rises.
>> And exactly that's the difference. They *can* be freed is not the same
>> thing as they *are* free.
> No argument there. That's why I think meminfo has two separate stats
> for MemFree and MemAvailable. MemFree is self-explanatory. The
> description of MemAvailable in
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.kernel.org%2Fdoc%2FDocumentation%2Ffilesystems%2Fproc.txt&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C4c92877de38440ef18ee08d8cdf7c8ab%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637485811429326241%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=0MI4k9YtAnGvP0QlBYHC6QMCJvQpZoaWMRlqjptvwsA%3D&amp;reserved=0 says "An
> estimate of how much memory is available for starting new
> applications, without swapping.". Since dropping unused pages from
> slabs, caches and pools is less expensive than swapping, I would
> assume that a well-behaved system would do that before resorting to
> swapping. And if so, such memory should be included in MemAvailable
> because VM will make it available before swapping. But again, that's
> my interpretation. WDYT?
>
>>> In that sense I think it's very similar to reclaimable slabs which are
>>> already accounted as part of MemAvailable. So it seems logical to me
>>> to include unused pages in the pools here as well. What am I missing?
>> See the shrinkers are there because you need to do some action before
>> you can re-use the memory.
>>
>> In the case of the TTM/DRM pool for example you need to change the
>> caching attributes which might cause sleeping for a TLB flush to finish.
> I see your point here. But how about caches/pools which can be easily
> dropped? Shouldn't they be part of MemAvailable?

Well if it can be easily dropped and reallocated we wouldn't have a pool 
for that in the first place :)

I mean we have created this page pool because the TLB shot down for the 
caching change is extremely costly.

It could make sense for slap, but not sure about that.

Regards,
Christian.

>
>> By accounting those pages as free you mess up (for example) the handling
>> which makes sure that there are enough emergency reserves. I can only
>> strongly recommend to not do that.
>>
>> What you could do is to add a sysfs interface to expose the different
>> shrinkers and the amount of pages in them to userspace. Similar to what
>> /proc/slabinfo is doing.
> True, we can work around this. Just want to make sure whatever we do
> really makes sense.
> Thanks,
> Suren.
>
>> Regards,
>> Christian.
>>
>>>>>> Android (for now) only needs to account overall gpu
>>>>>> memory since none of it is swappable on android drivers anyway, plus
>>>>>> no vram, so not much needed.
>>>>>>
>>>>>> Cheers, Daniel
>>>>>>
>>>>>>> Regards,
>>>>>>> Christian.
>>>>>>>
>>>>>>>>> In the best case this just messes up the accounting, in the worst case
>>>>>>>>> it can cause memory corruption.
>>>>>>>>>
>>>>>>>>> Christian.
>>>>>> --
>>>>>> Daniel Vetter
>>>>>> Software Engineer, Intel Corporation
>>>>>> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C4c92877de38440ef18ee08d8cdf7c8ab%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637485811429326241%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=vqR0DAK5XkzanA4fqN0hxDm8ZbCazpvlqTDMUqxhDoA%3D&amp;reserved=0
>>>> --
>>>> Daniel Vetter
>>>> Software Engineer, Intel Corporation
>>>> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C4c92877de38440ef18ee08d8cdf7c8ab%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637485811429326241%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=vqR0DAK5XkzanA4fqN0hxDm8ZbCazpvlqTDMUqxhDoA%3D&amp;reserved=0


^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2021-02-10 19:24 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-05  8:06 [RFC][PATCH v6 0/7] Generic page pool & deferred freeing for system dmabuf heap John Stultz
2021-02-05  8:06 ` [RFC][PATCH v6 1/7] drm: Add a sharable drm page-pool implementation John Stultz
2021-02-05  8:46   ` Christian König
2021-02-05 20:46     ` John Stultz
2021-02-05 22:38       ` Suren Baghdasaryan
2021-02-09 12:11       ` Christian König
2021-02-09 12:57         ` Christian König
2021-02-09 17:33           ` Suren Baghdasaryan
2021-02-09 17:46             ` Christian König
2021-02-09 17:52               ` Suren Baghdasaryan
2021-02-09 20:03               ` Daniel Vetter
2021-02-09 20:16                 ` Suren Baghdasaryan
2021-02-10 13:06                   ` Daniel Vetter
2021-02-10 16:39                     ` Suren Baghdasaryan
2021-02-10 17:21                       ` Daniel Vetter
2021-02-10 17:28                         ` Suren Baghdasaryan
2021-02-10 18:32                       ` Christian König
2021-02-10 19:12                         ` Suren Baghdasaryan
2021-02-10 19:23                           ` Christian König
2021-02-09 17:51         ` John Stultz
2021-02-05  8:06 ` [RFC][PATCH v6 2/7] drm: ttm_pool: Rename the ttm_pool_dma structure to ttm_pool_page_dat John Stultz
2021-02-05  8:06 ` [RFC][PATCH v6 3/7] drm: ttm_pool: Rework ttm_pool_free_page to allow us to use it as a function pointer John Stultz
2021-02-05  8:28   ` Christian König
2021-02-05 19:47     ` John Stultz
2021-02-09 12:13       ` Christian König
2021-02-09 17:39         ` John Stultz
2021-02-05  8:06 ` [RFC][PATCH v6 4/7] drm: ttm_pool: Rework ttm_pool to use drm_page_pool John Stultz
2021-02-05  8:50   ` Christian König
2021-02-05  8:06 ` [RFC][PATCH v6 5/7] dma-buf: heaps: Add deferred-free-helper library code John Stultz
2021-02-05  8:06 ` [RFC][PATCH v6 6/7] dma-buf: system_heap: Add drm pagepool support to system heap John Stultz
2021-02-05  8:06 ` [RFC][PATCH v6 7/7] dma-buf: system_heap: Add deferred freeing to the " John Stultz
2021-02-05 10:36 ` [RFC][PATCH v6 0/7] Generic page pool & deferred freeing for system dmabuf heap Christian König
2021-02-05 20:57   ` John Stultz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).