All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] drm/ttm: hard to swim in an empty pool
@ 2014-08-13  3:52 Jérôme Glisse
  2014-08-13  3:52 ` [PATCH 1/3] drm/ttm: set sensible pool size limit Jérôme Glisse
                   ` (2 more replies)
  0 siblings, 3 replies; 29+ messages in thread
From: Jérôme Glisse @ 2014-08-13  3:52 UTC (permalink / raw)
  To: dri-devel

So it seems there was several issue with the various ttm pool. The obvious
one is fixed in patch 2 where the always empty pool syndrom is addressed.
However the pool size are kind of crazy and because before some pool were
never actualy fill we might never have experience the hill effect of the
crazy maximum limit. This is what is addressed by first patch.

Last patch cook it up further so that under memory pressure the pool size
is divided by 2 each time a shrinker is run on a pool. There is a timeout
to restore the pool size on next allocation. Idea here is that memory should
not last and if it last then shrinker will keep minimize the pool size and
anyway thing are probably already sluggish once we it the shrinker path.

Of course because this fix thing in ttm memory allocation this need careful
testing. So before pushing anything i would like to see more people testing
this.

Cheers,
Jérôme

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 1/3] drm/ttm: set sensible pool size limit.
  2014-08-13  3:52 [PATCH 0/3] drm/ttm: hard to swim in an empty pool Jérôme Glisse
@ 2014-08-13  3:52 ` Jérôme Glisse
  2014-08-13  6:24   ` Michel Dänzer
  2014-08-13  3:52 ` [PATCH 2/3] drm/ttm: fix object deallocation to properly fill in the page pool Jérôme Glisse
  2014-08-13  3:52 ` [PATCH 3/3] drm/ttm: under memory pressure minimize the size of memory pool Jérôme Glisse
  2 siblings, 1 reply; 29+ messages in thread
From: Jérôme Glisse @ 2014-08-13  3:52 UTC (permalink / raw)
  To: dri-devel
  Cc: Michel Dänzer, Jérôme Glisse,
	Konrad Rzeszutek Wilk, Thomas Hellstrom

From: Jérôme Glisse <jglisse@redhat.com>

Due to bug in code it appear that some of the pool where never properly
use and always empty. Before fixing that bug this patch set sensible
limit on pool size. The magic 64MB number was nominated.

This is obviously a some what arbitrary number but the intend of ttm pool
is to minimize page alloc cost especialy when allocating page that will be
mark to be excluded from cpu cache mecanisms. We assume that mostly small
buffer that are constantly allocated/deallocated might suffer from core
memory allocation overhead as well as cache status change. This are the
assumptions behind the 64MB value.

This obviously need some serious testing including monitoring pool size.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Mario Kleiner <mario.kleiner.de@gmail.com>
Cc: Michel Dänzer <michel@daenzer.net>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 drivers/gpu/drm/ttm/ttm_memory.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_memory.c b/drivers/gpu/drm/ttm/ttm_memory.c
index dbc2def..73b2ded 100644
--- a/drivers/gpu/drm/ttm/ttm_memory.c
+++ b/drivers/gpu/drm/ttm/ttm_memory.c
@@ -38,6 +38,16 @@
 #include <linux/slab.h>
 
 #define TTM_MEMORY_ALLOC_RETRIES 4
+/* Have a maximum of 64MB of memory inside the pool.
+ *
+ * This is obviously a some what arbitrary number but the intend of ttm pool
+ * is to minimize page alloc cost especialy when allocating page that will be
+ * mark to be excluded from cpu cache mecanisms. We assume that mostly small
+ * buffer that are constantly allocated/deallocated might suffer from core
+ * memory allocation overhead as well as cache status change. This are the
+ * assumptions behind the 64MB value.
+ */
+#define MAX_POOL_SIZE (64UL << 20UL)
 
 struct ttm_mem_zone {
 	struct kobject kobj;
@@ -363,6 +373,7 @@ int ttm_mem_global_init(struct ttm_mem_global *glob)
 	int ret;
 	int i;
 	struct ttm_mem_zone *zone;
+	unsigned long max_pool_size;
 
 	spin_lock_init(&glob->lock);
 	glob->swap_queue = create_singlethread_workqueue("ttm_swap");
@@ -393,8 +404,9 @@ int ttm_mem_global_init(struct ttm_mem_global *glob)
 		pr_info("Zone %7s: Available graphics memory: %llu kiB\n",
 			zone->name, (unsigned long long)zone->max_mem >> 10);
 	}
-	ttm_page_alloc_init(glob, glob->zone_kernel->max_mem/(2*PAGE_SIZE));
-	ttm_dma_page_alloc_init(glob, glob->zone_kernel->max_mem/(2*PAGE_SIZE));
+	max_pool_size = min(glob->zone_kernel->max_mem >> 3UL, MAX_POOL_SIZE);
+	ttm_page_alloc_init(glob, max_pool_size / (2 * PAGE_SIZE));
+	ttm_dma_page_alloc_init(glob, max_pool_size / (2 * PAGE_SIZE));
 	return 0;
 out_no_zone:
 	ttm_mem_global_release(glob);
-- 
1.9.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 2/3] drm/ttm: fix object deallocation to properly fill in the page pool.
  2014-08-13  3:52 [PATCH 0/3] drm/ttm: hard to swim in an empty pool Jérôme Glisse
  2014-08-13  3:52 ` [PATCH 1/3] drm/ttm: set sensible pool size limit Jérôme Glisse
@ 2014-08-13  3:52 ` Jérôme Glisse
  2015-03-25 19:06   ` Konrad Rzeszutek Wilk
  2015-07-06  9:11   ` Michel Dänzer
  2014-08-13  3:52 ` [PATCH 3/3] drm/ttm: under memory pressure minimize the size of memory pool Jérôme Glisse
  2 siblings, 2 replies; 29+ messages in thread
From: Jérôme Glisse @ 2014-08-13  3:52 UTC (permalink / raw)
  To: dri-devel
  Cc: Jérôme Glisse, Thomas Hellstrom, Konrad Rzeszutek Wilk

From: Jérôme Glisse <jglisse@redhat.com>

Current code never allowed the page pool to actualy fill in anyway. This fix
it and also allow it to grow over its limit until it grow beyond the batch
size for allocation and deallocation.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Reviewed-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Tested-by: Michel Dänzer <michel@daenzer.net>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 drivers/gpu/drm/ttm/ttm_page_alloc_dma.c | 9 ++-------
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
index c96db43..a076ff3 100644
--- a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
+++ b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
@@ -953,14 +953,9 @@ void ttm_dma_unpopulate(struct ttm_dma_tt *ttm_dma, struct device *dev)
 	} else {
 		pool->npages_free += count;
 		list_splice(&ttm_dma->pages_list, &pool->free_list);
-		npages = count;
-		if (pool->npages_free > _manager->options.max_size) {
+		if (pool->npages_free >= (_manager->options.max_size +
+					  NUM_PAGES_TO_ALLOC))
 			npages = pool->npages_free - _manager->options.max_size;
-			/* free at least NUM_PAGES_TO_ALLOC number of pages
-			 * to reduce calls to set_memory_wb */
-			if (npages < NUM_PAGES_TO_ALLOC)
-				npages = NUM_PAGES_TO_ALLOC;
-		}
 	}
 	spin_unlock_irqrestore(&pool->lock, irq_flags);
 
-- 
1.9.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 3/3] drm/ttm: under memory pressure minimize the size of memory pool
  2014-08-13  3:52 [PATCH 0/3] drm/ttm: hard to swim in an empty pool Jérôme Glisse
  2014-08-13  3:52 ` [PATCH 1/3] drm/ttm: set sensible pool size limit Jérôme Glisse
  2014-08-13  3:52 ` [PATCH 2/3] drm/ttm: fix object deallocation to properly fill in the page pool Jérôme Glisse
@ 2014-08-13  3:52 ` Jérôme Glisse
  2014-08-13  6:32   ` Michel Dänzer
  2014-08-13  9:06   ` Thomas Hellstrom
  2 siblings, 2 replies; 29+ messages in thread
From: Jérôme Glisse @ 2014-08-13  3:52 UTC (permalink / raw)
  To: dri-devel
  Cc: Michel Dänzer, Jérôme Glisse,
	Konrad Rzeszutek Wilk, Thomas Hellstrom

From: Jérôme Glisse <jglisse@redhat.com>

When experiencing memory pressure we want to minimize pool size so that
memory we just shrinked is not added back again just as the next thing.

This will divide by 2 the maximum pool size for each device each time
the pool have to shrink. The limit is bumped again is next allocation
happen after one second since the last shrink. The one second delay is
obviously an arbitrary choice.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Mario Kleiner <mario.kleiner.de@gmail.com>
Cc: Michel Dänzer <michel@daenzer.net>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 drivers/gpu/drm/ttm/ttm_page_alloc.c     | 35 +++++++++++++++++++++++++-------
 drivers/gpu/drm/ttm/ttm_page_alloc_dma.c | 27 ++++++++++++++++++++++--
 2 files changed, 53 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc.c b/drivers/gpu/drm/ttm/ttm_page_alloc.c
index 09874d6..ab41adf 100644
--- a/drivers/gpu/drm/ttm/ttm_page_alloc.c
+++ b/drivers/gpu/drm/ttm/ttm_page_alloc.c
@@ -68,6 +68,8 @@
  * @list: Pool of free uc/wc pages for fast reuse.
  * @gfp_flags: Flags to pass for alloc_page.
  * @npages: Number of pages in pool.
+ * @cur_max_size: Current maximum size for the pool.
+ * @shrink_timeout: Timeout for pool maximum size restriction.
  */
 struct ttm_page_pool {
 	spinlock_t		lock;
@@ -76,6 +78,8 @@ struct ttm_page_pool {
 	gfp_t			gfp_flags;
 	unsigned		npages;
 	char			*name;
+	unsigned		cur_max_size;
+	unsigned long		last_shrink;
 	unsigned long		nfrees;
 	unsigned long		nrefills;
 };
@@ -289,6 +293,16 @@ static void ttm_pool_update_free_locked(struct ttm_page_pool *pool,
 	pool->nfrees += freed_pages;
 }
 
+static inline void ttm_pool_update_max_size(struct ttm_page_pool *pool)
+{
+	if (time_before(jiffies, pool->shrink_timeout))
+		return;
+	/* In case we reached zero bounce back to 512 pages. */
+	pool->cur_max_size = max(pool->cur_max_size << 1, 512);
+	pool->cur_max_size = min(pool->cur_max_size,
+				 _manager->options.max_size);
+}
+
 /**
  * Free pages from pool.
  *
@@ -407,6 +421,9 @@ ttm_pool_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
 		if (shrink_pages == 0)
 			break;
 		pool = &_manager->pools[(i + pool_offset)%NUM_POOLS];
+		/* No matter what make sure the pool do not grow in next second. */
+		pool->cur_max_size = pool->cur_max_size >> 1;
+		pool->shrink_timeout = jiffies + HZ;
 		shrink_pages = ttm_page_pool_free(pool, nr_free,
 						  sc->gfp_mask);
 		freed += nr_free - shrink_pages;
@@ -701,13 +718,12 @@ static void ttm_put_pages(struct page **pages, unsigned npages, int flags,
 	}
 	/* Check that we don't go over the pool limit */
 	npages = 0;
-	if (pool->npages > _manager->options.max_size) {
-		npages = pool->npages - _manager->options.max_size;
-		/* free at least NUM_PAGES_TO_ALLOC number of pages
-		 * to reduce calls to set_memory_wb */
-		if (npages < NUM_PAGES_TO_ALLOC)
-			npages = NUM_PAGES_TO_ALLOC;
-	}
+	/*
+	 * Free at least NUM_PAGES_TO_ALLOC number of pages to reduce calls to
+	 * set_memory_wb.
+	 */
+	if (pool->npages > (pool->cur_max_size + NUM_PAGES_TO_ALLOC))
+		npages = pool->npages - pool->cur_max_size;
 	spin_unlock_irqrestore(&pool->lock, irq_flags);
 	if (npages)
 		ttm_page_pool_free(pool, npages, GFP_KERNEL);
@@ -751,6 +767,9 @@ static int ttm_get_pages(struct page **pages, unsigned npages, int flags,
 		return 0;
 	}
 
+	/* Update pool size in case shrinker limited it. */
+	ttm_pool_update_max_size(pool);
+
 	/* combine zero flag to pool flags */
 	gfp_flags |= pool->gfp_flags;
 
@@ -803,6 +822,8 @@ static void ttm_page_pool_init_locked(struct ttm_page_pool *pool, gfp_t flags,
 	pool->npages = pool->nfrees = 0;
 	pool->gfp_flags = flags;
 	pool->name = name;
+	pool->cur_max_size = _manager->options.max_size;
+	pool->shrink_timeout = jiffies;
 }
 
 int ttm_page_alloc_init(struct ttm_mem_global *glob, unsigned max_pages)
diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
index a076ff3..80b10aa 100644
--- a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
+++ b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
@@ -93,6 +93,8 @@ enum pool_type {
  * @size: Size used during DMA allocation.
  * @npages_free: Count of available pages for re-use.
  * @npages_in_use: Count of pages that are in use.
+ * @cur_max_size: Current maximum size for the pool.
+ * @shrink_timeout: Timeout for pool maximum size restriction.
  * @nfrees: Stats when pool is shrinking.
  * @nrefills: Stats when the pool is grown.
  * @gfp_flags: Flags to pass for alloc_page.
@@ -110,6 +112,8 @@ struct dma_pool {
 	unsigned size;
 	unsigned npages_free;
 	unsigned npages_in_use;
+	unsigned cur_max_size;
+	unsigned long last_shrink;
 	unsigned long nfrees; /* Stats when shrunk. */
 	unsigned long nrefills; /* Stats when grown. */
 	gfp_t gfp_flags;
@@ -331,6 +335,17 @@ static void __ttm_dma_free_page(struct dma_pool *pool, struct dma_page *d_page)
 	kfree(d_page);
 	d_page = NULL;
 }
+
+static inline void ttm_dma_pool_update_max_size(struct dma_pool *pool)
+{
+	if (time_before(jiffies, pool->shrink_timeout))
+		return;
+	/* In case we reached zero bounce back to 512 pages. */
+	pool->cur_max_size = max(pool->cur_max_size << 1, 512);
+	pool->cur_max_size = min(pool->cur_max_size,
+				 _manager->options.max_size);
+}
+
 static struct dma_page *__ttm_dma_alloc_page(struct dma_pool *pool)
 {
 	struct dma_page *d_page;
@@ -606,6 +621,8 @@ static struct dma_pool *ttm_dma_pool_init(struct device *dev, gfp_t flags,
 	pool->size = PAGE_SIZE;
 	pool->type = type;
 	pool->nrefills = 0;
+	pool->cur_max_size = _manager->options.max_size;
+	pool->shrink_timeout = jiffies;
 	p = pool->name;
 	for (i = 0; i < 5; i++) {
 		if (type & t[i]) {
@@ -892,6 +909,9 @@ int ttm_dma_populate(struct ttm_dma_tt *ttm_dma, struct device *dev)
 		}
 	}
 
+	/* Update pool size in case shrinker limited it. */
+	ttm_dma_pool_update_max_size(pool);
+
 	INIT_LIST_HEAD(&ttm_dma->pages_list);
 	for (i = 0; i < ttm->num_pages; ++i) {
 		ret = ttm_dma_pool_get_pages(pool, ttm_dma, i);
@@ -953,9 +973,9 @@ void ttm_dma_unpopulate(struct ttm_dma_tt *ttm_dma, struct device *dev)
 	} else {
 		pool->npages_free += count;
 		list_splice(&ttm_dma->pages_list, &pool->free_list);
-		if (pool->npages_free >= (_manager->options.max_size +
+		if (pool->npages_free >= (pool->cur_max_size +
 					  NUM_PAGES_TO_ALLOC))
-			npages = pool->npages_free - _manager->options.max_size;
+			npages = pool->npages_free - pool->cur_max_size;
 	}
 	spin_unlock_irqrestore(&pool->lock, irq_flags);
 
@@ -1024,6 +1044,9 @@ ttm_dma_pool_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
 		/* Do it in round-robin fashion. */
 		if (++idx < pool_offset)
 			continue;
+		/* No matter what make sure the pool do not grow in next second. */
+		p->pool->cur_max_size = p->pool->cur_max_size >> 1;
+		p->pool->shrink_timeout = jiffies + HZ;
 		nr_free = shrink_pages;
 		shrink_pages = ttm_dma_page_pool_free(p->pool, nr_free,
 						      sc->gfp_mask);
-- 
1.9.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/3] drm/ttm: set sensible pool size limit.
  2014-08-13  3:52 ` [PATCH 1/3] drm/ttm: set sensible pool size limit Jérôme Glisse
@ 2014-08-13  6:24   ` Michel Dänzer
  0 siblings, 0 replies; 29+ messages in thread
From: Michel Dänzer @ 2014-08-13  6:24 UTC (permalink / raw)
  To: Jérôme Glisse, dri-devel
  Cc: Jérôme Glisse, Thomas Hellstrom, Konrad Rzeszutek Wilk

On 13.08.2014 12:52, Jérôme Glisse wrote:
> From: Jérôme Glisse <jglisse@redhat.com>
> 
> Due to bug in code it appear that some of the pool where never properly
> use and always empty. Before fixing that bug this patch set sensible
> limit on pool size. The magic 64MB number was nominated.
> 
> This is obviously a some what arbitrary number but the intend of ttm pool
> is to minimize page alloc cost especialy when allocating page that will be
> mark to be excluded from cpu cache mecanisms. We assume that mostly small
> buffer that are constantly allocated/deallocated might suffer from core
> memory allocation overhead as well as cache status change. This are the
> assumptions behind the 64MB value.
> 
> This obviously need some serious testing including monitoring pool size.
> 
> Signed-off-by: Jérôme Glisse <jglisse@redhat.com>

[...]

> @@ -393,8 +404,9 @@ int ttm_mem_global_init(struct ttm_mem_global *glob)
>  		pr_info("Zone %7s: Available graphics memory: %llu kiB\n",
>  			zone->name, (unsigned long long)zone->max_mem >> 10);
>  	}
> -	ttm_page_alloc_init(glob, glob->zone_kernel->max_mem/(2*PAGE_SIZE));
> -	ttm_dma_page_alloc_init(glob, glob->zone_kernel->max_mem/(2*PAGE_SIZE));
> +	max_pool_size = min(glob->zone_kernel->max_mem >> 3UL, MAX_POOL_SIZE);

This introduces a 'comparison of distinct pointer types lacks a cast'
warning for me.


-- 
Earthling Michel Dänzer            |                  http://www.amd.com
Libre software enthusiast          |                Mesa and X developer
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/3] drm/ttm: under memory pressure minimize the size of memory pool
  2014-08-13  3:52 ` [PATCH 3/3] drm/ttm: under memory pressure minimize the size of memory pool Jérôme Glisse
@ 2014-08-13  6:32   ` Michel Dänzer
  2014-08-13  9:06   ` Thomas Hellstrom
  1 sibling, 0 replies; 29+ messages in thread
From: Michel Dänzer @ 2014-08-13  6:32 UTC (permalink / raw)
  To: Jérôme Glisse, dri-devel
  Cc: Jérôme Glisse, Thomas Hellstrom, Konrad Rzeszutek Wilk

On 13.08.2014 12:52, Jérôme Glisse wrote:
> From: Jérôme Glisse <jglisse@redhat.com>
> 
> When experiencing memory pressure we want to minimize pool size so that
> memory we just shrinked is not added back again just as the next thing.
> 
> This will divide by 2 the maximum pool size for each device each time
> the pool have to shrink. The limit is bumped again is next allocation
> happen after one second since the last shrink. The one second delay is
> obviously an arbitrary choice.
> 
> Signed-off-by: Jérôme Glisse <jglisse@redhat.com>

[...]

> diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc.c b/drivers/gpu/drm/ttm/ttm_page_alloc.c
> index 09874d6..ab41adf 100644
> --- a/drivers/gpu/drm/ttm/ttm_page_alloc.c
> +++ b/drivers/gpu/drm/ttm/ttm_page_alloc.c
> @@ -68,6 +68,8 @@
>   * @list: Pool of free uc/wc pages for fast reuse.
>   * @gfp_flags: Flags to pass for alloc_page.
>   * @npages: Number of pages in pool.
> + * @cur_max_size: Current maximum size for the pool.
> + * @shrink_timeout: Timeout for pool maximum size restriction.
>   */
>  struct ttm_page_pool {
>  	spinlock_t		lock;
> @@ -76,6 +78,8 @@ struct ttm_page_pool {
>  	gfp_t			gfp_flags;
>  	unsigned		npages;
>  	char			*name;
> +	unsigned		cur_max_size;
> +	unsigned long		last_shrink;

s/last_shrink/shrink_timeout/

Looks like maybe you posted an untested stale set of patches?


> @@ -289,6 +293,16 @@ static void ttm_pool_update_free_locked(struct ttm_page_pool *pool,
>  	pool->nfrees += freed_pages;
>  }
>  
> +static inline void ttm_pool_update_max_size(struct ttm_page_pool *pool)
> +{
> +	if (time_before(jiffies, pool->shrink_timeout))
> +		return;
> +	/* In case we reached zero bounce back to 512 pages. */
> +	pool->cur_max_size = max(pool->cur_max_size << 1, 512);

Another 'comparison of distinct pointer types lacks a cast' warning.


Both issues apply to ttm_page_alloc_dma.c as well.


-- 
Earthling Michel Dänzer            |                  http://www.amd.com
Libre software enthusiast          |                Mesa and X developer
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/3] drm/ttm: under memory pressure minimize the size of memory pool
  2014-08-13  3:52 ` [PATCH 3/3] drm/ttm: under memory pressure minimize the size of memory pool Jérôme Glisse
  2014-08-13  6:32   ` Michel Dänzer
@ 2014-08-13  9:06   ` Thomas Hellstrom
  2014-08-13 10:42     ` Daniel Vetter
  1 sibling, 1 reply; 29+ messages in thread
From: Thomas Hellstrom @ 2014-08-13  9:06 UTC (permalink / raw)
  To: Jérôme Glisse
  Cc: Michel Dänzer, Jérôme Glisse, dri-devel,
	Konrad Rzeszutek Wilk


On 08/13/2014 05:52 AM, Jérôme Glisse wrote:
> From: Jérôme Glisse <jglisse@redhat.com>
>
> When experiencing memory pressure we want to minimize pool size so that
> memory we just shrinked is not added back again just as the next thing.
>
> This will divide by 2 the maximum pool size for each device each time
> the pool have to shrink. The limit is bumped again is next allocation
> happen after one second since the last shrink. The one second delay is
> obviously an arbitrary choice.

Jérôme,

I don't like this patch. It adds extra complexity and its usefulness is
highly questionable.
There are a number of caches in the system, and if all of them added
some sort of voluntary shrink heuristics like this, we'd end up with
impossible-to-debug unpredictable performance issues.

We should let the memory subsystem decide when to reclaim pages from
caches and what caches to reclaim them from.

/Thomas
>
> Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
> Cc: Mario Kleiner <mario.kleiner.de@gmail.com>
> Cc: Michel Dänzer <michel@daenzer.net>
> Cc: Thomas Hellstrom <thellstrom@vmware.com>
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> ---
>  drivers/gpu/drm/ttm/ttm_page_alloc.c     | 35 +++++++++++++++++++++++++-------
>  drivers/gpu/drm/ttm/ttm_page_alloc_dma.c | 27 ++++++++++++++++++++++--
>  2 files changed, 53 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc.c b/drivers/gpu/drm/ttm/ttm_page_alloc.c
> index 09874d6..ab41adf 100644
> --- a/drivers/gpu/drm/ttm/ttm_page_alloc.c
> +++ b/drivers/gpu/drm/ttm/ttm_page_alloc.c
> @@ -68,6 +68,8 @@
>   * @list: Pool of free uc/wc pages for fast reuse.
>   * @gfp_flags: Flags to pass for alloc_page.
>   * @npages: Number of pages in pool.
> + * @cur_max_size: Current maximum size for the pool.
> + * @shrink_timeout: Timeout for pool maximum size restriction.
>   */
>  struct ttm_page_pool {
>  	spinlock_t		lock;
> @@ -76,6 +78,8 @@ struct ttm_page_pool {
>  	gfp_t			gfp_flags;
>  	unsigned		npages;
>  	char			*name;
> +	unsigned		cur_max_size;
> +	unsigned long		last_shrink;
>  	unsigned long		nfrees;
>  	unsigned long		nrefills;
>  };
> @@ -289,6 +293,16 @@ static void ttm_pool_update_free_locked(struct ttm_page_pool *pool,
>  	pool->nfrees += freed_pages;
>  }
>  
> +static inline void ttm_pool_update_max_size(struct ttm_page_pool *pool)
> +{
> +	if (time_before(jiffies, pool->shrink_timeout))
> +		return;
> +	/* In case we reached zero bounce back to 512 pages. */
> +	pool->cur_max_size = max(pool->cur_max_size << 1, 512);
> +	pool->cur_max_size = min(pool->cur_max_size,
> +				 _manager->options.max_size);
> +}
> +
>  /**
>   * Free pages from pool.
>   *
> @@ -407,6 +421,9 @@ ttm_pool_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
>  		if (shrink_pages == 0)
>  			break;
>  		pool = &_manager->pools[(i + pool_offset)%NUM_POOLS];
> +		/* No matter what make sure the pool do not grow in next second. */
> +		pool->cur_max_size = pool->cur_max_size >> 1;
> +		pool->shrink_timeout = jiffies + HZ;
>  		shrink_pages = ttm_page_pool_free(pool, nr_free,
>  						  sc->gfp_mask);
>  		freed += nr_free - shrink_pages;
> @@ -701,13 +718,12 @@ static void ttm_put_pages(struct page **pages, unsigned npages, int flags,
>  	}
>  	/* Check that we don't go over the pool limit */
>  	npages = 0;
> -	if (pool->npages > _manager->options.max_size) {
> -		npages = pool->npages - _manager->options.max_size;
> -		/* free at least NUM_PAGES_TO_ALLOC number of pages
> -		 * to reduce calls to set_memory_wb */
> -		if (npages < NUM_PAGES_TO_ALLOC)
> -			npages = NUM_PAGES_TO_ALLOC;
> -	}
> +	/*
> +	 * Free at least NUM_PAGES_TO_ALLOC number of pages to reduce calls to
> +	 * set_memory_wb.
> +	 */
> +	if (pool->npages > (pool->cur_max_size + NUM_PAGES_TO_ALLOC))
> +		npages = pool->npages - pool->cur_max_size;
>  	spin_unlock_irqrestore(&pool->lock, irq_flags);
>  	if (npages)
>  		ttm_page_pool_free(pool, npages, GFP_KERNEL);
> @@ -751,6 +767,9 @@ static int ttm_get_pages(struct page **pages, unsigned npages, int flags,
>  		return 0;
>  	}
>  
> +	/* Update pool size in case shrinker limited it. */
> +	ttm_pool_update_max_size(pool);
> +
>  	/* combine zero flag to pool flags */
>  	gfp_flags |= pool->gfp_flags;
>  
> @@ -803,6 +822,8 @@ static void ttm_page_pool_init_locked(struct ttm_page_pool *pool, gfp_t flags,
>  	pool->npages = pool->nfrees = 0;
>  	pool->gfp_flags = flags;
>  	pool->name = name;
> +	pool->cur_max_size = _manager->options.max_size;
> +	pool->shrink_timeout = jiffies;
>  }
>  
>  int ttm_page_alloc_init(struct ttm_mem_global *glob, unsigned max_pages)
> diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
> index a076ff3..80b10aa 100644
> --- a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
> +++ b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
> @@ -93,6 +93,8 @@ enum pool_type {
>   * @size: Size used during DMA allocation.
>   * @npages_free: Count of available pages for re-use.
>   * @npages_in_use: Count of pages that are in use.
> + * @cur_max_size: Current maximum size for the pool.
> + * @shrink_timeout: Timeout for pool maximum size restriction.
>   * @nfrees: Stats when pool is shrinking.
>   * @nrefills: Stats when the pool is grown.
>   * @gfp_flags: Flags to pass for alloc_page.
> @@ -110,6 +112,8 @@ struct dma_pool {
>  	unsigned size;
>  	unsigned npages_free;
>  	unsigned npages_in_use;
> +	unsigned cur_max_size;
> +	unsigned long last_shrink;
>  	unsigned long nfrees; /* Stats when shrunk. */
>  	unsigned long nrefills; /* Stats when grown. */
>  	gfp_t gfp_flags;
> @@ -331,6 +335,17 @@ static void __ttm_dma_free_page(struct dma_pool *pool, struct dma_page *d_page)
>  	kfree(d_page);
>  	d_page = NULL;
>  }
> +
> +static inline void ttm_dma_pool_update_max_size(struct dma_pool *pool)
> +{
> +	if (time_before(jiffies, pool->shrink_timeout))
> +		return;
> +	/* In case we reached zero bounce back to 512 pages. */
> +	pool->cur_max_size = max(pool->cur_max_size << 1, 512);
> +	pool->cur_max_size = min(pool->cur_max_size,
> +				 _manager->options.max_size);
> +}
> +
>  static struct dma_page *__ttm_dma_alloc_page(struct dma_pool *pool)
>  {
>  	struct dma_page *d_page;
> @@ -606,6 +621,8 @@ static struct dma_pool *ttm_dma_pool_init(struct device *dev, gfp_t flags,
>  	pool->size = PAGE_SIZE;
>  	pool->type = type;
>  	pool->nrefills = 0;
> +	pool->cur_max_size = _manager->options.max_size;
> +	pool->shrink_timeout = jiffies;
>  	p = pool->name;
>  	for (i = 0; i < 5; i++) {
>  		if (type & t[i]) {
> @@ -892,6 +909,9 @@ int ttm_dma_populate(struct ttm_dma_tt *ttm_dma, struct device *dev)
>  		}
>  	}
>  
> +	/* Update pool size in case shrinker limited it. */
> +	ttm_dma_pool_update_max_size(pool);
> +
>  	INIT_LIST_HEAD(&ttm_dma->pages_list);
>  	for (i = 0; i < ttm->num_pages; ++i) {
>  		ret = ttm_dma_pool_get_pages(pool, ttm_dma, i);
> @@ -953,9 +973,9 @@ void ttm_dma_unpopulate(struct ttm_dma_tt *ttm_dma, struct device *dev)
>  	} else {
>  		pool->npages_free += count;
>  		list_splice(&ttm_dma->pages_list, &pool->free_list);
> -		if (pool->npages_free >= (_manager->options.max_size +
> +		if (pool->npages_free >= (pool->cur_max_size +
>  					  NUM_PAGES_TO_ALLOC))
> -			npages = pool->npages_free - _manager->options.max_size;
> +			npages = pool->npages_free - pool->cur_max_size;
>  	}
>  	spin_unlock_irqrestore(&pool->lock, irq_flags);
>  
> @@ -1024,6 +1044,9 @@ ttm_dma_pool_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
>  		/* Do it in round-robin fashion. */
>  		if (++idx < pool_offset)
>  			continue;
> +		/* No matter what make sure the pool do not grow in next second. */
> +		p->pool->cur_max_size = p->pool->cur_max_size >> 1;
> +		p->pool->shrink_timeout = jiffies + HZ;
>  		nr_free = shrink_pages;
>  		shrink_pages = ttm_dma_page_pool_free(p->pool, nr_free,
>  						      sc->gfp_mask);

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/3] drm/ttm: under memory pressure minimize the size of memory pool
  2014-08-13  9:06   ` Thomas Hellstrom
@ 2014-08-13 10:42     ` Daniel Vetter
  2014-08-13 12:35       ` GEM memory DOS (WAS Re: [PATCH 3/3] drm/ttm: under memory pressure minimize the size of memory pool) Thomas Hellstrom
  0 siblings, 1 reply; 29+ messages in thread
From: Daniel Vetter @ 2014-08-13 10:42 UTC (permalink / raw)
  To: Thomas Hellstrom
  Cc: Michel Dänzer, Jérôme Glisse, dri-devel,
	Konrad Rzeszutek Wilk

On Wed, Aug 13, 2014 at 11:06:25AM +0200, Thomas Hellstrom wrote:
> 
> On 08/13/2014 05:52 AM, Jérôme Glisse wrote:
> > From: Jérôme Glisse <jglisse@redhat.com>
> >
> > When experiencing memory pressure we want to minimize pool size so that
> > memory we just shrinked is not added back again just as the next thing.
> >
> > This will divide by 2 the maximum pool size for each device each time
> > the pool have to shrink. The limit is bumped again is next allocation
> > happen after one second since the last shrink. The one second delay is
> > obviously an arbitrary choice.
> 
> Jérôme,
> 
> I don't like this patch. It adds extra complexity and its usefulness is
> highly questionable.
> There are a number of caches in the system, and if all of them added
> some sort of voluntary shrink heuristics like this, we'd end up with
> impossible-to-debug unpredictable performance issues.
> 
> We should let the memory subsystem decide when to reclaim pages from
> caches and what caches to reclaim them from.

Yeah, artificially limiting your cache from growing when your shrinker
gets called will just break the equal-memory pressure the core mm uses to
rebalance between all caches when workload changes. In i915 we let
everything grow without artificial bounds and only rely upon the shrinker
callbacks to ensure we don't consume more than our fair share of available
memory overall.
-Daniel

> 
> /Thomas
> >
> > Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
> > Cc: Mario Kleiner <mario.kleiner.de@gmail.com>
> > Cc: Michel Dänzer <michel@daenzer.net>
> > Cc: Thomas Hellstrom <thellstrom@vmware.com>
> > Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > ---
> >  drivers/gpu/drm/ttm/ttm_page_alloc.c     | 35 +++++++++++++++++++++++++-------
> >  drivers/gpu/drm/ttm/ttm_page_alloc_dma.c | 27 ++++++++++++++++++++++--
> >  2 files changed, 53 insertions(+), 9 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc.c b/drivers/gpu/drm/ttm/ttm_page_alloc.c
> > index 09874d6..ab41adf 100644
> > --- a/drivers/gpu/drm/ttm/ttm_page_alloc.c
> > +++ b/drivers/gpu/drm/ttm/ttm_page_alloc.c
> > @@ -68,6 +68,8 @@
> >   * @list: Pool of free uc/wc pages for fast reuse.
> >   * @gfp_flags: Flags to pass for alloc_page.
> >   * @npages: Number of pages in pool.
> > + * @cur_max_size: Current maximum size for the pool.
> > + * @shrink_timeout: Timeout for pool maximum size restriction.
> >   */
> >  struct ttm_page_pool {
> >  	spinlock_t		lock;
> > @@ -76,6 +78,8 @@ struct ttm_page_pool {
> >  	gfp_t			gfp_flags;
> >  	unsigned		npages;
> >  	char			*name;
> > +	unsigned		cur_max_size;
> > +	unsigned long		last_shrink;
> >  	unsigned long		nfrees;
> >  	unsigned long		nrefills;
> >  };
> > @@ -289,6 +293,16 @@ static void ttm_pool_update_free_locked(struct ttm_page_pool *pool,
> >  	pool->nfrees += freed_pages;
> >  }
> >  
> > +static inline void ttm_pool_update_max_size(struct ttm_page_pool *pool)
> > +{
> > +	if (time_before(jiffies, pool->shrink_timeout))
> > +		return;
> > +	/* In case we reached zero bounce back to 512 pages. */
> > +	pool->cur_max_size = max(pool->cur_max_size << 1, 512);
> > +	pool->cur_max_size = min(pool->cur_max_size,
> > +				 _manager->options.max_size);
> > +}
> > +
> >  /**
> >   * Free pages from pool.
> >   *
> > @@ -407,6 +421,9 @@ ttm_pool_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
> >  		if (shrink_pages == 0)
> >  			break;
> >  		pool = &_manager->pools[(i + pool_offset)%NUM_POOLS];
> > +		/* No matter what make sure the pool do not grow in next second. */
> > +		pool->cur_max_size = pool->cur_max_size >> 1;
> > +		pool->shrink_timeout = jiffies + HZ;
> >  		shrink_pages = ttm_page_pool_free(pool, nr_free,
> >  						  sc->gfp_mask);
> >  		freed += nr_free - shrink_pages;
> > @@ -701,13 +718,12 @@ static void ttm_put_pages(struct page **pages, unsigned npages, int flags,
> >  	}
> >  	/* Check that we don't go over the pool limit */
> >  	npages = 0;
> > -	if (pool->npages > _manager->options.max_size) {
> > -		npages = pool->npages - _manager->options.max_size;
> > -		/* free at least NUM_PAGES_TO_ALLOC number of pages
> > -		 * to reduce calls to set_memory_wb */
> > -		if (npages < NUM_PAGES_TO_ALLOC)
> > -			npages = NUM_PAGES_TO_ALLOC;
> > -	}
> > +	/*
> > +	 * Free at least NUM_PAGES_TO_ALLOC number of pages to reduce calls to
> > +	 * set_memory_wb.
> > +	 */
> > +	if (pool->npages > (pool->cur_max_size + NUM_PAGES_TO_ALLOC))
> > +		npages = pool->npages - pool->cur_max_size;
> >  	spin_unlock_irqrestore(&pool->lock, irq_flags);
> >  	if (npages)
> >  		ttm_page_pool_free(pool, npages, GFP_KERNEL);
> > @@ -751,6 +767,9 @@ static int ttm_get_pages(struct page **pages, unsigned npages, int flags,
> >  		return 0;
> >  	}
> >  
> > +	/* Update pool size in case shrinker limited it. */
> > +	ttm_pool_update_max_size(pool);
> > +
> >  	/* combine zero flag to pool flags */
> >  	gfp_flags |= pool->gfp_flags;
> >  
> > @@ -803,6 +822,8 @@ static void ttm_page_pool_init_locked(struct ttm_page_pool *pool, gfp_t flags,
> >  	pool->npages = pool->nfrees = 0;
> >  	pool->gfp_flags = flags;
> >  	pool->name = name;
> > +	pool->cur_max_size = _manager->options.max_size;
> > +	pool->shrink_timeout = jiffies;
> >  }
> >  
> >  int ttm_page_alloc_init(struct ttm_mem_global *glob, unsigned max_pages)
> > diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
> > index a076ff3..80b10aa 100644
> > --- a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
> > +++ b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
> > @@ -93,6 +93,8 @@ enum pool_type {
> >   * @size: Size used during DMA allocation.
> >   * @npages_free: Count of available pages for re-use.
> >   * @npages_in_use: Count of pages that are in use.
> > + * @cur_max_size: Current maximum size for the pool.
> > + * @shrink_timeout: Timeout for pool maximum size restriction.
> >   * @nfrees: Stats when pool is shrinking.
> >   * @nrefills: Stats when the pool is grown.
> >   * @gfp_flags: Flags to pass for alloc_page.
> > @@ -110,6 +112,8 @@ struct dma_pool {
> >  	unsigned size;
> >  	unsigned npages_free;
> >  	unsigned npages_in_use;
> > +	unsigned cur_max_size;
> > +	unsigned long last_shrink;
> >  	unsigned long nfrees; /* Stats when shrunk. */
> >  	unsigned long nrefills; /* Stats when grown. */
> >  	gfp_t gfp_flags;
> > @@ -331,6 +335,17 @@ static void __ttm_dma_free_page(struct dma_pool *pool, struct dma_page *d_page)
> >  	kfree(d_page);
> >  	d_page = NULL;
> >  }
> > +
> > +static inline void ttm_dma_pool_update_max_size(struct dma_pool *pool)
> > +{
> > +	if (time_before(jiffies, pool->shrink_timeout))
> > +		return;
> > +	/* In case we reached zero bounce back to 512 pages. */
> > +	pool->cur_max_size = max(pool->cur_max_size << 1, 512);
> > +	pool->cur_max_size = min(pool->cur_max_size,
> > +				 _manager->options.max_size);
> > +}
> > +
> >  static struct dma_page *__ttm_dma_alloc_page(struct dma_pool *pool)
> >  {
> >  	struct dma_page *d_page;
> > @@ -606,6 +621,8 @@ static struct dma_pool *ttm_dma_pool_init(struct device *dev, gfp_t flags,
> >  	pool->size = PAGE_SIZE;
> >  	pool->type = type;
> >  	pool->nrefills = 0;
> > +	pool->cur_max_size = _manager->options.max_size;
> > +	pool->shrink_timeout = jiffies;
> >  	p = pool->name;
> >  	for (i = 0; i < 5; i++) {
> >  		if (type & t[i]) {
> > @@ -892,6 +909,9 @@ int ttm_dma_populate(struct ttm_dma_tt *ttm_dma, struct device *dev)
> >  		}
> >  	}
> >  
> > +	/* Update pool size in case shrinker limited it. */
> > +	ttm_dma_pool_update_max_size(pool);
> > +
> >  	INIT_LIST_HEAD(&ttm_dma->pages_list);
> >  	for (i = 0; i < ttm->num_pages; ++i) {
> >  		ret = ttm_dma_pool_get_pages(pool, ttm_dma, i);
> > @@ -953,9 +973,9 @@ void ttm_dma_unpopulate(struct ttm_dma_tt *ttm_dma, struct device *dev)
> >  	} else {
> >  		pool->npages_free += count;
> >  		list_splice(&ttm_dma->pages_list, &pool->free_list);
> > -		if (pool->npages_free >= (_manager->options.max_size +
> > +		if (pool->npages_free >= (pool->cur_max_size +
> >  					  NUM_PAGES_TO_ALLOC))
> > -			npages = pool->npages_free - _manager->options.max_size;
> > +			npages = pool->npages_free - pool->cur_max_size;
> >  	}
> >  	spin_unlock_irqrestore(&pool->lock, irq_flags);
> >  
> > @@ -1024,6 +1044,9 @@ ttm_dma_pool_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
> >  		/* Do it in round-robin fashion. */
> >  		if (++idx < pool_offset)
> >  			continue;
> > +		/* No matter what make sure the pool do not grow in next second. */
> > +		p->pool->cur_max_size = p->pool->cur_max_size >> 1;
> > +		p->pool->shrink_timeout = jiffies + HZ;
> >  		nr_free = shrink_pages;
> >  		shrink_pages = ttm_dma_page_pool_free(p->pool, nr_free,
> >  						      sc->gfp_mask);
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 29+ messages in thread

* GEM memory DOS (WAS Re: [PATCH 3/3] drm/ttm: under memory pressure minimize the size of memory pool)
  2014-08-13 10:42     ` Daniel Vetter
@ 2014-08-13 12:35       ` Thomas Hellstrom
  2014-08-13 12:40         ` David Herrmann
  2014-08-13 13:01         ` Daniel Vetter
  0 siblings, 2 replies; 29+ messages in thread
From: Thomas Hellstrom @ 2014-08-13 12:35 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Michel Dänzer, Jérôme Glisse, dri-devel,
	Konrad Rzeszutek Wilk

On 08/13/2014 12:42 PM, Daniel Vetter wrote:
> On Wed, Aug 13, 2014 at 11:06:25AM +0200, Thomas Hellstrom wrote:
>> On 08/13/2014 05:52 AM, Jérôme Glisse wrote:
>>> From: Jérôme Glisse <jglisse@redhat.com>
>>>
>>> When experiencing memory pressure we want to minimize pool size so that
>>> memory we just shrinked is not added back again just as the next thing.
>>>
>>> This will divide by 2 the maximum pool size for each device each time
>>> the pool have to shrink. The limit is bumped again is next allocation
>>> happen after one second since the last shrink. The one second delay is
>>> obviously an arbitrary choice.
>> Jérôme,
>>
>> I don't like this patch. It adds extra complexity and its usefulness is
>> highly questionable.
>> There are a number of caches in the system, and if all of them added
>> some sort of voluntary shrink heuristics like this, we'd end up with
>> impossible-to-debug unpredictable performance issues.
>>
>> We should let the memory subsystem decide when to reclaim pages from
>> caches and what caches to reclaim them from.
> Yeah, artificially limiting your cache from growing when your shrinker
> gets called will just break the equal-memory pressure the core mm uses to
> rebalance between all caches when workload changes. In i915 we let
> everything grow without artificial bounds and only rely upon the shrinker
> callbacks to ensure we don't consume more than our fair share of available
> memory overall.
> -Daniel

Now when you bring i915 memory usage up, Daniel,
I can't refrain from bringing up the old user-space unreclaimable kernel
memory issue, for which gem open is a good example ;) Each time
user-space opens a gem handle, some un-reclaimable kernel memory is
allocated, for which there is no accounting, so theoretically I think a
user can bring a system to unusability this way.

Typically there are various limits on unreclaimable objects like this,
like open file descriptors, and IIRC the kernel even has an internal
limit on the number of struct files you initialize, based on the
available system memory, so dma-buf / prime should already have some
sort of protection.

/Thomas


>> /Thomas
>>> Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
>>> Cc: Mario Kleiner <mario.kleiner.de@gmail.com>
>>> Cc: Michel Dänzer <michel@daenzer.net>
>>> Cc: Thomas Hellstrom <thellstrom@vmware.com>
>>> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
>>> ---
>>>  drivers/gpu/drm/ttm/ttm_page_alloc.c     | 35 +++++++++++++++++++++++++-------
>>>  drivers/gpu/drm/ttm/ttm_page_alloc_dma.c | 27 ++++++++++++++++++++++--
>>>  2 files changed, 53 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc.c b/drivers/gpu/drm/ttm/ttm_page_alloc.c
>>> index 09874d6..ab41adf 100644
>>> --- a/drivers/gpu/drm/ttm/ttm_page_alloc.c
>>> +++ b/drivers/gpu/drm/ttm/ttm_page_alloc.c
>>> @@ -68,6 +68,8 @@
>>>   * @list: Pool of free uc/wc pages for fast reuse.
>>>   * @gfp_flags: Flags to pass for alloc_page.
>>>   * @npages: Number of pages in pool.
>>> + * @cur_max_size: Current maximum size for the pool.
>>> + * @shrink_timeout: Timeout for pool maximum size restriction.
>>>   */
>>>  struct ttm_page_pool {
>>>  	spinlock_t		lock;
>>> @@ -76,6 +78,8 @@ struct ttm_page_pool {
>>>  	gfp_t			gfp_flags;
>>>  	unsigned		npages;
>>>  	char			*name;
>>> +	unsigned		cur_max_size;
>>> +	unsigned long		last_shrink;
>>>  	unsigned long		nfrees;
>>>  	unsigned long		nrefills;
>>>  };
>>> @@ -289,6 +293,16 @@ static void ttm_pool_update_free_locked(struct ttm_page_pool *pool,
>>>  	pool->nfrees += freed_pages;
>>>  }
>>>  
>>> +static inline void ttm_pool_update_max_size(struct ttm_page_pool *pool)
>>> +{
>>> +	if (time_before(jiffies, pool->shrink_timeout))
>>> +		return;
>>> +	/* In case we reached zero bounce back to 512 pages. */
>>> +	pool->cur_max_size = max(pool->cur_max_size << 1, 512);
>>> +	pool->cur_max_size = min(pool->cur_max_size,
>>> +				 _manager->options.max_size);
>>> +}
>>> +
>>>  /**
>>>   * Free pages from pool.
>>>   *
>>> @@ -407,6 +421,9 @@ ttm_pool_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
>>>  		if (shrink_pages == 0)
>>>  			break;
>>>  		pool = &_manager->pools[(i + pool_offset)%NUM_POOLS];
>>> +		/* No matter what make sure the pool do not grow in next second. */
>>> +		pool->cur_max_size = pool->cur_max_size >> 1;
>>> +		pool->shrink_timeout = jiffies + HZ;
>>>  		shrink_pages = ttm_page_pool_free(pool, nr_free,
>>>  						  sc->gfp_mask);
>>>  		freed += nr_free - shrink_pages;
>>> @@ -701,13 +718,12 @@ static void ttm_put_pages(struct page **pages, unsigned npages, int flags,
>>>  	}
>>>  	/* Check that we don't go over the pool limit */
>>>  	npages = 0;
>>> -	if (pool->npages > _manager->options.max_size) {
>>> -		npages = pool->npages - _manager->options.max_size;
>>> -		/* free at least NUM_PAGES_TO_ALLOC number of pages
>>> -		 * to reduce calls to set_memory_wb */
>>> -		if (npages < NUM_PAGES_TO_ALLOC)
>>> -			npages = NUM_PAGES_TO_ALLOC;
>>> -	}
>>> +	/*
>>> +	 * Free at least NUM_PAGES_TO_ALLOC number of pages to reduce calls to
>>> +	 * set_memory_wb.
>>> +	 */
>>> +	if (pool->npages > (pool->cur_max_size + NUM_PAGES_TO_ALLOC))
>>> +		npages = pool->npages - pool->cur_max_size;
>>>  	spin_unlock_irqrestore(&pool->lock, irq_flags);
>>>  	if (npages)
>>>  		ttm_page_pool_free(pool, npages, GFP_KERNEL);
>>> @@ -751,6 +767,9 @@ static int ttm_get_pages(struct page **pages, unsigned npages, int flags,
>>>  		return 0;
>>>  	}
>>>  
>>> +	/* Update pool size in case shrinker limited it. */
>>> +	ttm_pool_update_max_size(pool);
>>> +
>>>  	/* combine zero flag to pool flags */
>>>  	gfp_flags |= pool->gfp_flags;
>>>  
>>> @@ -803,6 +822,8 @@ static void ttm_page_pool_init_locked(struct ttm_page_pool *pool, gfp_t flags,
>>>  	pool->npages = pool->nfrees = 0;
>>>  	pool->gfp_flags = flags;
>>>  	pool->name = name;
>>> +	pool->cur_max_size = _manager->options.max_size;
>>> +	pool->shrink_timeout = jiffies;
>>>  }
>>>  
>>>  int ttm_page_alloc_init(struct ttm_mem_global *glob, unsigned max_pages)
>>> diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
>>> index a076ff3..80b10aa 100644
>>> --- a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
>>> +++ b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
>>> @@ -93,6 +93,8 @@ enum pool_type {
>>>   * @size: Size used during DMA allocation.
>>>   * @npages_free: Count of available pages for re-use.
>>>   * @npages_in_use: Count of pages that are in use.
>>> + * @cur_max_size: Current maximum size for the pool.
>>> + * @shrink_timeout: Timeout for pool maximum size restriction.
>>>   * @nfrees: Stats when pool is shrinking.
>>>   * @nrefills: Stats when the pool is grown.
>>>   * @gfp_flags: Flags to pass for alloc_page.
>>> @@ -110,6 +112,8 @@ struct dma_pool {
>>>  	unsigned size;
>>>  	unsigned npages_free;
>>>  	unsigned npages_in_use;
>>> +	unsigned cur_max_size;
>>> +	unsigned long last_shrink;
>>>  	unsigned long nfrees; /* Stats when shrunk. */
>>>  	unsigned long nrefills; /* Stats when grown. */
>>>  	gfp_t gfp_flags;
>>> @@ -331,6 +335,17 @@ static void __ttm_dma_free_page(struct dma_pool *pool, struct dma_page *d_page)
>>>  	kfree(d_page);
>>>  	d_page = NULL;
>>>  }
>>> +
>>> +static inline void ttm_dma_pool_update_max_size(struct dma_pool *pool)
>>> +{
>>> +	if (time_before(jiffies, pool->shrink_timeout))
>>> +		return;
>>> +	/* In case we reached zero bounce back to 512 pages. */
>>> +	pool->cur_max_size = max(pool->cur_max_size << 1, 512);
>>> +	pool->cur_max_size = min(pool->cur_max_size,
>>> +				 _manager->options.max_size);
>>> +}
>>> +
>>>  static struct dma_page *__ttm_dma_alloc_page(struct dma_pool *pool)
>>>  {
>>>  	struct dma_page *d_page;
>>> @@ -606,6 +621,8 @@ static struct dma_pool *ttm_dma_pool_init(struct device *dev, gfp_t flags,
>>>  	pool->size = PAGE_SIZE;
>>>  	pool->type = type;
>>>  	pool->nrefills = 0;
>>> +	pool->cur_max_size = _manager->options.max_size;
>>> +	pool->shrink_timeout = jiffies;
>>>  	p = pool->name;
>>>  	for (i = 0; i < 5; i++) {
>>>  		if (type & t[i]) {
>>> @@ -892,6 +909,9 @@ int ttm_dma_populate(struct ttm_dma_tt *ttm_dma, struct device *dev)
>>>  		}
>>>  	}
>>>  
>>> +	/* Update pool size in case shrinker limited it. */
>>> +	ttm_dma_pool_update_max_size(pool);
>>> +
>>>  	INIT_LIST_HEAD(&ttm_dma->pages_list);
>>>  	for (i = 0; i < ttm->num_pages; ++i) {
>>>  		ret = ttm_dma_pool_get_pages(pool, ttm_dma, i);
>>> @@ -953,9 +973,9 @@ void ttm_dma_unpopulate(struct ttm_dma_tt *ttm_dma, struct device *dev)
>>>  	} else {
>>>  		pool->npages_free += count;
>>>  		list_splice(&ttm_dma->pages_list, &pool->free_list);
>>> -		if (pool->npages_free >= (_manager->options.max_size +
>>> +		if (pool->npages_free >= (pool->cur_max_size +
>>>  					  NUM_PAGES_TO_ALLOC))
>>> -			npages = pool->npages_free - _manager->options.max_size;
>>> +			npages = pool->npages_free - pool->cur_max_size;
>>>  	}
>>>  	spin_unlock_irqrestore(&pool->lock, irq_flags);
>>>  
>>> @@ -1024,6 +1044,9 @@ ttm_dma_pool_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
>>>  		/* Do it in round-robin fashion. */
>>>  		if (++idx < pool_offset)
>>>  			continue;
>>> +		/* No matter what make sure the pool do not grow in next second. */
>>> +		p->pool->cur_max_size = p->pool->cur_max_size >> 1;
>>> +		p->pool->shrink_timeout = jiffies + HZ;
>>>  		nr_free = shrink_pages;
>>>  		shrink_pages = ttm_dma_page_pool_free(p->pool, nr_free,
>>>  						      sc->gfp_mask);
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> https://urldefense.proofpoint.com/v1/url?u=http://lists.freedesktop.org/mailman/listinfo/dri-devel&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=l5Ago9ekmVFZ3c4M6eauqrJWGwjf6fTb%2BP3CxbBFkVM%3D%0A&m=O3%2ForLs9%2FKcgsMAkAfqRcjXW1tOmP7AUEsPkaztSSrE%3D%0A&s=a7ca6074f66b92976893c6c6e18d96b96f905d4fc8ec0b8503614728fc387dc5

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: GEM memory DOS (WAS Re: [PATCH 3/3] drm/ttm: under memory pressure minimize the size of memory pool)
  2014-08-13 12:35       ` GEM memory DOS (WAS Re: [PATCH 3/3] drm/ttm: under memory pressure minimize the size of memory pool) Thomas Hellstrom
@ 2014-08-13 12:40         ` David Herrmann
  2014-08-13 12:48           ` Thomas Hellstrom
  2014-08-13 13:01         ` Daniel Vetter
  1 sibling, 1 reply; 29+ messages in thread
From: David Herrmann @ 2014-08-13 12:40 UTC (permalink / raw)
  To: Thomas Hellstrom
  Cc: dri-devel, Michel Dänzer, Jérôme Glisse,
	Konrad Rzeszutek Wilk

Hi

On Wed, Aug 13, 2014 at 2:35 PM, Thomas Hellstrom <thellstrom@vmware.com> wrote:
> On 08/13/2014 12:42 PM, Daniel Vetter wrote:
>> On Wed, Aug 13, 2014 at 11:06:25AM +0200, Thomas Hellstrom wrote:
>>> On 08/13/2014 05:52 AM, Jérôme Glisse wrote:
>>>> From: Jérôme Glisse <jglisse@redhat.com>
>>>>
>>>> When experiencing memory pressure we want to minimize pool size so that
>>>> memory we just shrinked is not added back again just as the next thing.
>>>>
>>>> This will divide by 2 the maximum pool size for each device each time
>>>> the pool have to shrink. The limit is bumped again is next allocation
>>>> happen after one second since the last shrink. The one second delay is
>>>> obviously an arbitrary choice.
>>> Jérôme,
>>>
>>> I don't like this patch. It adds extra complexity and its usefulness is
>>> highly questionable.
>>> There are a number of caches in the system, and if all of them added
>>> some sort of voluntary shrink heuristics like this, we'd end up with
>>> impossible-to-debug unpredictable performance issues.
>>>
>>> We should let the memory subsystem decide when to reclaim pages from
>>> caches and what caches to reclaim them from.
>> Yeah, artificially limiting your cache from growing when your shrinker
>> gets called will just break the equal-memory pressure the core mm uses to
>> rebalance between all caches when workload changes. In i915 we let
>> everything grow without artificial bounds and only rely upon the shrinker
>> callbacks to ensure we don't consume more than our fair share of available
>> memory overall.
>> -Daniel
>
> Now when you bring i915 memory usage up, Daniel,
> I can't refrain from bringing up the old user-space unreclaimable kernel
> memory issue, for which gem open is a good example ;) Each time
> user-space opens a gem handle, some un-reclaimable kernel memory is
> allocated, for which there is no accounting, so theoretically I think a
> user can bring a system to unusability this way.
>
> Typically there are various limits on unreclaimable objects like this,
> like open file descriptors, and IIRC the kernel even has an internal
> limit on the number of struct files you initialize, based on the
> available system memory, so dma-buf / prime should already have some
> sort of protection.

gem->filp points to a fresh shmem file, which itself is limited like
dmabuf. That should suffice, right?

Thanks
David
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: GEM memory DOS (WAS Re: [PATCH 3/3] drm/ttm: under memory pressure minimize the size of memory pool)
  2014-08-13 12:40         ` David Herrmann
@ 2014-08-13 12:48           ` Thomas Hellstrom
  0 siblings, 0 replies; 29+ messages in thread
From: Thomas Hellstrom @ 2014-08-13 12:48 UTC (permalink / raw)
  To: David Herrmann
  Cc: dri-devel, Michel Dänzer, Jérôme Glisse,
	Konrad Rzeszutek Wilk

On 08/13/2014 02:40 PM, David Herrmann wrote:
> Hi
>
> On Wed, Aug 13, 2014 at 2:35 PM, Thomas Hellstrom <thellstrom@vmware.com> wrote:
>> On 08/13/2014 12:42 PM, Daniel Vetter wrote:
>>> On Wed, Aug 13, 2014 at 11:06:25AM +0200, Thomas Hellstrom wrote:
>>>> On 08/13/2014 05:52 AM, Jérôme Glisse wrote:
>>>>> From: Jérôme Glisse <jglisse@redhat.com>
>>>>>
>>>>> When experiencing memory pressure we want to minimize pool size so that
>>>>> memory we just shrinked is not added back again just as the next thing.
>>>>>
>>>>> This will divide by 2 the maximum pool size for each device each time
>>>>> the pool have to shrink. The limit is bumped again is next allocation
>>>>> happen after one second since the last shrink. The one second delay is
>>>>> obviously an arbitrary choice.
>>>> Jérôme,
>>>>
>>>> I don't like this patch. It adds extra complexity and its usefulness is
>>>> highly questionable.
>>>> There are a number of caches in the system, and if all of them added
>>>> some sort of voluntary shrink heuristics like this, we'd end up with
>>>> impossible-to-debug unpredictable performance issues.
>>>>
>>>> We should let the memory subsystem decide when to reclaim pages from
>>>> caches and what caches to reclaim them from.
>>> Yeah, artificially limiting your cache from growing when your shrinker
>>> gets called will just break the equal-memory pressure the core mm uses to
>>> rebalance between all caches when workload changes. In i915 we let
>>> everything grow without artificial bounds and only rely upon the shrinker
>>> callbacks to ensure we don't consume more than our fair share of available
>>> memory overall.
>>> -Daniel
>> Now when you bring i915 memory usage up, Daniel,
>> I can't refrain from bringing up the old user-space unreclaimable kernel
>> memory issue, for which gem open is a good example ;) Each time
>> user-space opens a gem handle, some un-reclaimable kernel memory is
>> allocated, for which there is no accounting, so theoretically I think a
>> user can bring a system to unusability this way.
>>
>> Typically there are various limits on unreclaimable objects like this,
>> like open file descriptors, and IIRC the kernel even has an internal
>> limit on the number of struct files you initialize, based on the
>> available system memory, so dma-buf / prime should already have some
>> sort of protection.
> gem->filp points to a fresh shmem file, which itself is limited like
> dmabuf. That should suffice, right?
>
> Thanks
> David
I'm thinking of situations where you have a gem name and open a new
handle. It allocates a new unaccounted idr object. Admittedly you'd have
to open a hell of a lot of new handles to stress the system, but that's
an example of the situation I'm thinking of. Similarly perhaps if you
create a gem handle from a prime file-descriptor but I haven't looked at
that code in detail.

Thanks

/Thomas

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: GEM memory DOS (WAS Re: [PATCH 3/3] drm/ttm: under memory pressure minimize the size of memory pool)
  2014-08-13 12:35       ` GEM memory DOS (WAS Re: [PATCH 3/3] drm/ttm: under memory pressure minimize the size of memory pool) Thomas Hellstrom
  2014-08-13 12:40         ` David Herrmann
@ 2014-08-13 13:01         ` Daniel Vetter
  2014-08-13 14:09           ` Oded Gabbay
  2014-08-13 15:13           ` Thomas Hellstrom
  1 sibling, 2 replies; 29+ messages in thread
From: Daniel Vetter @ 2014-08-13 13:01 UTC (permalink / raw)
  To: Thomas Hellstrom
  Cc: Konrad Rzeszutek Wilk, Michel Dänzer, dri-devel,
	Jérôme Glisse

On Wed, Aug 13, 2014 at 02:35:52PM +0200, Thomas Hellstrom wrote:
> On 08/13/2014 12:42 PM, Daniel Vetter wrote:
> > On Wed, Aug 13, 2014 at 11:06:25AM +0200, Thomas Hellstrom wrote:
> >> On 08/13/2014 05:52 AM, Jérôme Glisse wrote:
> >>> From: Jérôme Glisse <jglisse@redhat.com>
> >>>
> >>> When experiencing memory pressure we want to minimize pool size so that
> >>> memory we just shrinked is not added back again just as the next thing.
> >>>
> >>> This will divide by 2 the maximum pool size for each device each time
> >>> the pool have to shrink. The limit is bumped again is next allocation
> >>> happen after one second since the last shrink. The one second delay is
> >>> obviously an arbitrary choice.
> >> Jérôme,
> >>
> >> I don't like this patch. It adds extra complexity and its usefulness is
> >> highly questionable.
> >> There are a number of caches in the system, and if all of them added
> >> some sort of voluntary shrink heuristics like this, we'd end up with
> >> impossible-to-debug unpredictable performance issues.
> >>
> >> We should let the memory subsystem decide when to reclaim pages from
> >> caches and what caches to reclaim them from.
> > Yeah, artificially limiting your cache from growing when your shrinker
> > gets called will just break the equal-memory pressure the core mm uses to
> > rebalance between all caches when workload changes. In i915 we let
> > everything grow without artificial bounds and only rely upon the shrinker
> > callbacks to ensure we don't consume more than our fair share of available
> > memory overall.
> > -Daniel
> 
> Now when you bring i915 memory usage up, Daniel,
> I can't refrain from bringing up the old user-space unreclaimable kernel
> memory issue, for which gem open is a good example ;) Each time
> user-space opens a gem handle, some un-reclaimable kernel memory is
> allocated, for which there is no accounting, so theoretically I think a
> user can bring a system to unusability this way.
> 
> Typically there are various limits on unreclaimable objects like this,
> like open file descriptors, and IIRC the kernel even has an internal
> limit on the number of struct files you initialize, based on the
> available system memory, so dma-buf / prime should already have some
> sort of protection.

Oh yeah, we have zero cgroups limits or similar stuff for gem allocations,
so there's not really a way to isolate gpu memory usage in a sane way for
specific processes. But there's also zero limits on actual gpu usage
itself (timeslices or whatever) so I guess no one asked for this yet.

My comment really was about balancing mm users under the assumption that
they're all unlimited.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: GEM memory DOS (WAS Re: [PATCH 3/3] drm/ttm: under memory pressure minimize the size of memory pool)
  2014-08-13 13:01         ` Daniel Vetter
@ 2014-08-13 14:09           ` Oded Gabbay
  2014-08-13 15:19             ` Thomas Hellstrom
  2014-08-13 16:30             ` Daniel Vetter
  2014-08-13 15:13           ` Thomas Hellstrom
  1 sibling, 2 replies; 29+ messages in thread
From: Oded Gabbay @ 2014-08-13 14:09 UTC (permalink / raw)
  To: Daniel Vetter, Thomas Hellstrom, Jérôme Glisse
  Cc: Michel Dänzer, dri-devel, Konrad Rzeszutek Wilk



On 13/08/14 16:01, Daniel Vetter wrote:
> On Wed, Aug 13, 2014 at 02:35:52PM +0200, Thomas Hellstrom wrote:
>> On 08/13/2014 12:42 PM, Daniel Vetter wrote:
>>> On Wed, Aug 13, 2014 at 11:06:25AM +0200, Thomas Hellstrom wrote:
>>>> On 08/13/2014 05:52 AM, Jérôme Glisse wrote:
>>>>> From: Jérôme Glisse <jglisse@redhat.com>
>>>>>
>>>>> When experiencing memory pressure we want to minimize pool size so that
>>>>> memory we just shrinked is not added back again just as the next thing.
>>>>>
>>>>> This will divide by 2 the maximum pool size for each device each time
>>>>> the pool have to shrink. The limit is bumped again is next allocation
>>>>> happen after one second since the last shrink. The one second delay is
>>>>> obviously an arbitrary choice.
>>>> Jérôme,
>>>>
>>>> I don't like this patch. It adds extra complexity and its usefulness is
>>>> highly questionable.
>>>> There are a number of caches in the system, and if all of them added
>>>> some sort of voluntary shrink heuristics like this, we'd end up with
>>>> impossible-to-debug unpredictable performance issues.
>>>>
>>>> We should let the memory subsystem decide when to reclaim pages from
>>>> caches and what caches to reclaim them from.
>>> Yeah, artificially limiting your cache from growing when your shrinker
>>> gets called will just break the equal-memory pressure the core mm uses to
>>> rebalance between all caches when workload changes. In i915 we let
>>> everything grow without artificial bounds and only rely upon the shrinker
>>> callbacks to ensure we don't consume more than our fair share of available
>>> memory overall.
>>> -Daniel
>>
>> Now when you bring i915 memory usage up, Daniel,
>> I can't refrain from bringing up the old user-space unreclaimable kernel
>> memory issue, for which gem open is a good example ;) Each time
>> user-space opens a gem handle, some un-reclaimable kernel memory is
>> allocated, for which there is no accounting, so theoretically I think a
>> user can bring a system to unusability this way.
>>
>> Typically there are various limits on unreclaimable objects like this,
>> like open file descriptors, and IIRC the kernel even has an internal
>> limit on the number of struct files you initialize, based on the
>> available system memory, so dma-buf / prime should already have some
>> sort of protection.
>
> Oh yeah, we have zero cgroups limits or similar stuff for gem allocations,
> so there's not really a way to isolate gpu memory usage in a sane way for
> specific processes. But there's also zero limits on actual gpu usage
> itself (timeslices or whatever) so I guess no one asked for this yet.
>
> My comment really was about balancing mm users under the assumption that
> they're all unlimited.
> -Daniel
>
I think the point you brought up becomes very important for compute (HSA) 
processes. I still don't know how to distinguish between legitimate use of GPU 
local memory and misbehaving/malicious processes.

We have a requirement that HSA processes will be allowed to allocate and pin GPU 
local memory. They do it through an ioctl.
In the kernel driver, we have an accounting of those memory allocations, meaning 
that I can print a list of all the objects that were allocated by a certain 
process, per device.
Therefore, in theory, I can reclaim any object, but that will probably break the 
userspace app. If the app is misbehaving/malicious than that's ok, I guess. But 
how do I know that ? And what prevents that malicious app to re-spawn and do the 
same allocation again ?

	Oded

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: GEM memory DOS (WAS Re: [PATCH 3/3] drm/ttm: under memory pressure minimize the size of memory pool)
  2014-08-13 13:01         ` Daniel Vetter
  2014-08-13 14:09           ` Oded Gabbay
@ 2014-08-13 15:13           ` Thomas Hellstrom
  2014-08-13 16:24             ` Daniel Vetter
  2014-08-14 22:29             ` Jesse Barnes
  1 sibling, 2 replies; 29+ messages in thread
From: Thomas Hellstrom @ 2014-08-13 15:13 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Michel Dänzer, Jérôme Glisse, dri-devel,
	Konrad Rzeszutek Wilk

On 08/13/2014 03:01 PM, Daniel Vetter wrote:
> On Wed, Aug 13, 2014 at 02:35:52PM +0200, Thomas Hellstrom wrote:
>> On 08/13/2014 12:42 PM, Daniel Vetter wrote:
>>> On Wed, Aug 13, 2014 at 11:06:25AM +0200, Thomas Hellstrom wrote:
>>>> On 08/13/2014 05:52 AM, Jérôme Glisse wrote:
>>>>> From: Jérôme Glisse <jglisse@redhat.com>
>>>>>
>>>>> When experiencing memory pressure we want to minimize pool size so that
>>>>> memory we just shrinked is not added back again just as the next thing.
>>>>>
>>>>> This will divide by 2 the maximum pool size for each device each time
>>>>> the pool have to shrink. The limit is bumped again is next allocation
>>>>> happen after one second since the last shrink. The one second delay is
>>>>> obviously an arbitrary choice.
>>>> Jérôme,
>>>>
>>>> I don't like this patch. It adds extra complexity and its usefulness is
>>>> highly questionable.
>>>> There are a number of caches in the system, and if all of them added
>>>> some sort of voluntary shrink heuristics like this, we'd end up with
>>>> impossible-to-debug unpredictable performance issues.
>>>>
>>>> We should let the memory subsystem decide when to reclaim pages from
>>>> caches and what caches to reclaim them from.
>>> Yeah, artificially limiting your cache from growing when your shrinker
>>> gets called will just break the equal-memory pressure the core mm uses to
>>> rebalance between all caches when workload changes. In i915 we let
>>> everything grow without artificial bounds and only rely upon the shrinker
>>> callbacks to ensure we don't consume more than our fair share of available
>>> memory overall.
>>> -Daniel
>> Now when you bring i915 memory usage up, Daniel,
>> I can't refrain from bringing up the old user-space unreclaimable kernel
>> memory issue, for which gem open is a good example ;) Each time
>> user-space opens a gem handle, some un-reclaimable kernel memory is
>> allocated, for which there is no accounting, so theoretically I think a
>> user can bring a system to unusability this way.
>>
>> Typically there are various limits on unreclaimable objects like this,
>> like open file descriptors, and IIRC the kernel even has an internal
>> limit on the number of struct files you initialize, based on the
>> available system memory, so dma-buf / prime should already have some
>> sort of protection.
> Oh yeah, we have zero cgroups limits or similar stuff for gem allocations,
> so there's not really a way to isolate gpu memory usage in a sane way for
> specific processes. But there's also zero limits on actual gpu usage
> itself (timeslices or whatever) so I guess no one asked for this yet.

In its simplest form (like in TTM if correctly implemented by drivers)
this type of accounting stops non-privileged malicious GPU-users from
exhausting all system physical memory causing grief for other kernel
systems but not from causing grief for other GPU users. I think that's
the minimum level that's intended also for example also for the struct
file accounting.

> My comment really was about balancing mm users under the assumption that
> they're all unlimited.

Yeah, sorry for stealing the thread. I usually bring this up now and
again but nowadays with an exponential backoff.


> -Daniel

Thomas

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: GEM memory DOS (WAS Re: [PATCH 3/3] drm/ttm: under memory pressure minimize the size of memory pool)
  2014-08-13 14:09           ` Oded Gabbay
@ 2014-08-13 15:19             ` Thomas Hellstrom
  2014-08-13 16:30             ` Daniel Vetter
  1 sibling, 0 replies; 29+ messages in thread
From: Thomas Hellstrom @ 2014-08-13 15:19 UTC (permalink / raw)
  To: Oded Gabbay
  Cc: dri-devel, Jérôme Glisse, Michel Dänzer,
	Konrad Rzeszutek Wilk

On 08/13/2014 04:09 PM, Oded Gabbay wrote:
>
>
> On 13/08/14 16:01, Daniel Vetter wrote:
>> On Wed, Aug 13, 2014 at 02:35:52PM +0200, Thomas Hellstrom wrote:
>>> On 08/13/2014 12:42 PM, Daniel Vetter wrote:
>>>> On Wed, Aug 13, 2014 at 11:06:25AM +0200, Thomas Hellstrom wrote:
>>>>> On 08/13/2014 05:52 AM, Jérôme Glisse wrote:
>>>>>> From: Jérôme Glisse <jglisse@redhat.com>
>>>>>>
>>>>>> When experiencing memory pressure we want to minimize pool size
>>>>>> so that
>>>>>> memory we just shrinked is not added back again just as the next
>>>>>> thing.
>>>>>>
>>>>>> This will divide by 2 the maximum pool size for each device each
>>>>>> time
>>>>>> the pool have to shrink. The limit is bumped again is next
>>>>>> allocation
>>>>>> happen after one second since the last shrink. The one second
>>>>>> delay is
>>>>>> obviously an arbitrary choice.
>>>>> Jérôme,
>>>>>
>>>>> I don't like this patch. It adds extra complexity and its
>>>>> usefulness is
>>>>> highly questionable.
>>>>> There are a number of caches in the system, and if all of them added
>>>>> some sort of voluntary shrink heuristics like this, we'd end up with
>>>>> impossible-to-debug unpredictable performance issues.
>>>>>
>>>>> We should let the memory subsystem decide when to reclaim pages from
>>>>> caches and what caches to reclaim them from.
>>>> Yeah, artificially limiting your cache from growing when your shrinker
>>>> gets called will just break the equal-memory pressure the core mm
>>>> uses to
>>>> rebalance between all caches when workload changes. In i915 we let
>>>> everything grow without artificial bounds and only rely upon the
>>>> shrinker
>>>> callbacks to ensure we don't consume more than our fair share of
>>>> available
>>>> memory overall.
>>>> -Daniel
>>>
>>> Now when you bring i915 memory usage up, Daniel,
>>> I can't refrain from bringing up the old user-space unreclaimable
>>> kernel
>>> memory issue, for which gem open is a good example ;) Each time
>>> user-space opens a gem handle, some un-reclaimable kernel memory is
>>> allocated, for which there is no accounting, so theoretically I think a
>>> user can bring a system to unusability this way.
>>>
>>> Typically there are various limits on unreclaimable objects like this,
>>> like open file descriptors, and IIRC the kernel even has an internal
>>> limit on the number of struct files you initialize, based on the
>>> available system memory, so dma-buf / prime should already have some
>>> sort of protection.
>>
>> Oh yeah, we have zero cgroups limits or similar stuff for gem
>> allocations,
>> so there's not really a way to isolate gpu memory usage in a sane way
>> for
>> specific processes. But there's also zero limits on actual gpu usage
>> itself (timeslices or whatever) so I guess no one asked for this yet.
>>
>> My comment really was about balancing mm users under the assumption that
>> they're all unlimited.
>> -Daniel
>>
> I think the point you brought up becomes very important for compute
> (HSA) processes. I still don't know how to distinguish between
> legitimate use of GPU local memory and misbehaving/malicious processes.
>
> We have a requirement that HSA processes will be allowed to allocate
> and pin GPU local memory. They do it through an ioctl.
> In the kernel driver, we have an accounting of those memory
> allocations, meaning that I can print a list of all the objects that
> were allocated by a certain process, per device.
> Therefore, in theory, I can reclaim any object, but that will probably
> break the userspace app. If the app is misbehaving/malicious than
> that's ok, I guess. But how do I know that ? And what prevents that
> malicious app to re-spawn and do the same allocation again ?

If you have per-process accounting of all those memory allocations and
you need to reclaim and there's no nice way of doing it, you should
probably do it like the kernel OOM killer: You simply kill the process
that is most likely to bring back most memory (or use any other
heuristic). Typically the kernel OOM killer does that when all swap
space is exhausted, probably assuming that the process that uses most
swap space is most likely to be malicious, if there are any malicious
processes.

If the process respawns, and you run out of resources again, repeat the
process.

/Thomas


>
>     Oded

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: GEM memory DOS (WAS Re: [PATCH 3/3] drm/ttm: under memory pressure minimize the size of memory pool)
  2014-08-13 15:13           ` Thomas Hellstrom
@ 2014-08-13 16:24             ` Daniel Vetter
  2014-08-13 16:30               ` Alex Deucher
  2014-08-13 17:20               ` Thomas Hellstrom
  2014-08-14 22:29             ` Jesse Barnes
  1 sibling, 2 replies; 29+ messages in thread
From: Daniel Vetter @ 2014-08-13 16:24 UTC (permalink / raw)
  To: Thomas Hellstrom
  Cc: Konrad Rzeszutek Wilk, Michel Dänzer, dri-devel,
	Jérôme Glisse

On Wed, Aug 13, 2014 at 05:13:56PM +0200, Thomas Hellstrom wrote:
> On 08/13/2014 03:01 PM, Daniel Vetter wrote:
> > On Wed, Aug 13, 2014 at 02:35:52PM +0200, Thomas Hellstrom wrote:
> >> On 08/13/2014 12:42 PM, Daniel Vetter wrote:
> >>> On Wed, Aug 13, 2014 at 11:06:25AM +0200, Thomas Hellstrom wrote:
> >>>> On 08/13/2014 05:52 AM, Jérôme Glisse wrote:
> >>>>> From: Jérôme Glisse <jglisse@redhat.com>
> >>>>>
> >>>>> When experiencing memory pressure we want to minimize pool size so that
> >>>>> memory we just shrinked is not added back again just as the next thing.
> >>>>>
> >>>>> This will divide by 2 the maximum pool size for each device each time
> >>>>> the pool have to shrink. The limit is bumped again is next allocation
> >>>>> happen after one second since the last shrink. The one second delay is
> >>>>> obviously an arbitrary choice.
> >>>> Jérôme,
> >>>>
> >>>> I don't like this patch. It adds extra complexity and its usefulness is
> >>>> highly questionable.
> >>>> There are a number of caches in the system, and if all of them added
> >>>> some sort of voluntary shrink heuristics like this, we'd end up with
> >>>> impossible-to-debug unpredictable performance issues.
> >>>>
> >>>> We should let the memory subsystem decide when to reclaim pages from
> >>>> caches and what caches to reclaim them from.
> >>> Yeah, artificially limiting your cache from growing when your shrinker
> >>> gets called will just break the equal-memory pressure the core mm uses to
> >>> rebalance between all caches when workload changes. In i915 we let
> >>> everything grow without artificial bounds and only rely upon the shrinker
> >>> callbacks to ensure we don't consume more than our fair share of available
> >>> memory overall.
> >>> -Daniel
> >> Now when you bring i915 memory usage up, Daniel,
> >> I can't refrain from bringing up the old user-space unreclaimable kernel
> >> memory issue, for which gem open is a good example ;) Each time
> >> user-space opens a gem handle, some un-reclaimable kernel memory is
> >> allocated, for which there is no accounting, so theoretically I think a
> >> user can bring a system to unusability this way.
> >>
> >> Typically there are various limits on unreclaimable objects like this,
> >> like open file descriptors, and IIRC the kernel even has an internal
> >> limit on the number of struct files you initialize, based on the
> >> available system memory, so dma-buf / prime should already have some
> >> sort of protection.
> > Oh yeah, we have zero cgroups limits or similar stuff for gem allocations,
> > so there's not really a way to isolate gpu memory usage in a sane way for
> > specific processes. But there's also zero limits on actual gpu usage
> > itself (timeslices or whatever) so I guess no one asked for this yet.
> 
> In its simplest form (like in TTM if correctly implemented by drivers)
> this type of accounting stops non-privileged malicious GPU-users from
> exhausting all system physical memory causing grief for other kernel
> systems but not from causing grief for other GPU users. I think that's
> the minimum level that's intended also for example also for the struct
> file accounting.

I think in i915 we're fairly close on that minimal standard - interactions
with shrinkers and oom logic work decently. It starts to fall apart though
when we've actually run out of memory - if the real memory hog is a gpu
process the oom killer won't notice all that memory since it's not
accounted against processes correctly.

I don't agree that gpu process should be punished in general compared to
other subsystems in the kernel. If the user wants to use 90% of all memory
for gpu tasks then I want to make that possible, even if it means that
everything else thrashes horribly. And as long as the system recovers and
rebalances after that gpu memory hog is gone ofc. Iirc ttm currently has a
fairly arbitrary (tunable) setting to limit system memory consumption, but
I might be wrong on that.

> > My comment really was about balancing mm users under the assumption that
> > they're all unlimited.
> 
> Yeah, sorry for stealing the thread. I usually bring this up now and
> again but nowadays with an exponential backoff.

Oh I'd love to see some cgroups or similar tracking so that server users
could set sane per-process/user/task limits on how much memory/gpu time
that group is allowed to consume. It's just that I haven't seen real
demand for this and so couldn't make the time available to implement it.
So thus far my goal is to make everything work nicely for unlimited tasks
right up to the point where the OOM killer needs to step in. Past that
everything starts to fall apart, but thus far that was good enough for
desktop usage.

Maybe WebGL will finally make this important enough so that we can fix it
for real ...
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: GEM memory DOS (WAS Re: [PATCH 3/3] drm/ttm: under memory pressure minimize the size of memory pool)
  2014-08-13 14:09           ` Oded Gabbay
  2014-08-13 15:19             ` Thomas Hellstrom
@ 2014-08-13 16:30             ` Daniel Vetter
  1 sibling, 0 replies; 29+ messages in thread
From: Daniel Vetter @ 2014-08-13 16:30 UTC (permalink / raw)
  To: Oded Gabbay
  Cc: Thomas Hellstrom, Konrad Rzeszutek Wilk, Michel Dänzer,
	dri-devel, Jérôme Glisse

On Wed, Aug 13, 2014 at 05:09:49PM +0300, Oded Gabbay wrote:
> 
> 
> On 13/08/14 16:01, Daniel Vetter wrote:
> >On Wed, Aug 13, 2014 at 02:35:52PM +0200, Thomas Hellstrom wrote:
> >>On 08/13/2014 12:42 PM, Daniel Vetter wrote:
> >>>On Wed, Aug 13, 2014 at 11:06:25AM +0200, Thomas Hellstrom wrote:
> >>>>On 08/13/2014 05:52 AM, Jérôme Glisse wrote:
> >>>>>From: Jérôme Glisse <jglisse@redhat.com>
> >>>>>
> >>>>>When experiencing memory pressure we want to minimize pool size so that
> >>>>>memory we just shrinked is not added back again just as the next thing.
> >>>>>
> >>>>>This will divide by 2 the maximum pool size for each device each time
> >>>>>the pool have to shrink. The limit is bumped again is next allocation
> >>>>>happen after one second since the last shrink. The one second delay is
> >>>>>obviously an arbitrary choice.
> >>>>Jérôme,
> >>>>
> >>>>I don't like this patch. It adds extra complexity and its usefulness is
> >>>>highly questionable.
> >>>>There are a number of caches in the system, and if all of them added
> >>>>some sort of voluntary shrink heuristics like this, we'd end up with
> >>>>impossible-to-debug unpredictable performance issues.
> >>>>
> >>>>We should let the memory subsystem decide when to reclaim pages from
> >>>>caches and what caches to reclaim them from.
> >>>Yeah, artificially limiting your cache from growing when your shrinker
> >>>gets called will just break the equal-memory pressure the core mm uses to
> >>>rebalance between all caches when workload changes. In i915 we let
> >>>everything grow without artificial bounds and only rely upon the shrinker
> >>>callbacks to ensure we don't consume more than our fair share of available
> >>>memory overall.
> >>>-Daniel
> >>
> >>Now when you bring i915 memory usage up, Daniel,
> >>I can't refrain from bringing up the old user-space unreclaimable kernel
> >>memory issue, for which gem open is a good example ;) Each time
> >>user-space opens a gem handle, some un-reclaimable kernel memory is
> >>allocated, for which there is no accounting, so theoretically I think a
> >>user can bring a system to unusability this way.
> >>
> >>Typically there are various limits on unreclaimable objects like this,
> >>like open file descriptors, and IIRC the kernel even has an internal
> >>limit on the number of struct files you initialize, based on the
> >>available system memory, so dma-buf / prime should already have some
> >>sort of protection.
> >
> >Oh yeah, we have zero cgroups limits or similar stuff for gem allocations,
> >so there's not really a way to isolate gpu memory usage in a sane way for
> >specific processes. But there's also zero limits on actual gpu usage
> >itself (timeslices or whatever) so I guess no one asked for this yet.
> >
> >My comment really was about balancing mm users under the assumption that
> >they're all unlimited.
> >-Daniel
> >
> I think the point you brought up becomes very important for compute (HSA)
> processes. I still don't know how to distinguish between legitimate use of
> GPU local memory and misbehaving/malicious processes.
> 
> We have a requirement that HSA processes will be allowed to allocate and pin
> GPU local memory. They do it through an ioctl.
> In the kernel driver, we have an accounting of those memory allocations,
> meaning that I can print a list of all the objects that were allocated by a
> certain process, per device.
> Therefore, in theory, I can reclaim any object, but that will probably break
> the userspace app. If the app is misbehaving/malicious than that's ok, I
> guess. But how do I know that ? And what prevents that malicious app to
> re-spawn and do the same allocation again ?

You can't do that in the kernel, this is policy decisions which is
userspaces job. But what we instead need to allow is to properly track
memory allocations so that memory limits can be set with cgroups. With SVM
you get that for free. Without SVM we need some work in that area since
currently the memory accounting for gem/ttm drivers is broken.

The other bit is limits for wasting gpu time, and I guess for that we want
a new gpu time cgroup system so that users can set soft/hard limits for
different gpgpu tasks on servers.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: GEM memory DOS (WAS Re: [PATCH 3/3] drm/ttm: under memory pressure minimize the size of memory pool)
  2014-08-13 16:24             ` Daniel Vetter
@ 2014-08-13 16:30               ` Alex Deucher
  2014-08-13 16:38                 ` Daniel Vetter
  2014-08-13 17:38                 ` Thomas Hellstrom
  2014-08-13 17:20               ` Thomas Hellstrom
  1 sibling, 2 replies; 29+ messages in thread
From: Alex Deucher @ 2014-08-13 16:30 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Jérôme Glisse, Thomas Hellstrom, Michel Dänzer,
	Maling list - DRI developers, Konrad Rzeszutek Wilk

On Wed, Aug 13, 2014 at 12:24 PM, Daniel Vetter <daniel@ffwll.ch> wrote:
> On Wed, Aug 13, 2014 at 05:13:56PM +0200, Thomas Hellstrom wrote:
>> On 08/13/2014 03:01 PM, Daniel Vetter wrote:
>> > On Wed, Aug 13, 2014 at 02:35:52PM +0200, Thomas Hellstrom wrote:
>> >> On 08/13/2014 12:42 PM, Daniel Vetter wrote:
>> >>> On Wed, Aug 13, 2014 at 11:06:25AM +0200, Thomas Hellstrom wrote:
>> >>>> On 08/13/2014 05:52 AM, Jérôme Glisse wrote:
>> >>>>> From: Jérôme Glisse <jglisse@redhat.com>
>> >>>>>
>> >>>>> When experiencing memory pressure we want to minimize pool size so that
>> >>>>> memory we just shrinked is not added back again just as the next thing.
>> >>>>>
>> >>>>> This will divide by 2 the maximum pool size for each device each time
>> >>>>> the pool have to shrink. The limit is bumped again is next allocation
>> >>>>> happen after one second since the last shrink. The one second delay is
>> >>>>> obviously an arbitrary choice.
>> >>>> Jérôme,
>> >>>>
>> >>>> I don't like this patch. It adds extra complexity and its usefulness is
>> >>>> highly questionable.
>> >>>> There are a number of caches in the system, and if all of them added
>> >>>> some sort of voluntary shrink heuristics like this, we'd end up with
>> >>>> impossible-to-debug unpredictable performance issues.
>> >>>>
>> >>>> We should let the memory subsystem decide when to reclaim pages from
>> >>>> caches and what caches to reclaim them from.
>> >>> Yeah, artificially limiting your cache from growing when your shrinker
>> >>> gets called will just break the equal-memory pressure the core mm uses to
>> >>> rebalance between all caches when workload changes. In i915 we let
>> >>> everything grow without artificial bounds and only rely upon the shrinker
>> >>> callbacks to ensure we don't consume more than our fair share of available
>> >>> memory overall.
>> >>> -Daniel
>> >> Now when you bring i915 memory usage up, Daniel,
>> >> I can't refrain from bringing up the old user-space unreclaimable kernel
>> >> memory issue, for which gem open is a good example ;) Each time
>> >> user-space opens a gem handle, some un-reclaimable kernel memory is
>> >> allocated, for which there is no accounting, so theoretically I think a
>> >> user can bring a system to unusability this way.
>> >>
>> >> Typically there are various limits on unreclaimable objects like this,
>> >> like open file descriptors, and IIRC the kernel even has an internal
>> >> limit on the number of struct files you initialize, based on the
>> >> available system memory, so dma-buf / prime should already have some
>> >> sort of protection.
>> > Oh yeah, we have zero cgroups limits or similar stuff for gem allocations,
>> > so there's not really a way to isolate gpu memory usage in a sane way for
>> > specific processes. But there's also zero limits on actual gpu usage
>> > itself (timeslices or whatever) so I guess no one asked for this yet.
>>
>> In its simplest form (like in TTM if correctly implemented by drivers)
>> this type of accounting stops non-privileged malicious GPU-users from
>> exhausting all system physical memory causing grief for other kernel
>> systems but not from causing grief for other GPU users. I think that's
>> the minimum level that's intended also for example also for the struct
>> file accounting.
>
> I think in i915 we're fairly close on that minimal standard - interactions
> with shrinkers and oom logic work decently. It starts to fall apart though
> when we've actually run out of memory - if the real memory hog is a gpu
> process the oom killer won't notice all that memory since it's not
> accounted against processes correctly.
>
> I don't agree that gpu process should be punished in general compared to
> other subsystems in the kernel. If the user wants to use 90% of all memory
> for gpu tasks then I want to make that possible, even if it means that
> everything else thrashes horribly. And as long as the system recovers and
> rebalances after that gpu memory hog is gone ofc. Iirc ttm currently has a
> fairly arbitrary (tunable) setting to limit system memory consumption, but
> I might be wrong on that.

Yes, it currently limits you to half of memory, but at least we would
like to make it tuneable since there are a lot of user cases where the
user wants to use 90% of memory for GPU tasks at the expense of
everything else.

Alex

>
>> > My comment really was about balancing mm users under the assumption that
>> > they're all unlimited.
>>
>> Yeah, sorry for stealing the thread. I usually bring this up now and
>> again but nowadays with an exponential backoff.
>
> Oh I'd love to see some cgroups or similar tracking so that server users
> could set sane per-process/user/task limits on how much memory/gpu time
> that group is allowed to consume. It's just that I haven't seen real
> demand for this and so couldn't make the time available to implement it.
> So thus far my goal is to make everything work nicely for unlimited tasks
> right up to the point where the OOM killer needs to step in. Past that
> everything starts to fall apart, but thus far that was good enough for
> desktop usage.
>
> Maybe WebGL will finally make this important enough so that we can fix it
> for real ...
> -Daniel
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: GEM memory DOS (WAS Re: [PATCH 3/3] drm/ttm: under memory pressure minimize the size of memory pool)
  2014-08-13 16:30               ` Alex Deucher
@ 2014-08-13 16:38                 ` Daniel Vetter
  2014-08-13 16:45                   ` Daniel Vetter
  2014-08-13 17:38                 ` Thomas Hellstrom
  1 sibling, 1 reply; 29+ messages in thread
From: Daniel Vetter @ 2014-08-13 16:38 UTC (permalink / raw)
  To: Alex Deucher
  Cc: Thomas Hellstrom, Konrad Rzeszutek Wilk, Michel Dänzer,
	Maling list - DRI developers, Jérôme Glisse

On Wed, Aug 13, 2014 at 12:30:45PM -0400, Alex Deucher wrote:
> On Wed, Aug 13, 2014 at 12:24 PM, Daniel Vetter <daniel@ffwll.ch> wrote:
> > On Wed, Aug 13, 2014 at 05:13:56PM +0200, Thomas Hellstrom wrote:
> >> On 08/13/2014 03:01 PM, Daniel Vetter wrote:
> >> > On Wed, Aug 13, 2014 at 02:35:52PM +0200, Thomas Hellstrom wrote:
> >> >> On 08/13/2014 12:42 PM, Daniel Vetter wrote:
> >> >>> On Wed, Aug 13, 2014 at 11:06:25AM +0200, Thomas Hellstrom wrote:
> >> >>>> On 08/13/2014 05:52 AM, Jérôme Glisse wrote:
> >> >>>>> From: Jérôme Glisse <jglisse@redhat.com>
> >> >>>>>
> >> >>>>> When experiencing memory pressure we want to minimize pool size so that
> >> >>>>> memory we just shrinked is not added back again just as the next thing.
> >> >>>>>
> >> >>>>> This will divide by 2 the maximum pool size for each device each time
> >> >>>>> the pool have to shrink. The limit is bumped again is next allocation
> >> >>>>> happen after one second since the last shrink. The one second delay is
> >> >>>>> obviously an arbitrary choice.
> >> >>>> Jérôme,
> >> >>>>
> >> >>>> I don't like this patch. It adds extra complexity and its usefulness is
> >> >>>> highly questionable.
> >> >>>> There are a number of caches in the system, and if all of them added
> >> >>>> some sort of voluntary shrink heuristics like this, we'd end up with
> >> >>>> impossible-to-debug unpredictable performance issues.
> >> >>>>
> >> >>>> We should let the memory subsystem decide when to reclaim pages from
> >> >>>> caches and what caches to reclaim them from.
> >> >>> Yeah, artificially limiting your cache from growing when your shrinker
> >> >>> gets called will just break the equal-memory pressure the core mm uses to
> >> >>> rebalance between all caches when workload changes. In i915 we let
> >> >>> everything grow without artificial bounds and only rely upon the shrinker
> >> >>> callbacks to ensure we don't consume more than our fair share of available
> >> >>> memory overall.
> >> >>> -Daniel
> >> >> Now when you bring i915 memory usage up, Daniel,
> >> >> I can't refrain from bringing up the old user-space unreclaimable kernel
> >> >> memory issue, for which gem open is a good example ;) Each time
> >> >> user-space opens a gem handle, some un-reclaimable kernel memory is
> >> >> allocated, for which there is no accounting, so theoretically I think a
> >> >> user can bring a system to unusability this way.
> >> >>
> >> >> Typically there are various limits on unreclaimable objects like this,
> >> >> like open file descriptors, and IIRC the kernel even has an internal
> >> >> limit on the number of struct files you initialize, based on the
> >> >> available system memory, so dma-buf / prime should already have some
> >> >> sort of protection.
> >> > Oh yeah, we have zero cgroups limits or similar stuff for gem allocations,
> >> > so there's not really a way to isolate gpu memory usage in a sane way for
> >> > specific processes. But there's also zero limits on actual gpu usage
> >> > itself (timeslices or whatever) so I guess no one asked for this yet.
> >>
> >> In its simplest form (like in TTM if correctly implemented by drivers)
> >> this type of accounting stops non-privileged malicious GPU-users from
> >> exhausting all system physical memory causing grief for other kernel
> >> systems but not from causing grief for other GPU users. I think that's
> >> the minimum level that's intended also for example also for the struct
> >> file accounting.
> >
> > I think in i915 we're fairly close on that minimal standard - interactions
> > with shrinkers and oom logic work decently. It starts to fall apart though
> > when we've actually run out of memory - if the real memory hog is a gpu
> > process the oom killer won't notice all that memory since it's not
> > accounted against processes correctly.
> >
> > I don't agree that gpu process should be punished in general compared to
> > other subsystems in the kernel. If the user wants to use 90% of all memory
> > for gpu tasks then I want to make that possible, even if it means that
> > everything else thrashes horribly. And as long as the system recovers and
> > rebalances after that gpu memory hog is gone ofc. Iirc ttm currently has a
> > fairly arbitrary (tunable) setting to limit system memory consumption, but
> > I might be wrong on that.
> 
> Yes, it currently limits you to half of memory, but at least we would
> like to make it tuneable since there are a lot of user cases where the
> user wants to use 90% of memory for GPU tasks at the expense of
> everything else.

Ime a lot of fun stuff starts to happen when you go there. We have piles
of memory thrashing testcases and generally had lots of fun with our
shrinker, so I think until you've really beaten onto those paths in
ttm+radeon I'd keep the limit where it is.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: GEM memory DOS (WAS Re: [PATCH 3/3] drm/ttm: under memory pressure minimize the size of memory pool)
  2014-08-13 16:38                 ` Daniel Vetter
@ 2014-08-13 16:45                   ` Daniel Vetter
  0 siblings, 0 replies; 29+ messages in thread
From: Daniel Vetter @ 2014-08-13 16:45 UTC (permalink / raw)
  To: Alex Deucher
  Cc: Thomas Hellstrom, Konrad Rzeszutek Wilk, Michel Dänzer,
	Maling list - DRI developers, Jérôme Glisse

On Wed, Aug 13, 2014 at 6:38 PM, Daniel Vetter <daniel@ffwll.ch> wrote:
>> Yes, it currently limits you to half of memory, but at least we would
>> like to make it tuneable since there are a lot of user cases where the
>> user wants to use 90% of memory for GPU tasks at the expense of
>> everything else.
>
> Ime a lot of fun stuff starts to happen when you go there. We have piles
> of memory thrashing testcases and generally had lots of fun with our
> shrinker, so I think until you've really beaten onto those paths in
> ttm+radeon I'd keep the limit where it is.

One example that already starts if you go above 50% is that by default
the dirty pagecache is limited to 40% of memory. Above that you start
to stall in writeback, but gpus are really fast at re-dirtying memory.
So we've seen cases where the core mm OOMed with essentially 90% of
memory on writeback and piles of free swap. Waiting a few seconds for
the SSD to catch up would have gotten it out of that tight spot
without killing any process. One side-effect of such fun is that
memory allocations start to fail in really interesting places, and you
need to pile in hacks so make it all a bit more synchronous to avoid
the core mm freaking out.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: GEM memory DOS (WAS Re: [PATCH 3/3] drm/ttm: under memory pressure minimize the size of memory pool)
  2014-08-13 16:24             ` Daniel Vetter
  2014-08-13 16:30               ` Alex Deucher
@ 2014-08-13 17:20               ` Thomas Hellstrom
  1 sibling, 0 replies; 29+ messages in thread
From: Thomas Hellstrom @ 2014-08-13 17:20 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Michel Dänzer, Jérôme Glisse, dri-devel,
	Konrad Rzeszutek Wilk

On 08/13/2014 06:24 PM, Daniel Vetter wrote:
> On Wed, Aug 13, 2014 at 05:13:56PM +0200, Thomas Hellstrom wrote:
>> On 08/13/2014 03:01 PM, Daniel Vetter wrote:
>>> On Wed, Aug 13, 2014 at 02:35:52PM +0200, Thomas Hellstrom wrote:
>>>> On 08/13/2014 12:42 PM, Daniel Vetter wrote:
>>>>> On Wed, Aug 13, 2014 at 11:06:25AM +0200, Thomas Hellstrom wrote:
>>>>>> On 08/13/2014 05:52 AM, Jérôme Glisse wrote:
>>>>>>> From: Jérôme Glisse <jglisse@redhat.com>
>>>>>>>
>>>>>>> When experiencing memory pressure we want to minimize pool size so that
>>>>>>> memory we just shrinked is not added back again just as the next thing.
>>>>>>>
>>>>>>> This will divide by 2 the maximum pool size for each device each time
>>>>>>> the pool have to shrink. The limit is bumped again is next allocation
>>>>>>> happen after one second since the last shrink. The one second delay is
>>>>>>> obviously an arbitrary choice.
>>>>>> Jérôme,
>>>>>>
>>>>>> I don't like this patch. It adds extra complexity and its usefulness is
>>>>>> highly questionable.
>>>>>> There are a number of caches in the system, and if all of them added
>>>>>> some sort of voluntary shrink heuristics like this, we'd end up with
>>>>>> impossible-to-debug unpredictable performance issues.
>>>>>>
>>>>>> We should let the memory subsystem decide when to reclaim pages from
>>>>>> caches and what caches to reclaim them from.
>>>>> Yeah, artificially limiting your cache from growing when your shrinker
>>>>> gets called will just break the equal-memory pressure the core mm uses to
>>>>> rebalance between all caches when workload changes. In i915 we let
>>>>> everything grow without artificial bounds and only rely upon the shrinker
>>>>> callbacks to ensure we don't consume more than our fair share of available
>>>>> memory overall.
>>>>> -Daniel
>>>> Now when you bring i915 memory usage up, Daniel,
>>>> I can't refrain from bringing up the old user-space unreclaimable kernel
>>>> memory issue, for which gem open is a good example ;) Each time
>>>> user-space opens a gem handle, some un-reclaimable kernel memory is
>>>> allocated, for which there is no accounting, so theoretically I think a
>>>> user can bring a system to unusability this way.
>>>>
>>>> Typically there are various limits on unreclaimable objects like this,
>>>> like open file descriptors, and IIRC the kernel even has an internal
>>>> limit on the number of struct files you initialize, based on the
>>>> available system memory, so dma-buf / prime should already have some
>>>> sort of protection.
>>> Oh yeah, we have zero cgroups limits or similar stuff for gem allocations,
>>> so there's not really a way to isolate gpu memory usage in a sane way for
>>> specific processes. But there's also zero limits on actual gpu usage
>>> itself (timeslices or whatever) so I guess no one asked for this yet.
>> In its simplest form (like in TTM if correctly implemented by drivers)
>> this type of accounting stops non-privileged malicious GPU-users from
>> exhausting all system physical memory causing grief for other kernel
>> systems but not from causing grief for other GPU users. I think that's
>> the minimum level that's intended also for example also for the struct
>> file accounting.
> I think in i915 we're fairly close on that minimal standard - interactions
> with shrinkers and oom logic work decently. It starts to fall apart though
> when we've actually run out of memory - if the real memory hog is a gpu
> process the oom killer won't notice all that memory since it's not
> accounted against processes correctly.
>
> I don't agree that gpu process should be punished in general compared to
> other subsystems in the kernel. If the user wants to use 90% of all memory
> for gpu tasks then I want to make that possible, even if it means that
> everything else thrashes horribly. And as long as the system recovers and
> rebalances after that gpu memory hog is gone ofc. Iirc ttm currently has a
> fairly arbitrary (tunable) setting to limit system memory consumption, but
> I might be wrong on that.

No, that's correct, or rather it's intended to limit pinned
unreclaimable system memory (though part of what's unreclaimable could
actually be made reclaimable if we'd implement another shrinker level).

>>> My comment really was about balancing mm users under the assumption that
>>> they're all unlimited.
>> Yeah, sorry for stealing the thread. I usually bring this up now and
>> again but nowadays with an exponential backoff.
> Oh I'd love to see some cgroups or similar tracking so that server users
> could set sane per-process/user/task limits on how much memory/gpu time
> that group is allowed to consume. It's just that I haven't seen real
> demand for this and so couldn't make the time available to implement it.
> So thus far my goal is to make everything work nicely for unlimited tasks
> right up to the point where the OOM killer needs to step in. Past that
> everything starts to fall apart, but thus far that was good enough for
> desktop usage.

Well I'm not sure if things have changed but last time (a couple of
years ago) I looked at this situation (kernel out of physical memory but
a fair amount of swap space left) the OOM killer was never invoked, so a
number of more or less critical kernel systems (disk I/O, paging,
networking) where getting -ENOMEM and hitting rarely tested error paths.
A state you don't want to have the kernel in. Now the OOM algorithm may
of course have changed since then.

My point is that with unaccounted constructs like gem-open-from-name it
should be easy for any unpriviliged authenticated gem client to pin all
kernel physical memory, put the kernel in that state and keep it there,
and IMO a kernel-user space interface shouldn't allow that.

/Thomas

>
> Maybe WebGL will finally make this important enough so that we can fix it
> for real ...
> -Daniel

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: GEM memory DOS (WAS Re: [PATCH 3/3] drm/ttm: under memory pressure minimize the size of memory pool)
  2014-08-13 16:30               ` Alex Deucher
  2014-08-13 16:38                 ` Daniel Vetter
@ 2014-08-13 17:38                 ` Thomas Hellstrom
  1 sibling, 0 replies; 29+ messages in thread
From: Thomas Hellstrom @ 2014-08-13 17:38 UTC (permalink / raw)
  To: Alex Deucher
  Cc: Jérôme Glisse, Michel Dänzer,
	Maling list - DRI developers, Konrad Rzeszutek Wilk

On 08/13/2014 06:30 PM, Alex Deucher wrote:
> On Wed, Aug 13, 2014 at 12:24 PM, Daniel Vetter <daniel@ffwll.ch> wrote:
>> On Wed, Aug 13, 2014 at 05:13:56PM +0200, Thomas Hellstrom wrote:
>>> On 08/13/2014 03:01 PM, Daniel Vetter wrote:
>>>> On Wed, Aug 13, 2014 at 02:35:52PM +0200, Thomas Hellstrom wrote:
>>>>> On 08/13/2014 12:42 PM, Daniel Vetter wrote:
>>>>>> On Wed, Aug 13, 2014 at 11:06:25AM +0200, Thomas Hellstrom wrote:
>>>>>>> On 08/13/2014 05:52 AM, Jérôme Glisse wrote:
>>>>>>>> From: Jérôme Glisse <jglisse@redhat.com>
>>>>>>>>
>>>>>>>> When experiencing memory pressure we want to minimize pool size so that
>>>>>>>> memory we just shrinked is not added back again just as the next thing.
>>>>>>>>
>>>>>>>> This will divide by 2 the maximum pool size for each device each time
>>>>>>>> the pool have to shrink. The limit is bumped again is next allocation
>>>>>>>> happen after one second since the last shrink. The one second delay is
>>>>>>>> obviously an arbitrary choice.
>>>>>>> Jérôme,
>>>>>>>
>>>>>>> I don't like this patch. It adds extra complexity and its usefulness is
>>>>>>> highly questionable.
>>>>>>> There are a number of caches in the system, and if all of them added
>>>>>>> some sort of voluntary shrink heuristics like this, we'd end up with
>>>>>>> impossible-to-debug unpredictable performance issues.
>>>>>>>
>>>>>>> We should let the memory subsystem decide when to reclaim pages from
>>>>>>> caches and what caches to reclaim them from.
>>>>>> Yeah, artificially limiting your cache from growing when your shrinker
>>>>>> gets called will just break the equal-memory pressure the core mm uses to
>>>>>> rebalance between all caches when workload changes. In i915 we let
>>>>>> everything grow without artificial bounds and only rely upon the shrinker
>>>>>> callbacks to ensure we don't consume more than our fair share of available
>>>>>> memory overall.
>>>>>> -Daniel
>>>>> Now when you bring i915 memory usage up, Daniel,
>>>>> I can't refrain from bringing up the old user-space unreclaimable kernel
>>>>> memory issue, for which gem open is a good example ;) Each time
>>>>> user-space opens a gem handle, some un-reclaimable kernel memory is
>>>>> allocated, for which there is no accounting, so theoretically I think a
>>>>> user can bring a system to unusability this way.
>>>>>
>>>>> Typically there are various limits on unreclaimable objects like this,
>>>>> like open file descriptors, and IIRC the kernel even has an internal
>>>>> limit on the number of struct files you initialize, based on the
>>>>> available system memory, so dma-buf / prime should already have some
>>>>> sort of protection.
>>>> Oh yeah, we have zero cgroups limits or similar stuff for gem allocations,
>>>> so there's not really a way to isolate gpu memory usage in a sane way for
>>>> specific processes. But there's also zero limits on actual gpu usage
>>>> itself (timeslices or whatever) so I guess no one asked for this yet.
>>> In its simplest form (like in TTM if correctly implemented by drivers)
>>> this type of accounting stops non-privileged malicious GPU-users from
>>> exhausting all system physical memory causing grief for other kernel
>>> systems but not from causing grief for other GPU users. I think that's
>>> the minimum level that's intended also for example also for the struct
>>> file accounting.
>> I think in i915 we're fairly close on that minimal standard - interactions
>> with shrinkers and oom logic work decently. It starts to fall apart though
>> when we've actually run out of memory - if the real memory hog is a gpu
>> process the oom killer won't notice all that memory since it's not
>> accounted against processes correctly.
>>
>> I don't agree that gpu process should be punished in general compared to
>> other subsystems in the kernel. If the user wants to use 90% of all memory
>> for gpu tasks then I want to make that possible, even if it means that
>> everything else thrashes horribly. And as long as the system recovers and
>> rebalances after that gpu memory hog is gone ofc. Iirc ttm currently has a
>> fairly arbitrary (tunable) setting to limit system memory consumption, but
>> I might be wrong on that.
> Yes, it currently limits you to half of memory, but at least we would
> like to make it tuneable since there are a lot of user cases where the
> user wants to use 90% of memory for GPU tasks at the expense of
> everything else.
>
> Alex
>

It's in /sys/devices/virtual/drm/ttm/memory_accounting/*

Run-time tunable, but you should probably write an app to tune if you
want to hand out to users, since if you up the limit, you probably want
to modify a number of values.

zone_memory: ro: Total memory in the zone.
used_memory: ro: Currently pinned memory.
available_memory: rw: Allocation limit.
emergency_memory: rw: Allocation limit for CAP_SYS_ADMIN
swap_limit: rw: Swapper thread starts at this limit.

/Thomas








_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: GEM memory DOS (WAS Re: [PATCH 3/3] drm/ttm: under memory pressure minimize the size of memory pool)
  2014-08-13 15:13           ` Thomas Hellstrom
  2014-08-13 16:24             ` Daniel Vetter
@ 2014-08-14 22:29             ` Jesse Barnes
  1 sibling, 0 replies; 29+ messages in thread
From: Jesse Barnes @ 2014-08-14 22:29 UTC (permalink / raw)
  To: Thomas Hellstrom
  Cc: dri-devel, Michel Dänzer, Jérôme Glisse,
	Konrad Rzeszutek Wilk

On Wed, 13 Aug 2014 17:13:56 +0200
Thomas Hellstrom <thellstrom@vmware.com> wrote:

> On 08/13/2014 03:01 PM, Daniel Vetter wrote:
> > On Wed, Aug 13, 2014 at 02:35:52PM +0200, Thomas Hellstrom wrote:
> >> On 08/13/2014 12:42 PM, Daniel Vetter wrote:
> >>> On Wed, Aug 13, 2014 at 11:06:25AM +0200, Thomas Hellstrom wrote:
> >>>> On 08/13/2014 05:52 AM, Jérôme Glisse wrote:
> >>>>> From: Jérôme Glisse <jglisse@redhat.com>
> >>>>>
> >>>>> When experiencing memory pressure we want to minimize pool size so that
> >>>>> memory we just shrinked is not added back again just as the next thing.
> >>>>>
> >>>>> This will divide by 2 the maximum pool size for each device each time
> >>>>> the pool have to shrink. The limit is bumped again is next allocation
> >>>>> happen after one second since the last shrink. The one second delay is
> >>>>> obviously an arbitrary choice.
> >>>> Jérôme,
> >>>>
> >>>> I don't like this patch. It adds extra complexity and its usefulness is
> >>>> highly questionable.
> >>>> There are a number of caches in the system, and if all of them added
> >>>> some sort of voluntary shrink heuristics like this, we'd end up with
> >>>> impossible-to-debug unpredictable performance issues.
> >>>>
> >>>> We should let the memory subsystem decide when to reclaim pages from
> >>>> caches and what caches to reclaim them from.
> >>> Yeah, artificially limiting your cache from growing when your shrinker
> >>> gets called will just break the equal-memory pressure the core mm uses to
> >>> rebalance between all caches when workload changes. In i915 we let
> >>> everything grow without artificial bounds and only rely upon the shrinker
> >>> callbacks to ensure we don't consume more than our fair share of available
> >>> memory overall.
> >>> -Daniel
> >> Now when you bring i915 memory usage up, Daniel,
> >> I can't refrain from bringing up the old user-space unreclaimable kernel
> >> memory issue, for which gem open is a good example ;) Each time
> >> user-space opens a gem handle, some un-reclaimable kernel memory is
> >> allocated, for which there is no accounting, so theoretically I think a
> >> user can bring a system to unusability this way.
> >>
> >> Typically there are various limits on unreclaimable objects like this,
> >> like open file descriptors, and IIRC the kernel even has an internal
> >> limit on the number of struct files you initialize, based on the
> >> available system memory, so dma-buf / prime should already have some
> >> sort of protection.
> > Oh yeah, we have zero cgroups limits or similar stuff for gem allocations,
> > so there's not really a way to isolate gpu memory usage in a sane way for
> > specific processes. But there's also zero limits on actual gpu usage
> > itself (timeslices or whatever) so I guess no one asked for this yet.
> 
> In its simplest form (like in TTM if correctly implemented by drivers)
> this type of accounting stops non-privileged malicious GPU-users from
> exhausting all system physical memory causing grief for other kernel
> systems but not from causing grief for other GPU users. I think that's
> the minimum level that's intended also for example also for the struct
> file accounting.
> 
> > My comment really was about balancing mm users under the assumption that
> > they're all unlimited.
> 
> Yeah, sorry for stealing the thread. I usually bring this up now and
> again but nowadays with an exponential backoff.

Yeah I agree we're missing some good limits stuff in i915 and DRM in
general probably.  Chris started looking at this awhile back, but I
haven't seen anything recently.  Tying into the ulimits/rlimits might
make sense, and at the very least we need to account for things
properly so we can add new limits where needed.

-- 
Jesse Barnes, Intel Open Source Technology Center
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/3] drm/ttm: fix object deallocation to properly fill in the page pool.
  2014-08-13  3:52 ` [PATCH 2/3] drm/ttm: fix object deallocation to properly fill in the page pool Jérôme Glisse
@ 2015-03-25 19:06   ` Konrad Rzeszutek Wilk
  2015-07-06  9:11   ` Michel Dänzer
  1 sibling, 0 replies; 29+ messages in thread
From: Konrad Rzeszutek Wilk @ 2015-03-25 19:06 UTC (permalink / raw)
  To: Jérôme Glisse
  Cc: Jérôme Glisse, Thomas Hellstrom, dri-devel

On Tue, Aug 12, 2014 at 11:52:05PM -0400, Jérôme Glisse wrote:
> From: Jérôme Glisse <jglisse@redhat.com>
> 
> Current code never allowed the page pool to actualy fill in anyway. This fix
> it and also allow it to grow over its limit until it grow beyond the batch
> size for allocation and deallocation.
> 
> Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
> Reviewed-by: Mario Kleiner <mario.kleiner.de@gmail.com>
> Tested-by: Michel Dänzer <michel@daenzer.net>
> Cc: Thomas Hellstrom <thellstrom@vmware.com>
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> ---
>  drivers/gpu/drm/ttm/ttm_page_alloc_dma.c | 9 ++-------
>  1 file changed, 2 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
> index c96db43..a076ff3 100644
> --- a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
> +++ b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
> @@ -953,14 +953,9 @@ void ttm_dma_unpopulate(struct ttm_dma_tt *ttm_dma, struct device *dev)
>  	} else {
>  		pool->npages_free += count;
>  		list_splice(&ttm_dma->pages_list, &pool->free_list);
> -		npages = count;
> -		if (pool->npages_free > _manager->options.max_size) {
> +		if (pool->npages_free >= (_manager->options.max_size +
> +					  NUM_PAGES_TO_ALLOC))
>  			npages = pool->npages_free - _manager->options.max_size;
> -			/* free at least NUM_PAGES_TO_ALLOC number of pages
> -			 * to reduce calls to set_memory_wb */
> -			if (npages < NUM_PAGES_TO_ALLOC)
> -				npages = NUM_PAGES_TO_ALLOC;
> -		}
>  	}
>  	spin_unlock_irqrestore(&pool->lock, irq_flags);
>  
> -- 
> 1.9.3
> 
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/3] drm/ttm: fix object deallocation to properly fill in the page pool.
  2014-08-13  3:52 ` [PATCH 2/3] drm/ttm: fix object deallocation to properly fill in the page pool Jérôme Glisse
  2015-03-25 19:06   ` Konrad Rzeszutek Wilk
@ 2015-07-06  9:11   ` Michel Dänzer
  2015-07-06 16:10     ` Jerome Glisse
  1 sibling, 1 reply; 29+ messages in thread
From: Michel Dänzer @ 2015-07-06  9:11 UTC (permalink / raw)
  To: Jérôme Glisse
  Cc: Thomas Hellstrom, Jack.Xiao, dri-devel, Konrad Rzeszutek Wilk


Hi Jérôme,


On 13.08.2014 12:52, Jérôme Glisse wrote:
> From: Jérôme Glisse <jglisse@redhat.com>
> 
> Current code never allowed the page pool to actualy fill in anyway. This fix
> it and also allow it to grow over its limit until it grow beyond the batch
> size for allocation and deallocation.
> 
> Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
> Reviewed-by: Mario Kleiner <mario.kleiner.de@gmail.com>
> Tested-by: Michel Dänzer <michel@daenzer.net>
> Cc: Thomas Hellstrom <thellstrom@vmware.com>
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> ---
>  drivers/gpu/drm/ttm/ttm_page_alloc_dma.c | 9 ++-------
>  1 file changed, 2 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
> index c96db43..a076ff3 100644
> --- a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
> +++ b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
> @@ -953,14 +953,9 @@ void ttm_dma_unpopulate(struct ttm_dma_tt *ttm_dma, struct device *dev)
>  	} else {
>  		pool->npages_free += count;
>  		list_splice(&ttm_dma->pages_list, &pool->free_list);
> -		npages = count;
> -		if (pool->npages_free > _manager->options.max_size) {
> +		if (pool->npages_free >= (_manager->options.max_size +
> +					  NUM_PAGES_TO_ALLOC))
>  			npages = pool->npages_free - _manager->options.max_size;
> -			/* free at least NUM_PAGES_TO_ALLOC number of pages
> -			 * to reduce calls to set_memory_wb */
> -			if (npages < NUM_PAGES_TO_ALLOC)
> -				npages = NUM_PAGES_TO_ALLOC;
> -		}
>  	}
>  	spin_unlock_irqrestore(&pool->lock, irq_flags);
>  
> 

Colleagues of mine have measured significant performance gains for some
workloads with this patch. Without it, a lot of CPU cycles are spent
changing the caching attributes of pages on allocation.

Note that the performance effect seems to mostly disappear when applying
patch 1 as well, so apparently 64MB is too small for the max pool size.

Is there any chance this patch could be applied without the
controversial patch 3? If not, do you have ideas for addressing the
concerns raised against patch 3?


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/3] drm/ttm: fix object deallocation to properly fill in the page pool.
  2015-07-06  9:11   ` Michel Dänzer
@ 2015-07-06 16:10     ` Jerome Glisse
  2015-07-07  6:39       ` Michel Dänzer
  0 siblings, 1 reply; 29+ messages in thread
From: Jerome Glisse @ 2015-07-06 16:10 UTC (permalink / raw)
  To: Michel Dänzer
  Cc: Thomas Hellstrom, Jack.Xiao, dri-devel, Konrad Rzeszutek Wilk

On Mon, Jul 06, 2015 at 06:11:29PM +0900, Michel Dänzer wrote:
> 
> Hi Jérôme,
> 
> On 13.08.2014 12:52, Jérôme Glisse wrote:
> > From: Jérôme Glisse <jglisse@redhat.com>
> > 
> > Current code never allowed the page pool to actualy fill in anyway. This fix
> > it and also allow it to grow over its limit until it grow beyond the batch
> > size for allocation and deallocation.
> > 
> > Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
> > Reviewed-by: Mario Kleiner <mario.kleiner.de@gmail.com>
> > Tested-by: Michel Dänzer <michel@daenzer.net>
> > Cc: Thomas Hellstrom <thellstrom@vmware.com>
> > Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > ---
> >  drivers/gpu/drm/ttm/ttm_page_alloc_dma.c | 9 ++-------
> >  1 file changed, 2 insertions(+), 7 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
> > index c96db43..a076ff3 100644
> > --- a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
> > +++ b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
> > @@ -953,14 +953,9 @@ void ttm_dma_unpopulate(struct ttm_dma_tt *ttm_dma, struct device *dev)
> >  	} else {
> >  		pool->npages_free += count;
> >  		list_splice(&ttm_dma->pages_list, &pool->free_list);
> > -		npages = count;
> > -		if (pool->npages_free > _manager->options.max_size) {
> > +		if (pool->npages_free >= (_manager->options.max_size +
> > +					  NUM_PAGES_TO_ALLOC))
> >  			npages = pool->npages_free - _manager->options.max_size;
> > -			/* free at least NUM_PAGES_TO_ALLOC number of pages
> > -			 * to reduce calls to set_memory_wb */
> > -			if (npages < NUM_PAGES_TO_ALLOC)
> > -				npages = NUM_PAGES_TO_ALLOC;
> > -		}
> >  	}
> >  	spin_unlock_irqrestore(&pool->lock, irq_flags);
> >  
> > 
> 
> Colleagues of mine have measured significant performance gains for some
> workloads with this patch. Without it, a lot of CPU cycles are spent
> changing the caching attributes of pages on allocation.
> 
> Note that the performance effect seems to mostly disappear when applying
> patch 1 as well, so apparently 64MB is too small for the max pool size.
> 
> Is there any chance this patch could be applied without the
> controversial patch 3? If not, do you have ideas for addressing the
> concerns raised against patch 3?

Wahou, now i need to find the keys to the DeLorean to travel back in time.

This patch is a fix and should be applied without 1 or 3. Because today
basicly the pool is always emptied and never fill up. But as Thomas pointed
out there is already bit too much pool accross the stack. Proper solution
would be to work something inside the mm level or the architecture (i assume
AMD is mostly interested in x86 on that front).

Here the whole issue is really about allocating page with WC/UC cache
properties. Changing cache properties on page is really costly on several
level, like the kernel needs to break the huge linear mapping and populate
lower level to remap the page with proper cache attribute inside the kernel
mapping.

As far as i remember the kernel never goes back to huge page mapping when
restoring page cache attribute, which meaning that little by litte with
uptime you loose the whole huge page mapping benefit for everything and you
waste more memory.

Anyway just wanted to dump here my recolection and how i think patch 1 & 3
should be replaced by simply moving down this allocation infrastructure
inside core mm code. Where it should always have been.

In meantime i think we need to apply this patch as it is really a fix to
make the code actually do what the comment and design pretends it does.

Cheers,
Jérôme
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/3] drm/ttm: fix object deallocation to properly fill in the page pool.
  2015-07-06 16:10     ` Jerome Glisse
@ 2015-07-07  6:39       ` Michel Dänzer
  2015-07-07 17:41         ` Jerome Glisse
  0 siblings, 1 reply; 29+ messages in thread
From: Michel Dänzer @ 2015-07-07  6:39 UTC (permalink / raw)
  To: Jerome Glisse
  Cc: Thomas Hellstrom, Jack.Xiao, dri-devel, Konrad Rzeszutek Wilk

On 07.07.2015 01:10, Jerome Glisse wrote:
> On Mon, Jul 06, 2015 at 06:11:29PM +0900, Michel Dänzer wrote:
>>
>> Hi Jérôme,
>>
>> On 13.08.2014 12:52, Jérôme Glisse wrote:
>>> From: Jérôme Glisse <jglisse@redhat.com>
>>>
>>> Current code never allowed the page pool to actualy fill in anyway. This fix
>>> it and also allow it to grow over its limit until it grow beyond the batch
>>> size for allocation and deallocation.
>>>
>>> Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
>>> Reviewed-by: Mario Kleiner <mario.kleiner.de@gmail.com>
>>> Tested-by: Michel Dänzer <michel@daenzer.net>
>>> Cc: Thomas Hellstrom <thellstrom@vmware.com>
>>> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
>>> ---
>>>  drivers/gpu/drm/ttm/ttm_page_alloc_dma.c | 9 ++-------
>>>  1 file changed, 2 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
>>> index c96db43..a076ff3 100644
>>> --- a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
>>> +++ b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
>>> @@ -953,14 +953,9 @@ void ttm_dma_unpopulate(struct ttm_dma_tt *ttm_dma, struct device *dev)
>>>  	} else {
>>>  		pool->npages_free += count;
>>>  		list_splice(&ttm_dma->pages_list, &pool->free_list);
>>> -		npages = count;
>>> -		if (pool->npages_free > _manager->options.max_size) {
>>> +		if (pool->npages_free >= (_manager->options.max_size +
>>> +					  NUM_PAGES_TO_ALLOC))
>>>  			npages = pool->npages_free - _manager->options.max_size;
>>> -			/* free at least NUM_PAGES_TO_ALLOC number of pages
>>> -			 * to reduce calls to set_memory_wb */
>>> -			if (npages < NUM_PAGES_TO_ALLOC)
>>> -				npages = NUM_PAGES_TO_ALLOC;
>>> -		}
>>>  	}
>>>  	spin_unlock_irqrestore(&pool->lock, irq_flags);
>>>  
>>>
>>
>> Colleagues of mine have measured significant performance gains for some
>> workloads with this patch. Without it, a lot of CPU cycles are spent
>> changing the caching attributes of pages on allocation.
>>
>> Note that the performance effect seems to mostly disappear when applying
>> patch 1 as well, so apparently 64MB is too small for the max pool size.
>>
>> Is there any chance this patch could be applied without the
>> controversial patch 3? If not, do you have ideas for addressing the
>> concerns raised against patch 3?
> 
> Wahou, now i need to find the keys to the DeLorean to travel back in time.
> 
> This patch is a fix and should be applied without 1 or 3. Because today
> basicly the pool is always emptied and never fill up. But as Thomas pointed
> out there is already bit too much pool accross the stack. Proper solution
> would be to work something inside the mm level or the architecture (i assume
> AMD is mostly interested in x86 on that front).
> 
> Here the whole issue is really about allocating page with WC/UC cache
> properties. Changing cache properties on page is really costly on several
> level, like the kernel needs to break the huge linear mapping and populate
> lower level to remap the page with proper cache attribute inside the kernel
> mapping.
> 
> As far as i remember the kernel never goes back to huge page mapping when
> restoring page cache attribute, which meaning that little by litte with
> uptime you loose the whole huge page mapping benefit for everything and you
> waste more memory.

That sounds pretty bad, but this patch might actually help a little for
that by reducing the amount of huge page mappings that need to be broken up?


> In meantime i think we need to apply this patch as it is really a fix to
> make the code actually do what the comment and design pretends it does.

I agree.

BTW, maybe it should be split up into the actual fix (removing the
npages assignment) and the NUM_PAGES_TO_ALLOC related simplification?


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/3] drm/ttm: fix object deallocation to properly fill in the page pool.
  2015-07-07  6:39       ` Michel Dänzer
@ 2015-07-07 17:41         ` Jerome Glisse
  2015-07-08  2:34           ` Michel Dänzer
  0 siblings, 1 reply; 29+ messages in thread
From: Jerome Glisse @ 2015-07-07 17:41 UTC (permalink / raw)
  To: Michel Dänzer
  Cc: Thomas Hellstrom, Jack.Xiao, dri-devel, Konrad Rzeszutek Wilk

On Tue, Jul 07, 2015 at 03:39:29PM +0900, Michel Dänzer wrote:
> On 07.07.2015 01:10, Jerome Glisse wrote:
> > On Mon, Jul 06, 2015 at 06:11:29PM +0900, Michel Dänzer wrote:
> >>
> >> Hi Jérôme,
> >>
> >> On 13.08.2014 12:52, Jérôme Glisse wrote:
> >>> From: Jérôme Glisse <jglisse@redhat.com>
> >>>
> >>> Current code never allowed the page pool to actualy fill in anyway. This fix
> >>> it and also allow it to grow over its limit until it grow beyond the batch
> >>> size for allocation and deallocation.
> >>>
> >>> Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
> >>> Reviewed-by: Mario Kleiner <mario.kleiner.de@gmail.com>
> >>> Tested-by: Michel Dänzer <michel@daenzer.net>
> >>> Cc: Thomas Hellstrom <thellstrom@vmware.com>
> >>> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> >>> ---
> >>>  drivers/gpu/drm/ttm/ttm_page_alloc_dma.c | 9 ++-------
> >>>  1 file changed, 2 insertions(+), 7 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
> >>> index c96db43..a076ff3 100644
> >>> --- a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
> >>> +++ b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
> >>> @@ -953,14 +953,9 @@ void ttm_dma_unpopulate(struct ttm_dma_tt *ttm_dma, struct device *dev)
> >>>  	} else {
> >>>  		pool->npages_free += count;
> >>>  		list_splice(&ttm_dma->pages_list, &pool->free_list);
> >>> -		npages = count;
> >>> -		if (pool->npages_free > _manager->options.max_size) {
> >>> +		if (pool->npages_free >= (_manager->options.max_size +
> >>> +					  NUM_PAGES_TO_ALLOC))
> >>>  			npages = pool->npages_free - _manager->options.max_size;
> >>> -			/* free at least NUM_PAGES_TO_ALLOC number of pages
> >>> -			 * to reduce calls to set_memory_wb */
> >>> -			if (npages < NUM_PAGES_TO_ALLOC)
> >>> -				npages = NUM_PAGES_TO_ALLOC;
> >>> -		}
> >>>  	}
> >>>  	spin_unlock_irqrestore(&pool->lock, irq_flags);
> >>>  
> >>>
> >>
> >> Colleagues of mine have measured significant performance gains for some
> >> workloads with this patch. Without it, a lot of CPU cycles are spent
> >> changing the caching attributes of pages on allocation.
> >>
> >> Note that the performance effect seems to mostly disappear when applying
> >> patch 1 as well, so apparently 64MB is too small for the max pool size.
> >>
> >> Is there any chance this patch could be applied without the
> >> controversial patch 3? If not, do you have ideas for addressing the
> >> concerns raised against patch 3?
> > 
> > Wahou, now i need to find the keys to the DeLorean to travel back in time.
> > 
> > This patch is a fix and should be applied without 1 or 3. Because today
> > basicly the pool is always emptied and never fill up. But as Thomas pointed
> > out there is already bit too much pool accross the stack. Proper solution
> > would be to work something inside the mm level or the architecture (i assume
> > AMD is mostly interested in x86 on that front).
> > 
> > Here the whole issue is really about allocating page with WC/UC cache
> > properties. Changing cache properties on page is really costly on several
> > level, like the kernel needs to break the huge linear mapping and populate
> > lower level to remap the page with proper cache attribute inside the kernel
> > mapping.
> > 
> > As far as i remember the kernel never goes back to huge page mapping when
> > restoring page cache attribute, which meaning that little by litte with
> > uptime you loose the whole huge page mapping benefit for everything and you
> > waste more memory.
> 
> That sounds pretty bad, but this patch might actually help a little for
> that by reducing the amount of huge page mappings that need to be broken up?

Not really, for limiting huge page mapping breakage you would need to allocate
page in same physical cluster so that only 1 huge page mapping needs to be
broken. It would be a bit like DMA32 or DMA16 physical range. So this would
obviously need some work at the MM level. At ttm level this can be more or
less implemented using GFP_DMA32 flag on page allocation but at the same time
doing that you kind of put more pressure on the first 4G of memory and i
think nowadays with the whole webbrowser consuming several GB of texture you
probably do not want to do that.


> > In meantime i think we need to apply this patch as it is really a fix to
> > make the code actually do what the comment and design pretends it does.
> 
> I agree.
> 
> BTW, maybe it should be split up into the actual fix (removing the
> npages assignment) and the NUM_PAGES_TO_ALLOC related simplification?

This would make 2 really small patch, patch is small as it is :) But why
not.

Cheers,
Jérôme
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/3] drm/ttm: fix object deallocation to properly fill in the page pool.
  2015-07-07 17:41         ` Jerome Glisse
@ 2015-07-08  2:34           ` Michel Dänzer
  0 siblings, 0 replies; 29+ messages in thread
From: Michel Dänzer @ 2015-07-08  2:34 UTC (permalink / raw)
  To: Jerome Glisse
  Cc: Thomas Hellstrom, Jack.Xiao, dri-devel, Konrad Rzeszutek Wilk

On 08.07.2015 02:41, Jerome Glisse wrote:
> On Tue, Jul 07, 2015 at 03:39:29PM +0900, Michel Dänzer wrote:
>> On 07.07.2015 01:10, Jerome Glisse wrote:
>>>
>>> In meantime i think we need to apply this patch as it is really a fix to
>>> make the code actually do what the comment and design pretends it does.
>>
>> I agree.
>>
>> BTW, maybe it should be split up into the actual fix (removing the
>> npages assignment) and the NUM_PAGES_TO_ALLOC related simplification?
> 
> This would make 2 really small patch, patch is small as it is :) But why
> not.

It's not about size but about having one commit for each logical change.
It took me a while to realize that the NUM_PAGES_TO_ALLOC changes aren't
directly related to the fix, and I've seen the same thing happen to others.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2015-07-08  2:34 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-13  3:52 [PATCH 0/3] drm/ttm: hard to swim in an empty pool Jérôme Glisse
2014-08-13  3:52 ` [PATCH 1/3] drm/ttm: set sensible pool size limit Jérôme Glisse
2014-08-13  6:24   ` Michel Dänzer
2014-08-13  3:52 ` [PATCH 2/3] drm/ttm: fix object deallocation to properly fill in the page pool Jérôme Glisse
2015-03-25 19:06   ` Konrad Rzeszutek Wilk
2015-07-06  9:11   ` Michel Dänzer
2015-07-06 16:10     ` Jerome Glisse
2015-07-07  6:39       ` Michel Dänzer
2015-07-07 17:41         ` Jerome Glisse
2015-07-08  2:34           ` Michel Dänzer
2014-08-13  3:52 ` [PATCH 3/3] drm/ttm: under memory pressure minimize the size of memory pool Jérôme Glisse
2014-08-13  6:32   ` Michel Dänzer
2014-08-13  9:06   ` Thomas Hellstrom
2014-08-13 10:42     ` Daniel Vetter
2014-08-13 12:35       ` GEM memory DOS (WAS Re: [PATCH 3/3] drm/ttm: under memory pressure minimize the size of memory pool) Thomas Hellstrom
2014-08-13 12:40         ` David Herrmann
2014-08-13 12:48           ` Thomas Hellstrom
2014-08-13 13:01         ` Daniel Vetter
2014-08-13 14:09           ` Oded Gabbay
2014-08-13 15:19             ` Thomas Hellstrom
2014-08-13 16:30             ` Daniel Vetter
2014-08-13 15:13           ` Thomas Hellstrom
2014-08-13 16:24             ` Daniel Vetter
2014-08-13 16:30               ` Alex Deucher
2014-08-13 16:38                 ` Daniel Vetter
2014-08-13 16:45                   ` Daniel Vetter
2014-08-13 17:38                 ` Thomas Hellstrom
2014-08-13 17:20               ` Thomas Hellstrom
2014-08-14 22:29             ` Jesse Barnes

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.