All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC] Per file OOM badness
@ 2018-01-18 16:47 ` Andrey Grodzovsky
  0 siblings, 0 replies; 170+ messages in thread
From: Andrey Grodzovsky @ 2018-01-18 16:47 UTC (permalink / raw)
  To: linux-kernel, linux-mm, dri-devel, amd-gfx; +Cc: Christian.Koenig

Hi, this series is a revised version of an RFC sent by Christian König
a few years ago. The original RFC can be found at 
https://lists.freedesktop.org/archives/dri-devel/2015-September/089778.html

This is the same idea and I've just adressed his concern from the original RFC 
and switched to a callback into file_ops instead of a new member in struct file.

Thanks,
Andrey

^ permalink raw reply	[flat|nested] 170+ messages in thread

* [RFC] Per file OOM badness
@ 2018-01-18 16:47 ` Andrey Grodzovsky
  0 siblings, 0 replies; 170+ messages in thread
From: Andrey Grodzovsky @ 2018-01-18 16:47 UTC (permalink / raw)
  To: linux-kernel, linux-mm, dri-devel, amd-gfx; +Cc: Christian.Koenig

Hi, this series is a revised version of an RFC sent by Christian KA?nig
a few years ago. The original RFC can be found at 
https://lists.freedesktop.org/archives/dri-devel/2015-September/089778.html

This is the same idea and I've just adressed his concern from the original RFC 
and switched to a callback into file_ops instead of a new member in struct file.

Thanks,
Andrey

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* [RFC] Per file OOM badness
@ 2018-01-18 16:47 ` Andrey Grodzovsky
  0 siblings, 0 replies; 170+ messages in thread
From: Andrey Grodzovsky @ 2018-01-18 16:47 UTC (permalink / raw)
  To: linux-kernel, linux-mm, dri-devel, amd-gfx; +Cc: Christian.Koenig

Hi, this series is a revised version of an RFC sent by Christian König
a few years ago. The original RFC can be found at 
https://lists.freedesktop.org/archives/dri-devel/2015-September/089778.html

This is the same idea and I've just adressed his concern from the original RFC 
and switched to a callback into file_ops instead of a new member in struct file.

Thanks,
Andrey

^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH 1/4] fs: add OOM badness callback in file_operatrations struct.
  2018-01-18 16:47 ` Andrey Grodzovsky
  (?)
@ 2018-01-18 16:47   ` Andrey Grodzovsky
  -1 siblings, 0 replies; 170+ messages in thread
From: Andrey Grodzovsky @ 2018-01-18 16:47 UTC (permalink / raw)
  To: linux-kernel, linux-mm, dri-devel, amd-gfx
  Cc: Christian.Koenig, Andrey Grodzovsky

This allows device drivers to specify an additional badness for the OOM
when they allocate memory on behalf of userspace.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 include/linux/fs.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 511fbaa..938394a 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1728,6 +1728,7 @@ struct file_operations {
 			u64);
 	ssize_t (*dedupe_file_range)(struct file *, u64, u64, struct file *,
 			u64);
+	long (*oom_file_badness)(struct file *);
 } __randomize_layout;
 
 struct inode_operations {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH 1/4] fs: add OOM badness callback in file_operatrations struct.
@ 2018-01-18 16:47   ` Andrey Grodzovsky
  0 siblings, 0 replies; 170+ messages in thread
From: Andrey Grodzovsky @ 2018-01-18 16:47 UTC (permalink / raw)
  To: linux-kernel, linux-mm, dri-devel, amd-gfx
  Cc: Christian.Koenig, Andrey Grodzovsky

This allows device drivers to specify an additional badness for the OOM
when they allocate memory on behalf of userspace.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 include/linux/fs.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 511fbaa..938394a 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1728,6 +1728,7 @@ struct file_operations {
 			u64);
 	ssize_t (*dedupe_file_range)(struct file *, u64, u64, struct file *,
 			u64);
+	long (*oom_file_badness)(struct file *);
 } __randomize_layout;
 
 struct inode_operations {
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH 1/4] fs: add OOM badness callback in file_operatrations struct.
@ 2018-01-18 16:47   ` Andrey Grodzovsky
  0 siblings, 0 replies; 170+ messages in thread
From: Andrey Grodzovsky @ 2018-01-18 16:47 UTC (permalink / raw)
  To: linux-kernel, linux-mm, dri-devel, amd-gfx
  Cc: Christian.Koenig, Andrey Grodzovsky

This allows device drivers to specify an additional badness for the OOM
when they allocate memory on behalf of userspace.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 include/linux/fs.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 511fbaa..938394a 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1728,6 +1728,7 @@ struct file_operations {
 			u64);
 	ssize_t (*dedupe_file_range)(struct file *, u64, u64, struct file *,
 			u64);
+	long (*oom_file_badness)(struct file *);
 } __randomize_layout;
 
 struct inode_operations {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH 2/4] oom: take per file badness into account
  2018-01-18 16:47 ` Andrey Grodzovsky
  (?)
@ 2018-01-18 16:47   ` Andrey Grodzovsky
  -1 siblings, 0 replies; 170+ messages in thread
From: Andrey Grodzovsky @ 2018-01-18 16:47 UTC (permalink / raw)
  To: linux-kernel, linux-mm, dri-devel, amd-gfx
  Cc: Christian.Koenig, Andrey Grodzovsky

Try to make better decisions which process to kill based on
per file OOM badness

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 mm/oom_kill.c | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 29f8555..825ed52 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -49,6 +49,8 @@
 #define CREATE_TRACE_POINTS
 #include <trace/events/oom.h>
 
+#include <linux/fdtable.h>
+
 int sysctl_panic_on_oom;
 int sysctl_oom_kill_allocating_task;
 int sysctl_oom_dump_tasks = 1;
@@ -182,6 +184,21 @@ static bool is_dump_unreclaim_slabs(void)
 }
 
 /**
+ * oom_file_badness - add per file badness
+ * @points: pointer to summed up badness points
+ * @file: tasks open file
+ * @n: file descriptor id (unused)
+ */
+static int oom_file_badness(const void *points, struct file *file, unsigned n)
+{
+	if (file->f_op->oom_file_badness)
+		*((long *)points) += file->f_op->oom_file_badness(file);
+
+	return 0;
+}
+
+
+/**
  * oom_badness - heuristic function to determine which candidate task to kill
  * @p: task struct of which task we should calculate
  * @totalpages: total present RAM allowed for page allocation
@@ -222,6 +239,12 @@ unsigned long oom_badness(struct task_struct *p, struct mem_cgroup *memcg,
 	 */
 	points = get_mm_rss(p->mm) + get_mm_counter(p->mm, MM_SWAPENTS) +
 		mm_pgtables_bytes(p->mm) / PAGE_SIZE;
+
+	/*
+	 * Add how much memory a task uses in opened files, e.g. device drivers.
+	 */
+	iterate_fd(p->files, 0, oom_file_badness, &points);
+
 	task_unlock(p);
 
 	/*
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH 2/4] oom: take per file badness into account
@ 2018-01-18 16:47   ` Andrey Grodzovsky
  0 siblings, 0 replies; 170+ messages in thread
From: Andrey Grodzovsky @ 2018-01-18 16:47 UTC (permalink / raw)
  To: linux-kernel, linux-mm, dri-devel, amd-gfx
  Cc: Christian.Koenig, Andrey Grodzovsky

Try to make better decisions which process to kill based on
per file OOM badness

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 mm/oom_kill.c | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 29f8555..825ed52 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -49,6 +49,8 @@
 #define CREATE_TRACE_POINTS
 #include <trace/events/oom.h>
 
+#include <linux/fdtable.h>
+
 int sysctl_panic_on_oom;
 int sysctl_oom_kill_allocating_task;
 int sysctl_oom_dump_tasks = 1;
@@ -182,6 +184,21 @@ static bool is_dump_unreclaim_slabs(void)
 }
 
 /**
+ * oom_file_badness - add per file badness
+ * @points: pointer to summed up badness points
+ * @file: tasks open file
+ * @n: file descriptor id (unused)
+ */
+static int oom_file_badness(const void *points, struct file *file, unsigned n)
+{
+	if (file->f_op->oom_file_badness)
+		*((long *)points) += file->f_op->oom_file_badness(file);
+
+	return 0;
+}
+
+
+/**
  * oom_badness - heuristic function to determine which candidate task to kill
  * @p: task struct of which task we should calculate
  * @totalpages: total present RAM allowed for page allocation
@@ -222,6 +239,12 @@ unsigned long oom_badness(struct task_struct *p, struct mem_cgroup *memcg,
 	 */
 	points = get_mm_rss(p->mm) + get_mm_counter(p->mm, MM_SWAPENTS) +
 		mm_pgtables_bytes(p->mm) / PAGE_SIZE;
+
+	/*
+	 * Add how much memory a task uses in opened files, e.g. device drivers.
+	 */
+	iterate_fd(p->files, 0, oom_file_badness, &points);
+
 	task_unlock(p);
 
 	/*
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH 2/4] oom: take per file badness into account
@ 2018-01-18 16:47   ` Andrey Grodzovsky
  0 siblings, 0 replies; 170+ messages in thread
From: Andrey Grodzovsky @ 2018-01-18 16:47 UTC (permalink / raw)
  To: linux-kernel, linux-mm, dri-devel, amd-gfx
  Cc: Christian.Koenig, Andrey Grodzovsky

Try to make better decisions which process to kill based on
per file OOM badness

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 mm/oom_kill.c | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 29f8555..825ed52 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -49,6 +49,8 @@
 #define CREATE_TRACE_POINTS
 #include <trace/events/oom.h>
 
+#include <linux/fdtable.h>
+
 int sysctl_panic_on_oom;
 int sysctl_oom_kill_allocating_task;
 int sysctl_oom_dump_tasks = 1;
@@ -182,6 +184,21 @@ static bool is_dump_unreclaim_slabs(void)
 }
 
 /**
+ * oom_file_badness - add per file badness
+ * @points: pointer to summed up badness points
+ * @file: tasks open file
+ * @n: file descriptor id (unused)
+ */
+static int oom_file_badness(const void *points, struct file *file, unsigned n)
+{
+	if (file->f_op->oom_file_badness)
+		*((long *)points) += file->f_op->oom_file_badness(file);
+
+	return 0;
+}
+
+
+/**
  * oom_badness - heuristic function to determine which candidate task to kill
  * @p: task struct of which task we should calculate
  * @totalpages: total present RAM allowed for page allocation
@@ -222,6 +239,12 @@ unsigned long oom_badness(struct task_struct *p, struct mem_cgroup *memcg,
 	 */
 	points = get_mm_rss(p->mm) + get_mm_counter(p->mm, MM_SWAPENTS) +
 		mm_pgtables_bytes(p->mm) / PAGE_SIZE;
+
+	/*
+	 * Add how much memory a task uses in opened files, e.g. device drivers.
+	 */
+	iterate_fd(p->files, 0, oom_file_badness, &points);
+
 	task_unlock(p);
 
 	/*
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH 3/4] drm/gem: adjust per file OOM badness on handling buffers
  2018-01-18 16:47 ` Andrey Grodzovsky
  (?)
@ 2018-01-18 16:47   ` Andrey Grodzovsky
  -1 siblings, 0 replies; 170+ messages in thread
From: Andrey Grodzovsky @ 2018-01-18 16:47 UTC (permalink / raw)
  To: linux-kernel, linux-mm, dri-devel, amd-gfx
  Cc: Christian.Koenig, Andrey Grodzovsky

Large amounts of VRAM are usually not CPU accessible, so they are not mapped
into the processes address space. But since the device drivers usually support
swapping buffers from VRAM to system memory we can still run into an out of
memory situation when userspace starts to allocate to much.

This patch gives the OOM another hint which process is
holding how many resources.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/drm_file.c | 12 ++++++++++++
 drivers/gpu/drm/drm_gem.c  |  8 ++++++++
 include/drm/drm_file.h     |  4 ++++
 3 files changed, 24 insertions(+)

diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
index b3c6e99..626cc76 100644
--- a/drivers/gpu/drm/drm_file.c
+++ b/drivers/gpu/drm/drm_file.c
@@ -747,3 +747,15 @@ void drm_send_event(struct drm_device *dev, struct drm_pending_event *e)
 	spin_unlock_irqrestore(&dev->event_lock, irqflags);
 }
 EXPORT_SYMBOL(drm_send_event);
+
+long drm_oom_badness(struct file *f)
+{
+
+	struct drm_file *file_priv = f->private_data;
+
+	if (file_priv)
+		return atomic_long_read(&file_priv->f_oom_badness);
+
+	return 0;
+}
+EXPORT_SYMBOL(drm_oom_badness);
diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index 01f8d94..ffbadc8 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -264,6 +264,9 @@ drm_gem_object_release_handle(int id, void *ptr, void *data)
 		drm_gem_remove_prime_handles(obj, file_priv);
 	drm_vma_node_revoke(&obj->vma_node, file_priv);
 
+	atomic_long_sub(obj->size >> PAGE_SHIFT,
+				&file_priv->f_oom_badness);
+
 	drm_gem_object_handle_put_unlocked(obj);
 
 	return 0;
@@ -299,6 +302,8 @@ drm_gem_handle_delete(struct drm_file *filp, u32 handle)
 	idr_remove(&filp->object_idr, handle);
 	spin_unlock(&filp->table_lock);
 
+	atomic_long_sub(obj->size >> PAGE_SHIFT, &filp->f_oom_badness);
+
 	return 0;
 }
 EXPORT_SYMBOL(drm_gem_handle_delete);
@@ -417,6 +422,9 @@ drm_gem_handle_create_tail(struct drm_file *file_priv,
 	}
 
 	*handlep = handle;
+
+	atomic_long_add(obj->size >> PAGE_SHIFT,
+				&file_priv->f_oom_badness);
 	return 0;
 
 err_revoke:
diff --git a/include/drm/drm_file.h b/include/drm/drm_file.h
index 0e0c868..ac3aa75 100644
--- a/include/drm/drm_file.h
+++ b/include/drm/drm_file.h
@@ -317,6 +317,8 @@ struct drm_file {
 
 	/* private: */
 	unsigned long lock_count; /* DRI1 legacy lock count */
+
+	atomic_long_t		f_oom_badness;
 };
 
 /**
@@ -378,4 +380,6 @@ void drm_event_cancel_free(struct drm_device *dev,
 void drm_send_event_locked(struct drm_device *dev, struct drm_pending_event *e);
 void drm_send_event(struct drm_device *dev, struct drm_pending_event *e);
 
+long drm_oom_badness(struct file *f);
+
 #endif /* _DRM_FILE_H_ */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH 3/4] drm/gem: adjust per file OOM badness on handling buffers
@ 2018-01-18 16:47   ` Andrey Grodzovsky
  0 siblings, 0 replies; 170+ messages in thread
From: Andrey Grodzovsky @ 2018-01-18 16:47 UTC (permalink / raw)
  To: linux-kernel, linux-mm, dri-devel, amd-gfx
  Cc: Christian.Koenig, Andrey Grodzovsky

Large amounts of VRAM are usually not CPU accessible, so they are not mapped
into the processes address space. But since the device drivers usually support
swapping buffers from VRAM to system memory we can still run into an out of
memory situation when userspace starts to allocate to much.

This patch gives the OOM another hint which process is
holding how many resources.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/drm_file.c | 12 ++++++++++++
 drivers/gpu/drm/drm_gem.c  |  8 ++++++++
 include/drm/drm_file.h     |  4 ++++
 3 files changed, 24 insertions(+)

diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
index b3c6e99..626cc76 100644
--- a/drivers/gpu/drm/drm_file.c
+++ b/drivers/gpu/drm/drm_file.c
@@ -747,3 +747,15 @@ void drm_send_event(struct drm_device *dev, struct drm_pending_event *e)
 	spin_unlock_irqrestore(&dev->event_lock, irqflags);
 }
 EXPORT_SYMBOL(drm_send_event);
+
+long drm_oom_badness(struct file *f)
+{
+
+	struct drm_file *file_priv = f->private_data;
+
+	if (file_priv)
+		return atomic_long_read(&file_priv->f_oom_badness);
+
+	return 0;
+}
+EXPORT_SYMBOL(drm_oom_badness);
diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index 01f8d94..ffbadc8 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -264,6 +264,9 @@ drm_gem_object_release_handle(int id, void *ptr, void *data)
 		drm_gem_remove_prime_handles(obj, file_priv);
 	drm_vma_node_revoke(&obj->vma_node, file_priv);
 
+	atomic_long_sub(obj->size >> PAGE_SHIFT,
+				&file_priv->f_oom_badness);
+
 	drm_gem_object_handle_put_unlocked(obj);
 
 	return 0;
@@ -299,6 +302,8 @@ drm_gem_handle_delete(struct drm_file *filp, u32 handle)
 	idr_remove(&filp->object_idr, handle);
 	spin_unlock(&filp->table_lock);
 
+	atomic_long_sub(obj->size >> PAGE_SHIFT, &filp->f_oom_badness);
+
 	return 0;
 }
 EXPORT_SYMBOL(drm_gem_handle_delete);
@@ -417,6 +422,9 @@ drm_gem_handle_create_tail(struct drm_file *file_priv,
 	}
 
 	*handlep = handle;
+
+	atomic_long_add(obj->size >> PAGE_SHIFT,
+				&file_priv->f_oom_badness);
 	return 0;
 
 err_revoke:
diff --git a/include/drm/drm_file.h b/include/drm/drm_file.h
index 0e0c868..ac3aa75 100644
--- a/include/drm/drm_file.h
+++ b/include/drm/drm_file.h
@@ -317,6 +317,8 @@ struct drm_file {
 
 	/* private: */
 	unsigned long lock_count; /* DRI1 legacy lock count */
+
+	atomic_long_t		f_oom_badness;
 };
 
 /**
@@ -378,4 +380,6 @@ void drm_event_cancel_free(struct drm_device *dev,
 void drm_send_event_locked(struct drm_device *dev, struct drm_pending_event *e);
 void drm_send_event(struct drm_device *dev, struct drm_pending_event *e);
 
+long drm_oom_badness(struct file *f);
+
 #endif /* _DRM_FILE_H_ */
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH 3/4] drm/gem: adjust per file OOM badness on handling buffers
@ 2018-01-18 16:47   ` Andrey Grodzovsky
  0 siblings, 0 replies; 170+ messages in thread
From: Andrey Grodzovsky @ 2018-01-18 16:47 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Andrey Grodzovsky, Christian.Koenig-5C7GfCeVMHo

Large amounts of VRAM are usually not CPU accessible, so they are not mapped
into the processes address space. But since the device drivers usually support
swapping buffers from VRAM to system memory we can still run into an out of
memory situation when userspace starts to allocate to much.

This patch gives the OOM another hint which process is
holding how many resources.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/drm_file.c | 12 ++++++++++++
 drivers/gpu/drm/drm_gem.c  |  8 ++++++++
 include/drm/drm_file.h     |  4 ++++
 3 files changed, 24 insertions(+)

diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
index b3c6e99..626cc76 100644
--- a/drivers/gpu/drm/drm_file.c
+++ b/drivers/gpu/drm/drm_file.c
@@ -747,3 +747,15 @@ void drm_send_event(struct drm_device *dev, struct drm_pending_event *e)
 	spin_unlock_irqrestore(&dev->event_lock, irqflags);
 }
 EXPORT_SYMBOL(drm_send_event);
+
+long drm_oom_badness(struct file *f)
+{
+
+	struct drm_file *file_priv = f->private_data;
+
+	if (file_priv)
+		return atomic_long_read(&file_priv->f_oom_badness);
+
+	return 0;
+}
+EXPORT_SYMBOL(drm_oom_badness);
diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index 01f8d94..ffbadc8 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -264,6 +264,9 @@ drm_gem_object_release_handle(int id, void *ptr, void *data)
 		drm_gem_remove_prime_handles(obj, file_priv);
 	drm_vma_node_revoke(&obj->vma_node, file_priv);
 
+	atomic_long_sub(obj->size >> PAGE_SHIFT,
+				&file_priv->f_oom_badness);
+
 	drm_gem_object_handle_put_unlocked(obj);
 
 	return 0;
@@ -299,6 +302,8 @@ drm_gem_handle_delete(struct drm_file *filp, u32 handle)
 	idr_remove(&filp->object_idr, handle);
 	spin_unlock(&filp->table_lock);
 
+	atomic_long_sub(obj->size >> PAGE_SHIFT, &filp->f_oom_badness);
+
 	return 0;
 }
 EXPORT_SYMBOL(drm_gem_handle_delete);
@@ -417,6 +422,9 @@ drm_gem_handle_create_tail(struct drm_file *file_priv,
 	}
 
 	*handlep = handle;
+
+	atomic_long_add(obj->size >> PAGE_SHIFT,
+				&file_priv->f_oom_badness);
 	return 0;
 
 err_revoke:
diff --git a/include/drm/drm_file.h b/include/drm/drm_file.h
index 0e0c868..ac3aa75 100644
--- a/include/drm/drm_file.h
+++ b/include/drm/drm_file.h
@@ -317,6 +317,8 @@ struct drm_file {
 
 	/* private: */
 	unsigned long lock_count; /* DRI1 legacy lock count */
+
+	atomic_long_t		f_oom_badness;
 };
 
 /**
@@ -378,4 +380,6 @@ void drm_event_cancel_free(struct drm_device *dev,
 void drm_send_event_locked(struct drm_device *dev, struct drm_pending_event *e);
 void drm_send_event(struct drm_device *dev, struct drm_pending_event *e);
 
+long drm_oom_badness(struct file *f);
+
 #endif /* _DRM_FILE_H_ */
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH 4/4] drm/amdgpu: Use drm_oom_badness for amdgpu.
  2018-01-18 16:47 ` Andrey Grodzovsky
  (?)
@ 2018-01-18 16:47   ` Andrey Grodzovsky
  -1 siblings, 0 replies; 170+ messages in thread
From: Andrey Grodzovsky @ 2018-01-18 16:47 UTC (permalink / raw)
  To: linux-kernel, linux-mm, dri-devel, amd-gfx
  Cc: Christian.Koenig, Andrey Grodzovsky

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 46a0c93..6a733cdc8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -828,6 +828,7 @@ static const struct file_operations amdgpu_driver_kms_fops = {
 #ifdef CONFIG_COMPAT
 	.compat_ioctl = amdgpu_kms_compat_ioctl,
 #endif
+	.oom_file_badness = drm_oom_badness,
 };
 
 static bool
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH 4/4] drm/amdgpu: Use drm_oom_badness for amdgpu.
@ 2018-01-18 16:47   ` Andrey Grodzovsky
  0 siblings, 0 replies; 170+ messages in thread
From: Andrey Grodzovsky @ 2018-01-18 16:47 UTC (permalink / raw)
  To: linux-kernel, linux-mm, dri-devel, amd-gfx
  Cc: Christian.Koenig, Andrey Grodzovsky

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 46a0c93..6a733cdc8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -828,6 +828,7 @@ static const struct file_operations amdgpu_driver_kms_fops = {
 #ifdef CONFIG_COMPAT
 	.compat_ioctl = amdgpu_kms_compat_ioctl,
 #endif
+	.oom_file_badness = drm_oom_badness,
 };
 
 static bool
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH 4/4] drm/amdgpu: Use drm_oom_badness for amdgpu.
@ 2018-01-18 16:47   ` Andrey Grodzovsky
  0 siblings, 0 replies; 170+ messages in thread
From: Andrey Grodzovsky @ 2018-01-18 16:47 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Andrey Grodzovsky, Christian.Koenig-5C7GfCeVMHo

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 46a0c93..6a733cdc8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -828,6 +828,7 @@ static const struct file_operations amdgpu_driver_kms_fops = {
 #ifdef CONFIG_COMPAT
 	.compat_ioctl = amdgpu_kms_compat_ioctl,
 #endif
+	.oom_file_badness = drm_oom_badness,
 };
 
 static bool
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-01-18 16:47 ` Andrey Grodzovsky
  (?)
@ 2018-01-18 17:00   ` Michal Hocko
  -1 siblings, 0 replies; 170+ messages in thread
From: Michal Hocko @ 2018-01-18 17:00 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: linux-kernel, linux-mm, dri-devel, amd-gfx, Christian.Koenig

On Thu 18-01-18 11:47:48, Andrey Grodzovsky wrote:
> Hi, this series is a revised version of an RFC sent by Christian König
> a few years ago. The original RFC can be found at 
> https://lists.freedesktop.org/archives/dri-devel/2015-September/089778.html
> 
> This is the same idea and I've just adressed his concern from the original RFC 
> and switched to a callback into file_ops instead of a new member in struct file.

Please add the full description to the cover letter and do not make
people hunt links.

Here is the origin cover letter text
: I'm currently working on the issue that when device drivers allocate memory on
: behalf of an application the OOM killer usually doesn't knew about that unless
: the application also get this memory mapped into their address space.
: 
: This is especially annoying for graphics drivers where a lot of the VRAM
: usually isn't CPU accessible and so doesn't make sense to map into the
: address space of the process using it.
: 
: The problem now is that when an application starts to use a lot of VRAM those
: buffers objects sooner or later get swapped out to system memory, but when we
: now run into an out of memory situation the OOM killer obviously doesn't knew
: anything about that memory and so usually kills the wrong process.
: 
: The following set of patches tries to address this problem by introducing a per
: file OOM badness score, which device drivers can use to give the OOM killer a
: hint how many resources are bound to a file descriptor so that it can make
: better decisions which process to kill.
: 
: So question at every one: What do you think about this approach?
: 
: My biggest concern right now is the patches are messing with a core kernel
: structure (adding a field to struct file). Any better idea? I'm considering
: to put a callback into file_ops instead.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-18 17:00   ` Michal Hocko
  0 siblings, 0 replies; 170+ messages in thread
From: Michal Hocko @ 2018-01-18 17:00 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: linux-kernel, linux-mm, dri-devel, amd-gfx, Christian.Koenig

On Thu 18-01-18 11:47:48, Andrey Grodzovsky wrote:
> Hi, this series is a revised version of an RFC sent by Christian Konig
> a few years ago. The original RFC can be found at 
> https://lists.freedesktop.org/archives/dri-devel/2015-September/089778.html
> 
> This is the same idea and I've just adressed his concern from the original RFC 
> and switched to a callback into file_ops instead of a new member in struct file.

Please add the full description to the cover letter and do not make
people hunt links.

Here is the origin cover letter text
: I'm currently working on the issue that when device drivers allocate memory on
: behalf of an application the OOM killer usually doesn't knew about that unless
: the application also get this memory mapped into their address space.
: 
: This is especially annoying for graphics drivers where a lot of the VRAM
: usually isn't CPU accessible and so doesn't make sense to map into the
: address space of the process using it.
: 
: The problem now is that when an application starts to use a lot of VRAM those
: buffers objects sooner or later get swapped out to system memory, but when we
: now run into an out of memory situation the OOM killer obviously doesn't knew
: anything about that memory and so usually kills the wrong process.
: 
: The following set of patches tries to address this problem by introducing a per
: file OOM badness score, which device drivers can use to give the OOM killer a
: hint how many resources are bound to a file descriptor so that it can make
: better decisions which process to kill.
: 
: So question at every one: What do you think about this approach?
: 
: My biggest concern right now is the patches are messing with a core kernel
: structure (adding a field to struct file). Any better idea? I'm considering
: to put a callback into file_ops instead.

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-18 17:00   ` Michal Hocko
  0 siblings, 0 replies; 170+ messages in thread
From: Michal Hocko @ 2018-01-18 17:00 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: linux-kernel, linux-mm, dri-devel, amd-gfx, Christian.Koenig

On Thu 18-01-18 11:47:48, Andrey Grodzovsky wrote:
> Hi, this series is a revised version of an RFC sent by Christian König
> a few years ago. The original RFC can be found at 
> https://lists.freedesktop.org/archives/dri-devel/2015-September/089778.html
> 
> This is the same idea and I've just adressed his concern from the original RFC 
> and switched to a callback into file_ops instead of a new member in struct file.

Please add the full description to the cover letter and do not make
people hunt links.

Here is the origin cover letter text
: I'm currently working on the issue that when device drivers allocate memory on
: behalf of an application the OOM killer usually doesn't knew about that unless
: the application also get this memory mapped into their address space.
: 
: This is especially annoying for graphics drivers where a lot of the VRAM
: usually isn't CPU accessible and so doesn't make sense to map into the
: address space of the process using it.
: 
: The problem now is that when an application starts to use a lot of VRAM those
: buffers objects sooner or later get swapped out to system memory, but when we
: now run into an out of memory situation the OOM killer obviously doesn't knew
: anything about that memory and so usually kills the wrong process.
: 
: The following set of patches tries to address this problem by introducing a per
: file OOM badness score, which device drivers can use to give the OOM killer a
: hint how many resources are bound to a file descriptor so that it can make
: better decisions which process to kill.
: 
: So question at every one: What do you think about this approach?
: 
: My biggest concern right now is the patches are messing with a core kernel
: structure (adding a field to struct file). Any better idea? I'm considering
: to put a callback into file_ops instead.

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-01-18 17:00   ` Michal Hocko
@ 2018-01-18 17:13     ` Michal Hocko
  -1 siblings, 0 replies; 170+ messages in thread
From: Michal Hocko @ 2018-01-18 17:13 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: linux-kernel, linux-mm, dri-devel, amd-gfx, Christian.Koenig

On Thu 18-01-18 18:00:06, Michal Hocko wrote:
> On Thu 18-01-18 11:47:48, Andrey Grodzovsky wrote:
> > Hi, this series is a revised version of an RFC sent by Christian König
> > a few years ago. The original RFC can be found at 
> > https://lists.freedesktop.org/archives/dri-devel/2015-September/089778.html
> > 
> > This is the same idea and I've just adressed his concern from the original RFC 
> > and switched to a callback into file_ops instead of a new member in struct file.
> 
> Please add the full description to the cover letter and do not make
> people hunt links.
> 
> Here is the origin cover letter text
> : I'm currently working on the issue that when device drivers allocate memory on
> : behalf of an application the OOM killer usually doesn't knew about that unless
> : the application also get this memory mapped into their address space.
> : 
> : This is especially annoying for graphics drivers where a lot of the VRAM
> : usually isn't CPU accessible and so doesn't make sense to map into the
> : address space of the process using it.
> : 
> : The problem now is that when an application starts to use a lot of VRAM those
> : buffers objects sooner or later get swapped out to system memory, but when we
> : now run into an out of memory situation the OOM killer obviously doesn't knew
> : anything about that memory and so usually kills the wrong process.

OK, but how do you attribute that memory to a particular OOM killable
entity? And how do you actually enforce that those resources get freed
on the oom killer action?

> : The following set of patches tries to address this problem by introducing a per
> : file OOM badness score, which device drivers can use to give the OOM killer a
> : hint how many resources are bound to a file descriptor so that it can make
> : better decisions which process to kill.

But files are not killable, they can be shared... In other words this
doesn't help the oom killer to make an educated guess at all.

> : 
> : So question at every one: What do you think about this approach?

I thing is just just wrong semantically. Non-reclaimable memory is a
pain, especially when there is way too much of it. If you can free that
memory somehow then you can hook into slab shrinker API and react on the
memory pressure. If you can account such a memory to a particular
process and make sure that the consumption is bound by the process life
time then we can think of an accounting that oom_badness can consider
when selecting a victim.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-18 17:13     ` Michal Hocko
  0 siblings, 0 replies; 170+ messages in thread
From: Michal Hocko @ 2018-01-18 17:13 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: linux-kernel, linux-mm, dri-devel, amd-gfx, Christian.Koenig

On Thu 18-01-18 18:00:06, Michal Hocko wrote:
> On Thu 18-01-18 11:47:48, Andrey Grodzovsky wrote:
> > Hi, this series is a revised version of an RFC sent by Christian Konig
> > a few years ago. The original RFC can be found at 
> > https://lists.freedesktop.org/archives/dri-devel/2015-September/089778.html
> > 
> > This is the same idea and I've just adressed his concern from the original RFC 
> > and switched to a callback into file_ops instead of a new member in struct file.
> 
> Please add the full description to the cover letter and do not make
> people hunt links.
> 
> Here is the origin cover letter text
> : I'm currently working on the issue that when device drivers allocate memory on
> : behalf of an application the OOM killer usually doesn't knew about that unless
> : the application also get this memory mapped into their address space.
> : 
> : This is especially annoying for graphics drivers where a lot of the VRAM
> : usually isn't CPU accessible and so doesn't make sense to map into the
> : address space of the process using it.
> : 
> : The problem now is that when an application starts to use a lot of VRAM those
> : buffers objects sooner or later get swapped out to system memory, but when we
> : now run into an out of memory situation the OOM killer obviously doesn't knew
> : anything about that memory and so usually kills the wrong process.

OK, but how do you attribute that memory to a particular OOM killable
entity? And how do you actually enforce that those resources get freed
on the oom killer action?

> : The following set of patches tries to address this problem by introducing a per
> : file OOM badness score, which device drivers can use to give the OOM killer a
> : hint how many resources are bound to a file descriptor so that it can make
> : better decisions which process to kill.

But files are not killable, they can be shared... In other words this
doesn't help the oom killer to make an educated guess at all.

> : 
> : So question at every one: What do you think about this approach?

I thing is just just wrong semantically. Non-reclaimable memory is a
pain, especially when there is way too much of it. If you can free that
memory somehow then you can hook into slab shrinker API and react on the
memory pressure. If you can account such a memory to a particular
process and make sure that the consumption is bound by the process life
time then we can think of an accounting that oom_badness can consider
when selecting a victim.

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-01-18 17:13     ` Michal Hocko
  (?)
@ 2018-01-18 20:01     ` Eric Anholt
  2018-01-19  8:20         ` Michal Hocko
  2018-01-19  8:35         ` Christian König
  -1 siblings, 2 replies; 170+ messages in thread
From: Eric Anholt @ 2018-01-18 20:01 UTC (permalink / raw)
  To: Michal Hocko, Andrey Grodzovsky
  Cc: linux-kernel, linux-mm, dri-devel, amd-gfx, Christian.Koenig

[-- Attachment #1: Type: text/plain, Size: 4015 bytes --]

Michal Hocko <mhocko@kernel.org> writes:

> On Thu 18-01-18 18:00:06, Michal Hocko wrote:
>> On Thu 18-01-18 11:47:48, Andrey Grodzovsky wrote:
>> > Hi, this series is a revised version of an RFC sent by Christian König
>> > a few years ago. The original RFC can be found at 
>> > https://lists.freedesktop.org/archives/dri-devel/2015-September/089778.html
>> > 
>> > This is the same idea and I've just adressed his concern from the original RFC 
>> > and switched to a callback into file_ops instead of a new member in struct file.
>> 
>> Please add the full description to the cover letter and do not make
>> people hunt links.
>> 
>> Here is the origin cover letter text
>> : I'm currently working on the issue that when device drivers allocate memory on
>> : behalf of an application the OOM killer usually doesn't knew about that unless
>> : the application also get this memory mapped into their address space.
>> : 
>> : This is especially annoying for graphics drivers where a lot of the VRAM
>> : usually isn't CPU accessible and so doesn't make sense to map into the
>> : address space of the process using it.
>> : 
>> : The problem now is that when an application starts to use a lot of VRAM those
>> : buffers objects sooner or later get swapped out to system memory, but when we
>> : now run into an out of memory situation the OOM killer obviously doesn't knew
>> : anything about that memory and so usually kills the wrong process.
>
> OK, but how do you attribute that memory to a particular OOM killable
> entity? And how do you actually enforce that those resources get freed
> on the oom killer action?
>
>> : The following set of patches tries to address this problem by introducing a per
>> : file OOM badness score, which device drivers can use to give the OOM killer a
>> : hint how many resources are bound to a file descriptor so that it can make
>> : better decisions which process to kill.
>
> But files are not killable, they can be shared... In other words this
> doesn't help the oom killer to make an educated guess at all.

Maybe some more context would help the discussion?

The struct file in patch 3 is the DRM fd.  That's effectively "my
process's interface to talking to the GPU" not "a single GPU resource".
Once that file is closed, all of the process's private, idle GPU buffers
will be immediately freed (this will be most of their allocations), and
some will be freed once the GPU completes some work (this will be most
of the rest of their allocations).

Some GEM BOs won't be freed just by closing the fd, if they've been
shared between processes.  Those are usually about 8-24MB total in a
process, rather than the GBs that modern apps use (or that our testcases
like to allocate and thus trigger oomkilling of the test harness instead
of the offending testcase...)

Even if we just had the private+idle buffers being accounted in OOM
badness, that would be a huge step forward in system reliability.

>> : So question at every one: What do you think about this approach?
>
> I thing is just just wrong semantically. Non-reclaimable memory is a
> pain, especially when there is way too much of it. If you can free that
> memory somehow then you can hook into slab shrinker API and react on the
> memory pressure. If you can account such a memory to a particular
> process and make sure that the consumption is bound by the process life
> time then we can think of an accounting that oom_badness can consider
> when selecting a victim.

For graphics, we can't free most of our memory without also effectively
killing the process.  i915 and vc4 have "purgeable" interfaces for
userspace (on i915 this is exposed all the way to GL applications and is
hooked into shrinker, and on vc4 this is so far just used for
userspace-internal buffer caches to be purged when a CMA allocation
fails).  However, those purgeable pools are expected to be a tiny
fraction of the GPU allocations by the process.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 170+ messages in thread

* RE: [RFC] Per file OOM badness
  2018-01-18 16:47 ` Andrey Grodzovsky
                   ` (6 preceding siblings ...)
  (?)
@ 2018-01-19  5:39 ` He, Roger
  2018-01-19  8:17     ` Christian König
  -1 siblings, 1 reply; 170+ messages in thread
From: He, Roger @ 2018-01-19  5:39 UTC (permalink / raw)
  To: Grodzovsky, Andrey, linux-kernel, linux-mm, dri-devel, amd-gfx
  Cc: Koenig, Christian

Basically the idea is right to me.

1. But we need smaller granularity to control the contribution to OOM badness.
     Because when the TTM buffer resides in VRAM rather than evict to system memory, we should not take this account into badness.
     But I think it is not easy to implement.

2. If the TTM buffer(GTT here) is mapped to user for CPU access, not quite sure the buffer size is already taken into account for kernel.
     If yes, at last the size will be counted again by your patches.

So, I am thinking if we can counted the TTM buffer size into: 
struct mm_rss_stat {
	atomic_long_t count[NR_MM_COUNTERS];
};
Which is done by kernel based on CPU VM (page table).

Something like that:
When GTT allocate suceess:
add_mm_counter(vma->vm_mm, MM_ANONPAGES, buffer_size);

When GTT swapped out:
dec_mm_counter from MM_ANONPAGES frist, then 
add_mm_counter(vma->vm_mm, MM_SWAPENTS, buffer_size);  // or MM_SHMEMPAGES or add new item.

Update the corresponding item in mm_rss_stat always.
If that, we can control the status update accurately. 
What do you think about that?
And is there any side-effect for this approach?


Thanks
Roger(Hongbo.He)

-----Original Message-----
From: dri-devel [mailto:dri-devel-bounces@lists.freedesktop.org] On Behalf Of Andrey Grodzovsky
Sent: Friday, January 19, 2018 12:48 AM
To: linux-kernel@vger.kernel.org; linux-mm@kvack.org; dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org
Cc: Koenig, Christian <Christian.Koenig@amd.com>
Subject: [RFC] Per file OOM badness

Hi, this series is a revised version of an RFC sent by Christian König a few years ago. The original RFC can be found at https://lists.freedesktop.org/archives/dri-devel/2015-September/089778.html

This is the same idea and I've just adressed his concern from the original RFC and switched to a callback into file_ops instead of a new member in struct file.

Thanks,
Andrey

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 170+ messages in thread

* RE: [RFC] Per file OOM badness
  2018-01-18 17:13     ` Michal Hocko
  (?)
  (?)
@ 2018-01-19  6:01     ` He, Roger
  2018-01-19  8:25         ` Michal Hocko
  -1 siblings, 1 reply; 170+ messages in thread
From: He, Roger @ 2018-01-19  6:01 UTC (permalink / raw)
  To: Michal Hocko, Grodzovsky, Andrey
  Cc: linux-mm, amd-gfx, linux-kernel, dri-devel, Koenig, Christian



-----Original Message-----
From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On Behalf Of Michal Hocko
Sent: Friday, January 19, 2018 1:14 AM
To: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
Cc: linux-mm@kvack.org; amd-gfx@lists.freedesktop.org; linux-kernel@vger.kernel.org; dri-devel@lists.freedesktop.org; Koenig, Christian <Christian.Koenig@amd.com>
Subject: Re: [RFC] Per file OOM badness

On Thu 18-01-18 18:00:06, Michal Hocko wrote:
> On Thu 18-01-18 11:47:48, Andrey Grodzovsky wrote:
> > Hi, this series is a revised version of an RFC sent by Christian 
> > König a few years ago. The original RFC can be found at 
> > https://lists.freedesktop.org/archives/dri-devel/2015-September/0897
> > 78.html
> > 
> > This is the same idea and I've just adressed his concern from the 
> > original RFC and switched to a callback into file_ops instead of a new member in struct file.
> 
> Please add the full description to the cover letter and do not make 
> people hunt links.
> 
> Here is the origin cover letter text
> : I'm currently working on the issue that when device drivers allocate 
> memory on
> : behalf of an application the OOM killer usually doesn't knew about 
> that unless
> : the application also get this memory mapped into their address space.
> : 
> : This is especially annoying for graphics drivers where a lot of the 
> VRAM
> : usually isn't CPU accessible and so doesn't make sense to map into 
> the
> : address space of the process using it.
> : 
> : The problem now is that when an application starts to use a lot of 
> VRAM those
> : buffers objects sooner or later get swapped out to system memory, 
> but when we
> : now run into an out of memory situation the OOM killer obviously 
> doesn't knew
> : anything about that memory and so usually kills the wrong process.

	OK, but how do you attribute that memory to a particular OOM killable entity? And how do you actually enforce that those resources get freed on the oom killer action?

Here I think we need more fine granularity for distinguishing the buffer is taking VRAM or system memory.

> : The following set of patches tries to address this problem by 
> introducing a per
> : file OOM badness score, which device drivers can use to give the OOM 
> killer a
> : hint how many resources are bound to a file descriptor so that it 
> can make
> : better decisions which process to kill.

But files are not killable, they can be shared... In other words this doesn't help the oom killer to make an educated guess at all.

> : 
> : So question at every one: What do you think about this approach?

	I thing is just just wrong semantically. Non-reclaimable memory is a pain, especially when there is way too much of it. If you can free that memory somehow then you can hook into slab shrinker API and react on the memory pressure. If you can account such a 	memory to a particular process and make sure that the consumption is bound by the process life time then we can think of an accounting that oom_badness can consider when selecting a victim.

I think you are misunderstanding here.
Actually for now, the memory in TTM Pools already has mm_shrink which is implemented in ttm_pool_mm_shrink_init.
And here the memory we want to make it contribute to OOM badness is not in TTM Pools.
Because when TTM buffer allocation success, the memory already is removed from TTM Pools.  

Thanks
Roger(Hongbo.He)

--
Michal Hocko
SUSE Labs
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH 3/4] drm/gem: adjust per file OOM badness on handling buffers
  2018-01-18 16:47   ` Andrey Grodzovsky
  (?)
@ 2018-01-19  6:01     ` Chunming Zhou
  -1 siblings, 0 replies; 170+ messages in thread
From: Chunming Zhou @ 2018-01-19  6:01 UTC (permalink / raw)
  To: Andrey Grodzovsky, linux-kernel, linux-mm, dri-devel, amd-gfx
  Cc: Christian.Koenig



On 2018年01月19日 00:47, Andrey Grodzovsky wrote:
> Large amounts of VRAM are usually not CPU accessible, so they are not mapped
> into the processes address space. But since the device drivers usually support
> swapping buffers from VRAM to system memory we can still run into an out of
> memory situation when userspace starts to allocate to much.
>
> This patch gives the OOM another hint which process is
> holding how many resources.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>   drivers/gpu/drm/drm_file.c | 12 ++++++++++++
>   drivers/gpu/drm/drm_gem.c  |  8 ++++++++
>   include/drm/drm_file.h     |  4 ++++
>   3 files changed, 24 insertions(+)
>
> diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
> index b3c6e99..626cc76 100644
> --- a/drivers/gpu/drm/drm_file.c
> +++ b/drivers/gpu/drm/drm_file.c
> @@ -747,3 +747,15 @@ void drm_send_event(struct drm_device *dev, struct drm_pending_event *e)
>   	spin_unlock_irqrestore(&dev->event_lock, irqflags);
>   }
>   EXPORT_SYMBOL(drm_send_event);
> +
> +long drm_oom_badness(struct file *f)
> +{
> +
> +	struct drm_file *file_priv = f->private_data;
> +
> +	if (file_priv)
> +		return atomic_long_read(&file_priv->f_oom_badness);
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL(drm_oom_badness);
> diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
> index 01f8d94..ffbadc8 100644
> --- a/drivers/gpu/drm/drm_gem.c
> +++ b/drivers/gpu/drm/drm_gem.c
> @@ -264,6 +264,9 @@ drm_gem_object_release_handle(int id, void *ptr, void *data)
>   		drm_gem_remove_prime_handles(obj, file_priv);
>   	drm_vma_node_revoke(&obj->vma_node, file_priv);
>   
> +	atomic_long_sub(obj->size >> PAGE_SHIFT,
> +				&file_priv->f_oom_badness);
> +
>   	drm_gem_object_handle_put_unlocked(obj);
>   
>   	return 0;
> @@ -299,6 +302,8 @@ drm_gem_handle_delete(struct drm_file *filp, u32 handle)
>   	idr_remove(&filp->object_idr, handle);
>   	spin_unlock(&filp->table_lock);
>   
> +	atomic_long_sub(obj->size >> PAGE_SHIFT, &filp->f_oom_badness);
> +
>   	return 0;
>   }
>   EXPORT_SYMBOL(drm_gem_handle_delete);
> @@ -417,6 +422,9 @@ drm_gem_handle_create_tail(struct drm_file *file_priv,
>   	}
>   
>   	*handlep = handle;
> +
> +	atomic_long_add(obj->size >> PAGE_SHIFT,
> +				&file_priv->f_oom_badness);
For VRAM case, it should be counted only when vram bo is evicted to 
system memory.
For example, vram total is 8GB, system memory total is 8GB, one 
application allocates 7GB vram and 7GB system memory, which is allowed, 
but if following your idea, then this application will be killed by OOM, 
right?

Regards,
David Zhou
>   	return 0;
>   
>   err_revoke:
> diff --git a/include/drm/drm_file.h b/include/drm/drm_file.h
> index 0e0c868..ac3aa75 100644
> --- a/include/drm/drm_file.h
> +++ b/include/drm/drm_file.h
> @@ -317,6 +317,8 @@ struct drm_file {
>   
>   	/* private: */
>   	unsigned long lock_count; /* DRI1 legacy lock count */
> +
> +	atomic_long_t		f_oom_badness;
>   };
>   
>   /**
> @@ -378,4 +380,6 @@ void drm_event_cancel_free(struct drm_device *dev,
>   void drm_send_event_locked(struct drm_device *dev, struct drm_pending_event *e);
>   void drm_send_event(struct drm_device *dev, struct drm_pending_event *e);
>   
> +long drm_oom_badness(struct file *f);
> +
>   #endif /* _DRM_FILE_H_ */

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH 3/4] drm/gem: adjust per file OOM badness on handling buffers
@ 2018-01-19  6:01     ` Chunming Zhou
  0 siblings, 0 replies; 170+ messages in thread
From: Chunming Zhou @ 2018-01-19  6:01 UTC (permalink / raw)
  To: Andrey Grodzovsky, linux-kernel, linux-mm, dri-devel, amd-gfx
  Cc: Christian.Koenig



On 2018a1'01ae??19ae?JPY 00:47, Andrey Grodzovsky wrote:
> Large amounts of VRAM are usually not CPU accessible, so they are not mapped
> into the processes address space. But since the device drivers usually support
> swapping buffers from VRAM to system memory we can still run into an out of
> memory situation when userspace starts to allocate to much.
>
> This patch gives the OOM another hint which process is
> holding how many resources.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>   drivers/gpu/drm/drm_file.c | 12 ++++++++++++
>   drivers/gpu/drm/drm_gem.c  |  8 ++++++++
>   include/drm/drm_file.h     |  4 ++++
>   3 files changed, 24 insertions(+)
>
> diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
> index b3c6e99..626cc76 100644
> --- a/drivers/gpu/drm/drm_file.c
> +++ b/drivers/gpu/drm/drm_file.c
> @@ -747,3 +747,15 @@ void drm_send_event(struct drm_device *dev, struct drm_pending_event *e)
>   	spin_unlock_irqrestore(&dev->event_lock, irqflags);
>   }
>   EXPORT_SYMBOL(drm_send_event);
> +
> +long drm_oom_badness(struct file *f)
> +{
> +
> +	struct drm_file *file_priv = f->private_data;
> +
> +	if (file_priv)
> +		return atomic_long_read(&file_priv->f_oom_badness);
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL(drm_oom_badness);
> diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
> index 01f8d94..ffbadc8 100644
> --- a/drivers/gpu/drm/drm_gem.c
> +++ b/drivers/gpu/drm/drm_gem.c
> @@ -264,6 +264,9 @@ drm_gem_object_release_handle(int id, void *ptr, void *data)
>   		drm_gem_remove_prime_handles(obj, file_priv);
>   	drm_vma_node_revoke(&obj->vma_node, file_priv);
>   
> +	atomic_long_sub(obj->size >> PAGE_SHIFT,
> +				&file_priv->f_oom_badness);
> +
>   	drm_gem_object_handle_put_unlocked(obj);
>   
>   	return 0;
> @@ -299,6 +302,8 @@ drm_gem_handle_delete(struct drm_file *filp, u32 handle)
>   	idr_remove(&filp->object_idr, handle);
>   	spin_unlock(&filp->table_lock);
>   
> +	atomic_long_sub(obj->size >> PAGE_SHIFT, &filp->f_oom_badness);
> +
>   	return 0;
>   }
>   EXPORT_SYMBOL(drm_gem_handle_delete);
> @@ -417,6 +422,9 @@ drm_gem_handle_create_tail(struct drm_file *file_priv,
>   	}
>   
>   	*handlep = handle;
> +
> +	atomic_long_add(obj->size >> PAGE_SHIFT,
> +				&file_priv->f_oom_badness);
For VRAM case, it should be counted only when vram bo is evicted to 
system memory.
For example, vram total is 8GB, system memory total is 8GB, one 
application allocates 7GB vram and 7GB system memory, which is allowed, 
but if following your idea, then this application will be killed by OOM, 
right?

Regards,
David Zhou
>   	return 0;
>   
>   err_revoke:
> diff --git a/include/drm/drm_file.h b/include/drm/drm_file.h
> index 0e0c868..ac3aa75 100644
> --- a/include/drm/drm_file.h
> +++ b/include/drm/drm_file.h
> @@ -317,6 +317,8 @@ struct drm_file {
>   
>   	/* private: */
>   	unsigned long lock_count; /* DRI1 legacy lock count */
> +
> +	atomic_long_t		f_oom_badness;
>   };
>   
>   /**
> @@ -378,4 +380,6 @@ void drm_event_cancel_free(struct drm_device *dev,
>   void drm_send_event_locked(struct drm_device *dev, struct drm_pending_event *e);
>   void drm_send_event(struct drm_device *dev, struct drm_pending_event *e);
>   
> +long drm_oom_badness(struct file *f);
> +
>   #endif /* _DRM_FILE_H_ */

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH 3/4] drm/gem: adjust per file OOM badness on handling buffers
@ 2018-01-19  6:01     ` Chunming Zhou
  0 siblings, 0 replies; 170+ messages in thread
From: Chunming Zhou @ 2018-01-19  6:01 UTC (permalink / raw)
  To: Andrey Grodzovsky, linux-kernel, linux-mm, dri-devel, amd-gfx
  Cc: Christian.Koenig



On 2018年01月19日 00:47, Andrey Grodzovsky wrote:
> Large amounts of VRAM are usually not CPU accessible, so they are not mapped
> into the processes address space. But since the device drivers usually support
> swapping buffers from VRAM to system memory we can still run into an out of
> memory situation when userspace starts to allocate to much.
>
> This patch gives the OOM another hint which process is
> holding how many resources.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>   drivers/gpu/drm/drm_file.c | 12 ++++++++++++
>   drivers/gpu/drm/drm_gem.c  |  8 ++++++++
>   include/drm/drm_file.h     |  4 ++++
>   3 files changed, 24 insertions(+)
>
> diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
> index b3c6e99..626cc76 100644
> --- a/drivers/gpu/drm/drm_file.c
> +++ b/drivers/gpu/drm/drm_file.c
> @@ -747,3 +747,15 @@ void drm_send_event(struct drm_device *dev, struct drm_pending_event *e)
>   	spin_unlock_irqrestore(&dev->event_lock, irqflags);
>   }
>   EXPORT_SYMBOL(drm_send_event);
> +
> +long drm_oom_badness(struct file *f)
> +{
> +
> +	struct drm_file *file_priv = f->private_data;
> +
> +	if (file_priv)
> +		return atomic_long_read(&file_priv->f_oom_badness);
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL(drm_oom_badness);
> diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
> index 01f8d94..ffbadc8 100644
> --- a/drivers/gpu/drm/drm_gem.c
> +++ b/drivers/gpu/drm/drm_gem.c
> @@ -264,6 +264,9 @@ drm_gem_object_release_handle(int id, void *ptr, void *data)
>   		drm_gem_remove_prime_handles(obj, file_priv);
>   	drm_vma_node_revoke(&obj->vma_node, file_priv);
>   
> +	atomic_long_sub(obj->size >> PAGE_SHIFT,
> +				&file_priv->f_oom_badness);
> +
>   	drm_gem_object_handle_put_unlocked(obj);
>   
>   	return 0;
> @@ -299,6 +302,8 @@ drm_gem_handle_delete(struct drm_file *filp, u32 handle)
>   	idr_remove(&filp->object_idr, handle);
>   	spin_unlock(&filp->table_lock);
>   
> +	atomic_long_sub(obj->size >> PAGE_SHIFT, &filp->f_oom_badness);
> +
>   	return 0;
>   }
>   EXPORT_SYMBOL(drm_gem_handle_delete);
> @@ -417,6 +422,9 @@ drm_gem_handle_create_tail(struct drm_file *file_priv,
>   	}
>   
>   	*handlep = handle;
> +
> +	atomic_long_add(obj->size >> PAGE_SHIFT,
> +				&file_priv->f_oom_badness);
For VRAM case, it should be counted only when vram bo is evicted to 
system memory.
For example, vram total is 8GB, system memory total is 8GB, one 
application allocates 7GB vram and 7GB system memory, which is allowed, 
but if following your idea, then this application will be killed by OOM, 
right?

Regards,
David Zhou
>   	return 0;
>   
>   err_revoke:
> diff --git a/include/drm/drm_file.h b/include/drm/drm_file.h
> index 0e0c868..ac3aa75 100644
> --- a/include/drm/drm_file.h
> +++ b/include/drm/drm_file.h
> @@ -317,6 +317,8 @@ struct drm_file {
>   
>   	/* private: */
>   	unsigned long lock_count; /* DRI1 legacy lock count */
> +
> +	atomic_long_t		f_oom_badness;
>   };
>   
>   /**
> @@ -378,4 +380,6 @@ void drm_event_cancel_free(struct drm_device *dev,
>   void drm_send_event_locked(struct drm_device *dev, struct drm_pending_event *e);
>   void drm_send_event(struct drm_device *dev, struct drm_pending_event *e);
>   
> +long drm_oom_badness(struct file *f);
> +
>   #endif /* _DRM_FILE_H_ */

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-01-19  5:39 ` He, Roger
  2018-01-19  8:17     ` Christian König
@ 2018-01-19  8:17     ` Christian König
  0 siblings, 0 replies; 170+ messages in thread
From: Christian König @ 2018-01-19  8:17 UTC (permalink / raw)
  To: He, Roger, Grodzovsky, Andrey, linux-kernel, linux-mm, dri-devel,
	amd-gfx
  Cc: Koenig, Christian

Am 19.01.2018 um 06:39 schrieb He, Roger:
> Basically the idea is right to me.
>
> 1. But we need smaller granularity to control the contribution to OOM badness.
>       Because when the TTM buffer resides in VRAM rather than evict to system memory, we should not take this account into badness.
>       But I think it is not easy to implement.

I was considering that as well when I wrote the original patch set, but 
then decided against it at least for now.

Basically all VRAM buffers can be swapped to system memory, so they 
potentially need system memory as well. That is especially important 
during suspend/resume.

>
> 2. If the TTM buffer(GTT here) is mapped to user for CPU access, not quite sure the buffer size is already taken into account for kernel.
>       If yes, at last the size will be counted again by your patches.

No that isn't accounted for as far as I know.

>
> So, I am thinking if we can counted the TTM buffer size into:
> struct mm_rss_stat {
> 	atomic_long_t count[NR_MM_COUNTERS];
> };
> Which is done by kernel based on CPU VM (page table).
>
> Something like that:
> When GTT allocate suceess:
> add_mm_counter(vma->vm_mm, MM_ANONPAGES, buffer_size);
>
> When GTT swapped out:
> dec_mm_counter from MM_ANONPAGES frist, then
> add_mm_counter(vma->vm_mm, MM_SWAPENTS, buffer_size);  // or MM_SHMEMPAGES or add new item.
>
> Update the corresponding item in mm_rss_stat always.
> If that, we can control the status update accurately.
> What do you think about that?
> And is there any side-effect for this approach?

I already tried this when I originally worked on the issue and that 
approach didn't worked because allocated buffers are not associated to 
the process where they are created.

E.g. most display surfaces are created by the X server, but used by 
processes. So if you account the BO to the process who created it we 
would start to kill X again and that is exactly what we try to avoid.

Regards,
Christian.

>
>
> Thanks
> Roger(Hongbo.He)
>
> -----Original Message-----
> From: dri-devel [mailto:dri-devel-bounces@lists.freedesktop.org] On Behalf Of Andrey Grodzovsky
> Sent: Friday, January 19, 2018 12:48 AM
> To: linux-kernel@vger.kernel.org; linux-mm@kvack.org; dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org
> Cc: Koenig, Christian <Christian.Koenig@amd.com>
> Subject: [RFC] Per file OOM badness
>
> Hi, this series is a revised version of an RFC sent by Christian König a few years ago. The original RFC can be found at https://lists.freedesktop.org/archives/dri-devel/2015-September/089778.html
>
> This is the same idea and I've just adressed his concern from the original RFC and switched to a callback into file_ops instead of a new member in struct file.
>
> Thanks,
> Andrey
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-19  8:17     ` Christian König
  0 siblings, 0 replies; 170+ messages in thread
From: Christian König @ 2018-01-19  8:17 UTC (permalink / raw)
  To: He, Roger, Grodzovsky, Andrey, linux-kernel, linux-mm, dri-devel,
	amd-gfx
  Cc: Koenig, Christian

Am 19.01.2018 um 06:39 schrieb He, Roger:
> Basically the idea is right to me.
>
> 1. But we need smaller granularity to control the contribution to OOM badness.
>       Because when the TTM buffer resides in VRAM rather than evict to system memory, we should not take this account into badness.
>       But I think it is not easy to implement.

I was considering that as well when I wrote the original patch set, but 
then decided against it at least for now.

Basically all VRAM buffers can be swapped to system memory, so they 
potentially need system memory as well. That is especially important 
during suspend/resume.

>
> 2. If the TTM buffer(GTT here) is mapped to user for CPU access, not quite sure the buffer size is already taken into account for kernel.
>       If yes, at last the size will be counted again by your patches.

No that isn't accounted for as far as I know.

>
> So, I am thinking if we can counted the TTM buffer size into:
> struct mm_rss_stat {
> 	atomic_long_t count[NR_MM_COUNTERS];
> };
> Which is done by kernel based on CPU VM (page table).
>
> Something like that:
> When GTT allocate suceess:
> add_mm_counter(vma->vm_mm, MM_ANONPAGES, buffer_size);
>
> When GTT swapped out:
> dec_mm_counter from MM_ANONPAGES frist, then
> add_mm_counter(vma->vm_mm, MM_SWAPENTS, buffer_size);  // or MM_SHMEMPAGES or add new item.
>
> Update the corresponding item in mm_rss_stat always.
> If that, we can control the status update accurately.
> What do you think about that?
> And is there any side-effect for this approach?

I already tried this when I originally worked on the issue and that 
approach didn't worked because allocated buffers are not associated to 
the process where they are created.

E.g. most display surfaces are created by the X server, but used by 
processes. So if you account the BO to the process who created it we 
would start to kill X again and that is exactly what we try to avoid.

Regards,
Christian.

>
>
> Thanks
> Roger(Hongbo.He)
>
> -----Original Message-----
> From: dri-devel [mailto:dri-devel-bounces@lists.freedesktop.org] On Behalf Of Andrey Grodzovsky
> Sent: Friday, January 19, 2018 12:48 AM
> To: linux-kernel@vger.kernel.org; linux-mm@kvack.org; dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org
> Cc: Koenig, Christian <Christian.Koenig@amd.com>
> Subject: [RFC] Per file OOM badness
>
> Hi, this series is a revised version of an RFC sent by Christian KA?nig a few years ago. The original RFC can be found at https://lists.freedesktop.org/archives/dri-devel/2015-September/089778.html
>
> This is the same idea and I've just adressed his concern from the original RFC and switched to a callback into file_ops instead of a new member in struct file.
>
> Thanks,
> Andrey
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-19  8:17     ` Christian König
  0 siblings, 0 replies; 170+ messages in thread
From: Christian König @ 2018-01-19  8:17 UTC (permalink / raw)
  To: He, Roger, Grodzovsky, Andrey, linux-kernel, linux-mm, dri-devel,
	amd-gfx
  Cc: Koenig, Christian

Am 19.01.2018 um 06:39 schrieb He, Roger:
> Basically the idea is right to me.
>
> 1. But we need smaller granularity to control the contribution to OOM badness.
>       Because when the TTM buffer resides in VRAM rather than evict to system memory, we should not take this account into badness.
>       But I think it is not easy to implement.

I was considering that as well when I wrote the original patch set, but 
then decided against it at least for now.

Basically all VRAM buffers can be swapped to system memory, so they 
potentially need system memory as well. That is especially important 
during suspend/resume.

>
> 2. If the TTM buffer(GTT here) is mapped to user for CPU access, not quite sure the buffer size is already taken into account for kernel.
>       If yes, at last the size will be counted again by your patches.

No that isn't accounted for as far as I know.

>
> So, I am thinking if we can counted the TTM buffer size into:
> struct mm_rss_stat {
> 	atomic_long_t count[NR_MM_COUNTERS];
> };
> Which is done by kernel based on CPU VM (page table).
>
> Something like that:
> When GTT allocate suceess:
> add_mm_counter(vma->vm_mm, MM_ANONPAGES, buffer_size);
>
> When GTT swapped out:
> dec_mm_counter from MM_ANONPAGES frist, then
> add_mm_counter(vma->vm_mm, MM_SWAPENTS, buffer_size);  // or MM_SHMEMPAGES or add new item.
>
> Update the corresponding item in mm_rss_stat always.
> If that, we can control the status update accurately.
> What do you think about that?
> And is there any side-effect for this approach?

I already tried this when I originally worked on the issue and that 
approach didn't worked because allocated buffers are not associated to 
the process where they are created.

E.g. most display surfaces are created by the X server, but used by 
processes. So if you account the BO to the process who created it we 
would start to kill X again and that is exactly what we try to avoid.

Regards,
Christian.

>
>
> Thanks
> Roger(Hongbo.He)
>
> -----Original Message-----
> From: dri-devel [mailto:dri-devel-bounces@lists.freedesktop.org] On Behalf Of Andrey Grodzovsky
> Sent: Friday, January 19, 2018 12:48 AM
> To: linux-kernel@vger.kernel.org; linux-mm@kvack.org; dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org
> Cc: Koenig, Christian <Christian.Koenig@amd.com>
> Subject: [RFC] Per file OOM badness
>
> Hi, this series is a revised version of an RFC sent by Christian König a few years ago. The original RFC can be found at https://lists.freedesktop.org/archives/dri-devel/2015-September/089778.html
>
> This is the same idea and I've just adressed his concern from the original RFC and switched to a callback into file_ops instead of a new member in struct file.
>
> Thanks,
> Andrey
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-01-18 20:01     ` Eric Anholt
  2018-01-19  8:20         ` Michal Hocko
@ 2018-01-19  8:20         ` Michal Hocko
  1 sibling, 0 replies; 170+ messages in thread
From: Michal Hocko @ 2018-01-19  8:20 UTC (permalink / raw)
  To: Eric Anholt
  Cc: Andrey Grodzovsky, linux-kernel, linux-mm, dri-devel, amd-gfx,
	Christian.Koenig

On Thu 18-01-18 12:01:32, Eric Anholt wrote:
> Michal Hocko <mhocko@kernel.org> writes:
> 
> > On Thu 18-01-18 18:00:06, Michal Hocko wrote:
> >> On Thu 18-01-18 11:47:48, Andrey Grodzovsky wrote:
> >> > Hi, this series is a revised version of an RFC sent by Christian König
> >> > a few years ago. The original RFC can be found at 
> >> > https://lists.freedesktop.org/archives/dri-devel/2015-September/089778.html
> >> > 
> >> > This is the same idea and I've just adressed his concern from the original RFC 
> >> > and switched to a callback into file_ops instead of a new member in struct file.
> >> 
> >> Please add the full description to the cover letter and do not make
> >> people hunt links.
> >> 
> >> Here is the origin cover letter text
> >> : I'm currently working on the issue that when device drivers allocate memory on
> >> : behalf of an application the OOM killer usually doesn't knew about that unless
> >> : the application also get this memory mapped into their address space.
> >> : 
> >> : This is especially annoying for graphics drivers where a lot of the VRAM
> >> : usually isn't CPU accessible and so doesn't make sense to map into the
> >> : address space of the process using it.
> >> : 
> >> : The problem now is that when an application starts to use a lot of VRAM those
> >> : buffers objects sooner or later get swapped out to system memory, but when we
> >> : now run into an out of memory situation the OOM killer obviously doesn't knew
> >> : anything about that memory and so usually kills the wrong process.
> >
> > OK, but how do you attribute that memory to a particular OOM killable
> > entity? And how do you actually enforce that those resources get freed
> > on the oom killer action?
> >
> >> : The following set of patches tries to address this problem by introducing a per
> >> : file OOM badness score, which device drivers can use to give the OOM killer a
> >> : hint how many resources are bound to a file descriptor so that it can make
> >> : better decisions which process to kill.
> >
> > But files are not killable, they can be shared... In other words this
> > doesn't help the oom killer to make an educated guess at all.
> 
> Maybe some more context would help the discussion?
> 
> The struct file in patch 3 is the DRM fd.  That's effectively "my
> process's interface to talking to the GPU" not "a single GPU resource".
> Once that file is closed, all of the process's private, idle GPU buffers
> will be immediately freed (this will be most of their allocations), and
> some will be freed once the GPU completes some work (this will be most
> of the rest of their allocations).
> 
> Some GEM BOs won't be freed just by closing the fd, if they've been
> shared between processes.  Those are usually about 8-24MB total in a
> process, rather than the GBs that modern apps use (or that our testcases
> like to allocate and thus trigger oomkilling of the test harness instead
> of the offending testcase...)
> 
> Even if we just had the private+idle buffers being accounted in OOM
> badness, that would be a huge step forward in system reliability.

OK, in that case I would propose a different approach. We already
have rss_stat. So why do not we simply add a new counter there
MM_KERNELPAGES and consider those in oom_badness? The rule would be
that such a memory is bound to the process life time. I guess we will
find more users for this later.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-19  8:20         ` Michal Hocko
  0 siblings, 0 replies; 170+ messages in thread
From: Michal Hocko @ 2018-01-19  8:20 UTC (permalink / raw)
  To: Eric Anholt
  Cc: Andrey Grodzovsky, linux-kernel, linux-mm, dri-devel, amd-gfx,
	Christian.Koenig

On Thu 18-01-18 12:01:32, Eric Anholt wrote:
> Michal Hocko <mhocko@kernel.org> writes:
> 
> > On Thu 18-01-18 18:00:06, Michal Hocko wrote:
> >> On Thu 18-01-18 11:47:48, Andrey Grodzovsky wrote:
> >> > Hi, this series is a revised version of an RFC sent by Christian Konig
> >> > a few years ago. The original RFC can be found at 
> >> > https://lists.freedesktop.org/archives/dri-devel/2015-September/089778.html
> >> > 
> >> > This is the same idea and I've just adressed his concern from the original RFC 
> >> > and switched to a callback into file_ops instead of a new member in struct file.
> >> 
> >> Please add the full description to the cover letter and do not make
> >> people hunt links.
> >> 
> >> Here is the origin cover letter text
> >> : I'm currently working on the issue that when device drivers allocate memory on
> >> : behalf of an application the OOM killer usually doesn't knew about that unless
> >> : the application also get this memory mapped into their address space.
> >> : 
> >> : This is especially annoying for graphics drivers where a lot of the VRAM
> >> : usually isn't CPU accessible and so doesn't make sense to map into the
> >> : address space of the process using it.
> >> : 
> >> : The problem now is that when an application starts to use a lot of VRAM those
> >> : buffers objects sooner or later get swapped out to system memory, but when we
> >> : now run into an out of memory situation the OOM killer obviously doesn't knew
> >> : anything about that memory and so usually kills the wrong process.
> >
> > OK, but how do you attribute that memory to a particular OOM killable
> > entity? And how do you actually enforce that those resources get freed
> > on the oom killer action?
> >
> >> : The following set of patches tries to address this problem by introducing a per
> >> : file OOM badness score, which device drivers can use to give the OOM killer a
> >> : hint how many resources are bound to a file descriptor so that it can make
> >> : better decisions which process to kill.
> >
> > But files are not killable, they can be shared... In other words this
> > doesn't help the oom killer to make an educated guess at all.
> 
> Maybe some more context would help the discussion?
> 
> The struct file in patch 3 is the DRM fd.  That's effectively "my
> process's interface to talking to the GPU" not "a single GPU resource".
> Once that file is closed, all of the process's private, idle GPU buffers
> will be immediately freed (this will be most of their allocations), and
> some will be freed once the GPU completes some work (this will be most
> of the rest of their allocations).
> 
> Some GEM BOs won't be freed just by closing the fd, if they've been
> shared between processes.  Those are usually about 8-24MB total in a
> process, rather than the GBs that modern apps use (or that our testcases
> like to allocate and thus trigger oomkilling of the test harness instead
> of the offending testcase...)
> 
> Even if we just had the private+idle buffers being accounted in OOM
> badness, that would be a huge step forward in system reliability.

OK, in that case I would propose a different approach. We already
have rss_stat. So why do not we simply add a new counter there
MM_KERNELPAGES and consider those in oom_badness? The rule would be
that such a memory is bound to the process life time. I guess we will
find more users for this later.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-19  8:20         ` Michal Hocko
  0 siblings, 0 replies; 170+ messages in thread
From: Michal Hocko @ 2018-01-19  8:20 UTC (permalink / raw)
  To: Eric Anholt
  Cc: Andrey Grodzovsky, linux-kernel, linux-mm, dri-devel, amd-gfx,
	Christian.Koenig

On Thu 18-01-18 12:01:32, Eric Anholt wrote:
> Michal Hocko <mhocko@kernel.org> writes:
> 
> > On Thu 18-01-18 18:00:06, Michal Hocko wrote:
> >> On Thu 18-01-18 11:47:48, Andrey Grodzovsky wrote:
> >> > Hi, this series is a revised version of an RFC sent by Christian König
> >> > a few years ago. The original RFC can be found at 
> >> > https://lists.freedesktop.org/archives/dri-devel/2015-September/089778.html
> >> > 
> >> > This is the same idea and I've just adressed his concern from the original RFC 
> >> > and switched to a callback into file_ops instead of a new member in struct file.
> >> 
> >> Please add the full description to the cover letter and do not make
> >> people hunt links.
> >> 
> >> Here is the origin cover letter text
> >> : I'm currently working on the issue that when device drivers allocate memory on
> >> : behalf of an application the OOM killer usually doesn't knew about that unless
> >> : the application also get this memory mapped into their address space.
> >> : 
> >> : This is especially annoying for graphics drivers where a lot of the VRAM
> >> : usually isn't CPU accessible and so doesn't make sense to map into the
> >> : address space of the process using it.
> >> : 
> >> : The problem now is that when an application starts to use a lot of VRAM those
> >> : buffers objects sooner or later get swapped out to system memory, but when we
> >> : now run into an out of memory situation the OOM killer obviously doesn't knew
> >> : anything about that memory and so usually kills the wrong process.
> >
> > OK, but how do you attribute that memory to a particular OOM killable
> > entity? And how do you actually enforce that those resources get freed
> > on the oom killer action?
> >
> >> : The following set of patches tries to address this problem by introducing a per
> >> : file OOM badness score, which device drivers can use to give the OOM killer a
> >> : hint how many resources are bound to a file descriptor so that it can make
> >> : better decisions which process to kill.
> >
> > But files are not killable, they can be shared... In other words this
> > doesn't help the oom killer to make an educated guess at all.
> 
> Maybe some more context would help the discussion?
> 
> The struct file in patch 3 is the DRM fd.  That's effectively "my
> process's interface to talking to the GPU" not "a single GPU resource".
> Once that file is closed, all of the process's private, idle GPU buffers
> will be immediately freed (this will be most of their allocations), and
> some will be freed once the GPU completes some work (this will be most
> of the rest of their allocations).
> 
> Some GEM BOs won't be freed just by closing the fd, if they've been
> shared between processes.  Those are usually about 8-24MB total in a
> process, rather than the GBs that modern apps use (or that our testcases
> like to allocate and thus trigger oomkilling of the test harness instead
> of the offending testcase...)
> 
> Even if we just had the private+idle buffers being accounted in OOM
> badness, that would be a huge step forward in system reliability.

OK, in that case I would propose a different approach. We already
have rss_stat. So why do not we simply add a new counter there
MM_KERNELPAGES and consider those in oom_badness? The rule would be
that such a memory is bound to the process life time. I guess we will
find more users for this later.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-01-19  6:01     ` He, Roger
@ 2018-01-19  8:25         ` Michal Hocko
  0 siblings, 0 replies; 170+ messages in thread
From: Michal Hocko @ 2018-01-19  8:25 UTC (permalink / raw)
  To: He, Roger
  Cc: Grodzovsky, Andrey, linux-mm, amd-gfx, linux-kernel, dri-devel,
	Koenig, Christian

[removed the broken quoting - please try to use an email client which
doesn't mess up the qouted text]

On Fri 19-01-18 06:01:26, He, Roger wrote:
[...]
> I think you are misunderstanding here.
> Actually for now, the memory in TTM Pools already has mm_shrink which is implemented in ttm_pool_mm_shrink_init.
> And here the memory we want to make it contribute to OOM badness is not in TTM Pools.
> Because when TTM buffer allocation success, the memory already is removed from TTM Pools.  

I have no idea what TTM buffers are. But this smells like something
rather specific to the particular subsytem. And my main objection here
is that struct file is not a proper vehicle to carry such an
information. So whatever the TTM subsystem does it should contribute to
generic counters rather than abuse fd because it happens to use it to
communicate with userspace.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-19  8:25         ` Michal Hocko
  0 siblings, 0 replies; 170+ messages in thread
From: Michal Hocko @ 2018-01-19  8:25 UTC (permalink / raw)
  To: He, Roger
  Cc: Grodzovsky, Andrey, linux-mm, amd-gfx, linux-kernel, dri-devel,
	Koenig, Christian

[removed the broken quoting - please try to use an email client which
doesn't mess up the qouted text]

On Fri 19-01-18 06:01:26, He, Roger wrote:
[...]
> I think you are misunderstanding here.
> Actually for now, the memory in TTM Pools already has mm_shrink which is implemented in ttm_pool_mm_shrink_init.
> And here the memory we want to make it contribute to OOM badness is not in TTM Pools.
> Because when TTM buffer allocation success, the memory already is removed from TTM Pools.  

I have no idea what TTM buffers are. But this smells like something
rather specific to the particular subsytem. And my main objection here
is that struct file is not a proper vehicle to carry such an
information. So whatever the TTM subsystem does it should contribute to
generic counters rather than abuse fd because it happens to use it to
communicate with userspace.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-01-18 20:01     ` Eric Anholt
@ 2018-01-19  8:35         ` Christian König
  2018-01-19  8:35         ` Christian König
  1 sibling, 0 replies; 170+ messages in thread
From: Christian König @ 2018-01-19  8:35 UTC (permalink / raw)
  To: Eric Anholt, Michal Hocko, Andrey Grodzovsky
  Cc: linux-kernel, linux-mm, dri-devel, amd-gfx

Am 18.01.2018 um 21:01 schrieb Eric Anholt:
> Michal Hocko <mhocko@kernel.org> writes:
>
>> [SNIP]
>> But files are not killable, they can be shared... In other words this
>> doesn't help the oom killer to make an educated guess at all.
> Maybe some more context would help the discussion?

Thanks for doing this. Wanted to reply yesterday with that information 
as well, but was unfortunately on sick leave.

>
> The struct file in patch 3 is the DRM fd.  That's effectively "my
> process's interface to talking to the GPU" not "a single GPU resource".
> Once that file is closed, all of the process's private, idle GPU buffers
> will be immediately freed (this will be most of their allocations), and
> some will be freed once the GPU completes some work (this will be most
> of the rest of their allocations).
>
> Some GEM BOs won't be freed just by closing the fd, if they've been
> shared between processes.  Those are usually about 8-24MB total in a
> process, rather than the GBs that modern apps use (or that our testcases
> like to allocate and thus trigger oomkilling of the test harness instead
> of the offending testcase...)
>
> Even if we just had the private+idle buffers being accounted in OOM
> badness, that would be a huge step forward in system reliability.

Yes, and that's exactly the intention here because currently the OOM 
killer usually kills X when a graphics related application allocates to 
much memory and that is highly undesirable.

>>> : So question at every one: What do you think about this approach?
>> I thing is just just wrong semantically. Non-reclaimable memory is a
>> pain, especially when there is way too much of it. If you can free that
>> memory somehow then you can hook into slab shrinker API and react on the
>> memory pressure. If you can account such a memory to a particular
>> process and make sure that the consumption is bound by the process life
>> time then we can think of an accounting that oom_badness can consider
>> when selecting a victim.
> For graphics, we can't free most of our memory without also effectively
> killing the process.  i915 and vc4 have "purgeable" interfaces for
> userspace (on i915 this is exposed all the way to GL applications and is
> hooked into shrinker, and on vc4 this is so far just used for
> userspace-internal buffer caches to be purged when a CMA allocation
> fails).  However, those purgeable pools are expected to be a tiny
> fraction of the GPU allocations by the process.

Same thing with TTM and amdgpu/radeon. We already have a shrinker hock 
as well and make room as much as we can when needed.

But I think Michal's concerns are valid as well and I thought about them 
when I created the initial patch.

One possible solution which came to my mind is that (IIRC) we not only 
store the usual reference count per GEM object, but also how many 
handles where created for it.

So what we could do is to iterate over all GEM handles of a client and 
account only size/num_handles as badness for the client.

The end result would be that X and the client application would both get 
1/2 of the GEM objects size accounted for.

Regards,
Christian.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-19  8:35         ` Christian König
  0 siblings, 0 replies; 170+ messages in thread
From: Christian König @ 2018-01-19  8:35 UTC (permalink / raw)
  To: Eric Anholt, Michal Hocko, Andrey Grodzovsky
  Cc: linux-kernel, linux-mm, dri-devel, amd-gfx

Am 18.01.2018 um 21:01 schrieb Eric Anholt:
> Michal Hocko <mhocko@kernel.org> writes:
>
>> [SNIP]
>> But files are not killable, they can be shared... In other words this
>> doesn't help the oom killer to make an educated guess at all.
> Maybe some more context would help the discussion?

Thanks for doing this. Wanted to reply yesterday with that information 
as well, but was unfortunately on sick leave.

>
> The struct file in patch 3 is the DRM fd.  That's effectively "my
> process's interface to talking to the GPU" not "a single GPU resource".
> Once that file is closed, all of the process's private, idle GPU buffers
> will be immediately freed (this will be most of their allocations), and
> some will be freed once the GPU completes some work (this will be most
> of the rest of their allocations).
>
> Some GEM BOs won't be freed just by closing the fd, if they've been
> shared between processes.  Those are usually about 8-24MB total in a
> process, rather than the GBs that modern apps use (or that our testcases
> like to allocate and thus trigger oomkilling of the test harness instead
> of the offending testcase...)
>
> Even if we just had the private+idle buffers being accounted in OOM
> badness, that would be a huge step forward in system reliability.

Yes, and that's exactly the intention here because currently the OOM 
killer usually kills X when a graphics related application allocates to 
much memory and that is highly undesirable.

>>> : So question at every one: What do you think about this approach?
>> I thing is just just wrong semantically. Non-reclaimable memory is a
>> pain, especially when there is way too much of it. If you can free that
>> memory somehow then you can hook into slab shrinker API and react on the
>> memory pressure. If you can account such a memory to a particular
>> process and make sure that the consumption is bound by the process life
>> time then we can think of an accounting that oom_badness can consider
>> when selecting a victim.
> For graphics, we can't free most of our memory without also effectively
> killing the process.  i915 and vc4 have "purgeable" interfaces for
> userspace (on i915 this is exposed all the way to GL applications and is
> hooked into shrinker, and on vc4 this is so far just used for
> userspace-internal buffer caches to be purged when a CMA allocation
> fails).  However, those purgeable pools are expected to be a tiny
> fraction of the GPU allocations by the process.

Same thing with TTM and amdgpu/radeon. We already have a shrinker hock 
as well and make room as much as we can when needed.

But I think Michal's concerns are valid as well and I thought about them 
when I created the initial patch.

One possible solution which came to my mind is that (IIRC) we not only 
store the usual reference count per GEM object, but also how many 
handles where created for it.

So what we could do is to iterate over all GEM handles of a client and 
account only size/num_handles as badness for the client.

The end result would be that X and the client application would both get 
1/2 of the GEM objects size accounted for.

Regards,
Christian.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-01-19  8:20         ` Michal Hocko
  (?)
@ 2018-01-19  8:39           ` Christian König
  -1 siblings, 0 replies; 170+ messages in thread
From: Christian König @ 2018-01-19  8:39 UTC (permalink / raw)
  To: Michal Hocko, Eric Anholt
  Cc: Andrey Grodzovsky, linux-kernel, linux-mm, dri-devel, amd-gfx

Am 19.01.2018 um 09:20 schrieb Michal Hocko:
> On Thu 18-01-18 12:01:32, Eric Anholt wrote:
>> Michal Hocko <mhocko@kernel.org> writes:
>>
>>> On Thu 18-01-18 18:00:06, Michal Hocko wrote:
>>>> On Thu 18-01-18 11:47:48, Andrey Grodzovsky wrote:
>>>>> Hi, this series is a revised version of an RFC sent by Christian König
>>>>> a few years ago. The original RFC can be found at
>>>>> https://lists.freedesktop.org/archives/dri-devel/2015-September/089778.html
>>>>>
>>>>> This is the same idea and I've just adressed his concern from the original RFC
>>>>> and switched to a callback into file_ops instead of a new member in struct file.
>>>> Please add the full description to the cover letter and do not make
>>>> people hunt links.
>>>>
>>>> Here is the origin cover letter text
>>>> : I'm currently working on the issue that when device drivers allocate memory on
>>>> : behalf of an application the OOM killer usually doesn't knew about that unless
>>>> : the application also get this memory mapped into their address space.
>>>> :
>>>> : This is especially annoying for graphics drivers where a lot of the VRAM
>>>> : usually isn't CPU accessible and so doesn't make sense to map into the
>>>> : address space of the process using it.
>>>> :
>>>> : The problem now is that when an application starts to use a lot of VRAM those
>>>> : buffers objects sooner or later get swapped out to system memory, but when we
>>>> : now run into an out of memory situation the OOM killer obviously doesn't knew
>>>> : anything about that memory and so usually kills the wrong process.
>>> OK, but how do you attribute that memory to a particular OOM killable
>>> entity? And how do you actually enforce that those resources get freed
>>> on the oom killer action?
>>>
>>>> : The following set of patches tries to address this problem by introducing a per
>>>> : file OOM badness score, which device drivers can use to give the OOM killer a
>>>> : hint how many resources are bound to a file descriptor so that it can make
>>>> : better decisions which process to kill.
>>> But files are not killable, they can be shared... In other words this
>>> doesn't help the oom killer to make an educated guess at all.
>> Maybe some more context would help the discussion?
>>
>> The struct file in patch 3 is the DRM fd.  That's effectively "my
>> process's interface to talking to the GPU" not "a single GPU resource".
>> Once that file is closed, all of the process's private, idle GPU buffers
>> will be immediately freed (this will be most of their allocations), and
>> some will be freed once the GPU completes some work (this will be most
>> of the rest of their allocations).
>>
>> Some GEM BOs won't be freed just by closing the fd, if they've been
>> shared between processes.  Those are usually about 8-24MB total in a
>> process, rather than the GBs that modern apps use (or that our testcases
>> like to allocate and thus trigger oomkilling of the test harness instead
>> of the offending testcase...)
>>
>> Even if we just had the private+idle buffers being accounted in OOM
>> badness, that would be a huge step forward in system reliability.
> OK, in that case I would propose a different approach. We already
> have rss_stat. So why do not we simply add a new counter there
> MM_KERNELPAGES and consider those in oom_badness? The rule would be
> that such a memory is bound to the process life time. I guess we will
> find more users for this later.

I already tried that and the problem with that approach is that some 
buffers are not created by the application which actually uses them.

For example X/Wayland is creating and handing out render buffers to 
application which want to use OpenGL.

So the result is when you always account the application who created the 
buffer the OOM killer will certainly reap X/Wayland first. And that is 
exactly what we want to avoid here.

Regards,
Christian.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-19  8:39           ` Christian König
  0 siblings, 0 replies; 170+ messages in thread
From: Christian König @ 2018-01-19  8:39 UTC (permalink / raw)
  To: Michal Hocko, Eric Anholt
  Cc: Andrey Grodzovsky, linux-kernel, linux-mm, dri-devel, amd-gfx

Am 19.01.2018 um 09:20 schrieb Michal Hocko:
> On Thu 18-01-18 12:01:32, Eric Anholt wrote:
>> Michal Hocko <mhocko@kernel.org> writes:
>>
>>> On Thu 18-01-18 18:00:06, Michal Hocko wrote:
>>>> On Thu 18-01-18 11:47:48, Andrey Grodzovsky wrote:
>>>>> Hi, this series is a revised version of an RFC sent by Christian KA?nig
>>>>> a few years ago. The original RFC can be found at
>>>>> https://lists.freedesktop.org/archives/dri-devel/2015-September/089778.html
>>>>>
>>>>> This is the same idea and I've just adressed his concern from the original RFC
>>>>> and switched to a callback into file_ops instead of a new member in struct file.
>>>> Please add the full description to the cover letter and do not make
>>>> people hunt links.
>>>>
>>>> Here is the origin cover letter text
>>>> : I'm currently working on the issue that when device drivers allocate memory on
>>>> : behalf of an application the OOM killer usually doesn't knew about that unless
>>>> : the application also get this memory mapped into their address space.
>>>> :
>>>> : This is especially annoying for graphics drivers where a lot of the VRAM
>>>> : usually isn't CPU accessible and so doesn't make sense to map into the
>>>> : address space of the process using it.
>>>> :
>>>> : The problem now is that when an application starts to use a lot of VRAM those
>>>> : buffers objects sooner or later get swapped out to system memory, but when we
>>>> : now run into an out of memory situation the OOM killer obviously doesn't knew
>>>> : anything about that memory and so usually kills the wrong process.
>>> OK, but how do you attribute that memory to a particular OOM killable
>>> entity? And how do you actually enforce that those resources get freed
>>> on the oom killer action?
>>>
>>>> : The following set of patches tries to address this problem by introducing a per
>>>> : file OOM badness score, which device drivers can use to give the OOM killer a
>>>> : hint how many resources are bound to a file descriptor so that it can make
>>>> : better decisions which process to kill.
>>> But files are not killable, they can be shared... In other words this
>>> doesn't help the oom killer to make an educated guess at all.
>> Maybe some more context would help the discussion?
>>
>> The struct file in patch 3 is the DRM fd.  That's effectively "my
>> process's interface to talking to the GPU" not "a single GPU resource".
>> Once that file is closed, all of the process's private, idle GPU buffers
>> will be immediately freed (this will be most of their allocations), and
>> some will be freed once the GPU completes some work (this will be most
>> of the rest of their allocations).
>>
>> Some GEM BOs won't be freed just by closing the fd, if they've been
>> shared between processes.  Those are usually about 8-24MB total in a
>> process, rather than the GBs that modern apps use (or that our testcases
>> like to allocate and thus trigger oomkilling of the test harness instead
>> of the offending testcase...)
>>
>> Even if we just had the private+idle buffers being accounted in OOM
>> badness, that would be a huge step forward in system reliability.
> OK, in that case I would propose a different approach. We already
> have rss_stat. So why do not we simply add a new counter there
> MM_KERNELPAGES and consider those in oom_badness? The rule would be
> that such a memory is bound to the process life time. I guess we will
> find more users for this later.

I already tried that and the problem with that approach is that some 
buffers are not created by the application which actually uses them.

For example X/Wayland is creating and handing out render buffers to 
application which want to use OpenGL.

So the result is when you always account the application who created the 
buffer the OOM killer will certainly reap X/Wayland first. And that is 
exactly what we want to avoid here.

Regards,
Christian.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-19  8:39           ` Christian König
  0 siblings, 0 replies; 170+ messages in thread
From: Christian König @ 2018-01-19  8:39 UTC (permalink / raw)
  To: Michal Hocko, Eric Anholt
  Cc: Andrey Grodzovsky, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg

Am 19.01.2018 um 09:20 schrieb Michal Hocko:
> On Thu 18-01-18 12:01:32, Eric Anholt wrote:
>> Michal Hocko <mhocko@kernel.org> writes:
>>
>>> On Thu 18-01-18 18:00:06, Michal Hocko wrote:
>>>> On Thu 18-01-18 11:47:48, Andrey Grodzovsky wrote:
>>>>> Hi, this series is a revised version of an RFC sent by Christian König
>>>>> a few years ago. The original RFC can be found at
>>>>> https://lists.freedesktop.org/archives/dri-devel/2015-September/089778.html
>>>>>
>>>>> This is the same idea and I've just adressed his concern from the original RFC
>>>>> and switched to a callback into file_ops instead of a new member in struct file.
>>>> Please add the full description to the cover letter and do not make
>>>> people hunt links.
>>>>
>>>> Here is the origin cover letter text
>>>> : I'm currently working on the issue that when device drivers allocate memory on
>>>> : behalf of an application the OOM killer usually doesn't knew about that unless
>>>> : the application also get this memory mapped into their address space.
>>>> :
>>>> : This is especially annoying for graphics drivers where a lot of the VRAM
>>>> : usually isn't CPU accessible and so doesn't make sense to map into the
>>>> : address space of the process using it.
>>>> :
>>>> : The problem now is that when an application starts to use a lot of VRAM those
>>>> : buffers objects sooner or later get swapped out to system memory, but when we
>>>> : now run into an out of memory situation the OOM killer obviously doesn't knew
>>>> : anything about that memory and so usually kills the wrong process.
>>> OK, but how do you attribute that memory to a particular OOM killable
>>> entity? And how do you actually enforce that those resources get freed
>>> on the oom killer action?
>>>
>>>> : The following set of patches tries to address this problem by introducing a per
>>>> : file OOM badness score, which device drivers can use to give the OOM killer a
>>>> : hint how many resources are bound to a file descriptor so that it can make
>>>> : better decisions which process to kill.
>>> But files are not killable, they can be shared... In other words this
>>> doesn't help the oom killer to make an educated guess at all.
>> Maybe some more context would help the discussion?
>>
>> The struct file in patch 3 is the DRM fd.  That's effectively "my
>> process's interface to talking to the GPU" not "a single GPU resource".
>> Once that file is closed, all of the process's private, idle GPU buffers
>> will be immediately freed (this will be most of their allocations), and
>> some will be freed once the GPU completes some work (this will be most
>> of the rest of their allocations).
>>
>> Some GEM BOs won't be freed just by closing the fd, if they've been
>> shared between processes.  Those are usually about 8-24MB total in a
>> process, rather than the GBs that modern apps use (or that our testcases
>> like to allocate and thus trigger oomkilling of the test harness instead
>> of the offending testcase...)
>>
>> Even if we just had the private+idle buffers being accounted in OOM
>> badness, that would be a huge step forward in system reliability.
> OK, in that case I would propose a different approach. We already
> have rss_stat. So why do not we simply add a new counter there
> MM_KERNELPAGES and consider those in oom_badness? The rule would be
> that such a memory is bound to the process life time. I guess we will
> find more users for this later.

I already tried that and the problem with that approach is that some 
buffers are not created by the application which actually uses them.

For example X/Wayland is creating and handing out render buffers to 
application which want to use OpenGL.

So the result is when you always account the application who created the 
buffer the OOM killer will certainly reap X/Wayland first. And that is 
exactly what we want to avoid here.

Regards,
Christian.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-01-19  8:39           ` Christian König
  (?)
@ 2018-01-19  9:32             ` Michel Dänzer
  -1 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-01-19  9:32 UTC (permalink / raw)
  To: Christian König, Michal Hocko, Eric Anholt
  Cc: Andrey Grodzovsky, amd-gfx, linux-kernel, dri-devel, linux-mm

On 2018-01-19 09:39 AM, Christian König wrote:
> Am 19.01.2018 um 09:20 schrieb Michal Hocko:
>> On Thu 18-01-18 12:01:32, Eric Anholt wrote:
>>> Michal Hocko <mhocko@kernel.org> writes:
>>>
>>>> On Thu 18-01-18 18:00:06, Michal Hocko wrote:
>>>>> On Thu 18-01-18 11:47:48, Andrey Grodzovsky wrote:
>>>>>> Hi, this series is a revised version of an RFC sent by Christian
>>>>>> König
>>>>>> a few years ago. The original RFC can be found at
>>>>>> https://lists.freedesktop.org/archives/dri-devel/2015-September/089778.html
>>>>>>
>>>>>>
>>>>>> This is the same idea and I've just adressed his concern from the
>>>>>> original RFC
>>>>>> and switched to a callback into file_ops instead of a new member
>>>>>> in struct file.
>>>>> Please add the full description to the cover letter and do not make
>>>>> people hunt links.
>>>>>
>>>>> Here is the origin cover letter text
>>>>> : I'm currently working on the issue that when device drivers
>>>>> allocate memory on
>>>>> : behalf of an application the OOM killer usually doesn't knew
>>>>> about that unless
>>>>> : the application also get this memory mapped into their address
>>>>> space.
>>>>> :
>>>>> : This is especially annoying for graphics drivers where a lot of
>>>>> the VRAM
>>>>> : usually isn't CPU accessible and so doesn't make sense to map
>>>>> into the
>>>>> : address space of the process using it.
>>>>> :
>>>>> : The problem now is that when an application starts to use a lot
>>>>> of VRAM those
>>>>> : buffers objects sooner or later get swapped out to system memory,
>>>>> but when we
>>>>> : now run into an out of memory situation the OOM killer obviously
>>>>> doesn't knew
>>>>> : anything about that memory and so usually kills the wrong process.
>>>> OK, but how do you attribute that memory to a particular OOM killable
>>>> entity? And how do you actually enforce that those resources get freed
>>>> on the oom killer action?
>>>>
>>>>> : The following set of patches tries to address this problem by
>>>>> introducing a per
>>>>> : file OOM badness score, which device drivers can use to give the
>>>>> OOM killer a
>>>>> : hint how many resources are bound to a file descriptor so that it
>>>>> can make
>>>>> : better decisions which process to kill.
>>>> But files are not killable, they can be shared... In other words this
>>>> doesn't help the oom killer to make an educated guess at all.
>>> Maybe some more context would help the discussion?
>>>
>>> The struct file in patch 3 is the DRM fd.  That's effectively "my
>>> process's interface to talking to the GPU" not "a single GPU resource".
>>> Once that file is closed, all of the process's private, idle GPU buffers
>>> will be immediately freed (this will be most of their allocations), and
>>> some will be freed once the GPU completes some work (this will be most
>>> of the rest of their allocations).
>>>
>>> Some GEM BOs won't be freed just by closing the fd, if they've been
>>> shared between processes.  Those are usually about 8-24MB total in a
>>> process, rather than the GBs that modern apps use (or that our testcases
>>> like to allocate and thus trigger oomkilling of the test harness instead
>>> of the offending testcase...)
>>>
>>> Even if we just had the private+idle buffers being accounted in OOM
>>> badness, that would be a huge step forward in system reliability.
>> OK, in that case I would propose a different approach. We already
>> have rss_stat. So why do not we simply add a new counter there
>> MM_KERNELPAGES and consider those in oom_badness? The rule would be
>> that such a memory is bound to the process life time. I guess we will
>> find more users for this later.
> 
> I already tried that and the problem with that approach is that some
> buffers are not created by the application which actually uses them.
> 
> For example X/Wayland is creating and handing out render buffers to
> application which want to use OpenGL.
> 
> So the result is when you always account the application who created the
> buffer the OOM killer will certainly reap X/Wayland first. And that is
> exactly what we want to avoid here.

FWIW, what you describe is true with DRI2, but not with DRI3 or Wayland
anymore. With DRI3 and Wayland, buffers are allocated by the clients and
then shared with the X / Wayland server.

Also, in all cases, the amount of memory allocated for buffers shared
between DRI/Wayland clients and the server should be relatively small
compared to the amount of memory allocated for buffers used only locally
in the client, particularly for clients which create significant memory
pressure.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-19  9:32             ` Michel Dänzer
  0 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-01-19  9:32 UTC (permalink / raw)
  To: Christian König, Michal Hocko, Eric Anholt
  Cc: Andrey Grodzovsky, amd-gfx, linux-kernel, dri-devel, linux-mm

On 2018-01-19 09:39 AM, Christian KA?nig wrote:
> Am 19.01.2018 um 09:20 schrieb Michal Hocko:
>> On Thu 18-01-18 12:01:32, Eric Anholt wrote:
>>> Michal Hocko <mhocko@kernel.org> writes:
>>>
>>>> On Thu 18-01-18 18:00:06, Michal Hocko wrote:
>>>>> On Thu 18-01-18 11:47:48, Andrey Grodzovsky wrote:
>>>>>> Hi, this series is a revised version of an RFC sent by Christian
>>>>>> KA?nig
>>>>>> a few years ago. The original RFC can be found at
>>>>>> https://lists.freedesktop.org/archives/dri-devel/2015-September/089778.html
>>>>>>
>>>>>>
>>>>>> This is the same idea and I've just adressed his concern from the
>>>>>> original RFC
>>>>>> and switched to a callback into file_ops instead of a new member
>>>>>> in struct file.
>>>>> Please add the full description to the cover letter and do not make
>>>>> people hunt links.
>>>>>
>>>>> Here is the origin cover letter text
>>>>> : I'm currently working on the issue that when device drivers
>>>>> allocate memory on
>>>>> : behalf of an application the OOM killer usually doesn't knew
>>>>> about that unless
>>>>> : the application also get this memory mapped into their address
>>>>> space.
>>>>> :
>>>>> : This is especially annoying for graphics drivers where a lot of
>>>>> the VRAM
>>>>> : usually isn't CPU accessible and so doesn't make sense to map
>>>>> into the
>>>>> : address space of the process using it.
>>>>> :
>>>>> : The problem now is that when an application starts to use a lot
>>>>> of VRAM those
>>>>> : buffers objects sooner or later get swapped out to system memory,
>>>>> but when we
>>>>> : now run into an out of memory situation the OOM killer obviously
>>>>> doesn't knew
>>>>> : anything about that memory and so usually kills the wrong process.
>>>> OK, but how do you attribute that memory to a particular OOM killable
>>>> entity? And how do you actually enforce that those resources get freed
>>>> on the oom killer action?
>>>>
>>>>> : The following set of patches tries to address this problem by
>>>>> introducing a per
>>>>> : file OOM badness score, which device drivers can use to give the
>>>>> OOM killer a
>>>>> : hint how many resources are bound to a file descriptor so that it
>>>>> can make
>>>>> : better decisions which process to kill.
>>>> But files are not killable, they can be shared... In other words this
>>>> doesn't help the oom killer to make an educated guess at all.
>>> Maybe some more context would help the discussion?
>>>
>>> The struct file in patch 3 is the DRM fd.A  That's effectively "my
>>> process's interface to talking to the GPU" not "a single GPU resource".
>>> Once that file is closed, all of the process's private, idle GPU buffers
>>> will be immediately freed (this will be most of their allocations), and
>>> some will be freed once the GPU completes some work (this will be most
>>> of the rest of their allocations).
>>>
>>> Some GEM BOs won't be freed just by closing the fd, if they've been
>>> shared between processes.A  Those are usually about 8-24MB total in a
>>> process, rather than the GBs that modern apps use (or that our testcases
>>> like to allocate and thus trigger oomkilling of the test harness instead
>>> of the offending testcase...)
>>>
>>> Even if we just had the private+idle buffers being accounted in OOM
>>> badness, that would be a huge step forward in system reliability.
>> OK, in that case I would propose a different approach. We already
>> have rss_stat. So why do not we simply add a new counter there
>> MM_KERNELPAGES and consider those in oom_badness? The rule would be
>> that such a memory is bound to the process life time. I guess we will
>> find more users for this later.
> 
> I already tried that and the problem with that approach is that some
> buffers are not created by the application which actually uses them.
> 
> For example X/Wayland is creating and handing out render buffers to
> application which want to use OpenGL.
> 
> So the result is when you always account the application who created the
> buffer the OOM killer will certainly reap X/Wayland first. And that is
> exactly what we want to avoid here.

FWIW, what you describe is true with DRI2, but not with DRI3 or Wayland
anymore. With DRI3 and Wayland, buffers are allocated by the clients and
then shared with the X / Wayland server.

Also, in all cases, the amount of memory allocated for buffers shared
between DRI/Wayland clients and the server should be relatively small
compared to the amount of memory allocated for buffers used only locally
in the client, particularly for clients which create significant memory
pressure.


-- 
Earthling Michel DA?nzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-19  9:32             ` Michel Dänzer
  0 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-01-19  9:32 UTC (permalink / raw)
  To: Christian König, Michal Hocko, Eric Anholt
  Cc: Andrey Grodzovsky, amd-gfx, linux-kernel, dri-devel, linux-mm

On 2018-01-19 09:39 AM, Christian König wrote:
> Am 19.01.2018 um 09:20 schrieb Michal Hocko:
>> On Thu 18-01-18 12:01:32, Eric Anholt wrote:
>>> Michal Hocko <mhocko@kernel.org> writes:
>>>
>>>> On Thu 18-01-18 18:00:06, Michal Hocko wrote:
>>>>> On Thu 18-01-18 11:47:48, Andrey Grodzovsky wrote:
>>>>>> Hi, this series is a revised version of an RFC sent by Christian
>>>>>> König
>>>>>> a few years ago. The original RFC can be found at
>>>>>> https://lists.freedesktop.org/archives/dri-devel/2015-September/089778.html
>>>>>>
>>>>>>
>>>>>> This is the same idea and I've just adressed his concern from the
>>>>>> original RFC
>>>>>> and switched to a callback into file_ops instead of a new member
>>>>>> in struct file.
>>>>> Please add the full description to the cover letter and do not make
>>>>> people hunt links.
>>>>>
>>>>> Here is the origin cover letter text
>>>>> : I'm currently working on the issue that when device drivers
>>>>> allocate memory on
>>>>> : behalf of an application the OOM killer usually doesn't knew
>>>>> about that unless
>>>>> : the application also get this memory mapped into their address
>>>>> space.
>>>>> :
>>>>> : This is especially annoying for graphics drivers where a lot of
>>>>> the VRAM
>>>>> : usually isn't CPU accessible and so doesn't make sense to map
>>>>> into the
>>>>> : address space of the process using it.
>>>>> :
>>>>> : The problem now is that when an application starts to use a lot
>>>>> of VRAM those
>>>>> : buffers objects sooner or later get swapped out to system memory,
>>>>> but when we
>>>>> : now run into an out of memory situation the OOM killer obviously
>>>>> doesn't knew
>>>>> : anything about that memory and so usually kills the wrong process.
>>>> OK, but how do you attribute that memory to a particular OOM killable
>>>> entity? And how do you actually enforce that those resources get freed
>>>> on the oom killer action?
>>>>
>>>>> : The following set of patches tries to address this problem by
>>>>> introducing a per
>>>>> : file OOM badness score, which device drivers can use to give the
>>>>> OOM killer a
>>>>> : hint how many resources are bound to a file descriptor so that it
>>>>> can make
>>>>> : better decisions which process to kill.
>>>> But files are not killable, they can be shared... In other words this
>>>> doesn't help the oom killer to make an educated guess at all.
>>> Maybe some more context would help the discussion?
>>>
>>> The struct file in patch 3 is the DRM fd.  That's effectively "my
>>> process's interface to talking to the GPU" not "a single GPU resource".
>>> Once that file is closed, all of the process's private, idle GPU buffers
>>> will be immediately freed (this will be most of their allocations), and
>>> some will be freed once the GPU completes some work (this will be most
>>> of the rest of their allocations).
>>>
>>> Some GEM BOs won't be freed just by closing the fd, if they've been
>>> shared between processes.  Those are usually about 8-24MB total in a
>>> process, rather than the GBs that modern apps use (or that our testcases
>>> like to allocate and thus trigger oomkilling of the test harness instead
>>> of the offending testcase...)
>>>
>>> Even if we just had the private+idle buffers being accounted in OOM
>>> badness, that would be a huge step forward in system reliability.
>> OK, in that case I would propose a different approach. We already
>> have rss_stat. So why do not we simply add a new counter there
>> MM_KERNELPAGES and consider those in oom_badness? The rule would be
>> that such a memory is bound to the process life time. I guess we will
>> find more users for this later.
> 
> I already tried that and the problem with that approach is that some
> buffers are not created by the application which actually uses them.
> 
> For example X/Wayland is creating and handing out render buffers to
> application which want to use OpenGL.
> 
> So the result is when you always account the application who created the
> buffer the OOM killer will certainly reap X/Wayland first. And that is
> exactly what we want to avoid here.

FWIW, what you describe is true with DRI2, but not with DRI3 or Wayland
anymore. With DRI3 and Wayland, buffers are allocated by the clients and
then shared with the X / Wayland server.

Also, in all cases, the amount of memory allocated for buffers shared
between DRI/Wayland clients and the server should be relatively small
compared to the amount of memory allocated for buffers used only locally
in the client, particularly for clients which create significant memory
pressure.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-01-19  9:32             ` Michel Dänzer
  (?)
@ 2018-01-19  9:58               ` Christian König
  -1 siblings, 0 replies; 170+ messages in thread
From: Christian König @ 2018-01-19  9:58 UTC (permalink / raw)
  To: Michel Dänzer, Christian König, Michal Hocko, Eric Anholt
  Cc: Andrey Grodzovsky, dri-devel, linux-kernel, amd-gfx, linux-mm

Am 19.01.2018 um 10:32 schrieb Michel Dänzer:
> On 2018-01-19 09:39 AM, Christian König wrote:
>> Am 19.01.2018 um 09:20 schrieb Michal Hocko:
>>> On Thu 18-01-18 12:01:32, Eric Anholt wrote:
>>>> Michal Hocko <mhocko@kernel.org> writes:
>>>>
>>>>> On Thu 18-01-18 18:00:06, Michal Hocko wrote:
>>>>>> On Thu 18-01-18 11:47:48, Andrey Grodzovsky wrote:
>>>>>>> Hi, this series is a revised version of an RFC sent by Christian
>>>>>>> König
>>>>>>> a few years ago. The original RFC can be found at
>>>>>>> https://lists.freedesktop.org/archives/dri-devel/2015-September/089778.html
>>>>>>>
>>>>>>>
>>>>>>> This is the same idea and I've just adressed his concern from the
>>>>>>> original RFC
>>>>>>> and switched to a callback into file_ops instead of a new member
>>>>>>> in struct file.
>>>>>> Please add the full description to the cover letter and do not make
>>>>>> people hunt links.
>>>>>>
>>>>>> Here is the origin cover letter text
>>>>>> : I'm currently working on the issue that when device drivers
>>>>>> allocate memory on
>>>>>> : behalf of an application the OOM killer usually doesn't knew
>>>>>> about that unless
>>>>>> : the application also get this memory mapped into their address
>>>>>> space.
>>>>>> :
>>>>>> : This is especially annoying for graphics drivers where a lot of
>>>>>> the VRAM
>>>>>> : usually isn't CPU accessible and so doesn't make sense to map
>>>>>> into the
>>>>>> : address space of the process using it.
>>>>>> :
>>>>>> : The problem now is that when an application starts to use a lot
>>>>>> of VRAM those
>>>>>> : buffers objects sooner or later get swapped out to system memory,
>>>>>> but when we
>>>>>> : now run into an out of memory situation the OOM killer obviously
>>>>>> doesn't knew
>>>>>> : anything about that memory and so usually kills the wrong process.
>>>>> OK, but how do you attribute that memory to a particular OOM killable
>>>>> entity? And how do you actually enforce that those resources get freed
>>>>> on the oom killer action?
>>>>>
>>>>>> : The following set of patches tries to address this problem by
>>>>>> introducing a per
>>>>>> : file OOM badness score, which device drivers can use to give the
>>>>>> OOM killer a
>>>>>> : hint how many resources are bound to a file descriptor so that it
>>>>>> can make
>>>>>> : better decisions which process to kill.
>>>>> But files are not killable, they can be shared... In other words this
>>>>> doesn't help the oom killer to make an educated guess at all.
>>>> Maybe some more context would help the discussion?
>>>>
>>>> The struct file in patch 3 is the DRM fd.  That's effectively "my
>>>> process's interface to talking to the GPU" not "a single GPU resource".
>>>> Once that file is closed, all of the process's private, idle GPU buffers
>>>> will be immediately freed (this will be most of their allocations), and
>>>> some will be freed once the GPU completes some work (this will be most
>>>> of the rest of their allocations).
>>>>
>>>> Some GEM BOs won't be freed just by closing the fd, if they've been
>>>> shared between processes.  Those are usually about 8-24MB total in a
>>>> process, rather than the GBs that modern apps use (or that our testcases
>>>> like to allocate and thus trigger oomkilling of the test harness instead
>>>> of the offending testcase...)
>>>>
>>>> Even if we just had the private+idle buffers being accounted in OOM
>>>> badness, that would be a huge step forward in system reliability.
>>> OK, in that case I would propose a different approach. We already
>>> have rss_stat. So why do not we simply add a new counter there
>>> MM_KERNELPAGES and consider those in oom_badness? The rule would be
>>> that such a memory is bound to the process life time. I guess we will
>>> find more users for this later.
>> I already tried that and the problem with that approach is that some
>> buffers are not created by the application which actually uses them.
>>
>> For example X/Wayland is creating and handing out render buffers to
>> application which want to use OpenGL.
>>
>> So the result is when you always account the application who created the
>> buffer the OOM killer will certainly reap X/Wayland first. And that is
>> exactly what we want to avoid here.
> FWIW, what you describe is true with DRI2, but not with DRI3 or Wayland
> anymore. With DRI3 and Wayland, buffers are allocated by the clients and
> then shared with the X / Wayland server.

Good point, when I initially looked at that problem DRI3 wasn't widely 
used yet.

> Also, in all cases, the amount of memory allocated for buffers shared
> between DRI/Wayland clients and the server should be relatively small
> compared to the amount of memory allocated for buffers used only locally
> in the client, particularly for clients which create significant memory
> pressure.

That is unfortunately only partially true. When you have a single 
runaway application which tries to allocate everything it would indeed 
work as you described.

But when I tested this a few years ago with X based desktop the 
applications which actually used most of the memory where Firefox and 
Thunderbird. Unfortunately they never got accounted for that.

Now, on my current Wayland based desktop it actually doesn't look much 
better. Taking a look at radeon_gem_info/amdgpu_gem_info the majority of 
all memory was allocated either by gnome-shell or Xwayland.

Regards,
Christian.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-19  9:58               ` Christian König
  0 siblings, 0 replies; 170+ messages in thread
From: Christian König @ 2018-01-19  9:58 UTC (permalink / raw)
  To: Michel Dänzer, Christian König, Michal Hocko, Eric Anholt
  Cc: Andrey Grodzovsky, dri-devel, linux-kernel, amd-gfx, linux-mm

Am 19.01.2018 um 10:32 schrieb Michel DA?nzer:
> On 2018-01-19 09:39 AM, Christian KA?nig wrote:
>> Am 19.01.2018 um 09:20 schrieb Michal Hocko:
>>> On Thu 18-01-18 12:01:32, Eric Anholt wrote:
>>>> Michal Hocko <mhocko@kernel.org> writes:
>>>>
>>>>> On Thu 18-01-18 18:00:06, Michal Hocko wrote:
>>>>>> On Thu 18-01-18 11:47:48, Andrey Grodzovsky wrote:
>>>>>>> Hi, this series is a revised version of an RFC sent by Christian
>>>>>>> KA?nig
>>>>>>> a few years ago. The original RFC can be found at
>>>>>>> https://lists.freedesktop.org/archives/dri-devel/2015-September/089778.html
>>>>>>>
>>>>>>>
>>>>>>> This is the same idea and I've just adressed his concern from the
>>>>>>> original RFC
>>>>>>> and switched to a callback into file_ops instead of a new member
>>>>>>> in struct file.
>>>>>> Please add the full description to the cover letter and do not make
>>>>>> people hunt links.
>>>>>>
>>>>>> Here is the origin cover letter text
>>>>>> : I'm currently working on the issue that when device drivers
>>>>>> allocate memory on
>>>>>> : behalf of an application the OOM killer usually doesn't knew
>>>>>> about that unless
>>>>>> : the application also get this memory mapped into their address
>>>>>> space.
>>>>>> :
>>>>>> : This is especially annoying for graphics drivers where a lot of
>>>>>> the VRAM
>>>>>> : usually isn't CPU accessible and so doesn't make sense to map
>>>>>> into the
>>>>>> : address space of the process using it.
>>>>>> :
>>>>>> : The problem now is that when an application starts to use a lot
>>>>>> of VRAM those
>>>>>> : buffers objects sooner or later get swapped out to system memory,
>>>>>> but when we
>>>>>> : now run into an out of memory situation the OOM killer obviously
>>>>>> doesn't knew
>>>>>> : anything about that memory and so usually kills the wrong process.
>>>>> OK, but how do you attribute that memory to a particular OOM killable
>>>>> entity? And how do you actually enforce that those resources get freed
>>>>> on the oom killer action?
>>>>>
>>>>>> : The following set of patches tries to address this problem by
>>>>>> introducing a per
>>>>>> : file OOM badness score, which device drivers can use to give the
>>>>>> OOM killer a
>>>>>> : hint how many resources are bound to a file descriptor so that it
>>>>>> can make
>>>>>> : better decisions which process to kill.
>>>>> But files are not killable, they can be shared... In other words this
>>>>> doesn't help the oom killer to make an educated guess at all.
>>>> Maybe some more context would help the discussion?
>>>>
>>>> The struct file in patch 3 is the DRM fd.A  That's effectively "my
>>>> process's interface to talking to the GPU" not "a single GPU resource".
>>>> Once that file is closed, all of the process's private, idle GPU buffers
>>>> will be immediately freed (this will be most of their allocations), and
>>>> some will be freed once the GPU completes some work (this will be most
>>>> of the rest of their allocations).
>>>>
>>>> Some GEM BOs won't be freed just by closing the fd, if they've been
>>>> shared between processes.A  Those are usually about 8-24MB total in a
>>>> process, rather than the GBs that modern apps use (or that our testcases
>>>> like to allocate and thus trigger oomkilling of the test harness instead
>>>> of the offending testcase...)
>>>>
>>>> Even if we just had the private+idle buffers being accounted in OOM
>>>> badness, that would be a huge step forward in system reliability.
>>> OK, in that case I would propose a different approach. We already
>>> have rss_stat. So why do not we simply add a new counter there
>>> MM_KERNELPAGES and consider those in oom_badness? The rule would be
>>> that such a memory is bound to the process life time. I guess we will
>>> find more users for this later.
>> I already tried that and the problem with that approach is that some
>> buffers are not created by the application which actually uses them.
>>
>> For example X/Wayland is creating and handing out render buffers to
>> application which want to use OpenGL.
>>
>> So the result is when you always account the application who created the
>> buffer the OOM killer will certainly reap X/Wayland first. And that is
>> exactly what we want to avoid here.
> FWIW, what you describe is true with DRI2, but not with DRI3 or Wayland
> anymore. With DRI3 and Wayland, buffers are allocated by the clients and
> then shared with the X / Wayland server.

Good point, when I initially looked at that problem DRI3 wasn't widely 
used yet.

> Also, in all cases, the amount of memory allocated for buffers shared
> between DRI/Wayland clients and the server should be relatively small
> compared to the amount of memory allocated for buffers used only locally
> in the client, particularly for clients which create significant memory
> pressure.

That is unfortunately only partially true. When you have a single 
runaway application which tries to allocate everything it would indeed 
work as you described.

But when I tested this a few years ago with X based desktop the 
applications which actually used most of the memory where Firefox and 
Thunderbird. Unfortunately they never got accounted for that.

Now, on my current Wayland based desktop it actually doesn't look much 
better. Taking a look at radeon_gem_info/amdgpu_gem_info the majority of 
all memory was allocated either by gnome-shell or Xwayland.

Regards,
Christian.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-19  9:58               ` Christian König
  0 siblings, 0 replies; 170+ messages in thread
From: Christian König @ 2018-01-19  9:58 UTC (permalink / raw)
  To: Michel Dänzer, Christian König, Michal Hocko, Eric Anholt
  Cc: amd-gfx, linux-kernel, dri-devel, linux-mm

Am 19.01.2018 um 10:32 schrieb Michel Dänzer:
> On 2018-01-19 09:39 AM, Christian König wrote:
>> Am 19.01.2018 um 09:20 schrieb Michal Hocko:
>>> On Thu 18-01-18 12:01:32, Eric Anholt wrote:
>>>> Michal Hocko <mhocko@kernel.org> writes:
>>>>
>>>>> On Thu 18-01-18 18:00:06, Michal Hocko wrote:
>>>>>> On Thu 18-01-18 11:47:48, Andrey Grodzovsky wrote:
>>>>>>> Hi, this series is a revised version of an RFC sent by Christian
>>>>>>> König
>>>>>>> a few years ago. The original RFC can be found at
>>>>>>> https://lists.freedesktop.org/archives/dri-devel/2015-September/089778.html
>>>>>>>
>>>>>>>
>>>>>>> This is the same idea and I've just adressed his concern from the
>>>>>>> original RFC
>>>>>>> and switched to a callback into file_ops instead of a new member
>>>>>>> in struct file.
>>>>>> Please add the full description to the cover letter and do not make
>>>>>> people hunt links.
>>>>>>
>>>>>> Here is the origin cover letter text
>>>>>> : I'm currently working on the issue that when device drivers
>>>>>> allocate memory on
>>>>>> : behalf of an application the OOM killer usually doesn't knew
>>>>>> about that unless
>>>>>> : the application also get this memory mapped into their address
>>>>>> space.
>>>>>> :
>>>>>> : This is especially annoying for graphics drivers where a lot of
>>>>>> the VRAM
>>>>>> : usually isn't CPU accessible and so doesn't make sense to map
>>>>>> into the
>>>>>> : address space of the process using it.
>>>>>> :
>>>>>> : The problem now is that when an application starts to use a lot
>>>>>> of VRAM those
>>>>>> : buffers objects sooner or later get swapped out to system memory,
>>>>>> but when we
>>>>>> : now run into an out of memory situation the OOM killer obviously
>>>>>> doesn't knew
>>>>>> : anything about that memory and so usually kills the wrong process.
>>>>> OK, but how do you attribute that memory to a particular OOM killable
>>>>> entity? And how do you actually enforce that those resources get freed
>>>>> on the oom killer action?
>>>>>
>>>>>> : The following set of patches tries to address this problem by
>>>>>> introducing a per
>>>>>> : file OOM badness score, which device drivers can use to give the
>>>>>> OOM killer a
>>>>>> : hint how many resources are bound to a file descriptor so that it
>>>>>> can make
>>>>>> : better decisions which process to kill.
>>>>> But files are not killable, they can be shared... In other words this
>>>>> doesn't help the oom killer to make an educated guess at all.
>>>> Maybe some more context would help the discussion?
>>>>
>>>> The struct file in patch 3 is the DRM fd.  That's effectively "my
>>>> process's interface to talking to the GPU" not "a single GPU resource".
>>>> Once that file is closed, all of the process's private, idle GPU buffers
>>>> will be immediately freed (this will be most of their allocations), and
>>>> some will be freed once the GPU completes some work (this will be most
>>>> of the rest of their allocations).
>>>>
>>>> Some GEM BOs won't be freed just by closing the fd, if they've been
>>>> shared between processes.  Those are usually about 8-24MB total in a
>>>> process, rather than the GBs that modern apps use (or that our testcases
>>>> like to allocate and thus trigger oomkilling of the test harness instead
>>>> of the offending testcase...)
>>>>
>>>> Even if we just had the private+idle buffers being accounted in OOM
>>>> badness, that would be a huge step forward in system reliability.
>>> OK, in that case I would propose a different approach. We already
>>> have rss_stat. So why do not we simply add a new counter there
>>> MM_KERNELPAGES and consider those in oom_badness? The rule would be
>>> that such a memory is bound to the process life time. I guess we will
>>> find more users for this later.
>> I already tried that and the problem with that approach is that some
>> buffers are not created by the application which actually uses them.
>>
>> For example X/Wayland is creating and handing out render buffers to
>> application which want to use OpenGL.
>>
>> So the result is when you always account the application who created the
>> buffer the OOM killer will certainly reap X/Wayland first. And that is
>> exactly what we want to avoid here.
> FWIW, what you describe is true with DRI2, but not with DRI3 or Wayland
> anymore. With DRI3 and Wayland, buffers are allocated by the clients and
> then shared with the X / Wayland server.

Good point, when I initially looked at that problem DRI3 wasn't widely 
used yet.

> Also, in all cases, the amount of memory allocated for buffers shared
> between DRI/Wayland clients and the server should be relatively small
> compared to the amount of memory allocated for buffers used only locally
> in the client, particularly for clients which create significant memory
> pressure.

That is unfortunately only partially true. When you have a single 
runaway application which tries to allocate everything it would indeed 
work as you described.

But when I tested this a few years ago with X based desktop the 
applications which actually used most of the memory where Firefox and 
Thunderbird. Unfortunately they never got accounted for that.

Now, on my current Wayland based desktop it actually doesn't look much 
better. Taking a look at radeon_gem_info/amdgpu_gem_info the majority of 
all memory was allocated either by gnome-shell or Xwayland.

Regards,
Christian.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-01-19  8:25         ` Michal Hocko
@ 2018-01-19 10:02           ` roger
  -1 siblings, 0 replies; 170+ messages in thread
From: roger @ 2018-01-19 10:02 UTC (permalink / raw)
  To: Michal Hocko, He, Roger
  Cc: Grodzovsky, Andrey, linux-mm, amd-gfx, linux-kernel, dri-devel,
	Koenig, Christian

[-- Attachment #1: Type: text/plain, Size: 991 bytes --]



On 2018a1'01ae??19ae?JPY 16:25, Michal Hocko wrote:
> [removed the broken quoting - please try to use an email client which
> doesn't mess up the qouted text]
>
> On Fri 19-01-18 06:01:26, He, Roger wrote:
> [...]
>> I think you are misunderstanding here.
>> Actually for now, the memory in TTM Pools already has mm_shrink which is implemented in ttm_pool_mm_shrink_init.
>> And here the memory we want to make it contribute to OOM badness is not in TTM Pools.
>> Because when TTM buffer allocation success, the memory already is removed from TTM Pools.
>
>     I have no idea what TTM buffers are. But this smells like something
>     rather specific to the particular subsytem. And my main objection here
>     is that struct file is not a proper vehicle to carry such an
>     information. So whatever the TTM subsystem does it should contribute to
>     generic counters rather than abuse fd because it happens to use it to
>     communicate with userspace.
>
 A A  got it. thanks.
>


[-- Attachment #2: Type: text/html, Size: 1627 bytes --]

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-19 10:02           ` roger
  0 siblings, 0 replies; 170+ messages in thread
From: roger @ 2018-01-19 10:02 UTC (permalink / raw)
  To: Michal Hocko, He, Roger
  Cc: linux-kernel, dri-devel, linux-mm, amd-gfx, Koenig, Christian


[-- Attachment #1.1: Type: text/plain, Size: 987 bytes --]



On 2018年01月19日 16:25, Michal Hocko wrote:
> [removed the broken quoting - please try to use an email client which
> doesn't mess up the qouted text]
>
> On Fri 19-01-18 06:01:26, He, Roger wrote:
> [...]
>> I think you are misunderstanding here.
>> Actually for now, the memory in TTM Pools already has mm_shrink which is implemented in ttm_pool_mm_shrink_init.
>> And here the memory we want to make it contribute to OOM badness is not in TTM Pools.
>> Because when TTM buffer allocation success, the memory already is removed from TTM Pools.
>
>     I have no idea what TTM buffers are. But this smells like something
>     rather specific to the particular subsytem. And my main objection here
>     is that struct file is not a proper vehicle to carry such an
>     information. So whatever the TTM subsystem does it should contribute to
>     generic counters rather than abuse fd because it happens to use it to
>     communicate with userspace.
>
    got it. thanks.
>


[-- Attachment #1.2: Type: text/html, Size: 1623 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-01-19  9:58               ` Christian König
  (?)
@ 2018-01-19 10:02                 ` Michel Dänzer
  -1 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-01-19 10:02 UTC (permalink / raw)
  To: christian.koenig, Michal Hocko, Eric Anholt
  Cc: amd-gfx, linux-kernel, dri-devel, linux-mm

On 2018-01-19 10:58 AM, Christian König wrote:
> Am 19.01.2018 um 10:32 schrieb Michel Dänzer:
>> On 2018-01-19 09:39 AM, Christian König wrote:
>>> Am 19.01.2018 um 09:20 schrieb Michal Hocko:
>>>> On Thu 18-01-18 12:01:32, Eric Anholt wrote:
>>>>> Michal Hocko <mhocko@kernel.org> writes:
>>>>>
>>>>>> On Thu 18-01-18 18:00:06, Michal Hocko wrote:
>>>>>>> On Thu 18-01-18 11:47:48, Andrey Grodzovsky wrote:
>>>>>>>> Hi, this series is a revised version of an RFC sent by Christian
>>>>>>>> König
>>>>>>>> a few years ago. The original RFC can be found at
>>>>>>>> https://lists.freedesktop.org/archives/dri-devel/2015-September/089778.html
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> This is the same idea and I've just adressed his concern from the
>>>>>>>> original RFC
>>>>>>>> and switched to a callback into file_ops instead of a new member
>>>>>>>> in struct file.
>>>>>>> Please add the full description to the cover letter and do not make
>>>>>>> people hunt links.
>>>>>>>
>>>>>>> Here is the origin cover letter text
>>>>>>> : I'm currently working on the issue that when device drivers
>>>>>>> allocate memory on
>>>>>>> : behalf of an application the OOM killer usually doesn't knew
>>>>>>> about that unless
>>>>>>> : the application also get this memory mapped into their address
>>>>>>> space.
>>>>>>> :
>>>>>>> : This is especially annoying for graphics drivers where a lot of
>>>>>>> the VRAM
>>>>>>> : usually isn't CPU accessible and so doesn't make sense to map
>>>>>>> into the
>>>>>>> : address space of the process using it.
>>>>>>> :
>>>>>>> : The problem now is that when an application starts to use a lot
>>>>>>> of VRAM those
>>>>>>> : buffers objects sooner or later get swapped out to system memory,
>>>>>>> but when we
>>>>>>> : now run into an out of memory situation the OOM killer obviously
>>>>>>> doesn't knew
>>>>>>> : anything about that memory and so usually kills the wrong process.
>>>>>> OK, but how do you attribute that memory to a particular OOM killable
>>>>>> entity? And how do you actually enforce that those resources get
>>>>>> freed
>>>>>> on the oom killer action?
>>>>>>
>>>>>>> : The following set of patches tries to address this problem by
>>>>>>> introducing a per
>>>>>>> : file OOM badness score, which device drivers can use to give the
>>>>>>> OOM killer a
>>>>>>> : hint how many resources are bound to a file descriptor so that it
>>>>>>> can make
>>>>>>> : better decisions which process to kill.
>>>>>> But files are not killable, they can be shared... In other words this
>>>>>> doesn't help the oom killer to make an educated guess at all.
>>>>> Maybe some more context would help the discussion?
>>>>>
>>>>> The struct file in patch 3 is the DRM fd.  That's effectively "my
>>>>> process's interface to talking to the GPU" not "a single GPU
>>>>> resource".
>>>>> Once that file is closed, all of the process's private, idle GPU
>>>>> buffers
>>>>> will be immediately freed (this will be most of their allocations),
>>>>> and
>>>>> some will be freed once the GPU completes some work (this will be most
>>>>> of the rest of their allocations).
>>>>>
>>>>> Some GEM BOs won't be freed just by closing the fd, if they've been
>>>>> shared between processes.  Those are usually about 8-24MB total in a
>>>>> process, rather than the GBs that modern apps use (or that our
>>>>> testcases
>>>>> like to allocate and thus trigger oomkilling of the test harness
>>>>> instead
>>>>> of the offending testcase...)
>>>>>
>>>>> Even if we just had the private+idle buffers being accounted in OOM
>>>>> badness, that would be a huge step forward in system reliability.
>>>> OK, in that case I would propose a different approach. We already
>>>> have rss_stat. So why do not we simply add a new counter there
>>>> MM_KERNELPAGES and consider those in oom_badness? The rule would be
>>>> that such a memory is bound to the process life time. I guess we will
>>>> find more users for this later.
>>> I already tried that and the problem with that approach is that some
>>> buffers are not created by the application which actually uses them.
>>>
>>> For example X/Wayland is creating and handing out render buffers to
>>> application which want to use OpenGL.
>>>
>>> So the result is when you always account the application who created the
>>> buffer the OOM killer will certainly reap X/Wayland first. And that is
>>> exactly what we want to avoid here.
>> FWIW, what you describe is true with DRI2, but not with DRI3 or Wayland
>> anymore. With DRI3 and Wayland, buffers are allocated by the clients and
>> then shared with the X / Wayland server.
> 
> Good point, when I initially looked at that problem DRI3 wasn't widely
> used yet.
> 
>> Also, in all cases, the amount of memory allocated for buffers shared
>> between DRI/Wayland clients and the server should be relatively small
>> compared to the amount of memory allocated for buffers used only locally
>> in the client, particularly for clients which create significant memory
>> pressure.
> 
> That is unfortunately only partially true. When you have a single
> runaway application which tries to allocate everything it would indeed
> work as you described.
> 
> But when I tested this a few years ago with X based desktop the
> applications which actually used most of the memory where Firefox and
> Thunderbird. Unfortunately they never got accounted for that.
> 
> Now, on my current Wayland based desktop it actually doesn't look much
> better. Taking a look at radeon_gem_info/amdgpu_gem_info the majority of
> all memory was allocated either by gnome-shell or Xwayland.

My guess would be this is due to pixmaps, which allow X clients to cause
the X server to allocate essentially unlimited amounts of memory. It's a
separate issue, which would require a different solution than what we're
discussing in this thread. Maybe something that would allow the X server
to tell the kernel that some of the memory it allocates is for the
client process.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-19 10:02                 ` Michel Dänzer
  0 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-01-19 10:02 UTC (permalink / raw)
  To: christian.koenig, Michal Hocko, Eric Anholt
  Cc: amd-gfx, linux-kernel, dri-devel, linux-mm

On 2018-01-19 10:58 AM, Christian KA?nig wrote:
> Am 19.01.2018 um 10:32 schrieb Michel DA?nzer:
>> On 2018-01-19 09:39 AM, Christian KA?nig wrote:
>>> Am 19.01.2018 um 09:20 schrieb Michal Hocko:
>>>> On Thu 18-01-18 12:01:32, Eric Anholt wrote:
>>>>> Michal Hocko <mhocko@kernel.org> writes:
>>>>>
>>>>>> On Thu 18-01-18 18:00:06, Michal Hocko wrote:
>>>>>>> On Thu 18-01-18 11:47:48, Andrey Grodzovsky wrote:
>>>>>>>> Hi, this series is a revised version of an RFC sent by Christian
>>>>>>>> KA?nig
>>>>>>>> a few years ago. The original RFC can be found at
>>>>>>>> https://lists.freedesktop.org/archives/dri-devel/2015-September/089778.html
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> This is the same idea and I've just adressed his concern from the
>>>>>>>> original RFC
>>>>>>>> and switched to a callback into file_ops instead of a new member
>>>>>>>> in struct file.
>>>>>>> Please add the full description to the cover letter and do not make
>>>>>>> people hunt links.
>>>>>>>
>>>>>>> Here is the origin cover letter text
>>>>>>> : I'm currently working on the issue that when device drivers
>>>>>>> allocate memory on
>>>>>>> : behalf of an application the OOM killer usually doesn't knew
>>>>>>> about that unless
>>>>>>> : the application also get this memory mapped into their address
>>>>>>> space.
>>>>>>> :
>>>>>>> : This is especially annoying for graphics drivers where a lot of
>>>>>>> the VRAM
>>>>>>> : usually isn't CPU accessible and so doesn't make sense to map
>>>>>>> into the
>>>>>>> : address space of the process using it.
>>>>>>> :
>>>>>>> : The problem now is that when an application starts to use a lot
>>>>>>> of VRAM those
>>>>>>> : buffers objects sooner or later get swapped out to system memory,
>>>>>>> but when we
>>>>>>> : now run into an out of memory situation the OOM killer obviously
>>>>>>> doesn't knew
>>>>>>> : anything about that memory and so usually kills the wrong process.
>>>>>> OK, but how do you attribute that memory to a particular OOM killable
>>>>>> entity? And how do you actually enforce that those resources get
>>>>>> freed
>>>>>> on the oom killer action?
>>>>>>
>>>>>>> : The following set of patches tries to address this problem by
>>>>>>> introducing a per
>>>>>>> : file OOM badness score, which device drivers can use to give the
>>>>>>> OOM killer a
>>>>>>> : hint how many resources are bound to a file descriptor so that it
>>>>>>> can make
>>>>>>> : better decisions which process to kill.
>>>>>> But files are not killable, they can be shared... In other words this
>>>>>> doesn't help the oom killer to make an educated guess at all.
>>>>> Maybe some more context would help the discussion?
>>>>>
>>>>> The struct file in patch 3 is the DRM fd.A  That's effectively "my
>>>>> process's interface to talking to the GPU" not "a single GPU
>>>>> resource".
>>>>> Once that file is closed, all of the process's private, idle GPU
>>>>> buffers
>>>>> will be immediately freed (this will be most of their allocations),
>>>>> and
>>>>> some will be freed once the GPU completes some work (this will be most
>>>>> of the rest of their allocations).
>>>>>
>>>>> Some GEM BOs won't be freed just by closing the fd, if they've been
>>>>> shared between processes.A  Those are usually about 8-24MB total in a
>>>>> process, rather than the GBs that modern apps use (or that our
>>>>> testcases
>>>>> like to allocate and thus trigger oomkilling of the test harness
>>>>> instead
>>>>> of the offending testcase...)
>>>>>
>>>>> Even if we just had the private+idle buffers being accounted in OOM
>>>>> badness, that would be a huge step forward in system reliability.
>>>> OK, in that case I would propose a different approach. We already
>>>> have rss_stat. So why do not we simply add a new counter there
>>>> MM_KERNELPAGES and consider those in oom_badness? The rule would be
>>>> that such a memory is bound to the process life time. I guess we will
>>>> find more users for this later.
>>> I already tried that and the problem with that approach is that some
>>> buffers are not created by the application which actually uses them.
>>>
>>> For example X/Wayland is creating and handing out render buffers to
>>> application which want to use OpenGL.
>>>
>>> So the result is when you always account the application who created the
>>> buffer the OOM killer will certainly reap X/Wayland first. And that is
>>> exactly what we want to avoid here.
>> FWIW, what you describe is true with DRI2, but not with DRI3 or Wayland
>> anymore. With DRI3 and Wayland, buffers are allocated by the clients and
>> then shared with the X / Wayland server.
> 
> Good point, when I initially looked at that problem DRI3 wasn't widely
> used yet.
> 
>> Also, in all cases, the amount of memory allocated for buffers shared
>> between DRI/Wayland clients and the server should be relatively small
>> compared to the amount of memory allocated for buffers used only locally
>> in the client, particularly for clients which create significant memory
>> pressure.
> 
> That is unfortunately only partially true. When you have a single
> runaway application which tries to allocate everything it would indeed
> work as you described.
> 
> But when I tested this a few years ago with X based desktop the
> applications which actually used most of the memory where Firefox and
> Thunderbird. Unfortunately they never got accounted for that.
> 
> Now, on my current Wayland based desktop it actually doesn't look much
> better. Taking a look at radeon_gem_info/amdgpu_gem_info the majority of
> all memory was allocated either by gnome-shell or Xwayland.

My guess would be this is due to pixmaps, which allow X clients to cause
the X server to allocate essentially unlimited amounts of memory. It's a
separate issue, which would require a different solution than what we're
discussing in this thread. Maybe something that would allow the X server
to tell the kernel that some of the memory it allocates is for the
client process.


-- 
Earthling Michel DA?nzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-19 10:02                 ` Michel Dänzer
  0 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-01-19 10:02 UTC (permalink / raw)
  To: christian.koenig-5C7GfCeVMHo, Michal Hocko, Eric Anholt
  Cc: linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On 2018-01-19 10:58 AM, Christian König wrote:
> Am 19.01.2018 um 10:32 schrieb Michel Dänzer:
>> On 2018-01-19 09:39 AM, Christian König wrote:
>>> Am 19.01.2018 um 09:20 schrieb Michal Hocko:
>>>> On Thu 18-01-18 12:01:32, Eric Anholt wrote:
>>>>> Michal Hocko <mhocko@kernel.org> writes:
>>>>>
>>>>>> On Thu 18-01-18 18:00:06, Michal Hocko wrote:
>>>>>>> On Thu 18-01-18 11:47:48, Andrey Grodzovsky wrote:
>>>>>>>> Hi, this series is a revised version of an RFC sent by Christian
>>>>>>>> König
>>>>>>>> a few years ago. The original RFC can be found at
>>>>>>>> https://lists.freedesktop.org/archives/dri-devel/2015-September/089778.html
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> This is the same idea and I've just adressed his concern from the
>>>>>>>> original RFC
>>>>>>>> and switched to a callback into file_ops instead of a new member
>>>>>>>> in struct file.
>>>>>>> Please add the full description to the cover letter and do not make
>>>>>>> people hunt links.
>>>>>>>
>>>>>>> Here is the origin cover letter text
>>>>>>> : I'm currently working on the issue that when device drivers
>>>>>>> allocate memory on
>>>>>>> : behalf of an application the OOM killer usually doesn't knew
>>>>>>> about that unless
>>>>>>> : the application also get this memory mapped into their address
>>>>>>> space.
>>>>>>> :
>>>>>>> : This is especially annoying for graphics drivers where a lot of
>>>>>>> the VRAM
>>>>>>> : usually isn't CPU accessible and so doesn't make sense to map
>>>>>>> into the
>>>>>>> : address space of the process using it.
>>>>>>> :
>>>>>>> : The problem now is that when an application starts to use a lot
>>>>>>> of VRAM those
>>>>>>> : buffers objects sooner or later get swapped out to system memory,
>>>>>>> but when we
>>>>>>> : now run into an out of memory situation the OOM killer obviously
>>>>>>> doesn't knew
>>>>>>> : anything about that memory and so usually kills the wrong process.
>>>>>> OK, but how do you attribute that memory to a particular OOM killable
>>>>>> entity? And how do you actually enforce that those resources get
>>>>>> freed
>>>>>> on the oom killer action?
>>>>>>
>>>>>>> : The following set of patches tries to address this problem by
>>>>>>> introducing a per
>>>>>>> : file OOM badness score, which device drivers can use to give the
>>>>>>> OOM killer a
>>>>>>> : hint how many resources are bound to a file descriptor so that it
>>>>>>> can make
>>>>>>> : better decisions which process to kill.
>>>>>> But files are not killable, they can be shared... In other words this
>>>>>> doesn't help the oom killer to make an educated guess at all.
>>>>> Maybe some more context would help the discussion?
>>>>>
>>>>> The struct file in patch 3 is the DRM fd.  That's effectively "my
>>>>> process's interface to talking to the GPU" not "a single GPU
>>>>> resource".
>>>>> Once that file is closed, all of the process's private, idle GPU
>>>>> buffers
>>>>> will be immediately freed (this will be most of their allocations),
>>>>> and
>>>>> some will be freed once the GPU completes some work (this will be most
>>>>> of the rest of their allocations).
>>>>>
>>>>> Some GEM BOs won't be freed just by closing the fd, if they've been
>>>>> shared between processes.  Those are usually about 8-24MB total in a
>>>>> process, rather than the GBs that modern apps use (or that our
>>>>> testcases
>>>>> like to allocate and thus trigger oomkilling of the test harness
>>>>> instead
>>>>> of the offending testcase...)
>>>>>
>>>>> Even if we just had the private+idle buffers being accounted in OOM
>>>>> badness, that would be a huge step forward in system reliability.
>>>> OK, in that case I would propose a different approach. We already
>>>> have rss_stat. So why do not we simply add a new counter there
>>>> MM_KERNELPAGES and consider those in oom_badness? The rule would be
>>>> that such a memory is bound to the process life time. I guess we will
>>>> find more users for this later.
>>> I already tried that and the problem with that approach is that some
>>> buffers are not created by the application which actually uses them.
>>>
>>> For example X/Wayland is creating and handing out render buffers to
>>> application which want to use OpenGL.
>>>
>>> So the result is when you always account the application who created the
>>> buffer the OOM killer will certainly reap X/Wayland first. And that is
>>> exactly what we want to avoid here.
>> FWIW, what you describe is true with DRI2, but not with DRI3 or Wayland
>> anymore. With DRI3 and Wayland, buffers are allocated by the clients and
>> then shared with the X / Wayland server.
> 
> Good point, when I initially looked at that problem DRI3 wasn't widely
> used yet.
> 
>> Also, in all cases, the amount of memory allocated for buffers shared
>> between DRI/Wayland clients and the server should be relatively small
>> compared to the amount of memory allocated for buffers used only locally
>> in the client, particularly for clients which create significant memory
>> pressure.
> 
> That is unfortunately only partially true. When you have a single
> runaway application which tries to allocate everything it would indeed
> work as you described.
> 
> But when I tested this a few years ago with X based desktop the
> applications which actually used most of the memory where Firefox and
> Thunderbird. Unfortunately they never got accounted for that.
> 
> Now, on my current Wayland based desktop it actually doesn't look much
> better. Taking a look at radeon_gem_info/amdgpu_gem_info the majority of
> all memory was allocated either by gnome-shell or Xwayland.

My guess would be this is due to pixmaps, which allow X clients to cause
the X server to allocate essentially unlimited amounts of memory. It's a
separate issue, which would require a different solution than what we're
discussing in this thread. Maybe something that would allow the X server
to tell the kernel that some of the memory it allocates is for the
client process.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-01-19  8:39           ` Christian König
  (?)
@ 2018-01-19 10:40             ` Michal Hocko
  -1 siblings, 0 replies; 170+ messages in thread
From: Michal Hocko @ 2018-01-19 10:40 UTC (permalink / raw)
  To: Christian König
  Cc: Eric Anholt, Andrey Grodzovsky, linux-kernel, linux-mm,
	dri-devel, amd-gfx

On Fri 19-01-18 09:39:03, Christian König wrote:
> Am 19.01.2018 um 09:20 schrieb Michal Hocko:
[...]
> > OK, in that case I would propose a different approach. We already
> > have rss_stat. So why do not we simply add a new counter there
> > MM_KERNELPAGES and consider those in oom_badness? The rule would be
> > that such a memory is bound to the process life time. I guess we will
> > find more users for this later.
> 
> I already tried that and the problem with that approach is that some buffers
> are not created by the application which actually uses them.
> 
> For example X/Wayland is creating and handing out render buffers to
> application which want to use OpenGL.
> 
> So the result is when you always account the application who created the
> buffer the OOM killer will certainly reap X/Wayland first. And that is
> exactly what we want to avoid here.

Then you have to find the target allocation context at the time of the
allocation and account it. As follow up emails show, implementations
might differ and any robust oom solution have to rely on the common
counters.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-19 10:40             ` Michal Hocko
  0 siblings, 0 replies; 170+ messages in thread
From: Michal Hocko @ 2018-01-19 10:40 UTC (permalink / raw)
  To: Christian König
  Cc: Eric Anholt, Andrey Grodzovsky, linux-kernel, linux-mm,
	dri-devel, amd-gfx

On Fri 19-01-18 09:39:03, Christian Konig wrote:
> Am 19.01.2018 um 09:20 schrieb Michal Hocko:
[...]
> > OK, in that case I would propose a different approach. We already
> > have rss_stat. So why do not we simply add a new counter there
> > MM_KERNELPAGES and consider those in oom_badness? The rule would be
> > that such a memory is bound to the process life time. I guess we will
> > find more users for this later.
> 
> I already tried that and the problem with that approach is that some buffers
> are not created by the application which actually uses them.
> 
> For example X/Wayland is creating and handing out render buffers to
> application which want to use OpenGL.
> 
> So the result is when you always account the application who created the
> buffer the OOM killer will certainly reap X/Wayland first. And that is
> exactly what we want to avoid here.

Then you have to find the target allocation context at the time of the
allocation and account it. As follow up emails show, implementations
might differ and any robust oom solution have to rely on the common
counters.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-19 10:40             ` Michal Hocko
  0 siblings, 0 replies; 170+ messages in thread
From: Michal Hocko @ 2018-01-19 10:40 UTC (permalink / raw)
  To: Christian König
  Cc: Eric Anholt, Andrey Grodzovsky, linux-kernel, linux-mm,
	dri-devel, amd-gfx

On Fri 19-01-18 09:39:03, Christian König wrote:
> Am 19.01.2018 um 09:20 schrieb Michal Hocko:
[...]
> > OK, in that case I would propose a different approach. We already
> > have rss_stat. So why do not we simply add a new counter there
> > MM_KERNELPAGES and consider those in oom_badness? The rule would be
> > that such a memory is bound to the process life time. I guess we will
> > find more users for this later.
> 
> I already tried that and the problem with that approach is that some buffers
> are not created by the application which actually uses them.
> 
> For example X/Wayland is creating and handing out render buffers to
> application which want to use OpenGL.
> 
> So the result is when you always account the application who created the
> buffer the OOM killer will certainly reap X/Wayland first. And that is
> exactly what we want to avoid here.

Then you have to find the target allocation context at the time of the
allocation and account it. As follow up emails show, implementations
might differ and any robust oom solution have to rely on the common
counters.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-01-19 10:40             ` Michal Hocko
  (?)
@ 2018-01-19 11:37               ` Christian König
  -1 siblings, 0 replies; 170+ messages in thread
From: Christian König @ 2018-01-19 11:37 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Eric Anholt, Andrey Grodzovsky, linux-kernel, linux-mm,
	dri-devel, amd-gfx

Am 19.01.2018 um 11:40 schrieb Michal Hocko:
> On Fri 19-01-18 09:39:03, Christian König wrote:
>> Am 19.01.2018 um 09:20 schrieb Michal Hocko:
> [...]
>>> OK, in that case I would propose a different approach. We already
>>> have rss_stat. So why do not we simply add a new counter there
>>> MM_KERNELPAGES and consider those in oom_badness? The rule would be
>>> that such a memory is bound to the process life time. I guess we will
>>> find more users for this later.
>> I already tried that and the problem with that approach is that some buffers
>> are not created by the application which actually uses them.
>>
>> For example X/Wayland is creating and handing out render buffers to
>> application which want to use OpenGL.
>>
>> So the result is when you always account the application who created the
>> buffer the OOM killer will certainly reap X/Wayland first. And that is
>> exactly what we want to avoid here.
> Then you have to find the target allocation context at the time of the
> allocation and account it.

And exactly that's the root of the problem: The target allocation 
context isn't known at the time of the allocation.

We could add callbacks so that when the memory is passed from the 
allocator to the actual user of the memory. In other words when the 
memory is passed from the X server to the client the driver would need 
to decrement the X servers accounting and increment the clients accounting.

But I think that would go deep into the file descriptor handling (we 
would at least need to handle dup/dup2 and passing the fd using unix 
domain sockets) and most likely would be rather error prone.

The per file descriptor badness is/was just the much easier approach to 
solve the issue, because the drivers already knew which client is 
currently using which buffer objects.

I of course agree that file descriptors can be shared between processes 
and are by themselves not killable. But at least for our graphics driven 
use case I don't see much of a problem killing all processes when a file 
descriptor is used by more than one at the same time.

Regards,
Christian.

> As follow up emails show, implementations
> might differ and any robust oom solution have to rely on the common
> counters.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-19 11:37               ` Christian König
  0 siblings, 0 replies; 170+ messages in thread
From: Christian König @ 2018-01-19 11:37 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Eric Anholt, Andrey Grodzovsky, linux-kernel, linux-mm,
	dri-devel, amd-gfx

Am 19.01.2018 um 11:40 schrieb Michal Hocko:
> On Fri 19-01-18 09:39:03, Christian KA?nig wrote:
>> Am 19.01.2018 um 09:20 schrieb Michal Hocko:
> [...]
>>> OK, in that case I would propose a different approach. We already
>>> have rss_stat. So why do not we simply add a new counter there
>>> MM_KERNELPAGES and consider those in oom_badness? The rule would be
>>> that such a memory is bound to the process life time. I guess we will
>>> find more users for this later.
>> I already tried that and the problem with that approach is that some buffers
>> are not created by the application which actually uses them.
>>
>> For example X/Wayland is creating and handing out render buffers to
>> application which want to use OpenGL.
>>
>> So the result is when you always account the application who created the
>> buffer the OOM killer will certainly reap X/Wayland first. And that is
>> exactly what we want to avoid here.
> Then you have to find the target allocation context at the time of the
> allocation and account it.

And exactly that's the root of the problem: The target allocation 
context isn't known at the time of the allocation.

We could add callbacks so that when the memory is passed from the 
allocator to the actual user of the memory. In other words when the 
memory is passed from the X server to the client the driver would need 
to decrement the X servers accounting and increment the clients accounting.

But I think that would go deep into the file descriptor handling (we 
would at least need to handle dup/dup2 and passing the fd using unix 
domain sockets) and most likely would be rather error prone.

The per file descriptor badness is/was just the much easier approach to 
solve the issue, because the drivers already knew which client is 
currently using which buffer objects.

I of course agree that file descriptors can be shared between processes 
and are by themselves not killable. But at least for our graphics driven 
use case I don't see much of a problem killing all processes when a file 
descriptor is used by more than one at the same time.

Regards,
Christian.

> As follow up emails show, implementations
> might differ and any robust oom solution have to rely on the common
> counters.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-19 11:37               ` Christian König
  0 siblings, 0 replies; 170+ messages in thread
From: Christian König @ 2018-01-19 11:37 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Eric Anholt, Andrey Grodzovsky, linux-kernel, linux-mm,
	dri-devel, amd-gfx

Am 19.01.2018 um 11:40 schrieb Michal Hocko:
> On Fri 19-01-18 09:39:03, Christian König wrote:
>> Am 19.01.2018 um 09:20 schrieb Michal Hocko:
> [...]
>>> OK, in that case I would propose a different approach. We already
>>> have rss_stat. So why do not we simply add a new counter there
>>> MM_KERNELPAGES and consider those in oom_badness? The rule would be
>>> that such a memory is bound to the process life time. I guess we will
>>> find more users for this later.
>> I already tried that and the problem with that approach is that some buffers
>> are not created by the application which actually uses them.
>>
>> For example X/Wayland is creating and handing out render buffers to
>> application which want to use OpenGL.
>>
>> So the result is when you always account the application who created the
>> buffer the OOM killer will certainly reap X/Wayland first. And that is
>> exactly what we want to avoid here.
> Then you have to find the target allocation context at the time of the
> allocation and account it.

And exactly that's the root of the problem: The target allocation 
context isn't known at the time of the allocation.

We could add callbacks so that when the memory is passed from the 
allocator to the actual user of the memory. In other words when the 
memory is passed from the X server to the client the driver would need 
to decrement the X servers accounting and increment the clients accounting.

But I think that would go deep into the file descriptor handling (we 
would at least need to handle dup/dup2 and passing the fd using unix 
domain sockets) and most likely would be rather error prone.

The per file descriptor badness is/was just the much easier approach to 
solve the issue, because the drivers already knew which client is 
currently using which buffer objects.

I of course agree that file descriptors can be shared between processes 
and are by themselves not killable. But at least for our graphics driven 
use case I don't see much of a problem killing all processes when a file 
descriptor is used by more than one at the same time.

Regards,
Christian.

> As follow up emails show, implementations
> might differ and any robust oom solution have to rely on the common
> counters.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-01-19 11:37               ` Christian König
  (?)
@ 2018-01-19 12:13                 ` Michal Hocko
  -1 siblings, 0 replies; 170+ messages in thread
From: Michal Hocko @ 2018-01-19 12:13 UTC (permalink / raw)
  To: Christian König
  Cc: Eric Anholt, Andrey Grodzovsky, linux-kernel, linux-mm,
	dri-devel, amd-gfx

On Fri 19-01-18 12:37:51, Christian König wrote:
[...]
> The per file descriptor badness is/was just the much easier approach to
> solve the issue, because the drivers already knew which client is currently
> using which buffer objects.
> 
> I of course agree that file descriptors can be shared between processes and
> are by themselves not killable. But at least for our graphics driven use
> case I don't see much of a problem killing all processes when a file
> descriptor is used by more than one at the same time.

Ohh, I absolutely see why you have chosen this way for your particular
usecase. I am just arguing that this would rather be more generic to be
merged. If there is absolutely no other way around we can consider it
but right now I do not see that all other options have been considered
properly. Especially when the fd based approach is basically wrong for
almost anybody else.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-19 12:13                 ` Michal Hocko
  0 siblings, 0 replies; 170+ messages in thread
From: Michal Hocko @ 2018-01-19 12:13 UTC (permalink / raw)
  To: Christian König
  Cc: Eric Anholt, Andrey Grodzovsky, linux-kernel, linux-mm,
	dri-devel, amd-gfx

On Fri 19-01-18 12:37:51, Christian Konig wrote:
[...]
> The per file descriptor badness is/was just the much easier approach to
> solve the issue, because the drivers already knew which client is currently
> using which buffer objects.
> 
> I of course agree that file descriptors can be shared between processes and
> are by themselves not killable. But at least for our graphics driven use
> case I don't see much of a problem killing all processes when a file
> descriptor is used by more than one at the same time.

Ohh, I absolutely see why you have chosen this way for your particular
usecase. I am just arguing that this would rather be more generic to be
merged. If there is absolutely no other way around we can consider it
but right now I do not see that all other options have been considered
properly. Especially when the fd based approach is basically wrong for
almost anybody else.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-19 12:13                 ` Michal Hocko
  0 siblings, 0 replies; 170+ messages in thread
From: Michal Hocko @ 2018-01-19 12:13 UTC (permalink / raw)
  To: Christian König
  Cc: Eric Anholt, Andrey Grodzovsky, linux-kernel, linux-mm,
	dri-devel, amd-gfx

On Fri 19-01-18 12:37:51, Christian König wrote:
[...]
> The per file descriptor badness is/was just the much easier approach to
> solve the issue, because the drivers already knew which client is currently
> using which buffer objects.
> 
> I of course agree that file descriptors can be shared between processes and
> are by themselves not killable. But at least for our graphics driven use
> case I don't see much of a problem killing all processes when a file
> descriptor is used by more than one at the same time.

Ohh, I absolutely see why you have chosen this way for your particular
usecase. I am just arguing that this would rather be more generic to be
merged. If there is absolutely no other way around we can consider it
but right now I do not see that all other options have been considered
properly. Especially when the fd based approach is basically wrong for
almost anybody else.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-01-19 12:13                 ` Michal Hocko
  (?)
@ 2018-01-19 12:20                   ` Michal Hocko
  -1 siblings, 0 replies; 170+ messages in thread
From: Michal Hocko @ 2018-01-19 12:20 UTC (permalink / raw)
  To: Christian König
  Cc: Eric Anholt, Andrey Grodzovsky, linux-kernel, linux-mm,
	dri-devel, amd-gfx

On Fri 19-01-18 13:13:51, Michal Hocko wrote:
> On Fri 19-01-18 12:37:51, Christian König wrote:
> [...]
> > The per file descriptor badness is/was just the much easier approach to
> > solve the issue, because the drivers already knew which client is currently
> > using which buffer objects.
> > 
> > I of course agree that file descriptors can be shared between processes and
> > are by themselves not killable. But at least for our graphics driven use
> > case I don't see much of a problem killing all processes when a file
> > descriptor is used by more than one at the same time.
> 
> Ohh, I absolutely see why you have chosen this way for your particular
> usecase. I am just arguing that this would rather be more generic to be
> merged. If there is absolutely no other way around we can consider it
> but right now I do not see that all other options have been considered
> properly. Especially when the fd based approach is basically wrong for
> almost anybody else.

And more importantly. Iterating over _all_ fd which is what is your
approach is based on AFAIU is not acceptable for the OOM path. Even
though oom_badness is not a hot path we do not really want it to take a
lot of time either. Even the current iteration over all processes is
quite time consuming. Now you want to add the number of opened files and
that might be quite many per process.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-19 12:20                   ` Michal Hocko
  0 siblings, 0 replies; 170+ messages in thread
From: Michal Hocko @ 2018-01-19 12:20 UTC (permalink / raw)
  To: Christian König
  Cc: Eric Anholt, Andrey Grodzovsky, linux-kernel, linux-mm,
	dri-devel, amd-gfx

On Fri 19-01-18 13:13:51, Michal Hocko wrote:
> On Fri 19-01-18 12:37:51, Christian Konig wrote:
> [...]
> > The per file descriptor badness is/was just the much easier approach to
> > solve the issue, because the drivers already knew which client is currently
> > using which buffer objects.
> > 
> > I of course agree that file descriptors can be shared between processes and
> > are by themselves not killable. But at least for our graphics driven use
> > case I don't see much of a problem killing all processes when a file
> > descriptor is used by more than one at the same time.
> 
> Ohh, I absolutely see why you have chosen this way for your particular
> usecase. I am just arguing that this would rather be more generic to be
> merged. If there is absolutely no other way around we can consider it
> but right now I do not see that all other options have been considered
> properly. Especially when the fd based approach is basically wrong for
> almost anybody else.

And more importantly. Iterating over _all_ fd which is what is your
approach is based on AFAIU is not acceptable for the OOM path. Even
though oom_badness is not a hot path we do not really want it to take a
lot of time either. Even the current iteration over all processes is
quite time consuming. Now you want to add the number of opened files and
that might be quite many per process.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-19 12:20                   ` Michal Hocko
  0 siblings, 0 replies; 170+ messages in thread
From: Michal Hocko @ 2018-01-19 12:20 UTC (permalink / raw)
  To: Christian König
  Cc: Eric Anholt, Andrey Grodzovsky, linux-kernel, linux-mm,
	dri-devel, amd-gfx

On Fri 19-01-18 13:13:51, Michal Hocko wrote:
> On Fri 19-01-18 12:37:51, Christian König wrote:
> [...]
> > The per file descriptor badness is/was just the much easier approach to
> > solve the issue, because the drivers already knew which client is currently
> > using which buffer objects.
> > 
> > I of course agree that file descriptors can be shared between processes and
> > are by themselves not killable. But at least for our graphics driven use
> > case I don't see much of a problem killing all processes when a file
> > descriptor is used by more than one at the same time.
> 
> Ohh, I absolutely see why you have chosen this way for your particular
> usecase. I am just arguing that this would rather be more generic to be
> merged. If there is absolutely no other way around we can consider it
> but right now I do not see that all other options have been considered
> properly. Especially when the fd based approach is basically wrong for
> almost anybody else.

And more importantly. Iterating over _all_ fd which is what is your
approach is based on AFAIU is not acceptable for the OOM path. Even
though oom_badness is not a hot path we do not really want it to take a
lot of time either. Even the current iteration over all processes is
quite time consuming. Now you want to add the number of opened files and
that might be quite many per process.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-01-19 10:02                 ` Michel Dänzer
@ 2018-01-19 15:07                   ` Michel Dänzer
  -1 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-01-19 15:07 UTC (permalink / raw)
  To: christian.koenig, Michal Hocko, Eric Anholt
  Cc: linux-mm, dri-devel, linux-kernel, amd-gfx

On 2018-01-19 11:02 AM, Michel Dänzer wrote:
> On 2018-01-19 10:58 AM, Christian König wrote:
>> Am 19.01.2018 um 10:32 schrieb Michel Dänzer:
>>> On 2018-01-19 09:39 AM, Christian König wrote:
>>>> Am 19.01.2018 um 09:20 schrieb Michal Hocko:
>>>>> OK, in that case I would propose a different approach. We already
>>>>> have rss_stat. So why do not we simply add a new counter there
>>>>> MM_KERNELPAGES and consider those in oom_badness? The rule would be
>>>>> that such a memory is bound to the process life time. I guess we will
>>>>> find more users for this later.
>>>> I already tried that and the problem with that approach is that some
>>>> buffers are not created by the application which actually uses them.
>>>>
>>>> For example X/Wayland is creating and handing out render buffers to
>>>> application which want to use OpenGL.
>>>>
>>>> So the result is when you always account the application who created the
>>>> buffer the OOM killer will certainly reap X/Wayland first. And that is
>>>> exactly what we want to avoid here.
>>> FWIW, what you describe is true with DRI2, but not with DRI3 or Wayland
>>> anymore. With DRI3 and Wayland, buffers are allocated by the clients and
>>> then shared with the X / Wayland server.
>>
>> Good point, when I initially looked at that problem DRI3 wasn't widely
>> used yet.
>>
>>> Also, in all cases, the amount of memory allocated for buffers shared
>>> between DRI/Wayland clients and the server should be relatively small
>>> compared to the amount of memory allocated for buffers used only locally
>>> in the client, particularly for clients which create significant memory
>>> pressure.
>>
>> That is unfortunately only partially true. When you have a single
>> runaway application which tries to allocate everything it would indeed
>> work as you described.
>>
>> But when I tested this a few years ago with X based desktop the
>> applications which actually used most of the memory where Firefox and
>> Thunderbird. Unfortunately they never got accounted for that.
>>
>> Now, on my current Wayland based desktop it actually doesn't look much
>> better. Taking a look at radeon_gem_info/amdgpu_gem_info the majority of
>> all memory was allocated either by gnome-shell or Xwayland.
> 
> My guess would be this is due to pixmaps, which allow X clients to cause
> the X server to allocate essentially unlimited amounts of memory. It's a
> separate issue, which would require a different solution than what we're
> discussing in this thread. Maybe something that would allow the X server
> to tell the kernel that some of the memory it allocates is for the
> client process.

Of course, such a mechanism could probably be abused to incorrectly
blame other processes for one's own memory consumption...


I'm not sure if the pixmap issue can be solved for the OOM killer. It's
an X design issue which is fixed with Wayland. So it's probably better
to ignore it for this discussion.

Also, I really think the issue with DRM buffers being shared between
processes isn't significant for the OOM killer compared to DRM buffers
only used in the same process that allocates them. So I suggest focusing
on the latter.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-19 15:07                   ` Michel Dänzer
  0 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-01-19 15:07 UTC (permalink / raw)
  To: christian.koenig, Michal Hocko, Eric Anholt
  Cc: linux-mm, dri-devel, linux-kernel, amd-gfx

On 2018-01-19 11:02 AM, Michel DA?nzer wrote:
> On 2018-01-19 10:58 AM, Christian KA?nig wrote:
>> Am 19.01.2018 um 10:32 schrieb Michel DA?nzer:
>>> On 2018-01-19 09:39 AM, Christian KA?nig wrote:
>>>> Am 19.01.2018 um 09:20 schrieb Michal Hocko:
>>>>> OK, in that case I would propose a different approach. We already
>>>>> have rss_stat. So why do not we simply add a new counter there
>>>>> MM_KERNELPAGES and consider those in oom_badness? The rule would be
>>>>> that such a memory is bound to the process life time. I guess we will
>>>>> find more users for this later.
>>>> I already tried that and the problem with that approach is that some
>>>> buffers are not created by the application which actually uses them.
>>>>
>>>> For example X/Wayland is creating and handing out render buffers to
>>>> application which want to use OpenGL.
>>>>
>>>> So the result is when you always account the application who created the
>>>> buffer the OOM killer will certainly reap X/Wayland first. And that is
>>>> exactly what we want to avoid here.
>>> FWIW, what you describe is true with DRI2, but not with DRI3 or Wayland
>>> anymore. With DRI3 and Wayland, buffers are allocated by the clients and
>>> then shared with the X / Wayland server.
>>
>> Good point, when I initially looked at that problem DRI3 wasn't widely
>> used yet.
>>
>>> Also, in all cases, the amount of memory allocated for buffers shared
>>> between DRI/Wayland clients and the server should be relatively small
>>> compared to the amount of memory allocated for buffers used only locally
>>> in the client, particularly for clients which create significant memory
>>> pressure.
>>
>> That is unfortunately only partially true. When you have a single
>> runaway application which tries to allocate everything it would indeed
>> work as you described.
>>
>> But when I tested this a few years ago with X based desktop the
>> applications which actually used most of the memory where Firefox and
>> Thunderbird. Unfortunately they never got accounted for that.
>>
>> Now, on my current Wayland based desktop it actually doesn't look much
>> better. Taking a look at radeon_gem_info/amdgpu_gem_info the majority of
>> all memory was allocated either by gnome-shell or Xwayland.
> 
> My guess would be this is due to pixmaps, which allow X clients to cause
> the X server to allocate essentially unlimited amounts of memory. It's a
> separate issue, which would require a different solution than what we're
> discussing in this thread. Maybe something that would allow the X server
> to tell the kernel that some of the memory it allocates is for the
> client process.

Of course, such a mechanism could probably be abused to incorrectly
blame other processes for one's own memory consumption...


I'm not sure if the pixmap issue can be solved for the OOM killer. It's
an X design issue which is fixed with Wayland. So it's probably better
to ignore it for this discussion.

Also, I really think the issue with DRM buffers being shared between
processes isn't significant for the OOM killer compared to DRM buffers
only used in the same process that allocates them. So I suggest focusing
on the latter.


-- 
Earthling Michel DA?nzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-01-19 11:37               ` Christian König
  (?)
@ 2018-01-19 16:48                 ` Michel Dänzer
  -1 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-01-19 16:48 UTC (permalink / raw)
  To: Christian König, Michal Hocko
  Cc: linux-kernel, amd-gfx, linux-mm, dri-devel

On 2018-01-19 12:37 PM, Christian König wrote:
> Am 19.01.2018 um 11:40 schrieb Michal Hocko:
>> On Fri 19-01-18 09:39:03, Christian König wrote:
>>> Am 19.01.2018 um 09:20 schrieb Michal Hocko:
>> [...]
>>>> OK, in that case I would propose a different approach. We already
>>>> have rss_stat. So why do not we simply add a new counter there
>>>> MM_KERNELPAGES and consider those in oom_badness? The rule would be
>>>> that such a memory is bound to the process life time. I guess we will
>>>> find more users for this later.
>>> I already tried that and the problem with that approach is that some
>>> buffers
>>> are not created by the application which actually uses them.
>>>
>>> For example X/Wayland is creating and handing out render buffers to
>>> application which want to use OpenGL.
>>>
>>> So the result is when you always account the application who created the
>>> buffer the OOM killer will certainly reap X/Wayland first. And that is
>>> exactly what we want to avoid here.
>> Then you have to find the target allocation context at the time of the
>> allocation and account it.
> 
> And exactly that's the root of the problem: The target allocation
> context isn't known at the time of the allocation.
> 
> We could add callbacks so that when the memory is passed from the
> allocator to the actual user of the memory. In other words when the
> memory is passed from the X server to the client the driver would need
> to decrement the X servers accounting and increment the clients accounting.
> 
> But I think that would go deep into the file descriptor handling (we
> would at least need to handle dup/dup2 and passing the fd using unix
> domain sockets) and most likely would be rather error prone.
> 
> The per file descriptor badness is/was just the much easier approach to
> solve the issue, because the drivers already knew which client is
> currently using which buffer objects.
> 
> I of course agree that file descriptors can be shared between processes
> and are by themselves not killable. But at least for our graphics driven
> use case I don't see much of a problem killing all processes when a file
> descriptor is used by more than one at the same time.

In that case, accounting a BO as suggested by Michal above, in every
process that shares it, should work fine, shouldn't it?

The OOM killer will first select the process which has more memory
accounted for other things than the BOs shared with another process.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-19 16:48                 ` Michel Dänzer
  0 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-01-19 16:48 UTC (permalink / raw)
  To: Christian König, Michal Hocko
  Cc: linux-kernel, amd-gfx, linux-mm, dri-devel

On 2018-01-19 12:37 PM, Christian KA?nig wrote:
> Am 19.01.2018 um 11:40 schrieb Michal Hocko:
>> On Fri 19-01-18 09:39:03, Christian KA?nig wrote:
>>> Am 19.01.2018 um 09:20 schrieb Michal Hocko:
>> [...]
>>>> OK, in that case I would propose a different approach. We already
>>>> have rss_stat. So why do not we simply add a new counter there
>>>> MM_KERNELPAGES and consider those in oom_badness? The rule would be
>>>> that such a memory is bound to the process life time. I guess we will
>>>> find more users for this later.
>>> I already tried that and the problem with that approach is that some
>>> buffers
>>> are not created by the application which actually uses them.
>>>
>>> For example X/Wayland is creating and handing out render buffers to
>>> application which want to use OpenGL.
>>>
>>> So the result is when you always account the application who created the
>>> buffer the OOM killer will certainly reap X/Wayland first. And that is
>>> exactly what we want to avoid here.
>> Then you have to find the target allocation context at the time of the
>> allocation and account it.
> 
> And exactly that's the root of the problem: The target allocation
> context isn't known at the time of the allocation.
> 
> We could add callbacks so that when the memory is passed from the
> allocator to the actual user of the memory. In other words when the
> memory is passed from the X server to the client the driver would need
> to decrement the X servers accounting and increment the clients accounting.
> 
> But I think that would go deep into the file descriptor handling (we
> would at least need to handle dup/dup2 and passing the fd using unix
> domain sockets) and most likely would be rather error prone.
> 
> The per file descriptor badness is/was just the much easier approach to
> solve the issue, because the drivers already knew which client is
> currently using which buffer objects.
> 
> I of course agree that file descriptors can be shared between processes
> and are by themselves not killable. But at least for our graphics driven
> use case I don't see much of a problem killing all processes when a file
> descriptor is used by more than one at the same time.

In that case, accounting a BO as suggested by Michal above, in every
process that shares it, should work fine, shouldn't it?

The OOM killer will first select the process which has more memory
accounted for other things than the BOs shared with another process.


-- 
Earthling Michel DA?nzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-19 16:48                 ` Michel Dänzer
  0 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-01-19 16:48 UTC (permalink / raw)
  To: Christian König, Michal Hocko
  Cc: linux-kernel, amd-gfx, linux-mm, dri-devel

On 2018-01-19 12:37 PM, Christian König wrote:
> Am 19.01.2018 um 11:40 schrieb Michal Hocko:
>> On Fri 19-01-18 09:39:03, Christian König wrote:
>>> Am 19.01.2018 um 09:20 schrieb Michal Hocko:
>> [...]
>>>> OK, in that case I would propose a different approach. We already
>>>> have rss_stat. So why do not we simply add a new counter there
>>>> MM_KERNELPAGES and consider those in oom_badness? The rule would be
>>>> that such a memory is bound to the process life time. I guess we will
>>>> find more users for this later.
>>> I already tried that and the problem with that approach is that some
>>> buffers
>>> are not created by the application which actually uses them.
>>>
>>> For example X/Wayland is creating and handing out render buffers to
>>> application which want to use OpenGL.
>>>
>>> So the result is when you always account the application who created the
>>> buffer the OOM killer will certainly reap X/Wayland first. And that is
>>> exactly what we want to avoid here.
>> Then you have to find the target allocation context at the time of the
>> allocation and account it.
> 
> And exactly that's the root of the problem: The target allocation
> context isn't known at the time of the allocation.
> 
> We could add callbacks so that when the memory is passed from the
> allocator to the actual user of the memory. In other words when the
> memory is passed from the X server to the client the driver would need
> to decrement the X servers accounting and increment the clients accounting.
> 
> But I think that would go deep into the file descriptor handling (we
> would at least need to handle dup/dup2 and passing the fd using unix
> domain sockets) and most likely would be rather error prone.
> 
> The per file descriptor badness is/was just the much easier approach to
> solve the issue, because the drivers already knew which client is
> currently using which buffer objects.
> 
> I of course agree that file descriptors can be shared between processes
> and are by themselves not killable. But at least for our graphics driven
> use case I don't see much of a problem killing all processes when a file
> descriptor is used by more than one at the same time.

In that case, accounting a BO as suggested by Michal above, in every
process that shares it, should work fine, shouldn't it?

The OOM killer will first select the process which has more memory
accounted for other things than the BOs shared with another process.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-01-19 12:20                   ` Michal Hocko
  (?)
@ 2018-01-19 16:54                     ` Christian König
  -1 siblings, 0 replies; 170+ messages in thread
From: Christian König @ 2018-01-19 16:54 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Eric Anholt, Andrey Grodzovsky, linux-kernel, linux-mm,
	dri-devel, amd-gfx

Am 19.01.2018 um 13:20 schrieb Michal Hocko:
> On Fri 19-01-18 13:13:51, Michal Hocko wrote:
>> On Fri 19-01-18 12:37:51, Christian König wrote:
>> [...]
>>> The per file descriptor badness is/was just the much easier approach to
>>> solve the issue, because the drivers already knew which client is currently
>>> using which buffer objects.
>>>
>>> I of course agree that file descriptors can be shared between processes and
>>> are by themselves not killable. But at least for our graphics driven use
>>> case I don't see much of a problem killing all processes when a file
>>> descriptor is used by more than one at the same time.
>> Ohh, I absolutely see why you have chosen this way for your particular
>> usecase. I am just arguing that this would rather be more generic to be
>> merged. If there is absolutely no other way around we can consider it
>> but right now I do not see that all other options have been considered
>> properly. Especially when the fd based approach is basically wrong for
>> almost anybody else.
> And more importantly. Iterating over _all_ fd which is what is your
> approach is based on AFAIU is not acceptable for the OOM path. Even
> though oom_badness is not a hot path we do not really want it to take a
> lot of time either. Even the current iteration over all processes is
> quite time consuming. Now you want to add the number of opened files and
> that might be quite many per process.

Mhm, crap that is a really good argument.

How about adding a linked list of callbacks to check for the OOM killer 
to check for each process?

This way we can avoid finding the process where we need to account 
things on when memory is allocated and still allow the OOM killer to 
only check the specific callbacks it needs to determine the score of a 
process?

Would still require some changes in the fs layer, but I think that 
should be doable.

Regards,
Christian.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-19 16:54                     ` Christian König
  0 siblings, 0 replies; 170+ messages in thread
From: Christian König @ 2018-01-19 16:54 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Eric Anholt, Andrey Grodzovsky, linux-kernel, linux-mm,
	dri-devel, amd-gfx

Am 19.01.2018 um 13:20 schrieb Michal Hocko:
> On Fri 19-01-18 13:13:51, Michal Hocko wrote:
>> On Fri 19-01-18 12:37:51, Christian KA?nig wrote:
>> [...]
>>> The per file descriptor badness is/was just the much easier approach to
>>> solve the issue, because the drivers already knew which client is currently
>>> using which buffer objects.
>>>
>>> I of course agree that file descriptors can be shared between processes and
>>> are by themselves not killable. But at least for our graphics driven use
>>> case I don't see much of a problem killing all processes when a file
>>> descriptor is used by more than one at the same time.
>> Ohh, I absolutely see why you have chosen this way for your particular
>> usecase. I am just arguing that this would rather be more generic to be
>> merged. If there is absolutely no other way around we can consider it
>> but right now I do not see that all other options have been considered
>> properly. Especially when the fd based approach is basically wrong for
>> almost anybody else.
> And more importantly. Iterating over _all_ fd which is what is your
> approach is based on AFAIU is not acceptable for the OOM path. Even
> though oom_badness is not a hot path we do not really want it to take a
> lot of time either. Even the current iteration over all processes is
> quite time consuming. Now you want to add the number of opened files and
> that might be quite many per process.

Mhm, crap that is a really good argument.

How about adding a linked list of callbacks to check for the OOM killer 
to check for each process?

This way we can avoid finding the process where we need to account 
things on when memory is allocated and still allow the OOM killer to 
only check the specific callbacks it needs to determine the score of a 
process?

Would still require some changes in the fs layer, but I think that 
should be doable.

Regards,
Christian.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-19 16:54                     ` Christian König
  0 siblings, 0 replies; 170+ messages in thread
From: Christian König @ 2018-01-19 16:54 UTC (permalink / raw)
  To: Michal Hocko; +Cc: linux-kernel, amd-gfx, linux-mm, dri-devel

Am 19.01.2018 um 13:20 schrieb Michal Hocko:
> On Fri 19-01-18 13:13:51, Michal Hocko wrote:
>> On Fri 19-01-18 12:37:51, Christian König wrote:
>> [...]
>>> The per file descriptor badness is/was just the much easier approach to
>>> solve the issue, because the drivers already knew which client is currently
>>> using which buffer objects.
>>>
>>> I of course agree that file descriptors can be shared between processes and
>>> are by themselves not killable. But at least for our graphics driven use
>>> case I don't see much of a problem killing all processes when a file
>>> descriptor is used by more than one at the same time.
>> Ohh, I absolutely see why you have chosen this way for your particular
>> usecase. I am just arguing that this would rather be more generic to be
>> merged. If there is absolutely no other way around we can consider it
>> but right now I do not see that all other options have been considered
>> properly. Especially when the fd based approach is basically wrong for
>> almost anybody else.
> And more importantly. Iterating over _all_ fd which is what is your
> approach is based on AFAIU is not acceptable for the OOM path. Even
> though oom_badness is not a hot path we do not really want it to take a
> lot of time either. Even the current iteration over all processes is
> quite time consuming. Now you want to add the number of opened files and
> that might be quite many per process.

Mhm, crap that is a really good argument.

How about adding a linked list of callbacks to check for the OOM killer 
to check for each process?

This way we can avoid finding the process where we need to account 
things on when memory is allocated and still allow the OOM killer to 
only check the specific callbacks it needs to determine the score of a 
process?

Would still require some changes in the fs layer, but I think that 
should be doable.

Regards,
Christian.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-01-19 15:07                   ` Michel Dänzer
  (?)
@ 2018-01-21  6:50                   ` Eric Anholt
  -1 siblings, 0 replies; 170+ messages in thread
From: Eric Anholt @ 2018-01-21  6:50 UTC (permalink / raw)
  To: Michel Dänzer, christian.koenig, Michal Hocko
  Cc: linux-mm, dri-devel, linux-kernel, amd-gfx

[-- Attachment #1: Type: text/plain, Size: 3622 bytes --]

Michel Dänzer <michel@daenzer.net> writes:

> On 2018-01-19 11:02 AM, Michel Dänzer wrote:
>> On 2018-01-19 10:58 AM, Christian König wrote:
>>> Am 19.01.2018 um 10:32 schrieb Michel Dänzer:
>>>> On 2018-01-19 09:39 AM, Christian König wrote:
>>>>> Am 19.01.2018 um 09:20 schrieb Michal Hocko:
>>>>>> OK, in that case I would propose a different approach. We already
>>>>>> have rss_stat. So why do not we simply add a new counter there
>>>>>> MM_KERNELPAGES and consider those in oom_badness? The rule would be
>>>>>> that such a memory is bound to the process life time. I guess we will
>>>>>> find more users for this later.
>>>>> I already tried that and the problem with that approach is that some
>>>>> buffers are not created by the application which actually uses them.
>>>>>
>>>>> For example X/Wayland is creating and handing out render buffers to
>>>>> application which want to use OpenGL.
>>>>>
>>>>> So the result is when you always account the application who created the
>>>>> buffer the OOM killer will certainly reap X/Wayland first. And that is
>>>>> exactly what we want to avoid here.
>>>> FWIW, what you describe is true with DRI2, but not with DRI3 or Wayland
>>>> anymore. With DRI3 and Wayland, buffers are allocated by the clients and
>>>> then shared with the X / Wayland server.
>>>
>>> Good point, when I initially looked at that problem DRI3 wasn't widely
>>> used yet.
>>>
>>>> Also, in all cases, the amount of memory allocated for buffers shared
>>>> between DRI/Wayland clients and the server should be relatively small
>>>> compared to the amount of memory allocated for buffers used only locally
>>>> in the client, particularly for clients which create significant memory
>>>> pressure.
>>>
>>> That is unfortunately only partially true. When you have a single
>>> runaway application which tries to allocate everything it would indeed
>>> work as you described.
>>>
>>> But when I tested this a few years ago with X based desktop the
>>> applications which actually used most of the memory where Firefox and
>>> Thunderbird. Unfortunately they never got accounted for that.
>>>
>>> Now, on my current Wayland based desktop it actually doesn't look much
>>> better. Taking a look at radeon_gem_info/amdgpu_gem_info the majority of
>>> all memory was allocated either by gnome-shell or Xwayland.
>> 
>> My guess would be this is due to pixmaps, which allow X clients to cause
>> the X server to allocate essentially unlimited amounts of memory. It's a
>> separate issue, which would require a different solution than what we're
>> discussing in this thread. Maybe something that would allow the X server
>> to tell the kernel that some of the memory it allocates is for the
>> client process.
>
> Of course, such a mechanism could probably be abused to incorrectly
> blame other processes for one's own memory consumption...
>
>
> I'm not sure if the pixmap issue can be solved for the OOM killer. It's
> an X design issue which is fixed with Wayland. So it's probably better
> to ignore it for this discussion.
>
> Also, I really think the issue with DRM buffers being shared between
> processes isn't significant for the OOM killer compared to DRM buffers
> only used in the same process that allocates them. So I suggest focusing
> on the latter.

Agreed.  The 95% case is non-shared buffers, so just don't account for
them and we'll have a solution good enough that we probably never need
to handle the shared case.  On the DRM side, removing buffers from the
accounting once they get shared would be easy.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-01-18 16:47 ` Andrey Grodzovsky
  (?)
@ 2018-01-22 23:23   ` Andrew Morton
  -1 siblings, 0 replies; 170+ messages in thread
From: Andrew Morton @ 2018-01-22 23:23 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: linux-kernel, linux-mm, dri-devel, amd-gfx, Christian.Koenig

On Thu, 18 Jan 2018 11:47:48 -0500 Andrey Grodzovsky <andrey.grodzovsky@amd.com> wrote:

> Hi, this series is a revised version of an RFC sent by Christian König
> a few years ago. The original RFC can be found at 
> https://lists.freedesktop.org/archives/dri-devel/2015-September/089778.html
> 
> This is the same idea and I've just adressed his concern from the original RFC 
> and switched to a callback into file_ops instead of a new member in struct file.

Should be in address_space_operations, I suspect.  If an application
opens a file twice, we only want to count it once?

But we're putting the cart ahead of the horse here.  Please provide us
with a detailed description of the problem which you are addressing so
that the MM developers can better consider how to address your
requirements.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-22 23:23   ` Andrew Morton
  0 siblings, 0 replies; 170+ messages in thread
From: Andrew Morton @ 2018-01-22 23:23 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: linux-kernel, linux-mm, dri-devel, amd-gfx, Christian.Koenig

On Thu, 18 Jan 2018 11:47:48 -0500 Andrey Grodzovsky <andrey.grodzovsky@amd.com> wrote:

> Hi, this series is a revised version of an RFC sent by Christian König
> a few years ago. The original RFC can be found at 
> https://lists.freedesktop.org/archives/dri-devel/2015-September/089778.html
> 
> This is the same idea and I've just adressed his concern from the original RFC 
> and switched to a callback into file_ops instead of a new member in struct file.

Should be in address_space_operations, I suspect.  If an application
opens a file twice, we only want to count it once?

But we're putting the cart ahead of the horse here.  Please provide us
with a detailed description of the problem which you are addressing so
that the MM developers can better consider how to address your
requirements.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-22 23:23   ` Andrew Morton
  0 siblings, 0 replies; 170+ messages in thread
From: Andrew Morton @ 2018-01-22 23:23 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: linux-kernel, linux-mm, dri-devel, amd-gfx, Christian.Koenig

On Thu, 18 Jan 2018 11:47:48 -0500 Andrey Grodzovsky <andrey.grodzovsky@amd.com> wrote:

> Hi, this series is a revised version of an RFC sent by Christian König
> a few years ago. The original RFC can be found at 
> https://lists.freedesktop.org/archives/dri-devel/2015-September/089778.html
> 
> This is the same idea and I've just adressed his concern from the original RFC 
> and switched to a callback into file_ops instead of a new member in struct file.

Should be in address_space_operations, I suspect.  If an application
opens a file twice, we only want to count it once?

But we're putting the cart ahead of the horse here.  Please provide us
with a detailed description of the problem which you are addressing so
that the MM developers can better consider how to address your
requirements.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-23  1:59     ` Andrey Grodzovsky
  0 siblings, 0 replies; 170+ messages in thread
From: Andrey Grodzovsky @ 2018-01-23  1:59 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-mm, dri-devel, amd-gfx, Christian.Koenig

[-- Attachment #1: Type: text/plain, Size: 1885 bytes --]



On 01/22/2018 06:23 PM, Andrew Morton wrote:
> On Thu, 18 Jan 2018 11:47:48 -0500 Andrey Grodzovsky <andrey.grodzovsky@amd.com> wrote:
>
>> Hi, this series is a revised version of an RFC sent by Christian KA?nig
>> a few years ago. The original RFC can be found at
>> https://lists.freedesktop.org/archives/dri-devel/2015-September/089778.html
>>
>> This is the same idea and I've just adressed his concern from the original RFC
>> and switched to a callback into file_ops instead of a new member in struct file.
> Should be in address_space_operations, I suspect.  If an application
> opens a file twice, we only want to count it once?

Makes sense

>
> But we're putting the cart ahead of the horse here.  Please provide us
> with a detailed description of the problem which you are addressing so
> that the MM developers can better consider how to address your
> requirements.

I will just reiterate the problem statement from the original RFC, 
should have
put it in the body of the RFC and not just a link, as already commented 
by Michal.
Bellow is the quoted RFC.

Thanks,
Andrey

P.S You can also check the follow up discussion after this first email.

"

I'm currently working on the issue that when device drivers allocate memory on
behalf of an application the OOM killer usually doesn't knew about that unless
the application also get this memory mapped into their address space.

This is especially annoying for graphics drivers where a lot of the VRAM
usually isn't CPU accessible and so doesn't make sense to map into the
address space of the process using it.

The problem now is that when an application starts to use a lot of VRAM those
buffers objects sooner or later get swapped out to system memory, but when we
now run into an out of memory situation the OOM killer obviously doesn't knew
anything about that memory and so usually kills the wrong process

"




[-- Attachment #2: Type: text/html, Size: 3372 bytes --]

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-23  1:59     ` Andrey Grodzovsky
  0 siblings, 0 replies; 170+ messages in thread
From: Andrey Grodzovsky @ 2018-01-23  1:59 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	Christian.Koenig-5C7GfCeVMHo


[-- Attachment #1.1: Type: text/plain, Size: 1906 bytes --]



On 01/22/2018 06:23 PM, Andrew Morton wrote:
> On Thu, 18 Jan 2018 11:47:48 -0500 Andrey Grodzovsky <andrey.grodzovsky-5C7GfCeVMHo@public.gmane.org> wrote:
>
>> Hi, this series is a revised version of an RFC sent by Christian König
>> a few years ago. The original RFC can be found at
>> https://lists.freedesktop.org/archives/dri-devel/2015-September/089778.html
>>
>> This is the same idea and I've just adressed his concern from the original RFC
>> and switched to a callback into file_ops instead of a new member in struct file.
> Should be in address_space_operations, I suspect.  If an application
> opens a file twice, we only want to count it once?

Makes sense

>
> But we're putting the cart ahead of the horse here.  Please provide us
> with a detailed description of the problem which you are addressing so
> that the MM developers can better consider how to address your
> requirements.

I will just reiterate the problem statement from the original RFC, 
should have
put it in the body of the RFC and not just a link, as already commented 
by Michal.
Bellow is the quoted RFC.

Thanks,
Andrey

P.S You can also check the follow up discussion after this first email.

"

I'm currently working on the issue that when device drivers allocate memory on
behalf of an application the OOM killer usually doesn't knew about that unless
the application also get this memory mapped into their address space.

This is especially annoying for graphics drivers where a lot of the VRAM
usually isn't CPU accessible and so doesn't make sense to map into the
address space of the process using it.

The problem now is that when an application starts to use a lot of VRAM those
buffers objects sooner or later get swapped out to system memory, but when we
now run into an out of memory situation the OOM killer obviously doesn't knew
anything about that memory and so usually kills the wrong process

"




[-- Attachment #1.2: Type: text/html, Size: 3501 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-01-19 16:54                     ` Christian König
  (?)
@ 2018-01-23 11:39                       ` Michal Hocko
  -1 siblings, 0 replies; 170+ messages in thread
From: Michal Hocko @ 2018-01-23 11:39 UTC (permalink / raw)
  To: Christian König
  Cc: Eric Anholt, Andrey Grodzovsky, linux-kernel, linux-mm,
	dri-devel, amd-gfx

On Fri 19-01-18 17:54:36, Christian König wrote:
> Am 19.01.2018 um 13:20 schrieb Michal Hocko:
> > On Fri 19-01-18 13:13:51, Michal Hocko wrote:
> > > On Fri 19-01-18 12:37:51, Christian König wrote:
> > > [...]
> > > > The per file descriptor badness is/was just the much easier approach to
> > > > solve the issue, because the drivers already knew which client is currently
> > > > using which buffer objects.
> > > > 
> > > > I of course agree that file descriptors can be shared between processes and
> > > > are by themselves not killable. But at least for our graphics driven use
> > > > case I don't see much of a problem killing all processes when a file
> > > > descriptor is used by more than one at the same time.
> > > Ohh, I absolutely see why you have chosen this way for your particular
> > > usecase. I am just arguing that this would rather be more generic to be
> > > merged. If there is absolutely no other way around we can consider it
> > > but right now I do not see that all other options have been considered
> > > properly. Especially when the fd based approach is basically wrong for
> > > almost anybody else.
> > And more importantly. Iterating over _all_ fd which is what is your
> > approach is based on AFAIU is not acceptable for the OOM path. Even
> > though oom_badness is not a hot path we do not really want it to take a
> > lot of time either. Even the current iteration over all processes is
> > quite time consuming. Now you want to add the number of opened files and
> > that might be quite many per process.
> 
> Mhm, crap that is a really good argument.
> 
> How about adding a linked list of callbacks to check for the OOM killer to
> check for each process?
> 
> This way we can avoid finding the process where we need to account things on
> when memory is allocated and still allow the OOM killer to only check the
> specific callbacks it needs to determine the score of a process?

I might be oversimplifying but there really have to be a boundary when
you have the target user context, no? Then do the accounting when you
get data to the user.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-23 11:39                       ` Michal Hocko
  0 siblings, 0 replies; 170+ messages in thread
From: Michal Hocko @ 2018-01-23 11:39 UTC (permalink / raw)
  To: Christian König
  Cc: Eric Anholt, Andrey Grodzovsky, linux-kernel, linux-mm,
	dri-devel, amd-gfx

On Fri 19-01-18 17:54:36, Christian Konig wrote:
> Am 19.01.2018 um 13:20 schrieb Michal Hocko:
> > On Fri 19-01-18 13:13:51, Michal Hocko wrote:
> > > On Fri 19-01-18 12:37:51, Christian Konig wrote:
> > > [...]
> > > > The per file descriptor badness is/was just the much easier approach to
> > > > solve the issue, because the drivers already knew which client is currently
> > > > using which buffer objects.
> > > > 
> > > > I of course agree that file descriptors can be shared between processes and
> > > > are by themselves not killable. But at least for our graphics driven use
> > > > case I don't see much of a problem killing all processes when a file
> > > > descriptor is used by more than one at the same time.
> > > Ohh, I absolutely see why you have chosen this way for your particular
> > > usecase. I am just arguing that this would rather be more generic to be
> > > merged. If there is absolutely no other way around we can consider it
> > > but right now I do not see that all other options have been considered
> > > properly. Especially when the fd based approach is basically wrong for
> > > almost anybody else.
> > And more importantly. Iterating over _all_ fd which is what is your
> > approach is based on AFAIU is not acceptable for the OOM path. Even
> > though oom_badness is not a hot path we do not really want it to take a
> > lot of time either. Even the current iteration over all processes is
> > quite time consuming. Now you want to add the number of opened files and
> > that might be quite many per process.
> 
> Mhm, crap that is a really good argument.
> 
> How about adding a linked list of callbacks to check for the OOM killer to
> check for each process?
> 
> This way we can avoid finding the process where we need to account things on
> when memory is allocated and still allow the OOM killer to only check the
> specific callbacks it needs to determine the score of a process?

I might be oversimplifying but there really have to be a boundary when
you have the target user context, no? Then do the accounting when you
get data to the user.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-23 11:39                       ` Michal Hocko
  0 siblings, 0 replies; 170+ messages in thread
From: Michal Hocko @ 2018-01-23 11:39 UTC (permalink / raw)
  To: Christian König
  Cc: Eric Anholt, Andrey Grodzovsky, linux-kernel, linux-mm,
	dri-devel, amd-gfx

On Fri 19-01-18 17:54:36, Christian König wrote:
> Am 19.01.2018 um 13:20 schrieb Michal Hocko:
> > On Fri 19-01-18 13:13:51, Michal Hocko wrote:
> > > On Fri 19-01-18 12:37:51, Christian König wrote:
> > > [...]
> > > > The per file descriptor badness is/was just the much easier approach to
> > > > solve the issue, because the drivers already knew which client is currently
> > > > using which buffer objects.
> > > > 
> > > > I of course agree that file descriptors can be shared between processes and
> > > > are by themselves not killable. But at least for our graphics driven use
> > > > case I don't see much of a problem killing all processes when a file
> > > > descriptor is used by more than one at the same time.
> > > Ohh, I absolutely see why you have chosen this way for your particular
> > > usecase. I am just arguing that this would rather be more generic to be
> > > merged. If there is absolutely no other way around we can consider it
> > > but right now I do not see that all other options have been considered
> > > properly. Especially when the fd based approach is basically wrong for
> > > almost anybody else.
> > And more importantly. Iterating over _all_ fd which is what is your
> > approach is based on AFAIU is not acceptable for the OOM path. Even
> > though oom_badness is not a hot path we do not really want it to take a
> > lot of time either. Even the current iteration over all processes is
> > quite time consuming. Now you want to add the number of opened files and
> > that might be quite many per process.
> 
> Mhm, crap that is a really good argument.
> 
> How about adding a linked list of callbacks to check for the OOM killer to
> check for each process?
> 
> This way we can avoid finding the process where we need to account things on
> when memory is allocated and still allow the OOM killer to only check the
> specific callbacks it needs to determine the score of a process?

I might be oversimplifying but there really have to be a boundary when
you have the target user context, no? Then do the accounting when you
get data to the user.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-01-18 17:00   ` Michal Hocko
  (?)
@ 2018-01-23 15:27     ` Roman Gushchin
  -1 siblings, 0 replies; 170+ messages in thread
From: Roman Gushchin @ 2018-01-23 15:27 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Andrey Grodzovsky, linux-kernel, linux-mm, dri-devel, amd-gfx,
	Christian.Koenig

On Thu, Jan 18, 2018 at 06:00:06PM +0100, Michal Hocko wrote:
> On Thu 18-01-18 11:47:48, Andrey Grodzovsky wrote:
> > Hi, this series is a revised version of an RFC sent by Christian König
> > a few years ago. The original RFC can be found at 
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.freedesktop.org_archives_dri-2Ddevel_2015-2DSeptember_089778.html&d=DwIDAw&c=5VD0RTtNlTh3ycd41b3MUw&r=jJYgtDM7QT-W-Fz_d29HYQ&m=R-JIQjy8rqmH5qD581_VYL0Q7cpWSITKOnBCE-3LI8U&s=QZGqKpKuJ2BtioFGSy8_721owcWJ0J6c6d4jywOwN4w&
> Here is the origin cover letter text
> : I'm currently working on the issue that when device drivers allocate memory on
> : behalf of an application the OOM killer usually doesn't knew about that unless
> : the application also get this memory mapped into their address space.
> : 
> : This is especially annoying for graphics drivers where a lot of the VRAM
> : usually isn't CPU accessible and so doesn't make sense to map into the
> : address space of the process using it.
> : 
> : The problem now is that when an application starts to use a lot of VRAM those
> : buffers objects sooner or later get swapped out to system memory, but when we
> : now run into an out of memory situation the OOM killer obviously doesn't knew
> : anything about that memory and so usually kills the wrong process.
> : 
> : The following set of patches tries to address this problem by introducing a per
> : file OOM badness score, which device drivers can use to give the OOM killer a
> : hint how many resources are bound to a file descriptor so that it can make
> : better decisions which process to kill.
> : 
> : So question at every one: What do you think about this approach?
> : 
> : My biggest concern right now is the patches are messing with a core kernel
> : structure (adding a field to struct file). Any better idea? I'm considering
> : to put a callback into file_ops instead.

Hello!

I wonder if groupoom (aka cgroup-aware OOM killer) can work for you?
We do have kmem accounting on the memory cgroup level, and the cgroup-aware
OOM selection logic takes cgroup's kmem size into account. So, you don't
need to introduce another accounting mechanism for OOM.

You can find the current implementation in the mm tree.

Thanks!

Roman

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-23 15:27     ` Roman Gushchin
  0 siblings, 0 replies; 170+ messages in thread
From: Roman Gushchin @ 2018-01-23 15:27 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: linux-kernel, linux-mm, dri-devel, amd-gfx, Christian.Koenig

On Thu, Jan 18, 2018 at 06:00:06PM +0100, Michal Hocko wrote:
> On Thu 18-01-18 11:47:48, Andrey Grodzovsky wrote:
> > Hi, this series is a revised version of an RFC sent by Christian Konig
> > a few years ago. The original RFC can be found at 
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.freedesktop.org_archives_dri-2Ddevel_2015-2DSeptember_089778.html&d=DwIDAw&c=5VD0RTtNlTh3ycd41b3MUw&r=jJYgtDM7QT-W-Fz_d29HYQ&m=R-JIQjy8rqmH5qD581_VYL0Q7cpWSITKOnBCE-3LI8U&s=QZGqKpKuJ2BtioFGSy8_721owcWJ0J6c6d4jywOwN4w&
> Here is the origin cover letter text
> : I'm currently working on the issue that when device drivers allocate memory on
> : behalf of an application the OOM killer usually doesn't knew about that unless
> : the application also get this memory mapped into their address space.
> : 
> : This is especially annoying for graphics drivers where a lot of the VRAM
> : usually isn't CPU accessible and so doesn't make sense to map into the
> : address space of the process using it.
> : 
> : The problem now is that when an application starts to use a lot of VRAM those
> : buffers objects sooner or later get swapped out to system memory, but when we
> : now run into an out of memory situation the OOM killer obviously doesn't knew
> : anything about that memory and so usually kills the wrong process.
> : 
> : The following set of patches tries to address this problem by introducing a per
> : file OOM badness score, which device drivers can use to give the OOM killer a
> : hint how many resources are bound to a file descriptor so that it can make
> : better decisions which process to kill.
> : 
> : So question at every one: What do you think about this approach?
> : 
> : My biggest concern right now is the patches are messing with a core kernel
> : structure (adding a field to struct file). Any better idea? I'm considering
> : to put a callback into file_ops instead.

Hello!

I wonder if groupoom (aka cgroup-aware OOM killer) can work for you?
We do have kmem accounting on the memory cgroup level, and the cgroup-aware
OOM selection logic takes cgroup's kmem size into account. So, you don't
need to introduce another accounting mechanism for OOM.

You can find the current implementation in the mm tree.

Thanks!

Roman

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-23 15:27     ` Roman Gushchin
  0 siblings, 0 replies; 170+ messages in thread
From: Roman Gushchin @ 2018-01-23 15:27 UTC (permalink / raw)
  Cc: Andrey Grodzovsky, linux-kernel, linux-mm, dri-devel, amd-gfx,
	Christian.Koenig

On Thu, Jan 18, 2018 at 06:00:06PM +0100, Michal Hocko wrote:
> On Thu 18-01-18 11:47:48, Andrey Grodzovsky wrote:
> > Hi, this series is a revised version of an RFC sent by Christian König
> > a few years ago. The original RFC can be found at 
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.freedesktop.org_archives_dri-2Ddevel_2015-2DSeptember_089778.html&d=DwIDAw&c=5VD0RTtNlTh3ycd41b3MUw&r=jJYgtDM7QT-W-Fz_d29HYQ&m=R-JIQjy8rqmH5qD581_VYL0Q7cpWSITKOnBCE-3LI8U&s=QZGqKpKuJ2BtioFGSy8_721owcWJ0J6c6d4jywOwN4w&
> Here is the origin cover letter text
> : I'm currently working on the issue that when device drivers allocate memory on
> : behalf of an application the OOM killer usually doesn't knew about that unless
> : the application also get this memory mapped into their address space.
> : 
> : This is especially annoying for graphics drivers where a lot of the VRAM
> : usually isn't CPU accessible and so doesn't make sense to map into the
> : address space of the process using it.
> : 
> : The problem now is that when an application starts to use a lot of VRAM those
> : buffers objects sooner or later get swapped out to system memory, but when we
> : now run into an out of memory situation the OOM killer obviously doesn't knew
> : anything about that memory and so usually kills the wrong process.
> : 
> : The following set of patches tries to address this problem by introducing a per
> : file OOM badness score, which device drivers can use to give the OOM killer a
> : hint how many resources are bound to a file descriptor so that it can make
> : better decisions which process to kill.
> : 
> : So question at every one: What do you think about this approach?
> : 
> : My biggest concern right now is the patches are messing with a core kernel
> : structure (adding a field to struct file). Any better idea? I'm considering
> : to put a callback into file_ops instead.

Hello!

I wonder if groupoom (aka cgroup-aware OOM killer) can work for you?
We do have kmem accounting on the memory cgroup level, and the cgroup-aware
OOM selection logic takes cgroup's kmem size into account. So, you don't
need to introduce another accounting mechanism for OOM.

You can find the current implementation in the mm tree.

Thanks!

Roman

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-01-23 15:27     ` Roman Gushchin
  (?)
@ 2018-01-23 15:36       ` Michal Hocko
  -1 siblings, 0 replies; 170+ messages in thread
From: Michal Hocko @ 2018-01-23 15:36 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: Andrey Grodzovsky, linux-kernel, linux-mm, dri-devel, amd-gfx,
	Christian.Koenig

On Tue 23-01-18 15:27:00, Roman Gushchin wrote:
> On Thu, Jan 18, 2018 at 06:00:06PM +0100, Michal Hocko wrote:
> > On Thu 18-01-18 11:47:48, Andrey Grodzovsky wrote:
> > > Hi, this series is a revised version of an RFC sent by Christian König
> > > a few years ago. The original RFC can be found at 
> > > https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.freedesktop.org_archives_dri-2Ddevel_2015-2DSeptember_089778.html&d=DwIDAw&c=5VD0RTtNlTh3ycd41b3MUw&r=jJYgtDM7QT-W-Fz_d29HYQ&m=R-JIQjy8rqmH5qD581_VYL0Q7cpWSITKOnBCE-3LI8U&s=QZGqKpKuJ2BtioFGSy8_721owcWJ0J6c6d4jywOwN4w&
> > Here is the origin cover letter text
> > : I'm currently working on the issue that when device drivers allocate memory on
> > : behalf of an application the OOM killer usually doesn't knew about that unless
> > : the application also get this memory mapped into their address space.
> > : 
> > : This is especially annoying for graphics drivers where a lot of the VRAM
> > : usually isn't CPU accessible and so doesn't make sense to map into the
> > : address space of the process using it.
> > : 
> > : The problem now is that when an application starts to use a lot of VRAM those
> > : buffers objects sooner or later get swapped out to system memory, but when we
> > : now run into an out of memory situation the OOM killer obviously doesn't knew
> > : anything about that memory and so usually kills the wrong process.
> > : 
> > : The following set of patches tries to address this problem by introducing a per
> > : file OOM badness score, which device drivers can use to give the OOM killer a
> > : hint how many resources are bound to a file descriptor so that it can make
> > : better decisions which process to kill.
> > : 
> > : So question at every one: What do you think about this approach?
> > : 
> > : My biggest concern right now is the patches are messing with a core kernel
> > : structure (adding a field to struct file). Any better idea? I'm considering
> > : to put a callback into file_ops instead.
> 
> Hello!
> 
> I wonder if groupoom (aka cgroup-aware OOM killer) can work for you?

I do not think so. The problem is that the allocating context is not
identical with the end consumer.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-23 15:36       ` Michal Hocko
  0 siblings, 0 replies; 170+ messages in thread
From: Michal Hocko @ 2018-01-23 15:36 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: Andrey Grodzovsky, linux-kernel, linux-mm, dri-devel, amd-gfx,
	Christian.Koenig

On Tue 23-01-18 15:27:00, Roman Gushchin wrote:
> On Thu, Jan 18, 2018 at 06:00:06PM +0100, Michal Hocko wrote:
> > On Thu 18-01-18 11:47:48, Andrey Grodzovsky wrote:
> > > Hi, this series is a revised version of an RFC sent by Christian Konig
> > > a few years ago. The original RFC can be found at 
> > > https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.freedesktop.org_archives_dri-2Ddevel_2015-2DSeptember_089778.html&d=DwIDAw&c=5VD0RTtNlTh3ycd41b3MUw&r=jJYgtDM7QT-W-Fz_d29HYQ&m=R-JIQjy8rqmH5qD581_VYL0Q7cpWSITKOnBCE-3LI8U&s=QZGqKpKuJ2BtioFGSy8_721owcWJ0J6c6d4jywOwN4w&
> > Here is the origin cover letter text
> > : I'm currently working on the issue that when device drivers allocate memory on
> > : behalf of an application the OOM killer usually doesn't knew about that unless
> > : the application also get this memory mapped into their address space.
> > : 
> > : This is especially annoying for graphics drivers where a lot of the VRAM
> > : usually isn't CPU accessible and so doesn't make sense to map into the
> > : address space of the process using it.
> > : 
> > : The problem now is that when an application starts to use a lot of VRAM those
> > : buffers objects sooner or later get swapped out to system memory, but when we
> > : now run into an out of memory situation the OOM killer obviously doesn't knew
> > : anything about that memory and so usually kills the wrong process.
> > : 
> > : The following set of patches tries to address this problem by introducing a per
> > : file OOM badness score, which device drivers can use to give the OOM killer a
> > : hint how many resources are bound to a file descriptor so that it can make
> > : better decisions which process to kill.
> > : 
> > : So question at every one: What do you think about this approach?
> > : 
> > : My biggest concern right now is the patches are messing with a core kernel
> > : structure (adding a field to struct file). Any better idea? I'm considering
> > : to put a callback into file_ops instead.
> 
> Hello!
> 
> I wonder if groupoom (aka cgroup-aware OOM killer) can work for you?

I do not think so. The problem is that the allocating context is not
identical with the end consumer.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-23 15:36       ` Michal Hocko
  0 siblings, 0 replies; 170+ messages in thread
From: Michal Hocko @ 2018-01-23 15:36 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: Andrey Grodzovsky, linux-kernel, linux-mm, dri-devel, amd-gfx,
	Christian.Koenig

On Tue 23-01-18 15:27:00, Roman Gushchin wrote:
> On Thu, Jan 18, 2018 at 06:00:06PM +0100, Michal Hocko wrote:
> > On Thu 18-01-18 11:47:48, Andrey Grodzovsky wrote:
> > > Hi, this series is a revised version of an RFC sent by Christian König
> > > a few years ago. The original RFC can be found at 
> > > https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.freedesktop.org_archives_dri-2Ddevel_2015-2DSeptember_089778.html&d=DwIDAw&c=5VD0RTtNlTh3ycd41b3MUw&r=jJYgtDM7QT-W-Fz_d29HYQ&m=R-JIQjy8rqmH5qD581_VYL0Q7cpWSITKOnBCE-3LI8U&s=QZGqKpKuJ2BtioFGSy8_721owcWJ0J6c6d4jywOwN4w&
> > Here is the origin cover letter text
> > : I'm currently working on the issue that when device drivers allocate memory on
> > : behalf of an application the OOM killer usually doesn't knew about that unless
> > : the application also get this memory mapped into their address space.
> > : 
> > : This is especially annoying for graphics drivers where a lot of the VRAM
> > : usually isn't CPU accessible and so doesn't make sense to map into the
> > : address space of the process using it.
> > : 
> > : The problem now is that when an application starts to use a lot of VRAM those
> > : buffers objects sooner or later get swapped out to system memory, but when we
> > : now run into an out of memory situation the OOM killer obviously doesn't knew
> > : anything about that memory and so usually kills the wrong process.
> > : 
> > : The following set of patches tries to address this problem by introducing a per
> > : file OOM badness score, which device drivers can use to give the OOM killer a
> > : hint how many resources are bound to a file descriptor so that it can make
> > : better decisions which process to kill.
> > : 
> > : So question at every one: What do you think about this approach?
> > : 
> > : My biggest concern right now is the patches are messing with a core kernel
> > : structure (adding a field to struct file). Any better idea? I'm considering
> > : to put a callback into file_ops instead.
> 
> Hello!
> 
> I wonder if groupoom (aka cgroup-aware OOM killer) can work for you?

I do not think so. The problem is that the allocating context is not
identical with the end consumer.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-01-23 15:36       ` Michal Hocko
  (?)
@ 2018-01-23 16:39         ` Michel Dänzer
  -1 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-01-23 16:39 UTC (permalink / raw)
  To: Michal Hocko, Roman Gushchin
  Cc: Andrey Grodzovsky, linux-kernel, amd-gfx, linux-mm, dri-devel,
	Christian.Koenig

On 2018-01-23 04:36 PM, Michal Hocko wrote:
> On Tue 23-01-18 15:27:00, Roman Gushchin wrote:
>> On Thu, Jan 18, 2018 at 06:00:06PM +0100, Michal Hocko wrote:
>>> On Thu 18-01-18 11:47:48, Andrey Grodzovsky wrote:
>>>> Hi, this series is a revised version of an RFC sent by Christian König
>>>> a few years ago. The original RFC can be found at 
>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.freedesktop.org_archives_dri-2Ddevel_2015-2DSeptember_089778.html&d=DwIDAw&c=5VD0RTtNlTh3ycd41b3MUw&r=jJYgtDM7QT-W-Fz_d29HYQ&m=R-JIQjy8rqmH5qD581_VYL0Q7cpWSITKOnBCE-3LI8U&s=QZGqKpKuJ2BtioFGSy8_721owcWJ0J6c6d4jywOwN4w&
>>> Here is the origin cover letter text
>>> : I'm currently working on the issue that when device drivers allocate memory on
>>> : behalf of an application the OOM killer usually doesn't knew about that unless
>>> : the application also get this memory mapped into their address space.
>>> : 
>>> : This is especially annoying for graphics drivers where a lot of the VRAM
>>> : usually isn't CPU accessible and so doesn't make sense to map into the
>>> : address space of the process using it.
>>> : 
>>> : The problem now is that when an application starts to use a lot of VRAM those
>>> : buffers objects sooner or later get swapped out to system memory, but when we
>>> : now run into an out of memory situation the OOM killer obviously doesn't knew
>>> : anything about that memory and so usually kills the wrong process.
>>> : 
>>> : The following set of patches tries to address this problem by introducing a per
>>> : file OOM badness score, which device drivers can use to give the OOM killer a
>>> : hint how many resources are bound to a file descriptor so that it can make
>>> : better decisions which process to kill.
>>> : 
>>> : So question at every one: What do you think about this approach?
>>> : 
>>> : My biggest concern right now is the patches are messing with a core kernel
>>> : structure (adding a field to struct file). Any better idea? I'm considering
>>> : to put a callback into file_ops instead.
>>
>> Hello!
>>
>> I wonder if groupoom (aka cgroup-aware OOM killer) can work for you?
> 
> I do not think so. The problem is that the allocating context is not
> identical with the end consumer.

That's actually not really true. Even in cases where a BO is shared with
a different process, it is still used at least occasionally in the
process which allocated it as well. Otherwise there would be no point in
sharing it between processes.


There should be no problem if the memory of a shared BO is accounted for
in each process sharing it. It might be nice to scale each process'
"debt" by 1 / (number of processes sharing it) if possible, but in the
worst case accounting it fully in each process should be fine.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-23 16:39         ` Michel Dänzer
  0 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-01-23 16:39 UTC (permalink / raw)
  To: Michal Hocko, Roman Gushchin
  Cc: Andrey Grodzovsky, linux-kernel, amd-gfx, linux-mm, dri-devel,
	Christian.Koenig

On 2018-01-23 04:36 PM, Michal Hocko wrote:
> On Tue 23-01-18 15:27:00, Roman Gushchin wrote:
>> On Thu, Jan 18, 2018 at 06:00:06PM +0100, Michal Hocko wrote:
>>> On Thu 18-01-18 11:47:48, Andrey Grodzovsky wrote:
>>>> Hi, this series is a revised version of an RFC sent by Christian KA?nig
>>>> a few years ago. The original RFC can be found at 
>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.freedesktop.org_archives_dri-2Ddevel_2015-2DSeptember_089778.html&d=DwIDAw&c=5VD0RTtNlTh3ycd41b3MUw&r=jJYgtDM7QT-W-Fz_d29HYQ&m=R-JIQjy8rqmH5qD581_VYL0Q7cpWSITKOnBCE-3LI8U&s=QZGqKpKuJ2BtioFGSy8_721owcWJ0J6c6d4jywOwN4w&
>>> Here is the origin cover letter text
>>> : I'm currently working on the issue that when device drivers allocate memory on
>>> : behalf of an application the OOM killer usually doesn't knew about that unless
>>> : the application also get this memory mapped into their address space.
>>> : 
>>> : This is especially annoying for graphics drivers where a lot of the VRAM
>>> : usually isn't CPU accessible and so doesn't make sense to map into the
>>> : address space of the process using it.
>>> : 
>>> : The problem now is that when an application starts to use a lot of VRAM those
>>> : buffers objects sooner or later get swapped out to system memory, but when we
>>> : now run into an out of memory situation the OOM killer obviously doesn't knew
>>> : anything about that memory and so usually kills the wrong process.
>>> : 
>>> : The following set of patches tries to address this problem by introducing a per
>>> : file OOM badness score, which device drivers can use to give the OOM killer a
>>> : hint how many resources are bound to a file descriptor so that it can make
>>> : better decisions which process to kill.
>>> : 
>>> : So question at every one: What do you think about this approach?
>>> : 
>>> : My biggest concern right now is the patches are messing with a core kernel
>>> : structure (adding a field to struct file). Any better idea? I'm considering
>>> : to put a callback into file_ops instead.
>>
>> Hello!
>>
>> I wonder if groupoom (aka cgroup-aware OOM killer) can work for you?
> 
> I do not think so. The problem is that the allocating context is not
> identical with the end consumer.

That's actually not really true. Even in cases where a BO is shared with
a different process, it is still used at least occasionally in the
process which allocated it as well. Otherwise there would be no point in
sharing it between processes.


There should be no problem if the memory of a shared BO is accounted for
in each process sharing it. It might be nice to scale each process'
"debt" by 1 / (number of processes sharing it) if possible, but in the
worst case accounting it fully in each process should be fine.


-- 
Earthling Michel DA?nzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-23 16:39         ` Michel Dänzer
  0 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-01-23 16:39 UTC (permalink / raw)
  To: Michal Hocko, Roman Gushchin
  Cc: Andrey Grodzovsky, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	Christian.Koenig-5C7GfCeVMHo

On 2018-01-23 04:36 PM, Michal Hocko wrote:
> On Tue 23-01-18 15:27:00, Roman Gushchin wrote:
>> On Thu, Jan 18, 2018 at 06:00:06PM +0100, Michal Hocko wrote:
>>> On Thu 18-01-18 11:47:48, Andrey Grodzovsky wrote:
>>>> Hi, this series is a revised version of an RFC sent by Christian König
>>>> a few years ago. The original RFC can be found at 
>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.freedesktop.org_archives_dri-2Ddevel_2015-2DSeptember_089778.html&d=DwIDAw&c=5VD0RTtNlTh3ycd41b3MUw&r=jJYgtDM7QT-W-Fz_d29HYQ&m=R-JIQjy8rqmH5qD581_VYL0Q7cpWSITKOnBCE-3LI8U&s=QZGqKpKuJ2BtioFGSy8_721owcWJ0J6c6d4jywOwN4w&
>>> Here is the origin cover letter text
>>> : I'm currently working on the issue that when device drivers allocate memory on
>>> : behalf of an application the OOM killer usually doesn't knew about that unless
>>> : the application also get this memory mapped into their address space.
>>> : 
>>> : This is especially annoying for graphics drivers where a lot of the VRAM
>>> : usually isn't CPU accessible and so doesn't make sense to map into the
>>> : address space of the process using it.
>>> : 
>>> : The problem now is that when an application starts to use a lot of VRAM those
>>> : buffers objects sooner or later get swapped out to system memory, but when we
>>> : now run into an out of memory situation the OOM killer obviously doesn't knew
>>> : anything about that memory and so usually kills the wrong process.
>>> : 
>>> : The following set of patches tries to address this problem by introducing a per
>>> : file OOM badness score, which device drivers can use to give the OOM killer a
>>> : hint how many resources are bound to a file descriptor so that it can make
>>> : better decisions which process to kill.
>>> : 
>>> : So question at every one: What do you think about this approach?
>>> : 
>>> : My biggest concern right now is the patches are messing with a core kernel
>>> : structure (adding a field to struct file). Any better idea? I'm considering
>>> : to put a callback into file_ops instead.
>>
>> Hello!
>>
>> I wonder if groupoom (aka cgroup-aware OOM killer) can work for you?
> 
> I do not think so. The problem is that the allocating context is not
> identical with the end consumer.

That's actually not really true. Even in cases where a BO is shared with
a different process, it is still used at least occasionally in the
process which allocated it as well. Otherwise there would be no point in
sharing it between processes.


There should be no problem if the memory of a shared BO is accounted for
in each process sharing it. It might be nice to scale each process'
"debt" by 1 / (number of processes sharing it) if possible, but in the
worst case accounting it fully in each process should be fine.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-01-23 16:39         ` Michel Dänzer
  (?)
@ 2018-01-24  9:28           ` Michal Hocko
  -1 siblings, 0 replies; 170+ messages in thread
From: Michal Hocko @ 2018-01-24  9:28 UTC (permalink / raw)
  To: Michel Dänzer
  Cc: Roman Gushchin, Andrey Grodzovsky, linux-kernel, amd-gfx,
	linux-mm, dri-devel, Christian.Koenig

On Tue 23-01-18 17:39:19, Michel Dänzer wrote:
> On 2018-01-23 04:36 PM, Michal Hocko wrote:
> > On Tue 23-01-18 15:27:00, Roman Gushchin wrote:
> >> On Thu, Jan 18, 2018 at 06:00:06PM +0100, Michal Hocko wrote:
> >>> On Thu 18-01-18 11:47:48, Andrey Grodzovsky wrote:
> >>>> Hi, this series is a revised version of an RFC sent by Christian König
> >>>> a few years ago. The original RFC can be found at 
> >>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.freedesktop.org_archives_dri-2Ddevel_2015-2DSeptember_089778.html&d=DwIDAw&c=5VD0RTtNlTh3ycd41b3MUw&r=jJYgtDM7QT-W-Fz_d29HYQ&m=R-JIQjy8rqmH5qD581_VYL0Q7cpWSITKOnBCE-3LI8U&s=QZGqKpKuJ2BtioFGSy8_721owcWJ0J6c6d4jywOwN4w&
> >>> Here is the origin cover letter text
> >>> : I'm currently working on the issue that when device drivers allocate memory on
> >>> : behalf of an application the OOM killer usually doesn't knew about that unless
> >>> : the application also get this memory mapped into their address space.
> >>> : 
> >>> : This is especially annoying for graphics drivers where a lot of the VRAM
> >>> : usually isn't CPU accessible and so doesn't make sense to map into the
> >>> : address space of the process using it.
> >>> : 
> >>> : The problem now is that when an application starts to use a lot of VRAM those
> >>> : buffers objects sooner or later get swapped out to system memory, but when we
> >>> : now run into an out of memory situation the OOM killer obviously doesn't knew
> >>> : anything about that memory and so usually kills the wrong process.
> >>> : 
> >>> : The following set of patches tries to address this problem by introducing a per
> >>> : file OOM badness score, which device drivers can use to give the OOM killer a
> >>> : hint how many resources are bound to a file descriptor so that it can make
> >>> : better decisions which process to kill.
> >>> : 
> >>> : So question at every one: What do you think about this approach?
> >>> : 
> >>> : My biggest concern right now is the patches are messing with a core kernel
> >>> : structure (adding a field to struct file). Any better idea? I'm considering
> >>> : to put a callback into file_ops instead.
> >>
> >> Hello!
> >>
> >> I wonder if groupoom (aka cgroup-aware OOM killer) can work for you?
> > 
> > I do not think so. The problem is that the allocating context is not
> > identical with the end consumer.
> 
> That's actually not really true. Even in cases where a BO is shared with
> a different process, it is still used at least occasionally in the
> process which allocated it as well. Otherwise there would be no point in
> sharing it between processes.

OK, but somebody has to be made responsible. Otherwise you are just
killing a process which doesn't really release any memory.
 
> There should be no problem if the memory of a shared BO is accounted for
> in each process sharing it. It might be nice to scale each process'
> "debt" by 1 / (number of processes sharing it) if possible, but in the
> worst case accounting it fully in each process should be fine.

So how exactly then helps to kill one of those processes? The memory
stays pinned behind or do I still misunderstand?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-24  9:28           ` Michal Hocko
  0 siblings, 0 replies; 170+ messages in thread
From: Michal Hocko @ 2018-01-24  9:28 UTC (permalink / raw)
  To: Michel Dänzer
  Cc: Roman Gushchin, Andrey Grodzovsky, linux-kernel, amd-gfx,
	linux-mm, dri-devel, Christian.Koenig

On Tue 23-01-18 17:39:19, Michel Danzer wrote:
> On 2018-01-23 04:36 PM, Michal Hocko wrote:
> > On Tue 23-01-18 15:27:00, Roman Gushchin wrote:
> >> On Thu, Jan 18, 2018 at 06:00:06PM +0100, Michal Hocko wrote:
> >>> On Thu 18-01-18 11:47:48, Andrey Grodzovsky wrote:
> >>>> Hi, this series is a revised version of an RFC sent by Christian Konig
> >>>> a few years ago. The original RFC can be found at 
> >>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.freedesktop.org_archives_dri-2Ddevel_2015-2DSeptember_089778.html&d=DwIDAw&c=5VD0RTtNlTh3ycd41b3MUw&r=jJYgtDM7QT-W-Fz_d29HYQ&m=R-JIQjy8rqmH5qD581_VYL0Q7cpWSITKOnBCE-3LI8U&s=QZGqKpKuJ2BtioFGSy8_721owcWJ0J6c6d4jywOwN4w&
> >>> Here is the origin cover letter text
> >>> : I'm currently working on the issue that when device drivers allocate memory on
> >>> : behalf of an application the OOM killer usually doesn't knew about that unless
> >>> : the application also get this memory mapped into their address space.
> >>> : 
> >>> : This is especially annoying for graphics drivers where a lot of the VRAM
> >>> : usually isn't CPU accessible and so doesn't make sense to map into the
> >>> : address space of the process using it.
> >>> : 
> >>> : The problem now is that when an application starts to use a lot of VRAM those
> >>> : buffers objects sooner or later get swapped out to system memory, but when we
> >>> : now run into an out of memory situation the OOM killer obviously doesn't knew
> >>> : anything about that memory and so usually kills the wrong process.
> >>> : 
> >>> : The following set of patches tries to address this problem by introducing a per
> >>> : file OOM badness score, which device drivers can use to give the OOM killer a
> >>> : hint how many resources are bound to a file descriptor so that it can make
> >>> : better decisions which process to kill.
> >>> : 
> >>> : So question at every one: What do you think about this approach?
> >>> : 
> >>> : My biggest concern right now is the patches are messing with a core kernel
> >>> : structure (adding a field to struct file). Any better idea? I'm considering
> >>> : to put a callback into file_ops instead.
> >>
> >> Hello!
> >>
> >> I wonder if groupoom (aka cgroup-aware OOM killer) can work for you?
> > 
> > I do not think so. The problem is that the allocating context is not
> > identical with the end consumer.
> 
> That's actually not really true. Even in cases where a BO is shared with
> a different process, it is still used at least occasionally in the
> process which allocated it as well. Otherwise there would be no point in
> sharing it between processes.

OK, but somebody has to be made responsible. Otherwise you are just
killing a process which doesn't really release any memory.
 
> There should be no problem if the memory of a shared BO is accounted for
> in each process sharing it. It might be nice to scale each process'
> "debt" by 1 / (number of processes sharing it) if possible, but in the
> worst case accounting it fully in each process should be fine.

So how exactly then helps to kill one of those processes? The memory
stays pinned behind or do I still misunderstand?
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-24  9:28           ` Michal Hocko
  0 siblings, 0 replies; 170+ messages in thread
From: Michal Hocko @ 2018-01-24  9:28 UTC (permalink / raw)
  To: Michel Dänzer
  Cc: Roman Gushchin, Andrey Grodzovsky, linux-kernel, amd-gfx,
	linux-mm, dri-devel, Christian.Koenig

On Tue 23-01-18 17:39:19, Michel Dänzer wrote:
> On 2018-01-23 04:36 PM, Michal Hocko wrote:
> > On Tue 23-01-18 15:27:00, Roman Gushchin wrote:
> >> On Thu, Jan 18, 2018 at 06:00:06PM +0100, Michal Hocko wrote:
> >>> On Thu 18-01-18 11:47:48, Andrey Grodzovsky wrote:
> >>>> Hi, this series is a revised version of an RFC sent by Christian König
> >>>> a few years ago. The original RFC can be found at 
> >>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.freedesktop.org_archives_dri-2Ddevel_2015-2DSeptember_089778.html&d=DwIDAw&c=5VD0RTtNlTh3ycd41b3MUw&r=jJYgtDM7QT-W-Fz_d29HYQ&m=R-JIQjy8rqmH5qD581_VYL0Q7cpWSITKOnBCE-3LI8U&s=QZGqKpKuJ2BtioFGSy8_721owcWJ0J6c6d4jywOwN4w&
> >>> Here is the origin cover letter text
> >>> : I'm currently working on the issue that when device drivers allocate memory on
> >>> : behalf of an application the OOM killer usually doesn't knew about that unless
> >>> : the application also get this memory mapped into their address space.
> >>> : 
> >>> : This is especially annoying for graphics drivers where a lot of the VRAM
> >>> : usually isn't CPU accessible and so doesn't make sense to map into the
> >>> : address space of the process using it.
> >>> : 
> >>> : The problem now is that when an application starts to use a lot of VRAM those
> >>> : buffers objects sooner or later get swapped out to system memory, but when we
> >>> : now run into an out of memory situation the OOM killer obviously doesn't knew
> >>> : anything about that memory and so usually kills the wrong process.
> >>> : 
> >>> : The following set of patches tries to address this problem by introducing a per
> >>> : file OOM badness score, which device drivers can use to give the OOM killer a
> >>> : hint how many resources are bound to a file descriptor so that it can make
> >>> : better decisions which process to kill.
> >>> : 
> >>> : So question at every one: What do you think about this approach?
> >>> : 
> >>> : My biggest concern right now is the patches are messing with a core kernel
> >>> : structure (adding a field to struct file). Any better idea? I'm considering
> >>> : to put a callback into file_ops instead.
> >>
> >> Hello!
> >>
> >> I wonder if groupoom (aka cgroup-aware OOM killer) can work for you?
> > 
> > I do not think so. The problem is that the allocating context is not
> > identical with the end consumer.
> 
> That's actually not really true. Even in cases where a BO is shared with
> a different process, it is still used at least occasionally in the
> process which allocated it as well. Otherwise there would be no point in
> sharing it between processes.

OK, but somebody has to be made responsible. Otherwise you are just
killing a process which doesn't really release any memory.
 
> There should be no problem if the memory of a shared BO is accounted for
> in each process sharing it. It might be nice to scale each process'
> "debt" by 1 / (number of processes sharing it) if possible, but in the
> worst case accounting it fully in each process should be fine.

So how exactly then helps to kill one of those processes? The memory
stays pinned behind or do I still misunderstand?
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-01-24  9:28           ` Michal Hocko
  (?)
@ 2018-01-24 10:27             ` Michel Dänzer
  -1 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-01-24 10:27 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, dri-devel, Christian.Koenig, linux-mm, amd-gfx,
	Roman Gushchin

On 2018-01-24 10:28 AM, Michal Hocko wrote:
> On Tue 23-01-18 17:39:19, Michel Dänzer wrote:
>> On 2018-01-23 04:36 PM, Michal Hocko wrote:
>>> On Tue 23-01-18 15:27:00, Roman Gushchin wrote:
>>>> On Thu, Jan 18, 2018 at 06:00:06PM +0100, Michal Hocko wrote:
>>>>> On Thu 18-01-18 11:47:48, Andrey Grodzovsky wrote:
>>>>>> Hi, this series is a revised version of an RFC sent by Christian König
>>>>>> a few years ago. The original RFC can be found at 
>>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.freedesktop.org_archives_dri-2Ddevel_2015-2DSeptember_089778.html&d=DwIDAw&c=5VD0RTtNlTh3ycd41b3MUw&r=jJYgtDM7QT-W-Fz_d29HYQ&m=R-JIQjy8rqmH5qD581_VYL0Q7cpWSITKOnBCE-3LI8U&s=QZGqKpKuJ2BtioFGSy8_721owcWJ0J6c6d4jywOwN4w&
>>>>> Here is the origin cover letter text
>>>>> : I'm currently working on the issue that when device drivers allocate memory on
>>>>> : behalf of an application the OOM killer usually doesn't knew about that unless
>>>>> : the application also get this memory mapped into their address space.
>>>>> : 
>>>>> : This is especially annoying for graphics drivers where a lot of the VRAM
>>>>> : usually isn't CPU accessible and so doesn't make sense to map into the
>>>>> : address space of the process using it.
>>>>> : 
>>>>> : The problem now is that when an application starts to use a lot of VRAM those
>>>>> : buffers objects sooner or later get swapped out to system memory, but when we
>>>>> : now run into an out of memory situation the OOM killer obviously doesn't knew
>>>>> : anything about that memory and so usually kills the wrong process.
>>>>> : 
>>>>> : The following set of patches tries to address this problem by introducing a per
>>>>> : file OOM badness score, which device drivers can use to give the OOM killer a
>>>>> : hint how many resources are bound to a file descriptor so that it can make
>>>>> : better decisions which process to kill.
>>>>> : 
>>>>> : So question at every one: What do you think about this approach?
>>>>> : 
>>>>> : My biggest concern right now is the patches are messing with a core kernel
>>>>> : structure (adding a field to struct file). Any better idea? I'm considering
>>>>> : to put a callback into file_ops instead.
>>>>
>>>> Hello!
>>>>
>>>> I wonder if groupoom (aka cgroup-aware OOM killer) can work for you?
>>>
>>> I do not think so. The problem is that the allocating context is not
>>> identical with the end consumer.
>>
>> That's actually not really true. Even in cases where a BO is shared with
>> a different process, it is still used at least occasionally in the
>> process which allocated it as well. Otherwise there would be no point in
>> sharing it between processes.
> 
> OK, but somebody has to be made responsible. Otherwise you are just
> killing a process which doesn't really release any memory.
>  
>> There should be no problem if the memory of a shared BO is accounted for
>> in each process sharing it. It might be nice to scale each process'
>> "debt" by 1 / (number of processes sharing it) if possible, but in the
>> worst case accounting it fully in each process should be fine.
> 
> So how exactly then helps to kill one of those processes? The memory
> stays pinned behind or do I still misunderstand?

Fundamentally, the memory is only released once all references to the
BOs are dropped. That's true no matter how the memory is accounted for
between the processes referencing the BO.


In practice, this should be fine:

1. The amount of memory used for shared BOs is normally small compared
to the amount of memory used for non-shared BOs (and other things). So
regardless of how shared BOs are accounted for, the OOM killer should
first target the process which is responsible for more memory overall.

2. If the OOM killer kills a process which is sharing BOs with another
process, this should result in the other process dropping its references
to the BOs as well, at which point the memory is released.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-24 10:27             ` Michel Dänzer
  0 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-01-24 10:27 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, dri-devel, Christian.Koenig, linux-mm, amd-gfx,
	Roman Gushchin

On 2018-01-24 10:28 AM, Michal Hocko wrote:
> On Tue 23-01-18 17:39:19, Michel DA?nzer wrote:
>> On 2018-01-23 04:36 PM, Michal Hocko wrote:
>>> On Tue 23-01-18 15:27:00, Roman Gushchin wrote:
>>>> On Thu, Jan 18, 2018 at 06:00:06PM +0100, Michal Hocko wrote:
>>>>> On Thu 18-01-18 11:47:48, Andrey Grodzovsky wrote:
>>>>>> Hi, this series is a revised version of an RFC sent by Christian KA?nig
>>>>>> a few years ago. The original RFC can be found at 
>>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.freedesktop.org_archives_dri-2Ddevel_2015-2DSeptember_089778.html&d=DwIDAw&c=5VD0RTtNlTh3ycd41b3MUw&r=jJYgtDM7QT-W-Fz_d29HYQ&m=R-JIQjy8rqmH5qD581_VYL0Q7cpWSITKOnBCE-3LI8U&s=QZGqKpKuJ2BtioFGSy8_721owcWJ0J6c6d4jywOwN4w&
>>>>> Here is the origin cover letter text
>>>>> : I'm currently working on the issue that when device drivers allocate memory on
>>>>> : behalf of an application the OOM killer usually doesn't knew about that unless
>>>>> : the application also get this memory mapped into their address space.
>>>>> : 
>>>>> : This is especially annoying for graphics drivers where a lot of the VRAM
>>>>> : usually isn't CPU accessible and so doesn't make sense to map into the
>>>>> : address space of the process using it.
>>>>> : 
>>>>> : The problem now is that when an application starts to use a lot of VRAM those
>>>>> : buffers objects sooner or later get swapped out to system memory, but when we
>>>>> : now run into an out of memory situation the OOM killer obviously doesn't knew
>>>>> : anything about that memory and so usually kills the wrong process.
>>>>> : 
>>>>> : The following set of patches tries to address this problem by introducing a per
>>>>> : file OOM badness score, which device drivers can use to give the OOM killer a
>>>>> : hint how many resources are bound to a file descriptor so that it can make
>>>>> : better decisions which process to kill.
>>>>> : 
>>>>> : So question at every one: What do you think about this approach?
>>>>> : 
>>>>> : My biggest concern right now is the patches are messing with a core kernel
>>>>> : structure (adding a field to struct file). Any better idea? I'm considering
>>>>> : to put a callback into file_ops instead.
>>>>
>>>> Hello!
>>>>
>>>> I wonder if groupoom (aka cgroup-aware OOM killer) can work for you?
>>>
>>> I do not think so. The problem is that the allocating context is not
>>> identical with the end consumer.
>>
>> That's actually not really true. Even in cases where a BO is shared with
>> a different process, it is still used at least occasionally in the
>> process which allocated it as well. Otherwise there would be no point in
>> sharing it between processes.
> 
> OK, but somebody has to be made responsible. Otherwise you are just
> killing a process which doesn't really release any memory.
>  
>> There should be no problem if the memory of a shared BO is accounted for
>> in each process sharing it. It might be nice to scale each process'
>> "debt" by 1 / (number of processes sharing it) if possible, but in the
>> worst case accounting it fully in each process should be fine.
> 
> So how exactly then helps to kill one of those processes? The memory
> stays pinned behind or do I still misunderstand?

Fundamentally, the memory is only released once all references to the
BOs are dropped. That's true no matter how the memory is accounted for
between the processes referencing the BO.


In practice, this should be fine:

1. The amount of memory used for shared BOs is normally small compared
to the amount of memory used for non-shared BOs (and other things). So
regardless of how shared BOs are accounted for, the OOM killer should
first target the process which is responsible for more memory overall.

2. If the OOM killer kills a process which is sharing BOs with another
process, this should result in the other process dropping its references
to the BOs as well, at which point the memory is released.


-- 
Earthling Michel DA?nzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-24 10:27             ` Michel Dänzer
  0 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-01-24 10:27 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, dri-devel, Christian.Koenig, linux-mm, amd-gfx,
	Roman Gushchin

On 2018-01-24 10:28 AM, Michal Hocko wrote:
> On Tue 23-01-18 17:39:19, Michel Dänzer wrote:
>> On 2018-01-23 04:36 PM, Michal Hocko wrote:
>>> On Tue 23-01-18 15:27:00, Roman Gushchin wrote:
>>>> On Thu, Jan 18, 2018 at 06:00:06PM +0100, Michal Hocko wrote:
>>>>> On Thu 18-01-18 11:47:48, Andrey Grodzovsky wrote:
>>>>>> Hi, this series is a revised version of an RFC sent by Christian König
>>>>>> a few years ago. The original RFC can be found at 
>>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.freedesktop.org_archives_dri-2Ddevel_2015-2DSeptember_089778.html&d=DwIDAw&c=5VD0RTtNlTh3ycd41b3MUw&r=jJYgtDM7QT-W-Fz_d29HYQ&m=R-JIQjy8rqmH5qD581_VYL0Q7cpWSITKOnBCE-3LI8U&s=QZGqKpKuJ2BtioFGSy8_721owcWJ0J6c6d4jywOwN4w&
>>>>> Here is the origin cover letter text
>>>>> : I'm currently working on the issue that when device drivers allocate memory on
>>>>> : behalf of an application the OOM killer usually doesn't knew about that unless
>>>>> : the application also get this memory mapped into their address space.
>>>>> : 
>>>>> : This is especially annoying for graphics drivers where a lot of the VRAM
>>>>> : usually isn't CPU accessible and so doesn't make sense to map into the
>>>>> : address space of the process using it.
>>>>> : 
>>>>> : The problem now is that when an application starts to use a lot of VRAM those
>>>>> : buffers objects sooner or later get swapped out to system memory, but when we
>>>>> : now run into an out of memory situation the OOM killer obviously doesn't knew
>>>>> : anything about that memory and so usually kills the wrong process.
>>>>> : 
>>>>> : The following set of patches tries to address this problem by introducing a per
>>>>> : file OOM badness score, which device drivers can use to give the OOM killer a
>>>>> : hint how many resources are bound to a file descriptor so that it can make
>>>>> : better decisions which process to kill.
>>>>> : 
>>>>> : So question at every one: What do you think about this approach?
>>>>> : 
>>>>> : My biggest concern right now is the patches are messing with a core kernel
>>>>> : structure (adding a field to struct file). Any better idea? I'm considering
>>>>> : to put a callback into file_ops instead.
>>>>
>>>> Hello!
>>>>
>>>> I wonder if groupoom (aka cgroup-aware OOM killer) can work for you?
>>>
>>> I do not think so. The problem is that the allocating context is not
>>> identical with the end consumer.
>>
>> That's actually not really true. Even in cases where a BO is shared with
>> a different process, it is still used at least occasionally in the
>> process which allocated it as well. Otherwise there would be no point in
>> sharing it between processes.
> 
> OK, but somebody has to be made responsible. Otherwise you are just
> killing a process which doesn't really release any memory.
>  
>> There should be no problem if the memory of a shared BO is accounted for
>> in each process sharing it. It might be nice to scale each process'
>> "debt" by 1 / (number of processes sharing it) if possible, but in the
>> worst case accounting it fully in each process should be fine.
> 
> So how exactly then helps to kill one of those processes? The memory
> stays pinned behind or do I still misunderstand?

Fundamentally, the memory is only released once all references to the
BOs are dropped. That's true no matter how the memory is accounted for
between the processes referencing the BO.


In practice, this should be fine:

1. The amount of memory used for shared BOs is normally small compared
to the amount of memory used for non-shared BOs (and other things). So
regardless of how shared BOs are accounted for, the OOM killer should
first target the process which is responsible for more memory overall.

2. If the OOM killer kills a process which is sharing BOs with another
process, this should result in the other process dropping its references
to the BOs as well, at which point the memory is released.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-01-24 10:27             ` Michel Dänzer
  (?)
@ 2018-01-24 11:01               ` Michal Hocko
  -1 siblings, 0 replies; 170+ messages in thread
From: Michal Hocko @ 2018-01-24 11:01 UTC (permalink / raw)
  To: Michel Dänzer
  Cc: linux-kernel, dri-devel, Christian.Koenig, linux-mm, amd-gfx,
	Roman Gushchin

On Wed 24-01-18 11:27:15, Michel Dänzer wrote:
> On 2018-01-24 10:28 AM, Michal Hocko wrote:
[...]
> > So how exactly then helps to kill one of those processes? The memory
> > stays pinned behind or do I still misunderstand?
> 
> Fundamentally, the memory is only released once all references to the
> BOs are dropped. That's true no matter how the memory is accounted for
> between the processes referencing the BO.
> 
> 
> In practice, this should be fine:
> 
> 1. The amount of memory used for shared BOs is normally small compared
> to the amount of memory used for non-shared BOs (and other things). So
> regardless of how shared BOs are accounted for, the OOM killer should
> first target the process which is responsible for more memory overall.

OK. So this is essentially the same as with the normal shared memory
which is a part of the RSS in general.

> 2. If the OOM killer kills a process which is sharing BOs with another
> process, this should result in the other process dropping its references
> to the BOs as well, at which point the memory is released.

OK. How exactly are those BOs mapped to the userspace?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-24 11:01               ` Michal Hocko
  0 siblings, 0 replies; 170+ messages in thread
From: Michal Hocko @ 2018-01-24 11:01 UTC (permalink / raw)
  To: Michel Dänzer
  Cc: linux-kernel, dri-devel, Christian.Koenig, linux-mm, amd-gfx,
	Roman Gushchin

On Wed 24-01-18 11:27:15, Michel Danzer wrote:
> On 2018-01-24 10:28 AM, Michal Hocko wrote:
[...]
> > So how exactly then helps to kill one of those processes? The memory
> > stays pinned behind or do I still misunderstand?
> 
> Fundamentally, the memory is only released once all references to the
> BOs are dropped. That's true no matter how the memory is accounted for
> between the processes referencing the BO.
> 
> 
> In practice, this should be fine:
> 
> 1. The amount of memory used for shared BOs is normally small compared
> to the amount of memory used for non-shared BOs (and other things). So
> regardless of how shared BOs are accounted for, the OOM killer should
> first target the process which is responsible for more memory overall.

OK. So this is essentially the same as with the normal shared memory
which is a part of the RSS in general.

> 2. If the OOM killer kills a process which is sharing BOs with another
> process, this should result in the other process dropping its references
> to the BOs as well, at which point the memory is released.

OK. How exactly are those BOs mapped to the userspace?
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-24 11:01               ` Michal Hocko
  0 siblings, 0 replies; 170+ messages in thread
From: Michal Hocko @ 2018-01-24 11:01 UTC (permalink / raw)
  To: Michel Dänzer
  Cc: linux-kernel, dri-devel, Christian.Koenig, linux-mm, amd-gfx,
	Roman Gushchin

On Wed 24-01-18 11:27:15, Michel Dänzer wrote:
> On 2018-01-24 10:28 AM, Michal Hocko wrote:
[...]
> > So how exactly then helps to kill one of those processes? The memory
> > stays pinned behind or do I still misunderstand?
> 
> Fundamentally, the memory is only released once all references to the
> BOs are dropped. That's true no matter how the memory is accounted for
> between the processes referencing the BO.
> 
> 
> In practice, this should be fine:
> 
> 1. The amount of memory used for shared BOs is normally small compared
> to the amount of memory used for non-shared BOs (and other things). So
> regardless of how shared BOs are accounted for, the OOM killer should
> first target the process which is responsible for more memory overall.

OK. So this is essentially the same as with the normal shared memory
which is a part of the RSS in general.

> 2. If the OOM killer kills a process which is sharing BOs with another
> process, this should result in the other process dropping its references
> to the BOs as well, at which point the memory is released.

OK. How exactly are those BOs mapped to the userspace?
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-01-24 11:01               ` Michal Hocko
  (?)
@ 2018-01-24 11:23                 ` Michel Dänzer
  -1 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-01-24 11:23 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, amd-gfx, Christian.Koenig, linux-mm, dri-devel,
	Roman Gushchin

On 2018-01-24 12:01 PM, Michal Hocko wrote:
> On Wed 24-01-18 11:27:15, Michel Dänzer wrote:
>> On 2018-01-24 10:28 AM, Michal Hocko wrote:
> [...]
>>> So how exactly then helps to kill one of those processes? The memory
>>> stays pinned behind or do I still misunderstand?
>>
>> Fundamentally, the memory is only released once all references to the
>> BOs are dropped. That's true no matter how the memory is accounted for
>> between the processes referencing the BO.
>>
>>
>> In practice, this should be fine:
>>
>> 1. The amount of memory used for shared BOs is normally small compared
>> to the amount of memory used for non-shared BOs (and other things). So
>> regardless of how shared BOs are accounted for, the OOM killer should
>> first target the process which is responsible for more memory overall.
> 
> OK. So this is essentially the same as with the normal shared memory
> which is a part of the RSS in general.

Right.


>> 2. If the OOM killer kills a process which is sharing BOs with another
>> process, this should result in the other process dropping its references
>> to the BOs as well, at which point the memory is released.
> 
> OK. How exactly are those BOs mapped to the userspace?

I'm not sure what you're asking. Userspace mostly uses a GEM handle to
refer to a BO. There can also be userspace CPU mappings of the BO's
memory, but userspace doesn't need CPU mappings for all BOs and only
creates them as needed.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-24 11:23                 ` Michel Dänzer
  0 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-01-24 11:23 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, amd-gfx, Christian.Koenig, linux-mm, dri-devel,
	Roman Gushchin

On 2018-01-24 12:01 PM, Michal Hocko wrote:
> On Wed 24-01-18 11:27:15, Michel DA?nzer wrote:
>> On 2018-01-24 10:28 AM, Michal Hocko wrote:
> [...]
>>> So how exactly then helps to kill one of those processes? The memory
>>> stays pinned behind or do I still misunderstand?
>>
>> Fundamentally, the memory is only released once all references to the
>> BOs are dropped. That's true no matter how the memory is accounted for
>> between the processes referencing the BO.
>>
>>
>> In practice, this should be fine:
>>
>> 1. The amount of memory used for shared BOs is normally small compared
>> to the amount of memory used for non-shared BOs (and other things). So
>> regardless of how shared BOs are accounted for, the OOM killer should
>> first target the process which is responsible for more memory overall.
> 
> OK. So this is essentially the same as with the normal shared memory
> which is a part of the RSS in general.

Right.


>> 2. If the OOM killer kills a process which is sharing BOs with another
>> process, this should result in the other process dropping its references
>> to the BOs as well, at which point the memory is released.
> 
> OK. How exactly are those BOs mapped to the userspace?

I'm not sure what you're asking. Userspace mostly uses a GEM handle to
refer to a BO. There can also be userspace CPU mappings of the BO's
memory, but userspace doesn't need CPU mappings for all BOs and only
creates them as needed.


-- 
Earthling Michel DA?nzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-24 11:23                 ` Michel Dänzer
  0 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-01-24 11:23 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, dri-devel, Christian.Koenig, linux-mm, amd-gfx,
	Roman Gushchin

On 2018-01-24 12:01 PM, Michal Hocko wrote:
> On Wed 24-01-18 11:27:15, Michel Dänzer wrote:
>> On 2018-01-24 10:28 AM, Michal Hocko wrote:
> [...]
>>> So how exactly then helps to kill one of those processes? The memory
>>> stays pinned behind or do I still misunderstand?
>>
>> Fundamentally, the memory is only released once all references to the
>> BOs are dropped. That's true no matter how the memory is accounted for
>> between the processes referencing the BO.
>>
>>
>> In practice, this should be fine:
>>
>> 1. The amount of memory used for shared BOs is normally small compared
>> to the amount of memory used for non-shared BOs (and other things). So
>> regardless of how shared BOs are accounted for, the OOM killer should
>> first target the process which is responsible for more memory overall.
> 
> OK. So this is essentially the same as with the normal shared memory
> which is a part of the RSS in general.

Right.


>> 2. If the OOM killer kills a process which is sharing BOs with another
>> process, this should result in the other process dropping its references
>> to the BOs as well, at which point the memory is released.
> 
> OK. How exactly are those BOs mapped to the userspace?

I'm not sure what you're asking. Userspace mostly uses a GEM handle to
refer to a BO. There can also be userspace CPU mappings of the BO's
memory, but userspace doesn't need CPU mappings for all BOs and only
creates them as needed.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-01-24 11:23                 ` Michel Dänzer
  (?)
@ 2018-01-24 11:50                   ` Michal Hocko
  -1 siblings, 0 replies; 170+ messages in thread
From: Michal Hocko @ 2018-01-24 11:50 UTC (permalink / raw)
  To: Michel Dänzer
  Cc: linux-kernel, amd-gfx, Christian.Koenig, linux-mm, dri-devel,
	Roman Gushchin

On Wed 24-01-18 12:23:10, Michel Dänzer wrote:
> On 2018-01-24 12:01 PM, Michal Hocko wrote:
> > On Wed 24-01-18 11:27:15, Michel Dänzer wrote:
[...]
> >> 2. If the OOM killer kills a process which is sharing BOs with another
> >> process, this should result in the other process dropping its references
> >> to the BOs as well, at which point the memory is released.
> > 
> > OK. How exactly are those BOs mapped to the userspace?
> 
> I'm not sure what you're asking. Userspace mostly uses a GEM handle to
> refer to a BO. There can also be userspace CPU mappings of the BO's
> memory, but userspace doesn't need CPU mappings for all BOs and only
> creates them as needed.

OK, I guess you have to bear with me some more. This whole stack is a
complete uknonwn. I am mostly after finding a boundary where you can
charge the allocated memory to the process so that the oom killer can
consider it. Is there anything like that? Except for the proposed file
handle hack?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-24 11:50                   ` Michal Hocko
  0 siblings, 0 replies; 170+ messages in thread
From: Michal Hocko @ 2018-01-24 11:50 UTC (permalink / raw)
  To: Michel Dänzer
  Cc: linux-kernel, amd-gfx, Christian.Koenig, linux-mm, dri-devel,
	Roman Gushchin

On Wed 24-01-18 12:23:10, Michel Danzer wrote:
> On 2018-01-24 12:01 PM, Michal Hocko wrote:
> > On Wed 24-01-18 11:27:15, Michel Danzer wrote:
[...]
> >> 2. If the OOM killer kills a process which is sharing BOs with another
> >> process, this should result in the other process dropping its references
> >> to the BOs as well, at which point the memory is released.
> > 
> > OK. How exactly are those BOs mapped to the userspace?
> 
> I'm not sure what you're asking. Userspace mostly uses a GEM handle to
> refer to a BO. There can also be userspace CPU mappings of the BO's
> memory, but userspace doesn't need CPU mappings for all BOs and only
> creates them as needed.

OK, I guess you have to bear with me some more. This whole stack is a
complete uknonwn. I am mostly after finding a boundary where you can
charge the allocated memory to the process so that the oom killer can
consider it. Is there anything like that? Except for the proposed file
handle hack?
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-24 11:50                   ` Michal Hocko
  0 siblings, 0 replies; 170+ messages in thread
From: Michal Hocko @ 2018-01-24 11:50 UTC (permalink / raw)
  To: Michel Dänzer
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	Christian.Koenig-5C7GfCeVMHo, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Roman Gushchin

On Wed 24-01-18 12:23:10, Michel Dänzer wrote:
> On 2018-01-24 12:01 PM, Michal Hocko wrote:
> > On Wed 24-01-18 11:27:15, Michel Dänzer wrote:
[...]
> >> 2. If the OOM killer kills a process which is sharing BOs with another
> >> process, this should result in the other process dropping its references
> >> to the BOs as well, at which point the memory is released.
> > 
> > OK. How exactly are those BOs mapped to the userspace?
> 
> I'm not sure what you're asking. Userspace mostly uses a GEM handle to
> refer to a BO. There can also be userspace CPU mappings of the BO's
> memory, but userspace doesn't need CPU mappings for all BOs and only
> creates them as needed.

OK, I guess you have to bear with me some more. This whole stack is a
complete uknonwn. I am mostly after finding a boundary where you can
charge the allocated memory to the process so that the oom killer can
consider it. Is there anything like that? Except for the proposed file
handle hack?
-- 
Michal Hocko
SUSE Labs
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-01-24 11:50                   ` Michal Hocko
@ 2018-01-24 12:11                     ` Christian König
  -1 siblings, 0 replies; 170+ messages in thread
From: Christian König @ 2018-01-24 12:11 UTC (permalink / raw)
  To: Michal Hocko, Michel Dänzer
  Cc: linux-kernel, dri-devel, Christian.Koenig, linux-mm, amd-gfx,
	Roman Gushchin

Am 24.01.2018 um 12:50 schrieb Michal Hocko:
> On Wed 24-01-18 12:23:10, Michel Dänzer wrote:
>> On 2018-01-24 12:01 PM, Michal Hocko wrote:
>>> On Wed 24-01-18 11:27:15, Michel Dänzer wrote:
> [...]
>>>> 2. If the OOM killer kills a process which is sharing BOs with another
>>>> process, this should result in the other process dropping its references
>>>> to the BOs as well, at which point the memory is released.
>>> OK. How exactly are those BOs mapped to the userspace?
>> I'm not sure what you're asking. Userspace mostly uses a GEM handle to
>> refer to a BO. There can also be userspace CPU mappings of the BO's
>> memory, but userspace doesn't need CPU mappings for all BOs and only
>> creates them as needed.
> OK, I guess you have to bear with me some more. This whole stack is a
> complete uknonwn. I am mostly after finding a boundary where you can
> charge the allocated memory to the process so that the oom killer can
> consider it. Is there anything like that? Except for the proposed file
> handle hack?

Not that I knew of.

As I said before we need some kind of callback that a process now starts 
to use a file descriptor, but without anything from that file descriptor 
mapped into the address space.

Regards,
Christian.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-24 12:11                     ` Christian König
  0 siblings, 0 replies; 170+ messages in thread
From: Christian König @ 2018-01-24 12:11 UTC (permalink / raw)
  To: Michal Hocko, Michel Dänzer
  Cc: linux-kernel, dri-devel, Christian.Koenig, linux-mm, amd-gfx,
	Roman Gushchin

Am 24.01.2018 um 12:50 schrieb Michal Hocko:
> On Wed 24-01-18 12:23:10, Michel DA?nzer wrote:
>> On 2018-01-24 12:01 PM, Michal Hocko wrote:
>>> On Wed 24-01-18 11:27:15, Michel DA?nzer wrote:
> [...]
>>>> 2. If the OOM killer kills a process which is sharing BOs with another
>>>> process, this should result in the other process dropping its references
>>>> to the BOs as well, at which point the memory is released.
>>> OK. How exactly are those BOs mapped to the userspace?
>> I'm not sure what you're asking. Userspace mostly uses a GEM handle to
>> refer to a BO. There can also be userspace CPU mappings of the BO's
>> memory, but userspace doesn't need CPU mappings for all BOs and only
>> creates them as needed.
> OK, I guess you have to bear with me some more. This whole stack is a
> complete uknonwn. I am mostly after finding a boundary where you can
> charge the allocated memory to the process so that the oom killer can
> consider it. Is there anything like that? Except for the proposed file
> handle hack?

Not that I knew of.

As I said before we need some kind of callback that a process now starts 
to use a file descriptor, but without anything from that file descriptor 
mapped into the address space.

Regards,
Christian.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-01-24 11:50                   ` Michal Hocko
  (?)
@ 2018-01-24 14:31                     ` Michel Dänzer
  -1 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-01-24 14:31 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, dri-devel, Christian.Koenig, linux-mm, amd-gfx,
	Roman Gushchin

On 2018-01-24 12:50 PM, Michal Hocko wrote:
> On Wed 24-01-18 12:23:10, Michel Dänzer wrote:
>> On 2018-01-24 12:01 PM, Michal Hocko wrote:
>>> On Wed 24-01-18 11:27:15, Michel Dänzer wrote:
> [...]
>>>> 2. If the OOM killer kills a process which is sharing BOs with another
>>>> process, this should result in the other process dropping its references
>>>> to the BOs as well, at which point the memory is released.
>>>
>>> OK. How exactly are those BOs mapped to the userspace?
>>
>> I'm not sure what you're asking. Userspace mostly uses a GEM handle to
>> refer to a BO. There can also be userspace CPU mappings of the BO's
>> memory, but userspace doesn't need CPU mappings for all BOs and only
>> creates them as needed.
> 
> OK, I guess you have to bear with me some more. This whole stack is a
> complete uknonwn. I am mostly after finding a boundary where you can
> charge the allocated memory to the process so that the oom killer can
> consider it. Is there anything like that?

I think something like charging the memory of a BO to the process when a
userspace handle is created for it, and "uncharging" when a handle is
destroyed, could be a good start.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-24 14:31                     ` Michel Dänzer
  0 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-01-24 14:31 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, dri-devel, Christian.Koenig, linux-mm, amd-gfx,
	Roman Gushchin

On 2018-01-24 12:50 PM, Michal Hocko wrote:
> On Wed 24-01-18 12:23:10, Michel DA?nzer wrote:
>> On 2018-01-24 12:01 PM, Michal Hocko wrote:
>>> On Wed 24-01-18 11:27:15, Michel DA?nzer wrote:
> [...]
>>>> 2. If the OOM killer kills a process which is sharing BOs with another
>>>> process, this should result in the other process dropping its references
>>>> to the BOs as well, at which point the memory is released.
>>>
>>> OK. How exactly are those BOs mapped to the userspace?
>>
>> I'm not sure what you're asking. Userspace mostly uses a GEM handle to
>> refer to a BO. There can also be userspace CPU mappings of the BO's
>> memory, but userspace doesn't need CPU mappings for all BOs and only
>> creates them as needed.
> 
> OK, I guess you have to bear with me some more. This whole stack is a
> complete uknonwn. I am mostly after finding a boundary where you can
> charge the allocated memory to the process so that the oom killer can
> consider it. Is there anything like that?

I think something like charging the memory of a BO to the process when a
userspace handle is created for it, and "uncharging" when a handle is
destroyed, could be a good start.


-- 
Earthling Michel DA?nzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-24 14:31                     ` Michel Dänzer
  0 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-01-24 14:31 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	Christian.Koenig-5C7GfCeVMHo, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Roman Gushchin

On 2018-01-24 12:50 PM, Michal Hocko wrote:
> On Wed 24-01-18 12:23:10, Michel Dänzer wrote:
>> On 2018-01-24 12:01 PM, Michal Hocko wrote:
>>> On Wed 24-01-18 11:27:15, Michel Dänzer wrote:
> [...]
>>>> 2. If the OOM killer kills a process which is sharing BOs with another
>>>> process, this should result in the other process dropping its references
>>>> to the BOs as well, at which point the memory is released.
>>>
>>> OK. How exactly are those BOs mapped to the userspace?
>>
>> I'm not sure what you're asking. Userspace mostly uses a GEM handle to
>> refer to a BO. There can also be userspace CPU mappings of the BO's
>> memory, but userspace doesn't need CPU mappings for all BOs and only
>> creates them as needed.
> 
> OK, I guess you have to bear with me some more. This whole stack is a
> complete uknonwn. I am mostly after finding a boundary where you can
> charge the allocated memory to the process so that the oom killer can
> consider it. Is there anything like that?

I think something like charging the memory of a BO to the process when a
userspace handle is created for it, and "uncharging" when a handle is
destroyed, could be a good start.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH 4/4] drm/amdgpu: Use drm_oom_badness for amdgpu.
  2018-01-18 16:47   ` Andrey Grodzovsky
@ 2018-01-30  9:24     ` Daniel Vetter
  -1 siblings, 0 replies; 170+ messages in thread
From: Daniel Vetter @ 2018-01-30  9:24 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: linux-kernel, linux-mm, dri-devel, amd-gfx, Christian.Koenig

On Thu, Jan 18, 2018 at 11:47:52AM -0500, Andrey Grodzovsky wrote:
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 46a0c93..6a733cdc8 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -828,6 +828,7 @@ static const struct file_operations amdgpu_driver_kms_fops = {
>  #ifdef CONFIG_COMPAT
>  	.compat_ioctl = amdgpu_kms_compat_ioctl,
>  #endif
> +	.oom_file_badness = drm_oom_badness,

Would be neat if we could roll this out for all gem drivers (once it's no
longer an RFC ofc).
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH 4/4] drm/amdgpu: Use drm_oom_badness for amdgpu.
@ 2018-01-30  9:24     ` Daniel Vetter
  0 siblings, 0 replies; 170+ messages in thread
From: Daniel Vetter @ 2018-01-30  9:24 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: linux-kernel, linux-mm, dri-devel, amd-gfx, Christian.Koenig

On Thu, Jan 18, 2018 at 11:47:52AM -0500, Andrey Grodzovsky wrote:
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 46a0c93..6a733cdc8 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -828,6 +828,7 @@ static const struct file_operations amdgpu_driver_kms_fops = {
>  #ifdef CONFIG_COMPAT
>  	.compat_ioctl = amdgpu_kms_compat_ioctl,
>  #endif
> +	.oom_file_badness = drm_oom_badness,

Would be neat if we could roll this out for all gem drivers (once it's no
longer an RFC ofc).
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-01-24 11:50                   ` Michal Hocko
  (?)
@ 2018-01-30  9:29                     ` Michel Dänzer
  -1 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-01-30  9:29 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, dri-devel, Christian.Koenig, linux-mm, amd-gfx,
	Roman Gushchin

On 2018-01-24 12:50 PM, Michal Hocko wrote:
> On Wed 24-01-18 12:23:10, Michel Dänzer wrote:
>> On 2018-01-24 12:01 PM, Michal Hocko wrote:
>>> On Wed 24-01-18 11:27:15, Michel Dänzer wrote:
> [...]
>>>> 2. If the OOM killer kills a process which is sharing BOs with another
>>>> process, this should result in the other process dropping its references
>>>> to the BOs as well, at which point the memory is released.
>>>
>>> OK. How exactly are those BOs mapped to the userspace?
>>
>> I'm not sure what you're asking. Userspace mostly uses a GEM handle to
>> refer to a BO. There can also be userspace CPU mappings of the BO's
>> memory, but userspace doesn't need CPU mappings for all BOs and only
>> creates them as needed.
> 
> OK, I guess you have to bear with me some more. This whole stack is a
> complete uknonwn. I am mostly after finding a boundary where you can
> charge the allocated memory to the process so that the oom killer can
> consider it. Is there anything like that? Except for the proposed file
> handle hack?

How about the other way around: what APIs can we use to charge /
"uncharge" memory to a process? If we have those, we can experiment with
different places to call them.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-30  9:29                     ` Michel Dänzer
  0 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-01-30  9:29 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, dri-devel, Christian.Koenig, linux-mm, amd-gfx,
	Roman Gushchin

On 2018-01-24 12:50 PM, Michal Hocko wrote:
> On Wed 24-01-18 12:23:10, Michel DA?nzer wrote:
>> On 2018-01-24 12:01 PM, Michal Hocko wrote:
>>> On Wed 24-01-18 11:27:15, Michel DA?nzer wrote:
> [...]
>>>> 2. If the OOM killer kills a process which is sharing BOs with another
>>>> process, this should result in the other process dropping its references
>>>> to the BOs as well, at which point the memory is released.
>>>
>>> OK. How exactly are those BOs mapped to the userspace?
>>
>> I'm not sure what you're asking. Userspace mostly uses a GEM handle to
>> refer to a BO. There can also be userspace CPU mappings of the BO's
>> memory, but userspace doesn't need CPU mappings for all BOs and only
>> creates them as needed.
> 
> OK, I guess you have to bear with me some more. This whole stack is a
> complete uknonwn. I am mostly after finding a boundary where you can
> charge the allocated memory to the process so that the oom killer can
> consider it. Is there anything like that? Except for the proposed file
> handle hack?

How about the other way around: what APIs can we use to charge /
"uncharge" memory to a process? If we have those, we can experiment with
different places to call them.


-- 
Earthling Michel DA?nzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-30  9:29                     ` Michel Dänzer
  0 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-01-30  9:29 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, amd-gfx, Christian.Koenig, linux-mm, dri-devel,
	Roman Gushchin

On 2018-01-24 12:50 PM, Michal Hocko wrote:
> On Wed 24-01-18 12:23:10, Michel Dänzer wrote:
>> On 2018-01-24 12:01 PM, Michal Hocko wrote:
>>> On Wed 24-01-18 11:27:15, Michel Dänzer wrote:
> [...]
>>>> 2. If the OOM killer kills a process which is sharing BOs with another
>>>> process, this should result in the other process dropping its references
>>>> to the BOs as well, at which point the memory is released.
>>>
>>> OK. How exactly are those BOs mapped to the userspace?
>>
>> I'm not sure what you're asking. Userspace mostly uses a GEM handle to
>> refer to a BO. There can also be userspace CPU mappings of the BO's
>> memory, but userspace doesn't need CPU mappings for all BOs and only
>> creates them as needed.
> 
> OK, I guess you have to bear with me some more. This whole stack is a
> complete uknonwn. I am mostly after finding a boundary where you can
> charge the allocated memory to the process so that the oom killer can
> consider it. Is there anything like that? Except for the proposed file
> handle hack?

How about the other way around: what APIs can we use to charge /
"uncharge" memory to a process? If we have those, we can experiment with
different places to call them.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-01-24 12:11                     ` Christian König
  (?)
@ 2018-01-30  9:31                       ` Daniel Vetter
  -1 siblings, 0 replies; 170+ messages in thread
From: Daniel Vetter @ 2018-01-30  9:31 UTC (permalink / raw)
  To: christian.koenig
  Cc: Michal Hocko, Michel Dänzer, linux-kernel, dri-devel,
	linux-mm, amd-gfx, Roman Gushchin

On Wed, Jan 24, 2018 at 01:11:09PM +0100, Christian König wrote:
> Am 24.01.2018 um 12:50 schrieb Michal Hocko:
> > On Wed 24-01-18 12:23:10, Michel Dänzer wrote:
> > > On 2018-01-24 12:01 PM, Michal Hocko wrote:
> > > > On Wed 24-01-18 11:27:15, Michel Dänzer wrote:
> > [...]
> > > > > 2. If the OOM killer kills a process which is sharing BOs with another
> > > > > process, this should result in the other process dropping its references
> > > > > to the BOs as well, at which point the memory is released.
> > > > OK. How exactly are those BOs mapped to the userspace?
> > > I'm not sure what you're asking. Userspace mostly uses a GEM handle to
> > > refer to a BO. There can also be userspace CPU mappings of the BO's
> > > memory, but userspace doesn't need CPU mappings for all BOs and only
> > > creates them as needed.
> > OK, I guess you have to bear with me some more. This whole stack is a
> > complete uknonwn. I am mostly after finding a boundary where you can
> > charge the allocated memory to the process so that the oom killer can
> > consider it. Is there anything like that? Except for the proposed file
> > handle hack?
> 
> Not that I knew of.
> 
> As I said before we need some kind of callback that a process now starts to
> use a file descriptor, but without anything from that file descriptor mapped
> into the address space.

For more context: With DRI3 and wayland the compositor opens the DRM fd
and then passes it to the client, which then starts allocating stuff. That
makes book-keeping rather annoying.

I guess a good first order approximation would be if we simply charge any
newly allocated buffers to the process that created them, but that means
hanging onto lots of mm_struct pointers since we want to make sure we then
release those pages to the right mm again (since the process that drops
the last ref might be a totally different one, depending upon how the
buffers or DRM fd have been shared).

Would it be ok to hang onto potentially arbitrary mmget references
essentially forever? If that's ok I think we can do your process based
account (minus a few minor inaccuracies for shared stuff perhaps, but no
one cares about that).
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-30  9:31                       ` Daniel Vetter
  0 siblings, 0 replies; 170+ messages in thread
From: Daniel Vetter @ 2018-01-30  9:31 UTC (permalink / raw)
  To: christian.koenig
  Cc: Michal Hocko, Michel Dänzer, linux-kernel, dri-devel,
	linux-mm, amd-gfx, Roman Gushchin

On Wed, Jan 24, 2018 at 01:11:09PM +0100, Christian Konig wrote:
> Am 24.01.2018 um 12:50 schrieb Michal Hocko:
> > On Wed 24-01-18 12:23:10, Michel Danzer wrote:
> > > On 2018-01-24 12:01 PM, Michal Hocko wrote:
> > > > On Wed 24-01-18 11:27:15, Michel Danzer wrote:
> > [...]
> > > > > 2. If the OOM killer kills a process which is sharing BOs with another
> > > > > process, this should result in the other process dropping its references
> > > > > to the BOs as well, at which point the memory is released.
> > > > OK. How exactly are those BOs mapped to the userspace?
> > > I'm not sure what you're asking. Userspace mostly uses a GEM handle to
> > > refer to a BO. There can also be userspace CPU mappings of the BO's
> > > memory, but userspace doesn't need CPU mappings for all BOs and only
> > > creates them as needed.
> > OK, I guess you have to bear with me some more. This whole stack is a
> > complete uknonwn. I am mostly after finding a boundary where you can
> > charge the allocated memory to the process so that the oom killer can
> > consider it. Is there anything like that? Except for the proposed file
> > handle hack?
> 
> Not that I knew of.
> 
> As I said before we need some kind of callback that a process now starts to
> use a file descriptor, but without anything from that file descriptor mapped
> into the address space.

For more context: With DRI3 and wayland the compositor opens the DRM fd
and then passes it to the client, which then starts allocating stuff. That
makes book-keeping rather annoying.

I guess a good first order approximation would be if we simply charge any
newly allocated buffers to the process that created them, but that means
hanging onto lots of mm_struct pointers since we want to make sure we then
release those pages to the right mm again (since the process that drops
the last ref might be a totally different one, depending upon how the
buffers or DRM fd have been shared).

Would it be ok to hang onto potentially arbitrary mmget references
essentially forever? If that's ok I think we can do your process based
account (minus a few minor inaccuracies for shared stuff perhaps, but no
one cares about that).
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-30  9:31                       ` Daniel Vetter
  0 siblings, 0 replies; 170+ messages in thread
From: Daniel Vetter @ 2018-01-30  9:31 UTC (permalink / raw)
  To: christian.koenig
  Cc: Michal Hocko, Michel Dänzer, linux-kernel, dri-devel,
	linux-mm, amd-gfx, Roman Gushchin

On Wed, Jan 24, 2018 at 01:11:09PM +0100, Christian König wrote:
> Am 24.01.2018 um 12:50 schrieb Michal Hocko:
> > On Wed 24-01-18 12:23:10, Michel Dänzer wrote:
> > > On 2018-01-24 12:01 PM, Michal Hocko wrote:
> > > > On Wed 24-01-18 11:27:15, Michel Dänzer wrote:
> > [...]
> > > > > 2. If the OOM killer kills a process which is sharing BOs with another
> > > > > process, this should result in the other process dropping its references
> > > > > to the BOs as well, at which point the memory is released.
> > > > OK. How exactly are those BOs mapped to the userspace?
> > > I'm not sure what you're asking. Userspace mostly uses a GEM handle to
> > > refer to a BO. There can also be userspace CPU mappings of the BO's
> > > memory, but userspace doesn't need CPU mappings for all BOs and only
> > > creates them as needed.
> > OK, I guess you have to bear with me some more. This whole stack is a
> > complete uknonwn. I am mostly after finding a boundary where you can
> > charge the allocated memory to the process so that the oom killer can
> > consider it. Is there anything like that? Except for the proposed file
> > handle hack?
> 
> Not that I knew of.
> 
> As I said before we need some kind of callback that a process now starts to
> use a file descriptor, but without anything from that file descriptor mapped
> into the address space.

For more context: With DRI3 and wayland the compositor opens the DRM fd
and then passes it to the client, which then starts allocating stuff. That
makes book-keeping rather annoying.

I guess a good first order approximation would be if we simply charge any
newly allocated buffers to the process that created them, but that means
hanging onto lots of mm_struct pointers since we want to make sure we then
release those pages to the right mm again (since the process that drops
the last ref might be a totally different one, depending upon how the
buffers or DRM fd have been shared).

Would it be ok to hang onto potentially arbitrary mmget references
essentially forever? If that's ok I think we can do your process based
account (minus a few minor inaccuracies for shared stuff perhaps, but no
one cares about that).
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-01-30  9:31                       ` Daniel Vetter
  (?)
@ 2018-01-30  9:43                         ` Michel Dänzer
  -1 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-01-30  9:43 UTC (permalink / raw)
  To: christian.koenig, Michal Hocko, dri-devel, Roman Gushchin
  Cc: linux-kernel, linux-mm, amd-gfx

On 2018-01-30 10:31 AM, Daniel Vetter wrote:
> On Wed, Jan 24, 2018 at 01:11:09PM +0100, Christian König wrote:
>> Am 24.01.2018 um 12:50 schrieb Michal Hocko:
>>> On Wed 24-01-18 12:23:10, Michel Dänzer wrote:
>>>> On 2018-01-24 12:01 PM, Michal Hocko wrote:
>>>>> On Wed 24-01-18 11:27:15, Michel Dänzer wrote:
>>> [...]
>>>>>> 2. If the OOM killer kills a process which is sharing BOs with another
>>>>>> process, this should result in the other process dropping its references
>>>>>> to the BOs as well, at which point the memory is released.
>>>>> OK. How exactly are those BOs mapped to the userspace?
>>>> I'm not sure what you're asking. Userspace mostly uses a GEM handle to
>>>> refer to a BO. There can also be userspace CPU mappings of the BO's
>>>> memory, but userspace doesn't need CPU mappings for all BOs and only
>>>> creates them as needed.
>>> OK, I guess you have to bear with me some more. This whole stack is a
>>> complete uknonwn. I am mostly after finding a boundary where you can
>>> charge the allocated memory to the process so that the oom killer can
>>> consider it. Is there anything like that? Except for the proposed file
>>> handle hack?
>>
>> Not that I knew of.
>>
>> As I said before we need some kind of callback that a process now starts to
>> use a file descriptor, but without anything from that file descriptor mapped
>> into the address space.
> 
> For more context: With DRI3 and wayland the compositor opens the DRM fd
> and then passes it to the client, which then starts allocating stuff. That
> makes book-keeping rather annoying.

Actually, what you're describing is only true for the buffers shared by
an X server with an X11 compositor. For the actual applications, the
buffers are created on the client side and then shared with the X server
/ Wayland compositor.

Anyway, it doesn't really matter. In all cases, the buffers are actually
used by all parties that are sharing them, so charging the memory to all
of them is perfectly appropriate.


> I guess a good first order approximation would be if we simply charge any
> newly allocated buffers to the process that created them, but that means
> hanging onto lots of mm_struct pointers since we want to make sure we then
> release those pages to the right mm again (since the process that drops
> the last ref might be a totally different one, depending upon how the
> buffers or DRM fd have been shared).
> 
> Would it be ok to hang onto potentially arbitrary mmget references
> essentially forever? If that's ok I think we can do your process based
> account (minus a few minor inaccuracies for shared stuff perhaps, but no
> one cares about that).

Honestly, I think you and Christian are overthinking this. Let's try
charging the memory to every process which shares a buffer, and go from
there.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-30  9:43                         ` Michel Dänzer
  0 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-01-30  9:43 UTC (permalink / raw)
  To: christian.koenig, Michal Hocko, dri-devel, Roman Gushchin
  Cc: linux-kernel, linux-mm, amd-gfx

On 2018-01-30 10:31 AM, Daniel Vetter wrote:
> On Wed, Jan 24, 2018 at 01:11:09PM +0100, Christian KA?nig wrote:
>> Am 24.01.2018 um 12:50 schrieb Michal Hocko:
>>> On Wed 24-01-18 12:23:10, Michel DA?nzer wrote:
>>>> On 2018-01-24 12:01 PM, Michal Hocko wrote:
>>>>> On Wed 24-01-18 11:27:15, Michel DA?nzer wrote:
>>> [...]
>>>>>> 2. If the OOM killer kills a process which is sharing BOs with another
>>>>>> process, this should result in the other process dropping its references
>>>>>> to the BOs as well, at which point the memory is released.
>>>>> OK. How exactly are those BOs mapped to the userspace?
>>>> I'm not sure what you're asking. Userspace mostly uses a GEM handle to
>>>> refer to a BO. There can also be userspace CPU mappings of the BO's
>>>> memory, but userspace doesn't need CPU mappings for all BOs and only
>>>> creates them as needed.
>>> OK, I guess you have to bear with me some more. This whole stack is a
>>> complete uknonwn. I am mostly after finding a boundary where you can
>>> charge the allocated memory to the process so that the oom killer can
>>> consider it. Is there anything like that? Except for the proposed file
>>> handle hack?
>>
>> Not that I knew of.
>>
>> As I said before we need some kind of callback that a process now starts to
>> use a file descriptor, but without anything from that file descriptor mapped
>> into the address space.
> 
> For more context: With DRI3 and wayland the compositor opens the DRM fd
> and then passes it to the client, which then starts allocating stuff. That
> makes book-keeping rather annoying.

Actually, what you're describing is only true for the buffers shared by
an X server with an X11 compositor. For the actual applications, the
buffers are created on the client side and then shared with the X server
/ Wayland compositor.

Anyway, it doesn't really matter. In all cases, the buffers are actually
used by all parties that are sharing them, so charging the memory to all
of them is perfectly appropriate.


> I guess a good first order approximation would be if we simply charge any
> newly allocated buffers to the process that created them, but that means
> hanging onto lots of mm_struct pointers since we want to make sure we then
> release those pages to the right mm again (since the process that drops
> the last ref might be a totally different one, depending upon how the
> buffers or DRM fd have been shared).
> 
> Would it be ok to hang onto potentially arbitrary mmget references
> essentially forever? If that's ok I think we can do your process based
> account (minus a few minor inaccuracies for shared stuff perhaps, but no
> one cares about that).

Honestly, I think you and Christian are overthinking this. Let's try
charging the memory to every process which shares a buffer, and go from
there.


-- 
Earthling Michel DA?nzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-30  9:43                         ` Michel Dänzer
  0 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-01-30  9:43 UTC (permalink / raw)
  To: christian.koenig-5C7GfCeVMHo, Michal Hocko,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Roman Gushchin
  Cc: linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On 2018-01-30 10:31 AM, Daniel Vetter wrote:
> On Wed, Jan 24, 2018 at 01:11:09PM +0100, Christian König wrote:
>> Am 24.01.2018 um 12:50 schrieb Michal Hocko:
>>> On Wed 24-01-18 12:23:10, Michel Dänzer wrote:
>>>> On 2018-01-24 12:01 PM, Michal Hocko wrote:
>>>>> On Wed 24-01-18 11:27:15, Michel Dänzer wrote:
>>> [...]
>>>>>> 2. If the OOM killer kills a process which is sharing BOs with another
>>>>>> process, this should result in the other process dropping its references
>>>>>> to the BOs as well, at which point the memory is released.
>>>>> OK. How exactly are those BOs mapped to the userspace?
>>>> I'm not sure what you're asking. Userspace mostly uses a GEM handle to
>>>> refer to a BO. There can also be userspace CPU mappings of the BO's
>>>> memory, but userspace doesn't need CPU mappings for all BOs and only
>>>> creates them as needed.
>>> OK, I guess you have to bear with me some more. This whole stack is a
>>> complete uknonwn. I am mostly after finding a boundary where you can
>>> charge the allocated memory to the process so that the oom killer can
>>> consider it. Is there anything like that? Except for the proposed file
>>> handle hack?
>>
>> Not that I knew of.
>>
>> As I said before we need some kind of callback that a process now starts to
>> use a file descriptor, but without anything from that file descriptor mapped
>> into the address space.
> 
> For more context: With DRI3 and wayland the compositor opens the DRM fd
> and then passes it to the client, which then starts allocating stuff. That
> makes book-keeping rather annoying.

Actually, what you're describing is only true for the buffers shared by
an X server with an X11 compositor. For the actual applications, the
buffers are created on the client side and then shared with the X server
/ Wayland compositor.

Anyway, it doesn't really matter. In all cases, the buffers are actually
used by all parties that are sharing them, so charging the memory to all
of them is perfectly appropriate.


> I guess a good first order approximation would be if we simply charge any
> newly allocated buffers to the process that created them, but that means
> hanging onto lots of mm_struct pointers since we want to make sure we then
> release those pages to the right mm again (since the process that drops
> the last ref might be a totally different one, depending upon how the
> buffers or DRM fd have been shared).
> 
> Would it be ok to hang onto potentially arbitrary mmget references
> essentially forever? If that's ok I think we can do your process based
> account (minus a few minor inaccuracies for shared stuff perhaps, but no
> one cares about that).

Honestly, I think you and Christian are overthinking this. Let's try
charging the memory to every process which shares a buffer, and go from
there.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-01-30  9:29                     ` Michel Dänzer
  (?)
@ 2018-01-30 10:28                       ` Michal Hocko
  -1 siblings, 0 replies; 170+ messages in thread
From: Michal Hocko @ 2018-01-30 10:28 UTC (permalink / raw)
  To: Michel Dänzer
  Cc: linux-kernel, dri-devel, Christian.Koenig, linux-mm, amd-gfx,
	Roman Gushchin

On Tue 30-01-18 10:29:10, Michel Dänzer wrote:
> On 2018-01-24 12:50 PM, Michal Hocko wrote:
> > On Wed 24-01-18 12:23:10, Michel Dänzer wrote:
> >> On 2018-01-24 12:01 PM, Michal Hocko wrote:
> >>> On Wed 24-01-18 11:27:15, Michel Dänzer wrote:
> > [...]
> >>>> 2. If the OOM killer kills a process which is sharing BOs with another
> >>>> process, this should result in the other process dropping its references
> >>>> to the BOs as well, at which point the memory is released.
> >>>
> >>> OK. How exactly are those BOs mapped to the userspace?
> >>
> >> I'm not sure what you're asking. Userspace mostly uses a GEM handle to
> >> refer to a BO. There can also be userspace CPU mappings of the BO's
> >> memory, but userspace doesn't need CPU mappings for all BOs and only
> >> creates them as needed.
> > 
> > OK, I guess you have to bear with me some more. This whole stack is a
> > complete uknonwn. I am mostly after finding a boundary where you can
> > charge the allocated memory to the process so that the oom killer can
> > consider it. Is there anything like that? Except for the proposed file
> > handle hack?
> 
> How about the other way around: what APIs can we use to charge /
> "uncharge" memory to a process? If we have those, we can experiment with
> different places to call them.

add_mm_counter() and I would add a new counter e.g. MM_KERNEL_PAGES.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-30 10:28                       ` Michal Hocko
  0 siblings, 0 replies; 170+ messages in thread
From: Michal Hocko @ 2018-01-30 10:28 UTC (permalink / raw)
  To: Michel Dänzer
  Cc: linux-kernel, dri-devel, Christian.Koenig, linux-mm, amd-gfx,
	Roman Gushchin

On Tue 30-01-18 10:29:10, Michel Danzer wrote:
> On 2018-01-24 12:50 PM, Michal Hocko wrote:
> > On Wed 24-01-18 12:23:10, Michel Danzer wrote:
> >> On 2018-01-24 12:01 PM, Michal Hocko wrote:
> >>> On Wed 24-01-18 11:27:15, Michel Danzer wrote:
> > [...]
> >>>> 2. If the OOM killer kills a process which is sharing BOs with another
> >>>> process, this should result in the other process dropping its references
> >>>> to the BOs as well, at which point the memory is released.
> >>>
> >>> OK. How exactly are those BOs mapped to the userspace?
> >>
> >> I'm not sure what you're asking. Userspace mostly uses a GEM handle to
> >> refer to a BO. There can also be userspace CPU mappings of the BO's
> >> memory, but userspace doesn't need CPU mappings for all BOs and only
> >> creates them as needed.
> > 
> > OK, I guess you have to bear with me some more. This whole stack is a
> > complete uknonwn. I am mostly after finding a boundary where you can
> > charge the allocated memory to the process so that the oom killer can
> > consider it. Is there anything like that? Except for the proposed file
> > handle hack?
> 
> How about the other way around: what APIs can we use to charge /
> "uncharge" memory to a process? If we have those, we can experiment with
> different places to call them.

add_mm_counter() and I would add a new counter e.g. MM_KERNEL_PAGES.

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-30 10:28                       ` Michal Hocko
  0 siblings, 0 replies; 170+ messages in thread
From: Michal Hocko @ 2018-01-30 10:28 UTC (permalink / raw)
  To: Michel Dänzer
  Cc: linux-kernel, amd-gfx, Christian.Koenig, linux-mm, dri-devel,
	Roman Gushchin

On Tue 30-01-18 10:29:10, Michel Dänzer wrote:
> On 2018-01-24 12:50 PM, Michal Hocko wrote:
> > On Wed 24-01-18 12:23:10, Michel Dänzer wrote:
> >> On 2018-01-24 12:01 PM, Michal Hocko wrote:
> >>> On Wed 24-01-18 11:27:15, Michel Dänzer wrote:
> > [...]
> >>>> 2. If the OOM killer kills a process which is sharing BOs with another
> >>>> process, this should result in the other process dropping its references
> >>>> to the BOs as well, at which point the memory is released.
> >>>
> >>> OK. How exactly are those BOs mapped to the userspace?
> >>
> >> I'm not sure what you're asking. Userspace mostly uses a GEM handle to
> >> refer to a BO. There can also be userspace CPU mappings of the BO's
> >> memory, but userspace doesn't need CPU mappings for all BOs and only
> >> creates them as needed.
> > 
> > OK, I guess you have to bear with me some more. This whole stack is a
> > complete uknonwn. I am mostly after finding a boundary where you can
> > charge the allocated memory to the process so that the oom killer can
> > consider it. Is there anything like that? Except for the proposed file
> > handle hack?
> 
> How about the other way around: what APIs can we use to charge /
> "uncharge" memory to a process? If we have those, we can experiment with
> different places to call them.

add_mm_counter() and I would add a new counter e.g. MM_KERNEL_PAGES.

-- 
Michal Hocko
SUSE Labs
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-01-30  9:43                         ` Michel Dänzer
  (?)
@ 2018-01-30 10:40                           ` Christian König
  -1 siblings, 0 replies; 170+ messages in thread
From: Christian König @ 2018-01-30 10:40 UTC (permalink / raw)
  To: Michel Dänzer, Michal Hocko, dri-devel, Roman Gushchin
  Cc: linux-kernel, linux-mm, amd-gfx

Am 30.01.2018 um 10:43 schrieb Michel Dänzer:
> [SNIP]
>> Would it be ok to hang onto potentially arbitrary mmget references
>> essentially forever? If that's ok I think we can do your process based
>> account (minus a few minor inaccuracies for shared stuff perhaps, but no
>> one cares about that).
> Honestly, I think you and Christian are overthinking this. Let's try
> charging the memory to every process which shares a buffer, and go from
> there.

My problem is that this needs to be bullet prove.

For example imagine an application which allocates a lot of BOs, then 
calls fork() and let the parent process die. The file descriptor lives 
on in the child process, but the memory is not accounted against the child.

Otherwise we would allow easy construction of deny of service problems.

To avoid that I think we need to add something like new file_operations 
callbacks which informs a file descriptor that it is going to be used in 
a new process or stopped to be used in a process.

Regards,
Christian.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-30 10:40                           ` Christian König
  0 siblings, 0 replies; 170+ messages in thread
From: Christian König @ 2018-01-30 10:40 UTC (permalink / raw)
  To: Michel Dänzer, Michal Hocko, dri-devel, Roman Gushchin
  Cc: linux-kernel, linux-mm, amd-gfx

Am 30.01.2018 um 10:43 schrieb Michel DA?nzer:
> [SNIP]
>> Would it be ok to hang onto potentially arbitrary mmget references
>> essentially forever? If that's ok I think we can do your process based
>> account (minus a few minor inaccuracies for shared stuff perhaps, but no
>> one cares about that).
> Honestly, I think you and Christian are overthinking this. Let's try
> charging the memory to every process which shares a buffer, and go from
> there.

My problem is that this needs to be bullet prove.

For example imagine an application which allocates a lot of BOs, then 
calls fork() and let the parent process die. The file descriptor lives 
on in the child process, but the memory is not accounted against the child.

Otherwise we would allow easy construction of deny of service problems.

To avoid that I think we need to add something like new file_operations 
callbacks which informs a file descriptor that it is going to be used in 
a new process or stopped to be used in a process.

Regards,
Christian.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-30 10:40                           ` Christian König
  0 siblings, 0 replies; 170+ messages in thread
From: Christian König @ 2018-01-30 10:40 UTC (permalink / raw)
  To: Michel Dänzer, Michal Hocko, dri-devel, Roman Gushchin
  Cc: linux-mm, linux-kernel, amd-gfx

Am 30.01.2018 um 10:43 schrieb Michel Dänzer:
> [SNIP]
>> Would it be ok to hang onto potentially arbitrary mmget references
>> essentially forever? If that's ok I think we can do your process based
>> account (minus a few minor inaccuracies for shared stuff perhaps, but no
>> one cares about that).
> Honestly, I think you and Christian are overthinking this. Let's try
> charging the memory to every process which shares a buffer, and go from
> there.

My problem is that this needs to be bullet prove.

For example imagine an application which allocates a lot of BOs, then 
calls fork() and let the parent process die. The file descriptor lives 
on in the child process, but the memory is not accounted against the child.

Otherwise we would allow easy construction of deny of service problems.

To avoid that I think we need to add something like new file_operations 
callbacks which informs a file descriptor that it is going to be used in 
a new process or stopped to be used in a process.

Regards,
Christian.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-01-30  9:43                         ` Michel Dänzer
  (?)
@ 2018-01-30 10:42                           ` Daniel Vetter
  -1 siblings, 0 replies; 170+ messages in thread
From: Daniel Vetter @ 2018-01-30 10:42 UTC (permalink / raw)
  To: Michel Dänzer
  Cc: christian.koenig, Michal Hocko, dri-devel, Roman Gushchin,
	linux-kernel, linux-mm, amd-gfx

On Tue, Jan 30, 2018 at 10:43:10AM +0100, Michel Dänzer wrote:
> On 2018-01-30 10:31 AM, Daniel Vetter wrote:
> > On Wed, Jan 24, 2018 at 01:11:09PM +0100, Christian König wrote:
> >> Am 24.01.2018 um 12:50 schrieb Michal Hocko:
> >>> On Wed 24-01-18 12:23:10, Michel Dänzer wrote:
> >>>> On 2018-01-24 12:01 PM, Michal Hocko wrote:
> >>>>> On Wed 24-01-18 11:27:15, Michel Dänzer wrote:
> >>> [...]
> >>>>>> 2. If the OOM killer kills a process which is sharing BOs with another
> >>>>>> process, this should result in the other process dropping its references
> >>>>>> to the BOs as well, at which point the memory is released.
> >>>>> OK. How exactly are those BOs mapped to the userspace?
> >>>> I'm not sure what you're asking. Userspace mostly uses a GEM handle to
> >>>> refer to a BO. There can also be userspace CPU mappings of the BO's
> >>>> memory, but userspace doesn't need CPU mappings for all BOs and only
> >>>> creates them as needed.
> >>> OK, I guess you have to bear with me some more. This whole stack is a
> >>> complete uknonwn. I am mostly after finding a boundary where you can
> >>> charge the allocated memory to the process so that the oom killer can
> >>> consider it. Is there anything like that? Except for the proposed file
> >>> handle hack?
> >>
> >> Not that I knew of.
> >>
> >> As I said before we need some kind of callback that a process now starts to
> >> use a file descriptor, but without anything from that file descriptor mapped
> >> into the address space.
> > 
> > For more context: With DRI3 and wayland the compositor opens the DRM fd
> > and then passes it to the client, which then starts allocating stuff. That
> > makes book-keeping rather annoying.
> 
> Actually, what you're describing is only true for the buffers shared by
> an X server with an X11 compositor. For the actual applications, the
> buffers are created on the client side and then shared with the X server
> / Wayland compositor.
> 
> Anyway, it doesn't really matter. In all cases, the buffers are actually
> used by all parties that are sharing them, so charging the memory to all
> of them is perfectly appropriate.
> 
> 
> > I guess a good first order approximation would be if we simply charge any
> > newly allocated buffers to the process that created them, but that means
> > hanging onto lots of mm_struct pointers since we want to make sure we then
> > release those pages to the right mm again (since the process that drops
> > the last ref might be a totally different one, depending upon how the
> > buffers or DRM fd have been shared).
> > 
> > Would it be ok to hang onto potentially arbitrary mmget references
> > essentially forever? If that's ok I think we can do your process based
> > account (minus a few minor inaccuracies for shared stuff perhaps, but no
> > one cares about that).
> 
> Honestly, I think you and Christian are overthinking this. Let's try
> charging the memory to every process which shares a buffer, and go from
> there.

I'm not concerned about wrongly accounting shared buffers (they don't
matter), but imbalanced accounting. I.e. allocate a buffer in the client,
share it, but then the compositor drops the last reference.

If we store the mm_struct pointer in drm_gem_object, we don't need any
callback from the vfs when fds are shared or anything like that. We can
simply account any newly allocated buffers to the current->mm, and then
store that later for dropping the account for when the gem obj is
released. This would entirely ignore any complications with shared
buffers, which I think we can do because even when we pass the DRM fd to a
different process, the actual buffer allocations are not passed around
like that for private buffers. And private buffers are the only ones that
really matter.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-30 10:42                           ` Daniel Vetter
  0 siblings, 0 replies; 170+ messages in thread
From: Daniel Vetter @ 2018-01-30 10:42 UTC (permalink / raw)
  To: Michel Dänzer
  Cc: christian.koenig, Michal Hocko, dri-devel, Roman Gushchin,
	linux-kernel, linux-mm, amd-gfx

On Tue, Jan 30, 2018 at 10:43:10AM +0100, Michel Danzer wrote:
> On 2018-01-30 10:31 AM, Daniel Vetter wrote:
> > On Wed, Jan 24, 2018 at 01:11:09PM +0100, Christian Konig wrote:
> >> Am 24.01.2018 um 12:50 schrieb Michal Hocko:
> >>> On Wed 24-01-18 12:23:10, Michel Danzer wrote:
> >>>> On 2018-01-24 12:01 PM, Michal Hocko wrote:
> >>>>> On Wed 24-01-18 11:27:15, Michel Danzer wrote:
> >>> [...]
> >>>>>> 2. If the OOM killer kills a process which is sharing BOs with another
> >>>>>> process, this should result in the other process dropping its references
> >>>>>> to the BOs as well, at which point the memory is released.
> >>>>> OK. How exactly are those BOs mapped to the userspace?
> >>>> I'm not sure what you're asking. Userspace mostly uses a GEM handle to
> >>>> refer to a BO. There can also be userspace CPU mappings of the BO's
> >>>> memory, but userspace doesn't need CPU mappings for all BOs and only
> >>>> creates them as needed.
> >>> OK, I guess you have to bear with me some more. This whole stack is a
> >>> complete uknonwn. I am mostly after finding a boundary where you can
> >>> charge the allocated memory to the process so that the oom killer can
> >>> consider it. Is there anything like that? Except for the proposed file
> >>> handle hack?
> >>
> >> Not that I knew of.
> >>
> >> As I said before we need some kind of callback that a process now starts to
> >> use a file descriptor, but without anything from that file descriptor mapped
> >> into the address space.
> > 
> > For more context: With DRI3 and wayland the compositor opens the DRM fd
> > and then passes it to the client, which then starts allocating stuff. That
> > makes book-keeping rather annoying.
> 
> Actually, what you're describing is only true for the buffers shared by
> an X server with an X11 compositor. For the actual applications, the
> buffers are created on the client side and then shared with the X server
> / Wayland compositor.
> 
> Anyway, it doesn't really matter. In all cases, the buffers are actually
> used by all parties that are sharing them, so charging the memory to all
> of them is perfectly appropriate.
> 
> 
> > I guess a good first order approximation would be if we simply charge any
> > newly allocated buffers to the process that created them, but that means
> > hanging onto lots of mm_struct pointers since we want to make sure we then
> > release those pages to the right mm again (since the process that drops
> > the last ref might be a totally different one, depending upon how the
> > buffers or DRM fd have been shared).
> > 
> > Would it be ok to hang onto potentially arbitrary mmget references
> > essentially forever? If that's ok I think we can do your process based
> > account (minus a few minor inaccuracies for shared stuff perhaps, but no
> > one cares about that).
> 
> Honestly, I think you and Christian are overthinking this. Let's try
> charging the memory to every process which shares a buffer, and go from
> there.

I'm not concerned about wrongly accounting shared buffers (they don't
matter), but imbalanced accounting. I.e. allocate a buffer in the client,
share it, but then the compositor drops the last reference.

If we store the mm_struct pointer in drm_gem_object, we don't need any
callback from the vfs when fds are shared or anything like that. We can
simply account any newly allocated buffers to the current->mm, and then
store that later for dropping the account for when the gem obj is
released. This would entirely ignore any complications with shared
buffers, which I think we can do because even when we pass the DRM fd to a
different process, the actual buffer allocations are not passed around
like that for private buffers. And private buffers are the only ones that
really matter.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-30 10:42                           ` Daniel Vetter
  0 siblings, 0 replies; 170+ messages in thread
From: Daniel Vetter @ 2018-01-30 10:42 UTC (permalink / raw)
  To: Michel Dänzer
  Cc: linux-kernel, dri-devel, christian.koenig, linux-mm, amd-gfx,
	Michal Hocko, Roman Gushchin

On Tue, Jan 30, 2018 at 10:43:10AM +0100, Michel Dänzer wrote:
> On 2018-01-30 10:31 AM, Daniel Vetter wrote:
> > On Wed, Jan 24, 2018 at 01:11:09PM +0100, Christian König wrote:
> >> Am 24.01.2018 um 12:50 schrieb Michal Hocko:
> >>> On Wed 24-01-18 12:23:10, Michel Dänzer wrote:
> >>>> On 2018-01-24 12:01 PM, Michal Hocko wrote:
> >>>>> On Wed 24-01-18 11:27:15, Michel Dänzer wrote:
> >>> [...]
> >>>>>> 2. If the OOM killer kills a process which is sharing BOs with another
> >>>>>> process, this should result in the other process dropping its references
> >>>>>> to the BOs as well, at which point the memory is released.
> >>>>> OK. How exactly are those BOs mapped to the userspace?
> >>>> I'm not sure what you're asking. Userspace mostly uses a GEM handle to
> >>>> refer to a BO. There can also be userspace CPU mappings of the BO's
> >>>> memory, but userspace doesn't need CPU mappings for all BOs and only
> >>>> creates them as needed.
> >>> OK, I guess you have to bear with me some more. This whole stack is a
> >>> complete uknonwn. I am mostly after finding a boundary where you can
> >>> charge the allocated memory to the process so that the oom killer can
> >>> consider it. Is there anything like that? Except for the proposed file
> >>> handle hack?
> >>
> >> Not that I knew of.
> >>
> >> As I said before we need some kind of callback that a process now starts to
> >> use a file descriptor, but without anything from that file descriptor mapped
> >> into the address space.
> > 
> > For more context: With DRI3 and wayland the compositor opens the DRM fd
> > and then passes it to the client, which then starts allocating stuff. That
> > makes book-keeping rather annoying.
> 
> Actually, what you're describing is only true for the buffers shared by
> an X server with an X11 compositor. For the actual applications, the
> buffers are created on the client side and then shared with the X server
> / Wayland compositor.
> 
> Anyway, it doesn't really matter. In all cases, the buffers are actually
> used by all parties that are sharing them, so charging the memory to all
> of them is perfectly appropriate.
> 
> 
> > I guess a good first order approximation would be if we simply charge any
> > newly allocated buffers to the process that created them, but that means
> > hanging onto lots of mm_struct pointers since we want to make sure we then
> > release those pages to the right mm again (since the process that drops
> > the last ref might be a totally different one, depending upon how the
> > buffers or DRM fd have been shared).
> > 
> > Would it be ok to hang onto potentially arbitrary mmget references
> > essentially forever? If that's ok I think we can do your process based
> > account (minus a few minor inaccuracies for shared stuff perhaps, but no
> > one cares about that).
> 
> Honestly, I think you and Christian are overthinking this. Let's try
> charging the memory to every process which shares a buffer, and go from
> there.

I'm not concerned about wrongly accounting shared buffers (they don't
matter), but imbalanced accounting. I.e. allocate a buffer in the client,
share it, but then the compositor drops the last reference.

If we store the mm_struct pointer in drm_gem_object, we don't need any
callback from the vfs when fds are shared or anything like that. We can
simply account any newly allocated buffers to the current->mm, and then
store that later for dropping the account for when the gem obj is
released. This would entirely ignore any complications with shared
buffers, which I think we can do because even when we pass the DRM fd to a
different process, the actual buffer allocations are not passed around
like that for private buffers. And private buffers are the only ones that
really matter.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-01-30 10:42                           ` Daniel Vetter
  (?)
@ 2018-01-30 10:48                             ` Michel Dänzer
  -1 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-01-30 10:48 UTC (permalink / raw)
  To: christian.koenig, Michal Hocko, Roman Gushchin
  Cc: dri-devel, linux-kernel, linux-mm, amd-gfx

On 2018-01-30 11:42 AM, Daniel Vetter wrote:
> On Tue, Jan 30, 2018 at 10:43:10AM +0100, Michel Dänzer wrote:
>> On 2018-01-30 10:31 AM, Daniel Vetter wrote:
>>
>>> I guess a good first order approximation would be if we simply charge any
>>> newly allocated buffers to the process that created them, but that means
>>> hanging onto lots of mm_struct pointers since we want to make sure we then
>>> release those pages to the right mm again (since the process that drops
>>> the last ref might be a totally different one, depending upon how the
>>> buffers or DRM fd have been shared).
>>>
>>> Would it be ok to hang onto potentially arbitrary mmget references
>>> essentially forever? If that's ok I think we can do your process based
>>> account (minus a few minor inaccuracies for shared stuff perhaps, but no
>>> one cares about that).
>>
>> Honestly, I think you and Christian are overthinking this. Let's try
>> charging the memory to every process which shares a buffer, and go from
>> there.
> 
> I'm not concerned about wrongly accounting shared buffers (they don't
> matter), but imbalanced accounting. I.e. allocate a buffer in the client,
> share it, but then the compositor drops the last reference.

I don't think the order matters. The memory is "uncharged" in each
process when it drops its reference.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-30 10:48                             ` Michel Dänzer
  0 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-01-30 10:48 UTC (permalink / raw)
  To: christian.koenig, Michal Hocko, Roman Gushchin
  Cc: dri-devel, linux-kernel, linux-mm, amd-gfx

On 2018-01-30 11:42 AM, Daniel Vetter wrote:
> On Tue, Jan 30, 2018 at 10:43:10AM +0100, Michel DA?nzer wrote:
>> On 2018-01-30 10:31 AM, Daniel Vetter wrote:
>>
>>> I guess a good first order approximation would be if we simply charge any
>>> newly allocated buffers to the process that created them, but that means
>>> hanging onto lots of mm_struct pointers since we want to make sure we then
>>> release those pages to the right mm again (since the process that drops
>>> the last ref might be a totally different one, depending upon how the
>>> buffers or DRM fd have been shared).
>>>
>>> Would it be ok to hang onto potentially arbitrary mmget references
>>> essentially forever? If that's ok I think we can do your process based
>>> account (minus a few minor inaccuracies for shared stuff perhaps, but no
>>> one cares about that).
>>
>> Honestly, I think you and Christian are overthinking this. Let's try
>> charging the memory to every process which shares a buffer, and go from
>> there.
> 
> I'm not concerned about wrongly accounting shared buffers (they don't
> matter), but imbalanced accounting. I.e. allocate a buffer in the client,
> share it, but then the compositor drops the last reference.

I don't think the order matters. The memory is "uncharged" in each
process when it drops its reference.


-- 
Earthling Michel DA?nzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-30 10:48                             ` Michel Dänzer
  0 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-01-30 10:48 UTC (permalink / raw)
  To: christian.koenig, Michal Hocko, Roman Gushchin
  Cc: dri-devel, linux-kernel, linux-mm, amd-gfx

On 2018-01-30 11:42 AM, Daniel Vetter wrote:
> On Tue, Jan 30, 2018 at 10:43:10AM +0100, Michel Dänzer wrote:
>> On 2018-01-30 10:31 AM, Daniel Vetter wrote:
>>
>>> I guess a good first order approximation would be if we simply charge any
>>> newly allocated buffers to the process that created them, but that means
>>> hanging onto lots of mm_struct pointers since we want to make sure we then
>>> release those pages to the right mm again (since the process that drops
>>> the last ref might be a totally different one, depending upon how the
>>> buffers or DRM fd have been shared).
>>>
>>> Would it be ok to hang onto potentially arbitrary mmget references
>>> essentially forever? If that's ok I think we can do your process based
>>> account (minus a few minor inaccuracies for shared stuff perhaps, but no
>>> one cares about that).
>>
>> Honestly, I think you and Christian are overthinking this. Let's try
>> charging the memory to every process which shares a buffer, and go from
>> there.
> 
> I'm not concerned about wrongly accounting shared buffers (they don't
> matter), but imbalanced accounting. I.e. allocate a buffer in the client,
> share it, but then the compositor drops the last reference.

I don't think the order matters. The memory is "uncharged" in each
process when it drops its reference.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-01-30 10:40                           ` Christian König
  (?)
@ 2018-01-30 11:02                             ` Michel Dänzer
  -1 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-01-30 11:02 UTC (permalink / raw)
  To: Christian König, Michal Hocko, Roman Gushchin
  Cc: dri-devel, linux-mm, linux-kernel, amd-gfx

On 2018-01-30 11:40 AM, Christian König wrote:
> Am 30.01.2018 um 10:43 schrieb Michel Dänzer:
>> [SNIP]
>>> Would it be ok to hang onto potentially arbitrary mmget references
>>> essentially forever? If that's ok I think we can do your process based
>>> account (minus a few minor inaccuracies for shared stuff perhaps, but no
>>> one cares about that).
>> Honestly, I think you and Christian are overthinking this. Let's try
>> charging the memory to every process which shares a buffer, and go from
>> there.
> 
> My problem is that this needs to be bullet prove.
> 
> For example imagine an application which allocates a lot of BOs, then
> calls fork() and let the parent process die. The file descriptor lives
> on in the child process, but the memory is not accounted against the child.

What exactly are you referring to by "the file descriptor" here?


What happens to BO handles in general in this case? If both parent and
child process keep the same handle for the same BO, one of them
destroying the handle will result in the other one not being able to use
it anymore either, won't it?


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-30 11:02                             ` Michel Dänzer
  0 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-01-30 11:02 UTC (permalink / raw)
  To: Christian König, Michal Hocko, Roman Gushchin
  Cc: dri-devel, linux-mm, linux-kernel, amd-gfx

On 2018-01-30 11:40 AM, Christian KA?nig wrote:
> Am 30.01.2018 um 10:43 schrieb Michel DA?nzer:
>> [SNIP]
>>> Would it be ok to hang onto potentially arbitrary mmget references
>>> essentially forever? If that's ok I think we can do your process based
>>> account (minus a few minor inaccuracies for shared stuff perhaps, but no
>>> one cares about that).
>> Honestly, I think you and Christian are overthinking this. Let's try
>> charging the memory to every process which shares a buffer, and go from
>> there.
> 
> My problem is that this needs to be bullet prove.
> 
> For example imagine an application which allocates a lot of BOs, then
> calls fork() and let the parent process die. The file descriptor lives
> on in the child process, but the memory is not accounted against the child.

What exactly are you referring to by "the file descriptor" here?


What happens to BO handles in general in this case? If both parent and
child process keep the same handle for the same BO, one of them
destroying the handle will result in the other one not being able to use
it anymore either, won't it?


-- 
Earthling Michel DA?nzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-30 11:02                             ` Michel Dänzer
  0 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-01-30 11:02 UTC (permalink / raw)
  To: Christian König, Michal Hocko, Roman Gushchin
  Cc: dri-devel, linux-mm, linux-kernel, amd-gfx

On 2018-01-30 11:40 AM, Christian König wrote:
> Am 30.01.2018 um 10:43 schrieb Michel Dänzer:
>> [SNIP]
>>> Would it be ok to hang onto potentially arbitrary mmget references
>>> essentially forever? If that's ok I think we can do your process based
>>> account (minus a few minor inaccuracies for shared stuff perhaps, but no
>>> one cares about that).
>> Honestly, I think you and Christian are overthinking this. Let's try
>> charging the memory to every process which shares a buffer, and go from
>> there.
> 
> My problem is that this needs to be bullet prove.
> 
> For example imagine an application which allocates a lot of BOs, then
> calls fork() and let the parent process die. The file descriptor lives
> on in the child process, but the memory is not accounted against the child.

What exactly are you referring to by "the file descriptor" here?


What happens to BO handles in general in this case? If both parent and
child process keep the same handle for the same BO, one of them
destroying the handle will result in the other one not being able to use
it anymore either, won't it?


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-01-30 11:02                             ` Michel Dänzer
@ 2018-01-30 11:28                               ` Christian König
  -1 siblings, 0 replies; 170+ messages in thread
From: Christian König @ 2018-01-30 11:28 UTC (permalink / raw)
  To: Michel Dänzer, Christian König, Michal Hocko, Roman Gushchin
  Cc: linux-mm, amd-gfx, linux-kernel, dri-devel

Am 30.01.2018 um 12:02 schrieb Michel Dänzer:
> On 2018-01-30 11:40 AM, Christian König wrote:
>> Am 30.01.2018 um 10:43 schrieb Michel Dänzer:
>>> [SNIP]
>>>> Would it be ok to hang onto potentially arbitrary mmget references
>>>> essentially forever? If that's ok I think we can do your process based
>>>> account (minus a few minor inaccuracies for shared stuff perhaps, but no
>>>> one cares about that).
>>> Honestly, I think you and Christian are overthinking this. Let's try
>>> charging the memory to every process which shares a buffer, and go from
>>> there.
>> My problem is that this needs to be bullet prove.
>>
>> For example imagine an application which allocates a lot of BOs, then
>> calls fork() and let the parent process die. The file descriptor lives
>> on in the child process, but the memory is not accounted against the child.
> What exactly are you referring to by "the file descriptor" here?

The file descriptor used to identify the connection to the driver. In 
other words our drm_file structure in the kernel.

> What happens to BO handles in general in this case? If both parent and
> child process keep the same handle for the same BO, one of them
> destroying the handle will result in the other one not being able to use
> it anymore either, won't it?
Correct.

That usage is actually not useful at all, but we already had 
applications which did exactly that by accident.

Not to mention that somebody could do it on purpose.

Regards,
Christian.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-30 11:28                               ` Christian König
  0 siblings, 0 replies; 170+ messages in thread
From: Christian König @ 2018-01-30 11:28 UTC (permalink / raw)
  To: Michel Dänzer, Christian König, Michal Hocko, Roman Gushchin
  Cc: linux-mm, amd-gfx, linux-kernel, dri-devel

Am 30.01.2018 um 12:02 schrieb Michel DA?nzer:
> On 2018-01-30 11:40 AM, Christian KA?nig wrote:
>> Am 30.01.2018 um 10:43 schrieb Michel DA?nzer:
>>> [SNIP]
>>>> Would it be ok to hang onto potentially arbitrary mmget references
>>>> essentially forever? If that's ok I think we can do your process based
>>>> account (minus a few minor inaccuracies for shared stuff perhaps, but no
>>>> one cares about that).
>>> Honestly, I think you and Christian are overthinking this. Let's try
>>> charging the memory to every process which shares a buffer, and go from
>>> there.
>> My problem is that this needs to be bullet prove.
>>
>> For example imagine an application which allocates a lot of BOs, then
>> calls fork() and let the parent process die. The file descriptor lives
>> on in the child process, but the memory is not accounted against the child.
> What exactly are you referring to by "the file descriptor" here?

The file descriptor used to identify the connection to the driver. In 
other words our drm_file structure in the kernel.

> What happens to BO handles in general in this case? If both parent and
> child process keep the same handle for the same BO, one of them
> destroying the handle will result in the other one not being able to use
> it anymore either, won't it?
Correct.

That usage is actually not useful at all, but we already had 
applications which did exactly that by accident.

Not to mention that somebody could do it on purpose.

Regards,
Christian.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-01-30 11:28                               ` Christian König
  (?)
@ 2018-01-30 11:34                                 ` Michel Dänzer
  -1 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-01-30 11:34 UTC (permalink / raw)
  To: christian.koenig, Michal Hocko, Roman Gushchin
  Cc: linux-mm, dri-devel, linux-kernel, amd-gfx

On 2018-01-30 12:28 PM, Christian König wrote:
> Am 30.01.2018 um 12:02 schrieb Michel Dänzer:
>> On 2018-01-30 11:40 AM, Christian König wrote:
>>> Am 30.01.2018 um 10:43 schrieb Michel Dänzer:
>>>> [SNIP]
>>>>> Would it be ok to hang onto potentially arbitrary mmget references
>>>>> essentially forever? If that's ok I think we can do your process based
>>>>> account (minus a few minor inaccuracies for shared stuff perhaps,
>>>>> but no
>>>>> one cares about that).
>>>> Honestly, I think you and Christian are overthinking this. Let's try
>>>> charging the memory to every process which shares a buffer, and go from
>>>> there.
>>> My problem is that this needs to be bullet prove.
>>>
>>> For example imagine an application which allocates a lot of BOs, then
>>> calls fork() and let the parent process die. The file descriptor lives
>>> on in the child process, but the memory is not accounted against the
>>> child.
>> What exactly are you referring to by "the file descriptor" here?
> 
> The file descriptor used to identify the connection to the driver. In
> other words our drm_file structure in the kernel.
> 
>> What happens to BO handles in general in this case? If both parent and
>> child process keep the same handle for the same BO, one of them
>> destroying the handle will result in the other one not being able to use
>> it anymore either, won't it?
> Correct.
> 
> That usage is actually not useful at all, but we already had
> applications which did exactly that by accident.
> 
> Not to mention that somebody could do it on purpose.

Can we just prevent child processes from using their parent's DRM file
descriptors altogether? Allowing it seems like a bad idea all around.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-30 11:34                                 ` Michel Dänzer
  0 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-01-30 11:34 UTC (permalink / raw)
  To: christian.koenig, Michal Hocko, Roman Gushchin
  Cc: linux-mm, dri-devel, linux-kernel, amd-gfx

On 2018-01-30 12:28 PM, Christian KA?nig wrote:
> Am 30.01.2018 um 12:02 schrieb Michel DA?nzer:
>> On 2018-01-30 11:40 AM, Christian KA?nig wrote:
>>> Am 30.01.2018 um 10:43 schrieb Michel DA?nzer:
>>>> [SNIP]
>>>>> Would it be ok to hang onto potentially arbitrary mmget references
>>>>> essentially forever? If that's ok I think we can do your process based
>>>>> account (minus a few minor inaccuracies for shared stuff perhaps,
>>>>> but no
>>>>> one cares about that).
>>>> Honestly, I think you and Christian are overthinking this. Let's try
>>>> charging the memory to every process which shares a buffer, and go from
>>>> there.
>>> My problem is that this needs to be bullet prove.
>>>
>>> For example imagine an application which allocates a lot of BOs, then
>>> calls fork() and let the parent process die. The file descriptor lives
>>> on in the child process, but the memory is not accounted against the
>>> child.
>> What exactly are you referring to by "the file descriptor" here?
> 
> The file descriptor used to identify the connection to the driver. In
> other words our drm_file structure in the kernel.
> 
>> What happens to BO handles in general in this case? If both parent and
>> child process keep the same handle for the same BO, one of them
>> destroying the handle will result in the other one not being able to use
>> it anymore either, won't it?
> Correct.
> 
> That usage is actually not useful at all, but we already had
> applications which did exactly that by accident.
> 
> Not to mention that somebody could do it on purpose.

Can we just prevent child processes from using their parent's DRM file
descriptors altogether? Allowing it seems like a bad idea all around.


-- 
Earthling Michel DA?nzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-30 11:34                                 ` Michel Dänzer
  0 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-01-30 11:34 UTC (permalink / raw)
  To: christian.koenig, Michal Hocko, Roman Gushchin
  Cc: linux-mm, amd-gfx, linux-kernel, dri-devel

On 2018-01-30 12:28 PM, Christian König wrote:
> Am 30.01.2018 um 12:02 schrieb Michel Dänzer:
>> On 2018-01-30 11:40 AM, Christian König wrote:
>>> Am 30.01.2018 um 10:43 schrieb Michel Dänzer:
>>>> [SNIP]
>>>>> Would it be ok to hang onto potentially arbitrary mmget references
>>>>> essentially forever? If that's ok I think we can do your process based
>>>>> account (minus a few minor inaccuracies for shared stuff perhaps,
>>>>> but no
>>>>> one cares about that).
>>>> Honestly, I think you and Christian are overthinking this. Let's try
>>>> charging the memory to every process which shares a buffer, and go from
>>>> there.
>>> My problem is that this needs to be bullet prove.
>>>
>>> For example imagine an application which allocates a lot of BOs, then
>>> calls fork() and let the parent process die. The file descriptor lives
>>> on in the child process, but the memory is not accounted against the
>>> child.
>> What exactly are you referring to by "the file descriptor" here?
> 
> The file descriptor used to identify the connection to the driver. In
> other words our drm_file structure in the kernel.
> 
>> What happens to BO handles in general in this case? If both parent and
>> child process keep the same handle for the same BO, one of them
>> destroying the handle will result in the other one not being able to use
>> it anymore either, won't it?
> Correct.
> 
> That usage is actually not useful at all, but we already had
> applications which did exactly that by accident.
> 
> Not to mention that somebody could do it on purpose.

Can we just prevent child processes from using their parent's DRM file
descriptors altogether? Allowing it seems like a bad idea all around.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-01-30 10:48                             ` Michel Dänzer
  (?)
@ 2018-01-30 11:35                               ` Nicolai Hähnle
  -1 siblings, 0 replies; 170+ messages in thread
From: Nicolai Hähnle @ 2018-01-30 11:35 UTC (permalink / raw)
  To: Michel Dänzer, christian.koenig, Michal Hocko, Roman Gushchin
  Cc: linux-mm, amd-gfx, linux-kernel, dri-devel

On 30.01.2018 11:48, Michel Dänzer wrote:
> On 2018-01-30 11:42 AM, Daniel Vetter wrote:
>> On Tue, Jan 30, 2018 at 10:43:10AM +0100, Michel Dänzer wrote:
>>> On 2018-01-30 10:31 AM, Daniel Vetter wrote:
>>>
>>>> I guess a good first order approximation would be if we simply charge any
>>>> newly allocated buffers to the process that created them, but that means
>>>> hanging onto lots of mm_struct pointers since we want to make sure we then
>>>> release those pages to the right mm again (since the process that drops
>>>> the last ref might be a totally different one, depending upon how the
>>>> buffers or DRM fd have been shared).
>>>>
>>>> Would it be ok to hang onto potentially arbitrary mmget references
>>>> essentially forever? If that's ok I think we can do your process based
>>>> account (minus a few minor inaccuracies for shared stuff perhaps, but no
>>>> one cares about that).
>>>
>>> Honestly, I think you and Christian are overthinking this. Let's try
>>> charging the memory to every process which shares a buffer, and go from
>>> there.
>>
>> I'm not concerned about wrongly accounting shared buffers (they don't
>> matter), but imbalanced accounting. I.e. allocate a buffer in the client,
>> share it, but then the compositor drops the last reference.
> 
> I don't think the order matters. The memory is "uncharged" in each
> process when it drops its reference.

Daniel made a fair point about passing DRM fds between processes, though.

It's not a problem with how the fds are currently used, but somebody 
could do the following:

1. Create a DRM fd in process A, allocate lots of buffers.
2. Pass the fd to process B via some IPC mechanism.
3. Exit process A.

There needs to be some assurance that the BOs are accounted as belonging 
to process B in the end.

Cheers,
Nicolai
-- 
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-30 11:35                               ` Nicolai Hähnle
  0 siblings, 0 replies; 170+ messages in thread
From: Nicolai Hähnle @ 2018-01-30 11:35 UTC (permalink / raw)
  To: Michel Dänzer, christian.koenig, Michal Hocko, Roman Gushchin
  Cc: linux-mm, amd-gfx, linux-kernel, dri-devel

On 30.01.2018 11:48, Michel DA?nzer wrote:
> On 2018-01-30 11:42 AM, Daniel Vetter wrote:
>> On Tue, Jan 30, 2018 at 10:43:10AM +0100, Michel DA?nzer wrote:
>>> On 2018-01-30 10:31 AM, Daniel Vetter wrote:
>>>
>>>> I guess a good first order approximation would be if we simply charge any
>>>> newly allocated buffers to the process that created them, but that means
>>>> hanging onto lots of mm_struct pointers since we want to make sure we then
>>>> release those pages to the right mm again (since the process that drops
>>>> the last ref might be a totally different one, depending upon how the
>>>> buffers or DRM fd have been shared).
>>>>
>>>> Would it be ok to hang onto potentially arbitrary mmget references
>>>> essentially forever? If that's ok I think we can do your process based
>>>> account (minus a few minor inaccuracies for shared stuff perhaps, but no
>>>> one cares about that).
>>>
>>> Honestly, I think you and Christian are overthinking this. Let's try
>>> charging the memory to every process which shares a buffer, and go from
>>> there.
>>
>> I'm not concerned about wrongly accounting shared buffers (they don't
>> matter), but imbalanced accounting. I.e. allocate a buffer in the client,
>> share it, but then the compositor drops the last reference.
> 
> I don't think the order matters. The memory is "uncharged" in each
> process when it drops its reference.

Daniel made a fair point about passing DRM fds between processes, though.

It's not a problem with how the fds are currently used, but somebody 
could do the following:

1. Create a DRM fd in process A, allocate lots of buffers.
2. Pass the fd to process B via some IPC mechanism.
3. Exit process A.

There needs to be some assurance that the BOs are accounted as belonging 
to process B in the end.

Cheers,
Nicolai
-- 
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-30 11:35                               ` Nicolai Hähnle
  0 siblings, 0 replies; 170+ messages in thread
From: Nicolai Hähnle @ 2018-01-30 11:35 UTC (permalink / raw)
  To: Michel Dänzer, christian.koenig, Michal Hocko, Roman Gushchin
  Cc: linux-mm, amd-gfx, linux-kernel, dri-devel

On 30.01.2018 11:48, Michel Dänzer wrote:
> On 2018-01-30 11:42 AM, Daniel Vetter wrote:
>> On Tue, Jan 30, 2018 at 10:43:10AM +0100, Michel Dänzer wrote:
>>> On 2018-01-30 10:31 AM, Daniel Vetter wrote:
>>>
>>>> I guess a good first order approximation would be if we simply charge any
>>>> newly allocated buffers to the process that created them, but that means
>>>> hanging onto lots of mm_struct pointers since we want to make sure we then
>>>> release those pages to the right mm again (since the process that drops
>>>> the last ref might be a totally different one, depending upon how the
>>>> buffers or DRM fd have been shared).
>>>>
>>>> Would it be ok to hang onto potentially arbitrary mmget references
>>>> essentially forever? If that's ok I think we can do your process based
>>>> account (minus a few minor inaccuracies for shared stuff perhaps, but no
>>>> one cares about that).
>>>
>>> Honestly, I think you and Christian are overthinking this. Let's try
>>> charging the memory to every process which shares a buffer, and go from
>>> there.
>>
>> I'm not concerned about wrongly accounting shared buffers (they don't
>> matter), but imbalanced accounting. I.e. allocate a buffer in the client,
>> share it, but then the compositor drops the last reference.
> 
> I don't think the order matters. The memory is "uncharged" in each
> process when it drops its reference.

Daniel made a fair point about passing DRM fds between processes, though.

It's not a problem with how the fds are currently used, but somebody 
could do the following:

1. Create a DRM fd in process A, allocate lots of buffers.
2. Pass the fd to process B via some IPC mechanism.
3. Exit process A.

There needs to be some assurance that the BOs are accounted as belonging 
to process B in the end.

Cheers,
Nicolai
-- 
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-01-30 11:34                                 ` Michel Dänzer
  (?)
@ 2018-01-30 11:36                                   ` Nicolai Hähnle
  -1 siblings, 0 replies; 170+ messages in thread
From: Nicolai Hähnle @ 2018-01-30 11:36 UTC (permalink / raw)
  To: Michel Dänzer, christian.koenig, Michal Hocko, Roman Gushchin
  Cc: linux-mm, amd-gfx, linux-kernel, dri-devel

On 30.01.2018 12:34, Michel Dänzer wrote:
> On 2018-01-30 12:28 PM, Christian König wrote:
>> Am 30.01.2018 um 12:02 schrieb Michel Dänzer:
>>> On 2018-01-30 11:40 AM, Christian König wrote:
>>>> Am 30.01.2018 um 10:43 schrieb Michel Dänzer:
>>>>> [SNIP]
>>>>>> Would it be ok to hang onto potentially arbitrary mmget references
>>>>>> essentially forever? If that's ok I think we can do your process based
>>>>>> account (minus a few minor inaccuracies for shared stuff perhaps,
>>>>>> but no
>>>>>> one cares about that).
>>>>> Honestly, I think you and Christian are overthinking this. Let's try
>>>>> charging the memory to every process which shares a buffer, and go from
>>>>> there.
>>>> My problem is that this needs to be bullet prove.
>>>>
>>>> For example imagine an application which allocates a lot of BOs, then
>>>> calls fork() and let the parent process die. The file descriptor lives
>>>> on in the child process, but the memory is not accounted against the
>>>> child.
>>> What exactly are you referring to by "the file descriptor" here?
>>
>> The file descriptor used to identify the connection to the driver. In
>> other words our drm_file structure in the kernel.
>>
>>> What happens to BO handles in general in this case? If both parent and
>>> child process keep the same handle for the same BO, one of them
>>> destroying the handle will result in the other one not being able to use
>>> it anymore either, won't it?
>> Correct.
>>
>> That usage is actually not useful at all, but we already had
>> applications which did exactly that by accident.
>>
>> Not to mention that somebody could do it on purpose.
> 
> Can we just prevent child processes from using their parent's DRM file
> descriptors altogether? Allowing it seems like a bad idea all around.

Existing protocols pass DRM fds between processes though, don't they?

Not child processes perhaps, but special-casing that seems like awful 
design.

Cheers,
Nicolai
-- 
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-30 11:36                                   ` Nicolai Hähnle
  0 siblings, 0 replies; 170+ messages in thread
From: Nicolai Hähnle @ 2018-01-30 11:36 UTC (permalink / raw)
  To: Michel Dänzer, christian.koenig, Michal Hocko, Roman Gushchin
  Cc: linux-mm, amd-gfx, linux-kernel, dri-devel

On 30.01.2018 12:34, Michel DA?nzer wrote:
> On 2018-01-30 12:28 PM, Christian KA?nig wrote:
>> Am 30.01.2018 um 12:02 schrieb Michel DA?nzer:
>>> On 2018-01-30 11:40 AM, Christian KA?nig wrote:
>>>> Am 30.01.2018 um 10:43 schrieb Michel DA?nzer:
>>>>> [SNIP]
>>>>>> Would it be ok to hang onto potentially arbitrary mmget references
>>>>>> essentially forever? If that's ok I think we can do your process based
>>>>>> account (minus a few minor inaccuracies for shared stuff perhaps,
>>>>>> but no
>>>>>> one cares about that).
>>>>> Honestly, I think you and Christian are overthinking this. Let's try
>>>>> charging the memory to every process which shares a buffer, and go from
>>>>> there.
>>>> My problem is that this needs to be bullet prove.
>>>>
>>>> For example imagine an application which allocates a lot of BOs, then
>>>> calls fork() and let the parent process die. The file descriptor lives
>>>> on in the child process, but the memory is not accounted against the
>>>> child.
>>> What exactly are you referring to by "the file descriptor" here?
>>
>> The file descriptor used to identify the connection to the driver. In
>> other words our drm_file structure in the kernel.
>>
>>> What happens to BO handles in general in this case? If both parent and
>>> child process keep the same handle for the same BO, one of them
>>> destroying the handle will result in the other one not being able to use
>>> it anymore either, won't it?
>> Correct.
>>
>> That usage is actually not useful at all, but we already had
>> applications which did exactly that by accident.
>>
>> Not to mention that somebody could do it on purpose.
> 
> Can we just prevent child processes from using their parent's DRM file
> descriptors altogether? Allowing it seems like a bad idea all around.

Existing protocols pass DRM fds between processes though, don't they?

Not child processes perhaps, but special-casing that seems like awful 
design.

Cheers,
Nicolai
-- 
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-30 11:36                                   ` Nicolai Hähnle
  0 siblings, 0 replies; 170+ messages in thread
From: Nicolai Hähnle @ 2018-01-30 11:36 UTC (permalink / raw)
  To: Michel Dänzer, christian.koenig, Michal Hocko, Roman Gushchin
  Cc: linux-mm, dri-devel, linux-kernel, amd-gfx

On 30.01.2018 12:34, Michel Dänzer wrote:
> On 2018-01-30 12:28 PM, Christian König wrote:
>> Am 30.01.2018 um 12:02 schrieb Michel Dänzer:
>>> On 2018-01-30 11:40 AM, Christian König wrote:
>>>> Am 30.01.2018 um 10:43 schrieb Michel Dänzer:
>>>>> [SNIP]
>>>>>> Would it be ok to hang onto potentially arbitrary mmget references
>>>>>> essentially forever? If that's ok I think we can do your process based
>>>>>> account (minus a few minor inaccuracies for shared stuff perhaps,
>>>>>> but no
>>>>>> one cares about that).
>>>>> Honestly, I think you and Christian are overthinking this. Let's try
>>>>> charging the memory to every process which shares a buffer, and go from
>>>>> there.
>>>> My problem is that this needs to be bullet prove.
>>>>
>>>> For example imagine an application which allocates a lot of BOs, then
>>>> calls fork() and let the parent process die. The file descriptor lives
>>>> on in the child process, but the memory is not accounted against the
>>>> child.
>>> What exactly are you referring to by "the file descriptor" here?
>>
>> The file descriptor used to identify the connection to the driver. In
>> other words our drm_file structure in the kernel.
>>
>>> What happens to BO handles in general in this case? If both parent and
>>> child process keep the same handle for the same BO, one of them
>>> destroying the handle will result in the other one not being able to use
>>> it anymore either, won't it?
>> Correct.
>>
>> That usage is actually not useful at all, but we already had
>> applications which did exactly that by accident.
>>
>> Not to mention that somebody could do it on purpose.
> 
> Can we just prevent child processes from using their parent's DRM file
> descriptors altogether? Allowing it seems like a bad idea all around.

Existing protocols pass DRM fds between processes though, don't they?

Not child processes perhaps, but special-casing that seems like awful 
design.

Cheers,
Nicolai
-- 
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-01-30 11:36                                   ` Nicolai Hähnle
  (?)
@ 2018-01-30 11:42                                     ` Michel Dänzer
  -1 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-01-30 11:42 UTC (permalink / raw)
  To: Nicolai Hähnle, christian.koenig, Michal Hocko, Roman Gushchin
  Cc: linux-mm, dri-devel, linux-kernel, amd-gfx

On 2018-01-30 12:36 PM, Nicolai Hähnle wrote:
> On 30.01.2018 12:34, Michel Dänzer wrote:
>> On 2018-01-30 12:28 PM, Christian König wrote:
>>> Am 30.01.2018 um 12:02 schrieb Michel Dänzer:
>>>> On 2018-01-30 11:40 AM, Christian König wrote:
>>>>> Am 30.01.2018 um 10:43 schrieb Michel Dänzer:
>>>>>> [SNIP]
>>>>>>> Would it be ok to hang onto potentially arbitrary mmget references
>>>>>>> essentially forever? If that's ok I think we can do your process
>>>>>>> based
>>>>>>> account (minus a few minor inaccuracies for shared stuff perhaps,
>>>>>>> but no
>>>>>>> one cares about that).
>>>>>> Honestly, I think you and Christian are overthinking this. Let's try
>>>>>> charging the memory to every process which shares a buffer, and go
>>>>>> from
>>>>>> there.
>>>>> My problem is that this needs to be bullet prove.
>>>>>
>>>>> For example imagine an application which allocates a lot of BOs, then
>>>>> calls fork() and let the parent process die. The file descriptor lives
>>>>> on in the child process, but the memory is not accounted against the
>>>>> child.
>>>> What exactly are you referring to by "the file descriptor" here?
>>>
>>> The file descriptor used to identify the connection to the driver. In
>>> other words our drm_file structure in the kernel.
>>>
>>>> What happens to BO handles in general in this case? If both parent and
>>>> child process keep the same handle for the same BO, one of them
>>>> destroying the handle will result in the other one not being able to
>>>> use
>>>> it anymore either, won't it?
>>> Correct.
>>>
>>> That usage is actually not useful at all, but we already had
>>> applications which did exactly that by accident.
>>>
>>> Not to mention that somebody could do it on purpose.
>>
>> Can we just prevent child processes from using their parent's DRM file
>> descriptors altogether? Allowing it seems like a bad idea all around.
> 
> Existing protocols pass DRM fds between processes though, don't they?
> 
> Not child processes perhaps, but special-casing that seems like awful
> design.

Fair enough.

Can we disallow passing DRM file descriptors which have any buffers
allocated? :)


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-30 11:42                                     ` Michel Dänzer
  0 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-01-30 11:42 UTC (permalink / raw)
  To: Nicolai Hähnle, christian.koenig, Michal Hocko, Roman Gushchin
  Cc: linux-mm, dri-devel, linux-kernel, amd-gfx

On 2018-01-30 12:36 PM, Nicolai HA?hnle wrote:
> On 30.01.2018 12:34, Michel DA?nzer wrote:
>> On 2018-01-30 12:28 PM, Christian KA?nig wrote:
>>> Am 30.01.2018 um 12:02 schrieb Michel DA?nzer:
>>>> On 2018-01-30 11:40 AM, Christian KA?nig wrote:
>>>>> Am 30.01.2018 um 10:43 schrieb Michel DA?nzer:
>>>>>> [SNIP]
>>>>>>> Would it be ok to hang onto potentially arbitrary mmget references
>>>>>>> essentially forever? If that's ok I think we can do your process
>>>>>>> based
>>>>>>> account (minus a few minor inaccuracies for shared stuff perhaps,
>>>>>>> but no
>>>>>>> one cares about that).
>>>>>> Honestly, I think you and Christian are overthinking this. Let's try
>>>>>> charging the memory to every process which shares a buffer, and go
>>>>>> from
>>>>>> there.
>>>>> My problem is that this needs to be bullet prove.
>>>>>
>>>>> For example imagine an application which allocates a lot of BOs, then
>>>>> calls fork() and let the parent process die. The file descriptor lives
>>>>> on in the child process, but the memory is not accounted against the
>>>>> child.
>>>> What exactly are you referring to by "the file descriptor" here?
>>>
>>> The file descriptor used to identify the connection to the driver. In
>>> other words our drm_file structure in the kernel.
>>>
>>>> What happens to BO handles in general in this case? If both parent and
>>>> child process keep the same handle for the same BO, one of them
>>>> destroying the handle will result in the other one not being able to
>>>> use
>>>> it anymore either, won't it?
>>> Correct.
>>>
>>> That usage is actually not useful at all, but we already had
>>> applications which did exactly that by accident.
>>>
>>> Not to mention that somebody could do it on purpose.
>>
>> Can we just prevent child processes from using their parent's DRM file
>> descriptors altogether? Allowing it seems like a bad idea all around.
> 
> Existing protocols pass DRM fds between processes though, don't they?
> 
> Not child processes perhaps, but special-casing that seems like awful
> design.

Fair enough.

Can we disallow passing DRM file descriptors which have any buffers
allocated? :)


-- 
Earthling Michel DA?nzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-30 11:42                                     ` Michel Dänzer
  0 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-01-30 11:42 UTC (permalink / raw)
  To: Nicolai Hähnle, christian.koenig, Michal Hocko, Roman Gushchin
  Cc: linux-mm, dri-devel, linux-kernel, amd-gfx

On 2018-01-30 12:36 PM, Nicolai Hähnle wrote:
> On 30.01.2018 12:34, Michel Dänzer wrote:
>> On 2018-01-30 12:28 PM, Christian König wrote:
>>> Am 30.01.2018 um 12:02 schrieb Michel Dänzer:
>>>> On 2018-01-30 11:40 AM, Christian König wrote:
>>>>> Am 30.01.2018 um 10:43 schrieb Michel Dänzer:
>>>>>> [SNIP]
>>>>>>> Would it be ok to hang onto potentially arbitrary mmget references
>>>>>>> essentially forever? If that's ok I think we can do your process
>>>>>>> based
>>>>>>> account (minus a few minor inaccuracies for shared stuff perhaps,
>>>>>>> but no
>>>>>>> one cares about that).
>>>>>> Honestly, I think you and Christian are overthinking this. Let's try
>>>>>> charging the memory to every process which shares a buffer, and go
>>>>>> from
>>>>>> there.
>>>>> My problem is that this needs to be bullet prove.
>>>>>
>>>>> For example imagine an application which allocates a lot of BOs, then
>>>>> calls fork() and let the parent process die. The file descriptor lives
>>>>> on in the child process, but the memory is not accounted against the
>>>>> child.
>>>> What exactly are you referring to by "the file descriptor" here?
>>>
>>> The file descriptor used to identify the connection to the driver. In
>>> other words our drm_file structure in the kernel.
>>>
>>>> What happens to BO handles in general in this case? If both parent and
>>>> child process keep the same handle for the same BO, one of them
>>>> destroying the handle will result in the other one not being able to
>>>> use
>>>> it anymore either, won't it?
>>> Correct.
>>>
>>> That usage is actually not useful at all, but we already had
>>> applications which did exactly that by accident.
>>>
>>> Not to mention that somebody could do it on purpose.
>>
>> Can we just prevent child processes from using their parent's DRM file
>> descriptors altogether? Allowing it seems like a bad idea all around.
> 
> Existing protocols pass DRM fds between processes though, don't they?
> 
> Not child processes perhaps, but special-casing that seems like awful
> design.

Fair enough.

Can we disallow passing DRM file descriptors which have any buffers
allocated? :)


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-01-30 11:42                                     ` Michel Dänzer
  (?)
@ 2018-01-30 11:56                                       ` Christian König
  -1 siblings, 0 replies; 170+ messages in thread
From: Christian König @ 2018-01-30 11:56 UTC (permalink / raw)
  To: Michel Dänzer, Nicolai Hähnle, Michal Hocko, Roman Gushchin
  Cc: linux-mm, dri-devel, linux-kernel, amd-gfx

Am 30.01.2018 um 12:42 schrieb Michel Dänzer:
> On 2018-01-30 12:36 PM, Nicolai Hähnle wrote:
>> On 30.01.2018 12:34, Michel Dänzer wrote:
>>> On 2018-01-30 12:28 PM, Christian König wrote:
>>>> Am 30.01.2018 um 12:02 schrieb Michel Dänzer:
>>>>> On 2018-01-30 11:40 AM, Christian König wrote:
>>>>>> Am 30.01.2018 um 10:43 schrieb Michel Dänzer:
>>>>>>> [SNIP]
>>>>>>>> Would it be ok to hang onto potentially arbitrary mmget references
>>>>>>>> essentially forever? If that's ok I think we can do your process
>>>>>>>> based
>>>>>>>> account (minus a few minor inaccuracies for shared stuff perhaps,
>>>>>>>> but no
>>>>>>>> one cares about that).
>>>>>>> Honestly, I think you and Christian are overthinking this. Let's try
>>>>>>> charging the memory to every process which shares a buffer, and go
>>>>>>> from
>>>>>>> there.
>>>>>> My problem is that this needs to be bullet prove.
>>>>>>
>>>>>> For example imagine an application which allocates a lot of BOs, then
>>>>>> calls fork() and let the parent process die. The file descriptor lives
>>>>>> on in the child process, but the memory is not accounted against the
>>>>>> child.
>>>>> What exactly are you referring to by "the file descriptor" here?
>>>> The file descriptor used to identify the connection to the driver. In
>>>> other words our drm_file structure in the kernel.
>>>>
>>>>> What happens to BO handles in general in this case? If both parent and
>>>>> child process keep the same handle for the same BO, one of them
>>>>> destroying the handle will result in the other one not being able to
>>>>> use
>>>>> it anymore either, won't it?
>>>> Correct.
>>>>
>>>> That usage is actually not useful at all, but we already had
>>>> applications which did exactly that by accident.
>>>>
>>>> Not to mention that somebody could do it on purpose.
>>> Can we just prevent child processes from using their parent's DRM file
>>> descriptors altogether? Allowing it seems like a bad idea all around.
>> Existing protocols pass DRM fds between processes though, don't they?
>>
>> Not child processes perhaps, but special-casing that seems like awful
>> design.
> Fair enough.
>
> Can we disallow passing DRM file descriptors which have any buffers
> allocated? :)

Hehe good point, but I'm sorry I have to ruin that.

The root VM page table is allocated when the DRM file descriptor is 
created and we want to account those to whoever uses the file descriptor 
as well.

We could now make an exception for the root VM page table to not be 
accounted (shouldn't be that much compared to the rest of the VM tree), 
but Nicolai is right all those exceptions are just an awful design :)

Looking into the fs layer there actually only seem to be two function 
which are involved when a file descriptor is installed/removed from a 
process. So we just need to add some callbacks there.

Regards,
Christian.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-30 11:56                                       ` Christian König
  0 siblings, 0 replies; 170+ messages in thread
From: Christian König @ 2018-01-30 11:56 UTC (permalink / raw)
  To: Michel Dänzer, Nicolai Hähnle, Michal Hocko, Roman Gushchin
  Cc: linux-mm, dri-devel, linux-kernel, amd-gfx

Am 30.01.2018 um 12:42 schrieb Michel DA?nzer:
> On 2018-01-30 12:36 PM, Nicolai HA?hnle wrote:
>> On 30.01.2018 12:34, Michel DA?nzer wrote:
>>> On 2018-01-30 12:28 PM, Christian KA?nig wrote:
>>>> Am 30.01.2018 um 12:02 schrieb Michel DA?nzer:
>>>>> On 2018-01-30 11:40 AM, Christian KA?nig wrote:
>>>>>> Am 30.01.2018 um 10:43 schrieb Michel DA?nzer:
>>>>>>> [SNIP]
>>>>>>>> Would it be ok to hang onto potentially arbitrary mmget references
>>>>>>>> essentially forever? If that's ok I think we can do your process
>>>>>>>> based
>>>>>>>> account (minus a few minor inaccuracies for shared stuff perhaps,
>>>>>>>> but no
>>>>>>>> one cares about that).
>>>>>>> Honestly, I think you and Christian are overthinking this. Let's try
>>>>>>> charging the memory to every process which shares a buffer, and go
>>>>>>> from
>>>>>>> there.
>>>>>> My problem is that this needs to be bullet prove.
>>>>>>
>>>>>> For example imagine an application which allocates a lot of BOs, then
>>>>>> calls fork() and let the parent process die. The file descriptor lives
>>>>>> on in the child process, but the memory is not accounted against the
>>>>>> child.
>>>>> What exactly are you referring to by "the file descriptor" here?
>>>> The file descriptor used to identify the connection to the driver. In
>>>> other words our drm_file structure in the kernel.
>>>>
>>>>> What happens to BO handles in general in this case? If both parent and
>>>>> child process keep the same handle for the same BO, one of them
>>>>> destroying the handle will result in the other one not being able to
>>>>> use
>>>>> it anymore either, won't it?
>>>> Correct.
>>>>
>>>> That usage is actually not useful at all, but we already had
>>>> applications which did exactly that by accident.
>>>>
>>>> Not to mention that somebody could do it on purpose.
>>> Can we just prevent child processes from using their parent's DRM file
>>> descriptors altogether? Allowing it seems like a bad idea all around.
>> Existing protocols pass DRM fds between processes though, don't they?
>>
>> Not child processes perhaps, but special-casing that seems like awful
>> design.
> Fair enough.
>
> Can we disallow passing DRM file descriptors which have any buffers
> allocated? :)

Hehe good point, but I'm sorry I have to ruin that.

The root VM page table is allocated when the DRM file descriptor is 
created and we want to account those to whoever uses the file descriptor 
as well.

We could now make an exception for the root VM page table to not be 
accounted (shouldn't be that much compared to the rest of the VM tree), 
but Nicolai is right all those exceptions are just an awful design :)

Looking into the fs layer there actually only seem to be two function 
which are involved when a file descriptor is installed/removed from a 
process. So we just need to add some callbacks there.

Regards,
Christian.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-30 11:56                                       ` Christian König
  0 siblings, 0 replies; 170+ messages in thread
From: Christian König @ 2018-01-30 11:56 UTC (permalink / raw)
  To: Michel Dänzer, Nicolai Hähnle, Michal Hocko, Roman Gushchin
  Cc: linux-mm, dri-devel, linux-kernel, amd-gfx

Am 30.01.2018 um 12:42 schrieb Michel Dänzer:
> On 2018-01-30 12:36 PM, Nicolai Hähnle wrote:
>> On 30.01.2018 12:34, Michel Dänzer wrote:
>>> On 2018-01-30 12:28 PM, Christian König wrote:
>>>> Am 30.01.2018 um 12:02 schrieb Michel Dänzer:
>>>>> On 2018-01-30 11:40 AM, Christian König wrote:
>>>>>> Am 30.01.2018 um 10:43 schrieb Michel Dänzer:
>>>>>>> [SNIP]
>>>>>>>> Would it be ok to hang onto potentially arbitrary mmget references
>>>>>>>> essentially forever? If that's ok I think we can do your process
>>>>>>>> based
>>>>>>>> account (minus a few minor inaccuracies for shared stuff perhaps,
>>>>>>>> but no
>>>>>>>> one cares about that).
>>>>>>> Honestly, I think you and Christian are overthinking this. Let's try
>>>>>>> charging the memory to every process which shares a buffer, and go
>>>>>>> from
>>>>>>> there.
>>>>>> My problem is that this needs to be bullet prove.
>>>>>>
>>>>>> For example imagine an application which allocates a lot of BOs, then
>>>>>> calls fork() and let the parent process die. The file descriptor lives
>>>>>> on in the child process, but the memory is not accounted against the
>>>>>> child.
>>>>> What exactly are you referring to by "the file descriptor" here?
>>>> The file descriptor used to identify the connection to the driver. In
>>>> other words our drm_file structure in the kernel.
>>>>
>>>>> What happens to BO handles in general in this case? If both parent and
>>>>> child process keep the same handle for the same BO, one of them
>>>>> destroying the handle will result in the other one not being able to
>>>>> use
>>>>> it anymore either, won't it?
>>>> Correct.
>>>>
>>>> That usage is actually not useful at all, but we already had
>>>> applications which did exactly that by accident.
>>>>
>>>> Not to mention that somebody could do it on purpose.
>>> Can we just prevent child processes from using their parent's DRM file
>>> descriptors altogether? Allowing it seems like a bad idea all around.
>> Existing protocols pass DRM fds between processes though, don't they?
>>
>> Not child processes perhaps, but special-casing that seems like awful
>> design.
> Fair enough.
>
> Can we disallow passing DRM file descriptors which have any buffers
> allocated? :)

Hehe good point, but I'm sorry I have to ruin that.

The root VM page table is allocated when the DRM file descriptor is 
created and we want to account those to whoever uses the file descriptor 
as well.

We could now make an exception for the root VM page table to not be 
accounted (shouldn't be that much compared to the rest of the VM tree), 
but Nicolai is right all those exceptions are just an awful design :)

Looking into the fs layer there actually only seem to be two function 
which are involved when a file descriptor is installed/removed from a 
process. So we just need to add some callbacks there.

Regards,
Christian.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH 4/4] drm/amdgpu: Use drm_oom_badness for amdgpu.
  2018-01-30  9:24     ` Daniel Vetter
  (?)
@ 2018-01-30 12:42       ` Andrey Grodzovsky
  -1 siblings, 0 replies; 170+ messages in thread
From: Andrey Grodzovsky @ 2018-01-30 12:42 UTC (permalink / raw)
  To: linux-kernel, linux-mm, dri-devel, amd-gfx, Christian.Koenig

That definitely what I planned, just didn't want to clutter the RFC with 
multiple repeated changes.

Thanks,

Andrey



On 01/30/2018 04:24 AM, Daniel Vetter wrote:
> On Thu, Jan 18, 2018 at 11:47:52AM -0500, Andrey Grodzovsky wrote:
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> index 46a0c93..6a733cdc8 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> @@ -828,6 +828,7 @@ static const struct file_operations amdgpu_driver_kms_fops = {
>>   #ifdef CONFIG_COMPAT
>>   	.compat_ioctl = amdgpu_kms_compat_ioctl,
>>   #endif
>> +	.oom_file_badness = drm_oom_badness,
> Would be neat if we could roll this out for all gem drivers (once it's no
> longer an RFC ofc).
> -Daniel

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH 4/4] drm/amdgpu: Use drm_oom_badness for amdgpu.
@ 2018-01-30 12:42       ` Andrey Grodzovsky
  0 siblings, 0 replies; 170+ messages in thread
From: Andrey Grodzovsky @ 2018-01-30 12:42 UTC (permalink / raw)
  To: linux-kernel, linux-mm, dri-devel, amd-gfx, Christian.Koenig

That definitely what I planned, just didn't want to clutter the RFC with 
multiple repeated changes.

Thanks,

Andrey



On 01/30/2018 04:24 AM, Daniel Vetter wrote:
> On Thu, Jan 18, 2018 at 11:47:52AM -0500, Andrey Grodzovsky wrote:
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> index 46a0c93..6a733cdc8 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> @@ -828,6 +828,7 @@ static const struct file_operations amdgpu_driver_kms_fops = {
>>   #ifdef CONFIG_COMPAT
>>   	.compat_ioctl = amdgpu_kms_compat_ioctl,
>>   #endif
>> +	.oom_file_badness = drm_oom_badness,
> Would be neat if we could roll this out for all gem drivers (once it's no
> longer an RFC ofc).
> -Daniel

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH 4/4] drm/amdgpu: Use drm_oom_badness for amdgpu.
@ 2018-01-30 12:42       ` Andrey Grodzovsky
  0 siblings, 0 replies; 170+ messages in thread
From: Andrey Grodzovsky @ 2018-01-30 12:42 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	Christian.Koenig-5C7GfCeVMHo

That definitely what I planned, just didn't want to clutter the RFC with 
multiple repeated changes.

Thanks,

Andrey



On 01/30/2018 04:24 AM, Daniel Vetter wrote:
> On Thu, Jan 18, 2018 at 11:47:52AM -0500, Andrey Grodzovsky wrote:
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> index 46a0c93..6a733cdc8 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> @@ -828,6 +828,7 @@ static const struct file_operations amdgpu_driver_kms_fops = {
>>   #ifdef CONFIG_COMPAT
>>   	.compat_ioctl = amdgpu_kms_compat_ioctl,
>>   #endif
>> +	.oom_file_badness = drm_oom_badness,
> Would be neat if we could roll this out for all gem drivers (once it's no
> longer an RFC ofc).
> -Daniel

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-01-30 11:56                                       ` Christian König
  (?)
@ 2018-01-30 15:52                                         ` Michel Dänzer
  -1 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-01-30 15:52 UTC (permalink / raw)
  To: Christian König, Nicolai Hähnle, Michal Hocko, Roman Gushchin
  Cc: linux-mm, amd-gfx, linux-kernel, dri-devel

On 2018-01-30 12:56 PM, Christian König wrote:
> Am 30.01.2018 um 12:42 schrieb Michel Dänzer:
>> On 2018-01-30 12:36 PM, Nicolai Hähnle wrote:
>>> On 30.01.2018 12:34, Michel Dänzer wrote:
>>>> On 2018-01-30 12:28 PM, Christian König wrote:
>>>>> Am 30.01.2018 um 12:02 schrieb Michel Dänzer:
>>>>>> On 2018-01-30 11:40 AM, Christian König wrote:
>>>>>>> Am 30.01.2018 um 10:43 schrieb Michel Dänzer:
>>>>>>>> [SNIP]
>>>>>>>>> Would it be ok to hang onto potentially arbitrary mmget references
>>>>>>>>> essentially forever? If that's ok I think we can do your process
>>>>>>>>> based
>>>>>>>>> account (minus a few minor inaccuracies for shared stuff perhaps,
>>>>>>>>> but no
>>>>>>>>> one cares about that).
>>>>>>>> Honestly, I think you and Christian are overthinking this. Let's
>>>>>>>> try
>>>>>>>> charging the memory to every process which shares a buffer, and go
>>>>>>>> from
>>>>>>>> there.
>>>>>>> My problem is that this needs to be bullet prove.
>>>>>>>
>>>>>>> For example imagine an application which allocates a lot of BOs,
>>>>>>> then
>>>>>>> calls fork() and let the parent process die. The file descriptor
>>>>>>> lives
>>>>>>> on in the child process, but the memory is not accounted against the
>>>>>>> child.
>>>>>> What exactly are you referring to by "the file descriptor" here?
>>>>> The file descriptor used to identify the connection to the driver. In
>>>>> other words our drm_file structure in the kernel.
>>>>>
>>>>>> What happens to BO handles in general in this case? If both parent
>>>>>> and
>>>>>> child process keep the same handle for the same BO, one of them
>>>>>> destroying the handle will result in the other one not being able to
>>>>>> use
>>>>>> it anymore either, won't it?
>>>>> Correct.
>>>>>
>>>>> That usage is actually not useful at all, but we already had
>>>>> applications which did exactly that by accident.
>>>>>
>>>>> Not to mention that somebody could do it on purpose.
>>>> Can we just prevent child processes from using their parent's DRM file
>>>> descriptors altogether? Allowing it seems like a bad idea all around.
>>> Existing protocols pass DRM fds between processes though, don't they?
>>>
>>> Not child processes perhaps, but special-casing that seems like awful
>>> design.
>> Fair enough.
>>
>> Can we disallow passing DRM file descriptors which have any buffers
>> allocated? :)
> 
> Hehe good point, but I'm sorry I have to ruin that.
> 
> The root VM page table is allocated when the DRM file descriptor is
> created and we want to account those to whoever uses the file descriptor
> as well.

Alternatively, since the file descriptor is closed in the sending
process in this case, maybe we can "uncharge" the buffer memory from the
sending process and charge it to the receiving one during the transfer?


> Looking into the fs layer there actually only seem to be two function
> which are involved when a file descriptor is installed/removed from a
> process. So we just need to add some callbacks there.

That could work for file descriptor passing, but I'm not sure it really
helps for the fork case. Let's say we charge the buffer memory to the
child process as well. If either process later destroys a buffer handle,
the buffer becomes inaccessible to the other process as well, however
its memory remains charged to it (even though it may already be freed).

I think using a DRM file descriptor in both parent and child processes
is a pathological case that we really want to prevent rather than
worrying about how to make it work well. It doesn't seem to be working
well in general already anyway.


Maybe we could keep track of which process "owns" a DRM file descriptor,
and return an error from any relevant system calls for it from another
process. When passing an fd, its ownership would transfer to the
receiving process. When forking, the ownership would remain with the
parent process.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-30 15:52                                         ` Michel Dänzer
  0 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-01-30 15:52 UTC (permalink / raw)
  To: Christian König, Nicolai Hähnle, Michal Hocko, Roman Gushchin
  Cc: linux-mm, amd-gfx, linux-kernel, dri-devel

On 2018-01-30 12:56 PM, Christian KA?nig wrote:
> Am 30.01.2018 um 12:42 schrieb Michel DA?nzer:
>> On 2018-01-30 12:36 PM, Nicolai HA?hnle wrote:
>>> On 30.01.2018 12:34, Michel DA?nzer wrote:
>>>> On 2018-01-30 12:28 PM, Christian KA?nig wrote:
>>>>> Am 30.01.2018 um 12:02 schrieb Michel DA?nzer:
>>>>>> On 2018-01-30 11:40 AM, Christian KA?nig wrote:
>>>>>>> Am 30.01.2018 um 10:43 schrieb Michel DA?nzer:
>>>>>>>> [SNIP]
>>>>>>>>> Would it be ok to hang onto potentially arbitrary mmget references
>>>>>>>>> essentially forever? If that's ok I think we can do your process
>>>>>>>>> based
>>>>>>>>> account (minus a few minor inaccuracies for shared stuff perhaps,
>>>>>>>>> but no
>>>>>>>>> one cares about that).
>>>>>>>> Honestly, I think you and Christian are overthinking this. Let's
>>>>>>>> try
>>>>>>>> charging the memory to every process which shares a buffer, and go
>>>>>>>> from
>>>>>>>> there.
>>>>>>> My problem is that this needs to be bullet prove.
>>>>>>>
>>>>>>> For example imagine an application which allocates a lot of BOs,
>>>>>>> then
>>>>>>> calls fork() and let the parent process die. The file descriptor
>>>>>>> lives
>>>>>>> on in the child process, but the memory is not accounted against the
>>>>>>> child.
>>>>>> What exactly are you referring to by "the file descriptor" here?
>>>>> The file descriptor used to identify the connection to the driver. In
>>>>> other words our drm_file structure in the kernel.
>>>>>
>>>>>> What happens to BO handles in general in this case? If both parent
>>>>>> and
>>>>>> child process keep the same handle for the same BO, one of them
>>>>>> destroying the handle will result in the other one not being able to
>>>>>> use
>>>>>> it anymore either, won't it?
>>>>> Correct.
>>>>>
>>>>> That usage is actually not useful at all, but we already had
>>>>> applications which did exactly that by accident.
>>>>>
>>>>> Not to mention that somebody could do it on purpose.
>>>> Can we just prevent child processes from using their parent's DRM file
>>>> descriptors altogether? Allowing it seems like a bad idea all around.
>>> Existing protocols pass DRM fds between processes though, don't they?
>>>
>>> Not child processes perhaps, but special-casing that seems like awful
>>> design.
>> Fair enough.
>>
>> Can we disallow passing DRM file descriptors which have any buffers
>> allocated? :)
> 
> Hehe good point, but I'm sorry I have to ruin that.
> 
> The root VM page table is allocated when the DRM file descriptor is
> created and we want to account those to whoever uses the file descriptor
> as well.

Alternatively, since the file descriptor is closed in the sending
process in this case, maybe we can "uncharge" the buffer memory from the
sending process and charge it to the receiving one during the transfer?


> Looking into the fs layer there actually only seem to be two function
> which are involved when a file descriptor is installed/removed from a
> process. So we just need to add some callbacks there.

That could work for file descriptor passing, but I'm not sure it really
helps for the fork case. Let's say we charge the buffer memory to the
child process as well. If either process later destroys a buffer handle,
the buffer becomes inaccessible to the other process as well, however
its memory remains charged to it (even though it may already be freed).

I think using a DRM file descriptor in both parent and child processes
is a pathological case that we really want to prevent rather than
worrying about how to make it work well. It doesn't seem to be working
well in general already anyway.


Maybe we could keep track of which process "owns" a DRM file descriptor,
and return an error from any relevant system calls for it from another
process. When passing an fd, its ownership would transfer to the
receiving process. When forking, the ownership would remain with the
parent process.


-- 
Earthling Michel DA?nzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-01-30 15:52                                         ` Michel Dänzer
  0 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-01-30 15:52 UTC (permalink / raw)
  To: Christian König, Nicolai Hähnle, Michal Hocko, Roman Gushchin
  Cc: linux-mm, amd-gfx, linux-kernel, dri-devel

On 2018-01-30 12:56 PM, Christian König wrote:
> Am 30.01.2018 um 12:42 schrieb Michel Dänzer:
>> On 2018-01-30 12:36 PM, Nicolai Hähnle wrote:
>>> On 30.01.2018 12:34, Michel Dänzer wrote:
>>>> On 2018-01-30 12:28 PM, Christian König wrote:
>>>>> Am 30.01.2018 um 12:02 schrieb Michel Dänzer:
>>>>>> On 2018-01-30 11:40 AM, Christian König wrote:
>>>>>>> Am 30.01.2018 um 10:43 schrieb Michel Dänzer:
>>>>>>>> [SNIP]
>>>>>>>>> Would it be ok to hang onto potentially arbitrary mmget references
>>>>>>>>> essentially forever? If that's ok I think we can do your process
>>>>>>>>> based
>>>>>>>>> account (minus a few minor inaccuracies for shared stuff perhaps,
>>>>>>>>> but no
>>>>>>>>> one cares about that).
>>>>>>>> Honestly, I think you and Christian are overthinking this. Let's
>>>>>>>> try
>>>>>>>> charging the memory to every process which shares a buffer, and go
>>>>>>>> from
>>>>>>>> there.
>>>>>>> My problem is that this needs to be bullet prove.
>>>>>>>
>>>>>>> For example imagine an application which allocates a lot of BOs,
>>>>>>> then
>>>>>>> calls fork() and let the parent process die. The file descriptor
>>>>>>> lives
>>>>>>> on in the child process, but the memory is not accounted against the
>>>>>>> child.
>>>>>> What exactly are you referring to by "the file descriptor" here?
>>>>> The file descriptor used to identify the connection to the driver. In
>>>>> other words our drm_file structure in the kernel.
>>>>>
>>>>>> What happens to BO handles in general in this case? If both parent
>>>>>> and
>>>>>> child process keep the same handle for the same BO, one of them
>>>>>> destroying the handle will result in the other one not being able to
>>>>>> use
>>>>>> it anymore either, won't it?
>>>>> Correct.
>>>>>
>>>>> That usage is actually not useful at all, but we already had
>>>>> applications which did exactly that by accident.
>>>>>
>>>>> Not to mention that somebody could do it on purpose.
>>>> Can we just prevent child processes from using their parent's DRM file
>>>> descriptors altogether? Allowing it seems like a bad idea all around.
>>> Existing protocols pass DRM fds between processes though, don't they?
>>>
>>> Not child processes perhaps, but special-casing that seems like awful
>>> design.
>> Fair enough.
>>
>> Can we disallow passing DRM file descriptors which have any buffers
>> allocated? :)
> 
> Hehe good point, but I'm sorry I have to ruin that.
> 
> The root VM page table is allocated when the DRM file descriptor is
> created and we want to account those to whoever uses the file descriptor
> as well.

Alternatively, since the file descriptor is closed in the sending
process in this case, maybe we can "uncharge" the buffer memory from the
sending process and charge it to the receiving one during the transfer?


> Looking into the fs layer there actually only seem to be two function
> which are involved when a file descriptor is installed/removed from a
> process. So we just need to add some callbacks there.

That could work for file descriptor passing, but I'm not sure it really
helps for the fork case. Let's say we charge the buffer memory to the
child process as well. If either process later destroys a buffer handle,
the buffer becomes inaccessible to the other process as well, however
its memory remains charged to it (even though it may already be freed).

I think using a DRM file descriptor in both parent and child processes
is a pathological case that we really want to prevent rather than
worrying about how to make it work well. It doesn't seem to be working
well in general already anyway.


Maybe we could keep track of which process "owns" a DRM file descriptor,
and return an error from any relevant system calls for it from another
process. When passing an fd, its ownership would transfer to the
receiving process. When forking, the ownership would remain with the
parent process.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-01-30 10:28                       ` Michal Hocko
  (?)
@ 2018-03-26 14:36                         ` Lucas Stach
  -1 siblings, 0 replies; 170+ messages in thread
From: Lucas Stach @ 2018-03-26 14:36 UTC (permalink / raw)
  To: Michal Hocko, Michel Dänzer
  Cc: linux-kernel, amd-gfx, Christian.Koenig, linux-mm, dri-devel,
	Roman Gushchin

Hi all,

Am Dienstag, den 30.01.2018, 11:28 +0100 schrieb Michal Hocko:
> On Tue 30-01-18 10:29:10, Michel Dänzer wrote:
> > On 2018-01-24 12:50 PM, Michal Hocko wrote:
> > > On Wed 24-01-18 12:23:10, Michel Dänzer wrote:
> > > > On 2018-01-24 12:01 PM, Michal Hocko wrote:
> > > > > On Wed 24-01-18 11:27:15, Michel Dänzer wrote:
> > > 
> > > [...]
> > > > > > 2. If the OOM killer kills a process which is sharing BOs
> > > > > > with another
> > > > > > process, this should result in the other process dropping
> > > > > > its references
> > > > > > to the BOs as well, at which point the memory is released.
> > > > > 
> > > > > OK. How exactly are those BOs mapped to the userspace?
> > > > 
> > > > I'm not sure what you're asking. Userspace mostly uses a GEM
> > > > handle to
> > > > refer to a BO. There can also be userspace CPU mappings of the
> > > > BO's
> > > > memory, but userspace doesn't need CPU mappings for all BOs and
> > > > only
> > > > creates them as needed.
> > > 
> > > OK, I guess you have to bear with me some more. This whole stack
> > > is a
> > > complete uknonwn. I am mostly after finding a boundary where you
> > > can
> > > charge the allocated memory to the process so that the oom killer
> > > can
> > > consider it. Is there anything like that? Except for the proposed
> > > file
> > > handle hack?
> > 
> > How about the other way around: what APIs can we use to charge /
> > "uncharge" memory to a process? If we have those, we can experiment
> > with
> > different places to call them.
> 
> add_mm_counter() and I would add a new counter e.g. MM_KERNEL_PAGES.

So is anyone still working on this? This is hurting us bad enough that
I don't want to keep this topic rotting for another year.

If no one is currently working on this I would volunteer to give the
simple "just account private, non-shared buffers in process RSS" a
spin.

Regards,
Lucas

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-03-26 14:36                         ` Lucas Stach
  0 siblings, 0 replies; 170+ messages in thread
From: Lucas Stach @ 2018-03-26 14:36 UTC (permalink / raw)
  To: Michal Hocko, Michel Dänzer
  Cc: linux-kernel, amd-gfx, Christian.Koenig, linux-mm, dri-devel,
	Roman Gushchin

Hi all,

Am Dienstag, den 30.01.2018, 11:28 +0100 schrieb Michal Hocko:
> On Tue 30-01-18 10:29:10, Michel DA?nzer wrote:
> > On 2018-01-24 12:50 PM, Michal Hocko wrote:
> > > On Wed 24-01-18 12:23:10, Michel DA?nzer wrote:
> > > > On 2018-01-24 12:01 PM, Michal Hocko wrote:
> > > > > On Wed 24-01-18 11:27:15, Michel DA?nzer wrote:
> > > 
> > > [...]
> > > > > > 2. If the OOM killer kills a process which is sharing BOs
> > > > > > with another
> > > > > > process, this should result in the other process dropping
> > > > > > its references
> > > > > > to the BOs as well, at which point the memory is released.
> > > > > 
> > > > > OK. How exactly are those BOs mapped to the userspace?
> > > > 
> > > > I'm not sure what you're asking. Userspace mostly uses a GEM
> > > > handle to
> > > > refer to a BO. There can also be userspace CPU mappings of the
> > > > BO's
> > > > memory, but userspace doesn't need CPU mappings for all BOs and
> > > > only
> > > > creates them as needed.
> > > 
> > > OK, I guess you have to bear with me some more. This whole stack
> > > is a
> > > complete uknonwn. I am mostly after finding a boundary where you
> > > can
> > > charge the allocated memory to the process so that the oom killer
> > > can
> > > consider it. Is there anything like that? Except for the proposed
> > > file
> > > handle hack?
> > 
> > How about the other way around: what APIs can we use to charge /
> > "uncharge" memory to a process? If we have those, we can experiment
> > with
> > different places to call them.
> 
> add_mm_counter() and I would add a new counter e.g. MM_KERNEL_PAGES.

So is anyone still working on this? This is hurting us bad enough that
I don't want to keep this topic rotting for another year.

If no one is currently working on this I would volunteer to give the
simple "just account private, non-shared buffers in process RSS" a
spin.

Regards,
Lucas

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-03-26 14:36                         ` Lucas Stach
  0 siblings, 0 replies; 170+ messages in thread
From: Lucas Stach @ 2018-03-26 14:36 UTC (permalink / raw)
  To: Michal Hocko, Michel Dänzer
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	Christian.Koenig-5C7GfCeVMHo, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Roman Gushchin

Hi all,

Am Dienstag, den 30.01.2018, 11:28 +0100 schrieb Michal Hocko:
> On Tue 30-01-18 10:29:10, Michel Dänzer wrote:
> > On 2018-01-24 12:50 PM, Michal Hocko wrote:
> > > On Wed 24-01-18 12:23:10, Michel Dänzer wrote:
> > > > On 2018-01-24 12:01 PM, Michal Hocko wrote:
> > > > > On Wed 24-01-18 11:27:15, Michel Dänzer wrote:
> > > 
> > > [...]
> > > > > > 2. If the OOM killer kills a process which is sharing BOs
> > > > > > with another
> > > > > > process, this should result in the other process dropping
> > > > > > its references
> > > > > > to the BOs as well, at which point the memory is released.
> > > > > 
> > > > > OK. How exactly are those BOs mapped to the userspace?
> > > > 
> > > > I'm not sure what you're asking. Userspace mostly uses a GEM
> > > > handle to
> > > > refer to a BO. There can also be userspace CPU mappings of the
> > > > BO's
> > > > memory, but userspace doesn't need CPU mappings for all BOs and
> > > > only
> > > > creates them as needed.
> > > 
> > > OK, I guess you have to bear with me some more. This whole stack
> > > is a
> > > complete uknonwn. I am mostly after finding a boundary where you
> > > can
> > > charge the allocated memory to the process so that the oom killer
> > > can
> > > consider it. Is there anything like that? Except for the proposed
> > > file
> > > handle hack?
> > 
> > How about the other way around: what APIs can we use to charge /
> > "uncharge" memory to a process? If we have those, we can experiment
> > with
> > different places to call them.
> 
> add_mm_counter() and I would add a new counter e.g. MM_KERNEL_PAGES.

So is anyone still working on this? This is hurting us bad enough that
I don't want to keep this topic rotting for another year.

If no one is currently working on this I would volunteer to give the
simple "just account private, non-shared buffers in process RSS" a
spin.

Regards,
Lucas
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-03-26 14:36                         ` Lucas Stach
  (?)
@ 2018-04-04  9:09                           ` Michel Dänzer
  -1 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-04-04  9:09 UTC (permalink / raw)
  To: Lucas Stach, Michal Hocko
  Cc: linux-kernel, dri-devel, Christian.Koenig, linux-mm, amd-gfx,
	Roman Gushchin

On 2018-03-26 04:36 PM, Lucas Stach wrote:
> Am Dienstag, den 30.01.2018, 11:28 +0100 schrieb Michal Hocko:
>> On Tue 30-01-18 10:29:10, Michel Dänzer wrote:
>>> On 2018-01-24 12:50 PM, Michal Hocko wrote:
>>>> On Wed 24-01-18 12:23:10, Michel Dänzer wrote:
>>>>> On 2018-01-24 12:01 PM, Michal Hocko wrote:
>>>>>> On Wed 24-01-18 11:27:15, Michel Dänzer wrote:
>>>>
>>>> [...]
>>>>>>> 2. If the OOM killer kills a process which is sharing BOs
>>>>>>> with another
>>>>>>> process, this should result in the other process dropping
>>>>>>> its references
>>>>>>> to the BOs as well, at which point the memory is released.
>>>>>>
>>>>>> OK. How exactly are those BOs mapped to the userspace?
>>>>>
>>>>> I'm not sure what you're asking. Userspace mostly uses a GEM
>>>>> handle to
>>>>> refer to a BO. There can also be userspace CPU mappings of the
>>>>> BO's
>>>>> memory, but userspace doesn't need CPU mappings for all BOs and
>>>>> only
>>>>> creates them as needed.
>>>>
>>>> OK, I guess you have to bear with me some more. This whole stack
>>>> is a
>>>> complete uknonwn. I am mostly after finding a boundary where you
>>>> can
>>>> charge the allocated memory to the process so that the oom killer
>>>> can
>>>> consider it. Is there anything like that? Except for the proposed
>>>> file
>>>> handle hack?
>>>
>>> How about the other way around: what APIs can we use to charge /
>>> "uncharge" memory to a process? If we have those, we can experiment
>>> with
>>> different places to call them.
>>
>> add_mm_counter() and I would add a new counter e.g. MM_KERNEL_PAGES.
> 
> So is anyone still working on this? This is hurting us bad enough that
> I don't want to keep this topic rotting for another year.
> 
> If no one is currently working on this I would volunteer to give the
> simple "just account private, non-shared buffers in process RSS" a
> spin.

Sounds good. FWIW, I think shared buffers can also be easily handled by
accounting them in each process which has a reference. But that's more
of a detail, shouldn't make a big difference overall either way.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-04-04  9:09                           ` Michel Dänzer
  0 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-04-04  9:09 UTC (permalink / raw)
  To: Lucas Stach, Michal Hocko
  Cc: linux-kernel, dri-devel, Christian.Koenig, linux-mm, amd-gfx,
	Roman Gushchin

On 2018-03-26 04:36 PM, Lucas Stach wrote:
> Am Dienstag, den 30.01.2018, 11:28 +0100 schrieb Michal Hocko:
>> On Tue 30-01-18 10:29:10, Michel DA?nzer wrote:
>>> On 2018-01-24 12:50 PM, Michal Hocko wrote:
>>>> On Wed 24-01-18 12:23:10, Michel DA?nzer wrote:
>>>>> On 2018-01-24 12:01 PM, Michal Hocko wrote:
>>>>>> On Wed 24-01-18 11:27:15, Michel DA?nzer wrote:
>>>>
>>>> [...]
>>>>>>> 2. If the OOM killer kills a process which is sharing BOs
>>>>>>> with another
>>>>>>> process, this should result in the other process dropping
>>>>>>> its references
>>>>>>> to the BOs as well, at which point the memory is released.
>>>>>>
>>>>>> OK. How exactly are those BOs mapped to the userspace?
>>>>>
>>>>> I'm not sure what you're asking. Userspace mostly uses a GEM
>>>>> handle to
>>>>> refer to a BO. There can also be userspace CPU mappings of the
>>>>> BO's
>>>>> memory, but userspace doesn't need CPU mappings for all BOs and
>>>>> only
>>>>> creates them as needed.
>>>>
>>>> OK, I guess you have to bear with me some more. This whole stack
>>>> is a
>>>> complete uknonwn. I am mostly after finding a boundary where you
>>>> can
>>>> charge the allocated memory to the process so that the oom killer
>>>> can
>>>> consider it. Is there anything like that? Except for the proposed
>>>> file
>>>> handle hack?
>>>
>>> How about the other way around: what APIs can we use to charge /
>>> "uncharge" memory to a process? If we have those, we can experiment
>>> with
>>> different places to call them.
>>
>> add_mm_counter() and I would add a new counter e.g. MM_KERNEL_PAGES.
> 
> So is anyone still working on this? This is hurting us bad enough that
> I don't want to keep this topic rotting for another year.
> 
> If no one is currently working on this I would volunteer to give the
> simple "just account private, non-shared buffers in process RSS" a
> spin.

Sounds good. FWIW, I think shared buffers can also be easily handled by
accounting them in each process which has a reference. But that's more
of a detail, shouldn't make a big difference overall either way.


-- 
Earthling Michel DA?nzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-04-04  9:09                           ` Michel Dänzer
  0 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-04-04  9:09 UTC (permalink / raw)
  To: Lucas Stach, Michal Hocko
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	Christian.Koenig-5C7GfCeVMHo, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Roman Gushchin

On 2018-03-26 04:36 PM, Lucas Stach wrote:
> Am Dienstag, den 30.01.2018, 11:28 +0100 schrieb Michal Hocko:
>> On Tue 30-01-18 10:29:10, Michel Dänzer wrote:
>>> On 2018-01-24 12:50 PM, Michal Hocko wrote:
>>>> On Wed 24-01-18 12:23:10, Michel Dänzer wrote:
>>>>> On 2018-01-24 12:01 PM, Michal Hocko wrote:
>>>>>> On Wed 24-01-18 11:27:15, Michel Dänzer wrote:
>>>>
>>>> [...]
>>>>>>> 2. If the OOM killer kills a process which is sharing BOs
>>>>>>> with another
>>>>>>> process, this should result in the other process dropping
>>>>>>> its references
>>>>>>> to the BOs as well, at which point the memory is released.
>>>>>>
>>>>>> OK. How exactly are those BOs mapped to the userspace?
>>>>>
>>>>> I'm not sure what you're asking. Userspace mostly uses a GEM
>>>>> handle to
>>>>> refer to a BO. There can also be userspace CPU mappings of the
>>>>> BO's
>>>>> memory, but userspace doesn't need CPU mappings for all BOs and
>>>>> only
>>>>> creates them as needed.
>>>>
>>>> OK, I guess you have to bear with me some more. This whole stack
>>>> is a
>>>> complete uknonwn. I am mostly after finding a boundary where you
>>>> can
>>>> charge the allocated memory to the process so that the oom killer
>>>> can
>>>> consider it. Is there anything like that? Except for the proposed
>>>> file
>>>> handle hack?
>>>
>>> How about the other way around: what APIs can we use to charge /
>>> "uncharge" memory to a process? If we have those, we can experiment
>>> with
>>> different places to call them.
>>
>> add_mm_counter() and I would add a new counter e.g. MM_KERNEL_PAGES.
> 
> So is anyone still working on this? This is hurting us bad enough that
> I don't want to keep this topic rotting for another year.
> 
> If no one is currently working on this I would volunteer to give the
> simple "just account private, non-shared buffers in process RSS" a
> spin.

Sounds good. FWIW, I think shared buffers can also be easily handled by
accounting them in each process which has a reference. But that's more
of a detail, shouldn't make a big difference overall either way.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-04-04  9:09                           ` Michel Dänzer
  (?)
@ 2018-04-04  9:36                             ` Lucas Stach
  -1 siblings, 0 replies; 170+ messages in thread
From: Lucas Stach @ 2018-04-04  9:36 UTC (permalink / raw)
  To: Michel Dänzer, Michal Hocko
  Cc: linux-kernel, dri-devel, Christian.Koenig, linux-mm, amd-gfx,
	Roman Gushchin

Am Mittwoch, den 04.04.2018, 11:09 +0200 schrieb Michel Dänzer:
> On 2018-03-26 04:36 PM, Lucas Stach wrote:
> > Am Dienstag, den 30.01.2018, 11:28 +0100 schrieb Michal Hocko:
> > > On Tue 30-01-18 10:29:10, Michel Dänzer wrote:
> > > > On 2018-01-24 12:50 PM, Michal Hocko wrote:
> > > > > On Wed 24-01-18 12:23:10, Michel Dänzer wrote:
> > > > > > On 2018-01-24 12:01 PM, Michal Hocko wrote:
> > > > > > > On Wed 24-01-18 11:27:15, Michel Dänzer wrote:
> > > > > 
> > > > > [...]
> > > > > > > > 2. If the OOM killer kills a process which is sharing BOs
> > > > > > > > with another
> > > > > > > > process, this should result in the other process dropping
> > > > > > > > its references
> > > > > > > > to the BOs as well, at which point the memory is released.
> > > > > > > 
> > > > > > > OK. How exactly are those BOs mapped to the userspace?
> > > > > > 
> > > > > > I'm not sure what you're asking. Userspace mostly uses a GEM
> > > > > > handle to
> > > > > > refer to a BO. There can also be userspace CPU mappings of the
> > > > > > BO's
> > > > > > memory, but userspace doesn't need CPU mappings for all BOs and
> > > > > > only
> > > > > > creates them as needed.
> > > > > 
> > > > > OK, I guess you have to bear with me some more. This whole stack
> > > > > is a
> > > > > complete uknonwn. I am mostly after finding a boundary where you
> > > > > can
> > > > > charge the allocated memory to the process so that the oom killer
> > > > > can
> > > > > consider it. Is there anything like that? Except for the proposed
> > > > > file
> > > > > handle hack?
> > > > 
> > > > How about the other way around: what APIs can we use to charge /
> > > > "uncharge" memory to a process? If we have those, we can experiment
> > > > with
> > > > different places to call them.
> > > 
> > > add_mm_counter() and I would add a new counter e.g. MM_KERNEL_PAGES.
> > 
> > So is anyone still working on this? This is hurting us bad enough that
> > I don't want to keep this topic rotting for another year.
> > 
> > If no one is currently working on this I would volunteer to give the
> > simple "just account private, non-shared buffers in process RSS" a
> > spin.
> 
> Sounds good. FWIW, I think shared buffers can also be easily handled by
> accounting them in each process which has a reference. But that's more
> of a detail, shouldn't make a big difference overall either way.

Yes, both options to wither never account shared buffers or to always
account them into every process having a reference should be pretty
easy. Where it gets hard is when trying to account the buffer only in
the last process holding a reference or something like this.

For the OOM case I think it makes more sense to never account shared
buffers, as this may lead to a process (like the compositor) to have
its RSS inflated by shared buffers, rendering it the likely victim for
the OOM killer/reaper, while killing this process will not lead to
freeing of any shared graphics memory, at least if the clients sharing
the buffer survive killing of the compositor.

This opens up the possibility to "hide" buffers from the accounting by
sharing them, but I guess it's still much better than the nothing we do
today to account for graphics buffers.

Regards,
Lucas

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-04-04  9:36                             ` Lucas Stach
  0 siblings, 0 replies; 170+ messages in thread
From: Lucas Stach @ 2018-04-04  9:36 UTC (permalink / raw)
  To: Michel Dänzer, Michal Hocko
  Cc: linux-kernel, dri-devel, Christian.Koenig, linux-mm, amd-gfx,
	Roman Gushchin

Am Mittwoch, den 04.04.2018, 11:09 +0200 schrieb Michel DA?nzer:
> On 2018-03-26 04:36 PM, Lucas Stach wrote:
> > Am Dienstag, den 30.01.2018, 11:28 +0100 schrieb Michal Hocko:
> > > On Tue 30-01-18 10:29:10, Michel DA?nzer wrote:
> > > > On 2018-01-24 12:50 PM, Michal Hocko wrote:
> > > > > On Wed 24-01-18 12:23:10, Michel DA?nzer wrote:
> > > > > > On 2018-01-24 12:01 PM, Michal Hocko wrote:
> > > > > > > On Wed 24-01-18 11:27:15, Michel DA?nzer wrote:
> > > > > 
> > > > > [...]
> > > > > > > > 2. If the OOM killer kills a process which is sharing BOs
> > > > > > > > with another
> > > > > > > > process, this should result in the other process dropping
> > > > > > > > its references
> > > > > > > > to the BOs as well, at which point the memory is released.
> > > > > > > 
> > > > > > > OK. How exactly are those BOs mapped to the userspace?
> > > > > > 
> > > > > > I'm not sure what you're asking. Userspace mostly uses a GEM
> > > > > > handle to
> > > > > > refer to a BO. There can also be userspace CPU mappings of the
> > > > > > BO's
> > > > > > memory, but userspace doesn't need CPU mappings for all BOs and
> > > > > > only
> > > > > > creates them as needed.
> > > > > 
> > > > > OK, I guess you have to bear with me some more. This whole stack
> > > > > is a
> > > > > complete uknonwn. I am mostly after finding a boundary where you
> > > > > can
> > > > > charge the allocated memory to the process so that the oom killer
> > > > > can
> > > > > consider it. Is there anything like that? Except for the proposed
> > > > > file
> > > > > handle hack?
> > > > 
> > > > How about the other way around: what APIs can we use to charge /
> > > > "uncharge" memory to a process? If we have those, we can experiment
> > > > with
> > > > different places to call them.
> > > 
> > > add_mm_counter() and I would add a new counter e.g. MM_KERNEL_PAGES.
> > 
> > So is anyone still working on this? This is hurting us bad enough that
> > I don't want to keep this topic rotting for another year.
> > 
> > If no one is currently working on this I would volunteer to give the
> > simple "just account private, non-shared buffers in process RSS" a
> > spin.
> 
> Sounds good. FWIW, I think shared buffers can also be easily handled by
> accounting them in each process which has a reference. But that's more
> of a detail, shouldn't make a big difference overall either way.

Yes, both options to wither never account shared buffers or to always
account them into every process having a reference should be pretty
easy. Where it gets hard is when trying to account the buffer only in
the last process holding a reference or something like this.

For the OOM case I think it makes more sense to never account shared
buffers, as this may lead to a process (like the compositor) to have
its RSS inflated by shared buffers, rendering it the likely victim for
the OOM killer/reaper, while killing this process will not lead to
freeing of any shared graphics memory, at least if the clients sharing
the buffer survive killing of the compositor.

This opens up the possibility to "hide" buffers from the accounting by
sharing them, but I guess it's still much better than the nothing we do
today to account for graphics buffers.

Regards,
Lucas

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-04-04  9:36                             ` Lucas Stach
  0 siblings, 0 replies; 170+ messages in thread
From: Lucas Stach @ 2018-04-04  9:36 UTC (permalink / raw)
  To: Michel Dänzer, Michal Hocko
  Cc: linux-kernel, amd-gfx, Christian.Koenig, linux-mm, dri-devel,
	Roman Gushchin

Am Mittwoch, den 04.04.2018, 11:09 +0200 schrieb Michel Dänzer:
> On 2018-03-26 04:36 PM, Lucas Stach wrote:
> > Am Dienstag, den 30.01.2018, 11:28 +0100 schrieb Michal Hocko:
> > > On Tue 30-01-18 10:29:10, Michel Dänzer wrote:
> > > > On 2018-01-24 12:50 PM, Michal Hocko wrote:
> > > > > On Wed 24-01-18 12:23:10, Michel Dänzer wrote:
> > > > > > On 2018-01-24 12:01 PM, Michal Hocko wrote:
> > > > > > > On Wed 24-01-18 11:27:15, Michel Dänzer wrote:
> > > > > 
> > > > > [...]
> > > > > > > > 2. If the OOM killer kills a process which is sharing BOs
> > > > > > > > with another
> > > > > > > > process, this should result in the other process dropping
> > > > > > > > its references
> > > > > > > > to the BOs as well, at which point the memory is released.
> > > > > > > 
> > > > > > > OK. How exactly are those BOs mapped to the userspace?
> > > > > > 
> > > > > > I'm not sure what you're asking. Userspace mostly uses a GEM
> > > > > > handle to
> > > > > > refer to a BO. There can also be userspace CPU mappings of the
> > > > > > BO's
> > > > > > memory, but userspace doesn't need CPU mappings for all BOs and
> > > > > > only
> > > > > > creates them as needed.
> > > > > 
> > > > > OK, I guess you have to bear with me some more. This whole stack
> > > > > is a
> > > > > complete uknonwn. I am mostly after finding a boundary where you
> > > > > can
> > > > > charge the allocated memory to the process so that the oom killer
> > > > > can
> > > > > consider it. Is there anything like that? Except for the proposed
> > > > > file
> > > > > handle hack?
> > > > 
> > > > How about the other way around: what APIs can we use to charge /
> > > > "uncharge" memory to a process? If we have those, we can experiment
> > > > with
> > > > different places to call them.
> > > 
> > > add_mm_counter() and I would add a new counter e.g. MM_KERNEL_PAGES.
> > 
> > So is anyone still working on this? This is hurting us bad enough that
> > I don't want to keep this topic rotting for another year.
> > 
> > If no one is currently working on this I would volunteer to give the
> > simple "just account private, non-shared buffers in process RSS" a
> > spin.
> 
> Sounds good. FWIW, I think shared buffers can also be easily handled by
> accounting them in each process which has a reference. But that's more
> of a detail, shouldn't make a big difference overall either way.

Yes, both options to wither never account shared buffers or to always
account them into every process having a reference should be pretty
easy. Where it gets hard is when trying to account the buffer only in
the last process holding a reference or something like this.

For the OOM case I think it makes more sense to never account shared
buffers, as this may lead to a process (like the compositor) to have
its RSS inflated by shared buffers, rendering it the likely victim for
the OOM killer/reaper, while killing this process will not lead to
freeing of any shared graphics memory, at least if the clients sharing
the buffer survive killing of the compositor.

This opens up the possibility to "hide" buffers from the accounting by
sharing them, but I guess it's still much better than the nothing we do
today to account for graphics buffers.

Regards,
Lucas
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
  2018-04-04  9:36                             ` Lucas Stach
  (?)
@ 2018-04-04  9:46                               ` Michel Dänzer
  -1 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-04-04  9:46 UTC (permalink / raw)
  To: Lucas Stach, Michal Hocko
  Cc: linux-kernel, amd-gfx, Christian.Koenig, linux-mm, dri-devel,
	Roman Gushchin

On 2018-04-04 11:36 AM, Lucas Stach wrote:
> Am Mittwoch, den 04.04.2018, 11:09 +0200 schrieb Michel Dänzer:
>> On 2018-03-26 04:36 PM, Lucas Stach wrote:
>>> Am Dienstag, den 30.01.2018, 11:28 +0100 schrieb Michal Hocko:
>>>> On Tue 30-01-18 10:29:10, Michel Dänzer wrote:
>>>>> On 2018-01-24 12:50 PM, Michal Hocko wrote:
>>>>>> On Wed 24-01-18 12:23:10, Michel Dänzer wrote:
>>>>>>> On 2018-01-24 12:01 PM, Michal Hocko wrote:
>>>>>>>> On Wed 24-01-18 11:27:15, Michel Dänzer wrote:
>>>>>>
>>>>>> [...]
>>>>>>>>> 2. If the OOM killer kills a process which is sharing BOs
>>>>>>>>> with another
>>>>>>>>> process, this should result in the other process dropping
>>>>>>>>> its references
>>>>>>>>> to the BOs as well, at which point the memory is released.
>>>>>>>>
>>>>>>>> OK. How exactly are those BOs mapped to the userspace?
>>>>>>>
>>>>>>> I'm not sure what you're asking. Userspace mostly uses a GEM
>>>>>>> handle to
>>>>>>> refer to a BO. There can also be userspace CPU mappings of the
>>>>>>> BO's
>>>>>>> memory, but userspace doesn't need CPU mappings for all BOs and
>>>>>>> only
>>>>>>> creates them as needed.
>>>>>>
>>>>>> OK, I guess you have to bear with me some more. This whole stack
>>>>>> is a
>>>>>> complete uknonwn. I am mostly after finding a boundary where you
>>>>>> can
>>>>>> charge the allocated memory to the process so that the oom killer
>>>>>> can
>>>>>> consider it. Is there anything like that? Except for the proposed
>>>>>> file
>>>>>> handle hack?
>>>>>
>>>>> How about the other way around: what APIs can we use to charge /
>>>>> "uncharge" memory to a process? If we have those, we can experiment
>>>>> with
>>>>> different places to call them.
>>>>
>>>> add_mm_counter() and I would add a new counter e.g. MM_KERNEL_PAGES.
>>>
>>> So is anyone still working on this? This is hurting us bad enough that
>>> I don't want to keep this topic rotting for another year.
>>>
>>> If no one is currently working on this I would volunteer to give the
>>> simple "just account private, non-shared buffers in process RSS" a
>>> spin.
>>
>> Sounds good. FWIW, I think shared buffers can also be easily handled by
>> accounting them in each process which has a reference. But that's more
>> of a detail, shouldn't make a big difference overall either way.
> 
> Yes, both options to wither never account shared buffers or to always
> account them into every process having a reference should be pretty
> easy. Where it gets hard is when trying to account the buffer only in
> the last process holding a reference or something like this.

FWIW, I don't think that would make sense anyway. A shared buffer is
actually used by all processes which have a reference to it, so it
should be accounted the same in all of them.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-04-04  9:46                               ` Michel Dänzer
  0 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-04-04  9:46 UTC (permalink / raw)
  To: Lucas Stach, Michal Hocko
  Cc: linux-kernel, amd-gfx, Christian.Koenig, linux-mm, dri-devel,
	Roman Gushchin

On 2018-04-04 11:36 AM, Lucas Stach wrote:
> Am Mittwoch, den 04.04.2018, 11:09 +0200 schrieb Michel DA?nzer:
>> On 2018-03-26 04:36 PM, Lucas Stach wrote:
>>> Am Dienstag, den 30.01.2018, 11:28 +0100 schrieb Michal Hocko:
>>>> On Tue 30-01-18 10:29:10, Michel DA?nzer wrote:
>>>>> On 2018-01-24 12:50 PM, Michal Hocko wrote:
>>>>>> On Wed 24-01-18 12:23:10, Michel DA?nzer wrote:
>>>>>>> On 2018-01-24 12:01 PM, Michal Hocko wrote:
>>>>>>>> On Wed 24-01-18 11:27:15, Michel DA?nzer wrote:
>>>>>>
>>>>>> [...]
>>>>>>>>> 2. If the OOM killer kills a process which is sharing BOs
>>>>>>>>> with another
>>>>>>>>> process, this should result in the other process dropping
>>>>>>>>> its references
>>>>>>>>> to the BOs as well, at which point the memory is released.
>>>>>>>>
>>>>>>>> OK. How exactly are those BOs mapped to the userspace?
>>>>>>>
>>>>>>> I'm not sure what you're asking. Userspace mostly uses a GEM
>>>>>>> handle to
>>>>>>> refer to a BO. There can also be userspace CPU mappings of the
>>>>>>> BO's
>>>>>>> memory, but userspace doesn't need CPU mappings for all BOs and
>>>>>>> only
>>>>>>> creates them as needed.
>>>>>>
>>>>>> OK, I guess you have to bear with me some more. This whole stack
>>>>>> is a
>>>>>> complete uknonwn. I am mostly after finding a boundary where you
>>>>>> can
>>>>>> charge the allocated memory to the process so that the oom killer
>>>>>> can
>>>>>> consider it. Is there anything like that? Except for the proposed
>>>>>> file
>>>>>> handle hack?
>>>>>
>>>>> How about the other way around: what APIs can we use to charge /
>>>>> "uncharge" memory to a process? If we have those, we can experiment
>>>>> with
>>>>> different places to call them.
>>>>
>>>> add_mm_counter() and I would add a new counter e.g. MM_KERNEL_PAGES.
>>>
>>> So is anyone still working on this? This is hurting us bad enough that
>>> I don't want to keep this topic rotting for another year.
>>>
>>> If no one is currently working on this I would volunteer to give the
>>> simple "just account private, non-shared buffers in process RSS" a
>>> spin.
>>
>> Sounds good. FWIW, I think shared buffers can also be easily handled by
>> accounting them in each process which has a reference. But that's more
>> of a detail, shouldn't make a big difference overall either way.
> 
> Yes, both options to wither never account shared buffers or to always
> account them into every process having a reference should be pretty
> easy. Where it gets hard is when trying to account the buffer only in
> the last process holding a reference or something like this.

FWIW, I don't think that would make sense anyway. A shared buffer is
actually used by all processes which have a reference to it, so it
should be accounted the same in all of them.


-- 
Earthling Michel DA?nzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC] Per file OOM badness
@ 2018-04-04  9:46                               ` Michel Dänzer
  0 siblings, 0 replies; 170+ messages in thread
From: Michel Dänzer @ 2018-04-04  9:46 UTC (permalink / raw)
  To: Lucas Stach, Michal Hocko
  Cc: linux-kernel, dri-devel, Christian.Koenig, linux-mm, amd-gfx,
	Roman Gushchin

On 2018-04-04 11:36 AM, Lucas Stach wrote:
> Am Mittwoch, den 04.04.2018, 11:09 +0200 schrieb Michel Dänzer:
>> On 2018-03-26 04:36 PM, Lucas Stach wrote:
>>> Am Dienstag, den 30.01.2018, 11:28 +0100 schrieb Michal Hocko:
>>>> On Tue 30-01-18 10:29:10, Michel Dänzer wrote:
>>>>> On 2018-01-24 12:50 PM, Michal Hocko wrote:
>>>>>> On Wed 24-01-18 12:23:10, Michel Dänzer wrote:
>>>>>>> On 2018-01-24 12:01 PM, Michal Hocko wrote:
>>>>>>>> On Wed 24-01-18 11:27:15, Michel Dänzer wrote:
>>>>>>
>>>>>> [...]
>>>>>>>>> 2. If the OOM killer kills a process which is sharing BOs
>>>>>>>>> with another
>>>>>>>>> process, this should result in the other process dropping
>>>>>>>>> its references
>>>>>>>>> to the BOs as well, at which point the memory is released.
>>>>>>>>
>>>>>>>> OK. How exactly are those BOs mapped to the userspace?
>>>>>>>
>>>>>>> I'm not sure what you're asking. Userspace mostly uses a GEM
>>>>>>> handle to
>>>>>>> refer to a BO. There can also be userspace CPU mappings of the
>>>>>>> BO's
>>>>>>> memory, but userspace doesn't need CPU mappings for all BOs and
>>>>>>> only
>>>>>>> creates them as needed.
>>>>>>
>>>>>> OK, I guess you have to bear with me some more. This whole stack
>>>>>> is a
>>>>>> complete uknonwn. I am mostly after finding a boundary where you
>>>>>> can
>>>>>> charge the allocated memory to the process so that the oom killer
>>>>>> can
>>>>>> consider it. Is there anything like that? Except for the proposed
>>>>>> file
>>>>>> handle hack?
>>>>>
>>>>> How about the other way around: what APIs can we use to charge /
>>>>> "uncharge" memory to a process? If we have those, we can experiment
>>>>> with
>>>>> different places to call them.
>>>>
>>>> add_mm_counter() and I would add a new counter e.g. MM_KERNEL_PAGES.
>>>
>>> So is anyone still working on this? This is hurting us bad enough that
>>> I don't want to keep this topic rotting for another year.
>>>
>>> If no one is currently working on this I would volunteer to give the
>>> simple "just account private, non-shared buffers in process RSS" a
>>> spin.
>>
>> Sounds good. FWIW, I think shared buffers can also be easily handled by
>> accounting them in each process which has a reference. But that's more
>> of a detail, shouldn't make a big difference overall either way.
> 
> Yes, both options to wither never account shared buffers or to always
> account them into every process having a reference should be pretty
> easy. Where it gets hard is when trying to account the buffer only in
> the last process holding a reference or something like this.

FWIW, I don't think that would make sense anyway. A shared buffer is
actually used by all processes which have a reference to it, so it
should be accounted the same in all of them.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 170+ messages in thread

* [RFC] Per file OOM badness
@ 2015-09-04 12:53 Christian König
  0 siblings, 0 replies; 170+ messages in thread
From: Christian König @ 2015-09-04 12:53 UTC (permalink / raw)
  To: lkml, dri-devel

Hello everyone,

I'm currently working on the issue that when device drivers allocate memory on
behalf of an application the OOM killer usually doesn't knew about that unless
the application also get this memory mapped into their address space.

This is especially annoying for graphics drivers where a lot of the VRAM
usually isn't CPU accessible and so doesn't make sense to map into the
address space of the process using it.

The problem now is that when an application starts to use a lot of VRAM those
buffers objects sooner or later get swapped out to system memory, but when we
now run into an out of memory situation the OOM killer obviously doesn't knew
anything about that memory and so usually kills the wrong process.

The following set of patches tries to address this problem by introducing a per
file OOM badness score, which device drivers can use to give the OOM killer a
hint how many resources are bound to a file descriptor so that it can make
better decisions which process to kill.

So question at every one: What do you think about this approach?

My biggest concern right now is the patches are messing with a core kernel
structure (adding a field to struct file). Any better idea? I'm considering
to put a callback into file_ops instead.

Best regards and feel free to tear this idea apart,
Christian.

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 170+ messages in thread

end of thread, other threads:[~2018-04-04  9:46 UTC | newest]

Thread overview: 170+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-18 16:47 [RFC] Per file OOM badness Andrey Grodzovsky
2018-01-18 16:47 ` Andrey Grodzovsky
2018-01-18 16:47 ` Andrey Grodzovsky
2018-01-18 16:47 ` [PATCH 1/4] fs: add OOM badness callback in file_operatrations struct Andrey Grodzovsky
2018-01-18 16:47   ` Andrey Grodzovsky
2018-01-18 16:47   ` Andrey Grodzovsky
2018-01-18 16:47 ` [PATCH 2/4] oom: take per file badness into account Andrey Grodzovsky
2018-01-18 16:47   ` Andrey Grodzovsky
2018-01-18 16:47   ` Andrey Grodzovsky
2018-01-18 16:47 ` [PATCH 3/4] drm/gem: adjust per file OOM badness on handling buffers Andrey Grodzovsky
2018-01-18 16:47   ` Andrey Grodzovsky
2018-01-18 16:47   ` Andrey Grodzovsky
2018-01-19  6:01   ` Chunming Zhou
2018-01-19  6:01     ` Chunming Zhou
2018-01-19  6:01     ` Chunming Zhou
2018-01-18 16:47 ` [PATCH 4/4] drm/amdgpu: Use drm_oom_badness for amdgpu Andrey Grodzovsky
2018-01-18 16:47   ` Andrey Grodzovsky
2018-01-18 16:47   ` Andrey Grodzovsky
2018-01-30  9:24   ` Daniel Vetter
2018-01-30  9:24     ` Daniel Vetter
2018-01-30 12:42     ` Andrey Grodzovsky
2018-01-30 12:42       ` Andrey Grodzovsky
2018-01-30 12:42       ` Andrey Grodzovsky
2018-01-18 17:00 ` [RFC] Per file OOM badness Michal Hocko
2018-01-18 17:00   ` Michal Hocko
2018-01-18 17:00   ` Michal Hocko
2018-01-18 17:13   ` Michal Hocko
2018-01-18 17:13     ` Michal Hocko
2018-01-18 20:01     ` Eric Anholt
2018-01-19  8:20       ` Michal Hocko
2018-01-19  8:20         ` Michal Hocko
2018-01-19  8:20         ` Michal Hocko
2018-01-19  8:39         ` Christian König
2018-01-19  8:39           ` Christian König
2018-01-19  8:39           ` Christian König
2018-01-19  9:32           ` Michel Dänzer
2018-01-19  9:32             ` Michel Dänzer
2018-01-19  9:32             ` Michel Dänzer
2018-01-19  9:58             ` Christian König
2018-01-19  9:58               ` Christian König
2018-01-19  9:58               ` Christian König
2018-01-19 10:02               ` Michel Dänzer
2018-01-19 10:02                 ` Michel Dänzer
2018-01-19 10:02                 ` Michel Dänzer
2018-01-19 15:07                 ` Michel Dänzer
2018-01-19 15:07                   ` Michel Dänzer
2018-01-21  6:50                   ` Eric Anholt
2018-01-19 10:40           ` Michal Hocko
2018-01-19 10:40             ` Michal Hocko
2018-01-19 10:40             ` Michal Hocko
2018-01-19 11:37             ` Christian König
2018-01-19 11:37               ` Christian König
2018-01-19 11:37               ` Christian König
2018-01-19 12:13               ` Michal Hocko
2018-01-19 12:13                 ` Michal Hocko
2018-01-19 12:13                 ` Michal Hocko
2018-01-19 12:20                 ` Michal Hocko
2018-01-19 12:20                   ` Michal Hocko
2018-01-19 12:20                   ` Michal Hocko
2018-01-19 16:54                   ` Christian König
2018-01-19 16:54                     ` Christian König
2018-01-19 16:54                     ` Christian König
2018-01-23 11:39                     ` Michal Hocko
2018-01-23 11:39                       ` Michal Hocko
2018-01-23 11:39                       ` Michal Hocko
2018-01-19 16:48               ` Michel Dänzer
2018-01-19 16:48                 ` Michel Dänzer
2018-01-19 16:48                 ` Michel Dänzer
2018-01-19  8:35       ` Christian König
2018-01-19  8:35         ` Christian König
2018-01-19  6:01     ` He, Roger
2018-01-19  8:25       ` Michal Hocko
2018-01-19  8:25         ` Michal Hocko
2018-01-19 10:02         ` roger
2018-01-19 10:02           ` roger
2018-01-23 15:27   ` Roman Gushchin
2018-01-23 15:27     ` Roman Gushchin
2018-01-23 15:27     ` Roman Gushchin
2018-01-23 15:36     ` Michal Hocko
2018-01-23 15:36       ` Michal Hocko
2018-01-23 15:36       ` Michal Hocko
2018-01-23 16:39       ` Michel Dänzer
2018-01-23 16:39         ` Michel Dänzer
2018-01-23 16:39         ` Michel Dänzer
2018-01-24  9:28         ` Michal Hocko
2018-01-24  9:28           ` Michal Hocko
2018-01-24  9:28           ` Michal Hocko
2018-01-24 10:27           ` Michel Dänzer
2018-01-24 10:27             ` Michel Dänzer
2018-01-24 10:27             ` Michel Dänzer
2018-01-24 11:01             ` Michal Hocko
2018-01-24 11:01               ` Michal Hocko
2018-01-24 11:01               ` Michal Hocko
2018-01-24 11:23               ` Michel Dänzer
2018-01-24 11:23                 ` Michel Dänzer
2018-01-24 11:23                 ` Michel Dänzer
2018-01-24 11:50                 ` Michal Hocko
2018-01-24 11:50                   ` Michal Hocko
2018-01-24 11:50                   ` Michal Hocko
2018-01-24 12:11                   ` Christian König
2018-01-24 12:11                     ` Christian König
2018-01-30  9:31                     ` Daniel Vetter
2018-01-30  9:31                       ` Daniel Vetter
2018-01-30  9:31                       ` Daniel Vetter
2018-01-30  9:43                       ` Michel Dänzer
2018-01-30  9:43                         ` Michel Dänzer
2018-01-30  9:43                         ` Michel Dänzer
2018-01-30 10:40                         ` Christian König
2018-01-30 10:40                           ` Christian König
2018-01-30 10:40                           ` Christian König
2018-01-30 11:02                           ` Michel Dänzer
2018-01-30 11:02                             ` Michel Dänzer
2018-01-30 11:02                             ` Michel Dänzer
2018-01-30 11:28                             ` Christian König
2018-01-30 11:28                               ` Christian König
2018-01-30 11:34                               ` Michel Dänzer
2018-01-30 11:34                                 ` Michel Dänzer
2018-01-30 11:34                                 ` Michel Dänzer
2018-01-30 11:36                                 ` Nicolai Hähnle
2018-01-30 11:36                                   ` Nicolai Hähnle
2018-01-30 11:36                                   ` Nicolai Hähnle
2018-01-30 11:42                                   ` Michel Dänzer
2018-01-30 11:42                                     ` Michel Dänzer
2018-01-30 11:42                                     ` Michel Dänzer
2018-01-30 11:56                                     ` Christian König
2018-01-30 11:56                                       ` Christian König
2018-01-30 11:56                                       ` Christian König
2018-01-30 15:52                                       ` Michel Dänzer
2018-01-30 15:52                                         ` Michel Dänzer
2018-01-30 15:52                                         ` Michel Dänzer
2018-01-30 10:42                         ` Daniel Vetter
2018-01-30 10:42                           ` Daniel Vetter
2018-01-30 10:42                           ` Daniel Vetter
2018-01-30 10:48                           ` Michel Dänzer
2018-01-30 10:48                             ` Michel Dänzer
2018-01-30 10:48                             ` Michel Dänzer
2018-01-30 11:35                             ` Nicolai Hähnle
2018-01-30 11:35                               ` Nicolai Hähnle
2018-01-30 11:35                               ` Nicolai Hähnle
2018-01-24 14:31                   ` Michel Dänzer
2018-01-24 14:31                     ` Michel Dänzer
2018-01-24 14:31                     ` Michel Dänzer
2018-01-30  9:29                   ` Michel Dänzer
2018-01-30  9:29                     ` Michel Dänzer
2018-01-30  9:29                     ` Michel Dänzer
2018-01-30 10:28                     ` Michal Hocko
2018-01-30 10:28                       ` Michal Hocko
2018-01-30 10:28                       ` Michal Hocko
2018-03-26 14:36                       ` Lucas Stach
2018-03-26 14:36                         ` Lucas Stach
2018-03-26 14:36                         ` Lucas Stach
2018-04-04  9:09                         ` Michel Dänzer
2018-04-04  9:09                           ` Michel Dänzer
2018-04-04  9:09                           ` Michel Dänzer
2018-04-04  9:36                           ` Lucas Stach
2018-04-04  9:36                             ` Lucas Stach
2018-04-04  9:36                             ` Lucas Stach
2018-04-04  9:46                             ` Michel Dänzer
2018-04-04  9:46                               ` Michel Dänzer
2018-04-04  9:46                               ` Michel Dänzer
2018-01-19  5:39 ` He, Roger
2018-01-19  8:17   ` Christian König
2018-01-19  8:17     ` Christian König
2018-01-19  8:17     ` Christian König
2018-01-22 23:23 ` Andrew Morton
2018-01-22 23:23   ` Andrew Morton
2018-01-22 23:23   ` Andrew Morton
2018-01-23  1:59   ` Andrey Grodzovsky
2018-01-23  1:59     ` Andrey Grodzovsky
  -- strict thread matches above, loose matches on Subject: below --
2015-09-04 12:53 Christian König

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.